Control System Synthesis: A Factorization Approach

Control System Synthesis
A Factorization Approach
Part II
Synthesis Lectures on Control
and Mechatronics
Editors
Chaouki Abdallah, University of New Mexico
Mark W. Spong, University of Texas at Dallas
Control System Synthesis: A Factorization Approach, Part II

Mathukumalli Vidyasagar
2011
Control System Synthesis: A Factorization Approach, Part I

2011
The Reaction Wheel Pendulum

Daniel J. Block, Karl J. Åström, and Mark W. Spong
2007
Copyright © 2011 by Morgan & Claypool
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in
any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in
printed reviews, without the prior permission of the publisher.
Control System Synthesis: A Factorization Approach, Part II

www.morganclaypool.com
ISBN: 9781608456611 paperback

ISBN: 9781608456635 ebook (Part II)
DOI 10.2200/S00358ED1V01Y201105CRM003
A Publication in the Morgan & Claypool Publishers series

SYNTHESIS LECTURES ON CONTROL AND MECHATRONICS
Lecture #3
Series Editors: Chaouki Abdallah, University of New Mexico
Mark W. Spong, University of Texas at Dallas
Series ISSN
Synthesis Lectures on Control and Mechatronics
Print 1939-0564 Electronic 1939-0572
Control System Synthesis
A Factorization Approach
Part II
University of Texas at Dallas
SYNTHESIS LECTURES ON CONTROL AND MECHATRONICS #3
M
&C Morgan & cLaypool publishers
ABSTRACT
This book introduces the so-called “stable factorization approach” to the synthesis of feedback
controllers for linear control systems. The key to this approach is to view the multi-input, multi-
output (MIMO) plant for which one wishes to design a controller as a matrix over the fraction field
F associated with a commutative ring with identity, denoted by R, which also has no divisors of
zero. In this setting, the set of single-input, single-output (SISO) stable control systems is precisely
the ring R, while the set of stable MIMO control systems is the set of matrices whose elements all
belong to R. The set of unstable, meaning not necessarily stable, control systems is then taken to be
the field of fractions F associated with R in the SISO case, and the set of matrices with elements in
F in the MIMO case.
The central notion introduced in the book is that, in most situations of practical interest,
every matrix P whose elements belong to F can be “factored” as a “ratio” of two matrices N, D
whose elements belong to R, in such a way that N, D are coprime. In the familiar case where the
ring R corresponds to the set of bounded-input, bounded-output (BIBO)-stable rational transfer
functions, coprimeness is equivalent to two functions not having any common zeros in the closed right
half-plane including infinity. However, the notion of coprimeness extends readily to discrete-time
systems, distributed-parameter systems in both the continuous- as well as discrete-time domains,
and to multi-dimensional systems. Thus the stable factorization approach enables one to capture all
these situations within a common framework.
The key result in the stable factorization approach is the parametrization of all controllers
that stabilize a given plant. It is shown that the set of all stabilizing controllers can be parametrized
by a single parameter R, whose elements all belong to R. Moreover, every transfer matrix in the
closed-loop system is an affine function of the design parameter R. Thus problems of reliable
stabilization, disturbance rejection, robust stabilization etc. can all be formulated in terms of choosing
an appropriate R.
This is a reprint of the book Control System Synthesis: A Factorization Approach originally
published by M.I.T. Press in 1985.
KEYWORDS
stable factorization, coprimeness, Bezout identity, simultaneous stabilization, robust
stabilization, robust regulation, genericity
Dedicated to the memories of
My paternal grandfather,
Mathukumalli Narasimharao (1881-1965)
and
My father,
Prof. Mathukumalli Venkata Subbarao (1921-2006)
ix

kt

я
dv

May God protect us together

May God nourish us together
May we work together with great energy
May our study be vigorous and effective
May we not have ill feelings
Let there be peace, peace, peace
(A vedic prayer recited by the teacher and student(s) together at the commencement of studies.
Found in several upanishads including Taittiriyopanishad )
xi
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Preface for the Original Edition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
6 Filtering and Sensitivity Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

6.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.2 Some Facts about Hardy Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.3 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.4 Sensitivity Minimization: Scalar Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.5 Sensitivity Minimization: Fat Plant Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
6.6 Sensitivity Minimization: General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
6.7 Two-Parameter Compensator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
7 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
7.2 Graph Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
7.3 Graph Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
7.4 Robustness Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
7.5 Robust and Optimal Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
7.6 Genericity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
8 Extensions to General Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

8.1 Coprime Factorizations over a Commutative Ring . . . . . . . . . . . . . . . . . . . . . . . . . 279
8.2 Coprime Factorizations for Distributed Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 298
8.3 Controller Synthesis over a General Ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
A Algebraic Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

A.1 Rings, Fields and Ideals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
A.2 Rings and Fields of Fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
A.3 Principal Ideal Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
A.4 Euclidean Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
xii
B Preliminaries on Matrix Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323

B.1 Matrices and Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
B.2 Canonical Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
C Topological Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337

C.1 Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
C.2 Topological Rings and Normed Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
Author’s Biography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Preface
This is a reprint of my book Control System Synthesis: A Factorization Approach, originally
published by M.I.T. Press in 1985. That book went out of print about ten years after its publication,
but nevertheless continued to be cited in the published literature. I would like to believe that this is
because the contents of the book are still relevant to linear control theory as it is currently practiced.
I am therefore grateful to Morgan & Claypool for their decision to reprint my book, so that the
current generation of graduate students and researchers are able to access its contents without having
to photocopy it surreptitiously from the library.
The original text was created by me using the troff text processing system created at Bell
Laboratories. Indeed, the 1985 book was the first time that I produced a camera-ready copy, instead
of subjecting myself to the mercy and vagaries of typists and/or typesetters. Back then this was a
novel thing to do; however today it is standard practice. Dr. C. L. Tondo of T&T TechWorks, Inc.
and his troops have expertly keyed in the entire text into LATEX, for which I am extremely grateful.
There is no doubt that the physical appearance of the text has been significantly improved as a result
of this switch.
With the entire text at my disposal, I could in principle have made major changes. After
thinking over the matter, I decided to stick with the original text, and restricted myself to correcting
typographical errors. Upon re-reading the original text after a gap of perhaps twenty years, I felt that
the entire material still continues to be relevant, except for Chapter 6 on H∞ -control. Just about the
time that I was finalizing the book, two seminal papers appeared, giving the connection between
interpolation theory and H∞ -control; these are cited in the book as references [40] and [115]. A
few years after my book was published, the influential book [37] appeared. This was followed in
short order by the paper [28], which gives a complete state-space computational procedure for H∞ -
control; this paper is perhaps the most cited paper in the history of control theory. Subsequently, the
book [29] gave an elementary introduction to the theory, while [45, 117] are advanced treatments
of the subject. It would therefore have required a massive effort on my part to rewrite Chapter 6 of
the book to bring it up to date, and I felt that I could contribute nothing beyond the excellent texts
already in print. So I decided to leave the book as it is, on the basis that the conceptual framework
for H∞ -control presented here still remains relevant.
I had dedicated the original book to my paternal grandfather. In the interim, my father too
has passed away, and I have therefore added his name to the dedication of the Morgan & Claypool
edition. About twenty years ago, while perusing a book on Telugu writers, I discovered that one of
my ancestors, bearing the same name as my paternal grandfather, was a well-known Telugu poet
during the 19th century; he lived between 1816 and 1873. The article about him states that the clan
xiv PREFACE
was famous for scholarship for three centuries (emphasis added). I am truly blessed to have such a
distinguished lineage.
Dallas and Hyderabad, June 2011

Preface for the Original Edition
The objective of this book is to present a comprehensive treatment of some recent results in the
area of linear multivariable control that can be obtained using the so-called "factorization" approach.
It is intended as a second level graduate text in linear systems and as a research monograph. The
prerequisites for reading this book are covered in three appendices, but a reader encountering these
topics for the first time would undoubtedly have difficulty in mastering this background on the basis
of these appendices alone. Moreover, the appendices concentrate on the mathematical background
needed to understand the material covered here, but for motivational background a standard first
course in graduate level linear system theory would be desirable.
The central idea that is used repeatedly in the book is that of "factoring" the transfer matrix of
a (not necessarily stable) system as the "ratio" of two stable rational matrices. This idea was first used
in a paper published in 1972 (see [92]), but the emphasis there was on analyzing the stability of a
given system rather than on the synthesis of control systems as is the case here. It turns out that this
seemingly simple stratagem leads to conceptually simple and computationally tractable solutions to
many important and interesting problems; a detailed description can be found in Chapter l.
The starting point of the factorization approach is to obtain a simple parametrization of all
compensators that stabilize a given plant. One could then, in principle, choose the best compensator
for various applications. This idea was presented in the 1976 paper by Youla, Jabr, and Bongiorno
entitled "Modern Wiener-Hopf Design of Optimal Controllers, Part II: The Multivariable Case,"
which can be considered to have launched this entire area of research. The viewpoint adopted in
this book, namely that of setting up all problems in a ring, was initially proposed in a 1980 paper by
Desoer, Liu, Murray, and Saeks. This paper greatly streamlined the Youla et al. paper and reduced
the problem to its essentials. Thus virtually all of the research reported here is less than five years
old, which bears out the power of this approach to formulate and solve important problems.
In writing the book, some assumptions have been made about the potential readership. First,
it is assumed that the reader is already well versed in the aims and problems of control system analysis
and design. Thus, for example, the book starts off discussing the problem of stabilization without any
attempt to justify the importance of this problem; it is assumed that the reader already knows that
stable systems are better than unstable ones. Also, as the book is aimed at professional researchers
as well as practitioners of control system synthesis, a theorem-proof format has been adopted to
bring out clearly the requisite conditions under which a particular statement is true, but at the same
time, the principal results of each section have been stated as close to the start of the section as is
practicable. In this way, a reader who is interested in pursuing the topics presented here in greater
depth is enabled to do so; one who is only interested in using the results can rapidly obtain an idea
of what they are by scanning the beginnings of various sections and by skipping proofs. In this
xvi PREFACE FOR THE ORIGINAL EDITION
connection it is worth noting that Chapter 3 is a painless introduction to the factorization approach
to scalar systems that could, in my opinion, be taught to undergraduates without difficulty.
At various times, I have taught the material covered here at Waterloo, Berkeley, M.I.T., and
the Indian Institute of Science. Based on these experiences, I believe that the appendices plus the
first five chapters can be covered in a thirteen-week period, with three hours of lectures per week. In a
standard American semester consisting of eighteen weeks of lectures with three hours of lectures per
week, it should be possible to cover the entire book, especially if one starts directly from Chapter I
rather than the appendices. Most of the sections in the appendices and the first five chapters contain
several problems, which contain various ancillary results. The reader is encouraged to attempt all of
these problems, especially as the results contained in the problems are freely used in the subsequent
sections.
It is now my pleasure to thank several persons who aided me in this project. My wife Shakun-
thala was always a great source of support and encouragement during the writing of this book, which
took considerably more time than either of us thought it would. Little Aparna came into being at
about the same time as the third draft, which gave me a major incentive to finish the book as quickly
as possible, so that I might then be able to spend more time with her. Several colleagues gave me
the benefit of their helpful comments on various parts of the manuscript. Of these, I would like to
mention Ken Davidson, Charlie Desoer, John Doyle, Bruce Francis, Allen Tannenbaum, George
Verghese, N. Viswanadham, and Alan Willsky. My students Chris Ma and Dean Minto went over
large parts of the material with a great deal of care and exposed more than one serious mistake. I
would like to acknowledge my indebtedness to all of these individuals. The final camera ready copy
was produced by me using the troff facility and the pic preprocessor to generate the diagrams. In this
connection, I would like to thank. Brian Haggman, a differential topologist turned computer hack,
for much valuable advice. Finally, at the outset of this project, I was fortunate enough to receive an
E. W. R. Steacie Memorial Fellowship awarded by the Natural Sciences and Engineering Research
Council of Canada, which freed me from all teaching duties for a period of two years and enabled
me to concentrate on the research and writing that is reflected in these pages.
I would like to conclude this preface with a historical aside, which I hope the reader will find
diverting. It is easy enough to discover, even by a cursory glance at the contents of this book, that one
of the main tools used repeatedly is the formulation of the general solution of the matrix equation
XN + Y D = I , where all entities are matrices with elements in a principal ideal domain. Among
recent writers, Tom Kailath [49] refers to this equation as the Bezout identity, while V. Kučera [60]
refers to it as the Diophantine equation. In an effort to pin down just exactly what it should be
called, I started searching the literature in the area of the history of mathematics for the details of
the person(s) who first obtained the general solution to the equation ax + by = 1, where a and b
are given integers and one seeks integer-valued solutions for x and y.
It appears that the equation "Diophantine equation" was commonly applied by European
mathematicians of the seventeenth century and later to any equation where the unknowns were
required to assume only integer values. The phrase honors the Greek mathematician Diophantus,
PREFACE FOR THE ORIGINAL EDITION xvii
who lived (in Alexandria) during the latter part of the third century A. D. However, the general
solution of the linear equation in integer variables mentioned above was never studied by him.
In fact, Smith [90, p. 134] states that Diophantus never studied indeterminate equations, that is
equations that have more than one solution. According to Colebrooke [17, p. xiii], the first occidental
mathematician to study this equation and to derive its general solution was one Bachet de Meziriac
in the year 1624. The first mathematician of antiquity to formulate and find all solutions to this
equation was an ancient Hindu by the name of Aryabhatta, born in the year 476. A detailed and
enjoyable exposition of this subject can be found in the recent comprehensive book by van der
Waerden [104]. Thus, in order to respect priority, I submit that the equation in question should
henceforth be referred to as Aryabhatta’s identity.
153
CHAPTER 6
Filtering and Sensitivity

Minimization
6.1 PROBLEM STATEMENT
The problem studied in this chapter is the following: Suppose P is a given plant, not necessarily
stable, and W1 , W2 are given proper stable rational matrices. The objective is to find, among all
stabilizing compensators of P , a C that minimizes the weighted performance measure
J = W1 (I + P C)−1 W2 . (6.1.1)
The significance of the problem can be explained in terms of Figure 6.1.
v
W1 W2
+
t r e + d
W2 C P
+–
y
z
W1
Figure 6.1: Pictorial Representation of the Weighted Performance Measure.
Here W2 can be interpreted as the generator of the disturbance input d, and (I + P C)−1 W2
it the transfer matrix from j to the measured output y. In this case W1 y is the weighted measured
output, and W1 (I + P C)−1 W2 is the transfer matrix from v to the weighted measured output.
Alternately, one can think of W2 as generating a (bounded) reference input r, and (I + P C)−1 W2 is
the transfer matrix from t to the tracking error e. In this case W1 e is the weighted error. By suitable
definition of the norm · , the weighted performance measure J can be made to represent several
quantities of practical significance.
Case I. Filtering Suppose r is a random process with power spectral density W2 (j ω)2 , and that it
is desired to minimize the weighted energy of the resulting process e by selecting C appropriately.
154 6. FILTERING AND SENSITIVITY MINIMIZATION
In this case, t represents white noise and W2 is a whitening filter; e(s) = [(I + P C)−1 W2 )](s), and
the power spectral density of e is given by
e(j ω)2 = [(I + P C)−1 W2 ](j ω)2 . (6.1.2)
In assessing the situation, one may be more concerned about certain frequencies more than others.
The role of the weighting matrix W1 is to reflect the relative importance attached to energy at
different frequencies. Thus, the weighted error z equals W1 e and its power spectral density is
z(j ω)2 = [W1 (I + P C)−1 W2 ](j ω)2 . (6.1.3)
The objective is to minimize the energy of the process z, i.e.,

∞
1
J = z(j ω)2 dω
2π
−∞
∞
1
= [W1 (I + P C)−1 W2 ](j ω)2 dω . (6.1.4)
2π
−∞
Case II. Sensitivity Minimization Suppose u is a square-integrable input with an L2 -norm no larger
than one, and suppose it is desired to minimize the largest L2 -norm that z may have in response to
such an input. Here again the objective function is to minimize the weighted energy of the error e,
but in this case the input u is assumed to be any L2 -input of unit norm or less, rather than white
noise as in Case I. In this case the objective function to be minimized is
J = sup z2
u2 ≤1
= sup W1 (I + P C)−1 W2 u2

u2 ≤1
= sup [W1 (I + P C)−1 W2 ](j ω) , (6.1.5)

ω∈R
where || · ||2 denotes the L2 -norm, and the last equality is a standard result from [26, 94] as pointed
out in the next section. Since in this case the quantity measures the maximum error in response to
a unit-norm input u, this problem is sometimes referred to as the minimax sensitivity problem.
In both the above cases, as well as in others that the reader may construct, the minimization
of J in (6.1.1) is required to be carried out subject to the constraint that C stabilizes P . As things
stand, the compensator C is thus constrained, and also enters the objective function J in a highly
nonlinear fashion. However, using the results of Chapter 5, it is possible to transform this into an
unconstrained optimization problem involving a much simpler objective function.
The set of all compensators C that stabilize a given plant P is neatly parametrized in The-
orem 5.2.1. Let (N, D), (D̃, Ñ) be any r.c.f. and any l.c.f. of P , and select X̃, Ỹ ∈ M(S) such that
6.1. PROBLEM STATEMENT 155
Ñ X̃ + D̃ Ỹ = I . Then, from this theorem,
S(P ) = {(X̃ + DR)(Ỹ − NR)−1 : R ∈ M(S) and |Ỹ − NR| = 0} . (6.1.6)
Moreover, if C ∈ S(P) equals (X̃ + DR)(Ỹ − NR)−1 for a certain R ∈ M(S), then (cf. (5.2.7))
(I + P C)−1 = (Ỹ − NR)D̃ . (6.1.7)
Thus, the problem of minimizing J (R) = W1 (I + P C)−1 W2 subject to the constraint that C ∈
S(P) is equivalent to minimizing
J (R) = W1 (Ỹ − NR)D̃W2 , (6.1.8)
subject to the constraint that |Ỹ − NR| = 0. Now, from Lemma 5.2.4, |Ỹ − NR| = 0 for all R
belonging to an open dense subset of M(S), and if P is strictly proper, then from Lemma 5.2.6,
|Ỹ − N R| = 0 for all R ∈ M(S). Hence, the original problem of minimizing the weighted perfor-
mance measure J of (6.1.1) is equivalent, in the case where P is strictly proper, to an unconstrained
minimization of the objective function J (·) of (6.1.8) with respect to R. Moreover, the original
expression W1 (I + P C)−1 W2 has been replaced by W1 (Ỹ − NR)D̃W2 , where the noteworthy
feature is that the quantity inside the norm is affine in R. If the plant is not strictly proper, the quantity
R is constrained to the extent that it has to satisfy the “nonsingularity” constraint. However, since
this constraint is only operational on a nowhere dense set, it can be ignored during the minimization
and can be tested once an unconstrained minimum has been found. If by chance the optimal choice
of R does not satisfy the nonsingularity constraint, then arbitrarily small perturbations in R will
satisfy the constraint and will result in only an arbitrarily small increase in the performance measure
(since the latter is a continuous function).
Up to now we have only talked about minimizing the weighted energy of the error signal
in response to various types of inputs. Depending on the application, one might prefer instead
to minimize the weighted energy of some other signal within the loop. Since the transfer matrix
between any two points can be expressed as an affine function of R using the results of Chapter 5,
such a minimization is qualitatively no different from the problems discussed earlier. For instance,
if it is desired to minimize the weighted energy of the plant output in response to an input u, then
J = W1 P C(I + P C)−1 W2 . (6.1.9)
However, if C ∈ S(P ), then by (5.2.7),
P C(I + P C)−1 = (X̃ + DR)Ñ , (6.1.10)
for some R ∈ M(S), so that

J = W1 (X̃ + DR)ÑW2 . (6.1.11)
In fact, it is easy to see that the general problem becomes one of minimizing
J = F − GRH , (6.1.12)
where F, G, H ∈ M(S) are given matrices.
Finally, the above discussion applies even when a two-parameter control scheme is used. In
this case, by (5.6.15), all transfer matrices are affine functions of the matrix S = [Q R], and the
general cost function assumes the form
J = F − GSH , (6.1.13)
where F, G, H ∈ M(S) are given matrices. The discrete-time analogs of the above problems are
simply obtained by replacing S by R∞ .
6.2 SOME FACTS ABOUT HARDY SPACES

In the previous section, it was shown that several problems of filtering and sensitivity minimization
can be formulated as follows: Minimize
J (R) = F − GRH , (6.2.1)
where F, G, H ∈ M(S) are given and R ∈ M(S) is to be chosen. The solution of this class of
problems is facilitated by the use of some concepts and results from the theory of Hp - (or Hardy)
spaces. The purpose of this section is to present a listing of the basic concepts and results from this
theory, mostly without proof. The reader is referred to [34] for an excellent treatment of this theory,
and to [59] for some specialized results.
Throughout this section, D denotes the open unit disc in the complex plane, and D̄ the closed
unit disc. For each p ∈ [1, ∞), the set Hp consists of all analytic functions f on the open unit disc
D with the property that
⎡ ⎤1/p
2π
1
sup ⎣ |f (rej θ )|p dθ ⎦ =: f p < ∞ . (6.2.2)
r∈(0,1) 2π
0
The set H∞ consists of all analytic functions f on D such that

sup max |f (rej θ )| =: f ∞ < ∞ . (6.2.3)
r∈(0,1) θ∈[0,2π]
If f ∈ Hp for some p ∈ [1, ∞], then strictly speaking f (z) is only defined when |z| < 1.
However, for almost all θ ∈ [0, 2π ], it is possible to define f (ej θ ) as a nontangential limit of f (z)
as z → ej θ . If f ∈ Hp then its boundary value function f (ej θ ) belongs to the Lebesgue space
Lp [0, 2π]. Moreover,
⎡ ⎤1/p
2π
1
f p = ⎣ |f (ej θ )|p dθ ⎦ if 1 ≤ p < ∞ ,
2π
0
= ess. sup |f (ej θ )| if p = ∞ . (6.2.4)
θ∈[0,2π]
6.2. SOME FACTS ABOUT HARDY SPACES 157
In other words, the norm f p of a function f ∈ Hp can be computed from its boundary value
function.
For each p ∈ [1, ∞], the space Hp is a Banach space. In addition, H2 is a (complex) Hilbert
space with inner product defined by
2π
1
f, g = f¯(ej θ )g(ej θ ) dθ , (6.2.5)
2π
0
where the bar denotes complex conjugation. It turns out that the set of functions {zi }∞
i=0 forms an
orthonormal basis for H2 . Hence, a function f (·) analytic in D belongs to H2 if and only if it can
be expanded in the form
∞
f (z) = fi z i , (6.2.6)
i=0
where the sequence {fi } is square-summable and the infinite sum converges in the sense of the
H2 -norm. Thus, one can think of H2 as the set of z-transforms of square-summable sequences.
Similarly, if {fi } is an absolutely summable sequence, then
∞

f (z) = fi z i , (6.2.7)
i=0
belongs to H∞ , and in fact

∞

f ∞ ≤ |fi | . (6.2.8)
i=0
In general, the converse of the above statement is false: Not every function in H∞ is of the form (6.2.7)
where the coefficient sequence is absolutely summable. However, it is easy to verify that a rational
function belongs to H∞ if and only if it is the z-transform of an absolutely summable sequence.
If the product of two H∞ -functions is defined to be their pointwise product, then H∞ becomes a
Banach algebra.
Suppose f ∈ H∞ . Then one can associate with f a continuous linear operator Tf mapping
H2 into itself by
(Tf g)(z) = f (z)g(z) . (6.2.9)
In other words, an application of Tf consists simply of pointwise multiplication by f (·). It can be

shown that this association is norm-preserving. That is, the operator norm of Tf equals f ∞ ; or
in other words,
Tf := sup f g2 = f ∞ . (6.2.10)
g2 ≤1
Suppose m is a nonnegative integer, and that {ai } is a sequence of nonzero complex numbers such

∞
that |ai | < 1 for all i, and the sum (1 − |ai |) converges. Then the function
i=1
∞
|ai | z − ai
b(z) = z m
, (6.2.11)
ai 1 − āi z
i=1
is called the Blaschke product corresponding to m and the sequence {ai }. The function b(·) is an
element of H∞ and satisfies |b(z)| < 1 for all z in D, |b(ej θ )| = 1 for almost all θ ∈ [0, 2π ]. More-
over, the set of zeros of b(·) in D consists precisely of the set {ai }, where repeated occurrence of a
particular complex number corresponds to multiple zeros of b. If m is a nonnegative integer and
{a1 , · · · , an } is a finite set of points (not necessarily distinct) such that 0 < |ai | < 1 for all i, then

n
|ai | z − ai
b(z) = zm , (6.2.12)
ai 1 − āi z
i=1
is called the associated (finite) Blaschke product. It too belongs to H∞ and satisfies |b(z)| < 1 for
all z ∈ D and |b(ej θ )| = 1 ∀θ ∈ [0, 2π ].
Suppose f is any rational function in H∞ . Let m denote the multiplicity of z = 0 as a zero
of f (if f (0) = 0 set m = 0), and let a1 , · · · , an denote the other zeros of f in D. If f has a
multiple zero at some point in D, repeat this point in the sequence {a1 , · · · , an } as many times as
the multiplicity. If b(·) is now defined as in (6.2.12), then it is clear that f can be factored in the
form f (z) = a(z)b(z), where both a, b ∈ H∞ , b is a Blaschke product, and a has no zeros in the
open unit disc. Further, if f has no zeros on the unit circle, then a is actually a unit of H∞ , i.e., it
has an inverse in H∞ .
The factorization described above is a special case of what is known as an inner-outer fac-
torization. A function g ∈ H∞ is inner if it satisfies |g(ej θ )| = 1 for almost all θ ∈ [0, 2π]. The
only rational inner functions are finite Blaschke products multiplied by a complex number of unit
modulus. In general, an inner function can be expressed as a product of a Blaschke product and a
so-called singular inner function; we will not require the latter concept in this book. The definition
of an outer function is more technical: A function f ∈ H∞ is outer if the range of the map Tf
defined in (6.2.9) is dense in H2 . However, a rational function f ∈ H∞ is outer if and only if it has
no zeros in the open unit disc (but it could have zeros on the unit circle). With these definitions, we
have the following result:
Fact 6.2.1 Every f ∈ H∞ can be factored as ab where a is outer and b is inner. If f is rational
and has no zeros on the unit circle, then f can be factored in the form ab where a is a unit of H∞
and b is a finite Blaschke product.
In the case of matrices over Hp -spaces, the essential results for our purposes are as follows:
Let n > 1 be an integer. Then Hn2 is a Hilbert space with the inner product
2π
1
f, g = f ∗ (ej θ )g(ej θ ) dθ . (6.2.13)
2π
0
where ∗ denotes the conjugate transpose. Corresponding to every F ∈ H∞

n×m
one can define a
continuous linear operator TF : H2 → H2 by
m n
(TF g)(z) = F (z)g(z) ∀z ∈ D . (6.2.14)
Moreover, the norm of the operator TF equals the norm of F ; that is,
TF = sup TF g2 = F ∞ = sup σ̄ (F (ej θ )) , (6.2.15)

g2 ≤1 θ∈[0,2π]
where σ (·) denotes the largest singular value.

The matrix analog of Fact 6.2.1 is stated next. A matrix G ∈ H∞ n×m
is inner if
∗
G (e )G(e ) = I for almost all θ ∈ [0, 2π ] (in this definition it is important to note that n
j θ j θ
need not equal m). A matrix F ∈ H∞ n×m

is outer if the range of the map TF defined in (6.2.14) is
dense in H2 . In particular, a rational matrix F ∈ H∞
n n×m
is outer if and only if rank F (z) = n for all
z ∈ D. Note that if G inner (resp. outer) then n ≥ m (resp. n ≤ m).
To state the next result precisely, we recall the set R∞ introduced in Section 2.1, which consists
of all rational functions that have no poles in the closed unit disc. It is easy to see that R∞ is just
the set of rational functions in H∞ (this explains the symbolism), and that the norm on R∞ defined
in (2.2.21) is consistent with (6.2.4).
Fact 6.2.2 Suppose G ∈ H∞ n×m

and has rank min{n, m}.1 Then G can be factored as Gi Go where
Gi is inner and Go is outer. If n ≥ m, then Go is square, whereas if n ≤ m, then Gi is square.
Fact 6.2.2 has several interesting consequences. First, suppose G ∈ R∞ n×m

has rank n and

that n ≤ m. Then, by applying Fact 6.2.2 to F = G , we see that G can be factored as AB where
A ∈ R∞ n×n
is outer and B ∈ R∞n×m
is the transpose of an inner matrix. This is quite distinct from the
factorization in Fact 6.2.2. Second, noting that the transpose of a square inner matrix is also inner,
we see that a square matrix G ∈ M(R∞ ) of full rank can be factored as Gi Go and also as Fo Fi
where Gi , Fi are inner and Go , Fo are outer.
Finally, observe that if G is a square inner matrix, then |G| is an inner function, and the
adjoint matrix of G is also a square inner matrix. Also, note that left multiplication by an inner
matrix preserves norms in both H2 and H∞ . That is, if F ∈ M(H∞ ) is inner, G ∈ M(H∞ ) and
H ∈ M(H2 ), then F G∞ = G∞ and F H 2 = H 2 .
1This means that at least one full-size minor of G is nonzero.
One can also define the class Hp over the open right half-plane, rather than the open unit
disc as is the case above. Suppose f : C+ → C is analytic; then f belongs to the class Hp (C+ ),
where 1 ≤ p < ∞, if
⎡ ⎤1/p
∞
1
sup ⎣ |f (σ + j ω)|p dω⎦ =: f p+ < ∞ . (6.2.16)
σ >0 2π
−∞
The function f belongs to H∞ (C+ ) if
sup sup |f (σ + j ω)| =: f ∞+ < ∞ . (6.2.17)

σ >0 ω
If f ∈ Hp (C+ ), then its domain definition can be extended to the j ω-axis, and the function ω →
f (j ω) belongs to the Lebesgue space Lp (R). Moreover,
⎡ ⎤1/p
∞
1
f p+ = ⎣ |f (j ω)|p dω⎦ if p ∈ [1, ∞),
2π
−∞
f ∞+ = ess. sup |f (j ω)| if p = ∞ . (6.2.18)
ω
H2 (C+ ) is a Hilbert space, with the inner product of two functions f and g defined by
∞
1
f, g + = f¯(j ω)g(j ω) dω . (6.2.19)
2π
−∞
The notions of Blaschke product, inner and outer functions all have their counterparts in this
setting. In particular, a function f ∈ H∞ (C+ ) is inner if |f (j ω)| = 1 for almost all ω; a rational
f ∈ H∞ (C+ ) with f (j ω) = 0 for all ω is outer if it is a unit. Facts 6.2.1 and 6.2.2 have their obvious
analogs for the space H∞ (C+ ), obtained by replacing “unit circle” by “imaginary axis” and “ej θ ” by
“j ω” throughout.
Suppose f ∈ H∞ (of the unit disc). Then the function g : C+ → C defined by
g(s) = f [(s − 1)/(s + 1)] , (6.2.20)
belongs to H∞ (C+ ), and satisfies, g∞+ = f ∞ . However, not all functions in H∞ (C+ ) can
be generated in this manner. For example, g(s) = e−s belongs to H∞ (C+ ), but is not of the
form (6.2.20) for any f ∈ H∞ .The situation is worse when p < ∞. Even if f ∈ Hp , the correspond-
ing g might not belong to Hp (C+ ). An easy way to see this is to note that every g ∈ Hp (C+ ) must
be strictly proper. Thus, if f ∈ Hp is rational and f (1) = 0, then from (6.2.20) g(∞) = f (1) = 0,
so that g ∈/ Hp (C+ ). Over the unit disc, H∞ is a subset of Hp for all p < ∞, and indeed Hp is a
subset of Hq whenever p > q; this follows from the compactness of the interval [0, 2π ]. However,
Hp (C+ ) and Hq (C+ ) are distinct sets whenever p = q, with neither being a subset of the other.
There is however an interesting relationship, which is given next.
Fact 6.2.3 Suppose f ∈ Hp , for some p ∈ [1, ∞), and define g by (6.2.20). Let
21/p g(s)
h(s) = . (6.2.21)
(1 + s)2/p
Then h ∈ Hp (C+ ) and hp+ = f p .
Proof. By definition
∞
p 1
f p = |f (ej θ )|p dθ, (6.2.22)
2π
0
∞

1
g(j ω)
p
p
hp =

2π
(1 + ω2 )1/p
2 dω
−∞
∞
1 2dω
= |f ([(j ω − 1)/(j ω + 1)])|p . (6.2.23)
2π 1 + ω2
−∞
To relate (6.2.22) and (6.2.23), let ej θ = (j ω − 1)/(j ω + 1). Then
1 + ej θ e−j θ/2 + ej θ/2

jω = =
1 − ej θ e−j θ/2 − ej θ/2
= j cot θ/2 . (6.2.24)
Hence,
2dω
θ = 2 cot−1 ω; dθ = . (6.2.25)
1 + ω2
Combining (6.2.25) and (6.2.23) completes the proof. 2
Finally, if f ∈ H2 , then its boundary function θ → f (ej θ ) belongs to L2 [0, 2π ]. Let
H+ +
2 [0, 2π] denote the set of boundary values of functions in H2 . Then H2 [0, 2π ] is a closed sub-
space of L2 [0, 2π]. Let H2 [0, 2π ] denote the orthogonal complement of H+
−
2 [0, 2π ] in L2 [0, 2π ].
Then every function f ∈ L2 can be expressed uniquely in the form f = f+ + f− where f+ ∈ H+ 2
and f− ∈ H− 2 (we drop the [0, 2π ] for brevity). This splitting up has a very intuitive interpretation.
The collection of functions {ej nθ , n ∈ Z} forms an orthonormal basis for L2 , so that every f ∈ L2
has the Fourier series
∞
f (ej θ ) = fn ej nθ . (6.2.26)
n=−∞
Now H+ 2 is the span of the functions {e
j nθ , n ∈ Z }, while H− is the span of the set {ej nθ , n < 0}.
+ 2
Hence,
∞
−1
f+ (ej θ ) = fn ej nθ , f− (ej θ ) = fn ej nθ . (6.2.27)
n=0 n=−∞
Note that the function z → z−1 f− (z−1 ) belongs to H2 . Such a function is called coanalytic. In case
f (·) is rational and has no poles on the unit circle, the above decomposition is even more familiar:
Expand f into partial fractions; then f+ (resp. f− ) is the sum of the terms corresponding to poles
outside (resp. inside) the unit disc.
The above discussion applies, mutatis mutandis, to the space L2 (R).
6.3 FILTERING
The problem studied in this section is the following: Suppose a plant P ∈ M(R(s)) is specified,
together with weighting matrices W1 , W2 . The objective is to minimize the weighted sensitivity
measure
J = W1 (I + P C)−1 W2 2+ , (6.3.1)
over all C ∈ S(P ). At this stage, several assumptions are made to facilitate a solution. First, note that
J might not even be defined for all C ∈ S(P ), because W1 (I + P C)−1 W2 , might not be strictly
proper. To get around this problem, it is assumed that either W1 or W2 is strictly proper. Next,
it is assumed that W1 , W2 are square, and that W1 (s) = 0, W2 (s) = 0 for all s ∈ C+ . Finally, for
technical reasons, it is assumed that the plant P has at least as many inputs as outputs. This is the
so-called “square” or “fat” plant case; the case of “tall” plants is discussed in Section 6.6.
As pointed out in Section 6.1, this problem can be reformulated in a more manageable
form. Let (N, D), (D̃, Ñ ) be an r.c.f. and an l.c.f. of P , and let X̃, Ỹ ∈ M(S) be any solutions of
Ñ X̃ + D̃ Ỹ = I . Then, by Theorem 5.2.1, every C ∈ S(P ) is of the form (X̃ + DR)(Ỹ − NR)−1
for some R ∈ M(S) such that |Ỹ − NR| = 0. Moreover, the corresponding weighted sensitivity
measure equals
J (R) = W1 (Ỹ − NR)D̃W2 2

= W1 Ỹ D̃W2 − W1 NR D̃W2 2 . (6.3.2)
Hence, the problem is to minimize J (R) by a suitable choice of R ∈ M(S).2 This problem has a
very nice geometrical interpretation. As R varies over M(S), the product W1 NR D̃W2 generates a
subspace of M(H2 (C+ )). To minimize J (·), one has to find the closest point in this subspace to
W1 Ỹ D̃W2 . Now, since at least one of the matrices W1 or W2 is strictly proper, it turns out that the
subspace {W1 N R D̃W2 : R ∈ M(S)} is not closed. As a result, there may not exist a closest point in
this subspace to W1 Ỹ D̃W2 . In other words, the infimum of J (·) is not attained by any R, except in
trivial cases. It is nevertheless possible to compute the infimum of J (·), and to construct a family
2The constraint that |Ỹ − N R| = 0 is ignored since it is satisfied by almost all R (see Lemma 5.2.4).
6.3. FILTERING 163
(Rε ) of functions in M(S) such that J (Rε ) approaches this infimum as ε → 0+ . This procedure is
based on the following lemma.
Lemma 6.3.1 Suppose F, G, H ∈ M(S), and suppose in addition that (i) F is strictly proper, (ii) G
has full row rank at all points on the j ω-axis and at infinity,3 and (iii) H is square and nonsingular at
all points on the j ω-axis and at infinity. Define
Jm (R) = F − GRH 2 , (6.3.3)
and let Jm (R) = ∞ if F − GRH ∈ / M(H2 (C+ )). Factorize G, H as G = Gi Go , H = Ho Hi , where

Gi , Hi are inner and Go , Ho are outer. For any, A ∈ M(S), let A∗ denote the rational matrix, A∗ (s) =
A (−s).4 Then
min Jm (R) = (Gi ∗ F Hi )− 2 , (6.3.4)
R
and this minimum is attained by any R satisfying
Go RHo = (Gi ∗ F Hi ∗ )+ . (6.3.5)
Proof. First of all, observe that the equation Go RHo = A has a solution for R corresponding to
every A ∈ M(S). The reason is that Go has a right inverse and Ho is a unit.
To prove the lemma, note that Gi ∗ (s)Gi (s) = I, Hi ∗ (s)Hi (s) = Hi (s)Hi ∗ (s) = I ∀s, since
both Gi and Hi are inner matrices. In particular, Gi ∗ (j ω) is a unitary matrix for all ω, and as a result
for any A ∈ M(H2 (C+ )) we have
A22 = Gi ∗ A22 , (6.3.6)
where the norm on the left is the H2 -norm or the L2 -norm, while the norm on the right side is
strictly the L2 -norm (since Gi ∗ A need not belong to M(H2 (C+ ))). In the same way, multiplication
by Hi ∗ is also norm-preserving.
Using this fact leads to
(Jm (R))2 = F − GRH 22 = F − Gi Go RHo Hi 22

= Gi ∗ F Hi ∗ − Go RHo 22 . (6.3.7)
If Gi ∗ F Hi is decomposed as (Gi ∗ F Hi ∗ )+ + (Gi ∗ F Hi ∗ )− , then by orthogonality,
(Jm (R))2 = (Gi ∗ F Hi ∗ )− 22 + (Gi ∗ F Hi ∗ )+ − Go RHo 22

≥ (Gi ∗ F Hi ∗ )− 22 . (6.3.8)
On the other hand, equality can be achieved in (6.3.8) by choosing R to satisfy (6.3.5). 2
3This means that G has at least as many columns as rows.
4 In particular, if s = j ω, then, A∗ (j ω) is the conjugate transpose of A(j ω), so that this notation is consistent with earlier usage.
If we attempt to apply Lemma 6.3.1 to minimize the function J (·) of (6.3.2), we run
into the difficulty that the functions G = W1 N and H = D̃W2 may not satisfy the hypotheses
of Lemma 6.3.1. In particular, if P (j ω) has less than full row rank for some ω, then so does N(j ω)
(i.e., G(j ω)). Similarly, any j ω-axis poles of P correspond to zeros of |D̃| and of |H |. Finally, since
W1 or W2 is assumed to be strictly proper, either |G(∞)| = 0 or |H (∞)| = 0. To get around these
difficulties, the following modification of Lemma 6.3.1 is required.
Lemma 6.3.2 Suppose F, G, H ∈ M(S), and suppose in addition that

(i) G has the form G = G1 Dg , where G1 ∈ M(S) has full row rank at all points on the j ω-axis
and at infinity, while Dg ∈ M(S) is a diagonal matrix Dg = Diag {dg1 , · · · , dgn }, where the only
C+e -zeros of dgi (if any) are at infinity or on the j ω-axis.
(ii) H has the form H = Dh H1 where H1 ∈ M(S) is square and nonsingular at all points on the
j ω-axis and at infinity, while Dh ∈ M(S) is a diagonal matrix of the form Dh = Diag {dh1 , · · · , dhm },
where the only C+e -zeros of dhi (if any) are at infinity or on the j ω-axis.
Under these conditions,
inf F − GRH 2 = inf F − G1 RH1 2 . (6.3.9)
R∈M(S) R∈M(S)
Proof. As shown by Lemma 6.3.1, the infimum on the right side of (6.3.9) is actually attained. Let
R0 ∈ M(S) be an element that achieves this minimum. Now every matrix of the form GRH is also
of the form G1 R1 H1 (just let R1 = Dg RDh ). Thus, the infimum on the left side of (6.3.9) can be
no smaller than that on the right side. Hence, in order to prove the lemma, it is enough to construct
a family of functions (Rε )ε>0 in M(S) such that F − GRε H 2 → F − G1 R0 H1 2 as ε → 0+ .
Suppose a ∈ S has the property that a(s) = 0 whenever Re s > 0, so that the only possible
C+e -zeros of a are at infinity or on the j ω-axis. (Note that all the dgi , dhi , are of this type.) To be
specific, let j ω1 , · · · , j ωl (where ωl may be infinite) denote the distinct j ω-axis zeros of a, with
multiplicities m1 , · · · , ml , respectively. It is claimed that there exists a family of functions (bε )ε>0
in S such that
(i) bε /a is a unit of S for each ε > 0.
(ii) bε (j ω) → 1 uniformly on all compact subsets of (−∞, ∞) that do not contain any of the
ωi ’s, as ε → 0+ .
To see this, note that a(s) is of the form
l
s − j ωi mi 1 n
a(s) = u(s) · , (6.3.10)
s+1 s+1
i=1
for some unit u ∈ S. Hence,

l
mi n
s − j ωi 1
bε (s) = · , (6.3.11)
s + ε − j ωi εs + 1
i=1
6.3. FILTERING 165
satisfies the requirements (i) and (ii).
In view of this, it follows that there exists families Bεg , Bεh of diagonal matrices in M(S) such
that
(i) Dg−1 Bεg is unimodular for all ε > 0.
(ii) Bεh Dh−1 is unimodular for all ε > 0.
(iii) Bεg (j ω) → I, Bεh (j ω) → I as ε → 0+ uniformly on all compact subsets of (−∞, ∞)
that do not contain the zeros of |Dg |, |Dh |.
Now define Rε = Dg−1 Bεg R0 Bεh Dh−1 , where R0 ∈ M(S) attains the infimum on the right
side of (6.3.9). Then
F − GRε H 2 = F − G1 Dg · Dg−1 Bεg R0 Bεh Dh−1 · Dh H1 2

= F − G1 Bεg R0 Bεh H1 2 . (6.3.12)
The proof of the lemma is complete if it can be shown that the last quantity in (6.3.12) approaches
F − G1 R0 H1 2 as ε → 0+ . But this follows from the limiting behavior of Bεg , Bεh as described
in (iii) above. 2
We are now ready to tackle the general problem of minimizing J (·) as defined in (6.3.2). Let
Gs denote the Smith form of W1 N and Hs the Smith form of D̃W2 , and suppose
Ug Gs Vg = W1 N, Uh Hs Vh = D̃W2 , (6.3.13)
where Ug , Vg , Uh , Vh ∈ U(S). Then
J (R) = W1 Ỹ D̃W2 − W1 NR D̃W2 2

= W1 Ỹ D̃W2 − Ug Gs Vg RUh Hs Vh 2 . (6.3.14)
Since Vg , Uh are unimodular, one can replace Vg RUh by another “free” parameter, say R1 . Moreover,
Gs (and Hs ) can be expressed as a product of two diagonal matrices Eg and Dg (resp. Eh and Dh )
such that |Eg | is nonzero at all points on the j ω-axis and at infinity, while |Dg (s)| = 0 whenever
s ∈ C+ . Now, defining
G1 = Ug Eg , H1 = Eh Vh , (6.3.15)
shows that
J (R) = W1 Ỹ D̃W2 − G1 Dg R1 Dh H1 2 , (6.3.16)
where R1 = Vg RUh . Hence, by Lemma 6.3.2,
inf J (R) = inf W1 Ỹ D̃W2 − G1 RH1 2 , (6.3.17)

R R
which can be computed using Lemma 6.3.1.

The situation is much cleaner in the case of discrete-time systems. First, since the interval
[0, 2π ] is compact, every H∞ -function is also in H2 . Further, a strictly causal plant, which is the
discrete analog of a strictly proper plant, is one whose transfer matrix is zero at z = 0. As a result,
in the formulation of the discrete-time filtering problem, the weighting matrices W1 and W2 can be
any rational H∞ -matrices and need not be strictly causal. Moreover, an optimal controller always
exists and the approximation procedure of Lemma 6.3.2 is not needed.
6.4 SENSITIVITY MINIMIZATION: SCALAR CASE

Because the sensitivity minimization problem is much more involved technically than the filtering
problem, the discussion of the former is divided into three parts, namely the scalar case, the fat plant
case, and the general case. Accordingly, the problem studied in this section is the following: Suppose
p is a given scalar plant, and w is a given element of S. The objective is to find a c ∈ S(p) that
minimizes the weighted sensitivity measure5

w(j ω)
J = sup

. (6.4.1)

ω∈R 1 + p(j ω)c(j ω)
As in the previous section, one can transform this problem into an affine minimization problem.
Let (n, d) be a coprime factorization of p, and let x, y ∈ S be such that xn + yd = 1. Then the
problem becomes: Minimize
J (r) = sup |[(y − rn)dw](j ω)|
ω∈R
= (y − rn)dw∞ , (6.4.2)
by suitable choice of r ∈ S. Moreover, it can be assumed without loss of generality that the weighting
function w is outer. To see this, factor w in the form w = wi wo where wi and wo are inner and
outer, respectively. Then, since multiplication by the inner function wi is norm-preserving, we have
J (r) = (y − rn)dwo ∞ .
In order to take full advantage of the available results on Hp -spaces, a bilinear transformation
is now used to map the set S into H∞ . Suppose f ∈ S, and define the function f˜ by
f˜(z) = f [(1 + z)/(1 − z)] . (6.4.3)
Since the bilinear transformation s = (1 + z)(1 − z) maps the open unit disc into the open right
half-plane, it follows that f˜ is a rational element of H∞ , i.e., f ∈ R∞ . Moreover,
f˜∞ = sup |f˜(ej θ )|
θ∈[0,2π]
= sup |f (j ω)| = f S , (6.4.4)
ω∈R
since the unit circle maps onto the imaginary axis plus the point at infinity. Conversely, suppose
g̃ ∈ H∞ is rational, and define g by
g(s) = g̃[(s − 1)/(s + 1)] . (6.4.5)
5 Note that the two weights w and w are combined into one, since we are dealing with the scalar case.
1 2
6.4. SENSITIVITY MINIMIZATION: SCALAR CASE 167
Then g ∈ S and g̃∞ = gS . In this way one can set up a one-to-one norm-preserving corre-
spondence between functions in S and functions in R∞ . If this procedure is adopted, the problem
under study becomes the following: Find a function r̃ ∈ R∞ that minimizes
J = (ỹ − r̃ ñ)d̃ w̃∞ . (6.4.6)
At this stage we drop the tilde and state the problem in H∞ rather than in R∞ ; the reason for the
latter is that H∞ is a Banach space while R∞ is merely a normed space. The problem in essence is as
follows: Given two rational functions f, g ∈ H∞ , find a rational function r ∈ H∞ that minimizes
J (r) = f − gr∞ . (6.4.7)
The definitions of f and g when J is as in (6.4.6) are obvious. The problem of minimizing J has
a very simple geometric interpretation: Let gH∞ denote the subspace {gr : r ∈ H∞ }. Then the
problem is to find the closest point to f in the subspace gH∞ , where distances are measured using
the norm · ∞ .
It turns out that the subspace gH∞ is closed if and only if g has no zeros on the unit circle. If
gH∞ is a closed subspace, then it can be shown that, corresponding to each rational f ∈ H∞ there
exists a unique closest point in gH∞ (more on this shortly). Unfortunately, if the plant p has zeros
on the j ω-axis or at infinity, then so does n. As a result, g = ñ = ñd̃ w̃ has one or more zeros on the
unit disc, and the subspace gH∞ is not closed. As a consequence, if one defines
J0 = inf f − gr∞ , (6.4.8)

r∈H∞
then the infimum in (6.4.8) may or may not be attained, depending on f . In the original context,
this means that in general there does not exist a single compensator that minimizes the weighted
sensitivity measure (6.4.1), in the practically important case where the plant is strictly proper (or has
j ω-axis zeros).
In view of the above discussion, one would nevertheless like to know the value of J0 , as well as a
sequence {ri } of functions in H∞ such that J (ri ) → J0 . This information can then be translated back
into a greatest lower bound on the weighted sensitivity measure of (6.4.1), as well as an optimizing
sequence {ci } of compensators.
In order to do this, first carry out an inner-outer factorization of g. Thus, g = bg0 where b is
a finite Blaschke product and g0 is outer. Now factor g0 as vu where v ∈ H∞ has all of its zeros on
the unit circle and u is a unit of H∞ . Then
J0 = inf f − bdur∞ = inf f − bvr∞ , (6.4.9)

r∈H∞ r∈H∞
since the unit factor u can be absorbed into the “free” parameter r. Now consider the modified
problem: Minimize f − rb∞ by a suitable choice of r ∈ H∞ . Define
Jm = inf f − rb∞ , (6.4.10)

r∈H∞
as the value of the modified problem. In view of earlier discussion, the infimum in (6.4.10) is actually
attained, since b has no zeros on the unit circle. Now, since every function of the form bvr is also a
multiple of b, it is clear that Jm ≤ J0 . The next lemma gives conditions under which equality holds.
Lemma 6.4.1 Let ej θ1 , · · · , ej θl denote the distinct zeros of v on the unit circle. Then J0 = Jm if and
only if |f (ej θi )| ≤ Jm ∀i.
Proof. “only if ” Since v(ej θi ) = 0 ∀i, we have (bvr)(ej θi ) = 0 ∀i, ∀r ∈ H∞ .6 Hence, (f −

bvr)(ej θi ) = f (ej θi ) ∀i, and as a result
f − bvr∞ = sup |[f − bvr](ej θ )|

θ∈[0,2π]
≥ max |[f − bvr](ej θi )|
i
= max |f (ej θi )|, ∀r ∈ H∞ . (6.4.11)
i
Combining (6.4.11) with the hypothesis that J0 = Jm leads to
Jm = J0 = inf f − bvr∞ ≥ max |f (ej θi )| . (6.4.12)

r∈H∞ i
“if ” Select an r0 ∈ H∞ such that Jm = f − r0 b∞ ; such an r0 exists, by earlier remarks.

Let f0 = f − r0 b. Next, introduce the function
hθ,ε (z) = (z − ej θ )/[z − (1 + ε)ej θ ] , (6.4.13)
and define

l
hε (z) = [hθi ,ε (z)]mi , (6.4.14)
i=1
where mi is the multiplicity of ej θi as a zero of v(·). Now observe that (i) hε /v is a unit of H∞ for
each ε > 0, and (ii) hε (ej θ ) → 1 uniformly on all compact subsets of [0, 2π ] that do not contain any
of the θi ’s, as ε → 0+ . Consider the family (rε )ε>0 in H∞ defined by rε = r0 hε /v. Then by (6.4.9),
f − bvrε ∞ ≥ J0 , for each ε > 0 . (6.4.15)
On the other hand, f − bvrε = f − br0 hε so that |(f − bvrε )(ej θ )| → |f0 (ej θ )| ≤ Jm uniformly
on all compact subsets of [0, 2π ] that do not contain any of the θi ’s, and |(f − bvrε )(ej θi )| =
|f (ej θi )| ≤ Jm by hypothesis. This means that
lim sup f − bvrε ∞ ≤ Jm . (6.4.16)

ε→0+
6 Strictly speaking, r(ej θi ) is undefined for a general H
∞ -function, and this statement is only valid under more restrictive conditions,
e.g., r is rational. These technicalities can be addressed, but spelling out the details would obscure the main ideas of the proof.
Combining (6.4.15), (6.4.16) and the fact that J0 ≥ Jm leads to the conclusion that J0 = Jm and
that
f − bvrε ∞ → Jm as ε → 0+ . (6.4.17)
This completes the proof. 2
In the above proof, it is easy to see that there is nothing special about the function hε of (6.4.14),
and that any function hε can be used so long as (i) hε ∈ H∞ ∀ε > 0, (ii) hε (ej θi ) = 0 ∀i, and
(iii) hε (ej θ ) → 1 uniformly on all compact subsets of [0, 2π ] that do not contain any of the 0i ’s.
Note that it f (ej θi ) = 0 ∀i, then the condition of Lemma 6.4.1 is automatically satisfied and
J0 = Jm . Going back to (6.4.6), we have f = ỹ d̃ w̃ and g = ñd̃ w̃. Hence, if the weighting function
w is chosen such that every unit circle zero of ñ is also a zero of w̃, then J0 = Jm . (Note that
unit circle zeros of d̃ are already common to both f and g.) Translating this condition back to the
s-domain yields the result that J0 = Jm provided w is strictly proper whenever p is, and vanishes at
the j ω-axis zeros of the plant. Once such a weighting function is chosen, then any unit circle zeros
of ñd̃ w̃ can be “deleted” without affecting the infimum of J .
Example 6.4.2 Let p be a strictly proper, nonminimum phase plant with the transfer function
s−1
p(s) = .
(s − 3)(s 2 + 1)
Let the weighting function w equal

1
w(s) = .
s+1
Then w(∞) = p(∞) = 0, so that J0 = Jm in this case. Now, substituting s = (1 + z)/(1 − z) gives
z(z − 1)2 ñ(z)

p̃(z) = = ,
2(2z − 1)(z + 1)
2
d̃(z)
where ñ, d̃ are the numerator and denominator polynomials of p̃, respectively. A particular solution
of x̃ ñ + ỹ d̃ = 1 can be obtained by Euclidean division and back substitution, namely
13 1
x̃(z) = 6z2 + , ỹ(z) = − (6z2 − 9z + 2) .
2 4
Finally,
1−z
w̃(z) = .
2
Hence,
J (r) = f − rg∞ ,
where
1
f (z) = (w̃ ỹ d̃)(z) = (z − 1)(2z − 1)(z2 + 1)(6z2 − 9z + 2),
4
g(z) = (w̃ ñd̃)(z) = −z(z − 1)(2z − 1)(z2 + 1) .
If we factor g in the form g = bvu where b is a Blaschke product, v has all of its zeros on the unit
circle, and u is a unit, then
z(z − 0.5)
b(z) = ,
1 − 0.5z
v(z) = (z − 1)3 (z2 + 1),
u(z) = −2(1 − 0.5z) .
Now Lemma 6.4.1 allows us to ignore v and u, in that
inf J (r) = Jm = inf f − rb∞ .

r r
Lemma 6.4.1 provides conditions under which, by solving the minimization problem
in (6.4.10) (which does have a solution), it is possible to determine the value of the infimum in (6.4.9).
Actually, it turns out that it is possible to compute Jm (the value of the problem in (6.4.10)) without
actually computing the r0 that attains the minimum. If the conditions of Lemma 6.4.1 hold, this
Jm is also the infimum in (6.4.9). In this way, it is possible to obtain rapidly a figure of merit against
which suboptimal choices of r in (6.4.9) can be compared. If the optimization problem in (6.4.10)
is solved completely, i.e., if r0 is also determined, then a family rε such that f − grε ∞ → Jm can
be computed as in the proof of Lemma 6.4.1.
The next step then is to give a procedure for computing Jm . We begin with a result that is
now obvious, but has some practical significance.
Lemma 6.4.3 Suppose the plant p is strictly proper and stable, and has no zeros in the open right
half-plane; suppose also that the weighting function w is stable, strictly proper and vanishes at all j ω-axis
zeros (if any) of p. Then

inf sup
(j ω)
= 0 . (6.4.18)
c∈S(p) ω 1 + pc
Remarks 6.4.4 In other words, Lemma 6.4.3 states that if the plant is strictly proper, stable, and
“minimum phase,”7 then the weighted sensitivity measure can be made arbitrarily small provided
the weighting function is also stable and vanishes at all RHP zeros of the plant.
7 Normally, the term “minimum phase” implies that the plant has no zeros in the closed right half-plane; here the usage is slightly
different.
Proof. In this case d̃ is a unit, and all zeros of ñ in the closed unit disc are in fact on the unit circle.
As a result, the finite Blaschke product b equals 1, and the condition of Lemma 6.4.1 is satisfied.
Finally, it is clear that
inf f − rb∞ = inf f − r∞ = 0 , (6.4.19)
r r
corresponding to the choice r = f . 2
In the general case the infimum Jm in (6.4.10) can be computed by associating, with each
H∞ -function f − rb, the corresponding continuous linear operator Tf −rb mapping H2 into itself,
as defined in (6.2.9). Since Th = h∞ ∀h ∈ H∞ , it follows that
Jm = inf Tf −rb . (6.4.20)

r∈H∞
Let bH2 denote the range of the operator Tb ; i.e.,
bH2 = {bg : g ∈ H2 } , (6.4.21)
and let N denote the orthogonal complement of bH2 . Thus,
N = {h ∈ H2 : h, bg = 0 ∀g ∈ H2 } . (6.4.22)
Finally, let denote the orthogonal projection mapping H2 onto N. Now, as r varies over H∞ , the
operator Tf −rb ranges over various operators. However, the operator Tf −rb is the same for all r,
and in fact equals Tf . To see this, let g ∈ H2 , be arbitrary; then
Tf −rb g = [(f − rb)g] = (f g) − (rbg)

= (f g) − (brg) = (f g) = Tf g , (6.4.23)
since brg ∈ bH2 and hence its projection onto N is zero. Using this idea, one can prove the following
result (see Sarason [86] for the proof ):
Fact 6.4.5 Define a linear map A : N → N to be the restriction of Tf to N. Then
inf f − rb∞ = A . (6.4.24)

r∈H∞
Moreover, if f (ej θ ) is a continuous function of θ, then there is a unique function r0 ∈ H∞ that attains
the infimum in (6.4.26). Let f0 = f − r0 b, and let g ∈ N be a function such that Ag2 = g2 .
Then f0 /A is inner, and
(Ag)(z)
f0 (z) = . (6.4.25)
g(z)
Fact 6.4.5 is very useful because it is possible to compute all the quantities involved using only
linear algebra. First of all, the subspace N is finite-dimensional, and in fact its dimension equals the
number of zeros of the Blaschke product b. Next, one can readily find a basis for N, which can be
converted to an orthonormal basis using the Gram-Schmidt procedure. Let {v1 , · · · , vn } denote
this orthonormal basis. Then one can compute a matrix representation of the linear operator A with
respect to this basis; call it A0 . Now (6.4.24) states that the infimum of f − rb∞ is precisely the
largest singular value of A0 . Further, if p = [p1 · · · pn ] denotes the (column) eigenvector of A0 ∗ A0
corresponding to the eigenvalue (σ̄ (A0 ))2 , then

n
(A0 p)i vi (z)
i=1
f0 (z) = . (6.4.26)

n
pi vi (z)
i=1
Let us elaborate each of these points in turn.
Lemma 6.4.6 Let b be a finite Blaschke product, and define N as in (6.4.22). Let {λ1 , · · · , λn } denote
the distinct zeros of b, with multiplicities {m1 , · · · , mn }, respectively. Then a (nonorthonormal) basis for
N is given by the functions

zk−1
, k = 1, · · · , mi ; i = 1, · · · , n . (6.4.27)
(1 − λ̄i z)k
The proof follows readily from the following lemma.
Lemma 6.4.7 Suppose λ is a complex number with |λ| < 1, k is an integer greater than or equal to 1,
and define
zk−1
φ(z) = . (6.4.28)
(1 − λ̄z)k
Then φ ∈ H2 ; moreover, for any g ∈ H2 ,
g (k−1) (λ)
φ, g = . (6.4.29)
(k − 1)!
Proof. Clearly φ ∈ H2 , since its only singularity is at 1/λ̄. By definition (6.2.5), we have
2π
1
φ, g = φ̄(ej θ )g(ej θ ) dθ
2π
0
2π
1 e−j (k−1)θ
= g(ej θ ) dθ
2π (1 − λ̄e−j θ )k
0
2π
1 ej θ
= g(ej θ ) dθ
2π (ej θ − λ)k
0

1 g(z)
= dz , (6.4.30)
2πj (z − λ)k
C
where C denotes the unit circle oriented counterclockwise (note that z = ej θ on C, so that dz =
j ej θ dθ). Finally, by the Cauchy integral formula, the last term in (6.4.30) equals g (k−1) (λ)/(k − 1)!.
2
Proof of Lemma 6.4.6. The subspace N is defined by (6.4.22). Let g ∈ H2 be arbitrary. Then bg ∈
H2 has a zero of multiplicity mi at λi , for i = 1, · · · , n. Hence, if φik denotes the function
zk−1
φik (z) = , (6.4.31)
(1 − λ̄i z)k
then, by Lemma 6.4.7,

(bg)(k−1) (λi )
φik , bg = =0, (6.4.32)
(k − 1)!
whenever 1 ≤ k ≤ mi . Thus, each of the functions φik belongs to N. It is routine to verify that they
are linearly independent. Finally, in order to show that they span N, it is enough to show that the
orthogonal complement of the span of the φik ’s is contained in bH2 . Accordingly, suppose h ∈ H2
is orthogonal to each φik . Then, by Lemma 6.4.7, h has a zero of multiplicity mi at λi , for all i, and
is hence of the form bg for some g ∈ H2 . 2
Once a basis has been found for N, it is easy to construct an orthonormal basis using the
Gram-Schmidt procedure. Let us drop the double subscript notation, let n denote the dimension of
the null space N, and let {φ1 , · · · , φn } denote the basis for N given by 6.4.6. Let
∈ C n×n denote
the associated Grammian matrix; i.e.,
γij = φi , φj . (6.4.33)

Note that the constants γij can be readily computed using Lemma 6.4.7. For instance, if b has only
simple poles at λ1 , · · · , λn , then Lemma 6.4.6 shows that the set
1
φi (z) = , (6.4.34)
1 − λ̄i z
is a basis for N. Next, by Lemma 6.4.7,
1
γij = φi , φj = φj (λi ) = . (6.4.35)
1 − λ̄j λi
Since
is the Grammian of a linearly independent set, it is Hermitian and positive definite. Hence,
it is possible to factor
in the form B ∗ B where B is nonsingular. Let X = B −1 , and define

n
vj (z) = φi (z)xij , j = 1, · · · , n . (6.4.36)
i=1
Then {v1 , · · · , vn } is an orthonormal basis for N. If the factor B is selected to be upper triangular,
then this is just the Gram-Schmidt procedure.
The next step is to obtain a matrix representation A0 for the operator A = Tf with respect
to the basis {vi }. This is given by
aij = vi , Avj = vi , Tf vj = vi , f vj . (6.4.37)
Now, by Lemma 6.4.7, a quantity of the form φi , f φj is very easy to compute. Hence, it is desirable
to expand aij as a summation of terms of the form φi , f φj using (6.4.36). Let denote the matrix
in C n×n whose ij -th element is φi , f φj . Then routine computations show that
A0 = X∗ X . (6.4.38)
Now Fact 6.4.5 asserts that the infimum Jm in (6.4.10) equals A0 . Further, if p ∈ C n is a
column vector such that A0 p2 = p2 , then the optimal function f0 is given by (6.4.26).
Finally, as is the case with the filtering problem, the solution of the sensitivity minimization
problem is a lot cleaner in the discrete-time case than in the continuous-time case. If p(z) is the
transfer function of a strictly-causal plant, then p(0) = 0. If, in addition, p(·) has no poles nor zeros
on the unit circle, and the weighting function w(·) has no zeros on the unit circle, then an optimal
controller always exists.
This section is concluded with an illustrative example.
Example 6.4.8 Consider once again the plant of Example 6.4.2, namely
s−1
p(s) = ,
(s − 3)(s 2 + 1)
together with the weighting function
1
w(s) = .
s+1
By previous discussion, the sensitivity minimization problem reduces to finding an r ∈ H∞ that
minimizes f − rb∞ , where f and b are computed in Example 6.4.2. The first objective of this
example is to show that it is unnecessary to compute f and b explicitly. Now b is the Blaschke
product part of ñd̃ w̃. Hence, to form b, it is only necessary to know the locations of the zeros of
ñ, d̃, w̃ inside the open unit disc. In the s-domain, this translates to knowing the open RHP zeros
and poles of the plant, as well as the open RHP zeros of the weighting function. Moreover, as stated
earlier, it can be assumed without loss of generality that the weighting function is outer, so that it
has no open RHP zeros. In the problem at hand, p has a zero at s = 1, and a pole at s = 3. Under
the bilinear transformation z = (s − 1)/(s + 1), these points correspond to z = 0, 0.5, respectively.
Hence,
z(z − 0.5)
b(z) = .
1 − 0.5z
Next, in order to apply Fact 6.4.5 to compute the optimal function f0 , it is only necessary to know
the values of f (and its derivatives in case b has repeated zeros) at the zeros of b. Since b is the inner
part of ñd̃ w̃ and f = ỹ d̃ w̃, it follows that if z is a zero of b, then f (z) = w̃(z) or 0, depending on
whether z is a zero of ñ (i.e., a plant zero) or a zero of d̃ i.e., a plant pole). In the present instance,
f (0) = w̃(0) = w(1) = 0.5, f (0.5) = 0. All of this discussion is made more precise in the next
section, where it is shown that the sensitivity minimization problem is one of interpolation by a
function of minimum norm.
To compute the optimal function f0 , we first find a basis for the null space N of (6.4.22). By
Lemma 6.4.6, such a basis is given by
1
φ1 (z) = 1, φ2 (z) = .
1 − 0.5z
From (6.4.35), the Grammian matrix is

1 1

= ,
1 4/3
which can be factored as

1 0√ 1 1√

= = B B .
1 1/ 3 0 1/ 3
Hence,
√

−1 1 −√3
X=B = ,
0 3
and an orthonormal basis for N is given by
√ √ z
v1 (z) = φ1 (z) = 1, v2 (z) = 3(φ2 (z) − φ1 (z)) = 3 .
2−z
Next we need to compute A0 = X Xwhere φij = φi , f φj . From Lemma 6.4.7,
φ1 , f φ1 = (f φ1 )(0) = w̃(0) = 0.5,

φ1 , f φ2 = (f φ2 )(0) = 0.5,
φ2 , f φ1 = (f φ1 )(0.5) = 0,
φ2 , f φ2 = (f φ2 )(0.5) = 0 .
Therefore
0.5
√ 0
A0 = X X = .
−0.5 3 0

√ that σ̄ (A0 ) = 1, and that the associated vector p is [1 0] , which leads to
It is routine to compute
A0 p = [0.5 − 0.5 3] . Hence, by (6.4.26) the optimal function f0 is given by
√
0.5v1 (z) − 0.5 3v2 (z)
f0 (z) =
v1 (z)
1 − 2z
= .
2−z
6.5 SENSITIVITY MINIMIZATION: FAT PLANT CASE

The sensitivity minimization problem for multivariable systems is much deeper than the correspond-
ing problem for scalar systems. At present, a clean solution to the problem is available only in the
case where the plant has at least as many inputs as outputs. Not all of the details of this solution are
given in this section, as they require some background concepts from functional analysis not covered
previously.
Suppose P is a given strictly proper plant with at least as many inputs as outputs, and that
W1 , W2 ∈ M(S) are weighting matrices such that W1 has full row rank (as a rational matrix) and
W2 is square and nonsingular. The objective is to minimize the weighted sensitivity measure
J = W1 (I + P C)−1 W2 ∞ . (6.5.1)
Let (N, D), (D̃, Ñ ) be any r.c.f. and any l.c.f. of P , and let X̃, Ỹ ∈ M(S) be any solutions of
Ñ X̃ + D̃ Ỹ = I . Then, as shown in Section 6.2, one can replace (6.5.1) by the equivalent problem
of minimizing
J (R) = W1 Ỹ D̃W2 − W1 NR D̃W2 ∞ , (6.5.2)
6.5. SENSITIVITY MINIMIZATION: FAT PLANT CASE 177
as the “free” parameter R varies over M(S). Now define rational H∞ -matrices F, G, H by
F (z) = (W1 Ỹ D̃W2 )[(1 + z)/(1 − z)] , (6.5.3)

G(z) = (W1 N)[(1 + z)/(1 − z)] , (6.5.4)
H (z) = (D̃W2 )[(1 + z)/(1 − z)] . (6.5.5)
Then the problem becomes one of minimizing
J (R) = F − GRH ∞ , (6.5.6)
suitably choosing R from the set of rational H∞ -matrices.

Since the plant is strictly proper, we have that G(1) = 0. As a result, the infimum of J is not
attained in general. To deal with this situation, it is necessary to extend Lemma 6.4.1 to the matrix
case. For this purpose, define
J0 = inf F − GRH ∞ . (6.5.7)

R∈M(H∞ )
Suppose G and H have the following properties: G is of the form G1 g where G1 ∈ M(H∞ ), g ∈
H∞ , and in addition G1 (ej θ ) has full row rank at all θ ∈ [0, 2π ], and all zeros of g(·) are on the unit
circle. Similarly, H is of the form H1 h where H1 ∈ M(H∞ ), h ∈ H∞ , and in addition H1 (ej θ ) is
square and nonsingular at all θ ∈ [0, 2π], and all zeros of h(·) are on the unit circle. This means in
particular that if G(ej θ ) has less than full rank for some θ , then G(ej θ ) must in fact equal the zero
matrix; similar remarks apply to H . Note that the assumptions on G and H are trivially satisfied in
the scalar case. Now define
Jm = inf F − G1 RH1 ∞ . (6.5.8)

R∈M(H∞ )
It can be shown that the infimum in (6.5.8) is achieved, owing to the rank conditions on G1 and H1 .
The next result is a generalization of Lemma 6.4.1.
Lemma 6.5.1 Let ej θ1 , · · · , ej θl denote the distinct zeros of the product gh on the unit circle. Then
J0 = Jm if and only if F (ej θi ) ≤ Jm ∀i.
Proof. Since every matrix of the form GRH is also of the form G1 RH1 it is easy to see that Jm ≤ J0 .
“only if ” Since gh(ej θi ) = 0 ∀i, it follows that GRH (ej θi ) = 0 ∀i, ∀R ∈ M(H∞ ). Hence,
(F − GRH )(ej θi ) = F (ej θi ) ∀i, ∀R ∈ M(H∞ ), and as a result
F − GRH ∞ = sup (F − GRH )(ej θ )

θ∈[0,2π]
≥ max (F − GRH )(ej θi )
i
= max F (ej θi ), ∀R ∈ M(H∞ ) . (6.5.9)
i
Combining (6.5.9) with the hypothesis that J0 = Jm leads to
Jm = J0 = inf F − GRH ∞ ≥ max F (ej θi ) . (6.5.10)

R∈M(H∞ ) i
“if ” Select an R0 ∈ M(H∞ ) such that Jm = F − G1 R0 H1 ∞ . Such an R0 exists, by earlier remarks.

Now recall the function (cf. (6.4.13))
z − ej θ
hθ,ε (z) = , (6.5.11)
z − (1 + ε)ej θ
and define

l
hε (z) = [hθi ,ε (z)]mi , (6.5.12)
i−1
where mi is the multiplicity of ej θi as a zero of gh. Define the family (Rε ) in M(H∞ ) by Rε =
R0 hε /gh. Then F − GRε H = F − G1 R0 H1 hε . Moreover, as in the proof of Lemma 6.4.1, it will
follow that F − G1 R0 H1 hε ∞ → F − G1 R0 H1 ∞ = Jm as ε → 0+ . Hence, J0 = Jm . 2
To interpret Lemma 6.5.1 in the context of the sensitivity minimization problem, suppose
the plant P has blocking zeros8 at s = ±j ω1 , · · · , ±j ωn and at infinity, but has full row rank at all
other points on the j ω-axis. Suppose also that P has no j ω-axis poles. Select the weighting matrices
W1 , W2 such that W1 has the same blocking zeros as P on the extended j ω-axis, but has full row
rank at all other points on the j ω-axis, while W2 (j ω) is square and nonsingular at all points on the
extended j ω-axis.Then F (ej θi ) = 0 ∀i and the condition F (ej θi ) ≤ Jm is automatically satisfied;
as a result, J0 = Jm . Moreover, if R0 is a choice of R that attains the infimum in (6.5.8), then the
family (Rε = R0 hε /gh) is an optimizing family for J (·). This leads to a family of compensators
(Cε ) for P such that W1 (I + P Cε )−1 W2 ∞ → Jm as ε → 0+ .
Let us therefore turn our attention to the problem of computing the infimum in (6.5.8) and
an R that attains this infimum. As a first step, factorize G1 and H1 in the form G1 = Gi Go , H1 =
Ho Hi , where Gi , Hi are inner, Go ∈ M(H∞ ) has a right inverse, and Ho ∈ M(H∞ ) is a unit. This
is possible in view of the rank conditions on G1 and H1 (see Fact 6.2.2). Now, for any V ∈ M(H∞ ),
the equation V = Go RHo has a solution for R in M(H∞ ). Hence,
inf F − G1 RH1 ∞ = inf F − Gi V Hi ∞ . (6.5.13)

R∈M(H∞ ) V ∈M(H∞ )
adj adj
Next, let Gi , Hi denote the adjoint matrices of Gi , Hi , respectively, and note that these adjoint
matrices are themselves inner. Let b = |Gi | · |Hi |, and note that b is an inner function. Now, since
multiplication by inner matrices is norm-preserving, it follows that
adj adj adj adj
F − Gi V Hi ∞ = Gi F Hi − Gi Gi V Hi Hi ∞
= F0 − bV ∞ , ∀V ∈ M(H∞ ) , (6.5.14)
8 Recall that s is a blocking zero of P if P (s) = 0.
adj adj
where F0 = Gi F Hi . Hence, from (6.5.8), (6.5.13), and (6.5.14),
Jm = inf F0 − bV ∞ . (6.5.15)
V ∈M(H∞ )
As a final simplification, we may as well assume that b is a (finite) Blaschke product: Since b is inner,
it equals a constant times a Blaschke product; the constant can be absorbed into the “free” matrix V .
Let us now switch notation and restate the problem. Given a rational matrix F ∈ M(H∞ )
and a finite Blaschke product b, find an R ∈ M(H∞ ) that minimizes F − bR∞ . There are two
aspects to this problem: (i) computing the minimum, and (ii) finding an R that attains the minimum.
Each aspect is discussed separately.
Backtracking through the various manipulations, the reader can verify that the matrices F
and R have the same dimensions, and that they have at least as many columns as rows. To be specific,
suppose they are of dimension r × s, where r ≤ s. Recall from Section 6.2 that Hs2 is a Hilbert space
under the inner product
2π
1
f, g = f ∗ (ej θ )g(ej θ ) dθ . (6.5.16)
2π
0
As in the scalar case, with each matrix H ∈ Hs×s
∞ one can associate a continuous linear operator
TH : Hs2 → Hs2 in the obvious way, namely:
(TH f )(z) = H (z)f (z) ∀z ∈ D, f ∈ Hs2 . (6.5.17)
Moreover, this association is norm-preserving; that is,
TH f 2
TH = sup
f ∈Hs \0 f 2
2
= H ∞ = ess. sup σ̄ (H (ej θ )) . (6.5.18)

θ∈[0,2π]
It turns out that Fact 6.4.5 has a complete analog in the multivariable case in so far as computing
the infimum of F − bR∞ . This analog is based on a preliminary result taken from [86]. Some
notation smoothens the way to a presentation of the latter. Let
bHs2 = {bf : f ∈ Hs2 } , (6.5.19)
and let N denote the orthogonal complement of bHs2 in Hs2 ; thus
N = {g ∈ Hs2 : g, bf = 0 ∀f ∈ Hs2 } . (6.5.20)
Finally, let denote the orthogonal projection mapping Hs2 onto N.
Lemma 6.5.2 For each ∈ Hs×s s

∞ , let denote the map T from H2 into N. Let denote the
collection { : ∈ Hs×s∞ }. Then, corresponding to each map M in the collection , there exists a matrix
∈ Hs×s ∞ such that M = ∞ and = M.
Proof. See [86], Theorem 3. 2
Theorem 6.5.3 Suppose F ∈ Hr×s

∞ with r ≤ s. If r < s, define

F
F0 = , (6.5.21)
0r−s×s
so that F0 ∈ Hs×s
∞ . Define a linear mapping, A : N → N as the restriction of TF0 to N. Then
inf F − bR∞ = A . (6.5.22)

s×s
R∈H∞
Remarks 6.5.4 As in the scalar case, the subspace N is finite-dimensional, so that A can be
computed by purely algebraic means. More on this later.
Proof. As a first step, suppose r = s so that F is square. As R varies over Hs×s

∞ , the operator TF −bR
ranges over various operators. However, the operator TF −bR is the same for all R, and equals TF .
To see this, let g ∈ Hs2 be arbitrary. Then
TF −bR g = (F − bR)g = (F g) − (bRg) = TF g , (6.5.23)
since bRg ∈ Hs2 and hence its projection onto N is zero. Now TF is what is called F in
Lemma 6.5.2, which states that there exists a matrix ∈ Hs×s
∞ such that = F and such
that ∞ = F = (TF ). But = F is equivalent to (F − ) = 0, i.e.,
(F − )g ∈ Hs2 ∀g ∈ Hs2 . (6.5.24)
Let ei denote the i-th elementary unit vector, which has a “1” in the i-th position and zeros elsewhere,
and let
g(z) = ei , ∀z ∈ D . (6.5.25)
Then g ∈ Hs∞ (which also implies that g ∈ Hs2 ); moreover, (6.5.24) now implies that the i-th column
of the matrix F − is a multiple (in H∞ ) of the inner function b. Repeating this reasoning for
all indices i shows that F − ∈ bHs×s
∞ , i.e., that = F − bR0 for some R0 ∈ H∞ . It is now
s×s
claimed that
F − bR0 ∞ = inf F − bR∞ . (6.5.26)
s×s
R∈H∞
To prove (6.5.26), note that
F − bR∞ = TF −bR by (6.5.18),

≥ TF −bR since ≤ 1,

= TF , by (6.5.23),
= ∞ = F − bR0 ∞ . (6.5.27)
The preceding discussion can be summarized as follows: If F is square, then

inf F − bR∞ = TF . (6.5.28)
s×s
R∈H∞
To complete the proof of the theorem, it only remains to show that TF restricted to N has the
same norm as TF . Suppose g ∈ Hs2 . Then g can be expressed uniquely in the form f + bh where
f ∈ N and h ∈ Hs2 ( since Hs2 is the direct sum of N and bHs2 ). Now9
TF g = (Ff + F bh) = (Ff + bF h) = Ff . (6.5.29)
Moreover, g2 ≥ f 2 , so that

F g Ff
2
≤ 2
. (6.5.30)
g2 f 2
Hence, A (which is the restriction of TF to N) has the same norm as TF .
Next, the restriction that F be square is removed. Suppose r < s and define F0 as in (6.5.21).
Partition any R0 ∈ Hs×s
∞ as R0 = [R R1 ] where R ∈ H∞ . Then, for any R0 ∈ H∞ , we have
r×s s×s

F − bR
F0 − bR0 ∞ =
−bR1 ∞

(F − bR)(ej θ )
= ess. sup σ̄ . (6.5.31)
θ∈[0,2π] −bR1 (ej θ )
Now for any two complex matrices A, B of compatible dimensions, it is easy to show that

A
σ̄ ≥ σ̄ (A) . (6.5.32)
B
Hence, from (6.5.31), F0 − bR0 ∞ ≥ F − bR∞ , so that
inf F0 − bR0 ∞ ≥ inf F − bR∞ . (6.5.33)
R0 ∈H∞
s×s r×s
R∈H∞
To prove the reverse inequality, let R be any matrix in Hs×s∞ that attains the infimum on the right
side of (6.5.33), and define R0 = [R 0s×r−s ] . Then F0 − bR0 ∞ = F − bR∞ . Hence, the
inequality in (6.5.33) is actually an equality. Moreover, from the previous paragraph, the infimum
on the left side of (6.5.33) equals the norm of TF0 when restricted to N. 2
9 Note that this argument fails if b is replaced by an inner matrix.
As in the scalar case, it is easy to compute a basis for N.
Lemma 6.5.5 Let b be a finite Blaschke product with distinct zeros λ1 , · · · , λn , of multiplicities
m1 , · · · , mn , respectively. For i ∈ {1, · · · , s}, let ei denote the i-th s × 1 elementary unit vector. Define
N as in (6.5.20). Then, a (nonorthonormal) basis for N is given by the vector-valued functions
zk−1
i
vkl (z) = · ei , 1 ≤ i ≤ s; 1 ≤ k ≤ ml ; 1 ≤ l ≤ n . (6.5.34)
1 − λ̄l z
Proof. By Lemma 6.4.7 and (6.5.16),
(k−1)
vkl
i
, f = fi (zl ), ∀f ∈ Hs2 , (6.5.35)
Since f ∈ bHs2 if and only if the right side of (6.5.35) is zero for all i, k, l, the lemma follows. 2
Once a basis for N has been determined, it is possible to convert it into an orthonormal basis,
compute a matrix representation of A0 of A with respect to this basis, and finally to compute σ̄ (A0 );
the details are exactly the same as in the scalar case (see (6.4.33) et. seq.). However, if one is only
interested in computing σ̄ (A0 ), a short-cut is available, and is described next.
It should be emphasized that the procedure described next is equally applicable to the scalar
case, and can be used to compute σ̄ (A0 ) more readily than by using the formula (6.4.38). However,
there would be no advantage gained by this: In the scalar case, Fact 6.4.5 gives a method for computing
the infimum of f − br∞ as well as for finding an r that attains the infimum, all in one shot. In
contrast, the procedure given below only determines the infimum, and a separate interpolation
procedure has to be used to find an optimizing r. This is no disadvantage in the multivariable case,
however, since such a two-stage procedure is all that is available at present.
Now for the short-cut: Theorem 6.5.3 states that the infimum of F − bR∞ equals A,
where A : N → N is the restriction of TF to N. Lemma 6.5.5 provides a (nonorthonormal) basis for
N. To compute A, the first step is to observe that A = A∗ , where A∗ : N → N is the adjoint
map of A. Now compute a representation of A∗ with respect to the basis given in Lemma 6.5.5, and
denote it by . Also, let
denote the Grammian matrix of the basis in Lemma 6.5.5. Then it is
easy to show that
A = inf{λ : λ
− ∗ is positive definite} . (6.5.36)
Alternately, A is the largest solution to the generalized eigenvalue problem |λ

− ∗ | = 0.
The effectiveness of the above short-cut derives from the fact that the matrix representation
is block-diagonal, as shown in Lemma 6.5.6 below. For the sake of clarity of notation, Lemma 6.5.6 is
stated only for the case where the Blaschke product b has simple zeros; the general case is conceptually
not more difficult and is left to the reader.
Lemma 6.5.6 Suppose b is a finite Blaschke product with distinct simple zeros λ1 , · · · , λm , and that
F ∈ Hr×s
∞ is rational. Define ∈ C ms×mr ,
∈ C mr×mr by
= Block Diag {F ∗ (λ1 ), · · · , F ∗ (λm )} , (6.5.37)

= (
ij ), i, j = 1, · · · , m , (6.5.38)
where
1

ij = · Ir . (6.5.39)
1 − λ̄j λi
Then
inf F − bR∞ = inf{λ : λ
− ∗ is positive definite} . (6.5.40)
s×s
R∈H∞
Proof. Define N as in (6.5.20).Then N has dimension ms, and a basis for N is given by Lemma 6.5.5
as
1
vij (z) = ej , j = 1, · · · , s; i = 1, · · · , m . (6.5.41)
1 − λ̄i z
Thus, any h ∈ N can be expanded as

m
s
h(z) = hij vij (z) . (6.5.42)
i=1 j =1
Arrange the coordinates hij in an ms × 1 column vector as follows:

⎡ ⎤
h11
⎢ .. ⎥
⎢ . ⎥
⎢ ⎥
⎢ h1s ⎥
⎢ ⎥
⎢ .. ⎥
⎢ . ⎥ . (6.5.43)
⎢ ⎥
⎢hm1 ⎥
⎢ ⎥
⎢ . ⎥
⎣ .. ⎦
hms
Then the claim is that the matrix representation of A∗ with respect to this basis is
0 = Block Diag {F0 (λ1 ), · · · , F0 (λm )} . (6.5.44)

To prove the claim, let vkl be any basis function of the form (6.5.41). We must show that the ma-
trix (6.5.44) gives the right answer for A∗ vkl . Now the coordinate vector of vkl is [0 · · · 1 · · · 0] , where
the 1 is in the position (k − 1)s + l. Applying the matrix 0 to this vector gives [0 · · · F0 (λk )el · · · 0]∗
as the coordinate vector of A∗ vkl ; in other words,

s
∗ ∗
A vkl = f0lp (λk )vkp . (6.5.45)
p=1
To show that (6.5.45) is correct, it is enough to show that it gives the right answer for the inner
products vij , A∗ vkl for all i, j . By definition,
vij , A∗ vkl = A∗ vkl , vij = vkl , Avij

= vkl , TF0 vij = vkl , F0 vij
= el∗ F0 (λk )vij (λk )
1
= f0lj (λk ) , (6.5.46)
1 − λ̄i λk
where the last two steps follow from Lemma 6.4.7 and (6.5.41), respectively. On the other hand,
letting ukl denote the function in (6.5.45) leads to

s
∗
vij , ukl = f0lp (λk )vij , vkp . (6.5.47)
p=1
Note that
1
vij , vkp = δjp , (6.5.48)
1 − λ̄k λi
where δ is the Kronecker delta. Hence, (6.5.47) becomes
∗ 1
vij , ukl = f0lj (λk ) · , (6.5.49)
1 − λ̄k λj
which agrees with (6.5.46).
By previous discussion, A∗ is given by
A∗ = inf{λ : λ
0 − ∗0 0 is positive definite} , (6.5.50)
where
0 is the Grammian matrix of the basis in (6.5.41). One can readily compute that
0 =
(
0ij )m
i,j =1 where
1

0ij = · Is , (6.5.51)
1 − λ̄j λi
Now the only difference between
and
0 is in their dimensions. Similarly, if we define as
in (6.5.37), then is a submatrix of 0 , since F is a submatrix of F0 . If r = s then we are done,
since
=
0 , = 0 and (6.5.50) is the same as (6.5.40). Otherwise, by symmetric row and column
permutations, one can transform λ
0 − ∗0 0 into the form

λ 0
λ
0 −→ , (6.5.52)
0 λ
− ∗
where = (ij )m
i,j =1 , and
1
ij = · Is−r . (6.5.53)
1 − λ̄j λi
It is clear from (6.5.52) that λ
0 − 0 ∗ 0 is positive definite if and only if λ
− ∗ is; hence,
(6.5.50) reduces to (6.5.40). 2
Now that an expression is available for the infimum of F − bR∞ the next step is to give
a procedure for computing an R that attains the infimum. In the case where the inner function b
has only simple zeros, such a procedure is provided by the matrix Nevanlinna-Pick algorithm.10 This
algorithm is also applicable to the scalar problem studied in Section 6.4, and forms an alternative to
the procedure of Fact 6.4.5. However, the latter is preferable, since it leads to an optimal interpolating
function in one step, where as the Nevanlinna-Pick algorithm is iterative, and requires as many
iterations as there are zeros of b.
The main result in the sequel is Theorem 6.5.9, which overlaps somewhat with Lemma 6.5.6.
However, an independent proof is given for this theorem, as the iterative interpolation algorithm is
really a consequence of the method of proof.
To set up the procedure, define
μ= inf F − bR∞ , (6.5.54)

r×s
R∈H∞
and let λ1 , · · · , λn denote the (simple) zeros of b. Then the problem is to find a matrix G ∈ Hr×s
∞
such that
G(λi ) = F (λi ), i = 1, · · · , n, and G∞ = μ . (6.5.55)
(Note that G ∈ Hr×s∞ is of the form F − bR for some R ∈ H∞ if and only if G(λi ) = F (λi ) ∀i).
s×s
Equivalently, if we define
Fi = [F (λi )]/μ , (6.5.56)
then the problem is one of finding a matrix ∈ Hr×s ∞ such that ∞ = 1 and (λi ) = Fi ∀i.
The remainder of this section is devoted to the matrix version of the so-called Nevanlinna-
Pick problem, which can be stated as follows: Given complex numbers λ1 , · · · , λn , with |λi | < 1 ∀i,
together with complex matrices F1 , · · · , Fn , with Fi ≤ 1 ∀i, find all matrix-valued functions
(if any) such that ∞ ≤ 1 and (λi ) = Fi ∀i. The relevance of this problem to sensitivity
minimization is clear.
10 Strictly speaking, the Pick problem only makes sense for square matrices, so it is probably more accurate to speak of the matrix
Nevanlinna algorithm.
In the sequel, it is important to note that · denotes the norm of a complex matrix whereas
· ∞ denotes the norm of a matrix-valued function.
Before proceeding to the interpolation algorithm, it is established that one may assume that
Fi < 1 ∀i, without any loss of generality. Consider first the scalar case, where one is given scalars
f1 , · · · , fn , with fi ≤ 1 ∀i, and is asked to find a φ ∈ H∞ such that φ∞ ≤ 1 and φ(λi ) = fi ∀i.
Suppose one of the constants has magnitude exactly equal to 1, and renumber the constants if
necessary so that |f1 | = 1. Since λ1 belongs to the open unit disc, it follows from the maximum
modulus principle that the only H∞ -function with norm 1 that equals f1 at λ1 is the constant
function φ(z) ≡ f1 ∀z. Hence, the interpolation problem does not have a solution unless fi = f1 ∀i.
In the latter case, the solution is the obvious one, namely φ(z) ≡ f1 . Thus, one may as well assume
at the outset that |fi | < 1 ∀i, because the problem is trivial otherwise. But this reasoning breaks
down in the matrix case, because there exist nonconstant matrices such that (z) ≡ 1 ∀z ∈ D;
for example, let

1 0
(z) = . (6.5.57)
0 z
However, Lemma 6.5.7 below shows that all such functions are essentially of the form (6.5.57). One
can think of Lemma 6.5.7 as providing a matrix version of the maximum modulus principle.
Lemma 6.5.7 Suppose ∈ Hk×l ∞ and that ∞ ≤ 1. Suppose (z0 ) = 1 for some z0 in the
open unit disc, and let r denote the multiplicity of 1 as a singular value of (z0 ) (i.e., as an eigenvalue of
∗ (z0 )(z0 )). Then (z) = 1 ∀z ∈ D. Moreover, there exist unitary matrices U ∈ C k×k , V ∈ C l×l
such that
Ir 0l−r×r
U (z)V = , (6.5.58)
0r×k−r (z)
where (z) < 1 ∀z ∈ D.
The proof of Lemma 6.5.7 requires the following observation on complex matrices.
Lemma 6.5.8 Suppose M ∈ C k×l has norm 1. Then

(i) |mij | ≤ 1 ∀i, j ,
(ii) If |mij | = 1 for some i, j , then all other elements of row i and column j are zero.
Proof. Temporarily let · 2 denote the Euclidean vector norm on both C k and C l . Then
Mx2 ≤ x2 ∀x ∈ C l , (6.5.59)
since M = 1. To prove (i), suppose |mij | > 1 for some i, j , and let x = ej , the j -th elementary
unit vector in C l . Then Mx2 > 1, which is a contradiction. To prove (ii), suppose |mij | = 1 for
some i, j , and again let x = ej .Then Mej 2 = 1 implies that all other elements of the j -th column
of M are zero. The same reasoning applied to M ∗ shows that all other elements of the i-th row of
M are zero. 2
Proof of Lemma 6.5.7. Suppose 1 is an r-fold eigenvalue of ∗ (z
0 )(z0 ), so that 1 is an r-fold
singular value of (z0 ). Thus, there exist unitary matrices U ∈ C , V ∈ C l×l such that
k×k

Ir 0r×l−r
U (z0 )V = , (6.5.60)
0k−r×r M
where M < 1. Define y(z) = (U (z)V )11 , i.e., y is the corner element of U (z)V . Then y
is analytic, satisfies |y(z)| ≤ 1 ∀z ∈ D (by Lemma 6.5.8 (i)), and y(z0 ) = 1 (by (6.5.60)). Hence,
by the maximum modulus theorem, y(z) = 1 ∀z ∈ D, since z0 is in the open unit disc. Now by
Lemma 6.5.8 (ii), all other elements in the first row and first column of U (z)V are identically
zero. The same reasoning applies to all elements of the form (U (z)V )ii where i ≤ r. Hence,
U (z)V is of the form (6.5.58), and (z) = 1 ∀z ∈ D. It only remains to show that (z) <
1 ∀z ∈ D. Suppose to the contrary that (z1 ) = 1 for some z1 ∈ D. Then, by previous reasoning,
(z) = 1 ∀z ∈ D. In particular, M = (z0 ) = 1 , which contradicts the assumption that 1
is an r-fold singular value of (z0 ). Hence, (z) < 1 ∀z ∈ D. 2
Let us now return to the interpolation problem, where one is given points λ1 , · · · , λn , and
matrices F1 , · · · , Fm , with |λi | < 1 and Fi ≤ 1 ∀i. Recall that the objective is to show that one
may assume Fi < 1 ∀i without sacrificing generality. Assume to the contrary that Fi = 1 for
some values of i. Among these select one for which the multiplicity of 1 as a singular value is the
largest, and renumber it as F1 . Select unitary matrices U, V such that

Ir 0
U F1 V = , (6.5.61)
0 G1
where r is the multiplicity of 1 as a singular value of F , and G1 < 1. Now, by Lemma 6.5.7, any
such that ∞ ≤ 1 and (λi ) = Fi must satisfy

Ir 0
U (z)V = ∀z ∈ D . (6.5.62)
0 (z)
In particular, F2 , · · · , Fn must satisfy

I 0
U Fi V = r , i = 2, · · · , n . (6.5.63)
0 Gi
Otherwise the interpolation problem has no solution. On the other hand, suppose (6.5.63) holds.
Then Gi < 1 ∀i; otherwise, if Gi = 1 for some i, then Fi has 1 as a singular value of multiplicity
larger than r, contradicting the manner in which F1 was selected. Now replace the original interpola-
tion problem by the following modified problem: Find a such that ∞ ≤ 1 and (λi ) = Gi ∀i.
Then there is a one-to-one correspondence between solutions of the original problem and of
the modified problem, via (6.5.62). But in the modified problem all the matrices Gi , have norms
strictly less than one.
We are now ready for the main result on the matrix Nevanlinna problem.
Theorem 6.5.9 Suppose λ1 , · · · , λn are distinct complex numbers and F1 , · · · , Fn are complex
matrices with |λi | < 1 and Fi < 1 ∀i. Define a partitioned matrix
⎡ ⎤
P11 · · · P1n
⎢ .. .. ⎥ ,
P =⎣ . . ⎦ (6.5.64)
Pn1 · · · Pnn
where
1
Pij = · (I − Fi∗ Fj ) . (6.5.65)
1 − λ̄i λj
Then there exists a ∈ M(H∞ ) such that ∞ < 1 and (λi ) = Fi ∀i, if and only if the matrix
P is positive definite.
In the current set-up Lemma 6.5.6 leads to the following result: There exists a ∈ M(H∞ )
such that ∞ ≤ 1 and (λi ) = Fi if and only if the matrix P is nonnegative definite.
The proof of Theorem 6.5.9 is rather long and is based on a preliminary lemma. We shall see
that an iterative algorithm for computing will emerge from the proof.
Lemma 6.5.10 Suppose E < 1, and define

A B
L(E) = , (6.5.66)
C D
where
A = (I − EE ∗ )−1/2 , B = −(I − EE ∗ )−1/2 E,

C = −(I − E ∗ E)−1/2 E ∗ , D = (I − E ∗ E)−1/2 . (6.5.67)
(Note that M 1/2 denotes the Hermitian square root of M.) Then for each matrix X with X ≤ 1, the
matrix
TE (X) = (AX + B)(CX + D)−1 , (6.5.68)
is well-defined. Moreover,
TE (X) ≤ 1 (resp. < 1) ⇐⇒ X ≤ 1 (resp. < 1) . (6.5.69)

Proof. First of all, CX + D = (I − E ∗ E)−1/2 (E ∗ X
+ I ). Since E < 1, it follows that, whenever
X ≤ 1, the matrix E X also has norm less than one, so that E ∗ X + I has an inverse. Hence, TE (X)
∗
is well-defined.
Now define
I 0
J = . (6.5.70)
0 −I
Then routine calculations show that L∗ J L = J ; these calculations make use of the matrix identities
(I − MN )−1 M = M(I − NM)−1 , (6.5.71)

−1 −1
(I − MN ) = I + M(I − NM) N. (6.5.72)
Note that for any matrix M, we have that M ≤ 1 (resp. < 1) if and only if M ∗ M − I ≤ 0 (resp.
< 0). Hence, (6.5.69) is equivalent to
T ∗ T − I ≤ 0 (resp. < 0) ⇐⇒ X∗ X − I ≤ 0 (resp. < 0) , (6.5.73)
where T is a shorthand for TE (X). But since CX + D is nonsingular,
T ∗ T − I ≤ 0 (resp. < 0) ⇐⇒ (CX + D)∗ (T ∗ T − I )(CX + D) ≤ 0 (resp. < 0) . (6.5.74)
The latter quantity equals
(AX + B)∗ (AX + B) − (CX + D)∗ (CX + D)

∗ ∗ X
= [X I ]L J L
I

X
= [X∗ I ]J
I
= X∗ X − I . (6.5.75)
Hence, (6.5.69) follows from (6.5.74) and (6.5.75). 2
Proof of Theorem 6.5.9. “if ” The proof is by induction on the number n of points to be interpolated.
If n = 1, the theorem is clearly true: Take (z) = F1 . To establish the inductive step, suppose the
theorem is true for n − 1 points, and suppose that P is positive definite; it is shown below that this
implies the existence of a ∈ M(H∞ ) satisfying ∞ < 1 and (λi ) = Fi ∀i.
Let y be the inner function11
|λ1 |(z − λ1 )
y(z) = , (6.5.76)
λ1 (1 − λ̄1 z)
11 Note that y(z) = z if λ = 0.
1
and define
Ti = TF1 (Fi ), i = 2, · · · , n , (6.5.77)

1
Gi = Ti , i = 2, · · · , n , (6.5.78)
y(λi )
where TF1 (·) is defined in (6.5.68). Now define a partitioned matrix
⎡ ⎤
Q22 · · · Q2n
⎢ .. ⎥ ,
Q = ⎣ ... . ⎦ (6.5.79)
Qn2 · · · Qnn
where
1
Qij = · (I − G∗i Gj ) . (6.5.80)
1 − λ̄i λj
It is claimed that Q is positive definite. To prove the claim, compute Qij from (6.5.80), and make
liberal use of the identities (6.5.71) and (6.5.72). This leads to

1 1 ∗
Qij = I− T Tj
1 − λ̄i λj 1 − ȳ(λi )y(λj ) i

1 1 1
= 1− ·I + (I − Ti∗ Tj ) , (6.5.81)
1 − λ̄i λj 1 − ȳ(λi )y(λj ) 1 − ȳ(λi )y(λj )
where from (6.5.77),
I − Ti∗ Tj = (I − F1∗ F1 )1/2 (I − Fi∗ F1 )−1 (I − Fi∗ Fj )

(I − F1∗ Fj )−1 (I − F1∗ F1 )1/2 . (6.5.82)
For clarity, define
Ri−1 = (I − F1∗ Fi )−1 (I − F1∗ F1 )1/2 , i = 1, · · · , n , (6.5.83)
and note that

1 1 λ̄1 λ1 − 1
· 1− = . (6.5.84)
1 − λ̄i λj 1 − ȳ(λi )y(λj ) (λ̄i − λ̄1 )(λj − λ1 )
Then, (6.5.81) becomes
λ̄1 λ1 − 1 R ∗ −1 Rj−1
Qij = · I + i Pij , (6.5.85)
(λ̄i − λ̄1 )(λj − λ1 ) y(λi ) y(λj )
or, upon rearrangement,

1 − λ̄1 λ1
Pij = Ri∗ I + ȳ(λi )Qij y(λj ) Rj . (6.5.86)
(1 − λ̄i λ1 )(1 − λj λ̄1 )
Now (6.5.86) implies that

∗ c1 I 0
P =R R, (6.5.87)
0 Q
where
1
ci = for i = 1, · · · , n , (6.5.88)
1 − λ̄1 λi
and R is a partitioned matrix of the form
⎡ ⎤
c1 R1 c2 R2 c3 R 3 ··· c n Rn
⎢ 0 y(λ2 )c2 R2 0 ··· 0 ⎥
⎢ ⎥
⎢ ··· ⎥
R=⎢ 0 0 y(λ3 )c3 R3 0 ⎥ . (6.5.89)
⎢ . .. .. .. ⎥
⎣ .. . . . ⎦
0 0 0 · · · y(λn )cn Rn
Since R is nonsingular, c1 > 0 and P is positive definite, it follows from (6.5.87) that Q is also positive
definite. In particular, Qii is positive definite for all i, and Gi < 1 for all i. By the inductive
hypothesis, there exists a function ∈ M(H∞ ) such that ∞ < 1 satisfying (λi ) = Gi for
i = 2, · · · , n. Now define
(z) = (A − y(z)(z)C)−1 (y(z)(z)D − B) , (6.5.90)
where
A = (I − F1 F1∗ )−1/2 , B = (I − F1 F1∗ )−1/2 F1 ,

C = (I − F1∗ F1 )−1/2 F1∗ , D = (I − F1∗ F1 )−1/2 . (6.5.91)
Note that (6.5.90) is the inverse of the transformation TF1 , defined in (6.5.68). Hence,
from (6.5.77), (6.5.78), and (6.5.90),
(λi ) = Fi for i = 2, · · · , n . (6.5.92)
Since y(λ1 ) = 0, (6.5.90) shows that
(λ1 ) = A−1 B = F1 . (6.5.93)
Finally, by Lemma 6.5.10, (z) < 1 for all z implies that (z) < 1 for all z. Hence, is a
solution to the interpolation problem with n points. This completes the proof by induction.
“only if ” One needs to prove the following statement: “Suppose ∈ M(H∞ ) and that
∞ < 1. Let λ1 , · · · , λn be any n distinct points in the open unit disc, let Fi = (λi ), and
define P as in (6.5.64), (6.5.65). Then P is positive definite.”
The proof is again by induction on the number n of points. The statement is clearly true if
n = 1. Suppose it is true for n − 1 points. Given the n points λ1 , · · · , λn and the function , let
Fi = (λi ) and define the function ∈ M(H∞ ) by
1
(z) = (A(z) + B)(C(z) + D)−1 , (6.5.94)
y(z)
where y is as in (6.5.76) and A, B, C, D are as in (6.5.91). Then ∞ < 1 by Lemma 6.5.10.
Moreover, (λi ) = Gi where Gi is defined in (6.5.78). By the inductive hypothesis, the matrix Q
of (6.5.79) is positive definite. Since (6.5.87) holds, P is also positive definite. Hence, the statement
is true for n points. 2
The proof of the “if ” part of the theorem suggests an iterative algorithm for computing . By
the transformation (6.5.77)–(6.5.78) one can replace the given interpolation problem involving n
points by another involving only n − 1 points. Applying the transformation repeatedly reduces the
number of interpolation constraints by one each time, until eventually one is left with a single matrix
Hn such that Hn < 1, and is faced with the problem of finding a matrix
(·) ∈ M(H∞ ) such
that
(λn ) = Hn and
∞ < 1. An obvious solution is
(z) ≡ Hn ∀z. By applying the inverse
transformation (6.5.90) n − 1 times one can find a solution to the original problem. Moreover, there
is a one-to-one correspondence between the set of all solutions
(·) to the above problem and the
set of all solutions to the original interpolation problem.
The fact that the matrix R in (6.5.87) is upper-triangular suggests that (6.5.87) is the first step
in a block Cholesky decomposition of P . This fact can be used to generate Q quickly and reliably.
Suppose now that it is desired to find a function (·) ∈ M(H∞ ) satisfying (λi ) = Fi and
such that ∞ ≤ 1 (as opposed to ∞ < 1). In this case, a necessary and sufficient condition
that such a exist is that the test matrix P of (6.5.64) be nonnegative definite (as opposed to positive
definite as in Theorem 6.5.9). To construct such a , one can use the iterative procedure described
above, but with a few modifications. As mentioned before Theorem 6.5.9, one can assume at the
outset that each matrix Fi has a norm strictly less than one, by applying the reduction procedure of
Lemma 6.5.7 if needed. Then, by applying the transformation (6.5.77)–(6.5.80), one can replace the
original problem involving the n matrices F1 , · · · , Fn by a reduced interpolation problem involving
the n − 1 matrices G2 , · · · , Gn . Moreover, if P is nonnegative definite, so is Q, because of (6.5.87).
However, even if Fi < 1 ∀i, it can happen that Gi = 1 for some i. In such a case, it is necessary
first to reduce the size of the matrices by first using Lemma 6.5.7 before carrying out another step
of the Nevanlinna iteration.
This section concludes with an example that illustrates the use of the matrix Nevanlinna
algorithm.
Example 6.5.11 Suppose it is desired to find a matrix ∈ H1×2

∞ that has norm less than one and
satisfies the interpolation constraints
(0) = [0.2 0.4], (0.5) = [0.3 0.7] .

6.6. SENSITIVITY MINIMIZATION: GENERAL CASE 193
The test matrix P of (6.5.64) is of order 4 × 4 in this case. To save some labor, let us consider the
equivalent problem of finding a ∈ H∞ 2×1
that has norm less than one and satisfies the interpolation
constraints
0.2 0.3
(0) = , (0.5) = .
0.4 0.7
In this case λ1 = 0, λ2 = 0.5, F1 , F2 are self-evident, and

0.8 0.66
P = ,
0.66 0.56
which is positive definite. Hence, the problem has a solution. From (6.5.67),
⎡ ⎤
1.0236 0.0472 −0.2236
L(F1 ) = ⎣ 0.0472 1.0944 −0.4472⎦ .
−0.2236 −0.4472 1.118
In this case y(z) = z, y(λ2 ) = 0.5. Hence, from (6.5.77) and (6.5.78),

0.11652 1 0.15791
T2 = · = ,
0.33304 0.73788 0.45135

1 0.31582
G2 = T2 = .
0.5 0.9027
One can as well choose (z) = G2 ∀z. Then, from (6.5.90),

−1
1.0236 + 0.0706z 0.0472 + 0.1412z 0.2236 + 0.3531z
(z) =
0.0472 + 0.2018z 1.0944 + 0.4037z 0.4472 + 1.0092z

1 0.2236 + 0.3659z
=
1.118 + 0.4743z 0.4472 + 1.0028z

1 0.2 + 0.3273z
= .
1 + 0.4242z 0.4 + 0.8969z
6.6 SENSITIVITY MINIMIZATION: GENERAL CASE

In this section, the sensitivity minimization problem is studied in full generality, i.e., without any
simplifying assumptions on the relative numbers of plant inputs and outputs, and on the dimensions
of the weighting matrices. The filtering problem in the general case is not discussed separately in
the interests of brevity, but reasoning similar to that given here applies to this problem as well. As
shown in earlier sections, the problem is one of minimizing a cost function
J (R) = F − GRH ∞ , (6.6.1)

by a suitable choice of R ∈ M(H∞ ), where F, G, H ∈ M(H∞ ) are given rational matrices. It is
assumed throughout that G(ej θ ), H (ej θ ) have full rank (either row or column, as appropriate) for
all θ ∈ [0, 2π]. If this is not so, then it is necessary to extend Lemma 6.5.1 to the present case. This
extension is straight forward and is left to the reader.
Some preliminary results are presented before we proceed to the main solution algorithm. In
the sequel, the notation A > 0 (resp. A ≥ 0) means that the Hermitian matrix A is positive definite
(resp. nonnegative definite).The notation A > B (resp. A ≥ B) denotes that A and B are Hermitian,
and that A − B > 0 (resp. A − B ≥ 0). The notations A < B and A ≤ B are defined analogously.
Lemma 6.6.1 Suppose A ∈ M(R(z)) and satisfies two conditions:
A (z−1 ) = A(z) ∀z, and (6.6.2)

A(e ) ≥ 0 ∀ θ ∈ [0, 2π ] .
jθ
(6.6.3)
Then there exists an outer matrix U ∈ M(R∞ ) such that
A(z) = U (z−1 )U (z) ∀z . (6.6.4)
Further, if (6.6.3) is strengthened to
A(ej θ ) > 0 ∀θ ∈ [0, 2π ] , (6.6.5)
then U can be chosen to a unit of M(R∞ ). In this case, U is unique to within left multiplication by an
orthogonal matrix.
A matrix U satisfying (6.6.4) is called a spectral factor of A and is denoted by A1/2 .

Proof. See [3, p. 240]. 2
Lemma 6.6.2 Suppose A ∈ R∞n×m

where n > m, and that A is inner. Then there exists an inner matrix
B ∈ R∞
n×(n−m)
such that = [A B] ∈ R∞n×n
is also inner.
The matrix B is sometimes referred to as a complementary inner matrix of A.

Proof. Since A is inner, it has full column rank in M(R∞ ). By permuting rows if necessary, which
does not affect the validity of the lemma, it can be assumed that

D
A= , (6.6.6)
N
where |D| = 0 as an element of R∞ . Note that D ∈ R∞

m×m
, N ∈ R∞
n−m×m
. Now define
P (z) = −[D (z−1 )]−1 N (z−1 ) . (6.6.7)

Then P ∈ (R(z))m×(n−m)and therefore has an r.c.f. (U, V ) over M(R∞ ). Specifically, U ∈
m×(n−m)
R∞ , V ∈ R∞
(n−m)×(n−m)
, and
P (z) = U (z)[V (z)]−1 ∀z . (6.6.8)
Combining (6.6.8) and (6.6.9) shows that
D (z−1 )U (z) + N (z−1 )V (z) = 0 ∀z . (6.6.9)
Now define
U
W = ∈ R∞
n×(n−m)
, (6.6.10)
V
and express (6.6.9) in the form
A (z−1 )W (z) = 0 ∀z . (6.6.11)
Clearly W has full column rank as an element of M(R∞ ). Hence, by Fact 6.2.2, W has an inner-
outer factorization Wi Wo , where Wi ∈ R∞n×(n−m)
and Wo ∈ R∞ (n−m)×(n−m)
. Multiplying (6.6.11) by
−1
[Wo (z)] gives
A (z−1 )Wi (z) = 0 ∀z . (6.6.12)
It is claimed that the choice B = Wi makes = [A B] inner (and of course square). Note that
−1
A (z )A(z) A (z−1 )B(z)
(z−1 )(z) = = I ∀z , (6.6.13)
B (z−1 )A(z) B (z−1 )B(z)
since A, B are both inner and satisfy (6.6.12). 2

Now we come to the main result of this section. To be specific, suppose F ∈ R∞ n×m
,G ∈
n×l
R∞ ,H ∈ R∞ , and that n > l, k < m; the cases n ≤ l and/or k ≥ m are simpler and the requisite
k×m
modifications will be obvious. Recall that the problem is to select R ∈ Rl×k

∞ so as to minimize
J (R) = F − GRH ∞ . (6.6.14)
The first step in the solution procedure is to show that the above problem can be reduced to
that of minimizing
A−Q B
, (6.6.15)
C D ∞
by a suitable choice of Q ∈ Rl×k

∞ , where A, B, C, D are L∞ (not H∞ !) matrices of appropriate
dimensions. For this purpose, first factor G as Gi Go , where Gi ∈ R∞
n×l
is inner and Go ∈ Rl×l
∞ is
outer. Then, using Lemma 6.6.2, construct a square inner matrix ∈ R∞ containing Gi as its first
n×n
l columns. Then,
Go
G= . (6.6.16)
0(n−l)×l
Similar reasoning applied to H ∈ R∞
m×k
shows that there exist an outer matrix Ho ∈ Rk×k
∞ and a
square inner matrix ∈ R∞m×m
such that
H = [Ho 0k×(m−k) ] . (6.6.17)
Hence,

Go
J (R) = F − R[Ho 0] ∞
0

∗ Go

= F − ∗
R[Ho 0]∞ . (6.6.18)
0
Now define
A B
= ∗ F ∗ ∈ M(L∞ ) , (6.6.19)
C D
and let Q denote the “free” parameter Go RHo (recall that both Go and Ho are units of M(H∞ ) by
virtue of the rank conditions on G and H ). Then the cost function becomes

A−Q B
. (6.6.20)
C D ∞
To proceed further, we require another very simple result:
Lemma 6.6.3 Suppose W = [X Y ] ∈ M(L∞ ) is rational. Then W ∞ ≥ Y ∞ . Further, if γ >

Y ∞ , then
W ∞ ≤ γ ⇐⇒ (γ 2 I − Y Y ∗ )−1/2 X∞ ≤ 1 , (6.6.21)
where Y ∗ (z) = Y (z−1 ), and (γ 2 I − Y Y ∗ )−1/2 denotes the spectral factor of γ 2 I − Y Y ∗ .
Proof. It is obvious that W ∞ ≥ Y ∞ . Moreover, if γ > Y ∞ , then (γ 2 I − Y Y ∗ )−1/2 exists

and is a unit of M(R∞ ), by Lemma 6.6.1; hence, (γ 2 I − Y Y ∗ )−1/2 is well-defined.
By definition,
W ∞ = max σ̄ (W (ej θ )) . (6.6.22)
θ∈[0,2π]
Hence,
W ∞ ≤ γ ⇐⇒ σ̄ (W (ej θ )) ≤ γ ∀θ ∈ [0, 2π ]
⇐⇒ (W (ej θ )W ∗ (ej θ ) ≤ γ 2 I ∀θ
⇐⇒ [XX ∗ + Y Y ∗ ](ej θ ) ≤ γ 2 I ∀θ
⇐⇒ (XX ∗ )(ej θ ) ≤ γ 2 I − (Y Y ∗ )(ej θ ) ∀θ . (6.6.23)
This leads immediately to (6.6.21). 2

A parallel result to Lemma 6.6.3 is the following.
Lemma 6.6.4 Suppose

X
W = ∈ M(L∞ ) , (6.6.24)
Y
is rational. Then W ∞ ≥ Y ∞ . Further, if γ > Y ∞ , then
W ∞ ≤ γ ⇐⇒ X(γ 2 I − Y ∗ Y )−1/2 ∞ ≤ 1 . (6.6.25)
Now let us return to the problem of minimizing the cost function in (6.6.20). Let γ be a
current guess of the minimum, and let K denote [C D]. Then, from Lemma 6.6.4,

A−Q B
min ≤γ , (6.6.26)
Q∈M(H∞ ) C D ∞
if and only if
min [A − Q B](γ 2 I − K ∗ K)−1/2 ≤ 1 . (6.6.27)
Q∈M(R∞ ) ∞
Now note that (γ 2 I − K ∗ K)−1/2 is a unit of M(R∞ ). Hence, the product Q(γ 2 I − K ∗ K)−1/2
can be replaced by another free parameter R. If we define
A1 = A(γ 2 I − K ∗ K)−1/2 ∈ M(L∞ ) , (6.6.28)

∗ −1/2
B1 = B(γ I − K K)
2
∈ M(L∞ ) , (6.6.29)
then (6.6.27) is equivalent to
min [A1 − R B1 ]∞ ≤ 1 . (6.6.30)

R∈M(H∞ )
Next, from Lemma 6.6.3, (6.6.30) holds if and only if
min (I − B1 − B1∗ )−1/2 (A1 − R)∞ ≤ 1 . (6.6.31)

R∈M(H∞ )
Once again, it is possible to replace (I − B1 − B1∗ )−1/2 R by another free parameter S ∈ M(H∞ ).
If we define
A2 = (I − B1 − B1∗ )−1/2 A1 ∈ M(L∞ ) , (6.6.32)
then (6.6.31) is equivalent to
min A2 − S∞ ≤ 1 . (6.6.33)
S∈M(H∞ )
The problem of finding the minimum in (6.6.33) is one of finding the closest matrix in
M(H∞ ) to a given matrix in M(L∞ ). This can be reduced to the problem studied in Section 6.5,
namely minimizing a cost function of the form (6.5.22). This can be accomplished as follows: Since
A2 ∈ M(L∞ ) and is rational, it has an l.c.f. (T , U ) over M(R∞ ). Now T is square and full rank in
M(R∞ ), and therefore has a factorization To Ti , where To is a unit of M(R∞ ) and Ti is inner. Hence,
A2 (z) = [T (z)]−1 U (z)

= [Ti (z)]−1 [To (z)]−1 U (z)
= [Ti (z)]−1 V (z) , (6.6.34)
where V = To−1 U ∈ M(R∞ ). Now
A2 − S∞ = Ti−1 V − S∞

= Ti (Ti−1 V − S)∞
= V − Ti S∞
adj
= Ti V − bS∞ , (6.6.35)
where b = |Ti |. This is the type of objective function covered by Theorem 6.5.3.
The preceding discussion can be summarized in the following iterative procedure for com-
puting the minimum of the function in (6.6.20).
Step 1 Choose

B
γ ≥ max {[C D]∞ , ∞ } . (6.6.36)
D
Step 2 Form
L = (γ 2 I − K ∗ K)−1/2 , (6.6.37)
where K = [C D].
Step 3 Form
M = (I − BLL∗ B ∗ )−1/2 . (6.6.38)
lf BL∞ ≥ 1 , increase γ and go to Step 2; otherwise go on to Step 4.

Step 4 Compute
J∗ = min MAL − S∞ . (6.6.39)
S∈M(H∞ )
If J ∗ < 1, decrease γ and go to Step 2. If J ∗ > 1, increase γ and go to Step 2. If γ = 1, go on to

Step 5.
Step 5 Set Q = M −1 SL−1 , where S achieves the minimum in (6.6.39). This choice of Q minimizes
the original objective function in (6.6.20).
6.7. TWO-PARAMETER COMPENSATOR 199
6.7 TWO-PARAMETER COMPENSATOR
The objective of this section is to discuss the relative merits of a two-parameter compensator in
filtering and sensitivity minimization applications, compared to a one-parameter compensator. The
conclusions can be briefly stated as follows: In problems of disturbance rejection, both schemes are
equivalent. In tracking problems involving a stable plant, again both schemes are equivalent. Finally,
in tracking problems with an unstable plant, whether one uses the H2 -norm or the H∞ -norm, there
exist plants for which the optimum achievable with a two-parameter scheme is as small as one wishes
compared to the one-parameter optimum.
To avoid purely technical details caused by the nonexistence of optimal compensators in the
continuous-time case, the discussion below is addressed to the discrete-time case. However, with
the aid of Lemmas 6.3.2, 6.4.1, and 6.5.1, this discussion can be translated to the continuous-time
case.
Accordingly, suppose P is a given multivariable plant with transfer matrix P (z), where z is
the transfer function of a unit delay. Suppose u1 is a reference input to be tracked, while u2 and u3
are sensor noises to be rejected. Let W1 , W2 , W3 , W4 be given weighting matrices in M(H∞ ). It
is assumed that (i) P has no poles on the unit circle, (ii) W1 , W3 are outer, (iii) W2 , W4 are square
outer, and (iv) P (ej θ ), W1 (ej θ ), · · · , W4 (ej θ ) all have full row rank at all θ ∈ [0, 2π ].
For convenience, a two-parameter compensation scheme is reproduced in Figure 6.2. The
u2
+
u1 ~ + z1 ~ y1
Nc1 Dc—1 D —1
– +
u3 z2
+
~ y2
Nc2 N
+
Figure 6.2: Two-Parameter Compensation Scheme.
optimal tracking problem is to minimize either

J2 = W1 (I − Hyu1 )W2 2 , (6.7.1)
or
J∞ = W1 (I − Hyu1 )W2 ∞ , (6.7.2)
by means of a suitable stabilizing compensator. The optimal disturbance rejection problem is to select
a stabilizing compensator that minimizes either
K21 = W3 Hyu2 W4 2 or K22 = W3 Hvu3 W4 2 , (6.7.3)
or
K∞1 = W3 Hyu2 W4 ∞ or K∞2 = W3 Hvu3 W4 ∞ . (6.7.4)
From the discussion in Section 5.6, it is at once clear that, as far as minimizing
K21 , K22 , K∞1 , K∞2 , there is no advantage to using a two-parameter scheme instead of a one-
parameter scheme. This is because the set of achievable stable transfer matrices Hyu2 , Hvu3 are
exactly the same with either scheme. But with the optimal tracking problem, the situation is quite
different. The minimum J2 or J∞ achievable with a two-parameter scheme is no larger than with
a one-parameter scheme, and can be arbitrarily smaller, depending on the plant. Moreover, with
the two-parameter scheme, it is possible simultaneously to minimize J2 and K21 , (or J2 and K22 , or
J∞ and K∞1 or any other combination). This is the overwhelming advantage of the two-parameter
scheme. To justify the latter statement first, recall from (5.6.15) that the set of achievable stable
transfer matrices from (u1 , u2 , u3 ) to y are parametrized by
[NQ N(Y − R Ñ) − N(X + R D̃)] , (6.7.5)
where (N, D), (D̃, Ñ ) are any r.c.f. and l.c.f. of P , X, Y ∈ M(H∞ ) satisfy XN + Y D = I , and
Q, R ∈ M(H∞ ) are arbitrary. Thus, to minimize J∞ and K22 simultaneously, for example, one
chooses Q to minimize
J∞ = W1 (I − N Q)W2 ∞ , (6.7.6)
and R to minimize
K22 = W3 (I − N(X + R D̃))W4 2 . (6.7.7)
The associated compensator
Ñc1 = Q, Ñc2 = X + R D̃, D̃c = Y − R Ñ , (6.7.8)
minimizes the weighted tracking error and optimally rejects the weighted disturbance.
From (5.6.16), the set of stable transfer matrices Hyu1 , that can be achieved using a one-
parameter compensator is {N(X + R D̃) : R ∈ M(H∞ )}. Let us define
∗
J∞ = min W1 (I − N Q)W2 ∞ , (6.7.9)
Q∈M(H∞ )
J2∗ = min W1 (I − N Q)W2 2 , (6.7.10)
Q∈M(H∞ )
h∗∞ = min W1 (I − N(X + R D̃))W2 ∞ , (6.7.11)

R∈M(H∞ )
h∗2 = min W1 (I − N(X + R D̃))W2 2 . (6.7.12)

R∈M(H∞ )
Then, J∞ ∗ , J ∗ (resp. h∗ , h∗ ) are the best weighted tracking errors achievable with a two-parameter
2 ∞ 2
(resp. one-parameter) compensation scheme. Clearly J∞ ∗ ≤ h∗ , J ∗ ≤ h∗ . If P is a stable plant, then
∞ 2 2
one can choose N = P , X = 0, D̃ = I , in which case it is immediate that J2∗ = h∗2 , J∞ ∗ = h∗ . On
∞
∗ ∗ ∗ ∗
the other hand, there exist plants for which the ratios J∞ / h∞ or J2 / h2 are as small as one wishes.
To set up these examples, let us study the computation of the various minima in (6.7.9)–(6.7.12).
Consider first the constants J∞∗ , h∗ and restrict attention to the scalar case for ease of discussion. In
∞
this case we can define a single weighting function w = w1 w2 . Also, factor n as ni no , where ni , no
are respectively inner and outer functions. Then
∗
J∞ = min (1 − nq)w∞ . (6.7.13)
q∈H∞
At this stage it can be assumed without loss of generality that w is outer: If w = wi wo is an

inner-outer factorization of w, then multiplication by wi does not affect the norm. Also, since
w(ej θ ) = 0 ∀θ ∈ [0, 2π ], w is actually a unit of H∞ . Therefore
∗
J∞ = min w − ni no qw∞
q∈H∞
= min w − ni q∞ , (6.7.14)

q∈H∞
after absorbing the units no and w into q. Thus, the problem is to find a function f of minimum
H∞ -norm such that f interpolates the values of w at the zeros of ni (i.e., at the plant zeros inside
the unit disc).
Now consider the computation of h∗∞ . By definition,
h∗∞ = min [1 − n(x + rd)]w∞
r∈H∞
= (y − rn) dw∞ . (6.7.15)
Factor d as di do where di , do are respectively inner and outer, and absorb all units into the free
parameter r. This leads to
h∗∞ = min [ydo w − ni r]di ∞
r∈H∞
= min ydo w − ni r∞ , (6.7.16)
r∈H∞
since multiplication by di preserves norms. Thus, the problem is to find a function g of minimum
H∞ -norm that interpolates the values of ydo w at the zeros of ni . This can be restated in a more
transparent way. Suppose λ is a zero of ni . Then
ni (λ) = 0 ⇒ (nx)(λ) = 0 ⇒ (yd)(λ) = 1
⇒ (ydo )(λ) = 1/di (λ) . (6.7.17)
Thus, g must interpolate the values of w/di at the zeros of ni , while f interpolates the values of w
at the zeros of ni . Since |di (z)| < 1 for all z in the open unit disc, we see that h∗∞ is in general larger
than J∞ ∗ . In fact, if p has a pole-zero pair close together inside the unit disc, then |d (λ)| will be
i
very small at some zero λ of ni , and h∗∞ will be considerably larger than J∞ ∗.
Example 6.7.1 Let w = 1, and let

z − 0.5 − ε
p(z) = .
z − 0.5
Then
z − 0.5 − ε
n = z − 0.5 − ε, ni = ,
1 − (0.5 + ε)z
z − 0.5
d = z − 0.5, di = .
1 − 0.5z
Since ni has only one zero, at z = 0.5 + ε,
∗
J∞ = |w(0.5 + ε)| = 1,
h∗∞ = |(w/di )(0.5 + ε)| = 1/|di (0.5 + ε)|
0.75 − 0.5ε 0.75
= ≈ .
ε ε
∗ / h∗ can be made as small as one desires.
By choosing ε sufficiently small, the ratio J∞ ∞
Now consider the computation of J2∗ and h∗2 . Recall from Section 6.2 that the space L2 [0, 2π ]
can be decomposed as the direct sum of the orthogonal subspaces H+ − +
2 and H2 , where H2 consists of
the boundary values of H2 -functions. Given f ∈ L2 [0, 2π ], let f+ (resp. f− ) denote its projection
onto H+ − − ∗ −1 −1
2 (resp. H2 ). Finally, if f ∈ H2 , then f denotes the function z → z f (z ), while if
F ∈ M(H∞ ), then Fo (resp. Fi ) denotes the outer (resp. inner) part of F .
For ease of discussion, suppose W1 = I , and denote W2 by W . Also, suppose W is a unit of
M(H∞ ), without loss of generality. Then
J2∗ = min (I − N Q)W 2

Q∈M(H∞ )
= min W − Ni Q2 , after absorbing No , W into Q
Q∈M(H∞ )
= min Ni∗ W − Q2
Q∈M(H∞ )
= (Ni∗ W )− 2 , (6.7.18)
corresponding to the choice Q = (Ni∗ W )+ . On the other hand,
h∗2 = min (I − N(X + R D̃))W 2 . (6.7.19)

R∈M(H∞ )
To bring out the connection between h∗2 and J2∗ more clearly, select X̃, Ỹ ∈ M(H∞ ) such that

Y X D −X̃ D −X̃ Y X
= =I . (6.7.20)
−Ñ D̃ N Ỹ N Ỹ −Ñ D̃
This is always possible, by Theorem 4.1.16. If we let (D̃W )o denote the outer part of D̃W , then
h∗2 = min (Ỹ − NR)D̃W 2
R∈M(H∞ )
= min Ỹ (D̃W )o − NR(D̃W )o 2

R∈M(H∞ )
= min Ni∗ Ỹ (D̃W )o − No R(D̃W )o 2

R∈M(H∞ )
= min Ni∗ Ỹ (D̃W )o − R2

R∈M(H∞ )
= [Ni∗ Ỹ (D̃W )o ]− 2 . (6.7.21)

To compare (6.7.18) and (6.7.21), note that
N X + Ỹ D̃ = I , (6.7.22)
Ni∗ W = Ni∗ (N X + Ỹ D̃)W = No XW + Ni∗ Ỹ D̃W . (6.7.23)
However, since No XW ∈ M(H2 ),
(Ni∗ W )− = [Ni∗ Ỹ D̃W ]− . (6.7.24)
So if we define F = Ni∗ Ỹ D̃W ∈ M(L2 ) then
J2∗ = F− 2 , h∗2 = [F (D̃W )∗i ]− 2 . (6.7.25)
This implies that h∗2 , is in general larger than J2∗ .
Example 6.7.2 Consider the scalar case; let w = 1, and let
zm
p(z) = ,
1 − 2z
where m is an integer. Then we can choose
n = zm , d = 1 − 2z,
x=2 , m
y = 1 + 2z + 4z2 + · · · + 2m−1 zm−1 .
In this case n = ni since n is already inner. Hence, we have
ni = zm , n∗i = z−m .
z − 0.5 1 − 0.5z
di = , di∗ = .
1 − 0.5z z − 0.5
f = n∗i yd = n∗i (1 − nx) = n∗i − x = z−m − 2m .
J2∗ = f− 2 = z−m 2 = 1 .
f di∗ = (z−m − 2m ) di∗ ,
h∗2 = (f di∗ )− 2 = (z−m di∗ )− − (2m di∗ )− 2
≥ 2m (di∗ )− 2 − (z−m di∗ )− 2 .
Now expand di (z) in a Fourier series
∞

di (z) = ci z i ,
i=0
where
c0 = −0.5, ci = 1.5(0.5)i for i ≥ 1 .
Then, since di is inner,

∞

1 = di 22 = ci2 .
i=0
Therefore,
∞

ci2 = 0.75 .
i=1
Since,
∞

di∗ (z) = ci z−i ,
i=0
∞

(di∗ )− = ci z−i ,
i=1
we see that
∞

(di∗ )− 22 = ci2 = 0.75 .
i=1
On the other hand, if m ≥ 1, then we have

∞

(z−m di∗ )− = z−m di∗ = ci z−(i+m) ,
i=0
(z−m di∗ )− 2 = di∗ 2 = 1 .
Hence,
√
h∗2 ≥ 2m−1 · 3−1.
By choosing the integer m sufficiently large, the ratio J2∗ / h∗2 can be made as small as one desires.
NOTES AND REFERENCES
Section 6.2 contains only the barest minimum of the theory of Hardy spaces. For excellent treatments
of this topic, see Duren [34], Koosis [59] or Sz. Nagy and Foias [69].
Solving the filtering problem of Section 6.3 was the original motivation of Youla, Jabr and
Bongiorno [111] in parametrizing the set of all compensators that stabilize a given plant. Kucera [60]
gives a solution to the discrete-time filtering problem that is similar in its approach to that of [111].
The treatment given here follows Francis [36].
The sensitivity minimization problem was formulated by Zames [113] and solved in the scalar
case by Zames and Francis [40, 115].The treatment here is similar to that in [36]. In the multivariable
case, the fat plant case was solved by Chang and Pearson [12] and Francis, Helton, and Zames [38].
Our treatment is adapted from the former. The general sensitivity minimization problem was solved
by Doyle [31]. Note that it is possible to carry out all of the steps of Sections 6.4 to 6.6 using
purely state-space methods; see [16, 31, 44]. The paper of Sarason [86] contains some mathematical
results pertinent to the sensitivity minimization problem. The Nevanlinna-Pick algorithm used in
Section 6.5 is taken from the paper by Delsarte, Genin, and Kamp [18], but this paper treats only
the square matrix case. Finally, Potapov [75] contains several useful results on matrices of analytic
functions, such as Lemma 6.5.6.
207
CHAPTER 7
Robustness
This chapter begins with a detailed study of the issue of the robustness of feedback stability in
the presence of plant and/or compensator perturbations. Both the nature and the extent of the
perturbations that maintain feedback stability are discussed. Then the problem of robust regulation
(i.e., stabilization plus tracking and/or disturbance rejection) is examined, for both the one-parameter
as well as the two-parameter compensation schemes. Finally, various generic properties of control
systems are studied.
7.1 INTRODUCTION
Consider the standard feedback system shown in Figure 7.1, and suppose the compensator C stabi-
lizes the plant P , i.e., that the closed-loop transfer matrix

(I + P C)−1 −P (I + CP )−1
H (P , C) = , (7.1.1)
C(I + P C)−1 (I + CP )−1
belongs to M(S). The central question studied in this chapter is: What sort of perturbations can be
u2
+
u1 e1 y1 e2 y2
C P
+ +
–
Figure 7.1: Feedback System.
permitted in P and/or C without destroying feedback stability? This question is very relevant in any
practical application of the controller synthesis theory developed in the previous two chapters. The
first step in designing a compensator for any given plant is to obtain a mathematical model for it. As
this process always involves some modeling errors, simplifying assumptions, etc., the “true” plant is
in general different from the nominal model used in designing a compensator. In the same vein, one
can normally expect some errors or simplifications in implementing the compensator, so that the
“true” compensator is not necessarily the same as the designed compensator. These phenomena can
be captured by supposing that the true plant and compensator are perturbed versions of the nominal
208 7. ROBUSTNESS
plant and compensator respectively. Hence, there is interest in obtaining characterizations of both
the nature and the extent of the permissible perturbations.
Before attempting to characterize permissible perturbations, we should first decide on a rea-
sonable definition of what constitutes a permissible perturbation. Suppose (P0 , C0 ) is a nominal
plant-compensator pair which is stable, so that H (P0 , C0 ) ∈ M(S), and suppose P0 , C0 are per-
turbed to P , C, respectively. In order for (P , C) to be a permissible perturbation of (P0 , C0 ), we
require as a minimum that (P , C) also be a stable pair. But this is not enough. The objective of
compensator design is not merely to stabilize a plant but also to improve its response (this is why
compensation is applied even to stable plants in some instances). Hence (P , C) is deemed to be
a permissible perturbation of (P0 , C0 ) if (P , C) is stable, and in addition the closed-loop transfer
matrix H (P , C) is “close” to H (P0 , C0 ).
There are two aspects to the study of robustness. The first is qualitative, and pertains to the
nature of the permissible perturbations. Mathematically, this is the problem of defining a topology on
the set of plant-compensator pairs.The second is quantitative, and is concerned with the extent of the
permissible perturbations. This is the problem of defining a metric on the set of plant-compensator
pairs. Each has its own significance, which is discussed at the appropriate place.
In the study of robustness, a very helpful feature is that the map from the pair (P , C) to the
closed-loop transfer matrix H (P , C) is one-to-one.That is, given H (P , C) ∈ M(S), one can uniquely
determine P and C, by proceeding as follows: Define

C 0 0 I
G= , F = . (7.1.2)
0 P −I 0
Then it is easy to verify (as in Section 5.1) that H (P , C) = (I + F G)−1 . Solving this relationship
for G in terms of H (P , C) gives

C 0
G= = F [I − (H (P , C))−1 ] , (7.1.3)
0 P
where we use the fact that F −1 = −F . Now suppose that (P0 , C0 ) is a given stable pair corresponding
to the nominal plant and compensator, and let B denote a subset of M(S) containing H (P0 , C0 ),
representing the acceptable perturbations of the closed-loop transfer matrix resulting from changes
in the plant and/or compensator. Now the set
{G ∈ M(R(s)) : G = F (I − H )−1 for some H ∈ B} , (7.1.4)
certainly contains all pairs (P , C) such that H (P , C) ∈ B. But there is a flaw in this argument: If
H is an arbitrary element of B, then G = F (I − H )−1 need not be block-diagonal and therefore,
need not correspond to a plant-compensator pair. Of course, the set of block-diagonal matrices of
the form (7.1.4) is precisely the set of permissible plant and compensator perturbations. However,
this set is very difficult to characterize explicitly. Hence we are forced to seek alternate approaches.
This is what is done in the next few sections.
7.2. GRAPH TOPOLOGY 209
7.2 GRAPH TOPOLOGY
Suppose it is desired to design a stabilizing compensator for a plant that is only specified imprecisely.
Suppose further that the plant uncertainty can be modeled by considering a family of plants Pλ , where
the uncertainty parameter λ assumes values in some topological space . Typically λ represents some
physical parameters of the plant, such as component values, and the function λ → Pλ captures the
manner in which this uncertainty in the physical parameters is reflected in the plant description. The
parameter space may be Rn for some integer n, or a more general normed space, or any topological
space. An example of the latter situation arises in the design of compensators for singularly perturbed
systems, where = [0, ∞). For technical reasons, it is assumed that is a first-countable topological
space (see Section C.1 for the definition). As all metric spaces are first-countable, this poses no
practical restriction. Let Pλ0 be the nominal plant description on which the compensator design
is based. The question is: When is it possible to design a stabilizing compensator for Pλ0 that is
“robust” against the plant uncertainty? To answer this question, it is necessary first to define robust
stabilization. The definition adopted here is that a compensator1 C ∈ S(Pλ0 ) robustly stabilizes the
family of plants Pλ if (i) there exists a neighborhood N of λ in such that C ∈ S(Pλ ) ∀λ ∈ N, and
(ii) the stable closed-loop transfer matrix H (Pλ , C) is continuous in λ at λ0 . The first condition
means that C stabilizes not only the nominal plant Pλ0 but also all “perturbed” plants Pλ for all λ,
sufficiently near λ0 . But this alone is not enough: The objective of compensator design is not merely
to stabilize a plant but also to shape the stable closed-loop response. Thus, the second condition
is imposed, requiring that the perturbed (stable) closed-loop response H (Pλ , C) be close to the
nominal (stable) closed-loop response H (Pλ0 , C) whenever λ0 is near λ. Note that the question as
posed above is purely qualitative, in that one is not concerned about the “size” of the neighborhood
N, but merely about its existence. The quantitative aspects of robustness are studied in the next two
sections.
The question studied (and answered) in this section is: When does there exist a robustly
stabilizing compensator for a given family Pλ of plants? The answer involves the introduction of an
appropriate topology on the set M(R(s)) of possibly unstable plants, called the graph topology. In
terms of the graph topology, the answer to the above question turns out to be exceedingly simple
and satisfactory: Robust stabilization is possible if and only if the function λ → Pλ is continuous in the
graph topology at λ = λ0 . In other words, robust stabilization is possible if and only if any variations
in the parameter λ from its nominal value λ0 affect the plant description in a continuous manner,
where the continuity of Pλ is judged in the context of the graph topology.
In this section, the graph topology is defined, and some of its properties are proved, including
the above-stated result on robust stabilizability. Then several examples are given with a view to
developing some insight into the graph topology. One of these “examples” contains a fairly substantial
application of these ideas to the design of compensators for singularly perturbed systems.
As the name implies, the graph topology has to do with the graph of a plant, and we begin
with this notion.
1 Recall that S(P ) is the set of compensators that stabilize P .
210 7. ROBUSTNESS
Suppose P ∈ M(R(s)). Then P is a possibly unstable plant; that is, not all bounded inputs
applied to P need result in bounded outputs. But undoubtedly some (though not necessarily all)
bounded inputs applied to P will produce bounded outputs. For example, suppose p(s) = 1/(s − 1);
then it is intuitively clear that a bounded input u(·) whose Laplace transform has a zero at s = 1 will
produce a bounded output from p. The graph of a plant P is just the set of bounded input-output
pairs corresponding to P . Precisely, if P ∈ R(s)n×m , then
GS (P ) = {(u, y) ∈ Sn+m : y = P u} , (7.2.1)
is the graph of P with respect to S.
Lemma 7.2.1 Suppose P ∈ R(s)n×m , and let (N, D) be any r.c.f. of P . Then
GS (P ) = {(Dz, N z) : z ∈ Sm } . (7.2.2)
Remarks 7.2.2 This lemma shows that the graph of a plant P can be very simply parametrized in
terms of any r.c.f. (N, D) of P . As the free parameter z varies over Sm , the pair (Dz, N z) generates
the set of all “bounded” input-output pairs corresponding to P .
As in [26, 81], let L∞ denote the set of all measurable essentially bounded functions mapping
[0, ∞) into the real numbers, and let L2 denote the set of all measurable square-integrable functions
mapping [0, ∞) into R. Then one can also define the graph of P with respect to the spaces L∞ and
L2 in a manner entirely analogous with (7.2.1), namely
GL∞ (P ) = {(u, y) ∈ L∞n+m

: ŷ = P û} , (7.2.3)
GL2 (P ) = {(u, y) ∈ L2n+m : ŷ = P û} , (7.2.4)
where the hat denotes Laplace transformation. It can be shown [95] that these graphs can also be
simply characterized in terms of an r.c.f. of P . Specifically, if (N, D) is any r.c.f. of P , then
GL∞ (P ) = {(Dz, N z) : z ∈ Lm
∞} , (7.2.5)
GL2 (P ) = {(Dz, N z) : z ∈ Lm
2}, (7.2.6)
where D and N are now interpreted as bounded operators on L∞ or L2 . But technically Lemma 7.2.1
is easier to prove than (7.2.5) or (7.2.6).
Proof. Suppose u = Dz for some z ∈ Sm .Then P u = ND −1 Dz = Nz ∈ Sn . Hence, for all z ∈ Sm ,
the pair (Dz, Nz) belongs to GS (P ). It remains to show that every point (u, y) ∈ GS (P ) is of the
form (Dz, Nz) for some z ∈ Sm . Since N and D are right-coprime, one can select X, Y ∈ M(S)
such that XN + Y D = 1. Now suppose (u, y) ∈ GS (P ), i.e., suppose u ∈ Sm , y ∈ Sn and y = P u.
Define z = D −1 u
∈ [R(s)]m .
Then clearly (u, y) = (Dz, N z). The proof is complete if it can be
shown that z actually belongs to Sm . Note that
z = D −1 u = (XN + Y D)D −1 u = XND −1 u + Y u
= Xy + Y u ∈ Sm , (7.2.7)
since all quantities in the last expression belong to M(S). 2
Corollary 7.2.3 Suppose P ∈ Sn×m . Then

GS (P ) = {(z, P z) : z ∈ Sm } . (7.2.8)
Proof. Note that (P , I ) is an r.c.f. of P . 2

Now we are ready to define the graph topology. Suppose (N, D) is a right-coprime pair, and
that |D| = 0. Then there exists a constant μ = μ(N, D) such that (N1 , D1 ) is also a right-coprime
pair and |D1 | = 0 whenever [(N1 − N) (D1 − D) ] < μ(N, D). To see this, first select any
X, Y ∈ M(S) such that XN + Y D = I , and let μ1 = 1/[X Y ]. Then

N1 − N
XN1 + Y D1 − I < 1 whenever < μ1 . (7.2.9)
D1 − D
This shows that XN1 + Y D1 ∈ U(S), whence N1 and D1 are right-coprime. Next, since |D| = 0, the
ball B(|D|, r) does not contain 0 whenever r < |D|. Let r < |D|. Since the map D1 → |D1 |
is continuous, the set of matrices D1 such that |D1 | ∈ B(|D|, r) contains a ball B(D, μ2 ) centered
at D. Now let μ = min{μ1 , μ2 }. Then the preceding arguments show that N1 , D1 are coprime and
|D1 | = 0 whenever (N1 , D1 ) ∈ B((N, D), μ). Of course, the bound μ is not uniquely specified by
the above conditions, but this does not matter.
Given a plant P ∈ M(R(s)), a basic neighborhood of P is defined as follows: Let (N, D) be
any r.c.f. of P , and let ε be any positive number less than μ(N, D). Then the set

−1 N1 − N

N(N, D; ε) = {P1 = N1 D1 : < ε} , (7.2.10)
D1 − D
is a basic neighborhood of P . Thus, a basic neighborhood of P merely consists of all plants P1 which
have an r.c.f. (N1 , D1 ) that is “close” to an r.c.f. of P . To put it another way, a basic neighborhood of
P is obtained by starting with some r.c.f. (N, D) of P and making a “small” perturbation of (N, D)
to (N1 , D1 ), and then taking the ratio, N1 D1−1 . Now by varying ε over all positive numbers less than
μ(N, D), by varying (N, D) over all r.c.f.’s of P , and by varying P over all elements of M(R(s)),
we obtain a collection of basic neighborhoods.
Lemma 7.2.4 The collection of basic neighborhoods forms a base for a topology on M(R(s)).
212 7. ROBUSTNESS
Proof. In accordance with the contents of Section C.1, it is necessary to prove that axioms (B1)
and (B2) are satisfied. For convenience, these axioms are recalled below:
(B1) The collection of basic neighborhoods covers M(R(s)).
(B2) If N1 , N2 are basic neighborhoods and P ∈ N1 ∩ N2 , then there is a basic neighborhood
N such that P ∈ N ⊆ (N1 ∩ N2 ).
Of these, it is at once obvious that (B1) holds, since every P ∈ M(R(s)) belongs to some basic
neighborhood. To prove (B2), suppose

−1 N̄ − N1

N1 = {N̄ D̄ : < ε1 } , (7.2.11)
D̄ − D1

N̄ − N2
N2 = {N̄ D̄ −1 :
D̄ − D2 < ε2 } , (7.2.12)
and suppose P ∈ N1 ∩ N2 . Then P has r.c.f.’s (N̄1 , D̄1 ), (N̄2 , D̄2 ) such that

N̄1 − N1
=: r1 < ε1 , N̄2 − N2 =: r2 < ε2 . (7.2.13)
D̄1 − D1 D̄2 − D2
Moreover, there exists a U ∈ U(S) such that N̄2 = N̄1 U, D̄2 = D̄1 U . Now select a positive number
r such that r ≤ min{ε1 − r1 , ε2 − r2 /U }, and define

−1 N − N̄1

N = {ND : < r} . (7.2.14)
D − D̄1
Then N is certainly a basic neighborhood containing P . It only remains show that N ⊆ N1 ∩ N2 .

This follows readily: First,

N − N̄1
< r ⇒ N − N1 < r1 + r < ε1 , (7.2.15)
D − D̄1 D − D1
which shows that N D −1 ∈ N1 . Next,

N − N̄1 − N̄2
< r ⇒ NU < rU
D − D̄1 DU − D̄2

NU − N2
⇒
DU
< r2 + rU < ε2 , (7.2.16)
− D2
which shows that (NU )(DU )−1 = ND −1 ∈ N2 . Hence N ⊆ N1 ∩ N2 . 2

The topology on M(R(s)) defined by the collection of sets of the form (7.2.10) as base is
called the graph topology. In it, two plants P1 and P2 are “close” if they have r.c.f.’s (N1 , D1 ), (N2 , D2 )
such that (N1 , D1 ) − (N2 , D2 ) is “small.” In view of Lemma 7.2.1, this means that P1 and P2 are
“close” if their graphs are “close.” This is the reason for the name “graph topology.”
The central result of this section is that the graph topology on M(R(s)) is the weakest topology
in which feedback stability is a robust property. In order to state this result precisely, it is necessary
first to characterize the convergence of sequences and continuity of functions in the graph topology.
Suppose {Pi } is a sequence in M(R(s)) and that P ∈ M(R(s)). Then by definition (Section C.1),
{Pi } converges to P if and only if every neighborhood of P contains all but a finite number of terms
of the sequence {Pi }. Since every neighborhood of P contains a basic neighborhood of P , it is easy
to see that {Pi } converges to P if and only if every basic neighborhood of P contains all but a finite
number of terms of {Pi }. All this is a ready consequence of the definition of the graph topology
and of the definition of convergence in a topological space. The next lemma gives several equivalent
characterizations of convergence in the graph topology.
Lemma 7.2.5 Suppose {Pi } is a sequence in M(R(s)) and that P ∈ M(R(s)). Then the following
statements are equivalent:
(i) {Pi } converges to P in the graph topology.
(ii) For every r.c.f. (N, D) of P , there exist r.c.f.’s (Ni , Di ) of Pi such that Ni → N, Di → D in
M(S).
(iii) There exist an r.c.f. (N, D) of P and a sequence {(Ni , Di )} of r.c.f.’s of Pi such that Ni →
N, Di → D in M(S).
(iv) For every l.c.f. (D̃, Ñ) of P , there exists a sequence {(D̃i , Ñi )} of l.c.f.’s of Pi such that
Ñi → Ñ , D̃i → D̃ in M(S).
(v) There exist an l.c.f. (D̃, Ñ) of P and a sequence {(D̃i , Ñi )} of l.c.f ’s of Pi such that Ñi →
Ñ , D̃i → D̃ in M(S).
Proof. (i) ⇐⇒ (ii). This is a ready consequence of the definition of the graph topology and of the
discussion in the paragraph preceding the lemma. Suppose {Pi } converges to P . Then every basic
neighborhood of P contains all but a finite number of terms in the sequence {Pi }. In particular, let
(N, D) be any r.c.f. of P and let εi be any sequence of positive numbers converging to zero. (Assume
without loss of generality that εi < μ(N, D) ∀i.) Define

−1 N̄ − N

Ni = {N̄ D̄ : < εi } . (7.2.17)
D̄ − D
Then for each integer k there exists an integer jk such that Pi ∈ Nk ∀i ≥ jk . That is, there exist
r.c.f.’s (Ni , Di ) of Pi such that

Ni − N

Di − D < εk ∀i ≥ jk . (7.2.18)
From this, one can conclude that

Ni − N

Di − D → 0 . (7.2.19)
214 7. ROBUSTNESS
i.e., that Ni → N, Di → D in M(S). This shows that (i) ⇒ (ii). The converse implication follows
just as easily. Suppose (ii) is true and let N be any basic neighborhood of P . Then N is of the
form (7.2.10) for some r.c.f. (N, D) of P and some ε > 0. By (ii), there exists a sequence {(Ni , Di )}
of r.c.f.’s of Pi such that (Ni , Di ) → (N, D) in M(S). Hence [Ni Di ] − [N D ] < ε for large
enough i, i.e., Pi ∈ N for large enough i. Since this is true for all basic neighborhoods of P , it
follows that {Pi } converges to P .
(ii) ⇒ (iii). Obvious.
(iii) ⇒ (ii). Suppose (N, D) is an r.c.f. of P with the property that there exists a sequence
{(Ni , Di )} of r.c.f.’s of Pi such that Ni → N, Di → D. Let (N̄, D̄) be any other r.c.f. of P . It is
shown that there exists a sequence {(N̄i , D̄i )} of r.c.f.’s of Pi such that N̄i → N̄, D̄i → D̄. Since
(N, D), (N̄ D̄) are both r.c.f.’s of P , there exists a U ∈ U(S) such that N̄ = NU, D̄ = DU . Now
define N̄i = Ni U, D̄i = Di U . Then N̄i → N̄, D̄i → D̄. Thus, (ii) holds.
(iv) ⇐⇒ (v). The proof is entirely analogous to that of (ii) ⇐⇒ (iii) above.
Thus far, it has been shown that (i), (ii), (iii) are all equivalent, and that (iv) and (v) are
equivalent. The proof of the lemma is completed by showing that (iii) ⇐⇒ (v).
(iii) ⇒ (v). By hypothesis, there exist an r.c.f. (N, D) of P and a sequence of r.c.f.’s {(Ni , Di )}
of Pi such that Ni → N, Di → D. Now select an l.c.f. (D̃, Ñ) of P and matrices X, Y, X̃, Ỹ ∈ M(S)
such that
Y X D −X̃
=I . (7.2.20)
−Ñ D̃ N Ỹ
This can always be done, in view of Theorem 4.1.16. Since (Ni , Di ) → (N, D) and since U(S) is
an open subset of M(S), it follows that

Di −X̃
Fi = ∈ U(S) for large enough i . (7.2.21)
Ni Ỹ
Now let Gi = Fi−1 ∈ M(S) for large enough i, and partition Gi as

Yi Xi
Gi = . (7.2.22)
−Ñi D̃i
Then Ñi , D̃i are left-coprime (for large enough i), since Ñi X̃ + D̃i Ỹ = I . Further, Ñi Di + D̃i Ni =
0 implies that D̃i−1 Ñi = Ni Di−1 = Pi .2 Hence (D̃i , Ñi ) is an l.c.f. of Pi . Finally, since inversion of
unimodular matrices is continuous, we have that

Y X
Gi → (lim Fi )−1 = . (7.2.23)
i −Ñ D̃
This shows that (D̃i , Ñi ) → (D̃, Ñ), which is an l.c.f. of P . This is (v).
(v) ⇒ (iii). This is entirely analogous to the above. 2
2 It still has to be shown that D̃ is nonsingular for this argument to be valid. However, this is straight-forward; see the proof of
i
Lemma 8.1.17.
The next result characterizes the continuity of functions mapping into M(R(s)). It is a ready
consequence of Lemma 7.2.5.
Corollary 7.2.6 Suppose is a first-countable topological space, and that λ → Pλ is a function mapping
into M(R(s)). Then the following are equivalent:
(i) The function λ → Pλ is continuous at λ = λ0 when M(R(s)) is equipped with the graph
topology.
(ii) For each r.c.f. (N0 , D0 ) of Pλ0 , there exists a family of r.c.f.’s (Nλ , Dλ ) of Pλ such that
(Nλ0 , Dλ0 ) = (N0 , D0 ), and such that Nλ , Dλ are continuous (in M(S)) at λ = λ0 .
(iii) There exists a family (Nλ , Dλ ) of r.c.f.’s of Pλ such that Nλ , Dλ are continuous (in M(S)) at
λ = λ0 .
(iv) For each l.c.f. (D̃0 , Ñ0 ) of Pλ0 , there exists a family (D̃λ , Ñλ ) of l.c.f.’s of Pλ such that
(D̃λ0 , Ñλ0 ) = (D̃0 , Ñ0 ) and such that (D̃λ , Ñλ ) are continuous (in M(S)) at λ = λ0 .
(v) There exists a family (D̃λ , Ñλ ) of l.c.f.’s of Pλ such that D̃λ , Ñλ are continuous (in M(S)) at
λ = λ0 .
Now we come to the main result of this section. It is rather wordy in the interests of precision,
but can be loosely stated as follows: A family of plants {Pλ } can be robustly stabilized by a compensator
C ∈ S(Pλ0 ) if and only if the function λ → Pλ is continuous at λ = λ0 in the graph topology: further,
if λ → Pλ is continuous at λ0 , then every C ∈ S(Pλ0 ) robustly stabilizes the family {Pλ }.
Theorem 7.2.7 Suppose λ → Pλ is a function mapping a first-countable topological space into

M(R(s)).
(i) Suppose the function λ → Pλ is continuous at λ = λ0 when M(R(s)) is equipped with
the graph topology. Let C be any element of S(Pλ0 ). Then there exists a neighborhood N of λ0 such
that C ∈ S(Pλ ) ∀λ ∈ N, and moreover H (Pλ , C) is continuous (in M(S)) as a function of λ at λ0 .
(ii) Conversely, suppose there exists a C ∈ S(Pλ0 ) such that C ∈ S(Pλ ) for all λ in some
neighborhood N of λ0 , and that H (Pλ , C) is continuous (in M(S)) as a function of λ at λ0 . Then
the function λ → Pλ is continuous when M(R(s)) is equipped with the graph topology.
Proof. (i) By hypothesis, the function λ → Pλ is continuous at λ0 . By Corollary 7.2.6 (iii), there
exists a family (Nλ , Dλ ) of r.c.f.’s of Pλ such that Nλ , Dλ are continuous at λ0 . Since C ∈ S(Pλ0 ),
there is an l.c.f. (D̃c , Ñc ) of C such that D̃c Dλ0 + Ñc Nλ0 = I (see Corollary 5.1.8). Moreover,

I − Nλ0 Ñc −Nλ0 D̃c
H (Pλ0 , C) = . (7.2.24)
Dλ0 Ñc Dλ0 D̃c
Now define λ = D̃c Dλ + Ñc Nλ . Then λ0 = I and λ → λ is a continuous function from
into M(S). Since U(S) is an open subset of M(S), there is a neighborhood N of λ0 in such that
216 7. ROBUSTNESS
λ ∈ U(S) ∀λ ∈ N. Hence C ∈ S(Pλ ) ∀λ ∈ N. Moreover, since inversion of unimodular matrices
is continuous, the map λ → −1
λ is continuous. It now follows from

I − Nλ −1 Ñ c N λ −1
D̃ c
H (Pλ , C) = λ λ , (7.2.25)
Dλ −1
λ Ñc Dλ −1λ D̃c
that H (Pλ , C) is continuous at λ = λ0 .

(ii) By hypothesis, there exists a C ∈ S(Pλ0 ) and a neighborhood N of λ0 such that C ∈
S(Pλ ) ∀λ ∈ N. Select any r.c.f. (N0 , D0 ) of Pλ0 . Since C ∈ S(Pλ0 ), by Corollary 5.1.8 there exists
an l.c.f. (D̃c , Ñc ) of C such that D̃c D0 + Ñc N0 = I . Now, since C ∈ S(Pλ ) ∀λ ∈ N, it follows by
symmetry that Pλ ∈ S(C) ∀λ ∈ N. Hence, once again by Corollary 5.1.8, there exists r.c.f.’s (Nλ , Dλ )
of Pλ such that D̃c Dλ + Ñc Nλ = I ∀λ ∈ N. Moreover,

I − N0 Ñc −N0 D̃c
H (Pλ0 , C) = , (7.2.26)
D0 Ñc D0 D̃c

I − Nλ Ñc −Nλ D̃c
H (Pλ , C) = . (7.2.27)
Dλ Ñc Dλ D̃c
By hypothesis H (Pλ , C) is continuous at λ = λ0 . Hence so is the function

Nλ Ñc Nλ D̃c
Fλ = . (7.2.28)
Dλ Ñc Dλ D̃c
Since Nλ0 , Dλ0 are fixed elements of M(S) (i.e., independent of λ), the function

N0 Nλ
Fλ = (since Ñc N0 + D̃c D0 = I ) , (7.2.29)
D0 Dλ
is also continuous at λ0 . Thus, Nλ , Dλ are continuous at λ0 . Finally, since H (Pλ , C) is continuous

at λ0 , it follows that
lim H (Pλ , C) = H (Pλ0 , C) . (7.2.30)
λ→λ0
Then from (7.2.29) it follows that

Nλ N0
lim = . (7.2.31)
λ→λ0 Dλ D0
Now by Corollary 7.2.6 (ii), the map λ → Pλ is continuous at λ0 . 2

In the proof of Theorem 7.2.7, the only properties of M(S) used were (i) the set of units is
open, and (ii) inversion of a unit is continuous. Thus, Theorem 7.2.7 holds under extremely general
conditions. Specifically, one can replace S by any topological ring with the above two properties, plus
the obvious ones (namely S is a commutative domain with identity), and Theorem 7.2.7 remains
valid. The same remark applies to Theorem 7.2.8 and Corollary 7.2.9 below.
Theorem 7.2.7 is addressed to the robustness of feedback stability in the case where the plant
is perturbed but the compensator remains fixed. The next result is addressed to the case where both
the plant and the compensator may be perturbed. Suppose a family of plants {Pλ } and a family
of compensators {Cλ } is specified, and consider the function λ → (Pλ , Cλ ). Theorem 7.2.8 below
relates the behavior of this function to the robustness of feedback stability. Before the presentation
of the theorem, however, a word is needed about the problem formulation. Modeling the plant
and compensator perturbations by a family {Pλ , Cλ } where both P and C are functions of the same
parameter λ gives the misleading impression that plant and compensator perturbations are in some
way “coupled.” But this is not so. Suppose plant uncertainty is modeled by specifying a family of
plants Pα where α belongs to a topological space A, and compensator uncertainty similarly leads
to a family of compensators Cβ where β belongs to another topological space B. In such a case,
define the parameter space to be the cartesian product of A and B equipped with the product
topology, and define P(α,β) = Pα , C(α,β) = Cβ . In this way, “independent” plant and compensator
uncertainty can also be treated by Theorem 7.2.8.
Theorem 7.2.8 Suppose λ → Pλ , λ → Cλ are functions mapping a first-countable topological

space into M(R(s)), and suppose that the pair (Pλ0 , Cλ0 ) is stable.
(i) Suppose the functions λ → Pλ , λ → Cλ are continuous at λ = λ0 in the graph topology.
Then there exists a neighborhood N of λ0 such that (Pλ , Cλ ) is stable for all λ ∈ N, and in addition
H (Pλ , Cλ ) is continuous at λ = λ0 in the norm topology on M(S).
(ii) Conversely, suppose there is a neighborhood N of λ0 such that (Pλ , Cλ ) is stable for all
λ ∈ N, and such that H (Pλ , Cλ ) is continuous at λ = λ0 in the norm topology on M(S). Then the
functions λ → Pλ , λ → Cλ are continuous at λ = λ0 in the graph topology.
The proof of this theorem is deferred to Section 7.3. However, combining Theorems 7.2.8
and 7.2.7 leads to the following result, which is relevant in the context of singularly perturbed
systems.
Corollary 7.2.9 Suppose λ → Pλ is a function mapping a first-countable topological space into

M(R(s)), and suppose a point λ0 ∈ is specified. If there exist a neighborhood N of λ0 and a fam-
ily {Cλ , λ ∈ N} in M(R(s)) of compensators such that (Pλ , Cλ ) is stable for all λ ∈ N and such that
H (Pλ , Cλ ) is continuous at λ = λ0 , then there exists a C ∈ M(R(s)) and a neighborhood B of λ0 such
that (Pλ , C) is stable for all λ ∈ B and H (Pλ , C) is continuous at λ = λ0 .
Remarks 7.2.10 The point of the corollary is this: Suppose a family of plants {Pλ } is specified,
together with a nominal plant Pλ0 . If it is possible to design a family of compensators Cλ such that
Cλ robustly stabilizes the family of plants {Pλ }, then the same can be accomplished using a fixed
(i.e., λ-independent) compensator.
218 7. ROBUSTNESS
Proof. The hypotheses imply, by Theorem 7.2.8 (ii), that both λ → Pλ , λ → Cλ are continuous
in the graph topology at λ = λ0 . Now let C = lim Cλ , or any other element of S(Pλ0 ). Then, by
λ→λ0
Theorem 7.2.7 (i), C robustly stabilizes the family {Pλ }. 2
The preceding results show that the graph topology is of fundamental importance in the study
of the robustness of feedback stability. Hence it is worthwhile to develop a further understanding of
it, with particular emphasis on what continuity in this topology means. The next several propositions
are directed towards this aim.
Proposition 7.2.11 On the set M(S) viewed as a subset of M(R(s)), the norm topology and the graph
topology are the same. M(S) is an open subset of M(R(s)) in the graph topology.
Proof. To prove the first sentence, it is enough to show that convergence in the norm topology is
equivalent to convergence in the graph topology in M(S). Accordingly, suppose {Pi } is a sequence3 in
M(S) and that P ∈ M(S). Suppose first that Pi − P → 0. Then C = 0 stabilizes P as well as Pi ,
since P , Pi ∈ M(S). Further, since Pi − P → 0, it follows from (7.1.1) that H (Pi , 0) → H (P , 0)
in M(S). Hence, by Theorem 7.2.7 (ii), we conclude that {Pi } converges to P in the graph topology.
To prove the converse, and incidentally to prove the second sentence, suppose {Pi } is a sequence
in M(R(s)) (not necessarily in M(S)) converging in the graph topology to P ∈ M(S). Now (P , I ) is
an r.c.f. of P . Hence, by Lemma 7.2.5 (ii) there exist r.c.f.’s (Ni , Di ) of Pi such that Ni → P , Di → I
in M(S). Now, I is a unimodular matrix. Since U(S) is an open subset of M(S) and since the inversion
of unimodular matrices is continuous, this implies that Di is unimodular for large enough i and that
Di−1 → I in M(S). Hence, Pi = Ni Di−1 ∈ M(S) for large enough i; moreover, since Ni − P → 0
and Di−1 − I → 0, it follows that Pi − P = Ni Di−1 − I → 0. As it has been shown during
the preceding argument that every sequence in M(R(s)) that converges to a limit in M(S) must
eventually lie in M(S), we conclude that M(S) is an open subset of M(R(s)). 2
Proposition 7.2.11 shows that, in the case of functions mapping into M(S), continuity in
the graph topology is the same as continuity in the usual norm topology. In the case of unstable
plant sequences, however, several strange things can happen. Before illustrating these difficulties, a
preliminary result is presented that is often helpful in concluding lack of continuity.
Proposition 7.2.12 Suppose λ → Pλ is a function from a first-countable topological space into

M(R(s)), and suppose this function is continuous in the graph topology at λ = λ0 . Let s1 , · · · , sl denote
the poles of Pλ0 in the closed RHP.
(i) Let be any compact subset of the open RHP such that none of the si lies on the boundary of ,
and let γ denote the number of poles of Pλ0 inside , counted according to their McMillan degrees. Then
there is a neighborhood N of λ0 such that Pλ has exactly γ poles inside for all λ ∈ N.
3 Strictly speaking, one should examine the convergence of nets rather than sequences. However, it is shown in Section 7.3 that
the graph topology is metrizable and hence first-countable; so it is enough to examine only sequences.
(ii) Let be any closed subset of the closed RHP that does not contain s1 , · · · , sl ; if Pλ0 has a pole
at infinity, suppose in addition that is bounded. Then Pλ (s) → Pλ0 (s) as λ → λ0 , uniformly for all
s ∈ .
Remarks 7.2.13 In general, the number of closed RHP poles of Pλ and of Pλ0 need not be equal.
For example, let = [0, ∞), λ0 = 0 and Pλ (s) = 1/(s + λ).
Proof. Since the function λ → Pλ is continuous in the graph topology at λ = λ0 , it follows from
Corollary 7.2.6 (iii) that there exists a family (Nλ , Dλ ) of r.c.f.’s of Pλ such that Nλ , Dλ are continuous
at λ0 . To prove (i), observe that |Dλ | is continuous at λ0 ; hence |Dλ |(s) → |Dλ0 |(s) uniformly in s
as λ → λ0 . Since |Dλ | and |Dλ0 | are both analytic in the open RHP, and since |Dλ0 | has no zeros on
the boundary of , it follows by applying the principle of the argument to |Dλ | and to |Dλ0 | that,
for some neighborhood N of λ0 , the number of zeros of |Dλ | inside is the same for all λ ∈ N.
Since the RHP poles of Pλ are in one-to-one correspondence with the zeros of |Dλ |, and since the
McMillan degree of an RHP pole of Pλ is precisely its multiplicity as a zero of |Dλ |, (i) follows.
To prove (ii), observe that since Dλ (s) → Dλ0 (s) uniformly over the closed RHP, it fol-
lows that [Dλ (s)]−1 → [Dλ0 (s)]−1 wherever the latter is well-defined, i.e., wherever |Dλ0 (s)| = 0.
Moreover, if we exclude a neighborhood of every closed RHP zero of |Dλ0 |, the convergence is
uniform with respect to s. Since Pλ (s) = Nλ (s)[Dλ (s)]−1 , the same is true of Pλ . 2
One of the nice features of the norm topology in M(S) is that it is a product topology: That
is, a sequence of matrices {P (i) } in M(S) converges to P ∈ M(S) if and only if each component
(i)
sequence {Pj k } converges to pj k for all j, k. Consequently, a function λ → Pλ mapping into M(S)
is continuous at λ0 if and only if each component of Pλ is continuous at λ0 . Now Proposition 7.2.11
shows that the graph topology and the norm topology coincide on M(S). Hence, when restricted to
M(S), the graph topology is also a product topology. However, in the case where a function λ → Pλ
maps into unstable multivariable plants, the continuity of the matrix-valued map Pλ at λ0 neither
implies nor is implied by the continuity at λ0 of the scalar-valued maps λ → pij λ for all i, j . This is
illustrated through some examples. First a scalar example is presented to set the stage.
Example 7.2.14 Let = R, λ0 = 0, and consider the function
s−1−λ
pλ (s) = . (7.2.32)
s−1
Then at λ = 0 the function pλ has a pole-zero cancellation in the open right half-plane, namely
at s = 1. This causes pλ to be discontinuous at λ = 0, as can be seen by the following reasoning.
Now p0 = 1, which belongs to S. If pλ were to be continuous at λ = 0, it would follow from
Proposition 7.2.11 that pλ ∈ S for all small enough λ. Since clearly pλ ∈
/ S ∀λ = 0, the conclusion
is that p is not continuous at λ = 0.
220 7. ROBUSTNESS
Example 7.2.15 Again let = R, λ0 = 0 and consider the function

−λs s−1−λ
s−1 s−1
P λ(s) = 2s 2 −1 −s 2 +s+1 . (7.2.33)
s 2 −1 s 2 −1
Then, from Example 7.2.14, the (1,2)-component of Pλ is not continuous in the graph topology
at λ = 0. Nevertheless, the function λ → Pλ mapping R into M(R(s)) is continuous in the graph
topology at λ = 0. To see this, define

z+λ z−1 z−1 z
Nλ (s) = , Dλ (s) = , (7.2.34)
z−2 1 z z−1
where z = s/(s + 1) ∈ S.Then it is easy to see that Nλ is continuous at λ = 0, and Dλ is independent

of λ. Moreover, Pλ = Nλ Dλ−1 , as is easily checked. Hence, by Corollary 7.2.6 (iii), the function Pλ is
continuous in the graph topology at λ = 0.This example shows that the continuity of a matrix-valued
function does not imply the continuity of its components.
Example 7.2.16 Once again let = R, λ0 = 0, and consider

P λ(s) = s+1
s−1−λ
s+1
s−1+λ . (7.2.35)
Then,

P0 (s) = s+1
s−1
s+1
s−1 . (7.2.36)
It is immediate from (7.2.35) and (7.2.36) that each component of Pλ is continuous in the graph
topology at λ = 0. Nevertheless, Pλ itself is not continuous at λ = 0. To see why this is so, note that
P0 has a pole at s = 1 of McMillan degree one, and this is the only pole of P0 . On the other hand,
for each nonzero λ ∈ (−1, 1), Pλ , has two poles in the closed RHP, each of McMillan degree one.
Since Proposition 7.2.12 (i) fails to hold, the conclusion is that Pλ is not continuous in the graph
topology at λ = 0.
The next result shows that, while the graph topology may be the “right” topology for study-
ing the robustness of feedback stability, addition and multiplication in R(s) (or M(R(s))) are not
necessarily continuous operations in this topology.
Proposition 7.2.17 R(s) equipped with the graph topology is not a topological ring.
Proof. It is necessary to demonstrate that either addition or multiplication fails to be continuous. In
actuality, both operations are not continuous in general. First, let {εi } be any sequence converging to
zero, and let
s − 3 − εi 1
pi (s) = , qi (s) = ,
s−2 s−2
s−3 1
p(s) = , q(s) = . (7.2.37)
s−2 s−2
Then pi → p, qi → q the graph topology, but
s − 2 − εi
(pi + qi )(s) = , (7.2.38)
s−2
does not approach p + q = 1 in the graph topology (see Example 7.2.14). Hence, addition is not
continuous in general. An example of the noncontinuity of multiplication is provided by
s − 1 − εi s+1
pi (s) = , qi (s) = ,
s+1 s − 1 + εi
s−1 s+1
p(s) = , q(s) = . (7.2.39)
s+1 s−1
The details are left to the reader. 2
Using entirely analogous reasoning, one can show that the matrix ring R(s)n×n equipped with
the graph topology fails to be a topological ring for every integer n ≥ 1.
Up to now we have seen a spate of negative results, including the fact that the continuity
of a matrix-valued function cannot be related in a simple way to the continuity of its components,
and that addition and multiplication need not be continuous. These results are presented primarily
to impress upon the reader the need to proceed with caution, and to demonstrate that “obvious”
statements such as “ lim (Pλ + Qλ ) = lim Pλ + lim Qλ ” need not always hold true. Happily,
λ→λ0 λ→λ0 λ→λ0
there are also several positive statements that one can make about the graph topology.
Proposition 7.2.18 Suppose is a first-countable topological space, and that P , Q : →

M(R(s)), R : → M(S) satisfy Pλ = Qλ + Rλ ∀λ ∈ . Suppose in addition that R is continuous
at λ0 . Then P is continuous at λ0 if and only if Q is continuous at λ0 .
Remarks 7.2.19 As we have seen in the proof of Proposition 7.2.17, the statement
lim Pλ = lim Qλ + lim Rλ , (7.2.40)

λ→λ0 λ→λ0 λ→λ0
is not valid in general. Thus, the point of Proposition 7.2.18 is that (7.2.40) is valid provided one of
the two limits on the right side of (7.2.40) belongs to M(S).
222 7. ROBUSTNESS
Proof. “if ” Since Q is continuous at λ0 , by Corollary 7.2.6 (iii) it follows that there exists a family
(Nλ , Dλ ) of r.c.f.’s of Qλ such that Nλ , Dλ are continuous at λ0 . Since Rλ ∈ M(S), the pair (Nλ +
Rλ Dλ , Dλ ) is an r.c.f. of Pλ . Finally, since M(S) is a topological ring, the functions λ → Nλ + Rλ Dλ
and λ → Dλ are both continuous at λ0 . This implies, by Corollary 7.2.6, that Pλ is continuous at
λ0 .
“only if ” Write Qλ = Pλ − Rλ and apply the above reasoning. 2
The multiplicative analog of Proposition 7.2.18 is stated next.
Proposition 7.2.20 Suppose is a first-countable topological space, and that P , Q : → M(R(s)),

R : → U(S) satisfy Pλ = Qλ Rλ ∀λ ∈ . Suppose in addition that R is continuous at λ0 . Then P is
continuous at λ0 if and only if Q is continuous at λ0 .
It is left to the reader to prove Proposition 7.2.20, by exploiting the fact that Rλ is unimodular
for all λ, and to extend Proposition 7.2.20 to the case where Pλ = Rλ Qλ .
The final result along these lines shows that the graph topology on M(R(s)) is a product
topology under certain circumstances.
Proposition 7.2.21 Suppose P maps a first-countable topological space into M(R(s)), and that P is
block-diagonal of the form P = Block Diag {P1 , · · · , Pn } where the dimensions of the blocks are the same
for all λ. Then P is continuous at λ0 ∈ if and only if each function Pi is continuous at λ0 .
The proof of this proposition is given in Section 7.3. One can generate some more results by
combining the preceding three propositions in various ways.
Consider a function λ → Pλ mapping into M(R(s)). In view of Examples 7.2.15 and 7.2.16,
one can ask whether there is a simple test to determine whether or not Pλ is continuous at λ0 . Such
a test is provided by Theorem 7.2.7, and can be stated as follows: Find any C ∈ S(Pλ0 ); then,
Pλ is continuous at λ0 if and only if C robustly stabilizes the family {Pλ }. Of course, this test
begs the question, since the issue of continuity arises only in conjunction with the existence of a
robustly stabilizing compensator. Thus, the next several propositions are addressed to some very
specific situations of practical interest.
Proposition 7.2.22 Let = R, λ0 = 0, and suppose Pλ (s) is the transfer matrix of the system
ẋ(t) = Aλ x(t) + Bλ u(t), y(t) = Cλ x(t) . (7.2.41)
Suppose the triple (C0 , A0 , B0 ) is stabilizable and detectable, and that (Cλ , Aλ , Bλ ) is continuous in λ at
λ = 0 in the usual norm topology on Euclidean space. Then the map λ → Pλ is continuous in the graph
topology at λ = 0.
Proof. This is a straight-forward application of Lemma 4.2.2. Select a matrix F such that A0 − F C0
is Hurwitz, i.e., such that all eigenvalues of A0 − F C0 have negative real parts. By continuity,
Aλ − F Cλ is also Hurwitz for sufficiently small λ. From Lemma 4.2.2, an l.c.f. of Pλ is given by
D̃λ (s) = I − Cλ (sI − Aλ + F Cλ )−1 F , (7.2.42)

Ñλ (s) = Cλ (sI − Aλ F Cλ )−1 Bλ , (7.2.43)
for sufficiently small λ. Since D̃λ , Ñλ are continuous at λ = 0 in the norm topology on M(S), it
follows from Corollary 7.2.6 (iii) that Pλ is continuous in the graph topology at λ = 0. 2
A consequence of Proposition 7.2.22 is that any compensator C that stabilizes the nominal
plant P0 , which is described by (7.2.41) with λ = 0, also stabilizes Pλ for all sufficiently small λ;
moreover, the closed-loop transfer matrix H (Pλ , C) is continuous at λ = 0.
In Proposition 7.2.22, each of the plants Pλ has the same dynamic order (i.e., the same state-
space dimension). Proposition 7.2.23 below is addressed to the so-called singular perturbation case,
where the plant P0 has a lower dynamic order than the plants Pλ , λ > 0.
Proposition 7.2.23 Suppose = [0, ∞), and for each λ > 0 let Pλ (s) be the transfer matrix of the
system

ẋ(t) A11 A12 x(t) B
= + 1 u(t) , (7.2.44)
λż(t) A21 A22 z(t) B2

x(t)
y(t) = [C1 C2 ] + D1 u(t) . (7.2.45)
z(t)
Suppose A22 is a Hurwitz (and hence, nonsingular) matrix, and let P0 (s) be the transfer matrix of the
system resulting from substituting λ = 0 in (7.2.44) and (7.2.45), namely
ẋ(t) = Ax(t) + Bu(t), y(t) = Cx(t) + D0 u(t) , (7.2.46)
where
A = A11 − A12 A−1 −1

22 A21 , B = B1 − A12 A22 B2 ,
C = C1 − C2 A−1 −1
22 A21 , D0 = D1 − C2 A22 B2 . (7.2.47)
Suppose the triple (C,A,B) is stabilizable and detectable. Then Pλ is continuous at λ = 0 in the graph
topology if C2 (sI − A22 )−1 B2 ≡ 0.
Remarks 7.2.24 Equations of the form (7.2.44)–(7.2.45) are usually used to model systems with
both fast and slow dynamical variables. The state variables x (resp. z) are referred to as the slow (resp.
fast) modes. In this context, P0 represents a simplified model of Pλ obtained by neglecting the fast
modes. A relevant question for such systems is the following: When does a compensator designed
224 7. ROBUSTNESS
on the basis of the simplified system P0 work for the true system Pλ ? More precisely, suppose
one has found a stabilizing compensator C0 for the simplified system P0 . The issue is whether
C0 also stabilizes Pλ for all sufficiently small positive λ, and if so whether H (Pλ , C0 ) approaches
H (P0 , C0 ) as λ → 0+ . On the basis of Theorem 7.2.7, we know that this is so if and only if Pλ is
continuous in the graph topology at λ = 0. Now Proposition 7.2.23 states that continuity holds if
C2 (sI − A22 )−1 B2 ≡ 0.
Proposition 7.2.23 has several interesting consequences. For instance, suppose the simplified
system P0 is stabilized by state feedback.This corresponds to the case y(t) = x(t) (i.e., C1 = I, C2 =
0), and C0 = K, a constant matrix. Since the requisite condition holds in this case, it follows that the
control law u(t) = −Kx(t) also stabilizes Pλ for all sufficiently small positive λ, and that H (Pλ , K)
converges to H (P0 , K) as λ → 0+ . Similarly, if the control input u does not directly affect the last
modes, (i.e., B2 = 0), then it is again immediate that any compensator that stabilizes the reduced
order model also robustly stabilizes the singularly perturbed system for all sufficiently small positive
λ.
Proof. Suppose C2 (sI − A22 )−1 B2 ≡ 0. The continuity of Pλ at λ = 0 is established using Corol-
lary 7.2.6 (iii), i.e., by constructing a family (D̃λ , Ñλ ) of l.c.f.’s of Pλ , λ ≥ 0, such that both D̃λ and,
Ñλ are continuous at λ = 0 in the norm topology on M(S).
Select a matrix F such that A − F C is Hurwitz; this can always be done since (C, A) is
detectable. Define
D̃0 (s) = I − C(sI − A + F C)−1 F , (7.2.48)

Ñ0 (s) = C(sI − A + F C)−1 (B − F D0 ) + D0 . (7.2.49)
Then, by Theorem 4.2.1, (D̃0 , Ñ0 ) is an l.c.f. of P0 . To obtain an l.c.f. of Pλ , let F0 = [F 0] and
define

A11 A12 F
Āλ = − [C1 C2 ]
A21 /λ A22 /λ 0

A11 − F C1 A12 − F C2
= . (7.2.50)
A21 /λ A22 /λ
From [15], the eigenvalues of Āλ are asymptotically equal to those of A22 /λ plus those of
A11 − F C1 − (A12 − F C2 )A−1

22 A21 = A − F C . (7.2.51)
Hence Āλ is Hurwitz for all sufficiently small positive λ. Thus, once again by Theorem 4.2.1, an
l.c.f. of Pλ is given by
D̃λ (s) = I − C0 (sI − Āλ )−1 F0 , (7.2.52)

Ñλ (s) = C0 (sI − Āλ )−1 (Bλ − F0 D1 ) + D1 , (7.2.53)
where
B1
Bλ = , C0 = [C1 C2 ] . (7.2.54)
B2 /λ
The remainder of the proof consists of studying Ñλ and D̃λ in detail and showing that
Ñλ → Ñ0 , D̃λ → D̃0 as λ → 0+ . Recall that the inverse of a partitioned matrix is given by
−1
X Y −1 −−1 Y V −1
= , (7.2.55)
W V −V W −1
−1 −1
V + V −1 W −1 Y V −1
where
= X − Y V −1 W . (7.2.56)
Hence
−1
sI − A11 + F C1 −A12 + F C2 F
D̃λ = I − [C1 C2 ]
−A21 /λ sI − A22 /λ 0
−1 −1 −1
= I − C 1 F + C2 V W F , (7.2.57)
where
= (sI − A11 + F C1 ) − (A12 − F C2 )(sI − A22 /λ)−1 A21 /λ

= (sI − A11 + F C1 ) − (A12 − F C2 )(λsI − A22 )−1 A21 , (7.2.58)
V = sI − A22 /λ , W = −A21 /λ . (7.2.59)
Simplification of (7.2.57) yields

D̃λ = I − M−1 F , (7.2.60)
where
M = C1 − C2 V −1 W = C1 + C2 (λsI − A22 )−1 A21 . (7.2.61)
Now, letting Ā denote the Hurwitz matrix A − F C, we get
D̃λ = I − C(sI − Ā)−1 F

= I − C1 (sI − Ā)−1 F + C2 A−1 −1
22 A21 (sI − Ā) F . (7.2.62)
Hence, it can be concluded that D̃λ → D̃0 if it can be shown that
−1 → (sI − Ā)−1 , (7.2.63)

(λsI − A22 )−1 A21 −1 → −A−1
22 A21 (sI − Ā)
−1
. (7.2.64)
Note that (λsI − A22 )−1 does not approach −A−1 4

22 . Hence, (7.2.64) does not automatically follow
from (7.2.63).
4 Observe that (λsI − A )−1 → 0 as s → ∞ whenever λ > 0. Hence, the norm of the difference between the functions (λsI −
22
A22 )−1 and A−1 −1
22 is at least A22 , no matter how small λ is.
226 7. ROBUSTNESS
To prove (7.2.63), let Ā11 denote A11 − F C1 , and note that
−1 = [sI − Ā11 − (A12 − F C2 )(λsI − A22 )−1 A21 ]−1

= {sI − Ā11 − (A12 − F C2 )A−1
22 A21
− (A12 − F C2 )[(λsI − A22 )−1 − A−1
22 ]A21 }
−1
= (I − R)−1 (sI − Ā)−1 , (7.2.65)
where
R = (A12 − F C2 )[(λsI − A22 )−1 − A−1
22 ](sI − Ā)
−1
. (7.2.66)
Hence,
−1 − (sI − Ā)−1 = [(I − R)−1 − I ](sI − Ā)−1 . (7.2.67)
Now (I − R)(s) approaches I uniformly on every compact subset of C+e as λ → 0, and (sI −
Ā)−1 → 0 as |s| → ∞. Hence, −1 → (sI − Ā)−1 and (7.2.63) is proved. To establish (7.2.64),
observe that (λsI − A22 )−1 approaches −A−1 22 uniformly on all compact subsets of C+e as λ → 0,
and that both −1 and (sI − Ā)−1 are strictly proper. Thus (7.2.64) follows from (7.2.63) plus the
strict properness of the quantities on both sides of (7.2.63). Finally, (7.2.63) and (7.2.64) together
imply that D̃λ → D̃0 as λ → 0.
Next, we examine the “numerator” matrix Ñλ . From (7.2.53)

sI − A11 + F C1 −A12 + F C2 B1 − F D1
Ñλ = [C1 C2 ] + D1 . (7.2.68)
−A21 /λ sI − A22 /λ B2 /λ
Expanding the above equation results in an expression for Ñλ as a sum of five terms the first four
of which arise from the matrix product and the last is D1 . From (7.2.55), the first of these terms is
C1 −1 (B1 − F D1 ). Now (7.2.63) implies that this term converges to C1 (sI − Ā)−1 (B1 − F D1 ).
The second term in the expansion is
− C2 V −1 W −1 (B1 − F D1 )
= C2 (sI − A22 /λ)−1 (A21 /λ)−1 (B1 − F D1 )
= C2 (λsI − A22 )−1 A21 −1 (B1 − F D1 ) . (7.2.69)
From reasoning analogous to that used to establish (7.2.64), it follows that this term converges to
−C2 A−1 −1
22 A21 (sI − A22 ) (B1 − F D1 ). The third term is
− C1 −1 Y V −1 B2 /λ
= C1 −1 (A12 − F C2 )(sI − A22 /λ)−1 B2 /λ
= C1 −1 (A12 − F C2 )(λsI − A22 )−1 B2 . (7.2.70)
As before, this converges to
−C1 (sI − Ā)−1 (A12 − F C2 )A−1

22 B2 . (7.2.71)
However, note that C2 A−1
22 B2 = 0. Hence the above limit simplifies to
−C1 (sI − Ā)−1 A12 A−1

22 B2 . (7.2.72)
The fourth term is the most complicated one; it equals
C2 V −1 (I + W −1 Y V −1 )B2 /λ
−1
B2
= C2 (sI − A22 /λ)−1 · I + Aλ21 −1 (A12 − F C2 ) sI − A22
λ
·
λ
= C2 (λsI − A22 )−1 [I + A22 −1 (A12 − F C2 )(λsI − A22 )−1 ]B2
= C2 (λsI − A22 )−1 B2
+ C2 (λsI − A22 )−1 A21 −1 · (A12 − F C2 )(λsI − A22 )−1 B2 . (7.2.73)
Now the first term on the right side is identically zero by hypothesis, while the second term is strictly
proper. As a result, the second term converges without difficulty to C2 A−1 −1 −1
22 A21 (sI − Ā) A12 A22 B2 .
Of course the fifth and final term is the constant matrix D1 . Combining the five limits, we see that,
Ñλ approaches
(C1 − C2 A−1 −1 −1
22 A21 )(sI − Ā) (B1 − F D1 − A12 A22 B2 ) + D1 . (7.2.74)
Finally, using the relations (7.2.47), and noting that D1 = D0 , since C2 A−1
22 B2 = 0, the above limit
simplifies to
C(sI − Ā)−1 (B − F D0 ) + D0 = Ñ0 . (7.2.75)
From Corollary 7.2.6, it now follows that Pλ converges to P0 in the graph topology. 2
While the result contained in Proposition 7.2.23 is interesting, its use is limited by the fact
that the condition it requires is quite restrictive. The origin of the restrictiveness is the definition
of robust stabilizability. Recall that a controller C ∈ S(P0 ) was not only required to stabilize Pλ for
all sufficiently small positive λ, but the resulting stable family of transfer matrices H (Pλ , C) was
required to converge uniformly over C+e to H (P0 , C) as λ → 0. The next proposition is devoted to
a study of the situation where a weaker form of convergence of the family H (Pλ , C) is acceptable.
Proposition 7.2.25 Consider the family {Pλ } defined in (7.2.44)–(7.2.45). Suppose C ∈ (P0 ) is strictly
proper. Then C ∈ S(Pλ ) for all sufficiently small positive λ. Moreover, [H (Pλ , C)](s) converges to
[H (P0 , C)](s) uniformly over all compact subsets of C+e as λ → 0.
Proof. In the interests of brevity, let us say that a family {Fλ } converges weakly to F ∈ M(S) if
Fλ (s) → F (s) uniformly over all compact subsets of C+e as λ → 0. In contrast, we say simply that
{Fλ } converges to F if the convergence of Fλ (s) to F (s) is uniform over the entire C+e . Also note
that, if {Fλ } converges weakly to F and {Gλ } converges to G, then the product {Gλ Fλ } converges
to GF provided G is strictly proper.
228 7. ROBUSTNESS
Since C ∈ S(P0 ), it has an r.c.f. (Nc , Dc ) such that Ñ0 Nc + D̃0 Dc = I . Moreover, since C is
strictly causal, so is Nc . Now consider the return difference matrix
λ = Ñλ Nc + D̃λ Dc . (7.2.76)
In the proof of Proposition 7.2.23, it was shown that {Ñλ } converges to Ñ0 provided C2 (sI −
A22 )−1 B2 ≡ 0. However, from (7.2.73) one can easily verify that, even if the above condition is
not satisfied, the family {Ñλ } still converges weakly to Ñ0 . Since Nc is strictly proper, it follows
that Ñλ Nc → Ñ0 Nc . Also D̃λ → D̃0 , since the condition was not used in the proof of this result
(see (7.2.63)–(7.2.67)). Hence, λ → Ñ0 Nc + D̃0 Dc = I , which implies that λ is unimodular for
sufficiently small (positive) λ. By Theorem 5.1.6, this means that C ∈ S(Pλ ) for small enough λ.
Next, the twin formula to (5.1.21) shows that

Dc −1 D̃ λ −D c −1
Ñ λ
H (Pλ , C) = λ λ . (7.2.77)
Nc −1
λ D̃λ I − Nc λ Ñλ
−1
All terms except −Dc −1λ Ñλ , converge to their corresponding values when λ = 0. But this term
only converges weakly to −Dc Ñ0 , since neither Dc , nor the limit of −1
λ is strictly proper. However,
this is enough to conclude that {H (Pλ , C)} converges weakly to H (P0 , C). 2
Note that the set of all strictly proper compensators in S(P0 ) can be parametrized using
Proposition 5.2.10.
If C ∈ S(P0 ) is not strictly proper, then it is possible to construct a family {Pλ } such
that (7.2.47) holds, but C ∈
/ S(Pλ ) for all λ > 0. In other words, if P0 is stabilized using a nonstrictly
proper compensator, then there always exists a singular perturbation of P0 that is not stabilized by
C for each λ > 0.
In closing, let us discuss the graph topology in the case where the set of “stable” transfer
matrices is something other than the set S. A careful examination of the various proofs in this
section reveals that the actual nature of the elements of the set S was not used anywhere: Rather,
the pertinent properties of S were (i) the set of units in S is open and (ii) inversion is continuous.
Thus, the graph topology can be defined over the set of fractions associated with an arbitrary ring R,
provided only that R has the two properties mentioned above. In particular, the graph topology for
discrete-time systems is self-evident, as it is in the case where the set of desirable transfer functions
is taken to be the set SD where D is some prespecified region in the complex plane. Finally, note that
in [114] a so-called “gap metric” is defined which is nothing but a metrization of the graph topology
corresponding to a particular choice of “stable” transfer functions, namely the set of bounded linear
operators mapping the Hilbert space H2 into itself. Such a choice of “stable” transfer functions takes
into account the fact that BIBO-stability is a desirable attribute of a system, but does not take into
account the requirement that a stable system should be causal in order to be physically useful.
7.3. GRAPH METRIC 229
7.3 GRAPH METRIC
The previous section discusses the qualitative aspects of the robustness of feedback stability. Specif-
ically, the set M(R(s)) of all (possibly unstable) plants is topologized by means of the graph topology,
and it is shown that continuity in the graph topology is a necessary and sufficient condition for
feedback stability to be robust (see Theorem 7.2.7). In order to obtain quantitative estimates of
robustness, it is essential to metrize the graph topology, i.e., to define a metric on M(R(s)) such
that convergence (and hence, continuity) in this metric is equivalent to convergence in the graph
topology. Such a metric, called the graph metric, is defined in this section. Its applications to deriving
quantitative robustness estimates are given in the next section.
At this stage, it should be noted that the graph metric presented in this section is only one
possible means of metrizing the graph topology. It is undoubtedly possible to define alternate metrics
on M(R(s)) that induce exactly the same topology. Each of these metrics would lead to a different
quantitative estimate for the robustness of feedback stability, but would leave the qualitative result
of Theorem 7.2.7 unchanged. Thus, in some sense the graph topology is more fundamental than
the graph metric, defined below.
The graph metric is defined using the concept of a normalized r.c.f. of a rational matrix, and
this is introduced first. Suppose P ∈ M(R(s)). An r.c.f. (N, D) of P is normalized if
N ∗ (s)N (s) + D ∗ (s)D(s) = I, ∀s , (7.3.1)
where N ∗ (s) = N (−s).
Lemma 7.3.1 Every P ∈ M(R(s)) has a normalized r.c.f. which is unique to within right multiplica-
tion by an orthogonal matrix.
Proof. Let (N1 , D1 ) be any r.c.f. of P , and define
(s) = N1∗ (s)N1 (s) + D1∗ (s)D1 (s) . (7.3.2)
Then ∗ (s) = (s) ∀s. Moreover, by the right-coprimeness of N1 and D1 , there exist positive
constants α and β such that αI ≤ (j ω) ≤ βI ∀ω. Hence, by [3, p. 240], one can do a spectral
factorization of as (s) = R ∗ (s)R(s), where R ∈ U(S). Now let N = N1 R −1 , D = D1 R −1 .
Then (N, D) is also an r.c.f. of P by Theorem 4.1.13, and (7.3.1) holds.
To prove the assertion regarding uniqueness, let (N2 , D2 ) be another normalized r.c.f. of P .
Then by definition
N2∗ (s)N2 (s) + D2∗ (s)D2 (s) = I, ∀s . (7.3.3)
Moreover, by Theorem 4.1.13, there is a unimodular matrix U ∈ U(S) such that N2 = NU, D2 =
DU . Substituting this into (7.3.3) and using (7.3.1) shows that U ∗ (s)U (s) = I . In other words, U
is a spectral factor of the identity matrix. Hence, U (s) must be constant as a function of s and must
equal an orthogonal matrix. 2
230 7. ROBUSTNESS
Definition 7.3.2 Suppose P1 , P2 ∈ M(R(s)) have the same dimensions, and let (Ni , Di ) be a
normalized r.c.f. of Pi for i = 1, 2. Let

Di
Ai = , i = 1, 2 , (7.3.4)
Ni
and define
δ(P1 , P2 ) = inf A1 − A2 U , (7.3.5)

U ∈M(S),U ≤1
d(P1 , P2 ) = max{δ(P1 , P2 ), δ(P2 , P1 )} . (7.3.6)
Then d(P1 , P2 ) is the graph metric on M(R(s)).

It is left to the reader to verify that d(P1 , P2 ) is a well-defined quantity depending only on
P1 and P2 , even though A1 and A2 are only unique to within right multiplication by an orthogonal
matrix. The reason is that multiplication by an orthogonal matrix does not change the norm.
Lemma 7.3.3 d is a metric on M(R(s)) assuming values in the interval [0, 1].
Proof. If U = 0, then A1 − A2 U = A1 = 1. Hence, δ(P1 , P2 ) ≤ 1 and d(P1 , P2 ) ≤ 1 for

all plants P1 , P2 . In proving that d is a metric, the only nonobvious part is the proof of the
triangle inequality. Accordingly, let P1 , P2 , P3 ∈ M(R(s)), and select U, V ∈ M(S) such that
U ≤ 1, V ≤ 1, and
A1 − A2 U ≤ δ(P1 , P2 ) + ε , (7.3.7)

A2 − A3 V ≤ δ(P2 , P3 ) + ε , (7.3.8)
where ε is some positive number. Then V U ∈ M(S) and V U ≤ 1; moreover,
A1 − A3 V U ≤ A1 − A2 U + A2 U − A3 V U

≤ A1 − A2 U + A2 − A3 V U
≤ δ(P1 , P2 ) + δ(P2 , P3 ) + 2ε (7.3.9)
where in the last step we use the fact that U ≤ 1. Now (7.3.9) shows that
δ(P1 , P3 ) ≤ δ(P1 , P2 ) + δ(P2 , P3 ) + 2ε . (7.3.10)
Since (7.3.10) holds for every ε > 0, it follows that
δ(P1 , P3 ) ≤ δ(P1 , P2 ) + δ(P2 , P3 ) . (7.3.11)
By symmetry,
δ(P3 , P1 ) ≤ δ(P3 , P2 ) + δ(P2 , P1 ) . (7.3.12)
The triangle inequality for d now follows from (7.3.6). 2
The main result of this section is Theorem 7.3.5 which states that the topology on M(R(s))
induced by the graph metric is the same as the graph topology. The following technical result is
needed to prove this theorem.
Lemma 7.3.4 Suppose P1 , P2 ∈ M(R(s)) have the same dimensions, and let A1 , A2 be associated
matrices in M(S) obtained from their normalized r.c.f.’s. Suppose U, V ∈ M(S) satisfy U ≤ 1, and
A1 − A2 U + A2 − A1 V < 1. Then U and V are unimodular.
Proof. Note that, because of (7.3.1), we have A1 X = A2 X = X ∀X ∈ M(S) (in other words,
multiplication by Ai preserves norms). Hence,
I − V U = A1 − A1 V U
≤ A1 − A2 U + A2 U − A1 V U
≤ A1 − A2 U + A2 − A1 V < 1 . (7.3.13)
Hence, V U ∈ U(S) by Lemma 7.2.5, and V , U ∈ U(S) as a consequence. 2
Theorem 7.3.5 A sequence {Pi } in M(R(s)) converges to P ∈ M(R(s)) in the graph topology if
and only if d(Pi , P ) → 0.
Proof. “if ” Suppose d(Pi , P ) → 0. Let A, Ai be matrices of the form (7.3.4) obtained from nor-
malized r.c.f.’s of P , Pi , respectively. Then for each ε in the interval (0, 1/4) there exists an integer
n such that d(Pi , P ) ≤ ε ∀i ≥ n. For each i, there exists Ui , Vi in the unit ball of M(S) such that
A − Ai Ui ≤ d(Pi , P ) + ε, Ai − AVi ≤ d(Pi , P ) + ε. Hence, for all i ≥ n, we have
A − Ai Ui + Ai − AVi ≤ 4ε < 1 . (7.3.14)
By Lemma 7.3.4, this implies that Ui , Vi are unimodular.Thus, if A = [D N ] and Ai = [Di Ni ] ,

then (Ni Ui , Di Ui ) is an r.c.f. of Pi and A − Ai Ui ≤ 2ε ∀i ≥ n. It now follows from Lemma 7.2.5
that Pi → P in the graph topology.
“only if ” Suppose Pi → P in the graph topology, and let (N, D) be a normalized r.c.f. of
P . Then by Lemma 7.2.5 there exist r.c.f.’s (Ni , Di ) of Pi such that Ni → N, Di → D. However,
(Ni , Di ) need not be normalized. Let Mi = [Di Ni ] , and suppose Mi = Ai Ri where Ai corre-
sponds to a normalized r.c.f. of Pi and Ri ∈ U(S). Now Mi − A → 0 and Mi = Ai Ri =
Ri since Ai is an isometry. Hence, Ri → 1. Define Ui = Ri /Ri . Then Ui = 1; moreover,
A − Ai Ui ≤ A − Ai Ri + Ai Ri − Ai Ui
= A − Mi + Ri − Ui
= A − Mi + |1 − Ri |
→ 0 as i → ∞ . (7.3.15)
232 7. ROBUSTNESS
This shows that δ(P , Pi ) → 0. To prove that δ(Pi , P ) → 0, it is necessary to estimate Ri−1 . Let
γi = A − Mi . Then for all vectors x ∈ M(L2 ), we have Mi x ≥ Ax − (A − Mi )x ≥ (1 −
γi )x. Since x = Ai x = Mi Ri−1 x ≥ (1 − γi )Ri−1 x, it follows that Ri−1 ≤ 1/(1 − γi )
whenever γi < 1. In particular, Ri−1 → 1 as i → ∞. Let Vi = Ri−1 /Ri−1 . Then, as in (7.3.15),
Vi = 1 and
Ai − AVi = Mi Ri−1 − AVi

≤ Mi Ri−1 − ARi−1 + ARi−1 − AVi
≤ Mi − ARi−1 + Ri−1 − Vi
→ 0 as i → ∞ . (7.3.16)
Thus δ(Pi , P ) → 0. 2
At present, the graph metric distance between two plants cannot be computed exactly, since
the problem of determining the infimum in (7.3.5) is still open. However, it is possible to give upper
and lower bounds.
Lemma 7.3.6 Suppose (N1 , D1 ) is a normalized r.c.f. of P1 , (N2 , D2 ) is a (not necessarily normalized)
r.c.f. of P2 , and let A1 = [D1 N1 ] , M2 = [D2 N2 ] . Suppose A1 − M2 =: γ < 1. Then
2γ
d(P1 , P2 ) ≤ . (7.3.17)
1−γ
Proof. Suppose M2 = A2 R2 where A2 corresponds to a normalized r.c.f. of P2 and R2 ∈ U(S). Now

M2 ≤ A1 + M2 − A1 = 1 + γ . Hence, R2 = M2 ≤ 1 + γ and R2−1 ≤ 1/(1 − γ ), as
in the proof of Theorem 7.3.5. Moreover, (7.3.15) and (7.3.16) imply that
δ(P1 , P2 ) ≤ 2γ , (7.3.18)
δ(P2 , P1 ) ≤ 2γ /(1 − γ ) , (7.3.19)
d(P1 , P2 ) = max{δ(P1 , P2 ), δ(P2 , P1 )} ≤ 2γ /(1 − γ ) , (7.3.20)
To compute a lower bound on the graph metric distance between two plants, note that
δ(P1 , P2 ) ≥ inf A1 − A2 R . (7.3.21)

R∈M(S)
This can be computed using the methods of Section 6.6.

We are now in a position to undertake a proof of Proposition 7.2.21. For ease of reference it
is restated here in an equivalent form.
Proposition 7.3.7 Suppose {Pi } is a sequence in M(R(s)), P ∈ M(R(s)), and suppose Pi , P are
block-diagonal in the form
Pi = Block Diag {Pi1 , · · · , Pil } , (7.3.22)
P = Block Diag {P̄1 , · · · , P̄l } , (7.3.23)
where the partitioning is commensurate. Then d(Pi , P ) → 0 if and only if d(Pij , P̄j ) → 0 as i → ∞,
for each j in {1, · · · , l}.
Remarks 7.3.8 Proposition 7.3.7 is equivalent to Proposition 7.2.21 in view of Theorem 7.3.5.
Proof. It is enough to prove the proposition for the case where l = 2, as the general case will then
follow by induction. Accordingly, let us change notation, and suppose we are given a sequence {Pi }
of block-diagonal matrices and a block-diagonal matrix P having the form

Qi 0 Q 0
Pi = , P = . (7.3.24)
0 Ri 0 R
The claim is that d(Pi , P ) → 0 if and only if d(Qi , Q) → 0 and d(Ri , R) → 0.
“if ” Suppose d(Qi , Q) → 0, d(Ri , R) → 0. Then, by Theorem 7.3.5, {Qi } converges to
Q and {Ri } converges to R in the graph topology. This implies, by Lemma 7.2.5, that there are
r.c.f.’s (NQi , DQi ), (NRi , DRi ), (NQ , DQ ), (NR , DR ) of Qi , Ri , Q, R such that (NQi , DQi ) →
(NQ , DQ ), (NRi , DRi ) → (NR , DR ) Now define

NQi 0 DQi 0
Ni = , Di = . (7.3.25)
0 NR i 0 DR i
Then clearly (Ni , Di ) is an r.c.f. of Pi , (N, D) is an r.c.f. of P , and (Ni , Di ) → (N, D) (because the
norm topology on M(S) is a product topology). Again by Lemma 7.2.5, {Pi } converges to P in the
graph topology, so that d(Pi , P ) → 0 by Theorem 7.3.5.
“only if ” Suppose d(Pi , P ) → 0. Then {Pi } converges to P in the graph topology. Let
(NQ , DQ ), (NR , DR ) be normalized r.c.f.’s of Q, R respectively. Then (N, D) as defined in (7.3.26)
is easily shown to be a normalized r.c.f. of P . Similarly, let (NQi , DQi ), (NRi , DRi ) be normalized
r.c.f.’s of Qi , Ri , respectively. Then (Ni , Di ) as defined in (7.3.25) is a normalized r.c.f. of Pi . By
hypothesis, there is a sequence {Ui } of unimodular matrices such that
⎡ ⎤ ⎡ ⎤
DQi 0 DQ 0

⎢ 0 DR i ⎥ ⎢ ⎥
⎢ ⎥ U1i U2i → ⎢ 0 DR ⎥ , (7.3.26)
⎣ NQ 0 ⎦ U3i U4i ⎣ NQ 0 ⎦
i
0 NR i 0 NR
234 7. ROBUSTNESS
where Ui is partitioned in the obvious way. Since the topology on M(S) is a product topology, each
block in the partitioned matrix on the left side of (7.3.26) converges to the corresponding block on
the right side of (7.3.26). In particular,

D Qi 0 DRi 0
U2i → , U3i → . (7.3.27)
NQi 0 NR i 0
Since (NQi , DQi ), (NRi , DRi ) are both normalized r.c.f.’s, (7.3.27) implies that U2i → 0, U3i → 0.
Let Ai (resp. A) denote the block 4 × 2 matrix on the left (resp. right) side of (7.3.26). Since
U2i → 0, U3i → 0, it follows that

0 U2i
Ai →0. (7.3.28)
U3i 0
Combining (7.3.26) and (7.3.28) gives
Ai Ūi → A , (7.3.29)
where
U1i 0
Ūi = . (7.3.30)
0 U4i
Now (7.3.29) shows that Ūi → 1. Hence, if we define Wi = Ui /Ui , then (7.3.29) shows that
A − Ai Wi → 0. On the other hand, since d(Pi , P ) → 0, there exists a sequence {Vi } of matrices
in the unit ball of M(S) such that Ai − AVi → 0.Thus A − Ai Wi + Ai − AVi < 1 for large
enough i, and Wi = 1. Hence by Lemma 7.3.4, both Wi and Vi are unimodular for large enough
i. In particular, U1i and U4i are both unimodular for large enough i. Hence, (NQi U1i , DQi U1i ) is
an r.c.f. of Qi , and (NRi U4i , DRi U4i ) is an r.c.f. of Ri . Since (7.3.29) implies that

D Qi DQ DRi DR
U1i → , U4i → . (7.3.31)
NQ i NQ NR i NR
It follows from Lemma 7.2.5 that Qi → Q, Ri → R in the graph topology. Now Theorem 7.3.5
shows that d(Qi , Q) → 0, d(Ri , R) → 0. 2
This section is concluded with a proof of Theorem 7.2.8. It is restated here in an equivalent
form for ease of reference.
Proposition 7.3.9 Suppose {Pi }, {Ci } are sequences in M(R(s)), and that P0 , C0 ∈ M(R(s)). Then,
the following statements are equivalent:
(i) d(Pi , P0 ) → 0, d(Ci , C0 ) → 0 as i → ∞.
(ii) d(Hi , H0 ) → 0 as i → ∞, where Hi = H (Pi , Ci ), H0 = H (P0 , C0 ).
Remarks 7.3.10 If the pair (P0 , C0 ) is stable, then the above result reduces to Theorem 7.2.8. In
general, it shows that the map from the plant-compensator pair to the closed-loop transfer matrix is
continuous if all spaces are equipped with the graph topology, even if the pair (P0 , C0 ) is not stable.
Proof. Let

C0 0 Ci 0
G0 = , Gi = . (7.3.32)
0 P0 0 Pi
Then Proposition 7.3.7 shows that (i) is equivalent to the following statement: (iii) d(Gi , G0 ) → 0
as i → ∞. Thus the statement to be proved is: d(Gi , G0 ) → 0 if and only if d(Hi , H0 ) → 0.
Define
0 I
F = . (7.3.33)
−I 0
Then (cf. Section 5.1)
Hi = (I + F Gi )−1 , H0 = (I + F G0 )−1 . (7.3.34)
Suppose (Ni , Di ) is an r.c.f. of Gi . Then Hi = Di (Di + F Ni )−1 , from (7.3.34). Moreover,

(Di , Di + F Ni ) is an r.c.f. of Hi , since

Di + F Ni I F Di
= , (7.3.35)
Di I 0 Ni
and the matrix on the right side of (7.3.35) is unimodular, with the inverse

0 I
. (7.3.36)
−F F
(Note that F −1 = −F .) Conversely, suppose (Bi , Ai ) is an r.c.f. of Hi . Then

Di 0 I Ai
= , (7.3.37)
Ni −F F Bi
is an r.c.f. of Gi . Similar remarks apply to G0 and H0 .
The statement to be proved is that d(Gi , G0 ) → 0 if and only if d(Hi , H0 ) → 0. Suppose
first that d(Gi , G0 ) → 0. Then, by Theorem 7.3.5, {Gi } converges to G0 in the graph topology.
Hence, according to Lemma 7.2.5, there exist r.c.f.’s (Ni , Di ) of Gi and (N0 , D0 ) of G0 such that
Ni → N0 , Di → D0 . Now define

Ai I F Di A0 I F D0
= , = . (7.3.38)
Bi I 0 Ni B0 I 0 N0
Then (Bi , Ai ) is an r.c.f. of Hi , (B0 , A0 ) is an r.c.f. of H0 , and Ai → A0 , Bi → B0 . Hence, {Hi }
converges to H0 in the graph topology and d(Hi , H0 ) → 0. The reverse implication follows by
reversing the above steps. 2
236 7. ROBUSTNESS
We conclude this section with a discussion of the graph metric in the case where the set of
“stable” transfer functions is something other than the set S. First of all, the graph metric extends
almost effortlessly to linear distributed systems. The central step in defining a graph metric is the
existence of a normalized r.c.f. In the case of distributed systems, the existence of such a factorization
is assured by Beurling’s theorem in the scalar case and the Beurling-Lax theorem in the multivariable
case, which states that a closed shift-invariant subspace of Hn2 is the image of an inner operator.
Thus, even in the case of distributed systems, if a plant has an r.c.f., then it has a normalized r.c.f.
Of course, not all distributed systems have r.c.f.’s (see Section 8.1). Thus, the graph metric can only
be defined between pairs of plants when each has an r.c.f. Next, in the case of discrete-time systems,
the arguments given here carry over without any major modifications: One only has to replace the
extended closed right half-plane by the closed unit disc. Finally, if the set of “stable” transfer functions
is a set of the form SD where D is some prespecified region of the complex plane, or its discrete
analog, then the graph metric can be defined by first conformally mapping D into C+e (or the closed
unit disc in the discrete-time case). This of course requires that the region D be simply connected.
7.4 ROBUSTNESS CONDITIONS

In this section, the focus is on designing stabilizing compensators for imprecisely known plants.
Thus, a nominal plant description is available, together with a description of the plant uncertainty,
and the objective is to design a compensator that stabilizes all plants lying within the specified band
of uncertainty. The types of plant uncertainty considered include additive perturbations, multiplica-
tive perturbations, stable-factor perturbations, and graph-metric perturbations. The solution of the
design problem is divided into two steps: The first step is the derivation of necessary and sufficient
conditions that any compensator must satisfy in order to be a solution of the design problem. The
second step is a further examination of these conditions to see when there actually exists a compen-
sator satisfying these conditions.This leads naturally to the notion of optimally robust compensators.
This section concludes with a discussion of the case where there are uncertainties in both the plant
and compensator.
Throughout this section, several models of plant uncertainty are used; these are described
next.
Suppose P0 is a given nominal plant and that r ∈ S is a given function. An additive perturbation
of P0 is only defined in the case where P0 has no poles on the imaginary axis. The class A(P0 , r)
consists of all P ∈ M(R(s)) that have the same number of open RHP poles as P0 (counted according
to McMillan degree), and satisfy
P (j ω) − P0 (j ω) < |r(j ω)| ∀ω ∈ R . (7.4.1)
In particular, (7.4.1) implies that if P ∈ A(P0 , r), then P also has no poles on the j ω-axis. However,
P − P0 need not be analytic in the open RHP, because even though P and P0 have the same
number of open RHP poles, they could be at different locations. Hence, (7.4.1) does not imply that
P (s) − P0 (s) < |r(s)| ∀s ∈ C+e . The assumption that r ∈ S is not too restrictive. Suppose it is
7.4. ROBUSTNESS CONDITIONS 237
known that P has the same number of RHP poles as P0 , and that P (j ω) − P0 (j ω) < |f (j ω)|
where f is some proper rational function. Then it is always possible to find a stable proper rational
function r such that r(j ω) = f (j ω) ∀ω. In fact, one can assume without loss of generality that r is
outer.
The class of multiplicative perturbations of P0 is also defined under the restriction that P0
has no j ω-axis poles. The class M(P0 , r) consists of all P ∈ M(R(s)) that have the same number
of open RHP poles as P0 and satisfy
P (s) = (I + M(s))P0 (s) , M ∈ M(R(s)) , where
M(j ω) < |r(j ω)| ∀ω ∈ R . (7.4.2)
Finally, the class of stable-factor perturbations is specified not just in terms of the nominal plant
description P0 , but in terms of a particular r.c.f. (N0 , D0 ) of P0 and a function r ∈ S. The class
S(N0 , D0 , r) consists of all plants P ∈ M(R(s)) satisfying
P (s) = N(s)[D(s)]−1 , where

N(s) − N0 (s)

D(s) − D0 (s) < |r(s)| ∀s ∈ C+e . (7.4.3)
The rationale behind the above three models of plant uncertainty is fairly clear. In the case of
both A(P0 , r) and M(P0 , r), the perturbed plant P has the same number of open RHP poles as the
nominal plant P0 (though possibly at different locations). In the case of A(P0 , r), |r(j ω)| provides
an upper bound on the absolute magnitude of the plant uncertainty at the frequency ω, while in the
case of M(P0 , r), |r(j ω)| provides an upper bound on the relative magnitude of uncertainty at the
frequency ω. The class of stable-factor perturbations is free from the restructive assumption that
the perturbed plant and the unperturbed plant have the same number of RHP poles. As well, P0
is permitted to have j ω-axis poles, as is the perturbed plant P . Using (7.4.3) in conjunction with
the principle of the argument over various subregions of C+ , one can localize regions in C+ where
a P ∈ S(N0 , D0 , r) can have poles.
Theorems 7.4.1 and 7.4.2 below give the main robustness results in the case of additive and
multiplicative perturbations, respectively.
Theorem 7.4.1 Suppose P ∈ S(N0 , D0 , r), r ∈ M(R(s)) are specified, and suppose C stabilizes
P0 . Then C stabilizes all P in the class A(P0 , r) if and only if5
[C(I + P0 C)−1 ](j ω) · |r(j ω)| ≤ 1 ∀ω ∈ R . (7.4.4)
Theorem 7.4.2 Suppose P0 ∈ M(P0 , r), r ∈ M(R(s)) are specified, and suppose C stabilizes P0 .
Then C stabilizes all P in the class M(P0 , r) if and only if
[P0 C(I + P0 C)−1 ](j ω) · |r(j ω)| ≤ 1 ∀ω ∈ R . (7.4.5)
5 Recall that the norm M of a matrix M equals its largest singular value.
238 7. ROBUSTNESS
The proofs of both theorems make use of a graphical criterion for feedback stability that
generalizes the Nyquist criterion. As it is of independent interest, it is stated separately.
Lemma 7.4.3 Suppose P , C ∈ M(R(s)) have γp , γc poles in the open RHP, respectively, counted
according to McMillan degree, and none on the j ω-axis. Then C stabilizes P if and only if the plot of
|(I + P C)(j ω)| as ω decreases from ∞ to −∞ does not pass through the origin, and encircles the origin
γp + γc times in the clockwise sense.
A little notation and an appeal to the principle of the argument simplifies the proof of
Lemma 7.4.3. Suppose f is a rational function that has neither poles nor zeros on the extended
j ω-axis. Then, as ω decreases from ∞ to −∞, the graph of f (j ω) traces out a closed curve in the
complex plane. Let
1
ind f = lim [arg f (−j ω) − arg f (j ω)] , (7.4.6)
2π ω→∞
denote the index of this closed curve with respect to the origin. In more prosaic terms, ind f is the
number of times that the plot of f (j ω) encircles the origin in the counter-clockwise sense as ω
decreases from ∞ to −∞. The principle of the argument states that
ind f = no. of zeros of f in C+ − no. of poles of f in C+ . (7.4.7)
Since arg ab = arg a + arg b for any two nonzero complex numbers a and b, it follows that ind
fg = ind f + ind g for any two appropriate rational functions f and g.
Proof of Lemma 7.4.3. Let (Np , Dp ), (D̃c , Ñc ) denote any r.c.f. and any l.c.f. of P and C, respec-
tively. Then, by Theorem 5.1.6, C stabilizes P if and only if D̃c Dp + Ñc Np is unimodular, or equiv-
alently, f = |D̃c Dp , +Ñc Np | is a unit of S. Using the principle of the argument, this condition can
be expressed in terms of the behavior of f (j ω), and ultimately, |(I + P C)(j ω)|.
Note that
f = |D̃c Dp + Ñc Np | = |D̃c | · |I + CP | · |Dp |

= |D̃c | · |I + P C| · |Dp | (7.4.8)
since |I + CP | = |I + P C|. If |(I + P C)(j ω)| = 0 for some ω (finite or infinite), it is immediate
that f is not a unit. So suppose that |(I + P C)(j ω)| = 0 ∀ω and at infinity. Then ind f is well-
defined and
ind f = ind |D̃c | + ind |I + P C| + ind |Dp | . (7.4.9)
Now for a function g ∈ S with no j ω-axis zeros, ind g equals its degree δ(g) defined in (2.1.13).
Moreover, g is a unit if and only if δ(g) = 0. Finally, from Theorem 4.3.12 and (7.4.7), it follows
that ind |D̃c | = γc , ind |Dp | = γp . Combining this reasoning with (7.4.9) shows that f is a unit if
and only if
ind |I + P C| = −(γc + γp ) . (7.4.10)
If P and/or C has j ω-axis poles, Lemma 7.4.3 has to be modified by replacing j ω-axis
throughout by indented j ω-axis. This modification is standard (see e.g., [26, Sec.IV.5]).
Proof of Theorem 7.4.1. For the sake of simplicity, it is assumed that C has no j ω-axis poles. The
modifications required to handle the case where C has j ω-axis poles are obvious and left to the
reader.
“only if ” It is shown that if (7.4.4) does not hold, then there exists a P in the class A(P0 , r) that
is not stabilized by C. Accordingly, suppose (7.4.4) is not true, and let F denote C(I + P0 C)−1 .Then
either F (∞) · |r(∞)| > 1, or else F (j ω0 ) · |r(j ω0 )| > 1 for some finite ω0 . Let Re denote the
extended real line, i.e., the set R ∪ {∞}. Then the hypothesis is that F (j ω0 ) · |r(j ω0 )| > 1 for
some ω0 ∈ Re . Let F denote F (j ω0 ), let σ1 , · · · , σs denote the nonzero singular values of F , and
let denote the diagonal matrix = Diag {σ1 , · · · , σs }. Select unitary matrices U, V such that

0
U F (j ω0 )V = , (7.4.11)
0 0
Note that σ1 · |r(j ω0 )| > 1 by assumption. There are now two cases to consider.
Case (i) ω0 = 0 or ∞. In this case F is a real matrix, so that U and V can also be chosen to
be real (and hence orthogonal), and r(j ω0 ) is real. Now define Q ∈ M(S) by

1/σ1 0 r(s)
Q(s) = −V U , (7.4.12)
0 0 r(j ω0 )
and let P = P0 + Q. Then, since Q is stable, P has the same number of (and in fact the same) RHP
poles as P0 . Moreover,
|r(j ω)|
Q(j ω) = < |r(j ω)| ∀ω ∈ R , (7.4.13)
σ1 |r(j ω0 )|
since σ1 |r(j ω0 )| > 1. Hence P belongs to the class A(P0 , r). However, it is claimed that C does not
stabilize P . To prove this, it is enough, in view of Lemma 7.4.3, to show that |(I + P C)(j ω0 )| = 0.
Let v1 denote the first column of V , and note that V v1 = e1 = [1 0 · · · 0] because V is orthogonal.
Hence,
(I + QF )(j ω0 )v1 = v1 − (QF )(j ω0 )v1

1/σ1 0 0
= v1 − V e
0 0 0 0 1
= v1 − v1 = 0 . (7.4.14)
240 7. ROBUSTNESS
Since v1 = 0, this implies that |(I + QF )(j ω0 )| = 0. Now |I + P C| = |I + (Q + P0 C)| = |I +
QF | · |I + P0 C|, from which it follows that |(I + P C)(j ω0 )| = 0.
Case (ii) ω0 is nonzero and finite. In this case the only complication is that the matrices U, V
and the constant r(j ω0 ) may not be real, so that the Q defined in (7.4.12) may not be a real rational
matrix. To handle this possibility, write r(j ω0 ) = |r(j ω0 )| exp(j δ) where δ ∈ [0, 2π ). Let v1 , u1
denote respectively the first column of V and the first row of U . Express each component of v1 , u1
in the form
(v1 )i = v̄i exp(j φi ), (u1 )i = ūi exp(j θi ) , (7.4.15)
where v̄i , ūi are real numbers and φ − δ, θ − δ ∈ (−π, 0] ∀i. Define
r(s)
Q(s) = − v̄(s)ū (s) , (7.4.16)
σ1 |r(j ω0 )|
where
⎡ ⎤
v̄1 (s − α1 )/(s + α1 )
⎢ .. ⎥
v̄(s) = ⎣ . ⎦ ,
v̄ (s − αn )/(s + αn )
⎡ n ⎤
ū1 (s − β1 )/(s + β1 )
⎢ .. ⎥
ū(s) = ⎣ . ⎦ , (7.4.17)
ūn (s − βn )/(s + βn ) ,
and the nonnegative constants αi , βi are adjusted such that6

j ω0 − αi j ω0 − βi
arg = φi − δ , arg = θi − δ . (7.4.18)
j ω0 + α i j ω0 + βi
Then Q is a real rational matrix, Q ∈ M(S), and
Q(j ω0 ) = −v1 u1 /σ1 . (7.4.19)
Moreover, Q satisfies (7.4.13). Hence, P = P0 + Q belongs to the class A(P0 , r). As before, (I +
QF )(j ω0 ) is singular, because it maps v1 into the zero vector. Hence C does not stabilize P .
“if ” Suppose (7.4.4) holds. Let P be any plant in the class A(P0 , r) and let denote P − P0 .
Then, by (7.4.1),
(j ω) < |r(j ω)| ∀ω . (7.4.20)
Combining (7.4.4) with (7.4.20) and letting F denote C(I + P0 C)−1 leads to
(j ω) · F (j ω) < 1 ∀ω . (7.4.21)

6 One can assume without loss of generality that ω > 0, so that such α , β exist.
0 i i
Hence
|I + λ(j ω)F (j ω)| = 0 ∀ω ∈ R, ∀λ ∈ [0, 1] . (7.4.22)
Let X denote the topological space C − 0 (i.e., the “punctured plane”) with the topology inherited
from C. Define two closed curves f (·) and g(·) in X by
f (j ω) = 1 ∀ω , g(j ω) = |I + (j ω)F (j ω)| ∀ω . (7.4.23)
Now (7.4.22) shows that the function
h(λ, j ω) = |I + λ(j ω)F (j ω)| , (7.4.24)
continuously transforms the curve f into the curve g in X. Hence, the curves f and g are homotopic
in X, and as a result they have the same index with respect to 0. Since the index of f is clearly zero,
it follows that
ind g = ind |I + F | = 0 . (7.4.25)
Finally, note that
|I + P C| = |I + F | · |I + P0 C| . (7.4.26)
Since C stabilizes P0 , it follows from Lemma 7.4.3 that
ind |I + P0 C| = −(γ0 + γc ) , (7.4.27)
where γ0 (resp. γc ) is the number of RHP poles of P0 (resp. C) counted according to McMillan
degree. Now (7.4.25)–(7.4.27) imply that
ind |I + P C| = −(γ0 + γc ) . (7.4.28)
Since P also has γ0 RHP poles by assumption, (7.4.28) and Lemma 7.4.3 show that C stabilizes P .
2
Suppose we define Ā(P0 , r) to be the set of all plants P that have the same number of RHP
poles as P0 , and satisfy
P (j ω) − P0 (j ω) ≤ |r(j ω)| ∀ω . (7.4.29)
(The difference between (7.4.1) and (7.4.29) is that the < is replaced by ≤.) Then the proof of
Theorem 7.4.1 can be easily adapted to prove the following result.
Corollary 7.4.4 Suppose C ∈ S(P0 ). Then C stabilizes all plants in the class Ā(P0 , r) if and only if
sup [C(I + P0 C)−1 ](j ω) · |r(j ω)| < 1 . (7.4.30)

ω
242 7. ROBUSTNESS
Sketch of Proof of Theorem 7.4.2. Suppose P = (I + M)P0 . Then
|I + P C| = |I + (I + M)P0 C| = |I + MG| · |I + P0 C| , (7.4.31)
where G = P0 C(I + P0 C)−1 . By assumption, M satisfies M(j ω) < |r(j ω)| ∀ω, and (7.4.5) states
that G(j ω) · |r(j ω)| ≤ 1 ∀ω. 2
To derive a result analogous to Corollary 7.4.4, define M̄(P0 , r) to be the set of all plants P
that have the same number of RHP poles as P0 , and satisfy P = (I + M)P0 where M ∈ M(R(s))
and
M(j ω) ≤ |r(j ω)| ∀ω . (7.4.32)
Corollary 7.4.5 Suppose C ∈ S(P0 ). Then C stabilizes all plants in the class M̄(P0 , r) if and only if
sup [P0 C(I + P0 C)−1 ](j ω) · |r(j ω)| < 1 . (7.4.33)
ω
Finally, consider the case of stable-factor perturbations.
Theorem 7.4.6 Suppose P0 ∈ M(R(s)). Suppose an r.c.f. (N0 , D0 ) of P0 and a function r ∈ S

are specified, and define the class S(N0 , D0 , r) as in (7.4.3). Suppose C ∈ S(P0 ), and select an l.c.f.
(D̃c , Ñc ) of C such that D̃c D0 + Ñc N0 = I . Then C stabilizes all P in the class, S(N0 , D0 , r), if
and only if
[D̃c (j ω) Ñc (j ω)] · |r(j ω)| ≤ 1 ∀ω . (7.4.34)
Remarks 7.4.7 Since D̃c , Ñc ∈ M(S), r ∈ S, the condition (7.4.34) actually implies that
[D̃c (s) Ñc (s)]r(s) ≤ 1 ∀s ∈ C+e .
Proof. For convenience, define

D0 D
A0 = , A1 = , Ãc = [D̃c Ñc ] . (7.4.35)
N0 N
“if ” Suppose (7.4.34) holds, and suppose P is an arbitrary plant in the class S(N0 , D0 , r).
Then P has an r.c.f. (N, D) such that A(s) − A0 (s) < |r(s)| ∀s ∈ C+e . Now consider the return
difference matrix D̃c D + Ñc N = Ãc A. Since Ãc A0 = I , it follows that Ãc A = I + Ãc (A − A0 ).
However, from (7.4.34) we get Ãc (A − A0 )](s) ≤ Ãc (s) · [A − A0 ](s) < 1 ∀s ∈ C+e . This
shows that |(Ãc A)(s)| = |I + [Ãc (A − A0 )](s)| = 0 ∀s ∈ C+e , so that Ãc A is unimodular. Hence,
by Theorem 5.1.6, C stabilizes P .
“only if ” Suppose (7.4.34) is false, and select an ω0 such that Ãc (j ω) · |r(j ω0 )| > 1. Let
σ1 ≥ σ2 ≥ · · · ≥ σm denote the singular values of Ãc (j ω0 ), and note that σ1 |r(j ω0 )| > 1. Let =
Diag {σ1 , · · · σm }, and select unitary matrices U, V such that7
U Ãc (j ω0 )V = [ 0] . (7.4.36)
Since Ãc (j ω0 )A0 (j ω0 ) = I , it follows that A0 (j ω0 ) must be of the form

−1

A0 (j ω0 ) = V U. (7.4.37)
X
By assumption, σ1 |r(j ω0 )| > 1. Now define R = Diag {1/σ1 , 0, · · · , 0}, and

R r(s)
Q(s) = −V U , (7.4.38)
T r(j ω0 )
where T is a constant matrix chosen as follows: If the first column of the matrix X in (7.4.37) is
nonzero, simply choose T = 0. Otherwise, if the first column of X is zero, choose T to be a matrix
whose columns are all zero except for the first column, and whose first column is nonzero and has
sufficiently small norm that Q(j ω0 ) < 1/|r(j ω0 )|. If T is so chosen, then the first column of
X − T is not identically zero. Now let P be the plant ND −1 , where [N D ] = A = A0 + Q. Then
P belongs to the class S(N0 , D0 , r), since Q(s) · |r(s)| ≤ 1 ∀s ∈ C+e .8 However,
−1
−R
A(j ω0 ) = A0 (j ω0 ) + Q(j ω0 ) = V U, (7.4.39)
X−T
and the first column of −1 − R is zero. Hence Ãc (j ω0 )A(j ω0 ) is singular, and Ãc A is not uni-
modular.
The question is: Does the singularity of |D̃c D + Ñc N| at j ω0 imply that C fails to stabilize
P ? We can conclude that this is so, provided it can be shown that N and D are right-coprime,
or at least, that any greatest common right divisor of N and D is nonsingular at j ω0 . To am-
plify this argument, suppose N = N1 B, D = D1 B, where B is a g.c.r.d. of N and D, and N1 and
D1 are right-coprime. Then |D̃c D + Ñc N| = |D̃c D1 + Ñc N1 | · |B|. Hence, if |B(j ω0 )| = 0, then
|D̃c D1 + Ñc N1 | vanishes at j ω0 , and C does not stabilize P . Thus the proof is complete if it can be
shown that |B(j ω0 )| = 0. For this it is enough to show that the matrix [D N ] = A0 − Q has full
column rank at j ω0 . But this last fact is immediate from (7.4.39), since the first column of X − T
is nonzero. 2
There is an interesting anomaly associated with the robustness condition (7.4.34). Comparing
Theorem 7.4.1 with Corollary 7.4.4, or Theorem 7.4.2 with Corollary 7.4.5, we see that it makes
very little difference, in the case of additive and multiplicative perturbations, whether the class of
7 Note that Ã (s) is a “fat” matrix and has full row rank at all s ∈ C .
c +e
8The matrix Q may not be a real rational matrix; but this can be fixed up as in the proof of Theorem 7.4.1.
244 7. ROBUSTNESS
perturbations is defined with a strict or nonstrict inequality. If the class of perturbations is defined
with a strict inequality, then the robustness condition has a nonstrict inequality, and vice versa. But
this is not so in the case of stable-factor perturbations. Define the class S̄(N0 , D0 , r) by

(N − N0 )(s)
S̄(N0 , D0 , r) = {P = ND −1 : (D − D0 )(s)

≤ |r(s)| ∀s ∈ C+e } . (7.4.40)
One might be tempted to conjecture the following result: C ∈ S(P0 ) stabilizes all P in the class
S̄(N0 , D0 , r); if and only if
sup Ãc (j ω) · |r(j ω)| < 1 . (7.4.41)
ω
But this is false: (7.4.41) is certainly sufficient for robust stability, but not necessary, as the next
example shows.
Example 7.4.8 Consider

4 n0 (s)
p0 (s) = =
s−3 d0 (s)
where
4(s + 7) (s − 3)(s + 7)
n0 (s) = , d0 (s) = .
(s + 1)2 (s + 1)2
Let c(s) = 4/(s + 5). Then c stabilizes p0 . Moreover, dc d0 + nc n0 = 1 with
s+5 4
dc (s) = , nc (s) = .
s+7 s+7
Now consider the class S̄(n0 , d0 , r); with r = 1. Since

dc (∞) 1

nc (∞) = 0 = 1 ,
(7.4.41) does not hold. Nevertheless, c stabilizes every p in the class S̄(n0 , d0 , r), which shows
that (7.4.41) is not always necessary for robust stability.
To establish the claim made in the above paragraph, suppose

n(s) n − n0
p(s) = , where (s)
≤ 1 ∀s ∈ C+e .
d(s) d − d0
Let Q = [q1 q2 ] denote [d − d0 n − n0 ] . Then there are two cases to consider, namely:
(i) q1 (∞) + 1 = 0, and (ii) q1 (∞) + 1 = 0. In either case we have [dc (s) nc (s)] < 1 ∀s ∈ C+ ,
so that
(dc d + nc n)(s) = 1 + [dc (s) nc (s)]Q(s) = 0 ∀s ∈ C+ .
On the other hand, [dc (∞) nc (∞)] = [1 0] so that
(dc d + nc n)(∞) = 1 + q1 (∞) .
Hence, if q1 (∞) = −1, then dc d + nc n has no zeros in the extended RHP and is thus a unit of S.
Therefore c stabilizes p.
It only remains to show that c stabilizes p even if q1 (∞) = −1. In this case q2 (∞) = 0
since Q(∞) ≤ 1. Hence, n(∞) = d(∞) = 0, and α = 1/(s + 1) is a common divisor of n and
d. Further, since (dc d + nc n)(s) = 0 ∀s ∈ C+ , we see that α, its powers and associates are the only
possible common divisors of n and d. Now it is claimed that, whatever be q1 , the function d0 + q1
can have only a simple zero at infinity. Let us accept this claim for a moment; then α is a greatest
common divisor of n and d, since d/α does not vanish at infinity. Let n1 = 1/α, d1 = d/α; then
(n1 , d1 ) is a coprime factorization of p. Now
(dc d1 + nc n1 )(∞) = d1 (∞) = 0 ,
since d has only a simple zero at infinity. On the other hand, since it has already been established
that (dc d + nc n)(s) = 0 ∀s ∈ C+ , it follows that
(dc d1 + nc n1 )(s) = 0 ∀s ∈ C+e .
Hence, dc d1 + nc n1 is a unit of S, and c stabilizes p.

Thus, the example is complete if the claim can be established. This is most easily done using
the bilinear transformation z = (s − 1)/(s + 1), which sends the function d0 (s) into

1+z
a(z) = d0 = (4 − 3z)(2z − 1) ,
1−z
and sends q1 (s) into an associated rational function b(z). In this context, the claim is that a − b
has only a simple zero at z = 1 whenever b(1) = −1 and b∞ ≤ 1.9 We prove the contrapositive,
namely, if g is a rational H∞ -function such that g(1) = 1 and a − g has a double zero at z = 1,
then g∞ > 1. Expand g in a power series around z = 1, as
∞

g(z) = gi (z − 1)i .
i=0
If a − g has (at least) a double zero at z = 1, then
g(1) = a(1) = 1 , g (1) = a (1) = −1 .
Hence, for z < 1 and sufficiently close to 1, we have g(z) > 1. Therefore, g∞ > 1. This completes
the example.
9 Here ·
∞ denotes the norm in the space H∞ ; see Section 6.2.
246 7. ROBUSTNESS
Each of Theorems 7.4.1, 7.4.2 and 7.4.6 gives conditions for a particular C ∈ S(P0 ) to stabilize
all plants in a given class of perturbations. But they do not answer the question whether, given a
class of perturbations, there actually exists a C ∈ S(P0 ) that satisfies the robustness conditions.
This question is addressed next. Note that the solutions given below make use of the H∞ -norm
minimization results derived in Sections 6.5 and 6.6.
Theorem 7.4.9 Suppose a nominal plant P0 ∈ M(R(s)), free of poles on the extended j ω-axis,
and a function r ∈ S are specified. Suppose r does not have any zeros on the extended j ω-axis. Let
(N, D), (D̃, Ñ ) be any r.c.f. and any l.c.f. of P0 , and let X̃, Ỹ ∈ M(S) be any particular solutions of
the identity Ñ X̃ + D̃ Ỹ = I . Factor D, D̃ in the form
D = Di Do , D̃ = D̃o D̃i , (7.4.42)
where Di , D̃i , are inner and Do , D̃o are outer, and assume without loss of generality that |Di | =
|D̃i |.10 Finally, factor r as ro ri where ro , ri are outer and inner, respectively. Under these conditions,
there exists a C ∈ S(P0 ) that stabilizes all plants in the class A(P0 , r) if and only if there exists an
S ∈ M(S) such that
adj
Di X̃ D̃o ro + δS∞ ≤ 1 , (7.4.43)
where δ = |Di | = |D̃i |.
Proof. By Theorem 5.2.1, the set of all C ∈ S(P0 ) is given by
S(P0 ) = {(X̃ + DS)(Ỹ − NS)−1 : S ∈ M(S), |Ỹ − NS| = 0} , (7.4.44)
where the nonsingularity constraint is satisfied by an open dense subset of S ∈ M(S) (see
Lemma 5.2.4). Moreover, from (5.2.7), if C = (X̃ + DS)(Ỹ − NS)−1 , then
C(I + P0 C)−1 = (X̃ + DS)D̃ . (7.4.45)
Hence, there exists a C ∈ S(P0 ) satisfying (7.4.4) if and only if there exists an S ∈ M(S) such that
X̃ D̃r + DS D̃r∞ ≤ 1 . (7.4.46)
A few simple manipulations bring (7.4.46) into the form (7.4.43). Note that
X̃ D̃r + DS D̃r∞ = X̃ D̃o D̃i ro ri + Di Do S D̃o D̃i ro ri

adj
= Di X̃ D̃o ro ri δ + δ 2 Do S D̃o ro ri ∞
adj
= δri (Di X̃ D̃o ro + δQ)∞
adj
= Di X̃ D̃o ro + δQ∞ , (7.4.47)
10 By Theorem 4.1.15, |D| and |D̃| are associates; so are |D | and |D̃ |, since both are units. Hence, |D | and |D̃ | are associates,
o o i i
and since both are inner, |Di | = ±|D̃i | in any case.
adj adj
where, in successive steps, we multiply on the left by and on the right by
Di replace the D̃i ,
matrix S by a new “free” parameter Q = Do S D̃o ro , and observe that multiplication by the inner
function δri preserves norms. Hence, (7.4.46) and (7.4.43) are equivalent. 2
If δ has only simple RHP zeros (or equivalently, if P0 has only simple RHP poles), then (7.4.43)
can be stated in a more readily testable form using the results of Section 6.5.
Corollary 7.4.10 Suppose δ has only simple RHP zeros, and let λ1 , · · · , λn denote these zeros. Let
adj
Fi = (Di X̃Do ro )(λj ), j = 1, · · · , n, and define
I + Fj∗ Fk
Qj k = , 1 ≤ j, k ≤ n , (7.4.48)
λ̄ + λk
⎡j ⎤
Q11 · · · Q1n
⎢ .. ⎥ .
Q = ⎣ ... . ⎦ (7.4.49)
Qn1 · · · Qnn
Then there exists a C ∈ S(P0 ) that stabilizes all plants in the class A(P0 , r) if and only if Q is nonnegative
definite.
In (7.4.43), X̃ is any particular solution of the identity Ñ X̃ + D̃ Ỹ = I . Hence, it is nice to

know that the test matrix Q is the same no matter which X̃ is used. To see this, let X̃1 , Ỹ1 be another
set of matrices in M(S) satisfying Ñ X̃1 + D̃ Ỹ1 = I . Then, from the proof of Theorem 5.2.1, it
follows that X̃1 = X̃ + DR for some R ∈ M(S). Hence,
adj adj adj
Di X̃1 D̃o ro = Di (X̃ + DR)D̃o ro = Di X̃ D̃o ro + δDo R D̃o ro , (7.4.50)
since D = Di Do . Now the second term on the right side of (7.4.50) vanishes at all zeros of δ.
Hence F1 , · · · , Fn are independent of which particular solution X̃ is used to compute them. Similar
remarks apply even in the case where δ has repeated zeros.
If the plant P0 is scalar, then the expression for Fj (or fj in the scalar case) is more elegant. In
adj
this case, Di = 1 (since by convention the adjoint of a 1 × 1 matrix is 1) and fj = (xdo ro )(λj ).
Now, in the scalar case δ = di ; hence, it follows from d = di do = δdo and nx + yd = 1 that, at any
zero λj of δ, we have n(λj )x(λj ) = 1. Hence,
do (λj )
x(λj )do (λj ) = . (7.4.51)
n(λj )
With the plant p0 , associate the stable plant q0 = p0 δ = n/do . Then
do (λj ) ro
fj = ro (λj ) = (λj ) . (7.4.52)
n(λj ) q0
248 7. ROBUSTNESS
The treatment of multiplicative perturbations is entirely similar to that of additive perturba-
tions.
Theorem 7.4.11 Suppose a function r ∈ S and a nominal plant P0 ∈ M(R(s)) are specified, and
suppose that P0 has no extended j ω-axis poles. Let (N, D), (D̃, Ñ) be any r.c.f. and any l.c.f. of
P0 , and let X, Y ∈ M(S) be any particular solutions to the identity XN + Y D = I . Under these
conditions, there exists a C ∈ S(P0 ) that stabilizes all plants in the class M(P0 , r) if and only if there
exists an R ∈ M(S) such that
N(X + R D̃)r∞ ≤ 1 . (7.4.53)
Further, if P0 has at least as many inputs as outputs and r has no extended j ω-axis zeros, then (7.4.53)
can be simplified as follows: Factor N, D̃ in the form
N = N i No , D̃ = D̃o D̃i , (7.4.54)
where Ni , D̃i are inner and No , D̃o are outer. Finally. factor r as ri ro where ri is inner and ro is a unit
of S. Under these conditions, there exists a C ∈ S(P0 ) that stabilizes all plants in the class M(P0 , r)
if and only if there exists an S ∈ M(S) such that
adj
No X D̃i ro + δS∞ ≤ 1 , (7.4.55)
where δ = |D̃i |.
Proof. From Theorem 5.2.1, every C ∈ S(P0 ) is of the form (Y − R Ñ)−1 (X + R D̃) for some
R ∈ M(S). Moreover,
P0 C(I + P0 C)−1 = P0 (I + CP0 )−1 C = N(X + R D̃) . (7.4.56)
Hence (7.4.5) holds for some C ∈ S(P0 ) if and only if there exists an R ∈ M(S) such that (7.4.53)
holds. Now suppose the additional assumptions on P0 , r hold. Then, using by now familiar manip-
ulations, we note that
N Xr + NR D̃r∞ = Ni No Xro ri + Ni No R D̃o D̃i ro ri ∞

adj
= Ni ri (No Xro D̃i + δS)∞
adj
= No Xro D̃i + δS)∞ , (7.4.57)
where S = No R D̃o and we use the fact that multiplication by the inner matrix Ni ri reserves norms.
Now (7.4.57) is precisely (7.4.55). 2
adj
If the plant P0 is scalar, one can again obtain a simple expression for the No Xro D̃i evaluated
at the zeros of δ. Suppose nx + dy = 1 and suppose δ(λ) = 0 at some point λ ∈ C+ .Then d(λ) = 0,
adj
nx(λ) = 1, and (No Xro D̃i )(λ) = ro (λ)/ni (λ).
The foregoing results also lead very naturally to the notion of optimally robust compensators.
Suppose a nominal plant P0 is given, together with a function f ∈ S, which represents an uncertainty
profile. Consider now the class A(P0 , r) consisting of all plants satisfying (7.4.1), where r = λf . It
is now reasonable to ask: What is the largest value of the parameter λ for which the class A(P0 , r)
is robustly stabilizable, and what is a corresponding robustly stabilizing compensator? This problem
can be solved very easily using the results derived so far. Factor f as fi fo where fi , fo are respectively
inner and outer, and define
adj
γ = min Di X̃ D̃o fo + δS , (7.4.58)
S∈M(S)
where all symbols are as in Theorem 7.4.9. Now, by applying Theorem 7.4.9 with r = λf , it follows
that there exists a single compensator that stabilizes all the plants in the class A(P0 , r) if and only
if γ λ ≤ 1. Hence, the largest value of λ for which there exists a robustly stabilizing compensator is
given by λ = 1/γ . Further, if S is any matrix which attains the minimum in (7.4.58), then the com-
pensator C = (X̃ + DS)(Ỹ − NS)−1 is an optimally robust compensator. The case of multiplicative
perturbations is entirely similar and is left to the reader.
Next, the case of stable-factor perturbations is studied.
Theorem 7.4.12 Suppose a function r ∈ S and a nominal plant P0 ∈ M(R(s)) are specified,
together with an r.c.f. (N0 , D0 ) of P0 . Let X, Y ∈ M(S) satisfy XN0 + Y D0 = I , and let (D̃0 , Ñ0 )
be any l.c.f. of P0 . Then there exists a C ∈ S(P0 ) that stabilizes all plants in the class S(N0 , D0 , r)
if and only if there exists a matrix R ∈ M(S) such that A − BR ≤ 1, where

X D̃0
A= , B= . (7.4.59)
Y −Ñ0
The proof is obvious and is left to the reader.

This section is concluded with a discussion of the case where there are perturbations in both
the plant and the compensator. The results in this case make use of the graph metric introduced in
Section 7.3. A bit of notation makes the presentation smoother. Given P , C ∈ M(R(s)), define

I 0
T (P , C) = H (P , C) − , (7.4.60)
0 0
where H (P , C) is defined in 7.1.1. Using this last equation, one can derive a few other useful
representations of T (P , C). For instance,

−P C(I + P C)−1 −P (I + CP )−1 −P
T (P , C) = = (I + CP )−1 [C I ] . (7.4.61)
C(I + P C)−1 (I + CP )−1 I
Theorem 7.4.13 Suppose the pair (P0 , C0 ) is stable, and that P0 , C0 are perturbed to P , C,
respectively. Then the pair (P , C) is stable provided
d(P , P0 )T (P0 , C0 ) + d(C, C0 )T (C0 , P0 ) < 1 . (7.4.62)
250 7. ROBUSTNESS
Corollary 7.4.14 Under the hypotheses of Theorem 7.4.13, the pair (P , C0 ) is stable provided
d(P , P0 ) < 1/T (P0 , C0 ).
The significance of Theorem 7.4.13 and Corollary 7.4.14 is as follows: The type of plant
perturbations studied in these two results is the most unstructured one considered in this section, in
that (i) one is permitted to simultaneously perturb both the plant and the compensator. (ii) there is
no restriction on the number of RHP poles of the perturbed and unperturbed plant being the same,
and (iii) the perturbations are not couched in terms of a particular coprime factorization of the plant.
The stability condition (7.4.62) is interesting in that the effects of the perturbations in the plant
and controller enter additively. When only the plant is perturbed, the stability condition given in
Corollary 7.4.14 is reminiscent of the small gain theorem (see e.g., [D1, Chapter VI]). These results
also bring out the utility of the graph metric in obtaining quantitative robustness estimates.
Proof. Let (N0 , D0 ), (X̃0 , Ỹ0 ) be normalized r.c.f.’s of P0 , C0 , respectively. Then from the stability
of the pair (P0 , C0 ), and Problem 5.1.1, the matrix

D0 −X̃0
U0 = , (7.4.63)
N0 Ỹ0
is unimodular. Let V0 = U0−1 , and partition V0 as

Y0 X0
V0 = . (7.4.64)
−Ñ0 D̃0
Then (D̃0 , Ñ0 ), (Y0 , X0 ), are l.c.f.’s of P0 , C0 , respectively.

Now select real numbers δp > d(P , P0 ), δc > d(C, C0 ) such that
δp T (P0 , C0 ) + δc T (C0 , P0 ) < 1 . (7.4.65)
This is possible in view of (7.4.62). Let (Np , Dp ), (Nc , Dc ) be any normalized r.c.f.’s of P , C
respectively. Then, from the definition of the graph metric (see 7.3.2), there exist matrices Wp , Wc , ∈
M(S) with Wp ≤ 1, Wc ≤ 1 such that

D0 − Dp Wp

N0 − Np Wp ≤ δp , (7.4.66)

Ỹ0 − Dc Wc

X̃0 − Nc Wc ≤ δc . (7.4.67)
Define F as in (7.3.33) and observe that F is inner so that multiplication by F is norm-preserving.
Thus (7.4.67) implies that
−(X̃0 − Nc Wc )

Ỹ0 − Dc Wc ≤ δc . (7.4.68)
Next, define

Dp Wp −Nc Wc
U=
N W Dc Wc
p p
Dp −Nc Wp 0
= . (7.4.69)
Np Dc 0 Wc
It is now shown that (7.4.65) implies the unimodularity of U . This will show, a fortiriori, that Wp
and Wc , are both unimodular, and that the pair (P , C) is stable, by Problem 5.1.1. Note that
U − U0 = [A B] , (7.4.70)
where
D0 − Dp Wp −(X̃0 − Nc Wc )
A= , B= . (7.4.71)
N0 − Np Wp Ỹ0 − Dc Wc
Now, if [A B]U0−1 < 1, then (7.4.70) implies that U is unimodular. From (7.4.64),
[A B]U0−1 = [A B]V0 = A[Y0 X0 ] + B[−Ñ0 D̃0 ] . (7.4.72)

[A B]U0−1 ≤ A[Y0 X0 ] + B[−Ñ0 D̃0 ]
≤ δp [Y0 X0 ] + δc [−Ñ0 D̃0 ] (7.4.73)
where the last step follows from (7.4.66) and (7.4.68). The proof is completed by showing that
[Y0 X0 ] = T (P0 , C0 ) , (7.4.74)

[−Ñ0 D̃0 ] = T (C0 , P0 ) . (7.4.75)
Then (7.4.65) and (7.4.73) will imply the unimodularity of U and the stability of (P , C).
To prove (7.4.74), recall that (N0 , D0 ) is a normalized r.c.f. of P0 , and (Y0 , X0 ) is the corre-
sponding l.c.f. of C0 such that Y0 D0 + X0 N0 = I . Hence, from (5.1.23) and (7.4.60),

−N0 X0 −N0 Y0 −N0
T (P0 , C0 ) = = [X0 Y0 ] ; (7.4.76)
D0 X0 D 0 Y0 D0

−N0
T (P0 , C0 ) = [X0 Y0 ] since is an isometry
D0
= [Y0 X0 ] . (7.4.77)
The proof of (7.4.75) follows essentially by symmetry arguments, after noting that [−Ñ0 D̃0 ] =
[Ñ0 D̃0 ]. Recall that (X̃0 , Ỹ0 ) is a normalized r.c.f. of C0 , that P0 stabilizes C0 , and that (D̃0 , Ñ0 )
is the corresponding l.c.f. of P0 such that D̃0 Ỹ0 + Ñ0 X̃0 = I . 2
252 7. ROBUSTNESS
It is not known at present how close the conditions of Theorem 7.4.13 and Corollary 7.4.14
are to being necessary. In particular, it is not known whether the condition T (P0 , C) < r −1 is
necessary for a controller C to stabilize all plants within a distance of r from the plant P0 . Nevertheless,
it is reasonable to seek, among all stabilizing controllers for P0 , an optimally robust controller C0
for which T (P0 , C0 ) is as small as possible. The problem of finding such a C0 can be formulated
as an H∞ -norm minimization problem. As C varies over S(P0 ), the corresponding T (P0 , C) varies
over all matrices of the form

−N0 (X0 + R D̃0 ) −N0 (Y0 − R Ñ0
T1 (R) = , (7.4.78)
D0 (X0 + R D̃0 ) D0 (Y0 − R D̃0
where (N0 , D0 ), (D̃0 , Ñ0 ) are any r.c.f. and l.c.f. of P0 , X0 , Y0 ∈ M(S) satisfy X0 N0 + Y0 D0 = I ,
and R ∈ M(S) is a free parameter (see (5.2.6)). Thus, minimizing T (P0 , C) over all C ∈ S(P0 )
is equivalent to the unconstrained minimization of T1 (R) as R varies over M(S). Now note that
T1 (R) is of the form U − V RW , where

−Ñ0 −Ñ0
U= [X0 Y0 ] , V = , W = [D̃0 − Ñ0 ] . (7.4.79)
D̃0 D̃0
This type of minimization problem is studied in Section 6.6.
Finally, the results contained in this section carry over with obvious modifications to the study
of the robustness of discrete-time systems, and to cases where the set of “stable” transfer functions
is something other than S.
7.5 ROBUST AND OPTIMAL REGULATION

In this section, several previously derived results are brought to bear to provide solutions of two
problems, namely robust regulation and optimal robust regulation. The robust regulation problem
encompasses two aspects, namely robust tracking and robust disturbance rejection. The robust track-
ing problem can be described as follows: Suppose we are given a plant P , together with an unstable
reference signal generator which is factored in the form D̃r−1 Ñr , where D̃r , Ñr are left-coprime.
The objective is to design a compensator such that the plant output tracks the unstable reference
signal, and continues to do so even as the plant is slightly perturbed. This problem is depicted in
Figure 7.2, where, to achieve the greatest possible generality, a two-parameter compensation scheme
is shown. Alternatively, suppose the unstable signal is a disturbance which is injected at the plant
output. In this case, the objective is to design a compensator such that the plant output is insensitive
to the disturbance, and remains so even as the plant is slightly perturbed. This situation is shown
in Figure 7.3, and is referred to as the robust disturbance rejection problem. Note that a feedback,
or one-parameter, configuration is employed in Figure 7.3 since, as shown in Section 5.6, both
configurations are equivalent for the purposes of disturbance rejection.
In more precise terms, the robust disturbance rejection problem can be stated as follows: Find
a compensator such that three conditions are met, namely
7.5. ROBUST AND OPTIMAL REGULATION 253
v ~ ~ r ~ ~ y
Dr—1 Nr N c1 Dc—1 P
+–
~
N c2
Figure 7.2: Robust Tracking Using a Two-Parameter Compensator.
v
~ —1 ~
Dr Nr
+
+ y
+– C P
Figure 7.3: Robust Disturbance Rejection.
(i) The feedback system of Figure 7.2 is stable.

(ii) The transfer matrix from v to the tracking error r − y belongs to M(S).
(iii) There exists a neighborhood N(P ) of P in the graph topology such that conditions (i)
and (ii) hold for all plants in N(P ).
If a one-parameter or feedback compensator is used, the robust tracking problem is mathe-
matically equivalent to the robust disturbance rejection problem, since the transfer matrix from v
to the tracking error e in Figure 7.4 is exactly the same as the transfer matrix from v to the plant
output y in Figure 7.3. On the other hand, if a two-parameter compensation scheme is used, there
is a distinct difference between robust tracking and robust disturbance rejection. As a matter of
fact, there is nothing to be gained by using a two-parameter scheme if the objective is to reject a
disturbance (see Section 5.6). For this reason, throughout this section, attention is focused on the
robust tracking problem.
v ~ ~ r e y
Dr—1 Nr C P
+–
Figure 7.4: Robust Tracking Using a One-Parameter Compensator.

254 7. ROBUSTNESS
Let us now formulate the problem in more mathematical terms. Given the plant P , let
(N, D), (D̃, Ñ ) be any r.c.f. and l.c.f. of P . The objective is to find a compensator C such that three
conditions are satisfied:
(R1) C ∈ S(P ).
(R2) (I + P C)−1 D̃r−1 Ñr ∈ M(S).11
(R3) There exists a δ such that (R1) and (R2) hold whenever P is replaced by P1 = D̃1−1 Ñ1
where
[D̃ − D̃1 Ñ − Ñ1 ] < δ . (7.5.1)
The solution to the robust tracking problem is given next.
Theorem 7.5.1 Let αr denote the largest invariant factor of D̃r .12 Then there exists a C satisfy-
ing (R1)–(R3) if and only if N and αr I are left-coprime. If N and αr I are left-coprime, the set of
all C that satisfy (R1)–(R3) is given by
{C1 /αr : C1 ∈ S(P /αr )} . (7.5.2)
Checking whether or not the problem at hand has a solution is made easier by the next result.
Lemma 7.5.2 The following statements are equivalent:

(i) N and αr I are left-coprime.
(ii) N (s) has full row rank at all C+e -zeros of αr .
(iii) N has full row rank, and the largest invariant factor of N and αr are coprime.
Thus, Lemma 7.5.2 shows that only square or fat plants can be robustly regulated, and reduces
the matrix coprimeness test of Theorem 7.5.1 to a scalar coprimeness criterion. Once one is sure
that the robust tracking problem does indeed have a solution in the case at hand, the formula (7.5.2)
describing all compensators C that achieve robust tracking has a very simple interpretation: In
Figure 7.5, choose C1 to be any compensator that stabilizes the feedback system. Then C = C1 /αr
achieves (R1)–(R3). Moreover, every such C is of the above form. The presence of the 1/αr term
within the compensator is referred to elsewhere as the “internal model principle” [107].
The proof of Theorem 7.5.1 requires two preliminary results. The first of these is the same as
Lemma 5.7.3, and is reproduced here for convenience.
Lemma 7.5.3 Suppose T , U, V ∈ M(S) with |U | = 0 and U, V left-coprime.Then T U −1 V ∈ M(S)

if and only if T U −1 ∈ M(S).
Because of Lemma 7.5.3, condition (R2) can be replaced by

11 Note that (R2) is redundant unless D̃ , is non-unimodular, i.e., the reference signal generator is unstable.
r
12 Note that α is a nonunit since D̃ is not unimodular.
r r
v ~ ~ r
Dr—1 Nr C Ơr P
+–
Figure 7.5: Solution to the Robust Tracking Problem.
(R2’) (I + P C)−1 D̃r−1 ∈ M(S).

The second result is actually a special case of a fact established in the next section (see Propo-
sition 7.6.5). It is stated here without proof, and the reader may wish to provide a specialized proof
that applies to the ring S (see Problem 7.5.1).
Lemma 7.5.4 Suppose A, B ∈ M(S) are square, have the same dimensions, and suppose |A| = 0. Then
for every ε > 0, the ball
B(B, ε) = {B1 ∈ M(S) : B1 − B < ε} , (7.5.3)
contains a matrix B1 such that A, B1 are right-coprime.
The proof of Theorem 7.5.1 is very long. The first step, which takes us most of the way, is to
establish the next result.
Lemma 7.5.5 Suppose (Nc , Dc ) is an r.c.f. of C ∈ S(P ). Then C achieves (R1)–(R3) if and only if
αr−1 Dc ∈ M(S) (i.e., if and only if αr divides every component of Dc ).
Proof. Since the robustness condition (R3) involves perturbing the plant from its nominal description
P , we can assume at the outset that D̃ and D̃r are right-coprime. If this is not so originally, it can be
achieved by an arbitrarily small perturbation of D̃, by Lemma 7.5.4. Also, since the truth or falsity
of the condition αr−1 Dc is unaffected by multiplying Dc by a unimodular matrix, we can as well
choose the particular r.c.f. (Nc , Dc ) of C that satisfies D̃Dc + ÑNc = I . Finally, note that since αr
is the largest invariant factor of D̃r , we have that αr D̃r−1 ∈ M(S) (see Problem 7.5.2).
“if ” Suppose Dc = αr T , where T ∈ M(S). Now (R1) holds by assumption, since C ∈ S(P ).
Next,
(I − P C)−1 D̃r−1 = Dc D̃ D̃r−1 = αr T D̃ D̃r−1 ∈ M(S) , (7.5.4)
so that (R2’) holds. To prove (R3), suppose P = D̃ −1 Ñ is replaced by P1 = D̃1−1 Ñ1 , where D̃1 =
D̃ + D̃, Ñ1 = Ñ + Ñ . Then the corresponding return difference matrix is

Dc
M = D̃1 Dc + Ñ1 Nc = I + [D̃ Ñ] , (7.5.5)
Nc
256 7. ROBUSTNESS
which is unimodular provided
−1
Dc
[D̃ Ñ] < δ =
Nc . (7.5.6)
Hence, (R1) holds robustly. To prove that (R2’) also holds robustly, note that
(I + P1 C)−1 D̃r−1 = Dc M −1 D̃1 D̃r−1 = T M −1 D̃1 αr D̃r−1 ∈ M(S) , (7.5.7)
since M −1 ∈ M(S) due to the unimodularity of M.

“only if ” Suppose (Rl), (R2’), (R3) hold; it is to be shown that αr−1 Dc ∈ M(S). This part of
the proof is divided into several steps.
First, since D̃ and Dr are right-coprime, let (A, B) be an l.c.f. of D̃ D̃r−1 . Then, by Theo-
rem 4.1.15, A and D̃r have the same invariant factors. In particular, αr is the largest invariant factor
of A.
Step 1 It is shown that
Dc A−1 ∈ M(S) . (7.5.8)
By assumption, (R2’) holds, i.e., Dc D̃ D̃r−1 = Dc A−1 B ∈ M(S). An application of Lemma 7.5.3
gives (7.5.8).
Dc QÑNc A−1 ∈ M(S) ∀Q ∈ M(S) . (7.5.9)
Let Q ∈ M(S) be arbitrary, and choose ε in the interval (0, 1/QÑNc ). Then I − εQÑNc is
unimodular. Define M = (I − εQÑNc )−1 . Then
∞

M=I+ (εQÑNc )i = I + Rε ÑNc , (7.5.10)
i=1
for some rational matrix Rε , (note that M is rational). Moreover, Rε → 0 as ε → 0. Hence, Rε <
δ for sufficiently small ε, where δ is the number defined in (7.5.1). Now let P1 = D̃ −1 (I + Rε )Ñ.
By assumption, (I + P1 C)−1 D̃r−1 ∈ M(S), since P1 is a valid perturbation of P . But
(I + P1 C)−1 D̃r−1 = Dc [D̃Dc + (I + Rε )ÑNc ]−1 D̃ D̃r−1 = Dc (I + Rε ÑNc )−1 A−1 B

= Dc (I − εQÑNc )A−1 B ∈ M(S) . (7.5.11)
Applying Lemma 7.5.3 gives
Dc (I − εQÑNc )A−1 ∈ M(S) . (7.5.12)
Subtracting (7.5.8) yields εDc QÑNc A−1 ∈ M(S), and dividing by ε leads to (7.5.9).
Dc QA−1 ∈ M(S) ∀Q ∈ M(S) . (7.5.13)

Indeed, since D̃Dc + Ñ Nc = I ,
Dc QA−1 = Dc Q(D̃Dc + ÑNc )A−1 = (Dc QD̃)Dc A−1 + Dc QÑNc A−1 ∈ M(S) , (7.5.14)
by (7.5.8) and (7.5.9), respectively.

αr−1 Dc ∈ M(S) , (7.5.15)
which is the desired conclusion. Select unimodular U, V ∈ U(S) such that
⎡ ⎤
α1 0
⎢ .. ⎥
U AV = ⎣ . ⎦=S, (7.5.16)
0 αr
is the Smith form of A. By (7.5.8),
Dc QA−1 ∈ M(S) ∀Q ⇐⇒ Dc QV S −1 U ∈ M(S) ∀Q

⇐⇒ Dc QV S −1 ∈ M(S) ∀Q
⇐⇒ Dc Q1 S −1 ∈ M(S) ∀Q1 (7.5.17)
where Q1 denotes QV , and Q, Q1 range over M(S). Now let Q1 = I . Then Dc S −1 ∈ M(S), which
shows that αr divides the last column of Dc . Next, successively let Q1 equal the permutation matrix
that interchanges column j with the last column, and let the index j vary over all possible values.
Then Dc Q1 S −1 ∈ M(S) implies that αr divides every column of Dc . 2
The argument used in the last step is sufficiently useful that it deserves display as a separate
item.
Proposition 7.5.6 Suppose A, B ∈ M(S), have the same number of columns, and in addition A is
square and nonsingular. Then
BQA−1 ∈ M(S) ∀Q ∈ M(S) , (7.5.18)
if and only if the largest invariant factor of A divides every element of B.
Proof of Theorem 7.5.1. It is first shown that the robust regulation problem has a solution if and
only if N and αr I are left-coprime. Select X̃, Ỹ ∈ M(S) such that Ñ X̃ + D̃ Ỹ = I . Then by Theo-
rem 5.2.1,
S(P ) = {(X̃ + DR)(Ỹ − NR)−1 : R ∈ M(S)} . (7.5.19)
Thus, the set of all Dc as C varies over S(P ) is given by {Ỹ − NR : R ∈ M(S)}. By Lemma 7.5.5,
the robust tracking problem has a solution if and only if αr−1 (Ỹ − NR) ∈ M(S) for some R ∈ M(S),
or equivalently, there exist R, W ∈ M(S) such that
Ỹ − NR = αr W . (7.5.20)
258 7. ROBUSTNESS
Now (7.5.20) can be rearranged as
NR + αr W = Ỹ . (7.5.21)
Thus, the proof is complete if it can be shown that the left-coprimeness of N and αr I is necessary and
sufficient for (7.5.21) to have a solution. Accordingly, suppose first that N, αr I are left-coprime, and
select T , U ∈ M(S) such that NT + αr U = I ; then R = T Ỹ , W = U Ỹ is a solution of (7.5.21).
Conversely, suppose N and αr I are not left-coprime, and let V be a nontrivial (i.e., nonunimodular)
common left-divisor of N and αr I . Then NR + αr W is a left-multiple of V for all R, W . On
the other hand, from Corollaries 4.1.18 and 4.1.7, N and Ỹ are left-coprime, so that Ỹ is not a
left-multiple of V . Hence, (7.5.21) does not have a solution.
To complete the proof, it is shown that the set of all C ∈ S(P ) such that αr Dc−1 ∈ M(S)
is precisely {C1 /αr : C1 ∈ S(P /αr )}, provided of course that N and αr I are left-coprime. First,
suppose Dc = αr T for some T ∈ M(S). Then, since
I = D̃Dc + ÑNc = αr D̃T + ÑNc , (7.5.22)
we see that T and Nc are right-coprime, and that C1 = Nc T −1 stabilizes (αr D̃)−1 Ñ = P /αr .
Hence, every solution to the robust tracking problem is of the form C1 /αr , for some C1 ∈ S(P /αr ).
Conversely, suppose C1 ∈ S(P /αr ). Now P /αr = (αr D̃)−1 Ñ; moreover, it can be shown that αr D̃
and Ñ are left-coprime. Hence, (αr D̃, Ñ) is a l.c.f. of P /αr . Since C1 ∈ S(P /αr ), C1 has an r.c.f.
(Nc1 , Dc1 ) satisfying
αr D̃Dc1 + ÑNc1 = I . (7.5.23)
Thus, C = C1 /αr = Nc1 (αDc1 )−1 stabilizes P , and the denominator matrix of C is a multiple of
αr . Hence, C solves the robust tracking problem. 2
Corollary 7.5.7 Suppose N and αr I are left-coprime, and let T , U ∈ M(S) be particular solutions of
NT + αr U = I. (7.5.24)
Let (B, A) be an r.c.f. of N/αr . Then the set of all C ∈ S(P ) that solve the robust tracking problem is
{[X̃ + D(T Ỹ + AS)][αr (U Ỹ − BS)]−1 : S ∈ M(S)} . (7.5.25)
Proof. From the proof of Theorem 7.5.1, we see that C solves the robust tracking problem if and
only if C = (X̃ + DR)(Ỹ − NR)−1 for some R satisfying (7.5.21). Now it is routine to verify that
the set of all solutions of (7.5.21) is
R = T Ỹ + AS , W = U Ỹ − BS , S ∈ M(S) . (7.5.26)
Substituting for R from (7.5.26) and noting that Ỹ − NR = αr W = αr (U Ỹ − BS) leads to (7.5.25).
2
Proof of Lemma 7.5.2. Recall from Section 4.2 that N and αr I are left-coprime if and only if the
matrix F = [N αr I ] has full row rank at all s ∈ C+e . If αr (s) = 0, the rank condition holds;
therefore, it is only necessary to verify the rank condition at the C+e -zeros of αr . If αr (s) = 0, then
F (s) has full row rank if and only if N(s) has full row rank. Hence, we conclude that N and αr I are
left-coprime if and only if N (s) has full row rank at all C+e -zeros of αr . Clearly this is the same as
requiring the largest invariant factor of N and αr to be coprime. 2
Lemma 7.5.2 also holds over any principal ideal domain, and not just over S; but the proof in
the general case is a little more complicated (see Problem 7.5.3).
Now let us consider the problem of robust tracking using a two-parameter compensation
scheme. Recall from Section 5.6 that such a scheme can be depicted as shown in Figure 7.2. In this
scheme, the tracking error between the reference input r and the plant output y is given by
e = [I − N(Ñc2 N + D̃c D)−1 Ñc1 ]r . (7.5.27)
Hence, the transfer matrix from v to e is given by
F (N, D) = [I − NM −1 Ñc1 ]D̃r−1 Ñr , (7.5.28)
where
M = M(N, D) = Ñc2 N + D̃c D . (7.5.29)
As before, the objectives of the robust tracking problem are to determine the compensator parameters
Ñc1 , Ñc2 , D̃c such that the closed-loop system is stable and F (N, D) ∈ M(S) for the nominal plant
as well as for all plants in some neighborhood of the nominal plant.
A partial solution to the problem is given by the next theorem.
Theorem 7.5.8 Consider the system of Figure 7.2, and suppose P has at least as many inputs as
outputs. Let (N, D) be an r.c.f. of the nominal plant, and let (N1 , D1 ) denote perturbations of (N, D).
Let (Nc2 , Dc ) be an r.c.f. of the feedback compensator D̃c−1 Ñc2 .Then the triple (Ñc1 , Ñc2 , D̃c ) solves
the robust tracking problem if and only if
(i) αr−1 Dc ∈ M(S), and
(ii) N1 [M(N1 , D)]−1 (Ñc2 − Ñc1 )D̃r−1 ∈ M(S) ∀(N1 , D1 ).
If P is square, (ii) is equivalent to
(ii’) (Ñc2 − Ñc1 )D̃r−1 ∈ M(S).
If P is nonsquare, (ii’) implies but is not implied by (ii).
Remarks 7.5.9 If we use a one-parameter, or feedback compensator, then Ñc2 = Ñc1 , so that
(ii) above is automatically satisfied. In this case, (i) is a necessary and sufficient condition for robust
tracking. This is precisely the result contained in Lemma 7.5.5.
260 7. ROBUSTNESS
Proof. “if ” Suppose first that (i) and (ii) hold. It is desired to show that
(I − N1 M −1 Ñc1 )D̃r−1 ∈ M(S) ∀(N1 , D1 ) , (7.5.30)
where M = M(N1 , D1 ) is as in (7.5.29). Define

D1 −Nc2
S= . (7.5.31)
N1 Dc
By the robustness of closed-loop stability, S is unimodular for (N1 , D1 ) sufficiently close to (N, D).
Let T = S −1 , and partition T as

A B
T = . (7.5.32)
K L
Now [A B] is equivalent to [I 0], since it is part of a unimodular matrix. This, plus the fact that

−Nc2
[A B] =0, (7.5.33)
Dc
implies that [A B] is an l.c.f. of D̃c−1 Ñc2 , i.e., that
[A B] = U [D̃c Ñc2 ] for some U ∈ U(S) . (7.5.34)
Similarly,
[K L] = [−Ñ1 D̃1 ] , (7.5.35)
where (Ñ1 , D̃1 ) is an l.c.f. of the perturbed plant N1 D1−1 . Now T S = I implies that
U (D̃c D1 + Ñc2 N1 ) = I , (7.5.36)
or U = M −1 . Also, ST = I implies in particular that
N1 B + Dc L = I , (7.5.37)
or
NM −1 Ñc2 + Dc D̃1 = I . (7.5.38)
Therefore
(I − N1 M −1 Ñc1 )D̃r−1 = Dc D̃1 D̃r−1 + N1 M −1 (Ñc2 − Ñc1 )D̃r−1 , (7.5.39)
which belongs to M(S) by (i) and (ii). Hence, robust tracking is achieved.
“only if ” Suppose the triple (Ñc1 , Ñc2 , D̃c ) achieves robust tracking. It is desired to show
that (i) and (ii) hold. The hypothesis is that
{I − N1 [M(N1 , D1 )]−1 Ñc1 }D̃r−1 Ñr ∈ M(S) ∀(N1 , D1 ) , (7.5.40)

where (N1 , D1 ) range over some neighborhood of (N, D). By Lemma 7.5.3, (7.5.40) is equivalent
to
{I − N1 [M(N1 , D1 )]−1 Ñc1 }D̃r−1 ∈ M(S) ∀(N1 , D1 ) . (7.5.41)
As a prelude to the proof, observe first that it can be assumed without loss of generality that the
numerator matrix N of the nominal plant and αr I are left-coprime. If this is not so originally, it
can be achieved by an arbitrarily small perturbation of the plant: It suffices to perturb N to a nearby
matrix which has full row rank at all C+e -zeros of αr (see Lemma 7.5.2). Also, we may as well
assume that the r.c.f. (N, D) of the nominal plant and the triple (Ñc1 , Ñc2 , D̃c ) satisfy
Ñc2 N + D̃c D = I . (7.5.42)
Finally, let (D̃, Ñ ) be an l.c.f. of the nominal plant defined by the equation
−1
D −Nc2 D̃c Ñc2
= . (7.5.43)
N Dc −Ñ D̃
Then, since Ñ and N have the same invariant factors, it follows from Lemma 7.5.2 that Ñ and αr I
are also left-coprime.
The first step of the proof is to observe that the hypotheses imply that Ñc1 and D̃r are
right-coprime. To see this, note that (7.5.41) implies that
W = (I − N Ñc1 )D̃r−1 ∈ M(S) . (7.5.44)
Hence,
N Ñc1 + W D̃r = I , (7.5.45)
which shows that Ñc1 and D̃r are right-coprime. Let (A, B) be an l.c.f. of Ñc1 D̃r−1 , and note that αr
is also the largest invariant factor of A. Next, fix the plant numerator at N and vary its denominator.
Then (7.5.41) states that
{I − N [M(N, D1 )]−1 Ñc1 }D̃r−1 ∈ M(S) ∀D1 . (7.5.46)
Subtracting (7.5.44) from (7.5.46) leads to
N{[M(N, D1 )]−1 − I }Ñc1 D̃r−1 = N{[M(N, D1 )]−1 − I }A−1 B ∈ M(S) ∀D1 . (7.5.47)
Hence, from Lemma 7.5.3,
N{[M(N, D1 )]−1 − I }A−1 ∈ M(S) ∀D1 . (7.5.48)
Now M = M(N, D1 ) = I + D̃c (D1 − D). By an argument analogous to that in the proof of
Lemma 7.5.5 (see especially Step 2), it can be shown that M −1 − I can be made equal ε D̃c Q
for an arbitrary matrix Q ∈ M(S), by suitable choice of D1 . Therefore, (7.5.48) implies that
N D̃c QA−1 ∈ M(S) ∀Q ∈ M(S) . (7.5.49)

262 7. ROBUSTNESS
Now Proposition 7.5.6 leads to the conclusion that αr the largest invariant factor of A, divides each
element of N D̃c . However, from (7.5.43), N D̃c = Dc Ñ. Hence,
Dc αr−1 Ñ ∈ M(S) . (7.5.50)
Finally, since αr I and Ñ are left-coprime, an application of Lemma 7.5.3 to (7.5.50) gives
Dc αr−1 ∈ M(S) . (7.5.51)
This proves (i).

The proof of (ii) is now very easy. From previous arguments, robust tracking is equivalent
to (7.5.39). However, since αr divides every element of Dc , we have that
Dc D̃1 D̃r−1 ∈ M(S) ∀(N1 , D1 ) . (7.5.52)
Subtracting (7.5.52) from (7.5.39) shows that
N1 M −1 (Ñc2 − Ñc1 )D̃r−1 ∈ M(S) ∀(N1 , D1 ) . (7.5.53)
This proves (ii).

To complete the proof, it is necessary to show two things: 1) Condition (ii’) implies (ii);
2) If the plant is square, (ii) implies (ii’). Of these, the first statement is obvious: If Ñc2 − Ñc1 is a
left-multiple of D̃r , then (7.5.53) is clearly satisfied. To prove the second statement, define
Q = (Ñc2 − Ñc1 )D̃r−1 . (7.5.54)
Then (7.5.53) implies in particular that NQ ∈ M(S). Now select unimodular matrices U, V ∈ U(S)
such that U QV is in Smith-McMillan form, and let
S = U −1 QV −1 = Diag {s1 /t1 , · · · , sr /tr } . (7.5.55)
Since N Q ∈ M(S), it follows that ti divides the i-th column of NU , for all i. Now, if Q itself does
not belong to M(S), then t1 is not a unit. Note that t1 is a divisor of αr , because of (7.5.54). Now let
s be any C+e -zero of t1 . Then, since t1 divides the first column of NU, NU (s) is singular, which in
turn implies, by the unimodularity of U , that N(s) is singular. However, since t1 is a divisor of αr ,
this contradicts the left-coprimeness of N and αr I . Hence, Q ∈ M(S). 2
Theorem 7.5.8 shows that (ii) implies (ii’) for square plants. It is now shown by example that
there exist situations involving nonsquare plants where (ii) holds but (ii’) is false.
Example 7.5.10 Suppose the plant

3s 4s + 1
P =
2s − 1 2s − 1
is to be compensated so as to robustly track a step input. In this case, we may take D̃r = αr =
s/(s + 1). For convenience, let z denote s/(s + 1), and note that z ∈ S. Then
P = [3z 3z + 1](3z − 1)−1 ,
and an r.c.f. for P is given by
N = [−z 3z + 1] ,

1 + z −1 − 3z
D= .
−2z 6z − 1
Now choose
1 0 1
D̃c = , Ñc2 = .
z z 1−z
Then it is claimed that robust tracking is achieved provided αr divides the second component of
Ñc2 − Ñc1 , even if αr does not divide the first component of Ñc2 − Ñc1 .This is proved by establishing
that condition (ii) of Theorem 7.5.8 holds. For this purpose, suppose N, D are perturbed to N1 , D1 ,
and partition these as
N1 = [na nb ], D1 = [Da Db ] .
Also, partition Ñc1 , Ñc2 , D̃c as

x y D̃a
Ñc1 = a , Ñc2 = a , D̃c = .
xb yb D̃b
The key point to note here is that D̃b is a multiple of z = αr . Now partition M = Ñc2 N1 + D̃c D1 ,
in the form
maa mab
M= .
mba mbb
Then
N1 M −1 (Ñc2 − Ñc1 )/αr

= |M|−1 [(na mbb − nb mba )(ya − xa )]/αr
+ |M|−1 [(−na mab − nb maa )(yb − xb )]/αr .
Since αr divides yb − xb , the second term belongs to S, provided the plant perturbation is small
enough to maintain the unimodularity of M, i.e., to maintain closed-loop stability. As for the first
term,
mba = yb na + D̃b Da , mbb = yb nb + D̃b Db

na mbb − nb mba = D̃b (na Db − nb Da )
264 7. ROBUSTNESS
is a multiple of αr , since αr divides D̃b . Hence, (ii) holds and robust tracking is achieved, whatever
be xa !
Now let us consider the problem of optimal regulation.The problems studied in Chapter 6 can
be interpreted as minimizing the effect of a stable disturbance on the plant output, or minimizing the
tracking error in response to a stable reference input. With the aid of the results derived thus far in
this section, it is now possible to tackle the same problems in the presence of unstable disturbances or
reference inputs. Specifically, suppose P is a given plant, D̃r−1 Ñr is a given disturbance or reference
input generator, and W1 , W2 are given square weighting matrices. The objective is to minimize the
objective function
Jp = W1 (I + P C)−1 D̃r−1 Ñr W2 p , (7.5.56)
over all C that achieve robust regulation. This can be interpreted as the problem of optimal distur-
bance rejection, or optimal tracking using a one-parameter compensator. Using Theorem 7.5.1 and
Corollary 7.5.7, this problem can be transformed into one of the type solved in Chapter 6. With all
symbols as in Theorem 7.5.1 and Corollary 7.5.7, the set of all C over which Jp is to be minimized
is given by (7.5.25). Substituting from (7.5.25) into (7.5.56) shows that
(I + P C)−1 D̃r−1 = (U Ỹ − BS)D̃L , (7.5.57)

Jp = E − W1 BSF p , (7.5.58)
where
L = αr D̃r−1 ,
E = W1 U Ỹ F ,
F = D̃LÑr W2 . (7.5.59)
Note that the robust regulation problem has a solution if and only if the plant has at least as many
inputs as outputs. Hence, F is square or fat, and as a result the minimization of Jp can be carried
out using the techniques presented in Section 6.5.
Consider now the problem of optimal robust tracking of square plants using a two-parameter
compensation scheme. From Corollary 7.5.7 and Theorem 7.5.8 it follows that the set of all triples
(Ñc1 , Ñc2 , D̃c ) that achieve robust regulation is given by
D̃c−1 Ñc2 = [(X̃ + D)(T Ỹ + AS)][αr (U Ỹ − BS)]−1 ,

Ñc1 = Ñc2 − QD̃r , S, Q ∈ M(S) . (7.5.60)
The above expression is not very convenient, as it does not explicitly specify the compensator pa-
rameters. Such an expression can be obtained using a doubly coprime factorization for the plant.
Suppose

Y X D −X̃
=I . (7.5.61)
−Ñ D̃ N Ỹ
Then by Problem 5.2.2 it follows that
(X̃ + DR)(Ỹ − NR)−1 = (Y − R Ñ)−1 (X + R D̃) ∀R ∈ M(S) . (7.5.62)
Hence, the set of all compensators that achieve robust tracking can be described with the aid of
Corollary 7.5.7 and (7.5.62) as
D̃c = Y − (T Ỹ + AS)Ñ ,
Ñc2 = X + (T Ỹ + AS)D̃ ,
D̃c1 = X + (T Ỹ + AS)D̃ + QD̃r . (7.5.63)
For the class of compensators described above, the tracking error transfer matrix can be computed
from (7.5.39) as
(Dc D̃ D̃r−1 + NQ)Ñr = ((U Ỹ − BS)D̃L + NQ)Ñr . (7.5.64)
Comparing (7.5.64) with (7.5.57), we see that the extra flexibility arising from using a two-parameter
compensation scheme is the presence of the term NQ in (7.5.64).The expression (7.5.64) can actually
be simplified. Since the plant is square, N and αr I are both left- as well as right-coprime. Hence,
(N, αr I ) is a r.c.f. for αr−1 N , or in other words, we may take A = αr I, B = N . Hence, in (7.5.64),
(U Ỹ − BS)D̃L + N Q = U Ỹ D̃L + N(Q − S D̃L) = U Ỹ D̃L + NR , (7.5.65)
where R = Q − S D̃ is a free parameter. To reiterate, the set of stable transfer matrices achievable
using a feedback compensator is
(U Ỹ D̃L − NS D̃L)Ñr , S ∈ M(S) , (7.5.66)
whereas with a two-parameter scheme, the set is
(U Ỹ D̃L + NQ)Ñr , Q ∈ M(S) . (7.5.67)
Now, the problem of minimizing
Jp = W1 (I − N Ñc2 )D̃r−1 Ñr W2 p , (7.5.68)
over all robustly stabilizing two-parameter compensators can be reformulated as that of minimizing
Jp = E − W1 NQÑr W2 p , (7.5.69)
as Q varies over M(S), where E is defined in (7.5.59). It is easy to see that the infimum in (7.5.69)
is no larger than that in (7.5.58), since (7.5.69) contains (7.5.58) as a special case by choosing
Q = S D̃L. Moreover, as in Section 6.6, there always exist infinitely many optimal compensators if a
266 7. ROBUSTNESS
two-parameter scheme is used, and this flexibility can be exploited to achieve both good disturbance
rejection and good tracking simultaneously.
Example 7.5.11 For the sake of variety we consider discrete-time systems in the next two exam-
ples. Thus z denotes the unit delay and its z-transform. Note that the z-transform of a sequence
{f0 , f1 , . . .} is given by
∞

f (z) = fi z i .
i=0
Consider the set-up shown in Figure 7.3, and suppose it is desired to robustly reject a step
input disturbance. Since the z-transform of a sequence of one’s is 1/(1 − z), we may take
nr = 1, dr = αr = 1 − z .
Suppose the plant has two inputs and one output, and has the transfer matrix
P = [z(z − 0.5) z(z + 0.5)](z + 0.25)−1 .
An l.c.f. of this plant is given by
D̃ = z + 0.25 , Ñ = [z(z − 0.5) z(z + 0.5)] ,
and a particular solution of the Bezout identity Ñ X̃ + D̃ Ỹ = I is given by
X̃ = [4 − 4] , Ỹ = 4 .
An r.c.f. of P is given by
N = [z 0] ,

−1 −z − 0.5
D= .
1 z − 0.5
Now N and αr are left-coprime. In fact (7.5.24) is satisfied with
T = [1 0] , U = 1 .
Hence, the robust disturbance rejection problem has a solution. Further, all solutions of this problem
can be parametrized using Corollary 7.5.7. An r.c.f. of N/αr is

z−1 0
B = [z 0] , A = .
0 1
From (7.5.25), the set of all compensators that achieve robust disturbance rejection is given by
C = Nc Dc−1 ,
where
Dc = (4 − zs1 ) ,

−4z + 3 + (1 − z)s1 − (z + 0.5)s2
Nc = (z + 0.25) ,
4z − 3 + (z − 1)s1 + (z − 0.5)s2
s1 , s2 ∈ H∞ are rational .
Now consider the problem of optimal robust disturbance rejection as stated in (7.5.56), with
both weighting functions equal to unity. From (7.5.57) and (7.5.58), the set of achievable transfer
functions from v to y is
f (s1 , s2 ) = (4 − zs1 )(z + 0.25) .
Suppose it is desired to minimize the H∞ -norm of this transfer function. Using the methods
described in Section 6.4, one can verily that the minimum is achieved with s1 = 0, with a minimum
cost of 5. The collection of compensators that achieve the minimum is

−4z + 3 − (z + 0.5)s2
Dc = 4(1 − z) , Nc = , s2 ∈ H∞ .
4z − 3 + (z − 0.5)s2
Example 7.5.12 Consider the set-up shown in Figure 7.2, where the input to be tracked is the
sinusoidal sequence {0, 1, 0, −1, 0, 1, . . .}. Since the z-transform of this sequence is z/(1 + z2 ), we
may take
nr = z , αr = 1 + z2 .
Suppose the plant is scalar, with the transfer function
z
p= .
z + 0.5
Then, dropping the tildes because everything is a scalar, we have
n = z , d = z + 0.5 , x = −2 , y = 2 .
a = αr = 1 + z2 , b = n = z , t = −z , u = 1 .
Hence, the set of r, w that satisfy (7.5.21) is
r = ty + as = −2z + (1 + z2 )s , w = uy − bs = 2 − zs , s ∈ H∞ .
From Corollary 7.5.7, the set of all feedback, or one-parameter compensators that achieve robust
tracking is given by c = nc /dc , where
dc = (2 − zs)(1 + z2 ) , nc = −2 + (z + 0.5)(−2z + (1 + z2 )s) ,

268 7. ROBUSTNESS
and s ∈ H∞ is arbitrary. If it is desired to have optimal tracking by minimizing the objective func-
tion (7.5.56) with unity weighting, then one has to minimize
J∞ = (2 − zs)(z + 0.5)z∞ , s ∈ H∞ .
Using the techniques of Section 6.4, one can verify that the minimum is 3, corresponding to the
choice s = 0. Hence the unique feedback compensator that achieves optimal robust tracking is
−(1 + 0.5z + z2 )
c= .
1 + z2
Now let us see how much better we can do using a two-parameter compensation scheme. The set
of transfer functions from v to the tracking error that can be achieved is
nr ((uy − bs)d + nq) = z[(2 − zs) (z + 0.5) + zq], s , q ∈ H∞ .

= z(1 + zq1 ) ,
where q1 = q − (z + 0.5)s + 2. The H∞ -norm of the above is minimized when q1 = 0, and

equals 1. Hence, not only does the two-parameter scheme result in a reduction in the optimal
tracking error by a factor of 3, but there are infinitely many compensators that achieve this mini-
mum. In this way it is possible to achieve both optimal disturbance rejection as well as good tracking.
For example, a compensator that achieves optimal tracking corresponds to the choice q = −2, s = 0,
which leads to the compensator
dc = 2 + 2z2 ,
nc2 = −2(1 + 0.5 + z2 ) ,
nc1 = nc2 + 2αr = −z .
Note that the feedback compensator is the same as the optimal compensator discussed above. But
the use of a two-parameter scheme reduces the tracking error by a factor of 3.
PROBLEMS
7.5.1. Prove Lemma 7.5.4.
7.5.2. Suppose A ∈ M(S) is square and nonsingular and let a denote its largest invariant factor.
Show that aA−1 ∈ M(S).
7.5.3. Prove Lemma 7.5.2 over a general principal ideal domain R (Hint: Partition all the full-size
minors of the matrix [N αr I ] into several groups, namely: those that do not involve any
columns of αr I , those that involve one column, two columns, etc.).
7.5.4. Suppose D̃ and Ñ are left-coprime, as are Ñ and αr I . Show that αr D̃, Ñ are left-coprime.
7.6. GENERICITY 269
7.5.5. Using the techniques of Section 6.7, construct a sequence of plants Pi and a sequence of
reference inputs D̃r−1
i
Ñri such that the ratio of the optimal tracking error achievable using
a two-parameter compensation scheme to that achievable using a one-parameter scheme
approaches zero as i approaches ∞.
7.6 GENERICITY
In this section, the focus is on the notion of genericity, which is a mathematically precise way of
expressing the intuitive idea that some statement or property is almost always true. The concept of
genericity is defined first, and then it is shown that various useful properties of control systems are
generically true. These include coprimeness, strong stabilizability, and simultaneous stabilizability
(subject of course to suitable restrictions).
We begin with a formal definition of genericity. Let X denote a topological space. Recall that
a binary relation R on X is a subset of the product space X × X. It is symmetric if (x, y) ∈ R ⇐⇒
(y, x) ∈ R. More generally, an n-ary relation is a subset of the product space Xn .
Definition 7.6.1 A binary relation R on X is generic if it is an open dense subset of X. It is strongly

generic if the following is true: For each x ∈ X, define
B(x) = {y ∈ X : (x, y) ∈ R} . (7.6.1)
Then the set of x for which B(x) is an open dense subset of X is itself an open dense subset of X.
One can state the same definition in words: R is generic if two statements are true: (i) If a pair
(x, y) belongs to R, then there is a neighborhood of (x, y) such that every point in the neighborhood
also belongs to R. (ii) If a pair (x, y) does not belong to R, then every neighborhood of this pair
contains a point that belongs to R. In other words, R is generic if it has two features: (i) If a pair
satisfies the relation, then sufficiently small perturbations will not destroy the relation. (ii) If a pair
does not satisfy the relation, then arbitrarily small perturbations in the pair will make it satisfy the
relation. A simple (and pertinent) example is that, in the polynomial ring R[s] topologized as in
Problem C.2.3, the relation R defined by
R = {(x, y) : x and y have no common zeros} , (7.6.2)
is generic. The exact relationship between genericity and strong genericity is brought out in Prob-
lem 7.6.1. But, roughly speaking, the symmetric relation R is strongly generic if, for almost all x,
almost all y satisfy the relation (x, y) ∈ R.
Proposition 7.6.2 Let A, B be open dense subsets of a topological space X. Then C = A ∩ B is also open
and dense.
270 7. ROBUSTNESS
Proof. Clearly C is open, being the intersection of two open sets. Hence, it only remains to show
that it is dense. For this, it is necessary to show the following: Let x ∈ X be arbitrary. Then every
neighborhood of x contains a point of A ∩ B. Accordingly, let x ∈ X be chosen arbitrarily, and let N
be a neighborhood of x. Since A is dense, N contains an element a ∈ A. Moreover, since A is open,
there is a neighborhood N1 ⊆ A containing a. One can assume that N1 is also a subset of N by
replacing N1 by N ∩ N1 if necessary. Next, since B is also dense, the neighborhood N1 contains an
element b ∈ B. Since b also belongs to A, b belongs to A ∩ B and is contained in the neighborhood
N of x. 2
The topological space-under study in this section is the ring S of proper stable rational
functions. Recall that the topology on S is the natural one induced by its norm. Thus, a base for
the topology on S is provided by open balls of various radii centered at various points. Most of the
arguments given below can be extended to an arbitrary topological ring whose set of units is open
and where inversion of a unit is continuous, but at a considerable increase in the level of abstraction.
We begin by deriving some results concerning the genericity of coprimeness.
Proposition 7.6.3 Define a symmetric binary relation R on S by
R = {(f, g) : f and g are coprime} . (7.6.3)
Then R is strongly generic as well as generic.
Proof. It is shown only that R is strongly generic. The proof of genericity is similar and is left as an
exercise (see Problem 7.6.2).
Let f ∈ S, and define
Rf = {g ∈ S : f and g are coprime} . (7.6.4)
It is shown below that Rf is an open dense subset of S whenever f = 0. Since the set of nonzero
elements is open and dense in S, this would prove the strong genericity of R.
To show that Rf is open, suppose that f and g are coprime, and choose a, b ∈ S such that
af + bg = 1. If h − g is less than b−1 , then
af + bh = 1 + b(h − g) , (7.6.5)
is a unit of S, which shows that h and f are also coprime. Thus, if g ∈ Rf , then h ∈ Rf whenever
h belongs to the ball B(g, b−1 ). This shows that Rf is open.
To show that Rf is dense whenever f = 0, let g be an arbitrary element of S. If f and g are
coprime, then nothing further needs to be done. On the other hand, if f and g are not coprime, it
must be shown that every neighborhood of g contains an element which is coprime with f . Towards
7.6. GENERICITY 271
this end, let S denote the set of C+e -zeros of the function f , and note that S is a finite set since f
is nonzero. Now partition the set S into two disjoint subsets S1 and S2 . as follows:
S1 = {s ∈ S : g(s) = 0} , S2 = {s ∈ S : g(s) = 0} . (7.6.6)
Define
r = min |g(s)| , (7.6.7)
s∈S2
and note that r > 0. Now let ε > 0 be arbitrary; it will be shown that the ball B(g, ε) contains an
element h which is coprime with f . Select a positive δ < min(r, ε) and define h = g + δ · 1. Then,
it is claimed that f and h are coprime, since they do not have any common C+e -zeros. To establish
the claim, note that the only possible common C+e -zeros of f and h are the elements of the set S.
Now, if s ∈ S1 , then h(s) = δ = 0, whereas if s ∈ S2 , then |h(s)| > r − δ > 0. Hence, f and h are
coprime. 2
Thus, Proposition 7.6.3 shows that, generically, any two functions in S are coprime. In order
to extend this result to matrices, a preliminary lemma is needed.
Proposition 7.6.4 Suppose m > 1 is some integer. Then the set
Mm = {D ∈ Sm×m : |D| = 0} , (7.6.8)
is an open dense subset of Sm×m .
Proof. The map D → |D| is continuous, and the set of nonzero elements in S is open. Hence, Mm ,
which is the preimage of an open set under a continuous map, is itself open. To show that it is also
dense, suppose D is a matrix in Sm×m with |D| = 0, and let ε be an arbitrary positive number. It
is shown next that the ball B(D, ε) contains a matrix with a nonzero determinant. Select matrices
U, V ∈ U(S) such that
U DV = Diag {d1 , · · · , dr , 0, · · · , 0} , (7.6.9)
is the Smith form of D, and r is the rank of D. Observe now that di divides di+1 for all i. Now
define
S = Diag {d1 , · · · , dr , δdr , · · · , δdr } , (7.6.10)
and let D1 = U −1 SV −1 . Then |D1 | = 0, and D1 can be made to lie in the ball B(D, ε) by choosing
the constant δ sufficiently small. 2
The next proposition generalizes Proposition 7.6.3 to the matrix case.
Proposition 7.6.5 Suppose m < n are positive integers, and define
R(m, n) = {A ∈ Sm×n : The g.c.d. of all minors of A is 1 } . (7.6.11)

272 7. ROBUSTNESS
Then R(m, n) is an open dense subset of Sm×n .
Remarks 7.6.6 1) If R(m, n) is viewed as mn-ary relation on S, then the proposition states that
this relation is generic.
2) Another way of stating the proposition is: Suppose F ∈ Sm×k , G ∈ Sm×l . Then F and G
are generically coprime if k + l > m.
3) The definition (7.6.11) can be equivalently stated as:
R(m, n) = {A ∈ Sm×n : A ∼ [I 0]} . (7.6.12)
Proof. Clearly it is enough to prove the proposition for the case where n = m + 1. The openness
of the set R(m, n) is obvious, and it only remains to show that the set is dense. This is done by
induction on m. Note that the proposition is true when m = 1, by Proposition 7.6.3. Suppose the
proposition is true for matrices with m − 1 rows and m (or more) columns.To establish the statement
for matrices with m rows, suppose A ∈ Sm×n but does not belong to R(m, n). To be shown is that
every neighborhood of A contains a matrix belonging to R(m, n). Let ε > 0 be any specified number,
and partition A in the form
A = [A1 a2 ] , (7.6.13)
where A1 and a2 have dimensions m × m and m × 1, respectively. Now let A11 denote the submatrix
consisting of the first m − 1 rows of A1 . By the inductive hypothesis and Proposition 7.6.4, there
exists an m × m matrix
B11
B1 = , (7.6.14)
B12
such that (i) |B1 | = 0, (ii) [I 0] is a Smith form of B11 , and (iii) B1 − A1 < ε/2. Now let S
denote the set of C+e -zeros of |B1 |, and partition the set S into two disjoint sets S1 and S2 , as
follows:
S1 = {s ∈ S : all minors of [B1 a2 ] are zero} ,

S2 = {s ∈ S : not all minors of [B1 a2 ] are zero} . (7.6.15)
The idea, as in the proof of Proposition 7.6.3, is to perturb a2 in such a way as to destroy all the
common zeros in the set S1 while ensuring that none of the points in S2 becomes a common zero
of all the minors. For this purpose, note that all minors of the matrix [B1 a2 ] belong to S and are
thus, bounded over the closed right half-plane. Let μ be an upper bound on the norms of all these
minors. Now, at each point s ∈ S1 , find an (m − 1) × (m − 1) minor of B1 which is nonzero at s;
such a minor must exist, since [I 0] is a Smith form of B1 . Suppose such a minor is obtained by
7.6. GENERICITY 273
omitting the j -th column of B1 . Let em denote the m-th elementary unit vector. Then the m × m
minor of
B = [B1 a2 + δem ] , (7.6.16)
obtained by omitting the j -th column is nonzero at s, whenever δ = 0. Hence, at all s ∈ S1 , at least
one of the m × m minors of the matrix B in (7.6.16) is nonzero. The next step is to show that,
if δ is sufficiently small, then at least one of the m × m minors of the matrix B is nonzero at all
points in the set S2 as well. This can be shown as follows: In analogy with (7.6.7), choose a real
number r > 0 such that, at each s ∈ S2 , at least one m × m minor of the matrix [B1 a2 ] is nonzero.
Now choose δ < r/μ. Then, at each s ∈ S2 , the corresponding m × m minor of B is also nonzero.
Finally, since S = S1 ∪ S2 , is the zero set of |B1 |, these are the only possible points in the closed
RHP where all m × m minors of the matrix B could vanish. Hence, we conclude that, whenever
δ < min(r/μ, ε), B has the Smith form [I 0]. 2
Proposition 7.6.5 shows that a rectangular matrix generically has the Smith form [I 0]. The
next result is addressed to the Smith form of square matrices.
Proposition 7.6.7 Suppose m is a positive integer, and define
Tm = {A ∈ Sm×m : A ∼ Diag {1, · · · , 1, |A|}} . (7.6.17)
Then Tm is an open dense subset of Sm×m .
Proof. The openness of Tm is left to the reader to prove. To prove that Tm is dense, suppose A does
not belong to Tm , and let ε > 0 be a given number. To construct an element A1 belonging to Tm
lying in the ball B(A, ε), we proceed as follows: First, by Proposition 7.6.4, it can be assumed that
|A| = 0, by perturbing the matrix A slightly if necessary. Now select unimodular matrices U, V such
that
S = U AV = Diag {a1 , · · · , am } , (7.6.18)
is a Smith form of A.Then, for any δ > 0, one can find functions bi ∈ B(ai , δ) such that the functions
b1 , · · · , bm are all pairwise coprime (see Proposition 7.6.3). Hence, the matrix
B = U −1 Diag {b1 , · · · , bm }V −1 , (7.6.19)
has the Smith form Diag {1, · · · , 1, |B|}. By adjusting δ, one can ensure that B ∈ B(A, ε). 2
The first principal result of this section states that a collection of r multivariable plants is
generically simultaneously stabilizable provided the plants have at least r inputs or r outputs. A
useful corollary is that generically a multivariable plant can be strongly stabilized. These results
make use of Proposition 7.6.5 concerning the generic Smith form of rectangular matrices.
Theorem 7.6.8 Suppose l, m are given positive integers, and let M(R(s)) denote the set [R(s)]l×m ,
equipped with the graph topology. Let r be a positive integer, and define the r-ary relation R on
274 7. ROBUSTNESS
M(R(s)) by
R = {(P1 , · · · , Pr ) : S(Pi ) = ∅} . (7.6.20)
i
Then R is an open dense subset of M(R(s)) whenever r ≤ max(l, m).
The proof of the theorem is divided into two parts: First, it is shown that the result is true in
the case where min(l, m) = 1, i.e., the case where all plants are either single-input or single-output.
Then it is shown that the general case can be reduced to this special case.
Lemma 7.6.9 Suppose min(l, m) = 1. Then R defined in (7.6.20) is an open dense subset of M(R(s))
if max(l, m) ≥ r.
Proof. The proof is given for the case where l = 1 and m ≥ r. The other case can be proved by
taking transposes.
(i) (i) (i)
Given the plants P1 , · · · , Pr , each of dimension 1 × m, find l.c.f.’s (dp , [np1 · · · npm ]), for
i = 1, · · · , r. Define the matrix Q ∈ M(R(s))r×(m+1) by
qij = dp(i) if j = 1 ,
qij = n(i)
p(j −1) if j > 1 . (7.6.21)
Let C ∈ M(R(s))m×1 be a compensator, and let ([nc1 , · · · , ncm ] , dc ) be an r.c.f. of C. Then by

Theorem 5.1.6, C stabilizes each of the plants Pi if and only if each of the functions ui are defined
by
⎡ ⎤
⎡ ⎤ dc
u1 ⎢ nc ⎥
⎢ .. ⎥ ⎢ 1⎥
⎣ . ⎦ = Q ⎢ .. ⎥ , (7.6.22)
⎣ . ⎦
ur
ncm
is a unit. Turning this argument around, the set of plants {P1 , · · · , Pr } can be simultaneously sta-
bilized if and only if there exist units ui , i = 1, · · · , r such that (7.6.22) has a solution in M(R(s))
for the unknowns dc and nci , i = 1, · · · , m, where the first component of the solution (correspond-
ing to dc ) is nonzero. Now, if m ≥ r, then the matrix Q has more columns than rows. Hence, by
Proposition 7.6.5, Q is generically right-invertible. That is, if Q is not right-invertible, it can be
made so by an arbitrarily small perturbation in each of its elements, which is precisely the same as
slightly perturbing each of the plants Pi in the graph topology. Thus, by slightly perturbing each of
the plants, we can ensure that (7.6.22) has a solution in M(R(s)), whatever be the left hand side.
In particular, one can choose any arbitrary set of units {u1 , · · · , ur }, and (7.6.22) has a solution in
M(R(s)). The only additional point to be worried about is that the first element of this solution
(corresponding to dc ) should be nonzero. But this too can be guaranteed by perturbing the units
slightly if necessary. The details of this last argument are routine and are left to the reader. 2
7.6. GENERICITY 275
The next step is to reduce the case of an l × m system where l < m to the case of a single-output
system (the case where l > m is handled by transposition).
Lemma 7.6.10 Suppose l, m are given positive integers. Define the relation R on the product space
M(S)(m+l)×m × Rl as the set of all (D, N, v) such that vN, D are right-coprime. Then the relation R is
generic.
Proof. Let n(1) , · · · , n(l) denote the rows of the matrix N . By Proposition 7.6.5, it is generically
true that the Smith form of the matrix

D
Mi = (i) , (7.6.23)
n
is [I 0] for all i. In other words, this means that the g.c.d. of all m × m minors of the matrix

D
M= , (7.6.24)
N
involving exactly one row of N is equal to 1. Now, if v is an l × 1 vector, then the pair (D, vN) is
right-coprime if and only if the g.c.d. of all m × m minors of the matrix

D
Q= , (7.6.25)
vN
is equal to 1. We show that generically this is the case for almost all v ∈ Rl . Now the matrix Q has
dimensions (m + 1) × m, as do the matrices Mi defined in (7.6.23). Moreover, since the determinant
function is multilinear, a minor of Q involving the last row can be expressed as a linear combination
of the corresponding minors of the various Mi . More precisely, let qi denote the minor of Q obtained
j
by omitting the j -th row the matrix D, and define mi analogously. Then

l
j
q =
j
vi mi . (7.6.26)
i=1
By hypothesis, it is generically true that

j
g.c.d.i,j {mi } = 1 . (7.6.27)
j
Hence, if the elements mi satisfy the rank condition of [100], it follows that, for almost all vectors
v,
g.c.d.j {q j } = 1 . (7.6.28)
It is left to the reader to show that the rank condition is generically satisfied.This shows that (D, vN)
are right-coprime and completes the proof that the relation R is generic. 2
276 7. ROBUSTNESS
The significance of Lemma 7.6.10 is in showing that, generically, a multivariable plant can
be stabilized using a rank one compensator. Suppose P is a given multivariable plant with an r.c.f.
(N, D), and suppose there exists a vector v such that vN, D are right-coprime. Then one can find
a pair of elements (a, B) ∈ M(S) such that avN + BD = I . Note that a is a column vector. Now
the compensator C = B −1 av has rank one as an element of M(R(s)) and stabilizes P . Hence, if
one can stabilize P using a rank one compensator if one can find a row vector v such that vN, D
are right-coprime. Lemma 7.6.10 states that generically this is always possible.
With the aid of Lemma 7.6.10, the proof of Theorem 7.6.8 can be completed.
Proof of Theorem 7.6.8. If min(l, m) = 1, then the truth of the theorem is established by
Lemma 7.6.9. If not, suppose without loss of generality that r ≤ m ≥ l. For each plant Pi , it is
generically true that the set of row vectors v such that the m-input-scalar-output plant vPi can be
stabilized by an appropriate compensator is open and dense in Rl . Since the intersection of a finite
number of open and dense subsets of Rl is again open and dense (see Proposition 7.6.2), generi-
cally the set of row vectors v such that each of the plants vPi can be stabilized by an appropriate
compensator is open and dense. Moreover, by Lemma 7.6.10, generically there exists a common sta-
bilizing compensator C1 , since r ≤ m. Hence, C = vC1 simultaneously stabilizes each of the plants
Pi , i = 1, · · · , r. 2
In Section 5.4 it was shown that the problem of simultaneously stabilizing r plants is equivalent
to that of simultaneously stabilizing r − 1 plants using a stable compensator. Now Theorem 5.2.1
shows that r plants of dimension l × m can be generically simultaneously stabilized provided r ≤
max(l, m). Thus, it is natural to conjecture that r plants of dimension l × m can generically be
simultaneously stabilized using a stable compensator provided r < max(l, m). In fact this is true,
and the proof follows very nearly the reasoning above. However, a few minor adjustments are needed.
The reason for this is the following: Given r plants Pi , i = 1, · · · , r, they can be simultaneously
stabilized using a stable compensator if and only if the plants r + 1 plants 0, Pi , i = 1, · · · , r can
be simultaneously stabilized. Now, the reasoning used in Lemma 7.6.9 and Theorem 7.6.8 relies on
being able to perturb all of the plants as needed. However, in the present instance, one can only
perturb all except one plant (namely the zero plant). However, there is a saving feature: In choosing an
l.c.f. of the zero plant, the numerator matrix has to be zero, but the denominator matrix is arbitrary.
Hence, the proof of the above result is complete if it can be shown that the Smith form of an
(r + 1) × (m + 1) matrix of the form

d1 N1
M= , (7.6.29)
d2 0
is generically equal to [I 0] whenever r < m. Comparing the matrix in (7.6.29) to that in Propo-
sition 7.6.5, we see that the only difference is the presence of a row of zeros in the bottom of the
matrix M in (7.6.29). Now, checking back to the proof of Proposition 7.6.5, the ingredients to the
proof by induction were: (i) the fact that the r × m matrix N1 generically had the Smith form [I 0],
and (ii) the ability to adjust the corner element d2 . As both of these features are present in (7.6.29),
7.6. GENERICITY 277
we can conclude that the matrix M is generically right-invertible if r < m. Hence, the following
corollary to Theorem 5.2.1 is proved:
Corollary 7.6.11 Suppose the plants Pi , i = 1, · · · , r have dimension l × m.Then they can generically
be simultaneously stabilized using a stable compensator if r < m.
PROBLEMS
7.6.1. (a) Show that if R is open then Rx is open for all x.
(b) Show that if Rx is a dense subset of X for all x belonging to a dense subset of X, then
R is a dense subset of X × X.
7.6.2. Show that the relation R of (7.6.3) is generic.
7.6.3. Show that the relation Tm defined in (7.6.17) is open.

The graph topology of Section 7.2 and the graph metric of Section 7.3 were introduced in [96]. For
another metric on the set of square rational plants, see [114]. The robustness results in Section 7.4
are due to several researchers, including Doyle and Stein [32], Chen and Desoer [13], Kimura [58],
and Vidyasagar and Kimura [99]. For other results on the robustness of feedback stability, specifically
the gain margin of systems, see [57, 67]. Some of the results on robust regulation are from [39].
The genericity of simultaneous stabilizability was first established for two plants by Vidyasagar and
Viswanadham [102]. The general results given here were first obtained by Ghosh and Byrnes [43]
using state-space methods. The present treatment follows [100].
279
CHAPTER 8
Extensions to General Settings

The previous six chapters concentrated on the problems of controller synthesis in the case where
the underlying ring of “stable” transfer functions is the ring S defined in Section 2.1. As pointed
out in Section 4.1, the only property of S that is used repeatedly in Chapters 2–7 is the fact that S
is a principal ideal domain. Thus, most if not all of the contents of these chapters remain valid if
S is replaced by any p.i.d. This is why the synthesis theory of Chapters 2–7 encompasses, within a
single framework, lumped continuous-time as well as discrete-time systems, and situations where
the stable region in the complex plant is something other than the left half-plane (or the closed unit
disc in the discrete-time case).
The objective of the present chapter is to generalize the earlier synthesis theory by removing
the assumption that the underlying set of “stable” systems is a p.i.d. In effect, the generalization
extends the scope of the theory to include linear distributed systems, 2-D systems and the like.
In contrast with earlier chapters, this chapter, and in particular Section 8.1, presupposes a
mathematical background on the part of the reader well beyond that contained in the appendices.
Several references to the literature are given at appropriate places, but in general, [14, 116] are good
references for concepts from abstract algebra, while [42] is a handy source for results on Banach
algebras.
8.1 COPRIME FACTORIZATIONS OVER A COMMUTATIVE

RING
Throughout this section, R denotes a commutative domain with identity, and F denotes the field
of fractions associated with R. As in the rest of the book, the symbols M(R), M(F), U(R) denote
respectively the set of all matrices with elements in R, the set of matrices with elements in F, and the
set of all unimodular matrices with elements in R. If R is a principal ideal domain, then the contents
of Section 4.1 show that every matrix over F has both right- and left-coprime factorizations over
R. In fact, for this it is enough that R be a Bezout domain, i.e., that every finitely generated ideal in
R be principal. However, if R is not Bezout, then the questions of the existence of r.c.f.’s and l.c.f.’s
and their interrelationships need to be examined anew. It is the purpose of this section to do so.
The concepts of coprimeness and coprime factorizations introduced in Section 4.1 can be
extended in a natural way to a general ring R. Suppose A, B ∈ M(R) have the same number of
columns. Then A and B are right-coprime if there exist X, Y ∈ M(R) such that
XN + Y D = I . (8.1.1)
280 8. EXTENSIONS TO GENERAL SETTINGS
Suppose F ∈ M(F). A pair (N, D) where N, D ∈ M(R) is a right-coprime factorization (r.c.f.) of
F if (i) D is square and |D| = 0, (ii) F = ND −1 , and (iii) N, D are right-coprime. Left-coprime
matrices and l.c.f.’s are defined analogously.
Given an arbitrary F ∈ M(F), the existence of an r.c.f. for F is not automatic unless R is a
Bezout domain (see Corollary 8.1.5 below). However, if F does have an r.c.f., then it is easy enough
to characterize all r.c.f.’s of F . In fact, the relevant part of Theorem 4.1.13 carries over in toto. For
ease of reference, the result is repeated here, but the proof is omitted.
Lemma 8.1.1 Suppose F ∈ M(F) has an r.c.f. (N, D). Then the set of all r.c.f.’s of F consists of the set
of pairs (NU, DU ) as U ranges over all matrices in U(R).
But when does an F ∈ M(F) have an r.c.f.? We begin with the scalar case.
Lemma 8.1.2 Suppose f ∈ F , and express f as a/b where a, b ∈ R.Then f has a coprime factorization
if and only if the ideal generated by a and b in R is principal.
Proof. “if ” Suppose g is a generator of the ideal generated by a and b. Then a = ng, b = dg, g =
xa + yb for appropriate elements n, d, x, y ∈ R. Since b is the denominator of a fraction, it is
nonzero, which in turn implies that both d and g, are nonzero. This, together with earlier relations,
leads to f = n/d, xn + yd = 1. Hence, (n, d) is a coprime factorization of f .
“only if ” Suppose (n, d) is a coprime factorization of f , and choose x, y ∈ R such that
xn + yd = 1. Let r = b/d. Then
r = bd −1 = b(xnd −1 + y)
= b(xab−1 + y) = xa + yb ∈ R . (8.1.2)
Hence, d divides b in R, and dr = b. This implies that a = nr, and that the ideal generated by a, b
is the same as the principal ideal generated by r. 2
The matrix case is essentially similar, if one observes that every matrix F ∈ M(F) can be
expressed in the form AB −1 where A, B ∈ M(R) and |B| = 0. To see this, suppose fij = pij /qij

where pij , qij ∈ R. Define b = qij , and let aij = bpij /qij ∈ R. If A denotes the matrix of the
i j
aij ’s, then F = A(bI )−1 . The result in the matrix case is stated without proof.
Lemma 8.1.3 Suppose F ∈ Fn×m , and express F in the form AB −1 where A, B ∈ M(R) and |B| = 0.
Define
I = {XA + Y B : X ∈ Rm×n , Y ∈ Rm×m } , (8.1.3)
and note that I is a left ideal in Rm×m . Then F has an r.c.f. if and only if I is a left principal ideal.
Lemma 8.1.4 Every f ∈ F has a coprime factorization if and only if R is a Bezout domain.
8.1. COPRIME FACTORIZATIONS OVER A COMMUTATIVE RING 281
Proof. The following statement is easy to prove: R is a Bezout domain if and only if every pair of
elements in R generates a principal ideal. So, if R is not Bezout, then there exist a, b ∈ R such
that the ideal generated by a and b is not principal. Obviously, both a and b are nonzero. Hence,
f = a/b ∈ F, but does not have a coprime factorization, by Lemma 8.1.2. This proves the “only if ”
part. The “if ” part follows readily from Lemma 8.1.2. 2
Corollary 8.1.5 Every F ∈ Fn×m has an r.c.f. for all integers n, m ≥ 1 if and only if R is a Bezout
domain.
Proof. If R is a Bezout domain, the proof of Theorem 4.1.13 applies to show that every F ∈ Fn×m
has an r.c.f. for all n, m ≥ 1. 2
The next several paragraphs are devoted to deriving conditions for a matrix in M(R) to have
a left (or right) inverse in M(R), using the idea of reduction modulo maximal ideals. Since a finite
set of matrices A1 , · · · , An in M(R) are left-coprime if and only if the matrix [A1 , · · · , An ] has a
right inverse, this problem arises naturally in the study of the coprimeness of two (or more) matrices.
In the case where R is a Bezout domain, several equivalent criteria for left invertibility are given in
Section 4.1. It turns out that Corollary 4.1.5 is the result that extends most naturally to the case of
a general ring R (see Theorem 8.1.8 below).
An ideal I in R is maximal if I = R and I is not properly contained in any ideal of R other
than R itself. Equivalently (see [116, p. 150]), I is maximal if and only if the factor ring R/I is a field.
Let max(R) denote the set of maximal ideals of R. Then R/I is a field for each I ∈ max(R), even
though for different I this field could be different. However, if R is in addition a Banach algebra
over the complex field, then R/I is isomorphic to the complex field C for all I ∈ max(R) [42, p. 31].
For x ∈ R and I ∈ max(R), let x̂I denote the coset x + I in the factor ring R/I.
Lemma 8.1.6 Suppose a1 , · · · , an ∈ R. Then there exist x1 , · · · , xn ∈ R such that

n
xi ai = 1 (8.1.4)
i=1
if and only if
rank [â1I · · · ânI ] = 1 ∀I ∈ max(R) . (8.1.5)
Remarks 8.1.7 Note that (8.1.4) is equivalent to the statement that a1 , · · · , an generate R,
while (8.1.5) states that no maximal ideal of R contains all of the ai ’s. Thus, Lemma 8.1.6 is a
natural restatement of a well-known result [42, p. 24] that an element in R is a unit if and only if it
is not contained in any maximal ideal of R.
Proof. “if ” Suppose (8.1.5) holds and let J denote the ideal generated by the ai ’s. If J = R, then J is
contained in a maximal ideal of R [116, p. 151]. Suppose J ⊆ I ∈ max(R). Then I contains all the
ai ’s, in contradiction to (8.1.5). Hence, J = R, i.e., (8.1.4) holds.
“only if ” Suppose (8.1.4) holds. If (8.1.5) is false, then there exists an I ∈ max(R) that contains
all the ai ’s. In view of (8.1.4), this implies that 1 ∈ I , which is absurd. 2
Theorem 8.1.8 Suppose A ∈ Rn×m with n ≥ m. Then A has a left inverse in Rm×n if and only if1
rank ÂI = m ∀I ∈ max(R) . (8.1.6)
An equivalent, and extremely useful, way of stating this theorem is: A ∈ Rn×m has a left
inverse if and only if the set of m × m minors of A together generate R.
Proof. “only if ” Suppose B ∈ Rm×n is a left inverse of A, so that BA = Im . Since the map x → x̂I
from R into R/I is a homomorphism for each I ∈ max(R), it follows, upon taking cosets modulo I,
that
B̂I ÂI = Im ∀I ∈ max(R) . (8.1.7)
Now (8.1.7) implies (8.1.6) .
“if ” This part of the proof is made more intelligible by using the symbol S(m, n) defined
in Section B.1. Recall that S(m, n) denotes the set of all strictly increasing m-tuples taken from
{1, · · · , n}.
The case m = n is very easy: In this case (8.1.6) merely states that |A| is not contained in any
maximal ideal of R. Hence, |A| is a unit of R, and A−1 ∈ Rm×m .
Now suppose m < n. For each m-tuple J ∈ S(m, n), let FJ ∈ Rm×m denote the m × m
submatrix of A corresponding to the rows in J. In other words, if J = {j1 , · · · , jm }, then
⎡ ⎤
aj1 1 ··· a j1 m
⎢ .. .. ⎥ .
FJ = ⎣ . . ⎦ (8.1.8)
a jm 1 · · · a jm m
Let fJ = |Fj | denote the m × m minor of A corresponding to the rows in J . Then (8.1.6) states
that, for each I ∈ max(R), we have fˆJ I = 0 for some J ∈ S(m, n). Hence, by Lemma 8.1.6, the
minors fJ , J ∈ S(m, n) together generate R; i.e., there exist elements xJ , J ∈ S(m, n) such that

xJ fJ = 1 . (8.1.9)
J ∈S(m,n)
1 Note that Â is an n × m matrix, with elements in the field R/I, whose ij -th element is â .
I ij I
Next, for each J ∈ S(m, n), define a matrix BJ ∈ R as follows: Suppose J = {j1 , · · · , jm },
m×n
define FJ as in (8.1.8), and let GJ ∈ Rm×m

denote the adjoint matrix of FJ . Let gJ 1 , · · · , gJ m
denote the columns of GJ . Then
BJ = [0 gJ 1 gJ 2 · · · gJ m 0]. (8.1.10)
In other words, columns j1 , · · · , jm of BJ equal gJ 1 · · · , gJ m , respectively, while all other columns

of BJ are zero. By construction,
BJ A = GJ FJ = fJ Im . (8.1.11)
Now define
B= xJ BJ . (8.1.12)
J ∈S(m,n)
Then BA = Im , from (8.1.11) and (8.1.9).2 2
Corollary 8.1.9 Suppose A, B ∈ M(R) have the same number of columns. Then A and B are right-

coprime if and only if the matrix [ÂI B̂I ] has full column rank for all I ∈ max(R).
Application 8.1.10 [110] Let k be an algebraically closed field (e.g., the complex numbers), and let
R be the ring k[x1 , · · · , xr ] of polynomials in the r indeterminates x1 , · · · , xr with coefficients in k.
This ring arises in the study of multidimensional linear systems. For each r-tuple p = (p1 , · · · , pr ) ∈
k r , the set
Ip = {φ ∈ k[x1 , · · · , xr ] : φ(p) = 0} (8.1.13)
of polynomials that vanish at p is a maximal ideal of R. Moreover, since k is algebraically closed,
every maximal ideal of R is of the form (8.1.13) for some p ∈ k r [4, p. 82]. Now R/Ip is isomorphic
to k for all p ∈ k r . Further, the canonical map φ → φ + Ip takes the polynomial φ into the field
element φ(p).
In this case, Lemma 8.1.6 is the “weak” form of Hilbert’s nullstellensatz, namely: A finite
collection of polynomial φ1 , · · · , φn generates R if and only if, at each point p ∈ k r , at least one of
the φ(p) is nonzero. Corollary 8.1.9 states that two matrices A, B ∈ M(R) are right-coprime if and
only if the matrix [A (p) B (p)] has full column rank for all p ∈ k r .
If R is a Banach algebra over the complex field, then Lemma 8.1.6 and Theorem 8.1.8 continue
to apply, but some further refinements are possible because of the topological structure of the maximal
ideal space of R. As mentioned earlier, in this case R/I is isomorphic to C for all I ∈ max(R). Hence,
in this case one can think of the canonical projection x → x̂I as a homomorphism from R into C.
Thus, with every I ∈ max(R) one can associate a nonzero homomorphism from R to C.The converse
is also true: If φ : R → C is a nonzero homomorphism, then the kernel of φ is a maximal ideal of
2This proof was communicated to the author by N.K. Bose.
R. Let us now switch notation. Let denote the maximal ideal space of R, and for each ω ∈ and
each x ∈ R, let x̂ω denote the image (in C) of x under the homomorphism associated with ω. Let
be equipped with the weakest topology in which the map ω → x̂ω is continuous for each x ∈ R.
Such a topology can be defined in terms of its base, consisting of sets of the form
B(ω; x1 , · · · , xn ) = {θ ∈ : |x̂iθ − x̂iω | < ε ∀i} . (8.1.14)
Then becomes a compact topological space. Let C() denote the Banach algebra of continuous
(complex-valued) functions on , with the supremum norm; i.e.,
f C = max |f (ω)| , (8.1.15)

ω∈
and multiplication defined pointwise. Then, with each element x ∈ R, one can associate an element
x̂ ∈ C() by the rule x̂(ω) = x̂ω ∀ω ∈ . The function x̂ ∈ C() is called the Gel’fand transform of
x. Let R̂ denote the subset of C() consisting of the set of Gel’fand transforms of all x ∈ R. Then
R̂ is a subalgebra of C(), but does not in general equal C(); in fact, R̂ may not even be a closed
subalgebra of C().
The utility of the above theory arises from the fact that, for several Banach algebras of interest,
the maximal ideal space can be identified explicitly.
Examples 8.1.11 Let R = A, the disc algebra defined in Section 2.4. One can think of A as the
set of transfer functions of 2 -stable shift-invariant digital filters whose frequency responses are
continuous. The maximal ideal space of A is the closed unit disc D. For each z ∈ D, the map
f → f (z) (the “evaluation” map) defines a homomorphism from A into C; conversely, all nonzero
homomorphisms from A into C are of this type. Hence, given f ∈ A, the function z → f (z) is the
Gel’fand transform of f . Since the space C() is just the set of continuous functions C(D), in this
case the Gel’fand transform is just the injection map of A into C(D).
Let R = 1 (Z+ ), the set of sequences {fi }∞i=0 that are absolutely summable, with convolution
as the multiplication. As shown in [26, 97], 1 (Z+ ) is precisely the set of unit-pulse responses of
∞ -stable shift-invariant causal digital filters. The maximal ideal space of 1 (Z+ ) is once again the
closed unit disc D. Corresponding to each z ∈ D, one can associate the homomorphism
∞

fˆ(z) = fi zi , (8.1.16)
i=0
which maps 1 (Z+ ) into C. The map z → fˆ(z) which is a function in C(D) is the Gel’fand
transform of f ∈ 1 (Z+ ). In this case, it is just the usual z-transform of the sequence {fi }.
More generally, let R = 1 (Zn+ ) denote the set of all absolutely summable n-indexed sequences
{fi1 ···in } where the each index ranges over the nonnegative integers. The maximal ideal space of R
n
is D , the cartesian product of the unit disc with itself n times, also referred to as the n-polydisc.
Suppose z = (z1 , · · · , zn ) ∈ Dn . Then the function
∞
∞

fˆ(z) = ··· fi1 ···in z1i1 · · · znin (8.1.17)
i1 =0 in =0
defines a homomorphism from 1 (Zn+ ) into C. The map z → fˆ(z) which belongs to C(Dn ) is the
Gel’fand transform of f , which is again the usual z-transform.
Now suppose R = 1 (Z), the set of absolutely summable two-sided sequences {fi }∞ i=−∞ .
In this case the maximal ideal space of R is the unit circle T. For each z ∈ T, the corresponding
homomorphism from 1 (Z) to C is again given by (8.1.16), with the summation now running
from −∞ to ∞. If R = 1 (Zn ), then the maximal ideal space of R is Tn , and (8.1.17) defines a
homomorphism of R into C (with the obvious change of 0 to −∞ everywhere). One can replace
Zn by any locally compact Abelian group and the theory is still complete; see [80].
In some applications, one does not know the maximal ideal space of R explicitly, but does
know a dense subset of . Two such instances are discussed in Examples 8.1.15 and 8.1.16 below.
Theorem 8.1.8 can be easily extended to this case.
Lemma 8.1.12 Suppose R is a Banach algebra over C, with maximal ideal space . Suppose
is a
dense subset of , and suppose a1 , · · · , an ∈ R. Then there exist x1 , · · · , xn ∈ R such that

n
xi ai = 1 (8.1.18)
i=1
if and only if

n
inf |âi (ω)| > 0 . (8.1.19)
ω∈
i=1
Proof. “if ” Suppose (8.1.19) holds; then it is claimed that

n
|âi (ω)| = 0 ∀ω ∈ , (8.1.20)
i=1
or equivalently, that at least one âi (ω) is nonzero at each ω ∈ . The proof of (8.1.20) is by con-
tradiction. Suppose that, for some ω0 ∈ , we have âi (ω0 ) = 0 ∀i. Since
is dense in , there is a
sequence3 {ωj } in
converging to ω0 , and

n
n
|âi (ωj )| → |âi (ω0 )| = 0 . (8.1.21)
i=1 i=1
3 If is not a first countable topological space, this argument has to be modified by replacing all sequences by nets.
But (8.1.21) contradicts (8.1.19). Hence, (8.1.20) is true. The existence of the xi ’s satisfying (8.1.18)
now follows from Lemma 8.1.6.
“only if ” Suppose (8.1.18) holds. Then, by Lemma 8.1.6, (8.1.20) is also true. Now, since
is a compact topological space, (8.1.20) implies that

n
inf |âi (ω)| > 0 . (8.1.22)
ω∈
i=1
Finally, (8.1.22) implies (8.1.19), since

is a subset of . 2
Lemma 8.1.13 Let R, ,

be as in Lemma 8.1.12, and suppose A ∈ Rn×m with n ≥ m. Then A has
m×n
a left inverse in R if and only if
inf σm (Â(ω)) > 0 , (8.1.23)
ω∈
where σm (·) denotes the smallest singular value of a matrix.
Remarks 8.1.14 An equivalent way of stating (8.1.23) is as follows: There exists an α > 0 such
that Â∗ (ω)Â(ω) ≥ αI ∀ω ∈
.
Proof. “if ” Suppose σm (Â(ω)) ≥ δ ∀ω ∈

. Then a simple exercise in numerical analysis shows that
nδ 2
|âJ (ω| ≥ > 0 ∀ω ∈
, (8.1.24)
c
J ∈S(m,n)
where âJ , J ∈ S(m, n) are the various m × m minors of Â, and c = n!/[m!(n − m)!]. From
Lemma 8.1.12, (8.1.24) implies that the minors âJ , J ∈ S(m, n) generate R. The existence of a
left inverse for A now follows as in the proof of Theorem 8.1.8.
“only if ” Suppose A has a left inverse. Then, by Theorem 8.1.8, the minors aJ , J ∈ S(m, n)
of A generate R. Hence, by Lemma 8.1.12,

inf |âj (ω)| > 0 . (8.1.25)
ω∈
J
It is now routine to prove (8.1.23). 2
Application 8.1.15 Let R = H∞ , which can be interpreted as the set of transfer functions of
causal, 2 -stable, shift-invariant digital filters, without the requirement that the frequency response
be continuous. The maximal ideal space of H∞ is not known explicitly. However, an extremely deep
result due to Carleson [34, Chapter 12], called the corona theorem, states that the open unit disc
is a dense subset of the maximal ideal space of H∞ . Hence, Lemma 8.1.12 states, in this case, that
two functions f, g ∈ H∞ are coprime if and only if
inf {|f (z)| + |g(z)|} > 0 . (8.1.26)

|z|<1
A similar interpretation can be given for Lemma 8.1.13.
Application 8.1.16 [8] Let A denote the set of distributions f (·) with support in [0, ∞) of the
form
∞

f (t) = fi δ(t − ti ) + fa (t) , (8.1.27)
i=0
where 0 ≤ t1 ≤ t2 < · · · are real constant, fa (·) is absolutely continuous, and in addition
∞
∞
f A := |fi | + |fa (t)| dt < ∞ . (8.1.28)
i=0 0
One can interpret A as the set of regular measures (without singular part) of bounded variation on
the half-line [0, ∞); see [48, 80]. It is shown in [26] that the set A is a good model for the set of
impulse responses of BlBO-stable linear time-invariant continuous-time systems. Now the maximal
ideal space of A is very messy to describe. However, it is shown in [48, pp. 145 et seq.] that the closed
right half of the complex plane is a dense subset of this maximal ideal space, in the following sense:
For each s ∈ C+ , define a homomorphism φs : A → C by
∞
∞
φs (f ) = fˆ(s) = fi exp(−sti ) + fa (t) exp(−st) dt . (8.1.29)
i=0 0
It is recognized that fˆ is just the Laplace transform of f (·). For each s ∈ C+ , the set of f (·) ∈ A
such that fˆ(s) = 0 is a maximal ideal of A. The result in [48] is that the collection of maximal ideals
corresponding to all s ∈ C+ is dense in the maximal ideal space of A. Hence [8], by Lemma 8.1.12,
two functions f, g ∈ A are coprime if and only if
inf {|fˆ(s)| + |ĝ(s)|} > 0 . (8.1.30)

s∈C+
Now we change direction somewhat. Lemma 8.1.4 shows that if R is not a Bezout domain,
then not every matrix in M(R) has an r.c.f., and not every matrix has an l.c.f. At this stage it is natural
to ask: If a matrix has one, does it have the other? The rest of this section is devoted to answering
this question for various rings of interest.
Suppose A ∈ Rm×n with m < n; then we say that A can be complemented if there exists a
unimodular matrix U ∈ Rn×n containing A as a submatrix. Suppose A can be complemented, and
let U = [A B ] ∈ Rn×n be unimodular. By expanding |U | along the first m rows using Laplace’s
expansion (Fact B.1.4), we see that the m × m minors of A must together generate R, or equivalently
(by Theorem 8.1.8), that A has a right inverse in Rn×m . Thus, only right-invertible matrices can be
complemented. But is right-invertibility sufficient?
We postpone answering this question in order to demonstrate the relevance of comple-
mentability to the issue at hand.
Lemma 8.1.17 Suppose F ∈ M(F) has an l.c.f. (D̃, Ñ). Then F has an r.c.f. if and only if the matrix
A = [D̃ Ñ ] can be complemented.
Proof. “if ” Suppose A can be complemented. Since B = [−Ñ D̃] is obtained from A by elementary
column operations, B can also be complemented.4 Suppose

Y X
U= (8.1.31)
−Ñ D̃
is a unimodular matrix containing B as a submatrix, and partition U −1 ∈ M(R) as

D −X̃
U −1 = . (8.1.32)
N Ỹ
The claim is that (N, D) is an r.c.f. of F . From U U −1 = I , we get
Y D + XN = I , (8.1.33)
−ÑD + D̃N = 0 . (8.1.34)
Thus, D and N are right-coprime from (8.1.33), and the proof is complete if it can be established
that |D| = 0, because (8.1.34) would then imply that ND −1 = D̃ −1 Ñ = F .
To show that D is nonsingular, begin by observing that D̃ is nonsingular, since (D̃, Ñ) is an
l.c.f. of F . Therefore the matrix

I 0 −1 D −X̃
U = (8.1.35)
−Ñ D̃ 0 I
is also nonsingular. It now follows from (8.1.35) that |D| = 0.
“only if ” This part exactly parallels the proof of Theorem 4.1.16. Suppose (N, D) is an r.c.f.
of F , and select X, Y, X̃, Ỹ ∈ M(R) such that XN + Y D = I, Ñ X̃ + D̃ Ỹ = I . Define

Y X D −X̃
U= ,W = . (8.1.36)
−Ñ D̃ N Ỹ
4The change from [D̃ Ñ ] to [−Ñ D̃] is made simply to make U resemble the matrix in Theorem 4.1.16.
Then,

I −Y X̃ + X Ỹ
UW = . (8.1.37)
0 I
Since |U W | = 1, it follows that |U | is a unit and that U is unimodular. This shows that B =
[−Ñ D̃] can be complemented. The same is true of A = [D̃ Ñ], since A can be transformed to B
by elementary column operations. 2
Corollary 8.1.18 Suppose F ∈ M(F) has an r.c.f. (N, D). Then F has an l.c.f. if and only if the matrix
[D N ] can be complemented.
The proof is easy and is left as an exercise.

Suppose A ∈ Rm×n be complemented.Then by definition there exists a matrix B ∈ R(n−m)×m
such that [A B ] is unimodular. The next result parametrizes all such matrices B.
Lemma 8.1.19 Suppose A ∈ Rm×n can be complemented, and suppose [A B0 ] is unimodular. Then
the set of B ∈ M(R(n−m)×m such that [A B ] is unimodular is given by
{B : B = RA + V B0 , where R ∈ M(R), V ∈ U(R)} . (8.1.38)
Proof. First, suppose B = RA + V B0 where V ∈ U(R). Then,

A A I 0 A
= = , (8.1.39)
B RA + V B0 R V B0
which is unimodular since both matrices on the right side are in U(R). Conversely, suppose [A B ] ∈
U(R). Partition [A B0 ] −1 as [F G], and note that AF = I, AG = 0. Now, by assumption,
−1
A A A I 0
= [F G] = (8.1.40)
B B0 B BF BG
is in U(R), which shows that BG ∈ U(R). Let V = BG, R = BF . Then, from (8.1.38),

A I 0 A A
= = , (8.1.41)
B R V B0 RA + V B0
which implies that B = RA + V B0 . 2

Now we return to the question: When can every right-invertible matrix over R be comple-
mented? The answer is provided by the notion of a Hermite ring. A row [a1 , · · · , an ] ∈ R1×n is a
unimodular row if a1 , · · · , an together generate R. A ring R is Hermite if every unimodular row can
be complemented. Lemma B.2.1 shows that every Bezout domain is Hermite. But it is shown below
that several other commonly encountered rings are Hermite, though they are not Bezout. Before
this, a minor technicality is disposed of.
Lemma 8.1.20 The following statements are equivalent:

(i) R is a Hermite ring.
(ii) If A ∈ Rk×n has a right inverse, then A can be complemented.
Remarks 8.1.21 (a) The objective of the lemma is to deal with “fat” matrices, not just row vectors;
(b) compare (ii) with Corollary 4.1.7.
Proof. (ii) ⇒ (i) Obvious: just let m = 1.

(i) ⇒ (ii) Fix n. The proof is by induction on k. The definition of a Hermite ring shows that
(ii) is true when k = 1. Suppose (ii) is true for matrices in Rm×n whenever m ≤ k − 1 < n − 1, and
suppose A ∈ Rk×n has a right inverse, say C. Partition A and C as shown below:
1
A
A = 2 , C = [C1 c2 ] , (8.1.42)
a
where a 2 is the last row of A and c2 is the last column of C. Then A1 C1 = Ik−1 . Hence, by the
inductive hypothesis (and taking transposes), there exists a unimodular matrix U ∈ Rn×n of the
form U = [C1 D]. Now
1
A Ik−1 Z
AU = 2 [C1 D] = =: M . (8.1.43)
a 0 y
Since A has a right inverse, so does M, and hence so does y ∈ R1×(n−k+1) . Let b ∈ R(n−k+1)×1 be
a right inverse of y. Since R is Hermite, there exists a unimodular matrix B ∈ R(n−k+1)×(n−k+l)
containing b as its first column. Thus,

Ik−1 0 Ik−1 ZB Ik−1 × ×
AU = = =: N , (8.1.44)
0 B 0 yB 0 1 ×
since the first element of yB ∈ R1×(n−k+1) is a one. (In (8.1.44), × is a generic symbol denoting an
element or a matrix whose value is unimportant.) Now the matrix N in (8.1.44) can be complemented,
since the determinant of ⎡ ⎤
Ik−1 × ×
⎣ 0 1 × ⎦ (8.1.45)
0 0 In−k
is one. Since N is a right associate of A, it follows that A can also be complemented. 2
Corollary 8.1.22 Suppose R is a Hermite ring and that A ∈ R m×n
with m < n. Then A has a right
inverse if and only if there exist unimodular V ∈ Rm×m
,U ∈ R n×n
such that
V AU = [Im 0] . (8.1.46)
Corollary 8.1.22 shows that in a Hermite ring, even though there is no Smith form in general
for arbitrary matrices, there is one for right-invertible matrices (and of course for left-invertible
matrices).
Theorem 8.1.23 The following three statements are equivalent:
(i) R is a Hermite ring.
(ii) If F ∈ M(F) has an l.c.f., then it has an r.c.f.
(iii) If F ∈ M(F) has an r.c.f., then it has an l.c.f.
Proof. The equivalence of (ii) and (iii) follows readily by taking transposes.
(i) ⇒ (ii) Suppose R is Hermite, and suppose F ∈ M(F) has an l.c.f., say (D̃, Ñ). Then A =
[D̃ Ñ ] has a right inverse. From Lemma 8.1.20, A can be complemented, and from Lemma 8.1.17,
it follows that F has an r.c.f.
(ii) ⇒ (i) Suppose [a1 · · · an ] ∈ R1×n is an arbitrary unimodular row; it must be shown that
this row can be complemented. Since a1 , · · · , an generate R, at least one of the ai must be nonzero.
Suppose a1 = 0 by renumbering if necessary, and defineF ∈ F1×(n−1) by

a2 an
F = · · · . (8.1.47)
a1 a1
Now the unimodularity of the row [a1 · · · an ] shows that F has an l.c.f., namely (a1 , [a2 · · · an ]).
Since (ii) is assumed to hold, F also has an r.c.f. By Lemma 8.1.17, this implies that the row [a1 · · · an ]
can be complemented. Hence, R is Hermite. 2
Theorem 8.1.23 shows that Hermite rings are the next best thing to Bezout domains, at least
as far as controller synthesis theory goes. If R is Bezout, then every matrix in M(F) has both an
r.c.f. and an l.c.f., and all of the synthesis theory of Chapter 5 goes through very nicely. If R is not
Bezout, then not every matrix in M(F) has an r.c.f., and not every matrix has an l.c.f. However, if
R is at least Hermite, then the matrices in M(F) that have an r.c.f. are precisely those that have an
l.c.f. Moreover, this property characterizes Hermite rings.
It is therefore worthwhile to explore whether various commonly encountered rings are Her-
mite. If R is a complex Banach algebra, then there is a very general result available. The proof of this
result is well beyond the scope of this book and can be found in [62]; see [62, Theorem 3].
Theorem 8.1.24 Suppose R is a complex Banach algebra with maximal ideal space . Then a
unimodular row [a1 · · · an ] in R can be complemented if and only if the row [â1 · · · ân ] can be
complemented in C(). In particular, if is contractible, then R is Hermite.
Recall that a topological space is contractible if the identity map on is homotopic to a
constant map. In other words, is contractible if there exists an ω0 ∈ and a continuous function
f : [0, 1] × → such that
f (0, ω) = ω, f (1, ω) = ω0 , ∀ω ∈ . (8.1.48)
Theorem 8.1.24 is similar in spirit to one of the basic results in the Gel’fand theory of Banach
algebras, which states that, given any complex Banach algebra R and any f ∈ R, f can be inverted
in R if and only if fˆ can be inverted in C().
Note that a convex set in C n is contractible: If ⊆ C n is convex, let ω0 ∈ be arbitrary and
define
f (λ, ω) = λω0 + (1 − λ)ω . (8.1.49)
This observation leads to a very useful criterion.
Corollary 8.1.25 Suppose R is a complex Banach algebra whose maximal ideal space is a (compact)
convex subset of Cn for some integer n. Then R is a Hermite ring.
Examples 8.1.26 [6, 54] The disc algebra, 1 (Z+ ), 1 (Zn+ ) are all Hermite.
Theorem 8.1.24 and Corollary 8.1.25 are not valid in general if R is a real Banach algebra.
However, in [6, 54] it is shown that the following real algebras are indeed Hermite rings:
(i) The set As of functions in the disc algebra A whose Fourier coefficients are all real (i.e.,
the set of all symmetric functions in A).
(ii) The set 1r (Zn+ ) consisting of all real sequences in 1 (Zn+ ).
Theorem 8.1.24 suggests a simple example of a ring that is not Hermite.
Example 8.1.27 Let S 2 denote the unit sphere in R3 , and let R = C(S 2 ), the set of all continuous
complex-valued functions on S 2 .5 Then, = S 2 , which is not contractible. It is claimed that R
is not a Hermite ring. To show this, it is enough to demonstrate a unimodular row in R1×3 that
cannot be complemented. Define a = [a1 a2 a3 ] by setting a(x) equal to the unit outward normal
vector at x, for each x ∈ S 2 . Then a(·) is continuous and therefore belongs to R1×3 . Moreover, since
a(x) = 0 ∀x ∈ S 2 , it follows from Lemma 8.1.6 that a is a unimodular row. Now suppose a could be
complemented, and suppose [a b c ] ∈ R3×3 is unimodular. Then for each x ∈ S 2 , the three vec-
tors a(x), b(x), c(x) must span the vector space C 3 . Let T (x) denote the (two-dimensional) tangent
space to S 2 at x. Since span {a(x), b(x), c(x)} = C 3 contains T (x), and since a(x) is orthogonal to
T (x), we see that the projections of b(x), c(x) onto T (x) must both be nonzero and must together
span T (x). Let f(x), g(x) denote the projections of b(x), c(x) onto T (x), respectively. Then f(·)
is a vector field on S 2 that does not vanish anywhere. However, this contradicts the Borsuk-Ulam
5This example was suggested to the author by Ken Davidson.
theorem [46, p. 116], which states that every vector field on S 2 must have a zero. Hence, a(·) cannot
be complemented, and R is not a Hermite ring.
If we define F = [a2 /a1 a3 /a1 ], then F has an l.c.f. but not an r.c.f.
We conclude this section with a theorem that is very important in several branches of control
theory and indeed in mathematics.
Theorem 8.1.28 (Quillen-Suslin) Let k be a field, and let R = k[x1 , · · · , xl ], the ring of poly-
nomials in l indeterminates with coefficients in k. Then R is a Hermite ring.
Remarks 8.1.29 (i) Until it was proven, the above result was known as Serre’s conjecture. See [61]
for an excellent historical introduction to this result, its origins and its significance.
(ii) Quillen [76] shows that the result remains valid if k is replaced by any p.i.d. However, the
proof given below does not apply to this case.
(iii) The ring k[[x1 , · · · , , xl ]] of formal power series in l indeterminates with coefficients in
k is also a Hermite ring; see [63].
(iv) The proof given below is taken from [79, pp. 138–145]. See [112] for another proof that
applies to the special case when k = C.
The elementary proof of Theorem 8.1.28 is rather long, and makes use of a few concepts not
previously introduced. A ring B is local if it contains a unique maximal ideal [4, p. 38]. Suppose A is
a domain, and let P be a prime ideal in A. The set consisting of all fractions of the form a/b where
a ∈ A, b ∈ A \ P, is a local ring, called the localization of A at P, and is denoted by Ap . The unique
maximal ideal of Ap is the set of fractions a/b where a ∈ P, b ∈ A \ P. The pertinent property of
local rings is the following: If B is a local ring and J is its unique maximal ideal, then every b ∈ / J is
a unit. This follows from Lemma 8.1.6.
If B is a ring and b1 , · · · , bn ∈ B, the symbol (b1 , · · · , bn ) denotes the ideal generated by
the bi ’s. If f ∈ B[y], then δ(f ) denotes the degree of the polynomial f . More generally, if f ∈
B[y1 , · · · , ym ], then δ(f ) denotes the total degree of f . That is, if f is of the form

f (y1 , · · · , yl ) = fi y1i1 · · · ylil , (8.1.50)
i
where i is a shorthand for the l-tuple (i1 , · · · , il ) then δ(f ) is the largest of the sums (i1 + · · · + il )
for which the coefficient fi is nonzero.
The proof of Theorem 8.1.28 is basically inductive. Clearly k[x1 ] is Hermite, since it is a p.i.d.
(see Lemma B.2.1). The proof consists of showing that if k[x1 , · · · , xl−1 ] is Hermite, then so is
k[x1 , · · · , xl ]. This proof consists of two independent steps. The first step (which is actually proved
last here, since it is more complicated than the other) is to show that if B is a Hermite ring and
[a1 , · · · , an ] is a unimodular row in B1×n , then it can be complemented, provided one of the ai ’s is a
monic polynomial. The second step (which is proved first) is to show that in the ring k[x1 , · · · , xl ],
it is always possible to effect a change of variables in such a way that every unimodular row contains
at least one monic polynomial in the new set of variables.
Lemma 8.1.30 (Nagata) Suppose a ∈ R, and let m = δ(a) + 1. Define a new set of indeterminates
l−i
yl = xl , yi = xi − xlm . (8.1.51)
Then a(x1 , · · · , xl ) = rb(y1 , . . . , yl ), where r = 0 belongs to k,and b is a monic polynomial in yl (with

coefficients in k[y1 , · · · , yl−1 ]).
Proof. Denote an l-tuple (j1 , · · · , jl ) where each of the ji is nonnegative by the symbol j. Then the
polynomial a can be written as j j
a= rj x1i · · · xl l , (8.1.52)
j
where rj ∈ k. Note that (8.1.51) has the inverse transformation

l−i
xl = yl , xi = yi + ylm . (8.1.53)
Substituting from (8.1.53) into (8.1.52) gives

l−1 j
b= rj (y1 + ylm )j1 · · · (yl−1 + ylm )jl−1 yl l . (8.1.54)
j
Now let m denote the l-tuple (ml−1 , · · · , m, 1). The first point to note is that if i = (i1 , · · · , il ) and
j = (j1 , · · · , jl ) are two distinct l-tuples with 0 ≤ ik , jk ≤ m − 1 ∀k, then i · m and j · m are distinct
integers, where · denotes the usual vector dot product. This follows upon noting that i · m < ml , and
that the l-tuple i is just the representation of the integer i · m with respect to the base m; similarly
for the integer j · m. Returning to (8.1.54), let J = {j : rj = 0}. Then there is a unique l-tuple j0 ∈ J
such that j0 · m > j · m ∀j ∈ J \ j0 . Hence, when expanded, (8.1.54) is of the form
j ·m
b = rj0 yl 0 + terms involving lower powers of yl . (8.1.55)
Hence, b equals rj0 times a monic polynomial in yl of degree j0 · m. 2

Note that if k is an infinite field, then it is possible to achieve the same result as in Lemma 8.1.30
by using only a linear transformation of variables [112]. However, the more complicated transfor-
mation is needed in the case of an arbitrary (and possibly finite) field.
To prove that R = k[x1 , · · · , xl ] is a Hermite ring, suppose [a1 · · · an ] is an arbitrary uni-
modular row in R1×n ; it must be shown that this row can be complemented. In order to do this,
we can first transform the indeterminates x1 , · · · , xl into the set y1 , · · · , yl using Lemma 8.1.30
such that one of the bi ’s is monic in yl , where bi (y1 . · · · , yl ) = a(x1 , · · · , xl )∀i. Since the rings
k[x1 , · · · , xl ] and k[y1 , · · · , yl ] are clearly isomorphic, it is enough to show that the row [b1 · · · bn ]
can be complemented. Thus, the proof of Theorem 8.1.28 is complete once the following lemma is
established. Note that this lemma is of independent interest.
Lemma 8.1.31 Suppose B is a Hermite ring, and let R = B[x]. Suppose [a1 · · · an ] ∈ R1×n is a
unimodular row, and that at least one of the ai ’s is monic. Then this row can be complemented.
The proof of Lemma 8.1.31 requires two preliminary results.
Lemma 8.1.32 (Suslin) Suppose B is a ring, R = B[x], and suppose f, g ∈ R are of the form
f (x) = x r + a1 x r−1 + · · · + ar ,
g(x) = b1 x r−1 + · · · + br . (8.1.56)
Then, for each j ∈ {1, · · · , r} the ideal (f, g) contains a polynomial hj such that δ(hj ) ≤ r − 1 and the
leading coefficient of hj is bj .
Proof. Define I ⊆ B to be the set of leading coefficients of all h in the ideal (f, g) such that
δ(h) ≤ r − 1. Then I is an ideal in B. To see this, suppose p1 , q1 , ∈ I, and let p, q ∈ (f, g) be the
corresponding polynomials with leading coefficients p1 , q1 and degrees less than or equal to r − 1.
Assume without loss of generality that δ(p) ≤ δ(q). Then the polynomial
h(x) = q(x) − x δ(q)−δ(p) p(x) (8.1.57)
has leading coefficient q1 − p1 and degree no larger than r − 1. Hence, I is closed under subtraction.
Finally, it is clear that if p1 ∈ I and s ∈ B, then p1 s ∈ I; just take h(x) = sp(x).
To prove the lemma, it is enough to establish the following statement: If h ∈ (f, g) and
δ(h) ≤ r − 1, then all coefficients of h belong to I. This is established by induction, by proving that:
for each j ∈ {1, · · · , r}, if h ∈ (f, g) and δ(h) ≤ r − 1, then the coefficients of x r−1 , · · · , x r−j
belong to I. Now the statement is true for j = 1, from the definition of I. Suppose the statement is
true up to j − 1, and suppose h is an arbitrary element of the ideal (f, g) with δ(h) ≤ r − 1. Let
δ = δ(h), and write
h(x) = h1 x r−1 + · · · + hr , (8.1.58)
where some of the hi may be zero. By the inductive hypothesis, h1 , · · · , hj −1 belong to I. Now,
xh(x) − h1 f (x) ∈ (f, g) has a degree no larger than r − 1. Moreover,

r
xh(x) − h1 f (x) = (hi+1 − h1 ai )x r−i . (8.1.59)
i=1
The (j − 1)-st coefficient of this polynomial is hj − h1 aj −1 , which belongs to I by the inductive

hypothesis. Since h1 aj −1 ∈ I, it follows that hj ∈ I. This completes the proof of the inductive step.
2
The last preliminary result is the same as Lemma 8.1.31, except that the ring B is restricted
to be a local ring.
Lemma 8.1.33 Let B be a local ring, let R = B[x], and suppose [a1 · · · an ] ∈ R1×n is a unimodular
row with at least one of the ai being monic. Then this row can be complemented.
Proof (Suslin). Assume without loss of generality that n ≥ 3,6 and that a1 is monic. The proof is by
induction on the degree of a1 . If δ(a1 ) = 0, then a1 = 1 and there is nothing further to prove. If
δ(a1 ) =: r > 0, we will show that [a1 · · · an ] can be reduced, through elementary column operations,
to a row [b1 · · · bn ] where b1 is monic and δ(b1 ) < r.
Accordingly, suppose r = δ(a1 ). Since a1 is monic, by performing column operations if nec-
essary we can assume that δ(ai ) < r for i = 2, · · · , n. Let J be the unique maximal ideal of B, and
for b ∈ B let b̂ ∈ B/J denote the coset b + J. For f ∈ R = B[x], define fˆ ∈ [B/J][x] in the obvious

way, namely, if f = fi x i , then fˆ = fˆi x i . Now the row [â1 · · · ân ] is unimodular in [B/J][x].
Since δ(â1 ) = δ(a1 ) > 0, not all of a2 , · · · , an can be zero. By renumbering if necessary, suppose
a2 = 0. Then some coefficient of a2 does not belong to J. Suppose

r
a2 (x) = b2i x r−i , (8.1.60)
i=1
and suppose b2j ∈ / J. Since B is a local ring, this implies that b2j is a unit. Let us now apply
Lemma 8.1.32 with f = a1 , g = a2 , and define I as in Lemma 8.1.32. Then Lemma 8.1.32 implies
that b2j ∈ I, i.e., that the ideal (a1 , a2 ) contains a monic polynomial of degree less than r. So by
elementary column operations, a3 can be replaced by a monic polynomial of degree less than r. Now
apply the inductive hypothesis. 2
Proof of Lemma 8.1.31. The lemma is proved by establishing the following fact: There exists a
unimodular matrix U ∈ U(R) such that aU = b, where b = [b1 · · · bn ] ∈ B1×n is a unimodular
row. In fact, such a transformation is possible (as the following proof will reveal) for any ring B.
Since B is Hermite, b can be complemented, whence a can also be complemented.
Suppose f, g ∈ R = B[x]; then f · g ∈ B[x] can be defined in the obvious way, namely,

if f = fi x i , then (f · g)(x) = fi (g(x))i . Given the row a, define I to be the subset of B
consisting of all b ∈ B which have the property that, for each f, g ∈ R there exists a unimodular
matrix U = U (f, g) such that a(f + bg) = a(f )U , where a(f + bg), a(f ) are row vectors in R1×n
defined by
(a(f + bg))i = ai · (f + bg) ,

(a(f ))i = ai · f . (8.1.61)
6 Note that a unimodular row consisting of exactly two elements can always be complemented, in any ring.
Then I is an ideal in B: Clearly 0 ∈ I (take U = I ∀f, g). If b, c ∈ I, then for each f, g ∈
R there exist unimodular matrices U (f, g), V (f, g) such that a(f + bg) = a(f )U, a(f +
cg) = a(f )V . Hence, we have a(f + (b + c)g) = a(f + bg + cg) = a(f + bg)V (f + bg, g) =
a(f )U (f, g)V (f + bg, g). Since the matrix U (f, g)V (f + bg, g) is unimodular for all f, g ∈ R,
it follows that b + c ∈ I. Finally, if b ∈ I and c ∈ B, then for any f, g ∈ R we have a(f + bcg) =
a(f )U (f, cg), which shows that bc ∈ I.
If it can be shown that I = B, then the proof is complete. If 1 ∈ I, then for each f, g ∈ R
there exists a U (f, g) ∈ U(R) such that a(f + g) = a(f ). In particular, let f = x, g = −x; then
a(0) = a(x)U for some U ∈ U(R). If we now define b = a(0), then the desired statement follows
upon observing that a(0) is a unimodular row in B.
To show that I = B, suppose by way of contradiction that I = B. Then I is contained in
some maximal ideal of B, say J. Let L denote the localization BJ . Then B ⊆ L, and a is a (monic)
unimodular row in L[x]. By Lemma 8.1.33, a can be complemented in L[x]. In other words, there
exists a unimodular matrix U over L[x] such that
a(x) = [1 · · · 0 0]U (x) . (8.1.62)
Now adjoin a new indeterminate y to L[x], and define
V (x, y) = [U (x + y)]−1 U (x) . (8.1.63)
Then V is also a unimodular matrix over L[x, y], and V (x, 0) = In . From (8.1.62), it follows that
a(x + y) = [1 · · · 0 0]U (x + y) , (8.1.64)
so that, from (8.1.64) and (8.1.63),
a(x + y)V (x, y) = [1 · · · 0 0]U (x) = a(x) . (8.1.65)
Next, note that V (x, y) is a matrix over L[x, y], so that each element of V (x, y) is a polynomial
in x and y with coefficients in L. Moreover, since V (x, 0) = In , it follows that each element of V is
of the form
r
Vij (x, y) = cij + dij k (x)y k , (8.1.66)
k=1
where cij = 0 or 1. Loosely speaking, each term in the polynomial expansion of vij is either a
“constant” 0 or 1, or else involves a positive power of y. Recall that L consists of fractions p/q where
p ∈ B, q ∈ B \ J. Let b denote the product of the denominators of all the coefficients occurring in
the matrix V (x, y); thus b is the product of the denominators of the coefficients of dij k (x), for all
i, j, k. Then V (x, by) is actually a unimodular matrix over B[x, y], since replacing y by by wipes
out all the denominators. Now (8.1.65) implies that
a(x + by)V (x, by) = a(x) . (8.1.67)

So for any f, g ∈ B[x], (8.1.67) implies that
a(f + bg)V (f, bg) = a(f ) . (8.1.68)
Therefore b ∈ I, by definition. On the other hand, b is the product of elements in B \ J, and hence
itself belongs to B \ J. In particular, b ∈
/ I ⊆ J. This contradiction shows that I = B. 2
8.2 COPRIME FACTORIZATIONS FOR DISTRIBUTED

SYSTEMS
Suppose R is a commutative domain with identity, and let F denote the associated field of fractions.
The previous section was concerned with the question of when a matrix in M(F) has a right or
left-coprime factorization, and when the existence of one implies the existence of the other. It was
shown that the notion of a Hermite ring plays an important role in answering the above questions.
In particular, if R is a Hermite ring, then a matrix in M(F) has an r.c.f if and only if it has an l.c.f.
However, aside from some general and abstract results, very little was said about the existence of
coprime factorizations. The objective of this section is to study this issue in more detail for the case
of linear distributed systems, both continuous-time as well as discrete-time. This is done by first
proving a general theorem about the existence of coprime factorizations, and then specializing this
theorem to the two special cases mentioned above. As a consequence, we shall obtain two very large
classes of transfer matrices whose elements all have both r.c.f ’s as weil as l.c.f.’s.
The main result of this section is given next.
Theorem 8.2.1 Suppose I is a subset of R satisfying two conditions:

(I1) I is a saturated multiplicative system;
(I2) If a ∈ I and b ∈ R, then the ideal in R generated by a and b is principal.
Under these conditions, every matrix G ∈ M(I−1 R) has both an r.c.f. as well as an l.c.f.
Remarks 8.2.2 1) Recall from Section A.2 that a subset I of R is said to be multiplicative if a, b ∈ I
implies that ab ∈ I. It is saturated if a ∈ I and b ∈ R divides a together imply that b ∈ I.
2) By induction, one can show that (I2) is equivalent to the following statement: (I2’) If a ∈ I
and b1 , . . . , bn ∈ R, then the ideal in R generated by a, b1 , . . . , bn is principal. Let d be a generator
of this principal ideal. Then one can show quite easily that d is a greatest common divisor (g.c.d.)
of the set {a, b1 , . . . , bn }.
3) Recall from Section A.2 that I−1 R denotes the ring consisting of all fractions of the form
a/b, where a ∈ R, and b ∈ I. One may assume without loss of generality that 1 ∈ I. By identifying
a ∈ R with a/1, one can imbed R as a subring of I−1 R.
The proof of Theorem 8.2.1 is facilitated by some preliminary results.
Lemma 8.2.3 Suppose a ∈ I, b1 , . . . , bn ∈ R, and let d be a g.c.d. of the set {a, b1 , . . . , bn }. Then
there exists a matrix U ∈ U(R) such that |U | = d and [a, b1 , . . . , bn ] is the first column of U .
8.2. COPRIME FACTORIZATIONS FOR DISTRIBUTED SYSTEMS 299
The proof of this lemma exactly follows that of Lemma B.2.1, if one notes that every subset
{a, b1 , · · · , bi } has a g.c.d. by condition (I2).
Corollary 8.2.4 Suppose a, b1 , · · · , bn have a g.c.d. of 1. Then there exists a U ∈ U(R) with
[a, b1 , · · · , bn ] as its first column.
Lemma 8.2.5 Suppose a, b1 , · · · , bn ∈ R have a g.c.d. of d. Then there exists a V ∈ U(R) such that
⎡ ⎤ ⎡ ⎤
a d
⎢b ⎥ ⎢ 0 ⎥
⎢ 1⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎢ · ⎥ ⎢·⎥
V ⎢ ⎥ = ⎢ ⎥. (8.2.1)
⎢ · ⎥ ⎢·⎥
⎢ ⎥ ⎢ ⎥
⎣ · ⎦ ⎣·⎦
bn 0
Proof. Let c0 = a/d, ci = bi /d, i ≥ 1. Then c0 ∈ I since I is saturated, and 1 is a g.c.d. of all the
ci ’s. By Corollary 8.2.4 above, there exists a U ∈ U(R) such that [c0 · · · cn ] is the first column of
U . Now let V = U −1 . 2
Lemma 8.2.6 Suppose G ∈ M(I−1 R) is of order n × m, and G = BA−1 , where B ∈ Rn×m , A ∈

Rm×m , Then G has an r.c.f. provided there exists a U ∈ U(R) such that

A D
U = . (8.2.2)
B 0n×m
The proof of this lemma is the same as that of Theorem 4.1.3 with a few inessential modifi-
cations, and is left to the reader.
Proof of Theorem 8.2.1. It is shown that every G ∈ M(I−1 R) has an r.c.f. The existence of an l.c.f.
for every such G will then follow by taking transposes.
Suppose G ∈ M(I) is of order n × m. Then a typical element gij of G is of the form cij /aij ,
where cij ∈ R, aij ∈ I ∀i, j . Define

a= aij , bij = gij a , (8.2.3)
i,j
and let B be the matrix whose elements are the bij ’s, A = aI . Then G = BA−1 , and the theorem is
proved if it can be shown that there exists a U ∈ U(R) such that (8.2.2) holds. Such a U is constructed
iteratively. As the first step, let dm be a g.c.d. of the set {a, b1m , · · · , bnm }. Such a dm exists, since
a ∈ I (by the multiplicativity of the set I). Select a unimodular matrix Vm ∈ R(n+1)×(n+1) such that
⎡ ⎤ ⎡ ⎤
a dm
⎢b ⎥ ⎢ 0 ⎥
⎢ 1m ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎢ · ⎥ ⎢ · ⎥
Vm ⎢ ⎥=⎢ ⎥ . (8.2.4)
⎢ · ⎥ ⎢ · ⎥
⎢ ⎥ ⎢ ⎥
⎣ · ⎦ ⎣ · ⎦
bnm 0
Such a Vm exists, by Lemma 8.2.5. Now define

I 0
Um = m−1 . (8.2.5)
0 Vm
Then routine computations show that

A aIm−1 em
Um = , (8.2.6)
B X 0
where
em = [0 · · · 0 dm ] , (8.2.7)
and X ∈ M(R). It is now possible to repeat the procedure by working on the next-to-last column of
the matrix in (8.2.6). After m repetitions of the reduction, one arrives at a matrix of the form shown
in (8.2.2) 2
As an application of the above theory, consider the ring A described in (8.1.16), which is
recalled here for convenience. The ring A consists of all distributions f (·) with support in [0, ∞)
of the form
∞

f (t) = fi δ(t − ti ) + fa (t) , (8.2.8)
i=0
where 0 ≤ t1 < t2 < . . . are real numbers, fi are real, f (·) is measurable, and in addition
∞
∞
f A = |fi | + |fa (t)| dt < ∞ . (8.2.9)
i=0 0
It can be shown that A is precisely the set of impulse responses of BIBO-stable linear time-invariant
systems, and is a ring with pointwise addition and convolution as the product. Now, it can be
shown [101] that A is not a Bezout domain. Hence, not every element in the field of fractions of A
has a coprime factorization. Moreover, it is not known whether A is a Hermite ring or not. Hence, it
is of interest to identify a subset B of the fraction field of A such that every matrix in M(B) has both
8.2. COPRIME FACTORIZATIONS FOR DISTRIBUTED SYSTEMS 301
an r.c.f. and an l.c.f. This is done in [8] and can be described in the current framework as follows: Let
Â denote the set of Laplace transforms of distributions in A. Then Â is isomorphic to A, and has
pointwise addition and multiplication as the ring operations. It is routine to verify that each Laplace
transform in Â converges over the closed right half-plane C+e , and is analytic over the open right
half-plane. Now let Â0 denote the subset of Â defined as follows: Â0 consists of those functions
f ∈ Â which are analytic over the closed right half-plane. More precisely, f ∈ Â0 if there exists
an ε > 0 (which may depend on f ) such that f (·) is analytic over the half-plane {s : Re s > −ε}.
Then Â0 is a subring of Â. We take Â0 as the ring R. The subset I of R is defined as follows: A
function f in Â0 is bounded away from zero et infinity if, whenever {si } is a sequence in C+e , such
that |si | → ∞, we have that lim inf |f (si )| > 0. Now I consists precisely of those f in Â0 that are
bounded away from zero at infinity. It is easy to verify that I is a saturated multiplicative system in R
(see Problem 8.2.1), so that condition (I1) is satisfied. The proof of (I2) is a little more complicated,
but can be found in [101]. Hence, on the basis of Theorem 8.2.1, we conclude that every function
in B = I−1 R has a coprime factorization, and more generally, every G ∈ M(B) has both an r.c.f. as
well as an l.c.f.
In view of the above very useful property, it is worthwhile to understand a little better the class
of functions B. Since the zeros of an analytic function are always isolated, if f ∈ Â then any zero of
f in the open RHP is isolated. Now, if f ∈ Â0 , then every zero of f in the closed RHP is isolated.
Finally, if f ∈ I, then f is also bounded away from zero at infinity, which implies that all zeros of f
in C+e must in fact be contained in a compact subset thereof. Since any such zero is isolated and of
finite multiplicity, we conclude that every f ∈ I has only a finite number of isolated zeros, each of
finite order, in the closed RHP. Hence, every g ∈ B has only a finite number of singularities in the
closed RHP, and each of these is a pole of finite order.
Now let us consider discrete-time systems. In this case, a good choice for the set of transfer
functions of stable systems is the disc algebra A, consisting of functions that are analytic on the open
unit disc and continuous on the closed unit disc. The argument given in [101] shows that A is not a
Bezout domain, but A is a Hermite ring as shown in Example 8.1.26. Hence, every matrix in M(A)
that has an r.c.f. also has an l.c.f. and vice versa. But it is still desirable to have simple sufficient
conditions for the existence of an r.c.f. (or an l.c.f.). For this purpose, let R be the disc algebra A,
and let I consist of those functions f ∈ A that have only a finite number of zeros in the closed unit
disc and none on the unit circle (hence they are all in the open unit disc).
With this definition, it is once again routine to verify condition (I1). To check (I2), suppose
f ∈ A, g ∈ I. Then f and g can have only a finite number of common zeros in the closed unit
disc, and these must all be in the open unit disc. Suppose these common zeros are α1 , . . . , αn , with
multiplicities m1 , . . . , mn . Let φ(z) be the polynomial

n
φ(z) = (z − αi )mi . (8.2.10)
i=1
Then it can be shown that the ideal in A generated by f and g is the same as the principal ideal
generated by φ (see Problem 8.2.2). Hence, (I2) holds, and every matrix whose elements all belong
to I−1 A have both an r.c.f. and an l.c.f.
PROBLEMS
8.2.1. Show that the set I consisting of the functions in Â0 that are bounded away from zero
infinity satisfies condition (I1).
8.2.2. Show that the subset I of the disc algebra defined above satisfies conditions (I1) and (I2).
8.3 CONTROLLER SYNTHESIS OVER A GENERAL RING

The objective of this section is to extend the results contained in Chapter 5, and specifically Sec-
tion 5.2, to the case where the set S is replaced by a general ring R. The main obstacle to doing
so is that the plant and compensator can not be assumed a priori to have right- or left-coprime
factorizations. It is shown below that this difficulty can be mostly circumvented.
Suppose R is a commutative domain with identity, and denote its field of fractions by F. In
the present set-up, both the plant P and the compensator C are assumed to belong to M(F). The
pair (P , C) is stable if

(I + P C)−1 −P (I + CP )−1
H (P , C) = , (8.3.1)
C(I + P C)−1 (I + CP )−1
is well-defined and belongs to M(R). The symbol S(P ) denotes the set of C ∈ M(F) such that
(P , C) is a stable pair. The first objective is to parametrize the set S(P ) corresponding to a given
plant P . This is made easier by the next result.
Lemma 8.3.1 Suppose P has an r.c.f. (resp. an l.c.f.) and that C ∈ S(P ). Then C has an l.c.f. (resp. an
r.c.f.).
Proof. Consider the case where P has an r.c.f.; in case P has an l.c.f. the arguments are entirely
similar. Suppose P = N D −1 where N, D ∈ M(R) are right-coprime, and select X, Y ∈ M(R) such
that XN + Y D = I . Now
H (P , C) = H (ND −1 , C)

I − N(D + CN)−1 C −N(D + CN)−1
= (8.3.2)
D(D + CN)−1 C D(D + CN)−1
belongs to M(R), since C ∈ S(P ). Define S, T ∈ M(F) by
S = (D + CN)−1 , T = (D + CN)−1 C . (8.3.3)

8.3. CONTROLLER SYNTHESIS OVER A GENERAL RING 303
Then,

I − NT −NS
H (P , C) = , (8.3.4)
DT DS
which shows that N S, N T , DS, DT ∈ M(R). This in turn implies that
S = X(NS) + Y (DS) ∈ M(R) ;

T = X(NT ) + Y (DT ) ∈ M(R) . (8.3.5)
Also, from (8.3.3), C = S −1 T . Thus, in order to prove that (S, T ) is an l.c.f. of C, it only remains
to establish the left-coprimeness of S and T . But this is immediate, since
SD + T N = (D + CN)−1 (D + CN) = I . (8.3.6)
Hence, (S, T ) is an l.c.f. of C. 2
Corollary 8.3.2 Suppose P has both an r.c.f. and an l.c.f. and that C ∈ S(P ). Then C has both an r.c.f.
and an l.c.f.
Corollary 8.3.3 Suppose P , C ∈ M(F) and that the pair (P , C) is stable. Then the matrix

C 0
G= (8.3.7)
0 P
has both an r.c.f. and an l.c.f.
Remarks 8.3.4 It is not claimed that P and C individually have both an r.c.f. and an l.c.f. - only
that G has both.
Proof. Define

0 I
F = . (8.3.8)
−I 0
Then (P , C) is stable if and only if (G, F ) is stable. Since F is a constant matrix, it has both an r.c.f.
as well as an l.c.f. The conclusion now follows from Corollary 8.3.2. 2
Thus, if one is interested in designing stabilizing compensators for a plant that has both an
r.c.f. and an l.c.f., the basic result of Chapter 5, namely Theorem 5.2.1, carries over completely. It is
repeated here for emphasis.
Theorem 8.3.5 Suppose P ∈ M(F) has an r.c.f. (N, D) and an l.c.f. (D̃, Ñ). Select X, Y, X̃, Ỹ ∈
M(R) such that XN + Y D = I, Ñ X̃ + D̃ Ỹ = I . Then,
S(P ) = {(Y − R Ñ)−1 (X + R D̃) : R ∈ M(R) and |Y − R Ñ| = 0}

= {(X̃ + DR)(Ỹ − NR)−1 : R ∈ M(R) and |Ỹ − NR| = 0}. (8.3.9)
Once a parametrization of S(P ) is available, Theorem 5.4.1 concerning the simultaneous

stabilizability of two plants P0 and P1 carries over to a general ring R.
Theorem 8.3.6 Suppose P0 and P1 ∈ M(F) have the same dimensions, and that each has both an
r.c.f. and an l.c.f. Define
A0 = X0 N1 + Y0 D1 , B0 = −Ñ0 D1 + D̃0 N1 , (8.3.10)
where the symbols are as in (5.4.1). Then there exists a C ∈ M(F) that stabilizes both P0 and P1 if
and only if there exists an M ∈ M(R) such that A0 + MB0 U(R).
Thus, as before, the simultaneous stabilizability question for two plants can be reduced to a
strong stabilizability question for a single plant. Unfortunately, this latter question is very difficult
to answer, especially for multivariable plants.
The strong stabilizability question for scalar plants is the following: Given coprime elements
n, d ∈ R, when does there exist a c ∈ R such that d + cn is a unit of R? The answer to this question
depends strongly on the specifics of the ring R. In case R = A, the disc algebra, and n has only a
finite number of zeros inside the closed unit disc, the existence or otherwise of a suitable c can be
decided using the parity interlacing test, as in Section 2.4. However, if n has a sequence of zeros
clustering on the unit circle, the methods of Section 2.4 are no longer applicable. For multivariable
plants, the strong stabilization question is: Given a right-coprime pair N, D ∈ M(R), does there
exist a C ∈ M(R) such that D + CN ∈ U(R)? If R is a proper Euclidean domain, the results of
Section 4.4 can be used to transform this matrix problem into a scalar problem. Specifically, such a
C exists if and only if the pair (n, d) can be strongly stabilized, where n is the g.c.d. of all elements
of N and d = |D|. However, such a reduction is not possible over a general ring. As a result, the
strong stabilization problem for multivariable plants over a general ring is still open.
The other contents of Chapter 5, such as the two-parameter compensation scheme for ex-
ample, can also be extended to a general ring without undue difficulty. The details are left to the
reader.
8.3. CONTROLLER SYNTHESIS OVER A GENERAL RING 305
The results of Chapters 6 and 7 can be extended almost without modification to distributed
systems. The reason is that these results are mainly based on the properties of the function spaces
H2 and H∞ , and the rationality of the various functions involved does not play a central role.
This section is concluded with a discussion of the possibility of designing lumped compensators
(i.e., those with rational transfer function matrices) for distributed systems. Such compensators are
desirable because of ease of implementation. Consider first the discrete-time case, and suppose the
ring R is the disc algebra A. Then the polynomials are dense in A. As a consequence, the set of
rational functions (i.e., ratios of polynomials) are dense in the fraction field of A in the graph topology.
Thus, if c is any element (compensator) in the fraction field of A, then every neighborhood of c
contains a rational function. This fact can be exploited to design lumped stabilizing compensators,
as follows: First, use Theorem 8.3.5 together with any other mathematical apparatus to design a
good, or even optimal, compensator C in the form Nc Dc−1 or D̃c−1 Ñc , where the matrices Nc , Dc
or Ñc , D̃c have their elements in the ring A. By approximating the elements of these matrices by
polynomials or rational H∞ -functions to a sufficiently high degree of accuracy, one can come up
with lumped compensators that stabilize the plant under study, or achieve near-optimal performance.
In the continuous-time case, the situation is a little more complicated, because the elements
of S are not dense in the ring Â. For example, the function f (s) = exp(−s) belongs to Â, since it is
the Laplace transform of a delayed impulse distribution. But it cannot be uniformly approximated
by any stable rational function. Hence, it is not always possible to approximate compensators in
the fraction field of Â by rational compensators. However, it is not too difficult to verify that any
function which is the Laplace transform of a distribution of the form
f (t) = f0 δ(t) + fa (t) , (8.3.11)
where the function fa is absolutely integrable (i.e., belongs to the space L1 ), can be approximated
arbitrarily closely by a function in S. Thus, if R denotes the set of Laplace transforms of such
functions, then S is a dense subset of R. Hence, any compensator of the form D̃c−1 Ñc , or Nc Dc−1
where Dc , Nc , Ñc , D̃c ∈ M(R), can be approximated by a lumped compensator.
The situation is more satisfactory in the case of the ring B discussed in the previous section.
Recall that B is the ring of fractions of the form I−1 A, where I and A are defined in Section 8.2. It
can be shown that every function in B is the sum of a rational function and the Laplace transform
of a function belonging to L1 (see [8]). Thus, if F ∈ M(B), then F can be written as F = G + H ,
where G ∈ M(R(s)) and H is the Laplace transform of a matrix consisting of functions from L1 .
Now, the problem of designing a stabilizing compensator for F is essentially that of designing one
for the “unstable part” G of F . Suppose (Ng , Dg ) is an r.c.f. of G over the ring M(S), and select
matrices X, Y ∈ M(S) that satisfy the identity XNg + Y Dg = I .Then, as shown in Problem 4.1.12,
the pair (Ng + H Dg , Dg ) is an r.c.f. of F , this time over the ring Â (which contains S as a subring),
and moreover,
X(Ng + H Dg ) + (Y − XH )Dg = I . (8.3.12)
Thus, the compensator (Y − XH )−1 X stabilizes the plant F . Next, note that Y − XH ∈ M(R),
where R is the set of Laplace transforms of distributions of the form (8.3.11). Thus, there exists
a sequence of rational matrices {Ai } in M(S) that converges Y − XH . By choosing the index
i sufficiently large, we can ensure that the lumped compensator A−1i X stabilizes the plant F . If
|Y − XH | = 0, this presents no problem, since one can always choose the sequence {Ai } to consist
of nonsingular matrices. Thus, we have shown that every distributed plant whose transfer matrix
belongs to the set B can be stabilized by a lumped compensator.

The references for the material in Section 8.1 are given at the appropriate places. The design of
lumped compensators for distributed systems is discussed in [70, 71], and for the special case of time-
delay systems in [55]. See [51] for results relating the internal and external stability of distributed
systems, and [6, 54, 64, 77] for controller synthesis theory in very general settings.
The factorization approach carries over with some limitations to nonlinear systems as well. In
this connection, see [1, 23, 24].
307
APPENDIX A
Algebraic Preliminaries
This appendix contains a brief introduction to some basic properties of rings. It is important to note
that the choice of results from ring theory that are presented here is highly selective; the reader is
referred to [52] or [116] for a more comprehensive treatment.
A.1 RINGS, FIELDS AND IDEALS
Definition A.1.1 A ring is a nonempty set R together with two binary operations + (addition)
and · (multiplication) such that the following axioms are satisfied:
(R1) (R, +) is a commutative group. This means that

a + (b + c) = (a + b) + c, ∀ a, b, c ∈ R.
a + b = b + a, ∀ a, b ∈ R.
There exists an element 0 ∈ R such that
a + 0 = 0 + a = a, ∀ a ∈ R.
For every element a ∈ R, there exists a corresponding element −a ∈ R such that a + (−a) =
0.
(R2) (R, ·) is a semigroup. This means that

a · (b · c) = (a · b) · c, ∀a, b, c ∈ R.
(R3) Multiplication is distributive over addition. This means that

a · (b + c) = a · b + a · c,
(a + b) · c = a · c + b · c, ∀ a, b, c ∈ R.
As is customary, a · b is denoted by ab, and a + (−b) is denoted by a − b.

A ring R is said to be commutative if ab = ba ∀ a, b ∈ R, and is said to have an identity if
there exists an element 1 ∈ R such that 1 · a = a · 1 = a ∀ a ∈ R.
Examples A.1.2 A classic example of a ring is the set of integers, denoted by Z, with the usual
notions of addition and multiplication. Clearly, Z is a commutative ring with identity. The set
E of even integers, together with the usual addition and multiplication is an example of a ring
308 A. ALGEBRAIC PRELIMINARIES
without identity; but E is commutative. The set Z2×2 of 2 × 2 matrices with integer elements is a
noncommutative ring with identity.
A ring R is said to be a domain (or an integral domain) if a ∈ R, b ∈ R, ab = 0 implies that
either a = 0 or b = 0. In other words, R is a domain if the product of every pair of nonzero elements
is nonzero.
Example A.1.3 In Example A.1.2 above, both Z and E are domains, but Z2×2 is not.
Example A.1.4 Consider the ring Z/(6) consisting of the six elements {0, 1, 2, 3, 4, 5}, with
addition and multiplication defined modulo 6. For example, 3 + 5 = 8 mod 6 = 2, and 2.5 = 10
mod 6 = 4.This ring is commutative and has an identity. However, since 2 · 3 = 0, it is not a domain.
Suppose R is a ring with identity. An element x ∈ R is called a unit of R if there is a y ∈ R
such that xy = yx = 1. It can be easily shown that such a y is unique; y is called the inverse of x
and is denoted by x −1 .
Example A.1.5 Let R = C[0, 1], the set of continuous real-valued functions defined over the
interval [0, 1]. If addition and multiplication on R are defined pointwise, i.e.,
(x + y)(t) = x(t) + y(t) ,

(xy)(t) = x(t)y(t), ∀t ∈ [0, 1], ∀x, y ∈ R ,
then R is a commutative ring with identity. However, it is not a domain; for example, let

1 − 2t 0 ≤ t ≤ 0.5
x(t) =
0 0.5 < t ≤ 1

0 0 ≤ t ≤ 0.5
y(t) = .
2t − 1 0.5 < t ≤ 1
Then xy = 0, even though x = 0, y = 0. A function x ∈ R is a unit of R if and only if x(t) =

0 ∀t ∈ [0, 1]. In other words, the units in R are the functions that do not change sign over [0, 1].
Definition A.1.6 A field is a commutative ring F with an identity, satisfying two additional
assumptions:
(F1) F contains at least two elements.
(F2) Every nonzero element of F is a unit.
Examples A.1.7 The rational numbers, real numbers and complex numbers are all well-known
examples of fields. Another example is Z/(p) where p is a prime integer, consisting of the elements
A.1. RINGS, FIELDS AND IDEALS 309
{0, 1, · · · , p − 1}. Addition and multiplication are defined modulo p. This set is clearly a commu-
tative ring with identity, and it contains at least two elements. To show that it satisfies axiom (F2),
suppose a is any nonzero element of Z/(p). Since the greatest common divisor of a and p (in the
usual sense) is 1, there exist integers x and y such that
ax + py = 1 .
Moreover, for any integer q, we have that
a(x − qp) + p(y + qa) = 1 .
So by a suitable choice of q, the integer x − qp can be made to lie between 0 and p − 1. As a result,
it follows that there exist integers b and c, with b lying between 0 and p − 1, such that
ab + pc = 1, or ab = 1 − pc ≡ 1 mod p .
Hence, a is a unit and b is its inverse. For example, in the ring Z/(13), the inverse of 5 is 8, and the
inverse of 2 is 7.
The field Z/(2), consisting of the two elements {0, 1}, is called the binary field.
A subset S of a ring R is a subring of R if it is a ring in its own right, i.e., if 0 ∈ S, and the
sum, difference and product of two elements of S again belong to S.
Definition A.1.8 A subset I in a ring R is said to be a left ideal if (i) I is a subgroup of the additive
group of R, and (iiL) a ∈ I, x ∈ R imply that xa ∈ I. I is a right ideal if (i) I is a subgroup of the
additive group of R and (iiR) a ∈ I, x ∈ R imply that ax ∈ I. I is an ideal if it is both a left ideal
and a right ideal.
In the above definition, condition (i) means the following: First, 0 ∈ I, and second, if x, y ∈ I,
then x ± y ∈ I. Condition (iiL) (resp. (iiR)) means that if and element of I is multiplied on the left
(resp. right) by any element of R, the product is once again in I. Clearly, if R is a commutative ring,
the concepts of a left ideal, right ideal, and ideal all coincide.
Example A.1.9 Let R be the ring Z2×2 of 2 × 2 matrices with integer elements. Let
I1 = {M ∈ Z2×2 : m11 = 0, m21 = 0} .
Then it is easy to verify that I1 is a left ideal in R. Similarly,
I2 = {M ∈ Z2×2 : m21 = 0, m22 = 0}
is a right ideal in R. The set of diagonal matrices
D = {M ∈ Z2×2 : m21 = 0, m12 = 0}

is a subring of R but is neither a left ideal nor a right ideal.
Let a be an integer. Then
M(a) = {M ∈ Z2×2 : mij is divisible by a ∀i, j }
is an ideal in R.
Example A.1.10 Consider the ring C[0, 1] of Example A.1.5, and let t0 ∈ [0, 1]. Then
It0 = {x(·) ∈ C[0, 1] : x(t0 ) = 0}
is an ideal. More generally, let S be any subset of [0, 1]. Then
IS = {x(·) ∈ C[0, 1] : x(t) = 0 ∀t ∈ S}
is an ideal.
Suppose a is some element of a ring R. Then the set of all elements of the form xa where
x ∈ R, i.e., the set of all left multiples of a, is a left ideal. It is called the left principal ideal generated
by a. Similarly, the right principal ideal generated by a is the set of all elements ax where x ∈ R.
An ideal I in a commutative ring R is a prime ideal if a ∈ R, b ∈ R, ab ∈ I implies that
either a ∈ I or b ∈ I. Equivalently, I is a prime ideal if and only if the set R − I is closed under
multiplication (i.e., the product of two elements not belonging to I cannot itself belong to I).
Example A.1.11 Consider the ring of integers Z, and let n be any integer. Then the set of multiples
of n, denoted by (n), is the principal ideal generated by n. It is a prime ideal if and only if n divides
ab implies that n divides a or n divides b, which is true if and only if n is a prime number.
Example A.1.12 Consider the ideal It0 of Example A.1.10. If a product xy belongs to this ideal,
then (xy)(t0 ) = 0, which means that either x(t0 ) = 0 or y(t0 ) = 0. Hence, It0 is a prime ideal.
However, if S contains more than one point, then IS is not a prime ideal.
By convention, the entire ring R is not considered to be a prime ideal. This is in conformity
with the convention that 1 is not considered a prime number (see Example A.1.11). Thus, “I is a
prime ideal in R” automatically implies, by convention, that I is a proper subset of R.
A.1. RINGS, FIELDS AND IDEALS 311
PROBLEMS
A.1.1. (i) Show that the zero element of a ring is unique (Hint: if 01 and 02 are two additive
identities, then 01 = 01 + 02 = 02 ).
(ii) Show that the additive inverse of an element is unique.
A.1.2. (i) Show that a · 0 = 0 for all a in a ring (Hint: ab = a(b + 0)).
(ii) Show that, if a ring R contains an identity and has at least one nonzero element, then
1 = 0 (Hint: let a = 0; then a0 = a = a1).
A.1.3. Show that, if R is a domain, then the cancellation law holds, i.e., ab = ac, a = 0 implies
that b = c.
A.1.4. Let R be a ring with identity and at least one nonzero element.
(i) Show that the set of units of R is a group under multiplication; i.e., show that (a) if u is
a unit, then so is u−1 , and (b) if u, v are units, so are uv and vu.
(ii) Show that 0 can never be a unit (Hint: see Problem A.1.2).
(iii) Show that if x has an inverse, then it is unique.
(iv) Show that if x has a left inverse y and a right inverse z, then y = z (i.e., show that
yx = 1, xz = 1 implies y = z).
A.1.5. Let R be the set of functions mapping the interval [0, 1] into the set of integers Z.
(i) Show that R is a ring under pointwise addition and multiplication.
(ii) What are the units of this ring?
A.1.6. Consider the ring Z/(p) of Example A.1.7. Show that this ring is a field if and only if p
is a prime number.
A.1.7. (i) Consider the ring Z/(9), consisting of {0, 1, · · · , 8}, with addition and multiplication
modulo 9. Determine the units of this ring. (Answer: 1,2,4,5,7,8)
(ii) Consider the ring Z/(n). Show that m is a unit of this ring if and only if the greatest
common divisor of m and n is 1.
A.1.8. Consider the ring Zn×n of n × n matrices with integer elements. Show that a matrix M in
this ring is a unit if and only if its determinant is ±1.
A.1.9. Show that a commutative ring R is a domain if and only if {0} is a prime ideal.
A.2 RINGS AND FIELDS OF FRACTIONS
Throughout this section “ring” means a commutative ring with identity.
Suppose R is a ring. An element a ∈ R is an absolute nondivisor of zero if b ∈ R, ab = 0 implies
that b = 0. In every ring, there are absolute nondivisors of zero; for example, all the units are of this
type. But there may be others as well.
Examples A.2.1 In the ring Zn×n , a matrix M is an absolute nondivisor of zero if and only if
the determinant of M is nonzero. Thus, a nonunit matrix can be an absolute nondivisor of zero
(cf. Problem A.1.8).
Now consider the ring C[0, 1] defined in Example A.1.5. Suppose a function x(·) belonging
to this ring vanishes at only a finite number of points. If y ∈ C[0, 1] and xy = 0, then x(t)y(t) ≡ 0,
which means that y(t) = 0 for all except a finite number of values of t. However, since y(·) is
continuous, this implies that y(t) ≡ 0, or that y is the zero element of the ring. Hence, x is an
absolute nondivisor of zero.
If a is an absolute nondivisor of zero and ab = ac, then b = c; in other words, absolute
nondivisors of zero can be “cancelled.”
Of course, if R is a domain, then every nonzero element of R is an absolute nondivisor of zero.
A set M in a ring R is said to be a multiplicative system if a, b ∈ M implies that ab ∈ M. It is
saturated if a ∈ R, b ∈ R, ab ∈ M implies that a ∈ M, b ∈ M.
Fact A.2.2 The set N of absolute nondivisors of zero in a ring R is a multiplicative system.
Proof. Suppose a, b ∈ N, y ∈ R, and aby = 0. Then, since a, b ∈ N, it follows successively that

aby = 0 ⇒ by = 0 ⇒ y = 0. Hence, ab ∈ N. 2
Suppose R is a ring, M is a multiplicative system in R containing 1, and M is a subset of N

(the set of absolute nondivisors of zero in R). We now begin a construction which will ultimately
result in a ring L which contains R as a subring, and in which every element of M is a unit.
Consider the set R × M, and define a binary relation ∼ on R × M as follows: (a, b) ∼
(c, d) ⇐⇒ ad = bc. The relation ∼ is an equivalence relation: Clearly ∼ is reflexive and symmetric.
To show that it is transitive, suppose (a, b) ∼ (c, d) and (c, d) ∼ (e, f ).Then ad = bc and cf = de.
Multiplying the first equation by f and the second one by b gives adf = bcf = bde. Now d ∈ N
since d ∈ M and M ⊆ N. Hence, d can be cancelled in the above equation to give af = be, i.e.,
(a, b) ∼ (e, f ).
Since ∼ is an equivalence relation, the set R × M can be partitioned into disjoint equivalence
classes under ∼. The set of equivalence classes R × M/ ∼ is denoted by L. The set L therefore
consists of fractions (a, b), or a/b in more familiar terms, where we agree to treat two fractions a/b
A.2. RINGS AND FIELDS OF FRACTIONS 313
and c/d as equivalent if ad = bc. Addition and multiplication of fractions in L are defined in the
familiar way, namely
a c ad + bc
+ = . (A.1)
b d bd
a c ac
· = . (A.2)
b d bd
Note that, if b, d ∈ M, then so does bd. Hence, the right sides of (A.1) and (A.2) are valid fractions.
Actually, (A.1) and (A.2) represent operations on equivalence classes; but the reader can verify
that the sum and product of two fractions (i.e., two equivalence classes) do not depend on which
representatives of the equivalence classes are used.
With addition and multiplication defined by (A.1) and (A.2), L is a ring. Moreover, if every
a ∈ R is identified with the fraction a/1, then R is isomorphic to a subring of L. The element 1/1
serves as an identity for L. Finally, if d ∈ M, then d corresponds to d/1 ∈ L; moreover, d/1 is a unit
in L with the inverse 1/d. Thus, every element of (the isomorphic image of ) M is a unit in L. The
ring L is called the ring of fractions of R with respect to M, and is denoted by M−1 R.
Since M is a subset of N (the set of absolute nondivisors of zero), the ring N−1 R contains the
largest number of units. If in particular R is a domain, then N = R \ 0, i.e., N consists of all nonzero
elements of R. In this case, the ring F = (R \ 0)−1 R has the property that every nonzero element
of R is a unit in F. Moreover, every nonzero element of F is also a unit: If a/b ∈ F and a = 0, then
b/a is the inverse of a/b. Hence, F is a field; it is referred to as the field of fractions or quotient field
associated with the domain R.
There is a particularly important class of fraction rings that is frequently encountered. Suppose
R is a ring, and that I is a prime ideal in R. If M denotes the complement of I, then M is a
multiplicative system (see Problem A.1.4). The corresponding fraction ring M−1 R is called the
localization of R with respect to I.
Examples A.2.3 The set of integers Z is a domain. The field of fractions associated with the
integers is the set of rational numbers.
In the ring Z, let M denote the set of numbers that are not divisible by 3. Then M is a
multiplicative system (since 3 is a prime number), and 1 ∈ M. The ring of fractions M−1 Z is the set
of rational numbers whose denominators (when expressed in reduced form) are not divisible by 3.
This is a subring of the field of rational numbers.
PROBLEMS
A.2.1. Suppose R is a ring, a, b, c ∈ R, and a is an absolute nondivisor of zero. Show that if
ab = ac, then b = c.
A.2.2. Show that the addition and multiplication rules (A.1) and (A.2) are unambiguous, in the
sense that the final answers do not depend on which representatives of the equivalence
classes are used in the computation.
A.2.3. Consider the ring Z/(10), consisting of the integers 0, · · · , 9 with addition and multipli-
cation defined modulo 10.
(i) Determine the units of this ring.
(ii) Show that the set N of absolute nondivisors of zero consists of just the units.
(iii) Using (ii), show that the ring of fractions with numerators in Z/(10) and denominators
in N is again just Z/(10).
A.2.4. Suppose R is a ring and that I is a prime ideal in R. Let M denote the complement of I.
Show that M is a multiplicative system.
A.3 PRINCIPAL IDEAL DOMAINS

Throughout this section, “ring” means a commutative ring with identity.
Definition A.3.1 A ring R is said to be a principal ideal ring if every ideal in R is principal. R is a
principal ideal domain (p.i.d.) if it is a domain as well as a principal ideal ring.
Recall that a principal ideal I consists of all multiples of some element a, i.e., I = {xa : x ∈ R}.
Thus, in a principal ideal ring, every ideal is generated by a single element.
If x and y are elements of a ring R with x = 0, we say that x divides y, and y is a multiple of
x, if there is an element z ∈ R such that y = xz; this is denoted by x | y. If x and y are elements of
a ring R such that not both are zero, a greatest common divisor (GCD) of x, y is any element d ∈ R
such that
(GCD1) d | x and d | y .
(GCD2) c | x, c | y ⇒ c | d .
In the above definition, it is implicit that d = 0, since d divides a nonzero element by virtue
of (GCD1). We say a greatest common divisor because a GCD is not unique. Certainly, if d is
a GCD of x and y, then so is −d. Actually, the following stronger result holds:
Fact A.3.2 Suppose d is a GCD of x and y; then so is du whenever u is a unit. Suppose in addition
that R is a domain; then every GCD d1 of x and y is of the form d1 = du for some unit u.
Proof. The first sentence is obvious. To prove the second sentence, observe that if d1 is another
GCD of x and y, then by (GCD2) d1 | d and d | d1 . Hence, d1 = du for some u. Moreover, du | d
since d1 | d. Since R is a domain and d = 0, this implies that u | 1, i.e., that u is a unit. 2
A.3. PRINCIPAL IDEAL DOMAINS 315
Fact A.3.2 states that if R is a domain, then once we have found one GCD of a given pair of
elements, we can quickly find them all. However, a question that is not answered by Fact A.3.2 is:
Does every pair of elements have a GCD? The answer to this question is provided next.
Theorem A.3.3 Suppose R is a principal ideal ring. Then every pair of elements x, y ∈ R, not
both of which are zero, has a GCD d which can be expressed in the form
d = px + qy (A.1)
for appropriate elements p, q ∈ R. Moreover, if R is a domain, then every GCD of x and y can be
expressed in the form (A.1).
Proof. Given x, y ∈ R, consider the set
I = {ax + by, a, b ∈ R} . (A.2)
In other words, I is the set of all “linear combinations” of x and y. It is easy to verify that I is an ideal
in R.1 Since R is a principal ideal ring, I must be a principal ideal; that is, there exists an element
d ∈ I such that I is the set of all multiples of d. Since x = 1 · x + 0 · y ∈ I, it follows that x is a
multiple of d, i.e., d | x; similarly d | y. Thus, d satisfies the axiom (GCD1). Next, since d ∈ I and
I is the ideal generated by x, y, there exist p, q ∈ R such that (A.1) holds. Now suppose c | x, c | y.
Then c | (px + qy), i.e., c | d. Hence, d also satisfies (GCD2) and is thus a GCD of x and y.
Up to now, we have shown that every pair of elements x and y has a GCD which can be
written in the form (A.1) for a suitable choice of p and q. Now, if in addition R is a domain, then
every GCD d1 of x, y is of the form d1 = du for some unit u, by Fact A.3.2. Thus, every GCD d1
can be written in the form d1 = du = upx + upy. 2
Two elements x, y ∈ R are relatively prime or simply coprime if every GCD of x, y is a unit.
In view of Fact A.3.2, if R is a domain, this is equivalent to saying that x and y are coprime if and
only if 1 is a GCD of x and y. Now suppose R is a principal ideal domain. Then, by Theorem A.3.3,
x, y ∈ R are coprime if and only if the ideal I in (A.2) is the same as the ideal generated by 1. But
the latter is clearly the entire ring R. This can be summarized as follows:
Fact A.3.4 Let R be a principal ideal domain. Then x, y ∈ R are coprime if and only if there exist
p, q ∈ R such that px + qy = 1.
One can also define a GCD of an n-tuple of elements (x1 , · · · , xn ), not all of which are zero.
An element d ∈ R is a GCD of the collection (x1 , · · · , xn ) if
(i) d | xi ∀i , (A.3)
(ii) c | xi ∀i ⇒ c | d . (A.4)
1 I is referred to as the ideal generated by x and y.
As in Fact A.3.2, it follows that a GCD of a given n-tuple is unique to within a unit provided the
ring is a domain. Moreover, Theorem A.3.3 can be generalized as follows:
Theorem A.3.5 Suppose R is a principal ideal ring, and suppose x1 , · · · , xn ∈ R, with at least one
element not equal to zero. Then the n-tuple (x1 , · · · , xn ) has a GCD d which can be expressed in
the form
n
d= pi xi (A.5)
i=1
for appropriate elements p1 , · · · , pn ∈ R. Moreover, if R is a domain, then every GCD of this
n-tuple can be expressed in the form (A.5).
The proof is entirely analogous to that of Theorem A.3.3 and is left as an exercise.
In the field of rational numbers, every ratio a/b of integers has an equivalent “reduced form”
f/g where f and g are coprime. The next result shows that such a statement is also true in the field
of fractions associated with any principal ideal domain.
Fact A.3.6 Let R be a principal ideal domain, and let F be the field of fractions associated with R.
Given any fraction a/b in F, there exists an equivalent fraction f/g such that f and g are coprime.
Proof. If a and b are already coprime, then there is nothing to be done. Otherwise, let d be a GCD
of a, b, and define f = a/d, g = b/d. Then clearly, f/g = a/b, and it only remains to show that
f and g are coprime. From Theorem A.3.3, there exist p, q ∈ R such that
d = pa + qb = pf d + qgd . (A.6)
Cancelling d from both sides of (A.6) gives 1 = pf + qg, which shows that f, g are coprime. 2
Thus, the procedure for “reducing” a fraction is the natural one: namely, we extract a greatest
common divisor from the numerator and the denominator.
Two elements a, b ∈ R are associates (denoted by a ∼ b) if there is a unit u such that a = bu.
One can readily verify that ∼ is an equivalence relation on R. A nonunit, nonzero element p ∈ R
is a prime if the only divisors of p are either units or associates of p. An equivalent definition is the
following: p is a prime if p = ab, a, b ∈ R implies that either a or b is a unit. In the ring of integers,
the primes are precisely the prime numbers.
A useful property of principal ideal domains is stated next, without proof (see [116]).
Fact A.3.7 Every nonunit, nonzero element of a principal ideal domain can be expressed as a
product of primes. Moreover, this factorization is unique in the following sense: If

n
m
x= pi = qi , (A.7)
i=1 i=1
A.3. PRINCIPAL IDEAL DOMAINS 317
where pi , qi are all primes, then n = m, and the qi ’s can be renumbered such that pi ∼ qi ∀i.
Thus, the only nonuniqueness in the prime factorization of an element arises from the pos-
sibility of multiplying some of the prime factors by a unit (for example, 6 = 2 · 3 = (−2) · (−3)).
Using prime factorizations, one can give a simple expression for a GCD of a set of elements.
The proof is left as an exercise (see Problem A.3.7).
Fact A.3.8 Suppose x1 , · · · , xn is a set of elements such that none is a unit nor zero. Express each
element xi in terms of its prime factors in the form

n
α
xi = pj ij (A.8)
i=1
where the pj ’s are distinct (i.e., nonassociative) primes, αij ≥ 0 and αij = 0 if pj is not a divisor of
xi . Then a GCD d of (x1 . · · · , xn ) is given by

n
β
d= pj j (A.9)
i=1
where
βj = min αij . (A.10)
i
Corollary A.3.9 Two elements x, y are coprime if and only if their prime divisors are distinct.
Up to now, we have talked about the greatest common divisor of a set of elements. A parallel
concept is that of the least common multiple. Suppose (x1 , · · · , xn ) is a set of elements, none of
which is zero. We say that y is a least common multiple of this set of elements if
(LCM1) xi | y ∀i ,
(LCM2) xi | z ∀i implies that y | z .
Using prime factorizations, one can give a simple expression for a l.c.m. that parallels Fact A.3.8.
Fact A.3.10 Suppose (x1 , · · · , xn ) is a collection of nonzero, nonunit elements. Express each
element xi in terms of its prime factors as in (A.8). Then a l.c.m. of this collection of elements is
given by
n
γ
y= pj j , (A.11)
i=1
where
γj = max αij . (A.12)
i
The proof is left as an exercise (see Problem A.3.8).
PROBLEMS
A.3.1. Prove that two elements x, y in a p.i.d. R are coprime if and only if there exist p, q ∈ R
such that px + qy is a unit.
A.3.2. Let R be a domain, and let x, y denote a GCD of x, y which is unique to within a unit
factor. Show that if u, v are units, then x, y = ux, vy (in other words, every GCD of
x, y is also a GCD of ux, vy and vice versa).
A.3.3. Let R be a p.i.d., and let x, y, z ∈ R. Show that x, y, z = x, y , z . More generally,
show that if x1 , · · · , xn ∈ R, then x1 , · · · , xn = x1 , · · · , xm , xm+1 , · · · , xn for 1 ≤
m < n.
A.3.4. Let R be a p.i.d., and let x1 , · · · , xn ∈ R. Let d be a GCD of this set of elements and

select p1 , · · · , pn ∈ R such that pi xi = d. Show that 1 is a GCD of the set of elements
p1 , · · · , pn .
A.3.5. Suppose x, y are coprime and z divides y, in a p.i.d. R. Show that x and z are coprime.
A.3.6. Suppose u is a unit and x1 , · · · , xn are nonzero elements. Show that u, x1 , · · · , xn = 1.
A.3.7. Prove Fact A.3.8.
A.3.8. Prove Fact A.3.10.
A.3.9. (i) Find the prime factorizations of 8, 24, 42 in the ring of integers.
(ii) Find a GCD of the above three numbers using Fact A.3.8.
(iii) Find a l.c.m. of the above three numbers using Fact A.3.10.
A.3.10. Show that the following three statements are equivalent:

(i) x divides y.
(ii) x is a GCD of x, y.
(iii) y is a l.c.m. of x, y.
A.3.11. Suppose R is a p.i.d., that x, y, z ∈ R, x and y are coprime, and that x divides yz. Show
that x divides z. (Hint: Use Corollary A.3.9.)
A.4. EUCLIDEAN DOMAINS 319
A.4 EUCLIDEAN DOMAINS
In this section, we study a special type of ring that finds a lot of application in this book. Throughout
this section “domain” means a commutative domain with identity.
Definition A.4.1 A domain R is a Euclidean domain if there is a degree function δ : R \ 0 → Z+

a satisfying the following axioms:2
(ED1) For every x, y ∈ R with y = 0, there exists a q ∈ R such that either r := x − qy is zero, or
else δ(r) < δ(y).
(ED2) If x | y then δ(x) ≤ δ(y).
One can think of q as a quotient, and r as a remainder, after “dividing” x by y.The axiom (ED1)
states that we can always get a remainder that is either zero or else has a smaller degree than the
divisor y. We speak of a quotient and remainder because q and r are not necessarily unique unless
additional conditions are imposed on the degree function δ(·).
The axiom (ED2) implies that δ(1) ≤ δ(x) ∀x = 0, since 1 divides every nonzero element.
Hence, it can be assumed without loss of generality that δ(1) = 0. The same axiom also implies that
if x and y are associates, then they have the same degree (because if x and y are associates, then x | y
and y | x). In particular, δ(u) = 0 whenever u is a unit.
Fact A.4.2 Let R be a Euclidean domain with degree function δ(·). and suppose
δ(x + y) ≤ max{δ(x), δ(y)} , (A.1)

δ(xy) = δ(x) + δ(y) . (A.2)
Then, for every x, y ∈ R with y = 0, there exists a unique q ∈ R such that δ(x − yq) < δ(y), where
the degree of zero is taken as −∞.
Proof. By (ED1), there exists at least one such q. Now suppose δ(x − ay) < δ(y), δ(x − by) < δ(y),
and define r = x − ay, s = x − by.Then x = ay + r = by + s. Rearranging gives (a − b)y = s −
r. If a = b, then δ((a − b)y) = δ(a − b) + δ(y) ≥ δ(y), by (A.2). On the other hand, δ(s − r) ≤
max{δ(r), δ(s)} < δ(y). This contradiction shows that a = b and also r = s. 2
Definition A.4.3 A Euclidean domain R with degree function δ(·) is called a proper Euclidean
domain if R is not a field and δ(·) satisfies (A.2).3
2 Note that Z denotes the set of nonnegative integers.
+
3This is slightly different from the definition in [65, p. 30].
Note that, in a proper Euclidean domain, the division process might still produce nonunique
quotients and remainders, because (A.1) is not assumed to hold. This is the case, for example, in the
ring of proper stable rational functions, which are studied in Chapter 2.
Fact A.4.4 Every Euclidean domain is a principal ideal domain.
Proof. Let R be a Euclidean domain, and let I be an ideal in R. If I = {0}, then I is principal with
0 as the generator. So suppose I contains some nonzero elements, and let x be an element of I such
that δ(x) is minimum over all nonzero elements of I. We claim that I is the ideal generated by x
and is hence principal. To prove this, let y ∈ I be chosen arbitrarily; it is shown that x divides y. By
axiom (ED1), there exists a q ∈ R such that either r := y − qx is zero or else δ(r) < δ(x). If r = 0,
then δ(r) < δ(x) contradict’s the manner in which x was chosen. Hence, r = 0, i.e., x divides y.
2
We now present a very important example of a Euclidean domain, namely the ring of poly-
nomials in one indeterminate with coefficients in a field. To lead up to this example, an abstract
definition of a polynomial is given.
Let R be a ring. Then a polynomial over R is an infinite sequence {a0 , a1 , · · · } such that only
finitely many terms are nonzero. The sum and product of two polynomials a = {ai } and b = {bi }
are defined by
(a + b)i = ai + bi , (A.3)
i
i
(ab)i = ai−j bj = aj bi−j . (A.4)
j =0 j =0
For notational convenience, a polynomial a = {ai } can be represented by a0 + a1 s + a2 s 2 + · · ·

where s is called the “indeterminate.” The highest value of the index i such that ai = 0 is called the
degree of a polynomial a = {a0 , a1 , · · · }.4 Thus, if a is a polynomial of degree m, we can write

m
a(s) = a0 + a1 s + · · · + am s m = ai s i . (A.5)
i=0
The set of polynomials over R is denoted by R[s], and is a commutative ring with identity. Moreover,
if R is a domain, so is R[s].
Fact A.4.5 Suppose F is a field. Then F[s] is a Euclidean domain if the degree of a polynomial in
F[s] is defined as above.
4The degree of the zero polynomial is taken as −∞.

A.4. EUCLIDEAN DOMAINS 321
Proof. To prove axiom (ED1), suppose

n
m
f (s) = fi s i , g(s) = gi s i , gm = 0 . (A.6)
i=0 i=0
It is necessary to show the existence of a q ∈ F[s] such that δ(f − gq) < δ(g), where δ(0) = −∞. If
n < m, take q = 0. If n ≥ m, define q1 (s) = (fn /gm )s n−m ; then δ(f − gq1 ) ≤ n − 1. By repeating
this process if necessary on the polynomial f − gq1 we can ultimately find a q ∈ F[s] such that
δ(f − gq) < δ(g). Thus, (ED1) is satisfied. The proof of (ED2) is straight-forward. 2
Since the degree function δ(·) satisfies both (A.1) and (A.2), the Euclidean division process
yields a unique remainder and quotient r, q corresponding to each pair f, g with g = 0. Moreover,
it is clear that F[s] is not a field (the polynomial s has no inverse). Hence, F[s] is a proper Euclidean
domain.
The field of fractions associated with F[s] is denoted by F(s), and is called the set of rational
functions over F. Note that every element of F(s) is a ratio of two polynomials (hence the name
rational function).
PROBLEMS
A.4.1. Show that the set of integers Z is a Euclidean domain if we define the degree of an integer
to be its absolute value. Does the division process result in unique remainders?
A.4.2. Consider the ring R[s], consisting of polynomials with real coefficients. What are the
primes of this ring?
A.4.3. Suppose R is a proper Euclidean domain and x ∈ R. Show that if δ(x) = 1, then x is a
prime. Is the converse true? (Hint: See Problem A.4.2.)
A.4.4. Let R be a Euclidean domain. Show that x ∈ R is a unit if and only if δ(x) = 0. (Hint:
Use axiom (ED1).)
323
APPENDIX B
Preliminaries on Matrix Rings

The objective of this appendix is to gather some well-known facts on matrices whose elements belong
to a ring or a field, and to state them in as much generality as possible.
B.1 MATRICES AND DETERMINANTS

Let R be a ring, and let Rn×n denote the set of square matrices of order n × n whose elements belong
to R. If the sum and product of two matrices in Rn×n are defined in the familiar way, namely
(A + B)ij = aij + bij , (B.1)

n
(AB)ij = aik bkj , (B.2)
k=1
then Rn×n becomes a ring, usually referred to as a ring of matrices over R. If R contains an identity
and n ≥ 2, then Rn×n is not commutative. For instance, if n = 2, we have

1 0 0 0 0 0 1 0
= . (B.3)
0 0 1 0 1 0 0 0
Similar examples can be constructed if n > 2. Also, if n ≥ 2, then Rn×n is not a domain because
Diag {1, 0, · · · , 0} Diag {0, 1, 0, · · · , 0} = 0n×n . (B.4)
The determinant of a matrix A ∈ Rn×n is denoted by |A| and is defined in the familiar way, namely

n
|A| = sign φ aiφ(i) , (B.5)
φ∈n i=j
where n denotes the set of permutations of the set N = {1, · · · , n} into itself, and sign φ = ± 1
depending on whether φ is an even or odd permutation.1 Most of the usual properties of determinants
hold in the present abstract setting. The required results are stated without proof, and the reader is
referred to [65] for further details.
Define a function : Rn × · · · × Rn → R as follows: For every v1 , · · · , vn ∈ Rn , define
(v1 , · · · , vn ) to be the determinant of the matrix V ∈ Rn×n whose columns are v1 , · · · , vn in that
1Throughout this section, the symbol “1” is used to denote both the integer as well as the identity element of the ring R. It is
usually clear from the context which is meant.
324 B. PRELIMINARIES ON MATRIX RINGS
order. Thus, is just the determinant function viewed as a function of the columns of a matrix.
Similarly, define (v ¯ 1 , · · · , vn ) to be the determinant of the matrix V ∈ Rn×n whose rows are
v1 , · · · , vn , in that order.
Fact B.1.1 We have

¯ 1 , · · · , vn ) ∀ v1 , · · · , vn ∈ Rn .
(v1 , · · · , vn ) = (v (B.6)
The function (and hence )¯ is alternating and multilinear. That is, if two arguments of are
interchanged, then changes sign, and if two arguments of are equal, then equals zero; finally,
(αv1 + βw1 , v2 , · · · , vn ) = α(v1 , v2 , · · · , vn ) + β(w1 , v2 , · · · , vn ) (B.7)
for all possible choices of the arguments.
Fact B.1.2 Let A ∈ Rn×n , n ≥ 2. Then, for any i, j ∈ {1, · · · , n},

n
|A| = (−1)i+j aij mij (A) , (B.8)
j =1

n
|A| = (−1)i+j aij mij (A) , (B.9)
i=1
where mij (A) is the ij -th minor of A, defined as the determinant of the (n − 1) × (n − 1) matrix
obtained from A by deleting its i-th row and j -th column.2
Fact B.1.3 Let A ∈ Rn×n , n ≥ 2. Then

n
|A| if i = k
aij (−1) k+j
mkj (A) = (B.10)
0 if i = k
j =1
n
|A| if j = k
aik (−1) i+j
mij (A) = . (B.11)
0 if j = k
i=1
Now a bit of notation is introduced to make subsequent theorem statements more compact.
Suppose m and n are positive integers, with m ≤ n.Then S(m, n) denotes the collection of all strictly
increasing m-tuples {i1 , · · · , im }, where 1 ≤ i1 < i2 < · · · < im ≤ n. For example,
S(3, 5) = {(1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 3, 4), (1, 3, 5) ,
(1, 4, 5), (2, 3, 4), (2, 3, 5), (2, 4, 5), (3, 4, 5)} . (B.12)
2To be consistent with subsequent notation, one should write a
N\i,N \j instead of mij (A). But the latter notation is more convenient.
B.1. MATRICES AND DETERMINANTS 325
If m = n, then S(m, n) is the singleton set {(1, 2. · · · , n)}, while if m = 0 then S(m, n) is just the
empty set.
Suppose A ∈ Rm×n , and let I ∈ S(l, m), J ∈ S(l, n). Then aI J denotes the l × l minor of A
consisting of the rows from I and the columns from J . In particular, if I and J are singleton sets
of the form I = {i}, J = {j }, then aI J is just the element aij . Note that aI J ∈ R. We use AI J to
denote the l × l matrix consisting of the elements from the rows in I and the columns in J . Thus,
aI J = |AI J |. In some situations it is of interest to examine a submatrix of A consisting of the rows
in I and all columns of A; such a submatrix is denoted by AI. . The notation A.J is similarly defined.
Fact B.1.4 (Laplace’s Expansion of a Determinant) Suppose A ∈ Rn×n , and suppose I ∈ S(m, n).
Then,
|A| = (−1)v(I,J ) aI J aN\I,N\J , (B.13)
J ∈S(m,n)
where
v(I, J ) = i+ j. (B.14)
i∈I j ∈J
Equation (B.13) generalizes (B.8). The corresponding generalization of (B.9) is similar and
is left to the reader.
Fact B.1.5 (Binet-Cauchy Formula) Suppose A ∈ Rn×m , B ∈ Rm×l , and let C = AB ∈ Rn×l .
Let I ∈ S(p, n), J ∈ S(p, l). Then,

cI J = aI K bKJ . (B.15)
K∈S(p,m)
In particular, if n = m = l, then |C| = |A| · |B|.
Note that (B.15) is a natural generalization of (B.2).

Using the multilinearity of the determinant function, one can obtain an expression for the
determinant of the sum of two matrices. For example, if A, B ∈ R2×2 and a1 , a2 , b1 , b2 are the
columns of the two matrices, then
|A + B| = (a1 + b1 , a2 + b2 )
= (a1 , a2 ) + (a1 , b2 ) + (b1 , a2 ) + (b1 , b2 ) . (B.16)
If A, B ∈ Rn×n then the formula for |A + B| will involve the sum of 2n terms. In case one of the
matrices is diagonal, each term in this expansion can be neatly expressed as a product of principal
minors of A and B.
Fact B.1.6 Suppose A, B ∈ Rn×n and that A is diagonal (i.e., aij = 0 for i = j ). Then,

n
|A + B| = aI I bN\I,N\I , (B.17)
l=0 I ∈S(l,n)
where aI I is interpreted as 1 when I is empty.
If A ∈ Rn×n , its adjoint matrix, denoted by Aadj , is defined by
(Aadj )ij = (−1)i+j mj i (A) . (B.18)
In view of (B.10) and (B.11), it is seen that, for any A ∈ Rn×n ,
A · Aadj = Aadj · A = |A|In , (B.19)
where In denotes the n × n identity matrix.

A matrix A ∈ Rn×n ts unimodular if it has an inverse in Rn×n , i.e., it is a unit in the ring
n×n
R .
Fact B.1.7 A ∈ Rn×n is unimodular if and only if |A| is a unit in R.
Proof. “if ” Suppose |A| is a unit in R and let b = |A|−1 ∈ R. Then bAadj ∈ Rn×n and A · bAadj =
bAadj · A = In . Hence, A is unimodular.
“only if ” Suppose A is unimodular, and let B ∈ Rn×n be the inverse of A. Then 1 = |In | =
|A| · |B|, which shows that |A| is a unit in R. 2
Now consider the set Fn×n of matrices with elements in a field F. Since a field is also a ring,
all of the preceding discussion applies. In addition, since every nonzero element of F is a unit, we
see that every A ∈ Fn×n such that |A| = 0 has an inverse in Fn×n . It is customary to call a matrix A
nonsingular if |A| = 0. From Fact B.1.3, we see that if A is nonsingular, then A−1 is given by
(A−1 )ij = (−1)i+j mj i (A)/|A| . (B.20)
The relation (B.20) is a special case of the following result, which gives the relationship between the
minors of A and A−1 . The proof can be found in [41, pp. 21–22].
Fact B.1.8 Suppose A ∈ Fn×n is nonsingular and let B = A−1 . Let N = {1, · · · , n} and suppose
I, J ∈ S(l, n) for some l. Then,
bI J = (−1)v(N\I,N\J ) aN\I,N\J |A|−1 , (B.21)
where the function v is defined in (B.14).

If I and J are singleton sets then (B.21) reduces to (B.20).
Suppose F ∈ Fr×s , and suppose F can be partitioned as

A B
F = , (B.22)
C D
B.1. MATRICES AND DETERMINANTS 327
where A is a nonsingular matrix of order n × n. Then the matrix
G := D − CA−1 B ∈ Fr−n×s−n (B.23)
is called the Schur complement of F with respect to A, and is sometimes denoted by F /A. The next
result relates the minors of G to those of F .
Fact B.1.9 Suppose J ∈ S(t, m), K ∈ S(t, l). Then
gJ K = |A|−1 fN∪({n}+J ),N∪({n}+K) , (B.24)
where {n} + J denotes the set sum (thus if J = {j1 , · · · , jt }, then {n} + J = {n + j1 , · · · , n + jt }).
Proof. Observe that

I 0 A B
F = =: E . (B.25)
−CA−1 I 0 G
Now suppose P ∈ S(n + t, n + m), Q ∈ S(n + t, n + l) and that N is a subset of both P and
Q. Consider the minor eP Q . The rows in P of E are obtained by adding multiples of the first
n rows of F to the rows in P of F . Thus, eP Q = fP Q , since a minor is unchanged by adding
multiples of some rows to others. Observe that the matrix EP Q is block-lower-triangular, so that
eP Q = |A| · eP −N,Q−N . The identity (B.24) now follows by choosing P = N ∪ ({n} + J ), Q =
N ∪ ({n} + K). 2
Fact B.1.10 Suppose A ∈ Fn×n , B ∈ Fm×n , |A| = 0, and let G = BA−1 . Suppose J ∈
S(l, m), K ∈ S(l, n). Then |A| · gJ K equals the determinant of the n × n matrix obtained from
A by replacing the rows in K of A by the rows in J of B.
Remarks B.1.11 A simple example helps to illustrate the statement of the result. Suppose A ∈
F5×5 , B ∈ F3×5 , and let a i , bi denote the i-th rows of A and B, respectively. Each of these is a 1 × 5
row vector. Suppose J = (2, 3), K = (3, 5). Then the above result states that

1
|A| · g(2,3),(3,5) =
b2
. (B.26)

a4

b3
Note that if B is a column vector then the above fact reduces to Cramer’s rule for solving linear
equations.
Proof. Define

A −I
F = . (B.27)
B 0
Then G is the Schur complement of F with respect to A. By Fact B.1.9,
|A| · gJ K = fN ∪ ({n} + J ), N ∪ ({n} + K),

A −I.K

. (B.28)
BJ. 0
Note that each of the last l columns of the minor consists of all zeros except for a single 1. If we
expand this minor about the last l columns using Laplace’s expansion (Fact B.1.4), then the expansion
consists of a single term, since the submatrix consisting of the last l columns has only nonzero l × l
minor. Thus, if K = {k1 , · · · , kl }, then

l
AN\K
|A| · gJ K = (−1) (−1)
v
, (B.29)
BJ.
where

l
l
v= (n + i) + ki . (B.30)
i=1 i=1
Now the minor on the right side of (B.29) is not quite in the form stated in Fact B.1.10, since
the rows of B occur below those of A, rather than substituting for them (as in Example B.26, for
instance). It is a matter of detail to verify that the parity factor (−1)v+l accounts for this difference.
2
Fact B.1.12 Suppose A ∈ Fn×n , B ∈ Fn×m , |A| = 0 and let G = A−1 B. Suppose J ∈
S(l, n), K ∈ S(l, m). Then |A| · gJ K equals the determinant of the n × n matrix obtained from
A by replacing the columns in J of A by the columns in K of B.
The proof is similar to that of Fact B.1.10 and is left to the reader.
B.2 CANONICAL FORMS

Let R be a principal ideal domain, and let F be the field of fractions associated with R. In this section,
we study the sets Rn×m and Fn×m , consisting of n × m matrices whose elements belong to R and
F, respectively. We prove the existence of two canonical forms, namely the Smith form on Rn×m and
the Smith-McMillan form on Fn×m .
A matrix A ∈ Rn×m is a left associate of B ∈ Rn×m (denoted by A =L B) if there is a unimod-
ular matrix U ∈ Rn×n such that A = U B. A is a right associate of B (denoted by A =R B) if there
is a unimodular matrix V ∈ Rm×m such that A = BV . A is equivalent to B (denoted by A ∼ B) if
B.2. CANONICAL FORMS 329
there are unimodular matrices U ∈ R , V ∈ R
n×n m×m
. such that A = U BV . It is left to the reader
to verify that =L , =R , ∼ are all equivalence relations.
We now commence our study of canonical forms.
Lemma B.2.1 Suppose a1 , · · · , an ∈ R, and let dn be a GCD of this set of elements. Then there exists
a matrix Pn ∈ Rn×n whose first row is [a1 · · · an ] and whose determinant is dn .
Proof. The proof is by induction on n. If n = 2, by Theorem A.3.3 there exist p1 , p2 ∈ R such that
p1 a1 + p2 a2 = d2 . Now let

a1 a2
P2 = (B.1)
−p2 p1
For larger values of n, let dn−1 be a GCD of a1 , · · · , an−1 . By the inductive hypothesis, we can
construct a matrix Pn−1 ∈ Rn−1×n−1 whose first row is [a1 · · · an−1 ] and whose determinant is
dn−1 . By Fact B.1.3,

n−1
aj (−1)1+j m1j (Pn−1 ) = dn−1 . (B.2)
j =1
so that

n−1
aj
(−1)1+j m1j (Pn−1 ) = 1 . (B.3)
dn−1
j =1
For convenience, let zj = −aj /dn−1 . Next, by Problem A.3.3, dn is a GCD of dn−1 and an . By
Theorem A.3.3, there exist x, y ∈ R such that xdn−1 + yan = dn . Now define
⎡ ⎤
an
⎢ Pn−1 0⎥
⎢ ⎥
⎢ .. ⎥ .
Pn = ⎢ .⎥ (B.4)
⎢ ⎥
⎣ 0⎦
yz1 ··· yzn−1 x
Expanding |Pn | about the last column and using (B.3) gives |Pn | = xdn−1 + yan = dn . 2
Theorem B.2.2 (Hermite Form) Every A ∈ Rn×n is a left associate of a matrix B that is lower-
triangular (i.e., bij = 0 for j > i).
Proof. Let dn be a GCD of {a1n , · · · , ann }, i.e., the elements of the last column of A. By Theo-
rem A.3.5, there exist elements p1 , · · · , pn ∈ R such that

n
pi ain = dn . (B.5)
i=1
By Problem A.3.4, 1 is a GCD of the set p1 , · · · , pn . Hence, by a slight variation of Lemma B.2.1,
there exists a unimodular matrix U ∈ Rn×n such that its last row is [p1 · · · pn ]. Now (U A)nn = dn .
Also, for i = 1, · · · , n − 1, (U A)in belongs to the ideal generated by a1n , · · · , ann and is thus a
multiple of dn . Let zi = (U A)in /dn for i = 1, · · · , n − 1, and define
⎡ ⎤
1 0 ··· 0 −z1
⎢0 1 · · · 0 ⎥
−z2
⎢ ⎥
⎢ ⎥
Un = ⎢ ... ... ..
.
..
.
..
⎥.
. (B.6)
⎢ ⎥
⎣0 0 · · · 1 −zn−1 ⎦
0 0 ··· 0 1
Then Un is unimodular since its determinant is 1. Moreover, Un U A is of the form

⎡ ⎤
0
⎢ .. ⎥
⎢ .⎥
Un U A = ⎢An−1 ⎥. (B.7)
⎣ 0⎦
a n−1 dn
In other words, the last column of A has been reduced to zero above the diagonal by means of left
multiplication by an appropriate unimodular matrix. One can now repeat the procedure with the
n − 1 × n − 1 matrix An−1 , and eventually arrive at a lower triangular matrix. For clarity, the next
step of the algorithm is briefly outlined: Let dn−1 denote a GCD of the elements of the last column
of An−1 . As above, there exists a unimodular matrix Ūn−1 ∈ Rn−1×n−1 such that
⎡ ⎤
0
⎢ .. ⎥
⎢ . ⎥
Ūn−1 An−1 = ⎢An−2 ⎥. (B.8)
⎣ 0 ⎦
a n−2 dn−1
Now define the n × n unimodular matrix
Un−1 = Block Diag{Ūn−1 , 1} . (B.9)

Then,
⎡ ⎤
0 0
⎢ .. .. ⎥
⎢An−2 . .⎥
⎢ ⎥
Un−1 Un U A = ⎢ 0 0⎥ . (B.10)
⎢ ⎥
⎣ a n−2 dn−1 0 ⎦
a n−1 · dn
The rest of the proof is now obvious. 2
Corollary B.2.3 Every matrix A ∈ Rn×n , is a left associate of an upper triangular matrix C ∈ Rn×n .
Proof. In the proof of Theorem B.2.2, start with the first column of A instead of the last. 2
Corollary B.2.4 Every matrix A ∈ Rn×n is a right associate of a lower (resp. upper) triangular matrix.
Proof. In the proof of Theorem B.2.2, start with the last (resp. first) row of A. 2
For rectangular matrices, the following result holds:
Corollary B.2.5 Suppose A ∈ Rn×m . Then A is a left associate of a matrix of the form

D
if n > m, [D E] if n < m . (B.11)
0
where D can be chosen to be either lower or upper triangular.
It is left to the reader to state and prove the result analogous to Corollary B.2.5 concerning
right associates of rectangular matrices.
Next, we start developing the Smith form. Recall that a matrix A ∈ Rn×m is said to have rank
l if (i) there is an l × l submatrix of A with nonzero determinant, and (ii) every (l + 1) × (l + 1)
minor of A is zero. An elementary row operation on the matrix A consists of one of the following:
(i) interchanging two rows of A, or (ii) adding a multiple of one row to another. An elementary column
operation is similarly defined. It is easy to see that an elementary row (column) operation on a matrix
A can be accomplished by multiplying A on the left (right) by a unimodular matrix. Thus, a matrix
obtained from A by elementary row and column operations is equivalent to A.
Now suppose A ∈ Rn×m is a (left or right) multiple of B ∈ Rn×m , and let bl denote a GCD
of all l × l minors of B. Then it follows from the Binet-Cauchy formula (Fact B.1.5) that bl divides
all l × l minors of A. Thus, if al denotes a GCD of all l × l minors of A, we see that bl divides al .
From this, it follows that if A and B are equivalent matrices, then al and bl are associates, and A
and B have the same rank.
Theorem B.2.6 (Smith Form) Suppose A ∈ Rn×m has rank l. Then, A is equivalent to a matrix
H ∈R n×m
of the form ⎡ ⎤
h1 0 · · · 0 0
⎢ 0 h2 · · · 0 0⎥
⎢ ⎥
⎢ .. .. ⎥ ,
H = ⎢ ... ..
.
..
. . . ⎥ (B.12)
⎢ ⎥
⎣0 0 0 hl 0 ⎦
0 0
where hi divides hi+1 for i = 1, · · · , l − 1. Moreover, h1 · · · hi is a GCD of all i × i minors of A,
and the hi ’s are unique to within multiplication by a unit.
Remarks B.2.7 h1 , · · · , hn are called the invariant factors of A.
Proof. Since A has rank l, it contains an l × l submatrix with nonzero determinant. By elementary
row and column operations, this submatrix can be brought to the upper left-hand corner of A.
As in the proof of Theorem B.2.2, there is a unimodular matrix U ∈ Rn×n such that (U A)11
is a GCD of all elements in the first column of A. By elementary row operations, the first column
of U A can be made to contain all zeros except in the (1, 1)-position. Call the resulting matrix Ā. If
ā11 divides all elements of the first row of Ā, then all elements of the first row (except in the (1, 1)
position) of ĀV can be made to equal zero by a suitable choice of a unimodular matrix V of the
form ⎡ ⎤
1 −v12 −v13 · · · −v1m
⎢· 1 0 · 0 ⎥
⎢ ⎥
⎢· · 1 · 0 ⎥
V =⎢ ⎥. (B.13)
⎢ .. .. .. ⎥
⎣· · . . . ⎦
· · · · 1
Moreover, the first column or ĀV will continue to have zeros except in the (1, 1) position. On the
other hand, if ā11 does not divide all elements in the first row of Ā. we can choose a unimodular
matrix V so that (ĀV )11 is a GCD of the elements of the first row of ĀV . If this is done, the first
column of ĀV may no longer contain zeros. In such a case, we repeat the above row and column
operations. This process cannot continue indefinitely, because the original element ā11 has only a
finite number of prime factors (see Fact A.3.7) and each successive corner element is a proper (i.e.,
nonassociative) divisor of its predecessor. Thus, in a finite number of steps, we arrive at a matrix B
equivalent to A such that b11 divides all elements of the first row as well as first column of B. By
elementary row and column operations, all of the elements of the first row and column of B can be
made equal to zero. Thus,
⎡ ⎤
b11 0 · · · 0
⎢ 0 ⎥
⎢ ⎥
A ∼ B ∼ ⎢ . ⎥. (B.14)
⎣ .. B1 ⎦
0
By proceeding to the first row and column of B1 (i.e., the second row and column of B) and then
repeating the procedure, we will eventually have
⎡ ⎤
d1 · · · 0
⎢ .. .. .. ⎥
⎢ . 0⎥
A ∼ ⎢. . ⎥. (B.15)
⎣ 0 · · · dl ⎦
0 M
Now, di = 0 ∀i, since these are all divisors of the elements of the first l rows and columns of A; and
none of these rows nor columns is identically zero. Next, M must be zero; otherwise A is equivalent
to a matrix of rank at least l + 1, which contradicts the fact that the rank of A is l.
Thus, far we have shown that
⎡ ⎤
d1 0 · · · 0
⎢ 0 d2 · · · 0 0 ⎥
⎢ ⎥
⎢ ⎥
A ∼ ⎢ ... ..
.
..
.
..
. ⎥. (B.16)
⎢ ⎥
⎣ 0 0 · · · dl ⎦
0 0
Let D = Diag {d1 , · · · , dl }. We will show that D ∼ Diag {h1 , · · · , hl }, where hi divides hi+1 for
i = 1, · · · , l − 1. This is enough to prove the theorem. By adding columns 2 to l to the first column
of D, it follows that ⎡ ⎤
d1 0 · · · 0
⎢d2 d2 · · · 0 ⎥
⎢ ⎥
D ∼ ⎢. .. .. .. ⎥ . (B.17)
⎣. . . . .⎦
dl 0 · · · dl
By multiplying the latter matrix on the left by an appropriate unimodular matrix, we get another
matrix E whose (1, 1)-element is a GCD of {d1 , · · · , dl } and whose first column is zero otherwise.
Let h1 := a GCD of {d1 . · · · , dl }. Then,
⎡ ⎤
h1
⎢ 0 Ē ⎥
⎢ ⎥
E=⎢. ⎥, (B.18)
⎣ .. ⎦
0
where every element of Ē is in the ideal generated by d2 , · · · , dl , and is thus a multiple of h1 . Since
the first row of Ē is a multiple of h1 , it follows that
⎡ ⎤
h1 0 ··· 0
⎢0 ⎥
⎢ ⎥
E ∼ ⎢. ⎥, (B.19)
⎣ .. E1 ⎦
0
where every element of E1 is a multiple of h1 . Now, by the preceding paragraph, E1 ∼

Diag {g2 , · · · , gl } where h1 divides gi for all i. Now let h2 be a GCD of g2 , · · · , gl . Then clearly
h1 divides h2 . Moreover,

h2 0
E1 ∼ . (B.20)
0 E2
Repeating this procedure, we finally get D ∼ Diag {h1 , · · · , hl } where hi divides hi+1 for all i. This
completes the proof of (B.12).
The divisibility conditions on the hi imply that the product h1 · · · hi is a GCD of all i × i
minors of H . Since A and H are equivalent, product is also a GCD of all i × i minors of A.
Now suppose A is also equivalent to another matrix
⎡ ⎤
g1 0 ··· 0
⎢0 g2 ··· 0 0⎥
⎢ ⎥
⎢ ⎥
G = ⎢ ... ..
.
..
.
..
. ⎥, (B.21)
⎢ ⎥
⎣0 0 · · · gl ⎦
0 0
where gi divides gi+1 for all i. Then the reasoning of the preceding paragraph shows that the product
h1 · · · hi is an associate of the product g1 · · · gi for all i. Now, if h1 ∼ g1 and h1 h2 ∼ g1 g2 , then
h2 ∼ g2 . Reasoning inductively in this fashion, we conclude that hi ∼ gi for all i. 2
Corollary B.2.8 Two matrices A, B ∈ Rn×m are equivalent if and only if their invariant factors are
associates.
n×m
Let F be the field of fractions associated with R. We now begin a study of the set F of
matrices with elements from F and show the existence of a canonical form known as the Smith-
McMillan form.
Theorem B.2.9 (Smith-McMillan Form) Suppose F ∈ Fn×m has rank l. Then there exist uni-
modular matrices U ∈ Rn×n , V ∈ Rm×m such that
⎡ ⎤
a1 /b1 0 ··· 0 0
⎢ 0 a2 /b2 · · · 0 0⎥
⎢ ⎥
⎢ .. . . . .. ⎥ ,
UFV = ⎢ . .. .. .. .⎥ (B.22)
⎢ ⎥
⎣ 0 0 · · · al /bl ⎦
0 0
where ai , bi are coprime for all i; ai divides ai+1 and bi+1 divides bi for i = 1, · · · , l − 1; and b1 is
a least common multiple of the denominators of all elements of F , expressed in reduced form.3
Proof. Let y denote an l.c.m. of the denominators of all elements of F , expressed in reduced form.
Then yF ∈ Rn×m , and the rank of yF is also l. By Theorem B.2.6, there exist unimodular matrices
U ∈ Rn×n , V ∈ Rm×m such that
⎡ ⎤
h1 · · · 0
⎢ .. .. .. ⎥
⎢. . . 0⎥
UyF V = ⎢⎢ ⎥ (B.23)
.. ⎥ ,
⎣ 0 · · · hl . ⎦
0 0
where hi divides hi+1 for all i. Hence,

⎡ ⎤
h1 /y ··· 0
⎢ .. .. .. ⎥
⎢ . . . 0⎥
UFV = ⎢
⎢
⎥
.. ⎥ . (B.24)
⎣ 0 · · · hl /y .⎦
0 0
Let ai /bi be a reduced form for the fraction hi /y, for i = 1, · · · , l. Since hi | hi+1 , let hi+1 = hi ri
where ri ∈ R. Let [·] denote the reduced form of a matrix. Then
ai+1 hi+1 hi ri ai ri r
i
= = = = ai , (B.25)
bi+1 y y bi bi
where in the last step we used the fact that ai , bi are coprime. Now (B.25) implies that
ai | ai+1 , bi+1 | bi for all i.
3 See Fact A.3.6 for the definition of a reduced form.
Finally, to show that b1 ∼ y, it is enough to show that h1 and y are coprime, since a1 /b1
is a reduced form of h1 /y. This is most easily done using prime factorizations. Let pij /qij be a
reduced form of the element fij , for all i, j . Then y is an l.c.m. of all qij , and h1 is a GCD of all
yfij = ypij /qij for all i, j . Suppose t is a prime factor of y, of multiplicity of α. (By this we mean
that t is a prime, t α divides y but t α+1 does not.) Then, from Fact A.3.10, t must be a prime divisor
of multiplicity α, of some qij . The corresponding pij is not divisible by t, since pij , qij are coprime
(see Corollary A.3.9). As a result, the corresponding term yfij = ypij /qij is also not divisible by
t (since t is a factor of multiplicity α of both ypij and qij ). Thus, h1 , being a GCD of all yfij , is
also not divisible by t (see Fact A.3.8). Since this is true of every prime factor of y, it follows from
Corollary A.3.9 that y and h1 are coprime. 2
337
APPENDIX C
Topological Preliminaries
In this appendix, a few basic concepts from topology are introduced. For greater detail, the reader is
referred to [56] or [89].
C.1 TOPOLOGICAL SPACES

This section contains a brief introduction to topological spaces.
Definition C.1.1 Let S be a set. A collection T of subsets of S is a topology if1

(TOP1) Both S and Ø (the empty set) belong to T .
(TOP2) A finite intersection of sets in T again belongs to T .
(TOP3) An arbitrary union of sets in T again belongs to T .
The ordered pair (S, T ) is called a topological space, and subsets of S belonging to T are said to be
open. A subset of S is closed if its complement in S is open.
Examples C.1.2 Let S be any set, and let T1 consist of just the two sets S and Ø. Then (S, T1 ) is
a topological space. T1 is referred to as the trivial topology on S. Let T2 consist of all subsets of S.
Then (S, T2 ) is also a topological space. T2 is referred to as the discrete topology on S.
Suppose S is a set, and T1 , T2 are topologies on S. Then T1 is weaker than T2 (and T2 is
stronger than T1 ) if T1 is a subset of T2 , i.e., every set that is open in the topology T1 is also open in
the topology T2 . It is obvious that, for any set S, the trivial topology and the discrete topology are,
respectively, the weakest and strongest topologies that can be defined on S.
To give interesting and useful examples of topological spaces, the notion of base is introduced.
To motivate this notion, recall the familiar definition of an open subset of the real line: A subset U of
R is open if and only if, corresponding to every x ∈ U, there is a number δ > 0 such that the interval
(x − δ, x + δ) is also contained in U. The concept of a base is an abstraction of this idea, whereby
intervals of the form (x − δ, x + δ) are replaced by more general sets satisfying appropriate axioms.
Let S be a set, and let B be a collection of subsets of S that satisfies two axioms:
(B1) The sets in B cover S (i.e., the union of the sets in B is S).
(B2) Whenever B1 , B2 are sets in B with a nonempty intersection and x ∈ B1 ∩ B2 , there
exists a B(x) in B such that x ∈ B(x) ⊆ (B1 ∩ B2 ).
1Throughout this section, upper case italic letters denote collections of sets, while bold face letters denote sets.
338 C. TOPOLOGICAL PRELIMINARIES
Using this collection B, another collection T of subsets of S is defined as follows: A subset U
of S belongs to T if and only if, for every x ∈ U, there is a set B(x) such that x ∈ B(x) ⊆ U.
Fact C.1.3 A subset U of S belongs to the collection T if and only if it is a union of sets in B. T
is a topology on S. Moreover, T is the weakest topology on S containing all sets in B.
Remarks C.1.4 B is referred to as a base for the topology T , and T is the topology generated by the
base B.
Example C.1.5 To clarify Fact C.1.3, consider the set Rn , consisting of n-tuples of real numbers.
Let B denote the collection of “balls” B(x, ε) of the form
B(x, ε) = {y ∈ Rn : x − y < ε} (C.1)
as x varies over Rn and ε varies over the positive numbers. Here · denotes the usual Euclidean (or
any other) norm on Rn . It is a straight-forward matter to verify that this collection of balls satisfies
axioms (B1) and (B2). Hence, this collection forms a base for a topology on Rn , in which a set
U ⊆ Rn is open if and only if, for every x ∈ U, there is a ball B(x, ε) ⊆ U. This coincides with the
“usual” definition of open sets on Rn . The one extra bit of information that comes out of Fact C.1.3
is that a set is open if and only if it is a union of balls.
Proof of Fact C.1.3. To prove the first sentence, suppose first that U is a union of sets in B; it is shown

that U belongs to the collection T . Specifically, suppose U = Bi , where I is an index set and Bi
i∈I
is in the collection B for all i ∈ I . Let x be an arbitrary element of U. Then x ∈ Bi for some i ∈ I ,
and for this i we have x ∈ Bi ⊆ U. Hence, by the definition of T , U is in T . Conversely, suppose U
is in T . Then for every x ∈ U there is a B(x) in B such that x ∈ B(x) ⊆ U. It is now claimed that

U= B(x), which would show that U is a union of sets from B. To prove the claim (and thereby
x∈U
complete the proof of the first sentence), note that U ⊇ B(x) since each B(x) ⊆ U; conversely,
x∈U
since every x ∈ U also belongs to the corresponding B(x), it follows that U ⊆ B(x). Hence, the
x∈U
two sets are equal.
The second sentence is proved by verifying that T satisfies the three axioms (TOP1)–(TOP3).
First, Ø belongs to T since Ø vacuously satisfies the defining condition for a set to belong to T .
Next, since S is the union of all sets in B, it is also in T . Hence, T satisfies (TOP1).
To establish (TOP2), it is enough to show that a nonempty intersection of two sets in T again
belongs to T ; it will then follow by induction that every finite intersection of sets in T again belongs to
T . Accordingly, suppose U, V are in T and let x be any element of U ∩ V; we will show the existence
of a B(x) in B such that x ∈ B(x) ⊆ (U ∩ V), which in turn will establish that the intersection
C.1. TOPOLOGICAL SPACES 339
U ∩ V is open. Since x ∈ U and x ∈ V, there exist B1 (x), B2 (x) in B such that x ∈ B1 (x) ⊆ U, x ∈
B2 (x) ⊆ V. Now, by axiom (B2), there is a B(x) in B such that x ∈ B(x) ⊆ (B1 (x) ∩ B2 (x)), which
in turn is contained in U ∩ V. Thus, T satisfies (TOP2).
To establish (TOP3), suppose {Uα } is a family of sets in T . Then each Uα is a union of sets

belonging to B, whence their union is also a union of sets belonging to B. Thus, Uα is also in T .
α
To prove the last sentence, let T1 be another topology on S such that every set in B is contained
in T1 . Since T1 is a topology, arbitrary unions of sets in B again belong to T1 . Since every set in T
can be expressed as a union of sets in B, it follows that every set in T is in T1 , i.e., T is weaker than
T1 . 2
A topological space (S, T ) is first-countable if T has the following property: For every x ∈ S,
there is a countable collection of open sets {Bi (x), i ∈ Z+ }, each containing x, such that every open
set containing x also contains some Bi (x). In view of Fact C.1.3, the collection Bi (x), x ∈ S is a
base for the topology T . Since the set S may be uncountable, the base for the topology may also
be uncountable. However, in a first-countable topology, the collection of open sets containing each
particular point has a countable base.
For example, consider the set Rn together with the topology of Example C.1.5. For a fixed
x ∈ Rn , the collection of balls B(x, 1/m), m an integer ≥ 1 is countable; moreover, every open
set containing x also contains at least one of the balls B(x, 1/m). Hence, the topological space of
Example C.1.5 is first-countable.
A very general class of first-countable topological spaces is that of metric spaces, which are
defined next.
Definition C.1.6 A metric space (S, ρ) is a set S, together with a function ρ : S → R satisfying
the following axioms:
(M1) ρ(y, x) = ρ(x, y) ∀ x, y ∈ S.
(M2) ρ(x, y) ≥ 0 ∀x, y ∈ S; ρ(x, y) = 0 ⇐⇒ x = y.
(M3) ρ(x, z) ≤ ρ(x, y) + ρ(y, z) ∀ x, y, z ∈ S.
If (S, ρ) is a metric space, then there is a natural topology that can be defined on S. Let
B(x, ε) denote the ball
B(x, ε) = {y ∈ S : ρ(x, y) < ε} . (C.2)
Then the collection of sets B(x, ε) as x varies over S and ε varies over all positive numbers, is a base
for a topology T on S. T is referred to as the topology on S induced by the metric ρ. In this topology,
a set U in S is open if and only if, for every x ∈ U, there is a ball B(x, ε) ⊆ U.
Every metric space is first-countable as a topological space: For a fixed x, consider the countable
collection of sets B(x, 1/m), m an integer ≥ 1. Then every open set containing x also contains at
least one of the sets B(x, 1/m).
The question of convergence of sequences in topological spaces is examined next. Suppose

(S, T ) is a topological space. A set N is said to be a neighborhood of x ∈ S if x ∈ N, and N contains
a set in T that contains x. In other words, a set N is a neighborhood of x if it contains an open
set containing x. Note that a neighborhood itself need not be open. A sequence {xi } in S is said
to converge to x ∈ S if every neighborhood of x contains all but a finite number of terms of the
sequence {xi }.
Example C.1.7 Consider the set Rn together with the topology defined in Example C.1.5. Then
a sequence {xi } converges to x if and only if, for every ε > 0, there is a number N such that
xi ∈ B(x, ε) ∀ i ≥ N. This is the familiar notion of convergence in Rn .
Example C.1.8 Consider a set S together with the discrete topology of Example C.1.2. Let {xi }
be a sequence in S converging to x. Since the singleton set {x} is also a neighborhood of x, it follows
that xi = x for all except a finite number of values of i.
Example C.1.9 Consider a set S together with the trivial topology of Example C.1.2. Then S is
the only neighborhood of any x ∈ S. As a result, every sequence {xi } converges to every x ∈ S.
The preceding two examples show that the meaning of convergence is very much dependent
on the particular topology defined on a set. Moreover, in general, a sequence does not necessarily
have a unique limit. To address the latter problem, we define Hausdorff topologies. A topology T
on a set S is Hausdorff if distinct points have disjoint neighborhoods. It is easy to verify that if T is a
Hausdorff topology on a set S, then the limit of a sequence in S is unique if it exists. Also, if (S, ρ)
is a metric space, then the topology induced by the metric ρ is Hausdorff.
Let S be a set, and let T1 , T2 be two topologies on S, with T1 weaker than T2 . Then, whenever
a sequence {xi } in S converges to x in the topology T2 , it also converges to x in the topology
T1 . However, it is possible that a sequence may converge in T1 but not in T2 . This shows that
convergence in T1 is in general a weaker requirement than convergence in T2 (and helps to explain
the terminology).
Recall that a set in S is closed if its complement is open. One of the desirable features of first-
countable topologies is that one can give an alternate, and very useful, characterization of closed sets.
Suppose U is a set in S. An element x ∈ S is said to be a limit point (or a cluster point, accumulation
point) of U if every neighborhood of x contains an element of U other than x. It can be shown that
a set is closed if and only if it contains all of its limit points. Now suppose (S, T ) is a first-countable
topological space. Then, for each x ∈ S, there is a countable collection of open sets {Bi (x), i ∈ Z}
such that every neighborhood of x contains at least one of the Bi (x). As a result, x is a limit point
of a set U if and only if there exists a sequence {xi } in U converging to x such that xi = x ∀i. Based
on this, one can show that U is closed if and only if every convergent sequence {xi } in U has a limit
in U (see Problem C.1.5).
C.1. TOPOLOGICAL SPACES 341
Now we come to the notion of continuity. Suppose (S1 , T1 ), (S2 , T2 ) are topological spaces,
and f is a function mapping S1 into S2 . Given any subset U of S2 , its preimage under f is the subset
of S1 denoted by f −1 (U) and defined by
f −1 (U) = {x ∈ S1 : f (x) ∈ U} . (C.3)
The function f is continuous at x ∈ S1 if, whenever U is an open subset of S2 containing f (x),

its preimage f −1 (U) is an open subset of S1 (containing x). f is continuous if it is continuous at
all x ∈ S1 . Clearly, whether or not f is continuous is very much dependent on the topologies T1
and T2 . For example, if T1 is the discrete topology, then every function f : S1 → S2 is continuous.
Also, one can easily verify the following: If f is continuous at x ∈ S1 and N is a neighborhood of
f (x), then f −1 (N) is a neighborhood of x (see Problem C.1.6). The converse is also true, but more
difficult to prove: If f −1 (N) is a neighborhood of x whenever N is a neighborhood of f (x), then f
is continuous at x. The proof can be found in [56].
Fact C.1.10 Suppose (S1 , T1 ), (S2 , T2 ) are topological spaces, and f : S1 → S2 is continuous at
x ∈ S1 . Then, whenever {xi } is a sequence in S1 converging to x, the sequence {f (xi )} converges to
f (x).
Proof. Let N be any neighborhood of f (x). Since f is continuous at x, f −1 (N) is a neighborhood

of x. Since {xi } converges to x, f −1 (N) contains all but a finite number of terms of the sequence
{xi }. This is the same as saying that f (xi ) ∈ N for all but a finite number of values of i. Since this
is true for every neighborhood of f (x), we conclude that {f (xi )} converges to f (x). 2
If the topologies T1 , T2 are first-countable, then the converse of Fact C.1.10 is also true: If
{f (xi )} converges to f (x) for all sequences {xi } converging to x, then f is continuous at x.
PROBLEMS
C.1.1. Let (S, ρ) be a metric space, and let B(x, ε) be defined by (C.2). Show that the collection
of sets B(x, ε) is a base for a topology on S by verifying that axiom (B1) is satisfied.
C.1.2. Let S be any set, and define ρ : S × S → R by ρ(x, y) = 0 if x = y, 1 if x = y.

(i) Verify that (S, ρ) is a metric space.
(ii) Show that the topology on S induced by ρ is the discrete topology.
C.1.3. Let (S, T ) be a first countable topological space, and let U be a subset of S. Show that U
is closed if and only if the following is true: xi ∈ U, {xi } converges to x implies that x ∈ U
(Hint: either x = xi for some i or else x is a limit point of U).
C.1.4. Suppose (S, ρ) is a metric space, and define
ρ(x, y)
d(x, y) = .
1 + ρ(x, y)
(i) Show that (S, d) is also a metric space.
(ii) Show that ρ and d induce the same topology on S.
C.1.5. Show that, if (S, T ) is a topological space and T is a Hausdorff topology, then a sequence
in S can converge to at most one point.
C.1.6. Suppose (S1 , T1 ), (S2 , T2 ) are topological spaces, and f : S1 → S2 is continuous at x ∈ S1 .
Show that, if N is a neighborhood of f (x), then f −1 (N) is a neighborhood of x.
C.2 TOPOLOGICAL RINGS AND NORMED ALGEBRAS

This section contains a brief introduction to the subject of topological rings. Roughly speaking, a
topological ring is a ring, together with a topology, such that subtraction and multiplication are
continuous operations with respect to the topology. Before talking about topological rings as such,
it is necessary to define the product topology on the cartesian product of topological spaces.
Recall that the cartesian product S1 × S2 of two sets S1 and S2 consists of all ordered pairs
(x, y) where x ∈ S1 and y ∈ S2 . Now suppose T1 , T2 are topologies on S1 , S2 , respectively. The
objective is to define a topology on the product set S = S1 × S2 . This is done in the following way:
Let B denote the collection of subsets of S of the form U × V where U ∈ T1 , V ∈ T2 . In other words,
B is the collection of subsets of S formed by taking cartesian products of an open subset of S1 and
an open subset of S2 . It can be shown that B is a base for a topology on S (see Problem C.2.1),
which is referred to as the product topology on S.
Now suppose Si , i = 1, 2, 3 are topological spaces, with the topologies Ti , i = 1, 2, 3. Let
S = S1 × S2 , and let T denote the product topology on S. Suppose f is a function mapping S into
S3 . By definition, f is continuous if f −1 (W) is an open subset of S whenever W is an open subset of
S3 . Now suppose f is continuous, that (x, y) ∈ S and let f (x, y) =: z ∈ S3 . Since f is continuous,
it follows that, whenever N is a neighborhood of z, f −1 (N) is a neighborhood of (x, y). Recalling
the definition of the product topology on the set S, we see that, whenever N is a neighborhood of
z, there exist neighborhoods N1 of x and N2 of y such that f (x1 , y1 ) ∈ N ∀ x1 ∈ N1 , ∀ y1 ∈ N2 .
In particular, f (x, y1 ) ∈ N∀ y1 ∈ N2 , and f (x1 , y) ∈ N ∀ x1 ∈ N1 . The conclusion is that if f is a
continuous function, then for each x ∈ S1 the function f (x, ·) : S2 → S3 is continuous; similarly,
for each y ∈ S2 , the function f (·, y) : S1 → S3 is continuous.
Now we come to topological rings.
Definition C.2.1 Suppose R is a ring and T is a topology on R. The pair (R, T ) is a topological
ring if the functions (x, y) → x − y and (x, y) → xy are continuous functions from R × R into
R, when R × R is given the product topology.
C.2. TOPOLOGICAL RINGS AND NORMED ALGEBRAS 343
Several facts are immediate consequences of the above definition. First, if the function
(x, y) → x − y is continuous, then for each fixed x , the function y → x − y is a continuous function
from R into R. In particular, taking x = 0, it follows that the function y → −y is continuous. Next,
since compositions of continuous functions are again continuous, the function (x, y) → x + y =
x − (−y) is also continuous. Similarly, the function (x, y) → y − x = −(x − y) is also continuous.
Given a subset U ⊆ R and an element x ∈ R, let x + U denote the set defined by
x + U = {x + y : y ∈ U} . (C.1)
One can think of x + U as the set U “translated” by x. Now suppose U is an open subset of R, and
define the function fx : R → R by fx (y) = y − x. As seen earlier, fx is continuous. Hence, the
set fx−1 (U) is open. But clearly this set equals x + U. Thus, we have shown that if U is open, so is
x + U for all x ∈ R. In other words, the topology on a topological ring is “translation-invariant,” in
the sense that translates of open sets are again open. A consequence of this is that, once we know all
the open sets containing 0, we know all open sets.
We now speak briefly about normed algebras. Suppose A is a linear vector space over a field
F. This means that there is a concept of addition between members of A, and scalar multiplication
between an element of F and an element of A. Now A is an algebra (over F) if, in addition, one
can define a product of two elements of A in such a way that A becomes a ring. Thus, an algebra A
has associated with it a field F, and three operations: addition between elements of A, multiplication
between two elements of A, and multiplication between an element of F and an element of A. A is a
ring with respect to addition and multiplication, and A is a linear vector space with respect to addition
and scalar multiplication. For example, if F is any field, the set of matrices Fn×n is an algebra over F.
Now suppose A is an algebra over R (the real numbers) or C (the complex numbers). Then
(A, · ) is a normed algebra if one can define a norm · on A such that
(NA1) a ≥ 0 ∀ a ∈ A, a = 0 ⇐⇒ a = 0,
(NA2) αa = |α| · a ∀ α ∈ R( or C), ∀ a ∈ A,
(NA3) a + b ≤ a + b ∀ a, b ∈ A,
(NA4) ab ≤ a · b , ∀ a, b ∈ A,
Axioms (NA1)–(NA3) are just the usual requirements for (A, · ) to be a normed space,
and (NA4) is the additional requirement for it to be a normed algebra.
Let (A, · ) be a normed algebra, and let B denote the collection of balls
B(x, ε) = {y ∈ A : x − y < ε} . (C.2)
Then, B is a base for a topology T on A. This is the same as the topology induced by the metric
ρ(x, y) = x − y.
It is now shown that (A, T ) is a topological ring. In order to do this, it is necessary to show that
the functions (x, y) → x − y and (x, y) → xy are both continuous. Let us begin with subtraction.
Suppose x̄, ȳ ∈ A, z̄ = x̄ − ȳ, and let S ⊆ A be an open set containing z̄. We must show that the
set {(x, y) : x − y ∈ S} =: V is an open subset of A × A. This is done by showing that, whenever
(x, y) ∈ V, there are balls B(x, δ), B(y, δ) such that B := B(x, δ) × B(y, δ) is contained in V. Since
S is open and x − y ∈ S, by definition there is an ε > 0 such that B(x − y, ε) ⊆ S. Now let δ = ε/2,
and, suppose x1 ∈ B(x̄, δ), y1 ∈ B(ȳ, δ). Then
(x1 − y1 ) − (x − y) ≤ x1 − x + y1 − y < 2δ < ε . (C.3)
Hence, (x1 , y1 ) ∈ V. This shows that subtraction is continuous. Next, to show that multiplication
is continuous, suppose x̄, ȳ ∈ A, let z̄ = x̄ ȳ, and let S be an open set in A containing z̄. Define
V = {(x, y) : xy ∈ S}. We show that V is open by showing that, whenever (x, y) ∈ V, there exist
balls B(x, δ), B(y, δ) such that B(x, δ) × B(y, δ) is contained in V. Since V is open, by definition
there is an ε > 0 such that B(xy, ε) ⊆ S. Suppose x1 ∈ B(x, δ), y1 ∈ B(y, δ) where δ is yet to be
specified. Then
x1 y1 − xy = [(x1 − x) + x] · [(y1 − y) + y] − xy
= (x1 − x)(y1 − y) + (x1 − x)y + x(y1 − y) . (C.4)
x1 y1 − xy ≤ x1 − x · y1 − y + x1 − x · y
+ x · y1 − y
≤ δ + δ(x + y) .
2
(C.5)
Hence (x1 , y1 ) ∈ V if δ is chosen such that
δ 2 + δ(x + y) ≤ ε . (C.6)
Such a δ can always be found since the left side of (C.6) is continuous in δ and equals zero when δ
is zero. This completes the proof that multiplication is continuous.
Fact C.2.2 Suppose (A, · ) is a normed algebra, and let U denote the set of units of A (i.e., the
set of elements of A that have a multiplicative inverse in A). Then the function f : u → u−1 maps
U into itself continuously.
Remark C.2.3 By the continuity of the “inversion” function f , we mean the following: Given any
u ∈ U and any ε > 0, there exists a δ > 0 such that
u−1 − v −1 < ε whenever v ∈ U and u − v < δ . (C.7)

C.2. TOPOLOGICAL RINGS AND NORMED ALGEBRAS 345
Proof. Let u ∈ U and let ε > 0 be specified. Suppose v ∈ U and v ∈ B(u, δ), where δ is not as yet
specified. Then
u−1 − v −1 = u−1 (v − u)v −1 =
≤ u−1 · v − u · v −1
≤ δu−1 · v −1
≤ δu−1 · (u−1 + u−1 − v −1 ) . (C.8)
Solving for u−1 − v −1 from (C.8) gives
δu−1 2
u−1 − v −1 ≤ . (C.9)
1 − δu−1
Thus, (C.7) is satisfied if δ is chosen such that the right side of (C.9) is less than ε. But such a δ
can always be found, since the right side of (C.9) is continuous in δ and equals zero when δ is zero.
This shows that the function f is continuous. 2
PROBLEMS
C.2.1. Suppose (S1 , T1 ), (S2 , T2 ) are topological spaces.
(i) Let T denote the collection of subsets of S1 × S2 of the form U × V where U ∈ T1 , V ∈
T2 . Show that T is a base for a topology on the set S1 × S2 (Hint: Show that T satisfies the
axiom (B1) of Section B.1).
(ii) Suppose B1 , B2 are respectively bases for the topologies T1 , T2 . Let B denote the col-
lection of subsets of S1 × S2 of the form B1 × B2 where B1 ∈ B1 , B2 ∈ B2 . Show that B
is also a base for a topology on S1 × S2 .
(iii) Show that B and T generate the same topology on S1 × S2 (Hint: Show that every set
in T is a union of sets in B).
C.2.2. Consider the ring Rn×n of n × n matrices with real elements. Let · be any norm on the
vector space Rn , and define the norm of a matrix by
Ax
A = supx=0
x
Show that (Rn×n , · ) is a normed algebra.
C.2.3. Consider the ring R[s] of polynomials in the indeterminate s with real coefficients. Define

the norm of a(s) = ai s i by
i
a(·) = |ai | .
i
Show that (R[s], · ) is a normed algebra.
347
Bibliography
[1] V. Anantharam and C. A. Desoer, “On the stabilization nonlinear systems,” IEEE Trans. on
Auto. Control, AC-29, pp. 569–572, June 1984. DOI: 10.1109/TAC.1984.1103584 Cited on
page(s) 306
[2] B. D. O. Anderson, “A note on the Youla-Bongiorno-Lu condition,” Automatica, 12, pp. 387–
388, July 1976. DOI: 10.1016/0005-1098(76)90060-1 Cited on page(s)
[3] B. D. O. Anderson and J. B. Moore, Optimal Filtering, Prentice-Hall, Englewood Cliffs, New
Jersey, 1979. Cited on page(s) 194, 229
[4] M. F. Atiyah and I. G. MacDonald, Introduction to Commutative Algebra, Addison-Wesley,

Reading, MA., 1969. Cited on page(s) 283, 293
[5] J. A. Ball and J. W. Helton, “A Beurling-Lax theorem for the Lie group U (m, n) which contains
most classical interpolation theory,” J. Operator Theory, 9, pp. l07–142, 1983. Cited on page(s)
[6] C. I. Byrnes, M. W. Spong and T. J. Tarn, “A several complex variables approach to feedback
stabilization of neutral delay-differential systems,” Math. Sys. Thy., 17, pp. 97–134, May 1984.
DOI: 10.1007/BF01744436 Cited on page(s) 292, 306
[7] F. M. Callier and C. A. Desoer,“Open-loop unstable convolution feedback systems with dynam-
ical feedback,” Automatica, 13, pp. 507–518, Dec. 1976. DOI: 10.1016/0005-1098(76)90010-8
Cited on page(s)
[8] F. M. Callier and C. A. Desoer, “An algebra of transfer functions of distributed linear time-
invariant systems,” IEEE Trans. Circ. and Sys., CAS-25, pp. 651–662, Sept. 1978.
DOI: 10.1109/TCS.1978.1084544 Cited on page(s) 287, 301, 305
[9] F. M. Callier and C. A. Desoer, “Simplifications and clarifications on the paper ’An algebra
of transfer functions of distributed linear time-invariant systems,’ ” IEEE Trans. Circ. and Sys.,
CAS-27, pp. 320–323, Apr. 1980. DOI: 10.1109/TCS.1980.1084802 Cited on page(s)
[10] F. M. Callier and C. A. Desoer, “Stabilization, tracking and disturbance rejection in multi-
variable convolution systems,” Annales de la Societé Scientifique de Bruxelles, 94, pp. 7–51, 1980.
Cited on page(s)
[11] F. M. Callier and C. A. Desoer, Multivariable Feedback Systems, Springer-Verlag, New York,
1982. Cited on page(s)
348 BIBLIOGRAPHY
[12] B. C. Chang and J. B. Pearson,“Optimal Disturbance reduction in linear multivariable systems,”
IEEE Trans. on Auto. Control, AC-29, pp. 880–888, Oct. 1984.
DOI: 10.1109/TAC.1984.1103409 Cited on page(s) 205
[13] M. J. Chen and C. A. Desoer, “Necessary and sufficient conditions for robust stability of linear
distributed feedback systems,” Int. J. Control, 35, pp. 255–267, 1982.
DOI: 10.1080/00207178208922617 Cited on page(s) 277
[14] M. J. Chen and C. A. Desoer, “Algebraic theory of robust stability of interconnected systems,”
IEEE Trans. on Auto. Control, AC-29, pp. 511–519, June 1984.
[15] J. H. Chow and P. V. Kokotovic, “Eigenvalue placement in two-time-scale systems,” Proc. IFAC
Symp. on Large Scale Sys., Udine, Italy, pp. 321–326, June 1976. Cited on page(s) 224
[16] C.-C. Chu and J. C. Doyle, “On inner-outer and spectral factorizations,” Proc. IEEE Conf. on
Decision and Control, Las Vegas, Dec. 1984. Cited on page(s) 205
[17] H. T. Colebrooke, Algebra with Arithmetic and Mensuration, John Murray, London, 1817,
reprinted by Dr. Martin Sandig OHg, Wiesbaden, W. Germany, 1973. Cited on page(s) xvii
[18] P. R. Delsarte, Y. Genin and Y. Kamp, “The Nevanlinna-Pick problem for matrix-valued
functions,” SIAM J. Appl. Math, 36, pp. 47–61, Feb. 1979. DOI: 10.1137/0136005 Cited on
page(s) 205
[19] C. A. Desoer and W. S. Chan, “The feedback interconnection of linear time-invariant systems,”
J. Franklin Inst., 300, pp. 335–351, 1975. DOI: 10.1016/0016-0032(75)90161-1 Cited on
page(s)
[20] C. A. Desoer and M. J. Chen, “Design of multivariable feedback systems with stable plant,”
IEEE Trans. on Auto. Control, AC-26, pp. 408–415, April 1981.
DOI: 10.1109/TAC.1981.1102594 Cited on page(s)
[21] C. A. Desoer and C. L. Gustafson, “Design of multivariable feedback systems with simple
unstable plant,” IEEE Trans. on Auto. Control, AC-29, pp. 901–908, Oct. 1984.
[22] C. A. Desoer and C. L. Gustafson, “Algebraic theory of linear multivariable feedback systems,’
IEEE Trans. on Auto. Control, AC-29, pp. 909–917, Oct. 1984.
[23] C. A. Desoer and C. A. Lin,“Two-step compensation of nonlinear systems,” Systems and Control
Letters, 3, pp. 41–46, June 1983. DOI: 10.1016/0167-6911(83)90036-1 Cited on page(s) 306
BIBLIOGRAPHY 349
[24] C. A. Desoer and R. W. Liu, “Global parametrization of feedback systems with nonlinear
plants,” Systems and Control Letters, 1, pp. 249–251, Jan. 1982.
DOI: 10.1016/S0167-6911(82)80006-6 Cited on page(s) 306
[25] C. A. Desoer, R. W. Liu, J. Murray and R. Saeks, “Feedback system design: The fractional
representation approach to analysis and synthesis,” IEEE Trans. Auto. Control, AC-25, pp. 399–
412, June 1980. DOI: 10.1109/TAC.1980.1102374 Cited on page(s)
[26] C. A. Desoer and M. Vidyasagar, Feedback Systems: Input-Output Properties, Academic Press,
New York, 1975. Cited on page(s) 154, 210, 239, 284, 287
[27] J. C. Doyle, “Robustness of multiloop linear feedback systems,” Proc. 17th Conf. Decision and
Control, Ft. Lauderdale, pp. 12–18, 1979. Cited on page(s)
[28] J. C. Doyle, K. Glover, P. P. Khargonekar and B. A. Francis, “State space solutions to standard
H2 and H∞ control problems,” IEEE Transactions on Automatic Control, 34, 831-847, 1989.
Cited on page(s) xiii
[29] J. C. Doyle, B. A. Francis and A. Tannenbaum, Feedback Control Theory, MacMillan, New York,
1991 Cited on page(s) xiii
[30] J. Doyle, “Analysis of feedback systems with structured uncertainties,” IEE Proc., Part D, 129,
pp. 242–250, Nov. 1982. Cited on page(s)
[31] J. C. Doyle, “Synthesis of robust controllers and filters,” Proc. IEEE Conference on Decision and
Control, pp. 109–114, 1983. DOI: 10.1109/CDC.1983.269806 Cited on page(s) 205
[32] J. C. Doyle and G. Stein, “Multivariable feedback design: Concepts for a classical/modern
synthesis,” IEEE Trans. Auto. Control, AC-26, pp. 4–16, Feb. 1981.
[33] J. C. Doyle, J. E. Wall and G. Stein, “Performance and robustness resuits for structured uncer-
tainty,” Proc. IEEE Conf. on Decision and Control, pp. 628–636, 1982. Cited on page(s)
[34] P. L. Duren, The Theory of H p -spaces, Academic Press, New York, 1970. Cited on page(s) 156,
205, 286
[35] B. A. Francis, “The multivariable servomechanism problem from the input-output viewpoint,”
IEEETrans. Auto. Control, AC-22, pp. 322–328, June 1977. DOI: 10.1109/TAC.1977.1101501
Cited on page(s)
[36] B. A. Francis,“On the Wiener-Hopf approach to optimal feedback design,” Systems and Control
Letters, 2, pp. 197–201, Dec. 1982. DOI: 10.1016/0167-6911(82)90001-9 Cited on page(s)
205
350 BIBLIOGRAPHY
[37] B. A. Francis, A Course in H∞ Control Theory, Lecture Notes in Control and Information
Sciences, Volume 88, Springer-Verlag, Heidelberg, 1987. Cited on page(s) xiii
[38] B. A. Francis, J. W. Helton and G. Zames, “H ∞ -optimal feedback controllers for linear mul-
tivariable systems,” IEEE Trans. on Auto. Control, AC-29, pp. 888–900, Oct. 1984.
[39] B. A. Francis and M. Vidyasagar, “Algebraic and topological aspects of the regulator problem
for lumped linear systems,” Automatica, 19, pp. 87–90, Jan. 1983.
DOI: 10.1016/0005-1098(83)90078-X Cited on page(s) 277
[40] B. A. Francis and G. Zames, “On optimal sensitivity theory for SISO feedback systems,” IEEE
Trans. Auto. Control, AC-29, pp. 9–16, Jan. 1984. DOI: 10.1109/TAC.1984.1103357 Cited on
page(s) xiii, 205
[41] F. R. Gantmacher, Theory of Matrices, Chelsea, New York. Cited on page(s) 326
[42] I. M. Gel’fand, D. Raikov and G. Shilov, Commutative normed rings, Chelsea, New York, 1964.
Cited on page(s) 279, 281
[43] B. K. Ghosh and C. I. Byrnes, “Simultaneous stabilization and simultaneous pole-placement

by non-switching dynamic compensation,” IEEE Trans. Auto. Control, AC-28, pp. 735–741,
June 1983. DOI: 10.1109/TAC.1983.1103299 Cited on page(s) 277
[44] K. Glover, “All optimal Hankel-norm approximations of linear multivariable systems and their
L∞ -error bounds,” Int. J. Control, 39, pp. 1115–1193, June 1984.
DOI: 10.1080/00207178408933239 Cited on page(s) 205
[45] M. Green and D. J. N. Limebeer, Linear Robust Control, Prentice-Hall, Englewood Cliffs, New
Jersey, 1995. Cited on page(s) xiii
[46] V. Guillemin and A. Pollack, Differential Topology, Prentice-Hall, Englewood Cliffs, NJ., 1979.
Cited on page(s) 293
[47] C. L. Gustafson and C. A. Desoer, “Controller design for linear multivariable feedback systems
with stable plant,” Int. J. Control, 37, pp. 881–907, 1983. DOI: 10.1080/00207178308933018
Cited on page(s)
[48] E. Hille and R. S. Phillips, Functional Analysis and Semigroups, Amer. Math. Soc., Providence,
RI, 1957. Cited on page(s) 287
[49] K. Hoffman, Banach Spaces of Analytic Functions, Prentice-Hall, Englewood Cliffs, NJ., 1962.
Cited on page(s)
BIBLIOGRAPHY 351
[50] N.T. Hung and B. D. O. Anderson,“Triangularization technique for the design of multivariable
control systems,” IEEE Trans. Auto. Control, AC-24, pp. 455–460, June 1979.
[51] C. A. Jacobson, “Some aspects of the structure and stability of a class of linear distributed
systems,” Robotics and Automation Lab. Rept. No. 31, Renn. Poly. Inst., Dept. of ECSE, May
1984. Cited on page(s) 306
[52] N. Jacobson, Lectures in Abstract Algebra, Vol. I, Van-Nostrand, New York. 1953. Cited on
page(s) 307
[53] T. Kailath, Linear Systems, Prentice-Hall, Englewood Cliffs, NJ., 1980. Cited on page(s)
[54] E. W. Kamen, P. P. Khargonekar and A. Tannenoaum, “A local theory of linear systems with
noncommensurate time delays,” submitted for publication. Cited on page(s) 292, 306
[55] E. W. Kamen, P. P. Khargonekar and A. Tannenbaum, “Stabilization of time-delay systems

using finite-dimensional compensators,” IEEE Trans. on Auto. Control, AC-30, pp. 75–78, Jan.
1985. DOI: 10.1109/TAC.1985.1103789 Cited on page(s) 306
[56] J. L. Kelley, General Topology, Van-Nostrand, New York, 1955. Cited on page(s) 337, 341
[57] P. P. Khargonekar and A. Tannenbaum, “Noneuclidean metrics and the robust stabilization of
systems with parameter uncertainty,” IEEE Trans. Auto. Control, AC-30, pp. 1005-1013, Oct.
[58] H. Kimura, “Robust Stabilizability for a class of transfer functions,” IEEE Trans. Auto. Control,
AC-29, pp. 788–793, Sept. 1984. DOI: 10.1109/TAC.1984.1103663 Cited on page(s) 277
[59] P. Koosis, The Theory of H p spaces, Cambridge University Press, Cambridge, 1980. Cited on
page(s) 156, 205
[60] V. Kučera, Discrete Linear Control: The Polynomial Equation Approach, Wiley, New York, 1979.
Cited on page(s) xvi, 205
[61] T. Y. Lam, Serre’s Conjecture, Lecture notes in Mathematics, No. 635, Springer-Verlag, Berlin,
[62] V. Ya. Lin, “Holomorphic fiberings and multivalued functions of elements of a Banach algebra,”
(English translation) Funct. Anal. Appl., 37, pp. 122–128, 1973. DOI: 10.1007/BF01078884
[63] H. Lindel and W. Lutkebohmert, “Projektive modulen fiber polynomialen erweiterungen von
potenzreihenalgebren,” Archiv der Math, 28, pp. 51–54, 1977. DOI: 10.1007/BF01223888
352 BIBLIOGRAPHY
[64] R. W. Liu and C. H. Sung, “Linear feedback system design,” Circ. Sys. and Sig. Proc., 2, pp. 35–
44, 1983. DOI: 10.1007/BF01598142 Cited on page(s) 306
[65] C. C. MacDuffee, Theory of Matrices, Chelsea, New York, 1946. Cited on page(s) 319, 323
[66] H. Maeda and M. Vidyasagar,“Some results on simultaneous stabilization,” Systems and Control
Letters, 5, pp. 205-208, Sept. 1984. DOI: 10.1016/S0167-6911(84)80104-8 Cited on page(s)
[67] H. Maeda and M. Vidyasagar, “Infinite gain margin problem in multivariable feedback sys-
tems,” Automatica, 22, pp. 131-133, Jan. 1986. DOI: 10.1016/0005-1098(86)90115-9 Cited
on page(s) 277
[68] A. S. Morse, “System invariants under feedback and cascade control,” Proc. Int. Conf. on Math.
Sys. Thy., Udine, ltaly, 1976. Cited on page(s)
[69] B. Sz-Nagy and C. Foias, Harmonic Analysis of Operators on Hilbert Space, Elsevier, New York,
[70] C. N. Nett, “The fractional representation approach to robust linear feedback design: A self-
contained exposition,” Robotics and Automation Lab. Rept. No. 30, Renn. Poly. Inst., Dept.
of ECSE, May 1984. Cited on page(s) 306
[71] C. N. Nett, C. A. Jacobson and M. J. Balas, “Fractional representation theory: Robustness with
applications to finite-dimensional control of a class of linear distributed systems,” Proc. IEEE
Conf. on Decision and Control, pp. 268–280, 1983. DOI: 10.1109/CDC.1983.269841 Cited on
page(s) 306
[72] C. N. Nett, C. A. Jacobson and M. J. Balas, “A connection between state-space and doubly
coprime fractional representations,” IEEE Trans. Auto. Control, AC-29, pp. 831–832, Sept.
1984. DOI: 10.1109/TAC.1984.1103674 Cited on page(s)
[73] G. Newton, L. Gould and J. F. Kaiser, Analytic Design of Linear Feedback Controls, Wiley, New
York, 1957. Cited on page(s)
[74] L. Pernebo, “An algebraic theory for the design of controllers for linear multivariable systems,”
IEEE Trans. Auto. Control, AC-26, pp. 171–194, Feb. 1981. DOI: 10.1109/TAC.1981.1102554
Cited on page(s)
[75] V. P. Potapov, “The Multiplicative structure of J -contractive matrix functions,” Translations of

the Amer. Math. Soc., 15, pp. 131–243. 1960. Cited on page(s) 205
[76] D. Quillen, “Projective modules over polynomial rings,” Invent. Math., 36, pp. 167–171, 1976.
DOI: 10.1007/BF01390008 Cited on page(s) 293
BIBLIOGRAPHY 353
[77] V. R. Raman and R. W. Liu, “A necessary and sulficient condition for feedback stabilization
in a factorial ring,” IEEE Trans. Auto. Control, AC-29, pp. 941–942, Oct. 1984.
[78] H. H. Rosenbrock, State Space and Multivariable Theory, Nelson, London, 1970. Cited on
page(s)
[79] J. J. Rotman, An Introduction to Homological Algebra, Academic Press, New York, 1979. Cited
on page(s) 293
[80] W. Rudin, Fourier Analysis on Groups, John Wiley, New York, 1962. Cited on page(s) 285, 287
[81] W. Rudin, Real and Complex Analysis, McGraw-Hill, New York, 1974. Cited on page(s) 210
[82] R. Saeks and J. Murray, “Feedback System Design: The tracking and disturbance rejection
problems,” IEEE Trans. Auto. Control, AC-26, pp. 203–218, Feb. 1981.
[83] R. Saeks and J. Murray, “Fractional representation, algebraic geometry and the simultaneous
stabilization problem,” IEEE Trans. Auto. Control, AC-27, pp. 895–903, Aug. 1982.
[84] R. Saeks, J. Murray, O. Chua, C. Karmokolias and A. Iyer, “Feedback system design: The single
variate case - Part I,” Circ., Sys. and Sig. Proc., 1, pp. 137–170, 1982. DOI: 10.1007/BF01600050
Cited on page(s)
[85] R. Saeks, J. Murray, O. Chua, C. Karmokolias and A. Iyer, “Feedback system design: The single
variate case - Part II,” Circ., Sys. and Sig. Proc., 2, pp. 3–34, 1983. DOI: 10.1007/BF01598141
Cited on page(s)
[86] D. Sarason, “Generalized interpolation in H ∞ ,” Trans. Amer. Math. Soc., 127, pp. 179–203,
May 1967. DOI: 10.2307/1994641 Cited on page(s) 171, 179, 180, 205
[87] D. D. Siljak, “On reliability of control,” Proc. IEEE Conf. Decision and Control, pp. 687–694,
1978. Cited on page(s)
[88] D. D. Siljak, “Dynamic reliability using multiple control systems,” Int. J. Control, 31, pp. 303–
329, 1980. DOI: 10.1080/00207178008961043 Cited on page(s)
[89] G. Simmons, Introduction to Topology and Modern Analysis, McGraw-Hill, New York, 1966.
[90] D. E. Smith, History of Mathematics. Vol. I, Dover, New York, 1958. Cited on page(s) xvii
[91] A. A. Suslin, “Projective modules over a polynomial ring are free,” Soviet Math. Doklady, 17,
pp. 1160–1164, 1976. Cited on page(s)
354 BIBLIOGRAPHY
[92] M. Vidyasagar, “Input-output stability of a broad class of linear time-invariant multivariable
feedback systems,” SIAM J. Control, 10, pp. 203–209, Feb. 1972. DOI: 10.1137/0310015 Cited
on page(s) xv
[93] M. Vidyasagar, “Coprime factorizations and the stability of muitivariable distributed feedback
systems,” SIAM J. Control, 13, pp. 1144–1155, Nov. 1975. DOI: 10.1137/0313071 Cited on
page(s)
[94] M. Vidyasagar, Nonlinear Systems Analysis, Prentice-Hall, Englewood Cliffs, NJ., 1978. Cited
on page(s) 154
[95] M. Vidyasagar, “On the use of right-coprime factorizations in distributed feedback systems
containing unstable subsystems,” IEEE Trans. Circ. and Sys., CAS-25, pp. 916–921, Nov. 1978.
DOI: 10.1109/TCS.1978.1084408 Cited on page(s) 210
[96] M. Vidyasagar, “The graph metric for unstable plants and robustness estimates for feedback
stability,” IEEE Trans. Auto. Control, AC-29, pp. 403–418, May 1984.
[97] M. Vidyasagar and N. K. Bose, “Input-output stability of linear systems defined over measure
spaces,” Proc. Midwest Symp. Circ. and Sys., Montreal, Canada, pp. 394–397, Aug. 1975. Cited
on page(s) 284
[98] M. Vidyasagar and K. R. Davidson, “A parametrization of all stable stabilizing compensators

for single-input-output systems,” (under preparation). Cited on page(s)
[99] M. Vidyasagar and H. Kimura, “Robust controllers for uncertain linear multivariable systems,”
Automatica, 22, pp. 85-94, Jan. 1986. DOI: 10.1016/0005-1098(86)90107-X Cited on page(s)
277
[100] M. Vidyasagar and B. C. Lévy, “On the genericity of simultaneous stabilizability,” Symp. Math.
Thy. of Netw. and Sys., Stockholm, 1985. Cited on page(s) 275, 277
[101] M. Vidyasagar, H. Schneider and B. A. Francis, “Algebraic and topological aspects of feedback
stabilization,” IEEE Trans. Auto. Control, AC-27, pp. 880–894, Aug. 1982.
DOI: 10.1109/TAC.1982.1103015 Cited on page(s) 300, 301
[102] M. Vidyasagar and N. Viswanadham, “Algebraic design techniques for reliable stabilization,”
IEEE Trans. Auto. Control, AC-27, pp. 1085–1095, Oct. 1982.
[103] M. Vidyasagar and N. Viswanadham, “Reliable stabilization using a multi-controller con-

figutation,” Proc. IEEE Conf. on Decision and Control, pp. 856–859, 1983.
DOI: 10.1016/0005-1098(85)90008-1 Cited on page(s)
BIBLIOGRAPHY 355
[104] B. L. van der Waerden, Geomety and Algebra in Ancient Civilizations, Springer Verlag, New
York, 1984. Cited on page(s) xvii
[105] J. L. Walsh, Interpolation and Approximation by Rational Functions in the Complex Domain,
AMS Colloqium Publ., Providence, RI., 1935. Cited on page(s)
[106] W. Wolovich, Linear Multivariable Control, Springer-Verlag, New York, 1974. Cited on
page(s)
[107] W. M. Wonham, Linear Multivariable Control: A Geometric Approach, 2nd ed., Springer-
Verlag, New York, 1979. Cited on page(s) 254
[108] D. C. Youla, J. J. Bongiorno, Jr. and C. N. Lu, “Single-loop feedback stablization of linear mul-
tivariable plants,” Automatica, 10, pp. 159–173, 1974. DOI: 10.1016/0005-1098(74)90021-1
Cited on page(s)
[109] D. C. Youla, J. J. Bongiorno, Jr. and H. A. Jabr, “Modern Wiener-Hopf design of optimal
controllers, Part I: The single-input case,” IEEE Trans. on Auto. Control, AC-21, pp. 3–14, Feb.
1976. DOI: 10.1109/TAC.1976.1101139 Cited on page(s)
[110] D. C. Youla and G. Gnavi, “Notes on n-dimensional system theory,” IEEE Trans. Circ. and
Sys., CAS-26, pp. 105–111, Feb. 1979. DOI: 10.1109/TCS.1979.1084614 Cited on page(s)
283
[111] D. C. Youla, H. A. Jabr and J. J. Bongiorno, Jr., “Modern Wiener-Hopf design of optimal
controllers, Part II: The multivariable case,” IEEE Trans. Auto. Control, AC-21, pp. 319–338,
June 1976. DOI: 10.1109/TAC.1976.1101223 Cited on page(s) 205
[112] D. C. Youla and R. Pickel, “The Quillen-Suslin theorem and the structure of n-dimensional
elementary polynomial matrices,” IEEE Trans. Circ. and Sys., CAS-31, pp. 513–518, June 1984.
DOI: 10.1109/TCS.1984.1085545 Cited on page(s) 293, 294
[113] G. Zames,“Feedback and optimal sensitivity: Model reference transformations, multiplicative

seminorms and approximate inverses,” IEEE Trans. Auto. Control, AC-26, pp. 301–320, April
[114] G. Zames and A. El-Sakkary, “Unstable systems and feedback: The gap metric,” Proc. Allerton
Conf., pp. 380–385, 1980. Cited on page(s) 228, 277
[115] G. Zames and B. A. Francis, “A new approach to classical frequency methods: feedback and
minimax sensitivity,” IEEE Trans. Auto. Control, AC-28, pp. 585–601, May 1983.
DOI: 10.1109/TAC.1983.1103275 Cited on page(s) xiii, 205
[116] O. Zariski and P. Samuel, Commutative Algebra, Vol. I, Van-Nostrand, New York, 1958. Cited
on page(s) 279, 281, 282, 307, 316
356 BIBLIOGRAPHY
[117] K. Zhou, J. C. Doyle and K. Glover, Robust and Optimal Control, Prentice-Hall, Englewood
Cliffs, New Jersey, 1996. Cited on page(s) xiii
357
Author’s Biography
MATHUKUMALLI VIDYASAGAR
Mathukumalli Vidyasagar was born in Guntur, India in 1947. He received the B.S., M.S., and Ph.D.
degrees in electrical engineering from the University of Wisconsin in Madison, in 1965, 1967, and
1969, respectively. From the next twenty years he taught mostly in Canada, before returning to his
native India in 1989. Over the next twenty years, he built up two research organizations from scratch,
first the Centre for Artificial Intelligence and Robotics under the Ministry of Defence, Government
of India, and later the Advanced Technology Center in Tata Consultancy Services (TCS), India’s
largest software company. After retiring from TCS in 2009, he joined the University of Texas at
Dallas as the Cecil & Ida Green Chair in Systems Biology Science, and he became the Founding
Head of the Bioengineering Department. His current research interests are stochastic processes and
stochastic modeling, and their application to problems in computational biology. He has received a
number of awards in recognition of his research, including the 2008 IEEE Control Systems Award.
359
Index
absolute nondivisor of zero, 312 degree function, 319

accumulation point, 340 determinant, 323
additive perturbation, 236 multilinearity of, 324
associate, 316 domain, 308
left, 328
right, 328 equivalence, 328
Euclidean domain, 319
base for a topology, 338 proper, 319
Bezout domain, 279
relation to coprime factorizations, 280 field, 309
binary relation, 269 of fractions, 313
Binet-Cauchy formula, 325 filtering, 153
Blaschke product, 158 solution, 162, 163
closed set, 337 fraction, 312
cluster point, 340
Gel’fand transform, 284
complementary inner matrix, 194
complementation, 288 generic relation, 269
in a local ring, 296 genericity
relation to coprime factorizations, 288 of comprimeness, 270
comprime factorization of nonsingularity, 271
over a general ring, 279 of simultaneous strong stabilizability, 273,
relation to Bezout domains, 280 277
relation to principal ideals, 279, 280 of Smith forms, 271, 273
comprimeness graph, 209
over a general ring, 281 graph metric, 229
continuity, 341 estimates for, 232
contractible set, 291 relation to graph topology, 230
relation to Hermite rings, 291 graph topology, 212
coprimeness, 315 and singular perturbations, 223
in A, 287 and state space models, 221
corona theorem, 286 continuity in, 215
360 INDEX
convergence in, 213 normalized r.c.f., 228
relation to product topology, 219 normed algebra, 342
relation to robust stabilization, 215 nullstellensatz, 283
greatest common divisor, 314 Nyquist criterion, 238
Hardy spaces, 156 open set, 337

Hausdorff topology, 340 optimal robust regulation, 264
Hermite form, 329 outer function, 158
Hermite ring, 290 outer matrix, 159
relation to coprime factorizations, 291
performance index, 153
ideal, 309 prime element, 316
left, 309 prime factorization, 317
maximal, 281 principal ideal domain, 314
prime, 310 principal ideal ring, 314
principal, 310 product topology, 342
right, 309 proper Euclidean domain, 319
index, 238
Quillen-Suslin theorem, 293
inner function, 158
quotient field, 313
inner matrix, 159
inner-outer factorization, 158, 159 reduced form, 316
internal model principle, 254 relation, 269
invariant factor, 332 binary, 269
generic, 269
Laplace’s expansion, 328
strongly generic, 269
least common multiple, 317
symmetric, 269
limit point, 340
ring, 307
local ring, 293
commutative, 307
localization, 293, 313
of fractions, 313
locally compact Abelian group, 285
with an identity, 307
maximal ideal space, 283 robust disturbance rejection, 252
maximum modulus principle, 186 solution, 252
metric space, 339 robust regulation, 252
multiplicative perturbation, 236 robust stabilizability, 209
multiplicative set, 298, 312 robust stabilization
existence of compensators, 244, 246, 248
Nagata’s lemma, 294 theorems, 237, 242
neighborhood, 339 robust tracking, 253
Nevanlinna-Pick algorithm, 185 using two-parameter compensation, 259
INDEX 361
saturated set, 298, 312 strongly generic relation, 269
Schur complement, 327 subring, 309
sensitivity minimization, 154 symmetric relation, 269
example, 174, 192
for a fat plant, 176 topological ring, 342
for a general plant, 193 topological space, 337
for a scalar plant, 166 topology, 337
Serre’s conjecture, 293 discrete, 337
singular perturbations, 223 first-countable, 339
and strictly proper compensators, 227 Hausdorff topology, 340
Smith form, 332 product, 342
Smith-McMillan form, 335 trivial, 337
spectral factor, 194
spectral factorization, 194 unimodular row, 289
stabilization by lumped compensators, 305
stable-factor perturbation, 236 weighted performance index, 153

Control System Synthesis: A Factorization Approach

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Control System Synthesis: A Factorization Approach

Uploaded by

Copyright:

Available Formats

Control System Synthesis

Control System Synthesis: A Factorization Approach, Part II

Control System Synthesis: A Factorization Approach, Part I

The Reaction Wheel Pendulum

Control System Synthesis: A Factorization Approach, Part II

ISBN: 9781608456611 paperback

A Publication in the Morgan & Claypool Publishers series

SYNTHESIS LECTURES ON CONTROL AND MECHATRONICS #3

May God protect us together

Preface for the Original Edition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

6 Filtering and Sensitivity Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

8 Extensions to General Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

A Algebraic Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

B Preliminaries on Matrix Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323

C Topological Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337

Author’s Biography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

Dallas and Hyderabad, June 2011

Filtering and Sensitivity

J = W1 (I + P C)−1 W2 . (6.1.1)

The significance of the problem can be explained in terms of Figure 6.1.

Figure 6.1: Pictorial Representation of the Weighted Performance Measure.

e(j ω)2 = [(I + P C)−1 W2 ](j ω)2 . (6.1.2)

z(j ω)2 = [W1 (I + P C)−1 W2 ](j ω)2 . (6.1.3)

The objective is to minimize the energy of the process z, i.e.,

= sup W1 (I + P C)−1 W2 u2

= sup [W1 (I + P C)−1 W2 ](j ω) , (6.1.5)

S(P ) = {(X̃ + DR)(Ỹ − NR)−1 : R ∈ M(S) and |Ỹ − NR|  = 0} . (6.1.6)

(I + P C)−1 = (Ỹ − NR)D̃ . (6.1.7)

J (R) = W1 (Ỹ − NR)D̃W2 , (6.1.8)

J = W1 P C(I + P C)−1 W2 . (6.1.9)

However, if C ∈ S(P ), then by (5.2.7),

P C(I + P C)−1 = (X̃ + DR)Ñ , (6.1.10)

for some R ∈ M(S), so that

6.2 SOME FACTS ABOUT HARDY SPACES

The set H∞ consists of all analytic functions f on D such that

belongs to H∞ , and in fact

In other words, an application of Tf consists simply of pointwise multiplication by f (·). It can be

where ∗ denotes the conjugate transpose. Corresponding to every F ∈ H∞

(TF g)(z) = F (z)g(z) ∀z ∈ D . (6.2.14)

TF = sup TF g2 = F ∞ = sup σ̄ (F (ej θ )) , (6.2.15)

where σ (·) denotes the largest singular value.

need not equal m). A matrix F ∈ H∞ n×m

Fact 6.2.2 Suppose G ∈ H∞ n×m

Fact 6.2.2 has several interesting consequences. First, suppose G ∈ R∞ n×m

The function f belongs to H∞ (C+ ) if

sup sup |f (σ + j ω)| =: f ∞+ < ∞ . (6.2.17)

g(s) = f [(s − 1)/(s + 1)] , (6.2.20)

To relate (6.2.22) and (6.2.23), let ej θ = (j ω − 1)/(j ω + 1). Then

1 + ej θ e−j θ/2 + ej θ/2

J (R) = W1 (Ỹ − NR)D̃W2 2

Jm (R) = F − GRH 2 , (6.3.3)

and let Jm (R) = ∞ if F − GRH ∈ / M(H2 (C+ )). Factorize G, H as G = Gi Go , H = Ho Hi , where

Go RHo = (Gi ∗ F Hi ∗ )+ . (6.3.5)

(Jm (R))2 = F − GRH 22 = F − Gi Go RHo Hi 22

If Gi ∗ F Hi is decomposed as (Gi ∗ F Hi ∗ )+ + (Gi ∗ F Hi ∗ )− , then by orthogonality,

(Jm (R))2 = (Gi ∗ F Hi ∗ )− 22 + (Gi ∗ F Hi ∗ )+ − Go RHo 22

Lemma 6.3.2 Suppose F, G, H ∈ M(S), and suppose in addition that

for some unit u ∈ S. Hence,

F − GRε H 2 = F − G1 Dg · Dg−1 Bεg R0 Bεh Dh−1 · Dh H1 2

where Ug , Vg , Uh , Vh ∈ U(S). Then

J (R) = W1 Ỹ D̃W2 − W1 NR D̃W2 2

inf J (R) = inf W1 Ỹ D̃W2 − G1 RH1 2 , (6.3.17)

which can be computed using Lemma 6.3.1.

S(P ) = {(X̃ + DR)(Ỹ − NR)−1 : R ∈ M(S) and |Ỹ − NR| = 0} . (6.1.6)

Tf −rb g = [(f − rb)g] = (f g) − (rbg)

Fact 6.4.5 Define a linear map A : N → N to be the restriction of Tf to N. Then

γij = φi , φj . (6.4.33)

aij = vi , Avj = vi , Tf vj = vi , f vj . (6.4.37)

φ1 , f φ1 = (f φ1 )(0) = w̃(0) = 0.5,

TF −bR g = (F − bR)g = (F g) − (bRg) = TF g , (6.5.23)

vij , A∗ vkl = A∗ vkl , vij = vkl , Avij