Professional Documents
Culture Documents
"This book is a useful compendium of the mathematics of (mostly) finite-dimensionallinear vector spaces (plus two final chapters on infinite-dimensional
spaces), which do find increasing application in many branches of engineering
and science .... The treatment is thorough; the book will certainly serve as a valuable reference."
- A merican Scientist
"The authors present topics in algebra and analysis for students in engineering
and science .... Each chapter is organized to include a brief overview, detailed
topical discussions and references for further study. Notes about the references
guide the student to collateral reading. Theorems, definitions, and corollaries are
illustrated with examples. The student is encouraged to prove some theorems
and corollaries as models for proving others in exercises. In most chapters, the
authors discuss constructs used to illustrate examples of applications. Discussions are tied together by frequent, well written notes. The tables and index are
good. The type faces are nicely chosen. The text should prepare a student well in
mathematical matters."
- S cience Books and iF lms
"This is an intermediate level text, with exercises, whose avowed purpose is to
provide the science and engineering graduate student with an appropriate modern mathematical (analysis and algebra) background in a succinct, but nontrivial,
manner. After some fundamentals, algebraic structures are introduced followed
by linear spaces, matrices, metric spaces, normed and inner product spaces and
linear operators.... While one can quarrel with the choice of specific topics and
the omission of others, the book is quite thorough and can serve as a text, for
self-study or as a reference."
- M athematical Reviews
"The authors designed a typical work from graduate mathematical lectures: formal definitions, theorems, corollaries, proofs, examples, and exercises. It is to
be noted that problems to challenge students' comprehension are interspersed
throughout each chapter rather than at the end."
- C H O ICE
Anthony N. Michel
Charles .J Herget
Birkhauser
Boston Basel Berlin
Anthony N. Michel
Department of Electrical Engineering
nU iversity of Notre Dame
Notre Dame, IN 64 556
.U S.A.
Charles .J eH rget
eH rget Associates
P.O. Box 1425
Alameda, CA 94501
.U S.A.
15-01, 15A03,
20-01, 26-,X
54 B05, 64 ,X 64 NIO, 64 N20,
74 N20, 74 N70,
54E35, 54E54 ,
Birkhiiuser Boston
www.birkhauser.com
(IBT)
CONTENTS
PREFACE
IX
CH A PTER 1:
1.1
1.2
1.3
1.4
1.5
1.6
2.2
2.3
2.4
CONCEPTS
Sets
1
Functions
12
Relations and Equivalence Relations
25
Operations on Sets
26
Mathematical Systems Considered in This Book
References and Notes
31
References
32
CH A PTER 2:
2.1
F U N DAMENTAL
ALGEBRAIC STRU C TU R ES
36
30
33
34
53
Contents
vi
CHAPTER J :
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
iL near Spaces
75
iL near Subspaces and Direct Sums
81
iL near Independence, Bases, and Dimension
iL near Transformations
95
iL near uF nctionals
109
Bilinear uF nctionals
113
Projections
119
Notes and References
123
References
123
CHAPTER 4 :
.4 1
.4 2
.4 3
4.4
.4 5
.4 6
.4 7
.4 8
;4 9
.4 10
85
FINITE-DIMENSIONAL VECTOR
SPACES ANDMATRICES
124
155
202
216
vii
Contents
.4 11
4.12
CH A PTER 5:
METRIC SPACES
238
263
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
CHAPTER 6:
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
6.1 0
6.11
6.12
yiii
Contents
6.13
6.14
6.15
6.16
oF urier Series
387
The Riesz Representation Theorem
393
Some Applications
394
A. Approximation of Elements in iH lbert Space
(Normal Equations)
395
B. Random Variables
397
C. Estimation of Random Variables
398
Notes and References
404
References
404
CHAPTER 7:
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.9
7.10
7.11
L I NEAR OPERATORS
406
PREFACE
This book evolved from a one-year sequence of courses offered by the authors
at Iowa State University. The audience for this book typically included theoretically oriented first- or second-year graduate students in various engineering or
science disciplines. Subsequently, while serving as Chair of the Department of
Electrical Engineering, and later, as Dean of the College of Engineering at the
University of Notre Dame, the first author continued using this book in courses
aimed primarily at graduate students in control systems. Since administrative
demands precluded the possibility of regularly scheduled classes, the Socratic
method was used in guiding students in self study. This method of course delivery turned out to be very effective and satisfying to student and teacher alike.
F e edback from colleagues and students suggests that this book has been used in
a similar manner elsewhere.
The original objectives in writing this book were to provide the reader with appropriate mathematical background for graduate study in engineering or science;
to provide the reader with appropriate prerequisites for more advanced subjects
in mathematics; to allow the student in engineering or science to become familiar with a great deal of pertinent mathematics in a rapid and efficient manner
without sacrificing rigor; to give the reader a unified overview of applicable
mathematics, thus enabling him or her to choose additional courses in mathematics more intelligently; and to make it possible for the student to understand
at an early stage of his or her graduate studies the mathematics used in the cur-
ix
Preface
Preface
ix
The prerequisites for this book include the usual background in undergraduate
mathematics offered to students in engineering or in the sciences at universities
in the United States. Thus, in addition to graduate students, this book is suitable for advanced senior undergraduate students as well, and for self study by
practitioners.
Concerning the labeling of items in the book, some comments are in order. Sections are assigned numerals that reflect the chapter and the section numbers. For
example, Section 2.3 signifies the third section in the second chapter. Extensive
sections are usually divided into subsections identified by upper-case common letters A, B, C, etc. Equations, definitions, theorems, corollaries, lemmas,
examples, exercises, figures, and special remarks are assigned monotonically
increasing numerals which identify the chapter, section, and item number. For
example, Theorem 4.4.7 denotes the seventh identified item in the fourth section
of Chapter .4 This theorem is followed by Eq. (4.4.8), the eighth identified item
in the same section. Within a given chapter, figures are identified by upper-case
letters A, B, C, etc., while outside of the chapter, the same figure is identified
by the above numbering scheme. iF nally, the end of a proof or of an example is
signified by the symbol .
A one-semester course
Chapters 1, 3, 4, 5, and Sections 6.1 and 6.11 in Chapter 6 can serve as the basis
for a one-semester course, emphasizing basic aspects of Linear Algebra and
Analysis in a metric space setting.
The coverage of Chapter 1 should concentrate primarily on functions (Section 1.2) and relations and equivalence relations (Section 1.3), while the material
concerning sets (Section 1.1) and operations on sets (Section 1.4) may be covered as reading assignments. On the other hand, Section 1.5 (on mathematical
systems) merits formal coverage, since it gives the student a good overview of
the book' s aims and contents.
xii
Preface
The material in this book has been organized so that Chapter 2, which addresses the important algebraic structures encountered in Abstract Algebra, may
be omitted without any loss of continuity. In a one-semester course emphasizing
Linear Algebra, this chapter may be omitted in its entirety.
In Chapter 3, which addresses general vector spaces and linear transformations, the material concerning linear spaces (Section 3.1), linear subspaces and
direct sums (Section 3.2), linear independence and bases (Section 3.3), and linear transformations (Section 3.4) should be covered in its entirety, while selected
topics on linear functionals (Section 3.5), bilinear functionals (Section 3.6), and
projections (Section 3.7) should be deferred until they are required in Chapter .4
Chapter 4 addresses finite-dimensional vector spaces and linear transformations (matrices) defined on such spaces. The material on determinants (Section
4.4) and some of the material concerning linear transformations on Euclidean
vector spaces (Subsections .4 1 OD and .4 1 OE), as well as applications to ordinary
differential equations (Section 4.11) may be omitted without any loss of continuity. The emphasis in this chapter should be on coordinate representations of
vectors (Section 4.1), the representation of linear transformations by matrices
and the properties of matrices (Section 4.2), equivalence and similarity of matrices (Section 4.3), eigenvalues and eigenvectors (Section 4.5), some canonical
forms of matrices (Section 4.6), minimal polynomials, nilpotent operators and
the Jordan canonical form (Section 4.7), bilinear functionals and congruence
(Section 4.8), Euclidean vector spaces (Section 4.9), and linear transformations
on Euclidean vector spaces (Subsections .4 1 OA, .4 1 OB, and .4 1 oq .
Chapter 5 addresses metric spaces, which constitute some of the most important topological spaces. In a one-semester course, the emphasis in this chapter
should be on the definition of metric space and the presentation of important
classes of metric spaces (Sections 5.1 and 5:3), open and closed sets (Section 5.4), complete metric spaces (Section 5.5), compactness (Section 5.6), and
continuous functions (Section 5.7). The development of many classes of metric
spaces requires important inequalities, including the Holder and the Minkowski
inequalities for finite and infinite sums and for integrals. These are presented
in Section 5.2 and need to be included in the course. Sections 5.8 and 5.10 address specific applications and may be omitted without any loss of continuity.
oH wever, time permitting, the material in Section 5.9, concerning equivalent and
homeomorphic metric spaces and topological spaces, should be considered for
inclusion in the course, since it provides the student a glimpse into other areas
of mathematics.
To demonstrate mathematical systems endowed with both algebraic and topological structures, the one-semester course should include the material of
Sections 6.1 and 6.2 in Chapter 6, concerning normed linear spaces (resp., Banach spaces) and inner product spaces (resp., Hilbert spaces), respectively.
Preface
ix ii
A two-semester course
In addition to the material outlined above for a one-semester course, a two-se-
ix v
Preface
involved a formal presentation to the entire class at the end of the semester.
The courses described above were also offered using the Socratic method, following the outlines given above. These courses typically involved half a dozen
participants. While most of the material was self taught by the students themselves, the classroom meetings served as a forum for guidance, clarifications, and
challenges by the teacher, usually resulting in lively discussions of the subject on
hand not only among teacher and students, but also among students themselves.
For the current printing of this book, we have created a supplementary website of additional resources for students and instructors: http://Michel.Herget.
net. Available at this website are additional current references concerning the
subject matter of the book and a list of several areas of applications (including
references). Since the latter reflects mostly the authors' interests, it is by definition rather subjective. Among several additional items, the website also includes
some reviews of the present book. In this regard, the authors would like to invite
readers to submit reviews of their own for inclusion into the website.
The present publication of Algebra and Analysisfor Engineers and Scientists
was made possible primarily because of Tom Grasso, Birkhauser's Computational Sciences and Engineering Editor, whom we would like to thank for his
considerations and professionalism.
Anthony N. Michel
Charles .J Herget
Summer. 2007
N
U F DAMENTAL
CONCEPTS
1.1. SETS
Virtually every area of modern mathematics is developed by starting from
an undefined object called a set. There are several reasons for doing this.
One of these is to develop a mathematical discipline in a completely axiomatic
and totally abstract manner. Another reason is to present a unified approach
to what may seem to be highly diverse topics in mathematics. Our reason is
the latter, for our interest is not in abstract mathematics for its own sake.
However, by using abstraction, many of the underlying principles of modern
mathematics are more clearly understood.
Thus, we begin by assuming that a set is a well defined collection of
1
A.
{x
A: P(x ) is true}.
F o r example, let A denote the set of all people who live in Ames, Iowa, and
let B denote the set of all males who live in Ames. We can write, then,
B=
{x
A: x is a male}.
When it is clear which set x belongs to, we sometimes write { x : P(x ) is true}
(instead of, say, {x E A: P(x ) is trueD.
It is also necessary to consider a set which has no members. Since a set is
determined by its elements, there is only one such set which is called the
1.1. Sets
empty set, or the vacuous set, or the null set, or the void set and which is
denoted by 0. Any set, A, consisting of one or more elements is said to be
non-empty or nOD-void. IfA is non-void we write A 1= = 0.
If A and B are sets and if every element of B also belongs to A, then we
say that B is a subset of A or A includes B, and we write B c A or A :::> B.
Furthermore, if B c A and if there is an x E A such that x . B, then we
say that B is a proper subset of A. Some texts make a distinction between
proper subset and any subset by using the notation c and ~, respectively.
We shall not use the symbol ~ in this book. We note that if A is any set,
then 0 c: A. Also, 0 c 0. If B is not a subset of A, we write B A or
A P= B.
1.1.1. Example.
Let R denote the set of all real numbers, let Z denote
the set of all integers, let J denote the set of all positive integers, and let Q
denote the set of all rational numbers. We could alternately describe the set
Zas
Z = { x E R: x is an integer}.
Thus, for every x E R, the statement x is an integer is either true or false.
We frequently also specify sets such as J in the following obvious manner,
{x
E Z: x
1, 2, ...}.
x{
R:x =
:,p,q
Z,q : ;i:o} .
We emphasize that all definitions are "ifand only if" statements. Thus, in
the above definition we should actually have said: A and B are equal if and
only if A c Band Be A. Since this is always understood, hereafter all
definitions will imply the "only if" portion. Thus, we simply say: two sets A
and B are said to be equal if A c Band B cA.
In Definition 1.1.2 we introduced two concepts of equality, one of equality
of sets and one of equality of elements. We shall encounter many forms of
equality throughout this book.
{x
X: x
AJ.
~
(1.1.3)
In every discussion involving sets, we will always have a given fixed set
in mind from which we take elements and subsets. We will call this set the
universal set, and we will usually denote this set by .X
Throughout the remainder of the present section, X denotes always an
arbitrary non-void fixed set.
We now establish some properties of sets.
1.1.4.
(i)
(ii)
(iii)
(iv)
(v)
(vi)
Then
Proof To prove (i), first assume that A is non-void and let x E A. Since
A c: B, x E B, and since B c: C, X E C. Since x is arbitrary, every element
of A is also an element of C and so A c C. Finally, if A = 0, then A c C
follows trivially.
The proofs of parts (ii) and (iii) follow immediately from (1.1.3).
To prove (iv), we must show that A c (A- ) - and (A- r c: A. If A = 0,
then clearly A c: (A- r . Now suppose that A is non-void. We note from
(1.1.3) that
(A- r
{x
X:
A- } .
(1.1.5)
"*
1.1.6. Exercise.
The proofs given in parts (i) and (iv) of Theorem 1.1.4 are intentionally
quite detailed in order to demonstrate the exact procedure required to prove
1.1. Sets
B=
x{
E X:
A or x
B}.
(ii)
(iii)
(iv)
(v)
(vi)
(vii)
(viii)
(ix)
(x)
(xi)
(xii)
(xiii)
(xiv)
(xv)
(xvi)
(xvii)
An B
AU B
An 0
Au 0
An X =
AuX = X ;
A n A=
Au A=
A u A-
An A-
B n A;
B U A;
=
0;
A;
=
A;
A;
A;
X;
0;
An Be A;
An B = A if and only if A c B;
A c A U B;
A = A u B if and only if B c A;
(A n B) n C = An (B n C);
(A U B) U C = A U (B U C);
A n (B u C) = (A n B) u (A n C);
Then
(xviii)
(xi)x
(x)x
(A
B) U
(A U B)(A n Bf
=
=
C = (A U C) () (B U
A- n B- ; and
A- U B- .
C);
B) U C
c (A U
C)
(B U
C).
(1.1.8)
C) () (B U
C) =
1.1.10. Exercise.
Prove parts (i) through (xvii) and parts (xi)x
of Theorem 1.1.7.
(A () B)
and (x)x
AU
U
';1
3
A, =
AI U
A2 U
... U
A3
x{
E A, for
E X:
some i
= 1 . .. , n).
n
';1
A, =
AI () A: () ... () A. =
{x
X:
A, for all i
= 1, ... ,n).
1.1. Sets
That is,
1= 1
AI, A z , .
,An'
1.1.7.
1.1.14.
U[ 1=
n[
1=1
A/J- =
A/J=
Exercise.
n A;,
1= 1
(1.1.12)
and
U/=1 A;.
(1.1.13)
The results expressed in Eqs. (1.1.12) and (1.1.13) are usually referred
to as De Morgan's laws. We will see later in this section that these laws hold
under more general conditions.
Next, let A and B be two subsets of X. We define the difference of Band
A, denoted (B - A), as the set of elements in B which are not in A, i.e.,
B-
A=
{x
:X x
f/:
B and x
A}.
Bn A- .
Now let A and B be again subsets of the set .X The symmetric difference
of A and B is denoted by A ! l B and is defined as
A !l B
(A -
B) U
(B -
A).
B A A;
(A U B) -
(iii) A A A = 0;
(iv) A! l 0 = A;
(v) A A (B! l C) =
(A
B);
(A A B) A C;
1.1.16. Exercise.
C); and
In passing, we point out that the use of Venn diagrams is highly useful in
visualizing properties of sets; however, under no circumstances should such
diagrams take the place of a proof. In Figure A we illustrate the concepts of
union, intersection, difference, and symmetric difference of two sets, and the
complement of a set, by making use of Venn diagrams. Here, the shaded
regions represent the indicated sets.
CU ( DO
C"' A U B
1.1.17.
1.1. Sets
B
{ : B c A}.
1.1.20. Example. The power class of the empty set, (P(0) = { 0 } , i.e.,
the singleton of 0. The power class of a singleton, (P({a)J = { 0 , a{ n. F o r the
In general, if A is a finite set with
set A = a{ , b}, (P(A) = { 0 , a{ ,} b{ ,} a{ ,
n elements, then (P(A) contains 2" elements. _
bn.
Before proceeding further, it should be pointed out that a free and uncritical use of a set theory can lead to contradictions and that set theory has had
a careful development with various devices used to exclude the contradictions.
Roughly speaking, contradictions arise when one uses sets which are "too
big," such as trying to speak of a set which contains everything. In all of our
subsequent discussions we will keep away from these contradictions by always
having some set or space X fixed for a given discussion and by considering
only sets whose elements are elements of ,X or sets (collections) whose elements are subsets of ,X or sets (families) whose elements are collections of
subsets of ,X etc.
Let us next consider ordered sets. Above, we defined set in such a manner
that the ordering of the elements is immaterial, and furthermore that each
element is distinct. Thus, if a and b are elements of ,X then a{ , b} = b{ , a};
i.e., there is no preference given to a or b. Furthermore, we have a{ , a, b}
= a{ , b}. In this case we sometimes speak of an unordered pair a{ , b}.
Frequently, we will need to consider the ordered pair (a, b), (a and b need
not belong to the same set) where we distinguish between the first element a
and the second element b. In this case (a, b) = (u, v) if and only if u = a and
v = b. Thus, (a, b) *- (b, a) if a *- b. Also, we will consider ordered triplets
(a, b, c), ordered u
q adruplets (a, b, c, d), etc., where we need to distinguish
between the first element, second element, third element, fourth element,
etc. Ordered pairs, ordered triplets, ordered quadruplets, etc., are examples of
ordered sets.
We point out here that our characterization of ordered sets is not axiomatic, since we are assuming that the reader knows what is meant by the first
10
x Y =
y): x
{(x,
,X Y
.} Y
(1.1.21)
2 X
..
X.
{(Xl'
XI
2, .
E IX >
,x . ):
X 2 E
2,
, X.
EX . } .
(1.1.22)
.
(Y I '
1.1.24.
and
eL t A =
Example.
A x
and let B =
a{ , b, c}. Then
({O, a), (0, b), (0, c), (1, a), (I, b), (I, c)}
B=
to, I},
({ a, 0), (a, 1), (b, 0), (b, 1), (c, 0), (c, I)}.
1.1. Sets
11
is called an indexed set. Here again, we agree to permit the possibility that the
elements x .., a E I need not be distinct. Clearly, if I is a finite non-void set,
then an indexed set is simply an ordered set.
In the next definition, and throughout the remainder of this section, J
denotes the set of positive integers.
1.1.25. Definition. A sequence is an indexed set whose index set is .J A
sequence of sets is an indexed family of sets whose index set is .J
We usually abbreviate the sequence "x {
E :X
n E } J by ,x { ,},
when no
possibility for confusion exists. (Even though the same notation is used for
the sequence ,x { ,}
and the singleton of "x ' the meaning as to which is meant
will always be clear from context.) Some authors write ,x { ,};C!
to indicate
that the index set of the sequence is .J Also, some authors allow the index set
of a sequence to be finite.
We are now in a position to consider the following additional generalizations.
and
{x
n A.. =
If K
A.. =
.. eJ r
..ex
0, we define
.. e0
{x
A..
:X x
X:
0 and
}K
n A.. =
.. e0
.X
The union and intersection of families of sets which are not necessarily
indexed is defined in a similar fashion. Thus, if ff' is any non-void family of
subsets of ,X then we define
pe'
and
F =
x{
(\ F =
e' ~
{x
:X x
F for all F E
o< x
and
n A.. =
.el
O
{ ,J
ff'.}
k{ , k
1, k
-t
2, ...},
..el
11
1.1.28. Example.
A" = :x {
-n <
x <
{x
R: - -
:x {
0 <x<
B" =
eL t
and
n B" =
00
,,= \
.
Then, U
=
eL t
n+ .}
.
Rand n A" =
I
< xI
< 1} + .. = 1
A" =
U ..
.- 1
Then,
,,= \
:x {
B" = :x {
-1
.J Let
< x <
-1 <
x <
I} .
2}
I} .
(ii) B U
(iii)
B-
(iv) B (v)
(vi)
U [ .. eIC
n[
.. eIC
U [ .. ex
n[
U
.. eIC
.. eIC
B
[
n B[
A..]
A..
A..]
n (B .. eIC
B be
n A..] ;
U
A.. ];
A..);
n A.. = U (B - A.. );
A.. r = n A;; and
.. ex
.. eIC
.. eIC
A..
1.1.30. Exercise.
.. eIC
.. elC
.. ex
A;.
Parts (v) and (vi) of Theorem 1.1.29 are called De Morgan's laws.
We conclude the present section with the following:
1.1.31. Definition. eL t ff' be any family of subsets of .X ff' is said to be a
family of disjoint sets if for all A, B E ff' such that A :# B, then A n B = 0.
A sequence of sets E{ ,,} is said to be a sequenee of disjoint sets if for every m,
n E J such that m n, E", n E" = 0.
"*
1.2.
N
U F CTIONS
1.2. uF nctions
and denote it by f(x ) . We sometimes writef: X
into .Y
f from X
13
->
The terms mapping, map, operator, transformation, and function are used
interchangeably. When using the term mapping, we usually say "a mapping
of X into Y." Although the distinction between the words "of X " and "from
X " is immaterial, as we shall see, the wording "into Y " becomes important
as opposed to the wording "onto Y," which we will encounter later.
Sometimes it is convenient not to insist that the domain of definition off
be all of X ; i.e., a function is sometimes defined on a subset of X rather than
on all of .X In any case, the domain of definition offis denoted by ' J ( f) c .X
Unless specified otherwise, we shall always assume that ' J ( f) = .X
Intuitively, a functionfis a "rule" whereby for each x E X a uniq u e y E Y
is assigned to x . When viewed in this manner, the term mapping is q u ite
descriptive. However, defining a function as a "rule" involves usage of yet
another undefined term.
Concerning functions, some additional comments are in order.
So-called "multivalued functions" are not allowed by the above
definition. They will be treated later under the topic of relations
(Section 1.3).
2. The set X (or )Y may be the Cartesian product of sets, e.g., X = IX
X X 2 X ... x "X . In this case we think offas being a function of n
variables. We write f(x l , . . . ,x,,) to denote the value offat (X I ' , ,
x . ) E X = X I X ... x "X .
3. It is important that the distinction between a function and the value
of a function be clearly understood. The value of a function, f(x ) ,
is an element of .Y The function.f is a much larger entity, and it is
to be thought ofas a single object. Note that f E P< (X
x Y ) (the power
x Y ) is a function. The
set of X x Y), but not every element of P< (X
set of all functions from X into Yis a subset of P< (X
x Y) and is sometimes denoted by yx .
1.
1.2.2. Example.
Let A and B be the sets defined in Example 1.1.24. L e t
fbe the subset of A x B given byf = (0, a), (1, b)}. Thenfis a function from
A into B. We see thatf(O) = a andf(1) = b. The range offis the set (a, h}
which is a proper subset of B.
Although we have defined a function as being a set, we usually characterize
a function according to a rule as shown, for example, in the following.
1.2.3. Example.
L e t R denote the real numbers, and letfbe a function from
R into R whose value at each x E R is given by f(x ) = sin x. The function
f is the sine function. Expressed explicitly as a set, we see that f = {(x, y):
14
y): x
sin y} c R x
R is not a
15
1.2. Functions
e
,.4
:?' 3 :7
b
d><:3
c
12 : X
,Y
-+
2 -+
1.2.7.
".4
,.3
b.
a~:
13 : X
'~4
I.: .X - + Y .
I. is bijective
3 ... 3Y
13 is (1 - 11
12 is onto
I, is into
,.5
1,: ,X
d.
Let x
X~
be such
that f(x l ) = f(x 2). This implies that (X I ,f(X I and (X 2,f(X 2 E f and so
(f(x l ), IX ) and (f(x 2), x 2) E g. Since f(x l ) = f(x 2) and g is a function,
we must have IX = X 2. Therefore,fis injective. _
The above result motivates the following definition.
1.2.9. Definition. L e tfbe an injective mapping of X into .Y Then we say
that f has an inverse, and we call the mapping g defined in Theorem 1.2.8
the inverse off Hereafter, we will denote the inverse of fbyf- I .
(i)
(ii)
(iii)
(iv)
eL t
1.2.11. Exercise.
16
Chapter 1
Fundamental Concepts
=
E
*'
1.2.14.
-
Example.
eL t
bY e given by lex )
and R
< (f)
= y{
= I
:Y ,- I
=
Y
for all x
;Ix \
< y< +
I] . Also,
I for all y
I !\y
Y a nd g : Y
Next, let ,X ,Y and Z be non-void sets. Suppose that/: X - Z. F o r each x E ,X we have f(x ) E Y and g(f(x
E Z. Since / and g
are mappings from X into Y a nd from Y into Z, respectively, it follows that
for each x E X there is one and only one element g(f(x
E Z. e
H nce, the
set
({ ,x )z E X X Z: z = g(f(x , x E X }
(1.2.15)
is a function from X into Z. We call this function the composite function of
g and / and denote it by g 0 f The value of go/ at x is given by
(g 0 f)(x )
g o/(x )
t:.
g(f(x .
1.2. uF nctions
17
1.2.16.
We also have
1.2.18. Theorem. IfI is a (I- I ) mapping of a set X onto a set ,Y and if g
is a (I- I ) mappi ng ofthe set Y o nto a set Z, then g 0 I is a (I- I ) mapping of X
ontoZ.
1.2.19. Exercise.
Next we prove:
1.2.20. Theorem. If I is a (1-1) mapping of a set X onto a set ,Y and if
a set Z, then (g 0 f)- I = (f- I ) 0 (g- I ).
1= fer, )U ,
and C =
w
{ , x , y,
18
(r stU ) .
1=
v x
That is, the top row identifies the domain ofI and the bottom row contains
each uniq u e element in the range of I directly below the appropriate element
in the domain. Clearly, this representation can be used for any function
defined on a finite set. In a similar fashion, let the function g : B - + C be
defined as
= (U
W )X .
Clearly, both/and g are bijective. Also, go lis the (I- I ) mapping of A onto
C given by
). y
z
(xr stU
g 0/=
F u rthermore,
uX ),
g- I
= (X
I- I
og- t
u v
Z
W
y),
x
(gof)- t
= (X r sZ
w Y
).
u
Now
= (rX Wt sZ Y )'
u
1.2.22. T
' heorem.
1.2.23. Exercise.
1= (: ;
:),
(~
g=
C ~ ; :).
:),
hog
= (: :
Then
g0I
(~
:)
and
:) .
1.2. uF nctions
19
Thus,
h
(g 0 f) =
i.e., h 0 (g 0 f)
(:
(h
~ ~)
g)
f.
and (h
g)
f
=
(:
~ ~),
f = ex
Part (i) follows immediately from parts (iii) and (iv) of Theorem
1.2.10.
The proof of part (ii) is left as an exercise. _
Proof.
1.2.27. Exercise.
1.2.29. Exercise.
as
eL t X
f=
= a{ ,
(ac bb ae),
g=
-+
X and g : X X
(ab eb ae).
10
f' ( x ' )
I(x ' )
(1.2.35)
XI'
We also have:
1.2.37. Definition. IfI is a mapping of XI into Y a nd if XI
mapping f of X into Y is said to be an extension offif
for every x
/(x )
E
XI'
I(x )
1.2.39. Example.
s, t}. Clearly XI
Also, define j, j
j =
:X
(U
eL t X I = u{ , v, ,} x
.X Define I : X I
I=(U
->
X
->
Yas
n p q
then any
(1.2.38)
,X
f{ l, v, ,x y, ,} z
and Y
v x
Z) .
X into Y
= tn, p, ,q
)X .
Y as
v x
npqrs
Z),
j =
(U
npqnt
Then j andj are two different extensions off Moreover, I is the mapping
1.2. uF nctions
11
f(A) =
y{ E :Y y
= f(x ) ,
A}.
f- ' ( B)
x{
X : f(x )
B}.
Note thatf- I (B) is always defined for any f: X - - > .Y That is, there is no
implication here thatfhas an inverse. The notation is somewhat unfortunate
in this respect. Note also that the range offis f( X).
In the next result, some of the important properties of images and inverse
images of functions are summarized.
1.2.41. Theorem. Let f be a function from X into ,Y let A, A1> and A2
be subsets of ,X and let B, BI> and B2 be subsets of .Y Then
1.2.42.
Exercise.
We note that, in general, equality is not attained in parts (iii), (vii), and
(viii) of Theorem 1.2.41. However, by considering special types of mappings
we can obtain the following results for these cases.
Proof We will prove only part (i) and leave the proofs of parts (ii) and
(iii) as an exercise.
To prove sufficiency, letfbe injective and let AI and A2 be subsets of .X
In view of part (iii) of Theorem 1.2.41, we need only show thatf(A I) nf(A,)
c f(A, n A2). In doing so, let y E f(A I) n f(A2). Then y E f(A I) and
y E f(A2). This means there is an IX E AI and an x 2 E A2 such that y
= f(x ,) = f(x 2). Since f is injective, IX = 2X ' Hence, IX E AI n A2. This
implies that y E f(A J n A2); i.e.,f(A I) n f(Al ) c f(A I n Al )
To prove necessity, assume that f(A I n A2) = f(A I) n f(A2) for all
subsets AI and A 2 of .X F o r purposes of contradiction, suppose there are
IX ' 2X E X such that IX
X 2 and f(x , ) = f(x 2). Let AI = IX { }
and A2
= (X2;} i.e., AI and A2 are singletons of X I and X 2, respectively. Then AI n A2
=
0, and so f(A, n A2) = 0. However, f(A,) = y{ } and f(A2} = y
{ ,}
and thus f(A I) n f(A2) = y{ }
0. This contradicts the fact that f(A , )
n f(A2) = f(A I n A2) for all subsets AI and A2 of .X Thus, f is injective. -
1.2.4.4
Exercise.
A..)
= UE J
f(A..);
n A..) c n f(A..);
(l.EI
EI
EI
into ,Y let A
{ .. : IX E I} be
} K be an indexed family
1.2.
u~ nct;ons
(iii) f- I (U
B,,)
"EI:
(iv) f- I (n
B,,)
"E /{
"EI:
f- I (B,,);
n f- I (B,,); and
(v) if Be Y , f- ' ( B- )
"EI:
[ f - I (B} r .
Proof
We prove parts (i) and (iii) and leave the proofs of the remaining
parts as an exercise.
To prove part (i), let y E feu A,,). This means that there is an x E U A"
"EI
"EI
such that y = f(x ) . Thus, for some IX E T, x E A". This implies that f(x )
E f(A,,) and so y E f(A,,). e
H nce, y E U f(A,,). This shows that feu A,,)
"EI
c U f (A,,).
"EI
=
Y
f(x )
Conversely, let x
Thus,j(x )
that
.EI:
1.2.46.
"EI
E/{
B". H e nce,j(x )
f- I (B,,) c f- I (
Exercise.
"E/{
"' E /{
Thus, x
.K
.E/{
"EI
T.
A", and so
f-I(B",),
"E/{
B",.
and so x
f- I (B,,) .
f- I (B,,). Then x
ceK
= y. Now x
B",) c U
E
"EI
E/{
IX
f- I ( U
Therefore,j- I (U
f(A,,). Then y
"EI
U f - I (B.).
E/{
"EI
"EI
B., and so x
f- I ( U
ceK
IX
K.
1.2.47. Definition. Let A and B be any two sets. The set A is said to be
equivalent to set B if there exists a bijective mapping of A onto B.
Clearly, if A is equivalent to B, then B is equivalent to A.
1.2.48. Definition. eL t J be the set of positive integers, and let A be any set.
Then A is said to be countably infinite if A is equivalent to .J A set is said to
be countable or denumerable if it is either finite or countably infinite. Ifa set
is not countable, it is said to be uncountable.
We have:
Chapter 1
~ntal
Concepts
Let
A c B c .X
1.3.
RELATIONS
AND EQUIVALENCE
RELATIONS
({ ,x
As in the case of mappings, it makes sense to speak of the domain and the
range of a relation. We have:
1.3.4.
:X (x, y)
E p,
E )Y ,
:Y (x, y)
p,
X
EX ) ,
{ ( y; x)
Y
X
X : (x, y)
pc X
,X
)Y ,
,X
Chapter 1
26
uF ndtzmental
Concepts
*"
defined by p = ({ A x B):
1.3.7. Example. Let p be the relation in (>J< )X
A c B}. That is, A p B if and only if A c B. Then p is reflexive and transitive
but not symmetric. _
In the following, we use the symbol,.., to denote a relation in .X
E ,.." then we write, as before, x ,.., y.
(x, y)
If
P< (X)
1.3.10. Example.
Let R1. = R x R, the real plane. Let X be the family of
all triangles in R1.. Then each of the following statements can be used to define
an equivalence relation in :X "is similar to," "is congruent to," "has the same
area as," and "has the same perimeter as." _
1.4.
OPERATIONS ON SETS
into
1.4.
27
Operations on Sets
1.4.2. Example.
Let R denote the real numbers. Let f: R x R - > R be
given by f(x , y) = x + y for all x, y E R, where x + y denotes the customary sum of x plus y (Le., + denotes the usual operation of addition of real
numbers). Then f is clearly an operation on R, in the sense of Definition
as being the operation on R,
1.4.1. We could just as well have defined
i.e., +: R x R - > R, where + ( x , y) A x + y. Similarly, the ordinary rules
of subtraction and multiplication on R, "- " and" . ", respectively, are also
operations on R. Notice that division, :- ,- is not an operation on R, because
x :- - y is not defined for all y E R (i.e., x :- - y is not defined for y = 0).
{ ,J then "- : - " is an operation on R#.
However, if we let R* = R - O
"+"
a)
A 01% 0
0,
b)
1%(0,
A 01%
lX(b,O)
b,
b IX a =
b, lX(b, b) =
b IX b =
..!~-
ala
b
b
b a
a.
IX:
(l.4 . 5)
If A =
IX
xIXy
IX
28
CMprerlIFm
~ en~/C~up~
L~
a
b
Iba
a b
a a a
b b b
" a b
a
p, y,
:r::
b
a b
and ~ on
is said to be commutative if x cz y
Definition. An operation cz on X
E X.
y cz x for all x , y
1.4.7.
Definition. An operation cz on X is said to be associative if (x cz y) cz z
= x cz (y cz )z for x, y, Z E .X
In the case of the real numbers R, the operations of addition and multiplication are both associative and commutative. The operation ofsubtraction is
neither associative nor commutative.
1.4.8.
then
Definition.
P if
(x cz y)
P (x
cz )z
for every x, y, Z E ;X
(ii) cz is said to be right distributive over
(x
(iii)
P y) cz
z =
(x
P if
cz )z P (y cz
)z
In Ex a mple
p of Example
"+".
1.4.9.
Definition. If cz is an operation on ,X and if IX
is a subset of ,X
then X l is said to be closed relative to cz if for every ,x y E X .. x cz Y E X l .
Clearly, every set is closed with respect to an operation on it.
The set of all integers Z, which is a subset of the real numbers R, is closed
with respect to the operations of addition and multiplication defined on R.
The even integers are also closed with respect to both of these operations,
whereas the odd integers are not a closed set relative to addition.
1.4.
Operations on Sets
1.4.10.
on X,
(' ,x
for all ,x y
y)
= x
IX'
= x
lX >
induced by
IX.
1.4.11. Theorem. L e t
be an operation on X, let X l C X, where X l is
closed relative to IX, and let IX' be the operation on X l induced by IX. Then
(i) if is commutative, then IX' is commutative;
(ii) if is associative, then IX' is associative; and
(iii) if P is an operation on X and X l is closed relative to p, and if is
left (right) distributive over p, then IX' is left (right) distributive over
P', where P' is the operation on X l induced by p.
1.4.12.
Exercise.
1.4.13.
1.4.14.
Example.
and a. and
on X l
a
b
C
Let
a. on X
a b C
a C b
b
b a
C
a
C
Xl
as
a.
= a{ ,
a b C
a a C b
b C b a
C
b a C
d C d a
e d C a
= a{ ,
a b C d e
a C b d e
C b a e d
b a e
f1.
e d
d e
e d
b e
b e
b, c, d, e}. Define
b e
30
*'
1.5.
MATHEMATICAL
IN THIS BOOK
SYSTEMS
CONSIDERED
31
{ X ; (,, ) .
It turns out that in a certain sense all inner product spaces are normed
linear spaces, that all normed linear spaces are metric spaces, and as indicated
before, that all metric spaces are topological spaces. Since normed linear
spaces and inner product spaces are also vector spaces, it should be clear
that, in the case of such spaces, properties of algebraic systems (called
algebraic strocture) and properties of topological systems (called topological
structure) are combined.
A class of normed linear spaces which are very important are Bauach
spaces, and among the more important inner product spaces are Hilbert
spaces. Such spaces will be considered in some detail in Chapter 6. Also, in
Chapter 7, linear transformations defined on Banach and Hilbert spaces will
be considered.
5. Applications are considered at the ends of Chapters ,4 5, and 7.
1.6.
REFERENCES
AND NOTES
A classic reference on set theory is the book by Hausdorff 1[ .5]. The many
excellent references on the present topics include the elegant text by Hanneken
1[ .4), the standard reference by Halmos 1[ .3] as well as the books by Gleason
1[ .1] and Goldstein and Rosenbaum 1[ .2].
REFERENCES
1[ .1]
1[ .2]
1[ .3]
1[ .4]
1[ .5]
31
ALGEBRAIC
STRUCTURES
The subject matter of the previous chapter is concerned with set theoretic
structure. We emphasized essential elements of set theory and introduced
related concepts such as mappings, operations, and relations.
In the present chapter we concern ourselves with algebraic structure.
The material of this chapter falls usually under the heading of abstract
algebra or modern algebra. In the next two chapters we will continue our
investigation of algebraic structure. The topics of those chapters go usually
under the heading of linear algebra.
This chapter is divided into three parts. The first section is concerned
with some basic algebraic structures, including semigroups, groups, rings,
fields, modules, vector spaces, and algebras. In the second section we study
properties of special important mappings on the above structures, including
homomorphisms, isomorphisms, endomorphisms, and automorphisms of
semigroups, groups and rings. Because of their importance in many areas
of mathematics, as well as in applications, polynomials are considered in
the third section. Some appropriate references for further reading are suggested at the end of the chapter.
The subject matter of the present chapter is widely used in pure as well as
in applied mathematics, and it has found applications in diverse areas, such
as modern physics, automata theory, systems engineering, information
theory, graph theory, and the like.
33
Chapter 2
34
Algebraic Structures
2.1.
OF
ALGEBRA
We begin by developing some of the more important properties of mathematical systems, { X ; IX,} where IX is an operation on a non-void set .X
2.1.1. Definition. Let IX be an operation on .X If for all ,x ,Y Z E X,
x IX Y = x IX z implies that y = ,z then we say that ;X {
IX} possesses the left
cancellation property. If x IX y = Z IX Y implies that x = ,z then ;X{
IX}
is
said to possess the right cancellation property. If { X ; IX} possesses both the
left and right cancellation properties, then we say that the cancellation laws
hold in ;X {
IX.}
In the following exercise, some specific cases are given.
2.1.2. Exercise.
eL t
x Y
xxy
yyx
IX
X =
,x{
p, )',
.1.- r :- t y
xxy
yxy
xxx
yyx
and d be defined as
~ xxx
yyy
Show that (i) { X ; P} possesses neither the right nor the left cancellation property; (ii) { X ; )'} possesses the left cancellation property but not the right cancellation property; (iii) { X ; d} possesses the right cancellation property but
not the left cancellation property; and (iv) { X ; IX} possesses both the left and
the right cancellation property.
In an arbitrary mathematical system { X ; IX} there are sometimes special
elements in X which possess important properties relative to the operation
IX. We have:
2.1.3. Definition. eL t
element e, such that
IX
IX
e, =
,x
,x
3S
relative to , or
e x = x e = x
E
.X
2.1.5. Exercise.
Let
I- h-oI
I
+}
or ;X{
to, I}
and" " by
.0 I
0 0
2.1.6. Theorem. L e t
be an operation on .X
Proof To prove the first part, let e' and en be identity elements of { X ; .}
Then e' en = e' and e' en = en. Hence, e' = en.
To prove the second part, note that since e, is a right identity, et e, = et.
Also, since et is a left identity, et e , = e,. Thus, et = e,.
To prove the last part, note that for all x E X we have x = x
e, =
e, x.
2.1.7. Definition. L e t
relative to . If x
X,
provided that
An element x "
x
E
x' =
e.
of x relative to if
x"
x =
e.
The following exercise shows that some elements may not possess any right
or left inverses. Some other elements may possess several inverses of one kind
and none of the other, and other elements may possess a number of inverses
of both kinds.
2.1.8. Exercise.
Let X
,x{
y, u, v} and define
~
as
y u v
x x y x y
y x y y x
x
u
v
u v
y
x
be an operation on .X
2.1.9. Deftnition. L e t
if is an associative operation on .X
We call { X ;
a semlgroup
y~ u~ v =
x
~
(y u) v =
(x~y)~(uv)
x
~
y ~ (u v)
(x~y)uv.
(2.1.10)
As a generalization of the above we have the so-called generalized assoc:lalaw, which asserts that if X I ' X z , .. ' ,x . are elements of a semigroup
{ X ; ~}, then any two products, each involving these elements in a particular
order, are equal. This allows us to simply write X I X z ~ ... ~ x .
ti~e
37
E ,X
Proof
=
Exercise.
Let
= u{ , v, x , y} and define
(X
u v x
u
v
v v u u
u u v x
x
y
x v
(X
as
Use this operations table to demonstrate that Theorem 2.1.13 does not, in
(X} is replaced by system ;X {
(X} with identity.
general, hold if monoid ;X {
By Theorem 2.1.13, any invertible element of a monoid possesses a unique
right inverse and a unique left inverse, and moreover these inverses are
equal. This gives rise to the following.
Chapter 2
38
I Algebraic Structures
a} be a monoid.
and
(y- I
a X-I)
ax - I )
1%
(x
x l % ( yay- I )ax -
a y) = y- I
1%
(X - I
ax
e.
a x ) a y = e.
e.
In the remainder of the present chapter we will often use the symbols
and "." to denote operations in place of a, p, etc. We will call these
"addition" and "multiplication." oH wever, we strongly emphasize here that
..+" and" " will, in general, not denote addition and multiplication of real
numbers but, instead, arbitrary operations. In cases where there exists an
identity element relative to "+ " , we will denote this element by "0" and call
it "zero." If there exists an identity element relative to ". ", we will denote this
element either by "I" or bye. Our usual notation for representing an identity
+ } an
relative to an arbitrary operation a will still be e. If in a system {X;
element x E X possesses an inverse, we will denote this element by - x and
we will call it "minus "x . F o r example, if ;X{
+ } is a semigroup, then we
denote the inverse of an invertible element x E X by - x , and in this case we
have x + (- x ) = (- x ) + x = 0, and also, - ( - x )
= .x Furthermore, if
,x y E X are invertible elements, then the "sum" x + y is also invertible,
and - ( x
y) = (- y )
(- x ) .
Note, however, that unless
is commutative, - ( x + y) (- x ) + (- y ). Finally, if x, y E X and if y is an
invertible element, then - y E .X In this case we often will simply write
x + (- y ) = x - y.
"+"
*'
"+"
2.1.17. Example. eL t X = O
{ , 1,2, 3}, and let the systems { X ;
{ X ; .} be defined by means of the operation tables
+}
and
2.1.
0
1
2
3
39
10
1 2 3
0 1 2 3
1 2 3 0
2 3 0 I
3 0 1 2
0
1
2
3
0
0
0
0
0
1 2 3
0 0 0
1 2 3
2 0 2
3 2 1
if x IX x = x , then x = e;
if Z E X and x IX y = x IX ,z then y = z;
ifz E X a ndx I X y = z I X Y , thenx = z ;
there exists a unique W E X such that
W (X
x =
y; and
(2.1.20)
such that
x(Xz=y.
Proof To prove the first part, let x (X x = x. Then X - I IX (x IX x ) =
and so (X - I (X x ) IX x = e. This implies that x = e.
To prove the second part, let x IX y = x IX .z Then X - I (X (x IX y) =
IX )z , and so (X - I IX x ) IX Y = (X - I (X x ) IX .z This implies that y = .z
The proof of part (iii) is similar to that of part (ii).
(2.1.21)
,x
X-I
(X
X-I
IX (x
Chapter 2 I AlgebraicStructures
04
In part (iv) of Theorem 2.1.19 the element w is called the left solution
of Eq. (2.1.20), and in part (v) of this theorem the element z is called the
right solution of Eq. (2. I.21).
We can classify groups in a variety of ways. Some of these classifications
are as follows. eL t { X ; } be a group. Ifthe set X possesses a finite number
of elements, then we speak of a finite group. If the operation is commutative
then we have a commutative group, also called an abelian group. If is not
commutative, then we speak of a non-commutative group or a non-abelian
group. Also, by the order of a group we understand the order of the set .X
Now let ;X {
} be a semigroup and let IX be a non-void subset of X
which is closed relative to . Then by Theorem 1.4.11, the operation I on XI
induced by the associative operation is also associative, and thus the
mathematical system { X I ; I } is also a semigroup. The system { X I ; I } is
called a subsystem of { X ; .} This gives rise to the following concept.
2.1.22. Definitio... eL t { X ;
c X for aU i E I,
is a subsemigroup
is a subsemigroup
of (X ; } for every i
of { X ; .}
lEI
Y'
=
{Y:
subset of ,X where { X ;
We Y c X and { Y ;
I and so x
y E ,X for
is a subsemigroup. _
}
is a semigroup, and
is a subsemigroup of { X ;
n.
Then
cy is non-empty,
since X
14
G=
YE'l/
.Y
Il}.
2.1.24.
Theorem. Let ;X {
Il} be a monoid with e its identity element, and let
{ X I ; Ill} be a subsemigroup of { X ; Il}. Ife E IX ! , then e is an identity element
of { X I ; Ill} and { X I ; Iltl is a monoid.
2.1.25. Exercise.
+012345
0012345
1104523
2 2 504
3 1
3345012
4431250
5523104
(a) Show that Z
{ 6; +} is a group.
(b) L e t K = O
{ , I}. Show that{ K ;
+} is a subgroup Of{Z6; +}.
(c) Are there any other subgroups Of{Z6; + } ?
We have seen in Theorem 2.1.24 that if e E IX c ,X then it is also an
identity of the subsemigroup { X I ; Il}. We can state something further.
2.1.28. Theorem. L e t { X ; Il} be a group with identity element e, and let
{ X I ; Il} be a subgroup of { X ; Il}. Then e l is the identity element of { X I ; Il}
if and only if e l = e.
14
2.1.29. Exercise.
E IX >
(iii) for every ,x y E IX >
X-I
and
XI;
lX Y E X l '
Proo.f Assume that { X I ; lX} is a subgroup. Then (i) follows from Theorem
2.1.28, and (ii) and (iii) follow from the definition of a group.
Conversely, assume that hypotheses (i), (ii), and (iii) hold. Condition
(iii) implies that IX is closed relative to lX, and therefore { X I ; lX} is a subsemigroup. Condition (i) along with Theorem 2.1.24 imply that (X I ; lX} is
a monoid, and condition (ii) implies that (X I ; lX} is a group. _
Analogous to Theorem 2.1.23 we have:
2.1.31. Theorem. eL t (X ; lX} be a group, and let ,X c X for all i
where lis some index set. Let Y =
"X If (X,; lX} is a subgroup of { X ;
for every i
l, then (Y ;
lX}
lEI
lX}
l,
is a subgroup of (X ; lX.}
= XI n
2.1.
34
E
Y J!'
is a subgroup of (X ;
X
~} .
.Y
2.1.34.
(W;~ }
~}.
This
is a
"+"
2.1.35. Example.
the group { Z ; + }
E
Y J!'
Let
Ciropter 2
I Algebraic Structures
p . = po ,
pE
M(X ) .
(2.1.38)
(<<
P) . y =
(<<
P)
0 i'
0 (jJ 0 y)
(P . y).
x for all x
,X
Proof
The (e )(x)
. Similarly, (<< e)(x)
E M(X ) .
e(<)x )
( e(x))
(x)
for every x E ,X
(x) for all x E ,X and
Next, we prove:
2.1.40. Theorem. Let { M (X ) ; .} be the semigroup of transformations on
the set .X An element /% E M(X ) has an inverse in M(X ) if and only if is a
permutation on .X Moreover, the inverse of a unit is the inverse mapping
- I determined by the permutation /%.
2.1.
54
e(x)
(X
(X(' )x
Therefore, (X is one-to-one.
permutation on .X
(X'
eH nce,
(X()x
if (X
(X'
(X(y)
e(y)
y.
(X - I ,
it is a
2.1.41.
2.1.42.
Exercise.
2.1.43. Definition. Any subgroup of the group {P(X); J is called a permu J is called the
tation group or a transformation group on ,X and {P(X);
permutation group or the transformation group on .X
Occasionally, we speak of a permutation group on ,X say { Y ; ,J without
making reference to the set .X In such cases it is assumed that (Y ; J is a
subgroup of the permutation group P(X ) for some set .X
2.1.4.4
Example.
tations, namely,
(XI
(X4
eL t
C :),
Gz ;),
y
=
=
(X2
(x s
C z ;),
y
=
=
G ;),
= (;
(x ,
= (: y
(X3
x
y
:),
;).
64
{X
2 ; .}
and { X
B.
2 ; .} is of order 3.
Note that { X I ;
.} is of order 2
Thus far we have concerned ourselves with mathematical systems consisting of a set and an operation on the set. Presently we consider mathematical systems consisting of a basic set X with two operations ~ and P defined
on the set, denoted by { X ; ~, Pl. Associated with such systems there are two
mathematical systems (called subsystems) {X;~}
and ;X {
Pl. By insisting that
the systems ;X {
}~ and ;X {
P} possess certain properties and that one of the
operations be distributive over the other, we introduce the important mathematical systems known as rings. We then concern ourselves with special types
of important rings called integral domains, division rings, and fields.
2.1.45.
Definition. Let X be a non- e mpty set, and let ~ and P be operations
on .X The set X together with the operations ~ and P on ,X denoted by
{X;~,
P}, is called a ring if
(i) {X;~}
(ii)' { X ;
(iii)
is an abelian group;
Pl is a semigroup; and
P is distributive over ~.
We refer to {X;~}
as the group component of the ring, to ;X{
P} as the
semigroup componeDt of the ring, to tX as the group operatioD of the ring, and
to P as the semigroup operation of the ring. F o r convenience we often denote
a ring { X ; tX, P} by X and simply refer to "ring X " . F o r obvious reasons,
we often use the symbols
and "." ("addition" and "multiplication")
in place of tX and P, respectively. Thus, if X is a ring we may write { X ;
and assume that { X ;
1 is the group component of ,X and { X ; .} is the
semigroup component of .X We call { X ; + }
the additive group of ring ,X
;X{
.} the multiplicatiYe semigroup of ring ,X x
y the sum of x and y, and
x . y the product of x and y.
We use 0 ("z e ro") to denote the identity element of { X ; .}+
If { X ; .} has
an identity element, we denote that identity bye.
The inverse of an element x relative to
is denoted by - .x If x has an
inverse relative to ".", we denote it by X - I . F u rthermore, we denote x
(- y )
by x - y (the "difference of x and y") and (- x )
y by - x
y. Note that
the elements 0, e, - x ,
and X - I are unique.
Subsequently, we adopt the convention that when operations
and
"." appear mixed without parentheses to clarify the order of operation,
the operation should be taken with respect to ". " first and then with respect
to .+
F o r example,
"+"
,+ .}
"+"
x y
z =
(x y)
74
and not x (y
Thus, we have
x
)z =
(x y)
(x )z
= x y
+
x
.z
,+ .}
2.1.46.
Definition. Let ;X {
an identity element, we say that X
.} has
;X {
"+"
,+ .}
,+ .}
X
.}.
The reader can readily verify that the following examples are rings.
L e tting" "
+ and "." denote the usual operations of
2.1.49.
Exercise.
addition and multiplication, show that { X ; ,+ .} is a commutative ring with
identity if
(i)
(ij)
X
(iii)
X
2.1.50. Exercise.
operation tables:
,+ .}
Let X
0
0
I
I
I 0
o
I
I
0 0
_O
14
F o r rings we have:
,+ .}
2.1.52. Theorem. If { X ;
(i) x + 0 = 0 + x = x ;
(ii) - ( x + y) = (- x ) + (- y ) = (- x ) - y
(iii) if x
y = 0, then x = y;
(iv) - ( - x )
= x;
(v) 0 = x 0 = 0 x ;
(vi) (- x ) .
y = - ( x . y) = x (- y ); and
(vii) (- x ) ( - y )= x y .
= -x
-
X we have
y;
(- y )
-[(-x).
y]
-[-(x
y)] =
y.
Now let { X ; + , .} denote a ring for which the two operations are equal,
i.e., "+" = ".". Then x + y = x y for all x , y E .X In particular, if
y = 0, then x + 0 = x . 0 = 0 for all x E X and we conclude that 0 is the
This gives rise to:
only element of the set
:X
,+ .}
O
{ .J
We next introduce:
,+ .}
49
*'
implies
*'
y,*O.
and
,+ .J
*' x
(i) y = ;z
(ii) x y =
(iii) y. x =
x z; and
z .x
Proof Assume that X is an integral domain. Clearly (i) implies (ii) and
(iii). To show that (ii) implies (i), let x y = x .z Then x (y - )z = 0.
Since x
0 and X has no ez ro divisors, y - z = 0 or y = .z Thus, (ii)
implies (i). Similarly, it follows that (iii) implies (i). This proves that (i),
(ii), and (iii) are equivalent.
Conversely, assume that x ;= t:. 0 and that (i), (ii), and (iii) are equivalent.
Let x y = O. Then x 0 = x . y, and it follows that y must be zero since
(ii) implies (i). Thus, x . Y ;= t:. 0 for y ;= t:. 0, and X has no ez ro divisors. _
*'
.x Thus,
such that
0, then either x
0 or
Chapter 2
Algebraic Structures
= O.
,+ .}
X .}.
O
{ .J
Let { X ;
,+ .}
is a ring with
Proof L e t X #
= X - O{ .J Then { X # ;
.} has an identity element e. L e t
x E .X Ifx E X # , then e x = x e = x. If x ~ X * , then x = 0 and 0 e
= e 0 = O. Therefore, e is an identity element of .X
is called a
Because of the prominence of fields in mathematics as well as in applications, and because we will have occasion to make repeated use of fields, it
may be worthwhile to restate the above definition, by listing all the properties
of fields.
2.1.63. DefinitioD. L e t X be a set containing more than one element, and
let there be two operations
and "." defined on .X Then { X ;
is a
field provided that:
"+"
(i) x
(y
,x y, z
(ii)
z) = (x
X (Le.,
+
"+"
,+ .}
y)
z and x (y )z = (x y) z
and"." are associative operations);
x
y= y
x and x y = y x for all x , y
"." are commutative operations);
(Le.,
for all
"+"
and
*"
51
2.1.64. Example.
Perhaps the most widely known field is the set of real
numbers with the usual rules for addition and multiplication. _
2.1.65. Exercise. Let Z denote the set of all integers and .. "
+ and "."
denote the usual operations of addition and multiplication on Z. Show that
Z
{ ; ,+ .} is an integral domain, but not a division ring, and hence not a
field.
The above example and exercise yield:
2.1.66. Definition. Let R denote the set of all real numbers, let Z denote
+ and "." denote the usual operations of
the set of all integers, and let" "
addition and multiplication, respectively. We call {R; ,+
.} the field of real
the ring of Integers.
numbers and { Z ;
,+ .}
and
Show that C
{ ;
,+ .}
x y
(a
(ac -
c, b
bd, ad +
d)
be).
is a field.
,+ .}
2.1.69. Exercise.
eL t Q denote the set of rational numbers, let P denote
the set of irrational numbers, and let" "
+ and" " denote the usual operations
of addition and multiplication on P and Q.
(a) Discuss the system Q
{ ;
(b) Discuss the system P
{ ;
,+ .}.
,+ .}.
,+ .}
52
where a, b, c, d and m, n, p, q
onMby
E R.
"+"
and "."
and
u v=
0[ bJ.
C
[m
nJ = 0[ . m +
c m + d p
pO n +
c.n+ d q
.Jq
(Note that in the preceding the operations + and defined on M are entirely
different from the operations + and for the field R.)
(a)
(b)
(c)
(d)
Show that M
{ ; +}
{ ; +}
Show that M
{ ; ,+
Show that M
{ ; ,+
Show that M
is a monoid.
is an abelian group.
.} is a ring.
.} has divisors of ez ro.
the subring { Y ;
(ii)
2.1.73. Exercise.
sU ing
determines
If Y is
If Y is a
Before, we characterized a trivial ring as a ring for which the set X consists
only of the 0 element. In the case of subrings we have:
53
,+ .}
Y =
Y=
,+ .}
be a subring.
O
{ J or
.X
F o r subdomains we have:
2.1.77. Theorem. L e t X be an integral domain, and let Y be a non-trivial
subring of .X Then Y is a subdomain of .X
Proof Let x , y E ,Y and let x y = O. Since x , y
zero divisors. Thus, Y has no zero divisors.
X, x and y cannot be
F o r subfields we have:
"*
Exercise.
Then Y
,+ .}
/EI
of(X ; + , .} .
,+ .}
Now let (X ;
Y'
Then' Y
= {Y:
W eY e
is non-empty because X
,+ .}
Also, let
.Y '
Now let R
YEy
,+ .}.
.Y
Then We R
This subring is
ex
=
x;
,+
55
+ ... +
,+ .}
2.1.83. Example. eL t { X ;
be a ring with identity, and let R be a
subring of X with e E R. By defining p: R x X - + X as p(I" )x = (1,. ,x it
is clear that X is an R-module. In particular, if R = ,X we see that any ring
with identity can be made into a module over itself. _
F o r modules we have:
2.1.84. Theorem. Let
we have
(i) (1,0 =
R and x
0;
(ii) (1,( - x )
= -(I,x);
(iii) Ox = 0; and
(iv) (- ( I,)x = -(I,x).
-(I,x).
To prove the third part observe that for 0 E R we have 0 + 0 = O. eH nce,
(0 + O)x = Ox + Ox = Ox, and therefore Ox = O.
To prove the last part, note that since (I, + (- ( I,) = 0 it follows that
( I, + (- ( I, x
= Ox = O. Therefore, (l,X + (- ( I,)x = 0, and (- ( I,)x = -(I,x).
=
2.1.85. Definition. eL t ;F {
group. If X is an F-module,
,+ .}
The notion of vector space, also called linear space, is among the most
important concepts encountered in mathematics. We will devote the next two
chapters and a large portion of the remainder of this book to vector spaces
and to mappings on such spaces.
R" by x =
(x .. :x
Y =
(X I
., "x
IY ">
R" - +
"+
on k' by
,Y ,)
k' by
(otx .. . . , otx,,)
2.1.87. Exercise.
We also have:
,+ .}
2.1.88. Theorem. eL t ;F {
be a field, and let "F = F x ... X F
be the n-fold Cartesian product of .F Denote the element x E F " by x = (~I'
~z,
... ,~,,)
and define the operation "+" on "F by
for all ,x Y
for all ot
F and x
2.1.89. Exercise.
(~
-+
"F by
..... , ~,,)
over F
x Y
x z
(ot P)(x
x ;z
Y ;z and
y)
for all x, ,Y z E X and for all ot, P E .F Then, X is called an algebra over .F
If, in addition to the above axioms, the binary operation of multiplication is
57
,+ .}
,+ .}
= /[ QX
/Xb.J
tXd
/Xc
for every x , y, Z
y . (z x)
(x y) =
algebra
(2.1.94)
L e t us now consider some specific cases of Lie algebras. Our first exercise
shows that any associative algebra can be made into a iL e algebra.
2.1.95. Exercise.
Let R be an associative algebra over ,F
operation "." on R by
x . y= x y - y x
for all x, y E R (where" " is the operation on the associative algebra R
over F ) . Show that R with "." defined on it is a iL e algebra.
2.1.96. Example.
In Exercise 2.1.70 we showed that the set of 2 x 2
matrices forms a ring but not a field, and in Exercise 2.1.92 we showed that
this set forms an algebra over ,F the field of real numbers. This set can be
made into a Lie algebra by Exercise 2.1.95.
k = (0,0,1)
j = (0,1,0)
2.1.98.
iF gure A. Unit
0
-k
j
k
0
k
-j
-i
x
j
i.e., "X " denotes the usual "cross product," also called "outer product."
encountered in vector analysis. Show that X is a iL e algebra.
eL t us next consider submodules.
2.1.99. Deflnition. eL t {R; ,+ .J be a ring with identity, and let { X ; } +
be an abelian group, where X is an R-module. eL t { Y ; + J be a subgroup of
{X;
.J If Y is an R-module, then Y is called an R-submoclole of .X
be a non-empty
Proof We give the sufficiency part of the proof and leave the necessity part
as an exercise.
Let , pER and let x E .Y Then x, px , (<< + P)x E Y by hypothesis
(ii). Since iY s a group, it follows that x + py E Yand since x E X we have
59
Exercise.
We next introduce the notion of vector subspace, also called linear subspace.
2.1.102. DefinitioD. Let F be a field, and let X be a vector space over .F
Let Y be a subset of .X If Y is an F - s ubmodule of ,X then Y is called a vector
snbspace.
L e t us consider some specific cases.
2.1.103.
for i
E
R}
Example.
L e t R be a ring, let X
is an R-submodule of
.X
be an R- m odule,
given by { x
:X
and let {x
x
ii
(1,{,{x
2.1.104.
Example.
Let F be a field, and let P be the vector space of ntuples over .F Let IX = (1,0, ... ,0) and lX = (0,1,0, ... 0). Then IX '
lX E F~. Let Y = (x E P: X = I X I + lX l , .. 1 E .}F Then iY s a vector
subspace. We see that jf X E ,Y then x is of the form x = ( 1,1. (1,1,0, ... ,0).
We next prove:
2.1.105. Theorem. Let X be an R-module, and let Y' denote a family of
R-submodules of ,X i.e., /Y is a submodule of X for every {Y E ,Y '
where
i E I and I is some index set. Let Y =
.{Y Then Y is an R-submodule of
.X
{el
Proof
cy =
()
rely
{Y:
W c: Y
and Y i s
c: X
an R-submodule of X}.
eL t us next prove:
2.1.107. Theorem. eL t X be an R-module, and let
(Y x I ' . . , ,x ,) denote the subset of X given by
(Y x
l,
Then (Y x
,x,,)
I'
{x
X:
=
x
IX
+ ... +
"x"'
,X"
.X
R}.
lt
eL t
,x ,) is an R-submodule of .X
+ ... +
a(IXlx
+ ... +
IX",X ,)
+ ...
R,
a IXIX
+ ... +
+ ...
l,
.Y
, ,x ,)
belongs to the family cy of Definition 2.1.106
... , ,x ,), in which case n Y , = (Y x l , , ,x ,). This
2.1.108. Deftnition. eL t
I'
IX""X
l ,
r'.E.y
leads to:
(Y x
(Y x
+a
X, and let
IX""X , lX I ' . . ,IX " E R}. Then
generated by x I ' . . . , "x .
... ,X "
61
x
implies that ~I
a basis for .X
D.
+ ... +
~,x,
= PI
for
is called
Overview
We conclude this section with the flow chart of Figure D, which attempts
to put into perspective most of the algebraic systems considered thus far.
Integral domain
Commutative ring
Module
Associative algebra
2.1.111.
Commutative algebra
2.2.
O
H MOMORPHISMS
(X
y)
(2.2.2)
p(x)P p (y)
p
p
x
2.2.3.
{ Y ; fJ.}
Figure C. Homomorphism
or semigroup { X ;
~}
into semigroup
{Y;
62
2.2. oH momorphisms
now assumes the form
p(x
y)
p(x )
(2.2.4)
p(y)
for every ,x y E .X This relation looks very much like the "linearity property"
which will be the central topic of a large portion of the remainder of this
book, and with which the reader is no doubt familiar. oH wever, we emphasize
here that the definition of "linear" will be reserved for a later occasion, and
that the term homomorphism is not to be taken as being synonymous with
linear. Nevertheless, we will see that many of the subsequent results for
homomorphisms will reoccur with appropriate counterparts throughout the
this book.
2.2.5. Example. Let R denote the set of real numbers, and let "+" and
"." denote the usual operations of addition and multiplication on R. Then
R
{ ; + } and {R; .} are semigroups. eL t
f(x ) =
for all x
e"
to {R; .} .
2.2.6. Exercise. eL t ;X {
+}
and { X ; }
denote the semigroups defined
in Example 2.1.17. L e tf: X - + X b e defined as follows:f(O) = 1,/(1) = 3,
f(2) = I, and f(3) = 3. Show that f is a homomorphism from ;X{
+ } into
{ X ; .} .
In order to simplify our notation even further, we will often use the
symbol"." in the remainder of the present chapter to denote operations for
semigroups (or groups), say { X ; ' } , { Y ; ,J and we will often refer to these
simply as semigroup (or group) X and ,Y respectively. In this case, if p
denotes a homomorphism of X into Y we write
p(x y)
p(x ) p(y)
for all x , y E .X
In Chapter I we classified mappings as being into, onto, one-to-one and into,
and one-to-one alld onto. Now if p is a homomorphism of a semigroup X
into a semigroup ,Y we can also classify homomorphisms as being into, onto,
one-to-one and into, and one-to-one and onto. This classification gives rise
to the following concepts.
2.2.7. Definition. eL t
semigroup .Y
p be a homomorphism of a semigroup X
into a
Chapter 2
64
I Algebraic Structures
we say that
We note that since all groups are semigroups, the concepts introduced
in the above definition apply necessarily also to groups.
In connection with isomorphic semigroups (or groups) a very important
observation is in order. We first note that if a semigroup (or group) X is
isomorphic to a semigroup ,Y then there exists a mapping p from X into Y
which is one-to-one and onto. Thus, the inverse of p, p- I, exists and we can
associate with each element of X one and only one element of ,Y and vice
versa. Secondly, we note that p is a homomorphism, eL ., p preserves the
properties of the respective operations associated with semigroup (or group)
X and semigroup (or group) Y o r, to put it another way, under p the (algebraic) properties of semigroups (or groups) X and Y a re preserved. eH nce,
it should be clear that isomorphic semigroups (or groups) are essentially
indistinguishable, the homomorphism (which is one-to-one and onto in this
case) amounting to a mere relabeling of elements of one set by elements of
a second set. We will encounter this type of phenomenon on several other
occasions in this book.
We are now ready to prove several results.
2.2.8. 1beorem. eL t
semigroup .Y Then
Xl
{x
:X
p(x )
I}
is a subsemigroup of .X
Proof. To prove the first part we must show that the subset p(X ) of Y
is closed relative to the operation" .. on .Y Now if x', y' E p( X), then there
exists at least one x E X and at least one y E X such that p(x) = x ' and
p(y) = y'. Since p is a homomorphism, we have
x'
y'
p(x) p(y) =
p(x y),
2.2. oH momorphisms
and since x y E X it follows that x ' y' E p(X ) because p(x y) E P(X ) .
Thus, p(X ) is closed and, hence, is a subsemigroup of .Y
To prove the second part, note that since e E X we have pee) E p(X ) ,
and since for any x ' E p(X ) there exists x E X such that p(x ) = x ' , we have
p(e) x '
pee) p(x ) =
p(e x ) =
p(x )
x'.
Since this is true for every x ' E p(X ) , it follows that p(e) is a left identity
element of p(X ) . Similarly, we can show that x ' pee) = x ' for every x '
E p(X ) . Thus, p(e) is an identity element of the subsemigroup p(X ) of .Y
To prove the third part of the theorem, note that since p is a homomorphism, we have
p(x ) p(x -
and
p(x -
I)
I)
p(x )
p(x
I-X )
p(x -
x)
p(e),
p(e);
i.e., p(e) is an identity element of p(X ) . Also, since p(x - I ) E P(X ) , p(x ) has
an inverse in P(X ) , and P
[ (X)I- ]
= p(x - I).
The proof of parts (iv) and (v) of this theorem are left as an exercise.
_
2.2.9. Exercise.
66
Proof To prove the first part, let e denote the identity element of .X By
part (i) of Theorem 2.2.8, p( X ) is a subsemigroup of ;Y by part (ii) of Theorem
2.2.8, p(e) is an identity element of p(X ) ; and by part (iii) of the same theorem,
it follows that every element of p(X ) has an inverse. Thus, p(X ) is a subgroup
of .Y
The second part of this theorem follows from Theorem 2.1.28 and from
part (ii) of Theorem 2.2.8.
The following result is known as Cayley' s theorem.
2.2.14. Theorem. L e t ;X{
.} be a group, and let { P (X ) ; .} denote the
permutation group on .X Then X is isomorphic to a subgroup of P(X ) .
Proof To prove the first part of the theorem, let x', y' E P(X). Then there
exist uniq u e ,x y E X such that p(x) = x ' and p(y) = y', and p- I (X ' )
= x
and p- I (y' ) = y. Since
p(x y)
we have
p- I (X '
y' )
p(x) P(Y)
x y
p- I (X ' )
x'
.v',
p- I (y' ) .
67
2.2. oH momorphisms
Since this is true for all x ' , y' E p(X ) , it follows that p- I is an isomorphism
of p( X ) with .X
To prove the second part of the theorem we first note that P(X) is a
subsemigroup of Y by Theorem 2.2.8. It follows from Theorem 2.2.13 that
e =
I(e') is an identity element of .X Now let p(k) = e'. Since p(e) = e',
it follows that k = e and that K, = e{ .} We can similarly show that K , .
e{ .} '
2.2.16. Theorem.
itself.
into
(i) If" and IjI are endomorphisms of X , then the composite mapping
IjI 0 " is likewise an endomorphism of .X
(ii) If" and IjI are automorphisms of X , then IjI 0 " is an automorphism
of .X
(iii) If" is an automorphism of ,X then
is also an automorphism
of .X
,,-1
Proof To prove the first part, note that" and IjI are both mappings of
X into X, and thus IjI 0 " is a mapping of X into .X
Also, by definition,
(1jI 0 ' l )(x ) = 1jI('l(x
for every x E X. Now since ' l (x ' y) = 'lex) ' l (y)
and ljI(x y) = ljI(x) IjI(Y) for every x , y E X , we have
IjI 0 ,,(x y)
1jI('l(x y
(1jI
'l(x
1jI('l(x)
(1jI
' l (Y
1jI(' l (x
1jI(' l (x
' l (y .
2.2.17. Exercise.
"+"
y)
y)
p(x )
p(x )
p(y); and
p(y)
into
68
into ,Y
denoted by P(X ) ,
is called the
into a ring .Y
XI = {x
E :X
p(x ) E Y t l
is a subring of .X
(iv) eL t Z be a ring and let f/I be a homomorphism of Y into Z. Then
the composite mapping f/I 0 p is a homomorphism of X into Z.
Proof To prove the first part of the theorem we note that the homomorphic
image p(X ) is clearly the homomorphic image of the group { X ; + } and of
the semigroup { X ; .). Since this homomorphic image is a subgroup of
{ Y ; + ) and subsemigroup of { Y ; .), it follows from Theorem 2.1.72 that
P(X ) is a subring of .Y
The proofs of the remaining parts of this theorem are left as an exercise .
2.2.20. Exercise.
z{
X:
p(z) =
into a ring ,Y
O}
(i) f(u +
(ii) f(rt.u)
1)
f(u)
rt.f(u)
=
69
f(v); and
hold.
In the next chapter we will consider in great detail a special class of
vector spaces and homomorphisms, and for this reason we will not pursue
this subject any further at this time.
2.3. APPLICATION
TO POLYNOMIALS
00
alt
+ ... +
a.t.
we are not looking for a way of defining the value of f(t) for each
t, but instead we seek a definition offin terms of the indexed set 0{ o, ... , an}.
To this end we let the a, belong to some field.
More formally, let F be a field and define a set P as follows. If a E P,
then a denotes an infinite sequence of elements from F in which all except a
finite number are ez ro. Thus, if a E P, then
a = 0{ o, a .. ... ,an' 0, 0, ...}.
We say that a
on P by
"+"
0 such that a/
0+
b=
a{ o
bo, 0 1
b.. ... .J
a b=
where
C"
c = c{ o,
~
CI , .
,J
" a,b,,_,
t:'o
for all k. In this case c" = 0 for all k> m + n, and P is also closed with
respect to the operation" " . Now let us define
Then 0
P and { P ; + }
0= O
{ , 0, ....J
is clearly an abelian group with identity O. Next,
70
define
... .J
e= { I ,O,O,
,+ .
Exercise.
Let us next complete the connection between our abstract characteriz a tion
of polynomials and with the function f(t) we originally introduced. To this
end we let
to= { I ,O,O,
= O{ , I, 0, 0,
t\
t'1. =
O
{ ,O, 1,0,
t3 =
O
{ ,O,O, I,O,
J
J
At this point we still cannot give meaning to a,t', because a, E F and t' E P.
However, if we make the obvious identification a{ " 0,0, ... J E P, and if
we denote this element simply by a, E P, then we have
f(t)
a o to
a\ t\
+ ... +
a t .
ao
a\ t
+ ... +
a"r.
,+ .}
* 0,
of
n
n,
2.3.
71
Application to Polynomials
2.3.5. Theorem. L e tf(t) be a polynomial of order n and let get) be a polynomial of order m. Then f(t)g(t) is a polynomial of order m + n.
Proof
+ ... +
L e tf(t) = f o
fit
= f(t)g(t). Then
go
glt
+ ... +
g.r,
Since It = 0 for i > nand gJ = 0 for j > m, the largest possible value of k
such that hk is non-zero occurs for k = m n; eL .,
hm+n
/"gm'
*-
O.
Proof
where e =
e,
f(t)f- 1 (t) is m
n. Thus, m
havem = n = O.
Conversely, let f(t) = fo =
= fo 1 = { f o , 0, 0, ... J .
f{ o, 0, 0, ... ,J where fo
*-
O. Then f- I (t)
+ ...
Chapter 2
72
I Algebraic Structures
= (q t)g(t) + ret),
(2.3.10)
where either ret) = 0 or deg ret) < deg get).
Proof If f(t) = 0 or if degf(t) < deg get), then Eq. (2.3.10) is satisfied with
q(t) = 0, and ret) = f(t). Ifdegg(t) = 0, Le.,g(t) = c, thenf(t) = c[ - I f(t)]
C, and Eq. (2.3.10) holds with q(t) =
c- I f(t) and ret) = O.
f(t)
Assume now that deg f(t) > deg get) > 1. The proof is by induction on
the degree of the polynomial f(t). Thus, let us assume that Eq. (2.3.10)
holds for deg f(t) = n. We first prove our assertion for n = 1 and then for
n + I.
Assume that deg f(t) = I, eL ., f(t) = a o + alt, where a l O. We need
only consider the case g(t) = b o + bit, where b l O. We readily see that
Eq. (2.3.10) is satisfied withq ( t) = alb. 1 and ret) = a o - alb.lb o'
Now assume that Eq. (2.3.10) holds for degf(t) = k, where k = 1, ... ,
n. We want to show that this implies the validity of Eq. (2.3.10) for
degf(t) = n + I. Let
*"
f(t) =
ao +
alt
+ ... +
*"
a"+lt"+I,
where a,,+ I 1= = O. Let deg get) = m. We may assume that 0 < m < n + I. Let
g(t) = bo + bit + ... + b",t"', where b", O. It is now readily verified that
f(t)
*"
[ f (t) -
(2.3.11)
f(t)
= b[ ;.la"t"+I"'-
s(t)]g(t)
ret).
Thus, Eq. (2.3.10) is satisfied and the proof of the existence of ret) and q(t)
is complete.
The proof of the uniqueness of q(t) and ret) is left as an exercise. _
2.3.12. Exercise.
73
Next. we prove:
2.3.14. Theorem. eL t t[ F ]
denote the ring of polynomials over a field .F
eL t f(t) and g(t) be nonzero polynomials in t[ F .]
Then there exists a unique
monic polynomial. d(t). such that (i) d(t) divides f(t) and g(t). and (ii) if
d'(t) is any polynomial which divides f(t) and g(t), then d'(t) divides d(t).
Let
Proof.
t[ K ]
{x(t)
t[ F :]
x ( t)
m(t)f(t) +
t[ F l}.
m(t)a(t)d'(t)
= m
[ (t)a(t)
n(t)b(t)d'(t)
n(t)b(t)]d(' t).
This implies that d'(t) divides d(t) and completes the proofofthe theorem.
m(t)f(t) +
n(t)g(t).
m(t)f(t) +
n(t)g(t).
74
cl)(t -
C1)' .. (t -
c.),
where c, C l , ,C. E C.
(ij) If F = R, then f(t) can be written uniquely, except for order, as a
product
f(t) = cfl(t)f1(t) . . .f",(t),
where C E R and the fl(t), ... ,/",(t) are monic irreducible polynomials of degree one or two.
2.4.
REFERENCES
AND NOTES
REFERENCES
2[ .1]
2[ .2]
2[ .3]
2[ .4]
2[ .5)
2[ .6)
G. BIRKO
H F
and S.MACLANE, A Survey of Modern Algebra. New York:
The Macmillan Company, 1965.
C. B. A
H NNEKEN, Introduction to Abstract Algebra. Belmont, Calif.: Dickenson Publishing Co., Inc., 1968.
S. T. Hu, Elements ofModern Algebra. San rF ancisco, Calif.: oH lden-Day,
Inc., 1965.
N. A
J COBSON, eL ctures in Abstract Algebra. New York: D. Van Nostrand
Company, Inc., 1951.
S. LIPSCHUTZ,
iL near Algebra. New York: McGraw-iH ll Book Company,
1968.
N. .H McCoY, uF ndamentals of Abstract Algebra. Boston: Allyn & Bacon,
Inc., 1972.
3.1.
IL NEAR
SPACES
"+ ..
75
76
y= y
x for every ,x y EX ;
(i) x
(ii) x
(y
)z = (x + y) + z for every ,x y, Z E X ;
(iii) there is a unique vector in ,X called the ez ro vector or the Dull
vector or the origiD, which is denoted by 0 and which has the property that 0 x = x for all x EX ;
(iv) IX(X
y) = IXX
IXy for all IX E F and for all ,x y E X ;
(v) (IX
p)x = IXX
px for all IX, p E F and for all x E X ;
(vi) (IXP)X = IX(PX) for all IX, p E F and for all x E ;X
(vii) Ox = 0 for all x E X ; and
(viii) Ix = x for all x E .X
The reader may find it instructive to review the axioms of a field which
are summarized in Definition 2.1.63. In (v) the "+" on the left-hand side
on the right-hand side
denotes the operation of addition on F ; the
denotes vector addition. Also, in (vi) IXP I!. IX p, where "." denotes the
operation of mulitplication on .F In (vii) the symbol 0 on the left-hand side is
a scalar; the same symbol on the right-hand side denotes a vector. The I
on the left-hand side of (viii) is the identity element of F r elative to ".".
To indicate the relationship between the set ofvectors X and the underlying
field ,F we sometimes refer to a vector space X over field .F oH wever, usually
we speak of a vector space X without making explicit reference to the field F
and to the operations of vector addition and scalar multiplication. If F is
the field of real numbers we call our vector space a real vector space. Similarly,
if F is the field of complex numbers, we speak of a complex vector space.
Throughout this chapter we will usually use lower case Latin letters (e.g.,
,x y, )z to denote vectors (Le., elements of X ) and lower case Greek letters
(e.g., IX, p, )') to denote scalars (Le., elements of F ) .
If we agree to denote the element (- l )x E X simply by - x , eL ., (- l )x
I!. - x ,
then we have x - x = Ix + (- l )x = (l - l)x = Ox = O. Thus,
if X is a vector space, then for every x E X there is a unique vector, denoted
-x,
such that x - x = O. There are several other elementary properties of
vector spaces which are a direct consequence of the above axioms. Some of
these are summarized below. The reader will have no difficulties in verifying
these.
"+"
77
IX(X
(vi) (IX
(vii) x
y) =
fJ)x
y=
3.1.3. Exercise.
IXX
IXX
IX}';
-
px; and
-
0 implies that x
-yo
Vector x
x
x
x+y
"fY
Vector x + y
3.1.5.
av
.,
.
y
($y
Vector y
Vector av, O< a < l
Vector ($y, fj > 1
Vector "fY, O
<'}
iF gm'e
The reader can readily verify that, for the space described above, all the
axioms of a linear space are satisfied, and hence X is a vector space. _
The purpose of the above example is to provide an intuitive idea of a
linear space. We wiJ) utilize this space occasionally for purposes of motivation
in our development. We must point out however that the terms "plane" and
"arrows" were not formally defined, and thus the space X was not really
properly defined. In the examples which follow, we give a more precise formulation of vector spaces.
78
3.1.6. Example.
3.1.7. Example.
e,
(el'
.0.
and
(Xx
e2" .. ,elt)
(X(el'
"+
.0.
eX l'
e 2"
.. ,elt)'
(3.1.8)
(3.1.9)
"+
3.1.10. Example.
3.1.11. Example.
eL t
numbers of the form
X
let F denote the field of real numbers, let vector addition be defined similarly
as in Eq. (3.1.8), and let scalar multiplication be defined similarly as in Eq.
(3.1.9). It is again an easy matter to show that this space is a vector space.
We point out that this space, which we denote by R- , is simply the collection
79
of all infinite sequences; eL ., there is no req u irement that any type of convergence of the sequence be implied. _
3.1.13. Example.
L e t X = c~ denote the set of all infinite sequences of
complex numbers of the form (3.1.12), let F represent the field of complex
numbers, let vector addition be defined similarly as in Eq. (3.1.8), and let
scalar multiplication be defined similarly as in Eq. (3.1.9). Then C~ is a
vector space. _
L e t X denote the set of all sequences of real numbers
3.1.14.
Example.
having only a finite number of non- z e ro terms. Thus, if x E ,X then
(3.1.15)
for some positive integer I. If we define vector addition similarly as in Eq.
(3.1.8), if we define scalar multiplication similarly as in Eq. (3.1.9), and if we
let F be the field of real numbers, then we can readily show that X is a real
vector space. We call this space the space of finitely non-zero sequences.
If X denotes the set of all sequences of complex numbers of the form
(3.1.15), if vector addition and scalar multiplication are defined similarly as
in eq u ations (3.1.8) and (3.1.9), respectively, then X is again a vector space
(a complex vector space). _
L e t X be the set of infinite sequences of real numbers
3.1.16. Example.
of the form (3.1.12), with the property that Iim~.
= O. If F is the field of real
numbers, if vector addition is defined similarly as in Eq. (3.1.8), and if scalar
multiplication is defined similarly as in Eq. (3.1.9), then X is a vector space.
This is so because the sum of two sequences which converge to z e ro also
converges to zero, and because the scalar multiple of a sequence converging
to z e ro also converges to zero.
_
3.1.17. Example.
L e t X be the set of infinite sequences of real numbers
of the form (3.1.12) which are bounded. If vector addition and scalar multiplication are again defined similarly as in (3.1.8) and (3.1.9), respectively,
and if F denotes the field of real numbers, then X is a vector space. This
space is called the space of bounded real sequences.
There also exists, of course, a complex counterpart to this space, the
_
space of bounded complex sequences.
3.1.18. Example.
00
1= 1
I~II
<
00.
Let
be the field
of real numbers, let vector addition be defined similarly as in (3.1.8), and let
scalar multiplication be defined similarly as in Eq. (3.1.9). Then X is a vector
space. _
Chapter 3
80
and
yX t )
(lIx X t )
x ( t)
lIx{t)
=
(3.1.20)
(3.1.21)
s:
Ix ( t)\
dt
<
00,
81
3.2.
IL NEAR
SUBSPACES
3.2.6. Exercise.
81
Chapter 3
3.1.7.
iF gure B
Y+Z
3.2.9.
3.2.
83
3.2.10. Theorem.
Then their sum, Y
3.2.11. Exercise.
{ ,J
Now let Y and Z be linear subspaces of a vector space .X If Y n Z = O
we say that the spaces Y and Z are disjoint. We emphasize that this terminology is not consistent with that used in connection with sets. We now
have:
3.2.12. Theorem. Let Y and Z be linear subspaces of a vector space .X
Then for every x E Y + Z there exist unique elements Y E Y and Z E Z
Z if and only if Y n Z =
O
{ .J
such that x = Y
2Y
Let
Y a nd
EB ,X .
(y, z),
84
where y E Y a nd
Z E
(<y< ,
(3.2.15)
)z ,
= fey, 0): yE
Y},
Z'
Z} ,
and
({ O, z): z
Then Y ' and Z' are linear subspaces of V and V = Y' EB Z' . By abuse of
notation, we frequently express this simply as V = Y EB Z.
Once more, making use of Example 3.1.4, let Y and Z denote two lines
intersecting at the origin 0, as shown in F i gure D. The direct sum of linear
subspaces Y a nd Z is in this case the "entire plane."
3.2.16.
Figure D
and let
z =
+
X
Do
z{
X:
=
x
8S
.Y Y
}Y
3.2.18.
3.3.
iF gure E
IL NEAR INDEPENDENCE,
AND DIMENSION
BASES,
Throughout the remainder of this and in the following chapter we use the
following notation: ({ IX . .. ,(X,,},
(X, E ,F denotes an indexed set of scalars,
and IX{ '
... ,x ,}. ,X E .X denotes an indexed set of vectors.
(XIIY
(X22Y
+ ... +
(X"y".
(3.3.2)
In Eq. (3.3.2) vector addition has been extended in an obvious way from
the case of two vectors to the case of n vectors. In later chapters we will
consider linear combinations which are not necessarily finite. The represen-
86
tation of x in Eq. (3.3.2) is, of course, not necessarily unique. Thus, in the
case of Example 3.1.10, if X = RZ and if x = (1, 1), then x can be represented
as
or as
x =
PIZI
pzz
2(! , 0)
3(0,! ) ,
2Y
Z2
(0,
X"
(1,1)
M
...-
.......Z, ..
(!'
...... ,Y
= ( 1,0)
0)
3.3.3.
iF gure F
3.3.4.
Y = IXIIY
+ IX.}'Yz
+ ... + IX",Y .. ,
where m may be any positive integer. Then V( Y) is a linear subspace of .X
3.3.5. Exercise.
3.3.8.
87
3.3.9. Exercise.
+ ... +
IX",X",
(3.3.11)
ru,
v
u
o
3.3.12.
tors.
iF gure .H
3.3.13. Exercise.
88
,x (t)
=
x i t) =
I', i=
I, ... ,n.
L e tY
= x{ o, X I "' " x 8 }. Then V( )Y
of degree less than or equal to n.
V( Y )
(c)
eL t oz (t)
1 for all I
I)
Ef> . . Ef>
V(X.).
a[ , b) and let
Zk(t) =
+ ... +
Ik
3.3.14.
Theorem. eL t
"'
If I::
a vector space .X
"',X
X 1 "' "
"' P,x"
= I::
I- '
If,~
"x ,}
Therefore "" =
P,
,~ "'
"x ,}
Zl"' "
.z }
P, for all i =
then "" =
1, 2, ... , m.
P,x, then ,~ "' ("" - P,)x, = O. Since the set .x{ , ... ,
is linearly independent, we have ("', - P,) = 0 for all i = 1, ... ,m.
Proof.
"' "',,x =
I- '
{XI'
oz{ ,
for all i.
,x",}
1,
,x
Proof.
"' . X
"' I X .
+ "',.+ IX +
""- I X ' - I
+ ... + "'",x..
+ ... +
"' , _ . X ' _ I
(- l )x ,
"',.+ ,x .+
in a linear space X
i ~ m, we can find
+ ... +
"'."X ,
(3.3.16)
O.
"'x z
+ ... +
"'",x", = O.
(3.3.17)
Suppose that index i is chosen such that "" 1= = O. Rearranging Eq. (3.3.17) to
"'IX
89
II,X ,
+ ... +
I- I
II' - I X
III+ I X I+ I
+ ... +
II.X " ,
(3.3.18)
PIX I
where P" =
proof. _
P1.X1.
-11"/11,,
= I,
P' _ I X / _
,i -
I, i
PI+ I X / +
+ ... +
3.3.20. Exercise.
3.3.21. Exercise.
L e t Y be a finite set in a linear space .X Show that Y
is linearly independent if and only if there is no proper subset Z of Y such
that V(Z) = V( )Y .
A concept which is of utmost importance in the study of vector spaces is
that of basis of a linear space.
3.3.22. Definition. A set Y
or simply a basis, for X if
in a linear space X
is called a Hamel
itself; eL .,
V( Y )
basis,
.X
Exercise.
{XI'
X 1 .,' "
,x , ,}
(lIX
+ ... +
II"X " .
90
can be expressed as
=
X
and
Then
x
(- x )
lI"X"
... - P.x.)
(lIIX I
lIlX I
(lIl -
PI)X I
lI"X"
+ ... +
+ ... +
(lI" -
lI.X.
lI.X.)
(- P IX
+ ... +
P")x,,
I-
P"x"
P.)x.
(ll. -
O.
Since the vectors x I, "x , ...' ,X. form a basis for ,X it follows that they
are linearly independent, and therefore we must have (lI, - P,) = 0 for
i = 1, ... ,n. From this it follows that III = PI' lI" = P", ... ,lI" = p".
We also have:
3.3.26. Theorem. eL t IX{ ' "X , ... ,x . } be a basis for vector space ,X and
let {YI' ... IY' II} be any linearly independent set of vectors. Then m < n.
Proof. We need to consider only the case m > n and prove that then we
actually have m = n. Consider the set of vectors IY{ ' X I "" ,x.l. Since the
vectors XI' ... ,X . span ,X IY can be expressed as a linear combination of
them. Thus, the set {YI' X I > ' "
,x.l is not linearly independent. Therefore,
there exist scalars PI' lIl> ... , lI., not all ez ro, such that
PIYI
If all the
lI, are
lIlx l
zero, then PI
+ ... +
lI"X.
I
(3.3.27)
O.
*'
*'
"x
(- l l)Y I
+ (~~I)XI
+ ... +
(- : :- I )X . _
I.
X = ~IXI
+ ... + ~.x .
Substituting (3.3.28) into the above expression we note that
(3.3.28)
Since
'IXI
"Y I
"1
Z' X
"tXI
+
z
,.[(I[ - )Yt
+
91
(-::-I)X._
+ ... +
t]
".- I X . _ I '
where" and are defined in an obvious way. In any case, every x E X can
be expressed as a linear combination of the set of vectors y{ t, X I' , X . _
and thus this set must span .X To show that this set is also linearly independent, let us assume that there are scalars
such that
AYI
AIX I
_ (-A
T I)
XI"
+ ... +
+ ... +
= (p~I)XI
(-A
._I)-A
+
PI
a,
X.-
0,
0 X
(3.3.29)
1= = 0, the relation
(-p:-t)x._
t+
(p~.)x.
(3.3.30)
At IX
A._t.X _
+ ... +
0 . .X
= 0
AI
' I .Y .
+ ... +
' I IY I '
But by Theorem 3.3.15 this means the "Y i = 1, ... ,n + 1 are linearly
dependent, a contradiction to the hypothesis of our theorem. F r om this it
now follows that if m > n, then we must have m = n. This concludes the
proof of the theorem. _
As a direct consequence of Theorem 3.3.26 we have:
Proof Let { X I ' ... , .x 1 be a basis for X, and let also { Y I ""
, y.. l be a
basis for .X Then in view of Theorem 3.3.26 we have m < n. Interchanging
the role of the X i and ,Y we also have n < m. Hence, m = n. _
92
+ ... +
II"X"
II H I
(- ...!)L x
11"+1
l -
i.e., { X l
,x ,});
..
,x,,}
.X
= O.
XI'
,X "
are
(~)x"
11,,+ I
is a basis for .X
Therefore, X
n< o
3.3.35. Exercise.
Prove
3.3.34.
Coroll~ry
x =
'tXI
+ ... + ,,,x,,.
,x ,}.
with respect
93
these results in this book, their proofs will be omitted. In the following
theorems, X is an arbitrary vector space (i.e., finite dimensional or infinite
dimensional).
3.3.37. Theorem. If Y is a linearly independent set in a linear space ,X
then there exists a Hamel basis Z for X such that Y c Z.
3.3.38. Theorem. If Y and Z are Hamel
Y and Z have the same cardinal number.
then
The notion of H a mel basis is not the only concept of basis with which we
will deal. Such other concepts (to be specified later) reduce to H a mel basis
on finite-dimensional vector spaces but differ significantly on infinite-dimensional spaces. We will find that on infinite-dimensional spaces the concept
of Hamel basis is not very useful. However, in the case of finite-dimensional
spaces the concept of Hamel basis is most crucial.
In view of the results presented thus far, the reader can readily prove the
following facts.
3.3.39. Theorem.
=n.
Let
Exercise.
Exercise.
3.3.43.
and
But this implies that ~I = ~ = ... = ~ "
= PI= P~ = ... = P.. = O.
Thus, W is a linearly independent set in .X Since X is the direct sum of Y
and Z, it is clear that W generates .X Thus, dim X = m + n. This completes
the proof of the theorem. _
We conclude the present section with the following results.
3.3.4.4
1beorem. eL t X be an n-dimensional vector space, and let y{ I '
... , y",} be a linearly independent set of vectors in ,X where m < n. Then it
is possible to form a basis for X consisting of n vectors x I ' , x"' where
,x = ,Y for i = I, ... , m.
Proof
Let { e l"" ,e,,} be a basis for .X Let SI be the set of vectors IY{ '
... ,Y"" e l , , ell}, where { Y I "' "
Y .. } is a linearly independent set of
vectors in X and where m < n. We note that SI spans X and is linearly
3.4.
iL near Transformations
95
. tJ,,Y
1= '
"*
"
+ E
1= '
p,e, =
O.
3.3.45.
3.3.46.
Exercise.
3.4.
IL NEAR
TRANSFORMATIONS
Among the most important notions which we will encounter are special
types of mappings on vector spaces, called linear transformations.
Deftnition. A mapping T of a linear space X into a linear space ,Y
where X and Y a re vector spaces over the same field ,F is called a linear
transformation or linear operator provided that
3.4.1.
(i) T(x
(ii) T(tJ)x
y)
96
L(X,
" II,T(X
= I-I;
I
,) for all ,X
E X
T(tl IIIXI)
and science this is called the principle of soperposition and is among the most
important concepts in those disciplines.
3.4.2. Example.
Let X = Y denote the space of real-valued continuous
Y
functions on the interval a[ , b] as described in Example 3.1.19. Let T: X - +
be defined by
T
[ (]x t)
f (x s)ds,
<
<
b,
= dx(t) .
T
[ (]x t)
dt
F r om the properties of derivatives it follows that T is a linear transformation
from e"(a, b) to e"- I (a, b).
3.4.4.
Example.
Let X denote the space ofall complexv- alued functions x ( t)
defined on the half-open interval 0[ , 00) such that x ( t) is Riemann integrable
and such that
,--
where k is some positive constant and a is any real number. Defining vector
addition and scalar multiplication as in Eqs. (3.1.20) and (3.1.21), respectively,
it is easily shown that X is a linear space. Now let Y denote the linear space of
complex functions of a complex variable s (s = (1 + ;0>, ; = ,.JT
= ). The
Y defined by
reader can readily verify that the mapping T: X - +
T
[ (] x s)
50-
e- " x ( t)
dt
(3.4.5)
3.4.6. Example.
Let X be the space of real-valued continuous functions
on a[ , b] as described in Example 3.1.19. Let k(s, t) be a real-valued function
3.4.
iL near Transformations
defined for a
integral
<
<
s :::;;: b, a
<
s:
(3.4.7)
k(s, t)x ( t) dt
[Ttx)(s)
the Riemann
s:
y(s)
T1 : X
be
X
-+
(3.4.8)
3.4.9.
Example.
T
[ )xz (s)
y(s)
s:
x(s) -
-+
by
(3.4.10)
= 0
(3.4.12)
k(s, t)x(t)dt.
[ T 3(]x s)
y(s)
J : k(s,
(3.4.13)
t)x ( t) dt
and
[T.x)(s)
y(s)
(x s)
.J :
-
(3.4.14)
= .x
T(x )
=
~x
=
~x
+ y=
.F
<iT(x)
T(x )
then
_
*'
T~ (x).
-+
numbers. If x
X as
T(y). Now if F
C,
= C,
98
Definition. L e t T
L(X,
& ( T)
y{
->
,Y we wiJI
)Y .
{x
X:
:Y y
Tx
= O}
Tx , x
(3.4.17)
}X
(3.4.18)
Theorem.
Let T
L(X,
)Y .
Then
3.4.20.
Exercise.
3.4 . 21.
We assume that R
< (T) =# O
{ J and X =# O
{ ,J for if R
< (T) = O
{ J or X
then dim R
<{ (T)}
= 0, and the theorem is proved. Thus, assume that
n> 0 and let Y l o"" y~+1 E R< (T). Then there exist X I ' . , X~+1
E X
such that Tx/ = y/ for ; = 1, ... , n + 1. Since X is of dimension n, there
exist ~I' . , ~+I
E F such that not all ~/ = 0 and
Proof
=
O
{ ,J
~IXt
+ ... +
~+tX~+t
+ ... +
~+IY-+I
O.
=
Thus,
or
O.
Therefore, by Corollary 3.3.34, R
< (T) is finite dimensional and dim R
< [ (T)]
~IYI
n< o
3.4.22. Example.
Let T: R% - > R"", where R% and R- are defined in Examples 3.1.10 and 3.1.11, respectively. F o r x E R% we write x = ({Io )%{ . Define
3.4.
iL near Transformations
Tby
T(~I'
~Z)
(0, I~ '
=
0, ~Z,
0, 0, ...).
The mapping T is clearly a linear transformation. The vectors (0; 1,0, ...)
< (T) and dim R
<[ (T)] = 2 = dim R
[ Z].
3.4.23.
{YI>'"
,y.J
3.4.24.
Exercise.
Theorem. eL t T
L(X,
)Y .
dim &(T)
dim R
< (T)
= dim .X
(3.4.26)
=
=
7tfl
T(7. e l
tdz
7z e z
71 Te l
7,f, =
7,e,),
7z Tez
+ ... +
7,Te,
"lle l
"Izez
+ ... +
"I,e, =
)' , + J e ,+ J
+ ... +
)'.e .
100
But fel, e", ... ,en} is a basis for .X F r om this it follows that 71 = 7" = ...
= Y r = 7r+ I = ... = Y n = O. eH nce, fltf", ... ,fr are linearly independent
< (T) = r. If s = 0, the preceding proof remains valid if
and therefore dim R
we let fel, ... ,e.} be any basis for X and ignore the remarks about the
vectors e{ r + I ' ,en}' If s = n, then ffi.(T) = .X eH nce, R
< (T) = O
{ J and so
< (T) = O. This concludes the proof of the theorem. _
dim R
Our preceding result gives rise to the next definition.
3.4.27. Definition. The rank p(T) of a linear transformation T of a finitedimensional vector space X into a vector space Y is the dimension of the
range space R
< (T). The nullity v(T) of the linear transformation Tis the dimension of the nullspace ffi.(i').
The reader is now in a position to prove the next result.
)Y . Let X be finite dimensional, and let
3.4.28. Theorem. eL t T E L ( X ,
s = dim ffi.(T). eL t IX { '
... ,x , } be a basis for ffi.(T). Then
if and only if x = lIlX I + ... + lI,X , for some set of scalars { l ilt
... , lI,}. Furthermore, for each x E X such that Tx = 0 is satisfied,
the set of scalars { l ilt ... , II,} is unique;
(ii) if oY is a fixed vector in ,Y then Tx = oY holds for at least one x E X
(called a solutioD of the equation Tx = oY ) if and only if oY E R
< (T);
and
(iii) if oY is any fixed vector in Y a nd if X o is some vector in X such that
Tx o = oY (i.e., X o is a solution of the equation Tx o = oY ), then a
vector x E X satisfies Tx = oY if and only if x = X o + PIX I + ...
+ P,X, for some set of scalars P{ it P", ... ,P,}. Furthermore, for
each x E X such that Tx = oY , the set of scalars P
{ it P1.' ... ,P,}
is unique.
3.4.29.
Exercise.
3.4.
iL near Transformations
101
T- I (Tx )
and
T(T- I y)
x for all x
E X
(3.4.30)
(3.4.31)
Let T E L ( X ,
Theorem.
)Y .
2Y )
Also, for
T- I (Tx l
T- I (Y I )
E F we have
T-I(~YI)
T-I(~Txl)
Tx 2)
T- I T(x
x 2)
IX
X 2
T- I (yz ) .
T-I(T(~xl))
~XI
~T-I(YI)'
dim R
< (T)
= dim .X
101
+ ... +
+ ... +
o.
:Dm =
X
3.4.36.
iF gure J . iL near transformation T from vector space X
vector space .Y
into
3.4.
iL near Transformations
103
(iii) Tx = 0 implies x = 0;
< (T), there is a unique x
(iv) for each y E R
(v) if TXt = Tx 1 , then X t = x 1 ; and
(vi)
if X
*' x
1,
then TXt
*' Tx
X such that Tx
y;
= y.
If X
and Y a re
Let X and Y be
The following are
such that Tx =
y.
T is injective;
T is surjective:
T is bijective; and
T has an inverse.
Exercise.
104
(S
for all x
.X
Also, with /X
by a scalar /X as
T)x
t::.
E
E
Tx
F and T E L ( X ,
(/XT)x
for all x
that /XT
Sx
define multiplication of T
)Y ,
/XTx
t::.
(3.4.24 )
(3.4.34 )
L(X,
Ox
= 0
(3.4.)4
for all x E .X
= -
Tx
(3.4.45)
T=
O.
3.4.64 .
Exercise. eL t X be a finite-dimensional space, and let T E L ( X ,
)Y .
Let e{ l> ... ,e.} be a basis for .X Then Te, = 0 for i = I, ... , n if and only
if T = 0 (i.e., T is the ez ro transformation).
With the above definitions it is now easy to establish the following result.
3.4.74 .
Tbeorem. eL t X and Y be two linear spaces over the same field
of scalars ,F and let L ( ,X Y) denote the set of all linear transformations from
X into .Y Then L ( X ,
Y ) is itself a linear space over ,F called the space of
linear transformations (here, vector addition is defined by Eq. (3.4.24 ) and
multiplication of vectors by scalars is defined by Eq. (3.4.43.
3.4.84 .
Exercise.
x y
= x z
(/XP)(x
z for all x , y, z E X ;
y z for all x , y,
y) for all x , y E X
X ; and
and for all /x, P E .F
Z E
3.4.
105
iL near Transformations
(iv) (x
= x
y) z
(y )z for all x , y, Z
,X
(S +
and
= ST+ SV,
U)
T)V =
SU
(exS)(PT) =
Z).
,F
PE
then it is
(3.4.51)
(3.4.52)
(3.4.53)
TV,
(3.4.54)
(a,P)ST.
= S[(T + )U ]x
)U x]
(ST)x
= S[Tx + ]xU
(SU ) x
=
(ST +
SU ) x
ST*- TS.
There is a special mapping from a linear space X into ,X called the identity
transformation, defined by
(3.4.56)
Ix = x
for all x E .X We note that I is linear, i.e., I E L ( X ,
if X * - O
{ ,J that I is unique, and that
TI =
for all T
L(X,
X).
IT =
(3.4.57)
106
Chapter
(a.I)x
a.lx
(3.4.58)
a.x
Theorem. Let T
L(X,
T- I T=
X).
If T is bijective, then T- I
IT- I =
I,
L(X,
X)
(3.4.61)
Exercise.
3.4.64.
T is invertible;
rank T = dim X ;
T is one-to-one;
T is onto; and
Tx = 0 implies x
Exercise.
O.
L(X,
(i) If ST = S
U = I, then S is bijective and S- I = T = .U
(ii) IfSand Tare bijective, then STis bijective, and (Sn- I =
(iii) If S is bijective, then (S- I )- I = S.
(iv) If S is bijective, then a.S is bijective and (a.S>1F a nd a.
*' O.
X).
Let
T- I S- I .
S- I for all a.
3.4.
107
iL near Transformations
3.4.66.
Exercise.
With the aid of the above concepts and results we can now construct
certain classes of functions of linear transformations. Since relation (3.4.51)
allows us to write the product of three or more linear transformations without
the use of parentheses, we can define T", where T E L ( ,X X ) and n is a positive
integer, as
T"I1T T ... T .
(3.4.67)
n times
11
T- I T- I ... T- t .
mtfmes
m ti'ines
(3.4.68)
n tImes
(T. T .... T)
m + ntimes
=
T"'"+
=
=
(T T . ..
n times
=
T) (T T . .
.
mtimes
T)
(3.4.69)
1'" T"'.
(T"')"
= T"" = T- = (1"')"'
(3.4.70)
(3.4.71)
where m and n are positive integers. Consistent with this notation we also
have
TI = T
(3.4.72)
and
TO = 1.
(3.4.73)
We are now in a position to consider polynomials of linear transformations.
Thus, if f(A) is a polynomial, i.e.,
f(A) =
A
\
+ ... +
" A",
(3.4.74)
, ,1"'.
(3.4.75)
f1, 0 1
f1,tT
+ ... +
The reader is cautioned that the above concept can, in general, not be
108
(~1>
~1.,
,e,,)
Exercise.
the
Theorem 3.4.77 points out the importance ofthe spaces R" and C". Namely,
every n-dimensional vector space over the field of real numbers is isomorphic
to R" and every n-dimensional vector space over the field of complex numbers
is isomorphic to eft (see Example 3.I.lO).
3.5.
IL NEAR
N
UF CTIONALS
s:
II(x ) =
ex s) ds, x
era, b]
(3.5.3)
(x so),
era, b],
So
a[ , b]
(3.5.4)
(x s)xo(s)
(3.5.5)
ds,
where X o is a fixed element of era, b] and where x is any element in era, b],
is also a linear functional on era, b].
3.5.6. Example. eL t X = P, and denote x
The mappingf, defined by
f,(x ) = el
by x
(e
I' .
e.).
(3.5.7)
= :E
,~ e,
I- I
(3.5.8)
3.5.9. Exercise. Show that the mappings (3.5.3), (3.5.4), (3.5.5), (3.5.7),
and (3.5.8) are linear functionals.
Now let X
109
110
f(x )
(x , J )
(3.5.10)
f(x )
(x , J )
or x ' x
is sometimes
(x , x ' ) ,
=
(3.5.11)
(fl
f1.)(x) =
t'x
(x ,
fl(x )
=
and
(<f< )(x)
~)
(x,~)
f1.(x),
(x , x ' )
=
(x , ;X )
(3.5.12)
(x,
x')
f (x ) ,
=
(3.5.13)
f(x l
1X .) =
(XI
IX < '
1X .' x ' )
x')
and also,
f(<)x<
=
,x<
1X< .'
x')
=
and let
x')
,x <
f(x l ) +
=
x')
=
f (x ) .
f(x1.),
x'(x)
(3.5.14)
(3.5.15)
3.5.17. Exercise.
3.5.19. T
' heorem. Let X be a finite-dimensional vector space, and let
e{ l , ,e,,} be a basis for .X IfI { t ... , . } is an arbitrary set of scalars,
then there is a unique linear functional x ' E X ' such that (e" x ' ) = , for
i = 1, ... , n.
Proof
F o r every
X
,X
X =
we have
~1l'1
~1.e1.
+ ... +
~"e .
111
X'
be given by
I'
(x , x ' )
=
1= 1
J'
rt~"
Ifx
In our next result and on several other occasions throughout this book,
we make use of the Kronecker delta.
for i,j =
_ I{
/J -
if i =
0 ifi*- j
(3.5.21)
We now have:
PIe;
Then
o=
pze;
(eJ , ~ . Pie;)= .~
1= 1
+ ... +
P.e:. =
p,(eJ , e;)
1= 1
1= 1
o.
= PJ'
P'~/J
= (e
x').
"
(x , x ' )
Also,
Let x
(' l eI
1= 1
+ ... +
' I (e l , x ' )
+ ... +
(x , e~)
=
= l'< eI>
~ "<e ,, ~)
t:t
x')
' I rt l
=
'J
+
+
+
+
112
,x <
= I ,X<
x')
e;)
= ,x<
I e;
x'
= I e; + ...
X'
.(X,
+ . e.).
e.)
we have
+ . e.,
e.}.
of X '
in Theorem 3.5.22 is
,x <
STy' )
= S< ,x
y' ) ,
x E ,X y' E yl,
x'
(3.5.25)
s
y
lX
3.5.26.
iF gure .K
113
L e t
,x <
ST(Y;
Thus, ST(y;
y~)
,x <
ST(y;)
ST(<y< ;
S
< ,x
(y;
,x <
STY;)
ST(<y< ;)
=
S
T(Y;).
ST(y~).
=
=
eH nce,
S
< ,x
ST h
,x<
).
y;)
,X
S
< ,x
h)
Also,
S
< ,x
Y;)
,x <
sry;) =
Therefore, ST
S
< ,x
,x<
Y;)
s ry~).
L ( yl, X ' ) .
The reader should have now no difficulties in proving the following results.
X).
Then TJ
)Y .
Then OT
3.5.32. Exercise.
3.6.
BILINEAR
N
UF CTIONALS
114
Chapter 3
g(czx
py) =
lig(x )
pg(y)
(3.6.2)
for all ,x y E X and for all cz, p E C, where d denotes the complex conjugate
of cz and denotes the complex conjugate of p.
Ifin Definition 3.6.1 the complex vector space is replaced by a real linear
g(czx
py) =
czg(x)
pg(y)
(3.6.3)
x
X
py, )z =
(a) g(czx
(b) g(x , czy + pz ) =
for all ,x y, z
czg(x,
)z
iig(x , y)
pg(y, z ) ; and
pg(x , z )
pE
C.
is a bilinear functional.
~I;;I
~7.; 7.
3.6.6. Example.
L e t ,x y E Rl, where R7. denotes the linear space of
ordered pairs of real numbers (if ,x y E R7., then x = (~I>
~7.)
and y =
(111) 111' L e t (J denote the angle hetween ,x y E l} 7.. The dot product of two
3.6.
115
Bilinear uF nctionals
vectors, defined by
I'll +
is a bilinear functional. _
g(x , y) =
~z'lz
(~t
~DI/2('It
3.6.7. Example.
is a bilinear functional.
L ( x ) P(y)
3.6.8. Example.
is a bilinear functional.
g(x , y)
*"
g(x , x )
for all x E X, the quadratic fonn induced by g (we frequently omit the phrase
"induced by g").
F o r example, if g(x , y) = ~ 1;;1 + ~2;;2'
as in Example 3.6.5, then g(x )
= ~I~l + ~2~Z = I~dz + l~zI2.
This is a quadratic form as studied in
analytic geometry.
F o r real linear spaces, Definitions 3.6.10 and 3.6.11 are again modified in
an obvious way by ignoring complex conjugates.
116
g, then
~
Proof
[ g (x , y)
g(y,x )
t Y)
t(X
=
te 2 Y ) .
y)
+y - ' - 2 - x+
g (x- 2
=
=
I
4 [ g (x ,
I
4 [ g (x ,
x)
x) -
y)
4 I g(x
=
g(y, x
y)
y)
g(y, x )
g(x, y) -
g(y, x )
g(x, y)
y,x
g(y, y),
and also,
t e 2 Y)
g[ (x,
Thus,
- } [ g (x ,
y)
= te
g(y, ) x
t Y)
-
g(y, y).
t e 2 y).
-
for every ,x y
Proof
t[ ! ( x
y) -
t[ ! ( x
(here i =
E X
y)
it[ ! ( x
iy)
- it[ ! ( x
-
iy)]
- I ).
t Y)
=
~
g(x, y)
g[ (x,
x)
g[ (x,
x) -
g[ (x,
x ) ._ ig(x, y)
g[ (x,
x)
g(y, x )
g(y, y)
and
t e 2 Y)
Also,
it(X
= ~
iY ) =
g(x , y) -
g(y, x )
g(y, y).
ig(y, x )
g(y, y)
and
it(X
- ; iY )
= ~
ig(x, y) -
ig(y, x )
g(y, y).
(3.6.14)
117
3.6.15. Theorem. Let X be a complex vector space. If two bilinear functionals g and h are such that g = h, then g = h.
Exercise.
3.6.16.
is
for all ,x y
.X
g(y, x)
= y, we obtain
g(x) = g(x, x) = g(x, x) =
But this implies that g is real.
Setting x
g(x)
for all x E .X
Conversely, if g(x) is real for all x E ,X then for h(x, y) = g(y, x) we
have h(x) = g(x, x ) = g(x, x ) = g(x). Since h = g, it now follows from
Theorem 3.6.15 that h = g, and thus
g(x, y) =
g(y, )x .
Note that Theorems 3.6.13, 3.6.15, and 3.6.17 hold only for complex
vector spaces. Theorem 3.6.15 implies that a bilinear form is uniquely
determined by its induced quadratic form, and Theorem 3.6.13 gives an
explicit connection between g and g. In the case of real spaces, these conclusions do not follow.
3.6.18. Example.
Let X = R2 with x = (el' e2) E R2 and y
E R2. Define the bilinear functionals g and h by
g(x, y) =
and
h(x, y)
Then g(x)
el171
2'217.
1{4 172
(171,172)
2' 172
F o r the case of real linear spaces, the definition of inner product is identical
to the above definition.
Since in a given discussion the particular bilinear functional g is always
118
(i) (x , x )
(ii) (x , y)
(iii) (Ctx +
>
3.7. Projections
119
Before closing the present section, let us consider a few specific examples.
3.6.23. Example. Let X = R"o F o r x
,' I .) E R , we can readily verify that
o
(~I'
00"
~")
R" and y
(' I I'
(x, y) =
Let
~,'Il
=
X
I~
=
x
C", F o r
(~I'
.. " ~.)
C" and y =
('II>
:E ,~ ; "
(x, y) =
1- 1
f'=
f(t)g(t)dt
n=
0, l ,
f.) = 0 if m
3.7.
1_
defined by
0[ , 1],
*'
*'
PROJECTIONS
In the present section we consider another special class of linear transformations, called prOjectiODS. Such transformations which utilize direct sums
(introduced in Section 3.2) as their natural setting will find wide applications
in later parts of this book.
XI'
120
=
x
3.7.1.
Figure L .
Projection on IX
Xl
2X
along 1'X ..
transformation which maps every point x in the plane X onto the subspace
XI along the subspace 1'X .'
3.7.3. Theorem. eL t X be the direct sum of two linear subspaces X I
1'X ., and let P be the projection on X I along 1'X .' Then
(i) P
L(X,
(ii) R
< (P) =
(iii)
~(P)
X);
X I ; and
X 2
and
Py)
=
=
P(f1.X I
f1.P(x
f1.P(x)
l)
f1.X1' .
PP(YI)
PYI
PY1' .)
f1.P(x I
1'X . and Y =
f1.X I
1'X .)
PYI
PP(YI
YI
1'Y .'
1'Y .)
pP(y),
Theorem. eL t P E L ( X ,
X).
if and only if PP = p'1.= P.
Then P is a projection on R
< (P) along
111
3.7. Projections
p'1. x
P(Px)
PX
XI
Px,
P.
let us assume that p2 = P. Let 1'X . = m(p) and let X I = R
< (P).
Clearly, m(p) and R
< (P) are linear subspaces of .X
We must show that
X = R
< (P) EB m(P) = X I EB X I '
In particular, we must show that R
< (P)
n m(p) = O{ J and that R< (P) and m(p) span .X
Now if y E R
< (P) there exists an x E X such that Px = y. Thus, p'1. x = Py
= Px = y. If y E m(p) then Py = O. Thus, if y is in both m(p) and m(p),
then we must have y = 0; i.e., R
< (P) n m(p) = O
{ .J
Next, let x be an arbitrary element in .X Then we have
C~n> versely,
Px
(I -
P)x.
XI'
Let
L(X,
X).
X =
R
< (P)
EB m(p).
(3.7.8)
*'
121
eL t us now consider:
U.X,
Then
X).
Next we consider:
3.7.12. Definition. eL t X be a linear space which is the direct sum of two
linear subspaces Y and Z; i.e., X = Y EEl Z. If Y a nd Z are both invariant
under a linear transformation T, then T is said to be reduced by Y a nd Z.
We are now in a position to prove the following result.
3.7.13. Theorem. Let Y and Z be two linear subspaces of a vector space
X such that X = Y EEl Z. Let T E L ( X ,
X). Then T is reduced by Y and Z
if and only if PT = TP, where P is the projection on Y along Z.
TPx
(3.7.14)
3.8.
123
for all x
.X
Equations (3.7.14)
TP.
3.8.
The material of the present chapter as well as that of the next chapter is
usually referred to as linear algebra. Thus, these two chapters should be
viewed as one package. F o r this reason, applications (dealing with ordinary
differential equations) are presented at the end of the next chapter.
There are many textbooks and reference works dealing with vector spaces
and linear transformations. Some of these which we have found to be very
useful are cited in the references for this chapter. The reader should consult
these for further study.
REFERENCES
3[ .1]
3[ .2]
3[ .3]
3[ .4]
P. R. A
H M
L OS,
iF nite Dimensional Vector Spaces. Princeton, N.J . : D. Van
Nostrand Company, Inc., 1958.
K. O
H M
F AN
and R. N
U K ZE,
Linear Algebra. Englewood Cliffs, N.J . : PrenticeH a ll, Inc., 1971.
A. W. NAYO
L R
and G. R. SEL,L
Linear Operator Theory in Engineering and
Science. New Y o rk: H o lt, Rinehart and Winston, 1971.
A. E. TAYO
L R, Introduction to u
F nctional Analysis. New Y o rk: J o hn Wiley &
Sons, Inc., 1966.
IF NITE-DIMENSIONAL
VECTOR SPACES AND
MATRICES
.4 1.
COORDINATE REPRESENTATION
OF VECTORS
x =
124
~IXI
+ ... +
~.x..
(4.1.1)
.4 1.
U5
(4.1.2)
or as
(4.1.3)
We call x (or x T) the coordinate representation of the underlying object (vector)
x with respect to the basis { x " ... ,x,,}. We call x a column vector and x T a
row vector. Also, we say that x T is the transpose vector, or simply the transpose
I'
('IX
=
I+
... + ,,,x.)
(<I' < )X
+ ... +
(<e< ")x,,.
(4.1.4)
I'
e l-
z{
x=
e z
(4.1.5)
I'_ t
or
x
Next, let y E ,X
T=
({I'
e_ .
'z,'
"IX I
.. ,e,,).
ez,'
(4.1.6)
where
y
"z X
+ ... +
""X".
(4.1.7)
.. ,x,,}
ChtJpter 4 I iF nite-Dimensional
126
=Y
(4.1.8)
or
(4.1.9)
Now
x +
(~.x.
(~I
=
+ ... +
~RX.)
1' .)x..
(4.1.10)
x+y=
or
x T+
yT
~.] . + .[~I]
.
(~.
1' .
~.
(~
. =
[~I . ~
R~
1' 1]
(4.1.11)
+ 1' R
+ ('11" .. ,'1R)
+ 1' ..... ,~. + 1' R)'
..... , ~.)
(4.1.12)
tlu.
+ ... +
t.u.
=
(4.I.I3)
This enables us to represent the same vector x E X with respect to two different bases in terms of two different but unique sets of coordinates, namely,
[]
~d
[]
(4.1.14)
The next two examples are intended to throw additional light on the above
discussion.
.4 1.15.
Example. eL t X = (~I'"
. ,~.)
E R.
eL t "I = (1,0, ... ,0),
(0, 1,0, ... ,0), ... , U . = (0, ... ,0, I). It is readily shown that the
set u{ l , . , u.l is a basis for RR. We call this basis the natural basis for RR.
Noting that
(4.1.16)
U2
.4 1.
117
(4.1.17)
or
x T=
... , ~.).
(~1'
u. =
... ,
0 ,
z =
, tU I
are
I>
u =
"
(4.1.18)
0
I
,'.+
(4.1.19)
for i = 1, 2, ... , n - 1. Thus, the coorwhere ot" = ,,, and ott = " dinate representation of x relative to {v., ... , v.) is given by
ot.
ot z
ot,,_.
_ ot"
,,,-.
(4.1.20)
"~-
"~
Hence, we have represented the same vector x E R by two different coordinate vectors with respect to two different bases for R".
Example. Let X = e[a, b}, the
functions on the interval a[ , b]. Let Y =
= 1 and x,(t) = I' for all I E a[ , b}, i =
3.3.13, Y is a linearly independent set in X
.4 1.21.
Chapter 4 I iF nite-Dimensional
128
eH nce, for any y E V(Y) there exists a unique set of scalars ('Io, I' I>
such that
y = I' oXo + ... + I' . X .
. , 1' .1
(4.1.22)
y(t) =
1' 0
+ ... +
' l It
'I.t,
t E a[ , b).
(4.1.23)
(4.1.24)
I' .
1X0Z o
IX I Z
+ ... +
IX"Z",
IX
1' 0 I' I -
(4.1.25)
E
V(Y )
'II
1' 2
(4.1.26)
IX._
_ IX"
Thus, two different coordinate vectors were used above in representing the
same vector y E V( )Y with respect to two different bases for V( )Y .
Summarizing, we observe:
1. Every vector X belonging to an n-dimensional linear space X over
a field F can be represented in terms of a coordinate vector x, or its
transpose x T , with respect to a given basis e{ I' , e.l c .X We note
that x T E P (the space P is defined in Example 3.1.7). By convention
we will henceforth also write x E P. To indicate the coordinate representation of x E X by x E P, we write x ~ .x
2. In representing x by x, an "ordering" of the basis e{ l t , ell} c X
is implied.
.4 2.
Matrices
3.
129
.4 2.
MATRICES
.4 2.1.
(i) eL t
set
(el>
e..},
(ii) L e t
e;
= Ae1 , e~ = Ae2 ,
e;, ... , e~J
{e~,
,I" = Ae... If x
=
=
A(e1e l
el~
e2e2
e2e~
e"e,,)
elAe l
e2Ae2
+ ... +
e"Ae"
e"e~.
e.e l
+ ... +
e2e2
e..e".
ele;
= 1,
(el +
,n.
e.. l".
A[(el
+ ... +
e2e~
I' 1)e.
I' 1)e'l
+
+
(e ..
(e ..
Chapter 4 I iF nite-Dimensional
130
A(x)
ell.
A(y) =
= (el
A(x
+ ... +
+ (e~ +
e~e~
111)e~
e"e:.
11~)~
111e~
+ ... +
11~~
(e"
+ ... +
l1"e~
11,,)e:.
y).
lXA(x)
A(lX)X
such that
I . .. n.
We point out that part (i) of Theorem .4 2.1 implies that a linear transformation is completely determined by knowing how it transforms the basis vectors
in its domain. and part (ii) of Theorem .4 2.1 states that this linear transfor-
mation is uniquely determined in this way. We will utilize these facts in the
following.
Now let X be an n-dimensional vector space. and let {el' ez . .. ell} be
a basis for .X L e t Y b e an m-dimensional vector space. and let {fIJ~
... J " ,}
be a basis for .Y L e t A E L ( X .
)Y . and let e; = Ae, for i = I . .. n. Since
{[IJ~
... J " ,} is a basis for .Y there are uniq u e scalars a{ o.} i = I . .. m.
j = I . .. n. such that
Now let x E .X
Ael =
Aez =
I. = allfl
~ = aufl
Ae" =
e:. =
at..!1
elel
Ax =
ffIJ~.
..
azt!~
aufz
az,,[z
+ ... +
a",t!",
a",d",
(4.2.2)
a",..!",.
x =
Since Ax E .Y
+ .,. +
e~ez
e"e"
ele~
+ ... +
(4.2.3)
e"e~.
,fIlII. say.
Ax =
11t!1
l1dz
+ ... +
11",[",.
(4.2.4)
.4 2.
Matrices
131
+ ... +
= el(aldl
e,,(au/l +
e8(a l J I
a",d",)
a",,,/,,,)
a",Jm)'
= (allel
+
However,
have
(a"'lel
a",,,e,, +
... + a"'8en)/",
aue" +
11" = a"lel + aue" +
111
11", =
= allel
amlel
a",,,e,,
alnen,
ah e8'
+ ... +
a",ne8'
we
(4.2.5)
A -- [ a,}] -
ail
a" I
a"'l
au
au
...
a",,,
.. ,
a 18 ]
ah .
(4.2.6)
a"'8
We see that once the bases {el, e", . .. ,e { / h/", ... ,I",} are fixed, we can
represent the linear transformation A by the array of scalars in Eq. (4.2.6)
which are uniquely determined by Eq. (4.2.2).
In view of part (ii) of Theorem .4 2.1, the converse to the preceding also
holds. Specifically, with the bases for X and Y still fixed, the array given in
Eq. (4.2.6) is uniquely associated with the linear transformation A of X into .Y
The above discussion justifies the following important definition.
8 },
.4 2.7. Definition. The array given in Eq. (4.2.6) is called the matrix A of
tbe linear transformation A from linear space X into linear space Y with respect
to the basis e{ 1> , en} of X and the basis { I I' ... ,fIll} of .Y
If, in Definition .4 2.7, X = ,Y and if for both X and Y the same basis
e{ l' ... , e is used, then we simply speak of the matrix A of the linear transformation A with respect to the basis e{ l, ... ,e8 } .
In Eq. (4.2.6), the scalars (all, 0,,,, ... ,0'8) form the ith row of A and the
8 }
Chapter 4 I iF nite-Dimensional
132
scalars (all' 0 2/ , ... , 0"'/) form the jth column of A. The scalar a'l refers to
that element of matrix A which can be found in the ith row and jth column of
A. The array in Eq. (4.2.6) is said to be an (m X n) matrix. Ifm = n, we speak
of a square matrix (i.e., an (n X n) matrix).
In accordance with our discussion of Section .4 1, an (n X 1) matrix
is called a column vector, column matrix, or n-vector, and a (1 x n) matrix
is called a row vector.
We say that two (m X n) matrices A = [ 0 1/] and B = b[ l/] are equal if
and only if 01/ = bl/ for all i = I, ... , m and for allj = I, ... , n.
F r om the preceding discussion it should be clear that the same linear
transformation A from linear space X into linear space Y may be represented
by different matrices, depending on the particular choice of bases in X and .Y
Since it is always clear from context which particular bases are being used,
we usually don' t refer to them explicitly, thus avoiding cumbersome notation.
Now let AT denote the transpose of A E L ( X ,
Y) (refer to Definition
3.5.27). Our next result provides the matrix representation of AT.
.4 2.8. Theorem. Let A E L ( X ,
Y ) and let A denote the matrix of A with
respect to the bases e{ I' ... , e~} in X and { f l' ... ,I.} in .Y Let X I and
yl be the algebraic conjugates of X and Y, respectively. Let AT E L ( Y I , X I )
be the transpose of A. Let {f~,
... ,f~}
and {e~,
... , e:.}, denote the dual
bases of { f l' ... , f",} and e{ u ... , e~}, respectively. If the matrix A is given by
Eq. (4.2.6), then the matrix of AT with respect to {f~,
... ,f~}
of yl and
{e~,
... , e:.} of X ' is given by
all
AT
a21
= [ 01.2.. .~2.2
al~
0"'1]
a2~
"""
~."'2
...
a",~
(4.2.9)
for i =
I, ...
,n, and
e<
"
e~>
=
6,,, and
<I",f~>
=
6,,}.
.4 2.
Matrices
133
Also,
A< el,f>~
Therefore, b,j
e< l, AT/~>
= k=L 1"
bkje~)
(el, tl
=
= bl}'
bklel, e~>
(4.2.9)
Our next result follows trivially from the discussion leading up to Definition .4 2.7.
.4 2.11. Theorem. Let A be a linear transformation of an n-dimensional
vector space X into an m-dimensional vector space ,Y and let y = Ax. Let
the coordinates of x with respect to the basis e{ l , el' ... , e,,} be (e \ J el' ... ,
e.), and let the coordinates of y with respect to the basis { f l,fl' ... ,f..} be
('I I ' 1' 1' ... , 'I.). eL t
all
all
011
ala
au
ala
(4.2.12)
I.}.
allel
auel
a l1 el
a 21 el
or, equivalently,
I' I =
.4 2.15.
Exercise.
jml
+
+
a,je j, i =
alae.
a 1"e. =
I, ... , m.
' I I'
1' 1'
(4.2.13)
(4.2.14)
Using matrix and vector notation, let us agree to express the system of
linear equations given by Eq. (4.2.13) equivalently as
134
Chapter 4
all
au
I iF nite-Dimensional
aU.
aa.
au
a~ h
a. 1 a.2
I'
2'
,- .
"1
"2
".
y,
~
(4.2.16)
(4.2.17)
all
(' I t
or, in short, as
aU.
a. 1
a.2
a21
au
a_
In
ab
T
x AT
yT.
(4 . 2.19)
B. Rank of a Matrix
We begin by proving the following result.
4.2.20. Theorem. L e t A be a linear transformation from X into .Y Then
A has rank r if and only if it is possible to choose a basis e{ l> e2 , , e.}
.4 2.
Matrices
135
for X and a basis { I I' ... ,fIll} for Y such that the matrix A of A with respect
to these bases is of the form
r..
- 100
6
o
010
A=
0 0 0
...
0 0
0-
0 0
1 0 0
...
m=
dim .Y
(4.2.21)
0000000
0000000
....
dim X
n=
Proof. We choose a basis for X of the form e{ l, e2.' ... ,e" e,+I'
. . , e.},
where e{ l+ ' >
... , e.} isa basisfodJt(A). Ifll = Ae l ,f2. = Ae2.' ... ,/, = Ae"
then {l1,f2.," .,/,} is a basis for R
< (A), as we saw in the proof of Theorem
3.4.25. Now choose vectors 1,+1, ... ,fin in Y such that the set of vectors
l{ 1,f2., .. . ,f",} forms a basis for Y (see Theorem 3.3.4)4 . Then
II
12.
Ae l
Ae2
(1)/1
(0)/1
(0)/2.
+
+
(1)12.
(O)/,
(0)/'1+
(0)/,
(0)/'1+
+
+
(O)/In'
(0)/""
..................................................................................................... ,
I, =
o=
Ae,
0=
Ae" =
Ae,+
(0)/1
(0)/2
= (0)/1 + (0)/2. +
(0)/'1+
...................................................................................................... ,
(0)/1
(0)/2.
+ ... +
(0)/,
(0)/'1+
+ ... +
(O)/In'
The necessity is proven by applying Definition 4.2.7 (and also Eq. (4.2.2
to the set of equations (4.2.22); the desired result given by Eq. (4.2.21)
follows.
Sufficiency follows from the fact that the basis for R
< (A) contains r linearly
independent vectors. _
A question of practical significance is the following: if A is the matrix
of a linear transformation A from linear space X into linear space Y with
respect to arbitrary bases e{ l , , e.} for X and { I I' ... , /In} for ,Y what is
< (A) be the subspace of Y generthe rank of A in terms of matrix A? Let R
ated by Ae l , Ae2.' ... , Ae". Then, in view of Eq. (4.2.2), the coordinate representation of Ae/> i = I, ... ,n, in Y with respect to { I I' ... ,fin} is given
by
Chapter 4 I iF nite-Dimensional
136
... ,
Ae,,~
+ ... + "
y=
"_ ...
a_ ... I
a_ ... 2
(4.2.23)
a..."
where" I' , "" are scalars. Since every spanning or generating set of a linear
space contains a basis, we are able to select from among the vectors Ael
Ae 2 ... ,Ae" a basis for R
< (A). Suppose that the set A
{ e l , Ae2 ... Aek}
is this basis. Then the vectors Ae I. Ae 2 .. , Ae k are linearly independent.
and the vectors Aek+I' ... , Ae" are linear combinations of the vectors Ae l
Ae2 ,
Aek F r om this there now follows:
.4 2.24.
Theorem. Let A E L ( X .
)Y , and let A be the matrix of A with
respect to the (arbitrary) basis eel' e2 ... , e,,} for X and with respect to the
(arbitrary) basis { l 1.l2 ... .I...} for .Y Let the coordinate representation of
y = Ax be Y = Ax. Then
(i) the rank of A is the number of vectors in the largest possible linearly
independent set of columns of A; and
(ii) the rank of A is the number of vectors in the smallest possible set of
columns of A which has the property that all columns not in it can
be expressed as linear combinations of the columns in it.
In view of this result we make the following definition.
.4 2.25. Definition. The rank of an m X
of linearly independent columns of A.
c.
Properties of Matrices
.4 2.
Matrices
137
a[ lj]
b[ IJ]
a[ lJ
blj]
e[ IJ]
C.
(4.2.26)
a[ IJ]
a[ lj]
d[ IJ]
D.
(4.2.27)
K }
and
B! J = 1 :bljg/t
1= 1
Now
, "'
= 1=1:1 J=I1:
j= I ,
... ,m.
blj aJkgl'
I' ..
K }
(4.2.28)
for i
I, ... , r andj =
(4.2.29)
138
Chapter 4 I iF nite-Dimensional
n matrix
D= A + B
where
where
for all i
ell =
~all
G= A C,
where
for each i
1, ... , r.
Theorem.
(ii) Let A be an (m
Then
X
A(B
C)
AD
AC.
r) matrix.
(4.2.33)
r) matrices.
(4.2.34)
.4 2.
Matrices
139
pE
,F and let A be an (m
(t +
(v)
(AB)C.
t(A +
B) =
and let
(4.2.35)
n) matrix. Then
X
P)A =
r) matrix,
tA +
(4.2.36)
pA.
n) matrices. Then
tA +
(4.2.37)
tB.
Let A and B be (m
n) matrices. Then
A +B=
(viii)
Let A, B, and C be (m
(A +
(4.2.39)
B+ A .
n) matrices. Then
B) + C =
A+
(B +
C).
(4.2.40)
.4 2.41.
.4 2.43.
I
I is called the n x
.4 2.45.
Exercise.
~ ~
[ : .. ..: ..:.:.:..
:J
(4.2.4)4
n identity matrix.
Prove Theorems 4.2.32,4.2.41,
and .4 2.43.
140
Chapter 4 I iF nite-Dimensional
F o r any (m x
n) matrix A we have
(4.2.46)
A+ O = O + A = A
and for any (n X n) matrix B we have
(4.2.47)
BI= I B= B
-A =
(- I )A =
all
012
ala
021
02 2
02"
0",2
a",,,
(- I )
_ 0 "' 1
- a ll
- 0 12
- a la
- 0 21
-au
- 0 211
- 0 "' 2
- a ",,,
(4.2.48)
=
_ - a "' l
AB*BA,
(4.2.49)
Theorem. eL t A be an (n
(i) rank A = n;
(ii) Ax = 0 implies x
0;
.4 2.
Matrices
141
Exercise.
Ax o;
(4.2.54)
Theorem.
Exercise.
Theorem.
4.2.58.
Exercise.
Now let A be an (n X
Chapter 4 I iF nite-Dimensional
142
= -'!
A . ..
A.
(4.2.59)
m times
m times
As in the case of Eqs. (3.4.69) through (3.4.71). the usual laws of exponents
follow from the above definitions. Specifically. if A is an (n x n) matrix and
if rand s are positive integers. then
A' A'
(A' ) '
A' '
A' '
=
A" =
A" =
A' A'.
=
(A' ) ' .
(4.2.61)
(4.2.62)
(4.2.63)
Al = A
and
AO =
(4.2.64)
I.
(4.2.65)
We are now once more in a position to consider functions of linear transformations. where in the present case the linear transformations are represented
by matrices F o r example. if f(J.) is the polynomial in .J given in Eq. (3.4.74).
and if A is any (n X n) matrix. then by f(A) we mean
f(A) =
/1 0 1
/lIA
+ ... +
/I.A.
(4.2.66)
.4 2.
143
Matrices
.4 1.69.
Exercise.
.4 2.70.
Example.
[~ ~ ~l
A=
Then
A+
If /X =
~ :J
B = :[ 3
3, then
let
Example.
and A - B
=[
- i , then
/XC
Example.
[ l~ :~
~
I'J~ .
18
.4 2.71.
[~ ~ iJ.
=
1 0 -I
Then
If/X =
~ [~ i ~I
/XA =
.4 1.71.
and B
I~
7+ i
-i
3-
i 5+
7;
11
2;]
-4
I- 2- 3;]
-8;
7
[
1- i - 3 i
-6i.
+5i
[:
:]
and H -
[~
J-
-1,
eMpter 4 I iF nite-Dimensional
144
Then
GH
Notice that in this case H G
.4 2.73.
[~ ~J
=
K
Then
.4 2.74.
[~
and L =
10 5J
[ 22 13
KL=
*' LK.
is not defined.
Example.
Clearly, K L
10 13.
5]
[ 10 IS
22
and K L
~J
I[ I 7 12
16J
Example.
Let
[~ ~J
=
[~ ~J.
and N =
Then
[~
MN=
i.e., MN =
.4 2.75.
0, even though M t= =
Example.
~J
=0,
0 and N t= =
o. _
I 2]
2 4
AT=4560._
[
163 I
.4 2.76.
Example.
p~
Then
Let
5
[:
-6
~]
32
24
and
Q=
-45
24
24
- 1 6-
-6
24
27
24
24
-2
24
24
24
.4 2.
145
Matrices
1 0
p Q = Q P =
i.e., Q =
.4 2.77.
Q- l .
p- I or, equivalently, P =
Example.
0 1
~I
6el
2el
2e~
e3
+ 3e2 +
e2
e3
0e3
3e, =
e4 ,
0,
e,
= 0,
(4.2.78)
o.
=
eL t
~ ~] [~:]
:
2 1 0 1
e3
e,
= [
4 2 1 3]
[2
A= 6 314 .
I 0
].
(4.2.79)
(4.2.80)
Chapter 4 I iF nite-Dimensional
146
I~
;[= }
(4.2.81)
=[~}
(4.2.82)
~.
where U
and where
pI as
'II
y=
' I ..
[~_I -L~:J'
AZI
Au
(4.2.83)
r
q =
Allu
A 11 u
+
+
Au"' } .
Az ' "
(4.2.85)
.4 2.
Matrices
147
B= [~I_~LJ'
B21
(4.2.86)
Bu
BA =
We now prove:
L(X,
(4.2.87)
Theorem. Let
0:
1 0
0:,
I
I
I
I
I
I
p=
0 0
-~_.
(4.2.89)
:0
I
I
I
I
I
I
I
n- r
;0
where r =
dim R
< (P).
R
< (P)
EB (~ P).
Chapter 4 I iF nite-Dimensional
148
Theorem.
A E L(X,
X).
Let
[~: -i'~!:J o :A
2Z
where All is a (p x p) matrix and the remaining submatrices are of appropriate dimension.
.4 2.91.
Exercise.
.4 3.
EQUIVALENCE
AND SIMILARITY
e{ ;, ... , e~}
e; =
:t pjlej
j=
i=
1, ... ,n,
where Plj E F for all i,j = I, ... ,n. The set e{ ;, ... ,~}
X if and only if P = [Plj] is non-singular.
and let
(4.3.2)
.4 3.
149
lX I '
,IX "
It follows that
I, ... ,n.
0, i =
=
1':1
Rearranging, we have
or
I::" IX I"
1= I
e;, ... , e.
O.
=
Since
are linearly independent, it follows that IX I = ... = IX" = O.
Thus, the columns ofP are linearly independent. Therefore, P is non-singular.
{ I' . , PIt} be a linearly indeConversely, let P be non-singular, i.e., let P
pendent set of vectors in .X
"
,=I:: lX,e; =
Let
I' . . , IX"
" IX,PI'
... ,e,,} is a linearly independent set, it follows that I::
=
Then
Since e{ l'
for j
I, ...
I- '
I-'
IX
= ... =
.F
.. ,p,,} is a linearly
IX"
0, and therefore
e.}
,e.}
4.3.4.
Theorem. L e t e{ l, ... ,e,,} and e{ ;, . ..
be two bases for ,X
and let P be the matrix of basis e{ ;, ... ,e~}
with respect to basis e{ l' ... , eft}'
Then p- I is the matrix of basis e{ I' ... , eft} with respect to the basis e{ ;,
... , e,.}.
4.3.5.
Exercise.
Chapter 4 I iF nite-Dimensional
150
.4 3.6. Theorem. eL t X be a linear space, and let the sets of vectors e{ l>
... ,eft}' e{ ~
,e..}, and e{ f' , . .. , e':} be bases for .X If P is the matrix
, e'ft} with respect to basis e{ I ' , eft} and if Q is the matrix
of basis e{ ,~
of basis e{ f' ,
, e':} with respect to basis e{ ,~ ... ,e..}, then PQ is the
eft}'
matrix of basis e{ f' , . . , e':} with respect to basis e{ l ,
.4 3.7.
Exercise.
We now prove:
~ , e..} be two bases for a linear
.4 3.8. Theorem. eL t e{ I . . , eft} and e{ ,~
... ,e..} with respect to basis
space .X and let P be the matrix of basis ,~{
e{ lt , eft}' eL t x E X and let x denote the coordinate representation of x
with respect to the basis e{ lt , eft}' eL t x ' denote the coordinate representation of x with respect to the basis e{ ,~ ... ,e..}. Then Px ' = .x
Proof.
eL t x
(~I'
... '~ft)'
(~~,
... ,~~).
Then
and
Thus,
~ ft ~eJ
J-I
Therefore,
~ [ .~
1-'
J~I
ft
P/J~J'
j':1
plJe, ]
~ ft(.
~
P/J~J
t:1 I- I
) e,
I, ... n.
Px /.
.4 3.
151
to the bases e{ ,~
... ,f~}
in .Y Then.
A' = Q AP.
Proof.
We have
A(~ I~ Pklek) =
Ae; =
k~1
k~1
Pkl[l=t1 alk(t
q J d j)]
J=I
k~1
IN
"J ::1
t(f't1 t
J-l
lJ q alkPkl)fj.
k= 1
for i =
I, ... ,m andj
QAP.
A
Px'
x=
" y
"
t
(e;, .. .e~}
A'
x'
.4 3.11.
y=
Ax
u; ..... f;"}
"
y'
Qy
Definition. An (m X n) matrix
n) matrix A if there exists an (m X
Chapter" I iF nite-Dimensional
152
(n X
A' = Q AP.
(4.3.13)
n) matrices. Then
Exercise.
1 0 0 ..
1 0
...
..
0-
...
000 .. 1 0 0
0 0 0 .. 0 0 0
.. 0
.. 0
0 0 0 .. 0 0 0
.. 0
= rank A
(4.3.17)
.4 3.18.
Exercise.
.4 3.
153
.Y
=
We have:
Theorem. L e t A E L ( X ,
X), let (e l , , e.l be a basis for ,X and
let A be the matrix of A with respect to (e l ,
, e.l. L e t
(e~,
... , e"l be
another basis for X whose matrix with respect to (e l , , e.l is P. L e t A'
be the matrix of A with respect to (~, ... , e"l. Then
.4 3.19.
A'
P- I AP.
(4.3.20)
t,
....
A'
__
Ie,.' .. enl
t,,-
e{ ;, ... , e~}
.4 3.22.
= P- I AP.
n)
(4.3.23)
.4 3.24.
eMpter 4 I iF nite-Dimensional
154
Our next result shows that ' " given in Definition 4.3.22 is an equivalence
relation.
4.3.25.
Let A, B, and C be (n x
Theorem.
n) matrices. Then
(i) A is similar to A;
(ii) if A is similar to B, then B is similar to A; and
(iii) if A is similar to B and if B is similar to C, then A is similar to C.
4.3.26.
Exercise.
Theorem.
/%0'
,/%",
.F
Then
f(P- I AP) =
P- l f(A)P.
(4.3.29)
e{ l , , e.}.
(v) L e t A E L ( X , X ) , and letf(l) denote the polynomial ofEq . (4.3.28).
Let A be any matrix of A. Thenf(A) = 0 ifand only iff(A) = O.
4.3.30.
Exercise.
11
0 0
12 0
00
A' =
(4.3.31)
o
o
0
0
0
0
1"_1
1.
.4 .4
ISS
Determinants ofMatrices
Then
MOO
o A~ 0
(A')k
..
0
0
0
0
0
0
I 0
o I
f(A' )
0
0
(10
0 0
0 0
A'1'
0
Al
0 ........ .
Ar .........
o o
o o
0
A2
+ ...
(II
0
0
f(AI )
o
o
0
0
A"_I
0
0
f(A2)
A"
............
. ...........
o
o
0
0
o
f(l.)
"*
.4 .4
DETERMINANTS OF
MATRICES
Chapter 4 I iF nite-Dimensional
156
*-
= j dz
q
n)
...
... j,,'
I 2
( jl jz
q=
= {
sgn (q)
I
q
is even
-I
q is odd
for all q E P(N).
Before giving the definition of the determinant of a matrix, let us consider
a specific example.
4.4.1.
Example. As indicated in the accompanying table, there are six
permutations on N = (I, 2,3). In this table the odd and even permutations
are identified and the function sgn is given.
t1
t1
(jl.h)
(j.. h)
123
132
213
231
312
321
(1,2)
(1,3)
(2, 1)
(2,3)
(3,1)
(3,2)
(1,3)
(1,2)
(2,3)
(2,1)
(3,2)
(3,1)
(jz , h)
sgn t1
even
+1
-1
-1
+1
+1
-1
(2,3)
(3,2)
(1,3)
(3,1)
(1,2)
(2, 1)
odd
odd
even
even
odd
n) matrix
all al2
A=
is
odd or even
a~~
a"l
.. ~
a"z
alrt]
.........
.
~"
a""
We form the product of n elements from A by taking one and only one
element from each row and one and only one element from each column. We
represent this product as
.4 .4
157
Determinants ofMatrices
where tU i]. ... j.) E P(N). It is possible to find n! such products, one for
each u E P(N). We now define the determinant of A, denoted by det (A), by
the sum
det (A) =
where u
I:
"ep(N)
a.}.,
(4..4 2)
det(A)
(4..4 3)
Theorem.
eL t A and B be (n
x n) matrices.
Proof To prove the first part, we note first that each product in the sum
given in Eq. (4..4 2) has as a factor one and only one element from each
column and each row of A. Thus, transposing matrix A will not affect the
n! products appearing in the summation. We now must check to see that
the sign of each term is the same.
F o r U E P(N), the term in det (A) corresponding to 0' is sgn (u)a llta 2}
. a.} . There is a product term in det (AT) of the form a lt'lajo'2" . aN. such
that a 1lt a 2jo . . , a.} . = a} I ' l aN2 ... au . The right-hand side of this equation
is just a rearrangement of the left-hand side. The number of j; > j;+ I for
i = I, ... ,n - I is the same as the number of j/ > j/+ I for i = 1, ... ,
n - 1. Thus, if 0" = ;U j~ . . .j~) then sgn (u' ) = sgn (0'), which means det
(AT) = det (A). Note that this result implies that any property below which
is proved for columns holds equally as well for rows.
To prove the second part, we note from Eq. (4..4 2) that if for some i,
Q/ k =
0 for all k, then det (A) = O. This proves that if every element in a row
of A is ez ro, then det (A) = O. By part (i) it follows that this result holds
also for columns. _
Chapter 4 I iF nite-Dimensional
158
.4 .4 5.
Exercise.
of Theorem .4 .4 .4
3) matrix, then
det (A)
all
a ZI
au
an
a l3
a Z3
a ll
a 31
I, ... , n, and,
det (A)
1, ... ,n.
F o r example, if A is a (2 x
= J=IL "
a,AI'
(4..4 9)
.4 .4
159
Determinants ofMatrices
If A is a (3
det (A)
all
012.
0' 3
02'
au
023
0IlC I ,
0I1CU
0I3 C I3'
Exercise.
O"C"
02,C2'
a 3 ,c 31
Prove Theorem .4 .4 7.
We also have:
.4 .4 11.
Theorem. Ifthe ith row of an (n X n) matrix A consists of elements
of the form 0/1 + 0:" 0' 2 + 0;2' ,a," + 0:.; i.e., if
a.2
then
det(A)
.4 .4 12.
Exercise.
Furthermore, we have:
.4 .4 13.
Theorem. eL t A and B be (n x n) matrices. If B is obtained from
the matrix A by adding a constant tt times any column (or row) to any other
column (or row) of A, then det (B) = det (A).
.4 .4 14.
Exercise.
Chapter 4 I iF nite-Dimensional
160
.4 .4 15.
a,/c ,k
1=1
and
= 0 for j
*' k
(4..4 16a)
(4..4 16b)
.4 .4 17.
Exercise.
a,/c ,k =
1=1
to obtain
det (A)cS/k>
(4..4 18)
j, k =
1, ... , n.
We are now in a position to prove the following important result.
i, k =
.4 .4 20.
Theorem. eL t A and B be (n
Proof
We have
det (AD) =
det(AB)=
~
'.=1
By Theorem .4 .4 11
n) matrices. Then
(4.4.21)
a",.b / 1
and Theorem .4 .4 ,4
a""
a",.
This determinant will vanish whenever two or more of the indices i/,j = 1,
... , n, are identical. Thus, we need to sum only over (f E P(N). We have
det (AB) =
"EP(N)
b"lb,,1" .b ,
.4 .4
Determinants 01 Matrices
161
{I, ... ,
Exercise.
Proof
o
A' =
QAP=
o
This shows that rank A
det (QAP)
0. But
0,
Chapter 4 I iF nite-Dimensional
162
Let us now turn to the problem of finding the inverse A- I of a nonsingular matrix A. In doing so, we need to introduce the classical adjoint of A.
.4 .4 25.
Definition. Let A be an (n X n) matrix, and let c' j be the cofactor
of D/J for i,j = 1, ... ,n. Let C be the matrix formed by the cofactors of A;
The matrix (J is called the classical adjoint of A. We write
i.e., C = c[ /J' ]
adj (A) to denote the classical adjoint of A.
We now have:
.4 .4 26.
Theorem.
Let A be an (n
A[adj (A)]
n) matrix. Then
X
a[ dj (A)]A
[det (A)] I.
Proof The proof follows by direct computation, using Eqs. (4..4 18)
(4..4 19).
Let A be a non-singular (n x
CoroUary.
A -I
.4 .4 29.
Example.
We have det(A)
and
de/(A) adj(A).
(4.4.28)
_~ H
A~[:
-1,
adj (A)
and
=[
-3
-1
-1
1 -1
~],
-2
A- I
=
[
-~
det (8).
.4 5.
.4 .4 32.
Exercise.
163
Prove Theorems .4 .4 30
and .4 .4 31.
.4 .4 33.
Definition. The determinant of a linear transformation A of a
finite-dimensional vector space X into X is the determinant of any matrix
A representing it; i.e., det (A) Do det (A).
The last result of the present section is a consequence of Theorems .4 .4 20
and .4 .4 24.
.4 .4 34.
*"
.4 5.
EIGENVALE
U S
AND EIGENVECTORS
e; =
Ael =
lle l ,
(4.5.1)
i. = Ae. = l.e.,
where 1, E ,F i = 1, ... , n. If this is the case, then the matrix A' of A with
respect to the given basis is
A/ =
Chapter 4 I iF nite-Dimensional
164
.4 5.2. Theorem. eL t A
such that
L ( ,X
Ax
Ax =
(4.5.3)
is a linear subspace of .X In fact, it is the null space of the linear transformation (A - .tI), where I is the identity element of L(X,
)X .
Proof
Since the zero vector satisfies Eq. (4.5.3) for any .t E ,F the set is
non-void. If the zero vector is the only such vector, then we are done, for
O
{ J is a linear subspace of X (of dimension ez ro). In any case, Eq. (4.5.3)
holds if and only if (A - U ) x
= O. Thus, x belongs to the null space of
A - U , and it follows from Theorem 3.4.19 that the set of all x E X sat
isfying Eq. (4.5.3) is a linear subspace of .X
Henceforth
we let
mol = x{
:X (A -
.tl)x
OJ.
(4.5.4)
The preceding result gives rise to several important concepts which we
introduce in the following definition.
E
Ax
.tx
(4.5.7)
then .t is called an eigenvalue of A and x is called an eigenvector of A corresponding to the eigenvalue .t.
Our next result provides the connection between Definitions .4 5.5 and
.4 5.6.
.4 5.8. Theorem. Let A E L ( X ,
X ) , and let A be the matrix of A with respect
to the basis e{ ., ... ,e,,}. Then A. is an eigenvalue of A if and only if.t is an
eigenvalue of A. Also, x E X is an eigenvector of A corresponding to .t if
.4 5.
165
Exercise.
.4 5.10.
.4 5.11.
Exercise.
)X .
Then 1
F is an eigenvalue of A if
(4.5.12)
U)
det (A -
11).
(4.5.13)
-1)
au
at..
det(A - 1 1) =
(4.5.14)
0"1
ad
(a"" -
1)
has any roots in .F There is, however, a special class of fields for which
requirement (b) is automatically satisfied. We have:
.4 5.15.
Pel) =
o.
(4.5.16)
Chapter 4 I iF nite-Dimensional
166
Any 1 which satisfies Eq. (4.5.16) is said to be a root of the polynomial equation (4.5.16).
In particular, the field ofcomplex numbers is algebraically closed, whereas
the field of real numbers is not (e.g., consider the equation ..P + I = 0).
There are other fields besides the field of complex numbers which are
algebraically closed. oH wever, since we will not develop these, we will restrict
ourselves to the field of complex numbers, C, whenever the algebraic closure
property of Definition .4 5.15 is required. When considering results that are
valid for a vector space over an arbitrary field, we will (as before) make usage
of the symbol F or frequently (as before) make no reference to F at all.
We summarize the above discussion in the following theorem.
.4 5.17.
Theorem. eL t A
L(X,
Then
X).
/1 0
/1 0
/Ill
/1ft
= (-
/lz l z
+ ... +
/I)' f t
(4.5.18)
I)");
/II).
+ ... +
/lz)z'
/lft1"
= 0; and
det
(4.5.19)
Definition. eL t A E L ( X ,
det (A -
1I)
X),
= det (A -
).1) =
/1 0
/II).
+ ... +
/I)."
(4.5.21)
det(A - 1 1) =
(4.5.22)
.4 5.23.
A
L(X,
det (A -
).1)
(1 1
-
).)",,().z -
).)"" .
()., -
).)"",
(4.5.24)
.4 5.
167
where AI' i = 1, ... ,p, are the distinct roots of Eq. (4.5.19) (Le., AI 1= = A/
for i 1= = j). In Eq. (4.5.24), ml is called the algebraic multiplicity of the root AI'
ml =
1= 1
n.
det (A -
AI)
O.
~o
+ ... +
~IA
~"A".
Now let B(A) be the classical adjoint of (A ~ AI). Since the elements bli).)
of B(A) are cofactors of the matrix A - ),1, they are polynomials in A of
degree not more than n - 1. Thus,
blJ(A)
Letting Bk
PI/O
PI/IA +
... +
PI/<,,-Il
A,,-I.
Bo
By Theorem .4 .4 26,
(A -
Thus,
.tB I
AI)B(A) =
+ ... +
d[ et (A -
AI)]I.
~"I,
AB,,_I -
B"-1 =
... , AB I - B o =
~"_II,
I I,
ABo
~0I.
Premultiplying the above matrix equations by A", A"-I, ... , A, I, respectively, we have
-A"B"_I
A"B"_I -
~"A",
A1B I -
ABo =
~IA,
A"-IB"_1
ABo
~"_IA"-I,
~ol.
o=
~oI
~IA
+ ... +
~"A"
p(A),
... ,
Chapter 4 I iF nite-Dimensional
168
n) matrix
X
theorem, we have:
A~
(-I)~+I[(loI
f(A) =
Proof
=
Pol
PIA
+ ... +
P~_IA~-I.
(l~
(-I)~.
To prove part (ii), let f(A) be any polynomial in A and let P(1) denote
the characteristic polynomial of A. Then there exist two polynomials g(1)
and r(A) (see Theorem 2.3.9) such that
f(A) =
P(A)g(1)
r(1),
(4.5.27)
0, we have f(A) =
r(A),
The Cayley-aH milton theorem holds also in the case of linear transformations. Specifically, we have the following result.
and let p(l) denote the characteristic
.4 5.28. Theorem. eL t A E L ( X , X ) ,
polynomial of A. Then P(A) = O.
.4 5.29.
Exercise.
Example.
A=G J~
2, we assume that
= (I -
r(l) =
We must determine
Po
and
PI'
1)(2 -
I and 1 2
Po +
sU ing
1)
PI1.
the fact that P(11) =
P(11) =
0, it
.4 6.
169
follows thatfO' I )
Hence,
PI
= 237
Po =
I and
-
A37
2- 2
(2 - 2
=
or,
[I
A37 -
237
37
37
)1
Therefore,
(2 37
I)A,
-
0.J
I 237
-
.4 5.31.
trace A =
all
022
+ ... +
a..
(4.5.32)
(i.e., the trace of a square matrix is the sum of its diagonal elements).
It turns out that if F = C, the field of complex numbers, then there is a
relationship between the trace, determinant, and eigenvalues of an (n X n)
matrix A. We have:
.4 5.33.
A E L(X,
jJ ;
.ti';
t
J=I
mJ
1J
.4 5.34.
.4 6.
Exercise.
SOME CANONICAL
O
F RMS
OF
MATRICES
Chapter 4 I iF nite-Dimensional
170
Proof. The proof is by contradiction. Assume that the set e{ ,~ ... ,e~}
is linearly dependent so that there exist scalars I ' ... , p , not all ez ro, such
that
Ie~
+ ... +
= O.
pe~
We assume that these scalars have been chosen in such a fashion that as
few of them as possible are non-zero. Relabeling, if necessary, we thus have
Ie~
+ ... +
(4.6.2)
0,
,e~
where I ;= C 0, ... , IX, 1= = 0 and where r < p is the smallest number for which
we can get such an expression.
Since ll, ... ,l, are eigenvalues and since e~, . .. ,I, are eigenvectors.
we have
0=
Also,
A(O)
+ ... +
+ ... + (<,< l,)I,.
A(<<le~
(<<Ill)e~
o=
=
o=
.2., 0
(<<Il,)e~
.2.,(<,~
, 1,)
+ ... +
+ ... +
IAe~
, AI,
(4.6.3)
+ ... +
,e~)
(4.6.4)
(<,< .2.,)1,.
I (ll -
.2.,)e~
+ ... +
, (.2., -
l,)I,.
X = n ).
.4 6.
)..f)
171
)..)Oz -
()..l -
)..),
Al
A' =
(4.6.6)
o
A"
Ale;
Ae; = Aze;
Ae;
(4.6.7)
)..l)
IX l ' t
IX
A' =
(4.6.9)
Aft
P- I AP.
(4.6.10)
The matrix P is the matrix of basis (e;, e;, ... , e~} with respect to basis
(e l , ez , ... , ell}, and p- I is the matrix of basis e{ l, ... ,eft} with respect to
Chapter 4 I iF nite-Dimensional
172
basis e{ ,~ ... , e,,}. The matrix P can be constructed by letting its columns
be eigenvectors of A corresponding to AIt , A., respectively. That is,
P=
where x
tt
,x .
[XI'
,x.l,
2,
(4.6.11)
Exercise.
_ 2-[ J4
A-
det(A - 1 I)
det(A - 1 1)
= A2 + A-
6.
e4- 1
+ e4 2 =
~2'
0, I~ - ~2 = O.
Thus, any vector of the form
corre-
.4 6.
173
The diagonal matrix A' given in Eq. (4.6.9) is, in the present case,
= [~I ;J=
A'
~[ l~-
P=
[XI'
2]
Then
[ ..22 -.2.8J
=
p- I
and
P- I AP
A is given by
e~
1=1
[~
=
PIleI =
-~J
el
e2'
[~I
=
e;) c X
e{ ,~
;- J
[~
=
;J.
e;
=
1=1
Pnel =
4e
1 -
e2'
e;
Example.
A=
is
det (A -
AI)
13-2]
0 4
-2
-I
= (I - A)2(2 - A) = 0,
Chapter 4 I iF nite-Dimensional
174
m H[
and
Corresponding to
A~
we have an eigenvector
:[ }
Letting P denote a modal matrix, we have
p=[~
and
1 I]
0[ 1
Oland p- I
010
=
-1
-2
3
!n
A'-P-'AP=[~
In this example, dim moll = 2, which happens to be the same as the algebraic
multiplicity of 11" F o r this reason we were able to diagonalize matrix A.
The next example shows that the multiplicity of an eigenvalue need not
be the same as its algebraic multiplicity. In this case we are not able to
diagonalize the matrix.
.4 6.15.
Example.
is
21-2]
[ 001
=
det(A - ) .I)
0 2
-1
(1- 1 )(2
- 1 )~
rx
H[
*~ O.
.4 6.
175
Setting ~x = (1,0,0), we see that dim mAl = I, and thus we have not been
able to determine a basis for R3, consisting of eigenvectors. Consequently,
we have not been able to diagonalize A.
When a matrix cannot be diagonalized we seek, for practical reasons,
to represent a linear transformation by a matrix which is as nearly diagonal
as possible. Our next result provides the basis of representing linear transformations by such matrices, which we call block diagonal matrices. In the next
section we will consider the "simplest" type of block diagonal matrix, called
the Jordan canonical form.
Theorem. Let X be an n-dimensional vector space, and let A
Let Y and Z be linear subspaces of X such that X = Y EEl Z
and such that A is reduced by Y a nd Z. Then there exists a basis for X such
that the matrix A of A with respect to this basis has the form
4.6.16.
E L(X,
X).
where dim Y
matrix.
4.6.17.
Exercise.
A=l'-~[ *J
r, AI is an (r X
r) matrix and A2 is an (n -
r) X
(n -
r)
We can generalize the preceding result. Suppose that X is the direct sum
of linear subspaces X I ' ... ' X , that are invariant under A E L ( X ,
X).
We can define linear transformations AI E L ( X I , ,X ), i = 1, ... ,p, by
A/x = Ax for x E X,. That is to say, A, is the restriction of A to ,X . We now
can find for each A, a matrix representation A" which will lead us to the
following result.
Theorem. eL t X be a finite-dimensional vector space, and let
A E L(X,
)X . If X is the direct sum of p linear subspaces, X I ' ... , "X which
are invariant under A, then there exists a basis for X such that the matrix
representation for A is in the block diagonal form given by
4.6.18.
A=
AI:
...I : A2
,-- ,
._ -
r- -
: A,
it
Chapter 4 I iF nite-Dimensional
176
i
.4 6.19.
Exercise.
F r om the preceding it is clear that, in order to carry out the block diagonalization of a matrix A, we need to find an appropriate set of invariant
subspaces of X and, furthermore, to find a simple matrix representation on
each of these subspaces.
then
~
= :x{
= OJ,
ll)x
(A -
X)
1, ... ,n,
j =
X=
~I
EB
E B~.
F o r any x E ~J'
we have Ax = 1J,x
and hence AJx = 1Jx for x E ~J'
A basis for ~J is any non-zero x J E ~r Thus, with respect to this basis, AJ
is represented by the matrix 1J (in this case, simply a scalar). With respect to a
, .x ,}
A is represented
basis of n linearly independent eigenvectors, IX{ >
by Eq. (4.6.6).
In addition to the diagonal form and the block diagonal form, there
are many other useful forms for matrices to represent linear transformations
on finite-dimensional vector spaces. One of these canonical forms involves
triangular matrices, which we consider in the last result ofthe present section.
We say that an (n X n) matrix is a triangulu matrix ifit either has the form
all
or the form
012.
0 13
ab
022
023
02.
0
0
(4.6.21)
a._ I ,.
a
all
021
02:1,
(4.6.22)
.4 6.
117
B=
Now let C be the k
bl2
bk+I,z
. .
0....
bl,k+1
~ '.k:.1
bk+I,k+1
x k matrix
0-- :- p=
i
I
I
I
I
...
Q
I
I
0:
0:
...
~-I-
p- I
I
I
.:
I
1
0:
Q- I
178
Chapter 4 I iF nite-Dimensional
and
AI :.
-~_.
P- I BP
o:
I
I
I
I
I
I
I
I
o:
where the .' s denote elements which may be non-ez ro. Letting A = P-IBP,
it follows that A is upper triangular and is similar to B. eH nce, any (k + 1)
x (k + 1) matrix which represents A E L ( X , X ) is similar to the upper
triangular matrix A, by Theorem .4 3.19. This completes the proof of the
theorem. _
Note that if A is in the triangular form of either Eq. (4.6.21) or (4.6.22),
then
det (A - 11) = (a J I - A)(au - A) ... (a - 1).
In this case the diagonal elements of A are the eigenvalues of A.
.4 7.
MINIMAL POLN
Y OMIALS,
OPERATORS, AND THE
CANONICAL O
F RM
NILPOTENT
JORDAN
Minimal Polynomials
[~ o ~ =~].
3
-I
1)Z(2 -
(I -
1),
theorem that
O.
(4.7.1)
.4 7.
179
Minimal Polynomials
m(A) =
A)(2 -
(1 -
m(A)
A) =
2-
3A +
A2
= O.
3A
21 -
AZ
(4.7.2)
Thus, matrix A satisfies Eq. (4.7.2), which is of lower degree than Eq. (4.7.1),
the characteristic eq u ation of A.
Before stating our first result, we recall that an nth- o rder polynomial in
A is said to be monic if the coefficient of An is unity (see Definition 2.3.4).
4.7.3.
Theorem. L e t A be an (n
polynomial m(A) such that
X
(i) m(A) = 0;
(ii) m(A) is monic; and,
(iii) if m'(A) is any other polynomial such that m'(A) = 0, then the degree
of m(A) is less or equal to the degree of m'(A) (Le., m(A) is ofthe lowest
degree such that m(A) = 0).
Proof We know that a polynomial, p(A), exists such that P(A) = 0, namely,
the characteristic polynomial. F u rthermore, the degree of p(A) is n. Thus,
there exists a polynomial, say f(A), of degree m < n such that f(A) = O.
Let us choose m to be the lowest degree for which f(A) = O. Since f(A) is
of degree m, we may divide f(A) by the coefficient of Am, thus obtaining
a monic polynomial, m(A), such that m(A) = O. To show that m(A) is
uniq u e, suppose there is another monic polynomial m' ( A) of degree m
such that m'(A) = O. Then m(l) - m' ( l) is a polynomial of degree less than
m. F u rthermore, m(A) - m'(A) = 0, which contradicts our assumption that
m(A) is the polynomial of lowest degree such that m(A) = O. This completes
the proof. _
4.7.5.
O. Then
Chapter 4 I iF nite-Dimensional
180
Proof. Let 11 denote the degree of mel). Then there exist polynomials q ( l)
and r(l) such that (see Theorem 2.3.9)
I(l)
<
or r(l)
11
= q ( l)m(l)
r(l),
o=
q(A)m(A)
rCA),
and hence rCA) = O. This means r(l) = 0, for otherwise we would have a
contradiction to the fact that mel) is the minimal polynomial of A. Hence,
I(l) = q ( l)m(l) and mel) divides I(l).
.4 7.6. Corollary. The minimal polynomial of A, mel), divides the characteristic polynomial of A, pel).
.4 7.7.
Exercise.
We now prove:
.4 7.8.
q(,t).
mel) =
l'
+ ... +
P.l - '
P.
Now let
Then
(A -
lI)B(l)
l' B o +
A,-tB 1 +
= A'B o + A1- B
[ t
l' I
PtA,- I I
... +
ABo]
+ ... +
AB'I_
A,-l[Bl
P,- t ll
o+
[l'-'AB
+ ... +
AB t]
A[B,-t
P,I =
l - l AB.
+ .,.
AB,_t]
AB,_l]
m(l)I.
AB,_t
.4 7.
MinimolPolynomials
181
).1)] d[ et B().) =
m
[ ()') ft.
.4 7.9.
P().)
where m t ,
).\ , . .
,).p
().t -
).)"',().%
i.e.,
).)"' . .. ().p -
).%), . .
(). -
.4 7.11.
-
).)"",
).p)",
(4.7.10)
().
Vt,
4.7.12.
Theorem.
Let
.4 7.13.
Definition. eL t A E L ( X ,
X ) . The minimal polynomial of A is
the minimal polynomial of any matrix A which represents A.
In order to develop the J o rdan canonical form (for linear transformations
with repeated eigenvalues), we need to establish several additional preliminary
results which are important in their own right.
.4 7.14.
Theorem. Let A E L ( X ,
X ) . and letf().) be any polynomial in )..
Let m, = { x : f(A)x
= OJ. Then m, is an invariant linear subspace of X
under A.
m"
Let
Chapter 4 I iF nite-Dimensional
182
Then
and
{x:
AJT)qX
(A -
OJ.
(4.7.15)
}~
.4 6.20
~J.
Exercise.
Next we prove:
.4 7.18. Theorem. Let X be a vector space over C, and let A E
Let m(l) be the minimal polynomial of A as given in Eq. (4.7.10).
= (A - AI)", let h(A) = (l - A1)" ... (A - Ap )" if p 2 2, let
if p = I. eL t AI be the restriction of A to ~i',
i.e., AI X = Ax for all
Let ml = x { E :X h(A)x = OJ. Then
L(X,
X).
Let g(l)
h(A) = I
x E ~i'.
Proof By Theorem .4 7.14, ml and ~i' are invariant linear subspaces under
A. Since g(l) and h(l) are relatively prime, there exist polynomials (q A) and
r(l) such that (see Exercise 2.3.15)
q ( l)g(l)
r(l)h(l)
1.
.4 7.
eH nce,
183
Minimal Polynomials
(q A)g(A)
Thus, for x
,X
we have
x
Now since
h(A)q(A)g(A)x
r(A)h(A)
(q A)g(A)x
(q A)g(A)h(A)x
I.
(4.7.19)
r(A)h(A)x.
(q A)m(A)x
(q A)Ox
0,
it follows thatq(A)g(A)x
E ml. We can similarly show that r(A)h(A)x Emi' .
Thus, for every x E X we have x = XI + x 2 , where IX E mi' and X z E ml.
Let us now show that this representation of x is unique. Let X = IX
X 2
= x; + x~,
where IX ' ;x E ml ' and 2X ' ~x E ml. Then
r(A)h(A)x
r(A)h(A)x
;x
XI
and
r(A)h(A)x;.
=
we get
r(A)h(A)x l
;X
r(A)h(A)x;.
o.)
ml(l) =
and
m2(A)
(A -
ll)kt
(1 -
A2)lo' ... (1 -
A,)lo,.
m l (A)m 2(A)x
Therefore,f(A) =
i = I, ... ,po
We thus conclude that kl
proof of the theorem. _
VI
for i
<
O.
VI
<
kl'
184
Chapter 4 I m
F ite-Dimensional
p(A.) =
A.)-,'
(4.7.21)
A.,)".
(4.7.22)
polynomial of A be
(A.I -
eL t
,x =
Then
i=
(i) "X
(ii) X =
:x {
(A. -
(A -
A. I ) " . . (A. -
OJ,
A.,I)"x =
i=
I, ... ,po
Et> .. Et>
Xl
X,;
(iii) (A. - A.,)" is the minimal polynomial of A" where A, is the restriction
of A to X,; and,
(iv) dim ,X = m" i = I, ... ,po
Proof The proofs of parts (i), (ii), and (iii) follow from the preceding
theorem by a simple induction argument and are left as an exercise.
To prove the last part ofthe theorem, we first show that the only eigenvalue
of A, E (L "X
,X ) is A." i = I, ... ,po eL t f) E "X
v*" 0, and consider
(A, - A.l)v = O. From part (iii) it follows that
0= (A, - A.,ly"V = (A, - 11I),1- (A , - A.I/)v
= (A, - 1,I),I- (A. - A.,)v = (A. - A.,)(A, - A.,I),.- l (A, - A.,l)v
(A. - l ,)l(A , =
A.,I),,-l v =
...
= (A. - A.,)"v.
det (A -
A./) =
D; det (A, -
A./).
A.)IIII .
(A., -
A.)III, =
(A. l
-
A.)" .. (A., -
A./ must
Exercise.
A.)t"
m i=
"
1, ... ,po
of Theorem .4 7.20.
.4 7.
Nilpotent Operators
185
Exercise.
.4 7.26. Exercise.
.4 6.14 and .4 6.15.
B. Nilpotent Operators
eL t us now proceed to find a representation for each of the A, E L ( X
,X )
" of
in Theorem .4 7.20 so that the block diagonal matrix representation
A E L(X,
X ) (see Theorem .4 6.18) is as simple as possible. To accomplish
this, we first need to define and examine so-called nilpotent operators.
.4 7.27. DefiDition. eL t N E L ( X ,
X). Then N is said to be nilpotent if
there exists an integer q > 0 such that N" = O. A nilpotent operator is said
to be of index q if N" = 0 but N,,- I "* O.
Recall now that Theorem .4 7.20 enables us to write X = X I EB X z EEl
X . Furthermore, the linear transformation (A, - A,l) is nilpotent on ~.
Ifwe let N, = A, - A,I, then A, = All + N,. Now 1,1 is clearly represented
by a diagonal matrix. oH wever, the transformation N, forces the matrix
representation of A, to be in general non-diagonal. So our next task is to
seek a simple representation of the nilpotent operator N,.
In the next few results, which are concerned with properties of nilpotent
operators, we drop for convenience the subscript i.
EB
.4 7.28. T
' heorem. eL t N E L ( V, V), where V is an m-dimensional vector
space. If N is a nilpotent linear transformation of index q and if x . E V is
such that N,- l x
0, then the vectors x , Nx , ... , N,,- I x in V are linearly
independent.
*"
Chapter 4 I iF nite-Dimensional
186
1= 0
= -
= NJ+I[~
NJ x
l{ ,1 Nix
l{ ,J
I=I+ J
Thus,
o.
*- o. Then we can write
l{ ,INI X =
(- ! t )NI- J - I
(l,J
I=I+ J
*- O.
=
X ]
NJ+l
y,
Nf- I X
Nf- J - I NJ x
Nf- J - I NJ + l
Nfy
= O.
0, I, ... ,
the matrix
Proof.
Hence,
Ne l
= 0 et
Ne 2
Ne f
= 0 e
I et
{ el->J
0 e2 +
0 e2 +
0 e2 +
2, ... ,q .
1=
+
... +
0 . ef -
0 ef -
I e
f-
0 ef
0 eq
0 e
.4 7.
187
Nilpotent Operators
Let N
m.
v, where dim
L ( V, V) be nilpotent of index
Proof Assume x E V, N x = 0, N- - I X
0, and v > m. Then, by Theorem
4.7.28, the vectors x , Nx , ... , N- - I x are linearly independent, which contradicts the fact that dim V = m.
= OJ, dim WI =
= {x: N2X = OJ, dim W 2 =
WI
W2
{x:
W.
{x:
Nx
N' x
= OJ, dim
W.
L ( V,
II,
12 ,
I.
Also, for any i such that I < i < v, let { e l' ... , em} be a basis for V such
that e{ lt ... ,ed is a basis for WI' Then
(i) WI C w2 C . C W.; and
(ii) (e u " " e"_,, Ne,.+1> ... ,Ne, .. ,} is a linearly independent set of
vectors in W,.
To prove the first part, let x E WI for any i < v. Then NiX = O.
eH nce, NI+ I X = 0, which implies x E W1+ 1 '
To prove the second part, let r = II- I and let t = 11+ I - II' We note
that if x E WI+ I , then NI(Nx ) = 0, and so Nx E WI' This implies that
Ne J E WI for j = II + I, ... ,11+1'
This means that the set of vectors
{el, ... ,e" NeH> !
... , Ne"..} is in WI' We show that this set is linearly
independent by contradiction. Assume there are scalars (XI" ,(x , and
PI' ... , PI> not all ez ro, such that
Proof
(Xle l
Since e{ l ,
be non-ez ro.
eH nce,
+ ... +
(X,e,
PINe,,+1
+ ... +
p,Ne".,
= O.
Chapter 4 I iF nite-Dimensional
188
Thus,
+ ... +
fl,e, ..> = 0,
W,. If fl.e,,+! + ...
N=:[ '
where
N,=
:],
(4.7.34)
N,
0100
0010
00
00
0000
0000
01
00
(4.7.35)
I. -
I._I
2/, -
1'1+
2/. -
11
(v
1,-.
(i
lI,
and k, is
X v) matrices,
x i) matrices, i = 2, ... ,v -
(I x
I, and
I) matrices.
.4 7.
189
Nilpotent Operators
,/(/y- I v_ . ),y
= e,y and let It. .- 1 = Nlt.., ... ,/(/.- 1 .- . ),.- 1 = NI(/._I . ) ,
By Lemma .4 7.32, it follows that {el>'"
,e,._.,fl . - I ,' " ,I<,.-, . ,) - I } is
a linearly independent subset of W._I> which mayor may not be a basis for
W._ I' If it is not, we adjoin additional elements from W._> \
denoted by
1<,.-, . 1+ 1 . - 1 "" ,/(/. -Iv ) >\- so as to form a basis for W._ I Now let
11 . 2- . = NII - I ,I2. 2- . = NI2.. - I '
,1<, . ,- . ),.-2. = NI<, . ,_ . ). _ I By
Lemma .4 7.32 it follows, as before, that e{ >\ ... , e,. ,/I. 2- .,. .. ,1<, . I- . ). 2- .}
is a linearly independent set in W.-2.' If this set is not a basis we adjoin vectors
from W.-2. so that we do have a basis. We denote the vectors that we adjoin
by 1<".,-1 . 1+ 1 . - 2 ., ,1<,. . - 1 .,.) 2- .' We continue in this manner until we
have formed a basis for V. We express this basis in the manner indicated in
Figure C.
Basis for
f '." - -,
f(/.-I..-,I.
f"._" ,-
f(l.- I ,),V- l , - - .
f(l._ , - / .-
2 ),v- l
f,,2' - - '
f(l.- I ,I,2,- - - - - - - - - ,
f(/2- 1 ,),2
f,." ,-
f(l._ I ._ , ).
f(/2- 1,), 1. - - ,
.4 7.36.
,,- - - - - - - - - ,
f/"I'
NI; =
1./
eH nce, if we let XI =
bottom to top, is
/{ ,./-0,>\
j>
j =
I
I.
C reading
Chapter 4 I iF nite-Dimensional
190
- I. - 1.- 2 ) +
there are a total of(/. - I.- I ) + (2/'1+ (2/ 1 - 12 ) = II columns in the table of Figure C.
This completes the proof of the theorem. _
... +
(2/ 2
-
II -
13 )
C. The oJ rdan
Canonical oF rm
We are finally now in a position to state and prove the result which establishes the Jordan canonical form of matrices.
.4 7.37.
A E L(X,
p(A) =
(AI -
A)m.,
(A -
AI)" ... (A -
A,)",
Then
(i)
(ii)
(iii)
(iv)
Xl>""
X
X,
x{
X:
(A -
A,I)"x
= OJ.
= IX EB .. EB
X,;
where A, is an (m,
=
X
0 ... 0]
~ ... ~.2
o
: : :
~.
'
(4.7.38)
... A,
A, = 1,1 + N,
(4.7.39)
and where N, is the matrix of the nilpotent operator (A, of index V, on ,X given by Eq. (4.7.34) and Eq. (4.7.35).
liT)
.4 7.
oJ rdan Canonical oF rm
191
A=
-1
1)7.
(I -
This implies that 1 1 = I is the only distinct eigenvalue of A. Its algebraic multiplicity is m. = 7. In order to find the minimal polynomial of A.
let
N
),.1,
A-
-2
N= A - I =
-2
o
o o
-1
-I
X).
-I
1 -I
0
-6
-I
-I
1
0
o
o
3 0o0
0
3 0
o
o
0
0
4 0
Chapter 4 I iF nite-Dimensional
192
o
o
NZ
-1
0
I
o
o
o
0
0
0
0 -I
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3 0
o0
0
0 -3 0
0
3 0
o0
0
0
o0
0
o0
N3 = 0 ,
and so
X =
VI =
5't~.
N' =
1 0iO 0 0 0
001:0000
I
o 0 0:0 0 0 0
r- -j000:01:00
I
I
o 0 010
010 0
- - - r-
o 0 0 0 0:0:0
o 0 0 0 0 0:0
1
,-_
..
NZx..
Nx . .
x..
Nx z , X z , x 3, x ...
We will represent the vectors x .. X z , "x
and x .. by x .. x z , "x
and x ..,
their coordinate representations, respectively, with respect to the natural
basis u{ .. ... , u,} in .X We begin by choosing XI E W3 such that X I 1= Wz ;
i.e., we find an X I such that N 3x I = 0 but NZx I :# O. The vector x f = (0,
.4 7.
193
oJ rdan Canonical oF rm
N
[ xz
l,
Nx l , X I '
Nx z , X z , x
3,
x,]
is the matrix of the new basis with respect to the natural basis (see Exercise
.4 3.9).
where
-I
0
I
P=
-I
0
0 0 -2 I
0
0 I
0 0
0
I 0
2 0 -I
0 0 -2 0 -2
0 0
0 0
0 0
p- l =
I
I
3
I
0
0
0 I
0 0
0
0 0
I
0 -I 0
0
and
0 0
-2
0 0
I
3 -I
0
0 I
0
0
0 -3
0
I -I
0 0 -I
-I -3
-2 -I
I 0
0
0 -I
0
0
0
0
I
0 0
0 0
0
0
0
I
0
A' =
N'
I.
(Recall that the matrix representation for [ i s the same for any basis in .X )
Thus,
Chapter 4 I iF nite-Dimensional
194
1 1 0iO 0 0 0
I
011:0000
001:0000
t- - 00 0: 1 1:0 0
0 0 0 :I 0 1 :I 0 0
o 0 0
0
I
A' =
o- o- T"i-l
'- -i-
0 0 0 0 0 OIl
Again, the reader can show that
A' =
P- I AP.
= AP.
.4 7.42.
Exercise. eL t X = R' , and let u{ t , , u,} denote the natural
X ) be represented by the matrix
basis for .X Let A E L ( X ,
A=
05 -1
I
1
0
3 -I
-1
1
0
0
4
0
0
0
1
1
-1
0
0
0
4 -1
0
0
0
0
1
3
0
0
0
0
1
3
04
A' =
0iO
I
1:000
o 0 4:0 0 0
O-O
- O
- r- i- 4 l0
0_
1_ _
0 0:0 4 : 0
0 0 0 0 i2
~
.4 8.
BILINEAR
N
UF CTIONALS
AND CONGRUENCE
.4 8.
195
and
px 2,Y ) =
f(x , IY
PY2) =
f (x ,
PIr
Rand ' J x
IY r
Definition. eL t e{
X,j
l , ... ,
2Y
YI'
Pf(x , 2Y )
E
.X
tl 1;1 JPd(x
J, Ir"'i;,. PlrYIr)
Pf(x 2,y)
YI)
f(ii x J
f (x l ,y)
As a consequence of
J, IY r)
I, ... , rand k =
I, ... ,s.
and
I, ... , n.
The matrix F = lftJ ] is called the matrix of the bilinear functional f with
respect to e{ l> ... , en)'
Our first result provides us with the representation of bilinear functionals
on finite-dimensional vector spaces.
.4 8.2. Theorem. Let f be a bilinear functional on a vector space ,X and
let fe I' . . , e.l be a basis for .X eL t F be the matrix of the bilinear functional
fwith respect to the basis fel> ... ,e.l. If X and yare arbitrary vectors in X
and if x and yare their coordinate representation with respect to the basis
fel , e2 , , e.l, then
Proof
+ ... +
~
:E fttl' l J '
1'1= I- J
We have x T = (el" .. ,en) and yT = ('II" .. ,'In)' Also,
e.e. and Y = ' I le l + ... + I' .en Therefore,
f(x , y) =
f(x , y) =
:E
:E
I- I I- J
T
x yF
el' l J ( e l, eJ )
= :E
:E ftJel'lJ
1= I = J I
=
(4.8.3)
=
ele l
T
x yF
Chapter 4 I iF nite-Dimensional
196
if f(x , y) =
concept.
4.8.4.
Definition.
skew symmetric if
for all x, y
.X
f(x , y)
.X
(4.8.5)
- f (y,x )
.4 8.7.
Exercise.
Definition. An (n
X
n) matrix F
is said to be
(i) symmetric if F = T
F ; and
(ii) skew symmetric if F = - F T .
The next result is easily verified.
.4 8.9. Theorem. Let f be a bilinear functional on ,X and let ft and f2 be
the symmetric and skew symmetric parts off, respectively. Then
and
for all x , y
4.8.10.
!(x)
f,(x , y) =
t[ f (x ,
y)
feY , )x ]
f2(X, y) =
t[ f (x ,
y) -
feY , )x ]
.X
Exercise.
Now let us recall that the q u adratic form induced by f was defined as
= f(x , x). On a real finite-dimensional vector space X we now have
.4 8.
197
for all x , y
y)
J(Y,
)x ]
J(x,
x) -
! [ g (x ,
y)
g(y, )x ]
.X
y, x -
y) =
y) -
feY , y) -
f(x -
y, x -
y)].
g(y, y) -
g(x -
y, x -
y)]
y)
Now if g(x , x ) =
![J(x,
y)
J(Y,
)x ]
! [ f (x ,
=
x ) , then
J(x,
J(Y,
)x ]
=
=
so that
![J(x,
x)
y) 1
+
! [ g (x ,
x)
! [ g (x ,
y)
(Y, )x ]
+
+
J(Y,
x)
J(x,
J(Y,
y).
g(y, x)],
! [ g (x ,
y)
(4.8.12)
g(y, )x .]
Then, in
.X This
Exercise.
Chapter 4 I iF nite-Dimensional
198
is given by
P"F P .
=
(4.8.16)
Proof
~
F'
Let F '
where, by definition,I:/
~ ~
Hence, F '
~ ~
"-I I-'
t:1
t:' t t:1
Then/(
"-I
P"F P .
p",e".
_
We now have:
4.8.17. Definition. An (n x n) matrix F ' is said to be congruent to an
(n X n) matrix F if there exists a non-singular matrix P such that
F'
=
PTFP.
(4.8.18)
,." .F
Note that congruent matrices are also equivalent matrices. The next
theorem shows that ,." in Definition 4.8.17 is reflexive, symmetric, and
transitive, and as such it is an equivalence relation.
4.8.19.
Theorem.
Let A, B, and C be (n x
n) matrices. Then,
(i) A is congruent to A;
(ii) if A is congruent to B, then B is congruent to A; and
(iii) if A is congruent to Band B is congruent to C, then A is congruent
toC.
Proof Clearly A = ITAI, which proves the first part. To prove the second
part, let A = PTBP, where P is non-singular. Then
= (PT)-IAP-I = (P-I)TA(P-I),
.4 8.
199
+1
p}
+1
-1
n (4.8.21)
-1
0
o
The integers rand p in the above matrix are uniquely determined by the
bilinear form.
Proof. Since the proof of this theorem is somewhat long, it will be carried
out in several steps.
Step 1. We first show that there exists a basis v{ u ... , v.J of X such that
/(v1, vJ) = 0 for i 1= = j. The proof of this step is by induction on the dimension
of .X The statement is trivial if dim X = 1. Suppose that the assertion is
true for dim X = n - l. Let / be a bilinear functional on ,X where dim
X = n. L e t VI E X be such that /(v l , VI) 1= = O. There must be such a VI;
otherwise, by Theorem .4 8.13, / would be skew symmetric, and we would
conclude that/(x , y) = 0 for all x , y. Now let mz = x { E X : f(v l , x) = OJ.
We now show that mz is a linear subspace of .X Let X u 2X E mz so that
f(vl , X I ) = f(v u x 2) = O. Then f(v u X I + x 2 ) = f(vl , X I ) + f(v l , x 2) = 0
0 = O. Similarly, f(vt> X I ) = 0 for all E R. Therefore, mz is a linear
Furthermore, mz 1= = X because VI mz. Hence, dim mz
subspace of .X
:= ;;; n - 1. Now let dim mz = q < n - 1. Since / is a bilinear functional on
mz, it follows by the induction hypothesis that there is a basis for mz consisting
of a set of q vectors v{ 2 , , vf+tl such that f(v1, vJ) = 0 for i 1= = j, 2 < i,
j < q + 1. Also, f(v l , vJ) = 0 for j = 2, ... ,q + I, by definition of mz.
Chapter 4 I iF nite-Dimensional
200
uF rthermore, f(v VI) = f(v l , vJ eH nce, f(v VI) = f(v., v,} = 0 for i = 2,
"
... ,q + l .
It follows
that f(v"vJ } = 0 for" i:# j and I~i,j<q+l.
We now show that {VI"" ,vq+ll is a basis for .X eL t x E X and let x '
= x - (XlVI, where (XI = f(v l , x ) ff(v l , VI)' Then f(v., x'} = f(v" )x
(X t /(v l , VI) = f(v l , VI) - f(v l , VI) =
O. Thus, x ' E mI. Since v{ ,z . . ,
vq + l l
is a basis for mI, there exist (X2" ,(Xq+1
such that x ' = (XV
z z + ...
+ (XI+q V q + l ; i.e., x = (XlVI + ... + (XI+q V q + l , Thus, { V I"' " vq + l } spans .X
To show that the set {VI" , vq + l } is linearly independent, assume that
(XlVI +
... + (XI+q Vq1+
= O. Then 0 = f(v l , 0) = f(V., (XlVI + ... +
(XI+q V
(X t /(v"
VI) =
0, which implies that (x , = O. eH nce, (X2VZ + ...
q + l) =
(XI+q V
O. Since the set { V 2"' " vqt+ l
forms a basis for mI, we
q + 1 =
must have (X2 = ... = (X1+q
= O. Thus, { V I"' "
vqt+ l forms a basis for
,X
and we conclude that q + I = n. This completes the proof of
step l.
Step 2. eL t { V I' , v.l be a basis for X such that f(v" VJ) = 0 for
i:# j and let P, = f(v v,) for i,j = I, ... , n. eL t e, = 1' IV, for i = 1, ... , n,
where 1' , = If..Jj;J l "if P,:# 0 and 1' 1 = 1 if P, = O. Now suppose that
P, = f(v" v,):# O. Then we have f(e" e,l = f(Ylv 1' ,V,) = 1' 1f(v" v,)
= P,f..J7lJ = l. Also, if P, = f(v v,) = 0, then f(e" e,l" = Ylf(v" v,) = O.
"
iF nally, we see that f(e eJ ) = f(y,v
vJ Y J ) = IY 1' fJ (v
VJ) = 0 if i:# j.
"
Thus, we have established a basis for" X such that fl}" = f(e" eJ ) = 0 if
i j and "[ = f(e,. e,l = + 1, - 1 , or O.
Step 3. We now show that the integers p and r in matrix (4.8.21) are
uniquely determined by f eL t { e l, ,e.l and {e~, . .. , e~} be bases for X
and let F and F ' be matrices offwith respect to { e l, , e.l and Ie;, ... , i.},
respectively, where
o
-1
F=
-I
n- p
o
o
.4 8.
201
o
-1
F'=
-1
n- q
To prove that p = q we show that e l , ,e" e;+h ... ,e:. are linearly
independent. F r om this it must follow that p
(n - q ) < n, or p < .q By
the same argument, q < p, and so p = .q L e t
Ylel
y,e, +
where 1' 1 E R, i = 1,
,p and 1' :
above equation we have
Then
+ ... +
"leI
=
l , ... ,
f(x o, x o) =
Y~
R, i
, ~e:.
,Y e"
Ylel
0,
=
+ ... +
-(Y;+le;+1
+ ... +
+ ... + Y~ >
y~e:.)
o'
+ ... +
pY e p)
(r~+I~+I'
... ,y~e~)]
0,
+ ... +
f[-(~+,~+,
= f(y,e,
f(x o, x o)
by choice of{ e
"pep =
+ ... +
;Y l+ e;+1
(- 1 )Z[ -
(,,~+
1)2 -
(,,~+z)Z
y~e:.),
(y~)Z]
<
+ ... +
by choice of{~+I"
.. ,e~+R}'
F r om this we conclude thaty~
'1~ = 0;
i.e., 1' 1 = ... = 1' p = O. Hence, Y~+ I~+ I + ... + y~e~
= O. But the set
{~+I"
.. , e:,} is linearly independent, and thus Y~+I
= ... = , ~ = O. Hence,
... ,e~ are linearly independent, and it follows
the vectors el' ... ,ep , ~+t,
thatp = .q
To prove that r is unique, let r be the number of non-zero elements of F
and let r' be the number of non-zero elements of F ' . By Theorem .4 8.15,
F and F ' are congruent and hence equivalent. Thus, it follows from Theorem
4.3.16 that F and F ' must have the same rank, and therefore r = r'.
This concludes the proof of the theorem. _
201
Chapter 4
I Finite-Dimensional
"*
.4 9.
Exercise.
n; and
EUCIL DEAN
VECTOR SPACES
Among the various linear spaces which we will encounter, the so-called
Euclidean spaces are so important that we devote the next two sections to
them. These spaces will allow us to make many generalizations to facts
established in plane geometry, and they will enable us to consider several
important special types of linear transformations. In order to characterize
these spaces properly, we must make use of two important notions, that of
the norm of a vector and that of the inner product of two vectors (refer to
Section 3.6). In the real plane, these concepts are related to the length of a
vector and to the angle between two vectors, respectively. Before considering
the matter on hand, some preliminary remarks are in order.
To begin with, we would like to point out that from a strictly logical
point of view Euclidean spaces should actually be treated at a later point of
.4 9.
203
[~:J
and y =
:[ :J
(4.9.1)
.4 9.1.
,n
Chapter 4 I iF nite-Dimensional
we say in this case that "the distance from x to y" is equal to {(~I
- I' I)Z
+ (~z - I' Z)Z}1/2, that "the distance from the origin 0 (the null vector) to
x" is equal to (~f + ~DI/Z,
and the like. Using the notation of the present
chapter, we have
(4.9.3)
and
Ix - yl =
,J ( x
y)T(x -
= ,J ( y - )X T(y - )x = Iy - lx .
y)
(4.9.4)
cos 8 =(~
Utilizing
17~+
~z7z)
(4.9.5)
""'~f
+ i ""'I' I + I' z
the notation of the present chapter, we have
,J
cos (J =
T
x x
XT~
(4.9.6)
yTy
Now if we let x
t:.
T
x y.
(4.9.7)
Ix I = ""'(x, x).
(4.9.8)
>
(x, x)
and
(x , x )
0 for all x * - O
0 for x
(4.9.9)
(4.9.10)
O.
(4.9.11)
(y, x)
for all x and y. Moreover, for any vectors ,x y, and z and for any real scalars
and p we have, in view of Eq. (4.9.7), the relations
(x
(x , y
and
y, )z =
)z =
(x, )z
(x, y)
(Y, )z ,
(4.9.12)
(x , )z ,
(4.9.13)
y) =
(x,
y),
(4.9.14)
(x , y ) =
(x,
y).
(4.9.15)
(<,x<
In connection with Eq. (4.9.6) we can make several additional observa1; if x = - y , then cos
tions. First, we note that if x = y, then cos (J =
8 = - 1 ; if x T = (~I> 0) and yT = (0, I' )z , then cos (J = 0; etc. It is easily
.4 9.
verified, using Eq. (4.9.6), that cos (J assumes all values between
1 and
- 1 ; i.e., - 1 < cos (J S
1.
The above formulation agrees, of course, with our notions of length
of a vector, distance between two vectors, and angle between two vectors.
F r om Eqs. (4.9.9}-(4.9.l5)
it is also apparent that relation (4.9.7) satisfies
all the axioms of an inner product (see Section 3.6).
U s ing the above discussion as motivation, let us now begin our treatment
of Euclidean vector spaces.
F i rst, we recall the definition of a real inner product: a bilinear functional
f on a real vector space X is said to be an inner product on X if (i) f is symmetric and (ii) f is strictly positive. We also recall that a real vector space X
on which an inner product is defined is called a real inner product space. We
now have the following important
.4 9.16.
*'
.4 9.17.
if y =
o.
(4.9.9}-(4.9.15)
0 for all x
X if and only
Chapter 4 I iF nite-Dimensional
A E L(X,
.4 9.19.
y E X,
A, B E L ( X ,
Corollary. Let
then A = B.
4.9.20. Corollary.
x , y E R-, then A =
4.9.11.
Exercise.
X).
0 for all ,x y E X
If (x, Ay) =
X).
A be a real (n x n) matrix.
Let
o.
If x T Ay
0 for all
(~\t
. . , ~_)
Definition. F o r each x E ,X
We call Ix
Ixl
I the norm of .x
let
(x ,
X ) 1/2.
Let
,,_ ) .
:E
/~ '
I- I
(x, y) =
(4.9.24)
(x , y)
Ixl =
:E l~
I- I
= Tx y,
)1/2
(X TX)1/2
(4.9.25)
(4.9.26)
.4 9.
207
.4 9.27.
Definition. The vector space R" with the inner product defined
in Eq. (4.9.24) is denoted by P. The norm of x given by Eq. (4.9.26) is called
the Euclidean norm on R".
Relation (4.9.29) of the next result is called the Schwarz inequality.
.4 9.28.
Theorem.
Then
Ix l I Y I ,
l(x,y)1 ~
(4.9.29)
where in Eq. (4.9.29) I(x, y) I denotes the absolute value of a real scalar and
Ix I denotes the norm of .x
F o r any x and y in X and for any real scalar tt we have
Proof
(x
tty, x
tty)
(x, )x
tt(x, y)
Then
(x
tty, x
tty)
(x
(x
,
,
tt(y, )x
tt 2(y, y)
>
O.
- ( x , y).
(y, y)
(x, x )
or
2tt(x, y)
tt 2(y, y)
x) _
2(x, y)(x y)
(y, y ) '
x) _
(x , y)2 >
(y,y) -
(x, x)(y, y)
>
(x , y)2(y y)
(Y, y)2 ,
0
,
(x , y)z.
Taking the square root of both sides, we have the desired inequality
l(x,y)1 < Ix l l yl
.4 9.30.
Exercise.
F o r ,x y
,X
= 0, Iy I = 0,
show that
l(x,y)1 = Ix l ' l yl
.4 9.31.
Theorem.
following hold:
For
all x and y in X
(i) Ix l >
0 unless x = 0, in which case Ixl = 0;
(ii) Ittx I = Itt I . Ix I, where Itt I denotes the absolute value of the scalar
tt; and
(iii)
Ix
IY ~
Ixl
Iyl
Chopter 4 I iF nite-Dimensional
Proof The proof of part (i) follows from the definition of an inner product.
To prove part (ii), we note that
Ilx z =
(<,x<
x)
( ,x
x)
Z
(x,
z.
= lz lx
)x
Taking the square root of both sides we have the desired relation
Ilx
= Illx I
To verify the last part of the theorem we note that
Ix
sU ing
ylZ
(x
= Ixl z +
y, x
y)
2(x , y)
(x, x)
+ Iylz.
2(x , y)
(Y , y)
Ix +
ylZ <
Ix +
y I<
Ix I + Iy I,
x+y
o
.4 9.32.
.4 9.33.
Ix +
Exercise.
Deorem. F o r all ,x y
holds.
.4 9.34.
Iyl
ylZ
+ Ix -
X the equality
ylZ
= 21xl z + 21yI Z
.4 9.
andy of X a s
p(x , y) =
Ix -
(4.9.35)
yl.
F o r all x , y, Z E ,X
Theorem.
(i) p(x , y) =
(ii) p(x , y) ~
(iii) p(x , y) <
p(y, x ) ;
0 and p(x , y)
0 if and only if x
p(z, y).
p(x , )z
y; and
A function p(x , y) having properties (i), (ii), and (iii) of Theorem .4 9.36
is called a metric. Without making use of inner products, we will in Chapter 5
define such functions on non-empty sets (not necessarily linear spaces) and
we will in such cases speak of metric spaces (Euclidean spaces are examples
of metric spaces).
.4 9.37.
Exercise.
B.
Orthogonal Bases
4.9.38.
Theorem.
Proof.
Ix +
Let x , y
Ix +
= Ixl z +
Iylz.
ylZ
(x
y, x
y)
(x, x )
(x , y)
(y,x )
O. Thus,
(Y , y)
= Ixl z
IYl,z
Definition. A vector x
Izi
* 0 and let
11~lyl
=
1~l YI
I=
1,
1.
is
Chapter 4 I iF nite-Dimensional
110
Next, let {fl" .. .J.l be an arbitrary basis for X and let F = [ l t J ] denote
the matrix of the inner product with respect to this basis; i.e., ItJ = (It, f J )
for all; andj. More specifically,F denotes the matrix of the bilinear functional
f that is used in determining the inner product on X with respect to the
indicated basis (see Definition .4 8.1). Let x and y denote the coordinate
representation of x and y, respectively, with respect to { f l' ... ,f.l. Then
we have, by Theorem .4 8.2,
(x , y)
= Tx yF
= J - LI L
= yTFx'
'-I
Ittlh
Now by Theorems .4 8.20 and .4 8.23, since the inner product is symmetric
and strictly positive, there exists a basis e{ l , ,e.l for X such that the
matrix of the inner product with respect to this basis is the (n x n) identity
matrix I, i.e.,
ifi:;ej
(e" eJ
~'J
={
if; = j .
Ixl =
(X TX) I/1
,Je:
+ ... +
e:
(4.9.43)
T
x yF
L Itt,'IJ =
',J-I
In particular, we have
(x, x)
= L
1= '
',J-I
e:.
~'Je,'IJ
= 'L=1 e,'I,.
.4 9.
211
The reader should note that Eqs. (4.9.7) and (4.9.8) introduced at the
beginning of this section are, of course, in agreement with Eqs. (4.9.42)
and (4.9.43). (See also Example .4 9.23.)
Our next result enables us to determine the coordinates of a vector with
respect to a given orthonormal basis.
4.9.44.
Theorem. Let e{ l , ,e,,} be an orthonormal basis for X and let
x be an arbitrary vector. The coordinates of x with respect to {el' ... , ell}
are given by the formulas
~I
Proof
Since x
(x, e l)
(ele l
~Iel
(x, e l ),
+ ... +
+ ... +
,~"
(x, ell)'
we have
~"e",
e"e", e l) =
el(e l , e l)
+ ... +
el )
~"(e,,
el'
= Z
Let X
and y =
Let
,x y E 2 ,
el111 + 2{ 112'
The natural basis for 2 is given by U I = (1,0) and U 2 = (0, I). Since (u l ,
u / ) = J ' I ' it follows that u{ 1, u 2} is an orthonormal basis for 2. F u rthermore,
we have
(x , y)
4.9.46.
by
Example.
Let X
1{ 111 + 4~z11z
(4.9.47)
(The reader may verify that this is indeed an inner product.) L e t u{ I , u z }
denote the natural basis for RZ; i.e., U I = (1,0) and U z = (0,1). The matrix
representation of the bilinear functional, which determines the above inner
product with respect to the basis u{ l , uz} is
(x, y) =
(x, y)
XT[~
~Jy,
Chapter 4 I iF nite-Dimensional
211
e~
(x, e 1 ). If we let
[~]
=
x'
and y'
[:~J
=
(x, y) =
(x)' Ty'.
This illustrates the fact that the norm of a vector must be interpreted with
respect to the inner product used in determining the norm. _
Our next result allows us to represent vectors in X in a convenient way.
.4 9.48.
all x
Theorem. Let e{
X w e have
_
l ,
(x , e,)
x - ( e , e, )e l
l
Proof.
Normalizing e l ,
i..}, where
e: = rile"
x
(x , e~)e~
(x , eft)
(-eft,
- ) e".
e.
I,
Then for
{~,
... ,
we have
(x , e~)e~
=
=
=
~el
(x , e,) e
(el> e,) l
+ ... +
(x , e.) e..
(e., e.)
.4 9.50.
Exercise.
t. (x,
t1
Then
e,)(y, e,).
(e" e,)
.4 9.
Proof
213
F o r arbitrary i
0=
(X/(X"
+ ... +
XIX
+ ... +
(XkkX J
(XkkX
XI)
we have
, (Xk
= I, ... , k, we have
(0, XI) =
=
= O.
(X I (X
XI)
I,
+ ... +
(Xk(X
k , XI)
X I );
(XI
Note that the converse to the above theorem is not true. We leave the
proofs of the next two results as an exercise.
.4 9.52. Corollary. A set, of k non-zero mutually orthogonal vectors is
a basis for X if and only if k = dim X = n.
.4 9.53. Corollary. F o r X there exist not more than n mutually orthonormal
vectors. (In this case we speak of a complete orthonormal set of vectors.)
.4 9.54.
Exercise.
Theorem. eL t
gl =
fl,
g,. =
f,. -
g" 1
="
(/,., el)e u
.- 1
j= 1
(/",ej)ej,
el =
gl/lgII,
e,. =
g,./lg,.l,
elf
.X
Set
= g"/lg,,l
3.3.4.4
Clwpter 4 I iF nite-Dimensional
214
for all
X
IJ,
k}
.X
Proof.
:E (x,
-
,x )x,
1= '
I, ... , k.
(x, ,x ). We have
I< x =
(x -
f.
IJ,,X I2 =
~
(x, x ) -
1J,tJ,
'X
lt
f. IJ,"X
j
ftilJ j lJ
~I
k k
~I
IJ j X
J-I
j)
(I,,tJt,x ,
X j )'
Theorem. Let
y.L
(i) Let
j=
(ii) y.L
(iii) n =
(iv) (Y.)L .L
(v) X =
{II'
I,
Y be a linear subspace of ,X
x{
X: (x, y) =
0 for all Y
Then x
,Ik} span .Y
, k.
and let
.} Y
(4.9.60)
y.L
is a linear subspace of .X
dim X = dim Y + dim y.L.
(vi) Let X , Y
and X
.Y
yEt>
y.L
E
2,
2Y
.X
E
If x
IX
y.L, then
+
X
and
Y
YI
Y2'
where
X I ,Y I
.4 9.
215
and
Ixl
v- "lxti z
=
+ IXlz .z
To prove the first part, note that if x E y.L, then x J . .fl"' "
X J . .fk'
since It E Y for i = 1, ... , k. On the other hand, let x J . ./' , i = 1, ... , k.
Then for any Y E Y there exist scalars I' " i = 1, ... , k such that y = ndl
+ ... + I' ,.fk' eH nce,
Proof
(x, y) =
Thus, x
(x ,
y.L.
t 'I,/')
I- '
t:1
O.
Exercise.
a linear functional on .X
f(x )
for all x E .X
(x , y)
(4.9.64)
Proof. If I(x ) = 0 for all x E X , then y = 0 is the unique vector such that
Eq. (4.9.64) is satisfied for all x E ,X by Theorem 4.9.17. So let us suppose
that I(x )
*-
0 for some x
Then
,X and let
{x
X : /(x )
= OJ.
of~.
(x, y) =
=
(xo'/(Yo)
oY )
f(yo) (x o, oY )
= lXf(yo) = f(x ) ;
oY )
(Yo. oY )
.4 10.
IL NEAR TRANSFORMATIONS
ON EUCIL
DEAN VECTOR SPACES
Orthogonal Transformations
A.
= t. PJleJ'
let e;
1="1
i=
1, ... ,
n, and let P denote the matrix determined by the real scalars P'' J The following
question arises: when is the set {e~, ... , e.l also an orthonormal basis for
X?
To determine the desired properties of P, we consider
(t Pklek, t. PIJel)
(e;, eJ) =
k~1
0 for i
(e;, ej) =
~I
PTP
where, as usual, I denotes the n
.4 10.1.
e;
I-J
Theorem. Let
{ e l' ...
=
~
~ PkIP/J
t.1
I for i
PklPkJ
(e ko e,).
j, we require that
6,J ,
= I,
,e.l
Let
,e.l is
PJleJ'
if and only if pT =
...
P- I .
.4 10.2.
=
p- I p
.4 10.3.
216
.4 10.
217
.4 10.4.
Definition. A linear transformation A from X into X is called an
orthogonal linear transformation if (Ax, Ay) = (x, y) for all x, y E .X
Let us now establish some of the properties of orthogonal transformations.
.4 10.5.
IAx l
Theorem. eL t A
Ixl for all x E .X
L(X,
X).
Also,
IA(x
y)1 2 =
(A(x
yW
= Ix
.X
y), A(x
2(Ax, Ay)
2(Ax, Ay)
+ yl2 = (x +
and therefore
for all x , y
= IAxl 2 +
= Ixl 2 +
y, x
(Ax, Ay) =
_
and IAx
(x, x)
(Ax
Ay, Ax
I = Ilx .
Con-
Ay)
IAyl2
lyl2.
= Ixl 2+
y)
2(x , y)
+ lyl2,
(x, y)
Proof Let Ax =
singular. _
O. Then
into X
Our next result establishes the link between Definitions 4.10.2 and 4.10.4.
Theorem. eL t e{ l' ... ,e.J be an orthonormal basis for .X
Let
X ) , and let A be the matrix of A with respect to this basis. Then
A is orthogonal if and only if A is orthogonal.
.4 10.7.
A
L(X,
Proof. Let x and y be arbitrary vectors in ,X and let x and y denote their
coordinate representation, respectively, with respect to the basis e{ l, ... , e.J.
Then Ax and Ay denote the coordinate representation of Ax and Ay, respectively, with respect to this basis. Now,
(Ax, Ay) =
and
(Ax Y ( Ay)
=
x
ATAy,
Chopter 4 I iF nite-Dimensional
118
= 0; i.e., ATA = I.
ATA -
Corollary.
Let A E L ( X ,
X).
1.
4.10.9.
Corollary. Let A, BE L ( X ,
X ) . If A and B are orthogonal transformations, then AB is also an orthogonal linear transformation.
4.10.10.
Exercise.
Adjoint Transformations
Theorem. L e t G E L ( X ,
X ) and defineg: X x X - + R by g(x , y)
(x, Gy) for all x , y E .X Then g is a bilinear functional on .X Moreover, if
{el>'
.. , e.1 is an orthonormal basis for ,X then the matrix of g with respect
to this basis, denoted by G, is the matrix of G with respect to {el>' , e.l.
Conversely, given an arbitrary bilinear functional g defined on ,X there
X ) such that (x , Gy) = g(x , y)
exists a unique linear transformation G E L ( X ,
for all x , y E .X
Proof.
g(x l
Let G
x z , y)
L(X,
(X I
X),
X Z,
Gy)
(X I '
Gy)
(x , Gy). Then
(x z , Gy)
g(x l ,y)
g(x z , Y ) .
Also,
g(x, YI
yz )
=
(x, G(YI
g(x, Y I )
yz
g(x , yz)
(x, GYI
Gyz)
(x , GYI)
(x , Gyz)
.4 10.
119
Furthermore,
and
g(x, IX)Y
g(tU,
y)
(x, G(IX Y
Gy)
(lX,X
=
IX(,X
(x, IXG(y
Gy)
=
IX(,X
y),
IXg(X,
Gy)
IXg(X,
y),
k=.
g~Jek
for j =
I, ...
,n.
Hence,
(e lt Ge) =
(e k=t.
l,
g~)ek)
=
g;j.
.4 10.13.
Theorem
(i) F o r each G E L ( X ,
X ) , there is a unique G* E L ( X ,
X ) such that
(x, G*y) = (Gx, y) for all ,x y E .X
(ii) Let {e., . .. ,e.} be an orthonormal basis for ,X and let G be the
matrix of the linear transformation G E L ( X ,
X ) with respect to this
basis. Let G* be the matrix of G* with respect to e[ l , , e.}. Then
G* = GT.
Proof The proof of the first part follows from the discussion preceding
the present theorem.
To prove the second part, let e[ l, ... , e.} be an orthonormal basis for
,X and let G* denote the matrix of G* with respect to this basis. Let x and y
be the coordinate representation of x and y, respectively, with respect to this
Chapter 4 I iF nite-Dimensional
basis. Then
(x , G*y) =
T
x G*y
(GX)T y =
(Gx , y) =
GT)y
T
x GT y.
O. eH nce,
G* =
GT.
The above result allows the following equivalent definition of the adjoint
linear transformation.
.4 10.14.
Definition. eL t G
is defined by the formula
for all x, y
L(X,
X).
(x , G*y)
.X
(i) (A*)* = A;
(ii) (A B)* = A*
(iii) (lXA)* = lXA*;
(iv) (AB)* = B*A*;
(v)
(vi)
(vii)
(viii)
.4 10.16.
B*;
Proof
Theorem. eL t A E L ( X ,
A- I .
X).
.4 10.
(Ax , Ay) =
.X
221
Therefore,
(A*Ax , y)
(x , y)
Exercise.
.X
.4 10.23.
Exercise.
Chapter 4 I iF nite-Dimensional
221
.4 10.24.
Corollary. eL t
following are equivalent:
(i) A is skew-symmetric;
(ii) (x, Ax ) = 0 for all x E ;X
(iii) Ax . .l x for all x E .X
Then the
and
Exercise.
and .4 10.25.
Proof
B=
A
[ -
(r
A" -
(r
is)I)[A -
is)I)
is)IA -
(r -
(r -
is)IA
(r
is)(r -
T
x Bx
T
x ([ A
rl)"
s"I)x =
AT -
rl T
T
x (A -
is)1"
O. Also,
rI)"x
s"xT.x
= A - rl.
i.e.,
where y =
(A -
~
,~
,,1 ~
0 and T
x x
= L ,1> 0, because
I- '
.4 10.
* O. Thus, we have
by assumption x
o=
yTy
>
SZ(xT)x
223
sZxT.x
O. Therefore, A =
T,
X ) with
Now let A be the matrix of the linear transformation A E L ( X ,
respect to some basis. If A is symmetric, then all its eigenvalues are real.
In this case A is self-adjoint and all its eigenvalues are also real; in fact, the
eigenvalues of A and A are identical. Thus, there exist uniq u e real scalars
AI' ... , Apt P < n, such that
U)
det (A -
det (A =
(AI -
AI) =
A)""(Az -
A)'"'
... (A, -
A)'".'
(4.10.29)
.4 10.30.
4.10.31.
Corollary. Let
least one eigenvalue.
.4 10.32.
Exercise.
A E L(X,
X).
.4 10.33.
Theorem. Let A E L ( X ,
X ) be a self-adjoint transformation, and
let AI" .. ,Ap , p < n, denote the distinct eigenvalues of A. If ,X is an eigenvector for A, and if XI is an eigenvector for AI' then ,x .1. XI for all i j.
A,(X
Thus,
"
Since A,
x,) =
(A,X
"
)JX
=
Now let A
L(X,
X),
(Ax
"
)JX
=
(XI'
Ax /) =
(x"
AJX /) =
(A, -
AJ)(X"
)JX
IX ) =
AJ"X
where
O.
_
~,
Chapter 4 I iF nite-Dimensional
224
m=
l
x{
= OJ.
:X (A - A Il)x
A,l, i.e.,
(4.10.34)
.4 10.36.
Exercise.
dim X
= n=
dim m\
+ ... +
dim mz
dim m,.
Proof Let dim ml = nl , and let ret, ... , e .l be an orthonormal basis for
mi' Next, let e{ + I > ' "
,e.,H .} be an orthonormal basis for mz . We continue
in this manner, finally letting e{ ., + ... +_. + I> , e + ... .+ ,} be an orthonormal
basis for mp Let n\ + ... + n p = m. Since ml ..1 mj , i *- j, it follows that
the vectors et> ... ,e.., relabeled in an obvious way, are orthonormal in .X
We can conclude, by Corollary .4 9.52, that these vectors are a basis for ,X
if we can prove that m = n.
Let Y be the linear subspace of X generated by the orthonormal vectors
e\ , ... , e... Then e{ l , , e..} is an orthonormal basis for Y a nd dim Y = m.
Since dim Y + dim y1. = dim X = n (see Theorem .4 9.59), we need only
prove that dim Y 1. = O. To this end let x be an arbitrary vector in Y 1.. Then
(x, e\) = 0, ... , (x, e..) = 0; i.e., x . .l e\, ... , x ..1 e.., by Theorem .4 9.59.
So, in particular, again by Theorem .4 9.59, we have x ..1 ml , i = I, ... ,p.
Now let y be in mi' Then
(Ax, y) =
(x, Ay) =
(x, AIY)
Alx , y) =
0,
(A'x, y) =
(AX, y) =
(x, Ay) =
(x, A'y).
Assume now that dim yol> O. Then by Corollary .4 10.31, A' has an
eigenvalue, say Ao, and a corresponding eigenvector X o *- O. Thus, X o *- 0
.4 10.
225
.4 10.38.
Corollary. eL t A
L(X,
)X .
If A is self-adjoint, then
In.
Al
A. 2.
A=
In.
A2.
A,
I
n,
A,
A is
det (A -
AI) =
and, hence, n,
det (A dim /~
=
AI)
(AI -
A)"'(A2. -
multiplicity of A" i
A)"'
1,
(Ap ,p.
A)"',
Chapter 4 I iF nite-Dimensional
Exercise.
.4 10.41.
Corollary. eL t f(x , y) be a symmetric bilinear functional on .X
Then there exists an orthonormal basis for X such that the matrix offwith
respect to this basis is diagonal.
Proof
L( ,X X )
.4 10.42.
Corollary. eL t j(x ) be a quadratic form defined on .X
there exists an orthonormal basis for X such that if x T = (~I' ..
the coordinate representation of x with respect to this basis, then! ( x )
+ ... + lX.e~ for some real scalars lXI' , IX
.4 10.43.
Exercise.
,~.)
Then
is
lXle~
Next, we state and prove the spectral tbeorem for self-adjoint linear
X ) is a
transformations. First, we recall that a transformation P E L ( X ,
projection on a linear subspace of X if and only if p1. = P (see Theorem
3.7.4). Also, for any projection P, X = R
< (P) EEl (~ P),
where R
< (P) is the range
of P and ~(P)
is the null space of P (see Eq. (3.7.8. Furthermore, recall
that a projection P is called an orthogonal projection if R
< (P) ..1 (~ P)
(see
Definition 3.7.16).
.4 10.4.4
Theorem. Let
A E L(X,
X)
AI' ... ,Ap denote the distinct eigenvalues of A, and let ~I be the null space
(ii) PIP)
I, ... ,p;
.4 10.
(iii)
= I, where I E L(X,
PJ
J-I
and
(iv) A =
AJP)"
J=I
To prove the first part, note that X = m:, EB m:;-, i = I, ... ,p,
by Theorem .4 9.59. Thus, by Theorem 3.7.3, R< (P ,) = m:, and m:(P ,) = m:;-,
and hence, P, is an orthogonal projection.
To prove the second part, let i 1= = j and let x E .X Then PJx
I:>. x J E m: J .
Since R< (P ,) = m:, and since m:,1.. m: J , we must have x J E m:(P ,); i.e.,
P,PJx
= 0 for all x E .X
Proof
I- I
= I.
To
do so, we first show that P is a projection. This follows immediately from the
fact that for arbitrary x E ,X p2 X = (PI + ... + P,)(Plx + ... + P,x ) =
PIx + ... + P;x , because P'P J = 0 for i 1= = j. Hence, p2 X = (PI + ...
+ P,)x = Px, and thus P is a projection. Next, we show that dim R<[ (P)] = n.
It is straightforward to show that
dim R
<[ (P)]
1= 1
dim m
[ :,]
dim m
[ :,],
1= 1
= n,
= n.
Since
X = R
< (P) EB m:(P), we conclude that R
< (P) = .X Finally, since P is a projection with range ,X we conclude that Px = x for all x E ,X i.e., P = 1.
To prove the last part of the theorem, let x E .X F r om part (iii) we have
Let ,x =
P,X
Ax =
=
A(x
for i =
i
AIPIX
PIX
P 2x
+ ... +
I, ... , p. Then ,x E
x,) =
A'p,X
m:, and AX
+ ... + Ax ,
= (AIP I + ... +
AX
P,x.
=
Al IX
A,X ,. Hence,
+ ... +
A,x,
A,P,)X,
Some Examples
L(X,
X),
I Finite-Dimensional
Chapter 4
A=
0[ 11
Ou]
021 022
is the matrix of A with respect to the basis e{ l, e2}' eL t x E E2, and let
x T = (' I ' ' 2 ) denote the coordinate representation of x with respect to this
basis. Then Ax is the coordinate representation of Ax with respect to this
basis, and we have
Ax
= y.
A.le~,
Ae; =
A.2e;,
81
111 8 1
.4 10.46.
iF gure F
.4 10.47.
FIgure
.4 10.
Ax =
A(~;e~
~e;)
~;Ae~
~;Ae;
~Ale~
~Aze;;
U n it circle
.4 10.49.
iF gure H
The reader can readily verify that R is indeed a linear transformation. The
matrix of R with respect to this basis is
_ C[ OS
R.-
(J
sin (J
(JJ
sin
cos (J
Chapter 4 I iF nite-Dimensional
R; I = [
c~s
-
(JJ
sin
cos (J ,
(J
sm (J
cos" (J
sin" (J =
1.
R,+~
PlaneZ
900
I
I
I
I
I
Set .Y
I
aJF ure J
.4 10.51.
cos (Jet
- s in (Je
sin (Je"
cos (Je"
0 e3
0 e3
0 e l + 0 e" + I e 3
The reader can readily verify that A is a linear transformation. The matrix
Ae 3
.4 10.
231
COS9
- s in 9
sin9
cos 9
E.
Further
Theorem.
eL t
L(X,
be an orthogonal transformation.
X)
(i) the only possible real eigenvalues of A, if there are any, are + I
and - I ;
(ii) if Y is a linear subspace of X which is invariant under A, then the
restriction A' of A to Y is an orthogonal transformation from Y into
Y; and
(iii) if Y is a linear subspace of X which is invariant under A, then Y l.
is also a linear subspace of X which is invariant under A.
Proof To prove the first part, assume that A has a real eigenvalue, say Ao
(The definition of eigenvalue of A E L ( X ,
X ) excludes the possibility of complex eigenvalues, since X is a vector space over the field R of real numbers.)
Then Ax = .lox for x
0 and
*'
But IAx I = Ix I, because A is by assumption an orthogonal linear transformation. Therefore, lAo I = I, and we have Ao = + I or - 1 .
To prove the second part assume that Y is invariant under A. Then
Ax E Y whenever x E ,Y and thus the restriction A' of A to ,Y defined by
A' x
Ax
IA'lx =
IAx l
= lxi,
Chapter 4 I iF nite-Dimensional
232
since A E L ( X ,
X ) is an orthogonal transformation. Therefore, A' is an
orthogonal transformation from Y into .Y
To prove the last part, let Y be an invariant subspace of X under A. Then
x E y.l if and only if x 1.. y for all y E .Y Suppose then that x E y.l and
consider Ax . Then for each y E Y we have
(Ax , y)
(x, A*y) =
(x, A- I y),
(x, A- I y ) =
y.l whenever x
We also have:
.4 10.53. Theorem. Let A E L ( X ,
X ) be an orthogonal transformation, let
+Y
denote the set of all x E X such that Ax = ,x and let y_ denote the set
of all x E X such that Ax = - x . Then +Y and L are linear subspaces of X
and +Y 1.. L .
(Ax , Ay) =
~(A
E
(x, - y )
y),
-(x,
and
1.. y_.
+Y
sU ing the above theorem we now can prove the following result.
.4 10.54.
Corollary. eL t A, +Y and y_ be defined as in Theorem .4 10.53,
and let Z denote the set of all x E X such that x 1.. +Y and x 1.. y_. Then Z
is a linear subspace of X and dim +Y + dim y_ + dim Z = dim X = 11.
Furthermore, the restriction of A to Z has no (real) eigenvalues.
Proof Let e{ l , , e"J be an orthonormal basis for Y H and let e{ .'1+ >
... , e"I+n.} be an orthonormal basis for ,_ Y
where dim +Y = n 1 and dim y_
= 11". Then the set e{ l ,
,e"I+"}' is orthonormal. Let Y denote the linear
subspace generated by e{ l , , e".+n.}. Then dim Y = 11 1 + 11". By the defand thus Z is a linear
inition of Z and by Theorem .4 9.59 we have Z = .Y ,L
subspace of .X Therefore,
11
= dim X =
= dim
+Y
dim Y
dim y_
dim y.l
= n l +
dimZ,
11"
dimZ
.4 10.
233
Our next result is concerned with orthogonal transformations on twodimensional Euclidean spaces.
.4 10.55. Theorem. eL t
where dim X = 2.
L(X,
X)
be an orthogonal transformation,
_ C[ OS
R,-
(J
sin (J
(JJ
sin
cos (J
(4.10.56)
1[ o - I OJ.
Q=
(4.10.57)
Proof
= [all
a21
= + 1 and choose an
au]
a2 2
denote the matrix of A with respect to this basis. Then, since A is orthogonal,
so is A and we have
(4.10.58)
and
(4.10.59)
det A = I.
Solving Eqs. (4.10.58) and (4.10.59) (we leave the details to the reader) yields
= cos (J, au = - sin (J, a21 = sin (J, and a 22 = cos (J.
To prove the second part assume that A is orthogonal and that det A
= - I . Consider the characteristic polynomial of A,
alI
. p(l)
Since det A =
- 1 we have /x o =
,
AI>
A2 -
12
/X 11
/X o
.v'iif+4
2
'
Chapter 4 I iF nite-Dimensional
which implies that both AI and Az are real and that AI *- Az. rF om Theorem
.4 10.52 these eigenvalues are + 1and - 1 . Therefore, there exists an orthonormal basis such that the matrix of A with respect to this basis is
[
OJ I[
AI
Az =
OJ
0 -1
AI) =
(tX I
PI).
P,).
).Z),
where the tXl' PI> i = I, ... , r are real (i.e., det (A - U ) does not have any
linear factors (AI - ).), with AI real). Solving the first uq adratic factor we have
, _
-PI
,+ P
J :
A-
-PI
.- .Jm2 -
""I -
and
z-
-41
t4 X
l,
where AI and Az are complex. By Theorem .4 5.33, part (iv), if 1(' ) is any
polynomial function, then I(AI) will be an eigenvalue of I(A). In particular,
ifj(A) = tX l + PIA + AZ, we know that one of the eigenvalues of the linear
transform~tion
tXlI + PIA + AZ will be tX l + PIAl + Ai = 0, by choice.
Thus, the linear transformation (tXII + PIA + AZ) has 0 as an eigenvalue.
Therefore, there exists a vector jl *- 0 in X such that
or
(tXII
PIA
AZ)/I
0 II
(4.10.61)
.4 10.
235
=
x
etft
+ e.Jz
Ax
ez A Z/ t .
AZit
Thus,
Ax
ez ( - f 1,tft
- e z f 1,tft
(et -
= etA/]
- f 1,tft -
PtA/t.
PtA/.) =
- e z f 1,tft
(et -
ez p t)A/t
eZpt)fz,
Chapter 4 I iF nite-Dimensional
"Y i =
1, ...
,r,
l) ' ( - 1 -
into ,X
lZ)
det(A -
Al).
Moreover, there exists an orthonormal basis e{ lt ... , e.l of X such that the
matrix of A with respect to this basis is of the form
cos 8 1
- s in 9 1
sin 9 I
cos 9 I
o
:I cos 8, - s in 8, :I
~-,
L~
A=
8
~
J_! ~
-1
-1
1+
2,
.4 10.
.4 10.65.
Exercise.
237
In our next result the canonical form of skew-adjoint linear transformations is established.
.4 10.66. Theorem. Let A be a skew-adjoint linear transformation from
X into .X Then there exists an orthonormal basis fe"~ ... ,e.l such that
the matrix A of A with respect to this basis is of the form
o
A=
!-o-i~
o
where the .J "
ez ro.
.4 10.67.
i=
Exercise.
i -J.,
I
L(X,
X)
is said to be a normal
Theorem. eL t A
L(X,
)X .
Then
Chapter 4
._ - - -
PI
-AI
Finite-Dimensional
AI:I
PI!
o
- - - - - -1
A=
P,
"
:- - - - - - A ,
A, I
P,
I
I
o
V~-2,
The proofs of parts (i)-(iii) follow from the definitions of normal, selfadjoint, skew-adjoint, and orthogonal linear transformations. To prove
part (iv), let A = AI + A2 , where AI = H A + A*) and A2 = t(A - A*),
and note that AI is self-adjoint and A 2 is skew adjoint. This representation
is unique by Corollary .4 10.25. Making use of Theorem .4 10.66 and Corollary
.4 10.38, we obtain the desired result. We leave the details of the proof of
this theorem as an exercise.
.4 10.70.
4.11.
Exercise.
APPLICATIONS
DIFE
F RENTIAL
TO ORDINARY
EQUATIONS
Let R denote the set of real numbers, and let D c R2 be a domain (i.e.,
D is an open and connected subset of R2). We will call R2 the (t, x) plane.
Let / be a real-valued function which is defined and continuous on D, and
4.IJ.
let
I:J.
dx/dt (Le.,
239
x =
f(t, x)
(4.11.1)
;(t) =
for all t
(4.11.2)
f(t, rp(t
x =
x('r)
In Figure K a
f(t, x ) } .
(4.11.4)
slope of line L
fIT, .,(T))
---
t,
.4 11.5.
iF gure .K
We can represent the initial-value problem given in Eq. (4.1 1.4) equivalently by means of the integral equation
rp(t) =
e+
(4.1 1.6)
Here we say that two problems are equivalent if they have the same solution.
To prove this equivalence, let rp be a solution of the initial-value problem
(4.1 1.4). Then rprr) =
and
;(t) =
f(t, rp(t
140
Chapter 4 I iF nite-
to I we have
ds
s: I(s, ,(s
ds.
or
' ( 1)
= ~ +
;tCt) =
= 1, ... , n
(4.11.8)
for all lET, is called a solution of tbe system of ordinary differential equations
(4.11.7).
,.} is
.4 11.9. Definition. Let (f, ~ I> . . , ~.) E D. If the set { ' I "' "
a solution of the system of equations (4.11.7) and if (' I (f), ... , ,.(f = (~I>
... , ~.),
then the set 1
' "'
. ".} is called a solution of the initial-value
problem
IX = /,(t, X I ' . ' , x.),
i = 1, ... , n } .
(4.11.10)
X I (f) = ~I'
I =
I, ... , n
It is convenient to use vector notation to represent Eq. (4.11.10). Let
.4 11.
241
f(/, x )
and define i =
/1(/, X
1.(/,
,X.)]
It.'
/[ I('~
=.
.
1.(/, x)
,x . )
X It . .
)X ]
= f(t, x)
(X T)
=;
}.
(4.11.11)
and
(4.11.14)
A(t)x,
(4.11.15)
i= A x ,
XI'
,x . ) = /,(t, x) =
a'/(t)x
I- I
l,
i=
I, ... ,n,
Chapter 4 I iF nite-Dimensional
242
XI'
,
,x~)
X ( k)
(4.1 1.1 6)
x(~-Il)
Definition. eL t (r,
and if rp(r) =
of the initial value problem
e" ... ,
(4.11.16)
(X )~
rp(~-Il(r)
x ( r)
e~,
,X(~-I
}.
(4.1 1.19)
e~
a,,(/)x(~)
a,,(t)x()~
and
a,.x(~)
a._I(/)x(~-1l
a~_I(t)X(~-1l
a l (t)x(1l
+ ... +
a~_lx(~-1l
al(/)x ( l)
alx ( I)
ao(t)x
ao(t)x
0,
aox
V(/),
(4.11.20)
0,
(4.11.21)
(4.11.22)
where a,,(t), . . ,oo(t) are real continuous functions defined on the interval
T, where a~(/)
:;z:! 0 for all lET, where a~, .
, a o are real constants, where
a" :;z:! 0, and where v(/) is a real function defined and piecewise continuous
on T. We call Eq. (4.11.21) a linear homogeneous ordinary differential equation
oforder n, Eq. (4.1 1.20) a linear non-homogeneous ordinary differential equation
of order n, and Eq. (4.1 I .22) a linear ordinary differential equation of order n
with constant coefficients.
We now show that the theory of nth-order ordinary differential equations
reduces to the theory of a system of n first-order ordinary differential equations. To this end, let in Eq. (4.11.19) X = X I ' and let
IX
x =
x
I_~X
x~
X 2
3
x~
1(/,
X ( 2)
(4.1 1.23)
X(~-I)
XI'
, x~)
x(~)
This system of equations is clearly defined for all (I, X I ' ... ,x~)
E D. Now
assume that the vector p4 T = ('11' ... , rp~) is a solution of Eq. (4.11.23) on an
.4 11.
= ;"
rp3
and since
rp\ft)(t),
it follows that the first component rp, of the vector, is a solution of Eq.
(4.11.16) on the interval T. Conversely, assume that rp, is a solution of Eq.
(4.11.16) on the interval T. Then the vector cpT = (rp, rp(l), ... , rp(ft-ll) is
clearly a solution of the system of eq u ations (4.11.23). Note that if rp,(1') = ~"
... ,rp\ft-I)(1') = ~ft' then the vector, satisfies ,(f) = ; , where
= (~t,
... , ~ft)' The converse is also true.
Thus far we have concerned ourselves with initial-value problems characterized by real ordinary differential equations. It is possible to consider initialvalue problems involving complex ordinary differential equations. F o r example, let t be real and let ZT = (z " ... , Zft) be a complex vector (i.e., Zk is of
the form U k + ivk , where U k and V k are real and i = ,J = } ) .
Let D be a domain
in the (t, z) space, and letf.,
,f,. be n continuous complex-valued functions
defined on D. Let fT = (fl'
,f,.), and let = dz/dl. We call
;T
= C(t, )z
(4.11.24)
= C(t, .<t
(+ t)
for all t
addition, (r,~"
... '~ft)
E D and if (rp,(-r), ... ,rpft(-r = (~I""
then cp is said to be a solution of the initial-value problem
z =
(z r- )
( t, )z } .
'~ft)
If in
~T,
(4.11.25)
Example.
x =
x'/3
(x O) =
O.
We can readily verify that this problem has infinitely many solutions passing
Chapter 4 I iF nite-Dimensional
tpi ) =
0,
{ H(t
p)]3/Z, P <
<
<1<
<
<
p
I
I.
The next example shows that the t interval for which a solution to the
initial-value problem exists may be restricted.
.4 11.27.
Example.
= ,{
x ( t.)
= {[I -
(t -
tl){ ] - l
=t
e-+ '
Systems
4 . lJ .
24S
vector with components vl(t) that are defined and piecewise continuous on T.
In the following we consider matrices and vectors with components which
may be either real- or complex-valued. In the former case the field for the x
space is the field of real numbers, while in the latter case the field for the x
space is the field of complex numbers. Also, let
D
E Rn(or en)}.
(4.11.28)
= A(I)x + V(I),
i = A(I)X,
(4.11.29)
i=
(4.11.31)
(4.11.30)
and
Ax.
In the applications section of the next chapter we will show that, with the
above assumptions, equations (4.11.29)-(4.11.31)
possess unique solutions
for every (r, e) E D which exist over the entire interval T = (II' (2 ) and which
depend continuously on the initial conditions. This is an extremely important
result in applications, where we usually require that T = (- 00, 00).
.4 11.32. Theorem. The set S of all solutions of Eq. (4.11.30) on T forms
an n-dimensional vector space.
Proof
L e t fl and ' 2 be solutions of Eq. (4.11.30), let F denote the field for
the x space, and let 0.: 1, 0.: 2 E .F Since
d
dt [o.:lfl(l)
1'
0.:2' 2 (1)]
o.:IA(t)4pl(t)
A(t)[o.:l'l(t)
1' >
0.:2A(t)4p2(t)
0.:2'2(t)].
n'
Chapter 4 I iF nite-Dimensional
IX"
for all t
But this last equation contradicts the assumption that the ~, are linearly
independent. Thus, the ." i = I, ... ,n, are linearly independent. iF nally,
to show that these solutions span S, let. be any solution of Eq. (4.1'1.30) on
T, such that '<1') = ~. Then there exist unique scalars IX I , , IX" E F such
that
/~ '
i=
1, ...
,n, form
By the uniqueness
is a solution of Eq. (4.11.30) on T such that .(1' ) =~.
results which we will prove in the next chapter (and which we accept here
on faith),
1' '
i=
I, ...
,n,
,,=
[
" II , U
:~
:.1~
.. ..:::.. :.1~
111]
is a fundamental matrix.
[ . II.1!
1 .,,]
.4 11.
Applications to Ordinary
247
Equations
Diff~rential
In our next definition we employ the natural basis for the x space, given by
0-
, 8:&=
81 =
... ,
u =
0
I
.4 11.34.
Definition. A fundamental matrix . (for Eq. (4.11.30
whose
columns are determined by the linearly independent solutions ' I ' i = I,
... ,n, with
'/(1') = 01' i = I, ... ,n,
'f E
with
1 .+ ]
= [ + I I+ : & !
A
[ (t)'II' IA(t>l' :' !&
= A(t)[ . II.z I 1 .] =
A(t)Y .
.. I A(t)' I ' . ]
We also have:
.4 11.37. Theorem. If" is a solution of the matrix equation
on T and if t, ' f E T, then
det "(/) = det "(' f )ef. tf A(.) i., t E T.
Proof
" =
Recall that if C =
~(detY)=
:~
is an (n
n) matrix, then tr C =
;{ I
o[ IAt)]. Then I II =
"u
.. :.:&~
I "d
fl.
[ e ll]
...
lh
.. ::: .. :.z~
.. .
~.2~
(4.11.38)
I-I
CII'
Let
alk(t)"kr Now
flu
fill
(4.11.36)
~.:&~
.. ,
'IIh
.. ::: .. ~:&
fl
flu
,,-'
,,:& .
fld . , . fl
(4.11.39)
Chapter 4 I iF nite-Dimensional
Also,
1' 2' .
...................
' / Inn
The last determinant is unchanged if we subtract from the first row 012 times
the second row plus 1 , times the third row up to 0ln times the nth row.
This yields
0\1 (/)'/1
\I
0 \ I (t}yt u
1\ 2' 1
...
\I
/' 122
(t}'/I
1n
\l'2n
d[ et 1'(t)] =
11 (/)
.4 1J.04 .
T.
+ ... +
for all t
det Y ( t)
det Y(r)ef~
It A(,),,,
Exercise.
We now prove:
.4 1J.14 .
Theorem. A solution Y of the matrix equation (4.11.36) is a
0 for all t E T.
fundamental matrix for Eq. (4.11.30) ifand only if det 1'(t)
*'
or
1' a ,
(4.11.42)
*'
*'
.4 11.
249
"C
4.1l.44.
Exercise.
Now let R(t) = [rit) be an arbitrary matrix such that the scalar valued
functions rl}(t) are Riemann integrable on T. We define integration of R(t)
componentwise, i.e.,
J[
r,/(t)dt}
for all t E T;
(iii) .(t, f) is non-singular for alI t
(iv) for any t, (J E T we have
.(t,1' )
n)
(4.11.46)
.(t, 1'~
T;
= .(t, (J~(J,
f);
(v) [.(t,1')-1
t:. .- I (t, f) =
.(- r , t) for all t E T; and
(vi) the unique solution of Eq. (4.11.29) is given by
cp(t)
= .(t, 1')~
f .(t,
")v(,,)d,,.
(4.11.47)
Proof The first part of the theorem follows from the definition of the state
transition matrix.
Chapter 4 I iF nite-Dimensiotull
(+ t)
= i(t, f~
A(t~(t,
.(t, f~.
Differentiating
= A(t)t<t).
f~
F u rthermore, f{ f ) = .(f, f~ = ~ .
F r om the uniqueness results (to be
presented in the next chapter) it follows that the specified is indeed the
solution of Eq. (4.11.30).
The third part of the theorem is a consequence of Theorem .4 11.41.
To prove the fourth part of the theorem we note that p4 t{ ) = .(t, f~
is the unique solution of Eq. (4.11.30) satisfying f{ f ) = ~ ,
and also that
f{o)' = ~O',
f~,
0' E T. Now consider the solution of Eq. (4.11.30) with
< )' .
Then
initial condition given at 0' in place of f; i.e., p4 t{ ) = .(t,O'.> O
f{ t )
.(t, f~
~t,
O'~O',
f~.
= .(t, O'~(O',
f)
~t,
f).
To prove the fifth part of the theorem we note that .- I (t, f) exists by part
(iii). F r om part (iv) it now follows that
I=
where I denotes the (n x
.(t, f~(f,
t),
.(f, t)
.- I (t, f) =
for all t E T.
In the next chapter we will show that under the present assumptions,
Eq. (4.11.29) possesses a unique solution for every (f, ~) E D, wheret< f ) = ~.
Thus, to prove the last part of the theorem, we must show that the function
(4.11.47) is this solution. Differentiating with respect to t we have
= ,< t , f~
+ ~t,
= A(t~(t,
f~
.(t)
A(t)[~t,
=
Also, f{ f )
=~.
A(t~t)
Therefore,
+
f~
t)Y(t)
(Y t)
f ,<t,
f A(t~(t,
f .(t,
,,)v(,,) d"J
,,)Y(,,) d"
,,)Y(,,) d"
v(t)
(Y t).
is the unique solution of Eq. (4.11.29).
.4 11.
251
eq u ations with constant coefficients given by Eq. (4.1 1.31). We require the
following preliminary result.
4.11.48.
Theorem. L e t A be a constant (n X n) matrix (A may be real or
complex). L e t SN(I) denote the matrix
N tie
SNCt) = I Ie~ k!AIe.
Let
... , n, and k =
where ~'J
is the K r onecker
~'J +
Let m =
max
max
I.J
I,)
la}+ }
00
max
=J
I~I~.
I<
~!
letl a}r
<
(leI tie
a'i k! '
+ 1e~1
QI)
(max
I
p=
wehavemaxla(Ie+I)I=maxl~a
~J
t. we have
Since I
(- t
~J
Ialp I)(max
Ia~~1 I).
P.}
When k=
I.}
I}
I.}
tl), t l
p- I
Ipa(lell
p}
Therefore,
maxlat]l~m.
I.J
and
".
I (kIt"k! I<M
l,
+ L
"-I
ali
M" =
e"'t"
we now have
~
QI)
i-
1e= 1
(kl
al}
tIc
-k'
is an absolutely and uniformly convergent series for each i,j over the interval
I I' I I) by the Weierstrass M-test.
(-
Let
A be a constant (n X
eAt =
for any
-00
<
<
00.
+ "=L 1-
k
t_ A
"
k!
n) matrix.
We define eAt
Chapter 4 I iF nite-Dimensional
252
I.
(- 0 0, 00), let
.4 11.50.
Theorem. Let T =
n) matrix. Then
l' E
.(t, 1')
= eA l' - . )
for all t E T;
(ii) the matrix eA' is non-singular for all t
(iii) eA"eA" = eA1h+ t ,) for all t I> t 2 E T;
(iv) AeA' = eA'A for all t E T; and
(v) (eN)-1 = e- AI for all t E T.
T;
Proof To prove the first part we must show that .(t, 1') satisfies the matrix
equation
' ( t,1' ) = A.(t, 1')
for all t
.(/,1' )
I. Now, by definition,
e AII-. )
:E (t - k! r- )k Ak.
k=1
= A+
.{ [ e AII- . )]
dt
:E (t - k! 1')k Ak+l
k=1
= AeA II - .
= A[ I +
:E (t - k! 1')k Ak]
k-I
l,
t~ ,
A~t,
1')
for all t E T, with .(1' , 1' ) = eA l.- . l = I. Therefore, eAll - d is the state transition matrix for Eq. (4.11.31).
The second part of the theorem is obvious.
To prove the third part of the theorem, we note that for any tl, t 2 E T,
we have
- t 2) = eN', which
Now .(tl> - t 2) = eAII,+ t ,l, .(tl> 0) = eA", and ~O,
yields the desired result.
To prove the fourth part of the theorem we note that for all t E T,
A(I +
:E Ik Ak) =
k= l k!
A+
:E t....Ak+1
k=1 k!
(I + k-tl t....Ak)A.
k!
.4 11.
253
iF nally, to prove the last part of the theorem, note that for all t
t!', . t!(' ,- ) = eA (,- , ) = I.
= e-
A,.
T,
.4 11.51.
(4.11.30)
A(tt)A(t z )
.(t, T)
where B(t, T)
.4 11.52.
= e
A'J (I,)d~
T
eB(.,T)
k= 1
-kIB~{t,
T),
A('1) d'l.
Exercise.
...
Exercise.
l)
A{t)x,
where
~l
= [;
Exercise.
eL t A denote the (n
A= A[ .I.
eA ,
for all t
T=
=
[
n) diagonal matrix
0].
Show that
A.~
ell' .
0 ]
el .,
(- b o, bo).
.4 11.55. Exercise.
eL t t E T = (- bo, be , let T E T, and let ~ E R~
(or en). Let A be the (n x n) matrix for Eq. (4.1I.3I), and let. denote the
Chapter 4 I iF nite-Dimensional
unique solution ofEq . (4.11.31) with ,(of) = ~. Let P be a similarity transformation for A, and let B = p- I AP.
(a) Show that eAt = Pe-rP-1 for all t E T.
(b) Show that the unique solution of Eq. (4.11.31) is given by
,=
P.r,
with
'!(f)
4.1l.56. Exercise.
A(t) = A for all t
P- I ' ( f) =
P-I~.
Ax
In Eq.
v(t).
L e tf E T, and let, denote the unique solution ofEq . (4.11.57) with ' ( f) =~.
Let P be a similarity transformation for A, and let B = p- I AP. Show that
the unique solution of Eq. (4.11.57) is given by
.= p ' ! ' ,
where. is the unique solution of the initial-value problem
By +
with
(f, '! (f
D, t
P- I v(t)
T.
J o:
- - 1- -
J=
where
i_~!:
o
o
.4 11.
where
oo
o
o
J.=
. I
1.. + .
o 0
0 lk+ ..
m = I, ... ,p, and where 11> ... ,1.., lk+.' ... ,lu, denote the (not
necessarily distinct) eigenvalues of A. Show that
ell'
where
.]
and
I"
I, - i
2!
(v. - l)!
1' - "
(VIII 2)!
o
where J . is a
VIII
VIII
matrix and k
e'"
v.
+ ... +
v, = n.
a.(t)x
l l
a.(t)x
l l
a._ . (t)x
a._ . (t)x
c.-
and
a.x
l .)
a._ . x
.-
Il
Il
l. - I )
a.(t)x ( \ l
+ ... +
a.(t)x ( \ l
a.x l l)
ao(t)x =
ao(t)x
aox
v(t),
(4.11.59)
0,
(4.11.60)
O.
(4.11.61)
In Eqs. (4.11.59) and (4.11.60), v(t) and o,(t), i = 0, ... ,n, are functions
which are defined and continuous on a real t interval T, and in Eq. (4.11.61),
Chapter 4 I iF nite-Dimensional
the
01'
0,,(1) F=
i
where
A(I) =
-' oo(t)
_ a,,(I)
o
o
(4.11.62)
A(I)x,
1(1)
- 0 2(1)
0.(1)
a,,(I)
(4.11.63)
- O "- I (t)
a,,(/)
In this case the matrix A(I) is said to be in companion form. Since A(I) is
continuous on T, there exists for all 1 ETa unique solution II to the initialvalue problem
i =
A(I)x
x(t')=;=(~I,,~,,)T
(4.11.64)
'
where T E T and; E R" (or e") (this will be proved in the next chapter).
Moreover, the first component of ,I, PI' is the solution of Eq. (4.11.60)
satisfying
PI(T) =
p(T) =
p(\)(T) =
~I'
... , pl"-II(T) =
~2'
~".
Now let 1' 1' " .. ,' I ' " be solutions of Eq. (4.11.60). Then we can readily
verify that the matrix
=[
;::: ...
1' "\ ' -
;1:' ...:::...;:' ]
I~"-
t)
t)
,,~.-
(4.11.65)
t)
A(I)",
(4.11.66)
where A(I) is defined by Eq. (4.11.63). We call the determinant of" the
Wronskian of Eq. (4.11.60) with respect to the solutions l1>""
I ", and
we denote it by
det" = W(' I ' I > " "
1' ,' ,).
(4.11.67)
Note that for a fixed set of solutions I I" .. , "" (and considering
the Wronskian is a function of I. To indicate this, we write W(" I '
T
,
fixed),
".)(1).
.4 11.
257
,' Y ,)(t)
=
.4 11.69.
tion
Example.
T,
det Y ( t) =
det 'P(r)eJ~trACII'"
Y', )(r)eJ~-[II.-.e")/II.C"lld".
(4.11.68)
tZx CZI
tx
=
x
0,
< t<
(4.11.70)
00.
Then
W(YI' >
)z'Y (t)
=
det P
' (t) =
--,
t>
O.
the notation of Eq. (4.1 1.63), we have in the present case al(t)laz(t)
lit. F r om Eq. (4.1 1.68) we have, for any l' > 0,
Using
)z'Y (t)
W(YI' >
= det "(t) =
-_
which checks.
- e2
ID (Titl _
l'
-, 2
(- II.e"I/II,C"IJ d"
t>
0,
.4 11.71.
.4 11.72.
Exercise.
We call a set ofn solutions ofEq . (4.11.60), 1'Y t .. , "' Y , which is linearly
independent on T a fundamental set for Eq. (4.11.60).
L e t us next turn our attention to the non-homogeneous linear nth- o rder
ordinary differential eq u ation (4.11.59). Without loss of generality, let us
assume that a,,(t) = 1 for all t E T; i.e., let us consider
C
X "1
a"_I(t)xC"-1l
+ ... +
al(t)x(l)
ao(t)x
v(t).
(4.11.73)
The study of this eq u ation reduces to the study of the system of n first-order
I Finite-Dimensional
Chapter 4
158
o
o
A(t) =
A(t)x
i =
where
b(t),
(4.11.74)
000
- o o(t)
- 0 1(/)
- 0 2(/)
...
- 0 ._
b(t) =
1(/)
o
V(/)
(4.11.75)
In the next chapter we will show that for all lET there exists a unique
solution ~ to the initial-value problem
i =
(X T)
A(t)x
=; =
b(t)
(4.11.76)
CI '
= el'
C(tJ(r- )
'2'
is the solution
= , .
... , Clo-(> ! r- )
We now have:
.4 11.77.
lX .>
... , I .}
+ ... +
I.- I>
O._I(t)X
o._I(/)x(-tJ
OI(t)X()J
01(/)X()J
; E
C(/)
= CA(/)
+ to I ,(/)
t:1
I'
W
{ ,(I .. .
W(I h
oo(/)X
oo(t)x =
= O.
v(1),
(4.11.78)
(4.11.79)
T,
(4.11.80)
where CA is the solution of Eq. (4.11.78) with CA(T) = ' I ' and where ~(lI'
... ,I.X/) is obtained from W(lI" .. , I .)(/) by replacing the ith column of
W(lI" .. , I .X/)
by (0,0, ... , l)T.
.4 11.81.
Exercise.
.4 11.82.
tion
Example.
12 x
12>
tx
ltJ
-
b(t), t
>
0,
(4.11.83)
.4 11.
O. This equation
is
(4.11.84)
I
t
-1
t, V'1' .(t) =
l/t,
=--,
tr
ln
:i =
where
Ax,
(4.11.87)
-al
(4.11.88)
- a 3 ..
- a ._ I
We now show that the eigenvalues of matrix A ofEq . (4.11.88) are precisely
the roots of the characteristic equation (4.11.86). First we consider
-,t
-,t
det(A - , tI)
-a2
o
o
-,t
o
o
Chapter 4 I iF nite-Dimensional
160
-1
-1
-1
-01
-0"
-03
= -1
...
-(1
- 0 "_ , ,
.
0,,_ 1 )
100
-1
+
sU ing
(- 1 )"+ 1 (- 0
(- I )"{ l "
, ,_11,,-1
0)
+ ... +
-1
all
oo}.
0.
(4.11.89)
(4.11.91)
1"
where 1 1 , ,1" denote the eigenvalues of matrix A. eL t
Vanclermonde matrix given by
V denote the
I
V=
11
II
1"
l~
1"
l~
A(t)x .
(4.11.92)
Let A*(t) denote the conjugate transpose of A(t). (That is, ifA(t) = o[ (J' t)],
then A*(t) = a[ l}(t)]T = a[ ,J (t)],
where a,it) denotes the complex conjugate
.4 12.
261
T*Y =
C,
where. C is a constant non-singular matrix, and where T* denotes the conjugate transpose of T.
It is also possible to consider adjoint equations for linear nth-order
ordinary differential equations. eL t us for example consider Eq. (4.11.85),
the study of which can be reduced to that of Eq. (4.11.87), with A specified
by Eq. (4.11.88). Now consider the adjoint system to Eq. (4.11.87), given by
y= - A *y,
where
0
0
0
-I
- A *=
(4.11.95)
0
0
0
0 -I
ao
a
a2
1
.....................
o 0
- I a.-
2Y =
.Y
aoY.,
-YI
= - Y , ,- I
(4.1 1.96)
0, ... , n -
I. Equation
(4.1 1.97)
alY . '
a,,-I.Y
Differentiating the last expression in Eq. (4.11.97) (n ... ' Y " - I '
and letting "Y = ,Y we obtain
C
(- I ) y "> + (- I ),,- l a._1Y c..-I> + ... + (- I )Qlit> +
1) times, eliminating
Y"
aoY
O.
(4.11.98)
.4 12.
C1uIpter 4
I Fmite-Dimensional
cations. (In particular, consult the references in .4 [ 10] for a list of diversified
areas of applications.)
ExceUent references on ordinary differential equations include .4 [ 3], .4 [ 5],
and .4 [ 11].
REFERENCES
.4 [ 1]
.4 [ 2]
.4 [ 3]
.4 [ ] 4
.4[ 5]
.4 [ 6]
.4 [ 7)
.4[ 8]
.4[ 9]
.4[ 10]
.4 [ 11]
Inc., 1961.
S. IL PSCHT
U Z,
Linear Algebra. New York: McGraw-iH ll Book Company,
1968.
B. NOBLE, Applied iL near Algebra. Englewood aiit' s , N.J.: Prentice-aH ll,
Inc., 1969.
.L S. PoNTlU A OIN, Ordinary Differential Equations. Reading, Mass.:
1989.
METRIC SPACES
5.1.
DEFINITION
OF
METRIC SPACE
<
>
y;
The function p is called a metric on ,X and the mathematical system consisting of p and ,X {X; p}, is called a metric space.
The set X is often called the underlying set of the metric space, the elements
of X are often called points, and p(x, y) is frequently called the distance
from a point x E X to a point y E .X In view of axiom (i) the distance
between two different points is a unique positive number and is equal to zero
if and only if two points coincide. Axiom (ii) indicates that the distance
between points x and y is equal to the distance between points y and x.
Axiom (iii) represents the well-known triangle inequality encountered, for
example, in plane geometry. Clearly, if p is a metric for X and if IX is any
real positive number, then the function IXp(X, y) is also a metric for .X We
are thus in a position to define infinitely many metries on .X
The above definition of metric was motivated by our notion of distance.
Our next result enables us to define metric in an equivalent (and often convenient) way.
5.1.2. Theorem. eL t p: X
(i) p(x, y) =
(ii) p(y, )z <
X -
p(x , y)
.X
Proof
5.1.
5.1.3. Example.
L e t X be the set of real numbers R, and let the function
P on R x R be defined as
p(x, y)
= Ix
yI
(5.1.4)
5.1.5. Example.
L e t X be the set of all complex numbers C. If z E C,
and where a, b are real numbers. Let
then z = a + ib, where i = . .;= 1 ,
i = a - ib and define p as
p(z
l'
Z2)
([ z
Z2)(Z I
I -
Z2)],12.
(5.1.6)
0 if x =
{ I if x
* y.
p(x, y) =
(5.1.8)
PI( X , y)
1+
p(x , y)
.
p(x , y)
(5.1.11)
R* =
u o+{ o}
{ - o o}
RU
the extended real numbers. In the following exercise, we define a useful metric
on R*. This metric is, of course, not the only metric possible.
5.1.12. Exercise.
Let
1:+ :
~
[(x)
1 {lxi'
:: ~:
E
5.1.13. Theorem. L e t { X ;
elements of .X
for all x , ,Y z E
Proof
Then
.x
Ip(x,
)z -
I<
p(y, )z
(5.1.14)
p(x, y)
<
p(x, )z
and
<
p(y, x)
p(x, z) -
p(y, z)
P(Y, z)
p(x, y)
p(y, )z
(5.1.15)
p(x, )z .
(5.1.16)
p(x, y)
(5.1.17)
P(Y, )z .
(5.1.18)
F r om (5.1.15) we have
<
- p (y, x)
p(x, z) -
<
p(x, z) p(y, z)
I<
P(Y, z)
<
p(x, y).
.X
5.1.
267
t5(Y)
sup p{ (x,
y): ,x y E .} Y
d(Y , Z) =
Let p
inffp(y, )z : y
X and define
d(p, Z)
inffp(p, )z : z
,Y z
E
Z}.
Z}.
p'(x,
then f;Y
y) =
,Y
5.1.23. Exercise.
We call p' the metric induced by p on ,Y and we say that {Y; p'} is a
metric subspace of fX ; p} or simply a subspace of .X Since usually there is
no room for confusion, we drop the prime from p' and simply denote the
5.2.
Let p, q
R such that I
<
for all~,
00
~p :::;; ,~
(ii) (H6Ider's
inequality)
1
1
- + -p= 1 . q
(a)
Let
<
p, q
1. Then
+ fJq ' .
(5.2.2)
<
00,
and
n'
(5.2.3)
R or C. If ~
"{ l
le,l' <
and I'{ I}
00
and ~
<
00,
then
(5.2.4)
(c)
Integrals.
f, g:
la, b]
Let
-+
la, b)
R. If
00
and
I/':U
Ig(t) ~
dt]
II'.
00
(5.2.5)
(b)
269
(c)
:U
Proof
Integrals.
,J g: a[ , b]
a[ , b]
Let
-+
R. If
then
If(tWdt <
:U
= fl.'/p and q 2 =
any choice of fl., P >
ql
and
I
If(tW dtT '
(tile/l,)' / ,
(~I~II,Y!J'
00,
/
Ig(tW dtT ,.
(5.2.8)
(tilell') II' =
q2
0 or if
7= = 0 and
lell
:U
+
s:
P'/.q
0, then inequality
Ig(tW dt <
00
(5.2.7)
in either
eo
(iil'lII,)'/I 7= =
(~'TI Iyl'
< 1- .
1'1/1
(~I~II')
lell'
Hence,
5.1.9.
iF gure A.
+ 1- .
q
(~'TI ')
I'llI'
270
It now follows that
~ I,~ ' I
~ '~,I 'I,I
=
~ (~
I~,I,)'/
(~I' ,~)'/.
< (~I~,I,)'/'(~
~ 1,~I' 1
1'1,1,)'/.'
I' "
+ I'I,IY =
(I~,I
I'I,I)'-II~,I
1[ ,' 1 +
I'I,I]'-I'~,I
(I~,I
+ 1'1,1)'1- 1'1,1
(I~,I
~
=
+
~
[I~,I
+ 1'1,1]1-' 1'1,1
Applying the Halder inequality (5.2.3) to each of the sums on the right side
of the above relation and noting that (p - l)q = p, we now obtain
+ I' ,I)'T/'[~
[t
+ [~(le,1
/
(le,1 + 1'1,1),]1 ' *- 0
'sl
[ ~
Since [ ~
I~,
1] /' < [ ~
[ .~ (I ,~ I +
(1',1 + 1'1,1)'
1' ,1'
1] 1'
I~,I'J/
<
a
We note that in case [ I;
(I~,I
1=1
+ 1'I,l)'T
+ 1'1,1)' 1] /' =
0, inequality
(5.2.6) follows
trivially.
Applying the same reasoning as above, the reader can now prove the
Minkowski inequality for infinite sum!! and for integrals. _
If in (5.2.3), (5.2.4), or (5.2.5) we let p = q = t, then we speak of the
Schwarz inequality for finite sums, infinite sums, and integrals, respectively.
271
5.2.10. Exercise.
Prove H o lder' s
inequality
for integrals (5.2.5),
Minkowski' s inequality for infinite sums (5.2.7), and Minkowski's inequality
for integrals (5.2.8).
5.3.
EXAMPLES
OF
I' '
5.3.1. Example.
Let
I' '
Rn (let X =
[ t1=1
p,(x, y) =
00,
and let
(5.3.2)
p,(a,d)=
:::;;
} 11,
} 11,
~1(l.1-~,I'
{
1~ (l.I-PII'
1.l,~I(l.I-~ PI+PI-~II'
=
~IP,-~,I'
11,
11,
= p (a,b)+ p (b,d),
F o r ,x y
poo(x, y)
It is readily shown that
5.3.5. Ex a mple.
Let
E R~
(for ,x y
max (1'1 -
(R~;
poo}( cn;
<
1 <p
I, =
x{
00,
EX :
E C~),
1111, ...
let
(5.3.4)
R- (or X
t I'll' <
I-I
oo} .
(5.3.6)
172
F o r ,x y
[ I;.. I{,
I" let
pJ X , y)
- 1' ,1'
I- I
1] /, .
(5.3.7)
eL t X
I..
F o r ,x y
R" (or X
= x{
:X sup 1{ ,{ 1l
I.. , define
< oo}.
(5.3.9)
p..(x, y) = sup
, I{ ,{ - I' ,ll
(5.3.10)
i[
Ix(t) -
y(t)IP dt
IJ I'
(5.3.13)
1[
Ix ( t) -
y(t) I' dt
1] /, > o.
<
=
I{ lu(t) I{ Iu(t) b
I{ lu(t) b
p,(u, w) =
p,(u, v)
w(tWdt
v(t) +
} 1/,
v(t) -
v(tWdt
1/,
w(t) I' dt
+ I{ b
} 1/,
Iv(t) -
w(tWdt
1/,
p,(v, w),
the triangle inequality. It now follows that e{ ra, bJ; p,} is a metric space.
It is easy to see that this space is an unbounded metric space. _
273
5.3.14. Example.
eL t
F o r x, y E ~[a,
b], let
b] be defined as
a[~ ,
p_ ( x ,
y)
sup Ix { t )
-
GStS6
In
(5.3.15)
y{ t ) I.
p_ ( x , y)
sup Ix { t )
Ix - y l
o
x =
z{t)
.StS6
z ( t) -
y{ t )
i
'
I
y(t)/
sup Ix ( t) -
GStS6
I
I
I
I
R, pix, yl
= Ix - yl
ev= ( v,.V2)
(x "
X"
.- - - - , -
2x 1
o
P.(x ,
X .. R2, P.(X ,
yl
- 2Y 11
(x tl
- -
era,
bJ , P. (x"
-~
Ib
I
2x 1 = sup { I x l
- --
- -
(t)- 2
x t{ 111
a~tb
~
5.3.16.
174
Chopttr 5
<
sup I{ x ( t) -
)z
p_(,x
=
b]; P-J
I + Iz(t)
z(t)
.S;' 9
Mttr;c Spacts
y(t) I}
-
y).
p_(,z
is a metric space.
In Figure B,. several metrics considered in Section 5.1 and in the present
section are depicted pictorially.
5.3.17. Exercise.
to
p_(,x
"
,-- [ I-'
y)
lim I; I~,
/
- 1' ,1' IJ , .
5.3.18. Exercise.
5.3.19. Theorem. L e t
Z = X x .Y Let ZI =
Define the functions
P,(ZI' Z2)
and
{X;
(XI>
([p(z IX >
P_(ZI'
l> ' +
[ p iY I '
Y2)],}I/"
Z2)
Then Z
{ ; PI} and Z
{ ; P-J
The spaces Z
{ ; P,J and Z
{ ; P-J
1 <p
<
00
2Y )}.
5.3.20. Exercise.
We can extend
We have:
XI
E
X,
... X
"X
" "X
IT
PIJ, ... ,{ X , ,;
t-~
y)
" P,(X
= I;
For
'~I
"
... , IX I)
y,)
,X Y
(YI'
... ,
5.4.
175
and
p"(x , y) =
Then { X ;
p'} and { X ;
5.3.22. Exercise.
5.4.
(I-'
~
p[ ,(x"
)1/~
y,)~]
SETS
R~
as a function of r if the
however, there
276
'.
oX
Sphere S(XO; rl, where X = R
and pIx , yl = Ix - vi
(b -1I212J~
t2
t 02 +
t 02 - r
9! - i
- - +- -,
t 02 +
t 02
1.,-,
I~
tOI - r
- ~ ."
-~.
:
tOI
tOI + r
~ I
I - " I+ I.,
~2
I
I
t1
tOI - r
tOI
112 1
to! + r
era,
bJ
tI
oX - r
y(tl I
a~t~b
5.4.2.
x l tl
. various
.
Figure C Spheres In
metric spaces.
5.4.
277
which contains no point of Y o ther than x itself. The point x is called a limit
point or point of accumulation of set Y if every sphere with center at x contains
an infinite number of points of .Y The set of all limit points of Y is called the
derived set of Y a nd is denoted by .'Y
Our next result shows that adherent points are either limit points or
isolated points.
5.4.6 . Theorem. eL t Y be a subset of X and let x E .X Ifx is an adherent
point of ,Y then x is either a limit point or an isolated point.
We prove the theorem by assuming that x is an adherent point of Y
but not an isolated point. We must then show that x is a limit point of .Y
To do so, consider the family of spheres S(x; lin) for n = 1,2, .... eL t
fX t E S(x; lin) be such that fX t E Y b ut fX t 1= = x for each n. Now suppose there
are only a finite number of distinct such points X ft , say, lX { '
... , x k } . If we
let d = min p(x, IX )' then d > O. But this contradicts the fact that there is
Proof
1:S:I:S:k
5.4.7.
o<
o
5.4.8.
5.4.7.
iF gure D. Set Y =
{x
2
R: 0
<
x
< 1, x
=
2} of Example
278
Chapter 5
I Metric Spaces
(i)
Y c
f;
(ii) f = f;
(iii) if Y c Z, then
(iv)
(v)
(vi)
= f u i;
Y n Z c f n i;
f = Y U Y'.
YUZ
i;
and
To prove the first part, let x E .Y Then x E S(x ; r) for every r > O.
Hence, x E .Y Therefore, Y c f.
To prove the second part, let x E ,Y and let r> O. Then there is an
XI
E Y such that X I E S(x ; r),andhencep(x , X I ) = r l < r. L e tro = r - r l
> O. WenowwishtoshowthatS(x l ; ro) c S(x ; r). Indoingso,lety E S(x l ;
ro)' Then p(y, X I ) < roo By the triangle inequality we have p(x , y) ~ p(x ,
XI) +
p(x l , y) < r l + (r - r l ) = r, and hence y E S(x ; r). Since X I E f,
the sphere S(x l ; ro) containsapointx 2 E .Y Thus, X 2 E S(x ; r). Since S(x ;
r) is an arbitrary spherical neighborhood of x , we have X E .Y This proves
that c .Y Also, in view of part (i), we have Y c
Therefore, it follows
that = Y
.Y
To prove the third part of the theorem, let r > 0 and let X E .Y Then
there is ayE Y such that y E S(x ; r). Since Y c Z, Y E Z and thus X is an
adherent point of Z.
To prove the fourth part, note that Y c Y U Z and Z c Y U Z. F r om
part (iii) it now follows that Y c Y U Z and i c Y
U Z. Thus, f u i
c Y U Z. To show that Y U Z c f u i, let X E Y U Z and suppose
that X :q Y u i. Then there exist spheres S(x ; r l ) and S(x ; r2) such that
S(x ; r l) n Y = 0 and S(x ; ' 2 ) n Z = 0. L e t
r = min {'It :' ' } z
Then
S(x ; r) n [ Y U Z] = 0. But this is impossible since X E Y U
Z. Hence,
X E Y
u i, and thus Y U Z c f u i.
The proof of the remainder of the theorem is left as an exercise.
_
Proof
5.4.11.
r.
Exercise.
5.4.
279
exists a sphere Sex; r) such that sex; r) c .Y The set of all interior points of
set Y is called the interior of Y a nd is denoted by yo. A point x E X is an
ex t erior point of Y if it is an interior point of the complement of .Y The
exterior of Y is the set ofall exterior points of set .Y The set ofall points x E X
such that x E f () (Y - )
is called the frontier of set .Y The boundary of a
set Y is the set of all points in the frontier of Y which belong to .Y
5.4.13. Example.
Let R
{ ; p} be the real line with the usual metric, and
let Y = y{ E R: 0 < :Y 5: I} = (0, I]. The interior of Y is the set (0, I) =
{ y E R: 0 < y < I}. The exterior of Y i s the set (- 0 0, 0) U (I, + 0 0), f =
y{ E R: < Y : 5:
I} = 0[ , I] and Y - = (- 0 0,0] U 1[ , + 0 0). Thus, the
frontier of Y is the set CO, I}, and the boundary of Y is the singleton l{ .}
5.4.15.
(i)
(ii)
Theorem.
and 0 are open sets.
If { .Y } .. eA is an arbitrary family of open subsets of ,X
X
is an open set.
(iii) The intersection of a finite number of open sets of X
then
eA
Y ..
is open.
Proof To prove the first part,. note that for every x E X, any sphere
Sex; r) c .X Hence, every point in X is an interior point. Thus, X is open.
Also, observe that 0 has no points and therefore every point of 0 is an
interior point of 0. Hence, 0 is an open subset of .X
To prove the second part, let .Y{ .} EA
be a family of open sets in ,X and
Y . If Y .. is empty for every tt E A, then Y = 0 is an open
let Y = U
.eA
Y . for some
that sex; r)
Therefore,
If Y 1 () Y 2
0, and let
E Y
T 1) C
5.4.16.
deter-
Theorem.
then r is closed.
then Z- is open.
Proof
The first part of this theorem follows immediately from the definitions of ,X 0, and closed set.
To prove the second part, let Y b e any open subset of .X We may assume
that Y 1= = 0 and Y 1= = .X Let x be any adherent point of Y - . Then x cannot
belong to ,Y for if it did, then there would exist a sphere S(x ; ,) c ,Y which
is impossible. Therefore, every adherent point of Y - belongs to Y - , and thus
Y - is closed if Y is open.
To prove the third part, let Z be any closed subset of .X Again, we may
assume that Z 1= = 0 and Z 1= = .X L e t x E Z- . Then there exists a sphere
S(x ; T) which contains no point of Z. This is so because if every such sphere
would contain a point of Z, then x would be an adherent point of Z and
consequently would belong to Z, since Z is closed. Thus, there is a sphere
S(x ; r) c Z- ; i.e., x is an interiorpointofZ- . Since this holds for arbitrary
x E Z- , Z- is an open set. _
In the next
sets.
5.4.18.
Theorem.
eA
5.4.
281
Proof To prove the first part, let Sex; r) be any open sphere in .X L e t
x . E sex; r), and let p(x, lX ) = r . If we let r o = r - ' . , then according to
the proof of part (ii) of Theorem 5.4.10 we have S(x l ; ro) c Sex; r). Hence,
x . is an interior point of sex; r). Since this is true for any x . E sex; r), it
follows that sex ; r) is an open subset of .X
To prove the second part of the theorem, we first note that if Y = 0,
then Y is open and is the union of an empty family of spheres. So assume
that Y t= = 0 and that Y is open. Then each point X E Y is the center of a
sphere Sex; r) c ,Y and moreover Y is the union of the family of all such
spheres.
The proof of the last part of the theorem is left as an exercise.
5.4.19.
Exercise.
5.4.20.
Let { Y ;
V=
.,el'
S' ( x ; r)
.,el'
.,el'
S(x ; r)n
Sex; r) =
U
Y.
pl.
5.4.21.
Exercise.
The first part of the preceding theorem may be stated in another equivalent
way. L e t 3 and 3' be the topology of ;X {
p} and {Y; pI, respectively, generated
by p. Then 3' = { Y n :U U E 3}.
Let us now consider some specific examples.
n- "Y
,,= \
= x{
R: a
is not an open subset of R. (This. can readily be verified, since every sphere
S(b; r) contains a point greater than b and hence is not in
n- "Y .)
,,= \
In the above example, let Y = (a, b]. We saw that Y is not an open subset
of R; i.e., b is not an interior point of .Y oH wever, if we were to consider
{ Y ; p} as a metric space by itself, then Y is an open set.
5.4.24. Example.
eL t e{ ra, b]; p_} denote the metric space of Example
5.3.14. eL t 1 be an arbitrary finite positive number. Then the s~t of continuous
functions satisfying the condition Ix ( t) I < 1 for all a < t < b is an open
_
subset of the metric space e{ ra, b]; p_.}
Theorems 5.4.15 and 5.4.17 tell us that the sets X and 0 are both open
and closed in any metric space. In some metric spaces there may be proper
subsets of X which are both open and closed, as illustrated in the following
example.
5.4.25. Example. eL t X be the set of real numbers given by X = (- 2 ,
- 1 ) U (+ 1 , + 2 ), and let p(x , y) = Ix - yl for x , y E .X Then { X ; p}
is
clearly a metric space. Let Y = (- 2 , - 1 ) c X and Z = (+ I, + 2 ) c .X
Note that both Y and Z are open subsets of .X oH wever, Y - = Z, Z- = ,Y
and thus Y a nd Z are also closed subsets of .X Therefore, Y and Z are proper
subsets of the metric space ;X {
p} which are both open and closed. (Note that
in the preceding we are not viewing X as a subset of R. As such X would be
open. Considering ;X{
p} as our metric space, X is both open and closed.) _
5.4.26. Exercise. eL t { X ; p} be a metric space with p the discrete metric
defined in Example 5.1.7. Show that every subset of X is both open and
closed.
In our next result we summarize several important properties of closed
sets.
5.4.
5.4.27.
Theorem.
eA
.eA
5.4.28.
Y:
.eA
Y. is a closed subset of .X
(n
.eA
Exercise.
.=1
Y. =
(x
R: 0 <
x
<
a} =
(0, a]
is not a closed subset of the real line, as can readily be verified since
adherent point of (0, a].
is an
5.4.31. Exercise.
The set K ( x o; r) defined in part (ii) of Theorem 5.4.27
is sometimes called a closed sphere. It need not coincide with S(x o; r), i.e.,
the closure of the open sphere S(x o; r).
(i) Show thatS(x o; r) c K(xo;r).
(ii) Let (X ; p} be the discrete metric space defined in Example 5.1.7.
Describe the sets S(x; I), S(x ; I), and K(x;
I) for any x E X and conclude
I) if X contains more than one point.
that, in general, S(x ; I) K ( x ;
*'
*"
Exercise.
5.4.37. Example. Let {R; p,} be the metric space defined in Example
5.3.1 (recall that 1 < p < 00). The set of vectors x = (e I'
,e.) with
rational coordinates (i.e.,
is a rational real number, i = I,
,n) is a
denumerable everywhere dense set in R and, therefore, R
{ ; p,} is a separable
metric space. _
e,
5.4.38. Example. eL t {l,; p,} be the metric space defined in Example 5.3.5
(recall that I < p < 00). We can show that this space is separable in the
following manner. eL t
Y
= .Y{
I,: .Y
= 1, ... ,n} .
5.4.
285
1:
I~kl'
<_. 2
E Y such that
eH nce,
i.e., p,(x,
<
)~Y
By Theorem 5.4.34,
E.
5.4.39.
~/~b
5.4.40.
Ex e rcise.
U s ing
theorem, show
that the metric spaces e{ ra, b]; P,}, defined in Example 5.3.12, and e{ ra, b];
p~,}
defined in Example 5.3.14, are separable.
Exercise. Show that the metric space { X ; p}, where pis the discrete
metric defined in Example 5.1.7, is separable if and only if X is a countable
set.
5.4.41.
5.3.8. Let
Example.
Y c R~
L e t {l~;
p~}
be the metric space defined in Example
denote the set
Y ={y
Clearly then Y c
such that
IX
.~I
E R~:
I~.
'1~H)~,
Y =
0 or I}.
286
5.5.
COMPLETE
METRIC SPACES
The set of real numbers R with the usual metric p defined on it has many
remarkable properties, several of which are attributable to the so-called
"completeness property" of this space. F o r this reason we speak of R
{ ; p}
as being a complete metric space. In the present section we consider general
complete metric spaces.
Throughout this section {X; p} is our underlying metric space, and J denotes
the set of positive integers. Before considering the completeness of metric
spaces we need to consider a few facts about sequences on metric spaces (cf.
Definition 1.1.25).
5.5.1. Definition. A sequence .x { }
in a set Y c: X is a functionf: J Thus, if .x{ }
is a sequence in ,Y thenf(n) = x . for each n E .J
.Y
lim x .
or x . - x as n - 00. If there is no x
then we say that {x.l diverges.
= ,x
The range off in the above definition may consist of a finite number of
points or of an infinite number of points. Specifically, if the range of f
Clearly, all
a{ + ( nl)"}
converges
5.5.6. lbeorem. eL t ,x { ,}
be a sequence in .X
Then
"
= x;
(ii) if ,x { ,}
is convergent, then it is bounded;
(iii) ,x { ,} converges to a point x E X if and only if every sphere about x
"
(vi) if ,x { ,}
converges to x E X and if the sequence y{ ,,} of X converges
to Y E ,X then lim p(x", y,,) = p(x, y); and
0 such
= x and
"
lim "x = y. Then for every f > 0 there are positive integers N" and N)' such
" p(x", x ) < f/2 whenever n > N" and p(x", y) < f/2 whenever n > N
that
r
Proof.
If we let N
Now
288
is less than every positive number is ez ro, it follows that p(x, y) = 0 and
therefore x =
y.
To prove part (iii), assume that lim x . = x and let Sex; f) be any sphere
about .x Then there is a positive integer N such that the only terms of the
sequence { x . } which are possibly not in Sex; f) are the terms X I ' x 2 , , X N - 1
Conversely, assume that every sphere about X contains all but a finite number
of terms from the sequence .x{ .}
With f > 0 specified, let M = max n{ E :J
.x 1= S(x ; f)} . IfwesetN= M +
l,thenx . E S(x ; f)foralln> N ,which
was to be shown.
To prove part (v), we note from Theorem 5.1.13 that
lP(y, )x -
I=
p(x, .x ).
By hypothesis, lim x . =
- p (y, x . )
I<
p(y, x.)
p(y, x) .
iF nally, to prove part (vii), suppose to the contrary that p(x, y) > .Y'
Then 6 = p(x, y) - i' > O. Now'Y - p(x., y) > 0 for all n E ,J and thus
0<
for all n
6<
p(x, y) -
p(x., y)
<
p(x, x . )
X.
.x Thus, p(x, y)
<
y.
c .X
S.S.8. Theorem. eL t
(i) x
.Y { }
(ii) x
.Y{ }
(iii)
Y be a subset of .X
Then
E X
5.5.
Proof
"*
5.5.10. Exercise.
converges in
p} is
Proof
"
<
el2 whenever m, n
Let ,x { ,}
>
N.
p} a Cauchy sequence
0 we can find an
is a bounded
We need to show that there is a constant "I such that 0 < "I < 00 and
such that p(x"" ,x ,) < "I for all m, n E I.
Letting e = I, we can find N such that p(x"" ,x ,) < I whenever m, n ~ N.
Now let l = max p{ (XI>
x z ), p(x l , x 3), ... ,p(x l , x N)). Then, by the triangle
inequality,
p(x l , ,x ,) < P(X l ' x N ) + p(x N , ,x ,) < (l + I)
Proof
I. Thus, p(x"" ,x ,)
<
2(A
<
p(x
I) and ,x { ,}
,x ,)
is a bounded sequence.
We also have:
Exercise.
291
Complete metric spaces are of utmost importance in analysis and applications. We will have occasion to make extensive use of the properties of such
spaces in the remainder of this book.
5.5.17. Example. eL t X = (0, I), and let p(x, y) =
E .X
eL t x . = lin for n E .J Then the sequence .x{ }
is a Cauchy sequence), since Ix . - lx iii < IIN for all
there is no x E X to which .x { }
converges, the metric
complete. _
5.5.18. Example.
= Ix - yl
eL t
Let
x. =
Ix
- Y I for all x,
is Cauchy (i.e., it
n > m > N. Since
space { X ; p} is not
2\
+ ... + 1,
for
n.
.J
The sequence .x { }
is
= P2X t , tY ), (x 2, 2Y
= ,J ( P ..(x t , x 2)]2 + (piY t , 2Y )]2.
Chapter 5
292
5.5.22. Exercise.
I Metric Spaces
Exercise.
pl.
5.5.25. Exercise. eL t X = R" (let X = C") denote the set of all real (of
all complex) ordered n-tuples x = (~I' ... ,~,,).
Let y = ('11J ... ,'1,,), let
p,(x , y)
/
1' 11'T "
= [~I~I
and let
sp <
00,
max I{ I~
- 1' 11 . .. ,I~" - 1' "n. i.e. p = 00.
tU ilizing the completeness of the real line (of the complex plane), show that
{R"; p,} = R;({C;'; p,} = C;) is a complete metric space for 1 S p S 00.
In particular, show that if lX { } '
is a Cauchy sequence in R; (in C;), where
lX ' = (~\kJ
... '~"l')'
then {~/l'}
is a Cauchy sequence in R (in C) for j = I,
... ,n, and lX { } '
converges to x, where x = (~I' ... ,~,,)
and ~, = lim l'~
l'
for j = 1, ... , n.
p..(x, y) =
5.5.26. Example. Let {I,; p,} be the metric space defined in Example
5.3.5. We now show that this space is a complete metric space.
eL t
Let lX { } '
f
lX )'
=
[ .-L1.."
.., -
".1 -
~.l'
1] /'
(~lkJ
<
~2k'
~d'
).
[ .~I ..
1~.k I'
1] /'
<"
for all k E .J Now let n be any positive integer, let p~ be the metric on R"
defined in Exercise 5.5.25, and let ~x = {~\kJ
... '~"l'J.
Then p~(x~,
xj)
< p,(x l" IX )' and thus {x~} is a Cauchy sequence in R;. It also follows that
p~(O,
x~)
s" for all k E .J Now by Exercise 5.5.25, }~x{ converges to x ' ,
5.5.
where x '
p~(O,
293
x')
(' I " .. ,' , ,). It follows from Theorem 5.5.6, part (vii), that
< ,,;
follows that x
i.e.,
E
t[ i
1',1'
)t/, <
".
I, it
integer N such that p,(x } , X k ) < for all k,j > N. Again, let n be any
positive integer. Then we have p~(,~x
)~x
< for all j, k > N. F o r fixed
n, we conclude from Theorem 5.5.6, part (vii), that p~(X',
x~)
:::;; for all
k 2 N. eH nce,
k'
N, where N depends
only on (and not on n). Since this must hold for all n E I, we conclude
that p(x , x k } < for all k > N. This implies that lim x k = X .
_
k
5.5.27. Exercise.
is complete.
5.5.28. Example. eL t e{ ra, bJ; p~) be the metric space defined in Example
5.3.14. Thus, era, bJ is the set of all continuous functions on a[ , bJ and
y)
p~(x,
sup I(X I) -
S/Sb
y(l) I.
We now show that e{ ra, bJ; p~) is a complete metric space. If ,x { ,} isa Cauchy
sequence in era, bJ, then for each > 0 there is an N such that I,x ,(I) - "X ,(I) I
< whenever m, n 2 N for all I E a[ , b]. Thus, for fixed I, the sequence
,X { ,(I})
converges to, say, oX (I}. Since t is arbitrary, the sequence offunctions
{x,,( .)} converges pointwise to a function x o( .). Also, since N = N( )
is independent of I, the sequence ,x { ,( )} converges uniformly to x o( ).
Now from the calculus we know that if a sequence of continuous functions
,x { ,( ) converges uniformly to a function x o( ), then x o( ) is continuous.
Therefore, every Cauchy sequence in e{ a[ , b); pool converges to an element in
this space in the sense of the metric poo. Therefore, the metric space e{ a[ ,
bJ; pool is complete. _
:U
y) =
(X [ I)
y(I)J2
dt}
lIZ.
We now show that this metric space is not complete. Without loss ofgenerality
let the closed interval be [ - 1 , IJ. In particular, consider the sequence ,x { ,}
of continuous functions defined by
x , ,(t)=
< t:::;; 0
0,
-)
nt,
O:::;;t:::;;l! n
I,I! n :::;;t:::;;)
}
,
194
x ( t)
n= 3
n= 2 - ~ ' + f
~ -n=l
- l _ l - - f I~ - - l ..- - -
5.5.30.
n =
m >
.., .X )}2
=
(m -
,,)2 ill... t 2 dt
(m -
,,)2
3m2 n
fl/. (I -
1/..
.F
Now let
nt)2 dt
3n
fl
Ix.(t) -
(x t)12 dt - -
0 as n - -
00.
This implies that the above integral with any limits between + I and - I
also approaches ez ro as n - > 00. Since x.(t) = 0 whenever t E [ - 1 ,0] , we
have
Ix.(t) -
(x t)12 dt
and x(t)
= 0 whenever t
Choosing n
fl
[ - 1 ,0] .
Ix.(t) -
2dt
I x(t) 1
0,
Now if 0 <
x(t) 12 dt - -
a S I, then
0 as n - -
00.
x ( tW dt - -
0 as n - -
00.
it follows that x(t) = 1 for t > a. Since a can be chosen arbitrarily close to
ez ro, we end up with a function x such that
x(t)
= {O,
I,
t
t
E
E
[ - 1 ,0]
(0, I]
}.
Chapter 5
296
Metric Spaces
r
a
f(x ) dx )
:t {kP(E
k= 1
k)
approximates
graph off, and it can serve as the definition of the integral of f between a
and b, after an appropriate limiting process has been performed. Provided
that this limit exists, it is called the eL besgue integral off over a[ , b], and it
is denoted by
f.
(a, b)
f.
(a,bl
fdJ.l
= fb f(x ) dx .
which are eL besgue integrable but not Riemann integrable over a[ , b]. F o r
example, consider the function f: a[ , b] - + R defined by f(x ) = 0 if x is
rational and f(x ) = I if x is irrational. This function is so erratic that the
Riemann integral does not exist in this case. oH wever, since the interval
a[ , b] = A U B, where A = { x : f(x )
= I} and B = { x : f(x ) = O}, it follows
from the preceding characterization
of eL besgue
integral that
f.
(a,bl
fdJ.l
l.J.l(A) + O.J.l(B) = b - a.
eL t us now consider an important class of complete metric spaces, given
in the next example.
=
5.5.
297
5.5.31. Example.
Let p > 1 (p not necessarily an integer), let (R, mr, p.)
denote the eL besgue measure space on the real numbers, and let a[ , h] be
a subset of R. Let .c;:a, h] denote the family of functions f: R - > R which
are eL besgue measurable and such that f
[J .
b]
.[J ,f
b)
If-
glp dp.
pL a[ ,
pL a[ ,
z ; pz.}
of
the sequence y{ ,,) converges to y. Since y{ ,,) is a Cauchy sequence in the complete space { Y ; p} we have y{ ,,} converging to a point y' E .Y But the limit of
a sequence of points in a metric space is unique by Theorem 5.5.6. Therefore,
y' = y; i.e., y E Y and y is closed.
::>
Sz
p}
S3 ::>
We leave the proof of the last result of the present section as an exercise.
5.5.35. Theorem. eL t { X ;
5.6.
n SIt
.~I
COMPACTNESS
_ I -
(- I )"
2
+ n'I
n-
2
1, , ....
Then the range of this sequence lies in Y and is thus bounded. eH nce,
range has at least one accumulation point. It, in fact, has two.
the
5.6.
Compactness
299
A theorem from the calculus which is closely related to the BolzanoWeierstrass theorem is the eH ine-Borel
theorem. We need the following
terminology.
5.6.1. Definition. eL t Y be a set in a metric space { X ; p), and let A be an
index set. A collection of sets { Y II : (X E A) in {X; p) is called a covering
of Y if Y c U
Y II A subcollection { Y p : p E B) of the covering { Y . : (X E A),
eL .,
ileA
B c A such that Y c
pes
(X
A). If
all the members Y . and Y p are open sets, then we speak of an open covering
and open subcovering. If A is a finite set, then we speak of a finite covering.
In general, A may be an uncountable set.
We now recall the eH ine-Borel theorem as it applies to subsets of the real
line (Le., of R): eL t Y be a closed and bounded subset of R. If { Y . : (X E A)
is any family of open sets on the real line which covers ,Y then it is possible to
find a finite subcovering of sets from { Y . : (X E A).
Many important properties of the real line follow from the BolzanoWeierstrass theorem and from the eH ine-Borel
theorem. In general, these
properties cannot be carried over directly to arbitrary metric spaces. The
concept of compactness, to be introduced in the present section, will enable
us to isolate those metric spaces which possess the eH ine-Borel and BolzanoWeierstrass property.
Because of its close relationship to compactness, we first introduce the
concept of total boundedness.
5.6.2. Definition. eL t Y be any set in a metric space { X ; p}, and let l
be an arbitrary positive number. A set S. in X is said to be an l- n et for Y
if for any point y E Y there exists at least one point S E S. such that p(s,y)
< l. The l-net, S.. is said to be finite if S. contains a finite number of points.
A subset Y of X is said to be totally bounded if X contains a finite l- n et for Y
for every l > O.
Some authors use the terminology l-dense set for E-net and precompact
for totally bounded sets.
An obvious equivalent characterization of total boundedness is contained
in the following result.
5.6.3. Theorem. A subset Y c X is totally bounded if and only if Y can
be covered by a finite number of spheres of radius E for any E > O.
5.6.4.
Exercise.
300
Set X
S. is the finite
set consisting of
the dots within
the set X
Set Y
5.6.5.
Then,
Theorem. eL t { X ;
5.6.7. Exercise.
We note, for example, that all finite sets (including the empty set) are
totally bounded. Whereas all totally bounded sets are also bounded, the
converse does, in general, not hold. We demonstrate this by means of the
following example.
5.6.8. Example. eL t /{ 2; P2J be the metric space defined in Example 5.3.5.
Consider the subset Y c /2 defined by
Y
= y{
. 1'1,1
12 ::E
t= 1
S I}.
= [~Iet
t[ i
l' ,12T'2 s
,Y we
2.
301
5.6. Compactness
0, ...), etc. Then pz(e l, eJ ) = ...;-T for i 1= = j. Now suppose there is a finite
-net for Y for say = 1- Let S{ l> ... , s,,} be the net S,. Now if eJ is such
that p(eJ' SI) < !
for some i, then peek' sJ > peek' eJ ) - p(eJ' SI) > ! for
k 1= = j. Hence, there can be at most one element of the set E in each sphere
S(SI;! ) for i = I, ... ,n. Since there are infinitely many points in E and
only a finite number of spheres S(SI; ! ) , this contradicts the fact that S, is
an (- n et. Hence, there is no finite (- n et for ( = ! '
and Y is not totally
bounded. _
Let us now consider an example of a totally bounded set.
5.6.9.
Example.
Let R
{ "; pz}
{y
EO
R":
leI
til <
I}. Clearly,
Sf =
(q l '
... ,q . )
where - N
< ml <
lq =
:Y
EO
N, i
[ L
I-I
I{ IN)Z
1] /1
= filN ~ (.
Since (
R
{ "; pz}
is totally bounded.
5.6.10. Exercise.
Let l{ ;z
pz} be the metric space defined in Example
5.3.5, and let Y c /z be the subset defined by
{y
EO
/z:
Itlll~
I, Itizi
<!,
(1)"- I , ... .}
302
Chapter 5
Metric Spaces
and thus
5.6. Compactness
303
Chapter 5
304
I Metric Spaces
Parts (iii), (iv) and (v) of the above theorem allow us to define a sequentially
compact metric space equivalently as a metric space which is complete and
totally bounded. We now show that a metric space is sequentially compact if
and only if it satisfies the Bolzano-Weierstrass property.
5.6.17. Theorem. A metric space { X ; p} is sequentially compact if and
only if every infinite subset of X has at least one point of accumulation.
5.6. Compactness
that every infinite subset of a compact metric space has a point of accumulation.
eL t [ X ; p) be a compact metric space, and let Y be an infinite subset of
.X F o r purposes of contradiction, assume that Y has no point of accumulation. Then each x E X is the center of a sphere which contains no point of
,Y except possibly x itself. These spheres form an infinite open covering of
.X But, by hypothesis, [ X ; p) is compact, and therefore we can choose from
this infinite covering a finite number of spheres which also cover .X Now
each sphere from this finite subcovering contains at most one point of ,Y and
therefore Y is finite. But this is contrary to our original assumption, and we
have arrived at a contradiction. Therefore, Y has at least one point of accumulation, and [ X ; p) is sequentially compact.
Conversely, assume that [ X ; p) is a sequentially compact metric space,
and let [ Y .. ; E A) be an arbitrary infinite open covering of .X From
Lemma 5.6.18 there exists an [ > 0 such that every sphere in X of radius
[ is contained in at least one of the open sets Y ... Now, by hypothesis, { X ; p)
is sequentially compact and is therefore totally bounded by part (iii) of
Theorem 5.6.15. Thus, with arbitrary [ fixed we can find a finite [-net,
IX[ >
S(x
x z , ... ,XI)'
l ; [)
c Y .. I , i
A). eH nce,
such that X c
1= 1
S(x
l;
XcU
5.6.18,
.. ;
Y ..
"
and X has a finite open subcovering chosen from the infinite open covering
{ Y .. ; E A). Therefore, the metric space { X ; p) is compact. This proves the
theorem. _
I-I
.. EB
.. EA
5.6.23. Exercise.
306
5.6.24.
(i)
(ii)
(iii)
(iv)
(v)
Chapter 5
Theorem.
{X;
;X{
p}
p}
p}
p}
In a metric space { X ;
Metric Spaces
is compact;
is sequentially compact;
{X;
possesses the Bolzano-Weierstrass property;
{X;
is complete and totally bounded; and
every infinite family of closed sets in { X ; p} with the finite intersection
property has a nonvoid intersection.
5~6.2S.
Exercise.
L e t { X I ; pa, { X z ; pz}, . .. , { X . ;
spaces. L e t X = X I X X z X ... x X . , and let
p(x , y)
PI(X " Y I )
+ ... +
P.(x . , Y.),
E .X
(5.6.26)
Recall that every non-void compact set in the real line R contains its
infimum and its supremum.
In general, it is not an easy task to apply the results of Theorem 5.6.24
to specific spaces in order to establish necessary and sufficient conditions
for compactness. F r om the point of view of applications, criteria such as
those established in Theorem 5.6.27 are much more desirable.
We now give a condition which tells us when a subset of a metric space is
compact. We have:
5.6.29. Theorem. L e t { X ; p} be a compact metric space, and let Y
If Y is closed, then Y is compact.
Proof
c .X
5.7.
Continuous uF nctions
u U[ .. EB V..].
Since Y c
307
.. eB
V.., Y
=
_
.. eB
V.. ; i.e., { Y
.. ;
B} covers Y .
5.7.
CONTINUOS
U
N
UF CTIONS
o)] < (
whenever p,,(x, x o) < ~. The mapping f is said to be continuous on X
simply continuous if it is continuous at each point x E .X
PY [ f (x ) ,f(x
or
308
We denote x
Rn and Y
::: :::
A=
amI
py}
and let { Y ;
a m2
. ...
:::]
.,.
a mn
RT (see Example
Rm by
Rm by
f(x)
Ax
y, oY
[
and
~']
amI
11m
= ~
p[ y(y, OY )]2
Using
R- and
"~ ] e[ ,]
a[ n
=
am_
tL
en
a/j(e J -
eOJ)r
Now let
M=
t{ 1
tJ
yo)2
all}
Ct ah) ~LJ
< [~
1/1
1= =
0 (if
M=
(e
J
e I)2)
O
<
>
5.7.3. Example.
Let { X ; p,,} = { Y ; py} = {e[a, b]; P2}' the metric space
defined in Example 5.3.12, and let us define a function/: X - +
Y in the fol-
5.7.
Continuous uF nctions
lowing way. F o r x
309
,X Y = f(x ) is given by
yet)
I: k(t, s)x(s)ds,
a[ , b],
where k: R7. - > R is continuous in the usual sense, i.e., with respect to the
metric spaces R~ and R1. We now show that f is continuous on .X Let x,
X o E X and y, o
Y E Y be such that y = f(x ) and oY = f(x o). Then
[ p iY ,
oY W
u: r
rI{ :
k(t, s)[(x s)
ox (s)]ds}
dt.
<
Mpx(x,
x o),
x o) <
7.
b, where b =
when-
fiM.
5.7.4. Example. Consider the metric space e{ ra, b]; p~} defined in Example
5.3.14. eL t el[ a , b] be the subset of era, b] of all functions having continuous
first derivatives (on (a, b, and let {X; Px} be the metric subspace {el[a, b];
pool. Let {Y; py} = e{ ra, b]; p~} and define the functionJ: X - > Yas follows.
F o r x E ,X Y = f(x ) is given by
yet) =
dx ( t) .
dt
To show that/is not continuous, we show that for any b > 0 there is a pair
x o) < ~ but pif(x ) ,f(x o > I. eL t ox (t) = 0 for
x , X o E X such that Px(,x
all t E a[ , b], and let x(t) = tx sin rot, tx > 0, ro > O. Then p(x o' x) < tx.
Now if oY = f(x o) and y = f(x ) , then yo(t) = 0 for all t E a[ , b] and yet)
= (XCI) cos rot. eH nce, p(Yo' y) = txro, provided that ro is sufficiently large, i.e.,
so that cos rot = I for some t E a[ , b]. Now no matter what value of ~
we choose, there is an x E X such that p(x, x o) < ~ if we pick tx < ~. oH wever, p(y, oY ) = I if we let ro = Iltx. Therefore,J i s not continuous on .X
5.7.6. Exercise.
Intuitively, Theorem 5.7.5 tells us thatfis continuous at X o if f(x ) is arbitrarily close to f(x o) when x is sufficiently close to X o. The concept of continuity is depicted in Figure H for the case where { X ; Px} = { Y ; py} = R~.
310
o
o
{ Y ; pyl
5.7.7.
iF gure H.
R~
Illustration of continuity.
whenever lim ~x
f(lim x~)
f(x o)
= x o'
f(x o)'
311
Chapter 5
312
and ;Y{
I Metric Spaces
(i) If {X; Px} is compact, then f(X ) is a compact subset of {Y; p)'.}
(ii) If U is a compact subset of the metric space ;X{
Px,} thenf(U ) is
a compact subset of the metric space { Y ; p)'.}
(iii) If {X; P}x is compact and if U is a closed subset of ,X then f( )U
is a closed subset of { ;Y p)').
(iv) If;X {
Px} is compact, thenfis uniformly continuous on .x
5.7.
Continuous uF nctions
313
f
<
there is an integer N
whenever n > N(f, x). In general, N(f, x ) is not necessarily bounded. However, if N(f, )x is bounded for all x E ,X then we say that the sequence
I[ .} converges to I uniformly on .X Let M(f) = sup N(f, x ) < 00. Equivalently,
"ex
pil.(x ) ,f(x
<
if for every
Proof
: ::;;
py(f(x ) ,/(x o
pif(x ) ,fM(X
py(fM(x),fM(X
PY(fM(XO),f(x
o < 3f,
whenever. PJe(x, x o) < 6. F r om this it follows that f is continuous at X O'
Since X o was arbitrarily chosen,fis continuous at all x E .X This proves the
theorem.
The reader will recognize in the last result of the present section several
generalizations from the calculus to real-valued functions defined on metric
spaces.
[ ; p} denote the
5.7.15. Theorem. Let [ X ; pJe} be a metric space, and let R
real line R with the usual metric. Let I: X - > R, and let U c: .X If I is continuous on X and if U is a compact subset of [ X ; p",}, then
(i) lis uniformly continuous on U ;
(ii) fis bounded on ;U and
(iii) if U " * 0, f attains its infimum and supremum on ;U i.e., there
ex i stx o ,x
E U ) andf(x
sup
l) =
1 E U s uchthatf(x o )= i nf{ f (x ) :x
[ f (x ) : x E .} U
Chapter 5
314
Metric Spaces
Proof Part (i) follows from part (iv) ofTheorem 5.7.12. Since U is a compact
subset of X it follows that /(U ) is a compact subset of R. Thus, /(U ) is
bounded and closed. From this it follows that j is bounded. To prove part
(iii), note that if U is a non-empty compact subset of ;X {
Px,} then /(U ) is
a non-empty compact subset of R. This implies that / attains its infimum
and supremum on .U
5.8.
In this section we present two results which are used widely in applications. The first of these is called the fixed point principle while the second is
known as the Ascoli-Arzela theorem. Both of these results are widely utilized,
for example,in establishing existence and uniqueness of solutions of various
types of equations (ordinary differential equations, integral equations,
algebraic equations, functional differential equations, and the like).
We begin by considering a special class of continuous mappings on metric
spaces, so-called contraction mappings.
The
5.8.1. Definition. eL t { X ; p} be a metric space and let j: X - X .
function / is said to be a contraction mapping if there exists a real number
c such that 0 < c < I and
for all ,x y
s;;: cp(x . y)
p(f(x ) ,j(y
.X
(5.8.2)
5.8.4.
Exercise.
The following result is known as the fixed point principle or the principle
of contraction mappings.
5.8.5. Theorem. eL t { X ; p} be a complete metric space, and let / be a
contraction mapping of X into .X Then
(i) there exists a unique point
and
(ii) for any
uX
X such that
f(x o) =
XI
E ,X
the sequence x {
X n+ 1
= /(x
n}
n ),
(5.8.6)
xo'
in X defined by
n=
1,2, ...
(5.8.7)
315
The unique point X o satisfying Eq. (5.8.6) is called a fixed point off In
this case we say that X o is obtained by the method of successive approximations.
We first show that if there is an X o E X satisfying (5.8.6), then it
must be unique. Suppose that X o and oY satisfy (5.8.6). Then by inequality
(5.8.2), we have p(xo,Yo) < cp(x o' oY ) Since 0 < c < I, it follows that
p(x o' oY ) = 0 and therefore X o = oY '
Now let IX be any point in .X We want to show that the sequence fx.}
generated by Eq. (5.8.7) is a Cauchy sequence. F o r any n > I, we have
p(x.+ I, x . ) < cp(x., x._ I). By induction we see that p(x.1+ >
x . ) < C - I p(XZx ' l )
for n = 1,2, .... Thus, for any m > n we have
Proof
p(x""
",- I
x.)
k=
<
-
-1
p( ,zX
1- c
IX
k)
)
<
c - I p(x z ,
xl)[1
c+
... +
C",-I- ]
Since 0 < c < I, the right-hand side of the above inequality can be made
arbitrarily small by choosing n sufficiently large. Thus, .x { }
is a Cauchy
sequence.
Next, since fX ; p} is complete, it follows that .x { }
converges; i.e., lim x
exists. eL t lim x .
But f(lim x . )
n+
,X
we have
= .x Thus,/(x ) = x and we
have proven the existence of a fixed point off Since we have already proven
uniqueness, the proof is complete. _
It may turn out that the composite function pn' /),. /0/0 ... 0/ is a
contraction mapping, whereas / is not. The following result shows that such
a mapping still has a unique fixed point.
f(x o) = X o'
(5.8.9)
Moreover, the fixed point can be determined by the method of successive
approximations (see Theorem 5.8.5).
5.8.10. Exercise.
316
Chapter 5
I Metric Spaces
5.8.11. Definition. Let e[a, b] denote the set of all continuous real-valued
functions defined on the interval a[ , b] of the real line R. A subset Y of
era, b] is said to be equicontinuous on a[ , b] if for every f > 0 there exists a
J > 0 such that Ix(t) - x(t o) I < f for all x E Y and all t, to such that
It - tol < .J
Note that in this definition J depends only on f and not on x or 1 and '
We now state and prove the Arzela-Ascoli theorem.
0,
5.8.12. Theorem. Let e{ a[ , b]; p_} be the metric space defined in Example
5.3.14. Let Y be a bounded subset of e[a, b]. If Y is equicontinuous on a[ , b],
then Y is relatively compact in e[a, b].
Proof F o r each positive integer k, let us divide the interval a[ , b] into k equal
parts by the set of points Vk = t{ ok' 11k' ... ,/ u } c a[ , b]. That is, a = 10k
< Ilk < ... < lu = b, where t'k = a + (ilk)(b - a), i = 0, I, ... ,k, and
a[ , b]
k= 1
= U
1= '
[ / c/- I lk'
Vk is a finite set,
by T
{ ,! Tz, ....J The ordering of this set is immaterial. Next, since Y is
bounded, there is a ., > 0 such that p_(,x
y) < ., for all ,x Y E .Y Let X o be
held fixed in ,Y and let Y E Y be arbitrary. Let 0 E era, b] be the function
which is zero for all 1 E a[ , b]. Then p_(y, 0) < p_(y, x o) + p_(x o, 0). Hence,
p_ ( y, 0) < M for all y E ,Y where M = ., + p_(x o, 0). This implies that
sup ly(t)1 < M for all Y E .Y Now, let y{ .J be an arbitrary sequence in
IEI bl
Hence,
Ix i t)
"x ,(t)
Nand n >
317
N, we have
fX t(t,)
I + IfX t(t,)
"x ,(t,)
I
IX",(t,) -
"x ,(t)1
<
E.
This implies that poo(x"" fX t) .< E for all m, n > N. Therefore, .x{ }
is a Cauchy
sequence in era, b]. Since e{ ra, b]; pool is a complete metric space (see Example
5.5.28), fX{ t}
converges to some point in era, b]. This implies that fY { t}
has a subsequence which converges to a point in era, b] and so, by Theorem
5.6.31, Y is relatively compact in era, b]. This completes the proof of the
theorem. _
Our next result follows directly from Theorem 5.8.12. It is sometimes
referred to as Ascoli's lemma.
5.8.13. Corollary. Let 9{ 1ft} be a sequence of functions in e{ ra, b]; poolIf 9{ 1ft} is equicontinuous on a[ , b] and uniformly bounded on a[ , b] (Le., there
exists an M> 0 such that sup 1,.(t)1 < M for all n), then there exists a
.S;.S;b
uniformly on a[ , b].
5.8.14.
Exercise.
converges to,
5.9.
EQUIVALENT
AND HOMEOMORPHIC
SPACES. TOPOLOGICAL
SPACES
METRIC
Chapter 5
318
I Metric Spaces
the same underlying set (e.g., the metric spaces { X ; P.. } and { Y ; py}, where
X
Y) may have many similar properties of the type mentioned above.
We begin with equivalence of metric spaces defined on the same underlying
set.
5.9.1. Definition. Let { X ; ptl and { X ; Pl} be two metric spaces defined on
the same underlying set .X Let 3 1 and 31 be the topology of X determined
by PI and Pl' respectively. Then the metrics PI and Pl are said to be equivalent
metrics if 3 1 = 31 ,
Throughout the present section we use the notation
f: { X ;
PI} ~
{Y;
Pl}
i: ;X{
i- I : {X;
PI}
Pl}
{X;
~
~
Pl}
{X;
PI}'
Proof
319
We now show that (ii) implies (iii). Clearly, the mapping i: ;X{
pz} - +
{X;
pz} is continuous. Now assume the validity of statement (ii), and let
{ Y ; P3} = {X; pz.} Then i: {X; PI} - {X; pz} is continuous. Again, it is clear
that i- I : { X ; PI} - + { X ; PI}
is continuous. Letting { Y ; P3} = { X ; pd
in
statement (ii), it follows that i- I : { X ; pz} - + { X ; PI} is continuous.
Next, we show that (iii) implies (iv). eL t i: ;X{
PI} - + ;X {
pz} be continuous, and let the sequence {x~}
in metric space { X ; PI} converge to .x By
Theorem 5.7.8, lim i(x)~
= i(x); eL ., lim ~x = x in { X ; pz.} The converse is
~
for all ,x y
5.9.4.
,X
Exercise.
y)
<
y)
PI(X,
<
lJ Pz(,x
y)
.X
)_
-
of
p(x , y)
p(x , y)
320
5.9.7. Exercise.
5.9.9. Exercise.
Prove Theorem 5.9.8. (H i nt:
5.9.2 to prove part (i) of this theorem.)
Use
Our next example shows that i- I need not be continuous, even though
i is continuous.
5.9.10. Example.
L e t X be any non-empty set, and let PI be the discrete
metric on X (see Example 5.1.7). In Exercise 5.4.26 the reader was asked
to show that every subset of X is open in { X ; PI)' Now let { X ; p) be an
arbitrary metric space with the same underlying set .X Clearly, i: { X ; PI)
-+
{ X ; p) is continuous. However, i- I : { X ; p) - + { X ; PI) is not continuous
unless every subset of { X ; p) is open. Since this is usually not true, i- I need
not be continuous. _
Next, we introduce the concepts of homeomorphism and homeomorphic
metric spaces.
5.9.11. Definition. Two metric spaces { X ; P.. J and {Y; py} are said to be
bomeomorpbic if there exists a mapping rp: { X ; P..J - + { Y ; p,.) such that (i)
rp is a bijective mapping of X onto ,Y and (ii) E c X is open in { X ; P..) if
and only if rp(E) is open in { Y ; p,.J. The mappingrp is calledabomeomorpbism.
We immediately have the following generalization of Theorem 5.9.2.
5.9.12. Theorem. Let {X; P..,J { Y ;
let rp be a bijective mapping of { X ;
statements are equivalent.
321
(i) rp is a homeomorphism;
Px} - + Z
{ ; Pz} is continuous on
(ii) for any mapping f: X - + Z, f: ;X {
X if and only iff0 rp-I: { Y ; py} - + { Z ; Pz} is continuous on ;Y
(iii) rp: { X ; Px} - + { Y ; py} is continuous and rp-I: {Y; py} - + {X; Px} is
continuous; and
(iv) for any sequence x { n } in ,X x { n } converges to a point x in {X; Px}
if and only if r{ p(x n )} converges to rp(x) in { Y ; py}.
5.9.13. Exercise.
It is possible for ;X {
PI} and ;X {
P2} to be homeomorphic, even though
PI and P2 may not be equivalent.
There are important cases for which the metric relations between the
elements of two distinct metric spaces are the same. In such cases only the
nature of the elements of the metric spaces differ. Since this difference may
be of no importance, such spaces may often be viewed as being essentially
identical. Such metric spaces are said to be isometric. Specifically, we have:
5.9.16. Definition. eL t { X ; Px} and { Y ; py} be two metric spaces, and let
rp: { X ; Px} - + (Y ; py} be a bijective mapping of X onto .Y The mapping rp is
said to be an isometry if
Px(,x
y) =
py(rp(x), rp(y
Px}
5.9.17. Theorem.
eL t
5.9.18. Exercise.
322
chapter, we note that a great deal of the development of metric spaces is not
a consequence of the metric but, rather, depends only on the properties of
certain open and closed sets. Taking the notion of open set as basic (instead
of the concept of distance, as in the case of metric spaces) and taking the
aforementioned properties of open sets as postulates, we can form a mathematical structure which is much more general than the metric space.
5.9.19. Definition. Let X be a non-void set of points, and let 3 be a family
of subsets which we will call open. We call the pair ;X{
3} a topological space
if the following hold:
3, 0 E 3;
(ii) if U 1 E 3 and U z E 3, then U 1 n U z
(iii) for any index set A, if IX E A, and ,U .
(i)
E
E
3; and
3, then
,.eA
,U .
3.
The family 3 is called the topology for the set .X The complement of an
open set U E 3 with respect to X is called a closed set.
The reader can readily verify the following results:
eL t ;X {
5.9.20. Theorem.
(i) 0 is closed;
(ii) X is closed;
(iii) the union of a finite number of closed sets is closed; and
(iv) the intersection of an arbitrary collection of closed sets is closed.
5.9.21.
Exercise.
of
5.9.23. Example.
Let X = ,x { y,J and let the open sets in X be the void set
0, the set X itself, and the set .} x {
If 3 is defined in this way, then ;X {
3}
is a topological space. In this case the closed sets are 0, ,X and y{ .}
_
5.9.24. Example. Although many fundamental concepts carry over from
metric spaces to topological spaces, it turns out that the concept of topological
space is often too general. Therefore, it is convenient to suppose that certain
topological spaces satisfy some additional conditions which are also true
in metric spaces. These conditions, called the separation axioms, are imposed
on topological spaces ;X{
3} to form the following important special cases:
5.10. Applications
323
5.10.
APPLICATIONS
Example.
= f(x ) ,
(5.10.2)
If(x z ) -
f(x l )!
<
Llx
z -
IX I
> 0,
(5.10.3)
324
81- - - . (
5.10.4.
y x
/
~
............... y" fIx)
8
/
,-
x.
X
3x
b
5.10.5.
iF gure .K
5.10. Applications
325
e, =
:'J 1
+ P"
a'J~
e.)
i=
1, ... , n.
(5.10.7)
e,
y =!(x)
denote the mapping determined by the system of linear equations
a'J~
P" i = I, ... , n,
"J :1
where y = (' I I' ... , 'I.) ERn.
F i rst we consider the complete space R
{ n; PI} = R7. Let y' = ! ( x ' ) ,
y" = ! ( x " ), x ' = (~;, ... , ~) and x " = (~~,
... , ~). We have
I' , =
PI(y' , y")
<
tIt a,iej -
e~)1
I- I
J-I
m:x
i; It a,~j
I- I
tt
<
{~la,JI}PI(X',
+ P, - a'Je~
J=I
1= I 1= I
- P,
la,Jllej - e~1
x " ),
where in the preceding the Holder inequality for finite sums was used (see
Theorem 5.2.1). Clearly, f is a contraction if the inequality
(5.10.8)
holds. Thus, Eq. (5.10.7) possesses a unique solution if (5. 10.8) holds for allj.
Next, we consider the complete space R
{ n; pz} = Ri. We have
pl(y' , y")
=
~
ft1 a,tj +
{.
I-I I-J
1= 1
J=I
P, - a,J e1- P, } z
ah }
pi(x ' ,
x " ),
where, in the preceding, the Schwarz inequality for finite sums was employed
(see Theorem 5.2.1). It follows that f is a contraction, provided that the
inequality
(5.10.9)
326
p- ( y' , y") =
m
{ ax
L
1= 1
la,/llp_(x',
Thus,f is a contraction if
It
= m~
~)
x " ).
fti 1al/ Il
a
{m~x
all~
b.
<
l.
(5.10.10)
for all i =
I, ...
a IJ ' - I..t(k- I )
1-
bI'
k -- " 1 2 ... ,
...
(~iO),
,~~O.
(5.10.11)
ex s)
tp{s)
s: (K s,
t)x(t)dt
(5.10.13)
yes)
Clearly
sup I(K s,
a5;t5;b
a5;,5;b
y E era, b].
t) I. Then
+ ). s: (K s,
,(s)
We thus have f:
era, b]
S 111M(b -
p_ ( f(x l ),! ( x ,
t)x(t)dt.
-+
a)p_ ( x
era, b].
Now let M =
l , x , ).
<
M(b
a)
(5.10.14)
5.10. Applications
3rT
,x ,(s)
,(s)
.t
s: (K s,
t)x"I_ (t)dt,
1,2,3, . . ..
(5.10.15)
5.10.16. Example. L e t rp
on the triangle a < t < s <
(x s)
rp(s)
+ .t .J '
(K s,
t)x(t)dt,
<
s<
b,
(5.10.17)
y(s) =
rp(s)
.1.
(K s,
t)x(t)dt.
Since the right- h and side of this expression is continuous, it follows that
f: era, b] - era, b]. Moreover, since K is continuous, there is an M such
that IK(s, t)1 ~ M. L e t YI = f(x l ), and let zY = f(x z ). As in the preceding
example, we have
p..(f(x l ),f(x 2)) = P"(YI> 2Y ) < 1.1.1 M(b Now let fl"l denote the composite mapping f 0 f
= yl"l. A little bit of algebra yields
p..(fI"l(X I ),fI"l(x
However,
= p..(yl">'yl"l) ~
~n . 1.tI"M"(b - a)" -
0 as n -
a)poo(x l , x 2)
l,
2 )
(5.10.18)
we have
~! 1.tI"M"(b -
a)"
<
1.
328
5.10.19. Exercise.
x =
(X f)
1(1, x ) }
(5.10.21)
<
k IX
lzx
I -
for all 1 E [f, T] and for all x l ' X z E R. In this case we say thatf satisfies
a iL pschitz condition in x and we call k a Lipschitz constant.
As was pointed out in Section .4 11, Eq. (5.10.21) is equivalent to the
integral equation
,(t)
e+ s: f(s, tp(sds.
F ( ,)
Then clearly F: e[ f , T]
poo(F(I' )'
F ( ,z
f(s, fP(sds,
e[ f ,
-+
n.
Now
[ (s,
'I(S
sup
ItJ
sup
ft k l' l (s)
.9:S:T
<
e+
=
:S:,:S:T
(5.10.22)
<
1<
T.
f(s, ' z ( s ] d s
,z ( S) Ids
<
k(T -
f)Poo(fPl> fPz) .
.F Similarly,
(5.10.23)
Since
..!n., k(T -
..!n., kft(T-
f)ft
<
f)ft
-+
0 as n - +
00,
Corollary 5.8.8 that Eq. (5.10.21) possesses a unique solution for [f, T].
Furthermore, this solution can be obtained by the method of successive
approximations.
_
5.10. Applications
5.10.24.
Exercise.
329
IX
x t fr)
,[ (1,
XI'
{I'
,x
ft ) ,
I, ... , n,
Further
1(1, x),
x ( r)
={
(5.10.26)
Do
It - 1'1 <
a,
Ix - {I <
b}.
Chapter 5
330
x
I Metric Spaces
x
"' -r-' - ~ - I
,"
(1", l~
I
I
I
I
I
- J=- ~
L . .-
I......-"""-- _ _
__
~
_ L.
1"- a
1"+ 1
OCal<-
_ '_
1"+ -
1"+ -
b
M
b .,. + 1
M
OC=.! I<
(1" - I , ~l
(1" + a,~)
5.10.29.
iF gure .L
Construction of an t-approximate
Proof
max
a if a < hiM and ~ = hiM if a> hiM (refer to Figure )L . We will show
that an f-approximate solution exists on the interval [ f , f + ~]. The proof is
similar for the interval (f - ~, fl. In our proof we will construct an f-approxiconsisting of a finite number of straight line
mate solution starting at (f,~,
segments joined end to end (see Figure )L .
Since 1 is continuous on the compact set Do, it is uniformly continuous
on Do (see Theorem 5.7.12). eH nce, given f > 0, there exists 6 = 6(f) > 0
such that I/(t, x ) - I(t' , x ' ) I < f whenever (t, x), (t', x ' ) E Do, It - t'l < 6
and Ix - i'x < 6. Now let f = to and f + ~ = t". We divide the half-open
interval (to, t,,] into n half-open subintervals (to, tl]' (tl' t 2,] ... , (t,,_ I ' t,,] in
such a fashion that
~
Let
solution.
(5.10.30)
331
5.10. Applications
eo)
eo
1' (t)
= 1' (t l- l ) + f(tl-I>
'1(ti-I(t
ti-I)'
ti-
1, ... , n.
(5.10.31)
1'1(t) -
I<
1' (t')
Mit -
t' l
(5.10.32)
f(t, 1' 1
We are now in a position to establish conditions for the existence of solutions of the initial-value problem (5.10.26).
S.10.33. Theorem. In Eq. (5.10.25), let! be continuous on the rectangle
Do =
)x : It - 1'1 < a, Ix - el < b}. Then the initial-value problem
(5.10.26) has a solution on some t interval given by It - 1'1 < ~ ~ a.
ret,
Proof
Let
f.
> 0, f.+
.--
1,2, ... ,
331
Chapter 5
Metric Spaces
,(t' ) 1
Mit -
t'l
,(t) =
f(s, ,(s d s.
(5.10.35)
+
~
[f(s, , (s
~ . ,(s)]ds.
(5.10.36)
solution, we have
,.,(1 -
[f(s, , ..,(s
Therefore, ~
-+
f [f(s, ,.,(s
tp(t)
+ ~ ...(s)]ds <
- f(s, ,(s
1~
+ ~.,(s)]ds
= ~
f(s, ,(s
lX(f ...
Ids I
f)
f' .
f f(s, ,(sds,
sU ing Theorem 5.10.33, the reader can readily prove the next result.
5.10.37. Corollary. In Eq. (5.10.25), let f be continuous on a domain D
ofthe (t, x ) plane, and let (f, ~ E D. Then the initial-value problem (5.10.26)
has a solution, on some t interval containing f.
5.10.38. Exercise.
5.10. Applications
333
r(t) ~
for all t
a[ , b), then
for all t
a[ , b).
Proof.
Let
k(t)r(t)
<
R(t) =
+
~
k(t)R(t), and
k(s)r(s)ds.
(K t)
<
R(t), R(a)
6, R(t)
(5.10.42)
e-f.k(I).". Then
-k(t)e-I~k(l)dl
K(t)R(t) -
- K ( t)k(t).
we have
<
K(t)k(t)R(t)
or
K(t)R(t)
or
(5.10.40)
r(t)
Then
k(t)R(t) <
f k(s)r(s)ds
+
K(t)R(t)
<
0
0
< o.
~ (K[ t)R(t)
r(t)
<
R(t) <
~el~k(l)dl,
k lx'
(5.10.25)
- x"I
D.
2'
334
e,
of Eq. (5.10.25) on an interval (a, b), if r- E (a, b), and if 'I(-r) = 'z(-r) =
then' l = ,z ) .
Proof. By Corollary 5.10.37, at least one solution exists on some interval
(a, b), r- E (a, b). Now suppose there is more than one solution, say ' I
and 'z, to the initial-value problem (5.10.26). Then
e+
,z ( t)
"~et)
for all t
s: f(s, ,,(sds,
' I (t) -
s: [f(s, 'I(S
i
-
1,2
f(s, ' z ( s ] d s.
Let ret) = l' l (t) - ' z ( t)l, and let k > 0 denote the iL pschitz constant for
f. In the following we consider the case when t ~ r- , and we leave the details
of the proof for t < r- as an exercise. We have,
ret)
s: If(s, 'I(S
<
=
i.e.,
f(s, ,z(s))l ds
-
<
s: kr(s)ds
r-[ , b). The conditions of Theorem 5.10.39 are clearly satisfied and
we have: if r(t) ~
case J
,z ( s) Ids
kr(s)ds;
ret)
for all t
s: k l'l(s) -
<
0, it follows that
ret)
,z ( t) 1 =
0 for all t
0 for all t
< JeJ~""'.
r-[ , b).
' z ( t)
for all t
(""leD
M.
~
Also, assume that r- E (a, b), that (-r, e> E D, and that the initial-value
problem (5.10.26) has a solution, on a t interval (a, b) such that (t, ,(t)) E D
for all t E (a, b). Then
lim ,(t)
1- . +
,(a+ )
,(t)
e+
I(s, ,(s)ds.
,(b- )
5.10.
If a
335
Applications
<
t1
<
t1
<
b, then
1,(t 1)
<
,(t1 )1
-
"
<
Mlt 1
-
til
;(t) =
{ , (t),
,(b- ) ,
Then
;(0 =
(a, b) } .
b
s: f(s, ;(sds
{+
for all t E (a, bl. Thus, the derivative of ;(t) exists on the interval (a, b),
and the left-hand derivative of ;(t) at t = b is given by
.i
x(b)
f(t, x )
,(b- ) .
~1)
1 E (a, bl
1 E b
[ , b+
{ ; (1),
",(1),
Pl
}.
,(b- )
and since
;(0 =
,(b- )
we have
;(0 =
for all t
(a, b
{+
{+
f(x , ;(s d s
s: f(s, ;(sds,
s: f(s, ;(sds
336
countinuity of I(s, s~ .
~(t)
= I(t, ~(t
for all t E (a, b + Pl.
We call ~ a continuation of the solution tp to the interval (a, b Pl. If
1 satisfies a Lipschitz condition on D with respect to ,x then ~ is unique,
and we call ~ the continuation of tp to the interval (a, b + Pl.
We can repeat the above procedure of continuing solutions until the
boundary of D is reached.
Now let the domain D be, in particular, a rectangle, as shown in F i gure
M. It is important to notice that, in general, we cannot continue solutions
over the entire t interval T shown in this figure.
0=
< x < t 2)
T= ( Tl.T2)
5.10.4.4
iF gure M. Continuation of a solution to the boundary of
domain D.
Ix -
Ixl = I;
Ix,l,
'sl
(5.10.46)
= I; I,x - y,l
I- '
on R-. (The reader can readily verify that the function given in Eq. (5.10.46)
satisfies the axioms of a norm (see Theorem .4 9.31).) The definition of Eapproximate solution for the differential eq u ation i = f(t, x ) is identical
to that given in Definition 5.10.27, save that scalars are replaced by vectors
(e.g., the scalar function tp is replaced by the n-vector valued function p4 ).
337
5.10. Applications
Exercise.
i =
(5.10.48)
f(t, x)
(X T)
f(t, x),
(5.10.49)
~
,X
J-I
a,it)x J
f,(t, x ) ,
I.>.
1, ... , n
(5.10.51)
where the a,it), i,j = I, ... , n, are assumed to be real and continuous
functions defined on the interval a[ , b]. We first show that f(t, x) = lfl(t, x),
... ,/_(t, )x T
]
satisfies a Lipschitz condition on D,
If(t, x ' )
max
I! ( .J ! ( ._ I - I
L I
f(t, x " ) I ~
k lx '
= (x ; , ...
D, where x '
x"l
x" =
,x~)T,
(x:' ,
Ir(t, x ' ) -
r(t, x " ) I = ,~
= L
II,(t, x ' )
- I - 1=1'
~
I- I
<
a,it)x~
I- 'tit a,it)(x~
L
J-I
Ix~
J-I
J-I
I,(t, x " ) I
x~'1
x~)
a,it)x~
= klx' - "x l
... ,x~)T,
Chapter 5
338
Metric Spaces
2'
(t, ' 2 { t
D.
We assume that t > t, and we leave the details of the proof for
t as an exercise. We have
Proof
t
<
(5.10.53)
= ;1
' I (t)
' 2 (t) =
;2 +
and
t2(t) 1 <
"I{ t ) -
1;1 -
r
r
f(s, ' I (s d s,
f(s, t2(sds,
;11
s: Ifl(s) -
(5.10.54),
f1(S) 1ds.
(5.10.54)
IX
X,(f)
a[ , b].
Proof
'~n)
=
=
E
a1it)x1
1=1
~I'
!,(t, )x ,
= I, ... , n
I, ... ,n }
(5.10.56)
5.10. Applications
339
(r, ;) over some interval e[ , d] c a[ , b]. We must show that 'I' can be continued
to a unique solution, over the entire interval a[ , b].
Let i be any solution of Eq. (5.10.56) through (r, ;) which exists on some
= i and = 0, we
subinterval of a[ , b]. Applying Lemma 5.10.52 to
have
I'
2'
(5.10.57)
for all t in the domain of definition of i. F o r purposes of contradiction,
suppose that 'I' does not have a continuation to a[ , b] and assume that 'I'
has a continuation i existing up to t' < b and cannot be continued beyond
t'. But inequality (5.10.57) implies that the path (t, i(t remains inside a
closed bounded subset of D. It follows from Theorem 5.10.45, interpreted
for systems of first-order ordinary differential equations, that i may be
continued beyond t'. We thus have arrived at a contradiction, which proves
that a continuation, of", exists on the entire interval a[ , b]. This continuation
is unique because f(t, )x satisfies a Lipschitz
condition with respect to x
on D.
S.lO.58. Exercise.
In Theorem 5.10.55, let alj(t), i,j = 1, ... ,n, be
continuous on the open interval (- 00, 00). Show that the initial-value probE Rn+1 which can
lem (5.10.56) possesses unique solutions for every (r,
be extended to the t interval (- 00, 00).
e>
Exercise.
eL t D c Rn+ I be given by Eq. (5.10.50), and let the real
functions alit), v/(t), i,j = I, ... ,n, be continuous on the t interval a[ , b].
Show that there exists a unique solution to the initial-value problem
5.10.59.
el' ...
lx r- )
with (r,
,en)
entire interval a[ , b].
e/,
I, ... ,n,
(5.10.60)
It is possible to relax the conditions on v/(t), j = 1, ' . ' . ,n, in the above
exercise considerably. F o r example, it can be shown that if v/(t) is piecewise
continuous on a[ , b], then the assertions of Exercise 5.10.59 still hold.
We now address ourselves to the last item of the present section. Consider
the initial-value problem (5.10.49) which we characterized in Definition
.4 11.9. Assume that f(t, )x satisfies a Lipschitz condition on a domain D
c Rn+1 and that (r,;) E D. Then the initial-value problem possesses a
unique solution, over some t interval containing 1'. To indicate the depen-
340
T, ;),
where fP{T; T,;) = ;. We now ask: What are the effects of different initial
conditions on the solution of Eq. (5.1O.48)? Our next result provides the
answer.
5.10.61. Theorem. In Eq. (5.10.49) let f(t, )x satisfy a iL pschitz condition
with respect to x on Dc R+I. Let (T,;) E D. Then the unique solution
f(t; T, ;) of Eq. (5.10.49), existing on some bounded t interval containing T,
depends continuously on ; on any such bounded interval. (This means if
;. ->
; , then ,(t; T, ;.) - >
f(t; T, ;).)
Proof
We have
= ;. +
,(t; T,;)
=;
and
>
<
.(t; T, 1;)1
I.(t; T, ;.) -
r
+ r
<
+ r
cp(t; T, ;.)
II;. - ; 1
<
where k denotes a iL pschitz
we obtain
is left as an exercise),
;1 + k
II;. -
' , (s;
T,
T,
1;.) - >
.(t;
T,
1;.) sU ing
cp(s;
T,
;)1 ds,
Theorem 5.10.39,
= II;. -1;1
ek(t- T l.
1;).
It follows from the proof of the above theorem that the convergence is
uniform with respect to t on any interval a[ , b] on which the solutions are
defined.
5.10.62. Example.
x =
(X T)
where
-00
<
<
00, - 0 0
,(t;
T,
e>
2x
c= ;
(5.10.63)
5.11.
341
Thus far, in the present section, we have concerned ourselves with problems characterized by real ordinary differential equations. It is an easy matter
to verify that all the existence, uniqueness, continuation, and dependence
(on initial conditions) results proved in the present section are also valid for
initial-value problems described by complex ordinary differential equations
such as those given, e.g., in Eq. (4.11.25). In this case, the norm of a complex
vector z = (z " ... ,ZRY ' Zk = kU + ivk , k = 1, ... , n, is given by
where IZk I =
= IZI - Z21
(u~
vl)I/2.
5.11. REFERENCES
The metric on
en
AND NOTES
There are numerous excellent texts on metric spaces. Books which are
especially readable include Copson 5[ .2], Gleason 5[ .3], Goldstein and
Rosenbaum [5.4], Kantorovich and Akilov [5.5], oK lmogorov and Fomin
5[ .7], Naylor and Sell 5[ .8], and Royden 5[ .9]. Reference 5[ .8] includes some
applications. The book by eK lley 5[ .6] is a standard reference on topology.
An excellent reference on ordinary differential equations is the book by
Coddington and eL vinson [5.1].
REFERENCES
5[ .1]
5[ .2]
5[ .3]
5[ .4]
5[ .5]
5[ .6]
5[ .7]
M. E. GOLDSTEIN and B. M. ROSENBAUM, "Introduction to Abstract Analysis," National Aeronautics and Space Administration, Report No. SP-203,
Washington, D.C., 1969.
L. V. A
K NTOROVICH
and G. P. AKIO
L V,
uF nctional Analysis in Normed
Y rk: The Macmillan Company, 1964.
Spaces. New o
.J EK EL ,Y
General Topology. Princeton, N.J.: D. Van Nostrand Company,
Inc., 1955.
A. N. O
K M
L OGOROV and S. V. O
F MIN, Elements of the Theory ofFunctions
and uF nctional Analysis. Vol. I. Albany, N.Y.: Graylock Press, 1957.
342
5[ .8]
5[ .9]
5[ .10]
344
spaces are special cases of Banach spaces; Banach spaces are special kinds of
nonned linear spaces; and H i lbert spaces are special types of inner product
spaces.) In section 15, we consider two applications. This chapter is concluded with a brief discussion of pertinent references in the last section.
6.1.
NORM ED IL NEAR
SPACES
Throughout this chapter, R denotes the field ofreal numbers, C denotes the
field of complex numbers, F denotes either R or C, and X denotes a vector
space over .F
6.1.1. Definition. Let II 1 1 denote a mapping from X into R which satisfies
the following properties for every ,x y E X and every E :F
(i)
(ii)
(iii)
(iv)
IIxll ~ 0;
IIxll = 0 if and only if x =
II/%IX I = 11l lxll; and
Ilx + yll ~ IIxll + lIyll
0;
The function " "is called a nonn on X, the mathematical system con II}, is called a nonned linear space, and II x II
sisting of II I\ and ,X { X ; "
is called the nonn or .x If F = C we speak of a complex nonned linear space,
and if F = R we speak of a real nonned linear space.
Different norms defined on the same linear space X yield different nonned
linear spaces. If in a given discussion it is clear which particular norm is
II} to denote the nonned
being used, we simply write X in place of { X ; "
linear space under consideration. Properties (iii) and (iv) in Definition 6.1.1
are called the homogeneity property and the triangle inequality of a nonn,
respectively.
Let { X ; II II} be a normed linear space and let ,x E ,X i = I, ... ,n.
Repeated use of the triangle inequality yields
II X I + ... + .x 1I
I\ x
I "
+ ... + IIx.lI
The following result shows that every normed linear space has a metric
associated with it, induced by the. nonn I\ II. Therefore, every nonned
linear space is also a metric space.
This theorem tells us that all ofthe results in the previous chapter on metric
spaces apply to normed linear spaces as we/l,providedwe let p(x, y) = Ilx - y II.
We will adopt the convention that when using the terminology ofmetric spaces
(e.g., completeness, compactness, convergence, continuity, etc.) in a normed
linear space (X ; II . Ill, we mean with respect to the metric space (X ; p},
where p(x, y) = II x - y II. Also, whenever we use metric space properties on ,F
i.e., on R or C, we mean with respect to the usual metric on R or C, respectively.
With the foregoing in mind, we now introduce the following important
concept.
6.1.4. Definition. A complete normed linear space is called a Banach
space.
Thus, (X ; (II II} is a Banach space if and only if (X ;
metric space, where p(x , y) = IIx - yll.
p} is a complete
6.1.5. Example.
Let X = RR, the space of n-tuples of real numbers, or let
X = CR, the space of n-tuples of complex numbers. From Example 3.1.10 we
see that X is a vector space. F o r x E X given by x = (e I' . . . , e.), and for
pER such that I < p < 00, define
We can readily
of a norm. Axioms
(i), (ii), (iii) of Definition 6.1.1 follow trivially, while axiom (iv) is a direct
conseq u ence of Minkowski' s inequality for finite sums (5.2.6). L e tting
pix , y) = II x y II" then (X ; p,} is the metric space of Exercise 5.5.25.
Since (X ; p,} is complete, it follows that (RR; II . II,} and (CR; II . II,} are
Banach spaces.
We may also define a norm on X by letting
IIxll .. =
It can readily be verified that (R\
spaces (see Exercise 5.5.25).
6.1.6. Example.
Let
ple 3.1.13), let I S p
<
=
X
00,
I,
max
I~R~'
II . II..}
le,l
and (CR; II
. II..}
x{
Define
IIxll, =
:X
f; le,l'
I~'
<
oo}.
(~ .. le,l' )1/' .
/- 1
(6.1.7)
It is readily verified that II . II, is a norm on the linear space I,. Axioms
(i), (ii), (iii) of Definition 6.1.1 follow trivially, while axiom (iv), the triangle
346
= x{
X:
sup
ne/1}
sup
ne/I},
IIxII .. =
< oo}
(6.1.8)
6.1.9. Example
(a) L e t
the interval
Ilxllp =
i[ "lx(t Wdt I] IP ,
b
I <p
<
00.
It is easily shown that lela, b); II . lip} is a normed linear space. Ax i oms
(i)-(iii) of Definition 6.1.1 follow trivially, while axiom (iv) follows from the
Minkowski inequality for integrals (5.2.8). L e t pix , y) = IIx - lY l p Then
e{ ra, b); pp} is a metric space which is not complete (see Example 5.5.29
where we considered the special case p = 2). It follows that e{ ra, b); II . lip}
is not a Banach space.
Next, define on the linear space era, b) the function II . II.. by
IIxII .. =
sup Ix ( t)
' E la,b)
I.
It is readily shown that e{ ra, b); II II..} is a normed linear space. L e t p.. (x, y)
= I\ x - yll... In accordance with Example 5.5.28, e{ ra, b); P..} is a complete
metric space, and thus e{ ra, b); II . II..} is a Banach space.
The above discussion can be modified in an obvious way for the case
where era, b) consists of complex-valued continuous functions defined on
a[ , b). H e re vector addition and multiplication of vectors by scalars are defined
similarly as in Eqs. (3.1.20) and (3.1.21), respectively. F u rthermore, it is
easy to show that e{ ra, b); II lip}, I < p < 00, and e{ ra, b); II . II..} are
normed linear spaces with norms defined similarly as above. Once more, the
space e{ ra, b); II lip}, I < p < 00, is not a Banach space, while the space
lela, b); II II..} is.
(b) The metric space pL { (a, b); pp} was defined in Example 5.5.31. It
can be shown that pL a[ , b) is a vector space over R. If we let
IIXI1p=f[
),,,,bl
IfIPdJlI] /P,
347
>
6.1.10. Example. Let { X ; II II..}, {Y; II . II,.} be two normed linear spaces
over ,F and let X x Y denote the Cartesian product of X and .Y Defining
vector addition on X x Y by
(XI'
IY )
= (XI
(x z ' )z Y
X z , IY
)z Y
( ,x
y)
(<,x<
y),
we can readily show that X x Y is a linear space (see Eqs. (3.2.14), (3.2.15)
and the related discussion). This space can be used to generate a normed
linear space { X
x ;Y II . III by defining the norm II . II as
F u rthermore, if { X ;
easily shown that { X
6.1.11. Exercise.
6.1.10.
spaces, then it is
_
II III a
IIx - ox ll < rl
(6.1.12)
S(x o; r)
x{
X:
K(xo;r)
x{
X : llx - x o lI< r } .
(6.1.14)
348
Thus, in a nonned linear space we may call S(x o; r) the closed sphere
given by Eq. (6.1.14).
When regarded as a function from X into R, a nonn has the following
important property.
6.1.15. Theorem. Let ;X {
II . III be a nonned linear space. Then
a continuous mapping of X into R.
II II is
In this chapter we will not always require that a particular nonned linear
space be a Banach space. Nonetheless, many important results of analysis
require the completeness property. This is also true in applications. F o r
example, in the solution of various types of equations (such as non-linear
differential equations, integral equations, etc.) or in optimization problems
or in non-linear feedback problems or in approximation theory, as well as
many other areas of applications, we frequently obtain our desired solution
in the form of a sequence generated by means of some iterative scheme. In
such a sequence, each succeeding member is closer to the desired solution
than its predecessor. Now even though the precise solution to which a
sequence of this type may converge is unknown, it is usually imperative that
the sequence converge to an element in that space which happens to be the
setting of the particular problem in question.
6.2.
IL NEAR
SUBSPACES
IIxll l = Ilxll
Then it is easy to show that { Y ;
for all
.Y
349
!Y
(1,0,0, ...),
=
=
.........................
Y.
).
. . ) E
.X
is not a
Next, we prove:
6.2.3. Theorem. Let X be a Banach space, let Y be a linear subspace of
,X and let f denote the closure of .Y Then f is a closed linear subspace of .X
x'lI
<
.Y Now
1I(~x
be a linear
350
Proof
= 2lixllx,
.Y eL t z
6.3.
INFINITE
Then
IIzll <
211 x llz = x
>
and so z
,Y we may
0 such that
E
.Y Since
Y .
SERIES
8}
y",
XI
+ ... +
"x .'
+ ... +
z
y=
X k
I; X
8- 1
+ ...
=
I; X
8- 1
con-
diverges.
.
IfI; IIx
11 <
00,
then
8= 1
8Y -
"Y , =
Since
i: II
8= 1
X " ,+ I
+ ... +
be a sequence in X .
converges; and
III;
X II <
I; II X II
8= 1
8 1
8
8}
x 8 eH nce,
XI
+ ... +
"x .'
If n >
m, then
6.4.
351
Convex Sets
of partial sums sIft = IIxIl1 + ... + II x'" II is Cauchy. Hence, given f > 0,
there is a positive integer N such that n > m > N implies Is. - SIft I:: :; ; f.
But Is. - s.. 1> Ily. - y",lI, and so Y { ..} is a Cauchy sequence. Since X is
complete, y{ ",} is convergent and conclusion (i) follows.
To prove the second part, let y", = IX
X .. , and let y =
lim y", =
+ ... +
...
I; X
.=1
Ilyll
.
I; Ilx/ll
<
m -+
1= 1
6.4.
CONVEX
1= 1
y-
y",
at
Y .. and
I- I
SETS
xy =
z{
X:
Z=
ax
(I -
and y. Convex
R such that 0
<
<
I},
as
)(
line segment yx
Convex set
6.4.2.
Figure
Non-convex
set
352
6.4.5.
x{
Example.
E :X
x
Let
= Y,y
Exercise.
+ pX
=
i:+ f J Y
+ i+ " J z
+~-tiJ1i:+
-1 J
- i:+P
- .
,Y
because
"i:+P
Therefore, x
the proof. _
(<<
pY c (<< +
P)Y.
This completes
sets. The
eY e
6.4.8.
Exercise.
6.4.
Convex Sets
353
We note that the convex hull of Y is the smallest convex set which contains .Y Examples of convex covers of sets in R2 are depicted in Figure B.
Legend:
f!J
6.4.10.
I~
>
0, i =
+ ... +
" ~I =
I, ... , n, where E
1= 1
Exercise.
be a convex set in .X
,Y
6.4.14.
Proof
Y={x
~
Ifx o' oY
~
+ P=
+ P=
X : llx l l<
I and IIYolI
PYol1 < lI~xoll
+ PYo
.Y
I}.
IIPYolI
0 and
~llxoll
P>
+
0, where
PllYol1 <
6.4.15.
(6.4.16)
the set determined by II x II < 1 results in a set which is not convex. In particular, if p = 2/3, the set determined by II x II < I yields the boundary and the
interior of an asteroid, as shown in F i gure C. The reason for the non- c onvex i ty
of this set can be found in the fact that the function (6.4.16) does not represent
a norm. In particular, it can be shown that (6.4.16) does not satisfy the triangle
inequality.
11'11.
t,
11'11,
6.4.17.
1I'lb13
iF gure C. Unit
355
6.4.18.
Exercise.
(al Cone
6.4.20.
6.5.
IL NEAR
iF gure D
FUNCTIONALS
f(x ) =
(6.5.1)
< x , f)
x'(x)
= ,x<
x').
(6.5.2)
356
Proof If y{ ,,} is a sequence in X such that y" - + x o' thenf(y,,) - + f(x o)' by
Theorem 5.7.8. Now let ,x { ,} be a sequence in X converging to x E .X Then
the sequence y{ ,,} in X given by y" = "x - x + X o converges to x o' By the
linearity off, we have
f(x ) = f(y,,) -
f(x,,) -
f(x o)'
Since If(y,,) - f(x o )l- +
0 as y" - + X o' we have If(x,,) - f(x ) l- +
0 as
"x - + x, and therefore f is continuous at x E .X Since x is arbitrary, the
proof of the theorem is complete. _
"*
,X
.X
Proof Assume thatfis bounded, and let M be such that If(x)1 < Mil x II
for all x E .X If"x - + 0, then If(x,,) I < Mil "x 11- + o. H e nce,fis continuous
at x = O. F r om Theorem 6.5.3 it follows thatfis continuous for all x E .X
Conversely, assume thatfis continuous at x = 0 and hence at any x E X.
There is a 6> 0 such that If(x)1 < I whenever IIxll < 6. Now for any
x 1= = 0 we have II (6x)/11 Ix I II = 6, and thus
If(x )
If we let M =
I If(~
IIxll
II II) 1=
-r
X
If( 6x ) \ .
I< Mllxll,
TIXlT
IIxll
-r
<
II x ll.
0-
andfis bounded. _
We will see later, in Example 6.5.17, that there may exist linear funetionals
on a normed linear space which are unbounded. The class oflinear functionals
which are bounded has some interesting properties.
357
11/11 =
for IE *X .
I/(x)1
IIxll
sup
.... 0
(6.5.7)
Then
(i) *X
is a linear subspace of XI;
(ii) the function II II defined in Eq. (6.5.7) is a norm on X ;
; II . III is complete.
(iii) the normed space { X
and
Proof.
IOt/(x)1
IIIx I
sup
.... 0
lOt IIII II
III1 + 1211 =
sup 1/1(x)
<
sup 1/1(x)1
.... 0
Finally,
IIxll
IIxll
.... 0
12(x)
I<
sup 1{ /1(x)1
.... 0
IIxll
1/2(x)
I}
= III1II + 11/211.
sup 1/2(x)1
.... 0 Ilxll
Otx(' )x
= Ot lim x:(x) +
~
p lim x:(y)
.-~
= /X('x )x
11_ 0 0
px ' ( y),
i.e.,
11' _ 0 0
('x Otx
py) =
px ' ( y), and thus, ' x is a linear functional. Next we show that ' x
is bounded. Since :x { }
is a Cauchy sequence, for f > 0 there is an M such that
Ix : (x ) - :x "(x)1 < fllxll for all m, n > M and for all x E .X But (~x )x +x'(x),
and hence Ix ' ( x ) - :x "(x) I < fllxll for all m > M. It now follows that
Ix(' )x I
= Ix(' )x
:x "(x)
:x "(x)
I < Ix(' )x
-
:x "(x)
I + I:x "(x) I
x'
(ii)
(iii)
6.5.H .
IIIII=
Ilfll=
11/11 =
inf{ M : I/(x )
Is
sup { 1 /(x ) l} ;
and
b:1~ 1
II for all x EX } ;
Mil x
sup {l/(x)l}.
1..1- 1
Exercise.
I(x ) =
x(s) ds,
era, b]
is a linear functional on era, b] (cf. Example 3.5.2). The norm of this functional equals (b - a), because
I/(x ) 1 =
I6J
x(s) ds
I<
(b -
a) max
G~.~6
Ix(s) I.
6.5.13. Example. Consider the space e{ ra, b]; II II..}, let X o be a fixed
element of era, b], and let x be any element of era, b]. The mapping
I(x ) =
is a linear functional on
bounded, because
If(x)1 =
If
s:
x(s)xo(s)
ds
(X S)Xo(S)
ds
I< u: I
oX (S)
IdS) II x 11_.
359
11/11 =
Ixo(s) Ids.
-
f(x )
~/e"
lIall -
6.5.16. Example. Analogous to the above example, let a = (~1' ~z' ...)
be a fixed element of the Banach space I" (see Example 6.1.6), and let x =
(et, e", .. ) be an arbitrary element of I". It follows that if
f(x )
00
~ ~/e"
1=1
If(x)1 =
Iii/~e(1
< ~I~le/l
< lIalllIxll,
which follows from Holder's inequality for infinite sums (5.2.4). Thus, f is
bounded and, hence, continuous. In a manner similar to that of Example
6.5.14, we can show that II/II = I! all. _
We conclude this section with an example
functional.
of an unbounded linear
360
6.5.17.
(~"
~2'
IIxli =
6.5.18. Exercise.
Verify the assertions made in Examples 6.5.12, 6.5.13,
6.5.14,6.5.16, and 6.5.17.
6.6.
IF NITE-DIMENSIONAL
SPACES
el'
S=
(<"<
a{ =
... , 8 )
8 :
I}.
g(a) =
1I,,x
+ ... +
8 8x 11
The reader can readily verify that g is a continuous function on S. Now let
m = inf{ g (a): a E S}. It follows from Theorem 5.7.15 that there is an
a o E S such that g(ao) = m. Note that m 0 since "x {
... , x 8 } is a basis for
,X and also a o O. Hence m > O. It now follows that
for every a
= (<"<
I for a
S, we
6.6. iF nite-Dimensional
361
Spaces
see that
for all a E S.
Next, for arbitrary x E X with coordinates (' I " ..
p = 1'1 I + ... + I I First, we suppose that p > O. Then
,en)
en
>
pm(I%1
enxnll
I+
= p II%x
(6.6.2)
E
P, we let
... + tXnll
+I~I)
= m(lell +
I+ n' l),
where inequality (6.6.2) has been used.
(6.6.3)
Let
X
Proof
Let Ix { >
... , ,x ,} be a basis for ,X let kY { }
be a Cauchy sequence in
and for each k let the coordinates of kY with respect to IX { >
... ,x,,} be
given by (l1kl> ... , ' 7 h)' It follows from Theorem 6.6.1 that there is a constant M such that I11k} 1- 1/J1 < MllYk - IY II forj = I, ... , n and all i, k =
1,2, .... Hence, each sequence 7'{ k}}
is a Cauchy sequence in ,F i.e., in R
or C, and is therefore convergent. Let '70} = lim 7' k} for j = I, ... , n. If we
,X
let
oY =
it follows that kY { }
' 7 0I X I
converges to oY '
+ ... +
7' o"x",
is complete.
6.6.7. Exercise.
".J
".x ..
lIy~.1I
-
Thus, Y~.
0 as m -
<.!. m
00.
Y~.
,,1m II Y k . II ,< ,1
=
"' l k.X I
+ ... +
"~k.X.
+ ... +
y~JI
- 0 as mj-
lI + ... +
I~o
~k",1
I\ .x II
00.
Thus, Y~.J
- y~. Since y~", - 0, it follows that Y~ = O. But this is impossible
because lX { '
... , .x } is a linearly independent set. We conclude that the sum
6.7.
363
Ilx z - ix i = I I ~:
;:11 - ix i
IIYz
x o ll" z Y
-
I'x I
- x ' I I>
_d_
= 1- - ' 1 - ,
d+ ' 1
-d+n
d+ ' 1
where x ' = X o + IIYz - X oII X E VI for all X E VI' Since I' is arbitrary, we
can choose I' so that II X z - X 1 II > t
Now let V2 be the linear subspace generated by {XI' x z .} If VI = ,Y we are
done. If not, we can proceed in the manner used above to select an X 3 rt VZ,
II x 3 11 = I, II IX - x 3 11 > t, and II X z - x 3 11 > t. If we continue this process,
then we either have V({x l ' . . . , ,x ,}) = Y for some n, or or else we obtain
such that Ilx,,1I = t and IIx" - "x ,11 > 1- for all
an infinite sequence ,x { ,}
m *' - n. The second alternative is impossible, since ,x { ,} is a bounded sequence
and as such must contain a convergent subsequence. This completes the
proof. _
>
IIYz
FUNCTIONALS
Throughout this section X denotes a real normed linear space. Before giving
geometric interpretations of linear functionals we introduce the notions of
maximal subspace and hyperplane.
6.7.1. Definition. A linear subspace Y of linear space X is called maximal
if it is not all of X and if there exists no linear subspace Z of X such that
Y , *- Z ,Z,*X and Y c Z.
Recall that if Y is a linear subspace of X and if Z E ,Y then we call the
set Z = z + Y a linear variety (see Definition 3.2.17). In this case we also
say that Z is a translation of .Y
364
z = x _ f (x ) y
f(y) ,
then f(z ) = 0, and thus x has the required form. Now assume that Y I is a
linear subspace of X for which oY c Y I and Y I 1= = oY ' We can choose
y E Y I - oY , and the above argument shows that X c
Y I so that Y I = .X
This shows that oY is maximal and that Y is a hyperplane.
The assertion that Y contains 0 if and only if (X = 0 follows readily.
Consider now the last part of the thorem. If Y is a hyperplane in X,
then Y is the translation of a linear subspace Z in X ; i.e., Y = x o + Z, with
X o fixed. If X o i Z, and if V( Y + x o) denotes the linear subspace generated
x o) =
If for x = (Xx o Z, Z E Z, we
by the set Y + X O' then V(Y
define f(x ) = (x , then Y = (x: f(x ) = I}. On the other hand, if X o E Z, then
we take IX i Z, X = V(Z + IX )'
Y = Z, and define for x = (XIX
+ Z,
f(x ) = (X. Then Y = (x : f(x ) = OJ. This concludes the proof of the
theorem. _
.x
In the proof ofthe above theorem we established also the following result:
6.7.4.
let Z
6.7.
The next result shows that it is possible to establish a unique correspondence between hyperplanes and linear functionals. This result follows readily
from Theorem 6.7.3.
6.7.5. Theorem. eL t Y be a hyperplane in a linear space .X If Y does not
contain the origin, there is a unique linear functional f on X such that
Y = { x : f(x )
= I}.
6.7.6. Exercise.
f associated with
IIx -
,x ,11 =
>
II(e inf
(6- 6 .)eZ
e,,)x o -
II(e -
(z - ,z ,)11
e.)x o - (z -
,z ,)11
=
Ie - e"ld.
1,
and let
Y 2, Y 3,
The set
f(x )
= "I~I
"1~1'
x{ E R1:f(x ) = "I~I
+ "1~1 = O}
is a line through the origin of R1 which is normal to the vector y. IfX I
the hyperplane
oY =
6.7.12.
("1' "1)
iF gure E. H a lf spaces.
oY .
367
6.8.
EXTENSION
OF
IL NEAR
N
UF CTIONALS
In this section we state and prove the Hahn-Banach theorem. This result
is very important in analysis and has important implications in applications.
We would like to point out that the present form of this theorem is not the
most general version of the Hahn-Banach theorem.
Throughout this section X will denote a real normed linear space.
6.8.1. Definition. Let Y be a linear subspace of ,X let Z be a proper linear
subspace of ,Y let I be a bounded linear functional defined on Z, and let]
be a bounded linear functional defined on .Y If lex) = I(x ) whenever x E Z,
then] is called an extension of/from Z to .Y If the spaces ,X ,Y Z are normed
and if 1I/11z = II 1 lin then I is called a norm preserving extension off
We now prove the following version of the Hahn-Banach
theorem.
Y;
and
(ii)
Proof Although this theorem is true for X not separable, we
the proof only for the case where X is separable (see Definition
separability). We assume that Y is a proper linear subspace of X,
wise there is nothing to prove. Let x I E X but x I i ,Y and let us
subset
Y
x{
X:
x =
(XIX
y, (X
R, y
shall give
5.4.33 for
for otherdefine the
.}Y
It is straightforward to verify that Y I is a linear subspace of ,X and furthermore that for each x E Y I there is a unique (X E R and a unique y E Y
such that x = (XIX
+ y. Ifan ex t ension] of/from Y t o Y I exists, then it has
the form
lex)
(X l (x
l)
+ ICY),
and if we let c = l(x l ), then lex) = 1(Y) - c(X. From this it is clear that
the extension is specified by prescribing the constant (Xc. In order that the
368
<
tXcl
IIfll IIy
tXX I II
If(tX)z
<
tXcl
IIflllltXz
or
tXx
II
Ix II
- I Ifllllz
IX II < f (z )
c<
-
Ilfllllz
Ix II
or, equivalently, as
IIfll liz
f(z ) -
Ix II <
<
f(z )
IIfllliz
Ix II
(6.8.3)
for all z E .Y We now must show that such a number c does indeed always
exist. To do this, it suffices to show that for any Y l o)/Z E ,Y we have
C 1 t>.
/(YI) 1- 1/IIIIYI
XI
II s /(Y z )
IIfllllyz
x I II
t>.
Cz
(6.8.4)
f(yz )
j(x )
= f(y) -
/%C,
lY >
E
X
= :x{
=
X
tXIY
;Y Y
,Y tX
= tXzY
;Y Y
and
Y~
= :x{
X
I,
tX
R}
R},
etc.
369
"_00
follows from
I=
IJ ( x )
'-00
lim 1I/IIIIwnii
n- o o
11/1I11xll
=1.
oll,
and so
l~o~ 1
1 for all Y E .Y
= y{
IXx
o'
:X y = lU o,
Then Ilyll=
II/oil = 1.
theorem.
6.8.6. Corollary. Let X o E ,X X O 1= = 0, and let "f > O. Then there exists a
bounded nonzero linear functional/defined on all of X such that II I II = "f
and/(x o) = 1I/11 l Ix o lI
The above corollary guarantees the existence
linear functionals.
6.8.7. Exercise.
In the next
given.
of non-trivial bounded
IS
6.8.8. Example.
Let X o E ,X X o 1= = 0, and let/be a linear functional defined
on X such that/(x o ) = Ilx o II and II/II = I. Let K b e the closed sphere given
by K = x { E :X IIxll < IlxolIJ.
Now if x E ,K
then I(x ) < I/(x ) I <
11/11 l Ix l l < IIx o II, and so x belongs to the half-space x { E X : /(x )
< IIx o ll}
Thus, the hyperplane x { E :X /(x ) = II X oII} is tangent to the closed sphere
(as illustrated in Figure )F .
_
Chapter 6
370
(X :
6.8.9.
iF gure .F
fIx) l= Ixoll}
In closing this section, we mention two of the more important consequences of the Hahn-Banach theorem with significant practical implications.
One of these states that given a convex set Y in X containing an interior
point and given a fixed point not in the interior of ,Y there is a hyperplane
separating the fixed point and the convex set .Y The second of these asserts
that if Y t and Y z are convex sets in X, if Y t has interior points, and if Y z
contains no interior point of Y I , then there is a closed hyperplane which
separates Y t and Y z
6.9.
DU A L
SPACE
R may be expressed as x
= L
I- I
e,e,. If we let
/x,
6.9.
of R'
.
CE
the elements of X *
there is an a
371
15;15;,
=
X
I~,I
IlfII =
lal
l 2
ex n /
ex,~"
,~.)
E
X*,
is
then
E X
+ ... + ex.~ ,
exl~1
so
6.9.3. Exercise.
Let X = R' , and define the norm of x = (~I'
... , ~,) E X
1~,lp)l/p,
where I < p <
(see Example 6.1.5).
by IIxll = (1~llp
Show that if f E x * then there is an a = (ex ... .. , ex,) E R" such that
f(x ) = exl~1
+ ... + ex,~" i.e., X* = R', and show that the norm on X *
00
+ ... +
is given by IlfII =
(I ex I I'
6.9.4.
Let
Exercise.
.1
+ .1q =
p
+ ... +
IIX,I')
1/"
I. If p
I.
+ .1q =
= 1, we take q =
space of 1.1' is I,. Specifically, show that every bounded linear functional on
lp is uniquely representable as
I(x )
= 1=I;
ex,"{
1
IIfll=
1(sup1: IIXIex,Ill,)I/'
I- I
I
if p
001
< P<
if I
I.
x " ),
where x' E X*. If X ' denotes the algebraic conjugate of ,X then the reader
can readily show that even though X * c X ' and X** c (X * )I, in general,
x * * is not a linear subspace of X f f.
Let us define a mapping J of X into x * * by the relation
(x ' , J x )
= (x , x ' ) , x
E ,X
x'
x*
(6.9.5)
372
or, equivalently, by
Jx
x",
x " (x ' )
x'(x).
(6.9.6)
We call this mapping J the canonical mapping of X into X**. The functional
x " defined on X * in this way is linear, because
px ; )
(X * )/.
,x <
x;
px ; )
Since
,x<
x;)
x"(x~)
p<,x
;x >
px " (x ; ),
The space
R~,
<
I <p
<
<
00
is reflexive.
6.9.10. Example.
The spaces Ip , I
6.9.11. Example.
6.9.12. Exercise.
6.9.1 I.
6.10.
WEAK
00,
are reflexive.
_
_
CONVERGENCE
6.10.
373
Weak Convergence
Proof
Assume that
have
1.x< ,
and thus x . -
Ilx. - x l l-
<
x ' ) - (x , ,x )1
x weakly. _
as n -
Ilx'llllx.
00.
x l l- -
0 as n -
6.10.3. Example.
X 2
Consider in
/2
*X
we
00,
0, ...),
X J
the con-
(1,0, ... ,
that { x . }
converges weakly we note that every x ' E /2 = *X
can be represented as the scalar product with some fixed vector y = ('11' 1' 2' ... , 1' ., ...);
i.e., if x = (el' e2' ... ,e., ... ), then
,x <
x')
= I=:E
el'1l
J
~
we now have
374
(ii)
(iii)
Ipft(t)
s: Ipft(t) dt =
a[ , b); and
I.
(x ,
where x
x~>
s: (x t)lpft(t) dt
be defined on
(x ,
x~>
era, b) by
= (x O)
1/.
- l ift
Ip.(t)x(t) dt
(x tft)
lift
- l ift
'ft(t) dt
(x tft)
in applications, it is convenient
to say the sequence l{ pft} converges to the so-called "~ function" which has
this property. We see that the sequence l{ pft} converges to the ~ function in the
sense of weak convergence. _
6.10.8. Theorem. Let X be a separable normed linear space. Every
bounded sequence of linear functionals in X contains a weakly convergent
subsequence.
375
{xU
such that the sequence IX < { '
.~x J>
converges. Again, from the subwe can select another subsequence {x~.J
such that the sequence
sequence .~x{ J
lX < { '
:.~X J>
converges. Continuing this procedure, we obtain the sequences
, x~.,
"~X "~x
X~"
x~.,
, x~.,
X~"
x~.,
, x~,
and X .
6.11.
INNER PRODUCT
SPACES
We recall (see Definition 3.6.19 and the discussion following this definition)
that if X is a complex linear space, a function defined on X X X into C,
which we denote by (x, y) for x, y E ,X is called an inner product if
(i) (x, )x
E
E
376
II . II: X
6.11.4.
Theorem. Let X
-+
R defined by
II . II
IIxll =
E
(x,
(6.11.5)
X ) I/2
C, we have
IIxll ~ 0;
IIxll = 0 if and only if x = 0;
lIlx l = 1lIlxll; and
Ilx + IY I < IIxll + lIyll
Exercise.
(6.11.2)
Using the above results, we can now readily show that the function
defined by IIxll = (x, X ) I/2 is a norm.
II . II: X
-+
6.11.
377
Subsequently, we adopt the convention that when using the properties and
terminology ofa normed linear space in connection with an inner product space
we mean the norm induced by the inner product, as given in Eq. (6.11.5).
We are now in a position to make the following important definition.
6.11.7.
space.
Definition.
Thus, every H i lbert space is also a Banach space (and also a complete
metric space). Some authors insist that H i lbert spaces be infinite dimensional.
We shall not follow that practice. An arbitrary inner product space (not
necessarily complete) is sometimes also called a pre- H i lbert space.
6.11.8. Example.
L e t X be a finite-dimensional (real or complex) inner
product space. It follows from Theorem 6.6.5 that X is a H i lbert space. _
6.11.9. Example.
L e t 12. be the (complex) linear space defined in Ex a mple
6.1.6. L e t x = (el> e2.' ...) E 12.' Y = (111) 112.' ...) E 12.' and define (x, y):
12. X 12. - Cas
(x , y)
= I-I;
elil'
I
s:
x ( t)y(t) dt.
It is readily verified that this space is a pre- H i lbert space. In view of Example
6.1.9 this space is not complete relative to the norm II x II = (x, X)I/2., and hence
it is not a H i lbert space.
(b) We extend the space of real-valued functions, pL a[ , bJ, defined in
Ex a mple 5.5.31 for the case p = 2, to complex-valued functions to be the
set of all functions f: a[ , b] C such that f = u + iv for u, v E 2L .[a, b].
Denoting this space also by 2L .[a, b], we define
(f, g)
= r
G
[ J .bl
fgdp,
378
for f, g
E L~[a,
b]; ( "
{L~[a,
spaces.
Ilxll =
where IIXlIII =
X is a Hilbert
(x, )X I/2
d: IIIX
11f)1/2
I- I
(XI'
space. _
6.11.12. Exercise.
6.11.13.
X E
,X
+-
(i) (z, x . )
(ii) (x . , z) -
(iii)
IIxlIll-+-
(iv) if 1; .Y
~
,._ 1
6.11.14.
;X
X;
is convergent in ,X
then (1; .Y , )z
.X
Exercise.
such that x .
,,= 1
+-
x, where
= n:o::.l
1; (y., )z for all
~
379
IlxllIIYII
F o r all x, y
yW
Ilx (i) Ilx
(ii) if x .J .. y, then IIx
6.11.17. Exercise.
yW =
yW
X we have
211xW
= IlxW
211yW; and
IlyW
Parts (i) and (ii) of Theorem 6.11.16 are referred to as the parallelogram
law and the Pythagorean theorem, respectively (refer to Theorems .4 9.33 and
.4 9.38).
Let x { .. : a E I} be an indexed set of elements in ,X
where I is an arbitrary index set (i.e., I is not necessarily the integers). Then
(x .. : E I} is said to be an orthogonal set ofvectors if x .. ...L x p for all , pEl
such that 1= = p. A vector x E X is called a unit vector if II x II = 1. An
6.11.18. Definition.
Let { X I '
... ,x
II ~
x
n}
W= J~
Then
IIx J llz.
Chapter 6
380
(f, g)
(f(t)g(t) dt.
(6.11.21)
is an orthonormal set in .X
we obtain
(f.,f",) =
Substituting Eq.
e 2a (a- I II)'
Since e
2ak
i.e., if m
'
cos 2nk
2n(n -
II
(fft,f",) =
0, m
* n;
0 e2aCa- I II)"
(6.11.21),
dt
1
m)i
* n, then fa ..L
:J
(fft,fft) =
i.e., if n =
fft(t)f",(t)
dt
(6.11.22)
m, then (fft,fft) =
Il/all =
I and
I;
1.
t I(x,
1='
(ii) (x -
x;)
6.11.25. Exercise.
x,)x,)
... , fX t}
12 < IlxW
:t (x,
1='
If { X I '
for all
..L x
J
X;
for any j
and
then
(6.11.24)
= 1, ... , n.
->
00
then
(6.11.27)
for every x
.X
(1"
6.12.
381
Orthogonal Complements
0 for all
From our discussion thus far it should be clear that not every normed
linear space can be made into an inner product space. The following theorem
gives us sufficient conditions for which a normed linear space is also an
inner product space.
6.11.30. Theorem.
Let
tfll x
for all ,x y
,X
yW where i =
6.11.33. Exercise.
,X
(6.11.31)
6.11.34.
Corollary. If X is a real normed linear space whose norm
satisfies Eq. (6.11.31) for all ,x y E ,X then it is possible to define an inner
product on X by
(x, y)
for all ,x y
= tWx
yW - l lx
-
yW}
.X
6.11.35. Exercise.
6.12.
ORTHOGONAL
COMPLEMENTS
is in a subspace Y of X
and z is orthogonal to .Y
y.
Xl
6.11.3
iF gure G
6.12.5. Exercise.
and Xl. =
O
{ .J
6.12.
Orthogonal Complements
383
which contains ;Y
Proof
y.L
::::J
Z.L.
() .Y l.
O.
Z ::::J ,Y
Z.L and
To prove part (iv) we note that, by part (ii) of this theorem, y.l c yll.L .
On the other hand, since Y c yH , by part (iii) of this theorem, y.L ::::J y.L l l.
Thus, y.L = y.L..L .L
The proof of part (v) is also left as an exercise.
_
6.12.9. Exercise.
In view of part (iv) of the above theorem, we can write y.L = y.LH
=
= ... , and y.l.L = y.l.L..L L = yU H l l.
= ....
Before giving the classical projection theorem, we state and prove the
following preliminary result.
yl..L..L lL .
6=
be a linear subspace of ,X
inf(lIy -
xII: y
.} Y
and let x
be an
384
IIx - W
z
IIx -
y 1l2 =
(x -
oY -
= (x - oY , x - oY ) -
(x -
oY , y ) -
oY -
oY 11 2 - 1 1
I I
y , x - oY (<,Y<
y)
oY )
x -
(<,Y<
y)
lIlIyll2
oY 11 2;
11 1- 1
< Ilx oY II. F r om this it follows that if x oY is not orthog,Y then Ilyo - ix i
o. This completes the first part of
oY
onal to every y E
the proof.
Next, assume that (x - oY ) 1- .Y We must show that oY is a uniq u e vector
such that II x y II > II x - oY II for all y oY . F o r any y E Y we have, in
view of part (ii) of Theorem 6.11.16,
I!x - yW = IIx - oY
oY -
yW =
IIx - oY W
Ilyo -
all y
*oY -
yW
This com-
6.12.11.
fF uJ re
The preceding theorem does not ensure the existence of the vector oY .
However, if we require in Theorem 6.12.10 that Y b e a c/osedlinear subspace
in a H i lbert space ,X then the existence of the unique vector oY is guaranteed.
6.12.
Orthogonal Complements
385
This important result, which we will prove below, is called the classical
projection theorem.
6.12.12. Theorem. eL t X be a Hilbert space, and let Y be a closed linear
subspace of .X Let x be an arbitrary vector in ,X and let
J = inf{lIy - Ix I: Y
.}Y
More-
x)
(x - nY )W
+ II(Y", -
)x =
(x - nY )IJ2
211Y", X
11 2
211x -
nY W
nY W
= 211Y", -
xW
211x -
nY W -
14 1 x - (y",
lIy", -
nY W
<
211Y",
xW
211x -
nY W -
Yn)!r'
we have
(4 P.
Proof. Let x be any vector in Z which is not in Y (there is one such vector
by hypothesis). If we define J as above, i.e., J = inf{lly - Ix I: Y E ,}Y
then there exists by Theorem 6.12.12 a vector oY E Y such that II x - oY II
= .J Now let z = oY - .x Then z 1.. Y by Theorem 6.12.12. _
386
::>
.Y Under
space .X
Then
Proof From part (ii) of Theorem 6.12.8 we have Y c y.u.. Since y.u. is
closed by Theorem 6.12.6, it follows that f c y.u.. F o r purposes of contradiction, let us now assume that f := 1= y.u.. Then Theorem 6.12.13 establishes the existence of a vector z E y.u. such that z := 1= 0 and such that z - ' f.
Thus, .z E fl.. Since Y c f, it follows that Z E yl.. Therefore, we have
Z E yl. n y.u. and Z := 1= 0, which is a contradiction to part (i) of Theorem
6.12.8. eH nce, we must have f = y.u..
We note that if, in particular, Y is a closed linear subspace of X, then
y.u..
In connection with the next result, recall the definition of the sum of two
subsets of X (see Definition 3.2.8).
Y
Z" -
m
z W =
IIY" - m
Y W
liz" - m
z ll 2
Ilu" as n - +
u= Y
(y
+ )z 1I
= IIY" - Y
ZIt -
lz l <
IIY,,-
yll + liz" - lz l
00.
Before proceeding to the next result, we recall from Definition 3.2.13 that
a linear space X is the direct sum of two linear subspaces Y and Z if for
every x E X there is a unique Y E Y and a unique Z E Z such that x =
6.13. oF urier
Series
387
=
Y
space X,
Zu =
(Z1.)1.
= 0{ 1} . =
.X
We call the function P the projection of x onto .Y Note that P(Px ) ~ p2X =
Py = ;Y eL ., p2 = P. We will examine the properties of projections in
greater detail in the next chapter. (Refer also to Definition 3.7.1 and
Theorem 3.7.4.)
6.13.
O
F R
U IER
SERIES
388
finite or infinite number of vectors from an orthonormal set. In this connection we will touch upon the concept of basis in Hilbert space. The property which makes all this possible is, of course, the inner product.
Much of the material in this section is concerned with an abstract approach
to the topic of F o urier series. Since the reader is probably already familiar
with certain facets of F o urier analysis, he or she is now in a position to recognize the power and the beauty of the abstract approach.
Throughout this section ;X {
(0, .)} is a complex inner product space, and
convergence of an infinite series is to be understood in the sense of Definition
6.3.1.
We now consider the representation of a vector Y of a finite-dimensional
linear subspace Y in an inner product space.
6.13.1. Theorem. Let X be an inner product space, let uY{
.. ,Yn} be a
finite orthonormal set in ,X and let Y be the linear subspace of X generated
by { Y I ' . , nY ' } Then the vectors {Yu .. , nY } form a basis for Y a nd, moreover, in the representation of a vector Y E Y by the sum
=
Y
IXIIY
+ ... +
IXnnY '
Exercise.
6.13.2.
4.9.51.)
i= I ,
.. ,n.
and
Let
.I 1X
1= 1
/1
<
00.
t1.,
Proof
A series
i.e.,
,X
if and only if L
in .X
Assume that
(x, ,x ),
2
I:& m + l
t:1
t1.I X
be a countably
is convergent to an
i; I1X / <
1= 1
be a H i lbert
X
00,
1t1. / 12
= I, 2, ....
__
t IX,X
I- I
,. If n
> m, then
6.13. oF urier
Series
389
Ill,1 2- >
1-11I+1
and ~
00
1 ..+ 1
Ill,12 <
-00
0 as n, m - >
00.
From
00.
'-11I+1
s". W
Ill,12 - >
Now assume that} ' Ill,12 < 00, and let x = lim s. We must show that
f:1
- ...
ll, = (x, ,x ). From Theorem 6.13.1 we have ll, = (s., ,x ), i = I, ... ,n.
But s. - > x, and hence by the continuity of the inner product we have
(s., ,x ) - > (x, ,x ) as n - > 00. Therefore, ll, = (x, ,x ), which completes the
proof. _
In the next result we use the concept of closed linear subspace generated by
a set (see Definition 6.12.7).
6.13.4. Theorem. Let ,x { }
be an orthonormal sequence in a Hilbert space
,X and let Y be the closed linear subspace generated by IX { ' }
Corresponding
to each x E X the series
00
converges to an element
1-'
(x, x,)x
(6.13.5)
.Y Moreover, (x -
)X ..L .Y
tU ilize
Theorems 6.11.26,
390
Y is complete;
(ii) if (x, y) =
for all Y
(iii) V(Y) = .X
(i)
,Y then x
0; and
6.13.9. Exercise. Prove Theorem 6.13.8 for the case where Y is an orthonormal sequence ,x { .}
As a specific example of a complete orthonormal set, we consider the
set of elements e l = (1,0, ... ,0, ...), e" = (0, 1,0, ... ,0, ...), e3 =
(0,0, 1,0, ... ,0, ...), ... in the Hilbert space I" (see Example 6.11.9). It is
readily verified that Y = Ie,} is an orthonormal set in I". Now let x = (' t ,
,,,' ... "., . . ) E
Ilx - iX II"
f 1',1",
Ic:t:l
- iX ii
k- -
let
iX
0. Hence,
=
,'1=
,' e,.
V(Y) =
Then
I" and
be a Hilbert
be
(6.13.11)
for every x
liz II" L
is complete.
E X
such that
I(z,
1= =
x,)
I-'
that the sequence
Now assume
6.13.4 and 6.13.8 we have
x
t=1
(x, ,x )x,
,~
, ,x .
6.13. oF urier
Since ,x { J
Series
391
is orthonormal we obtain
IIx I I
1= I
J=
= 1'' f1=
J=
1; (1,/i J (x
"
)J x
1= I
1(1,,1 2
.4 9.55).
X be an inner-product space. eL t ,x { }
be a finite
or a countably infinite sequence of linearly independent vectors. Then there
exists an orthonormal sequence y{ ,J having the same cardinal number as the
and generating the same linear subspace as ,x { ,}
sequence ,x { }
Proof
It is clear that IY
and IX
11;:11'
=
YI
Since
(Z2' Y I )
(x 2 (x 2, Y I )
(x 2, IY )YIo
-
(X2'
= (x
YI)
2, Y I )
= 0,
YI)
(x 2, IY )(YIo
YI)
*"
it follows that Z2 -L Y I ' We now let 2Y = 2z /11 2z 11. Note that Z2 0, because
and IY are linearly independent. Also, IY and 2Y generate the same linear
subspace as IX and X 2, because 2Y is a linear combination of IY and 2Y '
Proceeding in the fashion described above we define Zlo Z2' ... and
Y I ' 2Y ' . . recursively as
2X
Z. =
.X -
and
Y.=
a- I
1= 1
(X., ,Y )Y,
IIz:II'
As before, we can readily verify that z. L- ,Y for all i < n, that z. # . 0, and
that the ,Y { l,
i = I, ... ,n, generate the same linear subspace as the ,x{ ,}
i = I, ... ,n. If the set ,x { }
is finite, the process terminates. Otherwise it is
continued indefinitely by induction.
392
e,,}
The sequence
thus constructed can be put into a one-to-one corTherefore, these sequences have the
respondence with the sequence ,x { ,}
same cardinal number. _
The following result can be established by use of Zorn's lemma.
6.13.13. Theorem. eL t X be an inner product space containing a nonez ro element. Then X contains a complete orthonormal set. If Y is any
orthonormal set in ,X then there is a complete orthonormal set containing
Y a s a subset.
Indeed, it is also possible to prove the following result: if in an inner
product space \Y and Y 1 are two complete orthonormal sets, then Y \ and Y 1
have the same cardinal number, so that a one-to-one mapping of set \ Y onto
set Y 1 can be established. This result, along with Theorem 6.13.13, allows
us to conclude that with each Hilbert space X there is associated in a natural
way a cardinal number ". This, in turn, enables us to consider " as the
dimension of a Hilbert space .X F o r the case of finite-dimensional spaces this
concept and the usual definition of dimension coincide. oH wever, in general,
these two notions are not to be viewed as one and the same concept.
Next, recall that in Chapter 5 we defined a metric space X to be separable
if there is a countable subset everywhere dense in X (see Definition 5.4.33).
Since normed linear spaces and inner product spaces are also metric spaces,
we speak also of separable Banach spaces and separable Hilbert spaces. In
the case of Hilbert spaces, we can characterize separability in the following
equivalent way.
6.13.14. Theorem. A Hilbert space X is separable if and only if it contains
a complete orthonormal sequence.
6.13.15. Exercise.
,x { }
= L
. (x,
1= \
X , )X
I,
6.14.
393
concept, is of very little value in spaces which are not finite dimensional. In
such spaces, orthonormal basis as defined above is much more useful.
We conclude this section with the following result.
6.14.
THE
RIESZ REPRESENTATION
THEOREM
= (x,
f(x )
y)
(6.14.1)
I(x, y)1
<
Ilx l illyll
6.14.2.
Proof
394
X=
ZEfjZ1..
f(x -
f(x ) u) =
f(x ) -
f(x ) f(u) =
f(x ) -
f(x ) =
0,
and thus (x -
(x - f(x ) u, u) =
y=
f(x )
= (x,
y).
To show that the vector y is unique we assume that f(x ) = (x, y' ) and
I(x ) = (x, y") for all x E .X Then (x, y' ) - (x, y") = 0, or (x, y' - y") = 0,
or (y' - y", )x = 0 for all x E .X It now follows from Theorem 6.11.28 that
y' = y". This completes the proof of the theorem. _
6.14.3.
Exercise.
Definition 6.9.8).
space X
is reflexive (refer to
6.14..4
Exercise. Two normed linear spaces over the same field are said
to be congruent if they are isomorphic (see Definition 3.4.76) and isometric
(see Definition 5.9.16). Let X be a H i lbert space. Show that X is congruent
to X*.
6.15.
SOME APPLICATIONS
6.15.
Some Applications
395
variables, while in the third part we concern ourselves with the estimation of
random variables.
A. Approx i mation
(Normal Equations)
of Elements in H i lbert
Space
GT(y1> ... , nY )
IX] \
[
:
=
([ 'X )\Y ]
:
(6.15.1)
'
IXn
(x , nY )
where in Eq. (6.15.1) GT(y\, ... , nY ) is the transpose of the matrix
(Y\,
Y\)
(Y\,
nY )
(Y2,
Y\)
(Y2'
nY )
(6.15.2)
(Yn' Y \ )
(Yn' nY )
The matrix (6.15.2) is called the Gram matrix of Y\, ... ,Yn' The determinant
of (6.15.2) is called the Gram determinant and is denoted by A(YI> ... ,yJ.
The equations (6.15.1) are called the normal equations. It is clear that in a
real Hilbert space G(YI> ... ,Yn) = GT(YI> ... ,Yn), and that in a complex
Hilbert space G(YI> ... ,Yn) = GT(y1' ... ,Yn)'
In order to approximate x E X by oY E Y we only need to solve Eq.
(6.15.1) for the lXI' i = 1, ... ,n. The next result gives conditions under
which Eq. (6.15.1) possesses a unique solution for the IX I
396
*'
+ ... +
IXIIY
IXftYft
= O.
(6.15.4)
Taking the inner product of Eq. (6.15.4) with the vectors IY { '
the n linear equations
... , fY t}
yields
(6.15.5)
IXI(Yft'
YI)
t- -
..
IXft(Yft'
fY t)
= 0
Taking the l{ IX ' ... ,IXft} as unknowns, we see that for a non-trivial solution
(IX I . .. ,IXft) to exist we must have (~ IY '
... 'Yft) = O.
Conversely, assume that ~(yl'
... , fY t) = O. Then a non-trivial solution
(IX I
, IXft) exists
for Eq. (6.15.5). After rewriting Eq. (6.15.5) as
we obtain
f' .'
( I'I-IX IIY .
t IXly,
I- I
l":1
IXllY )
II,IX
=
1= I
IY I12 =
0,
... ,Y . }
is linearly
The next result establishes an expression for the error II x - oY II. The
proof of this result follows directly from the classical projection theorem.
6.15.6. Theorem. Let X be a Hilbert space, let x E ,X let { y l' ... , fY t} be
a set of linearly independent vectors in ,X let Y be the linear subspace of X
generated by { y l' ... , fY t}' and let oY E Y be such that
IIx -
Then
oY II =
min ! I x
7EY
yll =
min IIx -
I%IIY
... -
IXJ.II
397
where
!\(YI'
(Yit ,Y ,)
(Ylt
x)
(Y z ,
(Y z ,
x)
,Y ,)
= det
... ,Y", x )
(Y", ,Y ,)
(x, ,Y ,)
6.15.7. Exercise.
8.
(y.. x )
(x, x )
Random Variables
we have
,,-I
E"
E ~,
and (iii) 0
E ~.
It readily
.- 1
398
defined by (xF )x
= P{ X I < x . , ... , fX t < x ft ,} is called the distribution
function of .X
If X is a random variable and g is a function, g: R - R, such that the
Stieltjes integral
to be E{g(X)}
is a function, g: Rft -
t.
is defined
g(x)dF(x )x .
values of primary interest are E(X), the expected value of ,X E(XZ), the second
moment of ,X and E{ [ X
- E(X)Z
] ,}
the variance of .X
If we let .c z denote the family of random variables defined on a probability
space to, g:, P} such that E(XZ) < 00, then this space is a vector space over
R with the usual definition of addition and multiplication by a scalar. We
say two random variables, IX and X z , are equal almost surely if P{co: IX (co)
(z X co)}
= O. If we let L z denote the family of equivalence classes of all
random variables which are almost surely equal (as in Example 5.5.31),
then L { z ; (,)} is a real Hilbert space where the inner product is defined by
*'
L z
Throughout the remainder of this section, we let to, g:, P} denote our
belong
underlying probability space, and we assume that all random variab~es
to the Hilbert space L z with inner product (X , )Y = E(XY).
(X,
)Y
E(XY)
for ,X
+ ... +
V({Y
399
, IY Il})'
F u rthermore, Eq . (6.15.1) gives us the explicit form for
1, ... ,m. We are now in a position to summariz e the above discussion in the following theorem, which is usually called the orthogonality
principle.
" i =
p . .
6.15.8. Theorem. L e t ,X Y
I IlYIIl
that E{ [ X
Y I ,
, Y IIl belong to L z . L e t
G = ,Y [ j]'
where
i,j = I,
, m, and let V = (PI' ... ,Pill) E Rill, where
,} for i = 1,
, m. If G is non- s ingular, then X = I Y I
is the best linear estimate of X if and only if aT = bTG - I .
6.15.9. Corollary. L e t ,X
't,j =
P, =
l ilY
E{,Y Y
E{XY
1ft
j,}
6.15.10. Exercise.
+ ...
, -
0';-+
z for
mO'" , O'v
._
I -
O';b ,j for
I, ... , m.
The nex t result provides us with a useful means for finding the best linear
estimate of a random variable ,X given a set of random variables { Y p ... ,
Y k } , if we already have the best linear estimate, given {Y p . , Y k - I } .
04 0
Proof By the classical projection theorem (see Theorem 6.12.12), "Y ,(k - I)
.J .. ,Y' .-I
Now for arbitrary Z E ,Y ' ., we must have Z = CIY I + ... +
C,.-I ,Y .-I + C,.Y,. for some (C I' ... ,C,.). We can rewrite this as Z = ZI + Z2'
where ZI = CIY I + ... + C,.-I,Y .-I
+ C,.Y,.(k - I) and Z2 = C,.Y,.(k - I).
and Z2 1- 'Y,.-I'
it follows from Theorem 6.12.12 that ZI
Since ZI E ,Y ' .-I
and Z2 E V({,Y .(k - I)}), the theorem
and Z2 are unique. Since ZI E ,Y' .-I
is proved. _
Exercise.
be random variables in
1, ... ,m, and let B =
G is non-singular, then
if and only if A = BG- I .
i =
E{XYT}[E{YVTWIY
(6.15.16)
= 0
E{ U ( k)}
E{U(k)UT(j)}
Q(k~j"
(6~I5.1
7)
(6.15.18)
6.15.
Some Applications
04 1
E{V(k)}
and
E{V(k)VT(j)}
(6.15.19)
R(k)Ojk
(6.15.20)
m) matrix
with the
(6.15.21)
(6.15.22)
and
E{X(I)VT(k)}
=
=
0,
(6.15.23)
0,
(6.15.24)
(6.15.25)
(X k
Y(k)
1)
A(k)X(k)
C(k)X(k)
B(k)U(k)
(6.15.26)
V(k)
(6.15.27)
Chapter 6
i(k Ik -
and
i(k
where
K ( k)
P(k Ik -
P(k
for k
K ( k)[ Y ( k)
II k)
11 k)
= I[ -
C(k)i(k Ik -
1)],
(6.15.29)
A(k)i(k Ik),
I)CT(k)[C(k)P(k
P(kl k)
and
I)
Ik
-
I)CT(k)
K ( k)C(k)] P (kl
A(k)P(kl k)AT(k)
(6.15.30)
(6.15.31)
R(k)] - l ,
I),
(6.15.32)
B(k)Q(k)BT(k)
(6.15.33)
k -
i(IIO) =
and
P(lIO)
= P(I).
to find i(k Ik) and i(k + 11 k). It follows from Theorem 6.15.13 (extended
to the case of random vectors) that there is a matrix K ( k) such that i(k I k)
= i(kl k - I) + K ( k)f(kl k - 1), where f(kl k - 1) = Y ( k) - t(kl k
- I), and t(k Ik - I) is the best linear estimate of Y ( k), given {Y(l),
... ,
Y ( k - I)}. It follows immediately from Eqs. (6.15.23) and (6.15.27) and the
orthogonality principle that t(k I k - 1) = C(k)i(k I k - I). Thus, we have
shown that Eq. (6.15.29) must be true. In order to determine K ( k), let
X ( kl k - 1) = X ( k) - X ( kl k - I). Then it follows from Eqs. (6.15.26) and
(6.15.29) that
(X kl
k)
X ( kl
k -
I) -
K ( k)[ C (k)X ( kl
k -
I) +
V(k)] .
E{ X ( k
Ik -
l)YT(k)}
K ( k)[ C (k)E{ X ( k
Ik -
I)YT(k)}
E(X ( kl
k -
Ik -
l)Y T (k)}
E{V(k)YT(k)}].
X ( kl
k -
(6.15.34)
l)VT(k)}.
(6.15.35)
04 3
E{ X ( kl
k-
= E{ X ( kl
I)XT(k)}
k-
l)[XT(k)
iT(kl k -
iT(k Ik -
1)' +
I)]}
P(kl
k-
I)
E{X(kl
I)
(6.15.36)
where
t::.
l)iT(klk' - I)} =
I)} .
k-
I)}
I)X T (klk -
... , Y(k -
Now consider
Using
.+
= E{V(k)[TX (k)CT(k)
E{V(k)YT(k)}
= R(k).
VT(k)J}
(6.15.37)
0=
P(kl k -
I)CT(k) -
K(k)[C(k)P(kl
k-
l)CT(k)
.+
R(k)].
(6.15.38)
X ( kl k) =
X ( kl k -
1) -
K(k)[C(k)X(kl
Ik -
= [ I - K(k)C(k)]X(k
k-
1) -
1)
V(k)]
(K k)V(k).
I[ -
(K k)C(k)JP(kl
I[ -
= I[ -
K ( k)C(k)] P (kl
K(k)C(k)]P(k
P
{ (k
k-
Ik -
1)
k-
Ik -
I)CT(k) -
I)CT(k)KT(k)
(K k)R(k)KT(k)
1)
K(k)[C(k)P(k
Ik -
I)CT(k)
.+ R(k)J}
T
K (k).
E{[X(k
+
=
1) -
for j
A(k)i(k Ik)]YT(j)}
EfA(k)[X(k)
1, ... , k.
i(k I k)]YT(j)}
.+
EfB(k)U(k)YT(j)}
=
04 4
Finally, to verify Eq. (6.15.33), we have from Eqs. (6.15.26) and (6.15.30)
X(k
11 k)
A(k)X ( k
Ik) +
B(k)U(k).
6.16.
0 and
The material of the present chapter as well as that of the next chapter
constitutes part of what usually goes under the heading of functional analysis.
Thus, these two chapters should be viewed as a whole rather than two separate
parts.
There are numerous excellent sources dealing with H i lbert and Banach
spaces. We cite a representative sample of these which the reader should
consult for further study. References 6
[ .6]6[- .8],
6[ .10], and 6[ .12] are at an
introductory or intermediate level, whereas references 6
[ .2]6[ - .4]
and 6[ .13]
are at a more advanced level. The books by Dunford and Schwartz and by
Hille and Phillips are standard and encyclopedic references on functional
analysis; the text by Y osida constitutes a concise treatment of this subject,
while the monograph by H a lmos contains a compact exposition on H i lbert
space. The book by Taylor is a standard reference on functional analysis at
the intermediate level. The texts by K a ntorovich and Akilov, by K o lmogorov
and F o min, and by Liusternik and Sobolev are very readable presentations
of this subject. The book by Naylor and Sell, which presents a very nice
introduction to functional analysis, includes some interesting examples. F o r
references with applications of functional analysis to specific areas, including
those in Section 6.15, see, e.g., Byron and F u ller 6[ .1], K a lman et al. 6[ .5],
L u enberger 6[ .9], and Porter 6[ .11].
REFERENCES
6[ .1]
6[ .2]
6[ .3]
6[ .4]
6[ .5]
.F W. BYRON and R. W. EL UF R,
Mathematics of Classical and Quantum
Physics. Vols. I. II. Reading, Mass.: Addison-Wesley Publishing Co., Inc.,
1992.
6.16.
6[ .6)
6[ .7)
6[ .8)
6[ .9)
6[ .10]
6[ .11]
6[ .12]
6[ .13]
and G. P. AKIO
L V,
uF nctional Analysis in Normed
The Macmillan Company, 1964.
A. N. O
K M
L OGOROV and S. V. O
F MIN, Elements of the Theory of uF nctions
and uF nctional Analysis. Vols. t, II. Albany, N.Y.: Graylock Press, 1957
and 1961.
.L A. IL SU TERNIK
and V. .J SoBOLEV, Elements ofFunctional Analysis. New
York: rF ederick Ungar Publishing Company, 1961.
D. G. EUL NBERGER,
Optimization by Vector Space Methods. New York:
J o hn Wiley & Sons, Inc., 1969.
A. W. NAYO
L R and G. R. SEL,L
iL near Operator Theory. New York: Holt,
Rinehart and Winston, 1971.
W. A. PORTER, Modern oF undations of Systems Engineering. New York:
The Macmillan Company, 1966.
A. E. TAYO
L R, Introduction to uF nctional Analysis. New York: John Wiley
& Sons, Inc., 1958.
.K O
Y SIDA, uF nctional Analysis. Berlin: Springer-Verlag, 1965.
IL NEAR
OPERATORS
7.1.
BOUNDED
IL NEAR
TRANSFORMATIONS
Throughout this section X and Y denote vector spaces over the same field
y{
Y:
T(v), v EVe X } .
On the other hand, if W c ,Y then the inverse image ofset Wunder T is the
set
T- I (W) = x { E :X y = T(x) EWe .} Y
We define the range ofT, denoted R
< (T), by
R
< (T) =
y{
:Y
y=
T(x), x EX } ;
i.e., R
< (T) = T(X). Recall that if a transformation T of X into Y is injective,
then the inverse of T, denoted T- I , exists (see Definition 1.2.9). Thus, if
y = T(x) and if T is injective, then x = T- l (y).
In Definition3.4.1 we defined a linear operator (or a linear transformation)
as a mapping of X into Y having the property that
(i) T(x
(ii) T(lX)X
and
.X
X into Y by L ( X ,
)Y .
Tx in place of T(x).
.X
04 7
04 8
Exercise.
whenever II x X
o II
T(x o) II <
< 6.
7.1.6. Exercise.
Let T
L(X,
)Y .
Assume that T is bounded, and let "I be such that II Tx II S "IIIx II for
all x E .X Now consider a sequence x { n ) in X such that x . - > 0 as n - > 00.
Then II TX n II < , 11 .x 11 - > 0 as n - > 00, and hence T is continuous at the point
E .X
F r om Theorem 7.1.5 it follows that T is continuous at all points
x E .X
Conversely, assume that Tis continuous at x = 0, and hence at all x E .X
Since TO = 0 we can find a 6 > 0 such that II Tx II < I whenever II x II S 6.
F o r any x 1= := 0 we have 1I(6x)/llxllll = 6, and hence
Proof
IITxll
Ifwe let"
II T(I I~
<
11)11
i'llxll,
(~)II
T(I ~I)I
ilfl
<
_
Now let S, T E L ( X ,
+ T) by
(S
In Eq.
Y).
operators (S
T)x
(3.4.24 )
Sx
Tx, x
,X
= IX(Tx),
(IXT)x
,X
IX
IX
F as
.F
(ST)x =
S(Tx), x
.X
ST* TS.
In the following, we will use the notation B(X , )Y to denote the set of
all bounded linear transformations from X into Y; i.e.,
B(X , )Y
{T
L(X,
)Y :
T is bounded}.
(7.1.8)
Let
B(X , )Y .
inf{y: II Txll
<
II Til,
is
(7.1.12)
IITxll S IITII' l lx l l
for all x E .X In proving that the function II . II: B(X , )Y - + R satisfies all
the axioms of a norm (see Definition 6.1.1), we need the following result.
14 0
< lY lxll
(i)
II Til =
inf{ y :IITx l l
(ii)
II Til =
sup I{ I Tx l l/llx l l:
(iii)
IITII =
(iv)
II Til =
7.1.14.
","0
Tx l l: x
sup
I{ I
Tx l l: x
EX } .
Exercise.
;} X
and
II . II defined
Proof
be
EX } ;
I{ I
1"'1=\
can equivalently
for all x EX } ;
sup
I",I:S:\
II Til
)Y ,
B(X ,
0;
II(Sltx~)xlI
,X
x t= =
B(X,
X),
then ST
B(X,
IISTII =
completing the proof.
B(X,
X)
and
Til
X we have
X)
7.1.16. Theorem. If S, T
Proof
O.
X).
sup
","0
If x t= =
Tx l l
< IISIIIITIIllxll,
0, then
Then /
7.1.
Bounded iL near
7.1.18. Exercise.
Transformations
14 1
Let
,X
Tx =
6.1.6. F o r
The reader can readily verify that T is a linear operator which is neither
injective nor surjective. We see that
00
00
IITxW =
<
~Ie,lz
le,l z
IIxW
7.1.20. Example. Let X = era, b], and let 111100 be the norm on era, b] defined
in Example 6.1.9. eL t k: a[ , b] X a[ , b] - > R be a real-valued function, continuous on the square a < s < b, a < t < b. Define the operator T: X - > X
by
[ T x ] ( s)
for x
.X
Then T
L(X,
X)
IITx II =
sup
Q~,~b
)10
B(X ,
k(s, t)x ( t) dt
< [Q~rb
=
r
Q
Ik(s, t) Idt]
lIxll
)Y and that
IITil <
[Q~fb
)10'
Ix ( t) I]
It can, in fact, be shown
eL t
T E L(X,
)Y .
If X
Let {XI'
,x n } be a basis for .X
set of scalars
,en} such that x
the linear functionalsj,: X - > F b y j,(x ) =
Proof
reI,
= elx l +
e" i =
I,
there is a unique
If we define
,n, then by Theorem
X,
enxn'
14 2
IIxllp =
[ I ' l lI'
+ ... +
11_ = max, I{
II x
<
1 <p
00
e,l}.
It turns out that different norms on R" give rise to different norms oftransformation A. (In this case we speak of the norm of A induced by the norm
defined on R".) In the present example we derive expressions for the norm
of A in terms of the elements of matrix A when the norm on R" is given
by II III' II liz, and II 11-
IIAxl1 =
m
{ ax
S;lS;" I- I
jo be such that
Then IIAII =
i;latj,l=
1= 1
max
t'lalll.
1-1=
m
{ ax
tla/JI"
I S;lS;" 1= 1
t lau,l
I- '
la,ll} l lx l l
max
)' 0 '
)' 0 '
E
and Ilxoll =
To
= I,
1.
)' 0 '
7.1.
Bounded iL near
Transformations
14 3
Ie.
To prove this we note first that by Theorem .4 10.28 the eigenvalues of ATA
are all real. We show first that they are, in fact, non-negative.
Let {XI' ... , ,x ,} be eigenvectors of ATA corresponding to the eigenvalues
A{ . I, ... , A,,}, respectively. Then for each i = I, ... , k we have ATAx/ = A/X/.
Thus, ;X ATAx/ = A,X;/X .
From this it follows that
A=
,
>
;X ATAx/
x;x/-
O.
IIAxW =
T
x ATAx
'=1
e,
7.1.23. Exercise.
"
Ao I; Ilx,W
/= 1
+ .
+ .
AollxW,
IIAxW =
IIAII =
max ( t
/
J=I
laill).
The proof of
)Y
<
T.xll
IITm-
.-00
T(x
and
y) =
lim T.(x
T(<)x<
y) =
lim T.x
lim T.y
= lim T.x =
Tx.
= Tx + Ty
14 4
II Txll =
II =
lilim Tnx
lim II Tnx
II <
sup 01 Tn IIIIx
II)
(~ T)
x{
:X Tx
OJ.
L(X,
)Y as
(7.1.25)
eL t T
)Y . Then
B(X,
~(T)
T(I;
I- I
L(X,
~IXI)
= I; ~ITxl
1= 1
in .X
The proof of this theorem follows readily from Theorem 5.7.8. We leave
the details as an exercise.
7.1.28. Exercise.
7.2.
INVERSES
Throughout this section X and Y denote vector spaces over the same field
ofT- I .
Proof Assume that there is a constant IX > 0 such that IXII x II < II Tx II for
all x E .X Then Tx = 0 implies x = 0, and T- ' exists by Theorem 3.4.32.
For y E R
< (T) there is an x E X such that y = Tx and T- l y = .x Thus,
or
II =
IXII x
IXII T- I y II
<
II <
IIT-I Y
II Tx II
= II y II,
~ lIyll
The next result, called the Neumann expansion theorem, gives us important
information concerning the existence of the inverse of a certain class of
bounded linear transformations.
7.2.2. Theorem. Let X be a Banach space, let T E B(X, )X , let I E B(X, X)
denote the identity operator, and let II Til < I. Then the range of (1- T) is
,X the inverse of (I - T) exists and is bounded and satisfies the inequality
. Til in B(X,
.-0
1+
)X ;
T+
(7.2.3)
X)
converges uniformly to (J -
i.e.,
T2
+ ... +
T"
+ ....
T)- I
(7.2.4)
14 5
416
Proof
Since
IITil <
II P II < IITil,
then
ST =
I: P+
TS =
T)S =
I,
.=0
T)
S(I -
It now follows from Theorem 3.4.65 that (I F u rthermore, S E B(X , X ) . The inequality
and is left as an exercise.
_
7.2.5. Exercise.
I: T
.so
I:P,
.-0
S=
(I -
X),
and
converges. In
I.
Proof
The proof of this theorem is rather lengthy and requires two preliminary results which we state and prove separately.
where
X.
Proof
that II X
"x
A and
Xl
II < til X
Ilx
A such that
and obtain
1,2, ....
+ ... ,
X.
k}
Ilx and A =
+ ... +
X"
II X. II < 31IxlI/2, n =
The sequence x {
Xl
By construction of
I
.x 1I < 2.lIx
lI.
-
E A,
.X{ l,11
because
-
tl
11--
Xl
0 as n -
.x _
1 E
00.
Hence,
7.2. Inverses
x
= :E x
k- I
417
II. First,
we see that
and, in general,
+ IX -
+ -X
I - ... - ,X ,_I
,x ,11 + Ilx - IX - ... - ,x ,_111
X
II
7.2.8. Proposition.
= U
such that X
GO
_
is any countable collection of subsets of X
,,-I
C .1".
that S(X o; E)
Proof The proof is by contradiction. Without loss of generality, assume
that
AI C A z C A 3 C . . . .
F o r purposes of contradiction assume that for every x E X and every n
there is an E" > 0 such that S(x ; E,,) n A" = 0. Now let IX E X and EI > 0
be such that S(x l ; f l ) n AI = 0. eL t X z E X and f z > 0 be such that
S(x z ; fz ) c S(x l ; f\ ) and S(x z ; fz ) n Az = 0. We see that it is possible to
construct a sequence of closed nested spheres, ,K { ,},
(see Definition 5.5.34)
in such a fashion that the diameter of these spheres, diam (K,,), converges to
ez ro. In view of part (ii) of Theorem 5.5.35,
Then
X
n K" * 0. eL t
..
k- I
Clearly, Y
.
U A
k- I
{ y E :Y
k
II r- I y II <
kllyll},
n "K .
GO
GO
,,= 1
k= 1
A". This
= 1,2, ....
and a set A" such that S(Yo; E) C .1". We may assume that oY E A". Let
be such that 0 < p < E, and let us define the sets Band Bo by
and
= y{
Bo =
S(Yo; f):
{y E Y: y
p < lIy -
= z - oY ,
Z E
oY II}
B}.
14 8
Ax
Let Y
<
B n Aft' Then
<
Now let K
nlly -
oY
11[1
211Yoll
IIYoY- ll
+ 211 po ll
n[1
,Y
i{ :
,Y }
where II fY t
series I; X
k= 1
Y = IY +
II < 311 Y II/2ft . L e tx k =
+ ... +
1Y
T- I yk , k
< 3KllylI.
lk =
I~
so that
T(f
fY t
= 1,
= ~
k- I
TX
:tY k
k- I
~
X
(:' 1 k
y. eH nce,
T-I
3KIIYII
tU ilizing
the principle of contraction mappings (see Theorem 5.8.5),
we now establish results related to inverses which are important in applications. In the setting of normed linear spaces we can restate the definition of a
contraction mapping as being a function T: X X (T is not necessarily
7.3.
14 9
T(y) II <
IIT(x) -
/Xlix
yll
T(x)
= x
Let
X
X),
let l E ,F
E X
x =(iv) if III-
Til <
Proof
(i) F o r any ,x y
i[
l/)x =
y, and
+ ~ + ...
;J
0;
one
and
,X
we have
1I1- Tx - l - t TY I I
= 11- 1 1\ 1 T(x - y)1I < Il-IIIITllllx - yll.
Thus, if II Til < IAI, then A-I T is a contraction mapping. In view of the
principle of contraction mappings there is a unique x E X with l- ' T x
= x,
or Tx = lx . The unique solution has to be x = 0, because TO = O.
I
(ii)
L e tL
t- T.
Then IILII
mil
Til <
7.3.
CONJG
U ATE
AND ADJOINT
OPERATORS
24 0
TTy' )
T
< ,x
,X y'
yf
Ix'(x) I =
I=
ly' ( Tx )
I<T,x
y' ) I<
Ily'lllITxll"<
lIy'III1Tllllxll,
and therefore x ' is a bounded linear functional and x ' E X*. We have thus
assigned to each functional y' E y* a functional x ' E X * ; i.e., we have
established a linear operator which maps y* into X * . This operator is called
the conjugate operator of the operator T and is denoted by T'. We now have
The definition of T': *Y
.x <
*x
-+
T' y ' )
x'
T' y ' .
T
< ,x
tU ilizing
operator notation rather than bracket notation, the definition
of the conjugate operator T' satisfies the equation
x'(x)
y' ( Tx )
(T' y ' ) (x ) ,
E ,X
T' y ' ,
7.3.
24 1
7.3.1.
y.
iF gure A
7.3.3. Exercise.
T' y ' )
= <Tx,
y' ) , x
,X y'
The
*Y .
24 2
(Tx, y) =
(x, T*y), x
,X Y
.Y
We will now show that T*: Y - + X is linear, unique, and bounded. To prove
linearity, let x E ,X lY > 2Y E ,Y let ~, P E ,F and note that
(x, T*(~IY
PY2
=
(Tx, ~YI
(x,
T*YI)
PY2)
(Tx,
IY )
p(x , T*Y2) =
P(Tx, 2Y )
(x, ~T*YI
PY2) =
~T*YI
PT*Y2'
<
IITIIIIT*xllllxll,
(Tx, y) =
(x, T*y), x
,X Y
.Y
is
The reader is cautioned that many authors use the terms conjugate operator
and adjoint operator interchangeably. Also, the symbol T* is used by many
authors to denote both adjoint and conjugate operators.
Some of the important properties of conjugate operators are summarized
in the following result.
7.3.
24 3
(iv)
(v)
(vi)
(vii)
I<,x
<IIy'IIIITxll<
Ily'IIIITllllxll
this it follows that IIT'yl' l < IITlllly'lI,and therefore
IIT'II<IITII
From
T' y ' )
I=
I< T x , y' ) 1
"*
II Tx o II = I<x
o' T'y~>I<
IIT'y~
IITil <
IIT'II
(S +
T)' y ' )
S
S
< ,x
,x<
1- T)x , y' )
y' )
S' y '
<Tx,
y' )
T' y ' )
S
< x
=
,x<
Tx, y' )
,x<
(S'
S' y ' )
,x<
T' y ' )
T' ) y' ) .
T' y ;>
,x<
T'y>~
"*
<Tx,
y; -
"*
y~)"*
24 4
E
X
,X
Tx =
<x,
y, and we have
x')
<T-I
<x,
y, x ' )
<Tx,
T'(T~I)'x').
R
< (T')
(T- I )' .
Proof
or
IITllllx l l
T and T* we obtain
IITxZ
U
or
= I(Tx ,
Tx)1
= I(T*(Tx ) ,
)x 1
IIT*IIIITx l lllx l l,
II Tx II < II T* IIIIx II
this it follows that II Til < IIT*II,and therefore IITII = IIT*II
F r om
The proofs of properties (ii)-(viii) are trivial. To prove part (ix), we first
7.3.
24 5
note that
IITxW =
(Tx, Tx ) =
<
(T*Tx, )x
< IIT*Tllllxllllxll
IIT*Tx l lllx l l
Taking the square root on both sides of the above inequality we obtain
and thus II Til <
~IIT*TII,
7.3.9. Exercise.
IITW
7.3.10. Example.
= E
IY
i=
alJ x J ,
I= J
where IY is the ith component of the vector y E .X Let A* denote the adjoint
of A on the H i lbert space X, and let A* be represented by the n X n matrix
[a~}.
Now if u = (u l ,
,u.) E X , then
(Ax , u)
(y, u)
and
(x , A*u) =
1= 1
Nil =
t U I( 1=f:1 aIJx
1= 1
J ),
t (t
a~uJ)'
J-I
I- I
XI
In order that (Ax , u) = (x , A*u) we must have a~ = iiIJ ; i.e., the matrix of
A* is the transpose of the conjugate of the matrix of A.
(Tx)(t)
b (see Example
k(s, t)x(s)ds, t
6. I I.lO),
a[ , b],
where it is assumed that the kernel function k(s, t) is well enough behaved so
that
s: s:
I k(s, t) 1
dtds
<
00.
24 6
Now if U E L
[,
2a
b], then
(Tx, u)
u)
(Y.
s: k(t, s)u(s)ds;
= (T*uXI) =
Z(I)
12
-/
Exercise.
by
=
Y
12 (see Example
T(el>
Let
'2'
e., )
(0,
y,
is the operator
... )
12,
6.12.1), we have the following important results for bounded linear operators
on H i lbert
spaces.
(~ T*)
(vi) R
< (T)
space
~(T*);
~(T*).L;
R
<{ (T*)}.;L
=
=
~(T).L;
~(TT);
and
R
< (TT*).
Proof We prove (i) and (v) and leave the proofs of (ii}(- iv)
and (vi) as an
exercise.
To prove (i), we first show that R
< (T).l = ~(T*).
Let y E R
< (T).l. Then
(y, Tx) = 0 for all x E ,X and hence (T y , x) = 0 for all x E .X This can be
true only if T*y = 0; i.e., y E ~(T*).
On the other hand, if y E ~(T*),
then (T*y, x) = 0 for all x E .X Thus, (y, Tx) = 0 for every x E ,X which
implies that y E R< (T).l. Now R
< (T) need not be closed. However, by Theorem
6.12.14 R
< (T) = R
< (T)u. Therefore, R<{ (T)}.L
= R< (T)il.L = R< (T).L = ~(T*).
7.4.
eH rmitian
24 7
Operators
To prove (v), let y E m(T*). Then T*y = 0 and TT*y = O. This implies
that m(T*) c m(TT*). Next, let y E m(TT*). Then TT*y = 0 and (y, TT*y)
= O. This implies that (T*y, T*y) = 0 so that T*y = O. Therefore, y E m(T*)
and m(TT*) c m(T*), completing the proof of part (v).
7.3.14.
Exercise.
T(M) =
y{ : y =
is a Hilbert
Tx, x EM} .
0=
7.4.
HERMITIAN
OPERATORS
Exercise.
B(X,
X)
is said
24 8
Some authors call such transformations self-adjoint operators (see Definition .4 10.20).
The next two results allow us to characterize a hermitian operator in an
equivalent manner. The first of these involves symmetric bilinear forms (see
Definition 3.6.10).
7.4..4
Theorem. eL t T E B(X,
bilinear transformation ,(x , y) =
E Xor
7.4.5.
Theorem. Let
T E B(X,
E .X
0,
T)x , y) =
E
X).
.X
Tx for all
Proof
(Tx , )x
(x
y, T(x
(x -
where;
y, T(x -
i(x -
iy, T(x -
y) -
(T(x -
y), x -
i(T(x -
iy), x -
(T(x
y), x
iy
= ..;=1. Also,
y)
iy)
i(x
4(x,
i(T(x
iy, T(x
4 ( Tx ,
iy
(7.4.6)
Ty)
iy), x
y).
Since the left-hand sides of Eqs. (7.4.6) and (7.4.7) are equal,
that (x , Ty) = (Tx , y) for all x , y E ,X and hence T = T*.
iy
(7.4.7)
it follows
II Til =
IITII =
sup I{ (Tx , )x l:
sup { I (Tx , y)l:
7.4.
eH rmitian
7.4.9.
Operators
Exercise.
24 9
S, T E B(X,
X)
(S + T) is a hermitian operator;
exT is a hermitian operator;
if T is bijective, then T- I is hermitian; and
ST is hermitian if and only if ST = TS.
Exercise.
B(X, X )
T- S>
E
if S ~ 0, T~
0, then (S +
if ex > 0, T~
0, then exT~
if S ::; T, T::; ,U then S <
for any V E B(X, X), if
V*V> o.
B(X,
T)
0;
>
X)
0;
U; and
T > 0, then V*TV>
O. In particular,
Proof The proofs of parts (i}(- iii) are obvious. F o r example, if S > 0,
T > 0, then (Sx , )x + (Tx, )x = (Sx + Tx , )x =
+ Dx , x) ;;::: 0 and
(S+ D;;:::O.
To prove part (iv) we note that (V*TVx , x) = (TVx , Vx);;::: 0, since Vx
= y is a vector in X and (Ty, )Y > 0 for all y E .X If we consider, in particular, T = 1= 1*, then v* V ~ O.
The proof of the next result follows by direct verification of the formulas
involved.
34 0
7.4.15.
Theorem. eL t A
where i
= ,j- 1 .
~
=
A
[
E B(X ,
and let
X),
V=
and
A*]
ii
A
[ -
A*],
Then
=
U
and
Exercise.
= L
X
eL t
7.4.19.
Tx
s:
tx ( t)z ( t)
(x , Tz)
Let X =
z =
Show that T*
Tx
y(t)
dt
s:
x ( t)tz ( t)
dt
-+
by
x ( s)ds.
7.4.20. Exercise. eL t X = L
given in Example 7.3.11; i.e.,
Show that T
tx ( t).
[,
2a
L
(T*x , z).
T* and T is hermitian.
Exercise.
we have
E X
(Tx , z )
Thus, T =
b] (see Example
[,
2a
(Tx ) (t)
s: k(s, t)x(s)ds,
operator
t E a[ , b].
k(s, t).
We conclude this section with the following result, which we will subsequently require.
7.5.
34 1
Proof
7.5.
OTHER
LINEAR OPERATORS:
NORMAL OPERATORS, PROJECTIONS,
U N ITARY
OPERATORS,
AND ISOMETRIC OPERATORS
An operator T
B(X ,
E B(X ,
B(X ,
X)
X)
X)
is said to be an isometric
is said to be an unitary opera-
34 1
ClUpJ ter
I iL near Operators
T E B(X, X).
Let ,U V E B(X, X ) be hermitian
U
iV. Then T is normal if and only if U V = VU.
= l lx - y ll
)x
E
.X
(Tx , Tx )
But this
E B(X ,
(i) T is unitary;
(ii) T is unitary;
(iii) T and T are isometric;
)X .
34 3
7.S.H.
Exercise.
*" .X
*" O{ J
(i) P E B(X, X ) ;
(ii) IIPII = I; and
(iii) p* = P.
for all ,x Y E .X
This implies that P
= P*.
= x{
Px
E X:
= Y
= )x
= Y z.
34 4
Proof
Y=
Y
7.5.14.
Since
I
Y~
~Y
Theorem.
,Y
since Y c Y
Let P
L(X,
x{
it follows that
c Y~,
X).
and since Y
I,
Px
E X:
}x
Proof
If x, y E ,Y then Px
P(rx.x
fty) =
x and Py
P(rx.x
rx.Px
ftPy.
rx.x
fty.
Therefore, (rx.x
fty) E Y a nd Y is a linear subspace of .X We must show
that Y is a closed linear subspace. First, however, we show that P is bounded
and therefore continuous. Since
IIPzW
(Pz, Pz)
(P*Pz, )z
(P~z,
)z
(Pz, )z
<
IIPz l lllz l I,
I\ Px~
and let P be
Px =
I- I
(x, ,X )X
, for all x
.X
7.5.
34 5
I, ... , n we have
PX
Hence,
for any x
ft
~
(x
I- I
J ,
,x )x,
(7.5.18)
Ix "
we have
X
" (x,
,=
,X )X
t-1
.Y
Px.
+ ... +
tllXI
To show that
tI"x"
for some { t il' ... ,tift}. It follows from Eq. (7.5.18) that Py = Y and so
y E CR(P).
iF nally, to show that P is an orthogonal projection, we must show that
CR(P) 1- (~ P).
To do so, let x E ~(P)
and let y E CR(P). Then
(x, y)
=
=
(x, Py)
~
I~
(x, ~
(O,y)
" (y, ,X )X
1= 1
O.
"
y)
(~(x, "
1'1=
,)
=
,X )X
"(y,
- - ,x )(x,
1= 1
"
y)
,x )
(Px, y)
L(X,
Note that in view of Theorem 6.12.16, Definitions 3.7.12 and 7.5.19 are
consistent.
The proof of the next theorem is straightforward.
and let T
x{
X:
(1- P)x
.}x
P) is a projection onto lY ..
Exercise.
7.5.
(iv) P(Z)
(v) Q(Y )
;z
=
34 7
)X .
0;
0;
O
{ ;}
= O{ .J
7.5.26. Exercise.
and
Proof Assume that PIP Z = PZP I Then (PIP Z)* = PfN = PZP I = PIP Z;
i.e., if PIP Z = PZP I then (PIP Z)* = (P1P Z) Also, (PJPZP = PIPZPIP Z
= PIPIPZP Z = PIP Z; i.e., if PIP Z = PZP I , then PIP Z is idempotent. Therefore, PIP Z is an orthogonal projection.
Conversely, assume that PJP Z is an orthogonal projection. Then (PJP z )*
= PfN = PZP 1 and also (P1P Z)* = PJP z . Hence, P1P Z = PZP J .
Finally, we must show that the range of PI P z is eq u al to Y J (i Y z . Assume
that x E 6l(P IP z ). Then P1PZx = ,x because P J P z isan orthogonal projection.
Also, PIPZx = PI(PZx) E Y J , because any vector operated on by P J is in Y I '
Similarly, PZPlx = Pz(PJ)x
E Y z . Now, by hypothesis, P1P Z = PZP Io and
therefore PIPZx = PZPJx
= x E Y I (i Y z . Thus, whenever x E 6l(P IP z ),
then x E Y J (i Y z . This implies that 6l(P IP z ) c Y I (i Y z . To show that
6l(P IP z ) ::J Y I ( i Y z , assume that x E Y 1 (i Y z . Then PJPZx
= PJP{ )xz
= PIX = X E 6l(P IP z ). Thus, Y I (i Y z C 6l(P 1P z ). Therefore, 6l(P IP z )
= Y I (i Y z
7.5.28. Theorem. L e t
(i)
(ii)
(iii)
P::;;; Q;
II Px II < II Qxll
Y c: z;
(iv) QP =
(v) PQ =
P; and
P.
for all x
E X;
7. I iL near
ChJpz ter
34 8
Operators
(x , x )
(Px , Px )
IIQxll" ~
IIPxll" ~
IIQllllxll"
II x
II" =
(x , x ) ,
and let
7.5.30. Exercise.
R,
= [c~S
SID
0 - sin OJ
cos 0
=[
-
e"J.
By direct computation we
c~s 0
SID
sin OJ.
9 cos 9
It readily follows that R*R = RR* = I. Therefore, R is a linear transformation which is isometric, unitary, and normal. _
7.5.32. Exercise.
eL t
by y = PTx, where
=
X
y(t) =
34 9
0[ , 00) and define the truneation operator P T
{ X ( t)
R
< (P
T)
x{
E :X
x(t)
Additional examples
Section 7.10.
7.6.
THE
= x{
:X
(x t)
0 for t
> T},
SPECTRUM
OF
AN OPERATOR
.X
(i) R
< (T - AI) is dense in ;X
(ii) (T - .J I)-I exists; and
(iii) (T - .J I)-I is continuous (i.e., bounded)
is called the resolvent set of T and is denoted by p(T). The complement of
p(T) is called the spectrum of T and is denoted by q ( T).
The preceding definitions require some comments. First, note that if .J
is an eigenvalue of T, there is an x * - O such that (T - .J I)x = O. From
Theorem 3.4.32 this is true if and only if (T - AI) does not have an inverse.
eH nce, if .J is an eigenvalue of T, then ,t E (q T). Note, however, that there
04
are other ways that a complex number 1 may fail to be in p(T). These possi.
bilities are enumerated in the following definition.
7.6.3. Definition. The set of all eigenvalues of T is called the point spectrum
of T. The set of alll such that (T - l1)- 1 exists but Gl(T - l l) is not dense
in X is called the residual spectrum of T. The set of all 1 such that (T - 11)-1
exists and such that Gl(T - 11) is dense in X but (T - ll)- I is not continuous
is called the continuous spectrum. We denote these sets by pq ( T), Rq(T),
and Cq(T), respectively.
Clearly, q ( T) = Pq(T) U Cq(T) U Rq(T). Furthermore, when X is finite
dimensional, then q(T) = Pq(T). We summarize the preceding definition in
the following table.
AI)-1 exists and
is
continuous
(T (T -
< R (T- U )
R
< (T
-U)
*X
.11)-1
AI)-1 does
not exist
(T -
.11)-1 is
A e p(D
.Ie Ca(D
A e Pa(D
.Ie "RtT(T)
1 e RtT(T)
1 e PtT(T)
7.6.5. Example.
x = (~I'
~2"
..)
Tx
lx then (~
k:k =
pq ( T) = {
l )~k
=
0,
I. 2 . .. } .
Next, assume that 1 pq(T). so that (T - l1)- 1 exists. and let us inves
tigate the continuity of (T - 1 1)- 1 . We see that if y = (' I I. 1' 2.' ..) E Gl(T
- 11), then (T - l 1)- l y = x is given by
~
-.....!l.L_
k'lk .
..! . ._ l - I - l k
k-
Now if A.
0, then
II (T - A.I)-I y W=
.
k= 1
14
~
and (T -
k"11~
A.I)-I is not
(T -
*" 0, then
and
p(n
= P[ O'(T) u CO' ( nr
=
Y
and
T,x
(0,
6.11.9, let
X and the
p(T,)
p(T,)
= CO'(T,) = A{ .
CO'(T,)
RO'(T,)
PO'(T,)
= A{ .
PO'(T,)
RO'(T,)
C: IA.I >
= A{ . E
= 0.
I),
C: IA.I =
C: IA.I
I),
< I),
We now examine some of the properties of the resolvent set and the
spectrum.
7.6.7. Theorem. Let T E B(X, X). IflAI >
lently, if A E O'(n, then IA.I < II Til.
II Til, then A. E
14
Chapter 7
7.6.8. Exercise.
I iL near Operators
B(X,
X).
Proof Since o(T) is the complement of p(T), it is closed if and only if P(T)
is open. Let 1 0 E P(T). Then (T - 1 0 1) has a continuous inverse. F o r arbitrary 1 we now have
1 0 ) -I
- 1 /)11
II <
1.
Now in Theorem 7.2.2 we showed that if T E B(X, X), then T has a continuous inverse if III- Til < 1. In our case it now follows that (T - lo/)- I (T
- lI) has a continuous inverse, and therefore (T - 1I) has a continuous
inverse whenever Il - lo I is sufficiently small. This implies that 1 E p(T)
and P(T) is open. eH nce, u(T) is closed. _
F o r normal, hermitian, and isometric operators we have the following
result.
7.6.10. Theorem. eL t X be a Hilbert space, let T
eigenvalue of T, and let Tx = lx . Then
(i)
(ii)
(iii)
(iv)
B(X ,
X),
let l be an
7.6.
34
U ) (T
U)*
=
eL .,
(T -
(T -
U ) (T*
(T -
U ) T*
(T -
T*T -
IT -
(T* -
II)(T -
UXT
).1)*
U ) II
IT +
).T* -
II)
TT* -
TT*. Then
).T* +
).II
(T -
U ) *(T
),1)
(T -
;.II
U ) *(T
).1);
U),
/.l)(x, y)
=
=
i.e., (A -
/.l)(x, y)
).(x, y) -
/.l(x, y)
(Tx , y) -
(x,
= O. Since 1 F=
0;
The next two results indicate what happens to the spectrum of an operator
T when it is subjected to various elementary transformations.
7.6.11. Theorem.
T. Then
Let
T E B(X ,
= p(q(T
q ( p(T
7.6.12. Exercise.
X),
= {p(A): A E q(T)J.
B(X ,
X)
[ q ( T)r
tJ.
l{ :).
q ( T)} .
Proof Since T- I exists, 0 i q ( T) and so the definition of (q[ T)]1sense. Now for any). F= 0, consider the identity
(T- I -
It follows that if 1
1
(U
q ( T) implies that
1I)
1i
c q ( T- I )
q ( T- I ).
T)
makes
T- I .
In other words, U ( T- I )
c [ u (T)r
To
eL t
be a Hilbert
X
IIxII =
1/)- 1
and 1
II.
~
Then
)X .
lJ )
B(X,
- l l)x l l.
lx l l
~
B(X,
be a
X)
be a Hilbert
B(X,
X)
be
-II
Proof To prove (i), note that if T is hermitian it is normal and t1(T) = n(T).
eL t 1 E n(T), and assume that 1 0 is complex. Then for any x
0 we have
0<
11- IlIlxW =
<
I T -
=
i.e.,
211(T
ll)x ,
)x 1
*'
11- II(x , )x
I T -
Il)x ,
*'
I T - l l)x ,
)x 1 <
II(T -
)x -
lJ)lx lllxll
- l J ) x l lllx l l;
0<
T -
211(T- l l)x l l
Il)x ,
)x 1
7.6.
for all x E .X But this implies that l rt neT), contrary to the original assumption. eH nce, it must follow that l = .i, which implies that l is real.
To prove (ii), first note that II Til > sup { I ll: l E q ( T)} for any T E
B(X, X) (see Theorem 7.6.7). To show that equality holds, if T is hermitian,
we first must show that II T WE n(P) = q ( P). F o r all reall and all x E X
we can write
IIT2 x -
..1hW =
Since (T2X , )x
(T2 X or
(T2 X - l 2 X , T2X - l 2 X )
= (T2 X , T2 X ) - (T2 X , l2 X ) -
(Tx, T*x)
l2X , Px -
l2 X
IIT2x - l 2 X W =
IIT2x W -
be a sequence
l2X~HZ
ZH
).211 Tx~
IIT2X~HZ
11)2 -+
II T2x .
2A,211 Tx~ ZH
0 as n - +
2l211Tx~W
A,4 =
). 211 Tx~
ZH -
eL t
7.6.22. Exercise.
x{
be a Hilbert
:X
IITII. If'
B(X,
(T -
W+ ). 4
00;
7.6.21. Theorem.
neT) is closed.
i~ T)
)x ,
(7.6.19)
A,4(,X
l4
7.6.20. Exercise.
)x .
A,4 I1xW.
l4(,X
(l2 X , T2X ) +
X),
U)x
).
C, and we let
= O} =
~(T
U).
(q T2).
E
.~ l(T)
n(P), it now
B(X,
)X .
Then
be the null
(7.6.23)
46
Proof
L e t x E l~ (n.
We wantto show that Sx
Since x E ~l(n,
we have Tx = .lx. Thus, STx
7.6.25.
Corollary
Proof
Since IT =
E ~l(n;
is invariant under T.
~l(n
l~ (T)
l~ (T)..l
l~ (T)
~rtT*);
,~ ,(T)
if A.
reduces T.
C, and let T
B(X,
X).
Proof
Exercise.
Before considering the last result of this section, we make the following
definition.
7.6.28. Definition. A family of closed linear subspaces in a H i lbert space
X is said to be total if the only vector y E X orthogonal to each member of
the family is y =
o.
Proof
7.7.
74
subspace of .X Therefore,
Hence, TS = ST.
7.7.
COMPLETELY
m. =
X;
eL .,
CONTINUOUS
(TS -
ST)x
= 0 for all
x
.X
OPERATORS
Throughout this section X is a normed linear space over the field ofcomplex
numbers C.
Recall that a set Y c X is bounded if there is a constant k such that for
all x E Y we have II x II < k. Also, recall that a set Y is relatively compact
if each sequence x{ n } of elements chosen from Y contains a convergent
subsequence (see Definition 5.6.30 and Theorem 5.6.31). When Y contains
only a finite number of elements then any sequence constructed from Y must
include some elements infinitely many times, and thus Y contains a convergent
subsequence. From this it follows that any set containing a finite number
of elements is relatively compact. Every relatively compact set is contained
in a compact set and hence is bounded. F o r the finite-dimensional case it is
also true that every bounded set is relatively compact (e.g., in Rn the BolzanoWeierstrass theorem guarantees this). However, in the infinite-dimensional
case it does not follow that every bounded set is also relatively compact.
In analysis and in applications linear operators which transform bounded
sets into relatively compact sets are of great importance. Such operators are
called completely continuous operators or compact operators. We give the
following formal definition.
Exercise.
84
0 for all
7.7.5. Example. Let X = era, bJ, and let II . II", be the norm on era, bJ
as defined in Example 6.1.9. eL t k: a[ , bJ X a[ , bJ - R be a real-valued
function continuous on the square a < s < b, a < t < b. Defining T:
X-Xby
s:
T
[ (J x s)
k(s, t)x(t)dt
for all x E ,X we saw in Example 7.1.20 that Tis a bounded linear operator.
We now show that T is completely continuous.
eL t ,x { ,}
be a bounded sequence in ;X i.e., there is a K >
0 such that
IIx"lI", < K for all n. It readily follows that if "Y = Tx", then IIY"II S 7011x"II,
where
70 =
sup
1I~,b~
II
Ik(s .. t) -
y,,(s~)
I<
k(s~,
Ik(sl'
t)1
t) -
<
(K b
k(s~,
f_
a) if
t) IIx(t) Idt
<
for
for all n and all s.. s~ such that lSI - s~ I < ~. This implies the set ,Y{ ,}
is
equicontinuous, and so by the Arzela-Ascoli theorem (Theorem 5.8.12),
the set { Y . }
is relatively compact in era, b] ; i.e., it has a convergent subseuq ence. This implies that T is completely continuous.
It can be shown that if X = L~[a,
b) and if T is the Fredholm operator
defined in Example 7.3. II, then T is also a completely continuous operator.
The next result provides us with an example of a continuous linear transformation which is not completely continuous.
7.7.6. Theorem. Let IE B(X , X) denote the identity operator on X .
Then I is completely continuous if and only if X is finite dimensional.
94
(as
as n k , n kJ
~
00.
PDx
J
= aSx
J
PTx , -
(Xu
pv
Exercise.
7.7.14.
Example. A consequence of the above corollary is that if T E
B(X, X ) is completely continuous and X is infinite dimensional, then T cannot
be a bijective mapping of X onto .X For, suppose T were bijective. Then we
would have T- t T = I. By the Banach inverse theorem (see Theorem 7.2.6)
T- t would then be continuous, and by the preceding theorem the identity
mapping would be completely continuous. However, according to Theorem
7.7.6, this is possible only when X is finite dimensional.
Pursuing this example further, let X = era, bJ with II II~ as defined
in Example 6.1.9. Let T: X
<
~
X
be defined by Tx(t)
s: (x r- )d-r
for a
<
Proof
The proof of the next result utilizes what is called the diagonalization
process.
7.7.17. Theorem. Let X and Ybe Banach spaces, and let { T .l be a sequence
of completely continuous operators mapping X into .Y If the sequence { T .l
converges in norm to an operator T, then T is completely continuous.
Let .x { l be an arbitrary sequence in X with IIx.11 < I. We must show
that the sequence {Tx.l contains a convergent subsequence.
By assumption, T 1 is a completely continuous operator, and thus we can
{ 1 .x l. Let
select a convergent subsequence from the sequence T
Proof
Using
Now each of the operators T IJ T", T 3 , , T., ... transforms this sequence
into a convergent sequence. To show that Tis completely continuous we must
7.7.
54 1
show that T also transforms this sequence into a convergent sequence. Now
II Tx - Tx .... 11 =
<
<
1\ Tx -
liT -
Tkx
1\ Tx." 11
+ II Tkx
Tkll(llx 11
i.e.,
Ilx
m ",
Tkx
-
II) +
Tkx". -
Tkx ..",
Tkx",,,, -
II + II Tkx",,,, - Tx",,,, II
1\ Tkx - T k"x ,,,, II;
Tx",,,,
II
Tkx",,,,
II Tx - Tx",,,, II <
eL t
X
be a Hilbert
B(X ,
X).
Then
Exercise.
54 2
m.A(n =
is finite dimensional.
:x {
0, then
Tx
lx }
Proof. The proof is. by contradiction. Assume that m.A(n is not finite
dimensional. Then there is an orthonormal infinite sequence X I ' 2X .' ,
x .., ... in m.A(n, and
IITx .. -
TxlllW
i.e., II Tx .. of T
{ x ..} can be a Cauchy sequence, and hence no subsequence of T
{ x ..} can
converge. This completes the proof. _
In the next result n(T) denotes the approximate point spectrum of T.
Proof.
Ax .. II
--
"'
00; i.e., AX ... -
Ty =
T(lim lx ...) =
....
1 lim Tx
II.
"'
= ly.
7.7.23. Exercise.
u(n and 1 =#
7.7.
54 3
7.7.24.
Theorem. L e t X be a H i lbert space, and let T E B(X , X ) . If T is
completely continuous and hermitian, then T has an eigenvalue, l, with
III = II Til
Proof The proof follows directly from part (iii) of Theorem 7.6.18 and
Theorem 7.7.22. _
7.7.25. Theorem. L e t X be a H i lbert space, and let T E B(X , X ) . If Tis
normal and completely continuous, then T has at least one eigenvalue.
Proof If T = 0, then l = 0 clearly satisfies the conclusion of the theorem.
So let us assume that T *- O. Also, if T = T*, the conclusion of the theorem
follows from Theorem 7.7.24. So let us assume that T*- T*. L e t U = 1(T
T*) and V =
i/T -
and V
III = II Til
o<
(Sx,
)x
(Ax, x) =
A(x,
)x =
AllxW,
7.8.
space.
A. =
{l
E C: f
Proof To the contrary, let us assume that for some f > 0 the annulus A.
contains an infinite number of eigenvalues. By the Bolzano-Weierstrass
theorem, there is a point of accumulation 1 0 of the eigenvalues in the
annulus A. Let ){ .ft} be a sequence of distinct eigenvalues such that )." - > ).0
as n - > 00, and let Tx" = l"x", II "x II = I. Since T is a completely continuous
for which the sequence T
{ "x .}
operator, there is a subsequence x { ...} of ,x { ,}
converges to an element u E X ; i.e., Tx". - > U as nk - > 00. Thus, since
Tx ... = l".x
we have l x ... - > u. But 1/).... - > 1/10 because 1" I= :- O. Therefore x - > (I/10)u. But the x are distinct eigenvectors corresponding to
distinct eigenvalues. By part (iv) of Theorem 7.6.10 .x { ..}
is an orthonormal
2
sequence and "x . - > (I/10)u. But II x - "x ,11 = 2, and thus x { ...} cannot be
a Cauchy sequence. Yet, it is convergent by assumption; i.e., we have arrived
at a contradiction. Therefore, our initial assumption is false and the theorem
is proved. _
ft. ,
7.8.
54 5
7.8.3. Exercise.
7.8.4.
Proof
N= O
{ .J By Theorem 6.12.6, N is a closed linear subspace of .X We will
show first that Y is invariant under T*. Let x E .Y Then x E m.. for some n
and Tx = l"x. Now l.,(T*x ) = T*(l"x ) = T*Tx = T(T*x ) ; i.e., T(T*x )
= l.(T*x ) and so T*x E m.., which implies T*x E .Y
Therefore, Y is
invariant under T*. From Theorem 7.3.15 it follows that y.1. is invariant under
T. Hence, N is an invariant closed linear subspace under T. It follows from
Theorems 7.7.8 and 7.5.6 that if T I is the restriction of T to N, then T I E
B(N, N) and T I is completely continuous and normal. Now let us suppose that
N 1= = O
{ .J By Theorem 7.7.25 there is a non-zero x E N and a A. E C such that
T I x = lx . But if this is so, Ais an eigenvalue of T and it follows that x E m."
for some n. Hence, x E N (\ ,Y which is impossible unless x = O. This
completes the proof.
In proving an alternate form of the spectral theorem, we require
following result.
the
7.8.5. Theorem. Let {N k } be a sequence of orthogonal closed linear subspaces of ;X i.e., N k .1. N J for all j 1= = k. Then the following statements are
equivalent:
(i) N
{ k } is a total family;
(ii) X is the smallest closed linear subspace which contains every N k ; and
S4 6
(iii)
(b)
Proof
= U
II
L
k=1
k}
such that
and
X;
*'
eH nce,
let Y
. Ilx,..1I
<
k=1
N j . Then (x -
(i:
,..-1 ,x ..,
(x j' y) -
Next,
00.
x o, y)
Y)
let
(x j
o
X
(x j' y) -
X k
"'1=
Yj - x o, y) =
i: (x k, y) =
"'I~
Then X o
(x j ,y)
(x j' y) -
For
.X
(Y j ,y)
(x j ' Y )
fixed j,
(xo,Y)
O. Thus,
(x -
..
L
k=1
x~
x~,
where X
N k we have (x
i: II X k -
k=1
k,
x~
Nk
E
k -
.
for all k. Then L
x~)
L-
X~ 11 2 = O. Thus, II X
(x j k-
x~)
k=1
for j
(x k -
x~)
*' k, and so II k~
"'1-
O. Since X k
(x k -
x~)
Ir
each k.
To prove that (iii) implies (i), assume that x E Nt for every k. By hypothesis, x
i:
k=1
X k,
where
X k
7.8.
54 7
= 0, and so N
{ k ) is a total family. This
If every x
L
k= 1
X k E Y k for every k, then we say V({Y k)) is the direct sum of kY{ )'
case we write
k, where
In this
..
I; P J =
)- 0
(iv) T
..
t=1
I; and
lJP).
The proof of each part follows readily from results already obtained.
We simply indicate the principal results needed and leave the details as an
exercise.
Part (i) follows from the definition of orthogonal projection. Part (ii)
follows from part (ii) of Theorem 7.6.26. Parts (iii) and (iv) follow from
Theorems 7.1.27 and 7.8.5.
Proof
7.8.8. Exercise.
7.9.
DIFE
F RENTIATION
OF
OPERATORS
-+
.Y
If
(7.9.2)
(where t E )F for all hEX , then I is said to be Gateaux differentiable at
x o, and 6/(x o, h) is called the Gateaux differential of/at X o with increment h.
The Gateaux differential ofI is sometimes also called the weak differential
of I or the G-differenfial of f If I is Gateaux differentiable at x o, then
6/(x o, h) need not be linear nor continuous as a function of hEX . However,
we shall primarily be concerned with functions I: X - +
Y which have these
properties. This gives rise to the following concept.
7.9.3. Definition. L e t X o E X be a fixed element, and let I: X
there exists a bounded linear operator F ( x o) E B(X, )Y such that
.Y
If
o)
h) -
f' ( x o)
f(x o) -
F(x
= 0
o)'
F(xo)hll
-+
where D c X,
then I is said
differen-
7.9.4.
Theorem. L e t/: X - +
,Y and let X o E X be a fixed element. If I is
F r echet differentiable at x o then/is Gateaux differentiable. and furthermore
the Gateaux differential is given by
6/(x o, h) =
54 8
f' ( x o )h
7.9.
Differentiation ofOperators
Proof Let
such that
o)
F(x
II t~
o), let
!'(x
1I1\(J X
54 9
/(x o) -
th) -
Then there is a 0
II <
F ( x o )th
> 0
II h II
II /(x o + t~)
provided that It I <
h) = (F ox )h.
0/11 h II.
~/(xo,
Hence,
/(x o) -
II <
F ( x o )h
/ is Gateaux
differentiable at
Y
o and
is Frechet differentiable
! , (x o )h is also called the
7.9.5. Example.
Let X be a Hilbert space, and let/be a functional defined
on X ; i.e., I: X .- .F If I has a Frechet derivative at some X o E ,X then
that derivative must be a bounded linear functional on ;X i.e.,! , (x o) E X.
In view of Theorem 6.14.2, there is an element oY E X such that ! , (x o )h =
(h,yo)for each h E .X AIthough! , (x o) E X andyo E ,X we know by Exercise
6.14.4 that X and X
are congruent and thus isometric. It is customary to
view the corresponding elements of isometric spaces as being one and the
same element. With this in mind, we say! ' ( x o) = oY and we call! ' ( x o) the
gradient off at X O'
As a special case of the preceding example
specific case.
7.9.6. Example.
Let X = R' and let 1111 be any norm on .X By Theorem
6.6.5, X is a Banach space. Now let / be a functional defined on X ; i.e.,
I: X .- R. Let x = (~I' ... ,~.) E X and h = (hI> ... ,h.) E .X If/has
continuous partial derivatives with respect to ~I' i = I, ... ,n, then the
Frechet differential of/is given by
~/(x,
F o r fixed
X
o E ,X
h) -
8/(x )
ae:
hI + ... + o
- 8/(xc )
h.
= ~ 8/(x
~ )
hi "~"o
o) on X by
for hEX .
64 0
f '(x)
(Uf(X)
U f (x ) .
~ " "' ~
(7.9.7)
f(x
h) -
f(x ) -
F(x)h
(x
h, L x
= (h,Lh).
Lh) -
(x, L x )
F(x)h
I-
(h, L x )
(h, L *x)
h) -
f(x ) -
IIhll
- .
In
IIv - L x
f(x ) =
11 1 for all x
.X
-2L*v
(v, v) -
2L*Lx.
(v -
Lx,
(v, v) -
v-
Lx)
2(L*v, )x
2(v, L x )
(Lx,
Lx)
(x , L * L x ) .
I:
In the next
->
R"'.
of a function
7.9.10. Example. eL t X = R8, and let Y = R"'. Since X and Y are finite
dimensional, we may assume arbitrary norms on each of these spaces and
they wiII both be Banach spaces. L e tf: X - > .Y F o r x = (~I"
.. '~8)
E ,X
7.9.
64 1
Differentiation ofOperators
let us write
I(x ) =
For
X
o E ,X
/I~X)J
/[ 1(1;1,;., .
.
I",(x)
1",(1;1''
assume that the partial derivatives
af,(x )
,I;')J
,1;.)
af,(x o)
ae;-
? f ; "=". exist and are continuous for i = I, ... , m and j = I, ... ,n. The Frechet
differential of1 at X o with increment h = (hI' ... ,h.) E X is given by
3/(x o, h) =
all (x o)
a/,(x o)
h[ h:.'J
al",(x o)
al",(x o)
_ ael
al;.
of 1 at
X
7.9.11. Example.
Let X = e[a, b], the family of real-valued continuous
functions defined on a[ , b], and let { X ; II II-} be the Banach space given in
Example 6.1.9. Let k(s, t) be a real-valued function defined and continuous
on a[ , b] X a[ , b], and let g(t, )x be a real-valued function which is defined
and ag(t, x ) /ax is continuous for t E a[ , b] and x E R. Let I: X - . X be
defined by
I(x )
F o r fixed
given by
X
o E ,X
.X
3/(x o, h) =
k(s, t) ag(t'a~o(t})
h(t)dt.
is
64 2
7.9.12. Exercise.
Let f, g: X
-+
be Frechet
differentiable at
X
o E .X
and
Exercise.
h) -
f(y +
,(x ) -
Thus, given!
11,(x
>
f' ( y)d
0 there is a ~
h) -
f' ( y)g' ( x ) h
f(y) -
d) -
,(x ) -
g(x
>
h) f(y +
f' ( y){ g (x
f(y) h) -
0 such that II d II
! l Idli
<
f' ( y)d
g(x) ~ and
is such that
f' ( y)[ d -
g'()x h)
g'()x h).
II h II <
~ imply
E.
By the continuity of g (see the proof of part (i) of Theorem 7.9.13), it follows
that Ildli < M l Ihll for some constant M. Hence, there is a constant k
7.9.
Differentiation 01 Operators
such that
II ,(x +
h) -
64 3
,(x ) -
f' ( y)g' ( x ) h
II <
kf II h II.
all
Ax =
:
amI
Hence,
f' ( x ) =
representation of
differentiable
I,(/(x
,(/(x
I< 11,11
sup 1If' ( x
0< 1 < 1
+ th)IIllhll
Since
Irp(f(x
rp(/(x
h) -
I(x
I = Irp(/(x + h) = IIrplllI/(x o +
sup 11f'(x
o) II ~
O<t<1
I = Irp(y) I
h) - I(x o) II,
th)1I l Ihll.
I(x ) )
11/(x +
7.9.20. Exercise.
O< t < 1
h) -
I(x ) -
f' ( x ) hll
< iN l Ihll z .
We conclude the present section by showing that the Gateaux and Frechet
differentials play a role in maximizing and minimizing functionals which is
similar to that of the ordinary derivative of functions of real variables.
eL t F = R, and let I be a functional on X ; i.e., I: X - > R. Clearly, for
fixed x o, hEX . we may define a function g: R - + R by the relation g(t)
= I(x o + th) for all t E R. In this case, if I is Gateaux differentiable at x o
we see that ~/(xo.
h) = g' ( t) It.o, where g' ( t) is the usual derivative of g(t).
We will need this property in proving our next result, Theorem 7.9.22. First,
however, we require the following important concept.
o E .X
Proof
As pointed out in the remark preceding Definition 7.9.21, the realvalued function g(t) = f(x o + th) must have an extremum at t = O. From
the oridnary calculus we must have g'(t) 1,.0 = O. eH nce, 6f(x o, h) = 0
for all hEX .
We leave the proof of the next result as an exercise.
Exercise.
X o E
.X
If
o=
L*Lx
L*v .
Tx(s)
s: k(s, t)x(t)dt,
(7.10.1)
Tx -
h =
y,
(7.10.2)
64 6
*'
II-Pf +
k~I;: ~;:112
k~
=rhIlPoYW+
11-A
! kI2I1PkYW
+ ktlllPkYll z ]
= dzI[ IPoYW
= d 211 poY
+ ~ Pkylr
11
dziIYW.
~ 1k 12 II PkY
nt :X ~ ):
i;l PkyW
is convergent to an element in .X
00
00
11-1 All
,,- 1
A"
lJ .
= y. F r om
64 7
Pox
I
- r PoY
and
1 lPJ y forj=
PJ X = l J
Thus, poY
- l Pox
and PJY
lJPxJ
= poY
lPJx.
fti PJ'Y
00
Tx
00
= ftilJ P J x ,
and
00
~
lPJx. Hence, Y = Tx - l x .
:'J 1
Finally, to show that x given by Eq. (7.10.4) is unique, let x and z be such
that Tx - Ax = Tz - lz = y. Then it follows that T(x - )z - l(x - z)
=Y
- Y = O. Hence, T(x - )z = l(x - )z . Since 1 is by assumption not
an eigenvalue of T, we must have x - z = O. This completes the proof. _
lx
= lPox
1,2, ....
eigenvalue
X=X
poY
o - " ' "II.
PkY
~'
k= l lI.k
k*J
-.I\,
(7.10.6)
Proof We first observe that ffi:J reduces T by part (iii) of Theorem 7.6.26.
It therefore follows from part (ii) of Theorem 7.5.22 that TPJ = PJT. Now
suppose that Y is such that Eq. (7.10.2) is satisfied for some x E .X Then it
follows that PJY = Pi Tx - lJ x ) = TPJx - lJPxJ
= AJPXJ - AJPXJ = O.
In the preceding, we used the fact that Tx = lJ x for x E ffi:J and PJx E ffi:J
for all x E .X Hence, PJY = O.
Conversely, suppose that PJY = 0, and let x be given by Eq. (7.10.6).
The proof that x satisfies Eq. (7.10.2) follows along the same lines as the
proof of Theorem 7.10.3, and the details are left as an exercise. The nonuniqueness of the solution is apparent, since (T - ll)x o = 0 for any X o
E
ffi:J'
-
7.tO.7. Exercise.
64 8
B.
An Example
AX(I)
i(l) =
(7.10.8)
BU(I),
where x ( o)
= .(1, O)x(O)
(X I)
(.(1, r- )BU(f)d-r,
(7.10.9)
where .(1, f) is the state transition matrix for the system of equations given
in Eq. (7.10.8).
[' , T] by
Let sU now define the class of vector valued functions ;L O
;L O
[' ,
T] =
u{ : uT
(U . ,
,u",), where
/U
[ ,
20
T], i =
I, ... ,m} .
uT(t)v(l)dl
for u, v E Lr[O, 1',] then it follows that Lr[O, T] is a Hilbert space (see
Example 6.11.11). Next, let us define the linear operator L : Lr[O, T] - +
Li[O, 1'] by
[Lu](I)
.(1, r- )BU(f)d-r
(7.10.10)
for all U E Lr[O, 1'.] Since the elements of .(1, r- ) are continuous functions
on 0[ , T] X 0[ , T], it follows that L is completely continuous.
Now recall from Exercise 5.10.59 that Eq. (7.10.9) is the unique solution
to Eq. (7.10.8) when the elements of the vector u(t) are continuous functions
of t. It can be shown that the solution of Eq. (7.10.8) exists in an extended
sense if we permit u E Lr[O, T]. Allowing for this generalization, we can
now consider the following optimal control problem. Let "I E R be such that
"I > 0, and let/be the real-valued functional defined on Ll[O, T] given by
/(u)
T
x (t)X(I)dt
"I
T
U (I)U(t)dt,
(7.10.11)
64 9
Let
v(t)
t ::::;; T.
x =
v,
Lu -
= IILu - vW + "lIuW.
X - + Y be a
completely continuous operator, and let L * denote the adjoint of L . Let v
be a given fixed element in ,Y let" E R, and define the functionalf: X - + R
by
f(u) =
"lIull z
vW +
IILu -
(7.10.13)
for u E .X (In Eq. (7.10.13) we use the norm induced by the inner product
and note that II u II is the norm of u E ,X while II L u - v II is the norm of
(L u - v) E .Y ) If in Eq. (7.10.13), " > 0, then there exists a unique U o E X
such that f(u o) < f(u) for all u E .X Furthermore, U o is the solution to the
equation
L*Lu
+ "U
o=
(7.10.14)
L*v.
f(u o +
h) =
=
=
Therefore, f(u o +
(L u o +
(L u o -
(v, v)
(L u o -
v,L u o + L h - v) + ,,(uo +
v, L u o - v) + 2(Lh, L u o - v)
Lh -
v, L u o -
2(h, L * L u
IILu o -
"(I!o, u o) +
o+
vW +
v)
"u o -
2,,(u o, h)
(v, v)
L * v)
,,(uo, uo)
,,(h, h)
IlvW + "lIuoW+
,,(h, h)
"lIhW
O.
h, U o +
h)
74 0
The solution to Eq. (7.10.14) can be obtained from Eq. (7.10.4); however.
a more convenient method is available for the finding of the solution when
L is given by Eq. (7.10.10). This is summariz e d in the following result.
P(t)
with P(T)
Proof
<
.J_ ..
=
u(t)
BTp(t)x ( t)
Y'
- A Tp(t) -
P(t)A
+.!.
Y'
P(t)BBTp(t) -
(7.10.16)
where L u
is given
v)
-'!'L*x.
bitrary w
Y'
- . ! . * (L u
,L ,[O.
ru:
r f:
ruT(t)[f
r
for ar-
T]. We compute
(w. 0 )
Y'
uT(t)BT.T(S,t)w(s)dtds
BT.T(S, t)w(s)dsJ d t.
*L[ w](t)
Thus, u must satisfy
u(t)
t<
=-
BT.T(S. t)w(s)ds.
_ BI T
Y'
iT
.T(S, t)x(s)ds
P(t)x(t)
$T(S. t)x(s)ds.
(7.10.17)
We now find conditions for such a matrix P(t) to exist. F i rst, we see that
P(T) = O. Next, differentiating both sides of Eq. (7.10.17) with respect to
t, and noting that ebT(s, t) = AT.(s. t), we have
P(t)x(t)
P(t)i(t) =
- x ( t)
- x ( t)
AT
-
$T(S, t)x(s)ds
ATp(t)x(t).
74 1
Therefore,
P(t)x(t)
But
P(t)[Ax(t)
u(t) =
so that
P(t)x ( t)
Hence,
Bu(t)]
- l - L * x ( t)
P(t)Ax ( t) - l - P (t)BBTp(t)x ( t)
1'
pet) must satisfy
pet)
with peT) =
If
- A TP(t) -
P(t)AT +
L*Lu
1' U
ATP(t)X(t).
- x ( t)
l- P (t)BBTP(t) i'
O.
- l - B Tp(t)x ( t)
i'
=
1'
- x ( t)
= *L v,
ATP(t)X(t).
I given by
I(x ) =
(x, Mx ) -
2(w, x )
+ p,
(7.10.18)
where w is a fixed vector in ,X where PER, and where M is a linear selfadjoint operator having the property
c,llx W x , Mx ) < c
1IxW
(7.10.19)
for all x E X and some constants C 2 > C 1 > O. The reader can readily verify
that the functional given in Eq. (7.10.13) is a special case off, given in Eq.
(7.10.18), where we make the association M= L * L + 1' 1 (provided i' > 0),
w
= L * v, and p =
U n der
(v, v).
Mx =
(7.10.20)
74 1
has a unique solution, say x o, and X o minimizes f(x ) . Iterative methods are
based on beginning with an initial guess to the solution of Eq. (7.10.20)
and then successively attempting to improve the estimate according to a
recursive relationship of the form
(7.10.21)
where~.
E Rand r. E .X
Different methods of selecting~.
and r. give rise
to various algorithms of minimizing f(x ) given in Eq. (7.10.18) or, equivalently, finding the solution to Eq. (7.10.20). In this part we shall in particular
consider the method of steepest descent. In doing so we let
r.
w-
Mx.,
n=
1,2, . . . .
(7.10.22)
The term r. defined by Eq. (7.10.22) is called the residual of the approximation x . If, in particular, x . satisfies Eq. (7.10.20), we see that the residual
is ez ro. F o r f(x ) given in Eq. (7.10.18), we see that
f' ( x . )
= - 2 r",
where f' ( x . ) denotes the gradient of f(x . ). That is, the residual, r., is
"pointing" into the direction of the negative of the gradient, or in the direction
of steepest descent. Equation (7.10.2 I) indicates that the correction term
~.r.
is to be a scalar multiple of the gradient, and thus the steepest descent
method constitutes an example of one of the so-called "gradient methods."
is chosen so thatf(x . + ~.r.)
is minimum.
With r. given by Eq. (7.10.22),~.
Substituting x . + ~.r. into Eq. (7.10.18), it is readily shown that
(l
(r r.)
(r., Mr.)
,X
fix , )
7.10.23.
74 3
N}
Proof
(x -
F ( x N+ I) = F ( x N) - 2(1,N('N' MYN)
(I,~('N'
where we have let NY = X o - X N. Noting that N
' =
(YN' MYN) = (M- I ' N ' 'n), we have
F(x
Hence, (F N
x I+ )
so X N- >
7.11.
<
N) F(x
(1 -
(F N
x I+ )
N)
)F ( X
n) <
1= =
O. Note
M'N)'
MYN' so that F ( x
N) =
(1 -
REFERENCES
F(x
l ).
Thus, li~
F(x
n) =
Oand
AND NOTES
474
F o r applications of the type considered in Section 7.10, as well as additional applications, refer to Antosiewicz and Rheinboldt 7[ .1], Balakrishnan
7[ .2], Byron and Fuller 7[ .3], Curtain and Pritchard 7[ .4,]
Kantarovich and
Akilov 7[ .6], Lovitt 7[ .9], and Luenberger 7[ .10). Applications to integral
equations (see Section 7.lOA) are treated in 7[ .3] and 7[ .9]. Optimal control
problems (see Section 7.lOB) in a Banach and Hilbert space setting are
and 7[ .10]. Methods for minimization of funcpresented in 7[ .2], 7[ .4,]
and 7[ .10].
tionals (see Section 7.1OC) are developed in 7[ .1], 7[ .6],
REF E RENCES
7[ .1]
7[ .2]
7[ .3]
7[ .4]
7[ .5]
7[ .6]
7[ .7]
7[ .8]
7[ .9]
7[ .10]
7[ .11]
7[ .12]
1992.
INDEX
Abelian group, 40
abstract algebra, 33
additive group, 46
adherent point, 275
adjoint system of
ordinary differential
equations, 261
adjoint transformation, 219, 220,422
affine linear subspace, 85
algebra, 30,56,57,104
algebraically closed
field, 165
algebraic conjugate, 110
algebraic multiplicity, 167,223
algebraic structure, 31
algebraic system, 30
algebra with identity, 57,105
aligned, 379
almost everywhere, 295
approximate eigenvalue, 444
approximate point
spectrum, 444
approximation, 395
Arzela-Ascoli theorem, 316
Ascoli's lemma, 317
associative algebra, 56, 105
associative operation, 28
automorphism, 64, 68
autonomous system of
differential equations, 241
Axioms of norm, 207
475
476
Index
B
c
C[a,b],80
cancellation laws, 34
canonical mapping, 372
cardinal number, 24
cartesian product, 10
Cauchy-Peano
existence theorem, 332
Cauchy sequence, 290
Cayley-Hamilton theorem, 167
Cayley's theorem, 66
characteristic equation, 166,259
characteristic polynomial, 166
characteristic value, 164
characteristic vector, 164
0 > 79
classical adjoint of a
matrix, 162
closed interval, 283
closed relative to an
operation, 28
Index
477
covering, 299
cyclic group, 43,44
D
degree of a polynomial, 70
DeMorgan's laws, 7,12
dense-in-itself, 284
denumerable set, 23
derived set, 277-278
determinant of a
linear transformation, 163
determinant of a matrix, 157
diagonalization of a
matrix, 172
diagonalization process, 450
diagonal matrix, 155
diameter of a set, 267
difference of sets, 7
differentiation:
of matrices, 247
of vectors, 241
dimension, 78,92,392
direct product, 10
direct sum of linear,
subspaces83,457
discrete metric, 265
disjoint sets, 5
disjoint vector spaces, 83
distance 264
between a point
and a set, 267
between sets, 267
between vectors, 208
distribution function, 397
distributive, 28
diverge, 286, 350
division algorithm, 71
division (of
polynomials), 72
division ring, 46, 50
divisor, 49
divisors of zero, 48
divisors of zero, 48
domain of a function, 12
domain of a relation, 25
dot product, 114
dual, 358
dual basis, 112
E
e-approximate solution, 329
e-dense set, 299
e-net, 299
eigenvalue, 164,439
eigenvector, 164,439
element, 2
element of ordered set, 10
empty set, 3
endomorphism, 64, 68
equal by definition, 10
equality of functions, 14
equality of matrices, 132
equality of sets, 3
equals relation, 26
equicontinuous, 316
equivalence relation, 26
equivalent matrices, 151
equivalent metrics, 318
equivalent sets, 23
error vector, 395
estimate, 398
Euclidean metric, 271
Euclidean norm, 207
Euclidean space, 30,124, 205
even permutation, 156
events, 397
everywhere dense, 284
expected value, 398
extended real line, 266
extended real numbers, 266
extension of a function, 20
extension of an
operation, 29
exterior, 279
extremum, 464
F
factor, 72
family of disjoint sets, 12
family of subsets, 8
478
Index
G
Gateaux differential, 458
generalized associative
law, 36
generated subspace, 383
generators of a set, 60
Gram matrix, 395
Gram-Schmidt process, 213,391
graph of a function, 14
greatest common divisor, 73
Gronwall inequality, 332
group, 30, 39
group component, 46
group operation, 46
H
Hahn-Banach theorem, 367-370
half space, 366
Hamel basis, 89
Hausdorff spaces, 323
Heine-Borel property, 302
Heine-Borel theorem, 299
hermitian operator, 427
Hilbert space, 31, 377
homeomorphism, 320
homogeneous property
of a norm, 208,344
homogeneous system, 241-242
homomorphic image, 62,68
homomorphic rings, 67
homomorphic semigroups, 63
homomorphism, 30, 62
hyperplane, 364
I
idempotent operator, 121
identity:
element, 35
function, 19
matrix, 139
permutation, 19,44
relation, 26
transformation, 105,409
image of a set under f, 21
indeterminate of a
polynomial ring, 70
index:
of a nilpotent
operator, 185
of a symmetric
bilinear functional, 202
set, 10
indexed family of sets, 10
indexed set, 11
induced:
mapping, 20
Index
induced (cont.)
metric, 267
norm,349,412
operation, 29
inequalities, 268-271
infinite-dimensional
vector space, 92
infinite series, 350
infinite set, 8
initial value problem, 238-261,328-:
injection, 14
injective, 14,100
inner product, 117,205,375
inner product space, 31, 118, 205
inner product subspace, 118
integral domain, 46,49
integration:
of matrices 249
of vectors 249
interior, 278
intersection of sets, 5
invariant linear
subspace, 122
inverse:
image 21
of a function, 15, 100
of a matrix, 140
of an element, 38
relation, 25
invertible element, 37
invertible linear
transformation, 100
invertible matrix, 140
irreducible polynomial, 74
irreflexive, 372
isolated point, 275
isometric operator, 431
isometry,321
isomorphic, 108
isomorphic semigroups, 64
isomorphism, 30, 63, 68,108
J
Jacobian matrix, 461
Jacobi identity, 57
Jordan canonical form, 175,191
K
Kalman's theorem, 401402
kernel of a homomorphism, 65
Kronecker delta, 111
L
Laplace transform, 96
latent value, 164
leading coefficient of
a polynomial, 70
Lebesgue integral, 296
Lebesgue measurable
function, 296
Lebesgue measurable
sets, 295
Lebesgue measure, 295
left cancellation
property, 34
left distributive, 28
left identity, 35
left inverse, 36
left invertible element, 37
left R-module, 54
left solution, 40
Lie algebra, 57
limit, 286
limit point, 277,288
line segment, 351
linear:
algebra, 33
functional, 109,355-360
manifold, 81
operator, 31,95
quadratic cost
control, 468
space, 30,55,76
subspace, 59,81,348
subspace generated
by a set, 86
transformation, 30, 95,100
variety, 85
linearly dependent, 87
linearly independent, 87
Lipschitz condition, 324, 328
Lipschitz constant, 324, 328
480
Index
M
map, 13
mapping, 13
mathematical system, 30
matrices, 30
matrix, 132
matrix of:
a bilinear functional, 195
a linear transformation, 131
one basis with respect
to a second basis, 149
maximal linear subspace, 363
metric, 31,209,264
metric space, 31,209, 263-342
metric subspace, 267
minimal polynomial, 179,181
minor of a matrix, 158
modal matrix, 172
modern algebra, 33
module, 30, 54
monic polynomial, 70
monoid, 37
multiplication of a
linear transformation
by a scalar, 104
multiplication of
vectors by scalars, 76,409
multiplicative semigroup, 46
multiplicity of an
eigenvalue, 164
multivalued function, 25
N
natural basis, 126
natural coordinates, 127
n-dimensional complex
coordinate space, 78
n-dimensional real
coordinate space, 78
n-dimensional vector
space, 92
negative definite
matrix, 222
nested sequence
of sets, 298
Neumann expansion
theorem, 415
nilpotent operator, 185
non-abelian group, 40
non-commutative group, 40
non-empty set, 3
non-homogeneous system, 241-242
non-linear
transformation, 95
non-singular linear
transformation, 100
non-singular matrix, 140
non-void set, 3
norm, 206, 344
normal:
equations, 395
linear
transformation, 237
operator, 431
topological space, 323
normalizing a vector, 209
normed conjugate space, 358
normed dual space, 358
normed linear space, 31, 208,344
norm of a bounded
linear transformation, 409
norm preserving, 367
nowhere dense, 284
null:
matrix, 139
set, 3
space, 98,224
vector, 76, 77
nullity of a linear
transformation, 100
n-vector, 132
O
object, 2
observations, 398
odd permutation, 156
481
Index
one-to-one and onto
mapping, 14,100
one-to-one mapping, 14, 100
onto mapping, 14,100
open:
ball, 275
covering, 299
interval, 282
set, 279
sphere, 275
operation table, 27
operator, 13
optimal control problem, 468
ordered sets, 9
order of a group, 40
order of a polynomial, 70
order of a set, 8
ordinary differential
equations, 238-261
origin, 76, 77
orthogonal:
basis, 210
complement, 215,382
linear transformation, 217, 231-:
matrix, 216,226
projection, 123,433
set of vectors, 379
vectors, 118,209
orthogonality principle, 399
orthonormal set of
vectors, 379
outcomes, 397
P
parallel, 364
parallelogram law, 208, 379
Parseval's formula, 390
Parseval's identity, 212
partial sums, 350
partitioned matrix, 147
permutation group, 44,45
permutation on a set, 19
piecewise continuous
derivatives, 329
point of accumulation, 277
points, 264
R
radius, 275
random variable, 397
range of a function, 12
range of a relation, 25
range space, 98
rank of a linear
transformation, 100
482
rank of a matrix, 136
rank of a symmetric
bilinear functional, 202
real inner product space, 205
real line, 265
real vector space, 76
reduce, 435
reduced characteristic
function, 179
reduced linear
transformation, 122
reflection, 218
reflexive, 372
reflexive relation, 25
regular topological
space, 323
relation, 25
relatively compact, 307
relatively prime, 73
remainder, 72
repeated eigenvalues, 173
residual, 472
residual spectrum, 440
resolution of the
identity, 226,457
resolvent set, 439
restriction of a mapping, 20
R-homomorphism, 68
Riccati equation, 471
Riemann intergrable, 296
Riesz representation
theorem, 393
right:
cancellation property, 34
distributive, 28
identity, 34
inverse, 35
invertible element, 37
R-module, 54
solution, 40
R, 78
ring, 30,46
ring of integers, 51
ring of polynomials, 70
ring with identity, 47
R-module, 54
Rn, 78
rotation, 218, 230
row of a matrix, 131
Index
row rank of a matrix, 152
row vector, 125,132
R*, 266
R-submodule, 58
R-submodule generated
by a set, 60
s
scalar, 75
scalar multiplication, 76
Schwarz inequality, 207,376
second dual space, 371
secular value, 164
self-adjoint linear
transformation, 221, 224-225
self-adjoint operators, 428
semigroup, 30, 36
semigroup component, 46
semigroup of
transformations, 44
semigroup operation, 46
separable, 284, 300
separates, 366
sequence, 11, 286
sequence of disjoint
sets, 12
sequence of sets, 11
sequentially compact, 301-305
set, 1
set of order zero, 8
shift operator, 441
a-algebra, 397
a-field,397
signature of a symmetric
bilinear functional, 202
similarity transformation, 153
similar matrices, 153
simple eigenvalues, 164
singleton set, 8
singular linear
transformation, 101
singular matrix, 140
skew-adjoint linear
transformation, 221, 237
skew symmetric bilinear
functional, 196
skew symmetric matrix, 196
483
Index
skew symmetric part of a
linear functional, 196
solution of a differential
equation, 239
solution of an initial
value problem, 239
space of:
bounded complex
sequences, 79
bounded real sequences, 79
finitely non-zero
sequences, 79
linear transformations, 104
real-valued continuous
functions, 80
span, 86
spectral theorem, 226,455,457
spectrum, 164,439
sphere, 275
spherical neighborhood, 275
square matrix, 132
state transition matrix, 247-255
steepest descent, 472
strictly positive, 429
strong convergence, 373
subalgebra, 105
subcovering, 299
subdomain, 52
subfield, 52
subgroup, 41
subgroup generated
by a set, 43
submatrix, 147
subring, 52
subring generated by
a set, 53
subsemigroup,40
subsemigroup generated
by a set, 41
subsequence, 287
subset, 3
subsystem, 40,46
successive
approximations, 315, 324-328
sum of:
elements, 46
linear operators, 409
linear transformations, 104
matrices, 138
sets, 82
vectors, 76
surjective, 14, 100
Sylvester's theorem, 199
symmetric difference
of sets, 7
symmetric matrix, 196, 226
symmetric part of a
linear functional, 196
symmetric relation, 26
system of differential
equations, 240, 255-260
T
ternary operation, 26
Tj-spaces, 323
topological space, 31
topological structure, 31
topology, 280, 318,322-323
totally bounded, 299
T',421
trace of a matrix, 169
transformation, 13
transformation group, 45
transitive relation, 26
transpose of a linear
transformation, 113,420
transpose of a matrix, 133
transpose of a vector, 125
triangle inequality, 208, 264, 344
triangular matrix, 176
trivial ring, 48
trivial solution, 245
trivial subring, 53
truncation operator, 439
T*,422
T T ,113
u
unbounded linear
functional, 356
unbounded metric space, 265
uncountable set, 23
uniform convergence, 313
484
Index
V
vacuous set, 3
Vandermonde matrix, 260
variance, 398
vector, 75
vector addition, 75
vector space, 30,55, 76
vector space of n-tuples
over F, 56
vector space over a field, 76
vector subspace, 59
Venn diagram, 8
void set, 3
Volterra equation, 327
Volterra integral
equation, 97
w
weak convergence, 373
weakly continuous, 375
weak* compact, 375
weak-star convergence, 373
Weierstrass approximation
theorem, 285
Wronskian, 256-259
XYZ
Xf, 357
X*, 357-358
zero:
polynomial, 70
transformation, 104,409
vector, 76, 77
Zorn's lemma, 390