I N T E R S C I E N C E P U B L I S H E R S L T D .

, L O N D O N





































J anuary 1 9 4 8





































































































+ fi) x = + fix





























$ ' 7 t







,










P




























Y
n ) ( n














n
n (































































































=





























I

n



( f( t)

VS a [







vr



















































































b b

b

j ib



J a

























































d.




































fi





















( 2 c;


_ e, 2
e2 2 1 2 e2 2





r











1 / 1 7 / 2 ,
e2
2 n3 ,
¿ 3 = ) 7 3 .










































1 ,
































- F +


- F

- F









4 >






of














































































a


a
. a












































































np
fil+ q t nnfn


, n.









= n, nil+ 0 .
p' .





































































































































































y)














6
6






























+





































































































































































o
o
L o
o























































I r- 1 1 1 st











an_ ,

















































































































C n T in = ( z,










o o
o



























































































( x , y) = B ( x ; y) .





,













+












































































E















































































ß y




Fa
L y J
Fa vl
L ß






r

























1 1



ft. , i. e. ,

w,












= -















































n
n

n












































































k_

sif2






















4 k( ,
i. e. ,

4 7 2 5







































































































L

1 6 8


cp( f)





f2 , f" .




( 6 )












=




















=




































































































e















E n



vn




,





















COPYRIGHT 0 1961 BY INTERSCIENCE PUBLISHERS, INC.

ALL RIGHTS RESERVED
LIBRARY OF CONGRESS CATALOG CARD NUMBER 61-8630 SECOND PRINTING 1963

PRINTED IN THE UNITED STATES OF AMERICA

PREFACE TO THE SECOND EDITION
The second edition differs from the first in two ways. Some of the material was substantially revised and new material was added. The major additions include two appendices at the end of the book dealing with computational methods in linear algebra and the theory of perturbations, a section on extremal properties of eigenvalues, and a section on polynomial matrices ( §§ 17 and 21). As for major revisions, the chapter dealing with the Jordan canonical form of a linear transformation was entirely rewritten and Chapter IV was reworked. Minor
changes and additions were also made. The new text was written in colla-

boration with Z. Ja. Shapiro. I wish to thank A. G. Kurosh for making available his lecture notes
on tensor algebra. I am grateful to S. V. Fomin for a number of valuable comments Finally, my thanks go to M. L. Tzeitlin for assistance in the preparation of the manuscript and for a number of suggestions.
September 1950

I. GELtAND

Translator's note: Professor Gel'fand asked that the two appendices

be left out of the English translation.

.

Turetski of the Byelorussian State University. January 1948 I. S. and to D. GEL'FAND vii . Raikov. Fomin participated to a considerable extent in the writing of this book. E. Without his help this book could not have been written. The material in fine print is not utilized in the main part of the text and may be omitted in a first perfunctory reading. who made available to him notes of the lectures given by the author in 1945. T h e author wishes to thank Assistant Professor A. who carefully read the manuscript and made a number of valuable comments.PREFACE TO THE FIRST EDITION This book is based on a course in linear algebra taught by the author in the department of mechanics and mathematics of the Moscow State University and at the Byelorussian State University. A. V.

.

Simultaneous reduc97 tion of a pair of quadratic forms to a sum of squares 103 Unitary transformations 107 Commutative linear transformations. The Canonical Form of an Arbitrary Linear Transformation The canonical form of a linear transformation Reduction to canonical form Elementary divisors Polynomial matrices . Linear and Bilinear Forms ti-Dimensional vector spaces Euclidean space Orthogonal basis.TABLE OF CONTENTS Page Preface to the second edition Preface to the first edition vii I. Linear Transformations 70 Linear transformations. Operations on linear transformations . III. n-Dimensional Spaces. 132 132 137 142 149 IV. Invariant subspaces. Introduction to Tensors The dual space Tensors 164 164 171 . Isomorphism of Euclidean spaces 14 21 Bilinear and quadratic forms Reduction of a quadratic form to a sum of squares Reduction of a quadratic form by means of a triangular transformation 34 42 The law of inertia Complex n-dimensional space 46 55 60 70 II. Normal transformations Decomposition of a linear transformation into a product of a unitary and self-adjoint transformation 114 Linear transformations on a rea/ Euclidean space 126 Extremal properties of eigenvalues . Eigenvalues and eigenvectors of a linear 81 transformation 90 The adjoint of a linear transformation Self-adjoint (Hermitian) transformations.

.

the set of coefficients of a linear form. e2. In the sequel we shall consider all continuous functions defined on some interval In the examples just given the operations of addition and multiplication by numbers are applied to entirely dissimilar objects.g. . e) (e. z. . By the product of the number A and the n-tuple x = (ei . the diagonal of the parallelogram with sides x and y. x+y= + ij. A set R of elements x.. . As is well known the sum of two vectors x and y is. a-Dimensional vector spaces 1. DEFINITION 1. directed segments. . We frequently come across objects which are added and multiplied by numbers. Definition of a vector space. Thus In geometry objects of this nature are vectors in three dimensional space. e) we mean the n-tuple ix= (241. i. In analysis we define the operations of addition of functions and multiplication of functions by numbers.. The definition of multiplication by (real) numbers is equally well known. by definition. etc. E2. vector space over a field F if: [I] is said to be a . we introduce the concept of a vector space. Linear and Bilinear Forms § 1. To investigate all examples of this nature from a unified point of view fa.e. )42. rows of a matrix. Addition and multiplication of n-tuples by numbers are usually defined as follows: by the sum of the n-tuples . Two directed segments are said to define the same vector if and only if it is possible to translate one of them into the other. In algebra we come across systems of n numbers x = (ei. b].CHAPTER I n-Dimensional Spaces. + n). AE ). It is therefore convenient to measure off all such directed segments beginning with one common point which we shall call the origin. y. ¿) and y =(. ?In) we mean the n-tuple x = (E1. E2.).n2. E2 + n2.

It is not an oversight on our part that we have not specified how elements of R are to be added and multiplied by numbers. x+y=y+x (x + y) 4. 2. 1. The above operations must satisfy the following requirements (axioms): (commutativity) (associativity) R contains an element 0 such that x = x for all x in R. III. Thus (r t) (r t) = 5. For every x in R there exists (in R) an element denoted by I. 4. 1. Ai( is referred to as the product of x by A. We take as the elements of R matrices of order n. The set of all polynomials of degree not exceeding some natural number n constitutes a vector space if addition of polynomials and multiplication of polynomials by numbers are defined in the usual manner. (ßx) = ß(x).2 LECTURES ON LINEAR ALGEBRA With every two elements x and y in R there is associated an element z in R which is called the sum of the elements x and y. Any definitions of these operations are acceptable as long as the axioms listed above are satisfied. The sum of the elements x and y is denoted by x + y. 3 above are indeed examples of vector spaces.z = x (y z) x with the property x ( x) = O. 0 is referred to as the zero element. 2. (cc 1. II. We observe that under the usual operations of addition and multiplication by numbers the set of polynomials of degree n does not form a vector space since the sum of two polynomials of degree n may turn out to be a polynomial of degree smaller than n. Ix= x ct(x + fi)x = y) = + fix cor. Whenever this is the case we are dealing with an instance of a vector space. Let us give a few more examples of vector spaces. With every element x in R and every numeer A belonging tu a c field F there is associated an element Ax in R. As the sum . We leave it to the reader to verify that the examples 1. 2.

in particular. say. i.n-DIMENSIONAL SPACES 3 we take the matrix Hai. not all equal to zero such that cc. z. y. More generally it may be assumed that A. IT be linearly dependent. Then + Ov = 0 GO( + /3y + yz + 0 = O. fty Dividing by a and putting yz Ov. are elements of an arbitrary field K. If are taken from the field of complex numbers. y. y. unequal to zero. z. The geometric considerations associated with this word will help us clarify and even predict a number of results.. . . . v is said to be linearly independent if (1) yz + the equality implies that . a set of vectors x.4. + b1. The dimensionality of a vector space. + Ov = O. the numbers 2. . IT are linearly dependent if there exist numbers 9. 1/Ve now define the notions of linear dependence and independence of vectors which are of fundamental importance in all that follows. it. a. 2. z. z. then the space is referred to as a real vector space. In other words. IT be connected by a relation of the form (1) with at least one of the coefficients. DEFINITION 2. It is natural to call the elements of a vector space vectors. y. y =. However. The fact that this term was used in Example I should not confuse the reader.e. space are real. involved in the definition of a vector If the numbers X. As the product of the number X and the matrix 11 aikl we take the of the matrices I laikl I and matrix 112a1tt It is easy to see that the above set R is now a vector space.11. /3. Let R be a vector space. W e shall say that the vectors x. . let x. 1. Many concepts and theorems dealt with in the sequel and.p Let the vectors x. Vectors which are not linearly dependent are said to be linearly independent. then the space is referred to as a complex vector space. the contents of this section apply to vector spaces over arbitrary fields. Then R is called a vector space over the field K. y. in chapter I we shall ordinarily assume that R is a real vector space. y.

4 LECTURES ON LINEAR ALGEBRA (/57X) = A. y. y is the zero vector then these vectors are linearly dependent. u. and in three-dimensional space coincides with what is called in geometry the dimensionality of the line. Any two vectors on a line are proportional. that if one of a set of vectors is a linear combination of the remaining vectors then the vectors of the set are linearly dependent. . y are linearly dependent then at (2) x pZ + least one of them is a linear combination of the others. respectively. 2. .e. y. z. z. Ve shall now compute the dimensionality of each of the vector spaces considered in the Examples 1. y. . Show that if the vectors x. then R is said to be infinitedimensional. 1. z.. We now introduce the concept of dimension of a vector space. 2. 3. y in the form (2) we say that x is a linear combination of the vectors y. y. are arbitrary vectors then the vectors x. i. and space. plane. dependent. 1 vectors If R is a vector space which contains an arbitrarily large number of linearly independent vectors. y. linearly depend- ent. 4. It is therefore natural to make the following general definition. Whenever a vector x is expressible through vectors y. Show that if one of the vectors x. We leave it to the reader to prove that the converse is also true. DEF/NITION 3. /1. v. In the plane we can find two linearly independent vectors but any three vectors are linearly dependent. Infinite-dimensional spaces will not be studied in this book. . 5. Thus. = tu..e. If R is the set of vectors in three-dimensional space. are linearly dependent and u. A vector space R is said to be n-dimensional if it contains n linearly independent vectors and if any n in R are linearly dependent. in the plane. z. then it is possible to find three linearly independent vectors but any four vectors are linearly dependent. i. As we see the maximal number of linearly independent vectors on a straight line. z. z. EXERCISES. are linearly . y. if the vectors x.y '. (01x) = we have + Of.

0. t"-1 are linearly independent. the vectors xi -= (1. n2n). briefly. Indeed. 0). . Let N be any 1. Then the functions: f1(t) independent vectors (the proof fN(t) = tN-1 form a set of linearly of this statement is left to the reader). n21. Thn 172n n tnn cannot exceed n (the number of columns). Y2 = (V21. Since m > n. Let R denote the space whose elements are n-tuples of real numbers. Hence R is n-dimensional. . . = (0. 0). f2(t) = t. Yin = (nml. tre > n. Consequently R is three-dimensional. any m vectors in R. y2. x. 0. this space the n polynomials 1. . let x= (0. n22. n12. Ynt Thus the dimension of R is n. On the other hand. In n Let R be the space of polynomials of degree . . 717n2. . This space contains n linearly independent vectors For instance. The number of linearly independent rows in the matrix [ nu. But this implies the linear dependence of the vectors y1.n-DIMENSIONAL SPACES 5 As we have already indicated. 1) are easily seen to be linearly independent. nm2. the space R of Example 1 contains three linearly independent vectors and any four vectors in it are linearly dependent. It follows that our space contains an arbitrarily large number of linearly independent functions or. 1. Let R be the space of continuous functions. Yi(nii. ni > n. our ni rows are linearly dependent. . nmn) be ni vectors and let ni > n. 1. are linearly dependent. t. n12. natural number.. R is infinite-dimensional. It can be shown that any ni elements of R. are linearly dependent. n22y 17ml.

e. Thus. Basis and coordinates in n-dimensional space DEFINITION 4. Using (3) we have x= cco 2 e2 22o cc °c m en. «0 This proves that every x e R is indeed a linear combination of the vectors el.e. . i. i. en. for instance. It Proof: Let e1. ex not all zero such that (3) ao X + 1e1 + cte = O. e2. Otherwise (3) would imply the linear dependence of the vectors e1. 3. follows from the definition of an n-dimensional vector space that these vectors are linearly dependent. r2e2 + Subtracting one equation from the other we obtain O = (el $'1)e1 (2 e2)e2 + . it contains a basis. en of an n-dimensional vector space R is called a basis of R.6 LECTURES ON LINEAR ALGEBRA 5.. We leave it to the reader to prove that the space of n x n matrices [a2kH is n2-dimensional. e2. e be a basis in R.. + (E. . Let x be an arbitrary vector in R. e contains n + 1 vectors. . in the case of the space considered in Example 1 any-three vectors which are not coplanar form a basis. . e2. that there exist n I numbers « al. Every vector x belonging to an n-dimensional vector space R can be uniquely represented as a linear combination of basis vectors.e.e. e/n)e. To prove uniqueness of the representation of x in terms of the basis vectors we assume that x = $1e1 and E2e2 + +e + ete. e2. The set x. X = E'. e2. e1. By definition of the term "n-dimensional vector space" such a space contains n linearly independent vectors. THEOREM 1. Obviously «0 O. Any set of n linearly independent vectors e1.

the coordinates of x + y are E.. en are E2. . e2. E are called the coordinates of the vector el. en are linearly independent. Let us choose as basis the vectors . e2.. e2. x Y= then + 2e2 + + /72. + x + Y = (El + ni)ei + (e2 712)e2 + i. 1. . 2E2.e. Let R be the space of n-tuples of numbers. . 2. Thus the coordinates of the sum of two vectors are the sums of the appropriate coordinates of the summands. e2. . E2. e2 772. if are ni. En and the coordinates of y relative to the same basis a. it follows that e'2 = Et'= e2 i.. ni. e of a vector space R every vector X E R has a unique set of coordinates. = en = e'n = = $'7t ea = E12. .tt-DIMENSIONAL SPACES 7 Since e1. + + Ene + + n)e. e form a basis in an n-dimensional space and + x= (4) + e2e2 + then the numbers $1. It is clear that the zero vector is the only vector all of whose coordinates are zero. . . i. and the coordinates of the . If the coordinates of x relative to the basis e1.e. En + Similarly the vector /lx has as coordinates the numbers 141. e2. 2En product of a vector by a scalar are the products of the coordinates of that vector by the scalar in question. e. In the case of three-dimensional space our definition of the coordinates of a vector coincides with the definition of the coordinates of a vector in a (not necessarily Cartesian) coordinate system. n2. DEFINITION 5. EXAMPLES.e. if el. = eta This proves uniqueness of the representation. . x relative to the basis Theorem 1 states that given a basis el.

e. en = (0. numbers et. en) and the E2. ' n2 " S2 -.. n and then compute the coordinates x = (E1. e) relative to the basis el. 52. 1. 0). 0. 0. 1. x = (Ei. 1. o.. By definition x i. .72. 1) . o. 1. 172 = El. let = (1. 1) +Ee. . . = E. Ti (0. nn " $n $fl-1Let us now consider a basis for R in which the connection be- ni " Si. E. o) + = + $(o. = (0. 1) + n2. . tween the coordinates of a vector x = . It follows that in the space R of n-tuples ($1. = (0. En) o) 2(o. Ei(1. 171 . 272(0. . 1) = The numbers (ni. (E1. Then 0. Thus. . $2. 1. 1. E which define the vector is particularly simple. $) the numbers . e. + E2 + Consequently.e. e. $2. . E2. en) = n2e2 ' + nen. n/(1. 1. 1). e. . 172 + + n. n of the vector . 1). . . I.. 1). n) must satisfy the relations 0.$1. . 0.S LECTURES ON LINEAR ALGEBRA et = (1. ni + + + n). 0). = (0.. e2e2+ . 1).

E2. P' (a). e'2 = t a. P (t) = P (a) + P' (a)(t a) + Thus the coordinates of P(t) in this basis are 1)1 . One instance of this type is supplied by the ordinary three-dimensional space R considered in Example / and the space R' whose elements are triples of real numbers. e' = 1. = (1. [PR-1)(a)/ (n P(a). Indeed. e. . EXERCISE. e's = (t a)2. ($1. 1. . Let us now select another basis for R: . 0). ao. . Let R be the vector space of polynomials of degree n A very simple basis in this space is the basis whose elements are . . we can associate with a vector in R a vector in R'. e. = t. en) are linear 17 of a vector x the coordinates n. Show that in an arbitrary basis e. 0. .. It is easy to see that the the vectors el = 1..) . a. e2 = (0. e' = (t a)"--1. When vectors are added their coordinates are added. may be viewed as the coordinates of the vector e) relative to the basis 1). ' . = (all.e. 0. . an_2. i.. a22. e2. coordinates of the polynomial P(t) = a0r-1 a1t"-2 + in this basis are the coefficients a_. /2' combinations of the numbers E1. . This implies a parallelism between the geometric properties of R and appropriate properties of R'. en (a. . Isomorphism of n-dimensional vector spaces. 0). Expanding P (t) in powers of (t a) we find that + [P(nl) (a)I (n-1)!](t a)n_'.. e = t"--1-. We shall now formulate precisely the notion of "sameness" or of "isomorphism" of vector spaces. once a basis has been selected in R we can associate with a vector in R its coordinates relative to that basis. 1. X =(. en = (0. In the examples considered above some of the spaces are identical with others when it comes to the properties we have investigated so far.n-DIMENSIONAL SPACES 9 ei. a. When a vector is multiplied by a scalar all of its coordinates are multiplied by that scalar.). e2 (a. 6122. a".

But then x' is likewise uniquely determined by x. e' be a basis in R'. This is the same as saying that the dimensions of R and R' are the same. are said to be isomorphic if it is possible to establish a one-to-one correspondence X 4-4 x' between the elements x e R and x' e R' such that if x 4> x' and y y'. By the same token every x' e R' determines one and ordy one vector x e R. with the same coefficients as in (5)..10 LECTURES ON LINEAR ALGEBRA DEFINITION 6. THEOREM 2. are vectors in R and x'. Proof: Let R and R' be two n-dimensional vector spaces. y'. . We shall associate with the vector (5) x= e2e2 + + ee the vector + E2e'2 + x' i. are their counterparts in R' then in view of conditions I and 2 of the definition of isomorphism the equation Ax Ax' py' + = 0 is equivalent to the equation = O. then the vector which this correspondence associates with x + y is X' + y'. There arises the question as to which vector spaces are isomorphic and which are not. Two vector spaces R and RI. Let e2. Indeed. If x. Therefore the maximal number of linearly independent vectors in R is the same as the maximal number of linearly independent vectors in R'. Indeed. the vector which this correspondence associates with Ax is Ax'.e. a linear combination of the vectors e'. let us assume that R and R' are isomorphic. This means that the E. are uniquely determined by the vector x. Two vector spaces of different dimensions are certainly not isomorphic. It follows that two spaces of different dimensions cannot be isomorphic. Hence the counterparts in R' of linearly yy + independent vectors in R are also linearly independent and conversely. y. All vector spaces of dimension n are isomorphic. en be a basis in R and let e'2. . every vector x e R has a unique representation of the form (5). . This correspondence is one-to-one.

a set R' of vectors x. En) for which ei = 0 form a subspace. This completes the proof of the isomorphism of the spaces R and R'. Consider any plane in R going through the origin. . n form a subspace of The totality of polynomials of degree the vector space of all continuous functions. In the vector space of n-tuples of numbers all vectors . subspace. In § 3 we shall have another opportunity to explore the concept of isomorphism. Since a subspace of a vector space is a vector space in its own right we can speak of a basis of a subspace as well as of its dimensionality. Let R be the ordinary three-dimensional space. all vectors x (E1. It is clear that every subspace R' of a vector space R must con- tain the zero element of R. subspace of R if x e R'. E2. then x + y 4> x' + y' and 2x 4> Ax'. . y e R' implies x of R. y. of a vector space R is called a subspace of R if it forms a vector space under the operations of addition and scalar multiplication introduced in R.n-DIMENSIONAL SPACES 11 It should now be obvious that if x 4* x' and y e> y'. The null space and the whole space are usually referred to as improper subspaces. form a x = (E1. EXERCISE. Subspaces of a vector space DEFINITION 7. Show that if the dimension of a subspace R' of a vector space R is the same as the dimension of R. 1. The zero or null element of R forms a subspace The whole space R forms a subspace of R. It is clear that the dimension of an arbitrary subspace of a vector space does not exceed the dimension of that vector space. More generally. a2E2 + a. an are arbitrary but fixed numbers. In other words. A subset R'. . E) such that + anen = 0. The totality R' of vectors in that plane form a subspace of R. in R is called a y e R'. EXAMPLES. E2. 5. We now give a few examples of non-trivial subspaces. then R' coincides with R.E1 where al. x e R'. a2.

Consider the set of vectors of the form x xo °Lei. But this implies (cf. g.12 LECTURES ON LINEAR ALGEBRA A general method for constructing subspaces of a vector space R is implied by the observation that if e. Example 2. f. OY are a (finite infinite) set of vectors belonging to R. Thus a one-dimensional v all vectors me1. . the dimension of R'. e2. R' contains k linearly independent vectors (i. 4. e7. This subspace is the smallest subspace of R containing the vectors e. subspace R' of R. i. g. P Elk $21. n. f. $22. . be 1 vectors in R' and let 1 > k. x2. It is natural to call this set of vectors by analogy with threedimensional space a line in the vector space R. e2. . the vectors e1. . Etl. 0 are fixed vectors and ranges over all scalars. where xo and e. $22. f. xi then the I rows in the matrix $12. + etkek. Thus the maximal number of linearly independent vectors in R'.. xi. then the set R' of all (finite) linear combinations of the vectors e. e. subspaces of dimension / 1. g. A basis of such a space is a single vector el O. x2. e2. The subspace R' is referred to as the subspace . . . eh form a basis of R' Indeed. eh).. x. EXERCISE.. is k-dimensional and the vectors e1. On the other hand. + E12e2 + ' + elkek + E2e. $2k E lk must be linearly dependent.e. page 5) the linear dependence of the vec ors x. If we ignore null spaces. then the simplest vector spaces are one- dimensional vector spaces. e2. form a basis in R'. 2. Show that every n-dimensional vector space contains /. + enel. g. let x1. where a is an arbitrary scalar.e. is hand the vectors e. forms a generated by the vectors e. The subspace R' generated by the linearly independent vectors e1.$ 22. If X2 x1= $11e1 + $ 12. . f.

f. X= where xo is a fixed vector. e' be two bases of an n-dimensional vector space. . Transformation of coordinates under change of basis. ' of real numbers the set of vectors satisfying the relation + ane. a211e2 + The determinant of the matrix d in (6) is different from zero e' would be linearly depend(otherwise the vectors e'1. e'2. of dimension n Show that if two subspaces R. Let e2.. The set of vectors of the form xe ße2. g. is called a (two-dimensional) plane. EXERCISES. 1. + an. Then x = Ele x e1e1 $2e2 + $2e2 + $e = E'le'. Show that the dimension of the subspace generated by the vectors is equal to the maximal number of linearly independent vectors e. Further.en. all vectors of the form /el ße2.e1±a21e2+ ±a2e) + aen). en and e'1.. Show that in the vector space of n-tuples (ei. a. . ---a2E. let the connection between them be given 6. Let ei be the coordinates of a vector x in the first basis and its coordinates in the second basis. ae) . a. where el and e2 are fixed linearly independent vectors and a and fi are arbitrary numbers form a two-dimensional vector space. e'o. among the vectors e. and 112 of a vector space R have only the null vector in common then the sum of their dimensions does not exceed the dimension of R. = ae (6) a21e2 + a22 e2 + + ae. ent). . are fixed numbers not all of which are zero) form a subspace 1. ep. by the equations e'. + 6/1E1 (a1. + Replacing the e'. = ame. f. e'2 = tine' + ae. g.n-DIMENSIONAL SPACES 13 Similarly. with the appropriate expressions from (6) we get + ee e'l(ae ae2 + ane2 E'(a1.

the coefficients of the e. We take as our fundamental concept the concept of an inner product of vectors.21. Euclidean space 1. . + binen. dimension. parallelism of lines. many concepts of so-called Euclidean geometry cannot be forniulated in terms of addition and multiplication by scalars.14 LECTURES ON LINEAR ALGEBRA Since the e. Instances of such concepts are: length of a vector. bnnen e'n = b11 b2$2 + where the b are the elements of the inverse of the matrix st. etc.V. Thus. § 2. By means of these operations it is possible to define in a vector space the concepts of line. ei2 = bn + b22$2 + + b2Jn. Using the inner product operation in addition to the operations of addi- . To rephrase our result we solve the system (7) for ¿'i. on both sides of the above equation must be the same. the inner product of vectors. aniVi + a. However. The simplest way of introducing these concepts is the following. Thus the coordinates of the vector x in the first basis are express- ed through its coordinates in the second basis by means of the matrix st which is the transpose of . Hence auri + an (7) + + rn E2 = an VI en a22 E'2 + + a2netn. Definition of Euclidean space. angles between vectors. + + anE'n. We define this concept axiomatically. plane. the coordinates of a vector are transformed by means of a matrix ri which is the inverse of the transpose of the matrix at in (6) which determines the change of basis. In the preceding section a vector space was defined as a collection of elements (vectors) for which there are defined the operations of addition and multiplication by scalars. Then buE1 + 1)12E2 + . are linearly independent.

Without changing the definitions of addition and multiplication by scalars in Example 2 above we shall define the inner product of two vectors in the space of Example 2 in a different and more general manner. of vectors studied in elementary solid geometry (cf. Let x = (et. § 1). Example /. + x2.n-DIMENSIONAL SPACES 15 tion and multiplication by scalars we shall find it possible to develop all of Euclidean geometry. n2. Let us put . En) and Y = (n2.. We leave it to the reader to verify the fact that the operation just defined satisfies conditions 1 through 4 above. .) be in R.. en + n) and multiplication by scalars . A vector space in which an inner product satisfying conditions 1 through 4 has been defined is referred to as a Euclidean space. AE. (x. x) = 0 tt and only if x 0. Aen) with which we are already familiar from Example 2. Thus let taiki I be a real n x n matrix. we define the inner product of x and y as (x. (A real) (2x. y) =] (x1. th. § 1. If with every pair of vectors x. y) = (y. y). y) such that (x. Consider the space R of n-tuples of real numbers. Let us define the inner product of two vectors in this space as the product of their lengths by the cosine of the angle between them. In addition to the definitions of addition EXAMPLES. (25. then we say that an inner product is defined in R. y] + (x2. (x. DEFINITION 1. y) = 2(x. y) as defined. y in a real vector space R there is associated a real number (x. x). Let us consider the (three-dimensional) space R x + Y = (ei Ax nt. Y) = eint + 5m + + it is again easy to check that properties 1 through 4 are satisfied by (x. y]. 23 . 1. ez + n2. x) 0 and (x.

for (x.. EXERCISE. = = E. quadratic form in (3) is said to be positive definite if it takes on non-negative values only and if it vanishes only when all the Ei are zero. be non-negative fore very choice of the n numbers el. i. x) ctikeie. i. if we put a = 1 and a. . y) = I eini and the result is the Euclidean space of Example 2. For Axiom 1 to hold. for (1) to define an inner product the matrix 11(211 must be symmetric and the quadratic form associated with Ila11 must be positive definite. = E.16 LECTURES ON LINEAR ALGEBRA (x. as it is frequently called. e2. = O (i k). that IctO be symmetric. y) deEned by (1) takes the form (x. Y) = a11C1n1 + a12C1n2 + (1) + am el /7 + a 271 egbi a21e2171 + a22E2n2 + + an]. The homogeneous polynomial or.e.. = O. Axiom 4 requires that the expression (x. that is. and that the matrix 1\ (1 1 21 can be used to define an inner product satisfying the axioms I through 4. y) to be symmetric relative to x and y. Show that the matrix (0 1 1) 0 cannot be used to define an inner product (the corresponding quadratic form is not positive definite). k=1 i.. In summary. en and that it vanish only if E. . If we take as the matrix fIctO the unit matrix.e. it is necessary and sufficient that a= a. ennl an2 En n2 + ann ennn We can verify directly the fact that this definition satisfies Axioms 2 and 3 for an inner product regardless of the nature of the real matrix raj/cll. Thus for Axiom 4 to hold the quadratic form (3) must be positive definite. then the inner product (x.

of the angle between two vectors and of the inner product of two vectors imply the usual relation which connects these quantities.n-DIMENSIONAL SPACES 17 In the sequel (§ 6) we shall give simple criteria for a quadratic form to be positive definite. We shall now make use of the concept of an inner product to define the length of a vector and the angle between two vectors. It is quite natural to require that the definitions of length of a vector. (4) We shall denote the length of a vector x by the symbol N. In other words. lx1 y) 1311 i. we put (5) cos 9) (x. Q) = P (t)Q(t) dt. b]. This dictates the following definition of the concept of angle between two vectors. By the length of a vector x in Euclidean space we mean the number (x. g) = fa f(t)g(t) dt. Angle between two vectors..e. It is easy to check that the Axioms 1 through 4 are satisfied. x). By the angle between two vectors x and y we mean the number arc cos (x. Let the elements of a vector space be all the continuous functions on an interval [a. it is natural to require that the inner product of two vectors be equal to the product of the lengths of these vectors times the cosine of the angle between them. n 1. We define the inner product of two such functions as the integral of their product (f. y) 13C1 1371 . Let R be the space of polynomials of degree (P. DEFINITION 3. Length of a vector. We define the inner product of two polynomials as in Example 4 L 2. DEFINITION 2.

In para. lx + 3/12 (X + y. y) 1x1 1Y1 If is to be always computable from this relation we must show I We could have axiomatized the notions of length of a vector and angle that between two vectors rather than the notion of inner product. We shall show that Y12 = 1x12 1Y121 i. y) = O. y. Proof: By definition of length of a vector (x Y. x + y) = (x. x) (x. (x.. x) (Y. x Y). The concepts just introduced permit us to extend a number of theorems of elementary geometry to Euclidean spaces. In view of the distributivity property of inner products (Axiom 3). y) = (y. If x and y are orthogonal vectors.18 LECTURES ON LINEAR ALGEBRA The vectors x and y are said to be orthogonal if (x.7i/ 2.(Y. we defined the angle between two vectors x and y by means of the relation cos 99 (x. However. The Schwarz inequality. Y) 4. This theorem can be easily generalized to read: if x. Y) 13112. 2. Thus 1x + y12 = (x. Since x and y are supposed orthogonal. this course would have resulted in a more complicated system of axioms than that associated with the notion of an inner product. are pairwise orthogonal.e. y). The following is an example of such extension. x) (Y. then it is natural to regard x + y as the diagonal of a rectangle with sides x and y. that the square of the length of the diagonal of a rectangle is equal to the sum of the squares of the lengths of its two non-parallel sides (the theorem of Pythagoras). The angle between two non-zero orthogonal vectors is clearly . z. . x) = O. which is what we set out to prove. then - Ex + y + z + 12 = ixr2 )7I2 1z12 + ' w 3.

In view of Axiom 4 for inner products. x) = (x.. for any t. 1. 1. however that in para. y)2 which is what we wished to prove. In the case of Example 1. 1(x. (x. y) 2t(x. in vector analysis the inner product of two vectors is defined in such a way that the quantity (x. the discriminant of the equation t2(y. x) O. It is now appropriate to interpret this inequality in the various concrete Euclidean spaces in para. 12(y . y). y). of this section there is no need to prove this inequality. before we can correctly define the angle between two vectors by means of the relation (5) we must prove the Schwarz inequality. (cf. EXERCISE. x)(y. (x i. the remark preceding the proof of the Schwarz inequality. that (x. Consequently. Inequality (6) is known as the Schwarz inequality. equivalently. (x. Y) < 1 IXI IYI or. ty. EXAMPLES. y) _CO. i.n-DIMENSIONAL SPACES 19 1 < (X. x)(y. x ty) 0. y)2 (x. Prove that a necessary and sufficient condition for (x. Namely. x)(y. Consequently. We have proved the validity of (6) for an axiomatically defined Euclidean space. This inequality implies that the polynomial cannot have two distinct real roots. 2 To prove the Schwarz inequality we consider the vector x ty where t is any real number. y)1/1x1 IYI 1. y)/1x1 1y1 is the cosine of a previously determined angle between the vectors. Thus.. y) 2t(x. y) + (x. in turn.e. y) + (x. y)2 Ix12 13112 <I which. . is the same as (6) (x. 1. Example /. y) is the linear dependence of the vectors x and y. cannot be positive. inequality (6) tells us nothing new.e.) 2 Note.

Show that if the numbers an satisfy conditions (2) and (3). X) 2=1 E Et2.) fb In Example 4 the inner product was defined by means of the integral 1(1)g (t) dt. Hence (6) takes the form 0. where and (3) E aikeiek >. . i=1 In Example 3 the inner product was defined as (1) Y) i. y) = E i)(8. and /72.20 2 LECTURES ON LINEAR ALGEBRA In Example 2 the inner product was defined as (x. This inequality plays an important role in many problems of analysis. af(t)g(t) dt))2 fba [f(t)J' dt [g(t)p dt. En anakk.n=1 aikeink. If x and y are two vectors in a Euclidean space R then (7) Ix + Yi [xi + 1Yr- . (y. then the following inequality holds: 2 ( E ao. . y) = E t=1 It follows that (X. 71 in the inequality just derived. (Hint: Assign suitable values to the numbers Ei. En. k=1 for any choice of the ¿i. We now give an example of an inequality which is a consequence of the Schwarz inequality. Hence (6) implies that il the numbers an satisfy conditions (2) and (3). 2--1 and inequality (6) becomes i=1 ei MY 5- n ( )( n Ei2 i=1 tif2). 2=1 EXERCISE.$((o) k=1 ( n n E aikeik)( E 6111115112) k=1 i.

Interpret inequality (7) in each of the concrete Euclidean spaces considered in the beginning of this section. x)+21x1 lyi+ (y. x1 i. y). e of an nDEFINITION 1. form an orthonormal basis .y) = (x. if e. the tip of that vector) is defined as the length of the vector x y. then there exists an isomorphic mapping takes the first of these bases into the second.e. § 3. x+y) SI. y) (x. dimensional Euclidean vector space are said to form an orthogonal basis if they are pairwise orthogonal. Briefly. and an orthonormal basis i f . ..n-DIMENSIONAL SPACES 21 Proof: y12 = (x + y. Here there is every reason to prefer so-called orthogonal bases to all other bases. In a vector space there is no reason to prefer one basis to another.. The non-zero vectors el. Isomorphism of Euclidean spaces I.e 3 Careful reading of the proof of the isomorphism of vector spaces given in § 1 will show that in addition to proving the theorem we also showed that it is possible to construct an isomorphism of two n-dimensional vector spaces which takes a specified basis in one of these spaces into a specified basis in are two e and e'5. Y) = fix1+IYI)2. e. e'. 1x±y12 = (x+y. which is the desired conclusion.. e2. the vectors e. each has unit length. In the general case of an n-dimensional Euclidean space we define the distance between x and y by the relation d lx yl. In particular.. x -1.e2. In § 1 we introduced the notion of a basis (coordinate system) of a vector space. x) 2(x. In geometry the distance between two points x and y (note the use of the same symbol to denote a vectordrawn from the origin- and a point. Orthogonal basis. in addition. the other space. Orthogonal basis. y) + (y. of R onto itself which bases in R. Orthogonal bases play the same role in Euclidean spaces which rectangular coord nate systems play in analytic geometry. it follows that 213E1 Since 2(x.. 1x + yl lx EXERCISE. 3 Not so in Euclidean spaces.

Next we put e. e1. Suppose that we have already constructed non-zero pairwise orthogonal vectors el. This proves that el. Proof: By definition of an n-dimensional vector space (§ 1. A2e2 + + Ae = O. The result is 21(e1. e. We put = f1. e1). e2. For this definition to be correct we must prove that the vectors ei. Thus. . =A1e11+ ' + Ak-iet where the Al are determined from the orthogonality conditim . This procedure leads from any basis f.. Every n-dimensional Euclidean space contains orthogonal bases. (e1. the definition of an orthogonal basis implies that ei) 0 0. i. f2. f to an orthogonal basis el.. e2. 2.. THEOREM 1. (f.. = f. e are linearly independent.e. el) + A2(e1. el) = 0. . multiplying (2) by e. para. ". We shall make use of the so-called orthogonalization procedure to prove the existence of orthogonal bases. en) = O.e. ek) = 0 for k é L Hence A = O. f. To construct ek we put e. To this end we multiply both sides of (2) by el (i. 2) such a space contains a basis f1. we find that A2 = 0. . e2) + + 2(e1. We wish to show that (2) implies Ai = 2. e1. Likewise. form the inner product of each side of (2) with ei). Now. etc. e1)/(e1.22 LECTURES ON LINEAR ALGEBRA (ei. This means that (f. ek) = f1 10 if i = k if i k.e.2 = A = O. are linearly independent. . ei) = O. where a is chosen so that (e2.. e. i. e2. let en of the definition actually form a basis.

. el) + 22-1(e1. . e. . = f. basis. e1) (ek. Let R be the three-dimensional space with which we are familiar from elementary geometry. e. In view of the linear independence of the vectors f1. ek. f. e2) = 0. So far we have not made use of the linear independence of the . an orthogonal It is clear that the vectors e'k = ek/lekl (k = 1. (fk. = f. en.. This proves our theorem.ctor e. 2. i.e. et) = 0. e) = O. O. e2-1) 21(ek_1. e2. The vector ek is a linear combination of the vectors ek.. and the vectors e. and fk+. (5) that ek O. and lying in the plane determined by e. f. -I- ' ' + ' + A2e. But e. . e2.n-DIMENSIONAL SPACES 23 (ek.) = (f2 Since the vectors el. ' (f2.. the latter (fk. so . ek. fk we Just as ek. equalities become: (fk. e2_2. n) form an orthonormal basis. . perpendicular to e. Next select a . Let fi. ek_k and fk were used to construct e. 02) A1e2-1 + + e2-1) . . e2. By continuing the process described above we obtain n non-zero. e2. (e2. It follows that 2-1 = (f/c. e2) = (fk = (fk 21e2-1 Aiek. be three linearly independent vectors in R. can be used to construct e. pairwise orthogonal vectors ek. f2. + 2-1 fk. .. e1)/(e1. = 0. Similar statements hold for e22. may conclude on the basis of eq. It follows that ek = alfk ci2f2 + . = (fk. e2. O.-t. e. e2).. 1. can be written as a linear vectors f1. e2. but we shall make use of this fact presently to prove that e. e. Put e. EXAMPLES OF ORTHOGONALIZATION. fa. f2. are pairwise orthogonal. 22-2(e2. etc. e2) 0. el). combination of the vector f_. e2)/(e2.

e. We define the inner product of two vectors in this space by the integral fi P(t)Q (t) dt. Let R be the space of polynomials of degree not exceeding n 1. y 1. e2. R. = 12 + 131 The orthogonality requirements imply ß 0 and y = 1/3. We shall denote the kth element of this basis by Pk(t). Since O (t+ I. = t2 1/3. Multiplying each Legendre polynomial by a suitable constant we obtain an orthonormal basis in R. perpendicular to the previously constructed plane). Let R be the three-dimensional vector space of polynomials of degree not exceeding two.e.. Finally. = t I. We shall now orthogonalize this basis. -.. . As in Example 2 the process of orthogonalization leads to the sequence of polynomials 1. P 1/3 is an orthogonal basis in R.24 LECTURES ON LINEAR ALGEBRA and f2. If . ek) = {01 1ff + ne). +77e0. then 52e. en be an orthonormal basis of a Euclidean space x= y = ?he. Apart from multiplicative constants these polynomials coincide with the Legendre polynomials 1 dk (12 1)k 2k k! dtk The Legendre polynomials form an orthogonal. t.t^-2. Finally we put e.Y)= Since $2e2 + ee. The vectors 1. 12 1/3. We put e. I) = f (t dt = 2a. = 1. e. perpendicular to ei ande. We select as basis the vectors 1. Let e1. = t. i.. Thus 1. Next we put e.e. + n2e2 + + enen. t. t. (i. n2e2 + (x. (3/5)1. By dividing each basis vector by its length we obtain an orthonormal basis for R. but not orthonormal basis in R. t. it follows that a = 0. We define the inner product of two vectors in this space as in the preceding example. 7 kk. i. . 12 form a basis in R. e. choose e.

Show that if in some basis f1. = (x. P. (x. The result just proved may be states as follows: The coordinates of a vector relative to an orthonormal basis are the projections of this vector on the basis vectors. y) Thus. § 2). then (x. n2. This is the exact analog of a statement with which we are familiar from analytic geometry. f'2. . Let x = eie. n. (7) E2e2 ene. then this We shall now find the coordinates of a vector x relative to an orthonormal basis el.. e1) = and. " en and ni. Further. 2. e2. where aik = aki and ei e2. f. el) = . Multiplying both sides of this equation by el we get el) + $2(e2 el) + e. + + Enn. El% + $27)2 + (x. (x. .. similarly. e). . = (x. polynomials of degree 0. let Q (t) be an arbitrary polyno- . P o(t) be the normed Legendre Let Po(t). Show that if f. Example 2. except that there we speak of projections on the coordinate axes rather than on the basis vectors.f. EXAMPLES.n-DIMENSIONAL SPACES 25 it follows that + Enfl. It is natural to call the inner product of a vector x and a vector e of length 1 the projection of x on e.(1). 1. e2).f for every x = basis is orthonormal. Cle=1 f is an arbit ary basis. y) = E nikEink. the inner product of two vectors relative to an orthonormal basis is equal to the sum of the products of the corresponding coordinates of these vectors (cf. + e(en . Thus the kth coordinate of a vector relative to an orthonormal basis is the inner product of this vector and the kth basis vector. 1.. and Y = nifi + + n. Y) = 071 + " ' Ent. e. EXERCISES 1. ?I are the coordinates of x and y respectively.

. cos 2t. We define an inner product in R1 by the usual integral (P. sin nt. cos nt. if k and I. 2 (8) Consider the system of functions 1. rn sin kt cos It dt = 0. sin t + al cos 2t + + 6. It follows from (7) that I Q(t)P. .{0 ldt = 2n.. f. sin nt of these functions is called a trigonometric polynomial of degree n. P(t). Q) = fo P(t)Q(t) dt. cos t + b. A linear combination P(t) = (a012) + a. if it is orthogonal to every vector x e RI.o 2r sin kt sin lt dt = 0. Perpendicular from a point to a subspace. Hence every polynomial Q(t) of degree n can be represented in the forra Q (t) = P(t) + c1P1(1) + c. . The shortest distance from a point to a subspace.) DEFINITION 2. are an orthonormal basis for R1. it follows that the functions (8') 1/1/2a. (l/Vn) cos nt. Let R. on the interval (0.(1) dt. (l/ n) sin t. (This paragraph may be left out in a first reading.26 LECTURES ON LINEAR ALGEBRA rnial of degree n. sin t. To this end we note that all polynomials of degree n form an n-dimensional vector space with orthonormal basis Po(t). (l/ n) cos t. (l/Vn) sin nt 2. cos t. be a subspace of a Euclidean space R. + cP(t). 2n It is easy to see that the system (8) is an orthogonal basis Indeed r 2r Jo cos kt cos It dt = 0 if k 1. We shall say that a vector h e R is orthogonal to the subspace R.f: sin% kt dt = 27. o cos% kt dt n 227 . Since . We shall represent Q (t) as a linear combination of the Legendre polynomials. sin 2t. 2n). The totality of trigonometric polynomials of degree n form a (2n + 1) -dimensional space R.

we note that f (f c2e2 + c. Indeed.e. ej.. . must be orthogonal to III. i. .. just as in Euclidean geometry. e.e. e. orthogonal to any linear combination of these vectors.f1I2 = If . + Hence. the vector fo f1 belongs to R. i.. m) implies that for any numbers 2. Right now we shall show that. as a difference of two vectors in RI. of f on the subspace R1 (i.e. of finding a vector f..f0 + 4 .e. (h. how to drop a perpendicular from f on 141). As a vector in 121.e. 2.. and is therefore orthogonal to h = f f.n-DIMENSIONAL SPACES 27 e then it is also If h is orthogonal to the vectors e.1. In other words. be an m-dimensional subspace of a (finite or infinite dimensional) Euclidean space R and let f be a vector not belonging to lt1. f. We shall now show how one can actually compute the orthogo- nal projection f.f1I2 = If - so that If - > If e..e. ej = (f.) = O. If - > If . We shall see in the sequel that this problem has always a unique solution.. e R1 and f1 f. then. ej = O (k = 1. to a basis of 12. be a basis of R1. Indeed. . 2. Let el. in R1 such that the vector h f f is orthogonal to R1. or. . for a vector h to be orthogonal to an m-dimensional subspace of R it is sufficient that it be orthogonal to ni linearly independent vectors in It1. f. By the theorem of Pythagoras If 412 + 14 . = To find the c. + 2. 21e1 + 22. et) = 0 (1= 1.. in). 1111 is the shortest distance from f to RI. Let R. i. we shall show that if f. 22. (h. We pose the problem of dropping a perpendicular from the point f to 121. (f0. is called the orthogonal projection of f on the subspace R1..41. must be of the form f. fo. The vector f.

. this system has a unique solution. EXAMPLES. Let y be a linear function of x1.. e.). e2) (ex. are the coordinates of to relative to an orthonormal basis of R1 or a non-orthonormal basis of A system of m linear equations in in unknowns can have a unique solution only if its determinant is different from zero.) (era. satisfy the system (11). We first consider the frequent case when the vectors e1. e1) c2(e2.. e2. Indeed. e1) ' (ex.. this vector has uniquely determined coordinates el. by the expression in (9) we obtain a system of m equations for the c. cm with respect to the basis el. in view of the established existence and uniqueness of the vector f0. c... x2. . It follows that the determinant of the system (11) (el. m). 2. c2. e. i. ei) (e (e2. e1) = (f. e1) + + c. . .(ei. In this case the problem can be solved with ease.. e1) (k = 1. let y= + c. Thus.. .. en.(e.28 LECTURES ON LINEAR ALGEBRA Replacing f. e..e.) (e2. The method of least squares. the system (11) must also have a unique solution. e2. e. on the subspace We shall now show that for an arbitrary basis el. Since the c.) must be different from zero. Since it is always possible to select an orthonormal basis in an m-dimensional subspace. are orthonormal. the coordinates c. e. e2. x. in such a basis the system (11) goes over into the system ci = (f. . Indeed. of the orthogonal projection fo of the vector f on the subspace 111 are determined from the system (12) or from the system (11) according as the c.x . (e2. This determinant is known as the Gramm determinant of the vectors e1. e. .) (e e.. 1. we have proved that for every vector f there exists a unique orthogonal projection f. e2.

As a measure of "closeness" we take the so-called mean deviation of the left sides of the equations from the corresponding free terms. Frequently the c. x. the quantity k=1 E (X1kC1 X2nC2 + + XinkCk Ykr The problem of minimizing the mean deviation can be solved directly.(y1. let us consider the n-dimensional Euclidean space of n-tuples and the following vectors: e. c2. . c.. of the vector X22. the system (13) is usually incompatible and can be solved only approximately. and y.. y2.e. = y . x.e. + Xm2 + xmc. Let x. = Y2.n-DIMENSIONAL SPACES 29 where the c. i. Cm so as to minimize the distance from f to numbers e1. Indeed. x). x. x2. . There arises the problem . its solution can be immediately obtained from the results just presented.. e2. If R1 is the subspace spanned by fo = c. y.e.. cm so that the left sides of the equations of determining t1. and + ciel c2e2 + Consequently. denote the results of the kth measurement. = (x. X2n). However. . To this end one carries out a number of measurements of ael. in (13) are as "close" as possible to the corresponding right sides. c2. .2. (14) represents the square of the distance from + c. One could try to determine the coefficients c1. x2. 4 4. Thus. X. e2e2 . = (x11.e. x2c2 + However usually the number n of measurements exceeds the number m of unknowns and the results of the measurements are xincl never free from error. e2 = (X21 f =_. y) in that space.n and the problem of minimizing the f to clel c2e2 + mean deviation is equivalent to the problem of choosing ni . are fixed unknown coefficients. x1). cm from the system of equa- + XnaCm = y1. The right sides of (13) are the components of the vector f and the left sides. tions X21C2 + Xl2C1 + X22 C2 + X11C1 . are determined experimentally.

. . (2. As we have seen (cf.. (13') When the system (13) consists of n equat ons in one unknown xic x2c = y2. e2). 29c = 38.. = (f. e2)c1 (e2. en)ci + (e.). e2. enc = (ee f). (supposed linearly independent). f = (3. el)a. EXERCISE. 4. e2)c2 + ' (e2. ei)c. ex) = I =1 xxiYs. The method of approximate solution of the system (13) which we have just described is known as the method of least squares. In this case the normal system Solution: el consists of the single equation (e1. e. xc the (least squares) solution is c (x. c = 38/29. 4). e1)c2 + (e2. ek) = 1-1 where (f. y) (x. the numbers c1. en)c. e. (e1. which solve this problem are found from the system of equations (e1. 3. (e e2)c. em)c. . then our problem is the problem of finding the projection of f on RI. = (f. x) k=1 xox x. e1).30 LECTURES ON LINEAR ALGEBRA the vectors el. = (f. xoxk. c2.c2 k I . Use the method of least squares to solve the system of equations 2c 3c 3 4 4c = 5. The system of equations (15) is referred to as the system of normal equations. + + (e. formula (11)). (15) (e1. 5).

f(t)g(t) dt.7 ' e. . as usual. (x. 2. k=0 where or ck = (t. that polynomial for which the mean deviation from f (t) is a minimum. the mean deviation (16) is simply the square of the distance from j(t) to P(t). y1).Mich differs from f(t) by as little as possible.13(1)i2 dl. Approximation of functions by means of trigonometric polynomials. Thus. Then the length of a vector f(t) in R is given by = 6atr EN)? dt. (cf. sin t Or sin nt Ahr . cos t b. P(t) = (a012) + a. y). Let (t) be a continuous function on the interval [0. 1. e. and this problem is solved by dropping a perpendicular from f(t) to R1. Let us consider the space R of continuous functions on the interval 10. . We shall measure the proximity of 1(t) and P (t) by means of the integral u(t) . el). 2n]. the required element P(t) of R. g) = 21. Since the functions 1 eo V2. para. by means of the integral (I. sin t + + an cos nt b sin nt. Our problem is to find that vector of R._. cos nt . which is closest to fit). cos t Nhz 1 e. 2:r] in which the inner product is defined. of R of dimension 2n + 1.n-DIMENSIONAL SPACES 31 In this case the geometric significance of c is that of the slope of a line through the origin which is "as close as possible" to the points (x1. 2n form an orthonormal basis in R. we are to find among all trigonometric polynom als of degree n. e. It is frequently necessary to find a trigonometric polynomial P(t) of given degree v. y2). Consequently. (x2. is P(t) = E ce1. Example 2). The trigonometric polynomials (17) form a subspace R.

Thus in § 2. ' 7( /0 1(t) sin kt dt. To be more specific: DEFINITION 2. o b= 5 "Jo 127 f(t) sin kt dt.. 3. cos kt + b sin kt k=1 n 127 5 X o fit) dt. y) = (x'. C2k - ThuS. 27 1 J..e. then it associates with the sum x + y the sum x' y'. If X 4> X' and y 4> y'. 2 * Ea. are said to be isomorphic if it is possible to establish a one-to-one correspondence x 4> x' (x e R. etc. in § 2. e. then Axe> Ax'. i. x' e R') such that I.. the inner products of corresponding pairs of vectors are to have the same value. Isomorphism of Euclidean spaces...1c. We observe that if in some n-dimensional Euclidean space R a theorem stated in terms of addition. Two Euclidean spaces R and R'. =7 P(t) := . i.. "vector" stood for an n-tuple of real numbers. and bk defined above are called the Fourier coefficients of the function fit)... y').2" Vn o f(t) cos kt dt. a =- 1 x 5 2 1(1) cos kt dt. We have investigated a number of examples of n-dimensional Euclidean spaces. 1 tik.e. for the mean deviation of the trigonometric polynomial from f(t) to be a minimum the coefficients a and bk must have the values a. Example 2. then the same . Example 5. scalar multiplication and inner multiplication of vectors has been proved..32 n-DIMENSIONAL SPACES = Vart o -I 27 f(t)dt. If x 4> x' and y 4--> y'. then x + y X' + y'. The question arises which of these spaces are fundamentally different and which of them differ only in externals. a. If x x'. if OW correspondence associates with X E R the vector X' E R' and with y e R the vector y' e R'. The numbers a. then (x. In each of them the word "vector" had a different meaning.. it stood for a polynomial.

then. y') = El% + $2n2 + + $nn. As our standard n-dimensional space R' we shall take the space of Example 2. Y) = $1ni + $2n2 + because of the assumed orthonormality of the e. We shall show that all n-dimensional Euclidean spaces are isomorphic to a selected "standard" Euclidean space of dimension n. . The one-to-one nature of this correspondence is obvious. e2. = Eft?' + $2n2 + + Now let R be any n-dimensional Euclidean space. THEOREM 2.. § 2.e. that the inner products of corresponding pairs of vectors have the same value. The following theorem settles the problem of isomorphism of different Euclidean vector spaces. (x. all arguments would remain unaffected. . Conditions 1 and 2 are also immediately seen to hold..n-DIMENSIONAL SPACES 33 theorem is valid in every Euclidean space R'. 8n) in R'. We associate with the vector x= e2e2 + + ene in R the vector = (81. This will prove our theorem. Let el. 82. Clearly. + Ennn. i. in view of the properties 1. e) and y' = (7? n . isomorphic to the space R. ¿2. nn) is defined to be a (x'. in which a vector is an n-tuple of real numbers and in which the inner product of two vectors x' = (E1. It remains to prove that our correspondence satisfies condition 3 of the definition of isomorphism. On the other hand. We now show that this correspondence is an isomorphism. All Euclidean spaces of dimension n are isomorphic. the definition of inner multiplication in R' states that (x'. 2. if we replaced vectors from R appearing in the statement and in the proof of the theorem by corresponding vectors from R'. Indeed. . en be an orthonormal basis in R (we showed earlier that every Euclidean space contains such a basis). . 3 of the definition of isomorphism.

the vectors in question span a subspace of dimension at most three. Y): i. Bilinear and quadratic forms In this section we shall investigate the simplest real valued functions defined on vector spaces. of the proposition of elementary geometry just mentioned. Again. a geometric theorem about a pair of vectors is true in any vector space because it is true in elementary geometry. b]. inequality (7) of § 2 Ix + yl ixl is stated and proved in every textbook of elementary geometry as the proposition that the length of the diagonal of a parallelogram does not exceed the sum of the lengths of its two non-parallel sides. 4. the inner products of corresponding pairs of vectors have indeed the same value.. which expresses inequality (7). inner multiplication and multiplication of vectors by scalars) pertaining to two or three vectors is true if it is true in elementary geometry of three space. an assertion stated in terms of addition. and it therefore suffices to verify the assertion in the latter space.e. the inequality. Any "geometric" assertion (i. . This subspace is isomorphic to ordinary three space (or a subspace of it). EXERCISE. Indeed. y') = (x. To illustrate. is a direct consequence.34 LECTURES ON LINEAR ALGEBRA Thus (x'. In particular the Schwarz inequality vr (f(t) g(t))2 dt VSa [ f (t)]2 dt bb VSa [g(t)]2 dt. § 4. This completes the proof of our theorem. § 1. via the isomorphism theo- rem.e. and is therefore valid in every Euclidean space.. in the space of continuous functions on [a. para. Prove this theorem by a method analogous to that used in The following is an interesting consequence of the isomorphism theorem. § 2. We thus have a new proof of the Schwarz inequality.

e' Further. e'2. . + anen f(X) = a252 . DEFINITION I. A linear function (linear form) f is said to be defined on a vector space if with every vector x there is associated a number f(x) so that the following conditions hold: _fix f(x) +AY). en be a basis in an n-dimensional vector space. e'n be two bases in R. x a vector whose coordinates in the given basis are E1. the properties of a linear function imply that + ene) eifie.e2 + + 2. then (1) f(x) = aiel a252+ -in amen.n-DIMENSIONAL SPACES 35 1. n). Linear functions. Since every vector x can be represented in the form x= enen. Thus let et. if et. . . What must be remembered. + ac21e2 + acne. en is a basis of an n-dimensional vector space R. The exact nature of this dependence is easily where f(e) = explained. e2. e2.2. . Let et. e2. and f a linear function defined on R. . E. Y) !(Aa) = 1f (x). + 122e2 + e'2 + ocnIenr + (Xn2en. 2. however. is the dependence of the a.) f(x) = f&ie. on the choice of a basis. e2. e and e'1. The definition of a linear function given above coincides with the definition of a linear function familiar from algebra. e by means of the equations e'.). be expressed in terms of the basis vectors et. . = 1. Let the e'. Linear functions are the simplest functions defined on vector spaces. $2. = acne. + E2e2 + Thus. ¿j(e2) + + enf(e. let 2.e.

+ y2) = A (x. cogrediently). if we keep y fixed. In other words. 1. A (x. A (Ax.. A (x.. y) is a linear function of y.e. n2. bt2. y) is a linear function of y. ey. Bilinear forms.) = ctik + c(2k az + This shows that the coefficients of a linear form transform under a change of basis like the basis vectors (or. y). Consider the n-dimensional space of n-tuples of real numbers. y = n2.(E1. 2. . Again. yz). y) is a linear function of x. A (x. . In what follows an important role is played by bilinear and quadratic forms (functions). y). noting the definition of a linear function. n as constants. A (x. yi) + A (x. y) = (x. i. y) is said to be a bilinear function (bilinear form) of the vectors x and y if for any fixed y. en). + + akf(e. en. + a'2E12 + e'. relative to the basis e'1. and define (2) A (x. Indeed. DEFINITION 2.36 LECTURES ON LINEAR ALGEBRA relative to the basis el. A (x. as it is sometimes said. if $1. k1 ae ink depends linearly on the $. and + a'e' f(x) = a'le'.) and a' k = f(e'). i.. E). if we regard ni . y) = + an 27/1 anlennl a12e072 anE27/2 ' ' ' ' + n a2 ne2n an2enn2 + + annennn A (x. for any fixed x. EXAMPLES. y) is a linear function of x =. . Let x = ($1. e2. conditions 1 and 2 above state that A (xi + x2. Since a. $2. E are kept constant. = f(e. y) is a bilinear function. nk). pty) = yA(x. . y). A (x. A (x. it follows that cc2ke2 + + C(nk en) = Xlki(ei) ac22f(e2) ai = f(xikei + anka. y) + A (x2. e'2. y) = A (x.

A (f. e2. Axioms I. e2. ri of coordinates ei. in this case. t) be a (fixed) continuous function of two variables s. /he. y) using the e of x and the coordinates ni. 3. We shall express the bilinear form A (x. g) = f(s)g(t) ds dt = f(s) ds g(t) dt. y) = A (y x) for arbitrary vectors x and y. The matrix of a bilinear form. Thus. y) = A (ei ei $2e2 + In view of the properties 1 and 2 of bilinear forms . bilinear form. t)f(s)g(t) ds dt. Indeed. Conditions 2 have analogous meaning. DEFINITION 3. We defined a bilinear form en be a basis in n-dimensional axiomatically. e2. Indeed. 2. In Example / above the bilinear form A (x. space. the first part of condition 1 of the definition of a bilinear form means. i. y relative to the basis e1. g) = I then A (f. 1. Now let el. g) is the product of the linear functions!: f(s) ds and Jab g(t) dt. g) is a bilinear function of the vectors f and g. y) defined by (2) is symmetric if and only if aik= aid for all i and k. . If we put b sb K(s. A( f. + 71e).n-DIMENSIONAL SPACES 37 2. EXERCISE. that the integral of a sum is the sum of the integrals and the second part of condition 1 that the constant A. symmetric 21 A bilinear function (bill ear form) is called A (x. y) in a Euclidean space is an example of a symmetric bilinear form. Show that if 1(x) and g(y) are linear functions. )72e2 + A (x. n. e... Let K(s. 3 in the definition of an inner product (§ 2) say that the inner product is a symmetric. may be removed from under the integral sign. Let R be the space of continuous functions f(t). + e en. t. The inner product (x. t) b jib A (f.e. then their product 1(x) g(y) is a bilinear function. then If K(s.

1. 6 0 6 6 It follows that if the coordinates of x and y relative to the basis e. EXAMPLE. e the form A (x. y) = 65'oy. Thus given a basis e1. y). y) = I A (ei. 1.91 of the bilinear form A (x. y) = written as ai. EP Let R be the three-dimensional vector space of triples 59) of real numbers. respectively. 1 1 1 1 1 1+2 1 1 1+3 1 1 = 6. 1).. 1). e1) by a. 1). + tine.17' . e. . = A (ei. Let us choose as a basis of R the vectors e. = (1. a = 11 + 2 1 + 3 (-1) (-1) = 6. TA. Nlaking use of (4) we find that: an a = an 1+2 1+3 (-1) = 0. We define a bilinear form in R by means of the equation A (x. 4.e To sum up: Every bilinear form in n-dimensional space can be A (x: Y) = where X -r. e2. + 3.38 LECTURES ON LINEAR ALGEBRA A (x. 5'. n'a + + 2 6E'. y) is determined by its matrix at= Ha11. 4E'. e2. A (x. (El. and n'2. a= a 1 + 2 U (-1) + 3 (-1) = 4.. then A (x. y) = El% + 2eon + 3eana. ek). = (1. y = ?he. are denoted by 5'. 1 + 2 (-1) (-1) + 3 (-1) (I) = 6. en. e2.$1e1 + i. = (1. 1. e.. k=1 or. y) relative to the basis el. and compute the matrix . 1=1 azkink..2 (-1) + 3 (-1)(-1) = 2. + a. a = a= 1 1 1. .e. e. and The matrix a/ = is called the matrix of the bilinear form A (x. if we denote the constants A (ei.. 1 a 1 1 - = 04 4 2 d.

e2. f2. . w [clici2 .1 are. the numbers c. = c' transpose W' of W.n-DIMENSIONAL SPACES 39 4. y = fe.ß=1 . f be two bases of an n-dimensional vector space.. To this end The e'.zbk. e2. Our problem consists in finding relative to the basis f1. c/-1 Using this definition twice one can show that i = d. en and c. e and .. To find this value we make use of (3) where in place of the ei and ni we put the coordinates of fp . and fe relative to the basis e1. It follows that by. ca. e2. Now b becomes 4 4 As is well known. y) relative to the basis e1. We shall now express our result in matrix form. the matrix of that form L. c. is the value of our bilinear form for x f. i.4 = I ibikl 1.. c22e2 + In = crne + c2e2 + which state that the coordinates of the vector fn relat ve to the . C21C22 cn2c2 C2n cn is referred to as the matrix of transition from the basis e. a. + + 1 f2 = cue].. fe) = k=-1 acic. i. the elements of the we put e1. e are cm... e. fe).e. (4)] b = A (f.Vt. . .e. the element c of a matrix 55' which is the product of is defined as two matrices at = 11a0. c22. e and f. c... then = E abafic Aft. . Let si = I laid I be the matrix of a bilinear form A (x.. f. Let el.11 and a = cik E ai. to the basis f1. Transformation of the matrix of a bilinear form under a change of basis. e2. c. Let the connection between these bases be described by the relations = cue' (5) c21e2 d- + ce.. = A (fp. f2. By definition [eq. f2. e2. of course. (6) . the matrix I IbI I given the matrix I kJ. The matrix basis el.. bp.

x). The polar form A (x. f2. x)i . e2. Hence in view of the symmetry of A (x. where W is the matrix of transition from e1.e.40 (7*) LECTURES ON LINEAR ALGEBRA t. Let A (x.. y) be a symmetric bilinear form. Thus. basis el. A (x. x) + A (x. f. x). y) as well as the symmetric bilinear form A . Using matrix notation we can state that (7) =wi sr. . e to f1. . Since the right side of the above equation involves only values of A (x. A(y. f and W' is the transpose of W. x) + A (y. it follows that A (x.:W its matrix relative to the basis f1. THEOREM 1. y) be a symmetric form is justified by the following result which would be invalid if this requirement were dropped. quadratic form. if s is the matrix of a bilinear form A (x. x). (x. 5. y) -- (x + y. . k=1 Icriaikc. To show the essential nature of the symmetry requirement in the above result we need only observe that if A (x. y) + A (y. e and [. . y) = A (y. y) is u iquely determined by i s Proof: The definition of a bilinear form implies that A (x 4-. The requirement of Definition 4 that A (x. x) A (y. y) is indeed uniquely determined by A (x. y) by putting y = x is called a quadratic form. Quadratic forms DEFINITION 4. in view of the equality A (x.y. y) is any (not necessarily symmetric) bilinear form. e2. y) is referred to as the bilinear form polar to the quadratic form A (x. A (x. x) obtained from A (x.. y)]. x + y) the quadratic form A (x.. y). then A (x. x)). y) (i. x + y) = A (x. f2. The function A (x. y) relative to the then PI = W' dW.

y) = A (y. x) > 0 for x O. y) its polar form. y) = A (xl. x). y) can be expressed in terms of the coordinates E. Hence. x) = + $22 + -in $2 is a positive definite quadratic form. x) is called positive definite if for every vector x A (x. y). A quadratic form A (x. au. x).n-DIMENSIONAL SPACES 41 give rise to the same quadratic form A (x. y). x) can be expressed as follows: A (x. y) + A (x2. x) 0 and A (x. x) > O. y) = A (x. such a bilinear form always defines an inner product. Conversely. y) of two vectors is taken as the value A (x. x). of x and nk of y as follows: A (X. Let A (x. This enables us to give the following alternate definition of Euclidean space: A vector space is called Euclidean if there is defined in it a positive definite quadratic form A (x. = k=1 We introduce another important DEFINITION 5. = a. It follows that relative to a given basis every quadratic form A (x. x). EXAMPLE. y) associated with A (x. 3) I atkeznk. + x2. x) be a positive definite quadratic form and A (x. A (ibc.. In such a space the value of the inner product (x. x) = E aikeik. The definitions formulated above imply that A (x. an inner product is a bilinear form corresponding to a positive definite quadratic form. We have already shown that every symmetric bilinear form A (x. y) of the (uniquely determined) bilinear form A (x. These conditions are seen to coincide with the axioms for an inner product stated in § 2. (x. where a. It is clear that A (x. . A (x.

. A (x. . Consider the coordinate transformation defined by = nil + 7/'2 n2 = 7711 nia (k = 3. Reduction of a quadratic form to a sum of squares VVre know by now that the expression for a quadratic form A (x. We now show how to select a basis (coordinate system) in which the quadratic form is represented as a sum of squares. x) in terms of the coordinates of the vector x depends on the choice of basis. it contains one product say. § 1) we may write the formulas for coordinate transformations in place of formulas for basis trans- formations.e. . We now single out all above. para. the coefficient of )7'2. those terms of the form which contain ann12 + 2annIn2 + We shall assume slightly more than we may on the basis of the O. f3. 6. namely. If the form A (x.2) is not zero.. We shall now carry out a succession of basis transformations aimed at eliminating the terms in (2) containing products of coordinates with different indices.f be a basis of our space and let .. .n) nk = n'k Under this transformation 2a12771)72 goes over into 2a12(n1 771). . stays different from zero.42 LECTURES ON LINEAR ALGEBRA § S. If this is not the case it can be brought about by a change of basis consisting in a suitable change of the numbering of the basis elements. Thus let f1. x) in which at least one of the a (a is the coefficient of )7. . )72. x) = Al$12 + 12E22 . Since an = an = 0. that in (2) a n2 + 2ainnin. $n2. nn are the coordinates of the vector x relative to this basis. x) to a sum of squares it is necessary to begin with an expression (2) for A (x. In view of the one-to-one correspondence between coordinate transformations and basis transformations (cf. nn . x) (supposed not identically zero) does not contain any square of the variables n2. X) = a zo in where ni. To reduce the quadratic form A (x. i. . A (x. 2a1277072.

. " 71: = then our quadratic form goes over into A (x. n3*. if necessary. write an/h2 (3) 1 24112971712 (ann. 272** = (122* n2* + a23* n3* + + nn*.hThc. + 2a/nn + a1)2 B. where the dots stand for a sum of terms in the variables t)2' If we put 711* = aniD 1)2* a12n2 . our form becomes A (x. *2 + all n *.. x) = 1 a 11 (a11711 + ' + a1)2 + 4. by auxiliary transformations discussed above) and carry out another change of coordinates defined by ni ** n3** = nn** ni* . k=2 is entirely analogous to the right side of (2) except for the fact that it does not contain the first coordinate. x) n . If we assume that a22* 0 0 (which can be achieved.1=3 . fi ** t. " (Jinn.i. so in (2) the quadratic form under consideration becomes (x. ' ' It is clear that B contains only squares and products of the terms that upon substitution of the right side of (3) al2172.n-DIMENSIONAL SPACES 43 and "complete the square. * si ik ' The expression a ik* n i* n k* i. *.e. x) au th**2 a22* a ** **2 + n2ik.

7122 8%2. . e2. para. en of R relative to which A (x.j* Finally. Thus let A (x.2 +722. 8712. . x) = Again. We may now sum up our conclusions as follows: THEOREM 1. E are the coordinates of x relative to e1. If m < n. E2. x) be a quadratic form in an n-dimensional space R. x) =_ n1.. x) If 271. e2. x) .8 + 27j + 41)/ 2?y. § 1) and to see that each change leads from basis to basis. e.7/2 4flo.s = then A (x. 21E18 ¿2$22 27n em2.j* . 171* + = n. we put 4+1= = An = O. = then A (x.f3. es = e3 = 712. Then there exists a basis el.e.44 LECTURES ON LINEAR ALGEBRA After a finite number of steps of the type just described our expression will finally take the form A (x. by the equation A (x. + 27 s* '73' . x) (cf. We shall now give an example illustrating the above method of reducing a quadratic form to a sum of squares. i. . if SI = rits. if 71' . to n linearly independent vectors. x) = 21E12 + 22E22 where E1. Let A (x. An en2. We leave it as an exercise for the reader to write out the basis transformation corresponding to each of the coordinate transformations utilized in the process of reduction of A (x. x) be a quadratic form in three-dimensional space which is defined. x) has the form A (x.f2. 6. Vi = 77. relative to some basis f1. where m n.2 + 4.

e in terms of711. 7/2. 712*. = e2 = d21f1 6/12f2 + + + d2nf. ni. ¿3= In view of the fact that the matrix of a coordinate transformation is the inverse of the transpose of the matrix of the corresponding basis transformation (cf.% e2 . n** in terms of ni*. § 1) we can express the new basis vectors ei. n take the " form c12n2 + " C22n2 + C2nnn the matrix of the coordinate transformation is a so called triangular matrix. It is easy to check that in this case the matrix i. ' f. e. . 712. . x) assumes the canonical form A (x.. in terms of the old basis vectors f2. 6. we can express el.ìj in the etc.e. 112". 712. d2212 + = dnif d2f2 + + If the form A (x. x) is such that at no stage of the reduction process is there need to "create squares" or to change the numbering of the basis elements (cf. x) = _e. Ej = cu.n-DIMENSIONAL SPACES 45 then A (2c. 712*. form = ciini + Cl2712 E2 = C21711 C22n2 Clnnn C2nnn $n = enini + cn2n2 + Thus in the example just given 1/1 + Cnnn. e2 2n3. e2. e. for nj**. )73. . . . . r ni. e2. para. the beginning of the description of the reduction process in this section). en in terms of ni. E2.2 e22 12e22 If we have the expressions for ni*. . then the expressions for El. n* in terms of n.

. y) and the initial . = difi c/n2f2 + + § 6. However. ek) 0 for i k (i k --= 1.46 LECTURES ON LINEAR ALGEBRA of the corresponding basis transformation is also a triangular matrix: = e2 dfif1 d22f2. = a21 2n 0 am. . In contradistinction to the preceding section we shall express the vectors of the desired basis directly in terms of the vectors of the initial basis. n).. 0 0. e2. 2. an2 requirement that in the method of reducing a quadratic form to a sum of squares described in § 5 the coefficients an . the following determinants are different from zero: ali (1) a11 412 a22 0. It is our aim to define vectors el.4. Now let the quadratic form A (x. We assume that basis f1. Thus let a II be the matrix of the bilinear form A (x. y) relative to the basis f1. = A (fi. f by the equation (It is worth noting that this requirement is equivalent to the A (x. be different from zero. f2. x) be defined relative to the basis f1. a22 a1n di. er. 2-1 where ai. . Reduction of a quadratic form by means of a triangular transformation 1 In this section we shall describe another method of constructing a basis in which the quadratic form becomes a sum of squares. etc. . f2. fn. LI 2= at2 a21 mm. O. fn. i. . fk). . x) = I a1k. this time we shall find it necessary to impose certain restrictions on the form A (x. f2. a22*. en so that (2) A (ei.

= 0 for i < k and therefore. A (ek.k I). such that the vector ek = satisfies the relations A (ek. e1) = A (ek. en is the required basis. (i = 1.x. f1) = 0 for every k and for all i < k. and to obviate the computational difficulties involved we adopt a different approach. also for i > k. then = ocii A (ek. Thus if A (ek. OCk2. Our problem then is to find coefficients . c22f2 + e= + We could now determine the coefficients . However.. fi) oci2A (ek.k from the conditions (2) by substituting for each vector in (2) the expression for that vector in (3). e2.e. in view of the symmetry of the bilinear form.n-DIMENSIONAL SPACES 47 We shall seek these vectors in the form = (3) e2 c(21f1 22f2. f2) + A (e.k I. We claim that conditions (4) and (5) determine the vector ek . We assert that conditions (4) determine the vector ek to within a constant multiplier.. We observe that if for i = 1. Indeed. . e1. f2) = 1. i. this scheme leads to equations of degree two in the cc°. Ctkk atk2f2 + + Mkkfk = 0. 2. by ai1f1 oci2f2 ' then A (ek. if we replace e. ei) = O for . fi). To fix this multiplier we add the condition A (ek.212 + + 2(f1) + aA(ek. + . 2. fi) = 0 then (ek. .

ek). + akkA (f2. we are led to the following linear system for the kg. f1) 12A (f1.. bin = 0 for i k. chkiA (f1. f2) and is by assumption (1) different from zero so that the system (6) has a unique solution. ek) = ockk' The number x11 can be found from the system (6) Namely. f2) + ' 111A (f2' f1) (6) + kkA (f1. f = °. fk) = °.-1 QC/0c = where A1_. As A (ei. e2. 0. f A (fk. f2) + + 11114 fek. fk) 14 The determinant of this system is equal to A (fk. i. fk) A (fk. x) relative to the basis e1. f1) A (f2. is a determinant of order k Ao = 1. fl) oc12A (fk. fk) A (f2. f 1. It therefore remains to compute b11 = A (ek. f2) ac12A (f2. we already know b11 It remains to find the coefficients bt of the quadratic form en just constructed. Now A (ek. Substituting in (4) and (5) the expression for e. Thus conditions (4) and (5) determine ek uniquely. I analogous to (7) and . by Cramer's rule.12A (f1-1. f2) A (f 2. The proof is immediate. oc11f1 = C(11A (e1.48 LECTURES ON LINEAR ALGEBRA uniquely. f2) + + lickA (fk. ek) = for i k. fl) 12f2 + + °C1741) C(12A (elc.e. A 1. f2) + acnA (f1-1. The basis of the ei is characterized by the fact that A (e1.. ek). fk) I A (fk. f2 A (f2. A (x. r which in view of (4) and (5) is the same as A (ek. + akkA (LI. fl) (flt. f1) A (fi. ek) = A (ek. as asserted.

e. 2. .n-DIMENSIONAL SPACES 49 Thus blek = A . f. e2. e2. . 1). = (0. e) = A k-1 Ak To sum up: THEOREM 1.42 an An = an a12 a22 all ' an an a12 aln a2n (inn an2 be all different from zero Then there exists a basis el. =-. 0). In fact. . x) = l<=1 aiknink a ik = A (fi. . Consider the quadratic form 2E1' + 3E1E2 + 4E1E3 + E22 + in three-dimensional space with basis f= (1. are the coordinates of x in the basis el. . f (or if one were simply to permute the vectors f1. if one were to start out with another basis . 0. REMARK: The fact that in the proof of the above theorem we were led to a definite basis el.e A (x. en need not have the form (3). x) be a quadratic form defined relative to some basis f1. known as the method of Jacobi. e2. . fk). x) = 40 AI _AI A2 22 + A I. This method of reducing a quadratic form to a sum of squares is . 1. A Here 4Ç. 0. . en. let the determinants . x) is expressed as a sum of squares. Also. en in which the quadratic form is expressed as a sum of squares does not mean that this basis is unique. fr. EXAMPLE. Let A (x. en. f2. it should be pointed out that the vectors el. by the equation A (x. f2. e2.11=a11. Further. fn) one would be led to another basis el. e2. f2. relative to which A (x. f. 0). .(0.

0). 2a = I.e. 121 = 6.e./3 The determinants A. 0. 12 133 8 -y.f. 0). none of them vanishes. 822.50 LECTURES ON LINEAR ALGEBRA The corresponding bilinear form is A (x. 833 = 1. 8. i.. 2M21 = 0. = 6f1 Finally. our quadratic form becomes A(x. 43. whence and e. = 83113 + 822f3 + ma. f2) 1.&7.) = 0 and A (e2. or o( i and e. i. 839. 1. o r. 12) = 0. 833). =(j 0.. ct.. Or an 8f2 = (6. Relative to the basis e. fi) = 0. 232. 43 are 2. is found from the condition A (e1.x) = C12 + Ai C13 42 AH 43 C32 Cl2 8C22 11-7C32. = if. (823. = (cc. The coefficient cc. + $012 + 2e3m1 + e3 7.' -33 and e3-1871 1 12ea + 117 fa _(S 127.. y) =. = 82313 a22f2 e. e. e. Thus our theorem may be applied to the quadratic form at hand. 21ai 1832 + 2833 = jln + 832 28. + teh. --1-.. C C2 are the coordinates of the vector x in the basis e. 0). whence 831 = 0. e e . = (i. Here C. A (e3. fa) = 1 A (ea. are determined from the equations A (es. Next a and 822 are determined from the equations A (e2. f2) = 1. 0). 117). e2. Let el = ce.2e1n1 pi.f.

in which A (x. Z12. The number of negative coefficients which appear in the canonical form (8) of a quadratic form is equal to the number of changes of sign in the sequence 1. THEOREM 2. A2 " 4 e22 2 An_. If A1> 0. x) = I 21E12 1=1 0 for all x and is equivalent to E1= E2 = = En = O. x) .n-DIMENSIONAL SPACES 2. have opposite signs. A. e7. A. . 51 In proving Theorem I above we not only constructed a basis in which the given quadratic form is expressed as a sum of squares but we also obtained expressions for the coefficients that go with these squares. are positive. A1. so that the quadratic form is 1 . . then the quadrat e form A (x. Hence A (x. Assume that d. A2 > 0. where all the A. (8) It is clear that if A1_1 and A. These coefficients are I A. Then there exists a basis e1. x) takes the form A (x. Hence. Actually all we have shown is how to compute the number of positive and negative squares for a particular mode of reducing a quadratic form to a sum of squares. then of E12 is positive and that if this coefficient is negative. is positive definite. have the same sign then the coefficient and A. > 0. A > O. x) A (x. > 0. A. e2. /12 > 0. In the next section we shall show that the number of positive and negative squares is independent of the method used in reducing the form to a sum of squares. In other words.. x) 4E12 + 22E22 Anen2. .

f.142. . kikfk.f. We first disprove the possibility that A (fi. 2. 1. Let A (x. i. then one of the rows in the above determinant would be a linear combination of the remaining rows. x) be a positive definite quadratic form. p2f2 + -F p. f. f. . We have thus proved 4> 0. = 0. x) = A2E12+ A2E22 A2En2. let A (x. the latter equality is incompatible with the assumed positive definite nature of our form.) A (f fie) A (f.52 LECTURES ON LINEAR ALGEBRA Conversely.a. fi) A (f. f. so that A (yifi p2f2 + ' + p. fi) . For the quadratic form A (x..) 1.. f2) A(f2. y. tl > O.f2) A (f. 0. n) combined with Theorem 1 permits us to conclude that it is possible to express A (x. it follows THEOREM 3. fz) + + yk.A (f2. The fact that A. y) be a symmetric bilinear form and f2.e.. (k 1. This theorem is known as the Sylvester criterion for a quadrat c . n). x) in the -F form A (x. 2. Ak _ A k-1 Since for a positive definite quadratic form all that all A.) = O. > 0 (we recall that /10 = 1). fi) ---. g212 + In view of the fact that pif.) If A.0 (i 1. But then A (pifi p2f2 -F + phf.A (fk.) 0.f. k. 2. . not all zero such that yiA(fi. f2) A (f f. fi) A (f2.. it would be possible to find numbers y1. We shall show that then 4k> 0 (k A (f. a basis of the n-dimensional space R. 42 > 0. form to be positive definite. . x) to be positive definite it is necessary and sufficient that > 0. . k).

x) derivable from inner products.. x) every theorem concerning positive definite quadratic forms is at the same time a theorem about vectors in Euclidean space. Haik11. . we may put (x. y) can be taken as an inner product in R.. e. The Gramm determinant. A2. x). if (x. A.. If A (x. i. is known as the Gramm determinant of these vectors THEOREM 4. Indeed. 3. ek) (ek. ek) (ek. is always >_ O.p. This determinant is zero if and only if the vectors el. express the conditions for positive definiteness of A (x. This implies the following interesting COROLLARY.. x) relative to some basis are positive. x) '(x. y) A (x.ti-DIMENSIONAL SPACES 53 It is clear that we could use an arbitrary basis of R to express the conditions for the positive definiteness of the form A (X. then A (x. . x) is positive definite. y) is a symmetric bilinear form on a vector space R and A (x. . e2. e2) (e2. X). A2. e2. Thus every positive definite quadratic form on R may be identified with an inner product on R considered for pairs of equal vectors only. ek are linearly dependent. ek) (e2.o/ a matrix ilaikllof a quadratic form A (x. x). be k vectors in some Euclidean space. if 41. for quadratic forms A (x. i. x) is positive definite. then A (x. x) is positive definite. x) such that A (x. . I/the principal minors z1. Now let A be a principal minor of jja1111 and let p 1. e2) (ek. e. An would be different principal minors of the matrix the new A1. e2. y) (x. (x. Conversely. then all principal minors ok that matrix are positive. e2) (el.4. are all positive. . One consequence of this correspondence is that A (x.e. y) is a bilinear symmetric form on R such that A (x.. The Gramm determinant of a system of vectors e1. then A (x. . x) relative to the new basis. . Let e1. f2. we see that A > O. y) is an inner product on R. If we permute the original basis k) and vectors so that the pith vector occupies the ith position (i 1.e. The results of this section are valid for quadratic forms A (x. then if we used as another basis the vectors f1. . y).2. be the numbers of the rows and columns of jaj in A. The determinant el) e1) (el. A. In particular f in changed order.

ek coincides with the determinant 4. + yrz.x. z28 + za2 . vectors e1. e. are linearly independent. y. . 3/42 + 3/42 + x32 = Y1 X1 + Y2 x2 T Y3 X3 z. in that case one of the vectors.x. Indeed.y. y. Then the Gramm determinant of Consider the bilinear form A (x. (x. e2. + z. is a linear combination of the others and the determinant must vanish.54 LECTURES ON LINEAR ALGEBRA el. where (x. discussed in this section (cf. has indeed the asserted geometric meaning. We shall show that the Gramm determinant of a system of Ale].Y. y) = (y. 1.. e2. . + z. x) is positive definite it follows from Theorem 3 that 47c >0. Ya 23 where 3/4. z. x) > 0 is synonymous with the Schwarz EXAMPLES.y. Now. Indeed. ek is zero. ek . + zax. linearly dependent vectors e1. 3/423 2. y) is the inner product of X and y. (x. y) Proof. is a linear combination of the others. x1 y1 1. has the following geometric sense: d2 is the square of the area of the parallelogram with sides x and y. on the vectors x. (x. + y.z. y. are the Cartesian coordinates of x. = ixF2 ly12 2. x) = Ix1 lyr cos ry. J. i. x and y As an example consider the Gramm determinant of two vectors = (x.z. + z. Therefore. z is equal to the absolute value of the determinant xi 23 V In three-dimensional Euclidean space the volume of a parallelepiped 23 Ya 23 Ya 2.12 (1 cos2 99) = 1x12 13712 sin' 99. = 1x18 13. Since A (x. In Euclidean three-space (or in the plane) the determinant 4. where y) is the angle between x and y. y) is a symmetric bilinear form such that A (x. This completes the proof. z.y. say e.. e2. + 22e2 ' + It follows that the last row in the Gramm determinant of the e. x) The assertion that inequality.. e2. y). y) (Y. y 2'. Y) (Y. Assume that e1. 1x12 13712 cos' 9.z. (7)). 3/4 Y3 Y12 + Ya' + 1I32 x121 + x222 -.e.

.e. z) (z. The law of inertia T he law of inertia . it is possible to show that the Gramm determinant of k vectors y) x. the determinant (9) is referred to as the volume of the k-dimensional parallelepiped determined by w. y.n-DIMENSIONAL SPACES 55 (x. by vectors proportional to them we obtain a . (It is clear that the space R need not be k-dimensional. be even infinite-dimensional since our considerations involve only the subspace generated by the k vectors x.1. y. For a system of functions to be linearly dependent it is necessary and sufficient that their Gramm determinant vanish. z) Thus the Gramm determinant of three vectors x.2 By replacing those basis vectors (in such a basis) which correspond to the non-zero A. y. z is the square of the volume of the parallelepiped on these vectors.a f (t)dt and the theorem just proved implies that: The Gramm determinant of a system of functions is always 0. x) is a sum of squares. (1) A (x. R may. Y) (x. 3. z) (37. There are different bases relative to 1. w in a k-dimenional space R is the square of the determinant X1 Y1 22 Y2 " " Xfr Yk Wk W1 W2 where the xi are coordinates of x in some orthogonal basis. In the space of functions (Example 4. Similarly. (9) . indeed. which a quadratic form A (x. w. etc. the yi are the coordinates of y in that basis. the vectors x.) By analogy with the three-dimensional case.1 . y) (3'. /1(012(t)dt Pba 122(t)dt 11(1)1k(i)di Pb 12(t)f1(t)dt f " a 12(t)1(t)dt rb a rb Pb tic(1)11(1)dt 1k(t)12(t)dt . y. x) = 1-1 2. § 7. § 2) the Gramm determinant takes the form rb I rb 10 (t)de "b .

or 1. 2. in formula (1) are different from zero and the number of positive coefficients obtained after reduction of A (x. Now.. . and I. . It is natural to ask whether the number of coefficients whose values are respectively 0. z1'2. Then a certain matrix I ja'11 would take the place of I laikl and certain determinants would replace the determinants z11. 2. a2 a. ''2 = all a12 a22 z1n an an an an . are 0. x) which. § 6. To illustrate the nature of the question consider a quadratic form A (x. A. x) by means of a sum of squares in which the A. . and lis dependent on the choice of basis or is solely dependent on the quadratic form A (x. is represented by the matrix where a = A (ei. .. answers the question just raised. A1. I. ZI2. A'. relative to some basis el. If a quadratic form is reduced by two different methods (i. x). . . en... 1. e'2. an are different from zero. z11. zl. ek) and all the determinants 41 = an. . . known as the law of inertia of quadratic torms.. e2. as was shown in para. suppose some other basis e'1. e' were chosen. x) to a sum of squares by the method described in that section is equal to the number of changes of sign in the sequence 1. Then. The following theorem. then the number of positive coefficients as well as the number of negative coefficients is the same in both cases..56 LECTURES ON LINEAR ALGEBRA representation of A (x.e. in two different bases) to a sum of squares. There arises the question of the connection (if any) between the number of changes of sign in the squences 1.(12. all ). THEOREM 1..

±Q. f2. . )72p.. Let R' be the subspace spanned by the vectors el. It remains to show that x O.2e2 + + Atek = !IA Let us put + Akek = Pelf' /42f2 !lift = x. ti are the coordinates of x relative to the basis . which vanish is also an invariant of the form. . n2 ._ . . This means that there exist numbers pi not all zero such that Al. .) Let f. f2. basis of R". it follows that the number of coefficients A. fm. The vectors el. e2. . If x = 0. Hence x O. f. We can now prove Theorem 1. q'.. + + pit = 0. Ale 2. 2.e. e. Al. f2. n2p. Proof: Let e. f.. A2. i. . be a basis of R' and f. . say. e. + 52e2 + + $e -F -Fene. Proof: Let e. are linearly dependent (k 1 > n). . fi. Then there exists a vector x 0 contained in R' n R".e.. in (1) are invariants of the quadratic form. p would all be zero.. + erk. x) X 22 $2. We first prove the following lemma: LEMMA.2 $223+1 E2p+2 $2. Since the total number of the A.. " Ak. Assume ep.. e be a basis in which the quadratic form ei A2e2 + A (x. i. n2. A2. e. that this is false and that p > p'. (Here E. e2. . kt2.) We must show that p = p' and q =-. which is impossible. f2.2e2 + + Akek p2f. + .. is n. x) becomes A (x.. f.. and pi. and let k 1 > n. respectively. Let R' and R" be two subspaces of an n-dimensional space R of dimension k and 1. En are the coordinates of the vector x. 2... . in (1) and the number of negative A. x) = ni2 (Here )7. e2. E2. .n-DIMENSIONAL SPACES 57 Theorem 1 states that the number of positive A.e. f be another basis relative to which the n22 quadratic form becomes A (x. . ' It is clear that x is in R' n R". e2.

0 for all x e R. DEFINITION 2.. y. A (x.58 LECTURES ON LINEAR ALGEBRA R' has dimension p.e. on the other hand. y) we mean the set R. i. . x= and E. while not all the numbers il. A (x. yi y. Since n p>n (we assumed 1) > p').) The resulting contradiction shows that fi = p'. Similarly one can show that q = q'. To this end we shall define the rank of a quadratic form without recourse to its canonical form. . e2. number of non-zero coefficients 2. there exists a vector x 0 in R' n R" (cf. But this means that y.e 0 and its coordinates relative to the basis . e R. The reasonableness of the above definition follows from the law of inertia just proved. are E1. . and 41 e R.-+2 -c 0(Note that it is not possible to replace < in (5) with <. for. f. is a subspace of R. = O. A (x. Substituting these coordinates in (2) and (3) respectively we get. of all vectors y such that A (x. By the null space of a given bilinear form A (x. in one of its canonical forms..) = 0 and A (x. y. +ee X = np fil + + nil-Fa' fil+qt + nnfn The coordinates of the vector x relative to the basis e. By the rank of a quadratic form we mean the 2. We shall now investigate the problem of actually finding the rank of a quadratic form. i. Rank of a quadratic form DEFINITION 1. . . e. are 0. are zero.. The subspace R" spanned by the vectors fil. x) = + $22 > (since not all the E. f. +2 = = nil+0. y) = 0 for every x e R. on the one hand.. x) = . . = n. 0. y. Lemma). . It is easy to see that R. n. it is possible that nil+.eil+1. Q nil+. y2) = 0 for all x e R.e. Indeed.n22. Then A (x..±.) = 0 and A (x. let y. vanish) and. f has dimension n p'. This completes the proof of the law of inertia of quadratic forms. e Et. E2. 0.

We shall now try to get a better insight into the space Ro. the dimension of this subspace is n r. the rank of the matrix in question is n ro. cf. We defined the rank of a quadratic form to be the number of (non-zero) squares in any of its canonical forms. + an% = 0. anini + 12.[1 does depend on the choice of basis. 702 + n2f2 + A (fn.202 + ' + Thus the null space R. )72.n-DIMENSIONAL SPACES 59 If f. y) is independent of the choice of basis in R (although the matrix la .) = O. § 5). where r is the rank of the matrix Ikza We can now argue that The rank of the matrix ra11 of the bilinear form A (x. But relative to a canonical basis the matrix of a quadratic form is diagonal [1... a22n2 = O. 2. f. A (ft.) = Q /02 + + ni. y) 0 to belong to the null space of A (x. As is well known. consists of all vectors y whose coordinates 2h. We shall now connect the rank of the matrix of a quadratic form with the rank of the quadratic form. y) it suffices that n. f is a basis of R.17. nifi + nnf. and the null space is completely independent of the choice of basis. . . where ro is the dimension of the null space. f. then for a vector Y= n2f2 + + nnf.e) = aik. ant). 2.1 0 O Ao 0 An .) = Q + mit. Indeed. are solutions of the above system of linear equations.. Replacing y in (7) by (6) we obtain the following system of equations: A (f1. the above system goes over into + ainnn = 0. If we put A (fi.. 71f1 + n2f2 + A (f2. for i= 1.

. to find the rank of a quadratic form we must compute the rank of its matrix relative to an arbitrary basis. x)i.e. It is therefore reasonable to discuss the contents of the preceding sections with this case in mind.e. We mentioned in § 1 that all of the results presented in that section apply to vector spaces over arbitrary fields and.60 LECTURES ON LINEAR ALGEBRA and its rank r is equal to the number of non-zero coefficients. the rank of the quadratic form. 6 We could have obtained the same result by making use of the wellknown fact that the rank of a matrix is not changed if we multiply it by any non-singular matrix and by noting that the connection between two matrices st and . y) so that the following axioms hold: 1. § 8.4 (6' non-singular. Thus.. Complex vector spaces. The matrices which represent a quadratic form in different coordinate systems all have the same rank r. x) [(y. Many of the results presented so far remain in force for vector spaces over arbitrary fields. the rank of the matrix associated with a quadratic form in any basis is the same as the rank of the quadratic form. Since we have shown that the rank of the matrix of a quadratic form does not depend on the choice of basis. y) = (y. This rank is equal to the number of squares with non-zero multipliers in any canonical form of the quadratic form. to vector spaces over the field of complex numbers. 5 To sum up: THEOREM 2. i. x) denotes the complex conjugate of (Y. Complex Euclidean vector spaces. In addition to vector spaces over the field of real numbers. Complex n-dimensional space In the preceding sections we dealt essentially with vector spaces over the field of real numbers.. in particular. i. (x.41 which represent the same quadratic form relative to two different bases is .---- . By a complex Euclidean vector space we mean a complex vector space in which there is defined an inner product. a function which associates with every pair of vectors x and y a complex number (x. vector spaces over the field of complex numbers will play a particularly important role in the sequel.

(x. y) with y = tx would have different signs thus violating Axiom 4. x) i"(x. Complex Euclidean vector spaces are referred to as unitary spaces.x) x).1(y. If = (E1 E2 En ) and 2. 2. x) would imply (x. + (x. y2). (ix. x) is a non-negative real number which becomes zero (2x. EXAMPLES OF UNITARY SPACES. 2y) = il(x. ). x) = x) + (Y2. yi + Y2) = (Y1 + Y2. y). Y2) Axiom 1 above differs from the corresponding Axiom 1 for a real Euclidean vector space. y). -1. x) --= . In particular. y) (x. x). y. i. the numbers (x. Also. 2.. nn) are two elements of R. Ic--1 . y). Y) = (Y. Indeed. (x. y) only if x = O. y2) = y) (x.x2. This is justified by the fact that in unitary spaces it is not possible to retain Axioms 1. Axioms 1 and 2 imply that (x.n-DIMENSIONAL SPACES 61 2(x. we define (x. y) = (x1.(x. (x. ix) (x. Y) -= $217/2 + We leave to the reader the verification of the fact that with the above definition of inner product R becomes a unitary space. x) and (y. y). Indeed. 2 and 4 for inner products in the form in which they are stated for real Euclidean vector spaces. (x. 2y) = (2y.e. But then (Ax. /. ' ". Let R be the set of n-tuples of complex numbers with the usual definitions of addition and multiplications by (complex) numbers. x) = (x. y) -H (x2. The set R of Example i above can be made into a unitary space by putting y) = aikeifh.

en is an orthonormal basis and x= $2e2 + + $e. As in § 3 we prove that the vectors el. i. e is an orthonormal basis and = + ee. En and takes on the value zero only if el = C2 = en = O. If e.) x = e. e2. e2. y) = O. 0 for every n-tuple el. then (x. Let R be the set of complex valued functions of a real variable t defined and integrable on an interval [a. 3. Axiom 4 implies that the length of a vector is non-negative and is equal to zero only if the vector is the zero vector. not a real number. By the length of a vector x in a unitary space we shall mean the number \/(x. are given complex numbers satisfying the following two conditions: (a) a . 3. we do not introduce the concept of angle between two vectors. .e. g(t)) = f(t)g(t) dt. b]. Orthogonal basis..62 LECTURES ON LINEAR ALGEBRA where at. $2. .. x). Since the inner product of two vectors is. in general. that they form a basis. y = %el n2e2 + are two vectors. It is easy to see that R becomes a unitary space if we put (f(t). $2e2 + $en. Two vectors x and y are said to be orthogonal if (x. n2e2 + + nne. If e. Example / in this section). Y) = $2e2 + e2f/2 + + nnen + E71 (cf. . . The existence of an orthogonal basis in an n-dimensional unitary space is demonstrated by means of a procedure analogous to the orthogonalization procedure described in § 3. en are linearly independent. e2.>. .e. = (Th azkez f. en. Isomorphism of unitary spaces. e2. By an orthogonal basis in an n-dimensional unitary space we mean a set of n pairwise orthogonal non-zero vectors el.

f (2x) = .). f(Ax) = Af (x) . 1. + + b&. are constants. in the case of complex vector spaces there is another and for us more important way of introducing these concepts. a2 + ame. e. Linear functions of the first and second kind. are the coordinates of the vector x relative to the basis el. y). y) = A (xi. With the exception of positive definiteness all the concepts introduced in § 4 retain meaning for vector spaces over arbitrary fields and in particular for complex vector spaces. y) A (2x. = f(e. and a linear function of the second le. (x. A (xl + x2. et) + + e (e et). Bilinear and quadratic forms. a. of the first kind can be written in the form f(x) = a1e. y) = 2. f(x) = DEFINITION 1. Using the method of § 4 one can prove that every linear function 1. A (x2. where $.f(x + y) --f(x) +f(Y). Using the method of § 3 we prove that all un tary spaces of dimension n are isomorphic.(ei. However. W e shall say that A (x. y) is a bilinear form (function) of the vectors x and y if: for any fixed y.4 (x. y) is a linear function of the first kind of x. . 4.) = Et.1. y) is a linear function of the second kind of y. en and a. ez) = SO that e2e2 + + ez) = e. A (x. for any fixed x. A (x.n-DIMENSIONAL SPACES 63 t hen (x. y). e2. In other words. et) + $2 (e2. A complex valued function f defined on a complex space is said to be a linear function of the first kind if f(x + Y) =f(x) ±f(y). and that every linear function of the second kind can be written in the form b2t./(x).nd if 2.

. y) we obtain a function A (x. x) called a quadratic form (in complex space). + 3. y) = aik$. We recall that in the case of real vector spaces an analogous statement § 4). 6 6 holds only for symmetric bilinear forms (cf.. y) = (x. y) considered as a function of the vectors x and y. E2e2 + + $ne. One example of a bilinear form is the inner product in a unitary space A (x. A (x. y) e2e2 + + enen. is called the matrix of the bilinear form A (x. If x and y have the representations Y= + n2e2 + x = 1e1 then A (X. //lei ?2e2 + ed7kA (ei. Y1. e.4 (x. k1 $2e2 ' fle.n. y) be a bilinear form. + nmen Let en e2. . Ay) )7. A (X. en be a basis of an n-dimensional complex space. A (ei. ej. y = n1e1 n2e2 + + linen) = A (elei i. ?he].Fik i. Y2). Another example is the expression A (x. . The matrix IjaH with ai. Let A (x.2) = A (X. If we put y = x in a bilinear form A (x. Yi) + A (X. y).64 LECTURES ON LINEAR ALGEBRA 2. y) relative to the basis . k=1 viewed as a function of the vectors X El. The connection between bilinear and quadratic forms in complex space is summed up in the following theorem: Every bilinear form is uniquely determined by its quadratic form.

1. in particular. e1) d. Conversely. x)iA (x. A bilinear form is called Hermitian if A (x. x). This concept is the analog of a symmetric bilinear form in a real Euclidean vector space. ek) A (ek. x) + A (y. Y) so that. y) is Hermitian. iy) iA (x. i. x) A (y.n-DIMENSIONAL SPACES 65 Proof: Let A (x. if we multiply the 1. i. x iy)}. Namely. y) A (y. respectivly. 1.A (y x). y). x)-HiA (x.. I ankei ---. (III). For a form to be Hermitian it is necessary and sufficient that its matrix laikl I relative to some basis satisfy the condition a Indeed. x)iA (y.3) =M (x. y)± A (y. and add the results it follows easily that A (x. x + y) iA (x iy. x iy) 1{A (x A (y. y)+A(y. y)± A (y. x) + A (x. A (xiy. y). x iy)= A (x. A (xy. x y) iA(x iy. A (x. enable us to compute A (x. x y) + iA (x iy. x iy)}. y. if a = aki. x+y) = A (x. y) = a1kE111.x iy) A (x y. then a = A (ei. i. (III). x) be a quadratic form and let x and y be two arbitrary vectors. xy) = A (x. i. y). y). we obtain similarly. (IV) by 1. if the form A (x. (IV) by 1. x)-HiA (y. x±iy)=A(x. Since the right side of (1) involves only the values of the quadratic form associated with the bilinear form under consideration our assertion is proved. DEFINITION 2. x) A (x y. y). NOTE If the matrix of a bilinear form satisfies the condition 7 Note that A (x. A (x+iy. y) = A (y. equations (I). (II). x + y) + iA (x iy. If we multiply equations (I). x) A (x. respectively. The four identities 7: A (x±y. y). then A (x. y) = ±{A (x y. . (II).

where (x. y) = A (y. x) positive definite when for x 0. A (x iy. x). x + y). The following result holds: For a bilinear form A (x. then the associated quadratic form is also called Hermitian. x) denotes the inner product of x with itself. so that the number A (x.. A (x y. In fact. x) is a Hermitian quadratic form. A (x + y. y) relative to the tt . x).d relative to some basis implies that A (x. x). x) be real for every vector x. The proof is a direct consequence of the fact just proved that for a bilinear form to be Hermitian it is necessary and sufficient that A (x. Proof: Let the form A (x. Then A (x. x).e. x) = (x. then 4-4 = %)* seW . y) to be Hermitian it is necessar y and sufficient that A (x. x) = A (x. . xy). -.66 LECTURES ON LINEAR ALGEBRA a = dkì. as in § 4. Indeed. x) is real. x) is real for al x. e2. j 1. A (x iy. y) relative to the basis e1. y) be Hermitian. then the same must be true for the matrix of this form relative to any other basis. y) is a Hermitian bilinear form so that (x. A quadratic form is Hermitian i f and only i f it is real valued. x iy) are all real and it is easy to see from formulas (1) and (2) that A (x. then a complex Euclidean space can be defined as a complex A (x. en and I the matrix of A (x. f2. basis f1. y) = (y. = coe. If. If a bilinear form is Hermitian. a. x iy). if A (x. n). let A (x. COROLLARY. x) > 0 vector space with a positive definite Hermitian quadratic form. One example of a Hermitian quadratic form is the form A (x. If Al is the matrix of a bilinear form A (x. x) be real for all x. we call a quadratic form A (x. Conversely. but then a--=d relative to any other basis. f2 and if f. then. . i. axioms 1 through 3 for the inner product in a complex Euclidean space say in effect that (x. in particular. y) is a Hermitian bilinear form.

e). basis of R. ek) = 0 implies A (ei. x) = 0 so that 0. Let A (x.e. e1. Reduction of a quadra ic form to a sum of squares THEOREM 1. e. The proof is the same as the proof of the analogous fact in a real space. This can be done for otherWe choose el so that A (el. O. X) = is an arbitrary vector.-DIMENSIONAL SPACES 67 flc[ and tt'* Ilc*011 is the conjugate transpose of Here g' i. e2) for i < k.. A (x. y) A (ei. X) = + A2 EJ2 + + EJ where all the 2's are real. then we choose in it some basis er÷.. ei) are real in view of the Hermitian . . form a er+2. en. el) + E2(2A (e2. x) be a Hermitian quadratic form in a . This process is continued until we reach the O (Mr) may consist of the zero space Itffi in which A (x. On the other hand. en of R complex vector space R. the Hermitian nature of the form A (x. etc. y) vector only). We choose to give a version of the proof which emphasizes the geometry of the situation. Our construction implies A (e2. then A (ei. + enen + EThfA (en. The idea is to select in succession the vectors of the desired basis. Then there is a basis e1. 5. in view of formula (1). e2. It follows tha x= A (x. These vectors and the vectors el. If W. el) O.) 0. c* e51. One can prove the above by imitating the proof in § 5 of the analogous theorem in a real space. e2) + where the numbers A (e.. y) Now we select a vector e2 in the (n 1)-dimensional space Thu consisting of all vectors x for which A (e1. wise A (x.R. x) = 0 for all x and. )=0 + E2e2 + for > k. . relative to which the form in question is given by A (X.

. 42 --= au a12 a21 a2^ An = a22 a. --.a.A. The law of inertia A2. Just as in § 6 we find that for a Hermitian quadratic form to be positive definite it is necessary and sufficient that the determinants A2 .A (ei. . If we denote A (e. These formulas are identical with (3) and (6) of § 6.68 LECTURES ON LINEAR ALGEBRA nature of the quadratic form. basis. e2. that the determinants /11 42. An THEOREM 2. To see this we recall that if a Hermitian . Al.. among others. Prove directly that if the quadratic form A (x. EXERCISE. Reduction of a Hermitian quadratic form to a sum of squares by means of a triangular transformation. I. an where a. are real. . then the coefficients are equal to A (e1. x) = 1$112 A2 1E21' + zl I ler2..ea aln a2n A. then the determinants /1 4. x) is Hermitian. e1) by ). This implies. then A (x. e1) and are thus real. . =-. 4 are real. are all different from zero. The number of negative multipliers of the squares in the canonical form of a Hermitian quadratic form equals the number of changes of sign in the sequence . x) + 22M2 + + 2E& = 41E112 + 221e2I2 + quadratic form in a complex vector space and e. quadratic form is reduced to the canonical form (3). be positive. 7.2 § 6. We assume that the determinants all a12 + 2nleta2 6. x) be a Hermitian .. where A. = 1. a. Then just as in we can write down formulas for finding a basis relative to which the quadratic form is represented by a sum of squares. ek). Relative to such a basis the quadratic form is given by A (x. . If a Hermitian quadratic form has canonical fo . Let A (x.

then the number of positive. negative and zero coefficients is the same in both cases.It-DIMENSIONAL SPACES 69 relative to two bases. . The proof of this theorem is the same as the proof of the corre- sponding theorem in § 7. The concept of rank of a quadratic form introduced in § 7 for real spaces can be extended without change to complex spaces.

both procedures yield the same vector. Let R' be a plane in the space R (of Example 1) passing .). then Ax stands for the vector into which x is taken by this rotation. The right side of 1 is the result of first rotating x. and then through the origin. 70 adding the results. it is necessary to consider functions which associate points of a vector space with points of that same vector space. Consider a rotation of three-dimensional Euclidean space R about an axis through the origin. 1. If with every vector x of a vector space R there is associated a (unique) vector y in R. In many cases. The simplest functions of this type are linear transformations. Fundamental definitions. Operations on linear transformations 1. In the preceding chapter we stud- ied functions which associate numbers with points in an ndimensional vector space. We associate with x in R its projection x' = Ax on the plane R'. It is again easy to see that conditions 1 and 2 hold. Let us check condition 1. Whenever there is no danger of confusion the symbol A (x) is replaced by the symbol Ax. EXAMPLES. however. DEFINITION I. say. A (dlx ) = (x). 2.CHAPTER II Linear Transformations § 9. This transformation is said to be linear if the following two conditions hold: A (x + x2) = A(x1) + A (x. It is easy to see that conditions 1 and 2 hold for this mapping. Linear transformations. then the mapping y = A(x) is called a transformation of the space R. If x is any vector in R. and x. The left side of 1 is the result of first adding x and x. and then rotating the sum. Clearly.

n2. then Af(t) is a continuous function and A is linear.41(t) f2(T)] dr To . n. Let liaikH be a (square) matrix. en) (ni. . Indeed [P1 (t)Pa(t)i' [AP (t)]' a(t) AP' (t).). If we put AP(1) P1(1).fi(r) dr A (. If we put Af(t) = f(r) dr. With the vector x= &. A (fi + /2) = Jo I. k=1 aike k This mapping is another instance of a linear transformation. Consider the space of continuous funct ons f(t) defined on the interval [0.) --= /If (r) dr f2er) tit = Afi 2AI Af2. 1]. Consider the n-dimensional vector space of polynomials of degree n 1. The identity mapping E defined by the equation Ex x for all x. then A is a linear transformation. 4. Consider the vector space of n-tuples of real numbers. 5.LINEAR TRANSFORMATIONS 71 3. where P'(t) is the derivative of P(t).] we associate the vector Y = Ax where e2 . Jo f(r) dr Among linear transformations the following simple transforma- tions play a special role. Indeed. P 2 (t).

(2). e2.. the vector Ax = e. then Ax = A(eie..g. It is easily seen + E2e2 + + E'e. . e2. Now let the coordinates of g relative to the basis el. g there exists a unique linear transformation A such that Ae. so that A is indeed uniquely determined by the Ae. be a1. 2. We first prove that the vectors Ae.e) = EiAel + -Hen Ae. Ae = g. a2. In fact. . Ae determine A x= e2e2 + + ene E2Ae2 is an arbitrary vector in R. Ae2 = g2. uniquely. = g1.. a. which we shall call the matrix of the linear transformation A relative . Let el. Ae2. e. g..g2 that the mapping A is linear.. . every matriz determines a unique linear transformation given by means of the formulas (3). Connection between matrices and linear transformations. e2.e. E2e2 + &. . if . g2. e. Aek = aikei. conversely.e every linear transformation A determines a unique matrix Maji and. since x has a unique representation relative to the basis e1.72 LECTURES ON LINEAR ALGEBRA The null transformation 0 defined by the equation Ox = for all x. (i. To this end we consider the mapping A which associates with x = ele + es.. We shall show that Given n arbitrary vectors g1. e2. e2. . k = 1... n) form a matrix J1 = Haikl! to the basis e1. We have thus shown that relative to a given basis el. e. let A denote a linear transformation on R. This mapping is well defined. (1). i. . It remains to prove the existence of A with the desired properties. . en be a basis of an n-dimensional vector space R and 2. The numbers ao.

i. e2 = t. EXAMPLES. e. e 2! e -= En (n I)! . 0 0 EXERCISE Find the matrix of the above transformation relative to the basis e'. n)..e. = e.LINEAR TRANSFORMATIONS 73 Linear transformations can thus be described by means of matrices and matrices are the analytical tools for the study of linear transformations on vector spaces. i. Ae3 = 0. Let A be the differentiation transformation. = e. e2. e'2. Let R be the space of polynomials of degree n 1. the matrix which represents E relative to any basis is 00 It is easy to see that the null transformation is always represented by the matrix all of whose entries are zero. relative to this basis the mapping A is represented by the matrix [1 0 0 0 1 0 o.. e'3. e'a = e2 ea. 1. We choose as basis vectors of R unit vectors el. AP(t) = P'(t). in R. Let R be the three-dimensional Euclidean space and A the linear transformation which projects every vector on the XY-plane.e. directed along the coordinate axes.e.. 2. Ae. Then Ael = el.. e2.. Then Aei = e (1= 1. where e'. Let E be the identity mapping and e. ei2 = e2. We choose the following basis in R: t2 3 = 1. e any basis i.

e2. We wish to express the coordinates ni of Ax by means of the coor- dinates ei of x.e. an22 + a12E2 + a22 E2 + + ct. or. + C122E2 + (aie. (5) k=1 . Let (4) x = $1e. + a ace le + anEn. = ei(ael $2(a12e1 E2e2 + a21e2 + a22e2 + + Een) + anien) + an2e2) 5(aiei = (a111 a2e2 + a12e2 + ae) + + a$n)e. a2 n. Ae2 =e (n 1 Ae3 (2)' 2 t e Ae = tn-2 1) ! (n 2) ! Hence relative to our basis. Hence. Now Ax = A (e. en a basis in R and MakH the matrix which represents A relative to this basis. (4') Ax = 121e1+ n2. en. A is represented by the matrix 01 0 0 001 0 0 0 0 1 0 0 0 0 Let A be a linear transformation. el.74 LECTURES ON LINEAR ALGEBRA Then Ael = 1' = 0. briefly. aln = arier n2 = aizEi tin --= an1$1 + az. $2e2 + + $nen. + + nen. + a2en)e2 ((file.. in v ew of (4').)e. .

) = ABx.. C (x.. If E is the identity transformation and A is an arbitrary trans- formation. Let R be the space of polynomials of degree n 1. . if J]20represents a linear transformation A relative to some basis e1. Then D2P(t) = D(DP(t)) = (P'(t))/ P"(t). the third from property 1 for A and the fourth from the definition of multiplication of transformations. Addition and multiplication of linear transformations. e. Next we define powers of a transformation A: A2 = A A. x2) -= A [B (x x. Indeed. the second from property 1 for B. in this case D" = O. Likewise. then it is easy to verify the relations AE = EA = A.ICx is proved just as easily. Se/ect in R of the above example a basis as in Example 3 of para. etc. then transformation of the basis vectors involves the columns of lIctocH [formula (3)] and transformation of the coordinates of an arbitrary vector x involves the rows of Haikil [formula (5)]. Bx. . we define A° = E. EXAMPLE. By the product of two linear transformations A and B we mean the transformation C defined by the equation Cx = A (Bx) for all x. and. IP. e2. We shall now define addition and multiplication for linear transformations. by analogy with numbers. Ex ERCISE. Clearly. D P(t) = P' (t). Clearly. If C is the product of A and B. The product of linear transformations is itself linear.i. A3 = A2 A. we write C = AB. An" = Am A". Cx2 The first equality follows from the definition of multiplication of transformations. relative to this basis.LINEAR TRANSFORMATIONS 75 Thus. ABx. That C (2x) = . D3P(t) P"(t). = Cx. Let D be the differentiation operator.)] = A (Bx.e. DEFINITION 2. 3. 3 of this section and find the matrices of ll. it satisfies conditions 1 and 2 of Definition I.

e2. If j jazkl j and Ilkkl I represent A and B respectively (relative to some basis e1. A.76 LECTURES ON LINEAR ALGEBRA . e2. = (ac . if the (linear) transformation A is represented by the matrix I jaikj j and the (linear) transformation B by the matrix jjbjj. and. We see that the element c of the matrix W is the sum of the pro- ducts of the elements of the ith row of the matrix sit and the corresponding elements of the kth column of the matrix Re?. If C is the sum of A and B we write C = A + B. on the other hand. To answer this question We know that given a basis e1. . Thus. on the one hand. then. It is easy to see that C is linear.j1 and j I b. Let C be the sum of the transformations A and B. what is the matrix jcjj determined by the product C of A and B. By the sum of tzew linear transformations A and B we mean the transformation C defined by the equation Cx Ax Bx for all x.ke C.. e) and I C al! represents the sum C of A and B (relative to the same basis). then their product is represented by the matrix j[c1! which is the product of the matrices Hai. we note that by definition of Hcrj C. If the transformation A determines the matrix jjaikj! and B the matrix 1lb j]. czkei. + B. Ce. A. = ?Wei. = I b. = a ake. = A( J=1 bike]) == biAei Comparison of (7) and (6) yields cika15 blk. j j DEFINITION 3. e every linear transformation determines a matrix. Further AB. The matrix W with entries defined by (8) is called the product of the matrices S and M in this order. = IckeI.

A (BC) = (AB)C. It is clear that if A is represented by the matrix matrix rj2a2. Let D be the linear mapping defined on R by the equation Df (t) = f(1). Since properties 1 through 4 are proved for matrices in a course in algebra. Thus A+B=B±A. b). We now define the product of a number A and a linear transformation A.11. f (A B)C = AC -1.. 1 C(A B) = CA + CB. (A B) C = A + (B C). Thus the matrix of the sum of two linear transformations is the sum of the matrices associated with the summands. We recall that we have established the existence of a one-to-one correspondence between linear transformations and matrices which preserves sums and products. Thus by 2A we mean the transformation which associates with every vector x the vector il(Ax). the iso- morphism between matrices and linear transformations just mentioned allows us to claim the validity of 1 through 4 for linear transformations. We could easily prove these equalities directly but this is unnec- essary. equation + a.16-1 + EXAMPLE. we define the symbol P(A) by the P(A) = (Om + a. Addition and multiplication of linear transformations have some of the properties usually associated vvith these operations.[1 and 111)0. .LINEAR TRANSFORMATIONS 77 so that c=a The matrix b. Consider the space R of functions defined and infinitely differentiable on an interval (a.11 If P (t) aot'n then 2A is represented by the + + a.E.BC. 21. + bIl is called the sum of the matrices Ila. is an arbitrary polynomial and A is a transformation.

then P(D) is the linear mapping which takes f (I) in R into P(D)f(t) = aor)(t) + a. i. Find P(. all matrices of order n with the usual definitions of addition and multiplication by a scalar form a vector space of dimension n2. i : 0 0 - P(2.2" - 0] 0 .91) for 01 0 0 010 O 0 1 0 0 0 0 0 si = 0 0 0 0 000 o_ a It is possible to give reasonable definitions not only for polynomial in a matrix at but also for any function of a matrix d such as exp d. As was already mentioned in § 1.am f (t). . EXAMPLE Let a be a diagonal matrix.78 LECTURES ON LINEAR ALGEBRA If P (t) is the polynomial P (t) = cior + airn-1+ + am.^ - d2 it follows that = Oi At... sin d. with P (t) as above and al a matrix we define a polynomial in a matrix. a matrix of the form [A. etc. Since [AL.22 01 rim .) EXERCISE. by means of the equation P(d) = arelm + a1stm-1 + + a. 0 2.. ) 1 . f (m-1) (t) + -1.. Now consider the following set of powers of some matrix sl .e. e . Hence any n2 + 1 matrices are linearly dependent.2 - 0 ' - 0 .. Example 5. 0 2.2/ = 0 0 0 0 )...' dm = 0 - - 0 0 0 P (0). We wish to find P(d). 0 2. Analogously.

There is a close connection between the inverse of a transformation and the inverse of a matrix. that is. Inverse transformation DEFINITION 4. In the sequel we shall prove that for every matrix sif there exists a polynomial P(t) of degree n derivable in a simple manner from sit and having the property P(si) = C. This simple proof of the existence of a polynomial P (t) for which P(d ) = 0 is deficient in two respects. if A takes x into Ax. .LINEAR TRANSFORMATIONS 79 Since the number of matrices is n2 -H 1.e. the matrix has rank n. Not every transformation possesses an inverse. As is well-known for every matrix st with non-zero determinant there exists a matrix sil-1 such that (9) sisti af_id _ si-1 is called the inverse of sit To find se we must solve a system of linear equations equivalent to the matrix equation (9). clog' Jr a1d + a2ia/2 + It follows that for every matrix of order n there exists a polynomial P of degree at most n2 such that P(s1) = C. a1. A transformation which has an inverse is sometimes called non-singular. The transformation B is said to be the inverse of 4. The definition implies that B(Ax) = x for all x.. The elements of the kth column of sl-1 turn out to be the cofactors of the elements of the kth row of sit divided by the determinant of It is easy to see that d-1 as just defined satisfies equation (9). .a. there exist numbers a. a2. (not all zero) such that + a.. they must be linearly dependent. The inverse of A is usually denoted by A-1.e. where E is the identity mapping. A if AB = BA = E. namely. Thus it is clear that the projection of vectors in three-dimensional Euclidean space on the KV-plane has no inverse. It follows that a linear transformation A has an inverse if and only if its matrix relative to any basis has a nonzero determinant. then the inverse B of A takes Ax into x. i.dn2 = 0. i. it does not tell us how to construct P (t) and it suggests that the degree of P (t) may be as high as n2. We know that choice of a basis determines a one-to-one correspondence between linear transformations and matrices which preserves products.

i. Ae2. --F y. We now show how the matrix of a linear transformation changes under a change of basis. Since every vector in R' is a linear combination of the vectors Ae. Hence every vector Ax. THEOREM. e. . .e. Likewise. f . 5. let f. The matrices which represent a linear transformation in different bases are usually different. i. Ay e R'. Then y. every vector in R'.. it is also a linear combination of the h vectors of a maximal set. i. y. and y. + C21e2 + + c02en. e2.. is a linear combination of the vectors Ae. = Ax.e. are linear combinations of the k vectors of such a maximal set.. . Ae. If the maximal number of linearly independent vectors among the Ae..e.. formulas (2) and (3) of para. . Ae. i. = Ax. Now any vector x is a linear combination of the vectors el. y.. .e. (10) = f cei c22e2 + cei c. Hence R' is indeed a subspace of R. Let A be a linear transformation on a space R. e R'. e2. . Let e1. Ay Ax. e2. e . e R' and y. Let W be the matrix connecting the two bases. i.1H is h. n). Proof: Let y. The dimension of R' equals the rank of the nzatrix of A relative to any basis e2.. y..e.80 LECTURES ON LINEAR ALGEBRA If A is a singular transformation. the dimension of R' is the same as the rank of the matrix ra11. Ax. = A (x.. e R'. f be two bases in R. Let I represent A relative to the basis e. en and f. e2.A (2x). isk. then the matrix of C relative to the basis e1. . + If C is the linear transformation defined by the equations Cei =1. The set of vectors Ax (x varies on R) forms a subspace R' of R. if y = Ax. More specifically.e.. Ae. To say that the maximal number of linearly independent Ae. then the other Ae. then 2Ax . e is W (cf. then its matrix has rank < n. We shall prove that the rank of the matrix of a linear transformation is independent of the choice of basis. 2. 3). is h is to say that the maximal number of linearly independent columns of the matrix Ila. Connection between the matrices of a linear transformation relative to different bases. Hence the dimension of R' is h.

ft. .. We wish to express the matrix .. e2. alone. e2. 1. f2. f and the . matrix <91 which represents A relative to the basis e.!1 its matrix relative to f1. consider the function on the subspace R. only. Not so in the case of linear transformations. .4 of a transformation A relative to a basis f. so that = To sum up: Formula (11) gives the connection between the matrix . Invariant subspaces. In the case of a scalar valued function defined on a vector space R but of interest only on a subspace 12.(which exists in view of the linear independence of the fi) we get C-'ACe. Aek = tt (10') (10") Afk = i=1 bat. However. = Ce. Eigenvalues and eigenvectors of a linear transformation Invariant subs paces. of R we may. § 10. f. relative to a given basis matrix (C-1AC) = matrix (C-9 matrix (A) matrix (C). To this end we rewrite (10") as ACe. . (formula (10)). = basis e1. e. e. of course.. In other words. e2.LINEAR TRANSFORMATIONS 81 a Let sit = Ilai. e2. e and 11). . (11) bzker It follows that the matrix jbikl represents C-'AC relative to the ..11 be the matrix of A relative to e1. f2. and in that case it is not possible to restrict ourselves to R. en to the basis f1.. Premultiplying both sides of this equation by C-1. . may be mapped on points not in R. . The matrix 462 in (11) is the matrix of transition from the basis e. f2.R in terms of the matrices si and W. Here points in R.

e. Show that if A.. i. = A. A is the mapping which takes the vector z = e. In this case every line through the origin is an invariant subspace. /1. consider A on R. + A22e2 (here e. is an invariant EXERCISE. and e. EXERCISE. Let A be a stretching by a factor A1 along the x-axis and by a factor A. implies Ax e R If a subspace R1 is invariant under a linear transformation A we may. . then the coordinate axes are the only invariant one-dimensional subspaces. The set of polynomials of degree subspace. Trivial examples of invariant subspaces are the subspace consisting of the zero element only and the whole space. then A is a similarity transformation with coefficient A.k. Let R be a plane. EXAMPLES. only. = A. e2. of R is called invariant under A if x e R. A subs pace R.<n-1.82 LECTURES ON LINEAR ALGEBRA DEFIN/TION 1. i. 1. The invariant subspaces are: the axis of rotation (a one-dimensional invariant subspace) and the plane through the origin and perpendicular to the axis of rotation (a two-dimensional invariant subspace). Show that R in Example 3 contains no other subspaces invariant under A. are unit vectors along the coordinate axes). e. Let R be any n-dimensional vector space. Let A be a linear transformation on a space R.e.. Let R be three-dimensional Euclidean space and A a rotation about an axis through the origin. of course. Let A be a linear transformation on R whose matrix relative to some basis el. AP (t) --= P' (1). Let R be the space of polynomials of degree n I and A the differentiation operator on R. into the vector Az = Ai ei e.2. along the y-axis. en is of the form . If A. In this case the coordinate axes are onedimensional invariant subspaces.

then A has at least one eigenvector. n2. i.+1 ' ' a. that Ax = 2x. Then R. The proof is left to the reader. 2. consists of all vectors of the form ax. In the sequel one-dimensional invariant subspaces will play a special role. all non-zero vectors of a one-dimensional invariant subspace are eigenvectors.e.1 a ek is (1 _< In this case the subspace generated by the vectors e . The number A is called an eigenvalue of A. A vector x is called an eigenvector of A. . ak+in a1+11+1 0 O a1. e2. THEOREM 1. If A is a linear transformation on a complex i space R. Conversely. Relative to this basis A Proof: Let e1. = k). is represented by some matrix IctikrI Let x = elei E2e. If = invariant under A. n of the be any vector in R. then the subspace generated by ek-Flp e1+2 en would also be Eigenvectors and eigenvalues. then the vectors ax form a onedimensional invariant subspace. + + Ee . Thus if x is an eigenvector.LINEAR TRANsFormarioNs 83 an ' avc all a17. vector Ax are given by The proof holds for a vector space over any algebraically closed field since it makes use only of the fact that equation (2) has a solution.. Then the coordinates ni. invariant under A. e2. It is clear x that for R1 to be invariant it is necessary and sufficient that the vector Ax be in R1. en be a basis in R.. Let R1 be a one-dimensional subspace generated by some vector O. 0 satisfying the relation Ax Ax DEFINITION 2.

¿n(0).. Eno) en. in place of A... ¿. + A)E. + a2¿. . e not all zero satisfying the system (1). (an Ei A)ei (a22 an$. (1) becomes a homogeneous system of linear equations with zero determinant. is equiv- alent to the system of equations: a111 a2151 a12$2 + a22 + + ainE=-.. + ctE which expresses the condition for x to be an eigenvector. + a2¿. = A2 an11 Or an2$2 + + ae--= A¿. + + a1--.0.i02+ (Cf. a2 aA This polynomial equation of degree n in A has at least one (in general complex) root A.. a2$2 + + (anTh O. .A¿. i. para. a12 a22 A ai a... that A an an. . Thus to prove the theorem we must show that there exists a number A and a set of numbers ¿I). If we put xon Elm) $2(0) e2 . . + a2. = an1e1+ ct. .e. For the system (1) to have a non-trivial solution ¿1.(0).. ani¿i it is necessary and sufficient that its determinant vanish. 0... then Axo) = Aoco). . Such a system has a non-trivial solution ¿. E2(0).84 LECTURES ON LINEAR ALGEBRA = /12 1111E1 + a122 + 4 a22 e2 + = a2111 ' al. $2. 3 of § 9). . The equation Ax = Ax. With A.

. 2. We may thus speak of the characteristic polynomial of the transformation A rather than the characteristic polynomial of the matrix of the transformation A.LINEAR TRANSFORMATIONS 85 i. If a linear transformation A has n linearly independent eigenvectors then these vectors form a basis in which A is represent- ed by a diagonal matrix. namely. In the sequel we shall prove a stronger result 2. the simplest linear transformations. if A is represented in some 2 The fact that the roots of the characteristic polynomial do not depend on the choice of basis does not by itself imply that the polynomial itself is independent of the choice of basis. The proof of our theorem shows that the roots of the characteristic polynomial are eigenvalues of the transformation A and. en its linearly independent eigenvectors. We thus have THEOREM 2._ Such a matrix is called a diagonal matrix. i. This completes the proof of the theorem. . Let A be such a transformation and e1. 3.. conversely.e. 30°) is an eigenvector and 2 an eigenvalue of A. Ae.e. Relative to the basis e1. n). The polynomial on the left side of (2) is called the character stic polynomial of the matrix of A and equation (2) the characteristic equation of that matrix. Since the eigenvalues of a transformation are defined without reference to a basis. that the characteristic polynomial is itself independent of the choice of basis. NOTE: Since the proof remains valid when A is restricted to any subspace invariant under A. Linear transformations with n linearly independent eigenvectors are. the eigenvalues of A are roots of the characteristic polynomial. It is a priori conceivable that the multiplicity of the roots varies with the basis. in a way. en the matrix of A is o o Lo o 22. . it follows that the roots of the characteristic polynomial do not depend on the choice of basis.. e (i = 1. e2. we can claim that every invariant subspace contains at least one eigenvector of A. e2. Conversely. .

e. a. We lead up to this case by observing that . e2. This contradicts the assumed linear independence of e1. For instance. (3) Apply ng A to both sides of equation (3) we get A (al ek + x2e2 + Or + a. then e1. a root Ak of the characteristic equation determines at least one eigenvector. . Indeed. e2. If e1. then there would exist k numbers ai . We assume its validity for k 1 vectors and prove it for the case of k vectors.. en is diagonal. 1121e1 1222e2 Subtracting from this equation equation (3) multiplied by A. A. A ' . For k = 1 this assertion is obviously true. such that ei a2 e2 e. NOTE: There is one important case in which a linear transforma- tion is certain to have n linearly independent eigenvectors. = O. it follows by the result just obtained that A has n linearly independent eigenvectors e1. 0( . e2. are distinct. we are led to the relation Ak)eki = with 21 2. ock2kek = O. ek are eigenvectors of a transformation A and the corresponding eigenvalues 2. If the characteristic polynomial has multiple roots.) = 0. . . The following result is a direct consequence of our observation: 2k)e1 + 12(22 ' 2k)e2+ ' 1k--1(1ki If the characteristic polynomial of a transformation A has n distinct roots. If our assertion were false in the case of k vectors. the transformation A which associates with every polynomial of degree < n 1 its derivative has only one eigenvalue A = 0 and (to within a constant multiplier) one eigenvector P = constant. e2. then the matrix of A is diagonable. then the vectors of this basis are eigenvalues of A. e2. 0 0 (by assumption Ak for i k). say. are linearly independent. e.86 LECTURES ON LINEAR ALGEBRA basis by a diagonal matrix. are supposed distinct. then the number of linearly independent eigenvectors may be less than n. The matrix of A relative to the basis e1. Since the A. For if P (t) is a polynomial of . with al 0 0. e.

i. represent A relative to two bases then But Ati Ir-11 1st Ael This proves our contention. 0 I 0 A.In para. It follows that regardless of the choice of basis the matrix of A is not diagonal. We shall prove in chapter III that if A is a root of multiplicity m of the characteristic polynomial of a transformation then the maximal number of linearly independent eigenvectors correspond- ing to A is m. We shall now find an explicit expression for the characteristic polynomial in terms of the entries in some representation sal of A.a. Hence we can speak of the characteristic polynomial of a linear transformation (rather than the characteristic polynomial of the matrix of a linear transformation). In the sequel (§§ 12 and 13) we discuss a few classes of diagonable linear transformations (i. linear transformations which in some bases can be represented by diagonal matrices).. 010 1 0 0 0 0 1 0 0 0 Solution: (-1)"(A" a11^-1 a2A^-2 a ). as asserted.. Hence P'(t) = AP(t) implies A -= 0 and P(t) = constant.e. The problem of the "simplest" matrix representation of an arbitrary linear trans- formation is discussed in chapter III. . In fact. a. 1. a.LINEAR TRANSFORMATIONS 87 degree k > 0. it is independent of the choice of basis. 0 0 0 0 0 1Ao 0 0 1 A 2. Find the characteristic polynomial of the matrix A. Characteristic fiolynomial. then P'(t) is a poly-nomial of degree k 1. 2 we defined the characteris- tic polynomial of the matrix si of a linear transformation A as the determinant of the matrix si Ae and mentioned the fact that this polynomial is determined by the linear transformation A alone. an. Find the characteristic polynomial of the matrix an_. 4.e. if si and %'-'sn' for some W. EXERCISES.

a. p2. the characteristic polynomial P(2) of the matrix si has the form ( 1)4 (An fi12n-1 P22"' ' where p. The coefficients p and p. a2 1b7. The free term of Q(A) is all an a.. are of particular importance. In one important case the roots of . This is another way of saying that the characteristic polynomial is independent of the particular representation .. multiplicity. si of A. To compute the eigenvectors of a linear transformation we must know its eigenvalues and this necessitates the solution of a poly- nomial equation of degree n.2 a 2b and can (by the addition theorem on determinants) be written as the sum of determinants.. 21)22 a2 2b2 Ab. where a and are two arbitrary matrices. is the sum of the diagonal elements of sí. etc. Finally. is the determinant of si P(A) We wish to emphasize the fact that the coefficients pi.1 an an a. is the sum of the diagonal entries of si p2 the sum of the principal minors of order two. Thus. The sum of the diagonal elements of sal is called its It is clear that the trace of a matrix is the sum of all the roots of its characteristic polynomial each taken with its proper trace.88 LECTURES ON LINEAR ALGEBRA We begin by computing a more general polynomial. p are independent of the particular representation a of the transformation A. In the case at hand a = e and the determinants which add up to the coefficient of (A') are the principal minors of order n k of the matrix Ha. The coefficient of ( A)'' in the expression for Q(A) is the sum of determinants obtained by replacing in (4) any k columns of the matrix by the corresponding columns of the matrix II b1. Q (A) ¡ . an Q(A) = Abu Abn an a22 /1.. namely. p is the determinant of the matrix si and pi.. p.11.1)12 al ' b1.11.

namely. First we prove the following LEMMA 1. EXERCISE.' A) then the eigenvalues of A are the numbers an. e ' + CA) = W01?-1 Then P(. If the matrix of a transformation A is triangular._.-2) + WoAm.e.-3 am e. Let the polynomial P(A) = ad.. a' if a a. We conclude with a discussion of an interesting property of the characteristic polynomial. ann. e._2 e. am 0 al a. Now (6) and (7) yield the equations w. + (sn. a22. for every matrix a/ there exists a polynomial P(t) such that P(d) is the zero matrix." + + + am and the matrix se be connected by the relation P(A)g = (se --. an._3= a. We now show that the characteristic polynomial is just such a polynomial. i. The proof is obvious since the characteristic polynomial of the matrix (5) is P(2) = and its roots are an.LINEAR TRANSFORMATIONS 89 the characteristic polynomial can be read off from the matrix representing the transformation. = a. (We note that this lemma is an extension of the theorem of Bezout to polynomials with matrix coefficients. a22. As 'vas pointed out in para. Find the eigenvectors corresponding to the eigenvalues an. V0 ==(Lot.09) = C. .. if it has the form an O a12 a22 al.-2 (d?. (5) 0 0 a 2) (an. 2) fan a.) Prool: We have Ae)r(A) sér. . a of the matrix (5). W.12)%v) where ?(A) is a polynomial in A with matrix coefficie s. 3 of § 9.

Find a polynomial P(t) of lowest degree for which P(d) = 0 (cf.e)w(A) = P(A)e. we This completes the proof. 0 A. Let d be a diagonal matrix Oi =[A. then there exists no polynomial Q(A) of degree less than n such that Q(. EXERCISE. ' 0 on the left. VVe have considered under separate headings linear transfornaations and bilinear forrns on vector spaces. the inverse matrix can be written in the form AS)-/ = 1 If P(A) is the characteristic polynomial of Al. However. I in A. . § II. are distinct. d + Thus P( 31) = 0 and our lemma is proved 3. The adjodnt of a linear hmasforrnation 1. § 9). We note that if the characteristic polynomial of the matrix d has no multiple roots. the exercise below). THEOREM 3. We have A t)(d A t)-1 e. Hence (. we are doing essentially the same thing. As is well known. In 3 In algebra the theorem of Bezout is proved by direct substitution of A in (6). Connection between transfornudions and bilinear forms in Euclidean space..29 . P(A) where 5 (A) is the matrix of the cofactors of the elements of a/ At and P(A) the determinant of d ite. t + a. we get + a. n conclude on the basis of our lemma that P ( 31) = C. then P(d) = O. Proof: Consider the inverse of the matrix d At. 0 A where all the A. Subsequent multiplication by St and addition of the resulting equations is tantamount to the substitution of al in place of A. i.4) = a. the kth equation in (8) is obtained by equating the coefficients of A* in (6). ' the last by dm and add the resulting equations. 3. the second by al.e.i._. para.90 LECTURES ON LINEAR ALGEBRA the third by Sr. the characteristic polynomial of S.0) = 0 (cf. dm on the right. In fact. and P(. If we multiply the first of these equations on the left by t. Here this is not an admissible procedure since A is a number and a' is a matrix. Since the elements of IS(A) are polynomials of degree .

However. § 9) and the bilinear form is represented by raw (cf. (1) a151 î72 + -1. such correspondence would be without significance. y) = anEi Th. Here re is the transpose of rer The careful reader will notice that the correspondence between bilinear forms and linear transformations in Euclidean space considered below associates bilinear forms and linear transformations whose matrices relative to an orthonormal basis are transposes of one another.LINEAR TRANSFORMATIONS 91 the case of Euclidean spaces there exists a close connection between bilinear forms and linear transformations 4. the linear transformation is represented by Se-1 are (cf. y). Let el. e be an orthonormal basis in R. Let R be a complex Euclidean space and let A (x. . . § 4). This correspondence is shown to be independent of the choice of basis.afiin + 42nE217n + 421E2771 + 422E2772 + ani En an2Eni12 + + We shall now try to represent the above expression as an inner product. Now we introduce the vector z with coordinates = aide]. = a2ne2 + a$. To this end we rewrite it as follows: A (x. e2.2 j2 a2e2 + + anE)77. y) can be written in the form A (x. then. One could therefore try to associate with a given linear transformation the bilinear form determined by the same matrix as the transformation in question. + ¿2e2 + x then A (x. + an2$.. -F a21 52 ' C2 = a12e1 + 422E2 + + ani$.. y) be a bilinear form on R. It is clear that z is obtained by applying to x a linear transforma- tion whose matrix is the transpose of the matrix Haikil of the bilinear form A (x. upon change of basis. In fact. We shall denote this linear transformation 4 Relative to a given basis both linear transformations and bilinear forms are given by matrices. if a linear transformation and a bilinear form are represented relative to some basis by a matrix at. If + eandy = n1e1 212e2 + +mien. y) (6711E1 + an 52 + (171251 + 422E2 + (a25e1 + an' 52)771 -F a.

y) (Ax. (x. Ay). y) = for all y. which is the same as saying that A = B. Then A (x. y) = 2(Ax. y). y) (Ax (Bx. y) = (Bx. y) = (Ax.). This proves the uniqueness assertion.. x. But this means that Ax Ex = 0 for all x Hence Ax = Ex for all x. y). An) = (x. we shall put z = Ax. pAy) = /2(x. let A (x. Bx. Then (Ax. The equation A (x. y) establishes a one-to-one correspondence between bilinear forms and linear transformations on a Euclidean vector space. a bilinear form A (x. y) = (2Ax.e. AY1) (x. Ay. (x. y) = (Ax. Thus. Y) = (Ax. y) (Mx. y) (A (x. y). y). y) is easily proved: . y). y) = (Ax. . A (y + y2)) = (x. y) (Ax. y) = Eh CS72 d- + Cn Tin = (z.).. We can now sum up our results in the following THEonEm (2) 1. Ax2. (Ax. i. y) on Euclidean vector space determines a linear transformation A such that A (x. The converse of this proposition is also true. namely: A linear transformation A on a Euclidean vector space determines a bilinear form A (x. We now show that the bilinear form A (x y) determines the transformation A uniquely. y) = (Ax. y) and A (x.. The bilinearity of A (x.92 LECTURES ON LINEAR ALGEBRA by the letter A. + AY2) = (x. Ay. y) + (Ax. y) defined by the relation A (x. Y)- Thus.

A*y) is called the adjoint of A. This representation is obtained by rewriting formula (1) above in the following manner: A (x. For a non-orthogonal basis the connection between the two matrices is more complicated. My). every bilinear form can be represented as A (x. On the other hand. Relative to an orthogonal basis the matrix la*. by the result stated in the conclusion of para. In a Euclidean space there is a one-to-one correspond- ence between linear transformations and their adjoints. y) = (x. y) = 2(4121171 6/12172 + + am /7n) a22 772 + + a2n77) $ n(an1Fn = + $7. y) (Ax. Let A be a linear transformation on a complex Euclidean space.LINEAR TRANSFORMATIONS 93 The one-oneness of the correspondence established by eq. There is another way of establishing a connection between bilinear forms and linear transformations. Hence .) + a2272 + d. y). A*y). Transition from A to its adjoint (the operation *) DEFINITION 1. Namely. The transformation A* defined by (Ax. every bilinear form can be uniquely represented as (x.21% a2772 + + d12n2 + + + din) + a2nn.z2n2 + + d nom) = (x.] of A* and the matrix I laiklt of A are connected by the relation a*.(c1. Proof: According to Theorem 1 of this section every linear transformation determines a unique bilinear form A (x. A*y). 1. 2. = dn. (2) implies its independence from choice of basis. THEOREM 2. y) (x.

But this means that C* EXERC/SES. y) -= (x. (2A)* = a*. y) = (x. y) = (Bx. A. Interchange of x and y gives (Cx. We give proofs of properties 1 and 2. By the definition of A*.. The connection between the matrices of A and A* relative to an orthogonal matrix was discussed above. (Ax. On the other hand. The operation * is to some extent the analog of the operation of . If we compare the right sides of the last two equations and recall that a linear transformation is uniquely determined by the corresponding bilinear form we conclude that (AB)* . Then (Ax. y) = (x. B*A*y). A*y) = (x. basis. Prove properties 1 through 5 of the operation * by making use of the connection between the matrices of A and A* relative to an orthogonal 2. (A + B)* = A* + B*. Prove properties 3 through 5 of the operation *. y) = (x. (ABx. the definition of (AB)* implies (ABx. y) = (x. Denote A* by C. A*y). (A*)* = A.94 LECTURES ON LINEAR ALGEBRA (Ax. A* 3). Ax) = (Cy. (AB)* y). Ay). Cy). x). unitary and normal linear transformations. i. E* = E. (A*)* = A. whence (y. Self-adjoint. y) = A (x. 1. Some of the basic properties of the operation * are (AB)* = B*A*.e.= B* A*.

A. the two operations are the same. and A. are self-adjoint. i. y) = (Ay. . let A. equations (a) and (b) are equivalent. /3 real. y) is Hermitian is to say that (Ax..e. Again. = (A + A*)/ 2 and A2 A = A.. x).. (3) where Al and A. y) be Hermitian. A* A**) = 2i (A* i. and A. This class is introduced by DEFINITION 2. to say that A is self-adjoint is to say that (Ax. This brings out the analogy between real numbers and selfadjoint transformations.LINEAR TRANSFORMATIONS 95 conjugation which takes a complex number a into the complex number et. This analogy is not accidental. for complex numbers. Indeed. Then (A + A*)* 2 (A + A*)* = + (A* + A**) + (A* + A) = A1. Every linear transformation A can be written as a sum A= iA. We now show that for a linear transformation A to be self-adjoint it is necessary and sufficient that the bilinear form (Ax. The real numbers are those complex numbers for which Cc = The class of linear transformations which are the analogs of the real numbers is of great importance.* (A A*)/2i. are self-adjoint transformations. to say that the form (Ax. Indeed. + iA. Similarly. a. Clearly. Ay). In fact. y) = (x. 7-. Every complex number is representable in the form = a + iß.e. A linear transformation is called self-adjoint (Hermitian) if A* = A. 2i 1 A* 2 /A AT 2i (A A*)* = A) = A2. it is clear that for matrices of order one over the field of complex numbers.

DEFINITION 3. 1. then AB + BA and BA) are also self-adjoint. 5 In n-dimensional spaces TIE* = E and 1:*ti = E are equivalent . = Ul. not self-adjoint. (AB)* = B*A* = BA. AU is again statements. This proves the theorem. in general. Prove the uniqueness of the representation (3) of A. different from A*A. Prove that if A is an arbitrary linear transformation then AA* and A*A are self-adjoint. NOTE: In contradistinction to complex numbers AA* is. In § 13 we shall become familiar with a very simple geometric interpretation of unitary transformations. For the product AB of two self-adjoint transforma- tions A and B to be self-adjoint it is necessary and sufficient that A and B commute. Hence (4) is equivalent to the equation AB = BA. The analog of complex numbers of absolute value one are unitary transformations. Show that the product of two unitary transformations is a unitary transformation. Now. LECTURES ON LINEAR ALGEBRA I. 2.96 EXERCISES. then self-adjoint. This is not the case in infinite dimensional spaces. Show that if 15 is unitary and A self-adjoint. However: THEOREM 3. Proof: We know that A* = A and B* = B. We wish to find a condition which is necessary and sufficient for (4) (AB)* = AB. Show that if A and B are self-adjoint. A linear transformation U is called unitary if UU* = U*15 = E. EXERCISES. Prove that a linear combination with real coefficients of self-adjoint transformations is again self-adjoint. 5 In other words for a unitary transformations U. The product of two self-adjoint transformations is. i (AB EXERCISE. in general.

LINEAR TRANSFORMATIONS 97 In the sequel (§ 15) we shall prove that every linear transformation can be written as the product of a self-adjoint transformation and a unitary transformation. Simultaneous reduction of a pair of quadratic forms to a sum of squares 1. x) = (x. (Ax. A. § 12. Ax). i. In the course of this study we shall become familiar with very simple geometric characterizations of these classes of transformations. Ax). There is no need to introduce an analogous concept in the field of complex numbers since multiplication of complex numbers is commutative. x O. (2x. It is easy to see that unitary transformations and self-adjoint transformations are normal.) LEMMA 1. Ax Since A* -= A. Self-adjoint transformations.. These transformations are frequently encountered in different applications.e. The eigenvalues of a self-adjoint transformation are real. This result can be regarded as a generalization of the result on the trigonometric form of a complex number. x) = (x. (Self-adjoint transformations on infinite dimensional space play an important role in quantum mechanics. Self-adjoint (Hermitian) transformations. that is. This section is devoted to a more detailed study of self-adjoint transformations on n-dimensional Euclidean space. DEFINITION 4. A linear transformation A is called normal if AA* = A* A.x. . The subsequent sections of this chapter are devoted to a more detailed study of the various classes of linear transformations just introduced. Proof: Let x be an eigenvector of a self-adjoint transformation A and let A be the eigenvalue corresponding to x.

which proves that A is real. Since (x. The totality of vectors of R. we can select the vectors e. (x. Let x e R. x). x) = 71(x. Then there exist n pairwise orthogonal eigenvectors of A. the corresponding eigenvalues are real. Let A be a self-adjoint transformation on an n-dimensional Euclidean vector space R and let e be an eigenvector of A. (Ax. of vectors x orthogonal to e form an (n 1)-dimensional subspace invariant under A. that is. of A. 0. In R. etc. there exists a vector e2 which is an eigenvector of A (cf. By Lemma 2. Ae) THEOREM 1. 2e) = 2(x. (Ax. For A to be self-adjoint it is necessary and sufficient that there exists an orthogonal basis relative to which the matrix of A is diagonal and real. form an (n 1)-dimensional invariant subspace We now consider our transformation A on R. § 10. only. e) = 0. is invariant under A. . Necessity: Let A be self-adjoint. We show that R. there exists an eigenvector e. orthogonal to e. form an (n 2)-dimensional invariant subspace R2. Indeed. THEOREM 2. LEMMA 2. In R. Let A be a linear transformation on an n-dimensional Euclidean space R. there exists at least one eigenvector el of A. e) = 0. 2(x. By Lemma 1. This means that (x. Let A be a self-adjoint transformation on an n- dimensional Euclidean space. e2. We have to show that Ax e R1. that each of them is of length one. x) Proof: The totality R1 of vectors x orthogonal to e form an (n 1)-dimensional subspace of R. The corresponding eigenvalues of A are all real. This proves Theorem 1. note to Theorem 1. en. The totality R. Since the product of an eigenvector by any non-zero number is again an eigenvector. A*e) = (x. e) = O. Proof: According to Theorem 1. In this manner we obtain n pairwise orthogonal eigenvectors e1. the totality of vectors orthogonal to e. it follows that A = 1.98 LECTURES ON LINEAR ALGEBRA Or. § 10). e) = (x. Select in R a basis consisting of .

. 22) (e1. it follows that relative to this basis the matrix of the transforma- tion A is of the form [A. This concludes the proof of Theorem 2. 21 22.2e2.. . § 11). Since Ai rf 4. e2) = (e1. Ae = Anen. In our case this operation has no effect on the matrix in question. e2) = O.). or (2. it follows that (e1. Ael = 22e. A. o (1) 0 An 0 0 where the Ai are real. structed in the proof of Theorem 1. e2) = O. e of A con- A. that is ¿1(e1.. e2.e. o o A. Sufficiency: Assume now that the matrix of the transformation A has relative to an orthogonal basis the form (1). The matrix of the adjoint transformation A* relative to an orthonormal basis is obtained by replacing all entries in the transpose of the matrix of A by their conjugates (cf. ez). i. Since . Indeed.LINEAR TRANSFORMATIONS 99 the n pairwise orthogonal eigenvectors e1. A*e2) = (e1. We note the following property of the eigenvectors of a selfadj oint transformation: the eigenvectors corresponding to different eigenvalues are orthogonal. Ae. Hence the transformations A and A* have the same matrix. = 2. (Ael. e2) = 22(e1. let Ael = 22 Then . A A*.

and the $1 are the coordi ales of the vector Proof: Let A( y) be a Hermitian bilinear form. Reduction to principal axes.11 is said to be Hermitian if ai. The matrix Irai. 6 ili[e ii2. y) be a Hermitian bilinear form defined on an n-dimensional Euclidean space R..1 and. EXERCISE.. We have shown in § 8 that in any vector space a Hermitian quadratic form can be written in an appropriate basis as a sum of squares. happens to be negative. 1 to quadratic forms. Along each one of these directions we perform a stretching by ¡2. where the Xi are real. we can assert the existence of an orthonnal basis relative to which a given Hermitian quadratic form can be reduced to a sum of squares.e. Theorem 2 permits us now to state the important THEOREM 3.100 LECTURES ON LINEAR ALGEBRA NOTE: Theorem 2 suggests the following geometric interpretation of a self-adjoint transformation: We select in our space n pairwise orthogonal directions (the directions determined by the eigenvectors) and associate with each a real number Ai (eigenvalue). A (x. a reflection in the plane orthogonal to the corresponding direction. y) = A (y. a necessary and sufficient condition for a linear transformation A to be self-adjoint is that its matrix relative to some orthogonal basis be Hermitian. Let A (x. Hint: Bring the matrix to its diagonal form. Raise the matrix ( 0 A/2 A/2) 1 to the 28th power. We now apply the results 2. We know that we can associate with each Hermitian bilinear form a self-adjoint transformation. . raise it to the proper power. X). x) = x. if 2. In the case of a Euclidean space we can state a stronger result. and then revert to the original basis. Simultaneous reduction of a pair of quadratic forms to a sum of squares. in addition. Along with the notion of a self-adjoint transformation weintro- duce the notion of a Hermitian matrix. namely. A (x. i. obtained in para. Then there exists an orthonormal basis in R relative to which the corresponding quadratic form can be written as a sum of squares. Clearly.

y).LINEAR TRANSFORMATIONS 101 then there exists (cf. x) and B(x. where B(x.121 212 + + Arisni2. for we get A (x. y) is the bilinear form corresponding to B(x. y) e2Ae2 + 22e2e2 + + en Aen . x) 211$112 . i=k for i k. The process of finding an orthonormal basis in a Euclidean space relative to which a given quadratic form can be represented as a sum of squares is called reduction to principal axes. x) be two Hermitian quadratic forms on an n-dimensional vector space R and assume B(x. This proves the theorem. THEOREM 4. Let Ae2 = 12e2. x = ei Since e2e. § 11) a self-adjoint linear transformation A such that A (x. + +e . y) B(x. y). Theorem 1). x) to be positive definite. By . y I1 0 e. Then Ael = 21e1. Then there exists a basis in R relative to which each form can be written as a sum of squares. e2. y) = = (Ax. Aen An en. Proof: We introduce in R an inner product by putting (x. x) = (Ax.An enen. %el /12e2 + + ?)en) + nnen) n2e2 + = 1E11 + 225 In particular + + fin A (x. + n2 e2 + + nn. x). Let A (x. As our orthonormal basis vectors we select the pairwise orthogonal eigenvectors e1. This can be done since the axioms for an inner product state that (x. y) (Ax. . With the introduction of an inner product our space R becomes a Euclidean vector space. en of the self-adjoint transformation A (cf. n1e1 -I. y) is a Hermitian bilinear form corresponding to a positive definite quadratic form (§ 8).

22.12..e 1215212 + + 41E7. Det (id A) (22 2) (2 A). x ) + 1E21' + + le. Now.. x) = 211E112 . then with respect to this basis Det i. en is an arbitrary basis. ...102 LECTURES ON LINEAR ALGEBRA theorem 3 R contains an orthonormal squares. x) are expressible as sums of squares. e2. with respect to an orthonormal basis an inner product takes the form (x. an2 2. 141) differs from (4) by a multiplicative constant.1)2 a. a2n Abu Ab22 ' /bin 21)2n a22 a. .. y).12. We now show how to find the numbers AI. Ar. x).. Ab. Under a change of basis the matrices of the Hermitian quadratic forms A and B go over into the matrices Jill = (t* d%' and = %)* .q = 0 0 A AR) (A1 Consequently.e. x) = ei I2 + 1E212 + + [EF2 Since B(x x) (x. Det V* Det (at a Ab a an 2b21 Abni A are the roots of the equation al. Hence. Al) Det C. 141) . y) = B(x. We have thus found a basis relative to which both quadratic forms A (x. The matrices of the quadratic forms A and B have the following canonical form: d=0 0 [AI 22 0 0 [1 . Det It follows that the numbers A. Orthonormal relative to the inner product (x. . which appear in (2) above. relative to which the form A (x. e2. x) can be written as a sum of A (x. 22. x) and B(x. it follows that B(x. if el. basis el.

namely: A unitary transformation U on an n-dimensional Euclidean space R preserves inner products. y). y). y) = (x. y). U*Uy) = (x. Uy) = (x. . assume U*U = E. en. B(x. y) = (Ex.e. i. y E R. Conversely. ro Li oJ Consider the matrix a RR. (Ux. the two forms cannot be reduced simultaneously to a sum of squares. Indeed. Unitary transformations In § 11 we defined a unitary transformation by the equation (1) UU* U*U E. cannot be reduced simultaneously to a sum of squares. y). Therefore. Its determinant is equal to (A2 + 1) and has no real roots. e2. then (U*Ux. where Haikl F NOTE: The following example illustrates that the requirement that one of the two forms be positive definite is essential. Uy) = (x.. if for any vectors x and y (Ux. Conversely. any linear transformation U which preserves inner products is unitary (i. Then (Ux. the matrix of the first form is [1 0 101 11 and the matrix of the second form is a. .LINEAR TRANSFORMATIONS 103 and 0i/A are the matrices of the quadratic forms A (x. § 13. The two quadratic forms A (X.e. that is (U*Ux. Uy) (x. it satisfies condition (1)). x) in some basis e. where A is a real parameter. x) = neither of which is positive definite. This definition has a simple geometric interpretation. in accordance with the preceding discussion. x) and B(x. y) for all x.. X) = let12 142. Indeed.

. a2. it follows that U*LI = E. Ux) = (x. i.e. EXERCISE. but refers to the columns rather than the rows of the matrix of U. Making use of the condition U*U = E we obtain. we select an orthonormal basis el. a2d.e. in addition. i. a unitary transformation preserves the length of a vector. The condition UU* = E implies that the product of the matrices (2) and (3) is equal to the unit matrix. a=1 a=1 a(T = O (i k). e2. To do this.. This condition is analogous to the preceding one. a=1 a-1 aak = O (i k).. Then d22 al. x).104 LECTURES ON LINEAR ALGEBRA Since equality of bilinear forms implies equality of corresponding transformations.. relative to an orthonormal basis. aiti. U is unitary.. Let [all a21 1E12 a22 aa a1 a2 dn dn d12 a an]] dn2 be the matr x of the transformation U relative to this basis.. that is. the matrix of a unitary transformation U has the following properties: the sum of the products of the elements of any YOW by the conjugates of the corresponding elements of any other YOW is equal to zero. en. = 1. = 1. Prove that a linear transformation which preserves length is unitary. ann is the matrix of the adjoint U* of U. In particular. . for x = y we have (Ux. Thus. the sum of the squares of the moduli of the elements of any row is equal to one. We shall now characterize the matrix of a unitary transformation.

LEMMA 2. As we have shown unitary matrices are matrices of unitary transformations relative to an orthonormal basis. The eigenvalues of a unitary transformation are in absolute value equal to one. (6) (Uei. orthonormal basis).. A matrix I laall whose elements satisfy condition (4) or. e2. the inner product of the vectors +a Uei = ai. Proof: Let x be an eigenvector of a unitary transformation U and let A be the corresponding eigenvalue. (x. 2x) = 22(x. . Ux) = (2x. e O. 0 for i It follows that a necessary and sufficient condition for a linear transformation U to be unitary is that it take an orthonormal basis en into an orthonormal basis Uek . LEMMA 1. Let U be a unitary transfor ation on an n-di ensional space R and e its eigenvector.e. en to be an is equal to axid (since we assumed el. e2.e. equivalently. x) = (Ux. condition (5) is called unitary. Then x O. x). + a2e2 + and akk a2k e2 + + anke . Indeed. Uek) 1 k. Hence f 1 for i = k. 1)-d mensional subspace R. Ux = Ax. the matrix of transition from an orthonormal basis to another orthonormal basis is also unitary. Since a transformation which takes an orthonormal basis into another orthonormal basis is unitary. Ue = 2e. Uen.. Ue2. Ai = 1 or 121 = 1. We shall now try to find the simplest form of the matrix of a unitary transformation relative to some suitably chosen basis. that is. i. e1. . i. of R consisting of all Then the (n vectors x orthogonal to e is invariant under U.LINEAR TRANSFORMATIONS 105 Condition (5) has a simple geometric meaning.

e) = 0. (7) o o 22 oi. the transformation U as a linear transformation has at least one eigenvector. We claim that the n pairwise orthogonal eigenvectors constructed in the preceding theorem constitute the desired basis. i. Let U be a unitary transformation defined on an n-dimensional Euclidean space R. Denote by R2 the invariant subspace consisting of all vectors of R1 orthogonal to e2. has the form [2. of all vectors of R which are orthogonal to e.106 LECTURES ON LINEAR ALGEBRA i.e. e) = 0. Indeed. . e) Proof: Let x E R. i.O. o The numbers 4. 4. is invariant under U. one. O. = 22e2. e) = 0.e.e. 0 0. By Lemma 1.. = Ue. etc. By Lemma 1 the eigenvalues corresponding to these eigenvectors are in absolute value equal to one. (Ux. The corresponding eigenvalues are in absolute value equal to one. Then there exists an orthonormal basis in R relative to which the matrix of the transformation U is diagonal. Let U be a unitary transformation on an n-dimen- sional Euclidean space R.e.. Hence R. Ue = .. Since Ue = ae. Ux E Thus. R2 contains at least one eigenvector e3 of U. contains at least one eigenvector e2 of U. A are in absolute value equal to Proof: Let U be a unitary transformation. THEOREM 2. Ue. Denote this vector by el. .. the subspace R1 . . e) --. e) = (x. § 10. hence (Ux. Ue) = (U*Ux. Then U has n pairwise orthogonal eigenvectors. Proof: In view of Theorem 1. Proceeding in this manner we obtain n pairwise orthogonal eigenvectors e. Indeed. is indeed invariant under U.. it follows that i(Ux. (x. We shall show that Ux e R1. i. (Ux. THEOREM 1. the (n 1)-dimensional subspace R. By Lemma 2. en of the transformation U.

e2. 22. we can find a basis relative to which all these transformations are represented by diagonal matrices. This proves the theorem. therefore. 1. 1.. § 14. LEMMA 1. has form (7). Commutative linear transformations. . We shall now discuss conditions for the existence of such a basis. where is a diagonal matrix whose non-zero elements are equal in absolute value to one. VVe first consider the case of two transformations. the main result of para. can be given the following matrix interpretation. if the matrix of U has form (7) relative to some orthogonal basis then U is unitary. By Lemma 1 the numbers Ai. Let sal be a Hermitian matrix. Then there exists a unitary matrix 'V such that Pi= rigr.LINEAR TRANSFORMATIONS 107 and. Prove that if A is a self-adjoint transformation then the transformation (A iE)-1 (A + iE) exists and is unitary.e. 2.e. An are in absolute value equal to one.. . Prove the converse of Theorem 2. Then sat can be represented in the form sit = where ir is a unitary matrix and g a diagonal matrix whose nonzero elements are real. Normal transformations 1. We have shown (§ 12) that for each self-adjoint transformation there exists an orthonormal basis relative to which the matrix of the transformation is diagonal. EXERCISES. Let A and B be two commutative linear transformations. i. Analogously. the matrix of U relative to the basis e1. It may turn out that given a number of self-adjoint transformations. Since the matrix of transition from one orthonormal basis to another is unitary we can give the following matrix interpretation to the result obtained in this section. § 12. Commutative transformations. let AB = BA. . i. Let all be a unitary matrix.

Now consider A and B on R. i. EB BE and x is not an eigenvector of B. i. if A is the identity trans- formation E. there ex sts a vector e. A necessary and sufficient condition for the existence of an orthogonal basis in R relative to which the transformations A and B are represented by diagonal matrices is that A and B commute. § 12). = 21e1. Since AB -= BA.. THEOREM 1.e. B a linear transformation other than E and x a vector which is not an eigenvector of B. which is an eigenvector of B. only. orthogonal to e. Lemma 2. Any two commutative transformations have a common eigenvector. Be2 = u2e2.. NOTE: If AB = BA we cannot claim that every eigenvector of A is also an eigenvector of B. is invariant under A and B (cf. RA is invariant under B. Proof: Let AB = BA and let RA be the subspace consisting of all vectors x for which Ax ---. then x is an eigenvector of E. where A is an eigenvalue of A. LEMMA 2. by Lemma 2. Proof: We have to show that if x ERA. Then. there exists a vector e. .2x. = The (n 1)-dimensional subspace R. which proves our lemma. Hence RA contains a vector x. For instance.13x. since by assumption all the vectors of RA are eigenvectors of A. which is an eigen- vector of A and B: Ae. Sufficiency: Let AB EA. Ax = 2x. = 22e2. we have ABx = BAx = B2x = 2.e. By Lemma 2. in R. i.e. Ae. Let A and B be two linear self-adjoint transformations defined on a complex n-dimensional vector space R.. ABx = 2Bx. Be. which is an eigenvector of both A and B. By Lemma 1. xo is also an eigenvector of A.108 LECTURES ON LINEAR ALGEBRA Then the eigenvectors of A which correspond to a given eigenvalue A of A form (together with the null vector) a subspace RA invariant under the transformation B. then Bx e Ra.

etc. Hence R. B. R. Proof: The proof is by induction on the dimension of the space R. e2. NOTE: Theorem I can be generalized to any set of pairwise commutative self-adjoint transformations. C. are diagonal. We assume case of one-dimensional space (n that it is true for spaces of dimension < n and prove it for an n-dimensional space. A of A. We shall now characterize all transformations with this property. be the set of all eigenvectors of A corresponding to some eigenvalue vector in R which is not an eigenvector of the transformation A. B. say. Assume therefore that there exists a Let R. 2. of A and B: Aei 2. . are multiples of the . transformations A. is of n 1. and U. R1 must contain a vector which is an eigenvector of the This proves our lemma. Let U. is also invariant under A).LINEAR TRANSFORMATIONS 109 All vectors of R. The elements of any set of pairwise commutative transformations on a vector space R have a common eigenvector. A necessary and sufficient condition for the existence This means that the transformations A. subspace different from the null space and the whole space. . R. C. Prove that there exists a basis relative to which the matrices of U. which are orthogonal to e2 form an (n 2)dimensional subspace invariant under A and B. Necessity: Assume that the matrices of A and B are diagonal relative to some orthogonal basis. In the I ) the lemma is obvious. In §§ 12 and 13 we considered two classes of linear transformations which are represented in a suitable orthonormal basis by a diagonal matrix. THEOREM 2. But then the transformations themselves commute. C. Furthermore. our lemma is true for spaces of dimension dimension < n. This completes the sufficiency part of the proof. The proof follows that of Theorem but instead of Lemma 2 the following Lemma is made use of : 1 LEMMA 2'. in our set Sour lemma is proved. is a B. e2. C. R. commute. is invariant under each of the transformations (obviously. Normal transformations. be two commutative unitary transformations. identity transformation. n). It follows that these matrices Bei = pie. EXERCISE. by assumption. e the matrices of A and B are diagonal. B. By Lemma 1. Proceeding in this way we get n pairwise orthogonal eigenvectors e1. . Relative to e1.e1 . (i = 1. and U. If every vector of R is an eigenvector of all the transformations A. Since.

This proves that R. under A* is proved in an analogous manner.1. etc. e which are eigenvectors of A and A*. i. (x. cf. Necessity: Let the matrix of the transformation A be diagonal relative to some orthonormal basis. § 11). which is an eigenvector of A and A*. 22 0 Relative to such a basis the matrix of the transformation A* has the form 0 0 0 [Al i. It follows that A and A* commute. we can claim that R1 contains a (x. Sufficiency: Assume that A and A* commute. Applying now Lemma 2 to R. Prove that pi = 9 . Then by Lemma 2 there exists a vector el which is an eigenvector of A and A*. that is. The invariance of R.e.. Ax e R. Ate) vector e. 0 0 0 0 0 IL. Ate1=p1e1. i.9 The (n 1)-dimensional subspace R1 of vectors orthogonal to e. el) = 0. pled = [71(x. is invariant under A as well as under A*. e1) = (x. let the matrix be of the form [2. Then (Ax. is invariant under A. Continuing in this manner we construct n pairwise orthogonal vectors e. .110 LECTURES ON LINEAR ALGEBRA of an orthogonal basis relative to which a transformation A is represent- ed by a diagonal matrix is AA* = A*A (such transformations are said to be normal.e.. Indeed.. Let R2 be the (n 2)-dimensional subspace of vectors from R2 orthogonal to e2. let x E 141. e1) = 0. e. i. 0 Since the matrices of A and A* are diagonal they commute. Ae1=21e1. EXERCISE..e.

there exists an orthonormal basis in which A. then A is normal. Every non-singular linear transformation A can be . Note that if A is a self-adjoint transformation then AA* A*A = A2. e form an orthogonal basis relative to which both A and A* are represented by diagonal matrices. A is normal. Unitary transformations are the analog of numbers of absolute value one. 1. 1. e2. are self-adjoint. We shall now derive an analogous result for linear transformations.LINEAR TRANSFORMATIONS 111 The vectors e1. Prove that a normal transformation A can be written in the form A = HU UH. An alternative sufficiency proof. A linear transformation H is called positive 0 for all x. x) THEOREM 1. and A. H is self-adjoint and U unitary. The analog of positive numbers are the so-called positive definite linear transformations. where H and U commute. Let A1= A + A* 2 . + iA2. But then the same is true of A = A. definite if it is self-adjoint and if (Hx. A unitary transformation U is also normal since U*U = E. Thus some of the results obtained in para..e. i. where H is self-adjoint. DEFINITION 1. If A and A* commute then so do A. are represented by diagonal matrices. U unitary and where H and U commute Hint: Select a basis relative to which A and A* are diagonable. Prove that the matrices of a set of normal transformations any two of which commute are simultaneously diagonable. Decomposition of a linear transformation into a product of a unitary and self-adjoint transformation Every complex number can be written as a product of a positive number and a number whose absolute value is one (the so-called trigonometric form of a complex number). By Theorem I. UU* § 12 and § 13 are special cases of Theorem 2. A2 A 2i A* The transformations A1 and A. EXERCISES. Prove that if A HU. and A2. § IS.

(AA* x. Indeed.11 of the transformation A relative to any orthogonal basis is different from zero.0. where U is unitary and H is a non-singular positive definite transformation. that is. Indeed. Hence the determinant of the matrix of AA* is different from zero. we put U = H-1A. A*x) 0. (AA*)* = A**A* = AA*. for all x. the transformation AA* is positive definite. Given any linear transformation A.211 of the transformation A* relative to the same basis is the complex conjugate of the determinant of the matrix 11(4. The eigenvalues of a positive definite transformation B are non-negative. Before proving Theorem 1 we establish three lemmas. . then the determinant of the matrix ilai. H is easily expressible in terms of A. AA* is self-adjoint. which means that AA* is non-singular. The determinant of the matrix I fri.112 LECTURES ON LINEAR ALGEBRA represented in the form A = HU (or A = U. Consequently. Proof: The transformation AA* is positive definite. let A = HU. where H(H1) is a non-singular positive definite transformation and U(U1) a unitary transformation. Conversely. This will suggest a way of proving the theorem. LEMMA 1.). so that AA* -= H2. LEMMA 2. in order to find H one has to "extract the square root" of AA*. Having found H. Thus. x) = (A*x. We shall first assume the theorem true and show how to find the necessary H and U. Furthermore.H. if all the eigenvalues of a self-adjoint transformation B are non-negative then B is positive definite. Thus AA* is positive definite. If A is non-singular then so is AA*. If A is non-singular.

en. E2e2 + -Fe) O.LINEAR TRANSFORMATIONS 113 Proof. e) >. e). e) > 0. LEMMA 3. if B is positive definite and non-singular then the are positive. In addition. S nce all the 1 are non-negative it follows that (Bx. are positive then the transformation B is non-singular and. .. e be an orthonormal basis consisting of the eigenvectors of B. Given any positive definite transformation B. 0 and (e. conversely. Let B be positive definite and let Be = 2e. . A. By Lemma 2 all [V21. . e2.. are the eigenvalues of B. An env. Ar. Put . x) (I) E2e2 + +e. Conversely. Then (Bx.>. 0 2 O where 21. there exists a positive definite transformation H such that H2 = B (in this case we write H = Bi). x) NOTE: It iS clear from equality (1) that if all the A. it follows that A O. Since (Be. Proof: We select in R an orthogonal basis relative to which B is of the form [Al O 01 B=0 0 A. Then (Be. O. if B is non-singular then H is non-singular. e) = 2(e. assume that all the eigenvalues of a self-adjoint transformation B are non-negative. = (el Bel (E121e1+$222e2+ +$/1. Let e1. 0 ' H= O VA2 0 \/2 App y ng Lemma 2 again we conclude that H is positive definite. Let x= be any vector of R. E1e1fe2e2+ ±e) 221E2 E2 Be2 + + $Be. 22.

(Ai 132Ai )* = (Ai )* B* (Ai)* = A1 BA'. note to Lemma 2) > O. This completes the proof of Theorem 1. Let A be a non-singular linear transformation. Indeed. Making use of eq. Hence A/2i > 0 and H is non-singular We now prove Theorem 1. § 16. then C-1 XC and X = AB will both have real eigenvalues. Proof: We know that the transformations X = AB and C-1 XC have the same characteristic polynomials and therefore the same eigenvalues.114 LECTURES ON LINEAR ALGEBRA Furthermore. Linear transformations on a real Euclidean space This section will be devoted to a discussion of linear transformations defined on a real space. if B is non-singular. A suitable choice for C is C Ai. For the purpose of this discussion . If we can choose C so that C-i XC is self-adjoint. Indeed. Then the eigenvalues of the transformation AB are real. which is easily seen to be self-adjoint. Let A be a non-singular positive definite transformation and let B be a self-adjoint transformation. Then C-1XC = A1ABA1 Ai BA+. In view of Lemmas 1 and 3. This completes the proof. (2) we get A = HU. The operat on of extracting the square root of a transformation can be used to prove the following theorem: THEOREM. Prove that if A and B are positive definite transformations. EXERCISE. H is a non-singular positive definite transformation. UU* = H--1A (H-1A)* = H-1AA* H-' = H-1112H-' = E. Let H= (AA*). at least one of which is non-singular. If (2) U= then U is unitary. then (cf. then the transformation AB has nonnegative eigenvalues.

a rotation of the plane about the origin by an angle different from hat is a linear transformation which does not have any one-dimensional invariant subspace. eigenvector.e. There arise two possibilities: a. + annen = 2$7. These numbers are the coordinates of some vector x relative to the basis e1. Let A be one of its roots.2( be the matrix of A relative to this basis. we can state the following THEOREM 1. not all zero which are a solution of (1). e.a2 aA . e2. Ao is a real root. Thus. ar1E2 T a2$2 T The system 1) has a non-trivial solution if and only if an 2 0112 al 2 a22 a2 an. $20. Consider the system of equations ( 6112E2 + 4222 e2 + (1) 4221 + a22E2 T T ainE 2E2.x. . . This equation is an nth order polynomial equation in A with real coefficients. i.LINEAR TRANSFORMATIONS 115 the reader need only be familiar with the material of §§ 9 through 11 of this chapter. Every linear transformation in a real vector space R has a one-dimensional or two-dimensional invariant subspace. However. 1.. en be a basis in R and let I la .. . e2. We can thus rewrite (1) in the form Ax = 2. In § 10 we proved that in a complex vector space every linear transformation has at least one eigenvector (onedimensional invariant subspace). the vector x spans a one-dimensional invariant subspace. + a2$ = 2E2. The concepts of invariant subspace. Proof: Let e1. Then we can find numbers E1°. and eigenvalue introduced in § 10 were defined for a vector space over an arbitrary field and are therefore relevant in the case of a real vector space. This result which played a fundamental role in the development of the theory of complex vector spaces does not apply in the case of real spaces.

ßE1.. $ in ( 1) by these numbers and separating the real and imaginary parts we get (2) + inn E2 1..1 16 LECTURES ON LINEAR ALGEBRA b. A linear transformation A defi ed on a real Euclidean space R is said to be self-adjoint if (Ax. n2. Show that in an odd-dimensional space (in particular. Ay) for any vectors x and y. Y = Ghei n2e2 ' ' + e2 e2 + Furthermore. In the sequel we shall make use of the fact that in a two-dimen43 the sional invariant subspace associated with the root 2 = oc transformation has form (3). + 02072 + annyin = Gobi + ß. y) = (x. cJane. let Ci be the coordinates of the vector z Ax. i. A2. x . n) are the coordi(ni. e2.. The numbers Eib e2 " nates of some vector x Ax acx en (y) fly. 2.)7. $2.e.f + a2nnii = 15t/12 ' /3E2. in R. Let 1. Replacing $i. + O.1/2. + a12e2 = anEi + 022E2 + ani$. threedimensional) every transformation has a one-dimensional invariant subspace. EXERCISE.. Let el. e be an orthonormal basis in R and let + enen. E be a solution of (1 ).$ = a&2 ' and (2)' an r + a12 n2 -i. a2$2 + ' + a7. Self-adjoint transformations DEFINITION 1. can be rewritten as follows Equations (3) imply that the two dimensional subspace spanned by the vectors x and y is invariant under A. + amen --= ace' + azii en Cte2 Pni.1 a21n1 a22n2 + ' + alniin = °U71. . Thus the relations (2) and (2') Ay = + t3x. 4.

I and is thus independent of the theorem asserting the existence of the root of an algebraic equation is given in § 17. en. Y) = (z. A different proof which does not depend on the results of para. Relative to an arbitrary basis every symmetric b 1 near form A (x. y) there ex sts a self-adjoint transformation A such that A (x. We first prove two lemmas. VVe shall make use of this result in the proof of Theorem 3 of this section. We shall now show that given a self-adjoint transformation there exists an orthogonal basis relative to which the matrix of the transformation is diagonal. for a linear transformation to be self-adjoint it is necessary and sufficient that its matrix relative to an orthonormal basis be symmetric. . Thus. It follows that (Ax. . To sum up. y) = (Ax. 1. y).i. condition (4) is equivalent to aik aki. (x. Ay) = k=1 aikeink. e2. y) is represented by A (X. Comparing (5) and (6) we obta n the following result: Given a symmetric bilinear form A (x. k=1 where jaiklj is the matrix of A relative to the basis el. The proof of this statement will be based on the material of para. k=1 aikk?h Similarly. Y) = E :ini = 1=1 i. 3r) = k=1 aikeink where aik ak.LINEAR TRANSFORMATIONS 117 =a/M.

the transformation A has at least one eigenvector e. el) = 0. Suppose that A = + O. In the proof of Theorem 1 we constructed two vectors x and y such that Ax = ax fiy. e1) = O. x) + (y.118 LECTURES ON LINEAR ALGEBRA LEMMA 1.. Contradiction.e. Ay) = /3(x. Thus. Proof: It is clear that the totality R' of vectors x. S nce (x. to every real root A of the characteristic equation there corresponds a onedimensional invariant subspace and to every complex root A. Thus. Let A be a self-adjoint transformation and el an eigenvector of A. Then (Ax. Proof: According to Theorem 1 of this section. Ay = fix + ay.. THEOREM 2. by Lemma 1) . y)]. a two-dimensional invariant subspace. it contains (again. x) (x. Aei) = (x. y). (x. x e R. But then (Ax. Denote by R' the subspace consisting of vectors orthogonal to e. forms an (n 1)-dimensional subspace.. Then the totality R' of vectors orthogonal to el forms an (n 1)-dimensional invariant subspace. LEMMA 2. Ax E R'. y) = (x. it follows that 13 = O. 2e1) = 2(x. Since R' is invariant under A.. Y) y) (x. x) + (y. = (x. Every self-adjoint transformation has a one-di ensional invariant subspace. to prove Lemma 1 we need only show that all the roots of a self-adjoint transformation are real. Subtracting the first equation from the second we get [note that (Ax. Proof: By Lemma 1. We show that R' is invariant under A. i. orthogonal to e. There exists an orthonormal basis relative to which the matrix of a self-adjoint transformation A is diagonal.e. Ay)] O = 2/3[(x. let x e R'. Y) = ix(x. i. y) = 0.

e2. of vectors such that 2ei). y) = (Ax. y) there corresponds a linear self-adjoint transformation A such that A (x. x) Here the 2.. (i = 1. in this case the equation A (x. e2. Since Aei = 2. According to Theorem 2 of this section there exists an orthonormal basis e1. We showed earlier that to each symmetric bilinear form A (x. x) be a quadratic fornt on an n-dimensional Euclidean space.'SFORMATIONS 119 an eigenvector e. y) be a symmetric bilinear form on an n-dimensional Euclidean space. In this manner we obta n n pairwise orthogonal eigenvectors e1. Indeed. /he. e .. -H 22 ' 2e2 + ri2e2 + nen) the. . x) 1 is the equation of a central conic of order two.LINEAR TRA.e. e consisting of the eigenvectors of the transformation A (i.. y) = (Ax. -. Let A (x.e. Let A (x. y) = = (A($jel $2e2 + En e). is of the form [ 2. are the eigenvalues of the transformation A or. The orthonorrnal basis . 2. the matr x of A relative to the e. T hen there exists an orthonormal basis relative to which the quadratic form can be represented as A (x.. + /2e2 + ' -{--Iien) + 2e22. equiv- alently. Putting y = x we obtain the following 21e17l+ 22E2T2 + THEOREM 3. o o - - o1 o - o 3. etc. . o A. Reduction of a quadratic form to a sum of squares relative to an orthogonal basis (reduction to principal axes). of A. y). the roots of the characteristic equation of the matrix Haitl For n 3 the above theorem is a theorem of solid analytic geometry. With respect to such a basis Aei A (x. n).

relative to the basis e1. Then there exists a basis in R relative to which each fornt is expressed as a sum of squares. i.e. e3. e each quadratic form can be expressed as a sum of squares. i. x) = I E2. Putting x =. x) is expressed as a sum of squares. Proof: Let B(x. A (x. By Theorem 3 of this section there exists an orthonormal basis e1. y) be the bilinear form corresponding to the quadratic form B(x. e2. e2. Simultaneous reduction of a pair of quadratic forms to a sum of squares THEORENI 4.y in (9) we get lAx12 IxJ2. We define in R an inner product by means of the formula (x. an orthogonal transformation is length preserv EXE RC 'SE.120 LECTURES ON LINEAR ALGEBRA discussed in Theorem 3 defines in this case the coordinate system relative to which the surface is in canonicid form. if (Ax. x). y) = B(x. . 5. Let A (x. 4. y E R. x) and B(x. x) be positive definite. ea relative to which the form A (x. Thus. Prove that condition (10) is sufficient for a transformation to be orthogonal. are directed along the principal axes of the surface. x) = 27=1. y) for all x. Relative to an orthonormal basis an inner product takes the form (x. that is.. y).e. e2. and let B(x. x) be two quadratic forms on an n-dimensional space R. The basis vectors e1. Orthogonal transformations DF:FINITION.. x) = B(x. Ay) = (x. A linear transformation A defined on a real n-dimen- sional Euclidean space is said to be orthogonal if it preserves inner products.

Indeed. it follows that the square of the determinant of a matrix of an orthogonal transformation is equal to one. a-1 EXERCISE. Show that conditions (I1) and.. anan = 0 Conditions a=1 (12) can be written in matrix form. the determinant of a matrix of an orthogonal transformation is equal to + 1. consequently. e2. EXERCISE.)0 {1 Now let Ila11 be the matrix of A relative to the basis e1. en.. conditions (12) are sufficient for a transformation to be orthogonal. for i k. Conditions (12) imply that . Since the determinant of the product of two matrices is equal to the product of the determinants.LINEAR TRANSFORMATIONS 121 Since cos 99 = (x. it follows that an orthogonal transformation preserves the angle between two vectors. Since the columns of this matrix are the coordinates of the vectors Ae conditions (11) can be rewritten as follows: for i = k for i k. Ae likewise form an orthonormal basis.e. en be an orthonormal basis. whereas an orthogonal transforMation whose determinant is equal to 1 is called improper. .e. i. {I for i k (A. An orthogonal transformation whose determinant is equal to + lis called a proper orthogonal transformation. .. e2. Since an orthogonal transformation A preserves the angles between vectors and the length of vectors. Show that the product of two proper or two improper orthogonal transformations is a proper orthogonal transformation and the product of a proper by an improper orthogonal transformation is an improper orthogonal transformation. y) ix) and since neither the numerator nor the denominator in the expression above is changed under an orthogonal transformation. matrix of A by the matrix of A. Let e1. . it follows that the vectors Aei. Ae A. i. I axian are the elements of the product of the transpose of the this product is the unit matrix.

122

LECTURES ON LINEAR ALGEBRA

NOTE: What motivates the division of orthogonal transformations into proper and improper transformations is the fact that any orthogonal trans-

formation which can be obtained by continuous deformation from the
identity transformation is necessarily proper. Indeed, let A, be an orthogonal transformation which depends continuously on the parameter t (this means that the elements of the matrix of the transformation relative to some basis are continuous functions of t) and let An = E. Then the determinant of this transformation is also a continuous function of t. Since a continuous

function which assumes the values ± I only is a constant and since for 0 the determinant of A, is equal to 1, it follows that for t 0 the t determinant of the transformation is equal to 1. Making use of Theorem 5 of this section one can also prove the converse, namely, that every proper orthogonal transformation can be obtained by continuous deformation of the identity transformation.

We now turn to a discussion of orthogonal transformat ons in
one-dimensional and tviro-dimensional vector spaces. In the sequel

we shall show that the study of orthogonal transformations in a space of arbitrary dimension can be reduced to the study of these two simpler cases. Let e be a vector generating a one-dimensional space and A an orthogonal transformation defined on that space. Then Ae Ae
and since (Ae, Ae) = (e, e), we have 2.2(e, e) = (e, e), i.e., A = 1. Thus we see that in a one-dimensional vector space there exist x two orthogonal transformations only: the transformation Ax x. The first is a proper and the and the transformation Ax an second an improper transformation.

Now, consider an orthogonal transformation A on a twodimensional vector space R. Let e1, e2 be an orthonormal basis in

R and let

[7/ /
be the matrix of A relative to that basis.
We first study the case when A is a proper orthogonal transformation, i.e., we assume that acó ßy -= 1.

The orthogonality condition implies that the product of the matrix (13) by its transpose is equal to the unit matrix, i.e., that
(14)
Fa

Ly

)51-1 J

Fa

vl

fit

LINEAR TRANSFORMATIONS

123

Since the determinant of the matrix (13) is equal to one, we have

fi'br --13.1. It follows from (14) and (15) that in this case the matrix of the transformation is

(15)

r
where a2 + ß2 =
1.

Putting x = cos q», ß

sin qi we find that

the matrix of a proper orthogonal transformation on a two dimensional

space relative to an orthogonal basis is of the form

[cos 9)
sin

sin 92-1

cos 9'I

(a rotation of the plane by an angle go). Assume now that A is an improper orthogonal transformation,

that is, that GO ßy =

1.

In this case the characteristic
(a + 6)2

equation of the matrix (13) is A2

1 = O and, thus,

has real roots. This means that the transformation A has an eigenvector e, Ae = /le. Since A is orthogonal it follows that
±e. Furthermore, an orthogonal transformation preserves the angles between vectors and their length. Therefore any vector e, orthogonal to e is transformed by A into a vector orthogonal to Ae ±e, i.e., Ae, +e,. Hence the matrix of A relative to the
Ae

basis e, e, has the form

F±I
L

o

+1j.

Since the determinant of an improper transformation is equal to -- 1, the canonical form of the matrix of an improper orthogonal transformation in two-dimensional space is
HE
L
(

oi
o

Or

1

o +1

01

a reflection in one of the axes). We now find the simplest form of the matrix of an orthogonal

transformation defined on a space of arbitrary dimension.

124

LECTURES ON LINEAR ALGEBRA

Let A be an orthogonal transforma/ion defined on an n-dimensional Euclidean space R. Then there exists an orthonormal basis el, e,, , e of R relative to which the matrix of the transformaTHEOREM 5.

tion is

1
cos
92,

sin

921

sin 921

cos ch.

COS 92k

cos
99,_

sin

92,

where the unspecified entries have value zero.

Proof: According to Theorem 1 of this section R contains a one-or two-dimensional invariant subspace Ru). If there exists a one-dimensional invariant subspace WI) we denote by el a vector
of length one in that space. Otherwise Wu is two dimensional and we choose in it an orthonormal basis e1, e,. Consider A on

In the case when R(') is one-dimensional, A takes the form Ax = x. If Wu is two dimensional A is a proper orthogonal transformation (otherwise R") would contain a one-dimensional invariant subspace) and the matrix of A in Rn) is of the form
rcos
Lsin

sin wi cos (pi

The totality 11 of vectors orthogonal to all the vectors of Rn) forms an invariant subspace.
Indeed, consider the case when Rn) is a two-dimensional space,

say. Let x e ft., i.e.,

We reason analogously if Wn is one-dimensional. Ay) = O. select a basis in it. We now find a one-dimensional or two-dimensional invariant subspace of R.. y). cos q)k_ sin 92. it is the totality of vectors orthogonal to the vector el. Ax e it. R is the totality of vectors orthogonal to the vectors el and e2. sin 921 cos q). cos qik sin w. Hence (Ax. and in the latter case.e.LINEAR TRANSFORMATIONS 125 (x. Again. y) = O for all y e R(1). i. z) = 0 for all z e ml). . As y varies over all of W1. Relative to this basis the matrix of the transformation is of the form 1 1 1 cos qpi sin go. where the +1 on the principal diagonal correspond to one-dimensional invariant subspaces and the "boxes" [ cos Ti sin T. in the former case. Indeed. In this manner we obtain n pairwise orthogonal vectors of length one which form a basis of R.. z = Ay likewise varies over all of 14(1. if Wu is of dimension two. Since (Ax. etc. it is of dimension n 1. correspond to two-dimensional invariant subspaces This completes the proof of the theorem..] sin T. If WI) is of dimension one. cos q). Ay) = (x. it is of dimension n 2. it follows that (Ax.

§ 17. in particular permit us to prove the existence of eigenvalues and eigenvectors without making use of the theorem . Extremal properties of eigenvalues In this section we show that the eigenvalues of a self-adjoint linear transformation defined on an n-dimensional Euclidean space can be obtained by considering a certain minimum problem connected with the corresponding quadratic form (Ax.126 NOTE: LECTURES ON LINEAR ALGEBRA A proper orthogonal transformation which represents a rotation of a two-dimensional plane and which leaves the (n 2)-dimensional subspace orthogonal to that plane fixed is called a simple rotation. Relative to a suitable basis its matrix takes the form 1 1 1 Making use of Theorem 5 one can easily show that every orthogonal transformation can be written as the product of a number of simple rotations and simple reflections. The proof is left to the reader. Relative to a suitable basis its matrix is of the form 1 cos q sin yo sin 9) cos w 1 An improper orthogonal transformation which reverses all vectors of some one-dimensional subspace and leaves all the vectors of the (n 1)dimensional complement fixed is called a simple reflection. This approach win. x).

is the corresponding eigen- value. We shall first consider the case of a real space and then extend our results to the case of a complex space. h) and (Be. We first prove the following lemma: LEMMA 1. Be = O. then 2t(Be. h) 0 for all t. such that (Bx. h) is non-negative for all t. where t is an arb trary number and h a vector. The vector e. e) -= 0. h) > O. It follows that (Be. Let A be a selpadjoint linear transformation. e) + t2(Bh. We shall consider the quadratic form (Ax. Be) = (Be. i. .. h) t2(Bh. in our case the expression 2t(Be. x) corresponding to A assumes its minimum on the unit sphere. x) = 1. Let A be a self-adjoint linear transformation on an n-dimensional real Euclidean space. e) = (h. at which the minimum is assumed is an eigenvector of A and A. Then the quadratic form (Ax. Proof: Let x = e + th. However. We have (B(e th). x) for all X.. h) = O. x) which corresponds to A on the unit sphere. h) t2(Bh. Since (Bh. But this means that (Be.LINEAR TRANSFORMATIONS 127 on the existence of a root of an nth order equation. If for some vector x = e (Be. x) is non-negative. the function at + bt2 with a 0 changes sign at t = O. The extremal properties are also useful in computing eigenvalues. Indeed. on the set of vectors x such that (x. e) = 0. h) t(Bh. This proves the lemma. then Be = O. Since h was arbitrary. i. e + th) = (Be.e. e) + t(Be. THEOREM 1. Let B be a self-adjoint linear transformation on a real space such that the quadratic form (Bx. h) = O.e.

Ae. i. for (x. x) 2. To find the next eigenvalue of A we consider all vectors of R orthogonal to the eigenvector e. This inequality holds for vectors of unit length. it follows that inequality (2) holds for vectors of arbitrary length. at some point e. A. since the minimum of a function considered on the whole space cannot exceed the minimum of the function in a subspace. Hence (A 21E)e1 = 0. We have shown that el is an eigenvector of the transformation A corresponding to the eigenvalue 2. is the point in R. The required second eigenvalue A. for x el. then both sides of the inequality become multiplied by a2. Since (Ax.. Since any vector can be obtained from a vector of unit length by multiplying it by some number a. e1) = 1. we have (Ae. We now rewrite (2) in the form (Ax x) O for all x. e) = O. § 16 (Lemma 2).128 LECTURES ON LINEAR ALGEBRA Proof: The unit sphere is a closed and bounded set in n-dimensional space. This proves the theorem. Note that if we multiply x by some number a. where (e1. x) 21(x. A. these vectors form an (n 1)-dimensional subspace R.. x) = 1. x). As was shown in para. Inequality (1) can be rewritten as follows where (x... 2. We obtain the next eigenvector by solving the same problem in . x) on the unit sphere in It. invariant under A. 21e. x) = 1. (Ax. of A is the minimum of (Ax. = 21e1.e.. Obviously. x) is continuous on that set it must assume its minimum 2. We have (Ax. and (Aei. In particular. This means that the transformation B = A 21E satisfies the conditions of Lemma 1. at which the minimum is assumed. The corresponding eigenvector e. el) = 2.

A 2E 22 + 4($12 82' + -E =- = Adx.t. . x). (e ek) = 1 and (ek.. x) ek are orthonormal. that is.(x. common to both Roc and S. Let A be a self-adjoint transformation. The third eigenvalue of A is equal to the minimum of (Ax. eigenvector of a transformation from the extremum problem without reference to the preceding eigenvectors. e. let x = ekek (Ax. 812 + 8. x) (Ax. We shall show that if S is the subs pace spanned by the first k eigenvectors then for each x e S the lollowing inequality holds: A. It follows that . + + = (Akeke.. -L + Ake ke k) ek = 4E1' + A2E22 + + Ake k2. x). x) (Ax. (Ax.e. Since the sum of the dimensions of Rk and S is (n k + 1) + k it follows that there exists a vector x. etc. 1.) k.. + ¿kek). x) = Similarly.2 + ' + AkEk2 ek2 ' and therefore (Ax. then there exists a vector different from zero belonging to both subspaces. e1. + eke.(x. We can assume that xo has unit length. O for i Since Aek = 2. x) 2. x). ek (x. It is sometimes convenient to determine the second.k(x. e the corresponding orthonormal . it follows that ' (A (Eke/ eze. exek) + + Ekek) Furthermore. x) eoeo + Indeed. x) in that subspace. x) + ekeo. (x. Continuing in this manner we find all the n eigenvalues and the corresponding eigenvectors of A. e2. x). . x) Now let Rk be a subspace of dimension n k + 1. In § 7 (Lemma of para. Eke. third. and e.LINEAR TRANSFORMATIONS 129 the (n 2)-dimensional subspace consisting of vectors orthogonal to both e. < A.. e (x. 1) we showed that if the sum of the dimensions of two subspaces of an n-dimensional space is greater than n. since e. eigenvectors. < 5 An its eigenvalues and by e eo. Denote by A.

e. Let R be a (n k + 1)-dimensional subspace of the space R. 4 (x. x) B)x. x) for all x elt. x) for x on the unit sphere in Rk must be equal to or less than Ak.. x) -= 4. Then min (Ax. Let A. Ro) Ak But then the minimum of (Ax. for which (x. A.. we showed in this section that min (Ax. by formula (3). As a consequence of our theorem we have: Let A be a sell-adjoint linear transformation and B a postive definite linear A be the eigenvalues of A and lel transformation. To sum up: If Rk is an k 1)-dimensional subspace and x varies over all vectors in R. we have 2. x) for x e S. (x. x) is equal to A.I.0. Rk (x.) = 1. x). et. and the maximum over all subspaces Rk of dimension n k + 1. Indeed. x) = 1. x). x) A. x). x) = 1. x) = 1.. . is actually equal to Ak.. x. and the maximum of the right side is equal to 1. We now extend our results to the case of a complex space. (x. (x.. " for all x. is equal to . Since (Ax. x) = I. This is the subspace consisting of all vectors orthogonal to the first k eigenvectors et. x) (Axo.130 LECTURES ON LINEAR ALGEBRA (x... xi=1 k + 1)-dimensional subspace Rk we have min ((A (Ax. . (x. then min (Ax. Our theorem can be expressed by the formula (3) max min (Ax. x) = 1. f Indeed (Ax. et. is less than or equal to A. . Hence for any (n min (x. x e 12. xe Rk In this formula the minimum is taken over all x e R. xe Rk xeRk X)=-1 It follows that the maximum of the expression on the left side taken over all subspaces Rk does not exceed the maximum of the right side. taken over all vectors orthogonal to et. Then A. .u be the eigenvalues of A -7 B. x) ((A + 13)x. Since. Note that among all the subspaces of dimension n k 1 there exists one for which min (Ax. The subspace Rk can be chosen so that min (Ax.4. the maximum of the left side is equal to A.. xo) 2. it follows that We have thus shown that there exists a vector xo E Rk of unit length such that (AX0. We have thus proved the following theorem: THEOREM.. e. x).

t[(Be.. Then (B (e th). e -r. e)] + t2(Bh. or. It follows from (4) and (5) that (Be. let foy all x. x) corresponding to B be non-negative. e) = 0. i(Bh. h) (Eh. e) = O. (Be. Let B be a self-adjoint transformation on a complex space and let the Hermitian form (Bx. i(Be.LINEAR TRANSFORMATIONS 131 To this end we need only substitute for Lemma I the follovving lemma. h) O. i. e) Since h was arbitrary. This proves the lemma.e. and therefore Be = O. h) (Bh. h) for all t. h) = 0. x) 0 (Be.th) 0. we get. e) = 0. since (Be. then Be = O. It follows that (Bx. . If for some vector e. All the remaining results of this section as well as their proofs can be carried over to complex spaces without change. by putting ih in place of h. LEMMA 2. Proof: Let t be an arbitrary real number and h a vector.

such a transformation is not diagonable since. sional space and let A have k (k vectors Let A be an arbitrary linear transformation on a complex n-dimenn) linearly independent eigen- We recall that if the characteristic polynomial has n distinct roots. the so-called diagonal form. namely. 1. then the transformation has n linearly independent eigenvectors. the number of linearly independent eigenvectors of a linear transformation can be less than n. Hence for the number of linearly independent eigenvectors of a transformation to be less than n it is necessary that the characteristic polynomial have multiple roots. Example Clearly. There arises the question of the simplest form of such a transformation. as noted above. cf. In this chapter we shall find for an arbitrary transformation a basis relative to which the matrix of the transformation has a 3). The canonical form of a linear transformation In chapter II we discussedvarious classes of linear transformations on an n-dimensional vector space which have n linearly independ- ent eigenvectors. para. We found that relative to the basis consisting of the eigenvectors the matrix of such a transformation had a particularly simple form. in a sense. We now formulate the definitive result which we shall prove in § 19. also § 10. this case is.CHAPTER III The Canonical Form of an Arbitrary Linear Transformation § 18. Thus. However. i (An example of such a transformation is given in the sequel. any basis relative to which the matrix of a transformation is diagonal consists of linearly independent eigenvectors of the transformation. 132 . comparatively simple form (the so-called Jordan canonical form). In the case when the number of linearly independent eigenvectors of the transformation is equal to the dimension of the space the canonical form will coincide with the diagonal form. exceptional.

. A. is an eigenvector. . 2. = Akhi. h1.. Ael = 11e1.). . = 22f1. . say. p q ±sn. = h.1e. 21e2. Af. the subspace generated by the set e1. c2(e1 21e2) + (e_. Indeed. Ae = e. Af. A(c. We show that each subspace contains only one (to within a multiplicative constant) eigenvector. . relative to which the transformation A has the form: Ae. . contains the eigenvector el. c2e2 + where not all the c's are equal to zero. I f k n. + /lever Equating the coefficients of the basis vectors we get a system of c: equations for the numbers A.h 21e. 2 Clearly._1 At. 22. fq. e2. 111. basis consisting of k sets of vectors 2 . e. = e.e.. 21118. Assume that some vector of this subspace. consider the subspace generated by the vectors el. Ah. some linear combination of the form c1 e1 + + cpe. We shall now investigate A more closely. ix.) = Ac. f1. It therefore follows that each set of basis vectors gener- ates a subspace invariant under A. = Ah. . e... e. . We see that the linear transformation A described by (2) takes the basis vectors of each set into linear combinations of vectors in the same set. e. e..CANONICAL FORM OF LINEAR TRANSFORMATION 133 e. 12f. c. = Ah. /1h2. For instance.e. = f. + + c. .e. namely an eigenvector. corresponding to the eigenvalues Xi. Every subspace generated by each one of the k sets of vectors contains an eigenvector. that is. then each set consists of one vector only. Substituting the appropriate expressions of formula (2) on the left side we obtain cdie.) = cae.. f. Then there exists a . 22f2.

e. To find out what the elements in each box are it suffices to note how A transforms the vectors of the appropriate set. the matrix of the transformation relative to the basis (1) has k boxes along the main diagonal. Indeed. from the second.O All 0 1 -Al 0 0 0 0 1 0 0 0 0 A. = O. Ae2 = e1 + e._1= c. = 0 and from the remaining equa- cp-14+ 1Cp-1. tions that c..p q. c. We now write down the matrix of the transformation (2).± c3 = Ac2. then it would follow from the last equation that c. has the form (3) . in the next q columns the row indices of possible non zero elements are p + 1. We have Ael = Ale. Ae = Ae ep + Ale. 2. p. . Since the vectors of each set are transformed into linear combinations of vectors of the same set. ._2= = c2= el= O. + A1e. and from the last. it follows that in the first p columns the row indices of possible non-zero elements are 1. Recalling how one constructs the matrix of a transformation relative to a given basis we see that the box corresponding to the set of vectors e1. Substituting this value for A we get from the first equation c2 = 0. therefore._. 0 0 0 . Thus. The elements of the matrix which are outside these boxes are equal to zero. and so on. e2. We first show that A = Al. cAl = Ac. = 0. p + 2. Hence A = A1. if A A1._1.134 LECTURES ON LINEAR ALGEBRA ciAl+ c2rc2A. coincides (to within a multiplicative constant) with the first vector of the corresponding set. This means that the eigenvector is equal to cle and. c.

-. q. Now let P(1) =. It is easy to see that . are square boxes and all othur elements are zero. We show. how to compute a polynomial in the matrix (4). Although a matrix in the canonical form described above seems more complicated than a diagonal matrix. ao + ait + amtm be any polynomial. Then sif2 that is. one can nevertheless perform algebraic operations on it with relative ease. that is. in order to raise the matrix al to some power all one has to do is + raise each one of the boxes to that power. say. it has the form _211 0 0 A. 1 0 0 0 0 0 Here all the elements outside of the boxes are zero.s.CANONICAL FORM OF LINEAR TRANSFORMATION 135 The matrix of A consists of similar boxes of orders p. The matrix (4) has the form = k_ where the a. for instance. 0 1 0 0 21 0 0 221 (4) 0 0221 0 0 0 0 22 0 2k' 0 0 A.

e + 1 where et is the unit matnx of order p and where the matrix f has the form r0 . = et.1) -E 2! P"(À1) + + (tA. where n is the degree of P(t). = ene. . 2! (Al) A1 e)2 P"(11. Substituting for t the matrix sari we get P(di) = P(Mg + (si.. It is now easy to compute P(. . == 0 1 0 [0 if 2-2 - 0 0 0 0 0 0 0 o o o 00 00 and = 0. . ine2=e. ¿e is.1 are most easily computed by observing that fie. Je. . Hence P(di) = P(A1). Hence inei= 0.. A..)" n! P"'' (A1).3. In view of Taylor's formula a polynomial P(t) can be written as P(t)= P(20) (t (t A1)2 )0) -1-v(2. Jae.1) (di n! I)" But sit. say. e)P( (20 -1- (st.).0P-I are of the form 2 [0 0001 0000 0000 fr == JP+. A.1 0 = 0 0 0 I o o 0 0 11 0 0 0 0 We note that the matrices . . .5. -4 0. = 0. J3e1 J3e.(11). .02. First we write the matrix si. in the form st. ey_. .136 LECTURES ON LINEAR ALGEBRA [P(ei1) P(s12) P(s. We now show how to compute P(s1. J3e = Similarly.332e2=0. Pl()105 + P"(20) 2! 2! 52 + Put' (20) n! 2 The powers of the matrix . J'ep = e_. = e.

Petrovsky. Reduction to canonical form In this sect on we prove the following theorem 3: THEOREM 1. . Let A be a linear transformation on a complex n-dimensional space... then to compute P(d) one has to know the . as well as the values of the 1 derivatives at A.e. Lectures on the Theory of Ordinary Differential Equa- tions. .. has order p it suffices to know the value of P(t) and its first p 1 derivatives at the point A. We need the following lemma: LEMMA. In other words. we get P" (Al) 2! P' (A1) 1! PP-'' (AO1) ! P(211) = P(A) I" 1! F'''-'' (A. We prove the theorem by induction. . the first q first s 1 derivatives at A. chapter 6.CANONICAL FORM OF LINEAR TRANSFORMATION 137 Recalling that ifP = JP-1 = P (A1) ' ' = 0. Petrovsky. is the eigenvalue of si. It follows that if the matrix has canonical form (4) with boxes of order p. See I. first p 1 derivatives at A. there exists a basis relative to which A has the form (2) (§ 18).. G. where A. i.. s.)O Thus in order to compute P(d1) where sal. G. we assume that the required basis exists in a space of dimension n and show that such a basis exists in a space of dimension n 1. Proof: Consider the adjoint At of A. Then there exists a basis relative to which the matrix of the linear transformation has canonical form.. A. q. 0O O 2)! ' P(21) value of P(t) at the points t = A.. Let e be an eigenvector of A*. Every linear transformation A on an n-dimensional complex space R has at least one (n 1)-dimensional invariant subspace R'. A2. and the § 19. Ate = We claim that the (n 1)-dimensional subspace R' consisting of 3 The main idea for the proof of this theorem is due to I.

f1.e. Denote this basis by h2. e) = (x. . e2.e. invariant under A._. 2. f2. However. Indeed. by changing the proof slightly 've can show that the Lemma holds for any vector space R. Ax E R'. According to our lemma there exists an n-dimensional subspace R' of R. + 21h. Considered on R'. e) = 0. Applying the transformation A to e we get 4 We assume Itere that R is Euclidean. Ae2 = 11e2. e2. . let x e R'. = h. Aft = Af2 = f. all vectors x for which (x. that an inner product is defined on R. + ¿1e. the transwhere p q+ formation A has relative to this basis the form h Ae. Ah.e. Afq = fq-1 + 12f. . Aev = ev_.h2.. that is. alone. 22f2. i. 2e) = 0. = Ah2 = +2. f2. Let A be a linear transformation on an (n + 1)-dimensional space R. e) = O. .138 LEcruims ON LINEAR ALGEBRA all vectors x orthogonal 4) to e.. f. Then (Ax... that is. hs forms a basis in R. e2. 4112. (x. By the induction assumption we can choose a basis in R' relative to which A is in canonical form. ev. Ate) = (x. h. This proves the invariance of R' under A. i. h2.2f2. We now turn to the proof of Theorem 1.=1el. We now pick a vector e wh ch together with the vectors el. + s = n. el. is invariant under A.. ft.

. Indeed. para. e.ep) + A(wilk + + Mg] + wshs). and T. and the diagonal (cf. can The coefficients xi. if relative to some basis A is in canonical form then relative to the same basis A rE is also in canonical form and conversely..f. dmuji xe) + (3. the eigenvalues . co. fg. We shall now try to replace the vector e by some vector e' so that the expression for Aei is as simple as possible. Hence if r 0 we can consider the transformation A rE instead of A.. .11. f. 7. pti. h. zp. We will choose them so that the right side of (3) has as few terms as possible.Cie' Xpep + 61111 + 6shs. h... 5 We can assume that t = O. 2k. for instance.on the principal diagonal. e is triangular with the numbers A. namely. + + wshs).f..fr coihi '0» cosh. Indeed. A. h1.. tt. 4) it follows that r are the eigenvalues of A considered on the (n + 1)-dimensional space R.. .. We know that to each set of basis vectors in the n-dimensional space R' relative to which A is in canonical form there corresponds 5 The linear transformation A has in the (n 1)-dimensional space R A. . oh.. as a result of the transition from the n-dimensional invariant subspace R' to the (n + 1)-dimensional space R the number of eigenvalues is increased by one. 2.. Thus. We shall seek e' Pl. or.CANONICAL FORM OF LINEAR TRANSFORMATION 139 Ae = a1ej + /pep + + + + + + 61111 + 6311s + re. e.81f. We have Ae' = Ae A(zlei + + x.. + . A(x.e. + Óih ACtc. . A(0)1111 + . by the eigenvalue T. be chosen arbitrarily. This justifies our putting Ae =-- + + ape. making use of (1) Ae' = i1e1 + + + + ß1f1+ + 8. § 10. f2. the matrix of A relative to the basis el. + + in the form e' e . Since the eigenvalues of a triangular matrix are equal to the entries on . . + . ' .

i. e.f.-121 Xp)ep-i (. so that the linear combination of each set of vectors vanishes.e. 1 etc. In this way the linear combiequal to zero and determine nation of the vectors e1. we can choose xi. The eigenvalue associated with e' is zero (or 2. e.) . Assume this to be feasible. -H 21e2) ' + xpep) ¿lei)) Zp(en-1 (/. e 1. .. We consider first the case when all the eigenvalues are different from zero. 'Y . i. h 1.2 = (al X1A1 Z2)e1 X221 z3)e2 + + X.140 LECTURES ON LINEAR ALGEBRA one eigenvalue..2. (3) vanishes. The coefficients of the other sets of vectors are computed analogously. hs e v. equal to zero and determine x. w.e. We have thus determined e' so that Ae' = O. The sets of the former type can be dealt with as above.- We put the coefficient of e.e. in the (n + 1)-dimensional space R relative to which the transformation is in canonical form. In this case the summands on the right side of (3) are of two types: those corresponding to sets of vectors associated with an eigenvalue different from zero and those associated with an eigenvalue equal to zero. so that the right side of (3) becomes zero. so that the linear combination of the vectors cients xi.. e 2. Consider now the case when some of the eigenvalues of the transformation A on R' are zero. next we put the coefficient of e. if we consider the .. e. A (xi. . Then since the transformation A takes the vectors of each set into a linear combination of vectors of the same set it must be possible to select xl. . The vector e' forms a separate set. These eigenvalues may or may not be all different from zero. transformation A rather than A TE). (0. for such sets we can choose . + = i1e1 + ' . e'. We show how to choose the coeffix. We shall show that in this case we can choose a vector e' so that Ae' = 0. By adding this vector to the basis vectors of R' we obtain a basis . 2.. . in e1. . in (3) vanishes. The terms containing the vectors el. f 1f. . 2. (this can be done since Ai 0). are of the form + + ape.

say. . 0. e. O.. A(zie. Then. three sets of vectors. f. fg. = a. we annihilate all Za= ac2. Therefore the linear combination of the vectors el. Agr Ag2= gi. Ae' = x. es. it becomes necessary to change some of the basis vectors of R'. . .CANONICAL FORM OF LINEAR TRANSFORMATION 141 coefficients so that the appropriate linear combinations of vectors in each set vanish. Thus . e. g. Af2 Ae. e'.f. i. O = e'. e2.e. = f2_. Af.. g2. fq and g. . it follows that el. 111.. g. values are equal to zero. different from zero. forms a separate set and is Y associated with the eigenvalue zero. Let us assume that we are left with. 2. = 0. = Ae'.g. Ae = ep_1. = sc_. y. f2.. . p1f1 + + ß0f0 + 71g1 + xpe) 4- A (Yigi 4- Since Al = 22 = A. hs. f2. we obtain a vector e' such that fl. At. e2. the transformation A is already in canonical form relative to the basis e'. Proceeding in the same manner with the By putting z. vectors except ape.. + A(ktifl+ + Itqfq) + a. appearing on the right side of (4) will be of the form cc1e1 . ' . = 0. We illustrate the procedure by considering the case x. .e arrive at a vector e' such that It might happen that a = = Ae' = In this case we and just as in the first case. ßq..e. The vector e'. = 22 = 23 = O. andfi > q> r. = 0. Assume now that at least one of the coefficients x. el. f1. = Ae'. Ag. e. . in distinction to the previous cases. . Then " Ae' = 1e1 + (4) + yrg. x2e2 + + x2e2 x3e2 sets f. We form a new set of vectors by putting e' = e'._... whose eigenfi. .

If the matrix (. = Aet_7+2 = Gip ep_.142 LECTURES ON LINEAR ALGEBRA e' = e' cc. DEFINITION 1. The matrices sir and . e. We now replace the basis vectors e'. 2 . The results of this section will also imply the (as yet unproved) uniqueness of the canonical form. = 0. e'2. ). and leave the other basis vectors unchanged. y. If the first case. Elementary divisors In this section we shall describe a method for finding the Jordan canonical form of a transformation. ep by the vectors e'1.5:11 =. In this case a separate box of order 1 was added. f.g1. The case when -c coincided with one of the eigenvalues . .. + ßf. + yrgr. we added a new box. let = Then = wsgtir-1 . While constructing the canonical form of A we had to distinguish two cases: The case when the additional eigenvalue r (we assumed t = 0) did not coincide with any of the eigenvalues 2. then a2 is also similar to at._.. where is an arbitrary non-singular matrix are said to be similar. to increase the order = tig y.+1 = e'l = Ae'2 = cc. Then it was necessary. Note that the order of the first box has been increased by one. e1. Indeed.911 is similar to the matrix a2. in general.. 4. + 41.. then just as in of one of the boxes by one. . e2.tr'-isfl. This completes the proof of the theorem. + fg_r.+1 = Aet_.1 .. Relative to the new basis the transformation A is in canonical form. ei. § 20.±.e.

. is similar to Sly Indeed let = 1S114. One such invariant was found in § 10 where we showed that the characteristic polynomial of a matrix se. i.uf and for any matrix similar to S. We denote by D. = i. . and at2 are similar to some matrix d. This will be a complete system of invariants in the sense that if the invariants in question are the same for two matrices then the matrices are similar...e. the determinant of the matrix d At.e. we wish to construct functions of the elements of a matrix which assume the same values for similar matrices. In other words.. 6 We also put The greatest common divisor is determined to within a numerical multiplier. We choose Dk(A) to be a monic polynomial. It is easy to see that if two matrices a.(2) the greatest common divisor of those minors. we get d-= Z'2-1 d2%5..e. We now construct a whole system of invariants which will include the characteristic polynomial. D(1) = is the same for .. is the matrix of transition from this basis to a new basis (§ 9). if the hth order ininors are pairwise coprime we take Di(A) to be I.. we obtain Si2 = i. i. Thus similar matrices represent the same linear trans- formation relative to different bases.2 al 2W 2Z' W2-1a2r. The kth order minors of the matrix sir 24' are certain polynomials in 2.CANONICAL FORM OF LINEAR TRANSFORMATION 143 If we put W-1 r. s4t2 is similar to sit.e.e.2. i. expressions depending on the transformation alone. then V-1. then sit. sdf is similar to Let S be the matrix of a transformation A relative to some basis. If 56. Then r1-1s11r1 Putting W2W1-1 = 46'. Let S be a matrix of order n. In particular. We now wish to obtain invariants of a transformation from its matrix.WW is the matrix which represents A relative to the new basis.

2.144 LECTURES ON LINEAR ALGEBRA Do (A) = 1. 3) for the matrix 0 o 1 ]... D 1(1) is divisible by D. It follows that D(X) is indeed divisible by D_1(2).2da. In particular D(2) is the determinant of the matrix Ae. To prove the converse we apply the same A. This proves that the greatest common divisors of the kth order minors of a . then xe and a'1 are the entries of = i.AS each multiplied by some number. the entries of any row of (si 2e)w. 132(A) = D1(2) LEMMA 1. the definition of D_1(2) implies that all minors of order n 1 are divisible by D . Proof: Let se and sit = W-Isiff be two similar matrices.%c (21 AS) and (I 2e)w are the same. i. (A A0)3. We observe that D_1(1) divides D (2). Indeed. In the sequel we show that all the 13. are linear combinations of the rows of st AC with coefficients from . Find D(2) (k = 1. AS)W If a.(2). If is an arbitrary non-singular matrix then the greatest common divisors of the kth order minors of the matrices AS. Proof: Consider the pair of matrices sí At and (. independent of It follows that every minor of (a 2e)w is the sum of minors of a . Similarly. LEMMA 2. 1.(2) are identical.).2. For similar matrices the polynomials D. etc. If we expand the determinant D(2) by the elements of any row we obtain a sum each of whose summands is a product of an element of the row in question by its cofactor.2 (A).Ae)r .(2) are invariants. Hence every divisor of the kth order minors of alt AS must divide every kth order minor of (st 2e)%'. EXERCISE. By Lemma 1 the greatest common divisor of the kth order minors S ..Ag and (s1 26')%" are the same. Answer: D3(2.e.e.Ae is the same as the corresponding greatest common divisor . o rooA A. reasoning to the pair of matrices (sit AS)W and [(s1 xe)wx-i S . are the entries of st .

Hence the D. Then the greatest common divisor Dk(A) of the kth order minors of the matrix se .(1) = = D1(A) = 1. We shall find it convenient to choose the basis relative to which the matrix of the transformation is in Jordan canonical form.(2) for si and at are identical. (A We observe further that if . 1. An analogous statement holds for the matrices AS) and W-I(S1 . Hence D1(2) =. Theorem 1 tells us that in computing the D.. for one "box" of the canonical form. where at represents the transformation A in some basis.42 are of order n. Let A be a linear transformation. we conclude on the basis of Lemma 2 that THEOREM 1.1. Clearly D(2) = (A A0)n. 1.AS)S = AS. does not depend on the choice of basis. /10)4. then the mth order non-zero .CANONICAL FORM OF LINEAR TRANSFORMATION 145 for (saf Ad)W. If we cross out in (1) the first column and the last row we obtain a matrix sill with ones on the principal diagonal and zeros above it.R is a matrix of the form Q1 0 where . and n. We now compute the polynomials WA) for a given linear trans- formation A.1. If we cross out in sli like numbered rows and columns we find that D . In view of the fact that the matrices which represent a transformation in different bases are similar.te.(2) we may use the matrix which represents A relative to an arbitrarily selected basis. Our task is then to compute the polynomial 1).e.(2) are .. 1. and . We first find the D(2) for an nth order matrix of the form 20 O 1 o 1 o 0 - (1) 0 0 0 0 0 0 1 i. Thus for an individual "box" [matrix (1)] the D.(2) for the matrix si in Jordan canonical form.

146

LECTURES ON LINEAR ALGEBRA

minors of the matrix 94 are of the form

d, =
Here 4(1) are the minors of mi

A(2)

,

"21 + M2 = M.

of order m, and 4(2) the minors of -42

of order m2.7 Indeed, if one singles out those of the first n, rows which enter into the minor in question and expands it by these rows (using the theorem of Laplace), the result is zero or is of the
form A (2) A m(2) .

We shall now find the polynomials D,(1) for an arbitrary matrix

si which is in Jordan canonical form. We assume that al has p boxes corresponding to the eigenvalue A, q boxes corresponding to the eigenvalue 22, etc. We denote the orders of the boxes corresponding to the eigenvalue Al by n1, n2, , n, (n, n2 > > nv). Let R, denote the ith box in a' = si AC. Then ,42, say, is of the form

A, A
O

1

0
1

O
O

=
O O
O

I
A,

0

0

A_

We first compute 1),(2), i.e., the determinant of a. This determi-

nant is the product of the determinants of the i.e.,
D1(2) = (A
)1)1'1+7'2'4-

(1

22)mi±m2+-+mq

We now compute Dn_1(2). Since D0_1(2) is a factor of D(A), it ,A 22, . The problem must be a product of the factors A

now is to compute the degrees of these factors. Specifically, we

compute the degree of A A in D1(2). We observe that any non-zero minor of M = si Ae is of the form

=4

M2)

zlik.),

where t, t2 + + tk = n I and 4) denotes the t,th order minors of the matrix ,2,. Since the sum of the orders of the minors
i.e.,

7 Of course, a non-zero kth order minor of d may have the form 4 k(, it may he entirely made up of elements of a,. In this case we shall 4725 where zlo,22 --- 1. write it formally as z1 =

CANONICAL FORM OF LINEAR TRANSFORMATION

147

1, exactly one of these minors is of order one lower than the order of the corresponding matrix .4,,
, zfik) is n

M,

i.e., it is obtained by crossing out a row and a column in a box of the matrix PI. As we saw (cf. page 145) crossing out an appropriate row and column in a box may yield a minor equal to one. Therefore it is possible to select 47,1 so that some 4 is one and the remaining minors are equal to the determinants oif the appropriate boxes. It follows that in order to obtain a minor of lowest possible degree Al it suffices to cross out a suitable row and column in the in A box of maximal order corresponding to Al. This is the box of order n. Thus the greatest common divisor D 2(A) of minors of order n. A1 raised to the power n2 + n, 1 contains A n Likewise, to obtain a minor 4n-2 of order n 2 with lowest A, it suffices to cross out an appropriate row possible power of A and column in the boxes of order n, and n, corresponding to A,. + n, Thus D2(2) contains A A, to the power n, n, + , D1(A) do not conetc. The polynomials D_(2), D_ 1(2), tain A A, at all. Similar arguments apply in the determination of the degrees of

in WA). A,, 22, We have thus proved the following result.
If the Jordan canonical form of the matrix of a linear transforma./zi,) , n(n2 n, tion A contains p boxes of order n,, n2, corresponding to the eigenvalue A1, q boxes of order ml, m2, , m m,) corresponding to the eigenvalue A2, etc., then m2
Da (A)
(A
(A

A1)n,2+n2+--- +5 (A
Ann2+.3+
-Enp (A

A2r,-Ern2-3-m3+

+mg

D_1(A)

22),n2+.2+- +mg

= (A

Aira+

+"' (A

Az)na'

+ma

Beginning with D_,(2) the factor (A Beginning with Dn_ 2(2) the factor (2
etc.

2,)
A2)

is replaced by one. is replaced by one,

In the important special case when there is exactly one box of order n, corresponding to the eigenvalue A1, exactly one box of order m, corresponding to the eigenvalue A2, exactly one box of order k, corresponding to the eigenvalue A3, etc., the D,(A) have the following form.

148

LECTURES ON LINEAR ALGEBRA

Da(A) = (2 D _1(2) = 1
D _2 (2) =
1

2)'(A

22)m ' (2

23)"'

The expressions for the D1(A) show that in place of the D,(2) it is

more convenient to consider their ratios
E ,(2)
,(2.)
.

D k 19)

The E1(1) are called elementary divisors. Thus if the Jordan
canonical form of a matrix d contains p boxes of order n, n2, , n(ni n, >: n) corresponding to the eigenvalue A, q boxes of order mi., m2, m, (m1 m2_> mg) corresponding to the eigenvalue 22, etc., then the elementary divisors E1(A) are
(2 En(2) En-1(2) =- (A E n-2(2) = (A

21)" (2
Al)"2 (A

22)'
22)m
22)ma

',
*,

Ai)"a(A

Prescribing the elementary divisors E(2), E 2(2)

,

,

deter-

mines the Jordan canonical form of the matrix si uniquely. The eigenvalues 2 are the roots of the equation E(2). The orders n1, n2, n of the boxes corresponding to the eigenvalue A, coincide with the powers of (2 in E(2), E_1(2),
.

We can now state necessary and sufficient conditions for the existence of a basis in which the matrix of a linear transformation
is diagonal. A necessary and sufficient condition for the existence of a basis in
which the matrix of a transformation is diagonal is that the elementary divisors have simple roots only.

Indeed, we saw that the multiplicities of the roots 21, 22, , of the elementary divisors determine the order of the boxes in the Jordan canonical form. Thus the simplicity of the roots of the elementary divisors signifies that all the boxes are of order one,

i.e., that the Jordan canonical form of the matrix is diagonal.
For two matrices to be similar it is necessary and sufficient that they have the same elementary divisors.
THEOREM 2.

CANONICAL FORM OF LINEAR TRANSFORMATION

149

Proof: We showed (Lemma 2) that similar matrices have the

same polynomials D,(2) and therefore the same elementary
divisors E k(A) (since the latter are quotients of the 13,(2)). Conversely, let two matrices a' and a have the same elementary
divisors. ,sat and a are similar to Jordan canonical matrices.

Since the elementary divisors of d and are the same, their Jordan canonical forms must also be the same. This means that a' and a are similar to the same matrix. But this means that a' and :a are similar matrices.
THEOREM 3. The Jordan canonical form of a linear transformation

is uniquely determined by the linear transformation. Proof: The matrices of A relative to different bases are similar.

Since similar matrices have the same elementary divisors and these determine uniquely the Jordan canonical form of a matrix, our theorem follows.
We are now in a position to find the Jordan canonical form of a matrix of a linear transformation. For this it suffices to find the elementary divisors of the matrix of the transformation relative
to some basis. When these are represented as products of the form AO" (A (X AS' we have the eigenvalues as well as the order of the boxes corresponding to each eigenvalue.

§ 21. Polynomial matrices
1. By a polynomial matrix we mean a matrix whose entries are polynomials in some letter A. By the degree of a polynomial matrix we mean the maximal degree of its entries. It is clear that

a polynomial matrix of degree n can be written in the form

+

+ A0,
AE

where the A, are constant matrices. 8 The matrices A

which vvt considered on a number of occasions are of this type. The results to be derived in this section contain as special cases many of the results obtained in the preceding sections for matrices

of the form A

¿E.

In this section matrices are denoted by printed Latin capitals.

150

LECTURES ON LINEAR ALGEBRA

Polynomial matrices occur in many areas of mathematics. Thus, for
example, in solving a system of first order homogeneous linear differential equations with constant coefficients
(I)

dy,
dx

let alkYk

= 1, 2,

n)

we seek solutions of the form
(2)

Yk = ckeAx,

(2)

where A and ck are constants. To determine these constants we substitute the functions in (2) in the equations (1) and divide by eA.z. We are thus led

to the following system of linear equations:
71

iCj =
k=1

agkek

The matrix of this system of equations is A ilE, with A the matrix of coefficients in the system (1). Thus the study of the system of differential
equations (1) is closely linked to polynomial matrices of degree one, namely,

those of the form A
system

AE.

Similarly, the study of higher order systems of differential equations leads

to polynomial matrices of degree higher than one. Thus the study of the
d2yk
k=1

2+ an,

dx2

+ E bik
k=1

n

dyk

dx

+

n

czkyk
k=1

O

is synonymous with the study of the polynomial matrix AA% + 132 + C, where A -= 16/.0, B = C = 11c3k1F.

We now consider the problem of the canonical form of polynomial matrices with respect to so-called elementary transformations. The term 'elementary" applies to the following classes of transformations.

Permutation of two rows or columns. Addition to some row of another row multiplied by some
polynomial yo (A) and, similarly, addition to some column of another

column multiplied by some polynomial.
Multiplication of some row or column by a non-zero constant. DEFINITION 1. Two polynomial matrices are called equivalent if it is possible to obtain one from the other by a finite number of elementary transformations.

The inverse of an elementary transformation is again an elementary transformation. This is easily seen for each of the three types

CANONICAL FORM OF LINEAR TRANSFORMATION

151

of elementary transformations. Thus, e.g., if the polynomial matrix B(A) is obtained from the polynomial matrix A(2) by a permutation of rows then the inverse permutation takes B(A) into A(A). Again, if B(A) is obtained from A(2) by adding the ith row multiplied by q)(2) to the kth row, then A (A) can be obtained from B(A) by adding to the kth row of B(A) the ith row

multiplied by a.(A).
The above remark implies that if a polynomial matrix K (A) is equivalent to L (A), then L (A) is equivalent to K (A). Indeed, if L(A) is the result of applying a sequence of elementary transformations to K (A), then by applying the inverse transformations in

reverse order to L(2) we obtain K(2).
If two polynomial matrices K1(A) and K2(A) are equivalent to a

third matrix K (A), then they must be equivalent to each other. Indeed, by applying to K, (A) first the transformations which take it into K (A) and then the elementary transformations which take
K(2) into K,(A), we will have taken K1(2) into K, (A)
.

Thus K, (A)

and K2(A) are indeed equivalent. The main result of para. I of this section asserts the possibility of

diagonalizing a polynomial matrix by ineans of elementary transformations. We precede the proof of this result with the
following lemma:
LEMMA. If the elentent a11(2) of a polynomial matrix A (A) is not

zero and if not all the elements a(2) of A(A) are divisible by a(A),
then it is possible to find a polynomial matrix B (A) equivalent to A (A) and such that b11(A) is also different from zero and its degree is less

than that of au (2).
Proof: Assume that the element of A (A) vvhich is not divisible by

a (2) is in the first row. Thus let a(2) not be divisible by a (2) .

Then a(A) is of the form
a1fr(2) = a11(2)02) b(2), where b (A) f O and of degree less than au(A). Multiplying the first

column by q(A) and subtracting the result from the kth column,
we obtain a matrix with b(A) in place of a11(2), where the degree of
is less than that of a11 (A) . Permuting the first and kth columns of the new matrix puts b(A) in the upper left corner and results in a matrix with the desired properties. We can proceed in
b (A)

1(2) is replaced by zero and a. Thus the first row now contains an element not divisible by a(A) and this is the case dealt with before. Repeating this procedure a finite number of times we obtain a matrix B (A) all of whose elements are divisible by bll(A).(A)(1 T(2)) + (42*(2). we can replace our matrix with an equivalent one in which the element in the upper left corner is of lower degree than a11(A) and still different from zero. then. we can.(2) is replaced by a' . Since b(A).) is divisible by a11(2). nth element of the first column can be replaced with zero.(2) be an element not divisible by an(A). Now let all the elements of the first rovy and column be divisible by a1(2) and let a. by subtracting from the second. it must be of the form a1(A) = (2)a1(2).152 LECTURES ON LINEAR ALGEBRA an analogous manner if the element not divisible by a11(2) is in the first column.(2) = ai. the second. then . Otherwise suitable permuta- tion of rows and columns puts a non-zero element in place of au(A). If not all the elements of our matrix are divisible by ail (A) .1(2). in view of our lemma. third. We may assume that a11(2) O. If all the elements of a polynomial matrix B (A) are divisible by some polynomial E (A). In the sequel we shall make use of the following observation. then all the entries of a matrix equivalent to B (A) are again divisible by E (A). We will reduce this case to the one just considered. Dividing the first row by the leading coefficient of b11(2) replaces b11(2) with a monic polynomial E1(2) but does not affect the zeros in that row. If we subtract from the ith row the first row multiplied by 92(2). This completes the proof of our lemma. . third. Since a11(2. We now add the ith row to the first row.. b(A) are divisible by b11(2).) + a' i. third. This leaves a11(2) unchanged and replaces a(2) with a(2. etc. The new matrix inherits from B (A) the property that all its entries are divisible by b .(2) 92(2)a(2) which again is not divisible by an (2) (this because we assumed that a(2) is divisible by an(A)). Similarly. columns suitable multiples of the first column replace the second. nth element of the first row with zero. . We are now in a position to reduce a polynomial matrix to diagonal form.

Then c22(A) is replaced by a monic polynomial E. (A) .. This form of a polynomial matrix is called its canonical diagonal form.) all of whose elements are divisible by E1(2). (A) and the other c().(2). of course.(2) = E. Every polynomial matrix can be reduced by elemen- tary transformations to time diagonal form E1(2) O E2(A) 0 O (4) 0 E3(2) 0 0 E(2)_ Here lije diagonal elements Ek(A) are monic polynomials and El (X) divides E2(2). (A).CANONICAL FORM OF LINEAR TRANSFORMATION 153 We now have a matrix of the form (2) 0 0 (722(1) c23(2) c33(2) c2(A) (3) O c32(2) c.. an elementary transformation of the matrix of the c. Repetition of this process obviously leads to a diagonal matrix.vo rows and columns are zero and whose first two diagonal elements are monic polynomials E. E. happen that Er+. with E2(A) a multiple of E. (A) divides E3(2. E. etc..) c nn(2)_ Oc2(A) c3(2. This proves THEOREM 1. It may. can be viewed as an elementary transformation of the larger matrix.(. Thus we obtain a matrix whose "off-diagonal" elements in the first b. REMARK: We have brought A(A) to a diagonal form in which every diagonal element is divisible by its predecessor.). If we dispense with the latter requirement the process of diagonalization can be considerably simplified.+2(2) = = for some value of Y. of order n 1 the same proceWe can apply to the matrix dure which we just applied to the matrix of order n. Since the entries of the larger matrix other than E1(2) are zeros. .) in the first row and first column are replaced with zeros.

In the case of elementary transformations of type 1 which permute rows or columns this is obvious.(2) = 1. As can be seen from the proof of the lemma this requires far fewer elementary transformations than reduction to canonical diagonal form. Let there be given an arb trary polynomial matrix. Since given matrix. it is convenient to put Do(A) D. we take D. In this paragraph we prove that the canonical diagonal form of a given matrix is uniquely determined. Reduce the polynomial matrix 21 L 0 _2j O 0 ' 21 to canonical diagonal form.. EXERCISE..). Once the off-diagonal elements of the first row and first column are all zero we repeat the process until we reduce the matrix to diagonal form. the diagonal form of a polynomial matrix is not uniquely determined. On the other hand we will see in the next section that the canonical diagonal form of a polynomial matrix is uniquely determined. we take its leading coefficient to be one.(2.154 LECTURES ON LINEAR ALGEBRA Indeed.e. 2. that equivalent matrices have the same polynomials D. In this way the matrix can be reduced to various diagonal forms. Let D. or change . i. To this end we shall construct a system of polynomials connected with the given polynomial matrix which are invariant under elementary transformations and which determine the canonical diagonal form [01 (A completely. since such transformations either do not affect a particular kth order minor at all.e. to replace the off-diagonal elements of the first row and column with zeros it is sufficient that these elements (and not all the elements of the matrix) be divisible by a(2).(1.) denote the greatest common divisor of all kth order minors of the 1.(2) is determined to within a multiplicative constant. (A) are invariant under elementary transformations. As before. i. In particular. A nswer : 412 A2)]. if the greatest common divisor of the kth order minors is a constant. We shall prove that the polynomials D.

t)E. Since all the polynomials Ek(A) are divisible by E1(2) and all polynomials other than E1(2) are divisible by E2 (A). Thus in this case. Since all E. Likewise.(.(A)E. elementary transformations of type 3 do not change D.(2) since under such transformations the minors are at most multiplied by a constant.(A)E. In all these cases the greatest common divisor of all kth order minors remains unchanged. that is.(A). Now consider elementary transformations of type 2.(A). the product Ei(A)Ei(A)(i < j) is always divisible by the minor E. minors made up of like numbered rows and columns.(2) (i < j < k) is divisible by E1(A)E2(A)E3(2) and so Da(A) = E. the product E . If all kth order minors and. then we put 13.)E. consider addition of the jth column multiplied by T(A) to the ith column. Hence D2(A) = E. etc.).(À)E.(2. Since E2(2) is divisible by E1(2)...(A) = Dk±i(A) = D(2) = O. If some particular kth order minor contains none of these columns or if it contains both of them it is not affected by the transformation in question.(1. If it contains the ith column but not the kth column we can write it as a combination of minors each of which appears in the original matrix. it follows that the greatest common divisor D1(A) of all minors of order one is Ei(A). consequently. too. the minor .)E. We observe that equality of the /k(A) for all equivalent matrices implies that equivalent matrices have the same rank.) E2k(2). E3(2) is divisible by E2(2).CANONICAL FORM OF LINEAR TRANSFORMATION 155 its sign or replace it with another kth order minor. the greatest common divisor of the kth order minors remains unchanged.2(2.(4 other than E1(11) and E2(A) are divisible by E3(2. all minors of order higher than k are zero. We compute the polynomials D.(A). Specifically.(2) for a matrix in canonical form [Ei(i) O 0 I E2(2) (5) E(2) We observe that in the case of a diagonal matrix the only nonzero minors are the principal minors. These minors are of the form (2)E.

In § 20 we defined the elementary divisors of matrices of the form A THEOREM 2. Hence if the matrix A(A) is equivalent to a diagonal matrix (5). The polynomials Ek(2) are called elementary divisors.+2(A) = then = En(2) = 0. then the elements of the canonical D .(2) = D(2) = 0. Since in the case of the matrix (5) we found that D.156 LECTURES ON LINEAR ALGEBRA Thus for the matrix (4) (6) D k(2) = E1(A)E2(A) ' Eh(A) (k = 1. if Dr±i(A) = = D(2) = 0 we must put E. r).1(1) = Ek.(2) = E1(2) Ek(2) (k 1.(2) Ek(2) Dk-1(2) Here. . n). 2. the theorem follows. If D k(2) (k = 1. = E+(A) = = E (2) = O. COROLLARY. 2. .1(2) = E. 2. The canonical diagonal form of a polynomial matrix A (A) is uniquely determined by this matrix. Proof: We showed that the polynomials Dk(2) are invariant under elementary transformations.±1(2) 2E. r n) and that Dr+1(2) = D+2(A) = = D(2) = 0. = = E(2) = O. r) is the greatest common divisor of all kth order minors of A (A) and D. r. then both have the same Dk(A). D1(A) = Dr+2(2) = = Da(A) = 0. .. if beginning vvith some value of r Er. Thus the diagonal entries of a polynomial matrix in canonical diagonal form (5) are given by the quotients D.(2) D k-1 (A) diagonal form (5) are defined by the formulas (k 1. . Clearly. 2. A necessary and sufficient condition for two polyno- .

CANONICAL FORM OF LINEAR TRANSFORMATIoN 157 mial matrices A (A) and E(A) to be equivalent is that the polyno Di(A). then there exist invertible matrices P(A) and Q(A) such that (7) holds. Proof: We first show that if A (A) and B(2) are equivalent. D2(2). so that Da(A) = 1. Indeed. If det P (A) is a constant other than zero. . the elements of the inverse matrix are. Indeed. . apart from sign. the determinant of an invertible matrix is a non-zero constant. It follows that all the elementary divisors E(2) of an invertible matrix are equal to one and the canonical diagonal form of such a matrix is therefore the unit matrix. Then det P (A) det (A) = 1 and a product of two polynomials equals one only if the polynomials in question are if P (A) non-zero constants. Since D(2) is divisible by WM D. All invertible matrices are equivalent to the unit matrix. namely. Indeed. Thus let there be given a polynomial matrix A(2) . We illustrate this for all three types of elementary transforma- tions. n). als Indeed. let [P (2)1-1. Conversely. Two polynomial matrices A (A) and B(2) are equivalent if and only if there exist invertible polynomial matrices P(2) and Q(A) such that. (7) A(A) = P (2)B (2)Q (2).(2) are the same for A(A) and B (A). To this end we observe that every elementary transformation of a polynomial matrix A(2) can be realized by multiplying A(2) on the right or on the left by a suitable invertible polynomial matrix. 2. A polynomial matrix P(2) is said to be invertible if the matrix [P(2)]-1 is also a polynomial matrix. by the matrix of the elementary transformation in question. THEOREM 3. We have thus shown that a polynomial matrix is invertible if and only if its determinant is a non-zero constant. if the polynomials D. then P (A) is invertible. then det P(2) = const O.(2) = 1 (k = 1. is invertible.13(2) be the same for both matrices. 3.= Pi(A). then both of these matrices are equivalent to the same canonical diagonal matrix and are therefore equivalent (to one another). In our case these quotients would be polynomials and [P (2)J-1 would be a polynomial matrix. . the (n 1)st order minors divided by det P(2).

we must multiply it on the right (left) by the matrix 0 1 1 0 0 1 0 0 0 (8) 0 0 0 obtained by permuting the first two columns (or.(2) a2(2) ann(2) To permute the first two columns (rows) of this matrix. Finally.158 LECTURES ON LINEAR ALGEBRA a12(2) a22(2) a2(11. To multiply the second column (row) of the matrix A (A) by some number a we must multiply it on the right (left) by the matrix 10 0 0 a 0 0 0 1 0 (8) 0 0 0 1_ obtained from the unit matrix by multiplying its second column (or. what amounts to the same thing.) ' ain(2) A(A) an(2) Pan(2) a. row) by a. rows) of the unit matrix. to add to the first column of A (A) the second column multiplied by q(A) we must multiply A(A) on the right by the matrix 0 0 1 0 0 T(2) (10) 1 0 1 0 0 0 0 1 0 0 obtained from the unit matrix by just such a process. Likewise to add to the first row of A(A) the second row multiplied by 9/(A) we must multiply A(A) on the left by the matrix (T(2) 0 1 0 0 1 0 0 0 1 (11) 0 0 0 0 0 . what amounts to the same thing.

which is what we wished to prove. it must be possible to obtain A(A) by applying a sequence of elementary transformations to B (A).CANONICAL FORM OF LINEAR TRANSFORMATION 159 obtained from the unit matrix by just such an elementary transformation. where P (A) and 0(A) are invertible matrices. Since the determinant of a product of matrices equals the product of the determinants. every invertible matrix Q (A) is equivalent to the unit matrix and can therefore be written in the form Q(2) = 131(2)EP2(2) where 132(2) and P2(A) are products of matrices of elementary transformations. As we see the matrices of elementary transformations are obtained by applying an elementary transformation to E. Computation of the determinants of the matrices (8) through (11) shows that they are all non-zero constants and the matrices are therefore invertible. . Indeed. Indeed. Since the product of invertible matrices is an invertible matrix. But this means that Q (A) = Pi (A)P. it follows that the product of matrices of elementary transformations is an invertible matrix. in view of our observation. It follows that every invertible matrix is the product of matrices of elementary transformations. let A(A) = P(A)B(A)Q(A). A(A) can be obtained from B (A) by multiplying the latter by some sequence of invertible polynomial matrices on the left and by some sequence of invertible polynomial matrices on the right. the first part of our theorem is proved. Hence A(A) is equivalent to B (A). A(A) is obtained from B(1) by applying to the latter a sequence of elementary transformations. Since we assumed that A (A) and B (A) are equivalent. Consequently. But then. Every elementary transformation can be effected by multiplying B(A) by an invertible polynomial matrix.(2) is itself a product of matrices of elementary transformations. This observation can be used to prove the second half of our theorem. To effect an elementary transformation of the columns of a polynomial matrix A(A) 've must multi- ply it by the matrix of the transformation on the right and to effect an elementary transformation of the rows of A (A) we must multiply it by the appropriate matrix on the left.

there exist matrices 5(2) and R (R constant) such that P (A) = (A AE)S(2) + R. with det A1 O is equivalent to a matrix of the form A ¿E.A.2E) and if we denote A. = A. A constant.e. and A 2E. independent of § 19. if there exists a non-singular constant matrix C such that B C-1AC. + 2A.2A1 = A.9 In this paragraph we shall study polynomial matrices of the form A AE. . i. by A we have A. a new proof of the fact that every matrix is similar to a matrix in Jordan canonical form. w It is easy to see that if A and B are similar. + ¿A.160 LECTURES ON LINEAR ALGEBRA 4. 2E)C.. j. then B 2E = C-1(A Since a non-singular constant matrix is a special case of an invertible polynomial matrix. i.. polynomial matrices A AE and B if B C-i AC. This will yield. -1A.¿E) which implies (Theorem 3) the equivalence of A. Indeed. Later we show the converse of this result. -. J AA. Theorem 3 implies the equivalence of A AE and B 2E. of the fact that every matrix can be reduced to Jordan canonical form. Every polynomial matrix P(2) = P02" + 1312"-1 + + P can be ditiided on the left by a matrix of the form A AE (A any constant matrix). The process of division involved in the proof of the lemma differs from ordinary division only in that our multiplication is noncommutative. among others.e.-. This paragraph may be omitted since it contains an alternate proof. The main problem solved here is that of the equivalence of polynomial matrices A 2E and B AE of degree one. (A . then the AE are equivalent. Indeed. in this case A. to Every polynomial matrix A. that the equivalence of the polynomial matrices A AE and B AE implies the similarity of the matrices A and B. X ( A. namely. We begin by proving the following lemma: LEMMA.

then P (A) = (A or putting S (2) ( 2E) (P0An-1 P'02"-2 + ) R. It is easy to see that the polynomial matrix P(A) + (A AE)P02"-1 1. If R denotes the constant matrix just obtained. The polynomial matrices A AE and B AE are equivalent if and only if the matrices A and B are similar. (A AE)P'0An-2 is of degree not higher than n obtain a polynomial matrix P(2) + (A Continuing this process we P'02"-2 2E) (P02"-1 + .). THEOREM 4. It remains to prove necessity.e. where the P. Proof: The sufficiency part of the proof was given in the beginning of this paragraph. P02" P(2) = (A P'0211-2 + . just as in the ordinary theorem of Bezout.13102"-I P'12"-2 + + P'_. This means that we must show that the equivalence of A 2E and B AE implies the similarity of A and B.-=. then the polynomial matrix P(A) + (A AE)P02"-1 + 2...(2) (A AE) We note that in our case. By Theorem 3 there exist invertible polynomial matrices P(2) and Q(A) such that . can claim that R = R. A similar proof holds for the possibility of division on the right..e.CANONICAL FORM OF LINEAR TRANSFORMATION 161 Let P(A) = P02" P. = P(A). there exist matrices S1(A) and R1 such that P(2) -= S. i. are constant matrices.An-i. independent of X. i. This proves our lemma. is of degree not higher than If P(2) + (A n AE)P.) of degree not higher than zero. AE)S(2) + R.

To this end we divide P(2) on the left by B B AE and Q(2) by AE on the right. If we insert these expressions for P(2) and Q(2) in the formula (12) and carry out the indicated multiplications we obtain B AE = (B +(B 2E) 2E)P1 (2) (A 2E)121(2)(B 2E)Q0 + Po(A 2E)Q1(2)(B 2E)P1 (2) (A + 130(A 2E) 2E)Q0. 2E)Q1(2)(B + Po(A then we get B AE Po(A 2E)Q0 -= K(2). (2) (A 2E)Q1 (2) (B 2E) 2E)Q0 2E)P1(2)(A 2E). the first two summands in K(2) can be written as follows: Since Q1(2)(B + (B 2E)P1(2) (A (B 2E)Q0 AE)1J1 (2) (A = (B 2E) 2E)Q1 (2) (B 2E)P1 (2) (A 2E)Q (2). Q(2) -. i. (2) (A AE)Q (2) (B (A P (A) (A 2E) 2E)Q1(2)(B 2E).e.162 LECTURES ON LINEAR ALGEBRA (12) B 2E = P(2)(A 2E)Q(2). Using these relations we can rewrite K (2) in the following manner . We shall first show that 11 (A) and Q(2) in (12) may be replaced by constant matrices. 2E)Q-/(2). 2E) + Q0 = Q(2). where Po and Q0 are constant matrices. if we put K (2) = (B (B AE)P. 2E)P1(2)(A 2E)Q1(2)(B But in view of (12) AE)Q (A) P(A)(A 2E) = (B P-1(2)(B 2E). (2) + P0. If we transfer the last summand on the right side of the above equation to its left side and denote the sum of the remaining terms by K (2). We now add and subtract from the third summand in K(2) the expression (B 2E)P1 (A) (A 2E)Q1(2)(B 2E) and find K(2) = (B AE)P.Q1(2)(B 2E) + Qo. Then P(2) = (B AE)P..

. 21E)Qi(2)1(B P. are constant matrices.e..e. We have thus found that (17) B 2E =. that A and B are similar. Then it is easy to see that K (2) is of degree m + 2 and since ni 0.) in (12) with constant matrices. . Assume that this polynomial is not zero and is of degree m. and Qo are non-singular and that Po = Qo-1. we may indeed replace P(2) and Q(2. But (15) implies that K (2) is at most of degree one. (2)(A the expression in square brackets is a polynomial in 2. i. and with it K (2). We shall prove this polynomial to be zero. Since equivalence of the matrices A if and only if the matrices A ¿E and B ¿E is synonymous with identity of their elementary divisors it follows from the theorem just proved that two matrices A and B are similar ¿E and B a have the same elementary divisors. We now show that K (1) = O. Hence the expression in the square brackets. Equating the free terms we find that B = PoAQ0 i. We now show that every matriz A is similar to a matrix in Jordan canonical form. This completes the proof of our theorem. but then B is similar to A. is zero. Equating coefficients of 2 in (17) we see that PoQo E. Using these we construct as in § 20 a matrix B in Jordan canonical form. To this end we consider the matrix A ¿E and find its elementary divisors.Po(A 2E)Q0. K (2) is at least of degree two. Of course. can be deduced directly from §§ 19 and 20. Since P(2) and Q (2) are invertible. the contents of this paragraph 2E has the same elementary ¿E. where Po and Q.CANONICAL FORM OF LINEAR TRANSFORMATION 163 K (A) = (B 2E) [P1(2)P-1(2) 12-'(2)121(A) ¿E). which shows that the matrices P. B divisors as A As was indicated on page 160 (footnote) this paragraph gives another proof of the fact that every matrix is similar to a matrix in Jordan canonical form.

CHAPTER IV

Introduction to Tensors
§ 22. The dual space
1. Definition of the dual space. Let R be a vector space. Together with R one frequently considers another space called the dual space which is closely connected with R. The starting point for the definition of a dual space is the notion of a linear function introduced in para. 1, § 4.

We recall that a function f(x), x E R, is called linear if it satisfies the following conditions:

f(2x) = 2f (x). e be a basis in an n-dimensional space R. If Let el, e2, + e" e x = ei e2 e, + is a vector in R and f is a linear function on R, then (cf. § 4) we can write

f(x+y)-f(x)+f(Y),

re,) f(x) = /(eei e2e2 ane", a2e2 + = , a which determine the linear where the coefficients al, a2,
(1)

function are given by
(2)

a = f(e2),

a2 = f(e2),

a,, = f(e).

It is clear from (1) that given a basis e1, e2, , en every n-tuple , a determines a unique linear function. al, a2, Let f and g be linear functions. By the sum h off and g we mean

the function which associates with a vector x the number f(x) g (x). By the product off by a number a we mean the function which associates with a vector x the number x f(x).
Obviously the sum of two linear functions and the product of a

function by a number are again linear functions. Also, if f is
164

INTRODUCTION TO TENSORS

165

, a and g by the numbers determined by the numbers al, a2, g is determined by the numbers al + b n , then f b1, b2, , a, + b,, , a + bn and xi' by the numbers arz,., a2, , acin. Thus the totality of linear functions on R forms a vector space.
DEFINITION 1. Let R be an n-dimensional vector space. By the dual space R of R we mean the vector space whose elements are linear functions defined on R. Addition and scalar multiplication in R follow the rules of addition and scalar multiplication for linear

functions.

In view of the fact that relative to a given basis e1, e2,

, e in

R every linear function f is uniquely determined by an n-tuple , a and that this correspondence preserves sums and a1, a2, products (of vectors by scalars), it follows that R is isomorphic to the space of n-tupies of numbers. One consequence of this fact is that the dual space R of the
n-dimensional space R is likewise n-dimensional.

The vectors in R are said to be contravariant, those in R,
will denote elements of R and covariant. In the sequel x, y, elements of R. f, g, 2. Dual bases. In the sequel we shall denote the value of a

linear function f at a point x by (f, x). Thus with every pair f E R and x e R there is associated a number (f, x) and the
following relations hold:

f, xi + x2) = (f,x1)
(f, /Ix) 2(f, x), x), (Af, x) =
(fl.

( f, x2),

X) = (h, X) + (f2, X).

The first two of these relations stand for f(x,+ x2)=f(xi)-kf(x2)

and f(A)

Af (x) and so express the linearity of f The third

defines the product of a linear function by a number and the fourth, the sum of two linear functions. The form of the relations 1 through 4 is like that of Axioms 2 and 3 for an inner product (§ 2). However,

an inner product is a number associated with a pair of vectors
from the same Euclidean space whereas (f, x) is a number associated with a pair of vectors belonging to two different vector spaces

R and R.

166

LECTURES ON LINEAR ALGEBRA

Two vectors x E R and f E R are said to be orthogonal if

(f,x) = O. In the case of a single space R orthogonality is defined for
Euclidean spaces only. If R is an arbitrary vector space we can still speak of elements of R being orthogonal to elements of R. , f" e be a basis in R and f1,f2, DEFINITION 2. Let e1, e2, said to be dual if a basis in R. The two bases are
(3)

(P,ek)=

when i = k
{01

when i

k

(i, k = 1, 2,

In terms of the symbol hki, defined by 1 when i = k , n), 1, 2, (i, k = {0 when i k k condition (3) can be rewritten as (fi, ek) = If el, e2, , en is a basis in R, then (f, ek) = f(ek) give the numbers a, which determine the linear function fe R (cf. formula (2)). Ibis remark implies that , en is a basis in R, then there exists a unique basis if e1, e2, in in R dual to e1, e2, , fi, J.', The proof is immediate: The equations (p, e) = (P, e2) = 0, (P, ei) = 1, define a unique vector (linear function) J.' E R. The equations (f2, e2) = I, *, (f 2,e) = O (f2, el) = 0, define a unique vector (linear function) f2 e R, etc. The vectors fl, f2, . . are linearly independent since the corresponding n-tuples of numbers are linearly independent. Thus In, f2, ,I3
constitute a unique basis of R dual to the basis el, e2,

, e of R.

In the sequel we shall follow a familiar convention of tensor analysis according to which one leaves out summation signs and sums over any index which appears as a superscript and a sub+ enri. script. Thus el /72 stands for elm. + E22 + Given dual bases e, and f one can easily express the coordinates

of any vector. Thus, if x e R and
x

. . en and Th. /2. . E2. where f is the basis dual to the basis e. We shall express the number (f.e). e. e is a basis in R and f'. NOTE. We wish . andf. x) = a.. . x = El ei Then 52e2 + + e. . Thus let . p. can be computed from the formulas ek (fk. are the coordinates off E R relative to the basis in. ek)ntek To repeat: If el. e1)61k Ek. Similarly. . its dual basis in R then (if x) = niE' + 172E2 + + nnen. e. . (4) respectively (1. h in R and R where $1. x). en and fi. = (f. . vet) ei(fk. are the coordinates of x c R relative to the basis e1. . where a/c. For arbitrary bases el. the coordinates Ek of a vector x in the basis e. . = (fi. We now show that it is possible to interchange the roles of R and R without affecting the theory developed so far. en and P.1cn1ek ++Thifn. rke. 3. . e2. Interchangeability of R and R. ek) and f = ntf' + n2f2 6. R was defined as the totality of linear functions on R. respectively. p.INTRODUCTION TO TENSORS 167 then (fk. e2. f2. x) (fk. e2. x) in terms of the coordinates of the vectors f and x with respect to the bases e1. n . . e2. Hence. if fe k and f= nkfki then Now let e1. fn. e2. be dual bases. (f. f2. e2. (fi. x) = (nip.i.

. If the coordinates off relative to the basis dual by Jr'. en in R and denote its f". and (f. then we can write q)(f) = (tin. . fn of e. Such a definition runs as follows: a pair of dual spaces R and R is a pair of n-dimensional vector spaces and an operation (f. e R and permits us to view R as the space of linear functions on R thus placing the tvvo spaces on the same footing. e2. (f. x) 0 for all f implies x O. as a rule. be the vector alei we saw in para. (6) . e' be a new basis in R whose connection with the basis e1. coordinates of a vector x e R relative to some basis e1. We observe that the only operations used in the simultaneous study of a space and its dual space are the operations of addition of vectors and multiplication of a vector by a scalar in each of the spaces involved and the operation (f. Transformation of coordinates in R and R. x) so that conditions 1 through 4 above hold and. then cp(f) (f. (f. It is there- fore possible to give a definition of a pair of dual spaces R and R which emphasizes the parallel roles played by the two spaces. 4. in addition. . 2 above we showed that for every basis in R there exists a unique dual basis in R. for every basis in R there exists a unique dual basis in R. we specify the coordinates of a vector f E R relative to the dual basis f'. 172. X0) = a2e2 4- + (P + ann and (5) (f. x) which connects the elements of the two spaces. x) NOTE: 0 for all x implies f = 0. x) which associates with f e R and xeR a number (f. 5.168 LECTURES ON LINEAR ALGEBRA to show that if q) is a linear function on R. then. e2.72 4- + (re Then. = ci"ek. . f2. e is given by e'. f2. e2. x) for some fixed vector xo in R. xo). e. . on ft and the vectors x. If we specify the en. Now let e'1. e2. . This formula establishes the desired one-to-one correspondence between the linear functions 9. To this end we choose a basis el. fn are rh..n. as Now let x. In para. e'2. 2. f2. . + a2. In view of the interchangeability of R and R. .

= czknk This is seen by comparing the matrices in (6) and (6'). the coordinates of vectors in k transform like the vectors of the dual basis in R. f2. f2. e'i) = (fk. f'2. the matrix in (6') is the transpose 1 of the transition matrix in (6). x) (bklik. . ni. e'. e'2. Similarly. basis. e'2.e. e2.r fk To this end we compute (7. = bkiek. Then x) --= (I' kek) = (f'1 x) = (f". f'2. f'n be the dual basis of e'1. fOc= to fg. en to e'l .x)=bki(fk.(fk.. i.x) bkiek. We first find its of transition from the basis f'1. e'i) =1= = e'1) cik e'i) u5k (f Hence c1 = u1k. . fn: inverse.e.. p be the dual basis of e1. . .. the matrix (6') . e2. (fk. e'2. e'. to the basis F. f2. i. . . . It follows that the matrix of the transition from fl. f'k is equal to the inverse of the transpose of the matrix which is the matrix of transition from e1. ciece2) = ci. .Eiketk)= e". .. f '2. Thus let ei be the coordinates of x ER relative to a basis e1. Now e't = (f". . e andf'1.INTRODUCTION TO TENSORS 169 Let p. We say that the matrix lime j/ in (6') is the transpose of the transition matrix in (6) because the summation indices in (6) and (6') are different. VVe now discuss the effect of a change of basis on the coordinates e' of vectors in R and k. e and e'i its coordinates in a new basis e'1. We wish to find the matrix 111)7'11 of transition from the fi basis to the f'. so that It follows that the coordinates of vectors in R transform like the vectors of the dual basis in k.. e'i) in two ways: (fk.

Since the sumultaneous study of a vector space and its dual space involves only the usual vector operations and the operation . y). Of the matrices r[cikl and I ib. y. then (x. a. y2). (x. et. x = eiei. The dual of a Euclidean space. e.. But this The converse is obvious. = by. Now let y be the vector with coordinates c11.e. = O. . y1 means that y. LEMMA. en is orthonormal. then f(x) is of the form f(x) = ale + a2e2 + basis e1. y) = (x. 5. Let R be an n-dimensional Euclidean space. The fact the 111). e2.. . x). Proof: Let e1. To prove the uniqueness of y we observe that if f(x) = (x. Thus in the case of a Euclidean space everyf in k can be replaced with the appropriate y in R and instead of writing (f. y). Y). i. For the sake of simplicity we restrict our discussion to the case of real Euclidean spaces. every vector y determines a linear function f such that f(x) = (x.) = 0 for all x. x) we can write (y.k1 involved in these transformations one is the inverse of the transpose of the other.abni = 6/. Since the . y1) and f(x) = (x. y2).k11 is the inverse of the transpose of 11c.. If + ase. y) = ct1e1 a2E2 + + This shows the existence of a vector y such that for all x f(x) = (x. Conversely.k11 is expressed in the relations c. (x. e be an orthonormal basis of R. where y is a fixed vector uniquely determined by the linear function f. y.170 LECTURES ON LINEAR ALGEBRA We summarize our findings in the following rule: when we change from an "old" coordinate system to a "new" one objects with lower case index transform in one way and objects with upper case index transform in a different way. Then every linear function f on R can be expressed in the form f(x) = (x.

Multilinear functions. replace f by y. Solving equation (10) for f' we obtain the required result f = gi2e . If R is Euclidean. where the matrix Hell is the inverse of the matrix I Lgikl I.. (y. Euclidean space. then ek = gkia.e. we may.' is dual to that of the ek. ek) = giabe = gik (ei ek) = Thus if the basis of the J. . When we identify R and its dual rt the concept of orthogonality of a vector x E R and a vector f E k (introduced in para. and (f. x e R.INTRODUCTION TO TENSORS 171 ( f x) which connects elements fe R and x e R. = g f a. But this would have the effect 2 of introducing an inner product in R. x). Now eic) = gj2. We wish to find the coefficients go. A natural If R is an n-dimensional vector space. X) by (y.. e2. 2 above) reduces to that of orthogonality of two vectors of R. y. x).. Let e. we can identify R with R and so in look upon the f as elements of R. Show that gik i. It is natural to try to find expressions for the f in terms of the given e. ek). If we were to identify R and R we would have to write in place of (I. in case of a . its dual basis in R. f2. R by R. (p. flak EXERCISE. then R is also n-dimensional and so R and R are isomorphic. In the first chapter we studied linear and bilinear functions on an n-dimensional vector space. i. Tensors 1.. e be an arbitrary basis in R and f'. Let e].e. . tk) § 23. x). 2 This situation is sometimes described as follows: in Euclidean space one can replace covariant vectors by contravariant vectors. where g ik (ei. we may identify a Euclidean space R with its dual space R.

f'. y. . a vector in R (a covariant vector). y. f. There are three types of multilinear functions of two vectors (bilinear functions): bilinear functions on R (cons dered in § 4). y. g. ) = 1(x'. The simplest multilinear functions are those of type (1. (ß) bilinear functions on R. 1(x. f. q). ) q vectors f. g. . Again. g. . . /(2x. ). 0) and (0. ) . e R and e R (the dual of R) if 1 is linear in each of its Thus. y. . f. 1) defines a vector in R (a contravariant vector). The bilinear function of type (y) . DEFINITION I. f. y. g. functions of one vector in R and one in R. i. ) 1(x" . f. y. g. arguments. A multilinear function of type (1. A multilinear function of p vectors in R (contravariant vectors) ) f".172 LECTURES ON LINEAR ALGEBRA generalization of these concepts is the concept of a multilinear function of an arbitrary number of vectors some of which are elements of R and some of vvhich are elements of R. ). Similarly. 1). . § 22. ) = ul(x. -) = 2/(x. y. g. 1(x. 3.. y. uf. . for example. as was shown in para. g. y. . . g. g. and q vectors in R (covariant vectors) is called a multilinear function of type (p. let y = Ax be a linear transformation on R. A function 1(x. 0) is a linear function of one vector in R. a multilinear function of type (0. g. /(x. is said to be a multilinear function of p vectors x. y. ). f. g. There is a close connection between functions of type (y) and linear transformat ons. Indeed. y. f. y. ). 1(x. g. f' . f". if we fix all vectors but the first then /(x' x".e.

en be a basis in R and fl. en be a basis in R and fl. Expressions for multilinear functions in a given coordinate system. . fe R (a function of type (2. f n its dual basis in R..k /(x. 1)). f.. . f). e'n be a new basis in R and fi. fk. y E R. Coordinate transformations. e2. f) = where the coefficients ail' which determine the function / (x. . Ckfk) = V?? e5. Let Then . f2. e2.fk). Ax) which depends linearly on the vectors x e R and fe R. A similar formula holds for a general multilinear function /(x.fr f3.INTRODUCTION TO TENSORS 173 associated with A is the function (f. y. ". x. i. Let el. y. We now express a multilinear function in terms of the coordinates of its arguments. As in § 11 of chapter II one can prove the converse. y. . If (3) e'a = cate. )= y.. 7 its dual in R. 2. For simplicity we consider the case of a multilinear function 1(x. We now show how the system of numbers which determine a multilinear form changes as a result of a change of basis Thus let el.7: : : = 1(e. fit This shows that the ak depend on the choice of bases in R and R. g. e'2. . f) are given by the relations ai ei. .that one can associate with every bilinear function of type (y) a linear transformation on R. /(Ve. ). eini ht. y... Y.f) Or y = niei.f be its dual in R..e.. . f 2. f '2. e.. where the numbers au::: which define the multilinear function are given by ar. n'e5. Let e'1. x= /(x.

linear functions.. relative to the basis el. Similarly. II is the transpose of the inverse of For a fixed a the numbers c2 in (3) are the coordinates of the vector e'j. 4. def 4. and e' and f'1. b71. 2.715. § 22) f'ß = where the matrix I lb.174 LECTURES ON LINEAR ALGEBRA then (cf..': which define our multilinear function relative to the bases e'. PI. y. p. para. f. . In this way we find that numbers c52.. = ctfixift e and f1. f2. g. para. i. the . .e. ba. . . upon a change of basis. . transform in a manner peculiar to each object and to characterize the object one had to prescribe the . 3. . etc.. e2. fr.5: bibrs Here [c5' [[is the matrix defining the transformation of the e basis and is the matrix defining the transformation of the f basis. for a fixed fi the numbers baft in (4) are the coordinates of f'ß relative to the basis f'.rbts ci-cl To sum up: If define a multilinear function /(x. )relative to a pair of dual bases el. e' and r.l2. We shall now compute the numbers a'.us.) were defined relative to a given basis by an appropriate system of numbers.. This situation can be described briefly by saying that the lower indices of the numbers aj are affected by the matrix I Ic5'11 and the upper by the matrix irb1111 (cf. . and a bilinear function by the n' entries in its matrix. fr. The objects which we have studied in this book (vectors. . e'2. . [2. a linear function by its n coefficients. linear transformations. Thus relative to a given basis a vector was defined by its n coordinates./12. the coordinates of the vectors e' e'5. f2. .f". bilinear functions. fn. then e'1. . Hence to find we must put in (1) in place of Ei.r. We know that f''. a linear transformation by the n2 entries in its matrix. In the case of each of these objects the associated system of numbers would. 4. c/. § 22). e2. bar. Definition of a tensor. e. e'2. ' .). (CT:: = /(e't.

1 and 2 of this section we introduced the concept of a multilinear function. p times covariant and q times contravariant. every tensor determines a unique multilinear function. The numbers a. These transform according to the rule = and so represent a contravariant tensor of rank 1. We now give a few examples of tensors. geometry. Scalar. and algebra. The numbers the components of the tensor. Linear function (covariant vector). The number p are called called the rank (valence) of the tensor. This permits us to deduce properties of tensors and of the operations on tensors using the "model" supplied by multilinear functions. We now define a closely related concept which plays an important role in many branches of physics. If we associate with every coordinate system the same constant a. Conversely. transforms under change of basis in accordance with (6) the multilinear function determines a unique tensor of rank p q. A tensor of rank zero is called a scalar Contravariant vector. then a may be regarded as a tensor of rank zero.INTRODUCTION TO TENSORS 175 values of these numbers relative to some basis as well as their law of transformation under a change of basis. In para. We say that aß times covariant and g times contravariant tensor is defined if with every basis in R there is associated a set of nv+Q numbers a::: (there are p lower indices and q upper indices) which under change of basis defined by some matrix I Ic/II transform according to the rule (6) = cja b acrccrp::: b with q is the transpose of the inverse of I I I. multilinear functions are only one of the possible realiza- tions of tensors. DEFINITION 2. Relative to a definite basis this object is defined by nk numbers (2) which under change of basis transform in accordance with (5). Clearly. Let R be an n-dimensional vector space. Since the system of numbers defining a multilinear function of p vectors in R and q vectors in R. its coordinates relative to this basis. Given a basis in R every vector in R determines n numbers. defining .

ae2 = cia Ae2 = This means that the matrix takes the form e fi = ci2afl bflk e'. With every basis we associate the matrix of the bilinear form relative to this basis. Thus 61k is the simplest tensor of rank two once covariant and once .. once covariant and once contravariant and a bilinear form of vectors f. Let 11(11111 be the matrix of A relative to some basis el. .. Ae. i. = cia. b. i. Let A be a linear transformation on R.176 LECTURES ON LINEAR ALGEBRA a linear function transform according to the rule a'. = Ac. = Then e.e. Similarly. The resulting tensor is of rank two. twice covariant. basis c2ab fik. g e R defines a twice contravariant tensor. and so represent a covariant tensor of rank 1." e' where b1"Cak 6 ik It follows that Ae'. relative to any basis is the unit matrix. We shall show that this matrix is a tensor of rank two.e. With every basis we associate the matrix of A relative to this basis. Define a change of basis by the equations e'. k. = a' jei k. e2. e. a bilinear form of vectors x E R and y e R defines a tensor of rank two. y) be a bilinear form on R. once covariant and once contravariant. once covariant and once contravariant. which proves that the matrix of a linear transformation is indeed a tensor of rank two. = aikek. Bilinear function. the system of numbers 6ik In particular the matrix of the identity transformation E i to if i if i k. Let A (x.. Linear transformation. of A relative to the e'.

y. x) for all x e R. q) whose components relative to some basis take on 79±g prescribed (IT:: be the num values. One interesting feature of this tensor is that its components do not depend on the choice of basis. (This means that if the components of these two tensors relative to some basis are equal. 5 of § 22. Thus. in R and so obtain a multilinear ) of p q vectors in R. both a linear transformation and a bilinear form are defined by a matrix. 0 if i k. These numbers define a multilinear ) as per formula (1) in para. f. We now prove two simple properties of tensors. then. We wish to emphasize that the assumption about the tvvo tensors being of the same type is essential. A sufficient condition for the equality of two tensors of the same type is the equality of their corresponding components relative to some basis. . function l(x. 4. If R is a (real) n-dimensional Euclidean space. as was shown in para. g. Coincidence of the matrices defining these objects in one basis does not imply coincidence of the matrices defining these objects in another basis. T ensors in Euclidean space. The multilinear function. y. . Show dirctly that the system of numbers 6.) For proof we observe that since the two tensors are of the same type they transform in exactly the same way and since their components are the same in some coordinate system they must be the same in every coordinate system. this section. then their components relative to any other basis must be equal. then (f. x) = (y. y. = (I if i = k. . defines a unique tensor satisfying the required conditions. associated with every bais is a tensor. given a basis. it is possible to establish an isomorphism between R and R such that if y E R corresponds under this isomorphism to fe R. Given a multilinear function 1 of fi vectors x. EXERCISE. y. 2 of function /(x. Given p and q it is always possible to construct a tensor of type (p. y. g. sponding vectors u. Thus let prescribed in some basis.INTRODUCTION TO TENSORS 177 contravariant. in turn. The proof is simple. u. in R we can replace the latter by correin R and q vectors f.

The new . We showed in para. e. . fit. 5 of § 22 that in Euclidean space the vectors e. = (e1.) and let b.. . the tensor gz. e. . e5. f. . then this tensor can be used to construct a new tensor kJ. ¿(e1. e . e2. of a basis dual to fi are expressible in terms of the vectors p in the following manner: e. . is called a metric tensor. e.. u. e. v. f. . . ). namely. fie. In view of its connection with the inner product (metric) in our space.e. = grs where gzk = (et._ which is p q times covariant. e.. ek) It follows that rs ) 1(e3. = l(e1. ). v. .. We now propose to express the coefficients of l(x. es. i. The equation r defines the analog of the operation just discussed.178 LECTURES ON LINEAR ALGEBRA in terms of the coefficients of /(x. u. ). be the coefficients of the multilinear function .. .g... l(es. i. ) = l(ei.. In view of the established connection between multilinear functions and tensors we can restate our result for tensors: If au::: is a tensor in Euclidean space p times covariant and q times contravariant. gc. = ggfis = gsrgfis aTf:::. y. . g.) are the coefficients of a bilinear form. y. ) Here g is a twice covariant tensor. ). y..e. ei. Thus let au::: be the coefficients of the multilinear funct on /(x. fs. pc. ) . It is defined by the equation = gccrg fi. fr. e. This is obvious if we observe that the g. This operation is referred to as lowering of indices. .. the inner product relative to the basis e1.

ej. Operations on tensOYS. g. . f. /(x. . )= (x. f3. f. g. 1 is a multilinear function of p' p" vectors in R and q' q" vectors in R. y . y. Let . ) and 1"(z. ) be two multilinear functions of which the first depends on iv vectors in R and q' vectors in R and the second on Jo" vectors in R and q" vectors in R. y. f. 5. -) Clearly this sum is again a multilinear function of the same number of vectors in R and R as the summands l' and 1". g. y. Let l" (x. z. -) . . Show that gm is a twice contravariant tensor. /(x. 5 of § 22. y. h. Addition of tensors. . y. Consequently addition of tensors is defined by means of the formula = Multiplication of tensors. g. Since r(ei. y. y. -. EXERCISE. f. . ). Ve shall now express the components of the tensor correspond- ing to the product of the multilinear functions l' and 1" in terms of the components of the tensors corresponding to l' and 1". g. h. f. f. g. y. be two multilinear functions of the same number of vectors in R and the same number of vectors in R. ) of l' and 1" by means of the formula: f. g. We define their sum ) by the formula /(x. . g. h. Here e has the meaning discussed in para. . . In view of the connection between tensors and multilinear functions it is natural first to define operations on multilinear functions and then express these definitions in the language of tensors relative to some basis. . -. We define the product /(x. l'(x. z.INTRODUCTION TO TENSORS 179 operation is referred to as raising the indices. -)1(z. ) l'(x. . To see this we need only vary in 1 one vector at a time keeping all other vectors fixed. ) l'(x. f.

it follows that att tuk. f. ). f2. 1(e. Since each summand is a multilinear function of y. f" in R and consider the sum . fa) = A (e' a. . ) /(e2.e. y. e' and denote its dual basis by f'2. = r(e5. ) be a multilinear function of p vectors in R (p 1) and q vectors in R(q 1). + 1(e. We recall that if e'. l'(y. g. Specifically we must show that A (e. Jet. . e1. fk). Let /(x. fk Therefore eikr. . i. f'a) = cak A( = A (ek. . . . We use 1 to define a new multilinear function of p 1 vectors in R and q 1 vectors in R. g. e'2. and g. y. . (7) = /(ei.) ) and g. Contraction of tensors. the sum does not.. . the same is true of the sum I'. A (ea. e in R and its dual basis p. fl. f'ce) = A (cak ek. y. then cikek. We now show that whereas each summand depends on the choice of basis. g. f'. remain fixed we need only prove our contention for a bilinear form A (x. f2. irtn.::: a"tkl This formula defines the product of two tensors. g. f. ck f'a) = A (ek. Y. To this end we choose a basis el. Since the vectors y.180 LECTURES ON LINEAR ALGEBRA and = 1" (ek. g. g. f'k). y. y. ). -) . Since coefficients of the form /(x. g. f". f' z) A (e'2. Let us choose a new basis e'1. ) -. e2. -). fu . We now express the coefficients of the form (7) in terms of the . P) is indeed independent of choice of basis.

if one tried to sum over two covariant indices. Another contraction. ) = l(e e . Another example. (repeated as a factor an appropriate num- ber of times). say.k be a tensor of rank three and bt'n ai. With any tensor ai5 of rank two we can associate a sequence of invariants (i. Likewise the raising of indices can be viewed as contraction of the product of some tensor by the tensor g". simply scalars) a:. to a number independent of coordinate systems."' is a tensor rank a tensor of rank two. Let a.. The operation of lowering indices discussed in para. over the indices j and k. We observe that contraction of a tensor of rank two leads to a tensor of rank zero (scalar). say.e. then the tensor cit is the matrix of the product of these linear transformations. f2. However. the resulting system of numbers would no longer form a tensor (for upon change of basis this system of numbers would not transform in accordance with the prescribed law of transformation for tensors). The result of contracting this tensor over the indices i and m.kb. would lead to a tensor of rank one (vector).. jes. ). If the tensors a1 and b ki are looked upon as matrices of linear transformations.e. a/ .::: obtained from a::: as per (8) is called a contraction of the tensor It is clear that the summation in the process of contraction may involve any covariant index and any contravariant index. i. = The tensor a'. 4 of this section can be viewed as contraction of the product of some tensor by the metric tensor g. Let ati and b.INTRODUCTION TO TENSORS 181 and l'(e if follows that (8) . say. Their product ct'z' five.' be two tensors of rank two. numbers independent of choice of basis. would be a tensor of rank three. By multiplication and contraction these yield a new tensor of rank two: cit = aiabat.

For example. of a covariant vector. ) is the multilinear function corresponding If 1(x.. to the tensor ail . For example. as is clear from (9). contraction over all indices). g.. However. Since for a multilinear function to be symmetric with It goes without saying that we have in mind indices in the same (upper or lower) group. . if (9) 1(x. is a tensor of rank two. )= then. etc. if ei are the coordinates of a contravariant vector and n. it can be shown that every tensor can be obtained from vectors (tensors of rank one) using the operations of addition and multiplication. addition. . . f. multiplication by a number and total contraction (i. y. by multiplying vectors we can obtain tensors of arbitrarily high rank.. then Ein.e..e. given set of indices i if its components are invariant under an arbitrary permutation of these indices.182 LECTURES ON LINEAR ALGEBRA The operations on tensors permit us to construct from given tensors new tensors invariantly connected with the given ones. Thus. g.. i. By a rational integral invariant of a given system of tensors we mean a polynomial function of the components of these tensors whose value does not change when one system of components of the tensors in question computed with respect to some basis is replaced by another system computed with respect to some other basis. f. y. symmetry of the tensor with respect to some group of indices is equivalent to symmetry of the corresponding multilinear function with respect to an appropriate set of vectors. In connection with the above concept we quote without proof the following result: Any rational integral invariant of a given system of tensors can be obtained from these tensors by means of the operations of tensor multiplica- tion. We observe that not all tensors can be obtained by multiplying vectors. Symmetric and skew symmetric tensors DEFINITION. if then the tensor is said to be symmetric with respect to the first two (lower) indices. A tensor is said to be symmetric with respect to a 6.

the number of independent components of a skew symmetric tensor with k indices (k n) is (:). if the components of a tensor are skew symmetric in one coordinate system then they are skew symmetric in all coordi- nate systems. the tensor is skew symmetric. y. DEFINITION. be symmetric with respect to an appropriate set of indices in some basis. A tensor is said to be skew symmetric if it changes sign every time two of its indices are interchanged.e.. The definition of a skew symmetric tensor implies that an even permutation of its indices leaves its components unchanged and an odd permutation multiplies them by 1. For a multihnear function to be skew symmetric it is sufficient that the components of the associated tensor be skew symmetric relative to some coordinae system. i.e. Thus let a be a skew symmetric tensor of rank two. (There are no non zero skew symmetric tensors with more than n indices. Similarly. In other words. ) of p vectors in R is said to be skew symmetric if interchanging any pair x. i. either all covariant or all contravariant. y. skew symmetry of a multilinear function implies skew symmetry of the associated tensor (in any coordinate system). then this symmetry is preserved in all coordinate systems. the number of different compo2)/3! since 1) (n nents of a skew symmetric tensor ail. is n (n components with repeated indices have the value zero and components which differ from one another only in the order of their indices can be expressed in terms of each other. Then a.. We now count the number of independent components of a skew symmetric tensor.. so that the number of different components is n(n 1)/2. This much is obvious from (9). Here it is assumed that we are dealing with a tensor all of whose indices are of the same nature. A multilinear function 1(x.k = a.INTRODUCTION TO TENSORS 183 respect to a certain set of vectors it is sufficient that the corresponding tensor a:11. of its vectors changes the sign of the function. More generally. This follows from the . On the other hand. it follows that if the components of a tensor are symmetric relative to one coordinate system. The multilinear functions associated with skew symmetric tensors are themselves skew symmetric in the sense of the following definition: DEFINITION.

o2.. This operation is called symmetrization and consists in the following. EXERCISE. 2. it follows that such a tensor has only one independi is any permutation ent component.12_1 say. aCis i2) j.184 LECTURES ON LINEAR ALGEBRA fact that a component with two or more repeated indices vanishes and k > n implies that at least two of the indices of each component coincide.. ik. In view of formula (10) the multilinear function associated with a skew symmetric tensor with n indices has the form e2 En /(x.. n and if we put (10)a11±a 1. z) = =a ni n2 " vn 12''' This proves the fact that apart from a multiplicative constant the only skew symmetric multilinear function of n vectors in an ndimensional vector space is the determinant of the coordinates of these vectors. 12. = a is multiplied by the determinant of the matrix associated with this coordinate transformation.) We consider in greater detail skevv symmetric tensors with n indices. is even (4. say. The ofieration of symmetrization. Let the given tensor be 011.sign) depending on whether the permutation 1112 or odd( sign). y. respect to the first k indices. Show that as a result of a coordinate transformation the number a. then of the integers 1. Given a tensor one can always construct another tensor symmetric with respect to a preassigned group of indices. Consequently if 11. i2 2 . of the 2 -1.= a. Since two sets of n different indices differ from one another in order alone. .-44+1 where the sum is taken over all permutations ji. is to construct the tensor 1 k! L. To symmetrize it with . 12.12. For example indices ii.

12.) att. the tensors constructed from each of these systems differ by a non-zero multiplicative constant only. we wish to coordinatize it. Consider a k-dimensional subspace of an n-dimensional space R. The operation is defined by the equation a. . it is easy to show (the proof is left to the reader) that if two such systems of vectors generate the same subspace. However. Cik of the subspace defines this subspace.. The brackets conta ns the indices involved in the operation of alternation. .. The operation of alternation is indicated by the square bracket symbol [ ].e. = 1 +aiii2. . 2f 2. at2t. Cik. . any linear combination of the remaining vectors...INTRODUCTION TO TENSORS 185 The operation of alternation is analogous to the operation of symmetrization and permits us to construct from a given tensor another tensor skew symmetric with respect to a preassigned group of indices. n.t2! = Cat.ik. We vvish to characterize this subspace by means of a system of numbers. ni 2) product ail i2-"ik = Vini2 . k we can construct their tensor Cik and then alternate it to get PI It is easy to see that the components of this tensor are all kth order minors of the following matrix e2 n1 n2 CI V. Thus the skew symmetric tensor a[i1i2" constructed on the generators VI. For instance where the sum is taken over all permutations ji. of the . The tensor a[ii '41 does not change when we add to one of the vectors E. Different systems of k linearly independ- ent vectors may generate the same subspace. A k-dimensional subspace is generated by k linearly independent vectors ei.. Given k vectors eii. j. k! 1 . i. ik and the sign depends on the even or odd indices i.i nature of the permutation involved.

Sign up to vote on this title
UsefulNot useful