I N T E R S C I E N C E P U B L I S H E R S L T D .

, L O N D O N





































J anuary 1 9 4 8





































































































+ fi) x = + fix





























$ ' 7 t







,










P




























Y
n ) ( n














n
n (































































































=





























I

n



( f( t)

VS a [







vr



















































































b b

b

j ib



J a

























































d.




































fi





















( 2 c;


_ e, 2
e2 2 1 2 e2 2





r











1 / 1 7 / 2 ,
e2
2 n3 ,
¿ 3 = ) 7 3 .










































1 ,
































- F +


- F

- F









4 >






of














































































a


a
. a












































































np
fil+ q t nnfn


, n.









= n, nil+ 0 .
p' .





































































































































































y)














6
6






























+





































































































































































o
o
L o
o























































I r- 1 1 1 st











an_ ,

















































































































C n T in = ( z,










o o
o



























































































( x , y) = B ( x ; y) .





,













+












































































E















































































ß y




Fa
L y J
Fa vl
L ß






r

























1 1



ft. , i. e. ,

w,












= -















































n
n

n












































































k_

sif2






















4 k( ,
i. e. ,

4 7 2 5







































































































L

1 6 8


cp( f)





f2 , f" .




( 6 )












=




















=




































































































e















E n



vn




,





















COPYRIGHT 0 1961 BY INTERSCIENCE PUBLISHERS, INC.

ALL RIGHTS RESERVED
LIBRARY OF CONGRESS CATALOG CARD NUMBER 61-8630 SECOND PRINTING 1963

PRINTED IN THE UNITED STATES OF AMERICA

PREFACE TO THE SECOND EDITION
The second edition differs from the first in two ways. Some of the material was substantially revised and new material was added. The major additions include two appendices at the end of the book dealing with computational methods in linear algebra and the theory of perturbations, a section on extremal properties of eigenvalues, and a section on polynomial matrices ( §§ 17 and 21). As for major revisions, the chapter dealing with the Jordan canonical form of a linear transformation was entirely rewritten and Chapter IV was reworked. Minor
changes and additions were also made. The new text was written in colla-

boration with Z. Ja. Shapiro. I wish to thank A. G. Kurosh for making available his lecture notes
on tensor algebra. I am grateful to S. V. Fomin for a number of valuable comments Finally, my thanks go to M. L. Tzeitlin for assistance in the preparation of the manuscript and for a number of suggestions.
September 1950

I. GELtAND

Translator's note: Professor Gel'fand asked that the two appendices

be left out of the English translation.

.

A. and to D. Fomin participated to a considerable extent in the writing of this book. Without his help this book could not have been written. who carefully read the manuscript and made a number of valuable comments. V. T h e author wishes to thank Assistant Professor A. who made available to him notes of the lectures given by the author in 1945. Raikov.PREFACE TO THE FIRST EDITION This book is based on a course in linear algebra taught by the author in the department of mechanics and mathematics of the Moscow State University and at the Byelorussian State University. Turetski of the Byelorussian State University. S. E. January 1948 I. GEL'FAND vii . The material in fine print is not utilized in the main part of the text and may be omitted in a first perfunctory reading.

.

n-Dimensional Spaces.TABLE OF CONTENTS Page Preface to the second edition Preface to the first edition vii I. 132 132 137 142 149 IV. Simultaneous reduc97 tion of a pair of quadratic forms to a sum of squares 103 Unitary transformations 107 Commutative linear transformations. Operations on linear transformations . Normal transformations Decomposition of a linear transformation into a product of a unitary and self-adjoint transformation 114 Linear transformations on a rea/ Euclidean space 126 Extremal properties of eigenvalues . Introduction to Tensors The dual space Tensors 164 164 171 . The Canonical Form of an Arbitrary Linear Transformation The canonical form of a linear transformation Reduction to canonical form Elementary divisors Polynomial matrices . Invariant subspaces. Linear Transformations 70 Linear transformations. Linear and Bilinear Forms ti-Dimensional vector spaces Euclidean space Orthogonal basis. Eigenvalues and eigenvectors of a linear 81 transformation 90 The adjoint of a linear transformation Self-adjoint (Hermitian) transformations. Isomorphism of Euclidean spaces 14 21 Bilinear and quadratic forms Reduction of a quadratic form to a sum of squares Reduction of a quadratic form by means of a triangular transformation 34 42 The law of inertia Complex n-dimensional space 46 55 60 70 II. III.

.

CHAPTER I n-Dimensional Spaces. e) (e.). )42.n2. Linear and Bilinear Forms § 1. By the product of the number A and the n-tuple x = (ei . A set R of elements x. In algebra we come across systems of n numbers x = (ei. . + n). a-Dimensional vector spaces 1. e) we mean the n-tuple ix= (241. e2. we introduce the concept of a vector space. x+y= + ij. Definition of a vector space.. y. ¿) and y =(. by definition.g. Two directed segments are said to define the same vector if and only if it is possible to translate one of them into the other. the diagonal of the parallelogram with sides x and y. AE ). To investigate all examples of this nature from a unified point of view fa. z. directed segments.. E2. As is well known the sum of two vectors x and y is. . Addition and multiplication of n-tuples by numbers are usually defined as follows: by the sum of the n-tuples . Thus In geometry objects of this nature are vectors in three dimensional space. E2.e. In the sequel we shall consider all continuous functions defined on some interval In the examples just given the operations of addition and multiplication by numbers are applied to entirely dissimilar objects. rows of a matrix. b]. . In analysis we define the operations of addition of functions and multiplication of functions by numbers. the set of coefficients of a linear form. etc. i. ?In) we mean the n-tuple x = (E1. The definition of multiplication by (real) numbers is equally well known. vector space over a field F if: [I] is said to be a . E2 + n2. It is therefore convenient to measure off all such directed segments beginning with one common point which we shall call the origin. We frequently come across objects which are added and multiplied by numbers. DEFINITION 1. .

1. 4. It is not an oversight on our part that we have not specified how elements of R are to be added and multiplied by numbers. Ai( is referred to as the product of x by A. x+y=y+x (x + y) 4. We take as the elements of R matrices of order n. Whenever this is the case we are dealing with an instance of a vector space. 0 is referred to as the zero element. (ßx) = ß(x). Thus (r t) (r t) = 5.z = x (y z) x with the property x ( x) = O. The above operations must satisfy the following requirements (axioms): (commutativity) (associativity) R contains an element 0 such that x = x for all x in R. 3 above are indeed examples of vector spaces.2 LECTURES ON LINEAR ALGEBRA With every two elements x and y in R there is associated an element z in R which is called the sum of the elements x and y. (cc 1. 1. Ix= x ct(x + fi)x = y) = + fix cor. We observe that under the usual operations of addition and multiplication by numbers the set of polynomials of degree n does not form a vector space since the sum of two polynomials of degree n may turn out to be a polynomial of degree smaller than n. We leave it to the reader to verify that the examples 1. As the sum . Any definitions of these operations are acceptable as long as the axioms listed above are satisfied. 2. Let us give a few more examples of vector spaces. The set of all polynomials of degree not exceeding some natural number n constitutes a vector space if addition of polynomials and multiplication of polynomials by numbers are defined in the usual manner. With every element x in R and every numeer A belonging tu a c field F there is associated an element Ax in R. For every x in R there exists (in R) an element denoted by I. 2. II. 2. III. The sum of the elements x and y is denoted by x + y.

i. then the space is referred to as a real vector space. y. /3. in chapter I we shall ordinarily assume that R is a real vector space. Many concepts and theorems dealt with in the sequel and.n-DIMENSIONAL SPACES 3 we take the matrix Hai. v is said to be linearly independent if (1) yz + the equality implies that . space are real. However. are elements of an arbitrary field K. 1/Ve now define the notions of linear dependence and independence of vectors which are of fundamental importance in all that follows. not all equal to zero such that cc. 2. Then R is called a vector space over the field K. y. fty Dividing by a and putting yz Ov. z.e. involved in the definition of a vector If the numbers X. 1. In other words. . IT be connected by a relation of the form (1) with at least one of the coefficients. a. . The geometric considerations associated with this word will help us clarify and even predict a number of results. then the space is referred to as a complex vector space. The fact that this term was used in Example I should not confuse the reader.. The dimensionality of a vector space.p Let the vectors x. Then + Ov = 0 GO( + /3y + yz + 0 = O. If are taken from the field of complex numbers. y. + Ov = O. IT are linearly dependent if there exist numbers 9. . in particular. the numbers 2. It is natural to call the elements of a vector space vectors. it. W e shall say that the vectors x. More generally it may be assumed that A.4. a set of vectors x. DEFINITION 2. the contents of this section apply to vector spaces over arbitrary fields. . Vectors which are not linearly dependent are said to be linearly independent. z. z. + b1. Let R be a vector space. let x. say. y. IT be linearly dependent. unequal to zero. z. y. As the product of the number X and the matrix 11 aikl we take the of the matrices I laikl I and matrix 112a1tt It is easy to see that the above set R is now a vector space. .11. y. y =.

4 LECTURES ON LINEAR ALGEBRA (/57X) = A. y. Any two vectors on a line are proportional. linearly depend- ent. z. i. . dependent. 2. 4. y is the zero vector then these vectors are linearly dependent. A vector space R is said to be n-dimensional if it contains n linearly independent vectors and if any n in R are linearly dependent. We now introduce the concept of dimension of a vector space. if the vectors x. 1. z. y. y. z. Thus. EXERCISES. Infinite-dimensional spaces will not be studied in this book. y. 3. It is therefore natural to make the following general definition. We leave it to the reader to prove that the converse is also true.e. . then it is possible to find three linearly independent vectors but any four vectors are linearly dependent. are arbitrary vectors then the vectors x. y. = tu. respectively. and space. y. In the plane we can find two linearly independent vectors but any three vectors are linearly dependent.. then R is said to be infinitedimensional. z. If R is the set of vectors in three-dimensional space. v. y in the form (2) we say that x is a linear combination of the vectors y. Ve shall now compute the dimensionality of each of the vector spaces considered in the Examples 1.. 2. i. plane. Show that if the vectors x. z. /1. u.e. DEF/NITION 3. . As we see the maximal number of linearly independent vectors on a straight line. y are linearly dependent then at (2) x pZ + least one of them is a linear combination of the others. z. in the plane. are linearly dependent and u. . and in three-dimensional space coincides with what is called in geometry the dimensionality of the line. 5.y '. that if one of a set of vectors is a linear combination of the remaining vectors then the vectors of the set are linearly dependent. Whenever a vector x is expressible through vectors y. are linearly . (01x) = we have + Of. 1 vectors If R is a vector space which contains an arbitrarily large number of linearly independent vectors. Show that if one of the vectors x.

tre > n. Indeed. . the vectors xi -= (1. It can be shown that any ni elements of R. Since m > n. On the other hand. The number of linearly independent rows in the matrix [ nu. It follows that our space contains an arbitrarily large number of linearly independent functions or. Let N be any 1. 1. n2n). f2(t) = t. are linearly dependent. 0). But this implies the linear dependence of the vectors y1. natural number. 717n2. . . Yin = (nml. Let R denote the space whose elements are n-tuples of real numbers. 0). 0. the space R of Example 1 contains three linearly independent vectors and any four vectors in it are linearly dependent. t"-1 are linearly independent. Let R be the space of continuous functions. Thn 172n n tnn cannot exceed n (the number of columns). nmn) be ni vectors and let ni > n. In n Let R be the space of polynomials of degree . n21. x. this space the n polynomials 1. t. This space contains n linearly independent vectors For instance. y2. n12. n12. 0. . R is infinite-dimensional. are linearly dependent. Yi(nii. Then the functions: f1(t) independent vectors (the proof fN(t) = tN-1 form a set of linearly of this statement is left to the reader). nm2.. 1) are easily seen to be linearly independent. 1. .n-DIMENSIONAL SPACES 5 As we have already indicated. = (0. . any m vectors in R. n22. Y2 = (V21. Hence R is n-dimensional. ni > n. Consequently R is three-dimensional. n22y 17ml. Ynt Thus the dimension of R is n. . let x= (0. our ni rows are linearly dependent. briefly.

that there exist n I numbers « al. . Let x be an arbitrary vector in R. e2. for instance. It Proof: Let e1. e/n)e. en of an n-dimensional vector space R is called a basis of R. e. i. i.e. THEOREM 1. . e2. Every vector x belonging to an n-dimensional vector space R can be uniquely represented as a linear combination of basis vectors.. . r2e2 + Subtracting one equation from the other we obtain O = (el $'1)e1 (2 e2)e2 + . e2. e1.e. We leave it to the reader to prove that the space of n x n matrices [a2kH is n2-dimensional. e contains n + 1 vectors. Thus. Basis and coordinates in n-dimensional space DEFINITION 4. Using (3) we have x= cco 2 e2 22o cc °c m en. 3. ex not all zero such that (3) ao X + 1e1 + cte = O.6 LECTURES ON LINEAR ALGEBRA 5. To prove uniqueness of the representation of x in terms of the basis vectors we assume that x = $1e1 and E2e2 + +e + ete. e be a basis in R. By definition of the term "n-dimensional vector space" such a space contains n linearly independent vectors. follows from the definition of an n-dimensional vector space that these vectors are linearly dependent. e2.. . Any set of n linearly independent vectors e1. in the case of the space considered in Example 1 any-three vectors which are not coplanar form a basis. en. X = E'.e. Otherwise (3) would imply the linear dependence of the vectors e1. The set x. «0 This proves that every x e R is indeed a linear combination of the vectors el. e2. Obviously «0 O. it contains a basis. + (E.

Let us choose as basis the vectors . e2. and the coordinates of the .e. + + Ene + + n)e. Let R be the space of n-tuples of numbers. . Thus the coordinates of the sum of two vectors are the sums of the appropriate coordinates of the summands. e of a vector space R every vector X E R has a unique set of coordinates. the coordinates of x + y are E. . 2En product of a vector by a scalar are the products of the coordinates of that vector by the scalar in question. ni.e. . e2. En and the coordinates of y relative to the same basis a. EXAMPLES. en are E2. If the coordinates of x relative to the basis e1. i. . + x + Y = (El + ni)ei + (e2 712)e2 + i.e. . n2. if are ni. x Y= then + 2e2 + + /72. e2. e form a basis in an n-dimensional space and + x= (4) + e2e2 + then the numbers $1. = eta This proves uniqueness of the representation. E2.. En + Similarly the vector /lx has as coordinates the numbers 141. . e2 772. 2E2. DEFINITION 5.. if el. e2. . It is clear that the zero vector is the only vector all of whose coordinates are zero. 2. . In the case of three-dimensional space our definition of the coordinates of a vector coincides with the definition of the coordinates of a vector in a (not necessarily Cartesian) coordinate system. 1.tt-DIMENSIONAL SPACES 7 Since e1. = en = e'n = = $'7t ea = E12. en are linearly independent. e2. e. x relative to the basis Theorem 1 states that given a basis el. it follows that e'2 = Et'= e2 i.. E are called the coordinates of the vector el.

0). = (0. . . 1). = E.. . 1) = The numbers (ni. e. 1. Ei(1. By definition x i. n and then compute the coordinates x = (E1. tween the coordinates of a vector x = . . 1) + n2. nn " $n $fl-1Let us now consider a basis for R in which the connection be- ni " Si. 1. 1). $2. e. let = (1. E2. = (0. Ti (0. . E. 1. 1. . 1) +Ee. ' n2 " S2 -. = (0. 1). e2e2+ . en) = n2e2 ' + nen. I. 0. .72. $2. + E2 + Consequently. 1.$1. $) the numbers . n/(1. e) relative to the basis el. 1). ni + + + n). 0. 1) . Thus. n of the vector . . 1. En) o) 2(o. e. 272(0.S LECTURES ON LINEAR ALGEBRA et = (1. 0). . E which define the vector is particularly simple. n) must satisfy the relations 0. 0. 1. . o. 172 = El. 171 . e. (E1. 52. x = (Ei. numbers et..e. o) + = + $(o. It follows that in the space R of n-tuples ($1. .. en = (0. en) and the E2.. Then 0. 172 + + n. o.

.. ' . e2 = (0. In the examples considered above some of the spaces are identical with others when it comes to the properties we have investigated so far. . e. a. a. an_2. P (t) = P (a) + P' (a)(t a) + Thus the coordinates of P(t) in this basis are 1)1 . = (all. 0. i. en = (0. Indeed. e2 (a. = t. .. Expanding P (t) in powers of (t a) we find that + [P(nl) (a)I (n-1)!](t a)n_'. 6122. . Let R be the vector space of polynomials of degree n A very simple basis in this space is the basis whose elements are . e'2 = t a. ..e. [PR-1)(a)/ (n P(a). we can associate with a vector in R a vector in R'. en) are linear 17 of a vector x the coordinates n. . We shall now formulate precisely the notion of "sameness" or of "isomorphism" of vector spaces. a". 0. This implies a parallelism between the geometric properties of R and appropriate properties of R'. . ($1. = (1. One instance of this type is supplied by the ordinary three-dimensional space R considered in Example / and the space R' whose elements are triples of real numbers. e' = 1. /2' combinations of the numbers E1. e2. 1.). e's = (t a)2. en (a. ao. 1.. coordinates of the polynomial P(t) = a0r-1 a1t"-2 + in this basis are the coefficients a_. e' = (t a)"--1.n-DIMENSIONAL SPACES 9 ei. once a basis has been selected in R we can associate with a vector in R its coordinates relative to that basis. e. When a vector is multiplied by a scalar all of its coordinates are multiplied by that scalar. X =(. Show that in an arbitrary basis e. E2. may be viewed as the coordinates of the vector e) relative to the basis 1). When vectors are added their coordinates are added. a22.) . Let us now select another basis for R: . . It is easy to see that the the vectors el = 1. 0). 0). Isomorphism of n-dimensional vector spaces. e = t"--1-. EXERCISE. P' (a).

Two vector spaces R and RI. are said to be isomorphic if it is possible to establish a one-to-one correspondence X 4-4 x' between the elements x e R and x' e R' such that if x 4> x' and y y'. Let e2. There arises the question as to which vector spaces are isomorphic and which are not. . If x.. This is the same as saying that the dimensions of R and R' are the same. . This means that the E. All vector spaces of dimension n are isomorphic. Indeed. y'. every vector x e R has a unique representation of the form (5). Therefore the maximal number of linearly independent vectors in R is the same as the maximal number of linearly independent vectors in R'. . a linear combination of the vectors e'. This correspondence is one-to-one. By the same token every x' e R' determines one and ordy one vector x e R. with the same coefficients as in (5). are vectors in R and x'. Hence the counterparts in R' of linearly yy + independent vectors in R are also linearly independent and conversely. then the vector which this correspondence associates with x + y is X' + y'. Indeed. the vector which this correspondence associates with Ax is Ax'. en be a basis in R and let e'2. y. Two vector spaces of different dimensions are certainly not isomorphic. let us assume that R and R' are isomorphic. But then x' is likewise uniquely determined by x. are uniquely determined by the vector x. It follows that two spaces of different dimensions cannot be isomorphic. are their counterparts in R' then in view of conditions I and 2 of the definition of isomorphism the equation Ax Ax' py' + = 0 is equivalent to the equation = O. e' be a basis in R'.e. We shall associate with the vector (5) x= e2e2 + + ee the vector + E2e'2 + x' i.10 LECTURES ON LINEAR ALGEBRA DEFINITION 6. Proof: Let R and R' be two n-dimensional vector spaces. THEOREM 2.

Since a subspace of a vector space is a vector space in its own right we can speak of a basis of a subspace as well as of its dimensionality. . then x + y 4> x' + y' and 2x 4> Ax'. in R is called a y e R'. x e R'. Show that if the dimension of a subspace R' of a vector space R is the same as the dimension of R. then R' coincides with R. subspace of R if x e R'. of a vector space R is called a subspace of R if it forms a vector space under the operations of addition and scalar multiplication introduced in R. n form a subspace of The totality of polynomials of degree the vector space of all continuous functions. A subset R'. The zero or null element of R forms a subspace The whole space R forms a subspace of R. E) such that + anen = 0. . In § 3 we shall have another opportunity to explore the concept of isomorphism.n-DIMENSIONAL SPACES 11 It should now be obvious that if x 4* x' and y e> y'. form a x = (E1. y e R' implies x of R. En) for which ei = 0 form a subspace. all vectors x (E1. a2. EXERCISE. 5. y. It is clear that the dimension of an arbitrary subspace of a vector space does not exceed the dimension of that vector space. . More generally. Let R be the ordinary three-dimensional space. The totality R' of vectors in that plane form a subspace of R.E1 where al. Subspaces of a vector space DEFINITION 7. a2E2 + a. In the vector space of n-tuples of numbers all vectors . We now give a few examples of non-trivial subspaces. an are arbitrary but fixed numbers. Consider any plane in R going through the origin. The null space and the whole space are usually referred to as improper subspaces. This completes the proof of the isomorphism of the spaces R and R'. E2. It is clear that every subspace R' of a vector space R must con- tain the zero element of R. E2. EXAMPLES. a set R' of vectors x. In other words. subspace. 1.

e2. $22..e. 4. . xi then the I rows in the matrix $12. OY are a (finite infinite) set of vectors belonging to R. x2. where a is an arbitrary scalar. Etl. Thus the maximal number of linearly independent vectors in R'. n. e2. e. i. g. Show that every n-dimensional vector space contains /. g. 0 are fixed vectors and ranges over all scalars. eh form a basis of R' Indeed. f. f. is hand the vectors e. 2. Consider the set of vectors of the form x xo °Lei. . It is natural to call this set of vectors by analogy with threedimensional space a line in the vector space R. then the simplest vector spaces are one- dimensional vector spaces.. On the other hand. form a basis in R'.$ 22. Example 2. e7. eh). The subspace R' is referred to as the subspace . the dimension of R'. The subspace R' generated by the linearly independent vectors e1. x2. be 1 vectors in R' and let 1 > k. f. f. let x1. the vectors e1.. If X2 x1= $11e1 + $ 12. + etkek. forms a generated by the vectors e. P Elk $21. is k-dimensional and the vectors e1. e2. Thus a one-dimensional v all vectors me1. g. xi. then the set R' of all (finite) linear combinations of the vectors e. . . $2k E lk must be linearly dependent. g. + enel. But this implies (cf. .12 LECTURES ON LINEAR ALGEBRA A general method for constructing subspaces of a vector space R is implied by the observation that if e. A basis of such a space is a single vector el O. + E12e2 + ' + elkek + E2e. x. where xo and e. EXERCISE.e. . R' contains k linearly independent vectors (i. subspace R' of R. If we ignore null spaces. subspaces of dimension / 1. . page 5) the linear dependence of the vec ors x. This subspace is the smallest subspace of R containing the vectors e. $22. e2.

are fixed numbers not all of which are zero) form a subspace 1. Show that the dimension of the subspace generated by the vectors is equal to the maximal number of linearly independent vectors e. among the vectors e. with the appropriate expressions from (6) we get + ee e'l(ae ae2 + ane2 E'(a1. g. g. . all vectors of the form /el ße2. The set of vectors of the form xe ße2. . a211e2 + The determinant of the matrix d in (6) is different from zero e' would be linearly depend(otherwise the vectors e'1. e'2. X= where xo is a fixed vector. f. 1. Transformation of coordinates under change of basis. Show that in the vector space of n-tuples (ei. f. a. ep. e'2 = tine' + ae. e' be two bases of an n-dimensional vector space. a. is called a (two-dimensional) plane.. + 6/1E1 (a1. en and e'1. Let ei be the coordinates of a vector x in the first basis and its coordinates in the second basis. ---a2E. ent).e1±a21e2+ ±a2e) + aen). let the connection between them be given 6. where el and e2 are fixed linearly independent vectors and a and fi are arbitrary numbers form a two-dimensional vector space. e'o. + Replacing the e'. EXERCISES. ' of real numbers the set of vectors satisfying the relation + ane.. Let e2. of dimension n Show that if two subspaces R. and 112 of a vector space R have only the null vector in common then the sum of their dimensions does not exceed the dimension of R. ae) . + an. Further.n-DIMENSIONAL SPACES 13 Similarly.en. = ae (6) a21e2 + a22 e2 + + ae. = ame. . by the equations e'. Then x = Ele x e1e1 $2e2 + $2e2 + $e = E'le'.

14 LECTURES ON LINEAR ALGEBRA Since the e. Then buE1 + 1)12E2 + . bnnen e'n = b11 b2$2 + where the b are the elements of the inverse of the matrix st. The simplest way of introducing these concepts is the following.21. plane. ei2 = bn + b22$2 + + b2Jn.V. many concepts of so-called Euclidean geometry cannot be forniulated in terms of addition and multiplication by scalars. . the coordinates of a vector are transformed by means of a matrix ri which is the inverse of the transpose of the matrix at in (6) which determines the change of basis. Thus. By means of these operations it is possible to define in a vector space the concepts of line. To rephrase our result we solve the system (7) for ¿'i. on both sides of the above equation must be the same. + binen. We define this concept axiomatically. etc. Euclidean space 1. parallelism of lines. aniVi + a. the inner product of vectors. However. Definition of Euclidean space. § 2. + + anE'n. Using the inner product operation in addition to the operations of addi- . dimension. are linearly independent. angles between vectors. Hence auri + an (7) + + rn E2 = an VI en a22 E'2 + + a2netn. Thus the coordinates of the vector x in the first basis are express- ed through its coordinates in the second basis by means of the matrix st which is the transpose of . In the preceding section a vector space was defined as a collection of elements (vectors) for which there are defined the operations of addition and multiplication by scalars. Instances of such concepts are: length of a vector. the coefficients of the e. We take as our fundamental concept the concept of an inner product of vectors.

y) = (y. (25. Aen) with which we are already familiar from Example 2. AE. th.. we define the inner product of x and y as (x. of vectors studied in elementary solid geometry (cf. y) such that (x. In addition to the definitions of addition EXAMPLES. x). En) and Y = (n2. + x2. (A real) (2x.n-DIMENSIONAL SPACES 15 tion and multiplication by scalars we shall find it possible to develop all of Euclidean geometry. 23 . If with every pair of vectors x.. Let x = (et. y in a real vector space R there is associated a real number (x. § 1). Y) = eint + 5m + + it is again easy to check that properties 1 through 4 are satisfied by (x. y) =] (x1. y) = 2(x. en + n) and multiplication by scalars . Consider the space R of n-tuples of real numbers. y) as defined. Let us define the inner product of two vectors in this space as the product of their lengths by the cosine of the angle between them. y] + (x2. A vector space in which an inner product satisfying conditions 1 through 4 has been defined is referred to as a Euclidean space. DEFINITION 1. then we say that an inner product is defined in R. Let us consider the (three-dimensional) space R x + Y = (ei Ax nt. Without changing the definitions of addition and multiplication by scalars in Example 2 above we shall define the inner product of two vectors in the space of Example 2 in a different and more general manner. y].) be in R. x) 0 and (x. Thus let taiki I be a real n x n matrix. 1. x) = 0 tt and only if x 0. . Let us put . y). (x. ez + n2. Example /. We leave it to the reader to verify the fact that the operation just defined satisfies conditions 1 through 4 above. n2. § 1. (x.

16 LECTURES ON LINEAR ALGEBRA (x. ennl an2 En n2 + ann ennn We can verify directly the fact that this definition satisfies Axioms 2 and 3 for an inner product regardless of the nature of the real matrix raj/cll. i. x) ctikeie.e. i. if we put a = 1 and a. it is necessary and sufficient that a= a. y) to be symmetric relative to x and y. y) deEned by (1) takes the form (x. EXERCISE. and that the matrix 1\ (1 1 21 can be used to define an inner product satisfying the axioms I through 4. be non-negative fore very choice of the n numbers el. For Axiom 1 to hold. The homogeneous polynomial or. Axiom 4 requires that the expression (x. .e. Y) = a11C1n1 + a12C1n2 + (1) + am el /7 + a 271 egbi a21e2171 + a22E2n2 + + an]. = = E. k=1 i. . If we take as the matrix fIctO the unit matrix. as it is frequently called. = O. en and that it vanish only if E. = O (i k). e2. quadratic form in (3) is said to be positive definite if it takes on non-negative values only and if it vanishes only when all the Ei are zero.. = E. for (x. then the inner product (x. y) = I eini and the result is the Euclidean space of Example 2. Thus for Axiom 4 to hold the quadratic form (3) must be positive definite. that is. In summary.. for (1) to define an inner product the matrix 11(211 must be symmetric and the quadratic form associated with Ila11 must be positive definite.. that IctO be symmetric. Show that the matrix (0 1 1) 0 cannot be used to define an inner product (the corresponding quadratic form is not positive definite).

n 1. We define the inner product of two such functions as the integral of their product (f. Angle between two vectors. By the length of a vector x in Euclidean space we mean the number (x. b].. We shall now make use of the concept of an inner product to define the length of a vector and the angle between two vectors. it is natural to require that the inner product of two vectors be equal to the product of the lengths of these vectors times the cosine of the angle between them.e. By the angle between two vectors x and y we mean the number arc cos (x. DEFINITION 3. Let the elements of a vector space be all the continuous functions on an interval [a. g) = fa f(t)g(t) dt. It is easy to check that the Axioms 1 through 4 are satisfied. y) 13C1 1371 . of the angle between two vectors and of the inner product of two vectors imply the usual relation which connects these quantities. We define the inner product of two polynomials as in Example 4 L 2. we put (5) cos 9) (x. (4) We shall denote the length of a vector x by the symbol N. lx1 y) 1311 i. It is quite natural to require that the definitions of length of a vector.n-DIMENSIONAL SPACES 17 In the sequel (§ 6) we shall give simple criteria for a quadratic form to be positive definite. DEFINITION 2. Let R be the space of polynomials of degree (P. Length of a vector. x). This dictates the following definition of the concept of angle between two vectors. Q) = P (t)Q(t) dt. In other words.

which is what we set out to prove. In para. The concepts just introduced permit us to extend a number of theorems of elementary geometry to Euclidean spaces. 2. This theorem can be easily generalized to read: if x.. that the square of the length of the diagonal of a rectangle is equal to the sum of the squares of the lengths of its two non-parallel sides (the theorem of Pythagoras). The Schwarz inequality. If x and y are orthogonal vectors.(Y. y. . However. Y) 13112. y) 1x1 1Y1 If is to be always computable from this relation we must show I We could have axiomatized the notions of length of a vector and angle that between two vectors rather than the notion of inner product. (x. Y) 4. Proof: By definition of length of a vector (x Y. x) = O.7i/ 2. we defined the angle between two vectors x and y by means of the relation cos 99 (x. x) (Y. lx + 3/12 (X + y. x) (x. x Y). y) = (y. Thus 1x + y12 = (x. y) = O. y). z. x + y) = (x. Since x and y are supposed orthogonal. then it is natural to regard x + y as the diagonal of a rectangle with sides x and y.e. x) (Y. this course would have resulted in a more complicated system of axioms than that associated with the notion of an inner product. In view of the distributivity property of inner products (Axiom 3). The angle between two non-zero orthogonal vectors is clearly . are pairwise orthogonal. The following is an example of such extension. then - Ex + y + z + 12 = ixr2 )7I2 1z12 + ' w 3. We shall show that Y12 = 1x12 1Y121 i.18 LECTURES ON LINEAR ALGEBRA The vectors x and y are said to be orthogonal if (x.

y) + (x. 1(x. 1. y)2 (x. equivalently. x)(y. y) 2t(x.. i. Inequality (6) is known as the Schwarz inequality. It is now appropriate to interpret this inequality in the various concrete Euclidean spaces in para.n-DIMENSIONAL SPACES 19 1 < (X. inequality (6) tells us nothing new. . Thus.) 2 Note. (cf. ty. y). EXERCISE. 2 To prove the Schwarz inequality we consider the vector x ty where t is any real number. x)(y. x ty) 0.e. (x.. (x i. y). 1. cannot be positive. Namely. however that in para. in vector analysis the inner product of two vectors is defined in such a way that the quantity (x. in turn. In the case of Example 1. of this section there is no need to prove this inequality. 12(y . x) O. We have proved the validity of (6) for an axiomatically defined Euclidean space. Example /. is the same as (6) (x. y) is the linear dependence of the vectors x and y. 1. y)2 Ix12 13112 <I which. In view of Axiom 4 for inner products. the discriminant of the equation t2(y. that (x. This inequality implies that the polynomial cannot have two distinct real roots. EXAMPLES. Consequently. x)(y. y) + (x. y)2 which is what we wished to prove. y) _CO. Consequently.e. y)/1x1 1y1 is the cosine of a previously determined angle between the vectors. the remark preceding the proof of the Schwarz inequality. y)1/1x1 IYI 1. before we can correctly define the angle between two vectors by means of the relation (5) we must prove the Schwarz inequality. Prove that a necessary and sufficient condition for (x. x) = (x. Y) < 1 IXI IYI or. (x. for any t. y) 2t(x.

X) 2=1 E Et2. where and (3) E aikeiek >. 2--1 and inequality (6) becomes i=1 ei MY 5- n ( )( n Ei2 i=1 tif2).20 2 LECTURES ON LINEAR ALGEBRA In Example 2 the inner product was defined as (x. .n=1 aikeink. If x and y are two vectors in a Euclidean space R then (7) Ix + Yi [xi + 1Yr- . y) = E t=1 It follows that (X. En anakk. . and /72. Hence (6) takes the form 0. i=1 In Example 3 the inner product was defined as (1) Y) i. This inequality plays an important role in many problems of analysis.) fb In Example 4 the inner product was defined by means of the integral 1(1)g (t) dt. We now give an example of an inequality which is a consequence of the Schwarz inequality. then the following inequality holds: 2 ( E ao.$((o) k=1 ( n n E aikeik)( E 6111115112) k=1 i. (y. 71 in the inequality just derived. k=1 for any choice of the ¿i. En. (Hint: Assign suitable values to the numbers Ei. 2=1 EXERCISE. af(t)g(t) dt))2 fba [f(t)J' dt [g(t)p dt. Show that if the numbers an satisfy conditions (2) and (3). Hence (6) implies that il the numbers an satisfy conditions (2) and (3). y) = E i)(8.

y) + (y. Orthogonal bases play the same role in Euclidean spaces which rectangular coord nate systems play in analytic geometry. In a vector space there is no reason to prefer one basis to another. Isomorphism of Euclidean spaces I.. § 3. if e. of R onto itself which bases in R. x)+21x1 lyi+ (y.. . e'. Y) = fix1+IYI)2. Orthogonal basis. e of an nDEFINITION 1.. Interpret inequality (7) in each of the concrete Euclidean spaces considered in the beginning of this section. In § 1 we introduced the notion of a basis (coordinate system) of a vector space. the tip of that vector) is defined as the length of the vector x y. 1x±y12 = (x+y. the other space.e 3 Careful reading of the proof of the isomorphism of vector spaces given in § 1 will show that in addition to proving the theorem we also showed that it is possible to construct an isomorphism of two n-dimensional vector spaces which takes a specified basis in one of these spaces into a specified basis in are two e and e'5. Here there is every reason to prefer so-called orthogonal bases to all other bases. 3 Not so in Euclidean spaces. In geometry the distance between two points x and y (note the use of the same symbol to denote a vectordrawn from the origin- and a point. y).. x1 i. In the general case of an n-dimensional Euclidean space we define the distance between x and y by the relation d lx yl. y) (x. and an orthonormal basis i f . it follows that 213E1 Since 2(x. e..e2.n-DIMENSIONAL SPACES 21 Proof: y12 = (x + y. e2. form an orthonormal basis . In particular.y) = (x. The non-zero vectors el. 1x + yl lx EXERCISE. x) 2(x. which is the desired conclusion. each has unit length. the vectors e. then there exists an isomorphic mapping takes the first of these bases into the second. dimensional Euclidean vector space are said to form an orthogonal basis if they are pairwise orthogonal. in addition.e. x+y) SI. Briefly. x -1. Orthogonal basis.

f to an orthogonal basis el. Every n-dimensional Euclidean space contains orthogonal bases. =A1e11+ ' + Ak-iet where the Al are determined from the orthogonality conditim . = f. e2. e2. . . For this definition to be correct we must prove that the vectors ei. we find that A2 = 0.e. i. ei) = O. e are linearly independent. This procedure leads from any basis f. Likewise. The result is 21(e1.22 LECTURES ON LINEAR ALGEBRA (ei. e2. para. e2) + + 2(e1. multiplying (2) by e. etc. where a is chosen so that (e2.. f2. ". e. We put = f1. the definition of an orthogonal basis implies that ei) 0 0. e1.. i. We shall make use of the so-called orthogonalization procedure to prove the existence of orthogonal bases. This means that (f. f. (f. . ek) = 0 for k é L Hence A = O. are linearly independent. form the inner product of each side of (2) with ei).. let en of the definition actually form a basis.2 = A = O. e1)/(e1. To this end we multiply both sides of (2) by el (i. THEOREM 1. Next we put e. el) = 0. Thus. e1).. This proves that el. Proof: By definition of an n-dimensional vector space (§ 1.e. 2. To construct ek we put e. e1. en) = O. Now. We wish to show that (2) implies Ai = 2. ek) = f1 10 if i = k if i k. . 2) such a space contains a basis f1. e..e. Suppose that we have already constructed non-zero pairwise orthogonal vectors el. (e1. el) + A2(e1. A2e2 + + Ae = O.

and the vectors e. e2)/(e2. = (fk. ek_k and fk were used to construct e. 22-2(e2. e2. e2) = 0. .e. are pairwise orthogonal.) = (f2 Since the vectors el. basis. The vector ek is a linear combination of the vectors ek. e2. but we shall make use of this fact presently to prove that e. So far we have not made use of the linear independence of the .. e. n) form an orthonormal basis. e. f. + 2-1 fk. etc. In view of the linear independence of the vectors f1. By continuing the process described above we obtain n non-zero. e2) = (fk = (fk 21e2-1 Aiek.-t. and fk+. (fk. . el) + 22-1(e1. EXAMPLES OF ORTHOGONALIZATION. e2. e2-1) 21(ek_1. e2_2. i. combination of the vector f_. e2) 0. can be used to construct e. e1)/(e1. so . But e. Similar statements hold for e22. 2.. the latter (fk. It follows that 2-1 = (f/c. O. en.ctor e. perpendicular to e. It follows that ek = alfk ci2f2 + . pairwise orthogonal vectors ek. (e2. fa. -I- ' ' + ' + A2e. an orthogonal It is clear that the vectors e'k = ek/lekl (k = 1. be three linearly independent vectors in R.. f. fk we Just as ek. may conclude on the basis of eq. . ' (f2. Let R be the three-dimensional space with which we are familiar from elementary geometry. .. = f. (5) that ek O. el). e.. and lying in the plane determined by e. f2. This proves our theorem. Next select a . e. 02) A1e2-1 + + e2-1) . 1. . = 0. ek. f2. can be written as a linear vectors f1. ek. Put e. et) = 0.n-DIMENSIONAL SPACES 23 (ek. e) = O. e2. = f. equalities become: (fk. .. e1) (ek. e2). e2. O. e2. Let fi.

choose e. then 52e. n2e2 + (x. t. ek) = {01 1ff + ne). (i. We define the inner product of two vectors in this space by the integral fi P(t)Q (t) dt.24 LECTURES ON LINEAR ALGEBRA and f2. 12 form a basis in R. We shall denote the kth element of this basis by Pk(t).e. Apart from multiplicative constants these polynomials coincide with the Legendre polynomials 1 dk (12 1)k 2k k! dtk The Legendre polynomials form an orthogonal. Finally. R. Let R be the space of polynomials of degree not exceeding n 1. As in Example 2 the process of orthogonalization leads to the sequence of polynomials 1. Let R be the three-dimensional vector space of polynomials of degree not exceeding two. i. By dividing each basis vector by its length we obtain an orthonormal basis for R. Since O (t+ I. Multiplying each Legendre polynomial by a suitable constant we obtain an orthonormal basis in R. = 12 + 131 The orthogonality requirements imply ß 0 and y = 1/3. We put e. 7 kk. . Next we put e. If . P 1/3 is an orthogonal basis in R. it follows that a = 0. We define the inner product of two vectors in this space as in the preceding example. e2. = t I. t. perpendicular to ei ande. en be an orthonormal basis of a Euclidean space x= y = ?he...t^-2. i. but not orthonormal basis in R.Y)= Since $2e2 + ee. I) = f (t dt = 2a. t. Thus 1. +77e0. = t2 1/3. y 1. We select as basis the vectors 1. We shall now orthogonalize this basis. Let e1.. . The vectors 1. + n2e2 + + enen. e. (3/5)1.e. -. t. 12 1/3. e. = 1.e. = t. perpendicular to the previously constructed plane). Finally we put e.

(x. n. Show that if f. similarly. Example 2. f'2. = (x.(1). It is natural to call the inner product of a vector x and a vector e of length 1 the projection of x on e. y) = E nikEink. and Y = nifi + + n. except that there we speak of projections on the coordinate axes rather than on the basis vectors. " en and ni. the inner product of two vectors relative to an orthonormal basis is equal to the sum of the products of the corresponding coordinates of these vectors (cf. el) = . 1. then this We shall now find the coordinates of a vector x relative to an orthonormal basis el. EXAMPLES. This is the exact analog of a statement with which we are familiar from analytic geometry.n-DIMENSIONAL SPACES 25 it follows that + Enfl. + e(en . P o(t) be the normed Legendre Let Po(t). + + Enn. (x. y) Thus. Thus the kth coordinate of a vector relative to an orthonormal basis is the inner product of this vector and the kth basis vector. Y) = 071 + " ' Ent.f. . polynomials of degree 0. Let x = eie. (7) E2e2 ene. Show that if in some basis f1. e. where aik = aki and ei e2. e1) = and. EXERCISES 1. Multiplying both sides of this equation by el we get el) + $2(e2 el) + e. § 2). f. . 1. = (x.. . 2. e). P... let Q (t) be an arbitrary polyno- . Cle=1 f is an arbit ary basis. El% + $27)2 + (x. n2. The result just proved may be states as follows: The coordinates of a vector relative to an orthonormal basis are the projections of this vector on the basis vectors. e2). ?I are the coordinates of x and y respectively. then (x. Further.f for every x = basis is orthonormal. e2.

it follows that the functions (8') 1/1/2a. The shortest distance from a point to a subspace.(1) dt. Let R. (l/Vn) cos nt. Q) = fo P(t)Q(t) dt. 2 (8) Consider the system of functions 1. Since .f: sin% kt dt = 27. Perpendicular from a point to a subspace. if k and I. cos nt. (This paragraph may be left out in a first reading.) DEFINITION 2.26 LECTURES ON LINEAR ALGEBRA rnial of degree n. We shall represent Q (t) as a linear combination of the Legendre polynomials. sin t. f. (l/ n) cos t. sin nt. It follows from (7) that I Q(t)P. (l/Vn) sin nt 2. (l/ n) sin t. cos 2t. if it is orthogonal to every vector x e RI. We define an inner product in R1 by the usual integral (P. cos t + b. We shall say that a vector h e R is orthogonal to the subspace R. 2n). o cos% kt dt n 227 . 2n It is easy to see that the system (8) is an orthogonal basis Indeed r 2r Jo cos kt cos It dt = 0 if k 1. + cP(t). A linear combination P(t) = (a012) + a. P(t).. Hence every polynomial Q(t) of degree n can be represented in the forra Q (t) = P(t) + c1P1(1) + c. To this end we note that all polynomials of degree n form an n-dimensional vector space with orthonormal basis Po(t).{0 ldt = 2n. rn sin kt cos It dt = 0.. are an orthonormal basis for R1.o 2r sin kt sin lt dt = 0. sin t + al cos 2t + + 6. be a subspace of a Euclidean space R. sin nt of these functions is called a trigonometric polynomial of degree n. . . cos t. on the interval (0. sin 2t. The totality of trigonometric polynomials of degree n form a (2n + 1) -dimensional space R.

is called the orthogonal projection of f on the subspace R1.. + Hence. 2.41. In other words. we shall show that if f.) = O.. ej = O (k = 1. orthogonal to any linear combination of these vectors. and is therefore orthogonal to h = f f... as a difference of two vectors in RI. the vector fo f1 belongs to R. We pose the problem of dropping a perpendicular from the point f to 121. we note that f (f c2e2 + c. Indeed. fo.1.. = To find the c. Indeed. must be orthogonal to III.f1I2 = If - so that If - > If e. . et) = 0 (1= 1. 2.e. how to drop a perpendicular from f on 141).e. e.. i. .e.. + 2. in R1 such that the vector h f f is orthogonal to R1. m) implies that for any numbers 2. of finding a vector f.e. ej = (f. be a basis of R1. e. (h.e. of f on the subspace R1 (i. f. If - > If . i.e.f1I2 = If . We shall see in the sequel that this problem has always a unique solution. (f0. be an m-dimensional subspace of a (finite or infinite dimensional) Euclidean space R and let f be a vector not belonging to lt1. As a vector in 121. in). Let el.. By the theorem of Pythagoras If 412 + 14 . must be of the form f.. . or. (h. then.f0 + 4 . We shall now show how one can actually compute the orthogo- nal projection f. The vector f. 1111 is the shortest distance from f to RI. just as in Euclidean geometry. Let R. e R1 and f1 f.n-DIMENSIONAL SPACES 27 e then it is also If h is orthogonal to the vectors e. . Right now we shall show that. f. 22. to a basis of 12. i. ej. 21e1 + 22. for a vector h to be orthogonal to an m-dimensional subspace of R it is sufficient that it be orthogonal to ni linearly independent vectors in It1.

. e1) ' (ex. . c.) must be different from zero. c2. . e. The method of least squares. . Since the c.. in such a basis the system (11) goes over into the system ci = (f. the system (11) must also have a unique solution. this system has a unique solution.. e2.. . e1) + + c.. e. 1. ei) (e (e2. i.) (e2.) (e e. . let y= + c. We first consider the frequent case when the vectors e1. we have proved that for every vector f there exists a unique orthogonal projection f. in view of the established existence and uniqueness of the vector f0. cm with respect to the basis el. Indeed. EXAMPLES. This determinant is known as the Gramm determinant of the vectors e1.). e. of the orthogonal projection fo of the vector f on the subspace 111 are determined from the system (12) or from the system (11) according as the c. are the coordinates of to relative to an orthonormal basis of R1 or a non-orthonormal basis of A system of m linear equations in in unknowns can have a unique solution only if its determinant is different from zero. Indeed. (e2. . e1) (k = 1.. Thus. e. e.x . e2. 2. e1) c2(e2. en. It follows that the determinant of the system (11) (el. Since it is always possible to select an orthonormal basis in an m-dimensional subspace. m). are orthonormal. the coordinates c.(e. x2.28 LECTURES ON LINEAR ALGEBRA Replacing f. on the subspace We shall now show that for an arbitrary basis el. by the expression in (9) we obtain a system of m equations for the c. this vector has uniquely determined coordinates el. Let y be a linear function of x1. e..) (era. e2. e1) = (f.. e2.(ei. In this case the problem can be solved with ease. satisfy the system (11). e2) (ex. x...e..

= (x11. X. x. in (13) are as "close" as possible to the corresponding right sides. x2. let us consider the n-dimensional Euclidean space of n-tuples and the following vectors: e.e. c. Frequently the c.(y1. the system (13) is usually incompatible and can be solved only approximately.n and the problem of minimizing the f to clel c2e2 + mean deviation is equivalent to the problem of choosing ni . X2n). and y.e. = y . 4 4. x2. As a measure of "closeness" we take the so-called mean deviation of the left sides of the equations from the corresponding free terms. x. i. e2. c2. are fixed unknown coefficients. cm from the system of equa- + XnaCm = y1. Indeed. are determined experimentally. and + ciel c2e2 + Consequently.. Cm so as to minimize the distance from f to numbers e1. the quantity k=1 E (X1kC1 X2nC2 + + XinkCk Ykr The problem of minimizing the mean deviation can be solved directly. To this end one carries out a number of measurements of ael. y) in that space. . There arises the problem . + Xm2 + xmc. tions X21C2 + Xl2C1 + X22 C2 + X11C1 . e2 = (X21 f =_. y. of the vector X22.e. its solution can be immediately obtained from the results just presented. Thus. One could try to determine the coefficients c1. .e. x).2. Let x. denote the results of the kth measurement. x2c2 + However usually the number n of measurements exceeds the number m of unknowns and the results of the measurements are xincl never free from error. e2e2 . c2. If R1 is the subspace spanned by fo = c.. x1). The right sides of (13) are the components of the vector f and the left sides. . .. x. = Y2. cm so that the left sides of the equations of determining t1. y2. (14) represents the square of the distance from + c. = (x. However..n-DIMENSIONAL SPACES 29 where the c.

(13') When the system (13) consists of n equat ons in one unknown xic x2c = y2. In this case the normal system Solution: el consists of the single equation (e1. e2. 4. e.). e2)c1 (e2. e2)c2 + ' (e2. e.c2 k I . ek) = 1-1 where (f.. 5). el)a. The method of approximate solution of the system (13) which we have just described is known as the method of least squares. x) k=1 xox x. 3. . (e1. f = (3.30 LECTURES ON LINEAR ALGEBRA the vectors el. (2. y) (x. = (f. (supposed linearly independent). EXERCISE.. = (f. then our problem is the problem of finding the projection of f on RI. The system of equations (15) is referred to as the system of normal equations.. xc the (least squares) solution is c (x. c2. (15) (e1. ei)c. 29c = 38. = (f. en)c. ex) = I =1 xxiYs. c = 38/29. enc = (ee f). e2). e1)c2 + (e2. Use the method of least squares to solve the system of equations 2c 3c 3 4 4c = 5. which solve this problem are found from the system of equations (e1. + + (e. e1). . formula (11)). en)ci + (e. the numbers c1. As we have seen (cf. 4). (e e2)c. em)c. xoxk.

Let us consider the space R of continuous functions on the interval 10. Since the functions 1 eo V2. (x.Mich differs from f(t) by as little as possible. k=0 where or ck = (t. We shall measure the proximity of 1(t) and P (t) by means of the integral u(t) . g) = 21. y1). Thus.7 ' e. that polynomial for which the mean deviation from f (t) is a minimum. para. which is closest to fit). cos t b. cos nt . 2. y). Then the length of a vector f(t) in R is given by = 6atr EN)? dt. sin t + + an cos nt b sin nt. The trigonometric polynomials (17) form a subspace R.n-DIMENSIONAL SPACES 31 In this case the geometric significance of c is that of the slope of a line through the origin which is "as close as possible" to the points (x1. by means of the integral (I. Let (t) be a continuous function on the interval [0. is P(t) = E ce1. e. sin t Or sin nt Ahr . of R of dimension 2n + 1. .f(t)g(t) dt. Approximation of functions by means of trigonometric polynomials.13(1)i2 dl. e. Our problem is to find that vector of R. Example 2). the required element P(t) of R. we are to find among all trigonometric polynom als of degree n. P(t) = (a012) + a. 2:r] in which the inner product is defined. el)._. Consequently. the mean deviation (16) is simply the square of the distance from j(t) to P(t). as usual. (x2. 2n form an orthonormal basis in R. and this problem is solved by dropping a perpendicular from f(t) to R1. (cf. cos t Nhz 1 e. It is frequently necessary to find a trigonometric polynomial P(t) of given degree v. 1. y2). 2n]. .

y').. =7 P(t) := . We observe that if in some n-dimensional Euclidean space R a theorem stated in terms of addition. In each of them the word "vector" had a different meaning. If x x'. the inner products of corresponding pairs of vectors are to have the same value.. 27 1 J. if OW correspondence associates with X E R the vector X' E R' and with y e R the vector y' e R'. a. Two Euclidean spaces R and R'. then Axe> Ax'. cos kt + b sin kt k=1 n 127 5 X o fit) dt. e. The numbers a. 3. x' e R') such that I. scalar multiplication and inner multiplication of vectors has been proved.. "vector" stood for an n-tuple of real numbers.. Isomorphism of Euclidean spaces. then it associates with the sum x + y the sum x' y'. then the same . Thus in § 2. a =- 1 x 5 2 1(1) cos kt dt. o b= 5 "Jo 127 f(t) sin kt dt.32 n-DIMENSIONAL SPACES = Vart o -I 27 f(t)dt. Example 5. If x 4> x' and y 4--> y'. y) = (x'. then (x..1c. for the mean deviation of the trigonometric polynomial from f(t) to be a minimum the coefficients a and bk must have the values a. To be more specific: DEFINITION 2.2" Vn o f(t) cos kt dt. then x + y X' + y'. If X 4> X' and y 4> y'.. etc.e. it stood for a polynomial. 1 tik.. C2k - ThuS.. Example 2. i.e. are said to be isomorphic if it is possible to establish a one-to-one correspondence x 4> x' (x e R. ' 7( /0 1(t) sin kt dt.. in § 2. 2 * Ea. The question arises which of these spaces are fundamentally different and which of them differ only in externals.. We have investigated a number of examples of n-dimensional Euclidean spaces. and bk defined above are called the Fourier coefficients of the function fit). i.

en be an orthonormal basis in R (we showed earlier that every Euclidean space contains such a basis). y') = El% + $2n2 + + $nn. .. 3 of the definition of isomorphism. 2. . Y) = $1ni + $2n2 + because of the assumed orthonormality of the e. All Euclidean spaces of dimension n are isomorphic.. if we replaced vectors from R appearing in the statement and in the proof of the theorem by corresponding vectors from R'. i. (x. then. 82. + Ennn.n-DIMENSIONAL SPACES 33 theorem is valid in every Euclidean space R'. 8n) in R'. that the inner products of corresponding pairs of vectors have the same value. The following theorem settles the problem of isomorphism of different Euclidean vector spaces. It remains to prove that our correspondence satisfies condition 3 of the definition of isomorphism. Indeed. in view of the properties 1. . § 2. The one-to-one nature of this correspondence is obvious. ¿2. Let el. We shall show that all n-dimensional Euclidean spaces are isomorphic to a selected "standard" Euclidean space of dimension n. all arguments would remain unaffected. in which a vector is an n-tuple of real numbers and in which the inner product of two vectors x' = (E1. Conditions 1 and 2 are also immediately seen to hold. This will prove our theorem. . We now show that this correspondence is an isomorphism. isomorphic to the space R. THEOREM 2. nn) is defined to be a (x'. As our standard n-dimensional space R' we shall take the space of Example 2. = Eft?' + $2n2 + + Now let R be any n-dimensional Euclidean space.e. e2. Clearly. We associate with the vector x= e2e2 + + ene in R the vector = (81. e) and y' = (7? n . On the other hand. the definition of inner multiplication in R' states that (x'.

in the space of continuous functions on [a.e. This completes the proof of our theorem. y') = (x. In particular the Schwarz inequality vr (f(t) g(t))2 dt VSa [ f (t)]2 dt bb VSa [g(t)]2 dt.34 LECTURES ON LINEAR ALGEBRA Thus (x'. the inequality. Bilinear and quadratic forms In this section we shall investigate the simplest real valued functions defined on vector spaces. § 1.e. This subspace is isomorphic to ordinary three space (or a subspace of it). Prove this theorem by a method analogous to that used in The following is an interesting consequence of the isomorphism theorem. inequality (7) of § 2 Ix + yl ixl is stated and proved in every textbook of elementary geometry as the proposition that the length of the diagonal of a parallelogram does not exceed the sum of the lengths of its two non-parallel sides. and it therefore suffices to verify the assertion in the latter space. para. inner multiplication and multiplication of vectors by scalars) pertaining to two or three vectors is true if it is true in elementary geometry of three space. § 4. an assertion stated in terms of addition. Again. the vectors in question span a subspace of dimension at most three. EXERCISE. b]. To illustrate.. and is therefore valid in every Euclidean space. the inner products of corresponding pairs of vectors have indeed the same value. a geometric theorem about a pair of vectors is true in any vector space because it is true in elementary geometry. which expresses inequality (7). is a direct consequence. Any "geometric" assertion (i. of the proposition of elementary geometry just mentioned. . Y): i. Indeed. 4. We thus have a new proof of the Schwarz inequality.. via the isomorphism theo- rem. § 2.

. e'2. + E2e2 + Thus. e2. + ac21e2 + acne. x a vector whose coordinates in the given basis are E1. en be a basis in an n-dimensional vector space. . is the dependence of the a. let 2. . = 1. e2. . Linear functions. e'n be two bases in R. then (1) f(x) = aiel a252+ -in amen. A linear function (linear form) f is said to be defined on a vector space if with every vector x there is associated a number f(x) so that the following conditions hold: _fix f(x) +AY). on the choice of a basis. Linear functions are the simplest functions defined on vector spaces. ¿j(e2) + + enf(e. DEFINITION I. $2. e2.e. What must be remembered.) f(x) = f&ie. e by means of the equations e'. e and e'1. the properties of a linear function imply that + ene) eifie. E. n). Thus let et. however. Let the e'. The definition of a linear function given above coincides with the definition of a linear function familiar from algebra.). Let et. = acne. Y) !(Aa) = 1f (x). en is a basis of an n-dimensional vector space R. Since every vector x can be represented in the form x= enen.e2 + + 2. + anen f(X) = a252 . The exact nature of this dependence is easily where f(e) = explained.2. if et. and f a linear function defined on R. 2. .n-DIMENSIONAL SPACES 35 1. e2. e' Further. be expressed in terms of the basis vectors et. + 122e2 + e'2 + ocnIenr + (Xn2en. .

. n as constants. y) + A (x2. In what follows an important role is played by bilinear and quadratic forms (functions). en. EXAMPLES. Consider the n-dimensional space of n-tuples of real numbers. Since a. yi) + A (x. A (x. + a'2E12 + e'. A (x. ey. 2. y) is said to be a bilinear function (bilinear form) of the vectors x and y if for any fixed y. for any fixed x. A (x. bt2. A (x. y). In other words. y) is a linear function of y.. E).. e2. A (x. y) = A (x. y) = + an 27/1 anlennl a12e072 anE27/2 ' ' ' ' + n a2 ne2n an2enn2 + + annennn A (x. pty) = yA(x. if $1. if we regard ni . y). e'2. + y2) = A (x. DEFINITION 2. Again. + + akf(e. if we keep y fixed. yz). A (x.36 LECTURES ON LINEAR ALGEBRA relative to the basis el. y) is a linear function of y. Let x = ($1. y). and + a'e' f(x) = a'le'. $2. as it is sometimes said. Bilinear forms. i. y = n2. k1 ae ink depends linearly on the $. A (x.(E1. en). nk). 1.) and a' k = f(e'). y) is a linear function of x. y) is a linear function of x =. noting the definition of a linear function. . Indeed. . y) is a bilinear function. A (Ax. E are kept constant. y) = (x.. i. n2.) = ctik + c(2k az + This shows that the coefficients of a linear form transform under a change of basis like the basis vectors (or. it follows that cc2ke2 + + C(nk en) = Xlki(ei) ac22f(e2) ai = f(xikei + anka. = f(e. . conditions 1 and 2 above state that A (xi + x2. cogrediently). relative to the basis e'1.e. and define (2) A (x.

. t. A( f. g) is the product of the linear functions!: f(s) ds and Jab g(t) dt. y) in a Euclidean space is an example of a symmetric bilinear form. g) = f(s)g(t) ds dt = f(s) ds g(t) dt. t)f(s)g(t) ds dt.n-DIMENSIONAL SPACES 37 2. + 71e). y relative to the basis e1. symmetric 21 A bilinear function (bill ear form) is called A (x. then If K(s.. then their product 1(x) g(y) is a bilinear function. n. Let K(s. Show that if 1(x) and g(y) are linear functions. space.e. e. We shall express the bilinear form A (x. t) be a (fixed) continuous function of two variables s. i. Conditions 2 have analogous meaning. Indeed. EXERCISE. ri of coordinates ei. bilinear form. The inner product (x. y) = A (y x) for arbitrary vectors x and y. y) defined by (2) is symmetric if and only if aik= aid for all i and k. + e en. e2. may be removed from under the integral sign. )72e2 + A (x. The matrix of a bilinear form. y) using the e of x and the coordinates ni. 1. In Example / above the bilinear form A (x. e2. in this case. Axioms I. 3 in the definition of an inner product (§ 2) say that the inner product is a symmetric. t) b jib A (f. /he.. Now let el. A (f. Indeed. g) = I then A (f. If we put b sb K(s. Let R be the space of continuous functions f(t). e2. 2. We defined a bilinear form en be a basis in n-dimensional axiomatically. DEFINITION 3. that the integral of a sum is the sum of the integrals and the second part of condition 1 that the constant A. Thus. the first part of condition 1 of the definition of a bilinear form means. g) is a bilinear function of the vectors f and g. y) = A (ei ei $2e2 + In view of the properties 1 and 2 of bilinear forms . 3.

respectively.e. and n'2. k=1 or. 1). = A (ei. + a. Thus given a basis e1. n'a + + 2 6E'.38 LECTURES ON LINEAR ALGEBRA A (x. then A (x. y) = El% + 2eon + 3eana. Nlaking use of (4) we find that: an a = an 1+2 1+3 (-1) = 0.91 of the bilinear form A (x. a = 11 + 2 1 + 3 (-1) (-1) = 6. .. 5'. a = a= 1 1 1. 1). A (x. and The matrix a/ = is called the matrix of the bilinear form A (x. EXAMPLE.. We define a bilinear form in R by means of the equation A (x. if we denote the constants A (ei. . 1 a 1 1 - = 04 4 2 d. 6 0 6 6 It follows that if the coordinates of x and y relative to the basis e. = (1. y) = written as ai. 1=1 azkink. y = ?he. y) = I A (ei... + 3. e the form A (x. 4. a= a 1 + 2 U (-1) + 3 (-1) = 4. and compute the matrix . y). 1. e. e2. 4E'.17' . 1 + 2 (-1) (-1) + 3 (-1) (I) = 6. Let us choose as a basis of R the vectors e. e. en. 1). e. EP Let R be the three-dimensional vector space of triples 59) of real numbers.. e1) by a. e2. 1. y) relative to the basis el. (El. 1 1 1 1 1 1+2 1 1 1+3 1 1 = 6. y) is determined by its matrix at= Ha11. TA. = (1.2 (-1) + 3 (-1)(-1) = 2.e To sum up: Every bilinear form in n-dimensional space can be A (x: Y) = where X -r. y) = 65'oy. ek). + tine. = (1. e2.$1e1 + i. are denoted by 5'. 1.

(4)] b = A (f. = A (fp. e2. Let si = I laid I be the matrix of a bilinear form A (x. c. then = E abafic Aft. to the basis f1. f2. f2.zbk.. the element c of a matrix 55' which is the product of is defined as two matrices at = 11a0. f2. c/-1 Using this definition twice one can show that i = d. e2.4 = I ibikl 1. c22.. i. e are cm.11 and a = cik E ai. e2.ß=1 . i.. e. Let el.. . Now b becomes 4 4 As is well known.. To this end The e'. By definition [eq. = c' transpose W' of W. C21C22 cn2c2 C2n cn is referred to as the matrix of transition from the basis e. . bp. (6) . f.. e and f. We shall now express our result in matrix form. w [clici2 .. of course. .1 are. e and . en and c. fe). c. a. Let the connection between these bases be described by the relations = cue' (5) c21e2 d- + ce.e. the numbers c. and fe relative to the basis e1. the elements of the we put e1. y = fe. c22e2 + In = crne + c2e2 + which state that the coordinates of the vector fn relat ve to the .n-DIMENSIONAL SPACES 39 4. fe) = k=-1 acic. The matrix basis el. the matrix of that form L.Vt. It follows that by. is the value of our bilinear form for x f. the matrix I IbI I given the matrix I kJ. f be two bases of an n-dimensional vector space. To find this value we make use of (3) where in place of the ei and ni we put the coordinates of fp .. e2.e. Our problem consists in finding relative to the basis f1. + + 1 f2 = cue].. . ca.. e2. Transformation of the matrix of a bilinear form under a change of basis. y) relative to the basis e1.. c.

x). y). THEOREM 1. The function A (x. Quadratic forms DEFINITION 4. (x. y)].40 (7*) LECTURES ON LINEAR ALGEBRA t. Since the right side of the above equation involves only values of A (x. e2.e.. y) (i. Thus.:W its matrix relative to the basis f1. y) is any (not necessarily symmetric) bilinear form. e2.. Let A (x. then A (x. y) as well as the symmetric bilinear form A . k=1 Icriaikc. x + y) the quadratic form A (x. x + y) = A (x. The requirement of Definition 4 that A (x. f and W' is the transpose of W. y) = A (y. x)i . A(y. f. x) A (y. y) be a symmetric bilinear form. y) -- (x + y. y) is referred to as the bilinear form polar to the quadratic form A (x. A (x. e and [. To show the essential nature of the symmetry requirement in the above result we need only observe that if A (x. . if s is the matrix of a bilinear form A (x. 5. y) by putting y = x is called a quadratic form. y) is indeed uniquely determined by A (x. Using matrix notation we can state that (7) =wi sr. y) is u iquely determined by i s Proof: The definition of a bilinear form implies that A (x 4-. x) obtained from A (x. x). . x) + A (x. in view of the equality A (x. . . y) relative to the then PI = W' dW. y) be a symmetric form is justified by the following result which would be invalid if this requirement were dropped.. x). x)). where W is the matrix of transition from e1. basis el. it follows that A (x.y. A (x. quadratic form. e to f1. x) + A (y. Hence in view of the symmetry of A (x. f2. y) + A (y. The polar form A (x. f2.

A (ibc. x). x) can be expressed as follows: A (x. y) its polar form. x). y) = A (xl. In such a space the value of the inner product (x. x) is called positive definite if for every vector x A (x. These conditions are seen to coincide with the axioms for an inner product stated in § 2. such a bilinear form always defines an inner product. y) = A (y. an inner product is a bilinear form corresponding to a positive definite quadratic form. A quadratic form A (x. 3) I atkeznk. y) of two vectors is taken as the value A (x. y). y) + A (x2. x). A (x. au. of x and nk of y as follows: A (X. x) = E aikeik. y). y) can be expressed in terms of the coordinates E. x) = + $22 + -in $2 is a positive definite quadratic form. Hence. It is clear that A (x. where a. This enables us to give the following alternate definition of Euclidean space: A vector space is called Euclidean if there is defined in it a positive definite quadratic form A (x. y) of the (uniquely determined) bilinear form A (x. x) 0 and A (x. x).. y) = A (x. We have already shown that every symmetric bilinear form A (x. y) associated with A (x. Let A (x. = a. x) be a positive definite quadratic form and A (x. (x. + x2. . Conversely. EXAMPLE. The definitions formulated above imply that A (x. It follows that relative to a given basis every quadratic form A (x. x) > 0 for x O.n-DIMENSIONAL SPACES 41 give rise to the same quadratic form A (x. x) > O. = k=1 We introduce another important DEFINITION 5.

42 LECTURES ON LINEAR ALGEBRA § S. nn . x) (supposed not identically zero) does not contain any square of the variables n2. those terms of the form which contain ann12 + 2annIn2 + We shall assume slightly more than we may on the basis of the O. We now show how to select a basis (coordinate system) in which the quadratic form is represented as a sum of squares. stays different from zero. If the form A (x. $n2. . para. . To reduce the quadratic form A (x. f3. the coefficient of )7'2. nn are the coordinates of the vector x relative to this basis.2) is not zero. Thus let f1. . A (x. i. that in (2) a n2 + 2ainnin.. If this is not the case it can be brought about by a change of basis consisting in a suitable change of the numbering of the basis elements. x) = Al$12 + 12E22 . Consider the coordinate transformation defined by = nil + 7/'2 n2 = 7711 nia (k = 3. § 1) we may write the formulas for coordinate transformations in place of formulas for basis trans- formations. In view of the one-to-one correspondence between coordinate transformations and basis transformations (cf. We now single out all above.. We shall now carry out a succession of basis transformations aimed at eliminating the terms in (2) containing products of coordinates with different indices. x) to a sum of squares it is necessary to begin with an expression (2) for A (x. . namely. Since an = an = 0..n) nk = n'k Under this transformation 2a12771)72 goes over into 2a12(n1 771). . .e.f be a basis of our space and let . A (x. 6. 2a1277072. x) in which at least one of the a (a is the coefficient of )7. )72. Reduction of a quadratic form to a sum of squares VVre know by now that the expression for a quadratic form A (x. it contains one product say. x) in terms of the coordinates of the vector x depends on the choice of basis. X) = a zo in where ni.

+ 2a/nn + a1)2 B. x) au th**2 a22* a ** **2 + n2ik. 272** = (122* n2* + a23* n3* + + nn*. so in (2) the quadratic form under consideration becomes (x..1=3 . fi ** t. if necessary.i. *2 + all n *.e.hThc. " (Jinn.. by auxiliary transformations discussed above) and carry out another change of coordinates defined by ni ** n3** = nn** ni* . our form becomes A (x. " 71: = then our quadratic form goes over into A (x. write an/h2 (3) 1 24112971712 (ann. n3*. *. where the dots stand for a sum of terms in the variables t)2' If we put 711* = aniD 1)2* a12n2 . x) = 1 a 11 (a11711 + ' + a1)2 + 4. k=2 is entirely analogous to the right side of (2) except for the fact that it does not contain the first coordinate. * si ik ' The expression a ik* n i* n k* i.n-DIMENSIONAL SPACES 43 and "complete the square. If we assume that a22* 0 0 (which can be achieved. ' ' It is clear that B contains only squares and products of the terms that upon substitution of the right side of (3) al2172. x) n .

j* . 6. x) = 21E12 + 22E22 where E1. Thus let A (x.e. 7122 8%2.j* Finally. E are the coordinates of x relative to e1. x) =_ n1. where m n. relative to some basis f1. if 71' . Then there exists a basis el. x) be a quadratic form in three-dimensional space which is defined. e2. to n linearly independent vectors. e. x) If 271. x) . para. if SI = rits. by the equation A (x. We shall now give an example illustrating the above method of reducing a quadratic form to a sum of squares.f3. x) has the form A (x.f2. es = e3 = 712.2 + 4. . . 171* + = n. 21E18 ¿2$22 27n em2. E2.8 + 27j + 41)/ 2?y. . e2. Let A (x. Vi = 77. x) = Again. If m < n. We may now sum up our conclusions as follows: THEOREM 1. we put 4+1= = An = O. x) (cf.44 LECTURES ON LINEAR ALGEBRA After a finite number of steps of the type just described our expression will finally take the form A (x.s = then A (x. An en2. i. We leave it as an exercise for the reader to write out the basis transformation corresponding to each of the coordinate transformations utilized in the process of reduction of A (x. 8712. + 27 s* '73' .7/2 4flo..2 +722. = then A (x. § 1) and to see that each change leads from basis to basis. en of R relative to which A (x. x) be a quadratic form in an n-dimensional space R.

. . x) assumes the canonical form A (x.2 e22 12e22 If we have the expressions for ni*. we can express el. e. e2.% e2 . It is easy to check that in this case the matrix i. for nj**.. the beginning of the description of the reduction process in this section). 7/2. n take the " form c12n2 + " C22n2 + C2nnn the matrix of the coordinate transformation is a so called triangular matrix. 712. . )73. n* in terms of n. n** in terms of ni*. = e2 = d21f1 6/12f2 + + + d2nf. .e. x) is such that at no stage of the reduction process is there need to "create squares" or to change the numbering of the basis elements (cf. e2 2n3. in terms of the old basis vectors f2. Ej = cu. § 1) we can express the new basis vectors ei. ¿3= In view of the fact that the matrix of a coordinate transformation is the inverse of the transpose of the matrix of the corresponding basis transformation (cf. x) = _e. .n-DIMENSIONAL SPACES 45 then A (2c. e2. form = ciini + Cl2712 E2 = C21711 C22n2 Clnnn C2nnn $n = enini + cn2n2 + Thus in the example just given 1/1 + Cnnn. en in terms of ni. . ni. 712.ìj in the etc. 712*. ' f. . 6. E2. e. 112". 712*. para. then the expressions for El. r ni. e in terms of711. d2212 + = dnif d2f2 + + If the form A (x.

a22*. f2. = a21 2n 0 am. e2. y) and the initial . In contradistinction to the preceding section we shall express the vectors of the desired basis directly in terms of the vectors of the initial basis.. LI 2= at2 a21 mm. this time we shall find it necessary to impose certain restrictions on the form A (x. i. 2-1 where ai. However. the following determinants are different from zero: ali (1) a11 412 a22 0. y) relative to the basis f1. en so that (2) A (ei. fn. . . . We assume that basis f1. . O. fk). be different from zero. a22 a1n di. x) = I a1k. 0 0. Now let the quadratic form A (x.46 LECTURES ON LINEAR ALGEBRA of the corresponding basis transformation is also a triangular matrix: = e2 dfif1 d22f2. f2.4. an2 requirement that in the method of reducing a quadratic form to a sum of squares described in § 5 the coefficients an . x) be defined relative to the basis f1. ek) 0 for i k (i k --= 1. n). . er. fn. f2. = A (fi. It is our aim to define vectors el. . = difi c/n2f2 + + § 6. f by the equation (It is worth noting that this requirement is equivalent to the A (x. Reduction of a quadratic form by means of a triangular transformation 1 In this section we shall describe another method of constructing a basis in which the quadratic form becomes a sum of squares. Thus let a II be the matrix of the bilinear form A (x. etc. 2.

en is the required basis. f1) = 0 for every k and for all i < k.x. e2. We assert that conditions (4) determine the vector ek to within a constant multiplier.n-DIMENSIONAL SPACES 47 We shall seek these vectors in the form = (3) e2 c(21f1 22f2. e1. = 0 for i < k and therefore. fi) oci2A (ek. fi) = 0 then (ek. then = ocii A (ek.k from the conditions (2) by substituting for each vector in (2) the expression for that vector in (3). and to obviate the computational difficulties involved we adopt a different approach.. fi). such that the vector ek = satisfies the relations A (ek. 2.. f2) + A (e. Ctkk atk2f2 + + Mkkfk = 0. by ai1f1 oci2f2 ' then A (ek. f2) = 1.k I. ei) = O for . c22f2 + e= + We could now determine the coefficients . i.212 + + 2(f1) + aA(ek. Thus if A (ek.e. also for i > k. Indeed. if we replace e. Our problem then is to find coefficients . However. We claim that conditions (4) and (5) determine the vector ek . 2. in view of the symmetry of the bilinear form. A (ek. this scheme leads to equations of degree two in the cc°. To fix this multiplier we add the condition A (ek.k I). . (i = 1. + . . We observe that if for i = 1. e1) = A (ek. OCk2.

12A (f1-1. A (x. + akkA (LI. x) relative to the basis e1. fk) 14 The determinant of this system is equal to A (fk. Thus conditions (4) and (5) determine ek uniquely. fk) = °. f2) + + lickA (fk. f1) A (fi. f1) 12A (f1. The proof is immediate. f2) ac12A (f2. A 1. bin = 0 for i k. The basis of the ei is characterized by the fact that A (e1. ek) = A (ek. fl) (flt. ek). e2. is a determinant of order k Ao = 1. f2) + + 11114 fek. fk) A (fk. f1) A (f2. I analogous to (7) and . i. Now A (ek. f A (fk. f2) A (f 2.e. ek) = ockk' The number x11 can be found from the system (6) Namely. It therefore remains to compute b11 = A (ek.. f = °. f2) + ' 111A (f2' f1) (6) + kkA (f1. As A (ei. f2) + acnA (f1-1. fk) A (f2. fk) I A (fk. ek) = for i k.48 LECTURES ON LINEAR ALGEBRA uniquely. by Cramer's rule. we already know b11 It remains to find the coefficients bt of the quadratic form en just constructed. fl) oc12A (fk. r which in view of (4) and (5) is the same as A (ek. f 1. + akkA (f2. f2 A (f2.. chkiA (f1.-1 QC/0c = where A1_. 0. we are led to the following linear system for the kg. fl) 12f2 + + °C1741) C(12A (elc. ek). Substituting in (4) and (5) the expression for e. as asserted. oc11f1 = C(11A (e1. f2) and is by assumption (1) different from zero so that the system (6) has a unique solution.

= (0.n-DIMENSIONAL SPACES 49 Thus blek = A . 0. are the coordinates of x in the basis el. f (or if one were simply to permute the vectors f1. e2. 1. known as the method of Jacobi. e) = A k-1 Ak To sum up: THEOREM 1. if one were to start out with another basis . 2. relative to which A (x. let the determinants .11=a11. Further. en. 0. EXAMPLE. 0). . A Here 4Ç. e2. =-. it should be pointed out that the vectors el. by the equation A (x. e2. . . . x) = 40 AI _AI A2 22 + A I. fk). . f2. fn) one would be led to another basis el. x) is expressed as a sum of squares. x) = l<=1 aiknink a ik = A (fi. This method of reducing a quadratic form to a sum of squares is . REMARK: The fact that in the proof of the above theorem we were led to a definite basis el. f. en.(0. . f2. Also. en need not have the form (3). 0).e A (x. . e2. 1). Consider the quadratic form 2E1' + 3E1E2 + 4E1E3 + E22 + in three-dimensional space with basis f= (1. Let A (x. f. fr. . In fact. x) be a quadratic form defined relative to some basis f1. en in which the quadratic form is expressed as a sum of squares does not mean that this basis is unique.e. e2.42 an An = an a12 a22 all ' an an a12 aln a2n (inn an2 be all different from zero Then there exists a basis el. f2.

.) = 0 and A (e2. 12) = 0.' -33 and e3-1871 1 12ea + 117 fa _(S 127. f2) = 1. Or an 8f2 = (6. e e . or o( i and e. 1. A (e3. Thus our theorem may be applied to the quadratic form at hand. 43. 2a = I. fa) = 1 A (ea. i. = if..f. f2) 1.50 LECTURES ON LINEAR ALGEBRA The corresponding bilinear form is A (x. o r.. our quadratic form becomes A(x. + teh. Here C. whence 831 = 0. 833). C C2 are the coordinates of the vector x in the basis e. 21ai 1832 + 2833 = jln + 832 28. fi) = 0. ct. 2M21 = 0. 8.. = (cc./3 The determinants A. = 82313 a22f2 e. 12 133 8 -y. Relative to the basis e. is found from the condition A (e1. = 6f1 Finally. + $012 + 2e3m1 + e3 7. e2. 117). = 83113 + 822f3 + ma. 833 = 1. y) =. are determined from the equations A (es.e. (823.e. 0). 0).f. =(j 0. 0.. Let el = ce. 121 = 6. none of them vanishes. 0). e. 0).2e1n1 pi. 43 are 2. i. Next a and 822 are determined from the equations A (e2. 232.x) = C12 + Ai C13 42 AH 43 C32 Cl2 8C22 11-7C32. e.&7. --1-. = (i. 839. 822. The coefficient cc. whence and e.

is positive definite. Actually all we have shown is how to compute the number of positive and negative squares for a particular mode of reducing a quadratic form to a sum of squares. have the same sign then the coefficient and A. x) takes the form A (x. x) = I 21E12 1=1 0 for all x and is equivalent to E1= E2 = = En = O. x) A (x. in which A (x. A. Hence A (x. > 0. Hence. (8) It is clear that if A1_1 and A. so that the quadratic form is 1 . e2.. have opposite signs. A1. A > O. . In the next section we shall show that the number of positive and negative squares is independent of the method used in reducing the form to a sum of squares. A2 " 4 e22 2 An_. These coefficients are I A.n-DIMENSIONAL SPACES 2. x) 4E12 + 22E22 Anen2. then the quadrat e form A (x. /12 > 0. Assume that d. A. . If A1> 0. . x) . are positive. where all the A. then of E12 is positive and that if this coefficient is negative. > 0. Z12. e7. THEOREM 2. A. The number of negative coefficients which appear in the canonical form (8) of a quadratic form is equal to the number of changes of sign in the sequence 1. In other words. 51 In proving Theorem I above we not only constructed a basis in which the given quadratic form is expressed as a sum of squares but we also obtained expressions for the coefficients that go with these squares. Then there exists a basis e1. A2 > 0.

We shall show that then 4k> 0 (k A (f. fi) A (f. y. fi) A (f2. .f2) A (f. 0. . x) to be positive definite it is necessary and sufficient that > 0. > 0 (we recall that /10 = 1). 2. i. . it would be possible to find numbers y1.) A (f fie) A (f. fi) ---.e. k).A (fk.) 0. 1.f. let A (x. This theorem is known as the Sylvester criterion for a quadrat c . (k 1.. kikfk. 2. f. f.142. We first disprove the possibility that A (fi. x) in the -F form A (x. y) be a symmetric bilinear form and f2. not all zero such that yiA(fi. x) = A2E12+ A2E22 A2En2. We have thus proved 4> 0. then one of the rows in the above determinant would be a linear combination of the remaining rows. it follows THEOREM 3.f.. so that A (yifi p2f2 + ' + p. fz) + + yk. f. p2f2 + -F p.. tl > O. f2) A(f2.. Let A (x. form to be positive definite. n). f2) A (f f. . g212 + In view of the fact that pif. 42 > 0. For the quadratic form A (x. a basis of the n-dimensional space R.) If A. = 0. n) combined with Theorem 1 permits us to conclude that it is possible to express A (x.a. But then A (pifi p2f2 -F + phf. fi) .A (f2.0 (i 1. f.) 1. the latter equality is incompatible with the assumed positive definite nature of our form. .52 LECTURES ON LINEAR ALGEBRA Conversely. The fact that A. Ak _ A k-1 Since for a positive definite quadratic form all that all A. x) be a positive definite quadratic form. k.) = O. 2.

Conversely. e2. e2) (el. e. . ek) (e2. be k vectors in some Euclidean space. ek are linearly dependent. The Gramm determinant of a system of vectors e1. (x.2. if (x. In particular f in changed order.4. then all principal minors ok that matrix are positive. An would be different principal minors of the matrix the new A1.. Let e1. then if we used as another basis the vectors f1. e2. e2) (e2. y). A2.o/ a matrix ilaikllof a quadratic form A (x. express the conditions for positive definiteness of A (x. This determinant is zero if and only if the vectors el. y) A (x.. is always >_ O. we see that A > O. A.ti-DIMENSIONAL SPACES 53 It is clear that we could use an arbitrary basis of R to express the conditions for the positive definiteness of the form A (X. x) such that A (x. ek) (ek. are all positive. A2. x) derivable from inner products. . x) every theorem concerning positive definite quadratic forms is at the same time a theorem about vectors in Euclidean space. y) can be taken as an inner product in R.p. we may put (x. y) is a symmetric bilinear form on a vector space R and A (x. The Gramm determinant.. x). 3. e2) (ek.. . I/the principal minors z1. The determinant el) e1) (el. e2. if 41.e. Now let A be a principal minor of jja1111 and let p 1. e. Haik11. y) is an inner product on R. x) '(x. y) is a bilinear symmetric form on R such that A (x. be the numbers of the rows and columns of jaj in A. X). Thus every positive definite quadratic form on R may be identified with an inner product on R considered for pairs of equal vectors only. . If A (x. x) is positive definite. The results of this section are valid for quadratic forms A (x. x) is positive definite. . is known as the Gramm determinant of these vectors THEOREM 4. . Indeed.. ek) (ek.e. x). y) (x. If we permute the original basis k) and vectors so that the pith vector occupies the ith position (i 1. x) relative to the new basis. then A (x. This implies the following interesting COROLLARY. then A (x. . x) relative to some basis are positive. for quadratic forms A (x. then A (x. i. x) is positive definite. A. . i. f2. One consequence of this correspondence is that A (x.

y) is the inner product of X and y. discussed in this section (cf. where (x. x1 y1 1. ek coincides with the determinant 4..y. In Euclidean three-space (or in the plane) the determinant 4.x. . linearly dependent vectors e1. 3/423 2. x) The assertion that inequality. y) is a symmetric bilinear form such that A (x. y) (Y.54 LECTURES ON LINEAR ALGEBRA el. y) Proof. on the vectors x. y). Since A (x. is a linear combination of the others. = ixF2 ly12 2. in that case one of the vectors. ek . + z. + y. Ya 23 where 3/4. 3/42 + 3/42 + x32 = Y1 X1 + Y2 x2 T Y3 X3 z. Assume that e1. y. y. (x. is a linear combination of the others and the determinant must vanish. We shall show that the Gramm determinant of a system of Ale]. 3/4 Y3 Y12 + Ya' + 1I32 x121 + x222 -.y. e2. = 1x18 13. (x.z. ek is zero. where y) is the angle between x and y. e. This completes the proof. z.y. e2. are linearly independent. + zax.12 (1 cos2 99) = 1x12 13712 sin' 99. has the following geometric sense: d2 is the square of the area of the parallelogram with sides x and y. . x and y As an example consider the Gramm determinant of two vectors = (x. are the Cartesian coordinates of x. e2.. 1x12 13712 cos' 9. Indeed.Y. e2. 1. z is equal to the absolute value of the determinant xi 23 V In three-dimensional Euclidean space the volume of a parallelepiped 23 Ya 23 Ya 2. J. y) = (y. z28 + za2 . (x. + 22e2 ' + It follows that the last row in the Gramm determinant of the e. i.e. + z.z. Now. x) is positive definite it follows from Theorem 3 that 47c >0. Therefore. say e. has indeed the asserted geometric meaning. + z.. Y) (Y. x) > 0 is synonymous with the Schwarz EXAMPLES. Then the Gramm determinant of Consider the bilinear form A (x.x. x) = Ix1 lyr cos ry. vectors e1. (7)).z. y 2'. Indeed. y. z. + yrz.

y. (9) .2 By replacing those basis vectors (in such a basis) which correspond to the non-zero A. z) (37.a f (t)dt and the theorem just proved implies that: The Gramm determinant of a system of functions is always 0.1.e. (1) A (x. w in a k-dimenional space R is the square of the determinant X1 Y1 22 Y2 " " Xfr Yk Wk W1 W2 where the xi are coordinates of x in some orthogonal basis. which a quadratic form A (x. /1(012(t)dt Pba 122(t)dt 11(1)1k(i)di Pb 12(t)f1(t)dt f " a 12(t)1(t)dt rb a rb Pb tic(1)11(1)dt 1k(t)12(t)dt .n-DIMENSIONAL SPACES 55 (x. it is possible to show that the Gramm determinant of k vectors y) x. x) = 1-1 2. There are different bases relative to 1. z is the square of the volume of the parallelepiped on these vectors. (It is clear that the space R need not be k-dimensional. w. the determinant (9) is referred to as the volume of the k-dimensional parallelepiped determined by w. R may.) By analogy with the three-dimensional case. etc. § 2) the Gramm determinant takes the form rb I rb 10 (t)de "b . by vectors proportional to them we obtain a . y. Y) (x.1 . z) Thus the Gramm determinant of three vectors x. the vectors x. be even infinite-dimensional since our considerations involve only the subspace generated by the k vectors x. § 7. y. z) (z. Similarly. indeed. . x) is a sum of squares. y) (3'. For a system of functions to be linearly dependent it is necessary and sufficient that their Gramm determinant vanish. 3. the yi are the coordinates of y in that basis. The law of inertia T he law of inertia . y. In the space of functions (Example 4.

2. § 6.. To illustrate the nature of the question consider a quadratic form A (x. suppose some other basis e'1.. and I. 1. Then a certain matrix I ja'11 would take the place of I laikl and certain determinants would replace the determinants z11. x) by means of a sum of squares in which the A.. or 1. in formula (1) are different from zero and the number of positive coefficients obtained after reduction of A (x. z1'2. THEOREM 1. x) to a sum of squares by the method described in that section is equal to the number of changes of sign in the sequence 1. in two different bases) to a sum of squares. If a quadratic form is reduced by two different methods (i.. is represented by the matrix where a = A (ei. and lis dependent on the choice of basis or is solely dependent on the quadratic form A (x. all ). A1.56 LECTURES ON LINEAR ALGEBRA representation of A (x. Now. There arises the question of the connection (if any) between the number of changes of sign in the squences 1. . z11. A. answers the question just raised. a2 a. . are 0. I. .. relative to some basis el. known as the law of inertia of quadratic torms. Then. ZI2. e' were chosen. e2. . e'2. .(12. an are different from zero. as was shown in para. ''2 = all a12 a22 z1n an an an an .. . It is natural to ask whether the number of coefficients whose values are respectively 0. . zl.. x) which.e. en. x). . A'. 2. The following theorem. ek) and all the determinants 41 = an. then the number of positive coefficients as well as the number of negative coefficients is the same in both cases.

E2. + 52e2 + + $e -F -Fene. f2. e be a basis in which the quadratic form ei A2e2 + A (x. f2...±Q. + + pit = 0.. e2. is n. kt2. x) = ni2 (Here )7. n2p. )72p. which vanish is also an invariant of the form. be a basis of R' and f. If x = 0.2e2 + + Atek = !IA Let us put + Akek = Pelf' /42f2 !lift = x. that this is false and that p > p'. say. n2 . 2. We first prove the following lemma: LEMMA. Let R' be the subspace spanned by the vectors el. Let R' and R" be two subspaces of an n-dimensional space R of dimension k and 1. e2. e2. e..e. f be another basis relative to which the n22 quadratic form becomes A (x.. e2.2e2 + + Akek p2f. . . and pi. x) X 22 $2.. . + erk. f2. We can now prove Theorem 1. f2.. e. .n-DIMENSIONAL SPACES 57 Theorem 1 states that the number of positive A. and let k 1 > n. . . q'. f. .2 $223+1 E2p+2 $2. . 2. are linearly dependent (k 1 > n). . ti are the coordinates of x relative to the basis .e.) Let f.. Ale 2.) We must show that p = p' and q =-. . e. A2.._ . ' It is clear that x is in R' n R". f. This means that there exist numbers pi not all zero such that Al. . x) becomes A (x. i. basis of R". " Ak. + . The vectors el. in (1) and the number of negative A. It remains to show that x O. Assume ep. f. (Here E. A2. p would all be zero. Since the total number of the A. which is impossible. . Al. fi. Proof: Let e. Then there exists a vector x 0 contained in R' n R".. Proof: Let e. Hence x O. respectively.e.. En are the coordinates of the vector x. fm. n2. it follows that the number of coefficients A. i.. in (1) are invariants of the quadratic form.

) The resulting contradiction shows that fi = p'. y.58 LECTURES ON LINEAR ALGEBRA R' has dimension p. e R. .eil+1. Rank of a quadratic form DEFINITION 1. x= and E. x) = + $22 > (since not all the E. Since n p>n (we assumed 1) > p'). and 41 e R. . y. i. This completes the proof of the law of inertia of quadratic forms. e. By the null space of a given bilinear form A (x.e 0 and its coordinates relative to the basis . x) = . are E1.e. It is easy to see that R. E2. A (x. there exists a vector x 0 in R' n R" (cf. Then A (x. DEFINITION 2. e Et. .. are zero.) = 0 and A (x. A (x. y) = 0 for every x e R. To this end we shall define the rank of a quadratic form without recourse to its canonical form.-+2 -c 0(Note that it is not possible to replace < in (5) with <. = n. Indeed. . f. But this means that y.. Substituting these coordinates in (2) and (3) respectively we get.±. in one of its canonical forms. of all vectors y such that A (x. i.. y2) = 0 for all x e R. e2. on the one hand. . it is possible that nil+. 0. on the other hand. yi y. let y. is a subspace of R. Similarly one can show that q = q'. The reasonableness of the above definition follows from the law of inertia just proved. The subspace R" spanned by the vectors fil. n. = O.. By the rank of a quadratic form we mean the 2.e. while not all the numbers il. f. are 0. for. +ee X = np fil + + nil-Fa' fil+qt + nnfn The coordinates of the vector x relative to the basis e.) = 0 and A (x. y. 0.. 0 for all x e R. A (x. Lemma). number of non-zero coefficients 2. vanish) and. We shall now investigate the problem of actually finding the rank of a quadratic form. y) we mean the set R. . +2 = = nil+0. f has dimension n p'.n22. . Q nil+.

2.e) = aik. But relative to a canonical basis the matrix of a quadratic form is diagonal [1. f is a basis of R. consists of all vectors y whose coordinates 2h.1 0 O Ao 0 An . for i= 1. Replacing y in (7) by (6) we obtain the following system of equations: A (f1.n-DIMENSIONAL SPACES 59 If f.. are solutions of the above system of linear equations. 71f1 + n2f2 + A (f2.. anini + 12. a22n2 = O. f.. f. then for a vector Y= n2f2 + + nnf. the dimension of this subspace is n r. 2. As is well known. nifi + nnf. ant).) = Q + mit. + an% = 0. We defined the rank of a quadratic form to be the number of (non-zero) squares in any of its canonical forms.) = Q /02 + + ni. y) is independent of the choice of basis in R (although the matrix la .202 + ' + Thus the null space R.. A (ft. y) 0 to belong to the null space of A (x. and the null space is completely independent of the choice of basis. We shall now connect the rank of the matrix of a quadratic form with the rank of the quadratic form.[1 does depend on the choice of basis. 702 + n2f2 + A (fn. § 5). )72. y) it suffices that n.) = O.17. cf. . the above system goes over into + ainnn = 0. We shall now try to get a better insight into the space Ro. where r is the rank of the matrix Ikza We can now argue that The rank of the matrix ra11 of the bilinear form A (x. . If we put A (fi. where ro is the dimension of the null space. Indeed. the rank of the matrix in question is n ro.

41 which represent the same quadratic form relative to two different bases is . Complex n-dimensional space In the preceding sections we dealt essentially with vector spaces over the field of real numbers.4 (6' non-singular.e. This rank is equal to the number of squares with non-zero multipliers in any canonical form of the quadratic form. The matrices which represent a quadratic form in different coordinate systems all have the same rank r. a function which associates with every pair of vectors x and y a complex number (x. y) so that the following axioms hold: 1. x) [(y. Many of the results presented so far remain in force for vector spaces over arbitrary fields. the rank of the matrix associated with a quadratic form in any basis is the same as the rank of the quadratic form. It is therefore reasonable to discuss the contents of the preceding sections with this case in mind. the rank of the quadratic form. to vector spaces over the field of complex numbers. Since we have shown that the rank of the matrix of a quadratic form does not depend on the choice of basis. Complex vector spaces. y) = (y..---- . i. x)i. § 8. 6 We could have obtained the same result by making use of the wellknown fact that the rank of a matrix is not changed if we multiply it by any non-singular matrix and by noting that the connection between two matrices st and . 5 To sum up: THEOREM 2. We mentioned in § 1 that all of the results presented in that section apply to vector spaces over arbitrary fields and. Thus.. Complex Euclidean vector spaces. vector spaces over the field of complex numbers will play a particularly important role in the sequel.e. i. x) denotes the complex conjugate of (Y. to find the rank of a quadratic form we must compute the rank of its matrix relative to an arbitrary basis. in particular. . By a complex Euclidean vector space we mean a complex vector space in which there is defined an inner product. In addition to vector spaces over the field of real numbers. (x.60 LECTURES ON LINEAR ALGEBRA and its rank r is equal to the number of non-zero coefficients.

Axioms 1 and 2 imply that (x. x) = (x. The set R of Example i above can be made into a unitary space by putting y) = aikeifh.n-DIMENSIONAL SPACES 61 2(x. nn) are two elements of R. y). x) = x) + (Y2. y). Indeed. ' ". Y2) Axiom 1 above differs from the corresponding Axiom 1 for a real Euclidean vector space.(x. Ic--1 . In particular. y. Complex Euclidean vector spaces are referred to as unitary spaces. 2.1(y. (x.e. Y) -= $217/2 + We leave to the reader the verification of the fact that with the above definition of inner product R becomes a unitary space. yi + Y2) = (Y1 + Y2. Indeed. x). (x. y) -H (x2. (x. Y) = (Y. y). 2y) = (2y. But then (Ax. Also. 2 and 4 for inner products in the form in which they are stated for real Euclidean vector spaces. EXAMPLES OF UNITARY SPACES. y2). + (x. x) i"(x. 2. If = (E1 E2 En ) and 2. x) would imply (x. y2) = y) (x. /. x) is a non-negative real number which becomes zero (2x. ). -1. (x.. (ix.x) x). y) (x. i. ix) (x.x2. the numbers (x. x) --= . y) with y = tx would have different signs thus violating Axiom 4. we define (x. 2y) = il(x. x) and (y. y) = (x1. This is justified by the fact that in unitary spaces it is not possible to retain Axioms 1. y) only if x = O. y). Let R be the set of n-tuples of complex numbers with the usual definitions of addition and multiplications by (complex) numbers.

Isomorphism of unitary spaces. e2.e. . Let R be the set of complex valued functions of a real variable t defined and integrable on an interval [a. en.e. en is an orthonormal basis and x= $2e2 + + $e. we do not introduce the concept of angle between two vectors. Example / in this section). . b]. Two vectors x and y are said to be orthogonal if (x... e2. The existence of an orthogonal basis in an n-dimensional unitary space is demonstrated by means of a procedure analogous to the orthogonalization procedure described in § 3. then (x. Axiom 4 implies that the length of a vector is non-negative and is equal to zero only if the vector is the zero vector. x). . = (Th azkez f. g(t)) = f(t)g(t) dt. . y = %el n2e2 + are two vectors.) x = e. As in § 3 we prove that the vectors el. . en are linearly independent. If e. i. $2. If e. 3. e2. not a real number. En and takes on the value zero only if el = C2 = en = O.62 LECTURES ON LINEAR ALGEBRA where at.>. e2. By the length of a vector x in a unitary space we shall mean the number \/(x. Orthogonal basis. y) = O. n2e2 + + nne. By an orthogonal basis in an n-dimensional unitary space we mean a set of n pairwise orthogonal non-zero vectors el. in general. e is an orthonormal basis and = + ee. 3. $2e2 + $en. Y) = $2e2 + e2f/2 + + nnen + E71 (cf. that they form a basis. 0 for every n-tuple el. Since the inner product of two vectors is. It is easy to see that R becomes a unitary space if we put (f(t). are given complex numbers satisfying the following two conditions: (a) a .

In other words. y) is a linear function of the first kind of x. + + b&./(x). for any fixed x.(ei. However. W e shall say that A (x. are constants.f(x + y) --f(x) +f(Y). y). y) = A (xi. With the exception of positive definiteness all the concepts introduced in § 4 retain meaning for vector spaces over arbitrary fields and in particular for complex vector spaces.4 (x. et) + + e (e et).1. . y) A (2x. f(Ax) = Af (x) . are the coordinates of the vector x relative to the basis el. A (x. en and a. y) is a bilinear form (function) of the vectors x and y if: for any fixed y. A (x. a2 + ame. in the case of complex vector spaces there is another and for us more important way of introducing these concepts. A (x2. A complex valued function f defined on a complex space is said to be a linear function of the first kind if f(x + Y) =f(x) ±f(y). 1. Linear functions of the first and second kind. (x. Using the method of § 3 we prove that all un tary spaces of dimension n are isomorphic. where $. et) + $2 (e2. y) = 2. f(x) = DEFINITION 1. and that every linear function of the second kind can be written in the form b2t. y) is a linear function of the second kind of y. of the first kind can be written in the form f(x) = a1e. e2.) = Et. e.). A (xl + x2.n-DIMENSIONAL SPACES 63 t hen (x. Bilinear and quadratic forms. ez) = SO that e2e2 + + ez) = e. Using the method of § 4 one can prove that every linear function 1. and a linear function of the second le. y). = f(e.nd if 2. f (2x) = . 4. a.

2) = A (X.Fik i. 6 6 holds only for symmetric bilinear forms (cf. k=1 viewed as a function of the vectors X El. Let A (x. A (x. x) called a quadratic form (in complex space). y) relative to the basis .4 (x. Ay) )7. . y = n1e1 n2e2 + + linen) = A (elei i.64 LECTURES ON LINEAR ALGEBRA 2. k1 $2e2 ' fle. Y1. y) = aik$. is called the matrix of the bilinear form A (x. Yi) + A (X. e. y) be a bilinear form. en be a basis of an n-dimensional complex space. One example of a bilinear form is the inner product in a unitary space A (x. We recall that in the case of real vector spaces an analogous statement § 4). If x and y have the representations Y= + n2e2 + x = 1e1 then A (X. The matrix IjaH with ai. y).. ?he]. + 3. y) e2e2 + + enen. A (ei.. . Y2). y) we obtain a function A (x. ej. //lei ?2e2 + ed7kA (ei. A (X. If we put y = x in a bilinear form A (x. y) = (x. + nmen Let en e2. E2e2 + + $ne.n. The connection between bilinear and quadratic forms in complex space is summed up in the following theorem: Every bilinear form is uniquely determined by its quadratic form. y) considered as a function of the vectors x and y. Another example is the expression A (x.

respectivly.3) =M (x.x iy) A (x y. A (x+iy. y)± A (y. I ankei ---. x). Namely. x+y) = A (x. y). x y) iA(x iy. (IV) by 1. i. i. x) A (x y. y). The four identities 7: A (x±y. If we multiply equations (I). (IV) by 1. in particular. y). A (xy. x) + A (x. x y) + iA (x iy. x±iy)=A(x. x iy) 1{A (x A (y. y) = A (y. x + y) iA (x iy. .A (y x). y). (II). then a = A (ei. (III). respectively. if we multiply the 1. x) A (y. x)iA (y. x iy)}. then A (x.n-DIMENSIONAL SPACES 65 Proof: Let A (x. y)± A (y. For a form to be Hermitian it is necessary and sufficient that its matrix laikl I relative to some basis satisfy the condition a Indeed. 1. x iy)}. i. x) be a quadratic form and let x and y be two arbitrary vectors. x + y) + iA (x iy. x) + A (y. and add the results it follows easily that A (x. A (xiy. A (x. Conversely. DEFINITION 2. NOTE If the matrix of a bilinear form satisfies the condition 7 Note that A (x. x)-HiA (y. if the form A (x. x iy)= A (x. y)+A(y. i. y). (II). Since the right side of (1) involves only the values of the quadratic form associated with the bilinear form under consideration our assertion is proved. (III). xy) = A (x. y) is Hermitian. if a = aki.. enable us to compute A (x. y). equations (I). x) A (x. y) = ±{A (x y. Y) so that. y) = a1kE111. e1) d. This concept is the analog of a symmetric bilinear form in a real Euclidean vector space. iy) iA (x. x)-HiA (x. A bilinear form is called Hermitian if A (x. 1. ek) A (ek. x)iA (x. we obtain similarly. y) A (y. y.

e2. The proof is a direct consequence of the fact just proved that for a bilinear form to be Hermitian it is necessary and sufficient that A (x. y) is a Hermitian bilinear form so that (x. axioms 1 through 3 for the inner product in a complex Euclidean space say in effect that (x. The following result holds: For a bilinear form A (x. y) to be Hermitian it is necessar y and sufficient that A (x. y) relative to the basis e1. but then a--=d relative to any other basis. j 1. then 4-4 = %)* seW . Indeed. where (x.d relative to some basis implies that A (x. x) positive definite when for x 0. A quadratic form is Hermitian i f and only i f it is real valued. en and I the matrix of A (x. x + y). y) = A (y. x) is real. x) = (x. in particular. A (x + y. x) > 0 vector space with a positive definite Hermitian quadratic form. x iy).. x). Conversely. i. a. f2 and if f. If Al is the matrix of a bilinear form A (x. basis f1. A (x iy. y) be Hermitian. then. x) be real for all x. x). x). y) = (y. A (x y. A (x iy. then the same must be true for the matrix of this form relative to any other basis. y) is a Hermitian bilinear form. In fact. x iy) are all real and it is easy to see from formulas (1) and (2) that A (x.66 LECTURES ON LINEAR ALGEBRA a = dkì. then the associated quadratic form is also called Hermitian. so that the number A (x. x) is real for al x.e. Then A (x. we call a quadratic form A (x. . xy). Proof: Let the form A (x. COROLLARY. n). x). as in § 4. f2. y) relative to the tt . then a complex Euclidean space can be defined as a complex A (x. x) be real for every vector x. . x) is a Hermitian quadratic form. x) denotes the inner product of x with itself. if A (x. let A (x. -. If a bilinear form is Hermitian. = coe. If. x) = A (x. One example of a Hermitian quadratic form is the form A (x.

basis of R. If W. in view of formula (1). e. The idea is to select in succession the vectors of the desired basis. O. then we choose in it some basis er÷. 5. X) = is an arbitrary vector. Our construction implies A (e2. Reduction of a quadra ic form to a sum of squares THEOREM 1. el) O. c* e51. . One can prove the above by imitating the proof in § 5 of the analogous theorem in a real space. wise A (x. A (x. e). e2) for i < k. x) = 0 so that 0. en of R complex vector space R..R. ei) are real in view of the Hermitian . x) = 0 for all x and..) 0. y) A (ei. We choose to give a version of the proof which emphasizes the geometry of the situation.-DIMENSIONAL SPACES 67 flc[ and tt'* Ilc*011 is the conjugate transpose of Here g' i.. y) vector only). e2. Then there is a basis e1. e2) + where the numbers A (e. This process is continued until we reach the O (Mr) may consist of the zero space Itffi in which A (x.e. then A (ei. . the Hermitian nature of the form A (x. )=0 + E2e2 + for > k. It follows tha x= A (x. The proof is the same as the proof of the analogous fact in a real space. relative to which the form in question is given by A (X. e1. X) = + A2 EJ2 + + EJ where all the 2's are real. This can be done for otherWe choose el so that A (el. y) Now we select a vector e2 in the (n 1)-dimensional space Thu consisting of all vectors x for which A (e1. ek) = 0 implies A (ei. These vectors and the vectors el. Let A (x. etc. form a er+2. x) be a Hermitian quadratic form in a . On the other hand. en. + enen + EThfA (en. el) + E2(2A (e2.

. Just as in § 6 we find that for a Hermitian quadratic form to be positive definite it is necessary and sufficient that the determinants A2 . I.A (ei. an where a. . ek). e1) and are thus real. --. 7. These formulas are identical with (3) and (6) of § 6. The number of negative multipliers of the squares in the canonical form of a Hermitian quadratic form equals the number of changes of sign in the sequence . quadratic form is reduced to the canonical form (3). e1) by ). To see this we recall that if a Hermitian .a. that the determinants /11 42. then A (x. EXERCISE. . x) be a Hermitian . Prove directly that if the quadratic form A (x. x) is Hermitian. x) = 1$112 A2 1E21' + zl I ler2... An THEOREM 2.A. . Al. e2. If a Hermitian quadratic form has canonical fo . If we denote A (e.. basis. Then just as in we can write down formulas for finding a basis relative to which the quadratic form is represented by a sum of squares.68 LECTURES ON LINEAR ALGEBRA nature of the quadratic form.. x) + 22M2 + + 2E& = 41E112 + 221e2I2 + quadratic form in a complex vector space and e. The law of inertia A2. Reduction of a Hermitian quadratic form to a sum of squares by means of a triangular transformation. Let A (x. 4 are real. a. Relative to such a basis the quadratic form is given by A (x. are all different from zero. among others. This implies. are real. We assume that the determinants all a12 + 2nleta2 6. be positive. 42 --= au a12 a21 a2^ An = a22 a.ea aln a2n A. then the coefficients are equal to A (e1. then the determinants /1 4. =-.2 § 6. where A. = 1.

The concept of rank of a quadratic form introduced in § 7 for real spaces can be extended without change to complex spaces. The proof of this theorem is the same as the proof of the corre- sponding theorem in § 7. negative and zero coefficients is the same in both cases. then the number of positive.It-DIMENSIONAL SPACES 69 relative to two bases. .

2. If with every vector x of a vector space R there is associated a (unique) vector y in R. It is again easy to see that conditions 1 and 2 hold. Fundamental definitions. Operations on linear transformations 1. Let R' be a plane in the space R (of Example 1) passing .). The simplest functions of this type are linear transformations. It is easy to see that conditions 1 and 2 hold for this mapping. say. If x is any vector in R. 1. The left side of 1 is the result of first adding x and x. Clearly. Whenever there is no danger of confusion the symbol A (x) is replaced by the symbol Ax. both procedures yield the same vector.CHAPTER II Linear Transformations § 9. The right side of 1 is the result of first rotating x. This transformation is said to be linear if the following two conditions hold: A (x + x2) = A(x1) + A (x. In the preceding chapter we stud- ied functions which associate numbers with points in an ndimensional vector space. A (dlx ) = (x). In many cases. and x. Linear transformations. and then rotating the sum. then the mapping y = A(x) is called a transformation of the space R. and then through the origin. then Ax stands for the vector into which x is taken by this rotation. however. We associate with x in R its projection x' = Ax on the plane R'. Consider a rotation of three-dimensional Euclidean space R about an axis through the origin. 70 adding the results. DEFINITION I. Let us check condition 1. it is necessary to consider functions which associate points of a vector space with points of that same vector space. EXAMPLES.

n. A (fi + /2) = Jo I. If we put AP(1) P1(1). P 2 (t). 4.LINEAR TRANSFORMATIONS 71 3. . Indeed. 1]. Consider the space of continuous funct ons f(t) defined on the interval [0.) --= /If (r) dr f2er) tit = Afi 2AI Af2. then A is a linear transformation.41(t) f2(T)] dr To . n2.). The identity mapping E defined by the equation Ex x for all x. en) (ni.fi(r) dr A (. 5.] we associate the vector Y = Ax where e2 . k=1 aike k This mapping is another instance of a linear transformation. where P'(t) is the derivative of P(t). Consider the vector space of n-tuples of real numbers. If we put Af(t) = f(r) dr. Let liaikH be a (square) matrix. Indeed [P1 (t)Pa(t)i' [AP (t)]' a(t) AP' (t). then Af(t) is a continuous function and A is linear. Jo f(r) dr Among linear transformations the following simple transforma- tions play a special role. Consider the n-dimensional vector space of polynomials of degree n 1. With the vector x= &.

uniquely. every matriz determines a unique linear transformation given by means of the formulas (3).. .. which we shall call the matrix of the linear transformation A relative . e. . In fact. n) form a matrix J1 = Haikl! to the basis e1. Ae2 = g2.g2 that the mapping A is linear. so that A is indeed uniquely determined by the Ae. (2). Ae2. conversely. e2. . . be a1..g. (i. 2. the vector Ax = e. e. To this end we consider the mapping A which associates with x = ele + es. Now let the coordinates of g relative to the basis el. This mapping is well defined. e2. The numbers ao. It is easily seen + E2e2 + + E'e. g2. We shall show that Given n arbitrary vectors g1. = g1. g. a..72 LECTURES ON LINEAR ALGEBRA The null transformation 0 defined by the equation Ox = for all x. since x has a unique representation relative to the basis e1. e2. Aek = aikei. e2. a2.. We first prove that the vectors Ae. It remains to prove the existence of A with the desired properties.e every linear transformation A determines a unique matrix Maji and. en be a basis of an n-dimensional vector space R and 2. Ae determine A x= e2e2 + + ene E2Ae2 is an arbitrary vector in R. then Ax = A(eie. k = 1. . (1). i.. . E2e2 + &.. e2. g there exists a unique linear transformation A such that Ae.. Connection between matrices and linear transformations. . e. Let el. Ae = g. We have thus shown that relative to a given basis el. let A denote a linear transformation on R. if .e) = EiAel + -Hen Ae.e.

relative to this basis the mapping A is represented by the matrix [1 0 0 0 1 0 o. Then Ael = el. the matrix which represents E relative to any basis is 00 It is easy to see that the null transformation is always represented by the matrix all of whose entries are zero. Ae. Then Aei = e (1= 1. ei2 = e2. e 2! e -= En (n I)! ... e'2. e'a = e2 ea. Ae3 = 0. i. Let E be the identity mapping and e.e. e'3. e2.e. We choose as basis vectors of R unit vectors el.. e any basis i. AP(t) = P'(t). n).LINEAR TRANSFORMATIONS 73 Linear transformations can thus be described by means of matrices and matrices are the analytical tools for the study of linear transformations on vector spaces. e2. 0 0 EXERCISE Find the matrix of the above transformation relative to the basis e'. = e. i. We choose the following basis in R: t2 3 = 1. Let R be the three-dimensional Euclidean space and A the linear transformation which projects every vector on the XY-plane. 1. directed along the coordinate axes. EXAMPLES. = e. Let R be the space of polynomials of degree n 1.e.. e2 = t.. in R. where e'. e. Let A be the differentiation transformation. 2.

el. Hence. Now Ax = A (e. + a2en)e2 ((file. (5) k=1 . briefly. + + nen. . We wish to express the coordinates ni of Ax by means of the coor- dinates ei of x. e2. an22 + a12E2 + a22 E2 + + ct.74 LECTURES ON LINEAR ALGEBRA Then Ael = 1' = 0. aln = arier n2 = aizEi tin --= an1$1 + az. en. + C122E2 + (aie. Ae2 =e (n 1 Ae3 (2)' 2 t e Ae = tn-2 1) ! (n 2) ! Hence relative to our basis. Let (4) x = $1e. en a basis in R and MakH the matrix which represents A relative to this basis. in v ew of (4').. (4') Ax = 121e1+ n2. $2e2 + + $nen. A is represented by the matrix 01 0 0 001 0 0 0 0 1 0 0 0 0 Let A be a linear transformation. a2 n. = ei(ael $2(a12e1 E2e2 + a21e2 + a22e2 + + Een) + anien) + an2e2) 5(aiei = (a111 a2e2 + a12e2 + ae) + + a$n)e. or.e.)e. + a ace le + anEn.

ABx.. . Then D2P(t) = D(DP(t)) = (P'(t))/ P"(t). if J]20represents a linear transformation A relative to some basis e1. 3. That C (2x) = . Bx. If C is the product of A and B. etc. Indeed. Clearly. DEFINITION 2. A3 = A2 A. Next we define powers of a transformation A: A2 = A A. in this case D" = O. e. then transformation of the basis vectors involves the columns of lIctocH [formula (3)] and transformation of the coordinates of an arbitrary vector x involves the rows of Haikil [formula (5)]. IP. The product of linear transformations is itself linear. 3 of this section and find the matrices of ll. e2. we write C = AB. If E is the identity transformation and A is an arbitrary trans- formation..i.ICx is proved just as easily. Se/ect in R of the above example a basis as in Example 3 of para. An" = Am A". it satisfies conditions 1 and 2 of Definition I. = Cx. Let R be the space of polynomials of degree n 1. x2) -= A [B (x x. Let D be the differentiation operator. Clearly. C (x. and. D P(t) = P' (t).)] = A (Bx. then it is easy to verify the relations AE = EA = A. Ex ERCISE. D3P(t) P"(t). we define A° = E. the third from property 1 for A and the fourth from the definition of multiplication of transformations. relative to this basis. Addition and multiplication of linear transformations.) = ABx.e. By the product of two linear transformations A and B we mean the transformation C defined by the equation Cx = A (Bx) for all x. by analogy with numbers. the second from property 1 for B. . Likewise.LINEAR TRANSFORMATIONS 75 Thus. We shall now define addition and multiplication for linear transformations. EXAMPLE. Cx2 The first equality follows from the definition of multiplication of transformations.

e) and I C al! represents the sum C of A and B (relative to the same basis). . on the one hand. Further AB. on the other hand. Let C be the sum of the transformations A and B. We see that the element c of the matrix W is the sum of the pro- ducts of the elements of the ith row of the matrix sit and the corresponding elements of the kth column of the matrix Re?. and. czkei. then. It is easy to see that C is linear. By the sum of tzew linear transformations A and B we mean the transformation C defined by the equation Cx Ax Bx for all x. j j DEFINITION 3.76 LECTURES ON LINEAR ALGEBRA . we note that by definition of Hcrj C. if the (linear) transformation A is represented by the matrix I jaikj j and the (linear) transformation B by the matrix jjbjj.. If the transformation A determines the matrix jjaikj! and B the matrix 1lb j]. The matrix W with entries defined by (8) is called the product of the matrices S and M in this order.ke C. e every linear transformation determines a matrix. Thus. = A( J=1 bike]) == biAei Comparison of (7) and (6) yields cika15 blk. A. then their product is represented by the matrix j[c1! which is the product of the matrices Hai. If C is the sum of A and B we write C = A + B. e2. e2. = ?Wei. what is the matrix jcjj determined by the product C of A and B. A. = a ake. = (ac . If j jazkl j and Ilkkl I represent A and B respectively (relative to some basis e1. Ce. To answer this question We know that given a basis e1.j1 and j I b. = I b. + B. = IckeI.

[1 and 111)0. (A B) C = A + (B C). We could easily prove these equalities directly but this is unnec- essary.LINEAR TRANSFORMATIONS 77 so that c=a The matrix b. equation + a. Since properties 1 through 4 are proved for matrices in a course in algebra.E. f (A B)C = AC -1. We now define the product of a number A and a linear transformation A. 1 C(A B) = CA + CB. 21. . We recall that we have established the existence of a one-to-one correspondence between linear transformations and matrices which preserves sums and products. Thus the matrix of the sum of two linear transformations is the sum of the matrices associated with the summands. Consider the space R of functions defined and infinitely differentiable on an interval (a.BC. Addition and multiplication of linear transformations have some of the properties usually associated vvith these operations. we define the symbol P(A) by the P(A) = (Om + a. the iso- morphism between matrices and linear transformations just mentioned allows us to claim the validity of 1 through 4 for linear transformations.. is an arbitrary polynomial and A is a transformation. Let D be the linear mapping defined on R by the equation Df (t) = f(1). Thus by 2A we mean the transformation which associates with every vector x the vector il(Ax). Thus A+B=B±A. b).11. A (BC) = (AB)C.16-1 + EXAMPLE. It is clear that if A is represented by the matrix matrix rj2a2. + bIl is called the sum of the matrices Ila.11 If P (t) aot'n then 2A is represented by the + + a.

) EXERCISE. ) 1 .2" - 0] 0 .^ - d2 it follows that = Oi At. 0 2..e.' dm = 0 - - 0 0 0 P (0). Now consider the following set of powers of some matrix sl . Since [AL.78 LECTURES ON LINEAR ALGEBRA If P (t) is the polynomial P (t) = cior + airn-1+ + am. e . a matrix of the form [A. with P (t) as above and al a matrix we define a polynomial in a matrix. Example 5.22 01 rim . etc..91) for 01 0 0 010 O 0 1 0 0 0 0 0 si = 0 0 0 0 000 o_ a It is possible to give reasonable definitions not only for polynomial in a matrix at but also for any function of a matrix d such as exp d..am f (t). sin d. Analogously. Find P(... . all matrices of order n with the usual definitions of addition and multiplication by a scalar form a vector space of dimension n2. i. As was already mentioned in § 1. EXAMPLE Let a be a diagonal matrix. by means of the equation P(d) = arelm + a1stm-1 + + a. then P(D) is the linear mapping which takes f (I) in R into P(D)f(t) = aor)(t) + a. f (m-1) (t) + -1. 0 2. i : 0 0 - P(2.2/ = 0 0 0 0 ). Hence any n2 + 1 matrices are linearly dependent. We wish to find P(d)...2 - 0 ' - 0 . 0 2.

Inverse transformation DEFINITION 4. (not all zero) such that + a. it does not tell us how to construct P (t) and it suggests that the degree of P (t) may be as high as n2. Not every transformation possesses an inverse. a1. where E is the identity mapping.. there exist numbers a.e. In the sequel we shall prove that for every matrix sif there exists a polynomial P(t) of degree n derivable in a simple manner from sit and having the property P(si) = C. namely. The definition implies that B(Ax) = x for all x. The inverse of A is usually denoted by A-1. Thus it is clear that the projection of vectors in three-dimensional Euclidean space on the KV-plane has no inverse. i. they must be linearly dependent. i.LINEAR TRANSFORMATIONS 79 Since the number of matrices is n2 -H 1.. As is well-known for every matrix st with non-zero determinant there exists a matrix sil-1 such that (9) sisti af_id _ si-1 is called the inverse of sit To find se we must solve a system of linear equations equivalent to the matrix equation (9). We know that choice of a basis determines a one-to-one correspondence between linear transformations and matrices which preserves products. A if AB = BA = E.dn2 = 0. A transformation which has an inverse is sometimes called non-singular. a2. There is a close connection between the inverse of a transformation and the inverse of a matrix. The transformation B is said to be the inverse of 4. that is. if A takes x into Ax. . This simple proof of the existence of a polynomial P (t) for which P(d ) = 0 is deficient in two respects. . clog' Jr a1d + a2ia/2 + It follows that for every matrix of order n there exists a polynomial P of degree at most n2 such that P(s1) = C. the matrix has rank n. then the inverse B of A takes Ax into x. The elements of the kth column of sl-1 turn out to be the cofactors of the elements of the kth row of sit divided by the determinant of It is easy to see that d-1 as just defined satisfies equation (9).e.a. It follows that a linear transformation A has an inverse if and only if its matrix relative to any basis has a nonzero determinant.

is a linear combination of the vectors Ae.. THEOREM. Let A be a linear transformation on a space R. e R' and y.A (2x). let f. 2. We now show how the matrix of a linear transformation changes under a change of basis.80 LECTURES ON LINEAR ALGEBRA If A is a singular transformation. Let e1. Hence every vector Ax.. i. Ae. . 3). then the matrix of C relative to the basis e1. e . Let W be the matrix connecting the two bases. then its matrix has rank < n. if y = Ax. (10) = f cei c22e2 + cei c. f be two bases in R. Ay e R'. --F y. Hence R' is indeed a subspace of R. + If C is the linear transformation defined by the equations Cei =1. n). To say that the maximal number of linearly independent Ae.. The dimension of R' equals the rank of the nzatrix of A relative to any basis e2. every vector in R'. are linear combinations of the k vectors of such a maximal set. y. e2.1H is h. The set of vectors Ax (x varies on R) forms a subspace R' of R. = Ax. = Ax.e. Proof: Let y. Let I represent A relative to the basis e. The matrices which represent a linear transformation in different bases are usually different. then the other Ae. Since every vector in R' is a linear combination of the vectors Ae. Then y. y. e2. . e R'. 5. . Connection between the matrices of a linear transformation relative to different bases. Ae2. Now any vector x is a linear combination of the vectors el. . . en and f.e. formulas (2) and (3) of para. Hence the dimension of R' is h. e. and y. the dimension of R' is the same as the rank of the matrix ra11. Ae. Ax. e2. + C21e2 + + c02en. i. Likewise. i..e.. then 2Ax .. y. isk.e. ... More specifically. e R'. = A (x. it is also a linear combination of the h vectors of a maximal set.. Ay Ax. . e is W (cf.. i.e. e2. We shall prove that the rank of the matrix of a linear transformation is independent of the choice of basis. is h is to say that the maximal number of linearly independent columns of the matrix Ila.e. Ae. i. If the maximal number of linearly independent vectors among the Ae. f .

4 of a transformation A relative to a basis f. and in that case it is not possible to restrict ourselves to R.. so that = To sum up: Formula (11) gives the connection between the matrix . . = Ce. of course. In other words. ... e2. ft. f2. en to the basis f1. Eigenvalues and eigenvectors of a linear transformation Invariant subs paces.11 be the matrix of A relative to e1. (formula (10)). e. Aek = tt (10') (10") Afk = i=1 bat. = basis e1. e and 11). . 1. consider the function on the subspace R. e2. only.LINEAR TRANSFORMATIONS 81 a Let sit = Ilai. matrix <91 which represents A relative to the basis e. f and the . e2.!1 its matrix relative to f1. We wish to express the matrix . However.R in terms of the matrices si and W. In the case of a scalar valued function defined on a vector space R but of interest only on a subspace 12. Here points in R. may be mapped on points not in R.. of R we may. . f2. . alone.. To this end we rewrite (10") as ACe. e2. . Invariant subspaces. Premultiplying both sides of this equation by C-1. relative to a given basis matrix (C-1AC) = matrix (C-9 matrix (A) matrix (C). f2. (11) bzker It follows that the matrix jbikl represents C-'AC relative to the . Not so in the case of linear transformations. e. f. § 10. The matrix 462 in (11) is the matrix of transition from the basis e.(which exists in view of the linear independence of the fi) we get C-'ACe.

AP (t) --= P' (1). Trivial examples of invariant subspaces are the subspace consisting of the zero element only and the whole space. Let R be a plane. implies Ax e R If a subspace R1 is invariant under a linear transformation A we may. i. The set of polynomials of degree subspace. In this case the coordinate axes are onedimensional invariant subspaces.82 LECTURES ON LINEAR ALGEBRA DEFIN/TION 1. Let A be a stretching by a factor A1 along the x-axis and by a factor A. EXERCISE. and e. then the coordinate axes are the only invariant one-dimensional subspaces. In this case every line through the origin is an invariant subspace. i. + A22e2 (here e. of R is called invariant under A if x e R. = A. A is the mapping which takes the vector z = e. Let R be three-dimensional Euclidean space and A a rotation about an axis through the origin. Let A be a linear transformation on a space R.<n-1. consider A on R.k. is an invariant EXERCISE. . Let R be the space of polynomials of degree n I and A the differentiation operator on R. A subs pace R. are unit vectors along the coordinate axes). /1.. = A. along the y-axis. If A. Show that if A. Let A be a linear transformation on R whose matrix relative to some basis el.. 1.e. en is of the form . e. only. Let R be any n-dimensional vector space. The invariant subspaces are: the axis of rotation (a one-dimensional invariant subspace) and the plane through the origin and perpendicular to the axis of rotation (a two-dimensional invariant subspace). Show that R in Example 3 contains no other subspaces invariant under A. of course.e. then A is a similarity transformation with coefficient A.2. e2. into the vector Az = Ai ei e. EXAMPLES.

Then the coordinates ni. Relative to this basis A Proof: Let e1. = k). en be a basis in R. all non-zero vectors of a one-dimensional invariant subspace are eigenvectors.e. Let R1 be a one-dimensional subspace generated by some vector O.. n2. It is clear x that for R1 to be invariant it is necessary and sufficient that the vector Ax be in R1. that Ax = 2x. invariant under A. The number A is called an eigenvalue of A.. n of the be any vector in R. e2. The proof is left to the reader. vector Ax are given by The proof holds for a vector space over any algebraically closed field since it makes use only of the fact that equation (2) has a solution. A vector x is called an eigenvector of A.1 a ek is (1 _< In this case the subspace generated by the vectors e . consists of all vectors of the form ax. Then R. 2. ak+in a1+11+1 0 O a1. i. then A has at least one eigenvector. If A is a linear transformation on a complex i space R. 0 satisfying the relation Ax Ax DEFINITION 2. + + Ee . then the vectors ax form a onedimensional invariant subspace. In the sequel one-dimensional invariant subspaces will play a special role. Conversely. THEOREM 1. e2.LINEAR TRANsFormarioNs 83 an ' avc all a17. If = invariant under A. is represented by some matrix IctikrI Let x = elei E2e. . Thus if x is an eigenvector.+1 ' ' a. then the subspace generated by ek-Flp e1+2 en would also be Eigenvectors and eigenvalues.

. + a2. + + a1--. Eno) en. is equiv- alent to the system of equations: a111 a2151 a12$2 + a22 + + ainE=-.A¿. . (an Ei A)ei (a22 an$. + a2¿.(0). . The equation Ax = Ax. If we put xon Elm) $2(0) e2 . ..i02+ (Cf.e. 3 of § 9). = an1e1+ ct.. ¿. then Axo) = Aoco). + ctE which expresses the condition for x to be an eigenvector... + a2¿. a12 a22 A ai a. e not all zero satisfying the system (1). ¿n(0). Thus to prove the theorem we must show that there exists a number A and a set of numbers ¿I). para. . 0. . a2 aA This polynomial equation of degree n in A has at least one (in general complex) root A. . Such a system has a non-trivial solution ¿.. a2$2 + + (anTh O. . For the system (1) to have a non-trivial solution ¿1. = A2 an11 Or an2$2 + + ae--= A¿. in place of A. E2(0)... $2.84 LECTURES ON LINEAR ALGEBRA = /12 1111E1 + a122 + 4 a22 e2 + = a2111 ' al.. i. that A an an. + A)E.0. (1) becomes a homogeneous system of linear equations with zero determinant. With A. ani¿i it is necessary and sufficient that its determinant vanish..

The proof of our theorem shows that the roots of the characteristic polynomial are eigenvalues of the transformation A and. that the characteristic polynomial is itself independent of the choice of basis. Let A be such a transformation and e1.e. It is a priori conceivable that the multiplicity of the roots varies with the basis. Ae. Since the eigenvalues of a transformation are defined without reference to a basis. This completes the proof of the theorem.e. Linear transformations with n linearly independent eigenvectors are. if A is represented in some 2 The fact that the roots of the characteristic polynomial do not depend on the choice of basis does not by itself imply that the polynomial itself is independent of the choice of basis. the eigenvalues of A are roots of the characteristic polynomial. NOTE: Since the proof remains valid when A is restricted to any subspace invariant under A.LINEAR TRANSFORMATIONS 85 i.. . n). e2. Relative to the basis e1. conversely. We may thus speak of the characteristic polynomial of the transformation A rather than the characteristic polynomial of the matrix of the transformation A. in a way. we can claim that every invariant subspace contains at least one eigenvector of A. Conversely. en its linearly independent eigenvectors. 2. . . namely. en the matrix of A is o o Lo o 22. In the sequel we shall prove a stronger result 2. the simplest linear transformations. i. . The polynomial on the left side of (2) is called the character stic polynomial of the matrix of A and equation (2) the characteristic equation of that matrix. it follows that the roots of the characteristic polynomial do not depend on the choice of basis._ Such a matrix is called a diagonal matrix. 3. We thus have THEOREM 2.. e (i = 1. e2. If a linear transformation A has n linearly independent eigenvectors then these vectors form a basis in which A is represent- ed by a diagonal matrix. 30°) is an eigenvector and 2 an eigenvalue of A.

1121e1 1222e2 Subtracting from this equation equation (3) multiplied by A. If e1. such that ei a2 e2 e. then the vectors of this basis are eigenvalues of A. A. we are led to the relation Ak)eki = with 21 2. then e1. NOTE: There is one important case in which a linear transforma- tion is certain to have n linearly independent eigenvectors. The matrix of A relative to the basis e1. ock2kek = O. e2. If our assertion were false in the case of k vectors. A ' . . For if P (t) is a polynomial of . We assume its validity for k 1 vectors and prove it for the case of k vectors. 0( . 0 0 (by assumption Ak for i k). ek are eigenvectors of a transformation A and the corresponding eigenvalues 2. a. e2. en is diagonal. = O. are supposed distinct. For instance. For k = 1 this assertion is obviously true. the transformation A which associates with every polynomial of degree < n 1 its derivative has only one eigenvalue A = 0 and (to within a constant multiplier) one eigenvector P = constant. then there would exist k numbers ai . If the characteristic polynomial has multiple roots. This contradicts the assumed linear independence of e1. with al 0 0. e2.) = 0.. e2. The following result is a direct consequence of our observation: 2k)e1 + 12(22 ' 2k)e2+ ' 1k--1(1ki If the characteristic polynomial of a transformation A has n distinct roots. are linearly independent. say. e. Since the A. e. (3) Apply ng A to both sides of equation (3) we get A (al ek + x2e2 + Or + a. Indeed. then the number of linearly independent eigenvectors may be less than n. e2. We lead up to this case by observing that . a root Ak of the characteristic equation determines at least one eigenvector. . are distinct. then the matrix of A is diagonable. . e.86 LECTURES ON LINEAR ALGEBRA basis by a diagonal matrix. it follows by the result just obtained that A has n linearly independent eigenvectors e1.

. then P'(t) is a poly-nomial of degree k 1. We shall now find an explicit expression for the characteristic polynomial in terms of the entries in some representation sal of A. Hence P'(t) = AP(t) implies A -= 0 and P(t) = constant. 2 we defined the characteris- tic polynomial of the matrix si of a linear transformation A as the determinant of the matrix si Ae and mentioned the fact that this polynomial is determined by the linear transformation A alone. 0 0 0 0 0 1Ao 0 0 1 A 2.. In fact. represent A relative to two bases then But Ati Ir-11 1st Ael This proves our contention. We shall prove in chapter III that if A is a root of multiplicity m of the characteristic polynomial of a transformation then the maximal number of linearly independent eigenvectors correspond- ing to A is m. an.e. a. Hence we can speak of the characteristic polynomial of a linear transformation (rather than the characteristic polynomial of the matrix of a linear transformation). 0 I 0 A. 010 1 0 0 0 0 1 0 0 0 Solution: (-1)"(A" a11^-1 a2A^-2 a ). Characteristic fiolynomial. In the sequel (§§ 12 and 13) we discuss a few classes of diagonable linear transformations (i.a.LINEAR TRANSFORMATIONS 87 degree k > 0. 4. Find the characteristic polynomial of the matrix an_. as asserted.e. It follows that regardless of the choice of basis the matrix of A is not diagonal. a. Find the characteristic polynomial of the matrix A. if si and %'-'sn' for some W. 1.In para. EXERCISES. i. linear transformations which in some bases can be represented by diagonal matrices). . it is independent of the choice of basis. The problem of the "simplest" matrix representation of an arbitrary linear trans- formation is discussed in chapter III.

an Q(A) = Abu Abn an a22 /1. is the sum of the diagonal entries of si p2 the sum of the principal minors of order two. 21)22 a2 2b2 Ab.2 a 2b and can (by the addition theorem on determinants) be written as the sum of determinants... p2. is the determinant of si P(A) We wish to emphasize the fact that the coefficients pi.1 an an a. Q (A) ¡ . are of particular importance.11. This is another way of saying that the characteristic polynomial is independent of the particular representation . p is the determinant of the matrix si and pi. Finally.88 LECTURES ON LINEAR ALGEBRA We begin by computing a more general polynomial. In the case at hand a = e and the determinants which add up to the coefficient of (A') are the principal minors of order n k of the matrix Ha.. The coefficient of ( A)'' in the expression for Q(A) is the sum of determinants obtained by replacing in (4) any k columns of the matrix by the corresponding columns of the matrix II b1. p. where a and are two arbitrary matrices. a. In one important case the roots of . The free term of Q(A) is all an a.11. To compute the eigenvectors of a linear transformation we must know its eigenvalues and this necessitates the solution of a poly- nomial equation of degree n. namely. etc. The sum of the diagonal elements of sal is called its It is clear that the trace of a matrix is the sum of all the roots of its characteristic polynomial each taken with its proper trace. p are independent of the particular representation a of the transformation A. the characteristic polynomial P(2) of the matrix si has the form ( 1)4 (An fi12n-1 P22"' ' where p.. a2 1b7. multiplicity. is the sum of the diagonal elements of sí. Thus. si of A. The coefficients p and p.1)12 al ' b1..

First we prove the following LEMMA 1.-3 am e. Find the eigenvectors corresponding to the eigenvalues an. We conclude with a discussion of an interesting property of the characteristic polynomial. ann. am 0 al a. for every matrix a/ there exists a polynomial P(t) such that P(d) is the zero matrix. . The proof is obvious since the characteristic polynomial of the matrix (5) is P(2) = and its roots are an. 2) fan a. As 'vas pointed out in para. If the matrix of a transformation A is triangular.LINEAR TRANSFORMATIONS 89 the characteristic polynomial can be read off from the matrix representing the transformation. (5) 0 0 a 2) (an. W. a of the matrix (5).e.) Prool: We have Ae)r(A) sér.-2 (d?. a22. ._. = a. namely. V0 ==(Lot. if it has the form an O a12 a22 al.-2) + WoAm.12)%v) where ?(A) is a polynomial in A with matrix coefficie s._3= a." + + + am and the matrix se be connected by the relation P(A)g = (se --.. i. EXERCISE. (We note that this lemma is an extension of the theorem of Bezout to polynomials with matrix coefficients. Now (6) and (7) yield the equations w. 3 of § 9.09) = C. e ' + CA) = W01?-1 Then P(. an.._2 e. + (sn. e. We now show that the characteristic polynomial is just such a polynomial. a' if a a. a22.' A) then the eigenvalues of A are the numbers an. Let the polynomial P(A) = ad.

EXERCISE.0) = 0 (cf.. § II.e)w(A) = P(A)e. n conclude on the basis of our lemma that P ( 31) = C.29 . The adjodnt of a linear hmasforrnation 1. We have A t)(d A t)-1 e. ' 0 on the left. then there exists no polynomial Q(A) of degree less than n such that Q(. Here this is not an admissible procedure since A is a number and a' is a matrix. However. i._. d + Thus P( 31) = 0 and our lemma is proved 3.90 LECTURES ON LINEAR ALGEBRA the third by Sr. and P(. VVe have considered under separate headings linear transfornaations and bilinear forrns on vector spaces. Proof: Consider the inverse of the matrix d At. P(A) where 5 (A) is the matrix of the cofactors of the elements of a/ At and P(A) the determinant of d ite. Let d be a diagonal matrix Oi =[A. Subsequent multiplication by St and addition of the resulting equations is tantamount to the substitution of al in place of A. are distinct. As is well known. If we multiply the first of these equations on the left by t. the inverse matrix can be written in the form AS)-/ = 1 If P(A) is the characteristic polynomial of Al. the second by al. 0 A where all the A. In 3 In algebra the theorem of Bezout is proved by direct substitution of A in (6). we get + a. We note that if the characteristic polynomial of the matrix d has no multiple roots. we are doing essentially the same thing. Connection between transfornudions and bilinear forms in Euclidean space. 0 A. the kth equation in (8) is obtained by equating the coefficients of A* in (6). . Find a polynomial P(t) of lowest degree for which P(d) = 0 (cf.4) = a. I in A.i. dm on the right. t + a. Since the elements of IS(A) are polynomials of degree . the exercise below). In fact. ' the last by dm and add the resulting equations. THEOREM 3. 3. Hence (. we This completes the proof. the characteristic polynomial of S. then P(d) = O. para. § 9).e.

. . y) (6711E1 + an 52 + (171251 + 422E2 + (a25e1 + an' 52)771 -F a. Let el. This correspondence is shown to be independent of the choice of basis. Here re is the transpose of rer The careful reader will notice that the correspondence between bilinear forms and linear transformations in Euclidean space considered below associates bilinear forms and linear transformations whose matrices relative to an orthonormal basis are transposes of one another. if a linear transformation and a bilinear form are represented relative to some basis by a matrix at. then. Let R be a complex Euclidean space and let A (x. + ¿2e2 + x then A (x. If + eandy = n1e1 212e2 + +mien. upon change of basis. In fact. e be an orthonormal basis in R. y) be a bilinear form on R. § 4). = a2ne2 + a$. Now we introduce the vector z with coordinates = aide]. y) can be written in the form A (x. y).LINEAR TRANSFORMATIONS 91 the case of Euclidean spaces there exists a close connection between bilinear forms and linear transformations 4. It is clear that z is obtained by applying to x a linear transforma- tion whose matrix is the transpose of the matrix Haikil of the bilinear form A (x.afiin + 42nE217n + 421E2771 + 422E2772 + ani En an2Eni12 + + We shall now try to represent the above expression as an inner product. e2. + an2$. y) = anEi Th. One could therefore try to associate with a given linear transformation the bilinear form determined by the same matrix as the transformation in question. . § 9) and the bilinear form is represented by raw (cf. -F a21 52 ' C2 = a12e1 + 422E2 + + ani$. However. We shall denote this linear transformation 4 Relative to a given basis both linear transformations and bilinear forms are given by matrices. such correspondence would be without significance.2 j2 a2e2 + + anE)77. (1) a151 î72 + -1. the linear transformation is represented by Se-1 are (cf. To this end we rewrite it as follows: A (x..

y) = (Ax. The equation A (x. y) on Euclidean vector space determines a linear transformation A such that A (x. . Ay. Y)- Thus. y) = (Ax. An) = (x.). y). y) (A (x. We now show that the bilinear form A (x y) determines the transformation A uniquely. Bx. which is the same as saying that A = B.e. AY1) (x.). y) (Ax (Bx. Ay). (Ax. Ay. This proves the uniqueness assertion. y) (Mx. pAy) = /2(x. A (y + y2)) = (x. y) establishes a one-to-one correspondence between bilinear forms and linear transformations on a Euclidean vector space. y) (Ax. y) = (Bx. y) + (Ax. y) defined by the relation A (x.. y). x. + AY2) = (x.. y) (Ax. y) = Eh CS72 d- + Cn Tin = (z. y) = 2(Ax. (x. (x. We can now sum up our results in the following THEonEm (2) 1. y). y) = (2Ax. y).92 LECTURES ON LINEAR ALGEBRA by the letter A. Thus. y) = for all y. let A (x. The converse of this proposition is also true. namely: A linear transformation A on a Euclidean vector space determines a bilinear form A (x. i. But this means that Ax Ex = 0 for all x Hence Ax = Ex for all x. y) and A (x. y). a bilinear form A (x. y) = (Ax. Y) = (Ax. The bilinearity of A (x. Ax2. Then A (x. we shall put z = Ax. y) is easily proved: . y) = (Ax.. Then (Ax.

Namely. For a non-orthogonal basis the connection between the two matrices is more complicated.) + a2272 + d. This representation is obtained by rewriting formula (1) above in the following manner: A (x.z2n2 + + d nom) = (x. Relative to an orthogonal basis the matrix la*. = dn.21% a2772 + + d12n2 + + + din) + a2nn. My).(c1. In a Euclidean space there is a one-to-one correspond- ence between linear transformations and their adjoints. Let A be a linear transformation on a complex Euclidean space. A*y). THEOREM 2. y). 1.LINEAR TRANSFORMATIONS 93 The one-oneness of the correspondence established by eq. every bilinear form can be uniquely represented as (x. y) (Ax. There is another way of establishing a connection between bilinear forms and linear transformations. y) = 2(4121171 6/12172 + + am /7n) a22 772 + + a2n77) $ n(an1Fn = + $7. 2. Hence . (2) implies its independence from choice of basis. A*y) is called the adjoint of A. The transformation A* defined by (Ax. On the other hand.] of A* and the matrix I laiklt of A are connected by the relation a*. A*y). Transition from A to its adjoint (the operation *) DEFINITION 1. y) = (x. every bilinear form can be represented as A (x. Proof: According to Theorem 1 of this section every linear transformation determines a unique bilinear form A (x. y) (x. by the result stated in the conclusion of para.

Self-adjoint. A* 3). E* = E. The connection between the matrices of A and A* relative to an orthogonal matrix was discussed above. B*A*y). Then (Ax. A. y) = (x. y) = A (x. Interchange of x and y gives (Cx. Ax) = (Cy. (A + B)* = A* + B*. basis. 1. Ay). The operation * is to some extent the analog of the operation of . Cy). y) = (x..e. Some of the basic properties of the operation * are (AB)* = B*A*. A*y) = (x. the definition of (AB)* implies (ABx.= B* A*. By the definition of A*. Prove properties 3 through 5 of the operation *.94 LECTURES ON LINEAR ALGEBRA (Ax. y) = (x. If we compare the right sides of the last two equations and recall that a linear transformation is uniquely determined by the corresponding bilinear form we conclude that (AB)* . unitary and normal linear transformations. Denote A* by C. (A*)* = A. y) -= (x. x). (ABx. On the other hand. i. A*y). Prove properties 1 through 5 of the operation * by making use of the connection between the matrices of A and A* relative to an orthogonal 2. whence (y. (Ax. y) = (x. (A*)* = A. We give proofs of properties 1 and 2. (2A)* = a*. But this means that C* EXERC/SES. (AB)* y). y) = (Bx.

A. are self-adjoint. Ay). equations (a) and (b) are equivalent.* (A A*)/2i. A linear transformation is called self-adjoint (Hermitian) if A* = A. y) = (Ay. x). The real numbers are those complex numbers for which Cc = The class of linear transformations which are the analogs of the real numbers is of great importance. /3 real. We now show that for a linear transformation A to be self-adjoint it is necessary and sufficient that the bilinear form (Ax. it is clear that for matrices of order one over the field of complex numbers. This brings out the analogy between real numbers and selfadjoint transformations. This analogy is not accidental. Then (A + A*)* 2 (A + A*)* = + (A* + A**) + (A* + A) = A1... y) = (x. Indeed. are self-adjoint transformations. the two operations are the same. In fact.e. Similarly. and A. = (A + A*)/ 2 and A2 A = A. Indeed. let A. Clearly. for complex numbers.LINEAR TRANSFORMATIONS 95 conjugation which takes a complex number a into the complex number et. y) is Hermitian is to say that (Ax. 7-. y) be Hermitian. Again. to say that A is self-adjoint is to say that (Ax. .e. to say that the form (Ax. a. 2i 1 A* 2 /A AT 2i (A A*)* = A) = A2. + iA.. i. Every linear transformation A can be written as a sum A= iA. This class is introduced by DEFINITION 2. and A. Every complex number is representable in the form = a + iß. (3) where Al and A. A* A**) = 2i (A* i.

DEFINITION 3. This is not the case in infinite dimensional spaces. in general. Prove that if A is an arbitrary linear transformation then AA* and A*A are self-adjoint. 5 In other words for a unitary transformations U. Show that if A and B are self-adjoint. LECTURES ON LINEAR ALGEBRA I. The product of two self-adjoint transformations is. 5 In n-dimensional spaces TIE* = E and 1:*ti = E are equivalent . EXERCISES. Show that if 15 is unitary and A self-adjoint. The analog of complex numbers of absolute value one are unitary transformations. This proves the theorem. different from A*A. 2. Proof: We know that A* = A and B* = B. not self-adjoint. NOTE: In contradistinction to complex numbers AA* is. Prove that a linear combination with real coefficients of self-adjoint transformations is again self-adjoint. = Ul. Now. i (AB EXERCISE. then AB + BA and BA) are also self-adjoint. Prove the uniqueness of the representation (3) of A. A linear transformation U is called unitary if UU* = U*15 = E. Hence (4) is equivalent to the equation AB = BA. However: THEOREM 3. in general.96 EXERCISES. AU is again statements. Show that the product of two unitary transformations is a unitary transformation. 1. In § 13 we shall become familiar with a very simple geometric interpretation of unitary transformations. We wish to find a condition which is necessary and sufficient for (4) (AB)* = AB. (AB)* = B*A* = BA. For the product AB of two self-adjoint transforma- tions A and B to be self-adjoint it is necessary and sufficient that A and B commute. then self-adjoint.

There is no need to introduce an analogous concept in the field of complex numbers since multiplication of complex numbers is commutative. Simultaneous reduction of a pair of quadratic forms to a sum of squares 1. It is easy to see that unitary transformations and self-adjoint transformations are normal. i. In the course of this study we shall become familiar with very simple geometric characterizations of these classes of transformations.x. DEFINITION 4. § 12. (Ax. This section is devoted to a more detailed study of self-adjoint transformations on n-dimensional Euclidean space. Self-adjoint (Hermitian) transformations. Ax). (Self-adjoint transformations on infinite dimensional space play an important role in quantum mechanics. Ax Since A* -= A. x O. Proof: Let x be an eigenvector of a self-adjoint transformation A and let A be the eigenvalue corresponding to x. Self-adjoint transformations. (2x. A.LINEAR TRANSFORMATIONS 97 In the sequel (§ 15) we shall prove that every linear transformation can be written as the product of a self-adjoint transformation and a unitary transformation. These transformations are frequently encountered in different applications.e. that is.) LEMMA 1. A linear transformation A is called normal if AA* = A* A. The subsequent sections of this chapter are devoted to a more detailed study of the various classes of linear transformations just introduced.. x) = (x. The eigenvalues of a self-adjoint transformation are real. Ax). x) = (x. This result can be regarded as a generalization of the result on the trigonometric form of a complex number. .

x) = 71(x. that each of them is of length one. the totality of vectors orthogonal to e. there exists a vector e2 which is an eigenvector of A (cf. Necessity: Let A be self-adjoint. which proves that A is real. e) = (x. e) = 0. e) = O. LEMMA 2. is invariant under A. etc. Since (x. Let A be a linear transformation on an n-dimensional Euclidean space R.98 LECTURES ON LINEAR ALGEBRA Or. We show that R. x). of vectors x orthogonal to e form an (n 1)-dimensional subspace invariant under A. e) = 0. 2(x. Let A be a self-adjoint transformation on an n-dimensional Euclidean vector space R and let e be an eigenvector of A. . § 10. orthogonal to e. This proves Theorem 1. there exists at least one eigenvector el of A. Let A be a self-adjoint transformation on an n- dimensional Euclidean space. note to Theorem 1. The corresponding eigenvalues of A are all real. the corresponding eigenvalues are real. The totality of vectors of R. By Lemma 2. Proof: According to Theorem 1. it follows that A = 1. This means that (x. (Ax. that is. (x. of A. Since the product of an eigenvector by any non-zero number is again an eigenvector. In this manner we obtain n pairwise orthogonal eigenvectors e1. en. form an (n 1)-dimensional invariant subspace We now consider our transformation A on R. Select in R a basis consisting of . form an (n 2)-dimensional invariant subspace R2. For A to be self-adjoint it is necessary and sufficient that there exists an orthogonal basis relative to which the matrix of A is diagonal and real. In R. § 10). x) Proof: The totality R1 of vectors x orthogonal to e form an (n 1)-dimensional subspace of R. Let x e R. We have to show that Ax e R1. A*e) = (x. Indeed. Ae) THEOREM 1. we can select the vectors e. only. there exists an eigenvector e. Then there exist n pairwise orthogonal eigenvectors of A. By Lemma 1. e2. THEOREM 2. 0. In R. The totality R. (Ax. 2e) = 2(x.

e2) = 22(e1. structed in the proof of Theorem 1. let Ael = 22 Then . 21 22. Indeed. e2) = (e1. ez). A A*. = 2. i. o (1) 0 An 0 0 where the Ai are real. it follows that (e1. it follows that relative to this basis the matrix of the transforma- tion A is of the form [A. The matrix of the adjoint transformation A* relative to an orthonormal basis is obtained by replacing all entries in the transpose of the matrix of A by their conjugates (cf. e2) = O.LINEAR TRANSFORMATIONS 99 the n pairwise orthogonal eigenvectors e1. Hence the transformations A and A* have the same matrix. e2. A. Ae = Anen. e2) = O. A*e2) = (e1. Ae. Since Ai rf 4. 22) (e1... We note the following property of the eigenvectors of a selfadj oint transformation: the eigenvectors corresponding to different eigenvalues are orthogonal. e of A con- A. Ael = 22e. o o A. or (2. that is ¿1(e1. Since . (Ael. .2e2.. This concludes the proof of Theorem 2. § 11).). Sufficiency: Assume now that the matrix of the transformation A has relative to an orthogonal basis the form (1).e. In our case this operation has no effect on the matrix in question.

Theorem 2 permits us now to state the important THEOREM 3. Reduction to principal axes. A (x. obtained in para. x) = x. The matrix Irai. happens to be negative. Clearly. a reflection in the plane orthogonal to the corresponding direction. In the case of a Euclidean space we can state a stronger result. EXERCISE.1 and.100 LECTURES ON LINEAR ALGEBRA NOTE: Theorem 2 suggests the following geometric interpretation of a self-adjoint transformation: We select in our space n pairwise orthogonal directions (the directions determined by the eigenvectors) and associate with each a real number Ai (eigenvalue).11 is said to be Hermitian if ai.e. Along with the notion of a self-adjoint transformation weintro- duce the notion of a Hermitian matrix. where the Xi are real. We know that we can associate with each Hermitian bilinear form a self-adjoint transformation. y) = A (y. we can assert the existence of an orthonnal basis relative to which a given Hermitian quadratic form can be reduced to a sum of squares. Hint: Bring the matrix to its diagonal form. raise it to the proper power. a necessary and sufficient condition for a linear transformation A to be self-adjoint is that its matrix relative to some orthogonal basis be Hermitian. if 2. We have shown in § 8 that in any vector space a Hermitian quadratic form can be written in an appropriate basis as a sum of squares. We now apply the results 2. Along each one of these directions we perform a stretching by ¡2. namely. X). and the $1 are the coordi ales of the vector Proof: Let A( y) be a Hermitian bilinear form. 6 ili[e ii2. y) be a Hermitian bilinear form defined on an n-dimensional Euclidean space R.. i. in addition. Simultaneous reduction of a pair of quadratic forms to a sum of squares. Then there exists an orthonormal basis in R relative to which the corresponding quadratic form can be written as a sum of squares.. Let A (x. Raise the matrix ( 0 A/2 A/2) 1 to the 28th power. and then revert to the original basis. A (x. . 1 to quadratic forms.

Let Ae2 = 12e2. x) = (Ax. As our orthonormal basis vectors we select the pairwise orthogonal eigenvectors e1.LINEAR TRANSFORMATIONS 101 then there exists (cf. for we get A (x. This proves the theorem. where B(x. y) (Ax. x) be two Hermitian quadratic forms on an n-dimensional vector space R and assume B(x. en of the self-adjoint transformation A (cf. The process of finding an orthonormal basis in a Euclidean space relative to which a given quadratic form can be represented as a sum of squares is called reduction to principal axes. y). § 11) a self-adjoint linear transformation A such that A (x. x = ei Since e2e. n1e1 -I. THEOREM 4. Let A (x. x) to be positive definite. Then Ael = 21e1. y). + +e . y) B(x. By . %el /12e2 + + ?)en) + nnen) n2e2 + = 1E11 + 225 In particular + + fin A (x. e2. y) = = (Ax. y) is a Hermitian bilinear form corresponding to a positive definite quadratic form (§ 8).An enen. x). y) is the bilinear form corresponding to B(x. x) and B(x. + n2 e2 + + nn. Proof: We introduce in R an inner product by putting (x. Aen An en. Then there exists a basis in R relative to which each form can be written as a sum of squares. i=k for i k. .121 212 + + Arisni2. y) e2Ae2 + 22e2e2 + + en Aen . y I1 0 e. This can be done since the axioms for an inner product state that (x. x) 211$112 . Theorem 1). With the introduction of an inner product our space R becomes a Euclidean vector space.

x) can be written as a sum of A (x. x ) + 1E21' + + le. a2n Abu Ab22 ' /bin 21)2n a22 a. Under a change of basis the matrices of the Hermitian quadratic forms A and B go over into the matrices Jill = (t* d%' and = %)* . which appear in (2) above. en is an arbitrary basis.e 1215212 + + 41E7. 22. x) = 211E112 . Ar. x) and B(x. then with respect to this basis Det i. Hence.. if el. We now show how to find the numbers AI... an2 2. .12.. Al) Det C. it follows that B(x.12. Orthonormal relative to the inner product (x.102 LECTURES ON LINEAR ALGEBRA theorem 3 R contains an orthonormal squares.1)2 a. We have thus found a basis relative to which both quadratic forms A (x. 22. x) are expressible as sums of squares. Now. with respect to an orthonormal basis an inner product takes the form (x. e2.q = 0 0 A AR) (A1 Consequently.. y) = B(x. Det (id A) (22 2) (2 A). x). Det It follows that the numbers A. basis el. The matrices of the quadratic forms A and B have the following canonical form: d=0 0 [AI 22 0 0 [1 . y). Ab. 141) differs from (4) by a multiplicative constant. relative to which the form A (x.e.. x) = ei I2 + 1E212 + + [EF2 Since B(x x) (x. Det V* Det (at a Ab a an 2b21 Abni A are the roots of the equation al. e2. 141) . . .

. y). y) = (x. U*Uy) = (x. y) for all x. y). (Ux. Indeed. Uy) = (x. x) in some basis e. in accordance with the preceding discussion. y). Uy) (x. y E R. Therefore. the matrix of the first form is [1 0 101 11 and the matrix of the second form is a. This definition has a simple geometric interpretation. B(x. Conversely.e. X) = let12 142. it satisfies condition (1)). assume U*U = E. any linear transformation U which preserves inner products is unitary (i. ro Li oJ Consider the matrix a RR. Its determinant is equal to (A2 + 1) and has no real roots. Indeed. § 13. y). that is (U*Ux. The two quadratic forms A (X. Uy) = (x. cannot be reduced simultaneously to a sum of squares. namely: A unitary transformation U on an n-dimensional Euclidean space R preserves inner products. if for any vectors x and y (Ux.e.. Then (Ux.. x) = neither of which is positive definite. e2. i. then (U*Ux. where Haikl F NOTE: The following example illustrates that the requirement that one of the two forms be positive definite is essential. . Unitary transformations In § 11 we defined a unitary transformation by the equation (1) UU* U*U E. en. Conversely. y) = (Ex.LINEAR TRANSFORMATIONS 103 and 0i/A are the matrices of the quadratic forms A (x. the two forms cannot be reduced simultaneously to a sum of squares. x) and B(x. where A is a real parameter.

. ann is the matrix of the adjoint U* of U. we select an orthonormal basis el. i. a=1 a=1 a(T = O (i k). = 1. en. . a unitary transformation preserves the length of a vector.. To do this.. Let [all a21 1E12 a22 aa a1 a2 dn dn d12 a an]] dn2 be the matr x of the transformation U relative to this basis. We shall now characterize the matrix of a unitary transformation. in addition. i. e2. EXERCISE. a2d. a=1 a-1 aak = O (i k).. but refers to the columns rather than the rows of the matrix of U. Making use of the condition U*U = E we obtain. for x = y we have (Ux.e. In particular. the sum of the squares of the moduli of the elements of any row is equal to one. it follows that U*LI = E. Thus. This condition is analogous to the preceding one. that is.. Ux) = (x. relative to an orthonormal basis. a2. x). aiti. the matrix of a unitary transformation U has the following properties: the sum of the products of the elements of any YOW by the conjugates of the corresponding elements of any other YOW is equal to zero. Prove that a linear transformation which preserves length is unitary. = 1. The condition UU* = E implies that the product of the matrices (2) and (3) is equal to the unit matrix. Then d22 al.104 LECTURES ON LINEAR ALGEBRA Since equality of bilinear forms implies equality of corresponding transformations.e. U is unitary..

equivalently. Hence f 1 for i = k. A matrix I laall whose elements satisfy condition (4) or. As we have shown unitary matrices are matrices of unitary transformations relative to an orthonormal basis. Then x O. Ue2. LEMMA 1. 0 for i It follows that a necessary and sufficient condition for a linear transformation U to be unitary is that it take an orthonormal basis en into an orthonormal basis Uek . Since a transformation which takes an orthonormal basis into another orthonormal basis is unitary. 2x) = 22(x. condition (5) is called unitary.e. orthonormal basis). e2.LINEAR TRANSFORMATIONS 105 Condition (5) has a simple geometric meaning. Ue = 2e. en to be an is equal to axid (since we assumed el. e1. that is. x). LEMMA 2. e2. the inner product of the vectors +a Uei = ai. of R consisting of all Then the (n vectors x orthogonal to e is invariant under U.. the matrix of transition from an orthonormal basis to another orthonormal basis is also unitary.. (x. . Let U be a unitary transfor ation on an n-di ensional space R and e its eigenvector. . (6) (Uei. i. + a2e2 + and akk a2k e2 + + anke . e O.e. The eigenvalues of a unitary transformation are in absolute value equal to one. 1)-d mensional subspace R. x) = (Ux. Indeed. Ux) = (2x. Uen. Ux = Ax. Uek) 1 k. i. We shall now try to find the simplest form of the matrix of a unitary transformation relative to some suitably chosen basis. Ai = 1 or 121 = 1. Proof: Let x be an eigenvector of a unitary transformation U and let A be the corresponding eigenvalue.

.O. Ue) = (U*Ux. = 22e2. Proof: In view of Theorem 1. Ue.e. Then U has n pairwise orthogonal eigenvectors. By Lemma 1 the eigenvalues corresponding to these eigenvectors are in absolute value equal to one. . contains at least one eigenvector e2 of U. of all vectors of R which are orthogonal to e. Ux E Thus. the subspace R1 . o The numbers 4. 4. e) = 0. Let U be a unitary transformation defined on an n-dimensional Euclidean space R. the (n 1)-dimensional subspace R. e) Proof: Let x E R. one.e. e) = (x. THEOREM 1. hence (Ux. By Lemma 1. e) --. en of the transformation U. Ue = . (x.. the transformation U as a linear transformation has at least one eigenvector. Then there exists an orthonormal basis in R relative to which the matrix of the transformation U is diagonal. Hence R. etc.106 LECTURES ON LINEAR ALGEBRA i. (7) o o 22 oi. . it follows that i(Ux. § 10. 0 0. We claim that the n pairwise orthogonal eigenvectors constructed in the preceding theorem constitute the desired basis. (Ux.. Indeed. i. = Ue.e.. O. Denote this vector by el. Proceeding in this manner we obtain n pairwise orthogonal eigenvectors e. Denote by R2 the invariant subspace consisting of all vectors of R1 orthogonal to e2. Since Ue = ae. i.. By Lemma 2. has the form [2. is indeed invariant under U. R2 contains at least one eigenvector e3 of U.e. . i. The corresponding eigenvalues are in absolute value equal to one. e) = 0. e) = 0. THEOREM 2. Indeed. Let U be a unitary transformation on an n-dimen- sional Euclidean space R. (Ux. We shall show that Ux e R1. A are in absolute value equal to Proof: Let U be a unitary transformation. is invariant under U.

An are in absolute value equal to one.e. Normal transformations 1. i. § 12. Analogously.. Commutative transformations. . Since the matrix of transition from one orthonormal basis to another is unitary we can give the following matrix interpretation to the result obtained in this section. EXERCISES. if the matrix of U has form (7) relative to some orthogonal basis then U is unitary. Then sat can be represented in the form sit = where ir is a unitary matrix and g a diagonal matrix whose nonzero elements are real.LINEAR TRANSFORMATIONS 107 and. Then there exists a unitary matrix 'V such that Pi= rigr. By Lemma 1 the numbers Ai. the main result of para. We have shown (§ 12) that for each self-adjoint transformation there exists an orthonormal basis relative to which the matrix of the transformation is diagonal. 1.. i. we can find a basis relative to which all these transformations are represented by diagonal matrices. 22.e. . Prove the converse of Theorem 2. Let sal be a Hermitian matrix. . This proves the theorem. Let A and B be two commutative linear transformations. VVe first consider the case of two transformations. e2. therefore. LEMMA 1. 2. It may turn out that given a number of self-adjoint transformations. where is a diagonal matrix whose non-zero elements are equal in absolute value to one. let AB = BA. can be given the following matrix interpretation. the matrix of U relative to the basis e1. Commutative linear transformations. has form (7). 1. Prove that if A is a self-adjoint transformation then the transformation (A iE)-1 (A + iE) exists and is unitary. We shall now discuss conditions for the existence of such a basis. Let all be a unitary matrix. § 14.

Hence RA contains a vector x. = The (n 1)-dimensional subspace R. = 22e2. By Lemma 2.108 LECTURES ON LINEAR ALGEBRA Then the eigenvectors of A which correspond to a given eigenvalue A of A form (together with the null vector) a subspace RA invariant under the transformation B.. For instance. Any two commutative transformations have a common eigenvector. is invariant under A and B (cf. which is an eigenvector of B. = 21e1. then Bx e Ra. § 12). By Lemma 1. where A is an eigenvalue of A. A necessary and sufficient condition for the existence of an orthogonal basis in R relative to which the transformations A and B are represented by diagonal matrices is that A and B commute. i.. if A is the identity trans- formation E.e. only. Now consider A and B on R.. RA is invariant under B. Since AB -= BA. there exists a vector e. i. Be2 = u2e2.e. orthogonal to e. by Lemma 2. which is an eigenvector of both A and B. ABx = 2Bx. since by assumption all the vectors of RA are eigenvectors of A. i. EB BE and x is not an eigenvector of B. Then. LEMMA 2.2x. Proof: We have to show that if x ERA. xo is also an eigenvector of A. Be. THEOREM 1. Sufficiency: Let AB EA. Ae.e. which proves our lemma. Let A and B be two linear self-adjoint transformations defined on a complex n-dimensional vector space R.13x. which is an eigen- vector of A and B: Ae. . B a linear transformation other than E and x a vector which is not an eigenvector of B. Proof: Let AB = BA and let RA be the subspace consisting of all vectors x for which Ax ---. Ax = 2x. in R. then x is an eigenvector of E. Lemma 2. NOTE: If AB = BA we cannot claim that every eigenvector of A is also an eigenvector of B. we have ABx = BAx = B2x = 2. there ex sts a vector e.

THEOREM 2. R1 must contain a vector which is an eigenvector of the This proves our lemma. This completes the sufficiency part of the proof. A necessary and sufficient condition for the existence This means that the transformations A. We shall now characterize all transformations with this property. Assume therefore that there exists a Let R. 2. are multiples of the . The proof follows that of Theorem but instead of Lemma 2 the following Lemma is made use of : 1 LEMMA 2'. B. our lemma is true for spaces of dimension dimension < n. etc. R. C. B. transformations A. . We assume case of one-dimensional space (n that it is true for spaces of dimension < n and prove it for an n-dimensional space. e the matrices of A and B are diagonal. e2. A of A. If every vector of R is an eigenvector of all the transformations A.LINEAR TRANSFORMATIONS 109 All vectors of R. In §§ 12 and 13 we considered two classes of linear transformations which are represented in a suitable orthonormal basis by a diagonal matrix. Since. say. is invariant under each of the transformations (obviously. in our set Sour lemma is proved. It follows that these matrices Bei = pie. In the I ) the lemma is obvious. n). EXERCISE. Proof: The proof is by induction on the dimension of the space R. of A and B: Aei 2. are diagonal. (i = 1. be the set of all eigenvectors of A corresponding to some eigenvalue vector in R which is not an eigenvector of the transformation A.e1 . is of n 1. C. Normal transformations. . Necessity: Assume that the matrices of A and B are diagonal relative to some orthogonal basis. Proceeding in this way we get n pairwise orthogonal eigenvectors e1. Prove that there exists a basis relative to which the matrices of U. be two commutative unitary transformations. C. and U. commute. Furthermore. identity transformation. Hence R. which are orthogonal to e2 form an (n 2)dimensional subspace invariant under A and B. by assumption. By Lemma 1. B. e2. R. The elements of any set of pairwise commutative transformations on a vector space R have a common eigenvector. R. is a B. C. subspace different from the null space and the whole space. . NOTE: Theorem I can be generalized to any set of pairwise commutative self-adjoint transformations. is also invariant under A). and U. Relative to e1. Let U. But then the transformations themselves commute.

Ate1=p1e1. that is.9 The (n 1)-dimensional subspace R1 of vectors orthogonal to e. e. let the matrix be of the form [2.. e which are eigenvectors of A and A*. 22 0 Relative to such a basis the matrix of the transformation A* has the form 0 0 0 [Al i.e. under A* is proved in an analogous manner. i.e. Ae1=21e1. i. Ate) vector e. It follows that A and A* commute. let x E 141. Indeed. The invariance of R. Then (Ax. Prove that pi = 9 . pled = [71(x. el) = 0. is invariant under A.. Then by Lemma 2 there exists a vector el which is an eigenvector of A and A*. . which is an eigenvector of A and A*.e. Continuing in this manner we construct n pairwise orthogonal vectors e. 0 Since the matrices of A and A* are diagonal they commute. Necessity: Let the matrix of the transformation A be diagonal relative to some orthonormal basis. e1) = (x. § 11). i. Applying now Lemma 2 to R. Ax e R.110 LECTURES ON LINEAR ALGEBRA of an orthogonal basis relative to which a transformation A is represent- ed by a diagonal matrix is AA* = A*A (such transformations are said to be normal. etc. EXERCISE. cf. 0 0 0 0 0 IL. we can claim that R1 contains a (x.. Sufficiency: Assume that A and A* commute. (x.1. This proves that R. Let R2 be the (n 2)-dimensional subspace of vectors from R2 orthogonal to e2. is invariant under A as well as under A*. e1) = 0..

A unitary transformation U is also normal since U*U = E. + iA2. U unitary and where H and U commute Hint: Select a basis relative to which A and A* are diagonable. 1. where H is self-adjoint. definite if it is self-adjoint and if (Hx. and A2. there exists an orthonormal basis in which A. EXERCISES. where H and U commute. e form an orthogonal basis relative to which both A and A* are represented by diagonal matrices. Prove that the matrices of a set of normal transformations any two of which commute are simultaneously diagonable. UU* § 12 and § 13 are special cases of Theorem 2. and A. Every non-singular linear transformation A can be . Thus some of the results obtained in para. Let A1= A + A* 2 . H is self-adjoint and U unitary. DEFINITION 1. If A and A* commute then so do A. A2 A 2i A* The transformations A1 and A. A linear transformation H is called positive 0 for all x. are self-adjoint. Decomposition of a linear transformation into a product of a unitary and self-adjoint transformation Every complex number can be written as a product of a positive number and a number whose absolute value is one (the so-called trigonometric form of a complex number). are represented by diagonal matrices.e. e2.LINEAR TRANSFORMATIONS 111 The vectors e1. By Theorem I. Prove that a normal transformation A can be written in the form A = HU UH. A is normal. § IS.. But then the same is true of A = A. The analog of positive numbers are the so-called positive definite linear transformations. Unitary transformations are the analog of numbers of absolute value one. Note that if A is a self-adjoint transformation then AA* A*A = A2. i. Prove that if A HU. An alternative sufficiency proof. x) THEOREM 1. We shall now derive an analogous result for linear transformations. then A is normal. 1.

for all x.11 of the transformation A relative to any orthogonal basis is different from zero. Furthermore. Hence the determinant of the matrix of AA* is different from zero. Proof: The transformation AA* is positive definite. We shall first assume the theorem true and show how to find the necessary H and U. The determinant of the matrix I fri.0. so that AA* -= H2. LEMMA 1. in order to find H one has to "extract the square root" of AA*. the transformation AA* is positive definite. This will suggest a way of proving the theorem. H is easily expressible in terms of A. where U is unitary and H is a non-singular positive definite transformation. If A is non-singular then so is AA*. Given any linear transformation A. Thus. If A is non-singular. Consequently.). let A = HU. Conversely. that is. Indeed. Having found H. (AA* x. A*x) 0.211 of the transformation A* relative to the same basis is the complex conjugate of the determinant of the matrix 11(4. Before proving Theorem 1 we establish three lemmas. Indeed. Thus AA* is positive definite. The eigenvalues of a positive definite transformation B are non-negative.112 LECTURES ON LINEAR ALGEBRA represented in the form A = HU (or A = U. which means that AA* is non-singular. (AA*)* = A**A* = AA*. we put U = H-1A. then the determinant of the matrix ilai. .H. LEMMA 2. AA* is self-adjoint. where H(H1) is a non-singular positive definite transformation and U(U1) a unitary transformation. x) = (A*x. if all the eigenvalues of a self-adjoint transformation B are non-negative then B is positive definite.

Given any positive definite transformation B. = (el Bel (E121e1+$222e2+ +$/1. Proof: We select in R an orthogonal basis relative to which B is of the form [Al O 01 B=0 0 A. In addition. are the eigenvalues of B. e be an orthonormal basis consisting of the eigenvectors of B. e). x) NOTE: It iS clear from equality (1) that if all the A.. A. S nce all the 1 are non-negative it follows that (Bx. 0 ' H= O VA2 0 \/2 App y ng Lemma 2 again we conclude that H is positive definite.LINEAR TRANSFORMATIONS 113 Proof.>. E1e1fe2e2+ ±e) 221E2 E2 Be2 + + $Be. 0 2 O where 21. there exists a positive definite transformation H such that H2 = B (in this case we write H = Bi). Since (Be. Conversely. O. e) = 2(e. . .. Let x= be any vector of R. . Then (Bx. e) >. conversely. Ar.en. are positive then the transformation B is non-singular and. Then (Be. Put . An env. e) > 0. 22. e2. Let B be positive definite and let Be = 2e. E2e2 + -Fe) O. it follows that A O. if B is positive definite and non-singular then the are positive. assume that all the eigenvalues of a self-adjoint transformation B are non-negative. By Lemma 2 all [V21. LEMMA 3. Let e1. x) (I) E2e2 + +e. 0 and (e. if B is non-singular then H is non-singular.

which is easily seen to be self-adjoint.114 LECTURES ON LINEAR ALGEBRA Furthermore. H is a non-singular positive definite transformation. If (2) U= then U is unitary. at least one of which is non-singular. This completes the proof of Theorem 1. The operat on of extracting the square root of a transformation can be used to prove the following theorem: THEOREM. (2) we get A = HU. This completes the proof. Let H= (AA*). For the purpose of this discussion . Proof: We know that the transformations X = AB and C-1 XC have the same characteristic polynomials and therefore the same eigenvalues. then (cf. Hence A/2i > 0 and H is non-singular We now prove Theorem 1. § 16. Prove that if A and B are positive definite transformations. Making use of eq. Indeed. Then C-1XC = A1ABA1 Ai BA+. In view of Lemmas 1 and 3. note to Lemma 2) > O. if B is non-singular. UU* = H--1A (H-1A)* = H-1AA* H-' = H-1112H-' = E. then C-1 XC and X = AB will both have real eigenvalues. Let A be a non-singular positive definite transformation and let B be a self-adjoint transformation. Let A be a non-singular linear transformation. EXERCISE. (Ai 132Ai )* = (Ai )* B* (Ai)* = A1 BA'. then the transformation AB has nonnegative eigenvalues. Then the eigenvalues of the transformation AB are real. A suitable choice for C is C Ai. Linear transformations on a real Euclidean space This section will be devoted to a discussion of linear transformations defined on a real space. If we can choose C so that C-i XC is self-adjoint. Indeed.

Proof: Let e1. We can thus rewrite (1) in the form Ax = 2.x. Consider the system of equations ( 6112E2 + 4222 e2 + (1) 4221 + a22E2 T T ainE 2E2. ar1E2 T a2$2 T The system 1) has a non-trivial solution if and only if an 2 0112 al 2 a22 a2 an. a rotation of the plane about the origin by an angle different from hat is a linear transformation which does not have any one-dimensional invariant subspace. There arise two possibilities: a.a2 aA .. Every linear transformation in a real vector space R has a one-dimensional or two-dimensional invariant subspace. These numbers are the coordinates of some vector x relative to the basis e1.LINEAR TRANSFORMATIONS 115 the reader need only be familiar with the material of §§ 9 through 11 of this chapter. .2( be the matrix of A relative to this basis. This equation is an nth order polynomial equation in A with real coefficients. i. en be a basis in R and let I la . This result which played a fundamental role in the development of the theory of complex vector spaces does not apply in the case of real spaces. The concepts of invariant subspace. not all zero which are a solution of (1). However. $20. e2. the vector x spans a one-dimensional invariant subspace. we can state the following THEOREM 1. e2. + annen = 2$7. Then we can find numbers E1°. and eigenvalue introduced in § 10 were defined for a vector space over an arbitrary field and are therefore relevant in the case of a real vector space. 1. Thus.. . Let A be one of its roots. e. . eigenvector.e. + a2$ = 2E2. In § 10 we proved that in a complex vector space every linear transformation has at least one eigenvector (onedimensional invariant subspace). Ao is a real root.

Thus the relations (2) and (2') Ay = + t3x.1 a21n1 a22n2 + ' + alniin = °U71.e. Let el.1 16 LECTURES ON LINEAR ALGEBRA b. n2. In the sequel we shall make use of the fact that in a two-dimen43 the sional invariant subspace associated with the root 2 = oc transformation has form (3). y) = (x.$ = a&2 ' and (2)' an r + a12 n2 -i. e2.f + a2nnii = 15t/12 ' /3E2. + O. in R. The numbers Eib e2 " nates of some vector x Ax acx en (y) fly. Replacing $i. Y = Ghei n2e2 ' ' + e2 e2 + Furthermore. x . threedimensional) every transformation has a one-dimensional invariant subspace. Ay) for any vectors x and y. EXERCISE. E be a solution of (1 ). e be an orthonormal basis in R and let + enen... 4. a2$2 + ' + a7. . let Ci be the coordinates of the vector z Ax. $ in ( 1) by these numbers and separating the real and imaginary parts we get (2) + inn E2 1.ßE1. A2. n) are the coordi(ni. Show that in an odd-dimensional space (in particular. + amen --= ace' + azii en Cte2 Pni. $2. can be rewritten as follows Equations (3) imply that the two dimensional subspace spanned by the vectors x and y is invariant under A.. 2.1/2. + 02072 + annyin = Gobi + ß. Let 1. A linear transformation A defi ed on a real Euclidean space R is said to be self-adjoint if (Ax. cJane. Self-adjoint transformations DEFINITION 1. + a12e2 = anEi + 022E2 + ani$. i..)7.

LINEAR TRANSFORMATIONS 117 =a/M. y). We first prove two lemmas. condition (4) is equivalent to aik aki. 3r) = k=1 aikeink where aik ak. 1. VVe shall make use of this result in the proof of Theorem 3 of this section. Y) = (z. Ay) = k=1 aikeink.i. e2. It follows that (Ax. k=1 where jaiklj is the matrix of A relative to the basis el. Relative to an arbitrary basis every symmetric b 1 near form A (x. y) = (Ax. We shall now show that given a self-adjoint transformation there exists an orthogonal basis relative to which the matrix of the transformation is diagonal. y) is represented by A (X. Y) = E :ini = 1=1 i. I and is thus independent of the theorem asserting the existence of the root of an algebraic equation is given in § 17. Thus. Comparing (5) and (6) we obta n the following result: Given a symmetric bilinear form A (x. To sum up. The proof of this statement will be based on the material of para. for a linear transformation to be self-adjoint it is necessary and sufficient that its matrix relative to an orthonormal basis be symmetric. y) there ex sts a self-adjoint transformation A such that A (x. (x. . A different proof which does not depend on the results of para. en. . k=1 aikk?h Similarly.

x e R. Denote by R' the subspace consisting of vectors orthogonal to e. Subtracting the first equation from the second we get [note that (Ax. Ay = fix + ay. a two-dimensional invariant subspace.. THEOREM 2. Proof: According to Theorem 1 of this section. Ay)] O = 2/3[(x. y). i. el) = 0. Then the totality R' of vectors orthogonal to el forms an (n 1)-dimensional invariant subspace.. Every self-adjoint transformation has a one-di ensional invariant subspace.. x) + (y..118 LECTURES ON LINEAR ALGEBRA LEMMA 1. y) = 0. it follows that 13 = O. let x e R'. Then (Ax. Contradiction. it contains (again. We show that R' is invariant under A. by Lemma 1) . to prove Lemma 1 we need only show that all the roots of a self-adjoint transformation are real. x) (x. to every real root A of the characteristic equation there corresponds a onedimensional invariant subspace and to every complex root A. In the proof of Theorem 1 we constructed two vectors x and y such that Ax = ax fiy. S nce (x. the transformation A has at least one eigenvector e. There exists an orthonormal basis relative to which the matrix of a self-adjoint transformation A is diagonal. Y) y) (x. Proof: It is clear that the totality R' of vectors x. Thus. forms an (n 1)-dimensional subspace.e. Suppose that A = + O. LEMMA 2. 2e1) = 2(x. y) = (x. e1) = O. Thus. Ay) = /3(x. (x. i. Let A be a self-adjoint transformation and el an eigenvector of A. orthogonal to e. x) + (y. Aei) = (x. y)]. = (x. Y) = ix(x. Proof: By Lemma 1.e. Ax E R'. Since R' is invariant under A. But then (Ax.

. y) there corresponds a linear self-adjoint transformation A such that A (x. Reduction of a quadratic form to a sum of squares relative to an orthogonal basis (reduction to principal axes).'SFORMATIONS 119 an eigenvector e. Let A (x. -H 22 ' 2e2 + ri2e2 + nen) the. y) be a symmetric bilinear form on an n-dimensional Euclidean space. etc. /he. n). x) Here the 2. y) = (Ax.LINEAR TRA. equiv- alently.. are the eigenvalues of the transformation A or. in this case the equation A (x. We showed earlier that to each symmetric bilinear form A (x. The orthonorrnal basis . -. In this manner we obta n n pairwise orthogonal eigenvectors e1. Putting y = x we obtain the following 21e17l+ 22E2T2 + THEOREM 3. the matr x of A relative to the e. Let A (x. According to Theorem 2 of this section there exists an orthonormal basis e1.. (i = 1. e consisting of the eigenvectors of the transformation A (i. of vectors such that 2ei). x) 1 is the equation of a central conic of order two. + /2e2 + ' -{--Iien) + 2e22. . o A. y). 2. the roots of the characteristic equation of the matrix Haitl For n 3 the above theorem is a theorem of solid analytic geometry. x) be a quadratic fornt on an n-dimensional Euclidean space. With respect to such a basis Aei A (x. e2. e2. of A. Indeed. y) = (Ax. Since Aei = 2.e. o o - - o1 o - o 3. . is of the form [ 2. y) = = (A($jel $2e2 + En e)..e. T hen there exists an orthonormal basis relative to which the quadratic form can be represented as A (x. e .

x) = B(x. x) be two quadratic forms on an n-dimensional space R. x) and B(x. ea relative to which the form A (x. A (x. e3. y) for all x.120 LECTURES ON LINEAR ALGEBRA discussed in Theorem 3 defines in this case the coordinate system relative to which the surface is in canonicid form. an orthogonal transformation is length preserv EXE RC 'SE. Prove that condition (10) is sufficient for a transformation to be orthogonal. Relative to an orthonormal basis an inner product takes the form (x. y) = B(x. x) be positive definite. 4.e. x) = I E2. that is. i. e each quadratic form can be expressed as a sum of squares. relative to the basis e1. We define in R an inner product by means of the formula (x. y). and let B(x. Proof: Let B(x. .. e2. y E R. i. x). Ay) = (x. By Theorem 3 of this section there exists an orthonormal basis e1. Simultaneous reduction of a pair of quadratic forms to a sum of squares THEORENI 4. if (Ax. Orthogonal transformations DF:FINITION. are directed along the principal axes of the surface. x) = 27=1. Thus. Putting x =. Then there exists a basis in R relative to which each fornt is expressed as a sum of squares. 5. The basis vectors e1. x) is expressed as a sum of squares. y) be the bilinear form corresponding to the quadratic form B(x. Let A (x. e2. A linear transformation A defined on a real n-dimen- sional Euclidean space is said to be orthogonal if it preserves inner products.. e2.y in (9) we get lAx12 IxJ2.e.

it follows that the square of the determinant of a matrix of an orthogonal transformation is equal to one. Indeed.e. I axian are the elements of the product of the transpose of the this product is the unit matrix.. Since the determinant of the product of two matrices is equal to the product of the determinants. e2. a-1 EXERCISE.LINEAR TRANSFORMATIONS 121 Since cos 99 = (x.e. .. . Conditions (12) imply that . the determinant of a matrix of an orthogonal transformation is equal to + 1. . Since an orthogonal transformation A preserves the angles between vectors and the length of vectors. for i k.. An orthogonal transformation whose determinant is equal to + lis called a proper orthogonal transformation. i. e2. Let e1. consequently. en.)0 {1 Now let Ila11 be the matrix of A relative to the basis e1. EXERCISE. matrix of A by the matrix of A. it follows that an orthogonal transformation preserves the angle between two vectors. Ae likewise form an orthonormal basis. i. Show that the product of two proper or two improper orthogonal transformations is a proper orthogonal transformation and the product of a proper by an improper orthogonal transformation is an improper orthogonal transformation. it follows that the vectors Aei. Show that conditions (I1) and. whereas an orthogonal transforMation whose determinant is equal to 1 is called improper. conditions (12) are sufficient for a transformation to be orthogonal. anan = 0 Conditions a=1 (12) can be written in matrix form. en be an orthonormal basis. Ae A. Since the columns of this matrix are the coordinates of the vectors Ae conditions (11) can be rewritten as follows: for i = k for i k. {I for i k (A. y) ix) and since neither the numerator nor the denominator in the expression above is changed under an orthogonal transformation.

122

LECTURES ON LINEAR ALGEBRA

NOTE: What motivates the division of orthogonal transformations into proper and improper transformations is the fact that any orthogonal trans-

formation which can be obtained by continuous deformation from the
identity transformation is necessarily proper. Indeed, let A, be an orthogonal transformation which depends continuously on the parameter t (this means that the elements of the matrix of the transformation relative to some basis are continuous functions of t) and let An = E. Then the determinant of this transformation is also a continuous function of t. Since a continuous

function which assumes the values ± I only is a constant and since for 0 the determinant of A, is equal to 1, it follows that for t 0 the t determinant of the transformation is equal to 1. Making use of Theorem 5 of this section one can also prove the converse, namely, that every proper orthogonal transformation can be obtained by continuous deformation of the identity transformation.

We now turn to a discussion of orthogonal transformat ons in
one-dimensional and tviro-dimensional vector spaces. In the sequel

we shall show that the study of orthogonal transformations in a space of arbitrary dimension can be reduced to the study of these two simpler cases. Let e be a vector generating a one-dimensional space and A an orthogonal transformation defined on that space. Then Ae Ae
and since (Ae, Ae) = (e, e), we have 2.2(e, e) = (e, e), i.e., A = 1. Thus we see that in a one-dimensional vector space there exist x two orthogonal transformations only: the transformation Ax x. The first is a proper and the and the transformation Ax an second an improper transformation.

Now, consider an orthogonal transformation A on a twodimensional vector space R. Let e1, e2 be an orthonormal basis in

R and let

[7/ /
be the matrix of A relative to that basis.
We first study the case when A is a proper orthogonal transformation, i.e., we assume that acó ßy -= 1.

The orthogonality condition implies that the product of the matrix (13) by its transpose is equal to the unit matrix, i.e., that
(14)
Fa

Ly

)51-1 J

Fa

vl

fit

LINEAR TRANSFORMATIONS

123

Since the determinant of the matrix (13) is equal to one, we have

fi'br --13.1. It follows from (14) and (15) that in this case the matrix of the transformation is

(15)

r
where a2 + ß2 =
1.

Putting x = cos q», ß

sin qi we find that

the matrix of a proper orthogonal transformation on a two dimensional

space relative to an orthogonal basis is of the form

[cos 9)
sin

sin 92-1

cos 9'I

(a rotation of the plane by an angle go). Assume now that A is an improper orthogonal transformation,

that is, that GO ßy =

1.

In this case the characteristic
(a + 6)2

equation of the matrix (13) is A2

1 = O and, thus,

has real roots. This means that the transformation A has an eigenvector e, Ae = /le. Since A is orthogonal it follows that
±e. Furthermore, an orthogonal transformation preserves the angles between vectors and their length. Therefore any vector e, orthogonal to e is transformed by A into a vector orthogonal to Ae ±e, i.e., Ae, +e,. Hence the matrix of A relative to the
Ae

basis e, e, has the form

F±I
L

o

+1j.

Since the determinant of an improper transformation is equal to -- 1, the canonical form of the matrix of an improper orthogonal transformation in two-dimensional space is
HE
L
(

oi
o

Or

1

o +1

01

a reflection in one of the axes). We now find the simplest form of the matrix of an orthogonal

transformation defined on a space of arbitrary dimension.

124

LECTURES ON LINEAR ALGEBRA

Let A be an orthogonal transforma/ion defined on an n-dimensional Euclidean space R. Then there exists an orthonormal basis el, e,, , e of R relative to which the matrix of the transformaTHEOREM 5.

tion is

1
cos
92,

sin

921

sin 921

cos ch.

COS 92k

cos
99,_

sin

92,

where the unspecified entries have value zero.

Proof: According to Theorem 1 of this section R contains a one-or two-dimensional invariant subspace Ru). If there exists a one-dimensional invariant subspace WI) we denote by el a vector
of length one in that space. Otherwise Wu is two dimensional and we choose in it an orthonormal basis e1, e,. Consider A on

In the case when R(') is one-dimensional, A takes the form Ax = x. If Wu is two dimensional A is a proper orthogonal transformation (otherwise R") would contain a one-dimensional invariant subspace) and the matrix of A in Rn) is of the form
rcos
Lsin

sin wi cos (pi

The totality 11 of vectors orthogonal to all the vectors of Rn) forms an invariant subspace.
Indeed, consider the case when Rn) is a two-dimensional space,

say. Let x e ft., i.e.,

We reason analogously if Wn is one-dimensional. Since (Ax. it is of dimension n 1. R is the totality of vectors orthogonal to the vectors el and e2. We now find a one-dimensional or two-dimensional invariant subspace of R. z = Ay likewise varies over all of 14(1. sin 921 cos q).] sin T. Ay) = O. Indeed. y) = O for all y e R(1). y). . cos q)k_ sin 92. i. Hence (Ax.LINEAR TRANSFORMATIONS 125 (x. where the +1 on the principal diagonal correspond to one-dimensional invariant subspaces and the "boxes" [ cos Ti sin T. cos qik sin w. it is the totality of vectors orthogonal to the vector el. select a basis in it.. if Wu is of dimension two. Ay) = (x. in the former case.. As y varies over all of W1. Again. Ax e it. If WI) is of dimension one. z) = 0 for all z e ml).e. Relative to this basis the matrix of the transformation is of the form 1 1 1 cos qpi sin go. etc. cos q). correspond to two-dimensional invariant subspaces This completes the proof of the theorem. it follows that (Ax. and in the latter case. it is of dimension n 2. In this manner we obtain n pairwise orthogonal vectors of length one which form a basis of R..

126 NOTE: LECTURES ON LINEAR ALGEBRA A proper orthogonal transformation which represents a rotation of a two-dimensional plane and which leaves the (n 2)-dimensional subspace orthogonal to that plane fixed is called a simple rotation. Relative to a suitable basis its matrix is of the form 1 cos q sin yo sin 9) cos w 1 An improper orthogonal transformation which reverses all vectors of some one-dimensional subspace and leaves all the vectors of the (n 1)dimensional complement fixed is called a simple reflection. Extremal properties of eigenvalues In this section we show that the eigenvalues of a self-adjoint linear transformation defined on an n-dimensional Euclidean space can be obtained by considering a certain minimum problem connected with the corresponding quadratic form (Ax. § 17. The proof is left to the reader. in particular permit us to prove the existence of eigenvalues and eigenvectors without making use of the theorem . x). Relative to a suitable basis its matrix takes the form 1 1 1 Making use of Theorem 5 one can easily show that every orthogonal transformation can be written as the product of a number of simple rotations and simple reflections. This approach win.

Let A be a self-adjoint linear transformation on an n-dimensional real Euclidean space. on the set of vectors x such that (x. We first prove the following lemma: LEMMA 1. However.. h) = O. The extremal properties are also useful in computing eigenvalues. Be = O. i. The vector e. then 2t(Be. x) corresponding to A assumes its minimum on the unit sphere. Since (Bh. e) -= 0. e) + t(Be. x) = 1. THEOREM 1. . But this means that (Be.e. We shall first consider the case of a real space and then extend our results to the case of a complex space. h) t2(Bh. We shall consider the quadratic form (Ax. e) = 0. Be) = (Be. at which the minimum is assumed is an eigenvector of A and A. h) > O. h) t(Bh. We have (B(e th). then Be = O. e) = (h. Indeed. e + th) = (Be. is the corresponding eigen- value. i. This proves the lemma. Then the quadratic form (Ax. h) = O. such that (Bx. Let B be a self-adjoint linear transformation on a real space such that the quadratic form (Bx. If for some vector x = e (Be. the function at + bt2 with a 0 changes sign at t = O. where t is an arb trary number and h a vector. h) t2(Bh. Since h was arbitrary. Proof: Let x = e + th.e. x) is non-negative. It follows that (Be..LINEAR TRANSFORMATIONS 127 on the existence of a root of an nth order equation. x) which corresponds to A on the unit sphere. x) for all X. e) + t2(Bh. h) and (Be. h) 0 for all t. h) is non-negative for all t. in our case the expression 2t(Be. Let A be a selpadjoint linear transformation.

. Note that if we multiply x by some number a. for x el. A.. x) = 1. Ae. § 16 (Lemma 2). We obtain the next eigenvector by solving the same problem in .. This proves the theorem. then both sides of the inequality become multiplied by a2. x) is continuous on that set it must assume its minimum 2. (Ax. This means that the transformation B = A 21E satisfies the conditions of Lemma 1. Hence (A 21E)e1 = 0. 2. where (e1. these vectors form an (n 1)-dimensional subspace R. This inequality holds for vectors of unit length. x) 2.128 LECTURES ON LINEAR ALGEBRA Proof: The unit sphere is a closed and bounded set in n-dimensional space. is the point in R. A. at which the minimum is assumed. We now rewrite (2) in the form (Ax x) O for all x. it follows that inequality (2) holds for vectors of arbitrary length. i.e. we have (Ae. for (x. el) = 2. x) 21(x. As was shown in para. The corresponding eigenvector e. We have shown that el is an eigenvector of the transformation A corresponding to the eigenvalue 2. To find the next eigenvalue of A we consider all vectors of R orthogonal to the eigenvector e. We have (Ax. Obviously.. 21e. Inequality (1) can be rewritten as follows where (x. x) on the unit sphere in It. = 21e1. Since any vector can be obtained from a vector of unit length by multiplying it by some number a. at some point e. In particular. Since (Ax. x). since the minimum of a function considered on the whole space cannot exceed the minimum of the function in a subspace. x) = 1.. The required second eigenvalue A. e) = O. invariant under A. e1) = 1. of A is the minimum of (Ax. and (Aei.

) k.t. and e. Continuing in this manner we find all the n eigenvalues and the corresponding eigenvectors of A. eigenvector of a transformation from the extremum problem without reference to the preceding eigenvectors. then there exists a vector different from zero belonging to both subspaces. . e2. that is. Eke. It is sometimes convenient to determine the second. e (x. x). Let A be a self-adjoint transformation.2 + ' + AkEk2 ek2 ' and therefore (Ax. common to both Roc and S.. x) in that subspace. e. < A. third. x) (Ax. In § 7 (Lemma of para. (e ek) = 1 and (ek. x). . x) 2. -L + Ake ke k) ek = 4E1' + A2E22 + + Ake k2. since e.(x. We can assume that xo has unit length. x) eoeo + Indeed. 1. x) ek are orthonormal. (x. It follows that . 1) we showed that if the sum of the dimensions of two subspaces of an n-dimensional space is greater than n... x). eigenvectors. Denote by A. + + = (Akeke. x) (Ax. Since the sum of the dimensions of Rk and S is (n k + 1) + k it follows that there exists a vector x. it follows that ' (A (Eke/ eze.k(x. x). e1. A 2E 22 + 4($12 82' + -E =- = Adx. e the corresponding orthonormal . We shall show that if S is the subs pace spanned by the first k eigenvectors then for each x e S the lollowing inequality holds: A. x) + ekeo. etc. < 5 An its eigenvalues and by e eo. O for i Since Aek = 2. The third eigenvalue of A is equal to the minimum of (Ax.LINEAR TRANSFORMATIONS 129 the (n 2)-dimensional subspace consisting of vectors orthogonal to both e. + ¿kek). (Ax. exek) + + Ekek) Furthermore. ek (x. x) Now let Rk be a subspace of dimension n k + 1. + eke. 812 + 8. let x = ekek (Ax.(x. x) = Similarly.e..

x). we have 2. xe Rk In this formula the minimum is taken over all x e R.u be the eigenvalues of A -7 B. x) for all x elt. is less than or equal to A. x). x) = 1. Since. Hence for any (n min (x. is equal to . Then A. and the maximum of the right side is equal to 1. Then min (Ax. we showed in this section that min (Ax.. xo) 2. xi=1 k + 1)-dimensional subspace Rk we have min ((A (Ax. Note that among all the subspaces of dimension n k 1 there exists one for which min (Ax. Ro) Ak But then the minimum of (Ax. (x. We now extend our results to the case of a complex space.4. A.0. The subspace Rk can be chosen so that min (Ax. for which (x. it follows that We have thus shown that there exists a vector xo E Rk of unit length such that (AX0. et. e.. .. f Indeed (Ax. Let A. x) for x e S. x) -= 4. by formula (3). e.. As a consequence of our theorem we have: Let A be a sell-adjoint linear transformation and B a postive definite linear A be the eigenvalues of A and lel transformation. To sum up: If Rk is an k 1)-dimensional subspace and x varies over all vectors in R.. We have thus proved the following theorem: THEOREM.) = 1. x) = 1. .. x) for x on the unit sphere in Rk must be equal to or less than Ak.130 LECTURES ON LINEAR ALGEBRA (x. x) ((A + 13)x. Let R be a (n k + 1)-dimensional subspace of the space R. and the maximum over all subspaces Rk of dimension n k + 1. x e 12. is actually equal to Ak. (x. Since (Ax.I. x) (Axo. the maximum of the left side is equal to A. Rk (x. " for all x. et. x) A. x). Indeed. (x. taken over all vectors orthogonal to et. x). x) = 1. x) B)x.... Our theorem can be expressed by the formula (3) max min (Ax. x) = 1. xe Rk xeRk X)=-1 It follows that the maximum of the expression on the left side taken over all subspaces Rk does not exceed the maximum of the right side. . then min (Ax. (x. x) = I. This is the subspace consisting of all vectors orthogonal to the first k eigenvectors et. 4 (x. x. . x) is equal to A.

e) = 0.e. i(Bh. t[(Be. h) (Bh. h) = 0. x) 0 (Be. e -r. by putting ih in place of h. It follows from (4) and (5) that (Be. (Be. . LEMMA 2. we get. e) Since h was arbitrary. e) = O. All the remaining results of this section as well as their proofs can be carried over to complex spaces without change. e)] + t2(Bh.th) 0.. If for some vector e. i(Be. Let B be a self-adjoint transformation on a complex space and let the Hermitian form (Bx. let foy all x. This proves the lemma. e) = 0. h) for all t. h) (Eh. It follows that (Bx. or. Proof: Let t be an arbitrary real number and h a vector. h) O.LINEAR TRANSFORMATIONS 131 To this end we need only substitute for Lemma I the follovving lemma. Then (B (e th). and therefore Be = O. then Be = O. since (Be. i. x) corresponding to B be non-negative.

also § 10. para. Thus. 132 . We now formulate the definitive result which we shall prove in § 19. namely. We found that relative to the basis consisting of the eigenvectors the matrix of such a transformation had a particularly simple form. then the transformation has n linearly independent eigenvectors. In the case when the number of linearly independent eigenvectors of the transformation is equal to the dimension of the space the canonical form will coincide with the diagonal form. 1. the so-called diagonal form. cf. However. such a transformation is not diagonable since.CHAPTER III The Canonical Form of an Arbitrary Linear Transformation § 18. exceptional. as noted above. The canonical form of a linear transformation In chapter II we discussedvarious classes of linear transformations on an n-dimensional vector space which have n linearly independ- ent eigenvectors. this case is. Hence for the number of linearly independent eigenvectors of a transformation to be less than n it is necessary that the characteristic polynomial have multiple roots. in a sense. There arises the question of the simplest form of such a transformation. sional space and let A have k (k vectors Let A be an arbitrary linear transformation on a complex n-dimenn) linearly independent eigen- We recall that if the characteristic polynomial has n distinct roots. comparatively simple form (the so-called Jordan canonical form). i (An example of such a transformation is given in the sequel. the number of linearly independent eigenvectors of a linear transformation can be less than n. In this chapter we shall find for an arbitrary transformation a basis relative to which the matrix of the transformation has a 3). Example Clearly. any basis relative to which the matrix of a transformation is diagonal consists of linearly independent eigenvectors of the transformation.

Af. 12f. 111. Every subspace generated by each one of the k sets of vectors contains an eigenvector. 2 Clearly. Ael = 11e1. Then there exists a . Ah.. + + c... + /lever Equating the coefficients of the basis vectors we get a system of c: equations for the numbers A. e. that is. = f. A(c. .e. 21e2. e. Substituting the appropriate expressions of formula (2) on the left side we obtain cdie. then each set consists of one vector only. relative to which the transformation A has the form: Ae. 22f2.. For instance. 2.e. Indeed. A.. is an eigenvector. consider the subspace generated by the vectors el. = Ah.e. namely an eigenvector. .) = Ac. = Ah. /1h2. c.) = cae.. . It therefore follows that each set of basis vectors gener- ates a subspace invariant under A._1 At. = e. e. Af. fq. some linear combination of the form c1 e1 + + cpe. corresponding to the eigenvalues Xi. . contains the eigenvector el. c2(e1 21e2) + (e_. say.CANONICAL FORM OF LINEAR TRANSFORMATION 133 e. f1. e. c2e2 + where not all the c's are equal to zero. e2. p q ±sn.. . = h. e. . 21118. Ae = e. basis consisting of k sets of vectors 2 . 22. h1. . = 22f1. .h 21e. ix. Assume that some vector of this subspace. the subspace generated by the set e1. f. We see that the linear transformation A described by (2) takes the basis vectors of each set into linear combinations of vectors in the same set.1e. We shall now investigate A more closely. = Akhi.). We show that each subspace contains only one (to within a multiplicative constant) eigenvector. I f k n.

_1= c. . Indeed. = 0. We have Ael = Ale. and from the last. coincides (to within a multiplicative constant) with the first vector of the corresponding set. To find out what the elements in each box are it suffices to note how A transforms the vectors of the appropriate set. Thus. c. therefore. has the form (3) .O All 0 1 -Al 0 0 0 0 1 0 0 0 0 A. tions that c. if A A1. + A1e. We now write down the matrix of the transformation (2). Since the vectors of each set are transformed into linear combinations of vectors of the same set. Hence A = A1. cAl = Ac. Ae2 = e1 + e. Ae = Ae ep + Ale.p q.± c3 = Ac2. p + 2. Recalling how one constructs the matrix of a transformation relative to a given basis we see that the box corresponding to the set of vectors e1. e. then it would follow from the last equation that c._2= = c2= el= O. Substituting this value for A we get from the first equation c2 = 0.. 2. and so on. We first show that A = Al. p. c. it follows that in the first p columns the row indices of possible non-zero elements are 1. from the second. in the next q columns the row indices of possible non zero elements are p + 1. the matrix of the transformation relative to the basis (1) has k boxes along the main diagonal. = O. 0 0 0 . = 0 and from the remaining equa- cp-14+ 1Cp-1._1._.134 LECTURES ON LINEAR ALGEBRA ciAl+ c2rc2A. . This means that the eigenvector is equal to cle and. e2. The elements of the matrix which are outside these boxes are equal to zero.

We show. in order to raise the matrix al to some power all one has to do is + raise each one of the boxes to that power. The matrix (4) has the form = k_ where the a. 1 0 0 0 0 0 Here all the elements outside of the boxes are zero. it has the form _211 0 0 A.s. Now let P(1) =. one can nevertheless perform algebraic operations on it with relative ease. It is easy to see that . Although a matrix in the canonical form described above seems more complicated than a diagonal matrix. 0 1 0 0 21 0 0 221 (4) 0 0221 0 0 0 0 22 0 2k' 0 0 A. q. say. for instance. are square boxes and all othur elements are zero. how to compute a polynomial in the matrix (4). that is. -.CANONICAL FORM OF LINEAR TRANSFORMATION 135 The matrix of A consists of similar boxes of orders p. ao + ait + amtm be any polynomial. Then sif2 that is.

02.0P-I are of the form 2 [0 0001 0000 0000 fr == JP+.. J'ep = e_. in the form st.1) (di n! I)" But sit.)" n! P"'' (A1). A. ¿e is. . ine2=e. Je. ey_. . = et. We now show how to compute P(s1. J3e1 J3e. Hence inei= 0.332e2=0. Pl()105 + P"(20) 2! 2! 52 + Put' (20) n! 2 The powers of the matrix . .1 0 = 0 0 0 I o o 0 0 11 0 0 0 0 We note that the matrices . = 0.. -4 0. = e. First we write the matrix si. A. . == 0 1 0 [0 if 2-2 - 0 0 0 0 0 0 0 o o o 00 00 and = 0.1 are most easily computed by observing that fie. where n is the degree of P(t).3. 2! (Al) A1 e)2 P"(11.e + 1 where et is the unit matnx of order p and where the matrix f has the form r0 . . Substituting for t the matrix sari we get P(di) = P(Mg + (si. In view of Taylor's formula a polynomial P(t) can be written as P(t)= P(20) (t (t A1)2 )0) -1-v(2. It is now easy to compute P(.1) -E 2! P"(À1) + + (tA.136 LECTURES ON LINEAR ALGEBRA [P(ei1) P(s12) P(s..(11).5. . = ene. J3e = Similarly. say. . e)P( (20 -1- (st. . Hence P(di) = P(A1).). Jae.

. first p 1 derivatives at A.)O Thus in order to compute P(d1) where sal. In other words.CANONICAL FORM OF LINEAR TRANSFORMATION 137 Recalling that ifP = JP-1 = P (A1) ' ' = 0.. q. See I. G. Let e be an eigenvector of A*. A. We need the following lemma: LEMMA. Let A be a linear transformation on a complex n-dimensional space. s.. Reduction to canonical form In this sect on we prove the following theorem 3: THEOREM 1. . has order p it suffices to know the value of P(t) and its first p 1 derivatives at the point A. chapter 6.. 0O O 2)! ' P(21) value of P(t) at the points t = A. A2. is the eigenvalue of si.. Proof: Consider the adjoint At of A.. Petrovsky. then to compute P(d) one has to know the . where A. We prove the theorem by induction. Lectures on the Theory of Ordinary Differential Equa- tions. Every linear transformation A on an n-dimensional complex space R has at least one (n 1)-dimensional invariant subspace R'.e. i. Petrovsky. as well as the values of the 1 derivatives at A. we get P" (Al) 2! P' (A1) 1! PP-'' (AO1) ! P(211) = P(A) I" 1! F'''-'' (A. . G. we assume that the required basis exists in a space of dimension n and show that such a basis exists in a space of dimension n 1. It follows that if the matrix has canonical form (4) with boxes of order p.. Ate = We claim that the (n 1)-dimensional subspace R' consisting of 3 The main idea for the proof of this theorem is due to I. and the § 19. there exists a basis relative to which A has the form (2) (§ 18). . Then there exists a basis relative to which the matrix of the linear transformation has canonical form. the first q first s 1 derivatives at A.

. ev. e) = O. Ae2 = 11e2. + s = n. Let A be a linear transformation on an (n + 1)-dimensional space R. . f1. 2e) = 0. invariant under A. e2. + 21h. ft.._. Applying the transformation A to e we get 4 We assume Itere that R is Euclidean. h2. Ah. Afq = fq-1 + 12f. . i. Ax E R'. By the induction assumption we can choose a basis in R' relative to which A is in canonical form. that is. let x e R'. is invariant under A. However. Indeed. This proves the invariance of R' under A. Considered on R'. 4112. all vectors x for which (x. the transwhere p q+ formation A has relative to this basis the form h Ae.. = h. alone. el. f. e) = (x. e2. (x. by changing the proof slightly 've can show that the Lemma holds for any vector space R.e. e) = 0. 2.e.=1el..138 LEcruims ON LINEAR ALGEBRA all vectors x orthogonal 4) to e. 22f2. We now pick a vector e wh ch together with the vectors el. Aev = ev_. h. = Ah2 = +2. hs forms a basis in R. + ¿1e. Ate) = (x. f2.. f2. that an inner product is defined on R. Aft = Af2 = f. We now turn to the proof of Theorem 1. e2.2f2. i. ..h2.e. that is. According to our lemma there exists an n-dimensional subspace R' of R. Then (Ax. Denote this basis by h2.

and T. Hence if r 0 we can consider the transformation A rE instead of A. + + in the form e' e .. h. pti. zp. . and the diagonal (cf.81f. ..ep) + A(wilk + + Mg] + wshs). A(0)1111 + . Since the eigenvalues of a triangular matrix are equal to the entries on . ' . We shall now try to replace the vector e by some vector e' so that the expression for Aei is as simple as possible. A. + . making use of (1) Ae' = i1e1 + + + + ß1f1+ + 8. namely. by the eigenvalue T.. the eigenvalues . Thus.f. 2. 5 We can assume that t = O. dmuji xe) + (3... f2. . This justifies our putting Ae =-- + + ape. + Óih ACtc. We will choose them so that the right side of (3) has as few terms as possible. 2k. para. as a result of the transition from the n-dimensional invariant subspace R' to the (n + 1)-dimensional space R the number of eigenvalues is increased by one. tt. Indeed. fg.f. or... 4) it follows that r are the eigenvalues of A considered on the (n + 1)-dimensional space R. be chosen arbitrarily... e. h1. + + wshs). for instance. e is triangular with the numbers A. .Cie' Xpep + 61111 + 6shs.. + . co. We know that to each set of basis vectors in the n-dimensional space R' relative to which A is in canonical form there corresponds 5 The linear transformation A has in the (n 1)-dimensional space R A.fr coihi '0» cosh. f. § 10. can The coefficients xi.on the principal diagonal.. the matrix of A relative to the basis el. . e. A(x. oh.. h.11. We have Ae' = Ae A(zlei + + x.CANONICAL FORM OF LINEAR TRANSFORMATION 139 Ae = a1ej + /pep + + + + + + 61111 + 6311s + re. 7.e. Indeed. if relative to some basis A is in canonical form then relative to the same basis A rE is also in canonical form and conversely. We shall seek e' Pl.

We show how to choose the coeffix. e 1. i. + = i1e1 + ' . The coefficients of the other sets of vectors are computed analogously. By adding this vector to the basis vectors of R' we obtain a basis . The terms containing the vectors el. We consider first the case when all the eigenvalues are different from zero. in e1. -H 21e2) ' + xpep) ¿lei)) Zp(en-1 (/..e.f.2 = (al X1A1 Z2)e1 X221 z3)e2 + + X. i. A (xi. . We shall show that in this case we can choose a vector e' so that Ae' = 0.) . .-121 Xp)ep-i (. so that the linear combination of each set of vectors vanishes. . next we put the coefficient of e.- We put the coefficient of e. f 1f.e. We have thus determined e' so that Ae' = O. (this can be done since Ai 0).2.. In this way the linear combiequal to zero and determine nation of the vectors e1. we can choose xi. Consider now the case when some of the eigenvalues of the transformation A on R' are zero. in (3) vanishes. The sets of the former type can be dealt with as above. e 2. in the (n + 1)-dimensional space R relative to which the transformation is in canonical form. .140 LECTURES ON LINEAR ALGEBRA one eigenvalue. 2. for such sets we can choose . The eigenvalue associated with e' is zero (or 2. Assume this to be feasible. e. These eigenvalues may or may not be all different from zero.e. equal to zero and determine x. 1 etc. In this case the summands on the right side of (3) are of two types: those corresponding to sets of vectors associated with an eigenvalue different from zero and those associated with an eigenvalue equal to zero.. if we consider the . w. . are of the form + + ape. e. (0. (3) vanishes.. e'. 2. e. The vector e' forms a separate set.. transformation A rather than A TE). 'Y . . so that the linear combination of the vectors cients xi. so that the right side of (3) becomes zero. hs e v. h 1. Then since the transformation A takes the vectors of each set into a linear combination of vectors of the same set it must be possible to select xl.

Proceeding in the same manner with the By putting z. e'. ' . different from zero. .e. andfi > q> r. Assume now that at least one of the coefficients x. it follows that el.. Af2 Ae.CANONICAL FORM OF LINEAR TRANSFORMATION 141 coefficients so that the appropriate linear combinations of vectors in each set vanish. = sc_. 0.. O. vectors except ape. = 22 = 23 = O. say. f2. g2. Ae' = x.. We illustrate the procedure by considering the case x. Therefore the linear combination of the vectors el. g. = f2_.. = Ae'. = 0.. the transformation A is already in canonical form relative to the basis e'. .. A(zie. . 111. we obtain a vector e' such that fl. = 0..f.e. hs. . three sets of vectors. Ae = ep_1. i. Agr Ag2= gi. = a. e. ßq. in distinction to the previous cases. = 0. e. Then " Ae' = 1e1 + (4) + yrg. it becomes necessary to change some of the basis vectors of R'. appearing on the right side of (4) will be of the form cc1e1 .e arrive at a vector e' such that It might happen that a = = Ae' = In this case we and just as in the first case. e2. The vector e'. es. O = e'. Ag. 2. forms a separate set and is Y associated with the eigenvalue zero.g. e. values are equal to zero. . we annihilate all Za= ac2. el. f1. Let us assume that we are left with. f. f2. + A(ktifl+ + Itqfq) + a. . y. . fg. We form a new set of vectors by putting e' = e'. whose eigenfi. Then. . e2. fq and g. p1f1 + + ß0f0 + 71g1 + xpe) 4- A (Yigi 4- Since Al = 22 = A. At. Af._. = Ae'. x2e2 + + x2e2 x3e2 sets f. g. Thus .. .

ep by the vectors e'1. If the matrix (. in general.911 is similar to the matrix a2. We now replace the basis vectors e'. then a2 is also similar to at. where is an arbitrary non-singular matrix are said to be similar. DEFINITION 1. e.e.+1 = e'l = Ae'2 = cc. This completes the proof of the theorem. to increase the order = tig y. e'2. While constructing the canonical form of A we had to distinguish two cases: The case when the additional eigenvalue r (we assumed t = 0) did not coincide with any of the eigenvalues 2. If the first case. y. = 0. f. ).±. + ßf. Indeed... e2. then just as in of one of the boxes by one. and leave the other basis vectors unchanged.5:11 =. let = Then = wsgtir-1 .tr'-isfl.g1. Elementary divisors In this section we shall describe a method for finding the Jordan canonical form of a transformation. ei. we added a new box. + yrgr.1 . 4. Note that the order of the first box has been increased by one. . § 20.142 LECTURES ON LINEAR ALGEBRA e' = e' cc. The results of this section will also imply the (as yet unproved) uniqueness of the canonical form... Then it was necessary. e1. . In this case a separate box of order 1 was added.. The case when -c coincided with one of the eigenvalues . 2 . Relative to the new basis the transformation A is in canonical form.+1 = Aet_. The matrices sir and . + fg_r. + 41. = Aet_7+2 = Gip ep_._.

. i. We choose Dk(A) to be a monic polynomial. s4t2 is similar to sit. then sit...2 al 2W 2Z' W2-1a2r.e. In other words. is similar to Sly Indeed let = 1S114. if the hth order ininors are pairwise coprime we take Di(A) to be I. i. This will be a complete system of invariants in the sense that if the invariants in question are the same for two matrices then the matrices are similar. The kth order minors of the matrix sir 24' are certain polynomials in 2. is the matrix of transition from this basis to a new basis (§ 9).. the determinant of the matrix d At.(2) the greatest common divisor of those minors.. i. then V-1. Let S be a matrix of order n. Thus similar matrices represent the same linear trans- formation relative to different bases.e.e.2. In particular. . We now wish to obtain invariants of a transformation from its matrix. We now construct a whole system of invariants which will include the characteristic polynomial.. Then r1-1s11r1 Putting W2W1-1 = 46'.e.uf and for any matrix similar to S.e. sdf is similar to Let S be the matrix of a transformation A relative to some basis. = i. If 56. expressions depending on the transformation alone.CANONICAL FORM OF LINEAR TRANSFORMATION 143 If we put W-1 r. D(1) = is the same for . 6 We also put The greatest common divisor is determined to within a numerical multiplier. we obtain Si2 = i. we get d-= Z'2-1 d2%5. and at2 are similar to some matrix d.WW is the matrix which represents A relative to the new basis. One such invariant was found in § 10 where we showed that the characteristic polynomial of a matrix se. We denote by D. we wish to construct functions of the elements of a matrix which assume the same values for similar matrices.. It is easy to see that if two matrices a.

D 1(1) is divisible by D.e.144 LECTURES ON LINEAR ALGEBRA Do (A) = 1. EXERCISE. For similar matrices the polynomials D. then xe and a'1 are the entries of = i.. AS)W If a.(2).AS each multiplied by some number. It follows that D(X) is indeed divisible by D_1(2). LEMMA 2. In the sequel we show that all the 13.2 (A). 3) for the matrix 0 o 1 ]. Proof: Consider the pair of matrices sí At and (. i.Ae is the same as the corresponding greatest common divisor . If we expand the determinant D(2) by the elements of any row we obtain a sum each of whose summands is a product of an element of the row in question by its cofactor. 132(A) = D1(2) LEMMA 1.. etc. (A A0)3.Ae)r . By Lemma 1 the greatest common divisor of the kth order minors S . the definition of D_1(2) implies that all minors of order n 1 are divisible by D . are the entries of st . independent of It follows that every minor of (a 2e)w is the sum of minors of a .%c (21 AS) and (I 2e)w are the same. To prove the converse we apply the same A.2da. 1. Similarly.). the entries of any row of (si 2e)w. o rooA A. Find D(2) (k = 1.Ag and (s1 26')%" are the same.(2) are invariants. Indeed. Hence every divisor of the kth order minors of alt AS must divide every kth order minor of (st 2e)%'. Proof: Let se and sit = W-Isiff be two similar matrices. Answer: D3(2. are linear combinations of the rows of st AC with coefficients from .2. We observe that D_1(1) divides D (2). In particular D(2) is the determinant of the matrix Ae.e.(2) are identical.2. reasoning to the pair of matrices (sit AS)W and [(s1 xe)wx-i S .. This proves that the greatest common divisors of the kth order minors of a . If is an arbitrary non-singular matrix then the greatest common divisors of the kth order minors of the matrices AS.

(2) for si and at are identical. does not depend on the choice of basis.R is a matrix of the form Q1 0 where . Clearly D(2) = (A A0)n. Thus for an individual "box" [matrix (1)] the D..e. then the mth order non-zero . for one "box" of the canonical form.(1) = = D1(A) = 1.1. In view of the fact that the matrices which represent a transformation in different bases are similar. We now compute the polynomials WA) for a given linear trans- formation A. We shall find it convenient to choose the basis relative to which the matrix of the transformation is in Jordan canonical form.(2) for the matrix si in Jordan canonical form. Theorem 1 tells us that in computing the D.1. If we cross out in (1) the first column and the last row we obtain a matrix sill with ones on the principal diagonal and zeros above it. 1. If we cross out in sli like numbered rows and columns we find that D . Hence the D. Let A be a linear transformation.42 are of order n. where at represents the transformation A in some basis.(2) are . Then the greatest common divisor Dk(A) of the kth order minors of the matrix se . and n.CANONICAL FORM OF LINEAR TRANSFORMATION 145 for (saf Ad)W. and . We first find the D(2) for an nth order matrix of the form 20 O 1 o 1 o 0 - (1) 0 0 0 0 0 0 1 i.te. 1. Our task is then to compute the polynomial 1). /10)4.(2) we may use the matrix which represents A relative to an arbitrarily selected basis. An analogous statement holds for the matrices AS) and W-I(S1 . (A We observe further that if .AS)S = AS.. 1. Hence D1(2) =. we conclude on the basis of Lemma 2 that THEOREM 1.

146

LECTURES ON LINEAR ALGEBRA

minors of the matrix 94 are of the form

d, =
Here 4(1) are the minors of mi

A(2)

,

"21 + M2 = M.

of order m, and 4(2) the minors of -42

of order m2.7 Indeed, if one singles out those of the first n, rows which enter into the minor in question and expands it by these rows (using the theorem of Laplace), the result is zero or is of the
form A (2) A m(2) .

We shall now find the polynomials D,(1) for an arbitrary matrix

si which is in Jordan canonical form. We assume that al has p boxes corresponding to the eigenvalue A, q boxes corresponding to the eigenvalue 22, etc. We denote the orders of the boxes corresponding to the eigenvalue Al by n1, n2, , n, (n, n2 > > nv). Let R, denote the ith box in a' = si AC. Then ,42, say, is of the form

A, A
O

1

0
1

O
O

=
O O
O

I
A,

0

0

A_

We first compute 1),(2), i.e., the determinant of a. This determi-

nant is the product of the determinants of the i.e.,
D1(2) = (A
)1)1'1+7'2'4-

(1

22)mi±m2+-+mq

We now compute Dn_1(2). Since D0_1(2) is a factor of D(A), it ,A 22, . The problem must be a product of the factors A

now is to compute the degrees of these factors. Specifically, we

compute the degree of A A in D1(2). We observe that any non-zero minor of M = si Ae is of the form

=4

M2)

zlik.),

where t, t2 + + tk = n I and 4) denotes the t,th order minors of the matrix ,2,. Since the sum of the orders of the minors
i.e.,

7 Of course, a non-zero kth order minor of d may have the form 4 k(, it may he entirely made up of elements of a,. In this case we shall 4725 where zlo,22 --- 1. write it formally as z1 =

CANONICAL FORM OF LINEAR TRANSFORMATION

147

1, exactly one of these minors is of order one lower than the order of the corresponding matrix .4,,
, zfik) is n

M,

i.e., it is obtained by crossing out a row and a column in a box of the matrix PI. As we saw (cf. page 145) crossing out an appropriate row and column in a box may yield a minor equal to one. Therefore it is possible to select 47,1 so that some 4 is one and the remaining minors are equal to the determinants oif the appropriate boxes. It follows that in order to obtain a minor of lowest possible degree Al it suffices to cross out a suitable row and column in the in A box of maximal order corresponding to Al. This is the box of order n. Thus the greatest common divisor D 2(A) of minors of order n. A1 raised to the power n2 + n, 1 contains A n Likewise, to obtain a minor 4n-2 of order n 2 with lowest A, it suffices to cross out an appropriate row possible power of A and column in the boxes of order n, and n, corresponding to A,. + n, Thus D2(2) contains A A, to the power n, n, + , D1(A) do not conetc. The polynomials D_(2), D_ 1(2), tain A A, at all. Similar arguments apply in the determination of the degrees of

in WA). A,, 22, We have thus proved the following result.
If the Jordan canonical form of the matrix of a linear transforma./zi,) , n(n2 n, tion A contains p boxes of order n,, n2, corresponding to the eigenvalue A1, q boxes of order ml, m2, , m m,) corresponding to the eigenvalue A2, etc., then m2
Da (A)
(A
(A

A1)n,2+n2+--- +5 (A
Ann2+.3+
-Enp (A

A2r,-Ern2-3-m3+

+mg

D_1(A)

22),n2+.2+- +mg

= (A

Aira+

+"' (A

Az)na'

+ma

Beginning with D_,(2) the factor (A Beginning with Dn_ 2(2) the factor (2
etc.

2,)
A2)

is replaced by one. is replaced by one,

In the important special case when there is exactly one box of order n, corresponding to the eigenvalue A1, exactly one box of order m, corresponding to the eigenvalue A2, exactly one box of order k, corresponding to the eigenvalue A3, etc., the D,(A) have the following form.

148

LECTURES ON LINEAR ALGEBRA

Da(A) = (2 D _1(2) = 1
D _2 (2) =
1

2)'(A

22)m ' (2

23)"'

The expressions for the D1(A) show that in place of the D,(2) it is

more convenient to consider their ratios
E ,(2)
,(2.)
.

D k 19)

The E1(1) are called elementary divisors. Thus if the Jordan
canonical form of a matrix d contains p boxes of order n, n2, , n(ni n, >: n) corresponding to the eigenvalue A, q boxes of order mi., m2, m, (m1 m2_> mg) corresponding to the eigenvalue 22, etc., then the elementary divisors E1(A) are
(2 En(2) En-1(2) =- (A E n-2(2) = (A

21)" (2
Al)"2 (A

22)'
22)m
22)ma

',
*,

Ai)"a(A

Prescribing the elementary divisors E(2), E 2(2)

,

,

deter-

mines the Jordan canonical form of the matrix si uniquely. The eigenvalues 2 are the roots of the equation E(2). The orders n1, n2, n of the boxes corresponding to the eigenvalue A, coincide with the powers of (2 in E(2), E_1(2),
.

We can now state necessary and sufficient conditions for the existence of a basis in which the matrix of a linear transformation
is diagonal. A necessary and sufficient condition for the existence of a basis in
which the matrix of a transformation is diagonal is that the elementary divisors have simple roots only.

Indeed, we saw that the multiplicities of the roots 21, 22, , of the elementary divisors determine the order of the boxes in the Jordan canonical form. Thus the simplicity of the roots of the elementary divisors signifies that all the boxes are of order one,

i.e., that the Jordan canonical form of the matrix is diagonal.
For two matrices to be similar it is necessary and sufficient that they have the same elementary divisors.
THEOREM 2.

CANONICAL FORM OF LINEAR TRANSFORMATION

149

Proof: We showed (Lemma 2) that similar matrices have the

same polynomials D,(2) and therefore the same elementary
divisors E k(A) (since the latter are quotients of the 13,(2)). Conversely, let two matrices a' and a have the same elementary
divisors. ,sat and a are similar to Jordan canonical matrices.

Since the elementary divisors of d and are the same, their Jordan canonical forms must also be the same. This means that a' and a are similar to the same matrix. But this means that a' and :a are similar matrices.
THEOREM 3. The Jordan canonical form of a linear transformation

is uniquely determined by the linear transformation. Proof: The matrices of A relative to different bases are similar.

Since similar matrices have the same elementary divisors and these determine uniquely the Jordan canonical form of a matrix, our theorem follows.
We are now in a position to find the Jordan canonical form of a matrix of a linear transformation. For this it suffices to find the elementary divisors of the matrix of the transformation relative
to some basis. When these are represented as products of the form AO" (A (X AS' we have the eigenvalues as well as the order of the boxes corresponding to each eigenvalue.

§ 21. Polynomial matrices
1. By a polynomial matrix we mean a matrix whose entries are polynomials in some letter A. By the degree of a polynomial matrix we mean the maximal degree of its entries. It is clear that

a polynomial matrix of degree n can be written in the form

+

+ A0,
AE

where the A, are constant matrices. 8 The matrices A

which vvt considered on a number of occasions are of this type. The results to be derived in this section contain as special cases many of the results obtained in the preceding sections for matrices

of the form A

¿E.

In this section matrices are denoted by printed Latin capitals.

150

LECTURES ON LINEAR ALGEBRA

Polynomial matrices occur in many areas of mathematics. Thus, for
example, in solving a system of first order homogeneous linear differential equations with constant coefficients
(I)

dy,
dx

let alkYk

= 1, 2,

n)

we seek solutions of the form
(2)

Yk = ckeAx,

(2)

where A and ck are constants. To determine these constants we substitute the functions in (2) in the equations (1) and divide by eA.z. We are thus led

to the following system of linear equations:
71

iCj =
k=1

agkek

The matrix of this system of equations is A ilE, with A the matrix of coefficients in the system (1). Thus the study of the system of differential
equations (1) is closely linked to polynomial matrices of degree one, namely,

those of the form A
system

AE.

Similarly, the study of higher order systems of differential equations leads

to polynomial matrices of degree higher than one. Thus the study of the
d2yk
k=1

2+ an,

dx2

+ E bik
k=1

n

dyk

dx

+

n

czkyk
k=1

O

is synonymous with the study of the polynomial matrix AA% + 132 + C, where A -= 16/.0, B = C = 11c3k1F.

We now consider the problem of the canonical form of polynomial matrices with respect to so-called elementary transformations. The term 'elementary" applies to the following classes of transformations.

Permutation of two rows or columns. Addition to some row of another row multiplied by some
polynomial yo (A) and, similarly, addition to some column of another

column multiplied by some polynomial.
Multiplication of some row or column by a non-zero constant. DEFINITION 1. Two polynomial matrices are called equivalent if it is possible to obtain one from the other by a finite number of elementary transformations.

The inverse of an elementary transformation is again an elementary transformation. This is easily seen for each of the three types

CANONICAL FORM OF LINEAR TRANSFORMATION

151

of elementary transformations. Thus, e.g., if the polynomial matrix B(A) is obtained from the polynomial matrix A(2) by a permutation of rows then the inverse permutation takes B(A) into A(A). Again, if B(A) is obtained from A(2) by adding the ith row multiplied by q)(2) to the kth row, then A (A) can be obtained from B(A) by adding to the kth row of B(A) the ith row

multiplied by a.(A).
The above remark implies that if a polynomial matrix K (A) is equivalent to L (A), then L (A) is equivalent to K (A). Indeed, if L(A) is the result of applying a sequence of elementary transformations to K (A), then by applying the inverse transformations in

reverse order to L(2) we obtain K(2).
If two polynomial matrices K1(A) and K2(A) are equivalent to a

third matrix K (A), then they must be equivalent to each other. Indeed, by applying to K, (A) first the transformations which take it into K (A) and then the elementary transformations which take
K(2) into K,(A), we will have taken K1(2) into K, (A)
.

Thus K, (A)

and K2(A) are indeed equivalent. The main result of para. I of this section asserts the possibility of

diagonalizing a polynomial matrix by ineans of elementary transformations. We precede the proof of this result with the
following lemma:
LEMMA. If the elentent a11(2) of a polynomial matrix A (A) is not

zero and if not all the elements a(2) of A(A) are divisible by a(A),
then it is possible to find a polynomial matrix B (A) equivalent to A (A) and such that b11(A) is also different from zero and its degree is less

than that of au (2).
Proof: Assume that the element of A (A) vvhich is not divisible by

a (2) is in the first row. Thus let a(2) not be divisible by a (2) .

Then a(A) is of the form
a1fr(2) = a11(2)02) b(2), where b (A) f O and of degree less than au(A). Multiplying the first

column by q(A) and subtracting the result from the kth column,
we obtain a matrix with b(A) in place of a11(2), where the degree of
is less than that of a11 (A) . Permuting the first and kth columns of the new matrix puts b(A) in the upper left corner and results in a matrix with the desired properties. We can proceed in
b (A)

If we subtract from the ith row the first row multiplied by 92(2). b(A) are divisible by b11(2). This completes the proof of our lemma. in view of our lemma. by subtracting from the second. Since a11(2. We now add the ith row to the first row. The new matrix inherits from B (A) the property that all its entries are divisible by b . This leaves a11(2) unchanged and replaces a(2) with a(2. Thus the first row now contains an element not divisible by a(A) and this is the case dealt with before. columns suitable multiples of the first column replace the second. Dividing the first row by the leading coefficient of b11(2) replaces b11(2) with a monic polynomial E1(2) but does not affect the zeros in that row.(2) is replaced by a' .(2) be an element not divisible by an(A). . third. then all the entries of a matrix equivalent to B (A) are again divisible by E (A).) is divisible by a11(2).(2) 92(2)a(2) which again is not divisible by an (2) (this because we assumed that a(2) is divisible by an(A)). Since b(A).(A)(1 T(2)) + (42*(2). We may assume that a11(2) O. nth element of the first row with zero. third.152 LECTURES ON LINEAR ALGEBRA an analogous manner if the element not divisible by a11(2) is in the first column.. If not all the elements of our matrix are divisible by ail (A) . Similarly. we can replace our matrix with an equivalent one in which the element in the upper left corner is of lower degree than a11(A) and still different from zero.) + a' i.1(2) is replaced by zero and a. In the sequel we shall make use of the following observation. it must be of the form a1(A) = (2)a1(2). Repeating this procedure a finite number of times we obtain a matrix B (A) all of whose elements are divisible by bll(A). etc. We will reduce this case to the one just considered. then . Otherwise suitable permuta- tion of rows and columns puts a non-zero element in place of au(A).1(2). then. We are now in a position to reduce a polynomial matrix to diagonal form. Now let all the elements of the first rovy and column be divisible by a1(2) and let a. If all the elements of a polynomial matrix B (A) are divisible by some polynomial E (A). nth element of the first column can be replaced with zero. third.(2) = ai. . the second. we can.

(A) and the other c(). Since the entries of the larger matrix other than E1(2) are zeros..) all of whose elements are divisible by E1(2). can be viewed as an elementary transformation of the larger matrix. This form of a polynomial matrix is called its canonical diagonal form. Thus we obtain a matrix whose "off-diagonal" elements in the first b. This proves THEOREM 1.CANONICAL FORM OF LINEAR TRANSFORMATION 153 We now have a matrix of the form (2) 0 0 (722(1) c23(2) c33(2) c2(A) (3) O c32(2) c. of course. REMARK: We have brought A(A) to a diagonal form in which every diagonal element is divisible by its predecessor. Every polynomial matrix can be reduced by elemen- tary transformations to time diagonal form E1(2) O E2(A) 0 O (4) 0 E3(2) 0 0 E(2)_ Here lije diagonal elements Ek(A) are monic polynomials and El (X) divides E2(2). E.(2). It may.). happen that Er+.+2(2) = = for some value of Y. an elementary transformation of the matrix of the c. (A) divides E3(2. .. of order n 1 the same proceWe can apply to the matrix dure which we just applied to the matrix of order n. (A). If we dispense with the latter requirement the process of diagonalization can be considerably simplified. Repetition of this process obviously leads to a diagonal matrix. Then c22(A) is replaced by a monic polynomial E..) in the first row and first column are replaced with zeros.(2) = E.(. etc. E.vo rows and columns are zero and whose first two diagonal elements are monic polynomials E.) c nn(2)_ Oc2(A) c3(2. (A) . with E2(A) a multiple of E.

Let there be given an arb trary polynomial matrix. (A) are invariant under elementary transformations. or change . i. On the other hand we will see in the next section that the canonical diagonal form of a polynomial matrix is uniquely determined.e. EXERCISE. In this way the matrix can be reduced to various diagonal forms.). 2. that equivalent matrices have the same polynomials D. In this paragraph we prove that the canonical diagonal form of a given matrix is uniquely determined. the diagonal form of a polynomial matrix is not uniquely determined.154 LECTURES ON LINEAR ALGEBRA Indeed.e. In particular. Let D.(2.(2) is determined to within a multiplicative constant. to replace the off-diagonal elements of the first row and column with zeros it is sufficient that these elements (and not all the elements of the matrix) be divisible by a(2). In the case of elementary transformations of type 1 which permute rows or columns this is obvious. we take its leading coefficient to be one.(1.(2) = 1. Reduce the polynomial matrix 21 L 0 _2j O 0 ' 21 to canonical diagonal form. To this end we shall construct a system of polynomials connected with the given polynomial matrix which are invariant under elementary transformations and which determine the canonical diagonal form [01 (A completely. we take D. Once the off-diagonal elements of the first row and first column are all zero we repeat the process until we reduce the matrix to diagonal form.) denote the greatest common divisor of all kth order minors of the 1.. i. We shall prove that the polynomials D.. it is convenient to put Do(A) D. Since given matrix. As can be seen from the proof of the lemma this requires far fewer elementary transformations than reduction to canonical diagonal form. As before. if the greatest common divisor of the kth order minors is a constant. A nswer : 412 A2)]. since such transformations either do not affect a particular kth order minor at all.

the greatest common divisor of the kth order minors remains unchanged. consider addition of the jth column multiplied by T(A) to the ith column. E3(2) is divisible by E2(2).(A) = Dk±i(A) = D(2) = O.(A). it follows that the greatest common divisor D1(A) of all minors of order one is Ei(A). Now consider elementary transformations of type 2. Since E2(2) is divisible by E1(2). the product Ei(A)Ei(A)(i < j) is always divisible by the minor E.(A)E. If it contains the ith column but not the kth column we can write it as a combination of minors each of which appears in the original matrix.(A). consequently.(1.) E2k(2).(2) since under such transformations the minors are at most multiplied by a constant.. Specifically.(À)E.(A)E.). Thus in this case. then we put 13.CANONICAL FORM OF LINEAR TRANSFORMATION 155 its sign or replace it with another kth order minor. Hence D2(A) = E.)E.(2. We observe that equality of the /k(A) for all equivalent matrices implies that equivalent matrices have the same rank. the minor . elementary transformations of type 3 do not change D. the product E .2(2. Since all E. too.(2) (i < j < k) is divisible by E1(A)E2(A)E3(2) and so Da(A) = E. Likewise. Since all the polynomials Ek(A) are divisible by E1(2) and all polynomials other than E1(2) are divisible by E2 (A).)E. If all kth order minors and. We compute the polynomials D. If some particular kth order minor contains none of these columns or if it contains both of them it is not affected by the transformation in question. These minors are of the form (2)E. In all these cases the greatest common divisor of all kth order minors remains unchanged.(. all minors of order higher than k are zero. that is.t)E. etc.(A).(2) for a matrix in canonical form [Ei(i) O 0 I E2(2) (5) E(2) We observe that in the case of a diagonal matrix the only nonzero minors are the principal minors..(4 other than E1(11) and E2(A) are divisible by E3(2. minors made up of like numbered rows and columns.

156 LECTURES ON LINEAR ALGEBRA Thus for the matrix (4) (6) D k(2) = E1(A)E2(A) ' Eh(A) (k = 1. 2. then the elements of the canonical D . 2. n). then both have the same Dk(A). Hence if the matrix A(A) is equivalent to a diagonal matrix (5). In § 20 we defined the elementary divisors of matrices of the form A THEOREM 2.. Since in the case of the matrix (5) we found that D. 2. COROLLARY. if Dr±i(A) = = D(2) = 0 we must put E. the theorem follows. Proof: We showed that the polynomials Dk(2) are invariant under elementary transformations.(2) Ek(2) Dk-1(2) Here. . . The polynomials Ek(2) are called elementary divisors. = E+(A) = = E (2) = O. The canonical diagonal form of a polynomial matrix A (A) is uniquely determined by this matrix. = = E(2) = O. Clearly.(2) D k-1 (A) diagonal form (5) are defined by the formulas (k 1. D1(A) = Dr+2(2) = = Da(A) = 0.1(1) = Ek.(2) = E1(2) Ek(2) (k 1. r) is the greatest common divisor of all kth order minors of A (A) and D.±1(2) 2E. A necessary and sufficient condition for two polyno- . 2. .1(2) = E.+2(A) = then = En(2) = 0. r n) and that Dr+1(2) = D+2(A) = = D(2) = 0. if beginning vvith some value of r Er. Thus the diagonal entries of a polynomial matrix in canonical diagonal form (5) are given by the quotients D.(2) = D(2) = 0. . If D k(2) (k = 1. r). r.

If det P (A) is a constant other than zero. To this end we observe that every elementary transformation of a polynomial matrix A(2) can be realized by multiplying A(2) on the right or on the left by a suitable invertible polynomial matrix. is invertible. THEOREM 3. Two polynomial matrices A (A) and B(2) are equivalent if and only if there exist invertible polynomial matrices P(2) and Q(A) such that. It follows that all the elementary divisors E(2) of an invertible matrix are equal to one and the canonical diagonal form of such a matrix is therefore the unit matrix. the (n 1)st order minors divided by det P(2). 2.13(2) be the same for both matrices. als Indeed. if the polynomials D. the determinant of an invertible matrix is a non-zero constant. 3. then det P(2) = const O. then both of these matrices are equivalent to the same canonical diagonal matrix and are therefore equivalent (to one another). Thus let there be given a polynomial matrix A(2) . the elements of the inverse matrix are. D2(2). Since D(2) is divisible by WM D. then P (A) is invertible. Proof: We first show that if A (A) and B(2) are equivalent. We have thus shown that a polynomial matrix is invertible if and only if its determinant is a non-zero constant. Indeed. Indeed. . (7) A(A) = P (2)B (2)Q (2). A polynomial matrix P(2) is said to be invertible if the matrix [P(2)]-1 is also a polynomial matrix. then there exist invertible matrices P(A) and Q(A) such that (7) holds. so that Da(A) = 1. apart from sign. . Then det P (A) det (A) = 1 and a product of two polynomials equals one only if the polynomials in question are if P (A) non-zero constants. Indeed. by the matrix of the elementary transformation in question.= Pi(A).CANONICAL FORM OF LINEAR TRANSFORMATIoN 157 mial matrices A (A) and E(A) to be equivalent is that the polyno Di(A). namely. Conversely. n).(2) = 1 (k = 1. In our case these quotients would be polynomials and [P (2)J-1 would be a polynomial matrix.(2) are the same for A(A) and B (A). We illustrate this for all three types of elementary transforma- tions. . let [P (2)1-1. All invertible matrices are equivalent to the unit matrix.

Likewise to add to the first row of A(A) the second row multiplied by 9/(A) we must multiply A(A) on the left by the matrix (T(2) 0 1 0 0 1 0 0 0 1 (11) 0 0 0 0 0 . what amounts to the same thing. we must multiply it on the right (left) by the matrix 0 1 1 0 0 1 0 0 0 (8) 0 0 0 obtained by permuting the first two columns (or. row) by a.(2) a2(2) ann(2) To permute the first two columns (rows) of this matrix. To multiply the second column (row) of the matrix A (A) by some number a we must multiply it on the right (left) by the matrix 10 0 0 a 0 0 0 1 0 (8) 0 0 0 1_ obtained from the unit matrix by multiplying its second column (or.158 LECTURES ON LINEAR ALGEBRA a12(2) a22(2) a2(11. Finally. to add to the first column of A (A) the second column multiplied by q(A) we must multiply A(A) on the right by the matrix 0 0 1 0 0 T(2) (10) 1 0 1 0 0 0 0 1 0 0 obtained from the unit matrix by just such a process. what amounts to the same thing. rows) of the unit matrix.) ' ain(2) A(A) an(2) Pan(2) a.

Hence A(A) is equivalent to B (A). But then. where P (A) and 0(A) are invertible matrices. . It follows that every invertible matrix is the product of matrices of elementary transformations. Consequently. every invertible matrix Q (A) is equivalent to the unit matrix and can therefore be written in the form Q(2) = 131(2)EP2(2) where 132(2) and P2(A) are products of matrices of elementary transformations. This observation can be used to prove the second half of our theorem.CANONICAL FORM OF LINEAR TRANSFORMATION 159 obtained from the unit matrix by just such an elementary transformation. it must be possible to obtain A(A) by applying a sequence of elementary transformations to B (A). Since we assumed that A (A) and B (A) are equivalent. Indeed. Since the determinant of a product of matrices equals the product of the determinants. But this means that Q (A) = Pi (A)P. Every elementary transformation can be effected by multiplying B(A) by an invertible polynomial matrix. A(A) is obtained from B(1) by applying to the latter a sequence of elementary transformations. it follows that the product of matrices of elementary transformations is an invertible matrix. let A(A) = P(A)B(A)Q(A). To effect an elementary transformation of the columns of a polynomial matrix A(A) 've must multi- ply it by the matrix of the transformation on the right and to effect an elementary transformation of the rows of A (A) we must multiply it by the appropriate matrix on the left. As we see the matrices of elementary transformations are obtained by applying an elementary transformation to E. A(A) can be obtained from B (A) by multiplying the latter by some sequence of invertible polynomial matrices on the left and by some sequence of invertible polynomial matrices on the right. Computation of the determinants of the matrices (8) through (11) shows that they are all non-zero constants and the matrices are therefore invertible. Since the product of invertible matrices is an invertible matrix. in view of our observation. the first part of our theorem is proved. which is what we wished to prove.(2) is itself a product of matrices of elementary transformations. Indeed.

+ ¿A. 2E)C. This will yield. Indeed. then the AE are equivalent. independent of § 19. j. J AA. in this case A. Later we show the converse of this result. w It is easy to see that if A and B are similar.. polynomial matrices A AE and B if B C-i AC.. The main problem solved here is that of the equivalence of polynomial matrices A 2E and B AE of degree one. of the fact that every matrix can be reduced to Jordan canonical form.¿E) which implies (Theorem 3) the equivalence of A.160 LECTURES ON LINEAR ALGEBRA 4. there exist matrices 5(2) and R (R constant) such that P (A) = (A AE)S(2) + R. i. + 2A. to Every polynomial matrix A. -1A. Indeed. by A we have A. i.9 In this paragraph we shall study polynomial matrices of the form A AE.-.e. namely. and A 2E. if there exists a non-singular constant matrix C such that B C-1AC. with det A1 O is equivalent to a matrix of the form A ¿E. X ( A. Theorem 3 implies the equivalence of A AE and B 2E. -.2A1 = A. We begin by proving the following lemma: LEMMA. . a new proof of the fact that every matrix is similar to a matrix in Jordan canonical form. among others. (A . Every polynomial matrix P(2) = P02" + 1312"-1 + + P can be ditiided on the left by a matrix of the form A AE (A any constant matrix). then B 2E = C-1(A Since a non-singular constant matrix is a special case of an invertible polynomial matrix. The process of division involved in the proof of the lemma differs from ordinary division only in that our multiplication is noncommutative. A constant. = A.A.2E) and if we denote A. This paragraph may be omitted since it contains an alternate proof. that the equivalence of the polynomial matrices A AE and B AE implies the similarity of the matrices A and B.e.

(2) (A AE) We note that in our case. then the polynomial matrix P(A) + (A AE)P02"-1 + 2. A similar proof holds for the possibility of division on the right.An-i. P02" P(2) = (A P'0211-2 + .13102"-I P'12"-2 + + P'_. i.e. AE)S(2) + R. This proves our lemma. = P(A). independent of X.).e.) of degree not higher than zero. It is easy to see that the polynomial matrix P(A) + (A AE)P02"-1 1. It remains to prove necessity. are constant matrices.-=. there exist matrices S1(A) and R1 such that P(2) -= S. i. By Theorem 3 there exist invertible polynomial matrices P(2) and Q(A) such that . can claim that R = R.CANONICAL FORM OF LINEAR TRANSFORMATION 161 Let P(A) = P02" P. Proof: The sufficiency part of the proof was given in the beginning of this paragraph. is of degree not higher than If P(2) + (A n AE)P. where the P. (A AE)P'0An-2 is of degree not higher than n obtain a polynomial matrix P(2) + (A Continuing this process we P'02"-2 2E) (P02"-1 + . This means that we must show that the equivalence of A 2E and B AE implies the similarity of A and B.. THEOREM 4. just as in the ordinary theorem of Bezout. then P (A) = (A or putting S (2) ( 2E) (P0An-1 P'02"-2 + ) R.. If R denotes the constant matrix just obtained.. The polynomial matrices A AE and B AE are equivalent if and only if the matrices A and B are similar.

Q1(2)(B 2E) + Qo. We now add and subtract from the third summand in K(2) the expression (B 2E)P1 (A) (A 2E)Q1(2)(B 2E) and find K(2) = (B AE)P. If we insert these expressions for P(2) and Q(2) in the formula (12) and carry out the indicated multiplications we obtain B AE = (B +(B 2E) 2E)P1 (2) (A 2E)121(2)(B 2E)Q0 + Po(A 2E)Q1(2)(B 2E)P1 (2) (A + 130(A 2E) 2E)Q0. (2) (A 2E)Q1 (2) (B 2E) 2E)Q0 2E)P1(2)(A 2E). where Po and Q0 are constant matrices. 2E)Q-/(2). (2) (A AE)Q (2) (B (A P (A) (A 2E) 2E)Q1(2)(B 2E). i.e.162 LECTURES ON LINEAR ALGEBRA (12) B 2E = P(2)(A 2E)Q(2). (2) + P0. if we put K (2) = (B (B AE)P. 2E)Q1(2)(B + Po(A then we get B AE Po(A 2E)Q0 -= K(2). 2E) + Q0 = Q(2). We shall first show that 11 (A) and Q(2) in (12) may be replaced by constant matrices. the first two summands in K(2) can be written as follows: Since Q1(2)(B + (B 2E)P1(2) (A (B 2E)Q0 AE)1J1 (2) (A = (B 2E) 2E)Q1 (2) (B 2E)P1 (2) (A 2E)Q (2). Then P(2) = (B AE)P. 2E)P1(2)(A 2E)Q1(2)(B But in view of (12) AE)Q (A) P(A)(A 2E) = (B P-1(2)(B 2E). Using these relations we can rewrite K (2) in the following manner . Q(2) -.. If we transfer the last summand on the right side of the above equation to its left side and denote the sum of the remaining terms by K (2). To this end we divide P(2) on the left by B B AE and Q(2) by AE on the right.

We now show that K (1) = O. We now show that every matriz A is similar to a matrix in Jordan canonical form. To this end we consider the matrix A ¿E and find its elementary divisors. We shall prove this polynomial to be zero. But (15) implies that K (2) is at most of degree one.) in (12) with constant matrices. Of course. . B divisors as A As was indicated on page 160 (footnote) this paragraph gives another proof of the fact that every matrix is similar to a matrix in Jordan canonical form. 21E)Qi(2)1(B P.. Then it is easy to see that K (2) is of degree m + 2 and since ni 0.e. and Qo are non-singular and that Po = Qo-1. Assume that this polynomial is not zero and is of degree m. that A and B are similar. We have thus found that (17) B 2E =. Since equivalence of the matrices A if and only if the matrices A ¿E and B ¿E is synonymous with identity of their elementary divisors it follows from the theorem just proved that two matrices A and B are similar ¿E and B a have the same elementary divisors. i. the contents of this paragraph 2E has the same elementary ¿E. which shows that the matrices P. K (2) is at least of degree two. is zero. Equating coefficients of 2 in (17) we see that PoQo E.CANONICAL FORM OF LINEAR TRANSFORMATION 163 K (A) = (B 2E) [P1(2)P-1(2) 12-'(2)121(A) ¿E).Po(A 2E)Q0. but then B is similar to A. Equating the free terms we find that B = PoAQ0 i. are constant matrices. and with it K (2). Hence the expression in the square brackets.. This completes the proof of our theorem. Since P(2) and Q (2) are invertible. Using these we construct as in § 20 a matrix B in Jordan canonical form. we may indeed replace P(2) and Q(2. (2)(A the expression in square brackets is a polynomial in 2. can be deduced directly from §§ 19 and 20.e. where Po and Q.

CHAPTER IV

Introduction to Tensors
§ 22. The dual space
1. Definition of the dual space. Let R be a vector space. Together with R one frequently considers another space called the dual space which is closely connected with R. The starting point for the definition of a dual space is the notion of a linear function introduced in para. 1, § 4.

We recall that a function f(x), x E R, is called linear if it satisfies the following conditions:

f(2x) = 2f (x). e be a basis in an n-dimensional space R. If Let el, e2, + e" e x = ei e2 e, + is a vector in R and f is a linear function on R, then (cf. § 4) we can write

f(x+y)-f(x)+f(Y),

re,) f(x) = /(eei e2e2 ane", a2e2 + = , a which determine the linear where the coefficients al, a2,
(1)

function are given by
(2)

a = f(e2),

a2 = f(e2),

a,, = f(e).

It is clear from (1) that given a basis e1, e2, , en every n-tuple , a determines a unique linear function. al, a2, Let f and g be linear functions. By the sum h off and g we mean

the function which associates with a vector x the number f(x) g (x). By the product off by a number a we mean the function which associates with a vector x the number x f(x).
Obviously the sum of two linear functions and the product of a

function by a number are again linear functions. Also, if f is
164

INTRODUCTION TO TENSORS

165

, a and g by the numbers determined by the numbers al, a2, g is determined by the numbers al + b n , then f b1, b2, , a, + b,, , a + bn and xi' by the numbers arz,., a2, , acin. Thus the totality of linear functions on R forms a vector space.
DEFINITION 1. Let R be an n-dimensional vector space. By the dual space R of R we mean the vector space whose elements are linear functions defined on R. Addition and scalar multiplication in R follow the rules of addition and scalar multiplication for linear

functions.

In view of the fact that relative to a given basis e1, e2,

, e in

R every linear function f is uniquely determined by an n-tuple , a and that this correspondence preserves sums and a1, a2, products (of vectors by scalars), it follows that R is isomorphic to the space of n-tupies of numbers. One consequence of this fact is that the dual space R of the
n-dimensional space R is likewise n-dimensional.

The vectors in R are said to be contravariant, those in R,
will denote elements of R and covariant. In the sequel x, y, elements of R. f, g, 2. Dual bases. In the sequel we shall denote the value of a

linear function f at a point x by (f, x). Thus with every pair f E R and x e R there is associated a number (f, x) and the
following relations hold:

f, xi + x2) = (f,x1)
(f, /Ix) 2(f, x), x), (Af, x) =
(fl.

( f, x2),

X) = (h, X) + (f2, X).

The first two of these relations stand for f(x,+ x2)=f(xi)-kf(x2)

and f(A)

Af (x) and so express the linearity of f The third

defines the product of a linear function by a number and the fourth, the sum of two linear functions. The form of the relations 1 through 4 is like that of Axioms 2 and 3 for an inner product (§ 2). However,

an inner product is a number associated with a pair of vectors
from the same Euclidean space whereas (f, x) is a number associated with a pair of vectors belonging to two different vector spaces

R and R.

166

LECTURES ON LINEAR ALGEBRA

Two vectors x E R and f E R are said to be orthogonal if

(f,x) = O. In the case of a single space R orthogonality is defined for
Euclidean spaces only. If R is an arbitrary vector space we can still speak of elements of R being orthogonal to elements of R. , f" e be a basis in R and f1,f2, DEFINITION 2. Let e1, e2, said to be dual if a basis in R. The two bases are
(3)

(P,ek)=

when i = k
{01

when i

k

(i, k = 1, 2,

In terms of the symbol hki, defined by 1 when i = k , n), 1, 2, (i, k = {0 when i k k condition (3) can be rewritten as (fi, ek) = If el, e2, , en is a basis in R, then (f, ek) = f(ek) give the numbers a, which determine the linear function fe R (cf. formula (2)). Ibis remark implies that , en is a basis in R, then there exists a unique basis if e1, e2, in in R dual to e1, e2, , fi, J.', The proof is immediate: The equations (p, e) = (P, e2) = 0, (P, ei) = 1, define a unique vector (linear function) J.' E R. The equations (f2, e2) = I, *, (f 2,e) = O (f2, el) = 0, define a unique vector (linear function) f2 e R, etc. The vectors fl, f2, . . are linearly independent since the corresponding n-tuples of numbers are linearly independent. Thus In, f2, ,I3
constitute a unique basis of R dual to the basis el, e2,

, e of R.

In the sequel we shall follow a familiar convention of tensor analysis according to which one leaves out summation signs and sums over any index which appears as a superscript and a sub+ enri. script. Thus el /72 stands for elm. + E22 + Given dual bases e, and f one can easily express the coordinates

of any vector. Thus, if x e R and
x

x). . . x) = (nip. f2. 3. p. We now show that it is possible to interchange the roles of R and R without affecting the theory developed so far. = (f. are the coordinates of x c R relative to the basis e1. en and P. vet) ei(fk. Similarly. e2. e1)61k Ek. x = El ei Then 52e2 + + e. . .INTRODUCTION TO TENSORS 167 then (fk. = (fi. the coordinates Ek of a vector x in the basis e. . R was defined as the totality of linear functions on R. . . /2. Thus let . en and Th. (4) respectively (1. x) (fk. are the coordinates off E R relative to the basis in. its dual basis in R then (if x) = niE' + 172E2 + + nnen. . We wish . where f is the basis dual to the basis e. f2.1cn1ek ++Thifn. andf. x) = a. be dual bases. NOTE. e2. rke. Hence. Interchangeability of R and R. (fi. e2.. (f. . x) in terms of the coordinates of the vectors f and x with respect to the bases e1. ek) and f = ntf' + n2f2 6. h in R and R where $1. E2. e2. . can be computed from the formulas ek (fk. n . e2. For arbitrary bases el. We shall express the number (f. en and fi. fn. e. e2. p. if fe k and f= nkfki then Now let e1. e is a basis in R and f'. . ek)ntek To repeat: If el.e).i. where a/c. respectively. . e.

. then cp(f) (f. as Now let x. e2. on ft and the vectors x. then. x) which connects the elements of the two spaces.72 4- + (re Then. . 2 above we showed that for every basis in R there exists a unique dual basis in R. If we specify the en. . fn are rh. e is given by e'. as a rule. then we can write q)(f) = (tin. If the coordinates off relative to the basis dual by Jr'. f2. f2. x) so that conditions 1 through 4 above hold and. x) 0 for all f implies x O. 5. we specify the coordinates of a vector f E R relative to the dual basis f'.n. In para. . x) for some fixed vector xo in R. . . + a2. (f. = ci"ek. coordinates of a vector x e R relative to some basis e1. We observe that the only operations used in the simultaneous study of a space and its dual space are the operations of addition of vectors and multiplication of a vector by a scalar in each of the spaces involved and the operation (f. e R and permits us to view R as the space of linear functions on R thus placing the tvvo spaces on the same footing. xo). be the vector alei we saw in para. en in R and denote its f". for every basis in R there exists a unique dual basis in R. e'2. e2. Transformation of coordinates in R and R. Such a definition runs as follows: a pair of dual spaces R and R is a pair of n-dimensional vector spaces and an operation (f. x) NOTE: 0 for all x implies f = 0. e' be a new basis in R whose connection with the basis e1. e2. In view of the interchangeability of R and R. in addition. It is there- fore possible to give a definition of a pair of dual spaces R and R which emphasizes the parallel roles played by the two spaces. fn of e. 172. . and (f. 2. 4. .168 LECTURES ON LINEAR ALGEBRA to show that if q) is a linear function on R. To this end we choose a basis el. x) which associates with f e R and xeR a number (f. f2. . e2. This formula establishes the desired one-to-one correspondence between the linear functions 9. e. X0) = a2e2 4- + (P + ann and (5) (f. Now let e'1. (f. (6) .

so that It follows that the coordinates of vectors in R transform like the vectors of the dual basis in k. . e2. f2. the matrix in (6') is the transpose 1 of the transition matrix in (6). (fk. . .x) bkiek. e'2. e'.Eiketk)= e". i. . i.. fn: inverse. = czknk This is seen by comparing the matrices in (6) and (6'). Similarly. en to e'l . ciece2) = ci. Thus let ei be the coordinates of x ER relative to a basis e1.(fk. e'2. f'2. We first find its of transition from the basis f'1.r fk To this end we compute (7. the matrix (6') .e. Now e't = (f". e'2. fOc= to fg. We say that the matrix lime j/ in (6') is the transpose of the transition matrix in (6) because the summation indices in (6) and (6') are different. f'k is equal to the inverse of the transpose of the matrix which is the matrix of transition from e1. the coordinates of vectors in k transform like the vectors of the dual basis in R. ni. .. = bkiek. x) (bklik. We wish to find the matrix 111)7'11 of transition from the fi basis to the f'.e. . e'i) =1= = e'1) cik e'i) u5k (f Hence c1 = u1k. e'i) in two ways: (fk. VVe now discuss the effect of a change of basis on the coordinates e' of vectors in R and k. . e2. . f'n be the dual basis of e'1. to the basis F.INTRODUCTION TO TENSORS 169 Let p. f2.. . e'i) = (fk. e and e'i its coordinates in a new basis e'1. Then x) --= (I' kek) = (f'1 x) = (f". . p be the dual basis of e1. f2. e'. basis.x)=bki(fk. f '2. f'2. e andf'1. It follows that the matrix of the transition from fl...

But this The converse is obvious. (x.. y1 means that y. The fact the 111). Thus in the case of a Euclidean space everyf in k can be replaced with the appropriate y in R and instead of writing (f. y. Conversely. x = eiei. where y is a fixed vector uniquely determined by the linear function f. y) = (x. . 5. y). Now let y be the vector with coordinates c11. y1) and f(x) = (x. Since the . then f(x) is of the form f(x) = ale + a2e2 + basis e1.k11 is the inverse of the transpose of 11c. en is orthonormal. y) = ct1e1 a2E2 + + This shows the existence of a vector y such that for all x f(x) = (x.) = 0 for all x. e. Since the sumultaneous study of a vector space and its dual space involves only the usual vector operations and the operation . For the sake of simplicity we restrict our discussion to the case of real Euclidean spaces..170 LECTURES ON LINEAR ALGEBRA We summarize our findings in the following rule: when we change from an "old" coordinate system to a "new" one objects with lower case index transform in one way and objects with upper case index transform in a different way. Of the matrices r[cikl and I ib. If + ase. e2. (x. The dual of a Euclidean space. Then every linear function f on R can be expressed in the form f(x) = (x. et. LEMMA. a.k11 is expressed in the relations c.k1 involved in these transformations one is the inverse of the transpose of the other. Let R be an n-dimensional Euclidean space. Y). then (x. x) we can write (y. every vector y determines a linear function f such that f(x) = (x. i..abni = 6/. Proof: Let e1. = O.e. x). . To prove the uniqueness of y we observe that if f(x) = (x. = by. e be an orthonormal basis of R. y. y). y2). y2).

Tensors 1. where the matrix Hell is the inverse of the matrix I Lgikl I.. R by R. i. Multilinear functions. f2. in case of a . ek) = giabe = gik (ei ek) = Thus if the basis of the J. x). It is natural to try to find expressions for the f in terms of the given e. Show that gik i.INTRODUCTION TO TENSORS 171 ( f x) which connects elements fe R and x e R. Let e. its dual basis in R. Euclidean space. Let e]. tk) § 23. ek). flak EXERCISE.. If we were to identify R and R we would have to write in place of (I. replace f by y. then ek = gkia. If R is Euclidean...e. x).e. When we identify R and its dual rt the concept of orthogonality of a vector x E R and a vector f E k (introduced in para. (p. x). . Now eic) = gj2.' is dual to that of the ek. A natural If R is an n-dimensional vector space. In the first chapter we studied linear and bilinear functions on an n-dimensional vector space. e be an arbitrary basis in R and f'. (y. We wish to find the coefficients go. e2. x e R. Solving equation (10) for f' we obtain the required result f = gi2e . then R is also n-dimensional and so R and R are isomorphic. = g f a. 2 above) reduces to that of orthogonality of two vectors of R. 2 This situation is sometimes described as follows: in Euclidean space one can replace covariant vectors by contravariant vectors. and (f. . y. X) by (y. we may. we can identify R with R and so in look upon the f as elements of R. But this would have the effect 2 of introducing an inner product in R. where g ik (ei. we may identify a Euclidean space R with its dual space R.

) = ul(x. 1) defines a vector in R (a contravariant vector). 3. . f' .172 LECTURES ON LINEAR ALGEBRA generalization of these concepts is the concept of a multilinear function of an arbitrary number of vectors some of which are elements of R and some of vvhich are elements of R. i. 1(x. ). if we fix all vectors but the first then /(x' x". DEFINITION I. f'. f. g. g. ) = 1(x'. Similarly. f. y. The bilinear function of type (y) . 0) is a linear function of one vector in R. ). . ) 1(x" . y. Again. g. . g. let y = Ax be a linear transformation on R. 0) and (0. (ß) bilinear functions on R. g. e R and e R (the dual of R) if 1 is linear in each of its Thus. § 22. . and q vectors in R (covariant vectors) is called a multilinear function of type (p. . y. f. g. . g. y. a vector in R (a covariant vector). q). for example. g. uf. arguments. . ) . y. -) = 2/(x. g. as was shown in para. ). f. y. 1(x. A multilinear function of p vectors in R (contravariant vectors) ) f". y. y. . A function 1(x. The simplest multilinear functions are those of type (1. ).e. f". f. y. g. a multilinear function of type (0. 1). f. /(2x. A multilinear function of type (1. 1(x. ) q vectors f. Indeed. g. functions of one vector in R and one in R. y. y. y. There is a close connection between functions of type (y) and linear transformat ons. g.. /(x. There are three types of multilinear functions of two vectors (bilinear functions): bilinear functions on R (cons dered in § 4). f. is said to be a multilinear function of p vectors x. .

where the numbers au::: which define the multilinear function are given by ar. A similar formula holds for a general multilinear function /(x.fk). We now show how the system of numbers which determine a multilinear form changes as a result of a change of basis Thus let el. ). fk. f) = where the coefficients ail' which determine the function / (x. e. n'e5. Let e'1.. . Let Then . x= /(x. y. e'2. e2.f be its dual in R. eini ht. If (3) e'a = cate. y. Ax) which depends linearly on the vectors x e R and fe R. . y. .e. "..INTRODUCTION TO TENSORS 173 associated with A is the function (f. f). 1)). .. 7 its dual in R.. We now express a multilinear function in terms of the coordinates of its arguments. )= y. Y. y E R. en be a basis in R and fl.that one can associate with every bilinear function of type (y) a linear transformation on R. fe R (a function of type (2.f) Or y = niei. Coordinate transformations.. Ckfk) = V?? e5.. fit This shows that the ak depend on the choice of bases in R and R. Expressions for multilinear functions in a given coordinate system.. . f '2. As in § 11 of chapter II one can prove the converse.fr f3. f. f2. e'n be a new basis in R and fi. For simplicity we consider the case of a multilinear function 1(x. . y. x. f 2.k /(x. Let el. e2. i. f) are given by the relations ai ei. en be a basis in R and fl. g. 2. .. /(Ve. f n its dual basis in R.7: : : = 1(e.

us. para. linear transformations.. c/.f". . )relative to a pair of dual bases el.715.. f. upon a change of basis.5: bibrs Here [c5' [[is the matrix defining the transformation of the e basis and is the matrix defining the transformation of the f basis. and a bilinear function by the n' entries in its matrix. f2. (CT:: = /(e't. This situation can be described briefly by saying that the lower indices of the numbers aj are affected by the matrix I Ic5'11 and the upper by the matrix irb1111 (cf.) were defined relative to a given basis by an appropriate system of numbers.. a linear function by its n coefficients. fr. In the case of each of these objects the associated system of numbers would. e.e. . ba. . = ctfixift e and f1. Thus relative to a given basis a vector was defined by its n coordinates. def 4. [2. g. etc.l2. e2. § 22). . In this way we find that numbers c52. then e'1. the .174 LECTURES ON LINEAR ALGEBRA then (cf. linear functions. . and e' and f'1. 4. 4. We shall now compute the numbers a'. § 22) f'ß = where the matrix I lb. Definition of a tensor. bilinear functions. Hence to find we must put in (1) in place of Ei. Similarly. relative to the basis el. para. .': which define our multilinear function relative to the bases e'. i.. II is the transpose of the inverse of For a fixed a the numbers c2 in (3) are the coordinates of the vector e'j. . . The objects which we have studied in this book (vectors.). fr. the coordinates of the vectors e' e'5. e' and r. We know that f''. for a fixed fi the numbers baft in (4) are the coordinates of f'ß relative to the basis f'./12. a linear transformation by the n2 entries in its matrix. e'2. PI. p.. y. e'2. 2. b71.r. ' . .rbts ci-cl To sum up: If define a multilinear function /(x. transform in a manner peculiar to each object and to characterize the object one had to prescribe the . 3. bar. fn. . f2. . e2.

This permits us to deduce properties of tensors and of the operations on tensors using the "model" supplied by multilinear functions. These transform according to the rule = and so represent a contravariant tensor of rank 1. The numbers a. We say that aß times covariant and g times contravariant tensor is defined if with every basis in R there is associated a set of nv+Q numbers a::: (there are p lower indices and q upper indices) which under change of basis defined by some matrix I Ic/II transform according to the rule (6) = cja b acrccrp::: b with q is the transpose of the inverse of I I I. and algebra. geometry. multilinear functions are only one of the possible realiza- tions of tensors. The number p are called called the rank (valence) of the tensor. Let R be an n-dimensional vector space. p times covariant and q times contravariant. DEFINITION 2. A tensor of rank zero is called a scalar Contravariant vector. Since the system of numbers defining a multilinear function of p vectors in R and q vectors in R. Conversely. Clearly. every tensor determines a unique multilinear function. Scalar. The numbers the components of the tensor. We now define a closely related concept which plays an important role in many branches of physics. its coordinates relative to this basis. In para. If we associate with every coordinate system the same constant a. defining . then a may be regarded as a tensor of rank zero. transforms under change of basis in accordance with (6) the multilinear function determines a unique tensor of rank p q. Given a basis in R every vector in R determines n numbers. 1 and 2 of this section we introduced the concept of a multilinear function. Linear function (covariant vector).INTRODUCTION TO TENSORS 175 values of these numbers relative to some basis as well as their law of transformation under a change of basis. We now give a few examples of tensors. Relative to a definite basis this object is defined by nk numbers (2) which under change of basis transform in accordance with (5).

Linear transformation. = Ac. once covariant and once contravariant. twice covariant. With every basis we associate the matrix of A relative to this basis. = aikek. With every basis we associate the matrix of the bilinear form relative to this basis.. = Then e. once covariant and once contravariant. y) be a bilinear form on R. Let A (x. Let 11(11111 be the matrix of A relative to some basis el. once covariant and once contravariant and a bilinear form of vectors f.e. of A relative to the e'. e2. k. a bilinear form of vectors x E R and y e R defines a tensor of rank two.ae2 = cia Ae2 = This means that the matrix takes the form e fi = ci2afl bflk e'. We shall show that this matrix is a tensor of rank two. Thus 61k is the simplest tensor of rank two once covariant and once . e. Bilinear function. which proves that the matrix of a linear transformation is indeed a tensor of rank two. i. The resulting tensor is of rank two." e' where b1"Cak 6 ik It follows that Ae'. b. = a' jei k. Let A be a linear transformation on R. Define a change of basis by the equations e'. and so represent a covariant tensor of rank 1. relative to any basis is the unit matrix. basis c2ab fik. = cia.e. the system of numbers 6ik In particular the matrix of the identity transformation E i to if i if i k. Ae. .. Similarly. g e R defines a twice contravariant tensor.. i.176 LECTURES ON LINEAR ALGEBRA a linear function transform according to the rule a'.

g. q) whose components relative to some basis take on 79±g prescribed (IT:: be the num values. both a linear transformation and a bilinear form are defined by a matrix. = (I if i = k. this section. A sufficient condition for the equality of two tensors of the same type is the equality of their corresponding components relative to some basis. Thus. y. x) = (y. . Thus let prescribed in some basis. One interesting feature of this tensor is that its components do not depend on the choice of basis. then their components relative to any other basis must be equal. u. it is possible to establish an isomorphism between R and R such that if y E R corresponds under this isomorphism to fe R. 2 of function /(x. Coincidence of the matrices defining these objects in one basis does not imply coincidence of the matrices defining these objects in another basis. y. These numbers define a multilinear ) as per formula (1) in para. 4. y. y. in turn. 5 of § 22. y. Show dirctly that the system of numbers 6. defines a unique tensor satisfying the required conditions. Given a multilinear function 1 of fi vectors x. . . The multilinear function. Given p and q it is always possible to construct a tensor of type (p. g. 0 if i k. The proof is simple. f. then. If R is a (real) n-dimensional Euclidean space. given a basis. in R we can replace the latter by correin R and q vectors f. T ensors in Euclidean space. sponding vectors u. EXERCISE. in R and so obtain a multilinear ) of p q vectors in R.) For proof we observe that since the two tensors are of the same type they transform in exactly the same way and since their components are the same in some coordinate system they must be the same in every coordinate system.INTRODUCTION TO TENSORS 177 contravariant. function l(x. We wish to emphasize that the assumption about the tvvo tensors being of the same type is essential. We now prove two simple properties of tensors. associated with every bais is a tensor. (This means that if the components of these two tensors relative to some basis are equal. then (f. as was shown in para. x) for all x e R.

ek) It follows that rs ) 1(e3. .) and let b.. pc. y. e. The equation r defines the analog of the operation just discussed. v. is called a metric tensor. e.. f. = ggfis = gsrgfis aTf:::. This operation is referred to as lowering of indices. e. ).. ). y. . i. . e .g. e. This is obvious if we observe that the g. v. u. fr. gc. ) Here g is a twice covariant tensor. es. ). Thus let au::: be the coefficients of the multilinear funct on /(x. . = (e1.e.. g. We now propose to express the coefficients of l(x.. . . . be the coefficients of the multilinear function . ei.e. u. the tensor gz. namely. We showed in para. e2. The new .. l(es. ) . the inner product relative to the basis e1. ._ which is p q times covariant. e..) are the coefficients of a bilinear form. In view of its connection with the inner product (metric) in our space. 5 of § 22 that in Euclidean space the vectors e. ¿(e1.178 LECTURES ON LINEAR ALGEBRA in terms of the coefficients of /(x. .. It is defined by the equation = gccrg fi. f. i. ) = l(ei. = l(e1. fit. then this tensor can be used to construct a new tensor kJ. ). of a basis dual to fi are expressible in terms of the vectors p in the following manner: e. . fie. e. e5.. fs. .. y. In view of the established connection between multilinear functions and tensors we can restate our result for tensors: If au::: is a tensor in Euclidean space p times covariant and q times contravariant. = grs where gzk = (et.

f. g. ) l'(x. ) be two multilinear functions of which the first depends on iv vectors in R and q' vectors in R and the second on Jo" vectors in R and q" vectors in R. l'(x. . -. y. 5 of § 22. Operations on tensOYS. z. -) . . g. 1 is a multilinear function of p' p" vectors in R and q' q" vectors in R. f. /(x. g. . y. Consequently addition of tensors is defined by means of the formula = Multiplication of tensors. We define their sum ) by the formula /(x. f3. Since r(ei. . h. y. h. ) and 1"(z. . Show that gm is a twice contravariant tensor. f. ) l'(x. Let l" (x. . g. g. y. /(x. We define the product /(x. g. f. ) of l' and 1" by means of the formula: f. To see this we need only vary in 1 one vector at a time keeping all other vectors fixed. EXERCISE. y . y. ). z. Ve shall now express the components of the tensor correspond- ing to the product of the multilinear functions l' and 1" in terms of the components of the tensors corresponding to l' and 1". Let . . Addition of tensors. -. Here e has the meaning discussed in para. )= (x. y. . ej. g. . y.INTRODUCTION TO TENSORS 179 operation is referred to as raising the indices. y. f. g. f. . -)1(z. -) Clearly this sum is again a multilinear function of the same number of vectors in R and R as the summands l' and 1". f. be two multilinear functions of the same number of vectors in R and the same number of vectors in R. h. In view of the connection between tensors and multilinear functions it is natural first to define operations on multilinear functions and then express these definitions in the language of tensors relative to some basis. 5.

) be a multilinear function of p vectors in R (p 1) and q vectors in R(q 1). + 1(e.e. . and g. f'ce) = A (cak ek. f". We use 1 to define a new multilinear function of p 1 vectors in R and q 1 vectors in R. f2. ck f'a) = A (ek. g. f. f. the same is true of the sum I'. = r(e5.) ) and g. . e in R and its dual basis p. . ). We now show that whereas each summand depends on the choice of basis. i. f'a) = cak A( = A (ek. Since the vectors y. .::: a"tkl This formula defines the product of two tensors. g. fk Therefore eikr. it follows that att tuk. f2. the sum does not. g.180 LECTURES ON LINEAR ALGEBRA and = 1" (ek. y. Since coefficients of the form /(x.. f'. g. . 1(e. Jet. Y. Contraction of tensors. -). y. Let /(x. . . fa) = A (e' a. g. y. -) . remain fixed we need only prove our contention for a bilinear form A (x. ) -. ). e' and denote its dual basis by f'2. fl. irtn. A (ea. e'2. ) /(e2. g. Let us choose a new basis e'1. fu . fk). f" in R and consider the sum . g. P) is indeed independent of choice of basis. e2. y. y. l'(y. then cikek. We recall that if e'. We now express the coefficients of the form (7) in terms of the . f'k). . Since each summand is a multilinear function of y. Specifically we must show that A (e. e1. f' z) A (e'2. (7) = /(ei. To this end we choose a basis el.

f2. say. ). 4 of this section can be viewed as contraction of the product of some tensor by the metric tensor g.k be a tensor of rank three and bt'n ai. Let ati and b. to a number independent of coordinate systems. say. Their product ct'z' five. would lead to a tensor of rank one (vector).. i. = The tensor a'. a/ ."' is a tensor rank a tensor of rank two. say. With any tensor ai5 of rank two we can associate a sequence of invariants (i. By multiplication and contraction these yield a new tensor of rank two: cit = aiabat.. Let a. Likewise the raising of indices can be viewed as contraction of the product of some tensor by the tensor g". Another contraction. if one tried to sum over two covariant indices.kb.::: obtained from a::: as per (8) is called a contraction of the tensor It is clear that the summation in the process of contraction may involve any covariant index and any contravariant index. simply scalars) a:.e. If the tensors a1 and b ki are looked upon as matrices of linear transformations.' be two tensors of rank two.INTRODUCTION TO TENSORS 181 and l'(e if follows that (8) . The operation of lowering indices discussed in para. would be a tensor of rank three. We observe that contraction of a tensor of rank two leads to a tensor of rank zero (scalar). Another example. jes. over the indices j and k. then the tensor cit is the matrix of the product of these linear transformations. the resulting system of numbers would no longer form a tensor (for upon change of basis this system of numbers would not transform in accordance with the prescribed law of transformation for tensors). ) = l(e e . (repeated as a factor an appropriate num- ber of times).e. However. numbers independent of choice of basis. The result of contracting this tensor over the indices i and m.

Thus. ) is the multilinear function corresponding If 1(x. if ei are the coordinates of a contravariant vector and n. then Ein. For example. to the tensor ail . addition. given set of indices i if its components are invariant under an arbitrary permutation of these indices. etc. . by multiplying vectors we can obtain tensors of arbitrarily high rank.182 LECTURES ON LINEAR ALGEBRA The operations on tensors permit us to construct from given tensors new tensors invariantly connected with the given ones. However. g. f.e. is a tensor of rank two. For example. y. i.. Since for a multilinear function to be symmetric with It goes without saying that we have in mind indices in the same (upper or lower) group. y.. We observe that not all tensors can be obtained by multiplying vectors. if then the tensor is said to be symmetric with respect to the first two (lower) indices. contraction over all indices).. . as is clear from (9). multiplication by a number and total contraction (i. ... In connection with the above concept we quote without proof the following result: Any rational integral invariant of a given system of tensors can be obtained from these tensors by means of the operations of tensor multiplica- tion. f. of a covariant vector. if (9) 1(x. )= then. A tensor is said to be symmetric with respect to a 6. it can be shown that every tensor can be obtained from vectors (tensors of rank one) using the operations of addition and multiplication. g. Symmetric and skew symmetric tensors DEFINITION.e. By a rational integral invariant of a given system of tensors we mean a polynomial function of the components of these tensors whose value does not change when one system of components of the tensors in question computed with respect to some basis is replaced by another system computed with respect to some other basis. symmetry of the tensor with respect to some group of indices is equivalent to symmetry of the corresponding multilinear function with respect to an appropriate set of vectors.

y. Thus let a be a skew symmetric tensor of rank two. be symmetric with respect to an appropriate set of indices in some basis. if the components of a tensor are skew symmetric in one coordinate system then they are skew symmetric in all coordi- nate systems. More generally. The multilinear functions associated with skew symmetric tensors are themselves skew symmetric in the sense of the following definition: DEFINITION. We now count the number of independent components of a skew symmetric tensor. skew symmetry of a multilinear function implies skew symmetry of the associated tensor (in any coordinate system). the number of different compo2)/3! since 1) (n nents of a skew symmetric tensor ail. Similarly. of its vectors changes the sign of the function. then this symmetry is preserved in all coordinate systems. either all covariant or all contravariant.k = a. DEFINITION.e. Here it is assumed that we are dealing with a tensor all of whose indices are of the same nature. i. The definition of a skew symmetric tensor implies that an even permutation of its indices leaves its components unchanged and an odd permutation multiplies them by 1. so that the number of different components is n(n 1)/2. (There are no non zero skew symmetric tensors with more than n indices.. the tensor is skew symmetric. On the other hand.INTRODUCTION TO TENSORS 183 respect to a certain set of vectors it is sufficient that the corresponding tensor a:11. This follows from the . A multilinear function 1(x. y.e. In other words.. A tensor is said to be skew symmetric if it changes sign every time two of its indices are interchanged. For a multihnear function to be skew symmetric it is sufficient that the components of the associated tensor be skew symmetric relative to some coordinae system. it follows that if the components of a tensor are symmetric relative to one coordinate system.. the number of independent components of a skew symmetric tensor with k indices (k n) is (:). This much is obvious from (9). i. ) of p vectors in R is said to be skew symmetric if interchanging any pair x. Then a. is n (n components with repeated indices have the value zero and components which differ from one another only in the order of their indices can be expressed in terms of each other.

2.) We consider in greater detail skevv symmetric tensors with n indices. respect to the first k indices..12_1 say. of the 2 -1. is to construct the tensor 1 k! L. aCis i2) j. Let the given tensor be 011. say. Consequently if 11. Show that as a result of a coordinate transformation the number a. z) = =a ni n2 " vn 12''' This proves the fact that apart from a multiplicative constant the only skew symmetric multilinear function of n vectors in an ndimensional vector space is the determinant of the coordinates of these vectors. In view of formula (10) the multilinear function associated with a skew symmetric tensor with n indices has the form e2 En /(x. ik. EXERCISE. Since two sets of n different indices differ from one another in order alone.sign) depending on whether the permutation 1112 or odd( sign).12. To symmetrize it with ..= a.. y. 12. 12. is even (4.-44+1 where the sum is taken over all permutations ji.184 LECTURES ON LINEAR ALGEBRA fact that a component with two or more repeated indices vanishes and k > n implies that at least two of the indices of each component coincide. . n and if we put (10)a11±a 1. = a is multiplied by the determinant of the matrix associated with this coordinate transformation. This operation is called symmetrization and consists in the following. Given a tensor one can always construct another tensor symmetric with respect to a preassigned group of indices. The ofieration of symmetrization. then of the integers 1. it follows that such a tensor has only one independi is any permutation ent component. For example indices ii. o2. i2 2 .

.t2! = Cat. However. We vvish to characterize this subspace by means of a system of numbers. we wish to coordinatize it. The tensor a[ii '41 does not change when we add to one of the vectors E. A k-dimensional subspace is generated by k linearly independent vectors ei. k! 1 . For instance where the sum is taken over all permutations ji. of the .INTRODUCTION TO TENSORS 185 The operation of alternation is analogous to the operation of symmetrization and permits us to construct from a given tensor another tensor skew symmetric with respect to a preassigned group of indices... j. at2t. . 2f 2. . Consider a k-dimensional subspace of an n-dimensional space R. The operation of alternation is indicated by the square bracket symbol [ ]. Cik of the subspace defines this subspace.i nature of the permutation involved. ni 2) product ail i2-"ik = Vini2 . Given k vectors eii. Different systems of k linearly independ- ent vectors may generate the same subspace. the tensors constructed from each of these systems differ by a non-zero multiplicative constant only.12.) att. Cik. The brackets conta ns the indices involved in the operation of alternation. . Thus the skew symmetric tensor a[i1i2" constructed on the generators VI.e. ik and the sign depends on the even or odd indices i. any linear combination of the remaining vectors. i. The operation is defined by the equation a. = 1 +aiii2.ik. . it is easy to show (the proof is left to the reader) that if two such systems of vectors generate the same subspace. n.. k we can construct their tensor Cik and then alternate it to get PI It is easy to see that the components of this tensor are all kth order minors of the following matrix e2 n1 n2 CI V..

Sign up to vote on this title
UsefulNot useful