I N T E R S C I E N C E P U B L I S H E R S L T D .

, L O N D O N





































J anuary 1 9 4 8





































































































+ fi) x = + fix





























$ ' 7 t







,










P




























Y
n ) ( n














n
n (































































































=





























I

n



( f( t)

VS a [







vr



















































































b b

b

j ib



J a

























































d.




































fi





















( 2 c;


_ e, 2
e2 2 1 2 e2 2





r











1 / 1 7 / 2 ,
e2
2 n3 ,
¿ 3 = ) 7 3 .










































1 ,
































- F +


- F

- F









4 >






of














































































a


a
. a












































































np
fil+ q t nnfn


, n.









= n, nil+ 0 .
p' .





































































































































































y)














6
6






























+





































































































































































o
o
L o
o























































I r- 1 1 1 st











an_ ,

















































































































C n T in = ( z,










o o
o



























































































( x , y) = B ( x ; y) .





,













+












































































E















































































ß y




Fa
L y J
Fa vl
L ß






r

























1 1



ft. , i. e. ,

w,












= -















































n
n

n












































































k_

sif2






















4 k( ,
i. e. ,

4 7 2 5







































































































L

1 6 8


cp( f)





f2 , f" .




( 6 )












=




















=




































































































e















E n



vn




,





















COPYRIGHT 0 1961 BY INTERSCIENCE PUBLISHERS, INC.

ALL RIGHTS RESERVED
LIBRARY OF CONGRESS CATALOG CARD NUMBER 61-8630 SECOND PRINTING 1963

PRINTED IN THE UNITED STATES OF AMERICA

PREFACE TO THE SECOND EDITION
The second edition differs from the first in two ways. Some of the material was substantially revised and new material was added. The major additions include two appendices at the end of the book dealing with computational methods in linear algebra and the theory of perturbations, a section on extremal properties of eigenvalues, and a section on polynomial matrices ( §§ 17 and 21). As for major revisions, the chapter dealing with the Jordan canonical form of a linear transformation was entirely rewritten and Chapter IV was reworked. Minor
changes and additions were also made. The new text was written in colla-

boration with Z. Ja. Shapiro. I wish to thank A. G. Kurosh for making available his lecture notes
on tensor algebra. I am grateful to S. V. Fomin for a number of valuable comments Finally, my thanks go to M. L. Tzeitlin for assistance in the preparation of the manuscript and for a number of suggestions.
September 1950

I. GELtAND

Translator's note: Professor Gel'fand asked that the two appendices

be left out of the English translation.

.

V. who made available to him notes of the lectures given by the author in 1945. A. Fomin participated to a considerable extent in the writing of this book. The material in fine print is not utilized in the main part of the text and may be omitted in a first perfunctory reading. Turetski of the Byelorussian State University. Raikov. E. Without his help this book could not have been written. GEL'FAND vii . who carefully read the manuscript and made a number of valuable comments. S. and to D. January 1948 I.PREFACE TO THE FIRST EDITION This book is based on a course in linear algebra taught by the author in the department of mechanics and mathematics of the Moscow State University and at the Byelorussian State University. T h e author wishes to thank Assistant Professor A.

.

Isomorphism of Euclidean spaces 14 21 Bilinear and quadratic forms Reduction of a quadratic form to a sum of squares Reduction of a quadratic form by means of a triangular transformation 34 42 The law of inertia Complex n-dimensional space 46 55 60 70 II. 132 132 137 142 149 IV. Simultaneous reduc97 tion of a pair of quadratic forms to a sum of squares 103 Unitary transformations 107 Commutative linear transformations. n-Dimensional Spaces. Normal transformations Decomposition of a linear transformation into a product of a unitary and self-adjoint transformation 114 Linear transformations on a rea/ Euclidean space 126 Extremal properties of eigenvalues . Eigenvalues and eigenvectors of a linear 81 transformation 90 The adjoint of a linear transformation Self-adjoint (Hermitian) transformations. Linear and Bilinear Forms ti-Dimensional vector spaces Euclidean space Orthogonal basis. The Canonical Form of an Arbitrary Linear Transformation The canonical form of a linear transformation Reduction to canonical form Elementary divisors Polynomial matrices . Invariant subspaces.TABLE OF CONTENTS Page Preface to the second edition Preface to the first edition vii I. III. Introduction to Tensors The dual space Tensors 164 164 171 . Operations on linear transformations . Linear Transformations 70 Linear transformations.

.

e. . AE ). e2. Linear and Bilinear Forms § 1. It is therefore convenient to measure off all such directed segments beginning with one common point which we shall call the origin. e) we mean the n-tuple ix= (241.n2. the diagonal of the parallelogram with sides x and y. . To investigate all examples of this nature from a unified point of view fa. As is well known the sum of two vectors x and y is. by definition. )42. ?In) we mean the n-tuple x = (E1. z.g. In the sequel we shall consider all continuous functions defined on some interval In the examples just given the operations of addition and multiplication by numbers are applied to entirely dissimilar objects. etc.. Two directed segments are said to define the same vector if and only if it is possible to translate one of them into the other. rows of a matrix. + n). ¿) and y =(. A set R of elements x. E2. the set of coefficients of a linear form. E2. . we introduce the concept of a vector space. In analysis we define the operations of addition of functions and multiplication of functions by numbers. x+y= + ij. By the product of the number A and the n-tuple x = (ei . b]. Definition of a vector space. vector space over a field F if: [I] is said to be a .).CHAPTER I n-Dimensional Spaces. The definition of multiplication by (real) numbers is equally well known. . e) (e. a-Dimensional vector spaces 1. i. Thus In geometry objects of this nature are vectors in three dimensional space. y. E2 + n2. In algebra we come across systems of n numbers x = (ei. Addition and multiplication of n-tuples by numbers are usually defined as follows: by the sum of the n-tuples . DEFINITION 1. We frequently come across objects which are added and multiplied by numbers.. directed segments.

For every x in R there exists (in R) an element denoted by I. Ix= x ct(x + fi)x = y) = + fix cor. 2. The set of all polynomials of degree not exceeding some natural number n constitutes a vector space if addition of polynomials and multiplication of polynomials by numbers are defined in the usual manner. Whenever this is the case we are dealing with an instance of a vector space. Thus (r t) (r t) = 5. 2. II. As the sum . 4.2 LECTURES ON LINEAR ALGEBRA With every two elements x and y in R there is associated an element z in R which is called the sum of the elements x and y. (cc 1. With every element x in R and every numeer A belonging tu a c field F there is associated an element Ax in R. 2. It is not an oversight on our part that we have not specified how elements of R are to be added and multiplied by numbers. The above operations must satisfy the following requirements (axioms): (commutativity) (associativity) R contains an element 0 such that x = x for all x in R. 1. (ßx) = ß(x). III. Any definitions of these operations are acceptable as long as the axioms listed above are satisfied. 3 above are indeed examples of vector spaces. Ai( is referred to as the product of x by A. Let us give a few more examples of vector spaces. x+y=y+x (x + y) 4. 1. We leave it to the reader to verify that the examples 1.z = x (y z) x with the property x ( x) = O. We observe that under the usual operations of addition and multiplication by numbers the set of polynomials of degree n does not form a vector space since the sum of two polynomials of degree n may turn out to be a polynomial of degree smaller than n. We take as the elements of R matrices of order n. The sum of the elements x and y is denoted by x + y. 0 is referred to as the zero element.

11. 1. i. . let x. say. the contents of this section apply to vector spaces over arbitrary fields. y =. . . v is said to be linearly independent if (1) yz + the equality implies that .. then the space is referred to as a real vector space.p Let the vectors x. space are real. the numbers 2. Let R be a vector space. are elements of an arbitrary field K. + Ov = O. z. More generally it may be assumed that A. The fact that this term was used in Example I should not confuse the reader. IT be connected by a relation of the form (1) with at least one of the coefficients. . involved in the definition of a vector If the numbers X. 1/Ve now define the notions of linear dependence and independence of vectors which are of fundamental importance in all that follows. The geometric considerations associated with this word will help us clarify and even predict a number of results. z. then the space is referred to as a complex vector space.n-DIMENSIONAL SPACES 3 we take the matrix Hai. y. + b1. in chapter I we shall ordinarily assume that R is a real vector space. a. y. IT are linearly dependent if there exist numbers 9. In other words. fty Dividing by a and putting yz Ov. y. As the product of the number X and the matrix 11 aikl we take the of the matrices I laikl I and matrix 112a1tt It is easy to see that the above set R is now a vector space. it. W e shall say that the vectors x. Then + Ov = 0 GO( + /3y + yz + 0 = O. y. However. z. z. /3. a set of vectors x. It is natural to call the elements of a vector space vectors. y. Vectors which are not linearly dependent are said to be linearly independent. in particular. . Many concepts and theorems dealt with in the sequel and. 2. unequal to zero. y. The dimensionality of a vector space. not all equal to zero such that cc.4. If are taken from the field of complex numbers. IT be linearly dependent. DEFINITION 2. Then R is called a vector space over the field K.e.

and space. linearly depend- ent. y. EXERCISES. v. y. plane. Infinite-dimensional spaces will not be studied in this book. that if one of a set of vectors is a linear combination of the remaining vectors then the vectors of the set are linearly dependent. i. and in three-dimensional space coincides with what is called in geometry the dimensionality of the line. u. y. Whenever a vector x is expressible through vectors y. 3. y. then R is said to be infinitedimensional. y. . y in the form (2) we say that x is a linear combination of the vectors y. in the plane. If R is the set of vectors in three-dimensional space.. z. 1 vectors If R is a vector space which contains an arbitrarily large number of linearly independent vectors.. In the plane we can find two linearly independent vectors but any three vectors are linearly dependent. z. 2. Show that if the vectors x. i. 1. y are linearly dependent then at (2) x pZ + least one of them is a linear combination of the others. z. We leave it to the reader to prove that the converse is also true. y is the zero vector then these vectors are linearly dependent. if the vectors x. We now introduce the concept of dimension of a vector space. Show that if one of the vectors x. respectively. . 2. 5. Ve shall now compute the dimensionality of each of the vector spaces considered in the Examples 1.y '. Thus. z.4 LECTURES ON LINEAR ALGEBRA (/57X) = A. A vector space R is said to be n-dimensional if it contains n linearly independent vectors and if any n in R are linearly dependent. then it is possible to find three linearly independent vectors but any four vectors are linearly dependent. (01x) = we have + Of. are linearly dependent and u.e. z. dependent. are arbitrary vectors then the vectors x. As we see the maximal number of linearly independent vectors on a straight line. y. 4. . DEF/NITION 3. = tu. z. It is therefore natural to make the following general definition.e. /1. . are linearly . Any two vectors on a line are proportional.

n2n). are linearly dependent. . Thn 172n n tnn cannot exceed n (the number of columns). n12. 1) are easily seen to be linearly independent. R is infinite-dimensional. 0. 0). Yi(nii. In n Let R be the space of polynomials of degree . Consequently R is three-dimensional. t"-1 are linearly independent. But this implies the linear dependence of the vectors y1. . f2(t) = t. . natural number. n22. Hence R is n-dimensional. any m vectors in R. Then the functions: f1(t) independent vectors (the proof fN(t) = tN-1 form a set of linearly of this statement is left to the reader). Yin = (nml. n12. 1. our ni rows are linearly dependent. It follows that our space contains an arbitrarily large number of linearly independent functions or. Let R denote the space whose elements are n-tuples of real numbers. = (0. This space contains n linearly independent vectors For instance. tre > n. Y2 = (V21. y2. 1. nm2. On the other hand. 0). ni > n. x. Ynt Thus the dimension of R is n. Indeed. 0. Let N be any 1. let x= (0. briefly. 717n2. n21. Let R be the space of continuous functions. n22y 17ml. . . It can be shown that any ni elements of R.. The number of linearly independent rows in the matrix [ nu. nmn) be ni vectors and let ni > n. are linearly dependent. . t. Since m > n. . the space R of Example 1 contains three linearly independent vectors and any four vectors in it are linearly dependent. the vectors xi -= (1.n-DIMENSIONAL SPACES 5 As we have already indicated. this space the n polynomials 1.

in the case of the space considered in Example 1 any-three vectors which are not coplanar form a basis. i.e. Let x be an arbitrary vector in R. Every vector x belonging to an n-dimensional vector space R can be uniquely represented as a linear combination of basis vectors. Obviously «0 O. ex not all zero such that (3) ao X + 1e1 + cte = O. r2e2 + Subtracting one equation from the other we obtain O = (el $'1)e1 (2 e2)e2 + . e2. «0 This proves that every x e R is indeed a linear combination of the vectors el. .6 LECTURES ON LINEAR ALGEBRA 5. To prove uniqueness of the representation of x in terms of the basis vectors we assume that x = $1e1 and E2e2 + +e + ete. THEOREM 1.. en. i. e2. . The set x. e/n)e. for instance. it contains a basis. e be a basis in R. Any set of n linearly independent vectors e1. It Proof: Let e1. follows from the definition of an n-dimensional vector space that these vectors are linearly dependent.e. X = E'. Basis and coordinates in n-dimensional space DEFINITION 4. e2. Thus.. By definition of the term "n-dimensional vector space" such a space contains n linearly independent vectors. e1. e2. Using (3) we have x= cco 2 e2 22o cc °c m en. e2. We leave it to the reader to prove that the space of n x n matrices [a2kH is n2-dimensional. en of an n-dimensional vector space R is called a basis of R. that there exist n I numbers « al. + (E. e. 3.e. . e contains n + 1 vectors. . Otherwise (3) would imply the linear dependence of the vectors e1.

= en = e'n = = $'7t ea = E12.. e2. n2. = eta This proves uniqueness of the representation. E2.e. . e2 772. it follows that e'2 = Et'= e2 i. .. If the coordinates of x relative to the basis e1. . if are ni.e. e2. + + Ene + + n)e. Let R be the space of n-tuples of numbers. Let us choose as basis the vectors . e2.. 2. e form a basis in an n-dimensional space and + x= (4) + e2e2 + then the numbers $1. Thus the coordinates of the sum of two vectors are the sums of the appropriate coordinates of the summands. . . 2En product of a vector by a scalar are the products of the coordinates of that vector by the scalar in question. e of a vector space R every vector X E R has a unique set of coordinates. En + Similarly the vector /lx has as coordinates the numbers 141.tt-DIMENSIONAL SPACES 7 Since e1. In the case of three-dimensional space our definition of the coordinates of a vector coincides with the definition of the coordinates of a vector in a (not necessarily Cartesian) coordinate system. i. + x + Y = (El + ni)ei + (e2 712)e2 + i. en are linearly independent. e2. . en are E2. E are called the coordinates of the vector el. e2. if el. EXAMPLES. and the coordinates of the . the coordinates of x + y are E. En and the coordinates of y relative to the same basis a. ni. . 2E2. x relative to the basis Theorem 1 states that given a basis el. x Y= then + 2e2 + + /72.e. 1. DEFINITION 5. It is clear that the zero vector is the only vector all of whose coordinates are zero. . e.

172 = El. . . 172 + + n. 272(0. By definition x i. e) relative to the basis el. E which define the vector is particularly simple. . $) the numbers . 0). I. 1) + n2. nn " $n $fl-1Let us now consider a basis for R in which the connection be- ni " Si. 52. $2. 1. . . en) and the E2. 1.S LECTURES ON LINEAR ALGEBRA et = (1. o) + = + $(o. Ei(1. e. . = (0. 1. Thus. E.. . 1) = The numbers (ni. o. e. 1). . n and then compute the coordinates x = (E1. 1).72. 0. + E2 + Consequently. 1) . n) must satisfy the relations 0. 171 . It follows that in the space R of n-tuples ($1. (E1. tween the coordinates of a vector x = . 0). . . e. 1. Ti (0. 1).$1.e. = E. en) = n2e2 ' + nen. = (0. 0. = (0. e. let = (1. $2.. n/(1. ni + + + n).. n of the vector . E2. ' n2 " S2 -. numbers et. Then 0. o. . 1).. 0. 1) +Ee. En) o) 2(o. en = (0. 1. e2e2+ . 1. 1. x = (Ei.

= (1.. e' = 1. 0. a22. Expanding P (t) in powers of (t a) we find that + [P(nl) (a)I (n-1)!](t a)n_'. /2' combinations of the numbers E1. 1. coordinates of the polynomial P(t) = a0r-1 a1t"-2 + in this basis are the coefficients a_. e2. e'2 = t a.. ($1. e. Indeed. may be viewed as the coordinates of the vector e) relative to the basis 1). en = (0. i. 0). . e2 = (0. . a. . we can associate with a vector in R a vector in R'.. Let us now select another basis for R: . e2 (a. a. en (a. . It is easy to see that the the vectors el = 1. e = t"--1-. P' (a). In the examples considered above some of the spaces are identical with others when it comes to the properties we have investigated so far. When a vector is multiplied by a scalar all of its coordinates are multiplied by that scalar. This implies a parallelism between the geometric properties of R and appropriate properties of R'. [PR-1)(a)/ (n P(a).n-DIMENSIONAL SPACES 9 ei. . 0). One instance of this type is supplied by the ordinary three-dimensional space R considered in Example / and the space R' whose elements are triples of real numbers. P (t) = P (a) + P' (a)(t a) + Thus the coordinates of P(t) in this basis are 1)1 . 1. X =(. Isomorphism of n-dimensional vector spaces. a". When vectors are added their coordinates are added. = t.e. = (all. EXERCISE. . ao. Show that in an arbitrary basis e. . 0. . e' = (t a)"--1. once a basis has been selected in R we can associate with a vector in R its coordinates relative to that basis. e.) .).. an_2. E2. e's = (t a)2. 6122. en) are linear 17 of a vector x the coordinates n. We shall now formulate precisely the notion of "sameness" or of "isomorphism" of vector spaces. ' . Let R be the vector space of polynomials of degree n A very simple basis in this space is the basis whose elements are .

e' be a basis in R'. Indeed.. There arises the question as to which vector spaces are isomorphic and which are not. the vector which this correspondence associates with Ax is Ax'. If x. This correspondence is one-to-one. . It follows that two spaces of different dimensions cannot be isomorphic. y'. then the vector which this correspondence associates with x + y is X' + y'. y. By the same token every x' e R' determines one and ordy one vector x e R. . But then x' is likewise uniquely determined by x. let us assume that R and R' are isomorphic. . Indeed. THEOREM 2. We shall associate with the vector (5) x= e2e2 + + ee the vector + E2e'2 + x' i. Two vector spaces R and RI. Two vector spaces of different dimensions are certainly not isomorphic. every vector x e R has a unique representation of the form (5). en be a basis in R and let e'2.e. with the same coefficients as in (5). Proof: Let R and R' be two n-dimensional vector spaces. are vectors in R and x'. Let e2. All vector spaces of dimension n are isomorphic. This means that the E. This is the same as saying that the dimensions of R and R' are the same.10 LECTURES ON LINEAR ALGEBRA DEFINITION 6. Therefore the maximal number of linearly independent vectors in R is the same as the maximal number of linearly independent vectors in R'. are said to be isomorphic if it is possible to establish a one-to-one correspondence X 4-4 x' between the elements x e R and x' e R' such that if x 4> x' and y y'. are their counterparts in R' then in view of conditions I and 2 of the definition of isomorphism the equation Ax Ax' py' + = 0 is equivalent to the equation = O. are uniquely determined by the vector x. Hence the counterparts in R' of linearly yy + independent vectors in R are also linearly independent and conversely. a linear combination of the vectors e'.

n-DIMENSIONAL SPACES 11 It should now be obvious that if x 4* x' and y e> y'. We now give a few examples of non-trivial subspaces. Let R be the ordinary three-dimensional space. EXAMPLES. EXERCISE. 1. A subset R'. Since a subspace of a vector space is a vector space in its own right we can speak of a basis of a subspace as well as of its dimensionality. a2E2 + a. y. all vectors x (E1. E2. in R is called a y e R'. form a x = (E1. of a vector space R is called a subspace of R if it forms a vector space under the operations of addition and scalar multiplication introduced in R. an are arbitrary but fixed numbers. It is clear that the dimension of an arbitrary subspace of a vector space does not exceed the dimension of that vector space. . then R' coincides with R. x e R'. then x + y 4> x' + y' and 2x 4> Ax'. The totality R' of vectors in that plane form a subspace of R. subspace of R if x e R'. In other words. The zero or null element of R forms a subspace The whole space R forms a subspace of R. In § 3 we shall have another opportunity to explore the concept of isomorphism. a2. Subspaces of a vector space DEFINITION 7.E1 where al. In the vector space of n-tuples of numbers all vectors . En) for which ei = 0 form a subspace. Consider any plane in R going through the origin. . It is clear that every subspace R' of a vector space R must con- tain the zero element of R. subspace. The null space and the whole space are usually referred to as improper subspaces. . 5. E) such that + anen = 0. a set R' of vectors x. E2. n form a subspace of The totality of polynomials of degree the vector space of all continuous functions. This completes the proof of the isomorphism of the spaces R and R'. y e R' implies x of R. More generally. Show that if the dimension of a subspace R' of a vector space R is the same as the dimension of R.

subspaces of dimension / 1. g. R' contains k linearly independent vectors (i. e2. P Elk $21. g. g.$ 22. x2. the vectors e1. $22. e2. .. OY are a (finite infinite) set of vectors belonging to R. then the set R' of all (finite) linear combinations of the vectors e. g. If X2 x1= $11e1 + $ 12. f. The subspace R' generated by the linearly independent vectors e1. . x. i. Consider the set of vectors of the form x xo °Lei. This subspace is the smallest subspace of R containing the vectors e. $2k E lk must be linearly dependent.e. It is natural to call this set of vectors by analogy with threedimensional space a line in the vector space R. f. .. 4. where a is an arbitrary scalar. f. the dimension of R'. EXERCISE. let x1. e7. If we ignore null spaces. Thus the maximal number of linearly independent vectors in R'. subspace R' of R. e2. n. x2.e. eh form a basis of R' Indeed. The subspace R' is referred to as the subspace . . e2. is hand the vectors e.. e. Show that every n-dimensional vector space contains /. be 1 vectors in R' and let 1 > k. xi then the I rows in the matrix $12.12 LECTURES ON LINEAR ALGEBRA A general method for constructing subspaces of a vector space R is implied by the observation that if e. $22. On the other hand. f. then the simplest vector spaces are one- dimensional vector spaces. Thus a one-dimensional v all vectors me1. page 5) the linear dependence of the vec ors x. Example 2. + enel. + E12e2 + ' + elkek + E2e. 2. . But this implies (cf. Etl. . form a basis in R'. is k-dimensional and the vectors e1. where xo and e. A basis of such a space is a single vector el O. forms a generated by the vectors e. + etkek. eh). 0 are fixed vectors and ranges over all scalars. . xi.

e'o. Then x = Ele x e1e1 $2e2 + $2e2 + $e = E'le'. en and e'1. with the appropriate expressions from (6) we get + ee e'l(ae ae2 + ane2 E'(a1. + Replacing the e'. + 6/1E1 (a1. let the connection between them be given 6. all vectors of the form /el ße2. a. a211e2 + The determinant of the matrix d in (6) is different from zero e' would be linearly depend(otherwise the vectors e'1. Transformation of coordinates under change of basis. ' of real numbers the set of vectors satisfying the relation + ane. Let e2. among the vectors e. ae) . EXERCISES. Show that the dimension of the subspace generated by the vectors is equal to the maximal number of linearly independent vectors e.en. e'2 = tine' + ae. f.. g. is called a (two-dimensional) plane. Further. X= where xo is a fixed vector. and 112 of a vector space R have only the null vector in common then the sum of their dimensions does not exceed the dimension of R. e' be two bases of an n-dimensional vector space. of dimension n Show that if two subspaces R. where el and e2 are fixed linearly independent vectors and a and fi are arbitrary numbers form a two-dimensional vector space. The set of vectors of the form xe ße2. . e'2. g. = ame. . a. are fixed numbers not all of which are zero) form a subspace 1. 1.e1±a21e2+ ±a2e) + aen). ent). f. by the equations e'.. ---a2E. = ae (6) a21e2 + a22 e2 + + ae.n-DIMENSIONAL SPACES 13 Similarly. + an. Show that in the vector space of n-tuples (ei. Let ei be the coordinates of a vector x in the first basis and its coordinates in the second basis. ep. .

V. To rephrase our result we solve the system (7) for ¿'i. aniVi + a. We define this concept axiomatically. many concepts of so-called Euclidean geometry cannot be forniulated in terms of addition and multiplication by scalars. + + anE'n. the coefficients of the e. Thus. Then buE1 + 1)12E2 + . are linearly independent. + binen. We take as our fundamental concept the concept of an inner product of vectors. plane. angles between vectors. bnnen e'n = b11 b2$2 + where the b are the elements of the inverse of the matrix st. Thus the coordinates of the vector x in the first basis are express- ed through its coordinates in the second basis by means of the matrix st which is the transpose of . . Instances of such concepts are: length of a vector. Definition of Euclidean space.21. the inner product of vectors.14 LECTURES ON LINEAR ALGEBRA Since the e. ei2 = bn + b22$2 + + b2Jn. on both sides of the above equation must be the same. dimension. § 2. Hence auri + an (7) + + rn E2 = an VI en a22 E'2 + + a2netn. Using the inner product operation in addition to the operations of addi- . The simplest way of introducing these concepts is the following. parallelism of lines. Euclidean space 1. the coordinates of a vector are transformed by means of a matrix ri which is the inverse of the transpose of the matrix at in (6) which determines the change of basis. However. etc. In the preceding section a vector space was defined as a collection of elements (vectors) for which there are defined the operations of addition and multiplication by scalars. By means of these operations it is possible to define in a vector space the concepts of line.

.n-DIMENSIONAL SPACES 15 tion and multiplication by scalars we shall find it possible to develop all of Euclidean geometry. y) =] (x1. Let us define the inner product of two vectors in this space as the product of their lengths by the cosine of the angle between them. Let us put . y) = 2(x. Y) = eint + 5m + + it is again easy to check that properties 1 through 4 are satisfied by (x. Thus let taiki I be a real n x n matrix. § 1). En) and Y = (n2. Aen) with which we are already familiar from Example 2. of vectors studied in elementary solid geometry (cf. x). y) such that (x. y) as defined. 1. then we say that an inner product is defined in R. n2. y) = (y. (25. th. y]. 23 . Let x = (et. Example /.) be in R. (x. A vector space in which an inner product satisfying conditions 1 through 4 has been defined is referred to as a Euclidean space. (A real) (2x. Consider the space R of n-tuples of real numbers. § 1. ez + n2. + x2. . We leave it to the reader to verify the fact that the operation just defined satisfies conditions 1 through 4 above. en + n) and multiplication by scalars . we define the inner product of x and y as (x. (x. y] + (x2. Let us consider the (three-dimensional) space R x + Y = (ei Ax nt.. y in a real vector space R there is associated a real number (x. If with every pair of vectors x. AE. In addition to the definitions of addition EXAMPLES. Without changing the definitions of addition and multiplication by scalars in Example 2 above we shall define the inner product of two vectors in the space of Example 2 in a different and more general manner. DEFINITION 1. x) = 0 tt and only if x 0. x) 0 and (x. y).

that IctO be symmetric. Thus for Axiom 4 to hold the quadratic form (3) must be positive definite. EXERCISE. then the inner product (x.16 LECTURES ON LINEAR ALGEBRA (x. . for (x. = E.e. e2. that is. i. . If we take as the matrix fIctO the unit matrix.. Y) = a11C1n1 + a12C1n2 + (1) + am el /7 + a 271 egbi a21e2171 + a22E2n2 + + an]. Show that the matrix (0 1 1) 0 cannot be used to define an inner product (the corresponding quadratic form is not positive definite). k=1 i. In summary. i. y) = I eini and the result is the Euclidean space of Example 2. be non-negative fore very choice of the n numbers el. For Axiom 1 to hold. as it is frequently called.e. = O (i k). y) to be symmetric relative to x and y.. x) ctikeie. for (1) to define an inner product the matrix 11(211 must be symmetric and the quadratic form associated with Ila11 must be positive definite. it is necessary and sufficient that a= a. ennl an2 En n2 + ann ennn We can verify directly the fact that this definition satisfies Axioms 2 and 3 for an inner product regardless of the nature of the real matrix raj/cll. if we put a = 1 and a. = = E.. The homogeneous polynomial or. = O. Axiom 4 requires that the expression (x. y) deEned by (1) takes the form (x. quadratic form in (3) is said to be positive definite if it takes on non-negative values only and if it vanishes only when all the Ei are zero. en and that it vanish only if E. and that the matrix 1\ (1 1 21 can be used to define an inner product satisfying the axioms I through 4.

We define the inner product of two polynomials as in Example 4 L 2. y) 13C1 1371 . It is easy to check that the Axioms 1 through 4 are satisfied. It is quite natural to require that the definitions of length of a vector. This dictates the following definition of the concept of angle between two vectors. Let the elements of a vector space be all the continuous functions on an interval [a. By the angle between two vectors x and y we mean the number arc cos (x. We shall now make use of the concept of an inner product to define the length of a vector and the angle between two vectors. Angle between two vectors. b]. In other words.n-DIMENSIONAL SPACES 17 In the sequel (§ 6) we shall give simple criteria for a quadratic form to be positive definite. We define the inner product of two such functions as the integral of their product (f. of the angle between two vectors and of the inner product of two vectors imply the usual relation which connects these quantities.e. By the length of a vector x in Euclidean space we mean the number (x. we put (5) cos 9) (x. n 1. x). DEFINITION 3. Let R be the space of polynomials of degree (P. it is natural to require that the inner product of two vectors be equal to the product of the lengths of these vectors times the cosine of the angle between them. (4) We shall denote the length of a vector x by the symbol N. DEFINITION 2. g) = fa f(t)g(t) dt. Q) = P (t)Q(t) dt. Length of a vector.. lx1 y) 1311 i.

Since x and y are supposed orthogonal. In para. However. Proof: By definition of length of a vector (x Y. x) = O.. we defined the angle between two vectors x and y by means of the relation cos 99 (x. If x and y are orthogonal vectors. This theorem can be easily generalized to read: if x. y. are pairwise orthogonal. that the square of the length of the diagonal of a rectangle is equal to the sum of the squares of the lengths of its two non-parallel sides (the theorem of Pythagoras). 2. Y) 4. The angle between two non-zero orthogonal vectors is clearly . y) = O. Thus 1x + y12 = (x. this course would have resulted in a more complicated system of axioms than that associated with the notion of an inner product. . then it is natural to regard x + y as the diagonal of a rectangle with sides x and y. x + y) = (x. then - Ex + y + z + 12 = ixr2 )7I2 1z12 + ' w 3. x) (x. In view of the distributivity property of inner products (Axiom 3). (x. x) (Y.e.18 LECTURES ON LINEAR ALGEBRA The vectors x and y are said to be orthogonal if (x.(Y. z. y).7i/ 2. which is what we set out to prove. y) = (y. The following is an example of such extension. y) 1x1 1Y1 If is to be always computable from this relation we must show I We could have axiomatized the notions of length of a vector and angle that between two vectors rather than the notion of inner product. Y) 13112. x Y). We shall show that Y12 = 1x12 1Y121 i. The concepts just introduced permit us to extend a number of theorems of elementary geometry to Euclidean spaces. The Schwarz inequality. x) (Y. lx + 3/12 (X + y.

y) + (x. 2 To prove the Schwarz inequality we consider the vector x ty where t is any real number. This inequality implies that the polynomial cannot have two distinct real roots. Inequality (6) is known as the Schwarz inequality. Thus. y)/1x1 1y1 is the cosine of a previously determined angle between the vectors. y)2 which is what we wished to prove. y). in turn. (x. y) is the linear dependence of the vectors x and y. for any t. y) 2t(x. 1. y). ty. before we can correctly define the angle between two vectors by means of the relation (5) we must prove the Schwarz inequality. 12(y . (cf.e. We have proved the validity of (6) for an axiomatically defined Euclidean space. cannot be positive. x) = (x. is the same as (6) (x. equivalently. that (x. x)(y. y) _CO..e.) 2 Note. It is now appropriate to interpret this inequality in the various concrete Euclidean spaces in para. Y) < 1 IXI IYI or. the remark preceding the proof of the Schwarz inequality. y) 2t(x. x ty) 0. . 1. Consequently. (x i. x) O.n-DIMENSIONAL SPACES 19 1 < (X. EXERCISE. in vector analysis the inner product of two vectors is defined in such a way that the quantity (x. y)1/1x1 IYI 1. Prove that a necessary and sufficient condition for (x. In view of Axiom 4 for inner products. 1(x.. i. EXAMPLES. of this section there is no need to prove this inequality. Consequently. 1. the discriminant of the equation t2(y. y)2 (x. Example /. y) + (x. x)(y. however that in para. y)2 Ix12 13112 <I which. (x. Namely. inequality (6) tells us nothing new. In the case of Example 1. x)(y.

(Hint: Assign suitable values to the numbers Ei. i=1 In Example 3 the inner product was defined as (1) Y) i.) fb In Example 4 the inner product was defined by means of the integral 1(1)g (t) dt. 71 in the inequality just derived. Hence (6) takes the form 0. We now give an example of an inequality which is a consequence of the Schwarz inequality. X) 2=1 E Et2. af(t)g(t) dt))2 fba [f(t)J' dt [g(t)p dt. If x and y are two vectors in a Euclidean space R then (7) Ix + Yi [xi + 1Yr- . (y. 2=1 EXERCISE. y) = E t=1 It follows that (X. En anakk. En. 2--1 and inequality (6) becomes i=1 ei MY 5- n ( )( n Ei2 i=1 tif2).20 2 LECTURES ON LINEAR ALGEBRA In Example 2 the inner product was defined as (x. This inequality plays an important role in many problems of analysis. and /72. . k=1 for any choice of the ¿i. then the following inequality holds: 2 ( E ao.$((o) k=1 ( n n E aikeik)( E 6111115112) k=1 i. where and (3) E aikeiek >. Hence (6) implies that il the numbers an satisfy conditions (2) and (3). y) = E i)(8. Show that if the numbers an satisfy conditions (2) and (3).n=1 aikeink. .

x -1. Orthogonal basis. in addition. the other space. x1 i. each has unit length. the vectors e. In § 1 we introduced the notion of a basis (coordinate system) of a vector space... which is the desired conclusion. § 3.n-DIMENSIONAL SPACES 21 Proof: y12 = (x + y. then there exists an isomorphic mapping takes the first of these bases into the second. e2. y). it follows that 213E1 Since 2(x. Orthogonal basis. Interpret inequality (7) in each of the concrete Euclidean spaces considered in the beginning of this section. and an orthonormal basis i f ... Here there is every reason to prefer so-called orthogonal bases to all other bases.e 3 Careful reading of the proof of the isomorphism of vector spaces given in § 1 will show that in addition to proving the theorem we also showed that it is possible to construct an isomorphism of two n-dimensional vector spaces which takes a specified basis in one of these spaces into a specified basis in are two e and e'5. Orthogonal bases play the same role in Euclidean spaces which rectangular coord nate systems play in analytic geometry.e. In particular. Y) = fix1+IYI)2.y) = (x. e. x) 2(x. 1x±y12 = (x+y. form an orthonormal basis . Isomorphism of Euclidean spaces I. 1x + yl lx EXERCISE. dimensional Euclidean vector space are said to form an orthogonal basis if they are pairwise orthogonal. The non-zero vectors el. In the general case of an n-dimensional Euclidean space we define the distance between x and y by the relation d lx yl. In a vector space there is no reason to prefer one basis to another. x)+21x1 lyi+ (y. of R onto itself which bases in R. y) + (y. the tip of that vector) is defined as the length of the vector x y. if e. . Briefly. x+y) SI. In geometry the distance between two points x and y (note the use of the same symbol to denote a vectordrawn from the origin- and a point. e'.. e of an nDEFINITION 1. y) (x.e2. 3 Not so in Euclidean spaces.

e. para..e. where a is chosen so that (e2. Likewise. e2. This procedure leads from any basis f. e are linearly independent. ei) = O. To construct ek we put e. 2) such a space contains a basis f1. ek) = 0 for k é L Hence A = O. e2. We wish to show that (2) implies Ai = 2. . =A1e11+ ' + Ak-iet where the Al are determined from the orthogonality conditim . i. (e1. For this definition to be correct we must prove that the vectors ei. 2. This means that (f. Thus. . e2) + + 2(e1. Every n-dimensional Euclidean space contains orthogonal bases. ek) = f1 10 if i = k if i k. f. el) = 0. e1)/(e1. . We shall make use of the so-called orthogonalization procedure to prove the existence of orthogonal bases. ". multiplying (2) by e. the definition of an orthogonal basis implies that ei) 0 0. el) + A2(e1. Proof: By definition of an n-dimensional vector space (§ 1.e. are linearly independent. We put = f1. e1. e1. form the inner product of each side of (2) with ei). Suppose that we have already constructed non-zero pairwise orthogonal vectors el. This proves that el.. .. Next we put e.2 = A = O.. f to an orthogonal basis el. let en of the definition actually form a basis. we find that A2 = 0.22 LECTURES ON LINEAR ALGEBRA (ei. i. etc. en) = O. Now. A2e2 + + Ae = O. To this end we multiply both sides of (2) by el (i.. f2. The result is 21(e1. e2. (f. THEOREM 1. = f. e.e. e1).

and fk+. e2. + 2-1 fk. and lying in the plane determined by e.. 1. fk we Just as ek. e2. Let R be the three-dimensional space with which we are familiar from elementary geometry. .. e2) = (fk = (fk 21e2-1 Aiek. ek_k and fk were used to construct e. equalities become: (fk. e2. and the vectors e. In view of the linear independence of the vectors f1. 22-2(e2.e. are pairwise orthogonal. = (fk.n-DIMENSIONAL SPACES 23 (ek. Similar statements hold for e22. e2). an orthogonal It is clear that the vectors e'k = ek/lekl (k = 1. ' (f2.. combination of the vector f_. fa. = 0.. So far we have not made use of the linear independence of the .) = (f2 Since the vectors el. It follows that 2-1 = (f/c. el). f. But e. Next select a . the latter (fk. may conclude on the basis of eq. (e2. f2. Put e. The vector ek is a linear combination of the vectors ek. perpendicular to e. . ek. e. i. f. etc. f2. O. but we shall make use of this fact presently to prove that e..ctor e. et) = 0. so . e2. pairwise orthogonal vectors ek. (5) that ek O. be three linearly independent vectors in R. 2. e1) (ek. n) form an orthonormal basis. O. el) + 22-1(e1. = f. e2_2. Let fi. e. (fk. . e2. . e. By continuing the process described above we obtain n non-zero. EXAMPLES OF ORTHOGONALIZATION. e2) = 0. e2) 0. e1)/(e1. -I- ' ' + ' + A2e.. = f. can be written as a linear vectors f1. . ek. en. can be used to construct e. It follows that ek = alfk ci2f2 + . e) = O. This proves our theorem. 02) A1e2-1 + + e2-1) .-t. . e2. e. basis. e2)/(e2. e2-1) 21(ek_1.

Multiplying each Legendre polynomial by a suitable constant we obtain an orthonormal basis in R. -. Thus 1. = t2 1/3. (3/5)1. We select as basis the vectors 1. = 12 + 131 The orthogonality requirements imply ß 0 and y = 1/3. P 1/3 is an orthogonal basis in R. 12 form a basis in R. +77e0. then 52e. I) = f (t dt = 2a. We shall denote the kth element of this basis by Pk(t). i. 12 1/3. but not orthonormal basis in R. choose e. Finally we put e. = t. e. We define the inner product of two vectors in this space as in the preceding example. e2. en be an orthonormal basis of a Euclidean space x= y = ?he. Next we put e. Since O (t+ I.t^-2. (i. n2e2 + (x.e. = 1.e. 7 kk. . = t I. t. + n2e2 + + enen.. i. Finally. We put e. We shall now orthogonalize this basis. R.24 LECTURES ON LINEAR ALGEBRA and f2. ek) = {01 1ff + ne). We define the inner product of two vectors in this space by the integral fi P(t)Q (t) dt. t. .. The vectors 1. Apart from multiplicative constants these polynomials coincide with the Legendre polynomials 1 dk (12 1)k 2k k! dtk The Legendre polynomials form an orthogonal. perpendicular to the previously constructed plane). As in Example 2 the process of orthogonalization leads to the sequence of polynomials 1. Let e1. y 1.e.. By dividing each basis vector by its length we obtain an orthonormal basis for R. t. t. e. perpendicular to ei ande. it follows that a = 0. Let R be the space of polynomials of degree not exceeding n 1.Y)= Since $2e2 + ee. If . Let R be the three-dimensional vector space of polynomials of degree not exceeding two.

Cle=1 f is an arbit ary basis. and Y = nifi + + n. the inner product of two vectors relative to an orthonormal basis is equal to the sum of the products of the corresponding coordinates of these vectors (cf. + e(en . + + Enn. It is natural to call the inner product of a vector x and a vector e of length 1 the projection of x on e. EXERCISES 1. El% + $27)2 + (x. Thus the kth coordinate of a vector relative to an orthonormal basis is the inner product of this vector and the kth basis vector. Multiplying both sides of this equation by el we get el) + $2(e2 el) + e. Y) = 071 + " ' Ent. f. Let x = eie.(1). " en and ni.. e1) = and. polynomials of degree 0. . § 2). let Q (t) be an arbitrary polyno- . Show that if f. except that there we speak of projections on the coordinate axes rather than on the basis vectors. P o(t) be the normed Legendre Let Po(t). f'2. 2. Further. . e). The result just proved may be states as follows: The coordinates of a vector relative to an orthonormal basis are the projections of this vector on the basis vectors. then (x.n-DIMENSIONAL SPACES 25 it follows that + Enfl. e2). P. y) Thus. el) = . y) = E nikEink. (x. where aik = aki and ei e2. similarly. 1. n..f for every x = basis is orthonormal. EXAMPLES. then this We shall now find the coordinates of a vector x relative to an orthonormal basis el. 1. = (x. e2. (7) E2e2 ene. (x. Example 2. Show that if in some basis f1. = (x. ?I are the coordinates of x and y respectively. This is the exact analog of a statement with which we are familiar from analytic geometry. e.f. n2.. .

2 (8) Consider the system of functions 1. (l/Vn) cos nt. We shall represent Q (t) as a linear combination of the Legendre polynomials. . cos t + b. o cos% kt dt n 227 . rn sin kt cos It dt = 0. if k and I. are an orthonormal basis for R1.. Q) = fo P(t)Q(t) dt. A linear combination P(t) = (a012) + a. Let R. sin nt. on the interval (0. cos nt. Perpendicular from a point to a subspace. (l/ n) sin t.) DEFINITION 2. be a subspace of a Euclidean space R. The shortest distance from a point to a subspace. The totality of trigonometric polynomials of degree n form a (2n + 1) -dimensional space R. Hence every polynomial Q(t) of degree n can be represented in the forra Q (t) = P(t) + c1P1(1) + c. sin t + al cos 2t + + 6.o 2r sin kt sin lt dt = 0. cos t. . 2n It is easy to see that the system (8) is an orthogonal basis Indeed r 2r Jo cos kt cos It dt = 0 if k 1. We define an inner product in R1 by the usual integral (P. sin nt of these functions is called a trigonometric polynomial of degree n. if it is orthogonal to every vector x e RI. To this end we note that all polynomials of degree n form an n-dimensional vector space with orthonormal basis Po(t).26 LECTURES ON LINEAR ALGEBRA rnial of degree n.f: sin% kt dt = 27. cos 2t.{0 ldt = 2n. it follows that the functions (8') 1/1/2a. + cP(t). (l/Vn) sin nt 2. sin t.. (l/ n) cos t. We shall say that a vector h e R is orthogonal to the subspace R. P(t). sin 2t.(1) dt. 2n). (This paragraph may be left out in a first reading. It follows from (7) that I Q(t)P. Since . f.

e. then. fo. of f on the subspace R1 (i.e.. is called the orthogonal projection of f on the subspace R1.. f.. Indeed..1.n-DIMENSIONAL SPACES 27 e then it is also If h is orthogonal to the vectors e. Indeed. in). e. = To find the c. in R1 such that the vector h f f is orthogonal to R1. ej = O (k = 1. + Hence. just as in Euclidean geometry. We shall now show how one can actually compute the orthogo- nal projection f. i. + 2. et) = 0 (1= 1.f1I2 = If - so that If - > If e. Right now we shall show that. 21e1 + 22. ej. ej = (f.. e R1 and f1 f. By the theorem of Pythagoras If 412 + 14 . orthogonal to any linear combination of these vectors.. to a basis of 12. we shall show that if f. We pose the problem of dropping a perpendicular from the point f to 121. be a basis of R1.e. In other words. f. i. The vector f.e. As a vector in 121. the vector fo f1 belongs to R. We shall see in the sequel that this problem has always a unique solution. .41.. Let R.e. 22. must be of the form f. be an m-dimensional subspace of a (finite or infinite dimensional) Euclidean space R and let f be a vector not belonging to lt1. (f0.) = O. m) implies that for any numbers 2.. .f0 + 4 . 2. we note that f (f c2e2 + c. as a difference of two vectors in RI. must be orthogonal to III. and is therefore orthogonal to h = f f. of finding a vector f. e. (h. (h. . 2. i. . how to drop a perpendicular from f on 141).. Let el. or.f1I2 = If . If - > If . for a vector h to be orthogonal to an m-dimensional subspace of R it is sufficient that it be orthogonal to ni linearly independent vectors in It1. 1111 is the shortest distance from f to RI.e.

.). e2. by the expression in (9) we obtain a system of m equations for the c. This determinant is known as the Gramm determinant of the vectors e1. cm with respect to the basis el. x2... en. e.(e. Indeed. . in such a basis the system (11) goes over into the system ci = (f. this vector has uniquely determined coordinates el. Since it is always possible to select an orthonormal basis in an m-dimensional subspace. 2.e. we have proved that for every vector f there exists a unique orthogonal projection f. 1.. e1) c2(e2. m).) must be different from zero.) (era. in view of the established existence and uniqueness of the vector f0. e1) ' (ex. Let y be a linear function of x1. e. e2.. We first consider the frequent case when the vectors e1.x . satisfy the system (11).. . e. are the coordinates of to relative to an orthonormal basis of R1 or a non-orthonormal basis of A system of m linear equations in in unknowns can have a unique solution only if its determinant is different from zero. this system has a unique solution. e2. Indeed. the coordinates c. c. EXAMPLES. let y= + c. Since the c. . Thus. e2) (ex. on the subspace We shall now show that for an arbitrary basis el. e1) + + c. e. It follows that the determinant of the system (11) (el. are orthonormal. .. e2. of the orthogonal projection fo of the vector f on the subspace 111 are determined from the system (12) or from the system (11) according as the c. x... . e. e1) = (f. i. The method of least squares. (e2.. In this case the problem can be solved with ease. . c2.(ei.28 LECTURES ON LINEAR ALGEBRA Replacing f. the system (11) must also have a unique solution.) (e e. ei) (e (e2. e.. e1) (k = 1.) (e2.

of the vector X22. the system (13) is usually incompatible and can be solved only approximately. . e2. e2e2 . c. .e. y) in that space. Cm so as to minimize the distance from f to numbers e1. + Xm2 + xmc.2. x2. x1). x2c2 + However usually the number n of measurements exceeds the number m of unknowns and the results of the measurements are xincl never free from error. = Y2. cm from the system of equa- + XnaCm = y1. the quantity k=1 E (X1kC1 X2nC2 + + XinkCk Ykr The problem of minimizing the mean deviation can be solved directly. If R1 is the subspace spanned by fo = c. denote the results of the kth measurement. = y . To this end one carries out a number of measurements of ael. tions X21C2 + Xl2C1 + X22 C2 + X11C1 . c2. i. 4 4. let us consider the n-dimensional Euclidean space of n-tuples and the following vectors: e. c2. are fixed unknown coefficients. X. and + ciel c2e2 + Consequently. x. and y. its solution can be immediately obtained from the results just presented. X2n).. (14) represents the square of the distance from + c. There arises the problem . x2. x. x. . are determined experimentally. Indeed. One could try to determine the coefficients c1.n and the problem of minimizing the f to clel c2e2 + mean deviation is equivalent to the problem of choosing ni .e. in (13) are as "close" as possible to the corresponding right sides.(y1. .e..n-DIMENSIONAL SPACES 29 where the c.. x). Let x.e. y2. Thus. y. The right sides of (13) are the components of the vector f and the left sides. cm so that the left sides of the equations of determining t1. As a measure of "closeness" we take the so-called mean deviation of the left sides of the equations from the corresponding free terms. e2 = (X21 f =_. = (x11. However. = (x. Frequently the c..

formula (11)). xoxk. e2)c2 + ' (e2. = (f.. the numbers c1. en)ci + (e. 4. e1)c2 + (e2. The system of equations (15) is referred to as the system of normal equations. 5). EXERCISE. ei)c. In this case the normal system Solution: el consists of the single equation (e1. = (f. (supposed linearly independent). . (2. 4). Use the method of least squares to solve the system of equations 2c 3c 3 4 4c = 5. + + (e. which solve this problem are found from the system of equations (e1. = (f. y) (x. . (e e2)c. 29c = 38. then our problem is the problem of finding the projection of f on RI. e2. e1). em)c. The method of approximate solution of the system (13) which we have just described is known as the method of least squares. e. e.. e2). c2.). el)a. xc the (least squares) solution is c (x.. c = 38/29. ek) = 1-1 where (f.30 LECTURES ON LINEAR ALGEBRA the vectors el. enc = (ee f). en)c.c2 k I . 3. ex) = I =1 xxiYs. e2)c1 (e2. As we have seen (cf. x) k=1 xox x. (e1. (15) (e1. (13') When the system (13) consists of n equat ons in one unknown xic x2c = y2. f = (3.

k=0 where or ck = (t. (x2. (x. and this problem is solved by dropping a perpendicular from f(t) to R1. 1. Approximation of functions by means of trigonometric polynomials.n-DIMENSIONAL SPACES 31 In this case the geometric significance of c is that of the slope of a line through the origin which is "as close as possible" to the points (x1. It is frequently necessary to find a trigonometric polynomial P(t) of given degree v.f(t)g(t) dt. cos t b. Consequently. e. which is closest to fit). cos t Nhz 1 e. Since the functions 1 eo V2. g) = 21.Mich differs from f(t) by as little as possible. 2n form an orthonormal basis in R. sin t Or sin nt Ahr . as usual. of R of dimension 2n + 1._. e. by means of the integral (I. Our problem is to find that vector of R. the mean deviation (16) is simply the square of the distance from j(t) to P(t). The trigonometric polynomials (17) form a subspace R.7 ' e. y). P(t) = (a012) + a. (cf. Then the length of a vector f(t) in R is given by = 6atr EN)? dt. Let us consider the space R of continuous functions on the interval 10. 2. . we are to find among all trigonometric polynom als of degree n. para. Thus. y2). the required element P(t) of R. Let (t) be a continuous function on the interval [0. Example 2). cos nt . is P(t) = E ce1. 2:r] in which the inner product is defined. We shall measure the proximity of 1(t) and P (t) by means of the integral u(t) . sin t + + an cos nt b sin nt.13(1)i2 dl. . el). y1). 2n]. that polynomial for which the mean deviation from f (t) is a minimum.

Example 5. it stood for a polynomial.2" Vn o f(t) cos kt dt. The question arises which of these spaces are fundamentally different and which of them differ only in externals. x' e R') such that I.e.. for the mean deviation of the trigonometric polynomial from f(t) to be a minimum the coefficients a and bk must have the values a. a =- 1 x 5 2 1(1) cos kt dt. =7 P(t) := . then the same . the inner products of corresponding pairs of vectors are to have the same value.. y').. C2k - ThuS. If X 4> X' and y 4> y'. If x 4> x' and y 4--> y'. We have investigated a number of examples of n-dimensional Euclidean spaces. Isomorphism of Euclidean spaces. a. The numbers a. etc...e. i. i. To be more specific: DEFINITION 2. If x x'. o b= 5 "Jo 127 f(t) sin kt dt. cos kt + b sin kt k=1 n 127 5 X o fit) dt. Thus in § 2. if OW correspondence associates with X E R the vector X' E R' and with y e R the vector y' e R'. ' 7( /0 1(t) sin kt dt. "vector" stood for an n-tuple of real numbers.. e. Example 2.32 n-DIMENSIONAL SPACES = Vart o -I 27 f(t)dt. scalar multiplication and inner multiplication of vectors has been proved. In each of them the word "vector" had a different meaning.. 1 tik. Two Euclidean spaces R and R'. in § 2. then (x. 2 * Ea.. 27 1 J..1c. then Axe> Ax'. then x + y X' + y'. 3. We observe that if in some n-dimensional Euclidean space R a theorem stated in terms of addition. y) = (x'. then it associates with the sum x + y the sum x' y'. and bk defined above are called the Fourier coefficients of the function fit).. are said to be isomorphic if it is possible to establish a one-to-one correspondence x 4> x' (x e R.

all arguments would remain unaffected.. isomorphic to the space R. The one-to-one nature of this correspondence is obvious. 8n) in R'..e. § 2. if we replaced vectors from R appearing in the statement and in the proof of the theorem by corresponding vectors from R'. y') = El% + $2n2 + + $nn. This will prove our theorem. . We now show that this correspondence is an isomorphism. e) and y' = (7? n . i. then. Clearly. . that the inner products of corresponding pairs of vectors have the same value. . We associate with the vector x= e2e2 + + ene in R the vector = (81. 2. ¿2. + Ennn. en be an orthonormal basis in R (we showed earlier that every Euclidean space contains such a basis). We shall show that all n-dimensional Euclidean spaces are isomorphic to a selected "standard" Euclidean space of dimension n. As our standard n-dimensional space R' we shall take the space of Example 2. Conditions 1 and 2 are also immediately seen to hold. Let el. e2. in which a vector is an n-tuple of real numbers and in which the inner product of two vectors x' = (E1. THEOREM 2. The following theorem settles the problem of isomorphism of different Euclidean vector spaces. Y) = $1ni + $2n2 + because of the assumed orthonormality of the e. in view of the properties 1. . It remains to prove that our correspondence satisfies condition 3 of the definition of isomorphism. Indeed.n-DIMENSIONAL SPACES 33 theorem is valid in every Euclidean space R'. (x. All Euclidean spaces of dimension n are isomorphic. 3 of the definition of isomorphism. the definition of inner multiplication in R' states that (x'. 82. = Eft?' + $2n2 + + Now let R be any n-dimensional Euclidean space. On the other hand. nn) is defined to be a (x'.

Y): i.. y') = (x. a geometric theorem about a pair of vectors is true in any vector space because it is true in elementary geometry. EXERCISE. of the proposition of elementary geometry just mentioned. § 2. Again. in the space of continuous functions on [a.34 LECTURES ON LINEAR ALGEBRA Thus (x'. Prove this theorem by a method analogous to that used in The following is an interesting consequence of the isomorphism theorem.e. the inequality. para. the vectors in question span a subspace of dimension at most three. the inner products of corresponding pairs of vectors have indeed the same value. is a direct consequence. via the isomorphism theo- rem. In particular the Schwarz inequality vr (f(t) g(t))2 dt VSa [ f (t)]2 dt bb VSa [g(t)]2 dt. We thus have a new proof of the Schwarz inequality. and it therefore suffices to verify the assertion in the latter space. To illustrate. which expresses inequality (7). Bilinear and quadratic forms In this section we shall investigate the simplest real valued functions defined on vector spaces. Any "geometric" assertion (i. § 4.. . inequality (7) of § 2 Ix + yl ixl is stated and proved in every textbook of elementary geometry as the proposition that the length of the diagonal of a parallelogram does not exceed the sum of the lengths of its two non-parallel sides. inner multiplication and multiplication of vectors by scalars) pertaining to two or three vectors is true if it is true in elementary geometry of three space. an assertion stated in terms of addition.e. b]. This subspace is isomorphic to ordinary three space (or a subspace of it). and is therefore valid in every Euclidean space. This completes the proof of our theorem. Indeed. § 1. 4.

= 1.) f(x) = f&ie. the properties of a linear function imply that + ene) eifie. $2. + E2e2 + Thus. is the dependence of the a. = acne. ¿j(e2) + + enf(e.e2 + + 2. Thus let et. e' Further. . 2. A linear function (linear form) f is said to be defined on a vector space if with every vector x there is associated a number f(x) so that the following conditions hold: _fix f(x) +AY).). Let et. Y) !(Aa) = 1f (x). en be a basis in an n-dimensional vector space. and f a linear function defined on R. + ac21e2 + acne. .n-DIMENSIONAL SPACES 35 1. . n). Linear functions. on the choice of a basis. The exact nature of this dependence is easily where f(e) = explained. . en is a basis of an n-dimensional vector space R. + 122e2 + e'2 + ocnIenr + (Xn2en. Let the e'. let 2. . e2. if et. e2. . What must be remembered.2. e and e'1. Linear functions are the simplest functions defined on vector spaces. E. e2. The definition of a linear function given above coincides with the definition of a linear function familiar from algebra.e. + anen f(X) = a252 . e2. DEFINITION I. Since every vector x can be represented in the form x= enen. e'n be two bases in R. however. then (1) f(x) = aiel a252+ -in amen. e by means of the equations e'. x a vector whose coordinates in the given basis are E1. e'2. be expressed in terms of the basis vectors et.

en. 2. In other words. y) is a linear function of y. cogrediently).) and a' k = f(e'). A (x. . if we regard ni .e. . n2. conditions 1 and 2 above state that A (xi + x2. A (x. E are kept constant. y) is a linear function of y. e2. A (x. + + akf(e. A (Ax. EXAMPLES. i. Bilinear forms. as it is sometimes said. and + a'e' f(x) = a'le'. 1. relative to the basis e'1. yi) + A (x. en). n as constants. y) = + an 27/1 anlennl a12e072 anE27/2 ' ' ' ' + n a2 ne2n an2enn2 + + annennn A (x.36 LECTURES ON LINEAR ALGEBRA relative to the basis el. y = n2.) = ctik + c(2k az + This shows that the coefficients of a linear form transform under a change of basis like the basis vectors (or. In what follows an important role is played by bilinear and quadratic forms (functions).. Let x = ($1.. Consider the n-dimensional space of n-tuples of real numbers. Again. yz). bt2. it follows that cc2ke2 + + C(nk en) = Xlki(ei) ac22f(e2) ai = f(xikei + anka. DEFINITION 2. y). e'2. A (x. if $1. k1 ae ink depends linearly on the $. y) = A (x.(E1. y) is a bilinear function. if we keep y fixed. . y). Since a. pty) = yA(x. Indeed. $2. y) = (x. + y2) = A (x. + a'2E12 + e'. y). nk). noting the definition of a linear function. E). i. y) is said to be a bilinear function (bilinear form) of the vectors x and y if for any fixed y. y) + A (x2. A (x. A (x. A (x. = f(e. and define (2) A (x.. y) is a linear function of x =. . ey. y) is a linear function of x. for any fixed x.

n. t)f(s)g(t) ds dt. 3 in the definition of an inner product (§ 2) say that the inner product is a symmetric. )72e2 + A (x. The matrix of a bilinear form. Indeed. If we put b sb K(s. In Example / above the bilinear form A (x. symmetric 21 A bilinear function (bill ear form) is called A (x. We shall express the bilinear form A (x. A( f. Let K(s. space. y) = A (ei ei $2e2 + In view of the properties 1 and 2 of bilinear forms . then their product 1(x) g(y) is a bilinear function. 2. e2. e2. y) using the e of x and the coordinates ni. g) = I then A (f. may be removed from under the integral sign. EXERCISE.e. y) defined by (2) is symmetric if and only if aik= aid for all i and k. e. in this case.. g) is a bilinear function of the vectors f and g.. y) in a Euclidean space is an example of a symmetric bilinear form.n-DIMENSIONAL SPACES 37 2. Let R be the space of continuous functions f(t). Conditions 2 have analogous meaning. A (f. /he. ri of coordinates ei. Thus. y relative to the basis e1. the first part of condition 1 of the definition of a bilinear form means. + e en. . t) b jib A (f. that the integral of a sum is the sum of the integrals and the second part of condition 1 that the constant A. 3. We defined a bilinear form en be a basis in n-dimensional axiomatically. + 71e). t. bilinear form. Now let el. DEFINITION 3. then If K(s. The inner product (x. i. Indeed. Show that if 1(x) and g(y) are linear functions. 1. g) = f(s)g(t) ds dt = f(s) ds g(t) dt. g) is the product of the linear functions!: f(s) ds and Jab g(t) dt. Axioms I. y) = A (y x) for arbitrary vectors x and y. t) be a (fixed) continuous function of two variables s. e2.

y) is determined by its matrix at= Ha11. y). if we denote the constants A (ei. respectively. e the form A (x. + tine. a= a 1 + 2 U (-1) + 3 (-1) = 4. 1. 1 1 1 1 1 1+2 1 1 1+3 1 1 = 6. ek).. y) relative to the basis el.91 of the bilinear form A (x. e2. = A (ei.$1e1 + i. EP Let R be the three-dimensional vector space of triples 59) of real numbers.. = (1. 1=1 azkink. + a. e. 1 a 1 1 - = 04 4 2 d. e. and The matrix a/ = is called the matrix of the bilinear form A (x.17' . n'a + + 2 6E'. and compute the matrix . en..e To sum up: Every bilinear form in n-dimensional space can be A (x: Y) = where X -r.. 1). Nlaking use of (4) we find that: an a = an 1+2 1+3 (-1) = 0. 4. 1. e1) by a. e. = (1. 1). y) = El% + 2eon + 3eana. e2. a = a= 1 1 1. 1). y = ?he. 1. Let us choose as a basis of R the vectors e. We define a bilinear form in R by means of the equation A (x. . y) = written as ai. 4E'. k=1 or.2 (-1) + 3 (-1)(-1) = 2. y) = I A (ei. 1 + 2 (-1) (-1) + 3 (-1) (I) = 6. and n'2.. e2. Thus given a basis e1.e. a = 11 + 2 1 + 3 (-1) (-1) = 6. EXAMPLE. . are denoted by 5'. A (x. TA.38 LECTURES ON LINEAR ALGEBRA A (x. then A (x. y) = 65'oy. 5'. = (1. + 3. (El. 6 0 6 6 It follows that if the coordinates of x and y relative to the basis e.

. e.. f. e2. f2. the matrix I IbI I given the matrix I kJ.1 are. f2. e and . is the value of our bilinear form for x f. e2. e2. It follows that by. f2. (4)] b = A (f. Our problem consists in finding relative to the basis f1. . (6) . . Let si = I laid I be the matrix of a bilinear form A (x.ß=1 . ca. a... e and f. y = fe. fe). To this end The e'.11 and a = cik E ai. Transformation of the matrix of a bilinear form under a change of basis. en and c.Vt. the matrix of that form L.. c22. e2. c. f be two bases of an n-dimensional vector space. c/-1 Using this definition twice one can show that i = d. = A (fp.e. We shall now express our result in matrix form. fe) = k=-1 acic. e are cm. the numbers c. Let el.. the element c of a matrix 55' which is the product of is defined as two matrices at = 11a0. By definition [eq. Now b becomes 4 4 As is well known.. . y) relative to the basis e1.zbk.n-DIMENSIONAL SPACES 39 4. c. then = E abafic Aft. .e. i.4 = I ibikl 1. e2. = c' transpose W' of W. w [clici2 . + + 1 f2 = cue]. The matrix basis el. and fe relative to the basis e1. the elements of the we put e1. Let the connection between these bases be described by the relations = cue' (5) c21e2 d- + ce. C21C22 cn2c2 C2n cn is referred to as the matrix of transition from the basis e. c22e2 + In = crne + c2e2 + which state that the coordinates of the vector fn relat ve to the . bp.. i. of course. to the basis f1. To find this value we make use of (3) where in place of the ei and ni we put the coordinates of fp .... c.

. then A (x. .. f. y).y. Thus.. Let A (x. x)). Since the right side of the above equation involves only values of A (x. e2. y) = A (y. where W is the matrix of transition from e1. Quadratic forms DEFINITION 4. e to f1. x). x). y) is u iquely determined by i s Proof: The definition of a bilinear form implies that A (x 4-. A(y. y) (i. The function A (x. . y) as well as the symmetric bilinear form A . A (x.40 (7*) LECTURES ON LINEAR ALGEBRA t. e and [. x) + A (y. x)i . basis el. f and W' is the transpose of W. (x. The requirement of Definition 4 that A (x. y) relative to the then PI = W' dW. if s is the matrix of a bilinear form A (x. The polar form A (x. k=1 Icriaikc. y) is any (not necessarily symmetric) bilinear form. y) + A (y. To show the essential nature of the symmetry requirement in the above result we need only observe that if A (x. it follows that A (x.:W its matrix relative to the basis f1. x + y) = A (x. y) is referred to as the bilinear form polar to the quadratic form A (x. in view of the equality A (x. f2. x) A (y.e. x + y) the quadratic form A (x. . A (x. Using matrix notation we can state that (7) =wi sr. y) by putting y = x is called a quadratic form. y) is indeed uniquely determined by A (x. y) be a symmetric bilinear form. x) obtained from A (x. quadratic form. y)]. y) be a symmetric form is justified by the following result which would be invalid if this requirement were dropped. e2. x) + A (x. THEOREM 1. f2. y) -- (x + y. Hence in view of the symmetry of A (x.. 5. x).

It is clear that A (x.n-DIMENSIONAL SPACES 41 give rise to the same quadratic form A (x. y) its polar form. where a. y). This enables us to give the following alternate definition of Euclidean space: A vector space is called Euclidean if there is defined in it a positive definite quadratic form A (x. x). such a bilinear form always defines an inner product. In such a space the value of the inner product (x. of x and nk of y as follows: A (X. 3) I atkeznk. These conditions are seen to coincide with the axioms for an inner product stated in § 2. x) can be expressed as follows: A (x. y) = A (xl. A quadratic form A (x. y) can be expressed in terms of the coordinates E. x) be a positive definite quadratic form and A (x. We have already shown that every symmetric bilinear form A (x. x). It follows that relative to a given basis every quadratic form A (x. au. an inner product is a bilinear form corresponding to a positive definite quadratic form. x). y) = A (x. x) = E aikeik. EXAMPLE. . + x2. x) > 0 for x O. x) > O. Let A (x. y) of two vectors is taken as the value A (x. The definitions formulated above imply that A (x. x) 0 and A (x.. y) associated with A (x. = a. = k=1 We introduce another important DEFINITION 5. y) of the (uniquely determined) bilinear form A (x. y). (x. A (ibc. Hence. x) is called positive definite if for every vector x A (x. Conversely. A (x. y) + A (x2. y) = A (y. x). x) = + $22 + -in $2 is a positive definite quadratic form.

the coefficient of )7'2. nn . In view of the one-to-one correspondence between coordinate transformations and basis transformations (cf. We now single out all above. To reduce the quadratic form A (x. x) (supposed not identically zero) does not contain any square of the variables n2.42 LECTURES ON LINEAR ALGEBRA § S. . § 1) we may write the formulas for coordinate transformations in place of formulas for basis trans- formations. A (x. nn are the coordinates of the vector x relative to this basis. stays different from zero. Since an = an = 0. f3. . x) in which at least one of the a (a is the coefficient of )7. x) in terms of the coordinates of the vector x depends on the choice of basis. namely.2) is not zero.e. $n2... . We now show how to select a basis (coordinate system) in which the quadratic form is represented as a sum of squares. that in (2) a n2 + 2ainnin. If this is not the case it can be brought about by a change of basis consisting in a suitable change of the numbering of the basis elements. . those terms of the form which contain ann12 + 2annIn2 + We shall assume slightly more than we may on the basis of the O. Thus let f1. 6. i. x) = Al$12 + 12E22 . 2a1277072. Consider the coordinate transformation defined by = nil + 7/'2 n2 = 7711 nia (k = 3. If the form A (x. x) to a sum of squares it is necessary to begin with an expression (2) for A (x.. X) = a zo in where ni. Reduction of a quadratic form to a sum of squares VVre know by now that the expression for a quadratic form A (x. . A (x.n) nk = n'k Under this transformation 2a12771)72 goes over into 2a12(n1 771). para. )72.f be a basis of our space and let . We shall now carry out a succession of basis transformations aimed at eliminating the terms in (2) containing products of coordinates with different indices. . it contains one product say.

x) n . k=2 is entirely analogous to the right side of (2) except for the fact that it does not contain the first coordinate. 272** = (122* n2* + a23* n3* + + nn*.. If we assume that a22* 0 0 (which can be achieved.. where the dots stand for a sum of terms in the variables t)2' If we put 711* = aniD 1)2* a12n2 .n-DIMENSIONAL SPACES 43 and "complete the square. " (Jinn.hThc. our form becomes A (x. ' ' It is clear that B contains only squares and products of the terms that upon substitution of the right side of (3) al2172.i. if necessary. * si ik ' The expression a ik* n i* n k* i. x) au th**2 a22* a ** **2 + n2ik. write an/h2 (3) 1 24112971712 (ann. by auxiliary transformations discussed above) and carry out another change of coordinates defined by ni ** n3** = nn** ni* . + 2a/nn + a1)2 B. n3*. so in (2) the quadratic form under consideration becomes (x. " 71: = then our quadratic form goes over into A (x. *. *2 + all n *. fi ** t.e. x) = 1 a 11 (a11711 + ' + a1)2 + 4.1=3 .

171* + = n. . 7122 8%2. 8712. x) = Again.s = then A (x.e. relative to some basis f1. x) be a quadratic form in an n-dimensional space R. Let A (x. i.7/2 4flo. We may now sum up our conclusions as follows: THEOREM 1. § 1) and to see that each change leads from basis to basis. If m < n. we put 4+1= = An = O. Then there exists a basis el. Thus let A (x. = then A (x. 21E18 ¿2$22 27n em2. 6. + 27 s* '73' . . An en2. x) .j* Finally.2 +722.f2. e. x) (cf. x) If 271. e2. We shall now give an example illustrating the above method of reducing a quadratic form to a sum of squares.44 LECTURES ON LINEAR ALGEBRA After a finite number of steps of the type just described our expression will finally take the form A (x. We leave it as an exercise for the reader to write out the basis transformation corresponding to each of the coordinate transformations utilized in the process of reduction of A (x.. if SI = rits. if 71' . en of R relative to which A (x. e2.f3. . Vi = 77.2 + 4. to n linearly independent vectors. x) has the form A (x.8 + 27j + 41)/ 2?y. x) be a quadratic form in three-dimensional space which is defined. by the equation A (x.j* . where m n. para. E2. x) =_ n1. x) = 21E12 + 22E22 where E1. es = e3 = 712. E are the coordinates of x relative to e1.

. . x) assumes the canonical form A (x. It is easy to check that in this case the matrix i.e. e. 112". . d2212 + = dnif d2f2 + + If the form A (x. 712*. ' f. 7/2. )73. 712. en in terms of ni. e2. e2. n* in terms of n. para. 712*.. e. e in terms of711. 712. x) is such that at no stage of the reduction process is there need to "create squares" or to change the numbering of the basis elements (cf. r ni. . for nj**. . e2 2n3. . E2. n take the " form c12n2 + " C22n2 + C2nnn the matrix of the coordinate transformation is a so called triangular matrix. Ej = cu.ìj in the etc. § 1) we can express the new basis vectors ei. n** in terms of ni*. we can express el. 6. then the expressions for El. . form = ciini + Cl2712 E2 = C21711 C22n2 Clnnn C2nnn $n = enini + cn2n2 + Thus in the example just given 1/1 + Cnnn. x) = _e. ¿3= In view of the fact that the matrix of a coordinate transformation is the inverse of the transpose of the matrix of the corresponding basis transformation (cf.n-DIMENSIONAL SPACES 45 then A (2c. in terms of the old basis vectors f2. = e2 = d21f1 6/12f2 + + + d2nf. the beginning of the description of the reduction process in this section).% e2 .2 e22 12e22 If we have the expressions for ni*. ni.

en so that (2) A (ei. f2. 2. a22 a1n di. x) be defined relative to the basis f1. f2. . fn. this time we shall find it necessary to impose certain restrictions on the form A (x. LI 2= at2 a21 mm. Now let the quadratic form A (x. Reduction of a quadratic form by means of a triangular transformation 1 In this section we shall describe another method of constructing a basis in which the quadratic form becomes a sum of squares. f by the equation (It is worth noting that this requirement is equivalent to the A (x. y) and the initial .. = A (fi. However. an2 requirement that in the method of reducing a quadratic form to a sum of squares described in § 5 the coefficients an . x) = I a1k. y) relative to the basis f1. 0 0. . . . fk).46 LECTURES ON LINEAR ALGEBRA of the corresponding basis transformation is also a triangular matrix: = e2 dfif1 d22f2. 2-1 where ai. a22*. . e2. ek) 0 for i k (i k --= 1. er. = a21 2n 0 am. n). Thus let a II be the matrix of the bilinear form A (x. be different from zero. = difi c/n2f2 + + § 6. fn. It is our aim to define vectors el.4. i. In contradistinction to the preceding section we shall express the vectors of the desired basis directly in terms of the vectors of the initial basis. . f2. etc. the following determinants are different from zero: ali (1) a11 412 a22 0. O. We assume that basis f1.

f2) = 1. ei) = O for . and to obviate the computational difficulties involved we adopt a different approach. + .k from the conditions (2) by substituting for each vector in (2) the expression for that vector in (3). We assert that conditions (4) determine the vector ek to within a constant multiplier. e2. Our problem then is to find coefficients . To fix this multiplier we add the condition A (ek.. . in view of the symmetry of the bilinear form. c22f2 + e= + We could now determine the coefficients . if we replace e. 2. 2.n-DIMENSIONAL SPACES 47 We shall seek these vectors in the form = (3) e2 c(21f1 22f2.x. f1) = 0 for every k and for all i < k. en is the required basis. We observe that if for i = 1. i. We claim that conditions (4) and (5) determine the vector ek . . Indeed.. fi) = 0 then (ek. Thus if A (ek.k I). Ctkk atk2f2 + + Mkkfk = 0. e1. such that the vector ek = satisfies the relations A (ek. then = ocii A (ek.e. fi) oci2A (ek.k I. A (ek. by ai1f1 oci2f2 ' then A (ek. (i = 1.212 + + 2(f1) + aA(ek. e1) = A (ek. f2) + A (e. However. OCk2. this scheme leads to equations of degree two in the cc°. = 0 for i < k and therefore. also for i > k. fi).

12A (f1-1.48 LECTURES ON LINEAR ALGEBRA uniquely. Substituting in (4) and (5) the expression for e. f1) 12A (f1. f2) A (f 2. Now A (ek. f2) ac12A (f2. we already know b11 It remains to find the coefficients bt of the quadratic form en just constructed.. fk) A (f2. Thus conditions (4) and (5) determine ek uniquely.-1 QC/0c = where A1_. ek).. f2) + + 11114 fek. 0. e2. f2) and is by assumption (1) different from zero so that the system (6) has a unique solution. f2) + ' 111A (f2' f1) (6) + kkA (f1. It therefore remains to compute b11 = A (ek.e. f = °. ek) = for i k. chkiA (f1. f2 A (f2. is a determinant of order k Ao = 1. + akkA (f2. fl) 12f2 + + °C1741) C(12A (elc. fk) A (fk. ek) = ockk' The number x11 can be found from the system (6) Namely. as asserted. fl) oc12A (fk. The basis of the ei is characterized by the fact that A (e1. we are led to the following linear system for the kg. f2) + + lickA (fk. The proof is immediate. f1) A (fi. ek) = A (ek. I analogous to (7) and . + akkA (LI. f A (fk. fk) I A (fk. by Cramer's rule. r which in view of (4) and (5) is the same as A (ek. A (x. ek). bin = 0 for i k. As A (ei. fl) (flt. x) relative to the basis e1. oc11f1 = C(11A (e1. fk) = °. f2) + acnA (f1-1. fk) 14 The determinant of this system is equal to A (fk. A 1. i. f 1. f1) A (f2.

en in which the quadratic form is expressed as a sum of squares does not mean that this basis is unique. 2. e2. Consider the quadratic form 2E1' + 3E1E2 + 4E1E3 + E22 + in three-dimensional space with basis f= (1.42 an An = an a12 a22 all ' an an a12 aln a2n (inn an2 be all different from zero Then there exists a basis el. by the equation A (x. 0). 0. =-.n-DIMENSIONAL SPACES 49 Thus blek = A . . REMARK: The fact that in the proof of the above theorem we were led to a definite basis el. = (0. x) is expressed as a sum of squares. f2. 0. f. f. 1. en. .(0. . e2. en need not have the form (3).e. f (or if one were simply to permute the vectors f1. f2. EXAMPLE. 0). Let A (x. e2. . . let the determinants . are the coordinates of x in the basis el. . fn) one would be led to another basis el. e2. it should be pointed out that the vectors el. e2. x) be a quadratic form defined relative to some basis f1. 1). if one were to start out with another basis . relative to which A (x. . In fact. Further. x) = l<=1 aiknink a ik = A (fi.11=a11. e) = A k-1 Ak To sum up: THEOREM 1. en. Also. A Here 4Ç. fk). f2. x) = 40 AI _AI A2 22 + A I.e A (x. fr. known as the method of Jacobi. This method of reducing a quadratic form to a sum of squares is . .

e. 0).' -33 and e3-1871 1 12ea + 117 fa _(S 127. 2M21 = 0. 2a = I. Or an 8f2 = (6. 12 133 8 -y. = (cc. Thus our theorem may be applied to the quadratic form at hand. 0). = (i. The coefficient cc. o r. 117). 0). (823.f. e.. --1-. 833 = 1.. e2. our quadratic form becomes A(x. 232. C C2 are the coordinates of the vector x in the basis e. = 83113 + 822f3 + ma. = 6f1 Finally. 43 are 2.2e1n1 pi. whence 831 = 0. 12) = 0. or o( i and e. Next a and 822 are determined from the equations A (e2./3 The determinants A.. are determined from the equations A (es. 0). none of them vanishes. 21ai 1832 + 2833 = jln + 832 28. e. is found from the condition A (e1. i. A (e3... = if.) = 0 and A (e2.f. 43. y) =. i. 839. + teh. e e . = 82313 a22f2 e. 833).e. ct. + $012 + 2e3m1 + e3 7.50 LECTURES ON LINEAR ALGEBRA The corresponding bilinear form is A (x. 0. Let el = ce. 822. 8. f2) 1. Here C. =(j 0. fi) = 0. f2) = 1. whence and e.x) = C12 + Ai C13 42 AH 43 C32 Cl2 8C22 11-7C32.&7. 1. Relative to the basis e. fa) = 1 A (ea. 121 = 6.

A. in which A (x. A2 > 0. so that the quadratic form is 1 . have the same sign then the coefficient and A. > 0. . A2 " 4 e22 2 An_. e2. then of E12 is positive and that if this coefficient is negative. A. (8) It is clear that if A1_1 and A. where all the A. A > O. x) 4E12 + 22E22 Anen2. > 0. Z12. In the next section we shall show that the number of positive and negative squares is independent of the method used in reducing the form to a sum of squares. The number of negative coefficients which appear in the canonical form (8) of a quadratic form is equal to the number of changes of sign in the sequence 1. 51 In proving Theorem I above we not only constructed a basis in which the given quadratic form is expressed as a sum of squares but we also obtained expressions for the coefficients that go with these squares. x) = I 21E12 1=1 0 for all x and is equivalent to E1= E2 = = En = O. Hence A (x. THEOREM 2. These coefficients are I A.n-DIMENSIONAL SPACES 2. Hence. is positive definite. In other words. /12 > 0. Actually all we have shown is how to compute the number of positive and negative squares for a particular mode of reducing a quadratic form to a sum of squares.. x) A (x. A. Then there exists a basis e1. . x) . . A1. e7. Assume that d. then the quadrat e form A (x. x) takes the form A (x. are positive. If A1> 0. have opposite signs.

. n). fz) + + yk. Ak _ A k-1 Since for a positive definite quadratic form all that all A. f.f. We shall show that then 4k> 0 (k A (f.a. We first disprove the possibility that A (fi. The fact that A. i. tl > O..) 1. y.) If A.f. x) in the -F form A (x. it follows THEOREM 3..e.) A (f fie) A (f. a basis of the n-dimensional space R. not all zero such that yiA(fi.f2) A (f.142. fi) ---. fi) A (f2. n) combined with Theorem 1 permits us to conclude that it is possible to express A (x. 2. Let A (x. then one of the rows in the above determinant would be a linear combination of the remaining rows. (k 1.) = O. 2. fi) . 1. y) be a symmetric bilinear form and f2. . . let A (x. For the quadratic form A (x. f. g212 + In view of the fact that pif. so that A (yifi p2f2 + ' + p.52 LECTURES ON LINEAR ALGEBRA Conversely. . x) = A2E12+ A2E22 A2En2.) 0. . 0.0 (i 1. 2. f. f. x) be a positive definite quadratic form. the latter equality is incompatible with the assumed positive definite nature of our form. f2) A(f2.A (fk..A (f2. This theorem is known as the Sylvester criterion for a quadrat c . > 0 (we recall that /10 = 1). We have thus proved 4> 0. it would be possible to find numbers y1. form to be positive definite. f2) A (f f. kikfk. p2f2 + -F p. fi) A (f. x) to be positive definite it is necessary and sufficient that > 0. But then A (pifi p2f2 -F + phf. 42 > 0. = 0. k). . k.

y) can be taken as an inner product in R. ek) (ek.p. e2) (el. . then A (x. The Gramm determinant of a system of vectors e1. ek are linearly dependent. e2) (ek. X). A2. y) A (x. e2. Indeed. then if we used as another basis the vectors f1. A2. f2. is known as the Gramm determinant of these vectors THEOREM 4. Now let A be a principal minor of jja1111 and let p 1. y).. x) relative to some basis are positive. e.. x) '(x. The results of this section are valid for quadratic forms A (x.e.e. x) relative to the new basis. .2. The Gramm determinant. Conversely.o/ a matrix ilaikllof a quadratic form A (x. x) is positive definite.. i. . The determinant el) e1) (el. i. Let e1. then all principal minors ok that matrix are positive. y) is a bilinear symmetric form on R such that A (x. (x. then A (x. Haik11.ti-DIMENSIONAL SPACES 53 It is clear that we could use an arbitrary basis of R to express the conditions for the positive definiteness of the form A (X. e2. . This implies the following interesting COROLLARY. e. x) is positive definite. are all positive. An would be different principal minors of the matrix the new A1. One consequence of this correspondence is that A (x. . y) is a symmetric bilinear form on a vector space R and A (x. if (x.. x). . In particular f in changed order. A. we see that A > O. I/the principal minors z1. x) every theorem concerning positive definite quadratic forms is at the same time a theorem about vectors in Euclidean space. be k vectors in some Euclidean space. If we permute the original basis k) and vectors so that the pith vector occupies the ith position (i 1. x). is always >_ O. be the numbers of the rows and columns of jaj in A. for quadratic forms A (x. 3. Thus every positive definite quadratic form on R may be identified with an inner product on R considered for pairs of equal vectors only. we may put (x. if 41. y) is an inner product on R. This determinant is zero if and only if the vectors el. ek) (e2. y) (x. then A (x. . ek) (ek. x) such that A (x. e2. . x) is positive definite. express the conditions for positive definiteness of A (x. e2) (e2. If A (x.4. x) derivable from inner products. A..

(x. are linearly independent. is a linear combination of the others and the determinant must vanish. Indeed. z is equal to the absolute value of the determinant xi 23 V In three-dimensional Euclidean space the volume of a parallelepiped 23 Ya 23 Ya 2. has indeed the asserted geometric meaning. Therefore. y 2'. In Euclidean three-space (or in the plane) the determinant 4. (x.e. where y) is the angle between x and y.z. e2. z. z28 + za2 . . i. y) (Y. (x. x and y As an example consider the Gramm determinant of two vectors = (x. discussed in this section (cf. where (x. Since A (x. x) = Ix1 lyr cos ry.y. is a linear combination of the others.x. say e. Assume that e1. + z. in that case one of the vectors. Then the Gramm determinant of Consider the bilinear form A (x. 1. x1 y1 1. + zax. e2. y). y) is a symmetric bilinear form such that A (x. Now. y.54 LECTURES ON LINEAR ALGEBRA el. y. 3/423 2. We shall show that the Gramm determinant of a system of Ale]. + y. x) > 0 is synonymous with the Schwarz EXAMPLES.. x) The assertion that inequality. ek is zero. = 1x18 13. e2. y) Proof.y.12 (1 cos2 99) = 1x12 13712 sin' 99. + yrz. . This completes the proof. has the following geometric sense: d2 is the square of the area of the parallelogram with sides x and y. 3/42 + 3/42 + x32 = Y1 X1 + Y2 x2 T Y3 X3 z. on the vectors x. z. e2.Y. y. J. y) is the inner product of X and y. (7)). y) = (y. Y) (Y. e. ek coincides with the determinant 4. vectors e1. = ixF2 ly12 2.y. 1x12 13712 cos' 9. + 22e2 ' + It follows that the last row in the Gramm determinant of the e.x.z. 3/4 Y3 Y12 + Ya' + 1I32 x121 + x222 -.. ek . linearly dependent vectors e1.. + z.z. x) is positive definite it follows from Theorem 3 that 47c >0. Indeed. Ya 23 where 3/4. are the Cartesian coordinates of x. + z.

z) (z. the yi are the coordinates of y in that basis. etc.e. (9) . Similarly. /1(012(t)dt Pba 122(t)dt 11(1)1k(i)di Pb 12(t)f1(t)dt f " a 12(t)1(t)dt rb a rb Pb tic(1)11(1)dt 1k(t)12(t)dt .) By analogy with the three-dimensional case. it is possible to show that the Gramm determinant of k vectors y) x. y. x) = 1-1 2. be even infinite-dimensional since our considerations involve only the subspace generated by the k vectors x. y) (3'. which a quadratic form A (x. For a system of functions to be linearly dependent it is necessary and sufficient that their Gramm determinant vanish. y. (It is clear that the space R need not be k-dimensional.1. y. z) (37. z is the square of the volume of the parallelepiped on these vectors. In the space of functions (Example 4.2 By replacing those basis vectors (in such a basis) which correspond to the non-zero A. the determinant (9) is referred to as the volume of the k-dimensional parallelepiped determined by w. . by vectors proportional to them we obtain a . w. § 7. 3. Y) (x.n-DIMENSIONAL SPACES 55 (x.1 . R may. w in a k-dimenional space R is the square of the determinant X1 Y1 22 Y2 " " Xfr Yk Wk W1 W2 where the xi are coordinates of x in some orthogonal basis. There are different bases relative to 1. y. z) Thus the Gramm determinant of three vectors x. indeed. § 2) the Gramm determinant takes the form rb I rb 10 (t)de "b .a f (t)dt and the theorem just proved implies that: The Gramm determinant of a system of functions is always 0. The law of inertia T he law of inertia . x) is a sum of squares. (1) A (x. the vectors x.

. ''2 = all a12 a22 z1n an an an an . e' were chosen.. suppose some other basis e'1. en. A. 2. . x) which. ZI2.(12. a2 a. then the number of positive coefficients as well as the number of negative coefficients is the same in both cases..e. .. as was shown in para. relative to some basis el. If a quadratic form is reduced by two different methods (i. 2. x) by means of a sum of squares in which the A. There arises the question of the connection (if any) between the number of changes of sign in the squences 1. . in two different bases) to a sum of squares. known as the law of inertia of quadratic torms. . are 0. e2. x) to a sum of squares by the method described in that section is equal to the number of changes of sign in the sequence 1. To illustrate the nature of the question consider a quadratic form A (x. z11.. .. 1. I. It is natural to ask whether the number of coefficients whose values are respectively 0. THEOREM 1. an are different from zero. . Then a certain matrix I ja'11 would take the place of I laikl and certain determinants would replace the determinants z11. all ). answers the question just raised. .. and I.56 LECTURES ON LINEAR ALGEBRA representation of A (x. e'2. The following theorem. Then. and lis dependent on the choice of basis or is solely dependent on the quadratic form A (x. ek) and all the determinants 41 = an. is represented by the matrix where a = A (ei. § 6. Now. in formula (1) are different from zero and the number of positive coefficients obtained after reduction of A (x. zl. z1'2. x). or 1. A1.. A'.

.) Let f. It remains to show that x O.e. basis of R".. in (1) are invariants of the quadratic form. . The vectors el. e be a basis in which the quadratic form ei A2e2 + A (x. in (1) and the number of negative A. and let k 1 > n. We can now prove Theorem 1.. e2. Proof: Let e. i. kt2. 2. q'.. " Ak. e.e. x) X 22 $2. ' It is clear that x is in R' n R".n-DIMENSIONAL SPACES 57 Theorem 1 states that the number of positive A. + erk. We first prove the following lemma: LEMMA. . A2. En are the coordinates of the vector x. + . ti are the coordinates of x relative to the basis . . 2. f2.. f2.. fm. i. is n. . . )72p..e. . f be another basis relative to which the n22 quadratic form becomes A (x. which vanish is also an invariant of the form.2e2 + + Akek p2f. . ._ . which is impossible. that this is false and that p > p'..±Q. + 52e2 + + $e -F -Fene. Let R' and R" be two subspaces of an n-dimensional space R of dimension k and 1.. . fi. + + pit = 0. E2. n2p. This means that there exist numbers pi not all zero such that Al.) We must show that p = p' and q =-. .. e. Ale 2. Since the total number of the A. f2. n2 . are linearly dependent (k 1 > n). it follows that the number of coefficients A. Al. Proof: Let e.. p would all be zero.2 $223+1 E2p+2 $2. f2. Assume ep. n2. e2. If x = 0. e2. f. Then there exists a vector x 0 contained in R' n R". x) = ni2 (Here )7. be a basis of R' and f. x) becomes A (x. f. say. Let R' be the subspace spanned by the vectors el. A2. (Here E. . Hence x O.. and pi.. e. e2. f.2e2 + + Atek = !IA Let us put + Akek = Pelf' /42f2 !lift = x. respectively.

number of non-zero coefficients 2. yi y.e. 0 for all x e R. .. But this means that y. i. The reasonableness of the above definition follows from the law of inertia just proved. f has dimension n p'. f. there exists a vector x 0 in R' n R" (cf.. . i. We shall now investigate the problem of actually finding the rank of a quadratic form. E2. e. Q nil+. are zero.. let y. x= and E. are E1. y. of all vectors y such that A (x. for. +2 = = nil+0. x) = . and 41 e R. Since n p>n (we assumed 1) > p'). f.) = 0 and A (x. are 0. y) = 0 for every x e R. Indeed. vanish) and. y) we mean the set R. on the other hand. . . = n.-+2 -c 0(Note that it is not possible to replace < in (5) with <. is a subspace of R. on the one hand. . Then A (x. A (x. e2.e 0 and its coordinates relative to the basis . y. The subspace R" spanned by the vectors fil. 0. x) = + $22 > (since not all the E. . y2) = 0 for all x e R. By the null space of a given bilinear form A (x. . A (x.) The resulting contradiction shows that fi = p'.. Similarly one can show that q = q'. It is easy to see that R. By the rank of a quadratic form we mean the 2. Substituting these coordinates in (2) and (3) respectively we get. DEFINITION 2. e Et. e R. To this end we shall define the rank of a quadratic form without recourse to its canonical form. in one of its canonical forms.n22. This completes the proof of the law of inertia of quadratic forms.. = O. +ee X = np fil + + nil-Fa' fil+qt + nnfn The coordinates of the vector x relative to the basis e.eil+1. Lemma).e. it is possible that nil+. A (x.) = 0 and A (x.±. y. while not all the numbers il. 0. Rank of a quadratic form DEFINITION 1. n.58 LECTURES ON LINEAR ALGEBRA R' has dimension p.

y) 0 to belong to the null space of A (x.[1 does depend on the choice of basis. nifi + nnf. the rank of the matrix in question is n ro. for i= 1. then for a vector Y= n2f2 + + nnf. 702 + n2f2 + A (fn.n-DIMENSIONAL SPACES 59 If f. y) is independent of the choice of basis in R (although the matrix la .. We shall now try to get a better insight into the space Ro.e) = aik. y) it suffices that n.) = Q /02 + + ni. A (ft. f. 71f1 + n2f2 + A (f2.1 0 O Ao 0 An . f.. )72. the above system goes over into + ainnn = 0.17. cf. where r is the rank of the matrix Ikza We can now argue that The rank of the matrix ra11 of the bilinear form A (x. f is a basis of R. are solutions of the above system of linear equations. anini + 12. ant). and the null space is completely independent of the choice of basis. the dimension of this subspace is n r.) = Q + mit.) = O. § 5). 2. a22n2 = O.202 + ' + Thus the null space R. . + an% = 0.. Indeed. As is well known. where ro is the dimension of the null space. . We shall now connect the rank of the matrix of a quadratic form with the rank of the quadratic form. We defined the rank of a quadratic form to be the number of (non-zero) squares in any of its canonical forms. But relative to a canonical basis the matrix of a quadratic form is diagonal [1. 2. Replacing y in (7) by (6) we obtain the following system of equations: A (f1.. consists of all vectors y whose coordinates 2h. If we put A (fi.

6 We could have obtained the same result by making use of the wellknown fact that the rank of a matrix is not changed if we multiply it by any non-singular matrix and by noting that the connection between two matrices st and . This rank is equal to the number of squares with non-zero multipliers in any canonical form of the quadratic form. We mentioned in § 1 that all of the results presented in that section apply to vector spaces over arbitrary fields and. the rank of the quadratic form. The matrices which represent a quadratic form in different coordinate systems all have the same rank r. Complex Euclidean vector spaces. x)i. (x. to find the rank of a quadratic form we must compute the rank of its matrix relative to an arbitrary basis. In addition to vector spaces over the field of real numbers. Thus. x) denotes the complex conjugate of (Y. i. 5 To sum up: THEOREM 2. y) = (y.e. It is therefore reasonable to discuss the contents of the preceding sections with this case in mind. to vector spaces over the field of complex numbers. the rank of the matrix associated with a quadratic form in any basis is the same as the rank of the quadratic form.60 LECTURES ON LINEAR ALGEBRA and its rank r is equal to the number of non-zero coefficients.41 which represent the same quadratic form relative to two different bases is .4 (6' non-singular.. vector spaces over the field of complex numbers will play a particularly important role in the sequel.e. a function which associates with every pair of vectors x and y a complex number (x. x) [(y. in particular. i.. . Complex n-dimensional space In the preceding sections we dealt essentially with vector spaces over the field of real numbers. Complex vector spaces. y) so that the following axioms hold: 1. Many of the results presented so far remain in force for vector spaces over arbitrary fields. By a complex Euclidean vector space we mean a complex vector space in which there is defined an inner product. § 8.---- . Since we have shown that the rank of the matrix of a quadratic form does not depend on the choice of basis.

x) --= . (ix. -1. y) = (x1. ix) (x. (x. x) i"(x. y2) = y) (x.x) x). Ic--1 . y) only if x = O. yi + Y2) = (Y1 + Y2. Y) = (Y. In particular. Also.e. 2. y).x2. y). EXAMPLES OF UNITARY SPACES. 2. y) (x. Indeed. But then (Ax. y. /.1(y. 2y) = il(x. ' ". x) would imply (x. (x. nn) are two elements of R.. y2). Let R be the set of n-tuples of complex numbers with the usual definitions of addition and multiplications by (complex) numbers. 2 and 4 for inner products in the form in which they are stated for real Euclidean vector spaces. If = (E1 E2 En ) and 2. y). we define (x. y) -H (x2. The set R of Example i above can be made into a unitary space by putting y) = aikeifh. (x.n-DIMENSIONAL SPACES 61 2(x. x) = (x. x). y) with y = tx would have different signs thus violating Axiom 4. Indeed. (x. x) and (y. Complex Euclidean vector spaces are referred to as unitary spaces.(x. x) = x) + (Y2. Y) -= $217/2 + We leave to the reader the verification of the fact that with the above definition of inner product R becomes a unitary space. the numbers (x. ). i. Axioms 1 and 2 imply that (x. + (x. 2y) = (2y. This is justified by the fact that in unitary spaces it is not possible to retain Axioms 1. x) is a non-negative real number which becomes zero (2x. Y2) Axiom 1 above differs from the corresponding Axiom 1 for a real Euclidean vector space. y).

en. If e. en is an orthonormal basis and x= $2e2 + + $e. not a real number. 3. . en are linearly independent. $2. e2. are given complex numbers satisfying the following two conditions: (a) a . y) = O. Since the inner product of two vectors is. $2e2 + $en. e2. 3. . b]. Let R be the set of complex valued functions of a real variable t defined and integrable on an interval [a. By the length of a vector x in a unitary space we shall mean the number \/(x. Axiom 4 implies that the length of a vector is non-negative and is equal to zero only if the vector is the zero vector.e. If e. It is easy to see that R becomes a unitary space if we put (f(t). e2. By an orthogonal basis in an n-dimensional unitary space we mean a set of n pairwise orthogonal non-zero vectors el. g(t)) = f(t)g(t) dt. En and takes on the value zero only if el = C2 = en = O.62 LECTURES ON LINEAR ALGEBRA where at. e2.e. e is an orthonormal basis and = + ee. = (Th azkez f. Two vectors x and y are said to be orthogonal if (x.. Y) = $2e2 + e2f/2 + + nnen + E71 (cf.>.) x = e. 0 for every n-tuple el. Example / in this section). . in general. . Orthogonal basis. x). y = %el n2e2 + are two vectors. . As in § 3 we prove that the vectors el. n2e2 + + nne.. Isomorphism of unitary spaces. we do not introduce the concept of angle between two vectors. i. that they form a basis. The existence of an orthogonal basis in an n-dimensional unitary space is demonstrated by means of a procedure analogous to the orthogonalization procedure described in § 3. then (x.

where $. y) A (2x. Bilinear and quadratic forms. in the case of complex vector spaces there is another and for us more important way of introducing these concepts. e. (x. and a linear function of the second le. and that every linear function of the second kind can be written in the form b2t.4 (x. y). ez) = SO that e2e2 + + ez) = e. are the coordinates of the vector x relative to the basis el. A (x. en and a. A (xl + x2. y) is a linear function of the first kind of x. In other words. et) + + e (e et). y) is a linear function of the second kind of y.nd if 2. A (x. e2. With the exception of positive definiteness all the concepts introduced in § 4 retain meaning for vector spaces over arbitrary fields and in particular for complex vector spaces. of the first kind can be written in the form f(x) = a1e.(ei. f (2x) = . f(Ax) = Af (x) . a2 + ame. = f(e./(x). y). Linear functions of the first and second kind. + + b&. A (x2. 1. Using the method of § 4 one can prove that every linear function 1. . f(x) = DEFINITION 1. y) is a bilinear form (function) of the vectors x and y if: for any fixed y.f(x + y) --f(x) +f(Y). W e shall say that A (x. y) = 2. et) + $2 (e2.).) = Et. for any fixed x. However. y) = A (xi. are constants. a. A complex valued function f defined on a complex space is said to be a linear function of the first kind if f(x + Y) =f(x) ±f(y). Using the method of § 3 we prove that all un tary spaces of dimension n are isomorphic. 4.1.n-DIMENSIONAL SPACES 63 t hen (x.

E2e2 + + $ne. y = n1e1 n2e2 + + linen) = A (elei i. If x and y have the representations Y= + n2e2 + x = 1e1 then A (X. y) relative to the basis . ej. k=1 viewed as a function of the vectors X El. y) = (x. y) considered as a function of the vectors x and y. ?he]. Yi) + A (X. + nmen Let en e2. Let A (x. y) = aik$. The connection between bilinear and quadratic forms in complex space is summed up in the following theorem: Every bilinear form is uniquely determined by its quadratic form. k1 $2e2 ' fle. Another example is the expression A (x.. . Y1. x) called a quadratic form (in complex space). The matrix IjaH with ai. Y2). en be a basis of an n-dimensional complex space. + 3.n. A (x. e. y) e2e2 + + enen.2) = A (X. . y) be a bilinear form.Fik i. A (X. We recall that in the case of real vector spaces an analogous statement § 4). One example of a bilinear form is the inner product in a unitary space A (x.64 LECTURES ON LINEAR ALGEBRA 2.. Ay) )7. y) we obtain a function A (x. If we put y = x in a bilinear form A (x. y). is called the matrix of the bilinear form A (x. //lei ?2e2 + ed7kA (ei.4 (x. 6 6 holds only for symmetric bilinear forms (cf. A (ei.

Y) so that. y) = a1kE111. A (x+iy. (IV) by 1. This concept is the analog of a symmetric bilinear form in a real Euclidean vector space. For a form to be Hermitian it is necessary and sufficient that its matrix laikl I relative to some basis satisfy the condition a Indeed. i. x) A (x y. 1.x iy) A (x y. NOTE If the matrix of a bilinear form satisfies the condition 7 Note that A (x. 1. y) is Hermitian. y)± A (y.. respectively. ek) A (ek. i. x iy)}.3) =M (x. x)-HiA (x. x)-HiA (y.n-DIMENSIONAL SPACES 65 Proof: Let A (x. y). A (xy. enable us to compute A (x. y). then a = A (ei. in particular. y)+A(y. iy) iA (x. (II). (III). if we multiply the 1. x)iA (x. if a = aki. y) = ±{A (x y. If we multiply equations (I). x y) iA(x iy. y) A (y. (IV) by 1. The four identities 7: A (x±y. if the form A (x. xy) = A (x. A bilinear form is called Hermitian if A (x. then A (x. i. and add the results it follows easily that A (x. Namely. y). DEFINITION 2. respectivly. equations (I). x) A (x. x iy)= A (x. x) A (y. y). A (xiy. (II). I ankei ---. y. y). x iy)}. y)± A (y. x)iA (y. . Conversely. x y) + iA (x iy. (III).A (y x). x) + A (x. Since the right side of (1) involves only the values of the quadratic form associated with the bilinear form under consideration our assertion is proved. x+y) = A (x. x iy) 1{A (x A (y. y) = A (y. x + y) iA (x iy. e1) d. we obtain similarly. y). x). i. x) + A (y. x±iy)=A(x. x + y) + iA (x iy. x) be a quadratic form and let x and y be two arbitrary vectors. A (x.

. axioms 1 through 3 for the inner product in a complex Euclidean space say in effect that (x. e2. then. if A (x. in particular. y) to be Hermitian it is necessar y and sufficient that A (x. A (x iy. a. x iy) are all real and it is easy to see from formulas (1) and (2) that A (x. where (x. x) be real for every vector x. y) be Hermitian. A (x iy. i. If. x) is real for al x. x) is a Hermitian quadratic form. then a complex Euclidean space can be defined as a complex A (x. The following result holds: For a bilinear form A (x. then the same must be true for the matrix of this form relative to any other basis. we call a quadratic form A (x. j 1. One example of a Hermitian quadratic form is the form A (x. y) relative to the basis e1. x + y).66 LECTURES ON LINEAR ALGEBRA a = dkì. then 4-4 = %)* seW . Indeed.. f2 and if f. en and I the matrix of A (x. If Al is the matrix of a bilinear form A (x. y) is a Hermitian bilinear form so that (x. = coe. let A (x. basis f1. The proof is a direct consequence of the fact just proved that for a bilinear form to be Hermitian it is necessary and sufficient that A (x. x) denotes the inner product of x with itself. -. If a bilinear form is Hermitian. but then a--=d relative to any other basis. x iy). so that the number A (x. x) = (x. . y) is a Hermitian bilinear form. f2.d relative to some basis implies that A (x. y) relative to the tt .e. A (x y. then the associated quadratic form is also called Hermitian. x) = A (x. Proof: Let the form A (x. as in § 4. x). x) is real. A (x + y. y) = A (y. In fact. Conversely. x). A quadratic form is Hermitian i f and only i f it is real valued. x). COROLLARY. Then A (x. xy). x) positive definite when for x 0. x) be real for all x. n). x) > 0 vector space with a positive definite Hermitian quadratic form. x). y) = (y.

) 0. This can be done for otherWe choose el so that A (el. One can prove the above by imitating the proof in § 5 of the analogous theorem in a real space. el) O.e. in view of formula (1). y) Now we select a vector e2 in the (n 1)-dimensional space Thu consisting of all vectors x for which A (e1. On the other hand. 5. . el) + E2(2A (e2. e. . en of R complex vector space R. then we choose in it some basis er÷. y) vector only). Reduction of a quadra ic form to a sum of squares THEOREM 1. Then there is a basis e1. x) be a Hermitian quadratic form in a . O. wise A (x. X) = is an arbitrary vector. e2) for i < k. ei) are real in view of the Hermitian . x) = 0 for all x and. e2) + where the numbers A (e. y) A (ei. The idea is to select in succession the vectors of the desired basis. It follows tha x= A (x. form a er+2. basis of R. relative to which the form in question is given by A (X. e2. If W. the Hermitian nature of the form A (x.. then A (ei. x) = 0 so that 0. ek) = 0 implies A (ei. en. e). We choose to give a version of the proof which emphasizes the geometry of the situation. This process is continued until we reach the O (Mr) may consist of the zero space Itffi in which A (x. + enen + EThfA (en. Let A (x. The proof is the same as the proof of the analogous fact in a real space. etc.-DIMENSIONAL SPACES 67 flc[ and tt'* Ilc*011 is the conjugate transpose of Here g' i.. Our construction implies A (e2. These vectors and the vectors el. e1.R. X) = + A2 EJ2 + + EJ where all the 2's are real.. A (x. c* e51. )=0 + E2e2 + for > k.

x) = 1$112 A2 1E21' + zl I ler2. . e2. Relative to such a basis the quadratic form is given by A (x. The law of inertia A2. Al. An THEOREM 2. then the determinants /1 4. . where A.. Reduction of a Hermitian quadratic form to a sum of squares by means of a triangular transformation. x) be a Hermitian . --. e1) and are thus real. among others.A (ei. be positive. x) + 22M2 + + 2E& = 41E112 + 221e2I2 + quadratic form in a complex vector space and e. The number of negative multipliers of the squares in the canonical form of a Hermitian quadratic form equals the number of changes of sign in the sequence . = 1. a. I. . =-. x) is Hermitian. . ek). quadratic form is reduced to the canonical form (3).. If a Hermitian quadratic form has canonical fo . are real. 4 are real. an where a..2 § 6. 7. To see this we recall that if a Hermitian .ea aln a2n A.68 LECTURES ON LINEAR ALGEBRA nature of the quadratic form. that the determinants /11 42. We assume that the determinants all a12 + 2nleta2 6. These formulas are identical with (3) and (6) of § 6. Prove directly that if the quadratic form A (x. then A (x. then the coefficients are equal to A (e1. are all different from zero. This implies. EXERCISE. Let A (x.A. e1) by ). basis. 42 --= au a12 a21 a2^ An = a22 a.a. Then just as in we can write down formulas for finding a basis relative to which the quadratic form is represented by a sum of squares. If we denote A (e. Just as in § 6 we find that for a Hermitian quadratic form to be positive definite it is necessary and sufficient that the determinants A2 ..

negative and zero coefficients is the same in both cases. The proof of this theorem is the same as the proof of the corre- sponding theorem in § 7. The concept of rank of a quadratic form introduced in § 7 for real spaces can be extended without change to complex spaces. .It-DIMENSIONAL SPACES 69 relative to two bases. then the number of positive.

say. This transformation is said to be linear if the following two conditions hold: A (x + x2) = A(x1) + A (x. Clearly. In many cases. Let us check condition 1. however. it is necessary to consider functions which associate points of a vector space with points of that same vector space. We associate with x in R its projection x' = Ax on the plane R'. and x. 2. then the mapping y = A(x) is called a transformation of the space R. both procedures yield the same vector.CHAPTER II Linear Transformations § 9. 1. Whenever there is no danger of confusion the symbol A (x) is replaced by the symbol Ax. A (dlx ) = (x). If with every vector x of a vector space R there is associated a (unique) vector y in R. It is again easy to see that conditions 1 and 2 hold. In the preceding chapter we stud- ied functions which associate numbers with points in an ndimensional vector space. DEFINITION I. then Ax stands for the vector into which x is taken by this rotation. and then rotating the sum. Fundamental definitions. Consider a rotation of three-dimensional Euclidean space R about an axis through the origin. If x is any vector in R. and then through the origin. 70 adding the results. Operations on linear transformations 1. EXAMPLES. The left side of 1 is the result of first adding x and x. The simplest functions of this type are linear transformations. It is easy to see that conditions 1 and 2 hold for this mapping. Let R' be a plane in the space R (of Example 1) passing . Linear transformations. The right side of 1 is the result of first rotating x.).

Consider the n-dimensional vector space of polynomials of degree n 1. where P'(t) is the derivative of P(t). Let liaikH be a (square) matrix. en) (ni. n2. then Af(t) is a continuous function and A is linear.). A (fi + /2) = Jo I. P 2 (t).] we associate the vector Y = Ax where e2 .) --= /If (r) dr f2er) tit = Afi 2AI Af2. n. 1]. With the vector x= &. Consider the vector space of n-tuples of real numbers. Jo f(r) dr Among linear transformations the following simple transforma- tions play a special role. If we put Af(t) = f(r) dr.LINEAR TRANSFORMATIONS 71 3. If we put AP(1) P1(1). 4. Consider the space of continuous funct ons f(t) defined on the interval [0. Indeed [P1 (t)Pa(t)i' [AP (t)]' a(t) AP' (t). The identity mapping E defined by the equation Ex x for all x.fi(r) dr A (. . Indeed. k=1 aike k This mapping is another instance of a linear transformation. 5. then A is a linear transformation.41(t) f2(T)] dr To .

every matriz determines a unique linear transformation given by means of the formulas (3).e every linear transformation A determines a unique matrix Maji and. . E2e2 + &. since x has a unique representation relative to the basis e1. g there exists a unique linear transformation A such that Ae. 2.. k = 1. so that A is indeed uniquely determined by the Ae. In fact. . (1)... be a1... . if . e. Connection between matrices and linear transformations. e2. n) form a matrix J1 = Haikl! to the basis e1. (i. The numbers ao. We shall show that Given n arbitrary vectors g1.. (2). . en be a basis of an n-dimensional vector space R and 2. a2. e2. This mapping is well defined. Ae2.72 LECTURES ON LINEAR ALGEBRA The null transformation 0 defined by the equation Ox = for all x. . uniquely. e2. e. Ae = g. To this end we consider the mapping A which associates with x = ele + es. e2. Let el. which we shall call the matrix of the linear transformation A relative . Ae2 = g2. Aek = aikei. It is easily seen + E2e2 + + E'e. It remains to prove the existence of A with the desired properties. let A denote a linear transformation on R.. e2. e. g. a. We have thus shown that relative to a given basis el. then Ax = A(eie.g2 that the mapping A is linear.e. conversely.. . i. Ae determine A x= e2e2 + + ene E2Ae2 is an arbitrary vector in R. the vector Ax = e. Now let the coordinates of g relative to the basis el. .e) = EiAel + -Hen Ae. g2. = g1.g. We first prove that the vectors Ae.

We choose the following basis in R: t2 3 = 1. the matrix which represents E relative to any basis is 00 It is easy to see that the null transformation is always represented by the matrix all of whose entries are zero. Then Ael = el. = e. Then Aei = e (1= 1.. 2. e. Ae. e'3. Let A be the differentiation transformation. i. e'a = e2 ea. EXAMPLES. e2. directed along the coordinate axes. Let R be the three-dimensional Euclidean space and A the linear transformation which projects every vector on the XY-plane. AP(t) = P'(t). Let R be the space of polynomials of degree n 1. relative to this basis the mapping A is represented by the matrix [1 0 0 0 1 0 o.. Let E be the identity mapping and e. e 2! e -= En (n I)! .e. where e'.. We choose as basis vectors of R unit vectors el. 1.. = e. e'2. 0 0 EXERCISE Find the matrix of the above transformation relative to the basis e'. i.e. ei2 = e2. e2. in R.e. n).LINEAR TRANSFORMATIONS 73 Linear transformations can thus be described by means of matrices and matrices are the analytical tools for the study of linear transformations on vector spaces. e any basis i. e2 = t.. Ae3 = 0.

or. Ae2 =e (n 1 Ae3 (2)' 2 t e Ae = tn-2 1) ! (n 2) ! Hence relative to our basis.)e. Now Ax = A (e. = ei(ael $2(a12e1 E2e2 + a21e2 + a22e2 + + Een) + anien) + an2e2) 5(aiei = (a111 a2e2 + a12e2 + ae) + + a$n)e. (5) k=1 ..74 LECTURES ON LINEAR ALGEBRA Then Ael = 1' = 0. + C122E2 + (aie. briefly. aln = arier n2 = aizEi tin --= an1$1 + az. + + nen. Let (4) x = $1e. (4') Ax = 121e1+ n2. $2e2 + + $nen. an22 + a12E2 + a22 E2 + + ct. . e2. el.e. + a ace le + anEn. en a basis in R and MakH the matrix which represents A relative to this basis. A is represented by the matrix 01 0 0 001 0 0 0 0 1 0 0 0 0 Let A be a linear transformation. We wish to express the coordinates ni of Ax by means of the coor- dinates ei of x. en. in v ew of (4'). + a2en)e2 ((file. Hence. a2 n.

An" = Am A". . in this case D" = O.i. D3P(t) P"(t). Cx2 The first equality follows from the definition of multiplication of transformations. e2. DEFINITION 2. the third from property 1 for A and the fourth from the definition of multiplication of transformations. Then D2P(t) = D(DP(t)) = (P'(t))/ P"(t). If E is the identity transformation and A is an arbitrary trans- formation. .)] = A (Bx. IP. we define A° = E. the second from property 1 for B. D P(t) = P' (t). by analogy with numbers. C (x. Let D be the differentiation operator. Clearly. 3 of this section and find the matrices of ll. Ex ERCISE. then it is easy to verify the relations AE = EA = A. it satisfies conditions 1 and 2 of Definition I. ABx. Next we define powers of a transformation A: A2 = A A. If C is the product of A and B. and. Se/ect in R of the above example a basis as in Example 3 of para. EXAMPLE. relative to this basis...e. if J]20represents a linear transformation A relative to some basis e1. etc. Likewise. 3. We shall now define addition and multiplication for linear transformations. Bx.) = ABx. Indeed. Addition and multiplication of linear transformations. Clearly. Let R be the space of polynomials of degree n 1. That C (2x) = . then transformation of the basis vectors involves the columns of lIctocH [formula (3)] and transformation of the coordinates of an arbitrary vector x involves the rows of Haikil [formula (5)]. we write C = AB. By the product of two linear transformations A and B we mean the transformation C defined by the equation Cx = A (Bx) for all x. A3 = A2 A. = Cx. The product of linear transformations is itself linear.ICx is proved just as easily. e. x2) -= A [B (x x.LINEAR TRANSFORMATIONS 75 Thus.

= A( J=1 bike]) == biAei Comparison of (7) and (6) yields cika15 blk. To answer this question We know that given a basis e1. = a ake. on the other hand. j j DEFINITION 3. . then. on the one hand. Ce. The matrix W with entries defined by (8) is called the product of the matrices S and M in this order. e2. A. czkei. e) and I C al! represents the sum C of A and B (relative to the same basis).ke C. By the sum of tzew linear transformations A and B we mean the transformation C defined by the equation Cx Ax Bx for all x. A. what is the matrix jcjj determined by the product C of A and B. It is easy to see that C is linear. = I b. If the transformation A determines the matrix jjaikj! and B the matrix 1lb j]. e every linear transformation determines a matrix. = ?Wei. If C is the sum of A and B we write C = A + B. + B. Thus. We see that the element c of the matrix W is the sum of the pro- ducts of the elements of the ith row of the matrix sit and the corresponding elements of the kth column of the matrix Re?. and. then their product is represented by the matrix j[c1! which is the product of the matrices Hai. If j jazkl j and Ilkkl I represent A and B respectively (relative to some basis e1.j1 and j I b. Further AB. we note that by definition of Hcrj C.76 LECTURES ON LINEAR ALGEBRA . e2. = IckeI. Let C be the sum of the transformations A and B.. if the (linear) transformation A is represented by the matrix I jaikj j and the (linear) transformation B by the matrix jjbjj. = (ac .

BC. we define the symbol P(A) by the P(A) = (Om + a. We could easily prove these equalities directly but this is unnec- essary.11 If P (t) aot'n then 2A is represented by the + + a. Thus by 2A we mean the transformation which associates with every vector x the vector il(Ax). is an arbitrary polynomial and A is a transformation. We now define the product of a number A and a linear transformation A. Thus A+B=B±A. It is clear that if A is represented by the matrix matrix rj2a2. 1 C(A B) = CA + CB. Since properties 1 through 4 are proved for matrices in a course in algebra. We recall that we have established the existence of a one-to-one correspondence between linear transformations and matrices which preserves sums and products.[1 and 111)0. 21. the iso- morphism between matrices and linear transformations just mentioned allows us to claim the validity of 1 through 4 for linear transformations.. Addition and multiplication of linear transformations have some of the properties usually associated vvith these operations. Thus the matrix of the sum of two linear transformations is the sum of the matrices associated with the summands. Consider the space R of functions defined and infinitely differentiable on an interval (a. . equation + a. b). f (A B)C = AC -1. A (BC) = (AB)C.LINEAR TRANSFORMATIONS 77 so that c=a The matrix b.16-1 + EXAMPLE. Let D be the linear mapping defined on R by the equation Df (t) = f(1). + bIl is called the sum of the matrices Ila.E.11. (A B) C = A + (B C).

then P(D) is the linear mapping which takes f (I) in R into P(D)f(t) = aor)(t) + a.. 0 2.' dm = 0 - - 0 0 0 P (0). a matrix of the form [A. EXAMPLE Let a be a diagonal matrix.^ - d2 it follows that = Oi At. As was already mentioned in § 1. We wish to find P(d). Find P(.78 LECTURES ON LINEAR ALGEBRA If P (t) is the polynomial P (t) = cior + airn-1+ + am. sin d. 0 2. i. Example 5.) EXERCISE. . Analogously. Now consider the following set of powers of some matrix sl .am f (t). ) 1 . e . i : 0 0 - P(2. etc. by means of the equation P(d) = arelm + a1stm-1 + + a. Since [AL.. 0 2.2 - 0 ' - 0 . f (m-1) (t) + -1.91) for 01 0 0 010 O 0 1 0 0 0 0 0 si = 0 0 0 0 000 o_ a It is possible to give reasonable definitions not only for polynomial in a matrix at but also for any function of a matrix d such as exp d...22 01 rim .. Hence any n2 + 1 matrices are linearly dependent...2" - 0] 0 . with P (t) as above and al a matrix we define a polynomial in a matrix. all matrices of order n with the usual definitions of addition and multiplication by a scalar form a vector space of dimension n2.e.2/ = 0 0 0 0 ).

This simple proof of the existence of a polynomial P (t) for which P(d ) = 0 is deficient in two respects. The elements of the kth column of sl-1 turn out to be the cofactors of the elements of the kth row of sit divided by the determinant of It is easy to see that d-1 as just defined satisfies equation (9).dn2 = 0.e. the matrix has rank n.a. then the inverse B of A takes Ax into x. In the sequel we shall prove that for every matrix sif there exists a polynomial P(t) of degree n derivable in a simple manner from sit and having the property P(si) = C. a2. i. We know that choice of a basis determines a one-to-one correspondence between linear transformations and matrices which preserves products. they must be linearly dependent. it does not tell us how to construct P (t) and it suggests that the degree of P (t) may be as high as n2. The inverse of A is usually denoted by A-1. The transformation B is said to be the inverse of 4. if A takes x into Ax. The definition implies that B(Ax) = x for all x..e. namely. Thus it is clear that the projection of vectors in three-dimensional Euclidean space on the KV-plane has no inverse. a1. (not all zero) such that + a.LINEAR TRANSFORMATIONS 79 Since the number of matrices is n2 -H 1. There is a close connection between the inverse of a transformation and the inverse of a matrix. where E is the identity mapping. that is. . there exist numbers a.. As is well-known for every matrix st with non-zero determinant there exists a matrix sil-1 such that (9) sisti af_id _ si-1 is called the inverse of sit To find se we must solve a system of linear equations equivalent to the matrix equation (9). Inverse transformation DEFINITION 4. It follows that a linear transformation A has an inverse if and only if its matrix relative to any basis has a nonzero determinant. i. A if AB = BA = E. . clog' Jr a1d + a2ia/2 + It follows that for every matrix of order n there exists a polynomial P of degree at most n2 such that P(s1) = C. Not every transformation possesses an inverse. A transformation which has an inverse is sometimes called non-singular.

. the dimension of R' is the same as the rank of the matrix ra11. i. y. is h is to say that the maximal number of linearly independent columns of the matrix Ila.. We now show how the matrix of a linear transformation changes under a change of basis. Ax.e.1H is h. every vector in R'. if y = Ax. The set of vectors Ax (x varies on R) forms a subspace R' of R. The dimension of R' equals the rank of the nzatrix of A relative to any basis e2. i. e R'. Ae. Ae2. (10) = f cei c22e2 + cei c.A (2x).e. Ay Ax. e2. . Hence R' is indeed a subspace of R. + C21e2 + + c02en. formulas (2) and (3) of para. Hence every vector Ax. Ae. = Ax. Let e1. e2. e R' and y. Likewise. .e... is a linear combination of the vectors Ae. e2. To say that the maximal number of linearly independent Ae.. Let A be a linear transformation on a space R. e... and y. . + If C is the linear transformation defined by the equations Cei =1. . then its matrix has rank < n. e .. then the other Ae.80 LECTURES ON LINEAR ALGEBRA If A is a singular transformation. = Ax. We shall prove that the rank of the matrix of a linear transformation is independent of the choice of basis. are linear combinations of the k vectors of such a maximal set. --F y. e2. . e is W (cf. f be two bases in R. Then y. e R'. THEOREM. Proof: Let y. i. The matrices which represent a linear transformation in different bases are usually different. More specifically. it is also a linear combination of the h vectors of a maximal set. = A (x.. 3). Let W be the matrix connecting the two bases. y. Ay e R'. Since every vector in R' is a linear combination of the vectors Ae. Hence the dimension of R' is h. 2. If the maximal number of linearly independent vectors among the Ae.. then the matrix of C relative to the basis e1.e. Let I represent A relative to the basis e. Ae.e. then 2Ax . . en and f. Connection between the matrices of a linear transformation relative to different bases. y. . i. let f. n). 5. f .e. i. Now any vector x is a linear combination of the vectors el. isk.

. and in that case it is not possible to restrict ourselves to R.4 of a transformation A relative to a basis f. The matrix 462 in (11) is the matrix of transition from the basis e. e and 11). e2. of course.(which exists in view of the linear independence of the fi) we get C-'ACe. 1. e.11 be the matrix of A relative to e1.. e2. = basis e1. e2. only. . . may be mapped on points not in R. We wish to express the matrix . alone. so that = To sum up: Formula (11) gives the connection between the matrix . Eigenvalues and eigenvectors of a linear transformation Invariant subs paces.R in terms of the matrices si and W. of R we may.!1 its matrix relative to f1. . § 10. Aek = tt (10') (10") Afk = i=1 bat. Premultiplying both sides of this equation by C-1. In the case of a scalar valued function defined on a vector space R but of interest only on a subspace 12. Not so in the case of linear transformations. e2. f and the . f2. . f. = Ce. Here points in R. ft. f2. en to the basis f1. ... matrix <91 which represents A relative to the basis e. relative to a given basis matrix (C-1AC) = matrix (C-9 matrix (A) matrix (C). f2. (11) bzker It follows that the matrix jbikl represents C-'AC relative to the . . However.LINEAR TRANSFORMATIONS 81 a Let sit = Ilai. To this end we rewrite (10") as ACe. In other words. consider the function on the subspace R. e. (formula (10)).. Invariant subspaces.

Let R be a plane. Show that R in Example 3 contains no other subspaces invariant under A. e. Show that if A.. In this case the coordinate axes are onedimensional invariant subspaces. and e. of course. = A. The invariant subspaces are: the axis of rotation (a one-dimensional invariant subspace) and the plane through the origin and perpendicular to the axis of rotation (a two-dimensional invariant subspace). The set of polynomials of degree subspace. Let R be three-dimensional Euclidean space and A a rotation about an axis through the origin. Let A be a linear transformation on R whose matrix relative to some basis el. EXAMPLES. only. then A is a similarity transformation with coefficient A. along the y-axis. then the coordinate axes are the only invariant one-dimensional subspaces. implies Ax e R If a subspace R1 is invariant under a linear transformation A we may.e. Let A be a stretching by a factor A1 along the x-axis and by a factor A. A is the mapping which takes the vector z = e. Let A be a linear transformation on a space R. i. If A.2.82 LECTURES ON LINEAR ALGEBRA DEFIN/TION 1. AP (t) --= P' (1). Trivial examples of invariant subspaces are the subspace consisting of the zero element only and the whole space. In this case every line through the origin is an invariant subspace.k. en is of the form . is an invariant EXERCISE. = A. i. Let R be any n-dimensional vector space. 1. of R is called invariant under A if x e R. + A22e2 (here e. Let R be the space of polynomials of degree n I and A the differentiation operator on R. e2. consider A on R. A subs pace R. into the vector Az = Ai ei e. /1.<n-1.. EXERCISE. .e. are unit vectors along the coordinate axes).

en be a basis in R. then the vectors ax form a onedimensional invariant subspace.. Then R. is represented by some matrix IctikrI Let x = elei E2e. THEOREM 1.. vector Ax are given by The proof holds for a vector space over any algebraically closed field since it makes use only of the fact that equation (2) has a solution. Let R1 be a one-dimensional subspace generated by some vector O. 2. If = invariant under A. n2. consists of all vectors of the form ax. . + + Ee .1 a ek is (1 _< In this case the subspace generated by the vectors e . n of the be any vector in R. It is clear x that for R1 to be invariant it is necessary and sufficient that the vector Ax be in R1. then the subspace generated by ek-Flp e1+2 en would also be Eigenvectors and eigenvalues. Thus if x is an eigenvector. A vector x is called an eigenvector of A. all non-zero vectors of a one-dimensional invariant subspace are eigenvectors. i. then A has at least one eigenvector. The proof is left to the reader. that Ax = 2x.e. Conversely. The number A is called an eigenvalue of A. 0 satisfying the relation Ax Ax DEFINITION 2. invariant under A. e2.+1 ' ' a.LINEAR TRANsFormarioNs 83 an ' avc all a17. e2. If A is a linear transformation on a complex i space R. In the sequel one-dimensional invariant subspaces will play a special role. Relative to this basis A Proof: Let e1. ak+in a1+11+1 0 O a1. = k). Then the coordinates ni.

.. . ¿. e not all zero satisfying the system (1). Such a system has a non-trivial solution ¿. ¿n(0).. . + a2¿. For the system (1) to have a non-trivial solution ¿1. . a12 a22 A ai a. . i. If we put xon Elm) $2(0) e2 . para. + ctE which expresses the condition for x to be an eigenvector. .(0). E2(0).e.0. (an Ei A)ei (a22 an$. that A an an. The equation Ax = Ax. in place of A... Eno) en. 0. a2 aA This polynomial equation of degree n in A has at least one (in general complex) root A. = an1e1+ ct.. + a2¿. a2$2 + + (anTh O. + + a1--. .. Thus to prove the theorem we must show that there exists a number A and a set of numbers ¿I). + A)E. With A.. + a2.. = A2 an11 Or an2$2 + + ae--= A¿. is equiv- alent to the system of equations: a111 a2151 a12$2 + a22 + + ainE=-. 3 of § 9).. then Axo) = Aoco). $2.A¿.. ani¿i it is necessary and sufficient that its determinant vanish.84 LECTURES ON LINEAR ALGEBRA = /12 1111E1 + a122 + 4 a22 e2 + = a2111 ' al. (1) becomes a homogeneous system of linear equations with zero determinant.i02+ (Cf.

Relative to the basis e1. We may thus speak of the characteristic polynomial of the transformation A rather than the characteristic polynomial of the matrix of the transformation A. we can claim that every invariant subspace contains at least one eigenvector of A. Ae. e2. en the matrix of A is o o Lo o 22. Let A be such a transformation and e1. This completes the proof of the theorem. . namely. e (i = 1. . it follows that the roots of the characteristic polynomial do not depend on the choice of basis. The proof of our theorem shows that the roots of the characteristic polynomial are eigenvalues of the transformation A and. We thus have THEOREM 2.LINEAR TRANSFORMATIONS 85 i._ Such a matrix is called a diagonal matrix. If a linear transformation A has n linearly independent eigenvectors then these vectors form a basis in which A is represent- ed by a diagonal matrix. NOTE: Since the proof remains valid when A is restricted to any subspace invariant under A. Conversely. It is a priori conceivable that the multiplicity of the roots varies with the basis.. i. the simplest linear transformations. conversely. e2. The polynomial on the left side of (2) is called the character stic polynomial of the matrix of A and equation (2) the characteristic equation of that matrix. Linear transformations with n linearly independent eigenvectors are.. 3. if A is represented in some 2 The fact that the roots of the characteristic polynomial do not depend on the choice of basis does not by itself imply that the polynomial itself is independent of the choice of basis. . In the sequel we shall prove a stronger result 2. .e. in a way. n).e. Since the eigenvalues of a transformation are defined without reference to a basis. 2. 30°) is an eigenvector and 2 an eigenvalue of A. that the characteristic polynomial is itself independent of the choice of basis. en its linearly independent eigenvectors. the eigenvalues of A are roots of the characteristic polynomial.

then the matrix of A is diagonable. (3) Apply ng A to both sides of equation (3) we get A (al ek + x2e2 + Or + a. are linearly independent. We lead up to this case by observing that . For instance. a root Ak of the characteristic equation determines at least one eigenvector. e2. For k = 1 this assertion is obviously true. e. ek are eigenvectors of a transformation A and the corresponding eigenvalues 2. The following result is a direct consequence of our observation: 2k)e1 + 12(22 ' 2k)e2+ ' 1k--1(1ki If the characteristic polynomial of a transformation A has n distinct roots.86 LECTURES ON LINEAR ALGEBRA basis by a diagonal matrix. If e1. en is diagonal. We assume its validity for k 1 vectors and prove it for the case of k vectors. If our assertion were false in the case of k vectors. . the transformation A which associates with every polynomial of degree < n 1 its derivative has only one eigenvalue A = 0 and (to within a constant multiplier) one eigenvector P = constant.) = 0. . then the number of linearly independent eigenvectors may be less than n. If the characteristic polynomial has multiple roots. 0( . e2. Indeed. with al 0 0. The matrix of A relative to the basis e1. A ' . Since the A. e2. For if P (t) is a polynomial of . A. 0 0 (by assumption Ak for i k). This contradicts the assumed linear independence of e1. a. . then e1.. it follows by the result just obtained that A has n linearly independent eigenvectors e1. e2. NOTE: There is one important case in which a linear transforma- tion is certain to have n linearly independent eigenvectors. e. then the vectors of this basis are eigenvalues of A. are supposed distinct. such that ei a2 e2 e. e. then there would exist k numbers ai . 1121e1 1222e2 Subtracting from this equation equation (3) multiplied by A. we are led to the relation Ak)eki = with 21 2. e2. ock2kek = O. are distinct. say. = O.

Find the characteristic polynomial of the matrix an_. Characteristic fiolynomial. In fact.a. 0 I 0 A.In para..LINEAR TRANSFORMATIONS 87 degree k > 0. represent A relative to two bases then But Ati Ir-11 1st Ael This proves our contention. if si and %'-'sn' for some W. 010 1 0 0 0 0 1 0 0 0 Solution: (-1)"(A" a11^-1 a2A^-2 a ). It follows that regardless of the choice of basis the matrix of A is not diagonal. . it is independent of the choice of basis. We shall now find an explicit expression for the characteristic polynomial in terms of the entries in some representation sal of A. The problem of the "simplest" matrix representation of an arbitrary linear trans- formation is discussed in chapter III. Find the characteristic polynomial of the matrix A. EXERCISES. We shall prove in chapter III that if A is a root of multiplicity m of the characteristic polynomial of a transformation then the maximal number of linearly independent eigenvectors correspond- ing to A is m. 2 we defined the characteris- tic polynomial of the matrix si of a linear transformation A as the determinant of the matrix si Ae and mentioned the fact that this polynomial is determined by the linear transformation A alone. i. linear transformations which in some bases can be represented by diagonal matrices). a. Hence we can speak of the characteristic polynomial of a linear transformation (rather than the characteristic polynomial of the matrix of a linear transformation). In the sequel (§§ 12 and 13) we discuss a few classes of diagonable linear transformations (i. 0 0 0 0 0 1Ao 0 0 1 A 2.e. an. a. 4. then P'(t) is a poly-nomial of degree k 1.e. as asserted. Hence P'(t) = AP(t) implies A -= 0 and P(t) = constant.. 1.

a2 1b7. p2. etc. p. In the case at hand a = e and the determinants which add up to the coefficient of (A') are the principal minors of order n k of the matrix Ha. Finally. is the sum of the diagonal elements of sí.11.2 a 2b and can (by the addition theorem on determinants) be written as the sum of determinants. the characteristic polynomial P(2) of the matrix si has the form ( 1)4 (An fi12n-1 P22"' ' where p. Q (A) ¡ . To compute the eigenvectors of a linear transformation we must know its eigenvalues and this necessitates the solution of a poly- nomial equation of degree n. is the sum of the diagonal entries of si p2 the sum of the principal minors of order two. Thus. p are independent of the particular representation a of the transformation A. are of particular importance. an Q(A) = Abu Abn an a22 /1. The coefficients p and p. 21)22 a2 2b2 Ab.1)12 al ' b1. The free term of Q(A) is all an a.. multiplicity. The coefficient of ( A)'' in the expression for Q(A) is the sum of determinants obtained by replacing in (4) any k columns of the matrix by the corresponding columns of the matrix II b1...1 an an a.. is the determinant of si P(A) We wish to emphasize the fact that the coefficients pi.88 LECTURES ON LINEAR ALGEBRA We begin by computing a more general polynomial. a. p is the determinant of the matrix si and pi.11. In one important case the roots of . namely. This is another way of saying that the characteristic polynomial is independent of the particular representation . si of A. where a and are two arbitrary matrices. The sum of the diagonal elements of sal is called its It is clear that the trace of a matrix is the sum of all the roots of its characteristic polynomial each taken with its proper trace..

. We now show that the characteristic polynomial is just such a polynomial. Let the polynomial P(A) = ad. Now (6) and (7) yield the equations w. (We note that this lemma is an extension of the theorem of Bezout to polynomials with matrix coefficients. . If the matrix of a transformation A is triangular. We conclude with a discussion of an interesting property of the characteristic polynomial._. V0 ==(Lot.-2 (d?. 3 of § 9. e ' + CA) = W01?-1 Then P(. As 'vas pointed out in para.' A) then the eigenvalues of A are the numbers an. First we prove the following LEMMA 1.12)%v) where ?(A) is a polynomial in A with matrix coefficie s. 2) fan a.. The proof is obvious since the characteristic polynomial of the matrix (5) is P(2) = and its roots are an. EXERCISE. ann. an.09) = C._2 e.LINEAR TRANSFORMATIONS 89 the characteristic polynomial can be read off from the matrix representing the transformation.e. i. am 0 al a._3= a.-2) + WoAm. for every matrix a/ there exists a polynomial P(t) such that P(d) is the zero matrix. = a. if it has the form an O a12 a22 al. a22. a of the matrix (5). a' if a a. W.) Prool: We have Ae)r(A) sér. ." + + + am and the matrix se be connected by the relation P(A)g = (se --. namely. Find the eigenvectors corresponding to the eigenvalues an. + (sn. a22.-3 am e. (5) 0 0 a 2) (an. e.

Hence (. If we multiply the first of these equations on the left by t. n conclude on the basis of our lemma that P ( 31) = C. 3. In 3 In algebra the theorem of Bezout is proved by direct substitution of A in (6). Proof: Consider the inverse of the matrix d At. I in A. In fact. the characteristic polynomial of S. the kth equation in (8) is obtained by equating the coefficients of A* in (6). § 9). 0 A where all the A.e. then there exists no polynomial Q(A) of degree less than n such that Q(. and P(.e)w(A) = P(A)e. ' 0 on the left. d + Thus P( 31) = 0 and our lemma is proved 3. dm on the right. Find a polynomial P(t) of lowest degree for which P(d) = 0 (cf. P(A) where 5 (A) is the matrix of the cofactors of the elements of a/ At and P(A) the determinant of d ite. we get + a.0) = 0 (cf._. then P(d) = O. the exercise below).90 LECTURES ON LINEAR ALGEBRA the third by Sr. 0 A. Connection between transfornudions and bilinear forms in Euclidean space.i. . are distinct. As is well known.29 .. we are doing essentially the same thing. Subsequent multiplication by St and addition of the resulting equations is tantamount to the substitution of al in place of A. para. ' the last by dm and add the resulting equations. We have A t)(d A t)-1 e. We note that if the characteristic polynomial of the matrix d has no multiple roots. Here this is not an admissible procedure since A is a number and a' is a matrix. VVe have considered under separate headings linear transfornaations and bilinear forrns on vector spaces. we This completes the proof. However.4) = a. the inverse matrix can be written in the form AS)-/ = 1 If P(A) is the characteristic polynomial of Al. § II. EXERCISE. Let d be a diagonal matrix Oi =[A. the second by al. THEOREM 3. t + a. Since the elements of IS(A) are polynomials of degree . i. The adjodnt of a linear hmasforrnation 1.

LINEAR TRANSFORMATIONS 91 the case of Euclidean spaces there exists a close connection between bilinear forms and linear transformations 4. such correspondence would be without significance. upon change of basis. + an2$.. e be an orthonormal basis in R. We shall denote this linear transformation 4 Relative to a given basis both linear transformations and bilinear forms are given by matrices. In fact. Let R be a complex Euclidean space and let A (x. y). the linear transformation is represented by Se-1 are (cf. Here re is the transpose of rer The careful reader will notice that the correspondence between bilinear forms and linear transformations in Euclidean space considered below associates bilinear forms and linear transformations whose matrices relative to an orthonormal basis are transposes of one another. = a2ne2 + a$. This correspondence is shown to be independent of the choice of basis. y) = anEi Th. . If + eandy = n1e1 212e2 + +mien. y) be a bilinear form on R. then. To this end we rewrite it as follows: A (x. (1) a151 î72 + -1. However. One could therefore try to associate with a given linear transformation the bilinear form determined by the same matrix as the transformation in question.. if a linear transformation and a bilinear form are represented relative to some basis by a matrix at. + ¿2e2 + x then A (x.afiin + 42nE217n + 421E2771 + 422E2772 + ani En an2Eni12 + + We shall now try to represent the above expression as an inner product.2 j2 a2e2 + + anE)77. e2. y) can be written in the form A (x. Now we introduce the vector z with coordinates = aide]. -F a21 52 ' C2 = a12e1 + 422E2 + + ani$. § 4). It is clear that z is obtained by applying to x a linear transforma- tion whose matrix is the transpose of the matrix Haikil of the bilinear form A (x. Let el. . § 9) and the bilinear form is represented by raw (cf. y) (6711E1 + an 52 + (171251 + 422E2 + (a25e1 + an' 52)771 -F a.

let A (x. The equation A (x. y) = for all y. a bilinear form A (x. y). y) = (2Ax. We now show that the bilinear form A (x y) determines the transformation A uniquely. y).). y) + (Ax. We can now sum up our results in the following THEonEm (2) 1. The bilinearity of A (x. Y)- Thus. y) (Ax. Y) = (Ax. we shall put z = Ax. y) = 2(Ax. y) establishes a one-to-one correspondence between bilinear forms and linear transformations on a Euclidean vector space. Then (Ax. y) = (Ax. A (y + y2)) = (x. The converse of this proposition is also true. namely: A linear transformation A on a Euclidean vector space determines a bilinear form A (x. y) (Ax. But this means that Ax Ex = 0 for all x Hence Ax = Ex for all x. Ax2. y) (Mx. + AY2) = (x. pAy) = /2(x. Then A (x. y) is easily proved: . (x. (x. Ay. Bx. y). y) = Eh CS72 d- + Cn Tin = (z. y) (A (x..92 LECTURES ON LINEAR ALGEBRA by the letter A. y) defined by the relation A (x. y) = (Ax. y) = (Bx. AY1) (x. Ay). which is the same as saying that A = B. y).). This proves the uniqueness assertion. y) = (Ax. An) = (x. y). Ay.. . y) and A (x. y) = (Ax. y) on Euclidean vector space determines a linear transformation A such that A (x. y) (Ax (Bx.. (Ax. Thus. i.e. x.

This representation is obtained by rewriting formula (1) above in the following manner: A (x. A*y). (2) implies its independence from choice of basis. y). My). The transformation A* defined by (Ax. Namely. Let A be a linear transformation on a complex Euclidean space. y) = 2(4121171 6/12172 + + am /7n) a22 772 + + a2n77) $ n(an1Fn = + $7. y) = (x. A*y).(c1. Proof: According to Theorem 1 of this section every linear transformation determines a unique bilinear form A (x. 2.] of A* and the matrix I laiklt of A are connected by the relation a*.z2n2 + + d nom) = (x. THEOREM 2. = dn. A*y) is called the adjoint of A. There is another way of establishing a connection between bilinear forms and linear transformations. Relative to an orthogonal basis the matrix la*. y) (Ax. every bilinear form can be uniquely represented as (x. Transition from A to its adjoint (the operation *) DEFINITION 1. y) (x. every bilinear form can be represented as A (x. In a Euclidean space there is a one-to-one correspond- ence between linear transformations and their adjoints. by the result stated in the conclusion of para.) + a2272 + d. On the other hand. Hence .LINEAR TRANSFORMATIONS 93 The one-oneness of the correspondence established by eq.21% a2772 + + d12n2 + + + din) + a2nn. For a non-orthogonal basis the connection between the two matrices is more complicated. 1.

B*A*y). (Ax. If we compare the right sides of the last two equations and recall that a linear transformation is uniquely determined by the corresponding bilinear form we conclude that (AB)* . Cy). (A*)* = A. By the definition of A*. (ABx. (A*)* = A. The operation * is to some extent the analog of the operation of . A. Interchange of x and y gives (Cx. Prove properties 3 through 5 of the operation *. Denote A* by C. On the other hand.e. (AB)* y). (2A)* = a*. x). y) -= (x. Ax) = (Cy. A* 3). y) = (x. Prove properties 1 through 5 of the operation * by making use of the connection between the matrices of A and A* relative to an orthogonal 2. Self-adjoint. unitary and normal linear transformations. y) = (Bx. Ay). y) = (x. the definition of (AB)* implies (ABx. Some of the basic properties of the operation * are (AB)* = B*A*. y) = (x. A*y). i.94 LECTURES ON LINEAR ALGEBRA (Ax. y) = A (x. E* = E. basis. whence (y.= B* A*. The connection between the matrices of A and A* relative to an orthogonal matrix was discussed above. y) = (x. (A + B)* = A* + B*. We give proofs of properties 1 and 2. A*y) = (x. But this means that C* EXERC/SES. 1.. Then (Ax.

y) is Hermitian is to say that (Ax. = (A + A*)/ 2 and A2 A = A. Similarly. 7-. let A. Then (A + A*)* 2 (A + A*)* = + (A* + A**) + (A* + A) = A1. Indeed. A linear transformation is called self-adjoint (Hermitian) if A* = A. to say that the form (Ax. Clearly. This class is introduced by DEFINITION 2. are self-adjoint. the two operations are the same. y) = (x. a.. and A. Every complex number is representable in the form = a + iß.e. 2i 1 A* 2 /A AT 2i (A A*)* = A) = A2. In fact. Every linear transformation A can be written as a sum A= iA.. i. We now show that for a linear transformation A to be self-adjoint it is necessary and sufficient that the bilinear form (Ax. x). This analogy is not accidental. Indeed. Again. (3) where Al and A. A.e.. A* A**) = 2i (A* i. for complex numbers.LINEAR TRANSFORMATIONS 95 conjugation which takes a complex number a into the complex number et. and A. y) be Hermitian. to say that A is self-adjoint is to say that (Ax. Ay). are self-adjoint transformations. /3 real. This brings out the analogy between real numbers and selfadjoint transformations. + iA. The real numbers are those complex numbers for which Cc = The class of linear transformations which are the analogs of the real numbers is of great importance. . equations (a) and (b) are equivalent. it is clear that for matrices of order one over the field of complex numbers.* (A A*)/2i. y) = (Ay.

This proves the theorem. not self-adjoint. 5 In other words for a unitary transformations U. Prove that a linear combination with real coefficients of self-adjoint transformations is again self-adjoint. LECTURES ON LINEAR ALGEBRA I. Now. i (AB EXERCISE. A linear transformation U is called unitary if UU* = U*15 = E. The analog of complex numbers of absolute value one are unitary transformations. Hence (4) is equivalent to the equation AB = BA. AU is again statements. 2. = Ul. 5 In n-dimensional spaces TIE* = E and 1:*ti = E are equivalent . (AB)* = B*A* = BA. in general. different from A*A. However: THEOREM 3. Proof: We know that A* = A and B* = B. NOTE: In contradistinction to complex numbers AA* is. Show that if A and B are self-adjoint. Show that if 15 is unitary and A self-adjoint. In § 13 we shall become familiar with a very simple geometric interpretation of unitary transformations. then AB + BA and BA) are also self-adjoint. The product of two self-adjoint transformations is. 1. This is not the case in infinite dimensional spaces. Prove that if A is an arbitrary linear transformation then AA* and A*A are self-adjoint. We wish to find a condition which is necessary and sufficient for (4) (AB)* = AB. EXERCISES. Show that the product of two unitary transformations is a unitary transformation. then self-adjoint.96 EXERCISES. Prove the uniqueness of the representation (3) of A. DEFINITION 3. For the product AB of two self-adjoint transforma- tions A and B to be self-adjoint it is necessary and sufficient that A and B commute. in general.

It is easy to see that unitary transformations and self-adjoint transformations are normal. Self-adjoint (Hermitian) transformations. i. (Ax.e. (Self-adjoint transformations on infinite dimensional space play an important role in quantum mechanics.LINEAR TRANSFORMATIONS 97 In the sequel (§ 15) we shall prove that every linear transformation can be written as the product of a self-adjoint transformation and a unitary transformation. The eigenvalues of a self-adjoint transformation are real.x. Ax Since A* -= A. These transformations are frequently encountered in different applications. x) = (x. Ax). that is. There is no need to introduce an analogous concept in the field of complex numbers since multiplication of complex numbers is commutative.. § 12.) LEMMA 1. This result can be regarded as a generalization of the result on the trigonometric form of a complex number. A linear transformation A is called normal if AA* = A* A. The subsequent sections of this chapter are devoted to a more detailed study of the various classes of linear transformations just introduced. Simultaneous reduction of a pair of quadratic forms to a sum of squares 1. In the course of this study we shall become familiar with very simple geometric characterizations of these classes of transformations. Proof: Let x be an eigenvector of a self-adjoint transformation A and let A be the eigenvalue corresponding to x. (2x. A. Ax). This section is devoted to a more detailed study of self-adjoint transformations on n-dimensional Euclidean space. DEFINITION 4. x) = (x. . x O. Self-adjoint transformations.

By Lemma 1. Ae) THEOREM 1. Proof: According to Theorem 1. en. The totality R. etc. e) = 0. we can select the vectors e. The corresponding eigenvalues of A are all real. We have to show that Ax e R1. In this manner we obtain n pairwise orthogonal eigenvectors e1. there exists an eigenvector e. LEMMA 2. Let A be a linear transformation on an n-dimensional Euclidean space R. which proves that A is real. only. § 10. form an (n 2)-dimensional invariant subspace R2. Since the product of an eigenvector by any non-zero number is again an eigenvector. 0. Necessity: Let A be self-adjoint. x). it follows that A = 1. In R. Let A be a self-adjoint transformation on an n-dimensional Euclidean vector space R and let e be an eigenvector of A. (Ax. orthogonal to e. the corresponding eigenvalues are real. 2e) = 2(x. e) = (x. By Lemma 2. This means that (x. Then there exist n pairwise orthogonal eigenvectors of A. x) Proof: The totality R1 of vectors x orthogonal to e form an (n 1)-dimensional subspace of R. that each of them is of length one. Let x e R. note to Theorem 1. there exists a vector e2 which is an eigenvector of A (cf. e) = O. is invariant under A. . THEOREM 2.98 LECTURES ON LINEAR ALGEBRA Or. Let A be a self-adjoint transformation on an n- dimensional Euclidean space. For A to be self-adjoint it is necessary and sufficient that there exists an orthogonal basis relative to which the matrix of A is diagonal and real. This proves Theorem 1. Since (x. Indeed. In R. x) = 71(x. of A. that is. § 10). e) = 0. (Ax. 2(x. the totality of vectors orthogonal to e. e2. form an (n 1)-dimensional invariant subspace We now consider our transformation A on R. there exists at least one eigenvector el of A. A*e) = (x. Select in R a basis consisting of . (x. of vectors x orthogonal to e form an (n 1)-dimensional subspace invariant under A. We show that R. The totality of vectors of R.

Ael = 22e. it follows that relative to this basis the matrix of the transforma- tion A is of the form [A. A A*.). e2. We note the following property of the eigenvectors of a selfadj oint transformation: the eigenvectors corresponding to different eigenvalues are orthogonal. structed in the proof of Theorem 1. (Ael.. o (1) 0 An 0 0 where the Ai are real. = 2. e2) = O.. e of A con- A..2e2. Ae. o o A. that is ¿1(e1. Indeed. In our case this operation has no effect on the matrix in question. This concludes the proof of Theorem 2. Ae = Anen.e. ez). Since . i. 22) (e1. e2) = (e1. Sufficiency: Assume now that the matrix of the transformation A has relative to an orthogonal basis the form (1). e2) = 22(e1. Since Ai rf 4. A*e2) = (e1. § 11). A. . e2) = O. or (2.LINEAR TRANSFORMATIONS 99 the n pairwise orthogonal eigenvectors e1. let Ael = 22 Then . it follows that (e1. Hence the transformations A and A* have the same matrix. The matrix of the adjoint transformation A* relative to an orthonormal basis is obtained by replacing all entries in the transpose of the matrix of A by their conjugates (cf. 21 22.

a necessary and sufficient condition for a linear transformation A to be self-adjoint is that its matrix relative to some orthogonal basis be Hermitian. X). Simultaneous reduction of a pair of quadratic forms to a sum of squares. Along with the notion of a self-adjoint transformation weintro- duce the notion of a Hermitian matrix. y) = A (y.e. where the Xi are real. . raise it to the proper power. We have shown in § 8 that in any vector space a Hermitian quadratic form can be written in an appropriate basis as a sum of squares. namely. Hint: Bring the matrix to its diagonal form.11 is said to be Hermitian if ai. EXERCISE.1 and. we can assert the existence of an orthonnal basis relative to which a given Hermitian quadratic form can be reduced to a sum of squares.. and the $1 are the coordi ales of the vector Proof: Let A( y) be a Hermitian bilinear form. In the case of a Euclidean space we can state a stronger result.100 LECTURES ON LINEAR ALGEBRA NOTE: Theorem 2 suggests the following geometric interpretation of a self-adjoint transformation: We select in our space n pairwise orthogonal directions (the directions determined by the eigenvectors) and associate with each a real number Ai (eigenvalue). if 2. obtained in para. A (x. i. 6 ili[e ii2. in addition. Clearly. A (x. Let A (x.. Raise the matrix ( 0 A/2 A/2) 1 to the 28th power. x) = x. The matrix Irai. y) be a Hermitian bilinear form defined on an n-dimensional Euclidean space R. Along each one of these directions we perform a stretching by ¡2. Then there exists an orthonormal basis in R relative to which the corresponding quadratic form can be written as a sum of squares. 1 to quadratic forms. happens to be negative. We know that we can associate with each Hermitian bilinear form a self-adjoint transformation. We now apply the results 2. Theorem 2 permits us now to state the important THEOREM 3. Reduction to principal axes. a reflection in the plane orthogonal to the corresponding direction. and then revert to the original basis.

With the introduction of an inner product our space R becomes a Euclidean vector space. x) 211$112 . y I1 0 e.An enen. The process of finding an orthonormal basis in a Euclidean space relative to which a given quadratic form can be represented as a sum of squares is called reduction to principal axes. Aen An en. en of the self-adjoint transformation A (cf. x) = (Ax. § 11) a self-adjoint linear transformation A such that A (x. By . y). This can be done since the axioms for an inner product state that (x. y) e2Ae2 + 22e2e2 + + en Aen . x). %el /12e2 + + ?)en) + nnen) n2e2 + = 1E11 + 225 In particular + + fin A (x. As our orthonormal basis vectors we select the pairwise orthogonal eigenvectors e1. Then there exists a basis in R relative to which each form can be written as a sum of squares. i=k for i k. y).121 212 + + Arisni2.LINEAR TRANSFORMATIONS 101 then there exists (cf. y) is the bilinear form corresponding to B(x. x) and B(x. Then Ael = 21e1. THEOREM 4. Let A (x. x = ei Since e2e. + n2 e2 + + nn. y) = = (Ax. x) to be positive definite. y) (Ax. x) be two Hermitian quadratic forms on an n-dimensional vector space R and assume B(x. e2. y) B(x. Let Ae2 = 12e2. where B(x. for we get A (x. y) is a Hermitian bilinear form corresponding to a positive definite quadratic form (§ 8). n1e1 -I. This proves the theorem. Proof: We introduce in R an inner product by putting (x. Theorem 1). + +e . .

141) differs from (4) by a multiplicative constant. . en is an arbitrary basis. then with respect to this basis Det i.1)2 a. Ar.12. Ab.. e2. it follows that B(x. Al) Det C. if el.e 1215212 + + 41E7. x) = 211E112 . e2. Det (id A) (22 2) (2 A). relative to which the form A (x. x) and B(x. . x) can be written as a sum of A (x. 22. x ) + 1E21' + + le. y) = B(x.e. basis el.. Orthonormal relative to the inner product (x. x). We now show how to find the numbers AI. We have thus found a basis relative to which both quadratic forms A (x. an2 2. x) are expressible as sums of squares. 22....12. . Hence. y)..q = 0 0 A AR) (A1 Consequently. 141) .102 LECTURES ON LINEAR ALGEBRA theorem 3 R contains an orthonormal squares. Det It follows that the numbers A. The matrices of the quadratic forms A and B have the following canonical form: d=0 0 [AI 22 0 0 [1 . Det V* Det (at a Ab a an 2b21 Abni A are the roots of the equation al. with respect to an orthonormal basis an inner product takes the form (x. x) = ei I2 + 1E212 + + [EF2 Since B(x x) (x. Now. which appear in (2) above. Under a change of basis the matrices of the Hermitian quadratic forms A and B go over into the matrices Jill = (t* d%' and = %)* . a2n Abu Ab22 ' /bin 21)2n a22 a.

Uy) = (x. Conversely. it satisfies condition (1)).LINEAR TRANSFORMATIONS 103 and 0i/A are the matrices of the quadratic forms A (x. where Haikl F NOTE: The following example illustrates that the requirement that one of the two forms be positive definite is essential. en. Its determinant is equal to (A2 + 1) and has no real roots. x) = neither of which is positive definite.. e2. y E R.. X) = let12 142. . then (U*Ux. U*Uy) = (x. Uy) = (x. Indeed. where A is a real parameter. . assume U*U = E.e. Indeed. the two forms cannot be reduced simultaneously to a sum of squares. Therefore. Conversely. x) and B(x. This definition has a simple geometric interpretation. (Ux. The two quadratic forms A (X. y) for all x. x) in some basis e. that is (U*Ux. B(x. Uy) (x. any linear transformation U which preserves inner products is unitary (i. if for any vectors x and y (Ux. § 13. Then (Ux.e. namely: A unitary transformation U on an n-dimensional Euclidean space R preserves inner products. y). cannot be reduced simultaneously to a sum of squares. y). the matrix of the first form is [1 0 101 11 and the matrix of the second form is a. y). y) = (Ex. ro Li oJ Consider the matrix a RR. Unitary transformations In § 11 we defined a unitary transformation by the equation (1) UU* U*U E. y) = (x. y). in accordance with the preceding discussion. i.

.. a=1 a-1 aak = O (i k). The condition UU* = E implies that the product of the matrices (2) and (3) is equal to the unit matrix. aiti. We shall now characterize the matrix of a unitary transformation. = 1.. To do this.104 LECTURES ON LINEAR ALGEBRA Since equality of bilinear forms implies equality of corresponding transformations. en.e...e. x). In particular. e2. we select an orthonormal basis el. This condition is analogous to the preceding one. . for x = y we have (Ux. Ux) = (x. Prove that a linear transformation which preserves length is unitary. a2. in addition. U is unitary. a unitary transformation preserves the length of a vector. . relative to an orthonormal basis. EXERCISE. Let [all a21 1E12 a22 aa a1 a2 dn dn d12 a an]] dn2 be the matr x of the transformation U relative to this basis. i. the matrix of a unitary transformation U has the following properties: the sum of the products of the elements of any YOW by the conjugates of the corresponding elements of any other YOW is equal to zero. it follows that U*LI = E. the sum of the squares of the moduli of the elements of any row is equal to one. but refers to the columns rather than the rows of the matrix of U. i. a2d. = 1. that is. a=1 a=1 a(T = O (i k). ann is the matrix of the adjoint U* of U. Making use of the condition U*U = E we obtain. Then d22 al. Thus.

LINEAR TRANSFORMATIONS 105 Condition (5) has a simple geometric meaning. orthonormal basis). e1. Ux = Ax.. condition (5) is called unitary. e2. i. Then x O. e2. i. of R consisting of all Then the (n vectors x orthogonal to e is invariant under U. 0 for i It follows that a necessary and sufficient condition for a linear transformation U to be unitary is that it take an orthonormal basis en into an orthonormal basis Uek . x). Uek) 1 k. . Ai = 1 or 121 = 1. Ue2. Indeed. 2x) = 22(x. Since a transformation which takes an orthonormal basis into another orthonormal basis is unitary. that is. + a2e2 + and akk a2k e2 + + anke . x) = (Ux. (6) (Uei. Uen. Hence f 1 for i = k. LEMMA 1. .e. LEMMA 2. Proof: Let x be an eigenvector of a unitary transformation U and let A be the corresponding eigenvalue. 1)-d mensional subspace R. Ue = 2e.e. Let U be a unitary transfor ation on an n-di ensional space R and e its eigenvector. the matrix of transition from an orthonormal basis to another orthonormal basis is also unitary. equivalently. en to be an is equal to axid (since we assumed el. (x. Ux) = (2x. e O.. the inner product of the vectors +a Uei = ai. As we have shown unitary matrices are matrices of unitary transformations relative to an orthonormal basis. We shall now try to find the simplest form of the matrix of a unitary transformation relative to some suitably chosen basis. A matrix I laall whose elements satisfy condition (4) or. The eigenvalues of a unitary transformation are in absolute value equal to one.

Ue = . O. 4. e) = 0. etc. Ux E Thus. Hence R. By Lemma 1 the eigenvalues corresponding to these eigenvectors are in absolute value equal to one. . We claim that the n pairwise orthogonal eigenvectors constructed in the preceding theorem constitute the desired basis. (Ux. Proof: In view of Theorem 1. i.. one. i.106 LECTURES ON LINEAR ALGEBRA i. Let U be a unitary transformation defined on an n-dimensional Euclidean space R. e) = 0. it follows that i(Ux. i. e) = (x. = 22e2. Then there exists an orthonormal basis in R relative to which the matrix of the transformation U is diagonal. R2 contains at least one eigenvector e3 of U.e. A are in absolute value equal to Proof: Let U be a unitary transformation. e) Proof: Let x E R. of all vectors of R which are orthogonal to e. (7) o o 22 oi.. en of the transformation U. Indeed. § 10. THEOREM 2. THEOREM 1. is indeed invariant under U. Indeed. e) --. Ue. the transformation U as a linear transformation has at least one eigenvector.. Denote by R2 the invariant subspace consisting of all vectors of R1 orthogonal to e2. o The numbers 4. .. (x. Let U be a unitary transformation on an n-dimen- sional Euclidean space R. the (n 1)-dimensional subspace R. the subspace R1 .e. Proceeding in this manner we obtain n pairwise orthogonal eigenvectors e. By Lemma 1. has the form [2. Ue) = (U*Ux.. = Ue. 0 0. Since Ue = ae. (Ux. hence (Ux. Denote this vector by el. .e. Then U has n pairwise orthogonal eigenvectors. e) = 0. We shall show that Ux e R1.e. By Lemma 2. is invariant under U.O. The corresponding eigenvalues are in absolute value equal to one. contains at least one eigenvector e2 of U.

EXERCISES. i. § 12. 1. let AB = BA. i. We shall now discuss conditions for the existence of such a basis.. VVe first consider the case of two transformations. Then there exists a unitary matrix 'V such that Pi= rigr. LEMMA 1. the main result of para. Commutative linear transformations. Normal transformations 1. An are in absolute value equal to one. We have shown (§ 12) that for each self-adjoint transformation there exists an orthonormal basis relative to which the matrix of the transformation is diagonal.e.e. if the matrix of U has form (7) relative to some orthogonal basis then U is unitary. therefore. where is a diagonal matrix whose non-zero elements are equal in absolute value to one. Prove the converse of Theorem 2.LINEAR TRANSFORMATIONS 107 and. the matrix of U relative to the basis e1. Then sat can be represented in the form sit = where ir is a unitary matrix and g a diagonal matrix whose nonzero elements are real. e2. This proves the theorem. 2. Let all be a unitary matrix. By Lemma 1 the numbers Ai. It may turn out that given a number of self-adjoint transformations. has form (7). Prove that if A is a self-adjoint transformation then the transformation (A iE)-1 (A + iE) exists and is unitary. 1. Let sal be a Hermitian matrix. § 14. we can find a basis relative to which all these transformations are represented by diagonal matrices. 22. Commutative transformations. Let A and B be two commutative linear transformations. . . can be given the following matrix interpretation. Since the matrix of transition from one orthonormal basis to another is unitary we can give the following matrix interpretation to the result obtained in this section. . Analogously..

. Proof: Let AB = BA and let RA be the subspace consisting of all vectors x for which Ax ---. Lemma 2. we have ABx = BAx = B2x = 2. Be2 = u2e2. = 22e2. § 12). Then. which proves our lemma. i. Sufficiency: Let AB EA. RA is invariant under B. then Bx e Ra. A necessary and sufficient condition for the existence of an orthogonal basis in R relative to which the transformations A and B are represented by diagonal matrices is that A and B commute. By Lemma 2. if A is the identity trans- formation E. By Lemma 1. NOTE: If AB = BA we cannot claim that every eigenvector of A is also an eigenvector of B.e. by Lemma 2. Proof: We have to show that if x ERA. then x is an eigenvector of E. . which is an eigenvector of both A and B. For instance.13x. ABx = 2Bx. Since AB -= BA.. is invariant under A and B (cf. there ex sts a vector e. THEOREM 1.. i.e. which is an eigen- vector of A and B: Ae. Let A and B be two linear self-adjoint transformations defined on a complex n-dimensional vector space R. B a linear transformation other than E and x a vector which is not an eigenvector of B. Ae. Now consider A and B on R. i. LEMMA 2. Any two commutative transformations have a common eigenvector. since by assumption all the vectors of RA are eigenvectors of A. Hence RA contains a vector x. in R. EB BE and x is not an eigenvector of B. which is an eigenvector of B. = The (n 1)-dimensional subspace R. there exists a vector e.108 LECTURES ON LINEAR ALGEBRA Then the eigenvectors of A which correspond to a given eigenvalue A of A form (together with the null vector) a subspace RA invariant under the transformation B. orthogonal to e.e. xo is also an eigenvector of A. Be. only. Ax = 2x. where A is an eigenvalue of A. = 21e1.2x.

B. Assume therefore that there exists a Let R. A of A. . n). and U. It follows that these matrices Bei = pie. R. of A and B: Aei 2. B. NOTE: Theorem I can be generalized to any set of pairwise commutative self-adjoint transformations. e2. Proof: The proof is by induction on the dimension of the space R. is of n 1. In the I ) the lemma is obvious. is a B. 2. R. C. If every vector of R is an eigenvector of all the transformations A. R. By Lemma 1. The elements of any set of pairwise commutative transformations on a vector space R have a common eigenvector. and U. We shall now characterize all transformations with this property. . in our set Sour lemma is proved. But then the transformations themselves commute. Hence R. . Necessity: Assume that the matrices of A and B are diagonal relative to some orthogonal basis. R1 must contain a vector which is an eigenvector of the This proves our lemma. Let U. be two commutative unitary transformations. Furthermore. B. C. etc. which are orthogonal to e2 form an (n 2)dimensional subspace invariant under A and B. We assume case of one-dimensional space (n that it is true for spaces of dimension < n and prove it for an n-dimensional space. The proof follows that of Theorem but instead of Lemma 2 the following Lemma is made use of : 1 LEMMA 2'. Prove that there exists a basis relative to which the matrices of U. In §§ 12 and 13 we considered two classes of linear transformations which are represented in a suitable orthonormal basis by a diagonal matrix. by assumption.e1 . transformations A. is invariant under each of the transformations (obviously. This completes the sufficiency part of the proof. Normal transformations. e2. (i = 1. are multiples of the . say. be the set of all eigenvectors of A corresponding to some eigenvalue vector in R which is not an eigenvector of the transformation A. Proceeding in this way we get n pairwise orthogonal eigenvectors e1. A necessary and sufficient condition for the existence This means that the transformations A. Since. Relative to e1. commute. is also invariant under A). C. THEOREM 2. our lemma is true for spaces of dimension dimension < n. identity transformation. subspace different from the null space and the whole space. C. EXERCISE.LINEAR TRANSFORMATIONS 109 All vectors of R. e the matrices of A and B are diagonal. are diagonal.

EXERCISE.9 The (n 1)-dimensional subspace R1 of vectors orthogonal to e. . Ate) vector e. Continuing in this manner we construct n pairwise orthogonal vectors e.e. e1) = 0. Ate1=p1e1. we can claim that R1 contains a (x. Necessity: Let the matrix of the transformation A be diagonal relative to some orthonormal basis. el) = 0. The invariance of R.e. Let R2 be the (n 2)-dimensional subspace of vectors from R2 orthogonal to e2. Sufficiency: Assume that A and A* commute. Ax e R. etc. 0 Since the matrices of A and A* are diagonal they commute. It follows that A and A* commute. Then (Ax. Indeed. that is. e.. let the matrix be of the form [2. 0 0 0 0 0 IL. § 11).1.e... i. cf. Prove that pi = 9 . is invariant under A as well as under A*. i. This proves that R. Ae1=21e1.. e1) = (x. Applying now Lemma 2 to R.110 LECTURES ON LINEAR ALGEBRA of an orthogonal basis relative to which a transformation A is represent- ed by a diagonal matrix is AA* = A*A (such transformations are said to be normal. Then by Lemma 2 there exists a vector el which is an eigenvector of A and A*. 22 0 Relative to such a basis the matrix of the transformation A* has the form 0 0 0 [Al i. pled = [71(x. under A* is proved in an analogous manner. let x E 141. e which are eigenvectors of A and A*. i. is invariant under A. (x. which is an eigenvector of A and A*.

then A is normal. § IS. are self-adjoint. and A. But then the same is true of A = A. Unitary transformations are the analog of numbers of absolute value one.LINEAR TRANSFORMATIONS 111 The vectors e1. 1. By Theorem I. where H and U commute. there exists an orthonormal basis in which A. definite if it is self-adjoint and if (Hx. U unitary and where H and U commute Hint: Select a basis relative to which A and A* are diagonable. The analog of positive numbers are the so-called positive definite linear transformations. and A2. A is normal.e.. Every non-singular linear transformation A can be . DEFINITION 1. + iA2. EXERCISES. Prove that the matrices of a set of normal transformations any two of which commute are simultaneously diagonable. H is self-adjoint and U unitary. A unitary transformation U is also normal since U*U = E. We shall now derive an analogous result for linear transformations. Prove that if A HU. 1. A2 A 2i A* The transformations A1 and A. An alternative sufficiency proof. Decomposition of a linear transformation into a product of a unitary and self-adjoint transformation Every complex number can be written as a product of a positive number and a number whose absolute value is one (the so-called trigonometric form of a complex number). UU* § 12 and § 13 are special cases of Theorem 2. A linear transformation H is called positive 0 for all x. are represented by diagonal matrices. e form an orthogonal basis relative to which both A and A* are represented by diagonal matrices. where H is self-adjoint. Note that if A is a self-adjoint transformation then AA* A*A = A2. Thus some of the results obtained in para. e2. If A and A* commute then so do A. Prove that a normal transformation A can be written in the form A = HU UH. Let A1= A + A* 2 . x) THEOREM 1. i.

LEMMA 2. Proof: The transformation AA* is positive definite. Indeed. AA* is self-adjoint. The eigenvalues of a positive definite transformation B are non-negative. let A = HU. for all x.H. in order to find H one has to "extract the square root" of AA*. The determinant of the matrix I fri. We shall first assume the theorem true and show how to find the necessary H and U.). This will suggest a way of proving the theorem. . then the determinant of the matrix ilai.0. Consequently. Conversely. if all the eigenvalues of a self-adjoint transformation B are non-negative then B is positive definite. we put U = H-1A. Furthermore. Thus AA* is positive definite. that is. x) = (A*x. Before proving Theorem 1 we establish three lemmas. LEMMA 1.11 of the transformation A relative to any orthogonal basis is different from zero. If A is non-singular then so is AA*. Given any linear transformation A. where H(H1) is a non-singular positive definite transformation and U(U1) a unitary transformation. which means that AA* is non-singular. Thus. If A is non-singular. the transformation AA* is positive definite. so that AA* -= H2. (AA* x. Indeed. Having found H.112 LECTURES ON LINEAR ALGEBRA represented in the form A = HU (or A = U.211 of the transformation A* relative to the same basis is the complex conjugate of the determinant of the matrix 11(4. A*x) 0. H is easily expressible in terms of A. where U is unitary and H is a non-singular positive definite transformation. (AA*)* = A**A* = AA*. Hence the determinant of the matrix of AA* is different from zero.

0 2 O where 21. Then (Be. Conversely. Then (Bx. there exists a positive definite transformation H such that H2 = B (in this case we write H = Bi). if B is positive definite and non-singular then the are positive. x) (I) E2e2 + +e. if B is non-singular then H is non-singular. it follows that A O. . Given any positive definite transformation B. 22. By Lemma 2 all [V21. A.en. . An env. Put . conversely. Let e1. e2. e) > 0. E1e1fe2e2+ ±e) 221E2 E2 Be2 + + $Be. are positive then the transformation B is non-singular and. Let B be positive definite and let Be = 2e. Let x= be any vector of R. Ar.. 0 ' H= O VA2 0 \/2 App y ng Lemma 2 again we conclude that H is positive definite.>.LINEAR TRANSFORMATIONS 113 Proof. LEMMA 3. E2e2 + -Fe) O. Since (Be. assume that all the eigenvalues of a self-adjoint transformation B are non-negative. e be an orthonormal basis consisting of the eigenvectors of B.. 0 and (e. = (el Bel (E121e1+$222e2+ +$/1. S nce all the 1 are non-negative it follows that (Bx. x) NOTE: It iS clear from equality (1) that if all the A. . Proof: We select in R an orthogonal basis relative to which B is of the form [Al O 01 B=0 0 A. In addition. are the eigenvalues of B. O. e). e) >. e) = 2(e.

For the purpose of this discussion . This completes the proof of Theorem 1. Hence A/2i > 0 and H is non-singular We now prove Theorem 1. EXERCISE. This completes the proof. Making use of eq. Prove that if A and B are positive definite transformations. which is easily seen to be self-adjoint. note to Lemma 2) > O. Let A be a non-singular positive definite transformation and let B be a self-adjoint transformation. Indeed. Then the eigenvalues of the transformation AB are real. H is a non-singular positive definite transformation. Indeed. then C-1 XC and X = AB will both have real eigenvalues. § 16. (Ai 132Ai )* = (Ai )* B* (Ai)* = A1 BA'. UU* = H--1A (H-1A)* = H-1AA* H-' = H-1112H-' = E. then the transformation AB has nonnegative eigenvalues. (2) we get A = HU. If we can choose C so that C-i XC is self-adjoint. A suitable choice for C is C Ai. Let A be a non-singular linear transformation.114 LECTURES ON LINEAR ALGEBRA Furthermore. If (2) U= then U is unitary. Then C-1XC = A1ABA1 Ai BA+. In view of Lemmas 1 and 3. Proof: We know that the transformations X = AB and C-1 XC have the same characteristic polynomials and therefore the same eigenvalues. at least one of which is non-singular. if B is non-singular. Linear transformations on a real Euclidean space This section will be devoted to a discussion of linear transformations defined on a real space. then (cf. The operat on of extracting the square root of a transformation can be used to prove the following theorem: THEOREM. Let H= (AA*).

Ao is a real root. not all zero which are a solution of (1). e2. Consider the system of equations ( 6112E2 + 4222 e2 + (1) 4221 + a22E2 T T ainE 2E2. . + annen = 2$7.x. Then we can find numbers E1°. In § 10 we proved that in a complex vector space every linear transformation has at least one eigenvector (onedimensional invariant subspace). This result which played a fundamental role in the development of the theory of complex vector spaces does not apply in the case of real spaces. Proof: Let e1. .2( be the matrix of A relative to this basis. These numbers are the coordinates of some vector x relative to the basis e1. This equation is an nth order polynomial equation in A with real coefficients. The concepts of invariant subspace. There arise two possibilities: a. i. a rotation of the plane about the origin by an angle different from hat is a linear transformation which does not have any one-dimensional invariant subspace. we can state the following THEOREM 1. Every linear transformation in a real vector space R has a one-dimensional or two-dimensional invariant subspace.. and eigenvalue introduced in § 10 were defined for a vector space over an arbitrary field and are therefore relevant in the case of a real vector space.LINEAR TRANSFORMATIONS 115 the reader need only be familiar with the material of §§ 9 through 11 of this chapter. However.e. Thus. .a2 aA . 1. eigenvector.. $20. + a2$ = 2E2. en be a basis in R and let I la . ar1E2 T a2$2 T The system 1) has a non-trivial solution if and only if an 2 0112 al 2 a22 a2 an. Let A be one of its roots. e. We can thus rewrite (1) in the form Ax = 2. the vector x spans a one-dimensional invariant subspace. e2.

threedimensional) every transformation has a one-dimensional invariant subspace. Self-adjoint transformations DEFINITION 1. $ in ( 1) by these numbers and separating the real and imaginary parts we get (2) + inn E2 1. + O. can be rewritten as follows Equations (3) imply that the two dimensional subspace spanned by the vectors x and y is invariant under A. let Ci be the coordinates of the vector z Ax. y) = (x. e2. i. Y = Ghei n2e2 ' ' + e2 e2 + Furthermore. n) are the coordi(ni. x .)7.1 16 LECTURES ON LINEAR ALGEBRA b.e. + amen --= ace' + azii en Cte2 Pni. EXERCISE..ßE1.1 a21n1 a22n2 + ' + alniin = °U71. Thus the relations (2) and (2') Ay = + t3x. The numbers Eib e2 " nates of some vector x Ax acx en (y) fly. Let 1.f + a2nnii = 15t/12 ' /3E2. + a12e2 = anEi + 022E2 + ani$. $2.. + 02072 + annyin = Gobi + ß. cJane. A2.1/2. . a2$2 + ' + a7.$ = a&2 ' and (2)' an r + a12 n2 -i. A linear transformation A defi ed on a real Euclidean space R is said to be self-adjoint if (Ax. in R. Replacing $i. e be an orthonormal basis in R and let + enen. 4. Ay) for any vectors x and y.. n2. 2. E be a solution of (1 ). Let el.. In the sequel we shall make use of the fact that in a two-dimen43 the sional invariant subspace associated with the root 2 = oc transformation has form (3). Show that in an odd-dimensional space (in particular.

k=1 aikk?h Similarly. It follows that (Ax. A different proof which does not depend on the results of para. . condition (4) is equivalent to aik aki. Ay) = k=1 aikeink. To sum up. Y) = (z. . y) = (Ax. The proof of this statement will be based on the material of para. Comparing (5) and (6) we obta n the following result: Given a symmetric bilinear form A (x.LINEAR TRANSFORMATIONS 117 =a/M. We shall now show that given a self-adjoint transformation there exists an orthogonal basis relative to which the matrix of the transformation is diagonal. en. Y) = E :ini = 1=1 i. 1. y). We first prove two lemmas. I and is thus independent of the theorem asserting the existence of the root of an algebraic equation is given in § 17. k=1 where jaiklj is the matrix of A relative to the basis el. 3r) = k=1 aikeink where aik ak. y) there ex sts a self-adjoint transformation A such that A (x. Thus. e2. for a linear transformation to be self-adjoint it is necessary and sufficient that its matrix relative to an orthonormal basis be symmetric.i. (x. Relative to an arbitrary basis every symmetric b 1 near form A (x. VVe shall make use of this result in the proof of Theorem 3 of this section. y) is represented by A (X.

Proof: It is clear that the totality R' of vectors x. Let A be a self-adjoint transformation and el an eigenvector of A. 2e1) = 2(x. Contradiction. Subtracting the first equation from the second we get [note that (Ax. it contains (again.. i. = (x. y). Then the totality R' of vectors orthogonal to el forms an (n 1)-dimensional invariant subspace. Y) y) (x.e. x) + (y. Then (Ax. Every self-adjoint transformation has a one-di ensional invariant subspace. Suppose that A = + O. by Lemma 1) . let x e R'. Proof: According to Theorem 1 of this section. Since R' is invariant under A. Ay) = /3(x. the transformation A has at least one eigenvector e. Thus. Ay)] O = 2/3[(x.. Ay = fix + ay. el) = 0.e. orthogonal to e. Proof: By Lemma 1. In the proof of Theorem 1 we constructed two vectors x and y such that Ax = ax fiy. a two-dimensional invariant subspace. We show that R' is invariant under A.. S nce (x. forms an (n 1)-dimensional subspace. Thus. y)]. (x. But then (Ax. x) + (y. There exists an orthonormal basis relative to which the matrix of a self-adjoint transformation A is diagonal. i. Aei) = (x. THEOREM 2. Y) = ix(x. to every real root A of the characteristic equation there corresponds a onedimensional invariant subspace and to every complex root A. y) = (x.. to prove Lemma 1 we need only show that all the roots of a self-adjoint transformation are real. y) = 0. LEMMA 2. it follows that 13 = O. Ax E R'. Denote by R' the subspace consisting of vectors orthogonal to e. x e R.118 LECTURES ON LINEAR ALGEBRA LEMMA 1. x) (x. e1) = O.

/he. the matr x of A relative to the e. y) be a symmetric bilinear form on an n-dimensional Euclidean space.e. 2. e2. o A.. the roots of the characteristic equation of the matrix Haitl For n 3 the above theorem is a theorem of solid analytic geometry. equiv- alently. of vectors such that 2ei). y) = = (A($jel $2e2 + En e). -. is of the form [ 2. y) = (Ax.LINEAR TRA.e. In this manner we obta n n pairwise orthogonal eigenvectors e1. -H 22 ' 2e2 + ri2e2 + nen) the. . e consisting of the eigenvectors of the transformation A (i. + /2e2 + ' -{--Iien) + 2e22. x) Here the 2. The orthonorrnal basis . Let A (x. etc. x) be a quadratic fornt on an n-dimensional Euclidean space. With respect to such a basis Aei A (x. Let A (x. .. o o - - o1 o - o 3. of A. n). in this case the equation A (x. T hen there exists an orthonormal basis relative to which the quadratic form can be represented as A (x. y) there corresponds a linear self-adjoint transformation A such that A (x. Putting y = x we obtain the following 21e17l+ 22E2T2 + THEOREM 3. Since Aei = 2. According to Theorem 2 of this section there exists an orthonormal basis e1. x) 1 is the equation of a central conic of order two...'SFORMATIONS 119 an eigenvector e. Reduction of a quadratic form to a sum of squares relative to an orthogonal basis (reduction to principal axes). y). y) = (Ax. We showed earlier that to each symmetric bilinear form A (x. Indeed. (i = 1. e2. are the eigenvalues of the transformation A or. e .

Relative to an orthonormal basis an inner product takes the form (x. e2. The basis vectors e1. x) be positive definite. and let B(x.120 LECTURES ON LINEAR ALGEBRA discussed in Theorem 3 defines in this case the coordinate system relative to which the surface is in canonicid form. Orthogonal transformations DF:FINITION. . y) for all x.. x) = 27=1. By Theorem 3 of this section there exists an orthonormal basis e1. Proof: Let B(x. an orthogonal transformation is length preserv EXE RC 'SE. We define in R an inner product by means of the formula (x. i. relative to the basis e1. A (x. are directed along the principal axes of the surface. 5. Simultaneous reduction of a pair of quadratic forms to a sum of squares THEORENI 4.. e2. y) = B(x. x). Putting x =. Prove that condition (10) is sufficient for a transformation to be orthogonal. 4. e2. x) = B(x.e. x) and B(x. that is. y). Thus. y E R. x) is expressed as a sum of squares.y in (9) we get lAx12 IxJ2.e. e each quadratic form can be expressed as a sum of squares. ea relative to which the form A (x. i. x) be two quadratic forms on an n-dimensional space R. Let A (x. Ay) = (x. y) be the bilinear form corresponding to the quadratic form B(x. A linear transformation A defined on a real n-dimen- sional Euclidean space is said to be orthogonal if it preserves inner products. e3. x) = I E2. Then there exists a basis in R relative to which each fornt is expressed as a sum of squares. if (Ax.

.e. {I for i k (A. i. Show that conditions (I1) and. Conditions (12) imply that . e2. . Show that the product of two proper or two improper orthogonal transformations is a proper orthogonal transformation and the product of a proper by an improper orthogonal transformation is an improper orthogonal transformation.. . Indeed. Let e1. en. An orthogonal transformation whose determinant is equal to + lis called a proper orthogonal transformation. consequently. anan = 0 Conditions a=1 (12) can be written in matrix form. the determinant of a matrix of an orthogonal transformation is equal to + 1. Ae A. it follows that the vectors Aei. Since an orthogonal transformation A preserves the angles between vectors and the length of vectors. e2. it follows that the square of the determinant of a matrix of an orthogonal transformation is equal to one. I axian are the elements of the product of the transpose of the this product is the unit matrix. a-1 EXERCISE. EXERCISE.e. matrix of A by the matrix of A. whereas an orthogonal transforMation whose determinant is equal to 1 is called improper. conditions (12) are sufficient for a transformation to be orthogonal.LINEAR TRANSFORMATIONS 121 Since cos 99 = (x. for i k. i.)0 {1 Now let Ila11 be the matrix of A relative to the basis e1.. it follows that an orthogonal transformation preserves the angle between two vectors. Since the determinant of the product of two matrices is equal to the product of the determinants. . en be an orthonormal basis. Ae likewise form an orthonormal basis. Since the columns of this matrix are the coordinates of the vectors Ae conditions (11) can be rewritten as follows: for i = k for i k. y) ix) and since neither the numerator nor the denominator in the expression above is changed under an orthogonal transformation.

122

LECTURES ON LINEAR ALGEBRA

NOTE: What motivates the division of orthogonal transformations into proper and improper transformations is the fact that any orthogonal trans-

formation which can be obtained by continuous deformation from the
identity transformation is necessarily proper. Indeed, let A, be an orthogonal transformation which depends continuously on the parameter t (this means that the elements of the matrix of the transformation relative to some basis are continuous functions of t) and let An = E. Then the determinant of this transformation is also a continuous function of t. Since a continuous

function which assumes the values ± I only is a constant and since for 0 the determinant of A, is equal to 1, it follows that for t 0 the t determinant of the transformation is equal to 1. Making use of Theorem 5 of this section one can also prove the converse, namely, that every proper orthogonal transformation can be obtained by continuous deformation of the identity transformation.

We now turn to a discussion of orthogonal transformat ons in
one-dimensional and tviro-dimensional vector spaces. In the sequel

we shall show that the study of orthogonal transformations in a space of arbitrary dimension can be reduced to the study of these two simpler cases. Let e be a vector generating a one-dimensional space and A an orthogonal transformation defined on that space. Then Ae Ae
and since (Ae, Ae) = (e, e), we have 2.2(e, e) = (e, e), i.e., A = 1. Thus we see that in a one-dimensional vector space there exist x two orthogonal transformations only: the transformation Ax x. The first is a proper and the and the transformation Ax an second an improper transformation.

Now, consider an orthogonal transformation A on a twodimensional vector space R. Let e1, e2 be an orthonormal basis in

R and let

[7/ /
be the matrix of A relative to that basis.
We first study the case when A is a proper orthogonal transformation, i.e., we assume that acó ßy -= 1.

The orthogonality condition implies that the product of the matrix (13) by its transpose is equal to the unit matrix, i.e., that
(14)
Fa

Ly

)51-1 J

Fa

vl

fit

LINEAR TRANSFORMATIONS

123

Since the determinant of the matrix (13) is equal to one, we have

fi'br --13.1. It follows from (14) and (15) that in this case the matrix of the transformation is

(15)

r
where a2 + ß2 =
1.

Putting x = cos q», ß

sin qi we find that

the matrix of a proper orthogonal transformation on a two dimensional

space relative to an orthogonal basis is of the form

[cos 9)
sin

sin 92-1

cos 9'I

(a rotation of the plane by an angle go). Assume now that A is an improper orthogonal transformation,

that is, that GO ßy =

1.

In this case the characteristic
(a + 6)2

equation of the matrix (13) is A2

1 = O and, thus,

has real roots. This means that the transformation A has an eigenvector e, Ae = /le. Since A is orthogonal it follows that
±e. Furthermore, an orthogonal transformation preserves the angles between vectors and their length. Therefore any vector e, orthogonal to e is transformed by A into a vector orthogonal to Ae ±e, i.e., Ae, +e,. Hence the matrix of A relative to the
Ae

basis e, e, has the form

F±I
L

o

+1j.

Since the determinant of an improper transformation is equal to -- 1, the canonical form of the matrix of an improper orthogonal transformation in two-dimensional space is
HE
L
(

oi
o

Or

1

o +1

01

a reflection in one of the axes). We now find the simplest form of the matrix of an orthogonal

transformation defined on a space of arbitrary dimension.

124

LECTURES ON LINEAR ALGEBRA

Let A be an orthogonal transforma/ion defined on an n-dimensional Euclidean space R. Then there exists an orthonormal basis el, e,, , e of R relative to which the matrix of the transformaTHEOREM 5.

tion is

1
cos
92,

sin

921

sin 921

cos ch.

COS 92k

cos
99,_

sin

92,

where the unspecified entries have value zero.

Proof: According to Theorem 1 of this section R contains a one-or two-dimensional invariant subspace Ru). If there exists a one-dimensional invariant subspace WI) we denote by el a vector
of length one in that space. Otherwise Wu is two dimensional and we choose in it an orthonormal basis e1, e,. Consider A on

In the case when R(') is one-dimensional, A takes the form Ax = x. If Wu is two dimensional A is a proper orthogonal transformation (otherwise R") would contain a one-dimensional invariant subspace) and the matrix of A in Rn) is of the form
rcos
Lsin

sin wi cos (pi

The totality 11 of vectors orthogonal to all the vectors of Rn) forms an invariant subspace.
Indeed, consider the case when Rn) is a two-dimensional space,

say. Let x e ft., i.e.,

where the +1 on the principal diagonal correspond to one-dimensional invariant subspaces and the "boxes" [ cos Ti sin T. cos q)k_ sin 92. select a basis in it. correspond to two-dimensional invariant subspaces This completes the proof of the theorem. If WI) is of dimension one. Indeed. . cos qik sin w. Ay) = (x. R is the totality of vectors orthogonal to the vectors el and e2. it is of dimension n 1. Ax e it. We now find a one-dimensional or two-dimensional invariant subspace of R. it follows that (Ax. As y varies over all of W1. In this manner we obtain n pairwise orthogonal vectors of length one which form a basis of R.LINEAR TRANSFORMATIONS 125 (x. We reason analogously if Wn is one-dimensional. Since (Ax. Relative to this basis the matrix of the transformation is of the form 1 1 1 cos qpi sin go. z) = 0 for all z e ml).] sin T.. and in the latter case. if Wu is of dimension two. cos q). it is the totality of vectors orthogonal to the vector el.e. Hence (Ax. sin 921 cos q).. y) = O for all y e R(1). y). i.. Ay) = O. in the former case. z = Ay likewise varies over all of 14(1. etc. it is of dimension n 2. Again.

Relative to a suitable basis its matrix is of the form 1 cos q sin yo sin 9) cos w 1 An improper orthogonal transformation which reverses all vectors of some one-dimensional subspace and leaves all the vectors of the (n 1)dimensional complement fixed is called a simple reflection. Extremal properties of eigenvalues In this section we show that the eigenvalues of a self-adjoint linear transformation defined on an n-dimensional Euclidean space can be obtained by considering a certain minimum problem connected with the corresponding quadratic form (Ax. Relative to a suitable basis its matrix takes the form 1 1 1 Making use of Theorem 5 one can easily show that every orthogonal transformation can be written as the product of a number of simple rotations and simple reflections. in particular permit us to prove the existence of eigenvalues and eigenvectors without making use of the theorem . § 17. x).126 NOTE: LECTURES ON LINEAR ALGEBRA A proper orthogonal transformation which represents a rotation of a two-dimensional plane and which leaves the (n 2)-dimensional subspace orthogonal to that plane fixed is called a simple rotation. The proof is left to the reader. This approach win.

But this means that (Be. If for some vector x = e (Be. Proof: Let x = e + th. then Be = O. where t is an arb trary number and h a vector. on the set of vectors x such that (x. Be) = (Be. x) for all X. THEOREM 1. e) -= 0.LINEAR TRANSFORMATIONS 127 on the existence of a root of an nth order equation. e) + t(Be. Let A be a selpadjoint linear transformation.e. Let B be a self-adjoint linear transformation on a real space such that the quadratic form (Bx. e) = 0. We shall first consider the case of a real space and then extend our results to the case of a complex space. Then the quadratic form (Ax. .. h) t(Bh. x) corresponding to A assumes its minimum on the unit sphere. i. Since (Bh. at which the minimum is assumed is an eigenvector of A and A. We have (B(e th). h) is non-negative for all t. The vector e. We shall consider the quadratic form (Ax. h) t2(Bh. It follows that (Be. h) > O. x) = 1. h) t2(Bh. h) 0 for all t. However. x) is non-negative. x) which corresponds to A on the unit sphere. Let A be a self-adjoint linear transformation on an n-dimensional real Euclidean space. Be = O. This proves the lemma. then 2t(Be.. the function at + bt2 with a 0 changes sign at t = O. h) and (Be. in our case the expression 2t(Be. h) = O. e) = (h. e) + t2(Bh. Since h was arbitrary. We first prove the following lemma: LEMMA 1. The extremal properties are also useful in computing eigenvalues. Indeed.e. such that (Bx. is the corresponding eigen- value. h) = O. e + th) = (Be. i.

since the minimum of a function considered on the whole space cannot exceed the minimum of the function in a subspace. e) = O. for (x.. We have (Ax. Ae.e. where (e1. at which the minimum is assumed. then both sides of the inequality become multiplied by a2. As was shown in para. In particular.. Since (Ax. x) on the unit sphere in It.. e1) = 1. i. x). x) = 1. at some point e. 21e. = 21e1. A. for x el. 2. is the point in R. We obtain the next eigenvector by solving the same problem in . A.128 LECTURES ON LINEAR ALGEBRA Proof: The unit sphere is a closed and bounded set in n-dimensional space. el) = 2. (Ax. Inequality (1) can be rewritten as follows where (x.. x) 21(x.. invariant under A. of A is the minimum of (Ax. we have (Ae. it follows that inequality (2) holds for vectors of arbitrary length. x) is continuous on that set it must assume its minimum 2. The required second eigenvalue A. This means that the transformation B = A 21E satisfies the conditions of Lemma 1. x) 2. § 16 (Lemma 2). This inequality holds for vectors of unit length. these vectors form an (n 1)-dimensional subspace R. and (Aei. We now rewrite (2) in the form (Ax x) O for all x. Hence (A 21E)e1 = 0. The corresponding eigenvector e. Obviously. We have shown that el is an eigenvector of the transformation A corresponding to the eigenvalue 2. Since any vector can be obtained from a vector of unit length by multiplying it by some number a. This proves the theorem. Note that if we multiply x by some number a. To find the next eigenvalue of A we consider all vectors of R orthogonal to the eigenvector e. x) = 1.

+ eke. x) (Ax.t. Let A be a self-adjoint transformation. -L + Ake ke k) ek = 4E1' + A2E22 + + Ake k2. (Ax. third. We can assume that xo has unit length. We shall show that if S is the subs pace spanned by the first k eigenvectors then for each x e S the lollowing inequality holds: A. and e. let x = ekek (Ax. In § 7 (Lemma of para. x) = Similarly. it follows that ' (A (Eke/ eze. . e. Eke. A 2E 22 + 4($12 82' + -E =- = Adx. < A. e (x. e the corresponding orthonormal ..) k. 1.(x. x). < 5 An its eigenvalues and by e eo. Continuing in this manner we find all the n eigenvalues and the corresponding eigenvectors of A. . x) Now let Rk be a subspace of dimension n k + 1. ek (x. 812 + 8. (e ek) = 1 and (ek.e. etc. then there exists a vector different from zero belonging to both subspaces.k(x. + + = (Akeke. x) + ekeo. It follows that .(x. x).. Denote by A. eigenvectors.. x) in that subspace. e1. e2. eigenvector of a transformation from the extremum problem without reference to the preceding eigenvectors.2 + ' + AkEk2 ek2 ' and therefore (Ax. since e. x) 2. O for i Since Aek = 2. exek) + + Ekek) Furthermore. 1) we showed that if the sum of the dimensions of two subspaces of an n-dimensional space is greater than n. x) ek are orthonormal. x)..LINEAR TRANSFORMATIONS 129 the (n 2)-dimensional subspace consisting of vectors orthogonal to both e. x). that is. The third eigenvalue of A is equal to the minimum of (Ax. x) (Ax. It is sometimes convenient to determine the second. x) eoeo + Indeed. common to both Roc and S. (x. Since the sum of the dimensions of Rk and S is (n k + 1) + k it follows that there exists a vector x. + ¿kek).

The subspace Rk can be chosen so that min (Ax. " for all x. and the maximum over all subspaces Rk of dimension n k + 1. . x) for x e S. 4 (x. we showed in this section that min (Ax. for which (x.) = 1. then min (Ax. To sum up: If Rk is an k 1)-dimensional subspace and x varies over all vectors in R.0. Hence for any (n min (x. x) = 1. we have 2. x) = I. . xe Rk In this formula the minimum is taken over all x e R. the maximum of the left side is equal to A.. x). f Indeed (Ax. by formula (3). Since. et. . Then min (Ax. xo) 2. x) A. x) ((A + 13)x.u be the eigenvalues of A -7 B.. x e 12. As a consequence of our theorem we have: Let A be a sell-adjoint linear transformation and B a postive definite linear A be the eigenvalues of A and lel transformation..130 LECTURES ON LINEAR ALGEBRA (x. Ro) Ak But then the minimum of (Ax. x). it follows that We have thus shown that there exists a vector xo E Rk of unit length such that (AX0. Rk (x. (x.. Then A. x) = 1. x). (x. x). x) is equal to A. Let R be a (n k + 1)-dimensional subspace of the space R.. xe Rk xeRk X)=-1 It follows that the maximum of the expression on the left side taken over all subspaces Rk does not exceed the maximum of the right side. x) -= 4. Let A. xi=1 k + 1)-dimensional subspace Rk we have min ((A (Ax. is actually equal to Ak. e. x) for all x elt.. Indeed. x) = 1. We have thus proved the following theorem: THEOREM. and the maximum of the right side is equal to 1.. et. .I. A. x) (Axo. This is the subspace consisting of all vectors orthogonal to the first k eigenvectors et. taken over all vectors orthogonal to et. x) = 1. is equal to .4. Our theorem can be expressed by the formula (3) max min (Ax.. is less than or equal to A. x) B)x. x. (x. Note that among all the subspaces of dimension n k 1 there exists one for which min (Ax. Since (Ax. x) for x on the unit sphere in Rk must be equal to or less than Ak. e. (x.. We now extend our results to the case of a complex space.

let foy all x. e) = 0. x) corresponding to B be non-negative. h) O. i(Bh. e -r. e) Since h was arbitrary. If for some vector e. or. h) (Bh. by putting ih in place of h.LINEAR TRANSFORMATIONS 131 To this end we need only substitute for Lemma I the follovving lemma. since (Be. t[(Be.e.th) 0. we get. and therefore Be = O. Then (B (e th). It follows that (Bx. i(Be. e) = O. h) for all t. . LEMMA 2. then Be = O. x) 0 (Be. Proof: Let t be an arbitrary real number and h a vector. This proves the lemma. All the remaining results of this section as well as their proofs can be carried over to complex spaces without change. i.. It follows from (4) and (5) that (Be. h) = 0. e)] + t2(Bh. e) = 0. h) (Eh. Let B be a self-adjoint transformation on a complex space and let the Hermitian form (Bx. (Be.

We now formulate the definitive result which we shall prove in § 19. The canonical form of a linear transformation In chapter II we discussedvarious classes of linear transformations on an n-dimensional vector space which have n linearly independ- ent eigenvectors. also § 10. 1. Hence for the number of linearly independent eigenvectors of a transformation to be less than n it is necessary that the characteristic polynomial have multiple roots. Thus. the number of linearly independent eigenvectors of a linear transformation can be less than n. sional space and let A have k (k vectors Let A be an arbitrary linear transformation on a complex n-dimenn) linearly independent eigen- We recall that if the characteristic polynomial has n distinct roots. cf. any basis relative to which the matrix of a transformation is diagonal consists of linearly independent eigenvectors of the transformation. comparatively simple form (the so-called Jordan canonical form). namely. i (An example of such a transformation is given in the sequel. However.CHAPTER III The Canonical Form of an Arbitrary Linear Transformation § 18. such a transformation is not diagonable since. We found that relative to the basis consisting of the eigenvectors the matrix of such a transformation had a particularly simple form. the so-called diagonal form. In this chapter we shall find for an arbitrary transformation a basis relative to which the matrix of the transformation has a 3). In the case when the number of linearly independent eigenvectors of the transformation is equal to the dimension of the space the canonical form will coincide with the diagonal form. para. Example Clearly. 132 . exceptional. this case is. in a sense. as noted above. then the transformation has n linearly independent eigenvectors. There arises the question of the simplest form of such a transformation.

21e2. 2 Clearly.. 12f. 22f2. = f.e. h1. . . = Ah. Every subspace generated by each one of the k sets of vectors contains an eigenvector. I f k n.. 22. fq. corresponding to the eigenvalues Xi. 2. = Akhi.CANONICAL FORM OF LINEAR TRANSFORMATION 133 e. We shall now investigate A more closely. f1. that is. ix. For instance. c2(e1 21e2) + (e_. We see that the linear transformation A described by (2) takes the basis vectors of each set into linear combinations of vectors in the same set. It therefore follows that each set of basis vectors gener- ates a subspace invariant under A. Substituting the appropriate expressions of formula (2) on the left side we obtain cdie. We show that each subspace contains only one (to within a multiplicative constant) eigenvector. Ael = 11e1. is an eigenvector. contains the eigenvector el. basis consisting of k sets of vectors 2 . Ah. .. the subspace generated by the set e1. e. Indeed. say. Af.. + + c. e. consider the subspace generated by the vectors el. .) = cae. .e. e2.h 21e. then each set consists of one vector only. ._1 At. e..) = Ac.. c2e2 + where not all the c's are equal to zero. = Ah. Then there exists a . /1h2. Ae = e. + /lever Equating the coefficients of the basis vectors we get a system of c: equations for the numbers A.1e. relative to which the transformation A has the form: Ae. e.). = e. A(c. some linear combination of the form c1 e1 + + cpe. = h. = 22f1. A. c. f. 21118. 111. Assume that some vector of this subspace. . e.e. Af. . namely an eigenvector.. p q ±sn.

and so on. To find out what the elements in each box are it suffices to note how A transforms the vectors of the appropriate set. therefore. tions that c. c. Thus.O All 0 1 -Al 0 0 0 0 1 0 0 0 0 A. We have Ael = Ale. + A1e. e2. p + 2._1. We first show that A = Al. Substituting this value for A we get from the first equation c2 = 0.134 LECTURES ON LINEAR ALGEBRA ciAl+ c2rc2A. Recalling how one constructs the matrix of a transformation relative to a given basis we see that the box corresponding to the set of vectors e1. . Ae2 = e1 + e. it follows that in the first p columns the row indices of possible non-zero elements are 1._._1= c. the matrix of the transformation relative to the basis (1) has k boxes along the main diagonal. = 0 and from the remaining equa- cp-14+ 1Cp-1. then it would follow from the last equation that c. Since the vectors of each set are transformed into linear combinations of vectors of the same set. cAl = Ac. in the next q columns the row indices of possible non zero elements are p + 1. if A A1. 0 0 0 . = 0. 2.. from the second. . Indeed.± c3 = Ac2. p. coincides (to within a multiplicative constant) with the first vector of the corresponding set. has the form (3) . = O. e. Hence A = A1. Ae = Ae ep + Ale. and from the last. We now write down the matrix of the transformation (2)._2= = c2= el= O. The elements of the matrix which are outside these boxes are equal to zero.p q. c. This means that the eigenvector is equal to cle and.

how to compute a polynomial in the matrix (4). The matrix (4) has the form = k_ where the a. q. 1 0 0 0 0 0 Here all the elements outside of the boxes are zero. in order to raise the matrix al to some power all one has to do is + raise each one of the boxes to that power. Although a matrix in the canonical form described above seems more complicated than a diagonal matrix. say. -. We show.CANONICAL FORM OF LINEAR TRANSFORMATION 135 The matrix of A consists of similar boxes of orders p. ao + ait + amtm be any polynomial. that is. Then sif2 that is. 0 1 0 0 21 0 0 221 (4) 0 0221 0 0 0 0 22 0 2k' 0 0 A. it has the form _211 0 0 A. Now let P(1) =.s. It is easy to see that . for instance. are square boxes and all othur elements are zero. one can nevertheless perform algebraic operations on it with relative ease.

== 0 1 0 [0 if 2-2 - 0 0 0 0 0 0 0 o o o 00 00 and = 0. -4 0. Substituting for t the matrix sari we get P(di) = P(Mg + (si. First we write the matrix si.1 are most easily computed by observing that fie. Pl()105 + P"(20) 2! 2! 52 + Put' (20) n! 2 The powers of the matrix . .1) (di n! I)" But sit. ey_. . .)" n! P"'' (A1).0P-I are of the form 2 [0 0001 0000 0000 fr == JP+.). in the form st.(11)..1) -E 2! P"(À1) + + (tA.332e2=0.02. J3e1 J3e.. . = e. We now show how to compute P(s1. . Hence P(di) = P(A1). In view of Taylor's formula a polynomial P(t) can be written as P(t)= P(20) (t (t A1)2 )0) -1-v(2. where n is the degree of P(t). A. 2! (Al) A1 e)2 P"(11. = 0. Jae. A.5.e + 1 where et is the unit matnx of order p and where the matrix f has the form r0 .. Je. J3e = Similarly. = ene. ¿e is. . J'ep = e_. ine2=e.136 LECTURES ON LINEAR ALGEBRA [P(ei1) P(s12) P(s. say. Hence inei= 0. = et. . e)P( (20 -1- (st.1 0 = 0 0 0 I o o 0 0 11 0 0 0 0 We note that the matrices . It is now easy to compute P(.3. .

Proof: Consider the adjoint At of A. is the eigenvalue of si. It follows that if the matrix has canonical form (4) with boxes of order p. where A. 0O O 2)! ' P(21) value of P(t) at the points t = A. first p 1 derivatives at A.. Then there exists a basis relative to which the matrix of the linear transformation has canonical form. s. Petrovsky. .. G. Lectures on the Theory of Ordinary Differential Equa- tions. there exists a basis relative to which A has the form (2) (§ 18). we get P" (Al) 2! P' (A1) 1! PP-'' (AO1) ! P(211) = P(A) I" 1! F'''-'' (A. Every linear transformation A on an n-dimensional complex space R has at least one (n 1)-dimensional invariant subspace R'.. then to compute P(d) one has to know the . A. Petrovsky. G.)O Thus in order to compute P(d1) where sal. and the § 19. has order p it suffices to know the value of P(t) and its first p 1 derivatives at the point A. Reduction to canonical form In this sect on we prove the following theorem 3: THEOREM 1. chapter 6. See I. i. the first q first s 1 derivatives at A. We prove the theorem by induction.. Ate = We claim that the (n 1)-dimensional subspace R' consisting of 3 The main idea for the proof of this theorem is due to I..e.CANONICAL FORM OF LINEAR TRANSFORMATION 137 Recalling that ifP = JP-1 = P (A1) ' ' = 0. Let e be an eigenvector of A*.. We need the following lemma: LEMMA. q. . . Let A be a linear transformation on a complex n-dimensional space.. we assume that the required basis exists in a space of dimension n and show that such a basis exists in a space of dimension n 1. as well as the values of the 1 derivatives at A. In other words. A2.

alone. 2.e. e2. the transwhere p q+ formation A has relative to this basis the form h Ae..e. hs forms a basis in R. We now pick a vector e wh ch together with the vectors el. However.. f2. Ae2 = 11e2. Let A be a linear transformation on an (n + 1)-dimensional space R. We now turn to the proof of Theorem 1. f. + ¿1e. all vectors x for which (x. Aft = Af2 = f. e) = (x. ev. Ah. e2. that is..138 LEcruims ON LINEAR ALGEBRA all vectors x orthogonal 4) to e. Applying the transformation A to e we get 4 We assume Itere that R is Euclidean. e) = O. . According to our lemma there exists an n-dimensional subspace R' of R. Ate) = (x. invariant under A. By the induction assumption we can choose a basis in R' relative to which A is in canonical form. ft. let x e R'. Afq = fq-1 + 12f. that an inner product is defined on R.e._. that is. . 22f2. h2. This proves the invariance of R' under A. h. 2e) = 0.h2. i.. Indeed. by changing the proof slightly 've can show that the Lemma holds for any vector space R. = h. f2. e2. Aev = ev_. (x. + s = n. + 21h.2f2. el. . i. = Ah2 = +2. e) = 0. . Ax E R'. is invariant under A. Denote this basis by h2.. Considered on R'. f1.=1el. 4112. Then (Ax.

h. 4) it follows that r are the eigenvalues of A considered on the (n + 1)-dimensional space R. .. ' .. fg.on the principal diagonal. Since the eigenvalues of a triangular matrix are equal to the entries on . and the diagonal (cf..ep) + A(wilk + + Mg] + wshs). 2.f. if relative to some basis A is in canonical form then relative to the same basis A rE is also in canonical form and conversely. pti.f.fr coihi '0» cosh. para. or. Thus.. 5 We can assume that t = O.Cie' Xpep + 61111 + 6shs. A(x.CANONICAL FORM OF LINEAR TRANSFORMATION 139 Ae = a1ej + /pep + + + + + + 61111 + 6311s + re. zp.. 7. for instance. e.. A. f. Indeed. and T. Hence if r 0 we can consider the transformation A rE instead of A. be chosen arbitrarily. e. We know that to each set of basis vectors in the n-dimensional space R' relative to which A is in canonical form there corresponds 5 The linear transformation A has in the (n 1)-dimensional space R A.. We will choose them so that the right side of (3) has as few terms as possible.. namely. tt. Indeed. A(0)1111 + . + . + Óih ACtc. can The coefficients xi. 2k. This justifies our putting Ae =-- + + ape. .. f2. + + in the form e' e . dmuji xe) + (3. § 10.11. We have Ae' = Ae A(zlei + + x. oh. by the eigenvalue T. .. . e is triangular with the numbers A. + . making use of (1) Ae' = i1e1 + + + + ß1f1+ + 8. co.81f.. We shall now try to replace the vector e by some vector e' so that the expression for Aei is as simple as possible. . the matrix of A relative to the basis el. We shall seek e' Pl.. h. h1. as a result of the transition from the n-dimensional invariant subspace R' to the (n + 1)-dimensional space R the number of eigenvalues is increased by one. the eigenvalues . + + wshs).e.

for such sets we can choose . 'Y . (0. (this can be done since Ai 0).140 LECTURES ON LINEAR ALGEBRA one eigenvalue.2 = (al X1A1 Z2)e1 X221 z3)e2 + + X. hs e v. e'. so that the linear combination of each set of vectors vanishes. . The coefficients of the other sets of vectors are computed analogously.. 1 etc. .. We consider first the case when all the eigenvalues are different from zero. h 1. i.) . e. if we consider the .. Assume this to be feasible. In this way the linear combiequal to zero and determine nation of the vectors e1. f 1f. . + = i1e1 + ' . so that the right side of (3) becomes zero. e 2. The eigenvalue associated with e' is zero (or 2. Then since the transformation A takes the vectors of each set into a linear combination of vectors of the same set it must be possible to select xl. w. A (xi.. In this case the summands on the right side of (3) are of two types: those corresponding to sets of vectors associated with an eigenvalue different from zero and those associated with an eigenvalue equal to zero.- We put the coefficient of e.. .f. The sets of the former type can be dealt with as above. We show how to choose the coeffix. equal to zero and determine x. We have thus determined e' so that Ae' = O. .e. so that the linear combination of the vectors cients xi. we can choose xi. The terms containing the vectors el. 2.2.-121 Xp)ep-i (.e. in (3) vanishes. . By adding this vector to the basis vectors of R' we obtain a basis . 2. are of the form + + ape. e. in the (n + 1)-dimensional space R relative to which the transformation is in canonical form. We shall show that in this case we can choose a vector e' so that Ae' = 0. next we put the coefficient of e. -H 21e2) ' + xpep) ¿lei)) Zp(en-1 (/.e. i. e. Consider now the case when some of the eigenvalues of the transformation A on R' are zero. The vector e' forms a separate set. These eigenvalues may or may not be all different from zero. (3) vanishes. in e1. transformation A rather than A TE). e 1.

= f2_. the transformation A is already in canonical form relative to the basis e'. e2. = sc_. g. At. e. andfi > q> r. i. vectors except ape. O. it becomes necessary to change some of the basis vectors of R'.g. 2.. we annihilate all Za= ac2. Af. 111. Ae' = x. it follows that el. 0.. . we obtain a vector e' such that fl.e. = a. three sets of vectors.. e2. Then " Ae' = 1e1 + (4) + yrg. = Ae'. = 0. f2. . = Ae'. values are equal to zero. f. in distinction to the previous cases. Let us assume that we are left with. appearing on the right side of (4) will be of the form cc1e1 . . = 22 = 23 = O. . . O = e'.f.. The vector e'. g. el. forms a separate set and is Y associated with the eigenvalue zero. Then. y.e. f1.._. We form a new set of vectors by putting e' = e'. We illustrate the procedure by considering the case x.. e. whose eigenfi. . different from zero. Therefore the linear combination of the vectors el. Assume now that at least one of the coefficients x. Agr Ag2= gi. A(zie. say. Thus . ' . . Proceeding in the same manner with the By putting z. Ag. . f2. e. = 0. e'.CANONICAL FORM OF LINEAR TRANSFORMATION 141 coefficients so that the appropriate linear combinations of vectors in each set vanish. Af2 Ae..e arrive at a vector e' such that It might happen that a = = Ae' = In this case we and just as in the first case. . Ae = ep_1. + A(ktifl+ + Itqfq) + a. p1f1 + + ß0f0 + 71g1 + xpe) 4- A (Yigi 4- Since Al = 22 = A. x2e2 + + x2e2 x3e2 sets f.. g2. ßq. es. fg. = 0. hs. fq and g.

If the matrix (. f. then a2 is also similar to at.+1 = e'l = Ae'2 = cc.. let = Then = wsgtir-1 .. This completes the proof of the theorem. e'2. If the first case. = Aet_7+2 = Gip ep_. . Relative to the new basis the transformation A is in canonical form. § 20. + ßf. we added a new box. The case when -c coincided with one of the eigenvalues . 4. ). ei. e2.+1 = Aet_. ep by the vectors e'1. We now replace the basis vectors e'. then just as in of one of the boxes by one. e1.. and leave the other basis vectors unchanged.142 LECTURES ON LINEAR ALGEBRA e' = e' cc.1 . Note that the order of the first box has been increased by one. in general. Then it was necessary. Elementary divisors In this section we shall describe a method for finding the Jordan canonical form of a transformation. Indeed. The results of this section will also imply the (as yet unproved) uniqueness of the canonical form.911 is similar to the matrix a2.e. = 0. + yrgr. where is an arbitrary non-singular matrix are said to be similar. e. .5:11 =. The matrices sir and .tr'-isfl._. y. While constructing the canonical form of A we had to distinguish two cases: The case when the additional eigenvalue r (we assumed t = 0) did not coincide with any of the eigenvalues 2.. + 41. + fg_r. to increase the order = tig y.±. DEFINITION 1. 2 .g1. In this case a separate box of order 1 was added..

we obtain Si2 = i. This will be a complete system of invariants in the sense that if the invariants in question are the same for two matrices then the matrices are similar. If 56.uf and for any matrix similar to S. we wish to construct functions of the elements of a matrix which assume the same values for similar matrices. We now construct a whole system of invariants which will include the characteristic polynomial..CANONICAL FORM OF LINEAR TRANSFORMATION 143 If we put W-1 r. In other words. D(1) = is the same for .(2) the greatest common divisor of those minors. We denote by D.e... i. i. sdf is similar to Let S be the matrix of a transformation A relative to some basis. In particular. One such invariant was found in § 10 where we showed that the characteristic polynomial of a matrix se.WW is the matrix which represents A relative to the new basis.. then sit. we get d-= Z'2-1 d2%5.e. We now wish to obtain invariants of a transformation from its matrix.e. Then r1-1s11r1 Putting W2W1-1 = 46'. is similar to Sly Indeed let = 1S114. s4t2 is similar to sit.e. We choose Dk(A) to be a monic polynomial.. then V-1. Thus similar matrices represent the same linear trans- formation relative to different bases.2. 6 We also put The greatest common divisor is determined to within a numerical multiplier. Let S be a matrix of order n. . = i. is the matrix of transition from this basis to a new basis (§ 9). The kth order minors of the matrix sir 24' are certain polynomials in 2. It is easy to see that if two matrices a. the determinant of the matrix d At.2 al 2W 2Z' W2-1a2r.e. i. expressions depending on the transformation alone. if the hth order ininors are pairwise coprime we take Di(A) to be I... and at2 are similar to some matrix d.

By Lemma 1 the greatest common divisor of the kth order minors S . LEMMA 2.AS each multiplied by some number. This proves that the greatest common divisors of the kth order minors of a . If is an arbitrary non-singular matrix then the greatest common divisors of the kth order minors of the matrices AS. Find D(2) (k = 1. For similar matrices the polynomials D. AS)W If a.(2) are invariants.. Proof: Consider the pair of matrices sí At and (. In the sequel we show that all the 13. D 1(1) is divisible by D. 132(A) = D1(2) LEMMA 1.. o rooA A.2.144 LECTURES ON LINEAR ALGEBRA Do (A) = 1.Ae is the same as the corresponding greatest common divisor . 3) for the matrix 0 o 1 ]. Similarly. Hence every divisor of the kth order minors of alt AS must divide every kth order minor of (st 2e)%'.2da.e.Ae)r .). To prove the converse we apply the same A. 1. i. We observe that D_1(1) divides D (2)..Ag and (s1 26')%" are the same. etc.%c (21 AS) and (I 2e)w are the same. Indeed. If we expand the determinant D(2) by the elements of any row we obtain a sum each of whose summands is a product of an element of the row in question by its cofactor. the entries of any row of (si 2e)w. It follows that D(X) is indeed divisible by D_1(2). are linear combinations of the rows of st AC with coefficients from . In particular D(2) is the determinant of the matrix Ae.2. then xe and a'1 are the entries of = i.2 (A). reasoning to the pair of matrices (sit AS)W and [(s1 xe)wx-i S .e. EXERCISE.(2). Proof: Let se and sit = W-Isiff be two similar matrices. independent of It follows that every minor of (a 2e)w is the sum of minors of a .(2) are identical. Answer: D3(2. the definition of D_1(2) implies that all minors of order n 1 are divisible by D . are the entries of st . (A A0)3.

1. Let A be a linear transformation. We shall find it convenient to choose the basis relative to which the matrix of the transformation is in Jordan canonical form.. for one "box" of the canonical form. Hence the D. We first find the D(2) for an nth order matrix of the form 20 O 1 o 1 o 0 - (1) 0 0 0 0 0 0 1 i. Hence D1(2) =. 1. and .1. we conclude on the basis of Lemma 2 that THEOREM 1.(2) we may use the matrix which represents A relative to an arbitrarily selected basis. If we cross out in sli like numbered rows and columns we find that D .R is a matrix of the form Q1 0 where . then the mth order non-zero . (A We observe further that if . and n.CANONICAL FORM OF LINEAR TRANSFORMATION 145 for (saf Ad)W. In view of the fact that the matrices which represent a transformation in different bases are similar.te. does not depend on the choice of basis.(2) for the matrix si in Jordan canonical form. We now compute the polynomials WA) for a given linear trans- formation A. where at represents the transformation A in some basis.(1) = = D1(A) = 1. Then the greatest common divisor Dk(A) of the kth order minors of the matrix se .e. If we cross out in (1) the first column and the last row we obtain a matrix sill with ones on the principal diagonal and zeros above it.AS)S = AS.42 are of order n.(2) are . An analogous statement holds for the matrices AS) and W-I(S1 . Theorem 1 tells us that in computing the D. Thus for an individual "box" [matrix (1)] the D. /10)4. 1. 1. Clearly D(2) = (A A0)n.. Our task is then to compute the polynomial 1).(2) for si and at are identical.

146

LECTURES ON LINEAR ALGEBRA

minors of the matrix 94 are of the form

d, =
Here 4(1) are the minors of mi

A(2)

,

"21 + M2 = M.

of order m, and 4(2) the minors of -42

of order m2.7 Indeed, if one singles out those of the first n, rows which enter into the minor in question and expands it by these rows (using the theorem of Laplace), the result is zero or is of the
form A (2) A m(2) .

We shall now find the polynomials D,(1) for an arbitrary matrix

si which is in Jordan canonical form. We assume that al has p boxes corresponding to the eigenvalue A, q boxes corresponding to the eigenvalue 22, etc. We denote the orders of the boxes corresponding to the eigenvalue Al by n1, n2, , n, (n, n2 > > nv). Let R, denote the ith box in a' = si AC. Then ,42, say, is of the form

A, A
O

1

0
1

O
O

=
O O
O

I
A,

0

0

A_

We first compute 1),(2), i.e., the determinant of a. This determi-

nant is the product of the determinants of the i.e.,
D1(2) = (A
)1)1'1+7'2'4-

(1

22)mi±m2+-+mq

We now compute Dn_1(2). Since D0_1(2) is a factor of D(A), it ,A 22, . The problem must be a product of the factors A

now is to compute the degrees of these factors. Specifically, we

compute the degree of A A in D1(2). We observe that any non-zero minor of M = si Ae is of the form

=4

M2)

zlik.),

where t, t2 + + tk = n I and 4) denotes the t,th order minors of the matrix ,2,. Since the sum of the orders of the minors
i.e.,

7 Of course, a non-zero kth order minor of d may have the form 4 k(, it may he entirely made up of elements of a,. In this case we shall 4725 where zlo,22 --- 1. write it formally as z1 =

CANONICAL FORM OF LINEAR TRANSFORMATION

147

1, exactly one of these minors is of order one lower than the order of the corresponding matrix .4,,
, zfik) is n

M,

i.e., it is obtained by crossing out a row and a column in a box of the matrix PI. As we saw (cf. page 145) crossing out an appropriate row and column in a box may yield a minor equal to one. Therefore it is possible to select 47,1 so that some 4 is one and the remaining minors are equal to the determinants oif the appropriate boxes. It follows that in order to obtain a minor of lowest possible degree Al it suffices to cross out a suitable row and column in the in A box of maximal order corresponding to Al. This is the box of order n. Thus the greatest common divisor D 2(A) of minors of order n. A1 raised to the power n2 + n, 1 contains A n Likewise, to obtain a minor 4n-2 of order n 2 with lowest A, it suffices to cross out an appropriate row possible power of A and column in the boxes of order n, and n, corresponding to A,. + n, Thus D2(2) contains A A, to the power n, n, + , D1(A) do not conetc. The polynomials D_(2), D_ 1(2), tain A A, at all. Similar arguments apply in the determination of the degrees of

in WA). A,, 22, We have thus proved the following result.
If the Jordan canonical form of the matrix of a linear transforma./zi,) , n(n2 n, tion A contains p boxes of order n,, n2, corresponding to the eigenvalue A1, q boxes of order ml, m2, , m m,) corresponding to the eigenvalue A2, etc., then m2
Da (A)
(A
(A

A1)n,2+n2+--- +5 (A
Ann2+.3+
-Enp (A

A2r,-Ern2-3-m3+

+mg

D_1(A)

22),n2+.2+- +mg

= (A

Aira+

+"' (A

Az)na'

+ma

Beginning with D_,(2) the factor (A Beginning with Dn_ 2(2) the factor (2
etc.

2,)
A2)

is replaced by one. is replaced by one,

In the important special case when there is exactly one box of order n, corresponding to the eigenvalue A1, exactly one box of order m, corresponding to the eigenvalue A2, exactly one box of order k, corresponding to the eigenvalue A3, etc., the D,(A) have the following form.

148

LECTURES ON LINEAR ALGEBRA

Da(A) = (2 D _1(2) = 1
D _2 (2) =
1

2)'(A

22)m ' (2

23)"'

The expressions for the D1(A) show that in place of the D,(2) it is

more convenient to consider their ratios
E ,(2)
,(2.)
.

D k 19)

The E1(1) are called elementary divisors. Thus if the Jordan
canonical form of a matrix d contains p boxes of order n, n2, , n(ni n, >: n) corresponding to the eigenvalue A, q boxes of order mi., m2, m, (m1 m2_> mg) corresponding to the eigenvalue 22, etc., then the elementary divisors E1(A) are
(2 En(2) En-1(2) =- (A E n-2(2) = (A

21)" (2
Al)"2 (A

22)'
22)m
22)ma

',
*,

Ai)"a(A

Prescribing the elementary divisors E(2), E 2(2)

,

,

deter-

mines the Jordan canonical form of the matrix si uniquely. The eigenvalues 2 are the roots of the equation E(2). The orders n1, n2, n of the boxes corresponding to the eigenvalue A, coincide with the powers of (2 in E(2), E_1(2),
.

We can now state necessary and sufficient conditions for the existence of a basis in which the matrix of a linear transformation
is diagonal. A necessary and sufficient condition for the existence of a basis in
which the matrix of a transformation is diagonal is that the elementary divisors have simple roots only.

Indeed, we saw that the multiplicities of the roots 21, 22, , of the elementary divisors determine the order of the boxes in the Jordan canonical form. Thus the simplicity of the roots of the elementary divisors signifies that all the boxes are of order one,

i.e., that the Jordan canonical form of the matrix is diagonal.
For two matrices to be similar it is necessary and sufficient that they have the same elementary divisors.
THEOREM 2.

CANONICAL FORM OF LINEAR TRANSFORMATION

149

Proof: We showed (Lemma 2) that similar matrices have the

same polynomials D,(2) and therefore the same elementary
divisors E k(A) (since the latter are quotients of the 13,(2)). Conversely, let two matrices a' and a have the same elementary
divisors. ,sat and a are similar to Jordan canonical matrices.

Since the elementary divisors of d and are the same, their Jordan canonical forms must also be the same. This means that a' and a are similar to the same matrix. But this means that a' and :a are similar matrices.
THEOREM 3. The Jordan canonical form of a linear transformation

is uniquely determined by the linear transformation. Proof: The matrices of A relative to different bases are similar.

Since similar matrices have the same elementary divisors and these determine uniquely the Jordan canonical form of a matrix, our theorem follows.
We are now in a position to find the Jordan canonical form of a matrix of a linear transformation. For this it suffices to find the elementary divisors of the matrix of the transformation relative
to some basis. When these are represented as products of the form AO" (A (X AS' we have the eigenvalues as well as the order of the boxes corresponding to each eigenvalue.

§ 21. Polynomial matrices
1. By a polynomial matrix we mean a matrix whose entries are polynomials in some letter A. By the degree of a polynomial matrix we mean the maximal degree of its entries. It is clear that

a polynomial matrix of degree n can be written in the form

+

+ A0,
AE

where the A, are constant matrices. 8 The matrices A

which vvt considered on a number of occasions are of this type. The results to be derived in this section contain as special cases many of the results obtained in the preceding sections for matrices

of the form A

¿E.

In this section matrices are denoted by printed Latin capitals.

150

LECTURES ON LINEAR ALGEBRA

Polynomial matrices occur in many areas of mathematics. Thus, for
example, in solving a system of first order homogeneous linear differential equations with constant coefficients
(I)

dy,
dx

let alkYk

= 1, 2,

n)

we seek solutions of the form
(2)

Yk = ckeAx,

(2)

where A and ck are constants. To determine these constants we substitute the functions in (2) in the equations (1) and divide by eA.z. We are thus led

to the following system of linear equations:
71

iCj =
k=1

agkek

The matrix of this system of equations is A ilE, with A the matrix of coefficients in the system (1). Thus the study of the system of differential
equations (1) is closely linked to polynomial matrices of degree one, namely,

those of the form A
system

AE.

Similarly, the study of higher order systems of differential equations leads

to polynomial matrices of degree higher than one. Thus the study of the
d2yk
k=1

2+ an,

dx2

+ E bik
k=1

n

dyk

dx

+

n

czkyk
k=1

O

is synonymous with the study of the polynomial matrix AA% + 132 + C, where A -= 16/.0, B = C = 11c3k1F.

We now consider the problem of the canonical form of polynomial matrices with respect to so-called elementary transformations. The term 'elementary" applies to the following classes of transformations.

Permutation of two rows or columns. Addition to some row of another row multiplied by some
polynomial yo (A) and, similarly, addition to some column of another

column multiplied by some polynomial.
Multiplication of some row or column by a non-zero constant. DEFINITION 1. Two polynomial matrices are called equivalent if it is possible to obtain one from the other by a finite number of elementary transformations.

The inverse of an elementary transformation is again an elementary transformation. This is easily seen for each of the three types

CANONICAL FORM OF LINEAR TRANSFORMATION

151

of elementary transformations. Thus, e.g., if the polynomial matrix B(A) is obtained from the polynomial matrix A(2) by a permutation of rows then the inverse permutation takes B(A) into A(A). Again, if B(A) is obtained from A(2) by adding the ith row multiplied by q)(2) to the kth row, then A (A) can be obtained from B(A) by adding to the kth row of B(A) the ith row

multiplied by a.(A).
The above remark implies that if a polynomial matrix K (A) is equivalent to L (A), then L (A) is equivalent to K (A). Indeed, if L(A) is the result of applying a sequence of elementary transformations to K (A), then by applying the inverse transformations in

reverse order to L(2) we obtain K(2).
If two polynomial matrices K1(A) and K2(A) are equivalent to a

third matrix K (A), then they must be equivalent to each other. Indeed, by applying to K, (A) first the transformations which take it into K (A) and then the elementary transformations which take
K(2) into K,(A), we will have taken K1(2) into K, (A)
.

Thus K, (A)

and K2(A) are indeed equivalent. The main result of para. I of this section asserts the possibility of

diagonalizing a polynomial matrix by ineans of elementary transformations. We precede the proof of this result with the
following lemma:
LEMMA. If the elentent a11(2) of a polynomial matrix A (A) is not

zero and if not all the elements a(2) of A(A) are divisible by a(A),
then it is possible to find a polynomial matrix B (A) equivalent to A (A) and such that b11(A) is also different from zero and its degree is less

than that of au (2).
Proof: Assume that the element of A (A) vvhich is not divisible by

a (2) is in the first row. Thus let a(2) not be divisible by a (2) .

Then a(A) is of the form
a1fr(2) = a11(2)02) b(2), where b (A) f O and of degree less than au(A). Multiplying the first

column by q(A) and subtracting the result from the kth column,
we obtain a matrix with b(A) in place of a11(2), where the degree of
is less than that of a11 (A) . Permuting the first and kth columns of the new matrix puts b(A) in the upper left corner and results in a matrix with the desired properties. We can proceed in
b (A)

If all the elements of a polynomial matrix B (A) are divisible by some polynomial E (A). If not all the elements of our matrix are divisible by ail (A) .) + a' i.(2) is replaced by a' . This completes the proof of our lemma. Since b(A). third. we can replace our matrix with an equivalent one in which the element in the upper left corner is of lower degree than a11(A) and still different from zero. third. This leaves a11(2) unchanged and replaces a(2) with a(2. nth element of the first row with zero. Similarly. We will reduce this case to the one just considered. Thus the first row now contains an element not divisible by a(A) and this is the case dealt with before.(2) 92(2)a(2) which again is not divisible by an (2) (this because we assumed that a(2) is divisible by an(A)). third. by subtracting from the second. Repeating this procedure a finite number of times we obtain a matrix B (A) all of whose elements are divisible by bll(A). Dividing the first row by the leading coefficient of b11(2) replaces b11(2) with a monic polynomial E1(2) but does not affect the zeros in that row. etc. Since a11(2. Otherwise suitable permuta- tion of rows and columns puts a non-zero element in place of au(A). b(A) are divisible by b11(2). the second. we can. columns suitable multiples of the first column replace the second. then . We may assume that a11(2) O. in view of our lemma.(2) be an element not divisible by an(A).152 LECTURES ON LINEAR ALGEBRA an analogous manner if the element not divisible by a11(2) is in the first column. The new matrix inherits from B (A) the property that all its entries are divisible by b . We are now in a position to reduce a polynomial matrix to diagonal form. .1(2). In the sequel we shall make use of the following observation. We now add the ith row to the first row. it must be of the form a1(A) = (2)a1(2). Now let all the elements of the first rovy and column be divisible by a1(2) and let a.(2) = ai. then.(A)(1 T(2)) + (42*(2).) is divisible by a11(2).. then all the entries of a matrix equivalent to B (A) are again divisible by E (A). If we subtract from the ith row the first row multiplied by 92(2). nth element of the first column can be replaced with zero. .1(2) is replaced by zero and a.

E. can be viewed as an elementary transformation of the larger matrix. with E2(A) a multiple of E.(2). (A).. (A) . If we dispense with the latter requirement the process of diagonalization can be considerably simplified.).+2(2) = = for some value of Y. Every polynomial matrix can be reduced by elemen- tary transformations to time diagonal form E1(2) O E2(A) 0 O (4) 0 E3(2) 0 0 E(2)_ Here lije diagonal elements Ek(A) are monic polynomials and El (X) divides E2(2). (A) divides E3(2.CANONICAL FORM OF LINEAR TRANSFORMATION 153 We now have a matrix of the form (2) 0 0 (722(1) c23(2) c33(2) c2(A) (3) O c32(2) c. happen that Er+. an elementary transformation of the matrix of the c. of course..(.(2) = E. Repetition of this process obviously leads to a diagonal matrix.) c nn(2)_ Oc2(A) c3(2. Then c22(A) is replaced by a monic polynomial E.) in the first row and first column are replaced with zeros.. of order n 1 the same proceWe can apply to the matrix dure which we just applied to the matrix of order n.) all of whose elements are divisible by E1(2). . etc. This proves THEOREM 1. Since the entries of the larger matrix other than E1(2) are zeros.vo rows and columns are zero and whose first two diagonal elements are monic polynomials E. Thus we obtain a matrix whose "off-diagonal" elements in the first b. REMARK: We have brought A(A) to a diagonal form in which every diagonal element is divisible by its predecessor. (A) and the other c(). It may. This form of a polynomial matrix is called its canonical diagonal form. E.

to replace the off-diagonal elements of the first row and column with zeros it is sufficient that these elements (and not all the elements of the matrix) be divisible by a(2). it is convenient to put Do(A) D..). 2. In the case of elementary transformations of type 1 which permute rows or columns this is obvious.. we take D. we take its leading coefficient to be one.(1. Once the off-diagonal elements of the first row and first column are all zero we repeat the process until we reduce the matrix to diagonal form. the diagonal form of a polynomial matrix is not uniquely determined.e. Since given matrix. i.(2. In this paragraph we prove that the canonical diagonal form of a given matrix is uniquely determined.(2) is determined to within a multiplicative constant. In particular. On the other hand we will see in the next section that the canonical diagonal form of a polynomial matrix is uniquely determined. As can be seen from the proof of the lemma this requires far fewer elementary transformations than reduction to canonical diagonal form.e. or change . (A) are invariant under elementary transformations. that equivalent matrices have the same polynomials D.154 LECTURES ON LINEAR ALGEBRA Indeed. i. Reduce the polynomial matrix 21 L 0 _2j O 0 ' 21 to canonical diagonal form. In this way the matrix can be reduced to various diagonal forms. Let D.(2) = 1. since such transformations either do not affect a particular kth order minor at all. Let there be given an arb trary polynomial matrix.) denote the greatest common divisor of all kth order minors of the 1. To this end we shall construct a system of polynomials connected with the given polynomial matrix which are invariant under elementary transformations and which determine the canonical diagonal form [01 (A completely. EXERCISE. As before. We shall prove that the polynomials D. if the greatest common divisor of the kth order minors is a constant. A nswer : 412 A2)].

.(A) = Dk±i(A) = D(2) = O.(2) (i < j < k) is divisible by E1(A)E2(A)E3(2) and so Da(A) = E.(2) since under such transformations the minors are at most multiplied by a constant. elementary transformations of type 3 do not change D.(A)E.(A)E. Thus in this case.)E. consider addition of the jth column multiplied by T(A) to the ith column. Hence D2(A) = E. too. E3(2) is divisible by E2(2).CANONICAL FORM OF LINEAR TRANSFORMATION 155 its sign or replace it with another kth order minor. all minors of order higher than k are zero.(. consequently.2(2.(1. Since all E.(A).(A).. then we put 13. it follows that the greatest common divisor D1(A) of all minors of order one is Ei(A).(2) for a matrix in canonical form [Ei(i) O 0 I E2(2) (5) E(2) We observe that in the case of a diagonal matrix the only nonzero minors are the principal minors.(2. If it contains the ith column but not the kth column we can write it as a combination of minors each of which appears in the original matrix. Specifically.) E2k(2). We compute the polynomials D. In all these cases the greatest common divisor of all kth order minors remains unchanged. These minors are of the form (2)E. etc. Since all the polynomials Ek(A) are divisible by E1(2) and all polynomials other than E1(2) are divisible by E2 (A). Likewise.t)E. If some particular kth order minor contains none of these columns or if it contains both of them it is not affected by the transformation in question. the product Ei(A)Ei(A)(i < j) is always divisible by the minor E. If all kth order minors and. Now consider elementary transformations of type 2. that is.). the greatest common divisor of the kth order minors remains unchanged. the minor . We observe that equality of the /k(A) for all equivalent matrices implies that equivalent matrices have the same rank. the product E .)E.(A).(À)E. minors made up of like numbered rows and columns. Since E2(2) is divisible by E1(2).(4 other than E1(11) and E2(A) are divisible by E3(2.

(2) = D(2) = 0.(2) = E1(2) Ek(2) (k 1. if beginning vvith some value of r Er.+2(A) = then = En(2) = 0. Proof: We showed that the polynomials Dk(2) are invariant under elementary transformations. . Since in the case of the matrix (5) we found that D. = = E(2) = O. The canonical diagonal form of a polynomial matrix A (A) is uniquely determined by this matrix. COROLLARY.±1(2) 2E.1(2) = E. if Dr±i(A) = = D(2) = 0 we must put E. r..(2) Ek(2) Dk-1(2) Here. r). . In § 20 we defined the elementary divisors of matrices of the form A THEOREM 2. Clearly. the theorem follows. 2. . Hence if the matrix A(A) is equivalent to a diagonal matrix (5).(2) D k-1 (A) diagonal form (5) are defined by the formulas (k 1. 2. Thus the diagonal entries of a polynomial matrix in canonical diagonal form (5) are given by the quotients D.156 LECTURES ON LINEAR ALGEBRA Thus for the matrix (4) (6) D k(2) = E1(A)E2(A) ' Eh(A) (k = 1. 2. If D k(2) (k = 1.1(1) = Ek. The polynomials Ek(2) are called elementary divisors. . A necessary and sufficient condition for two polyno- . then the elements of the canonical D . 2. r) is the greatest common divisor of all kth order minors of A (A) and D. n). then both have the same Dk(A). r n) and that Dr+1(2) = D+2(A) = = D(2) = 0. D1(A) = Dr+2(2) = = Da(A) = 0. = E+(A) = = E (2) = O.

D2(2). In our case these quotients would be polynomials and [P (2)J-1 would be a polynomial matrix. then det P(2) = const O. Proof: We first show that if A (A) and B(2) are equivalent. Two polynomial matrices A (A) and B(2) are equivalent if and only if there exist invertible polynomial matrices P(2) and Q(A) such that. . so that Da(A) = 1. then both of these matrices are equivalent to the same canonical diagonal matrix and are therefore equivalent (to one another). Then det P (A) det (A) = 1 and a product of two polynomials equals one only if the polynomials in question are if P (A) non-zero constants. by the matrix of the elementary transformation in question. All invertible matrices are equivalent to the unit matrix. als Indeed. If det P (A) is a constant other than zero. Indeed. apart from sign. Conversely.CANONICAL FORM OF LINEAR TRANSFORMATIoN 157 mial matrices A (A) and E(A) to be equivalent is that the polyno Di(A). Since D(2) is divisible by WM D. . Indeed. let [P (2)1-1. if the polynomials D. n).(2) = 1 (k = 1. the determinant of an invertible matrix is a non-zero constant. the elements of the inverse matrix are. is invertible. (7) A(A) = P (2)B (2)Q (2). A polynomial matrix P(2) is said to be invertible if the matrix [P(2)]-1 is also a polynomial matrix. 2. . It follows that all the elementary divisors E(2) of an invertible matrix are equal to one and the canonical diagonal form of such a matrix is therefore the unit matrix. 3. then there exist invertible matrices P(A) and Q(A) such that (7) holds. then P (A) is invertible.(2) are the same for A(A) and B (A). THEOREM 3. We have thus shown that a polynomial matrix is invertible if and only if its determinant is a non-zero constant.= Pi(A). To this end we observe that every elementary transformation of a polynomial matrix A(2) can be realized by multiplying A(2) on the right or on the left by a suitable invertible polynomial matrix. Indeed.13(2) be the same for both matrices. namely. the (n 1)st order minors divided by det P(2). Thus let there be given a polynomial matrix A(2) . We illustrate this for all three types of elementary transforma- tions.

Finally. row) by a. To multiply the second column (row) of the matrix A (A) by some number a we must multiply it on the right (left) by the matrix 10 0 0 a 0 0 0 1 0 (8) 0 0 0 1_ obtained from the unit matrix by multiplying its second column (or. what amounts to the same thing. Likewise to add to the first row of A(A) the second row multiplied by 9/(A) we must multiply A(A) on the left by the matrix (T(2) 0 1 0 0 1 0 0 0 1 (11) 0 0 0 0 0 .) ' ain(2) A(A) an(2) Pan(2) a. rows) of the unit matrix.(2) a2(2) ann(2) To permute the first two columns (rows) of this matrix. what amounts to the same thing. to add to the first column of A (A) the second column multiplied by q(A) we must multiply A(A) on the right by the matrix 0 0 1 0 0 T(2) (10) 1 0 1 0 0 0 0 1 0 0 obtained from the unit matrix by just such a process. we must multiply it on the right (left) by the matrix 0 1 1 0 0 1 0 0 0 (8) 0 0 0 obtained by permuting the first two columns (or.158 LECTURES ON LINEAR ALGEBRA a12(2) a22(2) a2(11.

it follows that the product of matrices of elementary transformations is an invertible matrix. let A(A) = P(A)B(A)Q(A). in view of our observation. every invertible matrix Q (A) is equivalent to the unit matrix and can therefore be written in the form Q(2) = 131(2)EP2(2) where 132(2) and P2(A) are products of matrices of elementary transformations. Since we assumed that A (A) and B (A) are equivalent.CANONICAL FORM OF LINEAR TRANSFORMATION 159 obtained from the unit matrix by just such an elementary transformation. Indeed. To effect an elementary transformation of the columns of a polynomial matrix A(A) 've must multi- ply it by the matrix of the transformation on the right and to effect an elementary transformation of the rows of A (A) we must multiply it by the appropriate matrix on the left.(2) is itself a product of matrices of elementary transformations. It follows that every invertible matrix is the product of matrices of elementary transformations. Since the product of invertible matrices is an invertible matrix. Every elementary transformation can be effected by multiplying B(A) by an invertible polynomial matrix. Consequently. where P (A) and 0(A) are invertible matrices. it must be possible to obtain A(A) by applying a sequence of elementary transformations to B (A). As we see the matrices of elementary transformations are obtained by applying an elementary transformation to E. This observation can be used to prove the second half of our theorem. which is what we wished to prove. But this means that Q (A) = Pi (A)P. the first part of our theorem is proved. Computation of the determinants of the matrices (8) through (11) shows that they are all non-zero constants and the matrices are therefore invertible. . Since the determinant of a product of matrices equals the product of the determinants. A(A) is obtained from B(1) by applying to the latter a sequence of elementary transformations. A(A) can be obtained from B (A) by multiplying the latter by some sequence of invertible polynomial matrices on the left and by some sequence of invertible polynomial matrices on the right. Indeed. Hence A(A) is equivalent to B (A). But then.

This will yield. A constant. Later we show the converse of this result. and A 2E.e. The main problem solved here is that of the equivalence of polynomial matrices A 2E and B AE of degree one.9 In this paragraph we shall study polynomial matrices of the form A AE.. J AA. Every polynomial matrix P(2) = P02" + 1312"-1 + + P can be ditiided on the left by a matrix of the form A AE (A any constant matrix). -.2E) and if we denote A. that the equivalence of the polynomial matrices A AE and B AE implies the similarity of the matrices A and B.e. Theorem 3 implies the equivalence of A AE and B 2E.-. among others. w It is easy to see that if A and B are similar. + ¿A. in this case A. + 2A. with det A1 O is equivalent to a matrix of the form A ¿E. = A. Indeed. if there exists a non-singular constant matrix C such that B C-1AC. by A we have A. to Every polynomial matrix A. polynomial matrices A AE and B if B C-i AC. . The process of division involved in the proof of the lemma differs from ordinary division only in that our multiplication is noncommutative.160 LECTURES ON LINEAR ALGEBRA 4. of the fact that every matrix can be reduced to Jordan canonical form. then the AE are equivalent. namely.. (A . 2E)C. X ( A.A. independent of § 19. We begin by proving the following lemma: LEMMA. -1A. j. then B 2E = C-1(A Since a non-singular constant matrix is a special case of an invertible polynomial matrix. This paragraph may be omitted since it contains an alternate proof.2A1 = A. Indeed.¿E) which implies (Theorem 3) the equivalence of A. a new proof of the fact that every matrix is similar to a matrix in Jordan canonical form. there exist matrices 5(2) and R (R constant) such that P (A) = (A AE)S(2) + R. i. i.

= P(A).(2) (A AE) We note that in our case.). i. P02" P(2) = (A P'0211-2 + . It remains to prove necessity. where the P. just as in the ordinary theorem of Bezout. AE)S(2) + R..) of degree not higher than zero. independent of X.e.e.13102"-I P'12"-2 + + P'_. Proof: The sufficiency part of the proof was given in the beginning of this paragraph.An-i. (A AE)P'0An-2 is of degree not higher than n obtain a polynomial matrix P(2) + (A Continuing this process we P'02"-2 2E) (P02"-1 + . are constant matrices. THEOREM 4.. If R denotes the constant matrix just obtained. i.CANONICAL FORM OF LINEAR TRANSFORMATION 161 Let P(A) = P02" P. then P (A) = (A or putting S (2) ( 2E) (P0An-1 P'02"-2 + ) R. The polynomial matrices A AE and B AE are equivalent if and only if the matrices A and B are similar. This means that we must show that the equivalence of A 2E and B AE implies the similarity of A and B.-=. A similar proof holds for the possibility of division on the right.. This proves our lemma. is of degree not higher than If P(2) + (A n AE)P. can claim that R = R. It is easy to see that the polynomial matrix P(A) + (A AE)P02"-1 1. there exist matrices S1(A) and R1 such that P(2) -= S. By Theorem 3 there exist invertible polynomial matrices P(2) and Q(A) such that . then the polynomial matrix P(A) + (A AE)P02"-1 + 2.

where Po and Q0 are constant matrices. If we insert these expressions for P(2) and Q(2) in the formula (12) and carry out the indicated multiplications we obtain B AE = (B +(B 2E) 2E)P1 (2) (A 2E)121(2)(B 2E)Q0 + Po(A 2E)Q1(2)(B 2E)P1 (2) (A + 130(A 2E) 2E)Q0. If we transfer the last summand on the right side of the above equation to its left side and denote the sum of the remaining terms by K (2).Q1(2)(B 2E) + Qo. 2E)Q1(2)(B + Po(A then we get B AE Po(A 2E)Q0 -= K(2). i.e. 2E)P1(2)(A 2E)Q1(2)(B But in view of (12) AE)Q (A) P(A)(A 2E) = (B P-1(2)(B 2E). Then P(2) = (B AE)P. if we put K (2) = (B (B AE)P. (2) (A 2E)Q1 (2) (B 2E) 2E)Q0 2E)P1(2)(A 2E). We now add and subtract from the third summand in K(2) the expression (B 2E)P1 (A) (A 2E)Q1(2)(B 2E) and find K(2) = (B AE)P. Using these relations we can rewrite K (2) in the following manner . To this end we divide P(2) on the left by B B AE and Q(2) by AE on the right. Q(2) -. (2) + P0. (2) (A AE)Q (2) (B (A P (A) (A 2E) 2E)Q1(2)(B 2E). the first two summands in K(2) can be written as follows: Since Q1(2)(B + (B 2E)P1(2) (A (B 2E)Q0 AE)1J1 (2) (A = (B 2E) 2E)Q1 (2) (B 2E)P1 (2) (A 2E)Q (2).162 LECTURES ON LINEAR ALGEBRA (12) B 2E = P(2)(A 2E)Q(2). 2E) + Q0 = Q(2). 2E)Q-/(2).. We shall first show that 11 (A) and Q(2) in (12) may be replaced by constant matrices.

which shows that the matrices P. we may indeed replace P(2) and Q(2. . is zero. Of course. the contents of this paragraph 2E has the same elementary ¿E. This completes the proof of our theorem. We have thus found that (17) B 2E =. that A and B are similar. (2)(A the expression in square brackets is a polynomial in 2. Since equivalence of the matrices A if and only if the matrices A ¿E and B ¿E is synonymous with identity of their elementary divisors it follows from the theorem just proved that two matrices A and B are similar ¿E and B a have the same elementary divisors.e. We now show that every matriz A is similar to a matrix in Jordan canonical form. Assume that this polynomial is not zero and is of degree m.. where Po and Q. Since P(2) and Q (2) are invertible.e. but then B is similar to A.CANONICAL FORM OF LINEAR TRANSFORMATION 163 K (A) = (B 2E) [P1(2)P-1(2) 12-'(2)121(A) ¿E). Equating the free terms we find that B = PoAQ0 i. But (15) implies that K (2) is at most of degree one.) in (12) with constant matrices. and with it K (2). i. To this end we consider the matrix A ¿E and find its elementary divisors. Then it is easy to see that K (2) is of degree m + 2 and since ni 0. K (2) is at least of degree two. B divisors as A As was indicated on page 160 (footnote) this paragraph gives another proof of the fact that every matrix is similar to a matrix in Jordan canonical form. can be deduced directly from §§ 19 and 20. We shall prove this polynomial to be zero.. Using these we construct as in § 20 a matrix B in Jordan canonical form. 21E)Qi(2)1(B P. We now show that K (1) = O. and Qo are non-singular and that Po = Qo-1. are constant matrices.Po(A 2E)Q0. Equating coefficients of 2 in (17) we see that PoQo E. Hence the expression in the square brackets.

CHAPTER IV

Introduction to Tensors
§ 22. The dual space
1. Definition of the dual space. Let R be a vector space. Together with R one frequently considers another space called the dual space which is closely connected with R. The starting point for the definition of a dual space is the notion of a linear function introduced in para. 1, § 4.

We recall that a function f(x), x E R, is called linear if it satisfies the following conditions:

f(2x) = 2f (x). e be a basis in an n-dimensional space R. If Let el, e2, + e" e x = ei e2 e, + is a vector in R and f is a linear function on R, then (cf. § 4) we can write

f(x+y)-f(x)+f(Y),

re,) f(x) = /(eei e2e2 ane", a2e2 + = , a which determine the linear where the coefficients al, a2,
(1)

function are given by
(2)

a = f(e2),

a2 = f(e2),

a,, = f(e).

It is clear from (1) that given a basis e1, e2, , en every n-tuple , a determines a unique linear function. al, a2, Let f and g be linear functions. By the sum h off and g we mean

the function which associates with a vector x the number f(x) g (x). By the product off by a number a we mean the function which associates with a vector x the number x f(x).
Obviously the sum of two linear functions and the product of a

function by a number are again linear functions. Also, if f is
164

INTRODUCTION TO TENSORS

165

, a and g by the numbers determined by the numbers al, a2, g is determined by the numbers al + b n , then f b1, b2, , a, + b,, , a + bn and xi' by the numbers arz,., a2, , acin. Thus the totality of linear functions on R forms a vector space.
DEFINITION 1. Let R be an n-dimensional vector space. By the dual space R of R we mean the vector space whose elements are linear functions defined on R. Addition and scalar multiplication in R follow the rules of addition and scalar multiplication for linear

functions.

In view of the fact that relative to a given basis e1, e2,

, e in

R every linear function f is uniquely determined by an n-tuple , a and that this correspondence preserves sums and a1, a2, products (of vectors by scalars), it follows that R is isomorphic to the space of n-tupies of numbers. One consequence of this fact is that the dual space R of the
n-dimensional space R is likewise n-dimensional.

The vectors in R are said to be contravariant, those in R,
will denote elements of R and covariant. In the sequel x, y, elements of R. f, g, 2. Dual bases. In the sequel we shall denote the value of a

linear function f at a point x by (f, x). Thus with every pair f E R and x e R there is associated a number (f, x) and the
following relations hold:

f, xi + x2) = (f,x1)
(f, /Ix) 2(f, x), x), (Af, x) =
(fl.

( f, x2),

X) = (h, X) + (f2, X).

The first two of these relations stand for f(x,+ x2)=f(xi)-kf(x2)

and f(A)

Af (x) and so express the linearity of f The third

defines the product of a linear function by a number and the fourth, the sum of two linear functions. The form of the relations 1 through 4 is like that of Axioms 2 and 3 for an inner product (§ 2). However,

an inner product is a number associated with a pair of vectors
from the same Euclidean space whereas (f, x) is a number associated with a pair of vectors belonging to two different vector spaces

R and R.

166

LECTURES ON LINEAR ALGEBRA

Two vectors x E R and f E R are said to be orthogonal if

(f,x) = O. In the case of a single space R orthogonality is defined for
Euclidean spaces only. If R is an arbitrary vector space we can still speak of elements of R being orthogonal to elements of R. , f" e be a basis in R and f1,f2, DEFINITION 2. Let e1, e2, said to be dual if a basis in R. The two bases are
(3)

(P,ek)=

when i = k
{01

when i

k

(i, k = 1, 2,

In terms of the symbol hki, defined by 1 when i = k , n), 1, 2, (i, k = {0 when i k k condition (3) can be rewritten as (fi, ek) = If el, e2, , en is a basis in R, then (f, ek) = f(ek) give the numbers a, which determine the linear function fe R (cf. formula (2)). Ibis remark implies that , en is a basis in R, then there exists a unique basis if e1, e2, in in R dual to e1, e2, , fi, J.', The proof is immediate: The equations (p, e) = (P, e2) = 0, (P, ei) = 1, define a unique vector (linear function) J.' E R. The equations (f2, e2) = I, *, (f 2,e) = O (f2, el) = 0, define a unique vector (linear function) f2 e R, etc. The vectors fl, f2, . . are linearly independent since the corresponding n-tuples of numbers are linearly independent. Thus In, f2, ,I3
constitute a unique basis of R dual to the basis el, e2,

, e of R.

In the sequel we shall follow a familiar convention of tensor analysis according to which one leaves out summation signs and sums over any index which appears as a superscript and a sub+ enri. script. Thus el /72 stands for elm. + E22 + Given dual bases e, and f one can easily express the coordinates

of any vector. Thus, if x e R and
x

. e. ek) and f = ntf' + n2f2 6. rke. = (fi. . e is a basis in R and f'. Interchangeability of R and R. e2. are the coordinates of x c R relative to the basis e1. Similarly. x) = a. . respectively. en and P. . . . e1)61k Ek. 3.i. p. e2.1cn1ek ++Thifn. f2. en and fi. e. x). Hence. = (f. be dual bases. fn. . can be computed from the formulas ek (fk. x = El ei Then 52e2 + + e. e2. For arbitrary bases el. en and Th. ek)ntek To repeat: If el. x) = (nip. e2. f2. where a/c. (fi. NOTE. e2. p. are the coordinates off E R relative to the basis in. where f is the basis dual to the basis e. Thus let . . if fe k and f= nkfki then Now let e1. We wish . E2.INTRODUCTION TO TENSORS 167 then (fk. We now show that it is possible to interchange the roles of R and R without affecting the theory developed so far. x) in terms of the coordinates of the vectors f and x with respect to the bases e1. the coordinates Ek of a vector x in the basis e. e2. its dual basis in R then (if x) = niE' + 172E2 + + nnen. . /2. (4) respectively (1. andf. vet) ei(fk. R was defined as the totality of linear functions on R. h in R and R where $1. x) (fk. . . n .e). (f. We shall express the number (f. . .

x) for some fixed vector xo in R. coordinates of a vector x e R relative to some basis e1. . f2. fn are rh. . . we specify the coordinates of a vector f E R relative to the dual basis f'. . e. . 2 above we showed that for every basis in R there exists a unique dual basis in R. (f.. f2. for every basis in R there exists a unique dual basis in R. as a rule. . . It is there- fore possible to give a definition of a pair of dual spaces R and R which emphasizes the parallel roles played by the two spaces. then we can write q)(f) = (tin. In para. xo). then cp(f) (f. fn of e. x) NOTE: 0 for all x implies f = 0. e2. We observe that the only operations used in the simultaneous study of a space and its dual space are the operations of addition of vectors and multiplication of a vector by a scalar in each of the spaces involved and the operation (f. + a2. x) 0 for all f implies x O. in addition. (f. x) which connects the elements of the two spaces. f2. and (f. x) so that conditions 1 through 4 above hold and. (6) . . This formula establishes the desired one-to-one correspondence between the linear functions 9. en in R and denote its f". Transformation of coordinates in R and R. In view of the interchangeability of R and R. To this end we choose a basis el. 4. 5. e'2. e' be a new basis in R whose connection with the basis e1. X0) = a2e2 4- + (P + ann and (5) (f. Such a definition runs as follows: a pair of dual spaces R and R is a pair of n-dimensional vector spaces and an operation (f. Now let e'1. e2.168 LECTURES ON LINEAR ALGEBRA to show that if q) is a linear function on R. on ft and the vectors x. x) which associates with f e R and xeR a number (f.n. as Now let x. e R and permits us to view R as the space of linear functions on R thus placing the tvvo spaces on the same footing. then. 172.72 4- + (re Then. e2. If the coordinates off relative to the basis dual by Jr'. = ci"ek. be the vector alei we saw in para. 2. e is given by e'. If we specify the en. e2.

= bkiek. to the basis F. . f'n be the dual basis of e'1. x) (bklik. Now e't = (f". . We say that the matrix lime j/ in (6') is the transpose of the transition matrix in (6) because the summation indices in (6) and (6') are different. Thus let ei be the coordinates of x ER relative to a basis e1. ni. e'i) = (fk. e'. e'2. ciece2) = ci. . e2. the matrix in (6') is the transpose 1 of the transition matrix in (6). en to e'l . fn: inverse. . VVe now discuss the effect of a change of basis on the coordinates e' of vectors in R and k. . f'k is equal to the inverse of the transpose of the matrix which is the matrix of transition from e1.. e'i) in two ways: (fk.. . e2.r fk To this end we compute (7. We first find its of transition from the basis f'1. .. e'2. the coordinates of vectors in k transform like the vectors of the dual basis in R..e. i. Then x) --= (I' kek) = (f'1 x) = (f". p be the dual basis of e1. We wish to find the matrix 111)7'11 of transition from the fi basis to the f'.e. f'2. f2. . It follows that the matrix of the transition from fl. the matrix (6') . f '2. e'i) =1= = e'1) cik e'i) u5k (f Hence c1 = u1k.INTRODUCTION TO TENSORS 169 Let p. . . Similarly. = czknk This is seen by comparing the matrices in (6) and (6').(fk. f'2. e andf'1. e'. fOc= to fg. i.Eiketk)= e". f2. e and e'i its coordinates in a new basis e'1.x) bkiek. f2. (fk. so that It follows that the coordinates of vectors in R transform like the vectors of the dual basis in k. e'2..x)=bki(fk. basis.

(x. Of the matrices r[cikl and I ib.k11 is the inverse of the transpose of 11c. y2). e2. To prove the uniqueness of y we observe that if f(x) = (x. en is orthonormal. Let R be an n-dimensional Euclidean space. every vector y determines a linear function f such that f(x) = (x. Thus in the case of a Euclidean space everyf in k can be replaced with the appropriate y in R and instead of writing (f. .abni = 6/.) = 0 for all x. 5. et. y). x) we can write (y. LEMMA. Proof: Let e1. y). If + ase.. For the sake of simplicity we restrict our discussion to the case of real Euclidean spaces.e. Y). where y is a fixed vector uniquely determined by the linear function f. then (x.. The dual of a Euclidean space. Since the . But this The converse is obvious. Then every linear function f on R can be expressed in the form f(x) = (x. x). y. y2). The fact the 111).. . Since the sumultaneous study of a vector space and its dual space involves only the usual vector operations and the operation . = O.k1 involved in these transformations one is the inverse of the transpose of the other. (x. a. e. x = eiei.k11 is expressed in the relations c. y) = ct1e1 a2E2 + + This shows the existence of a vector y such that for all x f(x) = (x. y. y) = (x. = by. e be an orthonormal basis of R. Now let y be the vector with coordinates c11. then f(x) is of the form f(x) = ale + a2e2 + basis e1. y1 means that y.170 LECTURES ON LINEAR ALGEBRA We summarize our findings in the following rule: when we change from an "old" coordinate system to a "new" one objects with lower case index transform in one way and objects with upper case index transform in a different way. y1) and f(x) = (x. i. Conversely.

in case of a .. Euclidean space. where the matrix Hell is the inverse of the matrix I Lgikl I. (y.' is dual to that of the ek. we can identify R with R and so in look upon the f as elements of R. If R is Euclidean. ek). we may. tk) § 23.e.. R by R. then R is also n-dimensional and so R and R are isomorphic. x). Now eic) = gj2. Let e]. and (f. x e R. Tensors 1.. f2. (p. its dual basis in R. We wish to find the coefficients go. e2. X) by (y. . Solving equation (10) for f' we obtain the required result f = gi2e . e be an arbitrary basis in R and f'. then ek = gkia. flak EXERCISE. It is natural to try to find expressions for the f in terms of the given e. where g ik (ei. x). In the first chapter we studied linear and bilinear functions on an n-dimensional vector space. ek) = giabe = gik (ei ek) = Thus if the basis of the J. If we were to identify R and R we would have to write in place of (I. A natural If R is an n-dimensional vector space. Show that gik i. 2 above) reduces to that of orthogonality of two vectors of R. x). i. Multilinear functions. we may identify a Euclidean space R with its dual space R. When we identify R and its dual rt the concept of orthogonality of a vector x E R and a vector f E k (introduced in para. Let e. replace f by y.INTRODUCTION TO TENSORS 171 ( f x) which connects elements fe R and x e R. But this would have the effect 2 of introducing an inner product in R. = g f a. 2 This situation is sometimes described as follows: in Euclidean space one can replace covariant vectors by contravariant vectors..e. y. .

if we fix all vectors but the first then /(x' x". 0) is a linear function of one vector in R. 0) and (0. g. 1(x. /(2x. y. y. g. ). A function 1(x. a vector in R (a covariant vector). g. ). The bilinear function of type (y) . g. and q vectors in R (covariant vectors) is called a multilinear function of type (p. ) q vectors f. y. DEFINITION I. q). f. g. f. 1(x. g. functions of one vector in R and one in R. ). g. . as was shown in para. . is said to be a multilinear function of p vectors x. f. 1(x. a multilinear function of type (0. (ß) bilinear functions on R. ) . for example. g. f. Indeed. A multilinear function of p vectors in R (contravariant vectors) ) f".172 LECTURES ON LINEAR ALGEBRA generalization of these concepts is the concept of a multilinear function of an arbitrary number of vectors some of which are elements of R and some of vvhich are elements of R. let y = Ax be a linear transformation on R. ) 1(x" . There is a close connection between functions of type (y) and linear transformat ons. § 22. The simplest multilinear functions are those of type (1. ) = 1(x'. f' . f'. f. g. y. g. f. arguments. 1). e R and e R (the dual of R) if 1 is linear in each of its Thus. y.. y. y. A multilinear function of type (1. -) = 2/(x. . There are three types of multilinear functions of two vectors (bilinear functions): bilinear functions on R (cons dered in § 4). ) = ul(x. 3. ). . Similarly. . . uf. y. y. i. . f.e. 1) defines a vector in R (a contravariant vector). y. Again. . . g. g. f". /(x. y. y.

x. Expressions for multilinear functions in a given coordinate system. eini ht.. As in § 11 of chapter II one can prove the converse. y. e2. Let Then . .7: : : = 1(e. . en be a basis in R and fl. e. f) are given by the relations ai ei. n'e5.that one can associate with every bilinear function of type (y) a linear transformation on R. y. Let e'1. fe R (a function of type (2.. 2. We now show how the system of numbers which determine a multilinear form changes as a result of a change of basis Thus let el.. f. /(Ve. fit This shows that the ak depend on the choice of bases in R and R.f) Or y = niei. . A similar formula holds for a general multilinear function /(x. y. f) = where the coefficients ail' which determine the function / (x. en be a basis in R and fl. e'2. Y. f 2. ). Ckfk) = V?? e5. If (3) e'a = cate. f2. . e2. 7 its dual in R.fr f3. )= y. We now express a multilinear function in terms of the coordinates of its arguments. Coordinate transformations.. f).. f '2. ". fk. .. Let el. f n its dual basis in R. where the numbers au::: which define the multilinear function are given by ar. For simplicity we consider the case of a multilinear function 1(x.INTRODUCTION TO TENSORS 173 associated with A is the function (f. Ax) which depends linearly on the vectors x e R and fe R.k /(x. 1)).. e'n be a new basis in R and fi. x= /(x. . y.f be its dual in R.. i.fk). . y E R.e. g.

e2. f2..f". the . def 4. f2. § 22) f'ß = where the matrix I lb. ' . 4. a linear transformation by the n2 entries in its matrix. ./12..174 LECTURES ON LINEAR ALGEBRA then (cf.e. then e'1.rbts ci-cl To sum up: If define a multilinear function /(x. fr. )relative to a pair of dual bases el. e'2. II is the transpose of the inverse of For a fixed a the numbers c2 in (3) are the coordinates of the vector e'j... and e' and f'1. In this way we find that numbers c52. Hence to find we must put in (1) in place of Ei. We know that f''. linear functions.r. . [2. . p. .': which define our multilinear function relative to the bases e'. e' and r. § 22). para. ba. 4. .5: bibrs Here [c5' [[is the matrix defining the transformation of the e basis and is the matrix defining the transformation of the f basis. .l2. relative to the basis el. Thus relative to a given basis a vector was defined by its n coordinates. linear transformations. Similarly. In the case of each of these objects the associated system of numbers would. the coordinates of the vectors e' e'5. (CT:: = /(e't. etc.. 2. = ctfixift e and f1.). Definition of a tensor. i. 3. g. . e. The objects which we have studied in this book (vectors. fn.us. e2. for a fixed fi the numbers baft in (4) are the coordinates of f'ß relative to the basis f'. e'2. We shall now compute the numbers a'. bilinear functions. . fr. and a bilinear function by the n' entries in its matrix.) were defined relative to a given basis by an appropriate system of numbers. . y.715. transform in a manner peculiar to each object and to characterize the object one had to prescribe the . f. PI. bar. b71. c/. para. upon a change of basis. . a linear function by its n coefficients. . This situation can be described briefly by saying that the lower indices of the numbers aj are affected by the matrix I Ic5'11 and the upper by the matrix irb1111 (cf.

and algebra. p times covariant and q times contravariant. Clearly. A tensor of rank zero is called a scalar Contravariant vector. If we associate with every coordinate system the same constant a.INTRODUCTION TO TENSORS 175 values of these numbers relative to some basis as well as their law of transformation under a change of basis. its coordinates relative to this basis. Linear function (covariant vector). every tensor determines a unique multilinear function. 1 and 2 of this section we introduced the concept of a multilinear function. This permits us to deduce properties of tensors and of the operations on tensors using the "model" supplied by multilinear functions. The numbers the components of the tensor. We say that aß times covariant and g times contravariant tensor is defined if with every basis in R there is associated a set of nv+Q numbers a::: (there are p lower indices and q upper indices) which under change of basis defined by some matrix I Ic/II transform according to the rule (6) = cja b acrccrp::: b with q is the transpose of the inverse of I I I. DEFINITION 2. Given a basis in R every vector in R determines n numbers. then a may be regarded as a tensor of rank zero. These transform according to the rule = and so represent a contravariant tensor of rank 1. We now define a closely related concept which plays an important role in many branches of physics. multilinear functions are only one of the possible realiza- tions of tensors. Let R be an n-dimensional vector space. Conversely. defining . transforms under change of basis in accordance with (6) the multilinear function determines a unique tensor of rank p q. Relative to a definite basis this object is defined by nk numbers (2) which under change of basis transform in accordance with (5). The number p are called called the rank (valence) of the tensor. geometry. The numbers a. We now give a few examples of tensors. Scalar. Since the system of numbers defining a multilinear function of p vectors in R and q vectors in R. In para.

a bilinear form of vectors x E R and y e R defines a tensor of rank two." e' where b1"Cak 6 ik It follows that Ae'. of A relative to the e'..e. e2. Define a change of basis by the equations e'. e. Let 11(11111 be the matrix of A relative to some basis el. i. = a' jei k. Similarly. which proves that the matrix of a linear transformation is indeed a tensor of rank two. The resulting tensor is of rank two. We shall show that this matrix is a tensor of rank two. g e R defines a twice contravariant tensor.176 LECTURES ON LINEAR ALGEBRA a linear function transform according to the rule a'. once covariant and once contravariant and a bilinear form of vectors f. Bilinear function. Let A be a linear transformation on R. = aikek. once covariant and once contravariant. Linear transformation. With every basis we associate the matrix of the bilinear form relative to this basis. With every basis we associate the matrix of A relative to this basis. = Ac. k. . i. = cia. twice covariant. and so represent a covariant tensor of rank 1. once covariant and once contravariant. relative to any basis is the unit matrix.e.. basis c2ab fik. the system of numbers 6ik In particular the matrix of the identity transformation E i to if i if i k. Ae. y) be a bilinear form on R. Thus 61k is the simplest tensor of rank two once covariant and once .. = Then e.ae2 = cia Ae2 = This means that the matrix takes the form e fi = ci2afl bflk e'. b. Let A (x.

then their components relative to any other basis must be equal. defines a unique tensor satisfying the required conditions. We wish to emphasize that the assumption about the tvvo tensors being of the same type is essential.INTRODUCTION TO TENSORS 177 contravariant. x) for all x e R. g.) For proof we observe that since the two tensors are of the same type they transform in exactly the same way and since their components are the same in some coordinate system they must be the same in every coordinate system. as was shown in para. q) whose components relative to some basis take on 79±g prescribed (IT:: be the num values. sponding vectors u. If R is a (real) n-dimensional Euclidean space. u. y. Thus. EXERCISE. Show dirctly that the system of numbers 6. A sufficient condition for the equality of two tensors of the same type is the equality of their corresponding components relative to some basis. y. = (I if i = k. Coincidence of the matrices defining these objects in one basis does not imply coincidence of the matrices defining these objects in another basis. in turn. We now prove two simple properties of tensors. T ensors in Euclidean space. Given p and q it is always possible to construct a tensor of type (p. Given a multilinear function 1 of fi vectors x. then. The multilinear function. . The proof is simple. in R we can replace the latter by correin R and q vectors f. These numbers define a multilinear ) as per formula (1) in para. y. both a linear transformation and a bilinear form are defined by a matrix. this section. y. 4. . One interesting feature of this tensor is that its components do not depend on the choice of basis. associated with every bais is a tensor. in R and so obtain a multilinear ) of p q vectors in R. 5 of § 22. y. then (f. f. . Thus let prescribed in some basis. 0 if i k. it is possible to establish an isomorphism between R and R such that if y E R corresponds under this isomorphism to fe R. x) = (y. 2 of function /(x. function l(x. given a basis. g. (This means that if the components of these two tensors relative to some basis are equal.

. fs. = grs where gzk = (et. The equation r defines the analog of the operation just discussed. e. . .. y.. . ei. e.g. Thus let au::: be the coefficients of the multilinear funct on /(x. e. . ) . It is defined by the equation = gccrg fi. fit. In view of its connection with the inner product (metric) in our space. f... e. i. The new . y. the tensor gz.) and let b. u. . es. This operation is referred to as lowering of indices. gc. ). = (e1. be the coefficients of the multilinear function . i. ).e. the inner product relative to the basis e1.) are the coefficients of a bilinear form.. f. )._ which is p q times covariant. ) Here g is a twice covariant tensor. In view of the established connection between multilinear functions and tensors we can restate our result for tensors: If au::: is a tensor in Euclidean space p times covariant and q times contravariant. We showed in para.178 LECTURES ON LINEAR ALGEBRA in terms of the coefficients of /(x. .e. . = l(e1. fr.. v. e5. = ggfis = gsrgfis aTf:::. then this tensor can be used to construct a new tensor kJ. . l(es. pc. e. ¿(e1. This is obvious if we observe that the g. .. e2. ek) It follows that rs ) 1(e3. . u. g. e. fie. e .. y. We now propose to express the coefficients of l(x. . ) = l(ei.. v. of a basis dual to fi are expressible in terms of the vectors p in the following manner: e. ). namely. is called a metric tensor. 5 of § 22 that in Euclidean space the vectors e.

Show that gm is a twice contravariant tensor. Since r(ei. Let . h. g. ) be two multilinear functions of which the first depends on iv vectors in R and q' vectors in R and the second on Jo" vectors in R and q" vectors in R. be two multilinear functions of the same number of vectors in R and the same number of vectors in R. . -. . In view of the connection between tensors and multilinear functions it is natural first to define operations on multilinear functions and then express these definitions in the language of tensors relative to some basis. y. y. Operations on tensOYS. f. . f. h. 1 is a multilinear function of p' p" vectors in R and q' q" vectors in R. g. . y . h. ) l'(x. g. Consequently addition of tensors is defined by means of the formula = Multiplication of tensors. g. f3.INTRODUCTION TO TENSORS 179 operation is referred to as raising the indices. )= (x. Addition of tensors. /(x. y. f. To see this we need only vary in 1 one vector at a time keeping all other vectors fixed. EXERCISE. -)1(z. ) and 1"(z. . . g. f. y. g. g. z. 5 of § 22. We define their sum ) by the formula /(x. ) of l' and 1" by means of the formula: f. /(x. y. f. g. ej. . -) . We define the product /(x. y. y. Ve shall now express the components of the tensor correspond- ing to the product of the multilinear functions l' and 1" in terms of the components of the tensors corresponding to l' and 1". 5. l'(x. Let l" (x. z. ) l'(x. Here e has the meaning discussed in para. . . y. -) Clearly this sum is again a multilinear function of the same number of vectors in R and R as the summands l' and 1". f. -. ). . f.

g. To this end we choose a basis el.e. ) -. Since coefficients of the form /(x. = r(e5. . . (7) = /(ei. f'. e in R and its dual basis p. . . fl. e'2. fk). g. f'ce) = A (cak ek. g. f" in R and consider the sum . f' z) A (e'2. . f2. e1. and g. the sum does not. -). -) . remain fixed we need only prove our contention for a bilinear form A (x.180 LECTURES ON LINEAR ALGEBRA and = 1" (ek. We now express the coefficients of the form (7) in terms of the .. y. 1(e. Specifically we must show that A (e. Since each summand is a multilinear function of y. f. g. fa) = A (e' a. y. . irtn. Contraction of tensors. Y. g. + 1(e. . ). g. e' and denote its dual basis by f'2. f2. We recall that if e'. We now show that whereas each summand depends on the choice of basis. then cikek. Jet. g. it follows that att tuk. i.::: a"tkl This formula defines the product of two tensors. Since the vectors y. y. P) is indeed independent of choice of basis. f". y. fk Therefore eikr. the same is true of the sum I'. A (ea. ) /(e2. Let us choose a new basis e'1. e2. f. . ck f'a) = A (ek. l'(y. f'k).) ) and g. ) be a multilinear function of p vectors in R (p 1) and q vectors in R(q 1). ). f'a) = cak A( = A (ek. y. fu . We use 1 to define a new multilinear function of p 1 vectors in R and q 1 vectors in R. Let /(x.

) = l(e e .. = The tensor a'.k be a tensor of rank three and bt'n ai. jes. i. If the tensors a1 and b ki are looked upon as matrices of linear transformations. The operation of lowering indices discussed in para. Let ati and b. 4 of this section can be viewed as contraction of the product of some tensor by the metric tensor g. numbers independent of choice of basis. Another example. say.' be two tensors of rank two.INTRODUCTION TO TENSORS 181 and l'(e if follows that (8) . Let a.. then the tensor cit is the matrix of the product of these linear transformations.kb. We observe that contraction of a tensor of rank two leads to a tensor of rank zero (scalar). With any tensor ai5 of rank two we can associate a sequence of invariants (i. to a number independent of coordinate systems. say. simply scalars) a:. Their product ct'z' five. By multiplication and contraction these yield a new tensor of rank two: cit = aiabat."' is a tensor rank a tensor of rank two. The result of contracting this tensor over the indices i and m. a/ . f2. ). Likewise the raising of indices can be viewed as contraction of the product of some tensor by the tensor g". Another contraction. if one tried to sum over two covariant indices. However.e. would lead to a tensor of rank one (vector). the resulting system of numbers would no longer form a tensor (for upon change of basis this system of numbers would not transform in accordance with the prescribed law of transformation for tensors). (repeated as a factor an appropriate num- ber of times).::: obtained from a::: as per (8) is called a contraction of the tensor It is clear that the summation in the process of contraction may involve any covariant index and any contravariant index.e. over the indices j and k. would be a tensor of rank three. say.

etc.. by multiplying vectors we can obtain tensors of arbitrarily high rank. is a tensor of rank two. However. if ei are the coordinates of a contravariant vector and n. . then Ein. y. if then the tensor is said to be symmetric with respect to the first two (lower) indices. of a covariant vector. Thus. g. it can be shown that every tensor can be obtained from vectors (tensors of rank one) using the operations of addition and multiplication. A tensor is said to be symmetric with respect to a 6. multiplication by a number and total contraction (i. . symmetry of the tensor with respect to some group of indices is equivalent to symmetry of the corresponding multilinear function with respect to an appropriate set of vectors. f. if (9) 1(x.. to the tensor ail .. For example. f. addition. In connection with the above concept we quote without proof the following result: Any rational integral invariant of a given system of tensors can be obtained from these tensors by means of the operations of tensor multiplica- tion.e. g. as is clear from (9).182 LECTURES ON LINEAR ALGEBRA The operations on tensors permit us to construct from given tensors new tensors invariantly connected with the given ones. Since for a multilinear function to be symmetric with It goes without saying that we have in mind indices in the same (upper or lower) group. given set of indices i if its components are invariant under an arbitrary permutation of these indices. By a rational integral invariant of a given system of tensors we mean a polynomial function of the components of these tensors whose value does not change when one system of components of the tensors in question computed with respect to some basis is replaced by another system computed with respect to some other basis.. ) is the multilinear function corresponding If 1(x. . We observe that not all tensors can be obtained by multiplying vectors. Symmetric and skew symmetric tensors DEFINITION.e. i.. contraction over all indices). For example. y. )= then.

the number of different compo2)/3! since 1) (n nents of a skew symmetric tensor ail. DEFINITION.. of its vectors changes the sign of the function. be symmetric with respect to an appropriate set of indices in some basis.. In other words. For a multihnear function to be skew symmetric it is sufficient that the components of the associated tensor be skew symmetric relative to some coordinae system. Here it is assumed that we are dealing with a tensor all of whose indices are of the same nature. if the components of a tensor are skew symmetric in one coordinate system then they are skew symmetric in all coordi- nate systems.. This follows from the . We now count the number of independent components of a skew symmetric tensor. so that the number of different components is n(n 1)/2.e. A tensor is said to be skew symmetric if it changes sign every time two of its indices are interchanged. skew symmetry of a multilinear function implies skew symmetry of the associated tensor (in any coordinate system).k = a. ) of p vectors in R is said to be skew symmetric if interchanging any pair x. The multilinear functions associated with skew symmetric tensors are themselves skew symmetric in the sense of the following definition: DEFINITION. is n (n components with repeated indices have the value zero and components which differ from one another only in the order of their indices can be expressed in terms of each other. This much is obvious from (9). A multilinear function 1(x. Thus let a be a skew symmetric tensor of rank two. More generally. Then a. On the other hand. then this symmetry is preserved in all coordinate systems. (There are no non zero skew symmetric tensors with more than n indices. y. either all covariant or all contravariant. y. the number of independent components of a skew symmetric tensor with k indices (k n) is (:). the tensor is skew symmetric.INTRODUCTION TO TENSORS 183 respect to a certain set of vectors it is sufficient that the corresponding tensor a:11.e. i. The definition of a skew symmetric tensor implies that an even permutation of its indices leaves its components unchanged and an odd permutation multiplies them by 1. i. it follows that if the components of a tensor are symmetric relative to one coordinate system. Similarly.

z) = =a ni n2 " vn 12''' This proves the fact that apart from a multiplicative constant the only skew symmetric multilinear function of n vectors in an ndimensional vector space is the determinant of the coordinates of these vectors.sign) depending on whether the permutation 1112 or odd( sign). is to construct the tensor 1 k! L. Show that as a result of a coordinate transformation the number a. Given a tensor one can always construct another tensor symmetric with respect to a preassigned group of indices. This operation is called symmetrization and consists in the following. o2. 12.. then of the integers 1. respect to the first k indices. Since two sets of n different indices differ from one another in order alone. For example indices ii. 2.. . = a is multiplied by the determinant of the matrix associated with this coordinate transformation.184 LECTURES ON LINEAR ALGEBRA fact that a component with two or more repeated indices vanishes and k > n implies that at least two of the indices of each component coincide.12. say.= a. it follows that such a tensor has only one independi is any permutation ent component. n and if we put (10)a11±a 1.. y. Consequently if 11. 12. is even (4. of the 2 -1. Let the given tensor be 011. In view of formula (10) the multilinear function associated with a skew symmetric tensor with n indices has the form e2 En /(x.12_1 say. i2 2 . To symmetrize it with . aCis i2) j. EXERCISE.-44+1 where the sum is taken over all permutations ji.) We consider in greater detail skevv symmetric tensors with n indices. ik. The ofieration of symmetrization.

i. the tensors constructed from each of these systems differ by a non-zero multiplicative constant only. Thus the skew symmetric tensor a[i1i2" constructed on the generators VI.INTRODUCTION TO TENSORS 185 The operation of alternation is analogous to the operation of symmetrization and permits us to construct from a given tensor another tensor skew symmetric with respect to a preassigned group of indices. ni 2) product ail i2-"ik = Vini2 . n. 2f 2.t2! = Cat. Cik of the subspace defines this subspace.. j. Different systems of k linearly independ- ent vectors may generate the same subspace. A k-dimensional subspace is generated by k linearly independent vectors ei. . The operation of alternation is indicated by the square bracket symbol [ ]. Consider a k-dimensional subspace of an n-dimensional space R. we wish to coordinatize it. k! 1 .. Cik. k we can construct their tensor Cik and then alternate it to get PI It is easy to see that the components of this tensor are all kth order minors of the following matrix e2 n1 n2 CI V. ..e.i nature of the permutation involved. of the . = 1 +aiii2. Given k vectors eii. ik and the sign depends on the even or odd indices i.) att. it is easy to show (the proof is left to the reader) that if two such systems of vectors generate the same subspace. . at2t. . For instance where the sum is taken over all permutations ji.. The operation is defined by the equation a. However. We vvish to characterize this subspace by means of a system of numbers.12. The brackets conta ns the indices involved in the operation of alternation.. any linear combination of the remaining vectors.ik. The tensor a[ii '41 does not change when we add to one of the vectors E.

Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.