This action might not be possible to undo. Are you sure you want to continue?
THE THEORY OF
MATRICES
F. R. GANTMACHER
VOLUME ONE
AMS CHELSEA PUBLISHING
American Mathematical Society Providence, Rhode Island
The present work, published in two volumes, is an English translation by K. A. Hirsch, of the Russianlanguage book TEORIYA MATRITS by F. R. Gantmacher (FaarMaxep)
2000 Mathematics Subject Classification. Primary 1502.
Library of Congress Catalog Card Number 5911779 International Standard Book Number 0821813765 (Vol.I)
Copyright © 1959, 1960, 1977 by Chelsea Publishing Company Printed in the United States of America. Reprinted by the American Mathematical Society, 2000 The American Mathematical Society retains all rights except those granted to the United States Government. ® The paper used in this book is acidfree and falls within the guidelines established to ensure permanence and durability.
Visit the AMS home page at URL: http://wv.ams.org/ 1098765432 0403020100
PREFACE
THE MATRIX CALCULUS is widely applied nowadays in various branches of
mathematics, mechanics, theoretical physics, theoretical electrical engineering, etc. However, neither in the Soviet nor the foreign literature is there a book that gives a sufficiently complete account of the problems of matrix theory and of its diverse applications. The present book is an attempt to fill this gap in the mathematical literature. The book is based on lecture courses on the theory of matrices and its applications that the author has given several times in the course of the last seventeen years at the Universities of Moscow and Tiflis and at the Moscow Institute of Physical Technology.
The book is meant not only for mathematicians (undergraduates and research students) but also for specialists in allied fields (physics, engineering) who are interested in mathematics and its applications. Therefore the author has endeavoured to make his account of the material as accessible as possible, assuming only that the reader is acquainted with the theory of determinants and with the usual course of higher mathematics within the programme of higher technical education. Only a few isolated sections in the last chapters of the book require additional mathematical knowledge on
the part of the reader. Moreover, the author has tried to keep the individual chapters as far as possible independent of each other. For example, Chapter V, Functions of Matrices, does not depend on the material contained in Chapters II and III. At those places of Chapter V where fundamental concepts introduced in Chapter IV are being used for the first time, the corresponding references are given. Thus, a reader who is acquainted with the rudiments of the theory of matrices can immediately begin with reading the chapters that interest him. The book consists of two parts, containing fifteen chapters. In Chapters I and III, information about matrices and linear operators is developed ab initio and the connection between operators and matrices is introduced. Chapter II expounds the theoretical basis of Gauss's elimination method
and certain associated effective methods of solving a system of n linear equations, for large n. In this chapter the reader also becomes acquainted with the technique of operating with matrices that are divided into rectangular `blocks.'
iii
iv
PREFACE
In Chapter IV we introduce the extremely important `characteristic'
and `minimal' polynomials of a square matrix, and the `adjoint' and `reduced adjoint' matrices.
In Chapter V, which is devoted to functions of matrices, we give the general definition of f (A) as well as concrete methods of computing itwhere f (A) is a function of a scalar argument A and A is a square matrix. The concept of a function of a matrix is used in §§ 5 and 6 of this chapter for a complete investigation of the solutions of a system of linear differential equations of the first order with constant coefficients. Both the concept of a function of a matrix and this latter investigation of differential equations are based entirely on the concept of the minimal polynomial of a matrix
andin contrast to the usual expositiondo not use the socalled theory of elementary divisors, which is treated in Chapters VI and VII. These five chapters constitute a first course on matrices and their applications. Very important problems in the theory of matrices arise in connection with the reduction of matrices to a normal form. This reduction is carried out on the basis of Weierstrass' theory of elementary divisors. In view of the importance of this theory we give two expositions in this book : an analytic one in Chapter VI and a geometric one in Chapter VII. We draw the reader's attention to §§ 7 and 8 of Chapter VI, where we study effective methods of finding a matrix that transforms a given matrix to normal form. In § 8 of Chapter VII we investigate in detail the method of A. N. Krylov for the practical computation of the coefficients of the
characteristic polynomial.
In Chapter VIII certain types of matrix equations are solved. We also
consider here the problem of determining all the matrices that are permutable
with a given matrix and we study in detail the manyvalued functions of
matrices N/A and 1nA. Chapters IX and X deal with the theory of linear operators in a unitary space and the theory of quadratic and hermitian forms. These chapters do
not depend on Weierstrass' theory of elementary divisors and use, of the preceding material, only the basic information on matrices and linear operators contained in the first three chapters of the book. In § 9 of Chapter X we apply the theory of forms to the study of the principal oscillations of a
system with n degrees of freedom. In § 11 of this chapter we give an account of Frobenius' deep results on the theory of Hankel forms. These results are
used later, in Chapter XV, to study special cases of the RouthHurwitz
problem.
The last five chapters form the second part of the book [the second volume, in the present English translation). In Chapter XI we determine normal forms for complex symmetric, skewsymmetric, and orthogonal mat
PREFACE
V
rices and establish interesting connections of these matrices with real matrices of the same classes and with unitary matrices.
In Chapter XII we expound the general theory of pencils of matrices of the form A + AB, where A and B are arbitrary rectangular matrices of the same dimensions. Just as the study of regular pencils of matrices A + AB is based on Weierstrass' theory of elementary divisors, so the study of singular pencils is built upon Kronecker's theory of minimal indices, which is, as it were, a further development of Weierstrass's theory. By means of Kronecker's theorythe author believes that he has succeeded in simplifying the exposition of this theorywe establish in Chapter XII canonical forms of the pencil of matrices A + AB in the most general case. The results obtained there are applied to the study of systems of linear differential equations with constant coefficients. In Chapter XIII we explain the remarkable spectral properties of matrices with nonnegative elements and consider two important applications of matrices of this class : 1) homogeneous Markov chains in the theory of probability and 2) oscillatory properties of elastic vibrations in mechanics. The matrix method of studying homogeneous Markov chains was developed in the book [25] by V. I. Romanovskii and is based on the fact that the matrix of transition probabilities in a homogeneous Markov chain with a finite number of states is a matrix with nonnegative elements of a special type (a `stochastic' matrix). The oscillatory properties of elastic vibrations are connected with another important class of nonnegative matricesthe `oscillation matrices.' These matrices and their applications were studied by 'Al. G. Krei:n jointly with the author of this book. In Chapter XIII, only certain basic results in this domain are presented. The reader can find a detailed account of the whole material in the monograph [7]. In Chapter XIV we compile the applications of the theory of matrices to systems of differential equations with variable coefficients. The central place (§§ 59) in this chapter belongs to the theory of the multiplicative integral (Produktintegral) and its connection with Volterra's infinitesimal calculus. These problems are almost entirely unknown in Soviet mathematical literature. In the first sections and in § 11, we study reducible systems (in the sense of Lyapunov) in connection with the problem of stability of motion ; we also give certain results of N. P. Erugin. Sections 911 refer to the analytic theory of systems of differential equations. Here we clarify an inaccuracy in Birkhoff's fundamental theorem, which is usually applied to the investigation of the solution of a system of differential equations in the neighborhood of a singular point, and we establish a canonical form of the solution in the case of a regular singular point.
vi
PREFACE
In § 12 of Chapter XIV we give a brief survey of some results of the
fundamental investigations of I. A. LappoDanilevskii on analytic functions of several matrices and their applications to differential systems. The last chapter, Chapter XV, deals with the applications of the theory
of quadratic forms (in particular, of Hankel forms) to the RouthHurwitz problem of determining the number of roots of a polynomial in the right halfplane (Re z > 0). The first sections of the chapter contain the classical treatment of the problem. In § 5 we give the theorem of A. M. Lyapunov in
which a stability criterion is set up which is equivalent to the RouthHurwitz criterion. Together with the stability criterion of RouthHurwitz we give, in § 11 of this chapter, the comparatively little known criterion of Lienard and Chipart in which the number of determinant inequalities is only about
half of that in the RouthHurwitz criterion.
At the end of Chapter XV we exhibit the close connection between stabil
ity problems and two remarkable theorems of A. A. Markov and P. L.
Chebyshev, which were obtained by these celebrated authors on the basis of the expansion of certain continued fractions of special types in series of decreasing powers of the argument. Here we give a matrix proof of these theorems.
This, then, is a brief summary of the contents of this book.
F. R. Gantmaeher
PUBLISHERS' PREFACE
TIIE PUBLISHERS WISH TO thank Professor Gantmaeher for his kindness in
communicating to the translator new versions of several paragraphs of the original Russianlanguage book. The Publishers also take pleasure in thanking the VEB Deutscher Verlag der Wissenschaften, whose many published translations of Russian scientific books into the German language include a counterpart of the present work, for their kind spirit of cooperation in agreeing to the use of their formulas in the preparation of the present work. No material changes have been made in the text in translating the present work from the Russian except for the replacement of several paragraphs by the new versions supplied by Professor Gantmacher. Some changes in the references and in the Bibliography have been made for the benefit of the Englishlanguage reader.
^.. Transformation of coordinates 5.^.. Sylvester's determinant identity . __^. Basic notation . Mechanical interpretation of Gauss's algorithm _. 3. Square matrices 1.. ^ _ ... § . 12 3... The technique of operating with partitioned matrices._.^. 4... MATRICES AND OPERATIONS ON MATRICES... The partition of a matrix into blocks. Matrices.. Vector spaces § 2..^ .»  4.. Gauss's elimination method ... 19 H. THE ALGORITHM OF GAUSS AND SOME OF ITS APPLICATIONS 23 § § § § § 1...CONTENTS PREFACE PUBLISHERS' PREFACE ill vi ..__...... The generalized algo41 rithm of Gauss III..»^^^ I 3 2. A linear operator mapping an ndimensional space into an mdimensional space ^^... .»2. § § § § 55 57 59 61 66 Equivalent matrices...^..._^_ 4. The rank of an operator.. LINEAR OPERATORS IN AN nDIMENSIONAL VECTOR SPACE § 50 50 1.^ 3. The decomposition of a square matrix into triangular fac 23 28 31 33 tors 5.....». Addition and multiplication of linear operators .»» 1 § § § ^. _.._. Compound matrices. Sylvester's inequality Linear operators mapping an ndimensional space into itself vii . _. I.. Minors of the inverse matrix. Addition and multiplication of rectangular matrices_...... 6.
§ § § 134 Invariant polynomials and elementary divisors of a polynomial matrix 139 145 147 149 153 § § Equivalence of linear binomials5..___ .. The characteristic polynomial of a matrix...130 § 1. 3._. 2..r.... coefficients Stability of motion in the case of a linear system. The adjoint 82 matrix The method of Faddeev for the simultaneous computation of the coefficients of the characteristic polynomial and of 87 the ad joint matrix _ »» The minimal polynomial of a matrix. The components of the matrix A_ _ . FUNCTIONS OF MATRICES.. 7.95 95 § 2. § 8. 3. § 6._... § The normal forms of a matrix.. 130 Canonical form of a Amatrix........_._.. EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES._... THE CHARACTERISTIC POLYNOMIAL AND THE MINIMAL POLY_ NOMIAL OF A MATRIX.w .. § 5.. Definition of a function of a matrix. The elementary divisors of the matrix f(A) .. Application of a function of a matrix to the integration of § 4.. 116 125 VI... Characteristic values and characteristic vectors of a linear operator Linear operators of simple structure.. ANALYTIC THEORY OF ELEMENTARY DMSORS.w § § 1. 76 76 77 80 § § § 2._. ___ . The LagrangeSylvester interpolation polynomial101 3. § a system of linear differential equations with constant § 6..» 104 Representation of functions of matrices by means of series 110 5. Elementary transformations of a polynomial matrix.  69 72 IV. 6.CONTENTS § 7. _ The generalized Bezout theorem 4._. A criterion for similarity of matrices... Other forms of the definition of f (A).. Addition and multiplication of matrix polynomials___ Right and left division of matrix polynomials.._ . 4. _ _ § 1.._ 89 _ V..
.. 242 242 243 General considerations Metrization of a space.. 164 VII..... 215 . _. The extraction of mtb roots of a nonsingular matrix. § 2.... . Another method of constructing a transforming matrix... r _.. 5.._... 256 Orthonormal bases .... The extraction of mth roots of a singular matrix.. 246 Orthogonal projection 248 The geometrical meaning of the Gramian and some inequalities ..... Elementary divisors_.... Matrix polynomial equations 227 6... Decomposition into invariant subspaces with coprime ____ _ minimal polynomials § § 3.. 200 Krylov's method of transforming the secular equation. . _. THE STRUCTURE OF A LINEAR OPERATOR IN AN nDIMEN175 SIONAL SPACE.. The equation AX .. _. _ . _ .... 3.. 184 The normal form of a matrix 190 § § § 6.. 231 7...._. Decomposition of a space into cyclic invariant subspaces...§ 1.._....... Commuting matrices. A general method of constructing the transforming matrix 159 9..... The special case A = B.... ... The scalar equation f (X) = 0___ .. Factor space .. 8....._. § § § § § 6.. 262 The adjoint operator 265 ... _. »» 215 1. Congruence. . The equation AX = XB . 2.... 234 8.. Gram's criterion for linear dependence of vectors . 202  ____ ...» _ .XB = C ... 220 § § 4. MATRIX EQUATIONS § § § ... .CONTENTS § § ix 8. VIII... 2... The logarithm of a matrix.... 3. The minimal polynomial of a vector and a space (with 175 respect to a given linear operator)... . 8....... . 250 Orthogonalization of a sequence of vectors . LINEAR OPERATORS IN A UNITARY SPACE § § § 1.... . Invariant polynomials... _ 239 ... . § 5. » 177 181 4._ IX... _.193 The Jordan normal form of a matrix....... 7.. _... 4. 225 225 § § § 5..._ _. 7.
..... hermitian. 270 § 11..._ __ ......._. Extremal properties of the characteristic values of a regular pencil of forms .... 338 5. Polar decomposition of an operator and the Cayley formulas in a euclidean space... and unitary operators.. .. w »» § 13.... Polar decomposition of a linear operator in a unitary space. INDEX . _ _ _..... Linear operators in a euclidean space. _ 274 _ .. 326 § 9... _.... Positive quadratic forms 304 2.......x § CONTENTS 268 9......M._.... 294 § § § Reduction of a quadratic form to a sum of squares..»..... _... Hankel forms _ .. Positivesemidefinite and positivedefinite hermitian op_ _ . The spectra of normal. 308 § 6.. Normal operators in a unitary space... Small oscillations of a system with n degrees of freedom... BIBLIOGRAPHY 351 .... 276 280 § 15. QUADRATIC AND HERMITIAN FORMS. § 14.. Hermitian forms 331 § 10. 299 Reduction of a quadratic form to principal axes.. 369 . § § 286 290 294 1. The law of inertia 296 3. ... Commuting normal operators X..._. Cayley's formulas _. The methods of Lagrange and Jacobi of reducing a quadratic form to a sum of squares 4.. Pencils of quadratic 310 § 7. Transformation of the variables in a quadratic form..._ § 10. erators § 12.». 317 § S. _.
Let F be a given number field. . k=1. The numbers that constitute the matrix are called its elements. then we shall write A= 11 ait 111. NOTATION : In the doublesubscript notation for the elements. Matrices.CHAPTER I MATRICES AND OPERATIONS ON MATRICES § I. 2. 1 . multiplication.' DEFINITION 1: A rectangular array of numbers of the field F ary .. equal to n.n a21 (1) I is called a matrix. n).. 2. for example A. When m = n.. subtraction. All the numbers that will occur in the sequel are assumed to belong to the number field given initially.. the first subscript always denotes the row and the second subscript the column containing the given element. and the set of all complex numbers. In the general case the matrix is called rectangular (of dimension m X n). m. . As an alternative to the notation (1) for a matrix we shall also use the abbreviation Ilaikll (i =1. Examples of number fields are: the set of all rational numbers. and division by a nonzero number can always be carried out. is called its order.. the set of all real num bers. (2) Often the matrix (1) will also be denoted by a single letter. the matrix is called square and the number m... Basic Notation 1. If A is a square matrix of order it. The determinant of a square matrix A aik 1171 will be denoted by I aix in or by A number field is defined as an arbitrary collection of numbers within which the four operations of addition. a..
.. . MATRICES AND MATRIX OPERATIONS We introduce a concise notation for determinants formed from elements of the given matrix: at...zn11 are zero is called a row matrix and will be denoted by [zi.. . A rectangular matrix consisting of a single column x1 XS is called a column matrix and will be denoted by (XI. .. .k.. (3') The minors (3') in which i. .p A(k1k2. If r is the rank of a rectangular matrix A of dimension m X n. x2. A rectangular matrix consisting of a single row 1I21.. t1 i$. < k... ajPkP The determinant (3) is called a minor of A of order p....n) I 2..k'..... m .. .. . . In the notation (3) the determinant of a square matrix A=11aikIII can be written as follows : IAI=A (1 2.. 2. < k.. ip = k.. . i2 = k2. .. (3) arr.22.kP) a?..... n) has l') ) minors and '_5 ' 1 t112. are called principal minors. . .n The largest among the orders of the nonzero minors generated by a matrix is called the rank of the matrix.. n). = k...ajk... a.2 1.... .ip k1 k$.. 2.:5n.. then obviously r < min (m..:9 n n'.. a. a1Pk. z2. kP P Sm G 5 k1 < k2 < .. A square matrix in which all the elements outside the main diagonal . provided matrix A of order p A ask <k2 <.k. Arectangular 1.... k = 1...
2.0 0 ... x2i .n by means of the formulas (4) is called a linear transformation. .dnli JJd{8.. are expressed in terms of the quantities x..dJSuppose that m quantities y. in). and multiplication of matrices.... 2. Y.dt. is called a diagonal matrix and is denoted by' or by (dl. The coefficients of this transformation form a rectangular matrix (1) of dimension m X n. y. . Y2 . multiplication of a matrix by a number. .. The linear transformation (4) determines the matrix (1) uniquely.. Y2. . Addition and Multiplication of Rectangular Matrices We shall define the basic operations on matrices : addition of matrices. m) (5) 2 Here 84. x into the quantities Yi. y.. . y2 = a2lxl + a22x2 + ... k). y.0 0 110 d2. + alnxn + a2nxn (4) ym = amlxl + amsxs + . § 2. (4') The transformation of the quantities x... . is the Kronecker symbol : 8rk = 1 (i =k). xn by means of the linear transformation yy = k_1 E a{kxk n (i =1.. have linear and homogeneous expressions in terms of n other quantities x.. ADDITION AND MULTIPLICATION OF MATRICES 3 dl 0 . + amnxn .. 1.§ 2. x2i ......k1J. . and vice versa.... or more concisely. 10 (1* .. In the next section we shall define the basic operations on rectangular matrices using the properties of the linear transformations (4) as our starting point... . Y 2 . = kl aikxk (i =1.... xR : yl = allxl + a12x2 + ... . Suppose that the quantities y...
Let us multiply the quantities y. . . we formulate the following definition.. y... Then ayi k1 (aark) xk (i =1. is the matrix C _ 11 crk 11.. (7) In accordance with this. . z2i . 2. The operation of addition of matrices extends in a natural way to the case of an arbitrary finite number of summands. A+B=B+A.. (A+B) +C=A+ (B+C).. Here A. x2 .. B.. the coefficient matrix of the transforma tion (7) is the sum of the coefficient matrices of the transformations (5) and (6).li llbi+di bE+d$ 63+dall' a3 + c3 According to Definition 2. 2. whose elements are the smuts of the corresponding elements of the given matrices : C=A+B. m). in the transformation (5) by some number a of F. 2. . The operation of forming the sum of given matrices is called addition... m... MATRICES AND MATRIX OPERATIONS and the quantities z. . only rectangular matrices of equal dimension can be added. . both of dimension tit X n. of the same dimension. 2. DEFINITION 2: The suns of two rectangular matrices A= 11 aik II and B = H bik If . 2... in). Example.. (6) yr + zi = E (ark + brk) xk k1 (i =1. bsll+lldi d2 . . From the definition of matrix addition it follows immediately that this operation has the properties of commutativity and associativity : 1. 2. .. a. By virtue of the same definition. y2.. ... a. . and C are arbitrary rectangular matrices all of equal dimension. . ..n in terms of the same quantities xl. In accordance with this. k= 1. m) . where irk=ark+ bik (i =1. xn by means of the transformation n zi = k=1 bikxk Then n (i =1..we formulate the following definition. a3 bl b$ lle. z.4 1. 2. n). ..
. . Example.. k=1. xq by means of the composite transformation : 3 Here the symbols I A I and j a. (i=1. z2i .2. . 2.. where (i=1. . (a+#) A = aA+#A. n).§ 2... . of a1 az b1 b2 11 agal =2 "43 b3 II  b1 bs abs It is easy to see that 1.n) by anumberaof Fis the matrixC=ll cck (i=1.. If A is a square matrix of order n and a a number Of F.k= aa. y2. 2. Here A and B are rectangular matrices of equal dimension and a and P are numbers of F..4 denote the detenuiuants of the matrices A and aA (see p.. ..... y are expressed in terms of the quantities x1i x2.m. 2. k=1.t multiplication of the matrix by the number.. ADDITION AND MULTIPLICATION OF MATRICES ai 5 The product of a matrix A = I aik (i = 1.. ments of A by multiplication by a : C. then 3 3. a(A + B) = aA + aB. ..B of two rectangular matrices of equal dimension is defined by AB=A+ (1)B. m) (8) and that the quantities y1. elek = 1.... . 3. . . 2.. z... The operation of forming the product of a matrix by a number is called c. m. . . n) in (8) we can express z1. =X 1a+yk k.. . xq by the formulas 4 yk= bkixi (k =1. . z2i .. Suppose that the quantities z1. (ao)A = a(#A).. The difference A . . y by the transformation n z. n) whose elements are obtained from the corresponding DEFINITION 3.2... = aA.... . .. in terms of x1. 2..... . 2. . 1 aAJ=an A1. .. 2. . n) (9) Then on substituting these expressions for the yk (k = 1. . z in are expressed in terms of the quantities y1. . 1). 2.. m. . x2. y2..
... is defined n *s the sum of the products of the corresponding numbers: i=t a. m. . b. e. .b{. The product of two sequences of numbers a.('aikbsi)xt 11 k1 k1 i1 (a=1. a2n am2. + b. ex /1 bl Ca ea b.. 2. (10) In accordance with this we formulate the following definition.: B= b21 aml I bnl bn2 .. The product of two rectangular matrices all a12 . b2q A= is the matrix a21 a22 .. Example. m).. . kl (11) The operation of forming the product of given matrices is called matrix multiplication..... aln b11 b12 . Note that the operation of multiplication of two rectangular matrices can only be carried out when the number of columns of the first factor is equal to the number of rows of the second. DEFINITION 4. .. /a aldl + a.... _ alc1 + age.. j=1. cll Ci = C21 c12 .. In particular. blcl + b. 2. bids + beds + b.. ... at the intersection of the ith row and the jth column is the 'product'* of the ith row of the first matrix A into the jth column of the second matrix B : n (i= 1. . Cm2. . + b.ea a1/1 + aJ1 + ads a b. a and b.. MATRICES AND MATRIX OPERATIONS n zi=A'aikEb.. ble b22 . .. Cl dl d..d. clq C22 . b. arn.Cmq in which the element c{. cs d...6 1. multiplication is always possible when both factors are square matrices of one and the same order. .ds + aad.cs a1e1 + ases + aaea bier + bne.c. ./1+bsl2 +b3/3 ll By Definition 4 the coefficient matrix of the transformation (10) is the product of the coefficient matrices of (8) and (9).xt'=.. e2q cm... b2. 2... q). + aac. a..
.§ 2. . (AB)C = A (BC).. (A+B)C=AC+BC... . U=:1j. we can write the linear transformation yl = a11x1 + a12x2 + ... 1 2 2 0 3 4 3 1 H 18 4 8 2 .. then the matrices A and B are called permutable or commuting.A(B+C)=AB+AC.. 3. For example.I0 are permutable. or in abbreviated form. 2. ADDITION AND MULTIPLICATION OF MATRICES 7 The reader should observe that even in this special case the multiplication of matrices does not have the property of commutativity.. The definition of matrix multiplication extends in a natural way to the case of several factors. + alnxa y2 = a21x1 + a22x2 + . xn Ym amt am2.. because and B = 3 2 :u 7 2 AB=I 7 6 6 4 IandBA=II 64I It is very easy to verify the associative property of matrix multiplication and also the distributive property of multiplication with respect to addition: 1. Example. When we make use of the multiplication of rectangular matrices.. + a2nxn ym = amlxl+ am2xt + . but 1 2 3 4 2 4 0 2 If AB = BA.. a2. The matrices A. + amnxn as a single matrix equation y1 Y2 all a12 .. aln aY1 a2. a.
. is a rectangular matrix of dimension m X n. a2nd... 2. bim (12) b 22 ..df (i =1.}. 0 aml am2..d2 . then the columns (rows) of A are multiplied by dl..... dmama Hence: When a rectangular matrix A is multiplied on the right (left) by a diagonal matrix {dx... . . . d1 all a12 .. . aln aml amt Ibl.. amn b. n)... d2a2n amt am2. . from (11) that c{.. cam all ale .. alnd.... x2. all a21 a12 .. y2.8 I.0 alld1 a21d1 a12d2 .. MATRICES AND MATRIX OPERATIONS and y = (yl.. j= 1... 0 O . d2. b2m .n cy =«ai bq (i.. .amn a2.=a. respectively : C11 ...... . 2..... bn. j=1. m. 4... a1n a22 .0 d2.. a2n I dla11 d2a21 dla12 ... y... ..ama dmaml dmam2. . {13) ... amndn Similarly. d2. d 2 . ...1 ..... 2... Cim Cmi .. dlaln d2a22 ... Suppose that a square matrix C = 11 c{q IIi is the product of two rectangular matrices A= II ack II and B = If bk5 II of dimension m X n and n X m. respectively.) are column matrices Here x = (x1. and A= II a{k II Let us treat the special case when in the product C = AB the second Then it follows factor is a square diagonal matrix B = {dl.. aln 0 . m)... ..d amldi am2d2. .... ..
<km n ``1(klk2.. am (so that S When in > n. . By (13) the determinant of C can be represented in the form C11 ..... so that every summand on the righthand side of (15) is zero...1 n E ala.km)B(11 2 ... Hence in this case I C I =0....1 . l ba. In that case the righthand sides of (14) and (14') are to be replaced by zero. . 01 n am=1 I ama+nba... ADDITION AND MULTIPLICATION OF MATRICES 9 We shall establish the important BinetCauchy formula.. bamm .:: n (1 al.2 .m (14) ..ba.l a. Cmm ' I n alk. a2...<k.. a1k.. amambam..1 . then among the numbers al.<. .. ... which expresses the determinant I C I in terms of the minors of A and B : C11 .=..ba.m ala. Derivation of the BinetCauchy formula. clm a..ba.. bk. Cmm X a.m a....... Now let m < n.1 .1 m ba. .m) (14') 1g P. 46 If m > n....ba. aa..... am there are always at least two that are equal. alamba. ..... C1m Cml .. anal L n n alambamm Cml . .. as.. am= 1 a. C(12. ..mm)' According to this formula the determinant of C is the sum of the products of all possible minors of the maximal (mth) orders of A into the corresponding minors of the same order of B... ..§ 2. am are equal... 2 _ E A `al a2 . Then in the sum on the righthand side of (15) all those summands will be zero in which at least two of the subscripts al. amk } bkll .. All the remaining summands of (15) can be split into groups of m ! terms each by combining into one group those summands that differ from each other only in the order of the subscripts a. amk m bm . in the notations of page 2.. the matrices A and B do not have minors of order in. .. k1 k bk mm or. am..
. n).. (17) (a1b1+a$b2+.. + bndn b1 bs ..dnl Ij I an bb 6 Here k.an c= d..+anbn bl+b22+.. 2..... a_) = (1)N .. a2..+anbn)2<(a.... a1c1+ bldg ..... we obtain : aj+of+. + bncn laa aide+azda... < . .. ales + bldn and }. . ..+ancn b1c... ancn }.. MATRICES AND MATRIX OPERATIONS within each such group the subscripts a. ... b . a2.+an) Here the equality sign holds if and only if all the numbers aj are proportional to the corresponding numbers b{ (i=1. b{ = d{ (i = 1. + ...+b'n 1= X Ibi bk I' 1:5 i<k:gn aj ak If as and b{ (i = 1....+. . km ) ball ba. is the normal order of the subscripts a. km } B (11 2 .. Example 2..d.s+ay+.. am have one and the same set of values).+anbn a..+.gi<ks ... . cc dt bi b1 c* 4 (16) Setting ai = ci... . . am) A (k1 k2 =A 1 2 k2 (k. km k2 ..10 I. a.. a1 b1 Ildl. .. .b1+asbs+. bn en Therefore formula (14) yields the socalled Cauchy identity a. mm [ Hence from (15) we obtain (14').. Cl d1 c1+ascs.... . + a... A (k1 k2 . am) ballba.. where N is the number of transpositions of the indices needed to put the permutation a. 2. ..+.. .... + and..... < k. 2.....bndn .. Now within one such group the sum of the corresponding terms ise e (a1.cn ald1 + a02 + .+an albs+aob.bndl ....... into normal order.. . < k... + bndn at a* I l. b1d1 + b. m xe (ale a2.... .. .2 .. +and a1 a2...... ...... am and s ( a1.. a. k) .. + bncn bldg + b + . Example 1...2 .cl + a.+bn)... n) in this identity.. n) are real numbers. + bscs + . we deduce the wellknown inequality (bi+bj+.. a. b1e1 + b et + .. a. y + ..
.. j =1. aaca + bed.. q) C=AB.1 ai.. the determinant of the product of two square matrices is equal to the product of the determinants of the factors.. alc + blda = 0. jp) r1 1 s9h<j:<. m... ..2 ... 2... blf..... we arrive at the wellknown multiplication theorem for determinants : 2... to express the minors of the product of two rectangular matrices in terms of the minors of the factors. a(pa baf. . 2.n)B 1 2. k =1.n_A 12... a.. in the general case also...<ip9q PSmandp5q.. b2M alpl aip8 . ai.. B= IIbkj 1. n) or. + bid1 . 5. (i and C =Iic{f (I 1.. The BinetCauchy formula enables us. bafp .. The matrix formed from the elements of this minor is the product of two rectangular matrices af.... .2.n (18) Thus. . n.. ADDITION AND MULTIPLICATION OF MATRICES 11 Therefore for n > 2 ale.kII.c1 + bade Let us consider the special case where A and B are square matrices of one and the same order n..n \1 2.n in (14').. When we set m =.. ... We consider an arbitrary minor of C : C $1 i=.ip i j:. in another notation. 12. 2... Let A= 1Ia.. ..n 1 2.... blip bQf..
m. A p times (p =1.. 92 .. .. C... then re <min (rA. we obtain :' C (ii i2 _ I S k1 <ks <<.12 1. Square Matrices 1. B.. <kp n A (ui 2 .. For p > 1 formula (19) is a natural generalization of (11). 7 It follows from the BinetCauchy formula that the minors of order p in C for p > n (if minors of such orders exist) are all zero. Then the power of the matrix is defined in the usual way : AP=AA ..) ..2... rc are the ranks of A. ii 92 . MATRICES AND MATRIX OPERATIONS Therefore.. kr ) B (k1 ii ka (19) jp } . by applying the BinetCauchy formula. rn... If C = AR and rA. 9. See footnote 5. be a square matrix.n) E(m)A= AEt" )= A. From the associative property of matrix multiplication it follows that ApA°Ap+g. jr) ki k2 ... In that case the righthand side of (19) is to be replaced by zero. 2.rn) § 3.. Here p and q are arbitrary nonnegative integers. Clearly EW=II&II Let A= II au I1. p... The name `unit matrix' is connected with the following property of E : For every rectangular matrix A=IlataI( we have (i=1. The square matrix of order n in which the main diagonal consists entirely of units and all the other elements are zero is called the unit matrix and is denoted by E'"' or simply by E.2. We mention another consequence of (19). k=1.. A° = R. For p =1 formula (19) goes over into (11).. The rank of the product of two rectangular matrices does not exceed the rank of either factor.. .
. The matrix H'"' will also be denoted simply by H. 01 0 1 H = H(") = Hz = . 1 0 0 0 0.. In this we make use of the multiplication rule for powers: tp . Suppose that f (t) is the product of two polynomials g (t) and h (t) : f(t) = g(t)h(t). Hence. + am.' g(A)h(A) =h(A)g(A)..e.§ 3. (22) Let the sequence of elements aik for which k . it follows from (21) that f(A) =g(A)h(A)..0 0 0 0 1 . We denote by H'"' the square matrix order n in which all the elements of the first superdiagonal are units and all the other elements are zero.k = p) in a rectangular matrix A = II atk li be called the pth superdiagonal (subdiagonal) of the matrix. Then 010...i = p (i .. . however. Examples. It is worth mentioning that the substitution of matrices in an algebraic identity in several variables is not valid. The substitution of matrices that commute with one another. 0 0 0 . SQUARE MATRICES 13 We consider a polynomial (integral rational function) with coefficients in the field F: f (t) = aotm + altm1 + . Then by f (A) we shall mean the matrix /(A)=aoAm+aiAm1+. by virtue of the fact that h(t) g(t) = f (t). 0l Hp=0 (P >n). t9 = tp+q. Since all these operations remain valid when the scalar t is replaced by the matrix A. is allowable in this case.. in particular. two polynomials in one and the same matrix are always permuetable.+amE. 0 S Since each of these products is equal to one and the same f (A). i.... (21) The polynomial f (t) is obtained from g (t) and h (t) by multiplication term by term and collection of similar terms... We define in this way a polynomial in a matrix.
When an arbitrary rectangular matrix. if /(t)=ao+alt+a2teF is a polynomial in t. f(H)=a0E+a1H+ a2HP+. then ao 0 at ao as a. For example. 0 /(Ei)=a0E+a1F+ag +. . A of dimension m X n is multiplied on the left by the matrix H (or F) of order m. then ao a1 0 ae ..... 0 1 0 1 a1 b1 C1 a= bs Ct as b3 C3 b1 b4 C4 C1 b$ bs b4 0 0 0 1 Cs q C4 0 0 0 0 0 0 0 0 1 0 0 al b. ao Similarly. the last (first) column of A disappears. then all the columns of A are shifted to the right (left) by one place.14 1. = 0 as at j 0 0 .. When an arbitrary rectangular matrix A of dimension m X n is multiplied on the right by the matrix H (or F) of order n.. MATRICES AND MATRIX OPERATIONS By these equations.. 0 0 al bl C1 as b. Cp a3 ba Ca a4 b4 C4 0 0 0 a2 as a4 bs ba b4 0 0 2. For example. if F is the square matrix of order n in which all the elements of the first subdiagonal are units and all others are zero. then all the rows of A are shifted upward (or downward) by one place.= 0 al ao We leave it to the reader to verify the following properties of the matrices H and F: 1.. and the last (first) row of the product is filled by zeros. and the first (last) column of the product is filled by zeros. the first (last) row of A disappears. ..
. X.. aft ant  n xa1 ''a!x11 yx (24) an. C4 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 a3 by C2 c1 c2 as bs C3 a.. i+1 .+I .. i1 y1 a. . y2i . and observe that the determinant of the system of equations (23) is. n). then we can express x1i x2.pi Axi . x.. Otherwise A is called nonsingular. a1. k . k =1.. .! 1 0 0 2. 0 0 1 0 0 1 0 0 0 0 1 0 a2 b2 as bs C3 a.x i. . .. . . a1. (25) where Ax{ is the algebraic complement (the cofactor) of the element aki in the determinant I A I (i. b. From (24) it is easy to see that a(1). a21 ... in terms of yl. a2 b2 C2 as b3 C3 a. different from zero. C1 a.. b. A square matrix A is called singular if I A = 0.. . ann We have thus obtained the `inverse' transformation of the transformation (23). 2. b. C3 a3 bs C3 a. 0 00 0 0 0... . . Let us consider yi=x1 'aixxx (i =1. i1 yn an... i+1 . b. b. Cl 15 0 a. Let A = + 1 aix 1+i be a nonsingular matrix ( I A the linear transformation with coefficient matrix A 0).. . 2.. 0 b. 2.. The coefficient matrix of this transformation A1= 11 a{x1) III will be called the inverse matrix of A. n (23) When we regard (23) as equations for x1i x2j .. c.§ 3..1. . a2. b. SQUARE MATRICES a. C. by assumption. n). i1 y2 a2. yn by means of the wellknown formulas : ail ..
"Here we make use of the well known property of determinants that the sum of the products of the elements of an arbitrary column into the cofactors of the elements of that column is equal to the value of the determinant and the sum of the products of the elements of a column into the cofactors of the corresponding element of another column is zero.Aq . .n)... and this is impossible when I A I = 0.. j =1. cation of the matrices A and A'. then we would have by the multiplication theorem of determinants (see formula (18)) that I A X E = 1.' asi i>aki . [A'A].16 1.. in either order... . albsasb2 A' Ibsciblcs JAI b1cQb2c1 b2c3bsc2 ascs a2cs alesascs a2ci a1c2 asbi albs albsa2b1 By forming the composite transformation of the given transformation (23) and the inverse (24). we obtain in both cases the identity transformation (with the unit matrix as coefficient matrix) . then the equations (27) have no solution.'ail It is easy to see that the matrix equations AX=E and IA =E A O) (27) have no solutions other than X = A'. by (25) we have" n (26) The validity of equation (26) can also be established by direct multipliI (I) aikaki l =FA1 IAA']iiSimilarly.. 2. For if one of these equations had a solution X = Ij xik (' . nn k1 au_4ik = 6ii (i. (i. MATRICES AND MATRIX OPERATIONS For example. 10 If A is a singular matrix.2. For by multiplying both sides of the first (second) equation on the left (right) by A' and using the associative property of matrix multiplication we obtain from (26) in both cases :'0 X = A'. In fact.' Akiakj . n).j=1... if a1 a2 b2 C2 as bs C$ A= then bl Cl and I A 10. therefore AA' =A'A =E.
§ 3. § 17.kmI intersection of the ith row and the kth column and all the other elements are zeros. Since in this ring the operation of multiplication by a number of F is defined. XA=B (IAI 0). Modern Algebra. X = A'B and X =BA. . 12 For. and since there exists a basis of n2 linearly independent matrices in terms of which all the matrices of order n can be expressed linearly" 2 the ring of matrices of order n is an algebra. On comparing this with (28). SQUARE MATRICES 17 In the same way it can be shown that each of the matrix equations AX=B. IAlI=I For any two nonsingular matrices we have (AB)1 = B'A1. respectively. for example. i. A I . 12) that rB S rx and rx S rB. (30) 3. moreover. have one and only one solution. From (28) and (29) we deduce (see p. Modern Algebra. an arbitrary matrix A = II atk 11n with elements in F can be represented in n the form A = aikAk. van der Waerden. § 14. so that ra = rB. Note that (26) implies. van der Waerdeu. (28) where X and B are rectangular matrices of equal dimensions and A is a square matrix of appropriate order. for example. 13 See. All the matrices of order n form a ring" with unit element E'"'. (29) The matrices (29) are the `left' and the `right' quotients on `dividing' B by A.e. See. where Elk is the matrix of order n in which there is a 1 at the f." 11 A ring is a collection of elements in which two operations are defined and can always be carried out uniquely: the `addition' of two elements (with the commutative and associative properties) and the 'multiplication' of two elements (with the associative and distributive properties with respect to addition). the rank of the original matrix remains unchanged. I A' I = 1. the addition is reversible. we have: When a rectangular matrix is multiplied on the left or on the right by a nonsingular matrix.
4. 3) for every element e of the set there exists an inverse element a1 (a * a1 = A group is called commutative. 0 a 11 axe . 245ff. a. pp. .. lower triangular) matrices is a diagonal (upper triangular. and a1 * a=e). A= 0 0 A= . MATRICES AND MATRIX OPERATIONS All the square matrices of order v form a commutative group with respect to the operation of addition.. All the nonsingular diagonal matrices form a commutative group under multiplication. All the nonsingular upper (lower) triangular matrices form a (noncommutative) group under multiplication. in particular.. as do all the upper triangular matrices or all the lower triangular matrices. All the diagonal matrices of order n form a commutative group under the operation of addition... lower triangular) matrix and that the inverse of a nonsingular diagonal (upper triangular.. or abelian. . We conclude this section with a further important operation on matrices transposition. a triangular (and. if the group operation has the commutative property. aan a21 a.18 1. Since the determinant of a triangular matrix is equal to the product of its diagonal elements. 0 an$ . Therefore : 1. 0 a22 . ann axi . 2. 2) there exists a unit element e in the set (a * e = e * a = a). Concerning the group concept see. [531. a diagonal) matrix is nonsingular if and only if all its diagonal elements are different from zero... for example. lower triangular) matrix is a matrix of the same type. It is easy to verify that the sum and the product of two diagonal (upper triangular. .. 3. 14A group is a set of objects in which an operation is defined which associates with any two elements a and b of the set a welldefined third element a * b of the same set provided that 1) the operation has the associative property ((a * b) * c = a * (b * o) )." All the nonsingular matrices of order n form a (noncommutative) group with respect to the operation of multiplication. ana A diagonal matrix is a special case both of an upper triangular matrix and a lower triangular matrix. A square matrix A = 11 ate') i is called upper triangular (lower triangular) if all the elements below (above) the main diagonal are zero: ail 0 ...
I kii IIi differs from its transpose by a factor 1 (KT=K). 3. then AT is of dimension n X m. < ip s 11kl<ks<. (aA )T = aAT. Minors of the Inverse Matrix 1.2. In 4. see [357]. n).II ail.. . COMPOUND MATRICES..m.k=1.. is defined as AT=11aTt11.. 15 In formulas 1. then the transpose AT (i=1.where aTN =aik If A is of dimension m X n. 2.. Note that the product of two symmetric matrices is not. A and B are arbitrary rectangular matrices for which the corresponding operations are feasible. .§ 4. it follows that the product of two permutable skewsymmetric matrices is a symmetric matrix. where N = (%) is the number of combina tions of n objects taken p at a time. From 3... symmetric. MINORS 19 If A . for exampleall the N combinations of p indices selected from among the indices 1. In a skewsymmetric matrix any two elements that are symmetrical to the main diagonal differ from each other by a factor 1 and the diagonal elements are zero.K:).) or two skewsymmetric matrices (A = K. . 2.. 16 As regards the representation of a square matrix A in the form of a product of two symmetric matrices (A = 8. .... In order to arrange the minors (31) in a square array. In a symmetric matrix elements that are symmetrically placed with respect to the main diagonal are equal.S. this holds if and only if the two given symmetric matrices are permutable.. k =1. 3.. (A + B)T = AT + BT. 4.<kpSn/. m. we enumerate in some definite orderlexicographic order. A is an arbitrary square nonsingular matrix. Compound Matrices. Let A = II ask IIi be a given matrix. 2.2.. If a square matrix S= 11 sib 11i coincides with its transpose (ST = S). (AB)T = BTAT. kp) iP ( it < $2 < .. We consider all possible minors of A of order p (1 < p < n) : t1 i2 .. then it is called skewsymmetric. 2.. It is easy to verify the following properties :15 1.16 § 4.. (i=1. If a square matrix K =.. n. .. . then it is called symmetric.. in general.. ..n).. A\k1 l ks . (A 1)T = (AT)1.. By 3. (31) The number of these minors is N2.. 2.
The order of enumeration of the combination of indices is fixed once and for all and does not depend on the choice of A. Example. au au a4$ a44 We enumerate all combinations of the indices 1. then the minors (31) will also be denoted as follows : a _ A (kl zl . 2. Let all an a.20 1.. . Then A (1 2) A (1 3) A (1 4) A (2 3) A (12) A(34) A (1 2) A (1 3) A (1 4) A (2 3) A (2 4) A (3 4) A 9fs = (11 (11 2) A 3) A (1 4) A (2 3) A (2.q I IN is called the pth compound matrix of A= II aik lift. ip ks . < i2 < .. Here VI. < kp have the numbers a and 0. 2. 3.) A (3 4) (2 A 1 2) A ( 1 3 2 3 23 23 A ( 1 4) A (2 3 ) A ( 2 4 (21 A (21 (21 24 4) 2) A (3 3) A (3 1 A (2 3) A (3 1 3 4 4) 2) A 3) A 1 A (2 3) (34 ) (2 (24 A 2 4) A 34) 34 (23 A 4) A (34) A 23 .. 4 taken two at a time by arranging them in the following order: (12) (13) (14) (23) (24) (34).. kp i$ By giving to a and j independently all the values from 1 to N.. < iD and k.. p can take the values 1. The square matrix of order N 11 a. Note.. MATRICES AND MATRIX OPERATIONS If the combinations of indices i. = A. ... < k2 < .. we obtain all the minors of A = 11 a{k 11 ft of order p. n. and ?I consists of the single element I A I.3 ai4 A= as. . ass am au an an an a.. .
= SIP $p(p =1. 2. ..<1.. kP/ (32) (1S it <i2 n.. there follows an important formula that expresses the minors of the inverse matrix in terms of the minors of the given matrix : If B = A'.. and A are the numbers of the combinations of indices il<i2<. n). 2...1 (a. From C = AB it follows that ( E . Obviously. k1<k2<. This result follows immediately from the preceding one when we set C = E and bear in mind that 9. equation (32) can be written as follows : N C.... N) (here a. <.AbAp A. =1.. is the unit matrix of order N = (n) P From 2.. by formula (19).. 2. ... 11<IL.2..1 (p = 1. knP) (33) k2 . .. . <kP.... < P P tr + C 4 w.. QL k1 k2 is . Hence (EP=%PzP (p=1...<ip. MINORS 21 We mention some properties of compound matrices : 1.n. n).).. <k. < < iA_........ form a complete system of indices 1.). COMPOUND MATRICES.. From B = A' it follows that Q3P= ¶. lP k1<k2<. .... in terms of the minors of the same order of the factors.. then we have : C (i1 i2 '''p _ kP A 1 z i.. 2. kr <kp Sn) k2 (k'1 B it (ki where it < i2 < 12 . then for arbitrary (i it < i. ..n) ..§ 4. iP) B (ll `\kl 12 .. < iP and ii < i.n G 2.asdo For it follows from AB = E that QIPZP = (EP and ki<k2<. kP) I A 2.. . 2.n).# = Y a.. <. in the notation of this section. <(PS.<kp k2 . <1. For when we express the minors of order p (1 < p :5n) of the matrix product C. .
. equation (33) must hold. 1 P k. ip PP 2i 2z .{ o (yP) P 1 (Y=#)... . _V k..)2 = 0.. vml P (35) 0. ip p p p nA i i 2 . n. l k2 19tl<ta<. A 1).abgebye. k') but rather 2 p 2 1.. < ip k1 < k2 < . On the other hand. ip 1.... ..k..n) Since the elements b fi of the inverse matrix of W. . . (34) Equations (34) can also be written as follows : r 21 Z2 .. < i.... inP L' (j. < kp S n1 .....p `4 (1 2.1 .# not B (k.22 I. .. are uniquely determined by (34)... < k.)2>0.. Com parison of (35) with (34') and (34) shows that the equations (34) are satisfied if we take together with b.Ptr +.'.k... 2.2 k v_4 (k'11 L2. we obtain A 2 jp . 1 2 .. when we apply the wellknown Laplace expansion to the determinant I A 1. as do k1 < k2 < if v1 (j..)2>0 (34') \1 S ?l < j2 C .. n . k } 1 i1 i2 . MATRICES AND MATRIX OPERATIONS or in more explicit form : N La'. . 0. (ii 22 .. if k.)2 . it il.0. llve1 v1 tii{s P P P si . where i1 < is < < ip and i'1 < i2 < indices 1.v + .. if . ti} Btk1 k^2 .. form a complete system of < kp and ki < k2 < .. P 1.? .
. Vol. n). y2. x2. and y = (y1. + a.. the task of computing the elements of the inverse matrix is equivalent to the task of solving the system of equations (1) for arbitrary righthand sides y. Here x = (x.. .CHAPTER II THE ALGORITHM OF GAUSS AND SOME OF ITS APPLICATIONS § 1. y... the actual computation of the elements of A' by these formulas is very tedious for large n.."'0.... . Gauss's Elimination Method 1...Ell yk (i =1.... y.. then we can rewrite this as x=A'y.. + a.. x2i . X. Let altxl + a12x2 + .... = yn be a system of n linear equations in n unknowns x. .' I For a detailed account of these methods. Nauk..... n (2) (2') or in explicit form : x. .. In matrix form this system may be written as Ax=y.. . (1') are columns and . y2. A =11 an 1j' If A is nonsingular. effective methods of computing the elements of an A1 = 11 a''' 11i inverse matrixand hence of solving a system of linear equationsare of great practical value. 23 .. with righthand sides y. 3 (1950).. Y 2 . .. + alnxn = yl a21x1 + as2x2 + .) is the square coefficient matrix. y. Thus.. .. = y2 (1) an1x1 + an2x2 + . 2. we refer the reader to the book by Faddeev [151 and the group of papers that appeared in Uspehi Mat.... .. k1 a. 5. The elements of the inverse matrix are determined by the formulas (25) of Chapter I. Therefore.. However.
..i. they are variants of Gauss's elimination method. + ann)xn = yAl) The coefficients of the unknowns and the constant terms of the last equations are given by the formulas n1 (3) aif = ail . The system (1) has now been replaced by. xx yy .. +a x.. 2. = yl y2 (2) (6) (n1) an. + a(2) = yn xx The new coefficients and the new righthand sides are connected with the preceding ones by the formulas : ail .. THE ALGORITHM OF GAUSS AND SOME APPLICATIONS In the present chapter we expound the theoretical basis of some of these methods.. Suppose that aW 0.. whose acquaintance the reader first made in his algebra course at school... and so on.= y$2) (4) (2) (2) a.. + alxxn = yl a22x2+a2sxa+ . (5) Continuing the algorithm. Then we eliminate x2 in the same way from the last n .. + (2) a2n)x.2 equations of the system (3) and obtain the system allxl + alsx2 + a.24 II. from all the equations beginning with the second by adding to the second equation the first multipled by .. to the third the first all multiplied by ... 9 = 2. (1) ail Yi ) = yt 1)  aail yl (s. + = y2) (3) an2x2 + . Suppose that in the system of equations (1) we have all r 0. = yyl) (2) a1nx. ..ai> y2 22 (2) (1) at' (1) n). we go in n 1 steps from the original system (1) to the triangular recurrent system allxl + a12x2+ a + .a31. +a2. a1 the equivalent system allxl + a12x2 + . E2 (E) (1) ailt (1) y11 = X . + a22x2 + a4s x8 +..x..3x8 + .a(1) a2. .all...a{i .!!'. n) .2x8 + . + a22 x2 + yl a(i1)T.("l) ..=y21) aS8x8 . We eliminate x.
a22 0. ... ..2.. +. . .. .§ 1. are equal : A 1 2 .0 (p) . . .. anl. a3s1. . ... a(pv 1)xp + . . (n2) . .lx. .. . ... p+lxp+1 + aripP. h=1. .. . . . ..k1<k2< ... . a(l) 0 (pSn1). . h\ /1 .. This algorithm of Gauss consists of operations of a simple type such as can easily be carried out by presentday computing machines. <kAS 1( k1 k2 k 2 . = ytl+t ann>xn= . 0 p.n1 turn out to be different from zero. .. . p+1 a0') p+1. (p) _ . a22).... a1.p+1 . . ann The transition from A to G._1 turn out to be different from zero. a33. . (1) . . . (1) a22 x2 . .. . (10) . . . . k. . .. ann . . +1 (p1) apn (9) 0 0 0 0 . a22. p+1 a(p) n. Let us express the coefficients and the righthand sides of the reduced system in terms of the coefficients and the righthand sides of the original system (1). n . . . (7) This enables us (at the pth step of the reduction) to put the original system of equations into the form allxl + a12x2 + . . aln (1) a2. h kn/ _ p (kll G2 .. n) . . 0 0 a22 (1) . . . . .. . GAUSS'S ELIMINATION METHOD 25 This reduction can be carried out if and only if in the process all the numbers all. . . is effected as follows : To every row of A in succession from the second to the nth there are added some preceding rows (from the first p) multiplied by certain factors. . in which the first p of these numbers are different from zero. 4 awl..nxn . .... we consider the general case. a2p 2p app pp a(p1) . We shall not assume here that in the reduction process all the numbers all. . Therefore all the minors of order h contained in the first h rows of A and G.. .. . . + atn 1)xn = ptn1) p rp) + aP+l. al.+1 . . . 3. + alnxn (1) = yl (8) . . + ya ) : We denote the coefficient matrix of this system of equations by Gp all a12 . . ." a11r0. .. .. + a2n xn = y2 ap+1. .
. (2) __ A a33 .26 II.. .. xp consecutively by Gauss's algorithm it is necessary that all the values (16) should be different from zero. THE ALGORITHM OF GAUSS AND SOME APPLICATIONS From these formulas we find. that the inequalities (15) should hold. we obtain the fundamental formulas' arx) 1122 `4(1 2 .e.. i. k=p + 1... 12 . 02. then they also hold for every smaller value of p... (11) 2...pi (P1) (p) ( . the formulas for ate) make sense if only the last of the conditions (15).. aPP . . Hence instead of this formula we can write the equations A l 1) =a. a (1 2)  1 2. . (14) Thus. .p p (1) y alia22 (1) (p1) . p) (1 2.P i _ A 1 2 . . . i. P1)* 1 A (1 2 . by taking into account the structure (9) of Gp. holds. However. n)... A (1 1 1 2)aiias.. .. n). A 1 2. . A (1 . .e. can be written in the form of the following inequalities: 0. p k) .. . ( 12) When we divide the second of these equations by the first. The same holds true of (11).. p. (13) If the conditions (7) hold for a given value of p. i k = p + l .alla22 .. the conditions (7).... Therefore the formulas (13) are valid not only for the given value of p but also for all smaller values of p.. ... 89.ppk p . (i. A (1 2 3)=ai1a22ass. A (1) (1 22)0.....p' (16) In order to eliminate x1.. .... ai ark 12. .. 2 See [181].p 1 2. p 0 (15) From (14) we then find: a11=A (1). ( 1 2 3) 12 1 23 . 12. . p1 ) A \1 2 ..... the necessary and sufficient conditions for the feasibility of the first p steps in Gauss's algorithm.
r equations (18) reduce to the consistency conditions yir) =0 (20) Note that in the elimination algorithm the column of constant terms is subjected to the same transformations as the other columns.. . . + alnxn a(1)xn = =ys) y:'1) yr+l (r) = yn (r) a(....p y n+1 l i (i=1.p l.. k=r+ 1... From these formulas it follows. . .§ 1. (i = r + 1.r).2.at+l.. . (19) Therefore the last n .nxn (r () aar+lxr+1 . . . Therefore. .ri1)x (18) ar+I. ... the consistency conditions (20) reduce to the wellknown equations A l. that aik =0 (i.. GAUSS'S ELIMINATION METHOD 27 4. (17) I1 2_7 This enables us to eliminate x1f x2j .. . Then... . by a suitable permutation of the equations and a renumbering of the unknowns.. Suppose the coefficient matrix of the system of equations (1) to be of rank r. 2. . .... n) . ... 1) 0 (j =1..+ .. p=1.. because the rank of the matrix A =11 as ft' is equal to r. r).n.+' ...r 1 . we can arrange that the following inequalities hold : A (1 2 . of coefficients...+.... . (21) In particular. x.....+lxr+1 + .. ... consecutively and to obtain the system of equations ai1x1 + al2x2 + . n). . a22 x2 . .. . annxn (r) Here the coefficients are determined by the formulas (13). by supplementing the matrix A aik 111 with an (n + 1) th column of the constant terms we obtain : (P)=A(l..2. + (*) . f.. ... r n+1 r+' )=o (22) .. .
(n) of S under the action of forces F1. a rod.') O (9 1.. 2 We assume that the forces and the displacements are parallel to one and the same direction and are determined. . Under the combined action of two systems of forces the corresponding displacements are added together. x2. (n) on it. a string. We shall consider the displacements (sags) y1. (ii is nonsingular. . then all the displacements are multiplied by the same number. a membrane. i. § 2.. . a multispan rod. We consider an arbitrary elastic statical system 8 supported on edges (for example.. n) .. therefore.a_. or a discrete system) and choose n points (1). by their algebraic magnitudes (Fig. Fig.. y2.e.. in succession by means of Gauss's algorithm and reduce the system of equations to the form (6). (2). I ajc Fig. Mechanical Interpretation of Gauss's Algorithm 1. . . (2). . then we can eliminate x1. . y of the points (1). 2. THE ALGORITHM OF GAUSS AND SOME APPLICATIONS If n = r. and A (1 2 . applied at these points. we assume the principle of linear superposition of forces: 1. a lamina... . x.. F. Moreover. .. . When the magnitudes of all the forces are multiplied by one and the same real number. .. .28 II. .. 1). if the matrix A = 11 a.. F2. 2. . .
.. (p) are zero : R1a11 + . .. .. we can interpret the task of solving the system of equations (1) as follows : The displacements y1. . under the same forces the displacements of the system Sat the points (1). . (2). a(ik) (i. . .. (2). by .e. . .. . n) (Fig... the displacements y1. .. the displacement of (i) under the action of a unit force applied at (k) (i.. (2).. F2. MECFIANICAL INTERPRETATION OF GAUSS'S ALGORITHM 29 We denote by aik the coefficient of influence of the point (k) on the point (i). Fig. RP at the fixed points (1). We denote by Sp the statical system that is obtained from kS" by introduc ing p fixed hinged supports at the points (1).k=p+1. . .k) can be regarded as the displacement at the point (i) of S under the action of a unit force at (k) and of the reactions RI. R2. (p)..n) (see Fig.. . + Rpaip + aik.. .... n). we are required to find the corresponding forces F1. 3 The coefficient a.. .... F...E aikFk = yi (i=1.. y are determined by the formulas n k.. k = 1. 2).. being given.. Then under the combined action of the forces F1. . (24) On the other hand.. . . ... F2. y2i ..§ 2.. y2.. y. 3 for p =1) . (23) Comparing (23) with the original system (1). We denote the coefficients of influence for the remaining movable points (p + 1). . (p) (p < n).. (n) of the system S. + Rpalp + alk = 0 (25) R1ap1 + . i.I . 2. F.. 2. . Therefore a{k) R1ai1 + . . + Rpapp + apk = 0 I .
.. . THE ALGORITIIM OF GAUSS AND SOME APPLICATIONS (1 2. can be carried out as follows.. k= 1. But formulas (26) coincide with formulas (13) of the preceding section. a(p aik..k A (1 k) 1 li A( 1 1 .. The truth of this fundamental proposition can also be ascertained by purely mechanical considerations without recourse to the algebraic derivation of formulas (13).. . from (25) and substitute the expressions so obtained in (24). apk = 0. n) in the algorithm of Gauss are the coefficients of influence of the support system sp.. p i 2 .. R2...atk . as. R2...all ail alk (i...k=p+1. in terms of those of the original system S. the special case of a single support: p =1 (Fig. R... p °. alp alk apt .pl 2. p (1 2.. to begin with.. 3). .1 + + Rpa{p + aik . . n) These formulas coincide with the formulas (W). .. 2. For this purpose we consider. .. To the system of equations (25) we adjoin (24) written in the form (24') Rla. R. (26) These formulas express the coefficients of influence of the `support' system S.. . p/ k) (i.aik) =0... the coefficients of influence of the system Si are given by the formulas (we put p = 1 in (26) ) : I) _ a.30 If II. .p A 1 2. . R2.. Rp+1 = 1.a(p) Hence a (p) Is A (1 \1 A 1 2 .n).. we see that the determinant of the system must be zero : all . In this case. then we can determine RI.. k = p + 1. This elimination of RI.. Therefore for every p (< n 1) the coefficients (i... Regarding (25) and (24') as a system of p + 1 homogeneous equations with nonzero solutions RI. Rp.. . . ail . .
e. Applying the same reasoning to the system S1 and introducing a second support at the point (2) in this system. i. .. n) in Gauss's algorithm are the coefficients of influence of the support system Si. then the coefficients aik (i.. (j') under which the displacement at the point (j) is maintained all the time equal to zero at the expense of a suitably chosen auxiliary force Rf at the point (j'). . For from (10) and (11) we find : AA a(P) =A 1 2 12 . 0. Then aik is the coefficient of influence of the point (k') on the point (k). k = 2.. . k = p + 1.p) .1) the coefficients a{k) (i. Sylvester's Determinant Identity 1. .. . These equations enable us to give an easy proof of the important determinant identity of Sylvester. (1 2 n) =A 1 2 n . .. . . n) in Gauss's algorithm are the coefficients of influence of the system S1. (1') . § 3. (p). . . . yn are the displacements of the points (1)... n) in the system of equations (1) are the coefficients of influence of the statical system S. . we see that the coefficients a?k) (i. . . y2 = 0. (n) and that the forces F1. . In that case we must consider instead of the support at the point (j) a generalized sup port at the points (j).. .. are applied. F.. . In § 1.n . We wish to point out that in the mechanical interpretation of the elimination algorithm it was not necessary to assume that the points at which the displacements are investigated coincide with the points at which the forces F1i F2.. . . k = 1.P+1 . p pl PF. . We can assume that y1i y2. SYLVESTER'S DETERMINANT IDENTITY 31 Thus.. n) in the system of equations (4) are the coefficients of influence of the support system S2 and. . yp = 0 for arbitrary FP+1i . can be expressed by the inequality . . a comparison of the matrices A and G. 2. . (p').. at the expense of suitable R1= F1. led to equations (10) and (11).. . .. . Rp = Fp. if the coefficients a{k (i. (2'). From mechanical considerations it is clear that the successive introduction of p supports is equivalent to the simultaneous introduction of these supports. . (2)... (27) a(P) nn . A 1 2 .. .. F2. . in general.. a(P) p+ 1. for every p (< n . are applied at the points (1'). (2). . F. p 1 2. F. .§ 3. . .. Note.. The conditions that allow us to introduce p generalized supports at the points (1). (2')... . ... k = 3. that allow us to satisfy the conditions y1 = 0. (n').. .
... The matrix formed from these determinants will be denoted by =11bjkilp+1. THE ALGORITHM OF GAUSS AND SOME APPLICATIONS We introduce borderings of the minor A (1 2 ::: P) by the determinants b. (j =1. the minors s00 As 2. P nP acn. = A. . Obviously lim A. . We introduce the matrix As = A + eE... p) JA 1..n . k=p+ 1. by+1. It expresses the determinant I B formed from the bordered determinants in terms of the original determinant and the bordered minor. (9 =1.P+i . For suppose that the inequalities (29) do not hold. Then by formulas (13) by+1. 2.. p)1 [ [ JBI 12 (1 2 ... .. n)..n aPP+. We have established equation (28) for a matrix A( 2 ::: )whose elements satisfy the inequalities 0 (29) A(1 2 .... bnn bn.. 2..32 II. . apP+I..... A (1 2 . a(P)nn l 1 2 .. we can show by a `continuity argument' that this restriction may be removed and that Sylvester's identity holds for an arbitrary matrix A = 11 a 11i. (28) This is Sylvester's determinant identity.. P+ 1 rA (1 2 .x=A (1 2 . On the other hand. V B k) (i. ..... .P+1 .P)p +1 . P) .. Therefore equation (27) can be rewritten as follows: nPp 1 (B.. p) However.. ' 1 2.j ef {.
... We denote by G the coefficient matrix of the system of equations (18) to which the system n k1 I auxk= Y (i =1..r). Let us assume that the conditions for the feasibility of Gauss's algorithm are satisfied: Dk O (k^1... 0 such that A`.. lA. p i1 i2 .. < ke Sn then we obtain a form of Sylvester's identity particularly convenient for applications B i1 i2 . We introduce the following notation for the successive principal minors of the matrix Dk=AG 2.. kq) § 4.... ). n). P k1 k2 .. 9. . kq) \p < k1 < k2 < . 2 .. p)191 A (1 2 .. p soe ...1 2.... iq kq) 1 2 ...)0 We can write down the identity (28) for the matrices Aem.. 2.. Therefore we can choose a sequence e. DECOMPOSITION OF SQUARE MATRIX INTO TRIANGULAR FACTORS 33 are p polynomials in e that do not vanish identically.... n) matrix %_ 8 By the limit (for p * oo) of a sequence of matrices Xp =1I xtkl 11i we mean the where xjk=limx(ik (i.. iq A 1 2 .. p11 2 .2...a).2... < iq (1 2 ... The Decomposition of a Square Matrix into Triangular Factors 1...... .. Taking the limit m * oo on both sides of this identity.. p it [A (1 i2 .§ 4. p k1 k2 ... Let A= II aik IIi be a given matrix of rank r.. we obtain Sylvester's identity for the limit matrix3 A= lim A6.00 If we apply the identity (28) to the determinant i1 < i2 < .. iq (30) k1 k2 ...k=1. 2.k) (k= 1.
. ar... 0 ... where each matrix W1. 0 . arr . THE ALGORITHM OF GAUSS AND SOME APPLICATIONS has been reduced by the elimination method of Gauss. except a...... G coincides with the matrix G.. while the elements of the last n .. 0 0 .. WN is of the form (31) and is therefore a lower triangular matrix with diagonal elements equal to 1. . . and all the remaining elements.. +1 . a . (p. a2. 0 . 25) for p=r.+1 .. are zero....34 II....... am 0 0 .. Such an operation is equivalent to the multiplication on the left of the matrix to be transformed by the matrix (7) (i) 0 0 ..... Thus. 0 The transition from A to G is effected by a certain number N of operations of the following type : to the ith row of the matrix we add the jth row (j < i)... r+1 ..... The matrix G is of upper triangular form and the elements of its first r rows are determined by the formulas (13). 0 . W2W1A. 0 .r rows are all equal to zero:' I all a12.0 .. W2. . 1 . aln (1) (1) a2. 4 see formulas (19).. 1 (31) 0 . a2r 0 0 0 (1) (1) to G = 0 0 al. G =Wv ... a1r a22 . after a preliminary multiplication by some number a... 11 In this matrix the main diagonal consists entirely of units. ...
k k)  ... b11C11= D1.t =A ( 2 .. .k1k) kk A (1 12 2 .. k) 0 for k = 1. bnn 0 .. k +1.. . . n.. Cnn Here D.. bncn = D.k1g) c`° .ckk A (112 2 .. we obtain from (33) : A =W10. k k) (37) (g =k.... C2n 0 (35) bn2 .. then the elements of the first r rows of B and of the first r columns of C are uniquely determined. 2... . When the first r diagonal elements of B and C are given. DECOMPOSITION OF SQUARE MATRIX INTO TRIANGULAR FACTORS 35 (32) Let Then W = WV .. (33') We have thus represented A in the form of a product of a lower triangular matrix W1 and an upper triangular matrix G.§ 4.. b22C22 = D1 .. (33) method.... k=1.. r). G= WA.. From (32) it follows that W is lower triangular with diagonal elements equal to 1.. r (34) can be represented in the form of a product of a lower triangular matrix B and an upper triangular matrix C b11 0 .. and are given by the following formulas: b#r bA(12.. 0 C11 A =BC= bn1 0 C12 . Since W is nonsingular.. (36) The values of the first r diagonal elements of B and C can be chosen arbitrarily subject to the conditions (36). _ A(12. We shall call W the transforming matrix for A in Gauss's elimination Both matrices G and W are uniquely determined by A.. The problem of decomposing a matrix A into factors of this type is completely answered by the following theorem : Every matrix A= 11as )j of rank r in which the first r i zero successive principal minors are different from THEOREM 1: D. .. W2W1._1 D. CIn C22 .. 2 .
.. k=1.. (40) and relations (36) follow. bk1.r). equation (38) can be written as follows : A(1 2 .r and c.... . The second formulas in (37). <a 2... ..2.. . p.. obtaining b11b22 . the first formulas in (37). ckk  Dk (k== 1. .k+1.... . k. k1 gg bok=b 2 . k1 k)C(1 2 ..k+1. the first k columns of C contain only one nonvanishing minor of order k... TnE ALGORITIIM OF GAUSS AND SOME APPLICATIONS If r < n (I A 1=0). B (ai a2 . b22.... k) i. the last n . u21. .. bur11c22 . provided they satisfy (36).r rows of C can be filled with zeros and the last n .« (g=k. We put g = k in this equation..n. r).2. . for the ele'4(1 1 2. Further.. k=1.2. Without violating equation (35) we may multiply the matrix B in that equation on the right by an arbitrary nonsingular diagonal matrix M = !f yj8ttf i..1.... r). ments of C.... or. . . namely C (1 2 ::: k) Therefore.=1. c2....ak (38) k) C (1 2 ..1 k) «. k1 k)B(1 2 ...r rows of B can be chosen arbitrarily.. Making use of the formulas for the minors of the product of two matrices we find: A (1 .. ..... k) Since C is an upper triangular matrix. Now let B and C be arbitrary lower and upper triangular matrices whose product is A. That a representation of a matrix satisfying conditions (34) can be given in the form of a product (35) has been proved above (see (33') ).n.. . k) (39) = b11b22 .. c.. then all the elements in the last nr rows of B can be put equal to zero and all the elements of the last nr columns of C can be chosen arbitrarily.. µ2. from (39) and (40) we find : 12 .. while multiplying C at the same time on the left by M'1= aik But this is equivalent to multiplying the columns of B by u1. b.36 IT.u. 2.. respectively...... k + 1. conversely... We may therefore give arbitrary values to the diagonal elements b1i. ... and the rows of C by µl1.....k1g 1 12 .. .r).. klbrkc11C22 . ak1 .e.. . are established similarly. ckk (g=k..2..n... Proof..... k (gk.k1g a1 a2 .
This completes the proof of the theorem... .. . n)..2. ..r columns of B to be zeros and choose the elements of the last n . Here.r rows of C may be chosen to be zero.2... .r rows of C arbitrarily. . From this theorem there follow a number of interesting corollaries.. . i=1.r rows of C are multiplied only among each other. n. they can be used to advantage in the actual computation of the elements of B and C. k= 1..... u .r provided (36) is satisfied by the introduction of suitable factors µi. ..I bticik ca = i1 bii (iSk. COROLLARY 1: The elements of the first r columns of B and the first r rows of C are connected with the elements of A by the recurrence relations k1 ba= aik '1 Ckk bticit (i?k.. i=1. as we have shown already. 0 (k=1. The relations (41) follow immediately from the matrix equation (35) . 2. µ. ...r).2... We have seen that all the elements of the last n . the elements of the last n ..' But as a consequence. C .r columns of B may be chosen arbitrarily. of the last n . be. Clearly the product of B and C does not change if we choose the last n .. COROLLARY 3: If S 8(1 2 gtk 11 i is a symmetric matrix of rank r and k} Dk then ..2. ..r columns of B and the elements c0k of the last n .. arbitrary values may be given to the diagonal elements b11. r). DECOMPOSITION OF SQUARE MATRIX INTO TRIANGULAR FACTORS 37 We observe that in the multiplication of B and C the elements bk. where B= I I bik I I i is a lower triangular matrix in which 5 This follows from the representation (33'). .. r... COROLLARY 2: If A= jI as jji is a nonsingular matrix (r= n) satisfying (34). k=1.. then the matrices B and C in the representation (35) are uniquely determined as soon as the diagonal elements of these matrices are chosen in accordance with (36).§ 4. (41) i1 aik . S=BB'. c..
k) fork= 1. .... . 0 D.k19(9k. lln 01 . 2.... (43) where F and L are upper and lower triangular matrices respectively .r rows of L can be chosen completely arbitrarily.. D.n.....0 I . f21 A=FDL= 0 (44) Al fn2 ..2.k1k) (1 lm A(12 .. .... k) 2 . k. r). . r can be represented in the form of a product of a lower triangular matrix F.. a diagonal matrix D. .. . k = r + 1.n).. ....r).r columns of C be zero. we obtain the following theorem : THEOREM 2: Every matrix A a 0 of rank r in which Dk _A (1 2 ....1 (45) _`4 (1 2.....n. k) 1 2 ..2..k+ 1.k=1.r columns of F and the last n .... .... 1 where fyk 0 100. Then we may set bil 0 I1 c11 0 B=F 0 b 0 0 C= 0 e 0 0 L . n. k I g) (1 c . kr+ 1.k=1.. In the representation (35) let the elements of the last n ..n.....I 0 (g=k... D1 1 l19 . 12n .. .. THE ALGORITHM OF GAUSS AND SOME APPLICATIONS 1 byk = {'DkDk 1 A(1 2.. and f gk and lky are arbitrary for g = k + 1. the first r diagonal elements of F and L are 1 and the elements of the last n .. and an upper triangular matrix. L : 1 0 . (42) 2.. n . Substituting (43) for B and C in (35) and using (36).k+1.. ...38 II. k) (g=k+1.
Then AT= WW'.1. when applied to a matrix A =11 a 11 i of rank r for which Dk 0 (k =1..W.. A=G. .. the task of finding the inverse matrix A' reduces to determining G' and multiplying G' by 11'. equal to W). matrix G in which the first r diagonal elements are D. For this purpose we introduce. DECOMPOSITION OF SQUARE MATRIX INTO TRIANGULAR FACTORS 39 3. the product WE.) that we have performed on A in the algorithm of Gauss (in this case we shall have instead of the product WA. W. Although there is no difficulty in finding the inverse matrix G' once the matrix G has been determined.e. the operations involved can nevertheless be avoided. 2... . and W. al 1 . 0 as well. For actual computation of the elements of IV we recommend the following device. . write the unit matrix E on the right of A : al.DD Dr1 and the last n . . G is the Gaussian form of the matrix A . Since G and 11' are determined by means of the algorithm of Gauss. similar matrices G.'G.. then I G case. In this If A is nonsingular. A=FDL. the application of Gauss's algorithm to the matrix (46) gives the matrices G and W simultaneously. .. We obtain the matrix 14' when we apply to the unit matrix E all the transformations (given by W1 . i. W)..§ 4.. to gether with the matrices G and W. Let us compare (33') with (44) : (47) A=W'G.. r). therefore. yields two matrices: a lower triangular matrix W with diagonal elements 1 and an upper triangular Dr . (33) implies that A' = G' Zl'. I By applying all the transformations of the algorithm of Gauss to this rectangular matrix we obtain a rectangular matrix consisting of the two square matrices G and W : (G. Let us. because G is triangular.. Thus. a..a 0 0 (46) . equal to G. so that A I =4 0.. The elimination method of Gauss. for the transposed matrix AT.... .1. W is the transforming matrix.r rows consist entirely of zeros..
= W ID . we obtain (50) Comparing this equation with (33') and (47) we find : G = DW. G.0. A=FDL.. (54) This formula yields an effective computation of the inverse matrix A' by the application of Gauss's algorithm to the rectangular matrices (A. Now let A be nonsingular (r = n). since the last n . (53) Formula (53) shows that the decomposition of A into triangular factors can be obtained by applying the algorithm of Gauss to the matrices A and AT. Then I D 0. 1.DG. D = D'. (48) A=G W. here we take the product DL as the second factor C. E) (AT.. it follows from (50) and (51) that A=G. On the other hand. A = W1DW. we chose them such that F = W'. shows that we may also select the arbitrary elements of L in such a way that L=W. their first r columns coincide.40 II. a comparison of (47) with (44)._1 Dr (52) D = DDD. There fore it follows from (50) that A'=WW... (51) We now introduce the diagonal matrix D.r columns of F may be chosen arbitrarily. Since the first r diagonal elements of the first factors are the same (they are equal to 1).'.1. THE ALGORITHM OF GAUSS AND SOME APPLICATIONS These equations may be regarded as two distinct decompositions of the form (35) . But then.0 ^ 1 1 Dl : D. E). 1 . (49) Replacing F and L in (44) by their expressions (48) and (49). .
2. The Technique of Operat. 1... or cells A..2. fi = 1... then Gl coincides with G and W1 with W. . GENERALIZED ALGORITHM OF GAUSS 41 If..1 A. ing with Partitioned Matrices....# of dimensions ma X np (a = 1. in particular.. and therefore formulas (53) and (54) assume the form B =GTDG.' In the present section we deal with such partitioned matrices.An )m. ll be given. Let a rectangular matrix A=IIa. Instead of (58) we shall simply write A = (Ap) (a =1. (59) In the case s = t we shall use the following notation : A = (Aap)i (60) . We shall say of matrix (58) that it is partitioned into st blocks. (i=1. k=1. . or that it is represented in the form of a partitioned. PARTITIONED MATRICES..2. we take as our A a symmetrical matrix S. matrix.. or blocked. 2.s..n) (57) By means of horizontal and vertical lines we dissect A into rectangular blocks: A1!..m. 8.. f). . § 5...A2t A11 )m1 ) m2 (58) A...§ 5. (55) (56) S1= WTDW... The Partition of a Matrix into Blocks. The Generalized Algorithm of Gauss It often becomes necessary to use matrices that are partitioned into rectangular parts'cells' or `blocks.. 2..Au A= A!1 An.
. 2..2. Let A be quasidiagonal. .. B =(Bap) It is easy to verify that A }. Then it is easy to verify that AB= C = (Cap) . (a =1. (61) (62) We have to consider multiplication of partitioned matrices in more detail. In this case formula (64) gives Cap= AaaBap (a =1.6Bas a . pn B = Bll B22 . 2. } ml pt ) ml } m. 2. t).e... let t = u and Bop = 0 for a # f3. where C.. THE ALGORITHM OF GAUSS AND SOME APPLICATIONS Operations on partitioned matrices are performed according to the same formal rules as in the case in which we have numerical elements instead of For example. Now let B be a quasidiagonal matrix. p.. A..j Bt 2 . 6) that for the multiplication of two rectangular matrices A and B the length of the rows of the first factor A must be the same as the height of the columns of the second factor B. (66) . B. ) n..' A. .. fl =1. pl All All_ Al$ A . i.2. that the partitioning into blocks be such that the horizontal dimensions in the first factor are the same as the corresponding vertical dimensions in the second : nl n9 n. All . u). 2.. .l A12 Bu Bll ...Bp) (a=1. i. #=1.8.. .. u (64) We mention separately the special case in which one of the factors is a quasidiagonal matrix. For `block' multiplication of these matrices we require. 8 a_l P =1..B= (Aap }...... .. 2. 8...... in addition. then the rows of the matrix are multiplied on the left by the corresponding diagonal blocks of the quasidiagonal matrix..42 II. A =(Aas). 2.. A$1 A.. let s = t and Aap = 0 for a P.. 8. ) nl } nl (63) . We know (see Chapter I. . u).. .. #= 1. .t). let A and B be two rectangular matrices of equal dimensions partitioned into blocks in exactly the same way : blocks...... B BB l l 2. (65) When a partitioned matrix is multiplied on the left by a quasidiagonal matrix..p = ... 2..e. 8. 1'hen we obtain from (64) : Cap =AapBpp (a =1.(A2. BIN ... #= 1.
a quasidiagonal matrix).. Bas=0 for a<#.. Z. we find an d C0 =0 (. If A is a quasitriangular matrix (in particular.' the diagonal cells of the product are obtained by multiplying the corresponding diagonal cells of the factors.13=1.I (67) 2.... Note that the multiplication of square partitioned matrices of one and the same order is always feasible if the factors are split into equal quadratic schemes of blocks and there are square matrices on the diagonal places in each factor. I The case of lower quasitriangular matrices is treated similarly. then the determinant of the matrix is equal to the product of the determinant of the diagonal cells: IAI=IA11IIA:aI . A quasidiagonal matrix is a special case of a quasitriangular matrix.. GENERALIZED ALGORITHM OF GAUSS 43 When a partitioned matrix is multiplied on the right by a quasidiagonal matrix. The partitioned matrix (58) is called upper (lower) quasitriangular if s = t and all Aae =0 for a > fi (a < 1). Alt = A81 An . A498 )m1 ) ma (68) ) m. A2c Ail Aso ... I A. PARTITIONED MATRICES.. 81. then all the columns of the partitioned matrix are multiplied on the right by the corresponding diagonal cells of the quasidiagonal matrix. 6 It is assumed here that the block multiplication is feasible. For when we set s = t in (64) and Aaa=O. This rule can be obtained from the Laplace expansion..Baa for a<# (a..aa = A. Let a partitioned matrix nl A nt All A12 .. .§ 5. From the formulas (64) it is easy to see that : The product of two upper (lower) quasitriangular matrices is itself an upper (lower) quasitriangular matrix. We mention a rule for the calculation of the determinant of a quasitriangular matrix..
. . m.. (74) IAI=IB1. X nfl.. Aa1+XA... m. (75) .... All B= (69) \.. El )m.. But the determinant of the quasitriangular matrix V is 1: Hence See p. Aa +XA08 . THE ALGORITHM OF GAUSS AND SOME APPLICATIONS be given... no . Ma . m2.. multiplied on the left by a rectangular matrix X of dimension ma. of a square scheme of blocks : Apt . It is easy to see that VA=B. all the nondiagonal blocks of V are equal to zero except the block I that lies at the intersection of the ath row and 1th column... respectively .. we have' for the ranks of A and B : (71) rA=rB.. In the diagonal blocks of V there are unit matrices of order m1. To the ath row of submatrices we add the flth row. In the special case where A is a square matrix. .. App We introduce an auxiliary square matrix V.. which we give in the form m1 . we have from (70) : (72) I VIIAI=IB1. We obtain a partitioned matrix An .44 II........ O . As V is nonsingular. (73) IVI=1.... O .01 . 12..
Formula (78) reduces the computation of the determinant I A 1. THEOREM 3: If to the ath row (column) of the blocks of the partitioned matrix A we add the flth row (column) multiplied on the left (right) by a rectangular matrix X of the corresponding dimensions..ofs). consisting of st blocks to the computation of a determinant of lower order consisting of (s 1) (t 1) blocks. PARTITIONED MATRICES. . In this way we arrive at the generalized algorithm of Gauss. Then A(' .. then the rank of A remains unchanged under this transformation and. fl=2. We thus the firstthe matrix on the left by . then the process can be continued. (77) If the matrix Ap=l is square and nonsingular. .8 Let us consider a determinant A partitioned into four blocks : d I A C D (79) I where A and D are square matrices. To the ath A we add row multiplied .. if A is a square matrix. Then from the second row we subtract the first multiplied on the left by CA'.§ 5..A11 s 1AlpAm# (a=2. Aal > Bl = where (76) 0 A. . Suppose I A I # 0.. etc. s.. 3. t). .Aa1Ai1 (a = 2.... GENERALIZED ALGORITHM OF GAUSS 45 The same conclusion holds when we add to an arbitrary column of (68) another column multiplied on the right by a rectangular matrix x of suitable dimensions.. the determinant of A is also unchanged....row . then this determinant of (s 1) (t 1) blocks can again be subjected to such a transformation. Awl (78) IAI=1B11=IA111 . The results obtained can be formulated as the following theorem. We now consider the special case in which the diagonal block All in A is square and nonsingular (I A11 10). Anl l Aa=A. Al= o!) cl Aaa . Let A be a square matrix. obtain Au f0 Ala. We obtain s If A) is a square matrix and IA22 # 0....
(I) Similarly.c. we deduce from (I) and (II) the formulas of Schur.c1b . 0 1 cl c.cob.C4b` . B.BD1C 0 =IABDICI IDI.(IIb) we can obtain another six formulas by replacing A and D on the righthand sides simultaneously by B and C. Example. (Ib) Similarly. d.ACA'BI d=IADBD'CDI d = I AD . if C and D are permutable.b.46 II. C3 b3 d1 d2 d4 c4 d. D are square (of one and the same order n). .b4 d4w b. which reduce the computation of a determinant of order 2n to the computation of a determinant of order n : d=AD .CBI (A (D 0). C D (II) In the special case in which all four matrices A.b1 . these restrictions can be removed by continuity arguments. From formulas (I) . we subtract from the first row in d the second multiplied on the left by BD'. then it follows from (Ia) that (provided AC = CA). 0). .c.c. obtaining d= A . However.c1b1. if I D I 0. d. (Ia) (IIa) If the matrices A and C are permutable. d = I d1. C. THE ALGORITHM OF GAUSS AND SOME APPLICATIONS d= A B 0 D CA1B =1A1 IDCAIBI. then d =IADBC I (provided CD = DC). (IIb) Formula (Ib) was obtained under the assumption I A 10. b4 d= By formula (Ib). 1 0 b1 b. and (IIb ) under the assumption I D 0.
PARTITIONED MATRICES. the product CA'B. GENERALIZED ALGORITHM OF GAUSS 47 4. By means of Gauss's algorithm. i. with p = n. multiplied on the left by CA'. We subtract from the second row of blocks of R the first. From Theorem 3 there follows also THEOREM 4: If a rectangular matrix R is represented in partitioned form R = (C D). Such a modified Gaussian algorithm is sometimes applied even when the conditions (15).10 we reduce the matrix C 0) to the form (JAJ o) (83) (84) We will show that X =CA1B. since 0. where n is the order of the matrix. (85) For. This can be done if the conditions (15) hold for p = n. we may renumber the first n rows (or the first n columns) of the matrix (83) AI so that the n steps of Gauss's algorithm turn out to be feasible. But if these conditions do not hold.. . more generally. then the rank (81) D= CA1B. to We do not apply here the entire algorithm of Gauss to the matrix (83) but only the first n steps of the algorithm. the matrices R and T have the same rank. are satisfied.§ 5.1B = 0. where B and C are rectangular matrices of dimensions n X p and q X n. (80) where A is a square nonsingular matrix of order n (I A J of R is equal to n if and only if 0). n) if and only if D CA . But the rank of T coincides with the rank of A (namely. Then we obtain the matrix A B DCA1B . then. From Theorem 4 there follows an algorithm9 for the construction of the inverse matrix A1 and. Proof. the same transformation that was applied to the matrix (83) reduces the matrix 9 see [181].e. This proves the theorem. when (80) holds. (82) By Theorem 3.
C . where y is a column matrix. and C = E. (85) holds.CA1B = 0. then by applying the algorithm of Gauss to the matrix we obtain (. Let us illustrate this method by finding A' in the following example. if B = y. Therefore. . the matrix (86) is of rank n (n is the order of A).. In particular.E O)' (0 %)' where 1=A'. Hence X . But then (87) must also be of rank n. if in (83) we set B = C = E. i.48 II. when we apply Gauss's algorithm to the matrix we obtain the solution of the system of equations Ax =y. then X A1y. 1 0 2 1 3 2 We apply a somewhat modified elimination method" to the matrix 11 See the preceding footnote. Example. Further.CA1B (O XCA1B (86) (87) By Theorem 4. Let 2 1 1 A= It is required to compute A'. THE ALGORITHM OF GAUSS AND SOME APPLICATIONS A to the form G B Bl (.e.
1 0 0 0 0 01 0 0 01 0 0 0 0 To all the rows we add certain multiples of the second row and we arrange that all the elements of the first column. become zero. 2 1 1 GENERALIZED ALGORITHM OF GAUSS 1 1 49 0 1 0 0 1 0 1 2 2 0 0 0 0 0 3 0 . become zero. the third row multiplied by certain factors and see to it that in the second column all the elements. Then we add to all the rows.§ 5. PARTITIONED MATRICES. except the second. except the second and third. Then we add to the last three rows the first row with suitable factors and obtain a matrix of the form * * * * * * * * * * * * * * * * * * 0 0 0 0 0 021 0 2 4 1 1 3 0 0 11 2 I Therefore 2 I A'1= 4 1 1 3 1 1 . except the second.
(a+fl)x=ax. y. . # of F : 1. The investigation will be continued in Chapters VII and IX. There exists an element o in R such that the product of the number 0 with any element x of R is equal to o : 1 x=x. The study of these operators. (x+y)+x=x+(y+z). In the present chapter we shall expound the simpler properties of linear operators in an ndimensional space.x. § 1. 5. 4. 2.. 7. Vector Spaces 1. 1 These operations will be denoted by the usual signs `+' and °. a(x+y) =ax+ay. 3. 50 . the latter sign will sometimes be omitted. 6. x+y=y+x.An which two operations are defined. y.' the operation of `addition' and the operation of `multiplication by a number of the field F.' We postulate that these operations can always be performed uniquely in R and that the following rules hold for arbitrary elements x. in turn. z of R and numbers a. a.'. enables us to divide all matrices into classes and to exhibit the significant properties that all matrices of one and the same class have in common. a(fx)=(afl)x.CHAPTER III LINEAR OPERATORS IN AN nDIMENSIONAL VECTOR SPACE Matrices constitute the fundamental analytic apparatus for the study of linear operators in an ndimensional space. Let R be a set of arbitrary elements x.
For example. y. DEFINITION 4.' DEFINITION 2.. Example 1.a U.AY . x2...+8u=o. For example. . d in F. hold is called a vector space (over the field F) and the elements are called vectors. a DEFINITION 3. Let us call a column x = (XI. . not all zero. are called linearly dependent i f there exist numbers a.. The space R is called finitedimensional and the number n is called the dimension of the space if there exist n linearly independent vectors in R. where x =(. .. 2... . The part of this space that consists of the vectors parallel to some plane is a twodimensional space.. (1) If such a linear dependence does not hold. The set of all ordinary vectors (directed geometrical segments) is a threedimensional vector space. with coefficients in it. . then it is called infinitedimensional.§ 1. If the space contains linearly independent systems of an arbitrary number of vectors.. then 8 x=.. In this book we shall study mainly finitedimensional spaces. then one of the vectors can be repesented as a linear combination. and all the vectors parallel to a given line form a onedimensional vector space. is called a basis of the space. .. given in a definite order. A system of n linearly independent vectors el. such that ax+fly+. y.. . if a 0 in (1). then the vectors x. u of R. u are called linearly independent. . . . u are linearly dependent. P . of n numbers of F a vector (where n is a fixed number).. If the vectors x. The vectors x.7. while any n + 1 vectors in R are linearly dependent. We define the basic operations as operations on column matrices : 2 It is easy to see that all the usual properties of the operations of addition and of multiplication by a number follow from propertieR 1:7. Example 2. for arbitrary x of R we have: x+ox [x+o etc.. y. VECTOR SPACES 51 DEFINITION 1: A set R of elements in which two operations'addition' of elements and `multiplication of elements of R by a number of F'can always be performed uniquely and for which postulates 1.1) x. en of an ndimensional space. .... e2. of the remaining ones. . x + (x) = o..
. . .. LINEAR OPERATORS IN AN nDIMENSIONAL VECTOR SPACE (x1... at (x1...1). + xnen where xi= .0. x1.. For if there is another decomposition of x besides (2)... x1 + y2.. ... .. . are uniquely determined when the vector x and the basis e1i e2..0).e. The null vector is the column (0. (x1.. xn + yn. .. The set of all infinite sequences (x1... since the vectors e1i e2. But (2) in this case we must have ao. x2.... to' The set of all such polynomials (without a bound on the degree) form an infinitedimensional space. 0). As a basis of the space we can take.0). . Example 5. a1.... i... The space thus defined is often called the ndimensional number space. ... 0. Example 4. yar .. Note that the numbers x1i x2.0..... The vectors form an ndimensional space. xn.... x2.. . Example 3. It is easy to verify that all the postulates 1. . cannot be linearly dependent. x2i .. .. . . x2.. x.. say. .. en are linearly dependent (because there are n + 1 of them) aox + ales + aze2 + ... . . 0. the column of unit matrices of order n : (0... . .... are satisfied. .. en are given. (0... a (x1.. .. .3 As a basis of this space we can take. .) . for example.) _ (ax1. e2. y2. .52 III... .. 3.ai/ao (i= 1.. The set of all functions defined on a closed interval [a. where at least one of the numbers ao... .. xn.+xAen. Let the vectors e1i e2. e1.. e.. + an_1t"' of degree < n with coefficients in F is an ndimensional vector space.... Yn) _ (Xi + Yi' x2 + y2.) = (x1 + y1.... a is different from zero. is an infinitedimensional space... Then the vectors x. 2. xn) _ ((Xx1.. axe..n).7.... en forms a basis of an ndimensional vector space R and let x be an arbitrary vector of the space. .1.).... The set of polynomials ao + at + . b] form an infinitedimensional space. axn.. '.) + (y1. yn. xn) + (Yi. axe. Therefore x= xiel + xze2 + .L'n + yn). + anen = o.. xn. . x=xie1+x2ez+. . ) in which the operations are defined in a natural way. . .. (3) 8 The basic operations are taken to be ordinary addition of polynomials and multiplication of a polynomial by a number. the system of powers to. t'. axn).. (1... . .. ...
n = 0 ctxa1 + cax2a + ... . (6) Cixn1 + C2.. C. it follows that XIXI=X2X2= .' a xe{ ..= xn.e. + CmX = 0 As is well known... by subtracting (2) from (3)..a + . then all its components are zero... . i_1 i_1 i. xn=xn e2. A necessary and sufficient condition for the independence of the vectors x1i x2. 4.. VECTOR SPACES 53 then. is not equal to zero.. x2=x2.. .. are called the coordinates of x in the basis If n x =X xiei and y then y{e1.e.. has a nonzero solution if and only if the rank of the coefficient matrix is less than the number of unknowns. Hence the vector equation (5) is equivalent to the following system of scalar equations : C1xu + C2x11 + ... + cmxa. .... .+(xnxn) en=0. that this rank should be m. C. C2.xn. xm is. i. .. .. this system of homogeneous linear equations for c1. Let the vectors x be linearly dependent.. less than m.. x + y =2' (xi + y{) e{ and a x = . + Cmxi. ).n = 0 .xl) el+ (x2x2)e2+.=0... . x... x2i .. . ..... and since the vectors of a basis are linearly dependent. _ il x ei m Xcrxt=O where at least one of the numbers c1i c2... therefore. i. xlx1.§ 1. (5) If a vector is the null vector.. we obtain (xl . . ..e. (4) The numbers x1. the coordinates of a sum of vectors are obtained by addition of the corresponding coordinates of the summands and the product of a vector by a number a is obtained by multiplying all the coordinates of the vector by a.
Thus. means that the columns of the matrix (7) are linearly independent. . By the theorem. e2. xn are the coordinates of x in the given basis. . .. and therefore all vector spaces of the same number n of dimensions over the same number field F are isomorphic. X. Moreover. xn). . . . i. if we transpose the matrix.. . x2m (7) xnl xn2 . be linearly independent it is necessary and sufficient that the rank r of the matrix formed from the coordinates of these vectors in an arbitrary basis x11 x12 .54 III. xlm X21 X22 . . Note. where x1i x2. then the rank of the matrix is equal to the number of columns.has been chosen. if the columns of a matrix are linearly independent. .e. This means that to within isomorphism there exists only one ndimensional vector space for a given number field... see § 5.. x2.. Hence it follows that in an arbitrary rectangular matrix the maximal number of linearly independent columns is equal to the rank of the matrix. . x2. For a proof of Theorem 1 independent of this property. . an arbitrary ndimensional vector space is isomorphic to the ndimensional number space. then the rank obviously remains unchanged. change the rows into columns and the columns into rows. the following theorem holds : THEOREM 1: In order that the vectors x1.. . LINEAR OPERATORS IN AN nDIMENSIONAL VECTOR SPACE Thus. . m). there fore. 2. The linear independence of the vectors x1i x2... e. 4 This proposition follows from Theorem 1. ... Here the sum of vectors in R corresponds to the sum of the corresponding vectors of R'. then to every vector x there corresponds uniquely the column x = (x1. since the kth column consists of the coordinates of Xk (k = 1.. . be equal to m.e.. The analogous correspondence holds for the product of a vector by a number a of F... In other words. i. the choosing of a basis establishes a onetoone correspondence between the vec tors of an arbitrary ndimensional vector space R and the vectors of the ndimensional number space R' considered in Example 2..4 5.. to the number of vectors. . Hence in a rectangular matrix the number of linearly independent columns is always equal to the number of linearly independent rows and equal to the rank of the matrix. x. . in the proof of which we have started from the wellknown property of a system of linear homogeneous equations: a nonzero solution exists only when the rank of the coefficient matrix is less than the number of unknowns. If in an ndimensional space a basis e1.
.1xl + am. the transformation (8). x2 of R and a of F A (x1 + xs) = Ax1+ Axs . But we would then have mixed up properties of vectors that do not depend on the choice of a basis with properties of a particular basis. It is easy to see that this operator A has the property of linearity. i. which we formulate as follows : DEFINITION 5: An operator A mapping R into S. § 2. We consider a linear transformation y1 = a11x1 + a12x2 + . (9) Thus. of R a certain vector m y tI1y t of S. in S. for a given basis in R and a given basis in S. + ym= aa.. .. A LINEAR OPERATOR 55 The reader may ask why we have introduced an `abstract' ndimensional space if it coincides to within isomorphism with the ndimensional number space.. .. But the equality of all its coordinates is not a property of the vector itself. Then the trans formation (8) associates with every vector x= j xe. Indeed.. the fact that all the coordinates of a vector are zero is a property of the vector itself . For example.. g2.e. ... 1 (8) whose coe ficients be'ong to the number field F as well as two vector spaces over F: an ndimensional space R and an ndimensional space S.§ 2.... it does not depend on the choice of basis. .x .2x2 + . e in R and a basis g.. The axiomatic definition of a vector space immediately singles out the properties of vectors that do not depend on the choice of a basis.. A (ax1) = aAx1. the transformation (8) determines a certain operator t1 A that sets up a correspondence between the vector x and the vector y : y = Ax. because it disappears under a change of basis.. e2. + a.. determines a linear operator mapping R into S.. associating with every vector x of R a certain vector y = Ax of S is called linear if for arbitrary x1. A Linear Operator Mapping an nDimensional Space into an rDimensional Space 1. we could have defined a vector as a system of n numbers given in a definite order and could have introduced the operations on these vectors in the very way it was done in Example 2. i. + a1Axx ys ` at1x1 + ax2 + . We choose a basis el.e. . g.
..e..) and y = (y). . . and this is what we had to show. n). (k = 1.Y aikxk) gi n hence y=Ax= where a k1 k1 k1 =...2... i1 m yi =.56 III...a.. Then the vector equation y=Ax corresponds to the matrix equation y=Ax. Here... We denote by x = (xl. such that the linear transformation (8) formed by means of this matrix expresses the coordinates of the transformed vector y = Ax in terms of the coordinates of the original vector x...n) M Aek = i. the kth column consists of the coordinates of the vector Aek (k = 1. of the vector Aek thus obtained be denoted byalk. A mapping R into S and arbitrary bases e1. in S. that for an arbitrary linear operator e2.. g2. n).a2k.. x2.. e in R and g.g. y2.Py&. in the matrix A corresponding to the operator A.. . .. ... g.. 2. conversely. i. (10) 11 11 aml amt . in fact. alw zn 2l 22.... . there exists a rectangular matrix with elements in F aall i a12 . 2... apply the operator A to the basis vector ek and let the coordinates in the basis gl.. to every such matrix there corresponds a linear operator mapping R into S. we obtain k1 J' xkAek = n i1 k1 (.l aikgi (k =1. g . (11) Multiplying both sides of (11) by Xk and summing from 1 to n..E aikxk (i=1... x. y. for given bases of R and S : to every linear operator A mapping R into S there corresponds a rectangular matrix of dimension m X n and... Thus. Let us. ..) the coordinate columns of the vectors x and y. . . .m).2. . LINEAR OPERATORS IN AN nDIMENSIONAL VECTOR SPACE We shall now show the converse.. ... . ...
_. this operator maps R into R. Example. dt cu dt .. . Furthermore. Thus. 0 0 0 0 0 .§ 3..tn2. + b... p. k=1. We consider the set of all polynomials in t of degree < n with coefficients in F.2.m.t°' 0 1 and t°=1. x On the basis of this definition it is easy to verify that the sum C = A + B Cek =Act + Be...... a certain polynomial in R. Using formulas (11).t.... we choose bases consisting of powers of t : t°=I.n).. n1 § 3.. B=IIbikII DEFINITION 6: (i=1. In R. since d do(t) dt [m(t) + +V(t)] = ds + d (aq(t)] =at dp(t) do(t). (12) of the linear operators A and B is itself a linear operator.. It is assumed that (12) holds .. Addition and Multiplication of Linear Operators 1..2. for arbitrary x in R. ADDITION AND MULTIPLICATION OF LINEAR OPERATORS 57 which is the matrix form of the transformation (8). Let A and B be two linear operators mapping R into S and let the corresponding matrices be A=1ja II.. This set forms an ndimensional vector space R (see Example 4. The sum of the operators A and B is the operator C defined by the equations Cx=Ax+Bx (x a R).1 X n) corresponding to the differentiation operator d in these bases : 0 0 2. k1 (a.. Similarly....) ek . we construct the rectangular matrix of dimension (n .2  with coefficients in F form a space The differentiation operator d associates with every polynomial of R. and R.. 52)...t.. the polynomials in t of degree < n .... 1 0.. The differentiation operator is linear. 5 x e R means that the element x belongs to the set R.
. . n. holds for every x of R. k = 1. 2.. Then the vector equations X =Ay. and T and denote by A. (13) follows from (14). n). and C. and T be three vector spaces of dimension q. z are the coordinate columns of the vectors x. B. . LINEAR OPERATORS IN AN nDIMENSIONAL VECTOR SPACE Hence it follows that the operator C corresponds to the matrix C = II c4k II. Let R. j = 1. B. q). B R .. .. of which B maps R into S and A maps S into T . (15) From the linearity of the operators A and B follows the linearity of C. a =Cx correspond to the matrix equations : (16) z=Ay.e. 2. The product of the operators A and B is the operator C for which A Cx=A(Bx) (x E R)... Since x is an arbitrary column. 2. to the operators A. and let A and B be two linear operators. m. Cx = Ax + Bx (13) We would come to the same conclusion starting from the matrix equation (14) (x is the coordinate column of the vector x) corresponding to the vector equation (12). which is the product of the matrices A and B. S. S aT.58 III. 2. and m. We choose arbitrary bases in R. m.. and C the matrices corresponding. i. where x.. in this choice of basis. . Hence Cx=A (Bx) = (AB) x and as the column x is arbitrary C=AB. y=Bx. I I (i = 1. the product C = AB of the operators A and B corresponds to the .. the operator C corresponds to the matrix C=A+B. where Cik = atk + bik (i = 1. matrix C = 11 c. S. y= Bx. . The operator C maps R into T : R CAB T.. z=Cx. y. in symbols : DEFINITION 7.. (17) Thus. 2. . y. s. .
and aA. .e. (18') We shall now establish the connection between the coordinates of one and the same vector in the two different bases. In an ndimensional vector space we consider two bases: e1..§ 4. The mutual disposition of the basis vectors is determined if the coordinates of the vectors of the basis are given relative to the other basis.n). . k_1 (19) In (19) we substitute for the vectors et the expressions given for them in (18). xE. where A and B are the matrices corresponding to the operators A and B. ... . . . the `new' basis).. be the coordinates of the vector x relative to the `old' and the `new' bases. Thus we see that in Chapter I the operations on matrices were so defined that the sum A + B. e* (the `old' basis) and ei. and a is a number of F. the product AB. § 4.. x.. eQ. and xi. and the product aA correspond to the matrices A + B.. +tnrien (18) (k=1. + tnl e e*=tUel + e*=tinek or in abbreviated form.'xr et . respectively.. .. TRANSFORMATION OF COORDINATES 59 We leave it to the reader to show that the operator6 C=aA corresponds to the matrix (aeF) C = aA.... e2.. x=. . We set el = tli el + t21 e2 + . e.. T. Let x1. respectively: n t_1 n x = ' x{ e{ =. AB. We obtain : 6 I.. Transformation of Coordinates 1. . .2. the operator for which Cx= oAx (x a R).. t_I +t2ne2+.
. IT O. since otherwise (19) would imply a linear dependence among the vectors el*. (23) For when we set in (21) x1 = x2 = .. . we obtain the expression for the inverse transformation x*=Tlx. eg. . . .t xA (21) x2 = t2l xi + t22x2 + .. x%). LINEAR OPERATORS IN AN nDIMENSIONAL VECTOR SPACE n a rn ( X = G xtk* L t. = tal xl + tn2x2 . because the elements of T are the ' old' coordinates of the linearly independent vectors ei . x2. ei tik xk1 e+ / k1 i1 il kl Comparing this with (19) and bearing in mind that the coordinates of a vector are uniquely determined when the vector and the basis are given. = x = 0. This system can only have the zero solution xi = 0. They express the `old' coordinates in terms of the `new' ones. e... e2 . n)... Then the formulas (21) for the coordinate transformation can be written in the form of the following matrix equation : x = Tx*... (20) or in explicit form : x1 = t11 xl `F + ... we obtain a system of n linear homogeneous equations in the n unknowns xi. x = 0.2. with determinant T 1.. e2.. (24) Multiplying both sides of this equation by T1. x. .. . Its kth column consists of the `old' coordinates of the kth `new' basis vector. . + tnttxn... (25) 7 The inequality (23) also follows from Theorem 1 (p.. This follows from formulas (18) or immediately from (21) if we set in the latter xx= 1. we find: n xi = k1 t.60 III. . xx = 0 for i ' k. + ten xrt X. . Note that the matrix T is nonsingular...64).. The matrix T=11ikIn (22) is called the matrix of the coordinate transformation or the transforming matrix.... i...7 We now introduce the column matrices x = (x1i x2.*. + tl. x2 = 0.. Formulas (21) determine the transformation of the coordinates of a vector on transition from one basis to another. Therefore IT I 0.e. xt 112x2 (i=1.. and x* _ (xi. . e2* . .
.. e and gI... 2. EQUIVALENT MATRICES.. e2. 2. (26) (27) where x and y are the coordinate columns for the vectors x and y in the bases e1. gam. 3. . y. We choose arbitrary bases e1. then in (32) the square matrix P is of order in. In these bases the operator A corresponds to a matrix A = II aik II (i = 1. we find from (28) and (30) A*= PAQ.. y = Ny*. Let R and S be two vector spaces of dimension n and m. . that realize the coordinate tranformations in the spaces R and S on transition from the old bases to the new ones (see § 4) : x = Qx*. . y*. m . 8 If the matrices A and B are of dimension m x n.. eA in R and 9. . 92.. g2... SYLVESTER'S INEQUALITY § S. . g.. and Q of order n. then P and Q may be chosen such that their elements belong to the same number field. e2. A* instead of x.. To the vector equation y = Ax there corresponds the matrix equation y = Ax . in R and S. . .. (29) Then we obtain from (27) and (29) : y* = Ni y = N1 Ax Y1AQx*.1 92... RANK OF OPERATOR. respectively. in S.. (28) Let us denote by Q and N the nonsingular square matrices of order n and m. .g. If the elements of the equivalent matrices A and B belong to some number field. . In the new bases we shall have x'.. Sylvester's Inequality 1. The Rank of an Operator. (31) DEFINITION 8: Two rectangular matrices A and B of the same dimension are called equivalent if there exist two nonsingular matrices P and Q such that" (32) B = PAQ. .. n). over the number field F and let A be a linear operator In the present section we shall make clear how the matrix A corresponding to the given linear operator A changes when the bases in R and S are changed. We now choose other bases e*. Equivalent Matrices. respectively.. ez. mapping R into S... A. (30) Setting P = N1.61 § 5.... . k = 1. . en and gi. Here y* =A *X*..
(33) We define a new basis in R as follows : e{ (i =1. p. (34) e{. Let r denote the number of linearly independent vectors among the vectors Ae1. . Thus. Ae...+1... It is easy to see that.n). (j =1.. (36) . n) .. n). When a rectangular matrix is multi plied by an arbitrary nonsingular square matrix (on the right or left). Ae. .. . . Therefore it follows from (32) that rd=fB.. 2. 2. . .. then it corresponds to the same linear operator for certain other bases in R and S. Proof. The condition is necessary..... (35) Aef =gj . are linearly independent9 and that the remaining Ae... Ae. and if a matrix B is equivalent to A. 17). e. .+2.. . if a matrix A corresponds to the operator A for certain bases in R and S... .... Without loss of generality we may assume that the vectors Ael.. to every linear operator mapping R into S there corresponds a class of equivalent matrices with elements in F. e.. . 11 Then by (33).. LINEAR OPERATORS IN AN nDIMENSIONAL VECTOR SPACE From (31) it follows that two matrices corresponding to one and the same linear operator A for different choices of bases in R and S are always equivalent. Ae . Aes. ..'c{ref (i=r+ 1. . .. . . e into the space S with the basis g1. conversely. . we set (k=r+1.Ae! J1 (k=r+1. r).. g. Ae are expressed linearly in terms of them : Ae0 Y c. =o Next. then its rank does not change (see Chapter I.. r).. Let A be a rectangular matrix of dimension m X n. 9 This can be achieved by a suitable numbering of the basis vectors e... The condition is sufficient. . e2. g2.62 III. . 2. . It determines a linear operator A mapping the space R with the basis e1.. Ae. The following theorem establishes a criterion for the equivalence of two matrices : THEOREM 2: Two rectangular matrices of the same dimension are equi valent if and only if they have the same rank..
it is part of the space S or.. the form e' ' g' e' e' r 1 0 0 0 . Since the matrices A and Ir correspond to one and the same operator A. Together with the subspace AR of S we consider the set of all vectors x e R that satisfy the equation Ax=o (38) These vectors also form a subspace of R.. 1 0 . they are equivalent. 9. . 0 0 . are linearly independent.. The set of all vectors of the form Ax. Hence the rank of the original matrix A is r. EQUIVALENT MATRICES..0 Along the main diagonal of If. gQ. is a subspace of S. We have shown that an arbitrary rectangular matrix of rank r is equivalent to the `canonical' matrix Jr. We supplement them gm to obtain a basis g.... where x e It. because the sum of two such vectors and the product of such a vector by a number are also vectors of this form. which we shall denote by NA. This completes the proof of the theorem. g.§ 5. g. 0 (37) 0 0 0. 0 1 ..+ 21 > of S. .. 0 0 ... forms a vector space. there are r units . of § 1.10 This space will be denoted by AR. 3. as we shall say. The matrix corresponding to the same operator A in the new bases g. SYLVESTER'S INEQUALITY 63 The vectors gi.. . by (35) and (36)... Let A be a linear operator mapping an ndimensional space R into an ndimensional space S. But Jr is completely determined by specifying its dimensions m X n and the number r.. . 0 0 .7. Therefore all rectangular matrices of given dimension m X n and of given rank r are equivalent to one and the same matrix Ir and consequently to each other.. 0 0 . 0 0 .. As we have proved.0 0. all the remaining elements of If are zeros. equivalent matrices have the same rank.. 20 The set of vectors of the form Ax (x a R) satisfies the postulates 1... RANK OF OPERATOR... starting at the top. .. has now.. gm with suitable vectors gr+1..
=gr... . a. gE... .. a2.64 III. The columns of A are formed by the coordinate vectors Ale.. . Ae.=AeA=o. Since it follows from x = x{et that Ax = xAe. i1 .. . or nullity. .. LINEAR OPERATORS IN AN nDIMENSIONAL VECTOR SPACE DEFINITION 9: If a linear operator A maps R into S. (39) If A is an arbitrary matrix corresponding to A.11 and the dimension d of the space NA consisting of all vectors x e R that satisfy the condition (38) is called the defect. e R and 91. A= a21 an . Ae... . Since under transposition the rows of a matrix become its columns and the rank remains unchanged : 11 The dimension of the space AR never exceeds the dimension of R.. a1... ew form a basis of NA . . g$. e. Thus...r. .. g. then it is equivalent to I. amt determined by A in arbitrary bases e. .... Ae .1 xiei (where e1. We denote the corresponding bases of R and S by ei. i. .. 93. .. is equal to the maximal number of linearly independ ent vectors among Ael. g. Hence it follows that r is the rank of the operator A and that d= n .+1... Among all the equivalent rectangular matrices that describe a given operator A in distinct bases there occurs the canonical matrix I. the rank of an operator A coincides with the rank of the rectangular matrix A all a12 . (see (37) ). e2. .e. . form a basis of AR and that the vectors e. From the definition of AR and NA it follows that the vectors gi. gm .. the rank of A. then the dimension r of the space AR is called the rank of A. e.+1=.. .. . This follows from the fact that the equation x = basis of R) implies the equation Ax w 1...... ams . Then Aei=gi. ..... .Aei.. so that r < n. and therefore has the same rank r... Ae2.. . eQ. Thus : The rank of a matrix coincides with the number of linearly independent columns of the matrix. Ae. eA and gi. er+s. e is a x.. of A. the dimension of RA. a S.
S__ A +T. RANK OF OPERATOR. These inequalities were obtained in Chapter I. We denote by rd. A(BR). where d=nrA is the defect of the operator A mapping S into T. (43) From (40). B. BR. C in some choice of bases in R. by applying (39) we obtain rc= rBdi. C. (42) di5 d. We introduce the matrices A. rB. and T.14 Therefore roSrA. Ax = o . C or. to rC.13 Moreover.§ 5. S. 54). and (43) we find: ra+rBnSro. we have A(BR) c AS. ro the ranks of the operators A. C corresponding to A. Since BR c S. These numbers determine the dimensions of the subspaces AS. 12 In § 1 we reached these conclusions on the basis of other arguments (see p. B.e. Then the matrix equation C = AB will correspond to the operator equation C =AB. 13 R C S means that the set R forms part of the set S. Then the rank of this operator is equal to the dimension of the space A(BR).12 4. SYLVESTER'S INEQUALITY 65 The number of linearly independent rows of a matrix is also equal to the rank o f the matrix. Since BR C S.C w T. (42). Then the operator C maps R into T: R 86. § 2 from the formula for the minors of a product of two matrices. . B. Let A and B be two linear operators and let C = AB be their product. R. of the matrices A. Suppose that the operator B maps R into S and that the operator A maps S into T. EQUIVALENT MATRICES. the dimension of A (BR) cannot exceed the dimension of BR. what is the same. Let us regard A as an operator mapping BR into T. (40) where d1 is the maximal number of linearly independent vectors of BR that satisfy the equation (41) But all the solutions of this equation that belong to S form a subspace of dimension d. i.. 14 See Footnote 11. B. roSrB. Therefore.
. Bee Chapter I. . (46) Here I(A)g(A) = g(A)t(A) for any two polynomials f (t) and g (t). .66 III. and in general A' = AA. 2.. . the coordinates of y in the same basis. be a polynomial in a scalar argument t with coefficients in the field F. Then we set : f(A)=a0AA1+alAm1+. LINEAR OPERATORS IN AN nDIMENSIONAL VECTOR SPACE Thus we have obtained Sylvester's inequality for the rank of the product of two rectangular matrices A and B of dimensions m X n and n X q: rA+rBn5rABSmin(ra.. then the powers A2=AA.. . The sum of two linear operators in R and the product of such an operator by a number are also linear operators in R...it + a..A have a meaning. and by yl. (44) § 6. .. .. If A is a linear operator in R. Then it is easy to see that for all nonnegative integers p and q we have APA4 =AP+t. .. the coordinates of the vector x in an arbitrary basis e1.'s This ring has an identity operator. Then rt be =2 attxt xi (i =1.+asr1A+aE.. namely the operator E for which Ex=x For every operator A in R we have (xcR).. ... A3=AAA. and this product is also a linear operator in R. rB). A linear operator mapping the ndimensional vector space R into itself (here R = S and n = m) will be referred to simply as a linear operator in R.. Hence the linear operators in R form a ring. Multiplication of two such operators is always feasible. e2. Y2.yER). y. We set A*= E. (45) EA=AE=A. p. 17. We denote by x1i x2. e. X.. n)* (47) 15 This ring is in fact an algebra.. Linear Operators Mapping an nDimensional Space into Itself 1. me Let 1(t)= aer + alt" + + a. Let y=Ax (x.
. . 61 (namely.* and A*='I ask 11 is the square matrix corresponding to the operator A in this basis. . Then. and a comparison with (49) gives: (50) A*=TIAT. (51') where T is a nonsingular matrix. e. . In this isomorphism the polynomial f (A) corresponds to the matrix f (A)... Thus. es. we can write the transformation form y=Ax. e8.. n). we have y* . e. Then from (48) and (50) we find: y* = TIATx*. Let us consider. y* are the column matrices formed from the coordinates of the vectors x. are called similar. .. . DEFINITION 10: Two matrices A and B connected by the relation B = TIA T T. x2. g. . A*x* (49) where x*. The identity operator E corresponds to the square unit matrix E= II 8a+i.. In this case the spaces R and S coincide. . (51) Formula (51) is a special case of (31) on p. y. e the linear operator A corresponds to a square matrix A= 11 aik 1101. in the same way... the choice of a basis establishes an isomorphism between the ring of linear opera tors in R and the ring of square matrices of order n with elements in F. (47) in matrix and y = (y1. the bases e. .. P = T1 and Q=T). e2.. MAPPING nDIMENSIONAL SPACE INTO ITSELF 67 In the basis e1. ' We remind the reader (see § 2.... . . .. in analogy with (48). ga.e2. y=Ty*. y2. Introducing the coordinate columns x= (x1.."' is See § 2 of this chapter. (48) The sum and product of two operators A and B correspond to the sum and product of the corresponding square matrices A = II a(5 li.) that in the kth column of this matrix are to be found the coordinates of the vector Aek (k =1. .).. .. apart from the basis el.§ 6. The product aA corresponds to the matrix aA. . . another basis ei.e* of R. y in the basis ei.. 2. and B bik Il ..of these spaces are identified. . . . e. . e and gi. . We rewrite in matrix form the formulas for the transformation of coordinates x=Tx*..
and B to C. LINEAR OPERATORS IN AN nDIMENSIONAL VECTOR SPACE Thus.R. In other words. If I A 1= 0 ( 0). we shall give necessary and sufficient conditions for two square matrices of order n to be similar. In accordance with this definition a singular (nonsingular) operator corresponds to a singular (nonsingular) matrix in any basis. i. For it follows from (51') that B=DTI11 AI is a necessary. we are at the same time studying the matrix properties that are common to the whole class of similar matrices.e. 2) AR . In Chapter VI we shall establish a criterion for the similarity of two matrices. the vectors of the form Ax (x a R) fill out the whole space R. We note at once that two similar matrices always have the same determinant. or invariant.68 III. In accordance with (52) we may define the determinant I A I of a linear operator A in R as the determinant of an arbitrary matrix corresponding to the given operator. . then the operator A is called singular (nonsingular). In other words. that is. i. to a linear operator in R there corresponds a whole class of similar matrices . they represent the given operator in various bases. 2) AR is a proper part of R. that remain unchanged. and Transitivity (if A is similar to B. we have shown that two matrices corresponding to one and the same linear operator in R for distinct bases are similar and the matrix T linking these matrices coincides with the matrix of the coordinate transformation in the transition from the first basis to the second (see (50) ). then A is similar to C). It is easy to verify the three properties of similar matrices : Reflexivity (a matrix A is always similar to itself) Symmetry (if A is similar to B.. under transition from a given matrix to a similar one. but not a sufficient condition for the similarity of the matrices A and B. 17 The matrix T can always be chosen such that its elements belong to the same basic number field r as those of A and B.e.. I I For a nonsingular operator : 1) Ax = o implies that x = o . then B is similar to A) . For a singular operator: 1) There always exists a vector x o such that Ax = o. a linear operator in R is singular or nonsingular depending on whether its defect is positive or zero. In studying properties of a linear operator in R.
... + (ann . . ...... ant . Other terms for the latter are: proper value.A) x1 + a12x2 + . e.. . aln a2n a22 .. + a2nxn = 0 (55) anlx1 + an2x2 + .... of the system be zero : all . ann ..t In order to find the characteristic values and characteristic vectors of an operator A we choose an arbitrary basis e1.+ a2nxn = Ax2 .§ 7..2) x a = 0 Since the required vector must not be the null vector. latent vector. . e2... An important role in the study of the structure of a linear operator A in R is played by the vectors x for which Ax=Ax (lei... latent number.. (54) anlx1 + an2x2 + .I . x must be different from zero... eigenvalue.. Characteristic Values and Characteristic Vectors of a Linear Operator 1. CHARACTERISTIC VALUES AND CHARACTERISTIC VECTORS 69 § 7. eigenvector... + alnx = O a21x1 + (a22 _ 2)x2 + ....... + alnxr.lxn. we obtain a system of scalar equations + a12x2 + .... ..A a21 a12 ..... .. characteristic number... = Ax1 a11x1 a21x1 + a22x2 + . at least one of its . Let x { x{e3 and let A= 11 ack 11 71 be the matrix corresponding to A in the basis e1... x2....A t Other terms in use for the former are: proper vector. + annxn = . e2.. xro) (53) Such vectors are called characteristic vectors and the numbers A corresponding to them are called characteristic values or characteristic roots of the operator A (or of the matrix A)... and .. latent root. etc. which can also be written as (all . ..... Then if we equate the corresponding coordinates of the vectors on the lefthand and righthand sides of (53)... latent value. e in R. In order that the system of linear homogeneous equations (55) should have a nonzero solution it is necessary and sufficient that the determinant coordinates x.....
. From what we have shown. if a number I is a root of (56). LINEAR OPERATORS IN AN nDIMENSIONAL VECTOR SPACE The equation (56) is an algebraic equation of degree n in A. mechanics.. x.e. sin_Pf. according to which an algebraic equation (56) in the field of complex numbers always has at least one root. . say. every characteristic value A of a linear operator A is a root of the characteristic equation (56)..1 i (p1... a field that contains the roots of all algebraic equations with coefficients in the field..70 III. and physics and is known as the characteristic equation or the secular equation's of the matrix A = II ask I11 (the lefthand side is called the characteristic polynomial).e. then for this value A the system (55) and hence (54) has a nonzero solution x1i X2. And conversely._PA. S. astronomy. in general. Thus.. then every linear operator in R always has at least one characteristic vector in R corresponding to a characteristic value A.. off1A. 20 The power (1)4P occurs only in those terms of the characteristic determinant (56) that contain precisely n . it follows that every linear operator A in R has not more than n distinct characteristic values.. Equation (56) occurs in various problems of geometry. S.20 In particular. affil . The product of these diagonal elements occurs in the expansion of the determinant (56) . If r is the field of complex numbers.. i. A is similar to A : 18 The name is due to the fact that this equation occurs in the study of secular perturbations of the planets.19 This follows from the fundamental theorem of algebra. .p of the diagonal elements. n) . to this number I there corresponds a characteristic vector x = I xiei of the operator A. Let us write (56) in explicit form (57) It is easy to see that here Si = atr Ss = A (i k (58) and.. 2.A. _ I A 1. i. . .. is the sum of the principal minors of order p of the matrix A = Ij afk . 19 This proposition is valid even in the more general case in which r is an arbitrary algebraically closed field. Its coefficients belong to the same number field F as the elements of the matrix A = 11 a{k lift. We denote by A the matrix corresponding to the same operator A in another basis. .
. n .. This polynomial is sometimes called the characteristic polynomial of the operator A and is denoted by I A . However. j. y. A(ax+PY+Yz+. . The significance of the characteristic vectors and characteristic numbers for the study of linear operators will be illustrated in the next section by the example of operators of simple structure. z. . similar matrices A and j have the same characteristic polynomial. y. of (A)"p the sum of all principal minors of order p in A. are arbitrary numbers of F.. if characteristic vectors of a linear operator A correspond to distinct characteristic values... . jn_p of n .. . . . In other words. CHARACTERISTIC VALUES AND CHARACTERISTIC VECTORS 71 Hence A= T1 AT. ...)=. then the vector ax + fly + ya + . .AEI. . . ip A) `4 i1 i2 .... P. where i. j. . ... . in general. .. i. ip + When we take all possible combinations j. .§ 7... 2..A) .1(ax+fY+Yz+ . Ax=Ax. 2. then a linear combination of these characteristic vectors is not. a characteristic vector of A. is either equal to zero or is also a characteristic vector of A corresponding to the same A. a `characteristic' direction. i. ..... (59) and therefore Thus. .).. we obtain for the coefficient . For from it follows that Ax=Ax... lE=T1(AAE)T 1AAEI =JA .. and a. ip it t= . n.i.. (ai fl is .. Ay =Ay. ip) forms a complete set of indices 1.. each characteristic vector generates a onedimensional subspace..p of the indices 1. with a factor in which the term free of A is the principal minor A 1 iy . together with j. hence in the development of (56) we have A AEI _ (aim A) (aj. linearly independent characteristic vectors corresponding to one and the same characteristic value A form a basis of a `characteristic' subspace each vector of which is a characteristic vector for the same A.. In particular.S. are linearly independent characteristic vectors of an operator A corresponding to one and the same characteristic A...AE If x..
DEFINITION 11: A linear operator A in R is said to be an operator of simple structure if A has n linearly independent characteristic vectors in R. we are led to the following equation: Cfa 2m1) (Am.. a linear operator in R has simple structure if all the roots of the characteristic equation are distinct and belong to F. i1 (61) Applying the operator A to both sides we obtain : XCAX. Proof.= o. i. If the characteristic equation of an operator has n distinct roots and these roots belong to F. Since any of the summands in (61) can be put last.=cam. Thus. A. x n .. At for i. We begin with the following lemma.. .. k. Let Axi=.2. i1 (62) We multiply both sides of (61) by At and subtract (61) from (62) term by term. we have in (61) c1ca=. If we apply the operators A22E.. to (63) term by term..72 III. these condi .. However. where n is the dimension of R.. ..A1) xm= 0 so that c. Linear Operators of Simple Structure 1.e.m) (60) cixi=o.. (A. x2. This proves the lemma. i. characLEMMA: Characteristic vectors belonging to pairwise distinct teristic values are always linearly independent..k=1.=0.42) .=o. LINEAR OPERATORS IN AN nDIMENSIONAL VECTOR SPACE § 8. then by the lemma the characteristic vectors belonging to these roots are linearly independent. Then we obtain in Eci(li11)x. i_2 (63) We can say that (63) is obtained from (61) by termwise application of the operator A21E. = 0.. there is no linear dependence among the vectors x1.lixi and (xi o.
22i ... there corresponds the diagonal matrix A = I A:Bck 117 If we denote by A the matrix corresponding to A in an arbitrary basis e1. .. 2. It is easy to see that to the operator A in a `characteristic' basis g1. Let us consider an arbitrary linear operator A of simple structure. 68) to a diagonal matrix is called a matrix of simple structure. to an operator of simple structure there corresponds in any basis a matrix of simple structure. g2. . .. Thus.. i.. .' xkgk k1 then n n Ax = E xxAgk =. . g. .E Akxkgk k1 k1 The effect of the operator A of simple structure on the vector x = I xkgk k1 n may be put into words as follows : In the ndimensional space R there exist n linearly independent 'directions' along which the operator A of simple structure realizes a 'dilatation' with coefficients A. .. e2.. The matrix T is called the fundamental matrix for A. . 2. We denote by g1. . e2. .. . and vice versa.. n). A. The matrix T in (64) realizes the transition from the basis e1. .e. . .§ 8.. e. then A =I II 2j8& II i 7'. These components are subject to the corresponding 'dilatations' and their sum then gives the vector Ax.. g2. . If x =.. g2.. (64) A matrix that is similar (p.. .. There exist linear operators of simple structure whose characteristic polynomial has multiple roots. g a basis of R consisting of characteristic vectors of the operator... . The kth column of T contains the coordinates of a characteristic vector gk (with respect to e1.. g.. . e# to the basis g1. An arbitrary vector x may be decomposed into components along these characteristic directions. LINEAR OPERATORS OF SIMPLE STRUCTURE 73 tions are not necessary. e. that corresponds to the characteristic value 2k of A (k = 1. .
(66) An arbitrary matrix A = 11 aik I+i may be represented in the form of a sequence of matrices A.. The characteristic values Ar'">.Ak1 A1. has simple structure... we deduce from Theorem 3: . . LINEAR OPERATORS IN AN nDIMENSIONAL VECTOR SPACE We rewrite (64) as follows: A =TLT1 (L = (Al. (1Sk1<ks<. of A. .. . < ip < n) of p of the characteristic values A. kp ip) (1Si1<i2<. Hence limAk7tA 111 100 A =Ak.... A2.. ..1e. § 4) : 2p is a diagonal matrix of order N (N=(P")) along whose main diagonal are all the possible products of Al. taken p at a time. . Moreover. A")) . we obtain (see Chapter I. . lim Ak'" = Ak my00 . then for every p < n the compound matrix 91p also has simple structure. n)..#00 1p = q(p . . A. . (64') On going over to the pth compound matrices (1 < p < n). .. 2.<kp5n)..74 III. . A$"'). Atp the characteristic values of vtp are all the possible products 2 (1 C i. A. (k = 1... COROLLARY: If a characteristic value Ak of a matrix of simple structure A = 11 aik II. A2. (m 4 oo) each of which does not have multiple characteristic values and. and the fundamental matrix of stp is the compound Zp of the fundaA mental matrix T of A.... of A. A comparison of (65) with (64') yields the following theorem: THEOREM 3: If a matrix A = II aik II i has simple structure.<ipSn). T('1 ti$ kl ke . A. (1:5 k. since lim . < i2 < . . 2. n) and if T = II t{k II. A2.. therefore.. < kp < n) of S1p corresponds to the characteristic vector with coordinates tlk..... . t2k. t"k (k = 1... . of the matrix A.. corresponds to a characteristic vector with the coordinates . then the characteristic value 4142 Ak.. converge for m oo to the characteristic values A. .. < k2 < . moreover.
taken p at a time (p =1. then a complete systent of characteristic values of the compound matrix Wp consists of all possible products of the numbers A. ). .. In the present section we have investigated operators and matrices of simple structure. is a complete system of characteristic values of an arbitrary matrix A. n). LINEAR OPERATORS OF SIMPLE STRUCTURE 75 THEOREM 4 (Kronecker) : If Al. . . ... 2. 22i . The study of the structure of operators and matrices of general type will be resumed in Chapters VI and VII.§ S.. A.. 22.... . .
x)fm+aik)2m1 + . In contrast to a matrix polynomial an ordinary polynomial with scalar coefficients will be called a scalar polynomial. § I. These polynomials play an important role in various problems of the theory of matrices.)=A02m+Allm1+ where Ai = Il air '{i . For example. In the present chapter.. 0. the properties of the characteristic polynomial and the minimal polynomial are studied. The polynomial (1) is called regular if i Ao 1 0. A polynomial with matrix coefficients will sometimes be called a matrix polynomial. We consider a square polynomial matrix A(A).CHAPTER IV THE CHARACTERISTIC POLYNOMIAL AND THE MINIMAL POLYNOMIAL OF A MATRIX Two polynomials are associated with every square matrix: the characteristic polynomial and the minimal polynomial. 76 . (3) The number m is called the degree of the polynomial. 1. .e. which we shall introduce in the next chapter. m) .. +Am.. A prerequisite to this investigation is some basic information about polynomials with matrix coefficients and operations on them. The number n is called the order of the polynomial. a square matrix whose elements are polynomials in A (with coefficients in the given number field F) A ('1)=11 ait(2) Ili =11a. i. (1) The matrix A(A) can be represented in the form of a polynomial with matrix coefficients arranged with respect to the powers of A : A(I. the concept of a function of a matrix. ... (2) (9 =0. Addition and Multiplication of Matrix Polynomials 1. will be based entirely on the concept of the minimal polynomial. provided A...
l) be two matrix polynomials of the same order n and of respective degrees m and p : A(2) = A02m + A1RmI + .e. Right and Left Division of Matrix Polynomials 1. in (4) the product AoBo may be the null matrix even though A0 0. Let A (A) and B (.lm+Bllm1+..+Am (A0 O). Bo 0.. i. For.. B(A)=BO.. 0 that AoBo 0. We denote by m the larger of their degrees.. Let two matrix polynomials A (A) and B (A) of the same order be given. If at least one of the two factors is regular. + Bp (I BO 10) . if at least one of the matrices Ao and Bo is nonsingular. These polynomials can be written in the form A(A)=A0Am+Al2m1+. In contrast to the product of scalar polynomials.e. a different polynomial.+Bp (Ao (Bo O) . in general. B(A) = BOAP + B1AP 1 + . However.+Am. +Bm. + ArBp. Let A (A) and B (A) be two matrix polynomials of the same order n. and let B(A) be regular: A(A)=ADAm+AlAm1+. the product (4) of matrix polynomials may have a degree less than m + p. interchange the order of the factors).. B(A)=BOAp+B1ZP1 +. The multiplication of matrix polynomials has a specific property. 0).. § 2. i.+(Am±Bm). . Thus: The product of two matrix polynomials is a polynomial whose degree is less than or equal to the sum of the degrees of the factors..§ 2... then we obtain..e.... Then A(A)±B(1)=(Ao±BO)Am+(A1+B1)2m1+.. If we multiply B(A) by A(2) (i. + A.. then it follows from Ao 0 and B.. RIGHT AND LEFT DIVISION OF MATRIX POLYNOMIALS 77 We shall now consider the fundamental operations on matrix polynomials.. (4) Then A (A) B (A) =AOB01m+P + (AOB1 + A1B0) Am+P1 + .: The sum (difference) of two matrix polynomials of the same order can be represented in the form of a polynomial whose degree does not exceed the larger of the degrees of the given polynomials. 2. then the degree of the product is always equal to the sum of the degrees of the factors. less than the sum of the degrees of the factors.
We `divide' the highest term of the dividend Ao2m by the highest term of the divisor Bo2P.1 Am(1)P B(A) + A(2)(A) . coincide with Q(1) and R(1).. then we repeat the process and obtain : A(l)(2) =A(') B. in general.) is less than that of B(1). . when the right quotient and the right remainder are to be found) in (5) the quotient Q(1) is multiplied by the `divisor' B(1) on the right. we shall call the polynomials. If m < p. Similarly.. provided the divisor is a regular polynomial. Let us consider the right division of A(1) by B(A)... we apply the usual scheme for the division of a polynomial by a polynomial in order to find the quotient Q(1) and the remainder R (A). CHARACTERISTIC AND MINIMAL POLYNOMIAL OF A MATRIX We shall say that the matrix polynomials Q(1) and R(1) are the right quotient and the right remainder. The polynomials Q(1) and R(1) do not.e. 2. respectively. Q(1) and R(1) the left quotient and the left remainder of A (1) on division by B (1) if A (A) = B (A) Q (1) + R (A) (6) and if the degree of R (1) is less than that of B (1). A(2)(A) = A(02) lm(2) + (m(2) < m(l)). of A(1) on division by B(1) if A (A)= Q (1) B (A) + B (A) (5) and if the degree of R(1. (8) If m(')zP.78 IV. (7) The degree m(l) of A(1)(1) is less than m : A(')(A) _ A () Am( + . If m ? p. we can set Q (1) = 0. R (1) = A (A). Thus we find the `first remainder' AM(2): A(A) = AOBB'Am P B(A) + A(1)(A). We obtain the highest term AOBo11 of the required quotient. The reader should note that in the `right' division (i. (Aal) 0. mil) < m) . We multiply this term on the right by the divisor B(1) and subtract the product so obtained from A(1). . We shall now show that both right and left division of matrix polynomials of the same order are always possible and unique. (9) etc. and in the `left' division in (6) the quotient Q(1) is multiplied by the divisor B(1) on the left..
Q * (A) = 0. the left division of A(A) by B(A) is unique. Subtracting (11) from (12) term by term we obtain [Q (A)Q*(A)]B(A)=R*(A)B(A).. then the right division of AT(A) by BT(A) would not be unique. (The regularity of B (A) implies that of BT (A). (12) where the degrees of R(A) and R*(A) are less than that of B(A). less than p. p. R (A) = B* (A) The existence and uniqueness of the left quotient and left remainder is established similarly. . A(2)(A). for if it were not. (13) If we had Q (A) . i. RIGHT AND LEFT DIVISION OF MATRIX POLYNOMIALS 79 arrive at a remainder R(A) whose degree is less than p. i.' 1 Note that the possibility and uniqueness of the left division of A (1) by B(A) follows from that of the right division of the transposed matrices AT(A) and BT(A). and would therefore be at least equal to p. . Q (A) . A(')(A).. Q (A) = Q* (A).Q * (A) * 0. and then it follows from (13) that R(A) R*(A) = 0. where Q(A) =A0Bo1 AP + A(1)B0' A'(')P + . (10) We shall now prove the uniqueness of the right division.R(A).. Then it follows Since the degrees of A(A). Suppose we have simultaneously and A(A) = Q(A) B (A) + R(A) (11) A(A) = Q*(A) B(A) + R*(A).. because Bo 10. at some stage we from (7) and (9) that A (A) = Q (A)B (A) + .e.Q *(A). Thus.) For from AT(A) =Q1(1) BT(A) + RI(A) it follows (see Chapter I. This is impossible. then the degree on the lefthand side of (13) would be the sum of the degrees of B (2) and Q (A) . 19) that A(A) = B (A) QT(A) + RI(A) (61) By the same reasoning.e. since the degree of the polynomial on the righthand side of (13) is less than p.§ 2.. R(A)=RT (A) n . Comparison of (6) and (6') gives Q(A)=QT(A).. decrease.
134lf21111'+111 = B (A) = 111 °lRa+11' oil. CHARACTERISTIC AND MINIMAL POLYNOMIAL OF A MATRIX Example..O). AoBl112 IS + A AoB1B(A)11A=+1 312+121i' 223 + 131 1  2A3 + 12 II A'2A'+ 1 3A 311+1 A3 +A IIIIA'+A 11' 313 +12A II II 211A+ 1 111 .. The Generalized Bezout Theorem 1.13A A(1) (1) = II  0 4111°+ 11111111 + 11 1 0 1(o1)x18(1)112 B (A) = A(') (1) . A(A)IIA32A2+ 1 Ao S 2A' +'1' 3A' + A jI. (14) .80 IV.A(O1)Bo'B (A) 1 2 123 1 At 65111 2 1'+2!1=1121'4 11212 1+1 _3A A2}65 I 11 13A5 32 1311111112134 +5 112+6 51+2 61211 11 2211=112111+1 e(A)=AoBolA+AV>Bo1=112 5111+112 As an exercise. We consider an arbitrary matrix polynomial of order n F(1)=F0A'"+FFR'm1+. (Fo. § 3.+F..1il 511. A'1 2 21'+3 A'+ 211.Bo1'11 1(1'(1) 1I. Bof=1. 11 I_+ 111 211' B. the reader should verify that A(A) =Q(A)B(A) + R(2).
.. in general.A) +FOAm+F1Am1+ .l) by the binomial AE. be distinct. (19) THEOREM 1 (The Generalized Bezout Theorem) : When the matrix polynomial F(R) is divided on the right by the binomial AE . in the `left' value F(A).+Fm F(A)=ArFO+Am1F1+.A... since the powers of A need not be permutable with the matrix coefficients F0.. . However... the remainder is F(A).2 We divide F (..n11 PE . Thus we have found that R`FOAm+F1Am1+ .. Similarly This proves (18) R=F(A). (15) For a scalar A.+pal. _ [FOAM1 + (FOA + F1) Am2 + . + F.... THE GENERALIZED BEZOUT THEOREM 81 This polynomial can also be written as follows: F(A)=2mF0+2m1F1+. then the results of the substitution in (14) and (15) will. +Fm. when it is divided on the left... [Fo2m1+ (FoA+F1) 2m21 (AEA)+(FOA2+F1A+F2)dm2+Fe2m$+. at the left..F1i.. 2In the 'right' value F(A) the powers of A are at the right of the coefficients.Fm. To determine the right remainder we use the usual division scheme : P(A) =FO2m+F12m1 + .. We set and F(A)=FAm+FiAm1+.+Fm.. (16) (17) and call F(A) the right value and F(A) the left value of F(2) on substitution of A for . . + FOAm1 + F1Am2 + .1. +Fm=F(A)..A. the remainder is I`(A).. both ways of writing give the same result. +Fm =F0Am1(AE A) + (FOA + F1) Am1 +F22m2 +.. In this case the right remainder R(A) and left remainder R(A) will not depend on A.§ 3. if we substitute for the scalar argument A a square matrix A of order n.
§ 4. § 7.. where bik (A) is the algebraic complement of the element 18ik . This follows immediately from the generalized Bezout Theorem. 13 . CHARACTERISTIC AND MINIMAL POLYNOMIAL OF A MATRIX 2.a83 A(1)=I2EA I=13(a3i+a82+a33)12+ .11 The matrix B(1) = 11 ba (1) 11*. From this theorem it follows that: A polynomial F(1) is divisible by the binomial AE . We consider a matrix A = II aik 11 77 . The characteristic matrix of A is 1E ..k I'.ass . The Characteristic Polynomial of a Matrix.12 + a21a32 .aik in the determinant d (A) is called the ad joint matrix of A.A on the right (left) without remainder if and only if F(A) = 0 (F(A) = 0). The determinant of the characteristic matrix A(1)= I2EA I =I18aaoli is a scalar polynomial in A and is called the characteristic polynomial of A (see Chapter III.alsI! ate ' .)1 + a2saa9 .(a23 + a. Example. Then F(A) = f(1)Ef(A) is divisible by AE .82 IV.. Let A= II a. § 7). because in this case P(A) =F(A) =0.12 + a2aS1 .A. By way of example.. The Adjoint Matrix 1.a22a3t 8 This polynomial differs by the factor (1)" from the polynomial A(k) introduced in Chapter III. 1EA= 1a23 .a2iasa a.a23ass B(1) = a.A (both on the right and on the left) without remainder. for the matrix all a12 A= we have : a13 a23 as9 a. . and let f (1) be a polynomial in A. a81 a22 a33 a..
.. The polynomial matrix B(A) can also be represented in the form of a polynomial arranged with respect to the powers of A. We wish to find the characteristic values of g (A) . A(A)=0.phE) (A . CHARACTERISTIC POLYNOMIAL OF A MATRIX.nag) . ADJOINT MATRIX 83 These definitions imply the following identities in A : (AE ..ph) (# . B (1) (AE A) =A (A) E. we find (24) Passing to determinants on both sides of (24) and using (22) and (23) .J (A) E. 2 (21) 11 311 ' A('1)= A2 1 _ . (AA»). all the roots of the characteristic polynomial A (A) (each A{ is repeated as often as its multiplicity as a root of A(A) requires).§ 4.. i. Thus we have proved : THEOREM 2 (HamiltonCayley) : Every square matrix A satisfies its characteristic equation. 1 A(A)=A 5A+72= 5 3 51 8I 5 I1 3'I+7I 2.e. (22) Let g(µ) be an arbitrary scalar polynomial. We denote by Al. Example. all the characteristic values of A. By the Generalized Be'zout Theorem.e. A2. ... (20) (20') The righthand sides of these equations can be regarded as polynomials with matrix coefficients (each of these coefficients is the product of a scalar and the unit matrix E).)(AA=) .4u E) ..1_3(==A 5A+7. i. A. For this purpose we split g (y) into linear factors g (µ) = ao (p ..pas) . (p (23) On both sides of this identity we substitute the matrix A for ju : g (A) . Equations (20) and (20') show that A (A)E is divisible on the right and on the left by AE A without remainder. (A .A) B (A) =.. Then A(1)=IARAI =(AA. this is only possible when the remainder A (A) E = A (A) is the null matrix.ao (A . .
) 3. . we find : SAEg(A)I=[Ag(Aj)] [Ag(A$)] ..A). 1.. 9(A.IAµe91 _ (1)"'a"od (Pi) A (4u2) . Since by the HamiltonCayley Theorem A (A) = 0. A2.E.. if A has the characteristic values A.At) = g (Ai) g (As) . 90. A (u ) {1)"t4 _ If in the equation (i k1 rl j1 (µc .. then g(Al). where A is some parameter. A. (27) The difference d (A) ..... (26) This leads to the following theorem...p21"2 .) we replace the polynomial g (u) by A .. d (A) E= a (AE.. .. CHARACTERISTIC AND MINIMAL POLYNOMIAL OF A MATRIX g(A)1=aolAy1EI IA# 9I.....d (µ) is divisible by A .84 IV. A2..u).pp2"1 . (31) . THEOREM 3: If Al.. The identity A(A)d(fu)=6(A. 2. then Ak has the characteristic values Ak.&) (29) will still hold if we replace A and jz by the permutable matrices AE and A. _P. we obtain by virtue of the uniqueness of the quotient the required formula B(A)=6(A. AQ..g (. fu)(A/.. In particular. A) (AE ... A. g (25) 19 (A) I = 9 (Al) 9 (As) . A" are all the characteristic values (with the proper multiplicities) of a matrix A and if g(µ) is a scalar polynomial. . . We shall now derive an effective formula expressing the adjoint matrix B (A) in terms of the characteristic polynomial d(A). (30) Comparing (20') with (30). .. Let A (A) = I" . ..." (k = 0..) are the characteristic values of g(A).. µ) =d (28 ) is a polynomial in A and y..µ without remainder Therefore e (A. A). g(4). .
r' + B112 + B._. Br= Ak . then V. 2.. . If we substitute in (35) the expression for B. but contains this theorem implicitly. there correspond d.. (37) 0 and denote by b an arbitrary nonzero Let us assume that B (Ao) column of this matrix. .do is the rank of a. . .. 1 (36) Let Ao be a characteristic value of A. If A is nonsingular...) the elements of any two columns are proportional. linearly independent characteristic vectors (n . if only one characteristic direction corresponds to X.p2E.= (. . Substituting the value A.. This approach to the require the Generalized Theorem explicitly. then the rank of B (X._1 can be computed in succession.p1E.rs + . . If to the characteristic value X. (34) (35) The relations (34) and (35) follow immediately from (20) if we equate the coefficients of equal powers of A on both sides.p ' The matrices B1...2. we find : (A EA)B(A)=0. § 7.) does not exceed do. we obtain A(A) = 0. given in Theorem does not (33). (38) Therefore every nonzero column of B (Ao) determines a characteristic vector corresponding to the characteristic value Ao. .I.1)111 JA 1:pA0.p1AkI .p2Ax2 . (33) and. Then from (37) we have (A0E .p1A . where (32) B1= A .§ 4. ADJOINT MATRIX 85 Hence by (28) B (A) = E. B. n 1.E . n 1) . . then in B(X. .s Thus : ' From (34) follows (33). B2 = Aa . In particular. .p +. in general.' (k= 1.. + B. (lc=1.. B2. .. and it follows from (35) that A' = Pa B.. . .A ).A) b = 0 or Ab = lob... in (20). so that A (Ao) = 0. 5 See Chapter III. CHARACTERISTIC POLYNOMIAL OF A MATRIX.. Moreover. starting from the recurrence relation Bt = A Bl1. B0 = E) .
A)=A'B+A(A4E)+A'4A+ BE. The first column of the matrix B (+l) gives the characteristic vector (+1. 4(1)=(11)2 (12). If the given matrix A is nonsingular.p)=4()µ. +1. +1. Example.(p)=A'+A B1 0B. +1) corresponding to the characteristic value A= 2. 2 1 A= JA2 0 1 1 0 1 1 1 1 . A'= 2 B:= 2 1 1 1 1 _ 1 2 2 Furthermore. then the inverse matrix A' can be found by formula (36). The first column of the matrix B(+2) gives the characteristic vector (0. But B1=A4B= 2 1 0 3 1 1 3 1'21 1 1 B. 1 A1 4 a(1. . then the nonzero columns of B (10) are characteristic vectors of A for 1=10. then the adjoint matrix can be found by formula (31). CHARACTERISTIC AND MINIMAL POLYNOMIAL OF A MATRIX If the coefficients of the characteristic polynomial are known.86 IV. 1 1 21 1 1 =A'42 +5A2. If 1o is a characteristic value of A.=AB1+5E= 1 0 2 2 3 2 1 1 2 B (A) _ A. B(A)=8(AB.3A+3 A2 A+1 A1 As3A+2 1 1 A+2 A2 1 0 1 3 JA 1=2. 0) for the characteristic value A = 1.
. of computing the coefficients of the characteristic polynomial.p18k_1 .P2A 2 . Vol. .. Textbook of Algebra. . 436ff. .. . . of the ad joint matrix B(A). n). 2. THE METHOD OF FADDEEV 87 § 5. due to A. . ... . .. the traces of certain other matrices A1.. . B2. 1. then the coefficients p1. A2. By the trace tr A of a matrix A = !I as II1 we mean the sum of the diagonal elements of the matrix : N tr A =Za{{. I.pA (39) and of the matrix coefficients B1.. instead of the traces of the powers A.. if trA=p1= (41) A(A)=(AAl)(A. i. 8 See. we shall discuss another effective method. .). are the characteristic values of A.p1A"1 . can be determined from (44). . .. 160. Faddeev has proposed to compute successively. K. p2.2. .... N. Chrystal.e.wehave Since by Theorem 3 Ak has the characteristic values A1. (42) (k=0.. pp.. All.. This is the method of Leverrier for the determination of the coefficients of the characteristic polynomial from the traces of the powers of the matrix. 82... A2. for example. 6 See [14]. (43) The sums sk (k =1. The Method of Faddeev for the Simultaneous Computation of the Coefficients of the Characteristic Polynomial and of the Adjoint Matrix 1. n) of powers of the roots of the polynomial (39) are connected with the coefficients by Newton's formulas8 Jcpk= 8 k .. A2. i1 .. .. .§ 5. . In order to explain the method of Faddeev' we introduce the concept of the trace (or spur) of a matrix.... A.pk_181 (k 1. . 2.'2) . (2A"). . .. p. . p2.). Krylov. . .. A.. D. A2.... All are computed. It is easy to see that i1 (40) if Al. § 8. p" of the characteristic polynomial A (A) = An . An tr Ak= 8k =. (44) If the traces sl. 7 In Chapter VII. s" of the matrices A. 2. . A2. p. 1.. Faddeeve has suggested a method for the simultaneous determination of the scalar coefficients pl. ... G. . A{ +1 " (k= 0. 2.
._ p18k1.. pn and the that are determined successively by (45) are. An_1= ABn2. B2. . B2. the coefficients of A(2) and B(A).. in matrices B1. 2.. B1= A4E= 2 1 0 3 1 1 3 1 1 1 1 2 0 1 2 2 4 0 3 1 4 9 As a check on the computation. p.9 21 0 A= 1 1 1 1 1 1 2 1 0 1 1 1 . B. As a row whose elements are the sums of the elements above it.. CHARACTERISTIC AND MINIMAL POLYNOMIAL OF A MATRIX and so to determine pi.. As. . = A.. p8 ...88 IV. formulas (45) also deterof the matrix polynomial B(A).pzI`'. mine the coefficients B1.pniE. we obtain kpk= 8k. . . In order to convince ourselves that the numbers p1. p2. . p2. .. pn determined by (45) are also the coefficients of A(A). Therefore the numbers pl. But then the second of formulas (46) coincide of the with formulas (33) by which the matrix coefficients B1. (45) tr A z. . Exomple. Therefore.Pki8i But these formulas coincide with Newton's formulas (44) by which the coefficients of the characteristic polynomial d(al) are determined successively. by the following formulas : A1= A. we write under each matrix A. . adjoint matrix B(A) are determined.... A$ = AB1. . B2. p1= trA=4. An=ABn_1. .. . p1= tr A1.. p2. . we note that the following formulas for Ak and Bk (k = 1. 1 1 tr An1. . fact.p1Ak1.. Bn=AnpnE=0 0 may be used to check the computation.. . . .. = n trA.p1E. . B2. The product of this row of 'columnsums' of the first factor into the columns of the second factor must give the elements of the columnsum of the product. . p and B1.. n) follow from (45) : Ak = Ak  p1Ak1. B2 = A$ .pk_lA. Bn_1= An_1. .. ...Pn_1 = n  B1= A1. . Bk =Ak . . The last equation B. .. .pk1A prE (46) Equating the traces on the lefthand and righthand sides of the first of these formulas.
p4 2 1 2 0 2 2 1 2 1 1 Note.=3 trA.=2trA. in A3 only the elements of the first column. B2=A2+2E= 3 1 02 2 0 1 0 2 4 A. However. it is not. it is sufficient to compute in A2 the elements of the first column and only the diagonal elements of the remaining columns. J (1)=1'41'+211+51+2. The Minimal Polynomial of a Matrix 1. .=2.(1)q(A)+r(1). If we wish to determine p1. in general.§ 6. An annihilating polynomial tp(A) of least degree with highest coefficient 1 is called a minimal polynomial of A. § 6.=2. as we shall show below. MINIMAL POLYNOMIAL OF MATRIX 89 3 4 03 122 A. DEFINITION 1: A scalar polynomial f (1) is called an annihilating polynomial of the square matrix A if f(A) =0. A'I=1 B. 5 7 J A. p.=AB.=AB1 = 1 1 4 03 331 1 2 025 p.=5. B2._2. and in A4 only the first two elements of the first column. p2. Let us divide an arbitrary annihilating polynomial f (2) by a minimal polynomial f(1)=y. p4 and only the first columns of B1. Bs=A. B3.= 173 4 p.+5E= 5154 5 2 02 5 17 9 0 05 331 3 2 1 0 427 17 2 4 0 422 5 2 4 0 2 2 A4=ABs= 0 2 0 0 0 0 2 0 0 0 0 2 0 1 1 0 0 0 . a minimal polynomial. By the HamiltonCayley Theorem the characteristic polynomial d (A) is an annihilating polynomial of A. p3.
We shall now derive a formula connecting the minimal polynomial with the characteristic polynomial. Let lpl (A) and lp2 (A) be two minimal polynomials of one and the same matrix.e. of all the elements (see the preceding section). We denote by Ds1 (A) the greatest common divisor of all the minors of order n .. The factor D. (A):" (49) where V(A) is some polynomial.p(A)E=(AEA)C(A). the polynomials differ by a constant factor.t)(lEA).. apart from (50).A. . This constant factor must be 1.e.IL(1) = Y' (A).90 IV.1 of the characteristic matrix AE . But the degree of r(A) is less than that of the minimal polynomial ip(A). (48) Hence it follows that A (A) is divisible without remainder by D. because the highest coefficients in VYI(A) and tp2(A) are 1.(A) in (48) may be cancelled on both sides :12 .A.. CHARACTERISTIC AND MINIMAL POLYNOMIAL OF A MATRIX where the degree of r(A) is less than that of zp(A). 11 We could also verify this immediately by expanding the characteristic determinant d (. the `reduced' ad joint matrix of 1E . also the identity (see (201)) y(d)E=C(. it follows that r(A) = 0.A. 12 In this case we have.A) C (A) D _1(A) .10 Hence: Every annihilating polynomial of a matrix is divisible without remainder by the minimal polynomial. 2. Hence we have: f (A) = +p (A) q (A) + r (A). Then each is divisible without remainder by the other.e. Then of the matrix B (A) = I I b{k (A) II 1 B (A) = D. C(A) is at one and the same time the left quotient and right quotient of lp(A)E on division by AR . Since f (A) = 0 and ip(A) = 0. (47) where C (A) is a certain polynomial matrix.. Therefore r(A) =0. From (20) and (47) we have : A (A) E = (AE . i.IL (A) C (A).l) with respect to the elements of an arbitrary row. i. i. (50) 10 Otherwise there would exist an annihilating polynomial of degree less than that of the minimal polynomial. Thus we have proved the uniqueness of the minimal polynomial of a given matrix A.
e.A without remainder : V* (1) E _ (1E A) C* (1). and this is what we had to prove. 84) : C (A) _ 7(AE.§ 6. A).. the polynomial V(A) defined by (49) is an annihilating polynomial of A. . (A) is divisible by .* (A) X (A) (51) Since 1V* (A) = 0. We denote the minimal polynomial by 1V*(A).*(A). (53) The identities (50) and (53) show that C(A) as well as C*(1)x(1) are left quotients of ip(A) E on division by 1E . MINIMAL POLYNOMIAL OF MATRIX 91 Since tp (A) E is divisible on the left without remainder by AE . For the reduced adjoint matrix C(A) we have a formula analogous to (31) (p.(A) y. Let us show that it is the minimal polynomial. p) is defined by the equation" 13 Formula (55) can be deduced in the same way as (31). Thus.(µ) _ (A p) (2. e) we substitute for A and p the matrices AE and A and compare the matrix equation so obtained with (50). We have established the following formula for the minimal polynomial : (54) 3. it follows by the Generalized Be'zout Theorem that V(A) =0.V* (A) without remainder : V (1) = Y. i.A) C* (1) X (A). Hence it follows that x (2) is a common divisor of all the elements of the polynomial matrix C(A). Since the highest coefficients of TV(A) and iV*(A) are equal. (55) where the polynomial W(A.A. By the uniqueness of division C (1) = C* (A) X (A). Then y. we have in (51) x(A) = 1. (52) From (51) and (52) it follows that +V (A) E = (1E . On both sides of the identity y. the greatest common divisor of all the elements of the reduced adjoint matrix C (A) is equal to 1. ip(A) =y.A. because the matrix was obtained from B (A) by division by D1(A) . Therefore x(1) =const. on the other hand. by the Generalized Bezout Theorem the matrix polynomial p (A) E is divisible on the left by AE . But.
92 IV. 4(A) is divisible without remainder by W(A) and some power of W(A) is divisible without remainder by d (A). Then W(Ao) = 0 and therefore. Example. Ac = Aoc.>0. . A (A) I C (A) I = [+v (A)]". the sets of all the distinct roots of the polynomials d (A) and 9) (A) are equal. a). i. 2.A. (A) = (AA1)+".. (59) then (60) where 0<mk:nk (k = 1. every nonzero column of C(A0) (and such a column always exists) determines a characteristic vector for A = Ao. (61) 4. .AX' Al for i j. by (57). i.)m. (A2$)"a . . We denote by c an arbitrary nonzero column of C(A.e. CHARACTERISTIC AND MINIMAL POLYNOMIAL OF A MATRIX Moreover.Ao.... Then from (62) (4E A) c =o. (56) (57) Going over to determinants on both sides of (57). (62) Note that C (Ao) . j=1. We mention one further property of the matrix C(A). and this is impossible. ... k (AEA)C(A)= o(A)E. we obtain (58) Thus.. (20EA) C (Ao) = 0. 2. 0 always holds. . for otherwise alt the elements of the reduced adjoint matrix C (A) would be divisible without remainder by A . . 8).. . (63) In other words. 3 3 2 A= 1 5 2 1 3 0 23 (A) 1 25 3 2 2 . Let Ao be an arbitrary characteristic value of A = II a{k !I. If (A.). A (A) _ (AA1)"' (AAE)". n. In other words : All the distinct characteristic values of A are roots of W(A). (A. . (A.
D2(A) can only have 2 and 4 as its roots. we find from the first column of the matrix C (4) the characteristic vector (1.§ 6.8 + (18) 1 21 3 3 2 b 1 3 0 0 0 1 + (2 '8.1. To begin with. Cancelling this factor. 1.18 22 . Therefore D2(4) r 0..12 18 . . = 2. setting Ao = 4.1) corresponding to the characteristic value Ao = 4. For A = 4 the second order minor 115I 1 31 =2+2 of d (A) does not vanish. MINIMAL POLYNOMIAL OF MATRIX 8 93 ( . Therefore all the minors of order two in A(A) ..(A) and C(A) could have been determined by a different method. 1.) for A. 3) for the same characteristic value Ao = 2.3. In C(A) we substitute for A the value Ao = 2: C(2)= 11 1 3 3 i1 The first column gives us the characteristic vector (1. let us find D2(A). .+20) 0 1 0 1 0 11 2252 + 6 Il 2A41' 3A+6 2+2 2'32+2 326 2'82+2 2+2 All the elements of the matrix B (A) are divisible by D2 (A) = A .µ ) = d(N)d(2) _ 2+ k (28)+2'82+20. The second column gives us the characteristic vector (. 1 B(2) =A'+(28)A+(2'82+20)E 6 6 12 10 . The third column is a linear combination of the first two.2. Similarly. we have : i 2. 1. For A= 2 the columns of A(A) become proportional.3 C(2)= and 1 21 1 3 3 2 2 1! 26I' 12 d(2) 1 2 2 4II. The reader should note that y.
l .2)2. D2(A) cannot be divisible by (.% 62+8. Since the minor to be computed is of the first degree.94 IV. Therefore D2(2)=12. Hence 2('12)(A4)=. CHARACTERISTIC AND MINIMAL POLYNOMIAL OF A MATRIX vanish for A = 2 : D2(2) = 0. C('1)=V(AE.A)=A+(x6)E= 11 13 3 2 1 11 2 1 3 26 .
. 2.CHAPTER V FUNCTIONS OF MATRICES § 1. i. . 91 (3) .. (Ar) = h0 k1) (Ar) (4) (Ak) = h' (1k). we shall write this as follows : g(A)h(A) (mod o(A)). i. A'+ yl 111 + + yt is a polynomial in A.. are all the distinct characMk teristic values of A). . In this case. we wish to extend the function f (A) to a matrix value of the argument. i See Chapter IV.. we shall obtain a definition of f (A) in the general case. . (k=1.. We denote by (AA. 8)... 8). d'(Ak)=0. is divisible by p(A) without remainder .. We wish to define what is to be meant by f (A). as an annihilating polynomial for A. . . /(A) = yo A' + yl A'1 + .e. Starting from this special case. 95 .e. A2. Hence bf (1) d(Ak)=0..E. Definition of a Function of a Matrix 1. Let A= 11 alk 11 11' be a square matrix and f (A) a function of a scalar argument A. 2. . (2) Then the difference d(A) =g(A) h(A). + y. 9 (Ar) = h (Ak)... d(nk1)(Ar)=0 (k =1..)"'' 1'(A) (1) the minimal polynomial' of A (where A. The degree of this polynomial is m = k_1 Let g(A) and h(A) be two polynomials such that g (A)= h (A). We already know the solution of this problem in the simplest special case where f (A) = y. § 6. A.
l) (Ak) (k= 1..e.. . (Ak) f(mk. i. FUNCTIONS OF MATRICES The m numbers f (1k). Equation (4) shows that the polynomials g(A) and h(2) have the same values on the spectrum of A...96 V. then f (Ad) = g (Ad) where g(A) is an arbitrary polynomial that assumes on the spectrum of A the same values as does f (A) : f(A)=g(A)..8) (5) will be called the values of the function f (A) on the spectrum of the matrix A and the set of all these values will be denoted symbolically by I (AA).. f. We are thus led to the following definition : DEFINITION 1: If the function f(A) is defined on the spectrum of the matrix A. We postulate that the definition of f (A) in the general case be subject to the same principle: The values of the function f(A) on the spectrum of the matrix A must determine f (A) completely. the values of the polynomial g(2) on the spectrum of A determine the matrix g (A) completely.2. have meaning). Among all the polynomials with complex coefficients that assume on the spectrum of A the same values as f (A) there is one and only one polynomial 2 It will be proved in § 2 that such an interpolation polynomial always exists and an algorithm for the computation of the coefficients of the interpolation polynomial of least degree will be given. In symbols : g (Ad) =A (Ad) Our argument is reversible : from (4) follows (3) and therefore (2). all functions f (A) having the same values on the spectrum of A must have the same matrix value f (A). i.e. given a matrix A. Thus. But then it is obvious that for the general definition of f (A) it is sufficient to look for a polynomial2 g(A) that assumes the same values on the spectrum of A as f (A) does and to set : f(A)=g(A)..If for a function f (A) the values (5) exist (i. all polynomials g (A) that assume the same values on the spectrum of A have one and the same matrix value g (A) . then we shall say that the function f (A) is defined on the spectrum of the matrix A.e.
DEFINITION OF FUNCTION OF MATRIX 97 r(A) that is of degree less than m... .. r' (2k)= f" (2k).=m.. . Note.s).. If the minimal polynomial Ip(A) of a matrix A has no multiple meaning it is sufficient that f (A) be defined at the characteristic values .2. 4 In Chapter VI it will be shown that A is a matrix of simple structure (see Chapter III. Am. . r(mk1) (Ak) = j(mk1) (Ak) (k=1. . and the polynomial r (A) is of the form A"1 Therefore 3 This polynomial is obtained from any other polynomial having the same spectral values by taking the remainder on division by p(r) of that polynomial. . then for f(A) to have a Al. roots4 (in (1) m1=m2=.0 Its minimal polynomial is An.. 1314. f(0). Definition 1 can also be formulated as follows : DEFINITION 1': Let f(A) be a function defined on the spectrum of a matrix A and r(A) the corresponding LagrangeSylvester interpolation poly nomial.... then for some characteristic values the derivatives of f (A) up to a certain order (see (6)) must be defined as well..0 0 0 1 . f'(0)... 5 The properties of the matrix H were worked out in the example on pp. Then /(A) = r(A). .. A2.0 H= 0 0 0.§ 1. and this case only. I 0 0 0.=1. Therefore the values of f (A) on the spec trum of H are the numbers f (0). § 8) in this case.3 This polynomial r(A) is uniquely determined by the interpolation conditions : 9 (Ak) = f (Ak). . . (6) The polynomial r(A) is called the LagrangeSylvester interpolation polynomial for f (A) on the spectrum of A.. But if Ip(A) has multiple roots. Example 1: Let us consider the matrix5 n 0 1 0.. s = m)....
The interpolation polynomial r(A) of f (A) is given by the equation r (A) = / ('1o) + f' 1) 1! n (1't0) + .t t . : f (0) Example 2: Let us consider the matrix n 120 1 0 1. 0.. . . + (n1)! H_ f (0) f' (0) 1! 0 . . FUNCTIONS OF MATRICES /' (0) 1(n1) (0) 1(H)=f(0)E+ 1! H+ . 1. We mention two properties of functions of matrices. so that J 10E = H. ..2o)n.. Note that J = 20E + H. B=T'AT. then the matrices f (A) and f (B) are also similar and T transforms f (A) into f (B). If two matrices A and B are similar and T transforms A into B.. + (n t> (1) Hn.98 V. .. + 1! (n 1)1) ('1 A0)n1 Therefore 1(J) = r (J) = / (A0) E + f' A10) 0 i) H + . f (B) = T' f (A) T. .0 . The minimal polynomial of J is clearly (A ..1 . /(Io) 2.0 .A0I J= 0 0 0 0 0. 1. 00 1('10) (A0) 11 0 0 . . .
and vice versa. f(A)= T (f (A. .(i. (8) Example 1: If the matrix A is of simple structure A=T{21. . 2.. r (A4)..) Therefore I(A1) = r (AI).1. A.)} ... 6 From B = T'AT it follows that Bk = T1AkT (k. Then it is easy to see that f (A) .)).).) of A is an annihilating polynomial for each of the matrices A. But then it follows6 from the equation r(B) =T1r(A)T that f(B) =T1f(A)T. Therefore it follows from the equation that f (Ad) = r (Ad) f(A41)= r(AA). . Therefore it follows from g(A) =0 that 9(B) =0. .. then /(A) = {f (A1). . .. . 1.. then . . f (A... f (A2).) =r(AA... f (A2). 0. 1...).. f (A) has meaning if the function f (. . 2.. Therefore there exists an interpolation polynomial r(A) such that f (A) = r(A) and f(B) = r(B). As) . Hence for every polynomial g(X) we have g (B) = Tlg(A)T . .. . . . If A is a quasidiagonal matrix A = {A1. . . ..l) is defined at A1... f(2Y). A2. the minimal polynomial y..6 so that f (A) assumes the same values on the spectrum of A and of B. r (A.= r (A) = {r (A1)... .. f (Au) = r and equation (7) can be written as follows: f (A) = (t (A1).))T'. A2.. f(A4.. . . Let us denote by r(A) the LagrangeSylvester interpolation polynomial of f (A) on the spectrum of A. 22. f (A6)) . f(A. 12.. DEFINITION OF FUNCTION OF MATRIX 99 For two similar matrices have equal minimal polynomials. T1.. (7) On the other hand.
01. as in the matrix J. 0.u(11) .. 1Y o 0. . all the elements in the nondiagonal blocks are also zero. J be a matrix of the following quasidiagonal form t 11 1 0 0 0 11 0. FTJNCTIONS OF MATRICES Example 2: Let. § 7) that an arbitrary matrix A = 11 au II. . . § 6 or Chapter VII.11 0 . f(11) f' (11)1! 0 0 . . 1..0 1Y All the elements in the nondiagonal blocks are zero. II f (1) 1 0 1(11) it . 1213). By (8) (see also the example on pp. 0 o A. f (11) /(J)= f (1Y) f' (f 1Y) /(`u4) (1Y) (vU 1)! 0 0 . fal .1). 0 . ii f (1Y) Here.. 98) we always have f (A) = Tf (J) T1.0 1 . . Therefore (see 1. 0.0 . . J= 'Y 1Y 1 0.100 V.. on p.' 7 It will be established later (Chapter VI.0 1 o o 0.0 . .1 .. .` is always similar to some matrix of the form J : A = TJT1.
we consider the case in which the characteristic equation I AE ...(.. + m. 2..A.§ 2.. and the equation (6) takes the form r (Ak) = f (Ak) (k =1. has only simple roots :8 In this case (as in the preceding one) all the exponents Mk in (1) are equal to 1.. and condition (6) can be written as follows: r(Ak)=/(Ak) the function f (A) at the points Al.(A. (A_Am) ..Ak+ k1 (A(A*`11)"(AkAk1)(Ak`Ak+1)lE) .. In this case...Ak1E) (A . .. An : n Akt) (A 1k±1) . . but that the minimal polynomial. (A ..... 2.A..Al) .. (A (A ` r(A)_. (A .. Let us assume now that the characteristic polynomial has multiple roots. LAGRANGESYLVESTER INTERPOLATION POLYNOMIAL 101 § 2. . We now consider the general case : V (A) = (A . . (. An.. .. The roots of this equationthe characteristic values of the matrix Awill be denoted by A1i A2..11)..A1)m. To begin with. We represent the rational function.. m). (A . .)" . The LagrangeSylvester Interpolation Polynomial 1. A2i (k=1.(AAA). Then tP(A)=IAEAI=(AA1)(AA2).. = m) ...(A (AkAIE) ...A. (') .4_^ Ak1) (AkAk+1E) .)m (m2 + m2 }.(AkZk1)(AkAk+1)..tk. as a sum of partial fractions: 8 See footnote 4....E) . (Ak Ak1E) (A Ak+ 1) .. . n)... (A . (Ak AmE) k1 3.(Ak`in) n (Ak) 2. where the degree of r(A) is less than the degree of y'(A). .A I = 0 has no multiple roots.1n) An)f(2k) By Definition 1' f (A) = r (A) . r(A) is the ordinary Lagrange interpolation polynomial for . which is a divisor of the characteristic polynomial. r(A) is again the ordinary Lagrange interpolation polynomial and f (A) .
.tk + r' (2k) _ Vk(Ak)' . (10) where ek(A) is a rational function.. In order to determine the numerators aki of the partial fractions we multiply both sides of (9) by (2 . we can determine r(A) from the following formula.Ak).. . o I... 8)..e. which is obtained from (9) by multiplying both sides by tp(A) : f r (2) = E (ak i + ak 2 (2 1k) + . 2.. mk.. Ak)mk1+. . on the righthand side of (9) are expressible in terms of the values of the polynomial r(A) on the spectrum of A. equal to the sum of the first Mk terms of the Taylor expansion of f (A) in powers of (A .2. that does not become infinite for 1=At... s) are certain constants. (k =1. 2. FUNCTIONS OF MATRICES tV(1)k. regular for A= Ak.s).. by (13).. + ak mk (I.. . . 2. k =1. Therefore Ilk I  1(1k)_ Pk(1k) 1 1 tVk(1) AAIt (12) 8). (11) Formulas (11) show that the numerators ak.. and these values are known : they are equal to the corresponding values of the function f (A) and its derivatives.+akmk(A2k)"k1+ +(AAk)mkek(A) (k=1.102 V.. .. mk. s)... .11(2 aAk)*+k where ak.ak Lirk (1)1 (1)k_.I 2.... 2. Formulas (12) may be abbreviated as follows : akf _ 1 F _j (1) ll01> (91)! lYk(1)!Ax k (7=.. ..1) Vk (A) k_1 (14) In this formula the expression in brackets that multiplies 1vk(A) is. .. (13) When all the akt have been found.+Ar41 (9) (a ...'k )k and denote by Vk(A) the polynomial Then we obtain : (x1k)mk r 1Vk((1) ak1+ak2(A2k)+.(k=1. ..tk ak 2 = r (2k) Ljk (1) J d f ..° Hence aY1 V k (1) i a .2k)mk. k = 1.
(A . Al (1).A Example : 1P (A) _ (A . 2 ... Let v' (A) _ (A (A . and s can be found from the following formulas : _ Y= 6 = (Al A As)' A .A... ..G mx). At (13 3 (As A1)' (As 1(As) + (As? (A$ 1 A1)2 1I (As) 1" (As).A1)] (A . (lima)) ! Then it is not difficult to show that the required LagrangeSylvester polynomial is determined b 7 the formula r (A) = lim L (A).AIE)'.. The LagrangeSylvester interpolation polynomial can be obtained by a limiting process from the Lagrange interpolation polynomial. . 4'"i). LAGRANGESYLVESTER INTERPOLATION POLYNOMIAL 103 Note. ft (Aa1)). A(e).. (A11)).As)s + [Y +6 (A ..A=E) + E (A .. A1) 1(As) _ (. A(I) .. y.§ 2. A(m... .A1)' (A 1s)' (m = 5). f (Aims)) A .As)'] (A .) .. A(2) a .. a. 1(m.. a=_ 3 As)' 1(A1) + 2 A1)' (A1 1 As)' 1' (A1). 41).A 2 2 A1)3 1' (As) + 2 ..).)m (m = . .A1E)3 (A ... A(1) 2 .As) + e (A . is Then r(A)  a Y e As. Hence r (A) _ [« + Q (A . Acma by . t=1 We denote the Lagrange interpolation polynomial constructed for the m points A(t) A(2)A(m.A1)' and therefore r(A)== LaE + P (A .B. 1 . 2 .) 1 . . 1 ... . ....E)a + (YE + d (A .A. 6.AaE)'] (A .(A.A2)""' . ..
+ f(mk1) (Ak) mk(2).(M. ... . (A) f f (fir) (A) + f' (2k) 9'k2 (A) + .. (j = 1. . (17) where Zkj =Tkj (A) (18) . k1 j1 Let us determine the interpolation polynomial r(A) from the m conditions : r(j1) (Ak) = ckj (j =1. i.. mk .. 8).... s).+.. When we substitute in (14) the expressions (12) for the coefficients a and combine the terms that contain one and the same value of the function f (A) or of one of its derivatives. .. These polynomials are completely determined when p(A) is given and do not depend on the choice of the function f (A). . The functions ggkj(A) represent the LagrangeSylvester interpolation polynomial for the function whose values on the spectrum of A are all equal to zero with the exception of /(F1) (Ak).. 2. k =1.. equal to m (m is the degree of the minimal polynomial y'(A) ). 2.. From (15) we deduce the fundamental formula for f(A) : (A) k1 ff(Ak)Zkl+I'(Ak)Zk2+. by (16) ckj =0 (A) = 0 (j= 1.. k = 1. k =1. Let us return to the formula (14) for r(A). 2. s) are easily computable polynomials in A of degree less than m.. . The Components of the Matrix A 1.. 2... Other Forms of the Definition of I(A). For suppose that ft 0.. mk. 2. .. FUNCTIONS OF MATRICES § 3. 2. .. k =1. (15) =1. we represent r(2) in the form r (A) _ Here qk.. . mk. s) are linearly independent.. k =1. .' I)(Ak)Z kl.. .. t (16) Then by (15) and (16) ink r (A) _j1 X k1 and. mk . 8)... 2. . 2. which is equal to 1. The number of these polynomials is equal to the number of values of the function f (A) on the spectrum of A. . ... mk.e..104 V. All the polynomials qk j (A) (j 1. 2. 2. therefore..
k =1...mk. 2. since the m functions Tkf(A) are linearly independent. The components Zkj are linearly independent. COMPONENTS 105 The matrices Zkf are completely determined when A is given and do not depend on the choice of the function f (1). s) will be called the constituent matrices or components of the given matrix A.8).. and the parameter t enters only into the scalar coefficients of the matrices. . . it follows from (19) that x(a)=0. But then. the components Zkf on the righthand side of (17) do not depend on t. k=1.(2). On the righthand side of (17) the function f(A) is represented only by its values on the spectrum of A. Let us also note that any two components Zkj are permutable among each other and with A. OTHER FORMS OF DEFINITION OR f (A).22)3. we may represent r(A) in the form r(A)=1(11)9'11(.. . ... 2. The matrices Zkf (j =1. The formula (17) for f (A) is particularly convenient to use when it is necessary to deal with several functions of one and the same matrix A. that none of these matrices can be zero. among other things. where . . 2. k1 f1 Then by (18) X (A) =0. For suppose that i mk '. 2. ckiZkj = O .. or when the function f (2) depends not only on A. because they are all scalar polynomials in A. In the example at the end of § 2. Mk.. (20) implies that ckf=0 (j= 1. 2. and this is what we had to prove.1)+I'(11)c'12(A)+1(12)9'21(1)+1'(x:) 91t: (1)+1"(11) q.. . the degree of the minimal polynomial W (2) . where p(2) = (2 .T G Ck/9'kf (2) (20) Since by (20) the degree of x(2) is less than m.A1)2(2 .§ 3. . In the latter case. From the linear independence of the constituent matrices Zkf it follows. where 8 (19) Mk X (A) =. but also on some parameter t. .
A) / (A) = E...ADS Therefore where f (A) f (At) Zit + /' (At) Zts + f (A2) Z21 + f' (A2) Zzs + f (A2) Z23 Z.. . (At 1 As) a (AA...At)s (A I1( 2 (AA2)1 As .p) / (p) =1 coincide on this spectrum.106 V.. we can represent the fundamental formula (17) in the form 10 For 1(p) A 1 1A we have f(A) =(AE.As)a (At .Ak (22) Hence _ Zkf 1 (C (A) l(mkf) (9=1. 2...t =9'11 (A) = Z12 = 9?12 (A) = (At Az)a (A 12E)a {E  3 At (A .A) r (A) = (AE. + (mkZkk (AAk)k (21) where C(1) is the reduced adjoint matrix of 2EA (Chapter IV. .. (91)!(mk1)!`wk(A) JA_xk 2. Then we obtain C(A) 1D(A) (AE A) = 1 = kI AAk+ Zkt (AAk)2 1!Zk2 +.E) (AA2E)a.Aa)a 3(A1 2)2 1 A2A.10 The matrices (j 1) ! Zki are the numerators of the partial fractions in the decomposition (21).. + (AsAt)hj (A  At)s (A2 . When we replace the constituent matrices in (17) by their expressions (22). we can set in the fundamental formula (17) /(u)= 1 is a parameter.At J ' (A) _ (A .As)' 2 (A2 . 8). (mk . When the matrix A is given and its components have actually to be where A found. From the fact that f(p) and r(p) coincide on the spectrum of A it follows that (A. k=1.AE)] .A) . mk.u) r(p) and (A. . 3. § 6).A2 A2 )a ( i 1 3 (A At) A. and by analogy with (9) they may be expressed by the values of C(1) on the spectrum of A by formulas similar to (11) : k'(A) (mk1)!Zkmk= . For f (A) = r(A).At)2(A . Hence (AE.AS L1 ]' A2) 9'12 (Z) = AAt 12 f 9)22 (A) _ 9723 2(A12) (A ` A0(1 .D(Ak) . . FUNCTIONS OF MATRICES 2. where r(EL) is the LagrangeSylvester interpolation polynomial.2) ! Zk mk1  (A) x .
OTHER FORMS OF DEFINITION OR f (A).A is equal to 1. .Z11 + 11 (A1) Z12 $ + Zsi 12 Z21 =C(2).A j = (A 1)2 (A .2). 4 Example 1:" A= 2 1 0 1 1 1 1 1 12 2 1 1 AE.A 0 1 1 11 1 1 1 11 In this case A (A) _ AE . we have D2(A) =1 and.µ)='V (/4)tp{d) =1A2+(14)'4+141+5 and C(1)='P(1E. We now use the above expression for C(A). V (1)=v(1)=(11)2(12)=As412+512. When we multiply the rows of A into the sum column of B we obtain the sum column of AB.p(1) (1EA)1= C(1) . Z21. 'i(1. tute the results obtained in (24) : Z12=C(1).A)=A2+ (14)A+(1241+ 5)E 3 2 2 3 1 2 2 3 +(14) 3 3 1 1 2 1 1 0 1 1 1 1 +(1241+5) 1 1 0 0 0 1 0 0 0 1 The fundamental formula has in this case the form f(A)=f(1)Z11+/'(1)Z12+f(2)Z21. therefore. hence Z31=C(1)C'(1). Z12. COMPONENTS 1 107 r 0(A) (M 1) (23) f(4) _1) 12 . Since the minor of the element in the first row and second column of AE .§ 3. compute Z11. (24) Setting f (µ) =1 1 we find: . and substi 11 The elements of the sum column are printed in italics and are used for checking the computation.
. Substituting in (24'). in (17) by the values of the reduced adjoint matrix C(2) on the spectrum of A. 1 00 0 0 Z.t2).108 1 V.i 1. 00 0000 Za=(AE)1= II1 1 0 0.+Z. we made use of the decomposition (21) and expressed the components Z. we started from the fundamental formula (17) and substituted in succession certain simple polynomials for f (1) .. Again let 2 1 A 0 1 1 1 1 .. we can determine all the Z. 1 1 00 Computing the third equation from the first two term by term.+/(2) f'(I) / (1)+/(2) f (1)I f(1)f(2) I (25) Example 2: Let us show that we can determine f (A) starting only from the fundamental formula. In the second method. 4.. The examples we have analyzed illustrate three methods of practical computation of f (A). W('1)=(11)2(. (24') 1 Then f(A)=f(1)Z1+f'(1)Z2+f(2)Z). . In the third method. In (24') we substitute for f (A) in succession 1.=AE= 1 1 1 1 1 0 0 1 1 1. we found the interpolation polynomial r (1) and put f (A) = r (A). we obtain the expression for f(A). from the linear equations so obtained we determined the constituent matrices Zk. FUNCTIONS OF MATRICES 00 0 f(A)=l(1) I 1 1 1 1 0+/'(I) 1 1 1+/(2) 1 1 I1 0 0! 1!0 I 1 1 111 1 00 1 0!I 011 0 /' (1) /' (1) 11 / (1) F /' (1) 1/(1)+/'(1)/(2) f'(1). In the first method. (A 1) 2: 111 z1+Z3=E = 01oil.
. / (2k) . we obtain the required expression for f (A).. 9. 0 is always satisfied when the degrees of the polynomial gl (A).14 Example : Given the matrix A = 11 4 12 With coefficients not all equal to zero. ...1)2. In conclusion. gi (A. m = 3. g. (A) are 0.. As the factor of f (A) we have here the determinant d = I AY) (2k) 1 (in the ith row of A there are found the values of the polynomial g. t (A. . 2. .) . what is the same. .. 92(x).. OTHER FORMS OF DEFINITION OR f (A).. m 1. 9m (A) 9m (Al) .. + kl g{mk1) (2k) Zk mrl (i= 1. .1.(A): g1 (A) _ [9{ (Ak) Zk1 + 9i (Ak) Zk2 + . by setting X= 0 in (21).13 The condition A 5.e. .... 9m (A. In order to determine f (A) we must have A 0. i.... g(mi1) (2k) . (A) on the spectrum of A . 9s(A) = (1. f(ml') f(ma1) (A..) =0. The minimal polynomial of the matrix is yp(A) = (A . gl(2) = 1.. .. .. . gs(1) = A. m). from the (m + 1) equations (26) and (17) can be written in the form f (A) gg (A) (Al) . is divisible by W (A) .) (A..) . . we mention that high powers of a matrix An can be conveniently computed by formula (17) by setting f (A) equal to A . is In the last example. 9(.) . g1 (A.. . In the general case it can be stated as follows : In (17) we substitute for f(A) successively certain polynomials gl(A).. 14 Formula (17) may also be used to compute the inverse matrix A1. g(I'n`') (2k) . by setting J(x) = A or.3 it is required to compute the II elements of A100.. . i =1.. . COMPONENTS 109 The third method is perhaps the most convenient for practical purposes.§ 3. (26) From the m equations (26) we determine the matrices Zk3 and substitute the expressions so obtained in (17). m). 1.. This will be so if no linear combination 12 of the polynomials vanishes completely on the spectrum of A.m'1) (2a) Expanding this determinant with respect to the elements of the first column. respectively. 2. .. g2(A). The result of eliminating Zk.1)2.) ....
110
V. FUNCTIONS OF MATRICES
The fundamental formula is
f(A)(1)ZI+f'(1)Z,.
Replacing f (1) successively by 1 and A  1, we obtain :
Z,=E, Z2 =AE.
Therefore
f(A)=f(1)E+f'(1)(AE).
Setting f (A) =A"', we find
Aioo=E + 100(AE) =111
1011+100114
4411 =11401400
.g
11.
§ 4. Representation of Functions of Matrices by means of Series
1. Let A= 11 a!k II i be a matrix with the minimal polynomial (1) : (m =1 mk) kI Furthermore, let f (A) be a function and let fI(A), f2(A), ... , fp(A), ... be a sequence of functions defined on the spectrum of A. We shall say that the sequence of functions fp(A) converges for p + 00 to some limit on the spectrum of A if the limits
vY (R)
urn f p (Ak),
p+oo
lim fp (As),
p.4oo
... , "M p.o0
(fir)
(k =1, 2, ... , 8)
exist.
We shall say that the sequence of functions f,, (A) converges for p). o0 to the function f (A) on the spectrum of A, and we shall write
lim fp (AA)=f (Ad)
p.4oo
if
P.00
Jim
fp (Ak) = f (nk), lim fn (1k) = f'(1k), ... , lim f P k I (ilk) =
ppoo
(k=1,2,...,3).
i
The fundamental formula
f (A) the matrix as a vector in a space
f
of of dimension n2, then it follows
from the fundamental formula, by the linear independence of the matrices Zkf, that all the f (A) (for given A) form an subspace of RR'
§ 4. REPRESENTATION OF FUNCTIONS OF MATRICES BY SERIES
111
with basis Zk; (j =1, 2, ... , Mk; k =1, 2.... , s). In this basis the `vector' f (A) has as its coordinates the m values of the function f(A) on the spectrum of A. These considerations make the following theorem perfectly obvious :
THEOREM 1: A sequence of matrices fp (A) converges for p  oo to some limit if and only if the sequence fp(A) converges for p * oo on the spectrum of A to a limit, i.e., the limits
lim fp (A)
p+oo
and
lim fp (AA)
always exist simultaneously. Moreover, the equation
P. 00
lim fp (AA) = f (AA)
lim fp (A) = f (A)
FCo
(27) (28)
implies that
and conversely. Proof. 1) If the values of f(1) converge on the spectrum of A for p > oo to limit values, then from the formulas
fp (A) =G [fp (Ak) Zk1 + fp (4)
fpmkM
(At)
(29)
there follows the existence of the limit V+00 fp(A). On the basis of this lim
formula and of (17) we deduce (28) from (27). 2) Suppose, conversely, that lim fp(A) exists. Since the m constituent
p. 00
matrices Z are linearly independent, we can express, by (29), the m values of fp(A) on the spectrum of A (as a linear form) by the m elements of the matrix fp(A). Hence the existence of the limit lim /P(AA) follows, and (27) P.. co holds in the presence of (28). According to this theorem, if a sequence of polynomials gp(A) (p =1, 2, 3, ...) converges to the function f (A) on the spectrum of A, then
lira gp (A) = f (A).
2. This formula underlines the naturalness and generality of our definition of f (A). f (A) is always obtained from the gp (A) by passing to the limit p + oo, provided only that the sequence of polynomials g, (A) converges to f (A) on the spectrum of A. The latter condition is necessary for the existence of the limit pao gp(A). lim
112
V. FUNCTIONS OF MATRICES
00
We shall say that the series Y u,(A) converges on the spectrum of A
P0
to the function f (2) and we shall write
0
f (AA) 4.'uP(AA),
P=0
(30)
if all the functions occurring here are defined on the spectrum of A and the following equations hold :
f (2k) Lr U11 (Ad),
P0
P=0
up (Ak) .
..., f (mkl) (Ak) _, uymk 1)
P=0
(k =1, 2,
... ,
s),
where the series on the righthand sides of these equations converge. In other words, if we set
sP
y=0
uy (1)
(p = 0, 1, 2, ...),
then (30) is equivalent to
f (AA) 1im sP (AA).
(31)
It is obvious that the theorem just proved can be stated in the following
equivalent form :
THEOREM V: The series X u ,,(A) converges to a matrix if and only if 0*
00
P=0
the series
,
p=0
' u,(2) converges on the spectrum of A. Moreover, the equation
ero
f (A.!) implies that
P0
uP (AA)
0
f (A) =E uP (A),
P0
and conversely.
3. Suppose a power series is given with the circle of convergence I A 20 I < R and the sum f (A) :
00
f (A)=Z a,(AAo)'
(I AAol < R).
(32)
§ 4. REPRESENTATION OF FUNCTIONS OF MATRICES BY SERIES
113
Since a power series may be differentiated term by term any number of times within the circle of convergence, (32) converges on the spectrum of any matrix whose characteristic values lie within the circle of convergence.
Thus we have :
If the function f (2) can be expanded in a power series in the circle I 2  Ao I < r,
THEOREM 2:
00
f (A) = Z ap (A  Ao)p,
(33)
then this expansion remains valid when the scalar argument A is replaced
fall on the circumference of the circle of convergence; but we must then postulate in addition that the series (33), differentiated m7, I times term by term, should converge at the point I = Ak. It is well known that this already implies the convergence of the j times differentiated series (33)
at the point 2k to f°)(2k) for j = 0, 1, ... , mk1. The theorem just proved leads, for example, to the following expansions :'S
00
by a matrix A whose characteristic values lie within the circle of convergence. Note. In this theorem we may allow a characteristic value 2k of A to
ed =
po
' AP P1
,

c os
°° A= E (_ 1)1' A2p , P_() (2p) l
sin A =G
°0
P0
X (1)p 2p + ill,
A2p+1
cosh A Z (2p) t'
0o A2p
sinhA
= o (2p+1)t'
A2p+1
(EA)1=IA'
p0
00
(11,tI<1;k=1,2,...,8),
hi A =.X
P1
p
(A E) p
(IAk1I<1; k=1,2, ...,F)
(by In A we mean here the socalled principal value of the manyvalued function Ln ,I, i.e., that branch for which Ln 1= 0). Let G (u1, u2i . . . , ui) be a polynomial in u1, u2i ... , uI ; let f1(2), /2(2), f i (2) be functions of A defined on the spectrum of the matrix A, and let
g (A) = o U, (A), fe (A), ..., fl (2)].
Then from
g (Ad) =0
the re follows :
(34)
(35)
GUi(A),f2(A), ...,f:(A)]=o.
15 The expansions in the first two rows hold for an arbitrary matrix A.
114
V. FUNCTIONS OF MATRICES
For let us denote by /I (A), f 2 (A), ... , f j (A) the LagrangeSylvester inter
,rs(1), and let us set: r2(A), polation polynomials Q (f, (A), fz (A), ..., f: (A)l =G Lr1(A), r2 (A), ..., rl (A)l = h (A) = 0, Then (34) implies
Hence it follows that
h (A) = G [r1 (A), ?'s (A), ..., r! (A)],
...
h(AA)=O.
(36)
and this is what we had to show. This result allows us to extend identities between functions of a scalar variable to matrix values of the argument. For example, from cos2A+sin2 A=1 we obtain for an arbitrary matrix A toss A + sin2A =B
(in this case 0 (u1, u2) = u= + u'  1, f1 (A) = cos A, and f, (A) = sin A ). Similarly, for every matrix A
i.e.,
eAe7' = E ,
e
A = (ea)'
Further, for every matrix A
e`'=cosA+isinA
Let A be a nonsingular matrix (I A J 0). We denote by ft the singlevalued branch of the manyvalued function VA that is defined in a domain not containing the origin and containing all the characteristic values of A. A = 0 it now follows that Then }"A has a meaning. From
(FA
A.
Let f (A) = and let A = II aik II1 be a nonsingular matrix. Then f (A) x is defined as the spectrum of A, and in the equation Af(A)=1 we can therefore replace A by A :
i.e.,16
A f (A) =E, f(A)=A1. Denoting by r(2) the interpolation polynomial for the function ill we
may represent the inverse matrix A1 in the form of a polynomial in A :
Is We have already made use of this on p. 109. See footnote 10.
§ 4. REPRESENTATION OF FUNCTIONS OF MATRICES BY SERIES
115
A1=r(A).
where g(A) and h(A) are Let us consider a rational function o(2)= coprime polynomials in A. This function is defined on the spectrum of A if and only if the characteristic values of A are not roots of h(A), i.e.,'7 if 0. Under this assumption we may replace A by A in the identity h(A)
e(1)h(.)=g(A),
obtaining:
(A)h(A)=g(A)Hence
e(A)=g(A) [h(A)11=[h(A)]1g(A).
(37)
Notes. 1) If A is a linear operator in an ndimensional space R, then f (A) is defined exactly like f (A) :
f (A) = r(A),
where r (A) is the LagrangeSylvester interpolation polynomial for f (A) on
the spectrum of the operator A (the spectrum of A is determined by the
minimal annihilating polynomial ip (A) of A). According to this definition, if the matrix A= II ask 111 corresponds to the operator A in some basis of the space, then in the same basis the matrix f (A) corresponds to the operator f (A). All the statements of this chapter
in which there occurs a matrix A remain valid after replacement of the matrix A by the operator A. 2) We can also define18 a function of a matrix f (A) starting from the
characteristic polynomial
d (A)  n (A _ 2k)Ilk
ke1
instead of the minimal polynomial
(A) = U (I  2k)mk
k1
17 Bee (25) on p. 84. 18 See, for example, MacMillan, W. D., Dynamics of Rigid Bodies (New York, 1936).
116
V. FUNCTIONS OF MATRICES
We have then to set f (A) = g (A), where g (A) is an interpolation polynomial of degree less than n modulo d (A) of the function f (A)."' The formulas (17), (21), and (23) are to be replaced by the following20
f (A) = L1 [f (Ak) Zk1 + f'(4) Z. + ... i ,Q'k 1) (1k) Zknk]
k.=1
(1EA)1
(17')
= B(2)
4(2)
4.1
k1 2Zk
1
+
1!Zk2
(Zjk)2
htk1)
1)12knk +... (nk nk1 
JII
.
(21')
(23')
B (A)
where
dk (1) =
(A
d
(Ak)"k
(k1,2, ... , s).
(/'(k),
.. , f(ak1) (At) occur only fictitiously, because a comparison of (21) with (21') yields:
However, in (17') the values f (mk) (,lk),
41=Zkl, ..., Zkmk, 4mk+1=... = Zknk=0.
§ 5. Application of a Function of a Matrix to the Integration of a System
of Linear Differential Equations with Constant Coefficients
1. We begin by considering a system of homogeneous linear differential equations of the first order with constant coefficients :

dt
d
allxl + a12x2 + ... + alnxn
+ ... + a2nxn
dt2 = a21x1 + a22x2
...............
it anlxl F an2x2 + ... + a ,,xn,
(38)
where t is the independent variable, x1i x2, ... , xn are unknown functions of t, and a{k (i, k = 1, 2, ... , n) are complex numbers.
We introduce the square matrix A= aik 111 of the coefficients and the
column matrix X = (x1, x2, ... , xn) . Then the system (38) can be written in the form of a single matrix differential equation
19 The polynomial g(1) is not uniquely determined by the equation f (A) =g(A) and the condition `degree less than n.' 20 The special case of (23') in which J(2)= in is sometimes called Perron's formula (see [40], pp. 2527).
d3is =A at.to= dt .AeAt.. Therefore di is the column matrix with the elements dx dxl dx. we obtain : x0=Ax0. a dtx0 (43) By direct substitution in (39) we see21 that (43) is a solution of the differential equation (39).. .).o or. Let us set f (2) = ext in (17)... (41) Then by successive differentiations we find from (39) : dsx A dx =A 8x..)=A+A21+ 214 . Then eet = I I qik (t) I I i =. APPLICATIONS TO SYSTEM OF LINEAR DIFFERENTIAL EQUATIONS 117 dx dt =Ax.. + Zk. 20=A2x0.1t0 $ two z0=s= dt :so . Setting t=0 in (43). 1 xn lt_0 =x+.k rk') elk' (44) 21 (eat)_d$(E+At+ 221 +. . =A 3x (42) Substituting the value t = 0 in (39) and (42).' dt We shall seek a solution of the system of differential equations satisfying the following initial conditions: x10=x10.. .. we find: xIt_o=xo. (39) Here... Thus. X21$O= xe0 . and in what follows. .§ 5.E (Zki + Zk2t + .. Now the series (41) can be written as follows : x=x0+tAxo+21A$x0+... the formula (43) gives us the solution of the given system of differ ential equations satisfying the initial conditions (40). W) at' . briefly... dx (. we mean by the derivative of a matrix that matrix which is obtained from the given one by replacing all its elements by their derivatives. x ItO = x0 ... (40) Let us expand the unknown column x into a MacLaurin series in powers of t: x=x0+ t0E+±0 2!1 + ..
FUNCTIONS OF MATRICES The solution (43) may then be written in the following form: x1= q11 (t) x10 + q12 (t) x20 + .118 V... ... The greatest common divisor of the minors of order 2 is D2(A) =1.a where x10. The fundamental formula is /(A)=/(1)Z1+/(2)ZE+/'(2)Z. are constants equal to the initial values of the unknown functions x1.. The coefficient matrix is A = We form the characteristic determinant 32 2 1 1 1 I A 1 21 I =(I1)(. If t = t0 is taken as the initial value of the argument.x2 + x3 .. then (43) is to be replaced by the formula x=eA(`40)xa. x2.21$.i2)'. (46) Example... + q2n (t) X.. (t) x.2) 2..... x20. Therefore v (A) =d (A) _ ('1. x... A . x.0 ...... For f (2) we choose in succession 1.(t) x. . .... dXa2xi x dx. dt x1 . Thus. the integration of the given system of differential equations reduces to the computation of the elements of the matrix eAt.2. + q1..I) (1... .. (2 .+q.. d dt1 3x1.......a x2 = q21 (t) x1o + q22 (t) x20 + ....x2 + 2 x*. (45) xA=qn1 (t) x10+ga2 (t) x20 +.. We obtain : .
o2.§5.et + (1 + t) e2t et 1e2t te2t 010 e2t et+e2t Pte2t Thus +t)e21C2te2t+Csk2t x1=C1(1 x2 = C1 [.. Cs=x.621) + C}e2t where Ct=x1o. (et . APPLICATIONS TO SYSTEM OF LINEAR DIFFERENTIAL EQUATIONS 119 1 0 0 1 Z1+Z. we obtain : eAt= et 110 1 10 000 001 + te2t 11 (1 + t) e2t + e2t I001 1 . 00 Hence we determine Z1j Z2i Z. + a2Ax>< + f2 (t) dx dt . + a1 z .et + e2t) + C. 1 1 0 0 Z1= (A 29)1  1 1 1 1 0 0 0 0 0 0..te2t) + C3te't . xs = C1(. and substitute in the fundamental formula I(A)=I(I) 1 1 0 +f(2) 1 1 1 1 1 0 1 1 0 0 0 1 00 0 0 + f'(2) 1 1 1 1 1 1 0 00 If we now replace f (1) by eat.=A2E= 2 2 1 r..=E= 0 0 1 0 0 1 1 1 r Z1+Z.at + (1 + t) e2t) + Cs (et .. + / (t) (47) dxi = a21x1 + a22x2 + . We now consider a system of inhomogeneous )}clear differential equations with constant coefficients : dx1 Wt = a11x1 + a12x2 + . C2=x1q. .
.. FUNCTIONS OF MATRICES where f. . 2. t (53) 22 See footnote 21.. k=1.. . f r. S r 5 t. way: b{k(t) =1.(t) (i=1. 23 If a matrix function of a scalar argument is given. we write the system (47) as f. When we give to the argument tin (52) the value t.. 2.. we find c=e. .2. .. k = 1. n . n) are continuous functions in the interval to < t < t. 2.2.. so that (52) can be written in the following form : I x = e' (tto)xa+ f e4(tT)/(r) dr. att. tp (52) tp where c is a column with arbitrary constant elements.. . connected with x by the relation x= a dtz (49) Differentiating (49) term by term and substituting the expression for dt in (48) we find" At dt =t (t) (50) Hence2s z(t)=c+ to I feATf (r)dr (51) and so by (49) 8 9 x =edt [c + f eAT/ (r) dr] = eAtc + f ea (tT) f (r) dr . f2 (t).. Denoting by f ( t) the Column matrix with the elements f.. (48) We replace x by a new column z of unknown functions.n). .. t. (t)...Itoxo.120 V. (i=1.. !I".m. (t) and again setting A follows : =Ax+f(t).. then the integral j B (t) dr is defined in the natural f B(t)dt= f btk(r)dt t. B(t) t..). in.
g (vo = vLo ) Integrating term by term. § 30.. Therefore the differential equation of motion of the point has the form2' =g2w x v.t0) x10 + . + qnn (t . we determine the radius vector of the motion of the point : ! t T r = ro + f eAT drvo + f f CA' da dr g. Lectures on Theoretical Physics.r) fn (v)] dt ... It is known24 that in this case the acceleration of the point relative to the earth is determined by the constant force of gravity mg and the inertial Coriolis force .to) x1o + . taking the motion of the earth. As an example we consider the motion of a heavy material point in a vacuum near the surface of the earth.. I (Mechanics). where 00 (58) ro = r'e_o and vo = vl. we easily find from (53) : t (57) v = eArvo + f eAAT dr 0 .2mco X v (v is the velocity of the point relative to the earth.§ 5._ to) xn0 + jc [g11 (t . + q1 (t . into account. f0 3..._o. + qnn (t .. co the constant angular velocity of the earth). We define a linear operator A in threedimensional euclidean space by the equation Ax=2w x x and write instead of (55) (56) dl =Av+g. we can write the solution (53) in expanded form : x1 =q11 (t . APPLICATIONS TO SYSTEM OF LINEAR DIFFERENTIAL EQUATIONS 121 Setting eA° = 11 q(t) 117. Sommerfeld.z) fn (T)] dT to (54) xn = qn1 (t . Comparing (57) with (48). 25 Here the symbol x denotes the vector product.. Vol.r) f 1 (r) + . + q1n (t ..to) xn0 + I + f [goil (t . 24 See A.r) /1(T) ±. .
sin 2wt g) 4w' + g )J .w X I vote + 3 gta) + w X I W X (3 vota + 8 gtl)J {.. Atx = .2w x A'x = 8wt (w x x). A2 are linearly independent and that As+4w'A=O. For we find from (56) Atx = 4w x (w X x) = 4 (wx) w ..122 V. we neglect the terms containing the second and higher powers of co. As a preliminary we establish that the minimal polynomial of the operator A has the form V (1) = 1(As + 4m'). FUNCTIONS OF MATRICES Substituting for eAr the series E+A _1f+ A22 . for the additional displacement of the point due to the motion of the earth we then obtain the approximate formula d=wx(vott+3gts).w X ( I .coo 2wt At 2w 40 Substituting this expression for eA= in (58) and replacing the operator A by its expression from (56). let us compute eA°.sin 2wt s V0 + 1 + 2w't'+` cos 2wt (59) ... Returning to the exact solution (58). and replacing A by its expression from (56). The minimal polynomial V(A) has the simple roots 0.cos 2wt 2w' + w X Iw X v0 + 2wt .4wtx . The Lagrange interpolation formula for e4 has the form 1 + sin 2wt I + 1. .f. Hence and from (56) it follows that the operators E. we find t' rro + vot + g 2 . J (2wt . Considering that the angular velocity co is small (for the earth.cos 2wt Then 2w 4ws It eA' = E + sin 2wt A + 1.2coi. 2coi.. we have: r = ro + vot } gt1.. .. cv M 7. A.3 X 105 see').
the general solution of the equation (60) can be written in the form x = cos (}/A t) xo + sin (VA t) xa (61) where xa= x 1. and the last term on the righthand side of the last formula gives the displacement in the meridian plane perpendicular to. to begin with.. If n = 1.. and the square matrix A = II air.. we rewrite (60) in matrix form d2X i +Ax=0.. .. 114)... + axxxx= 0.... + a2nxx= 0 . exists when .. 26 By vA we mean a matrix whose square is equal to A.. .. (60') We consider.... _o and zo = dx 1 dt e_o' By direct verification we see that (61) is a solution of (60) for arbitrary n. n) are constant coefficients... k =1. the case in which I A .2wt40 + 2w'ts (g sin q co .. APPLICATIONS TO SYSTEM OF LINEAR DIFFERENTIAL EQUATIONS 123 Let its consider the special case v = o. the earth's axis...sin 2wt 4w3 (g X w) represents the eastward displacement perpendicular to the plane of the meridian...wg) .. d'xx (60) dit + axixi + ax2x2 + . if z and A are scalars and A # 0. 2. and away from.e.. X 2 ). 0. i.. The term 2wt . we know. . When we expand the triple vector product we obtain: r = r0 + g 2 + t2 2wt .. sx where the ask (i.. 4.. CA .26 Here we use the formulas CAI#0 (see p. Suppose now that the following system of linear differential equations of the second order is given : d921 +aiixi+a18x2+. II' .+al8xx=0 3tj2 + a2ixi + a22x9 + ...§ 5. Introducing again the column x = (x1. where q) is the geographical latitude of the point whose motion we are con sidering.sin 2wt 40 ° cos 1 (g X w) + . where x is a column and A a nonsingular square matrix.
.p2E I 0). We leave it to the reader to verify that the general solution of the inhonmogeneous system dzas + Ax = / (t) (63) satisfying the initial conditions x Jt_o = x0 and dt It.. Therefore (61) is the general solution of the given system of A differential equations also when I A J = 0.r.to)). and r may be chosen arbitrarily. where c and d are columns with arbitrary constant elements.o in the form x= cos (4'A t) xo + sin (j!A t) xo + xo can be written +(}'A)n sin [}'A (tz)]/ (r)dT. which are part of this expression. . Jl (62) (VA)`Isin(y'At)=Et. . provided only that the functions (IA)_l sin (}/At).1 At3+ 3! Formula (61) comprises all solutions of the system (60) or (60'). The righthand sides of the formulas (62) have a meaning even when 0. and f by f. as the initial values .. 0 to In the special case / (t) = h sin (pt + a) (h is a constant column. are intercos (rt) and preted as the righthand sides of the formulas (62). then in (61) and (64) cos (3/At) and sin (3/At) must be replaced by cos (3/A (t . and p and a are numbers).124 V. FUNCTIONS OF MATRICES cos ()"A t) = E .to)) and sin (3/A (t .2 Ate + t A2t4 . This formula has meaning when p2 is not a characteristic value of the matrix A (I A. d (64) If t = t0 is taken as the initial time. (64) can be replaced by : x=cos(VAt)c+(V4)I sin(jAt)d+(Ap2E)lhsin(pt+a).
. 2. x2i .. .. if for every e > 0 we can find a 6 > 0 such that from x10. 13.. . t) are continuous functions of the variables x1. pp. . x. n). = 0. We now introduce the definition of stability of motion according to Lyapunov. .. 11121. . and the righthand sides f{ (x1. n (68) where the pfk (t) are continuous functions for t ? to (i. Therefore in the mathematical treatment of the problem we speak of the 'stability' of the zero solution of the system (65) of differential equations. = 0) for all t to (to is the initial time)...§ 6. x remain of moduli less than a for the whole time of the motion (t ? i.27 and suppose that these parameters satisfy a system of differential equations of the first order: dx. .. . [9]. . k = 1. x = 0. X. (t) I < e (67) If. Let x1. = 0.. = 0.... for some 6 > 0 we always have lim xi (t)=0 (i = 1. xo I < a (i =1. xn in some domain containing the point x. (65) the independent variable t in these equations is the time.. (66) it follows that I x.. xno (for t = to) with moduli less than 6 the parameters x1. then the motion is called asymptotically stable.e.. i. . n).. in addition. . STABILITY OF MOTION IN THE CASE OF LINEAR SYSTEM 125 § 6..2s The motion to be investigated is called stable if for every e > 0 we can find a 6 > 0 such that for arbitrary initial values of the parameters x20. _ 1. . that special case when (65) is a system of linear homogeneous differential equations d x{ de =y p. In matrix form the system (68) can be written as follows : 27 In these parameters.. . n ) as long as I xto I < 6 (i = 1. x2i . .. Stability of Motion in the Case of a Linear System `perturbed' motion of a given mechanical system from an original motion.. .... p.. the motion to be studied is characterized by constant zero values x. We now consider a linear system. See also [3].. . n) (t ? t0) . 2.. 2. 1011. x2 = 0. x be parameters that characterize the displacement of (XI I x x t) (i 12 n) .r (t) xt.. x2.e. . 2. xn. pp. . 28 See [14].. or [36].
II pvv(t) (68') where x is the column matrix with the elements x1. n) (60) n linearly independent solutions of (68). . where c is the column matrix whose elements are arbitrary constants ci. . . in the choice of n linearly independent solutions of (69) we shall start from the following special initial conditions: 30 g4(to)=8{5= { 1 (i.. ..1 (i= 1. (70) or in matrix form.?=1.. x.2. and therefore formula (70) assumes the form x = Q (t) xo (72) or.q{5 (t) xio f. in expanded form.. x=Q(t)c. (71) in other words.126 V.. 2. : Then setting t = tp in (70). x2. ... FUNCTIONS OF MATRICES P (t) x. . . x and P(t) _ is the coefficient matrix. we find from (71) x0=c. . 30 Arbitrary initial conditions determine uniquely a certain solution of a given system.. c2i We now choose the special integral matrix for which Q (to) = E. We denote by 11 71 q15 (t).. . g21 (t) . . Every solution of the system of linear homogeneous differential eqi ations is obtained as a linear combination of n linearly independent solui:ioiis with constant coefficients : ri x{ 11 c?4r1(t) (t =1. qn5 (t) (9 1.. n) . CO. . (72') 29 Here the second subscript j denotes the number of the solution. 2. . 2... n)...29 The matrix Q(t) = 1140 lit whose columns are these solutions is called an integral matrix of the Aya= tem (68). n).
. + oo). .. i. lim Q(t) =O.o = 0.. xko # 0. t P. 2. (t) is unbounded. (It is sufficient to take 6 G nM in (66) and (67)..o = 0.) The motion characterized by the zero solution x. say qhk (t) . 3. (AA. +00 is stable. . STABILITY OF MOTION IN THE CASE OF LINEAR SYSTEM 127 We consider three cases : 1. We take the initiaLeonditions x. The motion is unstable. xk_l.0 = 0. 2. it follows from (72) that lim x (t)= 0. the minimal polynomial of the coefficient matrix P. n). is not bounded in the interval. However small in modulus xk may be.. we find that in this case Q (t) = ep(tt. We now consider the special case where the coefficients in the system (68) are constants : P (t) = P = cont. + oo). t. .+Do for every xo. = 0. Q(t) is an unbounded matrix in the interval (to. (73) (74) Comparing (74) with (72). Q (t) is a bounded matrix in the interval (to. . The condition (67) is not satisfied for any 6. the motion is stable. xk+i. This means that at least one of the functions qty (t) . there exists a number M such that Igt. the function x.. In this case the matrix Q(t) is bounded in the interval (to.) xo. .(t)I SM (t? to. .. (75) We denote by lp(A)_(AAlyn.) .e. i.. We have then (see § 5) x =epttt. 2. In this case it follows from (72') that Ixj(t)1SnMmaxlxfo The condition of stability is satisfied..o = O .. .). xno = 0.. x2=0. 9= 1.(AR2)m. x...§ 6. The motion is asymptotically stable. + oo) and therefore. Then xh (t) = qhk (t) xko X. Moreover.. as we have already explained..
.to'"rl eak(tto) . s) . and in the third case it is unstable.. For some k we have Re Ak > 0.) maximal value m0=m. in fact.+ oo.t0)moI L Z_?n r etdy(tto) + (*)] I.. 2. FUNCTIONS OF MATRICES For the investigation of the integral matrix (75) we apply formula (17) on p. .. The expression (76) can be represented in the form ep(t.to)f eak(tt. are distinct real numbers and (*) denotes a matrix that tends to zero as t . or Re AI..t (t . and in the third ease the matrix eP(tt0) is not bounded in the interval Therefore in the first case the motion (x1 = 0.e. x2 = 0. .. because the matrix . with maximal Re ak = ao and (to..2.s). pure imaginary characteristic values are simple roots of the minimal polynomial).+00 From the formula (76) it follows that in the first case the matrix Q(t) eP(tto) is bounded in the interval (ta. But. #z..) . 2. We can see this by showing that ICI where cj are complex numbers and f3. in the second case lim eP(tto) = 0. it follows from lim f (t) = 0 that r I and therefore 1cf12 = lim 1JfT T. can converge to zero for t * + oo only when f (t) = 0. in the second case it is asymptotically stable. i. + oo). f.. = 0.++oo T 0 C1= C2 ' Ctt=O. x..31 (for the given Re Xk=a. In this case I(A)= e' (tt0) (t is regarded as a parameter). fal r o where #1. + 7x. From this representation it follows that the matrix eP(tto) is not bounded for ao + w ro 1 > 0.. Zkfm e{Af(tt0) cannot converge for f.to) + . for all Ak with Re Ak = 0 the corresponding mk = 1 (i. 104. .to) = e(to(tto) (t .. 91 Special consideration is only required in the case when in (76) for eP(tto) there occur several terms of maximal growth (for t * + oo ). Re Zk < 0 (k =1. k1 We consider three cases : (76) 1.R1 t * + oo.. /0) (ilk) = (t .. .. and moreover. = 0) is stable. real and distinct numbers. + oo)..... but Mk > 1. 3. (t .128 V. t.Y [Zk1 + Z.... ReAk<0 (k=1.e. . Formula (17) yields eP (t t0) _ . .
. a. i.e.e. + oo).. The zero solution of the linear system (68) is asymptotically stable if and only if all the characteristic values of P have negative real parts.. 2) is violated. STABILITY OF MOTION IN THE CASE OF LINEAR SYSTEM 129 The results of the investigation may be formulated in the form of the following theorem :32 THEOREM 3: The zero solution of the linear system (68) for P = const.+oo 2) Z. It is easy to see that Z.§ 6. are simple roots of the minimal polynomial of P. and it is unstable if at least one of the conditions 1). is always representable in the form ep (t _to) = Z_ (t) + Zo + Z+ (t). Z0(t)._ (t). Proof. is either constant or is a bounded matrix in the interval (t. 3) of the theorem. for which Re 2k = 0. with Re Ak < 0. of nonlinear systems that become linear after neglecting the nonlinear terms). The considerations above enable us to make a statement about the nature of the integral matrix eP(''ol in the general case of arbitrary characteristic values of the constant matrix P. 2). 2) those characteristic values whose real part is zero. § 3. 32 On the question of sharpening the criteria of stability and instability for quasilinear systems (i. We denote by Zo the sum of all those matrices Zk. and Z+ (t) have the properties 1). is stable in the sense of Lyapunov if 1) the real parts of all the characteristic values of P are negative or zero. We denote by Z+ (t) the sum of all the remaining terms. On the righthand side of (76) we divide all the summands into three groups. . + oo ) that does not have a limit for t>+oo. THEOREM 4: The integral matrix 6P(t19) of the linear system (68) for P = const. 3) Z+ (t) = 0 or Z+ (t) is an unbounded matrix in the interval (t. see further Chapter XIV... the pure imaginary characteristic values (if any such exist). We denote by Z (t) the sum of all the terms containing the factors exk(to). where 1) lim Z_(t) =0.
130 . m. k 1. + Al1A + A1.e.. here l is the largest of the degrees of the polynomials aik(A). by a number c 9& 0. 2..j (i=1. DEFINITION 1: A polynomial matrix. § 1...m.2. i.. . k=1. 2.... the theory of the reduction of a constant (nonpolynomial) square matrix A to a normal form l (A = TAT')...CHAPTER VI EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES.V + A1A 1 + . Setting Ai=JIa{x II (i=1. in the form of a polynomial in 1 with matrix coefficients : A (d) =A0...+a. Elementary Transformations of a Polynomial Matrix 1. In the last two sections of the chapter two methods for the construction of the transforming matrix T will be given.1. Multiplication of any row. in the next three sections... is a rectangular matrix A(A) whose elements are polynomials in A: A (A)=Jjaa(2)II=I!a°2i+ a{t)X1+... ANALYTIC THEORY OF ELEMENTARY DIVISORS The first three sections of this chapter deal with the theory of equivalent polynomial matrices. j=0. the analytical theory of elementary divisors..l). i.. we shall develop..n.. or 2matrix.2. we may represent the polynomial matrix A(A) in the form of a matrix polynomial in A.e.. On the basis of this. for example the ith. We introduce the following elementary operations on a polynomial mat rix A(1): 1.
b(2). 2 See footnote 1.... 3. and S"' A (A).. but on the columns) ..0 S.. = .. . Interchange of any two rows... respectively:' 1. (i) (i) 1. (7) . 1 . for example the jth. . 3. respectively. .. S" A (A) .. 0 1 in other words. ....1 (i) 1 Ho ...oll (7) 1 . 2. the matrix A (I) is transformed into S'. 1 (1) 0 S"i 1..0. .. The operations of type 1. are equivalent to a multiplication of the polynomial matrix A(1) on the left by the following square matrices of order m. are therefore called left elementary operations... ..e S"= 0. . of any other row..1..A (A). We leave it to the reader to verify that the operations 1. 3.. 131 Addition to any row.. In the same way we define the right elementary operations on a polynomial matrix (these are performed not on the rows.. for example the ith... .. . ELEMENTARY TRANSFORMATIONS OF A POLYNOMIAL MATRIX 2.. . multiplied by any arbitrary polynomial b (A)... 2. 2. 3. as the result of applying the operations 1..2 the matrices (of order n) corresponding to them are : 1 In the matrices (1) all the elements that are not shown are 1 on the main diagonal and 0 elsewhere. for example the ith and the jth..
... 4 From the definition it follows that only matrices of the same dimensions can be leftequivalent. S"' (or. EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES (i) T'= (z) T" = (?) 0. The left elementary operations form a group. .I The result of applying a right elementary operation is equivalent to multiplying the matrix A(2) on the right by the corresponding matrix T. Therefore each left (right) elementary operation has an inverse operation which is also a left (right) elementary operation. left and right) elementary operations.. coincide with S' and S"' and that T" coincides with S" when the indices i and j are interchanged in these matrices. 2) rightequivalent. 3) left and right elementary operations.3 DEFINITION 2: Two polynomial matrices A(A) and B(A) are called 1) leftequivalent.132 VI. T"') will be called elementary matrices.. S".. as do the right elementary operations. .. . respectively... then A(X) can. .. . be obtained from B(X) by means of elementary operations of the same type. rightequivalent. T". Note that T' and T.4 3 It follows from this that if a matrix B(X) is obtained from A(a) by means of left (right. . . what is the same. conversely. 2) right elementary.1 I . The matrices of type S'.. 3) equivalent if one of them can be obtained from the other by means of 1) leftelementary..0 (i) 1 ...The determinant of every elementary matrix does not depend on A and is different from zero. T'.. . .. or simply equivalent.. .0 (9) 0..
. We consider a system of m linear homogeneous differential equations of order l with constant coefficients... S.. . . we write (2) in the form (3) B (A) = P (A) A (A). = 0 (4) aml (D) x1 5 I. where P(A) and Q (A) are polynomial square matrices with constant nonzero determinants. (D) x. independent of A. Therefore (3) is equivalent to (2) and signifies left equivalence of the matrices A (A) and B (A). where P(A). (2) Denoting the product S. DEFINITION 2': Two rectangular Amatrices A (A) and B(A) are called 1) leftequivalent. Here again. +am2(D)x2+.. where x1. ELEMENTARY TRANSFORMATIONS OF A POLYNOMIAL MATRIX 133 Let B (A) be obtained from A(A) by means of the left elementary operations corresponding to S. . x2. Definition 2 can be replaced by an equivalent definition.. has a constants nonzero determinant.Sp_1 . (3") Thus. 2. In the case of right equivalence of the polynomial matrices A(A) and B (A) we shall have instead of (3) the equation B (A) = A (A) Q (A) (3') and in the case of (twosided) equivalence the equation B(A)=P(A)A(A)Q(A). Sl by P(A). Then B (A) = SpSp_1 . S2. Si.. x are n unknown functions of the independent variable t : all (D) x1 + a12 (D) x2 + .. 3) B(A) =P(A)A(A)Q(A). In the next section we shall prove that every square Amatrix P(A) with a constant nonzero determinant can be represented in the form of a product of elementary matrices.. + a2R. ... 3) equivalent if 1) B(A) =P(A)A(A). .§ 1... +amn(D) x. 2) rightequivalent. S1A (A)... 2) B(A) =A(A)Q(A)..e. respectively. 82i . P(A) and Q(A) are matrices with nonzero determinants...=0. like each of the matrices S. independent of X.. All the concepts introduced above are illustrated in the following important example. + aln (D) x = 0 a21 (D) x1 + a22 (D) x2 + ..
. k =1.. the two systems of equations are equivalent . signifies an interchange of the ith and jth equation. The left elementary operation 2. The matrix of operator coefficients A (D) = II a. we denote quotient and remainder by Q4. (A) by all (. . Canonical Form of a 1Matrix 1. Among them we choose a polynomial of least degree and by a permutation of the rows we make it into the element a (1). 2. m. conversely. 2. To begin with. 2.m): 6 Here it is assumed that the unknown functions x.6 It is not difficult in this example to interpret the right elementary operations as well. The left elementary operation 3. Clearly. § 2. x are such that their derivatives of all orders. .. Let us assume that the first column of A(A) contains elements not identically equal to zero. Since. .134 VI.l) .) (i =1.... xj = x4).. signifies the termbyterm addition to the ith equation of the jth equation which has previously been subjected to the differential operator b (D).. (i. n is a polynomial matrix. we shall examine what comparatively simple form we can obtain for a rectangular polynomial matrix A (A) by means of left elementary operations only.. (A) and r4i(A) (i =2.. by the same reasoning. the third signifies the interchange of the terms in the equations that contain x4 and x. as far as they occur in the transformations.. . k =1. Then we divide at.e. we obtain a deduced system of equations... n) D= is a polynomial in D with constant coefficients.. .. x{ = xi. m. two systems of equations with leftequivalent matrices B(D) and B(D) have the same solutions. 2.. EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES here a4x (D) = aik)D1 + a . exist. . the left elementary operation 1. The first of them signifies the introduction of a new unknown function x{ = x4 for the unknown function x4i the second signifies the introduction of a new unknown function xi = xj + b (D) x{ (instead of xj) . x. on the matrix A(D) signifies termbyterm multiplication of the ith differential equation of the system by the number c 0. . With this restriction. Thus..... if we replace in (4) the matrix A(D) of operator coefficients by a leftequivalent matrix B(D). . d is the differential operator.. (D) II (i=1. or Dmatrix. the original system is a consequence of the new system.
.e. where the polynomials blk(A). 3 (A). 0 (k = 2. achieving a32 (A) _ .. 0 0 bn(A). . the degree of the polynomial a (A) is reduced. b.. at this stage all the elements a2... 0 0 bm (A) 0 . then by left elementary operations of the second type we make the degrees of the elements b. 3. etc.k(A). (A) .. b..... We have established the following theorem : THEOREM 1: An arbitrary rectangular polynomial matrix of dimension m X n can always be brought into the form (5) by means of left elementary operations. we finally reduce the matrix A (A) to the following form : b11 (A) b12 (A) b11 (A) b12 (A) . . .. (A).. . .. . ...b2m(A). and are all identically equal to zero if bkk(A) = eonst. (A) (mSn) 0 0 .. we prove THEOREM 2: An arbitrary rectangular polynomial matrix of dimension m X n can always be brought into the form .. Similarly..§ 2.. . Next we take the element a22 (A) and apply the same procedure to the rows numbered 2.. 0 (5) b.. . As the result of all these operations. . b23 (. ... (A) 0 .. min (m. this must come to an end at some stagei. . Now we subtract from the ith row the first row multiplied by qil (A) (i = 2. m). then b12(1) becomes identically equal to zero).. .. blm (A) .b2(A) o . . (A). . . n) ). . bt (A) b2n (A) . m. . bk_1. If not all the remainders ri... Now we repeat this process.. 0 II (M ?n) If the polynomial b22(A) is not identically equal to zero... . a3. then we choose one of them that is not equal to zero and is of least degree and put it into the place of all (2) by a permutation of the rows.l) less than the degree of b (A) without changing the elements b12(A). (1) turn out to be identically equal to zero. (A) are identically equal to zero.. In the same way. bl..k (d) are of degree less than that of bkk(2)... =a. .. . b22 (A) ... provided bkk(A) 0. CANONICAL FORM OF A AMATRIX ail (A) = a11(A) qti (A) + r1 (A) 135 (i=2.. . .. then by applying a left elementary operation of the second type we can make the degree of the element b12(A) less than the degree of b22(A) (if b22(A) is of degree zero. a. if bss (A) .2(2)=O. 3.Continuing still further. 0.. m)... Since the degree of the polynomial all (A) is finite..
n)). and all are identically equal to zero if ck. Ck2(A). But then. C. provided 0 Ckk(A) 0. . .e. 0 0 . 0 0_0 0 . . From Theorems 1 and 2 we deduce the corollary : COROLLARY : If the determinant of a square polynomial matrix P(A) does not depend on 1.... (m <_ n) Cml (A) Cm2 (A) . . does not depend on A and is different from 0.. Since in the application of elementary operations to a square polynomial matrix the determinant of the matrix is only multiplied by constant nonzero factors. (A) (m ? n) by means of right elementary operations. b. .. S2. b11(1) b22 (A) . like that of P(1).. . . ...136 VI. . conversely. by left elementary operations. .+n W== const. b2n (A) 0 0 .. But then.. .. .. bin (A) (7) 0 b_2 (A) . can (A) (6) Cm2 (A) .. n) . . where the polynomials Ckl (A).. .. # 0.. .. also by Theorem 1... ce. . .. S1E = SPSP_1 .(. . Hence bkk (A) = const.... 2... 0 .min (m. 0 0 C22 (A) . . 2.. . where n is the order of P(A). Sl .. the determinant of the matrix (7)... (k=2. .. and is different from zero.. . . . 0 (k =1. . Cnl (A) Cml (A) Cn2 (A) .. bnn (A).k_1(2) are of degree less than that of Ckk(M).. i. Cmm (A) . . the matrix (7) has the diagonal form l bAk Iii and can therefore be reduced to the unit matrix E by means of left elementary operations of type 1... . EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES C31 0 . . .l) =eonst. .... then the matrix can be represented in the form of a product of a finite number of elementary matrices. 0 ell C21 0 . . . So. . . C21 (A) C22 (A) ... the unit matrix E can be transformed into P(2) by means of the left elementary operations whose matrices are S1.. For by Theorem 1 the matrix P (A) can be brought into the form b11 (A) b12 (A) .3.. Therefore P (A) = SPSP1 .
§ 2.(A) are identically equal to zero.. ... .. If at least one of the remainders rs1(A).. (4') (D) x.m.+1 . + bu (D) x.. .b2n (D) xu . amt 0 a22 (A) . for example r1k(A).. n).... however.+1(D) x.... r1.. Let us return to our example of the system of differential equations (4). m) and from the kth column the first multiplied by g1k(A) (k = 2.. Among all the elements aik(A) of A(A) that are not identically equal to zero we choose one which has the least degree in A and by suitable permutations of the rows and columns we make this element into a (A) .. .. We apply Theorem 1 to the matrix 11 aik(D) 11 of operator coefficients. rfnI(2) ....+l . .3. . m.. In this system we may choose the functions x. (A) .. a2u (A) 0 . r12 (A).. .. x1 can be determined successively .+1 (D) x.. ... ... =... from this corollary there follows the equivalence of the two Definitions 2 and 2' of equivalence of polynomial matrices. 135. at each stage of this process only one differen tial equation with one unknown function has to be integrated._1.b2.... r1k(A) (i2... n).. CANONICAL FORM OF A AMATRIX 137 As we pointed out on p. is not identically equal to zero. We now pass on to establishing the `canonical' form into which a rectangular matrix A (A) can be brought by applying to it both left and right elementary operations..b. a1k (A) =a11(A) q1k (2) + rlk (2) (i=2... n). = . which is of smaller degree than a11(A). k = 2.. . . b.. (D) x. the system (4) is then replaced by an equivalent system bu (D) x1 + b12 (D) x2 + .. .. But if all the remainders r21 (A). (1) I .3. .. 3... 4. x arbitrarily.b.+1. we reduce our polynomial matrix to the form a11(A) II 0 0 ... . (D) xu where s =min (m.... .n). .+1.. k=2. b(D) x2 + .. _ . As we have shown on p. .. . then by subtracting from the kth column the first column multiplied by qlk(A).. x.b1.+1 (D) x. _ bin (D) xu ... we replace aak(A) by the remainder r1k(A). a. Then we can again reduce the degree of the element in the top left corner of the matrix by putting in its place an element of smaller degree in 2.. + b1. then by subtracting from the ith row the first multiplied by g11(A) (i = 2. Then we find the quotients and remainders of the polynomials all (A) and a1 k (A) on division by a11(A) : a1 (A) =a11(A) q:1(A) + r:1(A) . after which the functions x. 133.
3(2) . Since the original element aI I (A) had a definite degree and since the process of reducing this degree cannot be continued indefinitely. a.... a2 (A). 0 .011 0 where the polynomials aI (A). a2(2)....... (1) where a2 (A) is divisible without remainder by al (A) and all the polynomials c{k(A) are divisible without remainder by a2(A). k = 2. a2 (A) .... . Continuing the process further. then continuing the same reduction process on the rows numbered 2... (A) are equal to 1. . (A) 0 . a. ...I (A) by a polynomial of smaller degree. By multiplying the first s rows by suitable nonzero numerical factors.. .. Co. ... m.. a.. we can arrange that the highest coefficients of the polynomials aI (A). . . we finally arrive at a matrix of the form ai (.. 0 0 0 0 0 0 033 (A) ... .. 0 0 II . m and the columns 2. n) is not divisible without remainder by a. after a finite number of elementary operations. ..... n) ) are not identically equal to zero and each is divisible by the preceding one. 0 11 0 0 b22 (1) . (A) in which all the elements bsk (A) are divisible without remainder by aI (A). If among these elements b{k(A) there is one not identically equal to zero.I (A). . we reduce the matrix (8) to the form a.. n. . . then by adding to the first column that column which contains such an element we arrive at the preceding case and can therefore again replace the element a.. EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES If at least one of the elements aak(A) (i = 2.138 VI. . 0 0 0 0 0 0 0 . (A) c.. . .. ... b2+... . we must.0 0... obtain a matrix of the form 11 ai (A) 0 ... (1) (8) bm2 (1) b. (A) 0 0 a2 (A) 0 0 . 0 0 ..t) 0 0 ..... e.(A) (s < min (m. 0 0 0.
a. .(. Dr_j (A).(A) defined by (10) are called the invariant polynomials of the rectangular matrix A(A). . we have proved that : An arbitrary rectangular polynomial matrix A(A) is equivalent to a canonical diagonal matrix. 3. .._. Then they are obtained from one another by means of elementary operations. and hence D...§ 3. . 2. . But an easy verification shows immediately that the elementary operations We take the highest coefficient in D. Invariant Polynomials and Elementary Divisors of a Polynomial Matrix 1. . .. . (a). it is assumed that the highest coefficients of all the polynomials al(l). a2(1). i.. (A). § 3. (a) to be I (j =1.. INVARIANT POLYNOMIALS AND ELEMENTARY DIVISORS 139 DEFINITION 3: A rectangular polynomial matrix is called a canonical diagonal matrix if it is of the form (9). a2(A).. i2(A). Thus.. but all the minors of order greater than r are identically equal to zero in A. a. We introduce the concept of invariant polynomials of a 1matrix A(A)..(A) is divisible by the preceding. r).(A) are not identically equal to zero and 2) each of the polynomials a2(A). . i2 (2). 2. the matrix has minors of order r not identically equal to zero.. . ir(A)Do(A) =D1(i1). a.. where 1) the polynomials ai(l). ..e. then every term in the decomposition is divisible by D. d.._.. Let A(A) be a polynomial matrix of rank r. and we shall set up formulas that connect these polynomials with the elements of A (A) .. .. r).. The term `invariant polynomial' is explained by the following arguments.. .... . Moreover. (10) DEFINITION 4: The polynomials i.. We denote by DM(A) the greatest common divisor of all the minors of order j in A (A) (j = 1. . . D0(A)=1 each polynomial is divisible by the preceding ones The corresponding quotients will be denoted by it (A). Let A(A) and B(A) be two equivalent polynomial matrices.. (a) (j = 2. a2(1).(a) . i. r). therefore every minor of order j. . s If we apply the B4zout decomposition with respect to the elements of any row to an arbitrary minor of order j. D1(2). In the next section we shall prove that: The polynomials al(l)..' Then it is easy to see that in the series Dr (A). .) are uniquely determined by the given matrix A(A) .) are equal to 1.. (A) : $l(A)=Drr1(2)' i2(A) Dr_1(2)' . is divisible by D. a. ..().
we obtain for an arbitrary minor of B (A) the expression B 71 72... (2) and Dp (2) (p = 1.. 2. r) are 1... i2(A)..(A) defined by (10) unchanged. . <.. D2(A). P. i.. . . Moreover.. Thus.. Di (A) =D1(A). (A).. is (A) = ar_1(A). ar (A) .. Dr(A).. by (10). Hence it follows that all the minors of order r or greater of the matrix B(A) are zero. ir(A) remain invariant on transition from one matrix to another equivalent one. Hence9 r = r'. Therefore r:5 r* and D9(A) is divisible by D P (A) (p = 1. fp' Q #1 Y2 .. But the matrices A(A) and B(A) can exchange roles. it (A) = a1(A) . then it is easy to see that for this matrix D1(A) = at (A).(2).. .. ..140 VI.. . 2. n) ).. . ip a 1. . . D2 (A). . i2(A). it follows from the same formula that D. the polynomials i1(A). . For when we apply to the identity (3") the formula that expresses a minor of a product of matrices by the minors of the factors (see p. 9 The highest coefficients in D.. min (m. . because it is equivalent to (9). Dr (A) =Dr (A) . <ppsm 1 as < «2< . they also leave the polynomials it (A). i2(A).D2(2)a1(2)a2(2).. EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES change neither the rank of A(A) nor the polynomials D1(2). . ar(A) coincide with the invariant polynomials i1 (A) =a..... is divisible by Dp (A) (p = 1. jr k2 k1 . . <d.. . . 2. a2(A).... 2. . . n)). kp P i1 92 . min (m.. (11) Here ii (A).. . . ir(A) are at the same time the invariant polynomials of the original matrix A (A). But then.. 12).. ... < ap (al a2 . ap Q f l P2 . so that we have for the rank r* of B (A) : r* < r. D2 (A) =D2 (A). . . Since elementary operations do not change the polynomials D1 (A). Dr(j)=a1(A)a2(A) .. n) ).(A)... min (m. the diagonal polynomials in (9) a1(A)... fl k1 k 2 kp (p=1.. ap lA a1 a2 ... The results obtained can be stated in the form of the following theorem.. .. . D. the greatest common divisor of all the minors of order p of B (A). If the polynomial matrix has the canonical diagonal form (9).
It does follow from the fact that the Dolynomials it (A).. . COROLLARY 2: In the sequence of invariant polynomials ii (R)= D._i (A) . THEOREM 4: If in a quasidiagonal rectangular matrix A (2) 0 0 B(2) every invariant polynomial of A (A) divides every invariant polynomial of B(1). INVARIANT POLYNOMIALS AND ELEMENTARY DIVISORS 141 THEOREM 3: The rectangular polynomial matrix A (A) is always equivalent to a canonical diagonal matrix i.0 (12) . i2(A). i2(A). then the set of invariant polynomials of C(1) is the union of the invariant polynomials of A (A) and B(A).. This statement does not follow immediately from (13). ....(2) coincide with the polynomials a... 0 0 0 0 0 0...... The necessity follows from the fact that two polynomial matrices having the same invariant polynomials are equivalent to one and the same canonical diagonal matrix and... . r must here be the rank of A(A) and ii(1).. to each other. . COROLLARY 1: Two rectangular matrices of the same dimension A(A) and B(1) are equivalent if and only if they have the same invariant polynomials.0 0 ... therefore.. (1). . 2.. ai (A) of the canonical diagonal matrix (9). 0 0 .§ 3..0 Moreover.. ii (2) 0 .. . 0 0 ._1(1). iT(A) the invariant polynomials of A(A) defined by (10).. We now indicate a method of computing the invariant polynomials of a quasidiagonal Amatrix if the invariant polynomials of the matrices in the diagonal blocks are known.._2 ). ... Thus: The invariant polynomials form a complete system of invariants of a Amatrix. i.... The sufficiency of the condition was explained above. Du (A) (Do (2)=1) (13) every polynomial from the second onwards divides the preceding one.. a... 0 0...(2) 0 0 0 0 0 . 'e...ri (A) D.
. ... ir(A) and i1(A). We denote by ii(A). 10 The symbol ~ denotes here the equivalence of matrices. respectively. iQ (A)... 0) (i'.. . 12 The formulas (15) enable us to define not only the elementary divisors of . (A)]`'. . IT. ... ... . . is (A) _ [971(A)Id' 1972 (A) P :11 . .. In order to determine the invariant polynomials of C(A) in the general case of arbitrary invariant polynomials of A(A) and B(A) we make use of the important concept of elementary divisors. i2 (A). 972(A). . and braces ( ). conversely. By Theorem 3 the diagonal elements of this matrix that are not identically equal to zero then form a complete system of invariants of the polynomial matrix C(A).12 THEOREM 5: The set of elementary divisors of the rectangular quasiA (A) diagonal matrix 0 0 B(2) is always obtained by combining the elementary divisors of A(A) with those of B(A).. (A)]`'... . .. ii (A). . iQ (A). .2. Here T.... 0).(A)]i in (15).. . (ct dF? .. [q'. dR.142 VI.. 97. are called the elementary divisors of the matrix A (A) in the field p. a diagonal rectangular matrix of the form (12). Q)]11. IT. i. This proves the theorem. .. EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES Proof.4(A) in the field r in terms of the invariant polynomials but also. s) may be equal to zero.. i2(A). We decompose the invariant polynomials it (A). (A).. 0. . lR (k= 1. . as far as they are distinct from 1.{i.. 11 Some of the exponents cs. (14) The Amatrix on the righthand side of this relation is of canonical diago nal form. (A). . 0.. i2 (A) . ill (A). (A) into irre ducible factors over the given number field F it Ca) = [qIL (A)]`' [97s (A)]`' . (A)]d. Then1° A (A) . (1) are all the distinct factors irreducible over F (and with highest coefficient 1) that occur in il(A). i2(2).. . 0... [4r. .(A).. B (A) and therefore C (A) (A). . 2... z lk k=1.. DEFINITION 5 : All the powers among IT. (A). . 0). the invariant polynomials of the Amatrices A(A) and B(A). i. .... ....s 0' (16) if (A) = [q'1 (A)]" IT2 W)h .. . ii (A)... the invariant polynomials in terms of the elementary divisors..
+t. . etc.. (A) _ `. [p. .. it (A) = [9'1(A)]". 13031) c = ± 1.... This completes the proof of the theorem.. Note. . The theory of equivalence for integral matrices (i. (A)]" (*). 1s If any irreducible polynomial 9. . W. . (A)]s' .. (A) == [ml (A)] A' Cq2 (A)]k' We denote by (16) all the nonzero numbers among ci. (20.. . 9i Then the matrix C(A) is equivalent to the matrix (14). . [9'.. .k(A) occurs as a factor in some invariant polynomials...... and by a permutation of rows and of columns the latter can be brought into `diagonal' form {[9'1(A)]`' (*). [9'... Here in 1.1.. b(A) is to be replaced by an integer... .. matrices whose elements are integers) can be constructed along similar lines.. Hence it follows that [9'1(A)p. . (A)]`'.+d1+ . INVARIANT POLYNOMIALS AND ELEMENTARY DIVISORS 143 Proof.. (see pp. [c1(A)]`'. .. . (**)) (17) where we have denoted by (*) polynomials that are prime to pl (A) and by (**) polynomials that are either prime to g71(A) or identically equal to zero... i (A) = [PI WP [9's (A)]. .. . hi. of the matrix C (A) : D. are elementary divisors of C(A). The elementary divisors of C(A) that are powers of . [q'1(2)]°' as far as they are distinct from 1. (*)...  [p.._1 (A)= [9'i (A)]d'+ .. D.e.. and i1 (A). ci... ... . . (A)14' . .... . i.. 2.. (A)]`'. i2 (A) = [q'l (A) "[9's (A)] . i2(A).... D.. (2)]k'. +i...§ 3. in place of P(A) and Q(A) there are integral matrices with determinants equal to ..... [q'1(2)]1'._1(2).e.. (*). all the powers . [9'. [q'1(A)] .. is (A) = [ml (A)]d' (*). [p... From the form of the matrix (17) we deduce immediately the following decomposition of the polynomials D. but not in others.. .. _ [9'1(A)]d' [4's W14 . .. IT. (3"). then in the latter we write qk(A) with a zero exponent.2(A) are determined similarly..(2). ... (3'). i. and in (3).. (**).... di. . [gel (A)]d'. (A)]. i9 (A) _ [p1(A))9i [9's (2)]h . (A)]`'.. ....(*). We decompose the invariant polynomials of A(A) and B(A) into irreducible factors over p :18 ii (A) i2 (A) [q'1(A)fl [9's (A)]`' .. di.. [9'.... .
AE . Its invariant polynomials it Dnn i2 (a) = Dn2 (A) Do (2) (D o W=1) .1.. A knowledge of the invariant polynomials (and..a21 2a22 . respectively. We form its characteristic matrix all a12 . a.. multiplied by and A .. Aa.. The formulas (19) give an algorithm for computing these polynomials. Example : A= . based on the reduction of the characteristic matrix (18) to canonical diagonal form by means of elementary operations. but for large n this algorithm is very cumbrous. A3 4 1 A+1 0 0 0 1461 51 A'21+1 0 0 1 0 0 0 We add to the first column the second multiplied by A  3: . . Therefore practical methods of computing the invariant polynomials of a matrix are of interest. of the elementary divisors) of A enables us to investigate its structure. Suppose now that A= II ask III is a matrix with elements in the field F.. EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES 3. Theorem 3 gives another method of computing invariant polynomials. (19) are called the invariant polynomials of the matrix A and the corresponding elementary divisors in F are called the elementary divisors of the matrix A in the field F.A we add to the fourth row the third multiplied by A : A3 1 A+1 148A 5A A=2A+1 0 6 4 1 A2 1 0 0 0 0 Now adding to the first three columns the fourth.A = II . a2 (18) a. .144 VI. we obtain 6. 11 The characteristic matrix is a Amatrix of rank n. hence.4 1 0 0 6 2 14 6 1 0 1 1 3 1 00 AEA = A3 1 4 6 1 A2 1 14 A+1 5 0 0 1 0 0 A In the characteristic matrix AE .2.1 aii2 .
= PA. (21) 14 The identity (21) is equivalent to the two matrix equations: Bo = PA0Q and B. 1 B01 =A 0 (see p. then we multiply the first and third rows by 1.§ 4..A. These polynomial matrices may be represented in the form of matrix binomials : A (A) = A OA + A 1. We shall assume that these binomials are of degree 1 and regular. In the present section we consider two square Amatrices A (A) and B (A) of order n in which all the elements are of degree not higher than 1 in A. i. we obtain 0 1 0 0 A22A + 1 0 0 0 0 0 0 A$+2A1 1 0 0 1121+1 To the second row we add the fourth. After permuting some rows and columns we obtain : 1 0 1 0 0 0 0 (A1)' 0 0 0 0 0 0 0 (A 1)2 The matrix has two elementary divisors (A_ 1)2 and (A 1)2. in the identity THEOREM 6: BOA + B1= P (A) (A0A + A1) Q (A) (20) the matrices P(1) and Q(A)uith constant nonzero determinantscan be replaced by constant nonsingular matrices P and Q :14 BOA + B1= P (AOA + A1) Q.Q. EQUIVALENCE OF LINEAR BINOMIALS 145 122A+1 A+1 0 0 0 1 0 0 01 01 6A 1'2A } 0 1M 1 Oi' To the second and fourth rows we add the first multiplied by A + 1 and 5 . respectively . then they are strictly equivalent. B (A) =BOA + B1. Equivalence of Linear Binomials 1. § 4. 76). The following theorem gives a criterion for the equivalence of such binomials : If two regular binomials of the first degree Ao2 + Al and Boil + B1 are equivalent. that I Ao 1 O. i. .e.. In the preceding sections we have considered rectangular Amatrices.e.
(23) (24) here M and Q are constant square matrices (independent of A) of order n. We substitute these expressions for M(A) and Q (A) in (22). With the help of this matrix we write (20) in the form M (A) (BOA + B1) = (AOA + Al) Q (A) .. For this purpose we divide P(A) on the left by BOA + B. . "a The equivalence of the binomials Ac A+ A. we obtain (AOA + Al) [T (A) .. 0. After a few small transformations. We shall now show that M is a nonsingular matrix.1IA."' the inverse matrix M(A) =P1(A) is also a polynomial matrix.. and (28) we deduce: E =M(A)P (A) = M(A) (BOA + Bl) U(A) + M(A) P = (AOA + Al) Q (A) U(A) + (AOA + Al) S (A) P + MP (29) = (AOA + A1) [Q (A) U(A) +S (A) P] + MP. for otherwise the product on the lefthand side of (25) would be of degree ? 2.IAoI o. means that an identity (20) exists in which I P (A) I = const. IAOA+A. and Q (A) on the right by BOA + B.(AOA + A1) Q . However. I=IP(A)IIA. 1 Q (A) I = cont. : M (A) = (AoA + A1) B (A) + N. # 0. Since the determinant of P(A) does not depend on A and is different from zero. 0.S (A)] (BOA + B1) = M (BoA + B1) .. in this 0 and I Q (A) I = cont. Q (A) = T (A) (BOA + B1) + Q . (28) From (22).IIQ(A)I that I P (A) const. (26) (27) But then we obtain from (25) : M(BOA + B1) = (AOA + A1) Q.IA' +. we divide M(A) on the left by AOA + A.IAn+. (23).146 VI.)+A.IBoA+Bll= Therefore it follows from IB01+B. Therefore S (A) = T (A) . (22) Regarding M(A) and Q(A) as matrix polynomials. while the polynomial on the righthand side of the equation is of degree not higher than 1. case the last relations follow from (20) itself. (25) The difference in the brackets must be identically equal to zero. : P (h) _ (BOA + Bl) U (A) + P. and BoA + B... EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES Proof. For the determinants of regular binomials of the first degree are of degree n: IB.
A Criterion for Similarity of Matrices 1. This completes the proof of the theorem. . This equation shows that the characteristic matrices AE . But then from (29) MP=E' 0andM1=P.B are equivalent and therefore have the same invariant polynomials. That P and Q are nonsingular also follows directly from (21).A and 2E . Its characteristic matrix AE . Let A II atk 1Ii be a matrix with numerical elements from the field F. The condition is necessary.A+B1=P(A02+A1)Q (30) The fact that P is nonsingular follows from (30). From the proof it follows (see (24) and (28)) that the constant matrices P and Q by which we have replaced the Amatrices P(A) and Q(A) in (20) can be taken as the left and right remainders. since this identity implies B0 = PAOQ and therefore IPIIA0IIQI=IB0I0. the same elementary divisors in the field F. the expression in brackets must be identically equal to zero. are similar (B = T1AT) if and only if they have the same invariant polynomials or. A CRITERION FOR SIMILARITY OF MATRICES 147 Since the last term of this chain of equations must be of degree zero in A (because it is equal to E). so that Multiplying both sides of (27) on the left by P.A is a Amatrix of rank n and therefore has n invariant polynomials (see § 3) ii (A) I ig (A) The following theorem shows that these invariant polynomials determine the original matrix A to within similarity transformations. Hence )4EB=T1(i1EA)T. we obtain : B.§ 5. For if the matrices A and B are similar. what is the same. then there exists a nonsingular matrix T such that B= T1 AT. respectively. of P(A) and Q (A) on division by B02 + B2. § S. THEOREM 7: Two matrices A= II a4k 111 and B = II b{k II. Proof. Note.
B. in consequence. where P(A) and Q(A) are polynomial matrices in the identity AE . P and Q may be taken (see the Note on p.e. we obtain: B = PAQ. E = PQ. Note.B. 147) as the left remainder and the right remainder. . in (35) Q(B) denotes the right value of the matrix polynomial Q(A). If A = II a. respectively. and P(B) the left value of P(2). We have incidentally established the following result. Suppose that the characteristic matrices AE . we may replace in (31) the Amatrices P (A) and Q (A) by constant matrices AEB=P(AEA)Q. B=T'AT. 16 We recall that P(B) is the left value of the polynomial P(X) and Q(B) the right value of Q(a). Then these Amatrices are equivalent (see Corollary 1 to Theorem 3) and there exist. i. when the argument is replaced by B. by the Generalized Bezout Theorem. i. 2. (32) moreover..k II 1 and B = II bik II i are two (34) B = T1 AT.e.A) Q (A) (35) which connects the equivalent characteristic matrices 2E .A and 1E . similar matrices. of P(1) and Q(A) on division by AE . then we can choose as the transforming matrix T the matrix T = Q (B) _ [P (B)]1 . two polynomial matrices P(A) and Q(A) such that AEB=P(A)(AEA)Q(A).. which we state separately : SUPPLEMENT TO THEOREM 7.B have the same invariant polynomials. T=Q=P1 where This proves the theorem. we may set :'s P = P (B). 81). when a is replaced by B (see p. Pand Q: (31) Applying Theorem 6 to the matrix binomials AE .A and AE . EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES The condition is sufficient.A and AE . Q =Q (B) (33) Equating coefficients of the powers of A on both sides of (32).B.B = P(A) (AE .148 VI.
it (1) . The Normal Forms of a Matrix 1.. in the characteristic determinant is equal to ± 1. i2 (A). It is not difficult to verify that g (I) is the characteristic polynomial of L : JAEL j = 1 A 0 . . i2 (A) =. i.. Thus. 1 al .=im(A) =1. THE NORMAL FORMS OF A MATRIX 149 § 6. Lt. We shall call L the companion matrix of the polynomial g(..' L= 0 . .. there always exists a nonsingular matrix U (I U 0) such that .... 0 0. . .. it+' (A) =1. the minor of the element a. . . . .1 alFA 0 .. L2. .. We denote the companion matrices of these polynomials by LI. . . ..am 0 a".. 11 71 be a matrix with the invariant polynomials . ..Lt) (38) has the polynomials (37) as its invariant polynomials (see Theorem 4 on p.0 am M g On the other hand.. 0 0. 0 am1 0 am_2 0 1 A .§ 6. Let A= II a a. 0 _. .. 0. ( 1) =1. . am_11+am be a polynomial with coefficients in F.. 141).1+. A . Since the matrices A and LI have the same invariant polynomials.. Then the quasidiagonal matrix of order n LI=(L2.L2. . they are similar. 0 al2 11 (36) 110 0. namely g(A).. from (1) ('X) the second onwards. . L has a single invariant polynomial different from 1. i.+.. 1 . each divides the preceding one..e. Let g(A)=Am+a1A". Therefore DD_1(2) = 1 and it (A) = Dm_1(2) Dm (A) = DD(1) =9(A). (37) Here the polynomials i1 (1).1). We consider the square matrix of order m 0 1 0. it (A) have positive degrees and.
3) it follows automatically that the characteristic polynomials of the diagonal blocks in LI are the invariant polynomials of the matrix LI and..18 the quasidiagonal matrix LII = (DI). EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES A = ULIU1. . there always exists a nonsingular matrix V (f Vf9& 0) such that A=VLIIV1. The corresponding companion matrices will be denoted by La1.. The matrices A and LII have the same elementary divisors in F. . Note. by Theorem 5. Together with the elementary divisors.150 VI. Since Xt(1) is the only elementary divisor of L('' (j = 1. 78 Xf(1) is the only invariant polynomial of L(i) and is at the same time a power of a polynomial irreducible over F.. This normal form is characterized by : 1) the quasidiagonal form (38). . the second natural normal form of a matrix also changes.17 2. u). L(").. If we choose instead of the original field F another number field (which also contains the elements of the given matrix A). Therefore the matrices are similar. are essentially connected with the given number field F. and 3) the additional condition : in the sequence of characteristic polynomials of the diagonal blocks every polynomial from the second onwards divides the preceding one. in contrast to the invariant polynomials. and 3) the additional condition : the eharacterlstle polynomial of each diagonal block is a power of an irreducible polynomial over F. The elementary divisors of a matrix A.. 2. .. of A. We now denote by Xi (A) . This normal form is characterized by: 1) the quasidiagonal form (40). 17 From the conditions 1). then the elementary divisors may change. (II) The matrix LII is called the second natural normal form of the matrix A. 2) the special structure of the diagonal blocks (36). . XU (A) (39) the elementary divisors of A = 11 act Ili in the number field F. L(a)) (40) has. the polynomials (39) as its elementary divisors.. . i. hence.e.. 2) the special structure of the diagonal blocks (36). .. 2). L(2). (I) The matrix LI is called the first natural normal form of the matrix A. L(s). Xa (A) .
Ao 0 . . Jx) Then the quasidiagonal matrix has the powers (41) as its elementary divisors. 0 Ao . 2.. (AA2)p'. . But this polynomial may have complex roots. .. The Jordan blocks corresponding to the elementary divisors (41) will be denoted by J1. . If P is the field of complex numbers.I0)P. Hk=H(Ps) (k=1. 20 Among the numbers 11.. u).... but also the characteristic values of the matrix. . The matrix J can also be written in the form J=( AAE1+H1.J2. 0 . .19 Then the elementary divisors of A have the form20 (AA1)p'.. (42) 0 0 0 0 0. then among the elementary divisors there may also be powers of irreducible quadratic trinomials with real coefficients.J..§ 6... J2.. 11. . . 1 It is easy to verify that this matrix has only the one elementary divisor (I .. . The characteristic polynomial of the matrix then has real coefficients.). . Let us assume now that the number field F contains not only the elements of A. then every elementary divisor has the form (I . .0 . . . The matrix (42) will be called the Jordan block corresponding to the elementary divisor (I . . . J=(J1. for example.. .. (A2. 3. (41) We consider one of these elementary divisors : (AAo)P and associate with it the following matrix of order p : Ao 1 0 1 . 10 This always holds for an arbitrary matrix A if r is the field of complex numbers.. If F is the field of real numbers. 22E2+H2. AuE9+H.) where Ek=E(P4)..+pu=n). . 1 there may be some that are equal.I4)P..... that A= it as IIi is a matrix with real elements..)P. Tim NORMAL FORMS OF A MATRIX 151 Suppose. = A0E(P) + H(P)./)Pu (p1+p2+.. .
.... . 0 0 = A I.4 0 0 0 0 0 0 0 A4 If (and only if) all the elementary divisors of a matrix A are of the first degree.?tpl + kv) Ao 0 0 1 AO This matrix also has the single elementary divisor (A . § 8) if and only if all its elementary divisors are of the first degree. The Jordan normal form is characterized by its quasidiagonal form and by the special structure (42) of the diagonal blocks......e. they are similar.. and in this case we have : 0 0 A =T (A1... The following scheme describes the Jordan matrix J for the elementary divisors (A ...24)2: Al 1 0 0 00 1 0 0 0 0 0 AI 0 0 0 0 .AI)2. 21 The elementary divisors of degree I are often called `linear' or 'simple' elementary divisors. (A . A matrix A has simple structure (see Chapter III... . 0 0 0 .. A2R2+H$. 0 0 A$ 1 0 0 0 0 0 0 0 0 0 0 0 0 0 A2 00 Aa 0 A2 0 0 0 0 0 0 0 0 1 (43) 0 0 0 0 A. the Jordan form is a diagonal matrix...A2)3. .21 Instead of the Jordan block (42) sometimes the `lower' Jordan block of order p is used : 1o 1 0 Ao . .. To the elementary divisors (41) there corresponds the lower Jordan matrix...A8 .. ..Ao)¢ only. . 22 The matrix J is often called the upper Jordan matrix.. . A ..22 110 .152 VI. there exists a nonsingular matrix T(I T 1 0) such that A=TJT'=T(AIR....... in contrast to the lower Jordan matrix J(1) . .. i. (A . ... EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES Since the matrices A and J have the same elementary divisors. Aa. (UI) The matrix J is called the Jordan normal form or simply Jordan form of A. . +H1.
. A=T3(A1(EI+FI).. A.{A1E1+F1.. We denote by (A .Au(E.. . Au (E#+Fu)) (V) Ti'. apart from A=T2{AI(EI+Hi). .As(L's+Hs). . there exists a nonsingular matrix T. 2. Fk =F(P1) .) (Ek =E(" . The matrix f (A) does not alter if we replace the function f (A) by a polynomial that assumes on the spectrum of A the same values as f (A) (see Chapter V.Al)r' . A0 (E('° + Fcr)) has only the single elementary divisor (A (III) and (IV)..E..)= { A1E1 + F1. there may be some that are equal. An arbitrary matrix A having the elementary divisors (41) is always similar to J(1).A2)r' .. Without loss of generality we may therefore assume in what follows that f (A) is a polynomial... . THE ELEMENTARY DIVISORS OF A MATRIX 153 J(. u).. The Elementary Divisors of the Matrix I(A) 1.)Ti1=T. A2E2 + F2. to determine the elementary divisors (in the field of complex numbers) of the matrix f (A). (IV) 0. (VI) § 7. In this section we consider the following problem : Given the elementary divisors (in the field of complex numbers) of a matrix A = II a4k I11 and given a function f (A) defined on the spectrum of A.... . each of the two matrices (Ecr) + H(n)) .)}Tsl...23 Thus A is similar to the Jordan matrix A=TJT1.§ 7. (I T. As(Es+Fs). i. A.. + F. A2E2+F2... (2 .. the representations Therefore for a non singular matrix A having the elementary divisors (41) we have. k = 1.. We also note that if Ao Ao .e. § 1)..AX" the elementary divisors of A. (A .. and so 23 Among the f (A)= T f (J) T1.+H. I 0) such that A=TIJ(..
we shall from now on consider f (J) instead of f (A). of f (J).e.. (A . If the elementary divisors of a matrix are known. .. Let us determine the defect24 d of f (A) or. We have thus arrived at the following theorem : THEoRtEM 8: The defect of the matrix f (A). what is the same... u) (45) f (J) = (f (J1) .. as the number of elementary divisors of the form AO. 2.. 1! (pslj.21)P' . where A has the elementary divisors (A . 2.25 so that f(A4)=f'(A4)=.W' 4 (47) is given by the formula d =' min (k4 . J={J1. .0 (i= 1. ps) .= ICx:)(2).. EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES Moreover. then the defect of the matrix is determined as the number of elementary divisors corresponding to the characteristic value 0. 2. where r is the rank of f (A).. where (see Example 2 on p. (48) 24 d = n ... .=AcE(PC)+H(PC) (i=1.. (46) Since the similar matrices f (A) and f (J) have the same elementary divisors. u). i. where k4 is the multiplicity of A as a root of f (A). in that case f ()) 0. 100) f (Ac) f A)' .. .154 VI. J2. . W01. The defect of a quasidiagonal matrix is equal to the sum of the defects of the various diagonal blocks and the defect of f (J4) (see 46)) is equal to the smaller of the numbers k{ and p4.r. 25 ki may be equal to zero. and J...
to set k{ = j in (48) for the elementary divisors corresponding to the characteristic value Ao and ks = 0 for all the other terms (j 1. do = 0. .di+I (j = 1.+ 9m=d1... where ff(A) (j = 1.. : pi if f(k{)(a4) A 0 (i=1. m. (A ..§ 7.20E)2 . .. 27 The number m is characterized by the fact that dm_I< dm= dm fi (j 1. dm A . =dm) . .10E .. Let us return to the basic problem of determining the elementary divisors of the matrix f (A). . (A . .(A). 91+292+293+. ..). therefore... . 2.. In order to determine the defect of (A . d2. . pi) in (48) has to be interpreted as the number pt if f (\i) = f. .2... . d1. ..... 2. . . where f (X) is not a polynomial. ... ..... 2.1). .u). = f00 (Xi) = 0 and as the number k. 91+292+3g3+. the elementary divisors of f (A) coincide with those of f (J) and the elementary divisors of a quasidiagonal matrix coincide with those of the diagonal blocks (see Theorem 5).00 _ . m . gm > 0. .+ 3gm=d3. m). m). As we have mentioned above. Therefore the problem reduces to finding the elementary divisors of a matrix C of regular triangular form : 26 In the general case. . 2.AOE)m For this purpose we note that (AA0E)'= f.dd_1 . (49) Hence27 gg = 2di .A. then min (kt.28 As an application of this theorem we shall determine all the elementary divisors of an arbitrary matrix A = II aik Iii that corresponds to a characteristic value AO: 91 92 9m where gs ? 0 (i = 1..+ 29m=d2. provided the defects of the matrices are given. THE ELEMENTARY DIVISORS OF A MATRIX 155 here ki i s the multiplicity of Ai as root of f (2) (i =1. 2. . u). Thus we obtain the formulas 91+ 92+ 9s+. . 2.9)1 we have. (50) 3..
In this case. a1= .. . But then it follows from Dp (A) = (A a0)p. except the product of the elements on the main diagonal. is given by . ak . EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES :a 0 P1 a1 . every term contains at least one factor A . .. Since Dp_1(A) divides D.. . = ak_1= 0. ao il We consider separately two cases : 1.156 VI.. .(2) without remainder.(A) =l that C has only the one elementary divisor (A .H1' _ k0 0 0 . . a1 O. .. we see that g = 0.. ap (51) a1 !! ao: C=.ao.i. 2. we have Dp_1(A) =(Aa0)9 (9:!9p) Here Dp_1(A) denotes the greatest common divisor of the minors of order p 1 in the characteristic matrix Aa0 a1.a0... 0.ra.ao)¢.. D. a. The characteristic polynomial of C is obviously equal to Dp(A) _(Aao)'.a0E)f = a4H1 i + . .. C = a0E + a + . But since D. . (A) must be a power of A ..1 a1 Aao Aa0 AEC = 0 0 . It is easy to see that when the minor of the zero element marked by `+' is expanded. 0 . ap1Hp1 Therefore for the positive integer j the defect of the matrix (C .a1)p1 and is therefore in our case different from zero. which is (. .
d1=k.. 9Q = k . .. . 4.. To each elementary divisor of A (A . kjp. h (Aa0)9+I... (Aao)4... (55) Clearly the problem reduces to finding the elementary divisors of a cell of the form (55).. where Thus we arrive at the theorem : 29 In this case the number q + I Plays the role of at in (49) and (50). d2= 2k.§ 7.. the matrix C has the elementary divisors (Aa0)2+1. But the matrix (55) is of the regular triangular form (51). Thus.Ao)r there corresponds in f (J) the diagonal cell . (AO) (P' ) ' (AOE+H)=Z iO p1 0(a) tiH.(p1)(10) f(Ao) f. de =qk. (Aao)Q kh where the integers q > 0 and h ? 0 are determined by (52). Now we are in a position to ascertain what elementary divisors the matrix f (J) has (see (45) and (46) ).. 0 1 (AO) . . 9y+1= h. when p. . (52) We set Then28 p=qk+h (0Sh<k). THE ELEMENTARY DIVISORS OF A MATRIX dy _ 157 jkj..h. = ge1= 0. dQ+l=p (53) Therefore we have by (50) 91= . when kj>p.
29 (57) is obtained from (58) by setting k = 1. 0Sh<k. are the characteristic values of A. a' I. Af(Ao) (59) We note the following special cases of this theorem.. _ . . (58) kh p=qk+h. = f0_1)(AO) = 0.f (Ao))°. (A .158 VI. then f (A. . .. . 84... A. /(k)(20)* 0 (k < p) to the elementary divisor (56) of A there correspond the following elementary divisors of f (A) : (A ...Ad". characteristic value is repeated as often as its multiplicity as a root of the characteristic equation indicates. .f (AZ))P'. was established separately in Chapter IV. . (A . for p > 1 and f (AO) = . f (X.. . f'(AO) = .. (A 12)P'. EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES The elementary divisors of the matrix f (A) are obtained from those of A in the following way : To an elementary divisor THEOREM 9: (A . (In both sequences each . (A . (A .A0)P (56) of A for p =1 or for p > 1 and f' (AO) 7& 0 there corresponds a single elementary divisor (57) (A .An)P" .e. . to the elementary divisor (56) there correspond p elementary divisors of the first degree of f (A) :29 Af(A0).. f (A...91 then in going from A to f (A) the elementary divisors are not 'split up' i. has the elementary divisors (A . .. 0 for those X that are multiple roots of the minimal polynomial.) . R AO.) $0 If the derivative f'(A) is not zero on the spectrum of A.f where (A0))4+1.f (A0))9. 1.. .f (A0))P of f(A) . then f (A) has the elementary divisors (A __ f (A1))Pf.). A2. I f Al. if A 2. p.) are the characteristic values of f (A). for p > 1.e. (59) is obtained from (58) by setting so Statement 1. finally.f (A"))P k=p ork>p. (A .. (A . =/(P 1)(20) =0... .f h (AO))9+'. 0sq....
34 The formula (60) may be replaced by T = T.V. The normal form is completely determined by the invariant polynomials of the characteristic matrix 2E.A. To find the latter. Moreover. where V is an arbitrary matrix permutable with A.32 Note that whereas the normal form is uniquely determined by the matrix A. we can use the defining formulas (see (10) on p. A General Method of Constructing the Transforming Matrix In many problems in the theory of matrices and its applications it is sufficient to know the normal form into which a given matrix A = 11 aik 11T can be carried by similarity transformations. In some problems.38 for the transforming matrix T we always have an innumerable set of values that are given by T = UT1i (60) where T1 is one of the transforming matrices and U is an arbitrary matrix that is permutable with A. As far as the second normal form or the Jordan normal form is concerned. GENERAL METHOD OF CONSTRUCTING TRANSFORMING MATRIX 159 § 8. The existence of such a solution is certain. since A and A have the same invariant polynomials. it is necessary to know not only the normal form . 1.§ 8. The determination of a transforming matrix reduces to the solution of this system of n2 equations. . 139) or the reduction of the characteristic matrix AE A to canonical diagonal form by elementary transformations. An immediate method of determining T consists in the following. however.. This matrix equation in T is equivalent to a system of n2 linear homogeneous equations in the n2 unknown coefficients of T. they are uniquely determined to within the order of the diagonal blocks.84 32 From this fact follows the similarity of A and A.A of the given matrix A. but also a nonsingular transforming matrix T. we have to choose from the set of all solutions one for which tTI 0. The equation A=TAT' can be written as : ATTA=O. 33 This statement is unconditionally true as regards the first natural normal form.
From (62). T1. TP.160 VI..(A).. Tp T '.. After this (in accordance with (61)) we replace the argument A in Q(A) by the matrix A.A and AE .. i .11 . and where T1. where in. According to this...T2 . 2. For the actual process of finding Q(A) we reduce the two Amatrices AE .T2i.. .A and AE1 to canonical form by means of the corresponding elementary transformations ('(A). and (64) it follows that AEA=P(1) (AEA)Q(A)... we can choose as the transforming matrix T= Q(A)..(A)(AEA)Q.(A)}=P2(A) (AEA)Q2(A) (62) (63) (64) Q1 (A)=T....T.Tr. Ti 1.. where Q (A)= Qi (A) Q1 (A) = T. .(A)}=P... .1Tp.T*1 (65) We can compute the matrix Q (A) by applying successively to the columns of the unit matrix E the elementary operations with the matrices T1.... provided (61) AEA=P(A) (AEA)Q(A) The latter equation expresses the equivalence of the characteristic matrices AE .. . (63).. Example. Q2(A)=TITI. 148).A. Trt. T. 1 1 1 1 A= 3 8 4 3 5 3 4 4 15 . This method is based on the Supplement to Theorem 7 (p. since it requires a great many computations (even for n = 4 we have to solve 16 linear equations). Here P (A) and Q (A) are polynomial matrices with constant nonzero determinants.10 11 .Tpy. . EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES The method proposed above for determining a transforming matrix T is simple enough in concept but of little use in practice. . are the elementary matrices corresponding to the elementary operations on the columns of the Amatrices AE A and AE ... i. We proceed to explain a more efficient method of constructing the transforming matrix T.(A) i.. .X.
=[i9] In transforming the characteristic matrix AE . T"=[i+(b(A))1]..A into normal diagonal form we shall at the same time keep a record of the elementary right operations to be performed. GENERAL METHOD OF CONSTRUCTING TRANSFORMING MATRIX 161 Let us introduce a symbolic notation for the right elementary operations and the corresponding matrices (see pp. i.e.(2 + 1) 3] [4 + (1.§ 8. 130131) : T" [(c)il.8 15 JJA1 323 4 10 1 1 A3 1 5 4 4 0 1 1 11 2+I1 424 0 2+1 4 22I024 21 2 2+11 0 1 421 0 0 0 2+1 0 1 4 1 4 2+1 2+11 21 1 4 0 2+1 2 1 221024 0 0 0 1 424 42I 0 0 2+1 0 0 0 1 0 21 2+1 2 221024 1 424 421 0 0 0 . T".2) 4] [2 . 0 0 2'221 2+1 4A2723 0 2'221 2 522924 0 0 0 1 0 0 2+1 22221 42'723 0 22221 522924 A 1 0 0 0 1 0 0 0 0 0 1'+22+1 412+72+3 0 0 2'+22+1 522+92+4 ii 1 0 0 1 0 1 0 0 0 0 0 0 22+21+1 22322 0 22+22+1 0 0 0 0 . 0 0 0 A+l 00 Here 0 Ql (2) = [1 + (1 .(5) 3] [43] [4 + (A + 1) 3] . .4] [3 + 4] [14] [2 .42) 3] [23] x x [4 . 0 0 0 0 0 00 0 0 0 0A'322 (2+1)' 0 0 II A+1 0 . 21 0 1 0 0 O AS211 A+1 0 0 A'+22+1 _22322 1 0 0 0 1 0 0 0 1 0 1 0 00 0 0 22322 22+22+1 1 A+1 22221 0 1 0 . the operations on the columns : 2EA= .
The matrix has two elementary divisors. Therefore the Jordan normal form is 1 J= 0 1 0 0 0 0 1 0 1 1 0 0 1 +1 0 0 0 0 By elementary operations we bring the matrix AE . (A + 1) 3 and (A + 1). 1.J into normal diagonal form A+1 1 0 1 AEJ= 0 A+1 A+1 0 0 0 0 0 0 0 0 I 1 0 (A+1)2 A+1 0 0 1 0 0 0 0 A+1 0 A+I 0 0 1 0 0 1 0 (A + 1)3 (A + 1)2 0 0 0 0 1 0 1 0 1 (2+1)20 0 0 0 0 1+ 1 0 1 0 0 oo 0 0 0 A+1 0 0 0 0 0 0 0 (A + 1)a 0 0 1 0 0 0 A+1 00 0 Here 0 0 + 1)a {A Q2(A)=[2+(A+1)3] [1+(A+1)2] [12] [23] [34]. EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES We have found the invariant polynomials (A + 1)3.162 VI. We apply these elementary operations successively to the unit matrix E : 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 1 0 1 0 1A 1 1 0 0 0 00 00 1 0 0 1 1 0 0 00 1 1 0 1 0 0 1 0 0 1 0 1 1 1 1A 0 0 0 A1 0 1 1 0 A2 1A 0 . and 1 of A. (A + 1). Therefore Q(A) =Q1(A) QY1 (A) =(l+(11)4] [24] [3+4] (14] [2(A+1)3] [4+(I4A)3] [23](4(5)3] x x [43] [4 + (A + 1) 3] [34] [23] (12) [1.(A + 1) 3].(A + 1) 2] [2 .
§ 8. GENERAL METHOD OF CONSTRUCTING TRANSFORMING MATRIX 0 0 1 ` 163 00 11 00 5 12 1 1 0 11! 00 1 141 1' 0 1 A1 141 1 1 A2 25A 1 25A Il 0 00 1 A+1 II 00 0 1 I 0 f 00 00 0 1 01 21 A+6 A'+6A+5 0 1 1+6 111 + 10 11 12 i 5 12 1 1 1 1 I 01A+6 A1 o o 5 11 A+ 1 12 12 01 A+6 A2+61+5 11A.A 12 Thus Q (A) _ 12+6A+5 A1 1 A+6 101+9 514 A+1 0 0 0 0 5 1 1 A 12 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 As + 5 1 1010 0 610 1 00 0 00 0 A + I 4 00 5 511 9 1 00 01 1 6 12 Observing that 0 J2 = 1 2 1 0 1 2 0 0 0 1 0 0 I 0 1 0 we have T=Q(J).0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 2 0 0 1 2 0 1 1 0 0 5 0 1 12 1 0 0 6 1 0 1 000 000 1 1 01 1 0 0 0 1 + 0 1 0 0 0 0 1 0 0 0 01 0 + 4 0 05 511 6 9 1 00 1 0 1 12 1 5 0 5 41 11 0 5 12 0 1 1 0 .410 0 0 514 0 514 0 0 5 I' 1 1 12 1+1 00 f As+61+5 A1 1 A+6 10A + 9 514 0 5 1 1 .
However. (1.12 1 . 1y 1 0 0 1 . 0 I TI = Therefore 1 I 5 05 =1 4 5 0 1 1 0 1 0. (66) of the given matrix A are known. we replace the matrix equation AT =TJ .164 VI. § 9... . We shall now explain another method of constructing a transforming matrix which often leads to fewer computations than the method of the preceding section..A=)A. n). 0 P2 J=(11E" +.12 11 .. 2.e. . AT = TJ..12 11 . . EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES Check : AT= 1 01 11 6 5 5 04 35 1 TJ = 1 01 11 6 5 5 04 35 1 . g(Pl). Let A=TJT1. Another Method of Constructing a Transforming Matrix 1. 0 .. we shall apply this second method only when the Jordan normal form and the elementary divisors (A 11)Pl.12 i. 1E Then denoting the kth column of T by tk (k =1. .. . where P1 At 1 . 11 0 12 A = TJT1.
. where C =C (10). 4.AR) tP.+P.. .+1. what is the same..1 times : (1EA)C"(1)+2C'(1)=+p"(1)E (1E .. D = 1 + (71) C' (10) .+1 = 12tP.... [t1.. Atz =)1t2 + tl. + tP. . all the columns of T are split into `Jordan chains' of columns : . tp1+1.. to every elementary divisor (66)) there corresponds its Jordan chain of columns.. t2. Substituting 20 for 1 in (69) and (70) and observing that the righthand sides are zero.+P:= 12tp.. . For the matrix C(pl) we have the identity (1E . [tp.1 (68') . § 6)..+1..... etc...=. = tP.A) 0'(A) + C (1) =1P' (1) E (70) 1) &A2) (1).22E) tP._1 Atp. (68).. We shall show that these Jordan chains of columns can be determined by means of the reduced adjoint matrix C(2) (see Chapter IV...+2 + tp. (A 11E) tP.+l.. Let 10(1)=(11oYnx(1) Q(10) 0) We differentiate the identity (69) term by term m .. (A1oE)F=D.... The task of finding a transforming matrix T reduces to that of finding the Jordan chains that would give in all n linearly independent columns. (m1) (1) E..A) 0 m1) (1) + (m  (1E . a = (in 2 ! C(m2) (10) 172) K=(m 1 I . ..+pa1 (67) (68) which we rewrite as follows : (A 11E) t1= 0.. 1.§ 9.. . .. (A 12E) tP.+2 = tP.. (A .. . ANOTHER METHOD OF CONSTRUCTING TRANSFORMING MATRIX 165 by the equivalent system of equations At1= lltl... + tp. .A) C (1) = y...1.. tP. F = 2 j C" (10) .. . (A1oE)D=C. (69) where W(2) is the minimal polynomial of A. .+1 = 0..+1 . . To every Jordan block of J (or.. ... Each Jordan chain of columns is characterized by a system of equations of type (67). . = (67') tp. we obtain (A1oE)C=O.+P.. A tP..+P. (A1oE)K=Q...1 (A . AtP1+2 = 12tp.+p. (A 11E) t2 = t1. (1) R.. Atp.= l1tP... Thus..
85 we can find a k (c n) such that Ck 760(74) Then the m columns Ck. n). Since C = C(A() 0... they form a Jordan chain of vectors corresponding to the elementary divisor (A . 2.. Since the linearly independent columns (75) satisfy the system of equations (73).. If Ck = o for some k. + xKk =a. . . (A ... we obtain s Jordan chains containing n columns in all.. s.2..Aa) m (compare (73) with (67') ). ...Af)mf we associate the Jordan chain of columns CU)..0k. These columns are linearly independent. (A . .A....AaE) we obtain 8Ck+. (75) (76) Multiplying both sides of (76) successively by A .. (78) When we give to j the values 1. then the columns Dk...xCk=o..166 VI. 0(i). .. . For let yCk + . in contradiction to the definition of C(X).. We obtain : (AAOE)Ck=o. ... 2. but Dk :Pk o. We shall now show first of all how to construct a transforming matrix T in the case where the elementary divisors of A are pairwise coprime : With the elementary divisor (A . 35 From C(Xo) = 0 it would follow that all the elements of C(X) have a common divisor of positive degree. . (A AfE) K(f) = G(9. (A . 2. KU). (77) From (76) and (77) we find by (74) : y=S=. ... .Dk. (AAOE)Dk=Ck. . ..AaE.Ak are linearly independent.E) D(f) =C(f)..AaE)Kk =0k (73) (k=1... Gk. constructed as indicated above.Fk.. =x0.+xGt=o.Dk + . EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES In (71) we replace the matrices (72) by their kth columns (k = 1. etc... Then (A ... .n). DU).AtE) C(l) = o. Kk form a Jordan chain of m 1 vectors.
Z E).. we find : [As ] 1= (3.4 2 2 1 3 2 1 3 10 3 i.+2[A2]1+(A'2) A... K(1) ($2). For the computation of the first column of A2 we multiply all the rows of A into the first column of A. mi1 (A  Af+1E)mi+...o ... suppose that + atD(i) + . ' tp(A.A1E)m' .21) E . .§ 9. D(2). 8 A_ 3 1 4 2 3 2 . . (A) : C.E)" (80) and obtain xi = 0. ANOTHER METHOD OF CONSTRUCTING TRANSFORMING MATRIX 167 For.. elementary divisors : (A 13) . Multiplying all the rows of A into this column.. DO). Therefore 3 1 C..(1)= 6 2 3 +A 4 0 2 +(A ' 2) 3 2 2 1 1 ! IA'+3A'A3" 2A1+4A+2 +(A' 2A) 0 0 10 2A'2 i ll+ 2A}..A' 2A . (A Ai1E)i_a (A . in (80). We define the matrix T by the formula T =(C(1). The elements of the row of columnsums are set up in italicq.(A)=[A']. 2..µ)= V( 4) 2 1 3 I I 2  i (A) =µ +Afs ' ' (A '2)#1.. 0. 3). D(d). 2). .. we find : r....2A) E. (81) Example. (A + 1)2. 3 1 2 1 4 0 2 0 (A) = P (AE. . + xiV) ] . m. .. + (A' .=xi=0 and this is what we had to prove. (79) We multiply both sides of (79) on the left by (A . .=.. C(").. We make up the first column C.A.. . K(2). (A .=8. .. We obtain:36 [A2]1=(1.. for checking. A) = As + AA' + (A' 2) A + (A' . 4. Replacing mi 1 successively by m.2.1 C16 The columns into which we multiply the rows are written underneath the rows of A. K(i)). 6.3..
D = C' (Ao).. 4) and C1'(1)=(8. 8.4). As a preliminary to this. we pass on to the second column and. (84) For C(A)=Y'(AE.(4). proceeding as before. etc.4.A). As C1(1)= 0 8 4 4 0 4 8 8 (C1(1). EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES (0.. K = 1.)"'2. D=h...168 VT. .4. T= 220 1 1 1 0 1 1 1 0 1 We leave it to the reader to verify that i 1 AT=T 0 0 1 00 0 1 0 0 0 0 1 0 l 3.2. K=h. . . 0. Coming now to the general case.Aa )'". r elementary divisors (A . q elementary divisors (A . where +F(A.4). . 0). AO).0.A. We set up the matrix: Hence CI (1) = (0.. 1 (m &') (Ao). C:(1)..8.1) = (4. we establish some properties of the matrices C =C (A0).1) = (4. &') (82) The matrices (82) can be represented in the form of polynomials in A: C=h1(A). Ci (1) . m).(A).4. 0.lo).p)= JP (1A)tV(A) pA 37 A Jordan chain remains a Jordan chain when all its columns are multiplied by a number c f 0.. we shall investigate the Jordan chains of vectors corresponding to a characteristic value Aa for which there are p elementary divisors (A ..Ao) "''. 0. F = 21 C" (. 0) and C2(..4.. C ' S(I))= 0 4 4 4 0 4 44 0 2 1 1 We cancel37 4 in the first two columns and 4 in the last two columns.. we find : C2(.. where ht (A) (83) _ (AV ( (i=1.
o. 154).2° E)k. . 3. . provided we make the corresponding replacements in D..1 Using the properties 2. . Without changing the basic formulas (71) we may replace any column in C by an arbitrary linear combination of all the columns. n yi=''ar9zv......1.§ 9. and Theorem 8 (§ 7). . The matrices (82) have the ranks p.°)'". l 1P (0) = ht+l (E+) (86) This property of the matrices (82) follows immediately from 1..r.. Suppose that i < k. if we equate the rank to nd and use formula (48) for the defect of a function on A (p. . P q (hho)°.. . A. o). n) of hi (A) is expressed linearly by the columns z1. . I fOk Y _ kt [0.. and 4. We now proceed to the construction of the Jordan chains of columns for the elementary divisors (. Let us take two matrices h4(A) and hk(A) in (82) (see 1.. . (12°)k+f (83) follows from (82). z2. o. . 3p + 2q {.). ANOTHER METHOD OF CONSTRUCTING TRANSFORMING MATRIX 169 Therefore k! I C(k) (A°where IPA ) 1111(1) (A° E. a. 2p +q. k! LAO _ 2.. z of ht (A) . K. we transform the matrix C into the form 0 =(Cl. Hence the jth column y1 (j =1. (87) . . are the elements of the jth column of (A 4. 01 where a1i a2. and (86). In the sequence of matrices (82) every column of each matrix is a linear combination of the columns of every following matrix.. Then it follows from (84) that: hi (A) = hk (A) (A . (85). A). C . 2... .. . C..
C9 are linearly independent... F2p+1. in (91) are independent.. . etc...).. because they form 2p + q independent columns in D.. Therefore the columns C... e2.. . o). . Now D (D1. D.. .D1 + . instead of Dp+ 1. . Dp. F3p+2q. we can.. =ap0. .. . Cp. Observing (see (73) ) that (A20E)Ci=o (i=1....... ... . t D2p+1.. C .. C2. Ap). C. Cp.+.1...... .. . J etc.. all the columns in (91) .. D2.... D2p+1..D.... finally. Therefore by 4.. without changD2. Dp.170 VI.a. D2p+q..+ apDp + ap+1 Dp+1 + . zeros instead of Den+q+1.. .. Fp. . .. . Dp.. (89) In the same way.. F$pf?q+r. All the columns Ci.. D.... . for every i (1 < i!5...... because they form p linearly independent columns of C.. D1.. .. ... O. D. Dn. . Dl. hence in (88) a1=.2. . F2p+4I C10 .. we can represent the next matrix F in the form F =(r119 .... K1).... . O. .. (88) We multiply both sides of this equation by A 10E. Dp.. take the columns C1. EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES where the columns C. o. p m1 m1 (91) (D2p+11 F2p+1.: Ci = a. + apCA. (90) Formulas (73) gives us the Jordan chains m rn (Cl. C1.. K2p+11) ..... (Cp. . p) Ci is a linear combination of the columns D1. ..p). Cp are linearly independent combinations of the columns D...' and ing the matrix C. F2p+q. Then the matrix D assumes the form D=(DI. For all the columns Ci in (91) are linearly independent.. (D2p+q.... o). D.. and 2.. + anDn.. D2p+g. preserving the forms (87) and (89) of the matrices C and D... ..... By 3. (A20E)D/=C1 we obtain by (87) o = alCI + a9C2 + . .. K2p+q). q These Jordan chains are linearly independent. Dp+1..
3. columns... + n. (A) _ (A . The proof of the linear independence of these n columns proceeds as follows. = r("'t1) (Ai) = 0. (AA. . Then. . Therefore (A . .s. independent columns in K. r (A) Hi = o ( i : 7 4 . s). for every i r (A) is divisible by (A . s)..)" (2 A2) % .A.. 2. (j =1. d (. r' (Ai) = ... ANOTHER METHOD OF CONSTRUCTING TRANSFORMING MATRIX 171 are independent. 2.. (93) We take a fixed number j (1 < j < s) and construct the LagrangeSylvester interpolation polynomial r(A) (See Chapter V.. .)".. the number of columns in this system is equal to n.Al )"'i without remainder . (j = 1... . ..j) . 2) with the following values on the spectrum of the matrix : r (Ac)= r' (A:) _. The number of columns in (91) is equal to the sum of the exponents of the elementary divisors corresponding to the given characteristic value A.. 2.. These n columns are linearly independent and form one of the required transforming matrices T... (94) . But every column in the Jordan chain corresponding to the characteristic value Ai satisfies the equation (AA. Every linear combination of these n columns can be represented in the form (92) where Hi is a linear combination of columns in the Jordan chains (91) corresponding to the characteristic value A. therefore by (93). .§ 9.A') M.. All the chains so obtained contain n = nl + n2 + . Ai Suppose that the matrix A = II ack II z has s distinct characteristic values (j =1...A 5E)"'i Hi = o. = rlm'`I (At)= 0 for ij and r (A!) = 1. .E)'Ix=o.) _ (AAl)n' (A Is)". §§ 1. (A . For each characteristic value Ai we form its system of independent Jordan chains (91) . because they form no = mp + (m 1) q + .
. 1 Elementary divisors of the matrix A 1 (2 1)'. Example 1. C'(1).1). 0 1 1 1 2 3 3 A= 0 0 I 2 2 1 1 0 1 1 1 1 2 1 0 0 A (1)=(A1)4(2+ 1). 2. We must obtain two linearly independent columns of C(1) and one nonzero column of C(. .1)'. Let us point out some transformations on the columns of the matrix T under which it is transformed into the same Jordan form (with the same arrangement of the Jordan diagonal blocks) : 1.A)=A'+(I1)A+(x'11)E. C(.172 VI. we obtain from (94) and (95) Hi=o.1). 1 + 1. This is valid for every j =1. (A . are equal to zero. therefore r (A) H. Note. (95) Multiplying both sides of (92) by r(A). s. and hence all the coefficients in (92). 2. 1 1 J= 0 0 0 1 00 00 1 1 0 0 0 0 0 0 1 V (µR (1) =i"+ (2I). Multiplication of all the columns of an arbitrary Jordan chain by a nonzero number. . multiplied by one and the same arbitrary number. But Hi is a linear combination of independent columns corresponding to one and the same characteristic value A. 000 1 0 0 . EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES In exactly the same way.. +2:A1 C(1)P(AE. Let us compute successively the column of A2 and the corresponding columns of C(2). the difference r(1) 1 is divisible by (AA. (j = 1. .. s). H.. III. = H1.. . s). Addition to each column (beginning with the second) of a Jordan chain of the preceding column of the same chain. 2. C(1). Therefore all the coefficients in the linear combination Hf (j =1. C'(A). Addition to all the columns of a Jordan chain of the corresponding columns of another chain containing the same or a larger number of columns and corresponding to the same characteristic value.. . 1)=1'A'1+ 1.)m' without remainder. .
for example.C' (+1). C3(1) denotes the third column of C(l). 00010 000001 1* t000 004 004  0* 1* 2* +1) 0 0 100 00010 00001 10000 01000 000E 000 02 00 2 Therefore38 2 2 1 3 1 0 4 0 T= (Cl (+1).(A) 1 2 0 1 0 1 1 1 1 1 1 0 1 0 0 0** 1 ** 3 2 1 10000 1* 01000 3* 2* +(211) 0 0 1 0 0.§ 9. 0 1 2 1 0 0 0 2 1 1 0 0 0 2 1 1 0 0 2 0 1 0 0 0 0 We leave it to the reader to verify that AT1= T1J and I T1 Example 2. 0.C.11 Elementary divisors : (A + 1)9. 4) Divide the first and second columns by 2. Cs(1))= 0 0 2 1 2 0 24 1 2 1 0 The matrix T can be simplified a little.(2)=(1+1)3. Then we obtain the matrix. .C'(+1). 2) Add the first column to the third and the second to the fourth. A + 1. ANOTHER METHOD OF CONSTRUCTING TRANSFORMING MATRIX 173 1 1 2 1 0 1 0*44+(A1) 0 01 C(1)= 1 1 1 2 2 2 1* 1 1 1 22 22* 0 0 00 2*' 2* 1 0 0 0 0 3# 1* 0 0 0 0 2* 00 00 1 C(+ 1) = 0 * C'(+1)= 0 * E 2 x 22 22* 22 22* 2* 0*. V 4 . from the second. 5) Subtract the first column. 16 .10 11 .(+1). A= 3 8 1 1 4 3 5 3 1 1 4 A(A)=(2+1)`.C. 3) Subtract the third column from the fourth. We 1) Divide the fifth column by 4. multiplied by 1/2. 38 Here the subscript denotes the number of the column.
F= 2 1 0 0 0 1 1 1 000 In the matrices D. F. multiplied by 7. h2(A)_{ h3(A) and the matrices39 C = h. F1)= 1 5 0 0 0 4 5 1 11 0 7 1 0 0 1 As a check. F=E: 2 1 1 1 0 0 0 2 1 1 1 . to the first and subtract the fourth column from the second. we can verify that AT = TJ and that I T 0.D3. D=h. . *) . F3. D.10 15 . We obtain 00 0 1 c= 0 0 0 0 0 0 1 0 00 00 0 0 00 0 0 0 D= 7 1 5 1 7 0 0 0 1 1 0 0 1 1 0 0 0 1 4 11 . EQUIVALENT TRANSFORMATIONS OF POLYNOMIAL MATRICES !1 0 1 0 1 o 1 1 0 011 0 0 0  0' 0I 1 We form the polynomials hi(2)= +i =(1+1)'. 1 0 0 0 0 0 0 1 0 1 1 0 0 1 0 0 0 For the first three columns of T we take the third column of these matrices : T = (C3f D3.(A)= (A+E)2. D= 3 4 5 4 F= C= 0 0 0 0 8 4 4 4I 11 . we add the fourth column.10 2 1 1 0 . Then we have T=(C3. F. F3.(A)=A+E. In the matrices C. we subtract twice the third column from the first and we add the third column to the second and to the fourth.174 VI. We obtain 00 1 00 D C= 0 0 0 1 0 00 0 0 5 1 00 0 0 4 11 1 0 1 000 1 0 1 0 5 00 1 1 1 7 1 0 1 0 For the last column of T we take the first column of F.
we have seen in Chapter III that the behaviour of a linear operator in an ndimensional space with respect to various bases is given by means of a class of similar matrices. independently of the contents of the preceding chapter. . On the other hand. there is an integer p (0< p:!5 n) such that the vectors x... while Apx is a linear combination of these vectors with coefficients in F : on our paper [167]. The investigation of the structure of a linear operator will lead us. We form the sequence of vectors x. to the theory of transformations of a matrix to a normal form. Ax. Let x be an arbitrary vector of R.. §§ 9699 and also [53]. The Minimal Polynomial of a Vector and a Space (with Respect to a Given Linear Operator) 1. Therefore the contents of this chapter may be called the geometrical theory of elementary divisors. AP''x are linearly independent. (1) Since the space is finitedimensional.CHAPTER VII THE STRUCTURE OF A LINEAR OPERATOR IN AN nDIMENSIONAL SPACE (Geometrical Theory of Elementary Divisors) The analytic theory of elementary divisors expounded in the preceding chap ter has enabled us to determine for every square matrix a similar matrix having `normal' or `canonical' form. . The existence of a matrix of normal form in such a class is closely connected with important and deep properties of a linear operator in an ndimensional space. " The account of the geometric theory of elementary divisors to be given here is based For other geometrical constructions of the theory of elementary divisors. see [22]. We consider an ndimensional vector space R over the field r and a linear operator A in this space. Asx.. 175 .." § 1.. The study of these properties is the object of the present chapter. Ax.
. (4) The polynomial v(A) is called an annihilating polynomial for the whole space R. the phrase `with respect to the given operator A' is tacitly understood. For the sake of brevity. e$. . e and by V (A) the least common multiple of these polynomials (v' (A) is taken with highest coefficient 1).. 2. 97*(A) the minimal polynomials of the basis vectors e1.yi(A)e. .. . For let (A) = q' (2) x (A) + e (1)..ypx. + Yp_lA + yp (A monic polynomial is a polynomial in which the coefficient of the highest power of the variable is unity.. .2 But it is easy to see that of all the monic annihilating polynomials of x the one we have constructed is of least degree... because throughout this entire chapter we shall deal with a single operator A. Then p(A) is an annihilating polynomial for all the basis vectors e1. in particular... eA. o(A) are quotient and remainder on dividing Then (A) by 4p(A)x=x(A) p(A)x+e(A)x= e(A)x and theref ore p (A) x = o. p2 (A). Then w (A) is an annihilating polynomial for the basis vectors 2 Of course. Note that every annihilating polynomial (A) of x is divisible by the minimal polynomial T(A).e. (3) Every polynomial q: (A) for which (3) holds will be called an annihilating polynomial for the vector x.. that every vector x has only one minimal polynomial.+x. Hence e(A) 0. . we have the form x = xl el + xs e2 + v(A)x=xiV (A)e1+x2'V (A)e2+. e. es. But the degree of o (A) is less than that of the minimal polynomial p(A). . . in R. .176 VII. where x(A). e$. From what we have proved it follows. (2) We form the monic polynomial p(A) = AP + yl AP1 + . .=os i. We denote by q1(2). This polynomial will be called the minimal annihilating polynomial of x or simply the minimal polynomial of x.) Then (2) can be written: q. . Let W (2) be an arbitrary annihilating polynomial for the whole space R. Since every vector x e R is representable in + x. . this circumstance is not mentioned in the definition. STRUCTURE OF LINEAR OPERATOR IN nDIMENSIONAL SPACE APx =Y1Aplx  y2Ap2x . ip (A) = O. .(A)x=o.. We choose a basis el. . e. .
... R' and R" have no vector in common except the null vector.. . of all the annihilating polynomials for the whole space R. .4) is the annihilating or minimal polynomial of the matrix A. (5) then we shall say that the space R is decomposed into the two subspaces R' and R" and shall write: R=R'+R" (6) Note that the condition 1. § 2.. subtracting (7) from (5) term by term. implies the uniqueness of the representation (5). Finally we mention that the minimal polynomial of the space R annihilates every vector x of R so that the minimal polynomial of the space is divisible by the minimal polynomial of every vector in the space. e2.. e a matrix A = II ask II? then the annihilating or minimal polynomial of the space R (with respect to . DECOMPOSITION INTO INVARIANT SUBSPACES 177 e1. every vector x of R can be represented in the form of a sum x=x'+x" (x'ER'.. the one we have constructed. Although the construction of the minimal polynomial y(A) was associated with a definite basis e1. gis(l). e . Decomposition into Invariant Subspaces with CoPrime Minimal Polynomials 1. e2. . en . . eR. This polynomial is uniquely determined by the space R and the operator A and is called the minimal polynomial of the space R. Hence it follows that. . . If some collection of vectors R' forming part of R has the property that the sum of any two vectors of R' and the product of any vector of R' by a number a e F always belongs to R'. then that manifold R' is itself a vector space. and vice versa. the polynomial pp(A) itself does not depend on the choice of this basis (this follows from the uniqueness of the minimal polynomial for the space R). a subspace of R. yp(A).§ 2. . . For if for a certain vector x we had two distinct representations in the form of a sum of terms from R' and R". and 2.3 The uniqueness of the minimal polynomial of the space R follows from the statement proved above: every annihilating polynomial +p(A) of the space R is divisible by the minimal polynomial y'(A).. If two subspaces R' and R" of R are given and if it is known that 1. 99(A) of these vectors and must therefore be divisible without remainder by their least common multiple V(1). Therefore Y (A) must be a common multiple of the minimal polynomials q'r (A). Compare with Chapter IV. . we would obtain: 3 If in some basis e. (5) and x = u' + ac" (ac'E R' z" a R") (7) then. § 6. has the least degree and it is monic.x'"ER")..
"'. Then R=R'+R". n' = 2. i. e'. where R is the set of all vectors of our space.. and R" the set of all vectors parallel to the given line. are given. the definition of decomposition immediately extends to an arbitrary number of subspaces. respec tively. so that a basis of the whole space is formed from bases of the subspaces. R" to the second. . uniquely. be bases of R' and R". 2 I f e. e'2. STRUCTURE OF LINEAR OPERATOR IN nDIMENSIONAL SPACE x'z'=z"x" i.. is impossible. 2. Thus. Let R=R'+R" and let e't. The decomposition reduces the study of the behavior of an operator in the whole space to the study of its behavior in the various component subspaces. which. In this example..e. It follows. R' the set of all vectors parallel to the first direction.. In other words. n=n'+n". not parallel to one and the same plane.178 VII. and e" e" r . that Example 1. n = 3. into components in these three directions. n=3 and n'=n"=n'1'=1. In this form. in particular. We shall now prove the following theorem : . Since every vector in the space can be split. R' the set of all vectors parallel to the given plane. Suppose that in a threedimensional space a plane and a line intersecting the plane are given. if x e R' implies Ax a R'. may be replaced by the requirement that the representation (5) be unique. equality of the nonnull vectors x'z a R' and z"x" a R". the operator A carries a vector of an invariant subspace into a vector of the same subspace. In this case. we have R=R'+R"+R"'. Suppose that in a threedimensional space three directions. Example 2.. A subspace R'CR is called invariant with respect to the operator A if AR' C R'. n" =1. by 1. and R"' to the third.. Then the reader can easily prove that all these n' + n" vectors are a linearly independent and form a basis of R. In what follows we shall carry out a decomposition of the whole space into subspaces invariant with respect to A. condition 1.e. where R is the set of all the vectors of one space.
(13) x'e 11..e. (A) x. respectively. 11 and 12 so defined are subspaces of R.We denote by l1 the set of all vectors xeR satisfying the equation 1p1 (A) x = o. DECOMPOSITION INTO INVARIANT SUBSPACES 179 THEOREM 1 (First Theorem on the Decomposition of a Space into Invariant Subspaces) : If for a given operator A the minimal polynomial y'(1) of the space is represented over F in the form of a product of two coprime polynomials y'1(A) and "(A) (with highest coefficients 1) 1V (1) =1V1(A) tV2 00 .= o. For if xo a 11 and x. 12 is similarly defined by the equation V2 (A) x = o. i. whose minimal polynomials are V. (9) Proof.§ 2. a l2. Since y.(A)x=o. . Thus we have proved that R =11 + 12. 11 and 12 have only the null vector in common. then by (11) xo=Xi(A)%V1(A)xo+X2(A)+V2(A)xo=o. 1V2(A)x"=lo(A)Xi(A)x=o.. (11) (12) x = x' } where x' ='V2 (A) X2 (A)x. it follows that there exist polynomials X1 (A) and X2(1) (with coefficients in F) such that l=W1(A)X1(A)+%02(A)X2(A). i.1 (A) and Y2 (A) are coprime. = o and v2 (A) x.e. +Vi(A)x'=+p (A)X.andx"e 12. (8) then the whole space R splits into two invariant subspaces 11 and 12 R=11+12. v. x"= V1 (A)X1 (A )x Furthermore. (10) Now let x be an arbitrary vector of it In (10) we replace 1 by A and we apply both sides of the operator equation so obtained to the vector x : x'V1(A)X1(A)x+TVs(A)X2(A)x.. (A) and tp2(1).
. y. (^)]`' (14) (here p. (A) is the minimal polynomial of 1. Hence y. . Axe 11. This completes the proof of the theorem. and x an arbitrary vector of R.. (A).e.. We shall now show that y. M]`' [972 (A)]`' . Let (A) be an arbitrary annihilating polynomial for I.. (A) is a particular one of the annihilating polynomials (by the definition of 11). Then V. is invariant with respect to A.. Using the decomposition (12) already established. (15) where lk is an invariant subspace with the minimal polynomial [pk(Affk (k=1.l(A) is divisible by ip'. ggs(A) are distinct irreducible polynomials over F with highest coefficient 1). Let us decompose lp(A) into irreducible factors over F: V (1) = 197. Thus. The invariance of the subspace 12 is proved similarly.(A)y'2(A) without remainder. Multiplying both sided of this equation by A and reversing the order of A and 9. But gPj(A) is an arbitrary annihilating polynomial for I.2(A) is the minimal polynomial for the invariant subspace 12.... the theorem reduces the study of the behaviour of a linear operator in an arbitrary space to the study of the behaviour of this operator in a space where the minimal polynomial is a power of an irreducible polynomial over F. (A) A x = o. This proves that the subspace I.. it follows that the product 1(2)2(2) is an annihilating polynomial for R and is therefore divisible by V(A) V.1(A). (A).s). In exactly the same way it is shown that y.. rp2(2).. We shall take advantage of this to prove the following important theorem : THEOREM 2: In a vector space there always exists a vector whose minimal polynomial coincides with the minimal polynomial of the whole space. we write : Since x is an arbitrary vector of R.. Then by the theorem we have R=11+12+. STRUCTURE OF LINEAR OPERATOR IN nDIMENSIONAL SPACE Now suppose that x e 11... we obtain y.2.190 VII. [T. and V.+ln. i... (A) x = o. We consider first the special case where the minimal polynomial of the space R is a power of an irreducible polynomial qq(A) : V (A) _ [9) MY . (A) is the minimal polynomial of I. in other words.
Since the minimal polynomials of the subspaces 11. X2(A)X(A) is an annihilating polynomial of e'.. .. Turning now to the general case. every annihilating polynomial of the vector e is divisible by xl (A) Xi (A). e. Therefore X2(A)XP) is divisible by Xi (A). Therefore X (A) is divisible by the product X. e2... . [p... (A)]`" i..' + e" + + e(') is equal to the product [T1 (A)]`' lips (A)IC. our assertion is already proved for these subspaces. Congruence. Thus. n).. ree0a I. y. . It is easy to verify that the concept of congruence so introduced has the following properties : . CONGRUENCE... the minimal polynomial of the vector e = e. But the minimal polynomial of the space is the least common multiple of the minimal polynomials of the basis vectors. are powers of irreducible polynomials. Then e'=X2(A)X(A)eX(A)xs(A)e.. Let xl (A) and X2 (A) be the minimal polynomials of the vectors e' and e".2. . .. Let X0) be an arbitrary annihilating polynomial of the vector e = e' + e". 12. . . spectively. (A) and X2 (A) are coprime. . (A) X2 (A). For the proof in the general case we use the decomposition (15). x (A) is divisible by xl (A). whose minimal polynomials are [q'1 (2)]c . where l{< t (i=1. FACTOR SPACE 181 In R we choose a basis e1.. e" a 12. e. we prove the following preliminary lemma: LEMMA: If the minimal polynomials of the vectors a and e' are coprime. and since X. . But xl (A) and X2 (A) are coprime.= o...e. . . We shall say that two vectors x.. to the minimal polynomial of the space R. .e.. 2..§ 3. . . e2.x e 1. By assumption. It is proved similarly that X (A) is divisible by x2 0). 1. . (A) coincides with the minimal polynomial of one of the basis vectors e1. then the minimal polynomial of the sum vector e' + e" is equal to the product of the minimal polynomials of the constituent vectors. § 3.. Factor Space L Suppose given a subspace ICR. XI (A) and X2 (A) are coprime.. Proof.. Therefore there exist vectors e' a 11. Therefore xl (A) X2 (2) is the minimal polynomial of the vector e=e'+e". We now return to Theorem 2. y of R are congruent modulo 1 and shall write x = y (mod I) if and only if y . . The minimal polynomial of e1 is a divisor of W(A) and is therefore representable in the form [p(A)]1d.n). (2)]c ... so that ip(A) is the largest of the powers [p(A)]" (i=1. X$(A)X(A) i. By the lemma. In other words.
. From x . . These properties of congruence show that the operations of addition and multiplication by a number of F do not 'breakup' the classes. . then the products belong to one class. If we take two classes x and F and add elements x.y' it follows that (mod l) x+y=x'+y' (mod1). then all the sums so obtained belong to one and the same class. (mod l) 2. by assigning vectors that are pairwise congruent (mod I) to the same class (vectors of distinct classes are incongruent (mod 1)). From x = y (mod 1) it follows that y = x (mod I) (symmetry of congruence). aeR 1. Therefore R. which we denote by ax. § 1). y. is a vector 6 Since each class contains an infinite set of vectors. y. of the second class. there is.182 VII. x = x (mod I) (reflexivity of congruence). identity. namely o. x'.. 2. . . 3. The presence of these three properties enables us to make use of congruence to divide all the vectors of the space into classes. of the first class to arbitrary elements y. Thus. which we call the sum of the classes x and y and denote by x + y`. of the class x are multiplied by a number a e F. two operations are intro duced: `addition' and `multiplication by a number of F.ax' (mod 1) (a a F). in the manifold R of all classes x'. if all the vectors x.x' and y . From x .. an infinite number of ways of designating the class. . as well as R. STRUCTURE OF LINEAR OPERATOR IN 9tDIMENSIONAL SPACE For all x.' The subspace I is one of these classes..' It is easy to verify that these operations have the properties set forth in the definition of a vector space (Chapter III. Note that to every congruence x = y (mod 1) there corresponds the equalitye of the associated classes : x =:Y. x'. It is elementary to prove that congruences may be added term by term and multiplied by a number of F : 1. . y'. by this condition.. .y (mod 1) and y = z (mod I) it follows that x = z (mod 1) (transitivity of congruence). . The class containing the vector x will be denoted by t. From it follows that xx' ax . 6 That is. Similarly.
. (16) .. m. then n = n . + apxp . All the concepts introduced in this section can be illustrated very well by the following example. `Bundles' may be added and multiplied by a real number (by adding and multiplying the vectors that occur in the bundles). . we shall represent vectors in the form of directed segments beginning at a point 0. so that the operator A can be applied to both sides of a congruence. 1. m=2.. then the vectors Ax.. which we denote by A. In other words. If n. In this example. n are the dimensions of the spaces R. R. Ax'. Let R be the set of all vectors of a threedimensional space and r the field y x+ y x 1 of real numbers.. 4 We obtain another example by taking for 1 a plane passing through 0. We shall say that R is a factor space of R. n=3.. such that aixi + a2x2 + . m=1. FACTOR SPACE 183 space over the field F. Let I be a straight line passing through 0 (more accurately : the set of vectors that lie along some line passing through 0.Ax' (mod 1). For greater clarity. Example. These `bundles' are also the elements of the factor space R. 4. xp are linearly dependent modulo I if there exist numbers al..§ 3. n=1. if the operator A is applied to all vectors x.. Fig. the segment containing the endpoints of x and x' is parallel to I. i. thus. The congruence x = x' (mod I) signifies that the vectors x and x' differ by a vector of I. ...). also belong to one class.e. The linear operator A carries classes into classes and is. xs. Now let A be a linear operator in R. x'.. n = 3. o (mod 1). n=2. a2.. ap in F. Therefore the class x is represented by the line passing through the endpoint of x and parallel to 1 (more accurately: by the `bundle' of vectors starting from 0 whose endpoints lie on that line). 2. The reader will easily prove that from x = x' (mod 1) it follows that Ax . a linear operator in R. not all equal to zero. Let us assume that I is an invariant subspace with respect to A. Fig. In this example.m. .. of a class x. We shall say that the vectors xl. CONGRUENCE. respectively.
and reasonings. but in the space R.ap be the minimal polynomial of e. All these concepts will be called `relative. statements. The reader should observe that the relative minimal polynomial (of a vector or a space) is a divisor of the absolute one. For example.' in contrast to the `absolute' concepts that were introduced earlier (and that hold for the symbol '='). (18) . STRUCTURE OF LINEAR OPERATOR IN 9IDIMENSIONAL SPACE Note that not only the concept of linear dependence of vectors. For example. § 4. Let a (. Thus. and hence it follows that also a(A)xo (mod l).ap_12 I.ap_1Ae  alAp'e .. Then the vectors i. Decomposition of a Space into Cyclic Invariant Subspaces 1. A''le are linearly independent. Ae. in the preceding sections of this chapter can be repeated word for word with the symbol `_' replaced throughout by the symbol `.t) = 2'+ al Ap' + a vector e.184 VII. Then o(A)x=o.' The truth of all `relative' statements depends on the fact that by operating with congruences modulo I we deal essentially with equalitieshowever not in the space R. . we can introduce the concepts of an annihilating polynomial and of the minimal polynomial of a vector or a space (mod I).' where I is some fixed subspace invariant with respect to A. we have the statement: `In every space there always exists a vector whose relative minimal polynomial coincides with the relative minimal polynomial of the whole space.(mod 1). and (17) Ape = . Side by side with the `absolute' statements of the preceding sections we have `relative' statements.. but also all the concepts.. let of (A) be the relative minimal polynomial of a vector x and o(A) the corresponding absolute minimal polynomial.ape . Therefore o(A) is a relative annihilating polynomial of x and as such is divisible by the relative minimal polynomial ol(A).
7 The operator A carries the first vector of (17) into the second.. (A) = v(A) = Am + a. p. .. each once only. Note that the minimal polynomial of the generating vector e is also the minimal polynomial of the whole subspace .. i. Let V. i. .. V2(A) is a divisor of tp. 2.§ 4. The last basis vector is carried by A into a linear combination of the basis vectors in accordance with (18). then R = I. 176). We shall call this subspace cyclic in view of the special character of the basis (17) and of (18) . In other words. the words 'with respect to the linear operator A' are omitted for the sake of brevity (see the similar remark in footnote 2.. Every vector x e I is representable in the form of a linear combination of the basis vectors (17). DECOMPOSITION OF SPACE INTO CYCLIC INVARIANT SUBSPACES 185 The vectors (17) form a basis of a pdimensional subspace 1.. i..I)=1p+ f1Ap1+. the second into the third. +Ap is the minimal polynomial of R (mod 11). (A).. But since the whole theory is built up with reference to a single operator A. We are now ready to establish the fundamental proposition of the whole theory. + a.. Thus. Then there exists a vector e in the space for which this polynomial is minimal (Theorem 2. (A) = w2 (A) X (A) (21) 7It would be more accurate to call this subspace: cyclic with respect to the linear operator A. A carries every basis vector into a vector of I and hence an arbitrary vector of I into another vector of I. Ae. according to which the space R splits into cyclic subspaces.e. p. for only one polynomial x (A) . a cyclic subspace is always invariant with respect to A. Suppose that n > m and that the polynomial jV2(. (20) If n = m. etc. there exists a polynomial x(A) such that V.e.1 with coefficients in F. be the minimal polynomial of the space R. in the form xx(A)e.1 1 + . 180). Am_1e. By the remark at the end of § 3. (19) where x (A) is a polynomial in A of degree < p .. By forming all possible polynomials x (A) of degree < p 1 with coefficients in F we obtain all the vectors of I. In view of the basis (17) or the formula (19) we shall say that the vector e generates the subspace. Let Il denote the cyclic subspace with the basis e..e.
(24) where x1(1) is a polynomial. (29) . Then V. therefore x(A)x(A)e=o. we have from (26) : g=g* (modll). (25) We now introduce the vector g=g*xl(A)e. Comparing the last two statements. (A) g* o (mod 11). This equation shows that the product x (1) X (1) is an annihilating polynomial of the vector e and is therefore divisible by the minimal polynomial VJL (1) = x (1) Ws (1).. On the other hand.186 VII. zero. Using this decomposition of x (A). (22) i. being the relative minimal polynomial of g*. (27) The last equation shows that 1p2(A) is an absolute annihilating polynomial of the vector g and is therefore divisible by the absolute minimal polynomial of g.xl (A) e] = o.e. (23) We apply the operator x(A) to both sides of the equation. i. . SPACE Moreover.1 such that tp2(A)g*=x(A)e. there exists a polynomial x (1) of degree < m . so that x (1) is divisible by V2 (1) : X (A) X.. (28) Hence V2 (A). STRUCTURE OF LINEAR OPERATOR IN nl)IMENSIONAI. A" 'g is cyclic. because ypl (1) is the absolute minimal polynomial of the space.e. Then (25) can be written as follows: (26) 1VS(A)g=o. in R there exists a vector g* whose relative minimal polynomial is ip2(1). Ag. is the same for g as well. From the fact that V2(1) is the absolute minimal polynomial of g it follows that the subspace l2 with the basis g. Then by (21) we obtain on the left Wl (A)g*. we deduce that W2 (1) is simultaneously the relative and the absolute minimal polynomial of g. we may rewrite (23) as follows: Y's (A) [g* . (1) V2 (1)...
Applying the proof of the decomposition theorem. ..e. .. We now mention some properties of cyclic spaces. g. 12. Ae. t). we have R =11j i..''2 (2). R is a cyclic space. Ag. If n> m + p. But the dimension of the cyclic subspace 11 is m. .. . Next. (30) The vectors (30) form a basis of the invariant subspace 11 + 12 of dimension m + p. suppose that we have a decomposition of a cyclic space R into two invariant subspaces 11 and 12: R=11+12. Let R be a cyclic ndimensional space and y' (1) =A+. no linear combination with coefficients not all zero can be equal to a linear combination of the vectors (20). because its minimal polynomial coincides with the minimal polynomial of the whole space.. Thus we have established the following criterion for cyclicity of a space: THEOREM 4: A space is cyclic if and only if its dimension is equal to the degree of its minimal polynomial..+It (31) such that ip1(A) coincides with the minimal polynomial W(A) of the whole space and that each t{(. A"g. A''e. this process must come to an end with some subspace It. then R=11 + 12. . 3. suppose that R is an arbitrary space and that it is known that m = n. It with minimal polynomials y'1(2)..l) i s divisible by ip1_(A) (i= 2. . Conversely.§ 4. where t < n. Since the latter are themselves linearly independent.e.. DECOMPOSITION OF SPACE INTO CYCLIC INVARIANT SUBSPACES 187 From the fact that "(A) is the relative minimal polynomial of g (mod 11) it follows that the vectors (29) are linearly independent (mod 11).. . Then it follows from the definition of a cyclic space that m = n.. Since the whole space R is of finite dimension n. we represent R in the form (31). our last statement asserts the linear independence of the m + p vectors e. We have arrived at the following theorem : THEOREM 3 (Second Theorem on the Decomposition of a Space into Invariant Subspaces) : Relative to a given linear operator A the space can always be split into cyclic subspaces 11.. its minimal polynomial... i.. 3. (32) . Since m = n by assumption.. .. we consider R (mod 11 + 12) and continue our process of separating cyclic subspaces. If n = m + p... tVt (2) R=11+12+.
. and n2. m=m1+m2=n1+ n2. R cannot split into invariant subspaces. by what we have proved. are also cyclic and 2. m1i and m2. then the space itself is cyclic.e. the extreme numbers of this chain. and 12 by n. Then m1Sn1. respectively. i. their minimal polynomials by W(A). Suppose now that R is a cyclic space and that its minimal polynomial is a power of an irreducible polynomial over F : y. STRUCTURE OF LINEAR OPERATOR IN nDIMENSIONAL SPACE We denote the dimensions of R. (38) These equations mean that the subspaces 11 and 12 are cyclic.(A) is the least common multiple of W. (A) and yt(A). Thus we have arrived at the following proposition : THEOREM 5: A cyclic space can only split into invariant subspaces that 1. the minimal polynomial of every invariant subspace of R must also be a power of this irreducible polynomial q(A). we find from ml + m2= n1 + n2 that m1= n1 and m2 = n2. nl.l)]°. have coprime minimal polynomials. m and n. and W2(A). 11. But then. (1) = [p (. From the fact that m = m1 + m2 we deduce that Vil (1) and V2(A) are coprime. Moreover. we have mSm1+m2. Bearing (33) in mind. and (36) give us a chain of relations (36) mSm1+m25n1+n2=n. We add these inequalities term by term : m1 + m2 S n1 + n2 (33) (34) Since V. and the degrees of these minimal polynomials by m. are cyclic and 2. (37) But since the space R is cyclic. W1(A). are equal. The same arguments (in the opposite order) show that Theorem 5 has a converse : THEOREM 6: If a space is split into invariant subspaces that 1. .188 VII. have coprime minimal polynomials. Therefore the minimal polynomials of any two invariant subspaces cannot be coprime. (35). Therefore we have equality in the middle terms. it follows from (32) that (35) n=n1+n2. m2Sn2. In this case. (34).
the minimal polynomial of R must be a power of an irreducible polynomial. It... it is cyclic and 2.. by the second decomposition theorem. 2."+ . + IP)... [9' (7)]t' (ek? dk> > lk> 0. its minimal polynomial is a power of an irreducible polynomial over F. This theorem gives the decomposition of a space into indecomposable invariant subspaces. .... . .. (40) such that the minimal polynomial of each of these cyclic subspaces is a power of an irreducible polynomial. ..'+ I.'. . Similarly we decompose the spaces 12. it could be split into cyclic subspaces . . It into %V2(A)=[9.§ 4... s). Thus we have reached the following conclusion : THEOREM 7: A space does not split into invariant subspaces if and only if 1. 8 Some of the exponents dk. that some space R is known not to split into invariant subspaces. . . by the first decomposition theorem..+100. (Here we neglect the powers whose exponents are zero... .. [9'. nomials W. [9'... s) .(A).) From Theorem 7 it follows that these cyclic subspaces are indecomposable (into invariant subspaces). .[9's(A)J' Y'e (ia) = [9'i (A)]l' [9's (A)]`' ... Then R is a cyclic space.. [9'k (2)]dk. We have thus arrived at the following theorem : THEOREM 8 (Third Theorem on the Decomposition of a Space into Invari ant Subspaces) : A space can always be split into cyclic invariant subspaces R=1'+1"+ . . [9'k (2)]rk (k =1..(1)]a'[9's(A)]a2 . because otherwise R could be split into invariant subspaces. moreover. ik for k > 1 may be equal to zero.'V2(A). We now return to the decomposition (31) and split the minimal poly. In this way we obtain a decomposition of the whole space R into cyclic subspaces with the minimal polynomials [Vk (2)]`k.. . To It we apply the first decomposition theorem. (2)]`' . tVt(2) of the cyclic subspaces I... l2i irreducible factors over F : Vi (1) = [9'i (A) ]`' [9'2 (A) IC. (39) ...(') are cyclic subspaces with the minimal polynomials [91(1)]°'" [9's (a)]`'..8 k=1 2. 11". . 1. Then we obtain 11 =1. . conversely. DECOMPOSITION OF SPACE INTO CYCLIC INVARIANT SUBSPACES 189 Suppose. .. for otherwise.. where I.
The fact that the fourth `block' is zero operator A in 11 (with respect to the basis e1. n).. e. and complement it to a basis el. . Then obviously the block A3 in (41) is also equal to zero and the matrix A has the quasidiagonal form expresses the invariance of the subspace It. em). . then it can always be split into indecomposable invariant subspaces : R=I'+F'+. Therefore A has the following form m nm A = Al {0 As As }m u )nIn ' (41) where Al and A2 are square matrices of orders m and n . § S.190 VII. each of the constituent subspaces is cyclic and has as its minimal polynomial a power of an irreducible polynomial over F. Theorem 8 (the third decomposition theorem) has been proved by applying the first two decomposition theorems. . . namely. The matrix Al gives the .+IW..m coordinates of Aek are zero. For if the space R splits at all. as an immediate (and almost trivial) corollary of Theorem 7. . ... e.. em. . STRUCTURE OF LINEAR OPERATOR IN nDIMENSIONAL SPACE Note.. e2.. em+1. (40) By Theorem 7. respectively.. . 2..1i . es. . of R. Let 11 be an mdimensional invariant subspace of R. We remind the reader that the kth column of A consists of the coordinates of the vector Aek (k = 1. e.. . But it can also be obtained by other means. and As is a rectangular matrix. In It we take an arbitrary basis e1. For k < m the vector Aek a 11 (by the invariance of I1) and the last n . . .. .. The Normal Form of a Matrix 1. Let us see what the matrix A of the operator A looks like in this basis..m.. e. .... is the basis of some invariant subspace 12i so that R =11 + 12 and a basis of the whole space is formed from the two parts that are the bases of the invariant subspaces Il and 12. Let us assume now that e.
em = A'"le .1. Let P1 (A) _ Am + (Xgl)m1 + . . square matrices of orders m and n . A911. g.... . . . corresponding to A in this basis looks like. .. must have quasidiagonal form 1Ll 0. it and we form a basis of the whole space from the following bases of the cyclic subspaces : e. . Ytt (A) = A° + E1Ao1 + . IP! (1) =..m which give the operator in the subspaces 11 and 12 (with respect to the bases e1. e. 9 V) . A.). 2..... O Ll = O Ls . (m ? P. + E. the matrix L.. By applying the rule for the formation .... respectively. g... By the second decomposition theorem. (46) O 0 . Al.l. 12. . It : (43) In the sequence of minimal polynomials of these subspaces lpl(2). . . . (42) where Al and A2 are... . . e")... e2. .. . .>= ...§ 5. and em+1 . A'n1 e . to a quasidiagonal form of the matrix there always corresponds a decomposition of the space into invariant subspaces (and the basis of the whole space is formed from the bases of these subspaces).. we can split the whole space R into cyclic subspaces I1.. I2.(1) each factor is a divisor of the proceeding one (from which it follows automatically that the first polynomial is the minimal polynomial of the whole space). . As we have explained at the beginning of this section. .. . conversely. Ag. V... It is not difficult to see that. NORMAL FORM OF A MATRIX 191 A II of A II (Al. AP1g.. Ae.P + PP. es = Ae. + a. .... .... (45) Let us see what the matrix L. l generating vectors of the subspaces 11. Lt The matrix L1 corresponds to the operator A in 11 with respect to the basis e1= e. (44) We denote by e. .
1 . _ 0 00. . If we start not from the second. IAEL:I=1Vt(A) (for cyclic subspaces the characteristic polynomial of an operator A coincides with the minimal polynomial of the subspace relative to this operator). 3) The additional condition : the characteristic polynomial of each diagonal block is divisible by the characteristic polynomial of the following block... etc. we find : IAEL1j=v1(A).. but from the third decomposition theorem. . SPACE of the matrix for a given operator in a given basis (Chapter III. which is characterized by 1) The quasidiagonal form LII = (VI). STRUCTURE OF LINEAR OPERATOR IN nDIMENSIONAI. there exists a nonsingular matrix T such that A= TL1T1.. 67). The matrix L1 corresponds to the operator A in the `canonical' basis (45). (49) Of the matrix L1 we shall say that it has the first natural normal form. 2) The special structure of the diagonal blocks (47).. .. then in exactly the same way we would obtain a matrix L11 corresponding to the operator A in the appropriate basisa matrix having the second natural normal form..192 VII. 0 .. L2. L2= 0 0 0 . we find 0 0.. L(2). 1 0 . 0 py 1 . i. (48). . If A is the matrix corresponding to A in an arbitrary basis.PP1 0 Pp.e. 1 .am_1 (47) L. Similarly 0. then A is similar to L1.. JAEL2I= 2(A).am 0. This form is characterized by 1) The quasidiagonal form . Lt. p. L("1) . *1 p1 il Computing the characteristic polynomials of the matrices L1. . 0 . act (48) 0 0. 1 0 ....
. Elementary Divisors 1.A I = D. . . ELEMENTARY DIVISORS 193 2) The special structure of the diagonal blocks (47). Knowledge of these polynomials enables us to write out all the elements of the matrices LI and L11 similar to A and having the first and second natural normal forms.2 (A)... n) into irreducible factors (p=1. . There may be many canonical bases. in (A) =Do ) (D0 (1)  1) (50) define n polynomials whose product is equal to the characteristic polynomial d (1) = I AE . i (A). .n). (1) = it (2) i2 (2) .. (d). INVARIANT POLYNOMIALS. . 972 (2). Moreover. (1). § 3 for the characteristic matrix that were there established for an arbitrary polynomial matrix. = AE .(2) the greatest common divisor of all the minors of order p of the characteristic matrix A....§ 6. D". Well denote by D. . the formulas i A = D.._1 ( . .. are distinct irreducible polynomials over F. 2. etc. over F : (51) We split the polynomials ip(1) (p = 1.. we shall give an algorithm for the computation of the polynomials V. D1(A). 3.2. but to all of them there corresponds one and the same matrix L1. respectively.. 10 To within the order of the diagonal blocks. (48). . where T. t.. each polynomial is divisible by the following. 12 We always take the highest coefficient of the greatest common divisor as 1. n) . .9 and one and only one 10 having the second normal form. of the present section we repeat the basic concepts of Chapter VI... (52) 9 This does not mean that there exists only one canonical basis of the form (45). . . (A).....12 Since in the sequence D. y.A (p = 1. In the following section we shall show that in the class of similar matrices corresponding to one and the same operator there is one and only one matrix having the first normal form. § 6. 11 In subsection 1. 2.. i2 (2) = Dn2 (a)' . Invariant Polynomials..l).(. (1) from the elements of the matrix A. 3) The additional condition : the characteristic polynomial of each block is a power of an irreducible polynomial over F.
2.. n) and that (54) holds... simply.. q=4. i2 (A).. we can speak of the invariant polynomials and the elementary divisors of an operator A.. ap) T !Yl F':. Since all the matrices representing a given operator A in various bases are similar and therefore have the same invariant polynomials and the same elementary divisors....... is equal to the characteristic polynomial A (A) _ The name `invariant polynomial' is justified by the fact that two similar matrices A and A. kp/ T1 cap ii i2 .. p=4. and vice versa (since A and A can interchange places).A (in (57) this matrix is written out for the case m=5.Np (al a2 . 2. STRUCTURE OF LINEAR OPERATOR IN nDIMENSIONAL SPACE The polynomials iI (A). . like the product of all the invariant polynomials. ..1 oars . § 2) we obtain a relation between the minors of the similar matrices A2 and A. 2. = AE . . are called the elementary divisors. . [g72(A)]ap. We choose now for l the matrix LI having the first natural normal form and we compute the invariant polynomials of A starting from the form of the matrix it = 2E . . of A. and all the nonconstant powers among [971(A)]Yv. AEA1. .A or. . i (A) are called the invariant polynomials. Hence it follows that Dp(A) =Dp(A) (p = 1.. 1 (p=1. . kp 1U_f i !'a .194 VII. ip) A210. 2. always have identical invariant polynomials ip (1) = iv (1) (53) (p =1..Pp1 XP \k1 ks . . of the characteristic matrix A. (55) Hence (see Chapter I. The product of all the elementary divisors. This equation shows that every common divisor of all the minors of order p of AX is a common divisor of all the minors of order p of Ax. (54) For it follows from (53) that Ax = AE A = T1 (AR A) T = T'A1T. . n) ... .. A=T1AT.: Ax iL ki k2 . r=3) : .n)..
apart from a factor ± 1.... INVARIANT POLYNOMIALS..119L81=V:L(A)w2(A).1. we see that it is equal to zero.. that in the jth diagonal block one of the rows is crossed out. (57) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 e1+4 Using Laplace's Theorem.... 01 A y..... ... 0 0 y. all the rows except s 1 rows consist entirely of zeros (we have denoted the order of A5 by s). Suppose.(A)...... so that in each of these blocks only one line is crossed out.3........... The lines crossed out in this case intersect two distinct diagonal blocks.. 0 0 0 0 0 0 0 .. 0 0 0 0 0 0 0 0 Y4 1 0 0 0 0 A 0 p... ELEMENTARY DIVISORS 1 195 0 0 0 0 a6 1 A 0 01 A 0 01 0 0 0 A 0 1 a1}A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .. To obtain this minor we have to suppress one row and one column in the matrix (57).. IABL$1=V2(A) .........1 is a divisor of all the other minors of order n .. A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 y... which has s columns.§ 6. Expanding the determinant of order n . This minor... is equal to IABLS I . we find D*(A)=I2EAl=1AELEI IAKL2I......... 0 0 0 0 A P........... In this strip. 01 0 1 #1}2 0 0 0 0 0 . In the minor we take that vertical strip which contains this diagonal block. 0 01 y.... for example.... 0 e..1 by Laplace's Theorem with respect to the minors of order s in this strip.. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 a 0 0 0 0 0 0 0 0 0 0 0 0 0 0 . so that DA1(1) = 2 (1) Ve (1) (60) For this purpose we first take the minor of an element outside the diago nal blocks and show that it vanishes..... (59) We shall show that this minor of order n ....... We consider the minor of the element a...ve(1).. (58) Now let us find D..
each is divisible by the following._1(1) ip. then the space R can be decomposed into cyclic subspaces such that in the sequence of minimal polynomials V1(2). where x(A) is the determinant of the `mutilated' jth diagonal block. In this case the lines crossed out `mutilate' only one of the diagonal blocks.1Vt(2) coincide with the invariant polynomials... . 1V2 (A) = D _2 (A) (63) i14... and the matrix of the minor is again quasidiagonal. . and (62) we find: V'i (A) = D (2) = i1(A) . . t 1). . the product (61) is divisible by (59). w.t(A).1 (A) _ .196 VII. . . Therefore the minor is equal to (61) vi (A) ... 2. . . D.= D. Thus. other than 1. By similar arguments we obtain : D _2 ) = 3 ( 2 ) . V2(2).(1) are uniquely determined: they coincide with the invariant polynomials. (A) (62) DA:+1 (A) = v ( 2 ) .. .. It. .+1(A) .(A)=1 From (58). STRUCTURE OF LINEAR OPERATOR IN nDIMENSIONAL SPACE Now we take the minor of an element inside one of the diagonal blocks. =in (A) =1 . . The polynomials y11 (A)... For every linear operator A in R there exists a basis in which the matrix L1 that gives the operator is of the first natural normal form. 312(2). (A) X (A) ... THEOREM 9': .... of the operator A. v. . (A) Of the subspaces 11..V. I2.. '112(2). Let us give three equivalent formulations of the results obtained : THEOREM 9 (More precise form of the Second Decomposition Theorem) : If A is a linear operator in R. =4W. V. say the jth. The formulas (63) show that the polynomials v1(2). other than 1. equation (60) can be regarded as proved. This matrix is uniquely determined when the operator A is given: the characteristic polynomials of the diagonal blocks of L1 are the invariant polynomials of A. of the operator A (or the corresponding matrix A). Since v' (2) is divisible by V/'{+1(A) (i= 1. (60)... V.
Thus we arrive at the following more precise statement of the third decomposition theorem : 23 Or (what is the same) the same elementary divisors in the field F.. (A) = [q'1 (. . conversely.r(A)]ix (k = 1. therefore.. 194 we established that two similar matrices have the same invariant polynomials.. we were led to the third decomposition theorem..)]l' [w2W]I' . INVARIANT POLYNOMIALS. and hence with the product of all invariant polynomials : d (A) _ V1 (A) V2 (A) . to each other. ELEMENTARY DIVISORS 197 THEOREM 9": In every class of similar matrices (with elements in F) there exists one and only one matrix Ll having the first natural normal form.. Since the matrix L1 is uniquely determined when these polynomials are given. 2. wz (') _ (A)]d' IT. 'P. . .. .. ? lx. By (63) all the powers..13 3. IP. hence ip1(A) = 0 and by (64) A (A) =0. . The characteristic polynomials of the diagonal blocks of L1 coincide with the invariant polynomials (other than 1) of every matrix of that class. s) are the elementary divisors of A (or A) in the field F (see p.'a(A).§ 6.. [gq. (A)le' . y. To each power with nonzero exponent on the righthand sides of (66) there corresponds an invariant subspace in this decomposition.1 66 k=1. (65) Thus we have incidentally obtained the HamiltonCayley Theorem (see Chapter.. .... Now suppose. 194). 1V: (A) (64) But ip1(A) is the minimal polynomial of the whole space with respect to A. that two matrices A and B with elements in F are known to have the same invariant polynomials. IV. In § 4 by splitting the polynomials '1(A).(A)I's (c Z dx Z l`.. The characteristic polynomial 1(A) of the operator A coincides with D(A). the two matrices A and B are similar to one and the same matrix L3 and. . among [p*(A)]ck. On p. other than 1...2. (A) into irreducible factors over F : V'1 (A) = IT1(. We thus arrive at the following proposition : THEOREM 10: Two matrices with elements in F are similar if and only if they have the same invariant polynomials. § 4) : Every linear operator (every square matrix) satisfies its characteristic equation..)]CI [q'2 (A)]`' .
This theorem also admits a formulation in terms of matrices : THEOREM 11": A matrix A with elements in the field F is always similar to a matrix LII having the second natural normal form in which the characteristic polynomials of the diagonal blocks are the elementary divisors of A. . .. e". the characteristic polynomials of the diagonal blocks are the elementary divisors of A in F.. e("). La.. .. Then by Theorem 7 the subspaces 1'. .. I' sand from the `cyclic' bases of these subspaces we form a basis of the whole space e'. Theorem 11 and the associated Theorems 11' and 11" have. I".. Ae'. .. the characteristic polynomials of these diagonal blocks are not the invariant polynomials.. Let R=11 +I"+. are of the same structure as the blocks (47) and (48) of LI. .. L"). in the form14 14 At least one of the numbers l. The matrix LII has the second natural normal form (see § 5). We may write these powers. . STRUCTURE OF LINEAR OPERATOR IN nDIMENSIONAL SPACE THEOREM 11: If A is a linear operator in a vector space R over a field F. is positive. .+11"' be an arbitrary decomposition of a space R into indecomposable invariant subspaces. .. L2..12.. (68) It is easy to see that the matrix L11 corresponding to the operator A in the basis (68) has quasidiagonal form.. We denote by e'.. 1". 1.. . It")are cyclic and their minimal polynomials are powers of irreducible polynomials over F... We have arrived at another formulation of Theorem 11: THEOREM 11': For every linear operator A in R (over the field F) there exists a basis in which the matrix LII giving the operator is of the second natural normal form. but the elementary divisors of A. e". . then R can be split into cyclic subspaces whose minimal polynomials are the elementary divisors of A in F.. ... .. like LI: (69) LII =(L1.. L. However. .198 VII. .. Ae("). after adding powers with zero exponent if necessary. e0") generating vectors of the subspaces I'.. Ae". The diagonal blocks LI. .+I"' (67) be such a decomposition. a converse.. Let R=1'+I"+.. in a certain sense.
the characteristic polynomials of its diagonal blocks are the elementary divisors of every matrix of the given class. Here in the sequence V.2. . .. then the elementary divisors of A in each of these invariant subspaces.. By this theorem VP (1) = ip (A) (p =1. we obtain : THEOREM 13: If the space R is split into invariant subspaces with respect to an operator A...(2) each polynomial is divisible by the following. [ . IT. . y. This theorem has the following matrix form : . (A). When we split 11 and 12 into indecomposable subspaces. the subspaces 11.l J (70) [q'1(2)}". z dt z . then the minimal polynomials of these subspaces are all the elementary divisors of A in F. bearing Theorem 12 in mind. ... . V'9(2). ...... It (t is the number of rows in (70)) . .. V. we introduce 12i . all the powers (70) with nonzero exponent are the elementary divisors of A in the field F.. Thus we have the following theorem : THEOREM 12: If the vector space R (over the field F) is split in any way into decomposable invariant subspaces (with respect to an operator A).. But then Theorem 9 is immediately applicable to the decomposition R=11+12+.§ 6. By Theorem 6.. and therefore. . Hence. 19'2 (2)]1'. ELEMENTARY DIVISORS 199 [972(x)1"' . There is an equivalent formulation in terms of matrices : THEOREM 12': In each class of similar matrices (with elements in F) there exists only one matrix (to within the order of the diagonal blocks) having the second normal form L11. (A) are determined by the formulas (66). It are cyclic and their minimal polynomials v1(.. 12. 21 . by (66). . +p2(A).. Suppose that the space R is split into two invariant subspaces (with respect to an operator A) R=11 +12. . . z ll t k=1. INVARIANT POLYNOMIALS.. n).. [9'2 .(2)11' We denote the sum of the subspaces whose minimal polynomials are in the first row by 11. . form a complete system of elementary divisors of A in R.. Similarly.a (ck U..l). we obtain at the same time a decomposition of the whole space R into indecomposable subspaces. (2)]a'.. taken in their totality.+It.
A1)" (A . To this elementary divisor there corresponds in (67) a definite cyclic subspace . . A2... s). .. . . ep are linearly independent. § 7.E)r2e. . in particular. Now we note that (AAOE)el=o. (AA2)d' .)t'. ....A2)`' ... . . For this vector (A ... if F is the field of all complex numbers... ep=e. A2.... We take an arbitrary elementary divisor (A . .. (72) here Ao is one of the numbers AI. (73) The vectors el. A. i1 (A)=(AAI)`' (AA2)`' . in (71) are all the distinct roots of A (A). s (71) ii (A) = (A .. which is impossible.. . Ae2 = Aoe2 + el . This will hold true... (AAOE)e2=e1... We consider the vectors e1=(AAOE)PIe. e2. 2. Since the product of all the invariant polynomials is equal to the characteristic polynomial A (A) . e2=(AA. (q4 . STRUCTURE OF LINEAR OPERATOR IN 9tDIMENSIONAL SPACE THEOREM 13': A complete system of elementary divisors in r of a quasi diagonal matrix is obtained as the union of the elementary divisors of the diagonal blocks.. generated by a vector which we denote by e.... ... .. ... AR and p is one of the (nonzero) exponents ck.. AeP = A0ep + ep_ 1.A. Theorem 13' is often used for the actual process of finding the elementary divisors of a matrix.. Al.. 2. (AA3)sa.. ..A0)¢.... (AAoE)epe...200 VII. The Jordan Normal Form of a Matrix 1. . . In this case.A(.?1kz0'1 ck>0. . (A . Suppose that all the roots of the characteristic polynomial 4(A) of an operator A belong to the field F.. lk (k = 1..I or (74) (75) Ae1= A0e1.. . (A ))`a. dk. k=1. the decomposition of the invariant polynomials into elementary divisors in F will look as follows : i2(A)=(AAI)d.. since otherwise there would be an annihilating polynomial for e of degree less than p.) P' is the minimal polynomial.
This matrix looks as follows : A0 1 0 A0 0 . A2. If we now denote the minimal polynomials of these subspaces.. From Jordan chains connected with each subspace I'. Au need not all be distinct). e2. (A . e0 defined by (70) in the reverse order: g1=ep= e. . eP for which (75) holds form a socalled Jordan chain of vectors in . gP=el =(AA0E)P1 e.. . . § 8) if and only if all the elementary divisors of A are linear. A0 where E(P) is the unit matrix of order p and H(P) the matrix of order p which has l's along the first superdiagonal and 0's everywhere else. by (A . The matrix J can be written down at once when the elementary divisors of A in the field F containing all the characteristic roots of the equation 1(A) = 0 are known. . 1 (76) 0 0 0 . AuE(Pu) + H(Pu) ) .. i. i. .. .. the Jordan form is a diagonal matrix and we have : A=T (Al. 0 1 = A0E(P) + H(P) . the elementary divisors of A. (79) .Al)P' . .. A2E(P') + H(PQ) .). g2= eP1=(AA0E) e.... . .e.Au)r" (77) (the numbers A1i A2. Au)T'. . .. . .. Thus: A linear operator A has simple structure (see Chapter III.. . for an arbitrary matrix A there always exists a nonsingular matrix T (I T 0) such that A = TJT1...22)P'.. (A . JORDAN NORMAL FORM 201 With the help of (75) we can easily write down the matrix corresponding to A in I for the basis (73).1". 0 . If all the elementary divisors of A are of the first degree (and in that case only).. Linearly independent vectors e1..e. (78) We shall say of the matrix J that it is of Jordan normal form or simply Jordan form. then the matrix J corresponding to A in a Jordan basis has the following quasidiagonal form : J = { A1E(P. . I(") we form a Jordan basis of R. . ... Every matrix A is similar to a matrix J of Jordan normal form.. Let us number the vectors e1. .§ 7..) + H(P. . e2.
. to the operator A there corresponds the matrix Ao 1 0 A0 1 0 . 4E(PA) + FcP. Ag.. then its characteristic (secular) equation can be written in the form all _A a21 a12 a. = Aog... 0 A0 0 = AOE(P) + F(P)... . 1 Aa We shall say of the vectors (79) that they form a lower Jordan chain of vectors. we shall sometimes call (78) an upper Jordan matrix. If we take a lower Jordan chain of vectors in each subspace I'. . a. § 8..202 VII.) + F(Pi).. STRUCTURE OF LINEAR OPERATOR IN nDIMENSIONAL SPACE Then (AA0E)g1=g2.. When a matrix A = 11 aik 11 71 is given.. as is easy to see. Thus : Every matrix A is similar to an upper and to a lower Jordan matrix.. 0 0 . I(")of (67). Krylov's Method of Transforming the Secular Equation 1. (A4E)g2=g8..A ap2 . (80) We shall say of the matrix J1 that it is of lower Jordan form.2) P. 1 On the lefthand side of this equation is the characteristic polynomial J (A) of degree n. al.. 0 0 .. we can form from these chains a lower Jordan basis in which . For the direct computation of the coefficients of this polynomial it is necessary to expand the characteristic determinant . . hence Ag1= 20g1 + g2. In contrast to (80).. The vectors (79) form a basis in the cyclic invariant subspace I that corresponds in (67) to the elementary divisor (A .).. a2s al .. In this basis.. .. ... I".. to the operator A there corresponds the quasidiagonal matrix Jl = { AIE(P. . (AAoE)gP=o. Age = 20g2 + gs. AUE(P') + F(ft)). ..
in [2681 and [168] and in § 21 of the book [25).. is When we apply the operator A to both sides of (83) we express AP+lx linearly in terms of Ax.i Axor a1APix (83) p (A)x=o. .AE I. Ax.6 o in R and form the sequence of vectors x. AP1x. Thus. [269].+aP. . the direct determination of the coefficient of X in 4(X) would require the computa tion of six determinants of order 5. even for n = 6. that of X2 would require fifteen determinants of order 4. 2. AP+ix. Ax... But Apx.15 In 1937. q'(1)=AP +a1AP1+. (84) (85) where All the further vectors in (82) can also be expressed linearly by the first p vectors of the sequence. AP1x. . .. . .k in A (k = 1.§ S. By applying the operator A to the expression thus obtained for AP+lx.. Ax. APx . [168).. AP1x of this sequence are linearly independent and that the (p + 1)st vector APx is a linear combination of these p vectors : APx=aPxa. A. and for large n this involves very cumbersome computational work. ... because A occurs in the diagonal elements of the determinant.. Ax. is expressed linearly in terms of x. [2111. . and [149). .17 We consider an ndimensional vector space R with basis e1. in (82) there are p linearly independent 15 We recall that the coefficient of Xk in 4(X) is equal (apart from the sign) to the sum of all the principal minors of order n . A2x. etc.. e2. We take an arbitrary vector x . Hence we obtain a similar expression for AP+'x. Krylov's transformation simplifies the computation of the coefficients of the characteristic equation considerably. we express AP+*x in terms of x. N... n)." In this section we shall give an algebraic method of transforming the characteristic equation which differs somewhat from Krylov's own method. by (83). . . KRYLOV'S METHOD OF TRANSFORMING SECULAR EQUATION 203 A . Krylov [251] proposed a transformation of the character istic determinant as a result of which A occurs only in the elements of one column (or row).. Krylov's approach in algebraic form can be found. . for example. 17 Krylov arrived at his method of transformation by starting from a system of n linear differential equations with constant coefficients. e and the linear operator A in R determined by a given matrix A = II atk II1 in this basis. (82) Suppose that the first p vectors x.. 16 The algebraic analysis of Krylov's method of transforming the secular equation is contained in a number of papers [268].'' Thus.. etc. ... .
. The polynomial q (A) is the minimal (annihilating) polynomial of the vector x with respect to the operator A (see § 1). . . we shall denote the coordinates of x in the given basis e1. effective determination of the minimal polynomial q:(1) of x. The polynomial q°(l) is a divisor of the minimal polynomial VY(2) of the whole space R. e2i . q(A) and 4(1) are of the same degree and. Ax. where p < n. and (85) assume the form or A"x=a"xan_l Ax. SPACE vectors and this maximal number of linearly independent vectors in (82) is always realized by the first p vectors. In explaining Krylov's transformation. Thus. b. In the singular case. (A) is always a divisor of A(1). (89) an1 We consider the matrix formed from the coordinate vectors x. STRUCTURE OF LINEAR OPERATOR IN nDIMENSIONAL. since their highest coefficients are equal. and therefore in the regular case Krylov's method is a method of computing the coefficients of the characteristic polynomial A(A). § 1) : a b b1 . bk. Krylov's method does not enable us to determine J(A). A(.. Therefore q. . . We consider separately two cases: the regular ease.n)..2. I M= a1 4 111 0. A"x: 19 y... where p = n . 2...1)=An (87) where (88) The condition of linear independence of the vectors x. A"lx are lin(86) early independent and the equations (83). as we shall see later. (X) is the minimal polynomial of A... e by a. the vectors x. 1. Ax.. Regular case : p =r.. and the coordinates of the vector Akx by ak. and the singular case. lk (k=1.19 and y>(2) in turn is a divisor of the characteristic polynomial A(d).a1A"lx 4(A)x=o. and in this case it only determines the divisor q(A) of A(1)... . in the regular case 4 (A)= W=97 (A).. . (84).....204 VII.. they coincide... A"lx may be written analytically as follows (see Chapter III. . In the regular case. Ax.. .. .. In this case. The method of Krylov consists in an.
.a1+ana.. ln_I An1 An In (A) Hence we determine d(A) after a preliminary transposition of the determinant (92) with respect to the main diagonal: 20 By (89).. (n + 1)st..an_lal 01lan_1 = an .... + bn1ai + bnao =0 = 0 (010 ..bn (91) .an_1b1 .. + ln_1 ai + lnao lan+A0tn_1 =1)..§ 8. a2. . a2i ... and the last.and ... I 11 (90) b...alb.. bn ln_1 In In the regular case the rank of this matrix is n... an has a nonzero solution (a0 =1) . its determinant must vanish : a b al ...20 and substitute their values in (88). For this purpose we rewrite (88) and (91) as follows: aan+alan_i+. KRYLOV'S METHOD OF TRANSFORMING SECULAR EQUATION 205 a al an1 an b bi ... bn1 l li . an from (88) and (91) can be performed symmetrically. an1 bi an bn . This elimination of al.... ..... lan + llan1 + . The first n rows of the matrix are linearly independent... ._1.ana ..1 = In .i_1 . We obtain the dependence between the rows of (90) when we replace the vector equation (86) by the equivalent system of n scalar equations . the determinant of this system is different from zero . ... . a2...ail......+an_. =0 1+[And(2)]ao=0 Since this system of n + 1 equations in the n + 1 unknown a0. . ... From this system of n linear equations we may determine the unknown coefficients al.. . ban + bian_i + .an_lli ..... an uniquely..anb . row is a linear combination of the preceding n..
e3.. a.b_.. In Krylov's determinant on the righthand side of the identity. the whole space R is cyclic (with respect to A). then in this basis the operator A corresponds to a matrix A having the natural normal form 0 0.. A'ix as a basis.... .206 VII. ...... 1 ai The transition from the original basis e1. Note. A1x is accomplished by means of the nonsingular transforming matrix a a. (95) I _ A = TAT1. . 0 .. the vectors x.ri . Ax. Ax. In this case...1 b. .. STRUCTURE OF LINEAR OPERATOR IN nDIMENSIONAL SPACE a ai b b1 .. (96) 3... . A occurs only in the elements of the last column ...0 a 1 0 . . l. Ax. I li = 0.an1 (94) A= 0 .. a. I 11 1 A (93) MA(A)_ a b l A^ where the constant factor M is determined by (89) and differs from zero.1 . . A"ix are linearly dependent. ..... In the regular case. e to the basis x. li . 1...... Singular case : p < n.. The identity (93) represents Krylov's transformation. the remaining elements of the determinant do not depend on A.. so that M= a ai b bi . .. .1 T= and then b b. If we choose the vectors x.....
. (p + 1) st. . KRYLOV'S METHOD OF TRANSFORMING SECULAR EQUATION 207 Now (93) had been deduced under the assumption M 0. (i  1.1=hP.k=1.2. . a2.. . From the n coordinates a. . 11 In 12 b =a(4)a + (h b a28 21 (i + . . we may eliminate a1... . ... From this system of equations the coefficients a. h such that the determinant formed from the coordinates of the vectors x. Ax. ... . . row is a linear combination of the first p rows with the coefficients aP.n) are the elements ofAi (i`1....apc .. ...n).. . &P1 Furthermore. hl (98) CP1 fP1 . a1 (see (83) )... aP from (85) and (99) and obtain the following formula for T(2) : 21 at=a(')a+a(t)b+. But both sides of this equation are rational integral functions of A and of the parameters a.. Ax. when Krylov's determinant is expanded. but the last.' a2.aP1f1. h ..n).ap1h. etc.. . In exact analogy with the regular case (however...where a0) (j.. . Let us consider the matrix formed from the coordinates of the vectors x. it follows from (83) that: . But then... aP of the polynomial p(A) (the minimal polynomial of x) are uniquely determined.2. } an l.aP1c1..2. ..cep ... AP1x is different from zero : C f f1 M* = c1 . b. 1 by c) f.a1Cp1 = cP Ol f ....2142 = t P (99) . b. .. h)..+0)1... 1. with the value n replaced by p and the letters a.....2' Therefore it follows by a `continuity' argument that (93) also holds for M = 0.... ... f. . l we can choose p coordinates c. .. APx (97) This matrix is of rank p and the first p rows are linearly independent. all the coefficients turn out to be zero..... .§ 8.. .. b. .. Thus in the singular case (p < n) the formula (93) goes over into the trivial identity 0 = 0. .01p4 . .a1h.
2. b. In the case where A is a matrix of simple structure.. . But if the condition d (A) = . . . p. ... b. We have seen that in the regular case .2' 1. The fact that the polynomials yp(A) and T(A) coincide means that for x we have chosen a vector that generates (by means of A) the whole space R. by Theorem 2 of § 2. hp 4. where x = (a. l are the coordinates of a vector x that generates the whole ndimensional space (by means of the operator A corresponding to the matrix A). 1 the regular case holds. STRUCTURE OF LINEAR OPERATOR IN nDIMENSIONAL SPACE C f /1 . (168). 23 In analytical form.. Let us now clarify the problem : for what matrices A = II a{k II. A(A)°v (A) The fact that the characteristic polynomial A(A) coincides with the minimal polynomial p(A) means that in the matrix A = II ask Ili there are no two elementary divisors with one and the same characteristic value. 22 See. .. what is the same. i. we do not obtain A (A). The initial parameters b. h h1 1 C1 A (100) CP_1 CP t p1 . all the elementary divisors are coprime in pairs. this requirement is equivalent to the condition that the characteristic equation of A..e. . . this condition means that the columns x. (A) every divisor of y)(A). By varying the vector x we may obtain for T.. ... . . then however we choose the vector x o. and for what choice of the original vector x or. 1). are linearly independent. Such a vector always exists.208 VII. hP_1 fP AP1 P . have no multiple roots. for example. As1x... 48. . of the initial parameters a.22 The results we have reached can be stated in the form of the following theorem : THEOREM 14: Krylov's transformation gives an expression for the char acteristic polynomial A(A) of the matrix A= II ack Ili in the form of the determinant (93) if and only if two conditions are satisfied : The elementary divisors of A are coprime in pairs. since the polynomial T(A) obtained by Krylov's method is a divisor of y) (A) which in this case does not coincide with A (A) but is only a factor of it. .p(A) is not satisfied. Ax.
$1 = 20E2 + aP1 .. 6p_2 = M P1 + a2. + Pbp1 (104) 11=$11+ $2'1 + .+. b'. 5.. .apx . 1' of the vector y in the original basis may be found from the following formulas. b... . 0= ap.. we obtain 41Ax + E2A2x + .. We recommend to the reader the following scheme of computations... 1 are the initial parameters in the Krylov transformation). This divisor 92(1) is the minimal polynomial of the vector x with the coordinates a. . p_1 al . it follows that Sp . Let us show how to find the coordinates of a characteristic vector y for an arbitrary characteristic value Ao which is a root of the polynomial T(). AP1x.. (103) The first of these equations determine for us in succession the values . APix) .. Ax.+ap=0.aAPlx) $Ax+.....ap_1Ax . . 0. . . The coordinates a'. the Krylov transformation leads to some divisor T(A) of the characteristic polynomial d (1). b... In what follows we set Sp = 1.. + bplp_i .. .. L1 (the coordinates of y in the `new' basis x. Example 1. Ax. Then we obtain from (102) : Sp =1 .§ 8.. .24 We shall seek a vector y r o in the form y=E1x+ Substituting this expression for y in the vector equation (101) Ay20y and using (83).) obtained by Krylov's method.... (102) Hence.... because the equation 5p = 0 would yield by (102) a linear dependence among the vectors x. . . KRYLOV'S METHOD OF TRANSFORMING SECULAR EQUATION 209 In general.... . l (where a. which follow from (101) : a'= via + 2ai + . + $pap1 V= alb + 2b1 + . among other things.. 24 The following arguments hold both in the regular case p = n and the singular case p < n.+ AP1x + p (.. PAP'x). the last equation is a consequence of the preceding ones and of the relation Aa+a1Ao 1+..
. . These numbers are given arbitrarily (with only one condition : at least one is different from zero). 5 0 9 8 4 0 4 4 1 0 4 1 2 { 04 0 1 0 0 0 The given case is regular. i.. 11 we write the b.... b2..3 2 2 ..... 1 25 164(1)= 3 5 59 00 1 1 3 1 0 9 1 5 44 2 2 is ...... Above the given matrix we write the sum row as a check.1 3 .. .. STRUCTURE OF LINEAR OPERATOR IN 12DIMENSIONAL SPACE Under the given matrix A we write the row of the coordinates of x : a... l1 are obtained by multiplying the row a.l 1' Expanding this determinant and cancelling ... ...4 2 A= 2 3 .. 12. rows of the given matrix A.. Each of the rows..... b1.... The numbers a.. ...... l we write the row a1... ..a + a22b + .... beginning with the second.... Under the row at.. l1.... . Under the row a. b...... b1. etc.. the coordinates of the vector Ax... . . For example...e... a1= ala + a12b + .1 5 A'x 1 8 3 10 3 1 1 1 1 A'x Y X j .. ..... is deter mined by multiplying the preceding row successively into the rows of the given matrix... + row a2..210 VII. 1.. 1 successively into the .1:1 3 5 2 2 I 1 A'x 0 9. 3 1 .. = a2.16 we find : A (1) =1' ..2a° + 1= (11)' (1 + 1)1.. .. 0 01 I x=e1 +e2 2 5 Ax 3... .. b.. .... + etc. because 1 1 00 1 hf = 2 3 0 9 1 5 Krylov's determinant has the form 1 b b 2 3 2 = 16 0........ .2 4 2 1 .... b1. b.
Similarly..3 4 2 2 2 . We have a singular case to deal with.x + &2Ax + 3A2x + fgA3x a characteristic vector of A corresponding to the characteristic value to =1. 3. a2. in a vertical column parallel to the columns of x. 3 1 . C... e3. c = 0. $. The control equation 1.=1.2=1. c'.. d = 0. A3x.1 3 1 8 3 10 3 X= OIL 1 0 0 2 0 1 Ax A2x A$x 3 1 2 4 6 0 2 2 3 3 But in this case 1 0 0 0 2 M= 3 1 2 1 3 4 0 2 6 2 3 =0 and p = 3. . A = TAT1 where 0 0 0 1 d= 1 1 00 1 1 0 2 0 00 0 T= 1 2 5 1 3 5 0 9 0 0 032 2 1 5 Example 2. e2. 1.10=1. of course. $.=1. . but as initial parameters we take the numbers a =1.20+0=1.§ 8. e. 0 of a characteristic vector z for the characteristic value . We place the numbers 4. 2. satisfied. 0. 1.. by the formulas (103) : We find the numbers $1.j S. Ax.j S3. d'. E2. Azx. into the columns a. a=1.. As coordinates of y we find (after cancelling by 4) : 0.. z1. Furthermore. KRYLOV'S METHOD OF TRANSFORMING SECULAR EQUATION 211 We denote by y = . Multiplying the column .. 4. 0. similarly we obtain b'. we determine the coordinates 1. b = 0.10+0=1. . by (94) and (95).. we obtain the first coordinate a of the vector y in the original basis e. We consider the same matrix A.4 2 32 A= 2 1 . a3i a.10 + 1= 0 is.
.. of order n . a2.212 VII.. Hence we find three characteristic values : A. Z$ (105) it is necessary to watch the rank of the matrix obtained so that we stop after the first row (the (p + 1)st from above) that is a linear combination of the preceding ones. We shall then discover at once a row of the matrix (105) . after obtaining Krylov's determinant in the form (93) or (100). STRUCTURE OF LINEAR OPERATOR IN nDIMENSIONAL SPACE Taking the first three coordinates of the vectors x. A3x. A3 = 1... Ax. we write the Krylov determinant in the form 1 00 2 6 1 3 1 2 A 40 2 12 As 3 Expanding this determinant and cancelling ..8. Instead of expanding Krylov's determinant we can determine the coeffi cients a.l. These examples show that in applying Krylov's method.. l 1 (108) by using it in parallel with the computation of the corresponding rows by Krylov's method. But tr A = 0. . A asb2 . . the elimination method.1).1 (in the regular case. 11 ... .Aa b . l .. Hence A.. when we write down successively the rows of the matrix a b al bi a2 b2 . = 1. directly from the system of equations (91) (or (99)) by applying any efficient method of solution to the systemfor example. Moreover. A2x. = 1.. The determination of the rank is connected with the computation of certain determinants. in order to expand it with respect to the elements of the last column we have to compute a certain number of determinants of order p . The fourth characteristic value can be obtained from the condition that the sum of all the characteristic values must be equal to the trace of the matrix. This method can be applied immediately to the matrix a ai b1 .. we obtain : P (A)= AS A2. A2 = 1.A+ 1 =(I 1)2 (A + 1).
. We recommend the following simplification... Next we take an element f 1* 0 in the second row and by means of c and f 1* we make the elements c2 and f2 into zero.1. 0. (107) i. .r W. 27 The simplification consists in the fact that in the row of (108) to be transformed k .. by subtracting from the second row the first row multiplied by cl/c.(A) is the required polynomial 97(A) : gp(1) 9)(A)... hp1 AP1 AP fP . KRYLOV'S METHOD OF TRANSFORMING SECULAR EQUATION 213 that depends on the preceding ones.. Our transformation does not change the value of the Krylov determinant c f .. After obtaining the kth transformed row of (106) ak1. we obtain : 25 The elements c. . bk_1.. . ..0.. bk .... CP A M*.. In the first row of (106) we take an arbitrary element c 0 and we use it to make the element cl under it into zero.26 g. gq(2). lk .(X) are 1. . Let us explain this in some detail.2 .. have the form 0. 28 We recall that the highest coefficients of p(X) and g. the element in the last column of (106) is replaced by a polynomial of degree k. lx1 ) into the rows of the given matrix. bk1.2S As a result of such a transformation.gp(1).. without computing any determinant.1 elements are equal to zero..e.. (108) one should obtain the following (k + 1)st row by multiplying at_.`.§ 8. gk (A) = Ak + (k=0..1 ... after the transformation. must not belong to the last column containing the powers of X..27 Then we find the (k + 1)st row in the form at . bk_1. lk_1 (and not the original ak_1.. the (p + 1)st row of the matrix must. ) Since under our transformation the rank of the matrix formed from the first k rows for any k and the first n columns of (106) does not change. h 1 hi CP_1 f. . P Therefore M*'T (A) = cfl.. 'k1) 9k1(") .. f.. . Therefore it is simple to multiply such a row into the rows of A.... etc. Agk1 (1) and after subtracting the preceding rows.
214
VII. STRUCTURE OF LINEAR OPERATOR IN nDIMENSIONAL SPACE
a.
i'
9k A
The slight modification of Krylov's method that we have recommended (its combination with the elimination method) enables us to find at once the polynomial cp(2) that we are interested in (in the regular case, d (A) ) without computing any determinants or solving any auxiliary system of
equations.28
Example.
0 1 1 2 1 0 A= 1 2 3 1 0 2 1 1 2 2 1 1 3 0
1 1
4
4
1
5
1
0
1
1
0 0 0
0
1
0
0
1
1
0 2
 5  7
2
0 1 0:1 3 4 2 22
3
0 0
0 2'41+2
[241]
[5 + 71]
5
5
5
0
5
7  5
18  41' + 21
0
10 10
20
0 5
0
0 1841'+91+5 0 15 14418+912+51 [155(1941+2) 2(1841'+92+5)]
0
0
5
5 15 5
0
0
1'618+ 1228+715 26624+1223+71852 [552+(28 428X92+5)2(14623+121'+725)]
28  82' + 2518  211'  152 + 10
A (1)
0
0
23 A.,..4 o_..... U_
_e
,

ak a bk,
lk, A9k1 (A)
and after subtracting the preceding rows, we obtain :
25 The elements c, fl*.... must not belong to the last column containing the powers of X.
26 We recall that the highest coefficients of T(X) and g,(X) are 1. 27 The simplification consists in the fact that in the row of (108) to be transformed k  1 elements are equal to zero. Therefore it is simple to multiply such a row into the rows of A.
CHAPTER VIII
MATRIX EQUATIONS
In this chapter we consider certain types of matrix equations that occur in various problems in the theory of matrices and its applications.
§ 1. The Equation AX = XB
1. Suppose that the equation
AX = XB
(1)
is given, where A and B are square matrices (in general of different orders)
B=11bk Ill and where X is an unknown rectangular matrix of dimension m X n :
A= Ilaf lli,
X=1lxtxll
complex numbers) :
(A)
(9=1,2, ...,m; k=1,2, ...,n).
We write down the elementary divisors of A and B (in the field of
(B)
1292)°i
(pi + p2 + ... + pu = m)
,
...
(q, + q2 + .
+ q, n) .
In accordance with these elementary divisors we reduce A and B to
Jordan normal form
A=UAU1,
B=VBV',
(2)
where U and V are square nonsingular matrices of orders m and n, respectively, and A and B are the Jordan matrices :
A = I' 21E(P,) + H(p),
{ ju E(ql) + H(Q.),
12E(r,) + H(r,),
Iu2E(") + H(4:) ,
215
... , 2,,E(rx) + H(pk) } ... , 1 vE(9°) + H(9v) }
(3 )
216
VIII. MATRIX EQUATIONS
Replacing A and B in (1) by their expressions given in (2), we obtain:
UAU1X=XVBV1.
We multiply both sides of this equation on the left by UI and on the right by V: AU1XV = U1XVB. (4) When we introduce in place of X a new unknown matrix X (of the same dimension m X n) 1
X=UgV,
A% =±h.
(5)
we can write equation (4) as follows:
(6)
We have thus replaced the matrix equation (1) by the equation (6), of
the same form, in which the given matrices have Jordan normal form. We partition k into blocks corresponding to the quasidiagonal form of the matrices A and B :
X
(gap)
(a =1, 2, ..., u;
=1, 2, ..., v)
(here Xas is a rectangular matrix of dimension pa X qp (a =1, 2, ... , u ;
6=1,2,...,v)).
Using the rule for multiplying a partitioned matrix by a quasidiagonal one (see p. 42), we carry out the multiplication of the matrices on the lefthand and righthand sides of (6). Then this equation breaks up into uv matrix equations pakPa) + H°Pal gad = gap [ gyp" (4P) + H(4S)]
(a=1, 2, ..., u; #=1, 2, ..., v),
(a=1, 2, ..., u;
which we rewrite as follows :
(1Ap_';a)Xa0 =Hags,gap0p
f
1, 2, ..., v);
(7)
we have used here the abbreviations
Ha=
1. Za
H(Pa)
Go = HcQf)
(a = I, 2, ... , u; fl =1, 2, ... , v).
(8)
Let us take one of the equations (7). Two cases can occur:
,µp. We iterate equation (7) r  1 times:'
we replace (µp 2a) Sap by Flalap ZapOp.
I We multiply both sides of (7) by lip Am and in each term of the righthand side This process is repeated r 1 times.
§ 1. THE EQUATION AX = XB
(gyp  a)rXap
1)z (t) H:XafG;.
217
(9)
a+Tr
(10)
Note that, by (8),
Haa=Gfp=0.
righthand side of (9) at least one of the relations v pa, t:> qp
If in (9) we take r ? pa + qp 1, then in each term of the sum on the
is satisfied, so that by (10) either Ha=0 or GG=0. Moreover, since in this µp, we find from (9) case A.
2.
,la= pup
.
Xap =0. In this case equation (7) assumes the form
HaXap=XapGp.
(12)
In the matrices H. and Go the elements of the first superdiagonal are equal to 1, and all the remaining elements are zero. Taking this specific structure of H. and Gp into account and setting
11
(i=1, 2, ..., pa; k=1, 2, ..., 4p),
we replace the matrix equation (12) by the following equivalent system
of scalar equations :2
t st+1.kEt.k1(4WtPa+1,k=0; i=1, 2, ..., pa; k=1, 2, ..., qp). (13)
The equations (13) have this meaning :
1) In the matrix Xap the elements of every line parallel to the main
diagonal are equal ;
2)
tt =4.2= ...=SPa.4p1=0
Let pa = qp. Then Xap is a square matrix. From 1) and 2) it follows that in Xap all the elements below the main diagonal are zero, all the elements
in the main diagonal are equal to a certain number cp, all the elements of the first superdiagonal are equal to a number cap, etc. ; i.e.,
2 From the structure of the matrices Ha and Gp it follows that the product HaXap is obtained from Xgp by shifting all the rows one place upwards and filling the last row with zeros; similarly, XgfGp is obtained from Xap by shifting all the columns one place to the right and filling the first column with zeros (see Chapter I, p. 14). To simplify the notation we do not write the additional indices a, fl in ,k.
218
VIII. MATRIX EQUATIONS
Cap Cap Cap
CMPa1)
0
Xap =
0
.
=TPa
;
(14)
.
.
0
Cap
(Pa=q )
here cap, cap
are arbitrary parameters (the equations (12) do
not impose any restrictions on the values of these parameters). It is easy to see that for pa < qp
9pPa
and for pa > qp
Xap = ( 0
,
T,m)
(15)
Xap(Tzp))Pasp
0
(16)
We shall say of the matrices (14), (15), and (16) that they have regular upper triangular form. The number of arbitrary parameters in Xap is
equal to the smaller of the numbers pa and qp. The scheme below shows the structure of the matrices Xap for xa = Kp (the arbitrary parameters are here denoted by a, b, c, and d) :
gap _
a 0
0
b
c
d
c
,
a b
0 0
c
i
a
b
c
0
,
a
0
a 0
b
a
0
b
Sap=
0 0
0 0
0 a
0 0
b
a
Xap = 0 0
0 0
b
0
0
0
a
a
0
0
(pa=qp=4)
(pa=3, qp=5)
(pa5, qp=3)
In order to subsume case 1 also in the count of arbitrary parameters in X, we denote by dap (1) the greatest common divisor of the elementary divisors (I 2a)P" and (A ,up)4p and by Sap the degree of the polynomial dap (A) (a=1,2,.. ., u ; fl=1,2,. . ., v) . In case 1, we have 60=0 ; in case 2, aap= min (pa, qp). Thus, in both cases the number of arbitrary parameters in Xap is equal to Sap. The number of arbitrary parameters in X is determined by the formula.
N =U1 p1 a,,$. , .
In what follows it will be convenient to denote the general solution of (6) by XAh (so far we have denoted it by X).
u
e
§ 1. THE EQUATION AX = XB
219
The results obtained in this section can be stated in the form of the following theorem :
THEOREM 1:
The general solution of the matrix equation
where
AX =XB
H(r,), ... , A,,E(r") + H(PU)) Ui A = I I ai,t l !"` = UA U1= U {21E(P1) + 1= Il b;k I Iu = V BV1 = V {,u1E(Ql) + H(q,), . , y A;(qo) + H(qn)) V1
is given by the formula
X = UXAB V1 ,
(17)
Here Xjj is the general solution of the equation AX =XB
and has the following structure : X;B is decomposed into blocks
q'8
X: B = (Xa,) } P.
if 2u
(a=1,2, ..., U; f=1,2, ..., v);
pup, then the null matrix stands in the place Xap, but if A. = pp, then an arbitrary regular upper triangular matrix stands in the place XaB, and therefore also X, depends linearly on N arbitrary parameters c1, c2,...,cN
X.
N
X = G CtX f,
j1
(18)
where N is determined by the formula
N= 8ap
(A  µ'8)4p).
(19)
(here Bap denotes the degree of the greatest common divisor of (A 2,)P' and
Note that the matrices X1i X2, ... , XN that occur in (18) are solutions of the original equation (1) (X; is obtained from X by giving to the parameter c, the value 1 and to the remaining parameters the value 0; j = 1, ,2, .... N). These solutions are linearly independent, since otherwise for certain values of the parameters c1i c2, ..., cv, not all zero, the matrix X, and therefore Xd8 , would be the null matrix, which is impossible. Thus (18) shows that every solution of the original equation is a linear combination of N linearly independent solutions.
220
VIII. MATRIX EQUATIONS
If the matrices A and B do not have common characteristic values (if the characteristic polynomials I AE  A I and I AE  B I are coprime), then o u N= X 2;6.0= 0, and so X= 0, i.e., in this case the equation (1) has only a_la_1
the trivial solution X = 0. Note. Suppose that the elements of A and B belong to some number field F. Then we cannot say that the elements of U, V, and Xdk that occur in (17) also belong to F. The elements of these matrices may be taken in an extension field F1 which is obtained from F by adjoining the roots of the characteristic equations I AE  A I = 0 and 11E  B I = 0. We always have to deal with such an extension of the ground field when we use the reduction of given matrices to Jordan normal form. However, the matrix equation (1) is equivalent to a system of mn linear homogeneous equations, where the unknown are the elements Xjk (j =1, 2, 3, ... , tin ; k =1, 2, ... , n) of the required matrix X :
in
n
.E aijxik =.Z xtnbhk
J_1
A_1
(i=1, 2, ..., m; k =1, 2, ..., n).
(20)
What we have shown is that this system has N linearly independent solutions, where N is determined by (19). But it is well known that fundamental linearly independent solutions can be chosen in the ground field F to which the coefficients of (20) belong. Thus, in (18) the matrices X1i X2, ... , Xr can be so chosen that their elements lie in F. If we then give to the arbitrary
parameters in (18) all possible values in F, we obtain all the matrices X with elements in F that satisfy the equation (1).3
§ 2. The Special Case A = B. Commuting Matrices
1. Let us consider the special ease of the equation (1)
AX =XA,
(21)
where A= II ask II TI is a given matrix and X = 11 x{k II; an unknown matrix.
We have come to a problem of Frobenius: to determine all the matrices X that commute with a given matrix A. We reduce A to Jordan normal form :
A = UAU'1= U {21E('") + H(P,>,
...,
AUE,(ru> + Hcp">} u_i.
(22)
3 The matrices a = 44 ail III' and B = II bk:I1 determine a linear operator F(X) _ AXXB in the space of rectangular matrices X of dimension m X n. A treatment of operators of this type is contained in the paper [1791.
.= 1. ...2! ..A.. X2 is split into u2 blocks X2 = (X"")1 corresponding to the splitting of the Jordan matrix A into blocks .. is+1(A) _ .. where N = (A .. z are arbitrary parameters).. the formula for N can be written as follows: .. . i... (A.... XQp is either the null matrix or an arbitrary regular upper triangular matrix.). 0 00 0 0 0 00 0:0 0r s 00 r 0 0 0 0 0 0 wz 8"p . > n.. l am p q0 00 0 0 f 0 00 00 M10 0:0 t 0 (a. As we have explained in the preceding section. ..12)2. we write down the elements of X2 in the case where A has the following elementary divisors: (2). is (A). We denote the degrees of these polynomials by n1 > n2.Ap)1''. depending on whether 2 Ap or Aa =A. As an example.f" and (I . The number of parameters in Xa is equal to N. (23) where Xa denotes an arbitrary matrix permutable with A.. COMMUTING MATRICES 221 Then when we set in (17) V = U. we obtain all solutions of (21). here a denotes the degree of the greatest common divisor of the polynomials Let us bring the invariant polynomials of A into the discussion: il(A). i2(2). THE SPECIAL CASE A = B. (A11)3. = 0..e. .. A A2 In this case X2 has the following form : (A17` A2). Since each invariant polynomial is a product of certain coprime elementary divisors. b...§ 2.. > nj+1= .. in the following form : X = UXd U1.. B = A and denote X ja simply by Xa. a 0 0 0 b c b d:e c:0 b f e 8 0 a 0 0 a 0 0 0 0 0 h 0 0 0 k 0 h k0 m p0 00 h0 0 0 0 e0 00 a 0 0 00 00 . ... all matrices that commute with A.
e.(. < n .. see the remark at the end of the preceding section).). j =1. We have arrived at the following theorem: THEOREM 2: The number of linearly independent matrices that commute with the matrix A= OR . (A) is one of these polynomials and therefore xj = min (n. 2. i. are the degrees of the nonconstant invariant polynomials it (A). i. Note that n=n.e. Then g(A) is permutable with A. From (25) and (26) it follows that (26) N ? P. COROLLARY 1 TO THEOREM 2: All the matrices that are permutable with A can be expressed as polynomials in A if and only if it... . . 2.. . where n1. 712. i. n. (27) where the equality sign holds if and only if t =1. . on comparing this with (27). A"1. . if all the elementary divisors of A are coprime in pairs.. .. i2(A)...222 VIII.+ (2t1)nt. But the greatest common divisor of iD (A) and i. = n..(A) of A. we obtain : N = n1= n.. There arises the converse question : when can every matrix that is permutable with A be expressed as a polynomial in A? Every matrix that commutes with A would then be a linear combination of the linearly independent matrices E.. . MATRIX EQUATIONS 9 N = E mep 941 (24) where xf is the degree of the greatest common divisor of i9(A) and i. A. t). II TI is given by the formula (25) N=n1+3n2+. if all the elementary divisors of A are coprime in pairs.. Let g(el) be an arbitrary polynomial in A... Hence N = it.l) (g. A2. .+n2+. . it.. Hence we obtain: N is the number of linearly independent matrices that commute with A (we may assume that the elements of these matrices belong to the ground field F containing the elements of A . .+n.
and B. = n. From the relation AB = BA we obtain four matrix equations: 1. . 4. We split B into blocks corresponding to the quasidiagonal form d. C. (28) where the matrices A. AIB1 =B. In this case all the matrices that are permutable with A can be represented in the form of polynomials in A. A2) . (30) As we explained in § 1 (p. and of A2 and B2.e. and A2 have no characteristic values in common. COMMUTING MATRICES 223 3. A2Y=YA1. 2.A1. every matrix that commutes with C must be expressible linearly by the matrices E. (29) (28) : Proof.. We raise the question : when can all the matrices that commute with A be expressed in the form of polynomials in one and the same matrix C4 Let us consider the case in which they can be so expressed. the second and third of the equations in (30) only have the solutions X = 0. We mention a very important property of permutable matrices. we find that N = n. 4. has quasidiagonal form :. C"1. A2B2=B2A2. since A. Y = 0. THE SPECIAL CASE A = B. a. Si A = (A1. Then since by the HamiltonCayley Theorem the matrix C satisfies its characteristic equation. Therefore in this case N:5 n. COROLLARY 2 TO THEOREM 2: All the matrices that are permutable with A can be expressed in the form of polynomials in one and the same matrix C if and only if n. B=( YI B2). The first and fourth of the equations in (30) express the permutability of A. C2.§ 2. i. if and only if all the elementary divisors of AE . then the other matrix also has the same quasidiagonal form dl $j B = { B17B2). . .A are coprime. This proves our statement. Comparing this with (27). 220). A1X =XA2. . and A2 do not have characteristic values in common. The polynomials in q matrix that commutes with A also commute with A. Hence from (25) and (26) we also have n1= n. THEOREM 3: If two matrices A = II a{k III and B = I I b<k II i are permutable and if one of them. 3. say A.
then the whole space R can be split into subspaces invariant with respect to all the operators A. § 8). The invariance of 12 with respect to B is proved similarly. L are pairwise permutable and all the characteristic values of these operators belong to the ground field. The perrnutability of A and B implies that of Vp1(A) and B. .. . . As a special case of this we obtain: COROLLARY 2: If the linear operators A. . 179). MATRIX EQUATIONS In geometrical language. B. . . .. then a basis of the space can be formed from common characteristic vectors of these operators.. Then V.. B.n such that the minimal polynomial of each of these subspaces with respect to any one of the operators A.. . this theorem runs as follows : THEOREM 3': If R =I..1.. We also give the matrix form of the last statement : Permutable matrices of simple structure can be brought into diagonal form simultaneously by a similarity transformation. From the fact that they are coprime it follows that all the vectors of R that satisfy the equation 1p1 (A) x = o belong to 11 and all the vectors that satisfy VJ2(A)x = o belong to 12.. Finally. then the whole space R can be split into subspaces . L R=11+12+.+1.' Let x1 a I1. . . B. B. L has equal characteristic values in each of them.. = o. then 11 and 12 are invariant with respect to any linear operator B that commutes with A.e. This theorem leads to a number of corollaries : COROLLARY 1: If the linear operators A. .224 VIII. B. L is a power of an irreducible polynomial. we mention a further special case of this statement : COROLLARY 3 : If A. We denote by ip1(1) and V'2(A) the minimal polynomials of 11 and 12 with respect to A. Bx1 a 11. . 4 See Theorem 1 of Chapter VII (p. . B. . so that V. L are pairwise permutable. Let us also give a geometrical proof of this statement. BV IL i.. L are pairwise permutable operators of simple structure (see Chapter III... invariant with respect to all the operators such that each operator A..1(A)x. + 12 is a decomposition of the whole space R into invariant subspaces 11 and 12 with respect to an operator A and if the minimal polynomials of these subspaces (with respect to A) are coprime. .
m. k=I. 9 (A)= (A Al)°1 (A A$)°+ . Therefore. (A Ah). But we have established in § 1 that the only solution of (32) is the trivial one if and only if A and B do not have common characteristic values. The equation (31) is equivalent to a system of mn scalar equations in the elements of X : a. n). m. Suppose that the matrix equation AXXB=C (31) is given. n). let us consider the equation where 9(X) =0. The Scalar Equation J (X) = 0 1. where X. of dimension m X n...§ 4. then (31) has a unique solution. but if the matrices A and B have characteristic values in common.. if (32) only has the trivial solution X = 0. To begin with.xttxubtr=cik fi m n m n (i =1. II and X = I Xjk 11 are a given and an unknown rectangular matrix.2. is a fixed particular solution of (31) and X. The Equation AX . are given square matrices of order m and n and where C = I c..2. THE SCALAR EQUATION f (X) = 0 225 § 3. or it has an infinite number of solutions given by the formula X=%+%.h (33) . . respectively. (32) Thus. where A= 11 ay I1T and B = 11 bkc I'. if the matrices A and B do not have characteristic values in common.. then two cases may arise depending on the `constant' term C: either the equation (31) is contradictory.XB = C 1. . § 4.2. the general solution of the homogeneous equation (32) (the structure of X.. can be written in matrix form as follows : AXXB=O. ... then (31) has a unique solution. was described in § 1).... .. k=1..2. (31') The corresponding homogeneous system of equations (i_=1...
e. there may be some that are equal.=n (36) Example 2. H(Iv)) T1 Pil P21 . the elementary divisors of X must have the following form : 'I j2.. Since the minimal polynomial of X. . reducible to diagonal form) with characteristic values 0 or 1..E(P" + H(Pj=). The formula comprising all the idempotent matrices of a given order n has the form X=T(1.1.+P. The least exponent for which the power of the matrix is the null matrix is called the index of nilpotency. . the solutions of (35) are all the nilpotent matrices with an index of nilpotency p < m...1x. (37) A matrix satisfying this equation is called idempotent..e. .... The formula that comprises all the solutions of a given order n looks as follows (T is an arbitrary nonsingular matrix): X = T (H(P. n (38) where T is an arbitrary nonsingular matrix of order n. j.. py _ in. . pl..)..1.. then the matrix is called nilpotent. Aj B(pj.. Let the equation X2 = X be given..) + HOPQ) T1... (A)... +. Obviously.=n (among the indices jj....+P. S ai. (35) If a certain power of a matrix is the null matrix.. n is the given order of the unknown matrix X). the first invariant polynomial i. Therefore an idempotent matrix can be described as a matrix of simple structure (i. (34) where T is an arbitrary nonsingular matrix of order n. by formula (34). . H(Pt).+p.2. j2. The elementary divisors of an idempotent matrix can only be A or A . . into a finite number of classes of similar matrices. pig S aj. We represent X in the form X = T {Aj.. must be a divisor of g(A). MATRIX EQUATIONS is a given polynomial in the variable A and X is an unknown square matrix of order n. S al...=1. Let the equation X'n =0 be given.1......j. i.. ..226 VIII.. i. . (p i+p2+.0. Example 1. . The set of solutions of the equation (33) with a given order of the unknown matrix splits....0)T'.
Y are unknown square matrices of the same order.. . . Matrix Polynomial Equations 1. Let us now consider the more general equation f (X) = 0.. Let us consider the equations A0Xm+ AIX'"i + . § 5. 2. are given square matrices of order n and X. form As in the preceding case. The equation (33) investigated in the preceding section is a very specialone could almost say. Multiplicities: al. S air.Sa{).§ 5. . .. (42) where Ao. 2.. trivialcase of (41) and (42) and is obtained by setting A4 = aLE.... (40) (j1.+A. m. p. .. (T is an arbitrary nonsingular matrix). . js. pi.. pit S ait.. (41) YmA0+ Y'"'Al+. .. .. A.=0. (39) where f (A) is a regular function of A in some domain 0 of the complex plane. and (33)..... We shall require of the unknown solution X = II xik II i that its characteristic values belong to G and that their multiplicities be as follows : Zeros : Al. a2. (42). =1.. . The following theorem establishes a connection between (41). A. .E°'Q + H(Pid. . ... A..E°'Q + H(PQ) T_... =0.. S ai. and therefore X = T {Ai.. . j. A2. where aj is a number and i =1. + A. every elementary divisor of X must have the (A2)7' (P.. MATRIX POLYNOMIAL EQUATIONS 227 2... .
if X and Y are solutions of these equations.. By the HamiltonCayley Theorem (Chapter IV. where A(A)=IAEAI.. Hence g (A) =IF(A) I= IQ (A) I A (A) = I Q .+. satisfies the equation AEA=0.X I and z1 1(A) = I AE . Therefore (45) implies that g(X)=9(1')=0. A(Y)=0.(A) A. MATRIX EQUATIONS Every solution of the matrix equation A0Xm + AiXm1 + . (A).+ Am= 0.Y I are the characteristic polynomials of X and Y. A(A) =0. § 3).+Aml (43) (44) The same scalar equation is satisfied by every solution Y of the matrix equation YmA0+ Ym'A... + A. and the theorem is proved.Y : F (A) = Q (A) (AEX) = (AE. (A) I Al (A) (45) where d (A) = I AE . the matrix polynomial F(A) is divisible on the right by AE . A(X)=0. when substituted for A.228 THEOREM 4: VIII.+Am. For every square matrix A. F(Y)=O...Y) Q. By the generalized Bezont Theorem (Chapter IV.. =O satisfies the scalar equation g (X) = 0. Note that the HamiltonCayley Theorem is a special case of this theorem. Then the equations (41) and (42) can be written as follows (see p. where 9(A)=JAoAm+AlAm1+ . 81) F(X)=O. Proof. We denote by F(A) the matrix polynomial . by the theorem just proved.. Therefore.X and on the left by AE . . § 4).Am+AlAm'+.
e1.A0X0 + AiIj + .. %1f 5 See [318). k = 1. %) = 0. 6m among each other.n1 are certain constant matrices of order n.. $1. .. . $1 . n) of F is a homogeneous polynomial in o. .. . with the matrix coefficients A.. n).. Theorem 4 can be generalized as follows : THEOREM 5:5 If %o. . (48) the adjoint matrix of F (f{k is the algebraic complement of fkj in the determinant Then every elementf{k(i. 2.. .. (47) where 9($0. where From the definition of F there follows the identity . 1. . Em) JJ....k= JF( o. Em are scalar variables. m) are linear forms in to. F= ' Ff... X.+AmEmJ. We denote by F (Eo.....1. %1. . X. Em) E . 51. . . 1.fi.§ 5. .+AmEm)E Ei.... 6 The / i k E1. satisfy the scalar equation g (X0.. are pairwise permutable square matrices of order n that satisfy the matrix equation (46) .. fm = 9 (Eo. . . E1... .. . .m)E. .+ +f.+. ..$... S1.....fm(AoEo+A1s1+. .. .. .1. : .. Em)=JAoEo+A.. 2. . and fm. + A. A1.. (49) The transition from the lefthand side of (49) to the righthand side is accomplished by removing the parentheses and collecting similar terms. In this process we have to permute the variables Eo.f. ... We seta F(E0. . . Em)=JJfik(EO.. (i....... gm= O (Ao. are given square matrices of order n). 1. .. but we do not have to permute the variables $o.. Proof.. We write this in the following form : F1.. E1.. .... Em) = JJ frk (Eo. %1.. .m of degree m . L1. Therefore the equation (49) is not violated when we substitute for the variables the pairwise permutable matrices %o. X.im o 1 in . . . %1. . S. so that F can be represented in the form h+f.. MATRIX POLYNOMIAL EQUATIONS 229 2.. then the same matrices %o. . A. E1. fi n ..
Note 1.+gmAm=O. gm) But... n)..... For we can apply Theorem 5 to the equation (51) A. by assumption.+fm°n_1 2 (AOX0 + AiXi + . .. X. E. X. X1. X1. 2. We have shown that every solution of (41) satisfies the scalar equation (of degree < mn ) g(A) =0... Therefore all the solutions of (41) have to be looked for among the matrices of the form TTDjT1 1 (52) (here D{ are welldefined matrices...... . .230 VIII. . T. n). (53) A natural method of finding solutions T..... MATRIx EQUATIONS fo+f.. =0 and then go over term by term to the transposed matrices. 2. gfm =g(%o. Note 2... In (41) we substitute for X the matrix (52) and choose T{ such that the equation (41) is satisfied. and this is what we had to prove.. +A. X1.%m= 0. . i =1. (50) AoXo+AIX1 +.. if we wish. Theorem 4 is obtained as a special case of Theorem 5.Xo+AiXi+.. . Therefore we find from (50) 9 (10. + Amgm) X X . when we take for Xo....+. But the set of matrix solutions of this equation with a given order it splits into a finite number of classes of similar matrices (see § 4).%'"1.TT = 0 (i =1. For each Tt we obtain a linear equation A0T{Dr + A1T1DD 1 + + A... %m. are arbitrary nonsingular matrices of order n .... Theorem 5 remains valid if (46) is replaced by %0Ao+X1A1+.+AX. 3. of (53) is to replace the matrix equation by a system of linear homogeneous scalar equations in the elements . %m) = 0. we may assume that the D1 have Jordan normal form. . .
158) the elementary divisors of X do not `decompose' when X is raised to the mth power.. (A . 2.. EXTRACTION OF 1YlTH ROOTS OF NONSINGULAR MATRIX 231 of the required matrix T. it follows that the elementary divisors of X are : (2... In this circle we have m distinct branches of the function 'VA. . In the following two sections we shall consider special cases of (41) connected with the extraction of mth roots of a matrix.E.u).e. In this section we consider the case J A J 0 (A is nonsingular). These branches can be distinguished from one another by the value they assume at the center 2j of the circle. all the characteristic values of X are also different from zero. From what we have said.does not vanish on these characteristic values. when raised to the mth power. In this section and the following. give the characteristic values of A. . We now AjEj+Hj in the following way. The Extraction of mth Roots of a NonSingular Matrix 1. and starting from this branch we define m the matrix function 2j jHj by means of the series Here Ej=E(P1) and Hi=HSr1) (jaI.. (56) Since the characteristic values of the unknown matrix X. Each nonsingular solution TL of (53). when substituted in (52). + Hj U1. Therefore the derivative of f (A) _ A.$)P=. yields a solution of the given equation (41).. In the Aplane we take a circle. (R. j=1.§ 6. We denote by (111)p! . We denote by 'PA that branch whose value at Aj coincides with the characteristic value j of the unknown matrix X. we deal with the equation Xm=A.) ..°. But then (see Chapter VI.... (2)'. . I. (54) where A is a given matrix and X an unknown matrix (both of order n) and m is a given positive integer.. with center Aj..u) (57) where _ AP i.. All the characteristic values of A are different from zero in this case (since A J is the product of these characteristic values).Ax)ru (55) the elementary divisors of A and reduce A to Jordan normal form :7 A= UAU1= U {11E1 + H1. § 6. p.. determine' . j is one of the mth roots of Aj Q. Similar arguments may be applied to the equation (42).. not containing the origin.
u) in place of A. . Now from (54) and (59) it follows that A = T {A1E1 + H1. . Hes } has the elementary divisors (57). we note that if on both sides of the identity (m )m=A we substitute the matrix A... + Ht (j =1.f) where f =mV! (here j =1. 2. MATRIX EQUATIONS inAiEf+Hf=AiEf+Af 1 i Hf+. ... the matrix (58 ) has only one elementary divisor (A . j' AuEu + Hu } P1.. . . 3......232 1 VIII. .l. ml 22E2 H2 . m'AEu + Hu } X A U.. u). . the continuous character arises from the arbitrary parameters contained in Xd . AuEu + Hu} T1. 2. (62) The multivalence of the righthand side of this formula has a discrete as well as a continuous character: the discrete (in this case finite) character arises from the choice of the distinct branches of the function ya in the various blocks of the quasidiagonal matrix (for .. = Ak the branches of m}"A in the jth and kth diagonal blocks may even be distinct) . When we substitute in (59) for T the expression UXd we obtain a formula that comprises all the solutions of the equation (54) X = UX a AIE + Hl.)m=AtE1+Ht (j= 1. the same elementary divisors as the unknown matrix X. (I T J 0) such that X=T {MVAlEI Therefore there exists a nonsingular matrix T + HI . u) .»a(m1}f Hi+.E. .. . .. 2. . Since the derivative of the function mil at t is not zero. we obtain : (mVAtEi+H. (61) where Xa is an arbitrary nonsingular matrix permutable with A (the structure of Xa is described in detail in § 2).. . .. H2 . (58) which breaks off.. Hence it follows that the quasidiagonal matrix {'nfAIEI + Hl .. (59) In order to determine T. A2E2 + H2. H2 ..e. (60) Comparing (56) and (60) we find: T=UXZ.. i.
Therefore in (62) we can set A = A.. then in the formula for X=' 1 only a discrete multivalence occurs. d.e. Example. The matrix X4 in this case looks as follows : X= a b c 0 a 0 0d e where a. not a function of the matrix A (i. XI). X2. if the numbers A. In this case every value of VA can be represented as a polynomial in A. ... 2.. which gives all t'he required solutions X.Ht (j =1.+Hw) U1 Thus. We point out that 1A.. A. b.. If all the elementary divisors of A are coprime in pairs. In this case A has already the Jordan normal form.is. . then the matrix Xd has quasidiagonal form Xd ={X1. =1). U = E. ... "t. u). if the elementary divisors of A are coprime in pairs.§ 6. 00 X'=d.U{ mVASEs+H'. with AfEf }..e. . and e are arbitrary parameters. _I a b c 0 a 0 0 d 6 (e'=. . Note.. Therefore in this case (62) assumes the form ' X. i. in particular. . are all distinct... The formula (62).. where Xi is permutable with 2fEf + Hi and therefore permutable with every function of A1Ef + By and. is not representable in the form of a polynomial in A). c. 22. 2.. (63) .. Suppose it is required to find all square roots of 111 A= 0 i.e. all solutions of the equation 1 1 0 0 1 . now assumes the following form : a b c X= 0 a 0 0d e 1 a2 0e 0 0 0 0 . in general. EXTRACTION OF mTH ROOTS OF NONSINGULAR MATRIX 233 All solutions of (54) will be called mth roots of A and will be denoted by the manyvalued symbol }A_.
Age the elementary divisors with characteristic value zero. 24s. We solve this system of equations with respect to x. Hence we find: a b e 1 a1 0 0 s a X1= A cd ..cd) y8 .21)P+. The Extraction of mth Roots of a Singular Matrix 1.234 VIII.). y8 = dx8 + a2x. Then we obtain the transformation with the inverse matrix Xz 1: x. x8 = a1y9. As in the first case.1) vw+ 2 e 0 0 0 '1 (e )) w (v=a8c. § 7. (A the elementary divisors of A that correspond to nonzero characteristic values. . We pass on to the discussion of the case where J A I =0 (A is a singular matrix). . (65) here we have denoted by (A .. .. . x2.. x8=. and hence e = a 2.) da1 e) v 'l II (e . 2 i'(Pa) + H(Pa) ... we reduce A to the Jordan normal form : A = U {A1E(Pi) + H(P.acy8.. X.. .).. and by A4i..ady8+a'y1... .). yx = axe. H(4. The solution X depends on two arbitrary parameters u and w and two arbitrary signs a and . . H(4i)) U1 . MATRIX EQUATIONS Without changing X we may multiply Xa in (62) by a scalar so that X a I = 1.. x.a$b ac a1 0 a 0 0 a1 0 d a8 ad aec(71e) 0 The formula (63) yields : e (st))acd+ X= 0 0 e (e .(a2b . w=a1d). Then this leads to the equation a 2e = 1. Let us compute the elements of X a For this purpose we write down the linear transformation with the matrix coefficients of Xd : y. = a1 y. H(43).=ax1 +bx8+cx. ..
qt) is the index of nilpotency of A2. q. A2} and also commute. EXTRACTION OF mTH ROOTS OF SINGULAR MATRIX 235 (66) Then A = U {A1.)). (70) (71) Since I Al 1 0. from the permutability of the matrices (68) and the fact that Al and A2 do not have characteristic values in common. The original equation (54) implies that A commutes with the unknown matrix X and therefore the similar matrices U1 A U= {A1. X2).. . it follows that the second matrix in (68) has a corresponding quasidiagonal form U1 XU = (X1. H(a.. (AIE(P.).).) + H(P... A2} U1.. . XQ = A2.. is a nonsingular matrix (I Al I 0) and A2 a nilpotent matrix with index of nilpotency y = max (q1.. by the formula (62) : X1 = Xd.)... H(9. From A2 = 0 and (71) we find x"=0. we replace (54) by two equations: Xi = A1. .). .We reduce X2 to the Jordan form : . . . Thus it remains to consider the equation (71)... the results of the preceding section are applicable to (72) (70).e.. q2.. U1 X U (68) As we have shown in § 2 (Theorem 3).. A2} and (X1. which already has the Jordan normal form A2 = {H(Q'). .) + H(P. . Therefore we find X. where m (µ . q2. .) (A'2 = 0).§ 7. The last equation shows that the required matrix X2 is also nilpotent with an index of nilpotency v. (69) When we replace the matrices A and X in (54) by the similar matrices {A1.. to find all mth roots of the nilpotent matrix A2... where A1= {i11E(P.1) < v :!g m. A2= {H(4. E(Q.u. i.. " /2 E(Pu) + H(v )) Xd. (67) Note that A. A E(Pu) + H(P )). H(9t))i (73) µ = max (q1. X2).).
)]m. (74) Now we raise both sides of (74) to the mth power.. We express v in the form v=km+r e1. . ..... em.. 2. Let us now elarify the question of what elementary divisors the matrix [H(v)]m has.. He2 = ell . e2. We write (76) as follows : (76) e form a Jordan chain Hei=ef1 Obviously. ... . v2.. ...... for H. ekm.. each column form a Jordan chain with respect to the operator Hm. . ..)]".... .. the remaining ones k vectors. . e2... e(k1) m+2.. because we have to find not only the elementary divisor of the matrix [H(9)].8 We denote by H the linear operator given by H(`) in a vdimensional vector space with the basis el. ek m. . . (75) 2.. v.... This table has m columns : the first r columns contain k + 1 vectors each. [H(U. . The equation (77) shows that the vectors of . 158). He. Here we are compelled to use another method of investigating the problem. but also a matrix p.236 VIII.... = ew_1... If instead g This question is answered by Theorem 9 of Chapter VI (p. e. v . ...+l. H(°s)) T1 (v1. ... where k and r are nonnegative integers. .. .). 2. ....... ekm+.. We arrange the basis vectors eg. ... ev in the following way : ell (r<m). v. em+2. e2. S v) . eu = e_1=. transforming [H(°)]" into Jordan form...)]m) T1. (78) em+1. eu = o).. corresponding to the elementary divisor Av.. e2.. (9 =1... em+1=o). . These equations show that the vectors e3. MATRIX EQUATIONS 12 = T (H(t.. (77) H"'e! = ei_m (9 =1.. H(°=). Then from the form of the matrix H(°) (in H(v) all the elements of the first superdiagonal are equal to I and all the remaining elements are 0) it follows that He1= o... [H(ti. e(k1) m+1.. We obtain : A2 =X = T {[HO.
0 1 .0 II The matrix Hc°> has the single elementary divisor Av. [H(°)]" has the elementary divisors : jk+l.. When HO) is raised to the mth power.... . we obtain a new basis in which the matrix of the operator Hm has the following Jordan normal form :s H(k+1)... H(k). we set : vi=k*m+rr (05r{<m. equation (75) can be written as follows: .m (describing the transition from the one basis to the other) has the following form (see Chapter III.). i=1. . A2 = %s = TP {H(k1+1).... (81) Then.0 0 1. H(kz+1). H(k... r mr (79) where the matrix P...} P1 T'. the blocks 11(k). § 4) : m 1 0 .. e)... r... and the matrix has the form r . . this elementary divisor `falls apart..+1)..' As (79) shows. (80) Pn. by (79). Hck. EXTRACTION OF mTH ROOTS OF SINGULAR MATRIX 237 of numbering the vectors (78) by rows we number them by columns. mr H(kl}. . H(k. 0 0 .....'U . . H(k.l . and therefore [Httil]m _ Ps. r H(k). Pv rn. 0 0 . where P= {Pry.....m {H(k+1).. .... . (82) ...+1).. H(k+1).. kk?0. HHk>} P1 V. P. m = 0 0. H«>... . . H(k) are absent.. r mr Ik Turning now to (75).. . } e In the case k = 0..2.§ 7...
{%d. H(a). H(Pu..A"+ admissible for X2 if after raising of the matrix to the mth power these elementary divisors split and generate the given system of elementary divisors of A2: 29.. we obtain from (82) As = TPQ1AZQPIT1... . Therefore Q can be regarded as known. MATRIX EQUATIONS Comparing (82) with (73).+1).238 VIII.+1)... H(k. H(v. } U1. A.). .. H(ki)... (72). . (88) From (69)..... we see that the blocks H(k. H("'). Let us call a system of elementary divisors..+1)... (86) The matrix Q describes the permutation of the blocks in the quasidiagonal matrix that brings about the proper renumbering of the basis vectors.) H(4. QP1. (89) . we have X' ='A. .. PQ1%a.). 2 . The number of admissible systems of elementary divisors is always finite. In this case there exists a transforming matrix Q such that (H(k'+I). Hence TPQ1= X A. Substituting (87) for T in (74). . and (88) we obtain a general formula which comprises all the solutions : X = U { %a... .. QP1 (H("..).. . apart from the order....).QP'} {'")/11E(h) + H(P. )=QIA2Q.... v1+v2+... H(k. . (83) must coincide. Using (86). H(k').. Let us show that for each admissible system of elementary divisors A"'. . kg') (84) 3..4'1). ...) ) . or T = %e. . H(k.. (n2 is the order of A2). . P"'. At... 19t...".)Smµ.=n2 In every concrete case the admissible systems of elementary divisors for X2 can easily be determined by a finite number of trials. . %d.. (87) where %A2 is an arbitrary matrix that commutes with A2. H(k++1).q3. because max(VII v2. . H("a) } PQ1$a' . $(k.. . v. H(k'). d"" form a corresponding solution of (71) and let us determine all these solutions... . .+v. with the blocks H(v. .
0 d a2 a1 0 0 cd a2b ac a1 ad 0a 0 0 a$ From this formula we obtain % =X==XA P'H "PIQ = 0 0 ft1 p 0 0 where a = ca1. t=2. In this case. The matrix X can only have the one elementary divisor 23. Example. q1= 2.a2d and l4 = a3 are arbitrary parameters. 0 0 0 0 0 0 i. and q2 =1. that the equation gm = H(P) has no solution for m > 1. A = A2.§ 8. r1=1 and (see (80)) 1 P= Ps. in (88) we may set : a b c XA.s = 0 0 0 1 0 0 1 11 P1. p > 1. It is easy to see. k1=1. § 8.. v1=3. Suppose it is required to extract the square root of A= 0 1 0 . We consider the matrix equation ex =A. Q= E. . to find all the solutions of the equation 1' = A. The Logarithm of a Matrix 1.e. m=2. 0 Moreover. THE LOGARITHM OF A MATRIX 239 We draw the reader's attention to the fact that the mth root of a singular matrix does not always exist. 1=12.= 0 a 0 X1 = A. for example. Therefore s = 1. Its existence is bound up with the existence of a system of admissible elementary divisors for X2. (90) All the solutions of this equation are called (natural) logarithms of A and are denoted by In A. as in the example on page 233.
. we know (see Chapter VI.. the matrix (94) has only the one elementary divisor (A .)7 (2122 . (A 6a)P'.) + H(P. and with radius less than I 1.. u). then all the characteristic values of A are different from zero. . Pi+P2++P. . We write down the elementary divisors of A : (AAl)P'.) + H(P. of X (j =1.t!)Pi. . =n).. Au 0. AZE(P. that I A .E(Pi) + AX'H(P. In the plane of the complex variable A we draw a circle with center at . .u). .. if the equation (90) has a solution.sl)P`. 158) that in the transition from X to A = ex the elementary divisors do not split. After this. so that X has the elementary divisors (A . u). In (22E(P.). I and we denote by f1(A) =1n A that branch of the function In A in this circle which at Af assumes the value equal to the characteristic value E.. the condition I A 1 0 is necessary and A is nonsingular (I A I for the existence of solutions of the equation (90). Therefore the quasidiagonal matrix { In (A1E(P.. 2. . (AA2)P'. AuE(Pu) + H(Pu) } U1. 2.. (j =1.. 0). Therefore there exists a matrix T (I T I 0) such that ... (93) where eF! .) + H(P*)). then. of X by the formula A. Below. 2.)). MATRIX EQUATIONS The characteristic values A.e. therefore. . (91) Corresponding to these elementary divisors we reduce A to the Jordan normal form : A = UA U1 = U { AIE(P') + H(P...240 VIII.bu)P" . (A .. Suppose.. = eel . i....). . (11A..) + .. of A are connected with the characteristic values . we set: In (A1E(ri) + H(PI)) = t! (A5E(P/) + H(P)) = In A.1. Thus. . . E1 is one of the values of in A (j =1. . In (AE(Pu) + H(Pi)) } (95) has the same elementary divisors as the unknown matrix X.2j 3. 0. p. . . we shall see that this condition is also sufficient. (94) Since the derivative of In A vanishes nowhere (in the finite part of the Aplane). (92) Since the derivative of the function eE is different from zero for all values of .
. (In (21E(P.§ 8.. . If all the elementary divisors of A are coprime. In (2. and Xa can be omitted (see a similar remark on p. we obtain a general formula that comprises all the logarithms of the matrix : X = UX.2E(Pa) + H('*)). A.. . THE LOGARITHM OF A MATRIX 241 (96) X = T { In (d1E(rl) + H(P*)). . we note that A = ex = T { A1E(P..E(Pu) + f(rt)) ) T1. then on the righthand side of (99) the factors X. (97) Comparing (97) and (92). In (2 E(Ps) + H(ra)) ) XA U1.E(Pu) + H(Pu)) T1.. we find: T=U12. . In order to determine T. Substituting the expression for T from (98) into (96).)..) + H("))..) + H(P. . (98) where Xd is an arbitrary matrix that commutes with A. (99) Note... 233). In (..
242 . an orthogonal transformation). Thus. The matrices corresponding to one and the same operator in the various bases are similar. The transition from one orthonormal basis to another in a unitary space is brought about by means of a specialnamely. unitarytransformation (in a euclidean space. In the present chapter we shall study the properties of linear operators that are connected with the metric of the space. By means of the scalar product we shall define the `length' of a vector and the cosine of the `angle' between two vectors. unitary. All the bases of such a space are of equal standing. However. the study of linear operators in an ndimen sional vector space enables us to bring out those properties of matrices that are inherent in an entire class of similar matrices. orthogonal matrices). this does hold true of all orthonormal bases.CHAPTER IX LINEAR OPERATORS IN A UNITARY SPACE § 1. skewsymmetric. All the bases of the space are by no means of equal standing with respect to the metric. General Considerations In Chapters III and VII we studied linear operators in an arbitrary ndimensional vector space. This metrization leads to a unitary space if the ground field F is the field of all complex numbers and to a euclidean space if F is the field of all real numbers. the `scalar product' of the two vectors. by studying linear operators in an ndimensional metrized space we study the properties of matrices that remain invariant under transition from a given matrix to a unitarilyor orthogonallysimilar one. At the beginning of this chapter we shall introduce a metric into an ndimensional space by assigning in a special way to each pair of vectors a certain number. To a given linear operator there corresponds in each basis a certain matrix. Thus. symmetric. hermitian. Therefore all the matrices that correspond to one and the same linear operator in two distinct bases of a unitary (euclidean) space are unitarily (orthogonally) similar. This will lead in a natural way to the investigation of properties of special classes of matrices (normal.
. denoted by (xy) or (x. of the vectors. over. V denotes the nonnegative (arithmetical) value of the root. 2 The study of ndimensional vector spaces with an arbitrary (not positivedefinite) metric is taken up in the paper [319].' By the length of the vector x we mean' + _N x = + (x x) = I x I. (x + Y. all the arguments remain valid for infinitedimensional spaces. have the following consequences for arbitrary x. let' 1. and 5. the socalled scalar product.x). 2. Nx = (x x) ? 0. 3'. (2) And if. ay) = a (xy ). Y) =a (xY) . We consider a vector space R over the field of complex numbers. From 1. y + x)= (xy) + (xx). METRIZATION OR A SPACE 243 § 2. z of R and an arbitrary complex number a.§ 2.. y). then the hermitian metric is called positive semi. DEFINITION 1 : A vector space R with a positivedefinite hermitian metric will be called a unitary space. more (3) 5. This number is called the norm of x and is denoted by N x Nx=(x.. 4 The symbol . (x.2 In this chapter we shall consider finitedimensional unitary spaces. y. If for every vector x of R 4. To every pair of vectors x and y of R given in a definite order let a certain complex number be assigned. Nx = (x x) > 0 for x. Note that 1. o. Metrization of a Space 1. (ax . From 2. Z) =(xx) + (yx) . or inner product.definite. wherever it is not expressly stated that the space is finitedimensional. (1) 3. Suppose further that the `scalar multiplication' has the following properties : For arbitrary vectors x. it follows that every vector other than the null vector has a positive I A number with a bar over it denotes the complex conjugate of the number. z in R: 2'. (x. y. and 3. 2. then the hermitian metric is called positive definite. 3 In §§ 27 of this chapter. Then we shall say that a hermitian metric is introduced in R. we deduce that for every vector x the scalar product (x x) is a real number. (X Y) _ (Yx) .
xa and y. Nx = (xx) {. x2. . x+y)= (xx) + (yy)=Nx+Ny. . n 'x{ei. coordinates of the vectors x and y in this basis : 2. . t1 (4) n (xy) _ 2' htkX44 f.. the norm of a vector. We consider an arbitrary basis el..' The form on the righthand side of (6) is. is a hermitian form in its coordinates... and 3'.k1 (5) In particular. 2.e.244 IX. . 2. e2. nonnegative: {. By the additional condition 5.. A form . . n). (the theorem of Pythagoras !) . . the expression on the righthand side of (4) is called a hermitian bilinear form (in x. . i. k =1. .xzk ? 0 (8) . .. k1 where he= (eiek) (s. .. .. 2.. 3. {'k1 ' ha xizk ... . n) is called her mitian..... 2.. LINEAR OPERATORS IN A UNITARY SPACE length and that the null vector has length 0. and 3'. i. the equality sign in (8) only holds when all the xj are zero (i = 1..k1 ± h. y.. x2. Hence the name `hermitian metric. In this case it follows from 1..e. e of R... N(x+y)= (x + y. i. Ix+yI2=1x12+Iyj2 (x±y) Let R be a unitary space of finite dimension n. 5In accordance with this. and (5) we deduce it hu=hj (i.. for all values of the variables x1. by 4... two vectors x and y are called orthogonal (in symbols : x 1 y) if (xy) = 0.5 Thus. 2. Let us denote by xi and y{ (i = 1. where hkf = Nk (i.. 3.. A vector x is called normalized (or is said to be a unit vector) if I x I = 1. . . k=1. y).. .. k = 1.. .. that ICI= 1XI. h Axk . n) the x= Then by 2. To normalize an arbitrary vector A for which x o it is sufficient to multiply it by any complex number By analogy with the ordinary threedimensional vector spaces. n). r1 yy4et. 2'. (6) (7) From 1.. . n). the form is in fact positive definite.. the square of its length.e.
. these metrics do not all give essentially different unitary ndimensional spaces. Then by (4). . x'.§ 2. and conversely. (5). . We determine orthonormal bases in R with respect to these metrices : er and e{ (i = 1. Let x{ and yr (i=1. ea of an ndimensional euclidean space. m) (9) When m = n. DEFINITION 3: A vector space R over the field of real numbers with a positive euclidean metric is called a euclidean space. .. then a metric satisfying the postulates 1.e. . 2. Let the vector x in R be mapped onto the vector x' in R. .... For let us take two such metrics with the respective scalar products (xy) and (xy)'. k =1... then 6 I. . . .. 3. where x' is the vector whose coordinates in the basis er' are the same as the coordinates of x in the basis e{ (i =1. and 5. In this basis every metrization of the space is connected with a certain positive definite hermitian form CkI h{kxdxk. 4. (xY) = (x'Y')' Therefore: To within an of fine transformation of the space all positive definite hermitian metrizations of an ndimensional vector space coincide. e2. In § 7 we shall prove that every ndimensional space has an orthonormal basis. .. e2.. However. n) are the coordinates of the vectors x and yin some basis el. 2. and (9) (xY) xryr :1 ±Ixdl2. where n is the dimension of the space. 2. If xr and yr (i 1. . 1. by (10). for ilk.. . 2. . e. is called euclidean. ..6 Moreover.. 2. 2. the operator A that maps the vector x of R onto the vector x' of R' is linear and nonsingular. for i =k (i. If the field F is the field of real numbers. n) be the coordinates of x and yin an orthonormal basis. n). (10) Nx=(xx) = Let us take an arbitrary fixed basis in an ndimensional space R.. is called orthonormal if (eiek) =art =1 0.) This mapping is of f ine. . we obtain an orthonormal basis of the space.. ... . METRIZATION OR A SPACE 245 DEFINITION 2: A system of vectors e1. (x ). by (4). n). every such form determines a certain positivedefinite hermitian metric in R.
. ... are real numbers. . i.. § 3.. if i1 i. k = 1. Gram's Criterion for Linear Dependence of Vectors 1. .. LINEAR OPERATORS IN A UNITARY SPACE rA (xY) = Li 8Ekx4Vk . c. 8 In the case of a euclidean space. .. n) are real numbers.E i_1 A 4. we obtain (x1x1) C1 + (x1x2) C2 + . k_1 Nx = I x 12 = .. . Nx = I x I2 =... ... x. i. x... on both sides of this equation. i.... is positive definite.....k = (e.. + (x2xm) Cm =0 .' The expression n ... + Cmxm = 0. + (xlxm) C.. in succession .. c. From the fact that the metric is positive definite it follows that the quadratic form which gives this metric analytically.. . .. x of a unitary or of a euclidean space R are linearly dependent..k_1 y4>0n n In an orthonormal basis (xy) xiyi...246 IX.. 2... ... that there exist numbers8 c1i c2.. (12) When we perform the scalar multiplication by x1i x2.... 2. ... (13) (x..e. ... .. Suppose that the vectors x1.es) (i. Sikxixk > 0 i. . Here Sik = Ski Ck_1 (i. . such that CI11 + 02x2 + . n).'Sikxixk is called a quadratic form in x1i x2.ax1)C1+(xmx2)0 +.k_1 ' Sikxixk. c. 'Sikxixk . = 0 (x1x1) Ci + (x2x2) c2 + .. c. c2.e. (11) For n = 3 we obtain the wellknown formulas for the scalar product of two vectors and for the square of the length of a vector in a threedimensional euclidean space. .. as a nonzero solution of the system (13) of linear homogeneous equations with the determinant T s..+(xmxm)Cm=0 Regarding c1. k =1. C. .. x2.. not all zero. ..
. C1x1 + C2x2 + .. it follows that these vectors are linearly dependent and then the whole system of vectors is dependent. . ... GRAM'S CRITERION FOR LINEAR DEPENDENCE (x1x1) (x2x1) 247 (xlx2) . . For a principal minor is the Gramian of part of the vectors.§ 3..e. and since the metric is positive definite Cixi+C2x2+. + Cmxm) = 0 ... we introduce a positivedefinite metric into the space of functions sectionally continuous in [a. X)=0' G (xi. . be written as follows : (x1. . . x2. c2f . Cixl + C2x2 + . f2 (t).. x2i . and then adding.Cmxm) = 0 (x2. It is required to determine conditions under which they are linearly dependent. f. For this purpose.+Cmxm=0. . cixi+c2x2+.. x2... 2.... (t) be n complex functions of a real argument t.. sectionally continuous in the closed interval [a. When this principal minor vanishes. xm are linearly independent if and only if their Gramian is not equal to zero. 0].+Cmx.. +. . . (xixm) (x2x2) (x2xm) ( 14) G (xl.. x2. xm) = (xmx) xmx2xmxmwe conclude that this determinant must vanish : G (xi.. . (13' ) Multiplying these equations by c1. . we obtain : N(cixi+c2x2+. Example. . that the Gramian (14) is zero.)=0. xm.. Let fI(t).. the vectors x1.. conversely.. cm respectively.. Thus we have proved : THEOREM 1: The vectors x1i x2. PI by setting ... .. i. xm are linearly dependent.. We note the following property of the Gramian : If any principal minor of the Gramian is zero... Then the system of Equations (13) can equations (13) has a nonzero solution e `ca. x2. . ..+cmxm)=0 (xm. Suppose. then the Gramian is zero.. . xm) is called the Gramian of the vectors x1.
x. Let all vectors originate at a fixed point 0.. To establish the decomposition (15).. ... .. We shall show that x can be represented (and moreover. c... XN the projecting vector... 5 x8 =C1Xl+C2XZ+.248 IX. are complex numbers. g) = f f (a) j_(9) dt._(9) at § 4. .. (16) where cl. we represent the required xS in the form Fig. orthogonality to a subspace means orthogonality to every vector of the subspace) . where (15) xs e S and xN 1 S (the symbol 1 denotes orthogonality of vectors.. Let x be an arbitrary vector in a unitary or euclidean space R and S an mdimensional subspace with a basis x1. c2j . are real numbers... Example. x2. . f fA (t) f (t) dt a f fn (t) J. xN is the perpendicular dropped from the endpoint of x onto the plane S (Fig.. Let R be a threedimensional euclidean vector space and m = 2.. LINEAR OPERATORS IN A UNITARY SPACE p (f.0 +Cmxm. xs is the orthogonal projection of x onto S. .. . represented uniquely) in the form x=xS+XN.. es. c. c.9 9 In the case of a euclidean space. Orthogonal Projection 1. 5) . p a p f fL (t) fn (t) dt _ = 0. xs is the orthogonal projection of x onto the plane S.. Then S is a plane pass ing through 0. a Then Gram's criterion (Theorem 1) applied to the given function yields the required condition : p f f (t) fl (t) dt a p . and h = I xN I is the distance of the endpoint of x from S.
.... xS by their ith coordinates.. + (xmxm) C. it is sufficient to replace the vectors x. . ORTHOGONAL PROJECTION 249 To determine these numbers we shall start from the relations (xx3. the coordinates are taken in an arbitrary basis... .. To justify the transition from (18) to (19). (19) When we separate from this determinant the term containing xs. we obtain: (x1x1) c1 + .. x2i . ... xm) is the Gramian of the vectors x1. xs =o. coordinates (i = 1.. G 0).. . x1 a xn =xXS = (xx1) .. From (15) and (20).}. . (xlxm) x1 (xmxl) ... n) .. . we obtain (in a readily understandable notation) : X1 a xm XS = (xx1) . 2.. we find: where G = G (x1. xm (in virtue of the linear independence of these vectors.§ 4. .... + (xx1) (1) = 0 (xlxm) cl + . . (17) When we substitute in (17) for xs its expression (16). we equate the determinant of the system to zero and obtain (after transposition with respect to the main diagonal) :lo (x1x1) .. (Xmxl) C.... cm. ... x... xk)=0 (k=1.1... m). (xxm) xm x a (21) 10 The determinant on the lefthand side of (19) is a vector whose ith coordinate is xg in the last column by their itb obtained by replacing all the vectors x...1) = 0 (18) xc x1 c xmcm+ Regarding this as a system of linear homogeneous equations with the nonzero solution c1i c2. .. +(xxm) (. . x2. (xmxm) (xx1) . (xxm) 0 (20) . . ... (xxm) X..
1 (xix) I Q h2 = (xNxN) _ (xNx) = (xx1) . We draw attention to another important formula. x2. Let y be an arbitrary vector of S and x an arbitrary vector of R. among all vectors y e S the vector xs deviates the least from the given vector x e R. . by (15) and (21). see (1]. we have :12  h=jxxs ( I xYI (with equality only for y = xs) . We denote by h the length of the vector XN.xs) is the meansquare error in the approximation x xs.250 IX.. x) G (x..y)= N (XN + XS Y)= NxN + N(xs y) ? N(xN) = h'13 As regards the application of metrized functional spaces to problems of approximation of functions. . xm. to begin with. . Let us assume.. In this case the Gramian formed from any of these vectors is different from zero. The quantity h = j'N (x . ... that they are linearly independent.. We consider arbitrary vectors x. when we set.. xm.'a § S. LINEAR OPERATORS IN A UNITARY SPACE The formulas (20) and (21) express the projection xs of x onto the subspace S and the projecting vector xN in terms of the given vector x and the basis of S. The Geometrical Meaning of the Gramian and Some Inequalities 1.. . (xxm) G (xmx) (x x) 'bis ^ G (x1.. xm.. 12 N (x . x. x issue from a single point and construct on these vectors as edges an (m + 1)dimensional parallelepiped.xs I are equal to the value of the slant height and the height respectively from the endpoint of x to the hyperplane S. 2... If all vectors start from the origin of coordinates of an ndimensional point space.. It Bee the example on p.. xm) (22) The quantity h can also be interpreted in the following way : Let the vectors x1.. Then. Then h is the height of this parallelepiped measured from the end of the edge x to the base S that passes through the edges x1i X2. x2.y j and I x . x. when we set down that the height is shorter than the slant height. then I x . . Then. x9. in accordance with (22).. Thus. 248.....11 Therefore.
m) in an orthonormal basis of R and set B=1Ixjkll . (25) It is natural to call V. m). Gm=IXTXI and therefore (see formula (25) ). x2. . P . .. x. spanned by the vectors x1.2 ... =VIh. 2. x3. .. in consequence of (10).. x2.§ 5. xP+1) G(xi) = (xixi) > 0. . xim i 2 mod x421 (26) ximl 14 Formula (25) gives an inductive definition of the volume of an mdimensional parallelepiped.. Let us use the abbreviation G. Then. V. xilm xi.= G(xl.=Ix. NX = V2h2 = V3.. the volume of the mdimensional parallelepiped We denote by xlk. . . ..2 . ...) (p =1.. . xiym xim. . ..'._. where V2 is the area of the parallelogram spanned by x. GEOMETRICAL MEANING OF THE GRAMIAN 251 (23) =h>0 (p=1. 2.. Negative Gramians do not exist. 2. x2. from (23) and (24). 2. .. Then..1=Gm = xil. and x2. xm) > 0 Thus: The Gramian of linearly independent vectors is positive.=V2. (i=1. . x. Further. XU. xp) and multiply these inequalities and the inequality G G(x xs. . \/Gm=Vm_1h. .. (24) we obtain G (xi.. that of linearly dependent vectors is zero.. where V3 is the volume of the parallelepiped spanned by x1.....k the coordinates of Xk (k = 1.... in general. k=1 .. Continu ing further. n . we have \/G. we find : G4 = V3h3 = V4 and.=Vm. . xill V. x2.
Hadamard's inequality can be put into its usual form by setting m = n in (29) and introducing the determinant d formed from the coordinates X1k. .x. 2. In particular.. Let us return to the decomposition (15).. . This has the immediate consequence: (xx)=(x3+XN.=. x) S G (x1.) ... xm) S G (x1) G (x2) . . x2. n) in some orthonor*nal basis : x11 .. which.. 1n2 (27) "71 ... x. ... x1n x2 V. and (27) solve a number of fundamental metrical problems of ndimensional unitary and ndimensional euclidean analytical geometry. (29) where the equality sign holds if and only if the vectors x1i x2.252 IX. The inequality (29) expresses the following fact. (28) the equality sign holds if and only if x is orthogonal to x1. . . ... The formulas (20).1 . it follows from (26) that x11 x12 . LINEAR OPERATORS IN A UNITARY SPACE This equation has the following geometric meaning : The square of the volume of a parallelepiped is equal to the sum of the squares of the volumes of its projections on all the mdimensional coordinate subspaces. gives an inequality (for arbitrary vectors G (x1.mod x21 x92 . (26). . . for m = n. ... From this we easily obtain the socalled Hadamard inequality G (x1.. which is geometrically obvious : The volume of a parallelepiped does not exceed the product of the lengths of its edges and is equal to it only when the parallelepiped is rectangular.. xs+XN)=(xs. (21). G (x. xm.. xm) G (X). XN)?(XNXN)=h2. xin d=I xnl ... xm. X)+(XN.. in conjunction with (22). x2. ...... 2. . x2k. . x2.. . of the vectors Xk (k = 1.. (22). xm are pairwise orthogonal.. x2.
.. x3.. . xm) =0.. The first step (m =1) is trivial and yields the inequality a (xis) G (xi). . S) = 0.. x2S. . . the inequality (30) expresses the following geometric fact. . . of course. . . 5 on page 248). X. xm)r 0. .. xm) (31) If we now go over on the lefthand side of (31) from the vectors x{ to their projections xis (i = 1. by a simple geometric argument. x') and by squaring both sides. x2. . S V G (xi. x2s.. . . . xm_1) . We write the volume /G (x1. m).e.§ 5. A = }'G (x1. 3).. . We prove (30) by induction on rn. .. . xms) SG(xi. The volume of the orthogonal projection of a parallelepipep onto a subspace S does not exceed the volume of the given parallelepiped.. . . i.. i_1 Ix... If G(x1j x2.. Our condition for the equality sign to hold follows immediately from the proof. then (30) implies. then the first factor cannot increase. xm) of our parallelepiped as the product of the `base' y G (x1. xas.. Nauk. . But the product so obtained is the volume / G (x1s. x2. x2..212 {_1 3. . x2.112. no. I xis 1 :5 1 xi I (see Fig. ...vol.15 We now turn to the inequality G(xis.. these volumes are equal if and only if the projecting parallelepiped lies in S or has zero volume. xm) (30) If G(x1i x2. then the equality sign holds in (30) if and only if x{n =0 (i=1. .. we obtain (30). xms) of the parallelepiped projected onto the subspace S. In virtue of (25).. 2. 2.. m). GEOMETRICAL MEANING OF THE GRAMIAN 253 Then it follows from (27) and (29) that n 1 n X n ... . .. nor the second. '' Subsections 3 and 4 have been modified in accordance with a correction published by the author in 1954 (Uspehi Mat.. Hence G(xis. .. . by the induction hypothesis. . .. xm_1) by the distance h of the vertex of xm from the base: G (xi. . x2s. x2. that G (x1s. 9.
. xx) G (x7. . . x7.. . xms) = G (x1.. ... . and then R = T + S. Let us prove the inequality (32). . Now we shall establish a generalization of fladamard's inequality which comprises both the inequalities (28) and (29) : G (x1. xfn) = G (xl. x7. is orthogonal to each of the vectors xp+1... then (32) holds with the equality sign.... . whose square represents a certain volume. Let p < m.. The same arguments show that the Gramian on the righthand side of this equation can be split : G (x1. .+1i .. x7... .. .... (the case G(x1.+1S. This completes the proof. . x. i.. .. . .. x. xms) S G (x1. m). The equality sign holds in two cases: 1.. . . If G (x1. x2.+1..x2) vanishes. xm). When G(x.) G (xp+1. x7.) G (xP+1S .. x7. .. xm) . .+15. . .) = 0 has been considered at the beginning of the proof). vans ) If we now go back from the projections to the original vectors and use (30). x7. x7. ...) 0.. . and 2.. x. belong to S or.. what is the same..254 IX. . xfns) . . x7.. see § 8 of this Chapter).. to their projections xms onto the subspace S : G (xl.. x2. in the Gramian G(x1i X2. . 3.+1. x7.. . .. x7. from the vectors x7. x. .... .. is orthogonal to every vector x1i X2. . . for details. x7.. Since every vector of S is orthogonal to every vector of T..) = 0. (i =1.. x7... ..+1. we can go over. We denote it by S... LINEAR OPERATORS IN A UNITARY SPACE 4... . Let G(x1i x2.. By combining the last three relations we obtain the generalized Hadamard inequality (32) and the conditions for the equality sign to hold. .) G (x7.)...... .. The inequality (32) has the following geometric meaning: The volume of a parallepiped does not exceed the product of the volumes of two complementary `faces' and is equal to this product if and only if these faces are orthogonal or at least one of them has volume zero.2.. for then it is obvious that G(xp+1s..s = x. each vector x7. .... ..e. x2i ...... x7. x.. . The set of all vectors y of R that are orthogonal to T are easily seen also to form a subspace of R (the socalled orthogonal complement of T. (32) where the equality sign holds if and only if each vector x1. then we obtain G (x1.. . .+1.+1.+1S .. 2... .. . Then the p vectors x1i x2.. when the vectors x7.. x9 are linearly independent and form a basis of a pdimensional subspace T of R.. When x.. xm or one of the determinants G(x1. x. . xm) 5 G (x1.5) = 0. . =0....
we take ± h{kxcxk as the fundamental 4k1 metric form of R (see p. e2. k = 1. ... k1 x in an ndimensional space R.. n) (P < n). The inequality (33) holds for the coefficient matrix H = II h{k 111 of an arbitrary positivedefinite hermitian form.. x2. in a basis e1..5 G (el.... GEOMETRICAL MEANING OF THE GRAMIAN 255 5. ej. hjkxtxk be an arbitrary positive definite hermitian form.. Then R becomes a unitary space.5 H (1 2 . ep) G (ep+i.. (34) and the equality sign holds only if the vectors x and y differ only by a scalar factor The validity of Schwarz's inequality follows easily from the inequality established above G (x.. .§ 5. e2.. p)g lp F. . By analogy with the scalar product of vectors in a threedimensional euclidean space.... . y e R (xy) 2 S NxNy . (33) k= p+1..n).. as the coordinates. . this is known as Bunyakovskii's inequality.... n).. 244). . of a vector Let {. (33) holds if H is the real coefficient matrix of a positivedefinite quadratic form r.1 . We apply the generalized Hadamard inequality to the basis vectors e1.. . . en: O (el. By regarding x1.nn) . The generalized Hadamard inequality (32) can also be put into analytic form. e. § S.. x. . . . . we can rewrite the latter inequality as follows : H (1 2 . We remind the reader of Schwarz's inequality :t For arbitrary vectors x. 2. t In the Russian literature. In particular.xizk 6.. we can introduce in an ndimensional unitary space the 16 An analytical appronch to the generalized Hadaniard inequality can be found in the book [17J. w Here the equality sign holds if and only if hik = hki = 0 (i=1... y) = (xx) (xy) (yx) (yy) I ?  .. . Setting H = II hik II i and noting that (e... p .. e$. . k1 At h.. . 2.ek) = hik (i.
§ 6. The orthogonalizing process leads to vectors that are uniquely deterTBEOREM 2 : mined to within scalar multiples. xE. finite or infinite. xp] = [y1. 17 In the case of a euclidean space._. . .) . . 2. A sequence of vectors is called orthogonal if any two vectors of the sequence are orthogonal. Y2 . Orthogonalization of a Sequence of Vectors 1. . x2....+ cp xp of the vectors x1.. the subspace is pdimensional. Y: yi. x$. . I Y p] (p = 1.. The smallest subspace containing the vectors x1. containing an equal number of vectors.. cp are complex numbers. then they form a basis of [x1. cg. x2. .256 IX... A sequence of vectors X: x1. LINEAR OPERATORS IN A UNITARY SPACE `angle' B between the vectors x and y by defining"' cos26=N NI' Y From Schwarz's inequality it follows that 0 is real. the angle 8 between the vectors x and y is defined by the formula cas 8 = (xY) IxllY[ "s In the case of a euclidean space.. . . will be called equivalent if for all p [xi..)16 If x1. In that case.. x2. xpJ. .. .. . . x2. . ca.. x9.. . . xp are linearly independent. will be called nondegenerate if for every p the vectors x1. yg. . This subspace consists of all possible linear combinations c1 x2 + c2 x2 + . xpJ. By orthogonalization of a sequence of vectors we mean a process of replacing the sequence by an equivalent orthogonal sequence. .. xp are linearly independent.. . Every nondegenerate sequence of vectors can be orthogonalized. Two sequences of vectors X : x1... xp will be denoted by [x1. . . . xp(c1.. x2. these numbers are real.. x. .. . . ..
ORTHOOONALIZATION OF SEQUENCE OF VECTORS 257 Proof... (X).. z2. .) . . y2.. .1. cpp such that (p=1.. = o and x1N = x. y2. (Y) and z2.. . 2.. Q1.. . ._1= 0.. L [z1. Z1. Suppose that two orthogonalizing sequences yl.. 2. . (p =1.= G (x1.) are arbitrary nonzero numbers.1 (xpx1). ..= [x1..) x1. We set y1.. . (X) is given by the following construction. ):19 S1.... Then Y and Z are equivalent to each other. x1. By (21) x1 0 x1... x1N=XI) where A1. 1) Let us prove the second part of the theorem first. .. .. + c1. x2. x1. .... x2. G0=1). 2.2. ..PyP When we form the scalar products of both sides of this equation by yl.. x2..=CPPyP (p=1...2 . 19 For p = 1 we set x18. .] . x2. NO x1. 2. y1.. orthogonally onto the subspace Sp_1 (p=1.. xp_lJ=[y1) y2v . 2.31...(xpxp1) x1. .'_ CpIY1 + Cp2y2 + .. yp_lJ. )bpxpN (p=1.. . Therefore for every p there exist numbers c. z2. This proves Theorem 2.= XPSp_1 +x1. we obtain ep1= Cp2 = = cp..) . .). XPN 1 SPI (p =1. y2.. 2) A concrete form of the orthogonalizing process for an arbitrary nondegenerate sequence of vectors x1... (Z) are equivalent to one and the same nondegenerate sequence x1. We project the vector x._1 6 SP2..2.1 and take account of the orthogonality of Y and of the relation z.. .). .. is an orthogonal sequence equivalent to X..§ 6... cp2.. PlyPi + c.. .. Then it is easily seen that Y: y1. op_I (p=1.. and therefore z1.. Let (p=1. ..
.. 0 3 0 0 5 0 (m =1.. . 2.._I (p = 1. In the space of real functions that are sectionally continuous in the interval (1.... P... (37) we obtain an orthogonal sequence Z equivalent to the given sequence X. 77ff.. .. ... 20 See [12].258 IX. . . (p =1.. apart from constant factors. Example.. ... (xrlxr1) xr1 (xpx1) .. setting Sip= O.1). yp (p=1. p. Nyp= ap_1xxp.. in a different metric .... x... (x) = 2"'m! I do s_ 1)M dm'" (m 1. The same sequence of powers 1..... By (22). we define the scalar product +1 (f. x'" These orthogonal polynomials coincide....).. 3 .. + 11...) .p 1=Gp_1Gp (36) Therefore.l0.... x2. 1 We consider the nondegenerate sequence of `vectors' 1.. 2.. = O... . (xpxp_1) x.). (xlxp1) xi (xrixi) ..1 . We orthogonalize this sequence by the formulas (35) 10 yo=1. we obtain the following formulas for the vectors of the orthogonalized sequence: Yi= X10 Ys xx (1 1) x1 (xe:L) x: (xlxl) ....x°.. 2. LINEAR OPERATORS IN A UNITARY SPACE Setting A... g) = ff(x)g(x)dx.. 2. x 0 .. 2... Go. yr= 0 1 1 0 6 0... with the wellknown Legendre polynomials :20 Po(x) =1.v =aP1 G. x... Co = 1)... . x2...
x2i sented in the form (see (20) ) xs. Let us form the series 21 For further details see [12].. Epxp . 2. : gyp= (xxp) (p =1.).p = E1x1 + .X 14 Nx=1x12. b = + oo and t (x) = ex' we obtain the hermitian polynomials. x$..oo. g) = af f(x)g(x)t(x)dx (where t(x) ? 0 for a < x polynomials.21 2. the projection of x onto z. b) gives another sequence of orthogonal For example. s9] can be repre (p =1.. .. (Z). For a = . then we obtain the (n are cos x). .). = [xl. if a=1. . § 9.+. .+ I fr 1$ S Nx. Chapter II. We shall now take note of the socalled Bessel inequality for an orthonormal sequence of vectors xl... But Nxs = I S1 I'+ I 2 I' + . We denote by . ORTH00ONALIZATION OF SEQUENCE OF VECTORS a 259 (f.. Let x be an arbitrary vector.. P e112+1 4 P1'sNx (38) This is Bessel's inequality. In the case of a space of finite dimension n. Therefore...2x2 + . this inequality has a completely obvious geometrical meaning. it follows from (38) that the series x I ek 12 converges and that 00 1r_1 . for every p.. . 2.. For p=n it goes over into the theorem of Pythagoras IE11 +1 co =1x12 In the case of an infinitedimensional space and an infinite sequence Z. etc.§ 6. b =1 and t(x) _ Tchebyshev (Chebyshev) polynomials: 2 11 cos rl IxA . Then the projection of x onto the subspace S. .
and Ny. . 11z1..el Skzk . . for N (x + y).. . x . Let us calculate the corresponding meansquaredeviation b. of x onto the subspace . LINEAR OPERATORS IN A UNITARY SPACE 00 k.. is called complete.1 .oo ki If lim 6P= 0.. In this case.. In this case we have an equality for the vector x in R ( the theorem of Pythagoras in an infinitedimensional space!): co Nx=I X12=I1412k1 00 (39) If for every vector x of R the series F k xk converges in the mean to x. . zo] and is therefore the best approximation to the vector x in this subspace : Sv . k1 ki 00 lim 8P = Nx . 2. .1 .s. k =1.+Spk'p.X kzk) = (x ki Hence  k.ff Lz1. then the orthonormal sequence of vectors al. z2. when we replace x in (39) by x + y and use (39) three times.+ 2x2+. c2i .. P+Go 00 then we say that the series ' k xk converges in the mean (or converges with k_1 respect to the norm) to the vector x. .: N(x±Skzk) SN(x P ECkzk).G kzk) = Nx GPI k 12 . (40) . k1 krl where c1.260 IX. then we easily obtain : 00 k+1 (xY) = E 4 k r l k ki [ k = (xxk). 2k = (yzk). &p = N (x . Nx.i' I k L2 p. cp are arbitrary complex numbers.. kZk For every p the pth partial sum of this series.. Z2. is the projection x.
f1.. . ± 2. we have the formula 2n (f. f 2. because 271 f 0 ei14te'. .. In the theory of Fourier series it is proved that the system of functions e'kt (k = 0..1 These functions form an orthogonal sequence. . then fo is real.1. .) are called the Fourier coefficients of f (t). for 0 The series 00 k. g) = f f (t) 9 (t) dt 0 for the scalar product of two functions f (t) and g(t). Chapter II.. Setting Sn fk= f (t) 'ikedt = 2 (a. Let us define the norm of f (t) by Nf= f If (J) 12dt. We take the infinite sequence of functions ekt I ffx (k0. If f (t) is a real function..t dt = ( ee 0. 2a]. We consider the space of all complex functions f(t) (t is a real variable) that are sectionally continuous in the closed interval [0. 2. f2. [121.§ 6. and f k and f k are conjugate complex numbers. 0 2n Correspondingly. converges in the mean to f (t) in the interval [0. fketkt (/k=2hj?I(tetdt. for example.Foa 2n 2n f f (t) 0 e{kt dt f g (t) etkt dt .. . ± 2. ) 0 {k= 0> f 1. 2n].22 The condition of completeness gives Parseval's equality (see (40) ) 2n . ORTHOGONALIZATION OF SEQUENCE OF VECTORS 261 Example. . .) is complete. + ibk) 0 J 22 See... ± 1. This series is called the Fourier series of f (t) and the coefficients fk (k = 0.).
.. we have f keskt + f _te. ... . k_1 (41) .. x2. 0 § 7.) . Therefore. xz. x and xi.. for a real function f (t) the Fourier series assumes the form 2x 00 /ak=_ff(t)costtu .. . xn be the coordinates of one and the same vector x in two different orthonormal bases e1. . 2.. be an orthonormal basis of R.. The formulas for the coordinate transformation have the form ...= at cos kt + bt sin kt (k =1.2. x. n) . e2. Thus : Every finitedimensional subspace S (and. e2. Let e1... we easily find xk = (xek) : (k= 1.. in particular. . x3. LINEAR OPERATORS IN A UNITARY SPACE where sx 2x ak= I f f 0 (t) cos kt dt . the whole space R if it is finitedimensional) has an orthonormal basis.e. the coordinates of an arbitrary vector x in this basis : xxkek. .. en and Let x1.. k_1 Multiplying both sides of this equation on the right by ek and taking into account that the basis is orthonormal. e. . 1. . e' of a unitary space R.. . 0 2 + E (ak cos kt + bk sin kt) sn bk = f f (t) sin kt dt.. .... ..) 0 . x2i ei.. Orthonormal Bases 1. bk = a f f (t) sin kt dt (lc=0. 2. i.. We denote by x1.262 IX. A basis of any finitedimensional subspace S in a unitary or a euclidean space R is a nondegenerate sequence of vectors and thereforeby Theorem 2 of the preceding sectioncan be orthogonalized and normalized. e. . 2. in an orthonormal basis the coordinates of a vector are equal to its projections onto the corresponding basis vectors : x = E (xek) ek ..
.. for k for k=l. The orthonormal basis of R so obtained we shall denote by u1.. vikx'k ki (i = 1... (45) Such a coordinate transformation is called orthogonal and the corresponding matrix V is called an orthogonal matrix. 2. (43) A transformation (42) in which the coefficients satisfy the conditions (43) is called unitary and the corresponding matrix U is called a unitary matrix... ... . n) (44) whose coefficients are connected by the relation i1 vaou = Bki (k . eK to be orthonormal in terms of coordinates (see (10) )...... Therefore. We consider a unitary space R with an orthonormal basis el. We note an interesting matrix method of writing the orthogonalizing process.. e... lctk i are easily seen to be the coordinates of the vector ek in the basis e1. Here the coefficients u1 .2. . . u.. . . 2. . 1. Let A = II aik 117 be an arbitrary nonsingular matrix (I A 0) with complex elements.... ORTiioNORMAL BASES 263 (42) uikx' (i = 1. 1 uikuil = 41 0.n).. . .L..n). an by the equations akaikei i_1 (k n). . . Let us perform the orthogonalizing process on the vectors a1..'uikef i1 n (k=1. . . a2. Let R be an ndimensional euclidean space.§ 7.. The transition from one orthonormal basis of R to another is effected by a coordinate transformation xi = . 2. e2. e'.. a2.. . e and define the linearly independent vectors ax. . a. we obtain the relations 1. .. u2.. n) . . Suppose we have ui= .. .. l =1. when we write down the condition for the basis e'1. Thus: In an ndimensional unitary space the transition from one orthonormal basis to another is effected by a unitary coordinate transformation.. u k that form the kth column of the matrix U = I. u2k. 2. e2.
U is an orthogonal matrix. . where A.)...+.. a1= eliul a2 = C12U1 + C22U2 . I = 1..+Cnaua.. ... n).. . Since the orthogonalizing process determines the vectors u1. the factors U and C in (*) are uniquely determined apart from a diagonal factor M = (Cl. . .. This can also be shown directly. When we select from the sequence {Um) a convergent subsequence U. n . 0 I (m=1. A= UC (*) According to this formula : Every nonsingular matrix A = +I aik II can be represented in the form of a product of a unitary matrix U and an upper triangular matrix C. 1.P= U. i< k) are certain complex numbers. [a1.. then we obtain from the equation A. n). ea) : U= U1M... . Note 1.u2.p for p + oo the required decomposition A = UC.u2. If A is a real matrix.as.. .k=1. 2.. en ( I e.. Setting c{k = 0 for i > k. The formula (*) also remains valid for a singular matrix A (I A = 0).. apart from scalar multipliers s.. This can be seen by setting A = lim A.. ). u2.. .2.. LINEAR OPERATORS IN A UNITARY SPACE Then i..e. M000 ... . as=Cnul+ where the C.. in the case ` A I = 0 the factors U and C are no longer uniquely determined to within a diagonal factor M.264 IX.2. . When we go over to coordinates and introduce the upper triangular matrix C = II cik 111 and the unitary matrix U = I usk II i . i =1.. C2.....a2.... k = 1. Note 2.]=[u1..= UrC(m = 1. Then A.. 2... C. we have: ak=2 cpkup P1 a (k=1.... 2. 2i ..k (i.p "`y (lim U.) . .. u uniquely. ). . ..2... In this case. E2. the factors U and C in (*) can be chosen to be real. we obtain aik or 4U{pCpk (i.2... C=M1C1. However.p p4e0 n'y U) and proceed to the limit..up] (p=1.
..I (A *y. Moreover. ..§ 8. It is easy to verify that the operator A* so defined is linear and satisfies (46) for arbitrary vectors x and y of R. e. where D is a lower triangular matrix and W a unitary matrix... we obtain {_1 aik = (Aek. DEFINITION 4: A linear operator A* is called ad joint to the operator A if and only if for any two vectors x. (48) 23 From the fact that U is unitary it follows that UT is unitary. THE ADJOINT OPERATOR 265 Note 3. . we take an orthonormal basis el.. by apply ing the formula (41) to the vector Aek = 2' a ej . Let A be a linear operator in a unitary space and let A = II act 11 11 be the corresponding matrix in an orthonormal basis el. written in matrix form UT U = E . . To prove this. y of R (46) (Ax. k_1 (47) We now take (47) as the definition of an operator A*. Then (see (41)) the required operator A* and an arbitrary vector y of R must satisfy the equation n A*y =. Thus the existence and uniqueness of the adjoint operator A* is established.. since the condition (43).23 § 8. e. es) ek . 2. e2. We shall show that for every linear operator A there exists one and only one adjoint operator A*. y) _ (x. in R. (47) determines the operator A* uniquely. e2. . The Adjoint Operator 1. Then. k_1 By (46) this can be rewritten as follows : A*y = E (y. Instead of (*) we can also obtain a formula A=DW.. n). Aek) ek. k =1. D= CT. For when we apply the formula (*) that was established above to the transposed matrix AT AT=UC and then set W = UT. . implies that UUT = E. a{) (i. we obtain (**).. Let A be a linear operator in an ndimensional unitary space. A*y)..
Obviously. . (A*)* = A. 2. (This is not to be confused with the adjoint of a matrix as defined on p. It is easy to see that T is a subspace of R and that every vector x of R can be represented uniquely in the form of a sum x = xs + XT. (A + B)*= A* + B*. T is called the orthogonal complement of S. . XT a T. The matrix A* is the complex conjugate of the transpose of A. so that we have the resolution R=S+T. LINEAR OPERATORS IN A UNITARY SPACE Now let A* = II a{k basis. We shall now introduce an important concept. 3. (aA)*= aA* (a a scalar).. then the orthogonal complement T of the subspace is invariant with respect to A*. 2. ei) (i. T. S is the orthogonal complement of T. where xs c S. From (48) and (49) it follows by (46) that w aH (i. Let S be an arbitrary subspace of R. 4. We write S 1.. SiT.. k =1. n). k =1. Now we can formulate the fundamental property of the adjoint operator : 5.. (AB)* = B*A*. We denote by T the set of all vectors y of R that are orthogonal to S. Then.. We obtain this resolution by applying the decomposition (15) to the arbitrary vector x of R. 2. be the matrix corresponding to A* in the same (49) a _ (A*et. A* =AT. If a subspace S is invariant with respect to A. .) Thus: In an orthonormal basis ad joint matrices correspond to ad joint operators.266 IX. This matrix will be called the adjoint of A. The following properties of the adjoint operator follow from its definition : 1. 82. meaning by this that each vector of S is orthogonal to every vector of T.. 2. by (48). n).
.. then the corresponding characteristic values are complex conjugates. Now we shall prove the following proposition : If A is a linear operator of simple structure.. Yt o (k =1. 2. x2... A*y T.. . .. o). . it follows that the vectors of each system are linearly independent.. n) by suitable numerical factors we obtain (x:Yk)=6 (i. n).§ 8. Then Tk is invariant with respect to A*: A*Yt = µtYt... Then. m). k =1.. ... and yl. From the biorthogonality of the systems x1i x2. Consider the onedimensional orthogonal complement Tk = [yk] to the (n 1) dimensional subspace Sk (k =1. x. x)= µ(x. . k =1. . x2. x2. n).. y2.. . From Sk 1 yt it follows that (xkyk) 0. . y2. and this is what we had to prove. . y... 2. n). . 2.2. 2. . . Sk . we have 2(x.. . . setting y = x in (46). y2. . . .. . xl. For let Ax = 2x and A*x = ax (x. . . and complete systems of characteristic vectors 6. (xxYt) = 6 (1.. Ayi = p y. For let x1. . yk (k =1.x. Then it follows from Ax E S that (Ax. x be a complete system of characteristic vectors of A. (k =1. ... . Multiplying xk..... . We mention one further proposition : 7. n). are called biorthogonal if (xtYk) = b (i. 2. xt+p. y e T. If the operators A and A* have a common characteristic vector..k=1. Y. . then the ad joint operator A* is also of simple structure. 2. xti. x and yl. .. . because otherwise the vector yk would have to be the null vector. We use the notation = [x1.. Since x is an arbitrary vector of S...... y) = 0 and hence by (46) that (x.... . A*y) = 0. . . of A and A* can be chosen such that they are biorthogonal: Ax i = A. (50) where bik is the Kronecker symbol. x and 7i.. THE ADJOINT OPERATOR 267 For let x e S. y.. n). We introduce the following definition : DEFINITION 5: Two systems of vectors x1. x) and hence 2 =µ.
. For suppose that for arbitrary vectors x and y of R (Ux. the inverse of a unitary operator is also unitary. The operator A is normal if and only if its hermitian components Hl and H2 are permutable. 24 See footnote 13 on p. or U* = U1. 2.Y) =(x. (52) DEFINITION 8. UY) = (x.e. since y is arbitrary. i.e. DEFINITION 6. Normal Operators in a Unitary Space 1.24 This is called the unitary group. The hermitian components are uniquely determined by A. the unit operator E is unitary.268 IX. U*Ux=x. the product of two unitary operators is itself a unitary operator. as an operator preserving the metric. A linear operator U is called unitary if it is inverse to its adjoint : (53) UU*=E Note that a unitary operator can be regarded as an isometric operator in a hermitian space. . Hermitian operators and unitary operators are special cases of a normal operator. LINEAR OPERATORS IN A ITNITARY SPACE § 9. 2.Y) and therefore. (55) where H1 and H2 are hermitian operators (the' hermitian components' of A). U*U = E. We have TuEon 3: Every linear operator A can be represented in the form A=H1+iH2. Conversely. i. A linear operator A is called normal if it commutes with its ad joint : (51) AA* = A*A. and 3. Therefore the set of all unitary operators is a group.. A linear operator H is called hermitian if it is equal to its ad joint : H = H. Y) Then by (46) (54) (U*Ux. DEFINITION 7. (53) implies (54). From (53) and (54) it follows that 1. 18.
H22i (AA*). and finally as unitary if it is inverse to its adjoint. and U.. characterized by the following relations among its elements : A f . a hermitian matrix is always the coefficient matrix of some hermitian form (see § 1). . Suppose that (55) holds. Then: In an orthonormal basis a normal (hermitian. unitary) operator corresponds to a normal (hermitian. by (59). urquki = 6 (i. Then A* =H1iH2. n).. H. from H1H2= H2H1 it follows by (55) and (56) that AA* = A*A. .. A hermitian matrix H= II hik IIi is.. H. From (55) and (56) we have: H1= (56) (A+A*). Conversely. Suppose that in some orthonormal basis the operators A. characterized by the following relation among its elements : h1. H*=H. (59) Therefore we define a matrix as normal if it commutes with its adjoint.k=1. 2. Then it follows from (57) that H1H2 = H2H1.. NORMAL OPERATORS IN A UNITARY SPACE 269 Proof.. This completes the proof. A unitary matrix U = I I u4k III is. ). (60) . where x1 and x2 are real.. UU* = E. as hermitian if it is equal to its adjoint. i. (57) Conversely. unitary) matrix. UU* =E correspond to the matrix equations (58) AA* = A*A. H*=H. by (59). k =1.2. The representation of an arbitrary linear operator A in the form (55) is an analogue to the representation of a complex number z in the form x1 + ix2. Now let A be a normal operator : AA* = A*A. the formulas (57) define hermitian operators H1 and H2 connected with A by (55). Then the operator equations AA*=A*A..§ 9.e. and U cor respond to the matrices A.=h{t (i.
. Then S = [x. since A and B are permutable. Bx. we establish a property of permutable operators in the form of a lemma.. x .... common characteristic vector. orthonormality of the columns of the matrix U is a consequence of the orthonormality of the rows.n). while the (p + 1)th vector Bpx is a linear combination of the preceding ones.) 25 Thus. and Unitary Operators 1. Let A be an arbitrary normal operator in an ndimensional hermitian space R.'uju. from (60) there follow the equivalent relations : (61) . o. The Spectra of Normal. § 10. WIx are characteristic vectors of A corresponding to one and the same characteristic value A. B2x. Thus we have proved the existence of a common characteristic vector of the operators A and B. 2.). As a preliminary.2. and vice versa. i1 Equation (60) expresses the `orthonormality' of the rows and equation (61) that of the columns of the matrix I1= Il I'm II...21 n A unitary matrix is the coefficient matrix of some unitary transformation (see § 7). LINEAR OPERATORS IN A UNITARY SPACE Since UU* = E implies that U*U = E. BP1x) is a subspace invariant with respect to B. Then (see § 8. Proof. .. 1. LEMMA 1: Permutable operators A and B (AB=BA) always have a Then..270 IX. Bx. Hermitian.. (62) shows that the vectors x. and in particular y..k=1. (62) Suppose that in the sequence of vectors x. the first p are linearly independent.. so that in this subspace S there exists a characteristic vector y of B: By =juy. AB'fx=2Bkx (k=0.. is a characteristic vector of A corresponding to 2. y =A o . Therefore every linear combination of these vectors.. .. . Let x be a characteristic vector of A : Ax = Ax. In that case A and A* are permutable and therefore have a common characteristic vector x1. . Bx. . On the other hand. . 7.E=8jk (i.
. . S21 T2..Ak (i.. xn can be normalized without violating (63). S11 T1. is invariant with respect to A and A*. x2. I (63) The vectors x1i x2. 2. 5. .11xt. then A and A* have the same characteristic vectors. the orthogonal complement of S. in R : R = S1 + T1.. for i. If A is a normal operator. xi 1 x2. we establish in a similar way the existence of a common characteristic vector x3 of A and A* in T3. A*x. Ax1= Aixi .§ 10. 26 Here. i.. . SPECTRA OF NORMAL. A *x1= Aixi We denote by S. x21 and R = S2 + T2. k =1.. by Lemma 1.)A. A*xk = 2kxk (xk o) . that a linear operator A has a complete orthonormal system of characteristic vectors : Axk = Akxk. (xixk)=0. n). n) . . .. (xxxk) = 80 (i. .). xn ofAandA*: Axk= Akxk. every characteristic vector of A is a characteristic vector of the ad joint operator A*. . and in what follows. Suppose now. .)ski =0 Hence it follows that (k.. We shall show that A is then a normal operator. HERMITIAN.. the onedimensional subspace containing the vector x1 (S1= [x1)) and by T. Since S. where n is the dimension of the space. the permutable operators A and A* have a common characteristic vector x2 in Tt : Ax2 = A2x2. 2. .26 Since Ak = A always implies that Ik = 4 it follows from (63) that : 1.. For let us set : Then Yr=A*x. A*x2 = A2x2 (x2& o) . Setting S2 = [x1. xt)At(xkxx) =(AkA.. conversely. AND UNITARY OPERATORS 271 (xi o) .e. we obtain n pairwise orthogonal common characteristic vectors x1. (x*y) =(xk..(xkxt)=(Axk. Obviously. if A is a normal operator. Therefore. Thus we have proved that a normal operator always has a complete orthonormal system of characteristic vectors. k =1..1 x3 and x21 x3. T1 is also invariant with respect to these operators (see § 8. we mean by a complete orthonormal system of vectors an orthonormal system of n vectors. Obviously x. i=1. Continuing this process. n) . 2.
In particular. Since a hermitian operator H is a special form of a normal operator.. 2.. A*=p(A).. (64) If A is a normal operator.: 2. Let us now discuss the spectrum of a hermitian operator. . 2. 2. that (63) holds. n) . AA* =A*A. 5. i.e. LINEAR OPERATORS IN A UNITARY SPACE yj=_4*xiAjxj=o (1=1.. then each of the operators A and A* can be represented as a polynomial in the other.. by what we have proved it has a complete orthonormal system of characteristic vectors : Hxk =Arxk... normal operator A and T is the orthogonal complement of S. these two polynomials are determined by the characteristic values of A. 2.. Thus : 3. the subspace T is invariant with respect to A*. n) ... But A = q (A*). . Therefore T is also invariant with respect to A. Then by (63) q(Ak) =Ak (k=1. 2. then T is also an invariant subspace for A. Let A be a normal operator with the characteristic values 1. n). Using the Lagrange interpolation formula. . where q (1) is a polynomial. (xkxj) =bkj (k. (66) .272 IX. A=q(A*)... Thus we have obtained the following `internal' (spectral) characterization of a normal operator A (apart from the `external' one : AA* = A*A) : THEOREM 4: A linear operator is normal if and only if it has a complete orthonormal system of characteristic values. n).. (65) From H" = H it follows that Ak=Ak (k=1.. 1=1.. S 1 T. we define two polynomials p(2) and q (A) by the conditions p(Ak)=Ak.. i. .. . But then AA*xk = Ak kxk and A*Axk= AhAkxk or (k =1. A . we have shown that a normal operator is always of simple structure. If S is an invariant subspace with respect to a.e. .. 2. n).. Then by § 8... Let S be an invariant subspace of R for a normal operator A and let R = S + T.
. unitary) matrix corresponds to a normal (hermitian..1r =1. .. . from (67). 2. Since a unitary operator U is normal. n). . From UU* =Ewe find : A.. AND UNITARY OPERATORS 273 i. H* =H. among the normal operators a unitary operator is distinguished by the fact that all its characteristic values have modulus 1. Since in an orthonormal basis a normal (hermitian. It is not difficult to see that. For from (65). all the characteristic values of a hermitian operator H are real. SPECTRA OF NORMAL. n).. it has a complete orthonormal system of characteristic vectors Uxs= Atxk. n) it follows that i. unitary) operator. conversely. HERMITIAN.§ 10. 2. (70) . 2. H*xt=Hxr (k=1. a normal operator with real characteristic values is always hermitian. We have obtained the following `internal' characterization of a hermitian operator (apart from the `external' one : H* = H) : THEOREM 5: A linear operator H is hermitian if and only if it has a complete orthonormal system of characteristic vectors with real characteristic values. we obtain the following propositions : THEOREM 4': A matrix A is normal if and only if it is unitarily similar to a diagonal matrix : A = U jj A1Ba jjnU1 (U*= U1) . (66).. and H*xr = Akxk (k =1. . n).e. where (xrxl)= ak1 (k. . Let us now discuss the spectrum of a unitary operator.. 2. (68).. Thus.. (69) Conversely.. (67) (68) U*x* = AkXk (k =1.. We have thus obtained the following `internal' characterization of a unitary operator (apart from the `external' one: UU*=E) : THEOREM 6: A linear operator is unitary if and only if it has a complete orthonormal system of characteristic vectors with characteristic values of modulus 1.. and (69) it follows that UU* = E.e.. 1=1.
.. . 2 =2. n)..... n). SskII"IU11 (U1 =U11. 2.. l =I. then (Hx. X) ?0.. x2. x). i=1. x of characteristic vectors of H : Hxk = Akxk . . as is easy to see. is a hermitian form in the variables x1.. x)>0. i=1. . If a vector x is given by its coordinates x1. . and to a positivesemidefinite (positivedefinite) operator there corresponds a positivesemidefinite (positivedefinite) hermitian form (see § 1).. and positive definite if for every vector x e o of R (Hx. . . x2. . we have (Hx. (72) PositiveSemidefinite and PositiveDefinite Hermitian Operators 1. n). (71) THEOREM 6': A matrix U is unitary if and only if it is unitarily similar to a diagonal matrix with diagonal elements of modulus 1: U=U1(I § 11. Hence we easily deduce the `internal' characterizations of positivesemidefinite and positivedefinite operators : THEOREM 7: A hermitian operator is positive semidefinite (positive definite) if and only if all its characteristic values are nonnegative (positive)... We introduce the following definition : DEFINITION 9: A hermitian operator H is called positive semidefinite if for every vector x of R (Hx. IA I =1.. (73) Then.. n).. LINEAR OPERATORS IN A UNITARY SPACE THEOREM 5': A matrix H is hermitian if and only if it is unitarily similar to a diagonal matrix with real diagonal elements : H=UII18kII1UI (U*=U1.. 2. x. 2.274 IX. setting x = Z tkxk.... We choose an orthonormal basis x1i x2. 2. . . k_1 (xkxl) = 6k! (k. x) X (k=1. . x in an arbitrary orthonormal basis...
The equation (73) holds for H with Ak ? 0 (k = 1. POSITIVESEMIDEFINITE & POSITIVEDEFINITE HERMITIAN OPERATORS 275 From what we have shown. (74). .x) (Ax. .§ 11.. 2.. 2. right moduli. Let H be a positivesemidefinite hermitian operator. and (76) it follows that : F=g(H). (AA*x. Examples of positivesemidefinite hermitian operators are AA* and A*A. (76) Then from (73).. see (168).. n). . (A*Ax. x) =(A*x.. then F is also positive definite. }'AA* and A*A are called the left modulus and right modulus of A. We define the Lagrange interpolation polynomial g (A) by the equations 9 (Ak) = ek (_ }) (k =1.. then AA* and A*A are positivedefinite hermitian operators. n). n) and define a linear operator F by the equation Fxk = ekxk (k = 1.. (77) The latter equation shows that }'H is a polynomial in H and is uniquely determined when the positivesemidefinite hermitian operator H is given (the coefficients of g(2) depend on the characteristic values of H). (74) Then F is also a positivesemidefinite operator and F2=H. . Indeed. A*x) ? 0. 3. We set ek = I Ak a 0 (k = 1. . 2. If A is nonsingular. n). . The operators AA* and A*A are sometimes called the left norm and right norm of A. are equal. If H is positive definite. it follows that a positivedefinite hermitian operator is nonsingular and positive semidefinite. Ax) .. for an arbitrary vector x. .. In this paper necessary and sufficient conditions for the product of two normal operators to be normal are established.. . where A is an arbitrary linear operator in the given space. and hence the left and 27 For a detailed study of normal operators.t 0.27 For a normal operator the left and right norms. (75) We shall call the positivesemidefinite hermitian operator F connected with H by (75) the arithmetical square root of H and shall denote it by F= FH. 2. 2.
Axi) = (A*Axt. We now consider the general case where A may be singular. since by applying this decomposition to A* we obtain A* = HU and hence A=U1H. . Note that it is sufficient to establish (78). Qt ? 0.. First of all we observe that a complete orthonormal system of characteristic vectors of the right norm of A is always transformed by A into an orthogonal system of vectors. (k=1&1). the decomposition (79) for A.) = er (xtxl) = 0 28 See (168]. Polar Decomposition of a Linear Operator in a Unitary Space. Cayley's Formulas 1. Weset: H=1/AA* (here I H Ia = i A 12 0). (79) A = U1H1. For AA* =HUU*H =IP.. but also the second factor U is uniquely determined by the nonsingular operator A. From (78) and (79) it follows that H and H1 are the left and right moduli.1=1. We begin by establishing (78) in the special case where A is nonsingular (JAJ0). A is normal if and only if in (78) (or (79)) the factors H and U (H1 and Ul) are permutable. Proof. k. of A. U =H1A and verify that U is unitary : UU* = H1 AA*H1= H11PH1=E. respectively. (Axe. Note that in this case not only the first factor H in (78). i. x. A*A =H1UI U1H1= H1. We shall prove the following theorem :28 THEOREM 8: Every linear operator A in a unitary space can be represented in the forms (78) A = HU. For let Then A*Axt = e. U. . n] . . 77.e. LINEAR OPERATORS IN A UNITARY SPACE § 12. p. are unitary operators. H1 are positivesemidefinite hermitian operators and U. where H. 2.276 IX. .xk [(xtxi) = a .
.. then it follows from (82) that (82) H2 U= UHa.. X. respectively) and that the unitary factors U and Ut are uniquely determined only when A is nonsingular. . We define linear operators H and U by the equations Uxk=zk.. . e. . From (80) and (81) we find : (81) A=HU.§ 12. if H and U commute. with nonnegative characteristic values pt.. . POLAR DECOMPOSITION IN A UNITARY SPACE. IAil ++I4ISet++e.2.. and U is a unitary operator. k. . into the orthonormal system st.. Axk) = ex (k=1. . . Hzk=etxk. . then it follows from (82) that A is normal. because it has a complete orthonormal system of characteristic vectors st. . el. e* of the linear operator A and its left modulus H=1/AA* (by (82) ei. CAYLEY 'S FORMULAS 277 Here jAxk J2 = (Axk. e* are also the characteristic values of the right modulus Ht = A* A) are so numbered that then (see [379]. . a positivesemidefinite hermitian operator. .. (83) Since H =)'H2 = g (Hs) (see § 11). Ptl+Izslset+es . 4 and ei.. 2.29 29 If the characteristic values X. This completes the proof of the theorem.. En. Here H is. 22.. (83) shows that U and H commute.. X2. ea . A*A = U1H$U. x. . by (81). . . . . t =1. . . . because it carries the orthonormal system of vectors x1.. that the hermitian factors H and Hl are always uniquely determined by x (they are the left and right moduli of A. . .. e= . ... Conversely. z2. s such that (80) Axk = eksr (ssxr) = 8kt .. . Therefore there exists an orthonormal system of vectors act. .. . x2. Thus we can take it as proved that an arbitrary linear operator A has decompositions (78) and (79). or [1531 and [296]) the following inequality of Weyl holds: I2tlse. nJ .. z2. n) . If A is a normal operator (AA* = A*A). From (78) we find easily : AA* =H2. .
. 2. where F is a hermitian operator... We define a hermitian operator F by the equations Fxk = fkxk (k =1.278 IX. with H and H. 1:5 n). n). LINEAR OPERATORS IN A UNITARY SPACE It is hardly necessary to mention that together with the operator equations (78) and (79) the corresponding matrix equations hold. l = 1. . . fn . . n). . The decompositions (78) and (79) together with (86) give the following equations : A = He{F. The decompositions (78) and (79) are analogues to the representation of a complex number z in the form z = rat. positive semidefinite.. where r = I z I and I u I = 1. are hermitian operators. the operator F is not uniquely determined by U. if F is a hermitian operator. f2. and F.. (89) 30 OF= r(F). . .. 2. (84) where the fk (k = 1. Then Uxk = e'/kxk . In (86). 2. H. (1 < k. By choosing these multiples of 2n suitably we can assume that e'fk = eifi always implies that f k = f. For F is defined by means of the numbers f k (k =1. ..H1 (88) where H. . . . . F. x2. (87) A = t'F. x be a complete orthonormal system of characteristic vectors of the arbitrary unitary operator U. 2. (86) Thus. where r ? 0 and q. n) and we can add to each of these numbers an arbitrary multiple of 2n without changing the original equations (84).. then U= OF is unitary. 2.. The decompositions (87) and (88) are analogues to the representation of a complex number z in the form z = rein. Now let x. n). (xkxi) = 8k: (k. a unitary operator U is always representable in the form (86). Note.. Then we can determine the interpolation polynomial g(A) by the equations g (e'fk) = fk (k =1. 2. Conversely. n) are real numbers... .. ... From (84) and (85) it follows that :30 U=e'F. where r(A) is the Lagrange interpolation polynomial for the function eix at the places f... are real numbers. .
For this purpose. . F=i(EU) (E+ U)1. we find : Iµ (94) Repeating the arguments which have led us to the formula (86). The formulas (94) and (95) can be modified correspondingly. H1 and F1) are permutable. we obtain from (93) and (94) the pair of inverse formulas : U =(E + iF) (E . . Similarly we can normalize the choice of F1 so that (90) F1= h (U1) =h where h (A) is a polynomial. (85).F).. and vice versa.. here the point at which carries the real axis f = T into the circle infinity on the real axis goes over into the point u = 1. we have to take instead of (93) a fractionallinear function mapping the real axis f = f onto the circle I p I = 1 and carrying the point f = ac into p = po.§ 12. and vice versa. in (88). by Theorem 8..31 31 The exceptional value . . CAYLEY'S FORMULAS 279 From (84). A is normal if and only if in (87) H and F (or. f2. POLAR DECOMPOSITION IN A UNITARY SPACE.1. . the permutability of H and U (H1 and U1) implies that of H and F (HI and F1) . These formulas establish a onetoone correspondence between arbitrary hermitian operators F and those unitary operators U that do not have the characteristic value .iF)1. Therefore. and (89) it follows that F = g (U) = g (e. provided the characteristic values of F (or F1) are suitably normalized. (91) By (90) and (91). u2 . µ on the unit circle I u I = 1. . f on the real axis into certain numbers µ1. From (93). The formula (86) is based on the fact that the functional dependence 1u =e{1 (92) carries n arbitrary numbers f 1. (95) We have thus obtained Cayley's formulas.I can be replaced by any number po (I po 1) = 1). The transcendental dependence (92) can be replaced by the rational dependence I + it 1it (93) =1..
(A+B)T=AT+BT. (88). LINEAR OPERATORS IN A UNITARY SPACE The formulas (86). Linear Operators in a Euclidean Space 1. The linear operator AT is called the transposed operator of A (or the transpose of A) if for any two vectors x and y of R: (Ax. Let A be a linear operator in R. DEFINITION 15: A linear operator K is called skewsymmetric if KT=K. and (95) are obviously valid when we replace all the operators by the corresponding matrices. y) = (x. . DEFINITION 10.x) > 0. 3. DEFINITION 14: A symmetric operator S is called positive definite if for every vector x o of R (Sx. x) > 0. DEFINITION 13: A symmetric operator S is called positive semidefinite if for every vector x of R (Sx. We consider an ndimensional euclidean space R. DEFINITION 11: A linear operator A is called normal if AAT = ATA. § 13. 2. (96) The existence and uniqueness of the transposed operator is established in exactly the same way as was done in § 8 for the adjoint operator in a unitary space. (87). DEFINITION 12: A linear operator S is called symmetric if ST=S.280 IX. (AB)T = BT AT T. AT y) . We introduce a number of definitions. 1. (aA)' = aAT (a a real number). The transposed operator has the following properties : (AT)T = A. 4.
QY) = (x.32 From (101) it follows that : Q 2 1. i. equation (100) can be written as: (x. (101) Conversely. (101) implies (100) (for arbitrary vectors X. K= 2 (AAT). . (97) where S is symmetric and K is skewsymmetric. DEI'INITION 16: An operator Q is called orthogonal if it preserves the metric of the space. From (97) and (98) we have : (98) S=. . (99) Conversely.e.. y of R (Qx. For it follows from (97) that AT = S . the socalled orthogonal group. . and orthogonal operators are special forms of a normal operator. Hence QTQ=E. IQI=±1. (99) defines a symmetric operator S and a skewsymmetric operator K for which (97) holds. y). skewsymmetric. y). k =1.2 (A+AT).. n) . LINEAR OPERATORS IN A EUCLIDEAN SPACE 281 An arbitrary linear operator A can always be represented uniquely in the form A = S + K.§ 13. Y) (100) By (96). We consider an arbitrary orthonormal basis in the given euclidean space.. Hence it follows that in an orthonormal basis a normal operator A corresponds to a normal 32 The orthogonal operators in a euclidean space form a group.K. We shall call Q an orthogonal operator of the first kind (or proper) if I Q j =1 and of the second kind (or improper) if I Q I =1. 2. Symmetric. Suppose that in this basis A corresponds to the matrix A = II ark II1 (here all the ask are real numbers). QTQy) =(x.. i.e. The reader will have no difficulty in showing that the transposed operator AT corresponds in this basis to the transposed matrix AT = ! l aT 1 1i . where aj = akj (i. if for any two vectors x. S and K are called respectively the symmetric component and the skewsymmetric component of A.
are called proper and improper according as I Q I=+ 1 or I Q I=1.iv. [262a). (170b) are devoted to the study of the structure of orthogonal matrices. finally. 2.. 2. a skewsymmetric operator K to a skewsymmetric (KT = . This extension is made in the following way: 1. where x and y are real.iy and iv = u . 3. 4. a basis of R.e. We introduce `complex' vectors u = x + iy. .. Among all the linear operators of R those that are obtainable as the result of such an extension of operators of R can be characterized by the fact that they carry R into R (AR C R) . Orthogonal matrices. we have (i w) _ (xw).u. like orthogonal operators. then it will be the set of all vectors with complex coordinates and R the set of all vectors with real coordinates in this basis.w=u+iv(x. 33 The papers [138). The vectors of R are called `real' vectors. i. i. y c R. we extend R to a unitary space R.33 Just as was done in § 8 for the adjoint operator.veR). an orthogonal matrix Q = qek II i (QQT = E). Every linear operator A in R extends uniquely to a linear operator in R : A(x+iy) =Ax+iAy. The reader can easily verify that the required hermitian metric is given in the following way : If x=x+iy.K) and. For the study of linear operators in a euclidean space R. The operations of addition of complex vectors and of multiplication by a complex number are defined in the natural way. 3. a symmetric operator S to a symmetric matrix S = 11 s:x !I i (ST = S). then the orthogonal complement T of S in R is invariant with respect to AT. then (xw) = (xu) + (Yv) + i [(Yu) . If we choose a real basis.(xv)I Setting ar = x .e.282 IX. we can here make the following statement for the transposed operator: If a subspace S of R is invariant with respect to a linear operator A. x c R.y. LINEAR OPERATORS IN A UNITARY SPACE matrix A (AAT=ATA). These operators are called real. In R we introduce a hermitian metric such that in R it coincides with the existing euclidean metric. an orthogonal operator Q to matrix K ='I k{. Then the set of all complex vectors forms an ndimensional vector space R over the field of complex numbers which contains R as a subspace.
q..ux . Then it is easy to see that Ax =.§ 13. . of the real operator A there correspond the linearly independent characteristic vectors z1. . xl = Z1 (102) (k = 1. i.. 2.. so that when it has a root 1 of multiplicity p it also has the root 1 with the multiplicity p.. A real operator A carries conjugate complex vectors x = x + iy ... 2.. We consider a real operator A of simple structure with the characteristic values : A2k1=Yk+avk.. y$.. ... We shall call the plane in R spanned by this basis an invariant plane of A corresponding to the pair of characteristic values A.Az=AxiAy (Ax. .Ayc%).. 2. JAI are real and vk 0 (k =1. Let A = p + iv. y=2i(xat).. y11 x2. to conjugate characteristic values there correspond conjugate characteristic vectors. x. vk. a. A1= ut (k =l... E = x . i. Ask=Jukivk. q. . ... s. LINEAR OPERATORS IN A EUCLIDEAN SPACE 283 In a real basis real operators are determined by real matrices. matrices with real elements. . a ] has a real basis : x=2(z+z)..e. s. X9.. ... The twodimensional space [z. . Ay=vx+ uy.. n).... xX (103) form a basis of the euclidean space R.. x2. q) .. . From Ax = Ax it follows that As = Au.IYk.. The secular equation of a real operator has real coefficients.. corresponding to these characteristic values can be chosen such that x2t1 = xk + iYk. 1. . The vectors x1.iy (x..e. where Yk. ye. l =2q+ 1. n).. then to the characteristic value X there correspond the linearly independent characteristic vectors i. .vy. x2k = xk . Then the characteristic vectors xl.... y c R) into conjugate complex vectors : Ax=Ax+iAy. 1. Here sa If to the characteristic value ).. l = 2q + 1. . . xgq}1) .
fin } T1 (T =T).. Therefore : Normal. LINEAR OPERATORS IN A UNITARY SPACE Axk= ukxk . All the characteristic values of a symmetric operator S in a euclidean space are real.... hermitiam. and unitary real operators in R. It is easy to show that for a normal operator A in a euclidean space a canonical basis can be chosen as an orthonormal basis (103) for which (104) holds. (106) The transposed operator AT of A in R upon extension becomes the adjoint operator A* of A in R.. skewsymmetric. n]. 2. . (108) A symmetric operator S in a euclidean space always has an othonormal system of characteristic vectors wih real characteristic values. q (l =2q+ 1.. l =1. For a symmetric operator S we must set q = 0 in (104). hermitian multiplied by i.. ... since after the extension the operator becomes hermitian." Therefore a real normal matrix is always realsimilar and orthogonallysimilar to a matrix of the form (105) : A=Q{'I v1 fk1 V1 #'F Vq }Q1 III (107) q Pir (Q _ QT_1 =Q). . Then we obtain : Sxt = Atxt [(xkx:) = akt. Gln (105) Thus: For every operator A of simple structure in a euclidean space there exists a basis in which A corresponds to a matrix of the form (105). .284 IX. 2.vky .. k. n) (104) Pq Vq I ' A2q+1' . symmetric.. Hence it follows that : A real matrix of simple structure is realsimilar to a canonical matrix of the form (105) : i Pi V1 Aug Vq vl Jul Vq Aug /42q+1) . 36 The symmetric operator S is positive semidefinite if in (108) all µt? 0 and positive definite if all Iut> 0.V1 lul lu1 V1 Vq k =1.311 Therefore : 35 The orthonormality of the basis (102) in the hermitian metric implies the orthonormality of the basis (103) in the corresponding euclidean metric.. Ayk =vkxk + Akyk' Axt = µixl In the basis (103) there corresponds to the operator A the real quasidiagonal matrix { .. and orthogonal operators in R after the extension become normal. ..
2q + 1. Kyt = vtxt. Thus : Every real skewsymmetric matrix is realsimilar and orthogonallysimilar to a canonical skewsymmetric matrix : 0 v1 .1=2q+1. I.. q... For a skewsymmetric operator we must set in (104) : F1= u2 = .sin q'i cos rpi .. =FAA = 0 then the formulas assume the form Kxt=vkyt.§ 13.... .. Therefore in the case of an orthogonal operator we must set in (104) : . _ ILq =. For this basis (103) can be assumed to be orthonormal.n). 2.. n). the basis (103) can be assumed to be orthonormal. it follows that : Every real orthogonal matrix is realsimilar and orthogonallysimilar to a canonical orthogonal matrix : cos 97i sin Pi ..4+vk=1.2. n) (112) From what we have shown...vl 0 Ti '0''0}Q1 (Q=Q=Q) Vt 0 I 111 0vQ i (111) All the characteristic values of an orthogonal operator Q in a euclidean space are of modulus 1 (upon extension the operator becomes unitary).. 4a = ± 1 (k=1. . Qxt = ± xt k=1..sin q?.. LINEAR OPERATORS IN A EUCLIDEAN SPACE 285 A real symmetric matrix is always realsimilar and orthogonallysimilar to a diagonal matrix : (Q=Q'1=6 (109) All the characteristic values of a skewsymmetric operator K in a euclidean space are pure imaginary (after the extension the operator is i times a hermitian operator)... I cosg9 sing. =Q11 =Q1) ! 3 .. The formulas (104) can be represented in the form Qxk = xt cos % ..q.. (k =1..u=q+1= . (110) Kxi=0 Since K is a normal operator. ... .yt sin 99t ..q. i1 (113) COS Pill (Q1 . .2. Qyt=xtsinp.+ytcos1=2q+ 1.
286
IX. LINEAR OPERATORS IN A UNITARY SPACE
Example. We consider an arbitrary finite rotation around the point 0 in a threedimensional space. It carries a directed segment OA into a
directed segment OB and can therefore be regarded as an operator Q in a threedimensional vector space (formed by all possible segments OA). This operator is linear and orthogonal. Its determinant is + 1, since Q does not change the orientation of the space. Thus, Q is a proper orthogonal operator. For this operator the formulas (112) look as follows :
Qx1=x1cosp ylsin9, Qyi=x,sin9; +yicos9),
Qx. = f X2.
From the equation Q I =1 it follows that Qx2 = x2. This means that all the points on the line through 0 in the direction of x2 remain fixed. Thus we have obtained the Theorem of EulerD'Alembert: Every finite rotation of a rigid body around a fixed point can be obtained as a finite rotation by an angle q, around some fixed axis passing through that point.
§ 14. Polar Decomposition of an Operator and the Cayley Formulas
in a Euclidean Space
1. In § 12 we established the polar decomposition of a linear operator in
a unitary space. In exactly the same way we obtain the polar decomposition of a linear operator in a euclidean space.
THEOREM 9.
Every linear operator A is representable in the form of a
product "
A=SQ
A = Q1S1
(114)
(115)
where S, S, are positivesemidefinite symmetric and Q, Q, are orthogonal operators; here S = V4AT= g (AAT ), S1= ATA h (ATA), where g (A) and h(A) are real polynomials. A is a normal operator if and only if S and Q (S1 and Q,) are permutable. Similar statements hold for matrices.
37 As in Theorem 8, the operators S and Si are uniquely determined by A. If A is nonsingular, then the orthogonal factors Q and Q, are also uniquely determined.
§ 14. POLAR DECOMPOSITION IN A EUCLIDEAN SPACE. CAYLEY'S FORMULAS 287
Let us point out the geometrical content of the formulas (114) and (115). We let the vectors of an ndimensional euclidean point space issue from the origin of the coordinate system. Then every vector is the radius vector of
some point of the space. The orthogonal transformation realized by the operator Q (or Qi) is a `rotation' in this space, because it preserves the euclidean metric and leaves the origin of the coordinate system fixed.38 The symmetric operator S (or S1) represents a `dilatation' of the ndimensional space (i.e., a `stretching' along n mutually perpendicular directions with stretching factors Qoi, 02, ... , Pn that are, in general, distinct ware arbitrary nonnegative numbers) ). According to the formulas (114) and (115), every linear homogeneous transformation of an ndimensional euclidean space can be obtained by carrying out in succession some rotation and some dilatation (in any order). 2. Just as was done in the preceding section for a unitary operator, we now consider some representations of an orthogonal operator in a euclidean space R. Let K be an arbitrary skewsymmetric operator (KT =  K) and let
Q= ex. Then Q is a proper orthogonal operator. For
QT = eKT = eK = Q1
and
(116)
I Q I = 1.39
Let us show that every proper orthogonal operator is representable in the form (116). For this purpose we take the corresponding orthogonal
matrix Q. Since I Q I =1, we have, by (113),40
38 For I Q I = I this is a proper rotation; but for I Q 1 it is a combination of a rotation and a reflection in a coordinate plane. 3e If k,, k,, ... , kn are the characteristic values of K, then p1= eke, ju, = ekt , ... , Nn = ei*n are the characteristic values of Q = eX ; moreover
n
since
f1 40 Among the characteristic values of a proper orthogonal matrix Q there is an even 1011 can be written in the form number equal to  1. The diagonal matrix cos91 sing) for p=n. sin 91 cos T
I
=eii =1, ,Yki=0.
'V ki
IQI=,U,P2...ft.
n
288
IX. LINEAR OPERATORS IN A UNITARY SPACE
cos 991 sin 99]
cos 97P sin Tj
,
,+1,...,+1}Q,l(117)
 sin 9,1 cos 9'i
sin q?P cos 9'Q
(Q3)1
(Q1=
= 01)
We define the skewsymmetric matrix K by the equation
oFl
0 92q I
,
9'i 0
Since
0
9,
 Tg 0
,0,...,o}Q1_L 1.
(118)
l
cos 9'
sin p
cos p
92
0
1=  sin q,
it follows from (117) and (118) that
Q = eK.
(119)
The matrix equation (119) implies the operator equation (116). In order to represent an improper orthogonal operator we introduce a special operator W which is defined in an orthonormal basis e1, e2, ... , e by the equations
Wet =e1, ... , Wen1 = en_1, Wen =en.
(120)
W is an improper orthogonal operator. If Q is an arbitrary improper orthogonal operator then W 1 Q and QW' are proper and therefore representable in the form eK and eKl, where K and K1 are skewsymmetric operators. Hence we obtain the formulas for an improper orthogonal operator
Q=WeK=eK,W.
The basis e1, e2,
(121)
the basis xk,yk,x, (k=1,2,...,q;l=2q+1,...,n)in (110) and (112).
The operator W so defined is permutable with K ; therefore the two formulas (121) merge into one
... , e in (120) can be chosen such that it coincides with
Q=WeK (W=WT=W1; KT=K, WK=KW).
(122)
Let us now turn to the Cayley formulas, which establish a connection between orthogonal and skewsymmetric operators in a euclidean space. The formula
§ 14. POLAR DECOMPOSITION IN A EUCLIDEAN SPACE. CAYLEY'S FORMULAS 289
Q =(EK) (E+ K)' ,
(123)
as is easily verified, carries the skewsymmetric operator K into the orthogonal operator Q. (123) enables us to express K in terms of Q :
K = (E  Q) (E + Q)' .
(124)
The formulas (123) and (124) establish a onetoone correspondence between the skewsymmetric operators and those orthogonal operators that
do not have the characteristic value 1. Instead of (123) and (124) we
can take the formulas
Q= (EK)(E+K)',
K = (E + Q) (E  Q)1.
(125) (126)
In this case the number + 1 plays the role of the exceptional value. 3. The polar decomposition of a real matrix in accordance with Theorem 9
enables us to obtain the fundamental formulas (107), (109), (111), and
(113) without embedding the euclidean space in a unitary space, as was done
above. This second approach to the fundamental formulas is based on the
following theorem :
THEOREM 10: If two real normal matrices are similar,
B=T'AT (AAT= ATA, BBT =BTB, A=A, B=B),
then they are realsimilar and orthogonallysimilar :
(127)
B= Q'AQ (Q =Q =QT1)
(128)
Proof : Since the normal matrices A and B have the same characteristic values, there exists a polynomial g(2) (see 2. on p. 272) such that
Therefore the equation
AT =g(A),BT=g(B)
g(B) =T'g(A)T,
which is a consequence of (127), can be written as follows:
BT=T'ATT.
(129)
When we go over to the transposed matrices in this equation, we obtain : B = TTATT1. (130)
A comparison of (127) with (130) shows that
TTTA= ATTT.
(131)
290
IX. LINEAR OPERATORS IN A UNITARY SPACE
Now we make use of the polar decomposition of T : T = SQ,
(132)
where S = TTT = h (TTT) (h (A) a polynomial) is symmetric and Q is real and orthogonal. Since A, by (131), is permutable with TTT, it is also permutable with S = h (TTT) . Therefore, when we substitute the expression for T from (132) in (127), we have:
B'= Q1S'ASQ =Q1AQ.
This completes the proof. Let us consider the real canonical matrix
!u1 vi 4
S
Vl Pill
..., '
1uq V9
1'q 1uq
,
92q+i, ..., Yn).
(133)
The matrix (133) is normal and has the characteristic values µl + iv1, ... ,
µq ± ivq, iu2q+1, ... , ii,,. Since normal matrices are of simple structure, every
normal matrix having the same characteristic values is similar (and by Theorem 10 realsimilar and orthogonallysimilar) to the matrix (133). Thus we arrive at the formula (107). The formulas (109), (111), and (11.3) are obtained in exactly the same
way.
§ 15. Commuting Normal Operators In § 10 we have shown that two commuting operators A and B in an
ndimensional unitary space R always have a common characteristic vector. By mathematical induction we can show that this statement is true not only
for two, but for any finite number, of commuting operators. For given m pairwise commuting operators Al, A2,. . . , A. the first m  I of which have a common characteristic vector x, by repeating verbatim the argument of Lemma I (p. 270) (for A we take any A{ (i=1, 2, ... , m1) and for B we take Am), we obtain a vector y which is a common characteristic vector of
A,, 42,...,A1.
This statement is even true for an infinite set of commuting operators, because such a set can only contain a finite number (< n2) of linearly inde
pendent operators, and a common characteristic value of the latter is a
common characteristic value of all the operators of the given set.
2. Now suppose that an arbitrary finite or infinite set of pairwise commuting normal operators A, B, C, ... is given. They all have a common characteristic vector x1. We denote by Ti the (n1)dimensional sub
§ 15. COMMUTING NORMAL OPERATORS
291
space consisting of all vectors of. R that are orthogonal to xI. By § 10, 3. (p. 272), the subspace TI is invariant with respect to A, B, C, ... . Therefore all these operators have a common characteristic vector x2 in T1. We
consider the orthogonal complement T2 of the plane [XI, x21 and select in it a vector x3j etc. Thus we obtain an orthogonal system xI, x2i ... , x of common characteristic vectors of A, B, C, .... These vectors can be normalized. Hence we have proved :
THEOREM 11:
operators A, B, C,
XI, Z2, ... , a,,:
... in a unitary space R is given, then all these operators
If a finite or infinite set of pairwise commuting normal
have a complete orthonormal system of common characteristic vectors
Axi =,2izi, Bxi = 2izi, Czi = A; zi, ... [(zizk) =bjk; i, k =1, 2, ..., n]. (134)
In matrix form, this theorem reads as follows : THEOREM 11': If a finite or infinite set of pairwise commuting normal matrices A, B, C, ... is given, then all these matrices can be carried by one and the same unitary transformation into diagonal form, i.e., there exists a unitary matrix U such that
A=U{A1, ..., Rri}UI, B=U(A', ...,
A' ) U1,
(135)
C=UP1, ..., A }U_1,... (U=U*).
Now suppose that commuting normal operators in a euclidean space R are given. We denote by A, B, C, ... the linearly independent ones among them (their number is finite). We embed R (under preservation of the metric) in a unitary space A, as was done in § 13. Then by Theorem 11, the operators A, B, C, ... have a complete orthonormal system of common characteristic vectors zI, z2, ... , z, in R, i.e., (134) is satisfied. We consider an arbitrary linear combination of A, B, C, ... .
P=aA+pB±yC +
For arbitrary real values a, / , y, .
inland
. .
P is a real (PR c R) normal operator
(136)
.
Px, = Afz,, Af = aA; + t + yet + .. .
[( xfxk)
= 6 A.;
9,
k = 1, 2, ..., n).
The characteristic values A, (j =1, 2, ... , n) of P are linear forms in a, fl, y, .... Since P is real, these forms can be split into pairs of complex
vectors, we have
conjugates and real ones; with a suitable numbering of the characteristic
292
IX. LINEAR OPERATORS IN A UNITARY SPACE
A2k=MkiNk, Al=M1 A_k_I =JIk+iNk, l = 2q + 1, ..., n), (k = 1, 2, ..., q;
(137)
where Mk, Nk, and M, are real linear forms in a, B, y, ... .
We may assume that in (136) the corresponding vectors Z2k_1 and %2k are complex conjugates, and the z1 real :
Z2_1= xk + iyk , z 2k = xk  iyk , z1= x1
(138)
(k=1, 2, ..., q; 1=2q+1, ...,n).
But then, as is easy to see, the real vectors
xk, yk, x1
(k=1, 2, ..., q; 1=2q+ 1, ..., n)
k =1, 2, ..
(139)
form an orthonormal basis of R. In this canonical basis we have:"
Pxk = Mkxk  NkYk
Pyk = Nkxk + MkYk Px1 = M1x,
,
q
(l = 2q + 1, ... , n
(140)
Since all the operators of the given set are obtained from P for special values of a, f4, y, ... the basis (139), which does not depend on these parameters, is a common canonical basis for all the operators. Thus we have proved :
THEOREM 12: If an arbitrary set of commuting normal linear operators
in a euclidean space R is given, then all these operators have a common
orthonormal canonical basis Xk, yk, x1:
Axk = flkxk  vkYk
Ayk = VkXk + Nkyk
Bxk = /kxk  ykyk , ... ,
By,
vkxk + f4kyk ,
1
... ,
(141)
Ax1 = µ1z1;
Bxt =
1,
...
.
We give the matrix form of Theorem 12: THEOREM 12'. Every set of commuting normal real matrices A, B, C,... can be carried by one and the same real orthogonal transformation Q into canonical form
AQIII11
pQ/I III
}Q(142)
1
 ve /L2
Ii, P2o+1, ... , /in IV ,
+' The equation (140) follows from (136), (137), and (138).
l=2q+1.§ 15.. If one of the operators A. q.. C... COMMUTING NORMAL OPERATORS 293 Note. ... (matrices A...)say A (A)is symmetric. n). p.. 2. then in the corresponding formulas (141) ((142)) all the v are zero. In the case of skewsymmetry. In the case where A is an orthogonal operator (A an orthogonal matrix). . ..= f 1 (k=1... all the a are zero. B. B. we have µk COB TkYvt=Bin Tk. C.
. (2) If A = II aik II i is a real symmetric matrix. y=(yi.... yn)) (4) I The sign T denotes transposition. Ch d1. . A(x.. .CHAPTER X QUADRATIC AND HERMITIAN FORMS § 1. {.. y1. xn).. x) _ ' a4x{xk. x2. 294 . . i. 2. then by the bilinearity of A (x. Transformation of the Variables in a Quadratic Form 1. x2...k1 Z aikx$yk or (3) . .. .xixk (ack= al. x) = xTAx . k =1. ym are column matrices and c1i c2. y) (see (4)) . .. . y2... x. The form is called singular if its discriminant is zero. The determinant I A I = I aik 11 is called the discriminant of the quadratic form A(x. In (2) the quadratic form is represented as a product of three matrices: the row xT. In this chapter we shall mainly be concerned with real quadratic forms.. x). . the square matrix A.. n) where A= II aik 1171 is a symmetric matrix. dm are scalars.. ...k1 a..) by x and denote the quadratic form by R A(x. A quadratic form is a homogeneous polynomial of the second degree in n variables x1. d2. x1.. x.. x2i . If we denote the column matrix (x1. then the form (1) is called real. To every quadratic form there corresponds a bilinear form A (x. (x=(xl.k1 (1) then we can write :1 A(x. . . . y) _{. . y)=xTA1 If x1. and the column x. A quadratic form always has a representation n (. .
in (Ax.. y) and (x.n). 1:0 and T is the transforming matrix: T= t{k II i . y) _ (Ax. x2i .. . Substituting the expression for x in (2). A(x. .. then for arbitrary vectors n n x =E x{e{..ek) (i. xn) and = (S1. this transformation looks as follows : (6) (6') are column matrices: x = (XI... x) where A A=TTAT.{k Ii 1 of the d..2. (7) The formula (7) expresses the coefficient matrix A = I a. yi) {+1 I1 If A is an operator in an ndimensional euclidean space and if in some orthonormal basis e1.ko1 X 26{4 in terms of the coefficient matrix of the original form A = II a{k II i and the transformation matrix T= II t{k II z It follows from (7) that under a transformation the discriminant of the form is multiplied by the square "of the determinant of the transformation : s In A (x. x) _ (Ax. td1 y `{a1 y{e{ we have the identity2 A(x.§ 1. we obtain from (6') : Here x. A (x. Ax). k1 In matrix notation. e2. In particular.2.. the parentheses form part of the notation. ... en this symmetric operator corresponds to the matrix A = II a{k 11' 1'. . 2. 2. .. Ay).. Ay). . where a{k= (Ae{. Let us see how the coefficient matrix of the form changes under a transformation of the variables : n X4 =X t{k4k (i=1... transformed form A (. they denote the scalar product.k=1. TRANSFORMATION OF VARIABLES IN QUADRATIC FORM I 295 (5) m ddyt) m A 2'c{ddA (x{. y).n). x) = (x. y) = (x.
. . . (j = 1. Thus. x by the formulas' {=Xt Then.. Reduction of a Quadratic Form to a Sum of Squares. . all these matrices have one and the same rank.296 X. such linear forms . x) can be represented in an infinite number of ways in the form AaX ri . x2i ... ... § 2.. the rank of the coefficient matrix remains unchanged (the rank of A is the same as that of A). 3 see p. .. $2. at{xb (i=1. ..9 connected as in formula (7). 2.. are called congruent. . a second invariant is the socalled `signature' of the quadratic form. .2.. . In the real case. r) are linearly independent real linear forms in the variables x1. A real quadratic form A (x. r) and r_i X that r < n)... QUADRATIC AND HERMITIAN FORMS IAI=IAIITI2 (8) In what follows we shall make use exclusively of nonsingular transformations of the variables (I T I 0).. x (so Let us consider a nonsingular transformation of the variables under which the first r of the new variables $1.. (9) where a{ 9& 0 (i =1. n) are linearly independent and then setting =X1 (j =1. with I T 10.. %. (i=1. . 2.Y. 17. DEFINITION 1: Two symmetric matrices A and .. . X 2 . .2. X. n). 2. a whole class of congruent symmetric matrices is associated with every quadratic form.i. . are connected with x1. . r) * We obtain the necessary transformation by adjoining to the system of linear forms 8.. that the forms I.l. We shall now proceed to introduce this concept. As mentioned above.. .. . Under such transformations.. . The rank is an invariant for the given class of matrices. as is clear from (7).. The Law of Inertia 1. . .3 The rank of the coefficient matrix is usually called the rank of the quadratic form. in the new variables.. the rank of the form.
. . h. 0. ..<0.. ) a{ { and therefore . x) as a sum of independent squares'. but also so is the number of positives (and. bl>0.. x) in the form of a sum of independent squares A (x.... %. Proof.. x2i . 0). .. . THEOREM 1 (The Law of Inertia for Quadratic Forms) : In a representation of a real quadratic form A (x. a2. x) = I biY{ ia1 and that a1>0. REDUCTION TO SUM OF SQUARES.Y. x) in the form (9). . Hence : The number of squares in the representation (9) is always equal to the rank of the form. xn values that satisfy the system of r .aa+1 <0. LAW OF INERTIA 297 A (x. say g < h..bh+l <0.=0. %. a... .. i1 (9) the number of positive and the number of negative squares are independent of the choice of the representation.<0.+1=0.a2>0. Then in the identity a..b2>0.x)=JaiXi. another representation of A(x..a.... Suppose that g ...bh>0. A(x...'X9'=0' Y.... X2=0.. Let us assume that we have..X b{Yi (10) we give to the variables x1. 2.aa>0..... 6 By a sum of independent squares we mean a sum of the form (9) in which all at ¢ 0 and the forms %1. (11) 5 By the number of positive (negative) squares in (9) we mean the number of positive (or negative) a.(h .A = (a1. in addition to (9).. .. the number of negative) squares. But the rank of A is r.§ 2.b..8{ =. hence. x) = A (E.g) equations %1=0.. are linearly independent. We shall show that not only is the total number of squares invariant in the various representations of A(x. .
.. a{ I can be absorbed into the form A(x.. .x)I). or 0: A=TT(+1.. 1. X. x) to the canonical form A (13) Hence we deduce from Theorem 1 that : Every real symmetric matrix A is congruent to a diagonal matrix in which the diagonal elements are +1... 0 1 b (14) In the next section we shall give a rule for determining the signature from the coefficients of the quadratic form.X < 0. x). Note that in (9) the positive factor Xs (i=1. the assumption g is proved.. since otherwise the equations X9+1= 0..(h ig) equations (11). . 8 See footnote 4.0)T.x).. . ' Such values exist.298 X. DEFINITION 2: h has led to a contradiction.... x) is called the signature of the form A (x. .+I.+1. are independent. since r=v+v.=Xi+X2+._... 2..Xr. QUADRATIC AND HERMITIAN FORMS and for which at least one of the forms X.. . . we reduce A(x.. = 0 would be consequences of the r .... The rank r and the signature o. 8:.... and the theorem The difference a between the number n of positive squares and the number v of negative squares in the representation of A(x. .1.1..+XnXn+1. = 0 ..' For these values of the variables the lefthand side of the identity is X a.. . = 0 and hence all the equations X1= 0..X k1 h bkY't2 0. and the righthand side is . = o[A(x.. X. v=2v.0. r). because the linear forms X. This is impossible. does not vanish. r).. Then (9) assumes the form (12) Settings 4{=X{ (i=1. . .. . determine the numbers a and v uniquely. X. 2. X. X. Thus.. (Notation: o.
in (16). and the second contains x. METHODS OF LAGRANGE AND JACOBI 299 § 3. but a. x) does not contain these variables. .§ 3. Then we set g < n) the diagonal coefficient aD. We shall describe here two reduction methods : that of Lagrange and that of Jacobi. whereas A2(x. of the independent linear forms (17) ). x) does not contain the variable x.. The Methods of Lagrange and Jacobi of Reducing a Quadratic Form to a Sum of Squares It follows from the preceding section that in order to determine the rank and the signature of a form it is sufficient to reduce it in any way to a sum of independent squares.h A (x.k + ank) xk]' ' 2ahy [?(a. since the first contains xh but not x.L. 1. x) k1 (15) and convince ourselves by direct verification that the quadratic form Al (x. are linearly independent. We consider two cases 1) For some g (1 zero. 2) a = 0 and ahh = 0. (16) k. [ The forms n 1 0. the forms within the brackets are linearly independent (as sum and difference. but not xh. respectively. x) = a (} akxk)s+ Al (x. This method of separating out a square form in a quadratic form is always applicable when there is a nonzero diagonal element in the matrix A = II aik 1111'. Therefore. is not equal to A (x.. Lagrange's Method. x). as is easy to verify.ahk)xk} n +A2 (x. Let a quadratic form A(x.. and xh.x)akx{xk be given. Therefore we have separated out two independent squares in A (x.1 apkxk k akkxk (17) Each of these squares contains x. x) ` 2ah. 1 Then we set : " _ n £ (a. x).
A(x.300 X. k 1 2 .x. Note that the basic formulas (15) and (16) can be written as follows A1.. Jacobi's Method. We denote the rank of A(x.. We apply formula (16') with g = 2 and h = 3: Al (x.xzx3+zjs+ 2 r=4. x) = 2 x13 .x).L (X. (x2x22z4)'+As(x. Q=2.. We apply formula (15') with g = 1: 1 A(x.x)=4xi+xi+x1+x24x1x24x1x2+4x1x4+4x2x34x=x4.. since at each stage the square that is separated out contains an unknown that does not occur in the subsequent squares. k) 12 0 (k=1.x)16(8x14x24x3+4x4)'+A. 2 A2 (x. p (15) (16') ±A)2 QAadxA/ + A2(z x).(x.[(4 Example.2.x). z) = 2 xZ . Then the symmetric matrix A = 11 aqk 1I i can be reduced to the form . A(x. (X2+xs)'. QUADRATIC AND HERMITIAN FORMS By successive application of a combination of the methods 1) and 2).1 (2x32x24z4)2 +A3(z..(z. Finally..kI askxlxk by r and D4=A . r). x) by means of rational operations to a sum of squares. we can always reduce the form A(x. x)= 8 (2x2+2x3)2. Moreover. x) = where 2 (x3+x3)2. 2 2.. .x)=4a_(a a) +A1(x.x) =(2x1x2xs+x4)'+A. the squares so obtained are linearly independent. A(x. where Al (x.x)=(2x1.x).2 x2x4 + 2 23x4 .. 2x4)1+ 2xt. x) assume that (.
.0.* . r). . .§ 3. g. 1 1 g211... p1) 1 (q = p... (19) In particular. 2.. 0 . 9 =DpPt (p=1. p1 (1 2 . 2: (22) Without infringing (21) we may replace some of the zeros in the last n . § 2. p1 p\ t1 2 ...0 *. (20) In Chapter II... .... By such a replacement we can make G into a nonsingular upper triangular matrix 912 . 0 0 11 ..... 2.. § 4 (formula (55) on page 41) we have shown that 41=GTDG..... Do= 1).. 0 ..... gm (18) . . ..._i 911 "' 0 _ 1 1 . gin 92n 0 922 .... (23) .... * 9 See Chapter II... METHODS OF LAGRANGE AND JACOBI 911 301 912 . gm (1T1 0). where D is the diagonal matrix : (21) D= I D1 t D .0. p.gin 922 . . .. ..gn 0 0:.1.. . . by Gauss's elimination algorithm (see Chapter II. ..... .0 g"..r rows of G by arbitrary elements.. § 1).. r D. p =1. r. n.. .. p +..... 92n T= 0 0 0 0.I q 2 . .. 0 . G= 0 0 110 0 0 0 . . ... The elements of 0 are expressed in terms of the elements of A by the wellknown formulas9 9P9 _ A(1 2.
. . Do= 1) r (27) are intros Iced. (k= 1. 9kk = Dkk 1 T10 and there fore Xk contains the variable xk. k+ixk+1 + ' ± 9knxn (k =1. ..9kkxk + 9k.. 4344. x) under the transformation =Tx. . QUADRATIC AND HERMITIAN FORMS The equation (21) can then be rewritten : A = VDT. X. 2. .. the linearly independent forms Yk=Dk_1Xk (k=1. .. 11 Another approach to Jacobi's formula. Then Jacobi's formula (26) can be written as: v Xks Dk_1Dk. . r) (29) where 70 We regard D(E... which does not occur in theforms Xk+1. in [171. X. X.. skn. r).. (26) k1 This formula gives a representation of A (x.. 2.. pp.. . X_. Stq.Cknxn (k =1. 2. are linearly independent forms... . .. A (x.... r. From this equation it follows that the quadratic form10 D r !?k:I! DL (24) Eg ` " 4k k kel gkk (=(1. But we can also convince ourselves directly of the independence of the forms X. ..n) Do=1) goes over into the form A (x.. according to (20). For.l2 Jacobi's formula is often given in another form. r).. x) = (28) Here Yk = okkxk + Ck. which does not depend on (21). x) in the form of a sum of r independent squares. Instead of Xk (k = 1. Hence X.. . x) is of rank r. Since sk = Xk.. . r). ... ) as a quadratic form in the n variables 1... 3. 1. .. (25) we have Jacobi's Formul"a" A(x' X2 r Dk_1 s x)=YDAXki 9kk k (Do =1) . .. can be found.302 X. 2. for example. Xk . . X.4.. 12 The independence of the squares in Jacobi's formula follows from the fact that the form A (x. k+lxki1 + . . . .
922 = 1.2 x1x4 .. v = V (1.k1 q k =1. Jacobi's formula (26) yields: A(x. A (x. D. . (32) i. Dr. k 1 k 1 2. n = P(1.r)... Jacobi's formula (28) yields the following theorem: THEOREM 2 (Jacobi). D2t .. x)=(xl2x2+xsx4)2(xsxs+2x4)s. Example.. respectively. x) coincide. D1. If for the quadratic form n A (x. with the number P of permanences of sign and the number V of variations of sign in the sequence 1. Dr)..D2..§ 3. .D1. x) _ F aixxxx c. k . 911 = 1. D1i D2. We reduce the matrix 1 2 1 1 3 3 4 A= 2 3 0 1 1 4 1 3 1 to the Gaussian form a= 0 1 1 o 2 0 0 0 0 0 o 0 Hence r = 2.6 x2xs + 8 xx4 + 2 xsx4 . (33) . .. D2. 2... . METHODS OF LAGRANGE AND JACOBI 303 (30) ekg= A 1 2 . r) ...2 + 3 x= .. and the signature v=r2V(1. k)' 0 (k= (31) holds..3 xZ .1 of rank r the inequality Dk=A(1 2 ...4 xlx2 + 2 xlxs . x) = x. D1. ..Dr)..e. then the number n of positive squares and the number v of negative squares of A (x. ...
. when k±! > 0 if Dk=Dk+1=0. in both cases. but not three determined by the use of the formula in succession. 17 § 4. when a2> 0. In this case. when Dk+i <0... As a corroborating example. D4 = .=D2=D. Here D. Dr_. we can take the form ax + axe + bx2 + 2 axlx2 + 2 ax2xe + 2 axlxg = a (x1 + x2 + x3)2 + (b .D.. x) = 2 a. D1. (34) 2.a. This is shown by the following example: A (x..Dr) omitting the zero Dk provided Dk_1Dk+1 V (DA. class of positive quadratic forms. When three consecutive zeros occur in D1. Positive Quadratic Forms 1.<0..2V(1. and setting 1. Dr a=r. xlx4 + a2x2 + asx2 (a1a3as 0) .. do not determine the signature of the form. D4 < 0. a3 > 01 0 But 13. Note 3.D1.km1 Z aikxixk is called n positive (negative) semidefinite if for arbitrary real values of the variables : A.304 X. (35) 13 This rule was found in the case of a single zero Dk by Gundenfinger and for two . (<0). x) i. Dy_. 0.=O.."t Note 2. Dk+2) = 0.. .=O. then the signs ofD1iD2.. If D. If in the sequence 1. D2. QUADRATIC AND HERMITIAN FORMS 0 there are zeros.. DEFINITION 3: A real quadratic form A (x. but D. We state this rule without proof. ...a) x2 . then the signature of the quadratic form cannot be immediately determined by Jacobi's Theorem._1 0. In this section we deal with the special.(x.D2. Dk+l. the signs of the nonzero Dk do not determine the signature of the form. Dk. . then the signature can be Note 1.. but important. when a2<0.aia2a3 v= J 1.x)?0 successive zeros Dk by Frobenius [1621... .
(36) The class of positive (negative) definite forms is part of the class of Let A(x. a2i . But then A(x. v=0). In other words: A positivesemidefinite form is positive definite if and only if it is not singular.§ 4. not all zero. x) would have a negative value for these values of the variables. then A(x. such that all the %i would be zero. .. 0. (38) For if any ai were negative. r) are positive. x. . . conversely. 2. (< 0). . The following theorem gives a criterion for positive definiteness in the form of inequalities which the coefficients of the form must satisfy. tions a=r (a=r. We shall use the notation of the preceding section for the sequence of the principal minors of A : Now let A(x.=X. But then by (37) A (x. . (x r0) A(x. x) > 0 positive (negative) semidefinite forms.. o. we could find values of x1. x) be a positivedefinite form. .. X. Thus. 2.ks l positive (negative) definite if for arbitrary values of the variables. . . and by assumption this is impossible. all the squares must be positive ai > 0 : (i = 1. x for which Xl_. if in (37) r =n and all the a1. (37) In this representation... POSITIVE QUADRATIC FORMS 305 DEFINITION 4: A real quadratic form A(x.. It is easy to see that. x) f_1 a1X? . conversely. From the positive definiteness it follows that r = n. x) is positive semidcfinite.. 2. not all zero.. For if r < n. x) is also positive semidefinite. x) _ aikxixk is called i. then we could select values of x1i x2. x) = 0 for x .. Therefore it is representable in the form (37).. . . an are positive. x) be a positivesemidefinite form. r). x2.. where all the ai (i = 1... it follows from (37) and (38) that the form A(x. It is clear that.xi1=xi+1=.=0. We represent it in the form of a sum of linearly independent squares : A (x. Then A(x.. x) is a positivedefinite form. a positive semidefinite quadratic form is characterized by the equa and this contradicts (36)..
>0.306 X. QUADRATIC AND HERMITIAN FORMS I all a12 . 1 0 (p=1. D1 _0.. We are now in a position to apply Jacobi's formula (28) (for r= n). Hence the inequality (39) follows.2.e. all the remaining principal minors are then also positive.2.Da?0..kelaikxixk is positive definite. . D2= all a21 a12 a22 a21 Ia1 a".n).. x) is obtained from A(x. ii iNote. we have Di>0..2.=. when the successive principal minors of a real symmetric matrix are positive. Since every principal minor of A can be brought into the top left corner by a suitable numbering of the variables..n).. .. we have the COROLLARY : In a positivedefinite quadratic form A (x.. x) = .)>0 ilii(1Si1<i2<. x) if we set in the latter xP+.<iPSn.D2?0. . a22 . From the fact that A(x...p=1. a.k1 (p =1. If the successive principal minors are nonnegative...... 14 The form A. it follows that the `restricted' forms14 p Proof... (39) Jacobi's formula (28)... n) are also positive definite.D. Dn= I A.. 15 Thus.D2>0.. and the theorem is proved.. D1D2>0.. i... The necessity of (39) is established as follows.Eaikxixk. x) _Y akx{xk i. But then all these forms must be non singular. (40) (p=1.. a2n D1=a11. (x. t....... The sufficiency of the conditions (39) follows immediately from AP (x. I THEOREM 3: A quadratic form is positive definite if and only if D1>0. D2D$>0. 2..n). kl all the principal minors of the coefficient matrix are positive:" A(::::. . Since all the squares on the righthand side of the formula must be positive. x) i...
. We introduce the auxiliary form A. Proceeding to the limit for e+0. (x.n).<i n. Suppose.. we have the following theorem.. x). <ipSn. For. x) =A (x. The conditions for a form to be negative semidefinite and negative def inite are obtained from (39) and (41)..n).2.. ... ip / n (41) Proof. ' \ it i2 'p)?0 (15i1<i2<. i1 (e<0). n THEOREM 4: A quadratic form A(x. x). (x.. conversely. . n). A. respectively. x) is positive definite. x) _ . This completes the proof. (x. x) ?0.. so that we have the inequality (cf. ip ?ep>0 A.. However. POSITIVE QUADRATIC FORMS 307 it does not follow that A(x.k.p=1. we obtain (41).A(x.. x) is positive definite Proceeding to the limit for e + 0 we obtain : A (x. 2. But then (by Theorem 3).i alkxixk is positive semiti. but is not positive semidefinite. x) > 0 (x # o) . Corollary to Theorem 3) : As(il till2 i2 ip iP)>0 (1 Si1<i2<..I definite if and only if all the principal minors of its coefficient matrix are nonnegative : A(sly2.0 The fact that A (x. x) is positive semidefinite implies that A. the form a11x1 + 2a12x1x2 + a22x2 Q s in which a11= a12 = 0.. Obviously lim A. is.§ 4.. (x. when these inequalities are applied to .... that (41) holds. s. Then we have As(ili2.p=1. x) is positive semidefinite.. (x. )=Ep+ Sl i2 . .. a22 < 0 satisfies (40).
. in greater detail.. (42) THEOREM 6: A quadratic form A (x... x) is negative semidefinite if and only if the following inequalities hold: (.. there exists a real orthogonal matrix Q such that A =QIAQ (A = 11 Ai8ik jji... x) is negative definite if and only if the following inequalities hold: Dl<0. We consider an arbitrary real quadratic form A (x. i.e. n (QQT = E) (45) xi = k_1 gik k 4 gigk.308 X. i. iP I ? 0 11 22 (1 S i1 < i2 < . k =1.(1)"Dn>0. § 13) it is orthogonally similar to a real diagonal matrix A. Therefore (see Chapter IX. (45 ) the form A(x. (43) ip JJ § S. x) i. k_1 aikxixk . x) goes over into n (46) . are the characteristic values of A. Reduction of a Quadratic Form to Principal Axes 1. QUADRATIC AND HERMITIAN FORMS THEOREM 5: A quadratic form A(x. . .. .D2>0.D3<0. . Since for an orthogonal matrix QI = QT. .1)rA (i1 2 . 2. (44) Here Ai.. .. n) . p =1... < a n . = 8a . 2. A. £2. it follows from (43) that under the orthogonal transformation of the variables x=Q or. Its coefficient matrix A= 11 aik I!i is real and symmetric.... QQT =E) .
x. But then at some intermediate stage this characteristic value must pass through zero. Ai a{ c e.. .. The reason for this name is that the equation of a central hypersurface of the second order n i. x) to the canonical form (46) is called reduction to principal axes. It follows from (46) that the rank r of A(x.. to If IQ I=1. The signature can only change when some characteristic value changes sign. i=1. we have the following proposition : If under a continuous change of the coefficients of a quadratic form the rank remains unchanged. . ... n1). where A1.n (48) If we regard x1. then the signature also remains unchanged. 2. A. . 1R=$.. 0) (47) under the orthogonal transformation (45') of the variables assumes the canonical form fi (=. .. and the `rotation"' of the axes is brought about by the orthogonal transformation (45). REDUCTION TO PRINCIPAL AXES 309 THEOREM 7: Every real quadratic form A(x.. the reduction to principal axes can always be effected by a proper orthogonal matrix (I Q I = 1).n. x) is equal to the number of nonzero characteristic values of A and the signature o is equal to the difference between the number of positive and the number of negative characteristic values of A. Here we have started from the fact that a continuous change of the coefficients produces a continuous change of the characteristic values. as coordinates in an orthonormal basis in an ndimensional euclidean space. This follows from the fact that.xk = c (c = const.. 2. A2. . x2. Hence. are the coordinates in a new orthonormal basis of the same space... The new coordinate axes are axes of symmetry of the central surface (47) and are usually called its principal axes. x) . 287). then (45) is a combination of a rotation with a reflection (see p. k1 Z aikx.. 2.. and this results in a change of the rank of the form. are the characteristic values of A= II ak The reduction of the quadratic form A(x. in particular. then S1. we can perform the additional transformation {=Ei (i=1. However.§ 5. . without changing the canonical form..= l... 2i . .klatkxixk can be reduced to the canonical form (46) by an orthogonal transformation.
..n): Azk=AkBzk (k=1.2. Since the matrix A  is singular..n).AB(x.x)AB(x.2. ..1 n determine the pencil of forms A (x. the pencil A(x.x). and the other the kinetic energy. x) always has n real roots Ak with the corresponding principal vectors zk = (z... x) . The study of a system of two such forms is the object of this section.1 afkxixk and B (x.. (50) .. there exists a column z = (z1. (i. . znk) (k=l. The second form is always positive 1.. The equation I AAB I =0 is called the characteristic equation of the pencil of forms A(x. If the form B (x.kxixk i. x) . x) . QUADRATIC AND HERMITIAN FORMS § 6.. We denote by A some root of this equation. The following theorem holds: THEOREM 8: The characteristic equation The number A will be called a characteristic value of the pencil I AAB I =0 of a regular pencil of forms A (x. z2i . x) (A is a parameter). x) is then called regular. x) and z a corresponding principal column or `principal vector' of the pencil. Two real quadratic forms A (x. of the system.AB(x. x) is positive definite......k.310 X. ko.k=1. x) i. z2k. k+. Az=A0Bz (z:71 o such that A(x..2B (x.2. definite.zk)=bik are satisfied..AB(x. Pencils of Quadratic Forms In the theory of small oscillations it is necessary to consider simultaneously two quadratic forms one of which gives the potential. x) = ' b. x) .n) (49) These principal vectors zk can be chosen such that the relations B(zi.
n).. since D = BIA and DT=AB1. PENCILS OF QUADRATIC FORMS 311 Proof. and D = B1A there correspond in this basis linear operators in R : A. e" is. as a product of two symmetric matrices. then the properties 1. .. . ) y"). has real characteristic values. has characteristic columns (vectors) z'.. e_. x) : (xx) = B (x. therefore. 18 Since the basis e1. . B... z" corresponding to these characteristic values and satisfying the relations (50). . 284). i1 y =i1' yiei . but the original basis el. e is not orthonormal. the operators A and B to which..5. In this space we fix a basis el. y2. 2. our theorem states that the matrix (51) D = B1A 1. However. We have obtained an ndimensional euclidean space R. by means of the positivedefinite bilinear form B(x. and introduce a scalar product of two arbitrary vectors x = 2' x{et.§ 6. y = (y1. not orthonormal. in general. We observe that (49) can be written as : BtAzk=Akzk (k=1. and 2. e.. . D. euclidean. 243) and is.. we introduce an ndimensional vector space R over the field of real numbers... x) = xT Bx . 2. the symmetric matrices A and B correspond. (53') where x and y are columns x = (x. . It is easy to verify that the metric so introduced satisfies the postulates 1. . .k1 bjtxcyk = xT By (53) and hence the square of the length of a vector x by means of the form B (x.. (p. e2. x2. and 3. and D=B1A. . would follow immediately from properties of a symmetric operator (Chapter IX. y) : n (xy) = B (x. are not necessarily symmetric themselves.. z2. . . x"). ... .. (52) has simple structure.. To the matrices A. is not necessarily itself symmetric. y) _i. B. Thus.. e2." In order to prove these three statements. p.18 17 If D were a symmetric matrix. in this basis.
. For suppose that n 'r C. z" are linearly independent. This completes the proof.. .. . (54) .. z2.. x). Dy) .AB (x. and a complete orthonormal system of characteristic vectors zl. . x) . n)... . k1 (55) Then foreveryi (15i<n). . n Then all the cr (i = 1.* = 0.. 2. k= 1... an (seep.. A square matrix formed from principal columns z1. (x. y) = (Dx)TBy= xTDTBy= xTAB1By= xTAy and x= (x1. z")=IIzrk1fi will be called a principal matrix for the pencil of forms A (x. (Dx. . %2. (1. z". x2. zk)=cc.'A has real characteristic values Al. .'s Indeed. . QUADRATIC AND HERMITIAN FORMS We shall show that D is a symmetric operator in R (see Chapter IX. z2k. The symmetric operator D = B.. . 284.. § 13).. ... n k1 cxzk 1 cB(zr. n) in (55) are zero and there is no linear dependence among the columns z1.. en. yn) we have. ... . for arbitrary vectors x and y with the coordinate columns (Dx.by (50). z2. A2. by (53). . z" satisfying the relations (50) Z=(z1. e2. z2. Note that it follows from (50) that the columns z1. Then the equations (54) can be written in the form (51) or (49) and the relations (54'). 2. 2.. n) in the basis e1. .. xn) and y= (yl.. as3... (54') Let zk = (zlk. .. 0=B(z'. .. n)..312 X. .. 19 Hence D is similar to some symmetric matrix.. Dy) = xTBDy = xTBB'Ay = xTAy. yield the equation (50). A. y) = (x. . y2. . .. Chapter IX) : A3f BlAzk = Akzk (zrzk) = ark (k = 1. . znk) be the coordinate column of sk (k = 1. by (52) and (53). 2.. ... . z2.
n). k = 1.ABzk) = 0 .§ 6. (58) The formulas (58) show that the nonsingular transformation x=Z n (59) reduces the quadratic forms A(x. 2. n) .e. PENCILS OF QUADRATIC FORMS 313 The principal matrix Z is nonsingular (I Z 1 =.. we can represent (56) and (57) in the form IL I ZTBZ = C . (57) By introducing the principal matrix Z = (zi. The system of equations (61) can be contracted into the single equation ZT (Azk .AkBzk) = o (i = 1. when we multiply both sides of (49) on the left by the row matrix T Z' . Then (58) holds. . x) simultaneously to the canonical forms (60). we obtain : z'TAzk = IkziTBzk = Akbik (1.4 0). for every k (49) holds. (61) where k has an arbitrary fixed value (1 < k < n). Thus we have proved the following theorem : .i andk n k.' 0). ..=1 (60) This property of (59) characterizes a principal matrix Z... .. because its columns are linearly independent. x) simultaneously to sums of squares: k... and hence (56) and (57) holds for Z. since ZT is nonsingular. Therefore Z is a principal matrix.. z"). . x) and B(x. n) . 2. We rewrite (57) as follows : ziT (Azk . For suppose that the transformation (59) reduces the forms A(x. The equation (50) can be written as follows : z{TBzk = 6. AzkAkBzk=O. . (i. hence. x) and B(x. (56) Moreover. z2.. 2. (58) implies that Z is nonsingular (I Z . k =1. i. . . .
pp. x) . R (x. Afterwards it turns out (as we have shown on p. 313) that the columns z1. 21 An orthogonal transformation does not alter a sum of squares of the variables. 20 See [ 171. . A. .. the characteristic equation of the pencil A(x.AB(x. A(x. ..314 THEOREM 9: X. . y) is reduced to the form 27 Ak E by an orthogonal trans k1 formation y = QE (reduction to principal axes!). .AB(x. then Z='I zfk 11 is a principal matrix of the regular pencil of forms A(x.21 n Thus the transformation x = ZE. . x) to yk (which is always possible. x) and B(x. and the principal vectors of the pencil are characteristic vectors of A. x) _ k1 xl .11 is a principal matrix of a regular pencil of forms A(x. since B (x.l k_1 two given forms to (63). . x). k = 1. Now the form Al (y. are the characteristic values of the Sometimes the characteristic property of the transformation (62) formulated in Theorem 9 is used for the construction of a principal matrix and the proof of Theorem 8... y). x)AB(x. zn of Z satisfy the relations (49) and (50). obviously.x) simultaneously to sums of squares (63) L1 km1 pencil of Z.x) AB(x. x) is the unit form.20 For this purpose. so that B = E. because (Qx)TQ5=xTx. x). . Then. In this case the relations (50) can be written as follows : z`TZk=h1k (i.. i. x) coincides with the characteristic equation of A. In the special case where B (x. x) corresponding to the columns z1.. z2. then the transformation x=ZE (62) reduces the forms A(x.. z" Conversely.. reduces the 27yk k.. .57. zn. A2. we first of all carry out a transformation of variables x = Ty that reduces the form B(x.x) and B(x.. Then A (x.x). . x) to the form (63). z2. x) is carried into a certain form A1(y. . z2. where Z = TQ. if some transformation (62) simultaneously reduces A (x. x) is the `unit' sum of squares k1 positive definite). QUADRATIC AND HERMITIAN FORMS If Z = Ii zik 11.e. where Al.. 2. n) and they express the orthonormality of the columns z1. 56.
x. 2. k1 (65) Therefore the principal vectors z1.. Given the equation of a surface of the second order 2x22y23z210yz+2xz4=0 is (66) in a general skew coordinate system in which the equation of the unit sphere 2x2+3y2+2z2+2xz=1. A..). In R we consider a central hypersurface of the second order whose equation is n A (x. x) =1. .AB (x. .AB(x..'k=c. it is required to reduce equation (66) to principal axes. z" whose coordinates in the old basis form the columns of Z.§ 6. x) . k Thus. . x) askxixk = c .. provided the equation of the hypersurface is given in a general skew coordinate system22 in which the `unit sphere' has the equation B (x.e.. e and the fundamental metric form B(x.e. (k= 1. (64) Z = 11 zik 11 71 is a prinAfter the coordinate transformation x = cipal matrix of the pencil A(x.. e2.. . z" of the pencil coincide in direction with the principal axes of the hypersurface (64). the new basis vectors are the vectors z1.along the axes. x) just as was done for the proof of Theorem 8. PENCILS OF QUADRATIC FORMS 315 2. In this case (67) 2 0 2 5 1 5 3 0 1 BH 0 f 2 0 1 3 0 1 0 2 22 I. the principal vectors of the pencil. i. the task of determining the characteristic values and the principal vectors of a regular pencil of forms A(x. z2.. .' Ak6. a skew coordinate system with distinct units of lengths. . x) . These vectors form an orthonormal basis in which the equation of the hypersurface (64) has the form n . n)... .. Theorems 8 and 9 admit of an intuitive geometric interpretation... A2.. . Example. and the characteristic values A. . z2. of the pencil determine the lengths of the semiaxes : Ak = f . We introduce a euclidean space R with the basis e1. x) is equivalent to the task of reducing the equation (64) of a central hypersurface of the second order to principal axes. .
The coordinates of the first can be chosen arbitrarily. We denote the coordinates of a principal vector corresponding to the characteristic value 1 by n. v'. w' = . _ . Similarly. provided they satisfy the relation v + w = 0. A. by setting A _ . Thus.316 X. To the characteristic value A = 1 there must correspond two orthonormal principal vectors. v. the coordinates of the second principal vector are at. v. w are determined from the system of homogeneous equations whose coefficients are the elements of the determinant (68) for A =1: 5w 5w In fact we have only one relation v+w=0. The values of u. v.4. 22 =1. QUADRATIC AND HERMITIAN FORMS The characteristic equation of the pencil I A .AB = 0 has the form 221 0 232 5 12 5 322 0 1A This equation has three roots : Al =1.4 in the characteristic determinant. z2) = 0) : 2uzt' + 3vv' + 2ww' + uw' + zn'w = 0. w.v' and write down the condition for orthogonality (B(z'. w=v.' = 5v'. we find for the corresponding principal vector : u" v" =u" w"=2u". Hence we find : u' = W. We take the coordinates of the second principal vector in the form u' v' w' = .v'. We set U=0. .
. x) =1). i. Suppose that two quadratic forms are given A (x. 5 A$ S . x) i. i. § 7. 23 In the exposition of this section. we follow the book 1171. i+Cs+Hs=1 The first equation can also be written as follows : 4!+ 4'. i. The coordinates of the endpoints of the other two orthogonal axes are given by the first and second columns. x) _ Y bikxixk. Extremal Properties of the Characteristic Values of a Regular Pencil of Forms23 1.e. S A. 1 Therefore the principal matrix has the form 0 0 3 1 1 3 1 Z= 1 5 3V6 1 1 3 2 and the corresponding coordinate transformation (x = Z$) reduces the equations (66) and (67) to the canonical form i+$i4H4=0..x) in nondescending order: A. (67). 2/3.. v=3. v'. and u" are determined from the condition that the coordinates of a principal vector must satisfy the equation of the unit sphere (B (x. (69) . We number the characteristic values of the regular pencil of forms A(x. § 10. 1/3.§ 7. and an imaginary one equal to 1. x) AB(x.1/3.1l=1.e. Hence we find : v= 1 .k1 n of which B(x.x) is positive definite..k=1 aikxixk and B (x. EXTREMAL PROPERTIES OF CHARACTERISTIC VALUES 317 The values of v. The coordinates of the endpoint of the axis of rotation is determined by the third column of Z. This is the equation of a onesheet hyperboloid of rotation with real semiaxes equal to 2.
x) t + A2 n i = 1.l + . . by zl..25 Therefore all these columns correspond to the characteristic value A. L.. 6._ n=0. . 2. . QUADRATIC AND HERMITIAN FORMS The principal vectors24 corresponding to these characteristic values are denoted. For this purpose it is convenient to go over to new variabes l.. we often call a column.. In this case the corresponding x is a linear combination of the principal columns z1. we group together the equal characteristic values in (69) A1= .. = Apt < Ap1+1= . ignore the second part of the inequality and investigate when the equality sign holds in the first part. for the time being.. A2. x) considering all possible values of the variables. . .x) .AB(x. We ascribe to these respectively. < . zriz) (k =1. .. as before.. .. x) .. .. the quotient `4(x'x) is the coordinate of the center of these masses.. points nonnegative masses m1= m2 = 2. by (70). zn : zk = (zlk. + (70) On the real axis we take the n points A.... . z2. when 41+1=.. xl (x x=Z (x{= k1 where Z = II Ztk 11I A(x.= ZE it follows that x='tkzk. = API+p.. .. In the new variables the ratio of the forms is represented (see (63)) by A..   . zQ.. Then.. z2. n). Let us determine the least value (minimum) of the ratio of the forms o).. 310). k1 n . having the geometric interpretation in mind. i. Therefore B(x. a vector.. by means of the transformation A(x.318 X. z21. n) is a principal matrix of the pencil A(x.. not all equal to zero (x. For this purpose. 24 Here we use the term 'principal vector' in the sense of a principal column of the pencil (see p. 2.. Let us. . 25 Froni x..e. 2. ... x). . Throughout this section. x) A SA(x'x)SA B(x. so that x is also a principal column (vector) for A = A.. (71) The center of mass can coincide with the least value Al only if all the masses are zero except at this point.
x) x.xk . y that the equation B(x. i. in which the square of the length (the norm) is given by the positivedefinite form t. Therefore. (72) and this minimum is only assumed for principal vectors of the characteristic 2.x)_A2Ej+.r. . EXTREMAL PROPERTIES OF CHARACTERISTIC VALUES 319 We have proved : The smallest characteristic value of the regular pencil A(x. . B(x. This is in complete agreement with the geometric interpretation given in the preceding section..+Ma' and therefore minA(x..§ 7. x) mend THEOREM 10: B(x.+El A(x. z'1 form an orthonormal X basis. as the coordinates of a vector x in some basis of a euclidean space b. B(. ..e... x) . in (69) is the minimum of the ratio of the forms AP = min A (x.x)' value A. X) I = min A (x' x) B(x. In order to give an analogous `minimal' characteristic for the next chAracteristic value A2.r. to those that satisfy the equation26 B(z'. x) is the minimum of the ratio of the forms A(x. if the vector x= 1 Ekzk is orthogonal to one of the Zk.. z2.x)=0).x)=0.x) Ei+. we shall mean by the orthogonality of two vectors (columns) x..AB(x. we eventually obtain the following theorem : THEOREM 11: For every p (1 p < n) the pth characteristic value A. Proceeding to the subsequent characteristic values.. and iu what follows.. For these vectors. Here the equality sign holds only for those vectors orthogonal to z' that are principal vectors for the characteristic value A2. . y) = 0 holds. we restrict ourselves to all the vectors orthogonal to z'. We shall regard the quantities . z.kx.x)' (73) provided that the variable vector x is orthogonal to the first p 1 orthonormal principal vectors z'. zp' : 26 Here. then the cor k1 responding Ek = 0.kI In this metric the vectors z'.x)__A2 B(x.. ..x) (B(zl. x) B(x..
n). n). L2. x) =0 . . 2.. We shall say that the variables x1. n). .2' (75) Furthermore. x) = 0... 2.. x) (k =1... (73) is written as follows : dp=µ (B.+ ments of the row matrix zkTB (k = 1.. xn Suppose that linear forms in the variables x1i x2. (x) = 0.kx" (lc= 1. if only such values of the variables are considered that satisfy the system of equations Lk(x)=O (k=1. xn are given : (74') Lk(x) = llkxl +12kx2++l. 2. L. .. In this notation. z2.. L2. x" or (what is the same) the vector x is subject to h constraints L1. .h). .. . (74") Preserving the notation (74') for arbitrary linear forms we introduce a specialized notation for the `scalar product' of x with the principal vectors . x2. 27 (78) where llk. zQ1 and can therefore be used only when these vectors are known... we introduce the concept of constraint imposed on the variables x1..... 2.2. L1. . B (01. the minimum is assumed only for those vectors that satisfy the condition (74) and are at the same time principal vectors for the charac teristic value A. 3.. Lk (x) = zkT Bx = dkx.. there is a certain arbitrariness in the choice of these vectors. z1.. Lk) . + 12kx2 + ...... . . L.. Moreover. A ... . (76) We consider the constraints Ll(x)=0. ... .ZV L2. Z": Lk(x) = B(zk. Lp_1) (p=1.. lnk are the ele .. . Moreover. z2. ... . l1k.. .. . The characterization of A given in Theorem 11 has the disadvantage that it is connected with the preceding principal vectors z'.. . ... 2.  . ... x) as follows: B(x... .. . when the variable vector is subject to the constraints (74") x) we shall denote min A (x. . Lp_1(x)=0 and (77) Lr+1 (x) = 0. In order to give a characterization of lp (p = 1. .. x2. fi (B.. h)..320 X. . n) free from these defects. QUADRATIC AND HERMITIAN FORMS (74) B (zl. .
Lz.. Ll. Lh) = max B(z... L2. Lh) = v (B. the characteristic values of the pencil . the corresponding coordinates of x(l) are 4+1 = .. L2... £1. Therefore. . .. . in contrast to the `minimal' characterization which we discussed in Theorem 11. L1. . Thus we have proved : THEOREM 12: If we consider the minimum of the ratio of the two forms A(z. . . L2. = " = 0.u remains less than or equal to A and becomes A. x) for p . x) are A"SA"1<.. . n)... .z) write : (80) when the variable vector is subject to the constraints L1. L1. and 12 to the ratio we obtain instead of (72). a vector x11 I B (x(1).. L2.. x). x) .x) B(x... Ll.. x) x)' . L9then B. Lp_1 are taken. we can fu and (. 4. ..AB (x.. z". . if the specialized constraints L1. .. by (70). x) the form A (x. . EXTREMAL PROPERTIES OF CHARACTERISTIC VALUES 321 Since the number of constraints (77) and (78) is less than n.. .B. Lh) _ . but variable. (76). Thus. .. . L4. Note that when in the pencil A (x. L2f . all the characteristic values of the pencil change sign. Since the constraints (78) express the orthogonality of x to the principal vectors zvfl. A.1 arbitrary. . x(1y) < p B This inequality in conjunction with (76) shows that for variable constraints L1. ... x(1)) S A (x(1)._1 the value of .. by applying Theorems 10. by using the notation e (B. . and (79) the formulas ). 22.A(x. +g P But then A \ (B. but the corresponding principal vectors remain unchanged. LF_1) (x(1). Lr1) (p =1. L1. the maximum of these minima is equal to A.. L8..B. x) is replaced by . . L. L$. . L)' Therefore. 11... . Ll.mine (B. Ls.. there exists o satisfying all these constraints.§ 7. x) AB(x. L2.. .  A (x..A(x. Lh) max s (_.. (79) Theorem 12 gives a 'maximalminimal' characterization of Al.. x(1)) ^ 4i . £2... ..5 Moreover. constraints L1.: AV = max µ A .
2) The characteristic value pth from the end 2"_n+i (2:5 p:5 n) is the maximum of the same ratio of the forms A * P+ = max A x.z)' (81) and this maximum is assumed only for principal vectors of the pencil corresponding to the characteristic value A. L2.. . x) AB(x. A . ) (83) (84) ing to the characteristic value this maximum is assumed only for principal vectors of the pencil correspond and satisfying the constraints (83). 28 In a euclidean apace with a metric form B(x..n). L*_1. Bee footnote 26.x) (82) provided that the variable vector x is subject to the constraints :28 B (z*. 22. the condition (83) expresses the fact that the vector x is orthogonal to the principal vectors z"P+2 . x) = 0. 5 A2 5 A.. . A. . of A. . %*=max(x'x) B(x. QUADRATIC AND HERMITIAN FORMS =maxA(x. x) = 0.. L*r+2) (p=2. .x) v (g... L1. x) B(x.. x) there correspond the linearly independent principal vectors of the pencil z'.... These formulas establish the `maximal' and the 'minimalmaximal' properties. X".x) * B(x. B (z*1.. z". A . is the maximum of the ratio of the forms B (z.. which we formulate in the following theorem : THEOREM 13: Suppose that to the characteristic values a. LVA. x) = 0. of the regular pencil of forms A(x... r+I =min v (B. L*.. x) ' A (x. B (z*P+2. respectively... z2. . .322 X. . Then : 1) The largest characteristic value A. . x) .. x).. . ..
... ..u (B° .h variables). L1. 0 AP = max . (86) be h independent constraints.h independent ones v1. the characteristic values (87) are independent of this arbitrariness and have completely definite values. v) = µ (B . L1. . EXTREMAL PROPERTIES OF CHARACTERISTIC VALUES 323 3) If in the maximum of the ratio of the forms 4!!L!) with the constraints B(x.. ._1 (x) = 0 (2:!. x2. Lp_1) where in (89) only the constraints L1. ... L.. for example. 29 The constraints (86) are independent when the linear forms Lo (x). . . vri_k in various ways. L9.. ... . in general. L2. . x) .. LA) (88) and..... from the maximalminimal property of the characteristic values A1= Mill B° (v. = max is (B .. x) goes over into the pencil A° (v.AB (x.. x by the remaining variables. . L1. v2.29 Then we can express h of the variables x1._1 are allowed to vary. L2.. which we denote by v1f v2. L. The regular pencil so obtained has n .2B° (v. L1..§ 7. LA (x) on the lefthand sides of (86) are independent. .. Therefore. LA. . S Ao (87) Subject to the constraints (86) we can express all the variables in terms of n .. This follows.. then the least value (minimum) of this maximum is equal to Anp+1= min v 0 .. v) . However. . Lh W= 0 . . Lr_1 . . .. LP1) . ..x) L1 (x) = 0.. Ll. L2 (x) = 0. p S n) (2 < p < n) the constraints are varied. v) is again a positive definite form (only in n .h real characteristic values Ai A2 S . L2. L2 (x). . v). the regular pencil of forms A (x. ... when the constraints (86) are imposed.. (89) . (85) Lo (x) = 0. where B° (v. . .
. Lp_i increases or remains the same.. . L°._I) (B . . . LP1) Lo Here not only are L1. La1 varied... L.. nh). L p_I h. .. x) . ... ..324 X. Lv1 S A0 ° = max (A L° B. on the righthand side.. x) . .. L° LI .. Hence P . L1.. <1n are the characteristic values of the If Al < 2 regular pencil of forms A(x. . .. . This completes the proof. .. LI. A (x.. . Lh . .. . The second part of the inequality (90) holds in view of the relations Apmaxµ(B. . LpI) S max µ (A ...... (90) Proof.. LA . x) and 11 0 S AQ S . . .h) follows easily from (79) and (89). but Lp.. S . .°. .x) .. . then THEOREM 14: AP S AP s Ap+h (p=1. For when new constraints are added.. QUADRATIC AND HERMITIAN FORMS The following theorem holds : . L°. L1.. x)  AB (x. .x) B(x.... .B(x. on the lefthand side the latter are replaced by the fixed constraints Li.. The inequality 1p S 1p (p = 1. LI. . 1 . LI. A(x. 6.. Therefore (B LI. x) (91) are given and that for every x o.AB(x.2.x) Then obviously. Suppose that two regular pencils of forms A (x.... the value of the minimum µ (B .AB (x.... .. Lp+.. x).x) SA(x. L. n ..1 are the characteristic values of the same pencil subject to h independent constraints. Lp_i) = max µ (AB ._1 also. 2. LI.
2:5:. . %2(x)=0. . this leads to the following theorem : . and 11 < 22 < . x). then the identical relation A (x. then we have : ASAP (p=1...(x)=0 are imposed.. . Thus. x) and A(x.. x) coincide. x) . x) = B (x. x) . if we denote by Al < A2 < . Lr11 (p=1..x)+2'[X (x)]2. A.u (B . the difference A(x.. (p=1..z) (92) implies that ArSAp (p=1.z) B(x.u (=J . i1 Then. x) . n).x) A(x.nr). x) AB(x.. (93) Let us consider the special case where.. .F. .2. EXTREMAL PROPERTIES OF CHARACTERISTIC VALUES 325 max. L1. L1. L2. L2..A(x....x) and A (x. and A..... . the forms A(x.§ 7. ... < 2.x)..2. %. x) . In conjunction with the inequality (93). we have: APSAp5AP. . and the pencils A('x.. Therefore.AB(x. < A2 < . LPI1) S max. B(x. x) with the characteristic values A.. n).x)=A(x.AB(x.. the characteristic values of the pencils (91). x) .. x) and (x. we have proved the following theorem : THEOREM 15: If two regular pencils of forms A(x. when the r independent constraints %1 (x)=0..... x) and A(x.AB(x. in (92)..2.AB (x. < .x) B(x. < A..AB(x. n). x) . are given. x) have the same characteristic values Applying Theorem 14 to both pencils A (x.2. .. In this case. respectively... x) is a positivesemidefinite quadratic form and can therefore be expressed as a sum of independent positive squares: A(x.:5 2.
. x) ... where the form T3 (x. QUADRATIC AND HERMITIAN FORMS THEOREM 16: If Al < A2 < . . provided of course that r 0.. . 1. . . i. 7173. 32 See [17]. . 4. = 0.kl ... . pp. 31 The first parts of the inequalities hold for p > r.. < Ap. . .: 33 n T = X be (q1. q = 0. where A(x. and X.. The position of equilibrium itself corresponds to zero values of these coordinates : q. q2.AB (x... We consider the free oscillations of a conservative mechanical system with n degrees of freedom near a stable position of equilibrium.. i.AB(x. . q. x) . are the characteristic values of two regular pencils of forms A(x.. x) and A(x. We shall give the deviation of the system from the position of equilibrium by means of independent generalized coordinates q1.. (94) In exactly the same way the following theorem is proved : THEOREM 17 : If Al < A2 !5.. 2p (p=1... < A..2. x).. !5.. z) and A(x. and 1. Small Oscillations of a System with n Degrees of Freedom The results of the two preceding sections have important applications in the theory of small oscillations of a mechanical system with n degrees of freedom. 12 < . § S.x)=A(x... 42. 2p+r (p =1. respectively Ap < 2. (95) In Theorems 16 and 17 we can claim that for some p we have. . < A. 2. < 1. qA) 449E 30 The second parts of these inequalities hold for p < n . x) by adding r positive squares... x) AB(x.. r) are independent linear forms. then the following inequalities hold :30 A. n) .AB (x. 2. . < !n are the char acteristic values of the regular pencil of forms A(x. . .x)+[I (x)]2.r only. then the following inequalities hold:" 1. x) is obtained from B(x.1 and X4 (x) (i = 1. and 11 <_ A2 < . x). 33 A dot denotes the derivative with respect to time. q2 = 0..326 X. x) . q2. n)... . Then the kinetic energy of the system is represented as a quadratic form in the generalized velocities q..$2 Note.
and is zero only for zero velocities q1= q2 = . i. Then. 2.Y i=1 stationary value. i.. qn) as power series in ql. qn) . qn and keeping only the constant terms b4k. .. q2..)=bik+. i. = qn = 0. n). n) . we obtain : n P=. We now write down the differential equations of motion in the form of Lagrange's equations of the second kind : d 8T eT 8P at a4{ . K. .. k1 (aik =aki. i. .. we then have : n T = E bikMk i. since the deviations q1. k. for example. Without loss of generality. the potential energy P and the kinetic energy T are determined by two quadratic forms : n n P =.. .i aM+aikggk+.. (i....aQi (i =1.xgigk. . q2. ... 2. . Buslow (Buslov)... The potential energy of the system is a function of the coordinates : P (ql.aq _ . § 191.. n) .ka1 (bik = bk{ . k =1. qn. Therefore i. q2.... k =1. G. n).. (96) the second of which is positive definite. Theoreti8che Mechanik.Ybikgigk i.0)=0...0. The kinetic energy is always positive. . n)..k=1. we have n P =Z aikgigk i. SMALL OSCILLATIONS OF SYSTEM WITH n DEGREES OF FREEDOM 327 Expanding the coefficients bik(gl...k1 T =. Thus. we can take P0=P(0.§ 8.. .' a. q.. are small. we have a.2. .. q2i .ke1 Since in a position of equilibrium the potential energy always has a dq Io=0 (i=1. . .. g. 2. .. expanding the potential energy as a power series in q1. .. qn bik(g1..kI . q 2 .. . ..lbikgigk is a positivedefinite form. ...... Keeping only the terms of the second order in q1. 2. .g2. (97) 34 See. q2..
2. V2k. vn) is the constantamplitude column (constantamplitude `vector'). By Theorem 8. 2.35 Then. A. (98') and cancelling But this equation is the same as (49). n) .. .. Therefore the required amplitude vector is a principal vector. . 36 See G. the regular pencil of forms A(x.. § 210..AB (x.328 X. QUADRATIC AND HERMITIAN FORMS When we substitute for T and P their expressions from (96). and the square of the frequency A = cot is the corresponding characteristic value of the regular pencil of forms A(x. . . x) . . Suslow (Suslov).1=w2)... . . that the value of Po in the position of equilibrium is less than all other values of the function in some neighborhood of the position of equilibrium.k1l1 and the column matrix q = (q1. in matrix notation : q = vsin (wt + a). We shall seek solutions of (98) in the form of harmonic oscillations q1= v1 sin (cot + a). we obtain: X btkgk + X aikgk = 0 k=1 k1 (i =1. A2. x) has real characteristic values Al.= v2 sin (wt + a). v2. q) is also positive definite.. .. Substituting the expression (99) sin (wt + a). co is the frequency. our assumption means that the quadratic form P = A(q. .. . (98) We introduce the real symmetric matrices A=llaskII andB=llb. q. (99) Here v = (v1. Theoretische Mechanik. q2.k) . and n corresponding principal characteristic vectors v1. q2i . On the other hand. k = 1. K. v2. by a theorem of Dirichlet. ... . .. . we obtain : Av = ..e.) in a position of equilibrium shall have a strict minimum. n) satisfying the condition 35 I. V.. qn = v sin (cut + a). We subject the potential energy to an additional restriction by postulating that the function P(q1. x) .AB (x. . .36 the position of equilibrium is stable..1Bv for q in (. .. q.. x). and a is the initial phase of the oscillation.. vn (vk = (vlk. . qn) and write the system of equations (98) in the following matrix form : (98') Bq + Aq = o. .
k. . . V2k.. But then there exist n harmonic oscillations3s vk sin (cot + ak) (101) whose amplitude vectors vk = (v1k. n).. . . vk) = ' bp.. SMALL OSCILLATIONS OF SYSTEM WITH n DEGREES OF FREEDOM 329 B (vi. x) is positive definite it follows that all the characteristic values of the pencil A(x. are uniquely determined from (103).. 2. .vµivrk = Vik k. . k =1.n)... n) satisfy the conditions of 'orthonormality' (100). from the representation (63).. Since the equation (98') is linear. the values Ak sin ak and cvk cos ak (k = 1. the expression (102) is a solution of (98). . 2.. . . n) are arbitrary constants. vn are always linearly independent. 2.2..§ 8. . For. 4o=£WkAkcosakvk. (100) From the fact that A(x.. k =1. whatever the values of these constants.'Aksinakvk.. n). On the other hand. ((o%= .. v2. 2. . . 2... 2. x) . every oscillation can be obtained by a superposition of the harmonic oscillations (101) : n q PAksin ((okt+ak)vk k1 (102) where Ak and ak are arbitrary constants.. v1 (i. n). 4jc_o=q0 For from (102) we find : n n (103) q0=. . . the arbitrary constants can be made to satisfy the following initial conditions : q(eo=qo. The solution (102) of our system of differential equations can be written more conveniently : n qr X Ak sin ((ot + ak) va k1 (104) Note that we could also derive the formulas (102) and (104) starting from Theorem 9.. n) . 38 Here the initial phases a. for example.AB(x. x) are positive :3' Ak>0 (k=1. k1 k1 Since the principal columns v1. . vk) (k =1.. (k = 1. . For let us consider a nonsingular transformation of the 90 This follows. . and hence the constants Ak and ak (k = 1... .
. 2. . 2. w. of the given mechanical system in nondescending order : 0<w1Sco2 ' SWn. QUADRATIC AND HERMITIAN FORMS variables with the matrix V = II v. . . The disposition of the corresponding characteristic values 2k= wk (k = 1.330 X. . by Theorem 9. all the numbers A.. q) is positive definite. n). Setting n qj=Zv8. We number the frequencies W1.. n) in both methods are the same. 2. we again obtain the formulas (104) and therefore (102). n) of the pencil A(x.E 9E k1 (107) Bn in which the potential and kinetic energies have a representation as in (107) are called principal coordinates... because the matrix V = II v.AB(x. x) . x) .. (k = 1. _.k Ili that reduces the two forms A (x. . a principal matrix of the regular pencil of forms A(x.. (108) Since A(q.. . we find: Ok = At sin (wkt + ak) (110) When we substitute these expressions for 9k in (105).. we have : n n (106) P=A (q. . ..AB (x.. q) ='A Or. We now make use of Lagrange's equations of the second kind (98) and substitute the expressions (107) for P and T. .k 11 i in (106) is... 2.. x) is then also determined : . more briefly. x) simultaneously to the canonical form (63). .. are positive and can be represented in the form Ak=wk (wk>0... We obtain: 9k + Ak9k = 0 (k =1. x) and B(x. x). 2. . We also mention a mechanical interpretation of Theorems 14 and 15.. 2. (109) From (108) and (109). k=1 . 2. w2..2 . (i=1. i_1 The coordinates 01. A2. n) (105) q=VO and observing that q = V6. T =B(4. k =1. k1 or.. n) . The values vi p (i. . . . 92i . 3.
qn : L. subject to the relations AO = (012 (j = 1. q) ).4) for the kinetic energy (without a change in A (q.. 41 In the preceding sections. . Chapter III. .. q2.. with an increase of the form B(4. by the pencil A(x. We recall42 that a hermitian form is an expression $9 A finite stationary constraint is expressed by an equation f (ql.. The frequencies of the system... where f (qi. q2. x). we can assert on the basis of Theorem 15 that: With increasing rigidity of the system.It degrees of freedom..40 § 9. . All the results of §§ 17 of this chapter that were established for quad ratic forms can be extended to hermitian forms. 40 The reader can find an account of the oscillatory properties of elastic oscillations of a system with n degrees of freedom in [17]. but the value of the new jth frequency 10.. § 2. i.h).. L2.. 0. q2. L2(q) =0.e. .§ 9. the frequencies can only decrease. qn are connections can be assumed to be linear in q1.e. q) for the potential energy (without a change in B(q.. 5 A°_h of the are connected with the characteristic values AI A°s S constraints L1... i. 2.(q) =0. 2.. cannot exceed the value of the previous (j + h) th frequency w*h . and with increasing inertia of the system.(q. all the numbers and variables were real.. qs. HERMITIAN FORMS 331 the given We impose h independent finite stationary constraints39 on supposed to be small.. Therefore Theorem 14 immediately implies that 0)j W! S wj+h (j =1. q) ). .nh). In this section. . our system has n .. 42 See Chapter IX. Since the deviations ql. Lh. After the constraints are imposed. . x) . Hermitian Forms41 1. the frequencies can only increase. Thus : When h constraints are imposed. . .) is some function of the generalized coordinates... . .AB (x. . q. with an increase of the form 4.. these system... In exactly the same way. n . the frequencies of a system can only increase. . the numbers are complex and the variables assume complex values. Lh(q) =0. Theorems 16 and 17 lead to an additional sharpening of this proposition.
y) moreover. xn). then m p H(x..x) =xTHx. x) in the form of a product of three matrices. 2.. . . . (112) (113) H(y. the sign T denotes transposition. in)..e. x) _ I hikxixk i. 2.. the hermitian form H (x. (111) To the hermitian form (111) there corresponds the following bilinear hermitian form : H(x.. xn). and a column matrix :44 H(x. vk are column matrices and cj. H = H. . p)..43 By means of the matrix H = II hik 11 71 we can represent H (x. 21. . i. y) =xTH7.. (116) We subject the variables x1.. X2. n) (117) 43 A matrix symbol followed by an asterisk * denotes the matrix that is obtained from the given one by transposition and replacement of all the elements by their complex conjugates (H* = gT). in particular. . The coefficient matrix H = 1 hik II. .. H(x.k= hki. 3.. i. k =1. 44 Here x = (x1.. vk).. If (114) X =1 cui. t = !2.332 X. i. in particular.y) and.. X2 .... k =1. H(x.. . k=1 hikxiyk.. H(x. n) ..ke1 (h.x) =H(x. t _ yv . .. (x1. y) and.e. . y) i1 km1 £cidkH(ui.. 2. x) assumes real values only. dk are complex numbers (i = 1. x to the linear transformation xi = di tiksk ki (i =1. of the hermitian form is hermitian.x) =H(x. a row. . .x) (113') i. a square. . 2. QITADRATIC AND HERMITIAN FORMS H (x. . . m . y = ' dkvk k_1 m p (115) where u'. yn)..
.. x is replaced by we can rewrite (118) as follows : T II = WHW. (T= 11 tik11").l where the new coefficient matrix H = II htk 111 is connected with the old coefficient matrix H = II ha. . x)..§ 9. 0 (i = 1. a singular form remains singular under any transformation of the variables (117). . in the second of the formulas (114). A hermitian form H(x. xq. From (118) we obtain the formula for the transformation of the discriminant on transition to new variables : IHI=IHIITiiTI A hermitian form is called singular if its discriminant is zero. . .'S 45 Therefore r < n. x). in matrix notation.. r) are real numbers and n X` = ki are independent complex linear forms in the variables x1. Z. x) can be represented in infinitely many ways in the form H(x. .x)ajXjX{. 2. Obviously. HERMITIAN FORMS 333 or.. x) assumes the form H(. II by the formula hT = TTHT. 2.' ackxk (i =1. H (x. . (117') After the transformation. x2. The rank of H is called the rank of the form H(x. (118) This is immediately clear when. The determinant i H I is called the discriminant of H (x.k. (119) From the formula (118) it follows that H and H have the same rank provided the transformation (117) is nonsingular (I T i z 0). {a1 (120) where a{ . . x=Tl. i n i. r) .
and an upper triangular matrix L. (%i%i= I Z i ). (hkf k i + Jjo ) xk 2 2) XI. (122) Let us proceed to establish Jacobi's formula for a hermitian form n H(z. QUADRATIC AND HERMITIAN FORMS We shall call the righthand side of (120) a sum of linearly independent squares46 and every term in the sum a positive or a negative square accord ing as a{ > 0 or < 0. (123) This inequality enables us to use Theorem 2 of Chapter II (p.x)= 1 i' hkaxk hop k1 2 + HI (XP x) . Here.%. The proof is completely analogous to the proof of Theorem 1 (p. {. the number r in (120) is equal to the rank of the form H(x. 2. We apply this theorem to the matrix H = 11 hiR I1. = 0.(x. a diagonal matrix D. is the square of the modulus 41 The formula (121) is applicable when h. The difference a between the number n of positive squares and the number v of negative squares in (120) is called the signature of the bermitian form H(x.x)aXX i1 . 38) on the representation of an arbitrary square matrix in the form of a product of three matrices : a lower triangular matrix F. r). THEOREM 18 (The Law of Inertia for Hermitian Forms) : In the representation of a hermitian form H(x.. 299 must then be replaced by the formulas49 A H(x. x) as a sum of linearly independent squares. x) hx{zk of rank r.. No ¢. and (122). H(x. 46 This terminology is connected with the fact that R. Lagrange's method of reduction of quadratic forms to sums of squares can also be used for hermitian forms. h. of Z. x). . k) 0 (k =1.Hp.x): o. when hff. * 0. only the fundamental formulas (15) and (16) on p.. =nv. h (121) I.k1 we assume that Dk = H (1 2 . ... as in the case of a quadratic form.0. Just as for quadratic forms.334 X. 297). the number of positive squares and the number of negative squares do not depend on the choice of the representation. (hkf z)..
(126) Since H= II h4k 11 7 is a hermitian matrix. Dr'1. k=1. .1 J (127) Since all the elements in the last n . k1 k)' 1k DkH(1 .... k=1. k=1... 0. .. Dl. k1 j) (125) (j=k. (131) A comparison of this formula with (118) shows that the hermitian form .2..2..§ 9. r.k=1.. O )L...yDkx xk L n. 0. i.... ... n 2. ja=1. 2. (128) and 2) IF{=ILIO. and DkH .0}L.. because the last n .Dr .r columns of F and the last n .. i.j=0 tions that frk= ltt (i <k.48 we choose these elements such that 1) the relations (127) hold for all i.r).. in fact...k+1. HERMITIAN FORMS 335 and obtain H=F{Di... it follows from these equa i? k... 0) T n (jTj O)..r rows of L can be chosen arbitrarily...r diagonal elements of D are zero.. L . 2. ..... k ffk (124) where F = II f 4k II 1 .II lik II i . k f4k=14k (i. ... . n. . drop out of the righthand side of (124).. k=1. 2......n) F = L". .....n... (Do1) (132) under the transformation of the variables ^" These elements.. . Setting (129) (130) we write (129) as follows : H=TT {D. Dl. Then and (124) assumes the form H=L* DI) D2. (i <k.. D' 0. n).
. 2.. are independent. x) is equal to the number of variations of sign in the sequence 1. . . in place of Xl. we can write Jacobi's formula (133) in the form H (x..336 X. r) ... .Dr). that Jacobi's formula holds: H (x.. x.. Xr. automatically carry over to hermitian forms.k1 k (135) The linear forms X1..D2. so that the signature o of H(x.. x) is determined by the formula o=r2V(1. x) X hjkxtxk is called posii.Dr).x) >0 (<0). QUADRATIC AND HERMITIAN FORMS ti `. D2. X2i .. n) goes over into H(x. . x). . . Dr v=V(1. k+lxk+l + +tkx (k =1.. .. not all equal to zero. i.. X.' =k1tikxh X r (i=1.. D... k1 tine (negative) semidefinite if for arbitrary values of the variables x1. (133) x k + tk... the linearly independent forms (136) Yk=DkXk (k=1.. x) = EJ XkXk where Xk and (Do= 1)... made for quadratic forms (§ 3). X.D1. . 2. since Xk contains the variable xk which does not occur in the subsequent forms Xk+1i . .. When we introduce. H(x.. .. k1 7 DkH 1 2. DEFINITION 5: A hermitian form H(x.D2. . x2i X3. . X2.. x) ' r Y_ (D0=1). . .DI.. the number of negative squares in the representation of H(x...e. (138) All the remarks about the special cases that may occur. .. r) (134) t_ 1 1 2 .. 2.. (137) According to Jacobi's formula (137)..
.. . are nonnegative : H (i1 22 .' hikxixk can be rek'1 duced by a unitary transformation of the variables X= Uj (UU*=E) (141) THEOREM 21: to the canonical form A t_ (142) A where Al.2. x) . 274). We shall study the pencil of hermitian forms H (x..n)..k}>0 (k= 1.2. 22. x) = .§ 9.pn). a (139) THEOREM 20: A hermitian form H(x.i2. hikxixk is positive semi{.. x) _ 2. A...tip=1.k.. x) .2.. The proofs of Theorems 19 and 20 are completely analogous to the proofs of Theorems 3 and 4 for quadratic forms. x) _ 2Thikxixk is called positive i. x) = Y hikx{xk is positive defii.. The conditions for a hermitian form H(x.1 mite if and only if the following inequalities hold: Dk=H(1 2. sip) ?0 (140) (i1.lG (x.n. x) > 0 (< 0).. x2. not all equal to zero. n (143) ri i. x) to be negative definite or semidefinite are obtained by applying (139) and (140) to the form H (x. From Theorem 5' of Chapter IX (p.1 definite if and only if all the principal minors of H = II hik II.... Theorem 21 follows from the formula H= U 11 Abu 11 U1= TT II A{a{t I I T (UT= U'=T).. are the characteristic values of the matrix H = li hik i. tt rr . . Let H(x. HERMITIAN FORMS 337 a DEFINITION 6: A hermitian form H(x.. x).. x) =X hikx{xk and G(x.. A THEOREM 19: A hermitian form H(x.. H(x....' gikxixk be two hermitian ilk1 forms.. x. x) =Z.k. k+l (negative) definite if for arbitrary values of the variables x1. .. we obtain the Theorem on the reduction of a hermitian form to principal axes : Every hermitian form H(x..
. x) has n real roots Al. .. z") o such that Hz = Aoz. The proof is completely analogous to the proof of Theorem 8. 2. z2i . x) . z2. z" satisfying the conditions of 'orthonormality': G(z'.x) corresponding to the characteristic value Ao.k II... § 10.x_o (144) This is called a Hankel form. is a characteristic value of the pencil.. Its roots are called the characteristic values of the pencil. This equation is called the characteristic equation of the pencil of hermitian forms. 81... then there exists a column z = (z.. . x) is positive definite. Y) _ `..' The proofs of the theorems are then unchanged. . by means of these numbers.k=1.' 8i+xxtxx t.zk) =8ik (i. . and G = II gik II "I we form the equation HAG I =0. .. A.x) AG(x.. Let so. We shall call the column z a principal column or principal vector of the pencil H(x. Then the following theorem holds : THEOREM 22: The characteristic equation of a regular pencil of hermitian forms H(x.AG(x. QUADRATIC AND HERMITIAN FORMS (A is a real parameter). All extremal properties of the characteristic values of a regular pencil of quadratic forms remain valid for hermitian forms... .338 X. 32n_2 be a sequence of numbers. By means of the hermitian matrices H = II h. n). It has the form .. . If A. We form. This pencil is called regular if G(x.. Hankel Forms 1. A2. To these roots there correspond n principal vectors z'. a quadratic form in n variables "1 A9 (0. The matrix 8= II sl+k II o 1 corresponding to this form is called a Hankel matrix. Theorems 1017 remain valid if the term `quadratic form' is replaced throughout by the term `hermitian form..
8h+n2 This matrix is of rank h. RA+1. Dh&0. R2.2. But since the rank of (146) is h. Harixi. In this section we shall derive the fundamental results of Frobenius about the rank and signature of real Hankel forms.§ 10.. i. 82n2 iI We denote the sequence of principal minors of S by D1.. .. 8n (146) 8h1 8h 8h+1 . h+n1)... . On the other hand. Rh. Rh 81 82 82 88 .. R2. . . . these first h columns of (146) must then be linearly independent.. then Dh 0.. This proves the lemma.... LEMMA 1: If the first h rows of the Hanukel matrix S= 11 s{+k 110 are linearly independent.. 49 pee [162]. n)... D2. 8n1 .1 On 8n+1 .. Rh are linearly independent and Rh+1 is expressed linearly in terms of them : h Rh+1= 'E afRhf+1 i1 or 8g h of Q1 (4'h.. of S: 80 81 . 8n+1 11 8.. We denote the first h + 1 rows of S by R1.. By assumption. . Proof... (145) We write down the matrix formed from the first h rows R1. 8n 83 84 S= 82 8s .. R1... . R2..... 8n1 ..e..h+ 1. . by (145) every column of the matrix can be expressed linearly in terms of the preceding h columns and hence in the terms of the first h columns.49 We begin by proving two lemmas. but the first h + 1 rows linearly dependent. .. Dn: Dr=Ie{+tIo 1 (p=1. FORMS 80 81 339 81 82 82 .
=0.340 LEMMA 2: X. . Proof.. (p < n .h h+k+1)= 1 G 1 . .=tnA2=0)..) is a Hankel matrix and that tik = 0 for i + k < p ... since too toll tlo tll tot= t1o (because S is symmetric) and too = DA+1= 0 DA Let us assume that our assertion is true for the matrices T. 1.. to=tl=. We shall show that every Tp (p =1. 1. h DA Dh 8211+k1 (148) BA+i .2. for a certain h (< n).. The proof is by induction with respect to p. we shall show that it is also true for Tp+1= II t1k Ilo .. . it is obvious..e.21.. n h). t2.1 82h+i+k (i.hh+i+1 S. for T2.h.. i..h. (149) On the other hand. tp.. 2... For the matrix T1i our assertion is trivial . and tit DA+1=. QUADRATIC AND HERMITIAN FORMS If in the matrix S= II si+k iIo 1.nh1) then the matrix T = II tik lion41 is also a Hankel matrix and all its elements above the second diagonal are zero... nh1... t2p_2 such that with to=.2. 82h+. (p=1. n . . .. From the assumption it follows that there exist numbers tp_1.k. using Sylvester's determinant identity (see (28) on page 32). ..2 such that tik = ti+k (i.. I Tp I = t1. DAO. we find : Tp I =DA=0. .=tp_2=0 Here Tp II ti+k IIU 1.. .h) . We introduce the matrices TV = 11 tilt llu1 In this notation T = T.. there exist numbers tn_A_1..=Dn=O BA+k (147) 1..k=0. (150) .. ...
. Furthermore from (148) 8h+k (151) t = 82h+i+k Dh DA BA+i .g < 2p . and for p < i + k < 2p 1 the value of tik. we shall have BA+kp h ti = 82h+i+k + p1 DA Dh 8h+i .+1 is a Hankel matrix. 82h+i] 0 82h+k1 (152) By the preceding lemma. h+n1).82h+i+kp) . p1 (154) By the induction hypothesis (151) holds. HANKEL FORMS 341 Comparing (149) with (150)._1=0.'apsg_p v1 (q=h. Without loss of generality. t... Then. .. we have ti. by (154). we assume that i < p. the second diagonal are zero. Using Lemma 2. the last column of the determinant of the righthand side of (152) and use the relations (152) again. 82h+i1 h = 82h+i+k +X ap (t_9 ._. k:5 p < i + k:5 2p 1.. Then one of the numbers i or k is less than p. and since in (154) i < p. (153) Let i. above ... tj.§ 10. depends on i + k only. and all its elements to.h+1. . T. Thus. we shall prove the following theorem : . by (153). k and i + k . it follows from (147) that the (h + 1)th row of the matrix S= Its{+k IIo 1 is linearly dependent on the first h rows : h 8Q= . kp = ti+kp g<p Therefore. This proves the lemma. .2.. we obtain t. for i + k < p all the tik = 0. when we expand.
. S(l ...k=0. (i. .51 Therefore for r < n . and the matrix T has the form 0 ' 0 uAA1 T= U2 0 unh1 ...1 in the matrix T the elements u... h nr+h+ 1 nr+h+2 . By the preceding lemma..S° I T I =Dh = 0.. On the other hand. 51 From Sylvester's identity it follows that all the minors of T in which the order exceeds r . h) hh+i+1 .h rows and columns of S is not zero: D(*) = S (1 1 . . 8 contains some nonvanishing minors of order r bordering D5.h of T is different from zero. DA+1=.....342 X.. h nr+h+l nr+h+2 . n n 0.h.. 1. nh+ 1) is a Hankel matrix in which all the elements above the second diagonal are zero.=Dr=O. . and 60 By Sylvester's determinant identity (see (28) on p. On the other hand. 32).=u. Proof.. Therefore T t"h' o. Hence it follows that the corresponding minor of order r .. _h_1= 0. QUADRATIC AND HERMITIAN FORMS THEOREM 23: If the Hankel matrix S= II Si+k has rank r and if for some h (< r) DA O._h+1=. Therefore to. .. then the principal minor of order r formed from the first h and the last r ._A+1=0.. the matrix T = 11 tit l0nA1 t . h h+k+1). u2 "u1 The rank of T must be r ..h are zero.A_h1.
Dr) . DI. Dl..ko squares and the signature of the form : a+v=r... DO) = DhT ( n?+1. Dr) v = V (1. as Frobenius has shown.. .. 0 . 0 ur_h . x) = I ei+k xi xk ni of rank r the values of n.. D1. and o.. .2 V (1...nh = Dhu. . However.§ 10.nh nr+ l . ... D2. .. Dr). .. V (1.. respectively. ._h 0. . by Sylvester's identity (see page 32). . Dr) .rl We denote by n.. and this is what we had to prove. 110* . 303) these values can be determined from the signs of the successive minors Do= 1. . i. HANKEL FORMS 343 II 0 . v. . a =p (1. and o can be determined by the formulas (156) provided that 52 In the preceding Lemmas I and 2 and in Theorem 23. for Hankel forms there is a rule that enables us to use the formulas (156) in the general case: THEOREM 24 (Frobenius) : For a real Hankel form S(x. Dr_l. a=nv=r2v.. the field of complex or of real numbers. x)= Si+k xi xk of rank r. the number of positive and of negative Let us consider a real52 Hankel form S(x. . the ground field call be taken tj 0 as an arbitrary number fieldin particular. Dl.. v.. D1.0 T= urh (urh 0) .. ul II But then. . Dr by the formulas (155) n = P (1.. Dr) =r . .. . (156) These formulas become inapplicable when the last term in (155) or any three consecutive terms are zero (see § 3). . .. Dl. By the theorem of Jacobi (p.
Dh+1. QUADRATIC AND HERMITIAN FORMS Dh 0.xk have not only the forms S (x.344 1) for X. x) and Z1 go over.V corresponding to the group (158) are then:"' p odd p even Ph..V (Dh.h 1 and interpret Dh+p+1 not as D.. x) = Eof+kx.. and P .. . hnr+h+l. .. . but we have to set . where p = r . .==O (h <r) (157) Dr is replaced by D(r). r) .. S.r0) (158) a sign is attributed to the zero determinants according to the formula f U1) Sign DA+i = (1) 2 sign DA.Z{.. To begin with we consider the case where Dr r1 n1 s. 2. P = P (Dh.. .1 (i = 1. Dh+p+1) Vh. =D.. Then the Proof. where D(r)=S(1:. x) = :1 the Z. ..' .p a I)! sign Dh+p+1 DA 0. (x. D)i+p+1) p+1 2 P+1+8 2 P+1 2 0 P+1c 2 8 (160) Ph. r) .e.. r_o {. V. . 2. hnr+h+1 . and Sr(:r. n `1 . (x. ko r same rank r. into Sr(x. For let S (x. but as D(r) 96 0.... respectively.. We set xr+1 = = xn_1= 0. Dh+1. DA+P =0 (Dh+P+1 .+kx. DA+I=. Then the forms S(x. . (159) The values of P.'n1 di 2) in any group of p consecutive zero determinants (Dh 0) Dh+l = DA+2 = .0. . x) e.. P=. x) _ t.xk and S. x) and Z{ (i=1. PVh. but also the same signature a. are real linear forms and at _ {.Z. x) has ci 53 The formulas (159) and (160) are also applicable to (157). i.
. its signature also remains unchanged (see p. k=0. r) are different from zero and that in the process of variation none of the nonzero determinants (155) vanishes . 0). Di. (161) If Ds 0 for some i.. Dh+i. 32r_2 continuously in such a way that for the new parameter values 8o... signature of Sr(x. then sign D*= sign D..56 Since the rank of Sr(x.' . . Therefore the whole problem reduces to determining the variations in sign among those D* that correspond to Di = 0. because in the space of the parameters s. 56 Such a variation of the parameter is always possible.. .. the matrix T = !I 4k l+o is a Hankel matrix and all its elements above the second diagonal are zero. s22 an equation of the form D4 = 0 determines a certain algebraic hypersurface. x) f1 is of rank r (D.. si .. (i.). .. then it can always be approximated by arbitrarily close points that do not lie in these hypersurfaces. . sI.. We now vary the parameters so. Dn+r+i) For this purpose we set : 85+k Dh 825+k1 85+i .§ 10. for every group of the form (158) we have to determine P (Dn. a. .. Z5. 55 In this section.. 885+i+k 885+i1 By Lemma 2. Zr are linearly independent. so that T has the form s4 The linear forms Z1.. 1.)V (1. because the quadratic r form 8(z. 309).. x) is u. x) does not change during the variation. .... . Dh+r+i) .. the asterisk * does not indicate the adjoint matrix.... ssr2 all the terms of the sequenee55 1. HANKEL FORMS 345 X). . . . Therefore a_P(1.D. . . Dn+L.. p). 2. Dl*.... Dh4p...54 Thus the the same number of positive and negative squares as S(x. D. . More accurately. .V (De.. Dr (DQ ' St+k o 4 =1. If a point lies in some such hypersurfaces.. . D3.. DE.
+ xpxo) . p+ 1). p+ 1). . QUADRATIC AND HERMITIAN FORMS T= (162) Iit * . Dl*. k0 . Together with T. ... . (q=1... p) * .. Dp+1) = a*.. 1. 82A+11 82A+1+k and the corresponding determinants DQ*=I t* I o1 (q=1. where 8h*+k D* h 82b+t1 8A+1 (s. ... DA+p+1) * P (1. 1. * * . DA+1. . we consider the forms T (x. Dl*.346 X. . .... p + 1). 2. DQJttkIu1 * We denote the successive minors of T by D1. Dh+Q=DaD9 Therefore * * P (DA. Dr+1 (q=1. .. .. DA+p+1) Dp+1) . x). By Sylvester's determinant identity..  V (DA. . x)=4' tikxlxk . k = 0.. we consider the matrix T*=IItrkilo...V (1. . . . .. i. 2..k0 p Together with T*(x. D2... DA+1.. . x) = tp (xoxp + xlxp_1 + . x) = ± tl+kxjxk and T** (x. 2.. (163) where a* is the signature of the form T*(x.
. Since T*(x. We denote the signatures of T (x. . x) are obtained from T (x. .. x) by or and v**. t e for even p . it follows that: P (DA. Then for some h < r 0. (165). + xx_lx4) tP [2 (x0x2k + + x*_1x41) + xk] for even p Since every product of the form xaxp with a L fi can be replaced by a difxP)a(x" ference of squares (xa 2 zP)$. we can obtain a decomposition of T** (x. (xoxzx_1 + . Dh+1= . sign tP for even p. (167) where Dh+P+1 e_(1)a signDh . x) must also h+P 1 D be equal : (164) But for odd p.. x) and T**(x. DA+z) . from (162).. x) and T**(x. . (165) = 17' I = (1Y P (P+l) p+l . and T** (x. D P+1) f0 for oddp. = 0. . and (166).. T* (x. Dh+P+1) = p + 1. x) by variations of the coefficients during which D DA 1 n 0. Dh+p+1) + V (DD+1... Now let Dr = 0. Dh .§ 10. Dh+1. x). (168) the table (160) can be deduced from (167) and (168).F (Dh' Dh+l. T** (x. x) into independent real squares and we have 2 ** _ On the other hand. Dh*+P+1) . =D. Dh+P+1 Dh 0 for odd p. x). the signatures of T (x. (166) From (163). T* the rank of the form does not change (I T** I = T 0). x) _ J 2t. (164).. HANKEL FORMS 347 The matrix T** is obtained from T (see (162)) when we replace in the latter all the elements above the second diagonal by zeros. .... . Since p P (Dn*+v De+s) ..
. xA ' xnr+M . i. . D. we obtain Frobenius' rule (see page 304) from (160) for p=2. . xr1 = xii_1. D2. by P11. xn1 = xnr+A_1.. = XA.. QUADRATIC AND HERMITIAN FORMS In this case..348 X. by Theorem 25.. This completes the proof of the theorem. (7=1.. 2. In exactly the same way.. It follows from (166) that for odd p (p is the number of zero determinants in the group (158) ) sign DA ±1 = (__ 1) A p+I 2 In particular.. . . x) = '' ss+k xixk. Dr). In this case.h nr+h+1 . .n (1.. xA1 = xA1. . .. element A. We leave it to the reader to verify that the table (160) corresponds to the attribution of signs to the zero determinants given by (159). x.. DO =8 1.. Note. we can omit Dn+1 in computing V(1.. we find that the sequence Dn is obtained from l.. Then S(x..h nr+h+1. D I .. ..1 we have DADh+2 < 0.n ni k0 0 The case to be considered reduces to the preceding case by renumbering the variables in the quadratic form S(x... ..ke0 1 Starting from the structure of the matrix T on page 346 and using the relations Dt=DDhi . x) = X Si+kX$Xk We set: x0 = x0.. by replacing the single 1. thus obtaining G4undenfinger's rule.. . nh) obtained from Sylvester's determinant identity. for p:. D1..
BIBLIOGRAPHY
BIBLIOGRAPHY
Items in the Russian language are indicated by *
PART A. Textbooks, Monographs, and Surveys
[1]
AcHIESER (Akhieser), N. J., Theory of Approximation. New York : Ungar, 1956.
[Translated from the Russian.]
[2]
[3)
AITKEN, A. C., Determinants and matrices. 9th ed., Edinburgh: Oliver and Boyd,
1956.
BELLMAN, R., Stability Theory of Differential Equations. New York: McGrawHill, 1953. BERNSTEIN, S. N., Theory of Probability. 4th ed., Moscow: Gostekhizdat, 1946.
BODEWIO, E., Matrix Calculus. 2nd ed., Amsterdam: North Holland, 1959.
[4]
[5] [6] *[7]
CAHEN, G., Elements du calcul matriciel. Paris: Dunod, 1955.
CHEBOTAREV, N. G., and MEIMAN, N. N., The problem of RouthHurwitz for poly
nomials and integral functions. Trudy Mat. Inst. Steklov., vol. 26 (1949).
'[8] CHEBYSHEV, P. L., Complete collected works. vol. III. Moscow: Izd. AN SSSR,
1948.
"[9]
[10] [11]
CHETAEV, N. G., Stability of motion. Moscow: Gostekhizdat, 1946. COLLATZ, L., Eigenwertaufgaben mit technischen Anwendungen. Leipzig: Akad. Velags., 1949.
Eigenwertprobleme and ihre numerische Behandlung.
[12]
"[13]
Trans. and revised from the German original. New York: Interscience, 1953. ERuoIN, N. it, The method of LappoDanitevskil in the theory of linear differential equations. Leningrad : Leningrad University, 1956.
Chelsea, 1948. COURANT, R. and HILBERT, D., Methods of Mathematical Physics, vol. I.
New York:
"[14] FADDEEV, D. K. and SOMINSH]r, I. S., Problems in higher algebra. 2nd ed., Moscow, 1949; 5th ed. Moscow: Gostekhizdat, 1954. [15] FADDEEVA, V. N., Computational methods of linear algebra. New York: Dover Publications, 1959. [Translated from the Russian.]
[16]
FRAZER, R. A., DUNCAN, W. J., and COLLAR, A., Elementary Matrices and Some
Applications to Dynamics and Differential Equations. Cambridge: Cambridge University Press, 1938. `[17] GANTMACHER (Gantmakher), F. R. and KREIN, M. G., Oscillation matrices and kernels and small vibrations of dynamical systems. 2nd ed., Moscow: Gostekhizdat, 1950. [A German translation is in preparation.]
[18] [19]
GRSBNER, W., Matrizenrechnung. Munich: Oldenburg, 1956. HAHN, W., Theorie and 4nwendung der direkten Methode von Lyapunov (Ergebnisse der Mathematik, Neue Folge, Heft 22). Berlin: Springer, 1959. [Contains an extensive bibliography.]
351
352
BIBLIOGRAPHY
[20] [21] [22] [23]
*[241
INCE, E. L., Ordinary Differential Equations. New York: Dover, 1948. JUNG, H., Matrizen and Determinanten. Eine Einfiihrung. Leipzig, 1953.
KLEIN, F., Vorlesungen caber hbhere Geometric. 3rd ed., New York: Chelsea, 1949. KOWALEWSKI, G., Einfiihrung in die Determinantentheorie. 3rd ed., New York: Chelsea, 1949.
KREIN, M. G., Fundamental propositions in the theory of Azone stability of a
canonical system of linear differential equations with periodic coefficients.
*[25]
*[26] *[27]
Moscow : Moscow Academy, 1955. KREIN, M. G. and NAIMARK, M. A., The method of symmetric and hermitian forms
in the theory of separation of roots of algebraic equations. Kharkov: GNTI,
1936.
KREIN, M. G. and RUTMAN, M. A., Linear operators leaving a cone in a Banach space invariant. Uspehi Mat. Nauk, vol. 3 no. 1, (1948). KUDRYAVCHEV, L. D., On some mathematical problems in the theory of electrical
networks. Uspehi Mat. Nauk, vol. 3 no. 4 (1948).
*[28]
[29]
[30]
LAPPODANILEVSKII, I. A., Theory of functions of matrices and systems of linear
differential equations. Moscow, 1934. Memoires sur la thdorie den systemes des equations differentielles lineaires. 3 vols., Trudy Mat. Inst. Steklov. vols. 68 (19341936). New York:
Chelsea, 1953.
LEFSCHETZ, S., Differential Equations: Geometric Theory. New York: Interscience, 1957. LICHNEROWICZ, A., Algtbre et analyse lineaires.
[31] [32] [33] [34] *[35]
[36] [37]
2nd ed., Paris: Masson, 1956. LYAPUNOV (Liapounoff), A. M., Le Problpme general de la stability du mouvePress, 1949. MACDUFFEE, C. C., The Theory of Matrices. New York: Chelsea, 1946. Vectors and matrices. La Salle: Open Court, 1943. MALKIN, I. G., The method of Lyapunov and Poincard in the theory of nonlinear oscillations. Moscow: Gostekhizdat, 1949. Theory of stability of motion. Moscow: Gostekhizdat, 1952. [A German
ment (Annals of Mathematics Studies, No. 17). Princeton: Princeton Univ.
translation is in preparation.] MARDEN, M., The geometry of the zeros of a polynomial in a complex variable (Mathematical Surveys, No. 3). New York: Amer. Math. Society, 1949.
MARKOV, A. A., Collected works. Moscow, 1948. MEIMAN, N. N., Some problems in the disposition of roots of polynomials. Uspehi
*[38] *[39]
[40]
Mat. Nauk, vol. 4 (1949). MIRSKY, L., An Introduction to Linear Algebra. Oxford: Oxford University Press, 1955. *[41] NAIMARK, Y. I., Stability of linearized systems. Leningrad : Leningrad Aeronautical Engineering Academy, 1949. [42] PARODI, M., Sur quelques proprietds des valeurs caraetdristiques des matrices carrdes (Memorial des Sciences Matht matiques, vol. 118), Paris: GauthiersVillars,
1952.
[43] [44]
*[45]
PERLIS, S., Theory of Matrices. Cambridge. (Mass.): AddisonWesley, 1952. PICKERT, G., Normalformen von Matrizen (Enz. Math. Wiss., Band I, Teil B.
Heft 3, Teil I). Leipzig: Teubner, 1953.
PoTAPOV, V. P., The multiplicative structure of Jinextensible matrix functions. Trudy Moscow Mat. Soc., vol. 4 (1955).
BIBLIOGRAPHY
353
`[46]
[47]
Moscow: Gostekhizdat, 1948. ROUTH, E. J., A treatise on the stability of a given state of motion. London: Macmillan, 1877.
ROMANOVSKII, V. I., Discrete Markov chains.
[48]
[49] [50]
SCHLESINGER, L., Vorlesungen iiber lineare Differentialgieichungen. Berlin, 1908.
6th ed., London: Macmillan, 1905; repr., New York: Dover, 1959.
The advanced part of a Treatise on the Dynamics of a Rigid Body.
Einfuhru.ng in die Theorie der getviihnlichen. Dif ferentialgleichungen auf funktionentheoretischer Grundlage. Berlin, 1922. [51] SCHMEIDLER, W., Vortrage fiber Determinanten and Matrizen mit Anwendungen in Physik and Technik. Berlin: AkademieVerlag, 1949. [52] SCHREIER, 0. and SPERNER, E., Vorlesungen fiber Matrizen. Leipzig: Teubner, 1932. [A slightly revised version of this book appears as Chapter V of [53].] Introduction to Modern Algebra and Matrix Theory. New York: Chelsea, [53]
1958.
[54] [55]
SCHWERDTFEGER, H., Introduction to Linear Algebra and the Theory of Matrices.
Groningen: Noordhoff, 1950. SHORAT, J. A. and TAMARKIN, J. D., The problem of moments (Mathematical Surveys, No. 1). New York : Amer. Math. Society, 1943. [56] SMIRNOW, W. I. (Smirnov, V. I.), Lehrgang der hi heren Mathematik, Vol. III. Berlin, 1956. [This is a translation of the 13th Russian edition.] [57] SPECHT, W., Algebraische Gleichungen mit reellen oder komplexen Koeffizienten (Enz. Math. Wiss., Band I, Teil B, Heft 3, Teil II). Stuttgart: Teubner, 1958. STIELTJES, T. J., Oeuvres Completes. 2 vols., Groningen: Noordhoff. [58] [59] STOLL, R. R., Linear Algebra and Matrix Theory. New York: McGrawHill, 1952. [60] THRALL, R. M. and TORNHEIM, L., Vector spaces and matrices. New York:
Wiley, 1957.
[61]
TURNBULL, H. W., The Theory of Determinants, Matrices and Invariants. London: Blackie, 1950. [62] TURNBULL, H. W. and AITxEN, A. C., An Introduction to the Theory of Canonical Matrices. London : Blackie, 1932. (63] VOLTERRA, V. et HosTINsxy, B., Operations infinitesimales lineaires. Paris: GauthiersVillars, 1938. [64] WEDDERBURN, J. H. M., Lectures on matrices. New York: Amer. Math. Society,
1934.
WEYL, H., Mathematische Analyse des Raumproblems. Berlin, 1923. [A reprint is in preparation: Chelsea, 1960.] [66] WINTNER, A., Spektraltheorie der unendlichen Matrizen. Leipzig, 1929. [67] ZUEMiiHL, R., Matrizen. Berlin, 1950.
[65]
PART B. Papers
[101] ArRIAT, S., Composite matrices, Quart. J. Math. vol. 5, pp. 8189 (1954). `[102] AIZERMAN (Aisermann), M. A., On the computation of nonlinear functions of several variables in the investigation of the stability of an automatic regulating system, Avtomat. i Tolemeh. vol. 8 (1947).
[103]
AISERMANN, M. A. and F. R. GANTMACHER, Determination of stability by linear
approximation of a periodic solution of a system of differential equations with discontinuous righthand sides, Quart. J. Mech. Appl. Math. vol. 11, pp. 38598 (1958).
354
[104]
BIBLIOGRAPHY
AITKEN, A. C., Studies in practical mathematics. The evaluation, with applica
tions, of a certain triple product matrix. Proc. Roy. Soc. Edinburgh vol. 57,
[105]
(193637). AMIR Mo z ALI, R., Extreme properties of eigenvalues of a hermitian transformation and singular values of the sum and product of linear transformations, Duke Math. J. vol. 23, pp. 46376 (1956).
'[106] ARTASHENKOV, P. V., Determination of the arbitrariness in the choice of a matrix reducing a system of linear differential equations to a system with constant coefficients. Vestnik Leningrad. Univ., Ser. Mat., Phys. i Chim., vol. 2,
107]
pp. 1729 (1953). ARZHANYCH, I. S., Extension of Krylov's method to polynomial matrices, Dokl. Akad. Nauk SSSR, Vol. 81, pp. 74952 (1951).
*[108]
[109]
74 (1952). BAKER, H. F., On the integration of linear differential equations, Proc. London Math. Soc., vol. 35, pp. 33378 (1903). [110] BARANKIN, E. W., Bounds for characteristic roots of a matrix, Bull. Amer. Math. Soc., vol. 51, pp. 76770 (1945).
[111]
[112]
AZBELEV, N. and R. VINOORAD, The process of successive approximations for the computation of eigenvalues and eigenvectors, Dokl. Akad. Nauk., vol. 83, pp. 173
BARTSCH, H., Abschatzungen fur die Kleinste charakteristische Zahl einer positivdefiniten hermitschen Matrix, Z. Angew. Math. Mech., vol. 34, pp. 7274 (1954). BELLMAN, R., Notes on matrix theory, Amer. Math. Monthly, vol. 60, pp. 17375,
(1953); vol. 62, pp. 17273, 57172, 64748 (1955); vol. 64, pp. 18991 (1957).
[113]
[114]
BELLMAN, R. and A. HOFFMAN, On a theorem of Ostrowski, Arch. Math., vol. 5,
pp. 12327 (1954).
[115] [116] [117] [118]
[119]
BENDAT, J. and S. SILVERMAN, Monotone and convex operator functions, Trans.
Amer. Math. Soc., vol. 79, pp. 58.71 (1955). BERGE, C., Sur une propriet6 des matrices doublement stochastiques, C. R. Acad. Sci. Paris, vol. 241, pp. 26971 (1955). BIRKHOFF, G., On product integration, J. Math. Phys., vol. 16, pp. 10432 (1937).
equations, Math. Ann., vol. 74, pp. 13439 (1913). BoTT, R. and R. DUFFIN, On the algebra of networks, Trans. Amer. Math. Soc., vol. 74, pp. 99109 (1953). BRAUER, A., Limits for the characteristic roots of a matrix, Duke Math. J., vol. 13, pp. 38795 (1946) ; vol. 14, pp. 2126 (1947) ; vol. 15, pp. 87177 (1948) ; vol. 19, pp. 7391, 55362 (1952) ; vol. 22, pp. 38795 (1955).
BIRKHOFF, G. D., Equivalent singular points of ordinary linear differential
[120]
Angew. Math., vol. 192, pp. 11316 (1953).
[121]
Ober die Lage der charakteristischen Wurzeln einer Matrix, J. Reine
Bounds for the ratios of the coordinates of the characteristic vectors of a matrix, Proc. Nat. Acad. Sci. U.S.A., vol. 41, pp. 16264 (1955). [122] The theorems of Ledermann and Ostrowski on positive matrices, Duke Math. J., vol. 24, pp. 26574 (1957). [123] BRENNER, J., Bounds for determinants, Proc. Nat. Acad. Sci. U.S.A., vol. 40, pp. 45254 (1954) ; Proc. Amer. Math. Soc., vol. 5, pp. 63134 (1954) ; vol. 8, pp. 53234 (1957); C. R. Acad. Sci. Paris, vol. 238, pp. 55556 (1954). [124] BRUIJN, N., Inequalities concerning minors and eigenvalues, Nieuw Arch. Wisk., vol. 4, pp. 1835 (1956). [125] BRUIJN, N. and G. SZEKERES, On some exponential and polar representatives of matrices, Nieuw Arch. Wisk., vol. 3, pp. 2032 (1955).
BIBLIOWIAPIIY
355
*[126]
11271 11281
BuLOAKOV, B. V., The splitting of rectangular matrices, Dokl. Akad. Nauk SSSR,
[1291
[1301
*1131]
[1321
vol. 85, pp. 2124 (1952). CAYLEY, A., A memoir on the theory of matrices, Phil. Trans. London, vol. 148, pp. 1737 (1857) ; Coll. Works, vol. 2, pp. 47596. COLLATZ, L., Einschliessungssato fur die charakteristischen Zahlen von Matrizen, Math. Z., vol. 48, pp. 22126 (1942). Ober monotone systeme linearen Ungleichungen, J. Reine Angew. Math., vol. 194, pp. 19394 (1955). CREMER, L., Die Verringerung der Zahl der Stabilitatskriterien bei Yoraussetzu.ng positiven koeffizienten der charakteristischen Gleichung, Z. Angew. Math. Mech., vol. 33, pp. 22227 (1953). DANILEvsxIi, A. M., On the numerical solution of the secular equation, Mat. Sb., vol. 2, pp. 16972 (1937). DILTBERTO, S., On systems of ordinary differential operations. In: Contributions
to the Theory of Nonlinear Oscillations, vol. I, edited by S. Lefschetz (Annals of Mathematics Studies, No. 20). Princeton: Princeton Univ. Press (1950),
pp. 138.
DMITRIEV, N. A. and E. B. DYNKIN, On the characteristic roots of stochastic matrices, Dokl. Akad. Nauk SSSR, vol. 49, pp. 15962 (1945). Characteristic roots of Stochastic Matrices, Izv. Akad. Nauk, Ser. Fiz*[133a] Mat., vol. 10, pp. 16794 (1946). [134) DoBSCH, 0., Matrixfunktionen beschrankter Schwankung, Math. Z., vol. 43, pp. 35388 (1937).
*[1331
*1135]
DONSKAYA, L. I., Construction of the solution of a linear system in the neighbor
*[136]
*[137]
* [ 138]
hood of a regular singularity in special cases, Vestnik Leningrad. Univ., vol. 6 (1952). On the structure of the solution of a system of linear differential equations in the neighbourhood of a regular singularity, Vestuik Leningrad. Univ.,
vol. 8, pp. 5564 (1954). DuBNOV, Y. S., On simultaneous invariants of a system of affinors, Trans. Math. Congress in Moscow 1927, pp. 23637. On doubly symmetric orthogonal matrices, Bull. Ass. Inst. Univ. Moscow, pp. 3335 (1927). On Dirac's matrices, U. zap. Univ. Moscow, vol. 2, pp. 2, 4348 (1934). DUBNOV, Y. S. and V. K. IvANOV, On the reduction of the degree of affinor polynomials, Dokl. Akad. Nauk SSSR, vol. 41, pp. 99102 (1943). DUNCAN, W., Reciprocation of triplypartitioned matrices, J. Roy. Aero. Soc., Vol. 60, pp. 131.32 (1956). EOERVARY, E., On a lemma of Stieltjes on matrices, Acta. Sci. Math., vol. 15, pp. 99103 (1954). On hypermatrices whose blocks are commutable in pairs and their application in latticedynamics, Aeta Sci. Math., vol. 15, pp. 21122 (1954). EPSTEIN, M. and H. FLANDERS, On the reduction of a matrix to diagonal form, Amer. Math. Monthly, vol. 82, pp. 16871 (1955).
*[139]
1401 11411
[1421
[143]
[144]
*[1451
[1461
ERSHov, A. P., On a method of inverting matrices, Dokl. Akad. Nauk SSSR,
vol. 100, pp. 20911 (1955). ERUaIN, N. P., Sur la substitution. exposante pour quelques systCmes irreguliers, Mat. Sb., vol. 42, pp. 74553 (1935).
*[1471
Exponential substitutions of an irregular system of linear differential equations, Dokl. Akad. Nauk SSSR, vol. 17, pp. 23536 (1935).
298304 (1957). vol. Math. 9. pp. On the classification of real simple Lie groups. Reine Angew. On a theorem of Weyl concerning eigenvalues of linear transformations. On the algebraic. [153] FAN. 24156. analysis of Krylov's method of transforming the secular equation. S.A. Sup. Uspelii Mat. vol. 35.. I1OFFMAN.Nat.. FADDEEV.S. Bull. Math. Berlin Math. [161] [162] [163] [164] [165] FROBENIUS.. J. pp.. 40731.B. Phys. Proc. pp. 28. Math. Kiev (Trudy fiz.Nat.. 1894. Berlin. pp. 1929. TODD. t`ber Matrizen aus positiven Elementen. 41421 (1955). GANTMACHER.. Congress.. Ober lineare substitutionen and bilineare Formen. 1908. Wiss. Math. 601614. FAN. 3. FAEDO.18. K. 18. Akad. Deutsch. Deutsch. Wiss. 7886 (1937). KI.Nat. 1. pp. 36. Ukrain. R.S. Ped. 45677. pp. vol. KI. vol. Canad. 84. 6. On symmetrizable matrices... and A. FAN. pp. Iil. "[167] *[168] [169] "[170] *[171] Normal operators in a hermitian space. vol. Inst. U. pp. pp. pp. Kasan (Izvestiya fiz. S. Soc. 7184 (192930). A determinantal inequality. J. S. On the structure of an orthogonal matrix. vol. Generalization of Hadamard's determinant inequality. Sb. M. 54. Mat. Imbedding conditions for Hermitian and normal matrices. R. Berlin. S.. Kiev). 5. K. vol. Math.. Trans. `[151] FAGE. 1. IV. Soc. Geometric theory of elementary divisors after Krull. otdcla VUAN. Mat. Proc. 514. pp. London Math. Univ. Nat. Ober die vertauschbaren Matrizen. Soc. Univ..Nat. 6. pp. vol. K. 30. 58. pp. and M..B. Proc. 7. PALE. 1896. A comparison theorem for eigenvalues of normal matrices. 37. 51. Akad. Trudy Odessa Gos. pp. Nauk. Scuola Norm. Dokl. S. Berlin. [156] Some inequalities concerning positivedefinite Hermitian matrices. 4548 (1934). KI. 5. iMath. pp. G. [166] Ober Matrizen aus nicht negativen Elementen. Trans. PhysMat.. 3135 (1950).B. K.. and J. F.A. 716. Sci. Deutsch. KI.. vol. pp. Uc. vol. 163 (1877). On the transformation of the secular equation of a matrix. vol. pp. Maximum properties and inequalities for the eigenvalues of completely [154] continuous operators. Akad. Some metric inequalities in the space of matrices. 5363 (1953). Sci.64 (1955).neat. K. [157) Topological proofs for certain theorems on matrices with nonnegative elements. 62. Second Math.mat. pp. 3. Un nuove problema di stabilitic per le equationi algebriche a coefficienti reali. U. II. [155] [158] (159] [160] FAN. Deutsch. pp. 9111s (1955). K. 11116 (1958). Nat. Sci. Soc.Mat. 21750 (1939). Math. F. \auk SSSR. 15356 '[152] (1951). vol. pp. Eng. vol. pp. pp. Acad. S. Math.. vol. G. Wiss. . Acad. ttber die cogredienten transformationen der bilinearer Formen. Constr. J. Inst. 76066 (1951). 4. Wiss. 76568 (1946). 47176. Pacific J. Pisa. vol. pp. and G. Berlin. 1909. 293304 (1939). 89108 (1935). Acad. Ober das Tragheitsgesetz der quadratischen Formen. Wiss. Akad. vol. 1912. GANTMACHER. MathNat... Proc. Deutsch. KREIN.. vol. Akad. pp. ser. pp. Akad.B. 65255 (1949). Zap.. vol. Monatsh. Cambridge Philos. 1896.. Amer. D.B. pp.. 21937 (1958). Ann. Trans.356 `[148] *[149J [150] BIBLIOGRAPHY On Riemann's problem for a Gaussian system. obva pri Kazanskonl universitete). no. vol.
Die Anwendung der Matrizanten bei Eigenwertaufgaben. vol. pp. 87103 (1950). HOFFMAN. and V. [181] GROSSMAN. and F. Duke Math. Nauk. vol. vol.. J. 27. 74954 (1931). A... A. Mat. pp. Math. Math. 20. HELLMANN. pp. J. Statist. A. An extension of a matrix theorem of A. [178] GOHEEN. pp. W. Akad. J.. 34954 (1956). vol. Reine Angew. Duke Math. Kgl. Akad. Brauer. H.. 3739 (1953). 79 (1951). Edinburgh Math.Mat. Ann. Nauk SSSR. orthogonal. Numer. GELFAND. Amer. 136.. Amsterdam. 52. vol.. pp. C. vol. [182] HAHN. pp. Math. GANTSCHI. Nat. On symmetric. 23849 (1958). Monthly. H. pp. 1914. 62... A characterization of normal matrices. TAUSSKY. Ober die Abgrenzung der Eigenwerte einer Matrix... Ser. Math. 4.. no. A. 1. On the problem of a numerical solution of systems of simultaneous linear algebraic equations. Compositio Math. and skewsymmetric matrices. D. Sur lea matrices oscillatoires et comptetement nonnegatives. vol. pp. characteristic polynomial. 5. Note on bounds for certain determinants. IF.BIBLIOGRAPHY 357 [172] [173) [174] On a special class of determinants connected with Kellogg's integral kernels. vol. 30015 (1955). 3744 (1953). vol. GOLUSCHIKOV. Akad. vol. Compositio *[175] Math. 56. and 0. B. pp. vol. A. W. 35. Mat.. pp. Nauk SSSR. 1719 (1954). Math.. 14. Math. 1. vol. J. 42. pp... On matrices with nonnegative elements. Math. Introduction it la thtorie des suites monotones. 31319 (1957). vol... Sur le nombre des racines dune Equation algebrique comprise entre des limites donnees.. L. pp. Proc. D. 0. On the structure of the domains of stability of linear canonical systems of differential equations with periodic coefficients. HJELMSLER. On certain methods for expanding the Math. Proc. vol.. Math.. Soc. pp. vol. Some new methods in matrix calculation. P. . vol. Bounds of matrices with regard to an hermitian metric. 20. E. J.. J. pp. Bounds for determinants with dominant main diagonal.. I. no. Selbak. vol. Angew. 2. pp. 3. Uspehi. The variation of the spectrum of a normal matrix. pp. 134 (1943). pp.. HOTELLING. and H. E. On the eigenvalues of a matrix with prescribed singular values. vol. Standards. 43041 (1958). A. 3951 (1856). HOFFMAN. 52. pp.. HERMITE. 2. 32829 (1949). P. Monatsh. 1954. 340 (1955). LIDSKII. Eine Bemerkung zur zweiten Methode von Lyapunov.. 44576 (1937). 12. Hsu. [183] Ober die Anwendung der Methode von Lyapunov auf Differenzengleichungen. 10. Int.Mat.. 199209 (1953). S. Izv. Duke [187] [188] [189] [190] [191] [192] [193) [194] [195] vol. Soc. HORN. Uspehi Mat. vol. Ser. Izv. vol. L. vol. 174. Z... Nachr. 24. pp. HOUSEHOLDER. pp. HOUSEHOLDER. 2935 (1959). 116 (1954). Sb. Mech. Nauk. Small oscillations and some propositions in algebra. A. Danske Vid. Dokl. Fiz. A.. M. pp. Res. 14. WIELANDT. Amer. [177] GODDARD. 5. 5018 (1935). pp. Nauk SSSR. Math. 10.. Bur.. BAUER. [180] GRAVE. 56370 (1929). (184) [185] [186] HAYNSWORTH. pp. vol. pp. S. 6. S. Forh. [176] GERSHOORIN. On the structure of the automorphisms of the complex simple [179] Lie groups. Cong. L. Fiz. On a lemma of Stieltjes on matrices. 47 (1954). 2223. Proc. Math. Ann..
Sur les spectres des matrices. vol. Z. Univ. pp. Sinica. Geometries of symmetric matrices over the real field.. 66. The geometry of rectangular matrices and their '[203] [204] [205] [206] [207] [208] '[209] (210] [211] 0[212] [213] (214] [215] [216] [217] [218] [219] application to the real projective and noneuclidean geometries. A. KARPELEVICH. 303306 (1946). Ass. Ukrain. Dubreil et Ch. vol. Izv. vol.. Moscow Univ.. 103114 (1941). vol. Dokl. N. vol. vol. 46. KoLMoooaov. L. 39. On a property of matrices of symmetric signs. Math. Math. Inst. KHLODOVSKII.. Inequalities involving determinants. Nauk SSSR. 27384 (1895). 36. Nauk. vol. Izv. Mat. vol.. Akad. M. On the reduction of a matrix to its rational canonical form. On monotonic matrix functions of order n. A. Ann. On the influence of Gauss' transformation on the spectra of matrices. 13133 (1955). P.. 44190 (1945). J. Fiz: Mat. pp. D. Gaz. pp.. vol. Paris. vol. Amer. vol. Bull.. 15. floc. On the theory of the general case of Krylov's transformation of the secular equation.. Z. 7. Dokl. and B. Soc.. Akad.. pp. INORAHAM. 57. 9. KHAN. N. unter welchen tine Gleichung nur Wurzeln mit negativen reellen Teilen besitzt. Uspehi Mat. Ser. V. 13843 (1956). Akad.. 2. pp. Sb. pp. 7. vol. Mat. Izv. On the theory of nonnegative and oscillatory matrices. Akad. Amer. HURWITZ. vol. pp. L. D. F. 7. pp. 9. 47088. KOTELYANSKII. 36183 (1951). Nauk. Mat. 3. 3. vol.. Mat. vol. On some properties of matrices with positive elements. Trans. vol. 46370 (1955). 16367 (1953). HUA. vol. Sbm. pp. pp. Nauk SSSR Ser. Mat.K. pp. Mat. Fiz. Pisot. pp. vol. On the distribution of points on a matrix spectrum. 0 identitate important8 si descompunere a unei forme bilineare into sumd de produse. Sinica. Trans. 497506 (1952). 11721 (1954). Geometries of matrices. Math. F. Math. RoSENFELD. On the eigenvalues of a matrix with nonnegative elements.. no... 37982 (1933). Ser. pp. 53163 (1944). pp. 4. 33347 (1955). A. 53. pp. Acta Math. no. Matematika. 331.. Math. Odessa. IsHAK. M.. '[200] '[201] [202] Soc. 114 (1955/56). Bull. vol. 8.. H. 94101 (1950). Math. pp. I. pp. SSSR. On the theory of automorphic functions of a matrix variable. 19596 (1946).98. [199] Orthogonal classification of Hermitian matrices. Higher Ed. J. . vol. M. 5. 53. Izv. Nauk SSSR. On some sufficient conditions for the spectrum of a matrix to be real and simple.. Moscow (A). On some number systems arising from Lorentz transformations. 1. 1:3 (1937).. pp. N. vol. KAUAN. HUA. vol. The characteristic roots of a product of matrices. 303312 (1955). Ober die Bedingungen. 95.. Quart. Acts. pp. Math. Univ. pp. Automorphisms of the real symplectic group. 50823 (1946). pp. 7. Sb. Trans.. 16368 (1955). IoNEscu. 5. Amer. Sci. Markov chains with countably many possible states. 1927. A. 10761102 (1933). Uspehi Mat. pp.358 [196] [197] [198] BIBLIOGRAPHY On a kind of transformation of matrices. Nauk. A. 23346 (1957). 31.K. 7. I. vol. 59. Ukrain. Amer. pp. Fac.
no. Mat.. 1:3. Fiz. Mat. Nauk Kiev.Mat. Trans. KRAVOHUK. Kiev. On the theory of permutable matrices. Palermo. F. Mat. N. Akad. Izv. 1931. 2833 (1924).. 189 (1924). f 790 (1924).27. Akad. Izv. Prikl. 1718 (1924). 1842 (1936). On the nodes of harmonic oscillations of mechanical systems of a special [238] type. Mat. pp. Akad. pp. Ser. 319 (1935). Ser. vol. R. no. KRAUS. Mat. Akad. 69. Akad.. 20. and M. Inst. N. Kiev. A. Generalization of some results of Lyapunov on linear differential equa'[241] tions with periodic coefficients. fiber vertauschbare Matrizen. 51630 (1955). pp.. 1:2. Fiz.tion of quadratic forms. pp. Kiev. Inst. On the general theory of bilinear forms. 12630 (1927). 1. 18090 (1950). Trans. KOVALENKO. Sur lea vibrations propres des tiges dont 1'une des extremites eat encastree [240] et l'autre libre. 45566 (1933). pp. KRAVCHUx. Nauk SSSR.. vol. 125975 (1933).. Circ. 2. Sb. Kiev. 31534 (1952). vol. vol. Second. pp. Zap. Polyt. Inst. pp. (4). G. vol. pp. Ober konvexe Matrixfunktionen. and M. pp. 2. vol. vol. M. On the equivalence of singular pencils of matrices. 73. Nauk SSSR. Nauk SSSR. vol. 1936. Proc. 75. vol. Mat. pp.BIBLIOGRAPHY 359 '[220] '[221] [222] '[223) '[2241 [225] '[226] '[227] '[228] '[229] '[230] [231] [232] '[233] '[234] Estimates for determinants of matrices with dominant main diagonal. 1112. M.Mat. Akad. 2558 (1926). vol. M.. On a new class of hermitian forms. DOkl. Ser. Ber. pp. Fiz. 13744 (1956). pp. 32535 (1917). KRASOVSKII. Nauk. 40. Akad..' Trans. F. vol. vol. Inst. 1037. Sb. 1:2. 41. On some investigations of Lyapunov concerning differential equations with periodic coefficients. pp. vol. vol. 5. Av. pp. 19. Nauk SSSR. Av. Leipz. Nauk. 11.. 6. G. Soc. K. pp. 19. vol. (237) vol. 51. 17177 (1951). Congress 1934. G.. S. Z. 311 (1935). KREIN. Uspehi Mat. vol. Ser. On quadratic forms and linear transformations. On the stability after the first approximation. Charkov Mat.Mat. Zap. Izv. On a transformc.Mat.. Uspehi Mat. vol. 1929. 12. KRASNOSEL'SKII. and Y. vol. 7398. Agr. pp. Fiz. 1936. KREIN. '[235] KREIN. . An iteration process with minimal deviations. 9. 1. Ser. 44548 (1950). Proc. GoL'DBAUM. '[243] On an application of an algebraic proposition in the theory of monodromy matrices. Fiz: Mat. pp. On the structure of permutable groups of matrices. Meh. Charkov.. Zap. Class. Nauk Kiev. Addendum to the paper 'On the structure of an orthogonal matrix. Rend. '[236] On the spectrum of a Jacobian form in connection with the theory of torsion oscillations of drums.. Natiirliehe Normalformen linearer Transformationen. Soc. Sb. vol. Math. pp. Nauk Kiev. Trans. (4).. pp. 41. pp. M. Akad. KOWALEWSKI.. 33948 (1934). Dokl. Nauk Kiev.. pp. 1223. G. [239] Sur quelques applications des noyaux de Kellog aux problemes d'oscillations. 31. pp. On groups of commuting matrices. pp. 49599 (1950). Permutable sets of linear transformations. '[242] On an application of the fixedpoint principle in the theory of linear transformations of spaces with indefinite metric. 5. pp. Mat. Zap. F. pp. Mat. vol.
. POTAPov. Algebraische Reduction der Schaaren bilinearer Formen. pp. vol. Ukrain. Congr. 55568 (1952). M. pp. Soc.. (4).. vol. S. Soc. [259] LIiNARD. S: B. 1. KRONECKER. Trans. Charkov Mat. Edinburgh Math. vol. J. Nauk SSSR Ser. Akad.360 "[244] [245] BIBLIOGRAPHY On some problems concerning Lyapunov's ideas in the theory of stability. Meh. Phys. The multiplication theorem for characteristic matrix functions. Uspehi Mat. Vectors and Tensors. J.59 (1947). A. pp. vol. vols. Fundamental problems in the theory of systems of linear differential equations with arbitrary rational coefficients. Leningrad. Univ. Berlin 1890. 75. 491539 (1931). 2:2. LIDSXI'. Transport. 1936.. 16473 (1950). 10. vol. On the numerical solution of the equation by which the frequency of small oscillations is determined in technical problems. 254. On regular matrices. Pures Appl. '[254] [255] [256] Theorie des matrices satisfaisantes a des systemes des equations diffe rentielles lindaires a coefficients rationnels arbitraires.62. Trans. Vectors and Ten. On the application of the Bdzoutian to problems of the separation of roots of algebraic equations. Trudy Odessa Gos. pp. Mat. Congress Moscow. Akad. (6). 72. Soc. Proc. 3340 (1933). On the theory of integral matrix functions of exponential type. pp. NAIMARK (Neumark).. A. 102. Dokl. Piz. 18687. Nauk SSSR. Soc.. '[261] LIVSHITZ. 4. W. Mat. 1. vol. S: B. L. 76972 (1950). 16473 (1951). Trans. "[246] "[247] "[248] [249] [250] Mat. 3. teristic planes of a linear operator. 16669 (1948). pp. pp. Phys. On the characteristic roots of a sum and a product of symmetric matrices. and M. Bounds for the greatest latent root of a positive matrix. Sem. 233. 7.. Akad. 4180 (1928). vols. London Math. 92105 (1935).. G. '[263] "[264] The characteristic equation of lowest degree for affinors and its application to the integration of differential equations. ONTI. On some problems in the theory of oscillations of Sturm systems. On a transformation of the B6zoutian leading to Sturm's theorem. no. KRYLOV. vol. Akad. 10. 105 (1934). 94120. First Math. Izv. 5169 (1935). J. N. Math. Dokl. P. Trans. M. pp. pp. pp. vol. vol. Soc. "[257] LEDERMANN. V. pp. vols. pp. Lenin grad. Akad. '[251] [252] [253] Poincare et de Riemann. 2/3 (1935). vol.. "[260] LIPIN. 3. pp. N. Mat. Math. 25. pp. pp.. Theorie and Anwendung der veraligemeinerten Abelschen Gruppen. 26568 (1950). vol. and CHIPART. pp.Mat. Reduction of singular pencils of matrices. Dokl. Nauk. A. B. algorithmique des problemes reguliers de LAPPODANILEVSKII. "[258] Oscillation theorems for canonical systems of differential equations.. Proc. M. Vector solution of a problem on doubly symmetric matrices. Nauk SSSR. Inst. and V. Mat. pp. V. Sur la signe de la partie rtelle des racines d'une equation algdbrique. KR2IN. vol. vol. Eng 8. 3. J.. Nauk SSSR. Z. 1927. A. I. Akad. 76376... . 9. 291346 (1914). 16. Sem.. p. 12154 (1928). sors. Prikl. KRULL. '[262] LOPSHITZ. W. 2:1. pp. p. Trans. 11117 (1955). 4. vol. Heidelberg 1926. A numerical method of determining the characteristic roots and charac..
903958 (1931). vol. vol... 264. vol. A. pp. Nauk. 50. Fiz. On Krylov's method of forming the secular equation. Math. pp. vol. vol. P. vol. ElectrRestvo. Acad. pp. no. 56. Vectors and Tensors. Determination of eigenvalues and pp. 198200 (1949).. Canad. 12730 (1956). Washington Acad. pp. 30819 (1950). 10651102 (1932). On the maximum principle of Ky Fan.. 38. 117779 (1955). 177216 (1933) . and B. Izv. Soc. Math. Ser. L. pp. pp. [274] [275] [276] [277] [2781 [279] [280] '[281] (282] [283] [2841 [2851 [286] [2871 Math. pp. 11. 47. "[269] . PROKHOROV. Maximum and minimum values for the elementary symmetric functions of Hermitian forms. Proc. Robertson. Sem. vol. 4. Ser. Duke Math. Izv.. Akad. 24. Convex functions of quadratic forms. pp. pp.Mat. Nauk SSSR. 31320 (1957).Mat. pp. 73562. and J. L. MCGREGOR. A method for the exact determination of the roots of secular equations of high degree and a numerical analysis of their dependence on the parameters of the corresponding matrices. "[268] LusIN. Amer. vol. Extremal properties of Hermitian matrices. MITROVIC. The norm of adjugate and inverse matrices. M. R. 9. Nauk SSSR. vol. vol. '[270] Akad. vol. A remark on a norm inequality for square matrices. LYUSTERNIK. 24. vol. M. L. '[273] functions of certain operators by means of an electrical network. vol. Ober monotone Matrixfunktionen. Soc. 14346 (1956). 14145 (1947). An inequality for positivedefinite matrices. 9. vol. Sei. pp. J. Mathematika. 1842 (1936).. J. Monthly. vol. 6. M. Math.. 27677 (1956).. 11719 (1955). Izv. vol. MARCUS. II. 121.66 (1957). N. Trans. Canad. Paris. Nauk SSSR. 7. 41. 2. [267] Some classes of functions defined by difference on differential inequalities. C. 62. pp. The determination of eigenvalues of functions by an electric scheme. vol. 11. L. 366 (1940). D. pp. pp. The spread of a matrix.. Akad.24 (1945). J. Mat. vol. 37477 (1957). K. D. 66. 3. Math. A determinantal inequality of H. Amer. MORGENSTERN. 32. Avtomat. J. vol.On some properties of the displacement factor in Krylov's method. and A. pp. pp. Z. Math. S. LYUSTERNIK. Akad Nauk SSSR. Amer. Dokl. 18397 (1952). Duke Math. 240.. MAYANTS. Math. On the matrix theory of differential equations. J. N. Fiz.. Akad. An eigenvalue inequality for the product of normal matrices. pp.. vol. L. pp. 55. pp. pp. Conditions graphiques pour que toutes lee ravines dune equation algebrique soient z parties reelles negatives. 17374 (1956). Bull. 596638. J.. vol. Soc.. vol. pp. MARCUS.. Uspehi Mat. 32126 (1957).. vol. M. A. Z. 42830 (1955). 59199 (1957). Dokl.. vol. . MOYLS. Math. [266] LoWNER.BIBLIOGRAPHY 361 [265] An extremal theorem for a hyperellipsoid and its application to the solution of a system of linear algebraic equations. Monthly. 7. 8. pp. Arch. MIRSKY. vol... 8. Math. '[271] ' [272] On electric models of symmetric matrices.. Inequalities for normal and Hermitian matrices. N. 63. 5. Amer. Sei. Eine Versch irfung der Ostrowskischen Determinantenabachatzung. London Math. 678 (1946). 57982. pp. Nauk SSSR. 52431 (1956).. pp. Ser.. i Telemeh. MARCUS.
pp. vol. 241. J. pp. Uber die Stetigkeit von charakteristischen Wurzeln in Abhangigkeit von den Matrizenelementen. vol.19 (1956). Cambridge Philos. pp. NEIOAUS (Neuhaus). M. Math. vol. R. Amer. Math. 27. P. C. pp. pp. Amer. vol. pp. A generalized inverse for matrices. pp. Deutsch. Soc. vol. [297] On nearly triangular matrices. pp. 28690 (1951). pp. PENROSE. and V.. Verein. ORLANDO. Quart. vol. Proc. Paris. 6. TAUSSKY. L. LIDSKI'.. 193. Portugal. 27176 (1952). 95102 (1955). 11117 (1958). 512. Ann. 3.. 25392 (1952). Uspehi Mat.. J. Sul problems di Hurwitz relativo alle parti reali delle radici di un' equatione algebrica. pp. A. pp. Res. Sur quelques applications des fonctions convexes et concaves au lens de 1. vol. A. Ann. 218 (1955). Appl. pp. 18393 (1951). 45056 (1888). Remarques sur la stabilit6. Quart. pp. 387401 (1954). On best approximate solutions of linear matrix equations.... tlber Normen von Matrizen. Nauk.. vol. 6. 32. pp. R. vol.. 77. Paris. Cambridge Philos. Sci. Soc.362 [288] BIBLIOGRAPHY *[289] [290] *[291] [292] MoTzKIN. 63. 7382 (1954). 22. Soc. 7. J. Sci. 4651 (1955). 17. Dokl. vol. 242. 261718 (1956). Note on bounds for some determinants. J. 1. pp. OPPENIIEIM. On a method of computing the roots of a characteristic determinant. Soc. J.. 46366 (1954). 61. 23345 (1911). 4042 (1957). C. Acad. Inequalities connected with definite Hermitian forms. On the boundedness of the solutions of linear systems of differential equations with periodic coefficients. B. Jber. and O.. Approximative of matrices of high order. A. OKAMOTO. On a certain type of matrices with an application to experimental design. 228. 34445 (1954). Bul. 73. 40613 (1955). Standards.. Acad. vol. 71. 14360 (1954). Math. Sur une propriet6 des racines dune equation qui intervient en mecanique. G. Math. PARODI. Acad.. 51.... pp. R. Prikl. SHVARTSMAN. pp. 19394 (1957). pp. 25356 (1952). 52. pp. Bur. Integration par series des equations diff6rentielles lin6aires. Reine Angew. pp. On matrices with positive elements.. Nauk SSSR. Osaka Math. Soc. Sur la localisation des valeurs caract6ristiques des matrices dans le plan complexe. vol.. 11981200 (1949). Limits on the zeros of a network determinant. Math. Schur. Puree Appl. vol. H. vol. 162 (1942). Sei. pp. 10814 (1952) . 13. vol. Sur les determinants d diagonale dominante. 15. vol. Monthly. PAPKOVICR. pp. NUDEL'MAN. J. Math.. vol. 60..... On the spectrum of a oneparametric family of matrices. [298] [299] [300] [3011 [302] *[303] [304] (3051 [306] [3071 [308] [309] [310] [311] [312] . vol. Meh. 2. Belg. J. 8078. London Math. On the spectrum of the product of [293] [294] [295] [296] unitary matrices. NEUMANN. Duke Math. Trans. Math.. pp. vol. pp. Math. Proc. PEANO. 80. Z. T. Pairs of matrices with property L.. A. A. Nat. Cambridge Philos. Paris. Math. 31418 (1933). pp. J. OsTaowsxl. no. Math. G. vol. pp. vol. Math. PAPULIS... F. Soc. 48. 31. M. A.. vol. and P. Bounds for the greatest latent root of a positive matrix. Akad. vol. Math. vol. On positive stochastic matrices with real characteristic roots... vol. Mat. 101921 (1955). R. vol. M. C. 52. Proc. PERFECT.
.. [338] The elementary divisors. Zur Theorie and Anwendung der Produktintegrals. 24749 (1955).. L. H.. Nauk SSSR. vol... Soc. Math. DE. 44748 (1954)... vol. POTAPOV. [332] ROY. 122.. 4. G. of a singular Mmatrix. Sur un theoreme de Stieltjes relatif it certain matrices. pp. L. Inst. 44547 (1953/54). Dokl. 47. pp. A useful theorem in matrix theory. 64. Proc. Math. 72. Akad. Serbe Sci. Arch. Math. 10822 (1956). Zur Abscha"tzung von Matrizennormen. Amer. 147251 [329) (1935). Soc. 395404 (1953). 171. vol. J. pp. 1634 (1950).. Jacobischer Eettenbruchalgorithmus. RASca. 64. Izv. 20511 (1957). pp. 0.. "[334] SARYMSAKOV.. pp. H. V. 5.. An inequality for latent roots applied to determinants with dominant principal diagonal. Duke Math. 66. Edinburgh Math. pp. vol. Einfarbe Herleitung der Jordanschen Normalform. On the characteristic polynomial of the product of several matrices. Math: Nat. H. Math... Mat. 33133 (1945). Functions of matrices. pp. J. V. pp. Math. T. Nauk. 5. pp.. Monthly. Wiss. pp. A lower bound for the diagonal elements of a nonnegative matrix. A. vol. [335] SCHNEIDER. Soc. pp. HumboldtUniv. vol. Ser. Amer. 4164 (1929). vol. 176 (1907). Hermitian operators in a space with indefinite metric. Proc.. Berlin. P.. pp.... 10. Z. 6. Dokl. [324] [325] [326] RHAM. Akad. Math.. vol. Math. 12. H. Soc. vol. vol. pp. G. 17887 [327] (1958). 5. vol. On sequences of stochastic matrices. Soc. pp. A.. Soc. Bull. Un theoreme sur lea zeros des matrices nonnegatives. vol. On the limits of multiplicative integrals. RICHTEa... Amer. PHILLIPS.. vol. 21319 (1933). pp. London Math. Reine Angew. G. pp. pp. 24380 (1944). . Proc. France. Aeta Matb. 3. J. Uspehi Mat. 10. vol. 24863 (1907). Proc. J. '[323] RECHTMANOL'SHANSKAYA. Math. 63538 (1954).BIBLIOGRAPHY 363 [313] [314) [315] [316] [317) [318] [319] "[320] [321) [322] Methods of constructing certain stochastic matrices. 30511 (1955). vol. Edinburgh Math. Bemerkung zur Norm der rnversen einer Matrix. vol. I. Reine Angew. P. 62. [336] A pair of matrices with property P. [331] Amer. t7ber Matrizen. Math.. pp. vol. London Math. Soc. associated with 0. Recherches stir lea chatnes de Markoff. pp.. Ann. [328] ROMANOVSKII. vol. Publ. Proe. t)ber Stabilitdt and asymptotisches Verhalten der Losungen eines Systems endlicher Differenzengleichungen. Nauk.. Nauk. Acad. vol. 57882 (1856). 41. 12 no. Ann. Uspehi Mat.. vol.. 84953 (1950). pp. On the characteristic polynomial of the product of two matrices. 22. W. pp. Math. vol. Math. [337] A matrix problem concerning projections. pp. 61. 8. 49193 (1956). 65119 (19534).. J. no.. REICHARDT. 7. Nauk 88SR. 12930 (1956). On holomorphic matrix functions bounded in the unit circle. vol. J. Math. 26678 (1919). vol. pp. Amer. vol. 28. pp. S. 18. Akad. PONTRYAGIN. 161. 20. 13354 (1952). 820 (1953). Math. pp. [330] ROTH. Soc. tfber Matrixfunktionen. PERRON. [333] SAKHNOVICH. J. S. vol. vol. Reihe. 3. pp. Math. 13 (1954). B. pp. On a theorem of Markov. 18187 (1957). Nachr. Ann. vol. 31.
. Inst. Z. Zap. 66. t7ber mit einer matrix vertauschbare matrizen. pp. I. Nauk SSSR.. 6. M.. Matematika. Oazillationstheoreme for die Eigenvektoren spesiellen Matrizen. M.. pp. K. Sur lea matrices unitaires du type de ciroulants. AKHIESER (Achieser).99 (1951). A. 1. Y. pp. [348] SHOSTAK. Akad.. 42. J. Meh. Trudy Mat. SCHUR. The theory of branching random processes. J. 8389 (1934). Proc. pp. 696712 (1929). 4. [345] SHIFFNER. folds of matrices by means of iteration.. vol.. pp... 920 (1934). 1. Sb. (350] SHTAERMAN (Steiermann). S. L. Ober die Daratellbarkeit einer Matrix als Produkt symmetrischen Matrizen. pp. pp. Steklov. 2. (351] SHTAERMAN (Steiermann). 9. p. vol. R. K. Symplectic Geometry. Nauk. On orthogonal transformations. The development of the integral of a system of differential equations with regular singularities in series of powers of the elements of the differential substitution. 104. vol. 12150 (1951). pp. Math.. On the determination of the eigenvalues and invariant mani[344] SEVAST'YANOV. A. [356a] SMOGORZHEvsxIf... Stochastic matrices with real characteristic values. [354] . Nauk. vol. [346] On the powers of matrices. vol. vol.. Odessa Univ. Dokl. 32. 3. Inst. 32128 (1930).. 4. 8991 (1932). vol. A. vol. Reins Angew. Nauk.. L... vol. [357] STENZEL..dernde lineare transformationen. pp. no. no. Mat. 14160 (1951). I. Polyteh. Kiev. 29. P. M. pp. [353] SHVARTSMAN (Schwarzmann). On the theory of quadratic forms. Kiev. Circle Akad. J.. 76. 199206 (1954). [343] SEMENDYAEV. A. 23566 (1935). Uspehi Mat. Mat..364 BIBLIOGRAPHY [339] [340] [341] [342] SCHOENBERG. 3577 (1941). Math.. 12943 (1943). vol. An estimate of error in the numerical computation of matrices of high order.. Math. pp. Y. Y. Prikl. 15156 (1927). 46. Akad. Uspehi Mat. J. C. On a criterion for the conditional definiteness of quadratic forms in n linearly independent variables and on a sufficient condition for a conditional extremum of a function of n variables. B. 193221 (1943). vol. and M. Zur abziihlung der reellen wurzeln algebraischer gleichungen. Z. vol. 65155 (1950). J. pp. pp. vol. vol. [356] SMOGORZHEVSKII. 15.. 38. On Green's matrices of selfad joint differential operators. pp. no. 6. Uspehi Mat. A. 185. S. pp. [347] SHODA. vol. Nauk SSSR. [352] SHuRABwRA. Dokl. Nauk Kiev. 0[359] SULEIMANOVA. Akad. Z. KRAVCHUK. 125 (1922).. K. F. 65. I.. [355] SKAL'KINA. Nauk. 4. vol. [349] SHREIDER. 66. A theorem on polygons in n dimensions with application to variation diminishing linear transformations. tlber die charakteristischen wurzeln einer linearen substitution mit einer anwendung out die theorie der integraigleichungen. pp.8. 488510 (1909). vol. 11623 (1934). 546 (1933). vol... Ed. SCHOENBERG. Math. Matem. Z. Dokl. pp. vol. 6. A. WHITNEY. Compositio Math. and A. Amer. Z. and N. 38594 (1935). 186 (1943). pp. Izv. 9. vol. I. pp. Ann. (358] S76Ha. P. pp... Math. vol. A. Kiev. SIEGEL. On the preservation of asymptotic stability on transition from differential equations to the corresponding difference equations. A solution of systems of linear algebraic equations. Y. R. Uber variationsvermin. pp. Math. Mat. 19. 505. 34345 (1949). H. pp. 3. I. A. A new method for the solution of certain algebraic equations which have application to mathematical physics. 8.. Math. vol. vol. (1955).
Moscow Ped. Ober symmetrische. 6. "[372] VILENKIN. Sci... WEYL. 4148 (1921).. Zur Theorie der bilinearen Formen. J. Soc. [365] TAUSSKY.. 16682 (1930). [384] Unzerlegbare nichtnegative Matrizen.. vol. vol. pp. Die Stabilitatsfrage bei Differenzengleichungen. 116 (1937). 52. p. Z. 8ci. 7. vol. Washington Acad. vol. Amer.. A. J.. Ser.. J. 8892 (1952). Inequalities between the two kinds of eigenvaiues of a linear transformation. Acad. vol. pp. A reduction theorem for totally positive matrices. 26364 (1957). 23441 (1949). W. vol. 64248 (1950). 163.. [381] WHITNEY. [374] VOLTESBA. Math. Math. 1867. vol. 277..306 (1945). 1. Soc. K. pp.. D. Proc. pp. B. "[371] VEBZHBITS1 I.. Sb. Univ. U6. 121. WESTON. A. [367] Commutativity in finite matrices. 22935 (1957). (3). Ul. Nauk Baku. 2831 (1949). vol. On matrices of a special type. and Physik. vol. alternierende and orthogonale Normalformen von Mat risen. Reins Angew. Monatsh. 24. K. SULTANOV. Uli. Roy's paper `A useful theorem in matrix theory. pp. vol. 368 (1902). pp. 16797 (1953). pp. pp. [380] WEYE. 5557 (1957). H. 63. Zap. 1. Sci.. On an estimate for the maximal eigenvalue of a matrix. 5. 6776 (1935). 2. vol. 64. Mat. pp. vol. E. Mem.. Expansions of determinantal equations into polynomial form. [376] WAYLAND. 1. A. Remark on S. vol.41 (1955).. Inst. 31038. [375] WALKaa. pp. J... vol. Math. pp. J. H. [369] TUBNBULL. pp. A. vol. Inst. 240. Moscow Ped.... Math. pp. 1039. Soc..BIBLIOGRAPHY 365 "[360] *[361] On the characteristic values of stochastic matrices. U. 4. H. 71. vol. Z. Zur theorie der bilinearen and quadratischen Formen. 40811 (1949). pp. 2. pp. 12. vol.. f. "[370] TUBCHANINOV. B. "[362] [377] [378] Akad. WELLSTEIN. pp.. Bin Einschliessungssatz fur charakteristische Wurzeln normaler Matrizen. [364] TA Li. Math. Acad. Arch. pp.. 104344 (1948). 50512 (1939). Odessa. vol. P. Edinburgh Math. pp. Nat. 18797 (1918).. pp. Ital. Inclusion theorems for the eigenvalues of a normal matrix.. tial equations. Charkov. [373] VIVIES. Zap. 1104 (1887). Math..' Proc... 1 (1956). Appl. pp. Math. Sui fondamenti delta teoria defile equationi differenziali lineari. On the reduction of singular matrix pencils. Quart. Monthly. [382] WIELANDT. 35. Univ. [366] A determinantal inequality of H. S. [368] TOEPLITZ. Note sur ice structures unitaires et paraunitaires. Zap. M. 34852 (1948/49). 2. 0. 47. vol. Ann. Math. WEIEBSTRASB. SUSHKEVICH. Duke Math. Monatsli.. J. R. Zap. Math. and J. 10. vol. pp. I. 0.. vol. Sectora Akad. Trudy Mat.. M. 163236 (1890). 15. V. vol.. Bounds for characteristic roots of matrices. Analyse Math. Das algebraische Analogon su einem Sate von Math. Berlin. London Math. vol. Sci. Soc. pp. [363] SzNAGY. pp. vol. 1117 (1946). 99141 (1934).. Paris. H. N. Amer. R. Y. 1.. pp. N. vol. Wiss. Acta Math. Proc. [383] Die Einachliessung von Eigenwerten normaler Matrizen. Some properties of matrices with elements in a noncommutative ring. Some problems in the theory of series compounded from several matrices. [379) . C. 2. Robertson. pp. On some applications of matrix calculus to linear differen. 108.
3019 (1957). Acad. [386] Pairs of normal matrices with property L. WINTNER.... An inequality for Minkowski matrices. Math. pp. Trudy Bern. Tens. pp. On nonnegative valued matrices. YAKUBOVICH. L. On criteria for linear stability. V. 40.366 [385] BIBLIOGRAPHY Lineare Scharen von Matrixen mit reellen Eigenwerten. Proc. 53. Moscow. 36481 (1950). Akad. pp. Quadratic and skewsymmetric bilinear forms in a real symplectic space. 51. Sci. I.. Math. vol. pp. 52528 (1952).. 13035 (1953). Res. Nat. 4. vol. vol. 7578 (1953).. G. Amer. [389] [390] [391] On eigenvalues of sums of normal matrices. Application of the matrix calculus to the synthesis of relaycontact schemes.. Dokl. vol. Math. WONO.S. Soc. Appl. Ped. vol. Nat. ZIMMERMANN (Taimmerman). 63338 (1955). 86. pp. 5.. Soc. A. pp. 8990 (1953).. 21925 (1950). 57780 (1949). 12124 (1954). A. Mech. Pacific J. Nauk SSSR. vol. Vect. Math. pp. Nat. pp. Bur. 29. 6. [392] '[393] '[394] '[395] [396] . vol. Naub. 13741 (1953). Nikolaevak. pp. pp. Proc. Inst. 8. vol. 106110 (1955). Y. pp.A. Some criteria for the reducibility of a system of differential equations. pp. Zap. K. Nauk SSSR. 4. Amer. Anal. M. YAGLOM. vol. J. Math. Sci. vol. vol. M. J. [387] Inclusion theorems for eigenvalues. Math.. [388] An extremum property of sums of eigenvalues. U.. Dokl.. 66. Bur. Standards. Proc. Akad. 6. Z. Standards. Decomposition of the norm of a matrix into products of norms of its rows.. ZEITLIN (Tseitlin). vol.
INDEX .
.
73 Characteristic direction. 268 skewsymmetric. of matrix. 57 Adjoint matrix. herinitian. 165 Components. 183 Bunyakovekii's inequality. 53 Coordinate transformation. 309 BASIS(ES). 286 Danilevskii. 216 Cayley. 338 Characteristic matrix. 4 Compound matrix. 182 Constraint. 9 Birkhoff. 10 Cauchy index. 6 of space. 242 Axes. 333 .. 214 Cauchy. inequality of. 182 Addition of operators. 184 Addition of congruences. 202 orthonormal. 59 of vector. reduced. 71. characteristic. 255 CARTAN. theorem of. 309 reduction to. 82 Characteristic polynomial. 33ff. 173. 181. 240 polynomials of. formulas of. 133 Determinant identity of Sylvester. 248 Defect of vector space. 70. of matrix. 82 369 Decomposition. of matrix into triangular factors. 33 Determinant of square matrix. of matrix. 42 Bundle of vectors. D. 41 diagonal. 321. 320 Convergence. 51 Characterization of root. principal. 60' D'ALEMBERTEULER.INDEX [Numbers in italics refer to Volume Two) ABSOLUTE CONCEPTS. 322 Chebyshev. 110. 115 Cauchy identity. see Jordan. 259 Bbzout. principal. multiplicative. 71 Characteristic equation. of operator. theorem of. 73 coordinates of vector in. 82 Adjoint operator. 151 Block multiplication of matrices. Sturm Characteristic basis. generalized theorem of. 1 of vector space. 32. 75 Jordan. 105 of operator. 248 theorem of. 109 Congruences. 19ff. 319 maximalminimal. 245 ChebyshevMarkov. 247 Chetaev. 279 CayleyHamilton theorem. 287 Dimension. 221 Coefficients of Fourier. 81 BinetCauchy formula. 9 system of. 2 Columns. 23ff. 112 Coordinates. 259 characteristic. 71 Discriminant of form. formula of. 121 Chipart. matrix of. 174. formula of Binet. 53 Jordan. 20 Computation of powers of matrix. 261 Coefficients of influence. 338 Column matrix. generalized. 51 Direction. minimal. transformation of. 265 Algebra. Markov. 286. 281 Bessel. G. 281 symmetric. 1 Diagonal matrix.. 45 Angle between vectors. polar. 201 lower. 147 Block. 17 Algorithm of Gauss. 276. 64 Derivative. 83. 310. Jordan chains of. 111 Column. 3 Dilatation of space. 197 Cell. of matrix. 173. 41 Chain. 242. isolated.
306 semidefinite. 203 transformation of. 151 Jordan chains of columns. 296. 334 Integral. 53 theorem of. elementary. 201. 296 real. 83. 309 restricted. 302. 205 hermitian. 138 product. 201 KARPELEVICH. multiplicative. 283 JACOBI. 25. 87 Kernel of 7lmatrix. 147 Inertia. algorithm of. 210 HermiteBiehler theorem. 101 LagrangeSylvester interpolation polynomial. 333 signature of. Equivalence. 130 kernel of. of operator. 244. 89 . 338. 18 unitary. 92 Kotelyanskii. 122 EulerD 'Alembert. of matrices. 133 of pencils. 210 Hurwitz matrix. 197 Hankel form. 132. 97 ). criterion of. 304 HADAMARD INEQUALITY. 87. 95 Erugin. method of. 261 Frobenius. 175 Gaussian form of matrix. 343. 23ff. 238 geometrical theory of. 81 GANTMACHER. linear superposition of. 142. 337 positive semidefinite. 202 Jordan matrix. 338. 298 singular. 190 Hyperlogarithm. 268 Gundenfinger. 294 Fourier series. index of. 169 identity of. 205 Faddeev. 23ff. 247. 1 Forces. 233 infinite. 331 bilinear. 294 Hankel matrix. method of. 87 EIGENVALUE. Lagrange interpolation polynomial. 297. 336 rank of. 338. 80 Ince. 132 Invariant plane. 337 negative definite. 251 Group. 339. 304 signature of. 66 Imprimitivity. 247 Gramian. 172. 71 Krein. 336 pencil of. 821. strict. 194 admissible. 343. 39 Kolmogorov. 299 left value of. 294 reduction of. 45 elimination method of. 228 Hurwitz. 103 Gauss. 144. 299ff. 132. 87 Field. 250 Kronecker. 190. 337 negative semidefinite. 23ff.matrix. 303 Jacobi matrix. 1 Elimination method of Gauss. 173. 336 rank of. 24 Ergodic theorem for Markov chains. 294 definite. 124 Governors. 69 Gram. 334 singular. 28 Form. theorem of. 304. 75. 202. 40 Krylov. 165 Jordan form of matrix. 83. 294 Hankel. 152. 183 generalized. 61. 332 canonical form of. 254 HamiltonCayley theorem. 27 Dmitriev. 172. 206 LAGRANGE. 201 Jordan block. 333 quadratic. theorem of. bilinear. 305 discriminant of.370 INDEX Divisors. 152. 114 method of. 232 Dynkin. 286 FACTOR SPACE. 87 Domain of stability. 99 Jordan basis. 205 Hermite. 300 theorem of. 39 Golubchikov. entire. 103 lemma of. 169 IDENTITY OPERATOR. generalized. reduction to principal axes. law of. 53 Function. 252 Elements of matrix. 246. see pencil positive definite. 37. formula of.
171 Left value. 120 equivalence in the sense of. 133 equivalent. 136. 242 with same real part of spectrum. 89 uniqueness of. 118 theorem of. 18 annihilating polynomial of. 205 projective. 233. 67 unitary similarity of. 88 homogeneous. 50 (im)primitive. 144. 265 cells of. 56 logarithm of. 105 of coordinate transformation. 141. acyclic. 113 normalized. 117 determinant of. 6 quotient of. 187 Lyapunov matrix. 122 adjoint. Mapping. 81 Legendre polynomials.135. 17 rank of product. 190 idempotent. 221 Limit of sequence of matrices. addition of. by number. 105 compound. orthogonal. 90 blocks of. 192. 132 elementary divisors of. 201. 33 multiplication on left by H. 2 principal. 90 minor of. normal form of. 185 criterion of. 266 reduced. 4 group property. normal form of. normal form of. 81 complex. 234 Matricant. 152 dimension of. 239 Lyapunov. 88 Markov parameters. 202 A. 239 integral. 51 Linear transformation. 20 Hurwitz. 89 applications to differential equations. 194 elements of. 60 cyclic form of. 11 components of. 96 (ir)reducible. 170. 7 companion. leftequivalence of. rank of. 194 inverse of. 73 Gaussian form of. 1. 296 difference of. 226 infinite. 139. 109 constituent. 88 fully regular. 127 Matrices.. 117 minimal polynomial of. 258 Li6nard. 12 similarity of. 61ff. 23 representation of as product. 82 characteristic polynomial of. 19ff. 338. 17 . 173. 132. 3 Logarithm of matrix. 202. 114 invariant polynomials of. 82. 168. 1 elementary. 5 diagonal. 1 function of. 33 Linear (in)dependence of vectors. 33ff. 240 theorem of. 173. 82 and linear operator. 18 symmetric. defined on spectrum. 1 ff. 3 multiplication by. 144. 139. 201. 63. 20 computation of power of. 54 decomposition into triangular factors. 239 Lyapunov. 242 Markov chain. affine. 95ff. 126. 264. 80 Jacobi. 6 skewsymmetric. 173. 2 multiplication of. 115 Matrix.INDEX 371 LappoDanilevekil. 15 minors of. 5 by matrix. congruence of. 41 canonical form of. 5 equivalence of. 88 cyclic. 116ff. 132. 96 fundamental. 8 diagonal form of. 2 commuting. 133 limit of sequence of. 142. 245 Markov. 83 period of. 130 characteristic. 88 regular. 39 Hankel. 117 Lyapunov transformation. 99 Jordan form of. 41 derivative of. 149 completely reducible. 19ff. 221 Li6nardChipart stability criterion. 152. 117 MACMILLAN. irreducible. column. 152. 14 product of.
263 oscillatory. 268 positive definite. of first kind. 243 Null vector. 172 Mean. 20 spectrum of. 52 Nullity of vector space. 75 representation as product. 275 Moments. 245 hermitian. 5U totally. 125 NAIMARK. 7 Matrix polynomials. convergence in. 4 identity. 38 Minor. 66 invariant plane of. 66 adjoint. 87 transformation of coordinate. 192. 281 polar decomposition of. 272 fully regular. 265 decomposition of. 134 Operator (linear). 1 order of. 50 polynomial. 239 stochastic. 281 (im)proper. 263. 60 transforming. 233 root of singular. 42 permutable. 243. 280 . 286 real. representation of as product. 76 left quotient of. 15 normal. 191 row. 19 trace of. 43 upper triangular. 13 superdiagonal of. 242 euclidean. 6. 275 of vector. 55. 18 Matrix addition. 13 symmetric. 88 regular. 88 spur of. 16 Matrix multiplication. 226 nonnegative. 98 power of. 50. 125 of point. 121 stability of. 234ff. 13 positive. 35. 12 unitary. 83 Matrix equations. 5 upper quasitriangular. 2 almost principal. 274 positive semidefinite. 269 normal form of. 19 triangular. 109 power series in. 43 rank of.372 INDEX Matrix. 283 matrix corresponding to. properties of. 236. 15 skewsymmetric. 155 unit. see polynomial matrix polynomials in. 7 permutation of. of series. 243 Minimal indices for columns.. 77 Maxwell. 2 reducible. 125 asymptotic. 1 square root of. 98 nonsingular. 215ff. 103 partitioned. 280 normal. 12 computation of. 243 positive semidefinite. 64 Number space. 52 OPERATIONS. 18. uniqueness of solution. 274 projective. 276. 221. 274. 282 semidefinite. 280 orthogonal. 102 of zero density. 19 square. 113 principal minor of. 50 spectra of. 237 Motion. 87 subdiagonal of. index of. 250 Nilpoteney. 2 of simple structure. left. permutability of. 51 normal form of. of mechanical system. 233. 60 transpose of. 41. 56 normal. 239 Routh. ndimensional. 53 totally. 264 root of nonsingular. 218. 201. 280 positive semidefinite. 1 orthogonal. 2 quasitriangular. 104 Modulus. 226 Norm. 269 unitary. 244 positive definite. 73 singular. 202 notation for. 281 of second kind. nilpotent. 268 positive definite. 260 Metric. problem of. 150. 78 multiplication of. 281 hermitian. left. elementary.
176. absolute. 248 dilatation of. 97. 191 Routh scheme. 82 final. 258 matrix. canonical form of. 310 characteristic equation of. 338 characteristic equation of. 259 characteristic. coefficient. 245 extension of. 268 spectrum of. 242. 83 Root of matrix. 176 of square matrix. 76 Power of matrix. 260 Series. 133 scalar. 287 Routh. 232 decomposition of. 239 Rotation of space. 96 transition. 1 Orlando. 243 scalar. 194 of Legendre. 93 limiting. 287 euclidean. 260 fundamental. 176. 310 Period. 43 Quotients of matrices. 227 Schur. 255 Sequence of vectors. 310 characteristic value of. 64 Relative concepts. infinite. 89 of Chebyshev. equality of. 26 Polynomial matrix. 96 Product. 29 regular. 95 Quasitriangular matrix. 177 monic. 171 Space. 94 mean limiting. 144. of Markov chain. 139. 338 characteristic values of. small. of vectors. 183 . 50 Perron. 67 Singularity. 39 congruent. 196 Orthogonal complement. 280 square root of. 143 Smirnov. 296. 194 Routh matrix. 239 Pencil(s) of matrices. 24 Pencil of quadratic forms. 243 of operators. 298 Similarity of matrices. 27 of matrix. convergence of. 12 Probability. 72 skewsymmetric. 275 symmetric. 37. to unitary space. 326 PARAMETERS. 26 strict equivalence of. 273 Operators. 177. 96 Permutation of matrix. of system. 127 Pencil of hermitian forms. 338 principal vector of. 244 QUASIEROODIC THEOREM. 184 Right value. 176 RouthHurwitz. 103 invariant. 57 multiplictaion of. 179 Row matrix. 76 order of. 2 SCHLESINOER. 173. 233. 280 unitary. of vectors. 17 RANK. 76 positive pair of. 234ff. theorem of. 310 principal column of. 53 formula of. of solutions. 234 Parseval. 89. 101. see matrix polynomials minimal. 41 elementary divisors of. 261 Peano. homogeneous. annihilating. 280 transposed. criterion of. 6 Pythagoras. 58 of sequences. formula of. inner. 25 singular. 88 Markov. 282 factor.. addition of. 256. 88 mean limiting. 177 minimal.130 elementary operations on. of simple structure. 113 Polynomial(s). inequality of. 81 Ring. 180 rank of. 2 of pencil.INDEX 373 Operator (linear). 17 Romanovskii. 116 Petrovskil. 29 of vector space. 266 Orthogonalization. 71 interpolation. 256 Oscillations. 46 Schwarz. 58 Order of matrix. 242. 38 Signature of quadratic form. 338 limiting. 312 principal vector of. of infinite matrix. 201 criterion of. formulas of. 76. 131 regular. 233. 130. 310 principal matrix of.
244. 242. 244 Vector space. 51 finitedimensional. 33 inequality of. 69 projecting. 317 latent. 178 vector. 116ff. 244. 243 as extension of euclidean. 263 written as matrix equation. 50ff. rotation of. 248 real. 315 Unit vector. 129 Systems of vectors. 183 Jordan chain of. 92 nonessential. 51 infinitedimensional. 282 congruence of.. orthogonal. 87 Square(s). identity of. 185 generated by vector. theorem of. 201 latent. application of matrices to. 263 unitary. 69 length of. 64 rank of. 133. 53 Spur. 318. 59 orthogonal. 55 inner product of. 243 systems of. 51 cyclic. 63 Volterra. 143. 169 Suleimanova. 267 orthonormal. 183 linear independence of. 64 Vector. 147 Vyshnegradskii. 314 Unit sphere. 202 complex. 51 defect of. 13 Subspace. 248 orthogonalization of sequence. 121. of function. criterion of. 129 States. characteristic. 51 nullity of. 51 Substitution. 96. 18 norm of. linear. 282 scalar product of. 118 regular. 146. characteristic. 242 bundle of. 53 extremal properties of. 176 Subdiagonal. 273. 7 Lyapunov. 60 Transpose. 143 stability of solution. 232 Stodola. 251 modulo I. 92 Stieltjes. 248 projection of. biorthogonal. 243 linear dependence of. 221 domain of. 25 . subspace. 334 Stability. theorem of. 297 INDEX UNIT SUM OF SQUARES. independent. 125 of solnution of linear system. 117 Transforming matrix. 66 Systems of differential equations. 92 limiting. 280 Transposition. 267 orthonormal. 175 Sturm chain. 242. 32. 66 null. 13 Sylvester. 35. 173 Sturm. 71 coordinate. 3 of coordinates. 245 TRACE. 51 basis of. 63 Jordan chain of. 81 proper. 118 reducible. 242. 185 invariant. 69 Vector(s). 51 angle between. 51 test for. 175 generalized. biorthogonal. 242. integral. 19. 245 unit. 69 left and right. 272. 242. 87 Transformation. equivalent.374 Space. 282 Spectrum. 232 of motion. 52 orthogonal. 338 proper. 243 normalized. 168 singularity of. 244 VALUE(S). maximal. 287 unitary. 64 dimension of. 256 principal. 181 extremal. 87 Superdiagonal. essential. 172 WEIERSTRASS. 243 positive.
ISBN 0821813765 9"78082111813768 CIIEL/131.H .
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.