You are on page 1of 443

CARL-ERIK FROBERG

Institute for Computer Sciences
University of Lund, Sweden

Introduction to
NUMERICAL ANALYSIS
SECOND EDITION

A
VV

ADDISON- WESLEY PUBLISHING COMPANY
Reading, Massachusetts - Menlo Park, California - London - Don Mills, Ontario

This volume is an English translation of Larobok i numerisk analys by Carl-Erik Froberg,
published and sold by permission of Svenska Bokforlaget Bonniers, the owner of all rights to
publish and sell the same.

Copyright © 1965, 1969 by Addison-Wesley Publishing Company, Inc.
Philippines copyright 1969 by Addison-Wesley Publishing Company, Inc.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,
or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or
otherwise, without the prior written permission of the publisher. Printed in the United States of
America. Published simultaneously in Canada. Library of Congress Catalog Card No. 73-79592.
ISBN 0.201420863
CDEFGHIJKL-HA 787654

Preface
Numerical analysis is a science
-computation is an art.

The present text in numerical analysis was written primarily to meet the demand
of elementary education in this field at universities and technical institutes. But
it is also believed that the book will be useful as a handbook in connection with
numerical work within natural and technical sciences. Compared with the first
edition a thorough revision and modernization has been performed, and quite
a few new topics have been introduced. Two completely new chapters, integral
equations and special functions, have been appended. Major changes and addi-
tions have been made, especially in Chapters 6, 14, and 15, while minor changes
appear in practically all the remaining chapters. In this way the volume of the
book has increased by about 30%.
An introductory text must have a two-fold purpose: to describe different
numerical methods technically, as a rule in connection with the derivation of the
methods, but also to analyze the properties of the methods, particularly with
respect to convergence and error propagation. It is certainly not an easy task to
achieve a reasonable balance on this point. On one hand, the book must not
degenerate into a collection of recipes; on the other, the description should not
be flooded by complete mathematical analyses of every trivial method. Actually,
the text is not written in a hard-boiled mathematical style; however, this does
not necessarily imply a lack of stringency. For really important and typical
methods a careful analysis of errors and of error propagation has been performed.
This is true especially for the chapters on linear systems of equations, eigenvalue
problems, and ordinary and partial differential equations. Any person who has
worked on numerical problems within these sectors is familiar with the feeling
of uncertainty as to the correctness of the result which often prevails in such
cases, and realistic and reliable error estimations are naturally highly desirable.
In many situations, however, there might also be other facts available indicating
how trustworthy the results are.
Much effort has been spent in order to present methods which are efficient
for use on a computer, and several older methods not meeting this requirement
have been left out. As to the mathematical background only knowledge of ele-
mentary linear algebra, differential and integral calculus, and differential equations
is presupposed. Exercises are provided at the end of each chapter. For those

vi PREFACE

involving numerical computation access to a desk calculator and a suitable table
(e.g. Chambers' shorter six-figure mathematical tables) is assumed.
Many suggestions for improvements have been presented, both in reviews
and otherwise, and they have also been taken into account as far as possible.
This helpful interest is hereby gratefully acknowledged.

Lund, December 1968 C-E. F.

Contents

Chapter 1 Numerical Computations
1.0 Representation of numbers . . . . . . . . . . . . . . . . . . 1
1.1 Conversion . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 On errors . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Numerical cancellation . . . . . . . . . . . . . . . . . . . . 9
1.4 Computation of functions . . . . . . . . . . . . . . . . . . . 10
1.5 Computation of functions, using binary representation . . . . . . 12
1.6 Asymptotic series . . . . . . . . . . . . . . . . . . . . . . . 14

Chapter 2 Equations
2.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1 Cubic and quartic equations. Horner's scheme . . . . . . . . . . 19
2.2 Newton-Raphson's method . . . . . . . . . . . . . . . . . . 21
2.3 Bairstow's method . . . . . . . . . . . . . . . . . . . . . . 28
2.4 Graeffe's root-squaring method . . . . . . . . . . . . . . . . . 32
2.5 Iterative methods . . . . . . . . . . . . . . . . . . . . . . . 33
2.6 The Q.-D.-method . . . . . . . . . . . . . . . . . . . . . . 39
2.7 Equations with several unknowns . . . . . . . . . . . . . . . . 42

Chapter 3 Matrices
3.0 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.1 Matrix operations . . . . . . . . . . . . . . . . . . . . . . 47
3.2 Fundamental computing laws for matrices . . . . . . . . . . . . 51
3.3 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . 52
3.4 Special matrices . . . . . . . . . . . . . . . . . . . . . . . 58
3.5 Matrix functions . . . . . . . . . . . . . . . . . . . . . . . 62
3.6 Norms of vectors and matrices . . . . . . . . . . . . . . . . . 67
3.7 Transformation of vectors and matrices . . . . . . . . . . . . . 73
3.8 Limits of matrices . . . . . . . . . . . . . . . . . . . . . . 76

Chapter 4 Linear Systems of Equations
4.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.1 The elimination method by Gauss . . . . . . . . . . . . . . . 82
vii

viii CONTENTS

4.2 Triangularization . . . . . . . . . . . . . . . . . . . . . . . 84
4.3 Crout's method . . . . . . . . . . . . . . . . . . . . . . . 86
4.4 Error estimates . . . . . . . . . . . . . . . . . . . . . . . . 89
4.5 Iterative methods . . . . . . . . . . . . . . . . . . . . . . . 93
4.6 Complex systems of equations . . . . . . . . . . . . . . . . . 100

Chapter 5 Matrix Inversion
5.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.1 Jordan's method . . . . . . . . . . . . . . . . . . . . . . . 103
5.2 Triangularization . . . . . . . . . . . . . . . . . . . . . . . 104
5.3 Choleski's method . . . . . . . . . . . . . . . . . . . . . . 106
5.4 The escalator method . . . . . . . . . . . . . . . . . . . . . 107
5.5 Complex matrices . . . . . . . . . . . . . . . . . . . . . . 108
5.6 Iterative methods . . . . . . . . . . . . . . . . . . . . . . . 109

Chapter 6 Algebraic Eigenvalue-Problems
6.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.1 The power method . . . . . . . . . . . . . . . . . . . . . . 115
6.2 Jacobi's method . . . . . . . . . . . . . . . . . . . . . . . 120
6.3 Given's method . . . . . . . . . . . . . . . . .
. . . . . . 125
6.4 Householder's method . . . . . . . . . . . . . . . . . . . . 130
6.5 Lanczos' method . . . . . . . . . . . . . . . . . . . . . . . 132
6.6 Other methods . . . . . . . . . . . . . . . . . . . . . . . . 134
6.7 Complex matrices . . . . . . . . . . . . . . . . . . . . . . 134
6.8 Hyman's method . . . . . . . . . . . . . . . . . . . . . . . 137
6.9 The QR-method by Francis . . . . . . . . . . . . . . . . . . 142

Chapter 7 Linear Operators
7.0 General properties . . . . . . . . . . . . . . . . . . . . . . 147
7.1 Special operators . . . . . . . . . . . . . . . . . . . . . . . 148
7.2 Power series expansions with respect to 5 . . . . . . . . . . . . 151
7.3 Factorials . . . . . . . . . . . . . . . . . . . . . . . . . . 158

Chapter 8 Interpolation

8.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 164
8.1 Lagrange's interpolation formula . . . . . . . . . . . . . . . . 164
8.2 Hermite's interpolation formula . . . . . . . . . . . . . . . . 168
8.3 Divided differences . . . . . . . . . . . . . . . . . . . . . . 170
8.4 Difference schemes . . . . . . . . . . . . . . . . . . . . . . 173
8.5 Interpolation formulas by use of differences . . . . . . . . . . . 174
8.6 Throw-back . . . . . . . . . . . . . . . . . . . . . . . . . 181
8.7 Subtabulation . . . . . . . . . . . . . . . . . . . . . . . . 183
8.8 Special and inverse interpolation . . . . . . . . . . . . . . . . 183
8.9 Extrapolation . . . . . . . . . . . . . . . . . . . . . . . . 185

Chapter 9 Numerical Differentiation
Chapter 10 Numerical Quadrature
10.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 195
10.1 Cote's formulas . . . . . . . . . . . . . . . . . . . . . . . 198
10.2 Gauss's quadrature formulas . . . . . . . . . . . . . . . . . . 205
10.3 Chebyshev's formulas . . . . . . . . . . . . . . . . . . . . . 213
10.4 Numerical quadrature by aid of differences . . . . . . . . . . . 217

Chapter 11 Summation
11.0 Bernoulli numbers and Bernoulli polynomials . . . . . . . . . . 224
11.1 Sums of factorials and powers . . . . . . . . . . . . . . . . . 228
11.2 Series with positive terms . . . . . . . . . . . . . . . . . . . 231
11.3 Slowly converging series. Euler's transformation . . . . . . . . . 235
11.4 Alternating series . . . . . . . . . . . . . . . . . . . . . . . 239

Chapter 12 Multiple Integrals

Chapter 13 Difference Equations
13.0 Notations and definitions . . . . . . . . . . . . . . . . . . . 250
13.1 The solution concept of a difference equation . . . . . . . . . . 251

X CONTENTS

13.2 Linear homogeneous difference equations . . . . . . . . . . . . 252
13.3 Differences of elementary functions . . . . . . . . . . . . . . . 254
13.4 Bernoulli's method for algebraic equations . . . . . . . . . . . . 256
13.5 Partial difference equations . . . . . . . . . . . . . . . . . . 257

Chapter 14 Ordinary Differential Equations
14.0 Existence of solutions . . . . . . . . . . . . . . . . . . . . . 260
14.1 Classification of solution methods . . . . . . . . . . . . . . . 264
14.2 Different types of errors . . . . . . . . . . . . . . . . . . . . 265
14.3 Single-step methods . . . . . . . . . . . . . . . . . . . . . . 265
14.4 Taylor series expansion . . . . . . . . . . . . . . . . . . . . 266
14.5 Runge-Kutta's method . . . . . . . . . . . . . . . . . . . . 268
14.6 Multi-step methods . . . . . . . . . . . . . . . . . . . . . . 271
14.7 Milne-Simpson's method . . . . . . . . . . . . . . . . . . . 279
14.8 Methods based on numerical integration . . . . . . . . . . . . . 279
14.9 Systems of first-order linear differential equations . . . . . . . . . 284
14.10 Boundary value problems . . . . . . . . . . . . . . . . . . . 286
14.11 Eigenvalue problems . . . . . . . . . . . . . . . . . . . . . 288

Chapter 15 Partial Differential Equations
15.0 Classification . . . . . . . . . . . . . . . . . . . . . . . . 294
15.1 Hyperbolic equations . . . . . . . . . . . . . . . . . . . . . 296
15.2 Parabolic equations . . . . . . . . . . . . . . . . . . . . . . 300
15.3 Elliptic equations . . . . . . . . . . . . . . . . . . . . . . . 309
15.4 Eigenvalue problems . . . . . . . . . . . . . . . . . . . . . 315

Chapter 16 Integral Equations
16.0 Definition and classification 320
16.1 Fredholm's inhomogeneous equation of the second kind . . . . . . 321
16.2 Fredholm's homogeneous equation of the second kind . . . . . . 326

. . . . . . . . . . 349 17. . . . . . . . . . . . . . . . . . . . . 384 19. . . . . .5 Approximation with Chebyshev polynomials . . . . . . . . . . . . . . . . . . . . . . . . . 368 18. . 335 17. . .2 Random walks . 347 17.2 Polynomial approximation by use of orthogonal polynomials . . . . . .4 Functions defined through linear homogeneous differential equations 370 18. . . . . 377 18. . . . . . . . . . . . . 336 17. . . 328 16. . . . . . . . .1 Random numbers . . . . . 354 17. . .3 Computation of definite integrals . . .3 Approximation with trigonometric functions . . . . . . . . 378 Chapter 19 The Monte Carlo Method 19. . . . . . . . . . 391 . .6 Approximation with continued fractions . . . . . . . . . . . . . .2 The beta-function . . . . . . . . . . . . 359 18. . . . . . . . . . . . .0 Introduction . .7 Approximation with rational functions . . 371 18. . . . . . . . . . .4 Approximation with exponential functions . . . .1 The gamma-function . 388 19. . . . . . . . . . . 384 19.0 Introduction'. . . . . . .4 Volterra's equations . . . . . . . . . . . .5 Bessel functions . . .0 Different types of approximation . . . . . . . . . . . . . . . .4 Simulation .7 Spherical harmonics . . . 359 18. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Fredholm's equation of the first kind . . . . .1 Least square polynomial approximation . . . . . . . . . . . . . . . . . . . . . . . . 367 18. . . . 344 17. . . . . . . . . . . . . . . 355 Chapter 18 Special Functions 18. . . . 390 19. . . . . . .5 Connection between differential and integral equations . . . .3 Some functions defined by definite integrals . 329 16. . . . . . . . 340 17. . . . . . . . . . . . 332 Chapter 17 Approximation 17. .6 Modified Bessel functions . . . . . . . . . .

. . . . . 394 20. . . . .1 The simplex method . . .2 The transport problem . . . . . . . 408 Answers to Exercises 419 Index 429 . . . . . . . . . . . . . . . . . . . . . 394 20. . . . . . . . . . 403 20.3 Quadratic. . . . . .0 Introduction . . .Chapter 20 Linear Programming 20. . . . . . . . and dynamic programming . . integer. . . . . . . . . . . .

N different symbols 0. .. The number 10. + Here 0 S a. is more satisfactory. for sup- pose that we have a = a. . the word sedecimal. (N .. we have assumed that a can be represented by a finite number of digits.1)(N*-1 + . an arbitrary real number can always be approximated in this manner.. Representation of numbers As is well known. which plays a fundamental role." . with a pure Latin origin. In the general case.. + a_*N-* < (N .Chapter 1 Numerical computations A mathematician knows how to solve a problem-but he can't do it. The choice of 10 as the base of the system has evidently been made for historic reasons. W. However. A nonnegative number a can now be written in the form a = a.. N'" + a.N-1 + ..Nn-l + . 1. The representation is unique.-. our usual number system is the decimal or decadic system... + N-*) <(N .. * Previously. E. I . and as a matter of fact any positive integer N > I could have been chosen instead. 1. 2. MILNE."-.. further. .. (a. Then we find + . since hex is Greek and decimal is Latin.*N" + b. < N and. is called the base of the sys- tem.1) are necessary to represent an arbitrary number. by 0) .N + ao + a_. N"-' + .. Those numbers N which have been discussed more seri- ously are displayed in the following table: Base N Number system 2 Binary 3 Ternary 8 Octal 10 Decimal 12 Duodecimal 16 Sedecimal* When the base is given.. the word hexadecimal was used in this case.1) . 1N-1' =N^'. + a.Nn + b.0." # 0 .

Nq (q positive or negative integer or zero) . of course. Both systems have been widely used. we get R. and 15 are usually represented by A._ . C. and dividing b by N..b. we have directly IS. We split a into an integer part b and a fractional part c. = x0. by N. should be determined. . x..". The com- putations must be performed in the M-representation and N itself must also be given in this representation (in the N-representation N = 10). < N. The construction of high-speed electronic computers has greatly promoted the use of the binary number system." < N and . This representation.2 NUMERICAL COMPUTATIONS SEC. ._.. x x2. Next. a. . < N^'."N"' = R."-. Very large or very small numbers are often expressed in a so-called floating representation using mantissa and exponent packed together. . ."M" + x. x. D. Conversion We will briefly treat the problem of translating a number from one represen- tation to another."N" = R.. A certain number a can then be written a = p . Now we can write (a. In the binary case (N = 2). using suitable notations. Putting a . since 8 and 16 are powers of 2.1. 1. 13. is not unique." = b and so on.. we have. must be expressed as 1-digit sym- bols in the N-representation._. 12._. Then we have b = x..)N" + (a. treating b first. From the preceding proof. B.N' ' + .b.. Thus we have the equation: a = a. Suppose that the number a is given in the M-representation and that we want to find it in the N-representation." . .". we find the digits of the fractional part as the consecutive integer parts when c is repeatedly multiplied by N and the integer parts removed. or a < N". . The octal and sedecimal systems are al- most trivial variants.)N'"-' + = 0 or..-IN-1 + = S. = x and it is obvious that in general x0. Note that x. < N. In an analogous way. we get the quotient Q2 and the remainder R. that is a. E. we get a quotient Q. The signs should be chosen in such a way that 0 < a.. First we show that m = p. 11. in the latter case the numbers 10. are known and the coefficients x. . for example. respectively. p N° = (pN2) N°-2 . for example.N < /3. p is often chosen in such a way that .. For sup- pose.. which is absurd.1 . dividing Q. 0 < x.Nr + x. are the consecutive remainders when b is repeatedly divided by N.N + x0. where the coefficients a.. that p < m.. 14.. 1." = 0. and a remainder R. and F. then we would get a = b._._.I < N^' and a.N? + x.I < p < - or j < p < 1.Np + < Np+' < N".a. the corresponding number is then said to be normalized.

Very often round-off ton deci- mals is made. remainder 1. Analogously. it has to be decided from the context whether they are significant or not.2.x.3-'+ 1 1. On errors In this section we shall discuss different kinds of errors. 6 . If a number is formed as a result of a physical measurement or of a numerical computation.SEC. Independently of the nature of the error.. as a rule.1 is the relative error. Then the absolute error is defined by e = x . 1. in this case the digits beyond position n (which are made equal to zero) are simply left out.. 1.5250 rounded to 2d is 8. 3. last quotient .. remainder 1. and the nature of their growth. ON ERRORS 3 EXAMPLE Convert the decimal number 176. 58 = 3 = 19. it is left unchanged. Here we will briefly comment upon the concept of significant digits.3 = 58.3' + 1 .524=0. A number is rounded to position n by making all digits to the right of this position zero. When a number has been rounded to include only significant digits. the digit in position n is left unchanged or increased by one unit according as the truncated part is less or greater than half a unit in position n.31 + 1 31+2.14159.1415926 rounded to 5 decimals (5d) is 3. be the exact number and x an approxi- mation. Let x. It is obvious that a digit in a place farther to the left carries a larger amount of information than a digit to the right.6750 rounded to 2d is 1. otherwise. remainder 0. it ought to be given with so many significant figures that the maximal . We can say very roughly that the significant figures in a number are those which carry real information as to the size of the number apart from the exponential portion.524 to ternary form. If the number is an integer ending with one or several zeros.3 = 2.30. one can define an absolute and a relative error. . = 1 3-1+ 1 3-'+ 2. they are significant by definition.68. 8.52.2. ends with the last nonzero digit. If the fractional part ends with one or several zeros. while lefxo` = Jxfx. remainder 2. 0.2. EXAMPLES 8632574 rounded to 4 significant figures (4s) is 8633000. the digit in position n is increased by one unit if it is odd. these form a group which starts with the first nonzero digit and. If it is exactly equal to half a unit. 19 3 = 6. We find: 176 . their sources. Thus 176 = 20112 = 2.112010222.

but in this special case the associative law is fulfilled. Since the product of two n-digit numbers in general contains 2n digits and the quotient an infinite number. and so we can strengthen the inequality replacing JN-" by N. Then addition and sub- traction will give no round-off errors (we suppose that the results do not grow out of range). the difference is zero trivially. On the other hand. The distributive law is no longer satisfied exactly: lax(b+c)-axb. this figure should be rounded to 83 miles. Our basic inequalities then take the follow- ing form: laxb . The last condition is satisfied in many high-speed electronic computers.axcl laxb . however. the difference must be a multiple of N-*. and so we get: Jax(bxc)-(axb)xcl<N-". and further that only such numbers x are allowed for which -1 < x < 1. For example. .abl S N ". la=b-a/b15 N-%. it is rather meaningless to give the distance between two cities as 82. error does not exceed a few units in the last significant digit. We will now give an account of the round-off error bounds for the elementary operations. However.763 miles. In this case. Thus we have proved that (a-=b)xb=a. -+axc - N-*. Finally we consider 1(a+b)xb-al «N-"(l + Ibl) The difference must be a multiple of N-" and is thus zero except when JbJ = 1. 1. We take this fact into account by introducing pseudo operations denoted by x for multiplication and + for division.4 NUMERICAL COMPUTATIONS SEC. Hence the commutative law for multiplication is satisfied. the initial expression contains only quantities rounded to n digits. It is evident that a x b = b x a. Let us suppose that we are working in a number system with base N. using n digits.2. The associative law for multiplication is also modified: lax(bxc)-(axb)xcl JN *(2 + lal + Icl) - The expression in the last parentheses becomes 4 only when lal = lcl = 1. following von Neumann and Goldstine [2]. both results have to be rounded.

0. the computed sum of x and y is the exact sum of two modified numbers x' and y' which differ from x and y. 4x. We find directly that df = _ dx. we have (a x b) _ b .54 = 0.101 = 0.56 x (0. + 2 dx .65) x 0. we obtain other results.. 0.000053 106 = 0.al . which is a bad result when JbI is small. (0. 1. 10' = 0. ON ERRORS 5 On the other hand. as is shown in the following example. (0. t denoting the number of binary places in the mantissa. 4x.76 x 0.SEC. .Y) = (x .. we have the basic inequalities: fl(x + Y) = (x + Y)(1 + 6). Further details in this matter can be obtained from Wilkinson's book. fl(xxy)=xy(1+6). (0.243875 106 .65 = 0. .0. Thus for example. .0. 0.20 . fl(x x y).243875 .10" + 0. (0. EXAMPLES For n = 2 and N = 10.412648 . 0. When a floating representation is used instead. x1.10' = 0. .490000 101 + 0. ax. I ax.65 x 0. when the individual errors 4x.83 . fl(x .N-"(1 + Ibh-1) . by at most one part in 2'. + of dx1 + .y) for the results of the corresponding floating-point operations on x and y. fl(x _ Y) = (x/Y)(1 + 6)..Al + 6).243826 106) + 0... of the variables are known.06) -. with jej 5 2-'. .000049 105 + 0.56 x 0.531265 .56 x 0.35.36 .19 .243826 101 = 0.06 = 0..243826 106 = 0. A detailed examination of the errors involved in floating-point operations has been performed by Wilkinson [3]. we have: 1.54 = 0..0.412648 .412648 10') . .530000 10. 10'.06 = 0...412648 . respectively. In this case not even the associative law for addition is valid.65 x 0.243879 108 . In many cases one wants to estimate the error in a functionf(x. and fl(x .05 . 2.54) = 0.2.0. With the notations fl(x ± y).

20. all of them rounded to four decimal places.'xY' . while the growth takes place dynamically.2. and z = x/y. and hence we have z' = z + (11y)3 . When one tries to classify the errors in numerical computations it might be fruitful to study the sources of the errors and the growth of the individual errors. One such step means that from two approximations x' and y' of the exact numbers x and y we form the approximation z' of z by use of one of the four simple rules of arithmetic.. ax. 1. it is obvious that this case is extremely improbable. Local round-off errors.l . while for f = x. where terms of second and higher orders have been neglected. The error in z' is thus built up of propagated errors . 2. Round-off errors finally depend on the fact that practically each number in a numerical computation must be rounded to a certain number of digits. we have instead 5 Im.. and in practical computations.+ Idx*l . A simple example is rendered when data are obtained from a physical or chemical apparatus.1 x xi x It is only fair to point out that in general the maximal error bounds are rather pessimistic. I + Imxl . The maximum error is given by / l uxll + . we have (df)max < Idxd + Idx.005. + 1 I2 I I dx. Initial errors. Instead we compute z' = (x'/y')rouadad = (x + S)/(y + 77) + s. We shall here refrain from treating the "gross" errors in spite of the fact that they often play an important role in numerical computations and certainly do not lack interesting features. for example. a numerical computation proceeds in many steps. In the special case f = x. Well-known examples are computation of a definite integral through approximation with a sum.4x-. Local truncation errors. xTM*.. I + .. However. the maximum error is I 10-4 20.+ xn. + xz + .(x/y2)rI + s.000 = 1. 3. y' = y + ii. Truncation errors arise when an infinite process (in some sense) is replaced by a finite one. The initial errors are errors in initial data. Normally. Let us suppose that x' = x + 3. + imnI 1.l +. If. Then essentially three error sources remain: 1. From a statistical point of view one would expect that in about 99% of all cases the total error will not exceed 0. the errors have a tendency to cancel. )mex I I x.. The sources of the errors are essentially static. are added together.. or integration of an ordinary or partial differential equation by some difference method.000 numbers.6 NUMERICAL COMPUTATIONS SEC..

It can be of some interest to illustrate the different types of error a little more explicitly.. The dif- ference x' .f. where s < r. For this reason. A prob- lem with this property is said to be ill-conditioned. The total error is e = fe(x') .0000465196.SEC. Examples will be given in Chapters 2 and 4.3333(1 . To start with.0000333. Starting with the initial errors it might well be the case that they have a fatal effect on the solution. and the propagated error becomes E1 = e°'3333 _e1/3 = e°. under special circumstances the round-off errors can grow like a rolling snowball. The result is that instead of f. = f. In order to get a more detailed knowledge of error propagation we must iso- late the different types of errors.. = f2(x') . However. = -0.2. The difference e.f(x') is then the truncation error.(x') . As a rule. This phenomenon is known as instability and will be treated in considerable detail in Chapters 14 and 15. ON ERRORS 7 from x and y and further a new round-off error.f(x) = S. error estimates based on maximum errors in many cases are far too pessimistic. The difference e.x constitutes the initial error while the difference E. and this has the effect that the errors compensate each other to a large extent. + E2 + E3 . In normal cases the round-off errors are accumulated completely at random. In many cases f is such a function that it must be replaced by a simpler function f. = f(x') . The truncation errors can be made arbitrarily small by choosing N sufficiently large or h sufficiently small. such that N_ oo implies that the "approximate" solution approaches the "right" one. 1.(x') could be termed the propagated error from the roundings. The truncation errors usually depend on a certain parameter.Q°. In particular this may happen when such an error can be understood as a small component of a parasitic solution which one really wants to suppress. however. In practical com- putations the number x must be approximated by a rational number x' since no computer can store numbers with an infinite number of decimals. in differential equations N often has the form (b . are not exact but pseudo- operations of a type that has just been discussed.f(x) is the corresponding propagated error. N say. we try to compute e°-1"3 instead of e'"3. (often a truncated power series expansion off). . This means that small changes in initial data may produce large changes in the final results. Suppose that we want to dete mine ell' and that all calculations are performed with 4 decimals. We now choose the following specific example.a)lh where h is the interval length. (x') which is then a wrongly computed value of a wrong function of a wrong argument. the local truncation error is O(h') and the total trun- cation error is O(h'). The calculations performed by the computer.(x') we get another value f. Suppose that we want to computef(x) where x is a real number and f is a real function which we so far do not specify any closer. For example. This "compound-interest- effect" is very typical and plays a fundamental role in every error analysis.

we do not compute e= but instead + i!+2'+3 +4 for x .( 0. With forward analysis one follows the development of the errors from the initial values to the final result. In error estimations one can speak about a -priori estimations and a-posteriori estimations. e.0005 = 1.e. In the corresponding chapters we shall return to these problems in more explicit formulations.0. One might be content with showing that the method is convergent.0. of course. When different numerical methods are compared one usually considers the truncation errors first..0.3333° + )J = -0.. an asymp- totic formula for the error.3333 + 0.g.2. 6! Finally. + e2 + Ee = -0.8 NUMERICAL COMPUTATIONS SEC. i. In backward analysis one starts from a supposed error in the results tracing it backward to see between which limits the initial values must lie to produce such an error.3333° 5! + 0. The error analysis can now be performed on several different ambition levels. In the latter case the obtained results are used in the error analysis. s(h)_0 when h. Next.0555 + 0.e'13 = 6. In this case p(h) is an upper limit for the . is not known.0000296304 and the total error is s = 1. one may also derive actual error estimates of the type le(h)l S g(h) for all h < h°. I when h -. Examples of such problems are in first-hand linear systems of equations. Further the notions forward and backward analysis should be mentioned. Suppose that we consider the error e as a function of h where it is assumed that h . the summation of the truncated series is done with rounded values giving the result I + 0. however.(h). 1. Thus eg = -0. 0. Then one investigates how the errors depend on some suitable parameter which in the ideal case tends toward 0 or oo. and ordinary and partial differential equa- tions. the truncation error is ey .39552 96304 obtained with 10 decimals.3955 . instead of 1. eigenvalue computations.e. One might also derive results with respect to the convergence speed. who used it with great success for error analysis of linear systems of equations.0001124250. Hence.3955. Finally. i. This technique was introduced by Wilkinson. It might also happen that one can prove e(h)/q(h) --.3333. particularly important in con- nection with iterative processes and computations where each value depends on its predecessors. Ie(h)l S Cq. the first case is concerned with estimations performed without knowledge of the results to be computed. As can be understood from the name.00003 62750 .. Investigations of error propagation are. where C is a constant whose value.0062 + 0.

for example. Finally.5) dx + x dx+(1 .3. The Bessel functions J (x) are solutions of the differential equation (see Sec- tion 18. If a > 0 and a is small compared with a. Obviously. if we write E X = we obtain the root as approximately s/2a without loss of significance. SAMUEL JOHNSON. NUMERICAL CANCELLATION 9 error giving us a possibility to guarantee a certain accuracy. with _ (.tax + e = 0 has the two solutions x1=a+i/a:-r and x. Against this background it is not surprising that expressions which are completely equivalent from a mathe- matical point of view may turn out to be quite different numerically. Round numbers are always false. and a considerable amount of significance is lost.3. is expressed as the difference between two almost equal numbers. in [1].az-e. 1 _ 1 cosh x + sinhx a+b gives a very accurate result. 1. 1. Next. we know that cosh x = a. we present an example to show that one has to be careful when using mathematical formulas numerically. leading to a dangerous cancellation. and that we want to compute a-z. the root x.(x) k=o l k! (n+ k)! . sinh x = b. A more detailed discussion of these problems can be found.7I)y-0. suppose that for a fairly large value x.1)k(x/2)w+ Yk J.=a . On the other hand. Instead. We will restrict ourselves to a few examples on this matter.SEC. Numerical cancellation In the previous section it has been shown how several fundamental arithmetical laws must be modified in numerical applications. e-z_cosh x-sinhx=a-b. The second-degree equation xz .

J=(1) = 76830k.10 NUMERICAL COMPUTATIONS SEC.019563). It is obvious that this formula cannot be used here in this way. We start with the following values.(1) = 13081k. On the other hand.(x) - X is valid.000250).019565 (0. = 1 .. but rather will point to some general methods which have proved efficient.. and applying the same for- mula in the other direction.(1) = 0. J. and finally the computations are easy to check. A detailed discussion of this pheno- menon will be given in Chapter 14.114904 (0. we try to compute J (1) for higher values of n and obtain the following results (the correct figures are given in parentheses): J3(1) = 0. J4(1) ..4. J.765198 . We find that k = 1/668648. 1. 1. Computation of functions We are not going to treat this extensive chapter systematically. from which we obtain the correct values with an error of at most one digit in the sixth place. Suppose that the function y = exp (-x + arctan x= + 1) has to be tabulated from x = 0 to . J. J.114903). Using the recursion formula. First.000021).(1) = 14k . putting J.008605 (0. The constant k can be obtained from the identity J.(1) = k. J. J4(1) = 0. It is easy to show that the recursion formula Jp 1(x) = 2n.002486 (0.(1) = 0. 1.002477).440051 .(1) = 0.(1) = 294239k.(1) = 0. The explanation is that the former procedure is unstable but the latter is not. we point out the importance of making use of computational schemes whenever possible. We will restrict ourselves to real functions of one variable.(l) 0.(x) + 2J:(x) + 2J4(x) + 2J.(1) = 167k. J. we get: J.(1) = 0.4.1656k. J. The arrangement becomes clear. 1. correctly rounded to six decimal places: J.000744 (0.(1) = 511648k.(x) + . J1(1) = 0. even if a generalization to complex functions is straightforward in many cases.000002). the computational work is facilitated.000323 (0. J.

oo.4142 0..1)a.7854 0.2620 1.1138 1.2)1]. has introduced considerable advantages. + u. all square roots are computed at the same time.5952 1.8225 0. x z= x2+ 1 u=arctanz v=u-x y=e° 0 1 0.e. The expansion is conveniently trun- cated when iukl is less than some given tolerance..4225 1. We put po = I P.1078 1. The function in this example was defined by an explicit formula. + a. In the usual mathematical language. we take the Bessel function of zero order: (_ I )kx2k Jo(x) = 1 22k(k! )2 k=0 Putting uo = so = 1.1933 0.-.0198 0.. 1. COMPUTATION OF FUNCTIONS 11 x = 1 in steps of 0. i.2 [this is conveniently written x = 0(0. which at the same time is a language for describing computational procedures and an autocode system. The advent of ALGOL. = P.1. we see that the partial sums sk tend toward . and so on.x*-2 + ... We will give a few examples of this recursion technique and start with the polynomial P(x) = x* + a.1662 0.7854 2. so far as this is pos- sible. uk = -uk_. the function can be computed with arbitrary accu- racy.9563 Note that it is advisable to perform the operations vertically.0447 0.lo(x) when k --. .SEC.1).6 1. Po=0 Pr+' = Prx + P. next all arctangents are looked up in the table. + a.2995 0.. It has be- come more and more common to give a function by a recursive process.9553 -0.7952 0. As an example.. such formulas would often be clumsy..8 1. P'(x) = nx"-` + (n .x + a.8134 0.9078 0.4.4 1. It is then easy to prove that p* = P(x) and pA = P'(x).x*-' + .. at least in principle.2 1. In the domain of convergence. The following outline is then suitable. the remainder is then less than the absolute value of the first neglected term. Also for such expansions a recursive technique is often suitable.x214k2..0770 0. and its derivative.+' and r=0.2806 0.0 1. These formulas can easily be generalized to the mth derivative: P:+i = /np:'"-" Many functions are easily obtained from their power series expansions.(n. Sk = Sk_.5258 0..8620 0.

Putting y = y.and. we have x = 2Y. Last.1)ak). x and suppose that 1 < x < 2. Computation of functions using binary representation In some cases it is possible to use the special properties of a function to compute one binary digit at a time. where 0 < a < 1.I .). 2-2 +y.. 1.. = x.2-' +.x = y = y.. Suppose that a is written in binary form. = 2x. if xk_. -. + . cos 2z = 2 cos' z . which leads to Xk = COS (Yk+1 . 2-1 +y. x...<2.. In order to formulate the general method. .1. Yk = I and Xk = jx. x' = Consequently.'t_. Xk = (xk-1)". and n 2-' + . The method is not too fast but can easily be programmed for a computer. for example. are defined. we also consider the function y = (2/ir) arccos x (cf.. and finally we obtain log. we have 0 < (2/ir) arccos xk < I and.2-1+y2 -2-2+. while y. As our second example.8k_.. = 0 if x' < 2. .yk_. 2-' + . consequently.. yk = 0. a = 0. = x. but the computations will not be so simple any longer. after squaring.5. Fig. . It is not hard to infer that x°.. ) XA: = COS 2 (Yk + Yk+1 with x.(l + (xk . 1. An analogous technique can be defined also for bases other than 2. 2-1 + Yk+: 2-. we obtain xk+.. = x.). Now put 6a = a. We will demonstrate some of these procedures and start with the function y = x°. 2-1 + y. Two different cases can be distinguished: xk > 0 and xk < 0. From this we obtain the following recursion: Starting with x.12 NUMERICAL COMPUTATIONS SEC.. If.101100111. )J L 2 Using the formula. and yo = 1. Nk = 2(3k-1 . we choose y = log. = I if x' Z 2. 1.2-2+.. We use the notations +y. In the first case.. .5.. we first define the integral part entier (z) of a number z as the largest integers z.5.. In this way y y ..ak. then and the problem has been reduced to computing a series of square roots..Z2. we set Yk = 0 and Xk = 4-1 if xk-. and form ak = entier (2. we get y. and yk .

= 2x2 . 1.2r) cos (2 arccos xk) = 1 . lxk+. y==0 2 y= 2 +7 = _arccos_ . y. 3 .2x2 . = 1 . consequently. The function arcsin x can then be obtained from ? aresin x = 1 .=.5. = COS [ 2 (yk+: + yk+: 2-' + )] = cos (2 arccos xk .1 and yk = 0 if xk > 0 . USING BINARY REPRESENTATION 13 Ay 1 Figure 1.5 In the second case we have y. Starting with x.2x2 and yk = I if xk < 0 . 2-' Yk+1 + yk+.sec. we have the recursion: (xk+. 2-2 + = 2 Iarccos xk - Hence 2 xk+. i R EXAMPLE x= xo yo0 x. 2 arccos x . = x.t = I and. I X.

Suppose that s"(z) _ Ek=oakzk and further that Jim If(z) Izln .s"(z)I =0 i-.En=oa"(z . In many casesf(z) has no asymp- totic series expansion..6. z t"+i dt.0. and as a matter of fact it is in general divergent.+(n-1)!+R" x" 1 e-8 R"=(-1)" n! . The following no- tation is used: f(z) E*=oaz " when z -.6.zol" for a fixed value of n when s"(z) = F. It is obvious that g(z) and h(z) must be independent of n. but it might be possible to obtain such an expansion if a suitable function is subtracted or if we divide by an appropriate function.a"z" when z . the series expansion represents the 0 function f(z) in such a way that the nth partial sum approximates f(z) better than IzI" approximates 0.S"(z)1 =0 f-r0 Iz . then the following notation is used: f(z) - E.0 for an arbitrary fixed value of n.etdt t3 =e-II- with XL I x x= x' +?!-3!+. Asymptotic expansions at infinity are of special interest. is written in the form f(z) .J= t2 x e= X2 + 2!=. In this case we say that the series is a semiconvergent expansion of the function f(z). The following notation should present no difficulties: f(z) g(z) + h(z) ' a"z-" . 1. They are defined in the following way.S"(z)I = 0 for a fixed value of n when s"(z) = E k a akzk.z0)". The remarkable fact is that we do not claim that the series is convergent. oo if lim IzI"If(z) .z0)k. Asymptotic series Important tools for computation of functions are the asymptotic series. 1. .14 NUMERICAL COMPUTATIONS SEC. with the meaning Jim If(z) -. In a certain sense.k_oak(z . EXAMPLE f()x = e9dt=e_ t= etdt=e- x . An asymptotic series expansion around the point z..

1 1000000 -84367 11 915891 .ez.0915633 . Math. New York. 10T. [5] de Bruijn: Asymptotic Methods in Analysis (Amsterdam. 53. X x x x In general. since x has a fixed value. Am. Convert the sedecimal number ABCDEF to decimal form.31416 to octal form. but if it is truncated after n terms where n is chosen so that e. we obtain the following table: n 10?. London.266 4 914000 + 1633 14 915276 + 357 5 916400 . EXERCISES 1. [4] Rainville: Special Functions (MacMillan.el-R.515 6 915200 + 433 16 914840 + 793 7 915920 . Thusf(x) is expressed as a series expansion with a remainder term.. (11).e=. 1958).S. n 10T.4367 13 915899 . . Soc.S. 2. and by definition. In our special example we denote the partial sums with S. giving a maximum error less than e. 00.ex. 1962). and for x = 10. REFERENCES [I] Henrici: Discrete Variable Methods in Ordinary Differential Equations (Wiley. [3] Wilkinson: Rounding Errors in Algebraic Processes(Her Majesty's Stationery Office. 186 19 919778 -4145 10 915456 + 177 20 907614 + 8019 The correct value is e-1 e10 f"10 t dt 0. the truncated series can nevertheless be used. the error is of the same order of magnitude as the first neglected term. 1960). In general.186 2 900000 + 15633 12 915420 + 213 3 920000 .. 1038.) < e-'n!/x"+'. 1963). 767 15 916148 . 287 17 916933 -1300 8 915416 + 217 18 913376 +2257 9 915819 . Convert the decimal fraction 0.R. 15 EXERCISES Thus we have JR. Hence the series is divergent. [2] von Neumann-Goldstine: Bull. New York. 10?. the absolute value of which first decreases but later on increases to infinity from a certain value of n. this expression first decreases but diverges when n -.

v2.7. The maximum relative errors of v1.4). respectively. Find the coefficients in the /asymptotic formula -( sin t dt = cos x 1 X + z2 + 1 + sin x 1 . 0.75. is calculated from the relation p. Determine the maximum relative error when p.16 NUMERICAL COMPUTATIONS 3.' = p2v. and p2 are 0.2xy. Obtain an asymptotic series expansion for the function y = e-s2 Sa e2dt by first proving that y satisfies the differential equation y' = I . + L2 + . 4.v. 6. One wants to determine 1/(2 + 3)' having access to an approximate value of V3. \ 1 X (` . and 1%. Compare the relative errors on direct computation and on using the equivalent expression 97 - 5. (r = 1.

FALSTAFF.0. but this can hardly be avoided.. 3. Further.k) > 0. If no special assumptions are given for the function f(x). we shall in this chapter suppose that f(x) is continuously differentiable of sufficiently high order to make the operations performed legitimate. a is a root if f(a) = 0.<s. elimination of 6 by x. 2. FAKIR. For this reason it seems appropriate to point out some difficulties arising in connection with this problem.<s. Another consequence is that the equa- tions f(x) = 0 and M f(x) = 0. If now 1 < r. Everyone His Own Professor. However. . The inequality defines an interval instead of a point. Therefore. f(s. x... In some suitable way. and the quotation above comes from the book. and so on. In the case of simple roots. in numerical applications it must be understood that the equation usually cannot be satisfied exactly. < tk. influenced. we assume that s. substituting 6.. 17 . where s is a given tolerance. f(t. we could think of the condition I f(a)I < 6. His real name was Axel Wallengren.* 2. Only the case of f(0) = 0 has to be treated separately. due to round- off errors and limited capacity.<. for one. where M is a constant. where a is a given tolerance. k = 1. By definition. We will first consider the conception of a root of an equation f(x) = 0.) f(t.. by Mark Twain. we want to modify the mathematical definition of a root. and to start with. * Famous Swedish humorist-author (1865-1896). do not any longer have the same roots. f(s. the following procedure might be conceivable. Introduction Solutions of equations and systems of equations represent important tasks in numerical analysis. Chapter 2 Equations When it is known that x is the same as 6 (which by the way is understood from the pronunciation) all algebraic equations with I or 19 unknowns are easily solved by in- serting. we construct two sequences of numbers s s s ..) f(sk) > 0. and s.)f(tk) < 0 for i. then J(s + tom) is defined as a root. The numbers in both sequences are supposed to be successive approximations to an exact root of the equation.

for example.1) aak aak r Hence ar/aak = . By using such "range operations. . . for example. apart from those corresponding to the wanted accuracy.1010. Equations where small changes in the coefficients cause large changes in the roots are said to be ill-conditioned. a1+a. occur when r is large and f (r) is small. which means that the coefficient 210 is changed to 210. A number is replaced by an interval in such a way that the limits of the interval represent the upper and lower limits of the number in question. 8ak (2.. Large values of Ak have the effect that small changes in the coefficient ak cause large changes in the root r.<x. here higher terms are of decisive importance. Consider the polynomial f(z) = z* a z*-1 a z*-Z a and let r be a root of the equation f(z) = 0.0.1.+x. (2. Differentiating we get: = f (r) ar + r*-k = 0.= 210.0000001192. . and this relation is written in the following form: ijr _ ak.73 ± 2. Another difficulty occurs when one or several roots of an algebraic equation are extremely sensitive to changes in the coefficients.1628 _3. Large values of A. since in (2.0... Now we get more complicated calculation rules which will always give new intervals as results. . The limits are always rational numbers chosen to fit the computer used.14 ± 2.0.85.2..(x-!-20)=0 or x20+210x19+ +20!=0. when some of the roots lie close together. For example.r*-k-.8 are shifted only slightly.18 EQUATIONS SEC. one has to work with a suitable number of guard digits.5i. For r = -16 we obtain A. = 2-1'.16. If one wants to determine the roots of such an equation. 15! 4! This result indicates that we must have 10 guard digits.. It may be difficult to tell offhand whether a given equation is ill-conditioned or not.+b.<b. and we can summarize as follows.2. 2..81 i. We choose k = I and 8a. the latter is the case. If the computations are performed on an automatic computer." we can keep numerical errors under control .20.r*-k/ f (r). Then the roots . Now put Ak = yak r*-k-'lf (r)j. among the remaining roots we find. and .0. A well-known example has been given by Wilkinson [1]: (x+1)(x+2). the following process could be used.2) we have given only the first order terms. if a. It should also be noted that the value of Ak cannot be used for computation of the exact root. .2) f (r) a.

Putting 2 = max Jai]. Eliminating I and m and putting k2 = z. 2. be solved numerically by some of the methods which will be explained later. In most cases this is a very crude estimate. CUBIC AND QUARTIC EQUATIONS 19 all the time.oo and 0. When the root has been obtained with sufficient accuracy. we obtain z' + az2 + 8z + r = 0. subtraction. estimates of maximal errors are often far too pessimistic. the lower limit is moved upward or the upper limit downward. the problem is ill-conditioned. lm=s.z"-2 + . The upper and lower limits can be identified with our previous sequences sk and tk.1) = r. the equation may have 4. . Since the coefficients are real. where q = b . it is not too easy to design a general technique in such a way that no root possibly can escape detection. As mentioned before. Horner's scheme As early as in the sixteenth century.a/4. however. and fourth degree could be solved by square and cube roots. Next we consider an equation of the fourth degree. or no real roots. 2. second. multiplica- tion. it is removed.3a'/256. but the following procedures seem to be satisfactory from this point of view. it can be proved that for any root z of the equation f(z) = 0.3a2/8. but here we have at least one real root localized in a finite interval..SEC. If the final result is represented by a large interval. third. In any case there exists a partition into two quadratic factors: (y2+2ky+1)(y2--2ky+m)=y4+9y2 +ry+s with real k. Comparing the coefficients we get: I+m-4k2=q.1. of course. By repeated interval halvings. If c > 0. In 1824 Abel proved that the roots of algebraic equations of fifth and higher degrees could not in general be represented as algebraic expressions containing the coefficients of the equation. where only addition. First. and we are left with a second-degree equation. Equations of third and fourth degree can. There exist general theorems regarding the localization of the roots of a given equation. Suppose that f(z) = z" + a1z""1 + a. which gives the equation y' + qy2 + ry + s .0. We will now examine cubic and quartic equations a little more closely. 2. and m. However.. r = c . and root extraction were allowed. consider the equation x3 + axe + bx + c = 0.1. x' + ax' + bx2 + cx + d = 0. division. Cubic and quartic equations. it was known that algebraic equations of first. 2k(m . we have (zJ 5 I + 2. there is one between 0 and oo. there is a root between .ab/2 + &/8. + a.ac/4 + alb/16 . and if c < 0. 1. The cubic term is removed through the transformation x = y . s = d .

.z"-' + . 2.a)(z"-' + b..1. and we will return to it in Section 3.a + .. .+ a" = f(a) = 0. EXAMPLE The equation x' . In this way the problem is reduced to the solution of two second- degree equations. Conveniently. can be computed recursively from b. We obtain the following scheme. and the coefficients b. It is interesting to note that this problem is best treated with matrix-theoretical methods. with b. and both should be removed. The method just described is originally due to Descartes.4 = 0 has one root x = 2 and another x = .1. the following scheme is used: 1 a. a. We will now give a brief account of how roots can be removed from algebraic equations by means of so-called synthetic division (Horner's scheme). and hence the scheme can also be used for computation of the numerical value of a poly- nomial for a given value of the variable. + b"_. there is always at least one positive root z. 1 -3 4 2 -10 -4 2 2 -2 4 12 4 1 -1 2 6 2 0 -1 -1 2 -4 -2 1 -2 4 2 0 After the first division.ab.. = a. b" Every term in the second row is obtained by multiplication with a of the preceding term in the third row._.. 20 EQUATIONS SEC. Since T G 0.z"-' +. + a. we find that b.2x' + 4x + 2. + b. Comparing the coefficients of the z"-'-term.+arz"-*+. b" = a" + a.z"-r + brz"-r-' + ...a+a'.+a a.z"-' + .. and r = -r'/64.) z"+a.. = 1.. = at + a. a" l a a a.+a".-. Without difficulty we obtain b. and after the second.a+a' 1 a. with x' . and m can be determined one after another.. = 0 such that (z .. we get b" = f(a).a + a'.+a..3x' + 4x' + 2x' .. 1. Then - there exists a polynomial z"-' + b.. and hence k. Suppose that the equation f(z) = z" + a. and every term in the third row is obtained by addition of the terms above. b.+ a" = 0 has a root z = a.-. = a.4s)/16..z"-L + ._. = a. ... with a = q/2. /3 = (q' . + ab. we are left with x4 .x' + 2x' + 6x + 2. If a is not a root. .IOx . We touched upon the problem of determining upper and lower bounds for the roots of an algebraic equation..+ b"_.3.

One can prescribe either a fixed direction.2. where f(x) can be an algebraic or transcendental function. there are no difficulties in generalizing this technique to division with polynomials of higher degree. Division by z' + az + 9 leads to the fol- lowing scheme. The basic idea is now to replace the curve by a suitable straight line whose intersection with the x-axis can easily be computed. We intend to start with a trial value and then construct better and better approximations. or to the tangent . NEWTON-RAPHSON'S METHOD 21 Obviously. y ).. the problem can be so formulated that we are looking for the intersection between the curve and the x-axis.. e. _Q Cl C2 C3 2.a -c. yo) and (x y). but its direction can be chosen in many different ways. Newton-Raphson's method We are now going to establish general methods for computing a root of the equation f(x) = 0. a3 a3 . a -a -c.a -.SEC. 1 a. 2...g.2... -a -R . parallel to the secant through the first points. (x0. If we represent the function y = f(x) graphically. The line passes through the last approximation point (x.CA.

. = A ..lk . .. will be considerably less than s. 3. = E + En .. yn). f(e + f( + 2f (c) If the remaining terms are neglected. f() # 0.) (variable secant). or with the tangent through (x.. Hence e + s. k = (y..[f($) + Ilk and [I - We see that s. we get e.. In all cases we obtain an iteration formula of the following form: xE = x . 2. = x. and for the first two cases we get y = 0..-I Y. ej" = K K.Y. From this we get 1+ 1m =m and m=1± 2 . Further we set x _ + E.. if k does not deviate too much from In this case the convergence is said to be geometric. that is..K K..-I Y.)/k .).x0) (fixed secant).yo)/(x. .. = AE EA_.Y. .. We shall now examine the convergence of the different methods.. . .. . y. ..22 EQUATIONS SEC.f(x. In case 3 we find X. we have the following four possibilities: 1. through the first point. = KEn_.'".2..) and (x. x. We will now try to determine a number m such that Ew t. Let st bethe exact value of the simple root we are looking for: f(t) = 0. Then also z. . k = (y. 2.. k = f (xo) (fixed tangent).-I and + E.. E K-I'"' ... k=f (variable tangent). . f"(E+) + .Y._. and hence s. x = x.Y. 4.eef(C + En-1) __ EA-1En . x. e ..f(£ + en)/k = + En . .. If the slope of the line is denoted by k.) . or such a variable direction that the line simply coincides with the secant through yA-. E.

.+ = ) E..2) E.) + 2 f"(x.) + .f(x.2.. . Thus we have found s. at least if the factor is not too large. * See also Exercise 19.2. however. and the convergence is said to be quadratic. = x -- exactly as in (2.f( + b. and x.f (E2 /2)f"(e) f + E.) + h f'(x. we get h = . = K K..+I = X. If the root is of multiplicity p> I the convergence speed is given by [(p I)/p]&..3).f'($) + Ewf'($) + . .. The iteration formula has the form x. where x is an approximation of the root.3) As has already been mentioned. and it also works when coefficients or roots are complex. where the plus sign should be chosen.)lf'(x.2. If all terms of second and higher degree in h are neglected. The derivation presented here assumes that the desired root is simple. In case 4 we find: +E.) . + h) = f(x.1) The method with variable secant which has just been examined is known as Regula falsi.1(e) (2.).+1=S +6% f(E+ or En+= -0. It should be noted. The method can be used for both algebraic and transcendental equations.. the geometrical interpretation defines as the intersection between the tangent in (x. that in the case of an algebraic equation with real coefficients. Newton-Raphson's formula can also be obtained analytically from the con- dition f(x.) Hence the leading term is .aie (2.2...) E. (2.f (E + E. NEWTON-RAPHSON'S METHOD 23 SEC.f(s) ... .2. .E. EA. 2.. a com- plex root cannot be reached with a real starting value. = 0 .* The process which has just been described is known as the Newton-Raphson method. y*) and the x-axis. This fact also means that the num- ber of correct decimals is approximately doubled at every iteration.

a < b. . If the multiplicity is not known we can instead search for a zero of the function f(x)/f(x).24 EQUATIONS SEC. where again we have quadratic convergence. . 0 0 0 < 1.) = Bs.M. The modified formula x. (linear convergence.. Hence s _ 0e. We also suppose that x is so close to that p(x) = f(x)lf(x) varies monotonically in the interval between is and x. We suppose that x _ E + e is an approximation to the root i. b] provided that f (x) # 0 and f'(x) has the same sign everywhere in the interval. we must know x.).) + i S. we could as well compute and we would then be interested in an estimate of the error expressed in known quantities. Putting h = f(x. and further max {If(a)lf(a)I If(b)/f (b)I} < b .).E.5). we have supposed that IhI < 1/K. where f(x) is analytic. Assuming thatf(x) is finite every- where this function has the same zeros as f(x) with the only difference that all zeros are simple.1 < 1 IhKIhI (2.) = -(x.q(x. Hence.fRf*/UnY .s.4). Finally. .2. and f (x. it is easy to show that Newton-Raphson's method converges from an arbitrary starting value in the interval [a. if we use Newton-Raphson's method.4) where. Let be the exact root.f (x. of the equation f(x) = 0. . Section 2. For the error estimate (2. we give an error estimate which can be used for any method. we get f(E) = fix..a (global convergence).)lf (x. cf. Expanding in Taylor series and using Lagrange's remainder term.f'(x. Iff(x) is twice continuously differentiable and f(a)f(b) < 0.)lf(x. we find 16. .f'(x. Since 0.[U/f) U"lf)]=.S. 2.+1 = X.) = f(x%) .-8r* Putting K = sup If 'If I and h = f(x.2..) and Q . f(x.").) . of course. . as before.2.) where 0 < 0 < 1. by using the mean value theorem we get -(x.)lf(xw) restores quadratic convergence.) and (f/f7=% 1 .) .6e. and further that f (x) *_ 0 in this same interval. The geometrical interpretation of the conditions are left to the reader.).. This leads to the formula xn+1 = xw .)If (x.Pf(x..

0 we have a If we neglect the term aKlhl in the denominator. = 1. = -1.2. z. z.5) KKIhI however. 2.) We put K = sup IQl and suppose that we are close enough to the roots to be sure that Klhl < s < J. = 1. = -0.819358 + 1. Choosing zo = 3i. settingf(x) = x' .293133 + 2.SEC. and when s --. x. = E* .2. f(z) = z° + (7 . = -1.000000i.2e + Qe* = 0 or s* = 1+i/1-2Qh (The other root is discarded since we are looking for a value of E. = 1.997344 + 1.855585. .h and hence 2 aKh2 .x . where a Z (1 . we are essentially back at (2. = -0. we get -e-xw-10 x" 4x. x.+. By Newton-Raphson's formula. .10. 2. z. I -II<Ihl\1-aKlhl-. we must assume that Ihl < (2K)-' For s = j we get E <1 (2.1/1 . we find z. z. .902395i. .12i)z3 + (20 . .000002 + 1.871 x. NEWTON-RAPHSON'S METHOD 25 we find 2h 2h .038654 + 1. = 1.x . close to h.131282i.10.85558452522. But e.1 4x.505945i.28i)z2 + (19 .548506 + 2.2s)/2s. _ -0.2.000000+ 2.2i)z' + (20 . in normal cases s is much smaller.85578. 1 1) 1-aKlhl.12i)z + (13 . Naturally. za = -0. x' .26i) 0.999548i. EXAMPLES 1. z.999993i. = -1.2).1 xo=2.9656261. x.

2.4.(x) satisfying the differential equation xy" + y' + xy = 0. = a.4) = -0. we have c..5885.[a + Aa.4) = 0.10a... = 0. x i. y"1(2.. This formula is particularly useful if one is looking for zeros of a function defined by a differential equation (usually of second order).21423 61785./2 .20651 12692 .34061 02514.h f + 2 f' .a3)a3 + 2a(15a. c4 = 514 (15a.. . = 0. y'(2. . From the differential equation. Now we write h as a power-series expansion in a. .a3 h3 6 + a4 h4 24 _ . Let be the exact root and xo the starting value with xo = e + h. 37r. which will now be derived. °'y(2. f"/f = a3.4) = 0.=x*+ e-.00250 76833.. The zero is close to x = 2. c3 = 1(3al.a3) . f`°/f = a4.n ... 3. where we have y(2. 2. x.4) = 0. .. Hence z =. 2ir.10aza3 + a4)a4 +. = 0 .26 EQUATIONS SEC. .a$+c3a3+cal + and inserting this value in the preceding equation. EXAMPLES Find the least positive zero of the Bessel function J. e`= = sin x. x. Thus we find = xa . . the equation has an infinite number of roots lying close to it. .sin x a =s+cosxx xo=0... h2 h + ae 2 .az + c(3ai .h) = f .52018 52682 . This converges toward the smallest root.4) = -0. we easily get y"(2. h=a+c. . .58853274 . = 0 or after division by f (xo). Further we suppose that f'(xo) * 0 and put f/f = a. Newton-Raphson's formula is the first approximation of a more general ex- pression.6.f" If = az.a3 + a4) .f(xo .

4118459 .9) and for 1/1/ a x%+.10) . + a]/ft. for example. = jx%(3 .a 1/x.ax%) .2. showing the quadratic convergence clearly.00482 55577 and ¢ = x0 .2.a.)'.a = 0 to obtain x%+1 = 2 (x% + x) (2.8) Especially for 1% a we have X.ax.. a = .0.3969956 .a)/NxA-' = [(N . cube root).00000 01120 .6547864 .) ." . = x%(2 . a3 = . a. a' = 0. From y=x" .2.00002 32396 .ax%)'.7) This formula is used almost universally for automatic computation of square roots. +I = 3 (2x% + (2. _ (1 . The quantity a-' can be interpreted as a root of the equation l/x . Hence h = -0. and there is no real advantage. (2.SEC. (2.6) This relation can also be written 1 . 47856 217 -0. a2 = 0.0. we find x%+1 = x% . 2.2.(x* . From this we obtain the simple recursion formula 1/x% . (2. It is easy to construct formulas which converge still faster.00482 07503 . or x%+.a = 0.2. NEWTON-RAPHSON'S METHOD 27 and further a. 0. Older computers sometimes did not have built-in division.ax% = (1 .2. = x%(3 . = .1)x. + a2xn).40482 55577.ax%+.ax. a3 = -0.00482 07503 . but the improved convergence has to be bought at the price of a more complicated formula.ax.h = 2. A corresponding formula can easily be deduced for Nth roots. square root. or x%+.00000 00005. If one wants to compute v. The Newton-Raphson technique is widely used on automatic computers for calculation of various simple functions (inverse. one starts from the equation x2 . and this operation had to be programmed. I .0.

S(p._.3. Expanding in Taylor series and truncating after the first-order terms. V _a can then be obtained through multiplication by a.1 when x° is an approximation of V.3.28 EQUATIONS SEC. we obtain a quotient x"_y + b. .1' 1/a = x0\1 + + 9 + 81 + 243 + 729 s s2 5s' lOs' 22s6 x°(1 + 3 9 + 81 243 + 729 2. q) = 0 .xola and s = a/xo . so that R(p+dp. (2. all complex roots appear in pairs a ± ib. we get: OR R(P. the procedure is repeated with the corrected values for p and q.(l+s)'I'=xo+ 2 . S(p+dp. and when the system has been solved. 2.x"-6 + + b.. so that R(p. Our problem is then to find p and q. However. We regard this as a linear system of equations for dp and 4q. Bairstow's method As mentioned before.8 + 16 s3 128 + 256 Analogously for the cube root with r = I . Let the given poly- nomial be f(x) = x° + a. and a remainder Rx + S. Newton-Raphson's method can also be used for com- plex roots.xa/a and s = a/xo .q+dq)=0.x"-' + + a". If we divide by xz + px + q. if we are working with algebraic equations with real coefficients. where x° is an approximation of 1% a : r 2r' l4r6 35r' 91r6 +. we get x°0(1 r 3r= 5r6 35r' 63r° 1/a (1 -r) x + 2 + 8 + 16 + 128 + 256+ and 5s' 11a =x. these relations are not satisfied in general. Each such pair corresponds to a quadratic factor x2 + px + q with real coefficients. The last formula is remarkable because it does not make use of division (apart from the factor 7J).1) S(Rq)+ p 4p+ aq dq=0.. q) = 0 . An alternative method used to compute square roots is the following: Putting r = = 1 .3. and we try to find corrections dp and 4q. For arbitrary values p and q.q+dq)=0.1.q)+ ap 4p+ dq=0.

-b. = S . Conve- niently. = S + qb.2) b"_.=b.._._.+pb.+pd/k_1 + q abk-.-. +b. . = 0 . ._. + q abk_. ._._._.=0 ap aq Differentiating (2.-qbk_. (ap +paap.2.+pb. and b and define: bk=ak-pbk_.3.3..3)... a..3. + pb.+b.+p. and S.pb. . we obtain aap._. IR These values are inserted into (2. from which we obtain: a. abo ab_. we find abk = Uk_._. . (k= 1.4) S = b.. BAIRSTOW'S METHOD 29 In order to compute the coefficients b b3. (2. + qb"-.3) with b.._.)4p +ab.pb. Hence b"_1' (2. .._ R. a (2. + qb.+b.-.+q. dq+b.3.pb. . + qbk_. 4p+ a g. = 1 and b_. The quantities b b . +1'_. we use the identity x"+a. 2. ._.. b.+a..6) l_=bk_. + pb"_.=(x2+px+q)(x'-. .)+Rx+S._.qb.+pb.. a. a. (2. ap ap ap ap ap ' (2. aba = ab_.. .+qb ak = bk + pbk_. = a. 0.. dq+b. = 0._. If the first of these equations is multiplied by p and subtracted from the second.' R.. R and b. + p abk_.=0.5) (ab.1) to give aaP JP + aag dq + b.=b._.. = R + pb.3..x%-1+.SEC.3._ a. and S can be found recursively. Then we get b._.3.3.=0. = 0.n).qb.x"-a+..)dp+(aq +paa9dy+b. we introduce two other quantities b._. = a. dq aq aq aq aq .

pCk_y .3.n._. . . 2. qo = .1) is omitted.5) for determination of 4p and 4q can now be written 4p +4q = bn_.3.PCk-. Introducing u and v instead of 4p and 4q.. and we shall try to find two quadratic factors. n). 1 4 -3 3 -30 5 -5 (ck) 1 -1 10 .2. further. Starting with po -: 3. we obtain the following scheme: (ak) 1 5 3 -5 -3 -6 -6 -93 -3 5 10 10 5 ( b k) 1 2 2 . 2.) ..8) (c. it is easily understood that the convergence of Bairstow's method is quadratic.. v= -1.. . Ck_2 = bk_2 . 4p + c. From equation (2. CO = 1 ... that convergence always occurs if the starting value is not too bad. = 0 s (k _= 1. b.. C_.qck_. (k = 1.-2 4q = b. bw Note that in the last addition of the computation of ck.._3 cw-2 c... the term (in this example. and. (2.-.1). =_c k _2 aq ap Equations (2. we obtain by induction abk ._1 .qck-2 ..-.6) then pass into the following equations from which the quan- tities c can be determined: Ck_. abk_. (2. {-35u+ 10v= 4.3.1).7) Hence ck is computed from bk in exactly the same way as bk is from ak. EXAMPLES The equation x` + 5x' + 3x2 .gCk_k These can be comprised in one single equation: Ck = bk .. Equa- tions (2... . Putting abk/ap = -ck_. = bk_. we obtain the system: lOu.3.. . .30 EQUATIONS SEC.9 = 0 is to be solved.35 1 I I I I c.-.5. This gives directly b.5x .pCk_2 .3.3.b..

= 2. and (p. Next.000009 -0.917736 1 2. 30.08407v = . 1q.00745 .00000202.82964 0.90255.13v = 0.343383 9.917736 .28 9. BAIRSTOW'S METHOD 31 which gives {u= -0.917736 10.05 4.92 10.000026 -2.90255 2.097047 1.000043 . (p.91 2.821 9.0.92.339507 -26.07 0.01355 .82v = -0.= -4.2.91 -6.09.00378 . 30. ( u = 0. 1 q3 = .30.087350u . = 2.84 -0.00000081. 2.0.91759 1 2.32612 9.30.3.80510v = 0.36697 4.57 4.2.99742 4.08. 1 5 3 -5 -9 .000043 .07.000009. q.08407 .312724 8. .0. u = 0.805906 9.4.= 2.08795 -5.830107 0.91759 -3.35 0.087629 .31062 -0 01097 .0.30.08 -5. (p. u = .902954.31440 8.087350 . {p.90255 4.90255 -6.902953 -6.380150 4. = 2.917736 1 5 3 -5 -9 -2.08407u .312715 .33684 -26. lv -0.902953 4. .91.09745 1.92 -4.60u .9. itv= 0.13 -30.91759 10. the computation is repeated with the new values for p and q: 1 5 3 -5 -9 -2.0. 1v = -0.01355 .000403.25 -2. -2.37 -26.0.32612u + 9.60 9. 1 .00378.902953 2. V = 0.0.13u .5.-4.25.SEC.805906v = 0.09 1.92 1 2.00241 1q2 = .087350v = -0.000146.91 4.95915 1 -0.20 -2.80510 9.03 -0.343383u + 9.3.999983 4.917738 .2.902953.91759 .963233 1 -0.4.

after m squarings.830110) . suppose that IP..32 EQUATIONS SEC. .3) * The sign ' means "much larger than.x"-4 + . + 2a.q. p". .4.917738)(x2 + 2..902954x . .." ..097046x + 1.ak+...= II A2 = q:qk = q. + b" = 0 with b. >> Hence A.1)kbk = ak . )2 Putting x2 = y.I a. Suppose that.a3+2a4.g2g3 .q2 .+ a" = 0..4. . . are probably less than one unit in the last place. > IP and Iq. compared with the quadratic terms on formation of new coefficients.2aa.I > IPI > . we get (x" + a2x"-2 + a. 2. . .. _ - .4.1 >> . or (.. (2. + 2ak-2ak+2 . x"-6 I .gkq. and q.4. + A. . Collecting all even terms on one side and all odd terms on the other.. .. = p2j-. we suppose all roots to be real and different from each other. 2. . ) 2 2 . + 2a.(a1x"-' + a3x"-3 . .2) A3 q.2ak-.4. b3 = . One more synthetic division finally gives the approximate factorization: x1 +5x3+3x2-5x-9 (x2 + 2._ . while the original equation has the roots p p21 . a7 b2=a. Further.-' + . (2.a37 + 2aYa. q". For the sake of simplicity.. and consequently. The errors in p.Y"-' + + . . 2.l >> 1g. (2...1) The procedure can then be repeated and is finally interrupted when the double products can be neglected. Graeffe's root-squaring method Consider the equation x" + ax*-' + . we obtain the new equation b2Y"_2 y" + b. i = 1. = 0 with the roots q q5. Then q.. .2 -2a. n.. we have obtained the equa- tion x" + Ax.4.

642z2 + 10641 z . 2.390882u8 + 100390881u .10" = 0 .. We can then find a parabola y = ax 2 + bx + c passing through these three points. Iterative methods Strictly speaking. EXAMPLE x'-8x2+17x-10=0. f. (xt_1. 2.108 = 0 . The exact roots are 5. we get p. by m successive square-root extractions of q. It is also characteristic for Newton-Raphson's method that the derivative must be computed at each step. in particular this seems to affect the complex roots.5.-+an=0. f.. The latter can only be used in a special case._1).5. ITERATIVE METHODS 33 Finally. or.00081 e IP31 = 108/100390881 = 0. (xx + 17x)2 _ (8x8 + 10)2. f. Hence IRI = 39 8882 = 5._3. The equation of the parabola can be written . be the equation to be solved.) are three points on the curve. and u' . A more detailed description of the method can be found in [2]. could be considered as iterative methods.999512 ._=). and 1.. we find zs . Newton-Raphson's and Bairstow's methods.00041 . The first of these methods is due to Muller [4]. the case of complex roots has been treated particularly by Brodetsky and Smeal [3]. As has been shown by Wilkinson [9] a well-conditioned polynomial may become ill-conditioned after a root-squaring procedure. Suppose that (x. the sign has to be determined by insertion of the root into the equation. y'-30y=+129y-100=0.SEC. The general principle can be described as follows: Let 1(x)=ax%+a1x"-1+. and (x. Squaring twice again. which have been treated above. and it is primarily supposed to be useful for solution of algebraic equations of high degree with complex coefficients. We will now consider some methods which do not make use of the derivative. while the former is general. e 1P21 = 100390881/390882 = 2. putting x2 = y. 2.

-1)h f.5.1'.) f. (2.-1)(h + h. + s.-1(.1. = . This corresponds to the approximation f = a + a.] + f. .-. and we next solve for 1/2.-1)h.h.S.)h (h + h.2. = 1 + 2. `]Ij-.x + a.h.x.-1 (-h) + (h + h.x..4 Since 2 = h/h..4) as large as possible by choosing the sign accordingly.84 is the largest root of the equation m3 = m' + m + 1). + f.. + 2.).-z .2) (h.x. 1 = x.-3)(x. . + 301 + f.h. Putting h=x-x.)(x . ± Vg : .x. also Section 8. (2.-1) This equation is obviously of the second degree and..1).5.8. With g.-1 . + h.a. but so far it seems to have been tried only on algebraic equations of high degree.x) + (x .) (x. + h. + h. a+ for f for f and h. 0. . s.. (x . and x.x.5.+1 = 2f3. = f. The experiences must be considered as quite good./h. as is easily found.) h.-.f.-1 .x.S.-._1. Y -_ (h + h. Hence the result of the extrapolation is x.h.5) The process is conveniently started by making x. . .-1vi + f.-.-1 + (2 + 1)(2 + 1 + 2*`)2.h. . 1.-1 + .f. . _ (x .. (2.b.2._2)(xi-1 . x.1.5. we get: Y = 1 [2(2 + 1) 2.2.x.) A .. a small value for 2 will give a value of x close to x. = x.x.x{-1) f. For this reason we should try to make the denominator of (2. + hi+1 .x.-:2.f.5.) f.x. (x.1) (x.34 EQUATIONS SEC. down directly: Y.] (2) . (2. it is satisfied by the coordinates of the three points (cf.')2. 8.xj)/(x.5.+. = h.-l' (where 1.2' . + f.-1 f.4f._.1 = 2' b. Muller's method can be applied in principle to arbitrary equations. introducing 2 = h/h.-. -f. + f.[._ and S.f. 2._.-:)(x . = .(2. Further. 2.-1 + (x .-2 .-x.. xti-1)(x.-.)(x .2(2 + I + 2._. and further by using a .=x.(2.5. The speed of convergence is given by or approximately K K.xl close to the origin. we get .3) This expression is equated to zero and divided by f122._1).

ITERATIVE METHODS 35 We now take up a method designed for equations of the type x = g(x) . . has a limit Q.6) Starting with a suitable value x0.5.. we form x.) for i = 0..5.SEC. 1.6).5 (d) .. . 2.g(x) x xo (c) Figure 2. = g(x.5. it is obvious that a is a root of (2. Any Y-g(x) (a) (b) Y A i V y. If the sequence x. 2. (2.

x* (x . = -16.x*I + Ix* . if Ig'(x)j < 1 in the interval of interest.).5.273 1. 1-x f(x )) = x* .5).2544 x. Analo- gously x . using x . in fact. = (x . and hence we have convergence if m < 1.259921 was completely lost in (a) but rapidly at- tained in (b).)I S in.). equation f(x) = 0 can always be brought to this form. Multiplying. We will now examine the convergence analytically.x*-1I IgV*-01 Hence Ix .x*-1I 5 Ix . 2. we con- sider the equation x3 .x. = g(x) .2 = 0.x%_1. we can use m .f x)l f (x) (Newton-Raphson's method) we find: X.x*I S I m Ix* . Here x is the wanted root of the equation x = g(x)...5. 2.x*-11 1-m and Ix .. < $0 < x. Starting with x.2596 x. with x. Ix .x.x*-11 S Ix .-1I In particular. we get the results: (a) (b) X. Then we get Ix . Ix*+1 .x0)g'(en)gV1) .x*_. It is rewritten on one hand as (a) x = x3 + x ..x. we obtain: x . we find that x .g(x*_. choosing g(x) = x .2 and on the other as (b) x = (2 + Sx . Ix .x*_. (2.x")/5.x*-ll .x*I = (x .x01.2.25992 The correct value 1% 2 = 1.928 1.. = g(xo). _ -0.x*I S m* Ix .x. . in an infinite number of ways.xo)g'($0). that is. = 0. In order to show that this choice is not insignificant. It is also easy to give an error estimate in a more useful form.g(xo) _ (x . = x .293 1.349 1. = -2.x*-1I 5 Ix* .x* = g(x) ..7) If in practical computation it is difficult to estimate m from the derivative. we get.x*j Ix* . Starting with x0 = 1. From x .36 EQUATIONS SEC.x* + x* .h and gl(x) = f'(x) 1'( ) .)g'(e. Now suppose that Ig'(e.)g'(¢*_1). We attain the same result by simple geometric considerations (see Fig.2599 x..

b] and that the function values belong to the same interval. For m we can choose the value m = cos 1. But Ix . S Jhi1(1 . = 1. We now assume that the function p is defined on a closed interval [a.074. = 1. gives the following values: X1 = 1. and hence Ix .47. where K = sup J f'(x)/f (x)J taken over the interval in question. for example.4971 . x.497300. In this case it is obvious that x > x.000015.I < (0. J (e.497305.. and this leads to the estimate 1.SEC. with xo = 1. lies between x and x. The correct value to six decimal places is x = 1. We shall now discuss these questions to some extent and also treat the problems of convergence speed and computation volume.M) where M= and E. If P is regarded as a mapping. An equation f(x) = 0 can be brought to the form x = q(x).4971 .) I f Further.+ = q(x. x. EXAMPLE x=j+sinx. = 1.34.8) If the relation _ is satisfied.).x.) = IhI f(L.49729 . ITERATIVE METHODS 37 Using the same arguments as before we get Ix .5. x.926) 0. Mix .M) < KJhI/(1 ..5. = 1.+J < 1 . and hence MJhM Ix . x. by defining p(x) = x . is said to be a fix point of q.KjhI in accordance with previous results. The iteration sin x%.00019 = 0. (2.X.+11 < I If we now assume that f(x)/f (x) is monotonic we have If('-) < f(x. Previously we have worked under the tacit assumption that a solution exists and that it is unique.497290 < x < 1. Hence we have M/(1 .f(x)g(x). For solving the equation x = q(x) we consider the iteration x. it is possible to prove the existence and uniqueness of a fix-point under various assumptions. We also suppose the . 2.495.074/0.x.) I < K.0.KJhi) and we finally obtain Kh2 ix .

->a. Method s(Horners) p E Fixed secant 1 1 1 Regula falsi 1 1.5. while the reverse is not true (consider. 1]). We now turn to the problem of convergence speed and related to this the question of effectivity for different methods..84 Newton-Raphson 2 2 1. converges to E. 0 <L<I. . satisfying the condition IT(s) . the convergence is linear if p = I (in this case C must be < 1) and quadratic if p = 2.5. N. We present a survey in the following table. and consequently it must vanish in some point in the interval.38 EQUATIONS SEC.$.5.44 Multiplicity-indep. (2. Usually one neglects the fact that sometimes several extra evaluations are needed to start the process. First we show the existence of a fix-point.5.62 1..10) Ix.$I and x = E since L" -. <0 for x = b.tl.8).I leading to a contradiction. .EI' the convergence is said to be of order p while C is called the asymptotic error constant.T(E2)I <_ LIE..-R. The effectivity index E is defined through E = p'"' . for example.E2I < IE1 .62 Muller 1 1. Obviously. Ix . .$I . The continuous function p(x) . From the conditions above we have q(a) . It is now easy to show that the sequence defined by (2.x is then >O for x = a. (2. In particular.84 1.41 Chebyshev 3 3 1. i.11) and the method is better the larger E is (cf.26 . 2. function g' to be continuous in the Lipschitz sense. . (2.5. p(x) = 1/ in [0.p(t)I <LIs .e. 3 2 1. q(b) < b. 0. Then we would have IE.$I < L"Ix.e. The effectivity of a method naturally depends on not only how fast the convergence is but also how many new evaluations s of the function f and its derivatives are needed for each step (the unit of s is often called Horner). i.E21 = . If we can find constants p > I and C > 0 such that lim -m Ix^''-EI =C. and E.9) for arbitrary s and t. For EI = p(E)I <_ LIx .. this implies usual continuity. [8]). To show the unique- ness we suppose that there are two solutions e.

it is important that the starting values are not too far out. . In this last case the method works as follows.. The Q-D method This is a very general method. THE Q-D METHOD 39 Chebyshev's formula has the form xn+1 _ xn . and it can be used for de- termination of eigenvalues of matrices and for finding roots of polynomials.6.SEC.f(xn)f (xn)/[f (xn)3 . In order to attain reasonably fast convergence. the sum of two elements is equal to the sum of the others. x.6.. < Ixnj.. . Formulas q(n+1) n+1) qk+)1 ekn) = k ek . 2. as indicated in the scheme. xn and 0 < Ixj < Ix21 <. ./ (xn)7 ls ( ( 7 1(xn)/[2{ (xn)8J while the multiplicity-independent formula is xn+1= X.1) qkn) + ekn) = g(kn+l) + e k_1 k-1 We will give at least some background to the Q-D method when used for alge- braic equations.2) P(x) x-x3 x . (2. as indicated in the scheme.f(xn)f"(xn)] It seems fair to emphasize methods that do not require the derivatives. Then I/P(x) can be expressed as a sum of partial fractions: I __ Al + A3 x-x1 . and among these Regulafalsi and Muller's methods should be mentioned first.i. and Henrici.6.k): ql ) e1)° w (0) ql e1(l) q3 e(o) t) )o) + q(12) ' e(13)/q3 eau/q3 \e3 ) 3) (2) u) ql e'S q9 e(( q3 q.. Its full name is the Quotient-Difference Algorithm. the product of two elements is equal to the product of the others... .f(xn)/J (xn) .6. .(0) e(4o) e3') Rules In a rhomb with an e-element on top.. Rutishauser. In a rhomb with a q-element on top. One constructs a rhombic pattern of numbers e..kI and q. constructed by Stiefel.+ An (2.xn . 2.[. We take a polynomial P(x) of degree n with the simple roots x1.

/A. Now put a. we obtain: e(i+ x' lim e(i) = (2.. ) . .. .lx') _ 11+ (A2/A. (2._.... . + x.lx.e(i) 1 q(i+') = x -. x..)(x. (x.6.4) x. we find: 4(i+ lim ( /x. 3. (i) .) + (A9/x.) +. q(.q(i).6. k = 2.)' But since Ix.i) = 1/x.6.40 EQUATIONS SEC. .6. = 1 A. we introduce: (i+. .5) ('/ .6../xi+') + (A.(1-x..)-q(.i). +.q2 Hence lim.-x.lx. + (A.4) gives the first root x while the second root x...) x. In the same way it is easily shown that lim (/x)x q A2 (2.. Further. X. from (2.)i+' X./xk I < 1.. _ (A. A.lx. lim. 1 ../x3)' Introducing e(i) = q(i+') .3) P(x) i=o We compute the quotient between two consecutive coefficients: q(.6. we have for i --. X . (2. .)i+.I + (A. -\x.. . + (A.. A./x.) .6) x Equation (2. is obtained by eliminating x. and so forth. + x. .)` x.( X.6. (x. 2. + (A%/A')(x. 1 =_A. F. + a:-. n.+' to obtain: 1 (2.)(x. x.) _(1 1 A.X..) = a./A... . .lx..m q i) = 1/x. +. The terms on the right-hand side can be expanded in a geometric series: A.)' + ..) el (i+U q..x/x. oo : lim qfi) = I . Replacing i by i + 1.)(x.lx. (A..6.6): lim e(i+n i-.7) s It is now convenient to change notations so that the quantities q and e get a lower index 1: q.lx. and subtracting: lim q(i+.. A./xi+') + ./ A.

a new e-quantity is introduced et:) .000018 _1.. 0 0 au a.1. a* 0 0 a.013350 0 000075 .814554 _ 1.q2(i) + e. :-.j > .. > fix.13x'-22x+3=0.499924 0 7.000891 0 0 7.872983 .. it is shown that e('+u lira _ X2 .500001 0 7. eo e.500016 0 7.r are supposed to be nonzero.872907 0 0. 0.000092 -1.828672 0.692308 0 -0.`+') . we have xk = lim qk*) EXAMPLE 2x'.010169 0 _ 0.136364 7.000484 _1.873391 0. q' e.377754 . 0 0 All coefficients a._ e. THE Q-D METHOD 41 Further.j > Ix.440749 0. _.. 5 0 0 1. q. When we are looking for the roots of the equation aox" + a.6. .500408 0 000006 .069646 0. e.1.511286 0. 0 7.126195 0 0. a.127011 0. a*-' q* e* a.500000 .") x 9 Next q( ') is calculated. . and as before. If all roots are real and simple.870850 -'1. then with jx.499998 0 7.884200 . The initial values for solution of an algebraic equation by the Q-D method are established by using a fairly complicated argument which cannot be reproduced here.002541 _1. a.000003 _ 1. 0. 1.qJi+') . and it is now clear that the so-called rhomb rules have been attained. 2. q' qJ eJ qJ 6 .000001 0 7.. qJ e.872981 0.SEC.136364 0 0 8.x*-' + ...127086 0. = 0. 0 0 7.127017 .872999 . 0 .497861 _ 0. the scheme ought to start in the following way: e. a.872984 -0. + a.192308 _ 0._.

42 EQUATIONS SEC. 2.7.

The method can be made to work also for complex and multiple roots.
Details can be found, for example, in [5].
It may be observed that multiple roots of algebraic equations will give rise
to difficulties with any of the methods we have discussed here. As a matter of
fact, all classical methods require a large amount of skill and good judgment
for isolation and separation of the roots. It has turned out to be unexpectedly
difficult to anticipate all more or less peculiar phenomena which can appear
in such special cases. Against this background, Lehmer [6] has constructed a
method which seems to be highly insensitive to clustering among the roots.
The method is founded on the well-known fact that the integral
I
P(z) dz '
2iri J c P(z)
where P(z) is a polynomial and C a closed curve (e.g., the unit circle), gives
the number of roots within C. Since we know that the value of the integral
is an integer, we can use fairly coarse methods for this evaluation. The method
converges geometrically with a quotient -r%, and rather long computing times
are necessary. On the other hand, this is the price one has to pay to get rid
of all complications.

2.7. Equations with several unknowns

We distinguish between linear systems of equations, which will be treated sepa-
rately in Chapter 4, and nonlinear systems, which may even be transcendental.
This latter case can sometimes be treated by a series-expansion technique. We
will here limit ourselves to two unknowns. Let the equations be
F(x, y) = 0 ,
G(x,y) = 0.
Suppose that (x0, yo) is an approximate solution and let h and k be corrections
which we shall determine:
F(xo+h,yo+k)=0,
G(xo + h, yo + k) = 0.
Expanding in Taylor series and truncating after the first-order terms, we get:

F(xo,yo)+h( ax)o+k(a )
1
yax =0,

G(xo 'Yo) + h (
/o + k (ay ) = 0.
This linear system in h and k gives the next approximation. In practical work,

EXERCISES 43

the derivatives can be replaced with the corresponding difference quotients,
and only the values of the functions have to be computed.
Another way which is sometimes practicable is the following. Let the system
be
Form the function f(x,, x . . ., F, + F, + + F (where Fr is supposed
to be real). Then we have f(x x ... , 0. A solution of the system
obviously corresponds to a minimum off with the function value = 0. Starting
from some initial point (x;, x;, ... , x°n), we compute a number of function values
along the straight line x, = x;; x, = &,; ... ; x = x°,,, that is, only the first
variable x, is changed. On this line we search for a minimum (at least one
minimum is known to exist, since f is bounded downwards). The correspond-
ing value x, = x; is fixed, and then we repeat the whole procedure, varying
x, alone. After a sufficient number of steps, we reach a minimum point, and
if the function value is 0, it represents a solution. Alternatively, one can also
consider
f(x1, x,, ..., x,.) = IF3I + ... +
In both cases f = c represents a family of "equipotential" surfaces, and the
convergence will be particularly fast if the movements can be performed or-
thogonally to the surfaces. If this is the case, the procedure is called "method
of steepest descent."

REFERENCES
[1] Wilkinson: "The Evaluation of the Zeros of Ill-conditioned Polynomials," Num. Math.,
1, 150-180(1959).
[21 Zurmilhl: Praktische Mathematik (Springer Verlag, Berlin, 1961).
(3] Brodetsky-Smeal: "On Graeffe's Method for Complex Roots," Proc. Camb. Phil. Soc., 22,
83 (1924).
[41 Muller: "A Method for Solving Algebraic Equations Using an Automatic Computer,"
MTAC, 208-215 (1956).
[51 National Bureau of Standards: App!. Math. Series, No. 49 (Washington, 1956).
[6] D. H. Lehmer: "A Machine Method for Solving Polynomial Equations," Jour. ACM,
151-162 (April 1961).
(7] Stiefel: Einfuhrung in die Numerische Mathematik (Stuttgart, 1961).
[8) Traub: Iterative Methods for the Solution of Equations, (Prentice-Hall, Englewood Cliffs,
N.J., 1964).
(9] Wilkinson: Rounding Errors in Algebraic Processes (Her Majesty's Stationery Office,
London 1963).

EXERCISES
1. The equation x3 = x` + x3 + I has one root between I and 2. Find this root to
six decimals.
2. The equation x' - 5x3 - 12x2 + 76x - 79 = 0 has two roots close to x = 2. Find
these roots to four decimals.

44 EQUATIONS

3. Solve the equation log x = cos x to five correct decimals.
4. Find the smallest positive root of the equation tan x + tanh x = 0 to five correct
decimals.
5. Find the abscissa of the inflection point of the curvey = e : log x to five correct
decimals.
6. What positive value of x makes the function y = tan x/.%' a minimum?
7. The mean-value theorem states that for a differentiable function,
f(x) - f(a) = (x - a) f [a + p(x - a)]
Find a positive value x such that p = } when f(x) = arctan x and a = 0 (five decimals).
8. When a > 1 and 0 < b < J, the equation 1/(es"` - 1) - a/(e' - 1) - (a - 1)b = 0
has one positive root. Calculate this root to four decimals when a = 5, b = }.
9. The function
1.00158 - 0.402222 -e
y= I + 0.636257x

is examined for 0 < x s-, 1. Find the maxima and minima (also including the end points)
and sketch the corresponding curve (five decimals).
10. Find the smallest value of a such that a-%I-x Z sin x for all positive values of x
(five decimals).
11. Determine to five decimals the constant K such that the x-axis is a tangent of the
curve y = ICe=l10 - log x (x > 0).
12. Find the smallest value a such that e'°` < 1/(1 + x') for all x > 0 (six decimals).
13. Calculate the area between the two curves y = cos x and y = e-' (0 < x < 2/2;
five decimals).
14. One wants to compute the positive root of the equation x = a - bx' (a and b
positive) by using the iterative method xk+1 = a - bxt. What is the condition for
convergence?
15. The iterative method X% 1 = f(x) of x
an error which decreases approximately geometrically. Make use of this
fact for constructing a much better approximation from three consecutive values x_,,
x,,, and x+1.
16. Find the smallest positive root of the equation
x' _ x' + x`
1 _ x + (2!)a _.. = 0
(3!): (4-!')'2

correct to four decimals.
17. Find the intersection between the curves
!y=e.-2,
ly=log(x+2),
to four decimals (x > 0).

EXERCISES 45

18. For values of k slightly less than 1, the equation sin x = kx has two roots x = ± xo
close to 0. These roots can be computed by expanding sin x in a power series and putting
6(i - k) = s; x2 = y. Then s is obtained as a power series in y. Next, a new series
expansion is attempted: y = s + a,s2 + a,s' + a's` + , with unknown coefficients a,,
a2, a2, ... Find the first three coefficients, and use the result for solving the equation
sin x = 0.95x (six decimals).
19. The k-fold zero t; of the equation f(x) = 0 shall be determined. If k = 1, Newton-
Raphson's method gives a quadratic convergence, while k > 1 makes the convergence
rate geometrical. Show that the modified formula

x, =x. - k (x,.)
f (x.)
will always make the convergence quadratic in the neighborhood of Q. Use the formula
for finding a double root close to zero of the equation x' - 7x' + l Ox' + I Ox' - 7x + 1 = 0.
20. Solve the equation x' - 5x' - 17x + 20 = 0 by using Graeffe's method (squaring
three times).
21. Find a constant c such that the curves y = 2 sin x and y = log x - c touch each
other in the neighborhood of x = 8. Also calculate the coordinates of the contact point
to five places.
22. Determine the largest value a, of a so that x'l° Z log x for all positive values of
x. If a > a the inequality is not satisfied in a certain interval of x. Find the value
of a for which the length of the interval is 100 (two places).
23. If one attempts to solve the equation x = 1.4 cos x by using the formula x,,,, _
1.4 cos x,,, then in the limit, x will oscillate between two well-defined values a and b.
Find these and the correct solution to four decimal places.
24. The equation I - x + x'/2! - x'/3! +... - x"/n! = 0 (n an odd integer) has one
real solution e which is positive and, of course, depends upon n. Evaluate
by using Stirling's asymptotic formula (n - 1)! - e"n' 112.
25. The curves 3x2 - 2xy + 5y2 - 7x + 6y - 10=0 and 2x2 + 3xy - y2 - 4 = 0
have one intersection point in the first quadrant. Find its coordinates to four places.
26. The equation tan x = x has an infinite number of solutions drawing closer and
closer to a = (n + J1),r, it = 1, 2, 3, ... Put x = a - z, where z is supposed to be a small
number, expand tan z in powers of z, and put z = A/a + B/a' + C/a' + Determine
the constants A, B, and C, and with this approximation compute the root corresponding
to it = 6 to six decimals.
27. The equation sin x = 1/x has infinitely many solutions in the neighborhood of the
points x = n,r. For sufficiently large even values of it, the roots $. can be written as
fir + a + Aa' + Ba' + Ca' + , where A, B, C, ... are constants and a = 1/nn. Find
A and B.
28. Solve the equation x' - 8x' + 39x2 - 62x + 50 = 0, using Bairstow's method.
Start with p = q = 0.
29. The equation sin (xy) = y - x defines y as a function of x. When x is close to
1, the function has a maximum. Find its coordinates to three places.

46 EQUATIONS

30. The equation f(x) = 0 is given. One starts from x0 = t; + e, where £ is the un-
known root. Further, yo = f(x,), x, = xo - (xo), and y, = Ax,) are computed.
Last a straight line is drawn through the points

(x,, y') and (xo+x1
2 ' ,2.
2

If E is sufficiently small, the intersection between the line and the x-axis gives a good
approximation of $. To what power of a is the error term proportional?
31. Let {x;} be a sequence defined through z; = x; - (dx;)2/d2x;, where dx; = x;,, - xi,
42x,=x2- 2x.+, +x;. Compute lim e(z,-a)/(xj-a) when x. r, -a=(K+aj)(x, -a)
with IKJ < 1 and a; -, 0.

Chapter 3

Matrices
A good notation has a subtlety and
suggestiveness which at times make it seem
almost like a live teacher.
BERTRAND RUSSELL.

3.0. Definitions

Matrix calculus is a most important tool in pure mathematics as well as in a
series of applications representing fields as far apart from one another as
theoretical physics, elasticity, and economics. Matrices are usually intro-
duced in connection with the theory of coordinate transformations, but here
we will, in essence, treat them independently. The determinant conception is
supposed to be known although the fundamental rules will be discussed briefly.
Unless specified otherwise, the numbers involved are supposed to be complex.
A matrix is now defined as an ordered rectangular array of numbers. We denote
the number of rows by m and the number of columns by n. A matrix of this
kind will be characterized by the type symbol (m, n). The element in the ith
row and the kth column of the matrix A is denoted by (A);, = a;,1. The whole
matrix is usually written
an a,., ... a,,,

A= a9, a2, ... am.

aw,2 ... am,.

or sometimes in the more compact form
A = (a,,,)
In the latter case, however, it must be agreed upon what type of matrix it is.
If m = n, the matrix is quadratic; when m = 1, we have a row vector, and for
n = 1, a column vector. If m = n = 1, the matrix is usually identified with
the number represented by this single element. In this and following chapters
small letters printed in boldface indicate vectors, while capital letters in bold-
face indicate matrices.

3.1. Matrix operations

Addition and subtraction are defined only for matrices of the same type, that is,
for matrices with the same number of rows and the same number of columns.

47

48 MATRICES SEC. 3.1.

If A and B are matrices of type (m, n), then C = A ± B is defined by c:k =
a,k ± be,,. Further, if 2 is an ordinary number, 2A is the matrix with the
general element equal to la,,. If all as,, = 0, the matrix is said to be a null
matrix.
The definition of matrix multiplication is closely connected with the theory
of coordinate transformations. We start with the linear transformation

z: = a:ryr (i = 1, 2, ...,m)
r=
If the quantities y,, y ... , y,, can be expressed by means of the quantities
x x ..., x, through a linear transformation
P
yr = brkxk (r = 1, 2, ... , n),
k=1
we find by substitution

z; = r a:r k-1 brkxk = Lk=t1 r-I airbrk} xk .
=1

In this way we obtain z;, expressed directly by means of xk through a composed
transformation, and we can write

Zi = cikxk with Cik = airbrk
k=1 r=1

In view of this, the matrix multiplication should be defined as follows. Let A
be of type (m, n) and B of type (n, p). Then C - AB if

q, = t airbrk , (3.1.1)
r-1
where C will be of type (m, p).
It should be observed that the commutative law of multiplication is not
satisfied. Even if BA is defined (which is the case if m = p), and even if BA
is of the same type as AB (which is the case if m = n = p), we have in general
BA$AB.
Only for some rather special matrices is the relation AB = BA valid.
Transposition means exchange of rows and columns. We will denote the
transposed matrix of A by AT. Thus (AT),, = (A)ki. Conjugation of a matrix
means that all elements are conjugated; the usual notation is A*. If a matrix A
is transposed and conjugated at the same time, the following symbol is often
used:
(A*)T = (AT)* = AH .
Here we will treat only quadratic matrices (n, n), row vectors (1, n), and
column vectors (n, 1). First of all, we define the unit matrix or the identity
matrix I:
(I):k = Sik ,

SEC. 3.1. MATRIX OPERATIONS 49

where &,k is the Kronecker symbol

Sik=
0 if ilk,
1 if i=k.
Thus we have
1 0 0...
0 1 0...0
0 0 ... 1
0
In a quadratic matrix, the elements a,, form a diagonal, the so-called main
diagonal. Hence the unit matrix is characterized by ones in the main diago-
nal, zeros otherwise. Now form Al = B:

bik = : air&rk = aik
r=1
since 8, = I only when r = k.
Analogously, we form IA = C with
w

cik = r=1
E 3irark _ aik
Hence
AI=IA=A.
The sum of the diagonal elements is called the trace of the matrix:

Tr A = Ea,,.
:=1

A real matrix A is said to be positive definite if x"Ax) 0 for all vectors x # 0.
If, instead, x"Ax Z 0, A is said to be positive semideftnite.
The determinant of a matrix A is a pure number, denoted by det A, which
can be computed from A by use of certain rules. Generally, it is the sum of
all possible products of elements where exactly one element has been taken from
every row and every column, with a plus or a minus sign appended according
as the permutation of indices is even or odd.
det A = E ± a1,,1a,,,, .. a.,,..
The formula can be rewritten in a more convenient way if we introduce the
concept of algebraic complement. It is defined in the following way. The alge-
braic complement A,k of an element a,k is
A,k = (_ 1)i+k det a,k ,

where a,k is the matrix which is obtained when the tth row and the kth column
are deleted from the original matrix. Thus we have

det A E a{,kA:k =E aikAik
:=1 k=1

Instead. a right inverse and a left inverse exist and are equal. If in the ith row or kth column only the element aik # 0. Further. which are completely sufficient. When formulated algebraically.I )i+kaik det aik = aikAik . A is regular. that det A = 0 if two rows or two columns are proportional or. 3. c From these rules it follows. Last. if one row (column) can be written as a linear combination of some of the other rows (columns). If det A = 0. In this way we have proved that if A is regular.0. the problem leads to n systems of linear equations. Now let A be given with det A :?-. and hence X = Y. the following rules. i=1 The general rule is impossible to use in practical work because of the large number of terms (n!). We will try to find a matrix X such that AX = I. In a similar way. if AB = C. t aikAik = 0 k=1 if i #j. we form a new matrix C with the elements cik = Aki. if det A # 0.l is added to another row (column). are used: 1. we have det A det B = det C. If A _ (a bl -\ C d/ ' then detA=la bl _ad . A determinant remains unchanged if one row (column) multiplied by a number . we get more than 3 million terms. The inverse is denoted by A-': AA-'=A-1A=I.Aik = 0 if j :t. YAX=IX=X. The matrix C is .50 MATRICES SEC. Since A is regular. already for n = 10.bc. for example. 3. we see that the system YA = I also has a unique solution Y.1. all with the same coefficient matrix but with different right-hand sides. more generally. Premultiplication of AX = I with Y and postmulti- plication of YA = I with X give the result YAX=YI=Y. and t ai. every system has exactly one solution (a column vector). and these n column vectors together form a uniquely de- termined matrix X. A is said to be singular. then det A = (. 2.k . Further.

the matrix A is said to be of rank r.E akrbrti = Fa Crkdir dirCrk = (DC)i4 ' Hence GT = DC or (AB)T = BTAT. From the definition of the inverse. If the relation C.p rows and n . For transposing. we have the laws (A+B)T=AT+BT.u .2) 3. we prove (AB)T = BTAT. Suppose that we haven vectors u. and comparing. n . Put AT = C.2. we get AA = AA = (det A)1. A(B + C) = AB + AC. + cwuw = 0 . each with n components. and for inversion. and further (G T)ik =p $ki . we obtain a minor or subdeterminant of order p. If the matrix A has the property that all minors of order r + I are zero.. (3. we find (AT)-' = (A-')T. From the ex- pansion theorem for determinants. we have AA-' = I and (AT)-'AT _ I. Transposing the first of these equations.. If in a determinant. or A-' = (det A)-4. BT = D.p columns are removed. (AB)T = BTAT . A+B=B+A. Fundamental computing laws for matrices As is understood directly. and AB = G. while there is at least one minor of order r which is not zero. Further. the associative and commutative laws are valid for addition: (A+B)+C=A+(B+C).SEC.1.. (AB)-' = B-'A-' ..2. It follows directly from the definition that (AB)-' = B-1A-1. Then Cik = aki+ dik = bki. (A + B)C = AC + BC. 3.. u (row vectors or column vectors). . we get (A-')TAT = I. since (AB)(B-'A-') = (B-'A-)(AB) = I. the associative and distributive laws for multiplication are fulfilled: (AB)C = A(BC) . FUNDAMENTAL COMPUTING LAWS FOR MATRICES 51 called the adjoint matrix of A and is sometimes denoted by A.ul + C2S 3 + . First.

a*.2) and that all other terms contain 2 to at most the (n . has the only solution c. = 8. -2 aR. a*. a..2)(a.. it has the following form: a a. non- trivial solutions exist only if det (A . The equation can also be written (A-21)u=0. 3..--+ (-1)* det A = 0.. (3. = c.2) j=1 j=1 Equation (3. Thus E 2. Thus we obtain an algebraic equation of degree n in 2..'u. 3. If a matrix is of rank r.21) = 0.1) a*.2 I We notice at once that one term in this equation is (a .3. and every n-dimensional vector can be written as a linear combination of the base vectors. L72% = 0 .k is said to be orthonormal. A system of n vectors u. Thus a square matrix of order n has always n eigenvalues.3. . = Tr A . Eigenvalues and eigenvectors If for a given matrix A one can find a number 2 and a vector u such that the equation Au = In is satisfied. In this case they form a basis. 11 2.1) is called the characteristic equation. since otherwise we would have r + 1 linearly independent vectors and the rank would be at least r + 1.--+ a**)2*-' +.52 MATRICES SEC. in n dimensions such that u.. an arbitrary row vector or column vector can be written as a linear combination of at most r vectors of the remaining ones. 2 is said to be an eigenvalue or characteristic value and u an eigenvector or characteristic vector of the matrix A. An orthonormal system can always be constructed from n linearly independent vectors through suitable linear combinations.3. . . Explicitly..* a:. +.2)-power. in special cases some or all of them may coincide.. As is well known. the roots are the n eigenvalues of the matrix A.a .2) .3. This is a linear homogeneous system of equations. a** . Hence we can also define the rank as the num- ber of linearly independent row vectors or column vectors. = . = c* = 0..(a + a. Hence the equation can be written: 2* . a. (a** .... (3.3. the vectors are said to be linearly independent. = det A .

Now we suppose that the characteristic equation of A has the form f(x)=x"+c... and obtain S.S..l .)x. This method has been given by Leverrier.21)S] = det S-' det (A .+c...-.+nc. These are the well-known Newton identities. Putting S. TrA'.. analogously...l . TrA".. and the eigenvectors are the same..S. where S is a regular matrix.) and + . From this we see that if A has the eigenvalues . S. A is said to undergo a similarity transformation. The coefficients of the charac- teristic equation can be computed by solving recursively a linear system of equations containing TrA. Then we also have f(x) (x . It is easy to prove that the characteristic equation is invariant during such a transfor- mation. since det (B ..2) is easily obtained: J(X) = x"-' + (A + + C.+c. = E. of course..)x"`.x"_.) .A. + . + f(X) f'(x) = x) + x-2... EIGENVALUES AND EIGENVECTORS 53 If the equation Au = Au is premultiplied by A. x-1 The quotient f(x)/(x .SEC..-..2.-.+. .=0. . .+c. we have directly Tr A = S.=0.) + ft 2. we get A'u = . det S-' det S = det (S-IS) = 1._3 + . . (We have.)x"-. then A. we form f(A. + (1' and summing these equations for 2 = 2. in about 1840 (together with Adams). x) x-2.+nc.+c. we find f'(x)=nx"-'+(S.. 2.+2c.+C.-._.. 2. If a matrix B is formed from a given matrix A by B = S-'AS. computed the position of Neptune from observed irregularities in the orbit of Uranus.3. 3..+ Identifying with f(x) = nx"-' + (n . + (n .l ..+nc. A. S"_1 + cS.-.21) = det [S-'(A .x-'+.+C..S.. 1)c.1)c.2.A + C')x.1Au = 1'u and. A"`u = 2"u. = 0 . x-1 c._.)(x . . (x . ..21) . + c.+(S.S...) ..=0.21) det S = det (A .) + + 0. who.=.-.=0.. we get S.. A.S.. + . Last. A..has the eigenvalues 2-.

29. The two groups of vectors u. ui ATvk = 2ku. . u and v v . Since the eigenvectors are determined. The second equation is multiplied from the left by u. A r1Jk = 2kVk The first equation is transposed and multiplied from the right by vk. be- cause u. . ATvk = 2. consequently. VT = U``. are said to be biorthogonal. as a rule. But B = S-'AS. we get U = V and UT = U-'. . = 1 . u. vk . # 0. that is UT _ V-'. v. = 0 would imply uTw = 0 and u.u. . and consequently. U`'AU = V-'ATV = D.. .v. From the column vectors u u . . that the eigenvectors of A and AT are u u .. Subtracting. Then we have directly UTV = VTU= I. and v v ... We will now suppose that all eigenvalues of the matrix A are different. This is an example of the transformation of a matrix to diagonal form. we have Au=2u. and we have u=Sv.. .3. S-'ASv = 2v or A(Sv) = 2(Sv). since 2i # 2k. and further from the eigenvalues 21.. If u is an eigenvector of A. we see that u. Because the determinant remains constant when rows and columns are inter- changed. By=2v. v..u. Further.54 MATRICES SEC. Thus we have Aui = 2. Hence Sv is an eigenvector of A corresponding to the eigenvalue 2. ..2k)u.. Uand V. In the special case when A is symmetric. Using the fact that an arbitrary vector w can be repesented as w = E c.. Usually this concept is obtained by defining a vector transformation x=Ou. 3. are different. u... except for a constant factor. 2.. v. Hence A and B have the same eigenvalues. we form two matrices. we have AU = UD and ATV = VD and. vk = 0. respectively. A matrix with this property is said to be orthogonal. vk = 0 or u. and further.. we find (2. both with the same eigenvalue 2. it is obvious that they can be chosen to satisfy the condition u. and v is an eigenvector of B. the eigenvalues of A and AT are the same. a diagonal matrix D. y=Ov .0. while the eigenvectors... and v. vk . . v. .

Ix. M is not known and instead we have to accept the some- what weaker estimate 121 < max t laikl (3.E l aM. Choose M in such a way that IxMI = max.3. Then we have directly 2xM = r=1 E au. Suppose that 2 is a characteristic value of A with the corre- sponding eigenvector x.. which gives OTO = I.4) k i=1 and the better of these estimates is. 3. Later on we will treat special matrices and their properties in more detail.l <. and 121 IxMI = J E aM.affil < Pi (3. EIGENVALUES AND EIGENVECTORS 55 and requiring that the scalar product be invariant.x.l.SEC. x..l Ix.I .xM 12-aMMI <_EIa. r=1 However.VMl<_EIaM.x.5) and also inside at least one of the circles 12 . rtM we find in an analogous way 12-a.3.3.3) k-1 The transposed matrix has the same set of eigenvalues and hence we also have the estimate 121 < max t laikl (3.I . Writing (2 . Thus xr'y = (Ou)TOv = UTOTOD = MTV.3. Any characteristic value 2 must lie inside at least one of the circles 12 . The components of x are denoted by x1. (3..I > from which we obtain the estimate 121 5 IaM. of course. in general. preferred.MI Introducing the notations Pi = Ekti laikl and Qi = Ek*i lakil' we can formulate the inequalities as follows. We will now take up the problem of giving estimates for the eigenvalues of a matrix A.. x .3. .l < IxMI E IaM..6) .aMM)xM aMrx.aril < Q.

and all eigenvalues lie within the intersection of these two unions. and x be an approximate eigenvalue and eigenvector. Let A be a given matrix.1)/2 Cassini ovals lz . where every Sk induces a "small" change of A. 12156. 3 2 7' A= (-1 0 5 .Q. which means that the circles have degenerated to points. we find trivially 12 . at least so far as the domains are disjoint. Brauer has shown that the two unions of n circles each can be replaced by two unions each consisting of n(n . From the beginning the circles are disjoint and every circle contains exactly one eigenvalue.+32)y=0. + 32)xTy. 4 -3 6 We find that 2 must lie inside the union of the circles 12-3159.l lz . Using the previous inequalities on the matrix D.3. Hence xTA = . who found that Pi and Q.-° (0 5 a 5 1). Finally. A = 2a + 32 and y be the exact eigenvalue and eigenvector. Then we can gradually transform D back to A.akkl 5 Q. The points will then become circles which grow more and more. 3.lax" + 6T and xTAy = 2axTy + try.a.a. When a = 0 or a = 1.Ail 5 0.56 MATRICES SEC.llz . Combining the obtained inequalities. then exactly two eigenvalues are included. j = l(1)n-. 1z-6157 . respectively.Qk. Here we will also show how an approximate eigenvalue can be improved if we also know an approximate eigenvector. Suppose that we have a matrix A which can be transformed into diagonal form: S-1AS = D. xr'Ay = (2..akkl SP. On the other hand.pk (k>j. and comparing we get: x?y XTX EXAMPLES 1. Ay-(2. Now assume that S = S1S. 2. Then Ax-tax=6. In this way one can often get more precise information on the distribution of the eigen- values. If two circles overlap partially. could be replaced by R.. one can easily understand that all eigenvalues must lie inside the intersection of the union of the first group of circles and the union of the second group of circles. = P. and so on. but they have been improved by Ostrowski. SN.1) and lz . the inequali- ties of Gershgorin appear again. The estimates discussed above are usually attributed to Gershgorin.

-2 -a. with the characteristic equation det (A . Consider the equation J(z) =z*+a.28 ± 3.288. 0 0... 30...31 S 9.. =0.53 7 5 6 5 0..0 A= 0 1. 3.Al) = 0 or -A 0. the first two circles are subsets of the third.55 7 5 9 10 0.. This equation can be understood as the characteristic equation of a certain matrix A: j0 0 .288685).38 A = 1. Since max. 121 5 6 is a subset of 12 ... 1 -a. Ia. Zk_..EI = 18.EJ = 13 and max.3..56 and .z*-'+.. 0 -a* 1 -2. EIGENVALUES AND EIGENVECTORS 57 and also inside the union of the circles 12-3155.0 0 0. This is the so-called companion matrix.+a =0. we also have 121 S 13. 0 0. The treated case concerning localization of characteristic values of a matrix can easily be generalized to arbitrary algebraic equations. -2 -a.. The eigenvalues are approximately 9. X21<5. X 8 6 10 9 0.. In the second group..0. 11-61x12. Ja. 2. 0 -a*-. Hence the final domain is the union of12-31 61 57. Z%.501 and clearly lie inside the domain which has just been defined.. but this condition gives no improvement. Hence A ..288 (the exact eigenvalue is I = 30.. = 30.. In the first group..52 We find that and 82 = 0. 1 0. 10 7 8 7 0.. 0 -a.SEC. 1 .

58 MATRICES SEC. Special matrices In many applications real symmetric matrices play an important role.+-a. (3.l} and Iz + a. f Ia. Naturally.I . + la. Iz + ail = E Ia. Let 2 be an eigenvalue and u the corresponding eigenvector: Hu=Au. and Q.I + . characterized through the property HH = H.) and r"=la'zy'+. and obtain the result: Every root of the equation f(z) lies inside one of the two circles IzI = 1 .4. and so on.4. we find that za = -(a. Now we use the estimates with P. we get H*u* = HTU* _ Au* .2) by u gives u"Hu = 2*u"u = IOU. The condition H" = H can also be written Hr = H*. this is easy to prove directly.. Again we return to (3.l) < r"-' which is absurd.. la.l).4.I + .4.. The estimate III S max. 3.1) We get by transposing and conjugating: u"HH = uHH = A*uH.l = 1 .4.-a Every root also lies inside one of the two circles IzI = max (max (1 + la. they are a special case of the so-called Hermitian matrices. and. 3. We multiply the last line by 2 and add to the preceding one. Supposing that f(z) = 0 has a root zo whose absolute value is r > 1. 1 real. and since On # 0.kl gives the sufficient condition Ial < 1 that all roots of the equation f(z) = 0 lie inside the unit circle. we get I* = 2. multiply this line by 2. Ia. (3.2) Premultiplication of (3.l < r"-'(l all + la.. Es_.. and in this way we find f(A) = 0. this is due to the fact that all the eigenvalues of any Hermitian matrix are real.4... . to a large extent.I r"_Z S r"-' + la. How- ever.+ a.4. and add to the preceding one.1).zo-' +.. Such matrices are extremely important in modern physics.4...1) by uH and postmultiplication of (3. that is.1). Conjugating (3.

k = 0 .k = 0. Transposing and conjugat- ing. for if UAU" = D..3. 4. 2)-element: t ItZkI = ItflI' k=t (since t = 0) and hence t. where the eigenvectors were composed of matrices U and V with UT V = I. 3. It might be of interest to investigate if such transformations are always possible also for larger groups of matrices than Hermitian. We have seen that Hermitian matrices can be diagonalized by unitary matrices. Postmultiplication with Ux = Ax gives us x"U-'Ux = A*. and hence we have proved that a Hermitian matrix can be transformed to a diagonal matrix by use of a suitable unitary matrix. we have for Hermitian matrices V = U* and UT U* = I. and since x"x > 0. .7) an arbitrary matrix A can be factorized in U"TU where T is. . Consider Ux = Ax. (1) The condition is necessary. . for example. k = 3.. As has just been proved. n. k=1 Next we compare the (2.lx"x. upper triangular. or after conjugation. A"A = AA". The condition AA" = A"A im- plies U"TUU"T"U = U"T"UU"TU.. U" = U'. 1)-element on both sides and find: E It.SEC.4. we find that x" U" = x"U-' = 2*x". TT" = T"T. because according to Schur's lemma (which is proved in Section 3.. while the eigenvalue is the same. This condition is trivially satisfied if A is Hermitian. k = 2. We now compare the (1. 3. (3. We then repeat the argument of Section 3.3) Matrices with this property are said to be unitary. that is. we get A = U"DU and AA" = U"DUU"D"U = U"DD"U = U"D"DU = U"D"UU"DU = A"A (2) The condition is sufficient. . Previously we have shown that a real symmetric matrix can be diagonalized by a suitable orthogonal matrix. t.kI4 = that is. n . SPECIAL MATRICES 59 which means that a* is an eigenvector of HT if is is an eigenvector of H. The special case is well known from analytic geometry (transformation of conic sections to normal form). .4. Consequently T is not only triangular but also diagonal. we get 121 = 1. which is obvi- ously a special case of this more general property. We shall then prove that a necessary and sufficient condition that A can be diagonal ized with a unitary matrix is that A is normal. where U is a unitary matrix. that is.

Let u. a real orthogonal matrix of odd order must have at least one eigenvalue + I or .A)a:u: . and U. and O.1... A* = A A purely imaginary is complex antisymmetric (or 0) (except when .. are orthogonal.. Rewriting r we find r = E (A: .J2 . the characteristic equation is real. For a real orthogonal matrix. are unitary.6) such that r'r = e'. Analogous calculations can be performed also for other types of matrices. and hence complex roots are conjugate imaginaries.)'= U'. and the result is presented in the table below. we point out that if U..0. we can show that if O. Then x= aim. Since the absolute value is 1.1. they must be of the form ei'.Ax. 1 . Let the residual vector be r = Hx ..l = 0) Orthogonal AT = A-' If A * ± 1. Type of matrix Definition Eigenvalues Eigenvectors Hermitian AM=A 2 real is complex Real.2 S e.4. Then we shall prove that for at least one eigenvalue we have 12. Similarly. then also U = UU2 is unitary: U' = U2UH= U:'U1'=(U. symmetric AT = AS = A A real is real Anti-Hermitian A" = --A d purely imaginary is complex Antisymmetric AT = -A If 2 * 0. . 3.. and further t ia:I' = 1 ... . be an orthonormal set of eigenvectors of the Hermitian matrix H corresponding to the real eigen- values 21..U.and we shall here give a brief account of his result for Hermitian matrices defined through the property HX = H. then the same is true for 0 = 0. Further. and therefore 61 =EIA. A* =A III= 1 is complex Unitary Am = A-' Al x= 1 is complex Finally. is.Ia. then uTU = 0 Real. AT = -A. let 1 be an approximate eigenvalue and x the corresponding approximate eigenvector assumed to be normalized.. with the Euclidean norm e (cf. further. e`". then uTD = 0 Real. orthogonal AT = A-'.-21. Wilkinson [101 has obtained rigorous error bounds for computed eigensys- tems.60 MATRICES SEC. . u . 3.

Thus we get IHX-ARXI'lail2IA .1. we obtain the residual vector r = Hx .III at) a/ \1 .Aiui) _ E Ai JaiJ2 and Hx . but it is possible to find a much better result.. 3.i 2i Jail%.SRI t 12i . we have not used the fact that H is Hermitian.)1I <. and that IZR . Hence we get p = xHHx.2.4. where rHr = e2. cf.12 Se and hence a fortiori EZ Jail' ARI a' E Jail' .2. of course. So far. 1 1 and hence we conclude that 12. For. which is the so-called Rayleigh quotient. we get instead xHHx/x"x. SPECIAL MATRICES 61 Now. then we would get lei .fex) = x"Htx .fix. we further obtain 2R t Jail. . IAi . We know. 3. From 2R = £.ARx with the norm e. if IAi . Now we seek a real value ft which minimizes the Euclidean norm ofHx .2 fix"Hx + p-. Using the notation 2R for ft.Ail Z a for i = 2..ARx = E ai(Ai .5)].frX)H(Hx . Thus 1a111= 1 .pxI' = (Hx . I HX .at .> I .1 S e.21 S e for at least one value of i.SEC.ZR)ui .21' jail' > s' t Jail! = e.Al > E for all i.. We suppose that ZR is close to 2. by definition.2R1 Jail' = L/ r 1 Iz+ . 2R = XHHX = (E a. Jail. n. . that I2R . [If x is not normalized. (6._ 1 1 zi Jail' or Hence la21' (AR .2Rl' Jail' S Q ZRJ_ Jail_ <a and (1/ IzR . AR) la11' IAR .A) jail'.

where z is a variable and a. Replacing z by A.. Matrix functions Let us start with a polynomial f(z) = z* + a. + a. The sum of the series will be denoted by f(A).5. we get a matrix polynomial . + a. Premultiplication with A gives the result A2u = . EXAMPLE 10 7 8 7 0. . Obviously.2885 (the correct eigenvalue is 2 = 30. the series (I .. In this derivation a few additional computations gave an improved eigenvalue together with an error bound for this new value. In the same way we find that f(2) is an eigenvalue of the matrix f(A).2Au = 21u.5.. and j1 = 30. We suppose that the series is convergent in the domain Izi < R.. Here f(A) is a square matrix of the same order as A. Generalization .A + a. for example. 3. . If a matrix A undergoes a similarity transformation with a matrix S. then the series aol + a.53 7 5 6 5 0. that is 22 is an eigenvalue of A'..z + a.55 7 5 9 10 0.1) then the same is true for A': S-'A'S = (S-'AS)(S-'AS) = B2. a2.38 H X = 8 6 10 9 ' 0.9982.A + a. a* are real or complex constants.. (3.288685). Then it can be shown (see Section 3. the eigenvector is unchanged..5. + a.-...`(A) = A* + a.z*-' + . this can be directly generalized to polynomials: S-'f(A)S = f(B).52 We then find x"Hx = 30. for example. S-'AS = B.A)-' = I + A + A' + converges only if all eigenvalues of A are less than 1 in absolute value.62 MATRICES SEC.zl + . 3. Some series of this kind con- verge for all finite matrices. Suppose that 2 is a characteristic value and u the corresponding characteristic vector of A: Au = Au.A2 + is convergent.8) that if A is a square matrix with all eigenvalues less than R in absolute value.A*-' + . We will now investigate eigenvalues and eigenvectors of f(A).2340. x"x = 0.1.. It is natural to generalize to an infinite power series: f(z) = a. Az AZ 2! 3! while. . . where A is a square matrix.

l)*f(A) = det (21 ..SEC. . + C*) det (II ... Hence the case when some eigenvalues are equal can be considered as a limiting case.. since S`'A-'S = (S-'AS)-' . + C.. or as a poly- nomial with constant matrices as coefficients. .A) I = (. but we do not want to go into more detail right now.. All elements of C are polynomials of 2 of degree not more than n . (3. The division has a good meaning.. are certain constant matrices. . + c* . . 0 = 0... + c*)I.5.l*-' + . C.. Thus (. Cayley-Hamilton's theorem: Any square matrix A satisfies its own charac- teristic equation. For this.l° + c. f(2*) Premultiplication with S and postmultiplication with S-' gives f(A) = 0 . We also conclude directly that S-' (f(A)1 s = f(B) \g(A) // g(B) where f and g are polynomials... 3... by which we mean a matrix whose elements are polynomials in a variable.5.. S-'V(A)S = J(D) 0 0 . and further f(21) 0 .2*-' + C31*-' + . Now we consider the adjoint matrix C of the matrix Al . (21 .. Thus .l*-' + .A) -_ 2* + c. we give a proof in the special case when all eigenvalues are different and hence the eigenvectors form a regular matrix S. Hence C can be written C = C. First. and in fact. we need the concept of polynomial matrix. since we have full commutativity. Such a matrix can be written as a sum of matrices. MATRIX FUNCTIONS 63 to the case when f also contains negative powers is immediate. 0 0 f(in) .A. Let D be a diagonal matrix with d. with small modifications the given proof is still valid. each one containing just one power of the variable. Let f(2) be the characteristic polynomial.B-' .2) A general proof can be obtained from this by observing that the coef- ficients of the characteristic polynomial are continuous functions of the matrix elements.A)(C. where C C . we will also give another proof due to Perron. = 1. However. Then S-'AS = D.A*-' + C'2-1 + ..1. Under certain conditions the results obtained can be further generalized.

Hence Cjr(2)(ajk . A shorter and more direct proof is the following..AC.23jk in the matrix A . 3. Identifying on both sides. =c31.72 + 10)1. we get: Cl = 1.k (ajk .1+ 10)1.5. -A'= -7A + 10I or A2 . Replacing 2 by the matrix AT.4)(2 . EXAMPLE A = (4 22 1I-A=(I-14 ).AC* = c.A'')ek = 0 . Premultiplication of the first equation by A*.l)I = (22 .21.A)C = ((2 .2 0 0 (1-4)(A-3))-2 ) _ (-1)2f(2)I .k . = LJ ajkek or E (a.AT 3jk)ek = 0 . k j This relation is valid for all r. C1=I. [A'+(C.2jk) =J 2)ork i and this /equation is valid identically in 2. the second by A*-'..A)(11 + C. and hence we have f(AT) = 0 or f(A) = 0. 2 23)' C=(213 224)21+( 1 -4I' (21 .1. .7A + 101=0.A*-' + + c*I = 0. Denote by ek a column vector with 1 as the kth element and 0 elsewhere. -A= -71 -AC. and hence f(A) = 0.]I-(2'-7.3. AIC. we find J (AT )er = EJ(AT )Srkrk = L c.(AT) r. k k Let cjk(. C. .AC.64 MATRICES SEC. Then we have ATe. C.-A).1-AC.) = f(. .l) be the algebraic complement of ajk .3) . gives A* + c.1. followed by addition. . C'=( 3 1 2 -4) ' (21 . =c. and so on. = 101.

Then eA and aA + bl have the eigenvalues e". It is easy to prove that m(2) is a factor of f(2). since f(A) = 0 and m(A) = 0.2). we have eA=aA+bl.-A1' Z. EXAMPLE Consider e4 where A is a two-by-two-matrix. Dividing F(x) by fix). and a2. and 2. MATRIX FUNCTIONS 65 Cayley-Hamilton's theorem plays an important role in the theory of ma- trices as well as in many different applications. 3. Replacing x by A.A. (3. From the preceding discussion. + b. A. we get F(x) = f(x)Q(x) + R(x) .2.e'. we get f(A) = m(A)q(A) + r(A) or r(A) = 0 .SEC.:f.1.5. Suppose that the eigenvalues of A are 2. However. Replacing d by A.. It should be mentioned here that for every square matrix A. we find F(A) = f(A)Q(A) + R(A) = R(A) . since we can write f(2) = m(d)q(2) + r(. In most cases we have m(A) = f(A). this is in contradiction to our supposi- tion that m(2) is the minimum polynomial. Hence eat=a. e2 =aA. + aNl (N Z n) .3) Without proof we mention that this formula is also valid for functions other than polynomials.). b= Al 21 Il 1I' . + b.5.. there exists a unique polynomial m(A) of lowest degree with the leading coefficient I such that m(A) = 0. Now consider a matrix polynomial in A: F(A) = AN + a. where r(2) is of lower degree than m(2).AN-l + .e'1-l. or written by use of determinants: e2' e'2 a= je2i e2. (2.?. respectively. e2 . From this system we solve for a and b: a=e''-e'1 b=.+b.+b.2. a2. where R(x) is of degree S n .

F(2 23) F(21) 0 0 0 0 0 0 0 0 0 0 0+0 F(.. 0 0 0 0 0 0 0 0 F(Z Multiplying by S from the left and by S-' from the right.5. 3. In the general case we try to determine the coefficients a. 0 0 0 23.) 0 0 0 F(A3) Premultiplying the right-hand side of (3.0 2.i3 . 2 in D by F(A1).66 MATRICES SEC.?3 0 +. ..5. We will first treat a three-by-three-matrix A with the eigenvalues A A and 2.2. we get (D . An alternative expression for F(A) is due to Sylvester. 0 0 0 23-. . .28)(2.Al) (13 23) Let S be the matrix formed by the eigenvectors of A.. is obtained by replacing the elements A.(2) + (A .231) (3. in F(A) = a. Then F(A) should be a polynomial of second degree. .. .231) F(A l ) + . Generalization to matrices of arbitrary order is trivial.. we regain (3.23) + (A .. . 0 0 S-'AS = D .111)(A .5. 0 0 . (supposed to be different). .231)(D .)._. (3. 0 F(2. 0 0 Il 0 0. 0 0 0 ('ll .6) by S-' and postmultiplying by S. and F(21) 0 0 S-'F(A)S = F(D) ..) 0+0 0 0 F(D) . F(1. A with D= .5.231) (A))- F(21) AZ)(A . (Al .23) d.5. .23)(A1 .1) F. 1 1e .2..l..A"-' + + a1A + ao1.6). Then 2.l.6) F(~3) (23 .231)(A . . (3.1 .. ..5) D D. . and we will show that it can be written F(A) _ 2j) (A .5.4) and in this case we find I 1 . ..

1 we will restrict ourselves to the special matrix A112 corresponding to the values 2 and i.. The most common norms are: (a) The maximum norm IxI = maxi Ixil. = 5 and 2. (b) The absolute norm lxi = Ei Ixil. Given A = (5 2 -22 find A"'.30543 0.6. . Compute A-' when A = We find 2. Ixil2)'r2. 3. In this example with 2. (c) The Euclidean norm IxI = (E. We obtain directly: log A = 1 (log 50 log 6. = 2.5 log 20 I . 3 5' 3 2' . = . find log A. ±1/!. then X becomes real only if all eigenvalues of A are real and positive.(0. . i -6+3i 2 -1/ 5 ( 2 -6/ -5 5 4 .047 -0. Norms of vectors and matrices The norm of a vector x is defined as a pure number Ixl fulfilling the following conditions: I. Since the new characteristic values can be combined with different signs (± .25) = (1.30401 0. 3..SEC. Given A -\1 2). Hence Al/2 = /6 -3) 2 (1 3 i_ 1 (12. NORMS OF VECTORS AND MATRICES 67 EXAMPLES 1.61086)3 (log 2.039 0.) we generally obtain 2" solutions of the matrix equation X2 = A.6. Icxl = Ic! Ixl for an arbitrary complex number c . If A is real..086/ 2. . 2.\-0.078). that is.99858/ 3. the matrix X for which ex = A. Hence 2 2 -I 2 A-2 \ 1 -2/ 1 _ ( 0. Ixl>0forx#0and101=0.2i -2 + 6i/ 3. Ix +Ylslxl+lyl.4. 2.

Iaiki'TI= ilk (c) The Hilbert or spectral norm I IAI I2 = V. In most applications matrices appear together with vectors. p = 1. Iaikl . respectively. IIA+BII <_IIAII+IIBII: 4.6. we will use it for certain estimates. We note that with this definition we always have 11111 = 1.*o IxI Hence the condition of compatibility can be written IAxI S IIAII IxI A matrix norm defined in this way is said to be subordinate to the vector norm under discussion. for p = oo. we define a com- patible matrix norm by IIAI I = sup I AxI . where S is a regular matrix. 3. the relations (3. In particular. 2. we observe that N(1) = n'12.-norm. is the largest eigen- value of H = A"A. defined by Ixl = {E Ixil'}icy . since it is so easily computed. In this case the inequality by Cauchy-Schwarz is fulfilled: Ix"YI S IxI IYI Analogously. To a given vector norm. IIABII s IIAII IIBII The most common matrix norms are: (a) The maximum norms IIAII = IIAII. i respectively. (3.o = max Ek Iaikl i and IIAI I = IIAII1 = max k r. .2) 3.68 MATRICES SEC. For this reason the relations (3. a matrix norm is a pure number IIAI) such that: 1.6. . we can con- struct another norm IIABII defined by IIASlI = IIS-'ASII. As is easily verified. IIcAII = Icl IIAII for arbitrary complex c . We will mostly use the Euclidean norm. Here we point out that from an arbitrary matrix norm IIAII.2) are valid for IIAII5 if they are satisfied for IIAII. (b) The Euclidean norm N(A) = [Tr (A"A)]'i2 = [Tr (AA")J'/2 = [F. and p = 2. denoting the Euclidean matrix norm by N(A). where 2. Hence this norm is not subordinate to any vector norm. All these norms are special cases of the more general 1.6.6.2) should be augmented with a supplementary condition con- necting vector and matrix norms. Never- theless. IIAII>Oif A#Oand 1I0II=0.

. The third con- dition follows if we take a vector x with norm 1 for which I(A + B)x1 takes its maximum value.kxk <_ in L k=1 Iaikl and further iylm = max Iy. Then we have Lull = IAxll = E IV. Then 11A + B11 = I (A + B)xl = lAx + Bxl < IAAxi + IBxI <_ IIAII IxI + IIBII IxI = IIAII + IIBII .. max. E. the compatible matrix norm satisfies (3.6. that is. = in while all other com- ponents are equal to zero. ixii = m and form y = Ax. n. Then we have for the vector y = Ax: It lyil = It a. Let p be an index such that E Ia.kl.6. that is. Again let x be a vector with norm 1 but now such that IABxi takes its maximum value. = m maxi Ek laikl Hence we have xl I sup I S °° = max E laikl The proof that the second maximum norm is compatible with the absolute norm is similar. We then choose the vector x in such a way that x. NORMS OF VECTORS AND MATRICES 69 I We will now show that for a given vector norm. the fourth condition follows in a similar way. . aikxkl = mi=1 t laial = m max E faiki i=1 k k i=1 . Last. Then follows: lAxl = IF.. Ia.kl = max E laikl k i k and further suppose lxkl = m with sign (xk) = sign (a. lxii = m. Choose a vector x such that lxl. iyl . We shall now prove that equality can occur and suppose that E.=1 Iaikl takes its greatest value for k = q. 2.l < m Max E laid i i k We shall then prove that equality can occur.2). Hence IIABII = IABxI < IIAII IBxl 5 IIAII IIBII IxI = IIAII IIBII Put IIAII = maxi Ek laikl and choose a vector x such that IxI.SEC.kl. = m Ek a.aikxkl < Z F. = in. laikl 14 < E lxkl E laikl < }max k ]t Iaikl} E I xkl ki-I i=1 m max t laikl k i-1 Hence we have I AxIII xl 5 max. 3. The first two conditions are practically trivial. This implies y. = E.k) for k = 1.

. (3..+ Since lxl = 1. + . and choosing x = u we find directly IIAII - In the following discussion we will use the notation IjAlj for the Hilbert norm of a matrix.12 5 z.'x Clock Abk = ck .. be the ith row vector of A..' = N(A) Ixj2)": IAxI = \E Ia.70 MATRICES SEC.: x = au. Z 2. we get N(A) < n"2 IIAII. But IJAII = maxis. .6. c..6. and we assume that it has the eigenvalues 2. k i However. (3. and hence xI2).6. Thus we have sup IAxI. we then obtain N(AB) < N(A) N(B) . . we have N(A + B) < N(A) + N(B) obtained simply as the usual triangle inequality in n2 dimensions.3)....3) If B = I.. <_ \E Ia:I2 Ixl From this we see that IAxl/IxI 5 N(A) and also that IIAII < N(A). The vector x is now written as a linear combination of the eigenvectors u u . Hence 1412 <_ IIAI12lbk12 Summing over k and taking the square root we find N(AB) < IIA II N(B) . let a. . (3. Z .6. 3. x. since Ickl/lbkl corresponds to a special choice of x..l2 = I and further. respectively. Further. x"Hx = 2. Then bk).. Using (3.(Ia... b and c c.5) n-1j2N(A) <_ IIAII < N(A) .l2 + ja212 + + la. we have the estimates IIA II < N(A) < n1j2 IIA II . la112 + 22 1a212 + .. - we have x"x = Ia.6. = max E la:kl s:o Ixl..l Z 0. u... .. 12) _ 2. For the Euclidean matrix norm... in the following we will concentrate on the Euclidean vector norm and the Hilbert matrix norm. N(B) = \E by and N(AB) OP \4 1 But IIAII = sup :so IAxi I z Ibkl . we have IAxl2 = (Ax)"Ax = x"A"Ax.. Then the vector Ax has the components a. The matrix H = A"A is Hermitian. and show first that they are compatible. . Further.. + au. IAxI. + 1 Ia. let C = AB and denote the column vectors of B and C by b b . + la.j2 + .4) Summing up. Starting from a vector x with norm 1.l2 + la.

However. Then we have 11C11 = sup I U"HUxl IzI=1 sup (x"U"H"UU"HUx)'"2 = sup (y"H"Hy)'"2 = IIHII .6) :. NORMS OF VECTORS AND MATRICES 71 SEC. If the elements of a matrix A are bounded through laikl < c. where C is diagonal. while the right-hand side is = IIA"AII Hence IIA"All Z IIA112 On the other hand.v*o 1XI IYI For a moment we consider the matrix A"A and obtain l y"A"Axl sup Ix"A"AxI < sup 2*0 IX12 =.6. Putting H = A"A. If Uis unitary. 3. we have. then we also have ITr(A)l <nc. Now we choose two vectors. 1XI IYI IAxI lxl 1XI that is. Thus the norm remains invariant under a unitary transfor- . IIAII <nc. the equality sign can be valid in the inequality. using Cauchy-Schwarz' inequality. Iy"Axl < IAxI IYI _ IAxi IXIIYI 1XI IYI lxi and putting y = Ax.7) From this relation we can compute l IAII in an alternative way. 121=1 i.v*o 1XI IYl On the other hand. N(A) <nc. both nonzero. we have Ix"A"yl IIA"11 = sup = sup IY"Axl = IIAII IYI lX! =. x and y.6. Observing that y"Ax = (x"A"y)* is a pure number. the inequality IIBCII < IIAII IIAII gives IIA"AII < IIA"11 IIAII = IIA112 Hence we finally get IIA"AII = IIA112 = IIA1112 .i=1 where y = Ux. the left-hand side is sup.*o IAx12/1X12 = IIA112. Hence we also have IIAII = sup lY"AxI (3. we obtain further lY"Axl _ IAxH HAxi __ IAxI . we see that H is Hermitian and can be diagonalized with the aid of a unitary matrix U: U"HU = C.6.v*o 1XI IYI since the left-hand side is a special case of the right-hand side. we have (X"x)112 I UXi = (x"U"Ux)112 = = lxi . (3.

A)-III 5 1/(1 . that is. For a moment introduce v = (E Ih.A II (I' 1 1 (3.72 MATRICES SEC.IIAII)Ixl. Since 11 -All = IIAII we can write FT ]IIAII <_ II(I + A)-'II <_ T_. Let 2 and u satisfy A"Au = Au. Then we get from (3. From I = (I . since I(I-A)xl=lx-Axl.A)-' .kl2)'12.A)x # 0 for an arbitrary x 0. Next we are going to deduce some estimates of norms in connection with inverses of matrices. which means that I . But the norm of C is trivial: IICII = A.A)-' = (I .k) is a Hermitian matrix. . mation. 2 = {(Au)"Au}/u"u > 0. are the eigenvalues of H. (3. and hence (I .lxI-IIAIIIxi=(1 .A)-' .A(I . Hence IIAII=IIA"II= VT.8) If H = (h. is the largest eigen- value of H = A'tA.A)-'II z IIA(I .6. we have IIAII II(I .A is regular. If IIAII < 1.IIAII) Analogously. we find that II(I + A)-'II > 1/(1 + IIAII). while an arbitrary Hermitian matrix may also have negative eigenvalues.A exists. This matrix is positive semidefinite.9) 1 1+I.All _<II(I-A)-'II< 1-IIAII Both inequalities are valid under the assumption that IIAII < 1. It should be observed that all eigenvalues of the special Hermitian matrix A"A are positive. Then WHA"Au = Au"u. Inn where 2. Then all the characteristic roots 2. of Hare Z0. 3. is the numerically largest eigenvalue of H. .6.6.A)-'.III zII(I-A)-`II ..1 and consequently II(I . since for arbitrary x we have xHHx = xHA"Ax = (Ax)"Ax > 0.A)(I . then we have 7 1)2 N(H) _ [Tr (H"H)1'12 = [Tr (H2)I'I2 = (E .6.> Ixl-IAxi. where 2. if 1. the inverse of I .5) V SIAM<v.) .A)-'II = II(I . by treating I = (I + A)(I + A)-' in a similar way.

6. This quantity can also take negative values. Hence II(A+B)-'-A-'jl< IIA-'II (3.1 where 2.(A + B)-'Il = IIA-' . because there exist a num- ber 2 and a vector x such that Ax = . then we have p(A) = IIAIJ In the theory of differential equations. p(A) = IAxI /I xI 5 sup I AyI /IYI == IIAII If specially A is Hermitian.(I + A-'B)-'A-'Ii <_ IIA-'II III .10) i.lx and 121 = p(A). is the largest eigenvalue of A"A. For further details see.7. 3.A-'II = IIA-' .a ' where a = I IA-'BI I is assumed to be < 1. that is.6. TRANSFORMATION OF VECTORS AND MATRICES 73 Last.I)II < IIA-111 IIA-'BII II(I + A-'B)-'II <_ IIA-'II 1 a. for example. 1 a We have previously defined the spectral norm and shown that IIAII = where 2. It is defined in the following way: jt(A)= lim III+sAII . Denoting .7. We are now going to show how a system of orthogonal vectors can be constructed from them. Hence IAxI = p(A) I xI .12) where 2.6. (3. Dahlquist 151. . For each matrix norm corresponding to a vector norm p(A) < I IAI 1. we will show one more inequality which is of interest in connection with the inversion of matrices.SEC.12.or column-vectors. We have II(A + B)-' ..6.(I + A-'B)-'II = IIA-'II I I(I + A-'B)-'(I + A-1B ..8): p(A) = 2. Transformation of vectors and matrices An arbitrary real nonsingular square matrix can be interpreted as a system of n linearly independent row. 1 (3. are the eigenvalues of A.11) Neglecting quadratic and higher order terms in E. we have from (3. is the largest eigenvalue of the matrix H = J(A + A"). and hence it is no matrix norm. We now define also the spectral radius p(A) = maxis:.7): III +EAII=III +E(A+A")II"2 and further from (3. a measure p (A) is sometimes used.6. 3.

1 "k =I' [anT wk/wkT wk] wk J/ It is obvious that w{ wk = 0 if i k. T T wi = Qi .. These equations can be written in matrix form WB =A. we form w..k = w.... = a. . First.).: u = E ciui . . . and d. w.74 MATRICES SEC. that is. ... .\ 0 1 b.wi.. the initial vectors by a a . where W = (w w3.7. . b. = a.. .). A = (a a . B_ 0 0 1 . we get Au = Au = A E ciui = E ciAiui .LJ [airwk/wk wkJI wk ... k=t w.. Previously we showed that an arbitrary Hermitian matrix can be brought to diagonal form by a-unitary transformation U"HU= D. 0 0 0 II with b. b. Now suppose that the eigenvector u belonging to 2 is a linear combination of the eigenvectors ui belonging to A. we obtain a linear relation between the ai contrary to the supposition. a. We also noticed that if the n eigenvectors of an arbitrary matrix A are linearly independent. This technique is known as Gram-Schmidt's orthogo- nalization process.b3. and 1 b b13 . Operating with A on both sides.. then the eigenvectors are linearly independent. . a..ak/w. 3. . It is easy to show that if all eigenvalues are different. they form a regular matrix S such that S-'AS = D._ 0 cannot belong to two different values 2. it is trivial that the eigenvector u :7.

0 a22 a23 . submatrices of the form 2i 1 0 0 2i 1 0 0 2i 1 0 0 2i' 2)..-'A U.2 a.. The representation (3. .1) is called Jordan's normal form. in the main diagonal.. Here we will restrict ourselves to stating an important theorem by Jordan without proof: Every matrix A can be transformed with a regular matrix S as follows: S-'AS=J. are so-called Jordan boxes. a.2 a13 . A. so that U.7.. A matrix with this property is said to be defective. 0 .1) 0 0 J. J becomes a diagonal matrix. A. TRANSFORMATION OF VECTORS AND MATRICES 75 If we also have multiple eigenvalues. If an eigenvalue is represented in a Jordan-box with at least 2 rows.. If one or more eigenvalues are multiple but in such a way that the corresponding Jordan-boxes contain just one element (that is. where J.li In one box we have the same eigenvalue 2. ./ The J. 2 . we obtain a lower number of linearly independent eigenvectors than the order of the matrix. one can find a unitary matrix U such that T . II [3].. for example. 3. all Ji are different.3 . A matrix with this property is said to be derogatory. see. a.. in particular. 2 and determine U.U-'AU is triangular. 02..7. we have another degenerate case characterized by the fact that the minimum polynomial is of lower degree than the characteristic polynomial. Both these degenerate cases can appear separately or combined. but it can ap- pear also in other boxes. Gantmacher [1] or Schreier- Sperner. For proof and closer details. that is. the situation is considerably more complicated. . For later use we will now prove a lemma due to Schur.SEC. 0 0 0 2i 1 "0i 0 2i 0 0 0 .. where A. ... has the following form: ( 2. no ones are present above the main diagonal).. If. a.7. Given an arbitrary matrix A.. 0 (3. 03 0' a. Let the eigenvalues of A be J. The occurrence of multiple eigenvalues causes degeneration of different kinds. 0 a32 a33 .

We mention here that in many applications almost diagonal matrices (band matrices) as well as almost triangular matrices (Hessenberg matrices) are quite frequent..A11-0.6. A sequence of matrices A A . con- verges toward the matrix A if every element of AN converges to the corre- sponding element of A. and vice versa... a.8. of course. essentially this corresponds to the well-known fact that in an n-dimensional space. .. and the other elements can be made arbitrarily small by choosing s sufficiently small. 3.x.. by removing the first com- ponent. Then we form B = D-'TD. are removed has.. s2. . we will show that. As is easily found.. since the product of two unitary matrices is also unitary. it is always possible to choose n mutually orthogonal unit vectors.. The most interesting case corresponds to p = 1 (tridiagonal matrtices).8. the first column of U. any matrix A can be transformed to a triangular matrix with the off-diagonal elements arbitrarily small.. Analogously.. Here B is triangular with off-diagonal elements arbitrarily small and the eigenvalues in the main diagonal.. can be chosen arbitrarily.. a. which transforms the elements a. where D is a diagonal matrix with the nonzero elements s. . s*... becomes unitary. The choice can be made in such a way that U.. with S = UD. the diagonal ele- ments are unchanged.. x. has all elements in the first row and the first column equal to zero except the corner element.0.ksk (k Z i). Limits of matrices A matrix A is said to approach the limit 0 if all elements converge to zero. ..1 rotations of this kind. The consecutive powers A" of a matrix A converge to zero if and only if all eigenvalues of A lie inside the unit circle. by use of a similarity transformation. to zero. which is 1. It is obvious that after n . 2.. then A . where x.76 MATRICES SEC. Hence b.A. We start by transforming A to triangular form: T = U-'AU. The (n . we deter- mine a unitary matrix U. must be a multiple of x where Ax. This follows directly if A can . and the corresponding eigen- vectors x.. = 2.k = t..1)-dimensional matrix obtained when the first row and the first column of A..jI -. the eigenvalues 2. In the first case elements not equal to zero are present only in the main diagonal and the p closest diagonals on both sides (p < n 1). Otherwise. a triangular matrix is formed and further that the total transformation matrix U = U1U2 is unitary. 3. A necessary and sufficient condition for convergence is that IIA .. x. 2 ..0. is obtained from x. Last.. The matrix B has been obtained from A through B = S-'AS. In the second case all elements are zero below (over) the main diagonal except in the diago- nal closest to the main diagonal (upper and lower Hessenberg matrix).5) we get directly that if 11. the elements of U. that is. This matrix U. . From (3. A simple generalization is the following..

..2 + a.. in the main diagonal.. we get. elements of the form a.1)! 0 f(2) fc'-z. we can perform the transformation to almost triangular form just described in Section 3. . + a. and consider a Jordan box 1: 2 1 0 0 (p rows and columns). where 2. . + a.D . f. p is the largest order for those Jordan boxes which belong to this eigenvalue A.a-"(2)l( /'(2)/1! . where D is diagonal with the diagonal elements A..' . LIMITS OF MATRICES 77 be transformed to diagonal form: S-1AS .. ..2N. Hence we get the condition A.. If A cannot be transformed to diagonal form..2)!' f(1) = 0 0 . 3. (P N 0 0 AN and more generally f(2) . n.A+a... 2.0 for i = 1. 2. P . Thus it is obvious that the matrix series converges if 12. Then we have AN = SDNS-1.. f(2) Hence. is the absolutely largest eigenvalue and R is the convergence radius for the series E N Alternatively. .SEC.AN is transformed in the same way. (P N 2N-P+I `1 / JN=I0 2' .1 2" converges for every eigenvalue.A2 + .22 + .A2+..a. for convergence it is necessary and sufficient that the series N EN aN (p ..A i..ti.1 < R. + a. 22.8. we can use Jordan's normal form...7: B = S-'AS. .(2)/(P . 0 0 By induction it is easy to show that AN (N) 2N-1 . If the series a0! + a. We demonstrate the technique on the series f(A)=apl+a.

. Oxford. [11) Fox: An Introduction to Numerical Linear Algebra (Clarendon Press. Englewood Cliffs. From this we see directly that the matrix series converges for such matrices A that l2. where c is a constant (depending upon v). [71 Bodewig: Matrix Calculus (Amsterdam. 1958). Care must be exercised.8. (d) that v is an eigenvector of A (the eigenvalue to be computed). New York. visa column vector with n elements. 1958). if A = B.. 1961). We give the following examples: d (AB) = A dB dA B. 11 (Berlin. From this vector we form a matrix A = vvT. Show (a) that A is symmetric. 1935). Uppsala. Show that an antisymmetric matrix of odd order is singular. New York. then u is an eigenvector of A corre- sponding to the eigenvalue 0. 1961). [8] National Bureau of Standards: "Basic Theorems in Matrix Theory" Appl. 2. [4] MacDuffee: The Theory of Matrices (Chelsea. in the theory of ordinary differential equations (see Chapter 14). [9] Varga: Matrix Iterative Analysis (Prentice-Hall. [10] Wilkinson: "Rigorous Error Bounds for Computed Eigensystems. [5] Dahlquist: Stability and Error Bounds in the Numerical Integration of Ordinary Differ- ential Equations (Dissertation.78 MATRICES SEC. 3. 1962). [6] Zurmuhl: Matrizen (Berlin. 1960). REFERENCES [1] Gantmacher: Matrizenrechnung. above all. dt dt dt dt Problems of this kind appear. (b) that A2 cA. (c) that if u is a vector such that urv = 0.(Oct. No. [2] Faddeeva: Computational Methods of Linear Algebra (translated from the Russian) (Dover. 1959). + dt dt di In particular. I (translated from the Russian) (Berlin. 1964). we get A dA-' + d A A-' = 0 and dA-' _ _ A-. 1956). Series. Derivatives and integrals of matrices whose elements are functions of real or complex variables are defined by performing the operations on the individual elements.i < R for all i. [3] Schreier-Sperner: Analytische Geometrie. we have dAA. . 57 (Washington. 4. d (A2)=AdA dt dt + dt Differentiating the identity AA-' = I." Comp. dA A-' . N. J. because the matrices need not commutate. however. Jour. 1959). Math. EXERCISES 1.

Show that det(I .2) 0. A is a matrix of order (n.BA = I cannot be valid. u.9 0. Then cos A can be written in the form aA + M.AB ± BA has the same property. U-'AU-A'_(a b"). are symmetric with respect to both diagonals.k = 0 for i < k.=( 3). where all a. Two quadratic matrices of the same order. Find lim. 3)-matrix which has the eigenvalues A. A and B are n x n-matrices. Show that (I . where we choose U=rP `\pa -pa#) P with p real. given that the corresponding eigenvectors are 2 9 44). and . Find a (3.xyT) = I . -1 4. c. 2)-matrix a bl A - c d with complex elements is given.xTy.. where I is the unit matrix. d. and 23 = -1.EXERCISES 79 3. b. for example. 5. Find the constants a and b.3 0. 10. 6. Find p and a expressed in a. 9. by proving that the elements of successive matrices A" form partial sums in convergent geometric series. Then p and a are determined from the conditions that U is unitary and that A' is an upper triangular matrix (that is. Prove that the relation AB . A is transformed with a unitary matrix U. 8. . A (2. Show that B = eA is an orthogonal matrix if A is antisymmetric.l2 = 2. The matrix A=(0.4 is given. The matrix is given. Show that the matrix C . Find B if A= 7. Both x and y are vectors with n rows. . U 2=( 5) u3=( 2 4. c' 0). n). and use the result for computing a' and d' if 105) A=(-2 1 1. A".A)-' _ I + A + A' + + A"-'. = 6. A and B.

in this order.-2)and or=(0. A is a nonquadratic matrix such that one of the matrices AA T and ATA is regular.) . Show that P. If A is normal it can be trans- formed by a unitary matrix U to a diagonal matrix B: B = U-'AU. = 0. n)-matrix P. a.0.Ar)-'A. Find U and B when a and b real. Finally. 13. such that Pa. a. 15.4. or.. -1. for r=2when or=(1. A matrix A. . n) is formed by combining the row vectors or.3. A= (0 - 1 3 1 1 1 -3 -1 0 (A matrix X with this property is called a pseudo-inverse of the matrix A. = P. are given vectors with n components (r < n) which are linearly independent. compute P.. where a a2. One wants to construct an (n. Show that a permutation matrix is orthogonal and that all eigenvalues are I in absolute value.80 MATRICES 12. 3). is called a projection matrix) and that P. has the desired property (P. =I .a2 = = Pa. A matrix A is said to be normal if A H A = AAH. 14. A quadratic matrix which has exactly one element equal to I in every row and every column and all other elements equal to zero is called a permutation matrix. (A. Then find at least one matrix X if 2 0 -1). is singular when r >_ 1.. of type (r.A. Use this fact for determining one matrix X such that AXA = A. .

turning to Alice and sighing. certainly" said Alice. "but why do you call it sad?" L. Introduction We shall now examine a linear system of m equations with n unknowns. Cramer's rule. and det A tr 0. As is well known. Last. which is obtained when the rth column in D is re- placed by y. beyond reach even of the fastest computers. x the solution vector. (4.Chapter 4 Linear systems of equations "Mine is a long and sad tale" said the Mouse. and this enormous numeri. of course. as a rule the equations cannot be satisfied. then there exists a finite solution only in exceptional cases. or more? How does one evaluate a determinant of this magnitude? Using the definition directly.1) where D. "It is a long tail. one will find the number of necessary operations to be of the order n!. If m < n. as is well known. cal task is. apart from the trivial solution x = 0. Already n = 50 would give rise to more than 10°4 operations. Let A be the coefficient matrix. we have only the trivial solution x = 0. Ify = 0.Ewls CARROLL. the system Ax = y has the solution x. After the solution (4. = D # 0 and y # 0. if det A = 0 and y = 0. Even if the 81 . where A-' is taken from Formula (3.0. Of course.0. This is true from a purely theoretical but certainly not from a numerical point of view./D. of course. In this chapter we will treat only the case m = n. In a'subsequent chapter we will discuss how "solutions" are explored which agree with the given equations as well as possible. If m > n. and. we have then exactly one solution x. In the normal case.1) had been obtained. this expression increases very rapidly with n. is the determinant. the system usually has an infinite number of solu- tions. If det A = 0 but y 0. we have y 0 and det A # 0. is identical with the formula x = A-'y. 100.1. Now suppose that det A.2). but what happens if n = 50. and y the known right- hand-side vector.=D. According to Cramer's rule. we also have an infinity of nontrivial solutions. the whole problem was con- sidered to be solved once and forever.0. Cramer's rule is satisfactory when n = 3 or n = 4. looking down with wonder at the Mouse's tail. 4.

.. Among the iterative methods. They are treated exactly like the other coefficients. and finally xn. + a1Yx..1.. . the diagonal elements c.. third. . 4.x.... As a matter of fact. determinants are evaluated in the best possible way. x.. In normal cases the method is satisfactory...: of the variable x= (this element is called the pivot element). We start by dividing the first equation by a (if a = 0. YI. nth equation. but some difficulties may arise.. and then the triangularization method. from the second. + . and then we subtract this equation multiplied by a. + a.. Next we divide the second equation by the coefficient a.. Explicitly the system has the following form: a11x. it may occur that the pivot element. are obtained by back-substitution. .. is eliminated in a similar way from the third. . One tries to get around this difficulty by suitable permutations. + .. is very small and gives rise to large errors. The new coefficient matrix is an upper triangular matrix. +Clnxn = ZI . an. c22X2 + + c.82 LINEAR SYSTEMS OF EQUATIONS SEC. 4. a31.. but so far no completely satisfactory strategy seems to have been invented.. and in this way we get an easy check throughout the compu- tation by comparing the row sums with the numbers in this extra column. .. x.. The reason is that the small coefficient usually has been formed as the difference between two almost equal numbers. + . assuming det A 0 and y : 0.1. the number of operations is proportional to n'. fourth. Among the direct methods.. In practical work it is recommended that an extra column be carried. the equations are permu- tated in a suitable way).nxn = Z= Cnnxn = Z. . The elimination method by Gauss We are now going to examine the system Ax = y. while the methods we are going to discuss need only com- putational work of the order n'... + a:nxn = Y2 a.1x. a:1x1 + a. + a. are usually equal to 1. .. When the elimination is completed. even if it is different from zero.x. the system has assumed the following form: Ctlxl + Ctxx. and then x.. = Yn .. we will give greatest emphasis to the Gauss- Seidel method and the successive over-relaxation method by Young. we will first mention the elimination method by Gauss. with modifications by Crout and Jordan.2x2 + . We will distinguish mainly between direct and iterative methods.. This procedure is continued as far as possible.... with the sum of all the coefficients in one row.x. + a. nth equation.-.

y is eliminated: -0. .36364.12727z + 10.63636z .1.0.02248. 10.37079u = 10.61818z + 0. Jordan's modification means that the elimination is performed not only in the equations below but also in the equations above. Since the numerically largest y-coefficient belongs to the fourth equation.37078.34545u 2.09091. In the example just treated.3) 3x+ y+4z+llu=2.25843u = 3.63636z .7y + 3z + 5u = 6. THE ELIMINATION METHOD BY GAUSS 83 SEC. which gives x-0.02247u = -7.0. z = .6 .1.12727z + 10.27273u = -0.1.45455. (4.0.36364.4) 3.1y+3. Now z is also eliminated. u=8.1.02247.32582.1. 3.5y .8) z .6) z .5z + 1. we finally obtain a diagonal (even unit) coefficient matrix.5u= 0. -6x+8y. The last step in the elimination is trivial and gives the same result as before. (4. In this way.7y+ 0.8y+0. 4.2. y = 4.5) .6. First.0.1. we get + 0.5.7) .1.5u=0.0.34 545u = 2. (4.3z+ 0. we permutate the second and fourth equations.z. .74156. (4.63636z .03 636u = 11. using the first equation: . and x . (4.4) is ob- tained unchanged.30909u = 0. (4.6. EXAMPLE lOx .5u .1. 10. The final solution is now easily obtained: u = 1. we eliminate x.5u = 0. After that.37079.0. the system (4.1.6.61818z + 0.3z+ 0.1. y + 0.74545z + 0. y .4u=5.7y+ 0. 1.03636u 11.02447u = -7. In the next two steps. 1.0.5. and we have the solution with- out further computation. 10.32584u = 5.3.27 273u = -0.7y + 0. + 0.27273u = -0.0.72727.3z + 0.1z+9. y + 0.8z.7.45455. 5x-9y-2z+ 4u=7.72727. y + 0.37078u .5u = 4 .72727.

z=I. EXAMPLE 2x+ y+4z= 12. Tso81a' -A 37 4 11 -1 33 0 0 1 1 which means that the system has been reduced to: x+jy+2z=6. . and hence the former method should usually be preferred. 4x+lly. the Gaussian elimination leads to an upper triangu- lar matrix where all diagonal elements are 1. 1 r1z r13 1n 0 au = 0 1 r. in three dimensions. Finally.2. 4.= a 0 0 1 In this way we get 9 equations with 9 unknowns (61-elements and 3 r-elements). 4.2. Triangularization As we have already noted. an upper triangular matrix remains to be inverted.3). We now perform the following operation: 0 0 2 1 4 12 1 j 2 6 1 4 7 -7 0 8 -3 2 20 = 0 1 2 4. Hence.3y+2z=20.z=33.3 1:1 1D1 1.2 I a23 a32 a22 a. We shall now show that the elimination can be interpreted as a multiplication of the original matrix A (augmented by the vector y) by a suitable lower triangular matrix.84 LINEAR SYSTEMS OF EQUATIONS SEC. y+2z=4. 8x. We will now compare the methods of Gauss and Jordan with respect to the number of operations. we put 111 0 0 an a12 a1.1)(n + 1) n(n + 1) 2 2 2 Thus the number of operations is essentially n3/3 for Gauss' method and n3/2 for Jordan's method. A computational scheme is given in Section (4.1)(2n + 5) n(n + 1) 6 6 2 Jordan n(n . An elementary calculation gives the following results (note that in Gauss' method the operations during the back-substitution must be included): Method Addition and Multiplication Division subtraction Gauss n(n . giving a similar matrix as the result.1)(2n + 5) n(n .1)(n + 1) n(n .

x x1. In this way the system Ax = y is resolved into two simpler systems: Lz=y. -72 X.1 1.. + 3x. 14 z.. Both these systems can easily be solved by back-substitution. as before. Conveniently.=20. + 2x. = 14. we have LR = A and obtain n2 equations in n(n + 1) unknowns. = 1. = 1 or r. 1 . we have LA = R or A = L-'R. we can choose I. 14 12 1 0 Z. Since L-' is also a lower triangular matrix. Rx=z.. we can find a factorization of A as the product of one L-matrix and one R-matrix.3 .. 2x. we can produce a particulary simple triangularization in the form A = LLT. Replacing L-' by L. 3x. 3 The former system is solved in the order z z z and the latter in the order x. respec- tively.. EXAMPLE x. TRIANGULARIZATION 85 SEC. _ -10 and x4 = 2 0 0 -24 x. 0 1. 20 z0 -72 1 2 3 x. 111 0 .18. 4.1-2x. Hence 1 0 0 z.. where. 14 1 0 1 -4 x. If A is symmetric and positive definite.2.. _ -10 3 -5 1 z...+5x.+5... 0 L= 101 190 .. The inverse of this coefficient matrix is 11 -I -1 01 -2 f 0 0 1 which multiplied with the vector gives the solution If the lower and upper triangular matrices are denoted by L and R. x. = 18 and z.+ x..

1) of multiples of the (i .. and hence zTz > 0.1.a1T)x2 -.. Crout's method In practical work the Gaussian elimination is often performed according to a modification suggested by Crout. 4.1) first equations in the latter system.1).. .3. and addition of the results: a71x1 + (a.t are determined from 1. then A -. 4..(azla13 + a71a13)x3 . some terms become purely imaginary. since xTAx = xTLLTx = ZTZ with z = LTx . + a12x2 + a13X.86 LINEAR SYSTEMS OF EQUATIONS SEC. LTx-z.z1 .1. . it may occur that Obviously.1).3..: ai1x1 + a. This method is known as the square-root method (Banachiewicz and Dwyer) and. The system Ax = y can now be solved in two steps: Lz=y.1) by az1 and a. we elimi- nate in the usual way and obtain the following system (it is assumed that no permutations have been necessary): .3. x. . then L must also be real. On the other hand.3.. The elements l. .1. But z is real. if A is positive definite. A certain equation (i) in this system is the result of the subtraction from the corresponding equation in (4.. .ZT (4. + aitainxn = a.1aT -{.. Hence the equation (i) in (4.ai2xl + .1) by a certain constant a.1. + alnxn = Z. and then the coefficients are identified..3. Starting from the system (4.1.1) is obtained through multiplication of the first and second equation in (4. .3. + a33x3 + . (4. For if L is real.2a) The second equation of (4. (4. it assumes that A is symmetric. + a2nx. a.1) is obtained by multiplication of the corresponding equation in (4.. + a71aYn)xn = a1z1 + a22Z2 . + (a91a.3. + ... . as has already been pointed out.2b) The remaining equations are formed in a similar way..n. The first equation of (4. . = a. a.. . 1 1 + 1T21 + 1127 = aIT TI 8l 1IT13T = a13+ .LLT is positive definite.1. can be determined...1) xn = Zn .1) can be written as a linear combination of the first i equations in (4.3. We can easily obtain the condition that all elements are real.. but this does not imply any special complications. respectively. and all constants al. .

pk = 10 for i<k. and hence . In this way we get P(Q + I)X = AX and hence P(Q + I) = A. not including the di- agonal elements in Q. 9`k 1 a. (4. and the matrix A with the new column y. CROUT'S METHOD 87 Especially. if i Z k. and (4. (0 for iZk..3) and (4. we get the following equations for the right-hand sides: al.1). The systems (4..3. 4.4) Pz=y. the summation will run from 1 to i.1).3) a'.3.z.5) Now we have to perform only the matrix multiplication on the left-hand side and observe that the formation of products is usually interrupted before we reach the end: either we meet a zero in the ith row of P (this happens if i < k). we can write P(Q+IIz)(AIy) (4.SEC.. = YJ a. .1). Hence (ask for iZk.3. this case is conveniently brought together with the case i > k.3. The second equation is premultiplied with P and gives P(Q+I)x=Pz=y=Ax .3) now take the form: Ax=y (Q+I)x=z. With different right members we get other solutions x.k for i<k or P + Q = A'.3. A:-1 LI airark + aik = aik I Z k. or we find a zero in the kth column (this happens if i > k). + aeez.3. In both cases we take the last term in the sum separately. it will run from 1 to k. we introduce the more compact notations P and Q for the coefficient matrices of (4.3.3. the row and the column are of equal length. + Y2 (4. If i < k.6) airzr + aiizi = yi . and from n linearly inde- pendent x-vectors we can form a nonsingular matrix X. This corresponds to a triangularization of A with all diagonal elements in the upper triangular matrix equal to 1. = Yw For convenience.zl + a. respectively. (4.=.3. i < k. If i = k.2z2 + . Augmenting the matrix Q + I with the new column z. (4.1.i_1 L r=1 airark + aiiaik = aik .

°rk i<k. i < k.3.7) _ 1 . Hence PT = DLT.3x. These equations are conveniently rewritten in the following way: k-1 °ik = 4a.5=12.+8x9=24.88 LINEAR SYSTEMS OF EQUATIONS SEC. and D is a diagonal matrix with d.x.x.3.=a.x. the rest of the first row (including z. .x..=1.9) EXAMPLE 2x. 4.8) and the components of x are computed from the bottom upward through back- substitution. 2 -6 8 24 (A. 3x. + F.5=-fo. In the special case when A is symmetric. + 4x. (y. (4. where L is a lower triangular matrix with all diagonal elements equal to 1.I = R: PR=A. The ith row in this system has the form x.=3. we have with Q -{. a. the rest of the second row (including z. y) = 5 4 -3 2 3 1 2 16 2 -3 4 12 (A'. The computation is performed in the following order: The first column (un- changed).. we have R = LT = D-'PT.+ x.-a:. z) = 5 3 19 10 -r -$ 5 if Hencex. °:.=5.z.+2x9= 16. RTPT=AT=A.-fl. = a.3..k .(a.).-6x. 5x. and finally PT = DR or ak.3.. (4. .-3. Last. *=1 °:.-' z. = 2.3+4.. and from this we get LDR = RTDLT. Since the partition is unique.x...). The . Now we put P = LD. the vector x is computed from the equation (Q+I)x=z. (4. °. the rest of the second column. = z. and soon.

k) and ysf) are less than 1... When a result is obtained with more than s digits. The subsequent analysis is due to Wilkinson [ 1 ].3) 4' = Y(3) + m48J33) .. [2].4. ERROR ESTIMATES 89 relation P(Q + 1) = A in this case has the following form: 2 0 0' 1 -3 4 2 -6 8' 5 19 0 0 1 -zj = 5 4 -31 10 4_ 0 0 3 2. if this is not the case.(3> (4. Error estimates We start from a linear system of equations Ax = y and suppose that the elements in A and y are less than 1 in absolute value. Then x3 is eliminated by muliplying the first equation in (4. Further. The first three of these equations are found also in the third system.SEC.4. we will first consider the case n = 4. these initial values will be considered to be exact. The perturbations are selected to reproduce exactly the recorded multipliers mi. In order to simplify the discussion. otherwise. the equations are scaled in a suitable way..2) by a constant m43. round-off must be introduced. this can be achieved by suitable scaling.2) + 13 44 4 Now we assume that the absolute values of all a. 3 r$ 1 1 4. Dur- ing the elimination we have four different systems of equations A(r1 x = y(11. (4) w a44 x4 = Y4 The sign = means that the equation is satisfied exactly.4. we assume that all computations are done with s digits and that a.a+s'/a33' (3) a44(4) = a44(3) + m43a34 > (4. its second equation unchanged from the second system.4.. where N is the base of the number system.IN-. where the last two equations read (3) (3) (3) a33 x3 + a34 x4 = Y3 a(3)x3 a(3)x y. The last system has its first equa- tion unchanged from the first system. The corresponding maximal error is e. and adding to the second equation. and yi are given with this precision.4. 4. We find directly: m43 = ..1) (4 a")xa + aai'x4 = Y13) . and its third equation from the third system: (" allu'X1 + a12 x. The general idea is to search for a perturbed system (A + SA)x = y + 3y whose exact solution coin- cides with the approximate solution we obtain when we solve Ax = y by use of Gaussian elimination. where r = 1 corresponds to the initial system Ax = y.4. + a13 x3 + a14(1> xi cu (1) Yl r ' a2Y)x2 + a'23'x3 + a24'x4 = y22) (4. usually N = 2 or N = 10.

We have. IE44'I S E . but what has to be reproduced is the quantity y 3) + a42) This means that in the previous perturbation matrix. We certainly have..4. the fourth equation in the second system must be constructed in such a way that the perturbed fourth equation in the third system is reproduced exactly. and y42' in order that the third system.4) 1 1 We have now reached the following stage.. and E43' are the perturbations which have to be added to a413'.1). IE43)I S E Here a4. Y4" y4" + m43J33) + E43) .4) + m43aV34 + 644 .'. for example. 4./ 4(2) + m49 y(2) + E4(2) IE(2)I a .' + m39a.4. a44) Q. a4141. a33 ' = a.1. 2 . Thus we get 0 0 0 0 0 0 0 0 0 0 (8A(2) I8y(2)) < a 0 1 1 1 1 0 1 2 2 2 In a similar way.3) . we find 0 (SA(1)1 By(e)) 5a 1 (4.' + E. partly reproduced by (4..2).4. rounded.'. The exact values are: . 1 must be added to all numbers that are not zero. 171431 < J N-1 = E . should become equivalent to the fourth system (4. of course.32'. and F.4. On the other hand. The system (Au) + SA(3))x = y(l) + Sy(j) (4.90 LINEAR SYSTEMS OF EQUATIONS sEC. 4 . -m43a33) = a43) + E43' IE43'1 = Ia33'71431 S 1)711 5 E Analogously.'. IE331I S E Similar estimates are valid for E3. Hence 0 0 0 0 0 0 0 0 0 0 (SA(3) I Sy(3)) G E 0 0 0 0 0 0 0 1 1 1 Now we consider the transition between the second and the third system.m43 (a43'/a33') + 143 . These values are. E3. and further that the rectangle of nonzero numbers should be bordered with ones..5) . Y4 . a4'. for example.4.

11721 a. However. in a double- length accumulator).m.m32&za + SZ3 -m118z1 .j = lr7. a12 However. and suppose that we get a computed solution vector For example. awl 5 17721 < s. 91 SEC. It should be observed that the absolute value of x.4. ERROR ESTIMATES on repeated elimination gives the following systems: (A(2) + 8A(2))x = Y(z) + ay(a) (A(s) + SA(3))x = y(s) + &y(s) A")x = Y") Further. We shall now discuss the solution of the final (triangular) set A")x = y(') written in full in (4. is the round-off error from the division. . the factors used during the elimination are exactly identical with those computed during the approximate reduction. m198z3 + 8z.4. to s fractional places. and replacing (4.a(a)x u ' (If the division were exact!) .2 where 17. we get (2) X2 _ }'' .1).. It is now easy to see how the system (4.m318z1 . acs)x 73 a. need not be less than 1. 4.4. and for this reason we must compute x. . a24)S4 + 172 E.a. which might mean more than s significant digits. Hence (a) (a) (2) aaa i + a73 9 + a24 (a)c41 = Y2 + za where 13z. from the second equation in the system..18z.5) should be modified to give the system A")x = y") + 8z on elimination. we shall ignore this complication here. The numerator is assumed to be computed with a negligible rounding error (for example.5) by (A(1) + 6A(1))x = y(1) + Sy(1) + by(0) we obtain 3z1 By (0) = . + 3z. we actually compute[ [ yap) . Hence .4.I < 6 . Hence it is clear that E is the exact solution of the modified system A(.)x with = y(4) + 8z I bz.

...A-1}yI + I(A + 8A)-18yl 5 IIA-111 1 a a IYI + II(A + 8A)-'II I6yl But II(A + 8A)-111 = II(A + 8A)-' . (4.6) 1 2 3 4.A-' + A-111 5 II(A + 8A)-1 ..4. In practical computations. as has also been observed in numerical examples.A-111 + IIA-111 5 1 a + IIA-111 a IIA-111 . Using the triangular inequality and equation (3. 1 2 1 3 5 (4. that we have worked throughout with max- imal errors.(n-2) (n-2) 2n-3 1 2 3 4....xI S I{(A + 8A)-1 .(n-1) (n-1) 2n-1/ with If the computations are performed with s significant digits instead of s fractional places. More detailed accounts of these and related questions can be found in the papers by Wilkinson (also consult [3]).7) (A+8A)x'=y+By . we get the following result: The solution obtained by Gaussian elimination of the system Ax = y with round-off to s fractional places is an exact solution to another system (A + 8A)x = y + 8y. with a = 11-4-18.4. Generalizing ton equations. From this we get x'-x=(A+SA)-'(Y+Sy)-A-'y = {(A + 8A)-1 .. It should be observed. Hence x and x' are exact solutions of the systems Ax=y.4. Ix' .92 LINEAR SYSTEMS OF EQUATIONS SEC. IIA-'ll 1-a' . we find.10). however.6. where 0 0 0 0. we will also discuss briefly what can be said with regard to the errors in the solution itself.A-'}y + (A + SA)-' By. Last. 0 0 1 (SAIBy) 1 1 1 2 222 1 1 . one naturally obtains slightly worse estimates.411. the errors are usually much smaller. Suppose that x is the exact solution and x' the approxi- mate solution. 4.

and w = 0. we obtain the left-hand sides equal to 32. A well-known example has been presented by T. we obtain 32. Large condition numbers.01.99. 32. and 31. SEC.1.I. S. 22. Also the constant a strongly depends on this norm.4. and w = -0. which. In general.5. 22. Putting x = xa + eo.. we have P = IIAII IIA-111. However. Putting x = 6. N = n-'N(A)N(A-') .89.xI S 11LI (aIYI + ISYI) (4.8) First we note that the norm of the inverse of A plays a decisive role for the magnitude of the errors in the solution. Section 3. z = 2. especially in the solution of linear systems of equations and in the inversion of matrices. 8x+6y+10z+ 9w=33. (4. we get Ax = Ax.1. and further 2 is the largest and fp the smallest abso- lute value of the characteristic roots of A (cf. as a rule. the numbers M. la.5. setting instead x = 1. we obtain M = 2720.18. 7x + 5y + 6z + Sw = 23. A meas- ure of the numerical difficulties associated with a certain matrix is given by the so-called condition numbers M = nM(A)M(A-'). z = 1. of course. and we are inclined to believe that we are near the exact solution. 32.. N 752.50. Taking Wilson's matrix (see below) as an example.99..2. respectively.9. 7x+5y+ 9z+lOw=31. 4.1. N.19. but in fact we are still far from the correct solution x = y = z = w = 1. y = . indicate numerical diffi- culties. since a = IIA-' 3AI 15 I IA-'I 1 I I8Al l But we can say practically nothing of IIA-'l1 without computing the inverse. y = 0. and 31. and P become large at the same time. In particular. and P = 2984. + Ago = y. is closely connected with the solution of the system.10) P = 2/fU where M(A) = max.9. if A is Hermitian. ITERATIVE METHODS 93 Hence we finally have Ix' .7. 4. Wilson: lOx+7y+ 8z+ 7w=32. .01. Systems of equations with the property that small changes in the coefficients give rise to large changes in the solution are said to be ill-conditioned.4.9.6). Iterative methods We again start from the system Ax = y and assume that we have obtained an approximate solution xo.

.5._. = c. = B(21 ._. Hence we also have xa = By .ABr"_2 or r" = ABr"-1 = (I .. the relation is correct for n = 0. The relation x = Brw_.. for example. we have r. = x.94 LINEAR SYSTEMS OF EQUATIONS SEC.AB)"+'y .. + Br"_'. = x0 + Br.AB)r"_.BAB)y (compare with the iteration formula x"+. ..AB)"y. First we shall discuss an iterative method constructed by Jacobi.1: (I .. x. To begin with. x" = x"_. Thus we get x. x"=B(y+ro+r.AB)"+'y.AB.._.Ax. Next we suppose that it is correct for n . it is. as before. x'=(I+B)c. = -D-'Cx"+D-'y= Bx"+c.Ax" = y . for n = 1. of course. we form successively x x ..+E")y.Axa. Starting with xo = 0.. If the computations are done with s decimals.Ax.axe for 1/a). We write the coefficient matrix A in the form D + C where D contains all diagonal ele- ments of A and zeros elsewhere.. Now assume that 11E11 = E < 1.AB)y = (2B . = (I . x3=(I+B+B')c. where E = I .. then IIEZII 5 I IEII 11E11 et and analogously. First we shall prove that r" = (I . x. . 4. and Aso = y . = 2x" . + Br.. Obviously. and after addition..Ax"_. The procedure just described can be considered as the first step in an iteration chain. IIE"II < E" This suffices to secure the convergence of the series I + E + E2 + to (I . and hence lim x" = B(I . and x = Br.+..+r"-1)=B(I+E+ E2+. it is in general suitable to scale ro with the factor 10' before we solve the system Aso = r. is called the residual vector. = y . The vector ro = y . Now we shall assume that we know an approximate inverse B A. is pre- multiplied with A and subtracted from y: y . by use of the formula x"+.E)-'y = B(AB)-'y = A-'y . 0 for the exact soultion.E)`'. and now we take these relations as definitions of r and x". "-w In particular we have for n = 1: x.

A2X(. Putting A. = x3 = ..Ia:. 0 A. When p -. 0 0 a a13 a1w A1 = a.+A. and solve for x.B)}-'y = (D + C)-y = A-y In general.5.. We also form p.. it is important that the matrix has a clear diagonal dominance. Then the method converges if all p. = x.=A.2. we obtain lim xw = (I B)-'c = (I . We put x. = . ITERATIVE METHODS 95 Assuming I IBI I < 1. = x.=S..... we get for p = 0. we find by induction that x(p) _ (I . 'A..1./(Ia.B)-'D-y = {D(I . must be absolutely less than 1. we conclude that convergence is obtained if the matrix A has a pronounced diagonal dominance... 4. . + (. for p = 1. a. In this way we obtain a vector xu> (X... > 0..)A1 1y. 'A.Ai'A. p.n..A1 'A. For practical purposes the following criterion can be used: We have convergence if for i= 1. Further..".) and put p = max. oo the series within parentheses converges if IIEII < 1.".. We shall now discuss the method of Gauss-Seidel. we construct a series of approximate solutions in the following way The first vector x(0' is chosen to be 0..la.. Essentially.. we have x11' = A'(y . . and so on. 1.Al 'A3 + Al 'A.1). and the whole procedure can be repeated. 0 0 0 .. a proof for this is given at the end of this section. = x = 0 and further x. The condition for convergence of Gauss-Seidel's method is that the absolute- ly largest eigenvalue of A. without giving precise conditions. In the second equation. we can formulate the Gauss-Seidel iteration method: y ..". 0 0 0 a23 .where S. = E...E + Es . . a.Ai'y) (I .m) which ought to be a better approximation than x101.1 a 0 .I>S.=E.)Aj'y and analogously for p = 2. we have x'3' = (I .I-S.) Choosing x10' = 0. = x = 0 in the first equation. a. and it converges more rapidly the smaller p is. An empirical rule is that p should be < 2 in order to produce a sufficiently fast convergence. .1)p-'EP-')AI 'y. and the final value is x = (I + E)-'Ai'y = {A1(I + E)}-'y = (A1 + A3)-'Y = A-'y. Introducing the matrices a 0 0 . Starting from the system (4.SEC... = x..A.w a%1 a. . we put x3 = x.. On the whole we can say that Gauss-Seidel's method converges twice as fast as Jacobi's.. and then we solve for x. x1" = A-'y..

w)x(") + cv{Lx("+i) + Rx'"' + c} . For example. For systems of equations with coefficient matrices of this kind.9969 0.2 2 2. Young has developed a special technique.000065 1.000 3 3. Thus all diagonal ele- ments of B are equal to zero. y.. This property is defined in the following way.988 2. D. As before we split A into two parts A = D + C. The coefficient matrices are often sparse and moreover they usually possess a special characteristic which we shall call property A. can be written in the form /D. 6x+ 3y+ 12z= 36. > Ekli laikl for all values of i..1 1. i x.00020 5 2.+i) = (I . or i e T.3y+ 2z= 20. SOR. 4}. E ' F D. 3. where B = . The equation (D + C)x = y is now rewritten in the form (I + D''C)x = D-'y or x = Bx + c. Let n be the order of the matrix and W the set {1. k c T. and we can split B into one upper and one lower triangular matrix: B = R + L.0 implies i = k or i e S. . . each followed by the corresponding column permutation. the matrix 2 -1 0 0 -1 2 -1 0 0 -1 2 -1 0 0 -1 2 has property A. M.0086 1. 3}.96 LINEAR SYSTEMS OF EQUATIONS SEC. are diagonal matrices.. Then there must exist two disjoint subsets S and T such that S U T = W and further a.D-'C and c = D-'y. Then the SOR-method is defined through x(. 4x+ lly. where m is the so-called relaxation factor. where D. We shall further assume A symmetric and a. and D.5. When 0 < w < I we have under- . This means that the matrix after suitable row-permutations. zi 1 2. 2.z= 33. 4.999871 2. D containing the diagonal ele- ments. T = {2. > 0.023 1.k :. k e S.. .5 2.99979 1. a. Successive Over-Relaxation..000048 Systems of equations with a large number of unknowns appear essentially when one wants to solve ordinary or partial differential equations by differ- ence technique. and we can choose S = { 1.99971 1. n}. EXAMPLE 8x.9965 4 2.

and Young's methods can be re- presented in the form x("+" = Mx'"' + d according to the following: Jacobi M= -D-'C = B d = D-'y = c Gauss-Seidel M = (I .wL)-1wc .+1) = (I . since S is diagonal and Csymmetric (because A was supposed to be symmetric).r)/2.. ITERATIVE METHODS 97 relaxation.L)-1R d = (I .wL)-'wc. Hence.SEC. Now B need not be symmetrical. The second conclusion which can be made is the following. We notice.B is of the following type: 0 0 0 0 # # # 0 fu 0 0 0 # # # 0 0 fc 0 0 # # # o 0 0 0 # # # I\ 2 I # # # p 0 0 # # # ' 0 p 0 # # # # # 0 0 u From well-known determinantal rules it is clear that the same number of ele- ments from E and from F must enter an arbitrary term in the determinant. for example. we have (SCS)T = STCTST = SCS..coL)-11(1 . the value of the determinant is unchanged. ±u1. we are back with Gauss-Seidel's method. B and B' have the same eigenvalues which must be real: 0.D112(D-'C)D-112 = .. .cv)I + wR}x'"1 + (I . Two conclusions can be drawn from this. If all elements of E are multiplied by a factor k # 0 and all elements of F are divided by the same factor. 4. It is then clear that Jacobi's.L)-' SOR M = (I . It is easy to see that the eigenvalues of A do not change during the permu- tations which are performed for creating B. putting for a moment S = D-112. when w > I we have over-relaxation. but B' = D112BD-112 = . where s = (n .5. . In fact.D-212CD-112 is symmetric.w)I + wR} d = (I . The convergence speed will essentially depend upon the properties of the matrix M. For the SOR-method we encounter the important ques- tion of how w should be chosen to make p(M) as small as possible. 0.wL)-'{(1 . First: the characteristic equation in the example above must be p" + a1 p° + ate + a3p2 = 0 and in general p'P1(p2) = 0. convergence being obtained only if p(M) < 1. Solving for x("+'). that [CI . If w = 1. where p(M) is the spectral radius of M. we get x(... Gauss-Seidel's. ±1121 .

1)1 .1 .i2 . In the real case only the greater solution Z. = 0 and uk* Ib. = w. det([R+AL]- \ d+w. When w. conveniently we put z = I'll and obtain z=iwft± lwxpz -w+I. the derivative goes toward .fl )/. from the left. = Aw fl + iwzua -w+1 is of interest.w)I + wR} and an eigenvalue u of B. We can now express A in terms of u. from the right.(I .1i)_0 J w As has just been discussed we move a factor 21/2 from L to R which does not change the value of the determinant.98 LINEAR SYSTEMS OF EQUATIONS SEC. Iz!2 = w . > Ekti la. w. and then we divide all elements by 2111: det([R+L]-'2 0) 1I)=0./dw < 0 for co < w while dz. we must have U = w2'/` Since b. First it is obvious that the optimal value of the relaxation-factor is w.1 . The limit between these two possibilities is determined by the equation Jtsu)t . = 2(1 +'l .wL)-'{(1 .w + 1 = 0 with the two roots: w.p2)/p2 Real solutions z are obtained if co < (o. The equa- tion det (M . < w < w we have z = iwp ± i w .wL}-'[&)R .a).oo. We now derive an important relationship between an eigenvalue 2 of the matrix M = (I . we get the result as shown in Fig./dw > 0 for w > w.1. A simple calculation shows that dz. If 121 = Iz12 is represented as a function of w.wsu that is.5. that is.5. < co < cv..(wL)2]1 = 0 . For certain values of w we get real solutions. that is. or co Z w and complex solutions z if c. 2(1 . complex solu- tions.). and when w .. When (o -. we have according to Gershgorin p' < 1. Using this figure. the derivative will approach + oo. 4. for other values. we can draw all the conclusions we need. 4.2I) = 0 can be written det {(I .(a) .kI < 1 (note that a...p2) p1 .I).. -. = 2(1 .co. But R + L = B and if p is an eigenvalue of B.

. T Ph 3 2 I I I 2 3 4 Figure 4. = 2 53/209 = 0. =3 167/209 = 0. 0 ft 0 .41/209 = .1 -q2)/p2 = 1.SEC. A correct choice ofa) may imply considerable savings in com- putation effort. It turns out that lower approximations become best for larger values of w.5. It is also easy to understand that a value co which is a little too big is far less disastrous than a value which is a little too small since the left derivative is infinite. = 1 Exact solution: . The largest positive root is p = (V3 + 1)/8 giving cy = 2(1 .4 in the fifth approximation.0.1 = 0. + x. The coefficient matrix obviously fulfills all conditions posed before. Now p is determined from the equation 1 0 4 4 0 It 0 -4 . For convergence we must have 121 = 1. 4x. defined as e = E fix. + 4x.253589 xl + 4x.04464 and 2 = w .04464.0 . is actually smallest . + x. -4 0 p that is.3p2 + Ta = 0. 16p' . but we have SOR only when 1 < cv < 2.985646. + x. = 4 206/209 = 0.e.799043 X. We shall now illustrate the method on a simple example.196172 4x. ITERATIVE METHODS 99 where It is the spectral radius of B. and there are realistic examples which indicate that the work may be reduced by a factor of 10-100 compared to Gauss-Seidel's method.5 In more complicated cases when p is not known one must resort to guesses or experiments. In the table below we compare the fifth approximation in different methods. and the error. and hence 0 < co < 2. + x. 4.

050 when we get an accuracy of exactly 6 decimals.100 LINEAR SYSTEMS OF EQUATIONS SEC. = -log 0. Ay+Bx=v.799042 0.04464) -0.985646 - The convergence speed for the iteration method x'"+" = Mx'"' + c is defined through R = . we find (B-'A + A-'B)x B-'u + A-'v. and v are real column vectors with n rows. where 2 is the spectral radius for Gauss-Seidel's matrix. = .2 log 1/58+ 1 _ 1. 4. is.1) is equivalent to {Ax-By=u. .1 for SOR. In the example above we find: R. for w = 1.6.000705 SOR (w = 1. X.253589 0.2) (B-'A + A-'B)y = B-'v . (4. the matrix of Jacobi's method.253780 0. Then the system (A + iB)(x + iy) = is + iv (4. Eliminating.253594 0.985644 0.195862 0. and the system is solved by using Gaussian elimination. .985520 0. since the coefficient matrix is the same.000018 Exact -0.798965 0. The relationship pt = (2 + (o . where p(M) is the spectral radius for M. R. Hence we have the general result that the convergence for Gauss-Seidel's method is twice as fast as for Jacobi's method. x2 x3 x4 e Jacobi -0.6. = -log X38+ 1 = 0. Complex systems of equations Suppose that A and B are real.019265 Gauss-Seidel -0.A`'u.260742 0. y.184570 0.196172 0.8 for Gauss-Seidel's method.196163 0. and p the spectral radius for B. the total number of multiplications will be approximately 7n3/3. that is.798828 0. Both these systems can be solved simultaneously as one system with two different right-hand sides. If A and Bare inverted by Jordan's method. nonsingular matrices of type (n.6. 4. Alternatively. R.6.04464 = 3.985352 0.9 for Jacobi's method.799043 0. n) and that x.log p(M).1)/(21"1(o) used for co = 1 gives 2 = fp2.

Using Crout's method. 3x . [3] National Physical Laboratory: Modern Computing Methods (London. -x + y . ACM.. Math. Conf. -3x+ 7y+ 9z+5v=11. solve the system x + 2y + 3z + 4v = 20. solve the system x+ 2y-12z+8v=27. No. 1953). Simultaneous Linear Equations and the Determination of Eigenvalues (Washington. 76. 6x . [7] Bargmann. Proc. 8. von Neumann: Solution of Linear Systems of High Order. Using Crout's method. 4x + 18y + 4u+ 14v= -2. 1961). Int.19z + 25u + 36v = 23 .: Appl. Series. [8] Wilkinson: Rounding Errors in Algebraic Processes (Her Majesty's Stationery Office. [6] Id.12y. Math. -3x . 5x + 26y . Am.8z + 3v = 49. Contributions to the Solution of Systems of Linear Equa- tions and the Determination of Eigenvalues (1954). UNESCO. 92-111 (1954)." Proc.6z+ 4u+ 8v= 8.: "Error Analysis of Direct Methods of Matrix Inversion. 1963)." Jour.EXERCISES 101 we can also write (a) (B A) (y) - which can be solved by (2n)3/3 = 8n3/3 multiplications. No. 44-53 (1959).9z+ 6u+ 3v= 3. REFERENCES [11 Wilkinson: "Rounding Errors in Algebraic Processes. EXERCISES 1. Soc. Math. 1946). 2z+ y-4z+7v= 10. Using Crout's method..2v = 4. . solve the system 2x + 10y." Trans.34z + 15u + 18v = 29. 4x+2y-8z-4v= 2. [2] Id. 39. Series. (Insti- tute for Advanced Study. 3. Princeton. Lon- don.2y + 8z + 4v = 26. 2. Montgomery. 29. 281-330 (1961). Inf. 5x + 4y + 7z .12y . [5] National Bureau of Standards: Appl. [4) Young: "Iterative Methods for Solving Partial Difference Equations of Elliptic Type.

Certain linear systems of equations can be treated conveniently by Gauss-Seidel's iterative technique. The system ax+by+c=0. with x0 = yo = 0 (six iterations.ly2+0.lxy=1.11 =0..05z2=0. x6 = 0 . x3 + 10x4 .7.. = 5 . dx+ey+f=0. 8.lxz =0.. Find the optimal relaxation factor for the system 2x. After a simple generalization.z= 1 . is equivalent to forming the symmetric system ATAx = ATy. the minimum point is called xk..) 6. y+0. Show that the method in the previous example. 10x. when applied on the system Ax = y. the minimum point is called yk. = 10 . con- stant and vary y. .y =7. -x+2y. Using an iterative method. Use this method to solve the system (5x+2y. x6 = 5 . Next we keep x = xk.3x2-0. (Accuracy desired: three decimals.9=0. can be solved by minimizing the function z = (ax + by + c)2 + (dx + ey + f)2 We start from an approximate solution (xk. 7.2. Find in this way a solution of the system x-O. six decimals).5.102 LINEAR SYSTEMS OF EQUATIONS 4. + x2 . 5. Then perform 4 iterations with w rounded to I decimal. 2x. solve the following system (three correct decimals): x1 +10x2+ x3 =10.4y2+0. x-3y..2x6 + 20x. -y + 2z = I . yk) and construct a better one by first keeping y = yk constant and varying x. z+0. + 20x3 + x. The procedure is then repeated. 2X4 .. the method can be used also on some nonlinear systems. 3x2 + 30x6 + 3x6 = 0 .

and so on.1. Jordan's method Since the coefficient matrix is the same for all systems. and little fleas have lesser fleas. (r = 1.. from all equations but the second one. . nth column of X.. is eliminated from all equations but the first one. n). This is due to the fact that although Gauss' method is faster. second. As is easily understood.1.. and so ad infinitum. 5. Then x. and greater still. and so on. I. . 104. x* the vectors formed by the first. . Introduction We suppose that A is a square nonsingular matrix of type (n.. Jordan's modification demands less memory space for storing intermediate results. The great fleas themselves in turn have greater fleas to go on. We denote by x x.0. as a rule. since det A *-0. Jordan's modification is preferred.. only n' nontrivial elements need to be stored during the whole process.. Every step in the elimination gives one column in the unit matrix to the left. 5. and at the end we have I to the left and A-1 to the right. The following formula is used for the pth reduction of the pth row: apk p a. we start by writing A and I side by side. .' vk ap-1 PP For the other rows we have ap' a? 1k = a?-1 )k . 2. 12. All these systems have unique solutions.... .. The solution can be performed by Gauss' elimination method. After p steps we have the scheme shown at the top of p. all with the same coefficient matrix: Ax. .2) ap-1 pp 103 .. but. white these again have greater still.Chapter 5 Matrix inversion Great fleas have little fleas upon their back to bite 'em. Hence the equation AX= I can be replaced by n linear systems of equations. . and analogously we define the unit vectors I1.I.. . Our problem is to determine the matrix X which satisfies the relation AX = I. x. n).ap-1 7p pk (5.

1 0 0 x1.2. al. is the pivot element for the next step. .. the inverse is of the same form as the original matrix.. . ap+1...5 2 -0.. .72 0..2. a...2.16 -4.. I 1 . Put.02 0 0 25 54 20 0 1 0-0 0...w aP+.1 aPP.0 1 0 95 .1 .. the matrix A is written as the product of two triangular matrices denoted by L and R... printed in boldface type.. 1 ..t.04 -0.. aww awl aw..68 1 1 0 0 -186 129 196 -. aP. EXAMPLE 50 107 36 A = 25 54 20 31 66 21 50 107 36 1 0 0 1 2. 0 (p+ 1) 0 0 0 I aP+... 0 -0 1 0 1u 13.... aPt1..tl. The element aPt1.14 0 .. (n) (1) (2) ... a. (p) (p+ 1)...5 1 0 31 66 21 0 0 1 0 -0.. 0 a1. Triangularization As in Section 4..... app 0 .32 -0..34 . 1 aPP.. and after inversion we get A` = R-'L-1. 0 (P) 0 0 .Ptl . 5.P+1 . 0 a% P+1 .P+1 a....x ail ai' . for example..62 0 1 1 0 -7.n as. 0 (2) 0 1 . (1) (2)..P 0 .. 0 a. Hence A = LR.. 137 X31 X12 X33 0 0 1 . Inversion of a triangular matrix can be performed directly by use of the definition.28 0 -0 1 4 -1 2 0 0 0 0.1 1 0 xu X.(n) (1) 1 0 . After n steps we are through and have obtained the inverse A-'. a..66 -100 0 0 1 . 0 (n) 0 0 ...24 17 25 5..104 MATRIX INVERSION SEC... aP. ..84 2..(P) (p+ . a.. 0 0 1 0 0 1..96 0. only the n2 elements within the frame need to be stored during the computation.P 0 .

.2 10 21.y. the partition is not unique..68 0.2. r0 y r1.1 + 13. We easily find 0.8 3.2 As mentioned before.4 7. rr.1y. 0.68 0. r.3x = 1 .1 -0. Yl.1Y11 = 1 . the diagonal elements in one of the matrices are at our disposal.1 = 0 1. 0 0 y. 1.. L=( 2. L'' = -0. 0 0 5 -4. 1. 105 SEC. 121x11 + 1»x =0.5 1 0 .66 . = 0 .2 R= 0 0. we choose the diagonal elements of L equal to 1...x31 = 0 A nal ogousl y.100 .5 1 01 . r11y1.2 0 0 R-1 = 0 2 -20 . 5. 0. 1 0 0 r. r.68 .x2. = 0 1 0 0 0 r. instead.1 -4. The method described here is sometimes called the unsymmetrical Choleski method.5 2 0 0 0. r. TRIANGULARIZATION Then the unknowns can bedeterminedsuccessively from the following equations: 111x11 =I 1.2 0. + r. + 13.1 0 r. EXAMPLE 50 107 36 5 0 0 A = 25 54 20 .186 129 196' R-1L-1 = 95 .y = 1 .4 5 and . Y11 Y12 y.. we obtain 1 0 0 L = 0..5 0 31 66 21 3. + r12Y23 + r13y = 0. 10 0 1 gives the following system: 1 .28 39.x.62 -0...1x11 + 132x. If. -24 17 25 in accordance with earlier results.

5 0 0 0 0. The remaining matrix can be split into a product of a diagonal matrix D and an upper triangular matrix R. The inverse is obtained from A' _ (LT)-'L-' = (L-')TL-' .02 0 0 1 0 0)1 A-' = 0 1 -4 0 2 0 -0. (5.68 1 / 186 129 196 -I 95 . 5.3.96 0. Hence 50 0 0 D= 0 0. EXAMPLE 1 2 6 1 0 0 A= 2 5 I5 . some elements of L might become imaginary. but this will cause no extra trouble.66 . Choleski's method Now we assume that the matrix A is real symmetric and positive definite.2 1 0 . L= 2 1 0 . where Lisa lower triangular matrix. In our special example we get 1 -2. with all diagonal elements equal to 1.5 1 0I 0 0 1 0 0 25 -0. 6 15 46 6 3 1 1 0 0 5 -2 0 L-' = .1) and we have to perform only one inversion of a triangular matrix and one matrix multiplication. Then we can write A = LLT . -24 17 25 SincedetL=detR= 1. (L-')' L-' _.3.14 -7. .84 0. the inverse becomes A-' = R-'D-'L-1.04 and (1 2.106 MATRIX INVERSION SEC. the partition A = LDR is now unique.100 .2 10 -3 0 -3 1 0 -3 1 * In some special cases such a partition does not exist.72 R = 0 1 4 0 0 1 As is easily understood. we also 5. * As pointed out before.14 0.3.

4.) X. and X. (5.X. + .-' = 9'4 .x. from (a).1.1. 5. .1. Next we get from (a): X.' X.A.x = 0 . of type (n. 5. we obtain X.4. gives (a . where one extra row has been added at the bottom and one extra column to the right. 9 5 16 94 0 -188 (9 /4 A.'AI)X = -A. + A.4. gives (a .b) A3X.+aX.'A.' row vectors. (5. which is also called the square-root method. _ . 16).c) A.d) From (b) we get X. n) is known. The escalator method This method makes use of the fact that if the inverse of a matrix A. A. A.X.. IX.A-. A. are column vectors. Choleski's method. and X. seems to be quite advantageous. In this way we can increase the dimension number successively. A.-' is assumed to be known. Al 'A. and the inverse A-' has been computed. Hence x and then also X. then the inverse of the matrix A. The following equa- tions are obtained: A.X.= 8 -1 13). EXAMPLE 13 14 6 8 -1 13 13 14 6' A= 6 7 3 A.A3A. \6 1 7 3.' is known. can easily be obtained. can be determined. inserted in (c). Further.X'') .54 -3 121 . which.4. (5. (5.1. + ax =I .)x = 1.4.' and hence X. and a and x ordinary numbers.4.a) A. = -62 7 125 %2) A. = A1-'(I . = 1.4. 5. = (9. Finally. We put A =(A1 A2" A' I a and A' =(X.T I x where A. a= 11.A. which.SEC. inserted in (d). THE ESCALATOR METHOD 107 When the computations are done by hand. =0.

CA-'B)-' . AY+ BV = 0. X''=(-416. V = (D . X. and we obtain AX+BZ=I.108 MATRIX INVERSION SEC. for example. we find 0 x = -94.913).CA-'B)-' and CX + DZ = 0.A-'B(D . not necessarily of the same order. the two expressions for X and Y are.= -1 . Put A DB) (C V)=(0 I) where A and D are quadratic matrices.BD`'C)-' . Complex matrices It is easy to compute the inverse of the matrix A + iB if at least one of the matrices A and B is nonsingular.5.BD-'C)-1 . Z = .D-'C(A . If A and B are both regular. 5. Y = -A`'B(A + BA-1B)-1 . 65 1 0 -2 X. Y = . and if B-' exists: X = B-'A(AB-'A + B)-' .97. for inverting matrices of such a large order that all elements cannot be accommodated in the memory of a computer at one time. X = (A . Using the scheme just described. if A-' exists: X = (A + BA-'B)-' . .= -5 1 11 287 -67 -630 and finally 1 0 -2 0 -5 1 11 -1 287 -67 -630 65 -416 97 913 -94 The escalator method can be interpreted as a special case of a more general partition method which can be used. of course. CY+DV=I.(AB-'A + B)-' .5. We put (A + iB)(X + iY) = I and obtain. Y = . 5.

Put Ca= land C. for example.E) = B(21 . and if f(r) = 0 for all r.A)-'.I = E.2.(I . where r is a real number.. and we want to compute (I . we obtain B*+1 = B*(2I . ITERATIVE METHODS 109 identical. = B.21 E' = (00. as before. Obviously.454) while A-'= TT A I9 T .A)-' when n oo. (AB-'A + B)-' = {(AB-'A + B)(A-'B)}-' = (A + BA-1B)-1.ir)(A + iB). = I . Iterative methods A matrix A with IIAII < 1 is given. we get (A + iB)-' = (1 . In practical computation a strong improvement is usually obtained already for n = 1.1 A .181 = 10. 1=I+AC*.1 00.01 00. F becomes regular. B(I . f(r) is a polynomial of degree n.SEC.6.}.3 . we would also get f(i) = 0 against our assumption that A + iB is regular.E + E`) = . we get A-' = B(I + E)-' = B(I .E + E' .AB). We can then proceed as follows. we can write A-' . The condition for convergence is IIEII < 1. + (-1)Z*-'E="-'). we can easily prove that B. With AB* = I + E*.E. B-'A(AB-1A + B)-1 = (A-'B)-' . Forming AB .0.AB*) in complete analogy to Formula (2.1. where E = E. 5. Consider the matrices F = A + rB and G = B .6). . it is clear that C*=I+A As we have shown previously. or putting B. we find AB. put det F = det (A + rB) = f(r). = B. by use of induction. for example.4) (0 J \\ -0. since F is regular. and hence convergence is quadratic. From F + iG = (I .0. EXAMPLE (00.273 B(1 .(3 -1)' B= -0.1 91 . C. Then there must exist a number r such that. In order to prove this.6. 5. converges to (I .rA.4)' E (00. Moreover. ) if the series converges. Here (F + iG)-' can be computed as before. we have. Somewhat more complicated is the case when A and B are both singular but A + iB is regular. In this case.E + E' .02) = ' (00. = AB . Now suppose instead that we want to compute A-' and that we know an approximate inverse B.ir)(F + iG)-'.

and use the result for computation of A-' = R-'D-'L-1. 8. A is an (n. 7. 2 6 10 11 19 4 11 19 38 3. Find the inverse of the matrix 1 A= I i I* i) + .5 (i -2 7.110 MATRIX INVERSION EXERCISES 1. n)-matrix with all elements equal to 1. Factorize the matrix 2 4 8 A . 2. Find a number a such that I + aA becomes the inverse of ! .A. \* + b by using Jordan's method. Using Choleski's method. Find the inverse of the matrix 3 7 8 15 2 5 6 A.25 7.-4 18 -16 -6 2 -20) in the form LDR. find the inverse of the matrix 1 2 0.5 27 6. Compute the condition numbers M and N for the matrix 1 2 3 4 6 12 20 A_ 2 24 120 6 60 24 120 360 840 . 5.5 1 A_ 2 5 0 -2 05 0 2. Find the inverse of Pascal's matrix 1 1 1 1 1' 1 2 3 4 5 A= 1 3 6 10 15 1 4 10 20 35 1 5 15 35 70. Find the inverse of the matrix I 7 10 3 A. 2 0 19 12 27 17 8 4 5 2 6 3 4.

ck.. Ibikl<e2.. 111 EXERCISES 9. The inverse of a matrix A is known: A-' = B.kI > a. We have the estimates. Compute the inverse of _ 5+ i 4+2i A-(10-F3i 8+6i) 10.. -5 1 11 -1 and 8 -1 13 9 287 -67 -630 65 6 7 3 2 ( -416 97 913 -94 ) 9 5 16 II where the element 913 is changed to 913. Eik is a matrix with all elements zero except the element in the ith row and kth column which is 1.. = l fdk. Consider a band matrix A with diagonal elements a a2..aikl<E1.. Find the inverse of A + oEik. Use this method on the matrix 1 1 0 0 1 2 1 0 A= 0 1 3 1 0 0 0 1 4 I 0 0 0 1 13.01. Using E = AB . 0 0 0 and d. dk.. Find the inverse of A4 3 -I 10 2 _ 5 1 20 3 9 7 39 4 l -2 2 1 by repeated use of the escalator method. Then use the result on the matrices 1 0 -2 0 a= 13 14 6 4 A . 11. 14. .. 0 and D= 0 0 d5.E2. = ak. a and the elements next to the main diagonal equal to 1(a = 1). where 1 0 0. (we suppose that dk # 0). A and Baregiven matrices of type (n. . n).. 12.I < E. Show that we can write A = LDLT. Find an error estimate for thepth approximation when leikJ < c and Ibikl < a. A is a given matrix and B an approximate inverse (0th approximation).. we construct a series of improved approximations.. . 0 Id. where a is a small number (a2 and higher terms can be neglected). = a. Form C = A + B and D = AB and show that Ici.. A matrix with the property aik = 0 for Ii .. is called a band matrix. 0 L= 0 c. 1. where a is an integer > 1...I. + E2: IdikI < ns. 0 0 .

Si_l. C. Compute the second approximation of A"' and estimate the error according to Exercise 14.. + a. 16. . (i # r) and D. C2.. The matrix B is obtained from A by replacing AT by aT.k .. D. ..-17 8 -72 -22 -31 42 54 45 77 -33 -16 -63 and -19 -2 12 23 B = 10_3 -3 21 34 19 12 -8 2 -4 ' -25 -11 -3 2 where B is an approximate inverse of A. is not equal to zero and compute the columns D D2.. The is x is nonsingular matrix A is given.max (i. ... The rows are denoted by AT. . Show that B is nonsingular if 2 = aTC. A is an is x is matrix with elements aik = 25. if r#s. ..k. =C.C.k= n+l 17. Let 18 -12 23 -42 A..8i}1. in the inverse B-' by trying D. while the columns of A-' are denoted by C1. where S"-{0 1 if r=s. Show that A-' = B. A? . = a.. k)[n + 1 .k .C. where min (i. k)] b.. AT.112 MATRIX INVERSION 15.

and further. 6. in some cases we have recursive techniques which differ somewhat in principle from the other methods. as a rule. Theoretically. and transformation matrix. the following questions should be answered: (a) Are both eigenvalues and eigenvectors asked for. Thus a complex matrix can be trans- formed to a normal matrix following Eberlein. The table on p. As a rule. and better methods must be applied. In practical computation. Further it turns out that practically all methods depend on transforming the initial matrix one way or other without affecting the eigenvalues. or are eigenvalues alone sufficient? (b) Are only the absolutely largest eigenvalue(s) of interest? (c) Does the matrix have special properties (real symmetric. a particularly simple technique can be used. in some cases the computation is performed in two steps. this is indicated by a horizontal line. if only the largest eigenvalue is wanted. the transformation matrix is built up successively. Incidentally. the problem has been reduced to finding the roots of an algebraic equation and to solving n linear homogeneous systems of equations. It is obvious that such a compact table can give only a superficial picture. this method is unsuitable. and if so.0. methods which are important in principle will be treated carefully 113 . and so on)? If the eigenvectors are not needed less memory space is necessary. Except for a few special cases a direct method for computation of the eigenvalues from the equation det (2I . Further. while a normal matrix can be diagonalized following Goldstine-Horwitz. It is not possible to give here a complete description of all these methods because of the great number of special cases which often give rise to difficulties.Chapter 6 Algebraic eigenvalue problems Das also war des Pudels Kern! GOETHE. type of transfor- mation. both these procedures can be performed simultaneously giving a unified method as a result. but the resulting matrix need not have any simple proper- ties. When there is a choice between different methods. 114 presents a survey of the most important methods giving initial matrix.A) = 0 is never used. moreover. Hermitian. However. Introduction Determination of eigenvalues and eigenvectors of matrices is one of the most important problems of numerical analysis.

symmetric Tridiagonalization Orthogonal (rotations) Givens Real. Wilkinson. originator Real Iteration and deflation Power method Real. Wielandt . Greenstadt Complex Reduction to Hessenberg form Unitary (rotations) Givens Complex Reduction to Hessenberg form Unitary (reflections) Householder Complex Reduction to normal matrix Patricia Eberlein Tridiagonal Sturm sequence and interpolation Givens. Wielandt Hessenberg Recursive evaluation and interpolation Hyman. Initial matrix Technique Transformation matrix Method. symmetric Tridiagonalization Orthogonal (reflections) Householder Real Tridiagonalization Lanczos Real Tridiagonalization La Budde -- Real Triangularization Ruthishauser (LR) Hessenberg Triangularization Unitary Francis (QR) Complex Triangularization Unitary Lotkin. symmetric Diagonalization Orthogonal Jacobi Hermitian Diagonalization Unitary Generalized Jacobi Normal Diagonalization Unitary Goldstine-Horwitz Real.

In the other two cases a recursive technique is easily estab- lished which will work without difficulties in nondegenerate cases.1.. To a certain amount we shall discuss the determination of eigenvectors.1. 1 . 3.+c. + C. Triangularization. (6. (6. r = 1. 6.( )Pv. 1 )9 } .1) Then we have Av = c.1. (civ1 + c3 v..2) (L. 2. _ 2. the eigenvector of I. + c..l. Wilkinson's technique which tries to avoid a dangerous error accumulation. Now we let A operate repeatedly on a vector v.SEC.. Also Wielandt's method. THE POWER METHOD 115 and in other cases at least the main features will be discussed. which we express as a linear combination of the cigenvectors V = C1V1 + C2 V2 -i.......n. On the whole we can distinguish four principal groups with respect to the kind of transfor- mation used initially: 1. will be treated. The power method We assume that the eigenvalues of A are 2. aiming at an improved determination of approximate eigenvectors.. + c.4v..V "v) P_m (APv).1... 2.1(Z")9 V. For large values of p. The rate of convergence is determined by the quotient 2. convergence is faster the .. 4./2. Almost diagonalization (tridiagonalization). + C... + ... Almost triangularization (reduction to Hessenberg form).1... APD 1C1v1 + C. + . = Jim (. + . .2.Av. Diagonalization.Av. .1. The determination of the eigenvectors is trivial in the first case and almost trivial in the third case.. the vector CA+c. where 1211 > 1221 Z >_ 12.. 1 i V. + (6.+. 6.3) where the index r signifies the rth component in the corresponding vector.v that is. will converge toward c. v and through iteration we obtain \P 2. for example. The eigenvalue is ob- tained as .

0667 . Given a vector yk.3000 2.V. 0.. Yk+i _ zk+llak+t . We suppose that all v. and z = 0.). = 2 30 If the matrix A is Hermitian and all eigenvalues are different. and zk+i zk+1 = AYk .1.3007 y.0999 y = 0.. we form two other vectors.v. = 0. + CS lava + . 6.3276 Y.I The initial vector y. _ C'V' f Esc= + . For numerical purposes the algorithm just described can be formulated in the following way. = 7 and v.0661 1 1 0. the eigenvectors.1(zk+.1 is. as shown before./2. we get 9 2. (6. should be chosen in a convenient way. ( )9 V. often one tries a vector with all components equal to 1. Yk+.1.116 ALGEBRAIC EIGENVALUE PROBLEMS SEC.. + S.. are orthogonal.4) ak+l = max.4668 1 7. smaller Iz. Let x be the vector obtained after p iterations: X = e. EXAMPLE 1 -3 2 A= 4 4 -1 6 3 5 Starting from 1 Yo = 1 1 we find that 0.0001 After round-off. are normalized: H S Vi rok = Sik .0597 . = 0. + C.

+ .25987._.1. 30.SEC.. A has an eigenvalue 2. c.2. Hence the first row of A.28868.)..da + 6.q)-' is an eigenvalue of (A .12 + + When p increases.1.). respectively.12 + . tend to zero. Further.v. and with x = const AP(r. We will now discuss how the absolutely next largest eigenvalue can be calcu- lated if we know the largest eigenvalue 2. since (2 . is supposed to be normalized in such a way that the first component is 1. be an eigenvalue and the corresponding eigenvector with the first component of x. compared with the correct value 30.. + Further. Then . and 3.287. we can produce the two outermost eigenvalues. we know that 2-' is an eigenvalue of A-' and analogously that (2 . (x. is zero.j= + ls. x"x EXAMPLE With 110 7 8 7 1 _ 7 6 5 A. 6. (6.1.68892) 1 0.q. 2. THE POWER METHOD 117 Then we have Ax = c. and x.v. Now let 2. (6. Using this principle.5) . x"x = 1c.288662. 8 5 6 10 9 and xo 1 I 7 5 9 10 1 we obtain for p = 1. we get Rayleigh's quotient xNx . for example. 2. Let aT be the first row vector of A and form A.95761 (0. all e. The power method can easily be modified in such a way that certain other eigenvalues can also be computed. we can concentrate on that.=A-x./(x.94379 The quotients of the individual vector components give much slower conver- gence.aT.. and the corresponding eigenvector x.qI has an eigenvalue 2 .. and 30.6) Here x. equal to 1. = 30. = 29. The corresponding eigenvector is 0. then A .2.qI)-1.q)-' becomes large as soon as 2 is close to q.IE. If. for example.). If we know that an eigenvalue is close to q. + 62v and x"Ax = 2.75.

n . (note that the first component of x. which is obtained when the first row and first column of A are removed. we get a vector z. 6.49 . _ . (-32 66). from the relation x.x. = 2. as well as of x. . _ 2. = Without difficulty we find 0 0 0 A. . Since x.+cz. is an eigenvalue and x. 104 67 -147 . the first column of A. = 6 and the corresponding eigenvector 2 -1 or normalized. Then we obtain x. -\-15 ' and we find the eigenvalues 22 =.2. is 1).1)-matrix. = aTx. can be repeated.x.32 66 -23 -15 31 Now we need consider only B. When 2. We determine an eigenvector of this matrix.118 ALGEBRAIC EIGENVALUE PROBLEMS SEC. + caTz. and x.x2) since aTx. is irrelevant.l.)/aTz. and hence c = (2.= x. which are also eigenvalues of . Multiplying with aT we find aTx. EXAMPLE The matrix -306 -198 426 A .. -176 -114 244 has an eigenvalue . . x. . has the first component equal to 0. and in fact we need consider only the (n .2 and 2. = 1. and aTx. which is called deflation.1. Thus .x.x9) = A(x. is an eigenvector of A. . have been determined.) . we have A.aT (x. and by adding a zero as first component.1.(x.l.x. . .x2) = 21(x. the process.

33984 -0. 6.21XIX.288686 . = 0 and A. (6. Now suppose that x. we get c = and 1 3 x.x.99979 A.62044 -0.. or -5 2 and all eigenvalues and eigenvectors are known.?. = Ax.91234 -0. Then we can again use the power method on the matrix A. EXAMPLE 10 7 8 7 0. .x. and form A.77228 0..x. . x.91234 0. = / \1 and aTz = 30.x. we find 1\ A x.* and 1 6 X3= .7.x.l.83654 -0..SEC. = 1. _ .29097 1.+cz= . = -0.1.A.'.528561 7 5 6 5 0.35714 -0. x. = A .380255 A = 21 = 30.=x. Hence c = . we have x. has the same eigenvalues and eigen- vectors as A except A which has been replaced by zero. 2 is f11) `s and hence 1 0 x.1.33984 0.551959 7 5 9 10 0.29097 -1.x. = 0 when 2. THE POWER METHOD 119 the original matrix A.'x2 = .x'x.'x. 8 6 10 9 0.7) It is easily understood that the matrix A. +c (15 Since aTz = -48. we have A..83654 -1. The two-dimensional eigenvector belonging to 2. = 1.i.= or 4 4 With A. 520 93 3 1. If A is Hermitian.78053 .35714 0.53804 0.x and so on. In fact.x. = Ax.99979 0. . # A.

v..P.p.cost ((m + 1)q + a)] = .-... = cesB and the initial component of x. is the eigenvector belonging to 2e `lp.+. and 3. 6.8) . (6. In Chapter 3 we proved that for a real symmetric matrix A.10) .2A2-+2[cos (mp + a) cos ((m + 2)q + a) . a certain component of is denoted by p.P2.1 with real 2.2. 6.-2-. then 2 . It is also clear that if x.9) _. e`r. Then p. e -iv must also be an eigenvalue.. where c.x. 2AP.P.546.120 ALGEBRAIC EIGENVALUE PROBLEMS SEC. if q = it.Then we form A'^x. Then we easily find lim A2P.P. Among the off-diagonal elements .+1 sine q .. (6. = c2'"¢(e^"P+:e+:O + e c. = 3.Pw. all eigenvalues are real.+j In particular.+ = 4c2. Hence p.1. corresponding to p is e'O. Jacobi's method In many applications we meet the problem of diagonalizing real. Now we form p. and that there exists a real orthogonal matrix 0 such that O-'AO is diagonal.4c'.+o = cos (P .2..:.P.858057. .). If the numerically largest eigenvalue of a real matrix A is complex.1.-. Now suppose that we use the power method with a real starting vector x: x = c.a+.+. with m so large that the contributions from all the other eigenvectors can be neglected. then we have the simpler formula lim P. if the numerically largest eigenvalues are of the form ±.+2 = A2 .1. . This problem is particularly important in quantum mechanics. Further..85774 compared with the correct value 3. + 4 4 + .P:. is the eigen- vector belonging to 2e`% then x. With the starting vector we find the following values for Rayleigh's quotient: 2. symmetric matrices.+1 = 22.+2 . Hence Jim P. where we have put 0 + ib = a.1. We shall now try to produce the desired orthogonal matrix as a product of very special orthogonal matrices. 2 . = 2c2'" cos (mg) + a) . + P. 3. (6..8516. that is.

k and or = -1 if a. dkk = a. + dkk = aii + akk and diidkk = a series of such two-dimensional rotations.. since the angle 29 must belong to the first quadrant if tan 2q' > 0 and to the fourth quadrant if tan 29 < 0..2.akk) sin p cos q + a. that is. Hence we have for the angle p: j arctan (2a. i).1) D = O-'AO _ \-sin ° cos 9/ a.i.akk) + 4a.2a.(=aik). 6.k < 0 where the value of the arctan-function is chosen between . R = V (a. and in order to get as small rotations as possible we claim . i). we obtain: sin 29 = 2Qaik/R . We shall show that with the notation P. OIO..k/(a..ir/4 Sop S it/4. = PI-'AP. cos' q + 2aik sin q) cos 9 + as. . JACOBI'S METHOD 121 SEC. Putting 1 if aii Z akk .* akk .. We put (cos q -sin Pl in p cos P and get cos c sin ql (a.k sin q) cos 9 + akk cos= rp .i <akk . Each time we choose such values i and k that JaikI = max..ir/2 and n/2. sinz 4' . sin= 9 . ak.kl(a . . we choose the numerically largest element: jaikI = max. for increasing r will approach a diagonal .i aik cos q -sin 9l (6.2.k = dk. dik = dk. and (k. 2r/4 when a. = ..k(cos! p . > 0 if a. (k.sin= 9) .(a.n/4 when a.akk)) if a. .aR) .. This equation gives 4 different values of p. aik. tan 2p = 2a. (i.akk)/R. I cos 29 = a(aii . The elements a. 0. = 0.. the transformation matrices have the form given above in the elements (i.k form a (2. k). . and a. k) and are identical with the unit matrix elsewhere.2. = akk . (6. 2)-submatrix which can easily be transformed to diagonal form. After a few simple calculations we get finally: dkk = 4(aii + akk .akk).k akk/ \sin p cos gyp/ ' dii = ai. Now choose the angle i such that d.2) (Note that d. the matrix B.

be the kth column vector of 0 and Jk the kth diagonal element of D. sin a.. 1. we find for r # i and r # k relations of the form a:.2. ak. sin a + ak. of course. = a..k = N2(A) . The orthogonal transformation affects only the ith row and column and the kth row and column. that is. + a.. r(A') < r(A) exp (. rt(A') = r2 (A) .k . We put rt(A) = E. . r(A) has decreased with at least the factor exp (. performing a rotation as soon as la:kl > E. and if the process has been brought sufficiently far. matrix D with the eigenvalues of A along the main diagonal. a. In a slightly different modification. n(n .-. = a. ). Since a.. Here e is a prescribed tolerance which. cos tp ..2a. Let x. and hence a?. Thus rt(A) will be changed only through the cancellation of the elements a.2. and for a sufficiently large N we come arbitrarily close to the diagonal matrix containing the eigenvalues.1) / \ n(n .1) and 2 rt(A') 5 r'(A) (1 ..1)).P.1} Hence we get the final estimate.Jl < s.. exp (.a. every circle defined in this way contains exactly one eigenvalue.N/n(n . + at. If Ekx: ackl is denoted by E. since we have O-'AO = D. AO = OD. has to be changed each time the whole matrix has been passed. was the absolutely largest of all n(n . 2 < rz(A) .k and ak. This modification seems to be more powerful than the preceding one.1) off-diagonal elements. Thus it is easy to see when sufficient accuracy has been attained and the procedure can be discontinued. and.3) \ n(n After N iterations.. 122 ALGEBRAIC EIGENVALUE PROBLEMS SEC. Z rl(A) n(n .. as before. B = O-'AO. Ekx: a.. cos a + ak. we know from Gershgorin's theorem that la. Taking only off-diagonal elements into account. we have as.". = . . that is. (6. for some value of i. The method was first suggested by Jacobi. we go through the matrix row by row. Then it is obvious that we get the eigenvectors as the corresponding columns of 0 = lim. Then we have Axk = 2kxk . . 6.. It has proved very efficient for diagonalization of real symmetric matrices on automatic computers. The convergence of the method has been examined by von Neumann and Goldstine in the following way. 1)) .

000656 0.001570 -0.28868533.ic) b +ic d J . while the remaining elements are equal to 0 to 8 decimals accuracy. JACOBI'S METHOD 123 EXAMPLE 10 7 8 7 A_ 7 5 6 5 . After the second rotation we have 2.352'+ 1462'.000396 0. 0.1002 + 1 = 0.543214 5 10.010150 0.288685 0. 6.001723 _ 0 0.84310715. a two-dimensional unitary matrix has the form U cos 9' .SEC. 10.10) = oo and p = 45°.707107 A' . _ 7 5 11/> -1/1/2 = 151V T 11/1T 19 0 -1/1T -1/1/2 0 1 Here we take i = 1.589611 -0. which are very important in modern physics.349806 26.85805745. we have 10 7 15/1/2 -1/1/2 A.01015005. Generalization to Hermitian matrices. 0.19) and 33°. and 0.390331 1 11 and after 10 rotations we have 3.99999999 and the product 1. k = 3.sin q eie cos 9 / A two-dimensional Hermitian matrix H=( a b . 30. we obtain.3903 -0. is quite natural.2. and obtain tan 2q' = 151/2/(10 .021719 -0.978281 1.001723 0.589611 _ 1.707107 -0. 5051. 8 6 10 9 7 5 9 10 Choosing i = 3. Apart from trivial factors.00000015 in good agreement with the exact characteristic equation: 2'.843108 After 17 rotations the diagonal elements are 3.sin e'i° . The sum of the diagonal elements is 35.000656 -0. k = 4. tan 29 = 18/(10 .000026 0.000396 30.349806 -0.543214 0 -0. After the first rotation.001570 0.858056 0 -0.000026 A30 -0. to a given Hermitian matrix H we can find a unitary matrix U such that U-'HU becomes a diagonal matrix. As has been proved before.

it/2 and .it/2 if c < 0 .2.d)t + ct). As usual the value of the arctan-function must be chosen between . .7r/2 < 0 < r/2 and hence sin 0 = ca/r . Since b cos 0 + c sin 8 = ar the remaining equation has the solution sin 2q> = 2arr/R. where d = a cost qq + d Sint q) + 2(b cos 0 + c sin 8) sin p cos p .2(b cos 8 + c sin 0) cos 2rp = 0 .ic) cost.it/4 if b < 0 .4) (a . (6. d = a sin' ' + d cost p .124 ALGEBRAIC EIGENVALUE PROBLEMS SEC.*.ir/4 < < 7r/4 in order to get as small a rotation as possible which implies for a>d.2.1/(a . a-d=0: it/4 if bZ0. Now we want to choose ip according to . 6. first by cos 8 and sin 8. we get bsin8 . In principle we obtain 0 from the first equation and then p can be solved from the second. where a= when bZ0. .e-'6 .d) sin 2q' . it cos 2p = r(a . cos p d = d. {-1 when b < 0.5) a-d*0jarctan(2ar/(a-d)).d)t + 4rt = (a -+4(b2 .p . with r = ± 1 and R .pe tiB + (b .ccos8 = 0. The following explicit solution is now obtained (note that b and c cannot both be equal to 0 because then H would already be diagonal): b 0: 0 =arctan (c/b) . Putting d = 0. r= {-1 for a<d.a) sin p cos q. and r= bt + ct . b=0: 0= rr/2 if c>0. we separate the real and imaginary parts and then multiply the resulting equations. and finally add them together. Rather arbitrarily we demand . (6.d)/R .2.(b + ic) sint. then by -sin 0 and cos 0. is transformed to diagonal form by U-'HU = D.2(b cos 0 + c sin 0) sin q. = (d . cos 0 = ba/r . Using well-known trigonometric formulas.

and it is shown that it forms a Sturm sequence... a. 6.. . If c = 0 we get ©= 0 and recover the result in Jacobi's method. The elements a. In practice. and a define a two-dimensional subspace.. the quantity q defining the orthogonal matrix 0 . is a new unitary matrix approaching U when k is increased. (i.(cos g. the eigen- vectors are computed. However. The reduction of an arbi- trary complex matrix to normal form can be accomplished through a method given by Patricia Eberlein [ 101. We have a.. the last member of which is the characteristic polynomial.3. and we start by performing a rotation in this subspace.3. we can directly state how many roots larger than the inserted value x the characteristic equation has. k). cos g. GIVENS' METHOD 125 7r/2. we introduce the elements of our two- dimensional matrix. Finally we mention that a normal matrix (defined through AHA = AA") can always be diagonalized with a unitary matrix. as in Jacobi's method. The process can be performed following a technique suggested by Goldstine and Horwitz [8) which is similar to the method just described for Hermitian matrices. In the places (i. where the unitary matrices differ from the unit matrix only in four places.. 6. The element d can n ow be written J(a+ d)+ J(a-d)cos2p+o'rsin2p and consequently: 6. is now determined from the condition a. = 0 and not. This procedure can be used repeatedly on larger Hermitian matrices. (k. By testing for a number of suitable values x. This rotation affects all elements in the second and third rows and in the second and third columns. = The next rotation is performed in the (2.SEC. In Givens' method we can distinguish among three different phases. giving as result a band matrix with unchanged characteristic equation. The orthogonal transformations are performed in the following order. = a. 1).. we can obtain all the roots. The first phase is concerned with J(n .. i). and (k...1)(n .6) J(a+d-TR). by a. a32. Givens' method Again we assume that the matrix A is real and symmetric. During the third phase. U. -sin al `sing. U .. The product of the special matrices U. = a34 0. In the second phase a sequence of functions fi(x) is generated. = . k).a sin p + a cos a = 0 and tan g.2. 4)-plane with the new . both these processes are performed simultaneously. With the aid of the sign changes in this sequence.2) orthogonal transformations.

. Thus the problem has been reduced to the computation of eigenvalues and eigenvectors of the band matrix B. 0 B has been obtained from A by a series of orthogonal transformations.. Now we form the following sequence of functions: f.sin4' + a.3. cos p = 0 and tan p = a.. that is..... a...fin-.(2) = I and 8.1) .f.2) with f.'.3. and they are all set to zero by rotations in the planes (3. O.. a. . that is. . we make the elements a.. By now the procedure should be clear.0 0 0 B = 0 R9 a9 (6.. 4). if u is an eigenvector of A and v an eigen- vector of B (both with the same eigenvalue). with s = J(n .._. 5)-. determined from a.B.(2) = 2 . and it should be particularly observed that the element a. a. During the first of these rota- tions. the elements in the third and fourth rows and in the third and fourth columns are changed. . n).3. In this special case we have p = 1. and we must examine what happens to the elements a'.sin9. R.. ... (2. equal to zero by rotations in the (2.. = 02A109 = (0.4(0..)-'A(O. we get a. We find aia = a. = 0 if ji .21) could be split into two determinants of lower order]..2).a. such a matrix that at." a' sin p + a. . a.-. 5). = a'..) = 0-1A0. and it is easily understood that we finally obtain a band matrix. We find at once that f. We can suppose that all 0 [otherwise det (B .-'A. (6. O.0.. Now we pass to the elements a. 6. .)-1.0.. A. Now we put a. then we have a = Ov. B = A. _ -a.9cosp + a'.a which can be interpreted as the determinant of the (1. Further. 1)-element in the matrix 21 . 2)- element was changed during the preceding rotation.cosp = 0.. . (3. = (0.O.(2) . In the same way./a.-9(2) . = 0 is not affected. (3. a9 0 R9 .] Now all elements in the second and fourth rows and in the second and fourth columns are changed. = O... tan p = [Note that the (1. A./.126 ALGEBRAIC EIGENVALUE PROBLEMS SEC.(2) = (2 . . = 0. = O1 'A01 .. 8. = 0. and a... . which were made equal to zero earlier..O. n)-planes...).ki > p. In Chapter 3 it was proved that A and B have the same eigenvalues and further that.09 . = 0..1)(n .)f:-.

oo. Suppose.oo) > 0.(p) and f+. the number of sign changes does not change when we pass through a root of f(. ff (P). . 6.oo < P.. In general. and clearly this is also true in an interval p .-. GIVENS' METHOD 127 Analogously.oo) = n and V(oo) = 0.(2) it follows for 2 = p that f.-. 1 < i < n. (2 . +00 sign 1f3(2)] .aj+1)f.a. f.) .n.(+ 00) > 0. + We see at once that the equation f.(p) < 0. < + oo For i = 3 we will use a method which can easily be generalized to an induction proof.2): ff(2)=(2-a3)(A-a)(2-a. < a. a..._ and + oo. with... for example. that f. By induction.3. the situation is different.(2) 0 in the closed interval a < 2 < b.s < 2 < p + e. i = 1. which is the (1. Suppose that a and b are two such real numbers that f. .3. . Hence f(A) = 0 has i real roots. + .(A) .R._ then f(2) = (A .a.<+oo. has a root pin the interval.p:-. .)(2 . it is an easy matter to prove that f. < a. Then obviously V(a) = V(b).1) = 0 if i < n.(2) = 0 the roots p p .) . f.P2) .<p3<a2<.B..++ Hence.< a.) Now it suffices to examine the sign of f3(2) in a few suitable points: 2 -00 a...) < 0.(2) has the roots or. where -oo<p.R:-1(A . When i .SEC.. We are now going to study the number of sign changes V(p) in the sequence fo(p). p. 2..a.)(1 .+ .) .l) = 0 has three real roots p p and p..(2) has different signs in two arbitrary consecutive points.<P. and f.a.oo < a..-.. if f i-.(p) Hence f _.. we have f. . 2)-minor of At .(2) is the charac- teristic polynomial.l . then we may have the following combination of signs: f f+l f-1 .a.+.. we find that f. ..81.a. a.aYA .(a. f (P) It is evident that V(. Then we write f.pl)(2 . n. separated by the roots off -.-. .)-Rz(A-a.<pf-.(2) (A .. .. From f+. however. (.<a...) and obtain from (6.(A) _ ( .-2<a.(p) have different signs. p.) .. By successively putting A = .(2) = (2 . p p . such that . and a.(. for example.f:_. Next we shall examine the roots of the equation f(A) = 0..a. For i = 1 we have the only root I = a. < ps < a. For i = 2 we observe that f (. Hence we have two real roots a._.)(2 . First we examine what happens if the equation f (A) = 0. a . < P3 < + oo.(A) = 0.

I equations used for elimination but gives an error 3 when inserted into the ith equation. H. . Let us assume that the ith equation is excluded. we infer that the number of sign changes decreases by one unit each time a root p..+6. that n is odd. + e) = 0.. computation of the eigenvectors.. The described technique makes it possible to compute all eigenvalues in a given interval ("telescope method"). Since this is a homogeneous system in n variables. f. f.. .+ . For the third phase.. but. .. + Q. for example.V(p. be an exact eigenvalue of B.-1 f. Hence V(o. that the result to a large extent depends on which equation was excluded from the beginning. where we find the following scheme: 2 . . . E.e) . .I.e) . + e) = 1.(A) .E) . + A + E . we have 2 fo f. . Proceeding in the same way through all the roots p. A .V(p. - and hence V(p. the remaining equation must then be automatically satisfied.. or. a.-2 p..2. Let 2.+E + + +. thus round-off errors achieve a dominant influence.I) = 0..3. . we can say that the serious errors which appear on an unsuitable choice of equation to be excluded depend on numerical compensations. + +. + e) = 1.3) The sequence f.x. Essentially.s + . . f. 6. Hence we have proved that if q(A) is the number of eigenvalues of the matrix B which are larger than 2. In practi- cal work it turns out... - p. where we have ff-= f.. Denoting the roots of 0 by p P ...-.. ..-. Suppose.(2) is called a Sturm sequence. Now we let 2 increase until it reaches the neighborhood of or. - Then we see that V(p. f. this does not affect V) until we reach the neighborhood of p.3. . and since det (B . even for quite well-behaved matrices. as shown before. we can obtain a nontrivial solution by choosing n . p and the roots of 0 by or. or._.128 ALGEBRAIC EIGENVALUE PROBLEMS SEC. Wilkinson in [2].V(v. is passed. f.1 equations and determine the components of x (apart from a constant factor). then V(P) _ g. we shall follow J. Thus we search for a vector x such that Bx = . may appear. while the others are solved by elimination.. (6. Then we let I increase again (now a sign change off. The solution (supposed to be exact) satisfies the n .

. The resulting system is written: P11xl + P13x3 + P13x8 Pbx3 + P23x3 + pux4 (6. + /9ixi 0 (6. which we had at our disposal.2!)x = b. .v.5) we get x = E c. I 2. deliberately..3. .5) where e.3..8) Under the assumption that c.). most of the coefficients p13.. e.5) be replaced by (B . .. This system is solved by Gaussian elimination. p21. 3q :f 0. is a column vector with the ith component equal to I and the others equal to 0. is almost orthogonal to v.. However. (6. .) Since constant factors may be omitted. .l = 2. we have solved the system Qi-ix. this vector ej can be expressed as a linear combination. GIVENS' METHOD 129 Actually. where it should be observed that the equations are permutated properly to make the pivot element as large as possible.3. _ c.2I)-1v. . If the eigenvectors of .c=11 v1 e Lc.. # 0. . .7) Now let . have been obtained from the b. that is.3. It seems to be a reasonable choice to take all c.SEC.3. (6. + e. _ E C:. + j9/x{}1 = fJ 1 pp .. v..3. v.. 6. and under such circumstances it is clear that the vector x in (6.. Wilkinson suggests that (6.4) Ni-1xi-1 + (a. the vector e..8) cannot be a good approximation of v.0 (apart from trivial factors). and we obtain x = .2)x.2)x.3. (6.3.. (We had to use an approximation 2 instead of the exact eigenvalue 2.3.. as s .(B . our solution x approaches v. is of the same order of magnitude as a (that is..B are v v2. we could as well choose the constants c. this system can be written in a simpler way: (B . + (CIj .6) j=1 and from (6.9) where we have the vector b at our disposal.-21-E v.3. are zero.2!)x = e.10) As a rule. it may well happen that c. (6. (6.3.-. Since the c.

130 ALGEBRAIC EIGENVALUE PROBLEMS SEC.2) It is evident that P is symmetric. (6. P is also orthogonal. no eigenvector should then be disregarded. we have PrP = (I . Thus we choose c= :te. symmetric matrices.. and the mapping P means a reflection in a plane perpendicular to w.2WWT)X = x . We shall essentially follow the presentation given by Wilkinson [4].4. will be de- noted by P.4. also. with the general structure PEI-2wwT. . Householder's method This method.4 . The matrix P acting as an operator can be given a simple geometric inter- pretation. Figure 6.1) Here w is a column vector such that wTw = 1 . equal to 1.11) The system is solved.4.2(wrx)w . by back-substitution. Further. The orthogonal matrices..4wwT + 4wwTwwT = I . In Fig. has been designed for real. The distance from the endpoint of x to L is jxj cos (x. w) = WTx.3. 6.4.=1 (6. Let P operate on a vector x from the left: Px = (I . Even on rather pathological matrices. as usual. This is done by orthogonal transformations representing reflections.4 the line L is perpendicular to the unit vector w in a plane defined by w and x. 6.2wwT) = I . (6. that is. The first step consists of reducing the given matrix A to a band matrix.2wwr)(I . the vector x is normalized. 6. good results have been ob- tained by Givens' method. and last.

.2x. c.. + x.x. . ... by (6.x we find that the first row of AP. = 1 .2p. HOUSEHOLDER'S METHOD 131 SEC. 0. . The final result will become a band matrix as in Givens' method..4.2xx.4. (6.) . or w. + X3.r equations for the n . b. . 1 .2x. We carry through one step in the computation in an example: a.-2px.4. Further..x. -. b. x. 4).x.x. c. + d.4. Ob- viously. d. .x..2x.=0. (1. .'PIX4 Now we claim that c1-2Rx3=0.41. Since we are performing an orthogonal transformation. 3). X x. w. + X. x. which must be reduced to zero by transformation with P. d.3) r = 2.4) d. .x. x. (6. we get zeros in the positions (1. d. and further we have the condition that the sum of the squares must be 1. Now put A = A.x.x3 . this gives n . . and form successively A. At the first transformation. the (1. The transformation PAP. contains n .1.r elements in the row (r .. . The matrix A. c. X. n) and in the corresponding places in the first column. d.x). . must now produce zeros instead of c.. d._. (1. Putting p.1). c. the sum of the squares . 6. Those vectors w which will be used are constructed with the first (r ...+I + . the matrix P.2x.2x. = P. Since in the first row of P.2w. 2x. d.. . _ (0.w''. has the following form: 1 0 0 0 P_ 0 1 .1) components zero.2x.. 3. d. . 0. for example.A. 0 . 0 . Xr. can become zero only if the corresponding element is zero already in AP. only the first element is not zero.._.".2) we have Xt + X.2PIX2 . .. . b. = b. 3)-element of PAP.. = I . n . 1 -.r ± 1 elements x x. + c. . With this choice we form P.P. has the following elements: a. b. A= c3 d3 C.2x. . . and d. . .+ . = (0.

/2Sx2. .5. Hence we obtain for x.2p.4).4. and from (6. an eigenvector x of the band matrix has to be multiplied by the matrix P P .)112. is large. and further x.4. x and y. of the elements in a row is invariant.+b. which are biorthogonal. = :-d. and (6. Hence p1 = +Sx. and x4: = c1 sign b.4. b1/S). + x:) = ±Sx. one for S and one for x.4.0. x1.)2=a. . .x . x. .x.2p. (6.5. we have x7yk . Thus the quantities ought to be defined as follows: = 1 b sign b1) X 2 (l + 1 S / (6. 6. that is. . .7) The sign for this square root is irrelevant and we choose plus.4. The sum of the first three terms is p.. = ±S . 2Sx. y . sign b1 (6. two square roots have to be evaluated. Lanczos' method The reduction of real symmetric matrices to tridiagonal form can be accom- plished through methods devised by Givens and Householder. in the denominator.+c.. + xg + x.... 6.4. Since we have x. + c. In the general case. this should be done by iteration: xw-1 = P's-IX.5) Multiplying (6. and x.x..(xi + x. x .4. + d. + c. The end result is a band matrix whose eigenvalues and eigenvectors are com- puted exactly as in Givens' method.6) Inserting this into (6.x..4.. and x we get b1x.=P. In this method two systems of vectors are constructed.5) by x.2+di Putting S = (b. x` = d. = +c1/2Sx. we obtain b1 . + d. we obtain the best accuracy if x.4) by x. = J(1 -t.8) X3 2Sx.4. For arbitrary matrices a similar reduction can be performed by a technique suggested by Lanczos. (6.x.2p. for j # k. = 1. we find that x.5). and hence a'. (6. + (b1 .132 ALGEBRAIC EIGENVALUE PROBLEMS SEC. y. In order to get an eigenvector v of A.9) Vx. This is accomplished by choosing a suitable sign for the square- root extraction for S..

NA"Yj)* _ {xk (Yi+1 =1 b.-l... 0 0 J= . x are considered as columns in a matrix X and if further a tridiagonal matrix J is formed from the coefficients ak_l. 0 0 ....(bk-l. 2.1Arryk bjk = xjNyj Let us now consider the numerator in the expression for a. 0 .. Analogously x....(ak-1.2: Yi Axk = (x.. The new vectors are formed according to the rules xk+1 = Axk .kY. .ajky"xj If 0.. In this way the fol- lowing simpler formulas are obtained: xk+1 = Axk .. k.-1 + bkkYk) If the vectors x1. = Yi'Axk . a.'Axk Yj x. 1 a.-1.k = 0 for j f.kxk-1 + akkxk) Yk+1 = A"yk . 0 0 0 .k and a.SEC. .. 0 0 0 1 a a.5.. 0 0 1 a a.. = 0 under the same condition. k bjky. because of the biorthogonality. and for j = 1.k .. .. . x . x are linearly independent J = X-1AX. Hence we have a. with one's in the remaining diagonal: a..2. s=1 a. 0 0 0 . we get ajk ..)}* 0 . 0 then we can simply write AX = XJ... and provided the vectors x.Y.. a..kyHx. and similarly we also have b.....LJ ajkxj r k j=1 k Yk+1 = A"Yk ._. LANCZOS'S METHOD 133 The initial vectors x1 and yl can be chosen arbitrarily though in such a way that xf'y1 # 0... x . we form: k 0 = Yi'xk+i = Yj'Axk . =1 The coefficients are determined from the biorthogonality condition.jy. 6.. when j < k ..

# 0. real nonsymmetric matrices fall naturally into this group).Al).L(L-'A(LT)-' .(det L)2 det (C .A.. Then one chooses just one sequence of vectors which must form an orthogonal system..7. The simplest way out is to choose other initial vectors even if it is sometimes possible to get around the difficulties by modifying the formulas themselves. 6. further. y.R. Even the case in which complex eigenvalues are present can be treated without serious complications. Then we can split B according to B == LLT.-. where the method is described by its inventor. 6. with 1. but it can also happen that x.6. Certain complications may arise. If similar matrices are formed from the vectors y y .0 even if x. the problem has been reduced to the usual type treated before. we shall first discuss a triangularization method suggested by Lotkin [6] and Greenstadt [7]..AB = A .2I)LT . bkkl we get K = Y-'A BY. = R.. 6.. If the eigenvalues are real. in the same way as A and so on. Lanczos' method can be used also with real symmetric or Hermi- tian matrices. and instead we refer to the original paper [11]. Rutishauser. and from the coefficients b.. a detailed discussion of the degenerate cases is given by Causey and Gregory [9].. which in general converges toward an upper triangular matrix. Here we also mention still one method for tridiagonalization of arbitrary real matrices. . they will appear in the main diagonal. Obviously.134 ALGEBRAIC EIGENVALUE PROBLEMS SEC.L. has the same eigenvalues as A. Since CT = C. Lanczos' paper [ 1 ] should be consulted.. where A and B are symmetric and. . B is positive definite. Here we shall also examine the more general eigenvalue problem.. Hence A . . for example. . # 0 and y. = R. where C = L-'A(L-')T.ALLT . Space limitations prevent us from a closer discussion. where Lisa lower triangular matrix. det(A-2B)-0.7..or y-vector may become zero. = 1. the new matrix A. particularly concerning the deter- mination of the eigenvectors.. Complex matrices For computing eigenvalues and eigenvectors of arbitrary complex matrices (also. For closer details. Then we treat A.AB) . first given by La Budde.Rl 1. Closer details are given in [5]. obtaining a sequence of matrices A A A. Other methods Among other interesting methods we mention the LR-method. Starting from a matrix A = A we split it into two triangular matrices A. that some x."y. . and then we form A. = L. H. Since A. and det (A .

we get p2 + Igll = 1. We start by examining the two-dimensional case and put A = (a b) . d' =a+(d-a)p2-p(bq+cq*). 6.4) q=ap. It is possible to give examples for which the method does not converge (the sum of the squares of the absolute values of the subdiagonal elements is not monotonically decreasing. Clearly we have a' + d' . Hence a is obtained directly from the elements a. a2. Claiming c' = 0. V = bp2 .2. In practical computation one tries to find U as a product of essentially two-dimensional unitary matrices. and this can be done by the formula VA+iB= ±(1/l/T)( C+A+isignB --A). with p = cos 0 and q = sin 0 e"'.7.cq*2 + (d . we find with q = ap. and d. n). we get p and q from p = (I + 1«12)-112 > (6.7.7. we get Jal = tan 0.SEC.7). we must take the square root of a complex number. A' b'\ = c' d'J and obtain a'=d+(a-d)p2+p(bq+cq*). When a has been determined. b. using a procedure similar to that described for Hermitian matrices in Section 6.a)pq* . cf. where (a. Now we pass to the main problem and assume that A is an arbitrary complex matrix (n. U = (p -q*) (p real) . [ 13]).1) \ / q p From U"U = Y.7.7. c. Normally. (6.3) Here we conveniently choose the sign that makes Ial as small as possible. Further. where C = 1/A2 + B2. We choose that element below the main diagonal which is largest .2) c'=cp2+(d-a)pq-bq2. where T is a (lower or upper) triangular matrix (see Section 3.a + d. we suppose that A' = U-1AU. (6. but in practice convergence is obtained in many cases. COMPLEX MATRICES 135 The method depends on the lemma by Schur stating that for each square matrix A there exists a unitary matrix U such that U-'AU = T. (6.daa= b b and a= I (d-a±l/(d-a)2+4bc).

0 0 0 A= 0 0 . First we put ykk = 1 and then we use (6. . we must reserve two memory places for each element (even if A is real). Y23 y.. as a rule... r = 1... and obviously TY = YA... This procedure is repeated until i>k lai. in absolute value.7.- T= 0 0 t23 t3.. 0 0 0 t. The equations become fairly complicated but can be solved numerically and. 0 . An alter- native technique can be constructed aiming at minimization of rE by choosing 0 and p conveniently. 1. 1 y3. The quantities yik. which.7. and further we form a diagonal matrix A with the diagonal elements trr...2. and we shall restrict ourselves to the case when all eigenvalues are different.12 is less than a given tolerance.1.7) When the method is used on a computer. The condition is t11 y + t = from which we can determine y (we suppose that t # We proceed in the same way and collect all eigenvectors to a triangular matrix Y..... does not guarantee that r2 = Ei>k Iaiklz decreases. (k . . and perform an essentially two-dimensional unitary trans- formation which makes this element zero.7.. Denoting the triangular matrix by T. where U U . k > i..2.6) for i = k . t.tit (6. are the in- dividual essentially two-dimensional unitary matrices.. Last.1).Yrk yik = r=i+1 tkk . Next we see that we can determine such a value y that the vector with first com- ponenty and second component 1 becomes an eigenvector. Only in special cases are all results real. . however.. . 2.. U. We see directly that the vector whose first component is 1 and whose other components are 0 is an eigenvector belonging to the first eigenvalue t... n. 2.. the minimum point need not be established .. the eigenvalues aretrr.5) t 0t . . . and in this way the eigenvector yk belonging to the eigenvalue tkk can be determined.r= 1. 1 (6. The method described here depends on annihilation of a subdiagonal element.136 ALGEBRAIC EIGENVALUE PROBLEMS SEC..6) valid for i = 1. we obtain the eigenvectors xk of the original matrix A from xk = UYk (6. Clearly. we have U-1A U = T with U = U1U2 UN... we start with the triangular matrix. .. are computed recursively from the relation k tir.7.. In order to compute the eigenvectors.7... 6. k .n. Then we have t11 t12 t13 ' t1w 0 t22 t93 .

-q*) and U = (p Ck..1. Then we get Ck. to A where C. In the following we shall choose to work with the upper Hessenberg form. Introduce the notations Ci. An arbitrary complex matrix can quite easily be reduced to Hessenberg form which will now be demonstrated._. 6.2)/2 steps. After that we rotate in the (3. = 0. we shall describe one step in the reduction leading from A. k)-plane under the condition c'k _.i-1 + PCk..SEC..r r-1 n-r . q p with p = cos 0. B. . [ 13}). In the real case we have b = d = 0 giving a = 0 and tan 0 = c/a.bsin0cosp Squaring and adding we get (c2 + d2) cos2 0 = (a2 + b2) sin2 0. 6.i > 1. We shall also briefly show how the reduction can be made by use of reflec- tions following Householder (also cf. 4)-plane with c. and a trivial elimination also gives the angle a: tan 0 = V(C2 + d2)/(a2 + b2) I tan a = (ad .t-1 = 0 and splitting into real and imaginary parts: ccos0 = asin0cosg . Hyman's method A matrix is said to be of upper Hessenberg form if atk = 0 for i ._.1)(n . Starting from an arbitrary complex matrix C = (ctk) we perform a two-dimen- sional rotation in the (i. and cs. HYMAN'S METHOD 137 to a very high degree of accuracy. but for real matrices we can use real (orthogonal) transformations which is a great advantage. .8.. The first rotation occurs in the (2.k > 1 and of lower Hessenberg form if atk = 0 for k ._. {dcos0 = a sin©sina .i-1 = .. i + 2. n). and so on.t-I = a + bi . 3)-plane with the condition c31 = 0. In this way all elements in the first column except c. next in the (2. q = sin ©et+' making U unitary.8. 4)-plane with c.bc)/(ac + bd) . Essentially the reduction goes as in Givens' method demanding (n . k = i + 1. = 0 (i = 2. and so on.qC.bsinesinq . = 0. }n .. n . Putting A = A0._. . The value of the method is still difficult to estimate as it has not yet been tried sufficiently in practice.. In the general case the transformations are unitary. are annihilated. .) jr I\ 0 I b. 3.i-1 = c + di .

_. will then be produced through A._. and a.r n-r with Q.. w.2w._. must be annihilated except for the first com- ponent. b. I (we suppose Euclidean vector norm). = P. Further b._.(2w" br-.)"'r a. that is. is a vector with the first component = I and all other components = 0. =arg (e.b._. and *1 H. since Q. leaving H. = b. = Q"a.jw. _ (I .-1I and A. If this procedure is repeated we finally reach the matrix which is of upper Hessenberg form. Then form the . = 0 0 0 Hence Q. In this way a new Hessenberg matrix H. is completely determined. Thus the argument of a.P. ._._. we get W. . . -. .I1is real._.8.A.._. Now we choose 0 = Pr \ 0 Q.138 ALGEBRAIC EIGENVALUE PROBLEMS SEC._.wh)b._. b. = a. arg a. The matrix A. = b._. as well as the null-matrix of dimension (n .b._. Finally. = I .._._.-.B.._.'Q.! a. is equal to the argument of the top element of the vector b.).l = 1b. a._.1w.-1Qr ) l with a. After having produced a Hessenberg matrix A = U-1CUwith the same eigen- values as C we now turn to the computation of these. Ia. / in -. A simple computation gives a.IE)a and since 1 ._._. b._.2w.'_b.H'-' 0 I ar -1 Q . 6.b. = 1. being a column vector with (n .w. Ib.Q.A. _ (I . P. Let x be a vector with unknown components and 2 an arbitrary (complex) number. Here e. and e.1) unchanged while the vector b.r) elements.'a._. _ *1 is of upper Hessenberg form.r) x (r . of dimension (r + 1) x (r + 1) is formed by moving one new row and one new column to H. = b.a._.' and w"w. _ eiQ.

" cf. is obtained as a second-degree polynomial in . L-lk=i+1 aikxk ai. .uk=i+1 a ikzk Zn=0. Then we can solve x_1 as a first-degree polynomial from the last equation. we get as result the characteristic polynomial f(A) - det (2I . .aikYk Yi-1 = Yw=0.x.xn-1 + 27x. a21x1 + a22x2 + . + 2xn ..a. The method described here is such that the values obtained are exact eigen- values of another matrix differing only slightly from A ("reverse error com- putation. Then Newton-Raphson's method can be used (possibly a variant independent . this means that the eigen- value problem per se is ill-conditioned and we can expect numerical difficulties irrespective of the method chosen..2) + a. p. . Y. From the next to last equation.. It is now possible to compute the eigenvalues from these results by interpolation.i-1 ( [rmix Zi-I = (A . ai. and as the system is homogeneous we can choose._1 # 0. HYMAN'S METHOD 139 following linear system of equations: allxl + a12x2 + . 6.. first linear and then quadratic. It is quite possible that the errors in the eigenvalues may become large.I in A. x are inserted into the expression (A - a11x2 . x . = )x. + 2yj .. is determined from the second equation as a polynomial of degree n .A) apart from a constant factor. .Lk*=i+. + a2wx.-.. Section 4.8.w-lx.)Z. If all these values of xl. 1. We suppose that all elements a.. and so on. and finally x.8. 147). . for example.. _. .SEC. .a. 1 = 2.:-1 V (2-ar)yi+xi...) and putting p. = Ax2 > (6. . The interpolation can be improved by computing the first and possibly also the second derivative of the characteristic polynomial by iteration... Denoting the characteristic polynomial by f(A) (=As + .-1 aw. f'(A) = p oo . 3.i-. + alnx. n. we have X. ai.l..4. [18]. and Wilkinson..aii)xi . The follow- ing formulas are used: (A . xa = 1.

.2.i+l L. .374). + .i+. xn-I = P..k = Pi. of multiplicity): x xoYo or Yo Yo .i-iPi-i.3 (correct value 5.-iairP+k (6..xoZo xample 6 3 -4 2 A_ 2 1 5 -3 0 3 7 1 0 0 2 5 Starting with 20 = 5. 1) . .k2n-k .= 0).w-12 + X.....-1...T ..k+I .ainpxr ai. xi = Pii2"-i + Pi..o= 1.. . we find k ai.a T2 Hence 2.E. + Po.xi+l Pik2 -ai. 6. = X.4) (1 = 1. Pi+l.8. R"2°-' PR22w-2 + .n.8. we get xi-i = (2 .+.aii)xi . Pi.5.140 ALGEBRAIC EIGENVALUE PROBLEMS SEC. x which is also extended by an extra variable associated with the first equation of the system: (xo = Pool" + Poi2r-. + Pi. Using Ax = 2x. we compute xi and yi: i xi Yi 4 1 0 3 2 0 -i 1 H -2 0 .i+12n-i-l + . n.. Let us now once again write down expressions as polynomials in 2 for the set x x .i-7 Comparing powers 24-k on both sides.. . = 5 + . + = + A. a..ai. k=i..

6.A0)-. P11 P12 Pl3 P14 0 0 P23 P23 Pa.a32a13)-1. 0 0 0 0 1 0 0 0 0 p4.8. 0 Pit P12 P13 P14 0 0 a32 a33 a34 0 0 P22 P23 P24 0 0 0 a13 a.8. u1 = sa E a.v. = soa1(21 .SEC.(A .2 where 2.v3 and we have the same effect as when the power method is applied. that is.2) cannot be recommended since instability will occur. A=(i El/ \s3 1 .. = so E a. which leads to (A . Successive iterations are now computed from (A . The characteristic equation is now Pool" + Por2"-1 +. A0)_'v.(A. . However. . of course.5) 0 0 0 V44 0 0 0 0 0 1 Taking determinants on both sides we find in particular p. this technique. following Gauss or Crout.a a13 a. for n = 4 we get 1 a. . u0 = Z a. for example. = 0 . as a rule is unsuitable for stability reasons except possibly if the computation of coefficients can be performed within the rational field.8.lo be an approximate eigenvalue (2.A0I)ut+1 = sin. which can easily be generalized to arbitrary n. a. 0 0 0 pu pp 0 .20I)u1 = sa E a.: Poo Pot P02 Paa Poo 0 all a23 aa3 aa. will not help. + Po. Now suppose. This method of generating the characteristic equation and then the eigenvalues by some standard technique. for example. Further we suppose that the exact eigenvectors are v v .v.. = s. v". for example. . . is a suitable scale factor compensating for the fact that A . (6.. Let . -. Consider. HYMAN'S METHOD 141 This relation can be written in matrix form. The solution of the system (A . in principle inverse interpolation. an approximate eigenvector. where s. Computation of the eigenvectors from (6. = (a2. P..t)_'v. 0 0 0 p. is performed by standard means.I j is almost singular which will cause no trouble otherwise. is an exact eigenvalue) and u. If the eigenvector problem is ill-conditioned in itself.u. 20. it is possible to use an idea by Wielandt.

1. 6.. qr . The following theorem is essential in this connection: Every regular square matrix A can be written as A = QR where Q is unitary and R upper triangular. are known... k . Suppose that A is regular and put QR=A. The eigenvectors and on the other hand are ill-conditioned since small errors in a and e.e. is uniquely determined. with the well-conditioned eigenvalues 1 ± l/s. hence r = 1a11 (apart from a trivial factor eiei)... = q.q. But since Q is unitary. = a. we also have gt{gk = 3. and both are uniquely determined. The technique is com- pletely general but becomes too complicated for arbitrary matrices. we multiply from the left with q(i and obtain r. we find k E r. The method has the advantage that the computations can be performed in the real or complex domain depending on the special circumstances. Thus. then also q... . for j = 1. because otherwise we would have linear dependence between the vectors a1. q. 6.L rikgi i-1 The right-hand side is certainly not equal to 0. The column vectors of Q and A are denoted by qi and ai. Further we also have k--+1 rkkgk = ak . Hence we find k-1 rkk = lak -: {=1 rikgil .9.. can give rise to very large errors in the eigenvectors.142 ALGEBRAIC EIGENVALUE PROBLEMS SEC. 2. Multiplying all rows of Q by the kth column of R. If we now multiply all the rows in Q by the first column of R we get (since only the first element in the R-column is not equal to 0) r q. = ak - i=1 Next.-. The existence of such a partition can be proved by a method resembling Gram-Schmidt's orthogonalization. The QR-method by Francis The main principle of the QR-method is to produce a (for example upper) tri- angular matrix by the aid of unitary transformations. Now we assume that the vectors q1. For this reason the matrix is usually first transformed to Hessenberg form as described in the preceding section. Hyman's method will produce eigenvalues and eigenvectors to general complex matrices provided they are first transformed to Hessenberg form.9.. ak.

= A = XD'Y._.U.which gives P.+ = Q. and in principle they could be computed from A.P. = Ai-'Q1R.U.. = P. Q. is formed from A. Then we get P... We now assert that the matrix A. = All. = Q1Q8 . This means that first A.R. Q. = The decisive point is that as is easily shown. .R. where Q. = P.-.R. is partitioned in a product of a unitary and an upper triangular matrix.Q. . = PA. Obviously.Q. = A. by use of the following algorithm: A. A.P. P.. and so we obtain A. Further we assume d. R.R. = Ai 'P. In this way we would also obtain A.+. and then A. 6. 3. and the vector q. RX and R.j > 12. Now we start from A = A._. = Q1Q2 .-i = .. .-'.. Here P. = R.U.f.Q. = U. . uniqueness is secured apart from factors e'Bk in the diagonal elements of R.. = Q. The following steps are needed for the proof: 1. in subdiagonal elements will contain quotients . But A. = A.U. . Q. and R.+.._. for increasing s more and more will approach an upper triangular matrix. is computed as R. by partition in a product of a unitary and an upper triangular matrix. through a similarity transfor- mation: A. Then we also have PU. X is partitioned as X = QXRX and Y as Y = L. is unitary._. Q.Q.R. We do not give a complete proof but restrict our- selves to the main points.. and hence the condition rkk > 0 will determine Q and R completely. is unitary and U.R. It also means that A.Q...+..U. = Q.-'AP. Then we form P._.U. P. Q:'A. through a similarity transformation with a unitary matrix._.+.P. .. = 2i with 12.k is also determined.A. lower triangular with ones in the main diagonal (both partitions are unique).A.. R.Q. we put P. Next. R. THE QR-METHOD BY FRANCIS 143 SEC. is written A. For the latter partition a permutation might be necessary.A..9. = XDX-' = XDY which is always possible if all eigenvalues are different (this restriction can be removed afterward).1 > 2. and L. upper triangular... = Q. upper triangular. . and form sequences of matrices A Q and R. A = A..R. and consequently P.Q. 4. ..+. further details can be found in [121.A.

265-271 and 332-345..144 ALGEBRAIC EIGENVALUE PROBLEMS SEC. A.9. Appl. = P. are upper triangular. = P. . [9] Causey. 4. [31 Givens: Numerical Computation of the Characteristic Values of a Real Symmetric Matrix. Oct.. SIAM Review Vol. 176-195. Assoc. Series." The Com- puter Journal. Comp. 1. [2] Wilkinson: "The Calculation of the Eigenvectors of Codiagonal Matrices.'XDX-'P. Oak Ridge National Laboratory. 49 (Washington. 3. [71 Greenstadt: A Method for Finding Roots of Arbitrary Matrices. 4 (1961) pp. Q. [4] Wilkinson: "Householder's Method for the Solution of the Algebraic Eigenproblem. (2. [12] Francis: The QR Transformation-A Unitary Analogue to the LR Transformation. 6.RxDRX' (since X = QxR1 and RX = QXX). No. [13] Froberg: On Triangularization of Complex Matrices by Two-Dimensional Unitary Trans- formations... . If so. March 1962. 23--27 (April 1960). Comm. BIT 5: 4 (1965) p. Comp. 1958). Math. 1. A." J." The Computer Journal. we are left with P. [6] Lotkin: "Characteristic Values for Arbitrary Matrices. Comp. [15] Businger: Eigenvalues of a real symmetric matrix by the QR method. and Qx are unitary while U.U1 ^_ Q. the QR-method will become too laborious for arbi- trary matrices. As already mentioned. [141 Householder. [10] Eberlein: A Jacobi-like Method for the Automatic Computation of Eigenvalues and F. Several algorithms in ALGOL treating this method have already been published 115. 47-52 (1955). 255 (1950).)' so that this matrix actually will approach the identity matrix. 6. = Qx and U. Horwitz: A Procedure for the Diagonalization of Normal Matrices. Math. 45. 14. Mach. [5] National Bureau of Standards: Appl. = RXD'R. Num. and instead it is used on special matrices. p.. 16]. preferably Hessenberg or symmetric band-matrices. Algorithm 253. -. The method has good stability properties and seems to be one of the most promising at present. Math./A." Quart.!XDX-'Qx -. MTAC. [81 Goldstine. 29. 10. 217.+. [11] La Budde: The Reduction of an Arbitrary Real Square Matrix to Tridiagonal Form Using Similarity Transformations. But P. Jour. 433. 90-96. J. SIAM Vol. we can draw the conclusion that lim P. No. Gregory: On Lanczos' Algorithm for Tridiagonalizing Matrices.. The matrix RxDRX' is an upper triangular matrix with the same eigenvalues as A. 9.P.XRXD'R.igen- vectors of an Arbitrary Matrix. Math 1 (1959) p. Jour. 17 (1963) p. 230. 1961. 1959 pp. of Res. No. Bauer: On Certain Methods for Expanding the Characteristic Polynomial. 267-275 (1956). ORNL-1574 (1954). ACM 8: 4. 6. REFERENCES [1] Lanczos: "An Iteration Method for the Solution of the Eigenvalue Problem of Linear Differential and Integral Operators. and RXD'R. 5. Since the partition is unique.

Press 1965. if d is an eigenvalue of A. [19] Wilkinson: Convergence of the LR. p.a is an eigenvalue of A . Find the largest eigenvalue of the matrix 25 -41 10 -6 A_ -41 68 -17 10 10 -17 5 -3 -6 10 -3 2 correct to three places. Ass. Find the absolutely smallest eigenvalue and the corresponding eigenvectorof the matrix 1 2 -2 4 A. Mach. 350.21 5i 3 0 . The matrix 14 7 6 9 A_ 7 9 4 6 6 4 9 7 9 6 7 15 . Comp. Comp.2 12 3 5 3 13 0 7 ' 2 11 2 2 using the fact that 2-' is an eigenvalue of A` if 2 is an eigenvalue of A. find the highest and the lowest eigenvalue of the matrix 9 10 -8) A 10 5 1 8 -1 3/ Choose a = 12 and the starting vectors (1) 1l 1and 1 1 11 respectively. 8 (1966). (Desired accuracy: two decimal places. [ 17] Hyman: Eigenvalues and Eigenvectors of General Matrices. Oxford Univ. Using the fact that 2 .a!. Find the largest eigenvalue and the corresponding eigenvector of the Hermitian matrix 8 -5i 3 . Jour. (June 1957). EXERCISES 145 [16] Ruhe: Eigenvalues of a complex matrix by the QR method. QR and Related Algorithms. 118] Wilkinson: The Algebraic Eigenvalue Problem. 3+2i 0 2) 4. BIT 6: 4 (1966). 3. p. 77. 2.) 5. EXERCISES 1.

Using the LR-method..22). 6.C'. while all the others are b.C' + a.75 -0.1) (A . Use the result for computing to for the matrix 2. and A. ao a.(A .211)(A .24 -2. 4+1!)..15 0.1)-fold] and the eigenvectors of A. > 0).(%k . and one is (n . A is a matrix with eigenvalues A . a2 a. All the other eigenvalues are con- siderably less in absolute value than these two.37 A= ( . when 46 A ) = (3 2.15 -7.99 7. (A is called circulant.) 9. can be accurately determined from three such quotients R1. Also find the eigenvalues and the eigenvectors of A.01 . (all different). 10.A1)(2k . 1 1 . and both are greater than zero.(2k .2.Ak_ll)(A .2. 7. Use this relation for finding eigenvalues [one is simple. and R. A .-12 5 1 6 6 43 86 -27 1 11 -11 4 16 -27 106 4 6 -11 22 8. a2 a3 ao can be written aol + a1C + a... 146 ALGEBRAIC EIGENVALUE PROBLEMS has an eigenvalue close to 4. Show that aa. Find such numbers p and q that A' . Generalize the power method so that it can be used in this case. .. 12. n) all diagonal elements are a. ao a1 a. where C is a constant matrix. A real. A is a matrix with one eigenvalue Ao and another -2o (2 real..Ak 1). where a and v are eigenvectors corresponding to the eigenvalues 21 and 22.87) (two decimals).(A . From these values a series of consecutive Rayleigh quotients is constructed..11 and the next largest equal to 22.21). Find the largest eigenvalue of the modified eigenvalue problem Ax = ABx when 1 6 6 4 1 2 -1 4 A_ 6 37 43 16 and B.2n) Show that Bk = Bk.. ('-. find A. using the matrix B=(A-4I)-1. In the matrix A of type (n.(A . symmetric matrix A has the largest eigenvalue equal to .. all remaining eigenvalues are such that 121 < 2o. Put Bk .pA + ql = 0. a.87 -1. By use of the power method one has obtained a vector of the form u + Ev.37 -0. Compute this eigenvalue to six places. R2. a...4_1)(2k .. A.. . Show how 2.

-having the properties that f e F. we have P(f(x) + w(x)) = Pf(x) = g(x). however. In order to demonstrate this we suppose that w(x) is a function which is annihilated by P. f for all functions f e F. since it can occur that P-' is not uniquely determined. then the following laws are supposed to be valid: P. then Q is said to be an inverse of P. quadrature. where a and je are arbi- trary constants.P2 is defined as the operator which transforms f into P. usually one chooses F* = F. the product P. numerical differen- tiation..1) P1(P2 +P3) = P. but in general they do not commutate. f = P. g. where w is an arbitrary function annihilated by P. as the operator which transforms the function f into P. Two operators P. Hence we can write P-'Pf = f + w. 7.)PA (7. A linear operator is then a mapping of F on a linear function space F*. In fact.) + P P1(P2P3) _ (P.0. The operators which we are going to discuss fulfill the associative and distribu- tive laws. Usually the deduction is also considerably simplified. and further that f(x) represents a possible result of P-'g(x). f. are said to be equal if P. however. An operator P being linear means that P(af + /3g) = aPf -i 9Pg for all f and g e F with a and /3 arbitrary constants.Chapter 7 Linear operators "I could have done it in a much more complicated way" said the red Queen. Such a space is defined as a set F of functions. f + P. LEWIS CARROLL. the operator technique proves to be a most useful tool.P. are in general not obtained in this way..0. and P. Here we define the sum P of two operators P. Here we must be careful. If Q is such an operator that for all f we have Qg = jr if Pf = g. denote operators. Pw(x) = 0. General properties When one wishes to construct formulas for interpolation. We also define the inverse P-' of an operator P. . + (P! + PO _ (P. 147 . It must be understood. .. f ). and P.P..(P.. If we let P P P. The operators are supposed to operate on functions belonging to a linear function space. including remainder terms. that is. + P. and summation. immensely proud. g e F implies of + 6f e F. f. that complete formulas. + PAPA . Then f(x) + w(x) is another possible result. One of the greatest advantages is that one can sketch the type of formula desired in advance and then proceed directly toward the goal.

8mf(x) = f(x + h) . the interval length It enters the formula in an important way.148 LINEAR OPERATORS SEC. instead of E. f(x) fit) dt f(X) = f =_ f(t) dt Definite integration operators Jof(x) = S z(A/SI it)dt We observe that in all cases. by using the notation E. Thus P-' is a right inverse but not a left inverse.fix . Then we say briefly that P-1 is the inverse of P.1. Special operators Now we introduce the following 13 operators. This will be discussed in some detail later. 2 )1 The mean-value operator Sfx+h)=Sfx)+fix) Tf(x) = Tf(x . for example.1. and if one wants to use other intervals. Further. including their defining equations: Ef(x) = fix + h) The shifting operator df(x) = f(x + h) .1) afx+ Df(x) = f (x) The differentiation operator f(t) dt The indefinite integration operator Jf(x) = (a arbitrary constant) Ja J. this can easily be denoted. 7. If the equation Pw(x) = 0 has only the trivial solution w(x) = 0.h) . .. no misunderstandings need to be feared.h) + f(x) Summation operators 2) = of x . then P-' also becomes a left inverse: PP-1 = P-'P = 1 .1.f (x .f(x) The forward-difference operator f'f x) = fix) . 2) The central-difference operator '"f(x) = 2 Xx + h + f(x . this ought to have been marked in some way. for example.?) + f(x) (7. 7. we note that the summation operators are distinguished from the others by the fact that they are defined recursively.f(x . and in general this will give rise to an indeterminacy. However. except for D and J. Actually.h) The backward-difference operator / 8f (x) = f (z + L) .

we get as result a polynomial of lower degree... This can be expressed by saying that the operators are degrading. Similar relations hold for the operators T and a in relation to the difference operators V and S. In the same way as the integration introduces an indefinite constant. if the differences are formed first and the summation is performed afterward. To begin with we observe that the operator E-' ought to be interpreted by E-'f(x) = fix . The difference operator d has the same relation to the sum- mation operator S as the differentiation operator D to the integration operator J. whether they contain h or not. In the rest of this chapter. we can say that the operators J. f(x + h) = f(x) + hf (x) + hf '(x) + . and 8. We can build up a difference scheme and continue to the right in a unique way but not to the left. SPECIAL OPERATORS 149 In the following we will need relations between different operators. we take some simple relations and formulas.. . It is easy to see that if one first forms the sums starting from a certain initial value and then computes the differences. but JDJ(x) = f (t) dt = f(x) . for example D. 7. Hence we have: d Sf(x) = f(x) . we can write Ef(x) _ (1 + hD + h2D2 + . and in such cases the reverse operation must give an indeterminacy. T. Summing up. F. First. The first group is easily attainable in numerical computation. The first eight depend on the interval length h and have a discrete character.)/y\x) = e l l . We have clearly DJf(x) = D f(t) dt = fl (x) x) . an indeterminacy must appear. Sdf(x) = f(x) + c .f ( x ) ..h). This fact is closely related to a property of these latter operators. However. Hence EE-' = E-'E = 1. operates on a polynomial. while the last five. In order to link the discrete operators to the others.SEC. our main task will be to express the operators of one group in terms of the operators of the other. This is also reflected by the fact that the equations Df(x) = 0 and df(x) = 0 possess nontrivial solutions (:0). are connected with operations associated with a passage to the limit. J. S. this must also be the case for the summation operator since the summation has to start at some initial value. we make use of Taylor's formula.f(a) . the last group in analytical. we regain the old values. and or are right inverses but not left inverses to the operators D. If one of them. By means of operator notations.1.

. differentiation operators E d V S U z E E d+1 (1-V)' 1+jd2+b 1+ e° z d E-1 d (1 162+o 1+ eU-1 P 1-E-' 1-(1+d)' P 1-e° 8 E1/2 .V)-1/2 b 2 sinh i(E.E-'/2 d(1 + d)-1'2 V(1 .?)(1 . j1 + 4 cosh 2 U log E log (1 + d) log 1 1 p 2 sinh-' U 2 trl . Shift and difference operators vs.i2 17)-1/2 K + E-1/2) 41+ 2)(1 + d)-1j2 (1 .

both of these formulas will be used frequently. 4J=J DJ_. log (1 + 4) U' J-' h17 _ hP (7.1) The resulting expression can then be written jib P(8).=J. and such function values are usually not known. Then it is clear that Taylor's formula can be written in the compact form E=e"D. apart from the func- tions themselves. one ought to use the central difference operator b.2.D = V.1. . 150 we present the relations between shift and difference opera- tors and the differentiation operator.D=4. POWER SERIES EXPANSIONS WITH RESPECT TO 8 151 SEC. In particular we observe that 8 = 2 sinh U/2 and jt = 1 -+31/4-. (7. odd powers are obtained. = J_. since odd powers refer to points between the mesh points.1. If.2. and hence odd powers have been eliminated. in spite of all precautions.2) In the table on p. DJ = 1 . as a rule. Further. Further. To obtain the most symmetrical formulas. For this reason it is natural that we investigate power series expansions in terms of b a little more in detail. where P is an even power series. For the integration operators. PJ = J_ (7. we have .E-'). This formula can be used only for analytical functions. ought to contain only even powers of 3. we find the following formulas: DJ. (7.3) DJ°=JD=B.2.1.P) U' N _ hS J0 2 sinh-' (3/2) U 7. it should be observed that a"D is defined by its power series expansion. It should be observed that these expansions. Power series expansions with respect to a The quantities which are available in numerical work are. we can multiply with I . fiJ=J°. we also have h4 _ h4 J.4) log (1 . 7. for which we now prefer the form U= hD. in the following this technique will be used frequently.ud = J(E . As to the first factor. differences of all orders.

152 LINEAR OPERATORS SEC. 7.2.

Our main problem is now expressing F(U) as a power series in 8 by use of
the relation 8 = 2sinh (U/2). Denoting the inverse function of sinh by sinh-1,
we have U = 2 sinh-' (8/2), and in principle we merely have to insert this into
F(U). We shall now discuss how this is done in practice.
The quantities 8, F, and U will be denoted by x, y, and z, respectively, and
we have the relations
y = y(z)
(7.2.2)
x = 2 sinh (z/2) .
Differentiating, (we get
T)-1/2
dx=cosh (2)dz=(1 + J dz or dX= +
The general idea is to differentiate y with respect to x and then compute
(d%y/dx%).=o; from this we easily get the coefficients of the expansion. We
consider the following ten functions:
1. Y = z 6. y = coshpz/cosh (z/2)
2. y = z/cosh (z/2) 7. y = 1/cosh (z/2)
3. Y = z' 8. y = sinh z/z
4. y = cosh pz 9. y = 2 tanh (z/2)/z
5. y = sinh pz/sinh z 10. y = 2(cosh z - 1)/z' .
Then we find the following expansions:
1.

dy dz - 1 - I + x'
dx dx cosh (z/2) ( 4/
.
n)2=*
)(-n) 22,.( - n)
2'".n! 2+*,(n! )'
Integrating this relation, we get:
(2n)! x:w+1
Y = Z -,, 2" (n! )' (2n n+ 1)

X
x' 3x' 5x' 35x'
24 + 640 7168 + 94912
2.

z
cosh (z/2) '
dy - cosh (z/2) - (z/2) sinh (z/2) dz 1 - xy/4
dx cosh' (z/2) dx - 1 + x'/4

SEC. 7.2. POWER SERIES EXPANSIONS WITH RESPECT TO 8 153

Hence (x2 + 4)y' + xy = 4 (y' means dy/dx). For x = 0 we get y = 0, and
from the equation we get y' = 1.
Next we differentiate m times and obtain
1(x2 + 4)y" + 3xy' + y = 0,
(x2 + 4)y"'
+ 5xy + 4/ = 0,
(x2 + 4)y1° + 7xy"' + 9y = 0,

(x2 + 4)y(-+,) + (2m + 1)xy(m) + m2y('"-,) = 0

Putting x = 0 and m = 2n, we obtain y(2"+1>(0) =
/ -n2y(2"-1,(0), and hence
y(2" F1)(0) _ (-1)"(n?)2. Thus we get

E (- x2n+1
1,(n!
Y = "=o (2n + 1) t
.x3 x2 x' x" x13
x + 30 140 + 630 2772 + 12012
3.

- ZZ'
y
_ 2z
dx dx
2
cosh z/2)
(z
2 (- 1)" (2n + 2 1) (
x2"+,

and after integration,
y = 2 . [(n - 1)!12 x2"
".1 (2n)!
X4
-x2
X6 X, 10 12

I2+ 0 0+3150 16 332+
4.

y = cosh pz ;
Y, = dy _ p sinh pz
dx cosh (z/2) '

Y
= p2 cosh pz - p sinh pz (1/2) sink (z/2) = p2y - xy'/4
cosh2 (z/2) cosh (z/2) cosh2 (z/2) I + x2/4
Hence (x2 + 4)y" + xy' - 4p2y = 0 with the initial conditions x = 0, y = 1,
and y' = 0. Using the same technique as above, we obtain without difficulty
y(2"+2)(0) = (p2 - n2)y12">(0)
,
and hence
P2 2
+ + P2(P2 - 1)x' + p2(p2 - l)(P2 - 4)x°
y + .. .
2' 4! 6!

154 LINEAR OPERATORS SEC. 7.2.

5. In the preceding formula, we have computed y = coshpz. Differenti-
ating with respect to x, we find
1)x' + p2(pl - 1)(p2 - 4)x° + .. .
p sinh pz I

cosh (z/2)
= pox +
p2(P2

-
3! 5!
that is,
sinh pz = p C1 + (p2 - 1)x2 (p2 - 1)(p2 - 4)x' + ...1
sinh z 3! 5!

6. y = cosh pz/cosh (z/2). The following equation is easily obtained:
(x2+4)y"+ 3xy'+(1 -4p2)y=0.
For x = 0 we have y = 1 and y' = 0; further, we find
(x2 + 4)y("+,) + (2n + 1)xyc.) + (n2 _ 4p2)yc"->> = 0 .
Hence
y"(O)=P2-a,
y"(O) = (p2 4)(p 4)
yV,(O) = (p2
1) (p2 - 4)(Y A)

and

y 1 +(P 2 )x2+(P 4 2x'+(P 6 )x°+...,
7. As was shown in (1), we have
_ ((
1 xzl u2 _ (- 1)"(2n)!
x2,
cosh (z/2) - \1 + 4 / 2,"(n!)2

-1- x2 + 3x' - 5x8 + 35x8 -
8 1228 1024 327 88
8. We start from

Y
- sinh z - 2 sinh (z/2) cosh (z/2) = 2 sinh (z/2) x
z z W z/cosh (z/2) z/cosh (z/2)
Since y is an even function, we try with y a° + a,x2 ±- a2x' + and obtain,
using the expansion of (2),

(a°+a,x2+a.x'+...) x- 140...1- x
6 + 30
Identifying the coefficients, we get

Y
x2 _ x' x° 23x8 263x'°
+ 6 180 + 1512 - 226800 + 14968800

SEC. 7.2. POWER SERIES EXPANSIONS WITH RESPECT TO 8 155

9. Putting v = z cosh (z/2), we easily find the equation
(x2+4)v'-xv=x2+4,
and
v(2n+1)(0) = -n(n - I)vc2n-1)(0) (n Z 2) ; V'(0) = I , v"'(0) = i .
Hence
vx+12X, X5
1 0+840
7 9

5040+27720
11

Putting
= tanh (z/2) + a1x2 + a2x' + ..
Y
(z/2)
and using vy = 2 sinh (z/2) = x, we find:
Y _1- x2 11x' 191x° 2497x8 14797x10
12 + 720 - 60480 + 3628800 - 95800320 +
10. We obtain directly

Y
2(cosh z - 1) _ 4 sinh2 (z/2)
z2 z2

(2 sinh (z/2) cosh (z/2))(2 sinh (z/2)/cosh (z/2))

sinh z 2 tanh (z/2)
z z
and hence get y = x2/z2 as the product of the expansions (8) and (9):
x2 _ x' 31x° _ 289x8 317x10
y + 12 240 + 60480 3628800 + 22809600
Summing up, we have the following results:

U=8- 83
+
380 _ 587 3559 - 636"
{7 .2.3)
24 640 7 668 + 294912 2883584 +
82 8' 610 812
U = 1
86 8°
6
-6 + 30 140 + 630 27 22 + 12012
(7.2.4)
v° S°
UZ - SZ - X14 i 0`1° (!812
(7.2.5)
U 12 + 90 5160 + 3150 _ 16632

cosh pU = I + PY82 + PZ(PZ - I)8 + PZ(PZ - I)(PZ - 4)86 + ... , (7.2.6)
2! 4! 6!

156 LINEAR OPERATORS SEC. 7.2.

sinh P 1)52+( P 5 2)5++(P? 3\58+..., (7.2.7)
sin h U P +(P 3

1) z + ( P l 5' + (P + z)f 5e + ... , (7.2.8)
coshh(Ul2) 1 + (P 2 /l 4 // 6
52

8
+ 331
128
556 + 3558 63510 + 231512
(7 . 2 9)
.

1024 32768 - 262 444 4194304 '

EcS _ 1
52 _ 54 56 _ 2358 263510
7 . 2 . 10
U + 6 180 + 1512 22 8800 + 14968800 ' ( )

5 62 + 115' 19156 + 249758 _ 14797510
+
pU 12 720 - 60480 3628800 95800320
(7.2.11)

52 + 52 _ 5' + 3156 _ 28958 + 317510
Uz 12 240 60480 3628800 22809600
(7.2.12)

Here we obtained formula (7.2.4) by multiplying the expansions for
U/cosh (U/2) and p = cosh (U/2) .
The formulas (7.2.3) through (7.2.12) may essentially be regarded as relations
between U, on one hand, and 5, on the other, and all of them ultimately depend
upon Taylor's formula E = e°. From this point of view, it seems appropriate
to discuss the meaning and the validity of this formula a little more in detail.
We assume that f(x) is an analytic function, and write the formula as follows:
Ef(x) = e'f(x) or fix + h) = e'f(x) .
We observe that the formula is correct if f(x) = x The right-hand side
reduces to
(1 + hD + h2D2 + ... + h%D.\
x*
2! n! /J

since the function is annihilated by higher powers of D. But for r 5 n, we
have
h'D'x" = h*n(n - 1) ... (n - r + 1) x"-r _ (n l hrx* * ,
r! r! r/
and hence the whole expression is
hrDr
x° h)''
_0 r! _ r.e (n)
r h'x' = (x + .

SEC. 7.2. POWER SERIES EXPANSIONS WITH RESPECT TO 3 157

Clearly, the formula is also valid for polynomials, and a reasonable inter-
pretation of the formula E = e° is the following: The operators E and
1 + hD + h2D2/2! + + h"D^/n! are equivalent when used on a polynomial
P,,(x) of degree n, where n is an arbitrary positive integer. As second example
we take the formula E9 = (1 + 4)P. If p is an integer, (1 + 4)' is a polynomial,
and we get an identity. But if the formula is used on polynomials, it is also
valid for all real p. For fix) = x' we have
4x3=(x+h)3-x3=3xth+3xh2+h',
42x3 = 3h[(x + h)2 - x2] + 3h2[(x + h) - x] = 6h2(x + h) ,
4'x3 = 6h3 ,
s
(1 + 4)'x3 = t (P) 4'x3 = x3 + p(3xth + 3xh2 + h')
-o r
+ P(P 2 1) 6h2(x + h) + P(P - 6)(P - 2) 6h3

= x3 + 3x2ph + 3xp2h2 + p'1' _ (x + ph)3 .
Taking insteadf(x) = es, we get
4e==es+h-es-aes,

where a = eh - 1. Hence
(I + 4)Pe- (Pr) d.es

=(P)ales=(l
r
+a)'es,
provided that the series
(P) a'

converges. As is well known, this series is absolutely convergent if jal < 1,
that is, if h < log 2. In this case we get
(1 + 4)Pe` = (I + a)'es = e'he` = es+Ph ,

and hence the formula E' = (1 + 4)P is valid.
Summing up, we note that the operator formulas need not be universally
valid, but we see that they can be used on special functions, in particular on
polynomials. In practice they are often used on considerably more general
functions, and then, as a rule, the series expansions have to be truncated after a
few terms. A rigorous treatment, however, falls outside the scope of this book.
Infinite power series in degrading operators in general should be understood as
asymptotic series. If such a series is truncated after a certain term, the remainder
term will behave asymptotically as the first neglected term as h -' 0.

158 LINEAR OPERATORS SEC. 7.3.

7.3. Factorials
The factorial concept actually falls outside the frame of this chapter, but with
regard to later applications we will nevertheless treat it here. The factorial
polynomial of degree n (n positive integer) is defined by the formula
pcn)=P(P- 1)(P-2)...(p-n+1)= p! (7.3.1)
(p - n)!
Among the important properties of such expressions, we note the following:
(P+ 1))w) _ p(n)=(P+ 1)P(P- 1)...(p-n+2)
-P(P-1)...(p-n+1)
=P(P- 1)(P-2)...(p-n+2)IP+ l -(p-n+ 1)I
=np(P - 1)...(p-n+2)
or
dp(n) = np(n-1) (7.3.2)

Here the d-symbol operates on the variable p with h = 1.
We generalize directly to
42p(n) = n(n - l)p(n-2) = n)2)p(n-2)
and
dkpcn) = n(n - 1)(n - 2) (n - k + 1)p)n-k)
= n(k)p(n-k) . (7.3.3)
If k = n, we get dnp1n) = n!
Until now we have defined factorials only for positive values of n. When
n > 1, we have p(n) = (p - n + l)p1n-1), and requiring that this formula also
hold for n = I and n = 0, we get p(O) = 1; p'-1) = l/(p + 1). Using the formula
repeatedly for n = - 1, - 2, ... , we obtain:
n) = 1 _ 1
P (7.3.4)
(p + 1)(P + 2) ... (p + n) - (p + n)ln)
With this definition, the formula p'n) = (p - n + 1)p)n-1), as well as (7.3.2),
holds also for negative values of n.
We shall now derive formulas for expressing a factorial as a sum of powers,
and a power as a sum of factorials. Putting
Z(n) _ ak(n) Zk (7.3.5)
k-1

and using the identity z(n+1) = (z - n)z(n), we obtain, on comparing the coef-
ficients of zk,
al-') = a)kn-)1 - na'")
k , (7.3.6)
with cell) = 1 and aon) = 0. These numbers are usually called Stirling's numbers

SEC. 7.3. FACTORIALS 159

of the first kind; they are presented in the table below.

1 2 3 4 5 6 7 8 9 10
n
1 1

2 -1 1

3 2 -3 1

4 -6 11 -6 1

5 24 - 50 35 -10 1

6 -120 274 -225 85 -15 1

7 720 -1764 1624 -735 175 -21 1

8 - 5040 13068 - 13132 6769 - 1960 322 - 28 1

9 40320 -109584 118124 -67284 22449 -4536 546 -36 1

10 -362880 1026576 -1172700 723680 -269325 63273 -9450 870 -45 1

For example, z11) = 24z -- 50z2 + 35z' - l0z' + z°.
To obtain a power in terms of factorials, we observe that
z , z(k) = (z - k + k)z(k)
= z(k+1) + kz(k)
Putting
Z.
E G(*)z(k) ,
= k=1 (7.3.7)
we obtain
n-1
Z" = z Zn-1 = Z E81-11Z(k)
k=1
n-1
9 -I)(Z(k+1) + kz(k))
k=1

Since no constant term is present on the right-hand side of (7.3.7), we have
(ion) = 0, and we can let the summation run from k = 0 instead of k = 1:
Z. n-I
E pp(%-I)(z(k+1)
= k=o + kz(k))
n n -I

k=1
Nkn i1)Z(k) +
E kNk
k=1
,p(n-I)Z(k)

/atkn)z(k)

k-1

Identifying both sides, we get
j( /.?kn)
= k-1 I) 1kRkn-1) , k= 1,2,...,n- I
(7.3.8)
x n-1
The numbers ,Q are called Stirling's numbers of the second kind; they are displayed

in the table below: 6 9 10 nk 1 2 3 4 5 7 8 1 1 2 1 1 3 1 3 1 4 1 7 6 1 5 1 15 25 10 1 6 1 31 90 65 15 1 7 1 63 301 350 140 21 1 8 1 127 966 1701 1050 266 28 1 9 1 255 3025 7770 6951 2646 462 36 1 10 1 511 9330 34105 42525 22827 5880 750 45 1 For example.3). This ensures that sums of factorials .3. 127.9) The Stirling numbers ak"' and R1%) are special cases of the so-called Bernoulli numbers B.3. but otherwise these numbers are beyond the scope of this book. p. we have zd = Z' + 15z':' + 25z(2) + 10zu' + z(s) Both these tables. respectively. let u and v be the following column vectors: z zM Uand v= zs Z'2 z" z(") Then we have v=Au.160 LINEAR OPERATORS SEC.2) and (7.") of order n. u=Bv.3. we see that the factorials have very attractive properties with respect to the operator d. 7 .3. Further.10) (e _c vl Later we shall deal with the case n = 1.3. 3 . counted with n rows and n columns. Closer details are given in [11. they are defined by the expansion t" t" (7.1l (7. AB=BA=I.". We only mention the following relations: a. = n \k. form triangular matrices which we denote by A and B. (7.11) k "k From Formulas (7.

) = (Q + n+ Here P and Q need not be integers. In order to demonstrate this.3._ft tfi where f is a function such that 4F = f. and instead we refer to [2]. (p . 13) P_P (n + 1)h We shall return to these formulas in Chapter 11. If.3. on the other hand.12) to be 1.15) +2 1 In both cases we have analogous difference formulas: Sp'"' = np'"-" ..16) Naturally. + (F"+ .F. 54 and 568.nh i h). 1933).. However. . Spt"l = npt"-') . the interval is supposed (7. = (7 .+.. pp. London. the factorial is defined by pc"1=p(p-h). we refrain from doing this. the formula is slightly modified: Q p i" (Q + h)c. we regard E" " i=w f(xo + ih) = . [21 Buckingham: Numerical Methods (Pitman.. 1949)..F") =F.+ _ pa+. London. 3 . one can construct expansions of factorials in powers.. 14) . instead.. also in these cases.3.) + (F. and obtain " tfi = (F. Princeton.+. 161 REFERENCES are easily computed. we also mention that one can define central factorials: pW = p + n 2 1 J (7 3 .+f ..+. (7.-u =P (7. REFERENCES [11 Milne-Thomson: The Calculus of Finite Differences (Mac Millan.. we get t a-P p(. [3] Milne: Numerical Calculus (Princeton University Press.) + . and central mean factorials: pt"] = pp["' = 2 Rp + 2 / + (P -2 l/p n \ . and conversely. In passing. Since d(p'"+"/(n + 1)) = p("'. 1957). ..F.F.

) .13 ksgk (a) p(fkgk) = pfkpgk + 1afk8gk . 7. By also taking the point midway between the first points. (a) Show that where . Find its value expressed in known central differences. and the corresponding differences are denoted by 4J. The expression 5yo cannot usually be computed directly from a difference scheme. = 2-r[4' . = 0. Prove that: -1 (a) -1 d2fk = if. 10.-1 + a. A Fibonacci series is characterized by a = a.. Show that V-4= 4V and that 4+V=4/V-V/4. >0). Show that 4. d2f...adr+1 + bdr+2 -. x.162 EXERCISES EXERCISES 1. (gk/ fk _ pfkppgk . we obtain x0. d. Prove the relations.. 9. . where ao and a1 are given numbers. (b) 52f2k+1 = tanh (U/2)(f2 . the same holds for 6. 52(x2).=(-1)" {(n+ 1)+(n+k1)}.. 2. . Use the result for determining a par- ticular solution of the equation pf(x) = 2x` + 4x2 + 3x2 + 3x + A. and 02(x`) when h = 1. where xk = xo + kh.dfo . It.-2. (b) 5. d2f. x1/2. A function f(x) is given in equidistant points x0... x2/2... x1. . The sequence y" is formed according to the rule y.d operates on n. f di f. x1. ..fo) k=0 k=0 4. (b) 1/ gk-1/29k+1/2 (C) 5(fkgk) = pfk5gk + pg0fk (d) 5 (fk) = pgkOfk .. = 22. Prove that: (a) 4v= dk k + fk+l ) (p 2 1)(%) pi2.Ufkagk 1\ gk/1 9k-1/29k+1/2 3.. Find 52(x2). Prove that d"y.. . and the corresponding differences of different orders are denoted by df... (2)* (n integer. 8. Find cos pz as a power series of x when x = 2 sin (z/2). and hence that (n)-(e+11)-(i+1 ) I N .. Show that if yn are terms in a Fibonacci series. and compute the constants a and b.

.h) .. Show that ar .. + (..EXERCISES 163 (b) Find the coefficients c.2f(x) + f(x + h) and a'sf = f(x . 2.nh) . in the expansion (nl nb -Ec. We define a2f = f(x . is given. A Fibonacci series a. 14.2. Express z0fA/2 Joyo = S y(r)dt J sO-A/9 in central differences of y. 1.\/a and compute the smallest value N such that n& > 1010.1)"ar = k a new Fibonacci series. and state the first terms in this series. Obtain 6'2 as a power series in a2 (the three first terms).. . 12.. n = 0.2f(x) + f(x + nh). 13.

. other kinds of functions (trigonometric. 8. + 164 .. x .. 8. n. Our problem is then to find the function value y corresponding to a given value x. In the following we shall almost exclusively treat this case. this problem has an infinite number of solutions. i = 1. . n are known.. certain points P. ..1. This is.. f is usually a "smooth" function.. however... of course. Lagrange's interpolation formula We assume that for a function f(x) which is continuously differentiable n times. ). In practical cases.. y. n points (x y. that is. Find the value of the function at the point P(e.).. Usually f(x) is replaced by a polynomial P(x) of degree n . y. ). (x y. First... we shall restrict ourselves to functions of one variable. i = 1 . and usually we shall be concerned with polynomial approxi- mation. it varies in a rather regular manner. Are the given points equidistant? Has adifference scheme been constructed? Is a table of interpolation coefficients available? Should interpolation be performed at the beginning or at the end of a table? Is extremely high accuracy desired? We are now going to describe a number of methods. Clearly.(x.1. . 2. 17.. .. taking the values y y .. We suppose that the function can be approximated by a certain type of function.. we should first an- swer the following questions.). x. (x. z. in each case some kind of advice is given as to the circumstances under which the method should be applied.. It is obvious that this problem is strongly underdeter- mined.. . For a function f(x. for x = x1.. . When we have to choose a suitable interpolation method. From now on we consider a function y = f(x) with known values y... a somewhat vague condition which ought to be put in a more precise form. since one can prescribe an arbitrary value to this point.Chapter 8 Interpolation Interpolation-that is the art of reading between the lines in a table. 2. y.) are known.. Putting y = ao + a. In special cases.f(x.. z.0.). ... C..x + . y. Introduction The kind of problem we are going to treat in this chapter can be briefly de- scribed in the following way. ). . exponential) may occur... .

Obviously.. (X - + .X1)(x .x2)(x.xk)F0(x) and F(x) _ (x . Thus we can also write P(x) F.x.. +. (x2 ..x k_1)(x . x x2.x3) . .(x) = F(x) yk .2) rok Obviously.(x) = II (x _ x.X . and R = f'°'(¢)/n!. all terms in the sum vanish except the rth. which takes the value y = y..X.xw) For x = x.xk-1)(xk . where k=1 (8.1. (8....x2) . Then it is convenient to use the following functions: F(x) _ II-1 (x . (I < r < n).x3) . .. where R is a constant which is deter- mined so that G(x. and further that xo # xk. It can. .. .) . (8. t and hence F(xk) = Fk(xk). since P(x) is of degree n .xk+1) .xe) Lk(X) _ ( xk . we have G(x) = 0 for x = x0.. + (x .. Hence. (x . .3) k -1 Fk(xk) yk k-1 (x . ... replacing xo by x.. f(x) = P(x) + n! or written explicitly. We define the function G(x) = f(x) . . But G'°'(E) = f'"'(i.x3) .xk+1) .(x) + Fk(x). . (x A: .SEC.. . . yn (X. 2. The determinant (Vandermonde's determinant) is rj:>k (x..x1)(Xw . .1) (x .1.x.x3) . . (x ..xk)F(Xk) We suppose that the point x.) (x2 . (x .. . and by using Rolle's theorem repeatedly.) = 0.P(x) ..x2) .RF(x). be differentiated if one wants to compute the derivative in an arbitrary point. a .x.X.1. where c. (8.Xl)(x2 . 8. we obtain (note that E is a function of x) ft. .X2)(x ...R n!.x2) . (x . We shall now examine the difference between the given functionf(x) and the polynomial P(x) for an arbitrary value xo of x.1. . .1. .x2) . k = 1 .) . . (x .1. it must be remem- .. Thus there is a unique solution and it is obvious that the desired polynomial can be written P(x) = y = E Lk(x)yk . we conclude that 0. (X.)(xk .X.<-1) + ftwt' ) (x .4) n! This is Lagrange's interpolation formula. . we have F(x) = (x . . .. of course. . x. .(s) F(x) . lies in the closed interval I bounded by the extreme points of (x x .X2) .xk)F.x1)(x .I.) .) f(x) Yi Y2 (X1 . F.)(X . . (xk ..xk) and hence is not equal to 0.X1)(x . (x . LAGRANGE'S INTERPOLATION FORMULA 165 we get a linear system of equations in the coefficients a0.(x . (x1 . n.

we get F(x) = 0. by logarithmical dif- ferentiation F(x) = F(x) E I k X-Xk and F'(x)=F(x)E k 1 .F(x) Y.x. it is possible to obtain a comparatively simple expression. n 2 We put xk = xo -{.F(x) yk dx (x . . 1.-2+2.xk)F~(xk) yk dP = E (x .5) fix.xk).))y.xk)F(x) . and we can obtain nonzero contributions only when we have the factor (x .-Xk We now pass to the case when the x-coordinates are equidistant. and start by renum- bering the points. . X ..xk) i<k If we put x = x. that is.) . and distinguish between two cases. LJr+F(Xk) yk (8..Xk kr' X. where p is a fraction which preferably . ..) in the denominator. n . by l'Hospital's rule and becomes (F'(xr)/2F(x. k = 1.1. 8.. For the derivative in one of the given points (x yr).Xk and F( X. Hence F'(xr) = lim 2F(x) E I = 2F(xr) F. 166 bered... This is done because we try to arrange them in such a way that the point x lies in the middle of the interval. INTERPOLATION SEC. I a-z.xk)F(xk) . we get.k (x .F(x)E (x_xk)p k 1 X 1 1 F(x) [(1: x . We have ` F(x) P(x) = k=1 (x . for example. that the factor f °'(£) is also a function of x. (x _ xk)2 ] = 2F(x) i.x. 2. Xk+1 = xk + h. we find dP _ F(x.)ZF(x.)F(x) .kh and x = xa + ph.x k1 . yk + lim (x (x C dx z'zr (x.) The last term can be computed. namely n even and n odd. 2.) 7. .1. kar Xr . -1. however.)(x .xk .x. From F(x) = 11k_1(x .1. 0.xk)2P(xk) and for x = x.X...x. First we treat the case when n is even. --. The index k now takes the values -2+1.

we see that Ek Ak^'p .1. (Note that A_1 and A.1. + (. we have the remainder term R(p+ 2 .. using Lagrangian five-point interpolation.0242. Further.Xk-1 =h x.xk+.1 h fx = (k+ 2 -2)h =(p+ 2 -2)h xk .k) +2 In a similar way. -(2 . EXAMPLE For a function y = y(x). and our original x becomes x.sEc. Our original x1 now becomes xo + (. . A. (p xk . 1) h x. 4. are negative.n/2 + I )h. I 1 and for dif- ferent values of p [I]. The reader . Find y for x = 1. = _h \ 2) h In Xk .k)! k) x J(p+n 2 1 . (8.7) Interpolating the functionf(x) = 1. We have p == 0. becomes x.n/2 + 2)h.1. )(p+ 2).6) .(p / n°($) r 2 =(P+n/2. and from the table we obtain the coefficients A_ . + (nl2)h.420. LAGRANGE'S INTERPOLATION FORMULA 167 should lie between 0 and 1. Let Ak(p) be the coefficient ofyk in Lagrange's interpolation formula: Ak(P) = (n/2 + k (P t) ... we get for odd values of n: A* (- k(p) ((1)/2 + k)! ((1)/2 . (8.I)h n The coefficients AZ(p) have been tabulated for n = 3. x1_/2 = (p + 2 . our original x. the sign is indicated in the heading of the table Ill.k) h. .. .t). Hence Xk .. (k + 2 -.1. 5 points are given according to the table below.I)! (n/2 k)i (p . 8.. .

. In its general form Lagrange's formula is used only on rare occasions in prac- tical computation. Vk(xi) = 0 f (8. for example.) = Sik .6875481 0.8)(y .64) 1 + (y .6723287 0.6787329 .1) Here Uk(x) and Vk(x) are supposed to be polynomials of degree 2n .1. P(x) andf(x). as well as P'(x) and f (x). Vk(xi) = aik As is easily inferred.1)(y .02 4.. (8.00 4. This example shows clearly that a higher-order formula does not necessarily give a better result than a lower-order formula.63) 8 7 . + As = 1.02277254 1. that we start from the points (0.6570095 -0.2.38006584 1. (2. 27).I such that in the points x x ..78727924 A1yi = 4. 8). x.2. 3 + (y .2.168 INTERPOLATION SEC. 1.26.19)(.27) .0.8)(y . . Our requirements are fulfilled if we claim that Uk(x. Linear interpolation gives 2.03487946 Hartrec and others have emphasized that the Lagrangian technique must be used with discrimination. . 0).56.-1.63. + Ao + A. Thus we form P(x) = k=1 E Uk(x)f(xk) -+-k=1t V.) x y A 1.03 4. we get x . should check that A_2 + A_.64) .7144. we may choose Uk(x) = Wk(x)Lk(x)2 (8.37 With y .7026694 .26)(. Suppose. (1. it is extremely valuable in theoretical work within different branches of numerical analysis.27)(y .(x)f(xk) . Hermite's interpolation formula The Hermitian interpolation is rather similar to the Lagrangian..1)(y .37) 64. (3.7)(.15523816 1.27)(y . On the other hand.2.64) 2 L 1 (.6415888 0. 8. a deviation of only 3%. (..01 4. 4 27. 64) on the curve y = x' and that we want to compute 1%20 by inverse Lagrange interpolation.8)(y .2) Uk'(xi) = 0. We find directly that x = y r(y .3) Vk(x) = Zk(x)Lk(x)2 . and (4. coincide. 8.3139 instead of the correct value 2. 19 (.20.2. 1).56) + (y . that is.63.1)(y . The difference is that we now seek a polynomial P(x) of degree 2n .04 4.

2. HERMITE'S INTERPOLATION FORMULA 169 where Lk(x) is defined in (8. that is. we have Lk(x. 2. By using the conditions (8.2. .9) Here we also mention a technique to determine a curve through given points in such a way that the resulting curve becomes as "smooth" as possible.2). where we must assume that .7) and (8. (8.2. . In prac- . we know that G'(x) vanishes in n intermediate points in the in- terval I. and so on.8) From (8. .2.) = 8..xk))Lk(x)z f(xk) k=3 + E (X . clearly. .k.1. (8... .5) k=1 Interpolation of this kind is sometimes called osculating interpolation. 8. k = 1. Hence G"(x) vanishes 2n . we get Wk(xk) = 1. . x .4) / Zk(xk) = 1 . Also. and finally G(2'1(x) vanishes in at least one point 6 E I. x1.2.P(x) . and hence Wk and Zk must be linear functions. Now Lk(x) is of degree n . x x1. .). . 2. (2n)! = 0 . that is. (8. and we find the following expression for the complete inter- polation formula: f(x) = P(x) + [F(x)j' .. T":(XI-) = -2L (xk) . let I be the closed interval bounded by the extreme points in (x0.Ax) is continuously differentiable at least 2n times.SEC. k = 1..1. Since G(x) vanishes in n + 1 different points. G"'(x) vanishes 2n . We shall now estimate the magnitude of the error and construct the following function: G(x) = f(x) .1).2. . x.1 times in I.S F(x)4.2. Thus we have G12"($) = f*)() - S S.xk)Lk(x)'f(xk) . n.2.8) it follows that f(xo) = P(xo) + F(xo)' .2 times in I. S = f(xo) . n. x. . We determine a constant S such that G(x) vanishes in an additional point x0. Hence xo can be replaced by x. Thus Hermite's interpolation formula takes the form P(x) _ t { 1 . (8.2.2.6) Hence G(xk) = G'(xk) = 0.7) F(xo)' As in the Lagrangian case.. x%.. . G'(x) vanishes in the points x1.P(xo) (8. in 2n points in all.2LY(xk)(x .. (2n)! This relation is trivially correct also if xo = xk. . . x. Zk(xk) = 0 (8.

. x be n + 1 given points.3.. x1) f(xo. x.f(x0) = f(X1. x1. . x0) . aircraft industry) this is done by a flexible ruler.3. z.3. Then we define the first divided dif- ference of f(x) between x0 and x.3.X1) . If two arguments are equal. (8. x2) _ f(Xo.. (8. .. Xk) k f(xP) = P=0 (XP . 170 INTERPOLATION SEC. . and then one demands continuity for the function and its first and second derivatives in all the given points (nodes)..3) By induction it is easy to prove that f(x0...1) X1 . .xP_1)(xP . .. 8. .. x0) = Jim f(x) . x . In this respect the divided differences offer better possibilities..3. x . we can still attribute a meaning to the difference..xP+1) . . (XP . Let x0. one for each interval.2) o and in a similar way the nth divided difference: f( xo. f(x0.X0 and analogously fc____ 0 ) = - r+1 arguments r! .. spline.. (8. . Divided differences If one wants to interpolate by use of function values which are given for non- equidistant points. From this we see that f(x0..4) For equidistant arguments we have f(x0.) = ..)= 1 df0. . x ) = f(x x . --f(x0. h* n! where h = xk+.x...x0)(xP . x X..f(x0) = f (xo) s_:o X ..x0 Analogously.. the second divided difference is defined by f(x1.we have..) = f(x) . .: f(xo. . for example. x*)X. the Lagrangian scheme is impractical and requires much labor.Xk) (8. (x. Xo x .3. . 8..xk.x. tical applications (shipbuilding. Numerically spline interpolation is usually performed by computing a series of third-degree polynomials... x . xk. xk) is a symmetric function of the arguments x0.

3.). and adding we find f(x) = f(xo) + (x . Xo.. 1.x. + (x . f(xo.x2) . xo. xo.. This is Newton's interpolation formula with divided differences.. and so on. Yx- From the defining equations we have f(x) = f(xa) + (x .f(x.x. .xo). For a moment we put f(x) = P(x) + R.. EXAMPLE Find a polynomial satisfied by (-4.. x3) + (x . x1. x. X1. fix. . x1. x0.x.3. we also get f(X. x1. x .x1). x1) + (x .(x . .) . X1.) ..28 -14 0 5 10 3 2 13 2 9 88 442 5 1335 . x. .5) where R . x) f(x. (x . . we have f(xk) = P(xk) for k = 0.. (. x0. 9).6) (n + 1)! . Xo. x. .xo)(x . xo) f(x.x. Since P(x) is a polynomial of de- gree n and.3. .. X1. x1. and (5. . xa) jI"_o(x .) .. . R.-1) . X.. 2.. X1) + (x .. . f(x. fix. .x0) -f(xo. x0. further.f(x. x y -4 1245 -404 -1 33 94 . X. x0.. Xo) = f(xo.=o We also find that f ("+1)(c) (n+l)!.). n.) .. x1. (2. 8.. x2) + . . (8... (x . f(xo.xo)(x . R vanishes for x = xo. .Xo) . x) = f(xo.x. DIVIDED DIFFERENCES 171 SEC. 33).1. 1335). . (8..xo)(x . X1.) .) -f(xo. . and clearly P(x) must be identical with the Lagrangian interpolation polynomial.. Hence R = f %+1)(S) 11 (x .) . Finally..x.. Multiplying the second equation by (x .. d f(x.x1) -f(x.) . and finally the last equation by (x ..xo)(x .. (0. . 1245).x. . .x. 5). . the third by (x .

.660886 1.Y0(X1 . .t2) is to be computed for x = 0.622528 1..) = y and 10. 8.645508 -142 0.652416 1..) = y.2) = 3x' ...X) .. .X Xo .X212)(1 .1(x) x. .X1 14....30 1. 1 . . .648276 1.X) . Hence nth degree interpolation can be performed by n(n + 1)/2 linear interpolations.644537 1. Next we form 10. .172 INTERPOLATION SEC.5x3 1 6x1 .640000 1.2.9(X) x3 Y3 I-.(x2) = y..1.2(X0) = x... X.608049 -1142 0.X) _ x1-xo .1(x.645567 358 0. k = 0.) = yo and 10.Xw-1 we have I o .641119 -642 0. and first we form the linear expression Ioa(X) .X Obviously.1.x.3. n. The practical computation is best done by a technique developed by Aitken The different interpolation polynomials are denoted by I(x). In general.w\X) - X) .3(X) 10.1(X)(X2 .) f(x) = 1245 .404(x + 4) + 94(x + 4)(x + 1) .Yo(X1 .(xk) = yk.40 1. this is done by aid of the scheme below. (Note that this scheme contains divided differences. it is easy to prove that if IO.14(x + 4)(x + 1)x + 3(x + 4)(x + 1)x(x . 1 Yo x1-xo Y1 X.4142.645714 1.645571 1.1.(x) x2 Y2 Io..1.Xo) = Yo I.45 1. Conveniently.I d.Xo) .2(x)(X1 . . ..35 1.14x + 5 .685750 1.50 1.-X1 Io..x and observe that Yo(X2 .x) = 1 10. From a table the following values are obtained: x y = K(x) 0. Io.3(X) 10..1o.1W x2 .3(x) EXAMPLE K(x) = 11 dt J o l/(1 .645563 858 .645954 1. Y1 I0 . 2.(x(. X) X .x X.. xo Yo X.Y1(Xo .(x.2(X) 10.

day. it is much simpler computationally. For example. On the other hand. On one hand. d`yo . = E32y. since d = EP.=PY. and so on.4. DIFFERENCE SCHEMES 173 This interpolation is identical with the Lagrangian. 8. for example. = Fly. we observe a very characteristic kind of error propagation which we shall now illustrate.4. Difference schemes A difference scheme is constructed as shown in the following example..SEC.=b'Y5I. we also have d = E1"28 and hence. Note that the difference scheme is exactly the same and that it is only a question of notations what the differences are called. but it has two essential ad- vantages. d2y. We obtain the following dif- . When working with difference schemes. we have d'Y. where it is s. Yam We see directly that the quantities dkyo lie on a straight line sloping down to the right. Y dYo dy. we have the following picture: Yo- dYa. and so on. d'yo Y2 dzY d`Y0 dY: d'Yk data Ya d2Y' d`Y1 4y3 J. it gives a good idea of the accuracy obtained..... 4y. j 2Y3 Y4 4y. Y d d2 JS d' 0 1 1 14 15 36 16 50 24 65 60 81 110 24 175 84 256 194 24 369 108 625 302 671 1296 In general.Y. = Fly. on the other.81y. = 8y. Consider a function which is zero in all grid points except one. Finally. we have. In this way we find that the quantities 82ky lie on a horizontal line. = 17ya. and we infer that the quantities Fky lie on a straight line sloping upward to the right. 8. for example. d2y.

. 8. xo. in the worst possible case. where we have assumed as large variations as possible between two consecutive values. 6 -4s 166 -2s 86 -6 46 -16s 26 -8s 6 -4e 166 -2s 86 -6 46 -16s 26 .. 0 0 0 / 0 0 0 6 0 0 e / 0 0 6 _6z 0 6 ... ference scheme. Y. The propagation of round-off errors is clearly demonstrated in the scheme below..2h. we can obtain a doubling of the error for every new difference introduced.. xo . xo + h.5. and in such cases the picture above should be kept in mind. xo . y.56 0 6 -4s 156 6 . apart from the sign we recognize the binomial coefficients in the different columns. In higher- order differences the usual round-off errors appear as tangible irregular fluctu- ations.36 106 6 . Y_ Y_ Yo. of a function for x = . Interpolation formulas by use of differences We suppose that we know the values .h.166 Hence. .86 6 -4e 166 . and want to compute the .26 66 -206 36 -106 0 6 -46 156 0 6 56 0 0 0 0 0 0 6 0 \ \6 6 -66 0 0 0 The error propagates in a triangular pattern and grows quickly. . 8.2e 86 -6 46 .5.174 INTERPOLATION SEC.. Gross errors are easily revealed by use of a difference scheme. xo + 2h.

. An alternative derivation can be made by use of factorials..1).+(2)dpyo+ Pn)'j.n) _ (n "y(- (n P 1)) +f An analogous formula in G can easily be obtained: yn = (E-1)-'yo = (I ..5. been discussed in Section 7. where in general .p.+.. In the sequel we shall often use the notation q = 1 . its validity has. G)-Pyo =yo+pl7yo+ P(P2 (8.5.)+azptz>+. we have yn=E'yo=(I +d)°yo=yo+(Pi )dyo+(2)4y. to some extent. The two formulas by Newton are used only occasionally and almost exclu- sively at the beginning or at the end of a table..1) This is Newton's forward-d/erence formula. INTERPOLATION FORMULAS BY USE OF DIFFERENCES 175 function value y. ' (since k I+dk Ek p(n) = y« Consequently. q(p) is identical with the Lagrangian interpolation polynomial and the remainder term is the same as in this case: ys=Yo +(P)dy. Putting y=ao+a.)0 = dkyo = ak.5. 8.2. and operating with dk on both sides... (8. . For a moment we put P(P) = Y.2) This is Newton's backward-difference formula.. for x = x0 + ph. Symbolically. and a whole series of such formulas with slightly different properties can be constructed.SEC. we get for p = 0: (40y. k! and hence we again attain formula (8. + (n)l d 2 *y0 ' and we see directly that .Pt. + P dyo + P(P 1) dYyo + .5..P(O) = Y.. (p . n+ 1) .I < p < 1. More important are formulas which make use of central differences. PO) = Y.

5 Let p and q (with or without indices) denote even and odd functions of S. INTERPOLATION SEC.2 + 01(8)y1 (Steffensen) . = Yo + Slia(8)y_I. 176 Gauss I a Gauss 2 Stirling 0 BwM Everett Steffensen Figure 8. + p1(8)y1 (Everett) 6.5. Y. = f p(b)Yu2 + 0(8)Yy1. = p. = 9(3)Yo + &(8)Y1r. Y. respectively. y. The best-known formulas have the following structure: 1. y.(S)y. (Gauss) 3. y. Y. 8. = q(8)Y0 + p(p(3)Y0 (Stirling) 4. = q(S)Y0 + 0(S)Y-112 (Gauss) 2. (Bessel) 5.

y.y... the Gaussian interpolation formulas can be written in the following form: Y.4) In the third case we get 3 4 ePu = q + c cosh 2 e-Pr'=cosh U. = yo + () Sym + (i) S=yo + (p I) S'yu= + (p I) b`yo + . whence q = cosh pU and = sinh pU/cosh (U/2).. An analogous formula is obtained in the second case..3) Y.8)]. then we also have to change the sign of U. = yo + (p) ay-u' + (p + 1) 2yo +(p 3 S.u =P+(p .. 4 [cf.. we formally change the sign of 8.7)]. in order to use the parity properties of q) and ci. = Epyo = If. 8. (7.e-uft.2.l. Hence in case 1 we get e.5.5. All func- tions can be obtained by essentially the same technique. q' and (p stand for different functions in different cases. Hence. (8.5.uyo and 8 = 2 sinh 2 . (7.. Thus we obtain Stirling's . + ..+(p 42)b4yo+. INTERPOLATION FORMULAS BY USE OF DIFFERENCES 177 Naturally.2.1 5 3 [cf.SEC. and sinh pU = 2 sink U sinhpU cosh (U/2) 2 sinh U = 8 [p + (p 1) S= + (p 2) . {e-pu=p - whence cosh (p + 1/2) U a) _ cosh (U/2) = 1 + (p 2 1) 8= + (p 2) b' + . (8.. . We shall use e.

where q = 1 . since we would get an even series in 6 operating on y. (8.6'(p' .. we finally get sinhpU=p8+ (P? 4)P38 +(P 4 1156 +. Here we can use formula (7...2.ev .178 INTERPOLATION SEC.6) and (7. in order to find p. + P=(P$ 4! 4.p.i) ay..2. we cannot use (7.E-') to obtain c. S'Y0 + (P +3 1) P 33y..4)U. Stirling's formula can also be obtained by adding (8..2.3) and (8. interpolation formula. 8. YP = Y.e8 = 4(E .5) 5 Here we have used (7.4)U/cosh (U/2) and ¢ = sinh (p .1..4) 86Yo + .. and y_1..8) by p and integrate in 6: S p cosh pU S d8 = sinhpU. we obtain Bessel's formula: YP = I-Y'12 + (P .7) 3 . More- over. + (i) llb'Yl/= + (2) P 3 (P +4 l) 4 86Yii. + PpBYo + t 2. with p replaced by p . S'yo p 2(j72 + (P 2) P86Yo + ..5.2.5.8). Iee"' PU = q.5. J oshsh pU d8= Also integrating the right-hand side of (7. = sinhpU/sinh U.6 + q)le-u whence quo = sinh qU/sinh U and q?. If we change p to p .6) 4 In the fifth case we find = To + T.7) multiplied by 8 = 2 sinh (U/2)... Instead. we multiply (7.8).4. This is the important Everett formula: YP=qYo+(q 3 8'Yo+(q 5 2)81Yo+ + PYl + 1) 8'Y.2. + (p 5 2) 8'Yl + .2. On the other hand..4). In the fourth case we obtain the following system: e(P-112)U = q cosh U +01 2 -(P-1/:)U = q) cosh 2 ..c/i whence T = cosh (p . tt84Y1I2 + (P l)P 5 + . (8.5...5. (8..7) multiplied by sinh U = .5.

p.. + (1 6 Sbyq2 + 2 . The two Newton formulas are also included (Fig. . for example. for example. INTERPOLATION FORMULAS BY USE OF DIFFERENCES 179 SEC. then we have the first differ- ences. 15 p S . we shall restrict ourselves to Everett's formula. As a . not the least because the coefficients have been tabulated (this is true also for Bessel's formula). + qy0. taking into account only first. Putting j(P + i) S2iy1 + (q + i) S2iyol .S p < 4. When doing this. and further because even dif- ferences often are tabulated together with the function values. p0(p) = py. The two Gaussian interpolation formulas are of interest almost exclusively from a theoretical standpoint. q0(1) = y. and so on.. e-n' = 1 eu/2 .5.8) The structure of all these formulas can easily be demonstrated by sketching a difference scheme. Finally we have the sixth case: ePu = I + c.(2 4 P) a.cosh (p -1)U 00 sinh U and 01 _ cosh (p + J) U .and second-order differences.. 24S2y (8. . and hence q0(0) = y0. Everett's formula is perhaps the one which is most generally useful. e-UI2 . 176). analogous conditions prevail in the other cases._° 2i + 1 2i + 1 J we have for n = 1. Stirling's formula is suitable for small values of p. from Everett's formula.8). 8. we shall also consider the remainder term.2.n .cosh (U/2) sinh U Observing that sinh U = 2 sinh (U/2) cosh (U/2) = S cosh (U/2) and using (7.5. In many cases one wants just a simple and fast formula. Hence y. Such a formula is easily obtained.. for example. and Bessel's formula is suitable for values of p not too far from J. we obtain Stefensen's formula: P) (2 4 P) S'yli2 + (3 P) yn = Y. (8.(3 6 P) Sby-1I2 . whence _ cosh (U/2) .(1 2 P) y-.5. where the different quantities are represented by points. Steffensen's formula might compete if the corresponding conditions were fulfilled.Y-1/2 . . The column to the left stands for the function values.°e-a12 + 3e'/2 .5.9) Last.Sly and neglecting higher-order terms. by putting 32y0 = S'y. 8.

1 which takes the values Y-"+1.. n. 1.S. for example. yo. 0..-1(n + 1) = k=-. n. H.1)r (2n)! _ E. n. . 1)(P+n-2). and the remainder term is also the same.-1(P). (8. .-(n. Hence it must be equal to the Lagrangian interpolation polynomial (-1)"-kyk (p + n ..1)! (n .(i) = y. Y1.k)! P-k By use of direct insertion. .. .... .n and i = n + 1. . and hence we need consider only the values i = ... this is fulfilled for n = 1.5. . ... we find without difficulty that y. basis for induction.+1 (k + n .180 INTERPOLATION SEC. for i = .'+1 rO r = r! (2n .n + 1. we suppose that p"_. We shall now prove that p"(i) = y... .10) .n + 1.1)hSeSxo+nh.(p) is a polynomial in p of degree 2n . as has just been shown. i = -n. . y" in the 2n points . 1. . Thus the truncated series is identical with the Lagrangian interpolation poly- nomial.. k=-.1)! (n . 0. For symmetry reasons it is sufficient to examine. n + 1.. .I .. .. and the first one has the value Sz"y1 = 8:11y = E-"(E . . 0. 1. 0....(P) = 99-1(P) + \2n + 1) 82wy1 + + 1) Sxryo 99.(p-n) P. the second one is zero. . 1)"_k(n -k+ 1)"-k(2n)! yk k=-"+1 (n + k ..1)!(n .k + 1)! " (-1)r-1(2n)! r=o r! (2n . Hence we get p"(n + (n) i Yo"-"+1 = Y"+1 (2n)! 0! which concludes the proof. Then we have 2n(2n k)! ( ) p.+1 (k + n . . Hence we obtain the complete Everett formula: I 0 2r + 1 r=o 2r + 1 s2ry1+(P+n- 2n (x.. i = -n + 1. First we observe that (2n P.r)! Of the two remaining terms in q"(p).1)2"EYo 2n + 1) I) rf (2n1 r (. 1.r)! Yr-"+1 The two sums are of different signs and cancel except for the term r = 2n from the latter sum. . . the case i = n + 1.

. The modified interpolation formula then takes the form 1) 3. Y. thereby reducing the computational work on interpolation.6. 8. &Y CJ1 or after simplification: [ E(P) = P(124 P) (2 . .y: Lyo=s'Ya-CB4yo. Here we shall consider only Everett's formula. we find.p) . do not differ appreciably.Yo If the error is denoted by e. We now introduce so-called modified differences 82.. yv = py1 + qyo + (P + (q 3 1) B. When p varies from 0 to 1. between 0.15 and 0. we find 3 2) E(P) = [(q 5 + C (q 3 1)J S. Putting a = 12C .p)(p . Throwback By throwback we mean a general technique of comprising higher-order differ- ences into lower-order differences.p2) 8`y .6. and we also restrict our- selves to throwing back the fourth difference on the second.12C + p . 2 = PY + Apt 1) (8y .20. 32 8"y. . The relevant terms in the formula are of the type PY+(P 3 1)Szy+(P 5 2)8`y+.. taking into account that q .a) _ (p2 . which still remains to be chosen.2.1 .SEC. THROWBACK 181 8.Yo +[(P 2)+C(P3 (8. we examine the function T(P) = Al .P)2 + a(p2 . Supposing that 81y varies so slowly that 81yo and 31y. the factor (4 .p: E(P)= ( (l -p)(2-p)[4-(1 -p)2-C 6 20 + P(1 .. Differences modified in this way are frequently used in modern tables.-C84y1.=b2y.6)(1 + P) 4 20 P2 . and we will replace this factor by a constant.4 20 soy) + .6. namely.1) 5 Here terms of sixth and higher order have been neglected.p2)/20 varies only slightly.p2 .

6.00 where M = max (IS'YoI' If the fourth difference is absolutely less than 400.6.1). this function has one maximum and two minima in the interval 0 < p < 1. We now want to choose a in such a way that the maximum deviation is minimized. The throwback technique was introduced by Comrie. 8. as was done when C was determined. Figure 8.182 INTERPOLATION SEC.6 For sufficiently small values of a. gives the same value for C.1)/2 and C = (3 + 1/2)124 = 0. the error will be less than half a unit in the last place. as is easily inferred.2) The error due to this approximation is obtained by use of (8. and the modified second differences are defined by the formula 3'y=81y-0. 8. . larger values can be accepted.21/T) Ii'YI 384 and IElmax < if Is'YI < 1119. in [2]. this occurs when all maximum deviations are equal apart from the sign. If. More details on this matter can be found.18393y. for example. incidentally. At first he worked with Bessel's formula.6.6.1839. The error curve then gets the shape shown in Fig. we can suppose that the fourth differences are equal. which. and a straight- forward calculation shows that it is less in absolute value than 0. A simple calculation gives the result a = (1/2 . since IF'Imnx=(3. (8. He also examined how the method works for higher differences. and.

Subtabulation In many cases a function is tabulated with intervals h.8. It is quite clear that y is not interpolable for small values of x.2) +(r . for a = 0. (8. a very accurate formula was often employed for obtaining a small number of key values. the function z = xy is very well behaved and offers no difficulties for interpolation. In order to perform the interpolation. 3r(81r2 + 441r + 580) 4.2) + 3(r . . Another example is presented by the function y ettdt. or .1.. the corresponding dif- ference operator. 4.: r(a24 1) [4(a ... ))(a-2)d.. we want to subtabulate the function.e-.SEC.I)2]d'+.+. although we need values at closer intervals ah. SPECIAL AND INVERSE INTERPOLATION 183 8. the function y = (es .1 Leasy 20 16000 It is now to calculate d.1)(a . and d. = X II 9r d+ 3r(278r00 49) dz . + . afterward one has only to divide by x. we find d. in the last case.1)(r-2)(a. that is.. Let E. These computations were easy to check.6. On the other hand. Special and inverse interpolation Certain functions vary too rapidly for interpolation along the usual lines to be possible..7.1)(a ...)/sine x. d. Then we have E.. be the shift operator associated with the interval ah.1) Usually we have a = 4. we need different powers of d.1)(a . . and with these values we construct a new difference scheme from which the desired function values can be obtained without difficulty. Consider.3) + 4(r .}. for example. subtabulation was mostly used for computation of tables by use of a desk calculator. since the key values necessarily had to be reproduced.8.1)] ds + r(a 2 1) d + + r(a48 ') (2(a .. Then. d. . 8. which formed the basis for subtabu- lation in one or several stages. = Ea and dl=(1+4)rt_ 1=ad+a(a2 I) 41+a(a. that is. In the past.7.2)(a . Nowadays the method is mainly of theoretical interest. 8..

. the function z = y + log x can be interpolated without any complications for positive values of x. which becomes infinite for x = 0. First we determine a value p. 8. We are now trying to find such a value x between x0 and x.10-6. Hence e.p. a similar technique can often be used. we obtain the equation Y=PY. and in . Hence log (x + h) = log [x(1 + h)1JJ = logx + log (I + h \ LL x \ x log x + x 2x' + The procedure for inverse interpolation is best demonstrated by an example. 184 INTERPOLATION SEC. (b) y = sin x and y = cos x are usually tabulated together. the formula sin (x + h) = sin x + h cos x will give a maximum error of one unit in the ninth place. 2' In larger tables the interval is usually 10-1.10-". Sometimes the special properties of a function can be used for simplifying the interpolation. we will obtain about 13 correct figures.8. This value is inserted into the 8!y-terms. + (I .10-" < h < 5.xo)/h by p. and hence 1h3/3! I < 2. = x.y.). (c) y = log x. sin (x + h) = sin x cos h + cos x sin h 2! \ 3! / If the interval is 10`1.y"+(3 5P)8'Y0 This is a fifth-degree equation in p. where y is a given value. Let a function y = f(x) be tabulated together with 8Ey and 81y. A few examples will be presented. we denote (x . As is easily inferred. As before.. We have. 3 +(2 3P)3.+ 5 P)Y. In such cases the regularity of the differences of different orders is used as a test that the method is fairly realistic. and we will solve it by an iteration technique. we can choose h such that . Even if one has a function whose analytical character is unknown. + 8=y.)yo y. and using Everett's formula. (a) y = e*.1. for example. while the 81y-terms are neglected. from the equation p. and if the series is truncated after the h=-term.+k= +h+ + . + h that f(x) = y.5. As a matter of fact.

2y" + Y"+. Y . . We assume the convergence to be geometric. -.+(1 . 0. that is.9. from PRY.Yo-(P'S 2) &Y.-(3 5PI)34Yo If necessary. the procedure is repeated until the values do not change any more.Y"+.Y.Y. from the equation I8-y. The procedure described here can obviously be carried over to other interpo- lation formulas. Y.Y3. Subtracting we get Y .SEC.Y"+. 8. and YY" =h+o(h). EXTRAPOLATION 185 this way we obtain a value p.Y.) = o (hl-) .(P' 3 1)3'Y. 8. y-Y"=ah"+e"h where s. converging toward a value y. There are.9. Y"-. This is often expressed through y-Y"=ah"+o(h"). _ Y . .Y(Y"-1 . Suppose that we have a sequence Yo. Extrapolation In general it is obvious that extrapolation is a far more delicate process than interpolation. _(2 3P2)'..=Y-(P'3 (' 3P')bYY0 Next we obtain p.+(I -PN)Y. however. ..Y" = o (h) Y . one could point to the simple fact that a func- tion might possess a very weak singularity which is hardly apparent in the given values but nevertheless might have a fatal effect in the extrapolation point. y y . For example. First we shall discuss Aitken-extrapolation. and hence Y"-. a few cases when a fairly safe extrapolation can be performed implying considerable gains in computing time and accuracy.Y"-. 0 when h -.- P. =h+o(h) Y . From this we obtain Y-Y"+.PN)Yo=Y .

144600 .. The procedure can then be repeated with the new series y:. [2] Interpolation and Allied Tables (London.(dEy. The table below contains an error which should be located and corrected.9..h2+a.+.. [3] Jordan: Calculus of Finite Differences (Chelsea. ..2) Using the expression for F(ph) we can eliminate the quadratic term. Thus we form G(h) = F(h) F(2h) F(h) = F(O) 4a.y. [4] Steffensen: Interpolation (Chelsea.9. (8. 1947).3) -i - The procedure can then be repeated: H(h) = G(h) .4) Richardson's technique is used. .128350 3.160788 3.. or 3 y = y. .9.+. .176908 3. an integral.h)2 = O(h°') we get y= y.168857 3. 1944).152702 3. for example.68 0. Assume that we have F(h)=F(O)+a..h'+agh°+. (8.. by the aid of approximations obtained with different interval lengths.9. New York. + y. 1950).) + o (h. . New York. 1956)..65 0.62 0.64 0.y' + 0(h-+') y. = y. x f(x) x f(x) 3.2y% I y.67 0.) .136462 3.1) a yn-.-.66 0. in connection with numerical quadrature (Romberg-integration) and for solving differential equations.. for example.2y. (8. We shall also very briefly discuss Richardson extrapolation. as a rule we then take p = 2. G(2h) 24 _ G(h) I = F(O) + 64aah° + .120204 3.61 0.+..-.h* 20a8h8 .60 0. = -ah*-'(1 . .186 INTERPOLATION Since y.. Usually one wants to compute in a finite process a certain quantity.112046 3.+.63 0..+..-. REFERENCES (1] National Bureau of Standards: Tables of Lagrangian Interpolation Coefficients (Washington. (8.. EXERCISES 1.

5. Find the constants A.8 f 0.0 3.001 0.74) when .83059 0.00806 64890 -872716 -383 6.57351 08675 0. (9. + Bfi + Chf-1 + Dhf.. + Cp 1) 2) z s assuming that the series converges. (8. From the beginning of a table.00158 05620 -888096 -396 0. 7.13515 0. + 2x2 + x3 _ f(xI . .00180 as accurately as possible.46 -0. The equation x' . 8. Use the formula to obtain f(2. f(x. and D in the interpolation formula f(x0 + ph) = Af-. and (10. and x. x x2. The function y = x! has a minimum between 0 and 1. The function y = f(x) is given in the points (7. 3. x2.log (x!) a2 x 84 0. Find the value of y for x = 9.55438 29800 0. Show that _ x. using inverse interpolation (for example.54483 33726 0. in the vicinity of an ex- treme point x0.EXERCISES 187 2.15x + 4 = 0 has a root close to 0.003 0. Find the abscissa from the data below.26253 5. with Bessel's interpolation formula). The function y = expx is tabulated for x = 0(0. x. x3) x0 4 4f(x1. A function f(x) is known in three points. 1).+p = y.-. Find the maximal error on linear interpolation. 1). + R . using Lagrange's interpolation formula.56394 21418 0.47 +0. 3). B. 4.) Use this formula to find x° when the following values are known: x . + ph) is denoted by Show that y. C. + p dy.6 3.01)1. x2) + f(x2. 9).3.58308 91667 Find the function value for x = 0.005 0.002 0. Obtain this root with 6 decimal places. the following values are reproduced: x f(x) 0. as well as the order of the error term R.004 0. 9. 3.

1785. The function has an inflection point in the interval 0 < x < 0.6) 9. Use the formula to compute A(1.17946 (6. f(0.82948 (7. 11.4542 Find y for x = 0. A function f(x. Find y for x = 0.05 1 0.561136 -0.04 I 0.02 0.3.02 48.04 23.4342 0.5) 14.9591 2. 6) 15.61294 (8. Find an interpolation formula which makes use ofyo.80018 Find the function value for x = 7.243031 0.106126.81019 (7.6813 2. f(0. .0168553. It is known that the function y = Ai(x) satisfies the differential equation y" = xy. b.4 < x < 0.4 1.01 98. becomes correct to the highest possible order.'. y = 5.=ayo+byt+h2(eY0'+dyi').03 0. 14.1950.4492 0. 15. y. (up to sixth-order terms).5.135292 and Ai(1.03 31. 13.33222 (8.05 18.5 1. 6) 12.5) 9.4679 2. yo'.1) when Ai(1. Find its coordinates to three places..0341.00 0.2953 .5) 11.0) = 0.8) = 0. The following data are given for a certain function: x y y' 0. The function f is defined by f(x) = (e-'It) dt ..1378.2) = 0.". Determine the constants a. x y 0.4) 8.0218502 and f(2. and d in such a way that the formula y. 10. x 0. y) takes the following values in nine adjacent mesh points: (6.089618 In the interval 0.4) 14. . Find its coordinates as accurately as possible.01 0. y.6) = 0.3547 2.188 INTERPOLATION f(2.4) 11.0378. y has a maximum. [(0. The function y is given in the table below.1) = -3. c.31982 (7.98981 (6.6.3) = 0.2. 12.06 y oo 4.7775 0.4392 0.1706. For a function y = f(x) the following data are known: f(O) = 1.2) = -3.0379 3.yo°. 16.97257 (8. The function eee Y dt =J=t is given in the table below. and y.554284 0.

..x. .) + (x .)(x . 21.99 and extrapolation to x = 1.xo xo)(x ... A function Wp(x) is defined by f(x) . E < X'- 18..xo)z(x . The function y = f(x) is supposed to be differentiable three times.98. In a table the function values have been rounded to a certain number of decimals. According to Steffensen.)f"(F): X. . x. for is = 3. . Determine C . One wants to compute an approximate value of Yr by considering regular n-sided polygons with corners on the unit circle. x. In order to facilitate linear interpolation one also wants to give the first differences. Determine F.. and 0.-lim L log (1 . .. xa)s R(x) > where R(x) = J(x .x) I log g2 +'f(x) > by explicit computations for x = 0. the following formula is valid: (P(-X0. x1... The function f(x) = x + x2 + x' + xe + x" + .) .is defined for -1 < x < 1.) = f(xo. 0. can easily be computed..2xo + x1) f(xo) + (x . or to give rounded values of the exact first differences.g(xr.=o Prove this formula in the case n = 2. and 6 and extrapolate to oo taking into account that the error is proportional to 1/n2.g(x). . (x1 . x.x.95. 20. for such values of n that the perimeter P.xo)s f(x. Prove the relation (x .) - .EXERCISES 189 17. Determine which is best: to use the differences of the rounded values.. x1. 4.xl.. 19.x1) f xo) f(x) _ . < x.

. I am going to believe him.1) 2 In general we have D' = h-*[log (1 + d)]{.2 = 4. either in a grid point or in an interior point. P. The function fix) is supposed to be analytic and tabulated in equidistant points.. where the following expansion can be used: ak [log (Ik!+ x)]k -_'k - n! x" The coefficients a`. From the relation d = ell .l (9. The first case can be considered as a pure differentiation problem. P. WALDENSTROM.1.t"' again are Stirling's numbers of the first kind... The relation follows from [log(I+x)]k=[dk(1+# dt a-o dk tiR> x dtk n! o-o x° dk a(w)tP -k n! * Famous Swedish religious leader of the nineteenth century. the second as a combined interpolation and differentiation problem.. we obtain formally hD=log(1+d)=d-J4Z+ d'-. Our problem is then to compute the derivative. 190 .Chapter 9 Numerical differentiation If even the devil states that 2. or fi(x) h (dix) JZf(x) + d'f(x) ..

. (9. d* r + l d + (r + 1) (r 2) 4P + ..5): y= 1P E 2(.. (a* + . .2..1 .4).6 (SPY.) . Ca'r + a.Y (9. Dy h f. This formula. 2h [Y.4) gives instead y...+1fX) dflx) r+ I + (r d.3) or (7. . we prefer expressions containing central differences.6) The second derivative can be obtained by squaring (9. (-1)" (2n + 1)! Szn+Iy h [[ay PS3Y + 30 IetY 140 f1 y f 630 1-18. (9.Y-i .+Pf(x) . however.2.5) 24 + 640 7 68 S'Y + .1)n (n! )P SPn+Py h "=o (2n + 2)! 3-Y . Hence we obtain +u D' = h-. contains only odd powers of $ and is useless except for the computation of the derivative in the middle between two grid points. (9.+a. +u / d.3'y_. or P11(x) _ h_.5) or directly from (7.+u fc. + + (r p...2. we find .3) + a)(r + 2) Analogously... Then we can apply (7.. .7) hP I SzY 12 S+y + 90 Sey 560 sgy + 3150 .] ..l . .a'r+ 1 V' }'f(X) c..+Pftx) + . Formula (7. (9 .(x) = h...y h "_a 21n(n!)2(2n + 1) (ay 39y 3."P'fx) .y . the first formula gives y' = Dy = I (-1)"(2n)! aPn{. 4) +)(r 2) As before....(a.SPY-1) + 30 (S'y.NUMERICAL DIFFERENTIATION 191 (since we get no contribution if n < k).>+P.2. .

(q2 . It is easy to understand that.9) a4Yo 140 ady.7). should be considered as a "difficult" operation.Y. (p= .4)(1 .p): (P(P23! (g(g23! Y. In particular we have d (p(p2 . . significant figures are lost in numerical differentiation.n2)) .. which is differentiated with respect to p(dx = hdp.6) and (9.1)! (n + 1)! (2n + 1)! From this we get still another formula for the first derivative: ()!) n a2*yo) Yo = h [. (9.. q = 1 .192 NUMERICAL DIFFERENTIATION Combining (9. .4)) a yo + . 6 a2Yi - 3 a2yo + 30 a'Yi +1 . _ dq 1)) a2Y° + dp (P(P2 - (P2 . we find (_ 1)k (2k +!)1)! h2k+1 hy' W (2n(+)1)! p Zn+1y + We shall now also seek the derivative in an interior point. ...1)(q 2 .. = h [Y . (2n + 1)! (n ..Yo + L ((2n)+ n 1 [ 1 1 h Yi . The main term in the computation off (x) is.f(x . 2h and the two numbers within brackets must be of the same order of magnitude Hence.. numerical differentiation. Later we shall see that the reverse is true for integration.Yo + dP 1))a2Y.8) For higher derivatives we can continue differentiating with respect to p or q. 2(1 .1 . contrary to analytic differentiation. 105 aBYo + .1) . We start from Everett's formula. as a rule. of course.vi .9) .4)\ (q(q2 ...h)J .f'(x) = [fix + h) .n2)\ (n!)2 dp (2n + 1)! a=o (2n + 1)! d (q(q` . (1 n2) dq (2n + 1)! 9=. we can obtain desired expressions for higher deriva- tives.1) ... For example. 5 dq (9.

007827) + 30(0. A function is given according to the table below.000079) + ] = 0.0 0.487968 0.540302.54030.86672 0.6 1.505942 0.0.1 0.717356 -7167 65971 -660 0.2 0.2 2.932039 -9313 31519 1.0 1.40 1. 1 Table for Ex.3 0.55 1.35 1. REFERENCES [11 Milne-Thomson: The Calculus of Finite Differences (MacMillan.891207 .8 0. EXERCISES 1. London. Find the x-coordinate of the minimum point.50 1.10022 0.521525 0. 1933). 2.0.7 0.4.65 1.2 1. 2 x y x y 0.88737 0. Find the derivative for x = 0.2 < x < 1. x y 6 8' 8' 8' 0.388686 1.000087 .95063 .85937 0.90940 0.98730 0.841471 -8408 85 49736 -496 1. Table for Ex.418083 1.0.8 1. Find the derivative in the point x 1.644218 73138 0.9 0.963558 y'(1) = 012 x0.783327 -7827 79 58144 -581 1.783327 + 6 (0. Tabulated value: cos I = 0.5.EXERCISES 193 EXAMPLE The function y = sin x is tabulated in the scheme below.467462 0. The function y .4 1.4 1.f(x) has a minimum in the interval 0.45 1.444243 1.60 1.008904 .891207 -8904 87 40832 -409 1.

02 0.17967 59168 1.397132 0.35 +0.03384 90002 85. S.2 2.96 0.94 0.98 0.2 3.45 +0. A function y = y(x) is given according to the table below.18256 20761 0.05 -0.4 2. Find the constants a and b from the table below.n2)y = 0. The function is a solution of the equation x2y" + xy' + (x2 .8 3.50 +0.03461 98696 85.896384 0.06 0.02 0. 4 Table for Ex.17553 40257 1.65563 6.164936 0.770736 0. DIFFERENTIATION 3.40 +0. y is a function of x satisfying the differential equation xy" + ay' + (x .03307 53467 85.8 1.6 2. where a and b are known to be integers.02549 3.20 -0.b)y = 0.73036 2. where n is a positive integer.01 0.194 NUMERICAL. A function y = y(x) is given in the table below.00 0.19756 3.25 -0.00 -1.10 -0.17689 06327 1.15 -0.03538 78892 85.601987 0. Table for Ex.338744 0.95532 2.084255 0.73309 3.17827 10306 1. x y 0. x y 85.30 -0.811898 The function satisfies the differential equation (1 -x2)y" -2xy'. 5 x y x y 2.973836 0.18110 60149 1. 4. Find the second derivative for x = 3.0 3.585777 0.00 0. A function y = y(x) is given in the table below. Find the parameters a and b which are both known to be integers.17420 05379 2.45693 3.03 0.000000 0. Find n.0 1.03229 89750 . ax2y+by=0.04 0.33334 2.04 0.

. cosxdx. also taking into account the case when the integration interval is infinite. 1 + 1 . it turns out that we obtain a very poor accuracy unless we take very small intervals close to the origin. = 1. The only ex- ception is when the integrand contains a so-called weight function as factor. First we consider I = So (cos x/V) dx. I + 1 . Introduction When a definite integral has to be computed by numerical methods.8090484759. We shall here give a few examples of the treatment of such "improper" integrals. 77 X In the last integral the integrand is regular in the whole interval [for x = 0 we get 1im.-.Chapter 10 Numerical quadrature There was an old fellow of Trinity who solved the square root of infinity but it gave him such fidgets to count up the digits that he chucked math and took up divinity.cos x)/V) -. in this case a reasonable singularity can be allowed for the weight function. and hence the inte- gral has a finite value.Jfd-dx+-+ 0 xx _ X3IY 2! X 7IY 4! X11IY 6! [2X1/4 L . ((1 .. The reason is that higher derivatives are still singular in the origin. and hence we ought 195 . 10. and at first glance we would not expect any numerical difficulties.0. but the singularity is of the form x_n with p < 1. However. (a) Direct series expansion gives I.2! 5/2 + 4! JC3/Y X99/2 /4 . There are several possible ways to compute this value.6! 13/2 xl3/Y + Jo 1 = 2 . In this case we can write I= ' dx S11-cosxd=2. The integrand is singular in the origin.01. 5 108 4680 342720 (b) The singularity is subtracted and the remaining regular part is computed by aid of a suitable numerical method. it is essential that the integrand have no singularities in the domain of interest.

The last integral can be computed numerically or by series expansion.809048476. =---- direct computation by use of a suitable numerical method will give no difficulties.k 1 1)1.x-/2 dx + cos x ... We can also employ series expansion: I 2 2r1 i 0 1 t 2! 10+216 + to 4! . The integrand has a loga- rithmic singularity at the origin which can easily be neutralized. (c) The substitution x = t' gives 1 = 2 S' cos t= dt. (2) the numerical accuracy is improved considerably. (k 1 1 we easily obtain _ 1 k+: + . we find I.. // = 1. (k + 1)2k! .0. Hence.1 + x2/2 dx Vx -___77 J1 = 1 8 + (I cos X + X*/2 a x . 17.C' 4 + 18 96 . 3! + 13. in the latter case we obtain I=2cos1+4\ \ 5 . The resulting integral has no singularity left in the integrand.7! I +.J' 11 dt which is exactly the same series as before. As our second example we take I = Su e_z log x dx.. and even all derivatives are regular. 0 (1-x+ xp-- 2! 3! xdx. For example.196 NUMERICAL QUADRATURE SEC..5! I .. From Jo xk log x dx = (-k k+11 (log z. 1! 1 . 1 9 .1 . )0 yX We gain two advantages in this way: (1) the integrand becomes much smaller. tx: 6! + 9360+. Expanding e-9 in a power series. (d) Partial integration gives I = [21/-x cos x]o + 2v/-x sin x dx .. to employ a more refined subtraction procedure. we can write instead 1 . 10.

0.E. In Chapter 1 we encountered an important method. After some simple calculations. As our third example we choose I= f' dx Jo 1/x(1 . We divide the interval into two equal parts. ex- pansion in an asymptotic series. INTRODUCTION 197 The sum of this infinite series can be found by aid of a purely numerical technique (see Chapter 11). I ° e'dz So ze'2' + 1 Through the substitution a-' = t. however. using partial integration.7965996 . Another obvious method consists in making the integration interval finite by a suitable transformation. and make the sub- stitution x = t2 in the left and 1 . we get 1F dt + I_2 ( 1 1 Jo 1 -t2 1+t= 72--t- 21/20 2 VT dz 1 1 l 2-z2( 2+z2 + 4-z$/ Now the integrand is regular in the whole interval. namely.(x) ett dt . for example. where r is Euler's constant (see p.t2 in the right subinterval.(1) = -0. we get I = -r .ttlog r where the integrand is regular and the interval is finite.x) when x is close to 1. it is possible to get rid of both singularities in one step.x . 10. we will briefly touch upon the case where the integration interval is infinite.SEC. Take. In this case. . Putting 2u2 X= 1 + 1l0 we obtain I=21(1 du to 0 1T Finally. 233) and E. the integral is transformed into I = ' dt 0 1 .x2) The integrand is singular as 1/1/ x at the origin and as 1/12(1 . Alternatively.

(x . S:... P(x) dx.)(x . As is easily proved.. 198 NUMERICAL QUADRATURE SEC. In prin- ciple we meet the same problem as in interpolation.xk_.X. However. In general. and clearly we must replace P x) by a suitable function P(x). k=0 where (x .) .X0/)(X . 10.. we shall generally treat integrals with regular integrand and mostly with finite integration interval. the reader should compute the interpolation poly- nomial explicitly and then perform the integration directly.1. and from now on we shall preferably consider this case.x.Xk+1) .: Lk ds . Hence k(k.1).X. Later on we shall examine what can be said about the difference between the desired value. If the given x-values are not equidistant..D" n k=o JO Finally. the function values are available in equidistant points..(k_n) and we get 1 P(x) dx = nh Y. . (xk .Xk_. Likewise. S=o f(x) dx.. obtaining dx = hds. Here we have indicated a series of methods for transforming integrals with sin- gular integrand or with an infinite integration interval to integrals with regular integrand or finite interval. where f(x) is a function whose numerical value is known in certain points. (10.(1)(_1). 10. Cote's formulas We shall now treat the problem of computing SQ f(x) dx numerically.) . we shall not discuss the technique of differentiating or integrating with respect to a parameter. Throughout the remainder of this chapter.x0) Ck yk . putting (I/n) Sa Lk ds = Ck. .)(Xk ... we can write the integration formula * P(x) dx = nh t Ck yk = (x . they satisfy Ck = Cn_k and Ek=u Ck _ 1.X..1) JsO k=0 k=0 The numbers Ck (0 < k n) are called Cote's numbers. and the numerically computed value.xk+t) . For completeness we mention that perhaps the most effective method for computation of complicated definite integrals is the residue method.) Lk(X) = / (Xk . it rests upon the theory of analytic functions and must be ignored here.1. (xk . Thus we assume a constant interval length h and put P(X) _ Lk(x)yk..XO)(Xk . and further we put x = xo + sh.1.) Here xk = x0 + kh. (X . which is usually a polynomial.

1. 'Jo f=2. Error estimates We shall now compute the error in the Simpson formula. For n = I we find Co = C.2) =o T -61 Ys 1 This is the well-known Simpson formula.1.xo) (6 Yo + 4 Y1 -} . (10.1 C. that is. We suppose that f1v(x) exists and is continuous in the whole integration interval. 6 J22f(x) dx (x2 .l(-h)] + 4 f(0) + 3 [.f"(-h)] . Then we construct JAkf(x) F(h) = 3 [f(-h) + 4f(0) +f(h)] ./'(h) -f(-h)] We note that F'(0) = 0.3 [f(h) -f(-h)] .SEC. s(s . the trapezoidal rule. F111(0) =0. . F"(h) = 6 [fl(h) + fl(-h)] . Differentiating with respect to h several times.3 [f(h) +. we can use the mean-value theorem of differential cal- culus on F"'(h): F. It should be used only in some special cases.(h) = 3 fa Iv(oh) 1<0<1 . F111(h) = 6 [f"(h) . dx. Since f1° is continuous...1) ds = 1 . 10. COTE'S FORMULAS 199 EXAMPLE n = 2 c=0 1 (s (s-1)(s-2ds=(-1)(-2) C1 = 1 s((-1)) ds = 4 S. we get F'(h) = [f(-h) + 4f(0) + fh)] 3 + 3 [f(h) +f(-h)] _ . Further. F"(0) = 0 . = 1.

These conditions are fulfilled in (10.1.t)2 3 and hence t)= F(h) = g(O'h) So (h z 3 t= dt = h' 90 g(0'h) with 0 < O' < 1. Now we use the first mean-value theorem of integral calculus.200 NUMERICAL QUADRATURE SEC.it)g(t) a dt = SbJ(I) di a (a < < b).j1°(e) = 3. + .0f/v(e).a). and since f'° is continuous.. 0 .1.6) Similar error estimates can be made for the other Cote formulas.) + f19($. and then we can define a continuous function g(h) such that F111(h) = 3 g(h) (10. Integrating (10. = O f1°($.a.. -1 < 0 < 1.1.4) implies F(0) = F'(0) = F"(0) = 0 and further the validity of (10. (10. . Thus F(h) = 90 -h < ¢ < h . jb. where.1.) + . provided that f and g are continuous and that f is of the same sign in the interval.3) directly. we have with m < M trivially m < S < M.4) Formula (10. (10.1.3). a<E<b. (10. 10.5) over n intervals..1. we get F(h) = Sa (h 2 t)y 3 tzg(t) dt .1. and we obtain finally F=(b-a) .5 10-4(b . of course.a)e Using (10. But 2nh = b . The coef- ficients and the error terms appear in the table below. we get a)s F = (b2880 .1.4) with I(t) = t=(h . we also have S = f11'(g) with a < < b.5) Putting h = j(b . g(h) = f "(0h).=1 f S. + F.3) Obviously.1.0(h) is a function of h..1. + n Putting for a moment (1 /n) E.. we obtain F1 + F.

"1 NC. The error must be less than 6.(-) NC. The method de- scribed here is usually called Richardson extrapolation (or deferred approach to the limit). The remainder term R' is fairly complicated. Instead. J0 Choosing y = el and h = 0. 1.10-' (b . We observe that a Cote formula of order 2n + I has an error term which is only slightly less than the error term of the formula of order 2n. In practice.a)' f" (i: ) 5 288 19 75 50 50 75 19 3.0.2. I + e = 0.. but on the other hand. Using two intervals to compute S. Denoting the exact value by I. we get the result 0. + Y. especially for hand computation. then the error will become about 210+2 times larger. dx/x. the error is estimated directly only on rare occasions. it must be used with caution and discrimination. The relation is known under the name Weddle's rule and has been widely used.69325. . + 216y. + 41ye1.6.a)1f°111(¢) For example. the coef- ficiepts are very simple.") NC.a)/840)8ey3.69325.1 = 2.5 10-' (b . we obtain: s°+eA s0 y dx = 20 [(YO + Ye) + 5(YI + Ye) + (y.69317 (compared with log 2 = 0.1.'*) NC. + 272y. we use the fact that if the interval length is doubled in a formula of order 2n.69315). . with one interval we get 0. 201 SEC.69444. COTE'S FORMULAS Table of Cote's numbers nN NCO(w) NCI(-) NC.a)ef" (e) 2 6 1 4 1 3. one will obtain an error which is about 16 times larger.3201169227. 10-7 (b . .a)Of1v (¢) 4 90 7 32 12 32 7 5.2.29 el. we have ee y dx 8 [4ly. For this reason the even formulas predominate. + 216y. + 27y. + 27y. If we take Cote's sixth-order formula and add ((b .a)'f91 6 840 41 216 27 272 27 216 41 6.") Remainder term 1 2 1 1 8.69444 . we have approximately I + 166 = 0.10-7 (b .3 10_2 (b . = 11 10-9 compared with the actual error 5 10-0.) + 6Y3] + R' .4. we get from this formula 2. com- pared with the exact value e s .4 10-11. by doubling the interval in Simpson's formula.32011 6928.a)°fly (E) 3 8 1 3 3 1 1. and hence I = 0. obviously. 10. 10-10(b .. Hence.

b 16 c= 1 15 15 15 If this formula is used for several adjacent intervals. a-2c= a-4c= whence a= 7 . the following modification of Simpson's formula.Ys. 10.1.. . In the latter case the error is 211 times less than in the former... 7y. Brun. and we are left with the final formula y dx h (7yo + 16y..63233368 (Simpson).-Yo) ro+a+b L=. EXAMPLE i foe Z dx (1 + 4e-ml2 + e') = 0..a +6 b (Yo+4y. The result is the same as though Euler- Maclaurin's summation formula were truncated (see Section 11. ..cyl) .+Y. using three ordinates in nonequidistant points..10-6. If we want to compute an integral. may be used: ydx. yl. A similar correction can be per- formed for the trapezoidal formula.63211955 (Simpson with end correction). So e= dx 30 (7 + 16e'"° + 7e') + GO (-1 + e-1) = 0. + 14y. . a + 2nh = b. and expanding in series we find 2a+ b=2. where b . 10-*. error: .a = 2nh and y6. .*) 15 4 + 15 (Yo . suggested by V.01 . y. the y'-terms cancel in all interior points. f . + byo + ay1) + h2(cy'_ .1.)+ b -3 a (Y.. error: 2. a + h.2). Simpson's formula with end correction We shall now try to improve the usual Simpson formula by also allowing derivatives in the end points. Putting +A y dx = h(ay_..202 NUMERICAL QUADRATURE SEC.. denote the ordinates in the points a.) + 0(h6) . -F 16y.13.

a)' 2 3 2 -1 2 3.1.14 11 1. Suppose that we start with the trapezoidal rule for evaluating So f(x) dx.. Table of coefficients of open-type quadrature formulas n N NCi') NC.9 10`6.. is the same result as obtained from Simpson's formula.") NC. is identical with the Cote formula for n = 4. we obtain 0.1 10-'(b . C.^' 1 m>2. compared with the actual error 2..A.-. The error is now proportional to h8.. Since the error is proportional to h. 10."' Remainder term 1 2 1 1 2. Now we make use of the fact that the error in B. B. we stop when two suc- cessive values are sufficiently close to each other. 0. compared with the exact value 1 . The remainder term is 3. but the formula is not of the Cote type any longer. Further details of the method can be found in [5] and [6]. + B. the coefficients can be computed as above. incidentally. Romberg's method In this method we shall make use of Richardson extrapolation in a systematic way. .6988037.. The same process can be repeated again.. 203 SEC. Using the interval h = (b . + 26ya . 63_' '" m>3.4 10-7 (b .a)' fv'(e) For example. Since the error is proportional to h. 10-' (b . we obtain an improved value if we form B = A + A..1 10-0 (b . 10-2 (b .14yi + 11y3) -3A y = 20 If Sot e-= dx is computed in this way. we get open-type formulas.a)' f v(C) 4 20 11 -14 26 . is proportional to h.8. = B."' NCs"' NC. . .2.a)' f v'(e) 5 1440 611 -453 562 562 -453 611 7. If this is not the case.14y_. and form C.a) 2-°'..6988058. we denote the result with A...1 10-6. This formula. we form D. since the end-point ordinates also enter the formulas.a)' f `v(e) 3 24 11 1 1 11 2. B'. we have approximately: r+ 3A dx (1 l y-. As a matter of fact."' NC. COTE'S FORMULAS The Cote formulas just described are of closed type.

D. B.204 NUMERICAL QUADRATURE SEC... i.1 Cos tx2. _ r fi-1 sin tx. 1 IA.69314 74777 4 0.. _ Efw-. A special quadrature formula of the Cote type for this case has been obtained by Filon: .f{x) cos tx dx = h [a(sh){ f. cos tx.. .9(th)S. + 4 th*C.cos 0) e3 e= J .75 1 0.. and one such case is ob- tained if the integrand is an oscillating function. The functions a..] .1..69325 39682 0... a = 1. E" f(x) sin tx dx = h Ca(sh){ fo cos Ix.sin 201 O= 0' (B) = 4 (sin B ."-. Filon's formula As mentioned above.69314 79015 0.."-." + fo sin txo] i=o 2 " S. We find the following values: m A.R. sin ix. If. b = 2.70833 33333 0.69317 46031 3 0.. 201 19(0) = 2 (1 + cosy 0 . However.69444 44444 2 0. E 0 0._.-..R". C.. sin tx..69314 71805.t J=1 5. + 4 th'S'"-...69702 38095 0.. numerical quadrature is "easy" compared with numerical differentiation. ._1 .69314 71943 0. cos tx.-..] .. C'"-1 = Ef1i.. _ E f. 10. f(x) = llx.69314 71831 0. sin tx. and r are defined through 1 !!ILO 2 sin: 9 eq. sin tx. ." + fo cos txo] .". i=1 i=1 2nh=x.i-.-. . + r(th)S. cos tx.69339 12022 0. .8..fo sin txo} + $(th)C:_ + r(th)C..f.-._1 .} to + .69315 45307 0. COS tx.69412 18504 0.. t-o 2 C.69314 76528 0. EXAMPLE.. S."-x."-1 = f:i. there are exceptional cases.. . Here C..69314 71819 The value of the integral is log 2 = 0.

out any special conditions and make the following attempt: Jb w(x)f(x)dx = A. Gauss' quadrature formulas In deducing the Cote formulas. xb < < x2r 10. and the weights were determined in such a way that the formula ob- tained as high an accuracy as possible. Lk(x) .. x are still at our disposal.1). namely. Hence we can write f(x) = P(x) + F(x)(ao + a.. as well as the abscissas x.x' dx. . are at our disposal.xk FF(xk) Now we form the Lagrangian interpolation polynomial P(x): P(x) = itk=1Lk(x)f(xk) (10. 10.2) and findf(x)-P(x)=0for x..f(x1) + A.SEC. but as soon as they are given.. the weight constants Ak are also determined.x + ae + which inserted into (10.x2 + } dx _ Ek=1Akfixk) + R.1) Here the weights A.2. (10. . We shall now choose abscissas with. = 9U nhb flv(E) + 0(th') . Further. finally. we only suppose that w(x) Z 0. 1 -0 Here the abscissas x x. we made use of the ordinates in equidistant points.2. + R. (10. . FF(x) k=1 Fk(x) = xF(x) .) +. Identifying. F(x) _ fl (x . .. so far. gives Sb w(x)f(x) dx a k C1a w(x)Lk(x) dx) f(xk) + Ja w(x)F(x){a. + a. w(x) is a weight function.x + a. which will be specialized later on. We shall use the same notations as in Lagrange's and Hermite's interpolation formulas. GAUSS' QUADRATURE FORMULAS 205 The remainder term.f(x..2. we obtain Ak = w(x)Lk(x) dx .2.xk) .2. ..3) R= w(x)F(x) E a. has the form R.2.

We take as example w(x) = 1.. we get n equations which can be solved without any special difficulties. we can use the first . at least for small values of n.xi + sqx .s.2Lk (xk)(x .206 NUMERICAL QUADRATURE SEC. 1 (2n)I First we will consider the error term.S.xk)[Lk(x)]` dX .5) a k-1 k=1 where Bk = w(x)[ 1 . = S° w(x)(x .2. a = 0. and we choose the following: K r=0. Since F(x) = jj (x . J a C1. (10. and since w(x) Z 0. (10.2.2. Integrating Hermite's interpolation formula (8. (10. = 3.. and s3 The abscissas are obtained from the equation x3. x.8873. putting (x .1.2.. 1.xk)] [Lk(x)]E dx .2.5.9) we get w(x) f(x) dx = E Bk f(xk) + t Ck f (xk) + E .5+1/0..2. b = 1.xk). and we find x1=0.6) a E = Ia w(x) f! T [F(x)]2 dx .2 x2+ 5 x-20=0.15=0. and n =/ 3. In order to get a theoretically more complete and more general discussion we are now going to treat the problem from another point of view.5) and (8.x.5. For determination of the abscissas we can formulate n more conditions to be fulfilled.=0.2.4) K J6w(x)F(x)x'dx=0.n. xz)(x . s. with the solution s.15=0. 10.X3 .)(x _.0.x3) . x3=0.1127. The equations are: S3 = 0.

)L. the abscissas x x .(x) of degree n .. but now we shall ex- amine whether they can be chosen in such a way that the n constants C.4)...1. For determination of the abscissas. (10.2. Thus R. Further we see that Ak in (10.2.1. It is also easily understood that the n powers 1. 77 < b and further is a function of x... Rewriting B..x2) .4) is satisfied.7) E = (2 > 7) Here a < Q. = 0 which are equivalent to (10. w(x)L..1 must be orthogonal to F(x) with the weight function w(x).. .5) transforms to rn w(x)f(x) dx = E B.. [see (8. vanish. ..2.2. but this is necessary if we want explicit formulas..2L' . . . a We can say that the n polynomials L.2. treated already by Gauss.(x) dx = J6 w(x)[Lk(x)]2 dx . xe-' must also be orthogonal to F(x) in the same way.2.. (10..2.)C. .k = w(x)[Lk(x)]` dx ... Again.. J t a The abscissas x x2..1..x.x.. in (10. are >0.ix.x. F n! dx° ..)(x . we get directly Jn B. but the calculation is done more conveniently by use of the Legendre polynomials.. Then (10.) + w(x)[F(x)]2 dx .2.2. 10.2. x2. First we transform C. So far..7) is that the relation (10.(x)]L.e. we have the equation r+1F(x)xldx=0 E r=0.3)]: C. = Sa w(x)[(x ..SEC.3) can be identified with B. GAUSS' QUADRATURE FORMULAS 207 mean-value theorem of integral calculus: Ja w(x)[F(x)]' dx . this equation allows the direct determination of F(x) = (x .n..2.(x.. These are defined by Po(x) _ 1 d* (X2 _ 1)n Po(x) = I .6). xw are now determined in principle through the conditions C.9) If n is not too large. .: A. So far we have not specialized the weight function w(x).1. (x .(x) dx = 0 . = J6 w(x)[L. vanishes if f(x) is a polynomial of degree <2n.(x)]2 dX . xA are not restricted.). x. (10. We start with the simple case w(x) = 1.. the condition for the validity of (10. The natural interval turns out to be -1 < x < 1.(x) dx = F(xk) w(x)F(x)L.8) a a An important conclusion can be drawn at once from this relation: all weights A. a Thus we have the following double formula for A.

2.. dx'-' If r = n. r=0.11) 2+ 5[P(x)1'dx= 2n 1 (10 .2) . 10. (10. Then S±..2.10).R(x) is a polynomial of degree m. which vanishes n times in x = -1 and n times in x = + 1. . because +1 X.1) times in x = -1 and (n ..2. for example. m < n.+1(n!)3 (2n + 1)! Now we can compute +1 [P `+ d' (x))= dx = ( 1 (2n)! X% + . .11) follows from (10.1)'dx=0 -.2. 2 ..1.1)' is a polynomial of degree 2n. (10.1. . since if.208 NUMERICAL QUADRATURE SEC. Differentiating once.10) P (x)P (x) dx = 0 m # n..n.2. 2 (2n + 1)(2n -.1)' dx 2'n! n! / 2'n! dx" (2n)! 21.1)"dx = rx' rx' (x1 .1)' dx 1X- de-1 J 1 dx'-' r! d* (x`.22'(n!)3 (2n + 1)! 2n + 1 since powers below x' give no contributions to the integral. we obtain a polynomial of degree 2n . Hence we have the following relations: 5xP%(x)dx=0..1. Further. 12) The relation (10. x'P (x) dx = 0 if r is such an integer that 0 5 r < n. once in a point .2.) 1 (xs .1) .1) times in x = + 1. P. we see that (x= .+1(n!)3 2 . .. and according to Rolle's theorem. 3 2z.f o /! = 2n! cos!'+' T dT = 2n' 2n(2n . where every power separately is annihilated by integration with P (x).. which vanishes (n .. we obtain: -x2)'dx i .

if we can assume that f"*'($) varies slowly.7) and (10. after somewhat complicated calculations. however. As before.15) [P*(xk)]' I . = f`:*!L C' `+' 2'*+'(n!)' [P.(x) k+1 From (10. we can apply Richardson's ex- trapolation to increase the accuracy. P. P3(x) = 2 (5x' .xk Finer details are given. Hence it is inferred that the polynomial P*(x) of degree n has exactly n zeros in the interval (-1. for example. p. P... transform Ak to the form Ak = 1 2 ..xk Using the properties of the Legendre polynomials. Hence A. that is C = 2*(n!)2/(2n)!.. .2.2. (10. (10. 321. .13) The constant C is determined in such a way that the coefficient of the x*-term is 1. and changing to s.9) by putting F(x) = CP*(x). . in [4]. P. x* from the equation P*(x) = 0 ..k+1 k p . we obtain the error term R. we can.2. Since R.(x) = 1 . .2.70x3 + 15x) .2.14) P. Higher polynomials can be obtained form the recursion formula Pk+.(x) = 2k + 1 xPk(x) .1) . + 1). The first Legendre polynomials are P.(x) = x ._. Using (10. A similar argument can be repeated.3x) P. the number of "interior" zeros increases by 1.2.8) we obtain for the weights. and for each differentiation.(xk) J-1 x .= 1 (+' P*(x) dx.2.30x' + 3) .(x) = 8 (35x' .(x) = 8 (63x' .(x) = 2 (3x' .2.16) (2n)! J -' (2n + 1)[(2n)! ]3 In this form. 10. = a* f halving the interval with n unchanged will give rise to a factor 2'*. Ak =+1 Lk(x) dx =+ [Lk(x)]= dx .12). the error term is of limited use. Hence we obtain x x2. where C is a constant. Thus it is clear that we can satisfy equation (10. (10.(x)]2dx= J (10.2. GAUSS' QUADRATURE FORMULAS 209 in the interval between -1 and + 1. SEC.

17132 44923 79170 ±0.77459 66692 41483 0.00000 00000 00000 0. the error term will decrease by a factor 22%.k 2 ±0..xk Hence the formula z flx) dx o e = Ai k-1 Akf(xk) becomes exact for polynomials of degree up to 2n .2.1.a)/2]1°.(n. a suitable linear transformation must be performed: If(x)dx=b-aIt k A f(b-ax +b+al+R 2 R. 2 k-l 2 k Hence f zs'(E) will obtain a factor [(b .55555 55555 55556 0.210 NUMERICAL QUADRATURE SEC.2. and in the special case when the interval is halved.. We also note that Richardson ex- trapolation can be performed as usual. If an integral has to be computed between other limits.2 Abscissas and weights in the Gauss-Legendre quadrature formula n xk A.86113 63115 94053 0.30 x .2. we obtain the equations F(x)e-zx'dx=0 for r=0.90617 98459 38664 0..23861 91860 83197 0.93246 95142 03152 0. Using the same technique as before.65214 51548 62546 5 ±0.47862 86704 99366 0.23692 68850 56189 ±0. while the error term (10. Davids.2.56888 88888 88889 6 ±0.46791 39345 72691 The table reproduced above has been taken from a paper by Lowan. The first equation in (10.00000 00000 00000 0..88888 88888 88889 4 +0. Next we consider the case where the weight function w(x) is =e z.66120 93864 66265 0. 10.16) is mainly of theoretical interest.33998 10435 84856 0.17) _ 1 F(x)e-0 dx F'(xk).1.57735 02691 89626 1.1) (10.2.00000 00000 00000 3 ±0.36076 15730 48139 ±0.53846 93101 05683 0.17) fits the Laguerre polynomials L (x) = ex ds (e zx%) .18) . For prac- tical use this rule seems to be the most realistic one. (10. and Levenson [11 in which values are given up to n = 16.34785 48451 37454 ±0.

39507 09123 01 0.28994 50829 37 0.(xk)J= (2n)! A table of xk and Ak is given below.(n.53662 02969 21 0.64080 08442 76 0.07594 24496 817 7.20) xk[L'.71109 30099 29 2.19) S0- The abscissas x x .1.. .2. where values are given up ton = 15.2. 2-+1 .ez2 (10.03888 79085 150 9.39866 681 10 83 3.08581 00058 59 0.32254 76896 19 0.. n! t// [ [H-(x012 (10. defined by n (e_z) H.22) R.41421 35623 73 0.00053 92947 05561 5 0.35741 86924 38 4.1).58578 64376 27 0.59642 57710 41 0. Then we obtain the abscissas as roots. Abscissas and weights in the Gauss-Laguerre quadrature formula n xk Ak 2 0. . (10.14644 66094 07 3 0. and further it can be shown that the weightsand the remainder term can be written Ak = R (n!)4 = (n!)z J (10.26356 03197 18 0..41577 45567 83 0.SEC..27851 77335 69 6. 10.(x) 1) .2.74576 11011 58 0.01038 92565 016 4 0.60315 41043 42 1..00002 33699 723858 The table above has been taken from Salzer-Zucker [21. = n! 1/ -1r 2°(2n)! . x are obtained from the equation 0..29428 03602 79 0.00361 17586 7992 12. GAUSS' QUADRATURE FORMULAS 211 which satisfy the equation for r=0.21) .. The weights and the remainder term are A.2. Last we consider the case w = e-x'.2. of the Hermitian polynomials.41340 30591 07 0.52175 56105 83 1.2.85355 33905 93 3.

x .. The values listed at the top of p. 2.2. x. . .X1 fits the Chebyshev polynomials T . . all weights are equal. the weights are not integers. The weights take the amazingly simple form Ak = z . as a rule. we shall discuss the case w(x) = I/i/I -. Summing up. 7-17 xY where -1 <E < 1... S+m e-==f(x) dx = 2*+1 n! I/T J lxk) + n! V k=1 [Hw(xk)]Y 2*(2n)! x1.24) . the interest in Gaussian methods has grown considerably in later years.(x) .(n. k = 1. (10. x .xk)[P*lxk)1z (2n + 1)\\[(21n)! 19 x1.x= in the interval (-1.1. at least if the integrand has to be computed at each step.cos (n arccos x)... x . The reasons are essentially twofold: first. and hence a time- consuming multiplication has to be performed. 213 have been taken from Salzer. As a matter of fact. and Capuano [3].x2 n k=1 2n 2Y*(2n)! We will devote a few words to the usefulness of the Gaussian formulas.1 1 I ....212 NUMERICAL QUADRATURE SEC.23) n that is..(E) o xk [L. we have the following Gauss formulas: +1/(x) + 2Yn+l(n!)4 dx = 2 fixk) 1YY*'(E) k=1 (1 . 1). roots of L*(x) = 0. .2. 10. Second. since from T*(x) = 0 we get xk = cos ((2k . x* roots of P*(x) = 0 . (x) J+1 fix) dx = n t f (cos 2k .*(2n)! (10. . x* roots of H.. e-slp J x) dx( = n. Here we can compute the abscissas explicitly.. ..2. n. If an automatic computer is available.1)ir/2n). neces- sitates interpolation.1 r) + 27r . Finally. which. these reasons are no longer decisive. The condition r+1 1 r=0. Hence the final formula reads -' f( x) dx = nkf (cos 2k2n I n) + 2:. which will be treated in some detail in Chapter 16. It is true that for hand computation formulas of the Cote type are preferred in spite of the larger error terms. Zucker.2..(xk)] (2n)! x1.1) -1 1/1 .) * Y fxk) + (n!Y fcY*. the abscissas in the Gaussian formulas are not equidistant.

01995 32420 591 6 ±0. equidistant abscissas are given in advance. we have essentially the same error term. we expand f(x1).33584 90740 14 0. . CHEBYSHEV'S FORMULAS 213 Abscissas and weights in the Gauss-Hermite quadrature formula n xk Ak 2 ±0.65068 01238 86 0. .95857 24646 14 0.80491 40900 06 ± 1. 10.52464 76232 75 0. Let w(x) be a weight function and put +1 w(x)f(x) E_11 dx = k[f(x1) + f(xa) + .29540 89751 51 4 ±0. we obtain V+ 1 w(x)f(x) dx 1 f`'i(0) dx + w(x)x' 1 f +` w(x)f`n+u(c)xn+' dx 1)! (note that £ is a function of x). We suppose that f(x) can be expanded in a Maclaurin series: fix) = f(0) + xf'(0) + 2. + f(xn)) + R. one of them (in the middle of the interval) happens to coincide with the Gaussian value.72462 95952 24 ± 1. The reason is that in the first case.. and this is reflected as an improvement in the error term. f '(O) + .43607 74119 28 0.. The corresponding phenomenon is repeated for all even values of n which involve an odd number of abscissas.94530 87204 83 ±0.3. The corre- sponding formulas were obtained by Chebyshev.39361 93231 52 ±2.02018 28704 56 0.70710 67811 87 0. We will now specialize in such a way that all weights are equal.+ n* (0) + (n x` +i) fCn+1)(f) Inserting. On the other hand. the Cote formulas constitute a special case of the Gaussian formulas: namely.22474 48713 92 0. For n = 2 (Simpson's formula) and n = 3.3.15706 73203 23 ±2.88622 69254 53 3 0 1.35060 49736 74 0. when we have three abscissas.. Chebyshev's formulas As is easily understood.18163 59006 04 +1.00453 00099 0551 10.08131 28354 473 5 0 0. Now we can also explain a peculiarity in the Cote formula's error terms.SEC..

r(r x+1 + 1)zr +1 _1 =-2 1 Y. .3. f(x«)] = k [nf(o) + f'(0) xr + 2 f '(0) E x' + .. 3z1 + 4 .1) +1 kEx. we obtain the following system: kn = w(x) dx ..2) we introduce the function F(z) and obtain: F(z) _ fl (z .3. Putting S.t tz°) +1 l z" exp ( . Our final result then takes the form F(z) (10. +1 Jk E x.xr = z" exp 4-1 log (i - r-1 \ z ) zr/ _ z* exp .J+:J xr dx 1 z) -r rzr . (10.x. t ll tz°// = z* exp (. = E x... If in particular w(x) = 1. .... = x*w(x)dx. = xw(x) dx .2) . . + (n + Identifying.214 NUMERICAL QUADRATURE SEC. we get \\ / l / +1 log / I .). 5z4 1 + .3. = k x-w(x) dx = n + x-w(x) w(x) dx. we get i S. 10.'.1. As in (8."(0) E x.E o_1 x'] tz° J // = z* exp (n +1 w(x) log (1 . f(x*) in the Maclaurin series: k[ flx1) + .) = z* fi (1 .X/ dx/r w(x) dxl .f t17 n f x° w(x) dx/J+1 w(x) dxl dx/J+1 1 z* exp (n w(x) [ w(x) dx) \ 1 L . f(x. x) dx . + n f.

we find for different values of n: n= 1 F(z)=z n= 2 F(z) = z' n= 3 F(z)=z(z2. For n = 9. that is. Bernstein. we meet the peculiar phenomenon that only two roots are real. 2. The reason for this apparent contradiction is that we have substituted (Ilk) S'. all roots are real. but another Russian mathematician... 43 3 9 567 1701 56133 For n = 1. which is the case only exceptionally.n zn-z + n5!(3 5n . z° + 0 427 Z4 _ 57 z2 53 2 560 + 22400) n=10 F(z)=z10_ 528+ 8ze-10024+ 17 zz. even for t > n.42n + 1201 z*-0 + . 10. we have assumed our integration formula to be exact. has proved that this is not the case so long as we use the weight function w(x) = 1. 7! 9 / Explicitly.3. .SEC. 119 149 z4 + z2 6 360 64 0) 4 22 148 z 43 3 45 2835 42525 3 n=9 F(z) = z (ze . all roots are real. but for n = 10. complex roots appear again. . xlw(x) dx for S.2) I n= 4 F(z)=z4.. and hence. Do any values n > 9 exist for which the corresponding Chebyshev quadrature polynomial has only real roots? This question was left open by Chebyshev. 7. we may include only nonnegative powers. Expanding in power series we obtain F(z) = z" ..3 zz+45 n=5 F(z) = z (z4 _ 5 z2 + 7) 6 72 n= 6 F(z)=za-z4+ 5 zz 105 n=7 F(z) = z (z° . CHEBYSHEV'S FORMULAS 215 The left-hand side is a polynomial of degree n in z. but for n = 8. on the right-hand side. 6 z%-4 3! n r 35n.

we have for w(x) = 1: 2 A /y S}1f(x)dx = -Eftxr) n r=1 + RA.7947 or.57735 02691 6 -±-0.37454 14096 ±0. while F(z) has the form F(z) = 2-"+'T"(z) = 2-"+' cos (n arccos z).x2.. This means that the special nth-order Chebyshev formula above has the Gaussian precision 2n . The real roots are collected in the table below: n X.. (z) = cos (n arccos z) are called Chebyshev polynomials and will be treated at some length in Chapter 16.32391 18105 ±0.42251 86538 ±0. The polynomials T. the formula r f 1 1 dx xZ = S. Summing up.60101 86554 ±0.30 p. approximately.52876 17831 +0. 2n Hence all roots are real.m. we choose a Chebyshev formula with n = 4 and find the times 12 ± 12 . as a matter of fact.m.79465 44723 9 0 ±0..216 NUMERICAL QUADRATURE SEC.1876 and 12 ± 12 0.18759 24741 ±0. 2 ±0. R. 10.15 p.91158 93077 EXAMPLE The mean temperature shall be determined for a 24-hour period running from midnight to midnight.16790 61842 5 0 ±0.3.30 a.m.o f(cos 0) d0 n j(cos 2r 2n 1 is also of Gaussian type. by aid of four readings.26663 54015 ±0. the weight k = 7r/n. n X. 9. = cos 2r .45 a.0.86624 68181 3 0 ±0. In this case we have.70710 67812 7 0 ±0. and.52965 67753 4 ±0. dx . and 9. . E x+i f(A+1V') ] (n 1)! n -1«r<x.1. 2. Another important weight function is w(x) = l/ 1 .83249 74870 ±0. What times should be chosen? Since we do not want to give one reading more weight than another.88386 17008 ±0. 2.m.1 7r . as has been pointed out before. with the roots x.

-i=0.4.5._ dp kL=o k rar ck) =hf k1 + 1dk yo- Thus we get 1 k a (k) bk (10.3. j b. (10.-j =0. (10.1) and (7.1au bs = Jr . Hence we find with x = x° + ph: s' y dx = °+ (1 + 4)1Yo dx = h j 1 f. (p> d 0yo dp J =o =o o k=o k) = dkYo `'P(k)dp k=o k! o =h dkYo `' k((k)pr k! `Io .SEC.+jb. bio = -a $Sa.+ j=0.4)].jb.x0 [cf.. Using the table of Stirling's numbers of the first kind in Section 7. bs = -sa"aia .Z b= vajr'sa . we obtain bo = I . Numerical quadrature by aid of differences The problem of determining S=oy dx by use of usual forward differences can be solved by operator technique.2) bs-'1b.4.5). NUMERICAL QUADRATURE BY AID OF DIFFERENCES 217 10. S=° log (1 + d) where h = x. _ 7 T$.1) we obtain the system of equations b.1. b. 10. b. Here we need the power-series expansion of the function z/log (1 + z).4. bs = Th b9 = 936 b. I log (1 + z) I . we can also get the desired result by integrating Newton's interpolation formula.s1Jsa3a b. = 3 . = . .zs/4 + = 1 +b.3) where we have used (8. Alternatively. (7. bs = .z+bsz2+..3. Putting z . We have 'ydx hd = JJyo Yo..4.z/2 + z1/3 .4.. The coefficients are easily obtained recursively.7YO .

.-i+ 1 2Y. Written explicitly. 720 The first term in each of these expansions must be treated separately..dYo) y" - 1 .-3 + 4Yyo) 19 I60 Ply.720 d'Yo + 19 1 00 d6Y0 . E..4.+Y.+Y2+. Yo G 4 I -.-.. Jsydx=h foE'Yodp 0 h log E Yo = h I.. + 42Yo) 19 720(17'Y. 1 42+ 1 d'..log (1-V) 1 I2 F 1F2 12 24 IFS 19P*-.d) d` 12 24 720 / 1 ... is obtained in the following way.-dYo) 24 (P'Y. (10.-dY°)13 (vY"+dY°)-._i + 12(4y.+Y.+Y.E-' E -1 E.1._3 . .. log(1 _+.4) An analogous formula can be deduced for backward differences.24 (d°Y.218 NUMERICAL QUADRATURE SEC. the integration formula has the form J:oY dx = h [Yo + dyo 12 dxyo + 24 dsYO . _ yo = E"Yo . r1 1 1 hL2 yo +y1 +y3 +.. con- veniently we bring them together and obtain Y.17) log(1 + d) But 1 __ I (1 + I2 d .1 Hence we find r 1 ydx=h[2Yo+Y.4..+1 Y°Yo+Yt+Ya+.1 .log(l .. A better result.19 44+. however._ ..d&YO) - 720 This is Gregory's formula..- 1 12(VY. 10.. + 4'yo) (d'Y.

1.4.01527 77778 Ce = . and for h = 0.00001 02160 0. for y = n-'"ze-=' between .+y.6 1.+ Y. We shall now consider quadrature formulas containing central differences: so y dx = J°y.9 1.7 1. Co = 1 = 1 C. we find h S h S 0.5) The constants c.00010 34464 0.p..s formula.00000 04014 2 1. is not used for numerical quadrature but sometimes for summation.00015 44567 ci.00003 53403 An alternative method is the following..5.11). Denoting the sum by S. as a rule.) (10. we get 1. NUMERICAL QUADRATURE BY AID OF DIFFERENCES 219 If this formula is used. 10. We start from Everett's formula and integrate in p from 0 to 1.8'+.8 1.16971 We also consider S +.08333 33333 C. independently of h.SEC.81280 49541 10954.02489 0.-.4. for example. and hence.00315 80688 C. using (7. From (7.2.) = h9'(6)py113 .oo and + oo. Fork . all differences will vanish in the limits.81280 49540.x') dx = ?jI'(}). 1 exactly. _ 71*1b 0.00000 00035 1. 1. we have 8 8= 118' _ 1918° (8) .62+c.5 1. the integration .00068 81063 -g ! C10 = -"VVV753U _ -0. Gregory. while jf'(1) = 1.5 1. Of course. = -T3 _ -0.. = 2497a 0. and we would obtain ydx=h[2yo+yi+y9+..ftU 1 12 + 720 60480 + Thus we get the following formula: h Ss'ydx= O T (co+c.81362.00000 00000 02481 1 1.2.8blib _ -0.exp(. are displayed in the table below. and the reason is that the remainder term does not tend to zero.1: = 2 c(3)(y° + y. term by term (since q = I .0.)(yo+y. this cannot be true.00000 00000 00000 0. = 2 er3s7''386aa = 0.4) we have J° = h8/2 sinh-' (8/2) = h8/ U.

(7. we get f (p + k l 2kY= t (p + k)(2k+1) 2ky1 k-o l2k + 1/ '-S (2k + 1)! k=O S _ 8'kYl 'r. here we have further assumed that h = 1. If we truncate after the third. Especially if we truncate the expansion after the second term.2..T_ 8 h h 'u 2 sinh-1(8/2) operating on y = fx). However.. we can write the result: z°+k _ l =o k y dx = 2h (Yo + ti 8 Yo . and on the right-hand side (h) S =±A y dt.fro 2k+1 l + ko 2k+1 y Treating the first term only.220 NUMERICAL QUADRATURE SEC.7) This formula can be extended to larger intervals in a simple way. it is possible to obtain a formula in which the coefficients decrease much faster.5). . / . a. we get :o+k 1 82Y. we do not obtain formulas of the Cote type any longer.2k+>>(P + k)' k=o (2k + 1)! r=1 Now the integration can be performed: 1-kr+l (p+k)'dp= (k+1)+ 1 and we obtain finally 2 2k fl (k + 1)r+l . (p+k)alkYf(q+k' 2k Y ' . Using (7.1)] and operating on both sides with p8 . term.4.6) r+l where a factor 2 has been added to compensate for the factor h/2 in (10. fourth.4..10).] 2 y dx = 2h [Yo + = 2h [Yo + 1 Yl - 1 =o-k 6 6 y° + 6 Y-1J 2h = 6 [Y-l+4Yo+Yl] that is.1. E .(2k+1) (10.E-1 h 2h we get on the left-hand side _ 8J--h J 1 1 h8 ..4. Simpson's formula. in q gives the same result): _f. (10.180 8'Yo + 1512 8°yo . .kr+1 elk = (2k + 1)! r-l a. Starting from J f ( x ) = S a f(t) dt [cf.. 10.4.

Calculate S'I' exp (sin x) dx correct to four places. I. 6. Soc. Math. .33 43. Krylov: Approximate Calculation of Integrals (Macmillan. 10. [61 Stiefel-Ruthishauser: "Remarques concernant l'integration numerique. Find Sa exp (-exp (-x)) dx correct to five places. 739 (1942). 1004 (1949). 890." CR (Paris). Proc. Find S' log x cos xdx correct to five places. Levenson: Bull. Forh. [3] Salzer.. N. 7. 1962). Zucker: Bull. Norske Vid. [5] W. 48.69 50. Filon.25 sine x dx correct to five places. 2. G.0. Res.29 46. Edinburgh 49. is a certain constant. where C.. Find Sot' (cos x/(1 + x)) dx correct to four places. [41 Hildebrand: Introduction to Numerical Analysis (McGraw-Hill. Its acceleration is registered during the first 80 seconds and is given in the table below: t (sec) 0 10 20 30 40 50 60 70 80 a (m/sec') 30. Soc.63 33. and compare with the exact value. Am. Calculate S 0'/'-/l . 8. Compute C2 to 3 decimals using the fact that the number of twin primes less than 220 is 8535. Use this for a = 100. Selsk.00 31. 48.. New York. 1 1 (1952). 4. dx/(log x)2. Two prime numbers p and q such that lp . Find 1 dx 3o e ++x . 17] V. 55." Det Kgl. EXERCISES 1. 12.75 40. 30-36 (1955). Am. [21 Salzer. Roy.. Romberg. Math.67 Find the velocity and the height of the rocket at the time t = 80. Soc. 5.47 37. 11. A rocket is launched from the ground.qJ = 2 are said to form a twin prime. [8] L.I correct to four places. Functions (1964) p. 3. 28. 9. 1899-1900(1961). 1956). Find 11(1-ez)112 dx 0 X correct to four places. Calculate So ((sin x)'12/x2) dx correct to four places. Zucker: Capuano: Jour.EXERCISES 221 REFERENCES [1] Lowan. New York and London. b = 200. Calculate Sax` dx correct to four places. The prime number theorem states that the number of primes in the interval a < x < b is approximately S a dx/log x. For large values of n it is known that the number of twin primes less than n is asymp- totically equal to 2C2 S.44 35. Davids. "Vereinfachte numerische Integration. 38-47 (1928) [9) NBS Handbook of Math. 252.

. Then test the formula on the func- tion f(x) = [(5x + 13)/2]`x' with a = -0.. f(-1) + kzf(l) + k3f(a) . 0 2 Calculate the constant a and find the order of the remainder term R.)+3f(x5)] 17. where a is a given number such that -1 < a < 1.' f(x) dx = a. k. 23. 19. 15. Find So e-' log x dx correct to five places. and so on. 18. and k. Find So e(:+' dx correct to four places. Find S. Calculate the total arc length of the ellipse x2 + y2/4 = 1. Calculate So (e--'l(l + X2)) dx correct to four places. y is defined as a function of x by the relation y=. . . Calculate numerically So (x dx/(es .Yi) + R . Find x.f(x. = y(h). Determine abscissas and weights in the Gauss formula 1..1. fh. A function y = y(x) is known in equidistant points 0. Find an approximate value of S21.222 NUMERICAL QUADRATURE SEC.1)) and S u (x dx/(e + 1)). 24. One wants to construct a quadrature formula of the type y dx = (Yo + y.. _+x) correct to three places. such that the following approximate formula becomes exact to the highest possible order: f(x) 3 [f(-1)+2f(x. y. ±2h. y dx which contains only y_ yo. 13. The result should be accurate to six places. in the approximate integration formula f(x) dx = k. log (2 sin i } dt . and x. In the interval 0 < x < ir. 22.f(xa) 26. Determine abscissas and weights in the Gauss formula Jof(x)log z dx = A. = y(2h).) + asf(xz) 0 25. 20.4. 21. 16. J // Find the maximum value to five places. 10. and y.) + A. 14.) + ah'(yo . Find an integration formula of the type r+A where y.f(x. Calculate the coefficients k.

The function Si(x) is defined through Si(x) = Sa sin t/t dt. 29. can be solved by forming a sequence of functions yo. 1 xo where h=x1-xo. C.4398 0. and the remainder term R. c. yz. and the weight function w(x) = I/ v.4594 0.25)1. and D. in the follow- ing integration formula: 01 (x-xo)f(x)dx=h2(Afo+Bf1)+h3(Cf +Df)+R. when f(x) is as given in the table below.4794 0. Start with yo = I and use Simpson's formula and Bessel's interpolation formula.EXERCISES 223 27.. 28. . Calculate sin x Xa dx o correct to four places. according to y. The integral equation y(x) = I + So ft)y(t) dt. B.4207 . y1. x 0 0. Find a Chebyshev quadrature formula with three abscissas for the interval 0 < x < 1. J 0 Find the first five functions for x = 0(0.50 0.75 1 f II 0. Find a. One wants to construct a quadrature formula of so-called Lobatto type: +1 f x) dx = a[f -1) + f(1)] + b[f(-a) + ft a)] + cf(0) + R .+1(x) = 1 + f(t)y*(t) dt . Determine the constants A. where f is a given function. 30.5000 0. and a such that R becomes of the highest possible order. 31. b..25 0.

Bernoulli numbers and poly- nomials of the first order.4) Here we have Ak = B.(0).10) the Bernoulli numbers of order n were defined by the expansion t" tk Bk. kok! Ak e"t I k et .1) (et . t k=O k! r=1 r! ! -1 224 .0. .. First we shall compute these latter numbers. could tell where it would rise to in a few days. In (7. For the general definition it is convenient to make use of a so-called generating function. For brevity.0. At this rate of increase few. The reason was that the Stirling numbers could be considered as special cases of the Bernoulli numbers. (11.kk=oo k! The Bernoulli polynomials of order n and degree k are defined by t* tk e=t = B.3) we get t = ' A.0. STEPHEN LEACOCK. that is. if any. we shall call them Bernoulli numbers and Bernoulli polynomials: t tk (11 0 3) et-1 .0.1)° k=o k! If in (11.0. From (11." (X) .m. the well marked 1/2 inch of water. at nightfall 3/4 and at daybreak 7/8 of an inch. 11. since we want to reserve the notation Bk for other numbers closely associated with the numbers A. we use the notation Ak. Bernoulli numbers and Bernoulli polynomials In Chapter 7 we mentioned the Bernoulli numbers in connection with the fac- torials. (11. By noon of the next day there was 16/18 and on the next night 91/32 of an inch of water in the hold.Chapter 11 Summation At 6 p. The situation was desperate.3. Subsequently we shall exclusively treat the case n = 1.1)* . .2) we put x = 0.1 = k=o E tk!Bk(x) (11.2) (et .0. we obtain directly B1*1(0) = Bk*'.

SEC. 11.0. BERNOULLI NUMBERS AND POLYNOMIALS 225

where
A
7=0 V (S -
Hence we have
s!a=r s
:=o i! (s - i)!
A.=E(s)A;.
:-o I

But since the left-hand side was t, we have a, = 1 and a. = 0, s > 1. Hence
we obtain the following equation for determining A,,:

Ao = 1; (s)A;=0,
i=o i
s=2,3,4,... (11.0.5)

Written out explicitly, this is a linear system of equations which can be solved
recursively without difficulty:
A0 = 1,
Ao+2A,=0,
Ao + 3A, + 3A, = 0, (11.0.6)
Ao+4A,+6A,+4A,=0,

whence A, _ - i; A, = i}; Ao = 0; A, = - _51a; A, = 0; A. = a'j; ... Using a
symbolic notation, we can write the system (11.0.5) in a very elegant way,
namely,
(A+ 1)'-A'=0, s=2,3,4,...
with the convention that all powers A' should be replaced by A;.
Now we treat (11.0.4) in a similar way and obtain
x,',.t tr
Ak B,.(x)
,=o r. k=o k! .=o n.
Identifying we get

B,.(x) =.=ot (n)
r
Symbolically, we can write B (x) = (A + x) with the same convention A' A;
as before. Summing up, we have the following simple rules for the formation
of Bernoulli numbers and polynomials:
A'-I,
(A + 1) - A" = 0 with Ak A,, , (11.0.7)
B (x) = (A + x)' with Ak --. Ak .

226 SUMMATION SEC. 11.0.

The first Bernoulli polynomials are
Bo(x) = 1 ,

B,(x) = x - 1/2 ,
B,(x) = xt - x + 1/6,
B,(x) = x' - 3x2/2 + x/2,
B,(x) = x4 - 2x3 + xt - 1/30,
B,(x) = x' - 5x'/2 + 5x'/3 - x16,
Bo(x) = x° - 3x" + 5x1/2 - x2/2 + 1/42.
Equation (11.0.4) can be written
"
ester t 1 = 1 +EB"(x)
"=1 n.

and differentiating with respect to x, we obtain
(x) t" 1

est t t"-1 = B"-1(x)
et - 1 "=1 n (n - 1)! "=1 (n - 1)!
Comparing the coefficients, we get
B'',(x) = nB"_1(x) . (11.0.8)
The same relation is obtained by differentiating the symbolic equation
B"(x) = (A + x)" .
Now let x be an arbitrary complex number. From (A + 1)i = A` we also
have
it=f(n)t (A +
1)'x"_,

= 72t (n)i A'x"-1,
or
(A -E- I - x)* -- n(A f- 1)x"-1 = (A 1 x)" -- nAx"-'.
Hence we find the following formula, which we shall use later:
(A + 1 + x)" - (A + x)" = nx"-1 . (11.0.9)
It can also be written 4(A + x)" = Dx", or, since (A + x)" = B"(x),
d B"(x) = Dx" . (11.0.10)
Conversely, if we want to find a polynomial B"(x) fulfilling (11.0.10), we have
for h = 1:
( '0' X. (n ) Akx"-k=(A + x)"
B(x) n D 1 x"=
Ak
e- k= o
k !. /) k=o k
in agreement with (11.0.7).

SEC. 11.0. BERNOULLI NUMBERS AND POLYNOMIALS 227

Writing (11.0.8) in the form B (t) = B;}1(t)/(n + 1) and integrating, we obtain

n+l -
:+1B 1) - Bx+1(x) = dB,,+1(x) = Dx*+1 X.
t dt = n+l
°() n+1
where we have also used (11.0.10). Putting x = 0, we find for n = 1, 2, 3,.. . ,

0.
The relations
Ba(t) = 1 ,
B' (t) =
0, (n = 1, 2, 3, ...),

give a simple and direct method for recursive determination of B (t), and hence
also of A. =
From (11.0.6) we obtained A, = A, = 0, and we shall now prove that A,k+1 = 0
for k = 1, 2, 3, ... Forming
t
f(t)=e'-l t_ t e'+ t t
2coth2,
+ 2 2 e'-1
we see at once that f(- t) = f(t), which concludes the proof. Hence, there are
only even powers oft left, and we introduce the common notation
Bk=(-1)k+1A,k; k = 1,2,3,.... (11.0.11)
The first Bernoulli numbers B. are
B, = 1/6 B, = 7/6 B1, = 8553103/6
B, = 1/30 B, = 3617/510 B14 = 23749461029/870
B, = 1/42 B9 = 43867/798 B16 = 8615841276005/14322
B, = 1/30 B10 = 174611/330 B19 = 7709321041217/510
B, = 5/66 B11 = 854513/138 B17 = 2577687858367/6
B. = 691/2730 B1f = 236364091/2730.
The expansion (11.0.3) now takes the form

e't =1-2+BI2 -B, 4'
s +B,6!e

1 (11.0.12)

Above we discussed the even function
t
e°t
1+2t =2coth2.

228 SUMMATION SEC. 11.1.

Putting t/2 = x, we obtain the expansion
(2z). (2x)6
x cosh x =I+ B, (2x) - B, - B6 _' ' (11.0.13)
2! 4! 6!
Replacing x by ix, we get
x cot x = 1 - B, (2x) - B, (2x)' - B. (2x)°
2! 4! 6!
or
22P
1 - x cot x = f, BPx2P . (11.0.14)
P=1 (2p)!
Using the identity tan x = cot x - 2 cot 2x, we find
2=P 2!P - 1)
x= ( ) Bx=P-3 (11.0.15)
P=1 (2p).

Written out explicitly, formula (11.0.12) and the last two formulas take the
following form:
4 10
X6 X8
e= x 1 1 2+ 12 0 30240
+ T-20-9600 + 4790160
tanx=x+ x'
+
2x6+ 17 x,+ 62 x'+ 1382 x,1+...,
3 15 2835
315 155925

cot x =
1 - x - x° - 2x6 - xT _ 2x°
x 3 45 945 47 5 93555
(11.0.16)
The first series converges for jxj < 27r, the second for jxj < 7r/2, and the last
for jxj < in. A brief convergence discussion will be given later.

11.1. Sums of factorials and powers

In Chapter 7 we mentioned the fundamental properties of the factorials in the
formation of differences, and according to (7.3.2) we have dp'"' = np(%-1'. From
this we obtained an analogous summation formula (7.3.12),
t P`., = (Q + 1)(*+1) - pc+.+1, (11.1.1)
P=r n+1
valid for all integers of n except n = - 1. As an example we first calculate

r,p'6,
P=3
=P=3rp(p - l)(p - 2)
=7u,-3(4) 7.6.5.4-3.2.1.0=210,
4 4

SEC. 11.1. SUMS OF FACTORIALS AND POWERS 229

which is easily verified by direct computation. We can also compute, for
example,
4 4

p=1
P-' =p=1
r (p + 1)(P + 2)(p + 3) 1

5'-2> - V-2) _ 1 1 1 l_ 1

-2 2 (2. 3 .7/ 14'
which is also easily verified. The case n = -1 can be treated as a limit case;
then we have to write p("> = p!/(p - n)! = 1'(p + 1)/1'(p - n + 1), giving a
definition of the factorial also for values of n other than integers.
In Chapter 7 we showed how a power could be expressed as a linear com-
bination of factorials by use of Stirling's numbers of the second kind. Thus
we have directly a method for computing sums of powers, and we demonstrate
the technique on an example.
N N
[p`1> + 7p(2 + 6pc3> + piu]
EP'
p=0
_ p=0
E
(N + 1)ca> + 7(N + 1)13) + 6(N + 1)l11 + (N + 1)(5)
2 3 4 5
= N(N + 1) [15 + 70(N - 1)
30
+ 45(N - 1)(N - 2) + 6(N - 1)(N - 2)(N - 3)]
_ N(N + 1)(2N + 1)(3N' + 3N - 1)
30

For brevity we put S"(p) = 1" + 2" + + p". The expressions S"(p) have
been tabulated as functions of p for n = 0(1)15 in [4], pp. 188-189. By repeated
use of the relation dB"}1(x)/(n + 1) = x", we obtain

S"(P) = B"+1(P + 1) i B"+1(0) , (11.1.2)

which is an alternative formula for computation of power sums.
Here we shall also indicate how power sums with even negative exponents
can be expressed by use of Bernoulli numbers. We need the following theorem
from the theory of analytic functions:

7rCot2rx=-+ 1

X

it is given here without proof (see, for example, [1], p. 32, or (21, p. 122).
We consider the series expansion for small values of x on both sides of the
equation:
7rx cot 7rx = 1 + x ( 1 + 1
x+n/

230 SUMMATION SEC. 11.1.

For the left-hand side, we use equation (11.0. 14) and find that the coefficient
of x2P is - (2ir)2'B,/(2p)! For the right-hand side we have, apart from the
constant term,
X.` (_ 1
n 1- x/n
1 1 1
l
+ n 1+ x/n
+ x +
X2
+ x' X +. XY 'r'
n'
+.../
_i n n nY n' n nY

Y
+i+-+...).
M nY n' n°

Hence the coefficient of x2P is - 2 E *_, n -2p , and we get
1

,._, nYP
-2 (2>t)'P
(2p)!!
B. (11.1.3)

For p - 1, 2, 3, 4, and 5, we obtain
T I - K',
._1 _
1 7t2 . 1
- 7t0 .

n 6 7. W 90 ' n° 945 '
1 _ 1te 1 _ n10

--Ii ne 9450 ' n10 93555
On the other hand, we have no simple expressions for the sums E*.,n-YP-1;
the formulas turn out to be fairly complicated. The sums E;=,n-P have been
tabulated in [3], p. 224, for p = 2(1)100.
From equation (11.1.3) several interesting conclusions can be drawn. First
of all, we see that B, > 0 for p = 1, 2, 3, ... Further, we can easily obtain
estimates of B. for large values of p, since the left-hand side approaches 1 very
rapidly, and we have asymptotically
B 2(2p)! (11.1.4)
(27t )'P

This can be transformed by use of Stirling's formula:

B, 4V (Y'
e (11.1.5)

Using (11.1.4) we can investigate the convergence of the series (11.0.12), where
the general term for large values of p can be written 2(- 1)P+'(t/2it)P Hence
we have convergence for f t! < 2it.
In an analogous way, we can examine the convergence for the expansions
(11.0.14) and (11.0.15). The general term behaves like 2(x/ir)" and 2(2x/it)1P,
and we have convergence for Ixj < rt and Ixj < 7c/2, respectively. On the other
hand, this is quite natural, since x cot x and tan x are regular in exactly these
domains.

SEC. 11.2. SERIES WITH POSITIVE TERMS 231

11.2. Series with positive terms

Suppose that f(x) is an analytic function and consider
S = f(xxoo) + f(x° + h) + f(x° + 2h) + ... + f(xo + (n - 1)h)

-0E')f(xo) = E*
_ (E E-1 1 f(xo)

The main idea is to consider the sum as a trapezoidal approximation of a definite
integral; hence we are actually looking for the correction terms. As usual, they
can be expressed in forward or central differences, or in derivatives. In the first
two cases, we get the formulas by Laplace and Gauss, and in the last case we
obtain the summation formula by Euler-Maclaurin.
In the Laplace case, we have

f(x) dx (1 + E + E2 + ... + E*-')W1 f(xo)
T S

h
*-1 J
E - 1 hl f(x°)
and
s-h Sa f(x) dx = E*d 1 (I - h')f(xo)
E* I
d (1 +
d))f(xo)

log

(E* - 1)(b, + b,4 + bad' + ) f(xo)
Here we have used (7.1.4) and (10.4.1). Hence we obtain
f(xo) + f(xl) + ... + f(x*-)
= h J=°f(x)dx - Ebk(dk-'f(x*) - dk-'f(xo)) + R,, . (11.2.1)

We give the remainder term without proof:
R = nh°,+,b,*+i f c,*+1,(E) ; X. <S <x*+n..
In the Gaussian case we form
S-1 *-k"f(x)dx
S
=
E*
E --1 (I
1
- :°)f(xo)
h

= EIIsBI (I - U)f(xo)
The expression S/U is easily obtained by multiplication of (7.2.9) by (7.2.10):
8 3' 178' 36750 _ 278598'
U I + 24 - 5760 + 967680 464486400

232 SUMMATION SEC. 11.2.

and hence we have the formula

f(x°) + f(XI) + ... + f(x.-1) = h

Alternatively, we consider

2 [f(x° + f(x1)] - Qf(x)dx
h

E112 (f - h°
= E1/2 Y' (1 - 8
)fPU°

= E112 (12
62 _ 1la' 19186
720 + 60480 - ) f0
l
1162 + 1918'
tea (2 _
1
_ .. (E_ 1)f,
720 60480

By summing such relations to i[ f(x._1) + f(x.)] and further adding the term
fit[ f(x°) + f(x.)], we obtain
1 f= 1
f(x°) + f(xI) + ... + f(x.) = h J=° f(x) A + [f(x°) + f(x.)]

+ Lu6f(x.) - ,uaf(x°)] lt6'f(x.) -1-i5'f(x°)l
2 720
191
11a6f(x°)] - .. . (11.2.3)
+ 60480
Both (11.2.2) and (11.2.3) are due to Gauss.
Finally, we again form

f(x) + f(x1) + ... + f(x-1) L:fxdx
E-1 (1- U)f(xo)
_ (E. - 1) (e° 1 1
U)f(x°)

= (- 12 + B. U - B2 4!-+ B2 U° _ ...) (f(x.) - f(x°))
2! 6!

. this remainder term is of little use.2)!. and we get S9 = 0. which can also be written n 2 r=1+(1 . 190.05175 53591 J f(10) = -0.2.) .) . 11.4) This is Euler-Maclaurin's well-known summation formula.57721 56650 The exact value of Euler's constant is 1 0. The sum up to n = 9 is computed directly.>(e) nB.. Instead it can be proved that the absolute value of the error is strictly less than twice the absolute value of the first neglected term... for example.). p. SERIES WITH POSITIVE TERMS 233 Adding to both sides. however.00009 25926 a#zs f"(10) = 0. ..) .) h 'o f(x) dx + 1 [f(xo) + f(x. x0 < e < x.57721 56649 015. [4].ht" . (See...f"(xo)] + 30240 [f°(x. + f(x. .63174 36767 S = -0._ n+logn-1 n We have S_(I .f'(xa)] .)] + i2 [f(x. EXAMPLE Euler's constant is defined by r = lim(1 + + 3 +. Further. the error has the same sign as the first neglected term.. If the series is trun- cated after the term B. .720 VAX.-. U2--'/(2m . provided that f:. x.05175 53591 ..+ .logn).2.. (11.log x+log(x-1))dx= -1 +9log10-9log9 io x / _ -0.. we obtain the following equality: f(xo) + f(x0 + .) If n tends to infinity.00000 00015 S = 0.00000 01993 -sa4aaf°(I0) = -0.SEC.f(xo)] .>(x) has the same sign throughout the interval (x0.00268 02578 -i-1 f(10) = -0. the remainder term is R = (2m)! f:.

we make use of the relation V = E-'4. 102+ 3.(N-))P)y0 + Y.)(19 720p PZ) (d2Y.P. + + y. = E° .P2) Replacing x by V and d.. y.1 + Y. 103+ ) Already from 8 terms we get the value correctly to 10 decimals. In the example just discussed we have J1(x 1 + log (1 .. A natural generalization arises in the case where one wishes to perform the summation at closer intervals than those used at the tabulation. we have y/d = yo + y..PZ (4Y. are known function values belonging to the x-values x.1 (1 +4)P. 24p (p2Y.-...-2 + 42Ya) . Let us suppose that yo.4.) . + .4Yo) S = I (Yo + Y. Then we get S = (1 + EP + . P V d 2 2p 112p2(17y. It may well be the case that even when the integration can be performed explicit- ly.234 SUMMATION SEC. respectively. . _ Yo (1 -p)-P-.2. . + Y.2..-s .1 EP-1Y"+Y.31.. We easily find the following series expansion: (1+x)P-1 p( +(1-P) 2 -(1-P)12+(1-PZ)24 . ._Yo)+ 1 (Yo-Y..(1 .. 11.. we find . where N is an integer > 1.. y .). denote the desired sum by S..)+Y.1 (Yo+Y. = d'y.-. it is easier to use series expansion.P2)(9 .. Suppose that the operators E. Y. S= 1 (Y. = xo + ih. As was shown in Section 10..-dyo).P2)(19 .. from which we get pry.+d2Yo)-.) .10 +2.1 2p (Yo + Y.. Further.) dx .5) 480p . d..112p P - 24p P2 (42Y. Hence we finally obtain the formula _P . and p refer to the interval h.a (2x2 + 3x3 + 4x< + -(1 2.(1 ..x-')) dx = .41. (11.) .P2) --8a . + (1 . and put 1 IN = p.. + E.d2Yo) _(1 -PZ)(9-P2)+44Yo)-. Now we want to sum they-values with intervals h/N in the independent variable.

We observe that 1impS= P-a - h 1 12-00 ydx. In some cases the following method.=Ek_' k=1 . We have s. one performs Lagrangian extrapolation to 1/n = 0.1p (yo+Y3+.3. If we now form pS and let p -. EXAMPLE S=Ek-Y.12pp(yo+y. For analytic functions we could equally well use Euler-Maclaurin's formula directly.3. 1(Y. We can. SLOWLY CONVERGING SERIES 235 This is Lubbock's formula. of course. in the latter case the derivation is performed in the following way... Enler's transformation Summation of slowly converging series is a nasty problem.v -Yo) +.. I -p P y.) E* - EP_IY0+Y. can be used.).112p2h(Y. Yo) p Y.yo') ... 11. 0. Forming a suitable number of partial sums S. -Yo) .1))(Y.pY. the formula passes into Gregory's formula. -Yo) p + 1720p` h3(y.0.6) This is Woolhouse's formula. and hence S= (yo+Y1+.LU-+BP!U6 4! 6! -. 11. Slowly converging series.SEC. the formula passes into Euler-Maclaurin's formula. k=1 S.2. and hence for p . _ p(eU1.. (11. work with central differences or with derivatives. but Lubbock's is quite useful for empirical functions. CePt'1 1 1 U B1 + Bi gs U° + P-+ B1 PYUz pU { 2 2! 4! 6! 2 2! _B.' . suggested by Salzer..). and regarding them as functions of 1/n.. 30240ph'(Y. . and it is difficult to devise any general methods.+Y.- I E1 Yo.+y..

However.(x/(l .3.05) 1.(x/(1 x))(64/2) + (x/(1 .0. u0= uo 1 -Ex 1 I -x-xd 1 U0 1 -x I -(x/(I -x))4 and hence S_ 1 I.Ex I . I . Lubbock's and Woolhouse's methods may also be useful in such cases. consists of subtracting a known series whose con- vergence properties are similar to those of the given series. + R we find in the that S'. S(0.1) A corresponding derivation can be made by use of central differences: S= u0 = UO 1 .1) = 1. Symbolically we have S=(1+Ex+E4x4+. S(0.2)1 = 1.04) = 1.(x/(l .(x/(1 .60572341. = S(1/n). Putting S...x I . = A/n' + B/n' + With S = S.. is a good approximation of S. Then we obtain from Lagrange's formula S(0) = T2[125 S(0.54976773. we shall now consider a more systematic transformation technique. and we consider only such values of x for which the series converges.x [I .x))1j8 1 1 .3. originally due to Euler.64493357.1) .46361111 S(0.59616325. we obtain S(0.x))Eu484 u0 1 1 .z(84/2) + zp.x2 ..04) .x o\ x x/ .z(1 + z)84 u0' .1. We start from the power series S = u0 + u.x I .644934066.x))(84/2) .x(1 + 84/2 + p3) 1 u0 1 .x))(84/2)14 .(x/(1 . (11.)u0.=0 1 /.d'uo. 6 special case when R. analogous to the technique for computing certain integrals described in Section 10. Another method.x))113 I .128 S(0. 11..2) . The exact value is S(0) = = 1.S(0.05) + 16 S(0. = 1(4S.236 SUMMATION SEC..x + u. .

(11. Then we obtain with the same z as before: S= 1 1 x [uo + Q1(z)huo + Q2(z)h2uo + .2 + zfc8(1 + 2(1 + z)Bz + Simplifying. .3. I-x 2 1 -x(1 -X)= (1 -X)e (1 -X)e ) +2 (1 x ((U1 ..SEC. we finally obtain S= uo + 1 1+x( x 32uo + X2 S'uo + X' 3euo . (.I )i+r (l) ks k=1 Vc EXAMPLE Consider S =..u_1) + (1 x (V N3 .. we can use derivatives instead of differences..3.3) where we have Q1(z) = z z Q'(z) = z2 + 2 Q3(z)=z3+z2+ 6. Z3 (11.2) Naturally.2u-1) X)2 x)2 x2 + (l 8'u-1) + .. SLOWLY CONVERGING SERIES 237 where we have denoted x/(l .(z) = 1! i=1 r.3. 11. S=) (l f z(1 + z)32 + z2(1 + z)284 +.) S= 1 I x {(1 .x) by z.) .. Hence.4) +24 72z Q2(z)=ze+2z4+ 4Z' +T 13 31 z' Qe(z) = ze + ? ze + 6 + 4 ze + 3 z2 + 720 The general form of these polynomials is Q.o2n +)1 (-1)"n + 2 . zi r.3.. ] a (11.

= .. Already 10 terms.0002 4609 19 1. where the constants a..0009 9458 17 1.0000 3059 22 1.3.6449 3407 9 1. corrected with a corresponding remainder term. In order to obtain this accuracy by use of the original series. The remainder term is practically equal to the last term included. The convergence is often very slow but can be improved considerably if we write instead S = f(1) + a2[C(2) . give the result 0. deviating from the correct value ir/4 with only one unit in the last place. that is.0369 2776 12 1.1 ] + For facilitating such computations we give a brief table of the C-function when s = 2(1)22.0000 0191 6 1.. 2 3(n .4)].J)") [cf.3. we would have to compute about 5000 terms. = 2(n d'u. Here we also mention a technique which is often quite useful.0000 1528 3 1. If one now wants to compute S = E. Hence 4u.0001 2271 20 1.and the sth term is (s + 1)/(2s + 3)...0000 0382 5 1. are assumed to have such properties as to make the series convergent. Riemann's C-function can be defined through C(s) =k=1E Ir and this function has been tabulated accurately.=1 f(n). s C(s) S C(s) S C(s) 2 1.00000095 7 1.1) for x = -1 and u = 1 /(n + J) _ (n .2020 5690 10 1.0000 0024 .0040 7736 15 1. the convergence is now slightly better than in a geometric series with quo- tient J. d'u..0083 4928 14 1.C(2) + a2C(3) + a4C(4) + .0004 9419 18 1.0000 0764 4 1.0823 2323 11 1.2 through 7.1 ] + a2[C(3) . .(n .+ O's + + . and from this we get (s!)2228+1 4'280 = (-1) (2s+1)! and the final result 821-1(s!)2 S_ (2s 1)! . For s > 1. . Consider f(n) = nz . (7.7855. etc.0173 4306 13 1.3. we immediately find the result S = a.238 SUMMATION SEC._o + The quotient between the (s + I).j)'-11 . We use (11. 11.00000048 8 1.0000 6125 21 1.3.0020 0839 16 1.

1 1-[ C ( 6 ).2'.. T. 2. .u_1) + 1 (8`u1 .. 3 11..11 . 1+E-eu+1uo.n' 1 2 + .1 Us -. . 2' (11.3. = 1.. we prefer to treat the case separately.(81u.326324 (5 terms)..4. L -1+ U .2) If we prefer to use derivatives..8'u_. and (11. [C(4) ._2 n' n6 + n° J = 2 + [C(3) . f sin.. ALTERNATING SERIES 239 EXAMPLES k. it is more convenient to make a special calcu- lation as follows: S u..1n = sins 1 + 1f. .1 U+B. (11.1 J .I (u1 .=0 2. . 2 2! 4! .. .4. 11.lua 2 2! 4! JJ 1 _B12 .4. = 1 U 2U u U eu-1 e!O-1 ° = Ur1 2 +B12 B.4..2).cos? = sin' I -{ A (\ n' _ 1 1 3n' 2- + 45n° 1 315n8 + = sin' 1 + [C(2) . Using ordinary differences.. 2 T. (11.(1 2 -. as a matter of fact.1]+[C(9).1).4 +. we find the formula (-1)' . uo.686503 (7 terms). we get instead S= 1 uo .SEC.J .2 duo + d2u. Alternating series This special case. Nevertheless..B.+i [u0 .8'u-1) .3.) + ..1) If we use central differences.1] + 45 [C(6) . 4U1 + B216U1 -. can be treated by putting x = -1 in the formulas (11..3).3.11 . 0.

19/+(21 +.78539 81633 974. .3 + 5 _.. compared with the exact value 7r/4 = 0...+ 2u. and put v. v..10 .. Here (2x . Hence S.1 1)' with 2x . etc.78539 81634.+u.. _ (11.+ 2u. u . Let S=u.=u.. where e > 0... + The conditions for the validity of this transformation are quite mild.+4u. = 0.v.8. for example.6 I u. + S..4.6.1)(2x -2. who trans- forms a series with positive terms to an alternating series. +. 218 and I L1 1 2 16 272 7936 S' 21+217 2 21'+21" 218+ 21" that is. _ _2. + 4u4k + 8u.1 = 21..4. 31h° UOIX + 691 h" uo.4.. ... 11. + v.240 SUMMATION SEC. +8u..6 ' u 21' 21' U'--2.+u.76045 990472.v. it suffices that the terms uk decrease as k-'-'.... . Hence we find the formula S Luo 2 uo + uo' uo 2 24 240 17h' uo .3) + 159667200 40320 725 660 EXAMPLE S= (I . An interesting technique has been suggested by van Wijngaarden..02493 82586 8. Then it is easily shown that S = v.k +. S.4.)=S'+Ss- 23 1 By direct evaluation we get S. = 0.+.. = 0..+4u8 v8= Vk=uk + 2u.4. Thus _ 1 u 2 -21=. 2 2x ..=u..

. Calculate S= 13.I)-. Find to five places: S _ I I 1 log 2 . Calculate Catalan's constant . [3] Davis.=o (. 7. One wants to compute S = i=1(1/n3). and assuming that RN = 1/2N2 + a/N3.. Compute to five places S= (2./ _3 . [21 Hurwitz-Courant: Funktionentheorie (Berlin. 1931)..sin + + to five places. 241 EXERCISES EXAMPLE S=t I k=1 )1 ( 2 4 8 . EXERCISES 1. [4] Milne-Thompson: The Calculus of Finite Differences (Macmillan. Aided by these values. one computes SN explicitly for N= 10 and N = 20 (six places).1)+(41-4 -1)-(*T.log 3 + log 4 - 4.sin i + sin # . 2.I )k/(2k + 1)2 to ten places. 8.e 'E+e "s . 1 =4] vk . Waterman Institute for Scientific Research. find S. II (Berlin. 1929). If (Indiana University. Harold T. Calculate S=I - %/T VT 77 T correct to four places. Finde' . 1935).e a+ to four places. London.15+ 113 537 +9511 correct to five places. Bloom- ington. 5. 3. The Principia Press. Contributions. 6.. 1933). Find the sum S = sin 1 .: Tables of the Higher Mathematical Functions. Putting SN = EN 1(1/n3) and S = SN + RN.1)-(-.1)+(V6 . 1+ 23 + 4' 8' J 3 Ic Hence we have S= 3 REFERENCES [1] Knopp: Funktionentheorie..

when A.f(x+1)=u.11 + 13. f '(x). (1 + k-'1') to five places. 15. and the total integral appears as an alternating series.+ 1 2 +.f(x+2)=us.. A0 = 0. = B0 = B..5 + 7.u..._. calculate sin .+(n-1)B. The series S=uo+u... for n = 0(1)12._s. .... 1 3 1 to four places. Using this method. 11..2+i + 1. -u. is supposed to be convergent.x o log (I +x) to three decimal places. P n' -P for p (four places). 16.242 SUMMATION SEC. 17. = 1..A /B.15 + 17.. Further. 18. Compute (n'/n!) by use of factorials.-.1)A. A. and use the values for computing lim. Calculate explicitly A. Find S (1/n) arctan (1/n) to five decimal places.-s 1$=B.4...1 3.3 + 5. = A.. 11.. Compute 1 1+i + 21. 10. Compute k-6'' correct to six places. (four decimal places).7 979 1 1 _ 2 I + 11. Integrals of oscillating functions can be computed in such a way that each domain above and below the x-axis is taken separately. Show that the series _ 1 _ 2 1 1 _ 2 S 1./B. 14.3+i +. fx)=uo. f'(x). 17 _ 2 S 3 = (2n -+i Then compute S correct to four decimals. 12. up to terms of the fifth order. + (n . Find jjk_.+u. Compute 1 1 P2 + 41 8+ 9 1 P2 +. 13. Find S expressed in f(x). .+ u6 . 9. 13 15. f jx) is a function which can be differentiated an infinite number of times.

24.%. 20..' where a. 3. . Using series expansion compute So (x/sinh x) dx and So (x/sinh x)$ dx.EXERCISES 243 19. 1 /a.. xdx/(e' .-_./3 + z4/4 +. are the positive roots of the equation tan x=x taken in increasing order. where r is Euler's constant.. 22.1) and S. a2. Putting z = E. 21. prove that z4/2 + z. Using series expansion compute S.9. k''. . .. a3.q')'1 for q = 0._ I . Compute jj*_1(lOn . Find . xdx/(e' + 1).9)/(Ion ..r.1)(tOn . n= 2.5)2.. Compute IT =1(t . 4. 23.

yo + vk) has to be computed.1 For performing differentiation and integration (which. one encounters essentially greater difficulties than in the case of just one variable.Chapter 12 Multiple integrals The Good Lord said to the animals: Go out and multiply! But the snake answered: How could I? ! am an adder! Working with functions of several variables. As in the case of one vari- able.. Usually one prefers to perform first a series of simple interpolations in one variable and then to interpolate the other variable by means of the new values. ±2. is sometimes called cubature). ±1. . If. From 244 . Then the desired value is obtained by one more interpolation in x. for example. one can keep y. Already interpolation in two variables is a troublesome and laborious procedure. 10 6 2 5 h 11 3 O 1 9 7 4 8 12 Figure 12. we conveniently use an operator technique.. one can use the operator technique and deduce interpolation coefficients for the points in a square grid. (interpolation in the y-direction with different "integer" x-values). f(xo + nh. + vk constant and compute f(xa + nh. ya + vk) for n = 0. in two dimensions.

MULTIPLE INTEGRALS 245 now on we restrict ourselves to two independent variables and put h-ax . = 2(cosh 2C + cosh 272) fo =4fo+4(C'+i1')f. (20f.)f . (12 5) .fo S.1) With these notations we have. + f8 S. we obtain 4S. ay (12. f . (12...20f. = eE+'f . where we number a few points near the origin (see Fig. . with the error proportional to h'. for example. = e ' f o .8S. = fi + A + f3 + f4 S. . = e f.1) Thus we are working with a square grid. 12.(Si-4fo) (12. f1 = eEfo. we shall consider the following three sums: S. = fe + fe + f. rJ=h a . (1651 ..S.60f ) .. f . and so on. . = (6h3V2 + Jh'G' +. we find without difficulty: 1 V'fo= h. Analogously. f .+2S.fo+f30+fll+fu S. and we find F%'=h. = 2(cosh e + cosh rj)fo S..+44(x'+17')fo+.4) with the main disturbing term invariant under rotation. + S. and P=fo = 1 1 . . We shall here use the symbols G and D in the following sense: + 722 = h: CaY + a' = h-17 ax= aye =hD' . = 4cosh Ecosh17. The most important differential expression is j7'f. = eF f o ..+S3) with the error proportional to h2. axs y Further. .3) Further.

Identifying... and e2772 R. to . y) = f(hr. f (x. we obtain V-4h'[afe+4851+r'SZ I-c(Q+2R)]-I-. F4 + r74 = Q. that is. =J -w Then e+e+..he P4fp + .246 MULTIPLE INTEGRALS We now pass to the integration (cubature).ofe dx=hdr. and adding the condition that the error term shall be proportional to v4. . fo Setting for a moment e2 + >7t = P.. y) dx dy . 45 45 On the other hand.. we get _ 22 a+4/3+4r=1 a 45 R+Zr 6 45 1 whence 7 12 +r +c= 6 12 o ' 180 ' 1 1 r + 2c c = I 180 Thus we find the formula V = h2 (88fo + 16S1 + 7S2) . and try to compute ``+A r+A V Jf(x.. and hence +1 +1 V = h2 e'F+'Ife dr ds elf+1 4h2 sink sinkht Jo L ]:-1 [i-L_1 fo U = 4h2 L1 + 31 W + 772) + 5 Ce4 + £'77' + 7741 + . Simpson's formula in two directions gives ' V= 9 but the error term is more complicated.. .)2)2 = Q + 2R (thus invariant under rotation). hs) = dy = h ds.

dxdy 9 dy ' A J} -A hY -A -A Figure 12.3. 12. we obtain the open-type formulas shown in Fig.2 The two integration formulas are valid for integration over a square with the side = 2h. Integrating over a larger domain. This standard configuration is then repeated in both directions. 247 MULTIPLE INTEGRALS All these formulas can conveniently be illustrated by so-called computation molecules. 12. while the coefficients in the midpoints of the sides get contributions from two squares. 12k2 V2f0 S+A r+A f.2 should need no extra explanation. Taking out the factor 4. . we see that the coefficients at the corners receive contributions from four squares.dx 45 +A +A f. The examples in Fig.

but they are extremely clumsy and awkward. Of course. One possible way will be sketched in a later chapter. Acad. Y. EXERCISES 1. ed. 1964). and the demand for accuracy has to be weighed against available computational resources.5). For computation of S S Df(x.. 23 (Jan. but this Monte Carlo method has serious disadvantages due to its small accuracy. N. Hammer. REFERENCES [1] G. [5) Stroud. Sci. 2). Proc. (2. of a Symposium on Numerical Approximation. in principle at least. where D is a square with the sides parallel to the axes. MTAC. C. 2.. of Math. 2). + bS). Madison. 393 (1953). Although some pro- gress has been made in recent years. (2.248 EXERCISES 9 45 4h' Sfdxdy 4h'Jffdxdy Figure 12. [3] Thacher. and compare with the exact value. 189 (1957).3 The integration in higher dimensions will give rise to great difficulties. 11. W. [41 Thacher. y) dx dy. On practical work with multiple integrals the most common way is perhaps first to perform as many exact integrations as possible and then to integrate each variable by itself. Langer. 1958. one wants to use the approximation h'(af. Compute numerically Sa e -431+0's dx dy (h = 0. no completely satisfactory method has appeared so far. where D is a square with comers (1. This may create a need for huge amounts of computing time. 3. 7. 1). (21 P. Jour. 3. Here h is the side of the . Tyler: Can. 776 (1960). to construct formulas corresponding to the Cote and Gauss formulas. Compute numerically S S D (dx dy/(x' + y')). Wisc. 1). Ann. ACM. E. Comm. it is possible. 86. and (1.

by using the approximating expression (nag/4)(f.EXERCISES 249 square. y) dx dy. together with a regular hexagon with corners P P2. and S is the sum of the function values at the corners. is denoted by u. A function u = u(x. and compute approximately +: +i a (:2+r2 ) dx d y -1 -1 by using 13 points in all.... One wants to compute S S o f(x. f2. f. 5.6u0 and J = a2/ax2 + a`/dye. + f2 + fs + f4). Find a and b.. show that S= ih2dM + h442u + 0(h6). 4. where C is a circle with radius a. . y) sufficiently differentiable is given. . The function value in the point P. . is the function value at the midpoint. Here fi. f2i and f4 are the function values at the corners of a square with the same center as the circle has. center PO and side h.. P. . Putting S = ul + u2 + + ue . Find the side of the square when the formula is correct to the highest possible order. where du and 42u are taken in P.

which moves the origin to the point x.. dly. y(x + 1). + p"(x)Y(x) _: g(x) .E"-' + . Of course. Notations and definitions We will now work with a function y = y(x).2) If g(x) = 0. ... I'm feeling fine said Granny.0. y(x).0.0. but we shall continue with the name difference equation. Then we shall also use the notation y" for y(n). we infer that the difference equation can be trivially transformed to a relation of the type F(x.Chapter 13 Difference equations My goodness. .0. the equation is said to be homogeneous. We will restrict ourselves mainly to this latter case. (13. which can be defined for all real values of x. 250 .. In the latter case we will assume that the points are equidistant (grid points). waving with her hand. + c"Yo = (E" + c. a dependent variable y. As a matter of fact. .(x)Y(x + n . We define an ordinary equation as an equation which contains an independent variable x.3) * This is a free.1) + . that F has the form po(x)Y(x + n) + p.. y(x + n)) = 0 . (13. After a suitable translation. + c1Y"-I + . (13. d"y. an equation of this kind should be called a recurrence equation. The treatment of general difference equations is extremely complicated.. it is not an essential specialization to assume for- ward differences. such an equation can always be written Y. Next birthday I'll be ninety-nine with just one year left to thousand. though not very poetic.1. If we use the formula d = E . that is. and the grid points correspond to integer values of x. First of all we assume that the equation is linear.. and further we suppose that the vari- able x has been transformed in such a way that the interval length is 1.. ELIAS SEHLSTEDT* 13. and here we can discuss only some simple special cases.. and further we can spe- cialize to the case when all pi(x) are constant. + c")Yo p(E)yo = 0 . or only in certain points.1) where n is called the order of the equation. and one or several differences dy. translation of a humorous Swedish verse.

However. and so on. further. In the remainder of this chapter. Hence the general solution of the equation which we have discussed can be written y(x) = w(x) 3' . satisfies the equation. in general. The solution concept of a difference equation At first glance one might expect that several details from the theory of differen- tial equations could be transferred directly to the theory of difference equations.. w(x)) = 0 . lim=_. -1 -2 -3. but it is easy to show that this is not the case. The solution y = 3= . A fundamental question is. Now we easily find that y(x) = c .1. we consider the following example. (13. One might believe y = c 3z . w(x) must be continuous._o w(x) = w(0). it is called discrete.. y(x) = 3=(c + J cos 2irx + J cos 67rx . where c is a constant. and we see at once that the difference equation is satisfied again. The equation y*+. This is a new difference equation which. w(x). 13. and further. We find n 0 1 2 3 4 5 6.. w(x)) = 0 . . On the other hand. just discussed.1. y(x + 1). Since the solution appears only in certain points. we get F(x . but the corresponding solution y does not.x to be the general solution. since it depends on a special value ofyo. = 3y + 2n .1. in the latter case we meet difficulties of quite unexpected character and without correspondence in the former case.1) Operating with E. We can choose w = w(x)com- pletely arbitrarily for 0 < x < 1. will stand for peri- odic functions of x with period 1.. y(x + 1).3' . consider a relation containing the independent variable x. w.(x). is obtained if we take c = 1. In order to clarify this point. w(x + 1)) = F.x. suitable conditions must be imposed on w(x).I is a first-order inhomogeneous difference equation.x .. I9 8T . The discrete solution.. If we claim such properties.x and find w(x + 1) = w(x). For continuity. 13. can be interpreted directly by saying that w must be a periodic function with period 1. and a periodic function w(x): F(x. Assuming that yo = 1.I sins 37rx) .x is defined for all values of x and is called a continuous particular solution. w.1. y(x). Consider. . differentiability.. y 1 2 7 24 77 238 723 .(x. however..(x). THE SOLUTION CONCEPT OF A DIFFERENCE EQUATION 251 SEC. is uniquely determined for all integer values n. for example. obtain any of the properties of continuity.. what is meant by a "solution of a difference equation"? In order to indicate that the question is legitimate. we see that y. we put y(x) = w(x)3x . the dependent variable y.x. it is a particular solution.

the equation would be of lower order. y*-.e-i-11=Acos +Bsin 2 . n . can be determined in such a way that for x = 0. Hence y=A.2.. are given. 22 = 0 . For multiple roots we can use a limit procedure. We assume c # 0. the equation y(x + 2) + y(x) = 0 has the solution y=a.0. (13.. we infer that. otherwise. + . So far.. provided u and v are solutions and a and b constants. (13. Eliminating w(x) from these two equations. and then the whole procedure can be repeated with n replaced by n + 1. 13. 2. y y .2. y(x).2.3) in somewhat greater detail. As in a previous special example.. Linear.2) which is a first-order difference equation. First we consider only the discrete particular solutions appearing when the function values yo.2.e""11+b.2. ..0.21+A....1.1). . Since our equation is linear and homogeneous.3)... Thus we have constructed a continuous particular solution which coincides with the discrete solution in the grid points.. Thus y* can be determined from the difference equation.+A*2n is a solution of (13.=q(2)=0.. 2. putting y = 21. are the roots of (13. . On the other hand. yt..2. The n parameters A A . we get Ely = 2' 2' and (2* + c12*-1 + . 2.+c. . we conclude that the general solution can be written Y... For example..+. A.. 1 . Clearly. (13. we have considered only the case when the characteristic equation has 2 simple roots. Hence y = 21 is a solution of the difference equation if 2 satisfies the algebraic equation 2*+c12* +. all discrete solutions are uniquely determined from the n given parameter values. 13.. . Thus we suppose that r and . analogous to the well-known process for differential equations. y(x + 1)) = 0 . 2 . homogeneous difference equations We shall now discuss equation (13. = w1(x) . .1. y..2) If the characteristic equation has complex roots. then au + by also is a solution. + w2(x) . this does not give rise to any special difficulties.is+b(-i)z=a.0. the variable y takes the assigned values yo.1) This equation is called the characteristic equation of the difference equation (13. + w*(x) 2: . we obtain G(x. 2* (all assumed to be different from one another). + c*)2" = q1(2) .252 DIFFERENCE EQUATIONS SEC.3) if 2.

. y(l) = 1.qN ifp=q= . LINEAR.2 .). The characteristic equation is 2 = p22 + q.1) with p + q = 1. the general solution of the equation (E -( r)°y(x) = 0. EXAMPLES 1.r = xr=-' is also a solution.3=. 13. + ( l 1)5TM+ 2. and y(O) = 1. 34. we get y 1 -x/N..( y(x+n)-(7)ry(x+n. yN = 0. . If p # q. that r= and (r + e)= are solutions of the difference equation. with the roots 1 and q/p. and identifying. Putting y = (ax + b) . y(x) = py(x + 1) + qy(x . The characteristic equation is 21 . 5. we find the solution y(x) = pN-=q= . . Due to the linearity. 1. we conclude that we have the two solutions r= and xr=. The initial conditions give a+b=0 and a= -b=75. Since multiplication with r is trivially allowed.qN pN . we get dy = (2ax + 3a + 2b) 3=.. 8. is y = [wo(x) + xw. ..(x) + . 13.1 = 0 with the roots 1(] ± 1/5). + (-1)"r"y(x) = 0 . The technique is best demonstrated in a few examples. The Fibonacci numbers are defined by y(O) = 0. l 1)+()rzy(x+n-2)-. 3. 2. .. that is. Qy=(4x-2). HOMOGENEOUS DIFFERENCE EQUATIONS 253 r + e are roots of the characteristic equation. then also xzr= is a solution. Hence.-. y(x + 2) = y(x + 1) + y(x) (hence 0.4) . that is. The "general" solution is y(x) = a (1 21/)= + b (1 2]/5). 1.. lira r + e .2.(x)]r= EXAMPLES 1. 21. 1 (a . We shall also indicate briefly how inhomogeneous linear difference equations can be handled. we find the solution y = (2x . If the characteristic equation has a triple root r. 3= + w(x). + x°-=w.b)1/S = 2 Hence r( (l (l (l y(x)= 21' L\1)+\3/5+\5I 52+\7) 53+.SEC.3=.

. v. we easily find a = 3. First. 2. we shall now look at the problem from a point of view which corresponds to the well-known technique for performing integration by recognizing differentiation formulas.) .-1) 4a= _ (a . 4sinpx=2sin(p/2)cos(px+p/2).. . The solution of the homogeneous equation is y = w1(x) . Differences of elementary functions The equation 4f(x) = g(x). was discussed in Chapter 11 in the special case when g(x) is a factorial.(x) .254 DIFFERENCE EQUATIONS SEC.) = du.(u. 13.3.+1 . and then we shall give explicit formulas for the differ- ences of some elementary functions. 2/(27r)tn [cf. ± v.. ± 4v. nx(... 13... where g(x) is a given function. we get f(x) = w(x) + G(x) 2 g(x) + 6 g'(x) 30 g 4!x) _I g6! v(x) _ _ 1 g vu (x) + 42 30 8! + The coefficient of g'Z'-"(x) for large p behaves like .. e e° 1 and hence. 4(u..v. u. ` v. + u. 4y. 4(u. (13.1.u.. 4(u.3.. A formal solution can be obtained in the following way. = . and this fact gives some information concerning the convergence of the series.1) 4 ( u _ v.+1 . . adding a periodic function w(x).U. 2. Putting y = a x 2s. we find (h = 1): f(x) = g(x)1 = D G(x) . u%) = uw+luw+l . + w.) = u. Setting D-'g(x) = Sa g(t) dt = G(x).+1 + ... + v. (4(u.3.1)al 4 log x = log (1 + I /x) . 4ui = u. 4u. y(x + 2) . Hence the general solution is y = (w1(x) + 3x) 22 + w.(x). 4vr + v:+1 4u.+1 dx'. However. + u.u.u. (13. we indicate a few general rules for calculating differences.2) 4 cos px = -2 sin (p/2) sin (px + p12) .4)].) = u:+1 4v.3.+1 ...3y(x + 1) + 2y(x) = 6 2=..v. (11.

Thus we have the following difference formulas: dr(x) = (x .1)r(x) 1 (13.3. and takes the value (x . SEC. /3-a As a matter of fact.I/ 12x2 + .a). positive x (see further Chapter 18). where 0S a5 .3. We restrict ourselves to the case when the linear factors of the denominator are different. . we will now show how a solution can be obtained when g(x) is a rational function. we find: N--1 S Jim -aN---k=0 1 .a N.Sp(a .c'(N + 6) . Hence the problem has been reduced to solving the equation a dl x) -.3.P(a) + cb(R)} _ 00) . 8 .4) X4' EXAMPLE Compute S = Ek_o 1/((k + a)(k + i3)). we get db'(x) xY ' dib"(x) = X3 2 (13. Differentiating. for large x we have 1(x) .(N + a) . we get r'(x + 1) = xr'(x) + r(x) and hence r'(x + 1) = r'(x) 1 r(x + 1) r(x) + x The function r'(x)/r(x) is called the digamma-function and is denoted by b(x).8. Without difficulty. X Differentiating the second formula. DIFFERENCE OF ELEMENTARY FUNCTIONS 255 Returning to the equation d(x) = g(x).3) JO(x) = . x .a Now there exists a function r(x) which is continuous for x > 0.log x .. 13. then g(x) can be written as a sum of terms of the form a/(x .1/2x . 1 l _ lim {c.1)! for integer. The fol- lowing fundamental functional relation is fulfilled: r(x + 1) = xr(x) .

This equation can be interpreted as the characteristic equation of the matrix C. and we have shown before that limk_. .Ax) of df (x) = g(x) can be obtained by trying a series expansion in factorials. y. = 5083768999. y.(N + a) .o = 56661781./y.. But at the same time . has been tabu- lated by Davis (see [3] in Chapter 11)..1 1 0 .. A = 7041.1). is the numeri- cally largest root of the equation (13. = 5981922. the solution of a linear.256 DIFFERENCE EQUATIONS SEC.0. = Cv since y.. y.y....4..{c.4. we can start with the algebraic equation written in the form (13.2. is the numerically largest eigenvalue of C (2.y.3) to the usual power method for determining eigenvalues of matrices. = 0. 13.152 + 10 = 0 can be interpreted as the characteristic equation of the difference equation y. we find expansions in factorials more suitable for difference equations..473. y._11 /k2 rr2/6. The T-function.k+3 8Yk+: + 15yk+.2. power-series expansions are most conven- ient. = 9. 0 C= L 0 0 . = 79. The discussion of the case when the equation has several roots with the same absolute value was given in Section 6. homogeneous dif- ference equation with constant coefficients is closely related to a purely algebraic equation. -c.. V. = and V. Further details can be found in [1].-. . Putting Y. EXAMPLE The equation 2' .3).0. = 8. Hence we have reduced the point by point formation of the solution of (13. Further we have y. possibly multiplied by some other suitable functions.8)} = 0. 742. -c.. . E.2..l. On the other hand. 3. = 66668. y.l °yk. We arbitra- rily set y. In some cases a solution . . y. (yk/y.82' .) = 2 where 2.1). ". = 1.1 0 and at the same time as the characteristic equation of the difference equation (13. where r-c. we find simply that 11(k + a)' _ '(a) and. in particular. = 536707688. . assumed to be real).0. Although for differential equations. = LYo J we have v.p(N + . c./y. = 9. and obtain successively y. = 631539.1.-1 Y. Bernoulli's method for algebraic equations As was pointed out in Section 13. y.. and from this we get lim. as well as the functions ep.49. . . = -cyo [see (13. 13. y.3)].'. If a = k8.

We shall discuss only linear equations with constant coefficients. Hence we get u = Elf(y) = fix + y) [a was replaced by E.u. and first we shall demonstrate a direct technique. and hence we can write E. the dependent variable is denoted by u. 9. y) = u(x. 5 .u(x. so instead. y) is periodic in x and y with period 1. Obviously. 0) = 0 for x > 0. The general solution is u = w(x.472136.. where w(x. Here we try to obtain a generating func- tion whose coefficients should be the values of the desired function at the grid points' on a straight line parallel to one of the axes.527864 and 5 + l/2-0 = 9. We shall here restrict ourselves to two independent variables.1) with the boundary conditions u(x.y'1 = -18.1 > 12. be the roots in question. We take the equation u(x. Let 2. E. we get E1u(x. y) = u(x + 1' y) .SEC. Partial difference equations A partial difference equation contains several independent variables.u = au is u = cax.47217.Yk+1 = 2122." Yk k1OD Yk-1yk+1 .472137.9441554 and 22 -1. y) .u=E. Then . 13.999987 . which will be denoted by x and y. y) f(x + y).Ylo As a rule.1. and is usually written as a recurrence equation. it is essential that one of the roots be numerically much larger than any of the others. for example. Jim YkYk+f . Previously we have found that a formal solution of E.1. 0) = 1 ]. We demonstrate the tech- nique by an example. Another method is due to Laplace.u(x.77 Jim Yk+1 = Z1 . the Newton-Raphson method.2. Introducing . YYY11 . y . y) = qv [hence u(0. with 12. We can also find the next largest root with this method. *-. here c should be a function of y only. 13. y + 1) = 0 .Yk In our example we obtain YioY12 . y + I) . Consider the equation u(x + 1.5. u(0. y) + qu(x. y) = pu(x . and c by fly)].26 = 0. PARTIAL DIFFERENCE EQUATIONS 257 y1o/y9 = 9. Bernoulli's method converges slowly. one ought to use. and 2. Introducing shift operators in both directions. compared with the exact roots .5.

REFERENCES [I] Milne-Thomson: The Calculus of Finite Differences (Macmillan.1W . partial difference equations play an important role in connection with the numerical treatment of partial differential equations. the generating function q (Q) = u(0. The computation is performed with complex integration in a way similar to that used for solving ordinary differential equations by means of Laplace transformations.5. For details. and for this purpose. y)E2 + . at our disposal for satisfying also the boundary condition.. p=qY . we can multiply with an arbitrary function q(a) and integrate between arbitrary limits. we use the same example. London.. y . y) + gY(EE) Here we have used the initial equation... y . New York. Ordinary difference equations are of great importance for the numerical solution of ordinary differential equations. x . 1937). .. [21 Uspensky: Introduction to Mathematical Probability (McGraw-Hill. We try to find a solution in the form u = a. y) is now obtained as the coefficient of the t--term in the power-series expansion of g (Q). see [1] and [2]. y)6` = qu(0. if we make use of the boundary values as well. pa-1)-Y with arbitrary a is a solution. y . as well as the integration limits. Further. We obtain directly: -PE)_Y=gYC1 91Y(0=q'0 21)PT+. N(PY(e) _ s=1 Pu(x . The desired func- tion u(x.1) + z=1 E qu(x. y) = a'q"(l . y) = 1 2 x x Still another elegant method is due to Lagrange. y)f + u(2.W = (1 t) ..258 DIFFERENCE EQUATIONS SEC. Obviously. and hence Y(Y + 1) .1. (y +.611 and get the condition a/3 = p(3 + qa. u(x. These matters will be discussed more thor- oughly in Chapters 14 and 15. and again we obtain new solutions of the equation. we find qu(0. y) + u(1.Y)V Adding together. 13.1) p-qy = x + y .1) . and analogously. Here we shall only sketch the method very briefly..1) + E u(x. we get q ). 1933). y . This completes the computation of the generating function.u(0. Hence u(x. Now we have -P(ct). we get qqY-'(c) + qu(0.

.-3) for n = 3.a.._. . a and a2 are given real numbers. + a. + y.--. we have uk.+xB.= 0 has nontrivial solutions such that yo = yN = 0. show that UkUk12 . where n is an integer _> 0. b.+y... Bo=1. The integral X. Find all values ofkforwhich thedifference equation y. fo = 1.+. are computed directly../B. + Puk.wehave Find}+}+ to six places. when (a) f uo = I . (n < 12). and b.(1/ak). Find A.. Show that the characteristic equation f. =cosx. Solve the difference equation a2 f(x) = a f(x). B = lim.1._. using Bernoulli's method.2 cos x u..(2) = 0 can be obtained from fk = Afk_. with xo=3.. can be written in the form a . = B.Yo=2.-2y. . where a. uo. = I/(n + 1). . Given a quadratic band-matrix A of order n with the elements for Ii-kl=1 a.Uk+l = 0- 7. S. 2 = auk+.`.-i n=1. C = lim..17x' .+. + 1. + 10y..uk+. and use this relation for computing y. IB. f. and C.59x2 + 17x + 61 = 0. I y. . e '. Consider particularly the cases a = -4 sine # and a = 4 sinh2 1.=A. yn=Jox2+x+Idx is given. Ao=0. with ao = bo = 1. = A. as a function of n and x. are defined through A.2._ C.k = ..+2 + y*. we put A = lim. are given numbers. Show that S = E k-... and write a. The integral Sox'e 'dx.a2. are integers. Putting Uk = ukuk. = A.(A) and the eigenvalues of A.. Further. Prove that for n > 1.. u.-I + xA. a.2 . Also give the form of the solutions. Also show that with R.2.ao + B. = 4y..b. 3. and u. Solve the equation x' ._.. 5.=I. = = 0. = a2 = 1)._. _ j(a. Also find a value N such that yN = 0. .. = nb. a and b.. 12. Solve the system of difference equations x.04.fk-2.3.. Fork . yo and y. Find an explicit expression for f. Solve the difference equation u. 8. +(k2/N2)y. A. Ek-. We form a. = 7x.. 2. 6. B. 5. 11. 9. (1/ak) is convergent. 10. 4. ao. u. + C.. where the numbers ak form a Fibonacci series (a.. (b) Jua = 0. = na.0.-. EXERCISES 259 EXERCISES 1. and B. and find 4.-..._2 + a. A._ B. Show that y.0 1 otherwise. Determine A. satisfy the dif- ference equations a.. + u.+...

y) are of decisive importance for the character of the solution. In this way it is possible to give a picture of the directions associated with the equation. It is obvious that the properties of the function f(x. as well as linear equations with constant coefficients. We are now going to consider two important cases for f(x.1) by aid of Taylor's formula. y) is analytic (note that x and y may be two different complex variables). only the first of these cases will be treated here. GALILE[. in essence.1) For this reason we shall pay a great deal of attention to this equation. The discussion of a system of first-order differential equations can. If f(x. then it is an easy matter to obtain a solution of (14. 14. The definition of linear and homogeneous equations.0. is trivial and should need no comment. then it is possible to replace the equation by a system of n first-order equations by use of a simple substitution technique. no one of the two variables need enter the equation ex- plicitly. second that it fulfills the Lipschitz condition. The differential equation can be differentiated an arbitrary number 260 .0.Chapter 14 Ordinary differential equations Eppur si muove.0. If the equation is of such a form that the highest (nth) derivative can be expressed as a function of lower derivatives and the two variables. By equation (14. We will first point out a simple geometrical fact. and we can obtain a qualitative idea of the nature of the solutions. Existence of solutions An ordinary differential equation is an equation containing one independent and one dependent variable and at least one of its derivatives with respect to the independent variable.0. be reduced to an examination of the equation y' = f(x. y): first that the function is analytic.1) every point in the domain of definition for f is assigned one or several directions ac- cording as f is one-valued or not. y) (14.

f / I -.//l ///. -.y2). . Direction field for the differential equation y' = (x2 .---. /t -. / ! i /r I Figure 14.0. Except the coordinate axes the lines x = } 1 and y = tx are drawn.'-* l I'* -'.. .I Xx2 .-. . --i .

0.0. (14. with the initial condition y = yo for x . y). and putting X ..Y1(x) + f(E. we obtain IY9(x) . dE =o SIYi(x)-Yol+LMdE <2M hN. y)I < M. the series converges and gives a unique solution of (14.0.Y.f(x.2) where L is a constant.0.3) Thus the differential equation has been transformed to an integral equation.262 ORDINARY DIFFERENTIAL EQUATIONS SEC. 14.y:-A(E)I dd .. We integrate (14. (14. Y. .(x) which satisfies the conditions I y. 3. We now obtain IY. we get y=Yo +(x-xo)Yo+ I (x-xo)2yo +. (14. Then we form a sequence of functions y. and provided that Ix .(xo) = yo. 4.(x). =o We suppose xo S x S X. Now we suppose that this condition is fulfilled.zI .0. Y:+i(x) = Y.4) This equation defines Picard's method which on rare occasions is used also in practical work. Further we assume the initial condition y = yo for x = xo.1) between xo and x and obtain y = Yo + J zof(E.(x)I <_ S 00 I -f(E. but in general it should be possible to find a better starting solution.(x) = yo.(E))IdE < L sz Y.(E)) dE . y) .. y) is bounded and continuous within the domain under consideration: I f(x. A widely used extra condition is the so-called Lipschitz condition If(x.+Ax) -Y.(E) . . and further that f(x.. Now we choose an initial approximation y = y. i = 2. and hence we can obtain as many derivatives as we want. we could choose y. . If we require that f(x. y) be bounded and continuous. z)I < LIy .xo.(x)I < M and y.1). For example. .xo = h. Often one does not want to demand so much as analyticity from the function f(x. With the initial value y = yo for x = xo. this turns out not to be enough to guarantee a unique solution. y) dE .0. of times.YAx)I = ly..xol is sufficiently small. + S z0 f(E.

y. we obtain z = Yo + f(E.1)!/ . It is easy to show that the solution is unique. 14.5) is absolutely and uni- formly convergent toward a continuous limit function y(x).0.0. (n-1)!* But Y.(x)I 5 L N L"-'(E . (14. 2 f(E.(x) ..(E)) dE 90 =o .20 Iy (E) .-1 C 2! (n . + L*-'h.0. we find successively: I y.+. Now we form 1Y(x) . dd 1 S IY(x) .+.(x) + (y.-.1)..(x)) .Y..y$(x)I < L=o N dy = NL(x .Y. y.(x) . .(x) .y.h.6) =o Again we point out that y(x) = lim..+.(x) . we get back (14..2)! = N L"-'(x L. every term in this series is less in absolute value than the corresponding term in the series N l + Lh + L h' +.+. y (x).xo)"-2 dd =o (n ..4)..+. EXISTENCE OF SOLUTIONS 263 Further. z(e)) dE . Thus the series (14. y(E)} . = Yo + f(E. + (y. Y(E)) dE = 1Y(x) .(E)I dE 0. Hence we find Y(x) = Yo + js f(E.6). Iy(x) .(x) .5) and apart from y. L*(x21 xo)2 < N L2h' .(x)) + . Differentiating (14. N (n-1)! .0.xo) dE = N =o IY.Y3(x)I S L NL(E .y.(x)I + L 1. since y(x) n oo. which converges toward NeL".. The continuity follows from the fact that every partial sum has been obtained by integration of a bounded function.Y(E)) dE .SEC.=o { f(E.0.0.(x).Y.0. where the formation law is given by (14. Assuming that z = z(x) is another solution.y.-.xo) < NLh .f(E.y. (14.

If(£> z) .f(E. the values x and h will also be present.y. the methods will often be illustrated on simple equa- tions whose exact solution is known. 14. in the second case a multi-step method. In the first case we have a single- step method. is an explicit multi-step formula.. . yx+. further. + h [f(x*. may then appear either as a function of just one y-value y. we shall consider the differential equa- tion y' = f(x.xo)4 N L2h2 Ys 2 2! 2 2! and finally N 0. y. y). However.. and yk = the value of y obtained from the method [to be distinguished from the exact value y(x.l < (N/2)L(x .. Iz Y.1. In the latter case an iterative technique is often used to determine the value of yw+. On first hand. = y%_1 + 2hf(x...+1. y») + f(x. formula. Hence we have z(x) = y(x). or as a function of several values yn.)I dd <_ L Iz . y.+. yn_ ..yoi < Mh = N/2... 14. and hence Iz .)]. N L'(x .1.. The computation of yn+.264 ORDINARY DIFFERENTIAL EQUATIONS SEC.._. y. and we are going to use the notations xk = xo + kh. Classification of solution methods In the literature one can find many examples of equations which can be solved explicitly.)] is an implicit single-step formula. some- times an implicit. and hence Iz . as a rule equidistant in order to facilitate interpolation.y. For example.xo) < (N/2)Lh .oo. is sometimes performed through an explicit. as a rule.+11 < )I (n -+I 2 when n . while Y. The value y*+.. If an explicit solution cannot be constructed one is usually satisfied by com- puting a solution in certain discrete points.. In spite of this fact good arguments can be given that these equa- tions constitute a negligible minority. We shall restrict ourselves to such methods as are of interest for equations which cannot be solved explicitly in closed form. We can easily give examples of equations where the Lipschitz condition is not fulfilled and where more than one solution exists.I d$ o =o But 1z . The equation y' = 1/ y with y(O) = 0 has the two solutions y = 0 and y = x2/4 for x Z 0..

The errors have two different sources. viz. assuming that the solution is twice con- .3. Then the total truncation error e is defined through e. for example.-Y. in the second case this will have no effect except in some very special cases. the error resulting on use of the method in just one step. while the numerical method only resorts to a finite number of steps. that is. Second. how an error introduced by. The rounding error e'.) Yo = a .I=IY. = Y. the form of the truncation error. When one wants to investigate the truncation errors one should first study the local truncation error. is defined through the relation e...Y.3.-Y(x.Y. + hf(x. Single-step methods. First we shall derive the local truncation error e.+. (14. = Y. y) it is defined through y..)+(Y. the exact value which would result from the numerical algorithm.Y(x. First. all operations are performed in finite precision. y) in its starting point.))ISIe. .. However.) (14.3.. . Different types of errors For obvious reasons error estimations play an essential role in connection with numerical methods for solution of differential equations.. and hence rounding errors can never be avoided.2. = y. SINGLE-STEP METHODS.1) Due to the finite precision computation we actually calculate another value y.I+Ik. even the simplest numerical method implies introduc- tion of a truncation or discretization error since the exact solutions as a rule are transcendental and need an infinite number of operations. EULER'S METHOD 265 14..2) For the total error r we then obtain Ir.)I =I(Y. Let be the exact solution in the point x. Euler's method In this section we shall to some extent discuss a method. suggested already by Euler. and the nature of the error propagation. and y. but almost never used in practical work.-y(x.. 14. SEC. that is. the method is of great interest because of its simplicity.2. (14. In the first case the situation can be improved by making h smaller.2.l- There are two main problems associated with these error estimations. Applied on the differential equation y' = f(x. 14. round-off is propagated during later stages of the process.1) Geometrically the method has a very simple meaning: the wanted function curve is approximated by a polygon train where the direction of each part is determined as the function value f(x..

.+. m=0. y"=a + yy'=ax+fay 2 y axf'+2faxay y! a ay +f(ay ..) = hf(x..266 ORDINARY DIFFERENTIAL EQUATIONS SEC..l S 2 hN L It is possible to prove the asymptotic formula: e . Taylor series expansion Assuming sufficient differentiability properties of the function f(x.2.) e.4. . + h[f(x. + W__ I But e. 2.) . we shall also estimate the total truncation error e = s(x) with assuming the Lipschitz condition I f(x.l s A"leal B.y.+. and yo = y(xo). where xo < E < x.) . 1. . is the error after one step. 0 converges toward a function y(x) satisfying the differential equation. = e... 14.I + B (m = 0.1) and by induction we easily prove for A # 1: le. < E. y.1..n.. we have e.1. Hence the local truncation error is 0(h'). . = 0 and A" = (1 + hL)* < e*AL = etc:-:o&. Since e. = y. ..+. tinuously differentiable. n .. y).y(xo + h) = .y(x.. existence and uniqueness proofs can be con- structed by showing that the sequence of functions which is obtained when h -.h' + .c. Subtracting and putting e.I 5 (1 + hL)Ie.jh`y"(E) .) . we obtain Jh2y (E.. < x. y. jh=y"(E. We prefer to write this equation in the form le. which gives 1 1 Ie.. .)I 5 Ll y.. where the coefficients c c c .h + c.+..4.+..))] - Hence Ie. 14..+hf(x.I + jh'N.) = yo + hya . -y. y.l S AIE. y(x.=y.l and Iy"(E)I SN: y. depend upon x and the function f(x.h' + c.f(x. Starting from Euler's method... = y.. y) .). y) we can compute higher-order derivatives directly from the differential equation: y = f(x. Further. x.+. y(x.f(x.

. y = ao = 0.. We demon- strate the method in an example. . = h'y" /2.SEC. a.2)y .2x. 2xy .2xy. 14. = h/. = 375 a7 = 36 as = 0 Al-1) = 0.1).) .2y = (4x' ..2(n + Putting a. -2xy(*+..53801 95069 (taken from a table). a.+ 4x)y + 4x' . = -0. but it should be observed that it is better not to construct these explicit expres- sions.+a.53807 95069 a.2xy) .00461 92049 a.4.4 + 8xy = (-8x' + 12x)y + 4(x' . = y. = 0. we obtain n and y(x+h)=ao+a.1 . -0.52620 66800 A great advantage of this method is that it can be checked in a simple and effec- tive manner. It is possible to compute y" = -2x(1 . y"' = (-8x. Differentiating repeatedly. _ -16413 a. TAYLOR SERIES EXPANSION 267 After this we can compute y(x + h) = y(x) + hy'(x) + ih'y"(x) + . etc. the series being truncated in a convenient way depending on the value of h.52620 66801 Tabulated value: 0. y =1. NUMERICAL EXAMPLE x = 1..00035 87197 a4 51600 a.00761 59014 a.4y. We have and this value can be compared with a previous value.. h = 0.+. we get y"= -2xy'-2y. In this example we find . ao 0.

the following system: a+b+c+d = 1.9) = 0.5. p-1. It should be mentioned that there are nowa- days formula-manipulating programming languages in existence and thus also this part of the work can be taken over by the computer. depend on round-off errors.268 ORDINARY DIFFERENTIAL EQUATIONS sEc. cmr+d(nt+ms)=*.5.). be made empirically. yo + mk. = hf(xo + mh.54072 43187. which seems rather natural.) .yo+sk. Then we put k.5. We have eight equations in ten unknowns. The first reason hardly has any significance at all with modern computers while the difficulties in computing higher derivatives seem to have been overestimated.. 14.+dk. one reason being storage problems for the higher derivatives.1) k. The Taylor series method has a reputation of being unsuitable for numerical work. k. the method does not offer any easy checking possibilities.yo+rk. Let the equation be y' = f(x.5. 14. The differences. of course. The constants should be determined in such a way that yo + k becomes as good an approximation of y(xo + h) as possible. generali- zation to equations of arbitrary order is almost trivial.+tk3+(p-s-t)k. yo) . = hf(xo. and hence we can choose two quan- tities arbitrarily.+bk. If we assume that m = n. with starting point (xo. By use of series expansion.=hf(xo+nh. as a rule. The greatest disadvantage seems to be that it is rather difficult to estimate the error. (14. rt . and moreover. a=d=. one ob- tains. k. cmnr + dp(nt + ms) = A . Very often the values a vary in a highly regular way.. b+c=.) k =ak. bm + cn + dp = .j. y(0. s+t=1. Runge-Katta's method This is one of the most widely used methods.2) bm= + cnt + dpz = . The error estimates in this method must. (14. . and it is particularly suitable in cases when the computation of higher derivatives is complicated. cm2r + d(n=t + mzs) = T3 bm' + cn3 + dp' = a .+ck.=hf(xo+ph.54072 43189 compared with the tabulated value 0. after rather complicated calculations. dmrt = Ta . yo) and interval length h. y).+(n-r)k. First of all we shall discuss the solution of a first-order equation. larger errors would be disclosed in the back-integration checking. cr=*. It can be used for equations of arbitrary order by means of a transformation to a system of first-order equations. another difficulties in computing them. we find m=n=j. and further.

= hF(x. k.2. k.h' + .z+J1. z' = G(x. j y' = z ' (= y. = hG(x. y.yo+ks) If f is independent of y. = hf(xo. 1.)k1).Y= . Taking. (= G(x. a=d= . I =J(11+21. yo) .). the formula passes into Simpson's rule. 1.) k =A(k1+2k.+6) The new values are (x + h. which is integrated through: k. and it can also be shown that the local truncation error is O(h').z+J1. y. y').). b=c= .y2. 14. s=0.+2k. for example. + jh. (14. The total error has the asymptotic form e(x) . Runge-Kutta's method can be applied directly to differential equations of higher order.+2k. Z) This is a special case of y' = F(x. k. The explicit formula for the error term. however.SEC. the equation y" = f(x. Y. z) . z= xz_ .3) k8 = hf(x. y. z + 1).yo+. y + k. z)) h = 0.+2k.+k. = hG(x + jh. we get b = c = and hence m=n= . P= t= 1. Thus we have the final formula system: k. r= .5. y + jk. we put y' = z and obtain the following system of first-order equations: Y'=z.+k.+21. z) .hf(xo+')h. y.c. we can also obtain an improvement by aid of the usual Richardson extrapolation.) .5. Y" = x y " . RUNGE-KUTTA'S METHOD 269 If we further choose b = c.=hf(xo+h.). y. k=j(k. is rather complicated.h' + ch' + c. z)) . y. and in practice one keeps track of the errors by repeating the computation with 2h instead of h and comparing the results. = hF(x + jh. y + jk1. z' = f(x. yo + jk. NUMERICAL EXAMPLE AND COMPUTATION SCHEME F(x. z) . z) .

0469(=11e e-1). z= 11 a-= 5e-2'z.3133 1.0999 -0.1870 2.1 -0.6 1. The equations y' 12y + 9z . On the other hand. as pointed out above.999 -0.8360 -1.1640 0. 14.0000 1. Initial conditions: x y z(= F) xz2 .979102 -0.6505 2. First we take an example.0000 1.1998 0.2 0.98002 -0. and h = 0. It should be mentioned here that there is a whole family of Runge-Kutta methods of varying orders.8174 2.1632 1.039164 -0.0469 1.2270 1.2.6 2.0018 2. z' = l ly . Note that the interval length can be changed without restrictions.3504 -2.0134 2.1958204 0. however.9527709 -0.0 3.7109 3.2 1.01998 -0.8 1.99 -0.4 2. we obtain x y z lly/9z 1.2 0. For x Z 1 . we have a 21' < 10-9. the classical formula has reached a dominating position by its great simplicity.3111(^-9e 1).y2(= G) k 1 o 1 0 -1 0 -0.8169 1.4892 1. deserves to be mentioned.5668 .02 -0. A special phenomenon.196966 The values below the line are used to initiate the next step in the integration. However.2207 1. starting with x = 1. it is somewhat more difficult to get an idea of the accuracy.3111 4.4798 1.2 2.1 0.1 1 -0. y = 3.4 1. z = 4.2 0.0000 1. which can appear when Runge-Kutta's method is applied.1064 2. For practical use.1905542 0.loz.0530 1. and one would hardly expect any difficulties.1958204 -0.2195 2.0003 1. have the particular solution y = 9e1 + 5e-21z .7127 1.0 1.5.270 ORDINARY DIFFERENTIAL EQUATIONS SEC.980146 -0.

Denoting the characteristic values of A by 1 we have stability for -2. which is acceptable when a = 1 and h = 0.. < 0 if the eigenvalues are real and for 0 < IhA. .1326. y).1 gives perfect results..t < 2v/ _2 if the eigenvalues are purely imaginary.Y. 2. later on also y" = f(x. 14. When we use Runge-Kutta's method.YI.SEC.. = aflayk.) . The explanation is quite simple. 21. The method is explicit if the value can be found directly..*a'h' + ..a'hl.ah + Ja'h' . As a matter of fact. we approximate e -*h by I . out. Multi-step methods As has been mentioned previously.785 is the real root not equal to 0 of the equation 1 + x + x'/2! + x'/3! j..(x. and if in particular k = 1. In the example above.2374 compared with e-'-' = 0. for example.785. MULTI-STEP METHODS 271 Round-off has been performed strictly according to the principles given in Chapter 1. mainly situated in the left half-plane (also cf. must belong to a closed domain in the complex plane.. 2. . A reduction of the interval length (in this case to about 0. the phenomenon observed here is usually called partial instability.Y . The value -2.1. [101). we are back with the single-step methods which have just been discussed.+k as a function of several preceding values y. and implicit if the formula contains the wanted value also on the right-hand side. we had A _ 12 9 11 -10 that is.fk-9. Explicit calculations show. a multi-step method defines the wanted value Y. Primarily we shall treat the equation y' = f(x. 14. yk) will be used..785 < h2. If the system of differential equations is written in the form Y: =f. In the case of com- plex eigenvalues hA. that h = 0. In this case we have a k-step method. Further it is also obvious that a special technique is neces- sary for calculation of a sufficient number of initial values. however.+k_i+ Y. Consequently h must be chosen so that 21h < 2.6..x'/4! = 1. the stability properties are associated with a certain matrix A having the ele- ments a.6. . y).. there are several matrices A since the derivatives are considered in different points. A slight change in these principles is sufficient to produce a com- pletely different table. For brevity the notation fk = f(xk. but if we assume sufficiently small intervals we can neglect this. i .1) would give satisfactory results. it is a bad approximation if a = 21 (6.014996). and we observe immediately that a change of the interval length h will be difficult to achieve. n.. 1 and 2. In spite of the fact that proper instabilities appear only in connection with multi-step methods. .2 (the error is <3 10-8). or h < 0.. We shall assume equidistant abscissas through.

+ a0y* = h(Rk + Rk-1 + . This condition is known as the stability condition.6.1) Two polynomials will be introduced...3) and (14.l > 1.1)hak_. Inserting this into (14. Assume one root z. = 1. oo in such a way that nh becomes finite. + + 80f*] ' n = 0. Note that n --.. p(z) = akzk + ak_. we have (14. 14.l S 1.1) when h = 0.. oo). It is easy to show that we do not get convergence if z.j _ . (the right-hand side of the difference equation is equal to 0 since f(x.y*+k-1 + .. (14..j > 1.3) into account. of p(z) are such that Iz. Let us first consider the differential equation y' = 0 with the initial value y(O) = I and exact solution y(x) = 1.272 ORDINARY DIFFERENTIAL EQUATIONS SEC.5) . Since f(x.. see Henrici [7]. (14.6. and further that all zeros on the unit circle are simple.4) The conditions (14.1) /we find aky*+k + ak-.zk-1 + . we get kak+(k . Then a necessary condition for convergence is that the zeros z. + R0) Since y. + R0 Rk-.6. The method is now said to be convergent if limk0 y* = y(x*).5) 1 = Q(1) . 0 (n -. Then y* will contain a term with the absolute value (x/n)r* which tends to infinity when h -. Methods for which this condition is not fulfilled are said to be strongly unstable. y) = 0). Taking (14. = rh. By discussing other simple equations we can obtain further conditions which must be satisfied in order to secure convergence of the method. 1... .6. and this must be valid even for initial values which are close to the right ones and converge to these when h -. 0. The general linear k-step method is then defined through the formula aky*+k + ak-1y*+k-1 + + a0y* = h[/Skf*+k + Rk-1f*+k-. we obtain (n+k)hak+ (n + k . y) = 0 and all y.6. . namely. then the solution of (14.zk_.3) Next we also consider the differential equation y' = 1 with the initial value y(O) = 0 and exact solution y(x) = x. The polynomial p(z) at the same time is the characteristic polynomial associated with the difference equation (14.4) can be written in a more compact form: p(1) = 0. To prove this we consider the equation y' = 0 with the initial value y(O) = 0 and exact solu- tion y(x) = 0.1) contains a term Az.6. 2. (14. Q(z) = Rkzk + + . + a0 .6. Further assume an initial value such that A = h and consider y* in the point x = nh.6.6.6. For com- plete proof. such that 1z..6.

SEC. 14.6. MULTI-STEP METHODS 273

These relations are usually called the consistency conditions, and it is obvious
from the derivation that they are necessary for convergence. Somewhat vaguely
we can express this by saying that a method which does not satisfy the con-
sistency conditions is mathematically wrong. On the other hand, it might well
be the case that a "correct" method is strongly unstable (cf. Todd's example
in a subsequent section of this chapter).
If the local truncation error is O(hy+'), the order of the method is said to be
p. One is clearly interested in methods which, for a given step-number k, are
of highest possible order, but at the same time stability must be maintained.
Dahlquist [4] has proved that the highest possible order which can be attained
is 2k, but if we also claim stability we cannot get more than k + 1 if k is odd,
and k + 2 if k is even.
We shall now discuss a simple example in order to illustrate the effect of a
multi-step method on the error propagation. We start from the equation
Y'=ay, Y(0)= 1,
and shall apply a method suggested by Milne:

Y-+2 = Y. i- 3 [y'' -1- Yx+3]

that is, Simpson's formula. Also using the differential equation we obtain the
following difference equation:
_ ahl 4ah (1 ahl
ll T _3
The characteristic equation is
(1 _ ah) Z, - 4ah 2 - (1
with the roots (2ah/3 1 + a'h=/3)/(1 - ah/3) or after series expansion,
e 4 4
10
1 + ah + 2 + a6 + + ab2 + ... - a°A + c,h° ,
24
ah a=hi a'h' ( aA'3 +

where c, = a'/180 and c, = 2a3/81. The general solution of the difference equa-
tion is y = A1; + B2,,, and since we are looking for the solution y = e°= of
the differential equation we ought to choose A = 1, B = 0. We now compute
approximate values of 2 and 2, choosing n and h in such a way that nh = x
where x is a given value. Then we find
(e" + c,h5)' =
e -,',(l + c,h°e 1k)* = e6i(l + nc,h6e nA + - .. )
e°s(1 + c,xh') = _°s(1 + ax),

274 ORDINARY DIFFERENTIAL EQUATIONS SEC. 14.6.

where a = c,h' = abh'/180. In a similar way we also find
2, _ (- 1)"e-1111(i + $x) ,
where ,9 = c,h2 = 2ah2/81. We are now interested in the solution 2,", but in
a numerical calculation we can never avoid errors which will then be understood
as an admixture of the other solution. However, the effect of this will be quite
different in the two cases a > 0 and a < 0. For simplicity, we first assume
a = 1. Further we assume that the parasitic solution from the beginning is
represented with the fraction s and the "correct" solution with the fraction
1 - E. Thus, the generated solution is
y. _ (1 - s)e°(1 + ax) + E(-1)Re1ra(i + ,6x)
and the error is essentially
E,=axe, -ee, +s(-1)"e-,11 .
Since x is kept at a fixed value the first term depends only on the interval length
(a = h'/180) and can be made arbitrarily small. The second term depends on
s, that is, the admixture of the parasitic solution. Even if this fraction is small
from the beginning, it can grow up successively because of accumulation of
rounding errors. The last term finally represents an alternating and at the same
time decreasing error which cannot cause any trouble.
We shall now give a numerical illustration and choose h = 0.3. Further we
shall treat two alternatives for the initial value y, which we choose first accord-
ing to the difference equation (y, = 2,), second according to the dtfferential equa-
tion (y, = exp (0.3)). We find
A'-42/9= 11/9
and 2, = 1.349877; 2, = -0.905432. The difference equation is
1 ly and the results are given in the table below.

x y, Error 101 E;" 100 y1I Error - 100 E;"' 108

0 1 1 --
0.3 1.349877 18 18 1.349859 - 0
0.6 1.822167 48 49 1.822159 40 41
0.9 2.459702 99 100 2.459676 73 74
1.2 3.320294 177 179 3.320273 156 158
1.5 4.481988 299 303 4.481948 259 262
1.8 6.050132 485 490 6.050088 441 446
2.1 8.166934 764 772 8.166864 694 703
2.4 11.024354 1178 1191 11.024270 1094 1106
2.7 14.881521 1789 1808 14.881398 1666 1686
3.0 20.088220 2683 2712 20.088062 2525 2554

SEC. 14.6. MULTI-STEP METHODS 275

For y, we have with good approximation s = 0 and hence E' ; axe' with
a = h'/180 = 0.000045. For y,, we easily find a = 0.000008, and from this,
E;' is computed. The difference between actual and theoretical values is due
to the fact that the term ax is only an approximation of a whole series and
also to some extent to rounding errors.
We now pass to the second case a = - 1, that is, the differential equation
V = -y, y(O) = 1. In the same way as before we get
y" = (1 - s)e =(1 + ax) + a(-1)"e:ls(1 + $x) ,
where the signs of a and /3 have now been changed. Hence the error is
E, = (ax - e)e = + e(-1)"e-'s(1 + 9x).
Again we illustrate by a numerical example. With the same choice of interval
(h = 0.3), the characteristic equation is 22 + 4,1/11 - 9/11 = 0 with the roots
2, = 0.74080832 and 2, = -1.10444469. Also in this case the solution is di-
rected first according to the difference equation, that is, y, = 2 and second
according to the differential equation, that is, y, = exp (- 0.3). The difference
equation has the form l ly"+, _ -4y"+, + 9y", and the results are presented
in the table on p. 276.
As can be seen in the table there is good agreement between the actual and
the theoretical error for the solution y, up to about x = 8. Here the discrepancy
between y, and 2; begins to be perceptible which indicates that a is now not
equal to 0. We also notice the characteristic alternating error which is fully
developed and completely dominating for x = 12. Still the computed values
y, are fairly meaningful. If an error of this type appears, the method is said
to be weakly stable. In particular it should be observed that for a given value
x we can obtain any accuracy by making a sufficiently small, and in this case,
s being equal to 0 initially, we have only to perform the calculations with suf-
ficient accuracy. The other solution shows strong oscillations already from the
beginning due to the fact that the initial value exp (-0.3) is interpreted as an
admixture of the unwanted solution 2, to the desired solution 2; . We find
_ (2, - e°k)(1 - ah/3)
2V F+ ash2/3
for a = 1 and h = 0.3, we get s = - 536 . 10-8. Since s = 0(h5) the error can
be brought down to a safe level by decreasing h; already a factor 4 gives a -
5 . 10-1 which brings us back to the first casey,. Again we stress the importance
of steering after the difference equation in cases when weak stability can occur.
This case has been discussed in considerable detail because it illustrates prac-
tically all phenomena which are of interest in this connection, and the discussion
of more general cases will be correspondingly facilitated. Again we choose the
equation y' = ay, y(O) = 1 but now a more general method will be applied:
aky"+k + ak-iy"+k-i + ... + aoy" = h(,Sk,f"+k + Qk-i.f"+k-1 + ... + Af.)

276 ORDINARY DIFFERENTIAL EQUATIONS SEC. 14.6.

x Error 108 E . 108 Error.108

o
0.3
1 0000 0000
7408 0832
-
-990 -1000
1 0000 0000
7408 1822
--
0.6 5487 9697 -1467 -1482 5487 9337 -1826 -1842
0.9 4065 5336 -1630 -1647 4065 6277 -689 -707
1.2 3011 7812 -1609 -1626 3011 7175 -2246 - 2263
1.5 2231 1525 -1491 -1506 2231 2527 -489 -505
1.8 1652 8564 -1325 -1339 1652 7679 -2210 -2223
2.1 1224 4497 -1146 -1157 1224 5638 -4 -16
2.4 907 0826 -969 -980 906 9687 -2108 -2118
2.7 671 9743 -808 -816 672 1091 +539 +531
3.0 497 8042 -665 -672 497 6620 -2087 -2093
5.4 45 1551 -107 -110 44 8346 -3313 -3314
5.7 33 4510 - 87 -86 33 8055 + 3458 +3456
6.0 24 7811 - 64 - 67 24 3899 -3976 -3976
6.3 18 3577 -53 -52 18 7900 +4269 +4267
6.6 13 5999 -38 -40 13 1227 -4810 -4808
8.1 3 0340 -14 -11 3 8183 +7829 +7826
8.4 22482 -5 -9 1 3820 -8667 -8664
8.7 1 6648 -11 -7 26215 +9557 +9552
9.0 1 2341 0 -5 1774 -10567 -10562
9.3 9133 -9 -4 20804 +11661 +11655
9.6 6776 +3 -3 -6114 -12886 -12879
12.0 628 +14
12.3 439 -16
12.6 354 +17
12.9 230 - 20
13.2 206 +21
13.5 113 -24
13.8 127 +26
14.1 46 - 29
14.4 83 +27
14.7 7 - 34
15.0 65 +35

Putting f = ay we get a difference equation with the characteristic equation
p(z) - aha(z) = 0 .
If h is sufficiently small the roots of this equation will be close to the roots z;
of the equation p(z) = 0. We assume the stability and consistency conditions
to be fulfilled, and hence it is known that 1z;j 5 1, p(1)- 0, and p'(1) = a(1).

SEC. 14.6. MULTI-STEP METHODS 277

Roots whose absolute values are < 1 can never cause any difficulties, and we
restrict ourselves to roots on the unit circle, all of which are known to be
simple; among them is also z, = 1. Applying Newton-Raphson's method to
the equation p(z) - aha(z) = 0 we find the root
z' = z + aha(z,) + 0(h2) .

We now introduce the notation 2, = a(z,)/(z,p'(z,)); for reasons which will
soon become clear these quantities are called growth parameters. We find
(z;)* = z,*(I + ai,h + 0(h2))* = z,*(exp(aA,h + c,h' + 0(h3))*
= z; exp(al,nh)(1 + c,h2exp(-aA,h) + O(h3))*
= z*, exp (2,ax)(1 + c,nh2) = z,* exp (2,ax)(1 + a,x) ,
where a, = c,h and x is supposed to be a fixed value. In particular we have
Q(1)
2,= = 1,
01)
and the desired solution is obtained in the form e'l(1 + ax). Concerning the
parasitic solutions we see at once that such roots z,, for which Jz,J < 1, will
normally not cause any trouble (however, see the discussion below). If on the
other hand Jz,J = I and Re (al,) > 0, we are confronted with exactly the same
difficulties as have just been described (weak stability). A method which cannot
give rise to weak stability is said to be strongly stable.
Here it should also be mentioned that the same complication as was dis-
cussed in connection with Runge-Kutta's method may appear with the present
methods. For suppose that a < 0 and that h is chosen large enough to make
Iz'I > 1 in spite of the fact that Iz,I < 1; then the term z',* will initiate a fast-
growing error which can be interpreted as an unwanted parasitic solution. As
has been mentioned earlier this phenomenon is called partial instability since it
can be eliminated by choosing a smaller value of h.
At last, we shall briefly mention one more difficulty. Suppose that we are
looking for a decreasing solution of a differential equation which has also an
increasing solution that should consequently be suppressed. Every error intro-
duced will be interpreted as an admixture of the unwanted solution, and it will
grow quickly and independently of the difference method we are using. This
phenomenon is called mathematical or inherent instability.
Finally we quote an example of strong instability which has been given by
Todd. He considers the equation y" = -y, and attempts a solution by My" =
82y - TA-281y + 1j53°y - . Truncating the series after two terms, he obtains
with h = 0.1:
0.01y = y*+, - 2y* f- y*-, - T'2(Y*+2 - 4Y*+, + 6y* - 4Y*-, + Y*-9)
= -0.oly* ,

278 ORDINARY DIFFERENTIAL EQUATIONS SEC. 14.6.

or
29.88y + 16y.- - Y.-,
As initial values, he takes 0, sin 0.1, sin 0.2, and sin 0.3, rounded to 10 and 5
decimals, respectively.

X sin x Error Y Error
(] 0 decimals) (5 decimals)

0 0 0 0
0.1 0.09983 0.09983
34166 34166 0.09983
0.2 0.19866 0.19866
93308 93308 0.19867
0.3 0.29552 0.29552
02067 02067 0.29552
0.4 0.38941 0.38941
83423 83685 262 0.38934 -8
0.5 0.47942 0.47942
55386 59960 4574 0.47819 -124
0.6 0.56464 0.56464
24734 90616 65882 0.54721 -1743
0.7 0.64421 76872 0.64430 99144 9 22272 0.40096 -24326
0.8 0.71735 609090.71864 22373 128 61464 -2.67357 -3.39093
0.9 0.78332 69096 0.80125 45441 1792 76345
1.0 0.84147 09848 1.09135 22239 24988 12391
1.1 0.89120 73601 4.37411 56871 3.48290 83270

The explanation is quite simple. The characteristic equation of the difference
equation is
r4 - 16r' + 29.88r2 - 16r + 1 =0,
with two real roots r, = 13.938247 and r, = 11r, = 0.07174504. The complex
roots can be written r3 = e'" and r, = e-'", where sin 6 = 0.09983347. Thus
the difference equation has the solution
y(n) = Ar; + Bri " + C cos nO + D sin nO ,
and for large values of n the term Ar; predominates. The desired solution in
our case is obtained if we put A = B = C - 0, D = 1; it becomes y(n) =
sin (n 0.1000000556) instead of sin (n 0.1). In practical computation round-
off errors can never be avoided, and such errors will be interpreted as ad-
mixtures of the three suppressed solutions; of these Arl represents a rapidly
increasing error. The quotient between the errors in y(1. I) and y(1.0) is
13.93825, which is very close to r,.
Hence the method used is very unstable, and the reason is that the differential
equation has been approximated by a difference equation in an unsuitable way.
Here it should be observed that a strong instability cannot be defeated by
making the interval length shorter, and as a rule such a step only makes the
situation still worse.

SEC. 14.8. METHODS BASED ON NUMERICAL INTEGRATION 279

14.7. Milne-Simpson's method
The general idea behind this method for solving the equation y' = f(x, y) is
first to perform an extrapolation by a coarse method, "predictor," to obtain an
approximate value and then to improve this value by a better formula,
"corrector." The predictor can be obtained from an open formula of Cotes type
(cf. 10.1) and we find
43 14
Y.+1 - Y.-a = (2Y;,- - Y;- + 2Y:) + 45
l4 yV
(14.7.1)

Neglecting the error term we get an approximate value of y.+, with which we
can compute y;,+, = f(x.+,, y.+1). Then an improved value of yr+, is determined
from the corrector formula
y.+, = Y.-I + 3 [Yn, + 4y; + Y;+J (14.7.2)

with the local truncation error - 1,hlyv(.,). If necessary, one can iterate several
times with the corrector formula.
As before we put y. = s. from which we get

Y: = f(x., Y.) = AX., Y(x. + E.)) = AX., Y(x.)) + s. a = Y'(x.) + S. ay .
Assuming that the derivative of/ay varies slowly we can approximately replace
it by a constant K. The corrector formula then gives the difference equation
Kh
+
3
(s, + 4s. + s.+,), (14.7.3)

which describes the error propagation of the method. As has been shown
previously, we have weak stability if of/ay < 0. If instead we use the predic-
tor formula

and then the corrector formula just once, it is easy to prove that we have strong
stability.

14.8. Methods based on numerical integration

By formally integrating the differential equation y' = f(x, y), we can transform
it to an integral equation (cf. Chapter 16)

Y(x) - y(a) = f(t, y(t)) dt .

The integrand contains the unknown function y(t), and, as a rule, the integration
cannot be performed. However, At, y(t)) can be replaced by an interpolation
polynomial P(t) taking the values f. = f(x,,, y.) for t = x. (we suppose that

280 ORDINARY DIFFERENTIAL EQUATIONS SEC. 14.8.

these values have already been computed). By choosing the limits in suitable
lattice points and prescribing that the graph of P(t) must pass through certain
points one can derive a whole series of interpolation formulas. It is then con-
venient to represent the interpolation polynomial by use of backward differences
(cf. Newton's backward formula), and we consider the polynomial of degree
gins:
P(s)=fp+Svf,+s(1..21)P'fp ...+s(s+ 1)qi(s+q-1)P°.fp
Evidently c(0) = f0; p( - 1) = fp_,; .... a(- q) - fp_Q. But s -- 0 corresponds
tot = xp, s = - 1 tot = xp_ and so on, and hence we must have s = (t - xp)/h.
As an example we derive Adams-Bashforth's method, suggested as early as
1883. Then we must put
f p+, p(t) dt h f' cp(s) ds = h
y,, - y0 = zp
=
0
E
,
c.V'.f,
0

where c, _ (-1) Sa ds. The first coefficients become

co=1 c,= 1)ds-
;
c2=S0s(s2 2'
' s(s + 1)(s + 2) ds _ 3 = 251
C3 = 0 6 8 c` 720

= 95 _ 19087 36799
c° 288 ' c6 60480 '
C7 = 12 9960 '
It is obvious that a special starting procedure is needed in this case.
If the integration is performed between xp_, and xp, we get Adams-Moulton's
method; the limits x,_, and xp,., gives Nystrom's method, while xp_, and xp give
Mine-Simpson's method which has already been treated in considerable detail.
It is easy to see that Nystrom's method also may give rise to weak stability.
Cowell-Numerov's method. Equations of the form y" = fix, y), that is, not
containing the first derivative, are accessible for a special technique. It has
been devised independently by several authors: Cowell, Crommelin, Numerov,
Sti rmer, Milne, Manning, and Millman. We start from the operator formula
(7.2.12):
32
UY
_
_1+ &
12
- 240
8'
+
... .

Hence we have

+b2- 40+...lh-yn=h2(1+52240+...)f,
or

y.+, - 2y,. + y,.-1 = 2 (f.+1 + l0f. ,, _f _1) yV' + 0(h°) ,
240

SEC. U' = h'D4 and y" Neglecting terms of order h6 and higher. (14.h' + 0(h")] . \\ 1 . In general. (14.c.806 + 0(h7))3 = exp (. + .. = exp(anh)(1 + 480exp(-ah) + 480exp(-ah) +. = exp (.8. = i2(T+ + 10f.exp(ah) + a6h5 a"h" 480 + 480 + (14. 14.8. y) = y g(x) we get the following explicit formula: R.-iy. METHODS BASED ON NUMERICAL INTEGRATION 281 since S' .)* = exp (ax) (1 + 480 + 0(h7))~ = exp (ax)[1 + a.h" + .(h1/12)g and 6 = 2 + (5h2/6)g.h' + 0(h6)] .3) a"h" a"h" 2.-.a2h2/12).8.h' + c. where a.+1 .2y. Let us discuss the equation y" = aty leading to the following difference equation for y: ( a'h2l 10a4hYl ( azhY l 2 \1 . 12 12 12 ° with the characteristic equation 2(1+5a'hY/12)2+1=0. and after series expansion we find _ ' . +y.. This means that the total truncation error can be written s(x) ..ax)[ 1 .y./. = a1x/480 and nh = x. In a similar way we find exp (.ah) - 480 + 480 Hence we have 2.+1 where a = 1 . An error analysis in the general case gives the same result.a..ax) (1 . . In a special example we shall investigate the total error of the method.1) with a local truncation error 0(h6). .) .8. the formula is implicit.2) a.a.a2h2/ 12 The roots are {1 + 5a2h2/12 ± ah(1 + a2h2/6)'1Y}/(1 . but in the special case when f(x. . we get the formula Y /' y.

We start the computation with a special technique (for example. + hzfn. g. the following scheme can be applied as before: a 1 . 12 Zn+1 .8. . 3. usually by iteration. 14.8. For initial value problems a summed form of the method should be preferred. .4) s fn+1 12 . . Z.-Zo S. series expansion) for obtaining yl.(h2/l2)fn and s for the expression within brackets we obtain: Y yn n. = Zn_1 + Sn-.12 fn+ = yn - 12 fn + {(yn ._1 + fn Y Zn+1 Z. as well as the first two values of y and ay. Yn+1 . 12 12f"-' + With the notation zn = yn . In the special case f(x. .S. .12 g. (c) Y/ Then (b) and (c) express the method while (d) is the definition of sn.RS. . (a) Zn . (14. + S.. n = 1. and S are known. (b) Z. First.. One step in the computation comprises the following calculations: PQ . Adding (c) and (d) we find sn = sn_. R=2+ Sh-g.. S V. 2. a. is solved.. + S.282 ORDINARY DIFFERENTIAL EQUATIONS SEC.z*+1 From the last relation yn+. the formula is rewritten as follows: h2 h= _ h fn Y . T . . all values of x. y) = y g(x). and then we get zo=yo ----12 z1=y1-h12'+ So=Z. x y g a $ ay R P Q V T S When we start.

(-1)^'`1(1-t)}\-t)+(t)}dt..k) . for example.+1 . m m One finds a.5) Instead of f(z) we then introduce a suitable interpolation polynomial of degree q through the points x. putting f(x.-1 + y.= 1. however. + y.y(x) _ -ky'(x) + f _k (x . If we choose x = x...a.z)[f(z) +f(2x .a4= !.. Using the formula for n 2 and perform- ing the transformation x + ki = z. x. .k becomes y(x . Lagrange's remainder term. Cowell-Numerov's method can be interpreted as one of a whole family of methods designed for the equation y" = f(x. .SEC.-1 = h2TM-o E where a.z)y"(z) dz .x. be taken into account: (1) The first step needs special attention.=0. We start from the following Taylor expansion: ycm'(x) Y1A1(x } kt) dt Y(x + k) = y(x) + E m! Jo k (n . 1) -1 . y). we get x+k y(x + k) ..2y. we get Stormer's method: 0 y. y(x)) = f(x). k.=4e.2y.a.o"f. and q we are now able to derive a whole family of methods.. METHODS BASED ON NUMERICAL INTEGRATION 283 As is easily inferred... The following disadvantages must. and by different choices of x. . x.I)'" Jot{(') + (2m t)} dt . _ (. series expansion. If instead we choose x = xn_.j (x + k .z.+l.%.. 14. Just as was the case with Milne-Simpson's method for the equation y' = f(x.8.k .z)] dz .2y(x) + y(x . and adding the relations for k and -k we obtain z+k y(x + k) . that is. The same relation with k replaced by . y).z)y"(z) dz . where b.a. with q > 2 we again find Cowell-Numerov's method: Q Y._. .a.=T12 .k) = (x + k .._ .y(x) = ky'(x) -+. Replacing z by 2x .. and x + k = x..-.. and x + k .8. the method is extremely fast.. = h2-ot b. The first mean value theorem of integral calculus transforms the integral to k*yl"'(x + Ok) Jo ((n . (14.=T1z.t1)r 7 dt = nI y( (x + bk) .. (3) The derivative is not obtained during the calculation. (2) Interval changes are somewhat difficult.

. for example. we shall also briefly discuss the stability problems for equations of the form y" = f(x.1) dt Here A(t) is a square matrix whose elements are functions of t. b. In Todd's example we have p(z) = z' . b. 14. that Cowell-Numerov's method as just discussed is of order 4.7) with the simple meaning that if the integration formula is developed in powers of h we must have identity in terms of the orders h°. In a similar way we derive the consistency conditions p(l) = 0. Again. and h2. Our method can now be written p(E)y. we define the order of the method: if the local truncation error is O(hp' 2).9. then the order of the method is said to be p. 1. The proof is conducted by discussing the problem y" = 0. a(z) = T'2(zl + 10z + 1) and hence both the stability and the consistency conditions are satisfied. Systems of first-order linear differential equations We denote the independent variable with t. and 7 . and hence the whole system can be written in the compact form dx = A(t)x . (14. p"(1) = 2a(l). (14. Henrici [7]).J 5 1. Now we assume . y). we use the same notations p(z) and a(z) as for the equation y' = f(x. 7 + 4 VT. The following values are obtained: b° = 1. -_ -2Sb. and further for the roots on the unit circle the multiplicity is not greater than 2...8. = h2a(E)f. . for example.. . y(O) = y'(0) = 0.. and since 7 + 4VT> I the stability condition is not fulfilled. First. For Cowell- Numerov's method we have Ip(z)=z2-2z+ I.9. y). 14. with the exact solution y(x) = 0 (for closer details. = 0.284 ORDINARY DIFFERENTIAL EQUATIONS SEC. .16z 1 and a(z) = -12. x. h'.4VT. and the n dependent variables with x1.8. They will be considered as components of a vector x.9. 1. the consistency con- ditions are satisfied since p(1) = p'(1) = 0 and p"(1) -24 = 2a(1) . 21-V b. p'(1) = 0. A necessary condition for the convergence of the multistep method defined by (14. . On the other hand. b. x .6) The stability condition can then be formulated as follows. = T121 b. This means.6) is that all zeros z. f The roots of p(z) = 0 are 1.16z' + 30z2 . of p(z) are such that (z. Finally. (14..8. see..

. +k = 0. In the remainder of this section we shall concentrate on a very important special case. First we choose a value of t. of course. . since its terms are less than the corresponding terms in the exponential series.xk) converges uniformly. we have to work with norms instead of absolute values: IXk+1 . Now let M = sup05rST IIA(r)II. 1.xkl converges uniformly.0. Thus the series Ek-o Ixk+.Xkl M Ixk . Ixk . 14.SEC. 0 (14. and we get lXk+I . xk x when k oo. FIRST-ORDER LINEAR DIFFERENTIAL EQUATIONS 285 that A(t) is continuous for 0 < t < T. which on differentiation gives dx/dt = A(t)x. 0 Since xk are vectors. (14. Then it is also clear that the series E.. . Then the system has the simple form dx=Ax. The proof is essentially the same as that given in Section 14. the case when A is constant.x. and hence lxk+1-Xkl <Icl ((k+)kl)I .. {Xkl=C±A(V)XkdV.xk_1) dT . c is.9.9. a constant vector. (14. Forming xo=c.=o (xk+1 . .Xk = A(r)(Xk .xol <_ So IIA(r)II . 1.xk-. x(0)=xo. 2.9.2. .Xkl < 0 IIA(r)II . For the limit vector x. k = 0. which is so small that etq can easily be computed.9. kZ1. we have x=c+ A(r)xdv.l dr . that is.9. Then we use the properties of the exponential function to compute e^'tA = eTA. The uniqueness is proved in complete analogy to the one-dimensional case. namely.1) with the initial condition x(0) = c has a unique solution in the interval 0 < t < T. 1x01 dr Mlclt.4) Surprisingly enough.2) we get directly Xk+1 . . Then we shall prove that the system (14. this form is well suited for numerical work.3) dt We can write down the solution directly: x = elAx0 .. where ..-II dr But Ix.

with a. a. If the problem is changed in such a way that A is singular. where x x . and we have directly one sufficient condition: for all eigenvalues we must have Re(21)<0. (dx. y) has. T = mt is the desired interval length. either A = 0 or Re (2) < 0.. A= a.9. 4a..5) Hence for every eigenvalue. 14... that is. Hence a'A = Se°DS-'.ldt) = 0.6).. we have still a stationary solution which is not 0. After this has been done. one step in the computation consists of multiplying the solution vector by the matrix eTA. as. . The integral . . and very soon they will predominate completely. It is of special interest to examine under what conditions the solution is bounded also for t oo. Every disturbance can be understood as an admixture of the components which correspond to eigenvalues with positive real parts. as before.10. as solution.. a function of the form F(x. This implies that E". but it is unstable. 14. for example. = constant (for example 1).10. Problems of this kind are common in biology. we have . Often the matrix has the following form: a as . which means that the sum of the elements in each column of A is zero..x. the contribution from those eigenvalues which have a negative real part will vanish. and at least one eigenvalue is zero. ds/dt = 0 or Ax = 0. Boundary value problems A first-order differential equation y' = f(x. When t -+ oo.=.. and hence F. a.. where C is an arbitrary constant. This can also be inferred from the fact that the final state must be stationary. In this case there exists a regular matrix S such that S-'AS = D.. and the limiting value will be the eigenvector belonging to the eigenvalue 2 = 0.an ... in general.. C) = 0. all . where D is a diagonal matrix.286 ORDINARY DIFFERENTIAL EQUATIONS sEc.s * tsi or 12+a551 Sa55. Thus the matrix A is singular. We then restrict ourselves to the case when all eigen- values are different. x (the components of the vector x) represent. From the estimate (3.. but at least some eigenvalue has a positive real part. + a51 E t=t a. a.y. the amounts of a certain substance at different places in the body._.. (14.3. f Z 0 and a.

14. which is nothing but linear interpolation. From y= -x+C(x+ 12+ 12x42 and x = y = 1. we obtain z - x + x' + 12 x' 12. Often the boundary conditions have the form (ay..x.10.42 + x1° 12-42-90 + x.. Usually we specialize by assigning a function value y = y. Later we shall consider only equations of the second order. the methods can be generalized directly to equations of higher order. C 1 + 1/12 + 1/504 + .8. and observe the error at the other boundary point. Using Picard's method. (14.5. 0) and (1.1. If this is not the case.y. which should correspond to the abscissa x = x0.. and we are solving what is called an initial-value problem.3 12. where z = 0 at the origin. (14. For equations of second and higher order.. and in this way we obtain a boundary value problem.k'=b. conveniently. . BOUNDARY VALUE PROBLEMS 287 curves form a one-parameter family. The procedure is repeated.+. SEC. we can use a trial-and-error technique. we get _ 2 = 1. we use Regula falsi on the error at the right boundary point.8x" + 2.+ by.6'y. In our case n = 2..5x? 7! + 2.. we can specialize by con- ditions at several points. Then the inte- gration can begin at this point.2) Even under these circumstances the described technique. we guess a value of the derivative at the left boundary point. and accordingly we put y = Cz . Find a solution of the equation y" = xy + x3 passing through the points (0.10.. 1) is also fulfilled.' =c..1. 1). can be used so long as the equation is linear.90. as will be demon- strated with an example. For linear equations a direct technique is often successful.84274.. where a special curve corresponds to a special choice of the constant C.1 In nth-order equations such conditions can be imposed in n points: w-1 k=O a. 156 + =x.11x73 10! 13! + The boundary condition in the origin is satisfied. i=0.42.5.1) ay.n. We see that y = -x is a particular solu- tion. perform the integra- tion. and in this way we can improve the initial value. and we have only to choose C in such a way that the boundary condition in point (1.=r .10. + 2x' 4! + 2.

that is. Approxi- mating y" by a'y/h= and choosing h = 0. -Ys + 2.096 . The system becomes 2. with the boundary conditions y = 0 for x = 0 and y = 2 for x = 1. y(0. chosen . Ya = 0. and y(0. We can also approximate the derivatives of the equation by suitable difference expressions. -y. = 0. we consider the equation y"-(1+2/x)Y+x+2=0. = 1. The differential equation has the solution y = A cos ax + Bsinax. sin a = 0.04 + 0. = 0. then B can be.0022).1733y.44y.=0.04(x + 2) = 0 .11. we obtain (2. We find y. If sin a 0 we have B = 0. y. It is characteristic that the resulting matrix is a band matrix. y. -Yi + 2.2) = y.104.0030 ( 1. y. The complete system now has the form Ay=b . This results in a linear system of equations with four unknowns which are de- noted as follows: y(0. Y(0)=Y(l)=0. . . . = 1. + y*-.6195)..2899) . y.24y. = 0. and hence we obtain a linear system of equations. also including a few points outside the interval. Finally.288 ORDINARY DIFFERENTIAL EQUATIONS SEC. As an example.. 14. in good agreement with the exact solution y = x(e'-' + 1).2902 (exact 0. From y(0) = 0 we get A = 0. y. the only possible solution is the trivial one y(x) 0.11.112 . that is.088.8) = y. The method describ- ed here can be improved by use of better approximations for the derivatives.4550).4) = y y(O. the obtained values can be refined by the usual iteration technique.Cy. = 2. 14.6202 ( 0. since we have to compute higher differences.6) = y. where C contains higher differences. on the other hand.08/x)y.4556 ( 1. + 2.14y. while y(1) = 0 gives B sin a = 0. If. + 0. In the beginning we neglect Cy and com- pute a first approximation. Elgenvalue problems Consider the following boundary value problem: Y"+a2y=0. if a = nir where n is an integer.2.

But there is an interesting method by which we can advance via functions fulfilling both boundary conditions and simultaneously try to improve the fit of the differential equation. only the lowest eigenvalues can be obtained in this way. R0Y(b) + fl1y'(b) = 0 . and p are real functions of x. we obtain two integration constants which can be deter- mined from the boundary conditions. 14. If the interval between a and b is divided into equal parts. and hence our eigenvalue problem has been transformed into an algebraic eigenvalue problem.1) where p.+. . (for example.11. The eigenvalue problems play an important role in modern physics.-. that is. we have only the trivial solution zero. However. we have an infinity of nontrivial solutions. If the coefficient matrix is regular. = 0 .+j . 2% is obtained from the condition S'a y. A differential equation with boundary values corresponds exactly to a linear system of equations. where A is a band matrix and y is a column vector. further. the eigenvalues represent quantities which can be measured experimen- tally. we obtain 9 (y*-1 . Nontrivial solutions exist if det (A . By the trial-and-error technique discussed above. q. and the situation discussed here corresponds to the case in which the right-hand side is equal to 0. (14._. or y(a) = y(b). These special values a° = n=ir2 are called eigenvalues and the corre- sponding solutions eigenfunctions. EIGENVALUE PROBLEMS 289 arbitrarily. and as a rule. ( case we compute y* iteratively from In the Sturm-Liouviille 2%p(x)y.2y. and the derivatives are approximated by difference expressions. This can obviously be written in the form (A-AI)y=0.qY. The problem of solving this equation with regard to the boundary conditions a0Y(a) + 0.SEC. + Ap.Y. Usually the differential equation in question is written dx(pdx)-qy+2pY=o.Y.) . or from some similar condition.) + 2h (y. if the determinant is equal to 0. but if the matrix is singular. + y. we approach the solution via functions fulfilling the differential equation and one of the boundary con- ditions.' dx 1. p(a)y(a) = p(b)y'(b) is called Sturm-Liouville's problem. (14.11.11.21) = 0. energies).2) dx \p(x) ddx When integrating.

5625 4 - 1 0.8x7 + x8)/1385 4 (79360x . 2.507656 3. 3.5008895 3. = k(8x . 14. n Y.50000847 0.1422 2. Find the lowest value of a' and the corresponding solution.459 3 0. Then we form y y3.4.(2) = 0 gives the eigenvalues a = mir/2.' = . 3).506173 0.4664 4 0.32640x3 + 4032x° .. m = 1. m = 1.1475 2.20 2.11.4x3 + x')/5 2 (96x . Hence we obtain y. 2y'(0) al 0 0. the following relations are valid: 0.. y.896x3 + 112x' .5. = and a2 = 4 = 2.40x3 + 6x8 -.4x3 + x'). The analytic solution is.(1) = 1..x18)/50521 n ly.. 2y'(0) = it . and we therefore get y.5. y = ai (d + cx 33 + 2) y. y = sin ax.x°)/61 3 (2176x .50010058 3.(2x . [y(2)]2 y\31=0.4674011. 0 2x-x= 1 (8x .(1) = I gives ai = 2. and y. y(o)=y(2)=0.. = 2x .x2 fulfilling ya(0) = yo(2) = 0. y.(O) = 0 gives d = 0. We choosey.240x' + 10x9 .x2) ..(D]. of course.500686 0. For the exact solution corresponding to m = 1. y. ac- cording to The integration constants are obtained from the boundary conditions and a from the condition y (1) = 1.5000762 0.40 2 0. The result is given in the following table.14166 2.46729 . and the condition y.(2) = 0 gives c = J. . evi- dently..500011248 3.290 ORDINARY DIFFERENTIAL EQUATIONS SEC. . In our case we have. The procedure is then repeated. 05x52.55556 0. EXAMPLE y"+aly =0.a.

Initial values: (2. when h = 7r/6 and k = 1. 3. (Nov.2)1.xy for x = 0. Dahlquist: Stability and Error Bounds in the Numerical Integration of Ordinary Differ- ential Equations (Diss. Solve the differential equation y' = x . [3] R.2(0. Initial value: x = 0. Ed. 0.5) when y(0) = I andy"(0) = 2 (4 decimals). Solve the differential equation y' = 1l(x + y) for x = 0. 1. Find an algebraic expression for A(h). When x increases from 0 to .y)l(l +y2) for x = 2. The quantity E is to be determined to . 1957). A. If Runge-Kutta's method is applied to the differential equation y' = -y. 0. Initial values: x = 0. The differential equation y" = -y with initial conditionsy(0) = 0 andy(h) = k is solved by Numerov's method. EXERCISES 1. Karim: Comp. 5. y= 1. Prager. [10] A. [6] L. Solve the differential equation xy" . 1962). Solve the differential equation y" . y' = 1. 1965).2)3 by use of Cowell-Numerov's method.5 and x = 1 (h = 0. 0.1. 6.1. Gottingen.5)2 by using Runge- Kutta's method. Collatz: Numerische Behandlungvon Differentialgleichungen (Berlin. Milne: Numerical Solution of Differential Equations (Wiley. we obtain after n steps the value y" = y(nh) = [A(h)]" as an approximation of the exact solution e". [8] P. Heidel- berg. 1951).e-** to four significant figures for n = 100 and h = 0.5. E.: Numerical Solution of Ordinary and Partial Differential Equations (Oxford. 8.yy' = 0 by Runge-Kutta's method for x = 0.y2 by series expansion for x = 0. Solve the differential equation y" _ (x . Find the explicit solution in the simplest possible form. 2. Sketch the function graphically and read off the minimum point. y = 1. [7] L.8). 1961). and 5. I. [2] L. Then compute y.4(0. 1953). with y(O) = 1 and the interval h. Fox: The Numerical Solution of Two-Point Boundary Problems in Ordinary Differential Equations (Oxford. Fox.EXERCISES 291 REFERENCES [1] W. 4. [5] Modern Computing Methods (London. 1953). New York. 2.e -A with less than 1% relative error for h = 0. Babuska. New York. cx2+d 7. 308. y increases from 0 to oo. and E. compute y" . Henrici: Discrete Variable Methods in Ordinary Differential Equations (New York 1962). y = 0. [4] G. 1966) p.5 and x = I by use of series expansion and Runge-Kutta's method. Also find the coordinates of the minimum point graphically. [9] J. Initial value: x = 0.5(0. Uppsala. Bellman: Stability Theory of Differential Equations (McGraw-Hill. Then compare with the exact solution which is of the form axe + b y. The differential equation y' = 1 + x2y2 with y = 0 for x = 0 is given. 1958). Further.2. Jour.2. 1) and (2. and compute A(h) and A(h) . M. Vithsek: Numerical Processes in Differential Equations (Prag.

25. 1 with boundary values y(0) = 0.3 by use of. Show that y2 + z2 = l/. Runge-Kutta's method. 13. Find z(0) and y(j) to 3 decimals. and compute z. N' = 8. y'(0) = 1. we extrapolate to z = 0. and the differential equation for z is integrated with h = 0. Show that pq = I and find p and q to five decimals. Investigate the stability properties of the method y"+2 . (b) by performing two steps in the iteration y'+. What is the largest value of the interval length h leaving all solutions of the Cowell-Numerov difference equation corresponding toy" + y = 0 finite when x --> «. Given the system of differential equations xy'+Z +ky=0. Find the solution of the differential equation y" + x2y = 0 with the boundary conditions y(0) = 0. iN . and Q. y1.y2)/(x2 + y2) is given.y" + 8 (y' + 3y' . Find the largest possible interval length h such that no instability will occur when the equation is solved by use of the formulas y"+. 17. Show that limN. Then we introduce z = l/y. It has solutions which asymptotically approach a certain line y = px which is also a solution of the equation.8).+2 + y./r+ xi and find y(j) and z(1) to five decimals. Also show that y2 + z2 + 2y'z' is independent of x. = yr + by%' . The distance between the Nth and the (N + 1)-zero is denoted by zN. The following system is given Y"-2xy'+z=0. when applied to the equation y' = -y. z = 1. are polynomials in x.+2). 0. yo = X. First we computey(0. +. using Picard's method. 9. and z '(J) = -1. The differential equation y'=axy+b is given. z(j) = 0. lxi -y'+p = 0. The differential equation y" + xy = 0 is given. 11. = y" + h y w . The differential equation y' = (x2 . 15.P"+. .r2/125. 12. using a series expansion. For large values of x the solution y behaves like a damped oscillation with decreasing wavelength. When z is sufficiently small. with the initial conditions x = 0. 14. Show that z.? 16. . Another line y = qx is perpendicular to all integral curves.50. y(l) = l at the points 0.Q" is a constant which depends on n.. where P. 10. Then we can write y'"'=P"y+Q". + 3y.. 292 ORDINARY DIFFERENTIAL EQUATIONS three places in the following way. = P"Q". and 0. y = 0.. for example. = -x2y". The differential equation y" + ay' + by = 0 is given with 0 < a < 2/T.75: (a) by approximating the differential equation with a second-order difference equation. lz"+2xi +y = 0.

(e). The equation is approximated by a system of difference equations. A certain eigenvalue 2 of the differential equation y" . Find the smallest eigenvalue of the differential equation xy" + y' + Axy = 0.. Find the smallest eigenvalue 2 by approximating the second derivative with the second difference for x = 1.. then with h = 1. Determine the lowest eigenvalue by assuming y=a(x2.1)+b(x2.2)x". The smallest eigenvalue. y(l) = 0. Chapter 18).(e)Jo(2) = 0. 23. by Cowell-Numerov's method and. The differential equation (I + x)y" + y' + 2(l + x)y = 0 is given together with the boundary conditions y'(0) = 0. . The differential equation y" + 2xy = 0 is given. say. It is known that Jo(e) _ -J.x2Xa + bx2 + cx') which satisfies the boundary conditions. Is = -1 (since only the zeros are of interest we can choosey(j) arbitrary. Determine a. 22. is associated with the eigenfunction y(x) = E:-o a (. The differential equation y" + x2(y + 1) = 0 is given together with the boundary values y(l) = y(-1) = 0. = A-'a + A-'Dz. 1.1)+c(x'. Use h = and solve the linear system Az = a + Dz iteratively by the formula zw+.Si+.(e) and Y0(£) _ -Y. The independent variable is transformed by x = at and the equation takes the form y" + t2y = 0. is a matrix B with elements bik = n + I .(d) and also give the lowest eigenvalue with two correct decimals.k otherwise. x = 1. Then show that the eigenvalues A can be obtained from 2 = 62 where £ is a root of the equation Ji(OYo(4) .max (i.EXERCISES 293 18. aik = 2aik .. 26. 24. together with the boundary con- ditions y(O) = 0.. y(l) = 0. and x = 1. Compute 2 rounded to the nearest integer by determining the first zero.k . Use the points x = 0. and x = 1 (3 significant figures).Y. 19. The differential equation y" + (1 /x)y' + 2y = 0 is given together with the bound- ary conditions y'(0) = 0. first with h = 1. for example. for instance equal to 1).9 of the independent variable. Find the smallest positive eigenvalue of the differential equation y°(x) = Ay(x) with the boundary conditions y(O) = y'(0) = y"(1) = y"'(1) = 0. y(0) = y(l) = 0.3i-.2xy' + 22y = 0. y(l) = 0. x = . The inverse of an n x n-matrix A with elements a = 1. and 1. Find approximate values of y(0) and y(j) by assuming y = (1 . with the boundary conditions y(0) = y(l) = 0..1) which satisfies the boundary conditions. the equation can be brought to a Bessel equation of order 0 (cf. 20. of the differential equation y" + 2x2y = 0 with bound- ary conditions y(O) = y(l) = 0 can be determined as follows.. This equation can be solved. and finally Richardson extrap- olation is performed to Is = 0 (the error is proportional to h2). Show that by a suitable transformation x = at + . Determine y(0) of Exercise 18 approximating by a difference equation of second order. k). 25. 21. Use the points x = 0.

dp = a (au)dx+ (auldy-rdx+sdy. a. a2u . VOLTAIRE.u P axay ax q ay ax2 ayZ Then we have the relations: du= dx+Lydy=pdx+qdy.0. the theory of electromagnetism (Maxwell's equations).Chapter 1 5 Partial differential equations Le secret d'ennuyer eat celui de tout dire. ax (ay ay lay With the notations introduced. which dominate in the applications. which may be of quite a different character. When treating ordinary differential equations numerically it is customary to start in a certain point where also a number of derivatives are known. Here we will restrict ourselves to second-order partial differential equations. 15. (15. of the form z aau +2baxay au +cau axe ayz =e. We shall assume that the equations are linear in the second derivatives.1) where a. au/ax. in hydrodynamics. We introduce the conventional notations: _ au du . ox ax ay ax) dq = a (au)dx + a (au)dy = sdx + tdy. t. For a 294 . the differential equation itself can be written ar+2bs+ct-e. b. and e are functions of x.0. the theory o elasticity. Classification Partial differential equations and systems of such equations appear in the de- scription of physical processes. c. s. and au/ay. u. Such equations are often obtained when systems of the kind described above are specialized and simplified in different ways. for example. y. r. and quantum mechanics. The solutions of the equations describe possible physical reactions that have to be fixed through boundary conditions. that is. d2u .

0. and q) on a characteristic. This equation has a solution consisting of two directions y'/x' = dy/dx or rather a field of directions assigning two directions to every point in the plane. Since u. supersonic at others.2) ar+2bs+ct =e. the flow can be subsonic at some places. and introducing the notations x' = dx/dr and so on. 295 SEC. With this background we first encounter the problem of determining r.2) unless one of the following three relations is fulfilled: p' x' y' p' x' 0 p' y' 0 q' 0 x' =0. p = p(r). at least for "open" problems for which the boundary conditions do not prevent this. y.ac .0. CLASSIFICATION second-order equation it is usually sufficient to know yo = y(xo) and y.ac < 0 (elliptic equation). In the hyperbolic case we have an open domain bounded by the x-axis between x 0 and x = a. For a second-order partial differential equation it would be natural to conceive a similar procedure.0. higher derivatives are then obtained from the differential equation by successive dif- ferentiation.3) It is rather surprising that the most interesting situation appears in the excep- tional case D = 0. If D = 0 there is no solution of (15. These directions then define two families of curves which are called characteristics.2bx'y' + cx'2. (15. Then it would be reason- able to replace the starting point by an initial curve along which we assume the values of u. We sup- pose that the equation of the curve is given in parameter form: y = y(r) X = x(T) . q' x' y' -0. 0 y' . They are real if b2 . s. we can simply write u = u(r). e a 2b e a c e 2b c (Incidentally. We find directly D = ay'2 . 15. p. and q are known along the curve. (15. s.ac > 0 (hyperbolic equation) or if b2 . provided that the coefficient determinant D # 0. u. The following description gives the typical features of equations belonging to these three kinds. we find x'r + y's p' x's + y't = q' . p. an equation can be elliptic in one domain. q. they are equivalent if D = 0.0.) This means that one cannot pre- scribe arbitrary initial values (x. From this system we can determine r. As is easily understood. A well-known example is gas flow at high velocities. hyperbolic in another. p. and t in an arbitrary point of the curve. and q given.0 (parabolic equation) but imaginary if b2 . as well as additional conditions. and . and t as functions of r..0. and q = q(r)..

u(0.296 PARTIAL DIFFERENTIAL EQUATIONS SEC. are connected with oscillating systems (example: the wave equation). 0) = f(x) . Finally.2) The solution contains two arbitrary functions jr and g. Often the initial conditions have the form u(x. which describes a standing wave with u(0. y) and u(a. together with the equation. 0) is given as an initial condition. the phenomenon can be realized by a vibrating string stretched between the points x = 0 and x = 1.3) 2 2c =-a g(e) de Also for the two. Physically. Normally. which appears in quantum mechanics. for example. (15.1. On the portion of the x-axis between 0 and a. Elliptic equations usually are associated with equilibrium states and especially with potential problems of all kinds.u 1 a2u (15.1. setting the propagation velocity equal to c: a. and parabolic equations are connected with some kind of diffusion.and three-dimensional wave equation. these conditions determine u in all interior points.ct) + I =+ee u(x. in particular. we choose f(z) = -I cos irz and g(z) = j cos 7rz.1. If. we get u = sin trx sin trct. (15.1) ax2 c2 at2 In this case we can easily write down the general solution: u=f(x+ct)+g(x-ct). explicit solutions can be given (see. . On each of the vertical lines a boundary condition of the form au + $(au/ax) = 7 is given. Hyperbolic equations We shall first give an explicit example. as a rule. the lines x = 0 and x = a for y > 0. In the parabolic case we have the same open domain.1. in the elliptic case we have a closed curve on which u or the normal derivative au/an (or a linear combination of both) is given.1. 15. and we choose the wave equation in one dimension. but on the por- tion of the x-axis only u(x. at (x. 0) = 0. 0) = g(x) and then we easily find the general solution t) = fix + ct) + fix . the functions u(x. t) = u(1. 0) and au/ay are given as initial conditions. 15. Its physical meaning is quite simple: u can be interpreted as the superposition of two waves traveling in opposite directions. [3]). y) are also given as boundary conditions. Hyperbolic equations. t) = 0 and u(x. An interesting special case is the Schrodinger equa- tion.

in general. where the equation is hyperbolic.SEC. and q). the characteristics are obtained from the equation ay'2 -2bx'y'+cx'2=0 or dy -b± b-ac W X_ these two values will be denoted by f and g (they are functions of x. Further. consider an arc AB which is not characteristic and on which u. y = constant.1). 15. The characteristics have the property that if we know u along two intersecting characteristics. Then we have -p(x) + sb(yo) = F(x) q(xo) + SG(y) = G(y) where both F(x) and G(y) are known functions. impos- sible to give explicit solutions. and then we shall also treat the corresponding stability problems. QR one of g-type.y+ 1)+u(x. Now suppose that u is known on the line x = xo. we get the equation a2u/ax2 .1 q (xa) + (p(yo) = j[F(xo) + G(ya)] . it is. . However. This procedure will be exemplified later. First. if the coef- ficients of the equation do not have this simple form. u.y. Let PR be a characteristic off-type.4) has exactly the same solution.1.y)=u(x. p. y. we take a simple example. we obtain Figure 15. that is. HYPERBOLIC EQUATIONS 297 Setting y = ct in (15. Adding and putting x = xo and y = yo. This equation is obtained if we approximate the differential equation by second-order differences. then we also know u in the whole domain.l. and instead we must turn to numerical methods. where F(xo) = G(yo).1.1. A very general method is to replace the differential equation with a difference equation. and q are given.alu/ay' = 0 with the explicit solution u = f(x + y) + g(x . p. In this special case we have the general solution u = q(x) + (P(y). as well as on the line y = yo.y).y)+u(x. The equation a2u/axay = 0 gives the equa- tion for the characteristics dxdy = 0. and - u = p(x) + cb(y) F(x) + G(y) j[F(xo) + G(ya)] In general. x = constant. It is interesting to observe that the partial difference equation u(x+ l.1) (15. At present we shall consider a method which is special for hyperbolic equations and which makes use of the properties of the characteristics.

Then it is possible to compute u3 from u3=u. = f. au (x. ay= with the initial conditions u(x. It is characteristic that we can only obtain values within a domain ABC of triangular shape and with AB as "base. is not a characteristic. Along PR this relation is approximated by e. dx is replaced by x3 .Y.1.x.). Further. Y3 .)f.(q3-q2)=0.P. p3 and q.2.)' g.)(q. Hence we get u3 =u. 15." We shall now in an example demonstrate how it is possible to apply an ordinary series expansion technique. Y (Note that the straight line y = 0.(qs . further.x. by J(g2 + g3). + q3).a.). and along QR by e. . 0.+P3)+J(y3-Y.)g3-a2(P3-p.Y.ap'y' . Q = (x y. can easily be computed. and y3 are now approximately known.(P3 .)f. and so on. R = (x3. When the value in R has been obtained.q1) = 0.(x3 . and u3 can be obtained in the same way as before.(x3 . a.)g.cq'x' = 0 or eaXdx-aaxdp-cdq=0. let P = (x y.+q3) From the known approximations of u3.x. y.)(P. and dy by y3 . Since x. q3.). replaced by (f1 + f.) . where the initial conditions are given. can be solved from this system. 0) = x1 .(x3-x.+J(x3-x. In first approximation we have Y3 . + a3).+audx+audy.-c. 4 p3) and au/ay by J(q. Then improved values for x3.x3) From this system x3 and y. The second of the three equivalent determinantal equations can be written ex'y' . but with f. we can compute f3 and g3. Again we start from the wave equation a=u azu ax. by J(a. ay where we approximate au/ax by J(p.1) starting from . we can proceed to S.c.. y. and from R and S we can reach T. 0) = e_a. and q3.298 PARTIAL DIFFERENTIAL EQUATIONS SEC. .) We are then going to compute u(0.y. p. = g3(x3 . p.

.o. HYPERBOLIC EQUATIONS 299 the origin.axay= . it is of course also possible to approximate a hyperbolic differential equation with a difference equation. We then finally obtain the desired value through u(x. = 0. + yDl. 0) e-z .004 + 0.y)` + 1 e_1 dt = x2 + y2 + e.4 .2ur.02) 2 + 6 (3 .1.1-1 = Ur+. Differentiating the initial conditions. 0.SEC.1.)u(0.1 gives the value u = 0. . (15.04+0.0002) + . The exact solution is =+v u (x + y)=2. + Ur. 0) =0.+. 0) = 2 P and then successively: agu a3u ax3 . axay= ay3 (x. y = 0. and so on. .2u.0008 ... y) = exp (xD.0.04-0. As mentioned before.0. we obtain agu (x.001) + 24 (.4 . 0) = 0 .13201. a3u 0) = e_z ..1 + (2.2. axay a3u (x. 0.5) k2 h= .# . 0) = e- axay From the differential equation we get a (x.sinh y . From the wave equation a3u agu = 01 axay= we get the difference equation ur.. a3" axay - a3u =0 43 and hence a3u (x.9 + ur_1. 15.(x . 2 Ss-r which for x = 0.13200 .

u . . . . (15. 15.. and the boundary conditions u(0.. = rh and nh = 1.300 PARTIAL DIFFERENTIAL EQUATIONS SEC.2... f(2h)..2u1 + u. of course. toward increasing values of s. + u3 . it .. 0 01 U= V= A = °J 0 0 0. we obtain 1hY dtl = uo .. but a difference is introduced through the initial conditions which define an integration direction.. = dt I hY (u. 0) = fix). We shall first solve the problem by approximating 82u/axY. Writing out the equations for r = 1 . 2u.. U. = sb(t) . and Lewy in a famous paper [5) proved that the difference equation is stable only if k S h.._1 .. for t = 0 are f(h).u.. but not au/at.1 ordinary differential equations in ul.du. As before. Then we have approximately au.. 1 -2 ..1. . Hence we get a system of it . viz. The values of u u . In 1928. known functions of t: uo=P(t)... Courant.2) Y du'-. u(1. t) = 0(t).2. This is the heat equation in one dimension. 15. the indices r and s are completely equivalent. The portion between x = 0 and x = 1 is divided into n equal parts in such a way that x.2.-. 2. t) _ q(t).+1) .1) axY=at with the initial values u(x. + u.2. the quantities uo and u are. 0 < x < 1.1)h)..0 0 uY -1 -2 1 . Parabolic equations The simplest nontrivial parabolic equation has the form a2u au (15. In following sections we shall give much attention to stability problems. f((n . h2 dt = u1 . at .2u. Friedrichs. by a difference expression... Putting ua 01 2 1 0.

F = ea' = e1°/ t. = 16(. we obtain the particular solution u. and u. it- du. t) = t.2. t) = 0 and u(1. .) .2u. homogeneous rod of uniform thickness. Further we assume u(0.4: du.1 + 8t) 16 u.)t + C e-le(Y} J'f )t + (-5 + 32t) 128 U.u 1 0 -21-2-rt 1 1 0 =0. 15. = 16(u.= -2. e C>//_2_ e (. The roots are ft. + u. = BV T . 128 -7+96t The solution of the homogeneous system is best obtained by putting u. + t) .. = 16(u2 . which at time t = 0 has temperature u = 0.) . As an example. and we find with h . dt Using the indefinite-coefficient method.= 1 (-7+ 96tt ) .= -2+i. . PARABOLIC EQUATIONS 301 we can write by dU = AU+ V. (). This gives a homogeneous system of equations whose coefficient determinant must be zero: -2 . for x =.SEC. we solve the heat equation for a thin. ps= -2-V_2 . u. We will then consider the temperature u. for x = 4. dt du. and the corresponding eigenvectors are 1 0 -1 .2u. e 32t + B .2u. e)a:. 1 1 -lam 11 1J Hence the solution is ul = A . ft. + u. for x = 4. 1 -5+32t U2 _ -8+64t u.

u(x....2' 128 ' 128 ' C 128 For large values of t we have ul(t) .2. (15.(0) = u.. = sk. 15. here we restrict ourselves to computing the truncation error.2.. We get u(x. that is.2. B. As before.h. u3(t) ~ 4t . we take x..2u.. (15..4) Later on we shall examine for what values of a the method is stable...(0) = u. t) -. Then we find F=..2a)u.302 PARTIAL DIFFERENTIAL EQUATIONS SEC. but now we also choose t. t + k) . asu 24 Ca 6 / at' + 60) at Obviously.at + 2!k a'u at k' Mu 3! at3 + u(x .. u. t) = au k .(t) ~ 2 . we can also approximate au/at by a difference expression. + au. A. we also have.. Then we obtain the following partial difference equation: Ur-1.h. h4 aas u 540 at3 +. However.+1 . = rh. = u... 0) = 0 . we get u. a'u a3u a°u at ax' at3 ax° Hence we obtain the truncation error F= h2 ( 2 a au _ at 1 M a'u 1 6 ax') + 6 ` ( &U _ at3 1 o°u 60 ax° h' (a..3) P k With a = k/h'.. t) h' a°u M ax' + 12 ax' + 36 0 ax° + Since u must satisfy au/at = a'u/ax'. Now we solve the heat equation again with the initial condition u(x.+1...2u(x. we get a truncation error of highest possible order when we choose a = g. t) + u(x = a'u h' a'u 1. under the assumption of sufficient differentiability.. and C can be obtained from u.u.. a'u .. . + (1 .. A _ 1 B _ 3 + 21' _ 3 . 4 . Then it often turns out to be necessary to use smaller intervals in the 1-direction. + u.(0) = 0.+1 = aur_. k = h1/6.

t) = I (x' . and. and with the initial condition u(x.00541. corresponding to a = 0. these are then supposed to be known.h. sin m1rx e--'T't . and the boundary conditions u(0. t) = t.x + 6xt) + 6 'E 2 -1 ( n' sin nnx which gives u. From our previous solution. . = 0. Since the time step k has the form ah'. For this reason the method is said to be explicit. u. If f(x) is continuous we can write down an analytic solution u= a.05240.2. 0 S x S 1.. = 0.. u(1.k) + u(x + h.3 or 15.00615. 15. with a = 8 gives an error which is essentially less than what is obtained if we use ordinary derivatives in the t-direction. = 0. = 0. we obtain instead u. We shall now try to produce an analytic solution also for the difference equation which we now write in the form ur.k)) ._1. 0) _ E a.2. Let us again consider the same equation a'u au ax' at but now with the boundary values u(0. = a(u. t) = u(1. sin m7rx .. it is interesting to know how large values of a we can choose and still have convergence. t) = 0. t . = 0.3): a.u. 0) = f(x).. The exact solution of the problem is given by u(x.125. Hence the method . and u. choosing h = I and k = $1S corresponding to a = s. u.. = 0. t . The method defined by 15. u.4 gives the function values at time t + k as a linear combination of three function values at time t. t) = 0 when t Z 0..01878. u. t) = (u(x . Thus we obtain (cf.01877. = 0. = 0.05331 . = 0.5) M=1 where the coefficients are determined from the initial condition which demands f(x) = u(x. u.. Section 17.2Ur. t . = 2 Kf(x)sinmtrxdx. + ur+1. PARABOLIC EQUATIONS 303 SEC. = 0.2. u. 6 After 12 steps we have reached t = 0.05240. of course. The recursion formula takes the following simple form: u(x.2.1) .125 and the u-values. (15.01994. and u..k) + 4u(x.00540.+1 .

0 one cannot guarantee that the solution of the difference equation is close to the solution of the differential equation unless k is chosen such that a < I- A numerical method is usually said to be stable if an error which has been introduced in one way or other keeps finite all the time.1)h) _ p T(sk) X(rh) (Since the left-hand side is a function of s and the right-hand side a function of r. = e-M'x"k = e"29 that is.304 PARTIAL DIFFERENTIAL EQUATIONS SEC. It should be pointed out. In order to separate the variables we try with the expression u. Therefore. both must be constant. 2 /J (15.. X(rh) .4a sin' (mirh/2). it is clear that h and k must be chosen such that k < h'/2. the factor will deviate more and more from the right value.2.1)h) = .4a sin' (mirh/2)}'.2. section 17. which gives 2 M-1 bM . (Cf.2X(rh) + X((r . the same factor as in the solution of the differential equation.0 = flrh).6) Here Mh = 1 and further the coefficients bM should be determined in such a way that u. = 1 .2z2h2.2X(rh) + X((r . In this way we finally get u. \ - 4a sin' mtchl. Now it is obvious that the errors satisfy the same difference equation..) We notice that the function X(rh) = sin mirrh satisfies the equation X((r + 1)h) . a if we choose p' = 4a sin' (mirh/2).) The time dependence of the solution is expressed through the factor {1 .. and hence we have proved that the stability condition is exactly a < J.T(sk) = a X((r + 1)h) . If a increases. and we observe that the situation becomes dangerous if a > I since the factor then may become absolutely > 1. = X(rh)T(sk) and obtain T((s + 1)k) . that the dividing line between convergence on one side and stability on . Further we assume T(sk) = e q. Even if h -. E f(nh) sin mirnh . -p2 is called the separation constant. M _.4a sin' m>rh 2 =e If this is raised to the power s we get e-. 15.k and obtain e-QA. and for small values of a we have I . -E bM sin mtrrh \1 M =. how- ever.3.

) and the denominators 2'. + $ur+l. The followin g scheme is obtained: s=0 s=1 s=2 s=3 s=4 r=-4 0 0 0 0 is r= -3 0 0 0 1 0 r. 15. convergence means that the solution of the difference equation when h --.. PARABOLIC EQUATIONS 305 the other is rather vague.2u... we get with a=k/h'=J: U"#+1 = ur. 0 approaches the solution of the differential equation.2.3). Fl .urs-1 h' 2k which has considerably smaller truncation error than (15.s s= -1 s=0 s= 1 s=2 s= 3 s=4 r= -4 0 0 0 0 0 1 r= -3 0 0 0 0 -8 r=-2 1 0 0 0 1 -6 31 r = -1 0 0 1 -4 17 -68 r= 0 0 1 -2 7 -24 89 r= 1 0 0 1 -4 17 -68 r= 2 0 0 0 1 -6 31 r= 3 0 0 0 0 1 -8 r= 4 0 0 0 0 0 1 . + ur+1..-2 0 0 1 0 r4g r= -1 0 1 0 1 0 r= 0 1 0 0 i8 r= 1 0 0 0 r= 2 0 0 0 T8 r= 3 0 0 0 0 r= 4 0 0 0 0 Tls The numerators contain the binomial coefficients (. .2.SEC. In general.2. Before we enter upon a more systematic error investigation we shall demon- strate a technique which has been used before to illustrate the error propaga- tion in a difference scheme.. -_ ur.2ur. but examples have been given of convergence with an unstable method. Normally these two properties go together. the method is obviously stable. + ur+1. 0 and k -.4) with a = 1: iur-1.. while stability implies that an introduced error does not grow up during the procedure.1 .. We choose the heat conduction equation which we approximate according to (15. If the same equation is approximated with the difference equation ur-1..1-1 + ur-1.

. . . This can be interpreted as a difference equation with the characteristic equation p'+(2-2)fi+I=0...2.... 15. 306 PARTIAL DIFFERENTIAL EQUATIONS SEC. It is easy to give a sufficient condition for stability... . + au.4) is stable if a < J.. for a vector with the components u. Since T is symmetric.... the method is very unstable and useless in practical computation.+.21 < 2. that is.2..8+1 = au. Evidently.. We shall now treat the problem in a way which shows more clearly how the error propagation occurs.2a)u. = eo and we have stability if e stays finite when n -. 01 T= 0 0 . 2.+. M . we find u.. = x = 0.. = Au. 0 0 -1 2 -1 .2. 0 < 2 < 4. + (1 . Suppose that the term to be computed is given as a linear combination of known terms with coefficients c c c3. where r 2 -1 . . Hence we have the following roots of u: s 2 -' 4 . Then the method is stable if w EIc. .. Supposing that T has an eigenvector x we obtain the following system of equations: with n = 0. We split the matrix A into two parts: A = I .1).... . c. that (15. and we start again from the same system of equations: u. where 1 -2a a 0 a 1-2a a A= 0 a 1-2a L 0 0 0 a 1-2aJ If we also introduce an error vector e. oo.. -1 2 all matrices having the dimension (M .. and x. 1. 2 is real.aT.-...1) x (M .. . u. j=1 We see. Introducing the notation u.. we get trivially e.l<1. and Gershgorin's theorem gives 12 . for example.

+ u.1. a S [2 sin' (mir/2M)]-1.4 . = x.2.+i.. In practical computation the difficulties are not very large since the coefficient matrix is tridiagonal. It coincides with the forward-difference method. .. and x.. As before we put u. m = 1..SEC.. .. The necessity to work with such small time-steps has caused people to try to develop other methods which allow greater steps without giving rise to in- stability. viz. I (u.. + u*-1.3.+1 .21/4) = 1.. .+ and u.2u. M. however with the distinction that a'u/ax' is approximated with the mean value of the second difference quo- tients taken at times t and t + k. 2 singe= 2._1. and It et'".2'/4 < 1.1 when the expression within brackets is very close to 2.. and further we have (1 . = X(rh)T(sk) ..+1) k 2h' (15.+I. The best known of these methods is that of Crank-Nicolson.. Thus the method can be described through u..+.. . 15. 2 = 2(1 .4a sin' (m7r/2M).cos p) = 4 sin' 2 M . Hence x = Ae'"" + Be '°'. This shows that we can put cosg. + 2u.I < 1 . This shows that the eigenvalues of A are I . and we find the stability condition: that is. u.. The worst case corresponds to m = M .. (A sinMq=0. = 0 gives: A+B=0. and hence the method is implicit.ur..2..-.= I . PARABOLIC EQUATIONS 307 But 0 < 2 < 4 implies .2/2)' + (2 . + u. and we conclude that the method is stable if we choose a<j. But A 0 and consequently we have Mgp=m7r.7) We have now three unknowns. We are now going to investigate the convergence properties of the method..2.2/2 < I and 0 < 2 ..

+. = Cu.2. k -. u.2a sin' (myth/2)1' . (15. and putting T(sk) .2 T) M. and .. 15. We note that the system can be written (21 + aT)u.8) con- verges toward (15.3.8) \ 1 + 2a sin2 (m7rh/2) f where b.308 PARTIAL DIFFERENTIAL EQUATIONS SEC. and obtain T [(s + 1)k] .. that is. has the same form as before. and from this we get the eigenvalues of C (note that Mh = 1): 1 .T(sk) _a X [(r + 1)h] ." = (I + 2 T) (I ..2. A similar relation holds for the error vector: e. We have already computed the eigenvalues of T- =sin'ZM. We shall now also compute the truncation error of the method which is done most easily by operator technique. sin myrrh (1 .5).11 -+2a2asin2 yk (m7rh/2) sin' (m7rh/2) We can now write down the solution of the Crank-Nicolson difference equation: ur. . Let L' be the difference operator. then (15. . 0.e P'k we find 1 .M.2.2..e fik = 2a sin2 (mnh ) 1 + e-Ok 2 or finally e .aT)u.2. It is also easy to see that when h. _ E b. It is now obvious that the eigenvalues of Care of decisive importance for the stability properties of the method.1.. In order to examine the stability properties we again turn to matrix technique. =C'ep.1)h] = -p2 T[(s -F 1)k] + T(sk) 2 X(rh) Trying with X(rh) = sin myrrh we get p2 = 2asin2(m2h).2X(rh) -F X [(r . = (21 . 4 m = 1.2a sin' (myr/2M) I + 2a sin2 (myr/2M) Hence Crank-Nicolson's method is stable for all positive values of a. The shape of the solution immediately tells us that it can never grow with time irrespective of what positive values we assign to a..

Di (observe that D'u = a2u/ax'.. . As examples we shall take the Laplace equation ax + = 0..12+kD. the situation is completely different. not (au/ax)').2) ax' + aye In the remainder of this section.) + k2 D.3.3. In many cases it is relatively simple to find the general solution. If such a solution has been found. it is usually a simple matter to adapt the parameter values to the given boundary conditions. This result snows that without any risk we can choose k of the same order of magnitude as h.l De+D: = k D. Elliptic equations When one is working with ordinary differential equations. and it will be briefly described in the next section. (15.D. + 2 Di + 6sDi +.L = e {eA)i -2 } e-111-)(1 + ekD`) . ELLIPTIC EQUATIONS 309 SEC. 15. that is.h' D= + . For partial differential equations. The Laplace equation has the general solution u=f(x+ly)+g(x-iy). k = ch. we will give much attention to these two most important equations. 2 6 12 0(h) + 0(k2).D.+ ZsDi+.k2 D. We get kDt L' . since D.3. D'. . . originally sug- gested by Peaceman and Rachford. but usually it is a difficult problem to adapt it to the boundary conditions.(D.1) and the Poisson equation s F(x. which is a substantial improvement compared with k = ah'. Crank-Nicolson's method can be generalized to equations of the form au au + aye a2u ' at axe but usually one prefers another still more economic method..u = Dsu if u is a solution of the differential equation.. + D..' /.3. .2 k 1 _ 2h2 = D. -\2 D:+24 D:+.. y) (15. 15. L = D. the main problem is to find a solution depending on the same number of parameters as the order of the equation. The same technique can be used in the elliptic case.

and u = .4). y .4669 1. y) + u(x. 1.2233 0.0285 -0.8. 1).3779 +0.2068 0.1 on the horizontal side.6).3528 -0.log x . Choosing the starting value 0 at all these points.0248 -0. 2).3778 +0.log 2y .0301 -0.2).3778 . y + h) + u(x .3.4.4677 1.4676 1.0131 -0. Ac- cording to (12.2327 0. on the sides of a rectangle or a triangle.h.h.4). The problem of specializing f and g to the boundary conditions (Dirichlet's problem) is extremely complicated and has been the subject of extensive investigations. 2).2065 0.3559 -0.4539 1.3556 -0.3487 -0.3726 -0.0296 -0.8824 -0. 1. 172u(x. (1. Now we get a partial difference equation which can be solved approximately by Liebmann's iteration method (15.6.4u(x. y)] .h) .0573 0.0006 0. Here index n indicates a certain iteration: y . y)] . y) = 0.u(x . moving upward. y) .3558 -0.3785 -0. y) = h.0014 0.3777 +0.0207 0.3777 +0. and (2.3777 +0.(x + h.310 PARTIAL DIFFERENTIAL EQUATIONS SEC.3765 -0.3.2064 0.2063 0.2).3541 0.3) we have approximately.0299 -0.3). we can take this into account by modifying (15.y2 on the vertical side.2063 0. y) + u(x. we obtain A B C D E F --0.4 F(x.3549 -0. If the boundary is of a more complex shape.3795 -0. u = 4 .3557 -0.3. and F(1.4679 1.0301 --0.0015 0.h) + 4 u y + h) + u.2.0012 0.2 log x on the oblique side. 1. 1. the Laplace equation is obtained as a special case if we put F(x.2).4). ]. 1. C(1.4644 1.4679 1.3793 1. for example.8. With h = 0.0020 0.0012 0. E(1.0300 --0. B(1.2116 0.8. and equation 32 u ax2 u1 repeat the process until no further changes occur. We shall now consider the numerical treatment of the Poisson equation. (15.3) The equation >72u = F is then combined with boundary conditions in such a way that u is given.3. (15. we obtain six interior grid points: A(1.3. and with the boundary values u = x2 .3559 -0.4) We start with guessed values and iterate row by row.1155 0. As an example we treat the + 3l'9 x2 + ) in the interior of a triangle with vertices at the points (1.2079 0. D(1. 1. 15.6.

this equation describes an explicit method. Gauss-Seidel. This is called block iteration and evidently defines an implicit method. K. O E. (15. of the form U(%+"=Mu'"'±c.3. Using block-matrices we get the system: ID. ELLIPTIC EQUATIONS 311 The exact solution is u = x. . we have an ordinary linear system of equations of high but finite order. U. 15. All these methods (Jacobi. In general.. . D.log xy.5) Obviously. that is. A natural generalization is obtained if a whole group of interrelated values (for example. SOR) are point-iterative. We give an example with the Laplace equation in a rectangular domain 4 x 3: '9 '10 '11 '12 '5 '6 '7 '8 '1 2 '3 '4 The coefficient matrix becomes 4 -1 -1 -1 4 -1 -1 -1 4 -1 -1 -1 4 -1 -1 4 -1 -1 -1 -1 4 -1 -1 -1 -1 4 -1 -1 -1 -1 4 -1 -1 4 -1 -1 -1 4 -1 -1 -1 4 -1 -1 -1 4 where all empty spaces should be filled with zeros. (U. O E. F. F. in the same row) are computed simultaneously. when a discretization of the differential equation has been performed. from which the values in the last line have been computed.y' . . D.3. The variant of Liebmann's method which has been discussed here is closely related to Gauss-Seidel's method. For this reason we can follow up on the discussion of methods for such systems given in Chapter 4. this is the case for the overrelaxation method.) == K. in particular. SEC.

that we get a series of systems of equations for each iteration and not the unknown quantities directly. y) + c(x. y + h) . and further we put Al . y)u(x. y) ax A ( x+ h2Y) u(x + h.h. or SOR. First we discuss application of the method on an equation of the form ax [A (x. (PI-V. y) h2 . y)u(x.y+ 2)' T =C(x'Y- 2 2R=a+r. U.S.S(x. y)u(x + h.2b(x. The equation is then identically rewritten in the following two forms by use of matrix notations: (pl .y). and define Hou(x. 15. is a vector with components u ua.. and u.7) Vou(x. C. however. y) (15. Here. y) and analogously for the y-derivative. y)u(x.2. y) = a(x. (15.2)u+k. The derivatives are approximated as follows: aul ax [A (x. The differential equation will then be approximated by the following difference equation (Ho+Vo-Fha)u+Gha=O. y) . y) .6) dx where A. y) + 7(x.+ 2)u-(pl+Ho 2)u+k. 2b=a+c.u(x.3.312 PARTIAL DIFFERENTIAL EQUATIONS SEC.Fu + G = 0 . y)u(x . y) ay . we introduce the notations a=A(x+2y). Gauss-Seidel. a=C(x.A (x h y) u(x. y . y) = a(x. Supposing boundary conditions according to Dirichlet we construct a vector k from the term Gh2 and the boundary values. c=A(x . y) . Finally. Further. for example.h) . u3. y)u(x. we shall also treat alternating direction implicit (= ADI) methods.h. y) + y [c(x.2.3. This equa- tion can now be solved following Jacobi. and Fare Z0. and we then restrict ourselves to the Peaceman-Rachford method.u(x .3. . It must then be observed.Ho+ S )u=(pl+Vo.

5:xu + 2.)/(p + 2. .V)(pl + V)-'} . V = .V) . The convergence speed depends on the spectral radius 2(T). we now define Peaceman-Rachford's ADI-method through (P.I + H)uc"+1/2) _ (p. It is easily understood that large values of J(p .H)(pI + H)-'II 11(pl .2. so far at our disposal. If the eigenvalues of H and V are 2. The intermediate result u'"+'"" can then be eliminated and we find: uc"+>> = Tu'"1 + g .H)(pI + H)-'} 2{(pl ..i = (p"I - where p" are so-called iteration parameters. p = p.8) H)u("+'la) + k .2 IsisM I p + 2. that is.p)/(b + p). We now assume that H and V are symmetric and positive definite. First we restrict ourselves to the case when all p" are equal (=p). ELLIPTIC EQUATIONS 313 where p is an arbitrary parameter. (P. 15iSN p+p As H is positive definite.SEC.H)(pt + H)-'k + (pl + V)-'k . > 0 and we can assume 0 < a 5 2i 5 b.V)(pI + V)-'ll = 2{(pI . for the other factor we ought to choose p = p. Putting H = -Ho + S/2.V)uc") + k. Let us first try to make max p . as small as possible.3. S R.3. = V where 0 < a < ". and hence our critical choice of p should be such that (p .a)/(p + a) = (b -.)I are obtained if p is far away from 2.Vo + S/2.I + V)u'"+.H)(pf + H)-'(pI . Instead we have split a majorant for 2(T) into two factors which have been minimized separately since the general problem is considerably more difficult. From the beginning we want to choose p such that 2(T) is minimized.1 . (15.IiImaxlp .. W = {(pI .V)(pI + V)-'} = maxIPP .H)(pI + H)-'} {(pI . = ab.P:I. In the same way. but when we try to estimate this it is suitable to construct another matrix W by a similarity transformation: W = (p1 + V)T(pl + V)-' that is. all 2.. and we find 2(T) = 2(W) 5 II(pl . If these two values are not too far apart we can expect that the "best" of them will give a fair solution to the whole problem. 15. where T = (p1 + V)-'(pl . g = (p1 + V)-'(pI .

1 2 -1 -1 2 and 2 -1 S= .1) .1 2 .+a N MN . 314 PARTIAL DIFFERENTIAL EQUATIONS SEC. M = 6. N = 4. In the special case when we are treating the usual Laplace-equation inside a rectangular domain A x B.1 points along a given horizontal line form a closed system with the same matrix R as was denoted by T in the section on parabolic equations. R-4cos'2N and consequently p.a p. -a 1 7r(M2 +N') p...1 R. S R/ S sJ where 2 -1 . With.=2sin p. . In a similar way we obtain the eigenvalues of S: n7r 4 sin' 2N .+a p.3. -1 2 -1 . Now we can compute p. we get the following shape for H and V.. Hence we get a=4sin' L .M. m.1 2. 15. The M . for example. H is operating only horizontally and V only vertically.1 2 . Hence the eigenvalues of R have the form 4sin'2M. .2. where A = Mh and B = Nh. b= 4sin2 HIM-1I-4cos' L 2M 2M 2M ' a=4sin'ZN..1. rs /R S H= R V .=2sin N.

9) Wachspress suggests instead pi =d\dl nZ2.max (M.2. that is.1 ir(M2+N') p2+a p2+a M MN Finally we choose the value of p which gives the smallest limit.1) . j. as a rule. and then it is about enough that the method converges. one need not.3..SEC.4. 0). Note that P = M produces the result containing 11N. We here restrict ourselves to demonstrat- ing in an explicit example how such problems can be attacked with numerical methods. We put c = min (a.-a= 1 ..1... (15. N). For the general case when different parameters are used in cyclical order.4.... a) and d = max (b.4. but nevertheless there are a few results which are useful in practical computation. where P . In the case when one wants to use n parameters. while a similar situation is not present for hyperbolic or parabolic equations. (15. p=2sinP. this means that one solves a system of equations with tridiagonal coefficient matrices which can be done by usual Gaussian elimination. by force will keep the inner function values under control.3. (15. be worried about stability. 15.10) More careful investigations have shown that convergence with Wachspress parameters is about twice as fast as with Peaceman-Rachford parameters. EIGENVALUE PROBLEMS 315 and analogously p2-a p. Since all methods for numerical solution of elliptic equations are based upon the solution of finite (but large) linear systems of equations. We consider the equation z z ax + ays +2u=0. there is no complete theory. Since the domain is closed the boundary value:. 15..n.n. and they play a most important role in many applications.2.=d\d/ (11-1JH* j= 1. Peaceman-Rachford suggest the following choice: p. Eigenvalue problems Problems which contain a parameter but which can be solved only for certain values of this parameter are called eigenvalue problems. Almost exclusively an iterative technique is used. so to speak. When the iterations are performed in practical work.

083.4) 9 The lowest eigenvalue is obtained for m = n = 1: 211= =2.a -1 = 0. h = 1 and h = For reasons of sym- 4.u . with the smallest root je=4-V =1. while in the second case only three Figure 15.5625. then the equation describes a vibrating membrane.4 (a) (b) different values need be considered. We will work with two mesh sizes.p -2 0 -2 4 . (15. (3.3). and (0.2.4V+pV=0.083). (3.l = 2.4. metry all values are equal in the first case. Richardson extrapolation to h2 = 0 gives I = 2. 2) and (0. 2U + W . 3). (15.32 = 0. The exact solution u = sin (m7rx/3) sin (n7ry/3) gives A =(m'+n') 7r'. 316 PARTIAL DIFFERENTIAL EQUATIONS SEC. The condition for a nontrivial solution is 4 .1716. we have in the first case U+U+0+0-4U+1U=0 and d .4U + pU = 0. 0).3. 3).190.193.4. and hence . 15. inside a square with corners (0. 9 . (15. In the second case we get with It = 1962: 2V .12pe + 40. Now we know that the error is proportional to h'.4.4W+pW-0.4. and hence we have for h' and 2 the two pairs of values (1. 0). 2. On the sides of the square we shall have u = 0.2) 4V . Starting from equation (15.3) 0 -4 4-p The result is JU' .

1965).75 1 0 1 0 y 0 0 0 0 0 0. Friedrichs. 32 (1928). REFERENCES [t] Sommerfeld: Partial Differential Equations in Physics. The function f(x. Press. Higher eigenvalues can be determined numerically. S.5 0. (21 Petrovsky: Lectures on Partial Differential Equations (London. SIAM.5 0.. 50-67 (1947). Rachford. 151 Courant. 1956). 8. (Academic Press. New York." J.: "The Numerical Solution of Parabolic and Ellip- tic Differential Equations.25 0. Peaceman-H. Phil. John Todd). Ann. SIAM. ed. [121 G." Trans. Jr. 6.25 0. [41 Richtmyer: Difference Methods for Initial-Value Problems [Interscience (Wiley) New York. y) is given in the following points: x 0 0.EXERCISES 317 The error is only 0.. [81 D. Habetler: "An Alternating Direction Implicit Iteration Tech- nique. Lewy: Math. Crank-P. but then a finer mesh is necessary. Birkhoff-R. 403-424 (1960). London. Soc.5 f -1 -9 -A _} 0 -14 3 -i x 1 0 1 0 0. New York.25 0." Proc. L. [3] Beckenbach: Modern Mathematics for the Engineer (McGraw-Hill. We will also obtain considerably lower accuracy. Camb.5 0. [10] D. 1957]. 100. 3.75 1 y 0. 1954). EXERCISES 1. H." J. [7] J. 1960).75 0. 43. [6] Forsythe and Wasow: Finite-Difference Methods for Partial Differential Equations (Wiley New York.25 0. Wachspress-G. Nicolson: "A Practical Method for Numerical Evaluation of Solutions of Partial Differential Equations of the Heat Conduction Type. Am. Math. Soc. (9] E.75 1 1 1 1 1 f 7T -25 89 0 4 1t 85 1 . W. 1949).. 13-24 (1959). 92. y) satisfies the Laplace equation az + 3Y=0 y This function f(x. 28-41 (1955). J. Young: Elliptic and Parabolic Partial Differential Equations (in Survey of Numerical Analysis.150. [I]] G. D. Varga: "Implicit Alternating Direction Methods. Smith: Numerical Solution of Partial Differential Equations (Oxford Univ. too.

n-i + 0 gives a difference approximation of the heat equation au a2u at .4500 -1. t) = 1. t) = 0. 0). 4.. find the value of the function in the nine mesh points: x = 0. Find an approximate value of u in the origin when u satisfies the equation 2 z axe +2 au 1 y and further u = 0 on the sides of a square with corners (1.n where h and k are the mesh sizes of a rectangular grid. h (um+l.5 u -1. nk) = u.um. The function u(x. 1) we have u(x.75 and y = 0.5 u -0. 5. ± 1).7500 3. u(1.5 1 1.3757 1.1625 2. 2).at is given.0833 x 2. 1).n . 0.25.3000 1. t) denote a sufficiently differentiable function of x and t and put u(rnh.5. The equation a2u I au au ae+Xax . and (1.5 1 1 1 y 2 2 2 1.6500 -0. Find the value of u(x. 0).75. 6.. Let u(x.5 2 2.n+I .. Show that the equation z (2k) -'(u.318 PARTIAL DIFFERENTIAL EQUATIONS Using Liebmann's method.um.5 1 0. The boundary values are: x 1.5 3 3 3 y 0 0 0 0.0564 -1. (3.5 2 1.. and explain how a more general solution fulfilling the same conditions can be constructed. Using separation determine a solution satisfying u(0. 2) (h = 1).. axe On the sides of a square in the xy-plane with corners (0. 0. to three places.25. and (1.5.. y) in the four inner lattice points by approximating the differential equation with a difference equation so that the error becomes 0(h2).y) = x + y. Using Liebmann's method. (3. solve the differential equation a2u a2u y 3x2 + oy2 10 in a square with corners (1. 0. 2.3292 0. 0). 0.4775 -0. The square is split into smaller parts with side equal to 1.8077 -0. (0.. (1.y) satisfies the differential equation a2U + ( + x2 + y2) ay2 + 2y u = 0 . 1.9500 1. 0).

Then find an analytic solution trying u = cos ax sin jSy and compute an exact value of 2. 0). 10. V. and extrapolate to ha = 0. Find the smallest eigenvalue in these two cases. which is considered in a square grid with mesh size h. Compute the truncation error! . Choose h = 1. 1). Compute the smallest eigenvalue of the equation aau aau + 2u = 0 F a + Ta by replacing it with a difference equation. and extrapolate to ha = 0. and W. in the first case only two function values. Two cases are treated.1. (0. U and V.Y for a triangular domain with corners (. namely h = S and h = 5. and in the second case. where we have u = 0 on the boundary. and x + y = 1. and (1. Use h = j and h = 3. Compute the two smallest eigenvalues of the differential equation a+au=. 11. three function values. U. ax'+2ax2 y y Boundary values are u = 0 and du/an = 0 (the derivative along the normal) for a square with side 4. It is known that u = 0 on the boundary.)? + ya)u = 0 . and with two sides along the positive x- and y-axes. Find the smallest eigenvalue of the equation a.I and h extrapolating to ha = 0. valid on all 4 sides. Owing to the symmetry.tu. y) = u(h. Choose h . 0). 7. but a difference approximation of the equation au aau aau =0 ar+ -5-IT axa if k=hand h-0. This latter condition is conveniently replaced by u(-h. 8. y = 0.y). and au/ax = 0 on the remaining side along the y-axis.(1 + adl)d ]u. have to be considered.u 8 axa + a} + A(. Boundary values are u = 0 on two horizontal and one vertical side of a square with side 1. 9.EXERCISES 319 if k = ha and h -> 0. The differential equation is replaced by a second-order difference equation. The equation s a axa +ayE +2u=0 is considered in a triangular domain bounded by the lines x = 0. In the wave equation aau aau axa IT = 0 the left-hand side is approximated by the expression h-a[62 .

Definition and classification An equation containing an integral where an unknown function to be determined enters under the integral sign is called an integral equation. the function K(x.Chapter 16 Integral Equations In England the drain pipes are placed outside the houses in order to simplify repair service. inhomogeneous. kind. or if one or both of the limits is infinite. homogeneous. 2. if the function is also present outside the integral. t)y(t) dt . If the function to be determined only occurs linearly. Fredholm. If the equation has a term which does not contain the wanted function y(x). inhomogeneous. If the limits in the integral are constant. 1. Volterra. a y(x) = f(x) + 2 a K(x. the equa- tion is said to be linear. In the following the unknown function will be denoted by y(x). t) is singular in the considered domain. and we shall restrict ourselves to such equations. and it may then happen that the equation can be solved only for certain values of 2 (eigenvalues). t)y(t) dt . In the examples below. While the function y(x) as a rule has good regularity properties (continuity or differenti- ability).0. Ja f(x) = K(x. t)y(t) dt . otherwise homogeneous. inhomogeneous. if the upper limit is variable we have an equation of Volterra's type. Je f(x) = K(x. Volterra. t) has some discontinuity or at least discontinuous derivatives. the integral equation is said to be singular. the equation is said to be of the second kind. the equation is said to be of Fredholm's type. t)y(t) di . kind. inhomogeneous. Fredholm. t)y(t) dt . t) is called the kernel of the equation. Repairs are necessary because the pipes have been placed outside the houses. the terminology just described is illustrated. 320 . kind. The equation is still said to be of Fredholm's type if the limits are constant and the kernel quadratically integrable over the domain. the equation is said to be inhomogeneous. We then suppose thaty(t) is integrated with another function K(x. or that there is no solution at all. Fredholm. 2. it is rather common that the kernel K(x. kind. the equation is said to be of the first kind. y(x) = f(X) + A K K(x. t) (with t as integration variable). 16. 2. 1. If K(x. Often a parameter 2 is placed as a factor in front of the integral. If the desired func- tion is present only in the integral. y(x) = 2 f " K(x. kind.

1. (16.r 1 .oo)..2) Y1_ [ fl Y1 f_ fa y In Fredholm himself gave this system careful study. The reason that just these have been chosen is their strong dominance in the applications. We introduce the notations x. f(x..1. The fundamental description of the integral equations was performed by Fredholm and Volterra. FREDHOLM'S INHOMOGENEOUS EQUATION 321 Again it is pointed out that the types referred to here do not in any way cover all possibilities.1.1.) = K The integral is then replaced by a finite sum essentially according to the trape- zoidal rule. each of length h.. it is only possible to give a very rhapsodic presentation. J (16. Applying . y(x.) = y. .2hK . In order not to make the description unnecessarily heavy we shall assume that all func- tions used have such properties as to make the performed operations legitimate.2hK .. An extensive literature exists within this field. .1hK. Further.. degenerate) and methods related to them. Pri- marily we shall discuss direct methods based on linear systems of equations. 16.1hK.SEC. 16. IJ . t)y(t) dt . This limiting value is called Fredholm's determinant and is usually denoted by D(2). and then the following system of equations appears: Ay =f.1) since the treatment of other types in many respects can be related and traced back to this one. but a rather small amount has been published on numerical methods.. but important contributions to the theory have been made by Hilbert and Schmidt.AK. A. .. Fredholm's lnhomogeneous equation of the second kind Here we shall primarily study Fredholm's inhomogeneous equation of the second kind: y(x) = f(x) ± 2 KK(x..2hKK. but also iterative methods will be treated. .. where . 1 . = a + rh .2hK 1 . K(x t.) = ff .. Of special interest is the limiting value of d = det A when h --.1hK . The section between a and b is divided into n equal parts.AK. For obvious reasons. 0 (and n . special kernels (symmetric. will be mentioned briefly.

however. J 2! . where. 2) is usually called the resolvent of the integral equation. t.) K(x.0. t = a + rh.3) in some simple cases can be used also in practice. 2) . 1. On the other hand.: K 2! .. I K. t. Consider the equation Y(x) = f(x) + 2 (x + 1)Y(t) dt .!2 . 22h2 Z. 2) = 2 -. ob- tained by making f(x) = 0 in (16. t. 2) noticing that all determinants of higher order than the second vanish: 2 D(2) .) K(t 13) Q dt. 13) + 2! JJa dt. t) K(x.1 . 2) (x 1.dt. t) dt + t.. Expanding d in powers of 2 we find K. x. then 2 must be a root of the equation D(2) = 0 in order to secure a nontrivial solution. t. 13) K(131 13) A similar expansion can be made for d and after division by h and passage to the limit we get D(x. 2) = D(2).) K(t t. . 2)/D(2) we can write the solution in the form y(x) = f(x) + H(x. 1) + 2 (xt + L\ 1 3 )J .. D(x. t.2 . KKj and after passage to the limit h -. We compute D(2) and D(x. 2)f(t) dt . Ki. as usual.3) The function H(x. in this case we have an eigenvalue problem.2 K(t. and we give an example of this. dt. t. t. Hence. Introducing the notation H(x. 2) = D(x. p) dp a K(p. t) K(x. K(111 t) K(t t.I . dt2 dt3 + K(131 tl) K(13. t) .1. 0: 24 D(2) = 1 -. if we are considering the homogeneous Fredholm equation of second kind. The technique which led to the solution (16.-.1.2K(x. ti) K(ti.. 322 INTEGRAL EQUATIONS SEC.) K(t t3) K(t t3) dt.1. 16.) K(ti.) K(121 t2) where x = a + sh.4 is the algebraic complement of the (r. t) K(p. s)-element in the mat- rix A. as a rule 2 must have such a value that D(2) :. Cramer's rule we can write down the solution in explicit form according to Y. D(x. (16. p) I K(x.1 a K(t t.22 K(x.1). t2) K(ti. Obviously. 13) 21 JJJ6 K(t t.

t)f(t) dt .. J f(t) dt or Y(x) = f(x) + ia H(x. t) = b K(x.. z)K(z.(t) dt = K3(x. K1(x. t)ya(t) dt . the method is equivalent to a series expansion in 2 (Neumann series): Y = f(x) -I.2p1(x) + 22P2(x) -{. y . FREDHOLM'S INHOMOGENEOUS EQUATION 323 Thus we find the solution: Y(x) = . .(x) = 6 K(x. t) + . t)q 1(t) dt = J6 a K. t)f(t) dt .2 r( l l1 /Jf(t) dt 2 22/12 So L\1 2) (x I t) + 2 (xt + 3 which can obviously be written y(x) = f(x) + Ax + B. y1.6 ± 41/3.f(x) + 1 .(x. a a Here K. 16. Picard's method for ordinary differential equations).. 2)f(t) dt . that is. . The solution exists unless 22 + 122 = 12.(x. t) = K. K3(x. We shall now study an iterative technique and again start from (16. this is also a solution of the integral equation. Thus we obtain the final result: Y(x) = f(x) + ia [ 2K1(x. where A and B are constants depending on 2 and the function f(t). t)f(t) dt . It can be shown that the series converges for 2 < IIKII-1.SEC. If this sequence of functions converges toward a limiting function. are called iterated kernels. (P.(x.1) con- structing a series of functions ya.1.1. 2 = . t) a The functions K K . As is easily understood.(x.. t) dz ... Js (PAX) = a K(x.. according to f(x) + 2 S6 K(x. t)q.(x. t) dz and more generally 6 K (x. Inserting this and identifying coefficients for different powers of 2 we get: -P1(x) = 8 K(x. t) = K(x. where IIKII = S K2xydxdy. y0(x) = 0 (cf.. 1) + 23K.. t) + 22K. t.

t).t) and consequently 1 K. with arbitrary C. t) = exp (x .Ce.(t) .. 1) = el-=e" dz = e4-1 = K(x.. This last method obviously depends on the fact that the kernel es-1 could be written ele 1 making it possible to move the factor e= outside the integral sign. Consider the equation Y(x) + f(x) + 2 1 exp J0 (x . 2) = 2K. If the kernel decomposes in this way so that K(x.t)Y(t) dt . If 2 = 1 no solution exists unless Soe-l f(t) dt = 0. 1) -F.(x. t) = F(x)G(t) or.(x.1. t. t) + 2'K. if1ll<1. o In this special case we can attain a solution in a still simpler way. 16. t) 0 and all iterated kernels are equal to K(x. so that K(x. as before stands for the resolvent. if so.. Y= fl X) + 1 2 Z ez f'e-lftf) dt . where H(x. The described method is well suited for numerical treatment on a computer. We shall work an example analytically with this technique. Thus we have K(x. giving C = S'e-1 f(t) dt + C2 0 and we get back the same solution as before.(x)G.324 INTEGRAL EQUATIONS SEC. Hence we obtain Y = f(x) + S1 e=-lftt)('1 + 22 + 21 + .(x.. t) _ F. t) + )'K. we find f(x) + CAe= = f(x) + 2 S1 es-1[f(t) + Cdel] dt . However.) dt or. From the equation follows y(x) = f(x) + 2e. more gene- rally. it is now sufficient that 2 :# 1..(x. there are an infinity of solutions y = f(x) -. Inserting this solution again for determining C. J'0 e-ty(t) dt f(x) + C2es .

(t)f(t) dt = b. This system has a unique solu- tion if D(2) = det (I . 0 .. Let us start from Y(x) _= f(x) + F. we define D(x.(x) -F. where A = (a.=1c. FREDHOLM'S INHOMOGENEOUS EQUATION 325 the kernel is said to be degenerate. as a rule the equation can be treated in an elementary way.(t)Fk(t) dt = a.. Ja L Ja These last integrals are all constants: G.SEC.(x) The constants c.. -2. 2)j(t) dt . 2) as a determinant of the (n + l )th order: 0 -F. 16. .k) and b and c are column vectors.la I . t . . . 2)f(t) dt = f(x) + 2 a H(x.k and a G.(x) G. . t .Za. In this case.1. t. .(t) -. Further.Zai1 .(t) I .2a11 .F. t..2a22 . .2a1n D(x.(t) f(t) + 2 ckFk(t)] dt With the notations b G. Then we can also write Y(x) = f(x) + -J J'f D(x.(x) G:(t)Y(t) dt . with the resolvent as before represented in the form H(x. t .(t)] y(t) dt = f(x) + F. 2) = G.2A) # 0.(t)y(t) dt = and so we get: Y(x) = ix) + 2 Li. are determined by insertion into the equation: c. K we get a linear system of equations for the c... t. 2) D(2) We now treat a previous example by this technique: Y(x) = f(x) + 1 J1(x + t)Y(t) dt .2a12 .(x)G. G (t) .: (I-ZA)c=b. = JaG. 2) = D(x.

For we have y(x) _ 2 Sa K(x.326 INTEGRAL EQUATIONS SEC. This implies that 2 also must be real.lsJOLx \1 2) + . Solving for A and B we get A C(1 -J2)+D2 B=C213+D(1-J2) J d where we have supposed d = 1 . First we prove that if the kernel is real and symmetric. when we have a homogeneous equation. from the equation D(. or. while A and B have to be com- puted. that is. So f(t) dt = C. in the usual way from a determinantal condition. Hence we get as before Y(x) = fix) + 1 _ ZI T'7' .2 » f(t) dt . 6 a6 The left-hand integral is obviously real and so is the right-hand integral since its conjugate value is cb y(x)K(x. that is. t)y(t) dx dt .lt + BA which inserted into the equation gives A = C + 2 A2+BA. Fredhohn's homogeneous equation of the second kind As mentioned earlier the case f(x) .0. but here we shall only touch upon a few details which have exact counterparts in the theory of Hermitian or real symmetric matrices. B=D+ A2 + 2 B.1. Multiplying by y*(x) and integrating from a to b we get y*(x)y(x) dx = 2 5 5y*(x)K(x. Now. 16. t)y(t) dt. The eigenvalues are determined. then the eigenvalues are real.2= 0.i) = 0. There exists an extensive theory for these eigenvalue problems. .2. K(x.BA and y(t) = f(t) + A.ltx -f +t0 . x). Spty(t) dt _ B . gives rise to an eigenvalue problem.2. Putting So y(t) dt = A . t) = K(t. t)y*(t) dx dt = H y(x)K(t.2 . for example. x)y*(t) dx di which goes over into the original integral if the integration variables change places.?. if we are approximating by a finite linear system of equations. y(x) =J(x) + AAx -{. 16. `Otf(t) dt -D the constants C and D are known in principle.

t . we get finally D(x. 16. For Yi(x) = 2i K(x. 2) . Sb yi(x)K(x. Hence we have (1 . z)D(2) = 2 " K(x.li # 2k we get Yi(x)Yk(x) dx = 0 .2. t)y(t) dt . xj)D(2) .2 S K(t. and as . the sum of these products. t)Y4(t) dt . (16.. z)D(x. subtract and integrate: b (2k . (16.2.1hKijdtik = 0 . Ja O Yk(x) = Ak ° K(x.2.2hKkjdkk .)D(xk. To do this we start from the matrix A in (16.(t) dx dt LJ . equals zero. z.1) a We shall also try to construct a solution of the homogeneous equation Y(x) = 2 ° K(x. t) = K(t. the two double integrals are equal. t)Yk(t) dt Now multiply the first equation by Akyk(x) and the second by . If we choose the elements of column j and multiply by the algebraic complements of column k. t)Y. . x. 2) .liyi(x). by z. .2). 2) . 0: D(xk.2.AK(x. t)D(t. z. 2) dt . Dividing by h (note that djk and dik as distinguished from dkk contain an extra factor h) we get after passage to the limit h -. t)yk(t) dx dtl J Since K(x. x) and the variables x and t can change places.2d Jay. FREDHOLM'S HOMOGENEOUS EQUATION 327 SEC.1.3) These two relations are known as Fredholm's first and second relation. as is well known.(x) dx = 2i'lk Jayk(x)K(x.(x)y. z.1K(xk.2) Expanding by rows instead of columns we find in a similar way D(x.2hKjj)djk . (16. 2) dt . a Replacing xk by x and x. Next we shall prove that two eigenfunctions yi(x) and yk(x) corresponding to two different eigenvalues 2i and 2k are orthogonal. z)D(2) = 2 K(t. where the products (j) and (k) have been written out separately. 2) dt = 0 . xj .AK(x. t ..

t)D(t. In general. 2) dt. A = . 0 . as is easily understood.4) EXAMPLE Y(x) _ 2 Jo(x + t)Y(t) dt .6 ± 41/'3 . and under the as- sumption that D(x.2)x+3] Taking d = 6 we have apart from irrelevant constant factors y=x>/ + 1. we have the desired solution apart from a trivial constant factor: y(x) = D(x. t)y(t) dt .2. One solution is consequently Y(x)=2[(1.1) A parameter 2 can of course be brought together with the kernel.3. zo . Hence we can choose z = zo arbitrarily. 16. independently of z.328 INTEGRAL EQUATIONS SEC. The eigenvalues are obtained from D(A) = 0. D(2o) = 0. z.JA)(x + t) + 2(xt + j)]. (16. (= cep) . z. 16. the same result.3.3. that is. Fredholm's equation of the first kind The equation which we shall study very superficially in this section has the form fix) = lb K(x. As before we have D(x. If we choose other values for t we get. Fredholm's second relation then takes the form: D(x. Zo) . that is. Another method can also be used. z. t = 0. 2o) _ Ao SaK(x. this equation has no solution as can be realized from the following example: sin x = f1 e--0y(t) dt . 2) A[(1 . 22 + 122 = 12. The equation shows that the solution must have the form y(x) = Ax + B. A) is not identically zero. t. and we can choose for example. Now suppose that 2 = 20 is an eigenvalue. and this implies Ax+B=2 (x+t)(At+B)dt which gives the same result. (16.

If the summation is truncated at the value n we have the system Ac = b (16. However.3. t)y(t) dt .-1 aikui(x)uk(t) r-1 c.3.4.l S K(x.3.3. 16.2) Y(t) _ {=1 Lciui(t) The functions ui(x) being orthonormalized over the interval (a.4) with well-known properties regarding existence of solutions.4. K(x.Y. = (16.(t) dt = i. t).3. and y(t) can be expanded in a series of orthonormalized functions ui: fix) _ biui(x) . VOLTERRA'S EQUATIONS 329 However.k=1 aikckui(x) =1 Identifying we get an infinite linear system of equations ail ck = bi 1 k=1 where the coefficients Ck should be determined from the known quantities aik and b1. in about the same way as for .3) After insertion into (16. b) means that D ui(x)uk(x) dx = Sik (16. Volterra's equations First we are going to discuss Volterra's equation of the second kind y(x) = f(x) + .1) directly by a suitable quadra- ture formula giving rise to a linear system of equations: w Ea. i=1 K(x. (16. 16.1) we obtain: D biui(x) = S o i.u. there are some advantages in discussing the case separately.3.5) =o However. let us suppose that fix).KK.4. (16.4. t) = 0 when x < t < b. there is an apparent risk that one might not observe that no solution exists because of the approximations which have been made.(x )uk(t) .SEC. t ) = aik u .1) Obviously. Another possibility is to approximate (16. it can be considered as a special case of Fredholm's equation if we define the kernel so that K(x.

sin h= 6 +. and it is easy to show that we obtain the whole series expansion for sin x.2) y. successively.. . there is one important difference: if K(x.x)Y(t) dt .JK.oyo + JK. Y2=x+\o(t-x)(t.lh[ JK.4. t) and Ax) are real and continuous.=2h. As is the case for Fredholm's equation we can also construct a Neumann series with iterated kernels..sin 3h = 6s + Already Simpson's formula supplemented by the J-rule gives a substantial improvement.sin 2h = 6 3 + y. .] .4.=3h-4h'+h° y.. This method is very well adapted for use on automatic computers.6 -I 12 .f + . The technique is exactly the same. Using successive approximations we get Yo=x. 2. = fx + 2h[. y y2. and decide to use the trapezoidal rule (more sophisticated methods can also easily be applied). (16. =I°. triangular systems of equations compared with the general case.1Y. EXAMPLE Y(x) = x + (t . 6 . 16. y = . = h YL . and we can determine the desired values yo.330 INTEGRAL EQUATIONS SEC. further the interval length h.h' Ys . Then we obtain Y.=x+ (t-x)tdt=x. For simplicity we assume a = 0.6)dt=x.oyo + K11Y1 + K22Y2] .. However. y. So 1. y. then the series converges for all values of A. The trapezoidal rule with interval length = h gives Yo = 0 h' Y. and there is no reason to discuss it again.

y"(x) = -y(x) with the solution y = A cos x + B sin x.t)-'y(t) dt (0 < a < 1) . and in (x. the first function value being determined from the relation above: . dt = sin Ira f(z) dz Soy(t) 7r o (x . the procedure can be repeated. It is also possible to use the same difference method as before.t)rt If z is a vertical and t a horizontal axis.SEC.1 the result is y = sin x.t)s and goes over into s-°(I 0. x) # 0 in the considered interval we can divide by K(x. and integrating in z between the limits 0 and x. z)1-a Jo (x .a) _ E sin Ira that is. Very briefly we shall also comment upon Volterra's equation of the first kind fix) = K(x. x).t)0(x .4. 16. dividing by (x .t = (x .1'(a)P(l . and reversing the order of integration we find dz Az) dz . .4. t)y(t) dt . z)'-rt .Sz y(t) dt 1 S o (x .y(t) dt .ty y(t) dt. s)"-`d . x)y(x) + Sa t) y(t) dt .4) Jo Writing insteadfiz) = Sa (z . Differentiating the equation. VOLTERRA'S EQUATIONS 331 3. x). we obtain f(z) dz = f= dz f= y(t) dt Jo J= (x . and assuming K(x.z)'-a. the integration is performed over a tri- angle with corners in the origin. I' (x) = K(x. we find y'(x) = I .Z)1- The last integral is transformed through z . x) obtaining a Volterra equation of the second kind. (16. (16. y'(0) = 1 .a K(a. in (0. But since y(O) = 0 and y'(0) .Z)'-" o : (z . z)'-« Jo (z .4.3) J Differentiation with respect to x gives 8K(x. If necessary. a) A special equation of Volterra type is the Abelian integral equation f(x) = (x .

(n . D-2y = D-'(D-'y) = o (x . + c.332 INTEGRAL EQUATIONS SEC.5.5) it dx o (x . curves having the property that the time which a particle needed to slide along the curve to a given end point is a given function of the position of the starting point. . Jo (x .5.2)! These values are now inserted into (16.(x) d"-lu + . 1) and the following equation results: y(x) = f(x) + fo K(x. .1)! (n . Let us start from the equation d"u + a. Differentiating this relation we obtain the desired solution y(x) = sin air _ z) dz ..4. Putting d"u/dx" . (16. 16.y we find D-'y = y(t) dt (defining the operator D-') .1) dx" x"-' where the coefficients are continuous and obey the initial conditions u(0) = co u'(0) = c.t)"-'y(t) dt .t)y(t) dt (by partial integration) . . D-"y = D-'(D-"+y) _ I). (O) = c"_1 .t)y(t) dt. Connections between differential and integral equations There exists a fundamental relationship between linear differential equations and integral equations of Volterra's type. 16. x"-2 u = c"-1 X"-' + cn_2 + .5. 1 (n By integrating the equation d"u/dx" = y successively and taking the initial con- ditions into account. (16. we get d"-'u dx"-' d"-2u dx"_2 = c"-lx + c"-2 + D-'y .z)'-" Abel's initial problem was concerned with the case a = A and arose when he tried to determine tautochronous curves.. (16... + a"(x)u = F(x) .2) ..x + co + D-"y .5. that is..5.

1957). u'(0) = 0.ap(x) (ci_1 v (n . Compute D(A) and D(x. Hence y(x) _ -1 + S (x + t)y(t) dt . u" . t) = xt. Wiirzburg. Using the result in Exercise 1 solve the integral equation y(x) _ + So xty(t) dt . 2) when K(x..5. u(0) = 1. t) -[a1(x) + az(x)(x .x)y(t) dt .. [4] Hilbert: Grundzuge einer allgemeinen Theorie der linearen Integralgleiehungen (Teubner. ..4) (n . New York. EXAMPLES 1. Find the solution of the Fredholm integral equation y(x) = x2 + 2 f (x2 + t2)y(t) dt . + C1X + co) (16.5. 2. (Physica-Verlag.2)r + . [2] Mikhlin: Integral Equations (Pergamon.2xu' + u = 0. [31 Tricomi: Integral Equations (Interscience Publishers. New York.1)! + (n .1).. u(0) = 0. One finds y(x) = -x + (t .EXERCISES 333 where fix) = F(x) . 1924). 0 REFERENCES [1] Lovitt: Linear Integral Equations (McGraw-Hill.t) + a3(x) (x 2! 1)! 1 ] + a (x) ((n (16. 3. K(x. Thus we can say that the integral equation describes the differential equation together with its initial conditions. b = 1. 0 . t. u" + u = 0. a = 0. . Jo with the solution y(x) = -sin x (note that y = u"). New York..3) and r _ t)z +. 1912). 2. Leipzig. EXERCISES 1. 1957). 1953). u'(0) = 1. X-1 x°_2 . [5] Petrovsky: Vorlesungen uber die Theorie der Integralgleichungen. -F cw-2) .

9.t) = t(1 . Determine eigenvalues and eigenfunctions to the problemy(x) = 2 S' K(x. u(0) = u'(0) = 1. 12. Solve the integral equation y(x) = x + So ty(t) dt.x + So K(x. 10.334 INTEGRAL EQUATIONS 4. Find an integral equation corresponding to t. Find an approximate solution of the integral equation P1 y(x) = x + sin (xt)y(t) di . Show that the integral equation y(x) = I .3u" . 11. Find the eigenvalues of the integral equation y(x) = d S0"2cos (x + t)y(t) dt. Using a numerical technique solve the integral equation y(t) dt Y(x) = I + Jz ox+t+1 for x = 0(0. . 7.2)1. t)y(t)dt where K(x. Apply the trapezoidal rule.t) for t>x is equivalent to the boundary value problem d-y/dx2 + y = 0.x) for t<x {x(1 . 14.u = 0. t)y(t) dt where K(x. Determine the eigenvalues of the homogeneous integral equation y(x) = z (x + t + xt)y(t) di . 8. Find an integral equation corresponding to u"' . 13. 6. y(O) = 1: y(l) = 0. u(0) _ U'(0) = U'(0) = I. . 0 by replacing sin (xt) by a degenerate kernel consisting of the first three terms in the Maclaurin expansion.6u' + 8u = 0. Find the solution of the integral equation 3x2 + 5x3 = Sa (x + t)y(t) dt. Solve the integral equation y(x) = x + So xt2y(t) dt. t) = min (x. t). 0 5.

Which norm we should choose must be decided indi- vidually by aid of qualitative arguments..6).. 335 . The most natural measure is some of the vector norms (see Section 3. and sometimes also for prognostic purposes.w. and v v . CHARLES BABBAGE (in a letter to lord TENNYSON) 17. we choose the maximum norm. Putting v. u be the given values. Gen- erally.u. . this cannot be true and I suggest that in the next edition you have it read "Every moment dies a man every moment l'/16 is born. v the values obtained from the approximation. especially on automatic computers. If we choose the Euclidean norm. but it has also a practical aspect in connection with the computation of functions. and social sciences. Let u." Even this value is slightly in error but should be sufficiently accurate for poetry. The latter case is of great theoretical interest.Chapter 17 Approximation Sir. . u . biology. which is very popular with many people. . physics. we get the Chebyshev approxi- mation technique. to follow changes which might occur.. Different types of approximation In the following discussion we shall consider the case where we want to represent empirical data with a certain type of function and the case where an exact func- tion whose values are uniquely determined in all points is to be represented by a finite or an infinite number of functions. the representation obtained is used to get a deeper insight into the interrelations. =.. every moment one is born. we obtain the least-square method. and in this way we obtain the principle that the approximation should be chosen in such a way that the norm of the error vector (taken in some of the vector norms) is minimized. we want to find an approximation that makes the components of W small. chemistry." Obviously. for example. as components of a vector W. If. The former case occurs frequently in several applied sciences. In both cases we must define in some way how the deviations between actual and approximate values should be measured. instead.0. In your otherwise beautiful poem (The Vision of Sin) there is a verse which reads "Every moment dies a man. which has aroused a great deal of interest in recent years. medicine. we can regard the errors w...

.ky.x".n = vTM These equations are usually called normal equations..+a.P(x)l < 8 for all x e [a. = v. If m = n.. for example. This famous theorem states the following.1.y With the notations sk ..x... + . 17..(x)=ao+a. [ 11 ] or 112].b]. m aa.. One might expect that the usual Lagrangian interpolation polynomials with suitably chosen inter- polation points (nodes) would be satisfactory. slap + sa. where the experimental errors are associated with the y-values only. then to each e > 0 there exists a polynomial P(x) such that lf(x) . With this background Weierstrass' theorem is indeed remarkable. and vk = E j x. 2.5.Z. we shall determine the coefficients by minimizing S(ao+a. k = 0...+ sTM+... including the end points. (17. we can measure the deviation in the whole interval. It is hardly surprising that polynomial approximation plays a very essential role when one wants to approximate a given function f(x).. + s.a.a. we get the system soap + s. + .a.+. yo). x. For proof see. + S7. Runge has shown that the Lagrangian interpolation polynomials constructed for the function 1/(x' + 1) in the interval [ .. if m < n.+lal + .. .Yi )x = 0 . As a matter of fact. the polynomial is identical with the Lagrangian interpolation polynomial.1. If we are dealing with the approximation of functions which are defined in all points.. Then we seek a polynomial y=y. and the Euclidean norm should be interpreted as the square root of the integral of the square of the deviations.. If fix) is a continuous function in the interval [a.1.+a... . . 17.. . Least-square polynomial approximation We suppose that we have an empirical material in the form of (n + 1) pairs of values (xo.xi j=0 as =2 (ao + a1x. 1. + . (x. j=o or aoEX + x... 5] with uniformly distributed nodes give rise to arbitrary large deviations for increasing degree n. + aTMxi . b]. 336 APPROXIMATION SEC. but this turns out to be a dangerous way. fitting the given points as well as possible. = vo . For the maximum norm we measure the deviations in all local extrema. (z y.s. ..na... .x+.).2) smao -+.. ..

P-'v) + yTy . where n > m and further P = ATA.aTATy . x%.x. . LEAST-SQUARE POLYNOMIAL APPROXIMATION 337 The coefficient matrix P is symmetric. that is. m + 1). This is possible only if all coefficients ao.." vanish.yTAa + yTy = aTATAa ..y) = aTATAa . We put X. and we shall show that the system has a solution which makes S a minimum. we can write S = (a .vTP-1v = [A(a . here we observe that yTAa is a pure number. xo xo vo Yo 1 x. which means that P-1 exists. Now we have to choose a so that S is minimized which is achieved if A(a-P-'v)=0. . a = 0."(x) = ao + ax + + a. that is.P-'v)TP(a .vTP-'v . This is equivalent to stating that the homogeneous system of equations Pa = 0 has the only solu- tion a = 0.. + ax* = 0.. first by AT. A = I 1 a = v= y= Y1 L1 x" xw.+awx0" =0 9 jao + axf + . We have then proved that Pa = 0 implies a = 0. Putting P = ATA we shall first prove that P is nonsingular. .. which means that the polynomial y. this is obvious since Pa = 0 implies aTPa = aTATAa = (Aa)TAa = 0. and hence we would have Aa = 0. a .. This equhtion is multiplied from the left. then by P-' which gives a=P-1v.p-'v)]T [A(a _' P-'v)] + yTy . This last system written out in full has the following form: ao+alxu +. x."x" vanishes for n + I (n > m) different x-values xo. On the other hand.1. We shall minimize S = 1Aa-yl'=(Aa-y)T(Aa . Putting v = ATy.SEC. x"I V. 17.J LV. and hence yTAa = (yTAa)T = aTATy.. P cannot be singular. x .2aTATy + yTy . The problem is conveniently solved by use of matrices..._ LY" Here A is of type (n + 1. .. a..

32 1 10 100 1000 10000 1.66 229.52 696.00 431600.+ 116590368a. = 1322336 . 17.96 79. For large values of m. v. + 17064a.76 . 296ao + 170 64a.x + a.20 1584.00 8 100 10000 1000000 100000000 43.16 j x.919. xi Y.44.= 6479. + 1322336a.00 296 17064 1322336 116590368 94.22 12.001739.96 478.56 1293. s.1.76. = 537940. = 0.32 4 20 400 8000 160000 3.68 236.96 1 7.20 122. Later we will show how these difficulties can be overcome to some extent. the system is highly ill-conditioned and gives rise to considerable difficulties. = 116590368 and the system of equations becomes 9ao + 2 96a. x. v.56 43.6 4 2.44.72 43. s. . = 6479.72 3.00 2 12 144 1728 20736 1.64 19.76 6479. 0 8 64 512 4096 0.22 1.= 537940.00 5 30 900 27000 810000 7. a.04 56.16 3 16 256 4096 65536 2.16 4316.80 Hence we get so=9.66 11.2782.00 7 60 3600 216000 12960000 21.2) can be written Pa = v. After some computations we find that ao = -1.60 77616. x iyj xjY.44 537940. EXAMPLE A quadratic function y = ao + a.80 . 17 064ao + 13223 36a. x. As is easily seen (17. a2 = 0.x2 should be fitted with the following data: x 8 10 12 16 20 I 30 40 60 100 Y 0 . vo = 94. = 94. s.80 6894.00 6 40 1600 64000 2560000 11. s.40 19136. and if we know that P is nonsingular we find the same solution.88 1.80.88 7. 338 APPROXIMATION SEC.1. = 17064.96 21. = 296 .

99 21. LEAST-SQUARE POLYNOMIAL APPROXIMATION 339 SEC.+I-y:)' of ak With the notations : Ex.)' .nxoyo.1. Thus 1 +k' as = 0 gives. Further.30 A quite common special case arises when data have to be represented by straight lines.nxo. and then it is natural to minimize the sum of the squares of the vertical deviations.+I- i y. : x. as before I = ya . k vsa. = SR .04 1. In the first case we suppose that the errors in the x-values can be neglected compared with the errors in the y-values. With these coefficients we get the following function values: 10 12 16 20 30 40 60 100 x 8 y 11 0.99 11. s. 1. . Here we suppose that the x-values as well as the y-values are subject to errors of about the same order of magnitude.+I. and 1 = yo -A . 17. i i : A .=kE(kx. and then it is more natural to minimize the sum of the squares of the perpendicular distances to the line.s..kxo.k+nlxo-v.nnxyo .67 2. o A xo 2.-nyo. E xiyti = vl + Yai = t. y. We number the points from 1 to n: S=(kx.)x.=nxo=s Ey:=nyo=t1.04 43. we obtain I = yo .=0. C=t. that is.24 7.kxo . which means that the center of gravity lies on the desired line.98 4. B = v.42 1. . and we shall consider two such cases separately. al aS=0 ak gives (I+k') i ti (kx.

we get k2(v. 17.).b). systems where the number of equations is larger than the number of unknowns.nx0Yo) + k(s. then we minimize S = (Ax . yo). (17. (17.2.2. . solving the normal equations gives rise to considerable numerical difficulties.[Y<-Q(xi)J2=min. Here we also mention that overdetermined systems. Forsythe [1] expresses this fact in the following striking way: "When m Z 7 or 8. however. 17. as well as partial. Suppose that the n + I points (x.1 =0." Approximation with orthogonal polynomials might improve the situation although there is evidence of numeri- cal difficulties even in this case.2.3) i-o The problem is now to determine the polynomials P.. y. even for moderate values of the degree m of the approxi- mation polynomial.(x) is a polynomial of degree j fulfilling the relations x r (/.(v. From this equation we obtain two directions at right angles. Polynomial approximation by use of orthogonal polynomials As has been mentioned.. The method is not limited to this case but can be used for many other purposes.. . We seek a polynomial y = Q(x) of degree m (m < n) such that S=t..(x). one begins to hear strange grumblings of discontent in the computing laboratory. for example. differential equations [3]. n) with m > n. that is. . . In this way we are back at the problem just treated.k . .1) The polynomial Q(x) shall be expressed in the form Q(x) _ E a1P. where A is of type (m.2) i=o where P.nxoyo) = 0 or k'+A BCk . (x.b)T(Ax .(x) as well as the unknown .2.340 APPROXIMATION SEC. can be "solved" by aid of the least-squares method.) are given.2.. i-0 (17. After simplification. and in practice there is no difficulty deciding which one gives a minimum. Let the system be Ax = b.nxo . (x y. for numerical quadrature and for solving ordinary.t2 + nyo) . .

(xi) + 2 E P.(xi) A w ft 2 y3P. + a1 k i=0 k=0 or .kPk(x) = a2..(xi) . a.6) we further obtain P. We differentiate S with respect to a..2.r or a.2.(xi) + 2 k=0 E ak P.2.(x). putting x = xi. POLYNOMIAL APPROXIMATION 341 coefficients a.P1(x) + E a.2.2. . the powers xi can be written as linear combinations of the polynomials: I 1-1 x' _ E a. (17.6) by Pr(x).-i (17.kPk(x) . (17. we obtain xi-' _.2.SEC.2 aa.2.5) k=0 Conversely.4) i=o Then we have to determine the polynomials P. putting x = xi. E akPk(xi) i=0 i=0 k=0 tR w _ -2 E yi P. i=0 (yi . (17. and get as = .6) k=0 k-0 Squaring.(xi)Pk(xi) i=0 i=0 w 2a.kPk(xi)Pr(xi) = a.F. 17.2. we get ti=0x:Pr(Xi) = Ei=0Ek=0 a. = E yiP1(xi) . (17.7) i=o k=o Multiplying (17. and summing over i.2.kPk(x)}/as.Q(xi))P. .(x) _ a. a.(x) = {x' .2 EyiPf(xi) i=0 The condition aS/aa. = 0 gives the desired coefficients: a.kxk.9) k=o .k = i=o Ex:Pk(xi) (17. and summing over i.8) From (17. which we express in the following way: i P.

S1 Naturally. = vo . y. 0 z alo = Et xiP0(xi) /S. a.s.S2 . and v.' j=0 (17.).'.. Po=_..Y-) - Putting s.(xi) = .2.p a. i-0 -Ea. and by now we also know Q(x).. . . (x. For j = 0 we get Po(x) = 1/a.. SO S2 .2.silso v Vi . from (17. instead m can be increased successively. + (s0v.P. a. The derivations are completely analogous and will not be carried through here. P2 .2. which is a great advantage. from (17.yo)x Q(x) = S./so all v s2 . .8). The next steps come in the following order: .R. .4). YtiP.2. V So So Pl x-al0Po x .2 E Yi w x tw u . in the usual way.9).k-0 and by use of(17.s2/s0 Q(x) = aoP0(x) + a. we have .o..s. and P.vo/so ...(xi) + r. P.0. aakPj(xi)Pk(x:) . a_0. i-0 i=0 j=0 i=0 j. from (17.4).rt s = EY: a. a. Then we determine a0.. ajj from (17. sl2 o/so (x .s' . azy.s..s.P1(x) = '-'o + v. .7). .S.2. x..10) EXAMPLE Find a straight line passing as close as possible to the points (x.3) and (17. (x.. . We obtain ajk./So \ so) or S2y0 . this is exactly what is obtained if we solve the system (soap + s. a. . y. Closer details are given in [2]. x.V.. azl. ao = Yi P0(x:) = _ a. .342 APPROXIMATION SEC. we do not have to fix the degree m of the polynomial Q(x) from the beginning.. Obviously. s0 S2 . we obtain aoo=V. and a.a. j > k.. = s2 . all.2. S=EY. Then we can follow how S decreases. yo). 17.2. as is easily found. An obvious generalization is the case when the points should be counted with certain preassigned weights.s. P. all . = n + 1. j` slap + seal = v.. a. .2.

we obtain.13) Analogous to (17. n! d x* Thus co(x) .(x).I d [(x' . = 0..2. 17.(x) = jn + - 2 2^ .1. a. = y'dx =0 a the function system is said to be complete (Parseval's relation). 10).T.(x) dx (17.p. (17. dx = min. this case will be treated separately.11) The best-known example is the trigonometrical functions in the interval (0.tn= fby'dx .q..2.(x).p.SEC. we can use the Legendre polynomials P (x) multiplied by the normalization factor : P. form an orthonormal system over the interval a < x < b: b T.11) and (10.2. 1).Fla.F.(x)I dx . rp. EXAMPLE We consider the interval (. by use of (17.cp.2. Adding a weight function w(x) is an almost trivial complication.a. a. Using . a.=0 a< y'dx Ja known as Bessel's inequality.q 2(x).2..I )^ J .12) Differentiating with respect to a.1)V_j .2 a [y . we get J as . Now suppose that we want to approximate y = x* with aoggo(x) .(x) dx (17.2.(x)] l . where the functions T.2. as Putting aS/aa. 22r). If a.14) Ja .I/ v/_2 . POLYNOMIAL APPROXIMATION 343 Another natural generalization is the approximation of a function y = y(x) with E. T 2(X) : = (3x2. from the condition S= a [y . We now want to determine the coefficients a.12). (17.(x) = X11 T.(x) + a. .=0 Since S Z 0 we have b .11). =Jy p. we also find that S.2.2. and according to (10.

3. 17. (17. the proof of the relations is trivial.k.13). e . n) ± $(j .T.k. Now we can deduce corresponding relations in trigonometric form: w-I _I a E cos rja cos rka -aE {cos r(j + k)a + cos r(j k)a} r-0 2 r=0 a Re a 1 [er(.3) .2) 0 Here we have put 2n/n = a. a. n) .2.3.]} 2 r-o 7r[ 3(j + k. (17. n) = I0 otherwise . and b = 1.k . we found our discussion on the formulas tr=0erj.2.k: dx = 21r8. we get the system a+ 3 5 a b _ 1 3 + 5 7 Hence a = -'.2.tj 32x"dx=0 a. (17.(x) = 6x' 7 . n)] . 17. Approximation with trigonometric functions We shall now study the case when a function is to be approximated by trigo- nometric functions relating to (17.344 APPROXIMATION SEC.3.35 3 If we put conventionally S (x'-a-bx)'dx. b(m.3. (17.' e-'ka' = nb(j .3.=S}lx'(3x'-1)V8dx= 8 2 a0p0(x) + a1gi(x) + a. we get a0 = +i J _i 7_2 x' dx = A/ T 5.1) or (17. Defining I if m is divisible by n .1) e'is .12).= f_1. k) + er(j-k)n. and seek the minimum of S.

sinrja.4) E. 2%_. sin rx) .3. 17.3. _ 7r. which in the points x. y. we obtain (j.3. The coefficients are obtained from 2*-. (17. = 2 (a0 + (. if the series converges to the initial function.3. = a. (17. (17.=0 If in these relations we let a 0. (17.. First we shall try to generate a series in cos rja and sin rja.3..10) b.. + E (a. cos rja + b. instead we refer to [4]. cos rx + b. n)] . 1 u f(x) cos rx dx . (17. (17. n) . and we find 1 (%-1)1.3. = ja takes the values y. while the orthogonality at integration plays a decisive role in the theory of Fourier series. `o Here we cannot treat questions relating to the convergence of the series. sin rja) . y. or special phenomena which may appear (Gibbs' phenomenon). k * 0): 0 o cos jx cos kx dx = sin jx sin kx dx = z8.6) 0 The orthogonality at summation is the basis for the harmonic analysis. n odd . sin rja cos rka = 0 .8(j + k.7) + (a.5) 52 sin ix coskxdx = 0.=-Ey. [5].x a. a. APPROXIMATION WITH TRIGONOMETRIC FUNCTIONS 345 Other relations are proved in a similar way. . (17. Thus we have sin rja sin rka = 7r[8(j . absolutely integrable functions by a Fourier series: J(x) = 1 ao + E (a. sin rja) .3. or [6]. cos rja + b.9) 2 .=o n =o Similarly we can represent all periodic.SEC. = "fix) sin rx dx . b..3.cosrja._1 In this case we get the coefficients from . n even . .8) n .=-Ey.k.

4 2. 7r I 7r 'n b = I7rsin nx dx . Expand the function y = f(x) in a Fourier series. where 0<x<7r. 7r x 0<x<7r. We have a= R x 37r +X l cos nx dx 1 7r Jo( 4 2) cos nx dx + 7r 4 2I 0 n even. By passing to the integral form. 4 2' fix) X + x37r 7r<x<27r. n odd. this means that only certain discrete frequencies have been used. fx)-J+1 We find a.346 APPROXIMATION SEC. Further. we easily find b = 0.=I1 r cosnxdx . 4 2 7r 3' 5' For x = 0 we obtain 1 8 .I + 33+ 5'+.3.5cosnxdx=0. 7rn Thus we obtain = sin x + sin33x + sin55x + 0 < x < 7r . 17.x _ 2 (cos x + cos 3x + cos Sx + 0 S x < 7r .. we actually use all frequencies. _ 27rn'n odd ... o I 7r rat z sin nx dx = 2 7rn [1 and hence 0 n even . When a function is represented as a trigonometric series. and hence it . EXAMPLES 1. The principal point is Fourier's inte .

. are obtained as roots of the algebraic equation 2" + A..x dx . F(v) will be com- plex: F(v) = G(v) . If the computation is performed for a sufficient number of v-values. 17. 2.4.... a . 0 Jm 0 with g(x) = .fix)=a.... (17.3.. 2 221 .f(-x).el3x+. instead. and hence we must require that fix) and F(v) vanish with a suitable strength at infinity. + A. Last. and conveniently we compute the absolute value q.. one can quite easily get a general idea of which frequencies v are dominating in the material f(x). Suppose that we have a function f(x) which we try to approximate by ..+a.. . Approximation with exponential functions The problem of analyzing a sum of exponential functions apparently is very close to the problem just treated.SEC. a. which states a complete reciprocity between the amplitude and frequency functions: F(v) _ (-mtx)e-2x:. In practical work these integrals are evaluated numerically. and we get the coefficients a.2"-' + + A. then F(v) can be computed by numerical quadrature: F(v) = g(x) cos 2nvx dx . a. (17. the convolution integral will contain the same oscillations but with amplitude a'. if there is a pronounced peri- odicity in the material. y = 0 with presently unknown coefficients A1. It can easily be shown that if f(x) contains oscillations with amplitude a. then for certain values of t the factors fl x) and fix . and 2.t) will both be large. and hence the integral will also become large. We observe that f(x) satisfies a certain differential equation: y(") + Alycw-1> + . In this case one forms con- volution integrals of the form S f(x)f(x . A A We compute y' y" y(") numerically in n different points and obtain a linear system of equations from which the coefficients can be obtained.i h(x) sin 27rvx dx . .1) We assume that n is known and shall try to compute a a . If f(x) is known at a sufficient number of equidistant points. In general... h(x) = f(x) .(v) = (G(v)' + H(v)2)'1' . 2.Ax) +J(-x).t) dx.. e''=+a. 17..4. APPROXIMATION WITH EXPONENTIAL FUNCTIONS 347 gral theorem...iH(v). by use of the least-squares method.4.-e2"s.. Another method is autoanalysis. 2. = 0... . .11) f(x) = F(v)e29"x dv .

.3) P(v) . If this is repeated.. For example.s1 + Y. so = 1.. .e'2. four significant digits.. Thus we have n -m + 1 equations in m unknowns.(x) = 0.. We will approximate Ax) by a.348 APPROXIMATION SEC.Y. (v-v. Since the data..Y0 + s.V. + s. .-. we obtain for r = 0.4.2 to two places. = Yi . + Yes.)(v-v.... + C.305 e '-68s + 2. At first glance this method seems to be clear and without complications. (x.4.. .)v"'+s. m is.0951 e : + 0.) = 0./h. 1.4) IYw-Is. + Yos. = s. . (Xi. which very often come from biological.-rs3 + ..)C2 + .. + y.+s... e 4. since q1(v.... f.8607 e 32 + 1. of course.. + .. + `. We assume that the function y = f(x) is given in equidistant points with the coordinates (xo.. Forming (v-v. s.-1Yl + . 1. V.. but in practical work we meet tremendous difficulties. s . physical. + C.+l (17. can be determined by means of the least-squares method.. e-0-5x + 0.. at most._'s.vw = Y... = -Y. 2. Lanczos has shown that the three functions fi(x) = 0. A better method.3) and 2. m the equations: Cl + C. Finally.202 . Then we get v v . Normally. = e"'. + Y.YI). multiplying in turn by s. approximate the same data for 0 S x S 1. from (17. originating with Prony.m + 1 >m.. . and adding._1S1 + Y.5576 e-3i . . + . -y.. (17.v'"-'+. II Y..e'rso and v.-1 + soy.9Es.43z' f. + . 17.. (17. are usually given to.e'ls + a. where x. = log v.v...4. y. or chemical experiments. and normally n . = 0 . and further it is clear that we get a new equation if we shift the origin a distance h to the right. e-4.(x) = 0. we obtain p(vt)CI + q(v. s.... + Putting c. The main reason is the numerical differentiation which deteriorates the accuracy and thus leads to uncertain results. hence s1. = -Y.-2S2 + . = a..041.. _ xo + rh..s. r = 1. substantially smaller than n. 2..vr + c.. = Yo ' CIv1 + C2V2 + . (v. is the follow- ing. m.) . + . it is easily seen that an approximation with three exponential functions must already become rather questionable. which has also been discussed by Lanczos [4].yo).). + C..4.68 . we get the following system: Y.79 .)C.4.. s.2) C.. we get ..

1 U. (17. For m = 1 we get in particular 2xT (x) . we get y = T. for example. the other fundamental solution of (17. =off"..(x) = 8x3 . (17. (17. APPROXIMATION WITH CHEBYSHEV POLYNOMIALS 349 C.5) 1/1 -x' 2 k7r . dy n sin no dx y sin 0 and y = -n' cos no + n sin no cot 0 n'y + sin'B 1 -x' 1xy'-x' . m = n = 0. the last equation can be left out). we can successively compute all T.1 T.5.4) The first polynomials are TO(x) = 1 U0(x) = 0 T. v.U.SEC. 17.-.(x). (17..2): 2xU (x) . 17. the polynomials T (x) are orthogonal with the weight function -x'.3) is the function S (x) = sin (n arccos x).1) where it is natural to put T -.. from (17.3x U. we have in the same way as for (17.5.(x) = 16x° .(x) = x.(x).5. that is.(x) = 2x2 .5.5.(x) = x U.5.3.' c .3) As is inferred directly.(x) = 2x T. In particular we have So(x) = 0 and S.5.TA_.4.2) Since T0(x) = 1 and T. . for a moment.(x) = 16x' . .(x) = 4x' .(x) = 1 T. Approximation with Chebyshev polynomials The Chebyshev polynomials are defined by the relation T (x) = cos (n arccos x) . +t dx = m=n 0. .x'.5.(x) = T (x).12x' + 1 From (17.2) (since there are m + I equations but only m un- knowns. Well-known trigonometric formulas give at once (x) = 2T (x)T (x). Thus the polynomials T (x) satisfy the differential equation (I -x')y"-xy'+n'y=0. Putting. further.(x) = 4x3 . With U (x) = Sa(x)/1/1 --x2. x = cos 0.(x) = cos no and. (17. and a.(x) = 8x4 .(x) = 1/1 . = c.5) we have directly u min.5.20x3 + 5x U.(x) .4x T.8x' + I U.

all coefficients change if we alter the degree n...6). If we use (17.. k < n and at least one of them different from zero.) cos (17.5) or (17.5.5. Suppose that we want to approximate a function f(x) with a Chebyshev series: f(x) = Jc0 + c1T1(x) + c. = c"T"(x) + c". . which is possible only if f"_1(x) = 0.(E.p.1) would have an alternating sign in (n + 1) points x0..k.(x) = 2-)"-')T (x) . we have "-1 n E T.) = 2 ail: The general formula is A-1 Tk(s`.1T"+1(x) + . Hence f"_. x . which means that we cannot just add another term if the expansion has to be improved. . x.. can be expressed as R.8) In particular we observe that ck depends upon n.. we consider all polynomials of degree n and with the coefficient of x' equal to 1.. that is.2.1 < x < 1.. Now suppose that lp"(x)l < 2-1"-" everywhere in the interval -1 < x < 1. To determine the coefficients ck in (17.(x)I.3)..f(s..5.s. that is. + R"(x) .. r = 0. we can use either (17.)Tk(E. and that T"(x) _ (. 1. (17. we seek the polynomial p"(x) for which a is as small as possible.5. Putting a = sup_.350 APPROXIMATION SEC. i 2-c--"T."S(j ..1. 2n) + (-1)(j-k)/.=cos (2r+ 1)7c 2n r=0..1. The latter might be preferable. For .(x1) < 0 .5. 2..7).) = 2 [(-1)''+k'''"b(j + k. since the integration is often some- what difficult: 2 n-1 2 " -1 (2r 2n1) 7 Ck = n n .(x1) . when x=E.-..) . 2-(1-1)T"(x).. Then we have 2-c. = cos r7r/n. if j. _n. oo. p"(x) . .5). The desired poly- nomial is then 2-""-"T"(x). The remainder term R. we obtain the limiting values of ck when it --. as will become clear from the following discussion. 2n)] .5..(x) would have n roots in the interval.5. Chebyshev discovered the following remarkable property of the polynomials T"(x).T. First we observe that T"(x) = cos (n arccos x) = 0.1)* for x = x.-1)T"(x.3. 17.n. the polynomial f.p"(x) of degree (n ..p"(xo) > 0.6) In analogy to (17.(x) + . 1 Jp.

n takes the values 0. 3. SEC. The general coefficient in the expansion xk = E c"T" becomes C. For the other coefficients we find c. we have R. Since f(x) is an odd function. If the function f is such that we know a power-series expansion which con- verges in the interval.).+5T.e-if)k a""0' + e-"P 2k 2 eki + (k)1 e2 (-)i + (k)2 e(k-4)i + .+7T. . EXAMPLES 1. Clearly. x1 =s(3To+4T. and we are left with c" = 2-k+1 k l (17. and then the terms can be rearranged to a polynomial. + eki'P a"iy + e. we get 1 also for the term e(1r-k'i p .k+.+TO).+T4). = 2. . ..T"(x) dx = ? JA cosk q' cos ncp dq 7r .` +' arcsin x cos ((2k ± 1) arccos x) dx 7r-1 t/1 x'x' TO .+T7). x1=614(35T1+21T. 17. since co = 1. APPROXIMATION WITH CHEBYSHEV POLYNOMIALS 351 and if the convergence is sufficiently fast.. e-"if with the same r.k = 0. x°=-5 (10To+15T. all other terms vanish. In this case the error oscillates between -c" and +c".+6Ti+TT). a power-series expansion can be replaced with an expansion in Chebyshev polynomials.5.. _ 2.. and analogously. k. . a direct method can be recommended. 5.9) \(n + k)/2/ If k = 0 we have c" = 0 except when n = 0. J+1 x. = c"T"(x). the coefficients in this polynomial differ from the corresponding coefficients in the original expansion. We have 1 =To. If k is odd. 4. we get c. x2= z(To+T. 2.x' 7r o . When we integrate. f(x) = arcsin x. n takes the values 1. and if k is even.5. x3= (3Ti+T9). x5 =_A(10T. By use of these formulas.. k. -"iv 2k 2 The term e'k-'"w e"i'P becomes 1 if r = (n + k)/2. VI . The integrand can be written (ei' . This latter expansion is truncated in such a way that our requirements with respect to accuracy are fulfilled. x =T1. .

000000 00000I T This expansion is valid only for . and again we get the formula 7r 8 + 9 + 125 + .000542 926312T. + 0.-term and rearrange in powers of x.0.0. .130318 207984T. .000022x + 0.0.5.005474 240442T.000000 199212T8 .000045 . we obtain e-1 = 1.499199x2 .001615.* For small * For convenience.008333x6. If. The error is essentially equal to 0. Expansion of other elementary functions in Chebyshev series gives rise to rather complicated integrals.`P _ 4 (2k + 1 I)= ir Thus T. = Te T6 = = 1.000000 011037T9 + 0.(x) + T2(5) + arcsin x = 4 [T. we obtain Sn ( .000000 000550T10 .5x' .166667x3 + 0.0.5. 17.000045.0. we find e : = 1. + 0.008687x6 .352 APPROXIMATION SEC.266065 877752T9 . instead. whose largest absolute value is 0.000044 977322T6 . Putting x = cos p.000000 000025T1.I < x < 1. + 0.x + 0. 1 . The error curves are represented in Fig.cos (2k + I)p dcp c9k+1 = n o 2 . we get e. and we will not treat this problem here (cf. where the error is essentially x6/720..000045T5.000003 198436T. we truncate the usual power series after the x°-term.0. If we truncate the series after the T.(x) + 7r L= 9 For x = 1 we have T.271495 339533T.0.043794x4 . [71). . 17.0.1.041667x' .166488x3 + 0. 2.1. + 0.044336 849849T3 + 0. the latter curve is reversed. that is. . 36 times as large as in the former case. If we expand e_T in power series and then express the powers in Chebyshev polynomials.0. the maximum error is obtained for x I and amounts to 1/720 + 1/5040 + = 0.

10) where aA+. On the other hand. When Chebyshev series are to be used on a computer. Insertion in f(x) gives fix) = E (ak .2xT0) . . We form a sequence of numbers a.2xTk+1 + Tk+2)ak+$ + a07'0 + a. superior.5.. the Chebyshev series has the power of distributing the error over the whole interval. Such a technique has been devised by Clenshaw [9] for the com- putation of fix) _ k=0 CkTk(x) where the coefficients ck as well as the value x are assumed to be known. Figure 17.5 values of x the truncated power series is. ao by means of the relation ak . (1 7..(x) + k=0 _ k=o (Tk . of course.Ck .2xak+1 + ak+2 .5.5. a recursive technique is desirable. .11) . 17. Using (17. (17. = 0.(T.2xT0) or fix) =a0-a.x.5. APPROXIMATION WITH CHEBYSHEV POLYNOMIALS 353 SEC.2Xak+1 + ak+a)Tk(x) + 2xa )T _.2) we find fix) = a0T0 + a1(T.

and B. A convenient notation for this expression is the following: (17. = b.+. 1.2) A. = + (17. r. As is easily found.3) B. The formula is proved by induc- tion. + a.-.B. 161 Ix+9 Details on the theory and application of continued fractions can be found.+. 11 Ix+3 . is replaced by b. e=+ZI-ZI+ZI-?I+ZI-?1+ZI-ZI+ZI-. b. in [10] and [I I].. Continued fractions containing a variable z are often quite useful for repre- sentation of functions. in particular. for example._. + a. is obtained from A. Here A_.1) a. taking into account that A. 11 12 13 12 15 12 17 12 19 e. and b s = 0...._-a . the following recursion formulas are valid: b.1. Approximations with continued fractions A finite continued fraction is an expression of the form bo + a.6. and so we restrict ourselves to a few examples.. such expansions often have much better convergence properties than. = 0. Ao = bo.IIIz + 11+ lI+ 2I+ 21+ 1+ I+. 17. if b. are polynomials in the elements a. a.B./B. B. 17..6.6./B.A. and Bo = 1. the power-series expansions. . Ix+5 41 ._. Generalization to infinite continued fractions is obvious.354 APPROXIMATION SEC.6.-..A...1/b.. such a fraction is said to be convergent if exists. . + a. JA. oz+u II Iz I1 Iz I1 Iz ez ee-- u du= Ix II { I . Space limitations make it impossible to give a description of the relevant theory. Ix+7 91 . = 1. for example. b+ ba a3 ' + (17.du .6.

It is inter- esting to compare the errors of the different types of approximations.X2- (17.. One could also demand conditions of the form Ifx) I < M IxI° .x+. (17. Chebyshev polynomial. b].. are polynomials in z. Here we encounter the problem of finding rational approximations of the form P(x)/Q(x) to a given function f(x) in an interval [a. . naturally the type depends on the function to be approximated." Then we have. The second formula should be used if fix) is an even or an odd function.. and B. 17. continued fraction. should be chosen so as to make v as large as possible. Nearest to hand are the following two representations: J(x)_as+a. D>C>>B>A.. and hence A.. . B.P(x)/Q(x) until all maximal deviations are numerically equal.7.7. However.7. then one still does not know very much. Exercise 4). Approximations with rational functions A finite continued fraction containing a variable z can be written where A. The main idea is to adjust the zeros of the function fix) . = % = 1. and rational function. In typical cases this implies that the Taylor expansions of f(x) and P.2) i3 + The coefficients are determined by a semiempirical but systematic trial- and-error technique. C. under quite general condi- tions there exists a "best" approximation characterized by the property that all maximal deviations are equal but with alternating sign (cf. and D stand for power series.xl fix) ` ao + a. This type of approximation was introduced by Pade (cf. 1B. Without specializing we can put b(.1) bo+b. Then it is first necessary to prescribe how the deviation between the function and the approximation should be measured.(x)/Q (x) should coincide as far as possible.. [13] and [14]).x=61X2 + . on comparing approximations with the same number of parameters. are Chebyshev approximations. however.. + 6.. and Q.. and in this respect there are several theoretical results. respectively. and let the signs > and >> stand for "better" and "much better.. If this is to be done in the least squares sense. Let A. APPROXIMATION WITH RATIONAL FUNCTIONS 355 17. is also a rational function. For example.7.SEC. + aw.x.x'- + . Most common. an overall qualitative rule is the following. but this relation seems to be the typical one.(x) where the coefficients of P.+b... Q.x+. in the latter case it should first be multiplied by a suitable (positive or negative) odd power of x. There are exceptions to this general rule. In the examples we shall specialize to the same degree in numerator and denominator.+a.

1960). (10) 0. arctan x x 0.<z<1.02051 21130x' cos x - 1 + 0. 1966). APPROXIMATION SEC. 1948). 1961). 1961). 1957). 118 (1955). (3] F. Cheney: Introduction to Approximation Theory (McGraw-Hill.52406 42207x + 0. 1929). a=3. 1955).00656 58 1 01x8 e-1 1 + 0. 5 (June.34496 44663x + 0.80 10-10 sin x 1 .45477 29177x' + 0. K. Wall: Analytic Theory of Continued Fractions (Van Nostrand. 10-11 . 1.99999 99992 . and if nothing else is mentioned. [5] N. W. (141 E. [2] P. Indust.0677412133x+0. [9] Clenshaw: MTAC (9). New York. [11] H. . The errors were computed before the coefficients were rounded to ten places.34.0956558162x9 logz^- 1 + 1. [4] C. Soc.05067 70959x° E = 7. S. Bari: Trigonometriceskije Rjady (Moscow. 0.0884921370x' . [131 N.0.52975 01 38 5x'+0. Princeton. .00089 96261x4 e=7. 356 Last. [12]. the interval is 0 < x < 1. 17.0.00328 11761x4 x 1 + 0.29. L.0. Alt: Advances in Computers 2 (Academic Press. Also cf. Math. 1957). [81 Hastings: Approximations for Digital Computers (Princeton University Press.0.7..55. REFERENCES [1] George E. G.03310 27317x2 + 0. E is the maximum error. Guest: Numerical Methods of Curve Fitting (Cambridge. Perron: Die Lehre von den Keuenbrachen (Leipzig and Berlin. 1. Lanczos: Applied Analysis (London. New York.04410 77396x' + 0. 1956).00894 72229x° I + 1. New York. 4 (1961). we give a few numerical examples of rational approximation. App!. [6] Hardy-Rogosinski: Fourier Series (Cambridge.13037 54276x' + 0. [12] Froberg: BIT.02868 18192x' x=2z.00046 49838x' ' E = 4.1. I. [7] Lance: Numerical Methods for High-Speed Computers (London.11255 48636x' + 0.99999 99992 + 1. 1944).57490 98994x'+0.10-10.10-°.46370 86496x'+0.47593 58618x + 0. all of them constructed in the Chebyshev sense." J.01063 37905x3 s=7. 1961).0000000007 .6931471773+0. New York. Forsythe: "Generation and Use of Orthogonal Polynomials for Data-fitting with a Digital Computer.13356 39326x' + 0.67. Achieser: Theory of Approximation (Ungar.10-10 -0.45589 22221x° + 0.28700 44785x' + 0.

2 0.03362 0.07561 0..56513 1.6 0.6 0. 7. assuming (a) that the errors in the x-values can be neglected.6 0..12301 0. Find a and b.x2) . 3. The following data x and y are subject to errors of the same order of magnitude: x 1 2 3 4 5 6 7 8 4 6 6 7 y 3 3 5 5 Find a straight-line approximation.8 0. (9.cos x)2 dx is minimized. b.37854 1.02877 1. For IxJ < 1.4 0.21651 1. and (I1.00 0. Determine the constants a. b.18417 0. The radiation intensity from a radioactive source is given by the formula I = I.56 8.08551 0. 4.5 0.05537 0.2 1. . Approximate a-z for 0 < x < 1 by a .4 0.2 0. with f.1 y 2.34 1.and y-values are of the same order of magnitude. t 0. and c.74 0.4 1.0 1.4 0. Perform this. Determine the constants a and Io using the data below. 0. A.5 0.15398 0.38 1.7 0. Find a. y is a function of x.e °`. and B. b. It should be represented in the form Ae' + Be-b'. .. For small values of x.95289 9.bx in the Chebyshev sense.0 1. x 0.16156 0. 9).. 6.7 0. (b) that the errors in the x. 2). a6. there exists an expansion of the form e-z = (I + a.8 1. 5.16 2. and c. (6. The points (2.01909 .31604 2. 2. a2. 4).9 1. The function y of x given by the data below should be represented in the form y = e°= sin bx. Determine the constants al. 6).EXERCISES 357 EXERCISES 1. we want to approximate the function y = e2'.xX l + a2x2X l + a. 10) should be approximated by a straight line. using the least-squares method.6 0. the function cosx is to be approximated by a second-degree polynomial y = axe + bx + c such that S0'I2 (y .8 1 3. given by the data below.(x)-1+ax+bx2+cx3 1 -ax+bx2-cx2' Find a.75 1.3 0.78030 1. For 0 < x < s/2. (5.

005. P'(O) =f'(0).358 APPROXIMATION 10.3 0. P(1) =f(l).5 y(t) 80.fx)l to 3 significant figures. Compute B/A when A= (g(x) . 17. Approximate x/(e' . Find a third-degree polynomial P(x) which approximates the function f(x) cos irx in 0 < x < I so that P(0) =f0). The function F is defined through F(X) exp(.2 0. The constants a and b are to be chosen so that this accuracy is attained for an interval [0.2)d1.f(x)I d.ax . All five maximum deviations are numerically equal. Each line intersects with the curve in two points. Then calculate maxos. Find a.1) by a polynomial of lowest possible degree so that the absolute value of the error is < 5 x 10-' when x e [0. 12.1 0.P(x)I < 10-' forlxj<1. Using the method of least squares. Expand the function cos x in a series of Chebyshev polynomials. and c to three decimals.9 36.si IP(x) . 15. and the two lines intersect in x = a. The parabola y = f(x) = x' is to be approximated by a straight line y = g(x) in the Chebyshev sense in a certain interval I so that the deviations in the two endpoints and the maximum deviations in I all are numerically -e.f(x)) dx and B= Ig(x) . o Determine a polynomial P(x) of lowest possible degree so that IF(x) . P'(1) =f'(1). The function y(t) is a solution of the differential equation y' + ay = 0. 14. 16.2 16. The function e-' should be approximated by ax + b so that Ie ' .bl < 0.1 24. b.4 53. find a and b if the following values are known: t 0.4 0. Expand Isin xl in a Fourier series! 18.. . J 13. c] as large as possible. 1]. One line is used in I < x < a and another in a< x < 2. y(O) = b. Find a and the maximum deviation.2 11. The function y = 1/x should be approximated in the interval I < x < 2 by two different straight lines.

.1. omitting some proofs and other rather vital parts. With this starting point.. and instead we shall consider functions g(z) satisfying g(z + 1) = zg(z) for arbitrary complex z. . 2. complex integration and residue calculus.. Among these functions there are two main groups.4. one associated with simple linear homo- geneous differential equations. Now we have trivially n! 1)! n=2. Primarily we shall choose such functions as are of great importance from a theoretical point of view. 359 .1. one has tried to find a more general function which coincides with the factorial function for integer values of the argument. one prefers to modify the functional equation slightly. and the condition g(n + 1) = n! for n = 0. Since the gamma and beta functions are of such a great importance in many applications. the Alphas work so hard. n.0. for example.3. and if we prescribe the validity of this relation also for n = 1 we find 0! .. This is especially the case with the gamma function.. However. For different (mostly historical) reasons.Chapter 18 Special Functions I am so glad I am a Beta. Introduction In this chapter we shall treat a few of the most common special functions. And we are much better than the Gammas and Deltas. it does not seem to be reasonable to assume such a knowl- edge. but also for other functions with certain formulas depending on. . a rather careful description of their properties is desirable even if a complete foundation for the reasons mentioned cannot be given. and one with more complicated functions. 1. usually of second order. and for this reason it has been necessary to keep the discussion at a more elementary level. ALDOUS HUXLEY 18. 18. . or in physical and technical sciences. The Gamma Function For a long time a function called the factorial function has been known and is defined for positive integer values of n through n! = 1 2 3 . In the latter case a rather thorough knowledge of the theory of analytic functions is indispensable for a comprehensive and logical treatment. .

1. Taking logarithms and differentiating twice we get d..3) n .+ n _logn 2 Here. we find in a similar way for n = 0. 1. r(1) = I : -r'(1) _ E{n _ log (I + n -lim(1 + + +. r is Euler's constant. 2. (z 1 It now seems natural indeed to identify f(z) with d=/dz2 log g(z). and in this way define a function r(z) through =logr(z) _ (z (18.1) from 1 to z + 1.1.1) . 18. d'== log g(z) = M (z + k)2 log g(z + n + 1) *2 Now we put f(z)=Y I k_o (z + k)s and find immediately f(z) = E + f(z + n + 1). as usual. .1) { n)2 with r(z -i 1) = zr(z) ..1. (18.. zg(z). r(1) = 1 ...1. 360 SPECIAL FUNCTIONS SEC.. Hence we obtain the representation z 1'(z) =z 1 e-w=1j e=11/(1 + (18.2) Integrating (18.log g(z + 1 ) z2 + d_ log 8(z) From g(z + n + 1) _ (z + n)(z + n .1. we find dzlogr(z+ 1) r'(1)+f(n z+n One more integration from 0 to z gives logr(z+ -log (I +?n n Putting z = 1 we get because of r(2) = 1 .

(1 ..zr(.z) = z ' f..2) is satisfied. If we use E.-12-0 Hence the leading three terms in the series are z - 7rsz3 + 7r'z° 6 120 coinciding with the three leading terms in the series expansion for sin irz/ir.{ 1 In .1. 9 +. . +.4 .z): z r(1 . In the theory of analytic functions it is strictly proved that / Z2l sin 7rz %=I V 7r .5) By the aid of partial integration it is easily verified that (18. Rez>0.1 4 9 + 4 16 The first coefficient is r2/6 while the second can be written +.. 1 _ 1 ire _ ir4\ _ 7r4 4.1....z) = eT ` e-21- n// Multiplying with r(z) we get r(z)r(1 . = z . 4+4 +. ) ZE \ Then we find z(1 Z=) 4)(1 91 .. Another representation which is frequently used is the fol- lowing: r(z) = Je't'-'dt. 18. .Z2.z' + 4 + 9 +. -2.. THE GAMMA FUNCTION 361 SEC. (i + z/n) From this formula we see that the r-function has poles (simple infinities) for z = 0. 4+l .1. First we consider r(1 .z) = ... -1 .log (1 + I /n)} instead of Euler's constant we get the equiva- 3 lent formula= r(z) = 1 fl (1 + 1/n)= (18. (18.1.1 + za(1 .. n Let us now perform a very superficial investigation of the function z(1 ft ..9--q 2 (36 90) _.1.1 1 ..10 + /I. We are now going to derive some important formulas for the gamma function.?z ) ...4) z n=.

arctan (n )1 .. If the formula is used for purely imaginary argu- ment z = iy. -3 -.-Z + (-1lk(Zlk n/ nA: 2 \ k// nl If we now perform first the summation over n from 2 to N for we obtain z[log2+log 2 +log 3 +. 1 8.iy)j.2) = exp {E rz log (1 + I log (1 w-zL \ n ..i sink7 .1.7) ir(iy)12 = ..6) sin irz In particular we find for z I that [r(j)j.log 1 + Zl. .1.ry .1. arg (x + iy) = arctan (y/x). (18. In this way the following formula is obtained: arg r(iy) _ E { .(2) sign (y) .. and in particular for real values of the argument. we get r(iy)r( I .+log NN 1 . 1.1.ry and hence n (18. We shall now construct a formula which can be used for computation of the r-function in points close to the real axis. and arg e'a = a. = n and 1'() _ since r(x) > 0 when x > 0.iy) when N is an integer. + arg z.3 -.1. y sinh 2ry since I r(iy)I = j r(.NJ 2 = z(logN .4) in the form z(z + 1)r(z) = r(z -{.362 SPECIAL FUNCTIONS SEC..iy) = n sin 7riy ..iy) iyr(iy)r( .z. (Note that we have the same zeros on both sides!) This gives the important result n r(z)r(1 .) = arg z. By the aid of (18.N/ .8) n By use of the functional relation we can then derive formulas for the absolute value and the argument of r(N .1/ 1) \ But ..z) = (18.3) we can also compute the argu- ment using the well-known relations arg (z. We start from (18...

1)! and SMlogxdx=NlogN . with Zk = C(k) . we find the well-known identity 234+. THE GAMMA FUNCTION 363 When N-. r rzr+ + .Hlogxdx+ 2 [logM+logN] 112 LM T] + 360 L M3 N' J 12 00 L M° N° Br+. oo this expression tends to z(1 and we get r(z+2)=exp{z(1 -r)+-2 k\n/ Now the summation order in n and k is reversed (which is allowed if 1z1 < 2) and using the notation C(k) _ n-1 E n k.1. (2r + 1)(2r + 2) LMzr+ Applying N Z log n = log N! .2) = exp {z(1 .MlogM+M N .SEC.1. (18. 10-digit accuracy it is suitable to require 1zj > 15 and Re z > 0. Suppose f(x) = log x and h = 1 and let M and N be large positive integers.1.9) For z 1.. We shall now derive Stirling's formula starting from Euler-McLaurin's sum- mation formula. N > M. 18.9) should only be used when jzj is small. Since f(z*i >) (X) = (2r)! X-(zr+i) we obtain the following asymptotic relation: log n= . k> 1.1 ] zk{ .1 (also cf.I < x < 1. + (.N ..T) + t (-k k-2 1)k [C(k) .N (N positive integer) and the functional relation then is applied N times. preferably for real values of x such that .-1. an asymptotic formula constructed by Stirling should be used. Formula (18.. 19 of Chapter 11). For. Ex. If one wants to compute the value of the I'-function for an arbitrary com- plex argument z. Other values of z can be mastered if Stirling's formula is used on z .log (M -. we obtain finally r(z -+..1)r+2 1 _ 1 . say.

72n I 1 36I n' + However. To do this we again apply the same expansion.. = 1.3 1 it -_ (2n)! 7r 1 2n 2n 2n _-2 ./2 sin2"+' x dx < sin" x dx < sin2"-' x dx .CM + 2) log M + M- 12M + 360M3 = log N! . = S oil sin" x dx for non-negative integer values of m. . we have still to compute the value of K. By partial integration we find. 2 2 22"(n!)2 2 and Further we have for n > 1 - 2n +I1 2n .364 SPECIAL FUNCTIONS SEC. 18."_2 > m with Ja = 7r/2 and I.(N + 2) log N+ N. when m Z 2. and collecting terms in M and N we find log M! . The left-hand side can be estimated by considering the integral I.1 2 3 (2n)! (2n + 1) 221(n!)2 /2 /2 r.1. Hence we obtain the asymptotic expansion logn! =K+(n+ 2)logn-n+ . Then we see at once that the remainder must be a constant K: K = lim {log n! . Hence we have _ 2n . m-1 I.2n + 24n 2880n- or log (2n)! .2 log n! = -K + (2n + 2) log 2 1 2 logn - 8n 1 + l 192na -. but now for 2n: log (2n)! = K + (2n + 2) log (2n) .(n + J) log n + n} . . 0 0 0 . + 360NF ' 12N where both sides must be considered as asymptotic expansions. Now let M and N tend to infinity and truncate the two series just before the terms 1/12M and 1/12N.1 2n ..2 2n ..

+1 .1 )zsa-1 where I R. the so-called 0-function or the digamma function.(p(z) = l/z.log n): log r(z) = 2 log 2ir + (z 2) log z . For large values of n we have log (2n)! . Often the logarithmic derivative of the I'-function is defined as a function of its own.3) we obtain: O(z)=-7+%%-o(n+l z+n). I. (18.1...10) 2n(2n .I log (na) and comparing with our previous expression we get when n -.+ 1260za + (-1)^-'B + R. From the functional rela- tion for the I'-function we have directly ci(z + 1) . 51840z3 2488320z4 + The formulas can be used only if Iarg zI < ir.1.1.11) ... but it is advisable to keep to the right half-plane and apply the functional relation when necessary.2 log n! .< 12* < 1 2n 1< [22nn!))n 7r < 1 .2n log 2 . The derivation has been performed for positive integers.1.+1I s B. or 2%-I 4w-1 2n + Hence we finally get lim (2n)! = 'V1 -_ 21-(n!)3 7z which is the well-known formula by Wallis. but the formula can be continued to arbitrary complex values z sufficiently far away from the singular points z = 0.2.+1 and 9 = arg z .z 1 + 12z 360z. <'2% < I. (18. THE GAMMA FUNCTION that is. oo K= Ilog21r. . 365 SEC.. . From this follows 12.. 18._.1. Further from (18. We now present Stirling's formula in the following shape (log r(n) = log n! .+. (2n + 1)(2n + 2)IzI"+1(cos For the I'-function itself we have 1 1 139 71 r(z) 27rz=-112e-= I 1 + 12z + - 288z2 . .

(n-b.(3)= I + j .r .log n..3).)(n-a.1.1 .. First we can split u in factors in such a way that the product can be written PA(n-a. For z = 1 and z = 2 we find in particular ¢'(1) = -r = Je'0logtdt. I 252e +. 3.)_1 P = (1 . Hence we must have Z a.n ) . = E b.n ) (I - 1. 0(4) = 1 + j + I .r. An asymptotic expansion can be obtained from (18. if a < 0 to oo). u where u is a rational function of n. 1 2z .. The derivative of the (p-function can also be computed from (18..)..) A necessary condition for convergence is lim. + O(n-2). (1 - .5) which allows differentiation under the integral sign: rb'(z) = Jme 111-1 log t dt . I 12z' + I 120z* .).(n-a. %'(2) = 1 ..a/n) is convergent if and only if a = 0 (it a > 0 the product tends to 0. It is immediately recognized that for large values of n we have P(n) .1. Section 13. . As is the case for the I'-function the 0-function also has poles for z = -n. a. u = 1 which implies A = 1 and r = s. 2.10): O(z). (1 . But a product of the form 11-.366 SPECIAL FUNCTIONS SEC. 18.r = te ° log t dt 0 As an example we shall now show how the properties of the r-function can be used for computation of infinite products of the form jj-. n 0. (1 n1)-1 .logz.. The general factor then gets the form n. c. 1... Due to the relation d¢(z) = 1/z the c..-function is of some importance in the theory of difference equations (cf.) (n-b1)(n-b... -n Eb. and so on.. gb(2)= 1 -r.. . The function values for positive integer arguments can easily be computed: 0(1)= -r..

1) we perform the transformation t = v/(1 + v) obtaining B(x.n' 1 ea.r(1 .t)v-'dt . First it will be shown that the beta function can be ex- pressed in terms of gamma functions.Y) = 1: -'(1 .12) M r(1 . For convergence of the integral we must claim Re x > 0. J 0 which follows through the transformation (1 + v)¢ = z.a. THE BETA FUNTION 367 and we may add the factor exp {n-'(J ai ..1) 0 (Unfortunately. Re y > 0.1- .1-'e e-6.2.y) = A .2) Multiplying by r(x + y) and observing that dd = (1 + v)-:-vl'(x W e + Y) . Y) = JO0 v'-'(1 + v)-=-v dv .2. y) which.1'(x)1'(y) J0 .SEC.l*. (18. 0 J 0_ The inner integral has the value e-=r(x) and this gives r(x + y)B(x. v:-2 S e-(1 d J0 0 _W vz-'e-"E dv . is a func- tion of two variables x and y. In/ this way we get P= ( 1 . Starting from (18.) 18. 18.) This relation defines the beta function B(x.2.2. The Beta Function Many integrals can be brought to the form B(x.{-zr(-z)l .z) er-el- and from this we get the final result: P=fl1'(