You are on page 1of 109

Numerical Methods in Science and Engineering

Thomas R. Bewley UC San Diego

i

ii

Contents
Preface 1 A short review of linear algebra
1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 Vector addition . . . . . . . . . . . . . . . . . . . . . 1.1.3 Vector multiplication . . . . . . . . . . . . . . . . . . 1.1.4 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.5 Matrix addition . . . . . . . . . . . . . . . . . . . . . 1.1.6 Matrix/vector multiplication . . . . . . . . . . . . . 1.1.7 Matrix multiplication . . . . . . . . . . . . . . . . . 1.1.8 Identity matrix . . . . . . . . . . . . . . . . . . . . . 1.1.9 Inverse of a square matrix . . . . . . . . . . . . . . . 1.1.10 Other de nitions . . . . . . . . . . . . . . . . . . . . 1.2 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 De nition of the determinant . . . . . . . . . . . . . 1.2.2 Properties of the determinant . . . . . . . . . . . . . 1.2.3 Computing the determinant . . . . . . . . . . . . . . 1.3 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . 1.3.1 Physical motivation for eigenvalues and eigenvectors 1.3.2 Eigenvector decomposition . . . . . . . . . . . . . . 1.4 Matrix norms . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Condition number . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction to the solution of Ax = b . . . . 2.1.1 Example of solution approach . . . . . 2.2 Gaussian elimination algorithm . . . . . . . . 2.2.1 Forward sweep . . . . . . . . . . . . . 2.2.2 Back substitution . . . . . . . . . . . . 2.2.3 Operation count . . . . . . . . . . . . 2.2.4 Matlab implementation . . . . . . . . 2.2.5 LU decomposition . . . . . . . . . . . 2.2.6 Testing the Gaussian elimination code 2.2.7 Pivoting . . . . . . . . . . . . . . . . . 2.3 Thomas algorithm . . . . . . . . . . . . . . . 2.3.1 Forward sweep . . . . . . . . . . . . . iii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vii 1

1 1 1 2 2 2 3 3 4 4 4 6 6 6 6 7 7 9 10 10 11 12 14 14 14 15 16 17 19 19 20 20

2 Solving linear equations

11

. . . . . . . . . . . . . .7 Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. . . . . . 5.bisection . . . . .3. . .1 Scalar case . . . . . . . . .0. . . . . . . . . . . .4 B-splines . . .2.1. . . .3. . . . . . . .2. . . . .6 Testing the Thomas code 2.4 Matlab implementation . . . . . . . . . .3. . .2 Solution of nonlinear systems of equations . . . . . . . . . . . . . . . . . . . .4 Re ning the bracket . . . 2.3 Operation count . . . . . 5. .1. . . .3 Preconditioned conjugate gradient . . . . . . . . . . .2. . .1. . . . . . . . . .2 Conjugate gradient for quadratic functions . .2. . . . . . . . . . . . . . . . . . . . 5.1 The Newton-Raphson method for nonquadratic minimization . . . . . . . . . . . . . . . 3. . . 3. . . . . . . . . 25 25 25 26 27 28 29 30 30 31 32 33 35 35 36 37 37 38 41 43 43 45 45 45 46 46 47 47 48 50 52 52 54 55 59 61 4 Interpolation 35 5 Minimization 5. . . . . . . . . . . .1. . . . . . . . . . . . . . . . . . . . . . .2. . . . .1. . . . . . . . . . . . . . . . . . .2.4 Review . . . . . . . . . . . . . . . . 4. . . . . . . . . . . . .4 Matlab implementation . . .2 Back substitution . . . . . . . . . . . . . . . . . . . . . .2. . . . . . . . . . . . . . . . 3. .3. . . . . . . . . . . . . . . . . . . . . . . . . . . .0. . . . . . . iv 45 . . . . . . . . . . 5. . 21 21 22 22 23 24 24 3 Solving nonlinear equations 3. . . . . . . . . . . . . . . . . . . . . . . . . .3.2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2. . . . . . .2 Bracketing approaches for scalar root nding . . . . .2. . . . . . . . . . . 2. . . . . . . . . . . 5. .2. . . 2. . . . . . . . .3 Matlab implementation . . . 4. . . . . . . .3. . .3 Multivariable case|systems of nonlinear equations .0 Motivation . . . . . . . .2 Cubic spline interpolation . . . . . . . 5. . .1 The Newton-Raphson method for nonlinear root nding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Lagrange interpolation . . . . . . . . . . . .3 Tension splines . . . . .inverse parabolic interpolation . . . . . . . . . . . . . .4 Extension non-quadratic functions .the golden section search . . . . . . . . . . . .1. . . . . . . . . .1 Bracketing a root . . . . . . . . . . . . . . 2. . 3. . . . . . . . . . .0.1.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Optimization and control of dynamic systems .3. . . . . . . . . . . 3. . . . . . . . .3 Re ning the bracket . . . . .1. . . . . . . . . .2 Constructing the polynomial directly . . . . . . .1 Bracketing a minimum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Solving n + 1 equations for the n + 1 coe cients 4. . . . . . . . . . 5.5 Dependence of Newton-Raphson on a good initial guess 3. . . . . . . . . . . . . . . . . . . . . . . . 4. . . 3. . . . 3. . . . . . .2 Matlab implementation . . .5 LU decomposition . .1 Steepest descent for quadratic functions . . . . . . . . .2 Bracketing approaches for minimization of scalar functions . . . . . . . 4. . . . . . . . . . . . . 3. . . . . . 5. . . . . . . 4. 4. . . . . . . . .1 Solution of large linear systems of equations . . . . . . . . . . . . . 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. . .3 Gradient-based approaches for minimization of multivariable functions 5. . . . . . . . . . . . . . . . .2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. . . .3 Re ning the bracket . . . .2. . . . .false position . . 3. 5. . . . . . . . . . .2 Re ning the bracket . . . . . 5. . . . . . . . . . . . . .4 Testing the bracketing algorithms . . . . . . . 5. .2 Re ning the bracket .1 Constructing the cubic spline interpolant . . . . . . . .3. . .2 Quadratic convergence . . . . . . . . . . . . . .Brent's method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3. . . . . . . . . . . . . . . . . . . . . . 2. . . . . .

. . . . . . . . . . .2 Extension to several gridpoints . . . . . . . . . . . . . . .6 Di erentiation 6. . .les . . . . . .5 Commands for matrix factoring and decomposition A. . . . . . . . . . . . . . . . . A. . . . . . Pade Approximations . . .1 Stability of the explicit Euler method . . . . . . . .2 6.3 6. . . . A. . . . . . . . . . . . . . . . . . . . . . . . .6 Runge-Kutta methods . . . . . . . . . . . . . . . . . . . .6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. . . . . . . .6. . . . . . . . . . . . . . . . . . . .3 Romberg integration . . . . . . . . . . . . . . . .2 Where to nd Matlab .6. . . . . . . . . . . . . . . . . . .1 What is Matlab? . . . . . . . . .5 Derivation of nite di erence formulae . . . . . . . . . . . . . . . . . . . 7. . . . .6. . . . . .1. . . . . . . . . . . . . . .1 The class of second-order Runge-Kutta methods (RK2) 8. . . . . . . . . . 8. . . . . . . . . . . . . . . . . . . . . . . . .3 Stability of the trapezoidal method . A. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. 8. . . . . . . 77 A Getting started with Matlab 1 1 1 2 2 6 7 8 8 8 9 v . . . . . . . . . . . . . . . . . . . . . . .le . . . . . . . . . . . . .1 Taylor-series methods . . . . . . . . . . . . . . . . . . . . . . . .6 Commands used in plotting . . . . . . . . . . . . . . . . . . . .4 A low-storage Runge-Kutta method (RKW3) . A. . . . . . . . . .4 How to run Matlab|the basics . . . . 7.1 6. . . . . . . . . . . . . . . . . . . . . .3. . .7 Other Matlab commands . . . . . . . . .2 Simulation of an undamped oscillating system . . . . . . . .1. A. . . . . . . . . . . . . . . . . . . 8. . . . . . . . . . .2 The trapezoidal method . . . . . . . . . . . . . . . 8. . . . . 8. . . . . . . . . . . . . . . . . . Modi ed wavenumber analysis . . . . . . . . . . . . . . . . . 8. . . . . . . . . . . . . . . . 7. . .3. . . . . . . . 8. . . . 7. . . . . 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. . . . . . . . . . . . . A. . . . . . . . . . . . . . . . . . . . . 8. . . . . . . . . . . . . . . .4. . . . . . .4 Adaptive Quadrature . . A. . 63 63 64 65 67 68 7 Integration 7. . . . . . . . . . . . . . . . . . .4 Stability . . . . . . 69 69 69 70 70 71 74 77 78 79 79 79 82 82 83 83 84 85 85 87 88 89 8 Ordinary di erential equations 8. Taylor Tables . 8. . . . . . . . . .5 Accuracy . . . . . . . . . . . . . . . . . . 7. . . . . . . . . . . .3 An adaptive Runge-Kutta method (RKM4) . . . . . . .3 How to start Matlab . . . . . . . . . . . . . . Alternative derivation of di erentiation formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 A model problem . . . . . . . . . . A. . . . .10 Sample m. . . . . . . . . . . . . . . . . . . . . . . . . . .9 Matlab programming procedures: m. . . . . . . .2 Error Analysis of Integration Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Simulation of an exponentially-decaying system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 A popular fourth-order Runge-Kutta method (RK4) . . . . . . . . . . . . . . . . . . . .4 6. . . . . . . . . . . . . . . . . . . . . . .2 Stability of the implicit Euler method . . . . . . . . . . .1 Techniques based on Lagrange interpolation . . . . . . . . . . . . . . 8. . . . . . . . . . . . . . . . . .1 Basic quadrature formulae . . .4. . . . . . .4. . . . . . . . . . . . . .8 Hardcopies . . . . . . . . .

vi .

assuming only a prior exposure to linear algebra. but rather attempt to build up to and introduce the subjects discussed at greater lengths in these more exhaustive texts at a metered pace. We do not attempt to duplicate these excellent texts.Preface The present text provides a brief (one quarter) introduction to e cient and e ective numerical methods for solving typical problems in scienti c and engineering applications. a knowledge of complex variables. class notes from ME214 by Prof. Harv Lomax at Stanford University. I am indebted to several sources for the material compiled in this text. which draws from class notes from ME200 by Prof. vii . The latter two textbooks are highly recommended as supplemental texts to the present notes. It is intended to provide a succinct and modern guide for a senior or rst-quarter masters level course on this subject. and material presented in Numerical Recipes by Press et al. (1988-1992) and Matrix Computations by Golub & van Loan (1989). Parviz Moin at Stanford University. The present text was prepared as supplemental material for MAE107 and MAE290a at UC San Diego. and rudimentary skills in computer programming.

viii .

1. this notation is now reviewed. For clarity. Vectors are represented with lower-case letters and denoted in writing (i.2 Vector addition Two vectors of the same size are added by adding their individual elements: 0c + d 1 B c1 + d1 C c + d = B 2 . a fairly standard notation has evolved for problems in linear algebra. the elements of all vectors and matrices in these notes will be assumed to be real unless indicated otherwise.. C A @ cn + d n 1 .1 Notation Over the years. However. 1. The vector c shown c above is n-dimensional. and its i'th element is referred to as ci .2 C : B. it is useful to review brie y some relevant concepts from linear algebra. 1.e.1... Consequently.1 Vectors A vector is de ned as an ordered collection of numbers or algebraic variables: 0c 1 Bc1 C c = B . 2 C : B . on the blackboard) with an arrow above the letter (~ ) and in print (as in these notes) with boldface (c). 1.C @ A cn All vectors in the present notes will be assumed to be arranged in a column unless indicated otherwise. all of the numerical tools we will develop extend to complex systems in a straightforward manner. For simplicity.Chapter 1 A short review of linear algebra Linear algebra forms the foundation upon which e cient numerical methods may be built to solve both linear and nonlinear systems of equations.

2

CHAPTER 1. A SHORT REVIEW OF LINEAR ALGEBRA

1.1.3 Vector multiplication
In order to multiply a vector with a scalar, operations are performed on each element:

0 c1 B c1 C c = B .. 2 C : B . C @ A
cn

The inner product of two real vectors of the same size, also known as dot product, is de ned as the sum of the products of the corresponding elements: (u v) = u v =
n X i=1

ui vi = u1 v1 + u2 v2 + : : : + un vn :

The 2-norm of a vector, also known as the Euclidean norm or the vector length, is de ned by the square root of the inner product of the vector with itself:
k

u = (u u) = u2 + u2 + : : : + u2 : n 1 2
k

p

q

The angle between two vectors may be de ned using the inner product such that ) cos ](u v) = (u vv : u
k kk k

In summation notation, any term in an equation with lower-case English letter indices repeated twice implies summation over all values of that index. Using this notation, the inner product is written simply as ui vi . To avoid implying summation notation, Greek indices are usually used. Thus, u v does not imply summation over .

1.1.4 Matrices
A matrix is de ned as a two-dimensional ordered array of numbers or algebraic variables:

0a a ::: a 1 B a11 a12 : : : a1n C A = B .. B 21 22 . . . 2...n C : C @ . ... A
am1 am2 : : : amn

The matrix above has m rows and n columns, and is referred to as an m n matrix. Matrices are represented with uppercase letters, with their elements represented with lowercase letters. The element of the matrix A in the i'th row and the j 'th column is referred to as aij .

1.1.5 Matrix addition

Two matrices of the same size are added by adding their individual elements. Thus, if C = A + B , then cij = aij + bij .

1.1. NOTATION

3

The product of a matrix A with a vector x, which results in another vector b, is denoted Ax = b. It may be de ned in index notation as:

1.1.6 Matrix/vector multiplication
bi =

n X j =1

aij xj :

In summation notation, it is written:

bi = aij xj :
Recall that, as the j index is repeated in the above expression, summation over all values of j is implied without being explicitly stated. The rst few elements of the vector b are given by:

b1 = a11 x1 + a12 x2 + : : : + a1n xn b2 = a21 x1 + a22 x2 + : : : + a2n xn
etc. The vector b may be written:

0a 1 0a 1 0a 1 11 12 B a21 C B a22 C B a1n C b = x1 B .. C + x2 B .. C + : : : + xn B 2n C : B.C B. C B ... C @ A @ A @ A
am1 am2 amn

Thus, b is simply a linear combination of the columns of A with the elements of x as weights.

1.1.7 Matrix multiplication

Given two matrices A and B , where the number of columns of A is the same as the number of rows of B , the product C = A B is de ned in summation notation, for the (i j )'th element of the matrix C , as

cij = aik bkj :
Again, as the index k is repeated in this expression, summation is implied over the index k. In other words, cij is just the inner product of row i of A with column j of B . For example, if we write:

0 c c : : : c 1 0 a a : : : a 1 0b b : : : b 1 B c11 c12 : : : c1n C B a11 a12 : : : a1ll C Bb11 b12 : : : b1nC B 21 22 . 2.n C = B 21 22 . .2 C B 21 22 . 2.nC B ... ... . . .. C B .. .. . . . C B .. .. . . . C @ A @ . . . A@ . . .A cm1 cm2 : : : cmn am1 am2 : : : aml bl1 bl2 : : : bln | {z } | {z }| {z }
C A B

then we can see that c12 is the inner product of row 1 of A with column 2 of B . Note that usually AB = BA matrix multiplication usually does not commute.
6

4

CHAPTER 1. A SHORT REVIEW OF LINEAR ALGEBRA

1.1.8 Identity matrix

The identity matrix is a square matrix with ones on the diagonal and zeros o the diagonal. 01 1 0 B 1 C C I =B Ix = x IA = AI = A B ... C @ A 0 1 In the notation for I used at left, in which there are several blank spots in the matrix, the zeros are assumed to act like \paint" and ll up all unmarked entries. Note that a matrix or a vector is not changed when multiplied by I . The elements of the identity matrix are equal to the Kronecker delta: ( 1 i=j ij = 0 otherwise.
)

1.1.9 Inverse of a square matrix
h i

If BA = I , we may refer to B as A;1 . (Note, however, that for a given square matrix A, it is not always possible to compute its inverse when such a computation is possible, we refer to the matrix A as being \nonsingular" or \invertible".) If we take Ax = b, we may multiply this equation from the left by A;1 , which results in

A;1 Ax = b

)

x = A;1 b:

Computation of the inverse of a matrix thus leads to one method for determining x given A and b unfortunately, this method is extremely ine cient. Note that, since matrix multiplication does not commute, one always has to be careful when multiplying an equation by a matrix to multiply out all terms consistently (either from the left, as illustrated above, or from the right). If we take AB = I and CA = I , then we may premultiply the former equation by C , leading to

C AB = I

h

i

)

CA |{z} B = C
I

)

B = C:

Thus, the left and right inverses are identical. If we take AX = I and AY = I (noting by the above argument that Y A = I ), it follows that

Y AX = I
Thus, the inverse is unique.

h

i

)

YA |{z} X = Y
I

)

X = Y:

1.1.10 Other de nitions

The transpose of a matrix A, denoted AT , is found by swapping the rows and the columns: 01 21 A = @3 4A AT = 1 3 5 : 2 4 6 5 6
)

In index notation, we say that bij = aji , where B = AT .

We will need to solve systems of this type repeatedly in the numerical algorithms we will develop. As we will show. A diagonal matrix is one in which only the main diagonal of the matrix is nonzero. and a lower triangular matrix is one for which all superdiagonals are zero. so far. the second ) ..1 n). we will see in the rst homework that we are already able to analyze important systems of engineering interest with a reasonable degree of accuracy. In homework #1. .. etc. Such matrices arise when discretizing systems of partial di erential equations in more than one direction.. and a bit of analysis. With this machinery.1. NOTATION The adjoint of a matrix A. C C M =B C B C @ A B CA 0 We have.. and the condition number. The rst subdiagonal is immediately below the main diagonal (from a21 to an n.1 ). Generically. the easier it is to solve the problem Ax = b with an e cient numerical algorithm. we need to review a few more elements from linear algebra: determinants. subdiagonal is immediately below the rst subdiagonal. the second superdiagonal is immediately above the rst superdiagonal. A banded matrix has nonzero elements only near the main diagonal. . In 2. x no nz ero ele ero s nz nt no leme e ero s nz nt no leme e me nts with A B 0 4 B1 B B=B B B @ .1. . An upper triangular matrix is one for which all subdiagonals are zero. eigenvalues. the following is the block tridiagonal matrix that arises when discretizing the Laplacian operator in two dimensions on a uniform grid: 0 B B A=B B @ 0 0 Banded matrix 1 C C C C A 0 B B U =B B @ 1 C C C C A 0 B B L=B B @ 0 1 C C: C C A 0B C 1 0 BA B C C B . the rst superdiagonal is immediately above the main diagonal (from a12 to an. we analyze of the static forces in a truss subject to some signi cant simplifying assumptions.. . etc. is found by taking the conjugate transpose of A: 5 1 A = 1 + 3i 2i A = 12i 1 0 3i : 0 The main diagonal of a matrix A is the collection of elements along the line from a11 to ann . a tridiagonal matrix is one in which only the main diagonal and the rst subdiagonal and superdiagonal are nonzero. 1 C C C C C 1A 0 4 A = C = I: . B .. denoted A . as shown in class. the narrower the width of the band of nonzero elements. . 1 4 1 . . . 0 1 4 1 . such matrices look like: 0 Upper triangular matrix Lower triangular matrix A block banded matrix is a banded matrix in which the nonzero elements themselves are naturally grouped into smaller submatrices.. In order to further our understanding of the statics and dynamics of phenomena important in physical systems. etc. matrix norms. For example. Such matrices arise in discretization of di erential equations. . reviewed some of the notation of linear algebra that will be essential in the development of numerical methods.. we will discuss various methods of solution of nonsingular square systems of the form A x = b for the unknown vector x. so we will devote a lot of attention to this problem.. .

The determinant may be de ned by induction as follows: 1) The determinant of a 1 1 matrix A = a11 ] is just A = a11 . The command det(A) is used to compute the determinant in Matlab. j j n) The determinant of an n n matrix is de ned as a function of the determinant of several (n 1) (n 1) matrices as follows: the determinant of A is a linear combination of the elements of row (for any such that 1 n) and their corresponding cofactors: A =a 1A 1+a 2A 2+ a nA n where the cofactor A is de ned as the determinant of M with the correct sign: A = ( 1) + M where the minor M is the matrix formed by deleting row and column of the matrix A. 1.e. Adding a multiple of one row of the matrix to another row leaves the determinant unchanged: 1. the determinant of the identity matrix is I = 1.1 De nition of the determinant j j 2) The determinant of a 2 2 matrix A = a11 a12 is A = a11 a22 a12 a21 . and the u are the elements of U which are on the main diagonal.6 CHAPTER 1. By property #4. we see that whether or not the determinant of a matrix is zero is the litmus test for whether or not that matrix is singular.e. 2. the determinant is most easily computed by performing the row operations mentioned in properties #1 and 2 discussed in the previous section to reduce A to an upper triangular matrix U . If A is nonsingular (i. a a 21 22 j j . then A = 0. and 3 together. then A is the product a11 a22 j j ann of the elements on the main diagonal. which will be described in detail 2. If A is triangular (or diagonal).2. if Ax = b has a unique solution).. If A is singular (i. 2.2.) Taking properties #1. . denoted A .3 Computing the determinant . An extremely useful method to characterize a square matrix is by making use of a scalar quantity called the determinant.2 Properties of the determinant a b = a b c c d 0 d ab .2. . this is the heart of Gaussian elimination procedure. A SHORT REVIEW OF LINEAR ALGEBRA 1.. if Ax = b does not have a unique solution). j j . In particular. (In fact. the determinant has several important properties: 1. . j j When de ned in this manner. if follows that A = ( 1)r U = ( 1)r u11 u22 unn where r is the number of row exchanges performed. 4. then A = 0.2 Determinants 1. j j j j 6 j j For a large matrix A. x j j . j j . Exchanging two rows of the matrix ips the sign of the determinant: a b = c d c d a b 3.

j . Note also that. By the fundamental theorem of algebra. ]( i j ) = 0 for i = j ).3 Eigenvalues and Eigenvectors Consider the equation We want to solve for both a scalar and some corresponding vector (other than the trivial solution = 0) such that. this equation admits other solutions with = 0.3. EIGENVALUES AND EIGENVECTORS 7 1. Once the eigenvalues are found by nding the roots of the characteristic polynomial of A. . In other words. obeys the linear partial di erential equation (PDE) subject to @2f = @t2 2@ 2f @x2 (1. that solution must be = 0. Neglecting damping. The easiest way to determine for which it is possible to solve the equation A = for = 0 is to rewrite this equation as (A I ) = 0: If (A I ) is a nonsingular matrix. However. if all of the eigenvalues of A are distinct (di erent). and the corresponding vectors are called the eigenvectors. there are exactly n roots to this equation. 6 6 A = : 1. and since the righthand side is zero. j . Note that the in this equation may be determined only up to an arbitrary constant.e. as will be illustrated in 1. it is equivalent to simply scaling by the factor .1. f (x t). x 6 .1) f = 0 at x = 0 f = 0 at x = L ( f = c(x) at t = 0 and initial conditions: @f = d(x) at t = 0: @t boundary conditions: ( . all of the eigenvectors of A are linearly independent (i.. the eigenvectors may be found by solving the equation (A I ) = 0. The command V. turns out to be a polynomial in of degree n for an n n matrix A this is referred to as the characteristic polynomial of A. 6 . The values of for which (A I ) is singular are called the eigenvalues of the matrix A. which can not be determined because (A I ) is singular.1 Physical motivation for eigenvalues and eigenvectors In order to realize the signi cance of eigenvalues and eigenvectors for characterizing physical systems. then c is also an eigenvector for any scalar c. and to foreshadow some of the developments in later chapters. for those values of for which (A I ) is singular. though these roots need not be distinct. it is enlightening at this point to diverge for a bit and discuss the time evolution of a taught wire which has just been struck (as with a piano wire) or plucked (as with the wire of a guitar or a harp).3. when multiplied out. the de ection of the wire.1. when is multiplied from the left by A. then this equation has a unique solution. Such a situation has the important physical interpretation as a natural mode of a system when A represents the \system matrix" for a given dynamical system. Making use of property #4 of the determinant.D] = eig(A) is used to compute eigenvalues and eigenvectors in Matlab.3. we see that the eigenvalues must therefore be exactly those values of for which A I = 0: This expression. if is an eigenvector corresponding to a particular eigenvalue . . .

Inserting (1. pronounced \iota". ) . it follows for most ! that B = 0 as well. which results in: 9 ZL ! x dx = c L 2 Z L c(x) sin ! x dx > > c(x) sin ^ 2 c =L ^ = 0 0 for = 1 2 3 : : : ZL ZL ! x dx = d ! L 2 ! x dx> > ^ ^ d(x) sin d = d(x) sin 2 2 ) ^ =L. We now attempt to form a superposition of the nontrivial f that solves the initial conditions ^ given for f .2) (No summation is implied over the Greek index . it must follow that A = 0. With this approach.8 CHAPTER 1. for certain speci c values of ! (speci cally. Due to the boundary condition at x = L. for ! L= = for integer values of ).1) which also satis es the initial conditions as a superposition of these modes. De ning c = B C and d = B D . we take ^ 1 1 i X Xh ^ f= f = c sin ! x cos(! t) + d sin ! x sin(! t) ^ 8 =1 =1 where ! .2) into (1. !2 2 X T 00 = 2 X 00 T X T 00 = !2 T where the constant ! must be independent of both x and t due to the center equation combined with the facts that X = X (x) and T = T (t). we will be able to reconstruct a solution of (1. for . X satis es the homogeneous boundary condition at x = L even for nonzero values of B .1). integers: ZL 0 sin x L sin x L dx = = L=2 otherwise: 0 ( . The coe cients c and d may be determined by enforcing the initial conditions: ^ 1 1 X @f (x t = 0) = d(x) = X d ! sin ! x : ^ f (x t = 0) = c(x) = c sin ! x ^ @t i=1 i=1 ^ The c and d are referred to as the discrete sine transforms of c(x) and d(x) on the interval x ^ 1 0 ) 0 2 0 L]. A SHORT REVIEW OF LINEAR ALGEBRA We will solve this system using the separation of variables (SOV) approach. we multiply both of the above equations by sin(! x= ) and integrate over the domain x 0 L].1) which t this form. T = C cos(! t) + D sin(! t) Due to the boundary condition at x = 0. Noting the orthogonality of the sine functions1 .) If we can nd enough nontrivial (nonzero) solutions of (1. we seek \modes" of the solution. f . However. which satisfy the boundary conditions on f and which decouple into the form f = X (x)T (t): (1. This orthogonality principle states that. The two systems at right are solved with: X = A cos ! x + B sin ! x ) T 00 = T . and thus f (x t) = 0 x t. we nd that 2 X 00 = !2 X X 00 . .

1.3. EIGENVALUES AND EIGENVECTORS

9

Thus, the solutions of (1.1) which satis es the boundary conditions and initial conditions may be found as a linear combination of modes of the simple decoupled form given in (1.2), which may be determined analytically. But what if, for example, is a function of x? Then we can no longer represent the mode shapes analytically with sines and cosines. In such cases, we can still seek decoupled modes of the form f = X (x)T (t), but we now must determine the X (x) numerically. Consider again the equation of the form:
2 X 00 = !2 X
;

with

X = 0 at x = 0 and x = L:

Consider now the values of X only at N + 1 discrete locations (\grid points") located at x = j x for j = 0 : : : N , where x = L=N . Note that, at these grid points, the second derivative may be approximated by: @2X Xj+1 Xj Xj Xj;1 = x = Xj+1 2Xj + Xj;1 @x2 xj x x ( x)2
; ; ; ;

where, for clarity, we have switched to the notation that Xj , X (xj ). By the boundary conditions, X0 = XN = 0. The di erential equation at each of the N 1 grid points on the interior may be approximated by the relation 2 Xj +1 2Xj + Xj ;1 = ! 2 X j j ( x )2 which may be written in the matrix form: 0 22 2 10 X 1 0X 1 0 1 1 1 1 B 2 22 2 C B X2 C 2 2 B 2 C B X C h i B X2 C BX C 2 2 CB 3 C 2 2 3 3 3 1 B B C B . C = !2 B . 3 C B C B C B .. C B .. C ... ... ... CB C B C ( x)2 B B C BX C BX C 2 2 2 @ A @ N ;2A @ N ;2A 2 N ;2 N ;2 N2 2 ; 2 XN ;1 0 2 N ;1 XN ;1 N ;1 or, more simply, as A =
; ; ; ; ; ; ; ; ;

where , !2. This is exactly the matrix eigenvalue problem discussed at the beginning of this section, and can be solved in Matlab for the eigenvalues i and the corresponding mode shapes i using the eig command, as illustrated in the code wire.m provided at the class web site. Note that, for constant and a su ciently large number of gridpoints, the rst several eigenvalues returned by wire.m closely match the analytic solution ! = =L, and the rst several eigenvectors are of the same shape as the analytic mode shapes sin(! x= ).
;

1.3.2 Eigenvector decomposition
x=S
where

If all of the eigenvectors i of a given matrix A are linearly independent, then any vector x may be uniquely decomposed in terms of contributions parallel to each eigenvector such that

0 S=@ 1
j j

j

2

j

3

1 : : :A :

j

j

10

CHAPTER 1. A SHORT REVIEW OF LINEAR ALGEBRA

Such change of variables often simpli es a dynamical equation signi cantly. For example, if a given _ dynamical system may be written in the form x = Ax where the eigenvalues of A are distinct, then (by substitution of x = S and multiplication from the left by S ;1 ) we may write: 0 1 0 1 _= where 0 n In this representation, as is diagonal, the dynamical evolution of each mode of the system is completely decoupled (i.e., _ 1 = 1 1 , _ 2 = 2 2 , etc.). = S ;1 AS = B B @

B

2

...

C C: C A

1.4 Matrix norms
The norm of A, denoted A , is de ned by
k k k

where x is the Euclidean norm of the vector x. In other words, A is an upper bound on the amount the matrix A can \amplify" the vector x:
k k k k k

A = max Ax x=0 x 6
k k k

k

k

Ax

k

k

A x
kk k

k

8

x:

In order to compute the norm of A, we square both sides of the expression for A , which results in T T 2 A 2 = max Ax = max x (A A) x :
k k k k

x6=0

k

k

The peak value of the expression on the right is attained when (AT A) x = max x. In other words, the matrix norm may be computed by taking the maximum eigenvalue of the matrix (AT A). In order to compute a matrix norm in Matlab, one may type in sqrt(max(eig(A' * A))), or, more simply, just call the command norm(A).

x

k

2

x6=0

xT x

1.5 Condition number

Let Ax = b and consider a small perturbation to the right-hand side. The perturbed system is written A(x + x) = (b + b) A x = b: We are interested in bounding the change x in the solution x resulting from the change b to the right-hand side b. Note by the de nition of the matrix norm that x b=A and x A;1 b : Dividing the equation on the right by the equation on the left, we see that the relative change in x is bounded by the relative change in b according to the following:
) k k k k k k k k k kk k k k

x x

k

k

b c b
k k

k

k

11 where c = A A;1 is known as the condition number of the matrix A. If the condition number is small (say, O(103 )), the the matrix is referred to as well conditioned, meaning that small errors in the values of b on the right-hand side will result in \small" errors in the computed values of x. However, if the condition number is large (say, > O(103 )), then the matrix is poorly conditioned, and the solution x computed for the problem Ax = b is often unreliable. In order to compute the condition number of a matrix in Matlab, one may type in norm(A) * norm(inv(A)) or, more simply, just call the command cond(A).
k kk k

12 .

the solution for x1 may then be found: 3 x1 + 4 |{z} + 5 |{z} = 1 x2 x3 x1 = 12=3 = 4: ) .Chapter 2 Solving linear equations Systems of linear algebraic equations may be represented e ciently in the form Ax = b. one often needs to solve such a system for x. the solution for x2 may then be found: 6 x2 + 7 |{z} = 19 x3 x2 = 12=6 = 2: ) ) Finally. substituting the resulting values for x2 and x3 into the equation implied by the rst row. . the problem is almost as easy.1 Introduction to the solution of Ax = b If A is diagonal. . Systems of this form need to be solved frequently. . For example: 02 3 41 0 u 1 0 0 1 2u + 3v 4w = 0 @1 0 2A @ v A = @ 7 A : u 2w = 7 w u + v + w = 12 | 1 1{z 1 } | {z } | 12 } {z . Consider the following: 03 4 51 0x 1 0 1 1 @0 6 7A @x1 A = @19A 2 0 0 8 x3 8 The solution for x3 may be found by by inspection: x3 = 8=8 = 1. A x b 2. the solution may be found by inspection: 02 0 01 0x 1 051 x1 = 5=2 @0 3 0A @x1A = @6A x2 = 2 2 0 0 4 x3 7 x3 = 7=4 If A is upper triangular. . ) Given an A and b. 1 2 1 11 . Substituting this result into the equation implied by the second row. so these notes will devote substantial attention to numerical methods which solve this type of problem e ciently.

2. one reaches the truism 0 = 0. .12 CHAPTER 2. from which the solution may be found by the marching procedure illustrated above. Multiply the rst row by 2 and subtract from the last row: 01 1 1 1 0x 1 0 6 1 @0 4 1A @x1 A = @ 5 A 2 0 4 1 x3 11 3. or there are in nitely many solutions (if. we continue to combine rows until the matrix becomes the identity this is referred to as the Gauss-Jordan process: . Add second row to third: 01 1 1 1 0x 1 0 6 1 @0 4 1A @x1 A = @ 5 A 2 0 0 2 x3 6 . 00 @1 10 1 0 1 This is an upper triangular matrix. . we can perform the following manipulations: 1. Such a reduction to triangular form is called Gaussian elimination. .1 Example of solution approach Consider the problem 4 1 x1 5 1 1 A @x2 A = @6A 2 2 1 x3 1 Considering this matrix equation as a collection of rows. . There are either: zero solutions (if. . . in which case the corresponding element xi can take any value). we would like to reduce the problem to a triangular form. then present the general algorithm. Interchange the rst two rows: 01 1 1 1 0x 1 061 @0 4 1A @x1 A = @5A 2 2 2 1 x3 1 2. when solving the i'th equation. . we can perform simple linear combinations of the rows and still have the same system. To solve a general nonsingular matrix problem Ax = b. Alternatively (and equivalently). lower triangular matrices naturally lend themselves to solution via a march down from the top row. so we can solve this by inspection (as discussed earlier). Similarly. which cannot be made true for any value of xi ). SOLVING LINEAR EQUATIONS Thus.1. . For example. . When studying science and engineering problems on a computer. we are in trouble. when solving the i'th equation. . generally one should rst identify nonsingular problems before attempting to solve them numerically. The matrix A is called singular in such cases. upper triangular matrices naturally lend themselves to solution via a march up from the bottom row. each representing a separate equation. We rst illustrate the Gaussian elimination procedure by example. Note that if there is a zero in the i'th element on the main diagonal when attempting to solve a triangular system. one reaches an equation like 1 = 0.

and thus X = A. x2 . 1 ) 01 @0 0 1 0 0 1 2 1 .1. . Example: construct three vectors x1 . . ) ) | . 2 {z X . 1 . {z A } |{z} b .1 . Subtract second and third rows from the rst: 01 0 01 0x 1 011 011 1 @0 1 0A @x2 A = @2A x = @2A 0 0 1 x3 3 3 The letters x1 . INTRODUCTION TO THE SOLUTION OF Ax = b 13 4. 2 0 1 0 0A 1 2 1 ) 01 @0 0 0 0 1 1 0 2 1 0 0 1 2 . the procedure is generalized so that we can teach the computer to do the work for us. 2 1 1 1 1 1 2 . A particular case of interest is the several columns that make up the identity matrix. then add the result to the second row: 01 1 11 0x 1 061 @0 4 0A @x1 A = @8A 2 0 0 1 x3 3 5. x2 . | {z } | {z } A B 01 0 @0 1 0 0 j j 2 1 0 0 1 1 1 0A 1 0 0 1 . but is just a sequence of mechanical steps. we have AX = I by construction. . so we may devise a shorthand augmented matrix in which we can conduct the same series of operations without the extraneous symbols: 0 0 4 1 51 0 1 1 1 61 0 1 0 0 11 @ 1 1 1 6A @ 0 4 1 5A @ 0 1 0 2A ::: 2 2 1 1 2 2 1 1 0 0 1 3 ) . In the following section. Divide the last row by -2. 0 0 1 0A 1 1 1 ) .2. and x3 clutter this process. Divide the second row by 4: 01 1 11 0x 1 061 @0 1 0A @x1 A = @2A 2 0 0 1 x3 3 6. 2 1 1 1 2 1 . |{z} x bi comprising a right-hand-side matrix B. and x3 such that An advantage of this notation is that we can solve it simultaneously for several right hand sides 011 Ax1 = @0A 0 001 Ax2 = @1A 0 001 Ax3 = @0A : 1 This problem is solved as follows: 0 1 0 2 1 0 01 01 0 @ 1 1 1 0 1 0A @0 1 0 1 1 0 0 1 0 1 ) . j j j j | 1 2 1 2 11 A } The above procedure is time consuming. ) . 1 De 0 1 ning X = @x1 x2 x3 A.

0a a 11 B 0 a12 B . @ a1n b1 a2n b2 C C 1 2. etc.. . update the other bi as follows: bn bn =ann : bi bi . where A and b are given. . SOLVING LINEAR EQUATIONS 2. Initiate with: Starting from i = n 1 and working back to i = 1.. Such a procedure is referred to as \partial pivoting". Eliminate everything below a11 (the rst \pivot") in the rst column: Let m21 = a21 =a11 .. etc. C . Once nished. n X where summation notation is not implied. . . Repeat step 1 for the new (smaller) augmented matrix (highlighted by the dashed box in the last equation). The process of back substitution is straightforward.. i. ... . k=i+1 aik bk =aii . We can always complete the Gaussian elimination procedure with partial pivoting if the matrix we are solving is nonsingular.... 22 B .e.2 Gaussian elimination algorithm A x = b.1 Forward sweep . If it is not. . The pivot for the second column is a22 .2. . B . . 1. The modi ed augmented matrix eventually takes the form .. . . . if the problem we are solving has a unique solution. a1n b1 a2n b2 C C . The following notation is used for the augmented matrix: 0a a ::: a b 1 1n 1 B a11 a12 a2n b2 C 21 22 (A b) = B . so it is pivotal that the pivot is nonzero..14 CHAPTER 2. . .2. 22 B . .A 0 0 ann bn Note that at each stage we need to divide by the \pivot".. Multiply the rst row by m31 and add to the third row.....A 1 0a a 11 B 0 a12 B . . C : C @ A an1 an2 ann bn j This section discusses the Gaussian elimination algorithm to nd the solution x of the system 2. 2.. The modi ed augmented matrix soon has the form 0 an 2 ann bn where all elements except those in the rst row have been changed. Multiply the rst row by m21 and add to the second row.. exchange the row with the zero pivot with one of the lower rows that has a nonzero element in the pivot column. the vector b contains the solution x of the original system A x = b.2 Back substitution . Let m31 = a31 =a11 .. . @ ... . .. C . .

Thus. . . (n k + 1)(n k) k = n(n2+ 1) and n X k=1 k2 = n(n + 1)(2n + 1) 6 n(n 1)=2 (n3 n)=3 (n3 n)=3 . we see that: The total number of divisions is: The total number of multiplications is: The total number of additions is: ) For large n. Operation count for the back substitution: The total number of divisions is: The total number of multiplications is: The total number of additions is: ) n n. the total number of ops for the back substitution is thus O(n2 ). .etc. . Operation count for the forward sweep: To eliminate a21 : To eliminate entire rst column: To eliminate a32 : To eliminate entire second column: . (n k) = n(n 1)=2 . . . . the total number of ops for the forward sweep is thus O(2n3 =3).3 Operation count Let's now determine how expensive the Gaussian elimination algorithm is..1 X k=1 n. + The total number of divisions is thus: The total number of multiplications is: The total number of additions is: Two useful identities here are n X k=1 n..1 X k=1 n.1 X k=1 (n k) .1 X k=1 (n k) = n(n 1)=2 . GAUSSIAN ELIMINATION ALGORITHM 15 2. n n(n 1) (n 1) (n 1)(n 2) . . . . 1 (n 1) 1 (n 2) . both of which may be veri ed by induction. . For large n. . . .2. . we see that the forward sweep is much more expensive than the back substitution for large n.2.2. n n(n 1) (n 1) (n 1)(n 2) . (n k + 1)(n k) . Applying these identities.1 X k=1 n.

j) * A(j. Thus.j+1:n) = A(i.n) for i = n-1:-1:1. A(i.A(i. and the vector b is replaced by the solution x of the original system. and are left as an exercise for the motivated reader.j) % Add m_ij times the upper triangular part of the j'th row of % the augmented matrix to the i'th row of the augmented matrix. for i=j+1:n.2.j).j) A(i. % % % % Compute m_ij.i+1:n) * b(i+1:n) ) / A(i.A(i.j) / A(j. though the accepted convention is to use lowercase letters for the elements of matrices. unfortunately.j) * b(j) end end % -----------. The matrix A is replaced by the m_ij and U on exit.4 Matlab implementation The following code is an e cient Matlab implementation of Gaussian elimination. as a_ij is set to zero by construction during this iteration. the following algorithm may fail even on nonsingular problems if pivoting is required. Matlab refers to the elements of A as A(i. accounting for all values of x already determined: b(i) = ( b(i) .16 CHAPTER 2. % loop through the elements a_ij below the pivot a_jj. % % % % % gauss. for j = 1:n-1.BACK SUBSTITUTION -----------% Initialize the backwards march b(n) = b(n) / A(n. SOLVING LINEAR EQUATIONS 2.j+1:n) + A(i. % Note that an inner product is performed at the multiplication % sign here. Note that.m Solves the system Ax=b for x using Gaussian elimination without pivoting. -------------. Note that we can store m_ij in the location (below the diagonal!) that a_ij used to sit without disrupting the rest of the algorithm. = .i) end % end gauss.FORWARD SWEEP -------------% For each column j<n. The \partial pivoting" checks necessary to insure success of the approach have been omitted for simplicity.m .j+1:n) b(i) = b(i) + A(i.

1 and noting that EA = U .. each row operation (which is simply the multiplication of one row by a number and adding the result to another row) may also be denoted by the premultiplication of A by a simple transformation matrix Eij . m32 . which. 1 1 C C C A then E21 A means simply to multiply the rst row of A by m21 and add it to the second row.. which is exactly the rst step of the Gaussian elimination process. which we will call U . Furthermore. 1 0 1 C C: C A The forward sweep of Gaussian elimination (without pivoting) involves simply the premultiplication of A by several such matrices: ( E ) |En n. 1 0 1 . it is easily veri ed.2) ({z n2 E42 E32 )(En1 E31 E21} A = U: E To \undo" the e ect of this whole string of multiplications. it follows at once that A = LU .. so that . . we may simply multiply by the inverse of E . . 1 C C C: C C A De ning L = E . mn1 . @ ..1 = B m31 E B . we simply multiply the rst row of the resulting matrix by m21 and add it to the second row. 0 1 B . It turns out that the transformation matrix which does the trick at each step is simply an identity matrix with the (i j )'th component replaced by mij . Through several row operations. . GAUSSIAN ELIMINATION ALGORITHM 17 2.1 1 .2.2. . . if we de ne 01 Bm21 E21 = B B @ 0 1 0 .2 En.1 )(En n. is given by 0 1 B m21 B . 1 0 .5 LU decomposition We now show that the forward sweep of the Gaussian elimination algorithm inherently constructs an LU decomposition of A.....1 n. mn2 mn n. For example.2.. . To \undo" the multiplication of a matrix by E21 .. B m21 E211 = B @ . B . the matrix A is transformed by the Gaussian elimination procedure into an upper triangular form.

It is for demonstration purposes only. once we run the full Gaussian elimination procedure once).m is not e cient with memory. Once we have the LU decomposition of A (e. but at a signi cantly lower cost (O(2n2 ) ops instead of O(2n3 =3) ops) because we were able to leverage the LU decomposition of the matrix A. we can solve a system with a new right hand side A x = b0 with a very inexpensive algorithm. Leaving the nontrivial components of L and U in a single array A. L=eye(n) for j=1:n-1. though it makes the code a bit di cult to interpret.j) end end % U is simply the upper-triangular part of the modified A. Thus. this system can also be solved inexpensively (O(n2 ) ops). for i=j+1:n.m As opposed to the careful implementation of the Gaussian elimination procedure in gauss. for j=i:n. this is probably not a good idea. in which the entire operation is done \in place" in memory. it is a very good idea to reuse the LU decomposition of A rather than repeatedly running the Gaussian elimination routine from scratch. U=zeros(n) for i=1:n. constructing the LU decomposition of A. is an e ective method of saving memory space.g. We note that we may rst solve an intermediate problem L y = b0 for the vector y. if you are going to get several right-hand-side vectors b0 with A remaining xed. Note that this routine does not make efficient use of memory. construct L with 1's on the diagonal and the negative of the % factors m_ij used during the Gaussian elimination below the diagonal. U(i. % First.m.. It takes the information stored in the array A and spreads it out over two arrays L and U. SOLVING LINEAR EQUATIONS We thus see that both L and U may be extracted from the matrix that has replaced A after the forward sweep of the Gaussian elimination procedure. As U is (upper) triangular.18 CHAPTER 2. this system can be solved inexpensively (O(n2 ) ops).m: % % % % extract_LU.j)=A(i.m. Substituting the second equation into the rst.j) end end % end extract_LU. Once y is found. L(i. . As L is (lower) triangular. we see that what we have solved by this two-step process is equivalent to solving the desired problem A x = b0 . In codes for which memory storage is a limiting factor. the code extract LU.m Extract the LU decomposition of A from the modified version of A returned by gauss. we may then solve the system Ux=y for the vector x.j)=-A(i. The following Matlab code constructs these two matrices from the value of A returned by gauss. and noting that A = LU .

% Recall that the solution x is returned in b by gauss.2. Now let's see how good L and lower triangular with 1's on triangular. LU=L*U. % Let's hang on to them here. pause % Recall that A and b are destroyed in gauss.m. and the value of the value of A. If we did well. On a computer. U should be upper L*U should be about the same as not done efficiently below. pause % Now let's see how good x is.m. clear. in which all numbers are represented with only nite precision.2. pause U are. create a random A and b. U.6 Testing the Gaussian elimination code The following code tests gauss.m echo on % This code tests the Gaussian elimination and LU decomposition % routines. echo off. Note that the product L*U is L and U have structure which L. echo on. extract_LU. L should be the diagonal. This can sometimes lead to numerical inaccuracies. b=rand(n.m with random A and b. and extract the LU decomposition of A.m and extract LU.2. Recall that partial pivoting involves simply swapping rows whenever a zero pivot is encountered. any nonsingular system may be solved by the Gaussian elimination procedure if partial pivoting is implemented. % test_gauss. If we did well.2. A=Asave % end test_gauss. the value of A*x % should be about the same as the value of b. taking the di erence of two numbers 2. x=b. b=bsave. as small (but nonzero) pivots may be encountered by this algorithm.7 Pivoting . GAUSSIAN ELIMINATION ALGORITHM 19 2.1). I will shatter this dream for you with homework #1.m on several random matrices A. gauss. This can lead to subsequent row combinations which involve the di erence of two large numbers which are almost equal. % % % % % % Ax = Asave*x. First. as both is not being leveraged. n=4 A=rand(n). Asave=A bsave=b % Run the Gaussian elimination code to find the solution x of % Ax=b.m As you run the code test gauss. Just because a routine works well on several random matrices does not mean it will work well in general! As mentioned earlier. you may be lulled into a false sense of security that pivoting isn't all that important.

we can develop a procedure which swaps columns. Repeat step 1 for the new (smaller) augmented matrix... B B @ an. . .1 Forward sweep . B B @ 0 bn. at each stage it is pivotal that the pivot is nonzero..1 cn. 1 Diagonal dominance means that the magnitude of the element on the main diagonal in each row is larger than the sum of the magnitudes of the other elements in that row. Eliminate everything below b1 (the rst pivot) in the rst column: Let m2 = a2 =b1... The following notation is used for the augmented matrix: 0b c 0 1 B a1 b2 c2 2 B a b c B 3 3 3 (A g) = B B .1 cn. . C C gn. in addition to swapping the rows. . Multiply the rst row by m2 and add to the second row.3 Thomas algorithm This section discusses the Thomas algorithm that nds the solution x of the system A x = g.20 CHAPTER 2..3. C C gn.1 j 0 an bn 1 C C C C ... 2. 1.1 A g1 g2 g3 gn Again.1 0 0 bn 1 C C C C . A good numerical discretization of a di erential equation will result in matrices A which are diagonally dominant. where A and g are given with A assumed to be tridiagonal and diagonally dominant1. The algorithm is based on the Gaussian elimination algorithm of the previous section but capitalizes on the structure of A.. To alleviate this situation. The pivot for the second column is b2 . to maximize the size of the pivot at each step. as in the Gaussian elimination procedure. One has to bookkeep carefully when swapping columns because the elements of the solution vector are also swapped. 2.1 bn. : : : Continue iterating to eliminate the ai until the modi ed augmented matrix takes the form 0b c 0 B 01 b1 c2 2 B 0 b c B 3 3 B B .1 A g1 g2 g3 gn 2.. . C .. in which case the pivots are always nonzero and we may proceed without worrying about the tedious (and numerically expensive) chore of pivoting. C : . SOLVING LINEAR EQUATIONS which are almost equal can lead to signi cant magni cation of round-o error....

Once nished.3.2 Back substitution As before. 2 2(n 1) . initiate the back substitution with: gn gn =bn: Starting from i = n 1 and working back to i = 1. Operation count for the forward sweep: To eliminate a2 : To eliminate entire subdiagonal: ) 1 (n 1) . + 2 2(n 1) . Operation count for the back substitution: The total number of divisions is: The total number of multiplications is: The total number of additions is: ) n (n 1) (n 1) . This is a lot cheaper than Gaussian elimination! . For large n. the total number of ops for the back substitution is thus O(3n). .2. For large n. 2. the total number of ops for the forward sweep is thus O(5n).3. gi gi ci gi+1 =bi . the vector g contains the solution x of the original system A x = g. update the other gi as follows: .3. where summation notation is not implied.3 Operation count Let's now determine how expensive the Thomas algorithm is. THOMAS ALGORITHM 21 2.

as a_(j+1) is set to zero by construction during this iteration. SOLVING LINEAR EQUATIONS 2. Note that most of the elements of both L and U are zero. % % % % Compute m_(j+1). where a is the subdiagonal. -------------.c. The cost of e ciently solving L y = g0 2.5 LU decomposition . Once we have the LU decomposition of A.FORWARD SWEEP -------------% For each column j<n. = . The vectors (a.a(j+1) / b(j) a(j+1) % Add m_(j+1) times the upper triangular part of the j'th row of % the augmented matrix to the (j+1)'th row of the augmented % matrix. we can solve a system with a new right hand side A x = g0 with a two-step procedure as before.c(i) * g(i+1) ) / b(i) end % end thomas.g) are previously-defined vectors of length n.3.m The forward sweep of the Thomas algorithm again inherently constructs an LU decomposition of A. for j = 1:n-1. assuming A is tridiagonal and diagonally dominant. g(i) = ( g(i) . and c is the superdiagonal of the matrix A.22 CHAPTER 2.b.4 Matlab implementation % % % % % % % % % The following code is an e cient Matlab implementation of the Thomas algorithm. This decomposition may (ine ciently) be constructed with the code extract LU. thomas.m Solves the system Ax=g for x using the Thomas algorithm.3. and the vector g is replaced by the solution x of the original system.c) are replaced by the m_i and U on exit. b is the main diagonal. The matrix A is assumed to be diagonally dominant so that the need for pivoting is obviated. b(j+1) = b(j+1) + a(j+1) * c(j) g(j+1) = g(j+1) + a(j+1) * g(j) end % -----------. Note that we can put m_(j+1) in the location (below the diagonal!) that a_(j+1) used to sit without disrupting the rest of the algorithm.b. It is assumed that (a.BACK SUBSTITUTION ------------ g(n) = g(n) / b(n) for i = n-1:-1:1.m used for the Gaussian elimination procedure.

Note that this is an insanely inefficient way of storing the nonzero elements of A.m echo on % This code tests the implementation of the Thomas algorithm.1) extract_LU. THOMAS ALGORITHM 23 for the vector y is O(2n) ops (similar to the cost of the back substitution in the Thomas algorithm. and the value of L*U should be about the same as the value of A. % Recall that the solution x is returned in g by thomas. % test_thomas. If we did well. and is done for demonstration purposes only.3. the value of A*x % should be about the same as the value of g. Thus. thomas A = diag(a(2:n).1) % Hang on to A and g for later use. 2. L should be lower triangular with 1's on the diagonal. LU=L*U. If we did well. but noting that the divisions are not required because the diagonal elements are unity). L. = diag(a(2:n).3. % put the diagonals back into matrix form. This is a measurable savings which should not be discounted. b.0) + diag(c(1:n-1). n=4 a=rand(n.-1) + diag(b. solving A x = g0 by reusing the LU decomposition of A costs O(5n) ops. U should be upper triangular. pause % Now let's see how good x is. echo on.1) % % % A Construct A. create a random a. U. Asave=A gsave=g % Run the Thomas algorithm to find the solution x of Ax=g. and the cost of e ciently solving Ux=y for the vector x is O(3n) ops (the same as the cost of back substitution in the Thomas algorithm). echo off. and g. Note the structure of L and U.0) + diag(c(1:n-1).2. c. % First. whereas solving it by the Thomas algorithm costs O(8n) ops. A=Asave % end test_thomas. x=g.1) b=rand(n.m .1) g=rand(n.m.m for random A and g. pause % % % % Now let's see how good L and U are.m with extract LU.-1) + diag(b. g=gsave. Ax = Asave*x. clear.6 Testing the Thomas code The following code tests thomas.1) c=rand(n. and extract L and U.

which is O(8n).3. however. will not be full. widely used. and that partial pivoting (exchanging rows) is sometimes required to make it work. with which more e cient solution algorithms may be developed if several systems of the form A x = b must be solved where A is xed and several values of b will be encountered. . thus. When the equations and unknowns of the system are enumerated in such a manner that the nonzero elements lie only near the main diagonal.7 Parallelization 2. Many modern computers achieve their speed by nding many things to do at the same time. In many problems. which are amenable to very e cient solution via the Thomas algorithm. each step of the Thomas algorithm depends upon the previous step.4 Review We have seen that Gaussian elimination is a general tool that may be used whenever A is nonsingular (regardless of the structure of A). This is referred to as parallelization of an algorithm. Most of the systems we will encounter in our numerical algorithms. Gaussian elimination is also a means of obtaining an LU decomposition of A.24 The Thomas algorithm. is very e cient and. However. 2. A particular example of interest is tridiagonal systems. Thus. iteration j=7 of the forward sweep must be complete before iteration j=8 can begin. resulting in what we call a banded matrix. the Thomas algorithm does not parallelize. For example. more e cient solution techniques are available. One must always use great caution when parallelizing a code to not accidentally reference a variable before it is actually calculated. one can nd several di erent systems of the form A x = g and work on them all simultaneously to achieve the proper \load balancing" of a well-parallelized code. which is unfortunate. however.

in principle. which can be easily obtained (when f 0 (x(0) ) = 0) by taking f (x) = 0 and solving (3. Suppose the initial guess is x = x(0) . unlike linear systems where.Chapter 3 Solving nonlinear equations Many problems in engineering require solution of nonlinear algebraic equations. 3. In what follows. or several values of x which satisfy f (x) = 0. they can be obtained exactly with Gaussian elimination. . That is. In general. one. the next approximation is x(1) .1 Scalar case 25 .1). Let x(1) be this root.2) x(j+1) = x(j) f 0((x (j))) for j = 0 1 2 : : : x . the approximations can be improved by increasing the number of iterations. 3. only approximate solutions are obtained. However. we settle for the root of the approximate polynomial consisting of the rst two terms on the right-hand side of (3. successive approximations are Thus. there are no systematic methods to determine when a nonlinear equation f (x) = 0 will have zero. 1 6 . Linearize f (x) about x(0) using the Taylor series expansion f (x) = f (x(0) ) + (x x(0) )f 0 (x(0) ) + : : : (3. with nonlinear equations. Solution of nonlinear equations always requires iterations. starting with x obtained from f (j ) (3. x = xopt . which results in f (0) x(1) = x(0) f 0((x (0))) : x (0) . which leads to repeatedly solving linear systems of the form Ax = b. x. we will show that the most popular numerical methods for solving such equations involve linearization. Unfortunately. A single nonlinear equation may be written in the form f (x) = 0.1 The Newton-Raphson method for nonlinear root nding Iterative techniques start from an initial \guess" for the solution and seek successive improvements to this guess.1. In terms of a geometrical interpretation. if solutions exist. where f (x) crosses the x-axis in a plot of f vs. The objective is to nd the value(s) of x for which f is zero.1) for x. we want to nd the crossing point(s).1) Instead of nding the roots of the resulting polynomial of degree .

the Newton-Raphson method converges quadratically.2 Quadratic convergence f (xopt ) = 0 f (x(j) ) + (xopt x(j) )f 0 (x(j) ) + (x . opt . f 00 (x(j) ) : f 0(x(j) ) (3. Note that convergence is guaranteed only if the initial guess is fairly close to the exact root otherwise. De ning the error at iteration j as (j) = x(j) xopt . f (x(j) ) + (xopt x(j) )2 2 f 0 (x(j) ) . convergence is quadratic. The intersection of this line with the x-axis gives x(1) . Consider the Taylor series expansion 3. etc. 6 2 f 00 (x(j) ): x(j) xopt . divide by f 0 (x(j) ). The function f at point x(0) is approximated by a straight line which is tangent to f with slope We now show that.2) leads to x(j+1) xopt .1. (x(j) xopt )2 f 00 (x(j) ) : 2 f 0 (x(j) ) . once the solution is near the exact value x = xopt . x(j ) )2 If f 0 (x(j) ) = 0. the function at x(1) is approximated by a tangent straight line which intersects the x-axis at x(2) . we will also have problems. SOLVING NONLINEAR EQUATIONS until (hopefully) we obtain a solution that is su ciently accurate. Let x(j) indicate the solution at the j th iteration. . The geometrical interpretation of the method is shown below. the neglected higher-order terms dominate the above expression and we may encouter divergence. the error at the iteration j + 1 is related to the error at the iteration j by (j +1) 1 f 00 (x(j) ) 2 f 0(x(j) ) (j ) 2 : That is.26 CHAPTER 3. . If f 0 is near zero at the root.3) Combining this with the Newton-Raphson formula (3. f (x) x(0) x(2) x(1) x f 0 (x(0) ).

x(1) . . 3 sin x(k) : 2 1 ! .3 Multivariable case|systems of nonlinear equations @f @f fi (x1 x2 : : : xn ) = fi (x(0) x(0) : : : x(0) ) + (x1 x(0) ) @xi x=x +(x2 x(0) ) @xi n 1 2 1 2 1 2 n X @fi = fi (x(0) x(0) : : : x(0) ) + (xj x(0) ) @x +::: n 1 2 j j x=x j =1 . THE NEWTON-RAPHSON METHOD FOR NONLINEAR ROOT FINDING 27 A system of n nonlinear equations in n unknowns can be written in the general form: fi (x1 x2 : : : xn ) = 0 for i = 1 2 : : : n or. for i = 1 2 : : : n: (3. . x=x(0) +::: . ( ) j x=x(0) : Note that the elements of the Jacobian matrix A(k) are evaluated at x = x(k) . 0 @f1 @f1 1 A(k) = B @x2 @x2 C @ @f1 @f2 A @x1 @x2 x=x(k) (k) = 2x1 (k) 2 cos x1 .4) constitutes n linear equations for the n components of h(1) = x(1) x(0) . Let x(1) be the solution of the i above equation with only the linear terms retained and with fi (x1 x2 : : : xn ) set to zero as desired. The solution at the rst iteration. (0) . the components of the initial guess vector x(0) are x(0) for i = 1 2 : : : n. is obtained from x(1) = x(0) + h(1): Successive approximations are obtained from A(k) h(k+1) = f (x(k) ) @f where a(k) = @xi x=x k : ij (k+1) = x(k) + h(k+1) j x . which may be considered as the desired update to x(0) .3. (0) where. As usual. As an example.1. Thus. Then n X @fi (1) (0) ) (0) @xj {z=x } (xj {z xj } = fi (x ) x | j =1 | (0) .1. In matrix notation. A(0) h(1) = f (x(0) ) . we have . subscripts denote vector components and boldface denotes whole vectors. as f (x) = 0.4) a(0) ij h(1) j Equation (3. for example. superscripts denote iteration number. as in the previous section. where @f a(0) = @xi ij Note that the elemenets of the Jacobian matrix A(0) are evaluated at x = x(0) . Generalization of the Newton-Raphson method to such systems is achieved by using the multi-dimensional Taylor series expansion: 3. more compactly. consider the nonlinear system of equations 2 x2 1 f (x) = x1 + 3 cos x1 2 = 0: x2 + 2 sin The Jacobian matrix is given by .

4 Matlab implementation The following code is a modular Matlab implementation of the Newton-Raphson method. res=1 i=1 while (res>1e-10) f=compute_f(x) A=compute_A(x) % Compute function and Jacobian res=norm(f) % Compute residual x_save(i. Such handshaking between subprograms eases debugging considerably when your programs become large. and we solve A(1) h(2) = f (x(1) ) x(2) = x(1) + h(2): . % newt. By using modular programming style. We then update x according to x(1) = x(0) + h(1): The function f and Jacobian A are then evaluated at x = x(1) .m to compute a function and the corresponding % Jacobian. solve a nonlinear system using Newton-Raphson. and the residual f_save(i.28 Let the initial guess be: CHAPTER 3. f.:)=x' % Save x.m and compute_A. and all variables are passed in as arguments and pass back out via the function calls. A numerical solution to this example nonlinear system is found in the following sections.m % Given an initial guess for the solution x and the auxiliary functions % compute_f.1)=res x=x-(A\f) % Solve system for next x i=i+1 % Increment index end evals=i-1 % end newt. . we are most easily able to adapt segments of code written here for later use on di erent nonlinear systems. the following system of equations is solved for h(1) A(0) h(1) = f (x(0) ): .:)=f' res_save(i.m The auxiliary functions (listed below) are written as functions rather than as simple Matlab scripts. SOLVING NONLINEAR EQUATIONS 0 (0) 1 x x(0) = @ 1 A : (0) x2 The function f and Jacobian A are rst evaluated at x = x(0) . Note that f may % be a scalar function or a system of nonlinear functions of any dimension. Next. The process continues in an iterative fashion until convergence.1. 3.

m 29 % Evaluate the function f(x). the nal root to which the system converges is dependent on the choice of initial conditions. provide a sufficiently accurate initial guess for the root clear format long x=init_x % Find the root with Newton-Raphson and print the convergence newt.0.m function x] = init_x x= 0 1] % end function init_x. % First. in this example.m % Initial guess for x As before. It is seen that a \good" (lucky?) choice of initial conditions will converge very rapidly to the solution with this scheme.m % Tests Newton's algorithm for nonlinear root finding. x_save. and may not converge at all|this is a major drawback to the Newton-Raphson approach.m function A] = compute_A(x) A= 2*x(1) -3*sin(x(2)) 2*cos(x(1)) 1 ] % end function compute_A. and must be saved as compute f.0 and compute A. the solution is not unique. A poor choice will not converge smoothly. % test_newt.m before the test code in the following section will run.1.m. THE NEWTON-RAPHSON METHOD FOR NONLINEAR ROOT FINDING function f] = compute_f(x) f= x(1)*x(1)+3*cos(x(2))-1 x(2)+2*sin(x(1))-2 ] % end function compute_f.5 Dependence of Newton-Raphson on a good initial guess The following code tests our implementation of the Newton-Raphson algorithm.0. % Evaluate the Jacobian A % of the function in compute_f To prevent confusion with other cases presented in these notes. these case-speci c functions are stored at the class web site as compute f. res_save % end test_newt.m.m before the test code will run. It is apparent in the results shown that. 3.m and compute A.m. Typical results when applying the Newton-Raphson approach to solve a nonlinear system of equations are shown below.1.3. . the case-speci c function to compute the initial guess is stored at the class web site as and must be saved as init x. init x. As nonlinear systems do not necessarily have unique solutions.

0000 0.2 Bracketing approaches for scalar root nding It was shown in the previous section that the Newton-Raphson technique is very e cient for nding the solution of both scalar nonlinear equations of the form f (x) = 0 and multivariable nonlinear systems of equations of the form f (x) = 0 when good initial guesses are available.97 iteration 1 2 3 4 5 0. They also have the added bene t that they are based on function evaluations alone (i. The techniques we will present in this section.00043489424427 0.42 -2. you don't need to write compute A.. and b) an initial bracketing pair may be found. . When they are not. For example. i.00 7 553.87 443.2460 1.3689 0. we want to nd an x(lower) and an x(upper) such that f (x(lower) ) and f (x(upper) ) have opposite signs.e.60 6 -0. This may often be done by hand with a minor amount of trial and error. Unfortunately.00000000250477 0.58 34...74 3..18 4 2.e. 3. however. these techniques are based on a bracketing principle that does not extend readily to multi-dimensional functions.96 25 -1. it is desirable to seek the roots of such equations by more pedestrian means which guarantee success.10116915143065 0.75 -65.2.39 24 -1.01 0.2787 1.29 .82 4. for functions which have opposite sign for su ciently large and small arguments.97 26 -1. .3690 x1 A good initial guess 1.08 10 72. it is convenient to have an automatic procedure to nd such a bracketing pair.30 CHAPTER 3. Unfortunately. which makes them simple to implement.74 3. .79 3.2788 1.55 -1105. good initial guesses are not always readily available.57 -0.01 8 279. . are guaranteed to converge to a solution of a scalar nonlinear equation so long as: a) the function is continuous.11 -0.3770 0. a simple approach is to start with an initial guess for the bracket and then geometrically increase the distance between these points until a bracketing pair is found. though they can not achieve quadratic convergence.00 2 1.2787 x2 residual 1.00 -1.77 11 36. ..82 5 1. Our rst task is to nd a pair of values for x which bracket the minimum.m). SOLVING NONLINEAR EQUATIONS A poor initial guess iteration x1 x2 1 0. This may be implemented with the following codes (or variants thereof): .3690 0.16 -261. At times.0000 1.62 -1.00000000000000 3. 23 -1.25 3 -0.17708342963828 0.99 9 143.1 Bracketing a root The class of scalar nonlinear functions we will consider are continuous.

m.x_upper] = find_brack x_lower. % end function init_brack.'ko') if f_lower*f < 0 x_upper = x f_upper = f else x_lower = x f_lower = f end interval=interval/2 i=i+1 pause end x = (x_upper + x_lower)/2 % end bisection.2 Re ning the bracket . The bisection algorithm may be implemented with the following code: % bisection.m f_lower=compute_f(x_lower) f_upper=compute_f(x_upper) interval=x_upper-x_lower i=1 while interval > x_tol x = (x_upper + x_lower)/2 f = compute_f(x) evals=evals+1 res_save(i. To prevent confusion with other cases presented in these notes.1 and compute f.2.m.2.m 31 The auxiliary functions used by this code may be written as function x_lower.bisection Once a bracketing pair is found. the bounds on the solution are reduced by exactly a factor of 2. 3. The most straightforward approach is to repeatedly chop the interval in half.m % Evaluate the function f(x).20*x + 50 % end function compute_f.1.m evals=2 .5*interval x_lower =x_lower-0. these case-speci c functions are stored at the class web site as init brack.1)=norm(f) plot(x. the task that remains is simply to re ne this bracket until a desired degree of precision is obtained. The convergence of such an algorithm is linear at each iteration.m function f] = compute_f(x) f=x^3 + x^2 .3. BRACKETING APPROACHES FOR SCALAR ROOT FINDING function x_lower.5*interval end % end function find_brack.x_upper] = init_brack while compute_f(x_lower) * compute_f(x_upper) > 0 interval=x_upper-x_lower x_upper =x_upper+0.f.x_upper] = init_brack x_lower=-1 x_upper=1 % Initialize a guess for the bracket. keeping those two values of x that bracket the root.

2) such that (lower) x(new) = x(lower) f (xf= x ) . so it is useful to put in an ad hoc check to keep the progress moving.false position A technique that is usually slightly faster than the bisection technique. tol1 = interval/10 if ((x-x_lower) < tol1 | (x_upper-x) < tol1) x = (x_lower+x_upper)/2 end f = compute_f(x) evals = evals+1 res_save(i.f_lower / fprime evals=2 % Ad hoc check: stick with bisection technique if updates % provided by false position technique are too small.f.x. (3. f_lower f_upper].5) where the quantity f= x is a simple di erence approximation to the slope of the function f f = f (x(upper) ) f (x(lower) ) : x x(upper) x(lower) .32 CHAPTER 3. .'k--'. this approach sometimes stalls.'ko') if f_lower*f < 0 x_upper = x f_upper = f else x_lower = x f_lower = f end interval=x_upper-x_lower i=i+1 pause end x = (x_upper + x_lower)/2 % end false_pos.1)=norm(f) plot( x_lower x_upper].m f_lower=compute_f(x_lower) f_upper=compute_f(x_upper) interval=x_upper-x_lower i=1 while interval > x_tol fprime = (f_upper-f_lower)/interval x = x_lower . but that retains the safety of maintaining and re ning a bracketing pair. As shown in class. % false_pos. is to compute each new point by a numerical approximation to the Newton-Raphson formula (3. SOLVING NONLINEAR EQUATIONS 3.2.m .3 Re ning the bracket .

% First. an extra bit of plotting stuff to show what happens: x_lower_t=min(x_save) x_upper_t=max(x_save) xx= x_lower_t : (x_upper_t-x_lower_t)/100 : x_upper_t] for i=1:101.m.x_upper] = find_brack % Prepare to make some plots of the function on this interval xx= x_lower : (x_upper-x_lower)/100 : x_upper] for i=1:101.yy.x_save(1).\n') figure(1) clf plot(xx.y1.\n') figure(2) clf plot(xx.4 Testing the bracketing algorithms The following code tests the bracketing algorithms for scalar nonlinear root nding and compares with the Newton-Raphson algorithm. Convergence of the false position algorithm is usually found to be faster than bisection.1 and compute A.'b-') hold on title('Convergence of the bisection routine') bisection evals. Don't forget to update init x. yy(i)=compute_f(xx(i)) end x1= x_lower x_upper] y1= 0 0] x_tol = 0.'k-'.y1.m and compute A.m % Tests the bisection.'b-') hold on grid title('Convergence of false position routine') x_lower=x_lower_save x_upper=x_upper_save false_pos evals.x1.3. and both are safer (and slower) than Newton-Raphson.m appropriately (i.f_save(1).yy.2.'k-') hold on grid title('Convergence of the Newton-Raphson routine') x=init_x newt % Finally.\n') figure(3) clf plot(xx.1) in order to get Newton-Raphson to work.'k-'. compute a bracket of the root.m in order to obtain convergence of the Newton-Raphson algorithm. BRACKETING APPROACHES FOR SCALAR ROOT FINDING 33 3.e. yy(i)=compute_f(xx(i)) end plot(xx.'k-'.yy. false position. clear x_lower. x_upper_save=x_upper fprintf('\n Now testing the bisection algorithm.'ko') pause for i=1:size(x_save)-1 .yy.0001 x_lower_save=x_lower % Set tolerance desired for x. hold off fprintf(' Now testing the Newton-Raphson algorithm. with init x.x1.m. and Newton routines for % finding the root of a scalar nonlinear function. hold off grid fprintf(' Now testing the false position algorithm. You might have to ddle with init x.. % test_brack.2.

m .'k--'. .f_save(i+1).'ko') pause end evals. f_save(i) 0].. hold off % end test_brack. x_save(i+1)..34 plot( x_save(i) x_save(i+1)].

it is best to use a least-squares technique to t a low-order curve in the general vicinity of several data points. yi = P (xi ) = ao + a1 xi + a2 x2 + : : : + an xn i i 35 for i = 0 1 2 : : : n: . Such an approach generally produces a much smoother curve (and a more physically-meaningful result) when the measured data is noisy. There are two ways of accomplishing this: solving a system of n +1 simultaneous equations for the n + 1 coe cients of this polynomial. i.e. The process of Lagrange interpolation involves simply tting an n'th degree polynomial (i. this polynomial must take the value yi . Note that the process of interpolation passes a curve exactly through each data point. We will present both of these techniques.. and constructing the polynomial directly in factored form.Chapter 4 Interpolation One often encounters the problem of constructing a smooth curve which passes through discrete data points. if the data is from an experiment and has any appreciable amount of uncertainty.1.. However. This situation arises when developing di erentiation and integration strategies.e. as well as when simply estimating the value of a function between known values.1 Solving n + 1 equations for the n + 1 coe cients Consider the polynomial P (x) = ao + a1 x + a2 x2 + : : : + an xn : At each point xi . a polynomial with n + 1 degrees of freedom) exactly through this data. 4. we will present two techniques for this procedure: polynomial (Lagrange) interpolation and cubic spline interpolation. This technique minimizes a weighted sum of the square distance from each data points to this curve without forcing the curve to pass through each data point individually. f g 4.1 Lagrange interpolation Suppose we have a set of n + 1 data points xi yi . This is sometimes exactly what is desired. as we will discuss in the following chapters. In this handout.

36 In matrix form.1 . by construction. n Y i=0 i6= (x x i ) . C B . i=0 i6= which results in the very useful relationship L (xi ) = i = 1 i = 0 i= : 6 ( Scaling this result.1. a linear combination of n + 1 of these polynomials P (x) = n X k=0 yk Lk (x) provides an n'th degree polynomial which exactly passes through all of the data points.. C = B . xn a0 0 xn C B a1 C 1C B C xn n a 1 0 1 0y 1 B y0 C 1 . Unfortunately. Lagrange interpolation should be thought of as dangerous for anything more than a few data points and avoided in favor of other techniques.1. and thus this technique of nding an interpolating polynomial is unreliable at best. Vandermonde's matrix is usually quite poorly conditioned. A code which implements this constructive technique to determine the interpolating polynomial is given in the following subsection.C @ A . we may write this system as CHAPTER 4. Consider the n'th degree polynomial given by the factored expression (x x .1 n Y = 6 (x xi )7 4 5 .1 @ 2 n | 1 xn x{z V .2 B .2 Constructing the polynomial directly L (x) = (x x0 )(x x1 ) . . such as spline interpolation. as shown in Figure 4.1 )(x x .. +1 ) (x xn ) = . Unfortunately. the polynomial y L (x) (no summation implied) passes through zero at every data point x = xi except at x = x .. Thus. Choosing to normalize this polynomial at x = x . 4. where is a constant yet to be determined. C : A@.. In other words. . } | an } | yn } {z {z y This system is of the form V a = y. . Finally.. L (xi ) = 0 for i = .. . . where V is commonly referred to as Vandermonde's matrix. ... This expression (by construction) is equal to zero if x is equal to any of the data points except x . and may be solved for the vector a containing the coe cients ai of the desired polynomial. high-order polynomials tend to wander wildly between the data points even if the data appears to be fairly regular. where it has the value y . we de ne 6 2 3. INTERPOLATION 01 x x2 B1 x0 x0 B .A B.

2 −10 −5 0 5 10 Figure 4.2 Cubic spline interpolation Instead of forcing a high-order polynomial through the entire dataset (which often has the spurious e ect shown in Figure 4.3 Matlab implementation function p] = lagrange(x.1. n=size(x_data.6 0.m % Add L_k's contribution to p(x) 4.4 0. smooth.1).1: The interpolation problem: a continuous curve representing the function of interest several points on this curve the Lagrange interpolation of these points.2 0 −0.2.1) p=0 for k=1:n % For each data point {x_data(k).y_data(k)} L=1 for i=1:k-1 L=L*(x-x_data(i))/(x_data(k)-x_data(i)) % Compute L_k end for i=k+1:n L=L*(x-x_data(i))/(x_data(k)-x_data(i)) end p = p + y_data(k) * L end % end lagrange. 4.8 0.4.x_data.y_data) % Computes the Lagrange polynomial p(x) that passes through the given % data {x_data.y_data}. piecewise cubic function . CUBIC SPLINE INTERPOLATION 37 1 0. we may instead construct a continuous.

i. fi0. it follows that 00 2 00 2 fi0 (x) = f (xi ) (xi+1 x) + f (xi+1 ) (x xi ) + C1 : 2 2 i i 00 (xi ) (xi+1 x)3 f 00 (xi+1 ) (x xi )3 fi (x) = f 6 + 6 + C1 x + C2 : f g f g .1) with G being a linear combination of delta functions. ..1 (xi ) = fi0 (xi ) = f 0 (xi ) and (c) continuity of the second derivative f 00 .e. such a spline takes an approximately piecewise cubic shape between the data points. . By construction. The elasticity equation governing the deformation f of the spline is f 0000 = G (4. fi00 1 (xi ) = fi00 (xi ) = f 00 (xi ): . . f 00 (xi ) and f 00 (xi+1 ). uniquely determine a piecewise cubic function through the data which is usually reasonably smooth we will call this function the cubic spline interpolant. we have: f 0000 = 0 f 000 = C1 f 00 = C1 x + C2 f 0 = C1 x2 + C2 x + C3 2 and f = C1 x3 + C2 x2 + C3 x + C4 : 6 2 4. INTERPOLATION through the data. we may write a linear equation for fi00 (x) as a function of its value at the endpoints. by (4. and which satis es condition (b) by setting up and solving the appropriate system of equations for the value of f 00 at each data point xi . i. with a delta function) which is su cient to pass the spline through the data. At each data point. .2. fi. which are (as yet) undetermined. x fi00 (x) = f 00 (xi ) x xi+1 + f 00 (xi+1 ) x x xix : i xi+1 i+1 i . we have: (a) continuity of the function f . We now describe a procedure to determine an f which satis es conditions (a) and (c) by construction.. Integrating this equation twice and de ning i = xi+1 xi . Note that this rst degree polynomial is in fact just a Lagrange interpolation of the two data points xi f 00 (xi ) and xi+1 f 00 (xi+1 ) . De ning the interpolant in this manner is akin to deforming a thin piece of wood or metal to pass over all of the data points plotted on a large block of wood and marked with thin nails. the rst de nition of the word \spline" in Webster's dictionary is \a thin wood or metal strip used in building construction"|this is where the method takes its name. As G is nonzero only in the immediate vicinity of each nail.1. ..e..2. In fact. fi00 varies linearly with x between each data point.38 CHAPTER 4. condition (c) is satis ed. These conditions.e.1 Constructing the cubic spline interpolant Let fi (x) denote the cubic in the interval xi x xi+1 and let f (x) denote the collection of all the cubics for the entire range x0 x xn . together with the appropriate conditions at each end. .e. To begin the constructive procedure for determining f .1) where G is a force localized near each nail (i. The following form (which is linear in x) ts the bill by construction: x .1 (xi ) = fi (xi ) = f (xi ) = yi (b) continuity of the rst derivative f 0 . i. in a manner analogous to the construction of the Lagrange interpolant in 4. Thus. We will construct this function to be smooth in the sense of having continuous rst and second derivatives at each data point. note that on each interval xi x xi+1 for i = 0 1 : : : n 1. between the data points. . . As noted above. i i .

1 (xi ) 3 6 i i. Parabolic run-out: f 00 (x0 ) = f 00 (x1 ) f 00 (xn ) = f 00 (xn.2. leads to i. This is a diagonally-dominant tridiagonal system of n 1 equations for the n + 1 unknowns f 00 (x0 ). yi+1 i + i . . (4. .2). which thereby ensures that condition (b) is satis ed.4) Free run-out (also known as natural splines): (4. : : : .1 for i = 1 2 : : : n 1. .1 (4. condition (a) is satis ed.5). noting that i = xi+1 xi .4). .6) (making the appropriate choice for the end conditions depending upon the problem at hand) to give us n +1 equations for the n + 1 unknowns f 00 (xi ). f 00 (xi ). . . .2) By construction. or (4. the cubic spline interpolant follows immediately from equation (4. 2 00 2 3 (xi+1 x) + i + f (xi+1 ) 3 (x xi ) 6 i i . which gives The second derivative of f at each node. f 00 (xn ). an expression for fi0 (x) may now be found by di erentiating this expression for fi (x). 6 . yi : i i = 1 2 : : : n 1: Substituting appropriately from the above expression for fi0 (x). .1 i i+1 for . . Once this system is solved for the f 00 (xi ). .6) Equation (4.5) Periodic end-conditions: f 00 (x0 ) = f 00 (xn.4.3) i. i (x . We will consider three types of end conditions: . which is achieved by setting 00 fi0 (x) = f (xi ) 6 . . We nd the two remaining equations by prescribing conditions on the interpolating function at each end. . This set of equations is then solved for the f 00 (xi ). f 00 (x1 ). is still undetermined.1 f 00 (x ) + i. CUBIC SPLINE INTERPOLATION The undetermined constants of integration are obtained by matching the end conditions 39 fi (xi ) = yi and fi (xi+1 ) = yi+1 : A convenient way of constructing the linear and constant terms in the expression for fi (x) in such a way that the desired end conditions are met is by writing fi (x) in the form 00 3 f 00 (xi+1 ) (x xi )3 fi (x) = f (xi ) (xi+1 x) i (xi+1 x) + 6 6 i i (xi+1 x) + y (x xi ) where xi x xi+1 : + yi i+1 i i .1 ) f 00 (x1 ) = f 00 (xn ) (4. A system of equations from which the f 00 (xi ) may be found is obtained by imposing condition (b).1 + i f 00 (x ) + i f 00 (x ) = yi+1 yi yi yi. Finally.1 ) f 00 (x0 ) = 0 f 00 (xn ) = 0 (4. fi0(xi ) = fi0. xi ) (4. .3) may be taken together with equation (4.

. The parabolic run-out simply places a parabola between x0 and x1 .4 0. A code which solves these systems to determine the cubic spline interpolation with any of the above three end-conditions is given in the following subsection. When equation (4.2 −10 −5 0 5 10 Figure 4. Note that.8 0.1: Data points from which we want to reproduce (via interpolation) a continuous function. . 10 0:05 Table 4. All three end conditions on the cubic spline do reasonably well. . Both end conditions provide reasonably smooth interpolations. Note that applying periodic end conditions on the spline in this case (which is not periodic) leads to a non-physical \wiggle" in the interpolated curve near the left end in general. 4 0:19 . a tridiagonal system results which can be solved e ciently with the Thomas algorithm.2 0 −0. 6 0:05 . the periodic end conditions should be reserved for those cases for which the function being interpolated is indeed periodic. . . INTERPOLATION xi yi 12 0:05 10 8 0:05 0:12 . . . 6 8 0:05 0:12 . . whereas the free run-out tapers the curvature down to zero near x0 . 1 0. The results of applying the cubic spline interpolation formula to the set of data given in Table 4. but the Lagrange interpolation is again spurious. . when equation (4.3) is taken together periodic end conditions. a tridiagonal circulant system (which is not diagonally dominant) results.40 CHAPTER 4.3) is taken together with parabolic or free run-out at the ends.2.2: Various interpolations of the data of Table 1: data cubic spline (parabolic run-out) cubic spline (free run-out) Lagrange interpolation cubic spline (periodic).6 0. These data points all lie near the familiar curve generated by the function sin(x)=x.1 is shown in Figure 4. 4 2 0 2 0:19 0:46 1 0:46 .

solve system elseif end_conditions==3 % Periodic end conditions a(1)=-1 b(1)=0 c(1)=1 g(1)=0 a(n)=-1 b(n)=0 c(n)=1 g(n)=0 % Note: the following is an inefficient way to solve this circulant % (but NOT diagonally dominant!) system via full-blown Gaussian % elimination.2.2.n)=a(1) A(n.m ----------------------------------------------------------------- .4. % Compute the size of the problem. for i=2:n-1 a(i)=delta(i-1)/6 b(i)=(delta(i-1)+delta(i))/3 c(i)=delta(i)/6 g(i)=(y_data(i+1)-y_data(i))/delta(i) -. which was developed earlier in the course.1)=c(1) g=A\g' % <--.0) + diag(c(1:n-1).1) % Set up the delta_i = x_(i+1)-x_i for i=1:n-1.m Determines the quantity g=f'' for constructing a cubic spline interpolation of a function.2 Matlab implementation % % % % % spline_setup.solve system elseif end_conditions==2 % Free run-out ("natural" spline) b(1)=1 c(1)=0 g(1)=0 b(n)=1 a(n)=0 g(n)=0 thomas % <--. CUBIC SPLINE INTERPOLATION 41 4. delta(1:n-1)=x_data(2:n)-x_data(1:n-1) % Set up and solve a tridiagonal system for the g at each data point.. and assumes the x_data is already sorted in ascending order.-1) + diag(b. (y_data(i)-y_data(i-1))/delta(i-1) end if end_conditions==1 % Parabolic run-out b(1)=1 c(1)=-1 g(1)=0 b(n)=1 a(n)=-1 g(n)=0 thomas % <--. Note that this algorithm calls thomas. n=size(x_data. A = diag(a(2:n).solve system end % end spline_setup.1) A(1..m.

x_data.x_data(i)))+.1]) hold on pause x_lower=min(x_data) x_upper=max(x_data) xx= x_lower : (x_upper-x_lower)/200 : x_upper] for i=1:201 ylagrange(i)=lagrange(xx(i).y_data.'b:') pause for end_conditions=1:3 % Set the end conditions you want to try here.ylagrange.m ----------------------------------------------------------------function x_data. INTERPOLATION function f] = spline_interp(x.y_spline) pause end % end test_interp.g.g. f=g(i) /6 *((x_data(i+1)-x)^3/delta(i)-delta(i)*(x_data(i+1)-x))+. (y_data(i)*(x_data(i+1)-x) + y_data(i+1)*(x-x_data(i))) / delta(i) % end spline_interp..m ./(x_data+eps) % end input_data. spline_setup for i=1:201 y_spline(i)=spline_interp(xx(i).y_data.y_data) end plot(xx.delta) n=size(x_data.1) % Find that value of i such that x_data(i) <= x <= x_data(i+1) for i=1:n-1 if x_data(i+1) > x.42 CHAPTER 4. g(i+1)/6 *((x .x_data(i))^3/delta(i)-delta(i)*(x .y_data] = input_data x_data= -12:2:10]' y_data=sin(x_data+eps).'ko') axis( -13 11 -0..y_data.x_data. y_data]=input_data plot(x_data.m ----------------------------------------------------------------% test_interp...delta) end plot(xx.m clear clf x_data. end end % compute the cubic spline approximation of the function.3 1.x_data. break.

Proceeding with a constructive process to satisfy condition (a) as discussed in 4. we assemble the exponential terms in the solution of this ODE for f in a constructive fi00 (x) . we assemble the linear and constant terms of f 00 2 f such that manner such that condition (c) is satis ed. .4. 2y i x xi+1 00 xi xi+1 + fi (xi+1 ) . .1 f 00 (xi.2 f 00 (xi ) 2 yi xi+1 x + f 00 (xi+1 ) 2 yi+1 x xi f 00 (xi ) sinhsinh i+1 x) i i i sinh (x xi ) o where x x x : (4. x + C4 e x : x . i . cubic splines aren't even smooth enough. sinh + 1 i. i Di erentiating once and appling condition (b) leads to the tridiagonal system: i.4 B-splines It is interesting to note that we may write the cubic spline interpolant in a similar form to how we constructed the Lagrange interpolant. 2 f (x) i = fi00 (xi ) .1 ) 2 . 2 f ]0 = C 1 f 00 .1 : (4.8) 2 1 . 2y i+1 . even though the coe cients have a slightly more complicated form.2. .3 Tension splines f 0000 f 00 2 f ]00 . This leads to the following relationships between the data points: . In the limit of large tension.7). . .2.2 (C1 x + C2 ) + C3 e. . 2f ] = C 1 x + C2 : Solving the ODE on the right leads to an equation of the form . CUBIC SPLINE INTERPOLATION 43 For some special cases. that is.1 The tridiagonal system (4. .1. . . the desired solution may be written n (x fi (x) = . Rewriting the exponentials as sinh functions. 2 f 00 =G where is the tension of the spline. 4. Tensioned splines obey the di erential equation: 4.1 sinh i i.2. .2. . . In such cases.7) f 00 (xi+1 ) sinh i i+1 . =0 f 00 f= . i. . i .1 i f 00 (xi+1 ) = yi+1 yi yi yi. The tensioned-spline interpolant is then given by (4. it is helpful to put \tension" on the spline to straighten out the curvature between the \nails" (datapoints).8) can be set up and solved exactly as was done with (4.1 + 1 cosh i f 00 (xi ) 2 sinh i. . x xi xi+1 xi : Similarly. .3). sinh i cosh i.1 1 . f (x) = n X =0 y b (x) . the interpolant becomes almost piecewise linear.

we may con ne each of the basis functions to have compact support (in other words. we can set b (x) = 0 exactly for x x > R for some R). x ! j . and are found to have localized support (in other words. j j . By relaxing some of the continuity constraints.44 where the b (x) correspond to the various cubic spline interpolations of the Kronecker deltas such that b (xi ) = i . it is easier both to compute the interpolations themselves and to project the interpolated function onto a di erent mesh of gridpoints. The industry of computer animation makes heavy use of e cient algorithms for this type of interpolation extended to three-dimensional problems.2 for the functions L (x). as discussed in 4. With such functions. b (x) 0 for large x x ).1. The b (x) in this representation are referred to as the basis functions. j .

or r J = Ax b = 0: .0 Motivation It is often the case that one needs to minimize a scalar function of one or several variables.Chapter 5 Minimization 5. If A is symmetric positive de nite (i. J ! 1 j j ! 1 J J . it is sometimes useful to construct a function (x) such that. 5. Three cases of particular interest which motivate the present chapter are described here. Consider the n n problem Ax = b. if akj = ajk and all eigenvalues of A are positive) then we can accomplish this by de ning (x) = 1 xT Ax bT x = 1 xi aij xj bi xi : 2 2 Requiring that A be symmetric positive de nite insures that for x in all directions and thus that a minimum point indeed exists.1 Solution of large linear systems of equations Thus. . once (approximately) minimized via an iterative technique..e.2 Solution of nonlinear systems of equations x Recall from 3. where n is so large that an O(n3 ) algorithm such as Gaussian elimination is out of the question. J 5.) Di erentiating with respect to an arbitrary component of x. the desired condition Ax = b is (approximately) satis ed. (We will extend this approach to more general problems at the end of this chapter. and the problem can not be put into such a form that A has a banded structure. J @ @xk = akj xj bk = 0 J .1 that the Newton-Raphson method was an e ective technique to nd the root (when one exists) of a nonlinear system of equations f (x) = 0 when a su ciently-accurate initial guess is 45 .0.0. . we nd that 1 @ @xk = 2 ( ik aij xj + xi aij jk ) bi ik = akj xj bk : The unique minimum of (x) is characterized by J J . In such cases. solution of large linear systems of the form Ax = b (where A is symmetric positive de nite) may be found by minimization of (x).

MINIMIZATION available. given by function f (x) = r . computation and storage of the Hessian matrix. which has N 2 elements. variants of this method are really about the best one can do when one doesn't have a good initial guess for the solution.4) that the Newton-Raphson method requires both evaluation of the function f (x). minimization of (x) can be accomplished simply by applying the Newton-Raphson method developed previously to the gradient of (x).3 Optimization and control of dynamic systems In control and optimization problems. 5. there are quite likely many minimum points of (x) (at which r = 0). Root nding in systems of nonlinear equations is very di cult|though this method has signi cant drawbacks. However. and converges quite rapidly. only some of which (if any!) will correspond to f (x) = 0. Recall from equation (3. minimization of this cost function with respect to u results in the determination of an e ective set of control parameters. seeking a minimum of this (x) with respect to x might result in an x such that f (x) = 0. both of these terms are small. an alternative technique is to examine the square of the norm of the vector f : J (x) = f (x)]T f (x) J Note that this quantity is never negative.46 CHAPTER 5. in this case taken to be the gradient of (x). in turn. . The second term in the above expression measures the amount of control used and the rst term is an observation of the property of interest of the dynamic system which.1 The Newton-Raphson method for nonquadratic minimization If a good initial guess is available (near the desired minimum point).0. When the control is both e cient (small in magnitude) and has the desired e ect on the dynamic system. Unfortunately. given in this case by J J J J @f a(k) = @xi ij j x=x( @ = @x @x k ) 2J i j x=x(k) : J This matrix of second derivatives is referred to as the Hessian of the function . so any point x for which f (x) = 0 minimizes (x). and the Jacobian of f (x). Such cost functions may often be put in the form J (u) = xT Qx + uT Ru where x = x(u) is the state of the system and u is the control. Thus. for very large N . one can often represent the desired objective mathematically as a \cost function" to be minimized. J J J 5. When such a guess is not available. in order to nd the solution to the equation f (x) = 0. This works just as well for scalar or vector x. its minimization with respect to the control parameters results in an e ectively-controlled dynamic system. When the cost function is formulated properly. is a function of the applied control. Thus. can be prohibitively expensive.

'ko') % end compute_J. . % plot a circle at the data point. we begin with a \bracketing" approach analogous to that which we used for nding the root of a nonlinear scalar equation. bracketing a minimum means nding a triplet x1 x2 x3 with x2 between x1 and x3 and for which (x2 ) < (x1 ) and (x2 ) < (x3 ) (so a minimum must exist between x1 and x3 if the function is continuous and bounded).m Case-speci c auxiliary functions are given below. function J] = compute_J(x) J=3*cos(x)-sin(2*x) + 0.1*x. x1. BRACKETING APPROACHES FOR MINIMIZATION OF SCALAR FUNCTIONS 47 5.2 Bracketing approaches for minimization of scalar functions We now seek a reliable but pedestrian approach to a minimize scalar function (x) when a good initial guess for the minimum is not available. a simple approach is to start with an initial guess for the bracket and then geometrically scale out each end until a bracketing triplet is found.x2. for functions which are large and positive for su ciently large x .2.2 that bracketing a root means nding a pair x(lower) x(upper) for which f (x(lower) ) and f (x(upper) ) have opposite signs (so a root must exist between x(lower) and x(upper) if the function is continuous and bounded). J x f g f g J J J J j j 5.x3] = init_triplet J1=compute_J(x1) J2=compute_J(x2) J3=compute_J(x3) while (J2>J1) % Compute a new point x4 to the left of the triplet x4=x1-2. Such an initial bracketing triplet may often be found by a minor amount of trial and error.J.0*(x3-x2) J4=compute_J(x4) % Center new triplet on x3 x1=x2 J1=J2 x2=x3 J2=J3 x3=x4 J3=J4 end % end find_triplet. it is convenient to have an automatic procedure to nd such a bracketing triplet. however. Analogously.2. % Should work if J -> inf as |x| -> inf.0*(x2-x1) J4=compute_J(x4) % Center new triplet on x1 x3=x2 J3=J2 x2=x1 J2=J1 x1=x4 J1=J4 end while (J2>J3) % Compute new point x4 to the right of the triplet x4=x3+2.m % compute function you are trying to minimize. For example. At times. Recall from 3.1 Bracketing a minimum Matlab implementation % find_triplet.m % Initialize and expand a triplet until the minimum is bracketed.5. To do this.^2 plot(x.

1.x3] = init_triplet x1=-5 x2=0 x3=5 % Initialize guess for the bracketing triplet % end init_triplet. then a self-similar situation develops in which the ratio W is constant from one iteration to the next. let W be de ned as the ratio of the smaller interval to the width of the bracketing triplet x1 x2 x3 such that f g x x x4 = x2 + Z (x3 x1 ): We now pick a new trial point x4 and de ne Z = x4 x2 3 1 There are two possibilities: if (x4 ) > (x2 ). Note that.48 CHAPTER 5. analogous to the bisection technique developed for scalar root nding. W (k) = W (k+1) . where x(k) is between x(k) and x(k) and x(k) is assumed 2 1 3 2 to be closer to x(k) than it is to x(k) . in terms of the quantities at iteration k.1: Naming of points used in golden section search procedure: x(k) x(k) x(k) is referred 1 2 3 to as the bracketing triplet at iteration k. then x2 x4 x3 is a new bracketing triplet. function x1. . ) . we have either . x x W = x2 x1 3 1 .m 5. MINIMIZATION W (k) 1−W(k) 1−W(k+1) W (k+1) J(x) Z(k) J(x) Z(k+1) ) x4 x1(k) x2(k) (k) x3(k) k'th iteration (k+1) x2(k+1) x3(k+1) x4 (k + 1)'th iteration f g x1(k+1) Figure 5. i. we should take the width of both of these triplets as identical.1) If the same algorithm is used for the re nement at each iteration k. The reduction of the interval continues at the following iterations in a self-similar fashion. . A new guess is made at point x(k) and the bracket is re ned 1 3 4 by retaining those three of the four points which maintain the tightest bracket.x2. ) 1 W = x3 x2 : x3 x1 .the golden section search Once the minimum of the non-quadratic function is bracketed.2. . ) . so that W +Z =1 W Z = 1 2W (5. Minimizing the width of this new (re ned) bracketing triplet in the worst case. all that remains to be done is to re ne these brackets. As illustrated in Figure 5. . and if (x4 ) < (x2 ). ..1).e. is called the golden section search. A simple algorithm to accomplish this. then x4 x2 x1 is a new bracketing triplet (as in Figure 5.2 Re ning the bracket . J J f g J J f g .

Note that convergence is attained linearly: each bound on the minimum is 0. This is slightly slower than the convergence of the bisection algorithm for nonlinear root.m % Input assumes {x1. f g (5. repeated application of this algorithm quickly brings the triplet into this ratio as it is re ned. Z=sqrt(5)-2 % initialize the golden section ratio if (abs(x2-x1) > abs(x2-x3)) % insure proper ordering swap=x1 x1=x3 x3=swap swap=J1 J1=J3 J3=swap end pause while (abs(x3-x1) > x_tol) x4 = x2 + Z*(x3-x1) % compute and plot new point J4 = compute_J(x4) evals = evals+1 % Note that a couple of lines are commented out below because some % of the data is already in the right place! if (J4>J2) x3=x1 J3=J1 % Center new triplet on x2 % x2=x2 J2=J2 x1=x4 J1=J4 else .5.2. 2 4 3 Dropping the superscripts on W and Z . On output. or 4 2 1 W k+1 = Z (k) =(1 W (k) ) if x(k) x(k) x(k) is the new bracketing triplet. 4 2 1 2 3 2 4 3 The process continues on the new (re ned) bracketing triplet in an iterative fashion until the desired tolerance is reached such that x3 x1 < . 1 2 3 computes a new data point at x(0) = x(0) + Z (x(0) x(0) ) where Z = 0:236068.618034 times the previous bound. the golden section algorithm takes an initial bracketing triplet x(0) x(0) x(0) . which we assume to be independent of k. and then: 4 2 3 1 if (x(0) ) > (x(0) ). BRACKETING APPROACHES FOR MINIMIZATION OF SCALAR FUNCTIONS f g 49 W k+1 = Z (k) =(W (k) + Z (k) ) if x(k) x(k) x(k) is the new bracketing triplet. .1).J2. ) . both of these conditions reduce to the relation . p .x3} are a bracketing triple with function % values {J1. or 4 2 1 2 3 4 2 1 if (x(0) ) < (x(0) ). Even if the initial bracketing triplet is not in the ratio of the golden section.nding. and inserting .J3}. J J J J f f g g f f g g j . f g .x2. j p p Matlab implementation % golden. . the new triplet is x(1) x(1) x(1) = x(0) x(0) x(0) . the new triplet is x(1) x(1) x(1) = x(0) x(0) x(0) . x2 is the best guess of the minimum. W2 3W + 1 = 0 which (because 0 < W < 1) implies that W = 3 2 5 0:381966 1 W = 52 1 0:618034 and Z = 5 2 0:236068: These proportions are referred to as the golden section. To summarize.

Setting dP (x)=dx = 0 to nd the critical point of this quadratic yields 0 = y0 (x 2 x x x1x x2x ) + y1 (x 2 x x x0x x2x ) + y2 (x 2 x x x0x x1x ) : 0 1 )( 0 2 1 0 )( 1 2 2 0 )( 2 1 .3 that. . golden hold off x2. .'ko') plot(x2. and x2 y2 . . the minimum point of the function may be found via an e cient technique based on function evaluations alone.1. the false position method is an e cient technique to nd the root of the function based on function evaluations alone.inverse parabolic interpolation x Recall from 3. . .J1. the quadratic interpolant is given by the Lagrange interpolation formula J J f g f g f g ( x )(x 2 ( )(x 2 ( 1 P (x) = y0 (xx x1 )(x xx) ) + y1 (xx x0 )(x xx) ) + y2 (xx x0 )(x xx) ) 0 1 0 2 1 x0 1 2 2 x0 )(x2 1 as described in 4.0001 % Set desired tolerance for x.yy. evals % end test_golden.m % Tests the golden search routine clear find_triplet % Initialize a bracket of the minimum. .J3.2. J2. . % Prepare to make some plots of the function over the width of the triplet xx= x1 : (x3-x1)/100 : x3] for i=1:101.'ko') x_tol = 0. . MINIMIZATION The implementation of the golden section search algorithm may be tested as follows: % test_golden. . . .m J1=J2 J2=J4 J3=J3 % Center new triplet on x4 CHAPTER 5. For example. evals=0 % Reset a counter for tracking function evaluations. Analogously. . At the heart of this technique is the construction of successive quadratic interpolations based on recent function evaluations. . taking each new estimate of the minimum of (x) as that value of x for which the value of the quadratic interpolant is minimum. . . given data points x0 y0 . x .'k-') hold on grid title('Convergence of the golden section search') plot(x1. taking each new estimate of the root of f (x) as that value of x for which the value of the linear interpolant is zero. .50 x1=x2 x2=x4 % x3=x3 end pause end % end golden. yy(i)=compute_J(xx(i)) end figure(1) clf plot(xx. x1 y1 . . .2. when a function (x) is \locally quadratic" (meaning that its shape is wellapproximated by a quadratic function). .'ko') plot(x3. The false position method is based on the construction of successive linear interpolations of recent function evaluations. .m 5. . . when a function f (x) is \locally linear" (meaning that its shape is wellapproximated by a linear function). .3 Re ning the bracket .J2.

res=1 i=1 J1=compute_J(x1) J2=compute_J(x2) J3=compute_J(x3) while (abs(x3-x1) > x_tol) x4 = 0.'r*') % of the Lagrange interpolant.2.m with a call to .m) to test the e is easily modi ed (by replacing the call to ciency of the inverse parabolic algorithm. J1 J2 J3]) % plot a * at the critical point plot(x4. but will always maintain the bracketing triplet. . . x_lower=min( x1 x2 x3]) x_upper=max( x1 x2 x3]) xx= x_lower : (x_upper-x_lower)/200 : x_upper] for i=1:201 J_lagrange(i)=lagrange(xx(i). BRACKETING APPROACHES FOR MINIMIZATION OF SCALAR FUNCTIONS . pause J4 = compute_J(x4) evals = evals+1 % Compute function J at new point and the proceed as with % the golden section search. ..'b-') % plot a curve through the data Jinterp=lagrange(x4.m code inv quad.5*(J1*(x2+x3)*(x2-x3)+ J2*(x3+x1)*(x3-x1)+ J3*(x1+x2)*(x1-x2))/. (J1*(x2-x3)+ J2*(x3-x1)+ J3*(x1-x2)) % compute the critical point % The following plotting stuff is done for demonstration purposes only. x1 x2 x3]. .m The test golden.Jinterp. golden.m % Finds the minimum or maximum of a function by repeatedly moving to the % critical point of a quadratic interpolant of three recently-computed data points % in an attempt to home in on the critical point of a nonquadratic function. . x1 x2 x3]. .. % Note that a couple of lines are commented out below because some % of the data is already in the right place! if (J4>J2) x3=x1 J3=J1 % Center new triplet on x2 % x2=x2 J2=J2 x1=x4 J1=J4 else x1=x2 J1=J2 % Center new triplet on x4 x2=x4 J2=J4 % x3=x3 J3=J3 end end % end inv_quad. J1 J2 J3]) end plot(xx. .5. 51 Multiplying by (x0 x1 )(x1 x2 )(x2 x0 ) and then solving for x gives the desired value of x which is a critical point of the interpolating quadratic: 1 x x x + 1 x = 2 y0 (x1 + x2 )(x1y (x 2 ) + y) (x0y +x 2 )(x2 ) + 0 ) (x y2 (x0 + x1 )(x0 x1 ) : x2 + 1 ( 2 x0 y2 0 x1 ) 0 1 Note that the critical point found may either be a minimum point or a maximum point depending on the relative orientation of the data.J_lagrange. . Matlab implementation % inv_quad.

the inverse parabolic technique can also stall for a variety of scalar functions (x) one might attempt to minimize. the direction opposite to the gradient vector r (x). As we will show.) Of the methods discussed in these notes. A particular choice of the momentum term results in a remarkable orthogonality property amongst the set of various descent directions (namely. so exact convergence in n iterations usually can not be obtained.4 Re ning the bracket . for large n. p(k) TA p(j) = 0 for j = k) and the set of descent directions are referred to as a conjugate set. this approach should converge to one of the minima of (x) (if such a minimum exists). A descent direction p(k) chosen to be a linear combination of the direction of steepest descent r(k) and the step taken at the previous iteration p(k. if (x) has multiple minima.m. It is functionally equivalent to golden. where it is discussed in further detail. and c) does not require computation and storage of the Hessian matrix. i. it is often quite slow. Switching in a reliable fashion from one technique to the other without the possibility for stalling requires a certain degree of heuristics and results in a rather long code which is not very elegant looking. and requires a minimum number of function evaluations. referred to as Brent's method. In the simplest such algorithm (referred to as the steepest descent or simple gradient algorithm). Searching in a series of mutually conjugate directions leads to exact convergence of the iterative algorithm in n iterations. MINIMIZATION 5. combines the reliable convergence bene ts of the golden section search with the ability of the inverse parabolic interpolation technique to \home in" quickly on the solution when the minimum point is approached. A hybrid technique. b) is e cient for high-dimensional systems. Though the above approach is simple. As the iteration k . . The algorithm was developed by Brent in 1973 and implemented in Fortran and C by the authors of Numerical Recipes (1998). a phenomenon often encountered when momentum is lacking. The \momentum" carried by such an approach allows the iteration to turn more directly down narrow valleys without oscillating between one descent direction and another. however. assuming a quadratic cost function and no numerical round-o errors1. Don't let this dissuade you. J 5. J J J J J ! 1 J J J 6 1 Note that. A straightforward approach to minimizing a scalar function of an n-dimensional vector x is to update the vector x iteratively. and the one it nds (a \local" minimum) might not necessarily be the one with the smallest value of (x) (the \global" minimum).e.m. it is not always the best idea to proceed in the direction of steepest descent of the cost function. some very good techniques are available to solve this problem.3 Gradient-based approaches for minimization of multivariable functions We now seek a reliable technique to minimize a multivariable function (x) which a) does not require a good initial guess.Brent's method As with the false position technique for accelerated bracket re nement for the problem of scalar root nding. the algorithm (when called correctly) is reliable. and can be called in the same manner. discussed above. this technique will only nd one of the minimum points. As opposed to the problem of root nding in the multivariable case.2. (A Matlab implementation of the algorithm is included at the class web site. Note that. with the name brent. the direction p is taken to be the direction r of maximum decrease of the function (x).m and inv quad. the accumulating round-o error due to the nite-precision arithmetic of the calculations is signi cant. Brent's method is the best \all purpose" scalar minimimization tool for a wide variety of applications.. proceeding at each step in a downhill direction p a distance which minimizes (x) in this direction.1) is often much more e ective. fast.52 CHAPTER 5.

1) and conjugate gradient ( 5.2) approaches for quadratic functions rst. Though the quadratic and nonquadratic cases are handled essentially identically in most regards.4.3). then discuss their extension to nonquadratic functions ( 5. Brent's method) for nonquadratic functions. neglecting numerical round-o errors) is possible only in the quadratic case.5.4.. Exact convergence in n iterations (again. the line minimizations required by the algorithms may be done directly for quadratic functions. GRADIENT-BASED APPROACHES FOR MINIMIZATION OF MULTIVARIABLE FUNCTIONS53 In the following. but should be accomplished by a more reliable bracketing procedure (e. though nonquadratic functions may also be minimized quite e ectively with the conjugate gradient algorithm when the appropriate modi cations are made.g.3. we will develop the steepest descent ( 5.4. x x x .

Noting the derivation of r in 5.1 Steepest descent for quadratic functions r = rr Ar : T T . J . Dropping the superscript ()(k) for the time being for notational clarity. . The geometry of this problem is illustrated in Figure 5. which governs the distance we will update x(k) in this direction. In this manner.3. We will begin at some initial guess x(0) and move at each step of the iteration k in a direction downhill r(k) such that x(k+1) = x(k) + (k) r(k) where (k) is a parameter for the descent which will be determined. . . The ellipses in the two gures on the right indicate isosurfaces of constant .1.2.54 CHAPTER 5. note rst that 1 (x + r) = 2 (x + r)T A(x + r) bT (x + r) and thus @ (x + r) = 1 rTA(x + r) + 1 (x + r)T Ar bT r @ 2 2 TAr + rT Ax rT b = r = rTAr + rT (Ax b) = rTAr rT r: Setting @ (x + r)=@ = 0 yields J J . MINIMIZATION J(x) J(x) x2 x2 x x1 x1 One variable Two variables (oblique view) J Two variables (top view) Figure 5. we need to gure out the parameter of descent (k) . we proceed iteratively towards the minimum of (x). .1. J 5.2: Geometry of the minimization problem for quadratic functions. Consider a quadratic function (x) of the form 1 (x) = 2 xT Ax bT x where A is positive de nite. J J x . J J . de ne r(k) as the direction of steepest descent such that r(k) = r (x(k) ) = (Ax(k) b): Now that we have gured out what direction we will update x(k) . J . This may be found by minimizing (x(k) + (k) r(k) ) with respect to (k) .

and shown in Figure 5. Note that in \easy" cases for which the condition number is approximately unity. and is calculated in the following table. as we did for the method of steepest descent. by slight modi cation of the steepest descent algorithm.2 Conjugate gradient for quadratic functions . O(2n2 ) O(2n) O(2n2 ) O(2n) O(4n2 ) ops A cheaper technique for computing such an algorithm is discussed at the end of the next section. we arrive at the vastly improved conjugate gradient algorithm. the steepest descent algorithm must tack back and forth 90 at each turn. J J 5. the level surfaces of are approximately circular.iter)=x if (res<epsilon | iter==itmax).b^T x % using the steepest descent method clear res_save x_save epsilon=1e-6 x=zeros(size(b)) for iter=1:itmax r=b-A*x res=r'*r res_save(iter)=res x_save(:. GRADIENT-BASED APPROACHES FOR MINIMIZATION OF MULTIVARIABLE FUNCTIONS55 Thus. This improved algorithm retains the correct amount of momentum from one iteration to the next to successfully negotiate functions (x) with narrow valleys.3.3.5. Due to the successive line minimizations and the lack of momentum from one iteration to the next. Instead of minimizing in a single search direction at each iteration. which we will denote p(0) .3. We now show that. we can determine explicitly both the direction of descent r(k) and the parameter (k) which minimizes . now consider searching simultaneously in m directions. In poorly conditioned problems. By so doing.m % Minimize a quadratic function J(x) = (1/2) x^T A x . from the value of x(k) at each iteration. and convergence with either the steepest descent or the conjugate gradient algorithm will be quite rapid.m end % % % % % % determine gradient compute residual save some stuff exit yet? compute alpha update x Operation count The operation count for each iteration of the steepest descent algorithm may be determined by inspection of the above code. the path of the algorithm can be very jagged. the level surfaces become highly elongated ellipses. Operation To compute Ax b: To compute rT r: To compute rT Ar: To compute x + r: TOTAL: . As discussed earlier. J Matlab implementation % sd_quad. proceeding in the direction of steepest descent at each iteration is not necessarily the most e cient strategy. and the zig-zag behavior is ampli ed. break alpha=res/(r'*A*r) x=x+alpha*r end % end sd_quad.

such that p(k) TA p(j) = 0 for j = k: 6 . and ( ) the conjugate gradient algorithms when applied to nd minima of two test functions of two scalar control variables x1 and x2 (horizontal and vertical axes). X X @ (x(m) ) = 1 p(k) TA x(0) + m.1 X j =0 (j ) p(j ) : Taking the derivative of this expression with respect to (k) . bT x(0) + m. p(1). We seek a technique to select all the p(j) in such a way that they are orthogonal through A.1 (j) p(j) TA p(k) 2 2 @ (k) j =0 j =0 J .1 j=0 j6=k .4 1. Take x(m) = x(0) + and note that J m. or conjugate.2 0.1 X j =0 (j ) p(j ) m.1 (j) p(j) + 1 x(0) + m. : : : p(m.4 0.6 0. 0 2 4 b) Non-quadratic + J 4 ( 1 2 ). bT p(k) X (j) (k) T (j) = (k) p(k) TA p(k) + p(k) TA x(0) p(k) T b + p Ap : m. those corresponding to higher isovalues are dotted.8 x 10 2 −8 0 0 3 ( 1 2 ).3: Convergence of: ( ) simple gradient.1 X (j) (j) T (0) m.56 CHAPTER 5.1 (j) (j) X 1 p A x + p (x(m) ) = 2 x(0) + j =0 j =0 .8 1 1. MINIMIZATION 5 x 10 −5 10 Minimum point 4 8 Minimum point 6 3 4 2 2 1 0 −2 −8 Starting point a) Quadratic −6 −4 −2 J Starting point 0.6 1. Figure 5.1). Contours illustrate the level surfaces of the test functions contours corresponding to the smallest isovalues are solid.2 1.

step % compute momentum term % set up a C. then we obtain @ (xm ) = (k) p(k) TA p(k) + p(k) T A x(0) b @ (k) = (k) p(k) TA p(k) p(k) T r(0) J .G. p(k.3. The interested reader may nd this proof in Golub and van Loan (1989). = (kr.D. . then each of these minimizations may be done separately. r(k) . break if (iter==1) p = r else beta = res / res_save(iter-1) p = r + beta * p end end % % % % % % determine gradient compute residual save some stuff exit yet? compute update direction set up a S.1) T r(k. step . It entails just rede ning the descent direction p(k) at each iteration after the rst to be a linear combination of the direction of steepest descent. such that p(k) = r(k) + p(k. GRADIENT-BASED APPROACHES FOR MINIMIZATION OF MULTIVARIABLE FUNCTIONS57 IF we can nd such a sequence of p(j). and thus setting @ =@ (k) = 0 results in J (k) (k) T (0) = pk) T r (k) : p( A p 6 The remarkable thing about this result is that it is independent of p(j) for j = k! Thus.5. so long as we can nd a way to construct a sequence of p(k) which are all conjugate.1) and x(k+1) = x(k) + p(k) where and are given by (k) T (k) = rk) T r (k) : p( A p Veri cation that this choice of results in conjugate directions and that this choice of is equivalent to the one mentioned previously (minimizing in the direction p(k) from the point x(k) ) involves a straightforward (but lengthy) proof by induction.b^T x % using the conjugate gradient method clear res_save x_save epsilon=1e-6 x=zeros(size(b)) for iter=1:itmax r=b-A*x res=r'*r res_save(iter)=res x_save(:.1) r r (k) T (k) J Matlab implementation % cg_quad. The conjugate gradient technique is simply an e cient technique to construct a sequence of p(k) which are conjugate.iter)=x if (res<epsilon | iter==itmax).m % Minimize a quadratic function J=(1/2) x^T A x .1) . and the descent direction at the previous iteration.

m de nes an extra n-dimensional vector p). an implementation which costs only O(2n2 ) per iteration is possible.m CHAPTER 5. as given by the following code: % cg_cheap_quad. versus the direct calculation of r. The operation count is virtually the same (cg quad.D. for quadratic functions.1) Ap(k.1) ) = r(k. step end d=A*p % perform the (expensive) matrix/vector product alpha=res/(p'*d) % compute alpha x=x+alpha*p % update x . implementation of the conjugate gradient algorithm involves only a slight modi cation of the steepest descent algorithm.cheaper version A cheaper version of cg quad may be written by leveraging the fact that an iterative procedure may developed for the computation of r by combining the following two equations: x(k) = x(k.G.1) : The nice thing about this update equation for r.1) + (k. Thus.1) . (k. MINIMIZATION % compute alpha % update x As seen by comparison of the cg quad. is that this update equation depends on the matrix/vector product Ap(k.m. and the storage requirements are slightly increased (cg quad. (k.m requires 2n more ops per iteration).iter)=x % save some stuff if (res<epsilon | iter==itmax).b^T x % using the conjugate gradient method clear res_save x_save epsilon=1e-6 x=zeros(size(b)) for iter=1:itmax if (iter==1) r=b-A*x % determine r directly else r=r-alpha*d % determine r by the iterative formula. Putting these two equations together leads to an update equation for r at each iteration: r(k) = b A(x(k.1) .1) p(k.1) + . end res=r'*r % compute residual res_save(iter)=res x_save(:.1) p(k.m to sd quad.1) r(k) = b Ax(k) : . step else beta = res / res_save(iter-1) % compute momentum term p = r + beta * p % set up a C. break end % exit yet? if (iter==1) % compute update direction p = r % set up a S.m % Minimize a quadratic function J=(1/2) x^T A x .58 alpha=res/(p'*A*p) x=x+alpha*p end % end cg_quad. which needs to be computed anyway during the computation of . Matlab implementation .

is often highly nonuniform. These codes also produce some nice plots to get a graphical interpretation of the operation of the algorithms. We will defer discussion of the construction of an appropriate P to the end of the section su ce it to say for the moment that. ~ The computation of A might be prohibitively expensive and destroy any sparsity structure of A. ~~ ~ We therefore seek to solve a better conditioned but equivalent problem Ax = b which. the conjugate gradient algorithm converges in exactly N iterations for an N -dimensional quadratic minimization problem. We often seek to perform approximate minimization of an N -diminsional problem with a total number of iterations m N .1 AP .1 is symmetric positive de nite. For simplicity. once solved. max = min. For small c. GRADIENT-BASED APPROACHES FOR MINIMIZATION OF MULTIVARIABLE FUNCTIONS59 end % end cg_cheap_quad. we often can not a ord to perform N iterations. For large N .1 Ax = P .3.3 Preconditioned conjugate gradient P . which (for symmetric positive-de nite A) is just equal to the ratio of its maximum and minimum eigenvalues. Assuming exact arithmetic. we write the conjugate gradient algorithm ~~ ~ for the well conditioned problem Ax = b. The uniformity of the convergence is governed by the condition number c of the matrix A. ~~ ~ if P 2 is somehow \close" to A. we use a short-hand (\pseudo-code") notation: . so large reductions in might not occur until iterations well after the iteration m at which we would like to truncate the iteration sequence.m A test code named test sd cg.m.1 b | {z } |{z} | {z } ~ A ~ x ~ b Note that the matrix P . To accomplish this.5.1 b ) (P . Unfortunately. though monotonic. premultiply Ax = b by P .3. then the problem Ax = b is a well conditioned problem (because ~ = P . convergence of the conjugate gradient algorithm is quite rapid even for high-dimensional problems with N 1.1 AP . may be used to check to make sure these steepest descent and conjugate gradient algorithms work as advertized. ~ ~ We now show that it is not actually necessary to compute A and b in order to solve the original problem Ax = b in a well conditioned manner.1 I ) and can be solved rapidly (with a small number of iterations) using the A conjugate gradient approach.1 for some symmetric preconditioning matrix P : J J 5. which is available at the class web site.1 ) P x = P . To begin.1 AP . however. convergence of the conjugate gradient algorithm to the minimum of . will allow us to easily extract the solution of the original problem Ax = b for a poorlyconditioned symmetric positive de nite A.

.1 r P .1 AP . Note that. but substitute in the non-tilde variables: for i = 1 : m ( .1 AP . .1r may be found by solution of the system M s = r. . x = P x.1 P b (P . MINIMIZATION for i = 1 : m (~ ~ b x 1 ~ r ~ Ad~ i = 1 ~ i> r resold = res res = ~T ~ rr ( r ~ p ~ + p where = res=res i = 1 ~ ~ r old i > 1 ~ ~~ = res=(~ T d) where d = Ap p ~ ~ ~ ~ x x+ p end For clarity of notation. in converting the poorly-conditioned problem Ax = b to the well-conditioned problem ~ ~~ ~ ~ ~ Ax = b. . and b = P .60 CHAPTER 5. De ne now . With these de nitions.1 d i>1 .1 )P p Px Px + Pp end Now de ne M = P 2 and simplify: for i = 1 : m ( x r b Ad i = 1 r i>1 resold = res res = rT s where s = M . The new variable s = M . we now ~ some new intermediate variables r = P ~.1 b. . .1 )P x i = 1 P . p = P ~ r ~~ ~ rewrite exactly the above algorithm for solving the well-conditioned problem Ax = b. we have introduced a tilde over each vector involed in this optimization.1 res = (P r) (P r) P r i=1 Pp .1 d)] where P .1 d = (P .1 r + P p where P = res=resold i > 1 = res= (P p)T (P . and d = P d.1 AP . Thus.1 .1 T . we made the following de nitions: A = P . .1 resold = res ( .1 r P .1 p.1 r ( p s + p where = res=res i = 1 s old i > 1 T d] where d = Ap = res= p x x+ p end This is practically identical to the original conjugate gradient algorithm for solving the problem Ax = b.

then solves the triangular system H T s = f for the desired quantity s.3. the above algorithm can be rewritten in a manner that leverages the sparsity structure of H (akin to the backsubstitution in the Thomas algorithm). . r J . Recall that if M = P 2 is somehow \close" ~~ ~ to A. r = 2. solving the system M s = r for s is similar to solving Gaussian elimination by leveraging an LU decomposition: one rst solves the triangular system H f = r for the intermediate variable f . Determine the (negative of) the gradient direction. 5. There are a variety of heuristic techniques to construct an appropriate M . 3. we seek an M for which we can solve this system quickly (e. Note that the triangular factors H and H T are zero everywhere A is zero. Though it is sometimes takes a bit of e ort to write an algorithm that e ciently leverages such sparsity structure. compute the residual rT r. which constructs a triangular H with HH T = M A with the following strategy: H =A for k = 1 : n p H (k k) = H (k k) for i = k + 1 : n if H (i k) = 0 then H (i k) = H (i k)=H (k k) 6 end for j = k + 1 : n for i = j : n if H (i j ) = 0 then H (i j ) = H (i j ) H (i k)H (j k) end end end 6 . One of the most popular is incomplete Cholesky factorization.. determine the necessary momentum and the corresponding update direction p 4.5. the problem here (which is actually the solution of Ax = b via standard conjugate gradient) is well conditioned and converges in a small number of iterations. Thus.4 Extension non-quadratic functions At each iteration of the conjugate gradient method. an M which is the product of sparse triangular matrices).g. Motivation and further discussion of incomplete Cholesky factorization. 5. update x x + p. the bene ts of preconditioning are often quite signi cant and well worth the coding e ort which it necessitates. r + p.3. as well as other preconditioning techniques. as it usually must be done on a case-by-case basis. and J . determine the (scalar) parameter of descent which minimizes (x + p). there are ve things to be done: 1. is deferred to Golub and van Loan (1989). Once H is obtained with this appraoch such that HH T = M . if A is sparse. GRADIENT-BASED APPROACHES FOR MINIMIZATION OF MULTIVARIABLE FUNCTIONS61 when implementing this method.

take = 0) every R iterations in cg nq.1) )T r(k) = (r (k. which calculates the gradient r of the nonquadratic function . the algorithm is the same. Replace the direct computation of with a call to an (appropriately modi ed) version of Brent's method to determine based on a series of function evaluations. J J . J J J Finally. 4'. it is often adventageous to compute the momentum term according to the formula (k) r(k. When the function is not quadratic. but (x) now lacks the special quadratic structure we assumed in the previous sections.62 In the homework. this additional term often serves to nudge the descent direction towards that of a steepest descent step in regions of the function which are highly nonquadratic.e. when working with nonquadratic functions.m and cg nq. The \correction" term (r(k. you will extend the codes developed here for quadratic problems to nonquadratic problems.1) (r ) r .1) )T r(k) is zero using the conjugate gradient approach when the function is quadratic.1) T (k. but the momentum term in the conjugate gradient algorithm is computed using a local quadratic approximation of . Replace the line which determines the gradient direction with a call to a function.m. creating two new routines sd nq. Generalizing the results of the previous sections to nonquadratic problems entails only a few modi cations: J 1'.. the momentum should be reset to zero (i. Essentially.m (R = 20 is often a good choice). which we will call compute grad. As the function is not quadratic. Thus. This approach is referred to as the Polak-Ribiere variant of the conjugate gradient method for nonquadratic functions. 3'.m. the momentum sometimes builds up in the wrong direction.

1 Derivation of nite di erence formulae In the simulation of physical systems. one often needs to compute the derivative of a function f (x) which is known only at a discrete set of grid points x0 x1 : : : xN . we will indicate a uniform mesh by denoting its (constant) grid spacing as h (without a subscript). the truncation error of this method is also reduced by approximately a factor of 2. hj f 00 + : : : 2 j In these notes. 63 . the Taylor series expansion for f at point xj+1 in terms of f and its derivatives at the point xj is given by De ning hj = xj+1 xj . we may write the above equation as where O(h) denotes the contribution from all terms which have a power of h which is greater then or equal to one. j . For a uniform mesh. rearrangement of the above equation leads to . The exponent of h in the leading-order error is the order of accuracy of the method. which is referred to as the rst-order forward di erence formula for the rst derivative. fj0 = fj+1 fj h . de ning fj = f (xj ).Chapter 6 Di erentiation 6. 2 3 fj+1 = fj + (xj+1 xj )fj0 + (xj+12! xj ) fj00 + (xj+13! xj ) fj000 + : : : . E ective formulae for computing such approximations. indicating a \ rst-order" behavior. Neglecting these \higher-order terms" for su ciently small h. For a su ciently ne initial grid. For example. known as nite di erence formulae. . may be derived by combination of one or more Taylor series expansions. and we will denote a nonuniform (\stretched") mesh by denoting the grid spacing as hj (with a subscript). . The neglected term with the highest power of h (in this case. h fj00 =2) is referred to as the leading-order error. we can approximate the derivative as fj0 = fj+1h fj + O(h) . We will assume the mesh is uniform unless indicated otherwise. . if we re ne the grid spacing further by a factor of two. f = fj+1 fj x j h .

For example. ) . .1 h fjIV + : : : 2 j= h 24 x h2 In general. 6 j f = fj+1 fj. For example. the widely used second-order central di erence formula for the rst derivative can be obtained by subtraction of the two Taylor series expansions 2 3 fj+1 = fj + h fj0 + h fj00 + h fj000 + : : : 2 6 3 0 + h2 f 00 h f 000 + : : : f = f hf f = fj fj.1 about the point xj and rearranging. By approximate linear combination of four di erent Taylor Series expansions. . if we take fj0 . by adding the above Taylor series expansions instead of subtracting them.1 : 6 j x j 2h Similar formulae can be derived for second-order derivatives (and higher). Suppose we want to construct the most accurate nite di erence scheme for the rst derivative that involves the function values fj .1) . 2 X =0 a fj+ = (6. by expanding fj.1 h2 f 000 + : : : 3 6. fj+1 . DIFFERENTIATION Similarly. .1 = 2 h fj0 + h fj000 + : : : 3 fj0 = fj+1 2h fj.2 8 fj. . and fj+2 . Given this restriction on the information used. .1 j. we obtain which is referred to as the rst-order backward di erence formula for the rst derivative. a fourth-order central di erence formula for the rst derivative may be found. and thus fj+1 fj. we can obtain higher accuracy if we include more points. j . which takes the form f = fj.1 x j h .2 Taylor Tables There is a general technique for constructing nite di erence formulae using a tool known as a Taylor Table. That is.1 : fj00 = fj+1 2f2j + fj. This technique is best illustrated by example. They require the functional values at points outside the domain which are not available. we seek the highest order of accuracy that can be achieved.64 CHAPTER 6.1 + 8 fj+1 fj+2 : x j 12h The main di culty with higher order formulae occurs near boundaries of the domain. Near boundaries one usually resorts to lower order formulae. ) . j leading to . 2 j . . we obtain the second-order central di erence formula for second derivative given by 2 2f fj+1 2fj + fj. Higherorder (more accurate) schemes can be derived by combining Taylor series of the function f at various points near the point xj .

when multiplied by the corresponding terms at the top of each column and summed. ) . . . For small h. . revealing that this scheme is second-order accurate. we can derive a banded system of equations which can easily be solved to determine a numerical approximation of the fj0 . yield the Taylor series expansion of each of the terms to the left. consider the equation 1 X b.1). Summing up all of these terms.1). a0 a1 a2 . . This approach is referred to as Pade approximation. we have three free coe cients. which can be determined by multiplying the rst non-zero column sum by the term at the top of the corresponding column. The elements to the right.1 + fj0 + b1 fj0+1 . 6. 2 a1 h6 a2 (2h) 6 3 0 0 3 This simple linear system may be solved either by hand or with Matlab. . fj0 a0fj a1 fj+1 a2 fj+2 fj 0 . . . The resulting second-order forward di erence formula for the rst derivative is f = 3fj + 4fj+1 fj+2 x j 2h and the leading-order error.3 Pade Approximations By including both nearby function evaluations and nearby gradient approximations on the left hand side of an expression like (6. In the present case. Again illustrating by example. we get the error expanded in terms of powers of the grid spacing h. . then we desire to select the a such that the error is as large a power of h as possible (and thus will vanish rapidly upon re nement of the grid). . It is convenient to organize the Taylor series of the terms in the above formula using a \Taylor Table" of the form . The leftmost column of this table contains all of the terms on the left-hand side of (6. we can set several of the coe cients of this polynomial equal to zero.3. and can set the coe cients of the rst three terms to zero: a0 a1 a2 = 0 9 0 1 1 11 0a h1 0 0 1 0a 1 0 3=(2h)1 > 0 1 a1 h a2 (2h) = 0 = @0 @a0A = @ 2=h A 1 2A @a1 hA = @ 1A 1 h2 a (2h)2 = 0 > 0 1=2 2 a2 h 0 a2 1=(2h) a1 2 2 2 f g . . this is a good way of making this error small. k=. .1 a fj+ = : (6.1 fj0. thereby making as high a power of h as possible. is . .2) . . . ) . . .6. By proper choice of the available degrees of freedom a0 a1 a2 . . fj00 a1 h2 a2 (2h) 2 2 0 0 fj000 . . . . a1 h a2 (2h) 1 0 fj0 . . 3 3 2 a1 h a2 (2h) fj000 = h fj000 6 6 3 . . PADE APPROXIMATIONS 65 where the a are the coe cients of the nite di erence formula sought. .

.h)3 .2 ) 2 fj000 0 0 2 h b.1C = B 0 C CB C B C 3 6 @ 0 h3 =6 A @ 0 A @ 0 A h3 =6 h3 =6 h4 =24 0 h4 =24 a1 0 which is equivalent to 0 0 0 10 b 1 0 0 1 0b 1 0 1=4 1 1 1 1 . . .6 ) fj(iv) 0 3 h b..2). .11 C B 1=4 C 1 C B b1 C B 1C B 1 1 1=2 0 1=2 C Ba hC = B 0 C Ba C = B 3=(4h) C B B . @ AB x C B C . .1 4 x j+1 x j 4 x j.. b1 h 0 0 .24) fj(v) 4 4 2 (. a. .1 a0 a1 to obtain the highest possible accuracy in (6.1 b1 0 1 .1 fj.1 B 1 1 1 0 B b.1 B 1 1 h 0 h C B b1 C B 1C B h h CB C B C B B h2=2 h2=2 hh2==2 0 h2=2 C Baa. and thus is fourth-order accurate. fj . . Writing out this and has a leading-order error of h4 fj equation for all values of j on the interior leads to the tridiagonal.1 fj0 b1 fj0+1 a.. DIFFERENTIATION 2 3 4 h5 fj+1 = fj + h fj0 + h fj00 + h fj000 + h fj(iv) + 120 fj(v) + : : : 2 6 24 h2 f 000 + h3 f (iv) + h4 f (v) + : : : f 0 = f 0 + h f 00 + j +1 j j the corresponding Taylor table is 2 j 6 j 24 j . . . .1 (. . . . C B f j . C B . ... C B f C . .1 C B C .. . ) . .1 (.1 h (v) =30. .. B j+1 C @ A @ . . . 0 1 B . . . . .. Setting the sums of the rst ve columns equal to zero leads to the linear system 0 0 1 0b 1 0 0 1 0 1 1 1 . .. C C . fj0 fj00 b.1C B B 1=2 1=2 1=6 0 1=6 C B a. b1 h2 6 (.h) 120 h a1 120 5 b1 h 24 0 0 5 a1 h a1 h2 2 a1 h6 3 a1 h 24 4 .. h a..1 fj0.66 Leveraging the Taylor series expansions CHAPTER 6.1 a 0 fj a1 fj+1 .1 (. .. A . .. . b1 h3 0 .1 b1 a.. a. .1 24 .1 (. B C B CB x C B B C B CB f C = B 3 f 1 1 B C B x j C B 4h j+1 fj. . B .a..1( h) .1 6 .h)4 .. b. a.1 a0 a1 0 0 0 b. .1 C : 1 4 C 4 B C B CB C C B B .2 ) . diagonally-dominant system 1 0 0 1 .1h C B 0 C CB C B C Ba C B 0 C C @ A@ 0 A @ A @ 0A @ A 1=6 1=6 1=24 0 1=24 a1 h 0 a1 3=(4h) Thus.1 (. . f g . .a. h b.1 ( h) . the system to be solved to determine the numerical approximation of the derivatives at each grid point has a typical row given by 1 f + f +1 f = 43 fj+1 fj. . . We again use the available degrees of freedom b... . .

This system can then be solved e ciently for the numerical approximation of the derivative at all of the gridpoints using the Thomas algorithm. mesh re nement by a factor of two improves the accuracy of a second-order nite-di erence scheme by a factor of four. One commonly settles on a lower-order forward (backward) expression at the left (right) boundary in order to close this set of equations in a nonsingular manner. 6. A measure of accuracy of the nite-di erence scheme is provided by comparing the modi ed wavenumber k0 . which appears in the exact expression for the derivative (6.4). xj = h j where j = 0 1 2 ::: N .4). but complex exponentials tend to make the algebra easier. the numerical approximation of the derivative is completely o . .h) = ei k h e. noting that xj+1 = xj + h and xj. computes the derivative of f .4. (6. As k =h. but for large wavenumbers. a di erent treatment is needed a central expression (Pade or otherwise) cannot be used at x0 because x. and The nite-di erence approximation for the derivative which we consider here is f = fj+1 fj. the numerical approximation is degraded. we obtain . for a su ciently ne grid. the numerical approximation of the derivative on our discrete grid is usually pretty good (k0 k). Let us discretize the x axis with a uniform mesh. Another method for quantifying the accuracy of a nite di erence formula that yields further information is called the modi ed wavenumber approach. f = ei k (xj +h) ei k (xj . for example.3) (Alternatively.1 xj 2h Substituting for fj = ei k xj .i k h f = i sin(h k) f .6. ! k0 . sinhh k ) h k0 = sin(h k) . with the actual wavenumber k.5) where By analogy with (6. In an analogous manner.4) We now ask how accurately the second order central nite-di erence scheme. For small wavenumbers. k0 is called the modi ed wave number for this second-order nite di erence scheme. MODIFIED WAVENUMBER ANALYSIS 67 At the endpoints. For example. i k0 f j j x j 2h 2h h j .1 is outside of our available grid of data. consider the harmonic function given by f (x) = ei k x f 0 = i k ei k x = i k f L h= N: (6. we can also do this derivation with sines and cosines. one can derive modi ed wave numbers for any nite di erence formula.1 = xj h.4 Modi ed wavenumber analysis The order of accuracy is usually the primary indicator of the accuracy of nite-di erence formulae it tells us how mesh re nement improves the accuracy.5). To illustrate this approach.) The exact derivative of this function is (6. which appears in the numerical approximation of the derivative (6. and improves the accuracy of a fourth-order scheme by a factor of sixteen.

3) and manipulating as before. we obtain: i k0 1 ei k h + 1 + 1 e. . .i k h fj = 43h (ei k h e.1 )( which is the same as what we get with Taylor Table.1 Approximating the modi ed wavenumber at points xj+1 and xj. xi fi . . . . given by: x )(x x )(x i f (x) = (x (x xi )(x xi+1 ) ) fi. .1 + ((x xi. and xi+1 fi+1 .1 with their corresponding numerical approximations f 0 i k x j = i k 0 e i k h fj x j+1 = i k e +1 and f 0 i k xj.3) and manipulating as before.1 = 4h fj+1 fj.2 ): Inserting (6. .1 + 0 + (2hh)h) fi+1 = fi+1 2h fi. .1 )(x xx+1 )) fi + (x (x xi.1 xi+1 i+1 i.i k h )f 1 (ei k 2h e. = i k0 e. .68 As another example.2 8fj. . . . ik0 f j 12 j j j x j 3h 3h 6h hk0 = 4 sin(hk) 1 sin(2hk) 3 6 .1 . f g f g f g .1 ) 12h (fj+2 fj. we obtain: f = 2 (ei k h e.1 )(xi i+1 i. . . . ) .1 fi. .1 i+1 i Di erentiate this expression with respect to x and then evaluating at x = xi gives ( ( f 0 (xi ) = ( h)(h)2h) fi.1 = i k e 1 and inserting (6. .5 Alternative derivation of di erentiation formulae Consider now the Lagrange interpolation of the three points xi.1 )(x xi ) x ) fi+1 xi xi. .i k h fj x j. Consider now our fourth-order Pade approximation: f 1 f 3 1 f 4 x j+1 + x j + 4 x j. . .i k h )fj 4 4 .1 + 8fj+1 fj+2 = 2 (f x j 12h 3h j+1 fj.i k 2h )f = ih 4 sin(hk) 1 sin(2hk)if .1 i i. ) i k0 1 + 1 cos(hk) fj = i 23h sin(hk)fj 2 3 sin(hk ) hk0 = 2 1 1 + 2 cos(hk) 6. . . . . . consider the modi ed wavenumber for the fourth-order expression of rst derivative given by 1 f = fj.

7. . we can integrate a linear approximation of the function over the interval a c].2) If the function is known at all three points a. . S (f ): 69 . . this procedure is usually referred to as numerical quadrature. T (f ): (a c) (c 2 . . . leading to the trapezoidal rule: Zc a Z c h (x c) a) i f (x) dx f (a) + (x a) f (c) dx = h f (a) + f (c) . 7. in which we approximate the integral of a given function over a speci ed domain.1. then (de ning h = c a) we can integrate a constant approximation of the function over the interval a c]. (7.1 Basic quadrature formulae Consider the problem of integrating a function f on the interval a c] when the function is evaluated only at a limited number of discrete gridpoints. we can integrate a quadratic approximation of the function over the interval a c]. if the function is evaluated at the midpoint b = (a + c)=2. . . In the setting we discuss in the present chapter. leading to Simpson's rule: Zc a Z c h (x b)(x c) i f (a) + ((x a)(x cc)) f (b) + ((x a)(x bb)) f (c) dx f (x) dx b a)(b c a)(c a (a b)(a c) . One approach to approximating the integral of f is to integrate the lowest-order polynomial that passes through a speci ed number of function evaluations using the formulae of Lagrange interpolation. leading to the midpoint rule: . . .1 Techniques based on Lagrange interpolation Zc a f (x) dx Zch a f (b) dx = h f (b) . . M (f ): i (7.3) . (7.Chapter 7 Integration Di erentiation and integration are two essential tools of calculus which we need to solve engineering problems. . For example. b. = : : : = h f (a) + 4f6(b) + f (c) . . and c. The previous chapter discussed methods to approximate derivatives numerically we now turn to the problem of numerical integration.1) If the function is evaluated at the two endpoints a and c. a .

2 Extension to several gridpoints x Recall that Lagrange interpolations are ill-behaved near the endpoints when the number of gridpoints is large. The question of the accuracy of a particular integration rule over a single interval is often not of much interest. however. the numerical approximation of the integral of f (x) over the interval L R] via the midpoint rule takes the form f g .5) and numerical approximation of the integral via Simpson's rule takes the form ZR L f (x) dx n X fi.1=2 (7. Note also that if all even derivatives of f happen to be zero (for example. INTEGRATION Don't forget that symbolic manipulation packages like Maple (which is included with Matlab) and Mathematica are well suited for this sort of algebraic manipulation.70 CHAPTER 7.1 ).1.1 + fi h h X hi = f0 + fn + 2 fi i=1 2 2 i=1 (7.6) where the rightmost expressions assume a uniform grid in which the grid spacing h is constant.2 Error Analysis of Integration Rules In order to quantify the accuracy of the integration rules we have proposed so far.1 + xi )=2. the grid spacing hi = (xi xi. . it is generally ill-advised to continue the approach of the previous section to function approximations which are higher order than quadratic. we obtain Zc Z ch i 1 f (x) dx = f (b) + (x b) f 0(b) + 2 (x b)2 f 00 (b) + 1 (x b)3 f 000 (b) + : : : dx 3 a a 1 (x b)2 c f 0 (b) + 1 (x b)3 c f 00 (b) + : : : = h f (b) + 2 6 a a (7. we may again turn to Taylor series analysis. De ning a numerical grid of points x0 x1 : : : xn distributed over the interval L R]. . For example. better results are usually obtained by applying the formulae of 7. .1 + 4fi. the leading-order error is proportional to h3 . the intermediate gridpoints xi. Thus.1 i X fi. replacing f (x) with its Taylor series approximation about b and integrating. Instead. if f (x) is linear in x).1.1 X i=1 fi i (7. Thus.1=2 = (xi. A more relevant measure is the rate of convergence of the integration . ZR L f (x) dx n X i=1 hi fi.1 repeatedly over several smaller subintervals. 7. if the integral is approximated by the midpoint rule (7. . and the function evaluations fi = f (xi ).1). and the approximation of the integral over this single interval is third-order accurate.4) numerical approximation of the integral via the trapezoidal rule takes the form ZR L f (x) dx n n. 7.7) h3 f 00 (b) + h5 f (iv) (b) + : : : = h f (b) + 24 1920 . the midpoint rule integrates the function exactly. h i=1 i 6 1 2 + fi = h f0 + fn + 4 6 h n X i=1 fi. + 2 1 2 n.

if f (x) is cubic in x). h5 f (x) d x = h f (a) + 4f6(b) + f (c) 2880 f (iv) (b) + : : : a The leading-order error of Simpson's rule (7.1)-(7.8) As with the midpoint rule.3) is therefore proportional to h5 . consider the formula (7.5) when applied over several gridpoints on a given interval L R] as the number of gridpoints is increased in such a setting.3 Romberg integration In the previous derivation. with a technique known as Richardson extrapolation. 3 2 h 3 f 000 (b) + : : : f 000 (b) + : : : Combining these expressions gives f (a) + f (c) = f (b) + 1 h2 f 00 (b) + 1 f (iv) (b) + : : : 2 8 384 Solve for f (b) and substituting into (7.8) gives imation of the integral over this single interval using this rule is fth-order accurate.4) on n subintervals (n + 1 gridpoints). the error over the entire interval L R] of the numerical integration will be proportional to n h3 = h2 . Zc 7. Simpson's rule is fourth-order accurate.3) that S (f ) = 2 M (f ) + 1 T (f ).7) yields Zc 3 h5 f (x) d x = h f (a) + f (c) h f 00 (b) 480 f (iv) (b) + : : : 2 12 a .7. it became evident that keeping track of the leading-order error of a particular numerical formula can be a useful thing to do. f (c) = f (b) + h f 0 (b) + 1 h 2 2 2 h 0 1 2 f (b) + 2 . and thus the trapezoidal approximation of the integral over this single interval is third-order accurate. In fact. the most relevant measure is the rate of convergence of the integration rule (7. and thus the approx. (7. the leading-order error of the trapezoidal rule (7. Again. . For example. as with the midpoint rule.. then Simpson's rule integrates the function exactly.7) plus 1=3 3 3 times equation (7.6) when applied over several gridpoints on the given interval L R] as the number of gridpoints is increased in such a setting. the trapezoidal rule is second-order accurate. Thus. 2 h 2 2 f 00 (b) + 1 h 6 2 f 00 (b) + 1 6 . Simpson's rule can be constructed simply by determining the speci c linear combination of the midpoint rule and the trapezoidal rule for which the leading-order error term vanishes. Adding 2=3 times equation (7.2) is proportional to h3 .3. We now pursue further such a constructive procedure. the midpoint rule is second-order accurate. As h 1=n (the width of each subinterval is inversely proportional to the number of subintervals).g. Note that if all even derivatives of f higher than three happen to be zero (e. ROMBERG INTEGRATION 71 rule when applied over several gridpoints on a given interval L R] as the numerical grid is re ned. Consider now the Taylor series approximations of f (a) and f (c) about b: / f (a) = f (b) + . for approximations of the integral on a given interval L R] as the computational grid is re ned. in order to determine even higherorder approximations of the integral on the interval L R] using linear combinations of several . the most relevant measure is the rate of convergence of the integration rule (7. Again. Note from (7.

Recall rst that the error of the trapezoidal approximation of the integral (7. De ne the trapezoidal approximation of the integral on a numerical grid with n = nl = 2l (e.g. . Then. Assuming the coe cients ci (which are proportional to the various derivatives of f ) vary only slowly in space. we can eliminate the error proportional to h2 by taking a linear combination of I1 1 and 1 I2 1 to obtain: 5 I2 2 = 4 I2 1 3 I1 1 = I + 1 c2 h4 + 16 c3 h6 + : : : 1 1 4 . . . . . . eliminate terms proportional to h4 by linear combination with I2 2 : 1 1 2 I3 3 = 16 I3 15 I2 2 = I 64 c3 h6 : : : 1 . .1 i h X f (x) d x = h f0 + fn + 2 fi + c1 h2 + c2 h4 + c3 h4 + c3 h6 + : : : 2 i=1 . . First. . whereas at the second level we have I2 1 = I c1 h2 c2 h4 c3 h6 : : : 2 2 2 h2 c h4 c h6 : : : 1 1 = I c1 41 2 16 3 64 . . ZR L n. Note that at the rst level we have I1 1 = I c1 h2 c2 h4 c3 h6 : : : 1 1 1 .72 CHAPTER 7.) Continuing to the third level of grid re nement. . . . The approach we will present is commonly referred to as Romberg integration. . . INTEGRATION trapezoidal approximations of the integral on a series of successively ner grids. (This results in Simpson's rule.1 X i=1 fi i We now examine the truncation error (in terms of h1 ) as the grid is re ned. . . h = hl = (R L)=nl) as: Il 1 = hl f0 + fnl + 2 2 h nl . the trapezoidal approximation of the integral satis es I3 1 = I c1 h2 c2 h4 c3 h6 : : : 3 3 3 h2 c h4 c h6 : : : 1 1 1 = I c1 16 2 256 3 4096 . eliminate terms proportional to h2 by linear combination with I2 1 : 1 1 5 I3 2 = 4 I3 1 3 I2 1 = I + 64 c2 h4 + 1024 c3 h6 + : : : 1 1 . Let us start with n1 = 2 and h1 = (R L)=n1 and iteratively re ne the grid. . ..5) on the given interval L R] may be written I. if you do all of the appropriate substitutions. .

evals=0 toplevel=refinements+1 for level=1:toplevel % Approximate the integral with the trapezoidal rule on 2^level subintervals n=2^level I(level.3.n) evals=evals+evals_temp % Perform several corrections based on I at the previous level.n) % Integrate the function defined in compute_f.1) I(l.evals] = int_romb(L.evals] = int_trap(L.m Matlab implementation A simple function to perform the trapezoidal integrations is given by: function int.1) 1 . function int.1) (k.1) Il (k.k-1))/(4^(k-1) -1) end end int=I(toplevel. The structure of the re nements is: Gridpoints 2nd-Order 4th-Order 6th -Order 8th-Order Approximation Correction Correction Correction n1 = 21 = 2 I1 1 n2 = 22 = 4 n3 = 23 = 8 n4 = 24 = 16 I2 1 I3 1 I4 1 Il k = & ! & ! & ! I2 2 I3 2 I4 2 . for k=2:level I(level.refinements) % Integrate the function defined in compute_f.m from x=L to x=R using % Romberg integration to provide the maximal order of accuracy with a % given number of grid refinements.R.toplevel) % end int_romb.k-1) . h=(R-L)/n .7.m from x=L to x=R on % n equal subintervals using the trapezoidal rule.R.1) 4(k.k) = (4^(k-1)*I(level.1). ROMBERG INTEGRATION 73 This process may be repeated to provide increasingly higher-order approximations to the integral I . & ! & ! I3 3 I4 3 & ! I4 4 The general form for the correction term (for k 2) is: 4(k.R. A matlab implementation of Romberg integration is given below.I(level-1.evals_temp]= int_trap(L. A straightforward test code is provided at the class web site.

we know that and Ii Si = c h5 f (iv) (xi hi ) + : : : i 2 . . .1 xi ] with Simpson's rule yields f g Dividing this particular subinterval in half and summing Simpson's approximations of the integral on each of these smaller subintervals yields h hi Si(2) = 12 f (xi hi ) + 4f (xi 34 i ) + 2f (xi hi ) + 4f (xi hi ) + f (xi )]: 2 4 . (7. . . we will use Simpson's rule as the base method.5*(compute_f(L) + compute_f(R)) for i=1:n-1 x=L+h*i int=int+compute_f(x) end int=h*int evals=2+(n-1) % end int_trap CHAPTER 7.4 Adaptive Quadrature Often. the adaptive procedure further subdivides the interval and the process is repeated. 5 h (iv) f h xi 34 i + f (iv) xi hi 4 . . one would like to use a ne grid in the regions where the integrand varies quickly and a coarse grid where the integrand varies slowly. ~ Suppose we seek a numerical approximation I of the integral I such that ~ I I j . Evaluating the integral on a particular subinterval xi. If the accuracy is acceptable. . . INTEGRATION 7. . Let Ii denote the exact integral on xi. divide the interval L R] into several subintervals with the numerical grid x0 x1 : : : xn . j where is the error tolerance provided by the user.1 xi ]. . it is wasteful to use the same grid spacing h everywhere in the interval of integration L R]. As we now show. . From our error analysis. . To demonstrate this technique. Ideally. The idea of adaptive quadrature is to spread out this error in our approximation of the integral proportionally across the subintervals spanning L R]. adaptive quadrature techniques automatically adjust the grid spacing in just such a manner. The essential idea is to compare the two approximations Si and Si(2) to obtain an estimate for the accuracy of Si(2) .74 int=0. First. Si = hi f (xi hi ) + 4 f (xi hi ) + f (xi )]: 6 2 . we will use Si(2) for the approximation of the integral on this interval otherwise. .9) Each of the terms in the bracket in the last expression can be expanded in a Taylor series about the point xi h2i : . . Ii Si(2) = c hi 2 . i +::: h f (iv) xi 34 i = f (iv) xi hi hi f (v) xi hi + : : : 2 4 2 hi = f (iv) x hi + hi f (v) x hi + : : : f (iv) xi 4 i 2 i 2 4 .

. (7. 5 h (iv) f i xi hi + : : : 2 . The good news is that this di erence can easily be computed. the subinterval xi. Similar schemes may also be pursued for the other base integration schemes such as the trapezoidal rule.75 Thus. at least.1 xi ]. then Si(2) is su ciently accurate for the subinterval xi. and we move on to the next subinterval. If 1 (2) hi 15 Si Si R L j . about 15 of the di erence between Si and Si(2) . to leading order. The essential idea of adaptive quadrature is thus to spread evenly the error of the numerical approximation of the integral (or. Ii Si(2) = 2c hi 2 .10) from (7.10) reveals that 1 I Si(2) 15 (Si(2) Si ): . the error in Si(2) is. and substituting into the RHS of (7.10) Subtracting (7.9). . 1 Thus. j . knowledge of the truncation error can be leveraged to estimate the accuracy of the numerical solution without knowing the exact solution.1 xi ] will be subdivided further. an approximation of this error) over the entire interval L R] by selective re nements of the numerical grid. If this condition is not satis ed. . As with Richardson extrapolation. we obtain Si(2) Si = 15 c h5 f (iv) xi hi + : : : 16 i 2 .

76 .

Note also that we refer to the independent variable in this chapter as time. t. For simplicity of notation. the problem of integration of an ODE is fundamentally di erent than the problem of numerical quadrature discussed in 7.1) at timestep tn+1 = tn + hn given the solution at the initial time t0 and the solution at the previously-computed timesteps t1 to tn . in which the function being integrated was given.1) 8. step by step. y. such as From our ODE. rst-order ordinary di erential equation (ODE) of the form The problem we address now is the advancement of such a system in time by integration of this di erential equation. we have: y(tn+1 ) = y(tn ) + hy0 (tn ) + h y00 (tn ) + h y000 (tn ) + : : : : 2 6 2 3 (8.1 Taylor-series methods One of the simplest approaches to march the ODE (8. The ODE given above may be \solved" numerically by marching it forward in time. Note that ODEs with higher-order derivatives and systems of ODEs present a straightforward generalization of the present discussion. As the quantity being integrated. is itself a function of the result of the integration. we will focus our discussion initially on the case with constant stepsize h generalization to the case with nonconstant hn is straightforward.Chapter 8 Ordinary di erential equations Consider rst a scalar.2) y0 = dy = f dt 00 = dy0 = df = @f + @f dy = ft + ffy y dt dt @t @y dt dy00 = d (f + ff ) = @ (f + ff ) + @ (f + ff ) dy = f + f f + 2ff + f 2 f + f 2 f y000 = dt dt t y y y dt tt t y yt y yy @t t @y t 77 . as will be shown in due course. x dy = f (y t) with y(t ) = y : 0 0 dt (8. but this is done without loss of generality and other interpretations of the independent variable are also possible. we seek to approximate the solution y to (8. In other words. f .1) forward in time is to appeal to a Taylor series expansion in time.

and thus is \second-order" accurate over a single time step. We can also base a time integration scheme on the rst three terms of (8. .2 The trapezoidal method The formal solution of the ODE (8. as will be discussed further in class. as their computational expense is relatively high (due to all of the cross derivatives required) and their stability and accuracy is not as good as some of the other methods which we will develop.5) .3) This is referred to as the explicit Euler method. over a speci ed time interval (t0 t0 + T ).4) are easily solved for yn+1 . 3 2 y(tn ) = y(tn+1 ) hy0 (tn+1 ) + h y00 (tn+1 ) h y000 (tn+1 ) + : : : 2 6 .78 CHAPTER 8. over a speci ed time interval (t0 t0 + T ). and is the simplest of all time integration schemes. implicit methods such as the implicit Euler method given above are di cult to use.4. Denoting the numerical approximation of y(tn ) as yn . In such a setting. however. after we discuss rst an illustrative model problem. We defer discussion of the accuracy of this method to 8. Note that this method neglects terms which are proportional to h2 . however. As with explicit Euler. implicit Euler is rst-order accurate. because knowledge of yn+1 is needed (before it is computed!) to compute f in order to advance from yn to yn+1 .1. a more relevant measure is the accuracy achieved when marching the ODE over a given time interval (t0 t0 + T ) as the timesteps h are made smaller.1 gives This is referred to as the trapezoidal or Crank-Nicholson method.4) 8. As with the problem of numerical quadrature. yn+1 = yn + hf (yn tn ) + h ft (yn tn ) + f (yn tn )fy (yn tn )]: 2 2 yn+1 = yn + hf (yn+1 tn+1 ): (8. explicit Euler is rst-order accurate. Note that a Taylor series expansion in time may also be written around tn+1 : The time integration method based on the rst two terms of this Taylor series is given by This is referred to as the implicit Euler method.2): x yn+1 = yn + hf (yn tn ): Even higher-order Taylor series methods are also possible. we lose one in the order of accuracy (as in the quadrature problem discussed in 7) and thus. It also neglects terms which are proportional to h2 . x yn+1 = yn + h f (yn tn ) + f (yn+1 tn+1 )]: 2 (8. On the other hand. and thus is \second-order" accurate over a single time step.2) is given by (8. such problems are approximated by some type of linearization or iteration. Typically. If f is nonlinear in y. the time integration method based on the rst two terms of (8. if f is linear in y. ORDINARY DIFFERENTIAL EQUATIONS etc. We do not pursue such high-order Taylor series approaches in the present text.1) over the interval tn tn+1 ] is given by yn+1 = yn + Z tn tn +1 f (y t)dt: x Approximating this integral with the trapezoidal rule from 7. implicit strategies such as (8.

< < < 3 2 1 0 −1 −2 −3 −4 −3 −2 −1 0 1 2 Figure 8. Consider now the second-order ODE for a simple mass/spring system given by with y(t0 ) = y0 y0 (t0 ) = 0 .6) with = 1. Note that the explicit Euler method appear to be unstable for the large values of h.i!(t. the implicit Euler method. 0 0 (8.t ) + e.7) where ! = 1. allowed to be complex. We thus refer to the exact solution as being unstable if ( ) > 0 and stable if ( ) 0. for ( ) > 0. A MODEL PROBLEM 79 8. so we can compare the numerical approximation using a particular numerical method to the exact solution in order to quantify the pros and cons of the numerical method. with the trapezoidal method appearing to be the most accurate.2.8. The exact solution is y = y0 cos !(t t0 )] = (y0 =2) ei!(t.1. .1 Simulation of an exponentially-decaying system Consider now the model problem (8. we denote the region of stability of the exact solution in the complex plane by the shaded region shown in Figure 8.t ) .3.3. the magnitude of the exact solution grows without bound.t ) ]. The insight we gain by studying the application of the numerical method we choose to this simple model problem allows us to predict how this method will work on more di cult problems for which the exact solution is not available.6) 0 where is. In Figure 8. 8. 8.2 Simulation of an undamped oscillating system y00 = !2 y . .1: Stability of the exact solution to the model problem y0 = y in the complex plane . in general. The utility of this model problem is that the exact solution is available. Graphically. Note also that all three methods are more accurate as h is re ned. we show the application of the explicit Euler method.3. The exact solution of this problem is y = y0 e (t. The exact solution of this system is simply a decaying exponential. Note that. and the trapezoidal method to this problem.3 A model problem A scalar model problem which is very useful for characterizing various time integration strategies is y0 = y with y(t0 ) = y0 (8.

1 y ) z0 = z .80 CHAPTER 8. h = 0:6 .5 0 −0. exact solution.5 0 5 10 15 Figure 8.1y ) S .5 0 5 10 15 1 0.5 0 5 10 15 1 0.5 1. the eigenvalues would have a negative real part as well. .1 y0 = S .5 −1 −1.5 1 0.1 Thus.8) The eigenvalues of A are i!.2: Simulation of the model problem y0 = y with = 1 using the explicit Euler method (top). and the trapezoidal method (bottom).5 0 −0.5 −1 −1. Note that the eigenvalues are imaginary if we has started with the equation for a damped oscillator. the implicit Euler method (middle). h = 2:1 .5 1. Note also that A may be diagonalized by its matrix of eigenvectors: A = S S . ORDINARY DIFFERENTIAL EQUATIONS 1. Symbols denote: . We may easily write this second-order ODE as a rst-order system of ODEs by de ning y1 = y and y2 = y0 and writing: | {z0 } | {z } | {z } .5 0 −0. we have where = i! 0 0 . y1 0 = y2 y 0 !2 0 A 1 y1 : y2 y (8. i! : y0 = S S .5 −1 −1.

Symbols denote: . exact solution.6) with complex (in this case. < . In terms of the components of z. Note that the explicit Euler method appears to be unstable for both large and small values of h.5 0 −0. eigenmode decompositions of physical systems (like mass/spring systems) motivate us to look at the scalar model problem (8. and the trapezoidal method (bottom). Thus.5 −2 2 1.5 −1 −1.8). A MODEL PROBLEM 2 1.3: Simulation of the oscillatory system y00 = !2y with ! = 1 using the explicit Euler method (top).8).5 0 −0. the implicit Euler method. Note also that all three methods are more accurate as h is re ned.5 1 0.1y.5 −2 2 1. .7). we show the application of the explicit Euler method. h = 0:1 . as re-expressed in (8. even for ODEs with stable exact solutions. we have decoupled the dynamics of the system: 0 z1 = i!z1 0 z2 = i!z2: Each of these systems is exactly the same form as our scalar model problem (8.8.5 1 0.6) over the complex plane .5 0 −0. where we have de ned z = S . and some numerical techniques are sometimes unstable. pure imaginary) values for .5 1 0. In fact. will be stable i there are no eigenvalues of A with ( ) > 0. with the trapezoidal method appearing to be the most accurate. our original second-order system (8.3.5 −2 81 0 5 10 15 0 5 10 15 0 5 10 15 Figure 8. the implicit Euler method (middle). . h = 0:6 .5 −1 −1.3.5 −1 −1. In Figure 8. We see that some numerical methods for time integration of ODEs are more accurate than others. and the trapezoidal method to the rst-order system of equations (8.

4.6).4 Stability For stability of a numerical method for time integration of an ODE. we want to insure that. negative . the numerical solution remains stable i 1 (1 + R h)2 + ( I h)2 1: ) j j ) The region of the complex plane which satis es this stability constraint is shown in Figure 8. 8.2a and 8. We often need to restrict the timestep h in order to insure this.6). this numerical method is conditionally stable (i. assuming constant h. we develop techniques to quantify both the stability and the accuracy of numerical methods for time integration of ODEs by application of these numerical methods to the model problem (8. it is stable for su ciently small h)..82 CHAPTER 8. we see that yn+1 = yn + hyn = (1 + h)yn : Thus. 3 2 1 0 −1 −2 −3 −4 −3 −2 −1 0 1 2 Figure 8. 8.3) to the model problem (8. though the instability is mild for small h. Note that this region of stability in the complex plane h is consistent with the numerical simulations shown in Figure 8.4. this numerical method is unstable for any h.e. whereas for pure imaginary . if the exact solution is bounded. 2) an unstable numerical scheme: one which blows up for any h. the solution at time step n is: yn = (1 + h)n y0 . consider a system whose exact solution is bounded and de ne: 1) a stable numerical scheme: one which does not blow up for any h.3a: for real. the numerical solution is also bounded. . ORDINARY DIFFERENTIAL EQUATIONS In the next two sections.1 Stability of the explicit Euler method Applying the explicit Euler method (8. To make this discussion concrete. n y0 = 1 + h: For large n.4: Stability of the numerical solution to y0 = y in the complex plane h using the explict Euler method. and 3) a conditionally stable numerical scheme: one which blows up for some h.

4. ) ) < ( h) 0: . the solution at time step n is: ! 1 + 2h yn+1 = y: 1 2h n = 1 + 2h : 1 2h . n y0 = 1 1 h: For large n.2 Stability of the implicit Euler method 1 0 −1 −2 −3 −4 −3 −2 −1 0 1 2 Figure 8.6). STABILITY 83 Applying the implicit Euler method (8.5: Stability of the numerical solution to y0 = y in the complex plane h using the implicit Euler method. assuming constant h. we see that . For large n. Note that this region of stability in the complex plane h is consistent with the numerical simulations shown in Figure 8. and is even stable for some cases in which the ODE itself is unstable. we see that yn+1 = yn + hyn+1 yn+1 = (1 h). 3 2 8.8.1 yn : Thus.4. ) .4) to the model problem (8. n y0 1 2h . 8.5. ) .4. Thus. the numerical solution remains stable i 1 (1 R h)2 + ( I h)2 1: The region of the complex plane which satis es this stability constraint is shown in Figure 8.6). the numerical solution remains stable i 1 ::: j j ) !n 1 + 2h yn = y0 . the solution at time step n is: n yn = 1 1 h y0 . assuming constant h. j j ) .3b: this method is stable for any stable ODE for any h.2b and 8. .5) to the model problem (8.3 Stability of the trapezoidal method yn+1 = yn + 2h (yn + yn+1 ) ) Applying the trapezoidal method (8.

n y0 : 1 2h To quantify the accuracy of these three methods. we can compare the ampli cation factor in each of the numerical approximations to the exact value e h .2 and 8. Note that this region of stability in the complex plane h is consistent with the numerical simulations shown in Figure 8. yn = 1 1 h y0 = 1 + h + 2 h2 + 3 h3 + : : : n y0 . as shown in Figure 8. the exact solution (assuming t0 = 0 and h = constant) is n 3 3 2 2 y(tn ) = e tn y0 = (e h )n y0 = 1 + h + 2h + 6h + : : : y0 : On the other hand. 8.3c.84 CHAPTER 8. x !n . and the leading order error of the trapezoidal method is proportional to h3 . solving the model problem with explicit Euler led to yn = (1 + h)n y0 . < < 3 2 1 0 −1 −2 −3 −4 −3 −2 −1 0 1 2 Figure 8. . as noted in 8. n 2 2 3 3 1 + 2h yn = y0 = 1 + h + 2h + 4h + : : : y0 .5 Accuracy Revisiting the model problem y0 = y. over a speci ed time interval (t0 t0 + T ).6.2c and 8. n y0 and solving the model problem with trapezoidal led to . n y0 solving the model problem with implicit Euler led to n .1.3. explicit Euler and implicit Euler are rst-order accurate and trapezoidal is second-order accurate. which are stable for systems with ( ) < 0 and marginally stable for systems with ( ) = 0. The higher order of accuracy of the trapezoidal method implies an improved rate of convergence of this scheme to the exact solution as the timestep h is re ned. The leading order error of the explicit Euler and implicit Euler methods are seen to be proportional to h2 . Thus. ORDINARY DIFFERENTIAL EQUATIONS The region of the complex plane which satis es this stability constraint coincides exactly with the region of stability of the exact solution. as observed in Figures 8.6: Stability of the numerical solution to y0 = y in the complex plane h using the trapezoidal method.

The exact solution we seek to match with this scheme is given by y(tn+1 ) = y(tn ) + hf (yn tn ) + h ft (yn tn ) + f (yn tn )fy (yn tn ) + : : : 2 2 .8. as will be shown. as they don't require any information about the numerical approximation of the solution before time tn this typically makes them quite easy to use. 1h 2h k3 = f yn + 2 h k1 + 3 h k2 tn + (8. is given by the general form: k1 = f yn tn k2 = f yn + 1 h k1 tn + .6. RUNGE-KUTTA METHODS 85 8. the order of accuracy of the method can also be increased.6 Runge-Kutta methods An important class of explicit methods.6.9) yn+1 = yn + 1 h k1 + 2 h k2 + 3 h k3 + : : : where the constants i . as it is in our model problem. 8. Runge-Kutta methods are explicit and \self starting". and i are selected to match as many terms as possible of the exact solution: y(tn+1 ) = y(tn ) + hy0 (tn ) + h y00 (tn ) + h y000 (tn ) + : : : 2 6 where 2 3 y0 = f y00 = ft + ffy 2 y000 = ftt + ft fy + 2ffyt + fy f + f 2 fyy etc..1 The class of second-order Runge-Kutta methods (RK2) Consider rst the family of two-step schemes of the form (8. As the number of intermediate steps ki in the Runge-Kutta method is increased.9): k1 = f (yn tn ) k2 = f (yn + 1 h k1 tn + 1 h) f (yn tn ) + fy (yn tn ) 1 h f (yn tn ) + ft (yn tn ) 1 h yn+1 = yn + 1 h k1 + 2 h k2 yn + 1 h f (yn tn ) + 2 h f (yn tn ) + 1 h fy (yn tn ) f (yn tn ) + 1 h ft (yn tn ) yn + ( 1 + 2 ) h f (yn tn ) + 2 h2 1 fy (yn tn ) f (yn tn ) + 2 h2 1 ft (yn tn ): Note that the approximations given above are exact if f is linear in y and t. The stability properties of higher-order Runge-Kutta methods are also generally quite favorable. i . . called Runge-Kutta methods.

an RK2 method is second-order accurate. which is equivalent to perhaps the most common so-called \predictor-corrector" scheme.7.10) where is a free parameter. yn = 1 + h + 2h : . the general form of the two-step second-order Runge-Kutta method (RK2) is k1 = f yn tn yn+1 = yn + 1 21 h k1 + 21 h k2 . as we shown below. and may be computed in the following order: predictor : yn+1 = yn + hf (yn tn ) corrector : yn+1 = yn + h f (yn tn ) + f (yn+1 tn+1 ) : 2 h i The \predictor" (which is simply an explicit Euler estimate of yn+1 ) is only \stepwise 2nd-order accurate". Another popular choice is = 1. which is known as the midpoint method and has a clear geometric interpretation of approximating a central di erence formula in the integration of the ODE from tn to tn+1 . calculation of the \corrector" (which looks roughly like a recalculation of yn+1 with a trapezoidal rule) results in a value for yn+1 which is \stepwise 3rd-order accurate" (and thus the scheme is globally 2nd-order accurate). we require that > 2 1 2> h2 > 2 2h 1 = 2 h2 2> = =h 1+ 2 = 19 > ) 1= 1 2= 2 1 1 1 = 1 21 : 1 . ) The ampli cation factor is seen to be a truncation of the Taylor series of the exact value e h = 1 + h + 2h + 6h + : : : We thus see that the leading order error of this method (for any value of ) is proportional to h3 and. ORDINARY DIFFERENTIAL EQUATIONS Matching coe cients to as high an order as possible. Thus. A popular choice is = 1=2. the method is stable i 1 the domain of stability of this method is illustrated in Figure 8. over a speci ed time interval (t0 t0 + T ). k2 = f yn + h k1 tn + h (8.86 CHAPTER 8. Over a large number of timesteps. Applying an RK2 method (for some value of the free parameter ) to the model problem y0 = y yields yn+1 = yn + 1 21 h yn + 21 h (1 + h )yn 2 2 2 2 = 1 + h + 2h yn . However. 2 2 3 3 j j .

RUNGE-KUTTA METHODS 3 87 2 1 0 −1 −2 −3 −4 −3 −2 −1 0 1 2 Figure 8. with the relationship again given by a truncated Taylor series of the exact value: . 8.6. This particular RK4 scheme also has a reasonably-clear geometric interpretation.7: Stability of the numerical solution to y0 = y in the complex plane h using RK2.6. j j 2 2 3 3 4 4 1 the domain of stability of this .11) indeed provide fourth-order accuracy. and is the workhorse of many ODE solvers. A derivation similar to that in the previous section con rms that the constants chosen in (8. the method is stable i method is illustrated in Figure 8.8. h = 1 + h + 2h + 6h + 24 : Over a large number of timesteps.8.11) This scheme usually performs very well. as discussed further in class.2 A popular fourth-order Runge-Kutta method (RK4) The most popular fourth-order Runge-Kutta method is k2 = f yn + h k1 tn+1=2 2 hk t k3 = f yn + 2 2 n+1=2 k4 = f yn + h k3 tn+1 yn+1 = yn + h k1 + h k2 + k3 + h k4 6 3 6 k1 = f yn tn (8.

6. yn+1 = yn + h k1 32h k3 + 2hk4 2 . it may be shown that both yn+1 and yn+1 are \stepwise 5th-order accurate". meaning that using either to advance in time over a given interval gives global 4th-order accuracy. is 8. known as the Runge-Kutta-Merson method.13) h5 ~ yn+1 y(tn+1 ) = 720 y(v) + O(h6 ): ~ .88 3 CHAPTER 8. n+1 .14) Subtracting (8.8: Stability of the numerical solution to y0 = y in the complex plane h using RK4. . In fact. then it follows after a bit of analysis that the errors in yn+1 and ~ yn+1 are 5 y y(t ) = h y(v) + O(h6 ) ~ ~ (8.12) n+1 . 120 (8. if y(t) is the exact solution to an ODE and yn takes ~ this exact value of y(tn ) at t = tn . n 8 1 3 n+1=2 (8. ORDINARY DIFFERENTIAL EQUATIONS 2 1 0 −1 −2 −3 −4 −3 −2 −1 0 1 2 Figure 8.13) from (8. Another popular fourth-order scheme.3 An adaptive Runge-Kutta method (RKM4) k2 = f yn + h k1 tn+1=3 3 h (k + k ) t k3 = f yn + 6 1 2 n+1=3 k = f y + h (k + 3k ) t 4 k1 = f yn tn k5 = f yn+1 tn+1 yn+1 = yn + h k1 + 23h k3 + h k5 : 6 6 Note that one extra computation of f is required in this method as compared with the method given in (8. With the same sort of analysis as we did for RK2.11). .14) gives h5 ~ yn+1 yn+1 = 144 y(v) + O(h6 ) .

even ~ if the exact solution y(t) is unknown. As with the procedure of adaptive quadrature. it will be within a prede ned acceptable level of tolerance. .8.15) is the error of our current \best guess" for yn+1 . Thus. The rst term on the RHS is something we can compute. (8.15) The quantity on the LHS of (8. when the entire (global) error is added up. known as the Runge-Kutta-Wray method.4 A low-storage Runge-Kutta method (RKW3) k1 = f yn tn k2 = f yn + 1 h k1 tn + Amongst people doing very large simulations with specialized solvers.6.14) to give 89 yn+1 y(tn+1 ) = 1 (yn+1 yn+1 ) + O(h6 ): ~ 5 .16) 2 = 1=4 2 = 2=3 2=0 3 3 = 5=12 = 3=4: 2 1 0 −1 −2 −3 −4 −3 −2 −1 0 1 2 Figure 8. . . it is straightforward to determine whether or not the error on any particular step is small enough such that.9: Stability of the numerical solution to y0 = y in the complex plane h using third-order Runge-Kutta. a third-order scheme which is rapidly gaining popularity. even if we don't know the exact solution y(t). We may use this estimate to decide whether or not to re ne or coarsen the stepsize h to attain a desired degree of accuracy on the entire interval.6. RUNGE-KUTTA METHODS which may be substituted on the RHS of (8. we can still estimate the error of our best guess of yn+1 with ~ quantities which we have computed. 8. is 1h k3 = f yn + 2 h k1 + 3 h k2 tn + 2 h yn+1 = yn + 1 h k1 + 2 h k2 + 3 h k3 where 1 = 8=15 1 = 8=15 1 = 1=4 3 (8.

However. every operation that follows either updates an existing vector or may overwrite the memory of an existing vector! (Pointers are very useful to set up multiple variables at the same memory location. accurate discretization of the governing PDE necessitates big state vectors and e cient numerical methods.e. you have begun to get a avor for how such numerical techniques can be derived and implemented. yn+1 = y + 2 h k2 (overwrite k2 ) k3 = f y tn + 2 h (overwrite y ) yn+1 = yn+1 + 3 h k3 (update yn+1 ) = 17=60 and 2 = 5=12. . .6.) Thus. Amazingly.16) indeed provide third-order accuracy.. in a manner which immediately overwrites y ) when the function f is a complicated nonlinear function. with the relationship again given by a truncated Taylor series of the exact value: x . we only need enough room in the computer for two vectors of length N . The above ordering of the RKW3 scheme is useful when the dimension N of the vector y is huge. not the four that one might expect are needed upon examination of (8. the nite element modeling of the stresses and temperatures of an engine block) and computational uid dynamics (e. Low-storage schemes such as this are essential in both computational mechanics (e.. the method is stable i 1 the domain of stability of this method is illustrated in Figure 8. It is my hope that. In the above scheme. for example. in the past 10 weeks. though this scheme is third-order accurate. This scheme is particularly useful because it may be computed in the following order j j 2 2 3 3 k1 = f yn tn y = yn + 1 h k1 y = y + 1 h k1 k2 = f y tn + 1 h y = y + 3 h k2 (overwrite yn ) (overwrite k1 ) (overwrite y ) (update y ) with 1 identical is easily obtained by substitution.g.16). In these and many other problems of engineering interest. Note that one has to be very careful when computing an operation like f (y ) in place (i. it requires no more memory than explicit Euler. we rst compute the vector k1 (also of dimension N ) as we do.9. Veri cation that these two forms of the RKW3 scheme are . = 1 + h + 2h + 6h : Over a large number of timesteps.1 con rms that the constants chosen in (8. for explicit Euler. the nite-di erence simulation of the turbulent ow in a jet engine).90 A derivation similar to that in 8..g.

is a high-level language excellent for. such as Fortran 90.edged version of the latest release of the professional version of Matlab.1 What is Matlab? Matlab. Matlab is a very rapid way to solve problems which don't require intensive computations or to create plots of functions or data sets. Problems which do require intensive computations. the \student edition" of Matlab is available at the bookstore for just over what it costs to obtain the manual. and adaptive lters. extensive \toolboxes" are available which contain useful routines for a wide variety of disciplines. such as those often encountered in industry. optimization. This is a bargain price (less than $100) for excellent software.1 For those who don't own a personal computer. Matlab is also useful as a simple programming language in which one can experiment (on small problems) with e cient numerical algorithms (designed for big problems) in a user-friendly environment. linear algebra. short for Matrix Laboratory. among many other things. data analysis. The most stable versions are on various Unix machines around campus. Unix machines running Matlab can be remotely accessed with any computer running X-windows on campus. Matlab is an important tool for the engineer to have in his or her arsenal. and numerical computation of small-scale systems. if so desired. system identi cation. are more e ciently solved in other languages. In addition. In short. Note that the use of m. including the ACS Unix boxes in the student labs (accessible by undergrads) and most of the Unix machines in the research groups of the MAE department.les (described at the end of this chapter) allows you to rerun nished Matlab codes at high resolution on the larger Unix machines to get report-quality printouts.2 Where to nd Matlab For those who own a personal computer. 1 . and it is highly recommended that you get your own (legal) copy of it to take with you when you leave UCSD. two and three dimensional graphics.Appendix A Getting started with Matlab A. which is included with the program itself. 1 The student edition is actually an almost full. Because a lot of problems encountered in the engineering world are fairly small. signal processing. There are also several Macs and PCs in the department (including those run by ACS) with Matlab already loaded. including control design. after debugging your code on your PC. or if you choose not to make this investment. limited only in its printing capabilities and the maximum matrix size allowed. A. you will nd that Matlab is available on a wide variety of platforms around campus.

with matlab If you purchase the student edition. The PATH variable should be set correctly so it will start on the rst try|if it doesn't. as the online help in Matlab is very extensive. if you are sitting at a machine named turbulence. try typing >> 1+1 Matlab will reassure you that the universe is still in good order ans = 2 To enter a matrix. set the DISPLAY environmental variable. full instructions on how to install the program will be included in the manual. available with the student edition or on its own at the bookstore. If running Matlab remotely. Instructions on how to open an account on one of the ACS Unix machines will be discussed in class. Matlab can function as an ordinary (but expensive!) calculator. To run Matlab once logged in to one of these machines. of course. Don't be timid to use this approach. To begin with. you need to: A) log in to turbulence. on a Mac or PC. for a fairly thorough description of how any particular command works. or B) use the online help. At the >> prompt. just type matlab. after all. the DISPLAY variable must be set to the local computer before starting Matlab for the subsequent plots to appear on your screen. with setenv DISPLAY turbulence:0. From there. For example.4 How to run Matlab|the basics Now that you found Matlab and got it running. you are probably on a roll and will stop reading the manual and this handout (you are. For example. type >> A = 1 2 3 4 5 6 7 8 0] .3 How to start Matlab Running Matlab is the same as running any software on the respective platform. B) open a window and telnet to an ACS machine on which you have an account telnet iacs5 C) once logged in. you can: A) read the manual. A. just nd the Matlab icon and double click on it.0 D) run Matlab. by typing help <command name >. that machine probably doesn't have Matlab.2 APPENDIX A. an engineer!) The following section is designed to give you just enough knowledge of the language to be dangerous. GETTING STARTED WITH MATLAB A.

4*diag(ones(m.. a statement such as results in z = 0 >> z = 0:2:10] Vector y can be premultiplied by matrix A and the result stored in vector z with Multiplication of a vector by a scalar may be accomplished with Matrix A may be transposed as The inverse of a matrix is obtained by typing A 5 5 identity matrix may be constructed with Tridiagonal matrices may be constructed by the following command and variations thereof: >> 1*diag(ones(m-1. type a semicolon after entering commands. 5*sqrt(3).1).A. etc. Elements of a matrix can also be arithmetic expressions. >> A = 1 2 3 4 5 6 7 8 0] When typing in a long expression. A column vector may be constructed with resulting in y = 7 8 9 >> y = 7 8 9]' To automatically build a row vector. Three periods in a row means that the present command is continued on the following line. and a semicolon indicates the end of a row.0*y >> z = A*y 2 4 6 8 10 . HOW TO RUN MATLAB|THE BASICS Matlab responds with A = 1 4 7 2 5 8 3 6 0 3 By default. such as 3*pi.1)..4... .-1) .1) >> E = eye(5) >> D = inv(A) >> C = A' >> z = 3. as in: >> A = 1 2 3 4 5 6 7 8 0] . this can greatly improve the legibility of what you just typed. the elements of a matrix or vector will be printed out as it is created.0) + 1*diag(ones(m-1. Matlab operates in an echo mode that is. This may become tedious for large operations|to suppress it.1). such as: Matrix elements are separated by spaces or commas.

the usual precedence rules are observed by Matlab. GETTING STARTED WITH MATLAB There are two \matrix division" symbols in Matlab.4 APPENDIX A. then A\B and B/A correspond formally to left and right multiplication of B (which must be of the appropriate size) by the inverse of A. \ and / | if A is a nonsingular square matrix. In each of these three catagories. Thus. Finally. all the multiplications and divisions are calculated. that is inv(A)*B and B*inv(A). but the result is obtained directly (via Gaussian elimination with full pivoting) without the computation of the inverse (which is a very expensive computation).*y . all the additions and subtractions are calculated. type >> A= 1 2 3 4 5 6 >> b= 5 8 -7]' >> x=A\b 7 8 0] which results in x = -1 0 2 To check this result. just type >> A*x which veri es that ans = 5 8 -7 Starting with the innermost group of operations nested in parentheses and working outward.3333 It is best to make frequent use of parentheses to insure the order of operations is as you intend. Note that the matrix sizes must be correct for the requested operations to be de ned or an error will result. all the exponentials are calculated. to solve a system A*x=b for the unknown vector x. the calculation proceeds from left to right through the expression. Then. First. Thus >> 5/5*3 ans = 3 and >> 5/(5*3) ans = 0. Suppose we have two vectors x and y and we wish to perform the componentwise operations: z = x y for = 1 n The Matlab command to execute this is >> x = 1:5]' >> y = 6:10]' >> z = x.

however.U. else j = 0. an upper triangular matrix U. Note in this example that commas are used to include several commands on a single line. so z = x'*y is successful. x(i) = i end. and a permutation matrix such that P*X = L*U L = 1. elseif n < 0. HOW TO RUN MATLAB|THE BASICS 5 Note that z = x*y will result in an error. These are summarized in the following section|don't be concerned if most of these functions are unfamiliar to you now.5714 U = 7. Try it!) The period distinguishes matrix operations (* ^ and /) from component-wise operations (. One command which we will encounter early in the quarter is LU decomposition: >> A= 1 2 3 4 5 6 >> L. Matlab also has control ow statements.5000 . m Matlab has several advanced matrix functions which will become useful to you as your knowledge of linear algebra grows.1429 0. x = 1 2 3 4 5 6 Each for must be matched by an end. but exits at the control of a logical condition: >> m = 0 >> while m < 7. and the trailing x is used to write the nal value of x to the screen. An if statement may be used as follows: >> n = 7 >> if n > 0.0000 0.8571 0 0 3.0000 0. One of the most common bugs when starting out with Matlab is de ning and using row vectors where column vectors are in fact needed.^ and . m = m+2 end. not a column vector. Note also that this for loop builds a row vector. (A row vector times a column vector. j = 1. since this implies matrix multiplication and is unde ned in this case.0000 4.4. similar to many programming languages: >> for i = 1:6.0000 0.0000 8.* . j = -1. end The format of a while statement is similar to that of the for.5000 0 0 1./).0000 0 0 P = 0 1 0 0 0 1 1 0 0 0 1. x creates.A.P]=lu(A) P 7 8 0] This results in a lower triangular matrix L. is a well de ned operation.

GETTING STARTED WITH MATLAB con rms that the original matrix A is recovered. A. poly(A) is a vector whose elements are the coe cients of the polynomial whose roots are the elements of A. . If X is not positive de nite. and a permutation matrix P so that P*X = L*U. and unitary matrices U and V so that X = U*S*V'. of the same dimension as X. >> V. with nonnegative diagonal elements in decreasing order.P] = lu(X) produces a lower triangular matrix L. >> Q = orth(A) produces an orthonormal basis for the range of A. If A is a vector. >> U. an upper triangular matrix U.A). >> p = poly(A) If A is an N by N matrix. Note that Q'*Q = I.U.5 Commands for matrix factoring and decomposition The following commands are but a small subset of what Matlab has to o er: >> R = chol(X) produces an upper triangular R so that R'*R printed. det(lambda*eye(A) .S.R] = qr(X) produces an upper triangular matrix R of the same dimension as X and a unitary matrix Q so that X = Q*R.H] = hess(A) = P*H*P' and P'*P = eye(size(P)).6 Checking this with >> P\L*U APPENDIX A. produces a unitary matrix P and a Hessenberg matrix H so that A >> L.D] = eig(X) = X. an error message is produces a diagonal matrix D of eigenvalues and a full matrix V whose columns are the corresponding eigenvectors so that X*V = V*D. >> P.V] = svd(X) produces a diagonal matrix S. poly(A) is a row vector with N+1 elements which are the coe cients of the characteristic polynomial. the columns of Q span the same space as the columns of A and the number of columns of Q is the rank of A. >> Q.

'-'.6 Commands used in plotting For a two dimensional plot.4 0. try for example: >> >> >> >> x= 0:.5:8.6 −0.6.8 −1 0 Three dimensional plots are also possible: >> x.8 0.-8:.y2. xlabel.y1.2 −0.4 −0.8 0.2 0 −0.4 0.6 0. COMMANDS USED IN PLOTTING 7 A. semilogy./R >> mesh(Z) 1 2 3 4 5 6 7 8 9 10 This results in the plot: 1 0.1:10] y1=sin(x) y2=cos(x) plot(x.6 0.x.'x') This results in the plot: 1 0.A.5:8) >> R = sqrt(x.^2) + eps >> Z = sin(R).2 −0.2 0 −0. .^2 + y. semilogx. and ylabel|see the help pages for more information.y] = meshgrid(-8:. title.4 40 30 20 10 0 0 10 30 20 40 Axis rescaling and labelling can be controlled with loglog.

tanh. On the student edition for the PC. you can execute a series of Matlab commands that are stored in `m.9 Matlab programming procedures: m. type save session prior to quitting. Note that an m. Text hardcopy on all platforms is best achieved by copying the text in the Matlab window and pasting it in to the editor of your choosing and then printing from there. which are for the most part self-explanatory (check the help page if not).les. simply type sample at the Matlab >> prompt.7 Other Matlab commands Typing who lists all the variables created up to that point.les: scripts and functions. Save and load commands are menu-driven on the Mac and PC. Note that many of the `built-in' functions are just prewritten m. I recommend making a new directory for each new problem you work on. and succinctly. figure1. Hardcopies of graphs on a Mac may be obtained from the options available under the `File' menu. log. This creates postscript les which may then be sent directly to a postscript printer using the local print command (usually lp or lpr) at the Unix level. All variables will be saved in a le named session. asin.m. are: abs. or \Command" mode.m'. GETTING STARTED WITH MATLAB A. which have a le extension `. su ciently.les will be illustrated thoroughly by example as the course proceeds. A function is a set of commands that is \called" (as in a real programming language). sinh. Matlab also has many special functions built in to aid in linear problem-solving an index of these along with their usage is in the Matlab manual. to keep things organized and to keep from overwriting previously-written . Matlab should be started from the directory containing your m. A script is just a list of matlab commands which run just as if you had typed them in one at a time (though it sounds pedestrian. acos. cos. only inheriting those variables in the argument list with which it is called and only returning the variables speci ed in the function declaration. then the encapulated postscript le (e. On all platforms. cosh. Note that there are two types of m.8 Hardcopies Hardcopies of graphs on the Unix stations are best achieved using the print command. Some additional useful functions. tan.les stored in subdirectories. When working on more complicated problems. Any text editor may be used to generate m. Typing whos gives detailed information showing sizes of arrays and vectors. exp.g. An example m. so that you can come back to the code later and understand how it works. These two types of m.les so that Matlab can nd these les..les so that set-up commands don't need to be retyped repeatedly. In order to save the entire variable set computed in a session.8 APPENDIX A. eps.le called sample. On Unix machines. At a later time the session may be resumed by typing load session. it is a very good idea to work from m.le may call other m. conj. log10.les'.les In addition to the interactive.les. A. To execute the commands in an m.eps) may be included in TEX documents as done here. this simple method is a useful and straightforward mode of operation in Matlab). atan. If using the -deps option of the print command in Matlab. It is important to note that these should be plain ASCI text les|type in the sequence of commands exactly as you would if running interactively. The % symbol is used in these les to indicate that the rest of that line is a comment|comment all mles clearly. the prtscr command may be used for a screen dump. and can easily be opened and accessed by the user for examination.le is shown in the following section.mat. sin. A.

it may have real or complex eigenvalues.D] = eig(R*R') % R*R' has the same eigenvalues as R'*R.m . clc % This code uses MATLAB's eig program to find eigenvalues of % a few random matrices which we will construct with special % structure.D] = eig(R'*R) % This matrix R'*R is symmetric. % press any key pause V. % press any key pause V.2.10 Sample m. A. But different eigenvectors! % press any key pause % You can create matrices with desired eigenvalues 1.3.les (see help page for details).les.A. with real POSITIVE eigenvalues. SAMPLE M-FILE 9 m. S = rand(4) lambda = diag( 1 2 3 4]) A = S * lambda * inv(S) eig(A) echo off % end sample. with real eigenvalues.R' is skew-symmetric. On the Mac and the PC. % press any key pause eig(R + R') % Notice that R + R' is symmetric.10.R') % The matrix R . % % press any key to begin pause R = rand(4) eig(R) % As R has random entries. with imaginary eigenvalues.m echo on.le % sample. % press any key pause eig(R .4 from any % invertible S times lambda times S inverse. the search path must be updated using the path command after starting Matlab so that Matlab can nd your m.