## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

**Beginning Scientiﬁc Computing
**

∗

J. Nathan Kutz

†

January 5, 2005

Abstract

This course is a survey of basic numerical methods and algorithms

used for linear algebra, ordinary and partial diﬀerential equations, data

manipulation and visualization. Emphasis will be on the implementation

of numerical schemes to practical problems in the engineering and phys-

ical sciences. Full use will be made of MATLAB and its programming

functionality.

∗

These notes are intended as the primary source of information for AMATH 301. The notes

are imcomplete and may contain errors. Any other use aside from classroom purposes and

personal research please contact me at kutz@amath.washington.edu. c J.N.Kutz, Autumn

2003 (Version 1.0)

†

Department of Applied Mathematics, Box 352420, University of Washington, Seattle, WA

98195-2420 (kutz@amath.washington.edu).

1

AMATH 301 ( c (J. N. Kutz) 2

Contents

1 MATLAB Introduction 3

1.1 Vectors and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Logic, Loops and Iterations . . . . . . . . . . . . . . . . . . . . . 7

1.3 Plotting and Importing/Exporting Data . . . . . . . . . . . . . . 10

2 Linear Systems 14

2.1 Direct solution methods for Ax=b . . . . . . . . . . . . . . . . . 14

2.2 Iterative solution methods for Ax=b . . . . . . . . . . . . . . . . 18

2.3 Eigenvalues, Eigenvectors, and Solvability . . . . . . . . . . . . . 22

3 Curve Fitting 26

3.1 Least-Square Fitting Methods . . . . . . . . . . . . . . . . . . . . 26

3.2 Polynomial Fits and Splines . . . . . . . . . . . . . . . . . . . . . 30

3.3 Data Fitting with MATLAB . . . . . . . . . . . . . . . . . . . . . 33

4 Numerical Diﬀerentiation and Integration 38

4.1 Numerical Diﬀerentiation . . . . . . . . . . . . . . . . . . . . . . 38

4.2 Numerical Integration . . . . . . . . . . . . . . . . . . . . . . . . 43

4.3 Implementation of Diﬀerentiation and Integration . . . . . . . . . 47

5 Diﬀerential Equations 50

5.1 Initial value problems: Euler, Runge-Kutta and Adams methods 50

5.2 Error analysis for time-stepping routines . . . . . . . . . . . . . . 57

5.3 Boundary value problems: the shooting method . . . . . . . . . . 63

5.4 Implementation of the shooting method . . . . . . . . . . . . . . 68

5.5 Boundary value problems: direct solve and relaxation . . . . . . 72

5.6 Implementing the direct solve method . . . . . . . . . . . . . . . 76

6 Fourier Transforms 79

6.1 Basics of the Fourier Transform . . . . . . . . . . . . . . . . . . . 79

6.2 Spectral Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

6.3 Applications of the FFT . . . . . . . . . . . . . . . . . . . . . . . 84

7 Partial Diﬀerential Equations 89

7.1 Basic time- and space-stepping schemes . . . . . . . . . . . . . . 89

7.2 Time-stepping schemes: explicit and implicit methods . . . . . . 93

7.3 Implementing Time- and Space-Stepping Schemes . . . . . . . . 98

AMATH 301 ( c (J. N. Kutz) 3

1 MATLAB Introduction

The ﬁrst three sections of these notes deals with the preliminaries necessary

for manipulating data, constructing logical statements and plotting data. The

remainder of the course relies heavily on proﬁciency with these basic skills.

1.1 Vectors and Matrices

The manipulation of matrices and vectors is of fundamental importance in MAT-

LAB proﬁciency. This section deals with the construction and manipulation of

basic matrices and vectors. We begin by considering the construction of row

and column vectors. Vectors are simply a subclass of matrices which have only

a single column or row. A row vector can be created by executing the command

struture

>>x=[1 3 2]

which creates the row vector

x = (1 2 3) . (1.1.1)

Row vectors are oriented in a horizontal fashion. In contrast, column vectors

are oriented in a vertical fashion and are created with either of the two following

command structures:

>>x=[1; 3; 2]

where the semicolons indicate advancement to the next row. Otherwise a return

can be used in MATLAB to initiate the next row. Thus the following command

structure is equivalent

>>x=[1

3

2]

Either one of these creates the column vector

x =

¸

1

3

2

¸

. (1.1.2)

All row and column vectors can be created in this way.

Vectors can also be created by use of the colon. Thus the following command

line

>>x=0:1:10

AMATH 301 ( c (J. N. Kutz) 4

creates a row vector which goes from 0 to 10 in steps of 1 so that

x = (0 1 2 3 4 5 6 7 8 9 10) . (1.1.3)

Note that the command line

>>x=0:2:10

creates a row vector which goes from 0 to 10 in steps of 2 so that

x = (0 2 4 6 8 10) . (1.1.4)

Steps of integer size need not be used. This allows

>>x=0:0.2:1

to create a row vector which goes from 0 to 1 in steps of 0.2 so that

x = (0 0.2 0.4 0.6 0.8 1) . (1.1.5)

Matrices are just as simple to generate. Now there are a speciﬁed number

of rows (N) and columns (M). This matrix would be referred to as an N M

matrix. The 3 3 matrix

A =

¸

1 3 2

5 6 7

8 3 1

¸

(1.1.6)

can be created by use of the semicolons

>>A=[1 3 2; 5 6 7; 8 3 1]

or by using the return key so that

>>A=[1 3 2

5 6 7

8 3 1]

In this case, the matrix is square with an equal number of rows and columns

(N = M).

Accessing components of a given matrix of vector is a task that relies on

knowing the row and column of a certain piece of data. The coordinate of any

data in a matrix is found from its row and column location so that the elements

of a matrix A are given by

A(i, j) (1.1.7)

where i denotes the row and j denotes the column. To access the second row

and third column of (1.1.6), which takes the value of 7, use the command

AMATH 301 ( c (J. N. Kutz) 5

>>x=A(2,3)

This will return the value x = 7. To access an entire row, the use of the colon

is required

>>x=A(2,:)

This will access the entire second row and return the row vector

x = (5 6 7) . (1.1.8)

Columns can be similarly be extracted. Thus

>>x=A(:,3)

will access the entire third column to produce

x =

¸

2

7

1

¸

. (1.1.9)

More sophisticated data extraction and manipulation can be performed with

the aid of colons. To show examples of these techniques we consider the 4 4

matrix

B =

¸

¸

¸

1 7 9 2

2 3 3 4

5 0 2 6

6 1 5 5

¸

. (1.1.10)

The command

>>x=B(2:3,2)

removes the second through third row of column 2. This produces the column

vector

x =

3

0

. (1.1.11)

The command

>>x=B(4,2:end)

removes the second through last columns of row 4. This produces the row vector

x = (1 5 5) . (1.1.12)

We can also remove a speciﬁed number of rows and columns. The command

>>C=B(1:end-1,2:4)

AMATH 301 ( c (J. N. Kutz) 6

removes the ﬁrst row through the next to last row along with the second through

fourth columns. This then produces the matrix

C =

¸

7 9 2

3 3 4

0 2 6

¸

. (1.1.13)

As a last example, we make use of the transpose symbol which turns row

vectors into column vectors and vice-versa. In this example, the command

>>D=[B(1,2:4); B(1:3,3).’]

makes the ﬁrst row of D the second through fourth columns of the ﬁrst row of

B. The second row of D, which is initiated with the semicolon, is made from the

transpose (.’) of the ﬁrst three rows of the third column of B. This produces

the matrix

D =

7 9 2

9 3 2

. (1.1.14)

An important comment about the transpose function is in order. In par-

ticular, when transposing a vector with complex numbers, the period must be

put in before the ’ symbol. Speciﬁcally, when considering the transpose of the

column vector

x =

¸

3 + 2i

1

8

¸

. (1.1.15)

where i is the complex (imaginary) number i =

√

−1, the command

>>y=x.’

produces the row vector

y = (3 + 2i 1 8) , (1.1.16)

whereas the command

>>y=x’

produces the row vector

y = (3 −2i 1 8) . (1.1.17)

Thus the use of the ’ symbol alone also conjugates vector, i.e. it changes the

sign of the imaginary part.

AMATH 301 ( c (J. N. Kutz) 7

1.2 Logic, Loops and Iterations

The basic building blocks of any MATLAB program are for loops and if state-

ments. They form the background for carrying out the complicated manipula-

tions required of sophisticated and simple codes alike. This lecture will focus

on the use of these two ubiquitous logic structures in programming.

To illustrate the use of the for loop structure, we consider some very basic

programs which revolve around its implementation. We begin by constructing a

loop which will recursively sum a series of numbers. The basic format for such

a loop is the following:

sum=0

for j=1:5

sum=sum+j

end

This program begins with the variable sum being zero. It then proceeds to go

through the for loop ﬁve times, i.e. the counter j takes the value of one, two,

three, four and ﬁve in succession. In the loop, the value of sum is updated by

adding the current value of j. Thus starting with an initial value of sum equal

to zero, we ﬁnd that sum is equal to 1 (j = 1), 3 (j = 2), 6 (j = 3), 10 (j = 4),

and 15 (j = 5).

The default incremental increase in the counter j is one. However, the

increment can be speciﬁed as desired. The program

sum=0

for j=1:2:5

sum=sum+j

end

is similar to the previous program. But for this case the incremental steps in

j are speciﬁed to be two. Thus starting with an initial value of sum equal to

zero, we ﬁnd that sum is equal to 1 (j = 1), 4 (j = 3), and 9 (j = 5). And even

more generally, the for loop counter can be simply given by a row vector. As

an example, the program

sum=0

for j=[1 5 4]

sum=sum+j

end

will go through the loop three times with the values of j being 1, 5, and 4

successively. Thus starting with an initial value of sum equal to zero, we ﬁnd

that sum is equal to 1 (j = 1), 6 (j = 5), and 10 (j = 4).

The if statement is similarly implemented. However, unlike the for loop, a

series of logic statements are usually considered. The basic format for this logic

is as follows:

AMATH 301 ( c (J. N. Kutz) 8

if (logical statement)

(expressions to execute)

elseif (logical statement)

(expressions to execute)

elseif (logical statement)

(expressions to execute)

else

(expressions to execute)

end

In this logic structure, the last set of expressions are executed if the three

previous logical statements do not hold.

In practice, the logical if may only be required to execute a command if

something is true. Thus there would be no need for the else logic structure.

Such a logic format might be as follows:

if (logical statement)

(expressions to execute)

elseif (logical statement)

(expressions to execute)

end

In such a structure, no commands would be executed if the logical statements

do not hold.

To make a practical example of the use of for and if statements, we consider

the bisection method for ﬁnding zeros of a function. In particular, we will

consider the transcendental function

exp(x) −tan(x) = 0 (1.2.1)

for which the values of x which make this true must be found computationally.

We can begin to get an idea of where the relevant values of x are by plotting this

function. The following MATLAB script will plot the function over the interval

x ∈ [−10, 10].

clear all % clear all variables

close all % close all figures

x=-10:0.1:10; % define plotting range

y=exp(x)-tan(x); % define function to consider

plot(x,y) % plot the function

It should be noted that tan(x) takes on the value of ±∞ at π/2 + 2nπ where

n = , −2, −1, 0, 1, 2, . By zooming in to smaller values of the function, one

can ﬁnd that there are a large number (inﬁnite) of roots to this equation. In

AMATH 301 ( c (J. N. Kutz) 9

particular, there is a root located between x ∈ [−4, 2.8]. At x = −4, the function

exp(x) −tan(x) > 0 while at x = −2.8 the function exp(x) −tan(x) < 0. Thus,

in between these points lies a root.

The bisection method simply cuts the given interval in half and determines

if the root is on the left or right side of the cut. Once this is established, a new

interval is chosen with the mid point now becoming the left or right end of the

new domain, depending of course on the location of the root. This method is

repeated until the interval has become so small and the function considered has

come to within some tolerance of zero. The following algorithm uses an outside

for loop to continue cutting the interval in half while the imbedded if statement

determines the new interval of interest. A second if statement if used to ensure

that once a certain tolerance has been achieved, i.e. the absolute value of the

function exp(x) −tan(x) is less than 10

−5

, then the iteration process is stopped.

bisection.m

clear all % clear all variables

xr=-2.8; % initial right boundary

xl=-4; % initial left boundary

for j=1:1000 % j cuts the interval

xc=(xl+xr)/2; % calculate the midpoint

fc=exp(xc)-tan(xc); % calculate function

if fc>0

xl=xc; % move left boundary

else

xr=xc; % move right boundary

end

if abs(fc)<10^(-5)

break % quit the loop

end

end

xc % print value of root

fc % print value of function

Note that the break command ejects you from the current loop. In this case,

that is the j loop. This eﬀectively stops the iteration procedure for cutting the

intervals in half. Further, extensive use has been made of the semicolons at the

AMATH 301 ( c (J. N. Kutz) 10

end of each line. The semicolon simply represses output to the computer screen,

saving valuable time and clutter.

1.3 Plotting and Importing/Exporting Data

The graphical representation of data is a fundamental aspect of any technical

scientiﬁc communications. MATLAB provides an easy and powerful interface

for plotting and representing a large variety of data formats. Like all MATLAB

structures, plotting is dependent on the eﬀective manipulation of vectors and

matrices.

To begin the consideration of graphical routines, we ﬁrst construct a set of

functions to plot and manipulate. To deﬁne a function, the plotting domain

must ﬁrst be speciﬁed. This can be done by using a routine which creates a row

vector

x1=-10:0.1:10;

y1=sin(x);

Here the row vector x spans the interval x ∈ [−10, 10] in steps of ∆x = 0.1.

The second command creates a row vector y which gives the values of the sine

function evaluated at the corresponding values of x. A basic plot of this function

can then be generated by the command

plot(x1,y1)

Note that this graph lacks a title or axis labels. These are important in gener-

ating high quality graphics.

The preceding example considers only equally spaced points. However,

MATLAB will generate functions values for any speciﬁed coordinate values.

For instance, the two lines

x2=[-5 sqrt(3) pi 4];

y2=sin(x2);

will generate the values of the sine function at x = −5,

√

3, π and 4. The

linspace command is also helpful in generating a domain for deﬁning and eval-

uating functions. The basic procedure for this function is as follows

x3=linspace(-10,10,64);

y3=x3.*sin(x3);

This will generate a row vector x which goes from -10 to 10 in 64 steps. This

can often be a helpful function when considering intervals where the number of

discretized steps give a complicated ∆x. In this example, we are considering the

function xsin x over the interval x ∈ [−10, 10]. By doing so, the period must be

AMATH 301 ( c (J. N. Kutz) 11

included before the multiplication sign. This will then perform a component by

component multiplication. Thus creating a row vector with the values of xsin x.

To plot all of the above data sets in a single plot, we can do either one of

the following routines.

figure(1)

plot(x1,y1), hold on

plot(x2,y2)

plot(x3,y3)

In this case, the hold on command is necessary after the ﬁrst plot so that the

second plot command will not overwrite the ﬁrst data plotted. This will plot all

three data sets on the same graph with a default blue line. This graph will be

ﬁgure 1. Alternatively, all the data sets can be plotted on the same graph with

figure(2)

plot(x1,y1,x2,y2,x3,y3)

For this case, the three pairs of vectors are prescribed within a single plot

command. This ﬁgure generated will be ﬁgure 2. And advantage to this method

is that the three data sets will be of diﬀerent colors, which is better than having

them all the default color of blue.

This is only the beginning of the plotting and visualization process. Many

aspects of the graph must be augmented with specialty commands in order to

more accurately relay the information. Of signiﬁcance is the ability to change

the line colors and styles of the plotted data. By using the help plot command,

a list of options for customizing data representation is given. In the following,

a new ﬁgure is created which is customized as follows

figure(3)

plot(x1,y1,x2,y2,’g*’,x3,y3,’mO:’)

This will create the same plot as in ﬁgure 2, but now the second data set is

represented by green * and the third data set is represented by a magenta

dotted line with the actual data points given by a magenta hollow circle. This

kind of customization greatly helps distinguish the various data sets and their

individual behaviors.

Labeling the axis and placing a title on the ﬁgure is also of fundamental

importance. This can be easily accomplished with the commands

xlabel(’x values’)

ylabel(’y values’)

title(’Example Graph’)

AMATH 301 ( c (J. N. Kutz) 12

The strings given within the ’ sign are now printed in a centered location along

the x-axis, y-axis and title location respectively.

A legend is then important to distinguish and label the various data sets

which are plotted together on the graph. Within the legend command, the

position of the legend can be easily speciﬁed. The command help legend gives

a summary of the possible legend placement positions, one of which is to the

right of the graph and does not interfere with the graph data. To place a legend

in the above graph in an optimal location, the following command is used.

legend(’Data set 1’,’Data set 2’,’Data set 3’,0)

Here the strings correspond to the the three plotted data sets in the order they

were plotted. The zero at the end of the legend command is the option setting

for placement of the legend box. In this case, the option zero tries to pick the

best possible location which does not interfere with any of the data.

subplots

Another possible method for representing data is with the subplot command.

This allows for multiple graphs to be arranged in a row and column format.

In this example, we have three data sets which are under consideration. To

plot each data set in an individual subplot requires three plot frames. Thus we

construct a plot containing three rows and a single column. The format for this

command structure is as follows:

figure(4)

subplot(3,1,1), plot(x1,y1), axis([-10 10 -10 10])

subplot(3,1,2), plot(x2,y2,’*’), axis([-10 10 -10 10])

subplot(3,1,3), plot(x3,y3,’mO:’)

Note that the subplot command is of the structure subplot(row,column,graph

number) where the graph number refers to the current graph being considered.

In this example, the axis command has also been used to make all the graphs

have the same x and y values. The command structure for the axis command

is axis([xmin xmax ymin ymax]). For each subplot, the use of the legend,

xlabel, ylabel, and title command can be used.

Remark 1: All the graph customization techniques discussed can be performed

directly within the MATLAB graphics window. Simply go to the edit button

and choose to edit axis properties. This will give full control of all the axis

properties discussed so far and more. However, once MATLAB is shut down

or the graph is closed, all the customization properties are lost and you must

start again from scratch. This gives an advantage to developing a nice set of

commands in a .m ﬁle to customize your graphics.

AMATH 301 ( c (J. N. Kutz) 13

Remark 2: To put a grid on the a graph, simply use the grid on command.

To take it oﬀ, use grid oﬀ. The number of grid lines can be controlled from

the editing options in the graphics window. This feature can aid in illustrating

more clearly certain phenomena and should be considered when making plots.

Remark 3: To put a variable value in a string, the num2str (number to string)

command can be used. For instance, the code

rho=1.5;

title([’value of rho=’ num2str(rho)])

creates a title which reads ”value of rho=1.5”. Thus this command converts

variable values into useful string conﬁgurations.

load, save and print

Once a graph has been made or data generated, it may be useful to save the data

or print the graphs for inclusion in a write up or presentation. The save and

print commands are the appropriate commands for performing such actions.

The save command will write any data out to a ﬁle for later use in a separate

program or for later use in MATLAB. To save workspace variables to a ﬁle, the

following command structure is used

save filename

this will save all the current workspace variables to a binary ﬁle named ﬁle-

name.mat. In the preceding example, this will save all the vectors created

along with any graphics settings. This is an extremely powerful and easy com-

mand to use. However, you can only access the data again through MATLAB.

To recall this data at a future time, simply load it back into MATLAB with the

command

load filename

This will reload into the workspace all the variables saved in ﬁlename.mat.

This command is ideal when closing down operations on MATLAB and resuming

at a future time.

Alternatively, it may be advantageous to save data for use in a diﬀerent

software package. In this case, data needs to be saved in an ASCII format in

order to be read by other software engines. The save command can then be

modiﬁed for this purpose. The command

save x1 x1.dat -ASCII

saves the row vector x1 generated previously to the ﬁle x1.dat in ASCII format.

This can then be loaded back into MATLAB with the command

AMATH 301 ( c (J. N. Kutz) 14

load x1.dat

This saving option is advantageous when considering the use of other software

packages to manipulate or analyze data generated in MATLAB.

If you desire to print a ﬁgure to a ﬁle, the the print command needs to

be utilized. There a large number of graphics formats which can be used. By

typing help print, a list of possible options will be listed. A common graphics

format would involve the command

print -djpeg fig.jpg

which will print the current ﬁgure as a jpeg ﬁle named ﬁg.jpg. Note that you

can also print or save from the ﬁgure window by pressing the ﬁle button and

following the links to the print or export option respectively.

2 Linear Systems

The solution of linear systems is one of the most basic aspects of computational

science. In many applications, the solution technique often gives rise to a system

of linear equations which need to be solved as eﬃciently as possible. In addition

to Gaussian elimination, there are a host of other techniques which can be used

to solve a given problem. This section oﬀers an overview for these methods and

techniques.

2.1 Direct solution methods for Ax=b

A central concern in almost any computational strategy is a fast and eﬃcient

computational method for achieving a solution of a large system of equations

Ax = b. In trying to render a computation tractable, it is crucial to minimize

the operations it takes in solving such a system. There are a variety of direct

methods for solving Ax = b: Gaussian elimination, LU decomposition, and

inverting the matrix A. In addition to these direct methods, iterative schemes

can also provide eﬃcient solution techniques. Some basic iterative schemes will

be discussed in what follows.

The standard beginning to discussions of solution techniques for Ax = b

involves Gaussian elimination. We will consider a very simple example of a

33 system in order to understand the operation count and numerical procedure

involved in this technique. Thus consider Ax = b with

A =

¸

1 1 1

1 2 4

1 3 9

¸

b =

¸

1

−1

1

¸

. (2.1.1)

AMATH 301 ( c (J. N. Kutz) 15

The Gaussian elimination procedure begins with the construction of the aug-

mented matrix

[A[b] =

1 1 1 1

1 2 4 −1

1 3 9 1

¸

¸

=

1 1 1 1

0 1 3 −2

0 2 8 0

¸

¸

=

1 1 1 1

0 1 3 −2

0 1 4 0

¸

¸

=

1 1 1 1

0 1 3 −2

0 0 1 2

¸

¸

(2.1.2)

where we have underlined and bolded the pivot of the augmented matrix. Back

substituting then gives the solution

x

3

= 2 → x

3

= 2 (2.1.3a)

x

2

+ 3x

3

= −2 → x

2

= −8 (2.1.3b)

x

1

+x

2

+x

3

= 1 → x

1

= 7 . (2.1.3c)

This procedure can be carried out for any matrix A which is nonsingular, i.e.

det A = 0. In this algorithm, we simply need to avoid these singular matrices

and occasionally shift the rows around to avoid a zero pivot. Provided we do

this, it will always yield an answer.

The fact that this algorithm works is secondary to the concern of the time

required in generating a solution in scientiﬁc computing. The operation count

for the Gaussian elimination can be easily be estimated from the algorithmic

procedure for an N N matrix:

1. Movement down the N pivots

2. For each pivot, perform N additions/subtractions across a given row.

3. For each pivot, perform the addition/subtraction down the N rows.

In total, this results in a scheme whose operation count is O(N

3

). The back

substitution algorithm can similarly be calculated to give an O(N

2

) scheme.

LU Decomposition

Each Gaussian elimination operation costs O(N

3

) operations. This can be com-

putationally prohibitive for large matrices when repeated solutions of Ax = b

AMATH 301 ( c (J. N. Kutz) 16

must be found. When working with the same matrix A however, the operation

count can easily be brought down to O(N

2

) using LU factorization which splits

the matrix A into a lower triangular matrix L, and an upper triangular matrix

U. For a 3 3 matrix, the LU factorization scheme splits A as follows:

A=LU →

¸

a

11

a

12

a

13

a

21

a

22

a

23

a

31

a

32

a

33

¸

=

¸

1 0 0

m

21

1 0

m

31

m

32

1

¸

¸

u

11

u

12

u

13

0 u

22

u

23

0 0 u

33

¸

.

(2.1.4)

Thus the L matrix is lower triangular and the U matrix is upper triangular.

This then gives

Ax = b → LUx = b (2.1.5)

where by letting y = Ux we ﬁnd the coupled system

Ly = b and Ux = y (2.1.6)

where the system Ly = b

y

1

= b

1

(2.1.7a)

m

21

y

1

+y

2

= b

2

(2.1.7b)

m

31

y

1

+m

32

y

2

+y

3

= b

3

(2.1.7c)

can be solved by O(N

2

) forward substitution and the system Ux = y

u

11

x

1

+ u

12

x

2

+x

3

= y

1

(2.1.8a)

u

22

x

2

+ u

23

x

3

= y

2

(2.1.8b)

u

33

x

3

= y

3

(2.1.8c)

can be solved by O(N

2

) back substitution. Thus once the factorization is ac-

complished, the LU results in an O(N

2

) scheme for arriving at the solution.

The factorization itself is O(N

3

), but you only have to do this once. Note, you

should always use LU decomposition if possible. Otherwise, you are doing far

more work than necessary in achieving a solution.

As an example of the application of the LU factorization algorithm, we

consider the 3 3 matrix

A=

¸

4 3 −1

−2 −4 5

1 2 6

¸

. (2.1.9)

The factorization starts from the matrix multiplication of the matrix A and the

identity matrix I

A = IA =

¸

1 0 0

0 1 0

0 0 1

¸

¸

4 3 −1

−2 −4 5

1 2 6

¸

. (2.1.10)

AMATH 301 ( c (J. N. Kutz) 17

The factorization begins with the pivot element. To use Gaussian elimination,

we would multiply the pivot by −1/2 to eliminate the ﬁrst column element in

the second row. Similarly, we would multiply the pivot by 1/4 to eliminate the

ﬁrst column element in the third row. These multiplicative factors are now part

of the ﬁrst matrix above:

A =

¸

1 0 0

−1/2 1 0

1/4 0 1

¸

¸

4 3 −1

0 −2.5 4.5

0 1.25 6.25

¸

. (2.1.11)

To eliminate on the third row, we use the next pivot. This requires that we

multiply by −1/2 in order to eliminate the second column, third row. Thus we

ﬁnd

A =

¸

1 0 0

−1/2 1 0

1/4 −1/2 1

¸

¸

4 3 −1

0 −2.5 4.5

0 0 8.5

¸

. (2.1.12)

Thus we ﬁnd that

L =

¸

1 0 0

−1/2 1 0

1/4 −1/2 1

¸

and U =

¸

4 3 −1

0 −2.5 4.5

0 0 8.5

¸

. (2.1.13)

It is easy to verify by direct substitution that indeed A = LU. Just like

Gaussian elimination, the cost of factorization is O(N

3

). However, once L and

U are known, ﬁnding the solution is an O(N

2

) operation.

The Permutation Matrix

As will often happen with Gaussian elimination, following the above algorithm

will at times result in a zero pivot. This is easily handled in Gaussian elim-

ination by shifting rows in order to ﬁnd a non-zero pivot. However, in LU

decomposition, we must keep track of this row shift since it will eﬀect the right

hand side vector b. We can keep track of row shifts with a row permutation

matrix P. Thus if we need to permute two rows, we ﬁnd

Ax = b →PAx = Pb →PLUx = Pb (2.1.14)

thus PA = LU. To shift rows one and two, for instance, we would have

P =

¸

¸

¸

¸

0 1 0

1 0 0

0 0 1

.

.

.

¸

. (2.1.15)

If permutation is necessary, MATLAB can supply the permutation matrix as-

sociated with the LU decomposition.

AMATH 301 ( c (J. N. Kutz) 18

MATLAB: A` b

Given the alternatives for solving the linear system Ax = b, it is important to

know how the MATLAB command structure for A` b works. The following is

an outline of the algorithm performed.

1. It ﬁrst checks to see if A is triangular, or some permutation thereof. If it

is, then all that is needed is a simple O(N

2

) substitution routine.

2. It then checks if A is symmetric, i.e. Hermitian or Self-Adjoint. If so, a

Cholesky factorization is attempted. If A is positive deﬁnite, the Cholesky

algorithm is always succesful and takes half the run time of LU factoriza-

tion.

3. It then checks if A is Hessenberg. If so, it can be written as an upper

triangular matrix and solved by a substitution routine.

4. If all the above methods fail, then LU factorization is used and the forward

and backward substitution routines generate a solution.

5. If A is not square, a QR (Householder) routine is used to solve the system.

6. If A is not square and sparse, a least squares solution using QR factoriza-

tion is performed.

Note that solving by b = A

−1

x is the slowest of all methods, taking 2.5 times

longer or more than A ` b. It is not recommended. However, just like LU

factorization, once the inverse is known it need not be calculated again. Care

must also be taken when the det A ≈ 0, i.e. the matrix is ill-conditioned.

MATLAB commands

The commands for executing the linear system solve are as follows

• A` b: Solve the system in the order above.

• [L, U] = lu(A): Generate the L and U matrices.

• [L, U, P] = lu(A): Generate the L and U factorization matrices along with

the permutation matrix P.

2.2 Iterative solution methods for Ax=b

In addition to the standard techniques of Gaussian elimination or LU decom-

position for solving Ax = b, a wide range of iterative techniques are available.

These iterative techniques can often go under the name of Krylov space meth-

ods [6]. The idea is to start with an initial guess for the solution and develop

an iterative procedure that will converge to the solution. The simplest example

AMATH 301 ( c (J. N. Kutz) 19

of this method is known as a Jacobi iteration scheme. The implementation of

this scheme is best illustrated with an example. We consider the linear system

4x −y + z = 7 (2.2.1a)

4x −8y +z = −21 (2.2.1b)

−2x +y + 5z = 15 . (2.2.1c)

We can rewrite each equation as follows

x =

7 +y −z

4

(2.2.2a)

y =

21 + 4x +z

8

(2.2.2b)

z =

15 + 2x −y

5

. (2.2.2c)

To solve the system iteratively, we can deﬁne the following Jacobi iteration

scheme based on the above

x

k+1

=

7 +y

k

−z

k

4

(2.2.3a)

y

k+1

=

21 + 4x

k

+z

k

8

(2.2.3b)

z

k+1

=

15 + 2x

k

−y

k

5

. (2.2.3c)

An algorithm is then easily implemented computationally. In particular, we

would follow the structure:

1. Guess initial values: (x

0

, y

0

, z

0

).

2. Iterate the Jacobi scheme: x

k+1

= Ax

k

.

3. Check for convergence: | x

k+1

− x

k

|<tolerance.

Note that the choice of an initial guess is often critical in determining the con-

vergence to the solution. Thus the more that is known about what the solution

is supposed to look like, the higher the chance of successful implementation of

the iterative scheme. Table 1 shows the convergence of this scheme for this

simple example.

Given the success of this example, it is easy to conjecture that such a scheme

will always be eﬀective. However, we can reconsider the original system by

interchanging the ﬁrst and last set of equations. This gives the system

−2x +y + 5z = 15 (2.2.4a)

4x −8y +z = −21 (2.2.4b)

4x −y +z = 7 . (2.2.4c)

AMATH 301 ( c (J. N. Kutz) 20

k x

k

y

k

z

k

0 1.0 2.0 2.0

1 1.75 3.375 3.0

2 1.84375 3.875 3.025

.

.

.

.

.

.

.

.

.

.

.

.

15 1.99999993 3.99999985 2.9999993

.

.

.

.

.

.

.

.

.

.

.

.

19 2.0 4.0 3.0

Table 1: Convergence of Jacobi iteration scheme to the solution value of

(x, y, z) = (2, 4, 3) from the initial guess (x

0

, y

0

, z

0

) = (1, 2, 2).

k x

k

y

k

z

k

0 1.0 2.0 2.0

1 -1.5 3.375 5.0

2 6.6875 2.5 16.375

3 34.6875 8.015625 -17.25

.

.

.

.

.

.

.

.

.

.

.

.

±∞ ±∞ ±∞

Table 2: Divergence of Jacobi iteration scheme from the initial guess

(x

0

, y

0

, z

0

) = (1, 2, 2).

To solve the system iteratively, we can deﬁne the following Jacobi iteration

scheme based on this rearranged set of equations

x

k+1

=

y

k

+ 5z

k

−15

2

(2.2.5a)

y

k+1

=

21 + 4x

k

+ z

k

8

(2.2.5b)

z

k+1

= y

k

−4x

k

+ 7 . (2.2.5c)

Of course, the solution should be exactly as before. However, Table 2 shows

that applying the iteration scheme leads to a set of values which grow to inﬁnity.

Thus the iteration scheme quickly fails.

Strictly Diagonal Dominant

The diﬀerence in the two Jacobi schemes above involves the iteration procedure

being strictly diagonal dominant. We begin with the deﬁnition of strict diagonal

AMATH 301 ( c (J. N. Kutz) 21

dominance. A matrix A is strictly diagonal dominant if for each row, the sum

of the absolute values of the oﬀ-diagonal terms is less than the absolute value

of the diagonal term:

[a

kk

[ >

N

¸

j=1,j=k

[a

kj

[ . (2.2.6)

Strict diagonal dominance has the following consequence; given a stricly diag-

onal dominant matrix A, then Ax = b has a unique solution x = p. Jacobi

iteration produces a sequence p

k

that will converge to p for any p

0

. For the

two examples considered here, this property is crucial. For the ﬁrst example

(2.2.1), we have

A =

¸

4 −1 1

4 −8 1

−2 1 5

¸

→

row 1: [4[ > [ −1[ +[1[ = 2

row 2: [ −8[ > [4[ +[1[ = 5

row 3: [5[ > [2[ +[1[ = 3

, (2.2.7)

which shows the system to be strictly diagonal dominant and guaranteed to

converge. In contrast, the second system (2.2.4) is not stricly diagonal dominant

as can be seen from

A =

¸

−2 1 5

4 −8 1

4 −1 1

¸

→

row 1: [ −2[ < [1[ +[5[ = 6

row 2: [ −8[ > [4[ +[1[ = 5

row 3: [1[ < [4[ +[ −1[ = 5

. (2.2.8)

Thus this scheme is not guaranteed to converge. Indeed, it diverges to inﬁnity.

Modiﬁcation and Enhancements: Gauss-Seidel

It is sometimes possible to enhance the convergence of a scheme by applying

modiﬁcations to the basic Jacobi scheme. For instance, the Jacobi scheme given

by (2.2.3) can be enhanced by the following modiﬁcations

x

k+1

=

7 +y

k

−z

k

4

(2.2.9a)

y

k+1

=

21 + 4x

k+1

+z

k

8

(2.2.9b)

z

k+1

=

15 + 2x

k+1

−y

k+1

5

. (2.2.9c)

Here use is made of the supposedly improved value x

k+1

in the second equation

and x

k+1

and y

k+1

in the third equation. This is known as the Gauss-Seidel

scheme. Table 3 shows that the Gauss-Seidel procedure converges to the solution

in half the number of iterations used by the Jacobi scheme.

Unlike the Jacobi scheme, the Gauss-Seidel method is not guaranteed to con-

verge even in the case of strict diagonal dominance. Further, the Gauss-Seidel

modiﬁcation is only one of a large number of possible changes to the iteration

AMATH 301 ( c (J. N. Kutz) 22

k x

k

y

k

z

k

0 1.0 2.0 2.0

1 1.75 3.75 2.95

2 1.95 3.96875 2.98625

.

.

.

.

.

.

.

.

.

.

.

.

10 2.0 4.0 3.0

Table 3: Convergence of Jacobi iteration scheme to the solution value of

(x, y, z) = (2, 4, 3) from the initial guess (x

0

, y

0

, z

0

) = (1, 2, 2).

scheme which can be implemented in an eﬀort to enhance convergence. It is

also possible to use several previous iterations to achieve convergence. Krylov

space methods [6] are often high end iterative techniques especially developed

for rapid convergence. Included in these iteration schemes are conjugant gradi-

ent methods and generalized minimum residual methods which we will discuss

and implement [6].

2.3 Eigenvalues, Eigenvectors, and Solvability

Another class of linear systems of equations which are of fundamental impor-

tance are known as eigenvalue problems. Unlike the system Ax = b which has

the single unknown vector x, eigenvalue problems are of the form

Ax = λx (2.3.1)

which have the unknows x and λ. The values of λ are known as the eigenvalues

and the corresponding x are the eigenvectors.

Eigenvalue problems often arise from diﬀerential equations. Speciﬁcally, we

consider the example of a linear set of coupled diﬀerential equations

dy

dt

= Ay. (2.3.2)

By attempting a solution of the form

y = xexp(λt) , (2.3.3)

where all the time-dependence is captured in the exponent, the resulting equa-

tion for x is

Ax = λx (2.3.4)

which is just the eigenvalue problem. Once the full set of eigenvalues and eigen-

vectors of this equation are found, the solution of the diﬀerential equation is

written as

y = c

1

x

1

exp(λ

1

t) +c

2

x

2

exp(λ

2

t) + +c

N

x

N

exp(λ

N

t) (2.3.5)

AMATH 301 ( c (J. N. Kutz) 23

where N is the number of linearly independent solutions to the eigenvalue prob-

lem for the matrix A which is of size N N. Thus solving a linear system of

diﬀerential equations relies on the solution of an associated eigenvalue problem.

The questions remains: how are the eigenvalues and eigenvectors found? To

consider this problem, we rewrite the eigenvalue problem as

Ax = λIx (2.3.6)

where a multiplication by unity has been performed, i.e. Ix = x. Moving the

right hand side to the left side of the equation gives

Ax −λIx = 0. (2.3.7)

Factoring out the vector x then gives the desired result

(A−λI)x = 0. (2.3.8)

Two possibilities now exist.

Option I: The determinant of the matrix (A−λI) is not zero. If this is true, the

matrix is nonsingular and its inverse, (A− λI)

−1

, can be found. The solution

to the eigenvalue problem (2.3.8) is then

x = (A−λI)

−1

0 (2.3.9)

which implies that

x = 0. (2.3.10)

This trivial solution could have been guessed from (2.3.8). However, it is not

relevant as we require nontrivial solutions for x.

Option II: The determinant of the matrix (A−λI) is zero. If this is true, the

matrix is singular and its inverse, (A−λI)

−1

, cannot be found. Although there

is no longer a guarantee that there is a solution, it is the only scenario which

allows for the possibility of x = 0. It is this condition which allows for the

construction of eigenvalues and eigenvectors. Indeed, we choose the eigenvalues

λ so that this condition holds and the matrix is singular.

To illustrate how the eigenvalues and eigenvectors are computed, an example

is shown. Consider the matrix

A =

1 3

−1 5

(2.3.11)

This gives the eigenvalue problem

A =

1 3

−1 5

x = λx (2.3.12)

AMATH 301 ( c (J. N. Kutz) 24

which when manipulated to the form (A−λI)x = 0 gives

¸

1 3

−1 5

−λ

1 0

0 1

x =

1 −λ 3

−1 5 −λ

x = 0 . (2.3.13)

We now require that the determinant is zero

det

1 −λ 3

−1 5 −λ

= (1 −λ)(5 −λ) + 3 = λ

2

−6λ + 8 = (λ −2)(λ −4) = 0

(2.3.14)

which gives the two eigenvalues

λ = 2, 4 . (2.3.15)

The eigenvectors are then found from (2.3.13) as follows:

λ = 2 :

1 −2 3

−1 5 −2

x =

−1 3

−1 3

x = 0 . (2.3.16)

Given that x = (x

1

x

2

)

T

, this leads to the single equation

−x

1

+ 3x

2

= 0 (2.3.17)

This is an underdetermined system of equations. Thus we have freedom in

choosing one of the values. Choosing x

2

= 1 gives x

1

= 3 and

x

1

=

3

1

. (2.3.18)

The second eigenvector comes from (2.3.13) as follows:

λ = 4 :

1 −4 3

−1 5 −4

x =

−3 3

−1 1

x = 0 . (2.3.19)

Given that x = (x

1

x

2

)

T

, this leads to the single equation

−x

1

+x

2

= 0 (2.3.20)

This is an underdetermined system of equations. Thus we have freedom in

choosing one of the values. Choosing x

2

= 1 gives x

1

= 1 and

x

2

=

1

1

. (2.3.21)

These results can be found from MATLAB by using the eig command.

Speciﬁcally, the command structure

[V,D]=eig(A)

gives the matrix V containing the eigenvectors as columns and the matrix D

whose diagonal elements are the corresponding eigenvalues.

AMATH 301 ( c (J. N. Kutz) 25

Matrix Powers

Another important operation which can be performed with eigenvalue and eigen-

vectors is the evaluation of

A

M

(2.3.22)

where M is a large integer. For large matrices A, this operation is computa-

tionally expensive. However, knowing the eigenvalues and eigenvectors of of A

allows for a signiﬁcant ease in computational expense. Assuming we have all

the eigenvalues and eigenvectors of A, then

Ax

1

= λ

1

x

1

Ax

2

= λ

2

x

2

.

.

.

Ax

n

= λ

1

x

n

.

This collection of eigenvalues and eigenvectors gives the matrix system

AS = SΛ (2.3.23)

where the columns of the matrix S are the eigenvectors of A,

S = (x

1

x

2

x

n

) , (2.3.24a)

and Λ is a matrix whose diagonals are the corresponding eigenvalues

Λ =

¸

¸

¸

¸

λ

1

0 0

0 λ

2

0 0

.

.

.

.

.

.

.

.

.

0 0 λ

n

¸

. (2.3.25)

By multiplying (2.3.24)on the right by S

−1

, the matrix A can then be rewritten

as

A = SΛS

−1

. (2.3.26)

The ﬁnal observation comes from

A

2

= (SΛS

−1

)(SΛS

−1

) = SΛ

2

S

−1

. (2.3.27)

This then generalizes to

A

M

= SΛ

M

S

−1

(2.3.28)

where the matrix Λ

M

is easily calculated as

Λ

M

=

¸

¸

¸

¸

λ

M

1

0 0

0 λ

M

2

0 0

.

.

.

.

.

.

.

.

.

0 0 λ

M

n

¸

. (2.3.29)

AMATH 301 ( c (J. N. Kutz) 26

Since raising the diagonal terms to the M

th

power is easily accomplished, the

matrix A can then be easily calculated by multiplying the three matrices in

(2.3.28)

Solvability and the Fredholm-Alternative Theorem

It is easy to ask under what conditions the system

Ax = b (2.3.30)

can be solved. Aside from requiring the detA = 0, we also have a solvability

condition on b. Consider the adjoint problem

A

†

y = 0 (2.3.31)

where A

†

= A

∗T

is the adjoint which is the transpose and complex conjugate

of the matrix A.

The deﬁnition of the adjoint is such that

y Ax = A

†

y x. (2.3.32)

Since Ax = b, the left side of the equation reduces to y b while the right side

reduces to 0 since A

†

y = 0. This then gives the condition

y b = 0 (2.3.33)

which is known as the Fredholm-Alternative theorem, or a solvability condition.

In words, the equation states the in order for the system Ax = b to be solvable,

the right-hand side forcing b must be orthogonal to the null-space of the adjoint

operator A

†

.

3 Curve Fitting

Analyzing data is fundamental to any aspect of science. Often data can be

noisy in nature and only the trends in the data are sought. A variety of curve

ﬁtting schemes can be generated to provide simpliﬁed descriptions of data and

its behavior. The least-squares ﬁt method is explored along with ﬁtting methods

of polynomial ﬁts and splines.

3.1 Least-Square Fitting Methods

One of the fundamental tools for data analysis and recognizing trends in physical

systems is curve ﬁtting. The concept of curve ﬁtting is fairly simple: use a

simple function to describe a trend by minimizing the error between the selected

function to ﬁt and a set of data. The mathematical aspects of this are laid out

in this section.

AMATH 301 ( c (J. N. Kutz) 27

Suppose we are given a set of n data points

(x

1

, y

1

), (x

2

, y

2

), (x

3

, y

3

), , (x

n

, y

n

) . (3.1.1)

Further, assume that we would like to ﬁt a best ﬁt line through these points.

Then we can approximate the line by the function

f(x) = Ax +B (3.1.2)

where the constants A and B are chosen to minimize some error. Thus the

function gives an approximation to the true value which is oﬀ by some error so

that

f(x

k

) = y

k

+E

k

(3.1.3)

where y

k

is the true value and E

k

is the error from the true value.

Various error measurements can be minimized when approximating with a

given function f(x). Three standard possibilities are given as follows

I. MaximumError : E

∞

(f) = max

1<k<n

[f(x

k

) −y

k

[ (3.1.4a)

II. AverageError : E

1

(f) =

1

n

n

¸

k=1

[f(x

k

) −y

k

[ (3.1.4b)

III. Root−meanSquare : E

2

(f) =

1

n

n

¸

k=1

[f(x

k

) −y

k

[

2

1/2

. (3.1.4c)

In practice, the root-mean square error is most widely used and accepted. Thus

when ﬁtting a curve to a set of data, the root-mean square error is chosen to be

minimized. This is called a least-squares ﬁt.

Least-Squares Line

To apply the least-square ﬁt criteria, consider the data points ¦x

k

, y

k

¦. where

k = 1, 2, 3, , n. To ﬁt the curve

f(x) = Ax +B (3.1.5)

to this data, the E

2

is found by minimizing the sum

E

2

(f) =

n

¸

k=1

[f(x

k

) −y

k

[

2

=

n

¸

k=1

(Ax

k

+B −y

k

)

2

. (3.1.6)

Minimizing this sum requires diﬀerentiation. Speciﬁcally, the constants A and

B are chosen so that a minimum occurs. thus we require: ∂E

2

/∂A = 0 and

AMATH 301 ( c (J. N. Kutz) 28

∂E

2

/∂B = 0. This gives:

∂E

2

∂A

= 0 :

n

¸

k=1

2(Ax

k

+B −y

k

)x

k

= 0 (3.1.7a)

∂E

2

∂B

= 0 :

n

¸

k=1

2(Ax

k

+B −y

k

) = 0 . (3.1.7b)

Upon rearranging, the 2 2 system of equations is found for A and B

¸

n

k=1

x

2

k

¸

n

k=1

x

k

¸

n

k=1

x

k

n

A

B

=

¸

n

k=1

x

k

y

k

¸

n

k=1

y

k

. (3.1.8)

This equation can be easily solved using the backslash command in MATLAB.

This method can be easily generalized to higher polynomial ﬁts. In partic-

ular, a parabolic ﬁt to a set of data requires the ﬁtting function

f(x) = Ax

2

+Bx +C (3.1.9)

where now the three constants A, B, and C must be found. These can be found

from the 3 3 system which results from minimizing the error E

2

(A, B, C)

∂E

2

∂A

= 0 (3.1.10a)

∂E

2

∂B

= 0 (3.1.10b)

∂E

2

∂C

= 0 (3.1.10c)

Data Linearization

Although a powerful method, the minimization of procedure can result in equa-

tions which are nontrivial to solve. Speciﬁcally, consider ﬁtting data to the

exponential function

f(x) = C exp(Ax) . (3.1.11)

The error to be minimized is

E

2

(A, C) =

n

¸

k=1

(C exp(Ax

k

) −y

k

)

2

. (3.1.12)

Applying the minimizing conditions leads to

∂E

2

∂A

= 0 :

n

¸

k=1

2(C exp(Ax

k

) −y

k

)Cx

k

exp(Ax

k

) = 0 (3.1.13a)

∂E

2

∂C

= 0 :

n

¸

k=1

2(C exp(Ax

k

) −y

k

) exp(Ax

k

) = 0 . (3.1.13b)

AMATH 301 ( c (J. N. Kutz) 29

This in turn leads to the 2 2 system

C

n

¸

k=1

x

k

exp(2Ax

k

) −

n

¸

k=1

x

k

y

k

exp(Ax

k

) = 0 (3.1.14a)

C

n

¸

k=1

exp(Ax

k

) −

n

¸

k=1

y

k

exp(Ax

k

) = 0 . (3.1.14b)

This system of equations is nonlinear and cannot be solved in a straightforward

fashion. Indeed, a solution may not even exist. Or many solution may exist.

To avoid the diﬃculty of solving this nonlinear system, the exponential ﬁt

can be linearized by the transformation

Y = ln(y) (3.1.15a)

X = x (3.1.15b)

B = ln C . (3.1.15c)

Then the ﬁt function

f(x) = y = C exp(Ax) (3.1.16)

can be linearized by taking the natural log of both sides so that

ln y = ln(C exp(Ax)) = ln C+ln(exp(Ax)) = B+Ax →Y = AX+B. (3.1.17)

So by ﬁtting to the natural log of the y-data

(x

i

, y

i

) →(x

i

, ln y

i

) = (X

i

, Y

i

) (3.1.18)

the curve ﬁt for the exponential function becomes a linear ﬁtting problem which

is easily handled.

General Fitting

Given the preceding examples, a theory can be developed for a general ﬁtting

procedure. The key idea is to assume a form of the ﬁtting function

f(x) = f(x, C

1

, C

2

, C

3

, , C

M

) (3.1.19)

where the C

i

are constants used to minimize the error and M < n. The root-

mean square error is then

E

2

(C

1

, C

2

, C

3

, , C

m

) =

n

¸

k=1

(f(x

k

, C

1

, C

2

, C

3

, , C

M

) −y

k

)

2

(3.1.20)

which can be minimized by considering the M M system generated from

∂E

2

∂C

j

= 0 j = 1, 2, , M . (3.1.21)

AMATH 301 ( c (J. N. Kutz) 30

In general, this gives the nonlinear set of equations

n

¸

k=1

(f(x

k

, C

1

, C

2

, C

3

, , C

M

) −y

k

)

∂f

∂C

j

= 0 j = 1, 2, 3, , M . (3.1.22)

Solving this set of equations can be quite diﬃcult. Most attempts at solving

nonlinear systems are based upon iterative schemes which require good initial

guesses to converge to the solution. Regardless, the general ﬁtting procedure is

straightforward and allows for the construction of a best ﬁt curve to match the

data.

3.2 Polynomial Fits and Splines

One of the primary reasons for generating data ﬁts from polynomials, splines,

or least-square methods is to interpolate or extrapolate data values. In practice,

when considering only a ﬁnite number of data points

(x

0

, y

0

)

(x

1

, y

1

)

.

.

.

(x

n

, y

n

) .

then the value of the curve at points other than the x

i

are unknown. Interpola-

tion uses the data points to predict values of y(x) at locations where x = x

i

and

x ∈ [x

0

, x

n

]. Extrapolation is similar, but it predicts values of y(x) for x > x

n

or x < x

0

, i.e. outside the range of the data points.

Interpolation and extrapolation are easy to do given a least-squares ﬁt. Once

the ﬁtting curve is found, it can be evaluated for any value of x, thus giving

an interpolated or extrapolated value. Polynomial ﬁtting is another method

for getting these values. With polynomial ﬁtting, a polynomial is chosen to go

through all data points. For the n + 1 data points given above, an n

th

degree

polynomial is chosen

p

n

(x) = a

n

x

n

+a

n−1

x

n−1

+ +a

1

x +a

0

(3.2.1)

where the coeﬃcients a

j

are chosen so that the polynomial passes through each

data point. Thus we have the resulting system

(x

0

, y

0

) : y

0

= a

n

x

n

0

+a

n−1

x

n−1

0

+ +a

1

x

0

+a

0

(x

1

, y

1

) : y

1

= a

n

x

n

1

+a

n−1

x

n−1

1

+ +a

1

x

1

+a

0

.

.

.

(x

n

, y

n

) : y

n

= a

n

x

n

n

+a

n−1

x

n−1

n

+ +a

1

x

n

+a

0

.

AMATH 301 ( c (J. N. Kutz) 31

This system of equations is nothing more than a linear system Ax = b which

can be solved by using the backslash command in MATLAB.

Although this polynomial ﬁt method will generate a curve which passes

through all the data, it is an expensive computation since we have to ﬁrst set

up the system and then perform an O(n

3

) operation to generate the coeﬃcients

a

j

. A more direct method for generating the relevant polynomial is the Lagrange

Polynomials method. Consider ﬁrst the idea of constructing a line between the

two points (x

0

, y

0

) and (x

1

, y

1

). The general form for a line is y = mx+b which

gives

y = y

0

+ (y

1

−y

0

)

x −x

0

x

1

−x

0

. (3.2.2)

Although valid, it is hard to continue to generalize this technique to ﬁtting higher

order polynomials through a larger number of points. Lagrange developed a

method which expresses the line through the two points as

p

1

(x) = y

0

x −x

1

x

0

−x

1

+y

1

x −x

0

x

1

−x

0

(3.2.3)

which can be easily veriﬁed to work. In a more compact and general way, this

ﬁrst degree polynomial can be expressed as

p

1

(x) =

1

¸

k=0

y

k

L

1,k

(x) = y

0

L

1,0

(x) +y

1

L

1,1

(x) (3.2.4)

where the Lagrange coeﬃcients are given by

L

1,0

(x) =

x −x

1

x

0

−x

1

(3.2.5a)

L

1,1

(x) =

x −x

0

x

1

−x

0

. (3.2.5b)

The power of this method is that it can be easily generalized to consider the n+1

points of our original data set. In particular, we ﬁt an n

th

degree polynomial

through the given data set of the form

p

n

(x) =

n

¸

k=0

y

k

L

n,k

(x) (3.2.6)

where the Lagrange coeﬃcient is

L

n,k

(x) =

(x −x

0

)(x −x

1

) (x −x

k−1

)(x −x

k+1

) (x −x

n

)

(x

k

−x

0

)(x

k

−x

1

) (x

k

−x

k−1

)(x

k

−x

k+1

) (x

k

−x

n

)

.

(3.2.7)

Thus there is no need to solve a linear system to generate the desired polynomial.

This is the preferred method for generating a polynomial ﬁt to a given set of

data and is the core algorithm employed by most commercial packages such as

MATLAB for polynomial ﬁtting.

AMATH 301 ( c (J. N. Kutz) 32

Splines

Although MATLAB makes it a trivial matter to ﬁt polynomials or least-square

ﬁts through data, a fundamental problem can arise as a result of a polynomial ﬁt:

polynomial wiggle. Polynomial wiggle is generated by the fact that an n

th

degree

polynomial has, in general, n −1 turning points from up to down or vice-versa.

One way to overcome this is to use a piecewise polynomial interpolation scheme.

Essentially, this simply draws a line between neighboring data points and uses

this line to give interpolated values. This technique is rather simple minded,

but it does alleviate the problem generated by polynomial wiggle. However,

the interpolating function is now only a piecewise function. Therefore, when

considering interpolating values between the points (x

0

, y

0

) and (x

n

, y

n

), there

will be n linear functions each valid only between two neighboring points.

The data generated by a piecewise linear ﬁt can be rather crude and it tends

to be choppy in appearance. Splines provide a better way to represent data

by constructing cubic functions between points so that the ﬁrst and second

derivative are continuous at the data points. This gives a smooth looking func-

tion without polynomial wiggle problems. The basic assumption of the spline

method is to construct a cubic function between data points:

S

k

(x) = S

k,0

+S

k,1

(x −x

k

) +S

k,2

(x −x

k

)

2

+S

k,3

(x −x

k

)

3

(3.2.8)

where x ∈ [x

k

, x

k+1

] and the coeﬃcients S

k,j

are to be determined from various

constraint conditions. Four constraint conditions are imposed

S

k

(x

k

) = y

k

(3.2.9a)

S

k

(x

k+1

) = S

k+1

(x

k+1

) (3.2.9b)

S

k

(x

k+1

) = S

k+1

(x

k+1

) (3.2.9c)

S

k

(x

k+1

) = S

k+1

(x

k+1

) . (3.2.9d)

This allows for a smooth ﬁt to the data since the four constraints correspond

to ﬁtting the data, continuity of the function, continuity of the ﬁrst derivative,

and continuity of the second derivative respectively.

To solve for these quantities, a large system of equations Ax = b is con-

structed. The number of equations and unknowns must be ﬁrst calculated.

• S

k

(x

k

) = y

k

→ Solution ﬁt: n + 1 equations

• S

k

= S

k+1

→ Continuity: n −1 equations

• S

k

= S

k+1

→ Smoothness: n −1 equations

• S

k

= S

k+1

→ Smoothness: n −1 equations

This gives a total of 4n − 2 equations. For each of the n intervals, there are

4 parameters which gives a total of 4n unknowns. Thus two extra constraint

AMATH 301 ( c (J. N. Kutz) 33

conditions must be placed on the system to achieve a solution. There are a

large variety of options for assumptions which can be made at the edges of the

spline. It is usually a good idea to use the default unless the application involved

requires a speciﬁc form. The spline problem is then reduced to a simple solution

of an Ax = b problem which can be solved with the backslash command.

As a ﬁnal remark on splines, splines are heavily used in computer graph-

ics and animation. The primary reason is for their relative ease in calculating,

and for their smoothness properties. There is an entire spline toolbox avail-

able for MATLAB which attests to the importance of this technique for this

application. Further, splines are also commonly used for smoothing data before

diﬀerentiating. This will be considered further in upcoming lectures.

3.3 Data Fitting with MATLAB

This section will discuss the practical implementation of the curve ﬁtting schemes

presented in the preceding two sections. The schemes to be explored are least-

square ﬁts, polynomial ﬁts, line interpolation, and spline interpolation. Addi-

tionally, a non-polynomial least-square ﬁt will be considered which results in a

nonlinear system of equations. This nonlinear system requires additional insight

into the problem and sophistication in its solution technique.

To begin the data ﬁt process, we ﬁrst import a relevant data set into the

MATLAB environment. To do so, the load command is used. The ﬁle lineﬁt.dat

is a collection of x and y data values put into a two column format separated

by spaces. The ﬁle is the following:

lineﬁt.dat

0.0 1.1

0.5 1.6

1.1 2.4

1.7 3.8

2.1 4.3

2.5 4.7

2.9 4.8

3.3 5.5

3.7 6.1

4.2 6.3

4.9 7.1

5.3 7.1

6.0 8.2

6.7 6.9

7.0 5.3

This data is just an example of you may want to import into MATLAB. The

command structure to read this data is as follows

AMATH 301 ( c (J. N. Kutz) 34

load linefit.dat

x=linefit(:,1);

y=linefit(:,2);

figure(1), plot(x,y,’O:’)

After reading in the data, the two vectors x and y are created from the ﬁrst

and second column of the data respectively. It is this set of data which will be

explored with line ﬁtting techniques. The code will also generate a plot of the

data in ﬁgure 1 of MATLAB.

Least-Squares Fitting

The least-squares ﬁt technique is considered ﬁrst. The polyﬁt and polyval

commands are essential to this method. Speciﬁcally, the polyﬁt command is

used to generate the n + 1 coeﬃcients a

j

of the n

th

-degree polynomial

p

n

(x) = a

n

x

n

+a

n−1

x

n−1

+ +a

1

x +a

0

(3.3.1)

used for ﬁtting the data. The basic structure requires that the vectors x and y

be submitted to polyval along with the desired degree of polynomial ﬁt n. To

ﬁt a line (n = 1) through the data, the structure, use the command

pcoeff=polyfit(x,y,1);

The output of this function call is a vector pcoeﬀ which includes the coeﬃcients

a

1

and a

0

of the line ﬁt p

1

(x) = a

1

x +a

0

. To evaluate and plot this line, values

of x must be chosen. For this example, the line will be plotted for x ∈ [0, 7] in

steps of ∆x = 0.1.

xp=0:0.1:7;

yp=polyval(pcoeff,xp);

figure(2), plot(x,y,’O’,xp,yp,’m’)

The polyval command uses the coeﬃcients generated from polyﬁt to generate

the y−values of the polynomial ﬁt at the desired values of x given by xp. Figure

2 in MATLAB depicts both the data and the best line ﬁt in the least-square

sense.

To ﬁt a parabolic proﬁle through the data, a second degree polynomial is

used. This is generated with

pcoeff2=polyfit(x,y,2);

yp2=polyval(pcoeff2,xp);

figure(3), plot(x,y,’O’,xp,yp2,’m’)

AMATH 301 ( c (J. N. Kutz) 35

Here the vector yp2 contains the parabolic ﬁt to the data evaluated at the

x-values xp. These results are plotted in MATLAB ﬁgure 3. To ﬁnd the least-

square error, the sum of the squares of the diﬀerences between the parabolic ﬁt

and the actualy data must be evaluated. Speciﬁcally, the quantity

E

2

(f) =

1

n

n

¸

k=1

[f(x

k

) −y

k

[

2

1/2

(3.3.2)

is calculated. For the parabolic ﬁt considered in the last example, the polynomial

ﬁt must be evaluated at the x−values for the given data lineﬁt.dat. The error

is then calculated

yp3=polyval(pcoeff2,x);

E2=sqrt( sum( ( abs(yp3-y) ).^2 )/n )

This is a quick and easy calculation which allows for an evaluation of the ﬁtting

procedure. In general, the error will continue to drop as the degree of polynomial

is increased. this is because every extra degree of freedom allows for a better

least-squares ﬁt to the data.

Interpolation

In addition to least-square ﬁtting, interpolation techniques can be developed

which go through all the given data points. The error in this case is zero, but

each interpolation scheme must be evaluated for its accurate representation of

the data. The ﬁrst interpolation scheme is a polynomial ﬁt to the data. Given

n+1 points, an n

th

degree polynomial is chosen. The polyﬁt command is again

used for the calculation

n=length(x)-1;

pcoeffn=polyfit(x,y,n);

ypn=polyval(pcoeffn,xp);

figure(4), plot(x,y,’O’,xp,ypn,’m’)

The MATLAB script will produce an n

th

degree polynomial through the data

points. But as always is the danger with polynomial interpolation, polynomial

wiggle can dominate the behavior. This is indeed the case as illustrated in

ﬁgure 4. The strong oscillatory phenomena at the edges is a common feature of

this type of interpolation.

In contrast to a polynomial ﬁt, a piecewise linear ﬁt gives a simple minded

connect-the-dot interpolation to the data. The interp1 command gives the

piecewise linear ﬁt algorithm

yint=interp1(x,y,xp);

figure(5), plot(x,y,’O’,xp,yint,’m’)

AMATH 301 ( c (J. N. Kutz) 36

The linear interpolation is illustrated in ﬁgure 5 of MATLAB. The are a few

options available with the interp1 command, including the nearest option and

the spline option. The nearest option gives the nearest value of the data to the

interpolated value while the spline option gives a cubic spline ﬁt interpolation

to the data. The two options can be compared with the default linear option.

yint=interp1(x,y,xp);

yint2=interp1(x,y,xp,’nearest’)

yint3=interp1(x,y,xp,’spline’)

figure(5), plot(x,y,’O’,xp,yint,’m’,xp,int2,’k’,xp,int3,’r’)

Note that the spline option is equivalent to using the spline algorithm supplied

by MATLAB. Thus a smooth ﬁt can be achieved with either spline or interp1.

The spline command is used by giving the x and y data along with a vector

xp for which we desire to generate corresponding y−values.

yspline=spline(x,y,xp);

figure(6), plot(x,y,’O’,xp,yspline,’k’)

The generated spline is depicted in ﬁgure 6. This is the same as that using

interp1 with the spline option. Note that the data is smooth as expected from

the enforcement of continuous smooth derivatives.

Nonlinear Least-Square Fitting

To consider more sophisticated least-square ﬁtting routines, consider the follow-

ing data set which looks like it good be nicely ﬁtted with a Gaussian proﬁle.

gaussﬁt.dat

-3.0 -0.2

-2.2 0.1

-1.7 0.05

-1.5 0.2

-1.3 0.4

-1.0 1.0

-0.7 1.2

-0.4 1.4

-0.25 1.8

-0.05 2.2

0.07 2.1

0.15 1.6

0.3 1.5

0.65 1.1

1.1 0.8

AMATH 301 ( c (J. N. Kutz) 37

1.25 0.3

1.8 -0.1

2.5 0.2

To ﬁt the data to a Gaussian, we indeed assume a Gaussian function of the form

f(x) = Aexp(−Bx

2

) . (3.3.3)

Following the procedure to minimize the least-square error leads to a set of

nonlinear equations for the coeﬃcients A and B. In general, solving a nonlinear

set of equations can be a diﬃcult task. A solution is not guaranteed to exist.

And in fact, there may be many solutions to the problem. Thus the nonlinear

problem should be handled with care.

To generate a solution, the least-square error must be minimized. In partic-

ular, the sum

E

2

=

n

¸

k=0

[f(x

k

) −y

k

[

2

(3.3.4)

must be minimized. The command fmins in MATLAB minimizes a function of

several variables. In this case, we minimize (3.3.4) with respect to the variables

A and B. Thus the minimum of

E

2

=

n

¸

k=0

[Aexp(−Bx

2

k

) −y

k

[

2

(3.3.5)

must be found with respect to A and B. The fmins algorithm requires a

function call to the quantity to minimized along with an initial guess for the

parameters which are being used for minimization, i.e. A and B.

coeff=fmins(’gafit’,[1 1]);

This command structure uses as initial guesses A = 1 and B = 1 when calling

the ﬁle gaﬁt.m. After minimizing, it returns the vector coeﬀ which contains

the appropriate values of A and B which minimize the least-square error. The

function called is then constructed as follows:

gaﬁt.m

function E=gafit(x0)

load gaussfit.dat

x=gaussfit(:,1);

y=gaussfit(:,2);

E=sum( ( x0(1)*exp(-x0(2)*x.^2)-y ).^2 )

AMATH 301 ( c (J. N. Kutz) 38

The vector x

0

accepts the initial guesses for A and B and is updated as A and

B are modiﬁed through an iteration procedure to converge to the least-square

values. Note that the sum is simply a statement of the quantity to minimized,

namely (3.3.5). The results of this calculation can be illustrated with MATLAB

ﬁgure 7

xga=-3:0.1:3;

a=coeff(1); b=coeff(2)

yga=a*exp(-b*xga.^2);

figure(7), plot(x2,y2,’O’,xga,yga,’m’)

Note that for this case, the initial guess is extremely important. For any given

problem where this technique is used, an educated guess for the values of param-

eters like A and B can determine if the technique will work at all. The results

should also be checked carefully since there is no guarantee that a minimization

near the desired ﬁt can be found.

4 Numerical Diﬀerentiation and Integration

Diﬀerentiation and integration form the backbone of the mathematical tech-

niques required to describe and analyze physical systems. These two mathe-

matical concepts describe how certain quantities of interest change with respect

to either space and time or both. Understanding how to evaluate these quan-

tities numerically is essential to understanding systems beyond the scope of

analytic methods.

4.1 Numerical Diﬀerentiation

Given a set of data or a function, it may be useful to diﬀerentiate the quantity

considered in order to determine a physically relevant property. For instance,

given a set of data which represents the position of a particle as a function of

time, then the derivative and second derivative give the velocity and acceleration

respectively. From calculus, the deﬁnition of the derivative is given by

df(t)

dt

= lim

∆t→0

f(t + ∆t) −f(t)

∆t

(4.1.1)

Since the derivative is the slope, the formula on the right is nothing more than

a rise-over-run formula for the slope. The general idea of calculus is that as

∆t →0, then the rise-over-run gives the instantaneous slope. Numerically, this

means that if we take ∆t suﬃciently small, than the approximation should be

fairly accurate. To quantify and control the error associated with approximating

the derivative, we make use of Taylor series expansions.

AMATH 301 ( c (J. N. Kutz) 39

To see how the Taylor expansions are useful, consider the following two

Taylor series:

f(t + ∆t) = f(t) + ∆t

df(t)

dt

+

∆t

2

2!

d

2

f(t)

dt

2

+

∆t

3

3!

d

3

f(c

1

)

dt

3

(4.1.2a)

f(t −∆t) = f(t) − ∆t

df(t)

dt

+

∆t

2

2!

d

2

f(t)

dt

2

−

∆t

3

3!

d

3

f(c

2

)

dt

3

(4.1.2b)

where c

i

∈ [a, b]. Subtracting these two expressions gives

f(t + ∆t) −f(t −∆t) = 2∆t

df(t)

dt

+

∆t

3

3!

d

3

f(c

1

)

dt

3

+

d

3

f(c

2

)

dt

3

. (4.1.3)

By using the mean-value theorem of calculus, we ﬁnd f

(c) = (f

(c

1

) +

f

(c

2

))/2. Upon dividing the above expression by 2∆t and rearranging, we

ﬁnd the following expression for the ﬁrst derivative:

df(t)

dt

=

f(t + ∆t) −f(t −∆t)

2∆t

−

∆t

2

6

d

3

f(c)

dt

3

(4.1.4)

where the last term is the truncation error associated with the approximation

of the ﬁrst derivative using this particular Taylor series generated expression.

Note that the truncation error in this case is O(∆t

2

).

We could improve on this by continuing our Taylor expansion and truncat-

ing it at higher orders in ∆t. This would lead to higher accuracy schemes.

Speciﬁcally, by truncating at O(∆t

5

), we would have

f(t + ∆t) = f(t) + ∆t

df(t)

dt

+

∆t

2

2!

d

2

f(t)

dt

2

+

∆t

3

3!

d

3

f(t)

dt

3

+

∆t

4

4!

d

4

f(t)

dt

4

+

∆t

5

5!

d

5

f(c

1

)

dt

5

(4.1.5a)

f(t −∆t) = f(t) −∆t

df(t)

dt

+

∆t

2

2!

d

2

f(t)

dt

2

−

∆t

3

3!

d

3

f(t)

dt

3

+

∆t

4

4!

d

4

f(t)

dt

4

−

∆t

5

5!

d

5

f(c

2

)

dt

5

(4.1.5b)

where c

i

∈ [a, b]. Again subtracting these two expressions gives

f(t+∆t)−f(t−∆t) = 2∆t

df(t)

dt

+

2∆t

3

3!

d

3

f(t)

dt

3

+

∆t

5

5!

d

5

f(c1)

dt

5

+

d

5

f(c2)

dt

5

.

(4.1.6)

In this approximation, there is third derivative term left over which needs to

be removed. By using two additional points to approximate the derivative, this

term can be removed. Thus we use the two additional points f(t + 2∆t) and

f(t −2∆t). Upon replacing ∆t by 2∆t in (4.1.6), we ﬁnd

f(t+2∆t)−f(t−2∆t) = 4∆t

df(t)

dt

+

16∆t

3

3!

d

3

f(t)

dt

3

+

32∆t

5

5!

d

5

f(c3)

dt

5

+

d

5

f(c4)

dt

5

.

(4.1.7)

AMATH 301 ( c (J. N. Kutz) 40

O(∆t

2

) center-diﬀerence schemes

f

(t) = [f(t + ∆t) −f(t −∆t)]/2∆t

f

(t) = [f(t + ∆t) −2f(t) +f(t −∆t)]/∆t

2

f

**(t) = [f(t + 2∆t) −2f(t + ∆t) + 2f(t −∆t) −f(t −2∆t)]/2∆t
**

3

f

**(t) = [f(t + 2∆t) −4f(t + ∆t) + 6f(t) −4f(t −∆t) +f(t −2∆t)]/∆t
**

4

Table 4: Second-order accurate center-diﬀerence formulas.

By multiplying (4.1.6) by eight and subtracting (4.1.7) and using the mean-value

theorem on the truncation terms twice, we ﬁnd the expression:

df(t)

dt

=

−f(t + 2∆t) + 8f(t + ∆t) −8f(t −∆t) +f(t −2∆t)

12∆t

+

∆t

4

30

f

(5)

(c)

(4.1.8)

where f

(5)

is the ﬁfth derivative and the truncation is of O(∆t

4

).

Approximating higher derivatives works in a similar fashion. By starting

with the pair of equations (4.1.2) and adding, this gives the result

f(t+∆t)+f(t−∆t) = 2f(t)+∆t

2

d

2

f(t)

dt

2

+

∆t

4

4!

d

4

f(c

1

)

dt

4

+

d

4

f(c

2

)

dt

4

. (4.1.9)

By rearranging and solving for the second derivative, the O(∆t

2

) accurate ex-

pression is derived

d

2

f(t)

dt

2

=

f(t + ∆t) −2f(t) +f(t −∆t)

∆t

2

+O(∆t

2

) (4.1.10)

where the truncation error is of O(∆t

2

) and is found again by the mean-value

theorem to be −(∆t

2

/12)f

**(c). This process can be continued to ﬁnd any
**

arbitrary derivative. Thus, we could also approximate the third, fourth, and

higher derivatives using this technique. It is also possible to generate backward

and forward diﬀerence schemes by using points only behind or in front of the

current point respectively. Tables 4-6 summarize the second-order and fourth-

order central diﬀerence schemes along with the forward- and backward-diﬀerence

formulas which are accurate to second-order.

A ﬁnal remark is in order concerning these diﬀerentiation schemes. The

central diﬀerence schemes are an excellent method for generating the values of

the derivative in the interior points of a data set. However, at the end points,

forward and backward diﬀerence methods must be used since they do not have

AMATH 301 ( c (J. N. Kutz) 41

O(∆t

4

) center-diﬀerence schemes

f

**(t) = [−f(t + 2∆t) + 8f(t + ∆t) −8f(t −∆t) +f(t −2∆t)]/12∆t
**

f

**(t) = [−f(t + 2∆t) + 16f(t + ∆t) −30f(t)
**

+16f(t −∆t) −f(t −2∆t)]/12∆t

2

f

**(t) = [−f(t + 3∆t) + 8f(t + 2∆t) −13f(t + ∆t)
**

+13f(t −∆t) −8f(t −2∆t) +f(t −3∆t)]/8∆t

3

f

**(t) = [−f(t + 3∆t) + 12f(t + 2∆t) −39f(t + ∆t) + 56f(t)
**

−39f(t −∆t) + 12f(t −2∆t) −f(t −3∆t)]/6∆t

4

Table 5: Fourth-order accurate center-diﬀerence formulas.

O(∆t

2

) forward- and backward-diﬀerence schemes

f

**(t) = [−3f(t) + 4f(t + ∆t) −f(t + 2∆t)]/2∆t
**

f

**(t) = [3f(t) −4f(t −∆t) +f(t −2∆t)]/2∆t
**

f

**(t) = [2f(t) −5f(t + ∆t) + 4f(t + 2∆t) −f(t + 3∆t)]/∆t
**

3

f

**(t) = [2f(t) −5f(t −∆t) + 4f(t −2∆t) −f(t −3∆t)]/∆t
**

3

Table 6: Second-order accurate forward- and backward-diﬀerence formulas.

neighboring points to the left and right respectively. Thus special care must be

taken at the end points of any computational domain.

It may be tempting to deduce from the diﬀerence formulas that as ∆t →0,

the accuracy only improves in these computational methods. However, this

line of reasoning completely neglects the second source of error in evaluating

derivatives: numerical round-oﬀ.

Round-oﬀ and optimal step-size

An unavoidable consequence of working with numerical computations is round-

oﬀ error. When working with most computations, double precision numbers

are used. This allows for 16-digit accuracy in the representation of a given

number. This round-oﬀ has signiﬁcant impact upon numerical computations

AMATH 301 ( c (J. N. Kutz) 42

and the issue of time-stepping.

As an example of the impact of round-oﬀ, we consider the approximation to

the derivative

dy

dt

≈

y(t + ∆t) −y(t)

∆t

+(y(t), ∆t) (4.1.11)

where (y(t), ∆t) measures the truncation error. Upon evaluating this expression

in the computer, round-oﬀ error occurs so that

y(t) = Y (t) +e(t) , (4.1.12)

where Y (t) is the approximated value given by the computer and e(t) measures

the error from the true value y(t). Thus the combined error between the round-

oﬀ and truncation gives the following expression for the derivative:

dy

dt

=

y(t + ∆t) −y(t)

∆t

+E(y(t), ∆t) (4.1.13)

where the total error, E, is the combination of round-oﬀ and truncation such

that

E = E

round

+E

trunc

=

e(t + ∆t) −e(t)

∆t

−

∆t

2

2

d

2

y(c)

dt

2

. (4.1.14)

We now determine the maximum size of the error. In particular, we can bound

the maximum value of round-oﬀ and the derivate to be

[e(t + ∆t)[ ≤ e

r

(4.1.15a)

[ −e(t)[ ≤ e

r

(4.1.15b)

M = max

c∈[tn,tn+1]

d

2

y(c)

dt

2

. (4.1.15c)

This then gives the maximum error to be

[E[ ≤

e

r

+e

r

∆t

+

∆t

2

2

M =

2e

r

∆t

+

∆t

2

M

2

. (4.1.16)

Note that as ∆t gets large, the error grows quadratically due to the truncation

error. However, as ∆t decreases to zero, the error is dominated by round-oﬀ

which grows like 1/∆t.

To minimize the error, we require that ∂[E[/∂(∆t) = 0. Calculating this

derivative gives

∂[E[

∂(∆t)

= −

2e

r

∆t

2

+M∆t = 0 , (4.1.17)

so that

∆t =

2e

r

M

1/3

. (4.1.18)

AMATH 301 ( c (J. N. Kutz) 43

This gives the step size resulting in a minimum error. Thus the smallest step-size

is not necessarily the most accurate. Rather, a balance between round-oﬀ error

and truncation error is achieved to obtain the optimal step-size. For e

r

≈ 10

−16

,

the optimal ∆t ≈ 10

−5

. Below this value of ∆t, numerical round-oﬀ begins to

dominate the error

A similar procedure can be carried out for evaluating the optimal step size

associated with the O(∆t

4

) accurate scheme for the ﬁrst derivative. In this case

dy

dt

=

−f(t + 2∆t) + 8f(t + ∆t) −8f(t −∆t) +f(t −2∆t)

12∆t

+E(y(t), ∆t)

(4.1.19)

where the total error, E, is the combination of round-oﬀ and truncation such

that

E =

−e(t + 2∆t) + 8e(t + ∆t) −8e(t −∆t) +e(t −2∆t)

12∆t

+

∆t

4

30

d

5

y(c)

dt

5

.

(4.1.20)

We now determine the maximum size of the error. In particular, we can bound

the maximum value of round-oﬀ to e as before and set M = max ¦[y

(c)[¦ .

This then gives the maximum error to be

[E[ =

3e

r

2∆t

+

∆t

4

M

30

. (4.1.21)

Note that as ∆t gets large, the error grows like a quartic due to the truncation

error. However, as ∆t decreases to zero, the error is again dominated by round-

oﬀ which grows like 1/∆t.

To minimize the error, we require that ∂[E[/∂(∆t) = 0. Calculating this

derivative gives

∆t =

45e

r

4M

1/5

. (4.1.22)

Thus in this case, the optimal step ∆t ≈ 10

−3

. This shows that the error can

be quickly dominated by numerical round-oﬀ if one is not careful to take this

signiﬁcant eﬀect into account.

4.2 Numerical Integration

Numerical integration simply calculates the area under a given curve. The basic

ideas for performing such an operation come from the deﬁnition of integration

b

a

f(x)dx = lim

h→0

N

¸

j=0

f(x

j

)h (4.2.1)

where b−a = Nh. Thus the area under the curve, from the calculus standpoint,

is thought of as a limiting process of summing up an ever-increasing number

AMATH 301 ( c (J. N. Kutz) 44

of rectangles. This process is known as numerical quadrature. Speciﬁcally, any

sum can be represented as follows

Q[f] =

N

¸

j=0

w

k

f(x

k

) = w

0

f(x

0

) + w

1

f(x

1

) + +w

N

f(x

N

) (4.2.2)

where a = x

0

< x

1

< x

2

< < x

N

= b. Thus the integral is evaluated as

b

a

f(x)dx = Q[f] +E[f] (4.2.3)

where the term E[f] is the error in approximating the integral by the quadrature

sum (4.2.2). Typically, the error E[f] is due to truncation error. To integrate,

use will be made of polynomial ﬁts to the y-values f(x

j

). Thus we assume the

function f(x) can be approximated by a polynomial

P

n

(x) = a

n

x

n

+a

n−1

x

n−1

+ + a

1

x +a

0

(4.2.4)

where the truncation error in this case is proportional to the (n+1)

th

derivative

E[f] = Af

(n+1)

(c) and A is a constant. This process of polynomial ﬁtting the

data gives the Newton-Cotes Formulas.

Newton-Cotes Formulas

The following integration approximations result from using a polynomial ﬁt

through the data to be diﬀerentiated. It is assumed that

x

k

= x

0

+hk f

k

= f(x

k

) . (4.2.5)

This gives the following integration algorithms:

Trapezoid Rule

x1

x0

f(x)dx =

h

2

(f

0

+f

1

) −

h

3

12

f

(c) (4.2.6a)

Simpson’s Rule

x2

x0

f(x)dx =

h

3

(f

0

+ 4f

1

+f

2

) −

h

5

90

f

(c) (4.2.6b)

Simpson’s 3/8 Rule

x3

x0

f(x)dx=

3h

8

(f

0

+3f

1

+3f

2

+f

3

) −

3h

5

80

f

(c)(4.2.6c)

Boole’s Rule

x4

x0

f(x)dx=

2h

45

(7f

0

+32f

1

+12f

2

+32f

3

+7f

4

)−

8h

7

945

f

(6)

(c)(4.2.6d)

These algorithms have varying degrees of accuracy. Speciﬁcally, they are O(h

2

),

O(h

4

), O(h

4

) and O(h

6

) accurate schemes respectively. The accuracy condition

is determined from the truncation terms of the polynomial ﬁt. Note that the

Trapezoid rule uses a sum of simple trapezoids to approximate the integral.

AMATH 301 ( c (J. N. Kutz) 45

Simpson’s rule ﬁts a quadratic curve through three points and calculates the

area under the quadratic curve. Simpson’s 3/8 rule uses four points and a cubic

polynomial to evaluate the area, while Boole’s rule uses ﬁve points and a quintic

polynomial ﬁt to generate an evaluation of the integral.

The derivation of these integration rules follows from simple polynomial

ﬁts through a speciﬁed number of data points. To derive the Simpson’s rule,

consider a second degree polynomial through the three points (x

0

, f

0

), (x

1

, f

2

),

and (x

2

, f

2

):

p

2

(x) = f

0

(x −x

1

)(x −x

2

)

(x

0

−x

1

)(x

0

−x

2

)

+f

1

(x −x

0

)(x −x

2

)

(x

1

−x

0

)(x

1

−x

2

)

+f

2

(x −x

0

)(x −x

1

)

(x

2

−x

0

)(x

2

−x

1

)

.

(4.2.7)

This quadratic ﬁt is derived by using Lagrange coeﬃcients. The truncation error

could also be included, but we neglect it for the present purposes. By plugging

in (4.2.7) into the integral

x2

x0

f(x)dx ≈

x2

x0

p

2

(x)dx =

h

3

(f

0

+ 4f

1

+ f

2

) . (4.2.8)

The integral calculation is easily performed since it only involves integrating

powers of x

2

or less. Evaluating at the limits then causes many terms to cancel

and drop out. Thus the Simpson’s rule is recovered. The trapezoid rule, Simp-

son’s 3/8 rule, and Boole’s rule are all derived in a similar fashion. To make

connection with the quadrature rule (4.2.2), Q = w

0

f

0

+w

1

f

1

+w

2

f

2

, Simpson’s

rule gives w

0

= h/3, w

1

= 4h/3, and w

2

= h/3 as weighting factors.

Composite Rules

The integration methods (4.2.6) give values for the integrals over only a small

part of the integration domain. Trapezoid rule, for instance, only gives a value

for x ∈ [x

0

, x

1

]. However, our fundamental aim is to evaluate the integral over

the entire domain x ∈ [a, b]. Assuming once again that our interval is divided

as a = x

0

< x

1

< x

2

< < x

N

= b, then the trapezoid rule applied over the

interval gives the total integral

b

a

f(x)dx ≈ Q[f] =

N−1

¸

j=1

h

2

(f

j

+f

j+1

) . (4.2.9)

Writing out this sum gives

N−1

¸

j=1

h

2

(f

j

+f

j+1

) =

h

2

(f

0

+f

1

) +

h

2

(f

1

+f

2

) + +

h

2

(f

N−1

+f

N

)

=

h

2

(f

0

+ 2f

1

+ 2f

2

+ + 2f

N−1

+f

N

) (4.2.10)

AMATH 301 ( c (J. N. Kutz) 46

=

h

2

¸

f

0

+f

N

+ 2

N−1

¸

j=1

f

j

¸

**The ﬁnal expression no longer double counts the values of the points between
**

f

0

and f

N

. Instead, the ﬁnal sum only counts the intermediate values once,

thus making the algorithm about twice as fast as the previous sum expression.

These are computational savings which should always be exploited if possible.

Recursive Improvement of Accuracy

Given an integration procedure and a value of h, a function or data set can be

integrated to a prescribed accuracy. However, it may be desirable to improve the

accuracy without having to disregard previous approximations to the integral.

To see how this might work, consider the trapezoidal rule for a step size of 2h.

Thus, the even data points are the only ones of interest and we have the basic

one-step integral

x2

x0

f(x)dx =

2h

2

(f

0

+f

2

) = h(f

0

+f

2

) . (4.2.11)

The composite rule associated with this is then

b

a

f(x)dx ≈ Q[f] =

N/2−1

¸

j=0

h(f

2j

+f

2j+2

) . (4.2.12)

Writing out this sum gives

N/2−1

¸

j=0

h(f

2j

+f

2j+2

) = h(f

0

+f

2

) +h(f

2

+f

4

) + +h(f

N−2

+f

N

)

= h(f

0

+ 2f

2

+ 2f

4

+ + 2f

N−2

+f

N

) (4.2.13)

= h

¸

f

0

+f

N

+ 2

N/2−1

¸

j=1

f

2j

¸

.

Comparing the middle expressions in (4.2.10) and (4.2.13) gives a great deal

of insight into how recursive schemes for improving accuracy work. Speciﬁcally,

we note that the more accurate scheme with step size h contains all the terms

in the integral approximation using step size 2h. Quantifying this gives

Q

h

=

1

2

Q

2h

+h(f

1

+f

3

+ +f

N−1

) , (4.2.14)

where Q

h

and Q

2h

are the quadrature approximations to the integral with step

size h and 2h respectively. This then allows us to half the value of h and

AMATH 301 ( c (J. N. Kutz) 47

improve accuracy without jettisoning the work required to approximate the

solution with the accuracy given by a step size of 2h. This recursive procedure

can be continued so as to give higher accuracy results. Further, this type of

procedure holds for Simpson’s rule as well as any of the integration schemes

developed here. This recursive routine is used in MATLAB in order to generate

results to a prescribed accuracy.

4.3 Implementation of Diﬀerentiation and Integration

This lecture focuses on the implementation of diﬀerentiation and integration

methods. Since they form the backbone of calculus, accurate methods to ap-

proximate these calculations are essential. To begin, we consider a speciﬁc

function to be diﬀerentiated, namely a hyperbolic secant. Part of the reason for

considering this function is that the exact value of its derivative is known. This

allows us to make a comparison of our approximate methods with the actual

solution. Thus, we consider

u = sech(x) (4.3.1)

whose derivative is

du

dx

= −sech(x) tanh(x) . (4.3.2)

Diﬀerentiation

To begin the calculations, we deﬁne a spatial interval. For this example we take

the interval x ∈ [−10, 10]. In MATLAB, the spatial discritization, ∆x, must

also be deﬁned. This gives

dx=0.1; % spatial discritization

x=-10:dx:10; % spatial domain

Once the spatial domain has been deﬁned, the function to be diﬀerentiated must

be evaluated

u=sech(x);

ux_exact=-sech(x)*tanh(x);

figure(1), plot(x,u,x,ux_exact)

MATLAB ﬁgure 1 produces the function and its derivative.

To calculate the derivative numerically, we use the center-, forward-, and

backward-diﬀerence formulas derived for diﬀerentiation. Speciﬁcally, we will

make use of the following four ﬁrst-derivative approximations:

center −diﬀerence O(h

2

) :

y

n+1

−y

n−1

2h

(4.3.3a)

center −diﬀerence O(h

4

) :

−y

n+2

+ 8y

n+1

−8y

n−1

+y

n−2

12h

(4.3.3b)

AMATH 301 ( c (J. N. Kutz) 48

forward −diﬀerence O(h

2

) :

−3y

n

+ 4y

n+1

−y

n+2

2h

(4.3.3c)

backward −diﬀerence O(h

2

) :

3y

n

−4y

n−1

+y

n−2

2h

. (4.3.3d)

Here we have used h = ∆x and y

n

= y(x

n

).

Second-order accurate derivative

To calculate the second order accurate derivative, we use the ﬁrst, third and

fourth equations of (4.3.3). In the interior of the domain, we use the center-

diﬀerence scheme. However, at the left and right boundaries, there are no left

and right neighboring points respectively to calculate the derivative with. Thus

we require the use of forward- and backward-diﬀerence schemes respectively.

This gives the basic algorithm

n=length(x)

% 2nd order accurate

ux(1)=(-3*u(1)+4*u(2)-u(3))/(2*dx);

for j=2:n-1

ux(j)=(u(j+1)-u(j-1))/(2*dx);

end

ux(n)=(3*u(n)-4*u(n-1)+u(n-2))/(2*dx);

The values of ux(1) and ux(n) are evaluated with the forward- and backward-

diﬀerence schemes.

Fourth-order accurate derivative

A higher degree of accuracy can be achieved by using a fourth-order scheme.

A fourth-order center-diﬀerence scheme such as the second equation of (4.3.3)

relied on two neighboring points. Thus the ﬁrst two and last two points of the

computational domain must be handled separately. In what follows, we use a

second-order scheme at the boundary points, and a fourth-order scheme for the

interior. This gives the algorithm

% 4th order accurate

ux2(1)=(-3*u(1)+4*u(2)-u(3))/(2*dx);

ux2(2)=(-3*u(2)+4*u(3)-u(4))/(2*dx);

for j=3:n-2

ux2(j)=(-u(j+2)+8*u(j+1)-8*u(j-1)+u(j-2))/(12*dx);

end

ux2(n-1)=(3*u(n-1)-4*u(n-2)+u(n-3))/(2*dx);

ux2(n)=(3*u(n)-4*u(n-1)+u(n-2))/(2*dx);

AMATH 301 ( c (J. N. Kutz) 49

For the dx = 0.1 considered here, the second-order and fourth-order schemes

result in errors of 10

−2

and 10

−4

respectively. To view the accuracy, the results

are plotted together with the analytic solution for the derivative

figure(2), plot(x,ux_exact,’o’,x,ux,’c’,x,ux2,’m’)

To the naked eye, these results all look to be exactly on the exact solution.

However, by zooming in on a particular point, it quickly becomes clear that the

errors for the two schemes are indeed 10

−2

and 10

−4

respectively. The failure

of accuracy can also be observed by taking large values of dx. For instance,

dx = 0.5 and dx = 1 illustrate how the derivatives fail to be properly determined

with large dx values.

As a ﬁnal note, if you are given a set of data to diﬀerentiate. It is always

recommended that you ﬁrst run a spline through the data with an appropri-

ately small dx and then diﬀerentiate the spline. This will help to give smooth

diﬀerentiated data. Otherwise, the data will tend to be highly inaccurate and

choppy.

Integration

There are a large number of integration routines built into MATLAB. So unlike

the diﬀerentiation routines presented here, we will simply make use of the built

in MATLAB functions. The most straight-forward integration routine is the

trapezoidal rule. Given a set of data, the trapz command can be used to

implement this integration rule. For the x and y data deﬁned previously for the

hyperbolic secant, the command structure gives

int_sech=trapz(x,y.^2);

where we have integrated sech

2

(x). The value of this integral is exactly two.

The trapz command gives a value that is within 10

−7

of the true value. To

generate the cumulative values, the cumtrapz command is used.

int_sech2=cumtrapz(x,y.^2);

figure(3), plot(x,int_sech2)

MATLAB ﬁgure 3 gives the value of the integral as a function of x.

Alternatively, a function can be speciﬁed for integration over a given domain.

The quad command is implemented by specifying a function and the range of

integration. This will give a value of the integral using a recursive Simpson’s

rule that is accurate to within 10

−6

. The command structure for this evaluation

is as follows

int_quad=quad(inline(’sech(x).^2’),-10,10)

AMATH 301 ( c (J. N. Kutz) 50

Here the inline command allows us to circumvent a traditional function call.

The quad command can, however, be used with a function call. This command

executes the integration of sech

2

(x) over x ∈ [−10, 10].

Double and triple integrals over two-dimensional rectangles and three-dimensional

boxes can be performed with the dbequad and triplequad commands. Note

that no version of the quad command exists that produces cumulative integra-

tion values. However, this can be easily handled in a for loop.

5 Diﬀerential Equations

Our ultimate goal is to solve very general nonlinear partial diﬀerential equa-

tions of elliptic, hyperbolic, parabolic, or mixed type. However, a variety of

basic techniques are required from the solutions of ordinary diﬀerential equa-

tions. By understanding the basic ideas for computationally solving initial and

boundary value problems for diﬀerential equations, we can solve more com-

plicated partial diﬀerential equations. The development of numerical solution

techniques for initial and boundary value problems originates from the simple

concept of the Taylor expansion. Thus the building blocks for scientiﬁc comput-

ing are rooted in concepts from freshman calculus. Implementation, however,

often requires ingenuity, insight, and clever application of the basic principles.

In some sense, our numerical solution techniques reverse our understanding of

calculus. Whereas calculus teaches us to take a limit in order to deﬁne a deriva-

tive or integral, in numerical computations we take the derivative or integral of

the governing equation and go backwards to deﬁne it as the diﬀerence.

5.1 Initial value problems: Euler, Runge-Kutta and Adams

methods

The solutions of general partial diﬀerential equations rely heavily on the tech-

niques developed for ordinary diﬀerential equations. Thus we begin by consid-

ering systems of diﬀerential equations of the form

dy

dt

= f(y, t) (5.1.1)

where y represents a vector and the initial conditions are given by

y(0) = y

0

(5.1.2)

with t ∈ [0, T]. Although very simple in appearance, this equation cannot be

solved analytically in general. Of course, there are certain cases for which the

problem can be solved analytically, but it will generally be important to rely

on numerical solutions for insight. For an overview of analytic techniques, see

Boyce and DiPrima [1].

AMATH 301 ( c (J. N. Kutz) 51

The simplest algorithm for solving this system of diﬀerential equations is

known as the Euler method. The Euler method is derived by making use of the

deﬁnition of the derivative:

dy

dt

= lim

∆t→0

∆y

∆t

. (5.1.3)

Thus over a time span ∆t = t

n+1

−t

n

we can approximate the original diﬀerential

equation by

dy

dt

= f(y, t) ⇒

y

n+1

−y

n

∆t

≈ f(y

n

, t

n

) . (5.1.4)

The approximation can easily be rearranged to give

y

n+1

= y

n

+ ∆t f(y

n

, t

n

) . (5.1.5)

Thus the Euler method gives an iterative scheme by which the future values

of the solution can be determined. Generally, the algorithm structure is of the

form

y(t

n+1

) = F(y(t

n

)) (5.1.6)

where F(y(t

n

)) = y(t

n

) + ∆t f(y(t

n

), t

n

) The graphical representation of this

iterative process is illustrated in Fig. 1 where the slope (derivative) of the func-

tion is responsible for generating each subsequent approximation to the solution

y(t). Note that the Euler method is exact as the step size descreases to zero:

∆t →0.

The Euler method can be generalized to the following iterative scheme:

y

n+1

= y

n

+ ∆t φ. (5.1.7)

where the function φ is chosen to reduce the error over a single time step ∆t and

y

n

= y(t

n

). The function φ is no longer constrained, as in the Euler scheme,

to make use of the derivative at the left end point of the computational step.

Rather, the derivative at the mid-point of the time-step and at the right end of

the time-step may also be used to possibly improve accuracy. In particular, by

generalizing to include the slope at the left and right ends of the time-step ∆t,

we can generate an iteration scheme of the following form:

y(t + ∆t) = y(t) + ∆t [Af(t, y(t)) +Bf(t +P ∆t, y(t) +Q∆t f(t, y(t)))]

(5.1.8)

where A, B, P and Q are arbitrary constants. Upon Taylor expanding the last

term, we ﬁnd

f(t +P ∆t, y(t) +Q∆t f(t, y(t))) =

f(t, y(t)) +P∆t f

t

(t, y(t)) +Q∆t f

y

(t, y(t)) f(t, y(t)) +O(∆t

2

) (5.1.9)

where f

t

and f

y

denote diﬀerentiation with respect to t and y respectively, use

has been made of (5.1.1), and O(∆t

2

) denotes all terms that are of size ∆t

2

and

AMATH 301 ( c (J. N. Kutz) 52

y(t)

y

y

t

Error

t t t

y

Exact Solution y(t)

t

y

0

1

2

3

0 0 0 0

+∆ +2∆t +3∆t t

Euler Approximations

Figure 1: Graphical description of the iteration process used in the Euler

method. Note that each subsequent approximation is generated from the slope

of the previous point. This graphical illustration suggests that smaller steps ∆t

should be more accurate.

smaller. Plugging in this last result into the original iteration scheme (5.1.8)

results in the following:

y(t + ∆t) = y(t) + ∆t(A +B)f(t, y(t))

+PB∆t

2

f

t

(t, y(t)) +BQ∆t

2

f

y

(t, y(t)) f(t, y(t)) +O(∆t

3

) (5.1.10)

which is valid up to O(∆t

2

).

To proceed further, we simply note that the Taylor expansion for y(t + ∆t)

gives:

y(t + ∆t) = y(t) + ∆t f(t, y(t)) +

1

2

∆t

2

f

t

(t, y(t))

+

1

2

∆t

2

f

y

(t, y(t))f(t, y(t)) +O(∆t

3

) . (5.1.11)

Comparing this Taylor expansion with (5.1.10) gives the following relations:

A+B = 1 (5.1.12a)

PB =

1

2

(5.1.12b)

BQ =

1

2

(5.1.12c)

AMATH 301 ( c (J. N. Kutz) 53

f(t)

f(t)

f(t )

+∆

t

t/2

+∆t

f(t )

t t +∆t/2 t +∆

Figure 2: Graphical description of the initial, intermediate, and ﬁnal slopes used

in the 4th order Runke-Kutta iteration scheme over a time ∆t.

which yields three equations for the four unknows A, B, P and Q. Thus one

degree of freedom is granted, and a wide variety of schemes can be implemented.

Two of the more commonly used schemes are known as Heun’s method and

Modiﬁed Euler-Cauchy (second order Runge-Kutta). These schemes assume

A = 1/2 and A = 0 respectively, and are given by:

y(t + ∆t) = y(t) +

∆t

2

[f(t, y(t)) +f(t + ∆t, y(t) + ∆t f(t, y(t)))] (5.1.13a)

y(t + ∆t) = y(t) + ∆t f

t +

∆t

2

, y(t) +

∆t

2

f(t, y(t))

. (5.1.13b)

Generally speaking, these methods for iterating forward in time given a single

initial point are known as Runge-Kutta methods. By generalizing the assumption

(5.1.8), we can construct stepping schemes which have arbitrary accuracy. Of

course, the level of algebraic diﬃculty in deriving these higher accuracy schemes

also increases signiﬁcantly from Heun’s method and Modiﬁed Euler-Cauchy.

4th-order Runge-Kutta

Perhaps the most popular general stepping scheme used in practice is known

as the 4th order Runge-Kutta method. The term “4th order” refers to the fact

that the Taylor series local truncation error is pushed to O(∆t

5

). The total

cumulative (global) error is then O(∆t

4

) and is responsible for the scheme name

of “4th order”. The scheme is as follows:

y

n+1

= y

n

+

∆t

6

[f

1

+ 2f

2

+ 2f

3

+f

4

] (5.1.14)

where

f

1

= f(t

n

, y

n

) (5.1.15a)

AMATH 301 ( c (J. N. Kutz) 54

f

2

= f

t

n

+

∆t

2

, y

n

+

∆t

2

f

1

(5.1.15b)

f

3

= f

t

n

+

∆t

2

, y

n

+

∆t

2

f

2

(5.1.15c)

f

4

= f (t

n

+ ∆t, y

n

+ ∆t f

3

) . (5.1.15d)

This scheme gives a local truncation error which is O(∆t

5

). The cumulative

(global) error in this case case is fourth order so that for t ∼ O(1) then the error

is O(∆t

4

). The key to this method, as well as any of the other Runge-Kutta

schemes, is the use of intermediate time-steps to improve accuracy. For the

4th order scheme presented here, a graphical representation of this derivative

sampling at intermediate time-steps is shown in Fig. 2.

Adams method: multi-stepping techniques

The development of the Runge-Kutta schemes rely on the deﬁnition of the

derivative and Taylor expansions. Another approach to solving (5.1.1) is to start

with the fundamental theorem of calculus [2]. Thus the diﬀerential equation can

be integrated over a time-step ∆t to give

dy

dt

= f(y, t) ⇒ y(t + ∆t) −y(t) =

t+∆t

t

f(t, y)dt . (5.1.16)

And once again using our iteration notation we ﬁnd

y

n+1

= y

n

+

tn+1

tn

f(t, y)dt . (5.1.17)

This iteration relation is simply a restatement of (5.1.7) with ∆tφ =

tn+1

tn

f(t, y)dt.

However, at this point, no approximations have been made and (5.1.17) is exact.

The numerical solution will be found by approximating f(t, y) ≈ p(t, y) where

p(t, y) is a polynomial. Thus the iteration scheme in this instance will be given

by

y

n+1

≈ y

n

+

tn+1

tn

p(t, y)dt . (5.1.18)

It only remains to determine the form of the polynomial to be used in the

approximation.

The Adams-Bashforth suite of computational methods uses the current point

and a determined number of past points to evaluate the future solution. As

with the Runge-Kutta schemes, the order of accuracy is determined by the

choice of φ. In the Adams-Bashforth case, this relates directly to the choice

of the polynomial approximation p(t, y). A ﬁrst-order scheme can easily be

constructed by allowing

p

1

(t) = constant = f(t

n

, y

n

) , (5.1.19)

AMATH 301 ( c (J. N. Kutz) 55

where the present point and no past points are used to determine the value of

the polynomial. Inserting this ﬁrst-order approximation into (5.1.18) results in

the previously found Euler scheme

y

n+1

= y

n

+ ∆t f(t

n

, y

n

) . (5.1.20)

Alternatively, we could assume that the polynomial used both the current point

and the previous point so that a second-order scheme resulted. The linear

polynomial which passes through these two points is given by

p

2

(t) = f

n−1

+

f

n

−f

n−1

∆t

(t −t

n

) . (5.1.21)

When inserted into (5.1.18), this linear polynomial yields

y

n+1

= y

n

+

tn+1

tn

f

n−1

+

f

n

−f

n−1

∆t

(t −t

n

)

dt . (5.1.22)

Upon integration and evaluation at the upper and lower limits, we ﬁnd the

following 2nd order Adams-Bashforth scheme

y

n+1

= y

n

+

∆t

2

[3f(t

n

, y

n

) −f(t

n−1

, y

n−1

)] . (5.1.23)

In contrast to the Runge-Kutta method, this is a two-step algorithm which re-

quires two initial conditions. This technique can be easily generalized to include

more past points and thus higher accuracy. However, as accuracy is increased,

so are the number of initial conditions required to step forward one time-step

∆t. Aside from the ﬁrst-order accurate scheme, any implementation of Adams-

Bashforth will require a boot strap to generate a second “initial condition” for

the solution iteration process.

The Adams-Bashforth scheme uses current and past points to approximate

the polymial p(t, y) in (5.1.18). If instead a future point, the present, and the

past is used, then the scheme is known as an Adams-Moulton method. As before,

a ﬁrst-order scheme can easily be constructed by allowing

p

1

(t) = constant = f(t

n+1

, y

n+1

) , (5.1.24)

where the future point and no past and present points are used to determine the

value of the polynomial. Inserting this ﬁrst-order approximation into (5.1.18)

results in the backward Euler scheme

y

n+1

= y

n

+ ∆t f(t

n+1

, y

n+1

) . (5.1.25)

Alternatively, we could assume that the polynomial used both the future point

and the current point so that a second-order scheme resulted. The linear poly-

nomial which passes through these two points is given by

p

2

(t) = f

n

+

f

n+1

−f

n

∆t

(t −t

n

) . (5.1.26)

AMATH 301 ( c (J. N. Kutz) 56

Inserted into (5.1.18), this linear polynomial yields

y

n+1

= y

n

+

tn+1

tn

f

n

+

f

n+1

−f

n

∆t

(t −t

n

)

dt . (5.1.27)

Upon integration and evaluation at the upper and lower limits, we ﬁnd the

following 2nd order Adams-Moulton scheme

y

n+1

= y

n

+

∆t

2

[f(t

n+1

, y

n+1

) +f(t

n

, y

n

)] . (5.1.28)

Once again this is a two-step algorithm. However, it is categorically diﬀer-

ent than the Adams-Bashforth methods since it results in an implicit scheme,

i.e. the unknown value y

n+1

is speciﬁed through a nonlinear equation (5.1.28).

The solution of this nonlinear system can be very diﬃcult, thus making ex-

plicit schemes such as Runge-Kutta and Adams-Bashforth, which are simple

iterations, more easily handled. However, implicit schemes can have advantages

when considering stability issues related to time-stepping. This is explored fur-

ther in the notes.

One way to circumvent the diﬃculties of the implicit stepping method while

still making use of its power is to use a Predictor-Corrector method. This scheme

draws on the power of both the Adams-Bashforth and Adams-Moulton schemes.

In particular, the second order implicit scheme given by (5.1.28) requires the

value of f(t

n+1

, y

n+1

) in the right hand side. If we can predict (approximate)

this value, then we can use this predicted value to solve (5.1.28) explicitly. Thus

we begin with a predictor step to estimate y

n+1

so that f(t

n+1

, y

n+1

) can be

evaluated. We then insert this value into the right hand side of (5.1.28) and

explicitly ﬁnd the corrected value of y

n+1

. The second-order predictor-corrector

steps are then as follows:

predictor (Adams-Bashforth): y

P

n+1

=y

n

+

∆t

2

[3f

n

− f

n−1

] (5.1.29a)

corrector (Adams-Moulton): y

n+1

=y

n

+

∆t

2

[f(t

n+1

, y

P

n+1

)+f(t

n

, y

n

)]. (5.1.29b)

Thus the scheme utilizes both explicit and implicit time-stepping schemes with-

out having to solve a system of nonlinear equations.

Higher order diﬀerential equations

Thus far, we have considered systems of ﬁrst order equations. Higher order dif-

ferential equations can be put into this form and the methods outlined here can

be applied. For example, consider the third-order, nonhomogeneous, diﬀerential

equation

d

3

u

dt

3

+u

2

du

dt

+ cos t u = g(t) . (5.1.30)

AMATH 301 ( c (J. N. Kutz) 57

By deﬁning

y

1

= u (5.1.31a)

y

2

=

du

dt

(5.1.31b)

y

3

=

d

2

u

dt

2

, (5.1.31c)

we ﬁnd that dy

3

/dt = d

3

u/dt

3

. Using the original equation along with the

deﬁnitions of y

i

we ﬁnd that

dy

1

dt

= y

2

(5.1.32a)

dy

2

dt

= y

3

(5.1.32b)

dy

3

dt

=

d

3

u

dt

3

= −u

2

du

dt

−cos t u +g(t) = −y

2

1

y

2

−cos t y

1

+g(t) (5.1.32c)

which results in the original diﬀerential equation (5.1.1) considered previously

dy

dt

=

d

dt

¸

y

1

y

2

y

3

¸

=

¸

y

2

y

3

−y

2

1

y

2

−cos t y

1

+g(t)

¸

= f(y, t) . (5.1.33)

At this point, all the time-stepping techniques developed thus far can be applied

to the problem. It is imperative to write any diﬀerential equation as a ﬁrst-order

system before solving it numerically with the time-stepping schemes developed

here.

MATLAB commands

The time-stepping schemes considered here are all available in the MATLAB

suite of diﬀerential equation solvers. The following are a few of the most common

solvers:

• ode23: second-order Runge-Kutta routine

• ode45: fourth-order Runge-Kutta routine

• ode113: variable order predictor-corrector routine

• ode15s: variable order Gear method for stiﬀ problems [3, 4]

5.2 Error analysis for time-stepping routines

Accuracy and stability are fundamental to numerical analysis and are the key fac-

tors in evaluating any numerical integration technique. Therefore, it is essential

AMATH 301 ( c (J. N. Kutz) 58

to evaluate the accuracy and stability of the time-stepping schemes developed.

Rarely does it occur that both accuracy and stability work in concert. In fact,

they often are oﬀsetting and work directly against each other. Thus a highly

accurate scheme may compromise stability, whereas a low accuracy scheme may

have excellent stability properties.

We begin by exploring accuracy. In the context of time-stepping schemes,

the natural place to begin is with Taylor expansions. Thus we consider the

expansion

y(t + ∆t) = y(t) + ∆t

dy(t)

dt

+

∆t

2

2

d

2

y(c)

dt

2

(5.2.1)

where c ∈ [t, t+∆t]. Since we are considering dy/dt = f(t, y), the above formula

reduces to the Euler iteration scheme

y

n+1

= y

n

+ ∆t f(t

n

, y

n

) +O(∆t

2

) . (5.2.2)

It is clear from this that the truncation error is O(∆t

2

). Speciﬁcally, the trun-

cation error is given by ∆t

2

/2 d

2

y(c)/dt

2

.

Of importance is how this truncation error contributes to the overall error

in the numerical solution. Two types of error are important to identify: local

and global error. Each is signiﬁcant in its own right. However, in practice we

are only concerned with the global (cumulative) error. The global discretization

error is given by

E

k

= y(t

k

) −y

k

(5.2.3)

where y(t

k

) is the exact solution and y

k

is the numerical solution. The local

discretization error is given by

k+1

= y(t

k+1

) −(y(t

k

) + ∆t φ) (5.2.4)

where y(t

k+1

) is the exact solution and y(t

k

)+∆tφ is a one-step approximation

over the time interval t ∈ [t

n

, t

n+1

].

For the Euler method, we can calculate both the local and global error.

Given a time-step ∆t and a speciﬁed time interval t ∈ [a, b], we have after K

steps that ∆t K = b −a. Thus we ﬁnd

local:

k

=

∆t

2

2

d

2

y(c

k

)

dt

2

∼ O(∆t

2

) (5.2.5a)

global: E

k

=

K

¸

j=1

∆t

2

2

d

2

y(c

j

)

dt

2

≈

∆t

2

2

d

2

y(c)

dt

2

K

=

∆t

2

2

d

2

y(c)

dt

2

b −a

∆t

=

b −a

2

∆t

d

2

y(c)

dt

2

∼ O(∆t) (5.2.5b)

which gives a local error for the Euler scheme which is O(∆t

2

) and a global

error which is O(∆t). Thus the cumulative error is poor for the Euler scheme,

i.e. it is not very accurate.

AMATH 301 ( c (J. N. Kutz) 59

scheme local error

k

global error E

k

Euler O(∆t

2

) O(∆t)

2nd order Runge-Kutta O(∆t

3

) O(∆t

2

)

4th order Runge-Kutta O(∆t

5

) O(∆t

4

)

2nd order Adams-Bashforth O(∆t

3

) O(∆t

2

)

Table 7: Local and global discretization errors associated with various time-

stepping schemes.

A similar procedure can be carried out for all the schemes discussed thus

far, including the multi-step Adams schemes. Table 7 illustrates various schemes

and their associated local and global errors. The error analysis suggests that the

error will always decrease in some power of ∆t. Thus it is tempting to conclude

that higher accuracy is easily achieved by taking smaller time steps ∆t. This

would be true if not for round-oﬀ error in the computer.

Round-oﬀ and step-size

An unavoidable consequence of working with numerical computations is round-

oﬀ error. When working with most computations, double precision numbers

are used. This allows for 16-digit accuracy in the representation of a given

number. This round-oﬀ has signiﬁcant impact upon numerical computations

and the issue of time-stepping.

As an example of the impact of round-oﬀ, we consider the Euler approxima-

tion to the derivative

dy

dt

≈

y

n+1

−y

n

∆t

+(y

n

, ∆t) (5.2.6)

where (y

n

, ∆t) measures the truncation error. Upon evaluating this expression

in the computer, round-oﬀ error occurs so that

y

n+1

= Y

n+1

+e

n+1

. (5.2.7)

Thus the combined error between the round-oﬀ and truncation gives the follow-

ing expression for the derivative:

dy

dt

=

Y

n+1

−Y

n

∆t

+E

n

(y

n

, ∆t) (5.2.8)

where the total error, E

n

, is the combination of round-oﬀ and truncation such

that

E

n

= E

round

+E

trunc

=

e

n+1

−e

n

∆t

−

∆t

2

2

d

2

y(c)

dt

2

. (5.2.9)

AMATH 301 ( c (J. N. Kutz) 60

We now determine the maximum size of the error. In particular, we can bound

the maximum value of round-oﬀ and the derivate to be

[e

n+1

[ ≤ e

r

(5.2.10a)

[ −e

n

[ ≤ e

r

(5.2.10b)

M = max

c∈[tn,tn+1]

d

2

y(c)

dt

2

. (5.2.10c)

This then gives the maximum error to be

[E

n

[ ≤

e

r

+e

r

∆t

+

∆t

2

2

M =

2e

r

∆t

+

∆t

2

M

2

. (5.2.11)

To minimize the error, we require that ∂[E

n

[/∂(∆t) = 0. Calculating this

derivative gives

∂[E

n

[

∂(∆t)

= −

2e

r

∆t

2

+M∆t = 0 , (5.2.12)

so that

∆t =

2e

r

M

1/3

. (5.2.13)

This gives the step size resulting in a minimum error. Thus the smallest step-

size is not necessarily the most accurate. Rather, a balance between round-oﬀ

error and truncation error is achieved to obtain the optimal step-size.

Stability

The accuracy of any scheme is certainly important. However, it is meaningless

if the scheme is not stable numerically. The essense of a stable scheme: the

numerical solutions do not blow up to inﬁnity. As an example, consider the

simple diﬀerential equation

dy

dt

= λy (5.2.14)

with

y(0) = y

0

. (5.2.15)

The analytic solution is easily calculated to be y(t) = y

0

exp(λt). However, if

we solve this problem numerically with a forward Euler method we ﬁnd

y

n+1

= y

n

+ ∆t λy

n

= (1 +λ∆t)y

n

. (5.2.16)

After N steps, we ﬁnd this iteration scheme yields

y

N

= (1 +λ∆t)

N

y

0

. (5.2.17)

AMATH 301 ( c (J. N. Kutz) 61

Given that we have a certain amount of round oﬀ error, the numerical solution

would then be given by

y

N

= (1 +λ∆t)

N

(y

0

+e) . (5.2.18)

The error then associated with this scheme is given by

E = (1 +λ∆t)

N

e . (5.2.19)

At this point, the following observations can be made. For λ > 0, the solution

y

N

→ ∞ in Eq. (5.2.18) as N → ∞. So although the error also grows, it may

not be signiﬁcant in comparison to the size of the numerical solution.

In contrast, Eq. (5.2.18) for λ < 0 is markedly diﬀerent. For this case,

y

N

→ 0 in Eq. (5.2.18) as N → ∞. The error, however, can dominate in

this case. In particular, we have the two following cases for the error given by

(5.2.19):

I: [1 +λ∆t[ < 1 then E →0 (5.2.20a)

II: [1 +λ∆t[ > 1 then E →∞. (5.2.20b)

In case I, the scheme would be considered stable. However, case II holds and is

unstable provided ∆t > −2/λ.

A general theory of stability can be developed for any one-step time-stepping

scheme. Consider the one-step recursion relation for an M M system

y

n+1

= Ay

n

. (5.2.21)

After N steps, the algorithm yields the solution

y

N

= A

N

y

0

, (5.2.22)

where y

0

is the initial vector. A well known result from linear algebra is that

A

N

= S

−1

Λ

N

S (5.2.23)

where S is the matrix whose columns are the eigenvectors of A, and

Λ =

¸

¸

¸

¸

λ

1

0 0

0 λ

2

0

.

.

.

.

.

.

.

.

.

0 0 λ

M

¸

→ Λ

N

=

¸

¸

¸

¸

λ

N

1

0 0

0 λ

N

2

0

.

.

.

.

.

.

.

.

.

0 0 λ

N

M

¸

(5.2.24)

is a diagnonal matrix whose entries are the eigenvalues of A. Thus upon calcu-

lating Λ

N

, we are only concerned with the eigenvalues. In particular, instability

occurs if 1¦λ

i

¦ > 1 for i = 1, 2, ..., M. This method can be easily generalized

to two-step schemes (Adams methods) by considering y

n+1

= Ay

n

+By

n−1

.

AMATH 301 ( c (J. N. Kutz) 62

Figure 3: Regions for stable stepping for the forward Euler and backward Euler

schemes.

Lending further signiﬁcance to this stability analysis is its connection with

practical implementation. We contrast the diﬀerence in stability between the

forward and backward Euler schemes. The forward Euler scheme has already

been considered in (5.2.16)-(5.2.19). The backward Euler displays signiﬁcant

diﬀerences in stability. If we again consider (5.2.14) with (5.2.15), the backward

Euler method gives the iteration scheme

y

n+1

= y

n

+ ∆t λy

n+1

, (5.2.25)

which after N steps leads to

y

N

=

1

1 −λ∆t

N

y

0

. (5.2.26)

The round-oﬀ error associated with this scheme is given by

E =

1

1 −λ∆t

N

e . (5.2.27)

By letting z = λ∆t be a complex number, we ﬁnd the following criteria to yield

unstable behavior based upon (5.2.19) and (5.2.27)

forward Euler: [1 +z[ > 1 (5.2.28a)

backward Euler:

1

1 −z

> 1 . (5.2.28b)

Figure 3 shows the regions of stable and unstable behavior as a function of z.

It is observed that the forward Euler scheme has a very small range of stability

whereas the backward Euler scheme has a large range of stability. This large

stability region is part of what makes implicit methods so attractive. Thus

stability regions can be calculated. However, control of the accuracy is also

essential.

AMATH 301 ( c (J. N. Kutz) 63

5.3 Boundary value problems: the shooting method

To this point, we have only considered the solutions of diﬀerential equations for

which the initial conditions are known. However, many physical applications

do not have speciﬁed initial conditions, but rather some given boundary (con-

straint) conditions. A simple example of such a problem is the second-order

boundary value problem

d

2

y

dt

2

= f

t, y,

dy

dt

(5.3.1)

on t ∈ [a, b] with the general boundary conditions

α

1

y(a) +β

1

dy(a)

dt

= γ

1

(5.3.2a)

α

2

y(b) + β

2

dy(b)

dt

= γ

2

. (5.3.2b)

Thus the solution is deﬁned over a speciﬁc interval and must satisfy the relations

(5.3.2) at the end points of the interval. Figure 4 gives a graphical representa-

tion of a generic boundary value problem solution. We discuss the algorithm

necessary to make use of the time-stepping schemes in order to solve such a

problem.

The Shooting Method

The boundary value problems constructed here require information at the present

time (t = a) and a future time (t = b). However, the time-stepping schemes de-

veloped previously only require information about the starting time t = a. Some

eﬀort is then needed to reconcile the time-stepping schemes with the boundary

value problems presented here.

We begin by reconsidering the generic boundary value problem

d

2

y

dt

2

= f

t, y,

dy

dt

(5.3.3)

on t ∈ [a, b] with the boundary conditions

y(a) = α (5.3.4a)

y(b) = β . (5.3.4b)

The stepping schemes considered thus far for second order diﬀerential equations

involve a choice of the initial conditions y(a) and y

**(a). We can still approach
**

the boundary value problem from this framework by choosing the “initial” con-

ditions

y(a) = α (5.3.5a)

dy(a)

dt

= A, (5.3.5b)

AMATH 301 ( c (J. N. Kutz) 64

y(t)

a

t

b

y(b)

y(a)

y’’=f(t,y,y’)

Figure 4: Graphical depiction of the structure of a typical solution to a boundary

value problem with constraints at t = a and t = b.

where the constant A is chosen so that as we advance the solution to t = b we

ﬁnd y(b) = β. The shooting method gives an iterative procedure with which we

can determine this constant A. Figure 5 illustrates the solution of the boundary

value problem given two distinct values of A. In this case, the value of A = A

1

gives a value for the initial slope which is too low to satisfy the boundary

conditions (5.3.4), whereas the value of A = A

2

is too large to satisfy (5.3.4).

Computational Algorithm

The above example demonstrates that adjusting the value of A in (5.3.5b) can

lead to a solution which satisﬁes (5.3.4b). We can solve this using a self-

consistent algorithm to search for the appropriate value of A which satisﬁes

the original problem. The basic algorithm is as follows:

1. Solve the diﬀerential equation using a time-stepping scheme with the ini-

tial conditions y(a) = α and y

(a) = A.

2. Evaluate the solution y(b) at t = b and compare this value with the target

value of y(b) = β.

3. Adjust the value of A (either bigger or smaller) until a desired level of

tolerance and accuracy is achieved. A bisection method for determining

values of A, for instance, may be appropriate.

4. Once the speciﬁed accuracy has been achieved, the numerical solution

is complete and is accurate to the level of the tolerance chosen and the

discretization scheme used in the time-stepping.

AMATH 301 ( c (J. N. Kutz) 65

b

y(b)

y(a)

t

y’’=f(t,y,y’)

y(t)

target

α

β

a

<β

>β

y’’=f(t,y,y’)

y(b)

y(b) =β

A=A

A=A

1

2

Figure 5: Solutions to the boundary value problem with y(a) = α and y

(a) = A.

Here, two values of A are used to illustrate the solution behavior and its lack

of matching the correct boundary value y(b) = β. However, the two solutions

suggest that a bisection scheme could be used to ﬁnd the correct solution and

value of A.

We illustrate graphically a bisection process in Fig. 6 and show the con-

vergence of the method to the numerical solution which satisﬁes the original

boundary conditions y(a) = α and y(b) = β. This process can occur quickly so

that convergence is achieved in a relatively low amount of iterations provided

the diﬀerential equation is well behaved.

Implementing MATLAB time-stepping schemes

Up to this point in our discussions of diﬀerential equations, our solution tech-

niques have relied on solving initial value problems with some appropriate it-

eration procedure. To implement one of these solution schemes in MATLAB

requires information about the equation, the time or spatial range to solve for,

and the initial conditions.

To build on a speciﬁc example, consider the third-order, nonhomogeneous,

diﬀerential equation

d

3

u

dt

3

+u

2

du

dt

+ cos t u = Asin

2

t . (5.3.6)

AMATH 301 ( c (J. N. Kutz) 66

2

A <A <A

1 3 2

b

y(a)

t

y(t)

A=A

α

3

β

a

y(b) =β

A=A

1

y’’=f(t,y,y’)

target

A=A

3

5

4

A=A

A=A

A <A <A

1 4 3

A <A <A

4 5

Figure 6: Graphical illustration of the shooting process which uses a bisection

scheme to converge to the appropriate value of A for which y(b) = β.

By deﬁning

y

1

= u (5.3.7a)

y

2

=

du

dt

(5.3.7b)

y

3

=

d

2

u

dt

2

, (5.3.7c)

we ﬁnd that dy

3

/dt = d

3

u/dt

3

. Using the original equation along with the

deﬁnitions of y

i

we ﬁnd that

dy

1

dt

= y

2

(5.3.8a)

dy

2

dt

= y

3

(5.3.8b)

dy

3

dt

=

d

3

u

dt

3

=−u

2

du

dt

−cos tu+Asin

2

t =−y

2

1

y

2

−cos t y

1

+Asin

2

t (5.3.8c)

which results in the ﬁrst-order system (5.1.1) considered previously

dy

dt

=

d

dt

¸

y

1

y

2

y

3

¸

=

¸

y

2

y

3

−y

2

1

y

2

−cos t y

1

+Asin

2

t

¸

= f(y, t) . (5.3.9)

AMATH 301 ( c (J. N. Kutz) 67

At this point, all the time-stepping techniques developed thus far can be applied

to the problem. It is imperative to write any diﬀerential equation as a ﬁrst-order

system before solving it numerically with the time-stepping schemes developed

here.

Before solving the equation, the initial conditions and time span must be

deﬁned. The initial conditions in this case specify y

1

(t

0

), y

2

(t

0

) and y

3

(t

0

), or

equivalently u(t

0

), u

t

(t

0

) and u

tt

(t

0

). The time span speciﬁes the beginning and

end times for solving the equations. Thus t ∈ [t

0

, t

f

], where t

0

is the initial time

and t

f

the ﬁnal time. For this example, let’s consider solving the equations

with initial conditions y

1

(0) = 1, y

2

(0) = 0 and y

3

(0) = 0 over the interval

t ∈ [t

0

, t

f

] = [0, 30] with A = 2. In MATLAB this would be accomplished with

y0=[1 0 0]; % initial conditions

tspan=[0 30]; % time interval

A=2; % parameter value

These can now be used in calling the diﬀerential equation solver.

The basic command structure and function call to solve this equation is given

as follows:

[t,y]=ode45(’rhs’,tspan,y0,[],A)

This piece of code calls a function rhs.m which contains information about the

diﬀerential equations. It sends to this function the time span for the solution,

the initial conditions, and the parameter value of A. The brackets are placed in a

position for specifying accuracy and tolerance options. By placing the brackets

with nothing inside, the default accuracy will be used by ode45. To access

other methods, simply ode45 with ode23, ode113, or ode15s as needed.

The function rhs.m contains all the information about the diﬀerential equa-

tion. Speciﬁcally, the right hand side of (5.3.9) is of primary importance. This

right hand side function looks as follows:

function rhs=rhs(t,y,dummy,A) % function call setup

rhs=[y(1);

y(2);

-(y(1)^2)*y(2)-cos(t)*y(1)+A*sin(t)]; % rhs vector

This function imports the variable t which takes on values between those given

by tspan. Thus t starts at t = 0 and ends at t = 30 in this case. In between

these points, t gives the current value of time. The variable y stores the solu-

tion. Initially, it takes on the value of y0. The dummy slot is for sending in

information about the tolerances and the last slot with A allows you to send in

any parameters necessary in the right hand side.

Note that the output of ode45 are the vectors t and y. The vector t gives all

the points between tspan for which the solution was calculated. The solution

AMATH 301 ( c (J. N. Kutz) 68

at these time points is given as the columns of matrix y. The ﬁrst column

corresponds to the solution y

1

, the second column contains y

2

as a function of

time, and the third column gives y

3

. To plot the solution u = y

1

as a function

of time, we would plot the vector t versus the ﬁrst column of y:

figure(1), plot(t,y(:,1)) % plot solution u

figure(2), plot(t,y(:,2)) % plot derivative du/dt

A couple of important points and options. The inline command could also

be used in place of the function call. Also, it may desirable to have the solution

evaluated at speciﬁed time points. For instance, if we wanted the solution

evaluated from t = 0 to t = 30 in steps of 0.5, then choose

tspan=0:0.5:30 % evaluate the solution every dt=0.5

Note that MATLAB still determines the step sizes necessary to maintain a given

tolerance. It simply interpolates to generate the values at the desired locations.

5.4 Implementation of the shooting method

The implementation of the shooting scheme relies on the eﬀective use of a time-

stepping algorithm along with a root ﬁnding method for choosing the appro-

priate initial conditions which solve the boundary value problem. Boundary

value problems often arise as eigenvalue systems for which the eigenvalue and

eigenfunction must both be determined. As an example of such a problem, we

consider the Schr¨odinger equation from quantum mechanics which is a second

order diﬀerential equation

d

2

ψ

n

dx

2

+ [n(x) −β

n

] ψ

n

= 0 (5.4.1)

with the boundary conditions ψ(±1) = 0.

Rewriting the diﬀerential equation into a coupled ﬁrst order system gives

x

=

0 1

β

n

−n(x) 0

x (5.4.2)

where x = (x

1

x

2

)

T

= (ψ

n

dψ

n

/dx)

T

. The boundary conditions are

x = 1 : ψ

n

(1) = x

1

(1) = 0 (5.4.3a)

x = −1 : ψ

n

(−1) = x

1

(−1) = 0 . (5.4.3b)

At this stage, we will also assume that n(x) = n

0

for simplicity.

With the problem thus deﬁned, we turn our attention to the key aspects in

the computational implementation of the boundary value problem solver. These

are

AMATH 301 ( c (J. N. Kutz) 69

• FOR loops

• IF statements

• time-stepping algorithms: ode23, ode45, ode113, ode15s

• step-size control

• code development and ﬂow

Every code will be controlled by a set of FOR loops and IF statements. It is

imperative to have proper placement of these control statements in order for

the code to operate successfully.

Flow Control

In order to begin coding, it is always prudent to construct the basic structure

of the algorithm. In particular, it is good to determine the number of FOR

loops and IF statements which may be required for the computations. What

is especially important is determining the hierarchy structure for the loops. To

solve the boundary value problem proposed here, we require two FOR loops

and one IF statement block. The outermost FOR loop of the code should

determine the number of eigenvalues and eigenmodes to be searched for. Within

this FOR loop exists a second FOR loop which iterates the shooting method

so that the solution converges to the correct boundary value solution. This

second FOR loop has a logical IF statement which needs to check whether

the solution has indeed converged to the boundary value solution, or whether

adjustment of the value of β

n

is necessary and the iteration procedure needs

to be continued. Figure 7 illustrates the backbone of the numerical code for

solving the boundary value problem. It includes the two FOR loops and logical

IF statement block as the core of its algorithmic structure. For a nonlinear

problem, a third FOR loop would be required for A in order to achieve the

normalization of the eigenfunctions to unity.

The various pieces of the code are constructed here using the MATLAB pro-

gramming language. We begin with the initialization of the parameters.

Initialization

clear all; % clear all previously defined variables

close all; % clear all previously defined figures

tol=10^(-4); % define a tolerance level to be achieved

% by the shooting algorithm

col=[’r’,’b’,’g’,’c’,’m’,’k’]; % eigenfunction colors

n0=100; % define the parameter n0

AMATH 301 ( c (J. N. Kutz) 70

n

β

n

β

n

β

n

β

normalize and

initialize

parameters

solve ODEs

IF statement:

root solve for

IF converged:

plot and save data

IF not converged:

new value of

loop mode loop

choose choose A

choose new mode

Figure 7: Basic algorithm structure for solving the boundary value problem.

Two FOR loops are required to step through the values of β

n

and A along with

a single IF statement block to check for convergence of the solution

A=1; % define the initial slope at x=-1

x0=[0 A]; % initial conditions: x1(-1)=0, x1’(-1)=A

xp=[-1 1]; % define the span of the computational domain

Upon completion of the initialization process for the parameters which are not

involved in the main loop of the code, we move into the main FOR loop which

searches out a speciﬁed number of eigenmodes. Embedded in this FOR loop

is a second FOR loop which attemps diﬀerent values of β

n

until the correct

eigenvalue is found. An IF statement is used to check the convergence of values

of β

n

to the appropriate value.

Main Program

beta_start=n0; % beginning value of beta

for modes=1:5 % begin mode loop

AMATH 301 ( c (J. N. Kutz) 71

beta=beta_start; % initial value of eigenvalue beta

dbeta=n0/100; % default step size in beta

for j=1:1000 % begin convergence loop for beta

[t,y]=ode45(’shoot2’,xp,x0,[],n0,beta); % solve ODEs

if abs(y(end,1)-0) < tol % check for convergence

beta % write out eigenvalue

break % get out of convergence loop

end

if (-1)^(modes+1)*y(end,1)>0 % this IF statement block

beta=beta-dbeta; % checks to see if beta

else % needs to be higher or lower

beta=beta+dbeta/2; % and uses bisection to

dbeta=dbeta/2; % converge to the solution

end %

end % end convergence loop

beta_start=beta-0.1; % after finding eigenvalue, pick

% new starting value for next mode

norm=trapz(t,y(:,1).*y(:,1)) % calculate the normalization

plot(t,y(:,1)/sqrt(norm),col(modes)); hold on % plot modes

end % end mode loop

The code uses ode45, which is a fourth-order Runge-Kutta method, to solve

the diﬀerential equation and advance the solution. The function shoot2.m is

called in this routine. For the diﬀerential equation considered here, the function

shoot2.m would be the following:

shoot2.m

function rhs=shoot2(xspan,x,dummy,n0,beta)

rhs=[ x(2)

(beta-n0)*x(1) ];

This code will ﬁnd the ﬁrst ﬁve eigenvalues and plot their corresponding

normalized eigenfunctions. The bisection method implemented to adjust the

values of β

n

to ﬁnd the boundary value solution is based upon observations of

the structure of the even and odd eigenmodes. In general, it is always a good

idea to ﬁrst explore the behavior of the solutions of the boundary value problem

before writing the shooting routine. This will give important insights into the

behavior of the solutions and will allow for a proper construction of an accurate

and eﬃcient bisection method. Figure 8 illustrates several characteristic features

of this boundary value problem. In Fig. 8(a) and 8(b), the behavior of the

solution near the ﬁrst even and ﬁrst odd solution is exhibited. From Fig. 8(a)

it is seen that for the even modes increasing values of β bring the solution

from ψ

n

(1) > 0 to ψ

n

(1) < 0. In contrast, odd modes go from ψ

n

(1) < 0 to

ψ

n

(1) > 0 as β is increased. This observation forms the basis for the bisection

AMATH 301 ( c (J. N. Kutz) 72

−1 −0.5 0 0.5 1

x

−2

−1

0

1

2

ψ(x)

(c)

β

4

=60.53

β

3

=77.80

β

2

=90.13

β

1

=97.53

−1 −0.5 0 0.5 1

x

−0.5

0

0.5

1

ψ(x)

(a)

β=98

β=95

−1 −0.5 0 0.5 1

x

−1

−0.5

0

0.5

1

(b)

β=95

β=85

Figure 8: In (a) and (b) the behavior of the solution near the ﬁrst even and

ﬁrst odd solution is depicted. Note that for the even modes increasing values of

β bring the solution from ψ

n

(1) > 0 to ψ

n

(1) < 0. In contrast, odd modes go

from ψ

n

(1) < 0 to ψ

n

(1) > 0 as β is increased. In (c) the ﬁrst four normalized

eigenmodes along with their corresponding eigenvalues are illustrated for n

0

=

100.

method developed in the code. Figure 8(c) illustrates the ﬁrst four normalized

eigenmodes along with their corresponding eigenvalues.

5.5 Boundary value problems: direct solve and relaxation

The shooting method is not the only method for solving boundary value prob-

lems. The direct method of solution relies on Taylor expanding the diﬀerential

equation itself. For linear problems, this results in a matrix problem of the

form Ax = b. For nonlinear problems, a nonlinear system of equations must be

solved using a relaxation scheme, i.e. a Newton or Secant method. The proto-

AMATH 301 ( c (J. N. Kutz) 73

typical example of such a problem is the second-order boundary value problem

d

2

y

dt

2

= f

t, y,

dy

dt

(5.5.1)

on t ∈ [a, b] with the general boundary conditions

α

1

y(a) +β

1

dy(a)

dt

= γ

1

(5.5.2a)

α

2

y(b) + β

2

dy(b)

dt

= γ

2

. (5.5.2b)

Thus the solution is deﬁned over a speciﬁc interval and must satisfy the relations

(5.5.2) at the end points of the interval.

Before considering the general case, we simplify the method by considering

the linear boundary value problem

d

2

y

dt

2

= p(t)

dy

dt

+q(t)y +r(t) (5.5.3)

on t ∈ [a, b] with the simpliﬁed boundary conditions

y(a) = α (5.5.4a)

y(b) = β . (5.5.4b)

Taylor expanding the diﬀerential equation and boundary conditions will gen-

erate the linear system of equations which solve the boundary value problem.

Tables 4-6 (Sec. 4.1) summarize the second-order and fourth-order central diﬀer-

ence schemes along with the forward- and backward-diﬀerence formulas which

are accurate to second-order.

To solve the simpliﬁed linear boundary value problem above which is accu-

rate to second order, we use table 4 for the second and ﬁrst derivatives. The

boundary value problem then becomes

y(t+∆t)−2y(t)+y(t−∆t)

∆t

2

= p(t)

y(t+∆t)−y(t−∆t)

2∆t

+q(t)y(t)+r(t) (5.5.5)

with the boundary conditions y(a) = α and y(b) = β. We can rearrange this

expression to read

¸

1 −

∆t

2

p(t)

y(t + ∆t) −

2 + ∆t

2

q(t)

y(t) +

¸

1 +

∆t

2

y(t −∆t) = ∆t

2

r(t) .

(5.5.6)

We discretize the computational domain and denote t

0

= a to be the left bound-

ary point and t

N

= b to be the right boundary point. This gives the boundary

conditions

y(t

0

) = y(a) = α (5.5.7a)

y(t

N

) = y(b) = β . (5.5.7b)

AMATH 301 ( c (J. N. Kutz) 74

The remaining N −1 points can be recast as a matrix problem Ax = b where

A=

2+∆t

2

q(t

1

) −1+

∆t

2

p(t

1

) 0 · · · 0

−1−

∆t

2

p(t

2

) 2+∆t

2

q(t

2

) −1+

∆t

2

p(t

2

) 0 · · ·

.

.

.

0

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

0

.

.

.

.

.

.

.

.

. −1+

∆t

2

p(t

N−2

)

0 · · · 0 −1−

∆t

2

p(t

N−1

) 2+∆t

2

q(t

N−1

)

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

(5.5.8)

and

x=

y(t

1

)

y(t

2

)

.

.

.

y(t

N−2

)

y(t

N−1

)

¸

¸

¸

¸

¸

¸

¸

b=

−∆t

2

r(t

1

) + (1 + ∆tp(t

1

)/2)y(t

0

)

−∆t

2

r(t

2

)

.

.

.

−∆t

2

r(t

N−2

)

−∆t

2

r(t

N−1

) + (1 −∆tp(t

N−1

)/2)y(t

N

)

¸

¸

¸

¸

¸

¸

¸

.

(5.5.9)

Thus the solution can be found by a direct solve of the linear system of equations.

Nonlinear Systems

A similar solution procedure can be carried out for nonlinear systems. However,

diﬃculties arise from solving the resulting set of nonlinear algebraic equations.

We can once again consider the general diﬀerential equation and expand with

second-order accurate schemes:

y

=f(t, y, y

) →

y(t+∆t)−2y(t)+y(t−∆t)

∆t

2

=f

t, y(t),

y(t+∆t)−y(t−∆t)

2∆t

.

(5.5.10)

We discretize the computational domain and denote t

0

= a to be the left bound-

ary point and t

N

= b to be the right boundary point. Considering again the

simpliﬁed boundary conditions y(t

0

) = y(a) = α and y(t

N

) = y(b) = β gives

the following nonlinear system for the remaining N −1 points.

2y

1

−y

2

−α + ∆t

2

f(t

1

, y

1

, (y

2

−α)/2∆t) = 0

−y

1

+ 2y

2

−y

3

+ ∆t

2

f(t

2

, y

2

, (y

3

−y

1

)/2∆t) = 0

.

.

.

−y

N−3

+2y

N−2

−y

N−1

+∆t

2

f(t

N−2

, y

N−2

, (y

N−1

−y

N−3

)/2∆t) = 0

−y

N−2

+ 2y

N−1

−β + ∆t

2

f(t

N−1

, y

N−1

, (β −y

N−2

)/2∆t) = 0.

This (N − 1) (N − 1) nonlinear system of equations can be very diﬃcult to

solve and imposes a severe constraint on the usefulness of the scheme. However,

AMATH 301 ( c (J. N. Kutz) 75

there may be no other way of solving the problem and a solution to these system

of equations must be computed. Further complicating the issue is the fact that

for nonlinear systems such as these, there are no guarantees about the existence

or uniqueness of solutions. The best approach is to use a relaxation scheme

which is based upon Newton or Secant method iterations.

Solving Nonlinear Systems: Newton-Raphson Iteration

The only built-in MATLAB command which solves nonlinear system of equa-

tions is FSOLVE. However, this command is now packaged within the opti-

mization toolbox. Most users of MATLAB do not have access to this toolbox

and alternatives must be sought. We therefore develop the basic ideas of the

Newton-Raphson Iteration method, commonly known as a Newton’s method.

We begin by considering a single nonlinear equation

f(x

r

) = 0 (5.5.12)

where x

r

is the root to the equation and the value being sought. We would like to

develop a scheme which systematically determines the value of x

r

. The Newton-

Raphson method is an iterative scheme which relies on an initial guess, x

0

, for

the value of the root. From this guess, subsequent guesses are determined until

the scheme either converges to the root x

r

or the scheme diverges and another

initial guess is used. The sequence of guesses (x

0

, x

1

, x

2

, ...) is generated from

the slope of the function f(x). The graphical procedure is illustrated in Fig. 9.

In essence, everything relies on the slope formula as illustrated in Fig. 9(a):

slope =

df(x

n

)

dx

=

rise

run

=

0 −f(x

n

)

x

n+1

−x

n

. (5.5.13)

Rearranging this gives the Newton-Raphson iterative relation

x

n+1

= x

n

−

f(x

n

)

f

(x

n

)

. (5.5.14)

A graphical example of how the iteration procedure works is given in Fig. 9(b)

where a sequence of iterations is demonstrated. Note that the scheme fails if

f

(x

n

) = 0 since then the slope line never intersects y = 0. Further, for certain

guesses the iterations may diverge. Provided the initial guess is suﬃciently close,

the scheme usually will converge. Conditions for convergence can be found in

Burden and Faires [5].

The Newton method can be generalized for system of nonlinear equations.

The details will not be discussed here, but the Newton iteration scheme is similar

to that developed for the single function case. Given a system:

F(x

n

) =

f

1

(x

1

, x

2

, x

3

, ..., x

N

)

f

2

(x

1

, x

2

, x

3

, ..., x

N

)

.

.

.

f

N

(x

1

, x

2

, x

3

, ..., x

N

)

¸

¸

¸

¸

¸

= 0, (5.5.15)

AMATH 301 ( c (J. N. Kutz) 76

n

x

n

x

n+1

x

n+1

x

n

x

n+1

x

n

x

1

x

(a)

f( )

f( )

slope=rise/run=(0-f( ) )/( - )

x

3

x

0

x

4

x

2

(b)

Figure 9: Construction and implementation of the Newton-Raphson iteration

formula. In (a), the slope is the determining factor in deriving the Newton-

Raphson formula. In (b), a graphical representation of the iteration scheme is

given.

the iteration scheme is

x

n+1

= x

n

+∆x

n

(5.5.16)

where

J(x

n

)∆x

n

= −F(x

n

) (5.5.17)

and J(x

n

) is the Jacobian matrix

J(x

n

) =

f

1x

1

f

1x

2

f

1x

N

f

2x

1

f

2x

2

f

2x

N

.

.

.

.

.

.

.

.

.

f

Nx

1

f

Nx

2

f

Nx

N

¸

¸

¸

¸

¸

(5.5.18)

This algorithm relies on initially guessing values for x

1

, x

2

, ..., x

N

. As before,

there is no guarantee that the algorithm will converge. Thus a good initial guess

is critical to its success. Further, the determinant of the Jacobian cannot equal

zero, det J(x

n

) = 0, in order for the algorithm to work.

5.6 Implementing the direct solve method

To see how the ﬁnite diﬀerence discretization technique is applied, we consider

the same example problem considered with the shooting method. Speciﬁcally,

the boundary value problem gives rise to an eigenvalue system for which the

eigenvalue and eigenfunction must both be determined. The example considered

is the Schr¨odinger equation from quantum mechanics which is a second order

AMATH 301 ( c (J. N. Kutz) 77

diﬀerential equation

d

2

ψ

dx

2

+ [n(x) −β] ψ = 0 (5.6.1)

with the boundary conditions ψ(±1) = 0. As before, we will consider n(x) =

n

0

= 100. Unlike the shooting method, there is no need to rewrite the equation

as a coupled system of ﬁrst order equations.

To discretize, we ﬁrst divide the spatial domain x ∈ [−1, 1] into a speciﬁed

number of equally spaced points. At each point, let

ψ(x

n

) = ψ

n

. (5.6.2)

Upon discretizing, the governing equation becomes

ψ

n+1

−2ψ

n

+ψ

n−1

∆x

2

+ (n

0

−β)ψ

n

= 0 n = 1, 2, 3, ..N −1 . (5.6.3)

Multiplying through by ∆x

2

and collecting terms gives

ψ

n+1

+

−2 + ∆x

2

n

0

ψ

n

+ψ

n−1

= ∆x

2

βψ

n

n = 1, 2, 3, ..N −1 . (5.6.4)

Note that the term on the right hand side is placed there because the values of

β are unknown. Thus part of the problem is to determine what these values are

as well as all the values of ψ

n

.

Deﬁning λ = ∆x

2

β gives the eigenvalue problem

A

ψ = λ

ψ (5.6.5)

where

A=

−2+∆x

2

n

0

1 0 0

1 −2+∆x

2

n

0

1 0

.

.

.

0

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

0

.

.

.

.

.

.

.

.

. 1

0 0 1 −2+∆x

2

n

0

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

(5.6.6)

and

ψ=

ψ(x

1

)

ψ(x

2

)

.

.

.

ψ(x

N−2

)

ψ(x

N−1

)

¸

¸

¸

¸

¸

¸

¸

. (5.6.7)

Thus the solution can be found by an eigenvalue solve of the linear system of

equations. Note that we already know the boundary conditions ψ(−1) = ψ

0

= 0

and ψ(1) = ψ

N

= 0. Thus we are left with and N−1N−1 eigenvalue problem

for which we can use the eig command.

AMATH 301 ( c (J. N. Kutz) 78

numerical implementation

To implement this solution, we begin with the preliminaries of the problem

clear all; close all;

n=200; % number of points

x=linespace(-1,1,n); % linear space of points

dx=x(2)-x(1); % delta x values

With these deﬁned, we turn to building the matrix A. It is a sparse matrix

which only has three diagonals of consequence.

n0=100; % value of n_0

m=n-2; % size of A matrix (mXm)

% DIAGONALS

for j=1:m

A(j,j)=-2+dx^2*n0;

end

% OFF DIAGONALS

for j=1:m-1

A(j,j+1)=1;

A(j+1,j)=1;

end

Note that the size of the matrix is m = n−2 since we already know the boundary

conditions ψ(−1) = ψ

0

= 0 and ψ(1) = ψ

N

= 0..

To solve this problem, we simply use the eig command

[V,D]=eig(A) % get the eigenvectors (V), eigenvalues (D)

The matrix V is an mm matrix which contains as columns the eigenvectors

of the matrix A. The matrix D is an m m matrix whose diagonal elements

are the eigenvalues corresponding to the eigenvectors V. The eig command

automatically sorts the eigenvalues puts them in the order of smallest to biggest

as you move across the columns of V. In this particular case, we are actually

looking for the largest ﬁve values of β, thus we need to look at the last ﬁve

columns of Vfor our eigenvectors and last ﬁve diagonals of Dfor the eigenvalues.

col=[’r’,’b’,’g’,’c’,’m’,’k’]; % eigenfunction colors

for j=1:5

eig(1)=0; % left boundary condition

eig(2:m+1)=V(:,end+1-j); % interior points

eig(n)=0 % right boundary condition

norm=trapz(x,eig.^2); % normalization

plot(x,eig/sqrt(norm),col(j)); hold on % plot

beta=D(end+1-j,end+1-j)/dx/dx % write out eigenvalues

end

This plots out the normalized eigenvector solutions and their eigenvalues.

AMATH 301 ( c (J. N. Kutz) 79

6 Fourier Transforms

Fourier transforms are one of the most powerful and eﬃcient techniques for

solving a wide variety of problems. The key idea of the Fourier transform

is to represent functions and their derivatives as sums of cosines and sines.

This operation can be done with the Fast-Fourier transform (FFT) which is

an O(N log N) operation. Thus the FFT is faster than most linear solvers of

O(N

2

). The basic properties and implementation of the FFT will be considered

here.

6.1 Basics of the Fourier Transform

Other techniques exist for solving many computational problems which are not

based upon the standard Taylor series discretization. An alternative to standard

discretization is to use the Fast-Fourier Transform (FFT). The FFT is an inte-

gral transform deﬁned over the entire line x ∈ [−∞, ∞]. Given computational

practicalities, however, we transform over a ﬁnite domain x ∈ [−L, L] and as-

sume periodic boundary conditions due to the oscillatory behavior of the kernel

of the Fourier transform. The Fourier transform and its inverse are deﬁned as

F(k) =

1

√

2π

∞

−∞

e

ikx

f(x)dx (6.1.1a)

f(x) =

1

√

2π

∞

−∞

e

−ikx

F(k)dk . (6.1.1b)

There are other equivalent deﬁnitions. However, this deﬁnition will serve to

illustrate the power and functionality of the Fourier transform method. We

again note that formally, the transform is over the entire real line x ∈ [−∞, ∞]

whereas our computational domain is only over a ﬁnite domain x ∈ [−L, L].

Further, the Kernel of the transform, exp(±ikx), describes oscillatory behavior.

Thus the Fourier transform is essentially an eigenfunction expansion over all

continuous wavenumbers k. And once we are on a ﬁnite domain x ∈ [−L, L],

the continuous eigenfunction expansion becomes a discrete sum of eigenfunctions

and associated wavenumbers (eigenvalues).

Derivative Relations

The critical property in the usage of Fourier transforms concerns derivative

relations. To see how these properties are generated, we begin by considering

the Fourier transform of f

**(x). We denote the Fourier transform of f(x) as
**

¯

f(x). Thus we ﬁnd

¯

f

(x) =

1

√

2π

∞

−∞

e

ikx

f

(x)dx = f(x)e

ikx

[

∞

−∞

−

ik

√

2π

∞

−∞

e

ikx

f(x)dx. (6.1.2)

AMATH 301 ( c (J. N. Kutz) 80

Assuming that f(x) →0 as x →±∞ results in

¯

f

(x) = −

ik

√

2π

∞

−∞

e

ikx

f(x)dx = −ik

¯

f(x) . (6.1.3)

Thus the basic relation

´

f

= −ik

´

f is established. It is easy to generalize this

argument to an arbitrary number of derivatives. The ﬁnal result is the following

relation between Fourier transforms of the derivative and the Fourier transform

itself

¯

f

(n)

= (−ik)

n

´

f . (6.1.4)

This property is what makes Fourier transforms so useful and practical.

As an example of the Fourier transform, consider the following diﬀerential

equation

y

−ω

2

y = −f(x) x ∈ [−∞, ∞] . (6.1.5)

We can solve this by applying the Fourier transform to both sides. This gives

the following reduction

´

y

−ω

2

´ y = −

´

f

−k

2

´ y −ω

2

´ y = −

´

f

(k

2

+ω

2

)´ y =

´

f

´ y =

´

f

k

2

+ω

2

. (6.1.6)

To ﬁnd the solution y(x), we invert the last expression above to yield

y(x) =

1

√

2π

∞

−∞

e

−ikx

´

f

k

2

+ω

2

dk . (6.1.7)

This gives the solution in terms of an integral which can be evaluated analyti-

cally or numerically.

The Fast Fourier Transform

The Fast Fourier transform routine was developed speciﬁcally to perform the

forward and backward Fourier transforms. In the mid 1960s, Cooley and Tukey

developed what is now commonly known as the FFT algorithm [7]. Their al-

gorithm was named one of the top ten algorithms of the 20th century for one

reason: the operation count for solving a system dropped to O(N log N). For

N large, this operation count grows almost linearly like N. Thus it represents a

great leap forward from Gaussian elimination and LU decomposition. The key

features of the FFT routine are as follows:

1. It has a low operation count: O(N log N).

AMATH 301 ( c (J. N. Kutz) 81

2. It ﬁnds the transform on an interval x ∈ [−L, L]. Since the integration

Kernel exp(ikx) is oscillatory, it implies that the solutions on this ﬁnite

interval have periodic boundary conditions.

3. The key to lowering the operation count to O(N log N) is in discretizing

the range x ∈ [−L, L] into 2

n

points, i.e. the number of points should be

2, 4, 8, 16, 32, 64, 128, 256, .

4. The FFT has excellent accuracy properties, typically well beyond that of

standard discretization schemes.

We will consider the underlying FFT algorithm in detail at a later time. For

more information at the present, see [7] for a broader overview.

The practical implementation of the mathematical tools available in MAT-

LAB is crucial. This lecture will focus on the use of some of the more sophis-

ticated routines in MATLAB which are cornerstones to scientiﬁc computing.

Included in this section will be a discussion of the Fast Fourier Transform rou-

tines (ﬀt, iﬀt, ﬀtshift, iﬀtshift, ﬀt2, iﬀt2), sparse matrix construction (spdiag,

spy), and high end iterative techniques for solving Ax = b (bicgstab, gmres).

These routines should be studied carefully since they are the building blocks of

any serious scientiﬁc computing code.

Fast Fourier Transform: FFT, IFFT, FFTSHIFT, IFFTSHIFT2

The Fast Fourier Transform will be the ﬁrst subject discussed. Its implemen-

tation is straightforward. Given a function which has been discretized with 2

n

points and represented by a vector x, the FFT is found with the command

ﬀt(x). Aside from transforming the function, the algorithm associated with the

FFT does three major things: it shifts the data so that x ∈ [0, L] → [−L, 0]

and x ∈ [−L, 0] →[0, L], additionally it multiplies every other mode by −1, and

it assumes you are working on a 2π periodic domain. These properties are a

consequence of the FFT algorithm discussed in detail at a later time.

To see the practical implications of the FFT, we consider the transform of

a Gaussian function. The transform can be calculated analytically so that we

have the exact relations:

f(x) = exp(−αx

2

) →

´

f(k) =

1

√

2α

exp

−

k

2

4α

. (6.1.8)

A simple MATLAB code to verify this with α = 1 is as follows

clear all; close all; % clear all variables and figures

L=20; % define the computational domain [-L/2,L/2]

n=128; % define the number of Fourier modes 2^n

AMATH 301 ( c (J. N. Kutz) 82

x2=linspace(-L/2,L/2,n+1); % define the domain discretization

x=x2(1:n); % consider only the first n points: periodicity

u=exp(-x.*x); % function to take a derivative of

ut=fft(u); % FFT the function

utshift=fftshift(ut); % shift FFT

figure(1), plot(x,u) % plot initial gaussian

figure(2), plot(abs(ut)) % plot unshifted transform

figure(3), plot(abs(utshift)) % plot shifted transform

The second ﬁgure generated by this script shows how the pulse is shifted. By

using the command ﬀtshift, we can shift the transformed function back to its

mathematically correct positions as shown in the third ﬁgure generated. How-

ever, before inverting the transformation, it is crucial that the transform is

shifted back to the form of the second ﬁgure. The command iﬀtshift does this.

In general, unless you need to plot the spectrum, it is better not to deal with

the ﬀtshift and iﬀtshift commands. A graphical representation of the ﬀt pro-

cedure and its shifting properties is illustrated in Fig. 10 where a Gaussian is

transformed and shifted by the ﬀt routine.

To take a derivative, we need to calculate the k values associated with the

transformation. The following example does this. Recall that the FFT assumes

a 2π periodic domain which gets shifted. Thus the calculation of the k values

needs to shift and rescale to the 2π domain. The following example diﬀerentiates

the function f(x) = sech(x) three times. The ﬁrst derivative is compared with

the analytic value of f

(x) = −sech(x) tanh(x).

clear all; close all; % clear all variables and figures

L=20; % define the computational domain [-L/2,L/2]

n=128; % define the number of Fourier modes 2^n

x2=linspace(-L/2,L/2,n+1); % define the domain discretization

x=x2(1:n); % consider only the first n points: periodicity

u=sech(x); % function to take a derivative of

ut=fft(u); % FFT the function

k=(2*pi/L)*[0:(n/2-1) (-n/2):-1]; % k rescaled to 2pi domain

ut1=i*k.*ut; % first derivative

ut2=-k.*k.*ut; % second derivative

ut3=-i*k.*k.*k.*ut; % third derivative

u1=ifft(ut1); u2=ifft(ut2); u3=ifft(ut3); % inverse transform

u1exact=-sech(x).*tanh(x); % analytic first derivative

AMATH 301 ( c (J. N. Kutz) 83

−64 −32 0 32 64

Fourier modes

−15

−10

−5

0

5

10

15

r

e

a

l

[

f

f

t

s

h

i

f

t

(

u

)

]

−10 −5 0 5 10

x

0

0.2

0.4

0.6

0.8

1

u

=

e

x

p

(

−

x

2

)

−64 −32 0 32 64

Fourier modes

0

5

10

15

a

b

s

(

f

f

t

s

h

i

f

t

(

u

)

)

−64 −32 0 32 64

Fourier modes

−15

−10

−5

0

5

10

15

r

e

a

l

[

f

f

t

(

u

)

]

Figure 10: Fast Fourier Transform of Gaussian data illustrating the shifting

properties of the FFT routine. Note that the ﬀtshift command restores the

transform to its mathematically correct, unshifted state.

figure(1)

plot(x,u,’r’,x,u1,’g’,x,u1exact,’go’,x,u2,’b’,x,u3,’c’) % plot

The routine accounts for the periodic boundaries, the correct k values, and

diﬀerentiation. Note that no shifting was necessary since we constructed the k

values in the shifted space.

For transforming in higher dimensions, a couple of choices in MATLAB are

possible. For 2D transformations, it is recommended to use the commands

ﬀt2 and iﬀt2. These will transform a matrix A, which represents data in the

x and y direction respectively, along the rows and then columns. For higher

dimensions, the ﬀt command can be modiﬁed to ﬀt(x,[],N) where N is the

number of dimensions.

AMATH 301 ( c (J. N. Kutz) 84

6.2 Spectral Analysis

Although our primary use of the FFT will be related to diﬀerentiation and

solving ordinary and partial diﬀerential equations, it should be understood that

the impact of the FFT is far greater than this. In particular, FFTs and other

related frequency transforms have revolutionized the ﬁeld of digital signal pro-

cessing and imaging. The key concept in any of these applications is to use the

FFT to analyze and manipulate data in the frequency domain.

This lecture will discuss a very basic concept and manipulation procedure to

be performed in the frequency domain: noise attenuation via frequency (band-

pass) ﬁltering. This ﬁltering process is fairly common in electronics and signal

detection. To begin we consider a ideal signal generated in the time-domain.

clear all; close all;

T=20; % Total time slot to transform

n=128; % number of Fourier modes 2^7

t2=linspace(-T/2,T/2,n+1); t=t2(1:n); % time values

k=(2*pi/L)*[0:(n/2-1) -n/2:-1]; % frequency components of FFT

u=sech(t); % ideal signal in the time domain

figure(1), plot(t,u,’k’), hold on

This will generate an ideal hyperbolic secant shape in the time domain.

In most applications, the signals as generated above are not ideal. Rather,

they have a large amount of noise on top of them. Usually this noise is what is

called white noise, i.e. a noise which eﬀects all frequencies the same. We can

add white noise to this signal by considering the pulse in the frequency domain.

noise=10;

ut=fft(u);

utn=ut+noise*(rand(1,n)+i*rand(1,n));

These three lines of code generate the Fourier transform of the function along

with the vector utn which is the spectrum with a complex and Gaussian dis-

tributed (mean zero, unit variance) noise source term added in.

6.3 Applications of the FFT

Spectral methods are one of the most powerful solution techniques for ordinary

and partial diﬀerential equations. The best known example of a spectral method

is the Fourier transform. We have already made use of the Fourier transform

using FFT routines. Other spectral techniques exist which render a variety of

problems easily tractable and often at signiﬁcant computational savings.

AMATH 301 ( c (J. N. Kutz) 85

Discrete Cosine and Sine Transforms

When considering a computational domain, the solution can only be found on

a ﬁnite length domain. Thus the deﬁnition of the Fourier transform needs to be

modiﬁed in order to account for the ﬁnite sized computational domain. Instead

of expanding in terms of a continuous integral for values of wavenumber k and

cosines and sines (exp(ikx)), we expand in a Fourier series

F(k) =

N

¸

n=1

f(n) exp

¸

−i

2π(k −1)

N

(n −1)

1 ≤ k ≤ N (6.3.1a)

f(n) =

1

N

N

¸

k=1

F(k) exp

¸

i

2π(k −1)

N

(n −1)

1 ≤ n ≤ N . (6.3.1b)

Thus the Fourier transform is nothing but an expansion in a basis of cosine and

sine functions. If we deﬁne the fundamental oscillatory piece as

w

nk

=exp

**2iπ(k−1)(n−1)
**

N

=cos

**2π(k−1)(n−1)
**

N

+i sin

**2π(k−1)(n−1)
**

N

,

(6.3.2)

then the Fourier transform results in the expression

F

n

=

N−1

¸

k=0

w

nk

f

k

0 ≤ n ≤ N −1 . (6.3.3)

Thus the calculation of the Fourier transform involves a double sum and an

O(N

2

) operation. Thus at ﬁrst it would appear that the Fourier transform

method is the same operation count as LU decomposition. The basis functions

used for the Fourier transform, sine transform and cosine transform are depicted

in Fig. 11. The process of solving a diﬀerential or partial diﬀerential equation

involves evaluating the coeﬃcient of each of the modes. Note that this expan-

sion, unlike the ﬁnite diﬀerence method, is a global expansion in that every basis

function is evaluated on the entire domain.

The Fourier, sine, and cosine transforms behave very diﬀerently at the

boundaries. Speciﬁcally, the Fourier transform assumes periodic boundary con-

ditions whereas the sine and cosine transforms assume pinned and no-ﬂux bound-

aries respectively. The cosine and sine transform are often chosen for their

boundary properties. Thus for a given problem, an evaluation must be made

of the type of transform to be used based upon the boundary conditions need-

ing to be satisﬁed. Table 8 illustrates the three diﬀerent expansions and their

associated boundary conditions. The appropriate MATLAB command is also

given.

AMATH 301 ( c (J. N. Kutz) 86

0 π 2π

−1

−0.5

0

0.5

1

0 π 2π

−1

−0.5

0

0.5

1

0 π 2π

−1

−0.5

0

0.5

1

mode 1

mode 2

mode 3

mode 4

mode 5

Figure 11: Basis functions used for a Fourier mode expansion (top), a sine

expansion (middle), and a cosine expansion (bottom).

Fast-Poisson Solvers

The FFT algorithm provides a fast and eﬃcient method for solving the Poisson

equation

∇

2

ψ = ω (6.3.4)

given the function ω. In two dimensions, this equation is equivalent to

∂

2

ψ

∂x

2

+

∂

2

ψ

∂y

2

= ω . (6.3.5)

By discretizing in both the x and y directions, this forces us to solve an as-

sociated linear problem Ax = b. At best, we can use a factorization scheme

to solve this problem in O(N

2

) operations. Although iteration schemes have

AMATH 301 ( c (J. N. Kutz) 87

command expansion boundary conditions

ﬀt F

k

=

¸

2N−1

j=0

f

j

exp(iπjk/N) periodic: f(0) = f(L)

dst F

k

=

¸

N−1

j=1

f

j

sin(πjk/N) pinned: f(0) = f(L) = 0

dct F

k

=

¸

N−2

j=0

f

j

cos(πjk/2N) no-ﬂux: f

(0) = f

(L) = 0

Table 8: MATLAB functions for Fourier, sine, and cosine transforms and their

associated boundary conditions. To invert the expansions, the MATLAB com-

mands are iﬀt, idst, and idct repectively.

the possibility of outperforming this, it is not guaranteed. Fourier transform

methods allow us to solve the problem in O(N log N).

Denoting the Fourier transform in x as

¯

f(x) and the Fourier transform in y

as

¯

g(y), we transform the equation. We begin by transforming in x:

¯

∂

2

ψ

∂x

2

+

¯

∂

2

ψ

∂y

2

= ´ ω → −k

2

x

´

ψ +

∂

2 ´

ψ

∂y

2

= ´ ω , (6.3.6)

where k

x

are the wavenumbers in the x direction. Transforming now in the y

direction gives

−k

2

x

¯

´

ψ +

¯

∂

2 ´

ψ

∂y

2

=

¯

´ ω → −k

2

x

¯

´

ψ +−k

2

y

¯

´

ψ =

¯

´ ω . (6.3.7)

This can be rearranged to obtain the ﬁnal result

¯

´

ψ = −

¯

´ ω

k

2

x

+k

2

y

. (6.3.8)

The remaining step is to inverse transform in x and y to get back to the solution

ψ(x, y).

The algorithm to solve this is surprisingly simple. Before generating the

solution, the spatial domain, Fourier wavenumber vectors, and two-dimensional

function ω(x, y) are deﬁned.

Lx=20; Ly=20, nx=128; ny=128; % domains and Fourier modes

x2=linspace(-Lx/2,Lx/2,nx+1); x=x2(1:nx); % x-values

y2=linspace(-Ly/2,Ly/2,ny+1); y=y2(1:nx); % y-values

[X,Y]=meshgrid(x,y); % generate 2-D grid

omega=exp(-X.^2-Y.^2); % generate 2-D Gaussian

After deﬁning the preliminaries, the wavenumbers are generated and the equa-

tion is solved

AMATH 301 ( c (J. N. Kutz) 88

kx=(2*pi/Lx)*[0:nx/2-1 -nx/2:-1]; % x-wavenumbers

ky=(2*pi/Ly)*[0:ny/2-1 -ny/2:-1]; % y-wavenumbers

kx(1)=10^(-6); ky(1)=10^(-6); % fix divide by zero

[KX,KY]=meshgrid(kx,ky); % generate 2-D wavenumbers

psi=real( ifft2( -fft2(omega)./(KX.^2+KY.^2) ) ); % solution

Note that the real part of the inverse 2-D FFT is taken since numerical round-oﬀ

generates imaginary parts to the solution ψ(x, y).

An observation concerning (6.3.8) is that there will be a divide by zero when

k

x

= k

y

= 0 at the zero mode. Two options are commonly used to overcome

this problem. The ﬁrst is to modify (6.3.8) so that

¯

´

ψ = −

¯

´ ω

k

2

x

+k

2

y

+eps

(6.3.9)

where eps is the command for generating a machine precision number which is

on the order of O(10

−15

). It essentially adds a round-oﬀ to the denominator

which removes the divide by zero problem. A second option, which is more

highly recommended, is to redeﬁne the k

x

and k

y

vectors associated with the

wavenumbers in the x and y directions. Speciﬁcally, after deﬁning the k

x

and

k

y

, we could simply add the command line

kx(1)=10^(-6);

ky(1)=10^(-6);

The values of kx(1) = ky(1) = 0 by default. This would make the values small

but ﬁnite so that the divide by zero problem is eﬀectively removed with only a

small amount of error added to the problem.

There is a second mathematical diﬃculty which must be addressed. The

function ψ(x, y) with periodic boundary conditions does not have a unique so-

lution. Thus if ψ

0

(x, y) is a solution, so is ψ

0

(x, y) + c where c is an arbitrary

constant. When solving this problem with FFTs, the FFT will arbitrarily add

a constant to the solution. Fundamentally, we are only interested in deriva-

tives of the streamfunction. Therefore, this constant is inconsequential. When

solving with direct methods for Ax = b, the non-uniqueness gives a singular

matrix A. Thus solving with Gaussian elimination, LU decomposition or it-

erative methods is problematic. Often in applications the arbitrary constant

does not matter. Speciﬁcally, most applications only are concerned with the

derivative of the function ψ(x, y). Thus we can simply pin the value of ψ(x, y)

to some prescribed value on our computational domain. This will ﬁx the con-

stant c and give a unique solution to Ax = b. For instance, we could impose

the following constraint condition ψ(−L, −L, t) = 0. Such a condition pins the

value of ψ(x, y, t) at the left hand corner of the computational domain and ﬁxes

c.

AMATH 301 ( c (J. N. Kutz) 89

7 Partial Diﬀerential Equations

With the Fourier transform and discretization in hand, we can turn towards the

solution of partial diﬀerential equations whose solutions need to be advanced

forward in time. Thus, in addition to spatial discretization, we will need to

discretize in time in a self-consistent way so as to advance the solution to a

desired future time. Issues of stability and accuracy are, of course, at the heart

of a discussion on time and space stepping schemes.

7.1 Basic time- and space-stepping schemes

Basic time-stepping schemes involve discretization in both space and time. Is-

sues of numerical stability as the solution is propagated in time and accuracy

from the space-time discretization are of greatest concern for any implementa-

tion. To begin this study, we will consider very simple and well known examples

from partial diﬀerential equations. The ﬁrst equation we consider is the heat

(diﬀusion) equation

∂u

∂t

= κ

∂

2

u

∂x

2

(7.1.1)

with the periodic boundary conditions u(−L) = u(L). We discretize the spatial

derivative with a second-order scheme (see Table 4) so that

∂u

∂t

=

κ

∆x

2

[u(x + ∆x) − 2u(x) +u(x −∆x)] . (7.1.2)

This approximation reduces the partial diﬀerential equation to a system of or-

dinary diﬀerential equations. We have already considered a variety of time-

stepping schemes for diﬀerential equations and we can apply them directly to

this resulting system.

The ODE System

To deﬁne the system of ODEs, we discretize and deﬁne the values of the vector

u in the following way.

u(−L) = u

1

u(−L + ∆x) = u

2

.

.

.

u(L −2∆x) = u

n−1

u(L −∆x) = u

n

u(L) = u

n+1

.

AMATH 301 ( c (J. N. Kutz) 90

Recall from periodicity that u

1

= u

n+1

. Thus our system of diﬀerential equa-

tions solves for the vector

u =

¸

¸

¸

¸

u

1

u

2

.

.

.

u

n

¸

. (7.1.3)

The governing equation (7.1.2) is then reformulated as the diﬀerential equations

system

du

dt

=

κ

∆x

2

Au, (7.1.4)

where A is given by the sparse matrix

A=

−2 1 0 0 1

1 −2 1 0 0

0

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

0

.

.

. 0 1 −2 1

1 0 0 1 −2

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

, (7.1.5)

and the values of one on the upper right and lower left of the matrix result from

the periodic boundary conditions.

MATLAB implementation

The system of diﬀerential equations can now be easily solved with a standard

time-stepping algorithm such as ode23 or ode45. The basic algorithm would be

as follows

1. Build the sparse matrix A.

e1=ones(n,1); % build a vector of ones

A=spdiags([e1 -2*e1 e1],[-1 0 1],n,n); % diagonals

A(1,n)=1; A(n,1)=1; % periodic boundaries

2. Generate the desired initial condition vector u = u

0

.

3. Call an ODE solver from the MATLAB suite. The matrix A, the diﬀusion

constant κ and spatial step ∆x need to be passed into this routine.

[t,y]=ode45(’rhs’,tspan,u0,[],k,dx,A);

AMATH 301 ( c (J. N. Kutz) 91

The function rhs.m should be of the following form

function rhs=rhs(tspan,u,dummy,k,dx,A)

rhs=(k/dx^2)*A*u;

4. Plot the results as a function of time and space.

The algorithmis thus fairly routine and requires very little eﬀort in programming

since we can make use of the standard time-stepping algorithms already available

in MATLAB.

2D MATLAB implementation

In the case of two dimensions, the calculation becomes slightly more diﬃcult

since the 2D data is represented by a matrix and the ODE solvers require a

vector input for the initial data. For this case, the governing equation is

∂u

∂t

= κ

∂

2

u

∂x

2

+

∂

2

u

∂y

2

(7.1.6)

where we discretize in x and y. Provided ∆x = ∆y = δ are the same, the system

can be reduced to the linear system

du

dt

=

κ

δ

2

Bu, (7.1.7)

where we have arranged the vector u so that

u =

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

u

11

u

12

.

.

.

u

1n

u

21

u

22

.

.

.

u

n(n−1)

u

nn

¸

, (7.1.8)

where we have deﬁned u

jk

= u(x

j

, y

k

). The matrix B is a nine diagonal matrix.

The kron command can be used to generate it from its 1D version.

Again, the system of diﬀerential equations can now be easily solved with a

standard time-stepping algorithm such as ode23 or ode45. The basic algorithm

follows the same course as the 1D case, but extra care is taken in arranging the

2D data into a vector.

AMATH 301 ( c (J. N. Kutz) 92

1. Build the sparse matrix B. Here the matrix B can be built by using the

fact that we know the diﬀerentiation matrix in one dimension, i.e. the

matrix A of the previous example.

n=100; % number of discretization points in x and y

I=eye(n); % idendity matrix of size nXn

B=kron(I,A)+kron(A,I); % 2-D differentiation matrix

2. Generate the desired initial condition matrix U = U

0

and reshape it to a

vector u = u

0

. This example considers the case of a simple Gaussian as

the initial condition. The reshape and meshgrid commands are important

for computational implementation.

Lx=20; Ly=20; % spatial domain of x and y

nx=100; ny=100; % discretization points in x and y

N=nx*ny; % elements in reshaped initial condition

x2=linspace(-Lx/2,Lx/2,nx+1); % account for periodicity

x=x2(1:nx); % x-vector

y2=linspace(-Ly/2,Ly/2,ny+1); % account for periodicity

y=y2(1:ny); % y-vector

[X,Y]=meshgrid(x,y); % set up for 2D initial conditions

U=exp(-X.^2-Y.^2); % generate a Gaussian matrix

u=reshape(U,N,1); % reshape into a vector

3. Call an ODE solver from the MATLAB suite. The matrix A, the diﬀusion

constant κ and spatial step ∆x = ∆y = dx need to be passed into this

routine.

[t,y]=ode45(’rhs’,tspan,u0,[],k,dx,A);

The function rhs.m should be of the following form

function rhs=rhs(tspan,u,dummy,k,dx,A)

rhs=(k/dx^2)*A*u;

4. Plot the results as a function of time and space.

The algorithm is again fairly routine and requires very little eﬀort in program-

ming since we can make use of the standard time-stepping algorithms.

AMATH 301 ( c (J. N. Kutz) 93

n+2

lines

t

t

t

t

t

time t

x x x x x

space x

solution progession

along 1D lines

0

1

2

j

j+1

n-2 n-1 n n+1

Figure 12: Graphical representation of the progression of the numerical solu-

tion using the method of lines. Here a second order discretization scheme is

considered which couples each spatial point with its nearest neighbor.

Method of Lines

Fundamentally, these methods use the data at a single slice of time to generate

a solution ∆t in the future. This is then used to generate a solution 2∆t into

the future. The process continues until the desired future time is achieved.

This process of solving a partial diﬀerential equation is known as the method of

lines. Each line is the value of the solution at a given time slice. The lines are

used to update the solution to a new timeline and progressively generate future

solutions. Figure 12 depicts the process involved in the method of lines for the

1D case.

7.2 Time-stepping schemes: explicit and implicit methods

Now that several technical and computational details have been addressed, we

continue to develop methods for time-stepping the solution into the future.

Some of the more common schemes will be considered along with a graphical

AMATH 301 ( c (J. N. Kutz) 94

u u u

u

m+1

n

m+1

t

t

n-1 n n+1

m m m

m

Figure 13: Four-point stencil for second-order spatial discretization and Euler

time-stepping of the one-wave wave equation.

representation of the scheme. Every scheme eventually leads to an iteration

procedure which the computer can use to advance the solution in time.

We will begin by considering the most simple partial diﬀerential equations.

Often, it is diﬃcult to do much analysis with complicated equations. Therefore,

considering simple equations is not merely an excersize, but rather they are

typically the only equations we can make analytical progress with.

As an example, we consider the one-way wave equation

∂u

∂t

= c

∂u

∂x

. (7.2.1)

The simplest discretization of this equation is to ﬁrst central diﬀerence in the x

direction. This yields

∂u

∂t

=

c

2∆x

(u

n+1

−u

n−1

) , (7.2.2)

where u

n

= u(x

n

, t). We can then step forward with an Euler time-stepping

method. Denoting u

(m)

n

= u(x

n

, t

m

), and applying the method of lines iteration

scheme gives

u

(m+1)

n

= u

(m)

n

+

c∆t

2∆x

u

(m)

n+1

−u

(m)

n−1

. (7.2.3)

This simple scheme has the four-point stencil shown in Fig. 13. To illustrate

more clearly the iteration procedure, we rewrite the discretized equation in the

form

u

(m+1)

n

= u

(m)

n

+

λ

2

u

(m)

n+1

−u

(m)

n−1

(7.2.4)

where

λ =

c∆t

∆x

(7.2.5)

is known as the CFL (Courant, Friedrichs, and Levy) condition. The iteration

procedure assumes that the solution does not change signiﬁcantly from one time-

step to the next, i.e. u

(m)

n

≈ u

(m+1)

n

. The accuracy and stability of this scheme

AMATH 301 ( c (J. N. Kutz) 95

is controlled almost exclusively by the CFL number λ. This parameter relates

the spatial and time discretization schemes in (7.2.5). Note that decreasing ∆x

without decreasing ∆t leads to an increase in λ which can result in instabilities.

Smaller values of ∆t suggest smaller values of λ and improved numerical stability

properties. In practice, you want to take ∆x and ∆t as large as possible for

computational speed and eﬃciency without generating instabilities.

There are practical considerations to keep in mind relating to the CFL num-

ber. First, given a spatial discretization step-size ∆x, you should choose the

time discretization so that the CFL number is kept in check. Often a given

scheme will only work for CFL conditions below a certain value, thus the im-

portance of choosing a small enough time-step. Second, if indeed you choose to

work with very small ∆t, then although stability properties are improved with

a lower CFL number, the code will also slow down accordingly. Thus achieving

good stability results is often counter-productive to fast numerical solutions.

Central Diﬀerencing in Time

We can discretize the time-step in a similar fashion to the spatial discretiza-

tion. Instead of the Euler time-stepping scheme used above, we could central

diﬀerence in time using Table 4. Thus after spatial discretization we have

∂u

∂t

=

c

2∆x

(u

n+1

−u

n−1

) , (7.2.6)

as before. And using a central diﬀerence scheme in time now yields

u

(m+1)

n

−u

(m−1)

n

2∆t

=

c

2∆x

u

(m)

n+1

−u

(m)

n−1

. (7.2.7)

This last expression can be rearranged to give

u

(m+1)

n

= u

(m−1)

n

+λ

u

(m)

n+1

−u

(m)

n−1

. (7.2.8)

This iterative scheme is called leap-frog (2,2) since it is O(∆t

2

) accurate in time

and O(∆x

2

) accurate in space. It uses a four point stencil as shown in Fig. 14.

Note that the solution utilizes two time slices to leap-frog to the next time slice.

Thus the scheme is not self-starting since only one time slice (initial condition)

is given.

Improved Accuracy

We can improve the accuracy of any of the above schemes by using higher order

central diﬀerencing methods. The fourth-order accurate scheme from Table 5

gives

∂u

∂x

=

−u(x + 2∆x) + 8u(x + ∆x) −8u(x −∆x) +u(x −2∆x)

12∆x

. (7.2.9)

AMATH 301 ( c (J. N. Kutz) 96

u u

u

n

m+1

m-1

t

t

n-1 n+1

m m

m

m+1

u

n

m-1

t

Figure 14: Four-point stencil for second-order spatial discretization and central-

diﬀerence time-stepping of the one-wave wave equation.

Combining this fourth-order spatial scheme with second-order central diﬀerenc-

ing in time gives the iterative scheme

u

(m+1)

n

= u

(m−1)

n

+λ

¸

4

3

u

(m)

n+1

−u

(m)

n−1

−

1

6

u

(m)

n+2

−u

(m)

n−2

. (7.2.10)

This scheme, which is based upon a six point stencil, is called leap frog (2,4). It

is typical that for (2,4) schemes, the maximum CFL number for stable compu-

tations is reduced from the basic (2,2) scheme.

Lax-Wendroﬀ

Another alternative to discretizing in time and space involves a clever use of the

Taylor expansion

u(x, t + ∆t) = u(x, t) + ∆t

∂u(x, t)

∂t

+

∆t

2

2!

∂

2

u(x, t)

∂t

2

+O(∆t

3

) . (7.2.11)

But we note from the governing one-way wave equation

∂u

∂t

= c

∂u

∂x

≈

c

2∆x

(u

n+1

−u

n−1

) . (7.2.12a)

Taking the derivative of the equation results in the relation

∂

2

u

∂t

2

= c

∂

2

u

∂x

2

≈

c

∆x

2

(u

n+1

−2u

n

+u

n−1

) . (7.2.13a)

These two expressions for ∂u/∂t and ∂

2

u/∂t

2

can be substituted into the Taylor

series expression to yield the iterative scheme

u

(m+1)

n

= u

(m)

n

+

λ

2

u

(m)

n+1

−u

(m)

n−1

+

λ

2

2

u

(m)

n+1

−2u

(m)

n

+u

(m)

n−1

. (7.2.14)

AMATH 301 ( c (J. N. Kutz) 97

Scheme Stability

Forward Euler unstable for all λ

Backward Euler stable for all λ

Leap Frog (2,2) stable for λ ≤ 1

Leap Frog (2,4) stable for λ ≤ 0.707

Table 9: Stability of time-stepping schemes as a function of the CFL number.

This iterative scheme is similar to the Euler method. However, it introduces an

important stabilizing diﬀusion term which is proportional to λ

2

. This is known

as the Lax-Wendroﬀ scheme. Although useful for this example, it is diﬃcult to

implement in practice for variable coeﬃcient problems. It illustrates, however,

the variety and creativity in developing iteration schemes for advancing the

solution in time and space.

Backward Euler: Implicit Scheme

The backward Euler method uses the future time for discretizing the spatial

domain. Thus upon discretizing in space and time we arrive at the iteration

scheme

u

(m+1)

n

= u

(m)

n

+

λ

2

u

(m+1)

n+1

−u

(m+1)

n−1

. (7.2.15)

This gives the tridiagonal system

u

(m)

n

= −

λ

2

u

(m+1)

n+1

+u

(m+1)

n

+

λ

2

u

(m+1)

n−1

, (7.2.16)

which can be written in matrix form as

Au

(m+1)

= u

(m)

(7.2.17)

where

A =

1

2

¸

¸

¸

¸

2 −λ 0

λ 2 −λ

.

.

.

.

.

.

.

.

.

0 λ 2

¸

. (7.2.18)

Thus before stepping forward in time, we must solve a matrix problem. This

can severely aﬀect the computational time of a given scheme. The only thing

which may make this method viable is if the CFL condition is such that much

larger time-steps are allowed, thus overcoming the limitations imposed by the

matrix solve.

AMATH 301 ( c (J. N. Kutz) 98

MacCormack Scheme

In the MacCormack Scheme, the variable coeﬃcient problem of the Lax-Wendroﬀ

scheme and the matrix solve associated with the backward Euler are circum-

vented by using a predictor-corrector method. The computation thus occurs in

two pieces:

u

(P)

n

= u

(m)

n

+λ

u

(m)

n+1

−u

(m)

n−1

(7.2.19a)

u

(m+1)

n

=

1

2

u

(m)

n

+u

(P)

n

+λ

u

(P)

n+1

−u

(P)

n−1

. (7.2.19b)

This method essentially combines forward and backward Euler schemes so that

we avoid the matrix solve and the variable coeﬃcient problem.

The CFL condition will be discussed in detail in the next section. For now,

the basic stability properties of many of the schemes considered here are given

in Table 9. The stability of the various schemes hold only for the one-way

wave equation considered here as a prototypical example. Each partial diﬀer-

ential equation needs to be considered and classiﬁed individually with regards

to stability.

7.3 Implementing Time- and Space-Stepping Schemes

Computational performance is always crucial in choosing a numerical scheme.

Speed, accuracy and stability all play key roles in determining the appropriate

choice of method for solving. We will consider three prototypical equations:

∂u

∂t

=

∂u

∂x

one-way wave equation (7.3.1a)

∂u

∂t

=

∂

2

u

∂x

2

diﬀusion equation (7.3.1b)

i

∂u

∂t

=

1

2

∂

2

u

∂x

2

+[u[

2

u nonlinear Schr¨odinger equation. (7.3.1c)

Periodic boundary conditions will be assumed in each case. The purpose of this

section is to build a numerical routine which will solve these problems using the

iteration procedures outlined in the previous two sections. Of speciﬁc interest

will be the setting of the CFL number and the consequences of violating the

stability criteria associated with it.

The equations are considered in one dimension such that the ﬁrst and second

AMATH 301 ( c (J. N. Kutz) 99

derivative are given by

∂u

∂x

→

1

2∆x

0 1 0 0 −1

−1 0 1 0 0

0

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

0

.

.

. 0 −1 0 1

1 0 0 −1 0

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

u

1

u

2

.

.

.

u

n

¸

(7.3.2)

and

∂

2

u

∂x

2

→

1

∆x

2

−2 1 0 0 1

1 −2 1 0 0

0

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

0

.

.

. 0 1 −2 1

1 0 0 1 −2

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

u

1

u

2

.

.

.

u

n

¸

. (7.3.3)

From the previous lectures, we have the following discretization schemes for

the one-way wave equation (7.3.1):

Euler (unstable): u

(m+1)

n

=u

(m)

n

+

λ

2

u

(m)

n+1

−u

(m)

n−1

(7.3.4a)

leap-frog (2,2) (stable for λ ≤ 1): u

(m+1)

n

=u

(m−1)

n

+λ

u

(m)

n+1

−u

(m)

n−1

(7.3.4b)

where the CFL number is given by λ = ∆t/∆x. Similarly for the diﬀusion

equation (7.3.1)

Euler (stable for λ ≤ 1/2): u

(m+1)

n

=u

(m)

n

+λ

u

(m)

n+1

−2u

(m)

n

+u

(m)

n−1

(7.3.5a)

leap-frog (2,2) (unstable): u

(m+1)

n

=u

(m−1)

n

+2λ

u

(m)

n+1

−2u

(m)

n

+u

(m)

n−1

(7.3.5b)

where now the CFL number is given by λ = ∆t/∆x

2

. The nonlinear Schr¨odinger

equation discretizes to the following form:

∂u

(m)

n

∂t

= −

i

2∆x

2

u

(m)

n+1

−2u

(m)

n

+u

(m)

n−1

−i[u

(m)

n

[

2

u

(m)

n

. (7.3.6)

The Euler and leap-frog (2,2) time-stepping schemes will be explored with this

equation.

AMATH 301 ( c (J. N. Kutz) 100

Figure 15: Evolution of the one-way wave equation with the leap-frog (2,2)

scheme and with CFL=0.5. The stable traveling wave solution is propagated in

this case to the left.

One-Way Wave Equation

We ﬁrst consider the leap-frog (2,2) scheme applied to the one-way wave equa-

tion. Figure 15 depicts the evolution of an initial Gaussian pulse. For this case,

the CFL=0.5 so that stable evolution is analytically predicted. The solution

propagates to the left as expected from the exact solution. The leap-frog (2,2)

scheme becomes unstable for λ ≥ 1 and the system is always unstable for the

Euler time-stepping scheme. Figure 16 depicts the unstable evolution of the

leap-frog (2,2) scheme with CFL=2 and the Euler time-stepping scheme. The

initial conditions used are identical to that in Fig. 15. Since we have predicted

that the leap-frog numerical scheme is only stable provided λ < 1, it is not sur-

prising that the ﬁgure on the left goes unstable. Likewise, the ﬁgure on the right

shows the numerical instability generated in the Euler scheme. Note that both

of these unstable evolutions develop high frequency oscillations which eventually

blow up. The MATLAB code used to generate the leap-frog and Euler iterative

solutions is given by

clear all; close all; % clear previous figures and values

AMATH 301 ( c (J. N. Kutz) 101

Figure 16: Evolution of the one-way wave equation using the leap-frog (2,2)

scheme with CFL=2 (left) along with the Euler time-stepping scheme (right).

The analysis predicts stable evolution of leap-frog provided the CFL≤ 1. Thus

the onset of numerical instability near t ≈ 3 for the CFL=2 case is not surprising.

Likewise, the Euler scheme is expected to be unstable for all CFL.

% initialize grid size, time, and CFL number

Time=4;

L=20;

n=200;

x2=linspace(-L/2,L/2,n+1);

x=x2(1:n);

dx=x(2)-x(1);

dt=0.2;

CFL=dt/dx

time_steps=Time/dt;

t=0:dt:Time;

% initial conditions

u0=exp(-x.^2)’;

u1=exp(-(x+dt).^2)’;

usol(:,1)=u0;

usol(:,2)=u1;

% sparse matrix for derivative term

e1=ones(n,1);

A=spdiags([-e1 e1],[-1 1],n,n);

A(1,n)=-1; A(n,1)=1;

% leap frog (2,2) or euler iteration scheme

for j=1:time_steps-1

AMATH 301 ( c (J. N. Kutz) 102

% u2 = u0 + CFL*A*u1; % leap frog (2,2)

% u0 = u1; u1 = u2; % leap frog (2,2)

u2 = u1 + 0.5*CFL*A*u1; % euler

u1 = u2; % euler

usol(:,j+2)=u2;

end

% plot the data

waterfall(x,t,usol’);

map=[0 0 0];

colormap(map);

% set x and y limits and fontsize

set(gca,’Xlim’,[-L/2 L/2],’Xtick’,[-L/2 0 L/2],’FontSize’,[20]);

set(gca,’Ylim’,[0 Time],’ytick’,[0 Time/2 Time],’FontSize’,[20]);

view(25,40)

% set axis labels and fonts

xl=xlabel(’x’); yl=ylabel(’t’); zl=zlabel(’u’);

set(xl,’FontSize’,[20]);set(yl,’FontSize’,[20]);set(zl,’FontSize’,[20]);

print -djpeg -r0 fig.jpg % print jpeg at screen resolution

Heat Equation

In a similar fashion, we investigate the evolution of the diﬀusion equation when

the space-time discretization is given by the leap-frog (2,2) scheme or Euler

stepping. Figure 17 shows the expected diﬀusion behavior for the stable Euler

scheme (λ ≤ 0.5). In contrast, Fig. 18 shows the numerical instabilities which are

generated from violating the CFL constraint for the Euler scheme or using the

always unstable leap-frog (2,2) scheme for the diﬀusion equation. The numerical

code used to generate these solutions follows that given previously for the one-

way wave equation. However, the sparse matrix is now given by

% sparse matrix for second derivative term

e1=ones(n,1);

A=spdiags([e1 -2*e1 e1],[-1 0 1],n,n);

A(1,n)=1; A(n,1)=1;

Further, the iterative process is now

% leap frog (2,2) or euler iteration scheme

for j=1:time_steps-1

u2 = u0 + 2*CFL*A*u1; % leap frog (2,2)

AMATH 301 ( c (J. N. Kutz) 103

Figure 17: Stable evolution of the heat equation with the Euler scheme with

CFL=0.5. The initial Gaussian is diﬀused in this case.

u0 = u1; u1 = u2; % leap frog (2,2)

% u2 = u1 + CFL*A*u1; % euler

% u1 = u2; % euler

usol(:,j+2)=u2;

end

where we recall that the CFL condition is now given by λ = ∆t/∆x

2

, i.e.

CFL=dt/dx/dx

This solves the one-dimensional heat equation with periodic boundary condi-

tions.

Nonlinear Schr¨odinger Equation

The nonlinear Schr¨odinger equation can easily be discretized by the above tech-

niques. However, as with most nonlinear equations, it is a bit more diﬃcult to

perform a von Neumann analysis. Therefore, we explore the behavior for this

system for two diﬀerent discretization schemes: Euler and leap-frog (2,2). The

CFL number will be the same with both schemes (λ = 0.05) and the stability

AMATH 301 ( c (J. N. Kutz) 104

Figure 18: Evolution of the heat equation with the Euler time-stepping scheme

(left) and leap-frog (2,2) scheme (right) with CFL=1. The analysis predicts

that both these schemes are unstable. Thus the onset of numerical instability

is observed.

Figure 19: Evolution of the nonlinear Schr¨odinger equation with the Euler time-

stepping scheme (left) and leap-frog (2,2) scheme (right) with CFL=0.05.

will be investigated through numerical computations. Figure 19 shows the evo-

lution of the exact one-soliton solution of the nonlinear Schr¨odinger equation

(u(x, 0) = sech(x)) over six units of time. The Euler scheme is observed to

lead to numerical instability whereas the leap-frog (2,2) scheme is stable. In

general, the leap-frog schemes work well for wave propagation problems while

Euler methods are better for problems of a diﬀusive nature.

The MATLAB code modiﬁcations necessary to solve the nonlinear Schr¨odinger

equation are trivial. Speciﬁcally, the iteration scheme requires change. For the

stable leap-frog scheme, the following command structure is required

u2 = u0 + -i*CFL*A*u1- i*2*dt*(conj(u1).*u1).*u1;

u0 = u1; u1 = u2;

AMATH 301 ( c (J. N. Kutz) 105

Note that i is automatically deﬁned in MATLAB as i =

√

−1. Thus it is

imperative that you do not use the variable i as a counter in your FOR loops.

You will solve a very diﬀerent equation if not careful with this deﬁnition.

References

[1] W. E. Boyce and R. C. DiPrima, Elementary Diﬀerential Equations and

Boundary Value Problems, 7th Ed. (Wiley, 2001).

[2] See, for instance, R. Finney, F. Giordano, G. Thomas, and M. Weir, Cal-

culus and Analytic Geometry, 10th Ed. (Prentice Hall, 2000).

[3] C. W. Gear, Numerical Initial Value Problems in Ordinary Diﬀerential

Equations, (Prentice Hall, 1971).

[4] J. D. Lambert, Computational Methods in Ordinary Diﬀerential Equations,

(Wiley, 1973)

[5] R. L. Burden and J. D. Faires, Numerical Analysis, (Brooks/Cole, 1997).

[6] A. Greenbaum, Iterative Methods for Solving Linear Systems, (SIAM,

1997).

[7] W. H. Press, S. A. Teukolsky, W. T. Vetterling, B. P. Flannery, Numerical

Recipes, 2nd Ed. (Cambridge, 1992).

[8] R. Courant, K. O. Friedrichs, and H. Lewy, “

¨

Uber die partiellen diﬀeren-

zengleichungen der mathematischen physik,” Mathematische Annalen 100,

32-74 (1928).

[9] J. C. Strikwerda, Finite Diﬀerence Schemes and Partial Diﬀerential Equa-

tions, (Chapman & Hall, 1989).

[10] N. L. Trefethen, Spectral methods in MATLAB, (SIAM, 2000).

[11] G. Strang, Introduction to Applied Mathematics, (Wellesley-Cambridge

Press, 1986).

[12] I. M. Gelfand and S. V. Fomin, Calculus of Variations, (Prentice-Hall,

1963).

AMATH 301 ( c J. N. Kutz)

2

Contents

1 MATLAB Introduction 1.1 Vectors and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Logic, Loops and Iterations . . . . . . . . . . . . . . . . . . . . . 1.3 Plotting and Importing/Exporting Data . . . . . . . . . . . . . . 3 3 7 10

2 Linear Systems 14 2.1 Direct solution methods for Ax=b . . . . . . . . . . . . . . . . . 14 2.2 Iterative solution methods for Ax=b . . . . . . . . . . . . . . . . 18 2.3 Eigenvalues, Eigenvectors, and Solvability . . . . . . . . . . . . . 22 3 Curve Fitting 26 3.1 Least-Square Fitting Methods . . . . . . . . . . . . . . . . . . . . 26 3.2 Polynomial Fits and Splines . . . . . . . . . . . . . . . . . . . . . 30 3.3 Data Fitting with MATLAB . . . . . . . . . . . . . . . . . . . . . 33 4 Numerical Diﬀerentiation and Integration 38 4.1 Numerical Diﬀerentiation . . . . . . . . . . . . . . . . . . . . . . 38 4.2 Numerical Integration . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3 Implementation of Diﬀerentiation and Integration . . . . . . . . . 47 5 Diﬀerential Equations 50 5.1 Initial value problems: Euler, Runge-Kutta and Adams methods 50 5.2 Error analysis for time-stepping routines . . . . . . . . . . . . . . 57 5.3 Boundary value problems: the shooting method . . . . . . . . . . 63 5.4 Implementation of the shooting method . . . . . . . . . . . . . . 68 5.5 Boundary value problems: direct solve and relaxation . . . . . . 72 5.6 Implementing the direct solve method . . . . . . . . . . . . . . . 76 6 Fourier Transforms 79 6.1 Basics of the Fourier Transform . . . . . . . . . . . . . . . . . . . 79 6.2 Spectral Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 6.3 Applications of the FFT . . . . . . . . . . . . . . . . . . . . . . . 84 7 Partial Diﬀerential Equations 89 7.1 Basic time- and space-stepping schemes . . . . . . . . . . . . . . 89 7.2 Time-stepping schemes: explicit and implicit methods . . . . . . 93 7.3 Implementing Time- and Space-Stepping Schemes . . . . . . . . 98

AMATH 301 ( c J. N. Kutz)

3

1

MATLAB Introduction

The ﬁrst three sections of these notes deals with the preliminaries necessary for manipulating data, constructing logical statements and plotting data. The remainder of the course relies heavily on proﬁciency with these basic skills.

1.1

Vectors and Matrices

The manipulation of matrices and vectors is of fundamental importance in MATLAB proﬁciency. This section deals with the construction and manipulation of basic matrices and vectors. We begin by considering the construction of row and column vectors. Vectors are simply a subclass of matrices which have only a single column or row. A row vector can be created by executing the command struture >>x=[1 3 2] which creates the row vector x = (1 2 3) . (1.1.1) Row vectors are oriented in a horizontal fashion. In contrast, column vectors are oriented in a vertical fashion and are created with either of the two following command structures: >>x=[1; 3; 2] where the semicolons indicate advancement to the next row. Otherwise a return can be used in MATLAB to initiate the next row. Thus the following command structure is equivalent >>x=[1 3 2] Either one of these creates the column vector 1 x= 3 . 2

(1.1.2)

All row and column vectors can be created in this way. Vectors can also be created by use of the colon. Thus the following command line >>x=0:1:10

The coordinate of any data in a matrix is found from its row and column location so that the elements of a matrix A are given by A(i.7) where i denotes the row and j denotes the column.2 0. of rows (N ) and columns (M ).2 so that x = (0 0. use the command . The 3 × 3 matrix 1 3 A= 5 6 8 3 can be created by use of the semicolons >>A=[1 3 2.6) In this case.3) (1.1. Matrices are just as simple to generate.1.4) (1. Kutz) creates a row vector which goes from 0 to 10 in steps of 1 so that x = (0 1 2 3 4 5 6 7 8 9 10) .6 0.1.1. This matrix matrix.4 0.1. Accessing components of a given matrix of vector is a task that relies on knowing the row and column of a certain piece of data.2:1 to create a row vector which goes from 0 to 1 in steps of 0.6). j) (1. Note that the command line >>x=0:2:10 creates a row vector which goes from 0 to 10 in steps of 2 so that x = (0 2 4 6 8 10) .5) Now there are a speciﬁed number would be referred to as an N × M 2 7 1 (1. This allows >>x=0:0. To access the second row and third column of (1.8 1) . 8 3 1] or by using the return key so that >>A=[1 3 2 5 6 7 8 3 1] 4 (1. 5 6 7.1. the matrix is square with an equal number of rows and columns (N = M ). which takes the value of 7. Steps of integer size need not be used. N.AMATH 301 ( c J.

This produces the column vector 3 .3) will access the entire third column to produce 2 x= 7 .1. the use of the colon is required >>x=A(2.11) x= 0 The command >>x=B(4.10) 5 0 2 6 .1. 6 1 5 5 The command >>x=B(2:3.9) More sophisticated data extraction and manipulation can be performed with the aid of colons. This produces the row vector x = (1 5 5) . Kutz) 5 >>x=A(2.:) This will access the entire second row and return the row vector x = (5 6 7) .2:end) removes the second through last columns of row 4.8) (1. To show examples of these techniques we consider the 4 × 4 matrix 1 7 9 2 2 3 3 4 B= (1. To access an entire row.1. (1. Columns can be similarly be extracted.1. (1. The command >>C=B(1:end-1.3) This will return the value x = 7. N. 1 (1.AMATH 301 ( c J.12) We can also remove a speciﬁed number of rows and columns.1. Thus >>x=A(:.2:4) .2) removes the second through third row of column 2.

AMATH 301 ( c J.1. The second row of D. when transposing a vector with complex numbers. 1 (1. it changes the sign of the imaginary part. the command >>D=[B(1. is made from the transpose (. which is initiated with the semicolon. Speciﬁcally. N.1. we make use of the transpose symbol which turns row vectors into column vectors and vice-versa. This then produces the matrix 7 9 2 C= 3 3 4 .e.’) of the ﬁrst three rows of the third column of B.13) 0 2 6 As a last example. B(1:3. when considering the transpose of the column vector 3 + 2i .14) D= 9 3 2 An important comment about the transpose function is in order.2:4). the command >>y=x. (1. Kutz) 6 removes the ﬁrst row through the next to last row along with the second through fourth columns. (1.1. In this example.’ produces the row vector y = (3 + 2i 1 8) . (1.16) . In particular. i.3).17) Thus the use of the ’ symbol alone also conjugates vector.15) x= 8 √ where i is the complex (imaginary) number i = −1.1. the period must be put in before the ’ symbol. This produces the matrix 7 9 2 . (1. whereas the command >>y=x’ produces the row vector y = (3 − 2i 1 8) .1.’] makes the ﬁrst row of D the second through fourth columns of the ﬁrst row of B.

To illustrate the use of the for loop structure. This lecture will focus on the use of these two ubiquitous logic structures in programming. Kutz) 7 1. Thus starting with an initial value of sum equal to zero. The basic format for this logic is as follows: . and 4 successively. The program sum=0 for j=1:2:5 sum=sum+j end is similar to the previous program. As an example. unlike the for loop. The default incremental increase in the counter j is one. Thus starting with an initial value of sum equal to zero. 5. and 15 (j = 5). and 10 (j = 4).2 Logic. In the loop. 10 (j = 4). the increment can be speciﬁed as desired. the program sum=0 for j=[1 5 4] sum=sum+j end will go through the loop three times with the values of j being 1. However. We begin by constructing a loop which will recursively sum a series of numbers. N. a series of logic statements are usually considered. 6 (j = 5). and 9 (j = 5). we ﬁnd that sum is equal to 1 (j = 1). Thus starting with an initial value of sum equal to zero. i. Loops and Iterations The basic building blocks of any MATLAB program are for loops and if statements. we consider some very basic programs which revolve around its implementation.e.AMATH 301 ( c J. two. But for this case the incremental steps in j are speciﬁed to be two. However. the value of sum is updated by adding the current value of j. It then proceeds to go through the for loop ﬁve times. we ﬁnd that sum is equal to 1 (j = 1). And even more generally. They form the background for carrying out the complicated manipulations required of sophisticated and simple codes alike. the for loop counter can be simply given by a row vector. we ﬁnd that sum is equal to 1 (j = 1). The basic format for such a loop is the following: sum=0 for j=1:5 sum=sum+j end This program begins with the variable sum being zero. 4 (j = 3). The if statement is similarly implemented. 3 (j = 2). four and ﬁve in succession. 6 (j = 3). the counter j takes the value of one. three.

By zooming in to smaller values of the function. % define plotting range y=exp(x)-tan(x). · · ·. −1. we will consider the transcendental function exp(x) − tan(x) = 0 (1. 2. no commands would be executed if the logical statements do not hold. one can ﬁnd that there are a large number (inﬁnite) of roots to this equation. 0. clear all close all % clear all variables % close all figures x=-10:0. The following MATLAB script will plot the function over the interval x ∈ [−10. we consider the bisection method for ﬁnding zeros of a function. 1. −2. In particular. Thus there would be no need for the else logic structure. In . Kutz) 8 if (logical statement) (expressions to execute) elseif (logical statement) (expressions to execute) elseif (logical statement) (expressions to execute) else (expressions to execute) end In this logic structure. Such a logic format might be as follows: if (logical statement) (expressions to execute) elseif (logical statement) (expressions to execute) end In such a structure. % define function to consider plot(x. To make a practical example of the use of for and if statements. 10].AMATH 301 ( c J.y) % plot the function It should be noted that tan(x) takes on the value of ±∞ at π/2 + 2nπ where n = · · · . N. In practice.1:10. the last set of expressions are executed if the three previous logical statements do not hold. the logical if may only be required to execute a command if something is true. We can begin to get an idea of where the relevant values of x are by plotting this function.1) for which the values of x which make this true must be found computationally.2.

then the iteration process is stopped.e. that is the j loop. depending of course on the location of the root. i. Thus. This method is repeated until the interval has become so small and the function considered has come to within some tolerance of zero.8]. % calculate the midpoint fc=exp(xc)-tan(xc). This eﬀectively stops the iteration procedure for cutting the intervals in half. the function exp(x) − tan(x) > 0 while at x = −2. Kutz) 9 particular. % clear all variables % initial right boundary % initial left boundary % j cuts the interval for j=1:1000 xc=(xl+xr)/2.8. In this case. N.AMATH 301 ( c J. end % move left boundary % move right boundary if abs(fc)<10^(-5) break % quit the loop end end xc fc % print value of root % print value of function Note that the break command ejects you from the current loop. bisection. xl=-4. Further. The following algorithm uses an outside for loop to continue cutting the interval in half while the imbedded if statement determines the new interval of interest. 2. The bisection method simply cuts the given interval in half and determines if the root is on the left or right side of the cut. in between these points lies a root. % calculate function if fc>0 xl=xc. else xr=xc. extensive use has been made of the semicolons at the . there is a root located between x ∈ [−4. Once this is established.m clear all xr=-2. a new interval is chosen with the mid point now becoming the left or right end of the new domain.8 the function exp(x) − tan(x) < 0. the absolute value of the function exp(x) − tan(x) is less than 10−5 . At x = −4. A second if statement if used to ensure that once a certain tolerance has been achieved.

1.64). The second command creates a row vector y which gives the values of the sine function evaluated at the corresponding values of x.*sin(x3). y1=sin(x). y2=sin(x2).AMATH 301 ( c J. MATLAB provides an easy and powerful interface for plotting and representing a large variety of data formats. The semicolon simply represses output to the computer screen. The linspace command is also helpful in generating a domain for deﬁning and evaluating functions.3 Plotting and Importing/Exporting Data The graphical representation of data is a fundamental aspect of any technical scientiﬁc communications. we ﬁrst construct a set of functions to plot and manipulate. In this example. Like all MATLAB structures. the two lines x2=[-5 sqrt(3) pi 4].y1) Note that this graph lacks a title or axis labels. the plotting domain must ﬁrst be speciﬁed. saving valuable time and clutter. plotting is dependent on the eﬀective manipulation of vectors and matrices. √ will generate the values of the sine function at x = −5. These are important in generating high quality graphics.1:10. N. To deﬁne a function. The basic procedure for this function is as follows x3=linspace(-10. the period must be . To begin the consideration of graphical routines. MATLAB will generate functions values for any speciﬁed coordinate values. 10]. y3=x3. This can often be a helpful function when considering intervals where the number of discretized steps give a complicated ∆x. 3. For instance. we are considering the function x sin x over the interval x ∈ [−10.1. Kutz) 10 end of each line. 10] in steps of ∆x = 0.10. This can be done by using a routine which creates a row vector x1=-10:0. The preceding example considers only equally spaced points. Here the row vector x spans the interval x ∈ [−10. By doing so. This will generate a row vector x which goes from -10 to 10 in 64 steps. π and 4. A basic plot of this function can then be generated by the command plot(x1. However.

y1). hold on plot(x2.y2.y3) In this case. In the following. This graph will be ﬁgure 1. This is only the beginning of the plotting and visualization process. the three pairs of vectors are prescribed within a single plot command. Thus creating a row vector with the values of x sin x. Labeling the axis and placing a title on the ﬁgure is also of fundamental importance. This kind of customization greatly helps distinguish the various data sets and their individual behaviors. By using the help plot command. Many aspects of the graph must be augmented with specialty commands in order to more accurately relay the information.’g*’.x3. a list of options for customizing data representation is given. Alternatively. the hold on command is necessary after the ﬁrst plot so that the second plot command will not overwrite the ﬁrst data plotted.x2. And advantage to this method is that the three data sets will be of diﬀerent colors.x3. all the data sets can be plotted on the same graph with figure(2) plot(x1. figure(1) plot(x1.x2. a new ﬁgure is created which is customized as follows figure(3) plot(x1. This will plot all three data sets on the same graph with a default blue line. but now the second data set is represented by green * and the third data set is represented by a magenta dotted line with the actual data points given by a magenta hollow circle. This can be easily accomplished with the commands xlabel(’x values’) ylabel(’y values’) title(’Example Graph’) .AMATH 301 ( c J. To plot all of the above data sets in a single plot. we can do either one of the following routines.y3.y1.’mO:’) This will create the same plot as in ﬁgure 2. which is better than having them all the default color of blue. This will then perform a component by component multiplication. N.y3) For this case.y1. Kutz) 11 included before the multiplication sign.y2) plot(x3. Of signiﬁcance is the ability to change the line colors and styles of the plotted data.y2. This ﬁgure generated will be ﬁgure 2.

Remark 1: All the graph customization techniques discussed can be performed directly within the MATLAB graphics window. In this example. Within the legend command. axis([-10 10 -10 10]) subplot(3. plot(x3. To plot each data set in an individual subplot requires three plot frames. legend(’Data set 1’. and title command can be used. Simply go to the edit button and choose to edit axis properties. the use of the legend.AMATH 301 ( c J.’Data set 3’.’Data set 2’. In this case.2). The format for this command structure is as follows: figure(4) subplot(3. we have three data sets which are under consideration.m ﬁle to customize your graphics. once MATLAB is shut down or the graph is closed. the axis command has also been used to make all the graphs have the same x and y values. In this example. the following command is used. This gives an advantage to developing a nice set of commands in a . Kutz) 12 The strings given within the ’ sign are now printed in a centered location along the x-axis. . This will give full control of all the axis properties discussed so far and more.1. y-axis and title location respectively. This allows for multiple graphs to be arranged in a row and column format. plot(x2.’mO:’) Note that the subplot command is of the structure subplot(row. xlabel.3). ylabel.column. A legend is then important to distinguish and label the various data sets which are plotted together on the graph.0) Here the strings correspond to the the three plotted data sets in the order they were plotted. one of which is to the right of the graph and does not interfere with the graph data.graph number) where the graph number refers to the current graph being considered.1. axis([-10 10 -10 10]) subplot(3.y1). Thus we construct a plot containing three rows and a single column. The command structure for the axis command is axis([xmin xmax ymin ymax]). The command help legend gives a summary of the possible legend placement positions. The zero at the end of the legend command is the option setting for placement of the legend box. However. the position of the legend can be easily speciﬁed. To place a legend in the above graph in an optimal location. the option zero tries to pick the best possible location which does not interfere with any of the data.y3. all the customization properties are lost and you must start again from scratch.y2. subplots Another possible method for representing data is with the subplot command.’*’). N. For each subplot.1.1). plot(x1.

This can then be loaded back into MATLAB with the command . title([’value of rho=’ num2str(rho)]) creates a title which reads ”value of rho=1. However. To recall this data at a future time. The save and print commands are the appropriate commands for performing such actions. save and print Once a graph has been made or data generated. N. it may be advantageous to save data for use in a diﬀerent software package.mat. In this case. simply load it back into MATLAB with the command load filename This will reload into the workspace all the variables saved in ﬁlename. The save command will write any data out to a ﬁle for later use in a separate program or for later use in MATLAB. data needs to be saved in an ASCII format in order to be read by other software engines. To save workspace variables to a ﬁle.mat. This command is ideal when closing down operations on MATLAB and resuming at a future time. The save command can then be modiﬁed for this purpose. Remark 3: To put a variable value in a string. To take it oﬀ. This feature can aid in illustrating more clearly certain phenomena and should be considered when making plots. The command save x1 x1. the code rho=1. use grid oﬀ.5. this will save all the vectors created along with any graphics settings. This is an extremely powerful and easy command to use.dat in ASCII format.AMATH 301 ( c J. Kutz) 13 Remark 2: To put a grid on the a graph. it may be useful to save the data or print the graphs for inclusion in a write up or presentation. In the preceding example.dat -ASCII saves the row vector x1 generated previously to the ﬁle x1. you can only access the data again through MATLAB. load. the following command structure is used save filename this will save all the current workspace variables to a binary ﬁle named ﬁlename. Thus this command converts variable values into useful string conﬁgurations. For instance. Alternatively.5”. simply use the grid on command. the num2str (number to string) command can be used. The number of grid lines can be controlled from the editing options in the graphics window.

1.jpg. There are a variety of direct methods for solving Ax = b: Gaussian elimination. it is crucial to minimize the operations it takes in solving such a system. If you desire to print a ﬁgure to a ﬁle. Some basic iterative schemes will be discussed in what follows. the solution technique often gives rise to a system of linear equations which need to be solved as eﬃciently as possible. N. LU decomposition. 2.AMATH 301 ( c J. In addition to Gaussian elimination.1 Direct solution methods for Ax=b A central concern in almost any computational strategy is a fast and eﬃcient computational method for achieving a solution of a large system of equations Ax = b. In addition to these direct methods. the the print command needs to be utilized.jpg which will print the current ﬁgure as a jpeg ﬁle named ﬁg. there are a host of other techniques which can be used to solve a given problem. (2.1) 1 3 9 1 . Kutz) 14 load x1.dat This saving option is advantageous when considering the use of other software packages to manipulate or analyze data generated in MATLAB. This section oﬀers an overview for these methods and techniques. Thus consider Ax = b with 1 1 1 1 A= 1 2 4 b = −1 . A common graphics format would involve the command print -djpeg fig. a list of possible options will be listed. In trying to render a computation tractable. There a large number of graphics formats which can be used. and inverting the matrix A. By typing help print. We will consider a very simple example of a 3×3 system in order to understand the operation count and numerical procedure involved in this technique. Note that you can also print or save from the ﬁgure window by pressing the ﬁle button and following the links to the print or export option respectively. 2 Linear Systems The solution of linear systems is one of the most basic aspects of computational science. The standard beginning to discussions of solution techniques for Ax = b involves Gaussian elimination. iterative schemes can also provide eﬃcient solution techniques. In many applications.

3b) (2. det A = 0.1.e. The fact that this algorithm works is secondary to the concern of the time required in generating a solution in scientiﬁc computing. perform the addition/subtraction down the N rows. For each pivot. Movement down the N pivots 2.3a) (2. For each pivot. i. The back substitution algorithm can similarly be calculated to give an O(N 2 ) scheme. Kutz) The Gaussian elimination procedure begins mented matrix 1 1 [A|b] = 1 2 1 3 1 1 = 0 1 0 2 1 1 = 0 1 0 1 1 1 = 0 1 0 0 15 with the construction of the aug1 4 9 1 3 8 1 3 4 1 3 1 1 −1 1 1 −2 0 1 −2 0 1 −2 2 (2.3c) x1 + x2 + x3 = 1 → x1 = 7 . this results in a scheme whose operation count is O(N 3 ).2) where we have underlined and bolded the pivot of the augmented matrix. perform N additions/subtractions across a given row. LU Decomposition Each Gaussian elimination operation costs O(N 3 ) operations. it will always yield an answer. The operation count for the Gaussian elimination can be easily be estimated from the algorithmic procedure for an N × N matrix: 1. N. 3. Provided we do this.1. we simply need to avoid these singular matrices and occasionally shift the rows around to avoid a zero pivot.1. Back substituting then gives the solution x3 = 2 x2 + 3x3 = −2 → x3 = 2 → x2 = −8 (2.1.AMATH 301 ( c J. In total. This procedure can be carried out for any matrix A which is nonsingular. In this algorithm. This can be computationally prohibitive for large matrices when repeated solutions of Ax = b .

1. A = LU → a21 a22 a23 = m21 0 0 u33 m31 m32 1 a31 a32 a33 (2.8a) (2. A = −2 −4 (2.1.1. we consider the 3 × 3 matrix 4 3 −1 5.1.1. As an example of the application of the LU factorization algorithm.7c) (2. Kutz) 16 must be found. you are doing far more work than necessary in achieving a solution. the operation count can easily be brought down to O(N 2 ) using LU factorization which splits the matrix A into a lower triangular matrix L. This then gives Ax = b → LUx = b (2.7a) (2. and an upper triangular matrix U.1.4) Thus the L matrix is lower triangular and the U matrix is upper triangular. Thus once the factorization is accomplished. Otherwise.6) can be solved by O(N 2 ) back substitution. the LU results in an O(N 2 ) scheme for arriving at the solution. Note.1.1.AMATH 301 ( c J.5) where by letting y = Ux we ﬁnd the coupled system Ly = b and Ux = y where the system Ly = b y1 = b1 m21 y1 + y2 = b2 m31 y1 + m32 y2 + y3 = b3 can be solved by O(N 2 ) forward substitution and the system Ux = y u11 x1 + u12 x2 + x3 = y1 u22 x2 + u23 x3 = y2 u33 x3 = y3 (2.1.10) . but you only have to do this once.7b) (2.1. the LU factorization scheme splits A as follows: u11 u12 u13 1 0 0 a11 a12 a13 1 0 0 u22 u23 . For a 3 × 3 matrix. you should always use LU decomposition if possible.1.8b) (2.8c) (2. The factorization itself is O(N 3 ). When working with the same matrix A however. N.9) 1 2 6 The factorization starts from the identity matrix I 1 A = IA = 0 0 matrix multiplication of the matrix A and the 0 1 0 0 4 3 0 −2 −4 1 1 2 −1 5 . 6 (2.

5 . This is easily handled in Gaussian elimination by shifting rows in order to ﬁnd a non-zero pivot. This requires that we multiply by −1/2 in order to eliminate the second column.1. .15) 1 ··· . we would multiply the pivot by −1/2 to eliminate the ﬁrst column element in the second row.25 To eliminate on the third row. We can keep track of row shifts with a row permutation matrix P. These multiplicative factors are now part of the ﬁrst matrix above: 4 3 −1 1 0 0 (2. two.1.1.11) A = −1/2 1 0 0 −2.5 4. we use the next pivot. To shift rows one and 0 1 1 0 P= 0 0 . A = −1/2 (2. Similarly. in LU decomposition. 1/4 0 1 0 1.AMATH 301 ( c J. N.13) It is easy to verify by direct substitution that indeed A = LU. (2. Thus if we need to permute two rows. we must keep track of this row shift since it will eﬀect the right hand side vector b.5 1/4 −1/2 1 (2. we would multiply the pivot by 1/4 to eliminate the ﬁrst column element in the third row.5 4. once L and U are known. L = −1/2 0 0 8.5 1/4 −1/2 1 Thus we ﬁnd that 4 3 −1 1 0 0 1 0 and U = 0 −2. However. The Permutation Matrix As will often happen with Gaussian elimination. third row. Kutz) 17 The factorization begins with the pivot element. Just like Gaussian elimination.5 .1. To use Gaussian elimination. However.12) 0 0 8. the cost of factorization is O(N 3 ).25 6. . we would have 0 ··· 0 ··· (2.1. following the above algorithm will at times result in a zero pivot.5 4. we ﬁnd Ax = b → PAx = Pb → PLUx = Pb thus PA = LU. .5 . ﬁnding the solution is an O(N 2 ) operation.14) If permutation is necessary. for instance. Thus we ﬁnd 4 3 −1 1 0 0 1 0 0 −2. MATLAB can supply the permutation matrix associated with the LU decomposition.

It then checks if A is Hessenberg. 4. U. However. Hermitian or Self-Adjoint.5 times longer or more than A \ b. If A is positive deﬁnite. the Cholesky algorithm is always succesful and takes half the run time of LU factorization. a Cholesky factorization is attempted. 2. N. Note that solving by b = A−1 x is the slowest of all methods. U ] = lu(A): Generate the L and U matrices. MATLAB commands The commands for executing the linear system solve are as follows • A \ b: Solve the system in the order above. a wide range of iterative techniques are available.2 Iterative solution methods for Ax=b In addition to the standard techniques of Gaussian elimination or LU decomposition for solving Ax = b. then LU factorization is used and the forward and backward substitution routines generate a solution. If so. it is important to know how the MATLAB command structure for A \ b works. If A is not square. just like LU factorization. If A is not square and sparse. These iterative techniques can often go under the name of Krylov space methods [6]. it can be written as an upper triangular matrix and solved by a substitution routine. a least squares solution using QR factorization is performed. • [L.AMATH 301 ( c J. taking 2. 5. the matrix is ill-conditioned. The idea is to start with an initial guess for the solution and develop an iterative procedure that will converge to the solution. 6. once the inverse is known it need not be calculated again. then all that is needed is a simple O(N 2 ) substitution routine. It is not recommended. a QR (Householder) routine is used to solve the system. The simplest example . i. If so. i. The following is an outline of the algorithm performed. It then checks if A is symmetric. Care must also be taken when the det A ≈ 0.e. If all the above methods fail. 1. 3. • [L. Kutz) MATLAB: A \ b 18 Given the alternatives for solving the linear system Ax = b. It ﬁrst checks to see if A is triangular.e. P ] = lu(A): Generate the L and U factorization matrices along with the permutation matrix P . or some permutation thereof. If it is. 2.

2b) (2. Kutz) 19 of this method is known as a Jacobi iteration scheme.2. To solve the system iteratively. In particular. z0 ). y0 . Table 1 shows the convergence of this scheme for this simple example.3b) (2.1a) (2. we can deﬁne the following Jacobi iteration scheme based on the above xk+1 = yk+1 zk+1 7 + yk − zk 4 21 + 4xk + zk = 8 15 + 2xk − yk = . Note that the choice of an initial guess is often critical in determining the convergence to the solution.2. the higher the chance of successful implementation of the iterative scheme.1c) −2x + y + 5z = 15 . (2. However. Guess initial values: (x0 . it is easy to conjecture that such a scheme will always be eﬀective. we would follow the structure: 1. We consider the linear system 4x − y + z = 7 4x − 8y + z = −21 We can rewrite each equation as follows x= 7+y−z 4 21 + 4x + z y= 8 15 + 2x − y z= .2. Given the success of this example.3a) (2. Iterate the Jacobi scheme: xk+1 = Axk .2c) (2.AMATH 301 ( c J. This gives the system −2x + y + 5z = 15 4x − 8y + z = −21 4x − y + z = 7 .1b) (2. 5 (2.2. we can reconsider the original system by interchanging the ﬁrst and last set of equations.2.4b) (2.3c) An algorithm is then easily implemented computationally. Check for convergence: xk+1 − xk <tolerance.2.2.2.2. 2.4c) .2a) (2. 5 (2. 3. Thus the more that is known about what the solution is supposed to look like. N.2.4a) (2. The implementation of this scheme is best illustrated with an example.2.2.

75 1. However. Strictly Diagonal Dominant The diﬀerence in the two Jacobi schemes above involves the iteration procedure being strictly diagonal dominant. 15 . .6875 .25 . ±∞ Table 2: Divergence of Jacobi iteration scheme from the initial guess (x0 .0 3.015625 .0 5.0 3. we can deﬁne the following Jacobi iteration scheme based on this rearranged set of equations xk+1 = yk+1 zk+1 yk + 5zk − 15 2 21 + 4xk + zk = 8 = yk − 4xk + 7 .0 zk 2. (2.2. 1.375 -17.5a) (2.5 6. . .0 3. 2). . y0 . 3. xk 1. . z0 ) = (1.875 . 4.0 3. . .0 Table 1: Convergence of Jacobi iteration scheme to the solution value of (x.0 16.2. 2. To solve the system iteratively. Kutz) 20 k 0 1 2 . y0 . . We begin with the deﬁnition of strict diagonal . 2. z) = (2.5c) Of course.6875 34. 3. . ±∞ zk 2. Table 2 shows that applying the iteration scheme leads to a set of values which grow to inﬁnity. . .84375 . . N. . .2.0 yk 2.025 .99999993 . .0 1. 2. k 0 1 2 3 . z0 ) = (1. 2). ±∞ yk 2. 2. . .375 3.5 8.9999993 .5b) (2. . . the solution should be exactly as before. 3) from the initial guess (x0 .AMATH 301 ( c J. 4.99999985 . .0 -1. . . . . Thus the iteration scheme quickly fails. 19 xk 1.375 2. y.

9a) (2. (2. For the two examples considered here. For instance. the Gauss-Seidel modiﬁcation is only one of a large number of possible changes to the iteration . given a stricly diagonal dominant matrix A. it diverges to inﬁnity.2. (2. then Ax = b has a unique solution x = p.1).9c) Here use is made of the supposedly improved value xk+1 in the second equation and xk+1 and yk+1 in the third equation. Indeed. This is known as the Gauss-Seidel scheme.9b) (2. Further. (2.2. the Jacobi scheme given by (2.2.2. In contrast.3) can be enhanced by the following modiﬁcations xk+1 = yk+1 zk+1 7 + yk − zk 4 21 + 4xk+1 + zk = 8 15 + 2xk+1 − yk+1 = . Table 3 shows that the Gauss-Seidel procedure converges to the solution in half the number of iterations used by the Jacobi scheme.7) row 3: |5| > |2| + |1| = 3 −2 1 5 It is sometimes possible to enhance the convergence of a scheme by applying modiﬁcations to the basic Jacobi scheme. For the ﬁrst example (2. Modiﬁcation and Enhancements: Gauss-Seidel which shows the system to be strictly diagonal dominant and guaranteed to converge.j=k |akj | . Kutz) 21 dominance. N. Unlike the Jacobi scheme.2. the sum of the absolute values of the oﬀ-diagonal terms is less than the absolute value of the diagonal term: N |akk | > j=1. Jacobi iteration produces a sequence pk that will converge to p for any p0 .2. A matrix A is strictly diagonal dominant if for each row.2.8) 4 −1 1 row 3: |1| < |4| + | − 1| = 5 Strict diagonal dominance has the following consequence. we have row 1: |4| > | − 1| + |1| = 2 4 −1 1 A = 4 −8 1 → row 2: | − 8| > |4| + |1| = 5 .AMATH 301 ( c J.2. 5 (2. the second system (2. this property is crucial. the Gauss-Seidel method is not guaranteed to converge even in the case of strict diagonal dominance.2.6) Thus this scheme is not guaranteed to converge.4) is not stricly diagonal dominant as can be seen from −2 1 5 row 1: | − 2| < |1| + |5| = 6 A = 4 −8 1 → row 2: | − 8| > |4| + |1| = 5 .

3) where all the time-dependence is captured in the exponent.0 yk 2. the solution of the diﬀerential equation is written as y = c1 x1 exp(λ1 t) + c2 x2 exp(λ2 t) + · · · + cN xN exp(λN t) (2. The values of λ are known as the eigenvalues and the corresponding x are the eigenvectors.75 3. 2). Eigenvalue problems often arise from diﬀerential equations. Unlike the system Ax = b which has the single unknown vector x. 2. .75 1.95 . Once the full set of eigenvalues and eigenvectors of this equation are found.0 1.3 Eigenvalues. we consider the example of a linear set of coupled diﬀerential equations dy = Ay . 10 xk 1. 4. . . 2. Speciﬁcally. 3. and Solvability Another class of linear systems of equations which are of fundamental importance are known as eigenvalue problems. z0 ) = (1. scheme which can be implemented in an eﬀort to enhance convergence.1) which have the unknows x and λ.2) (2.0 3. the resulting equation for x is Ax = λx (2. 2.3.5) .3.3. .96875 . eigenvalue problems are of the form Ax = λx (2. (2.0 Table 3: Convergence of Jacobi iteration scheme to the solution value of (x. Kutz) 22 k 0 1 2 . Included in these iteration schemes are conjugant gradient methods and generalized minimum residual methods which we will discuss and implement [6].95 2.AMATH 301 ( c J. y.4) which is just the eigenvalue problem.98625 . z) = (2. . N. Krylov space methods [6] are often high end iterative techniques especially developed for rapid convergence. . .0 2. dt By attempting a solution of the form y = x exp(λt) . Eigenvectors. It is also possible to use several previous iterations to achieve convergence.3. y0 . . 4. 3) from the initial guess (x0 .3.0 zk 2.

3.e. an example is shown. Ix = x.3. Indeed.3.8). Option I: The determinant of the matrix (A−λI) is not zero. It is this condition which allows for the construction of eigenvalues and eigenvectors. it is not relevant as we require nontrivial solutions for x. To illustrate how the eigenvalues and eigenvectors are computed. The solution to the eigenvalue problem (2.3. it is the only scenario which allows for the possibility of x = 0. i. N.8) (2. Thus solving a linear system of diﬀerential equations relies on the solution of an associated eigenvalue problem.8) is then x = (A − λI)−1 0 which implies that x = 0. the matrix is singular and its inverse. Factoring out the vector x then gives the desired result (A − λI)x = 0 . can be found.7) .9) (2. Moving the right hand side to the left side of the equation gives Ax − λIx = 0.AMATH 301 ( c J.3.6) where a multiplication by unity has been performed.10) This trivial solution could have been guessed from (2.12) 1 3 −1 5 (2.3. we choose the eigenvalues λ so that this condition holds and the matrix is singular.11) (2. (A − λI)−1 . Kutz) 23 where N is the number of linearly independent solutions to the eigenvalue problem for the matrix A which is of size N × N . However. (2. Two possibilities now exist.3. (A − λI)−1 . If this is true. we rewrite the eigenvalue problem as Ax = λIx (2. The questions remains: how are the eigenvalues and eigenvectors found? To consider this problem. Option II: The determinant of the matrix (A − λI) is zero. Although there is no longer a guarantee that there is a solution. cannot be found. the matrix is nonsingular and its inverse. Consider the matrix A= This gives the eigenvalue problem A= 1 −1 3 5 x = λx (2.3. If this is true.3.

3. Choosing x2 = 1 gives x1 = 3 and x1 = 3 1 . Thus we have freedom in choosing one of the values. (2. this leads to the single equation −x1 + 3x2 = 0 (2. Choosing x2 = 1 gives x1 = 1 and x2 = 1 1 . N. (2.17) This is an underdetermined system of equations.3. 4 . this leads to the single equation −x1 + x2 = 0 (2. (2.20) This is an underdetermined system of equations. (2. the command structure [V.15) which gives the two eigenvalues Given that x = (x1 x2 )T . The eigenvectors are then found from (2.AMATH 301 ( c J. Kutz) which when manipulated to the form (A − λI)x = 0 gives 1 3 −1 5 1−λ −1 3 5−λ −λ 1 0 0 1 x= 1−λ 3 −1 5 − λ x = 0.13) as follows: λ=2: 1−2 −1 3 5−2 x= −1 3 −1 3 x = 0.3. 24 (2.18) The second eigenvector comes from (2. .16) (2.3.3.D]=eig(A) gives the matrix V containing the eigenvectors as columns and the matrix D whose diagonal elements are the corresponding eigenvalues.14) λ = 2.19) Given that x = (x1 x2 )T .13) We now require that the determinant is zero det = (1 − λ)(5 − λ) + 3 = λ2 − 6λ + 8 = (λ − 2)(λ − 4) = 0 (2. Speciﬁcally.3.3.21) These results can be found from MATLAB by using the eig command. Thus we have freedom in choosing one of the values.3.3.3.3.13) as follows: λ=4: 1−4 −1 3 5−4 x= −3 3 −1 1 x = 0.

. and Λ is a matrix whose diagonals are λ1 0 0 λ2 Λ= .23) By multiplying (2. Assuming we have all the eigenvalues and eigenvectors of A.3. 0 ··· 0 0 0 . N..3. .22) where M is a large integer.3.24)on the right by S−1 . . .27) (2.3.3. This then generalizes to where the matrix Λ M the corresponding eigenvalues ··· 0 0 ··· 0 (2. . . then Ax1 = λ1 x1 Ax2 = λ2 x2 . . knowing the eigenvalues and eigenvectors of of A allows for a signiﬁcant ease in computational expense. .28) AM = SΛM S−1 is easily calculated as M 0 ··· λ1 0 λM 0 · · · 2 ΛM = .29) . . λM n . . . This collection of eigenvalues and eigenvectors gives the matrix system AS = SΛ where the columns of the matrix S are the eigenvectors of A. this operation is computationally expensive. the matrix A can then be rewritten as A = SΛS−1 .3.24a) (2. For large matrices A. (2. .AMATH 301 ( c J. 0 (2. .. . . · · · 0 λn (2.25) . (2. Kutz) 25 Matrix Powers Another important operation which can be performed with eigenvalue and eigenvectors is the evaluation of AM (2. However.3. Axn = λ1 xn . S = (x1 x2 · · · xn ) .26) The ﬁnal observation comes from A2 = (SΛS−1 )(SΛS−1 ) = SΛ2 S−1 .3.3.

AMATH 301 ( c J. N. Kutz)

26

Since raising the diagonal terms to the M th power is easily accomplished, the matrix A can then be easily calculated by multiplying the three matrices in (2.3.28)

**Solvability and the Fredholm-Alternative Theorem
**

It is easy to ask under what conditions the system Ax = b (2.3.30)

can be solved. Aside from requiring the detA = 0, we also have a solvability condition on b. Consider the adjoint problem A† y = 0 (2.3.31)

where A† = A∗T is the adjoint which is the transpose and complex conjugate of the matrix A. The deﬁnition of the adjoint is such that y · Ax = A† y · x . (2.3.32)

Since Ax = b, the left side of the equation reduces to y · b while the right side reduces to 0 since A† y = 0. This then gives the condition y·b= 0 (2.3.33)

which is known as the Fredholm-Alternative theorem, or a solvability condition. In words, the equation states the in order for the system Ax = b to be solvable, the right-hand side forcing b must be orthogonal to the null-space of the adjoint operator A† .

3

Curve Fitting

Analyzing data is fundamental to any aspect of science. Often data can be noisy in nature and only the trends in the data are sought. A variety of curve ﬁtting schemes can be generated to provide simpliﬁed descriptions of data and its behavior. The least-squares ﬁt method is explored along with ﬁtting methods of polynomial ﬁts and splines.

3.1

Least-Square Fitting Methods

One of the fundamental tools for data analysis and recognizing trends in physical systems is curve ﬁtting. The concept of curve ﬁtting is fairly simple: use a simple function to describe a trend by minimizing the error between the selected function to ﬁt and a set of data. The mathematical aspects of this are laid out in this section.

AMATH 301 ( c J. N. Kutz) Suppose we are given a set of n data points (x1 , y1 ), (x2 , y2 ), (x3 , y3 ), · · · , (xn , yn ) .

27

(3.1.1)

Further, assume that we would like to ﬁt a best ﬁt line through these points. Then we can approximate the line by the function f (x) = Ax + B (3.1.2)

where the constants A and B are chosen to minimize some error. Thus the function gives an approximation to the true value which is oﬀ by some error so that f (xk ) = yk + Ek (3.1.3) where yk is the true value and Ek is the error from the true value. Various error measurements can be minimized when approximating with a given function f (x). Three standard possibilities are given as follows I. MaximumError : E∞ (f ) = max |f (xk ) − yk | II. AverageError : E1 (f ) = 1 n

1<k<n n

(3.1.4a) (3.1.4b)

1/2

k=1

|f (xk ) − yk | 1 n

n

III. Root−meanSquare : E2 (f ) =

k=1

|f (xk ) − yk |

2

. (3.1.4c)

In practice, the root-mean square error is most widely used and accepted. Thus when ﬁtting a curve to a set of data, the root-mean square error is chosen to be minimized. This is called a least-squares ﬁt.

**Least-Squares Line
**

To apply the least-square ﬁt criteria, consider the data points {xk , yk }. where k = 1, 2, 3, · · · , n. To ﬁt the curve f (x) = Ax + B to this data, the E2 is found by minimizing the sum

n n

(3.1.5)

E2 (f ) =

k=1

|f (xk ) − yk |2 =

k=1

(Axk + B − yk )2 .

(3.1.6)

Minimizing this sum requires diﬀerentiation. Speciﬁcally, the constants A and B are chosen so that a minimum occurs. thus we require: ∂E2 /∂A = 0 and

**AMATH 301 ( c J. N. Kutz) ∂E2 /∂B = 0. This gives: ∂E2 =0: ∂A ∂E2 =0: ∂B
**

n

28

k=1 n k=1

2(Axk + B − yk )xk = 0 2(Axk + B − yk ) = 0 .

(3.1.7a) (3.1.7b)

**Upon rearranging, the 2 × 2 system of equations is found for A and B
**

n k=1 n k=1

x2 k xk

n k=1

xk

n

A B

=

n k=1 xk yk n k=1 yk

.

(3.1.8)

This equation can be easily solved using the backslash command in MATLAB. This method can be easily generalized to higher polynomial ﬁts. In particular, a parabolic ﬁt to a set of data requires the ﬁtting function f (x) = Ax2 + Bx + C (3.1.9)

where now the three constants A, B, and C must be found. These can be found from the 3 × 3 system which results from minimizing the error E2 (A, B, C) ∂E2 =0 ∂A ∂E2 =0 ∂B ∂E2 =0 ∂C (3.1.10a) (3.1.10b) (3.1.10c)

Data Linearization

Although a powerful method, the minimization of procedure can result in equations which are nontrivial to solve. Speciﬁcally, consider ﬁtting data to the exponential function f (x) = C exp(Ax) . (3.1.11) The error to be minimized is

n

E2 (A, C) =

k=1

(C exp(Axk ) − yk )2 .

(3.1.12)

**Applying the minimizing conditions leads to ∂E2 =0: ∂A ∂E2 =0: ∂C
**

n

k=1 n k=1

2(C exp(Axk ) − yk )Cxk exp(Axk ) = 0 (3.1.13a) 2(C exp(Axk ) − yk ) exp(Axk ) = 0 . (3.1.13b)

1.19) where the Ci are constants used to minimize the error and M < n. The rootmean square error is then n E2 (C1 . 2. · · · . the exponential ﬁt can be linearized by the transformation Y = ln(y) X=x B = ln C . CM ) (3.1. C3 .20) which can be minimized by considering the M × M system generated from ∂E2 = 0 j = 1.18) (3. General Fitting Given the preceding examples.1.15c) the curve ﬁt for the exponential function becomes a linear ﬁtting problem which is easily handled. Yi ) (3. Indeed.17) So by ﬁtting to the natural log of the y-data (xi . The key idea is to assume a form of the ﬁtting function f (x) = f (x. Then the ﬁt function f (x) = y = C exp(Ax) can be linearized by taking the natural log of both sides so that ln y = ln(C exp(Ax)) = ln C +ln(exp(Ax)) = B +Ax → Y = AX +B . C1 .1.1. · · · . a theory can be developed for a general ﬁtting procedure. Cm ) = k=1 (f (xk . (3.16) (3. N. M .21) .1. C2 .AMATH 301 ( c J.1.1. C2 . ln yi ) = (Xi . ∂Cj (3.15b) (3.14b) C k=1 exp(Axk ) − yk exp(Axk ) = 0 .1.14a) (3. C1 . k=1 This system of equations is nonlinear and cannot be solved in a straightforward fashion. yi ) → (xi . C3 . · · · . C3 .15a) (3.1. · · · . CM ) − yk )2 (3. Kutz) This in turn leads to the 2 × 2 system n n 29 C k=1 xk exp(2Axk ) − n xk yk exp(Axk ) = 0 k=1 n (3. C2 . To avoid the diﬃculty of solving this nonlinear system.1. Or many solution may exist. a solution may not even exist.

or least-square methods is to interpolate or extrapolate data values. y0 ) (x1 . N. 2. Interpolation and extrapolation are easy to do given a least-squares ﬁt. y1 ) . .1. . · · · .22) Solving this set of equations can be quite diﬃcult. Polynomial ﬁtting is another method for getting these values. an nth degree polynomial is chosen pn (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 (3. . splines. then the value of the curve at points other than the xi are unknown. i. Kutz) In general. C3 . Interpolation uses the data points to predict values of y(x) at locations where x = xi and x ∈ [x0 . a polynomial is chosen to go through all data points. Most attempts at solving nonlinear systems are based upon iterative schemes which require good initial guesses to converge to the solution. Once the ﬁtting curve is found. y1 ) : n−1 y0 = an xn + an−1 x0 + · · · + a1 x0 + a0 0 n−1 y1 = an xn + an−1 x1 + · · · + a1 x1 + a0 1 . but it predicts values of y(x) for x > xn or x < x0 . M . thus giving an interpolated or extrapolated value. CM ) − yk ) ∂f = 0 j = 1. yn ) : yn = an xn + an−1 xn + · · · + a1 xn + a0 .AMATH 301 ( c J. . ∂Cj (3. it can be evaluated for any value of x. In practice.e. Regardless. xn ].2 Polynomial Fits and Splines One of the primary reasons for generating data ﬁts from polynomials.1) where the coeﬃcients aj are chosen so that the polynomial passes through each data point. (xn . For the n + 1 data points given above. this gives the nonlinear set of equations n 30 k=1 (f (xk . when considering only a ﬁnite number of data points (x0 . 3. 3. . · · · . outside the range of the data points. Extrapolation is similar. C2 . C1 . the general ﬁtting procedure is straightforward and allows for the construction of a best ﬁt curve to match the data. yn ) .2. Thus we have the resulting system (x0 . y0 ) : (x1 . n n−1 (xn . With polynomial ﬁtting.

k (x) (3. y0 ) and (x1 .6) where the Lagrange coeﬃcient is (x − x0 )(x − x1 ) · · · (x − xk−1 )(x − xk+1 ) · · · (x − xn ) .2.3) which can be easily veriﬁed to work.AMATH 301 ( c J.4) where the Lagrange coeﬃcients are given by L1. (3.5a) (3. y1 ).2.2. we ﬁt an nth degree polynomial through the given data set of the form n pn (x) = k=0 yk Ln. In particular.5b) The power of this method is that it can be easily generalized to consider the n+1 points of our original data set.1 (x) = . it is an expensive computation since we have to ﬁrst set up the system and then perform an O(n3 ) operation to generate the coeﬃcients aj . A more direct method for generating the relevant polynomial is the Lagrange Polynomials method. In a more compact and general way. Ln.0 (x) + y1 L1. (xk − x0 )(xk − x1 ) · · · (xk − xk−1 )(xk − xk+1 ) · · · (xk − xn ) (3. The general form for a line is y = mx + b which gives x − x0 y = y0 + (y1 − y0 ) . Kutz) 31 This system of equations is nothing more than a linear system Ax = b which can be solved by using the backslash command in MATLAB.2) x1 − x0 Although valid. Although this polynomial ﬁt method will generate a curve which passes through all the data. it is hard to continue to generalize this technique to ﬁtting higher order polynomials through a larger number of points.7) Thus there is no need to solve a linear system to generate the desired polynomial. Consider ﬁrst the idea of constructing a line between the two points (x0 .2.0 (x) = x − x1 x0 − x1 x − x0 L1. this ﬁrst degree polynomial can be expressed as 1 p1 (x) = k=0 yk L1.1 (x) (3. x1 − x0 (3.2.2.2.k (x) = . Lagrange developed a method which expresses the line through the two points as p1 (x) = y0 x − x0 x − x1 + y1 x0 − x1 x1 − x0 (3. N.k (x) = y0 L1. This is the preferred method for generating a polynomial ﬁt to a given set of data and is the core algorithm employed by most commercial packages such as MATLAB for polynomial ﬁtting.

0 + Sk. One way to overcome this is to use a piecewise polynomial interpolation scheme. and continuity of the second derivative respectively. the interpolating function is now only a piecewise function. Thus two extra constraint . (3. in general. Kutz) 32 Splines Although MATLAB makes it a trivial matter to ﬁt polynomials or least-square ﬁts through data. y0 ) and (xn . To solve for these quantities.9a) (3. The number of equations and unknowns must be ﬁrst calculated.AMATH 301 ( c J.2. n − 1 turning points from up to down or vice-versa. This gives a smooth looking function without polynomial wiggle problems.j are to be determined from various constraint conditions. Splines provide a better way to represent data by constructing cubic functions between points so that the ﬁrst and second derivative are continuous at the data points.9d) This allows for a smooth ﬁt to the data since the four constraints correspond to ﬁtting the data. there will be n linear functions each valid only between two neighboring points.2. Four constraint conditions are imposed Sk (xk ) = yk Sk (xk+1 ) = Sk+1 (xk+1 ) Sk (xk+1 ) = Sk+1 (xk+1 ) Sk (xk+1 ) = Sk+1 (xk+1 ) . N.2 (x − xk )2 + Sk. xk+1 ] and the coeﬃcients Sk. a fundamental problem can arise as a result of a polynomial ﬁt: polynomial wiggle.8) where x ∈ [xk .3 (x − xk )3 (3. This technique is rather simple minded.9c) (3. a large system of equations Ax = b is constructed. The data generated by a piecewise linear ﬁt can be rather crude and it tends to be choppy in appearance. For each of the n intervals. continuity of the function. continuity of the ﬁrst derivative. Polynomial wiggle is generated by the fact that an nth degree polynomial has.1 (x − xk ) + Sk. there are 4 parameters which gives a total of 4n unknowns.2. yn ).2.2. when considering interpolating values between the points (x0 . Therefore. Essentially.9b) (3. this simply draws a line between neighboring data points and uses this line to give interpolated values. However. • Sk (xk ) = yk → Solution ﬁt: n + 1 equations • Sk = Sk+1 → Continuity: n − 1 equations • Sk = Sk+1 → Smoothness: n − 1 equations • Sk = Sk+1 → Smoothness: n − 1 equations This gives a total of 4n − 2 equations. The basic assumption of the spline method is to construct a cubic function between data points: Sk (x) = Sk. but it does alleviate the problem generated by polynomial wiggle.

9 3.1 1.7 2.6 2. polynomial ﬁts. The ﬁle lineﬁt.8 4. The schemes to be explored are leastsquare ﬁts.0 0. N.7 7. Kutz) 33 conditions must be placed on the system to achieve a solution.3 4.1 2.3 7.dat 0.1 1. As a ﬁnal remark on splines.2 6. There is an entire spline toolbox available for MATLAB which attests to the importance of this technique for this application.0 6.9 5. This will be considered further in upcoming lectures. The spline problem is then reduced to a simple solution of an Ax = b problem which can be solved with the backslash command. the load command is used. The primary reason is for their relative ease in calculating. 3.5 6.0 1.3 6. and for their smoothness properties. To do so.3 This data is just an example of you may want to import into MATLAB.9 5. line interpolation. splines are also commonly used for smoothing data before diﬀerentiating.dat is a collection of x and y data values put into a two column format separated by spaces.2 4.1 6. Further. It is usually a good idea to use the default unless the application involved requires a speciﬁc form. and spline interpolation.1 8. we ﬁrst import a relevant data set into the MATLAB environment. splines are heavily used in computer graphics and animation.5 2.1 7. a non-polynomial least-square ﬁt will be considered which results in a nonlinear system of equations. There are a large variety of options for assumptions which can be made at the edges of the spline. Additionally. This nonlinear system requires additional insight into the problem and sophistication in its solution technique. The command structure to read this data is as follows .4 3.7 4.3 3. To begin the data ﬁt process. The ﬁle is the following: lineﬁt.7 4.8 5.3 Data Fitting with MATLAB This section will discuss the practical implementation of the curve ﬁtting schemes presented in the preceding two sections.AMATH 301 ( c J.5 1.

2).’m’) The polyval command uses the coeﬃcients generated from polyﬁt to generate the y−values of the polynomial ﬁt at the desired values of x given by xp.2).’O’. Speciﬁcally.y.1) used for ﬁtting the data. To ﬁt a parabolic proﬁle through the data.1:7.y. figure(2). The polyﬁt and polyval commands are essential to this method.xp. xp=0:0. Kutz) 34 load linefit. Least-Squares Fitting The least-squares ﬁt technique is considered ﬁrst.AMATH 301 ( c J. Figure 2 in MATLAB depicts both the data and the best line ﬁt in the least-square sense. plot(x.xp.’O’. It is this set of data which will be explored with line ﬁtting techniques. The basic structure requires that the vectors x and y be submitted to polyval along with the desired degree of polynomial ﬁt n.1.y. y=linefit(:. yp=polyval(pcoeff. To evaluate and plot this line.xp).y. figure(1). N. plot(x.’O:’) After reading in the data. the line will be plotted for x ∈ [0. yp2=polyval(pcoeff2. The output of this function call is a vector pcoeﬀ which includes the coeﬃcients a1 and a0 of the line ﬁt p1 (x) = a1 x + a0 . the two vectors x and y are created from the ﬁrst and second column of the data respectively.dat x=linefit(:.yp2. 7] in steps of ∆x = 0. a second degree polynomial is used. the polyﬁt command is used to generate the n + 1 coeﬃcients aj of the nth -degree polynomial pn (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 (3.yp.3.xp).y. values of x must be chosen. The code will also generate a plot of the data in ﬁgure 1 of MATLAB. figure(3). use the command pcoeff=polyfit(x. plot(x. To ﬁt a line (n = 1) through the data.1). This is generated with pcoeff2=polyfit(x.’m’) . the structure.1). For this example.

a piecewise linear ﬁt gives a simple minded connect-the-dot interpolation to the data. In general.yint. ypn=polyval(pcoeffn. an nth degree polynomial is chosen. plot(x. this is because every extra degree of freedom allows for a better least-squares ﬁt to the data. The ﬁrst interpolation scheme is a polynomial ﬁt to the data. Interpolation In addition to least-square ﬁtting. but each interpolation scheme must be evaluated for its accurate representation of the data. figure(4).x). the error will continue to drop as the degree of polynomial is increased.AMATH 301 ( c J. The error in this case is zero. To ﬁnd the leastsquare error. The strong oscillatory phenomena at the edges is a common feature of this type of interpolation.y.’O’.xp.3.y. polynomial wiggle can dominate the behavior. E2=sqrt( sum( ( abs(yp3-y) ). the quantity E2 (f ) = 1 n n 1/2 k=1 |f (xk ) − yk | 2 (3. interpolation techniques can be developed which go through all the given data points.xp. pcoeffn=polyfit(x. plot(x. N.’m’) .2) is calculated.n).dat. This is indeed the case as illustrated in ﬁgure 4. These results are plotted in MATLAB ﬁgure 3. In contrast to a polynomial ﬁt. the sum of the squares of the diﬀerences between the parabolic ﬁt and the actualy data must be evaluated.ypn. But as always is the danger with polynomial interpolation. Given n+1 points. Speciﬁcally.xp). The polyﬁt command is again used for the calculation n=length(x)-1.’O’. figure(5).y. For the parabolic ﬁt considered in the last example. The interp1 command gives the piecewise linear ﬁt algorithm yint=interp1(x.xp).y. The error is then calculated yp3=polyval(pcoeff2.^2 )/n ) This is a quick and easy calculation which allows for an evaluation of the ﬁtting procedure. Kutz) 35 Here the vector yp2 contains the parabolic ﬁt to the data evaluated at the x-values xp. the polynomial ﬁt must be evaluated at the x−values for the given data lineﬁt.’m’) The MATLAB script will produce an nth degree polynomial through the data points.

yint.4 -1.3 1. The nearest option gives the nearest value of the data to the interpolated value while the spline option gives a cubic spline ﬁt interpolation to the data.xp.int3.65 1.6 0. This is the same as that using interp1 with the spline option. plot(x.yspline.y. consider the following data set which looks like it good be nicely ﬁtted with a Gaussian proﬁle. yint=interp1(x.xp.15 1. The are a few options available with the interp1 command.’k’) The generated spline is depicted in ﬁgure 6.7 1. yint2=interp1(x.xp).1 -1.0 -0.xp.5 0.5 0.7 0. Thus a smooth ﬁt can be achieved with either spline or interp1.8 -0.1 1.’k’.0 1. Note that the data is smooth as expected from the enforcement of continuous smooth derivatives. N.y.1 0.xp.’O’.y.AMATH 301 ( c J.xp).xp.y. The spline command is used by giving the x and y data along with a vector xp for which we desire to generate corresponding y−values. The two options can be compared with the default linear option.0 -0.xp.07 2.’r’) Note that the spline option is equivalent to using the spline algorithm supplied by MATLAB.4 -0.y.3 0. plot(x.05 -1.25 1.y.’nearest’) yint3=interp1(x. Nonlinear Least-Square Fitting To consider more sophisticated least-square ﬁtting routines. Kutz) 36 The linear interpolation is illustrated in ﬁgure 5 of MATLAB.’spline’) figure(5).1 0.2 -1. figure(6).dat -3.int2.2 -0.2 0.8 .2 0.05 2.4 1. yspline=spline(x.’m’. gaussﬁt. including the nearest option and the spline option.’O’.2 -2.

the sum n E2 = k=0 |f (xk ) − yk |2 (3.m.4) with respect to the variables A and B.2 To ﬁt the data to a Gaussian.3.3.1). we indeed assume a Gaussian function of the form f (x) = A exp(−Bx2 ) . The fmins algorithm requires a function call to the quantity to minimized along with an initial guess for the parameters which are being used for minimization. solving a nonlinear set of equations can be a diﬃcult task.[1 1]). E=sum( ( x0(1)*exp(-x0(2)*x. y=gaussfit(:.1 0. it returns the vector coeﬀ which contains the appropriate values of A and B which minimize the least-square error. we minimize (3. The function called is then constructed as follows: gaﬁt. N. And in fact. In general. A solution is not guaranteed to exist.4) must be minimized.3) Following the procedure to minimize the least-square error leads to a set of nonlinear equations for the coeﬃcients A and B.AMATH 301 ( c J. After minimizing.5) must be found with respect to A and B.3 -0.m function E=gafit(x0) load gaussfit. The command fmins in MATLAB minimizes a function of several variables.25 1. A and B. This command structure uses as initial guesses A = 1 and B = 1 when calling the ﬁle gaﬁt. the least-square error must be minimized.3. coeff=fmins(’gafit’. there may be many solutions to the problem.e.^2)-y ). Kutz) 37 1. i.5 0.3.8 2. (3. Thus the nonlinear problem should be handled with care.2).^2 ) . Thus the minimum of n E2 = k=0 |A exp(−Bx2 ) − yk |2 k (3. In this case. To generate a solution.dat x=gaussfit(:. In particular.

then the derivative and second derivative give the velocity and acceleration respectively. then the rise-over-run gives the instantaneous slope. 4. a=coeff(1). b=coeff(2) yga=a*exp(-b*xga. Note that the sum is simply a statement of the quantity to minimized. The results of this calculation can be illustrated with MATLAB ﬁgure 7 xga=-3:0. this means that if we take ∆t suﬃciently small. 4 Numerical Diﬀerentiation and Integration Diﬀerentiation and integration form the backbone of the mathematical techniques required to describe and analyze physical systems. For instance. the formula on the right is nothing more than a rise-over-run formula for the slope.yga. namely (3.5).1) Since the derivative is the slope. From calculus. given a set of data which represents the position of a particle as a function of time.^2).1:3.’O’.3. an educated guess for the values of parameters like A and B can determine if the technique will work at all.AMATH 301 ( c J. figure(7). Understanding how to evaluate these quantities numerically is essential to understanding systems beyond the scope of analytic methods. plot(x2.1. N.y2. The general idea of calculus is that as ∆t → 0.’m’) Note that for this case. The results should also be checked carefully since there is no guarantee that a minimization near the desired ﬁt can be found. we make use of Taylor series expansions. . the initial guess is extremely important. Numerically. For any given problem where this technique is used.xga. Kutz) 38 The vector x0 accepts the initial guesses for A and B and is updated as A and B are modiﬁed through an iteration procedure to converge to the least-square values. the deﬁnition of the derivative is given by df (t) f (t + ∆t) − f (t) = lim ∆t→0 dt ∆t (4. These two mathematical concepts describe how certain quantities of interest change with respect to either space and time or both. than the approximation should be fairly accurate.1 Numerical Diﬀerentiation Given a set of data or a function. it may be useful to diﬀerentiate the quantity considered in order to determine a physically relevant property. To quantify and control the error associated with approximating the derivative.

we ﬁnd the following expression for the ﬁrst derivative: df (t) f (t + ∆t) − f (t − ∆t) ∆t2 d3 f (c) = − (4.1.3) By using the mean-value theorem of calculus. Upon replacing ∆t by 2∆t in (4.1. Upon dividing the above expression by 2∆t and rearranging. By using two additional points to approximate the derivative. consider the following two Taylor series: df (t) ∆t2 d2 f (t) + + dt 2! dt2 df (t) ∆t2 d2 f (t) f (t − ∆t) = f (t) − ∆t + − dt 2! dt2 where ci ∈ [a. there is third derivative term left over which needs to be removed. We could improve on this by continuing our Taylor expansion and truncating it at higher orders in ∆t.1. N.1. we would have df (t) ∆t2 d2 f (t) + dt 2! dt2 3 3 ∆t d f (t) ∆t4 d4 f (t) ∆t5 d5 f (c1 ) + + + (4. b].7) .1. Speciﬁcally.1.5a) 3! dt3 4! dt4 5! dt5 df (t) ∆t2 d2 f (t) + f (t − ∆t) = f (t) − ∆t dt 2! dt2 3 3 ∆t d f (t) ∆t4 d4 f (t) ∆t5 d5 f (c2 ) − + − (4. Again subtracting these two expressions gives f (t + ∆t) = f (t) + ∆t f (t+∆t)−f (t−∆t) = 2∆t df (t) 2∆t3 d3 f (t) ∆t5 + + dt 3! dt3 5! d5 f (c1) d5 f (c2) + . Note that the truncation error in this case is O(∆t2 ). we ﬁnd f (t+2∆t)−f (t−2∆t) = 4∆t df (t) 16∆t3 d3 f (t) 32∆t5 + + dt 3! dt3 5! d5 f (c3) d5 f (c4) . dt5 dt5 (4.AMATH 301 ( c J.2a) 3! dt3 ∆t3 d3 f (c2 ) (4. this term can be removed.5b) 3! dt3 4! dt4 5! dt5 where ci ∈ [a.1. Subtracting these two expressions gives f (t + ∆t) = f (t) + ∆t f (t + ∆t) − f (t − ∆t) = 2∆t df (t) ∆t3 + dt 3! ∆t3 d3 f (c1 ) (4. This would lead to higher accuracy schemes.1. (4. b]. + dt5 dt5 (4.6).4) dt 2∆t 6 dt3 where the last term is the truncation error associated with the approximation of the ﬁrst derivative using this particular Taylor series generated expression.2b) 3! dt3 d3 f (c1 ) d3 f (c2 ) + dt3 dt3 . Kutz) 39 To see how the Taylor expansions are useful.1. we ﬁnd f (c) = (f (c1 ) + f (c2 ))/2. by truncating at O(∆t5 ).6) In this approximation. Thus we use the two additional points f (t + 2∆t) and f (t − 2∆t).

Thus. By starting with the pair of equations (4. at the end points. fourth. However. Kutz) 40 O(∆t2 ) center-diﬀerence schemes f (t) = [f (t + ∆t) − f (t − ∆t)]/2∆t f (t) = [f (t + ∆t) − 2f (t) + f (t − ∆t)]/∆t2 f (t) = [f (t + 2∆t) − 2f (t + ∆t) + 2f (t − ∆t) − f (t − 2∆t)]/2∆t3 f (t) = [f (t + 2∆t) − 4f (t + ∆t) + 6f (t) − 4f (t − ∆t) + f (t − 2∆t)]/∆t4 Table 4: Second-order accurate center-diﬀerence formulas. and higher derivatives using this technique. This process can be continued to ﬁnd any arbitrary derivative. N.1. we ﬁnd the expression: −f (t + 2∆t) + 8f (t + ∆t) − 8f (t − ∆t) + f (t − 2∆t) ∆t4 (5) df (t) = + f (c) dt 12∆t 30 (4. we could also approximate the third.1. forward and backward diﬀerence methods must be used since they do not have .2) and adding. The central diﬀerence schemes are an excellent method for generating the values of the derivative in the interior points of a data set.10) where the truncation error is of O(∆t2 ) and is found again by the mean-value theorem to be −(∆t2 /12)f (c).8) where f (5) is the ﬁfth derivative and the truncation is of O(∆t4 ).1.6) by eight and subtracting (4. By multiplying (4.7) and using the mean-value theorem on the truncation terms twice.9) By rearranging and solving for the second derivative.AMATH 301 ( c J.1. It is also possible to generate backward and forward diﬀerence schemes by using points only behind or in front of the current point respectively.1.and backward-diﬀerence formulas which are accurate to second-order. this gives the result f (t+∆t)+f (t−∆t) = 2f (t)+∆t2 d2 f (t) ∆t4 + dt2 4! d4 f (c1 ) d4 f (c2 ) + dt4 dt4 . (4.1. Tables 4-6 summarize the second-order and fourthorder central diﬀerence schemes along with the forward. the O(∆t2 ) accurate expression is derived f (t + ∆t) − 2f (t) + f (t − ∆t) d2 f (t) = + O(∆t2 ) dt2 ∆t2 (4. A ﬁnal remark is in order concerning these diﬀerentiation schemes. Approximating higher derivatives works in a similar fashion.

AMATH 301 ( c J. It may be tempting to deduce from the diﬀerence formulas that as ∆t → 0. Thus special care must be taken at the end points of any computational domain.and backward-diﬀerence formulas. neighboring points to the left and right respectively. Kutz) 41 O(∆t4 ) center-diﬀerence schemes f (t) = [−f (t + 2∆t) + 8f (t + ∆t) − 8f (t − ∆t) + f (t − 2∆t)]/12∆t f (t) = [−f (t + 2∆t) + 16f (t + ∆t) − 30f (t) +16f (t − ∆t) − f (t − 2∆t)]/12∆t2 f (t) = [−f (t + 3∆t) + 8f (t + 2∆t) − 13f (t + ∆t) +13f (t − ∆t) − 8f (t − 2∆t) + f (t − 3∆t)]/8∆t3 f (t) = [−f (t + 3∆t) + 12f (t + 2∆t) − 39f (t + ∆t) + 56f (t) −39f (t − ∆t) + 12f (t − 2∆t) − f (t − 3∆t)]/6∆t4 Table 5: Fourth-order accurate center-diﬀerence formulas. Round-oﬀ and optimal step-size An unavoidable consequence of working with numerical computations is roundoﬀ error. N. This allows for 16-digit accuracy in the representation of a given number. This round-oﬀ has signiﬁcant impact upon numerical computations .and backward-diﬀerence schemes f f f f (t) = [−3f (t) + 4f (t + ∆t) − f (t + 2∆t)]/2∆t (t) = [3f (t) − 4f (t − ∆t) + f (t − 2∆t)]/2∆t (t) = [2f (t) − 5f (t + ∆t) + 4f (t + 2∆t) − f (t + 3∆t)]/∆t3 (t) = [2f (t) − 5f (t − ∆t) + 4f (t − 2∆t) − f (t − 3∆t)]/∆t3 Table 6: Second-order accurate forward. double precision numbers are used. O(∆t2 ) forward. this line of reasoning completely neglects the second source of error in evaluating derivatives: numerical round-oﬀ. However. the accuracy only improves in these computational methods. When working with most computations.

1. the error grows quadratically due to the truncation error. is the combination of round-oﬀ and truncation such that E = Eround + Etrunc = e(t + ∆t) − e(t) ∆t2 d2 y(c) − .13) where the total error. (4.15c) | − e(t)| ≤ er M= c∈[tn . To minimize the error. ∆t 2 ∆t 2 (4. However. we consider the approximation to the derivative y(t + ∆t) − y(t) dy ≈ + (y(t). round-oﬀ error occurs so that y(t) = Y (t) + e(t) . (4. Calculating this derivative gives ∂|E| 2er = − 2 + M ∆t = 0 .1.15b) (4.1. we require that ∂|E|/∂(∆t) = 0. N. As an example of the impact of round-oﬀ.1. E.1. (4.1.17) ∂(∆t) ∆t so that ∆t = 2er M 1/3 . as ∆t decreases to zero. In particular. ∆t) (4. the error is dominated by round-oﬀ which grows like 1/∆t.14) We now determine the maximum size of the error.16) Note that as ∆t gets large.18) .1. (4. Upon evaluating this expression in the computer.11) dt ∆t where (y(t).1. ∆t) measures the truncation error. Thus the combined error between the roundoﬀ and truncation gives the following expression for the derivative: y(t + ∆t) − y(t) dy = + E(y(t). we can bound the maximum value of round-oﬀ and the derivate to be |e(t + ∆t)| ≤ er max d2 y(c) dt2 .1. ∆t) dt ∆t (4.15a) (4. ∆t 2 dt2 (4.AMATH 301 ( c J.tn+1 ] This then gives the maximum error to be |E| ≤ er + er ∆t2 2er ∆t2 M + M= + .1.12) where Y (t) is the approximated value given by the computer and e(t) measures the error from the true value y(t). Kutz) 42 and the issue of time-stepping.

12∆t 30 dt5 (4. Rather. Kutz) 43 This gives the step size resulting in a minimum error. as ∆t decreases to zero. from the calculus standpoint. the optimal ∆t ≈ 10−5 . is the combination of round-oﬀ and truncation such that −e(t + 2∆t) + 8e(t + ∆t) − 8e(t − ∆t) + e(t − 2∆t) ∆t4 d5 y(c) + .20) We now determine the maximum size of the error. the error is again dominated by roundoﬀ which grows like 1/∆t. Below this value of ∆t. This shows that the error can be quickly dominated by numerical round-oﬀ if one is not careful to take this signiﬁcant eﬀect into account. Thus the area under the curve. the optimal step ∆t ≈ 10−3 . This then gives the maximum error to be E= |E| = ∆t4 M 3er + .1. the error grows like a quartic due to the truncation error.1. 4. Thus the smallest step-size is not necessarily the most accurate. a balance between round-oﬀ error and truncation error is achieved to obtain the optimal step-size. For er ≈ 10−16 .21) Note that as ∆t gets large.1) where b − a = N h. The basic ideas for performing such an operation come from the deﬁnition of integration b N f (x)dx = lim a h→0 f (xj )h j=0 (4. E. N.1.2 Numerical Integration Numerical integration simply calculates the area under a given curve. In this case −f (t + 2∆t) + 8f (t + ∆t) − 8f (t − ∆t) + f (t − 2∆t) dy = + E(y(t). numerical round-oﬀ begins to dominate the error A similar procedure can be carried out for evaluating the optimal step size associated with the O(∆t4 ) accurate scheme for the ﬁrst derivative.2. we require that ∂|E|/∂(∆t) = 0.AMATH 301 ( c J. In particular.22) 4M Thus in this case. To minimize the error. Calculating this derivative gives 1/5 45er ∆t = . However. ∆t) dt 12∆t (4.19) where the total error.1. is thought of as a limiting process of summing up an ever-increasing number . (4. we can bound the maximum value of round-oﬀ to e as before and set M = max {|y (c)|} . 2∆t 30 (4.

Speciﬁcally. any sum can be represented as follows N Q[f ] = j=0 wk f (xk ) = w0 f (x0 ) + w1 f (x1 ) + · · · + wN f (xN ) (4.6a) (4. N. they are O(h2 ). use will be made of polynomial ﬁts to the y-values f (xj ). Thus the integral is evaluated as b f (x)dx = Q[f ] + E[f ] a (4.6c) f (x)dx = x3 Simpson’s 3/8 Rule x0 x4 f (x)dx = 3h 3h5 (f0 +3f1 +3f2 +f3 ) − f 8 80 Boole’s Rule x0 f (x)dx = 2h 8h7 (6) (7f0+32f1+12f2+32f3+7f4 )− f (c)(4.6b) (c)(4. Thus we assume the function f (x) can be approximated by a polynomial Pn (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 (4.AMATH 301 ( c J.2. The accuracy condition is determined from the truncation terms of the polynomial ﬁt. Typically.6d) 45 945 These algorithms have varying degrees of accuracy.3) where the term E[f ] is the error in approximating the integral by the quadrature sum (4. the error E[f ] is due to truncation error.2.2) where a = x0 < x1 < x2 < · · · < xN = b. This process of polynomial ﬁtting the data gives the Newton-Cotes Formulas.4) where the truncation error in this case is proportional to the (n + 1)th derivative E[f ] = Af (n+1) (c) and A is a constant.2. O(h4 ) and O(h6 ) accurate schemes respectively.2.2. Kutz) 44 of rectangles. (4. Newton-Cotes Formulas The following integration approximations result from using a polynomial ﬁt through the data to be diﬀerentiated. Note that the Trapezoid rule uses a sum of simple trapezoids to approximate the integral. O(h4 ). To integrate. Speciﬁcally.2).2. This process is known as numerical quadrature. It is assumed that xk = x0 + hk fk = f (xk ) .2.2. .2.5) This gives the following integration algorithms: x1 Trapezoid Rule Simpson’s Rule x0 f (x)dx = x0 x2 h h3 (f0 + f1 ) − f (c) 2 12 h h5 (f0 + 4f1 + f2 ) − f 3 90 (c) (4.

consider a second degree polynomial through the three points (x0 .2.9) Writing out this sum gives N −1 j=1 h (fj + fj+1 ) = 2 = h h h (f0 + f1 ) + (f1 + f2 ) + · · · + (fN −1 + fN ) 2 2 2 h (f0 + 2f1 + 2f2 + · · · + 2fN −1 + fN ) 2 (4. 3 (4. Evaluating at the limits then causes many terms to cancel and drop out. only gives a value for x ∈ [x0 .2).2. (x1 .2. f2 ): (x − x1 )(x − x2 ) (x − x0 )(x − x2 ) (x − x0 )(x − x1 ) + f1 + f2 .AMATH 301 ( c J. The derivation of these integration rules follows from simple polynomial ﬁts through a speciﬁed number of data points. Kutz) 45 Simpson’s rule ﬁts a quadratic curve through three points and calculates the area under the quadratic curve. (x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 ) (x2 − x0 )(x2 − x1 ) (4. and (x2 .2. f2 ).2.10) . 2 (4. f0 ).2. and w2 = h/3 as weighting factors. The trapezoid rule. and Boole’s rule are all derived in a similar fashion. Assuming once again that our interval is divided as a = x0 < x1 < x2 < · · · < xN = b. our fundamental aim is to evaluate the integral over the entire domain x ∈ [a. but we neglect it for the present purposes. while Boole’s rule uses ﬁve points and a quintic polynomial ﬁt to generate an evaluation of the integral.7) into the integral p2 (x) = f0 x2 x0 x2 f (x)dx ≈ p2 (x)dx = x0 h (f0 + 4f1 + f2 ) . By plugging in (4. N.8) The integral calculation is easily performed since it only involves integrating powers of x2 or less. However. To make connection with the quadrature rule (4. Simpson’s rule gives w0 = h/3. The truncation error could also be included.7) This quadratic ﬁt is derived by using Lagrange coeﬃcients. for instance. w1 = 4h/3. Composite Rules The integration methods (4. Trapezoid rule. Simpson’s 3/8 rule uses four points and a cubic polynomial to evaluate the area.6) give values for the integrals over only a small part of the integration domain. Thus the Simpson’s rule is recovered. To derive the Simpson’s rule. x1 ].2. b]. Q = w0 f0 + w1 f1 + w2 f2 . then the trapezoid rule applied over the interval gives the total integral b a N −1 f (x)dx ≈ Q[f ] = j=1 h (fj + fj+1 ) . Simpson’s 3/8 rule.

2 (4. where Qh and Q2h are the quadrature approximations to the integral with step size h and 2h respectively. a function or data set can be integrated to a prescribed accuracy. To see how this might work. the even data points are the only ones of interest and we have the basic one-step integral x2 f (x)dx = x0 2h (f0 + f2 ) = h(f0 + f2 ) . However.13) Comparing the middle expressions in (4.12) Writing out this sum gives N/2−1 j=0 h (f2j + f2j+2 ) = h(f0 + f2 ) + h(f2 + f4 ) + · · · + h(fN −2 + fN ) = h(f0 + 2f2 + 2f4 + · · · + 2fN −2 + fN ) N/2−1 (4. Recursive Improvement of Accuracy Given an integration procedure and a value of h.2. Instead.2.2.13) gives a great deal of insight into how recursive schemes for improving accuracy work.10) and (4.2. we note that the more accurate scheme with step size h contains all the terms in the integral approximation using step size 2h. it may be desirable to improve the accuracy without having to disregard previous approximations to the integral.14) = h f0 + fN + 2 j=1 f2j . Quantifying this gives Qh = 1 Q2h + h(f1 + f3 + · · · + fN −1 ) . This then allows us to half the value of h and .11) The composite rule associated with this is then b a N/2−1 f (x)dx ≈ Q[f ] = h (f2j + f2j+2 ) . j=0 (4. Speciﬁcally. consider the trapezoidal rule for a step size of 2h. Kutz) N −1 h f0 + fN + 2 fj 2 j=1 46 = The ﬁnal expression no longer double counts the values of the points between f0 and fN . the ﬁnal sum only counts the intermediate values once. Thus.2. thus making the algorithm about twice as fast as the previous sum expression. These are computational savings which should always be exploited if possible.2. 2 (4. N.AMATH 301 ( c J.

This recursive routine is used in MATLAB in order to generate results to a prescribed accuracy. To begin. x=-10:dx:10. ∆x. In MATLAB. namely a hyperbolic secant.u. For this example we take the interval x ∈ [−10. we use the center-. This gives dx=0. forward-. the function to be diﬀerentiated must be evaluated u=sech(x).3.AMATH 301 ( c J. accurate methods to approximate these calculations are essential. Speciﬁcally. Since they form the backbone of calculus. Kutz) 47 improve accuracy without jettisoning the work required to approximate the solution with the accuracy given by a step size of 2h.2) Diﬀerentiation To begin the calculations. This recursive procedure can be continued so as to give higher accuracy results. dx (4. 10].ux_exact) MATLAB ﬁgure 1 produces the function and its derivative.1. figure(1). ux_exact=-sech(x)*tanh(x). This allows us to make a comparison of our approximate methods with the actual solution. we consider u = sech(x) (4. plot(x. we consider a speciﬁc function to be diﬀerentiated. N.3. must also be deﬁned. 4. Further. and backward-diﬀerence formulas derived for diﬀerentiation. we deﬁne a spatial interval. the spatial discritization. this type of procedure holds for Simpson’s rule as well as any of the integration schemes developed here.x.3a) (4. we will make use of the following four ﬁrst-derivative approximations: center − diﬀerence O(h2 ) : yn+1 − yn−1 2h −yn+2 + 8yn+1 − 8yn−1 + yn−2 center − diﬀerence O(h4 ) : 12h (4.3.3 Implementation of Diﬀerentiation and Integration This lecture focuses on the implementation of diﬀerentiation and integration methods. Thus. % spatial discritization % spatial domain Once the spatial domain has been deﬁned.3b) . Part of the reason for considering this function is that the exact value of its derivative is known. To calculate the derivative numerically.3.1) whose derivative is du = −sech(x) tanh(x) .

This gives the basic algorithm n=length(x) % 2nd order accurate ux(1)=(-3*u(1)+4*u(2)-u(3))/(2*dx). ux2(2)=(-3*u(2)+4*u(3)-u(4))/(2*dx). In what follows. Kutz) −3yn + 4yn+1 − yn+2 2h 3yn − 4yn−1 + yn−2 . A fourth-order center-diﬀerence scheme such as the second equation of (4.3. The values of ux(1) and ux(n) are evaluated with the forward. and a fourth-order scheme for the interior.3).AMATH 301 ( c J. for j=2:n-1 ux(j)=(u(j+1)-u(j-1))/(2*dx). third and fourth equations of (4. This gives the algorithm % 4th order accurate ux2(1)=(-3*u(1)+4*u(2)-u(3))/(2*dx). end ux(n)=(3*u(n)-4*u(n-1)+u(n-2))/(2*dx).3d) Here we have used h = ∆x and yn = y(xn ). In the interior of the domain. at the left and right boundaries. we use a second-order scheme at the boundary points. Fourth-order accurate derivative A higher degree of accuracy can be achieved by using a fourth-order scheme. Thus the ﬁrst two and last two points of the computational domain must be handled separately. there are no left and right neighboring points respectively to calculate the derivative with.3) relied on two neighboring points. Thus we require the use of forward. backward − diﬀerence O(h2 ) : 2h 48 forward − diﬀerence O(h2 ) : (4.3. we use the ﬁrst. for j=3:n-2 ux2(j)=(-u(j+2)+8*u(j+1)-8*u(j-1)+u(j-2))/(12*dx). end ux2(n-1)=(3*u(n-1)-4*u(n-2)+u(n-3))/(2*dx).3c) (4. we use the centerdiﬀerence scheme. Second-order accurate derivative To calculate the second order accurate derivative. .and backward-diﬀerence schemes respectively. N. ux2(n)=(3*u(n)-4*u(n-1)+u(n-2))/(2*dx).3.3. However.and backwarddiﬀerence schemes.

For instance. we will simply make use of the built in MATLAB functions. by zooming in on a particular point. N. The command structure for this evaluation is as follows int_quad=quad(inline(’sech(x). The failure of accuracy can also be observed by taking large values of dx. The most straight-forward integration routine is the trapezoidal rule.ux.y. if you are given a set of data to diﬀerentiate.ux_exact.x. the cumtrapz command is used.int_sech2) MATLAB ﬁgure 3 gives the value of the integral as a function of x. Otherwise. the results are plotted together with the analytic solution for the derivative figure(2). the data will tend to be highly inaccurate and choppy. This will give a value of the integral using a recursive Simpson’s rule that is accurate to within 10−6 . plot(x. It is always recommended that you ﬁrst run a spline through the data with an appropriately small dx and then diﬀerentiate the spline.-10. The value of this integral is exactly two.y.^2). where we have integrated sech2 (x). The quad command is implemented by specifying a function and the range of integration. a function can be speciﬁed for integration over a given domain. int_sech2=cumtrapz(x. plot(x. Kutz) 49 For the dx = 0.^2).^2’). To generate the cumulative values.1 considered here.10) . Given a set of data. This will help to give smooth diﬀerentiated data. The trapz command gives a value that is within 10−7 of the true value. dx = 0. figure(3). To view the accuracy.AMATH 301 ( c J. For the x and y data deﬁned previously for the hyperbolic secant. Alternatively.5 and dx = 1 illustrate how the derivatives fail to be properly determined with large dx values.’m’) To the naked eye. However. As a ﬁnal note.’c’.ux2. the trapz command can be used to implement this integration rule. So unlike the diﬀerentiation routines presented here.’o’. the second-order and fourth-order schemes result in errors of 10−2 and 10−4 respectively. these results all look to be exactly on the exact solution. the command structure gives int_sech=trapz(x. Integration There are a large number of integration routines built into MATLAB. it quickly becomes clear that the errors for the two schemes are indeed 10−2 and 10−4 respectively.x.

1. hyperbolic. we can solve more complicated partial diﬀerential equations. however. or mixed type. However. 10]. but it will generally be important to rely on numerical solutions for insight. this equation cannot be solved analytically in general. Kutz) 50 Here the inline command allows us to circumvent a traditional function call. Implementation. this can be easily handled in a for loop. This command executes the integration of sech2 (x) over x ∈ [−10. Although very simple in appearance. a variety of basic techniques are required from the solutions of ordinary diﬀerential equations. see Boyce and DiPrima [1]. Note that no version of the quad command exists that produces cumulative integration values. Runge-Kutta and Adams methods The solutions of general partial diﬀerential equations rely heavily on the techniques developed for ordinary diﬀerential equations. in numerical computations we take the derivative or integral of the governing equation and go backwards to deﬁne it as the diﬀerence. there are certain cases for which the problem can be solved analytically. . our numerical solution techniques reverse our understanding of calculus. The quad command can.1) with t ∈ [0. and clever application of the basic principles.2) (5. T ]. Thus the building blocks for scientiﬁc computing are rooted in concepts from freshman calculus.AMATH 301 ( c J. The development of numerical solution techniques for initial and boundary value problems originates from the simple concept of the Taylor expansion. By understanding the basic ideas for computationally solving initial and boundary value problems for diﬀerential equations. Of course. often requires ingenuity. However. Double and triple integrals over two-dimensional rectangles and three-dimensional boxes can be performed with the dbequad and triplequad commands. insight. 5.1 Initial value problems: Euler. parabolic. however. For an overview of analytic techniques. N. be used with a function call. Thus we begin by considering systems of diﬀerential equations of the form dy = f (y. 5 Diﬀerential Equations Our ultimate goal is to solve very general nonlinear partial diﬀerential equations of elliptic.1. Whereas calculus teaches us to take a limit in order to deﬁne a derivative or integral. In some sense. t) dt where y represents a vector and the initial conditions are given by y(0) = y0 (5.

by generalizing to include the slope at the left and right ends of the time-step ∆t. (5. the derivative at the mid-point of the time-step and at the right end of the time-step may also be used to possibly improve accuracy. the algorithm structure is of the form y(tn+1 ) = F (y(tn )) (5. we can generate an iteration scheme of the following form: y(t + ∆t) = y(t) + ∆t [Af (t. ∆t→0 ∆t dt (5.5) Thus the Euler method gives an iterative scheme by which the future values of the solution can be determined. 1 where the slope (derivative) of the function is responsible for generating each subsequent approximation to the solution y(t). y(t) + Q∆t · f (t.1. The Euler method is derived by making use of the deﬁnition of the derivative: ∆y dy = lim . tn ) .1. The Euler method can be generalized to the following iterative scheme: yn+1 = yn + ∆t · φ . (5.1.4) dt ∆t The approximation can easily be rearranged to give yn+1 = yn + ∆t · f (yn . In particular. t) ⇒ ≈ f (yn .7) where the function φ is chosen to reduce the error over a single time step ∆t and yn = y(tn ). B.1. y(t))) = f (t. y(t)) + Q∆t · fy (t. Kutz) 51 The simplest algorithm for solving this system of diﬀerential equations is known as the Euler method.1. y(t)))] (5. Generally. y(t)) + O(∆t2 ) (5.9) where ft and fy denote diﬀerentiation with respect to t and y respectively. and O(∆t2 ) denotes all terms that are of size ∆t2 and . Upon Taylor expanding the last term.1.1. as in the Euler scheme. (5. y(t)) · f (t. tn ) . The function φ is no longer constrained. we ﬁnd f (t + P · ∆t. tn ) The graphical representation of this iterative process is illustrated in Fig. N. Rather.1). y(t)) + Bf (t + P · ∆t.3) Thus over a time span ∆t = tn+1 −tn we can approximate the original diﬀerential equation by dy yn+1 − yn = f (y.6) where F (y(tn )) = y(tn ) + ∆t · f (y(tn ). Note that the Euler method is exact as the step size descreases to zero: ∆t → 0.1. y(t) + Q∆t · f (t.8) where A. y(t)) + P ∆t · ft (t.AMATH 301 ( c J. to make use of the derivative at the left end point of the computational step. P and Q are arbitrary constants. use has been made of (5.

y(t)) 2 1 2 + ∆t · fy (t. (5.1.1. y(t)) + O(∆t3 ) .1. y(t)) + ∆t2 · ft (t.AMATH 301 ( c J. y(t)) + BQ∆t2 · fy (t.12b) (5. y(t)) + O(∆t3 ) (5.8) results in the following: y(t + ∆t) = y(t) + ∆t(A + B)f (t. y(t)) +P B∆t2 · ft (t. This graphical illustration suggests that smaller steps ∆t should be more accurate. y(t)) · f (t.1. Kutz) y(t) y 0 52 Exact Solution y(t) Error Euler Approximations y 1 y t0 t 0 +∆ t y t 0 +2∆ t 3 2 t 0 +3∆ t t Figure 1: Graphical description of the iteration process used in the Euler method. N. smaller.11) 2 Comparing this Taylor expansion with (5.1. y(t))f (t.12c) .1. To proceed further. Note that each subsequent approximation is generated from the slope of the previous point. Plugging in this last result into the original iteration scheme (5.10) which is valid up to O(∆t2 ).12a) (5. we simply note that the Taylor expansion for y(t + ∆t) gives: 1 y(t + ∆t) = y(t) + ∆t · f (t.1.10) gives the following relations: A+B =1 1 PB = 2 1 BQ = 2 (5.

B. y(t) + ∆t · f (t.1. The term “4th order” refers to the fact that the Taylor series local truncation error is pushed to O(∆t5 ). the level of algebraic diﬃculty in deriving these higher accuracy schemes also increases signiﬁcantly from Heun’s method and Modiﬁed Euler-Cauchy. intermediate.13b) 2 2 Generally speaking. y(t)))] (5. these methods for iterating forward in time given a single initial point are known as Runge-Kutta methods.1.AMATH 301 ( c J. y(t) + · f (t. (5. which yields three equations for the four unknows A. The scheme is as follows: ∆t yn+1 = yn + [f1 + 2f2 + 2f3 + f4 ] (5. y(t)) + f (t + ∆t. Of course. These schemes assume A = 1/2 and A = 0 respectively. and ﬁnal slopes used in the 4th order Runke-Kutta iteration scheme over a time ∆t. and a wide variety of schemes can be implemented. and are given by: y(t + ∆t) = y(t) + ∆t [f (t.1. we can construct stepping schemes which have arbitrary accuracy.1. 4th-order Runge-Kutta Perhaps the most popular general stepping scheme used in practice is known as the 4th order Runge-Kutta method. Two of the more commonly used schemes are known as Heun’s method and Modiﬁed Euler-Cauchy (second order Runge-Kutta).1. P and Q. The total cumulative (global) error is then O(∆t4 ) and is responsible for the scheme name of “4th order”. Kutz) f(t) 53 f(t +∆t ) f(t) f(t +∆ t/2 ) t +∆t/2 t +∆ t t Figure 2: Graphical description of the initial. yn ) (5.13a) 2 ∆t ∆t y(t + ∆t) = y(t) + ∆t · f t + . Thus one degree of freedom is granted. y(t)) .14) 6 where f1 = f (tn . N.15a) .8). By generalizing the assumption (5.

y)dt.15b) (5.19) .1. Another approach to solving (5.18) It only remains to determine the form of the polynomial to be used in the approximation.AMATH 301 ( c J. N. is the use of intermediate time-steps to improve accuracy.1.1. However. no approximations have been made and (5.7) with ∆t·φ = tn f (t. y) is a polynomial. yn + f2 2 2 f4 = f (tn + ∆t.1.15d) This scheme gives a local truncation error which is O(∆t5 ). t (5. The numerical solution will be found by approximating f (t.17) is exact. tn + 54 f2 = f (5. y) where p(t. Thus the iteration scheme in this instance will be given by tn+1 yn+1 ≈ yn + p(t.1.1. as well as any of the other Runge-Kutta schemes.17) n+1 This iteration relation is simply a restatement of (5. yn + f1 2 2 ∆t ∆t f 3 = f tn + . this relates directly to the choice of the polynomial approximation p(t. Thus the diﬀerential equation can be integrated over a time-step ∆t to give dy = f (y. (5. y)dt . For the 4th order scheme presented here. Adams method: multi-stepping techniques The development of the Runge-Kutta schemes rely on the deﬁnition of the derivative and Taylor expansions.1.1) is to start with the fundamental theorem of calculus [2]. Kutz) ∆t ∆t . The Adams-Bashforth suite of computational methods uses the current point and a determined number of past points to evaluate the future solution. at this point. yn ) .16) And once again using our iteration notation we ﬁnd tn+1 yn+1 = yn + tn f (t.1.15c) (5. the order of accuracy is determined by the choice of φ. a graphical representation of this derivative sampling at intermediate time-steps is shown in Fig. y). y) ≈ p(t. In the Adams-Bashforth case. t) dt t+∆t ⇒ y(t + ∆t) − y(t) = f (t.1. y)dt . 2. The key to this method. The cumulative (global) error in this case case is fourth order so that for t ∼ O(1) then the error is O(∆t4 ).1. As with the Runge-Kutta schemes. yn + ∆t · f3 ) . t (5. tn (5. A ﬁrst-order scheme can easily be constructed by allowing p1 (t) = constant = f (tn . y)dt .

1.25) Alternatively. we ﬁnd the following 2nd order Adams-Bashforth scheme yn+1 = yn + ∆t [3f (tn .1. we could assume that the polynomial used both the future point and the current point so that a second-order scheme resulted.21) When inserted into (5. ∆t (5. this linear polynomial yields tn+1 yn+1 = yn + tn fn−1 + fn − fn−1 (t − tn ) dt .1. This technique can be easily generalized to include more past points and thus higher accuracy. the present. The linear polynomial which passes through these two points is given by p2 (t) = fn + fn+1 − fn (t − tn ) . as accuracy is increased.1. yn+1 ) .1. yn+1 ) . Inserting this ﬁrst-order approximation into (5.18). The Adams-Bashforth scheme uses current and past points to approximate the polymial p(t. yn ) . y) in (5.1.1.18) results in the backward Euler scheme yn+1 = yn + ∆t · f (tn+1 .AMATH 301 ( c J.22) Upon integration and evaluation at the upper and lower limits.26) .1. Inserting this ﬁrst-order approximation into (5. The linear polynomial which passes through these two points is given by p2 (t) = fn−1 + fn − fn−1 (t − tn ) . ∆t (5. (5. (5. and the past is used. However.1. N.18). any implementation of AdamsBashforth will require a boot strap to generate a second “initial condition” for the solution iteration process. Aside from the ﬁrst-order accurate scheme. a ﬁrst-order scheme can easily be constructed by allowing p1 (t) = constant = f (tn+1 . we could assume that the polynomial used both the current point and the previous point so that a second-order scheme resulted. If instead a future point. yn−1 )] .20) Alternatively. yn ) − f (tn−1 . (5.18) results in the previously found Euler scheme yn+1 = yn + ∆t · f (tn . this is a two-step algorithm which requires two initial conditions. ∆t (5. then the scheme is known as an Adams-Moulton method.1.23) In contrast to the Runge-Kutta method.1.24) where the future point and no past and present points are used to determine the value of the polynomial. Kutz) 55 where the present point and no past points are used to determine the value of the polynomial. so are the number of initial conditions required to step forward one time-step ∆t. As before. 2 (5.

We then insert this value into the right hand side of (5. i. ∆t (5. then we can use this predicted value to solve (5. Kutz) Inserted into (5. For example. we have considered systems of ﬁrst order equations. N.1.29b) 2 Thus the scheme utilizes both explicit and implicit time-stepping schemes without having to solve a system of nonlinear equations. The second-order predictor-corrector steps are then as follows: P predictor (Adams-Bashforth): yn+1 = yn + ∆t [3fn − fn−1 ] (5. (5.30) 3 dt dt .28) Once again this is a two-step algorithm. Higher order diﬀerential equations Thus far. yn+1)+f (tn . 2 (5. we ﬁnd the following 2nd order Adams-Moulton scheme yn+1 = yn + ∆t [f (tn+1 . The solution of this nonlinear system can be very diﬃcult. Higher order differential equations can be put into this form and the methods outlined here can be applied.28) and explicitly ﬁnd the corrected value of yn+1 . the second order implicit scheme given by (5. In particular.1.28) requires the value of f (tn+1 . However.27) Upon integration and evaluation at the upper and lower limits. which are simple iterations. This scheme draws on the power of both the Adams-Bashforth and Adams-Moulton schemes. it is categorically diﬀerent than the Adams-Bashforth methods since it results in an implicit scheme.1. yn)]. the unknown value yn+1 is speciﬁed through a nonlinear equation (5. consider the third-order.1.18).29a) 2 ∆t P corrector (Adams-Moulton): yn+1 = yn + [f (tn+1 . This is explored further in the notes.1.1.e. (5. yn+1 ) in the right hand side. yn+1 ) + f (tn .1.28) explicitly. thus making explicit schemes such as Runge-Kutta and Adams-Bashforth.1.1.1. Thus we begin with a predictor step to estimate yn+1 so that f (tn+1 . implicit schemes can have advantages when considering stability issues related to time-stepping. more easily handled.28). diﬀerential equation du d3 u + u2 + cos t · u = g(t) . If we can predict (approximate) this value.AMATH 301 ( c J. this linear polynomial yields tn+1 56 yn+1 = yn + tn fn + fn+1 − fn (t − tn ) dt . However. yn+1 ) can be evaluated. nonhomogeneous. yn )] . One way to circumvent the diﬃculties of the implicit stepping method while still making use of its power is to use a Predictor-Corrector method.

it is essential .AMATH 301 ( c J.1. = y2 = (5.32c) dt dt dt which results in the original diﬀerential equation (5.1.31b) (5. t) .1.1.1. Using the original equation along with the deﬁnitions of yi we ﬁnd that dy1 = y2 (5.31a) (5.33) y3 dt dt 2 y3 −y1 y2 − cos t · y1 + g(t) At this point. The following are a few of the most common solvers: • ode23: second-order Runge-Kutta routine • ode45: fourth-order Runge-Kutta routine • ode113: variable order predictor-corrector routine • ode15s: variable order Gear method for stiﬀ problems [3. all the time-stepping techniques developed thus far can be applied to the problem. 4] 5.32a) dt dy2 = y3 (5.32b) dt 3 dy3 d u du 2 = 3 = −u2 − cos t · u + g(t) = −y1 y2 − cos t · y1 + g(t) (5. MATLAB commands The time-stepping schemes considered here are all available in the MATLAB suite of diﬀerential equation solvers. N.1) considered previously y y2 d 1 dy = f (y.1. It is imperative to write any diﬀerential equation as a ﬁrst-order system before solving it numerically with the time-stepping schemes developed here.1. dt 57 (5.2 Error analysis for time-stepping routines Accuracy and stability are fundamental to numerical analysis and are the key factors in evaluating any numerical integration technique. Therefore.31c) we ﬁnd that dy3 /dt = d3 u/dt3 . Kutz) By deﬁning y1 = u du y2 = dt d2 u y3 = 2 .1.

However. t+∆t].1) dt 2 dt2 where c ∈ [t. In the context of time-stepping schemes. Rarely does it occur that both accuracy and stability work in concert. The local discretization error is given by k+1 = y(tk+1 ) − (y(tk ) + ∆t · φ) (5. b]. in practice we are only concerned with the global (cumulative) error. the natural place to begin is with Taylor expansions. Thus we ﬁnd local: k= ∆t2 d2 y(ck ) ∼ O(∆t2 ) 2 dt2 K (5. We begin by exploring accuracy. we can calculate both the local and global error.2. the above formula reduces to the Euler iteration scheme yn+1 = yn + ∆t · f (tn . Kutz) 58 to evaluate the accuracy and stability of the time-stepping schemes developed. Each is signiﬁcant in its own right.4) where y(tk+1 ) is the exact solution and y(tk )+∆t·φ is a one-step approximation over the time interval t ∈ [tn . they often are oﬀsetting and work directly against each other.e. The global discretization error is given by Ek = y(tk ) − yk (5. Thus a highly accurate scheme may compromise stability. . Speciﬁcally.2. whereas a low accuracy scheme may have excellent stability properties. tn+1 ].2. Given a time-step ∆t and a speciﬁed time interval t ∈ [a.5b) 2 2 dt ∆t 2 dt2 which gives a local error for the Euler scheme which is O(∆t2 ) and a global error which is O(∆t). i.AMATH 301 ( c J. the truncation error is given by ∆t2 /2 · d2 y(c)/dt2 . Two types of error are important to identify: local and global error.2. Since we are considering dy/dt = f (t. Of importance is how this truncation error contributes to the overall error in the numerical solution.3) where y(tk ) is the exact solution and yk is the numerical solution. Thus the cumulative error is poor for the Euler scheme. yn ) + O(∆t2 ) . y).2. In fact.2) It is clear from this that the truncation error is O(∆t2 ).2. Thus we consider the expansion dy(t) ∆t2 d2 y(c) y(t + ∆t) = y(t) + ∆t · + · (5. it is not very accurate. we have after K steps that ∆t · K = b − a.5a) global: Ek = j=1 ∆t2 d2 y(cj ) ∆t2 d2 y(c) ≈ ·K 2 dt2 2 dt2 = ∆t2 d2 y(c) b − a b − a d2 y(c) · = ∆t · ∼ O(∆t) (5. (5. N. For the Euler method.

9) ∆t 2 dt2 . Round-oﬀ and step-size An unavoidable consequence of working with numerical computations is roundoﬀ error.AMATH 301 ( c J. is the combination of round-oﬀ and truncation such that en+1 − en ∆t2 d2 y(c) En = Eround + Etrunc = − . Table 7 illustrates various schemes and their associated local and global errors.2. This allows for 16-digit accuracy in the representation of a given number. ∆t) dt ∆t (5. ∆t) measures the truncation error. Thus it is tempting to conclude that higher accuracy is easily achieved by taking smaller time steps ∆t. double precision numbers are used. A similar procedure can be carried out for all the schemes discussed thus far.2. including the multi-step Adams schemes. Upon evaluating this expression in the computer. N. As an example of the impact of round-oﬀ. ∆t) dt ∆t (5. we consider the Euler approximation to the derivative yn+1 − yn dy ≈ + (yn .8) where the total error. This would be true if not for round-oﬀ error in the computer. This round-oﬀ has signiﬁcant impact upon numerical computations and the issue of time-stepping. (5.6) where (yn . The error analysis suggests that the error will always decrease in some power of ∆t. When working with most computations. Kutz) 59 scheme Euler 2nd order Runge-Kutta 4th order Runge-Kutta 2nd order Adams-Bashforth local error O(∆t2 ) O(∆t3 ) O(∆t5 ) O(∆t3 ) k global error Ek O(∆t) O(∆t2 ) O(∆t4 ) O(∆t2 ) Table 7: Local and global discretization errors associated with various timestepping schemes. (5. En .2.2. round-oﬀ error occurs so that yn+1 = Yn+1 + en+1 .7) Thus the combined error between the round-oﬀ and truncation gives the following expression for the derivative: dy Yn+1 − Yn = + En (yn .

15) The analytic solution is easily calculated to be y(t) = y0 exp(λt). However.2. it is meaningless if the scheme is not stable numerically. Stability The accuracy of any scheme is certainly important.2. As an example.tn+1 ] (5.2.12) ∂(∆t) ∆t so that ∆t = 2er M 1/3 . (5. The essense of a stable scheme: the numerical solutions do not blow up to inﬁnity.2.16) . Thus the smallest stepsize is not necessarily the most accurate. consider the simple diﬀerential equation dy = λy (5. Calculating this derivative gives 2er ∂|En | = − 2 + M ∆t = 0 .2.2.10c) max This then gives the maximum error to be |En | ≤ er + er ∆t2 2er ∆t2 M + M= + . (5. (5. However. if we solve this problem numerically with a forward Euler method we ﬁnd yn+1 = yn + ∆t · λyn = (1 + λ∆t)yn . a balance between round-oﬀ error and truncation error is achieved to obtain the optimal step-size.10a) (5.10b) d2 y(c) dt2 .2.AMATH 301 ( c J.2. we ﬁnd this iteration scheme yields yN = (1 + λ∆t)N y0 .14) dt with y(0) = y0 . we require that ∂|En |/∂(∆t) = 0. we can bound the maximum value of round-oﬀ and the derivate to be |en+1 | ≤ er | − en | ≤ er M= c∈[tn . Kutz) 60 We now determine the maximum size of the error.11) To minimize the error.13) This gives the step size resulting in a minimum error.2. (5.2. Rather. In particular. After N steps. ∆t 2 ∆t 2 (5. N. (5.17) (5.

(5. However.2.2.. 0 · · · 0 λM (5.18) as N → ∞. . (5.2.18) At this point. however. (5.20b) |1 + λ∆t| > 1 then E → ∞ .2. After N steps.23) is a diagnonal matrix whose entries are the eigenvalues of A. . Consider the one-step recursion relation for an M × M system yn+1 = Ayn . instability occurs if {λi } > 1 for i = 1. Eq. 2.2. we have the two following cases for the error given by (5. (5. Thus upon calculating ΛN . (5.20a) (5.AMATH 301 ( c J. N. (5. 0 · · · 0 λN M . For λ > 0. So although the error also grows. . . M .22) (5.2. Kutz) 61 Given that we have a certain amount of round oﬀ error. In particular.21) where y0 is the initial vector. can dominate in this case. it may not be signiﬁcant in comparison to the size of the numerical solution.18) as N → ∞..2. case II holds and is unstable provided ∆t > −2/λ. For this case. . .2. A well known result from linear algebra is that AN = S−1 ΛN S where S is the matrix whose λ1 0 · · · 0 0 λ2 0 · · · Λ= . the following observations can be made. . yN → 0 in Eq. In contrast.19): I: II: |1 + λ∆t| < 1 then E → 0 (5.2.2. . columns are the eigenvectors of A.. the solution yN → ∞ in Eq. The error. The error then associated with this scheme is given by E = (1 + λ∆t)N e .. . the numerical solution would then be given by yN = (1 + λ∆t)N (y0 + e) . This method can be easily generalized to two-step schemes (Adams methods) by considering yn+1 = Ayn + Byn−1 . . A general theory of stability can be developed for any one-step time-stepping scheme. the scheme would be considered stable.24) . In case I.19) (5. and N λ1 0 ··· 0 N 0 λ2 0 ··· → ΛN = . the algorithm yields the solution yN = AN y0 . In particular. . . . we are only concerned with the eigenvalues.18) for λ < 0 is markedly diﬀerent. .2..2.

14) with (5.2. . If we again consider (5.28a) (5.2. It is observed that the forward Euler scheme has a very small range of stability whereas the backward Euler scheme has a large range of stability.27) |1 + z| > 1 1 backward Euler: > 1.25) y0 . This large stability region is part of what makes implicit methods so attractive.16)-(5.19).2.2. (5.19) and (5.27) By letting z = λ∆t be a complex number. Kutz) 62 Figure 3: Regions for stable stepping for the forward Euler and backward Euler schemes. 1−z forward Euler: (5.26) The round-oﬀ error associated with this scheme is given by E= 1 1 − λ∆t N e.2. control of the accuracy is also essential. Thus stability regions can be calculated. We contrast the diﬀerence in stability between the forward and backward Euler schemes.28b) Figure 3 shows the regions of stable and unstable behavior as a function of z.2. (5.2. The backward Euler displays signiﬁcant diﬀerences in stability.2. we ﬁnd the following criteria to yield unstable behavior based upon (5. Lending further signiﬁcance to this stability analysis is its connection with practical implementation. N. the backward Euler method gives the iteration scheme yn+1 = yn + ∆t · λyn+1 .2. The forward Euler scheme has already been considered in (5. However.2. which after N steps leads to yN = 1 1 − λ∆t N (5.15).AMATH 301 ( c J.2.

3.3.AMATH 301 ( c J.3 Boundary value problems: the shooting method To this point. y.3.5b) .3. b] with the general boundary conditions α1 y(a) + β1 dy(a) = γ1 dt dy(b) = γ2 .3.2) at the end points of the interval. We can still approach the boundary value problem from this framework by choosing the “initial” conditions y(a) = α dy(a) = A. A simple example of such a problem is the second-order boundary value problem dy d2 y (5. Figure 4 gives a graphical representation of a generic boundary value problem solution. b] with the boundary conditions y(a) = α y(b) = β .4a) (5. The Shooting Method The boundary value problems constructed here require information at the present time (t = a) and a future time (t = b). Some eﬀort is then needed to reconcile the time-stepping schemes with the boundary value problems presented here. dy dt (5. but rather some given boundary (constraint) conditions.3. y.1) = f t.3. Kutz) 63 5.5a) (5. However.2b) Thus the solution is deﬁned over a speciﬁc interval and must satisfy the relations (5. 2 dt dt on t ∈ [a.3) on t ∈ [a. N.2a) (5. We discuss the algorithm necessary to make use of the time-stepping schemes in order to solve such a problem. (5. the time-stepping schemes developed previously only require information about the starting time t = a.4b) The stepping schemes considered thus far for second order diﬀerential equations involve a choice of the initial conditions y(a) and y (a). We begin by reconsidering the generic boundary value problem d2 y =f dt2 t. However. dt (5. α2 y(b) + β2 dt (5. many physical applications do not have speciﬁed initial conditions.3.3. we have only considered the solutions of diﬀerential equations for which the initial conditions are known.

for instance.4b). 4. The basic algorithm is as follows: 1.3.AMATH 301 ( c J.y’) y(b) y(a) t a b Figure 4: Graphical depiction of the structure of a typical solution to a boundary value problem with constraints at t = a and t = b.3. the numerical solution is complete and is accurate to the level of the tolerance chosen and the discretization scheme used in the time-stepping. In this case. Adjust the value of A (either bigger or smaller) until a desired level of tolerance and accuracy is achieved. We can solve this using a selfconsistent algorithm to search for the appropriate value of A which satisﬁes the original problem. where the constant A is chosen so that as we advance the solution to t = b we ﬁnd y(b) = β. Solve the diﬀerential equation using a time-stepping scheme with the initial conditions y(a) = α and y (a) = A. Once the speciﬁed accuracy has been achieved. the value of A = A1 gives a value for the initial slope which is too low to satisfy the boundary conditions (5.3. 2. . N. whereas the value of A = A2 is too large to satisfy (5. may be appropriate.3. Computational Algorithm The above example demonstrates that adjusting the value of A in (5. A bisection method for determining values of A.y.4).4).5b) can lead to a solution which satisﬁes (5. Figure 5 illustrates the solution of the boundary value problem given two distinct values of A. Kutz) 64 y(t) y’’=f(t. Evaluate the solution y(b) at t = b and compare this value with the target value of y(b) = β. 3. The shooting method gives an iterative procedure with which we can determine this constant A.

the time or spatial range to solve for. However. We illustrate graphically a bisection process in Fig. the two solutions suggest that a bisection scheme could be used to ﬁnd the correct solution and value of A.y’) α A=A 1 y(a) t a b Figure 5: Solutions to the boundary value problem with y(a) = α and y (a) = A.y. two values of A are used to illustrate the solution behavior and its lack of matching the correct boundary value y(b) = β. dt3 dt (5.y’) y(t) y(b) >β β A=A 2 target y(b) =β y(b) <β y’’=f(t.6) . nonhomogeneous. Here. To build on a speciﬁc example. N. 6 and show the convergence of the method to the numerical solution which satisﬁes the original boundary conditions y(a) = α and y(b) = β. Implementing MATLAB time-stepping schemes Up to this point in our discussions of diﬀerential equations. diﬀerential equation du d3 u + u2 + cos t · u = A sin2 t .y. our solution techniques have relied on solving initial value problems with some appropriate iteration procedure. consider the third-order. and the initial conditions.AMATH 301 ( c J. Kutz) 65 y’’=f(t. To implement one of these solution schemes in MATLAB requires information about the equation.3. This process can occur quickly so that convergence is achieved in a relatively low amount of iterations provided the diﬀerential equation is well behaved.

Kutz) y’’=f(t.3.9) . dt (5.3. t) .1.7a) (5.3.8c) dt dt dt which results in the ﬁrst-order system (5. Using the original equation along with the deﬁnitions of yi we ﬁnd that dy1 = y2 (5.3.3.y’) A=A 2 y(t) A=A 3 66 β A=A 4 A=A y(b) =β target 5 A1<A3<A 2 A1<A4<A 3 A4<A5<A 3 α A=A 1 y(a) t a b Figure 6: Graphical illustration of the shooting process which uses a bisection scheme to converge to the appropriate value of A for which y(b) = β.8a) dt dy2 = y3 (5.3. y3 y2 = = dt dt 2 y3 −y1 y2 − cos t · y1 + A sin2 t (5.8b) dt dy3 d3 u du 2 = 3 = −u2 −cos t·u+A sin2 t = −y1 y2 − cos t · y1 + A sin2 t (5.1) considered previously y2 y1 dy d = f (y.7c) we ﬁnd that dy3 /dt = d3 u/dt3 . By deﬁning y1 = u du y2 = dt d2 u y3 = 2 .y.AMATH 301 ( c J. N.3.7b) (5.

A=2. it takes on the value of y0. -(y(1)^2)*y(2)-cos(t)*y(1)+A*sin(t)].tspan. % rhs vector This function imports the variable t which takes on values between those given by tspan. The variable y stores the solution.y]=ode45(’rhs’. It is imperative to write any diﬀerential equation as a ﬁrst-order system before solving it numerically with the time-stepping schemes developed here. The time span speciﬁes the beginning and end times for solving the equations. tspan=[0 30]. or equivalently u(t0 ). simply ode45 with ode23. y(2). Speciﬁcally.AMATH 301 ( c J. the right hand side of (5. The initial conditions in this case specify y1 (t0 ).3. % initial conditions % time interval % parameter value These can now be used in calling the diﬀerential equation solver. By placing the brackets with nothing inside. To access other methods.9) is of primary importance.m which contains information about the diﬀerential equations. Before solving the equation. Thus t ∈ [t0 . the default accuracy will be used by ode45. 30] with A = 2. This right hand side function looks as follows: function rhs=rhs(t.dummy. The basic command structure and function call to solve this equation is given as follows: [t. It sends to this function the time span for the solution. In between these points. all the time-stepping techniques developed thus far can be applied to the problem. N. ode113. Note that the output of ode45 are the vectors t and y.A) % function call setup rhs=[y(1). let’s consider solving the equations with initial conditions y1 (0) = 1.[]. y2 (0) = 0 and y3 (0) = 0 over the interval t ∈ [t0 .y. t gives the current value of time. the initial conditions and time span must be deﬁned.m contains all the information about the diﬀerential equation. The brackets are placed in a position for specifying accuracy and tolerance options. tf ]. In MATLAB this would be accomplished with y0=[1 0 0]. Kutz) 67 At this point. ut (t0 ) and utt (t0 ). tf ] = [0. the initial conditions. For this example. y2 (t0 ) and y3 (t0 ). or ode15s as needed. The vector t gives all the points between tspan for which the solution was calculated. The dummy slot is for sending in information about the tolerances and the last slot with A allows you to send in any parameters necessary in the right hand side.A) This piece of code calls a function rhs. The solution . and the parameter value of A.y0. where t0 is the initial time and tf the ﬁnal time. Thus t starts at t = 0 and ends at t = 30 in this case. Initially. The function rhs.

AMATH 301 ( c J. N. Kutz)

68

at these time points is given as the columns of matrix y. The ﬁrst column corresponds to the solution y1 , the second column contains y2 as a function of time, and the third column gives y3 . To plot the solution u = y1 as a function of time, we would plot the vector t versus the ﬁrst column of y: figure(1), plot(t,y(:,1)) figure(2), plot(t,y(:,2)) % % plot solution u plot derivative du/dt

A couple of important points and options. The inline command could also be used in place of the function call. Also, it may desirable to have the solution evaluated at speciﬁed time points. For instance, if we wanted the solution evaluated from t = 0 to t = 30 in steps of 0.5, then choose tspan=0:0.5:30 % evaluate the solution every dt=0.5 Note that MATLAB still determines the step sizes necessary to maintain a given tolerance. It simply interpolates to generate the values at the desired locations.

5.4

Implementation of the shooting method

The implementation of the shooting scheme relies on the eﬀective use of a timestepping algorithm along with a root ﬁnding method for choosing the appropriate initial conditions which solve the boundary value problem. Boundary value problems often arise as eigenvalue systems for which the eigenvalue and eigenfunction must both be determined. As an example of such a problem, we consider the Schr¨dinger equation from quantum mechanics which is a second o order diﬀerential equation d2 ψn + [n(x) − βn ] ψn = 0 dx2 (5.4.1)

with the boundary conditions ψ(±1) = 0. Rewriting the diﬀerential equation into a coupled ﬁrst order system gives x = 0 βn − n(x) 1 0 x (5.4.2)

where x = (x1 x2 )T = (ψn dψn /dx)T . The boundary conditions are x=1: x = −1 : ψn (1) = x1 (1) = 0 ψn (−1) = x1 (−1) = 0 . (5.4.3a) (5.4.3b)

At this stage, we will also assume that n(x) = n0 for simplicity. With the problem thus deﬁned, we turn our attention to the key aspects in the computational implementation of the boundary value problem solver. These are

AMATH 301 ( c J. N. Kutz) • FOR loops • IF statements • time-stepping algorithms: ode23, ode45, ode113, ode15s • step-size control • code development and ﬂow

69

Every code will be controlled by a set of FOR loops and IF statements. It is imperative to have proper placement of these control statements in order for the code to operate successfully. Flow Control In order to begin coding, it is always prudent to construct the basic structure of the algorithm. In particular, it is good to determine the number of FOR loops and IF statements which may be required for the computations. What is especially important is determining the hierarchy structure for the loops. To solve the boundary value problem proposed here, we require two FOR loops and one IF statement block. The outermost FOR loop of the code should determine the number of eigenvalues and eigenmodes to be searched for. Within this FOR loop exists a second FOR loop which iterates the shooting method so that the solution converges to the correct boundary value solution. This second FOR loop has a logical IF statement which needs to check whether the solution has indeed converged to the boundary value solution, or whether adjustment of the value of βn is necessary and the iteration procedure needs to be continued. Figure 7 illustrates the backbone of the numerical code for solving the boundary value problem. It includes the two FOR loops and logical IF statement block as the core of its algorithmic structure. For a nonlinear problem, a third FOR loop would be required for A in order to achieve the normalization of the eigenfunctions to unity. The various pieces of the code are constructed here using the MATLAB programming language. We begin with the initialization of the parameters. Initialization clear all; close all; % clear all previously defined variables % clear all previously defined figures

tol=10^(-4); % define a tolerance level to be achieved % by the shooting algorithm col=[’r’,’b’,’g’,’c’,’m’,’k’]; % eigenfunction colors n0=100; % define the parameter n0

AMATH 301 ( c J. N. Kutz)

70

initialize parameters choose βn

choose A

solve ODEs mode loop IF statement: root solve for βn IF converged: plot and save data normalize and choose new mode

Figure 7: Basic algorithm structure for solving the boundary value problem. Two FOR loops are required to step through the values of βn and A along with a single IF statement block to check for convergence of the solution A=1; % define the initial slope at x=-1 x0=[0 A]; % initial conditions: x1(-1)=0, x1’(-1)=A xp=[-1 1]; % define the span of the computational domain Upon completion of the initialization process for the parameters which are not involved in the main loop of the code, we move into the main FOR loop which searches out a speciﬁed number of eigenmodes. Embedded in this FOR loop is a second FOR loop which attemps diﬀerent values of βn until the correct eigenvalue is found. An IF statement is used to check the convergence of values of βn to the appropriate value. Main Program beta_start=n0; % beginning value of beta for modes=1:5 % begin mode loop

βn loop

IF not converged: new value of βn

x0.n0.n0. it is always a good idea to ﬁrst explore the behavior of the solutions of the boundary value problem before writing the shooting routine. hold on % plot modes end % end mode loop The code uses ode45. the function shoot2. 8(a) and 8(b).dummy. N.xp.1)>0 % this IF statement block beta=beta-dbeta.x. In contrast. From Fig.m is called in this routine. This code will ﬁnd the ﬁrst ﬁve eigenvalues and plot their corresponding normalized eigenfunctions. % default step size in beta for j=1:1000 % begin convergence loop for beta [t.1)) % calculate the normalization plot(t.y(:.1). This observation forms the basis for the bisection .col(modes)).m would be the following: shoot2.m function rhs=shoot2(xspan. Figure 8 illustrates several characteristic features of this boundary value problem. to solve the diﬀerential equation and advance the solution.[]. % and uses bisection to dbeta=dbeta/2. For the diﬀerential equation considered here. The bisection method implemented to adjust the values of βn to ﬁnd the boundary value solution is based upon observations of the structure of the even and odd eigenmodes. In Fig.*y(:.1)-0) < tol % check for convergence beta % write out eigenvalue break % get out of convergence loop end if (-1)^(modes+1)*y(end. Kutz) 71 beta=beta_start. odd modes go from ψn (1) < 0 to ψn (1) > 0 as β is increased. which is a fourth-order Runge-Kutta method.1)/sqrt(norm).y]=ode45(’shoot2’. the behavior of the solution near the ﬁrst even and ﬁrst odd solution is exhibited. % initial value of eigenvalue beta dbeta=n0/100. % checks to see if beta else % needs to be higher or lower beta=beta+dbeta/2. 8(a) it is seen that for the even modes increasing values of β bring the solution from ψn (1) > 0 to ψn (1) < 0.AMATH 301 ( c J. % solve ODEs if abs(y(end. This will give important insights into the behavior of the solutions and will allow for a proper construction of an accurate and eﬃcient bisection method.y(:. In general.beta) rhs=[ x(2) (beta-n0)*x(1) ].beta).1. % after finding eigenvalue. % converge to the solution end % end % end convergence loop beta_start=beta-0. pick % new starting value for next mode norm=trapz(t. The function shoot2.

The direct method of solution relies on Taylor expanding the diﬀerential equation itself. odd modes go from ψn (1) < 0 to ψn (1) > 0 as β is increased.5 0 0.13 β1=97. 5.e. Kutz) (a) (b) 1 0.5 0 0. The proto- . this results in a matrix problem of the form Ax = b.5 Boundary value problems: direct solve and relaxation The shooting method is not the only method for solving boundary value problems.AMATH 301 ( c J. In (c) the ﬁrst four normalized eigenmodes along with their corresponding eigenvalues are illustrated for n0 = 100.5 1 x Figure 8: In (a) and (b) the behavior of the solution near the ﬁrst even and ﬁrst odd solution is depicted. Note that for the even modes increasing values of β bring the solution from ψn (1) > 0 to ψn (1) < 0.80 β2=90.53 −1 −0. a nonlinear system of equations must be solved using a relaxation scheme. In contrast. method developed in the code. i. N. For nonlinear problems.53 β3=77.5 0 0 β=98 β=95 −0.5 72 ψ(x) 1 1 1 x x ψ(x) 1 0 −1 −2 β4=60. For linear problems.5 −0.5 2 −1 −0. a Newton or Secant method.5 0 0. Figure 8(c) illustrates the ﬁrst four normalized eigenmodes along with their corresponding eigenvalues.5 −1 (c) β=95 β=85 −1 −0.5 0.

To solve the simpliﬁed linear boundary value problem above which is accurate to second order.5.5.5.1) on t ∈ [a.4a) (5.5. dy dt (5.7b) . This gives the boundary conditions 1− y(t0 ) = y(a) = α y(tN ) = y(b) = β . (5.2b) Thus the solution is deﬁned over a speciﬁc interval and must satisfy the relations (5. The boundary value problem then becomes y(t+∆t)−y(t−∆t) y(t+∆t)−2y(t)+y(t−∆t) = p(t) +q(t)y(t)+r(t) (5.5. we use table 4 for the second and ﬁrst derivatives. b] with the simpliﬁed boundary conditions y(a) = α y(b) = β . We can rearrange this expression to read ∆t ∆t p(t) y(t + ∆t) − 2 + ∆t2 q(t) y(t) + 1 + y(t − ∆t) = ∆t2 r(t) .and backward-diﬀerence formulas which are accurate to second-order. dt (5.2) at the end points of the interval. N.5) ∆t2 2∆t with the boundary conditions y(a) = α and y(b) = β.5.4b) (5.1) summarize the second-order and fourth-order central diﬀerence schemes along with the forward.2a) (5. (5.AMATH 301 ( c J. Tables 4-6 (Sec. 2 2 (5.5. Before considering the general case. Kutz) 73 typical example of such a problem is the second-order boundary value problem d2 y =f dt2 t.5.3) Taylor expanding the diﬀerential equation and boundary conditions will generate the linear system of equations which solve the boundary value problem. 4.5.6) We discretize the computational domain and denote t0 = a to be the left boundary point and tN = b to be the right boundary point.5.7a) (5. y. we simplify the method by considering the linear boundary value problem d2 y dy = p(t) + q(t)y + r(t) dt2 dt on t ∈ [a. b] with the general boundary conditions α1 y(a) + β1 dy(a) = γ1 dt dy(b) α2 y(b) + β2 = γ2 .5.

However. .5. (β − yN −2 )/2∆t) = 0. We can once again consider the general diﬀerential equation and expand with second-order accurate schemes: y = f (t. diﬃculties arise from solving the resulting set of nonlinear algebraic equations. . . . ∆t p(t − 1− N− ) 1 2 − 1+ ∆t p(t1 ) 2 2+∆t2 q(t2 ) . (yN−1 −yN−3)/2∆t) = 0 −yN −2 + 2yN −1 − β + ∆t2 f (tN −1 . . −yN−3 +2yN−2 −yN−1 +∆t2 f (tN−2 . Considering again the simpliﬁed boundary conditions y(t0 ) = y(a) = α and y(tN ) = y(b) = β gives the following nonlinear system for the remaining N − 1 points. .8) y(t1 ) y(t2 ) . −∆t2 r(t1 ) + (1 + ∆tp(t1 )/2)y(t0 ) −∆t2 r(t2 ) .5. . . . yN −1 . y1 . . . . . . . ··· (5. . . 0 . 2y1 − y2 − α + ∆t2 f (t1 . ∆t2 2∆t (5.10) We discretize the computational domain and denote t0 = a to be the left boundary point and tN = b to be the right boundary point. (y3 − y1 )/2∆t) = 0 . yN−2 . 0 . .5. y(t). However. 0 − 1+ ∆t p(t2 ) 2 . This (N − 1) × (N − 1) nonlinear system of equations can be very diﬃcult to solve and imposes a severe constraint on the usefulness of the scheme. y2 . . x= y(tN −2 ) y(tN −1 ) b= −∆t2 r(tN −2 ) −∆t2 r(tN −1 ) + (1 − ∆tp(tN −1 )/2)y(tN ) . 0 ··· . Kutz) 74 The remaining N − 1 points can be recast as a matrix problem Ax = b where A= and 2+∆t2 q(t1 ) − 1− ∆t p(t2 ) 2 0 . . (y2 − α)/2∆t) = 0 −y1 + 2y2 − y3 + ∆t2 f (t2 . Nonlinear Systems A similar solution procedure can be carried out for nonlinear systems. . y. = f t. N. . − 1+ ∆t p(tN− ) 2 2 2+∆t2 q(tN− ) 1 0 . . . (5.9) Thus the solution can be found by a direct solve of the linear system of equations. . ··· 0 . . y ) → y(t+∆t)−2y(t)+y(t−∆t) y(t+∆t)−y(t−∆t) .AMATH 301 ( c J.

x2 . fN (x1 . subsequent guesses are determined until the scheme either converges to the root xr or the scheme diverges and another initial guess is used. x1 . The sequence of guesses (x0 . .. x3 . We therefore develop the basic ideas of the Newton-Raphson Iteration method.15) F(xn ) = = 0. this command is now packaged within the optimization toolbox. x2 . Note that the scheme fails if f (xn ) = 0 since then the slope line never intersects y = 0. 9(b) where a sequence of iterations is demonstrated. the scheme usually will converge. The Newton method can be generalized for system of nonlinear equations.12) where xr is the root to the equation and the value being sought. . . Kutz) 75 there may be no other way of solving the problem and a solution to these system of equations must be computed. The details will not be discussed here. xN ) (5. for certain guesses the iterations may diverge. Further complicating the issue is the fact that for nonlinear systems such as these.. The graphical procedure is illustrated in Fig.. for the value of the root. there are no guarantees about the existence or uniqueness of solutions.. Provided the initial guess is suﬃciently close.. x3 .. We would like to develop a scheme which systematically determines the value of xr . x2 .5.5. The best approach is to use a relaxation scheme which is based upon Newton or Secant method iterations. From this guess. but the Newton iteration scheme is similar to that developed for the single function case.14) A graphical example of how the iteration procedure works is given in Fig. dx run xn+1 − xn f (xn ) . x3 . commonly known as a Newton’s method. everything relies on the slope formula as illustrated in Fig. . Conditions for convergence can be found in Burden and Faires [5]. N. .. We begin by considering a single nonlinear equation f (xr ) = 0 (5. Given a system: f1 (x1 .. x0 .AMATH 301 ( c J. . Further. .. xN ) f2 (x1 .5. However. 9. x2 . xN ) .13) Rearranging this gives the Newton-Raphson iterative relation xn+1 = xn − (5.5. Solving Nonlinear Systems: Newton-Raphson Iteration The only built-in MATLAB command which solves nonlinear system of equations is FSOLVE.) is generated from the slope of the function f (x).. In essence. Most users of MATLAB do not have access to this toolbox and alternatives must be sought. f (xn ) (5. 9(a): slope = df (xn ) rise 0 − f (xn ) = = .. The NewtonRaphson method is an iterative scheme which relies on an initial guess.

the determinant of the Jacobian cannot equal zero. in order for the algorithm to work.5.6 Implementing the direct solve method To see how the ﬁnite diﬀerence discretization technique is applied.. . det J(xn ) = 0. Thus a good initial guess is critical to its success.5. · · · f N xN (5. Speciﬁcally.16) (5.18) This algorithm relies on initially guessing values for x1 . the iteration scheme is xn+1 = xn + ∆xn where J(xn )∆xn = −F(xn ) and J(xn ) is the Jacobian matrix J(xn ) = f1x1 f2x1 .. a graphical representation of the iteration scheme is given. xN . . . 5. . f N x1 f1x2 f2x2 .5. As before. Kutz) 76 slope=rise/run=(0-f( xn ) )/( xn+1 . the boundary value problem gives rise to an eigenvalue system for which the eigenvalue and eigenfunction must both be determined. In (b). . there is no guarantee that the algorithm will converge. f N x2 ··· ··· f1xN f2xN . . we consider the same example problem considered with the shooting method.xn) xn xn+1 x3 x0 x4 x2 x1 f( xn+1) f( xn ) (a) (b) Figure 9: Construction and implementation of the Newton-Raphson iteration formula.. The example considered is the Schr¨dinger equation from quantum mechanics which is a second order o . N. x2 . In (a). Further. the slope is the determining factor in deriving the NewtonRaphson formula.AMATH 301 ( c J. .17) (5.

. . As before.AMATH 301 ( c J.6) (5. To discretize. . . 1 −2+∆x2 n0 (5. .1) dx2 with the boundary conditions ψ(±1) = 0. 0 1 −2+∆x2 n0 .. . . ∆x2 Multiplying through by ∆x2 and collecting terms gives ψn+1 + −2 + ∆x2 n0 ψn + ψn−1 = ∆x2 βψn n = 1.4) Note that the term on the right hand side is placed there because the values of β are unknown. . we will consider n(x) = n0 = 100..6. . . . 0 1 . . N.6. the governing equation becomes ψn+1 − 2ψn + ψn−1 + (n0 − β)ψn = 0 n = 1. 0 . ··· 0 .N − 1 .. we ﬁrst divide the spatial domain x ∈ [−1.7) ..6.. 1] into a speciﬁed number of equally spaced points. 3.6. Note that we already know the boundary conditions ψ(−1) = ψ0 = 0 and ψ(1) = ψN = 0. 1 .6.6. 2. ψ(xN −2 ) ψ(xN −1 ) Thus the solution can be found by an eigenvalue solve of the linear system of equations. . 2. Deﬁning λ = ∆x2 β gives the eigenvalue problem Aψ = λψ where −2+∆x2 n0 1 0 . . ··· 0 . ··· ψ(x1 ) ψ(x2 ) . Unlike the shooting method. 3. Thus we are left with and N −1×N −1 eigenvalue problem for which we can use the eig command. there is no need to rewrite the equation as a coupled system of ﬁrst order equations. (5. . Kutz) diﬀerential equation 77 d2 ψ + [n(x) − β] ψ = 0 (5.3) (5. . let ψ(xn ) = ψn . At each point. .2) (5. 0 . . Thus part of the problem is to determine what these values are as well as all the values of ψn ..5) and A= ψ = (5. Upon discretizing.6. .N − 1 . .

It is a sparse matrix which only has three diagonals of consequence. thus we need to look at the last ﬁve columns of V for our eigenvectors and last ﬁve diagonals of D for the eigenvalues.D]=eig(A) % get the eigenvectors (V). end Note that the size of the matrix is m = n−2 since we already know the boundary conditions ψ(−1) = ψ0 = 0 and ψ(1) = ψN = 0. In this particular case. The matrix D is an m × m matrix whose diagonal elements are the eigenvalues corresponding to the eigenvectors V. The eig command automatically sorts the eigenvalues puts them in the order of smallest to biggest as you move across the columns of V. % delta x values With these deﬁned. eigenvalues (D) The matrix V is an m × m matrix which contains as columns the eigenvectors of the matrix A. % value of n_0 m=n-2.end+1-j)/dx/dx % write out eigenvalues end This plots out the normalized eigenvector solutions and their eigenvalues.end+1-j). end % OFF DIAGONALS for j=1:m-1 A(j.eig/sqrt(norm).j)=-2+dx^2*n0.’k’]. N.1. we turn to building the matrix A. % number of points x=linespace(-1.’g’.AMATH 301 ( c J. n=200.’c’. % normalization plot(x. % left boundary condition eig(2:m+1)=V(:. we begin with the preliminaries of the problem clear all. col=[’r’.j)=1. % size of A matrix (mXm) % DIAGONALS for j=1:m A(j.’m’. A(j+1. . close all. we simply use the eig command [V.^2). % eigenfunction colors for j=1:5 eig(1)=0. To solve this problem. hold on % plot beta=D(end+1-j. % interior points eig(n)=0 % right boundary condition norm=trapz(x. Kutz) 78 numerical implementation To implement this solution.n).eig. we are actually looking for the largest ﬁve values of β.col(j)). % linear space of points dx=x(2)-x(1).. n0=100.j+1)=1.’b’.

−∞ There are other equivalent deﬁnitions. ∞] whereas our computational domain is only over a ﬁnite domain x ∈ [−L. Further. L]. Kutz) 79 6 Fourier Transforms Fourier transforms are one of the most powerful and eﬃcient techniques for solving a wide variety of problems. The FFT is an integral transform deﬁned over the entire line x ∈ [−∞. Thus the Fourier transform is essentially an eigenfunction expansion over all continuous wavenumbers k. This operation can be done with the Fast-Fourier transform (FFT) which is an O(N log N ) operation.1.1a) (6. Thus the FFT is faster than most linear solvers of O(N 2 ). describes oscillatory behavior. And once we are on a ﬁnite domain x ∈ [−L. N.1. We again note that formally.2) −∞ . we transform over a ﬁnite domain x ∈ [−L. However. this deﬁnition will serve to illustrate the power and functionality of the Fourier transform method. The basic properties and implementation of the FFT will be considered here. L] and assume periodic boundary conditions due to the oscillatory behavior of the kernel of the Fourier transform. The Fourier transform and its inverse are deﬁned as 1 F (k) = √ 2π 1 f (x) = √ 2π ∞ eikx f (x)dx −∞ ∞ (6.1b) e−ikx F (k)dk . the transform is over the entire real line x ∈ [−∞. (6. Thus we ﬁnd 1 f (x) = √ 2π ik eikx f (x)dx = f (x)eikx |∞ − √ −∞ 2π −∞ ∞ ∞ eikx f (x)dx . we begin by considering the Fourier transform of f (x). exp(±ikx). ∞]. the Kernel of the transform. 6. however.AMATH 301 ( c J. L].1 Basics of the Fourier Transform Other techniques exist for solving many computational problems which are not based upon the standard Taylor series discretization. We denote the Fourier transform of f (x) as f (x). the continuous eigenfunction expansion becomes a discrete sum of eigenfunctions and associated wavenumbers (eigenvalues).1. The key idea of the Fourier transform is to represent functions and their derivatives as sums of cosines and sines. An alternative to standard discretization is to use the Fast-Fourier Transform (FFT). Given computational practicalities. To see how these properties are generated. Derivative Relations The critical property in the usage of Fourier transforms concerns derivative relations.

(6.1. (6.4) This property is what makes Fourier transforms so useful and practical. This gives the following reduction y − ω 2 y = −f −k 2 y − ω 2 y = −f (k 2 + ω 2 )y = f y= k2 f . The Fast Fourier Transform The Fast Fourier transform routine was developed speciﬁcally to perform the forward and backward Fourier transforms. . N. Cooley and Tukey developed what is now commonly known as the FFT algorithm [7]. we invert the last expression above to yield 1 y(x) = √ 2π ∞ e−ikx −∞ k2 f dk . ∞] . Kutz) Assuming that f (x) → 0 as x → ±∞ results in ik f (x) = − √ 2π ∞ −∞ 80 eikx f (x)dx = −ik f (x) . In the mid 1960s.6) To ﬁnd the solution y(x). The key features of the FFT routine are as follows: 1.AMATH 301 ( c J.3) Thus the basic relation f = −ik f is established.1. For N large. this operation count grows almost linearly like N . It is easy to generalize this argument to an arbitrary number of derivatives. (6. Thus it represents a great leap forward from Gaussian elimination and LU decomposition.1. + ω2 (6. + ω2 (6.1.5) We can solve this by applying the Fourier transform to both sides. It has a low operation count: O(N log N ). The ﬁnal result is the following relation between Fourier transforms of the derivative and the Fourier transform itself f (n) = (−ik)n f .7) This gives the solution in terms of an integral which can be evaluated analytically or numerically. As an example of the Fourier transform.1. consider the following diﬀerential equation y − ω 2 y = −f (x) x ∈ [−∞. Their algorithm was named one of the top ten algorithms of the 20th century for one reason: the operation count for solving a system dropped to O(N log N ).

128. we consider the transform of a Gaussian function. iﬀt. Given a function which has been discretized with 2n points and represented by a vector x. and it assumes you are working on a 2π periodic domain. 64. For more information at the present. L]. These properties are a consequence of the FFT algorithm discussed in detail at a later time. the FFT is found with the command ﬀt(x). The practical implementation of the mathematical tools available in MATLAB is crucial. The key to lowering the operation count to O(N log N ) is in discretizing the range x ∈ [−L. IFFT. % clear all variables and figures % define the computational domain [-L/2. sparse matrix construction (spdiag.8) A simple MATLAB code to verify this with α = 1 is as follows clear all. 4. 3. To see the practical implications of the FFT. see [7] for a broader overview. close all. additionally it multiplies every other mode by −1. 256. ﬀtshift. the algorithm associated with the FFT does three major things: it shifts the data so that x ∈ [0. iﬀt2).L/2] % define the number of Fourier modes 2^n . L=20. 8. 0] and x ∈ [−L. L]. Kutz) 81 2.AMATH 301 ( c J. 0] → [0.e. gmres). 16. 32. FFTSHIFT. 4. L] → [−L. the number of points should be 2. L] into 2n points. N. spy).1. ﬀt2. It ﬁnds the transform on an interval x ∈ [−L. Included in this section will be a discussion of the Fast Fourier Transform routines (ﬀt. These routines should be studied carefully since they are the building blocks of any serious scientiﬁc computing code. it implies that the solutions on this ﬁnite interval have periodic boundary conditions. The FFT has excellent accuracy properties. We will consider the underlying FFT algorithm in detail at a later time. n=128. Aside from transforming the function. This lecture will focus on the use of some of the more sophisticated routines in MATLAB which are cornerstones to scientiﬁc computing. Fast Fourier Transform: FFT. and high end iterative techniques for solving Ax = b (bicgstab. · · ·. i. typically well beyond that of standard discretization schemes. Since the integration Kernel exp(ikx) is oscillatory. iﬀtshift. The transform can be calculated analytically so that we have the exact relations: f (x) = exp(−αx2 ) → 1 k2 f (k) = √ exp − 4α 2α . (6. Its implementation is straightforward. IFFTSHIFT2 The Fast Fourier Transform will be the ﬁrst subject discussed.

% function to take a derivative of ut=fft(u). % second derivative ut3=-i*k.*ut. The following example does this. % analytic first derivative .*ut. By using the command ﬀtshift. % shift FFT figure(1). % define the domain discretization x=x2(1:n). it is crucial that the transform is shifted back to the form of the second ﬁgure. u2=ifft(ut2).*k. The ﬁrst derivative is compared with the analytic value of f (x) = −sech(x) tanh(x).*x). % FFT the function utshift=fftshift(ut). % clear all variables and figures % define the computational domain [-L/2. Kutz) 82 x2=linspace(-L/2.L/2] % define the number of Fourier modes 2^n x2=linspace(-L/2. N. plot(abs(utshift)) % plot shifted transform The second ﬁgure generated by this script shows how the pulse is shifted. % define the domain discretization x=x2(1:n). plot(abs(ut)) % plot unshifted transform figure(3). we can shift the transformed function back to its mathematically correct positions as shown in the third ﬁgure generated. it is better not to deal with the ﬀtshift and iﬀtshift commands.*k. The command iﬀtshift does this. % consider only the first n points: periodicity u=exp(-x.L/2. we need to calculate the k values associated with the transformation. before inverting the transformation. L=20. % k rescaled to 2pi domain ut1=i*k.*k. Recall that the FFT assumes a 2π periodic domain which gets shifted. % third derivative u1=ifft(ut1).n+1). % inverse transform u1exact=-sech(x). However. % consider only the first n points: periodicity u=sech(x). n=128. plot(x. % FFT the function k=(2*pi/L)*[0:(n/2-1) (-n/2):-1]. % function to take a derivative of ut=fft(u). 10 where a Gaussian is transformed and shifted by the ﬀt routine. clear all.u) % plot initial gaussian figure(2). u3=ifft(ut3). The following example diﬀerentiates the function f (x) = sech(x) three times. % first derivative ut2=-k.L/2. To take a derivative. In general. Thus the calculation of the k values needs to shift and rescale to the 2π domain.*tanh(x). unless you need to plot the spectrum.*ut. close all.n+1). A graphical representation of the ﬀt procedure and its shifting properties is illustrated in Fig.AMATH 301 ( c J.

u2. the correct k values. N.N) where N is the number of dimensions.u. it is recommended to use the commands ﬀt2 and iﬀt2.6 0.u3. unshifted state.u1exact. For higher dimensions. Note that no shifting was necessary since we constructed the k values in the shifted space.x.’go’.2 0 −10 15 10 5 0 −5 −10 −15 −64 −32 0 32 Fourier modes 64 0 −64 −32 0 32 Fourier modes abs(fftshift(u)) real[fftshift(u)] 10 −5 0 x 5 10 real[fft(u)] 2 83 15 10 5 0 −5 −10 −15 −64 15 −32 0 32 Fourier modes 64 5 64 Figure 10: Fast Fourier Transform of Gaussian data illustrating the shifting properties of the FFT routine.’r’.[].x. which represents data in the x and y direction respectively.u1.’c’) % plot The routine accounts for the periodic boundaries. a couple of choices in MATLAB are possible. Note that the ﬀtshift command restores the transform to its mathematically correct. For transforming in higher dimensions. the ﬀt command can be modiﬁed to ﬀt(x. and diﬀerentiation. Kutz) 1 0.8 u=exp(−x ) 0. For 2D transformations.’g’. along the rows and then columns.x.4 0.’b’. .x. These will transform a matrix A. figure(1) plot(x.AMATH 301 ( c J.

% ideal signal in the time domain figure(1). % frequency components of FFT u=sech(t).n+1). In particular. Rather. i. These three lines of code generate the Fourier transform of the function along with the vector utn which is the spectrum with a complex and Gaussian distributed (mean zero.n)+i*rand(1.’k’). % Total time slot to transform n=128. Usually this noise is what is called white noise. The key concept in any of these applications is to use the FFT to analyze and manipulate data in the frequency domain. The best known example of a spectral method is the Fourier transform. This ﬁltering process is fairly common in electronics and signal detection.n)). hold on This will generate an ideal hyperbolic secant shape in the time domain. % number of Fourier modes 2^7 t2=linspace(-T/2. FFTs and other related frequency transforms have revolutionized the ﬁeld of digital signal processing and imaging. To begin we consider a ideal signal generated in the time-domain. Kutz) 84 6. . utn=ut+noise*(rand(1. a noise which eﬀects all frequencies the same. noise=10. In most applications. it should be understood that the impact of the FFT is far greater than this.3 Applications of the FFT Spectral methods are one of the most powerful solution techniques for ordinary and partial diﬀerential equations. they have a large amount of noise on top of them. Other spectral techniques exist which render a variety of problems easily tractable and often at signiﬁcant computational savings.AMATH 301 ( c J. close all. the signals as generated above are not ideal. 6.u. ut=fft(u). This lecture will discuss a very basic concept and manipulation procedure to be performed in the frequency domain: noise attenuation via frequency (bandpass) ﬁltering. % time values k=(2*pi/L)*[0:(n/2-1) -n/2:-1].e.2 Spectral Analysis Although our primary use of the FFT will be related to diﬀerentiation and solving ordinary and partial diﬀerential equations.T/2. unit variance) noise source term added in. N. We can add white noise to this signal by considering the pulse in the frequency domain. t=t2(1:n). clear all. T=20. plot(t. We have already made use of the Fourier transform using FFT routines.

unlike the ﬁnite diﬀerence method. The Fourier. Thus the deﬁnition of the Fourier transform needs to be modiﬁed in order to account for the ﬁnite sized computational domain.3. Instead of expanding in terms of a continuous integral for values of wavenumber k and cosines and sines (exp(ikx)).3. (6.3. N.AMATH 301 ( c J. Kutz) 85 Discrete Cosine and Sine Transforms When considering a computational domain. we expand in a Fourier series N F (k) = n=1 f (n) exp −i N 2π(k − 1) (n − 1) N 2π(k − 1) (n − 1) N 1≤k≤N (6. The basis functions used for the Fourier transform.2) then the Fourier transform results in the expression N −1 Fn = k=0 wnk fk 0 ≤ n ≤ N −1. N (6.1a) f (n) = 1 N F (k) exp i k=1 1 ≤ n ≤ N .3) Thus the calculation of the Fourier transform involves a double sum and an O(N 2 ) operation. Thus at ﬁrst it would appear that the Fourier transform method is the same operation count as LU decomposition. 11. Note that this expansion. The cosine and sine transform are often chosen for their boundary properties.1b) Thus the Fourier transform is nothing but an expansion in a basis of cosine and sine functions. The process of solving a diﬀerential or partial diﬀerential equation involves evaluating the coeﬃcient of each of the modes. the solution can only be found on a ﬁnite length domain. . Speciﬁcally. an evaluation must be made of the type of transform to be used based upon the boundary conditions needing to be satisﬁed. and cosine transforms behave very diﬀerently at the boundaries. sine. (6. The appropriate MATLAB command is also given. sine transform and cosine transform are depicted in Fig. If we deﬁne the fundamental oscillatory piece as wnk = exp 2iπ(k−1)(n−1) = cos N 2π(k−1)(n−1) N + i sin 2π(k−1)(n−1) . is a global expansion in that every basis function is evaluated on the entire domain. Table 8 illustrates the three diﬀerent expansions and their associated boundary conditions. the Fourier transform assumes periodic boundary conditions whereas the sine and cosine transforms assume pinned and no-ﬂux boundaries respectively. Thus for a given problem.3.

a sine expansion (middle). In two dimensions.5 −1 1 0. At best. we can use a factorization scheme to solve this problem in O(N 2 ) operations. Kutz) mode 1 mode 2 mode 3 mode 4 mode 5 1 0.5 −1 0 π 2π 0 π 2π 0 π 2π 86 Figure 11: Basis functions used for a Fourier mode expansion (top).3.5 0 −0.AMATH 301 ( c J. N. and a cosine expansion (bottom). this equation is equivalent to ∂2ψ ∂2ψ + =ω.5 0 −0.5 0 −0.5) By discretizing in both the x and y directions. Although iteration schemes have .5 −1 1 0.4) given the function ω. Fast-Poisson Solvers The FFT algorithm provides a fast and eﬃcient method for solving the Poisson equation 2 ψ=ω (6. this forces us to solve an associated linear problem Ax = b. 2 ∂x ∂y 2 (6.3.

x=x2(1:nx). and idct repectively.nx+1).3. Ly=20. idst. We begin by transforming in x: ∂2ψ ∂2ψ ∂2ψ 2 + = ω → −kx ψ + = ω. % x-values y2=linspace(-Ly/2. we transform the equation. To invert the expansions. ∂y 2 (6.8) The remaining step is to inverse transform in x and y to get back to the solution ψ(x. y) are deﬁned. % generate 2-D grid omega=exp(-X.3.6) where kx are the wavenumbers in the x direction.^2). Kutz) 87 command ﬀt dst dct expansion Fk = Fk = Fk = 2N −1 j=0 fj exp(iπjk/N ) N −1 j=1 fj sin(πjk/N ) N −2 j=0 fj cos(πjk/2N ) boundary conditions periodic: f (0) = f (L) pinned: f (0) = f (L) = 0 no-ﬂux: f (0) = f (L) = 0 Table 8: MATLAB functions for Fourier. % y-values [X. N. the spatial domain. Transforming now in the y direction gives 2 −kx ψ + ∂2ψ 2 2 = ω → −kx ψ + −ky ψ = ω . ny=128. y). sine.ny+1). nx=128. y=y2(1:nx).y). Before generating the solution. The algorithm to solve this is surprisingly simple. Denoting the Fourier transform in x as f (x) and the Fourier transform in y as g(y). 2 2 ∂x ∂y ∂y 2 (6.3. it is not guaranteed.AMATH 301 ( c J. % domains and Fourier modes x2=linspace(-Lx/2. % generate 2-D Gaussian After deﬁning the preliminaries. the wavenumbers are generated and the equation is solved .^2-Y. and two-dimensional function ω(x. 2 2 kx + ky (6.Ly/2. Fourier wavenumber vectors. and cosine transforms and their associated boundary conditions. Lx=20.7) This can be rearranged to obtain the ﬁnal result ψ=− ω . Fourier transform methods allow us to solve the problem in O(N log N ). the possibility of outperforming this. the MATLAB commands are iﬀt.Y]=meshgrid(x.Lx/2.

this constant is inconsequential.8) is that there will be a divide by zero when kx = ky = 0 at the zero mode. When solving with direct methods for Ax = b. the non-uniqueness gives a singular matrix A. A second option. −L. % x-wavenumbers ky=(2*pi/Ly)*[0:ny/2-1 -ny/2:-1]. most applications only are concerned with the derivative of the function ψ(x.^2) ) ). ky(1)=10^(-6).^2+KY. t) = 0. Speciﬁcally.AMATH 301 ( c J. LU decomposition or iterative methods is problematic.3. y. t) at the left hand corner of the computational domain and ﬁxes c. . we are only interested in derivatives of the streamfunction. y). This will ﬁx the constant c and give a unique solution to Ax = b. Thus solving with Gaussian elimination. y). the FFT will arbitrarily add a constant to the solution. The function ψ(x.ky). which is more highly recommended. An observation concerning (6. we could simply add the command line kx(1)=10^(-6)./(KX. The values of kx(1) = ky(1) = 0 by default.8) so that ψ=− ω 2 + ky + eps (6. y) + c where c is an arbitrary constant. % solution Note that the real part of the inverse 2-D FFT is taken since numerical round-oﬀ generates imaginary parts to the solution ψ(x. Fundamentally. It essentially adds a round-oﬀ to the denominator which removes the divide by zero problem. % generate 2-D wavenumbers psi=real( ifft2( -fft2(omega). y) with periodic boundary conditions does not have a unique solution. we could impose the following constraint condition ψ(−L. y) is a solution. y) to some prescribed value on our computational domain.3.KY]=meshgrid(kx. Such a condition pins the value of ψ(x. Two options are commonly used to overcome this problem. % fix divide by zero [KX. is to redeﬁne the kx and ky vectors associated with the wavenumbers in the x and y directions. When solving this problem with FFTs. ky(1)=10^(-6). Often in applications the arbitrary constant does not matter. Therefore.3. Thus we can simply pin the value of ψ(x. so is ψ0 (x. For instance. after deﬁning the kx and ky .9) 2 kx where eps is the command for generating a machine precision number which is on the order of O(10−15 ). This would make the values small but ﬁnite so that the divide by zero problem is eﬀectively removed with only a small amount of error added to the problem. % y-wavenumbers kx(1)=10^(-6). N. The ﬁrst is to modify (6. Thus if ψ0 (x. There is a second mathematical diﬃculty which must be addressed. Speciﬁcally. Kutz) 88 kx=(2*pi/Lx)*[0:nx/2-1 -nx/2:-1].

Issues of numerical stability as the solution is propagated in time and accuracy from the space-time discretization are of greatest concern for any implementation. at the heart of a discussion on time and space stepping schemes. .AMATH 301 ( c J. Issues of stability and accuracy are. The ﬁrst equation we consider is the heat (diﬀusion) equation ∂2u ∂u =κ 2 (7. we will consider very simple and well known examples from partial diﬀerential equations. u(L − 2∆x) = un−1 u(L − ∆x) = un u(L) = un+1 . N. To begin this study. of course.2) This approximation reduces the partial diﬀerential equation to a system of ordinary diﬀerential equations. u(−L) = u1 u(−L + ∆x) = u2 . ∂t ∆x2 (7. we discretize and deﬁne the values of the vector u in the following way.1) ∂t ∂x with the periodic boundary conditions u(−L) = u(L).1. 7. We discretize the spatial derivative with a second-order scheme (see Table 4) so that κ ∂u = [u(x + ∆x) − 2u(x) + u(x − ∆x)] . We have already considered a variety of timestepping schemes for diﬀerential equations and we can apply them directly to this resulting system. Kutz) 89 7 Partial Diﬀerential Equations With the Fourier transform and discretization in hand.1 Basic time. .1. in addition to spatial discretization. we will need to discretize in time in a self-consistent way so as to advance the solution to a desired future time. Thus.and space-stepping schemes Basic time-stepping schemes involve discretization in both space and time. we can turn towards the solution of partial diﬀerential equations whose solutions need to be advanced forward in time. . The ODE System To deﬁne the system of ODEs.

A(n. .3) . [t.A). Thus our system of diﬀerential equations solves for the vector u1 u2 u= ...1.[]. 3.n)=1.. ...1. MATLAB implementation The system of diﬀerential equations can now be easily solved with a standard time-stepping algorithm such as ode23 or ode45. Build the sparse matrix A.. Generate the desired initial condition vector u = u0 . N. .1.4) Au . .u0.1). % diagonals A(1. .5) .1)=1. . dt ∆x2 where A is given by the sparse matrix −2 1 0 ··· 0 1 1 −2 1 0 ··· 0 0 .AMATH 301 ( c J.1. .n). . Call an ODE solver from the MATLAB suite. the diﬀusion constant κ and spatial step ∆x need to be passed into this routine. A= (7. 0 . un The governing equation (7. The basic algorithm would be as follows 1. % build a vector of ones A=spdiags([e1 -2*e1 e1]. .tspan. (7. .k. . .y]=ode45(’rhs’. ··· .2) is then reformulated as the diﬀerential equations system κ du = (7. e1=ones(n.[-1 0 1]. % periodic boundaries 2. Kutz) 90 Recall from periodicity that u1 = un+1 .n.dx. The matrix A. 0 1 −2 1 1 0 ··· 0 1 −2 and the values of one on the upper right and lower left of the matrix result from the periodic boundary conditions.

AMATH 301 ( c J. Plot the results as a function of time and space.A) rhs=(k/dx^2)*A*u. Again. . .k. the system of diﬀerential equations can now be easily solved with a standard time-stepping algorithm such as ode23 or ode45. The basic algorithm follows the same course as the 1D case. u1n u = u21 u22 .7) where we have deﬁned ujk = u(xj .m should be of the following form function rhs=rhs(tspan. the system can be reduced to the linear system κ du = 2 Bu . (7. yk ).1.u.dummy.dx. . The algorithm is thus fairly routine and requires very little eﬀort in programming since we can make use of the standard time-stepping algorithms already available in MATLAB. but extra care is taken in arranging the 2D data into a vector. N. Provided ∆x = ∆y = δ are the same. . dt δ where we have arranged the vector u so that u11 u12 . . the calculation becomes slightly more diﬃcult since the 2D data is represented by a matrix and the ODE solvers require a vector input for the initial data.6) where we discretize in x and y. For this case.1. 2D MATLAB implementation In the case of two dimensions.8) . un(n−1) unn (7. 91 4. the governing equation is ∂u =κ ∂t ∂2u ∂2u + 2 ∂x2 ∂y (7. The kron command can be used to generate it from its 1D version.1. Kutz) The function rhs. The matrix B is a nine diagonal matrix.

u. The algorithm is again fairly routine and requires very little eﬀort in programming since we can make use of the standard time-stepping algorithms. Plot the results as a function of time and space.k.k. The reshape and meshgrid commands are important for computational implementation.A). % generate a Gaussian matrix u=reshape(U.m should be of the following form function rhs=rhs(tspan.y). This example considers the case of a simple Gaussian as the initial condition. Generate the desired initial condition matrix U = U0 and reshape it to a vector u = u0 . % account for periodicity x=x2(1:nx). Call an ODE solver from the MATLAB suite.AMATH 301 ( c J.A) rhs=(k/dx^2)*A*u. % 2-D differentiation matrix 2. the diﬀusion constant κ and spatial step ∆x = ∆y = dx need to be passed into this routine.tspan.[].dummy.dx.I). the matrix A of the previous example. % x-vector y2=linspace(-Ly/2. ny=100. [t. % elements in reshaped initial condition x2=linspace(-Lx/2. n=100. The matrix A.A)+kron(A.N. The function rhs. Build the sparse matrix B.^2-Y.Lx/2.ny+1). % spatial domain of x and y nx=100. % account for periodicity y=y2(1:ny). .dx. % discretization points in x and y N=nx*ny.Ly/2. % reshape into a vector 3. Kutz) 92 1. Here the matrix B can be built by using the fact that we know the diﬀerentiation matrix in one dimension. Lx=20.y]=ode45(’rhs’.^2).1). % idendity matrix of size nXn B=kron(I. 4. i. % number of discretization points in x and y I=eye(n). Ly=20. N. % y-vector [X.e.Y]=meshgrid(x.nx+1).u0. % set up for 2D initial conditions U=exp(-X.

Kutz) time t t t j+1 j 93 solution progession along 1D lines lines t t t 2 1 0 xn-2 xn-1 xn xn+1 x n+2 space x Figure 12: Graphical representation of the progression of the numerical solution using the method of lines. Here a second order discretization scheme is considered which couples each spatial point with its nearest neighbor. 7. Method of Lines Fundamentally.AMATH 301 ( c J. Each line is the value of the solution at a given time slice. Some of the more common schemes will be considered along with a graphical . these methods use the data at a single slice of time to generate a solution ∆t in the future. This process of solving a partial diﬀerential equation is known as the method of lines.2 Time-stepping schemes: explicit and implicit methods Now that several technical and computational details have been addressed. Figure 12 depicts the process involved in the method of lines for the 1D case. N. we continue to develop methods for time-stepping the solution into the future. The lines are used to update the solution to a new timeline and progressively generate future solutions. The process continues until the desired future time is achieved. This is then used to generate a solution 2∆t into the future.

The accuracy and stability of this scheme . i.5) ∆x is known as the CFL (Courant. it is diﬃcult to do much analysis with complicated equations. We will begin by considering the most simple partial diﬀerential equations. Often. t). ∂t ∂x (7. Friedrichs.e. Therefore.4) un+1 − un−1 2 where c∆t λ= (7.2.2. N.1) The simplest discretization of this equation is to ﬁrst central diﬀerence in the x direction. 13. representation of the scheme. we consider the one-way wave equation ∂u ∂u =c . and applying the method of lines iteration scheme gives c∆t (m) (m) (m+1) (m) un = un + (7. and Levy) condition. 2∆x n+1 This simple scheme has the four-point stencil shown in Fig. Denoting un = u(xn .2.2. As an example. tm ).AMATH 301 ( c J. Every scheme eventually leads to an iteration procedure which the computer can use to advance the solution in time. We can then step forward with an Euler time-stepping (m) method. we rewrite the discretized equation in the form λ (m) (m) (m+1) (m) un = un + (7. To illustrate more clearly the iteration procedure. Kutz) m+1 94 t m+1 un tm um n-1 um n um n+1 Figure 13: Four-point stencil for second-order spatial discretization and Euler time-stepping of the one-wave wave equation.2. The iteration procedure assumes that the solution does not change signiﬁcantly from one time(m) (m+1) step to the next.3) u − un−1 .2) where un = u(xn . un ≈ un . This yields c ∂u = (un+1 − un−1 ) . considering simple equations is not merely an excersize. but rather they are typically the only equations we can make analytical progress with. ∂t 2∆x (7.

you want to take ∆x and ∆t as large as possible for computational speed and eﬃciency without generating instabilities. Smaller values of ∆t suggest smaller values of λ and improved numerical stability properties. 14. Second. Central Diﬀerencing in Time We can discretize the time-step in a similar fashion to the spatial discretization.2. First.5). Thus the scheme is not self-starting since only one time slice (initial condition) is given.2. ∂x 12∆x (7.9) .7) This last expression can be rearranged to give (m+1) (m−1) un = un + λ un+1 − un−1 . In practice. Note that decreasing ∆x without decreasing ∆t leads to an increase in λ which can result in instabilities.2) since it is O(∆t2 ) accurate in time and O(∆x2 ) accurate in space. then although stability properties are improved with a lower CFL number. if indeed you choose to work with very small ∆t. Often a given scheme will only work for CFL conditions below a certain value. This parameter relates the spatial and time discretization schemes in (7.2. And using a central diﬀerence scheme in time now yields un (m+1) (7. The fourth-order accurate scheme from Table 5 gives ∂u −u(x + 2∆x) + 8u(x + ∆x) − 8u(x − ∆x) + u(x − 2∆x) = . Improved Accuracy We can improve the accuracy of any of the above schemes by using higher order central diﬀerencing methods. N. (m) (m) (7. you should choose the time discretization so that the CFL number is kept in check. Kutz) 95 is controlled almost exclusively by the CFL number λ.8) This iterative scheme is called leap-frog (2. the code will also slow down accordingly. Note that the solution utilizes two time slices to leap-frog to the next time slice. 2∆x n+1 (7. Thus after spatial discretization we have c ∂u = (un+1 − un−1 ) . ∂t 2∆x as before. Thus achieving good stability results is often counter-productive to fast numerical solutions.2.2. given a spatial discretization step-size ∆x. It uses a four point stencil as shown in Fig. thus the importance of choosing a small enough time-step. There are practical considerations to keep in mind relating to the CFL number.AMATH 301 ( c J.6) − un 2∆t (m−1) = c (m) (m) u − un−1 . we could central diﬀerence in time using Table 4. Instead of the Euler time-stepping scheme used above.

t) + + O(∆t3 ) .13a) (7.4). t) ∆t2 ∂ 2 u(x. t) + ∆t ∂u(x. It is typical that for (2.2. Lax-Wendroﬀ Another alternative to discretizing in time and space involves a clever use of the Taylor expansion u(x.2) scheme. Kutz) m+1 96 t t t m+1 un m m-1 un-1 um-1 n m un+1 m Figure 14: Four-point stencil for second-order spatial discretization and centraldiﬀerence time-stepping of the one-wave wave equation.4) schemes.12a) These two expressions for ∂u/∂t and ∂ 2 u/∂t2 can be substituted into the Taylor series expression to yield the iterative scheme (m+1) (m) un = un + λ (m) λ2 (m) (m) (m) (m) un+1 − un−1 + un+1 − 2un + un−1 . is called leap frog (2. ∂t ∂x 2∆x Taking the derivative of the equation results in the relation ∂2u c ∂2u =c 2 ≈ (un+1 − 2un + un−1 ) . which is based upon a six point stencil.2. Combining this fourth-order spatial scheme with second-order central diﬀerencing in time gives the iterative scheme (m+1) (m−1) un = un +λ 4 (m) 1 (m) (m) (m) un+1 − un−1 − u − un−2 3 6 n+2 . the maximum CFL number for stable computations is reduced from the basic (2.11) But we note from the governing one-way wave equation ∂u c ∂u =c ≈ (un+1 − un−1 ) . ∂t 2! ∂t2 (7.2. N.2.14) . 2 ∂t ∂x ∆x2 (7. 2 2 (7.2.AMATH 301 ( c J. t + ∆t) = u(x.10) This scheme. (7.

18) . . we must solve a matrix problem. (7. Backward Euler: Implicit Scheme The backward Euler method uses the future time for discretizing the spatial domain.AMATH 301 ( c J.2. 0 ··· λ 2 (7. .707 Table 9: Stability of time-stepping schemes as a function of the CFL number. This iterative scheme is similar to the Euler method. .2. It illustrates. However.2.16) Thus before stepping forward in time. .2. however.15) 2 This gives the tridiagonal system λ (m+1) λ (m+1) (m) (m+1) un = − un+1 + un + un−1 . . thus overcoming the limitations imposed by the matrix solve. Kutz) 97 Scheme Forward Euler Backward Euler Leap Frog (2. This can severely aﬀect the computational time of a given scheme. 2 2 which can be written in matrix form as Au(m+1) = u(m) where A= 2 −λ · · · 0 λ 2 −λ · · · .2) Leap Frog (2. . .4) Stability unstable for all λ stable for all λ stable for λ ≤ 1 stable for λ ≤ 0. N. . the variety and creativity in developing iteration schemes for advancing the solution in time and space. 1 2 (7.. it is diﬃcult to implement in practice for variable coeﬃcient problems. The only thing which may make this method viable is if the CFL condition is such that much larger time-steps are allowed. Thus upon discretizing in space and time we arrive at the iteration scheme λ (m+1) (m+1) (m+1) (m) = un + un un+1 − un−1 . This is known as the Lax-Wendroﬀ scheme. it introduces an important stabilizing diﬀusion term which is proportional to λ2 .17) (7. Although useful for this example.

1c) Periodic boundary conditions will be assumed in each case. 7. Kutz) MacCormack Scheme 98 In the MacCormack Scheme. The purpose of this section is to build a numerical routine which will solve these problems using the iteration procedures outlined in the previous two sections.1b) (7.3.1a) (7. The stability of the various schemes hold only for the one-way wave equation considered here as a prototypical example.19b) 1 (m) (P ) (P ) u + u(P ) + λ un+1 − un−1 n 2 n This method essentially combines forward and backward Euler schemes so that we avoid the matrix solve and the variable coeﬃcient problem. the basic stability properties of many of the schemes considered here are given in Table 9.2.AMATH 301 ( c J.3 Implementing Time. Of speciﬁc interest will be the setting of the CFL number and the consequences of violating the stability criteria associated with it.and Space-Stepping Schemes Computational performance is always crucial in choosing a numerical scheme.2. Speed.3. The computation thus occurs in two pieces: (m) u(P ) = un + λ un+1 − un−1 n (m+1) un = (m) (m) (7. We will consider three prototypical equations: ∂u ∂u ∂t = ∂x ∂u ∂2u ∂t = ∂x2 2 i ∂u = 1 ∂ u + |u|2 u ∂t 2 ∂x2 one-way wave equation diﬀusion equation nonlinear Schr¨dinger equation . (7. accuracy and stability all play key roles in determining the appropriate choice of method for solving. the variable coeﬃcient problem of the Lax-Wendroﬀ scheme and the matrix solve associated with the backward Euler are circumvented by using a predictor-corrector method. N. For now. Each partial diﬀerential equation needs to be considered and classiﬁed individually with regards to stability. o (7. The equations are considered in one dimension such that the ﬁrst and second .3.19a) . The CFL condition will be discussed in detail in the next section.

5b) where now the CFL number is given by λ = ∆t/∆x2 .3. ...2) (unstable): un = un +2λ un+1 −2un +un−1 (7.3. .2) (stable for λ ≤ 1): un = un +λ un+1 −un−1 (7. . → . → . we have the following discretization schemes for the one-way wave equation (7. . . .. . 1 u1 u2 . 1 0 .3. ..4b) where the CFL number is given by λ = ∆t/∆x. u2 .. 0 un . . The nonlinear Schr¨dinger o equation discretizes to the following form: ∂un ∂t (m) =− i (m) (m) (m) (m) (m) un+1 − 2un + un−1 − i|un |2 un .4a) (m+1) (m−1) leap-frog (2. ∂x2 ∆x2 . . . Similarly for the diﬀusion equation (7.3.3.1) (m+1) (m) (m) Euler (stable for λ ≤ 1/2): un = un +λ un+1 −2un +un−1 (m) (m) (m) (m) (7. Kutz) derivative are given by 99 and 0 1 0 ··· 0 −1 −1 0 1 0 ··· 0 .2) From the previous lectures. u1 0 . 2∆x2 (7. .3. N.1): Euler (unstable): (m+1) (m) un = un + −2 1 1 −2 0 .AMATH 301 ( c J.2) time-stepping schemes will be explored with this equation. 0 −1 0 1 1 0 ··· 0 −1 0 0 ··· 1 0 . . 2 ∂ u 1 .6) The Euler and leap-frog (2.3) λ (m) (m) u −un−1 2 n+1 (m) (m) (7. ··· ..3. 0 1 −2 1 0 1 −2 0 ··· (7.3. . (7. un 0 0 ··· . 1 ∂u . ∂x 2∆x . . ··· . . . . . ..5a) (m+1) (m−1) (m) leap-frog (2.3.

The solution propagates to the left as expected from the exact solution. close all. One-Way Wave Equation We ﬁrst consider the leap-frog (2. 15. For this case. the ﬁgure on the right shows the numerical instability generated in the Euler scheme.2) scheme and with CFL=0.5.2) scheme becomes unstable for λ ≥ 1 and the system is always unstable for the Euler time-stepping scheme. The stable traveling wave solution is propagated in this case to the left. Note that both of these unstable evolutions develop high frequency oscillations which eventually blow up. The initial conditions used are identical to that in Fig. Likewise.5 so that stable evolution is analytically predicted. Figure 16 depicts the unstable evolution of the leap-frog (2.2) scheme with CFL=2 and the Euler time-stepping scheme. the CFL=0. % clear previous figures and values .2) scheme applied to the one-way wave equation. Since we have predicted that the leap-frog numerical scheme is only stable provided λ < 1. it is not surprising that the ﬁgure on the left goes unstable. Figure 15 depicts the evolution of an initial Gaussian pulse. Kutz) 100 Figure 15: Evolution of the one-way wave equation with the leap-frog (2. N. The MATLAB code used to generate the leap-frog and Euler iterative solutions is given by clear all. The leap-frog (2.AMATH 301 ( c J.

% leap frog (2.L/2.1)=u0. dx=x(2)-x(1). L=20.2) scheme with CFL=2 (left) along with the Euler time-stepping scheme (right).n+1).n. % sparse matrix for derivative term e1=ones(n. n=200. the Euler scheme is expected to be unstable for all CFL.^2)’. time. % initialize grid size.^2)’. dt=0. A(n.2)=u1. x2=linspace(-L/2. CFL=dt/dx time_steps=Time/dt.1).n).2. x=x2(1:n). and CFL number Time=4. A=spdiags([-e1 e1]. t=0:dt:Time. usol(:. Thus the onset of numerical instability near t ≈ 3 for the CFL=2 case is not surprising.1)=1. A(1. N.[-1 1]. Likewise.n)=-1. Kutz) 101 Figure 16: Evolution of the one-way wave equation using the leap-frog (2. The analysis predicts stable evolution of leap-frog provided the CFL≤ 1.AMATH 301 ( c J. u1=exp(-(x+dt). usol(:. % initial conditions u0=exp(-x.2) or euler iteration scheme for j=1:time_steps-1 .

we investigate the evolution of the diﬀusion equation when the space-time discretization is given by the leap-frog (2. N.[20]). The numerical code used to generate these solutions follows that given previously for the oneway wave equation. u1 = u2.[20]). 18 shows the numerical instabilities which are generated from violating the CFL constraint for the Euler scheme or using the always unstable leap-frog (2. the sparse matrix is now given by % sparse matrix for second derivative term e1=ones(n.set(yl.j+2)=u2.’FontSize’. In contrast.usol’). Kutz) 102 % % u2 = u0 + CFL*A*u1. set(xl.2) % euler % euler % set x and y limits and fontsize set(gca.2) or euler iteration scheme for j=1:time_steps-1 u2 = u0 + 2*CFL*A*u1.[20]).set(zl. % u0 = u1.’FontSize’. A=spdiags([e1 -2*e1 e1].n).[20]).n)=1. leap frog (2.’ytick’. colormap(map). A(1. % u2 = u1 + 0.2) scheme for the diﬀusion equation. set(gca.t.2) leap frog (2.2) scheme or Euler stepping. However. view(25. Further. usol(:. end % plot the data waterfall(x. % leap frog (2.[-1 0 1]. map=[0 0 0].[0 Time/2 Time].[20]).1).’FontSize’. the iterative process is now % leap frog (2.’Ylim’. Figure 17 shows the expected diﬀusion behavior for the stable Euler scheme (λ ≤ 0. A(n.’Xtick’. u1 = u2.’FontSize’.[-L/2 0 L/2].AMATH 301 ( c J. zl=zlabel(’u’).5).2) .jpg % print jpeg at screen resolution Heat Equation In a similar fashion.40) % set axis labels and fonts xl=xlabel(’x’).’Xlim’. Fig. yl=ylabel(’t’).’FontSize’.n.5*CFL*A*u1. print -djpeg -r0 fig.[0 Time].1)=1.[-L/2 L/2].

as with most nonlinear equations. it is a bit more diﬃcult to perform a von Neumann analysis.j+2)=u2. The initial Gaussian is diﬀused in this case.2) % euler % euler where we recall that the CFL condition is now given by λ = ∆t/∆x2 . usol(:.AMATH 301 ( c J. u0 = u1.5.05) and the stability . CFL=dt/dx/dx This solves the one-dimensional heat equation with periodic boundary conditions.e. N.2). Nonlinear Schr¨dinger Equation o The nonlinear Schr¨dinger equation can easily be discretized by the above techo niques. The CFL number will be the same with both schemes (λ = 0. However. we explore the behavior for this system for two diﬀerent discretization schemes: Euler and leap-frog (2. Kutz) 103 Figure 17: Stable evolution of the heat equation with the Euler scheme with CFL=0. i. end % % % leap frog (2. u1 = u2. u2 = u1 + CFL*A*u1. u1 = u2. Therefore.

Speciﬁcally. the leap-frog schemes work well for wave propagation problems while Euler methods are better for problems of a diﬀusive nature. The Euler scheme is observed to lead to numerical instability whereas the leap-frog (2. . the iteration scheme requires change. u1 = u2.i*2*dt*(conj(u1).05. N.2) scheme (right) with CFL=0. 0) = sech(x)) over six units of time.*u1). Figure 19 shows the evolution of the exact one-soliton solution of the nonlinear Schr¨dinger equation o (u(x. u0 = u1. Figure 19: Evolution of the nonlinear Schr¨dinger equation with the Euler timeo stepping scheme (left) and leap-frog (2. The analysis predicts that both these schemes are unstable.2) scheme (right) with CFL=1. the following command structure is required u2 = u0 + -i*CFL*A*u1. will be investigated through numerical computations.*u1. Thus the onset of numerical instability is observed. The MATLAB code modiﬁcations necessary to solve the nonlinear Schr¨dinger o equation are trivial. For the stable leap-frog scheme.AMATH 301 ( c J. Kutz) 104 Figure 18: Evolution of the heat equation with the Euler time-stepping scheme (left) and leap-frog (2.2) scheme is stable. In general.

1997). (Cambridge. L. Friedrichs. “Uber die partiellen diﬀerenzengleichungen der mathematischen physik. 10th Ed. B. (Prentice Hall. (Wiley. Calculus and Analytic Geometry. 1992). Gear. N. 2000). (Prentice-Hall. 1989). O. Calculus of Variations. [12] I. Giordano. Finite Diﬀerence Schemes and Partial Diﬀerential Equations. D. W. [6] A. (Chapman & Hall. Press. Finney. [3] C. K. (Prentice Hall. 1986). and M. 7th Ed. (SIAM. Spectral methods in MATLAB. (Wiley. T. You will solve a very diﬀerent equation if not careful with this deﬁnition. 2000). S. Vetterling. Lewy. A. Gelfand and S. 1973) [5] R. Elementary Diﬀerential Equations and Boundary Value Problems. W. (SIAM. Numerical Analysis. Boyce and R. G. C. Computational Methods in Ordinary Diﬀerential Equations. Burden and J. 2001). C. D. Faires. Fomin. 32-74 (1928). P. R. Strikwerda. [2] See.AMATH 301 ( c J. E. [10] N. (Wellesley-Cambridge Press. 1963). V. Strang. [9] J. [11] G. Weir. Thus it is imperative that you do not use the variable i as a counter in your FOR loops. 2nd Ed. Courant. DiPrima. for instance. L. and H. Flannery. Greenbaum. Kutz) 105 √ Note that i is automatically deﬁned in MATLAB as i = −1. Iterative Methods for Solving Linear Systems. [4] J.” Mathematische Annalen 100. 1997). 1971). (Brooks/Cole. F. Thomas. M. [7] W. ¨ [8] R. Introduction to Applied Mathematics. Numerical Initial Value Problems in Ordinary Diﬀerential Equations. . Lambert. Numerical Recipes. Trefethen. H. References [1] W. Teukolsky.

- Vector Calculus
- Thermal Analysis of Materials - Marcel Dekker
- Lu Decomposition
- MATLAB - A Fundamental Tool for Scientific Computing and Engineering Applications - Volume 2
- A Concise Introduction to Numerical Analysis
- Numerical Analysis Using Matlab and Spreadsheets
- Game Theory
- Principles of Scientific Computing
- MATLAB Programming
- MATLAB - A Fundamental Tool for Scientific Computing and Engineering Applications - Volume 3
- Basic Probability Theory for Bio Medical Engineers - JohnD. Enderle
- MATLAB - A Fundamental Tool for Scientific Computing and Engineering Applications - Volume 1
- MATLAB - Kutz
- Introduction to High Performance Scientific Computing
- Solutions Manual Scientific Computing
- 1107.2643v1
- WCEE2012_4803
- Power Perturbation Method for Power Flow Analysis
- Iterative Methods
- cs229-linalg
- semicon
- Post-Buckling Behavior of Laminated Composite Cylindrical Shells Subjected to Axial, Bending and Torsion Loads
- MA2266
- TUM MSCE Interview Questions.doc
- ∑ x [n ]e
- 00975511
- 9fcfd50e8175eea9f1
- Linear algebra.pdf
- Multivariate Statistics
- Side Coupling Cavity

- UT Dallas Syllabus for math2418.001.08f taught by (xxx)
- UT Dallas Syllabus for math2418.001.10f taught by Paul Stanford (phs031000)
- tmpE89F.tmp
- UT Dallas Syllabus for ce2300.501.09s taught by Lakshman Tamil (laxman)
- UT Dallas Syllabus for ee2300.501 05s taught by Edward Esposito (exe010200)
- UT Dallas Syllabus for math2418.001.07s taught by Paul Stanford (phs031000)
- tmp59CC.tmp
- UT Dallas Syllabus for math2418.001 06s taught by Paul Stanford (phs031000)
- Principal Component Analysis based Opinion Classification for Sentiment Analysis
- UT Dallas Syllabus for ee2300.001 06f taught by Lakshman Tamil (laxman)
- AHP technique a way to show preferences amongst alternatives
- UT Dallas Syllabus for math2418.5u1.10u taught by ()
- UT Dallas Syllabus for math2418.001.07f taught by Paul Stanford (phs031000)
- Background Foreground Based Underwater Image Segmentation
- tmpB096
- UT Dallas Syllabus for eco5309.501 05f taught by Daniel Obrien (obri)
- tmp808C.tmp
- UT Dallas Syllabus for math2418.501.07s taught by Paul Stanford (phs031000)
- UT Dallas Syllabus for math2418.501 06s taught by Paul Stanford (phs031000)
- UT Dallas Syllabus for ee2300.001.09s taught by Naofal Al-dhahir (nxa028000)
- tmp9C3B.tmp
- Information Security using Cryptography and Image Processing
- tmp6601
- UT Dallas Syllabus for engr2300.001.11f taught by Jung Lee (jls032000)
- UT Dallas Syllabus for ee2300.001.08s taught by Naofal Al-dhahir (nxa028000)
- Offline Signature Recognition and Verification using PCA and Neural Network Approach
- UT Dallas Syllabus for ce2300.003.10s taught by Lakshman Tamil (laxman)
- UT Dallas Syllabus for math2418.501 05s taught by Paul Stanford (phs031000)
- UT Dallas Syllabus for ee2300.001.08f taught by William Pervin (pervin)
- UT Dallas Syllabus for math2418.001.08s taught by Paul Stanford (phs031000)

- sadfasdfag
- Yogaoindkoamokdmfoasd
- f9huaw9hfanvoa
- asdfadfadsfasdf
- afoisdnmqoiemfoqmefoeqmfw
- sldgnoaenbfoaenojiodsf oienagoifnvaomdf.pdf
- agdawefadfadfadf
- asldnoanvoan
- dsfasdfsadfsdadf
- faosnfoianvo
- oasnvoanevo
- OIGornboeangoimdoisfmaoiefma.pdf
- 129je9fjofwom
- jsoifaghioaeohig
- z Smith Chart
- Chaos&Dynamics
- 06015541
- NONE
- SPL_LL90_3
- MeowMix
- viviyiugiug
- STUFF
- NODIJFOISJONMFKOSGFMI
- 109779532 Engineering Mathematics 5th Ed by K a Stroud
- Rousseau
- 6939386 Photodiode Amplifiers Op Amp Solutions J Graeme 1996 WW
- FourierTransformPairs
- fsKFLJoianveonfwofm

Sign up to vote on this title

UsefulNot usefulClose Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Close Dialog## This title now requires a credit

Use one of your book credits to continue reading from where you left off, or restart the preview.

Loading