Ordinary Differential Equations with Scilab

WATS Lectures
Provisional notes
Universit´e de Saint-Louis
2004
G. Sallet
Universit´e De Metz
INRIA Lorraine
2004
1
Table des mati`eres
1 Introduction 5
2 ODEs. Basic results 6
2.1 General Definitions . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Existence and uniqueness Theorems . . . . . . . . . . . . . . 7
2.3 A sufficient condition for a function to be Lipschitz . . . . . . 8
2.4 A basic inequality . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5 Continuity and initial conditions . . . . . . . . . . . . . . . . . 9
2.6 Continuous dependence of parameters . . . . . . . . . . . . . . 10
2.7 Semi-continuity of the definition interval . . . . . . . . . . . . 10
2.8 Differentiability of solutions . . . . . . . . . . . . . . . . . . . 11
2.9 Maximal solution . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 Getting started 11
3.1 Standard form . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Why to solve numerically ODEs ? . . . . . . . . . . . . . . . . 12
3.2.1 ˙ x = x
2
+ t . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.2 ˙ x = x
2
+ t
2
. . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.3 ˙ x = x
3
+ t . . . . . . . . . . . . . . . . . . . . . . . . . 14
4 Theory 14
4.1 Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Backward error analysis . . . . . . . . . . . . . . . . . . . . . 21
4.3 One–step methods . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3.1 Analysis of one-step methods . . . . . . . . . . . . . . 22
4.3.2 Conditions for a one-step method to be of order p . . . 25
4.3.3 Runge-Kutta method . . . . . . . . . . . . . . . . . . . 27
4.3.4 Order of RK formulas . . . . . . . . . . . . . . . . . . 32
4.3.5 RK formulas are Lipschitz . . . . . . . . . . . . . . . . 36
4.3.6 Local error estimation and control of stepsize . . . . . 37
4.4 Experimentation . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.4.1 coding the formulas . . . . . . . . . . . . . . . . . . . . 39
4.4.2 Testing the methods . . . . . . . . . . . . . . . . . . . 41
4.4.3 Testing roundoff errors . . . . . . . . . . . . . . . . . . 43
4.5 Methods with memory . . . . . . . . . . . . . . . . . . . . . . 51
4.5.1 Linear MultistepsMethods LMM . . . . . . . . . . . . 52
2
4.5.2 Adams-Basforth methods (AB) . . . . . . . . . . . . . 53
4.5.3 Adams-Moulton methods . . . . . . . . . . . . . . . . . 55
4.5.4 BDF methods . . . . . . . . . . . . . . . . . . . . . . . 57
4.5.5 Implicit methods and PECE . . . . . . . . . . . . . . . 59
4.5.6 Stability region of a method . . . . . . . . . . . . . . . 61
4.5.7 Implicit methods, stiff equations, implementation . . . 66
5 SCILAB solvers 67
5.1 Using Scilab Solvers, Elementary use . . . . . . . . . . . . . . 67
5.2 More on Scilab solvers . . . . . . . . . . . . . . . . . . . . . . 73
5.2.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.3 Options for ODE . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.3.1 itask . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3.2 tcrit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3.3 h0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3.4 hmax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.3.5 hmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.3.6 jactype . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.3.7 mxstep . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.3.8 maxordn . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.3.9 maxords . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.3.10 ixpr . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.3.11 ml,mu . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.4 Experimentation . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.4.1 Two body problem . . . . . . . . . . . . . . . . . . . . 80
5.4.2 Roberston Problem : stiffness . . . . . . . . . . . . . . 83
5.4.3 A problem in Ozone Kinetics model . . . . . . . . . . . 87
5.4.4 Large systems and Method of lines . . . . . . . . . . . 106
5.5 Passing parameters to functions . . . . . . . . . . . . . . . . . 112
5.5.1 Lotka-Volterra equations, passing parameters . . . . . 113
5.5.2 variables in Scilab and passing parameters . . . . . . . 115
5.6 Discontinuities . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.6.1 Pharmacokinetics . . . . . . . . . . . . . . . . . . . . . 117
5.7 Event locations . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.7.1 Syntax of “Roots ” . . . . . . . . . . . . . . . . . . . . 127
5.7.2 Description . . . . . . . . . . . . . . . . . . . . . . . . 128
5.7.3 Precautions to be taken in using “root” . . . . . . . . . 129
5.7.4 The bouncing ball problem . . . . . . . . . . . . . . . . 131
3
5.7.5 The bouncing ball on a ramp . . . . . . . . . . . . . . 134
5.7.6 Coulomb’s law of friction . . . . . . . . . . . . . . . . . 137
5.8 In the plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.8.1 Plotting vector fields . . . . . . . . . . . . . . . . . . . 143
5.8.2 Using the mouse . . . . . . . . . . . . . . . . . . . . . 144
5.9 test functions for ODE . . . . . . . . . . . . . . . . . . . . . . 147
5.9.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.9.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.9.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.9.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.9.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.9.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
5.9.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
5.9.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.9.9 Arenstorf orbit . . . . . . . . . . . . . . . . . . . . . . 150
5.9.10 Arenstorf orbit . . . . . . . . . . . . . . . . . . . . . . 151
5.9.11 The knee problem . . . . . . . . . . . . . . . . . . . . . 151
5.9.12 Problem not smooth . . . . . . . . . . . . . . . . . . . 151
5.9.13 The Oregonator . . . . . . . . . . . . . . . . . . . . . . 152
4
The content of these notes are the subject of the lectures given in august
2004, at the university Gaston Berger of Saint-Louis, in the context of the
West African Training School (WATS) supported by ICPT and CIMPA.
1 Introduction
Ordinary differential equations (ODEs) are used throughout physics, engi-
neering, mathematics, biology to describe how quantities change with time.
The lectures given by Professors Lobry and Sari, last year, has introduced
the basic concepts for ODEs. This lecture is concerned about solving ODEs
numerically.
Scilab is used to solve the problems presented and also to make mathematical
experiments. Scilab is very convenient problem solving environment (PSE)
with quality solvers for ODEs. Scilab is a high-level programming language.
Programs with Scilab are short, making practical to list complete programs.
Developed at INRIA, Scilab has been developed for system control and signal
processing applications. It is freely distributed in source code format.
Scilab is made of three distinct parts : an interpreter, libraries of functions
(Scilab procedures) and libraries of Fortran and C routines. These routines
(which, strictly speaking, do not belong to Scilab but are interactively called
by the interpreter) are of independent interest and most of them are avai-
lable through Netlib. A few of them have been slightly modified for better
compatibility with Scilab’s interpreter.
Scilab is similar to MATLAB, which is not freely distributed. It has many
features similar to MATLAB. SCILAB has an inherent ability to handle
matrices (basic matrix manipulation, concatenation, transpose, inverse etc.,)
Scilab has an open programming environment where the creation of functions
and libraries of functions is completely in the hands of the user.
The objective of this lecture is to give to the reader a tool, Scilab, to make
mathematic experiments. We apply this to the study of ODE. This can be
viewed as an example. Many others areas of mathematic can benefit of this
technique.
This notes are intended to be used directly with a computer. You just need a
computer on which Scilab is installed, and have some notions about the use
of a computer. For example how to use a text editor, how to save a file.
Scilab can be downloaded freely from http ://www.Scilab.org. In this Web
5
site you will also find references and manuals. A very good introduction
to Scilab is ”Une introduction `a Scilab” by Bruno Pin¸ con, which can be
downloaded from this site.
We advise the reader to read this notes, and to check all the computations
presented. We use freely the techniques of Scilab, we refer to the lectures by
A.Sene associated to this course. All the routines are either given in the text
or given in the appendix. All the routines have been executed and proved to
be working. Excepted for some possible errors made on “cuting and pasting”
the routines in this lectures are the original ones.
For the convenience of the reader, we recall the classical results for ODE’s and
take this opportunity to define the notations used throughout this lectures.
2 ODEs. Basic results
2.1 General Definitions
Definition 2.1 :
A first order differential is an equation
˙ x = f(t, x) (1)
In this definition t represents time. The variable x is an unknown function
from R with values in R
n
. Notation ˙ x is traditionally the derivative with
respect to time t . The function f is an application, defined on an open set Ω
of RR
n
, with values in R
n
. The variable x is called the state of the system.
Definition 2.2 :
A solution of (1) is a derivable function x, defined on an interval I of R
x : I ⊂ R −→R
n
and which satisfies, on I
d
dt
x(t) = f(t, x(t))
A solution is then a time parametrized curve, evolving in R
n
, such that the
tangent at the point x(t) to this curve, is equal to the vector f(t, x(t)).
6
2.2 Existence and uniqueness Theorems
From the title of this section you might imagine that this is just another
example of mathematicians being fussy. But this is not : it is about whether
you will be able to solve a problem at all and, if you can, how well. In this
lecture we’ll see a good many examples of physical problems that do not have
solutions for certain values of parameters. We’ll also see physical problems
that have more than one solution. Clearly we’ll have trouble computing a
solution that does not exist, and if there is more than one solution then we’ll
have trouble computing the “right” one. Although there are mathematical
results that guarantee a problem has a solution and only one, there is no
substitute for an understanding of the phenomena being modeled.
Definition 2.3 (Lipschitz function) :
A function defined on an open set U of R R
n
with values in R
n
is said to
be Lipschitz with respect to x, uniformly relatively to t on U if there exists a
constant L such that for any pair (t, x) ∈ U , (t, y) ∈ U we have
| f(t, x) −f(t, y) | ≤ L | (x −y) |
Definition 2.4 (Locally Lipschitz functions) :
A function defined on an open set Ω from RR
n
to R
n
is said to be Lipschitz
with respect to x, uniformly relatively to t sur Ω if for any pair (t
0
, x
0
) ∈ Ω,
there exists an open set U ⊂ Ω containing (t
0
, x
0
), a constant L
U
such that
f is Lipschitz with constant L
U
in U.
Remark 2.1 :
Since all the norm in R
n
are equivalent, it is straightforward to see that this
definition does not depends of the norm. Only the constant L, depends on the
norm.
Theorem 2.1 (Cauchy-Lipschitz) :
Let f a function defined on an open set Ω of RR
n
, continuous on Ω, valued
in R
n
locally Lipschitz with respect to x, uniformly relatively to t on Ω .
Then for any point (t
0
, x
0
of Ω there exists a solution of the ODE
˙ x = f(t, x)
7
defined on an open interval I of R containing t
0
and satisfying the initial
condition
x(t
0
) = x
0
Moreover if y(t) defined on an interval, denotes another solution from the
open interval J containing t
0
then
x(t) = y(t) sur I ∩ J
Definition 2.5 :
An equation composed of a differential equation with an initial condition
_
˙ x = f(t, x)
x(t
0
) = x
0
(2)
is called a Cauchy problem
A theorem from Peano, gives the existence, but cannot ensures uniqueness
Theorem 2.2 (Peano 1890) :
If f is continuous any point (t
0
, x
0
) of Ω is contained in one solution of the
Cauchy problem (2)
Definition 2.6 A solution x(t) of the Cauchy problem , defined on an open
I, is said to be maximal if any other solution y(t) defined on J,there exists
at least t ∈ I ∩ J , such that x(t) = y(t) then we have J ⊂ I
We can now give
Theorem 2.3 :
If f is locally lipschitz in x on Ω, then any point (t
0
, x
0
) is contained in an
unique maximal solution
2.3 A sufficient condition for a function to be Lipschitz
If a function is sufficiently smooth, the function is Lipschitz. In other words if
the RHS (right hand side) of the ODE is sufficiently regular we have existence
and uniqueness of solution . More precisely we have
Proposition 2.1 :
If the function f of the ODE (1) is continuous with respect to (t, x) and is
continuously differentiable with respect to x then f is locally Lipschitz.
8
2.4 A basic inequality
Definition 2.7 A derivable function z(t) is said to be an ε–approximate-
solution of the ODE (1) if for any t where z(t) is defined we have
| ˙ z(t) −f(t, z(t)) | ≤ ε
Theorem 2.4 :
Let z
1
(t) et z
2
(t) two ε–approximate solutions of (1), where f is Lipschitz
with constant L. If the solutions are respectively ε
1
et ε
2
approximate, we
have the inequality
| z
1
(t) − z
2
(t) | ≤ | z
1
(t
0
) − z
2
(t
0
) | e
L|t−t
0
|
+ (ε
1

2
)
_
e
L|t−t
0
|
−1
_
L
(3)
2.5 Continuity and initial conditions
Definition 2.8 :
If f is locally Lipschitz we have existence and uniqueness of solutions. We
shall denote, when there is existence and uniqueness of solution, by
x(t, x
0
, t
0
)
the unique solution of
_
˙ x = f(t, x)
x(t
0
) = x
0
Remark 2.2 :
The notation used here x(t, x
0
, t
0
) is somewhat different of the notation used
in often the literature, where usually the I.V. is written (t
0
, x
0
). We reverse
the order deliberately, since the syntax of the solvers in Scilab are
--> ode(x0,t0,t,f)
Since this course is devoted to ODE and Scilab, we choose, for convenience,
to respect the order of Scilab
The syntax in Matlab is
9
>> ode(@f, [t0,t],x0)
Proposition 2.2 :
With the hypothesis of the Cauchy-Lipschitz theorem the solution x(t, x
0
, t
0
)
is continuous with respect to x
0
. The solution is locally Lipschitz with respect
to the initial starting point :
| x(t, x, t
0
) − x(t, x
0
, t
0
) | ≤ e
|t−t
0
|
| x −x
0
|
2.6 Continuous dependence of parameters
We consider the following problem
_
˙ x = f(t, x, λ)
x(t
0
) = x
0
where f is supposed continuous with respect to (t, x, λ) and Lipschitzian in
x uniformly relatively to (t, λ).
Proposition 2.3 With the preceding hypothesis the solution x(t, t
0,
, x
0
, λ) is
continuous with respect to (t, λ)
2.7 Semi-continuity of the definition interval
We have seen in the Lipschitzian case that x(t, x
0
t
0
) is defined on an maximal
open interval
t
0
∈ ]α(x) , ω(x)[
.
Theorem 2.5 : Let (t
0
, x
0
) a point of the open set Ω and a compact interval
I on which the solution is defined. Then there exists an open set U containing
x
0
such that for any x in U the solution x(t, x, t
0
) is defined on I.
This implies that, for example, that ω(x) is lower semi-continuous .
10
2.8 Differentiability of solutions
Theorem 2.6 We consider the system
_
˙ x = f(t, x)
x(t
0
) = x
0
We suppose that f is continuous on Ω and such that the partial derivative
∂f
∂x
(t, x) is continuous with respect to (t, x).
The the solution x(t, t
0
, x
0
) is differentiable in t and satisfies the linear dif-
ferential equation
˙ ¸ .. ¸

∂x
0
x(t, x
0
, t
0
) =
∂f
∂x
(t, x(t, x
0
, t
0
)) .

∂x
0
x(t, x
0
, t
0
)
2.9 Maximal solution
We have seen that to any point (t
0
, x
0
) is associated a maximal open set :
]α(t
0
, x
0
) , ω(t
0
, x
0
)[. What happens when t → ω(t
0
, x
0
) ?
Theorem 2.7 For any compact set K of Ω, containing the point (t
0
, x
0
),then
(t, x(t, t
0
, x
0
)) does not stays in K when t → ω(t
0
, x
0
)
Particularly when ω(t
0
, x
0
) < ∞ we see that x(t, t
0
, x
0
) cannot stay in any
compact set.
3 Getting started
3.1 Standard form
The equation
˙ x = f(t, x)
is called the standard form for an ODE. It is not only convenient for the
theory, it is also important in practice. In order to solve numerically an ODE
problem, you must write it in a form acceptable to your code. The most
common form accepted by ODE solvers is the standard form.
Scilab solvers accepts ODEs in the more general form
11
A(t, x) ˙ x = f(t, y)
Where A(t, x) is a non singular mass matrix and also implicit forms
g(t, x, ˙ x) = 0
.
With either forms the ODEs must be formulated as a system of first-order
equations. The classical way to do this, for differential equations of order n
is to introduce new variables :
_
¸
¸
_
¸
¸
_
x
1
= ˙ x
x
2
= ¨ x

x
n−1
= x
(n−1)
(4)
Example 3.1 :
We consider the classical differential equation of the mathematical pendulum :
_
_
_
¨ x = −sin(x)
x(0) = a
˙ x(0) = b
(5)
This ODE is equivalent to
_
_
_
˙ x
1
= x
2
x
2
= −sin(x
1
)
x
1
(0) = a; x
2
(0) = b
(6)
3.2 Why to solve numerically ODEs ?
The ODEs, for which you can get an analytical expression for the solution,
are the exception rather that the rule. Indeed the ODE encountered in prac-
tice have no analytical solutions. Even simple differential equations can have
complicated solutions.
The solution of the pendulum ODE is given by mean of Jacobi elliptic func-
tion SN and CN. Now the solution of the ODE of the pendulum with friction,
cannot be expressed, by known analytical functions.
Let consider some simple ODEs on R
n
12
3.2.1 ˙ x = x
2
+ t
The term in x
2
prevents to use the method of variation of constants.Given
in a computer algebra system (Mathematica, Maple, Mupad,...), this ODE
has for solution
DSolve[x’[t] == x[t]^2 + t, x[t], t]
Which gives
N
D
Where
N = −C J

1
3
_
2
3
t
3
2
_
+ t
3
2
J

2
3
_
2
3
t
3
2
_
−C J

4
3
_
2
3
t
3
2
_
+ C J2
3
_
2
3
t
3
2
_
and
D = 2t J1
3
_
2
3
t
3
2
_
+ C J

1
3
_
2
3
t
3
2
_
Where the functions J
α
(x) are the Bessel function. The Bessel functions Bes-
sel J
n
(t) and Bessel Y
n
(t) are linearly independent solutions to the differential
equation t
2
¨ x + t ˙ x + (t
2
− n
2
)x = 0. J
n
(t) is often called the Bessel function
of the first kind, or simply the Bessel function.
3.2.2 ˙ x = x
2
+ t
2
this ODE has for solution proposed
DSolve[x’[t] == x[t]^2 + t^2, x[t], t]
Which gives
N
D
Where
N = −C J

1
4
_
t
2
2
_
−2t
2
J

3
4
_
t
2
2
_
−C J

5
4
_
t
2
2
_
+ C J3
4
_
t
2
2
_
13
and
D = 2t J1
4
_
t
2
2
_
+ C J

1
4
_
t
2
2
_
3.2.3 ˙ x = x
3
+ t
This time Mathematica has no answer
In [ 3] := DSolve[x’[t] == x[t]^3 + t, x[t], t]
Out[3]: = DSolve[x’[t] == x[t]^3 + t, x[t], t]
4 Theory
For a Cauchy problem , with existence and uniqueness
_
˙ x = f(t, x)
x(t
0
) = x
0
(7)
A discrete variable method start with the given value x
0
and produce an
approximation x
1
of the solution x(t
1
, x
0
, t
0
) evaluated at time t
1
. This is
described as advancing the integration a step size h
0
= t
1
−t
0
. The processus
is then repeated producing approximations x
j
of the solution evaluated on a
mesh
t
0
< t
1
< < t
i
< < t
f
Where t
f
is the final step of integration.
Usually the contemporary good quality codes, choose their steps to control
the error, this error is set by the user or have default values.
Moreover some codes are supplemented with methods for approximating the
solution between mesh points. The mesh points are chosen by the solver, to
satisfy the tolerance requirements, and by polynomial interpolation a func-
tion is generated that approximate the solution everywhere on the interval
of integration. This a called a continuous output. Generally the continuous
approximation has the same error tolerance that the discrete method. It is
14
important to know how a code produces answers at specific points, so as
to select an appropriate method for solving the problem with the required
accuracy.
4.1 Generalities
Definition 4.1 :
We define the local error, at time (t
n+1
) of the mesh, using the definition
(2.8), by
e
n
= x
n+1
−x(t
n+1
, x
n
, t
n
)
Or the size of the error by
le
n
= |x
n+1
−x(t
n+1
, x
n
, t
n
)|
The method gives an approximation x
n
at time t
n
. The next step, start from
(t
n
, x
n
) to produce x
n+1
. The local error measure the error produced by the
step and ignore the possible error accumulated already in x
n
. A solver try to
control the local error. This controls the global error
| x
n+1
−x(t
n+1
, x
0
, t
0
)|
Only indirectly, since the propagation of error can be upper bounded by
| x(t
n+1
, x
0
, t
0
) −x
n+1
| ≤
≤ |x(t
n+1
, x
n
, t
n
) −x
n+1
| + |x(t
n+1
, x
0
, t
0
) −x(t
n+1
, x
n
, t
n
)|
The first term in the right hand side of the inequality is the local error,
controlled by the solver, the second is given by the difference of two solutions
evaluated at same time, but with different I.V. The concept of stability of
the solutions of an ODE is then essential : two solutions starting from nearby
Initial values, must stay close. The basic inequality (3 ) is of interest for this
issue. This inequality tells us that is L [ t −t
0
[ is small, the ODE has some
kind of stability. The Lipschitz constant on the integration interval play a
critical role. A big L would implies small step size to obtain some accuracy
on the result. If the ODE is unstable the numerical problem can be ill-posed.
15
We also speak of well conditioned problem. We shall give two examples of
what can happen. Before we give a definition
Definition 4.2 (convergent method) :
A discrete method is said to be convergent if on any interval [t
0
, t
f
] for any
solution x(t, x
0
, t
0
), of the Cauchy problem
_
˙ x = f(t, x)
x(t
0
) = x
0
If we denote by n the number of steps, i.e. t
f
= t
n
, ˜ x
0
the first approximation
of the method to the starting point x
0
, then
|x(t
n
) −x(t
f
)| →0
When max
i
h
i
→0, and ˜ x
0
→x
0
.
It should be equivalent to introduce max
i
|x(t
i
)−x(t
i
)|. Our definition consi-
der the last step, the alternative is to consider the error maximum for each
step. Since this definition is for any integration interval, there are equivalent.
We give two example of unstable ODE problems.
Example 4.1 :
We consider the Cauchy-Problem
_
_
_
˙ x
1
= x
2
˙ x
2
= 100 x
1
x
1
(0) = 1 , x
2
(0) = −10
(8)
The analytic solution is x(t) = e
−10t
. The I.V. has been chosen for. We now
try to get numerical solutions, by various solver implemented in SCILAB.
Some commands will be explained later on. In particular, for the default
method in SCILAB, we must augment the number of steps allowed. This is
done by ODEOPTIONS. We shall see this in the corresponding section
-->;exec("/Users/sallet/Documents/Scilab/ch4ex1.sci");
-->x0=[0;0];
-->t0=0;
16
-->%ODEOPTIONS=[1,0,0,%inf,0,2,3000,12,5,0,-1,-1];
-->ode(x0,t0,3,ch4ex1)
ans =
! 0.0004145 !
! 0.0041445 !
-->ode(’rk’,x0,t0,3,ch4ex1)
ans =
! 0.0024417 !
! 0.0244166 !
-->ode(’rkf’,x0,t0,3,ch4ex1)
ans =
! - 0.0000452 !
! - 0.0004887 !
-->ode(’adams’,x0,t0,3,ch4ex1)
ans =
! - 0.0045630 !
! - 0.0456302 !
-->exp(-30)
ans =
9.358E-14
All the results obtained are terrible. With LSODA, RK, RKF, RK,ADAMS.
With RKF and ADAMS the value is even negative ! ! All the codes are of the
highest quality and all fails to give a sound result. The reason is not in the
17
code, but in the ODE. The system is a linear system
˙ x = Ax
with
A =
_
0 1
100 0
_
The eigenvalues of A are 1 and −10. As seen, in last year lectures, the solu-
tions are linear combination of e
t
and e
−10t
. The picture is a saddle node (or
mountain pass). The solution starting at (1, −10) is effectively e
−10t
. This one
of the two solutions going to the equilibrium (the pass of the saddle) , but
any other solution goes toward infinity. The ODE is unstable. Then when a
solver gives an approximation which is not on the trajectory, the trajectories
are going apart. The problem is ill conditioned. Moreover, it is simple to
check that the Lipschitz constant is L = 100 for the euclidean norm (make
in Scilab norm(A)...). The problem is doubly ill conditioned
Example 4.2 :
We consider the Cauchy-Problem
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
˙ x
1
= x
2
˙ x
2
= t x
1
x
1
(0) =
3

2
3
Γ(
2
3
)
x
2
(0) = −
3

1
3
Γ(
1
3
)
(9)
This simple equation has for solution the special function Airy Ai(t) With
the help of Mathematica :
DSolve[{x’[t] == y[t], y’[t] == t x[t]}, {x[t], y[t]}, t]
which gives
_
x(t) = C
1
Ai(t) + C
2
Bi(t)
y(t) = C
1
Ai

(t) + C
2
Bi

(t)
The function Ai et Bi are two independent functions solutions of the ge-
neral equation ¨ x = tx. The Airy functions often appear as the solutions to
boundary value problems in electromagnetic theory and quantum mechanics.
18
The Airy functions are related to modified Bessel functions with
1
3
orders. It
is known that Ai(t) tends to zero for large positive t, while Bi(t) increases
unboundedly.
We check that the I.V are such that Ai(t) is solution of the Cauchy-Problem.
Now the situation is similar to the preceding example. Simply, this example
require references to special function. It is not so immediate as in the prece-
ding example.
This equation is a linear nonautonomous equation ˙ x = A(t).x, with
A(t) =
_
0 1
t 0
_
The norm associated to the euclidean norm is t (same reason as for the
preceding example). Then the Lipschitz constant on [0, 1] is L = 1. which
gives an upper bound from the inequality (3) of e. This is quite reasonable.
On the interval [0, 10] , the Lipschitz constant is 10, we obtain an upper
bound of e
100
. We can anticipate numerical problems.
We have the formulas (see Abramowitz and Stegun )
Ai(t) =
1
π
_
t
3
K1
3
_
2/3t
3
2
_
and
Bi(t) =
_
t
3
_
I

1
3
_
2/3t
3
2
_
+ I1
3
_
2
3
t
3
2
__
Where I
n
(t) and K
n
(t) are respectively the first and second kind modified
Bessel functions. The Bessel K and I are built–in , in Scilab (Type help
bessel in the window command)). So we can now look at the solutions :
We first look at the solution for t
f
= 1, integrating on [0, 1]. In view of the
upper bound for the Airy ODE on the error, the results should be correct.
-->deff(’xdot=airy(t,x)’,’xdot=[x(2);t*x(1)]’)
--> t0=0;
-->x0(1)=1/((3^(2/3))*gamma(2/3));
-->x0(2)=1/((3^(1/3))*gamma(1/3));
-->X1=ode(’rk’,x0,t0,1,airy)
X1 =
19
! 0.1352924 !
! - 0.1591474 !
-->X2=ode(x0,t0,1,airy)
X2 =
! 0.1352974 !
! - 0.1591226 !
--X3=ode(’adams’,x0,t0,1,airy)
X3 =
! 0.1352983 !
! - 0.1591163 !
-trueval=(1/%pi)*sqrt(1/3)*besselk(1/3,2/3)
trueval =
0.1352924
The results are satisfactory. The best result is given by the ‘ ‘rk” method.
Now we shall look at the result for an integration interval [0, 10]
-->X1=ode(’rk’,x0,t0,10,airy)
X1 =
1.0E+08 *
! 2.6306454 !
! 8.2516987 !
X2=ode(x0,t0,10,airy)
X1 =
1.0E+08 *
! 2.6281018 !
! 8.2437119 !
--X3=ode(’adams’,x0,t0,10,airy)
20
X3 =
1.0E+08 *
! 2.628452 !
! 8.2448118 !
-->trueval=(1/%pi)*sqrt(10/3)*besselk(1/3,(2/3)*10^(3/2))
trueval =
1.105E-10
This time the result is terrible ! The three solvers agree on 3 digits, around
2.6210
8
, but the real value is 1.105 10
−10
. The failure was predictable. The
analysis is the same of for the preceding example.
4.2 Backward error analysis
A classical view for the error of a numerical method is called backward error
analysis. a method for solving an ODE with I.V. , an IVP problem approxi-
mate the solutions x(t
n
, x
0
, t
0
) of the Cauchy Problem
_
˙ x = f(t, x)
x(t
0
) = x
0
We can choose a function z(t) which interpolates the numerical solution.
For example we can choose a polynomial function which goes through the
points (t
i
, x
i
), with a derivative at this point f(t
i
, x
i
). This can be done by a
Hermite interpolating polynomial of degree 3. The function z is then with a
continuous derivative. We can now look at
r(t) = ˙ z(t) −f(t, z(t))
In other words the numerical solution z is solution of modified problem
˙ x = f(t, x) + r(t)
Backward error analysis consider the size of the perturbation. Well conditio-
ned problem will give small perturbations.
21
4.3 One–step methods
One-steps methods are methods that use only information gathered at the
current mesh point t
j
. They are expressed under the form
x
n+1
= x
n
+ h
n
Φ(t
n
, x
n
, h
n
)
Where the function Φ depends on the mesh point x
n
at time t
n
and the next
time step h
n
= t
n+1
−t
n
.
In the sequel we shall restrict, for simplicity of the exposition, to one-step
methods with constant step size. The result we shall give, are also true for
variable step size, but the proof are more complicated.
Example 4.3 (Improved Euler Method) ;
If we choose
Φ(t
n
, x
n
, h
n
) = f(t
n
, x
n
)
We have the extremely famous Euler method.
Example 4.4 (Euler Method) ;
If we choose
Φ(t
n
, x
n
, h
n
) =
1
2
[f(t
n
, x
n
) + f(t
n
+ h
n
, x
n
+ h
n
f(t
n
, x
n
)]
This method is known as the Improved Euler method or also as Heun
method. Φ depends on (t
n
, x
n
, h
n
).
4.3.1 Analysis of one-step methods
We shall analyze the error and propagation of error.
Definition 4.3 (Order of a method) :
A method is said of order p, if there exists a constant C such that for any
point of the approximation mesh we have
le
n
≤ C h
p+1
22
That means that the local error at any approximating point is O(h
p+1
).
For a one-step method
le
n
= | x(t
n+1
, x
n
, t
n
) −x
n
−Φ(t
n
, x
n
, h)|
In practice, the computation of x
n+1
is corrupted by roundoff errors, in other
words the method restart with ˜ x
n+1
, in place of x
n+1
.
In view of this errors we shall establish a lemma
Lemma 4.1 (propagation of errors) :
We consider a one-step method, with constant step. We suppose that the
function Φ(t, x, h) is Lipschitzian of constant Λ relatively to x, uniformly
with respect to h and t. We consider two sequences x
n
and ˜ x
n
defined by
x
n+1
= x
n
+ hΦ(t
n
, x
n
, h) ˜ x
n+1
= ˜ x
n
+ hΦ(t
n
, ˜ x
n
, h) + ε
n
We suppose that [ ε
n
[≤ ε
Then we have the inequality
|˜ x
n+1
−x
n+1
| ≤ e
Λ|t
n+1
−t
0
|
(| ˜ x
0
−x
0
| + (n + 1)ε)
Proof
We have, using that Φ is Lipschitz of constant Λ :
|˜ x
n+1
−x
n+1
| ≤ (1 +hΛ)|˜ x
n
−x
n
| + ε
A simple induction gives
|˜ x
n+1
−x
n+1
| ≤ (1 +hΛ)
n+1
|˜ x
0
−x
0
| + ε
e
(n+1)hΛ
−1

Now using (1 + x) ≤ e
x
, and t
n+1
− t0 = (n + 1)h, setting for shortness
T = t
n+1
−t0, we have :
|˜ x
n+1
−x
n+1
| ≤ e
ΛT
_
|˜ x
0
−x
0
| + (n + 1)ε
1 −e
−ΛT
ΛT
_
Remarking that
1−e
−x
x
≤ 1, gives the result.
Now let consider any solution of the Cauchy problem. For example x(t, ˜ x
0
, t
0
).
Set z(t) = x(t, ˜ x
0
, t
0
). we have
z(t
n+1
) = z(t
n
) + hΦ(t
n
, z(t
n
), h) + e
n
23
Where e
n
is evidently
e
n
= z(t
n+1
) −z(t
n
) −hΦ(t
n
, z(t
n
), h)
By definition le
n
= |e
n
|.
We see that for any solution of the Cauchy problem we can define ˜ x
n
=
x(t
n
, ˜ x
0
, t
0
), since this sequence satisfies the condition of the lemma.
With this lemma we can give an upper bound for the global error. Since
we can set ˜ x
n
= x(t
n
, x
0
, t
0
) ( we choose directly the good I.V.). Without
roundoff error, we know that
|e
n
| = le
n
≤ C h
p+1
Then we see, using the lemma, that the final error x(t
f
, x
0
, t
0
) − x
n+1
is
upper bounded by e
ΛT
(n + 1)Ch
p+1
, but since by definition T = (n + 1)h,
t
n+1
−t
0
= T, t
f
= t
n+1
, we obtain
|x(t
f
, x
0
, t
0
) −x
n+1
| ≤ e
ΛT
(n + 1)CTh
p
The accumulation of local error, gives an error of order p at the end of the
integration interval. This justifies the name of order p methods. We have
also proved that any method of order p ≤ 1 is convergent. This is evidently
theoritical, since with a computer, the sizes of steps are necessarily finite.
But this is the minimum we can ask for a method.
If we add the rounding error to the theoretical error e
n
, if we suppose that
the rounding are upper bounded by ε (in fact, it is the relative error, but the
principle is the same) we have
˜ x
n+1
= ˜ x
n
+ h (Φ(t
n
, ˜ x
n
, h) + α
n
) + β
n
˜ x
n+1
= ˜ x
n
+ hΦ(t
n
, ˜ x
n
, h) + hα
n
+ β
n
We have ε
n
= hα
n

n
+e
n
, where α
n
is the roundoff error made in computing
Φ and β
n
the roundoff error on the computation of ˜ x
n+1
. If we suppose that
we know upper bounds for α
n
and β
n
, for example M
1
and M
2
, taking in
consideration the theoretical error, using the lemma, a simple computation
shows that the error E(h) has an expression of the kind
E(h) = e
ΛT
_
|˜ x
0
−x
0
| + C T h
p
+ TM
1
+
M
2
T
h
_
24
So we can write
E(h) = K
1
+ e
ΛT
T
_
C h
p
+
M
2
h
_
This shows that the error has a minimum for a step size of
_
M
2
pC
_ 1
p+1
Which gives an optimal number of steps for an order p method
N
opt
= T
_
pC
M
2
_ 1
p+1
(10)
In other words finite precision arithmetic ruins accuracy when too much steps
are taken. Beyond a certain point the accuracy diminishes. We shall verify
this fact experimentally.
To be correct the error is relative i.e. in computer the result ¯ x of the compu-
tation of a quantity x satisfies
|¯ x −x|
|x|
≤ u
Where for IEEE arithmetic u = 1.2 10
−16
Since we are integrating on com-
pact intervals, the solutions involved x(t) are bounded and since Φ is conti-
nuous , Φ(t, x, h) is also bounded. This means that we must have an idea of
the size of x(t) and the corresponding Φ. We shall meet this kind of problem
in the sequel, when we examine absolute error tolerance and relative error
tolerance
4.3.2 Conditions for a one-step method to be of order p
We shall give necessary and sufficient condition for a method to be of order
≤ p. Since the local error is
le
n
= |x
n+1
−x(t
n+1
, x
n
, t
n
)|
Calling
e
n
= x(t
n+1
, x
n
, t
n
) −x
n
−h Φ(t
n
, x
n
, h)
If we suppose that Φ is of class (
p
, the Taylor formula gives
25
Φ(t, x, h) =
p

i=0
h
i
i!

i
Φ
∂h
i
(t, x, 0) +◦(h
p
)
If we suppose f of class (
p
,
x(t
n+1
, x
n
, t
n
) −x
n
= x(t
n+1
, x
n
, t
n
) −x(t
n
, x
n
, t
n
)
Once again by Taylor formula, denoting x(t) = x(t, x
n
, t
n
)
x(t
n+1
, x
n
, t
n
) −x
n
=
p+1

j=1
h
j
j!
d
j
x
dt
j
(t
n
, x
n
) +◦(h
p+1
)
Now since x(t) is solution of the differential equation we have
dx
dt
(t
n
, x
n
) = f(t
n
, x
n
)
d
2
x
dt
2
(t
n
, x
n
) =
∂f
∂t
(t
n
, x
n
) +
∂f
∂x
(t
n
, x
n
) f(t
n
, x
n
)
To continue, we denote by
f
[n]
(t, x(t)) =
d
j
x
dt
j
[f(t, x(t))]
Then
f
[2]
(t, x) =

2
f
∂t
2
(t, x) + 2

2
f
∂t ∂x
(t, x) f(t, x) +

2
f
∂x
2
(t, x) (f(t, x), f(t, x))
The computation becomes rapidly involved, and special techniques, using
graph theory, must be developped. But the computation can be made (at
least theoretically). Then
x(t
n+1
, x
n
, t
n
) −x
n
=
p+1

j=1
h
j
j!
f
[j−1]
(t
n
, x
n
) +◦(h
p+1
)
Hence we deduce immediately (beware at the shift of indexes)
e
n
=
p+1

k=1
_
h
k
k!
f
[k−1]
(t
n
, x
n
) −
h
k
(k −1)!

k−1
Φ
∂h
k−1
(t
n
, x
n
, 0)
_
+◦(h
p+1
)
26
We have proven
Proposition 4.1 (order of a method) :
We consider a method of class (
p
. A method is of order at least p+1 (beware
at the shift of indexes) iff Φ satisfies, for k = 1 : p + 1
1
k
f
[k−1]
(t, x) =

k−1
Φ
∂h
k−1
(t, x, 0)
Corollary 4.1 (order 1 method) :
A method is of order at least 1, iff
f(t, x) = Φ(t, x, 0)
The Euler method is then of order 1.
Corollary 4.2 (order 2 method) :
A method is of order at least 2, iff
f(t, x) = Φ(t, x, 0)
and
1
2
_
∂f
∂t
(t, x) +
∂f
∂x
(t, x) f(t, x)
_
=
∂Φ
∂h
(t, x, 0)
Check that Heun is an order 2, method.
4.3.3 Runge-Kutta method
The Runge-Kutta methods are one-steps methods. The principle of RK me-
thods is to collect information around the last approximation, to define the
next step. Two RK methods are implemented in Scilab : “rk” and “rkf” .
To describe a RK method we define Butcher Arrays.
Definition 4.4 (Butcher Array) :
A Butcher array of size (k + 1) (k + 1) is an array
27
c
1
0
c
2
a
2,1
0
c
3
a
3,1
a
3,2
0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
c
k
a
k,1
a
k,2
a
k,k−1
0
b
1
b
2
b
k−1
b
k
The c
i
coefficients satisfying
c
i
=
i−1

j=1
a
i,j
A Butcher array is composed of (k −1)
2
+ k datas.
We can now describe a k-stages Runge-Kutta formula
Definition 4.5 (k-stages RK formula) :
A k-stages RK formula is defined by a k + 1 Butcher array. If a (t
n
, x
n
)
approximation has be obtained, the next step (t
n+1
, x
n+1
) is given by the fol-
lowing algorithm :
First we define k intermediate stages, by finite induction , which gives points
x
n,i
and slopes s
i
:
x
n,1
= x
n
s
1
= f(t
n
, x
n,1
)
x
n,2
= x
n
+ h a
2,1
s
1
s
2
= f(t
n
+ c
2
h , x
n,2
)
x
n,3
= x
n
+ h (a
3,1
s
1
+ a
3,2
s
2
) s
3
= f(t
n
+ c
3
h , x
n,3
)
. . .
x
n,k
= x
n
+ h
k−1

i=1
a
k,i
s
i
s
k
= f(t
n
+ c
k
h , x
n,k
)
The next step is then defined by
x
n+1
= x
n
+ h
k

i=1
b
i
s
i
28
The intermediates points x
n,i
are associated with times t
n
+ c
i
h.
We shall give a quite complete set of examples, since this RK formulas are
used by high quality codes, and can be found in the routines.
Example 4.5 (Euler RK1) :
The Euler method is a RK formula with associated Butcher array
0 0
1
Example 4.6 (Heun RK2) :
The Heun method (or improved Euler) is a RK formula with associated But-
cher array
0
1 1
1
2
1
2
Example 4.7 (Heun RK2) :
The Midpoint method, named also modified Euler formula, is a RK formula
with associated Butcher array
0
1
2
1
2
0 1
Example 4.8 (Classical RK4) :
The classical RK4 formula with associated Butcher array is
0
1
2
1
2
1
2
0
1
2
1 0 0 1
1
6
2
6
2
6
1
6
Example 4.9 (Bogacki-Shampine Pair, BS(2,3)) :
It is the first example of a Embedded RK formulas. We shall come on this
matter later.
29
0
1
2
1
2
3
4
0
3
4
1
2
9
1
3
4
9
7
24
1
4
1
3
1
8
30
Example 4.10 (Dormand Prince Pair DOPRI5 ) :
It is the second example of embedded RK formulas.
0
1
5
1
5
3
10
3
40
9
40
4
5
44
45

56
15
32
9
8
9
19372
6561

25360
2187
64448
6561

212
729
1
9017
3168

355
33
46732
5247

49
176

5103
18656
1
35
384
0
500
1113
125
192

2187
6784
11
84
5179
57600
0
7571
16695
393
640

92097
339200
187
2100
1
40
Example 4.11 (an ode23 formula) :
This example was the embedded RK formulas used in MATLAB till version
4, for the solver ODE23.
0
1
4
1
4
27
40

189
800
729
800
1
214
891
1
33
650
891
41
162
0
800
1053

1
78
31
Example 4.12 (RKF Runge-Kutta Fehlberg 45) :
This formula is the Runge-Kutta Fehlberg coefficients. They are used in Scilab
“RKF”. It is still an “interlocking” RK formula.
0
1
4
1
4
3
8
3
32
9
32
12
13
1932
2197

7200
2197
7296
2917
1
9017
3168

355
33
46732
5247

49
176

5103
18656
1
439
216
−8
3680
513

845
4104

1
5
4.3.4 Order of RK formulas
We shall derive conditions for which RK formulas are of order p. We have
said that for a RK formula the coefficient c
i
are imposed. We shall justify
this assumption.
Any ODE can be rendered autonomous, or more exactly replaced by an equi-
valent autonomous system. The trick is to built a “clock”. Let the classical
system
_
˙ x = f(t, x)
x(t
0
) = x
0
(11)
This system is clearly equivalent to the autonomous system
_
¸
¸
_
¸
¸
_
˙ y = 1
˙ z = f(y, z)
z(t
0
) = x
0
y(t
0
) = t
0
(12)
Actually some codes require the equations to be presented in autonomous
form.
It is expected that a solver will have the same behavior with the two systems.
32
We shall compare the steps taken by a RK formula for the two systems. Let
before denote by
¯
S
i
the slopes for the second system in R R
n
, necessarily
of form
¯
S
i
=
¸
¸
¸
¸
1
¯ s
i
And by
¯
W
n,i
the intermediate points obtained. Since the first coordinate of
the autonomous system is a clock, starting at time t
0
, since the sequence
of the original system is given by the intermediate points x
n,i
evaluated at
time t
n
+c
i
h, the code should gives a sequence, for the intermediates points
corresponding to the autonomous system
¯
W
n,i
=
¸
¸
¸
¸
t
n
+ c
i
h
x
n,i
we have, using the notations of RK formulas
s
1
= f(t
n
, x
n
) ¯ s
1
=
¸
¸
¸
¸
1
f(t
n
, x
n
)
x
n,2
= x
n
+ ha
2,1
s
1
¯
W
n,2
=
¸
¸
¸
¸
t
n
+ ha
2,1
x
n,2
s
2
= f(t
n
+ c
2
h, x
n,2
)
¯
S
2
=
¸
¸
¸
¸
1
¯ s
2
= f(t
n
+ ha
2,1
, x
n,2
)
We must have c
2
= a
2,1
. If this is satisfied then ¯ s
2
= s
2
, hence for the next
step
x
n,3
= x
n
+ h (a
3,1
s
1
+ a
3,2
s
2
)
¯
W
n,3
=
¸
¸
¸
¸
t
n
+ h (a
3,1
+ a
3,2
)
x
n,3
Then we must have c
2
= a
3,1
+ a
3,2
, we obtain by induction that the c
i
are
equal to the sum of the a
i,j
on the same row.
To derive condition for RK method we use the corallary (4.1). For a RK
method the function Phi is given by
Φ(t, x, h) =
k

i=1
b
i
s
i
Where it is clear that it is the s
i
which depends on (t, x, h).
33
RK formulas of order 1 From corollary (4.1) we must have f(t, x) =
Φ(t, x, 0). When h = 0 from the formulas, all the intermediates points are
given by x
n,1
= x
n
, and s
i
= f(t
n
, x
n
). Hence
Φ(t, x, 0) = f(t, x)
k

i=1
b
i
= f(t, x)
A RK formula is of order at least 1, iff
k

i=1
b
i
= 1
.
RK formulas of order 2 :
From corollary (4.2), if it of order 1 and if
1
2
_
∂f
∂t
(t, x) +
∂f
∂x
(t, x) f(t, x)
_
=
∂Φ
∂h
(t, x, 0)
We must evaluate
∂Φ
∂h
(t, x, h). We have
k

i=1
b
i
∂s
i
∂h
From the formulas
∂s
i
∂h
= c
i
∂f
∂t
(t
i
+ c
i
h, x
n,i
) +
∂f
∂x
_
i−1

k=1
a
i,k
s
k
+ hA
_
Where the expression A is given by
i−1

k=1
a
i,k
∂s
k
∂h
.
When h = 0 we have already seen that x
n,i
= x
n
and s
i
= f(t
n
, x
n
), then
∂s
i
∂h
(t, x, 0) = c
i
∂f
∂t
(t, x)) +
∂f
∂x
_
i−1

k=1
a
i,k
f(t, x)
_
34
Withe the hypothesis

i−1
k=1
a
i,k
= c
i
, using the fact that
∂f
∂x
is a linear appli-
cation we get
∂s
i
∂h
(t, x, 0) = c
i
_
∂f
∂t
(t, x)) +
∂f
∂x
(t, x)f(t, x)
_
= c
i
f
[1]
(t, x)
Finally the condition reduces to
1
2
f
[1]
(t, x) = f
[1]
(t, x)
k

i=1
b
i
c
i
We have proved that a RK is at least of order 2 if
k

i=1
b
i
= 1 and
k

i=1
b
i
c
i
=
1
2
The computation can be continued, but appropriate techniques must be used.
See Bucher or Haire and Wanner. The condition obtained can be expressed
as matrix operations. We introduce, from the Butcher array, the matrix A,
of size k k. The coefficients of A for i > j are the a
i,j
of the array. The
other coefficient are 0. We introduce the length k column vector C and the
length k vector B, in the same manner. We use freely the Scilab notation.
Particularly Scilab can be used for testing the order of a RK formula. So we
express the relation as a Scilab test. We shall give condition till order 5. RK
formulas of order 7 and 8 exists. Finding RK coefficients is also an art. Some
formulas of the same order are more efficient.
Relations for Order of RK formulas :
Each test add to the preceding.
Before, it must be verified that the c
i
satisfy the condition

a
i,j
= c
i
:
sum(A,’c’)==C
The test must output %T.
Order 1
Sum(B)==1
Order 2
35
B*C==1/2
Order 3
B*C.^2==1/3 B*A*C==1/6
Order 4
B*C.^3==1/4 B*A*C.^ 2==1/12
B*(C.*(A*C))==1/8 B*A^ 2*C==1/24
Order 5
B*C.^4==1/5 B*( (C.^2) .* (A*C))==1/10
B*(C.*(A*C.^2))==1/15 B*( C.*((A^2)* C)==1/30
B*((A*C).*(A*C))==1/20 B*A*C.*^3==1/20
(B*A) *(C.*(A*C))==1/40 B*A^2*C.^2==1/60
B*A^3C== 1/120
Exercise
Check, on the examples, the order of the RK formulas. For the Embedded
RK formulas, you will find out that, for example for the BS(2,3) pair that
if you only take the 4 4 Butcher array of the BS(2,3) you have an order 3
RK formula. The complete array gives an order 2 formula. This explain the
line and the double line of the array. Since the computation cost of a code
is measured by the number of evaluation of f, the BS(2,3) give the second
order formula for free. The interest of such formulas for step size control will
be explained in the next sections.
4.3.5 RK formulas are Lipschitz
If we prove this, then from our preceding results, the RK formulas are
convergent and at least of order 1.
We must evaluate
36
Φ(t, x, h) −Φ(t, y, h) =
k

i=1
b
i
(s
i
(x) −s
i
(y))
We denote by α = max[a
i,j
[ , β = max[b
i
[ , by L the Lipschitz constant of
f,
With the notation of RK formula, a simple induction shows that (for the
intermediate points x
i
and y
i
) we have
|x
i
−y
i
| ≤ (1 +hαL)
i−1
|x −y|
and
|s
i
(x) −s
i
(y)| ≤ L(1 +hαL)
i−1
|x −y|
Hence
|Φ(t, x, h) −Φ(t, y, h)| ≤ β
(1 +hαL)
k
−1

|x −y|
This prove that the RK formula are Lipschitz methods.
4.3.6 Local error estimation and control of stepsize
The local error of a method of order p is given par
e
n
= x(t
n+1
, x
n
, t
n
) −x
n+1
Suppose that we have a formula of order p+1 to compute an estimate x

n+1
,
then we have
est
n
= x

n+1
−x
n+1
= [x(t
n+1
, x
n
, t
n
) −x
n+1
] −
_
x(t
n+1
, x
n
, t
n
) −x

n+1
¸
Then
|est
n
| = le
n
+O(h
p+2
) = O(h
p+1
) = Ch
p+1
Where C is depending on (t
n
, x
n
). Since le
n
is O(h
p+1
), the difference est
n
give a computable estimate of the error. Since the most expensive part of
taking a step is the evaluation of the slopes, embedded RK formulas are
particularly interesting since they give an estimation of the error for free.
37
Modern codes use this kind of error estimation. If the error is beyond a
tolerance tol (given by the user, or is a default value used by the code), the
step is rejected and the code try to reduce the stepsize.
If we have taken a step σh, the local error would have been
le
σ
n
= C(σh)
p+1
= σ
p+1
|est|
, We use here |est
n
| for le
n
, then the step size σh passing the test satisfies
σ <
_
tol
|est|
_ 1
p+1
The code will use for new step h
_
tol
est
_ 1
p+1
.
This is the way that most popular codes select the step size. But there is
practical details. This is only an estimation. Generally how much the step
size is decreased or increased is limited. Moreover a failed step is expensive,
the codes use, for safety, a fraction of the predicted step size, usually 0.8 or
0.9. Besides a maximum step allowed, can be used, to prevent too big steps.
The maximal step increase is usually chosen between 1.5 and 5. It is also
used that after a step rejection to limit the increase to 1.
In the case of embedded RK formulas, two approximation are computed. If
the step is accepted, with which formula does the code will advance ? Advan-
cing with the higher order result is called local extrapolation. This is for
example the case of the BS(2,3) pair. The 4th line gives of BS(2,3) a 3 order
RK formula and an estimation say x
n+1
. To compute the error f(t
n+1
, x
n+1
)
is computed. If the step is accepted, if the code advance with x
n+1
, the slope
is already computed. The next step is free (at least for evaluation of f). This
kind of formula are called FSAL (first same as last). The pairs DOPRI5 and
RKF are also a FSAL pair. (check it !).
4.4 Experimentation
The objective of this section is to experiment and verify some of the theoreti-
cal results obtained. The power of Scilab is used. For the need of experiments
we shall code some of the methods described before. This is for heuristic rea-
sons. We give here an advice to the reader : do not rewrite codes that have
been well written before and have proved to be sure. The solver of Scilab
38
are high quality codes, don’t reinvent the wheel. However it is important to
know how codes are working. This was the reason of the preceding chapter.
4.4.1 coding the formulas
We shall code three RK formulas : Euler, Heun, RK4 (classical). Since all
the formulas given are RK formulas we shall indicates one general method.
We suppose that the Butcher array is given by the “ triplette ” (A, B, C) of
size respectively k k, 1 k and k 1. With theses data you can check the
order of the RK formula.
When this done we suppress the first row of A ( only 0).
A(1,:)=[ ];
The computation of the slopes is straighforward. We can write a k-loop or
simply write everything. In pseudo-code, for the main loop (computing x
i+1
) , we obtain :
Assuming that the RHS is coded in odefile with inputs (t, x), that k is the
number of stages of the RK formulas, that x
i
has already been computed :
n=length(x0) // dimension of the ODE
s=zeros ( n,k)
// k the number of stages, preallocation of the
// matrix of slopes
A1=A’; //technical reasons
// computing x(i+1)
// inside the slope lightening the notation.
x=x(i);
t=t(i);
sl=feval ( t ,x, odefile); s(:,1)=sl(:);
//making sure sl is a column vector
sl=feval(t+h*C(2),x +h*A1(:,1)); s(:,2)=sl(:);
.... write the other stages till k-1 and the last
sl=feval (t+h*C(k), x+h*A(:,k)); s(:,k)=sl(:);
// compute the new point
39
x= x+ h*s*B’;
x(i+1)=x;
This the core of the code, care must be taken to details. How is stored x(i)
column? row?. Since the utility of a solver is also to plot solution, the mesh
points should be stored as a column vector, then x(i) should be the i-th row
of Nn matrix, where N is the number of mesh points. Once you have write
the code, this is not difficult to code all the RK formulas.
Do it !
the routine can also be coded more naively, for example RK4, here is a Scilab
file, named rk4.sci :
function [T,X]=rk4(fct,t0,x0,tf,N)
//Integrates with RK4 classical method
// fct user supplied function : the RHS of the ODE
//
// t0 initial time
//
// x0 initial condition
//
// tf final time [t0,tf] is the integration interval
//
// N number of steps
x0=x0(:) // make sure x0 column !
n=length(x0)
h=(tf-t0)/(N-1)
T=linspace(t0,tf,[,N])
T=T(:)
X=zeros(N,n) // preallocation
X(1,:)=x0’
//main loop
for i=1:N-1
// compute the slopes
s1=feval(T(i),X(i,:),fct)
s1=s1(:)
40
x1=X(i,:)+h*(s1/2)’
s2=feval(T(i)+h/2,x1’,fct)
s2=s2(:)
x2=X(i,:)+h*(s2/2)’
s3=feval(T(i)+h/2,x2’,fct)
s3=s3(:)
x3=X(i,:)+h*s3’
s4=feval(T(i)+h,x3’,fct)
s4=s4(:)
X(i+1,:)=X(i,:)+h*(s1’+2*s2’+2*s3’+s4’)/6
end
A WARNING
When you create a Scilab file, this file has the syntax
function [out1,out2, ...] =name_of_function (in1,in2, ...)
Where output : out1, out2, . . .are the wanted quantities computed by the
file, and the in1, in2, . . .are the inputs required by the file.
ALWAYS SAVE THE FILE UNDER THE NAME OF THE GIVEN
FUNCTION! That is save under
name_of_function.sci.
If you are saving under another name Scilab get confused, and you are in
trouble.
4.4.2 Testing the methods
We shall use a test function, for example ˙ x = x + t with I.V. x(0) = 1. The
exact solution is x(t) = −1 −t −e
t
.
We write the file fctest1.sci
function xdot=fctest1(t,x)
//test fonction
//
xdot=x+t;
And finally write a script for testing the three methods
41
//compare the different Runge Kutta methods
// and their respective error
//
//initialization
//
;getf("/Users/sallet/Documents/Scilab/euler.sci");
;getf("/Users/sallet/Documents/Scilab/rk2.sci");
;getf("/Users/sallet/Documents/Scilab/rk4.sci");
;getf("/Users/sallet/Documents/Scilab/fctest1.sci");
Nstepvect=(100:100:1000);
n=length(Nstepvect);
eul_vect=zeros(n,1);
rk2_vect=zeros(n,1);
rk4_vect=zeros(n,1);
for i=1:n
[s,xeul]=euler(fctest1,0,1,3,Nstepvect(i));
// sol for euler
[s,xrk2]=rk2(fctest1,0,1,3,Nstepvect(i));
// sol for Heun RK2
[s,xrk4]=rk4(fctest1,0,1,3,Nstepvect(i));
// sol for RK4
z=-1-s+2*exp(s); // exact solution for the points s
eulerror=max(abs(z-xeul)); // biggest error for Eul
rk2error=max(abs(z-xrk2)); // biggest error for RK2
rk4error=max(abs(z-xrk4)); // biggest error for RK4
//
//
eul_vect(i)=eulerror;
rk2_vect(i)=rk2error;
rk4_vect(i)=rk4error;
//
end
//
//plot
xbasc()
42
plot2d(’ll’,Nstepvect,[eul_vect,rk2_vect,rk4_vect])
xgrid(2)
Now we call this script from the Scilab command window :
;getf("/Users/sallet/Documents/Scilab/comparRK.sci");
We get the following picture
2 3
11
10
10 10
10
10
9
10
8
10
7
10
6
10
5
10
4
10
3
10
2
10
1
10
0
10
1
10
Euler
RK2
RK4
Number of steps
A
b
s
o
l
u
t
e
E
r
r
o
r
Fig. 1 – Comparison of RK Formulas
Clearly measuring the slopes show that the methods are respectively of order
1, 2 and 4, as expected.
4.4.3 Testing roundoff errors
We want to experiment the effects of roundoff, that is to say the effects of
finite precision arithmetic. We choose for example the RK2 improved Euler
method, and we test the method with the function ˙ x = x and the I.V. x
0
= 1.
We look at x(1) = e.
43
Since we look only at the terminal point, and since we shall have a great
number of steps, we choose to minimize the number of variables used. Scilab
stores usual variables in a stack. The size of this stack depends on the amount
of free memory. Then we use for this experiment a modified code of RK2,
minimizing the number of variables. Compare with the original code given
RK2 (same as RK4) before.
function sol=rk2mini(fct,t0,x0,tf,N)
// Integrates with RK2 method, improver Euler,
// with minimum code, minimum allocation memory
// for experiment computing the solution only
// at final time
//
// fct user supplied function : the RHS of the ODE
//
// t0 initial time
//
// x0 initial condition
//
// tf final time [t0,tf] is the integration intervall
//
// N number of steps
x0=x0(:)
n=length(x0)
h=(tf-t0)/N
x=x0;
t=t0;
k=1;
//main loop
while k< N
s1=feval(t,x,fct) // compute the slope
s1=s1(:)
x1=x+h*s1 // Euler point
s2=feval(t+h,x1,fct) // slope at Euler point
s2=s2(:)
44
x=x+h*(s1’+s2’)/2
t=t+h
k=k+1
end
sol=x
Since we work in finite precision, we must be aware that (t
f
− t
0
) ,= N ∗ h.
In fact N ∗ h is equal to the length of the integration interval at precision
machine. The reader is invited to check this.
We obtain the following results :
-->result
result =
! 10. 0.0051428 0.0011891!
! 100. 0.0000459 0.0000169!
! 1000. 4.536E-07 1.669E-07!
! 10000. 4.531E-09 1.667E-09!
! 100000. 4.524E-11 1.664E-11!
! 1000000. 2.491E-13 9.165E-14!
! 10000000. 1.315E-13 4.852E-14!
! 30000000. 3.060E-13 1.126E-13!
Where the first column is the number of steps, the second is the absolute
error [e −sol[ and the third column is the relative error [
e−sol
e
[.
If we plot, for example, the absolute error versus the number of steps, in a
loglog plot, by the command
-->plot2d(result1(:,1),result1(:,3),logflag=’ll’,style=-4)
-->plot2d(result1(:,1),result1(:,3),logflag=’ll’,style=1)
We obtain the figure
45
1
10
2
10
3
10
4
10
5
10
6
10
7
10
8
10
13
10
12
10
11
10
10
10
9
10
8
10
7
10
6
10
5
10
4
10
3
10
2
10








1
10
2
10
3
10
4
10
5
10
6
10
7
10
8
10
13
10
12
10
11
10
10
10
9
10
8
10
7
10
6
10
5
10
4
10
3
10
2
10
Number of steps
A
b
s
o
l
u
t
e
E
r
r
o
r
Fig. 2 – Effects of finite precision machine on accuracy
The reader is invited to make his own experiments. We draw the reader’s
attention to the fact that the computation can be lengthy. We use a double
precision computer (as it is the case in this lecture, since Scilab use double
precision) the effects appears for an order 2 one-step method around 10
7
steps.
For our test problem, the RK2 method is simply
x
n+1
= x
n
(1 +h +
h
2
2
)
The local error is equivalent to the first neglected factor of the Taylor deve-
lopment
h
3
6
x
n
, then on [0, 1], we can estimate le
n

e
6
h
3
. Then in the formula
(10) we can use C = e/6, which gives M
2
≈ 10
21
.
In this experiment we have use the IEEE arithmetic precision of the compu-
ter. This implies that, if we want to see the phenomena, we must to use a
large number of steps. This has been the case.
46
We can simulate simple precision or a even a less stringent precision. A
famous anecdote is about C.B. Moler, a famous numerician, and creator
and founder ofMATLAB, which was accustomed to describe the precision as
“half–precision” and “full precision” rather than single and double precision.
Here is a Scilab function which chop any number to the nearest number with
t significant decimal digits
function c=chop10(x,t)
// chop10 rounds the elements of a matrix
// chop10(X) is the matrix obtained
// in rounding the elements of X
// to t significant decimal digits. The rounding is toward
//the nearest number with t decimal digits.
// In other words the aritmetic is with unit
// roundoff 10^(-t)
//
y=abs(x)+ (x==0);
e=floor(log10(y)+1);
c=(10.^(e-t)).*round((10.^(t-e)).*x);
There a little trick, here, in this code. Note the line
y=abs(x)+ (x==0);
Which is intended to encompass the case x = 0.
Now we can rewrite the RK2 code, with chop10. This time, we must take
the effect of the arithmetic into account, since N ∗ h ,= (t
f
−t
0
). At the end
of the main loop, we add a step, to take care of this, to obtain a value at
t
f
. Note that we have included the function chop as a sub–function of RK2.
Scilab permits this.
function sol=rk2hp(fct,t0,x0,tf,N)
// RK2 code with half precision (H.P.)
x0=x0(:)
n=length(x0)
h=(tf-t0)/N
h=chop10(h,8) // H.P.
x=x0;
47
t=t0;
k=1;
//main loop
while k< N +1
k=k+1
s1=feval(t,x,fct)
s1=s1(:)
s1=chop10(s1,8); // H.P.
x1= chop10(x+h*s1,8) // Euler point
s2=feval(t+h,x1,fct) // slope at Euler point
s2=s2(:)
s2=chop10(s2,8) //H.P.
x=x+h*(s1+s2)/2
x=chop10(x,8) // H.P.
t=t+h
end
if t<tf then
h=tf-t;
s1=feval(t,x,fct);
s1=s1(:);
x1=x+h*s1;
s2=feval(t+h,x1,fct);
s2=s2(:);
x=x+h*(s1+s2)/2;
t=t+h;
end
sol=x
////////////////////////////
function c=chop10(x,t)
y=abs(x)+ (x==0);
e=floor(log10(y)+1);
c=(10.^(e-t)).*round((10.^(t-e)).*x);
48
Now we write a script for getting the results :
// script for getting the effects of roundoff
// in single precision
//note : the function chop10
// has been included in rk2hp
;getf("/Users/sallet/Documents/Scilab/rk2hp.sci");
;getf("/Users/sallet/Documents/Scilab/ode1.sci");
N=100:100:3000;
//
results=zeros(30,3);
results(:,1)=N’;
//
for i=1:30,
sol=rk2hp(ode1,0,1,1,N(i));
results(i,2)=abs(%e-sol);
results(i,3)=abs(%e-sol)/%e;
end
We obtain
! 100. 0.0000456 0.0000168 !
! 200. 0.0000104 0.0000038 !
! 300. 0.0000050 0.0000018 !
! 400. 0.0000015 5.623E-07 !
! 500. 0.0000024 8.934E-07 !
! 600. 1.285E-07 4.726E-08 !
! 700. 0.0000018 6.527E-07 !
! 800. 3.285E-07 1.208E-07 !
! 900. 0.0000010 3.683E-07 !
! 1000. 9.285E-07 3.416E-07 !
! 1100. 5.285E-07 1.944E-07 !
! 1200. 0.0000035 0.0000013 !
! 1300. 0.0000039 0.0000014 !
! 1400. 9.121E-07 3.356E-07 !
49
! 1500. 0.0000023 8.566E-07 !
! 1600. 0.0000020 7.462E-07 !
! 1700. 0.0000012 4.380E-07 !
! 1800. 0.0000013 4.678E-07 !
! 1900. 0.0000016 5.991E-07 !
! 2000. 8.715E-07 3.206E-07 !
! 2100. 0.0000011 4.151E-07 !
! 2200. 4.987E-07 1.835E-07 !
! 2300. 0.0000030 0.0000011 !
! 2400. 0.0000011 4.151E-07 !
! 2500. 0.0000058 0.0000021 !
! 2600. 0.0000046 0.0000017 !
! 2700. 0.0000020 7.452E-07 !
! 2800. 0.0000028 0.0000010 !
! 2900. 0.0000044 0.0000016 !
! 3000. 0.0000030 0.0000011 !
This result gives the following plot for the absolute error (the relative error
is given by the third column)
50
2
10
3
10
4
10
7
10
6
10
5
10
4
10






























Number of steps
A
b
s
o
l
u
t
e
E
r
r
o
r
Fig. 3 – Effects of half precision on accuracy.
We observe that for a number of step under 1000 the slope of the error is 2 as
predicted by the theory, but rapidly the situation deteriorates with roundoff
errors.
4.5 Methods with memory
Methods with memory exploit the fact that when the step (t
n
, x
n
) is reached,
we have at our disposition previously computed values, at mesh points, x
i
and f(t
i
, x
i
). An advantage of methods with memory is their high order of
accuracy with just few evaluation of f. We recall that, in the evaluation of the
method, it is considered that the highest cost in computing is the evaluation
of f.
Methods with memory distinguishes between explicit and implicit methods.
Implicit methods are used for a class of ODE named stiff. A number of theo-
retical question are still open. Many popular codes use variable order methods
with memory. Practice shows that this codes are efficient. But understanding
51
the variation of order is an open question. Only recently theoretical justifica-
tion have been presented (Shampine, Zhang 1990, Shampine 2002). To quote
Shampine :
our theoretical understanding of modern Adams and BDF codes
that vary their order leaves much to be desired.
In this section we just give an idea of how works methods with memory.
In modern codes the step is varying and the order of the method also. To
be simple we shall consider (as for one-step methods ) the case of constant
step size. This is for pedagogical reasons. In theory changing the step size
is ignored, but in practice it cannot. Varying the step size is important for
the control of the local error. It is considered that changing step step size is
equivalent to restarting . But starting a method with memory is a delicate
thing. Usually in the early codes the starting procedure use a one-step method
with an appropriate order. Another problem is the problem of stability of
these methods, this is related to the step size. Methods with memory are
more complex to analyze mathematically. However considering constant step
size is still useful. General purpose solvers tend to work with constant step
size, at least during a while. Special problems, for example, problems arising
from semi-discretization in PDE are solved with constant step size.
4.5.1 Linear MultistepsMethods LMM
When the step size is constant the Adams and the BDF methods are included
in a large class of formulas called Linear Multipstep Methods. These
methods are defined by
Definition 4.6 (LMM methods) :
A LMM method of k-steps defines a rule to compute x
n+1
, when k preceding
steps (x
n
, x
n−1
, , x
n−k+1
) are known . This rule is given by
k

i=0
α
i
x
n+1−i
= h
k

i=0
β
i
f(t
n+1−i
, x
n+1−i
) (13)
It is always supposed that α
0
,= 0, which gives x
n+1
If β
0
= 0 the method is explicit.
Otherwise, since x
n+1
appears on the two sides of the equation ,the method
is implicit. This means that x
n+1
must be computed from the formula (13).
52
Three class of methods can be defined from LMMs. The Adams-Basforth
explicit methods, the Adams-Moulton methods, and the Backward Differen-
tiation Methods (BDF). The later methods are implicit .
The relation (13 ) can be rewritten
When β
0
= 0
x
n+1
= −
k

i=1
α
i
α
0
x
n+1−i
+ h
k

i=1
β
i
α
0
f(t
n+1−i
, x
n+1−i
)
And when β
0
,= 0
x
n+1
= h
β
0
α
0
f(t
n+1
, x
n+1
) −
k

i=1
α
i
α
0
x
n+1−i
+ h
k

i=1
β
i
α
0
f(t
n+1−i
, x
n+1−i
)
4.5.2 Adams-Basforth methods (AB)
The family of AB methods are described by the numerical scheme ABk
x
n+1
= x
n
+ h
k

i=1
β

k,i
f(t
n+1−i
, x
n+1−i
) (14)
This class of method is quite simple and requires only one evaluation of f at
each step, unlike RK methods.
The mathematical principle under Adams formulas is based on a general pa-
radigm in numerical analysis : When you don’t know a function approximate
it with an interpolating polynomial, and use the polynomial in place of the
function. We know approximations of f at (t
i
, x
i
), for k preceding steps. To
lighten the notation we use
f(t
i
, x
i
) = f
i
The classical Cauchy problem
_
˙ x = f(t, x)
x(t
n
) = x
n
For passing from x
n
to x
n+1
this is equivalent to the equation
x
n+1
= x
n
+
_
x
n+1
xn
f(s, x(s)) ds
53
There exists a unique k −1 degree polynomial P
k
(t) approximating f at the
k-points (t
i
, x
i
). This polynomial can be expressed by
P
k
(t) =
k

i=1
L
i
(t) f
n+1−i
Where L
i
the classical fundamental Lagrangian interpolating polynomal.
Replacing f by P
k
under the integral, remembering that h = x
n+1
− x
n
=
x
i+1
− x
i
, gives the formula (14). The coefficients β

k,i
are the results of this
computation. For constant step size, it is not necessary to make the explicit
computation. We can construct a table of Adams-Basforth coefficients.
We give the idea to construct the array , and compute the two first lines.
If we consider the polynomial function 1, the approximation of degree 0 is
exact, and the formula must be exact. Then consider ˙ x = 1, with x(1) = 1,
the solution is x(t) = t, then integrate from 1 to 1 + h, with AB1 formula,
gives
1 +h = 1 +hβ

1,1
1
Evidently β

1,1
= 1 and AB1 is the forward Euler formula.
Considering AB2, with ˙ x = 2t , x(1) = 1, with the 2 steps at 1 and 1 +h, we
get for the evaluation at 1 + 2h , with the formula
x
n+1
= x
n
+ h(β

2,1
f
n
+ β

2,2
f
n−1
)
The result
(1 + 2h)
2
= (1 +h)
2
+ h
_
β

2,1
2(1 +h) + β

2,2
2
_
This is a polynomial equation in h, by identification of the coefficients (this
formula must be satisfied for any h ), we obtain 1 +2β

2,1
= 4 with 1 +β

2,1
+
β

2,2
= 2 which gives
β

2,1
=
3
2
and β

2,2
= −
1
2
The reader is invited to establish the following tableau of Adams-Basfortth
coefficients
54
k β

k,1
β

k,2
β

k,3
β

k,4
β

k,5
β

k,6
1 1
2
3
2

1
2
3
23
12

16
12
5
12
4
55
24

59
24
37
24

9
24
5
1901
720

2774
720
2616
720

1274
720
251
720
6
4277
1440

7923
1440
9982
1440

7298
1440
2877
1440

475
1440
When the mesh spacing is not constant, the h
n
appears explicitly in the
formula and everything is more complicated. In particular it is necessary to
compute the coefficient at each step. Techniques has been devised for doing
this task efficiently.
It can be proved, as a consequence of the approximation of the integral, that
Proposition 4.2 (order of AB) :
The method ABk is of order k.
4.5.3 Adams-Moulton methods
The family of AM methods are described by the numerical scheme AMk
x
n+1
= x
n
+ h
k−1

i=0
β
k,i
f(t
n+1−i
, x
n+1−i
) (15)
The AMk formula use k − 1 preceding steps (because one step is at x
n+1
,
then one step less that the corresponding ABk method
The principle is the same as for AdamsBasforth but in this case the inter-
polation polynomial use k points, including (t
n+1
, x
n+1
). The k interpolating
values are then
(f
n+1
, f
n
, , f
n+2−k
)
55
Accepting the value (implicit) f
n+1
, exactly the same reasoning as for AB,
gives an analogous formula, with f
n+1
entering in the computation. This is an
implicit method. The coefficient of the AM family can be constructed in the
same manner as for the AB family, and we get a tableau of AM coefficients :
k β
k,0
β
k,1
β
k,2
β
k,3
β
k,4
β
k,5
1 1
2
1
2
1
2
3
5
12
8
12

1
12
4
9
24
19
24

5
24
1
24
5
251
720
646
720

264
720
106
720

19
720
6
475
1440
1427
1440

798
1440
482
1440

173
1440
27
1440
The method AM1, is known as Forward Euler formula
x
n+1
= x
n
+ hf(t
n+1
, x
n+1
)
The method AM2 is known as the trapezoidal rule (explain why !), in the
context of PDEs it is also called Crank-Nicholson method
x
n+1
= x
n
+
h
2
[f(t
n
, x
n
) + f(t
n+1
, x
n+1
)
It can be proved, it is a consequence of the approximation of the integral,
that
Proposition 4.3 (order of AM) :
The method AMk is of order k.
It can be shown that the AM formula is considerably more accurate that the
AB formula of the same order, for moderately large order k.
56
4.5.4 BDF methods
The principle behind BDF is the same. We use also a polynomial interpola-
tion but for a different task. Now we interpolate the solution itself, including
the searched point x
n+1
at the mesh point t
n+1
. Using the k + 1 values
x
n+1
, x
n
, , x
n+1−k
at mesh points t
n+1
, t
n
, , t
n+1−k
we obtain a unique
polynomial P
k
(t) of degree k. A way of approximate a derivative is to dif-
ferentiate an interpolating polynomial. Using this, we collocate the ODE at
t
n+1
i.e.
P

k
(t
n+1
) = f(t
n+1
, P
k
(t
n+1
) ) = f(t
n+1
, x
n+1
)
This relation is an implicit formula for x
n+1
. If we write
P
k
(t) =
k

i=0
L
i
(t) x
n+1−i
The preceding relation becomes
k

i=0
L

i
(t
n+1
) x
n+1−i
= f(t
n+1
, x
n+1
)
If we recall that
L
i
(t) =

j=i
(t −x
n+1−j
)

j=i
(x
n+1−i
−x
n+1−j
)
It is then clear that for each index i, we can write
L
i
(t) =
1
h
M
i
(t)
Hence the final formula, with constant step for BDF is
k

i=0
α
i
x
n+1−i
= hf(t
n+1
, x
n+1
) (16)
The BDF formulas are then LMMs methods.
It is easy to see that
Proposition 4.4 (order of BDF) :
The method BDFk is of order k.
57
For constant step an array of BDF coefficient can be computed with the
same principle as for the building of the coefficient of Adams methods. There
however a difference, the Adams formulas are based on interpolation of the
derivative of the solution, the BDFs are based on the interpolation of solution
values themselves.
let look at at example. Considering the ODE ˙ x = 3t
2
with x(1) = 1, should
gives the coefficients of BDF3.
Looking at the mesh (1 +h, 1, 1 −h, 1 −2h) give the relation
α
0
(1 +h)
3
+ α
1
+ α
2
(1 −h)
3
+ α
3
(1
2
h)
3
= 3h(1 +h)
2
This relation gives
_
¸
¸
_
¸
¸
_
α
0
−α
2
−8α
3
= 3
α
0
+ α
2
+ 4α
3
= 2
α
0
−α
2
−2α
3
= 1
α
0
+ α
1
+ α
2
+ α
3
= 0
From which we obtain
α
0
=
11
6
α
1
= −3 α
2
=
3
2
α
3
= −
1
3
The array for BDF is
k α
k,0
α
k,1
α
k,2
α
k,3
α
k,4
α
k,5
1 1 −1
2
3
2
−2
1
2
3
11
6
−3
3
2

1
3
4
25
12
−4 3 −
4
3
1
4
5
137
60
−5 5 −
10
3
5
4

1
5
6
147
60
−6
15
12

20
3
15
4

6
5
1
6
58
The name Backward difference comes from the fact that expressed in Back-
ward Differences the interpolating polynomial has a simple form. The Back-
ward differences of a function are defined inductively, ∇
n+1
f
∇f(t) = f(t) −f(t −h)

n+1
f(t) = ∇(∇
n
f(t))
If we set x
n+1
= x(t
n+1
) for interpolation, the polynomial is
P
k
(t) = x
n+1
+
k

i=1
(t −t
n+1
) (t −t
n+2−i
)
h
i
i!

i
x
n+1
Then the BDF formula takes the simple form
k

i=1
1
i

i
x
n+1
= hf(t
n+1
, x
n+1
)
4.5.5 Implicit methods and PECE
The reader at this point can legitimately ask the question : why using implicit
methods, since an implicit method requires to solve an equation, and then
add some computation overhead?
The answer is multiple. A first answer is in stability. We shall compare two
methods on a simple stable equation :
_
˙ x = αx
x(0) = 1
With α < 0.
The simple method RK1=AB1 gives rise to the sequence
x
n+1
= (1 +αh)
n+1
This sequence converge iff [ 1 +αh [< 1 , which implies [ h [<
2
|α|
On contrary the forward Euler method AM1=BDF1 gives
x
n+1
=
1
1 −αh
x
n
59
This sequence is always convergent for any h > 0. If we accept complex values
for α the stability region is all the left half of the complex plane. Check that
this is also true for the AM2 method. We shall study in more details this
concept in the next section.
Stability restricts the step size of Backward Euler to [ h [<
2
|α|
. When α takes
great values, the forward Euler becomes interesting despite the expense of
solving an implicit equation.
The step size is generally reduced to get the accuracy desired or to keep the
computation stable. The popular implicit methods used in modern codes,
are much more stable than the corresponding explicit methods. If stability
reduces sufficiently the step size for an explicit method, forcing to take a great
number of steps, an implicit method can be more cheap in computation time
and efficiency.
In stiff problems, with large Lipschitz constants, a highly stable implicit
method is a good choice.
A second answer is in comparing AB and AM methods. AM are much more
accurate than the corresponding AB method. The AM methods permits big-
ger step size and in fact compensate for the cost of solving an implicit equa-
tion.
We shall takes a brief look on the implicit equation. All the methods consi-
dered here are all LMMs and, when implicit, can be written
x
n+1
= h
β
0
α
0
f(t
n+1
, x
n+1
) −
k

i=1
α
i
α
0
x
n+1−i
+ h
k

i=1
β
i
α
0
f(t
n+1−i
, x
n+1−i
)
Or in a shorter form
x
n+1
= hC
1
F(x
n+1
) + hC
2
+ C
3
The function Φ(x) = hC
1
F(x) +fC
2
+C
3
is clearly Lipschitz with constant
[ h [ C
1
L, where L is the Lipschitz constant of f the RHS of the ODE. Then
by a classical fixed point theorem, the implicit equation has a unique solution
if [ h [ C
1
L < 1 or equivalently
[ h [ <
1
C
1
L
The unique solution is the limit of the iterated sequence z
n+1
= Φ(z
n
). The
convergence rate is of order of h
60
If h is small enough a few iterations will suffice. For example it can be required
that the estimated [ h [ C
1
L is less that 0.1 , or that three iterations are
enough . . .. Since x
n+1
is not expected to be the exact solution there is no
point in computing it more accurately than necessary. We get an estimation
of the solution of the implicit equation, this estimate is called a prediction.
The evaluation of f is called an evaluation. This brings us to another type
of methods. Namely the predictor corrector methods.
Another methods are predictor corrector methods or P.E.C.E. methods.
The idea is to obtain explicitly a first approximation (prediction) px
n+1
of
x
n+1
. Then we can predict pf
n+1
i.e. f(t
n+1
, px
n+1
). A substitution of f
n+1
by pf
n+1
in an implicit formula , give a new value for x
n+1
said corrected
value, with this corrected value f
n+1
can be evaluated . . .
_
¸
¸
_
¸
¸
_
Prediction px
n+1
= explicit formula
Evaluation pf
n+1
= f(t
n+1
, px
n+1
)
Correction x
n+1
= implicit(pf
n+1
)
Evaluaton f
n+1
= f(t
n+1
, x
n+1
)
(17)
Let apply this scheme to an example with AB1 and AM2 methods (look at
the corresponding arrays)
_
¸
¸
_
¸
¸
_
Prediction AB1 px
n+1
= x
n
+ h f(t
n
, x
n
)
Evaluation pf
n+1
= f(t
n+1
, px
n+1
)
Correction AM2 x
n+1
= x
n
+
1
2
[f
n
+ pf
n+1
]
Evaluaton f
n+1
= f(t
n+1
, x
n+1
)
(18)
We rediscover in another disguise the Heun method RK2.
It is simple to look at the local error for a PECE method. The reader is invited
to do so. Check that the influence of the predictor method is less than the
corrector method. Then for a k order predictor is wise to choose a k + 1
corrector method. Prove that for a order p

predictor method and an order
p corrector method , the PECE associated method is of order min(p

+1, p).
A PECE is an explicit method. As we have seen AB1-AM2 is RK2 method.
Check the stability of this method and verify that this method has a finite
stability region.
4.5.6 Stability region of a method
:
The classic differential equation
61
_
_
_
˙ x = λx
x(0) = 1
'(λ) < 0
(19)
Is called Dahlquist test equation.
When applied to this test equation a discrete method gives a sequence defined
by induction. To be more precise, since all the method encountered in this
notes can be put in LMM form, we look at this expression for a LMM method.
using the definition (13) of a LMM formula we obtain, when applied to the
Dahlquist equation (19), the recurrence formula :
k

i=0

i
−β
i
hλ)x
n+1−i
= 0
We set µ = hλ
It is well known that to this linear recurrent sequence is associated a poly-
nomial

0
−β
0
µ)ζ
k+1
+ + (α
i
−β
i
µ)ζ
k−i+1
+ + (α
k
−β
k
µ) = 0
Since this formula is linear in µ the right hand side can be written
R(µ) = ρ(ζ) −µσ(ζ)
The polynomial ρ, with coefficient α
i
is known as the first characteristic
polynomial of the method, The polynomial σ, with coefficient β
i
is known as
the second characteristic polynomial of the method.
From recurrence formulas, it is well known that a recurrence formula is
convergent if the simple root of the polynomial R(µ) are contained in the
closed complex disk U of radius 1, and the multiple roots are in the open
disk.
Definition 4.7 (Stability Domain) :
The set S
S = ¦µ ∈ C [ roots(R(µ) ⊂ U¦
is called the stability region of the LMM method.
If S ⊂ C

the method is said to be A-stable.
62
characterization of the stability domain We have
µ =
ρ(ζ)
σ(ζ)
The boundary of S is then the image of the unit circle by the function
H(ζ) =
ρ(ζ)
σ(ζ)
. The stability region, whenever it is not empty, must lie on the
left of this boundary. With Scilab it is straightforward to plot the stability
region and obtain the classical pictures of the book (HW, S; B)
For example for Adams-Basforth formulas
H(ζ) =
ζ
k+1
−ζ
k
β

k,1
ζ
k
+ + β

k,k
=
ζ −1
β

k,1
+ + β

k,k
ζ
−k
For BDF formulas
µ = H(ζ) =
ρ(ζ)
ζ
n+1
In Scilab, using the vectorization properties
-->z=exp(%i*%pi*linspace(0,200,100);
-->r=z-1;
-->-->s2=(3-(1)./z)/2;
-->w2=(r./s2)’;
-->plot2d(real(w2),imag(w2),1)
-->s2=(23-(16)./z+(5)./z.^2)/12;
-->w3=(r./s2)’;
-->plot(ream(w3),imag(w3),2)
-->s4=(55-(59)./z+(37)./z^2-(9)./z.^3)/24;
-->w4=(r./s4)’;
-->plot(real(w4),imag(w4),3)
-->s5=(1901-(2774)./z+(2616)./z.^2-(1274)./z.^3+...
-->(251)./z.^4)/720;
-->w5=(r./s5)’;
-->plot2d(real(w5),imag(w5),4)
Here is A WARNING
Pay attention at the code. We write
(16)./z
63
And not
16./z
The first code is clear, and Scilab interpret like ones(z) ./ z . The second
is interpreted like the real 1. follow by a slash and the matrix z. In Scilab
x = A /B is the solution of x ∗ B = A. Then Scilab compute the solution of
the matrix equation x ∗ z = 1..
The two codes gives very different results. We are looking for the element
wise operation ./, this is the reason of the parenthesis. We could also have
written, using space
1 ./ z
However we prefer to use parenthesis as a warning for the reader.
We obtain the following pictures
1. 5 1. 1 0. 7 0. 3 0.1 0.5 0.9 1.3
1. 5
1. 1
0. 7
0. 3
0.1
0.5
0.9
1.3
1.7
1. 5 1. 1 0. 7 0. 3 0.1 0.5 0.9 1.3
1. 5
1. 1
0. 7
0. 3
0.1
0.5
0.9
1.3
1.7
1. 5 1. 1 0. 7 0. 3 0.1 0.5 0.9 1.3
1. 5
1. 1
0. 7
0. 3
0.1
0.5
0.9
1.3
1.7
1. 5 1. 1 0. 7 0. 3 0.1 0.5 0.9 1.3
1. 5
1. 1
0. 7
0. 3
0.1
0.5
0.9
1.3
1.7
AB2
AB3
AB4
AB5
Adams Basforth Stability Region
Fig. 4 – Adams-Basforth region of stability
The reader is invited to obtain the following figures for Adams-Moulton and
BDF formulas
64
6 5 4 3 2 1 0 1
4
3
2
1
0
1
2
3
4
6 5 4 3 2 1 0 1
4
3
2
1
0
1
2
3
4
6 5 4 3 2 1 0 1
4
3
2
1
0
1
2
3
4
AM2
AM3
AM4
Fig. 5 – Adams-Moulton region of stability
3 1 5 9 13 17 21
12
8
4
0
4
8
12
3 1 5 9 13 17 21
12
8
4
0
4
8
12
3 1 5 9 13 17 21
12
8
4
0
4
8
12
3 1 5 9 13 17 21
12
8
4
0
4
8
12
3 1 5 9 13 17 21
12
8
4
0
4
8
12
2 3
4
5
Fig. 6 – BDF region of stability, on the left of the boundaries
It can be shown that for RK methods of order p the polynomial R(ζ) is
65
R(ζ) = 1 +ζ +
ζ
2
+ +
ζ
k
k!
.
4.5.7 Implicit methods, stiff equations, implementation
In this lectures we have often spoken of stiff equations. Stiffness is a com-
plicated idea that is difficult to define. We have seen that for solving ODE
numerically the equation must be stable. This for example implies that the
Jacobian of the RHS must be a stable matrix. But with the Jacobian can
be associated the Lipschitz constant, related with the norm of the Jacobian.
Roughly a stiff problem is the combination of stability, with big eigenvalues
(negative real part) of the Jacobian , implying a big norm, hence a big Lip-
schitz constant. This imply that the process described by the ODE contains
components operating on different time scale.
For a stiff problem the constant appearing in the upper bound hL , with L
the Lipschitz constant , the step size giving the desired accuracy, since L is
big, must be chosen very small for assuring stability. A working definition for
stiffness could be
Equations where certains implicit methods, in particular BDF,
perform better, usually tremendously better, than explicit ones
To implement an implicit method, we have to solve some implicit equation
in x
x = hC
1
F(x) + hC
2
+ C
3
Where F(x) = f(t
n+1
, x).
We have proposed a first solution by iteration procedure. If the norm of the
Jacobian of f is big, this must imply for convergence a small step size h.
Another method is Newton’s method, which define a sequence converging to
the solution. We set
F(x) = x −(hC
1
F(x) + hC
2
+ C
3
)
The sequence is defined by induction, x
(0)
is the initial guess, and
x
(k+1)
= x
(k)

_
∂F
∂x
(x
(k)
_
−1
F(x
(k)
)
66
With
_
∂F
∂x
(x)
_
= I −hC
1
∂f(t
n+1
, x)
∂x
Each Newton iteration requires solving a linear system with a matrix related
to the Jacobian of the system. For any step size, Newton’s method converge
if the initial guess is good enough.
To implement Newton’s method, or some other variant, the matrix of partial
derivatives of the RHS is required. The user is encouraged to provide di-
rectly the Jacobian. If the Jacobian is not provided, the solver must compute
numerically the Jacobian with all the accompanying disadvantages.
5 SCILAB solvers
In the simplest use of Scilab to solve ODE, all you must do is to tell to Scilab
what Cauchy problem is to be solved. That is, you must provide a function
that evaluate f(t, x), the initial conditions (x
0
, t
0
) and the values where you
want the solution to be evaluated.
5.1 Using Scilab Solvers, Elementary use
The syntax is
x=ode(x0,t0,T,function);
The third input is a vector T. The components of T are the times at which
you want the solution computed.
The 4th input ‘ ‘function” is a function which compute the RHS of the ODE,
f(t, x).
A warning : the function must always have in its input the ex-
pression t , even if the ODE is autonomous.
That is, if given in a Scilab file, the code of the right hand side must begin
as
function xdot =RHS(t,x)
// the following lines are the lines of code that compute
// f(t,x)
67
The time t must appears in the first line.
This file must be saved as RHS.sci and called from the window command
line by a getf ( ”../RHS.sci ”)
Or as an alternative you can code as an inline function, directly from the
window command :
-->deff(’xdot=RHS(t,x)’,’xdot=t^2*x’)
-->RHS
RHS =
[xdot]=RHS(t,x)
You can check that the function RHS exists, in typing RHS, and Scilab
answer give the input and the output.
If you are interested in plotting the solution, you must get a mesh of points
on the integration interval
-->T=linspace(0,1,100);
-->x=ode(1,0,T’,RHS);
-->xbasc()
-->plot2d(T’,x’)
you obtain
68
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
1.00
1.04
1.08
1.12
1.16
1.20
1.24
1.28
1.32
1.36
1.40
Fig. 7 – Plot of a solution
When the function is simple, it can be entered directly from the command
line. But the solvers of Scilab are intended to be used with large and com-
plicated systems. In this case the equation is provided as Scilab function.
We shall give another example
Example 5.1 (Euler rigid body equations) :
We consider the following system
_
_
_
˙ x
1
= x
2
x
3
˙ x
2
= −x
3
x
1
˙ x
3
= −0.51 x
1
x
2
With I.V.
[x
1
(0); x
2
(0); x
3
(0)] = [0; 1; 1]
This example illustrates the solution of a standard test problem proposed by
Krogh for solvers intended for nonstiff problems. It is studied in the book of
Shampine and Gordon. This problem has a known solution in term of Jacobi
special function. We give some preliminaries.
We define the so-called incomplete elliptic integral of the first kind (Abra-
mowitz and Stegun) :
69
F(Φ, m) =
_
Φ
0

_
1 −msin
2
θ
=
_
sin(Φ)
0
dt
_
(1 −t
2
)(1 −mt
2
Since F is clearly a monotone increasing function of Φ, it admits an inverse
function, called the amplitude.
u = F(Φ, m) Φ = am(u, m)
Definition 5.1 (Jacobi ellliptic functions) :
The Jacobi elliptic functions are given respectively by (see Abramowitz and
Stegun)
_
_
_
sn(u, m) = sin(Φ) = sin(am(u, m))
cn(u, m) = cos(Φ) = cos(am(u, m))
dn(u, m) =
_
1 −sn
2
(u, m)
It is now a simple exercise to check that
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
d
dt
sn(u, m) = cn(u, m) dn(u, m)
d
dt
cn(u, m) = −sn(u, m) dn(u, m)
d
dt
dn(u, m) = −msn(u, m) cn(u, m)
With sn(0, m) = 0, cn(0, m) = dn(0, m) = 1.
We immediately see that the solution of the Euler equations, with the I.V.
given, are
_
_
_
x
1
(t) = sn(t, m)
x
2
(t) = cn(t, m)
x
3
(t) = dn(t, m)
With m = 0.51. We readily see that the solutions are periodic with period
4 F(
π
2
, m), which is known as the complete elliptic integral of the first kind
4 K(m) = 4 F(
π
2
, m).
In Scilab the function sn(u, m) is built-in and called by
%sn(u,m)
70
. The inverse of the Jacobi sinus is
%asn(u,m)
Then the amplitude is given by am(u, m) = asin(%sn(u, m), and the incom-
plete elliptic integral is given by F(Φ, m) = %asn(sin(Φ)). The complete
integral is given by
%k(m)
With this preliminaries we can make some experiments. First we get the
figure of Gordon-Shampine book :
Coding the rigid body equations
function xdot=rigidEuler(t,x)
xdot=[x(2)*x(3);-x(3)*x(1);-0.51*x(1)*x(2)]
Computing and plotting the solution :
-->;getf("/Users/sallet/Documents/Scilab/rigidEuler.sci");
--> x0=[0,1,1]’; t0=0; T=linspace(0,12,1200);
--> m=0.51;K=% k(m);
-->sol=ode(x0,t0,T,rigidEuler);
-->plot2d(T’,sol’)
Note the transpose for plotting. The solver gives the solution in n rows, n the
dimension of the system, and N columns the number of time points asked
by the user. The function plot use column vectors.
We obtain the figure
71
0 2 4 6 8 10 12
1. 0
0. 6
0. 2
0.2
0.6
1.0
1.4
x
x
x
1
3
2
Euler Equations of Rigid Body :solutions
Fig. 8 – Solution of the movement of a rigid body
We can now appreciate the quality of the solution.
The theoretical solution is
(sn(t, m), cn(t, m), dn(t, m))
As we have seen we have access in Scilab to sn, and then to dn(t, m) =
_
1 −msn(t, m)
2
. We have to use an addition formula to obtain cn from sn.
We have
sn(u + v) =
sn(u) cn(v) dn(v) + sn(v) cn(u) dn(u)
1 −msn(u)
2
sn(v)
2
We obtain from this relation (analogous to sin(x + π/2))
cn(x) = sn(x + K) dn(x)
Using the vectorized properties of Scilab (look at the operations .∗ and .)
-->theorsol=[\%sn(T,m); ...
72
-->% sn(T+K,m).*sqrt(1-m*(% sn(T,m)).^2);...
-->sqrt(1-m*%sn(T,m).^2)];
-->norm(abs(theorsol-sol))/norm(theorsol)
ans =
0.0244019
We obtain an absolute error of 2% for all the interval. Don’t forget that we
integrates on [0, 12]. If we only look after a time interval of 1s, the result are
better :
-->(%sn(1,m)-sol(1,101))/%sn(1,m)
ans =
0.0011839
And at the end we have also
--> norm(theorsol(:,$)-sol(:,$))/norm(theorsol)
ans =
0.0015027
Which is quite correct.
We shall see how to improve these results from the accuracy point of view.
5.2 More on Scilab solvers
The syntax ode is an interface to various solvers, in particular to ODEPACK.
The library ODEPACK can be found in Netlib. They are variations of Hind-
smarch’s LSODE. LSODE implements Adams-Moulton formulas for non stiff
problems and BDF’s formulas. When the user says nothing, the solver selects
between stiff and non stiff methods. LSODE is a successor of the code GEAR
(Hindsmarch), which is itself a revised version of the seminal code DIFSUB
of Gear (1971) based on BDF’s formulas.
When the user does no precise the solver used, ODE call LSODA solver of the
ODEPACK package. Lsodar is the livermore solver for ordinary differential
equations, with automatic method switching for stiff and nonstiff problems,
and with root-finding. This version has been modified by scilab group on Feb
97 following Dr Hindmarsh direction.
73
5.2.1 Syntax
x=ode(’method’,x0,t0,T,rtol,atol, function,Jacobian)
Method :
The user can choose between
1. ’adams’
2. ’Stiff’
3. ’RKF’
4. ’RK’.
The method ’Adams’ use Adams-Moulton formulas. The maximum order for
the formulas is, by default, 12.
The method ’Stiff’ uses a BDF method. The maximum order is 5
The method ’RKF’ uses the program of Shampine and Watts (1976) based
on the RKF embedded formula of Fehlberg of example (4.12)
An adaptative Runge-Kutta formula of order 4
Tolerance : rtol, atol :
Tolerance for the error can be prescribed by the user. At each step the solver
produce an approximation ¯ x at mesh points t and an estimation of the error
est is evaluated. If the user provide a n-vector rtol of the dimension of the
system, and another n-vector atol, then the solver will try to satisfy
est(i) ≤ rtol(i) [ ¯ x(i) [ +atol(i)
For any index i = 1 : n.
This inequality defines a mixed error control. If atol = 0 then it corresponds
to a pure relative error control, if rtol = 0 i corresponds to a pure absolute
error control.
When ¯ x(i) is big enough, the absolute error test can lead to impossible re-
quirements : The inequality
[ ¯ x(i) −x(i) [≤ est(i) ≤ atol(i)
gives, if x(i) is big enough
[ ¯ x(i) −x(i) [
[ x(i) [

atol(i)
[ x(i) [
≤ eps
74
This tolerance ask for a relative error smaller than the precision machine,
which is impossible.
In a similar way, absolute error control can lead also to troubles, if x(i)
becomes small, for example, if [ x(i) [≤ atol(i) , with a pure absolute control
any number smaller than atol(i) will pass the test of error control. The norm
of x(i) can be small but theses values can have influence on the global solution
of the ODE, or they can later grow and then must be taken in account. The
scale of the problem must be reflected in the tolerance. We shall give examples
later, for example section(5.4.3).
If rtol and atol are not given as vectors, but as scalars, Scilab interprets rtol
as rtol = rtol ∗ ones(1, n), and similarly for atol.
Default values for rtol and atol are respectively
atol = 10
−7
and rtol = 10
−5
Jacobian :
We have seen why the Jacobian is useful when the problems are stiff in section
(4.5.7).
When the problems are stiff, it is better to give the Jacobian of the RHS
of the ODE. You must give Jacobian as a Scilab function, and the syntax
must include t and the state x in the beginning of the code for the Jacobian.
This is exactly the same rule as for the RHS of the ODE. For a Scilab file
function :
function J=jacobian (t,x)
The same rule must be respected for an “inline ” function.
When the problem is stiff you are encouraged to provide the Jacobian. If this
is not feasible the solver will compute internally by finite differences, i.e. the
solver compute the Jacobian numerically. It is now easy to understand why
providing Jacobian is interesting for the computation overhead and also for
the accuracy.
Before illustrating with examples the preceding notions we describe how to
change options for the ODE solvers in Scilab.
5.3 Options for ODE
Some options are defined by default. A vector must be created to modify
these options If created this vector is of length 12. At the beginning of the
75
session this vector does not exist.
The ODE function checks if this variable exists and in this case it uses it. If
this variable does not exist ODE uses default values. For using default values
you have to clear this variable.
To create it you must execute the command line displayed by odeoptions.
--> % ODEOPTIONS=[itask,tcrit,h0,hmax,hmin,jactyp,...
--> mxstep,maxordn,maxords,ixpr,ml,mu]
A Warning : the syntax is exactly as written, particularly you must
write with capital letters. Scilab makes the distinction. If you write
% odeoptions , ODE will use the default values !
The default values are
[1,0,0,%inf,0,2,500,12,5,0,-1,-1];
The meaning of the elements is described below.
5.3.1 itask
Default value 1
The values of “itask” are 1 to 5
– itask =1 : normal computation at specified times
– itask=2 : computation at mesh points (given in first row of output of
ode)
– itask=3 : one step at one internal mesh point and return
– itask= 4 : normal computation without overshooting tcrit
– itask=5 : one step, without passing tcrit, and return
5.3.2 tcrit
Default value 0
tcrit assumes itask equals 4 or 5, described above.
5.3.3 h0
Default value 0
h0 first step tried
It Can be useful to modify h0, for event location and root finding.
76
5.3.4 hmax
Default value ∞ ( % inf in Scilab)
hmax max step size
It Can be useful to modify hmax, for event location and root finding.
5.3.5 hmin
Default value 0
hmin min step size
It Can be useful to modify hmin, for event location and root finding.
5.3.6 jactype
Default value 2
– jactype= 0 : functional iterations, no jacobian used (”adams” or ”stiff”
only)
– jactype= 1 : user-supplied full jacobian
– jactype= 2 : internally generated full jacobian
– jactype= 3 : internally generated diagonal jacobian (”adams” or ”stiff”
only)
– jactype= 4 : user-supplied banded jacobian (see ml and mu below)
– jactype = 5 : internally generated banded jacobian (see ml and mu
below)
5.3.7 mxstep
Default value 500
mxstep maximum number of steps allowed.
5.3.8 maxordn
maxordn maximum non-stiff order allowed, at most 12
5.3.9 maxords
maxords maximum stiff order allowed, at most 5
77
5.3.10 ixpr
Default value 0
ixpr print level, 0 or 1
5.3.11 ml,mu
ml,mu .If jactype equals 4 or 5, ml and mu are the lower and upper half-
bandwidths of the banded jacobian : the band is the lower band of the i,j’s
with i-ml ¡= j and the upper band with j -mu ¡=i. (mu as for upper, ml as
for lower . . .)
mu+1
ml+1
n-ml
n-mu
n-mu-ml
ml diagonal lines m
mu
n columns
n
r
o
w
s
n
ml+mu+1
Fig. 9 – A banded Jacobian
If jactype equals 4 the jacobian function must return a matrix J which is
ml+mu+1 x ny (where ny=dim of x in xdot=f(t,x)) , such that column 1
of J is made of mu zeros followed by df1/dx1, df2/dx1, df3/dx1, ... (1+ml
78
possibly non-zero entries) , column 2 is made of mu-1 zeros followed by
df1/dx2, df2/dx2, etc
This is summarized by the following sketch :
nl+1
mu
n
mu+1
n-ml
mu+1
Fig. 10 – The matrix given to Scilab fo banded Jacobian
To be more precise and justify the preceding sketch, let a banded (mu,ml)
Jacobian , with entries J(i, j)
J(i, j) =
_
∂f
i
∂x
j
_
This express that the function f
i
depends only of the variables
(x
i−ml
, x
i−ml+1
, , x
i+mu
)
Or differently that the variable x
j
occurs only in the functions
(f
j−ml
, f
j−ml+1
, , f
j+mu
)
When the Jacobian is not banded if Jactype=1, you store the Jacobian in a
n n matrix J as defined before.
79
When jactype=2, and you have given values in %ODEOPTIONS, to ml
and mu , with 0 ≤ ml, mu ≤ n − 1, Scilab is waiting to store the Jacobian
in a ml + mu + 1 n matrix. In the column j of this matrix is loaded
[f
j−ml
; , f
j−ml+1
; ; f
j+ml−1
; f
j+ml
]
This equivalent to say that
∂f
i
∂x
j
is loaded in J(i−j +mu+1, j). Since we have
−mu ≤ i −j ≤ ml , the number of line is then between 1 and ml +mu + 1.
You can check that this give the preceding picture.
We shall see an example of this option, in the example Brusselator in section
(5.4.4).
5.4 Experimentation
There exists test problems in the litterature. For example Hull [1972] et al,
Krogh, Enright et al [1975] Hairer &Wanner . We shall use some of these test
problems.
5.4.1 Two body problem
We use the problem D5 of Hull
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¨ x = −
x
(x
2
+y
2
)
3/2
¨ y = −
y
(x
2
+y
2
)
3/2
x(0) = 1 −e ˙ x = 0
y(0) = 0 ˙ y =
_
1+e
1−e
(20)
The solution is an ellipse of eccentricity e. The solution is
x(t) = cos(u) −e y(t) =

1 −e
2
sin(u)
Where u is the solution of the Kepler’s equation u −e sin(u) = t.
We shall integrates this equation for e = 0.9 and find the solution for tf = 20.
We must solve the Kepler’s equation, for tf = 20, i.e. solve
u −tf −e sin(u) = 0
80
It is clear that this equation has always a unique solution, we shall find this
solution uf with Scilab, using fsolve
-->tf=20;e=0.9;
-->deff(’v=kepler(u)’,’v=u-tf-e*sin(u)’)
-->uf=fsolve(1,kepler)
uf =
20.8267099361762185
--> solex=[=cos(uf)-e;...
-->-sin(uf)/(1-e*cos(uf));...
-->sqrt(1-e^2)*sin(uf);...
-->sqrt(1-e^2)*cos(uf)/(1-e*cos(uf))]
solex =
! - 1.29526625098757586 !
! - 0.67753909247075539 !
! 0.40039389637923184 !
! - 0.12708381542786892 !
This result corresponds to the solution in Shampine. Here is some computa-
tions
-->t0=0;x0=[0.1;0;0;sqrt(19)];tf=20;
-->solstand=ode(x0,t0,t,D5);
lsoda-- at t (=r1), mxstep (=i1) steps
needed before reaching tout
where i1 is : 500
where r1 is : 0.1256768952473E+02
Scilab tells you that the number of steps is too small. We shall then modify
this. To have access to % ODEOPTIONS, we type
81
-->%ODEOPTIONS=[1,0,0,% inf,0,2,20000,12,5,0,-1,-1];
-->solstand=ode(x0,t0,tf,D5)
solstand =
! - 1.29517035385191548 !
! - 0.67761734310378363 !
! 0.40040490187681727 !
! - 0.12706223352197193 !
-->rtol=1d-4;atol=1d-4;
--solad2=ode(’adams’,x0,t0,tf,rtol,atol,D5);
-->norm(solex1-solad2)/norm(solex1)
ans =
0.12383663388400930
-->rtol=1d-12;atol=1d-14;
-->solad14=ode(’adams’,x0,t0,tf,rtol,atol,D5);
-->norm(solex1-solad14)/norm(solex1)
ans =
0.00000000085528126
// comparing the solutions
-->solex
solex =
! - 1.29526625098757586 !
! - 0.67753909247075539 !
! 0.40039389637923184 !
! - 0.12708381542786892 !
-->solad2
82
solad2 =
! - 1.420064880055532 !
! - 0.56322053593253107 !
! 0.33076856194984439 !
! - 0.17162136959584925 !
-->solad14
solad14 =
! - 1.29526624993082473 !
! - 0.67753909321197181 !
! 0.40039389630222816 !
! - 0.12708381528611454 !
A tolerance of 10
−4
gives a result with one correct digit ! the default tolerance
4 significant digits, a tolerance of 10
−14
gives 7 correct digits. This problem
illustrates the dangers of too crude tolerances.
5.4.2 Roberston Problem : stiffness
The following system is a very popular example in numerical studies of stiff
problem. It describes the concentration of three products in a chemical reac-
tion :
_
_
_
˙ x
1
= −k
1
x
1
+ k
2
x
2
x
3
˙ x
2
= k
1
x
1
−k
2
x
2
x
3
−k
3
x
2
2
˙ x
3
= k
3
x
2
2
(21)
The coefficients are
k
1
= 0.04
k
2
= 10
4
k
3
= 3.10
7
With the initial value
x
0
=
_
_
1
0
0
_
_
83
The Jacobian of the system is
J =
_
_
−k
1
k
2
x
3
k
2
x
2
k
1
−k
2
x
3
−2k
3
x
2
−k
2
x
2
0 2k
3
x
2
0
_
_
It is straightforward to see that the eigenvalues of J are 0 and two others
negatives values
−k
1
−2 k
2
x
2
−k
2
x
3
±
_
−8 k
2
x
2
(k
1
+ k
3
x
2
) + (k
1
+ k
2
(2 x
2
+ x
3
))
2
2
Check that for this system, with the corresponding initial value, x
1
+x
2
+x
3
=
1 . Hint = ˙ x
1
+ ˙ x
2
+ ˙ x
3
= 0. The equations are redondant.
We have one equilibrium x
1
= x
2
= 0 and x
3
= 1. At the I.V. the eigenvalues
are
(0, 0, −k
1
)
At the equilibrium the eigenvalues are
(0, 0, −(k
1
+ k
2
))
We remark that the system let invariant the positive orthant. This is ap-
propriate, since concentration cannot be negative. To study the system it is
sufficient to study the two first equation. On this orthant we have a Lyapunov
function V (x
1
, x
2
) = x
1
+ x
2
. It is clear that
˙
V = ˙ x
1
+ ˙ x
2
= −k
3
x
2
2
≤ 0
By LaSalle principle, the greatest invariant set contained in
˙
V = 0, i.e. x
2
= 0
is reduced to x
1
= 0 (look at the second equation). Hence the equilibrium
x
1
= x
2
= 0 is a globally asymptotically stable equilibrium (in the positive
orthant). This means that all the trajectories goes toward (0, 0, 1). Near this
equilibrium the system is stiff, since −(k
1
+ k
3
) = −10
4
−0.04
We can now look at the numerical solutions
Before we define a file Robertsoneqn.sci, and a file JacRober the Jacobian of
the RHS.
84
function xdot=JacRober(t,x)
xdot=[-.04 , 1d4*x(3), 1d4*x(2);..
.04,-1d4*x(3)-3.d7*x(2),-1.d4*x(2);..
0,6D7*x(2),0]
Then we compute, solve and plot, in a first time, naively :
;getf("/Users/sallet/Documents/Scilab/Robertsoneqn.sci");
-->;getf("/Users/sallet/Documents/Scilab/JacRober.sci");
-->T=logspace(-6,11,10000);
--> t0=0;x0=[1;0;0];
--> sol1=ode(x0,t0,T,Robertsoneqn);
-->xbasc()
-->plot2d(T’,(sol1)’)
We obtain
6
10
5
10
4
10
3
10
2
10
1
10
0
10
1
10
2
10
3
10
4
10
5
10
6
10
7
10
8
10
9
10
10
10
11
10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Fig. 11 – Robertson chemical reaction
It seems that a curve has disappeared. More attention shows that the second
curve is on the x–axis. To see what happens we multiply x
2
(t) by 10
4
85
We have specified in the plot some options. strf =

xyz

means no captions
(x=0), with a rectangle defined (y=1), axes are drawn, y axis on the left
(z=1). The option rect precise the frame by a length 4 column vector
[xmin; ymin; xmax; ymax]
finally logflag = ‘ln

says that the x-axis is logarithmic and the y-axis is
normal.
-->xbasc()
-->plot2d(T’,(diag([1,1d4,1])*sol1)’,strf=’011’,
rect=[1d-6;-.1;1d11;1.1],logflag=’ln’)
To obtain the classical figure.
6
10
5
10
4
10
3
10
2
10
1
10
0
10
1
10
2
10
3
10
4
10
5
10
6
10
7
10
8
10
9
10
10
10
11
10
0.10
0.02
0.14
0.26
0.38
0.50
0.62
0.74
0.86
0.98
1.10
X
X x 10
X
1
^4
2
3
Fig. 12 – Robertson chemical reaction
Since the component x
2
becomes very small we can augment the requirement
on atol for x
2
and also use the Jacobian.
If we have defined verb %ODEOPTIONS” we must only change jacstyle. See
on the first example, two body, for %ODEOPTIONS. We have
86
-->rtol=1d-4; atol=[1d-4;1d-6;1d-4];
-->%ODEOPTIONS(6)=1
-->solJac=ode(x0,t0,T,rtol,atol,Robertsoneqn,JacRober);
-->solJac(:,$)
ans =
! 0.00000001989864378 !
! 0.00000000000008137 !
! 0.97813884204576329 !
-->sol1(:,$)
ans =
! 0.00000002076404250 !
! 0.00000000000008323 !
! 0.99773865098969228 !
5.4.3 A problem in Ozone Kinetics model
We borrow this example from the book Kahaner-Moler-Nash. The system
corresponds to the modeling of the amount of Ozone in the atmosphere.
This example will illustrate the need of some mathematical analysis, with
the necessity of some exploratory computations for adjusting the tolerances.
One reaction mechanism is
O + O
2
k
1
−→O
3
O + O
3
k
2
−→2 O
2
O
2
k
3
(t)
−→2 O
O
3
k
4
(t)
−→O + O
2
87
The concentration of O
2
, molecular oxygen, is many orders of magnitude
larger than the other concentrations and therefore is assumed to be constant
O
2
= 3.5 10
16
cm
−3
The equations are
_
_
_
˙ x
1
= −k
1
x
1
O
2
−k
2
x
1
x
2
+ 2k
3
(t) O
2
+ k
4
(t) x
2
˙ x
2
= k
1
x
1
O
2
−k
2
x
1
x
2
−k
4
(t)x
2
(22)
The different variables represent x
1
the concentration of free oxygen, x
2
the
concentration of ozone. The rate of kinetics constants are known
k
1
= 1.63 10
−16
k
2
= 4.66 10
−16
The two other reaction rates vary twice a day and are modeled by
k
i
(t) =
_
_
_
e
−c
i
/ sin(ω t)
if sin(ω t) > 0
0 if sin(ω t) ≤ 0
With
ω =
π
43200
s
−1
c
3
= 22.62 c
4
= 7.601
The time is measured in seconds (43200s = 12h = 1/2 day)
The rates k
3
and k
4
govern the production of free singlet oxygen, they rise
rapidly at dawn, reach a peak at noon, decrease to zero at sunset. These
constants are zero at night. These functions are (

.
The I.V are
x
1
(0) = 10
6
cm
−3
x
2
= 10
12
cm
−3
A mathematical analysis shows that the system is “well-posed”, i.e. the po-
sitive orthant is invariant by the system, namely any solution starting in the
positive orthant cannot leaves this orthant. The system is difficult for the
solvers. They are rapid change in the solutions at daylight and at night the
ODE reduces to
_
_
_
˙ x
1
= −k
1
x
1
O
2
−k
2
x
1
x
2
˙ x
2
= k
1
x
1
O
2
−k
2
x
1
x
2
Moreover the reaction rates are special. We shall plot the reaction rates k
3
and k
4
. We have to code these functions and the ODE.
88
function xdot=ozone1(t,x)
k1=1.63d-16; k2=4.66d-16; O2=3.7d16;
xdot=[-k1*x(1)*O2-k2*x(1)*x(2)+2*k3(t)*O2+k4(t)*x(2);..
k1*x(1)*O2-k2*x(1)*x(2)-k4(t)*x(2)];
/////////////
function w=k3(t)
c3=22.62; omega=%pi/43200;
w= exp(-c3./(sin(omega*t)+...
((sin(omega*t)==0)))).*...(sin(omega*t)~=0)
//////
function J=jacozone1(t,x)
k1=1.63d-16; k2=4.66d-16; O2=3.7d16;
J=[-k1*O2-k2*x(2),-k2*x(1)+k4(t);k1*O2-k2*x(2),-k2*x(1)-k4(t)]
////////////
function v=k4(t)
c4=7.601; omega=%pi/43200;
v= exp(-c4./(sin(omega*t)+((sin(omega*t)==0)))).*...
(sin(omega*t)~=0)
Some remarks, here, are necessary. First we have coded in ozone1.sci some
sub-functions of the main program. This is important when several functions
must be supplied. We have as sub-function, the Jacobian, jacozone1 and the
depending of time functions k
3
(t) and k
4
(t). These function are used by the
main program “ozone”.
We draw the reader’s attention to the code of theses two functions. A natural,
but clumsy , way to code “case functions” as k
3
is to use a if condition. There
is a principle in vectorial language as Scilab to vectorize the code. Moreover
using loops is time consuming. A motto is
Time is too short to short to spend writing loop
89
A rule- of-thumb is that the execution time of a MATLAB (SCI-
LAB) function is proportional to the number of statements exe-
cuted. No matter what those statements actually do (C. Moler)
We use test functions.
sin(omega*t)==0
The value is %T if the test succeed , %F otherwise. These are boolean va-
riables True or False. These variables can be converted in numerics where
%T is replaced by 1 and % F replaced by 0. But if boolean variables are used
in some arithmetic operation the conversion is automatic. Example :
-->(1-1==0)*2
ans =
2.
-->(1-1==0)+2
ans =
3.
-->(1-1==0)/2
!--error 4
undefined variable : %b_r_s
-->2/(1-1==0)
!--error 4
undefined variable : %s_r_b
-->2/bool2s(1-1==0)
ans =
2.
90
The operations ∗ and + convert automatically, / need to use “bool2s”. If you
are not being sure, use bool2s.
We have in the code, the value of the function which is multiplied by
.*(sin(omega*t)~=0)
Note the “dot” operation to vectorize. This imply that our function is 0 when
sin(ω t) ≤ 0.
We add a term, in the denominator of the exponential :
+( (sin(omega*t)==0) )
To prevent to divide by zero, which shall give a error message an interrupt
the program. In this manner we obtain the result. Moreover the function is
vectorized, as we shall see :
-->;getf("/Users/sallet/Documents/Scilab/ozone1.sci");
-->T=linspace(2,43200,1000);
-->plot2d(T’,[1d5*k3(T)’,k4(T)’])
We have multiplied k
3
by 10
5
since k
3
varies from 0 to 15. 10
−11
and k
4
varies
from 0 to 5. 10
−4
. Otherwise the graph of k
3
would have been squashed on
the x-axis.
0 1e4 2e4 3e4 4e4 5e4
0
1e-4
2e-4
3e-4
4e-4
5e-4
k4(t)
x k3(t) 10
^5
Fig. 13 – reaction rates
91
It seems that these functions are zero for some values of t in the interval. We
check it , with the help of the function find
-->I2=find(k4(T)==0)
I2 =
! 1. 2. 3. 4. 997. 998. 999. 1000. !
-->I1=find(k3(T)==0)
I1 =
column 1 to 11
! 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 991. !
column 12 to 19
! 992. 993. 994. 995. 996. 997. 998. 999. !
column 20
! 1000. !
We have obtain the indexes for which the function are zero, we deduce the
time
-->[T(4),T(5)]
ans =
! 131.72372 174.96496 !
-->k4([T(4),T(5)])
ans =
! 0. 3.568-260 !
92
-->k3(T(11))
ans =
9.669-312
To understand what happens here, we must go back to the computer arith-
metic. Scilab is conforming to the IEEE standard. To be under the complete
IEEE arithmetic you must type
ieee(2)
This means that invalid operations give NaN (Not a Number). An Overflow
give %inf. The range is 10
±308
. Or in other words the overflow threshold
is 1.797e308 and the underflow around 4.941d −324. There is no symmetry.
The reader is advised to cautious, when working around this boundaries. The
unit roundoff is %eps :
-->ieee(2)
-->log(0)
ans =
-Inf
-->1/0
ans =
Inf
-->0/0
ans =
Nan
-->1.7977d308
ans =
93
Inf
-->1.79765d308
ans =
1.798+308
-->5d-324
ans =
4.941-324
-->5d-325
ans =
0.
-->4.8d-324
ans =
4.941-324
-->4d-324
ans =
4.941-324
In other words under 10
−324
, for the computer everything is zero. With ex-
ponential function some care is necessary , as we have seen with the function
k
i
.
We make the following experience
-->T=linspace(1,500,5000);
-->Ind=find(k3(T));
-->min(Ind);
94
-->T(min(Ind))
ans =
417.54871
-->k3(T(min(Ind)))
ans =
4.941-324
-->T=418.5:.01:420;
-->plot2d(’nl’,T’,k3(T)’)
T=linspace(42781,42789,100);
-Ind=max(find(k3(T))); ;
-->T(Ind)
ans =
42782.455
We are playing with the limits of underflow machine ! Notice the shattering
effects in the next figure.
95
418.5 418.7 418.9 419.1 419.3 419.5 419.7 419.9 420.1
-324
10
-323
10
-322
10
-321
10
k3(t)
time in s
Fig. 14 – Effects of IEEE arithmetic
We know, that for the computer , the functions k
3
is zero on the intervals
[0, 418] and [42782, 43200] and this periodically on period 43200. In the same
manner we find out that k
4
is zero on the intervals [0, 141] and [43059, 43200]
and this periodically on period 43200.
We shall first integrate in [0, 141] with a simplified function ozone2.sci . Sim-
ply we have erased k
1
, k
2
and modified the Jacobian :
function xdot=ozone2(t,x)
k1=1.63d-16; k2=4.66d-16; O2=3.7d16;
xdot=[-k1*x(1)*O2-k2*x(1)*x(2);k1*x(1)*O2-k2*x(1)*x(2)];
/////////////
function J=jacozone2(t,x)
96
k1=1.63d-16; k2=4.66d-16; O2=3.7d16;
J=[-k1*O2-k2*x(2),-k2*x(1);k1*O2-k2*x(2),-k2*x(1)]
We can now integrate. We choose to see the evolution, second by second,
(look at the choice of T)
-->%ODEOPTIONS=[1,0,0,%inf,0,2,20000,12,5,0,-1,-1];
--> ;getf("/Users/sallet/Documents/Scilab/ozone2.sci");
-->t0=0; x0=[1d6;1d12];
-->T1=linspace(1,141,141);
-->sol1=ode(x0,t0,T1,ozone2);
-->plot2d(T1’,sol1(1,:)’)
Since the two components are on different scale we plot two separate figures.
We obtain the two plots
0 20 40 60 80 100 120 140 160
-100
300
700
1100
1500
1900
2300
2700
O
Fig. 15 – concentration of free oxygen
97
0 20 40 60 80 100 120 140 160
100000089e4
100000093e4
100000097e4
100000101e4
100000105e4
100000109e4
100000113e4
Time in s
O3
Fig. 16 – concentration of Ozone
We see on the figure that the concentration is negative ! We check this
-->ind=find(sol1(1,:)<0);
-->min(ind)
ans =
7.
-->T(7)
ans =
7.
-->sol1(1,6)
ans =
0.0000000001981
-->sol1(1,7)
98
ans =
- 0.0000000000283
We also see that the ozone rapidly reach a constant value, 14 significant digits
remain constant :
-->format(’v’,16)
-->sol1(2,10)
ans =
1000000999845.5
-->sol1(2,$)
ans =
1000000999845.5
After some experimentation, we choose atol and rtol. For the first component
x
1
, what is important is the absolute error, since x
1
is rapidly very small.
See the discussion on error and tolerance absolute and relative in paragraph
(5.2.1). The components are scaled very differently as we can see. There exists
a rule-of-thumb, to begin :
Let m the number of significant digits required for solution com-
ponent x
i
, set rtol(i) = 10
m+1
and atol(i) at the value at which
[x
i
[ is essentially significant.
Beginning with this and after some trial and error, we come with
-->rtol=[1d-2;1d-13];
-->atol=[1d-150,1d-2];
-->sol11=ode(x0,t0,T1,rtol,atol,ozone2);
99
-->ind=find(sol11(1,:)<0);
-->min(ind)
ans =
60.
-->max(ind)
ans =
77.
It means that we have only 17 values which are negative.
We can try with a stiff only method.
-->%ODEOPTIONS(6)=1;
-->sol11=ode(x0,t0,T1,rtol,atol,ozone2,jacozone2);
--ind=find(sol11(1,:)<0);
--min(ind)
ans =
60.
--max(ind)
ans =
77.
The result is equivalent. A try with a pure adams method gives poorer results.
We shall now take the last results of the integration as new start, and use
on the interval [141, 418] a function ozone3.sci which use only the function
k
4
(t). This is immediate and we omit it.
-->x0=sol11(:,$)
x0 =
100
! 1.914250067-152 !
! 1000000999845.5 !
-->to=T1($)
to =
141.
-->;getf("/Users/sallet/Documents/Scilab/ozone3.sci");
-->T2=linspace(141,418,277);
-->sol2=ode(x0,t0,T2,rtol,atol,ozone3,jacozone3);
-->xbasc()
-->plot2d(T2’,sol2(1,:)’)
-->xset(’window’,1)
-->xbasc()
-->plot2d(T2’,sol2(2,:)’)
-->min(sol2(1,:))
ans =
- 3.483813747-152
The first component has still negative values, but they are in the absolute
tolerance ! We obtain the figure for the concentration of free oxygen. The
figure for the ozone is the same and the concentration stays constant.
101
140 180 220 260 300 340 380 420
-1e-99-
3e-99
7e-99
11e-99
15e-99
19e-99
23e-99
27e-99
31e-99
35e-99
39e-99
Fig. 17 – concentration of free oxygen on [141, 418]
We can now integrate on the interval [418, 43059], with the two functions k
i
,
using ozone1.sci the complete program. We make an exploratory computation
with default tolerance atol = 10
−7
and rtol = 10
−5
.
t0=T2($);x0=sol2(:,$);
-->;getf("/Users/sallet/Documents/Scilab/ozone1.sci");
-->rtol=1d-7;atol=1d-9;
-->T3=linspace(418,42782,42364);
-->sol3=ode(x0,t0,T3,rtol,atol,ozone1);
-->xset(’window’,2)
-->plot2d(T3’,sol3(1,:)’)
-->xset(’window’,3)
-->plot2d(T3’,sol3(2,:)’)
-->min(sol3(1,:))
102
ans =
- 1.814E-11
-->sol3(1,$)
ans =
1.866E-13
0 1e4 2e4 3e4 4e4 5e4
1e7
0
1e7
2e7
3e7
4e7
5e7
6e7
7e7
8e7
9e7
O
Time in s
418 42742
Fig. 18 – Free oxygen on day light
103
0 1e4 2e4 3e4 4e4 5e4
99e10
100e10
101e10
102e10
103e10
104e10
105e10
106e10
107e10
108e10
O2
Fig. 19 – Molecular oxygen, day light
On day light the free oxygen, starting at very low concentration (3.6 10
−98
)
goes to a peak 8.8 10
7
at noon (21429s), and then decreases to 1.86 10
−13
.
We check that the concentration of O has still negative values, but under the
value of atol. We observe that the concentration of molecular oxygen change
from a plateau 1.000 10
12
to another plateau 1.077 10
12
. Our tolerance are
satisfactory.
We complete now the simulation, with two other runs, on the third interval,
corresponding to [42782, 43059] with the function ozone3 , and on the interval
[43059, 43341], the remaining of the day and the beginning of the next one,
with ozone2, since the k
i
are zero. The remaining of the simulation is obtained
in a periodic way. You should write a script to obtain for example a simulation
on 10 days.
For reference for the reader, we give the figure corresponding for the two
intervals for the concentration of free oxygen at the beginning of the day,
and the two intervals at the end and at the beginning of the next day. If we
plot, and pasted all the results obtained, together, these plots will disappear,
in reason of their scale. We shall get something like the figure (18).
We use the following script :
104
-->xbasc()
-->[([0,0.0,0.5,0.5])
-->plot2d(T1’,sol11(1,:)’)
-->xsetech([0.5,0.0,0.5,0.5])
-->plot2d(T2’,sol2(1,:)’)
-->xsetech([0,0.5,0.5,0.5])
-->plot2d(T4’,sol4(1,:)’)
-->xsetech([0.5,0.5,0.5,0.5])
-->plot2d(T5’,sol5(1,:)’)
0 20 40 60 80 100 120 140 160
-100
300
700
1100
1500
1900
2300
2700
140 180 220 260 300 340 380 420
-1e-99
3e-99
7e-99
11e-99
15e-99
19e-99
23e-99
27e-99
31e-99
35e-99
39e-99
42780 42820 42860 42900 42940 42980 43020 43060
-1e-14
1e-14
3e-14
5e-14
7e-14
9e-14
11e-14
13e-14
15e-14
17e-14
19e-14
43050 43090 43130 43170 43210 43250 43290 43330 43370
-6e-152
-4e-152
-2e-152
0
2e-152
4e-152
6e-152
8e-152
10e-152
Free Oxygen
0-141s
141-418s
42782-43059s
43059-43341s
Fig. 20 – Molecular oxygen at low levels
The chattering effect in the graph of interval [43059, 43341] is simply a scale
zoom effect. The same thing arrive in the interval [0, 141] but the I.V. for this
105
interval is 10
6
, and we begin the plot at time 1 giving a concentration of 2410,
whereas the I.V. at the beginning of the interval for the concentration of free
oxygen is 3.8 10
−156
. The chattering effect in interval [0, 141] is squashed
by the scale. You can check by yourself, that this happen also in the first
interval, with the command
-->xbasc()
-->plot2d(T1(70:141)’,sol11(1,70:141)’)
The remaining of the simulation should give something approximatively per-
iodic, since the I.V. are almost the same, comparing the starting for the
second intervals (k
4
,= 0 and k
3
= 0) :
-->sol5(:,$)
ans =
! 2.982-155 !
! 1.077E+12 !
-->sol11(:,$)
ans =
! 1.924-152 !
! 1.000E+12 !
Do it ! Obtain a simulation for 10 days.
5.4.4 Large systems and Method of lines
Till now we have just solve ODE with small dimension. But the solver of
Scilab are able to handle large systems. We shall here consider an example in
dimensions 20 and 40. But depending on your computer, problems of larger
size can be considered.
A source of large systems of ODE is the discretization of PDE. We shall
consider some examples and particularly the example of the Brusselator with
diffusion ( HW).
The brusselator with diffusion is the system of PDE :
106
_
∂u
∂t
= A+ u
2
v −(B + 1)u + α

2
u
∂x
2
∂v
∂t
= −Bu −u
2
v + α

2
v
∂x
2
(23)
With 0 ≤ x ≤ 1
With the boundaries conditions
u(0, t) = u(1, t) = 1 v(0, t) = v(1, t) = 3
u(x, 0) = 1 + sin(2πx) v(x, 0) = 3
When a PDE is discretized in space (semi-discretization) the result is an
ODE.
For example, let consider the variable coefficient wave equation
u
t
+ c(x)u
x
= 0
For x ∈ [0, 2π]. We consider the grid 0 = x
0
< x
1
< x
2
< < x
N
< x
N+1
=

Some functions v
i
(t) are used to approximate the function u(x
i
, t). Ano-
ther approximation is given of the spatial derivatives of u(t, x).
For example we can use a first order difference approximation on a grid of
spacing ∆x
∂u(t, x
i
)
∂x

v
i
(t) −v
i−1
(t)
∆x
A more accurate approximation can be a central difference
∂u(t, x
i
)
∂x

v
i+1
(t) −v
i−1
(t)
2∆x
Of course these formulas must be adapted at the boundary of the grid.
If one use the spectral methods,
∂u(t, x
i
)
∂x
≈ (Dv)
i
Where D is a differentiation matrices applied to the function v, and (Dv)
i
is
the ith component.
Consider the Brusselator with diffusion (23). We use a grid of N points
107
x
i
=
i
N + 1
i = 1 : N ∆x =
1
N + 1
We approximate the second spatial derivatives by finite difference to obtain
a system of ODE of dimension N.
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
˙ u
i
= A+ u
2
i
v
i
−(B + 1)u
i
+
α
(∆x)
2
(u
i−1
−2u
i
+ u
i+1
)
˙ v
i
= Bu
i
−u
2
i
v
i
+
α
(∆x)
2
(v
i−1
−2v
i
+ v
i+1
)
u
0
(t) = u
N+1
(t) = 1; v
0
(t) = v
N+1
(t) = 3
u
i
(0) = 1 + 2 sin(2πx
i
)
v
i
(0) = 3
(24)
We shall code this ODE. We ordered the components and the equations in
the following order :
(u
1
, v
1
, u
2
, v
2
, , u
N
, v
N
)
That is we set x
2i−1
= u
i
and x
2i
= v
i
for i = 1 : N. If we adopt the convention
x
−1
= u
0
(t) = x
2N+1
= u
N
(t) = 1 and x
0
(t) = v
0
(t) = x
2N+2
(t) = v
N
(t) = 3
to incorporate the boundary conditions, the equation (24) is equivalent to
_
¸
¸
_
¸
¸
_
˙ x
2i−1
= A+ x
2
2i−1
x
2i
−(B + 1)x
2i
+
α
(∆x)
2
(x
2i−3
−2x
2i−1
+ x
2i+1
)
˙ x
2i
= Bx
2i−1
−x
2
2i−1
x
2i
+
α
(∆x)
2
(x
2i−2
−2x
2i
+ x
2i+2
)
x
2i−1
(0) = 1 + 2 sin(2π
i
N+1
)
x
2i
(0) = 3
(25)
It is straightforward, from the original equation to check that each RHS of
the system depends at most only of the 2 preceding and the 2 following
coordinates. This proves that the Jacobian is banded with mu=2 and ml=2.
If we follow the syntax of the section (5.3.11) the matrix J is, setting, for
shortening the formulas , D = (B + 1) + 2c
For the 4 first columns
_
¸
¸
¸
¸
_
0 0 c c
0 x
2
1
0 x
2
3

2x
1
x
2
−D −x
2
1
−2c 2x
3
x
4
−D −x
2
3
−2c
B −2x
1
x
2
0 B −2x
1
x
4
0
c c c c
_
¸
¸
¸
¸
_
108
And for the remaining columns till the two last columns.
_
¸
¸
¸
¸
_
c c c c
x
2
2i−1
0 0 x
2
2N−1
2x
2i−1
x
2i
−D −x
2
2i−1
−2c 2x
2N−1
x
2N
−D −x
2
2N−1
−2c
B −2x
2i+1
x
2i
0 B −2x
2N+1
x
2N
0
c c 0 0
_
¸
¸
¸
¸
_
The code, using the vectorization properties of Scilab, results from these
formulas. We introduce a temporary vector to take care of the boundaries
conditions. The Jacobian is coded as a sub-function of the main program.
function xdot=brusseldiff(t,x)
//constants
A=1;B=3;alpha=1/50;
//
// dimension of the problem
N=20;
c=alpha*(N+1)^2;
// A temporary vector for taking in account
// boundary conditions
x=x(:);
y=[1;3;x;1;3];
xdot=zeros(2*N,1)
i=1:2:2*N-1
xdot(i)=A+y(i+3).*y(i+2).^2-(B+1)*y(i+2)+...
c*(y(i)-2*y(i+2)+y(i+4))
xdot(i+1)=B*y(i+2)-y(i+3).*y(i+2).^2+...
c*(y(i+1)-2*y(i+3)+y(i+5))
/////////////////
function J=Jacbruss(t,x)
//banded Jacobian ml=2 mu=2
109
//see the notes for the explaination
x=x(:); N=20;
J=zeros(5,2*N);
A=1;B=3;alpha=1/50;
c=alpha*(N+1)^2;
d=B+1+2*c;
// five lines to be defined
J(1,3:2*N)=c;
J(5,1:2*N-3)=c;
J(2,2:2:2*N)=(x(1:2:2*N).^2)’;
J(3,1:2:2*N)=(2*x(1:2:2*N).*x(2:2:2*N))’-d;
J(3,2:2:2*N)=J(2,2:2:2*N)-2*c;
J(4,1:2:2*N)=-J(3,1:2:2*N)-2*c-1;
Now we can integrate the ODE on the interval [0, 10].
We have to call the function brussdiff.sci, set the %ODEOPTIONS with
Jactype=4,mu=2,ml=2, maxstep=20000, and set the I.V. :
-->;getf("/Users/sallet/Documents/Scilab/brusseldiff.sci");
-->%ODEOPTIONS=[1,0,0,%inf,0,4,20000,12,5,0,2,2];
-->N=20;
-->X=(1:N)/(N+1); t0=0; T=linspace(0,10,100);
-->x0=zeros(2*N,1);
-->x0(1:2:2*N)=1+sin(2*\%pi*X)’;
-->x0(2:2:2*N)=3;
-->solbru=ode(x0,t0,T,brusseldiff,Jacbruss);
The solution is obtained in a snap. We can do the same for N = 40. You
just have to change in the function brussdiff.sci the value of N , in only two
places at the beginning of the function brussdiff and Jacbruss. Don’t forget
to save the file and make a “getf”, then you retype in the window command
the same 6 lines of instructions and you get the solution for N=40 !
Now we can plot the solution of the PDE. u
i
is the discretization of the
solution u. Using the properties of plot3d , thes lines of commands
--plot3d1(T,X,solbru(1:2:2*N,:)’)
-->xset(’window’,1)
--plot3d1(T,X,solbru(2:2:2*N,:)’)
110
Give the figures
2.89
1.45
0.00
Z
0.02
0.50
0.98 Y 10
5
0
X
Fig. 21 – solution of u(x, t) of (23)
4.09
2.66
1.24
Z
0.02
0.50
0.98
t
10
5
0
X
Fig. 22 – Solution of v(x, t) of (23)
111
To obtain a comparable figure as in HW pp7, we change the point of view,
and change the orientation of the x-axis.
-->plot3d1(T,X(N:-1:1),solbru(1:2:2*N,:)’,170,45)
And obtain
2.89
1.45
0.00
Z
10
5
0
X
0.98
0.50
0.02
Y
Fig. 23 – Solution u(x, t) other point of view
You are now able to test the ODE solver of MATLAB on classic test problems.
See the section of examples. Do it ! Experiment, make your own errors, it is
the only way to learn how to use Scilab.
5.5 Passing parameters to functions
It is frequent to study ODE with parameters and to want to see what are the
effects of parameters on the solutions. The pedestrian way is to change the
parameters in the file of the ODE, save the file and in the window command
make a “getf” for this file to redefine the function. In the preceding section we
have use this way to change the number of discretization N from 20 to 40. If
you want to test, for example for 100 parameters this is not very convenient.
In this file N has an affected value in the file.
112
If a variable in a function is not defined (and is not among the input para-
meters) then it takes the value of a variable having the same name in the
calling environment. This variable however remains local in the sense that
modifying it within the function does not alter the variable in the calling
environment. A function has then access to all the variables in the superior
levels, but cannot modified this variables. Functions can be invoked with less
input or output parameters.
It is not possible to call a function if one of the parameter of the calling
sequence is not defined
In this section we shall give a way to do affect parameters easily and to
automatize the process using a script.
In the syntax of ODE of Scilab you replace the argument f by a list :
ode ( x0,t0, t ,lst)
Where lst is a list with the following structure
lst=list (f, param1,param2, ...,paramk);
The argument param1, param2,. . ., paramk are the parameters you want.
The function f must be a function with the syntax
function xdot =f(t,x,param1,param2,...,paramk)
We shall give an example. If a Jacobian is coded the same rules applies.
5.5.1 Lotka-Volterra equations, passing parameters
The ODE Lotka-Volterra are contained in any book on ODE.
The predator-prey Lotka-Volterra equations are defined by
_
˙ x
1
= (a −b x
2
) x
1
˙ x
2
= (−c + dx
1
) x
2
(26)
We have 4 positive parameters : (a, b, c, d)
The we code the function
function xdot=lotka(t,x,a,b,c,d)
xdot=[(a-b*x(2))*x(1);(-c+d*x(1))*x(2)]
113
Now at the command line, we can give values to the parameters and call the
solver ODE
a= 1;b=1;c=1=d=2; //example
lst=list(lotka,a,b,c,d)
sol=ode(x0,t0,t,lst);
To change the parameters it suffice to change the values of (a, b, c, d).
For example we fix a = 1, b = 0.1,c = 1 and we want to see what happens
when d varies from 0.1 to 0.9. We want to have the solution (x
1
(10), x
2
(10))
for each value of the parameter d = 0.1 : 0.1 : 0.9 with initial value [1; 1].
Here is a script lotkascript.sci for getting an array answering the question
// Lotka script for the tableau
//
;getf("/Users/sallet/Documents/Scilab/lotka.sci");
a=1;b=0.1;c=1:
d=.1:.1:.9;
t0=0;
tf=10;
x0=[1;1];
tableau=zeros(2,9);
tableau=[alpha;tableau];
n=length(alpha);
for i=1:n
X=ode(x0,t0,tf,list(lotka,a,b,c,d(i)));
tableau(2:3,i)=X;
end
In the window command you execute the script
-->;exec("/Users/sallet/Documents/Scilab/lotkascript.sci");
-->tableau
tableau=
! 0.1 0.2 0.3 0.4 0.5 !
! 2.7917042 0.7036127 0.5474123 0.5669242 0.6287856 !
! 25.889492 13.875439 6.9745155 4.3248549 3.0348836 !
114
column 6 to 9
! 0.6 0.7 0.8 0.9 !
! 0.7055805 0.7867920 0.8661472 0.9386788 !
! 2.2915144 1.8082973 1.4650893 1.2052732 !
5.5.2 variables in Scilab and passing parameters
To sum up the question of passing parameters. If you change often the para-
meters, Use parameters as input in the function and use list when you invoke
solvers with ODE.
You can use in a function some parameters, inside the code, but these para-
meters must exists somewhere. Created in the file, or existing in the current
local memory of the window command.
It is not necessary to create global variables. A warning : a sub-function has
not access to the parameters of the other function of the program.
For example, consider the following function foo.sci
function w=foo(t)
c3=22.62; omega=%pi/43200;
w=1+subfoo(t)+N
//
function z=subfoo(t)
z=c3*t
If you type in Scilab
-->N=40;
-->;getf("/Users/sallet/Documents/Scilab/foo.sci");
-->foo(1)
ans =
63.62
-->subfoo(1)
!--error 4
undefined variable : c3
at line 2 of function subfoo called by :
subfoo(1)
115
The function subfoo has not access to c3, only foo. To remedy you have 3
solutions
– create c3 in subfoo, giving it value. This is the manner used in sub-
function Jacozone of function ozone.sci in section (5.4.3). In this case
c3 is a variable of the program foo.sci
– create c3 in the base workspace, by typing in the window command.
– declaring c3, global in foo and in subfoo. Affecting a value to c3 in foo.
This equivalent to the first method, with a little subtlety
Ordinarily, each Scilab function, has its own local variables and can “read”
all variables created in the base workspace or by the calling functions. The
global allow to make variables read/write across functions. Any assignment to
that variable, in any function, is available to all the other functions declaring
it global. Then if for example you create the two functions
foo1.sci (see the difference with foo.sci)
function w=foo1(t)
global c3
c3=22.62;
w=1+subfoo(t)/c3
//
function z=subfoo(t)
global c3
z=c3*t
And the function foo.2.sci
function w=foo2(t)
global c3
w=1+subfoo2(t)/c3
//
function z=subfoo2(t)
global c3
z=c3*t
If you load by getf these two functions, a simple trial show that foo.1 and
foo.2 share the variable c3, even with the variable c3 having no given value
in foo2, it is just declared global. but however c3 is not a variable in the base
workspace.
116
-->foo2(1)
ans =
2.
-->subfoo1(1)
ans =
22.62
-->c3
!--error 4
undefined variable : c3
5.6 Discontinuities
As we have seen in the theoretical results for existence and uniqueness for
ODE, a current hypothesis is a certain regularity of the function defined by
the RHS of the ODE. Lipschitzian is the minimum, and very often some
smoothness is supposed. Moreover the theoretical results for the convergence
and the order of the methods suppose that the function f is sufficiently deri-
vable (at least p). Then when the function in the RHS has some discontinuity
we have a problem. Unfortunately it is often the case in applications.
All the algorithms are intended for smooth problems. Even if the quality
codes can cope with singularities, this is a great burden on the code, and
unexpected results can come out. We shall illustrate this by an example
from the book of Shampine.
5.6.1 Pharmacokinetics
This example is a simple compartmental model of Lithium absorption. Mania
is a severe form of emotional disturbance in which the patient is progressi-
vely and inappropriately euphoric and simultaneously hyperactive in speech
and locomotor behaviour. In some patients, periods of depression and ma-
nia alternate, giving rise to the form of affective psychosis known as bipolar
depression, or manic-depressive disorder. The most effective medications for
this form of emotional disorder are the simple salts lithium chloride or lithium
carbonate. Although some serious side effects can occur with large doses of
lithium, the ability to monitor blood levels and keep the doses within mo-
dest ranges (approximately one milliequivalent [mEq] per litre) makes it an
effective remedy for manic episodes and it can also stabilize the mood swings
of the manic-depressive patient.
117
The carbonate lithium has a half-life of 24 hours in the blood. The first
question is with a dosage of one tablet every 12 hours, how long does it take
for the medication to reach a steady-state in the blood. Remember that the
safe range is narrow.
This is modeled by a two compartment model. The first is the digestive tract
and the second is the blood. The dynamic can be summarized by
u(t)
a=5.6
b=O.7
x
2 1
x
Intestinal tract
Blood
Excretion
Fig. 24 – Compartment model of Lithium
A mass balance analysis give the equation
_
˙ x
1
= −a x
1
+ u(t)
˙ x
2
= a x
1
−b x
2
(27)
Since the half-life of Lithium in the blood is 24 hours, if we use for unity
of time the day, the ODE giving the excretion of lithium (without further
input) is ˙ x
2
= −a x
2
, which gives a half-life of log(2)/a we have
b = log(2) ≈ 0.6931472
The half-life in the digestive tract is 3 hours which gives
a = log(2)/(3/24) ≈ 5.5451774
The lithium is administrated in one unity dose, passing during one half-hour,
every half-day. Then u(t) is periodic function on 1/2 day, constant on 1/48
and taking zero value on the remaining of the day. The constant is such that
the integration on a half-day (the quantity of drug metabolized) is 1.
118
Then u(t) of period 1/2, is defined on [0, 1/2] by
u(t) =
_
48 if 0 ≤ t ≤
1
48
0 if
1
48
< t <
1
2
Clearly the RHS of the ODE is discontinuous. But the bad points of discon-
tinuity are known.
programming u(t) in Scilab :
Once more there is a clumsy way to code u(t). Doing by case ... and using
if loops. There is a better manner, which has the great advantage to give a
vectorized code, and moreover which is built-in, that is considerably quicker.
The keys are the test function and the function pmodulo. The function
pmodulo is vectorized and can be used to build any periodic function.
The function modulo (x,P) is
modulo(x,P)=x-P .* floor(x ./ P )
That is pmodulo(x,P) is the remainder of the division with integer quotient
of x by P (as period). The number pmodulo(x,P) satisfies
0 ≤ pmodulo(x, P) < P
.
When you consider the grid PZ, modulo(x, P) is the distance of x to the
nearest left number on the grid, i.e if n is the smallest integer such that
nP ≤ x < (n + 1)P
modulo(x, P) = x −nP
.
With these preliminaries it is now easy to code a periodic function, of per-
iod P which value is 1 on [0, w] and taking value 0 on ]w, P]. We call it
pulsep(t, w, P)
function x=pulsep(t,w,p)
//give a step periodic function of the variable t of period p
//value 1 if 0 <= t <= w and 0 if w < t < p
x= bool2s(pmodulo(t,p) <= w );
119
For safety, we test is the remainder is less or equal to w, we obtain boolean
variables, %T or %F, and we convert in variables 1 or 0. See the comments
in the code of the function k
i
in section (5.4.3).
Now it is simple to code the Lithium model
function xdot= phk(t,x)
// pharmacokinetics example from Shampine
//plot of numerical solutions of ODE pp105
b =log(2); a=log(2)/(3/24):
dotx=[-a 0; a -b]*x + 48 * [pulsep(t,1/48,1/2) ; 0];
//
function pulsep(t,w,p)
x=bool2s( pmodulo(t,p) <= w );
We now write a script for testing some solvers. We test the default solver
RK, Adams and RKF (Shampine and Watts).
// compute solution for the pharmaco-kinetics problem
//
;getf("/Users/sallet/Documents/Scilab/phk.sci");
//
x0=[0;0];
t0=0;
tf=10;
T=linspace(t0,tf,100);
T=T(:);
X=ode(x0,t0,T,phk);
xset(’window’,0)
xbasc()
plot2d(’RK’,T,X(:,2)’,2)
X1=ode(’rkf’,x0,t0,T,phk);
xset(’window’,1)
xbasc()
plot(T,X1(:,2)’,3)
120
X2=ode(’adams’,x0,t0,T,phk);
xset(’window’,2)
xbasc()
plot(T,X2(:,2)’,3)
For the method “RKF”
0 1 2 3 4 5 6 7 8 9 10
0
0.4
0.8
1.2
1.6
2.0
2.4
2.8
3.2
Fig. 25 – concentration of Li in blood, by RKF method
For the method “RK”
121
0 1 2 3 4 5 6 7 8 9 10
0
0.4
0.8
1.2
1.6
2.0
2.4
2.8
3.2
Fig. 26 – concentration of Li in blood, by RK method
For the method “Adams”
0 1 2 3 4 5 6 7 8 9 10
0
0.4
0.8
1.2
1.6
2.0
2.4
2.8
3.2
Fig. 27 – concentration of Li in blood, by Adams method
The solver gives contradictory results. To understand what happens, we
122
should have plot also the first component :
0 1 2 3 4 5 6 7 8 9 10
0
0.4
0.8
1.2
1.6
2.0
2.4
2.8
3.2
0 1 2 3 4 5 6 7 8 9 10
0
0.4
0.8
1.2
1.6
2.0
2.4
2.8
3.2
RKF
RK
Fig. 28 – Plotting the two components
It is clear that the “RKF” method, which was substantially quicker , has
tacken big steps, too much big steps, and has missed some absorption of
tablets, namely at time 3, 3.5, 4 and 4.5.
To get the more precise result of the solution is to integrate on interval, stop
at bad points and restart.
Integrating on interval of smoothness for f :
We shall now integrate on the intervals of smothness for u(t), for 10 days.
We have to integrate on 20 intervals, defined by the mesh
[0, 1/481/2, , p/2 + 1/48, p + 1, , 19/2 + 1/48, 10]
On intervals of the kind [p/2, p/2 +1/48] the fonction u has value 48, on the
intervals of the kind [p/2 + 1/48, p + 1] the function has value 0.
To define this interval, we can use, for example , a for loop. But in Scilab,
once again , this is clumsy : a principle is to use the vectorized capacity of
123
Scilab and avoid loops which are computer time consuming. The idea is to
take the interval [0, 1/48] and to make 20 translations of steps 1/2. To begin
we concatene 20 copies of this interval. This done by using the kronecker
product of matrices :
-->I=[0,1/48] ;
-->J=ones(1,20);
-->A=kron(J,I);
now to each block we have to make a translation of a multiple of 1/2.
-->K=ones(1,2);
-->L=0:1/2:19/2 ; // translation
-->M=kron(L,K); //blocks of translations
-->mesh=A+M;
-->mesh=[mesh,10]; // the last point is missing.
We obtain a vector of mesh points of length 48.
Now we have the mesh we can integrate. to begin we prepare the output. We
take n = 100 steps by interval.
t0=0;
x0=[0;0];
xout=x0’
tout=t0;
n=100;
We set the loop
// a loop for solving the problem
// we integrate on a half a day,
//this half day is divised in two
// integration intervals
for i=1:20
//integration with lithium absorption
// on the first half--day
T=linspace(mesh(2*i-1),mesh(2*i),n);
124
X=ode(x0,t0,T,odephk2);
tout=[tout;T(2:$)’];
xout=[xout ;X(:,2:$)’];
//new IV
t0=T($);
x0=X(:,$);
// integration without lithium absorption
//on the second half day
T=linspace(mesh(2*i),mesh(2*i+1),n);
X=ode(x0,t0,T,odephk1);
tout=[tout;T(2:$)’];
xout=[xout ;X(:,2:$)’];
//new IV
t0=T($);
x0=X(:,$);
end
xbasc()
plot2d(tout,xout(:,2))
So we obtain the following plot :
125
0 1 2 3 4 5 6 7 8 9 10
0
0.4
0.8
1.2
1.6
2.0
2.4
2.8
3.2
Fig. 29 – concentration of Li in blood, by interval
We can compare the different plots, in blue the “true value” integrating by
interval, in red with dots the method RK, in green the method RKF.
So we obtain the following plot :
0 1 2 3 4 5 6 7 8 9 10
0
0.4
0.8
1.2
1.6
2.0
2.4
2.8
3.2
RKF
RK
Intervall
Fig. 30 – comparison of methods
126
We see that the RKF method, h as bad results, this is due to the skipping
of some absorption of Lithium, since RKF choose too big integrating steps
(when this is not precised by the user) . The RK and Adams methods has
good behavior.
Plot the first component and compare.
5.7 Event locations
The preceding example of the absorption of Lithium is a case of a RHS
discontinuous. The “bad” points are known. In this case breaking up the
the integration interval solve the problem. Actually breaking up the problem
gives rise to a restart. But there is some cases where there can be difficult
to approach the problem in this way. If there are a great many places where
singularities occur, the restarts becomes expensive.
All this presumes that we know where the bad places occur. When this de-
pends of the solution itself the matter becomes much more difficult because
it is necessary to locate bad points. This situation is quite frequent in appli-
cations where the RHS depends on a switching function. The problem is to
detect when the solution is crossing a surface of discontinuity.
A common example is a problem which can be written
_
˙ x = f
1
(x) if g(x) > 0
˙ x = f
2
(x) if g(x) < 0
(28)
The function g is called a switching function, and the hypersurface of equation
g(y) = 0 is the surface of discontinuities. We shall come back on this point
later on, when considering the Coulomb example.
Scilab has a event location capabilities.
We shall illustrate this by two examples and explain how to use these capabi-
lities. The two first one are given by a bouncing ball, the third is an example
taken from HW of the friction Coulonb’s law.
We describe first the syntax of “root” :
5.7.1 Syntax of “Roots ”
The syntax is in the window command :
127
[x,rd]=ode(’root’,x0,t0,tf ,rtol ,atol,f ,jac ,ng,g )
The parameters are
– x0 : real vector or matrix (initial conditions).
– t0 : real scalar (initial time).
– t : real vector (times at which the solution is computed).
– f : external i.e. function which gives the RHS of the ODE
– rtol,atol : real constants or real vectors of the same size as x.
– jac : external i.e. function which gives the Jacobian of the RHS of the
ODE.
– ng : integer. The number of constraints to be satisfied . The dimension
of g
– g : external i.e. function or character string or list.
5.7.2 Description
With this syntax (first argument equal to “root”) ode computes the solution
of the differential equation ˙ x = f(t, x) until the state x(t) crosses the surface
in R
n
of equation g(t, x) = 0.
Needless to say that we must choose the final time tf safely, i.e. to be sure
that we encounter the surface.
The function g g should give the equation of the surface. It is an external i.e.
a function with specified syntax, or the name of a Fortran subroutine or a C
function (character string) with specified calling sequence or a list.
If g is a function the syntax should be as follows : z = g(t, x) where t is a
real scalar (time) and x a real vector (state). It returns a vector of size ng
which corresponds to the ng constraints. If g is a character string it refers to
the name of a Fortran subroutine or a C function, with the following calling
sequence : g(n, t, x, ng, gout) where ng is the number of constraints and gout
is the value of g (output of the program).
If we must pass parameters to g, the function g can be a list with the same
conventions as for f apply (see section (5.5)) or Jac
Ouput rd is a 1 k vector. The first entry contains the stopping time. Other
entries indicate which components of g have changed sign. k larger than 2
indicates that more than one surface ((k − 1) surfaces) have been simulta-
neously traversed.
128
The output x is the first solution (computed) reaching the surface defined by
g
5.7.3 Precautions to be taken in using “root”
Missing a zero :
Some care must be taken when using “root” for event location. The solver use
the root finding capabilities of Scilab . But in other hand the solver adjust
the steps according the tolerance required. When the ODE is particularly
smooth and easy the solver can take great step and miss some zeros of the
function.
To show what can happen, we use an explicit example. We define a function
with some zeros near the origin, and constant outside the neighborhood of
the origin, and we choose a very simple ODE, namely, xdot =0 !
The function g is obtain by a a sinusoid, sin(50πt), with zeros at each step
1/50, we multiply this sinusoid by the characteristic function of the interval
[0, 0.05] and we add the characteristic function of the interval [0.05, ∞]. Then
our function g depending only of time, has 2 zeros located at 0, 0.02, 0.04.
This function is constant, equal to 1 on [0.05, ∞].
With this we use “root”
-->deff(’xdot=testz(t,x)’,’xdot=0’)
-->deff(’y=g(t,x)’,’y=sin(50*%pi*t).*(t>0).*(t<.05)+...
bool2s(t> =.05)’)
-->[xsol,rd]=ode(’root’,x0,t0,tf,testz,1,g);
-->rd
rd =
[]
The vector rd is empty. No zero has been detected!
The remedy for this is to adjust the maxstep used by the solver (see sec-
tion (5.3.4). We use %ODEOPTIONS, and set hmax to 0.01 :
-->%ODEOPTIONS=[1,0,0,.01,0,2,10000,12,5,0,-1,-1];
-->[xsol,rd]=ode(’root’,x0,t0,tf,testz,1,g);
129
-->rd
rd =
! 0.02 1. !
The solver stop at the first zero of g. We can restart with this new I.V.
-->t0=rd(1);x0=xsol;
--[xsol,rd]=ode(’root’,x0,t0,tf,testz,1,g);
-->rd
rd =
! 0.04 1. !
To obtain the second zero.
¡
Restarting from an event location :
There is here another subtlety. Sometimes when we ask Scilab to find the
next event, without precautions, we get a warning message :
lsodar- one or more components of g has a root
too near to the initial point
That is, in other words, the root finding capabilities find the event you have
previously detected. In our example, this has not occurred. But it can hap-
pens (see the next examples). You can remedy to this situation with three
ways
1. Adjust the minimum step hmin, %ODEOPTIONS(5), which is 0 by default
(see section (5.3.5)
2. Adjust the first step tried h0, %ODEOPTIONS(3), which is 0 by default
(see section (5.3.3)
3. Or integrate a little bit the solution, and restart from this new point
to search the new events. We have to use this trick in the example of
the ball on a ramp, in section (5.7.5), and in the example of the dry
friction of Coulomb’s law.
130
Remark 5.1 :
Modifying h0 or hmin is not without harmlessness. You can miss some sin-
gularities of your system when one of these two quantities are too big.
5.7.4 The bouncing ball problem
A ball is thrown in a vertical plane. The position of the ball is represented
by the abscissa x(t) and his height h(t).
The ODE that (x(t), h(t)) satisfies is coded by
function xdot=fallingball2d(t,x)
xdot=[x(2); 0; x(4); -9.81]
At time t
0
= 0, the ball is thrown with initial condition x(0) = 0, ˙ x(0) = 5,
h(0) = 0 and
˙
h(0) = 10.
We give numerically, the time at which the ball hits the ground.
-->;getf("/Users/sallet/Documents/Scilab/fallingball2d.sci");
-->t0=0;
-->x0=[0;5;0;10];
-->tf=100;
-->deff(’z=g(t,x)’,’z=x(3)’)
-->[x,rd]=ode(’roots’,x0,t0,tf,fallingball2d,1,g);
-->rd(1)
ans =
2.038736
-->x
x =
131
! 10.19368 !
! 5. !
! 0. !
! -10. !
What are the value of x, ˙ x, h,
˙
h? they are obtained by calling x, so
x = 10.19...; ˙ x = 5; y = 0; ˙ y = −10
When the ball hits the ground, it bounces, or in other words , if t
hit
is the
time of hit, the arrival speed is ( ˙ x(t
hit
),
˙
h(t
hit
)) and becomes
( ˙ x(t
hit
), −k
˙
h(t
hit
))
Where k is a coefficient of elasticity. We set k = 0.8.
We write the following script
// script for a falling ball
//
// script for a falling ball
//
;getf("/Users/sallet/Documents/Scilab/fallingball.sci");
tstart = 0;
tfinal = 30;
x0 = [0; 20];
deff(’z=g(t,x)’,’z=x(1)’)
tout = tstart;
xout = x0.’;
teout = [];
xeout = [];
for i = 1:10
//solve till the first stop
[x,rd]=ode(’root’,x0,tstart,tfinal,fallingball,1,g);
//accumulate output
132
//
T=linspace(tstart,rd(1),100)’;
T=T(:);
X=ode(x0,tstart,T,fallingball);
tout=[tout;T(2:$)];
xout=[xout;X(:,2:$)’];
teout=[teout;rd(1)];
xeout=[xeout;x’];
// new IV (initial values)
x0(1)=0;
x0(2)=-.9*x(2);
tstart=rd(1);
end
xset(’window’,1)
xbasc()
plot2d(tout,xout(:,1))
plot2d(teout,zeros(teout),-9,"000")
When the script is executed we obtain a tableau of the 10 first times where
the ball hits the ground, give the values of x, ˙ x, h,
˙
h at this events. The time
is given in the first column
-->[teout xeout]
ans =
! 2.038736 10.19368 5. 0. -10. !
! 3.6697248 18.348624 5. 0. - 8. !
! 4.9745158 24.872579 5. 0. - 6.4 !
! 6.0183486 30.091743 5. - 3.109E-15 - 5.12 !
! 6.0183486 30.091743 5. 1.061E-15 4.096 !
! 6.0183486 30.091743 5. - 1.850E-15 - 3.2768 !
! 6.0183486 30.091743 5. 8.147E-16 2.62144 !
! 6.0183486 30.091743 5. - 9.009E-17 - 2.097152 !
! 6.0183486 30.091743 5. 6.549E-16 1.6777216 !
! 6.0183486 30.091743 5. - 1.103E-15 - 1.3421773 !
133
With accumulation of output (see the script) we obtain the following plot
0 4 8 12 16 20 24 28
1
3
7
11
15
19
23
Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο
Fig. 31 – A Bouncing ball on the ground
5.7.5 The bouncing ball on a ramp
We consider the same problem, but this time the ball is bouncing on a ramp
(equation x = y = 1)
The rule for restarting is a reflection rule on a surface : the new starting
vector speed is symmetric with respect to the normal at the hitting point,
and reduced accordingly to the reduction coefficient.
We need here a refinement : when we ask Scilab to find the next event,
without precautions, we get , as we have already said a warning message :
lsodar- one or more components of g has a root
too near to the initial point
Simply when starting from the new initial point, which is precisely an “event
” location point, the solver find this point.
134
Then we need to advance a little bit the solution et restart from
this point for searching for the new events.
// script for a falling ball
//
;getf("/Users/sallet/Documents/Scilab/fallingball2d.sci");
t0 = 0;
tfinal = 100;
x0 = [0;0;2;0];
deff(’z=g(t,x)’,’z=[x(1)+x(3)-1;x(3)]’)
tout = t0;
xout = x0.’;
teout = [];
xeout = [];
k=0.422;
eps=0.001;
A=[1 0 0 0 ;0 0 0 -k;0 0 1 0; 0 -k 0 0];
for i = 1:7
//solve till the first stop
[x,rd]=ode(’root’,x0,t0,tfinal,fallingball2d,2,g);
//accumulate output
//
T=linspace(t0,rd(1),100)’;
T=T(:);
X=ode(x0,t0,T,fallingball2d);
tout=[tout;T(2:$)];
xout=[xout;X(:,2:$)’];
teout=[teout;rd(1)];
xeout=[xeout;x’];
// new IV (initial values)
// must push a little bit the solution
x0=A*x;
t0=rd(1);
Z1=ode(x0,t0,t0+eps,fallingball2d);
135
tout=[tout;t0+eps];
xout=[xout;Z1’];
// new I.V.
x0=Z1(:,$);
tO=t0+eps;
end
xset(’window’,1)
xbasc()
plot2d(xout(:,1),xout(:,3))
plot2d(xeout(:,1),xeout(:,3),-9,"000")
We obtain the plot
0 0.2 0.4 0.6 0.8 1. 0 1.2
0. 1
0.3
0.7
1.1
1.5
1.9
2.3
Ο
Ο
Ο
Ο
ΟΟΟ
Falling Ball on a ramp
Fig. 32 – A bouncing ball on a ramp
Try the preceding code without “the little push” to see the reaction of Scilab.
Try the other solutions : modifying h0, or hmin.
Suppose that in the preceding configuration, there is a vertical wall located
at the end of the ramp. Compute the path of theball
In the first example, the ground is flat. Suppose that the ground is given by
136
0.1 sin(5πt). The ball is thrown at time 0 with I.V. : [2, 0.1] . The coefficient
of restitution is k = 0.9. Simulates the path of the ball.
5.7.6 Coulomb’s law of friction
We consider the example
¨ x + 2D ˙ x + µsign( ˙ x) + x = Acos(ωt) (29)
The parameters are D = 0.1, µ = 4, A = 2, and ω = π, the initial values are
x(0) = 3 and ˙ x(0) = 4.
This equation model the movement of a solid of mass M moving on a line,
on which acts a periodic force Acos(ωt, and subject to a dry friction force of
intensity mu and to a viscous friction of intensity 2D.
We write the equation in standard form
_
˙ x
1
= x
2
˙ x
2
= −0.2 x
2
−x
1
+ 2 cos(πt) −4sign(x
2
)
(30)
The switching function is x
2
, the surface is then the x
1
–axis. When the so-
lution crosses the x
1
–axis different cases occur.
We follow a solution starting in the upper half–plane till it hits the manifold
x
2
= 0. Let denote t
I
the time where the solution reaches the manifold,
f
I
(x, t) = −x
1
+2 cos(πt) −4 and f
II
(x, t) = −x
1
+2 cos(πt) +4 the different
values of the vectors field on the x
1
–axis.
Let look at the different vectors fields near the intersection with the x
1
–axis :
1. f
I
(x, t
I
) < 0 and f
II
(x, t
I
) < 0. We continue the integration in the
x
2
< 0 half–plane.
2. f
I
(x, t
I
) < 0 and f
II
(x, t
I
) > 0, the solution remains trapped in the
manifold x
2
= 0 and x
1
= Constant = x
1
(t
I
) until one of the value
of f
I
or f
II
changes sign and becomes of same positive sign, then the
solution will leave the manifold and go toward the common direction
given by the vector fields.
We shall now implement the solution of this problem.
We have for the differential equation 3 possibilities.
1. The term of the dry friction has value 4 in the expression of ˙ x
2
in
equation (30)
137
2. Or the term of the dry friction has value −4 in the expression of ˙ x
2
in
equation (30)
3. Or if the speed is 0 (we are then in the x
1
–axis manifold) and if the
components of two vector fields on the x
2
axis given by the formulas
−x
1
+ 2 cos(πt) −4
and
−x
1
+ 2 cos(πt) + 4
have opposite sign, then we are trapped in the manifold x
2
= 0, and
then the ODE is given by ˙ x = 0.
We shall code, these 3 possibilities, by the introduction of a parameter MODE,
which takes values 1,−1 or 0, and is given by sign(x_2)
The event function is twofold. When we are not in the manifold x
2
= 0, the
event is : the component x
2
taking the value 0. At this point we have
to decide which way the solution of the ODE is going.
If the the components of the two vector fields on the x
2
axis, given by
the 2 formulas −x
1
+ 2 cos(πt) −4 and −x
1
+ 2 cos(πt) + 4, have same sign
the solution goes through the x
1
–axis and we choose the appropriate vector
field, corresponding to MODE=1 or MODE=-1.
If the two preceding expressions have opposite signs, then we set MODE=0
and the next event that we must detect is the change of sign of the expression
(−x
1
+ 2 cos(πt) −4)(−x
1
+ 2 cos(πt) + 4)
The event is then : looking for a zero of this expression.
With this analysis, an solution for the code could be
function xdot = coulomb(t,x,MODE)
// Example of a spring with dry and viscous friction
// and a periodic force
// MODE describe the vector field
// MODE=0 we are on the x1 axis
M=1; // Mass
A=2; // Amplitude of input force
138
mu=4; // Dry Friction force
D=0.1; // viscous force
//
if MODE==0
xdot = [0; 0];
else
xdot=[x(2); (-2*D*x(2)- mu*MODE +A*cos(%pi*t)-x(1))/M];
end
//------------------------------------------------------
//sub function for event location
function value=gcoul(t,x,MODE)
M=1; A=2; mu=4; D=0.1;
if MODE==0
F=( (-mu+A*cos(%pi*t)-x(1)) *( mu+A*cos(%pi*t)-x(1) ) );
value = F ;
else
value=x(2);
end
Now we write a script for finding the events and accumulate the output. We
have the same remark that we have to push a little bit the solution from the
“event point ”.
// script for plotting Coulomb example
//-----------------------------------------------------
clear
A = 2; M = 1; mu=4; D=0.1;
t0=0;
tstart =t0;
tfinal = 10;
x0 = [3; 4];
;getf("/Users/sallet/Documents/Scilab/coulomb.sci");
// limiting hmax
%ODEOPTIONS=[1,0,0,.5,0,2,20000,12,5,0,-1,-1];
//
139
//-------------------------------------------------------
MODE=sign(x0(2));
tout = tstart;
xout = x0;
teout = [];
xeout = [];
while tfinal >= tstart
// Solve until the first terminal event.
//updating value of MODE
list1=list(coulomb,MODE);
list2=list(gcoul,MODE);
// We need to push a little bit the integration
// in order that gcoul has not a zero too near
// the initial point
Tplus=tstart+0.05;
xsolplus=ode(x0,tstart,Tplus,list1);
// Looking for a new zero of gcoul
x0=xsolplus;
tstart=Tplus;
[xsole,rd] = ode(’root’,x0,tstart,tfinal,list1,1,list2);
//If no event occur, then terminates till final time
if rd(1)==[] then
T=linspace(tstart,tfinal,100);
xsol=ode(x0,tstart,T,list1);
xout=[xout,xsol(:,2:$)];
tout=[tout,T(2:$)];
break,
end;
// If event occurs accumulate events
xeout=[xeout,xsole];
teout=[teout,rd(1)];
140
// Accumulate output.
T=linspace(tstart,rd(1),100);
xsol=ode(x0,tstart,T,list1);
tout = [tout, T(2:$)];
xout = [xout, xsol(:,2:$)];
//start at end of the preceding integration
tstart = rd(1);
x0 = [xsole(1);0];
a1=-mu+A*cos(%pi*tstart)-x0(1);
F=(-mu+A*cos(%pi*tstart)-x0(1) ) * ...
(mu+A*cos(%pi*tstart)-x0(1));
if MODE==0
MODE=sign(a1);
elseif F<=0
MODE= 0 ;
else
MODE=sign(a1);
end
end
// plotting
xset(’window’,1)
xbasc()
// plot the force
t = t0 + (0:100)*((tfinal-t0)/100);
plot2d(t,A*cos(%pi*t),5);
//plot the solution
plot2d(tout’,xout’)
//Mark the events with a little "o"
plot2d(teout’,xeout’,style=[-9,-9])
Running this script gives the times and location of events :
-->[teout’,xeout’]
ans =
141
! 0.5628123 4.2035123 - 2.845E-16 !
! 2.0352271 3.2164227 7.683E-16 !
! 2.6281436 3.2164227 0. !
! 3.727185 2.9024581 7.175E-16 !
! 4.6849041 2.9024581 0. !
! 5.6179261 2.7281715 1.410E-15 !
! 6.7193768 2.7281715 0. !
! 7.5513298 2.6140528 9.055E-16 !
! 8.7436998 2.6140528 0. !
! 9.5042179 2.5325592 9.558E-16 !
Notice that the second component x
2
is zero, but within precision machine
accuracy. Notice also that we set to zero, in the script, this component when
we restart from these points.
The script gives the following plot :
0 1 2 3 4 5 6 7 8 9 10
2
1
0
1
2
3
4
5
0 1 2 3 4 5 6 7 8 9 10
2
1
0
1
2
3
4
5
0 1 2 3 4 5 6 7 8 9 10
2
1
0
1
2
3
4
5
Ο
Ο Ο
Ο Ο
Ο Ο
Ο Ο
Ο
0 1 2 3 4 5 6 7 8 9 10
2
1
0
1
2
3
4
5
Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο
position
speed
Force
Fig. 33 – Solution of (30)
142
5.8 In the plane
Scilab has tools for ODE in the plane. We can plot vector fields in the plane,
and it is possible to plot solution directly from a mouse click in the graphic
window.
5.8.1 Plotting vector fields
calling sequence :
fchamp(f,t,xr,yr,[arfact,rect,strf])
fchamp(x,y,xr,yr,<opt_args>)
parameters :
– f : An external (function or character string) or a list which describes
the ODE.
– - It can be a function name f, where f is supposed to be a function
of type y = f(t, x, [u]). f returns a column vector of size 2, y, which
gives the value of the direction field f at point x and at time t.
– It can also be an object of type list, list(f,u1) where f is a
function of type y=f(t,x,u) and u1 gives the value of the parameter
u.
– t : The selected time.
– xr,yr : Two row vectors of size n1 and n2 which define the grid on
which the direction field is computed.
– <opt args> : This represents a sequence of statements
key1=value1, key2=value2,... where
key1, key2,... can be one of the following :
arfact, rect, strf
DESCRIPTION :
fchamp is used to draw the direction field of a 2D first order ODE defined
by the external function f. Note as usual, that if the ODE is autonomous,
argument t is useless, but it must be given.
Example 5.2 :
We consider the equation of the pendulum (5) and we plot the corresponding
vector field
// plotting the vector field of the pendulum
//
// get Equation ODE pendulum
143
;getf("/Users/sallet/Documents/Scilab/pendulum.sci");
//
n = 30;
delta_x = 2*%pi;
delta_y = 2;
x=linspace(-delta_x,delta_x,n);
y=linspace(-delta_y,delta_y,n);
xbasc()
fchamp(pendulum,0,x,y,1,...
[-delta_x,-delta_y,delta_x,delta_y],"031")
xselect()
Running this script give you the picture
6.28 5.03 3.77 2.51 1.26 0.00 1.26 2.51 3.77 5.03 6.28
4.44
3.55
2.66
1.78
0.89
0.00
0.89
1.78
2.66
3.55
4.44
Fig. 34 – Vectors field pendulum
5.8.2 Using the mouse
The following section is taken from B. Pin¸ con ’s book [21].
We can plot solution of ODE directly from the mouse. We use xclick. We
get I.V. (for autonomous systems) with the mouse. Then we integrates from
this I.V.
144
The primitive xclick is governed by
Calling sequence :
[c_i,c_x,c_y,c_w,]=xclick([flag])
parameters :
– c i : integer, mouse button number.
– c x,c y : real scalars, position of the mouse.
– c w : integer, window number.
– flag : integer. If present, the click event queue is not cleared when
entering xclick.
description :
xclick waits for a mouse click in the graphics window.
If it is called with 3 left hand side arguments, it waits for a mouse click in
the current graphics window.
If it is called with 4 left hand side arguments, it waits for a mouse click in
any graphics window.
c i : an integer which gives the number of the mouse button that was pressed
0, 1 or 2 (for left, middle and right) or -1 in case of problems with xclick.
c x,c y : the coordinates of the position of the mouse click in the current
graphics scale.
c w : the window number where the click has occurred.
Example 5.3 :
We consider the Brusselator equation (ODE), we plot the vector fields and
solve with some clicks
The ODE in the plane. This is the Brusselator without diffusion. We pass a
parameter
function [f] = Brusselator(t,x,eps)
//
f =[ - (6+eps)*x(1) + x(1)^2*x(2); (5+eps)*x(1) - x(1)^2*x(2)]
We now have the following script from [21] :
// Plotting Brusselator
;getf("/Users/sallet/Documents/Scilab/Brusselator.sci");
eps = -4;
P_stat = [2 ; (5+eps)/2];
145
// plotting limits
delta_x = 6; delta_y = 4;
x_min = P_stat(1) - delta_x; x_max = P_stat(1) + delta_x;
y_min = P_stat(2) - delta_y; y_max = P_stat(2) + delta_y;
n = 20;
x = linspace(x_min, x_max, n);
y = linspace(y_min, y_max, n);
// 1/ vector field plotting
xbasc()
fchamp(list(Brusselator,eps),0,x,y,1,[x_min,...
y_min,x_max,y_max],"031")
xfrect(P_stat(1)-0.08,P_stat(2)+0.08,0.16,0.16)
// critical point
xselect()
// 2/solving ODE
m = 500 ; T = 5 ;
rtol = 1.d-09; atol = 1.d-10; // solver tolerance
t = linspace(0,T,m);
color = [21 2 3 4 5 6 19 28 32 9 13 22 ...
18 21 12 30 27];
num = -1;
while %t
[c_i,c_x,c_y]=xclick();
if c_i == 0 then
plot2d(c_x, c_y, -9, "000")
u0 = [c_x;c_y];
[u] = ode(’stiff’,u0, 0, t, rtol, atol, list(Brusselator,eps));
num = modulo(num+1,length(color));
plot2d(u(1,:)’,u(2,:)’,color(num+1),"000")
elseif c_i == 2 then
break
end
end
Try it
146
4. 0 2. 8 1. 6 0. 4 0.8 2.0 3.2 4.4 5.6 6.8 8.0
3.74
2.89
2.04
1.20
0.35
0.50
1.35
2.20
3.04
3.89
4.74
Ο
Ο
Ο
Ο
Ο
Ο
Ο
Ο
Ο
Ο
Ο
Ο
Ο
Ο Ο
Ο
Fig. 35 – Brusselator trajectories from the mouse
5.9 test functions for ODE
To understand a solver’s performances experimentation is a good way. Some
collections of test problem exist. We give here some test problems. The reader
is invited to test his ability with Scilab on these problems.
5.9.1
_
˙ x =
4t
2
−x
t
4
−1
x
x(2) = 15
To integrate on [0, 10], 0, 30], [0, 50]
The solution is the polynomial
x(t) = 1 +t + t
2
+ t
3
_
˙ x =
4t
2
−x
t
4
−1
x
x(2) = 15
147
To integrate on [0, 10], 0, 30], [0, 50]
The solution is the polynomial
x(t) = 1 +t + t
2
+ t
3
5.9.2
_
˙ x = x
x(0) = 1
To integrate on [0, 10], 0, 30], [0, 50]
5.9.3
_
˙ x = −0.5x
3
x(0) = 1
To integrate on [0, 20]
The solution is
x(t) =
1

t + 1
5.9.4
_
¸
¸
_
¸
¸
_
˙ x
1
= x
2
˙ x
2
= −x
1
x
1
(0) = 0 , x
2
(0) = 1
To integrate on [0, 2π], 0, 6π], [0, 16π]
5.9.5
_
˙ x = Ax
x(0) = (1, 0, . . . , 0)
T
Where A is a N N tridiagonal matrix.
148
A = −2 ∗ eye(N, N) + diag(ones(N −1, 1), 1) +diag(ones(N −1, 1), 1)
To integrate on [0, 10], 0, 30], [0, 50] for N = 50
The solution is
x(t) = Q diag(exp(λ
i
.t))Q
−1
.x
0
Where the eigenvalues
λ
i
= −4 sin
2
_

2(N + 1)
_
and R is an orthogonal matrix
R
ij
=
_
2
N + 1
sin
_
ijπ
N + 1
_
5.9.6
_
¸
¸
_
¸
¸
_
˙ x
1
= x
2
x
3
˙ x
2
= −x
1
x
3
˙ x
3
= −0.51x
1
x
2
x
1
(0) = 0 , x
2
(0) = 1 x
3
= 1
To integrate on [0, 4K], 0, 6K], [0, 28K
The solution is
x
1
(t) = sn(t, 0.51) x
2
(t) = cn(t, 0.51) x
3
(t) = dn(t, 0.51)
With period K=1.862640080233273855203
5.9.7
_
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
_
¨ x
1
= −
x
1
r
3
¨ x
2
= −
x
2
r
3
r = sqrtx
2
1
+ x
2
2
x
1
(0) = 1 , ˙ x
1
(0) = 0
x
2
(0) = 0 , ˙ x
2
(0) = 1
149
To integrate on [0, 2π], 0, 6π], [0, 16π]
The solution is
x
1
(t) = cos(t) x
2
t) = sin(x)
5.9.8
_
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
_
¨ x
1
= −
x
1
r
3
¨ x
2
= −
x
2
r
3
r = sqrtx
2
1
+ x
2
2
x
1
(0) = 0.4 , ˙ x
1
(0) = 0
x
2
(0) = 0 , ˙ x
2
(0) = 2
To integrate on [0, 2π], 0, 6π], [0, 16π]
The solution is
x
1
(t) = cos(u) −0.6 x
2
t) = 0.8 sin(u)
where u is the solution of u = −0.6 sin u
5.9.9 Arenstorf orbit
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¨ x
1
= 2 ˙ x
2
+ x
1
−µ
∗ x
1

r
3
1
−µ
x
1
−µ

r
3
2
¨ x
2
= −2 ˙ x
1
+ x
2
−µ
∗ x
2
r
3
1
−µ
x
2
r
3
2
r
1
=
_
(x
1
+ µ)
2
+ x
2
2
r
2
=
_
(x
1
−µ

)
2
+ x
2
2
x
1
(0) = 0.4 , ˙ x
1
(0) = 0
µ = 1/82.45; µ∗ = 1 −µ
x
1
(0) = 0 , ˙ x
1
(0) = 0
x
2
(0) = 0 , ˙ x
2
(0) = −1.0493575098303199
There are the equations of motion of a restricted three body problem. There
is no analytic expression available for the solution. The error is measured at
the end of the period T
150
T = 6.19216933131963970
5.9.10 Arenstorf orbit
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¨ x
1
= 2 ˙ x
2
+ x
1
−µ
∗ x
1

r
3
1
−µ
x
1
−µ

r
3
2
¨ x
2
= −2 ˙ x
1
+ x
2
−µ
∗ x
2
r
3
1
−µ
x
2
r
3
2
r
1
=
_
(x
1
+ µ)
2
+ x
2
2
r
2
=
_
(x
1
−µ

)
2
+ x
2
2
x
1
(0) = 0.4 , ˙ x
1
(0) = 0
µ = 1/82.45; µ∗ = 1 −µ
x
1
(0) = 0.994 , ˙ x
1
(0) = 0
x
2
(0) = 0 , ˙ x
2
(0) = −2.00158510637908252
Period T
T = 17.06521656015796255
5.9.11 The knee problem
[Dahlquist] :
_
_
_
ε ˙ x = (1 −t)x −x
2
x(0) = 1
The parameter ε satisfies 0 < ε ¸1
This is a stiff problem. ε = 10
−4
, ε = 10
−6
. To integrate on [0, 2]
5.9.12 Problem not smooth
[Dahlquist] :
_
_
_
¨ x = −x −sign(x) −3 sin(2t)
x(0) = 0 ˙ x(0) = 3
To integrate on [0, 8π]
151
5.9.13 The Oregonator
_
¸
¸
_
¸
¸
_
˙ x
1
= 77.27(x
2
+ x
1
(1
8
.375 10
−6
x
1
−x
2
)
˙ x
2
=
1
77.27
(x
3
−(1 +x
1
)x
2
˙ x
3
= 0.161(x
1
−x
2
)
x
1
(0) = 1 x
2
(0) = 2 x
3
(0) = 3
To integrate on [0, 30], [0, 60]
This is a stiff problem with a limit cycle.
R´ef´erences
[1] P. Bogacki and L.F. Shampine. A 3(2) pair of Runge-Kutta formulas.
Appl. Math. Lett., 2(4) :321–325, 1989.
[2] P. Bogacki and L.F. Shampine. An efficient Runge-Kutta (4, 5) pair.
Comput. Math. Appl., 32(6) :15–28, 1996.
[3] Peter N. Brown, George D. Byrne, and Alan C. Hindmarsh. VODE :
A variable-coefficient ODE solver. SIAM J. Sci. Stat. Comput.,
10(5) :1038–1051, 1989.
[4] J.C. Butcher. Numerical methods for ordinary differential equations.
Wiley, 2003.
[5] George D. Byrne and Alan C. Hindmarsh. Stiff ODE solvers : A review
of current and coming attractions. J. Comput. Phys., 70 :1–62, 1987.
[6] Gear C.W. Numerical initial value problems in ODE. Prentice Hall,
1971.
[7] George E. Forsythe, Michael A. Malcolm, and Cleve B. Moler. Computer
methods for mathematical computations. Prentice Hall, 1977.
[8] C. W. Gear. Algorithm 407 : Difsub for solution of ordinary differential
equations [d2]. Commun. ACM, 14(3) :185–190, 1971.
[9] Ernst Hairer and Martin Hairer. GniCodes –Matlab programs for geo-
metric numerical integration. In Frontiers in numerical analysis, 2002.
[10] Ernst Hairer, Syvert P. Nørsett, and Gerhard Wanner. Solving ordinary
differential equations. I : Nonstiff problems. 2. rev. ed. Springer-Verlag,
1993.
152
[11] Ernst Hairer and Gerhard Wanner. Solving ordinary differential equa-
tions. II : Stiff and differential-algebraic problems. 2nd rev. ed. Springer-
Verlag, 1980.
[12] Jack K. Hale. Ordinary differential equations. 2nd ed. R.E. Krieger
Publishing Company, 1980.
[13] A.C. Hindmarsch and G.D. Byrne. Episode. In Numerical methods for
differential systems, 1976.
[14] A.C. Hindmarsch and G.D. Byrne. Odepack, a systematized collection of
ode solvers. In Eds. Scientific Computing, R. S. Stepleman et al., editor,
Scientific Computing, pages 55–64. North-Holland Publ., Amsterdam,
1976.
[15] A.C. Hindmarsch and G.D. Byrne. Lsode and lsodi, two new initial
value ordinary differential equation. ACM SIGNUM, 15 :10–11, 1980.
[16] Morris W. Hirsch and Stephen Smale. Differential equations, dynamical
systems, and linear algebra. Academic Press, 1974.
[17] T.E. Hull, W.H. Enright, B.M. Fellen, and A.E. Sedgwick. Comparing
numerical methods for ordinary differential equations. SIAM J. Numer.
Anal., 9 :603–637, 1972.
[18] David Kahaner, Cleve Moler, and Stephen Nash. Numerical methods
and software. With disc. Prentice Hall, 1989.
[19] Fred T. Krogh. On testing a subroutine for the numerical integration of
ordinary differential equations. J. Assoc. Comput. Mach., 20 :545–562,
1973.
[20] F.T. Krogh. A test for instability in the numerical solution of ordinary
differential equations. J. Assoc. Comput. Mach., 14 :351–354, 1967.
[21] B. Pin¸ con . Une introduction `a Scilab. IECN, 2000.
[22] Lawrence F. Shampine. Numerical solution of ordinary differential equa-
tions. Chapman and Hall, 1994.
[23] L.F. Shampine. What everyone solving differential equations numeri-
cally should know. In Computational techniques for ordinary differential
equations, pages 1–17, 1980.
[24] L.F. Shampine and C.W. Gear. A user’s view of solving stiff ordinary
differential equations. SIAM Rev., 21 :1–17, 1979.
[25] L.F. Shampine, I. Gladwell, and S. Thompson. Solving ODEs with MAT-
LAB. Cambridge University Press, 2003.
153
[26] L.F. Shampine and M.K. Gordon. Computer solution of ordinary diffe-
rential equations. The initial value problem. W.H. Freeman, 1975.
[27] L.F. Shampine and S. Thompson. Event location for ordinary differential
equations. Comput. Math. Appl., 39(5-6) :43–54, 2000.
[28] L.F. Shampine, H.A. Watts, and S.M. Davenport. Solving nonstiff ordi-
nary differential equations - the state of the art. SIAM Rev., 18 :376–411,
1976.
[29] Andrew H. Sherman and Alan C. Hindmarsh. GEARS : A package for
the solution of sparse, stiff ordinary differential equations. In Initial
value problems for ODE, pages 190–200, 1980.
154

Table des mati`res e
1 Introduction 2 ODEs. Basic results 2.1 General Definitions . . . . . . . . . . . . 2.2 Existence and uniqueness Theorems . . 2.3 A sufficient condition for a function to be 2.4 A basic inequality . . . . . . . . . . . . . 2.5 Continuity and initial conditions . . . . . 2.6 Continuous dependence of parameters . . 2.7 Semi-continuity of the definition interval 2.8 Differentiability of solutions . . . . . . . 2.9 Maximal solution . . . . . . . . . . . . . 3 Getting started 3.1 Standard form . . . . . . 3.2 Why to solve numerically 3.2.1 x = x2 + t . . . . ˙ 3.2.2 x = x2 + t2 . . . ˙ 3.2.3 x = x3 + t . . . . ˙ . . . . . ODEs ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lipschitz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 6 6 7 8 9 9 10 10 11 11 11 11 12 13 13 14 14 15 21 22 22 25 27 32 36 37 38 39 41 43 51 52

4 Theory 4.1 Generalities . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Backward error analysis . . . . . . . . . . . . . . . . . . 4.3 One–step methods . . . . . . . . . . . . . . . . . . . . . 4.3.1 Analysis of one-step methods . . . . . . . . . . . 4.3.2 Conditions for a one-step method to be of order p 4.3.3 Runge-Kutta method . . . . . . . . . . . . . . . . 4.3.4 Order of RK formulas . . . . . . . . . . . . . . . 4.3.5 RK formulas are Lipschitz . . . . . . . . . . . . . 4.3.6 Local error estimation and control of stepsize . . 4.4 Experimentation . . . . . . . . . . . . . . . . . . . . . . 4.4.1 coding the formulas . . . . . . . . . . . . . . . . . 4.4.2 Testing the methods . . . . . . . . . . . . . . . . 4.4.3 Testing roundoff errors . . . . . . . . . . . . . . . 4.5 Methods with memory . . . . . . . . . . . . . . . . . . . 4.5.1 Linear MultistepsMethods LMM . . . . . . . . . 2

4.5.2 4.5.3 4.5.4 4.5.5 4.5.6 4.5.7

Adams-Basforth methods (AB) . . . . . . . . . . Adams-Moulton methods . . . . . . . . . . . . . . BDF methods . . . . . . . . . . . . . . . . . . . . Implicit methods and PECE . . . . . . . . . . . . Stability region of a method . . . . . . . . . . . . Implicit methods, stiff equations, implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

53 55 57 59 61 66 67 67 73 74 75 76 76 76 77 77 77 77 77 77 78 78 80 80 83 87 106 112 113 115 117 117 127 127 128 129 131

5 SCILAB solvers 5.1 Using Scilab Solvers, Elementary use . . . . . . . . . 5.2 More on Scilab solvers . . . . . . . . . . . . . . . . . 5.2.1 Syntax . . . . . . . . . . . . . . . . . . . . . . 5.3 Options for ODE . . . . . . . . . . . . . . . . . . . . 5.3.1 itask . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 tcrit . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 h0 . . . . . . . . . . . . . . . . . . . . . . . . 5.3.4 hmax . . . . . . . . . . . . . . . . . . . . . . . 5.3.5 hmin . . . . . . . . . . . . . . . . . . . . . . . 5.3.6 jactype . . . . . . . . . . . . . . . . . . . . . . 5.3.7 mxstep . . . . . . . . . . . . . . . . . . . . . . 5.3.8 maxordn . . . . . . . . . . . . . . . . . . . . . 5.3.9 maxords . . . . . . . . . . . . . . . . . . . . . 5.3.10 ixpr . . . . . . . . . . . . . . . . . . . . . . . 5.3.11 ml,mu . . . . . . . . . . . . . . . . . . . . . . 5.4 Experimentation . . . . . . . . . . . . . . . . . . . . 5.4.1 Two body problem . . . . . . . . . . . . . . . 5.4.2 Roberston Problem : stiffness . . . . . . . . . 5.4.3 A problem in Ozone Kinetics model . . . . . . 5.4.4 Large systems and Method of lines . . . . . . 5.5 Passing parameters to functions . . . . . . . . . . . . 5.5.1 Lotka-Volterra equations, passing parameters 5.5.2 variables in Scilab and passing parameters . . 5.6 Discontinuities . . . . . . . . . . . . . . . . . . . . . . 5.6.1 Pharmacokinetics . . . . . . . . . . . . . . . . 5.7 Event locations . . . . . . . . . . . . . . . . . . . . . 5.7.1 Syntax of “Roots ” . . . . . . . . . . . . . . . 5.7.2 Description . . . . . . . . . . . . . . . . . . . 5.7.3 Precautions to be taken in using “root” . . . . 5.7.4 The bouncing ball problem . . . . . . . . . . . 3

. . . . . . . . . . .9. . . . . . . . . . .9. . . 5. . . . . . . . . . . . . . . . . . . . . .8. . . . . . . .6 Coulomb’s law of friction . . . . . . 5. . . .8 In the plane . . . . . . .9. .5 The bouncing ball on a ramp 5.9. . . . . . . . 5. . . . .10 Arenstorf orbit . . . . . . 5. . . . .7. . .5. .3 . . . . . . . . . 5.8. . . . . . . . .13 The Oregonator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9. . . . . . . . . . . . . . . . . . . . . . .1 . . . . . . . . . . 5. . .9. . . . . . . . . . . . . . 5. . . . . . 5. 5. . . . . . . . . . . . . . . . . . . . . . . . 5. . . . . . . . . . 5. . . . . .9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9.9 Arenstorf orbit . . .8 . . . . . . . . . . . . . . . . .2 Using the mouse . . . . . . . . . . . . . .5 . . . . . . . . . . . . . . . . . . . . . .9. . . . . . .9. . . . . 5. .9. . .9 test functions for ODE . . . 5. . . . . . . . . . . . . . . . . . . . . . . . . . .7 . . . . . . . . . 5. . . . . . . . . . . . . .12 Problem not smooth .7. . . . . . . . . . . . . . . .11 The knee problem . . . .9. . . . . 5. . . . . . .9. . . . . 5. . . . . .2 . . . . . . 134 137 143 143 144 147 147 148 148 148 148 149 149 150 150 151 151 151 152 4 . . . . . . . .4 . . . . . . . . . . . . . . . . 5. . . . . . .1 Plotting vector fields . . . . . . . . . . .6 .

The objective of this lecture is to give to the reader a tool. These routines (which.org.. do not belong to Scilab but are interactively called by the interpreter) are of independent interest and most of them are available through Netlib. Scilab is a high-level programming language. Developed at INRIA. SCILAB has an inherent ability to handle matrices (basic matrix manipulation. Scilab is very convenient problem solving environment (PSE) with quality solvers for ODEs. A few of them have been slightly modified for better compatibility with Scilab’s interpreter. Scilab can be downloaded freely from http ://www. The lectures given by Professors Lobry and Sari. in the context of the West African Training School (WATS) supported by ICPT and CIMPA. at the university Gaston Berger of Saint-Louis. This notes are intended to be used directly with a computer. This lecture is concerned about solving ODEs numerically. We apply this to the study of ODE. Scilab is made of three distinct parts : an interpreter. to make mathematic experiments. libraries of functions (Scilab procedures) and libraries of Fortran and C routines. Scilab. transpose. Scilab is similar to MATLAB. concatenation. engineering. In this Web 5 .Scilab. It is freely distributed in source code format. Scilab has been developed for system control and signal processing applications. biology to describe how quantities change with time. which is not freely distributed. It has many features similar to MATLAB. Programs with Scilab are short. last year. how to save a file. For example how to use a text editor.) Scilab has an open programming environment where the creation of functions and libraries of functions is completely in the hands of the user. This can be viewed as an example. inverse etc. Many others areas of mathematic can benefit of this technique. mathematics. has introduced the basic concepts for ODEs. Scilab is used to solve the problems presented and also to make mathematical experiments. strictly speaking.The content of these notes are the subject of the lectures given in august 2004. You just need a computer on which Scilab is installed. 1 Introduction Ordinary differential equations (ODEs) are used throughout physics. and have some notions about the use of a computer. making practical to list complete programs.

is equal to the vector f (t. Basic results General Definitions Definition 2. We use freely the techniques of Scilab. defined on an interval I of R x : I ⊂ R −→ Rn and which satisfies.1 ODEs. The variable x is an unknown function from R with values in Rn . For the convenience of the reader. Notation x is traditionally the derivative with ˙ respect to time t . and to check all the computations presented. with values in Rn . All the routines are either given in the text or given in the appendix. 6 . x) ˙ (1) In this definition t represents time. on I d x(t) = f (t. we refer to the lectures by A. we recall the classical results for ODE’s and take this opportunity to define the notations used throughout this lectures. which can be a c downloaded from this site.1 : A first order differential is an equation x = f (t. A very good introduction to Scilab is ”Une introduction ` Scilab” by Bruno Pin¸on. We advise the reader to read this notes. 2 2. evolving in Rn . x(t)). x(t)) dt A solution is then a time parametrized curve.2 : A solution of (1) is a derivable function x.site you will also find references and manuals. Definition 2. such that the tangent at the point x(t) to this curve. The variable x is called the state of the system. The function f is an application. defined on an open set Ω of R × Rn .Sene associated to this course. All the routines have been executed and proved to be working. Excepted for some possible errors made on “cuting and pasting” the routines in this lectures are the original ones.

x0 ). x0 ) ∈ Ω. x) ∈ U . there is no substitute for an understanding of the phenomena being modeled. We’ll also see physical problems that have more than one solution. Clearly we’ll have trouble computing a solution that does not exist. (t. continuous on Ω. y) ∈ U we have f (t. y) ≤ L (x − y) Definition 2. it is straightforward to see that this definition does not depends of the norm. In this lecture we’ll see a good many examples of physical problems that do not have solutions for certain values of parameters. x) − f (t.1 (Cauchy-Lipschitz) : Let f a function defined on an open set Ω of R ×Rn . Remark 2. Only the constant L. there exists an open set U ⊂ Ω containing (t0 .3 (Lipschitz function) : A function defined on an open set U of R × Rn with values in Rn is said to be Lipschitz with respect to x. valued in Rn locally Lipschitz with respect to x. But this is not : it is about whether you will be able to solve a problem at all and. depends on the norm. how well. and if there is more than one solution then we’ll have trouble computing the “right” one. Then for any point (t0 . uniformly relatively to t on Ω . uniformly relatively to t sur Ω if for any pair (t0 . if you can.1 : Since all the norm in Rn are equivalent. Although there are mathematical results that guarantee a problem has a solution and only one.2 Existence and uniqueness Theorems From the title of this section you might imagine that this is just another example of mathematicians being fussy. Theorem 2.2. x) ˙ 7 . uniformly relatively to t on U if there exists a constant L such that for any pair (t.4 (Locally Lipschitz functions) : A function defined on an open set Ω from R×Rn to Rn is said to be Lipschitz with respect to x. a constant LU such that f is Lipschitz with constant LU in U. x0 of Ω there exists a solution of the ODE x = f (t. Definition 2.

defined on an open I. x0 ) is contained in an unique maximal solution (2) 2. then any point (t0 . denotes another solution from the open interval J containing t0 then x(t) = y(t) sur I ∩ J Definition 2. but cannot ensures uniqueness Theorem 2. gives the existence. the function is Lipschitz.3 : If f is locally lipschitz in x on Ω.5 : An equation composed of a differential equation with an initial condition x ˙ = f (t. x0 ) of Ω is contained in one solution of the Cauchy problem (2) Definition 2.defined on an open interval I of R containing t0 and satisfying the initial condition x(t0 ) = x0 Moreover if y(t) defined on an interval.1 : If the function f of the ODE (1) is continuous with respect to (t.2 (Peano 1890) : If f is continuous any point (t0 .there exists at least t ∈ I ∩ J . More precisely we have Proposition 2. 8 . x) x(t0 ) = x0 is called a Cauchy problem A theorem from Peano. is said to be maximal if any other solution y(t) defined on J.3 A sufficient condition for a function to be Lipschitz If a function is sufficiently smooth. x) and is continuously differentiable with respect to x then f is locally Lipschitz.6 A solution x(t) of the Cauchy problem . such that x(t) = y(t) then we have J ⊂ I We can now give Theorem 2. In other words if the RHS (right hand side) of the ODE is sufficiently regular we have existence and uniqueness of solution .

4 A basic inequality Definition 2. since the syntax of the solvers in Scilab are --> ode(x0. t0 ) the unique solution of x = f (t.5 Continuity and initial conditions Definition 2. is written (t0 .4 : Let z1 (t) et z2 (t) two ε–approximate solutions of (1).2 : The notation used here x(t. to respect the order of Scilab The syntax in Matlab is 9 . x0 ). x) ˙ x(t0 ) = x0 Remark 2.V. where f is Lipschitz with constant L.f) Since this course is devoted to ODE and Scilab. x0 . when there is existence and uniqueness of solution. x0 . If the solutions are respectively ε1 et ε2 approximate.2. We reverse the order deliberately. t0 ) is somewhat different of the notation used in often the literature.8 : If f is locally Lipschitz we have existence and uniqueness of solutions. z(t)) ˙ ≤ ε Theorem 2. where usually the I. by x(t.t0. we choose. for convenience. we have the inequality eL|t−t0 | − 1 + (ε1 +ε2 ) L z1 (t) − z2 (t) ≤ z1 (t0 ) − z2 (t0 ) e L|t−t0 | (3) 2.t. We shall denote.7 A derivable function z(t) is said to be an ε–approximatesolution of the ODE (1) if for any t where z(t) is defined we have z(t) − f (t.

t0 ) ≤ e|t−t0 | x − x0 2. Proposition 2.2 : With the hypothesis of the Cauchy-Lipschitz theorem the solution x(t. Theorem 2. that ω(x) is lower semi-continuous . .5 : Let (t0 . [t0. λ) and Lipschitzian in x uniformly relatively to (t.x0) Proposition 2. x. λ) 2.3 With the preceding hypothesis the solution x(t. x0 ) a point of the open set Ω and a compact interval I on which the solution is defined. t0 ) is defined on I. The solution is locally Lipschitz with respect to the initial starting point : x(t. x. x0 . x. t0 ) − x(t. λ) x(t0 ) = x0 We consider the following problem where f is supposed continuous with respect to (t. for example. x.>> ode(@f. λ) is continuous with respect to (t. x0 . x0 t0 ) is defined on an maximal open interval t0 ∈ ]α(x) . t0. ω(x)[ .t]. t0 ) is continuous with respect to x0 . This implies that. λ).6 Continuous dependence of parameters x ˙ = f (t. Then there exists an open set U containing x0 such that for any x in U the solution x(t. x0 . 10 .7 Semi-continuity of the definition interval We have seen in the Lipschitzian case that x(t.

x0 . t0 . What happens when t → ω(t0 . x0 ) cannot stay in any compact set. containing the point (t0 . The most common form accepted by ODE solvers is the standard form.7 For any compact set K of Ω. ∂x The the solution x(t. x) ˙ is called the standard form for an ODE.9 Maximal solution We have seen that to any point (t0 . 3 3. ∂x0 ∂x ∂x0 2. x(t.then (t. x0 ) is differentiable in t and satisfies the linear differential equation ˙ ∂f ∂ ∂ x(t.2. t0 )) . x(t. t0 ) (t. x). x0 ) Particularly when ω(t0 . x0 . x) x(t0 ) = x0 Theorem 2. x0 . In order to solve numerically an ODE problem. ω(t0 . x0 ) < ∞ we see that x(t. t0 ) = x(t. x0 ) ? Theorem 2. x0 )) does not stays in K when t → ω(t0 . It is not only convenient for the theory.6 We consider the system We suppose that f is continuous on Ω and such that the partial derivative ∂f (t. t0 . Scilab solvers accepts ODEs in the more general form 11 . x0 )[. t0 . it is also important in practice.1 Getting started Standard form The equation x = f (t. x0 ) .8 Differentiability of solutions x ˙ = f (t. x0 ) is associated a maximal open set : ]α(t0 . x) is continuous with respect to (t. x0 ). you must write it in a form acceptable to your code.

for differential equations of order n is to introduce new variables :  ˙  x1 = x   x2 = x ¨ (4)  ···   xn−1 = x(n−1) Example 3.A(t. x) is a non singular mass matrix and also implicit forms g(t. Even simple differential equations can have complicated solutions.1 : We consider the classical differential equation of the mathematical pendulum :  ¨ = −sin(x)  x x(0) = a (5)  x(0) = b ˙ (6) This ODE is equivalent to  ˙  x1 = x2 x2 = −sin(x1 )  x1 (0) = a. by known analytical functions. cannot be expressed. Indeed the ODE encountered in practice have no analytical solutions. x. With either forms the ODEs must be formulated as a system of first-order equations.2 Why to solve numerically ODEs ? The ODEs. x)x = f (t. are the exception rather that the rule. The solution of the pendulum ODE is given by mean of Jacobi elliptic function SN and CN. Now the solution of the ODE of the pendulum with friction. Let consider some simple ODEs on Rn 12 . x) = 0 ˙ . x2 (0) = b 3. The classical way to do this. for which you can get an analytical expression for the solution. y) ˙ Where A(t.

). or simply the Bessel function. 3.2.2. x[t]. x[t].Given in a computer algebra system (Mathematica. this ODE has for solution DSolve[x’[t] == x[t]^2 + t. t] Which gives N D Where 2 3 t2 3 3 N = −C J− 1 3 + t 2 J− 2 3 2 3 t2 3 − C J− 4 3 2 3 t2 3 + C J2 3 2 3 t2 3 and D = 2t J 1 3 2 3 t2 3 + C J− 1 3 2 3 t2 3 Where the functions Jα (x) are the Bessel function. The Bessel functions Bessel Jn (t) and Bessel Yn (t) are linearly independent solutions to the differential equation t2 x + tx + (t2 − n2 )x = 0. t] Which gives N D Where t2 2 − 2t2 J− 3 4 N = −C J− 1 4 t2 2 13 − C J− 5 4 t2 2 + C J3 4 t2 2 .. Jn (t) is often called the Bessel function ¨ ˙ of the first kind.3..1 x = x2 + t ˙ The term in x2 prevents to use the method of variation of constants. Maple..2 x = x2 + t2 ˙ this ODE has for solution proposed DSolve[x’[t] == x[t]^2 + t^2. Mupad.

this error is set by the user or have default values. t0 ) evaluated at time t1 . with existence and uniqueness (7) A discrete variable method start with the given value x0 and produce an approximation x1 of the solution x(t1 . This is described as advancing the integration a step size h0 = t1 − t0 .3 x = x3 + t ˙ This time Mathematica has no answer In [ 3] := DSolve[x’[t] == x[t]^3 + t. This a called a continuous output. t] 4 Theory x = f (t. Generally the continuous approximation has the same error tolerance that the discrete method. x[t]. x[t]. and by polynomial interpolation a function is generated that approximate the solution everywhere on the interval of integration. x) ˙ x(t0 ) = x0 For a Cauchy problem . It is 14 . t] Out[3]: = DSolve[x’[t] == x[t]^3 + t. x0 . to satisfy the tolerance requirements. choose their steps to control the error.and D = 2t J 1 4 t2 2 + C J− 1 4 t2 2 3.2. Usually the contemporary good quality codes. The processus is then repeated producing approximations xj of the solution evaluated on a mesh t0 < t1 < · · · < ti < · · · < tf Where tf is the final step of integration. Moreover some codes are supplemented with methods for approximating the solution between mesh points. The mesh points are chosen by the solver.

tn ) − xn+1 + ≤ ··· x(tn+1 . t0 ) Only indirectly. 15 . using the definition (2. A big L would implies small step size to obtain some accuracy on the result. xn . A solver try to control the local error. by en = xn+1 − x(tn+1 . The local error measure the error produced by the step and ignore the possible error accumulated already in xn . xn ) to produce xn+1 . start from (tn . but with different I. t0 ) − x(tn+1 . 4. tn ) The method gives an approximation xn at time tn . The next step. controlled by the solver. at time (tn+1 ) of the mesh. tn ) Or the size of the error by len = xn+1 − x(tn+1 .1 : We define the local error.V. The Lipschitz constant on the integration interval play a critical role. This inequality tells us that is L | t − t0 | is small. tn ) The first term in the right hand side of the inequality is the local error.8). the second is given by the difference of two solutions evaluated at same time. the ODE has some kind of stability. The basic inequality (3 ) is of interest for this issue. xn . t0 ) − xn+1 · · · ≤ x(tn+1 . since the propagation of error can be upper bounded by x(tn+1 . so as to select an appropriate method for solving the problem with the required accuracy.1 Generalities Definition 4. xn . x0 . If the ODE is unstable the numerical problem can be ill-posed. The concept of stability of the solutions of an ODE is then essential : two solutions starting from nearby Initial values. x0 .important to know how a code produces answers at specific points. This controls the global error xn+1 − x(tn+1 . xn . x0 . must stay close.

0].1 : We consider the Cauchy-Problem  ˙  x1 = x2 x2 = 100 x1 ˙  x1 (0) = 1 . tf ] for any solution x(t. the alternative is to consider the error maximum for each step. -->t0=0. x2 (0) = −10 (8) The analytic solution is x(t) = e−10t . then x(tn ) − x(tf ) → 0 When maxi hi → 0. Since this definition is for any integration interval. Example 4. We give two example of unstable ODE problems. has been chosen for.exec("/Users/sallet/Documents/Scilab/ch4ex1. We shall give two examples of what can happen. x0 the first approximation ˜ of the method to the starting point x0 . Our definition consider the last step. The I. -->x0=[0. x) ˙ x(t0 ) = x0 If we denote by n the number of steps.We also speak of well conditioned problem. ˜ It should be equivalent to introduce maxi x(ti )−x(ti ) . and x0 → x0 . 16 . of the Cauchy problem x = f (t. This is done by ODEOPTIONS. tf = tn . we must augment the number of steps allowed. Some commands will be explained later on.e.V.2 (convergent method) : A discrete method is said to be convergent if on any interval [t0 . by various solver implemented in SCILAB. t0 ). We shall see this in the corresponding section -->. In particular. there are equivalent. for the default method in SCILAB. i.sci"). We now try to get numerical solutions. x0 . Before we give a definition Definition 4.

3.0.t0.0.3.x0. RKF.0. -->ode(x0.t0.2.ch4ex1) ans = ! .3. The reason is not in the 17 .0244166 ! -->ode(’rkf’. With LSODA.0004887 ! -->ode(’adams’.0004145 ! 0.0.0041445 ! -->ode(’rk’.0.0.0045630 ! ! .358E-14 All the results obtained are terrible.-1].x0.5. RK.%inf.0.12.ch4ex1) ans = ! .t0. RK.-->%ODEOPTIONS=[1.3000.3.x0.0.ADAMS.ch4ex1) ans = ! ! 0.0024417 ! 0.0000452 ! ! .0456302 ! -->exp(-30) ans = 9. With RKF and ADAMS the value is even negative ! ! All the codes are of the highest quality and all fails to give a sound result.ch4ex1) ans = ! ! 0.-1.t0.

Moreover. 18 . −10) is effectively e−10t . it is simple to check that the Lipschitz constant is L = 100 for the euclidean norm (make in Scilab norm(A)..2 : We consider the Cauchy-Problem   x1 = x2  ˙  x = tx  ˙2  1 2 3− 3  x1 (0) = Γ( 2 )  3  1   x2 (0) = − 3−13 Γ( ) 3 (9) This simple equation has for solution the special function Airy Ai(t) With the help of Mathematica : DSolve[{x’[t] == y[t].). {x[t]. The ODE is unstable. The system is a linear system x = Ax ˙ with A= 0 1 100 0 The eigenvalues of A are 1 and −10. t] which gives x(t) = C1 Ai(t) + C2 Bi(t) y(t) = C1 Ai (t) + C2 Bi (t) The function Ai et Bi are two independent functions solutions of the general equation x = tx. but any other solution goes toward infinity. This one of the two solutions going to the equilibrium (the pass of the saddle) . The Airy functions often appear as the solutions to ¨ boundary value problems in electromagnetic theory and quantum mechanics. the trajectories are going apart. The problem is ill conditioned. y’[t] == t x[t]}. The problem is doubly ill conditioned Example 4. but in the ODE. the solutions are linear combination of et and e−10t .. in last year lectures. Then when a solver gives an approximation which is not on the trajectory. y[t]}.code. The picture is a saddle node (or mountain pass). The solution starting at (1. As seen.

1]. Then the Lipschitz constant on [0. the Lipschitz constant is 10. with ˙ A(t) = 0 1 t 0 The norm associated to the euclidean norm is t (same reason as for the preceding example).’xdot=[x(2).x0.x. -->x0(1)=1/((3^(2/3))*gamma(2/3)).t*x(1)]’) --> t0=0. Now the situation is similar to the preceding example. This is quite reasonable. this example require references to special function. we obtain an upper bound of e100 .1 The Airy functions are related to modified Bessel functions with 3 orders. 1] is L = 1. In view of the upper bound for the Airy ODE on the error. The Bessel K and I are built–in . Simply. This equation is a linear nonautonomous equation x = A(t). We can anticipate numerical problems. while Bi(t) increases unboundedly. in Scilab (Type help bessel in the window command)). It is known that Ai(t) tends to zero for large positive t. the results should be correct. We have the formulas (see Abramowitz and Stegun ) Ai(t) = and Bi(t) = 3 t I− 1 2/3t 2 + I 1 3 3 3 1 π 3 t K 1 2/3t 2 3 3 2 3 t2 3 Where In (t) and Kn (t) are respectively the first and second kind modified Bessel functions. -->x0(2)=1/((3^(1/3))*gamma(1/3)). We check that the I. -->deff(’xdot=airy(t. So we can now look at the solutions : We first look at the solution for tf = 1.1.V are such that Ai(t) is solution of the Cauchy-Problem.airy) X1 = 19 . It is not so immediate as in the preceding example.t0. 10] . which gives an upper bound from the inequality (3) of e. On the interval [0. -->X1=ode(’rk’.x)’. integrating on [0.

airy) X1 = 1.airy) 20 .1352924 ! ! .! 0.airy) X3 = ! 0.1591226 ! --X3=ode(’adams’.2437119 ! --X3=ode(’adams’.10.1591163 ! -trueval=(1/%pi)*sqrt(1/3)*besselk(1/3.0.1352924 The results are satisfactory.2516987 ! X2=ode(x0. Now we shall look at the result for an integration interval [0.0E+08 * ! ! 2.1591474 ! -->X2=ode(x0.2/3) trueval = 0.t0.10.x0.1352983 ! ! . 10] -->X1=ode(’rk’.x0.6281018 ! 8. The best result is given by the ‘ ‘rk” method.10.6306454 ! 8.t0.x0.0.0E+08 * ! ! 2.airy) X1 = 1.1352974 ! ! .t0.airy) X2 = ! 0.1.0.t0.1.t0.

105E-10 This time the result is terrible ! The three solvers agree on 3 digits.628452 ! 8. This can be done by a Hermite interpolating polynomial of degree 3. t0 ) of the Cauchy Problem x ˙ = f (t. 21 .X3 = 1. but the real value is 1.105 10−10 . For example we can choose a polynomial function which goes through the points (ti . The failure was predictable. x0 . Well conditioned problem will give small perturbations.(2/3)*10^(3/2)) trueval = 1.0E+08 * ! ! 2.2 Backward error analysis A classical view for the error of a numerical method is called backward error analysis. x) + r(t) ˙ Backward error analysis consider the size of the perturbation.V. . a method for solving an ODE with I. xi ). The analysis is the same of for the preceding example. xi ). x) x(t0 ) = x0 We can choose a function z(t) which interpolates the numerical solution. z(t)) ˙ In other words the numerical solution z is solution of modified problem x = f (t. 4. around 2. We can now look at r(t) = z(t) − f (t.62108.2448118 ! -->trueval=(1/%pi)*sqrt(10/3)*besselk(1/3. an IVP problem approximate the solutions x(tn . with a derivative at this point f (ti . The function z is then with a continuous derivative.

xn + hn f (tn . Example 4. xn )] 2 This method is known as the Improved Euler method or also as Heun method. hn ). xn . If we choose 1 [f (tn . In the sequel we shall restrict.4. hn ) Where the function Φ depends on the mesh point xn at time tn and the next time step hn = tn+1 − tn .3 One–step methods One-steps methods are methods that use only information gathered at the current mesh point tj .3. Φ depends on (tn . to one-step methods with constant step size. xn ) We have the extremely famous Euler method. xn .3 (Order of a method) : A method is said of order p. hn ) = f (tn . hn ) = 4. If we choose Φ(tn . The result we shall give.3 (Improved Euler Method) . They are expressed under the form xn+1 = xn + hn Φ(tn . are also true for variable step size. but the proof are more complicated.4 (Euler Method) . Example 4.1 Analysis of one-step methods We shall analyze the error and propagation of error. xn . Φ(tn . xn . if there exists a constant C such that for any point of the approximation mesh we have len ≤ C hp+1 22 . Definition 4. for simplicity of the exposition. xn ) + f (tn + hn .

We consider two sequences xn and xn defined by ˜ xn+1 = xn + hΦ(tn . h) In practice. ˜ In view of this errors we shall establish a lemma Lemma 4. x0 . z(tn ). we have ˜ z(tn+1 ) = z(tn ) + hΦ(tn . x. using that Φ is Lipschitz of constant Λ : xn+1 − xn+1 ≤ (1 + hΛ) xn − xn + ε ˜ ˜ e(n+1)hΛ − 1 hΛ − t0 = (n + 1)h. For a one-step method len = x(tn+1 . x Now let consider any solution of the Cauchy problem. h) + en 23 . h) + εn ˜ ˜ ˜ We suppose that | εn |≤ ε Then we have the inequality xn+1 − xn+1 ≤ eΛ|tn+1 −t0 | ( x0 − x0 + (n + 1)ε) ˜ ˜ Proof We have. t0 ). uniformly with respect to h and t. xn . xn . in place of xn+1 . setting for shortness 1 − e−ΛT ΛT A simple induction gives xn+1 − xn+1 ≤ (1 + hΛ)n+1 x0 − x0 + ε ˜ ˜ Now using (1 + x) ≤ ex . in other words the method restart with xn+1 . and tn+1 T = tn+1 − t0. we have : xn+1 − xn+1 ≤ eΛT ˜ −x x0 − x0 + (n + 1)ε ˜ Remarking that 1−e ≤ 1. h) xn+1 = xn + hΦ(tn . gives the result. ˜ Set z(t) = x(t. xn . x0 . the computation of xn+1 is corrupted by roundoff errors. For example x(t. xn . We suppose that the function Φ(t.1 (propagation of errors) : We consider a one-step method. h) is Lipschitzian of constant Λ relatively to x. tn ) − xn − Φ(tn . t0 ).That means that the local error at any approximating point is O(hp+1 ). with constant step.

tn+1 − t0 = T . If we suppose that ˜ we know upper bounds for αn and βn . using the lemma. We see that for any solution of the Cauchy problem we can define xn = ˜ x(tn . We have also proved that any method of order p ≤ 1 is convergent. tf = tn+1 . we obtain x(tf . x0 . Without ˜ roundoff error. But this is the minimum we can ask for a method. for example M1 and M2 . If we add the rounding error to the theoretical error en .). since with a computer. using the lemma. that the final error x(tf . t0 ). h) + hαn + βn ˜ ˜ ˜ We have εn = hαn +βn +en . x0 . since this sequence satisfies the condition of the lemma. This is evidently theoritical.Where en is evidently en = z(tn+1 ) − z(tn ) − hΦ(tn . t0 ) − xn+1 ≤ eΛT (n + 1)CT hp The accumulation of local error. ˜ With this lemma we can give an upper bound for the global error. x0 . if we suppose that the rounding are upper bounded by ε (in fact. it is the relative error. h) + αn ) + βn ˜ ˜ ˜ xn+1 = xn + hΦ(tn . gives an error of order p at the end of the integration interval. taking in consideration the theoretical error. x0 . z(tn ). the sizes of steps are necessarily finite. but the principle is the same) we have xn+1 = xn + h (Φ(tn . h) By definition len = en . xn . Since we can set xn = x(tn . This justifies the name of order p methods. xn . we know that en = len ≤ C hp+1 Then we see. where αn is the roundoff error made in computing Φ and βn the roundoff error on the computation of xn+1 . t0 ) ( we choose directly the good I. a simple computation shows that the error E(h) has an expression of the kind E(h) = eΛT x0 − x0 + C T hp + T M1 + ˜ 24 M2 T h .V. but since by definition T = (n + 1)h. t0 ) − xn+1 is upper bounded by eΛT (n + 1)Chp+1 .

So we can write E(h) = K1 + eΛT T C hp + M2 h

This shows that the error has a minimum for a step size of M2 pC
1 p+1

Which gives an optimal number of steps for an order p method Nopt = T pC M2
1 p+1

(10)

In other words finite precision arithmetic ruins accuracy when too much steps are taken. Beyond a certain point the accuracy diminishes. We shall verify this fact experimentally. To be correct the error is relative i.e. in computer the result x of the compu¯ tation of a quantity x satisfies x−x ¯ ≤u x Where for IEEE arithmetic u = 1.2 10−16 Since we are integrating on compact intervals, the solutions involved x(t) are bounded and since Φ is continuous , Φ(t, x, h) is also bounded. This means that we must have an idea of the size of x(t) and the corresponding Φ. We shall meet this kind of problem in the sequel, when we examine absolute error tolerance and relative error tolerance 4.3.2 Conditions for a one-step method to be of order p

We shall give necessary and sufficient condition for a method to be of order ≤ p. Since the local error is len = xn+1 − x(tn+1 , xn , tn ) Calling If we suppose that Φ is of class C p , the Taylor formula gives 25 en = x(tn+1 , xn , tn ) − xn − h Φ(tn , xn , h)

p

Φ(t, x, h) =
i=0

hi ∂ i Φ (t, x, 0) + ◦(hp ) i! ∂hi

If we suppose f of class C , Once again by Taylor formula, denoting x(t) = x(t, xn , tn )
p+1

p

x(tn+1 , xn , tn ) − xn = x(tn+1 , xn , tn ) − x(tn , xn , tn ) hj dj x (tn , xn ) + ◦(hp+1 ) j! dtj

x(tn+1 , xn , tn ) − xn =

j=1

Now since x(t) is solution of the differential equation we have dx (tn , xn ) = f (tn , xn ) dt ∂f ∂f d2 x (t , xn ) = (tn , xn ) + (tn , xn ) f (tn , xn ) 2 n dt ∂t ∂x To continue, we denote by f [n] (t, x(t)) = Then ∂2f ∂2f ∂2f (t, x) f (t, x) + 2 (t, x) (f (t, x), f (t, x)) (t, x) + 2 ∂t2 ∂t ∂x ∂x The computation becomes rapidly involved, and special techniques, using graph theory, must be developped. But the computation can be made (at least theoretically). Then f [2] (t, x) =
p+1

dj x [f (t, x(t))] dtj

x(tn+1 , xn , tn ) − xn =

j=1

hj [j−1] f (tn , xn ) + ◦(hp+1 ) j!

Hence we deduce immediately (beware at the shift of indexes)
p+1

en =
k=1

hk ∂ k−1 Φ hk [k−1] f (tn , xn ) − (tn , xn , 0) k! (k − 1)! ∂hk−1 26

+ ◦(hp+1 )

We have proven Proposition 4.1 (order of a method) : We consider a method of class C p . A method is of order at least p + 1 (beware at the shift of indexes) iff Φ satisfies, for k = 1 : p + 1 ∂ k−1 Φ 1 [k−1] f (t, x) = (t, x, 0) k ∂hk−1 Corollary 4.1 (order 1 method) : A method is of order at least 1, iff f (t, x) = Φ(t, x, 0) The Euler method is then of order 1. Corollary 4.2 (order 2 method) : A method is of order at least 2, iff f (t, x) = Φ(t, x, 0) and ∂f 1 ∂f (t, x) + (t, x) f (t, x) 2 ∂t ∂x Check that Heun is an order 2, method. 4.3.3 Runge-Kutta method = ∂Φ (t, x, 0) ∂h

The Runge-Kutta methods are one-steps methods. The principle of RK methods is to collect information around the last approximation, to define the next step. Two RK methods are implemented in Scilab : “rk” and “rkf” . To describe a RK method we define Butcher Arrays. Definition 4.4 (Butcher Array) : A Butcher array of size (k + 1) × (k + 1) is an array

27

.c1 0 c2 a2.1 a3.k = xn + h i=1 k−1 s3 = f (tn + c3 h . We can now describe a k-stages Runge-Kutta formula Definition 4.2 s2 ) . . .2 ) xn. . which gives points xn.1 = xn s1 = f (tn . .2 · · · ak. xn.j A Butcher array is composed of (k − 1)2 + k datas. the next step (tn+1 . . xn. xn. xn.k−1 0 b1 b2 · · · bk−1 bk The ci coefficients satisfying i−1 ci = j=1 ai. .3 ) ak. . .2 0 ..1 s1 + a3.k ) The next step is then defined by k xn+1 = xn + h i=1 bi si 28 . .2 = xn + h a2. xn. If a (tn .i si sk = f (tn + ck h .. by finite induction ..1 ak. xn+1 ) is given by the following algorithm : First we define k intermediate stages. .5 (k-stages RK formula) : A k-stages RK formula is defined by a k + 1 Butcher array. xn ) approximation has be obtained..i and slopes si : xn.1 ) xn.1 0 c3 a3.1 s1 s2 = f (tn + c2 h . ck ak. .3 = xn + h (a3.

3)) : It is the first example of a Embedded RK formulas. We shall give a quite complete set of examples.9 (Bogacki-Shampine Pair. named also modified Euler formula. since this RK formulas are used by high quality codes.The intermediates points xn. and can be found in the routines. We shall come on this matter later. 29 .6 (Heun RK2) : The Heun method (or improved Euler) is a RK formula with associated Butcher array 0 1 1 1 2 1 2 Example 4. is a RK formula with associated Butcher array 0 1 2 1 2 0 1 Example 4. BS(2.8 (Classical RK4) : The classical RK4 formula with associated Butcher array is 0 0 1 2 1 0 0 1 1 6 2 6 2 6 1 6 1 2 1 2 1 2 Example 4. Example 4.5 (Euler RK1) : The Euler method is a RK formula with associated Butcher array 0 0 1 Example 4.7 (Heun RK2) : The Midpoint method.i are associated with times tn + ci h.

0 1 2 3 4 1 2 0 2 9 3 4 1 3 4 9 1 7 24 1 4 1 3 1 8 30 .

0 1 5 3 10 4 5 8 9 1 5 3 40 44 45 19372 6561 9017 3168 9 40 − 56 15 − 25360 2187 − 355 33 0 0 32 9 64448 6561 46732 5247 − 212 729 49 − 176 5103 − 18656 1 1 35 384 5179 57600 500 1113 7571 16695 125 192 393 640 − 2187 6784 92097 − 339200 11 84 187 2100 1 40 Example 4. 0 1 4 27 40 1 4 − 189 800 214 891 729 800 1 1 33 650 891 41 162 0 800 1053 1 − 78 31 .11 (an ode23 formula) : This example was the embedded RK formulas used in MATLAB till version 4.Example 4.10 (Dormand Prince Pair DOPRI5 ) : It is the second example of embedded RK formulas. for the solver ODE23.

32 .3. or more exactly replaced by an equivalent autonomous system.Example 4. Let the classical system x ˙ = f (t. x) x(t0 ) = x0 This system is clearly equivalent to the autonomous system   y=1  ˙  z ˙ = f (y. It is still an “interlocking” RK formula. Any ODE can be rendered autonomous. We shall justify this assumption. We have said that for a RK formula the coefficient ci are imposed. They are used in Scilab “RKF”. The trick is to built a “clock”. z) = x0  z(t0 )   y(t0) = t0 (11) (12) Actually some codes require the equations to be presented in autonomous form.12 (RKF Runge-Kutta Fehlberg 45) : This formula is the Runge-Kutta Fehlberg coefficients. 0 1 4 3 8 12 13 1 4 3 32 1932 2197 9017 3168 9 32 − 7200 2197 − 355 33 −8 7296 2917 46732 5247 49 − 176 5103 − 18656 1 1 4.4 439 216 3680 513 845 − 4104 −1 5 Order of RK formulas We shall derive conditions for which RK formulas are of order p. It is expected that a solver will have the same behavior with the two systems.

2 .1). Let ¯ before denote by Si the slopes for the second system in R × Rn .3 Then we must have c2 = a3. Since the first coordinate of the autonomous system is a clock. hence for the next ¯ step xn. the code should gives a sequence.1 + a3.2 s2 ) ¯ Wn.2 ¯ Wn. using the notations of RK formulas s1 = f (tn .1 xn.i the intermediate points obtained. For a RK method the function P hi is given by k Φ(t. h).We shall compare the steps taken by a RK formula for the two systems.2 ) ¯ We must have c2 = a2.1 s1 s2 = f (tn + c2 h. starting at time t0 .i evaluated at time tn + ci h.i we have.i = tn + ci h xn. h) = i=1 bi si Where it is clear that it is the si which depends on (t.3 = xn + h (a3.1 s1 + a3.2 ) s1 = ¯ 1 f (tn .2 = ¯ S2 = 1 s2 = f (tn + ha2. we obtain by induction that the ci are equal to the sum of the ai.1 .3 = tn + h (a3. To derive condition for RK method we use the corallary (4. x.2 ) xn. xn. 33 . xn ) xn.1 . since the sequence of the original system is given by the intermediate points xn. xn ) tn + ha2.1 + a3. x. necessarily of form ¯ Si = 1 ¯ si ¯ And by Wn. for the intermediates points corresponding to the autonomous system ¯ Wn.2 = xn + ha2. xn.j on the same row. If this is satisfied then s2 = s2 .

k f (t. x) = Φ(t. all the intermediates points are given by xn.k k=1 ∂sk ∂h . x. x. x) k=1 .i ) + ∂h ∂t ∂x Where the expression A is given by i−1 i−1 ai. x) i=1 bi = f (t. then ∂si ∂f ∂f (t. x) 2 ∂t ∂x We must evaluate ∂Φ (t. When h = 0 we have already seen that xn.k sk + hA k=1 ai. if it of order 1 and if 1 ∂f ∂f (t. 0). x. RK formulas of order 2 : From corollary (4.2). x)) + ∂h ∂t ∂x 34 i−1 ai.i = xn and si = f (tn . 0) = ci (t. h). x) + (t. x. xn. iff k bi = 1 i=1 . Hence k Φ(t. ∂h = ∂Φ (t. x) f (t. and si = f (tn .RK formulas of order 1 From corollary (4. 0) = f (t. When h = 0 from the formulas. x) A RK formula is of order at least 1. x.1) we must have f (t. 0) ∂h We have k bi i=1 ∂si ∂h From the formulas ∂si ∂f ∂f = ci (ti + ci h.1 = xn . xn ). xn ).

The coefficients of A for i > j are the ai. The other coefficient are 0. We shall give condition till order 5. x) 2 k bi ci i=1 We have proved that a RK is at least of order 2 if k k bi = 1 i=1 and i=1 bi ci = 1 2 The computation can be continued. x.Withe the hypothesis cation we get i−1 k=1 ai. in the same manner.k = ci .’c’)==C The test must output %T.j = ci : . RK formulas of order 7 and 8 exists. x)f (t. x) = ci f [1] (t. Particularly Scilab can be used for testing the order of a RK formula. Some formulas of the same order are more efficient. See Bucher or Haire and Wanner. We use freely the Scilab notation. x) = f [1] (t. from the Butcher array. We introduce. We introduce the length k column vector C and the length k vector B. x)) + (t. 0) = ci (t. So we express the relation as a Scilab test. but appropriate techniques must be used. Order 1 Sum(B)==1 Order 2 35 ai. of size k × k. Relations for Order of RK formulas : Each test add to the preceding.j of the array. Finding RK coefficients is also an art. x) ∂h ∂t ∂x Finally the condition reduces to 1 [1] f (t. Before. The condition obtained can be expressed as matrix operations. using the fact that ∂f ∂x is a linear appli- ∂si ∂f ∂f (t. it must be verified that the ci satisfy the condition sum(A. the matrix A.

Since the computation cost of a code is measured by the number of evaluation of f .^2==1/3 Order 4 B*C. The interest of such formulas for step size control will be explained in the next sections.3) give the second order formula for free.^2) .*(A*C))==1/40 B*A^3C== 1/120 Exercise Check.3) pair that if you only take the 4 × 4 Butcher array of the BS(2.^2==1/60 B*A*C. The complete array gives an order 2 formula.^3==1/4 B*(C.*((A^2)* C)==1/30 B*A*C.*(A*C))==1/20 (B*A) *(C.^ 2==1/12 B*A^ 2*C==1/24 B*A*C==1/6 If we prove this. the BS(2. 4.*(A*C.^4==1/5 B*(C. on the examples. This explain the line and the double line of the array.3. then from our preceding results.*^3==1/20 B*A^2*C.*(A*C))==1/8 Order 5 B*C. For the Embedded RK formulas. the RK formulas are convergent and at least of order 1. you will find out that.5 RK formulas are Lipschitz B*( (C.* (A*C))==1/10 B*( C.3) you have an order 3 RK formula.^2))==1/15 B*((A*C). the order of the RK formulas.B*C==1/2 Order 3 B*C. We must evaluate 36 . for example for the BS(2.

k Φ(t. x. x. h) − Φ(t. xn . Since the most expensive part of taking a step is the evaluation of the slopes. xn ). β = max|bi | . the difference estn give a computable estimate of the error. embedded RK formulas are particularly interesting since they give an estimation of the error for free. 37 . xn .3. y. h) = i=1 bi (si (x) − si (y)) We denote by α = max|ai. by L the Lipschitz constant of f. a simple induction shows that (for the intermediate points xi and yi ) we have xi − yi and si (x) − si (y) Hence (1 + hαL)k − 1 Φ(t. tn ) − xn+1 ∗ Suppose that we have a formula of order p + 1 to compute an estimate xn+1 . tn ) − xn+1 ] − x(tn+1 . xn . tn ) − x∗ n+1 n+1 Then estn = len + O(hp+2 ) = O(hp+1 ) = Chp+1 Where C is depending on (tn . h) − Φ(t. then we have estn = x∗ − xn+1 = [x(tn+1 . y. h) ≤ β hα This prove that the RK formula are Lipschitz methods. 4.6 x−y ≤ L(1 + hαL)i−1 x − y ≤ (1 + hαL)i−1 x − y Local error estimation and control of stepsize The local error of a method of order p is given par en = x(tn+1 . With the notation of RK formula.j | . Since len is O(hp+1 ).

or is a default value used by the code). The maximal step increase is usually chosen between 1.4 Experimentation The objective of this section is to experiment and verify some of the theoretical results obtained. two approximation are computed.9. 4. Generally how much the step size is decreased or increased is limited.8 or 0. This is only an estimation. can be used. For the need of experiments we shall code some of the methods described before. We use here estn for len . a fraction of the predicted step size. xn+1 ) is computed. This is for example the case of the BS(2. The next step is free (at least for evaluation of f ). the local error would have been leσ = C(σh)p+1 = σ p+1 est n . the step is rejected and the code try to reduce the stepsize. The pairs DOPRI5 and RKF are also a FSAL pair. If the step is accepted. This is the way that most popular codes select the step size. If the error is beyond a tolerance tol (given by the user. Moreover a failed step is expensive. for safety. This kind of formula are called FSAL (first same as last). If we have taken a step σh. If the step is accepted. This is for heuristic reasons. It is also used that after a step rejection to limit the increase to 1.Modern codes use this kind of error estimation. the codes use. In the case of embedded RK formulas. to prevent too big steps. The solver of Scilab 38 . To compute the error f (tn+1 . usually 0. We give here an advice to the reader : do not rewrite codes that have been well written before and have proved to be sure.3) a 3 order RK formula and an estimation say xn+1 . The power of Scilab is used. with which formula does the code will advance ? Advancing with the higher order result is called local extrapolation. the slope is already computed. (check it !). The 4th line gives of BS(2. Besides a maximum step allowed. if the code advance with xn+1 . But there is practical details.5 and 5.3) pair. then the step size σh passing the test satisfies σ< tol est tol est 1 p+1 1 p+1 The code will use for new step h .

k)).4. C) of size respectively k × k.1)).x. s(:. We can write a k-loop or simply write everything. When this done we suppress the first row of A ( only 0). We suppose that the Butcher array is given by the “ triplette ” (A. .x +h*A1(:. 1 × k and k × 1. With theses data you can check the order of the RK formula. // compute the new point 39 .1 coding the formulas We shall code three RK formulas : Euler. s(:. Since all the formulas given are RK formulas we shall indicates one general method. odefile). RK4 (classical). sl=feval ( t . preallocation of the // matrix of slopes A1=A’.are high quality codes.1)=sl(:). In pseudo-code. write the other stages till k-1 and the last sl=feval (t+h*C(k). don’t reinvent the wheel. x=x(i). 4. However it is important to know how codes are working. //technical reasons // computing x(i+1) // inside the slope lightening the notation. This was the reason of the preceding chapter. t=t(i).2)=sl(:).k)=sl(:). x). we obtain : Assuming that the RHS is coded in odefile with inputs (t.k) // k the number of stages... //making sure sl is a column vector sl=feval(t+h*C(2).. Heun. that k is the number of stages of the RK formulas. that xi has already been computed : n=length(x0) // dimension of the ODE s=zeros ( n. s(:. x+h*A(:.:)=[ ]. B. The computation of the slopes is straighforward. A(1. for the main loop (computing xi+1 ) .

fct) s1=s1(:) 40 . for example RK4. named rk4.X(i.sci : function [T. This the core of the code. Do it ! the routine can also be coded more naively. then x(i) should be the i-th row of N × n matrix. Since the utility of a solver is also to plot solution. here is a Scilab file.x= x+ h*s*B’.X]=rk4(fct.:). where N is the number of mesh points.tf. Once you have write the code.[.tf.t0.:)=x0’ //main loop for i=1:N-1 // compute the slopes s1=feval(T(i). x(i+1)=x.N]) T=T(:) X=zeros(N. How is stored x(i) column ? row ?.tf] is the integration interval // // N number of steps x0=x0(:) // make sure x0 column ! n=length(x0) h=(tf-t0)/(N-1) T=linspace(t0. this is not difficult to code all the RK formulas. care must be taken to details. the mesh points should be stored as a column vector.n) // preallocation X(1.N) //Integrates with RK4 classical method // fct user supplied function : the RHS of the ODE // // t0 initial time // // x0 initial condition // // tf final time [t0.x0.

:)=X(i..out2. x(0) = 1. this file has the syntax function [out1.in2. And finally write a script for testing the three methods 41 .x) //test fonction // xdot=x+t.fct) s4=s4(:) X(i+1..) Where output : out1.sci. We write the file fctest1.:)+h*(s1’+2*s2’+2*s3’+s4’)/6 end A WARNING When you create a Scilab file.sci function xdot=fctest1(t. . ..x1’.x3’.x2’. . . for example x = x + t with I. . If you are saving under another name Scilab get confused. .are the wanted quantities computed by the file. out2. and you are in trouble.2 Testing the methods We shall use a test function. ALWAYS SAVE THE FILE UNDER THE NAME OF THE GIVEN FUNCTION ! That is save under name_of_function.fct) s2=s2(:) x2=X(i.:)+h*(s1/2)’ s2=feval(T(i)+h/2.] =name_of_function (in1.. The ˙ t exact solution is x(t) = −1 − t − e .4.V. 4.fct) s3=s3(:) x3=X(i. in2.:)+h*(s2/2)’ s3=feval(T(i)+h/2.are the inputs required by the file.:)+h*s3’ s4=feval(T(i)+h. . and the in1. .x1=X(i.

3. // sol for Heun RK2 [s.1.1.Nstepvect(i)). n=length(Nstepvect). // sol for RK4 z=-1-s+2*exp(s). for i=1:n [s.Nstepvect(i)).getf("/Users/sallet/Documents/Scilab/euler.xrk4]=rk4(fctest1.3.3. // end // //plot xbasc() 42 . . // exact solution for the points s eulerror=max(abs(z-xeul)). // sol for euler [s.1).getf("/Users/sallet/Documents/Scilab/fctest1.sci").getf("/Users/sallet/Documents/Scilab/rk2.xrk2]=rk2(fctest1.//compare the different Runge Kutta methods // and their respective error // //initialization // . rk2_vect=zeros(n. .0.xeul]=euler(fctest1.Nstepvect(i)).sci").1.0. eul_vect=zeros(n.sci").getf("/Users/sallet/Documents/Scilab/rk4.sci").1). . // biggest error for Eul rk2error=max(abs(z-xrk2)). rk4_vect=zeros(n. rk2_vect(i)=rk2error. // biggest error for RK2 rk4error=max(abs(z-xrk4)).1). // biggest error for RK4 // // eul_vect(i)=eulerror. rk4_vect(i)=rk4error.0. Nstepvect=(100:100:1000).

2 and 4. that is to say the effects of finite precision arithmetic.V. We get the following picture 1 10 0 Euler A b s o l u t e E r r o r 10 1 10 2 10 3 10 4 RK2 10 5 10 6 10 7 10 8 10 9 RK4 10 10 10 11 10 10 2 3 Number of steps 10 Fig.3 Testing roundoff errors We want to experiment the effects of roundoff. and we test the method with the function x = x and the I.plot2d(’ll’. ˙ We look at x(1) = e. We choose for example the RK2 improved Euler method. x0 = 1. 4. as expected.4. 1 – Comparison of RK Formulas Clearly measuring the slopes show that the methods are respectively of order 1.rk2_vect.sci"). 43 .[eul_vect.getf("/Users/sallet/Documents/Scilab/comparRK.rk4_vect]) xgrid(2) Now we call this script from the Scilab command window : .Nstepvect.

x1.fct) // slope at Euler point s2=s2(:) 44 .x0. Scilab stores usual variables in a stack.tf.t0. minimum allocation memory // for experiment computing the solution only // at final time // // fct user supplied function : the RHS of the ODE // // t0 initial time // // x0 initial condition // // tf final time [t0. minimizing the number of variables. The size of this stack depends on the amount of free memory.N) // Integrates with RK2 method. k=1.fct) // compute the slope s1=s1(:) x1=x+h*s1 // Euler point s2=feval(t+h.tf] is the integration intervall // // N number of steps x0=x0(:) n=length(x0) h=(tf-t0)/N x=x0. we choose to minimize the number of variables used.Since we look only at the terminal point. Compare with the original code given RK2 (same as RK4) before. and since we shall have a great number of steps. improver Euler. Then we use for this experiment a modified code of RK2.x. //main loop while k< N s1=feval(t. function sol=rk2mini(fct. // with minimum code. t=t0.

491E-13 1.060E-13 0.logflag=’ll’. the absolute error versus the number of steps.x=x+h*(s1’+s2’)/2 t=t+h k=k+1 end sol=x Since we work in finite precision.3).315E-13 3.result1(:. in a loglog plot. The reader is invited to check this. 0. 10000000.1).logflag=’ll’.style=1) We obtain the figure 45 . 1000000. by the command -->plot2d(result1(:.126E-13! Where the first column is the number of steps. 30000000. We obtain the following results : -->result result = ! ! ! ! ! ! ! ! 10.style=-4) -->plot2d(result1(:.0011891! 0. 10000. 1000. we must be aware that (tf − t0 ) = N ∗ h. 100000.667E-09! 1. In fact N ∗ h is equal to the length of the integration interval at precision machine. for example.524E-11 2.165E-14! 4.531E-09 4.669E-07! 1.0000169! 1.0051428 0. 100. the second is the absolute error |e − sol| and the third column is the relative error | e−sol |.852E-14! 1.3).1).664E-11! 9.result1(:.0000459 4.536E-07 4. e If we plot.

2 – Effects of finite precision machine on accuracy The reader is invited to make his own experiments. xn+1 = xn (1 + h + 46 ♦ A b s o l u t e 10 4 10 ♦ 2 3 4 5 6 7 8 2 10 10 10 10 10 10 Number of steps . then on [0. we can estimate len ≤ 6 h3 . which gives M2 ≈ 10 . We draw the reader’s attention to the fact that the computation can be lengthy. since Scilab use double precision) the effects appears for an order 2 one-step method around 107 steps. This has been the case. we must to use a large number of steps. Then in the formula 6 21 (10) we can use C = e/6. For our test problem. This implies that. We use a double precision computer (as it is the case in this lecture. the RK2 method is simply h2 ) 2 The local error is equivalent to the first neglected factor of the Taylor deve3 e lopment h xn . In this experiment we have use the IEEE arithmetic precision of the computer. 1].10 3 5 10 6 ♦ 10 7 10 ♦ E r r o r 8 10 9 10 10 ♦ 10 11 10 12 ♦ ♦ ♦ 10 13 10 10 1 10 Fig. if we want to see the phenomena.

*round((10. A famous anecdote is about C. Note the line y=abs(x)+ (x==0). The rounding is toward //the nearest number with t decimal digits. we add a step. c=(10. There a little trick.P. Here is a Scilab function which chop any number to the nearest number with t significant decimal digits function c=chop10(x.*x). function sol=rk2hp(fct.^(t-e)). // In other words the aritmetic is with unit // roundoff 10^(-t) // y=abs(x)+ (x==0).P.t0. Note that we have included the function chop as a sub–function of RK2.N) // RK2 code with half precision (H. e=floor(log10(y)+1).t) // chop10 rounds the elements of a matrix // chop10(X) is the matrix obtained // in rounding the elements of X // to t significant decimal digits.8) // H. to take care of this. Which is intended to encompass the case x = 0.We can simulate simple precision or a even a less stringent precision. This time. in this code.B. to obtain a value at tf .tf. At the end of the main loop. x=x0. here. which was accustomed to describe the precision as “half–precision” and “full precision” rather than single and double precision. with chop10. and creator and founder ofMATLAB.^(e-t)). since N ∗ h = (tf − t0 ). we must take the effect of the arithmetic into account. Scilab permits this. Now we can rewrite the RK2 code.x0.) x0=x0(:) n=length(x0) h=(tf-t0)/N h=chop10(h. Moler. a famous numerician. 47 .

s2=feval(t+h.t=t0.fct) s1=s1(:) s1=chop10(s1. s1=feval(t.x.^(t-e)).P.P. //main loop while k< N +1 k=k+1 s1=feval(t.*x).x. end sol=x //////////////////////////// function c=chop10(x.fct). t=t+h end if t<tf then h=tf-t.8) // Euler point s2=feval(t+h.fct) // slope at Euler point s2=s2(:) s2=chop10(s2. 48 .t) y=abs(x)+ (x==0). x=x+h*(s1+s2)/2.*round((10. e=floor(log10(y)+1). x1=x+h*s1. x=x+h*(s1+s2)/2 x=chop10(x. c=(10.fct).P. s2=s2(:).8) // H. s1=s1(:).8) //H. x1= chop10(x+h*s1.^(e-t)).x1. // H.8). k=1.x1. t=t+h.

1.3)=abs(%e-sol)/%e. results(i. 1000. 900.0.0000039 9.285E-07 5. end We obtain ! ! ! ! ! ! ! ! ! ! ! ! ! ! 100.N(i)). results(:. 0.0000456 0.getf("/Users/sallet/Documents/Scilab/ode1.3).726E-08 6.0000104 0.Now we write a script for getting the results : // script for getting the effects of roundoff // in single precision //note : the function chop10 // has been included in rk2hp .0000038 0.0000024 1.0000015 0. 200. 1300.0000014 3.1.285E-07 0. 800.934E-07 4. results(i. // for i=1:30.0000050 0.0000018 5.0000035 0. 500.416E-07 1.0000010 9.0000013 0.683E-07 3.getf("/Users/sallet/Documents/Scilab/rk2hp. 1100.0000018 3.sci").208E-07 3. 400.527E-07 1.1)=N’.2)=abs(%e-sol).944E-07 0. 700. .285E-07 0. 300. 600.sci").285E-07 0.356E-07 49 ! ! ! ! ! ! ! ! ! ! ! ! ! ! . 1400. // results=zeros(30. 1200. N=100:100:3000. sol=rk2hp(ode1.121E-07 0.0000168 0.623E-07 8.

0000030 8.0000030 0. 2400.0000013 0.0000011 ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! This result gives the following plot for the absolute error (the relative error is given by the third column) 50 .0000011 4. 1700. 2200. 1900. 2800.0000023 0.0000021 0.0000011 0. 0. 2300.0000044 0. 2700.987E-07 0.715E-07 0. 2600.206E-07 4.462E-07 4.566E-07 7.0000058 0.380E-07 4.0000010 0.0000017 7.991E-07 3.0000020 0. 3000. 1600. 2100.151E-07 1.452E-07 0. 2000.0000016 8.0000012 0. 2900.0000028 0. 2500.835E-07 0.0000016 0.0000046 0.151E-07 0.678E-07 5.0000020 0.! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! 1500.0000011 4. 1800.

in the evaluation of the method.4 10 ♦ 5 10 ♦ ♦ ♦ ♦♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦♦ ♦ ♦ ♦ ♦ 4 ♦ 6 10 7 10 10 2 3 10 Number of steps Fig. We observe that for a number of step under 1000 the slope of the error is 2 as predicted by the theory. A number of theoretical question are still open. xi ). it is considered that the highest cost in computing is the evaluation of f .5 Methods with memory Methods with memory exploit the fact that when the step (tn . An advantage of methods with memory is their high order of accuracy with just few evaluation of f . Methods with memory distinguishes between explicit and implicit methods. 3 – Effects of half precision on accuracy. at mesh points. xi and f (ti . Practice shows that this codes are efficient. Many popular codes use variable order methods with memory. xn ) is reached. 4. We recall that. But understanding 51 ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ E r r o r ♦ ♦ ♦ A b s o l u t e 10 . we have at our disposition previously computed values. Implicit methods are used for a class of ODE named stiff. but rapidly the situation deteriorates with roundoff errors.

the method is implicit. 4. This rule is given by k k αi xn+1−i = h i=0 i=0 βi f (tn+1−i .5. This means that xn+1 must be computed from the formula (13). In modern codes the step is varying and the order of the method also. which gives xn+1 If β0 = 0 the method is explicit. However considering constant step size is still useful. In theory changing the step size is ignored. Another problem is the problem of stability of these methods.the variation of order is an open question.6 (LMM methods) : A LMM method of k-steps defines a rule to compute xn+1 . when k preceding steps (xn .1 Linear MultistepsMethods LMM When the step size is constant the Adams and the BDF methods are included in a large class of formulas called Linear Multipstep Methods. General purpose solvers tend to work with constant step size. for example. In this section we just give an idea of how works methods with memory. Varying the step size is important for the control of the local error. · · · . but in practice it cannot. this is related to the step size. But starting a method with memory is a delicate thing. It is considered that changing step step size is equivalent to restarting . xn−1 . These methods are defined by Definition 4. at least during a while. xn−k+1 ) are known . problems arising from semi-discretization in PDE are solved with constant step size. 52 . Shampine 2002). Methods with memory are more complex to analyze mathematically. To be simple we shall consider (as for one-step methods ) the case of constant step size. This is for pedagogical reasons. Otherwise. since xn+1 appears on the two sides of the equation . Usually in the early codes the starting procedure use a one-step method with an appropriate order. Special problems. xn+1−i ) (13) It is always supposed that α0 = 0. Zhang 1990. Only recently theoretical justification have been presented (Shampine. To quote Shampine : our theoretical understanding of modern Adams and BDF codes that vary their order leaves much to be desired.

The relation (13 ) can be rewritten When β0 = 0 k xn+1 = − And when β0 = 0 i=1 αi xn+1−i + h α0 k i=1 βi f (tn+1−i . The mathematical principle under Adams formulas is based on a general paradigm in numerical analysis : When you don’t know a function approximate it with an interpolating polynomial. To lighten the notation we use f (ti . x(s)) ds 53 . the Adams-Moulton methods. The Adams-Basforth explicit methods. xn+1−i ) α0 xn+1 4. and the Backward Differentiation Methods (BDF).2 β0 = h f (tn+1 . x) x(tn ) = xn For passing from xn to xn+1 this is equivalent to the equation xn+1 xn+1 = xn + xn f (s. We know approximations of f at (ti . unlike RK methods. xn+1 ) − α0 k i=1 αi xn+1−i + h α0 k i=1 βi f (tn+1−i . xi ). and use the polynomial in place of the function. xn+1−i ) (14) This class of method is quite simple and requires only one evaluation of f at each step. xn+1−i ) α0 Adams-Basforth methods (AB) The family of AB methods are described by the numerical scheme ABk k xn+1 = xn + h i=1 ∗ βk.Three class of methods can be defined from LMMs. The later methods are implicit . for k preceding steps.i f (tn+1−i .5. xi ) = fi The classical Cauchy problem x ˙ = f (t.

gives the formula (14). with x(1) = 1. If we consider the polynomial function 1.i are the results of this computation. For constant step size.2 fn−1 ) The result ∗ ∗ (1 + 2h)2 = (1 + h)2 + h β2.1 = 4 with 1 + β2. This polynomial can be expressed by k Pk (t) = i=1 Li (t) fn+1−i Where Li the classical fundamental Lagrangian interpolating polynomal.There exists a unique k − 1 degree polynomial Pk (t) approximating f at the k-points (ti . by identification of the coefficients (this ∗ ∗ formula must be satisfied for any h ). Considering AB2. and compute the two first lines. with the 2 steps at 1 and 1 + h. xi ).1 1 ∗ Evidently β1. we obtain 1 + 2β2.2 = − 2 2 The reader is invited to establish the following tableau of Adams-Basfortth coefficients ∗ β2.1 fn + β2.1 = 54 . Replacing f by Pk under the integral. and the formula must be exact. it is not necessary to make the explicit computation. remembering that h = xn+1 − xn = ∗ xi+1 − xi .1 + ∗ β2. with x = 2t .2 = 2 which gives 3 1 ∗ and β2. gives ∗ 1 + h = 1 + hβ1. the approximation of degree 0 is exact.1 2(1 + h) + β2. with AB1 formula. we ˙ get for the evaluation at 1 + 2h . Then consider x = 1. with the formula ∗ ∗ xn+1 = xn + h(β2. ˙ the solution is x(t) = t. We can construct a table of Adams-Basforth coefficients.1 = 1 and AB1 is the forward Euler formula.2 2 This is a polynomial equation in h. x(1) = 1. The coefficients βk. We give the idea to construct the array . then integrate from 1 to 1 + h.

i f (tn+1−i .5 ∗ βk.∗ k βk. · · · . fn . including (tn+1 .3 Adams-Moulton methods The family of AM methods are described by the numerical scheme AMk k−1 xn+1 = xn + h i=0 βk.2 (order of AB) : The method ABk is of order k.3 ∗ βk. as a consequence of the approximation of the integral.4 ∗ βk. xn+1 ). that Proposition 4. The k interpolating values are then (fn+1 . Techniques has been devised for doing this task efficiently. It can be proved. fn+2−k ) 55 . xn+1−i ) (15) The AMk formula use k − 1 preceding steps (because one step is at xn+1 . 4. the hn appears explicitly in the formula and everything is more complicated. then one step less that the corresponding ABk method The principle is the same as for AdamsBasforth but in this case the interpolation polynomial use k points. In particular it is necessary to compute the coefficient at each step.6 1 2 3 4 5 6 1 3 2 23 12 55 24 1901 720 4277 1440 −1 2 − 16 12 − 59 24 − 2774 720 − 7923 1440 5 12 37 24 2616 720 9982 1440 9 − 24 − 1274 720 − 7298 1440 251 720 2877 1440 475 − 1440 When the mesh spacing is not constant.5.1 ∗ βk.2 ∗ βk.

The coefficient of the AM family can be constructed in the same manner as for the AB family. xn+1 ) 2 It can be proved. It can be shown that the AM formula is considerably more accurate that the AB formula of the same order.Accepting the value (implicit) fn+1 . it is a consequence of the approximation of the integral.3 βk.0 βk.3 (order of AM) : The method AMk is of order k.2 βk. gives an analogous formula.5 1 − 12 5 − 24 1 24 106 720 482 1440 19 − 720 173 − 1440 27 1440 − 264 720 798 − 1440 The method AM1. in the context of PDEs it is also called Crank-Nicholson method h xn+1 = xn + [f (tn . with fn+1 entering in the computation. exactly the same reasoning as for AB. that Proposition 4. and we get a tableau of AM coefficients : k βk. xn+1 ) The method AM2 is known as the trapezoidal rule (explain why !).1 1 2 3 4 5 6 1 1 2 5 12 9 24 251 720 475 1440 1 2 8 12 19 24 646 720 1427 1440 βk. xn ) + f (tn+1 . is known as Forward Euler formula xn+1 = xn + hf (tn+1 .4 βk. for moderately large order k. This is an implicit method. 56 .

Pk (tn+1 ) = f (tn+1 . xn+1 ) This relation is an implicit formula for xn+1 . tn . we collocate the ODE at tn+1 i. Now we interpolate the solution itself. We use also a polynomial interpolation but for a different task.e.4 BDF methods The principle behind BDF is the same. tn+1−k we obtain a unique polynomial Pk (t) of degree k. xn+1−k at mesh points tn+1 . with constant step for BDF is Li (t) = k αi xn+1−i = hf (tn+1 . xn+1 ) i=0 (16) The BDF formulas are then LMMs methods. xn+1 ) i=0 If we recall that Li (t) = − xn+1−j ) j=i (xn+1−i − xn+1−j ) j=i (t It is then clear that for each index i. · · · . we can write 1 Mi (t) h Hence the final formula. Using the k + 1 values xn+1 .5. A way of approximate a derivative is to differentiate an interpolating polynomial. It is easy to see that Proposition 4. 57 .4 (order of BDF) : The method BDFk is of order k.4. Using this. · · · . Pk (tn+1 ) ) = f (tn+1 . If we write k Pk (t) = i=0 Li (t) xn+1−i The preceding relation becomes k Li (tn+1 ) xn+1−i = f (tn+1 . including the searched point xn+1 at the mesh point tn+1 . xn .

the BDFs are based on the interpolation of solution values themselves.4 αk.0 αk. There however a difference.2 αk.For constant step an array of BDF coefficient can be computed with the same principle as for the building of the coefficient of Adams methods. Looking at the mesh (1 + h. 1.5 1 2 3 4 5 6 1 3 2 11 6 25 12 137 60 147 60 −1 −2 −3 −4 −5 −6 1 2 3 2 −1 3 −4 3 − 10 3 − 20 3 58 1 4 5 4 15 4 3 5 15 12 −1 5 −6 5 1 6 .1 αk. the Adams formulas are based on interpolation of the derivative of the solution.3 αk. 1 − h. 1 − 2h) give the relation α0 (1 + h)3 + α1 + α2 (1 − h)3 + α3 (12 h)3 = 3h(1 + h)2 This relation gives   α0 − α2 − 8α3 = 3   α0 + α2 + 4α3 = 2  α0 − α2 − 2α3 = 1   α0 + α1 + α2 + α3 = 0 11 6 α1 = −3 α2 = 3 2 α3 = − 1 3 From which we obtain α0 = The array for BDF is k αk. let look at at example. Considering the ODE x = 3t2 with x(1) = 1. should ˙ gives the coefficients of BDF3.

The name Backward difference comes from the fact that expressed in Backward Differences the interpolating polynomial has a simple form. The Backward differences of a function are defined inductively. which implies | h |< On contrary the forward Euler method AM1=BDF1 gives xn+1 = 1 xn 1 − αh 59 2 |α| . the polynomial is k Pk (t) = xn+1 + i=1 (t − tn+1 ) · · · (t − tn+2−i ) hi i! i xn+1 Then the BDF formula takes the simple form k i=1 1 i i xn+1 = hf (tn+1 . A first answer is in stability. and then add some computation overhead ? The answer is multiple. The simple method RK1=AB1 gives rise to the sequence xn+1 = (1 + αh)n+1 This sequence converge iff | 1 + αh |< 1 .5. n+1 f f (t) = f (t) − f (t − h) n+1 f (t) = ( n f (t)) If we set xn+1 = x(tn+1 ) for interpolation. xn+1 ) 4. since an implicit method requires to solve an equation.5 Implicit methods and PECE The reader at this point can legitimately ask the question : why using implicit methods. We shall compare two methods on a simple stable equation : x = αx ˙ x(0) = 1 With α < 0.

with large Lipschitz constants. the implicit equation has a unique solution if | h | C1 L < 1 or equivalently 1 C1 L The unique solution is the limit of the iterated sequence zn+1 = Φ(zn ). Then by a classical fixed point theorem. We shall study in more details this concept in the next section. the forward Euler becomes interesting despite the expense of solving an implicit equation. AM are much more accurate than the corresponding AB method. A second answer is in comparing AB and AM methods. when implicit. where L is the Lipschitz constant of f the RHS of the ODE. If stability reduces sufficiently the step size for an explicit method. 2 Stability restricts the step size of Backward Euler to | h |< |α| . an implicit method can be more cheap in computation time and efficiency. If we accept complex values for α the stability region is all the left half of the complex plane. The AM methods permits bigger step size and in fact compensate for the cost of solving an implicit equation. When α takes great values. The convergence rate is of order of h |h|< 60 . The popular implicit methods used in modern codes. can be written β0 = h f (tn+1 . xn+1−i ) α0 Or in a shorter form xn+1 = hC1 F (xn+1 ) + hC2 + C3 The function Φ(x) = hC1 F (x) + f C2 + C3 is clearly Lipschitz with constant | h | C1 L. Check that this is also true for the AM2 method. a highly stable implicit method is a good choice. All the methods considered here are all LMMs and. The step size is generally reduced to get the accuracy desired or to keep the computation stable. We shall takes a brief look on the implicit equation. forcing to take a great number of steps. are much more stable than the corresponding explicit methods. In stiff problems.This sequence is always convergent for any h > 0. xn+1 ) − α0 k xn+1 i=1 αi xn+1−i + h α0 k i=1 βi f (tn+1−i .

Namely the predictor corrector methods. .e. The evaluation of f is called an evaluation.1 . Then for a k order predictor is wise to choose a k + 1 corrector method. or that three iterations are enough . It is simple to look at the local error for a PECE method. with this corrected value fn+1 can be evaluated . The reader is invited to do so. For example it can be required that the estimated | h | C1 L is less that 0. Another methods are predictor corrector methods or P. Check that the influence of the predictor method is less than the corrector method. . A PECE is an explicit method.C.E. f (tn+1 . methods. pxn+1 ) (17) xn+1 = implicit(pfn+1 )  Correction   Evaluaton fn+1 = f (tn+1 . xn )   Evaluation pfn+1 = f (tn+1 . p). this estimate is called a prediction. Then we can predict pfn+1 i.5. Check the stability of this method and verify that this method has a finite stability region. xn+1 ) We rediscover in another disguise the Heun method RK2.E. xn+1 ) Let apply this scheme to an example with AB1 and AM2 methods (look at the corresponding arrays)   Prediction AB1 pxn+1 = xn + h f (tn .6 Stability region of a method : The classic differential equation 61 . . pxn+1 ) (18) 1  Correction AM2 xn+1 = xn + 2 [fn + pfn+1 ]   Evaluaton fn+1 = f (tn+1 .. Since xn+1 is not expected to be the exact solution there is no point in computing it more accurately than necessary.If h is small enough a few iterations will suffice. 4. This brings us to another type of methods. pxn+1 ). We get an estimation of the solution of the implicit equation.   Prediction pxn+1 = explicit f ormula   Evaluation pfn+1 = f (tn+1 . . Prove that for a order p∗ predictor method and an order p corrector method . give a new value for xn+1 said corrected value. As we have seen AB1-AM2 is RK2 method. A substitution of fn+1 by pfn+1 in an implicit formula . The idea is to obtain explicitly a first approximation (prediction) pxn+1 of xn+1 . the PECE associated method is of order min(p∗ + 1.

Is called Dahlquist test equation. using the definition (13) of a LMM formula we obtain. when applied to the Dahlquist equation (19). with coefficient αi is known as the first characteristic polynomial of the method. we look at this expression for a LMM method. the recurrence formula : k  ˙  x = λx x(0) = 1  (λ) < 0 (19) i=0 (αi − βi hλ)xn+1−i = 0 We set µ = hλ It is well known that to this linear recurrent sequence is associated a polynomial (α0 − β0 µ)ζ k+1 + · · · + (αi − βi µ)ζ k−i+1 + · · · + (αk − βk µ) = 0 Since this formula is linear in µ the right hand side can be written R(µ) = ρ(ζ) − µ σ(ζ) The polynomial ρ.7 (Stability Domain) : The set S S = {µ ∈ C | roots(R(µ) ⊂ U} is called the stability region of the LMM method. with coefficient βi is known as the second characteristic polynomial of the method. If S ⊂ C− the method is said to be A-stable. since all the method encountered in this notes can be put in LMM form. Definition 4. 62 . From recurrence formulas. To be more precise. it is well known that a recurrence formula is convergent if the simple root of the polynomial R(µ) are contained in the closed complex disk U of radius 1. When applied to this test equation a discrete method gives a sequence defined by induction. and the multiple roots are in the open disk. The polynomial σ.

-->plot2d(real(w5)./s5)’./z^2-(9). -->w5=(r.200./z. S ./z 63 A WARNING .characterization of the stability domain We have µ= ρ(ζ) σ(ζ) The boundary of S is then the image of the unit circle by the function ρ(ζ) H(ζ) = σ(ζ) ./z. The stability region.1) -->s2=(23-(16).k βk.imag(w5)./s4)’. -->w2=(r./s2)’. -->plot(real(w4).2) -->s4=(55-(59). must lie on the left of this boundary. With Scilab it is straightforward to plot the stability region and obtain the classical pictures of the book (HW./z./z. -->(251).4) Here is Pay attention at the code. -->plot(ream(w3). -->w3=(r. -->plot2d(real(w2). whenever it is not empty./s2)’.imag(w2).^2-(1274). -->w4=(r.^2)/12./z+(37).. using the vectorization properties µ = H(ζ) = -->z=exp(%i*%pi*linspace(0. B) For example for Adams-Basforth formulas ζ −1 ζ k+1 − ζ k = ∗ H(ζ) = ∗ k ∗ ∗ βk.imag(w3). -->r=z-1..100)./z+(2616).k ζ −k For BDF formulas ρ(ζ) ζ n+1 In Scilab.3) -->s5=(1901-(2774).^4)/720.imag(w4)./z+(5).^3)/24.1 ζ + · · · + βk.1 + · · · + βk. We write (16)./z. -->-->s2=(3-(1)./z)/2.^3+.

And not 16..9 1.1 0. and Scilab interpret like ones(z) . using space 1 ./ z . 7 0./. 5 1.7 1. 5 1. The second is interpreted like the real 1. 1 0. We could also have written.5 0. 7 1. 3 0.3 Fig. 1 1. We are looking for the element wise operation . 3 0. In Scilab x = A /B is the solution of x ∗ B = A.9 AB2 0. We obtain the following pictures Adams Basforth Stability Region 1. The two codes gives very different results. follow by a slash and the matrix z. 4 – Adams-Basforth region of stability The reader is invited to obtain the following figures for Adams-Moulton and BDF formulas 64 ./z The first code is clear.1 AB5 AB3 AB4 0. this is the reason of the parenthesis./ z However we prefer to use parenthesis as a warning for the reader.3 0. Then Scilab compute the solution of the matrix equation x ∗ z = 1.5 0.

on the left of the boundaries It can be shown that for RK methods of order p the polynomial R(ζ) is 65 .4 3 AM2 2 AM3 1 0 AM4 1 2 3 4 6 5 4 3 2 1 0 1 Fig. 5 – Adams-Moulton region of stability 12 8 4 0 2 3 4 5 4 8 12 3 1 5 9 13 17 21 Fig. 6 – BDF region of stability.

than explicit ones To implement an implicit method. We set F (x) = x − (hC1 F (x) + hC2 + C3 ) ∂F (k) (x ∂x 66 −1 The sequence is defined by induction. Another method is Newton’s method. If the norm of the Jacobian of f is big.5.7 ζk ζ +···+ 2 k! Implicit methods. This imply that the process described by the ODE contains components operating on different time scale. must be chosen very small for assuring stability. Roughly a stiff problem is the combination of stability. 4. in particular BDF. stiff equations. For a stiff problem the constant appearing in the upper bound hL . with big eigenvalues (negative real part) of the Jacobian .R(ζ) = 1 + ζ + . since L is big. the step size giving the desired accuracy. We have seen that for solving ODE numerically the equation must be stable. this must imply for convergence a small step size h. which define a sequence converging to the solution. usually tremendously better. implying a big norm. implementation In this lectures we have often spoken of stiff equations. perform better. and x(k+1) = x(k) − F (x(k) ) . with L the Lipschitz constant . Stiffness is a complicated idea that is difficult to define. A working definition for stiffness could be Equations where certains implicit methods. x(0) is the initial guess. We have proposed a first solution by iteration procedure. But with the Jacobian can be associated the Lipschitz constant. we have to solve some implicit equation in x x = hC1 F (x) + hC2 + C3 Where F (x) = f (tn+1 . This for example implies that the Jacobian of the RHS must be a stable matrix. x). related with the norm of the Jacobian. hence a big Lipschitz constant.

the matrix of partial derivatives of the RHS is required. t0 ) and the values where you want the solution to be evaluated.function). x). x) ∂F (x) = I − hC1 ∂x ∂x Each Newton iteration requires solving a linear system with a matrix related to the Jacobian of the system. the initial conditions (x0 .x) 67 . or some other variant. if given in a Scilab file. the solver must compute numerically the Jacobian with all the accompanying disadvantages. all you must do is to tell to Scilab what Cauchy problem is to be solved. The 4th input ‘ ‘function” is a function which compute the RHS of the ODE. For any step size. The third input is a vector T . x). even if the ODE is autonomous. A warning : the function must always have in its input the expression t . That is. Newton’s method converge if the initial guess is good enough. If the Jacobian is not provided.t0. The user is encouraged to provide directly the Jacobian. That is. you must provide a function that evaluate f (t.T. The components of T are the times at which you want the solution computed. 5. the code of the right hand side must begin as function xdot =RHS(t. Elementary use The syntax is x=ode(x0.1 Using Scilab Solvers. To implement Newton’s method.x) // the following lines are the lines of code that compute // f(t. f (t. 5 SCILAB solvers In the simplest use of Scilab to solve ODE.With ∂f (tn+1 .

/RHS. -->x=ode(1.100).RHS).1.The time t must appears in the first line.x’) you obtain 68 .’xdot=t^2*x’) -->RHS RHS = [xdot]=RHS(t. directly from the window command : -->deff(’xdot=RHS(t.x)’.T’.x) You can check that the function RHS exists.. This file must be saved as RHS.sci ”) Or as an alternative you can code as an inline function. you must get a mesh of points on the integration interval -->T=linspace(0. If you are interested in plotting the solution. in typing RHS. and Scilab answer give the input and the output.sci and called from the window command line by a getf ( ”.0. -->xbasc() -->plot2d(T’.

This problem has a known solution in term of Jacobi special function.40 1.12 1. it can be entered directly from the command line. But the solvers of Scilab are intended to be used with large and complicated systems.51 x1 x2 ˙ With I.0 Fig.28 1.16 1. x3 (0)] = [0.3 0. We give some preliminaries.20 1.6 0. It is studied in the book of Shampine and Gordon.36 1.08 1.8 0.00 0 0. In this case the equation is provided as Scilab function.04 1. [x1 (0). x2 (0).5 0.1. 1.9 1. We shall give another example Example 5.7 0. 1] This example illustrates the solution of a standard test problem proposed by Krogh for solvers intended for nonstiff problems.24 1.V.4 0. 7 – Plot of a solution When the function is simple.2 0. We define the so-called incomplete elliptic integral of the first kind (Abramowitz and Stegun) : 69 .1 0.1 (Euler rigid body equations) : We consider the following system  x1 = x2 x3 ˙  x2 = −x3 x1 ˙  x3 = −0.32 1.

it admits an inverse function. m). u = F (Φ. m) = −m sn(u. m) = 0.m) 70 .V. cn(0. which is known as the complete elliptic integral of the first kind 2 4 K(m) = 4 F ( π . m) dt With sn(0. m). m) dn(u. m) = cos(Φ) = cos(am(u. m) = cn(u. m) x2 (t) = cn(t. m) is built-in and called by %sn(u. m) dn(u. m))  dn(u. m) = 1 − sn2 (u. m) Φ = am(u. m) = −sn(u. m) = 0 dθ 1 − m sin2 θ sin(Φ) = 0 dt (1 − t2 )(1 − mt2 Since F is clearly a monotone increasing function of Φ. m) With m = 0.Φ F (Φ. given. m)) cn(u. m)     d cn(u. m) Definition 5.1 (Jacobi ellliptic functions) : The Jacobi elliptic functions are given respectively by (see Abramowitz and Stegun)   sn(u. m) It is now a simple exercise to check that  d  dt sn(u. called the amplitude. with the I. m)  dt    d  dn(u. m)  x3 (t) = dn(t. m) = sin(Φ) = sin(am(u. m) cn(u. m) = dn(0. 2 In Scilab the function sn(u. We readily see that the solutions are periodic with period 4 F ( π . We immediately see that the solution of the Euler equations. m) = 1. are   x1 (t) = sn(t.51.

sci"). --> x0=[0. We obtain the figure 71 . and N columns the number of time points asked by the user.12. --> m=0. First we get the figure of Gordon-Shampine book : Coding the rigid body equations function xdot=rigidEuler(t. m) = %asn(sin(Φ)).51.1]’.rigidEuler).1200). The solver gives the solution in n rows. -->plot2d(T’. T=linspace(0. t0=0.1.-x(3)*x(1). n the dimension of the system.sol’) Note the transpose for plotting. and the incomplete elliptic integral is given by F (Φ.m) Then the amplitude is given by am(u. The inverse of the Jacobi sinus is %asn(u. -->sol=ode(x0. m). The complete integral is given by %k(m) With this preliminaries we can make some experiments.51*x(1)*x(2)] Computing and plotting the solution : -->..x) xdot=[x(2)*x(3).K=% k(m).-0. m) = asin(%sn(u.getf("/Users/sallet/Documents/Scilab/rigidEuler.T.t0. The function plot use column vectors.

2 0.m).. 8 – Solution of the movement of a rigid body We can now appreciate the quality of the solution. We have sn(u + v) = sn(u) cn(v) dn(v) + sn(v) cn(u) dn(u) 1 − m sn(u)2 sn(v)2 We obtain from this relation (analogous to sin(x + π/2)) cn(x) = sn(x + K) dn(x) Using the vectorized properties of Scilab (look at the operations . m). m)) As we have seen we have access in Scilab to sn.Euler Equations of Rigid Body :solutions 1. We have to use an addition formula to obtain cn from sn. m). and then to dn(t. 2 0.) -->theorsol=[\%sn(T.0 x3 0.∗ and . The theoretical solution is (sn(t..6 x1 0. cn(t. . m) = 1 − m sn(t. 0 Fig. 72 .4 1. dn(t. 6 x2 0 2 4 6 8 10 12 1. m)2 .

m).. If we only look after a time interval of 1s.^2).m)-sol(1. LSODE implements Adams-Moulton formulas for non stiff problems and BDF’s formulas.101))/%sn(1.0015027 Which is quite correct. 5.m)). and with root-finding. LSODE is a successor of the code GEAR (Hindsmarch). They are variations of Hindsmarch’s LSODE.$)-sol(:. the result are better : -->(%sn(1.$))/norm(theorsol) ans = 0. Don’t forget that we integrates on [0. 73 . Lsodar is the livermore solver for ordinary differential equations. When the user says nothing.^2)]. 12].2 More on Scilab solvers The syntax ode is an interface to various solvers. which is itself a revised version of the seminal code DIFSUB of Gear (1971) based on BDF’s formulas. in particular to ODEPACK.m) ans = 0. -->sqrt(1-m*%sn(T.0244019 We obtain an absolute error of 2% for all the interval.0011839 And at the end we have also --> norm(theorsol(:. -->norm(abs(theorsol-sol))/norm(theorsol) ans = 0.. The library ODEPACK can be found in Netlib. We shall see how to improve these results from the accuracy point of view.m).*sqrt(1-m*(% sn(T. This version has been modified by scilab group on Feb 97 following Dr Hindmarsh direction. the solver selects between stiff and non stiff methods. When the user does no precise the solver used.. with automatic method switching for stiff and nonstiff problems. ODE call LSODA solver of the ODEPACK package.-->% sn(T+K.

The maximum order for the formulas is. If the user provide a n-vector rtol of the dimension of the system. At each step the solver produce an approximation x at mesh points t and an estimation of the error ¯ est is evaluated. When x(i) is big enough. the absolute error test can lead to impossible re¯ quirements : The inequality | x(i) − x(i) |≤ est(i) ≤ atol(i) ¯ gives.atol.T. The maximum order is 5 The method ’RKF’ uses the program of Shampine and Watts (1976) based on the RKF embedded formula of Fehlberg of example (4. This inequality defines a mixed error control.1 Syntax x=ode(’method’. ’RKF’ 4. by default. ’adams’ 2. The method ’Stiff’ uses a BDF method.x0. The method ’Adams’ use Adams-Moulton formulas. if rtol = 0 i corresponds to a pure absolute error control.rtol.t0. and another n-vector atol.2.5. atol : Tolerance for the error can be prescribed by the user. if x(i) is big enough | x(i) − x(i) | ¯ atol(i) ≤ ≤ eps | x(i) | | x(i) | 74 . ’Stiff’ 3.12) An adaptative Runge-Kutta formula of order 4 Tolerance : rtol. function.Jacobian) Method : The user can choose between 1. ’RK’. If atol = 0 then it corresponds to a pure relative error control. 12. then the solver will try to satisfy est(i) ≤ rtol(i) | x(i) | + atol(i) ¯ For any index i = 1 : n.

with a pure absolute control any number smaller than atol(i) will pass the test of error control. for example section(5. We shall give examples later. For a Scilab file function : function J=jacobian (t. It is now easy to understand why providing Jacobian is interesting for the computation overhead and also for the accuracy. n).x) The same rule must be respected for an “inline ” function. or they can later grow and then must be taken in account. If rtol and atol are not given as vectors. At the beginning of the 75 . Scilab interprets rtol as rtol = rtol ∗ ones(1.5. In a similar way.7). and the syntax must include t and the state x in the beginning of the code for the Jacobian. the solver compute the Jacobian numerically. You must give Jacobian as a Scilab function. but as scalars. and similarly for atol. it is better to give the Jacobian of the RHS of the ODE. 5. A vector must be created to modify these options If created this vector is of length 12. absolute error control can lead also to troubles. When the problems are stiff. This is exactly the same rule as for the RHS of the ODE.3 Options for ODE Some options are defined by default. The scale of the problem must be reflected in the tolerance. Before illustrating with examples the preceding notions we describe how to change options for the ODE solvers in Scilab.e. When the problem is stiff you are encouraged to provide the Jacobian. Default values for rtol and atol are respectively atol = 10−7 and rtol = 10−5 Jacobian : We have seen why the Jacobian is useful when the problems are stiff in section (4. for example.4. The norm of x(i) can be small but theses values can have influence on the global solution of the ODE. i.3).This tolerance ask for a relative error smaller than the precision machine. If this is not feasible the solver will compute internally by finite differences. if | x(i) |≤ atol(i) . if x(i) becomes small. which is impossible.

and return 5.1 itask Default value 1 The values of “itask” are 1 to 5 – itask =1 : normal computation at specified times – itask=2 : computation at mesh points (given in first row of output of ode) – itask=3 : one step at one internal mesh point and return – itask= 4 : normal computation without overshooting tcrit – itask=5 : one step. The ODE function checks if this variable exists and in this case it uses it.3. If you write % odeoptions .2 tcrit Default value 0 tcrit assumes itask equals 4 or 5. described above.ixpr. 5.mu] A Warning : the syntax is exactly as written.3. To create it you must execute the command line displayed by odeoptions.h0. The meaning of the elements is described below.ml.12..2.%inf. For using default values you have to clear this variable.jactyp. particularly you must write with capital letters.5. If this variable does not exist ODE uses default values.0.3.hmin. for event location and root finding.maxords. --> mxstep.0. ODE will use the default values ! The default values are [1. 5.500.-1]..0.session this vector does not exist.tcrit..-1. without passing tcrit. Scilab makes the distinction. --> % ODEOPTIONS=[itask.3 h0 Default value 0 h0 first step tried It Can be useful to modify h0.hmax.0. 76 .maxordn.

5.3.4

hmax

Default value ∞ ( % inf in Scilab) hmax max step size It Can be useful to modify hmax, for event location and root finding. 5.3.5 hmin

Default value 0 hmin min step size It Can be useful to modify hmin, for event location and root finding. 5.3.6 jactype

Default value 2 – jactype= 0 : functional iterations, no jacobian used (”adams” or ”stiff” only) – jactype= 1 : user-supplied full jacobian – jactype= 2 : internally generated full jacobian – jactype= 3 : internally generated diagonal jacobian (”adams” or ”stiff” only) – jactype= 4 : user-supplied banded jacobian (see ml and mu below) – jactype = 5 : internally generated banded jacobian (see ml and mu below) 5.3.7 mxstep

Default value 500 mxstep maximum number of steps allowed. 5.3.8 maxordn

maxordn maximum non-stiff order allowed, at most 12 5.3.9 maxords

maxords maximum stiff order allowed, at most 5

77

5.3.10

ixpr

Default value 0 ixpr print level, 0 or 1 5.3.11 ml,mu

ml,mu .If jactype equals 4 or 5, ml and mu are the lower and upper halfbandwidths of the banded jacobian : the band is the lower band of the i,j’s with i-ml ¡= j and the upper band with j -mu ¡=i. (mu as for upper, ml as for lower . . .)
mu+1

n-mu-ml

n r o w s

ml+1 ml+mu+1

mu

m ml diagonal lines n

n-mu

n-ml

n columns

Fig. 9 – A banded Jacobian If jactype equals 4 the jacobian function must return a matrix J which is ml+mu+1 x ny (where ny=dim of x in xdot=f(t,x)) , such that column 1 of J is made of mu zeros followed by df1/dx1, df2/dx1, df3/dx1, ... (1+ml

78

possibly non-zero entries) , column 2 is made of mu-1 zeros followed by df1/dx2, df2/dx2, etc This is summarized by the following sketch :

mu mu+1

nl+1

mu+1

n-ml

n

Fig. 10 – The matrix given to Scilab fo banded Jacobian To be more precise and justify the preceding sketch, let a banded (mu,ml) Jacobian , with entries J(i, j) J(i, j) = ∂fi ∂xj

This express that the function fi depends only of the variables (xi−ml , xi−ml+1 , · · · , xi+mu ) Or differently that the variable xj occurs only in the functions (fj−ml , fj−ml+1 , · · · , fj+mu ) When the Jacobian is not banded if Jactype=1, you store the Jacobian in a n × n matrix J as defined before.

79

. fj−ml+1.1 Two body problem We use the problem D5 of Hull  x  x = − (x2 +y2 )3/2  ¨        y = − 2 y2 3/2  ¨ (x +y )   x(0) = 1 − e          y(0) = 0  y= ˙ (20) x=0 ˙ 1+e 1−e The solution is an ellipse of eccentricity e. The solution is √ x(t) = cos(u) − e y(t) = 1 − e2 sin(u) Where u is the solution of the Kepler’s equation u − e sin(u) = t.9 and find the solution for tf = 20. for tf = 20. mu ≤ n − 1. We shall see an example of this option. We must solve the Kepler’s equation.4. to ml and mu . i.4).e.4. We shall integrates this equation for e = 0. We shall use some of these test problems. with 0 ≤ ml. in the example Brusselator in section (5. Krogh.When jactype=2. You can check that this give the preceding picture. fj+ml ] 5. and you have given values in %ODEOP T IONS. [fj−ml . fj+ml−1 . 5. j). Enright et al [1975] Hairer &Wanner . · · · .4 Experimentation There exists test problems in the litterature. the number of line is then between 1 and ml + mu + 1. Scilab is waiting to store the Jacobian in a ml + mu + 1 × n matrix. In the column j of this matrix is loaded ∂f This equivalent to say that ∂xi is loaded in J(i−j + mu + 1. solve u − tf − e sin(u) = 0 80 . Since we have j −mu ≤ i − j ≤ ml . For example Hull [1972] et al.

lsoda-.t... Here is some computations -->t0=0.sqrt(19)].0.1256768952473E+02 Scilab tells you that the number of steps is too small.’v=u-tf-e*sin(u)’) -->uf=fsolve(1.kepler) uf = 20. We shall then modify this.0.. -->sqrt(1-e^2)*sin(uf).1.tf=20.x0=[0. -->solstand=ode(x0.It is clear that this equation has always a unique solution...8267099361762185 --> solex=[=cos(uf)-e.e=0. -->deff(’v=kepler(u)’..67753909247075539 ! ! 0.9.. we shall find this solution uf with Scilab.0.0.t0. using f solve -->tf=20.D5)..at t (=r1). mxstep (=i1) steps needed before reaching tout where i1 is : 500 where r1 is : 0. -->sqrt(1-e^2)*cos(uf)/(1-e*cos(uf))] solex = ! .1. -->-sin(uf)/(1-e*cos(uf)).29526625098757586 ! ! . To have access to % ODEOPTIONS.40039389637923184 ! ! . we type 81 ..12708381542786892 ! This result corresponds to the solution in Shampine.

atol.0.0. -->norm(solex1-solad14)/norm(solex1) ans = 0.-->%ODEOPTIONS=[1.40040490187681727 ! ! .t0.atol=1d-14.67761734310378363 ! ! 0.tf.0.0.0. -->solstand=ode(x0.D5).-1].00000000085528126 // comparing the solutions -->solex solex = ! .atol=1d-4.D5).1.12.12383663388400930 -->rtol=1d-12.67753909247075539 ! ! 0.tf. -->solad14=ode(’adams’.5.2.rtol.12708381542786892 ! -->solad2 82 .-1.0.29526625098757586 ! ! .tf.0.x0.20000.12706223352197193 ! -->rtol=1d-4.t0.t0.1. -->norm(solex1-solad2)/norm(solex1) ans = 0.0.% inf. --solad2=ode(’adams’.40039389637923184 ! ! .rtol.29517035385191548 ! ! .x0.D5) solstand = ! .atol.

a tolerance of 10−14 gives 7 correct digits.40039389630222816 ! ! .420064880055532 ! ! .0.0.67753909321197181 ! ! 0.0.solad2 = ! .1.12708381528611454 ! A tolerance of 10−4 gives a result with one correct digit ! the default tolerance 4 significant digits.0.29526624993082473 ! ! .04 k2 = 104 k3 = 3. This problem illustrates the dangers of too crude tolerances.2 Roberston Problem : stiffness The following system is a very popular example in numerical studies of stiff problem.107  With the initial value  1 x0 =  0  0 83 . 5. It describes the concentration of three products in a chemical reaction :  ˙  x1 = −k1 x1 + k2 x2 x3 x2 = k1 x1 − k2 x2 x3 − k3 x2 ˙ (21) 2  x3 = k3 x2 ˙ 2 The coefficients are k1 = 0.4.1.56322053593253107 ! ! 0.33076856194984439 ! ! .17162136959584925 ! -->solad14 solad14 = ! .

and a file JacRober the Jacobian of the RHS. −k1 ) At the equilibrium the eigenvalues are (0. The equations are redondant.sci. 1).04 We can now look at the numerical solutions Before we define a file Robertsoneqn. This means that all the trajectories goes toward (0.It is straightforward to see that the eigenvalues of J are 0 and two others negatives values −8 k2 x2 (k1 + k3 x2 ) + (k1 + k2 (2 x2 + x3 ))2 2 The Jacobian of the system is   −k1 k2 x3 k2 x2 J =  k1 −k2 x3 − 2k3 x2 −k2 x2  0 2k3x2 0 −k1 − 2 k2 x2 − k2 x3 ± Check that for this system. 0. the eigenvalues are (0. with the corresponding initial value. i. ˙ ˙ ˙ We have one equilibrium x1 = x2 = 0 and x3 = 1. x2 ) = x1 + x2 . Hint = x1 + x2 + x3 = 0. since concentration cannot be negative. 84 . 0.e. x1 +x2 +x3 = 1 . the greatest invariant set contained in V = 0. −(k1 + k2 )) We remark that the system let invariant the positive orthant.V. Hence the equilibrium x1 = x2 = 0 is a globally asymptotically stable equilibrium (in the positive orthant). To study the system it is sufficient to study the two first equation. 0. It is clear that ˙ V = x1 + x2 = −k3 x2 ≤ 0 ˙ ˙ 2 ˙ By LaSalle principle. Near this equilibrium the system is stiff. x2 = 0 is reduced to x1 = 0 (look at the second equation). At the I. On this orthant we have a Lyapunov function V (x1 . This is appropriate. since −(k1 + k3 ) = −104 − 0.

. 1d4*x(2). --> sol1=ode(x0.Robertsoneqn).getf("/Users/sallet/Documents/Scilab/JacRober.0].7 0.8 0.sci").04.-1d4*x(3)-3..0.6 0. naively : .04 .4 0. -->xbasc() -->plot2d(T’. -->. 11 – Robertson chemical reaction It seems that a curve has disappeared. solve and plot.. in a first time.(sol1)’) We obtain 1..x) xdot=[-. 1d4*x(3).1 0 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 11 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 Fig.-1..11. -->T=logspace(-6.0] Then we compute.t0. 0.9 0. To see what happens we multiply x2 (t) by 104 85 .6D7*x(2). --> t0=0.T.10000).function xdot=JacRober(t.d7*x(2). More attention shows that the second curve is on the x–axis.sci").5 0.0 0.getf("/Users/sallet/Documents/Scilab/Robertsoneqn.2 0.3 0.d4*x(2).x0=[1.

38 0.74 0. If we have defined verb %ODEOPTIONS” we must only change jacstyle. The option rect precise the frame by a length 4 column vector [xmin.14 0.1.We have specified in the plot some options. with a rectangle defined (y=1).50 0.10 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 11 X1 X3 X2 x 10 ^4 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 Fig.02 0. rect=[1d-6.logflag=’ln’) To obtain the classical figure. two body.1d4.strf=’011’. See on the first example.10 0. y axis on the left (z=1). ymin. for %ODEOPTIONS. xmax.26 0.86 0. We have 86 .-. -->xbasc() -->plot2d(T’.1d11.1. strf = xyz means no captions (x=0).98 0.1])*sol1)’.(diag([1. 1.1].62 0. 12 – Robertson chemical reaction Since the component x2 becomes very small we can augment the requirement on atol for x2 and also use the Jacobian. axes are drawn. ymax] finally logf lag = ‘ln says that the x-axis is logarithmic and the y-axis is normal.

$) ans = ! ! ! 0.00000002076404250 ! 0.rtol. with the necessity of some exploratory computations for adjusting the tolerances. -->solJac(:.97813884204576329 ! -->sol1(:.00000000000008137 ! 0.$) ans = ! ! ! 0.4.00000001989864378 ! 0.JacRober).1d-6.99773865098969228 ! 5.3 A problem in Ozone Kinetics model We borrow this example from the book Kahaner-Moler-Nash. One reaction mechanism is 1 O + O2 −→ O3 2 O + O3 −→ 2 O2 k k O2 −→ 2 O O3 −→ O + O2 k4 (t) k3 (t) 87 .Robertsoneqn.-->rtol=1d-4.atol.1d-4]. atol=[1d-4.00000000000008323 ! 0.T. The system corresponds to the modeling of the amount of Ozone in the atmosphere. This example will illustrate the need of some mathematical analysis.t0. -->%ODEOPTIONS(6)=1 -->solJac=ode(x0.

the positive orthant is invariant by the system. These functions are C ∞ .601 43200 The time is measured in seconds (43200s = 12h = 1/2 day) The rates k3 and k4 govern the production of free singlet oxygen. We shall plot the reaction rates k3 and k4 .e. they rise rapidly at dawn. 88  x2 = k1 x1 O2 − k2 x1 x2 ˙ .62 c4 = 7. namely any solution starting in the positive orthant cannot leaves this orthant. The I. reach a peak at noon. The system is difficult for the solvers.63 10−16 k2 = 4. We have to code these functions and the ODE.The concentration of O2 . The rate of kinetics constants are known k1 = 1. These constants are zero at night. molecular oxygen. decrease to zero at sunset. They are rapid change in the solutions at daylight and at night the ODE reduces to  ˙  x1 = −k1 x1 O2 − k2 x1 x2 Moreover the reaction rates are special. x2 the concentration of ozone. is many orders of magnitude larger than the other concentrations and therefore is assumed to be constant O2 = 3. i.V are ω= x1 (0) = 106 cm−3 x2 = 1012 cm−3 With The two other reaction rates vary twice a day and are modeled by  −c / sin(ω t) if sin(ω t) > 0  e i ki (t) =  0 if sin(ω t) ≤ 0 A mathematical analysis shows that the system is “well-posed”.5 1016 cm−3 The equations are  ˙  x1 = −k1 x1 O2 − k2 x1 x2 + 2k3 (t) O2 + k4 (t) x2 (22)  x2 = k1 x1 O2 − k2 x1 x2 − k4 (t)x2 ˙ The different variables represent x1 the concentration of free oxygen.66 10−16 π s−1 c3 = 22.

function xdot=ozone1(t,x) k1=1.63d-16; k2=4.66d-16; O2=3.7d16; xdot=[-k1*x(1)*O2-k2*x(1)*x(2)+2*k3(t)*O2+k4(t)*x(2);.. k1*x(1)*O2-k2*x(1)*x(2)-k4(t)*x(2)]; ///////////// function w=k3(t) c3=22.62; omega=%pi/43200; w= exp(-c3./(sin(omega*t)+... ((sin(omega*t)==0)))).*...(sin(omega*t)~=0) ////// function J=jacozone1(t,x) k1=1.63d-16; k2=4.66d-16; O2=3.7d16; J=[-k1*O2-k2*x(2),-k2*x(1)+k4(t);k1*O2-k2*x(2),-k2*x(1)-k4(t)] //////////// function v=k4(t) c4=7.601; omega=%pi/43200; v= exp(-c4./(sin(omega*t)+((sin(omega*t)==0)))).*... (sin(omega*t)~=0) Some remarks, here, are necessary. First we have coded in ozone1.sci some sub-functions of the main program. This is important when several functions must be supplied. We have as sub-function, the Jacobian, jacozone1 and the depending of time functions k3 (t) and k4 (t). These function are used by the main program “ozone”. We draw the reader’s attention to the code of theses two functions. A natural, but clumsy , way to code “case functions” as k3 is to use a if condition. There is a principle in vectorial language as Scilab to vectorize the code. Moreover using loops is time consuming. A motto is Time is too short to short to spend writing loop 89

A rule- of-thumb is that the execution time of a MATLAB (SCILAB) function is proportional to the number of statements executed. No matter what those statements actually do (C. Moler) We use test functions. sin(omega*t)==0 The value is %T if the test succeed , %F otherwise. These are boolean variables True or False. These variables can be converted in numerics where %T is replaced by 1 and % F replaced by 0. But if boolean variables are used in some arithmetic operation the conversion is automatic. Example : -->(1-1==0)*2 ans = 2. -->(1-1==0)+2 ans = 3. -->(1-1==0)/2 !--error 4 undefined variable : %b_r_s

-->2/(1-1==0) !--error 4 undefined variable : %s_r_b -->2/bool2s(1-1==0) ans = 2.

90

The operations ∗ and + convert automatically, / need to use “bool2s”. If you are not being sure, use bool2s. We have in the code, the value of the function which is multiplied by .*(sin(omega*t)~=0) Note the “dot” operation to vectorize. This imply that our function is 0 when sin(ω t) ≤ 0. We add a term, in the denominator of the exponential : +( (sin(omega*t)==0) ) To prevent to divide by zero, which shall give a error message an interrupt the program. In this manner we obtain the result. Moreover the function is vectorized, as we shall see : -->;getf("/Users/sallet/Documents/Scilab/ozone1.sci"); -->T=linspace(2,43200,1000); -->plot2d(T’,[1d5*k3(T)’,k4(T)’]) We have multiplied k3 by 105 since k3 varies from 0 to 15. 10−11 and k4 varies from 0 to 5. 10−4. Otherwise the graph of k3 would have been squashed on the x-axis.
5e-4

k4(t)
4e-4

3e-4

2e-4

1e-4

^5

10 x k3(t)
0 0 1e4 2e4 3e4 4e4 5e4

Fig. 13 – reaction rates 91

999. ! We have obtain the indexes for which the function are zero. ! column 12 to 19 ! 992. 998. ! column 20 ! 1000. 997. 991. 2. 1 to 11 4. 3. 999. 2. 10.72372 174. 6. 8.96496 ! -->k4([T(4). 7. 4.T(5)]) ans = ! 0.It seems that these functions are zero for some values of t in the interval.568-260 ! 92 . 997. with the help of the function find -->I2=find(k4(T)==0) I2 = ! 1. 3. we deduce the time -->[T(4). ! -->I1=find(k3(T)==0) I1 = column ! 1. 9. 995. 5. 993. 998.T(5)] ans = ! 131. 1000. 994. We check it . 3. 996.

The unit roundoff is %eps : -->ieee(2) -->log(0) ans = -Inf -->1/0 ans = Inf -->0/0 ans = Nan -->1. Or in other words the overflow threshold is 1. The range is 10±308 . Scilab is conforming to the IEEE standard.669-312 To understand what happens here. we must go back to the computer arithmetic. To be under the complete IEEE arithmetic you must type ieee(2) This means that invalid operations give NaN (Not a Number).-->k3(T(11)) ans = 9. There is no symmetry. The reader is advised to cautious.941d − 324. when working around this boundaries.797e308 and the underflow around 4.7977d308 ans = 93 . An Overflow give %inf.

-->min(Ind).500.941-324 In other words under 10−324 . -->4. for the computer everything is zero.798+308 -->5d-324 ans = 4. With exponential function some care is necessary . 94 .79765d308 ans = 1. We make the following experience -->T=linspace(1.941-324 -->4d-324 ans = 4.941-324 -->5d-325 ans = 0.8d-324 ans = 4. -->Ind=find(k3(T)).5000).Inf -->1. as we have seen with the function ki .

95 . -Ind=max(find(k3(T))).941-324 -->T=418.455 . -->plot2d(’nl’.k3(T)’) T=linspace(42781.5:.54871 -->k3(T(min(Ind))) ans = 4.42789. We are playing with the limits of underflow machine ! Notice the shattering effects in the next figure. -->T(Ind) ans = 42782.T’.01:420.-->T(min(Ind)) ans = 417.100).

418] and [42782.1 time in s Fig. k2 and modified the Jacobian : function xdot=ozone2(t.9 420.3 419. 43200] and this periodically on period 43200.k1*x(1)*O2-k2*x(1)*x(2)]. 43200] and this periodically on period 43200.5 418. 14 – Effects of IEEE arithmetic We know. ///////////// function J=jacozone2(t.-321 10 -322 k3(t) 10 -323 10 -324 10 418. xdot=[-k1*x(1)*O2-k2*x(1)*x(2).x) 96 .7 418.5 419.x) k1=1.63d-16. We shall first integrate in [0.sci . Simply we have erased k1 . O2=3. In the same manner we find out that k4 is zero on the intervals [0. k2=4. that for the computer .7 419. the functions k3 is zero on the intervals [0. 141] with a simplified function ozone2.7d16.66d-16.9 419.1 419. 141] and [43059.

T1. --> .-1. second by second.0. 15 – concentration of free oxygen 97 . k2=4.66d-16.63d-16. O2=3.141. We choose to see the evolution.0.getf("/Users/sallet/Documents/Scilab/ozone2. -->T1=linspace(1. (look at the choice of T) -->%ODEOPTIONS=[1.-k2*x(1)] We can now integrate.0. x0=[1d6.141).k1=1.20000.sol1(1. We obtain the two plots 2700 2300 1900 1500 1100 700 300 O 0 20 40 60 80 100 120 140 160 -100 Fig.5.sci").-1]. -->sol1=ode(x0.-k2*x(1). -->plot2d(T1’.%inf.ozone2).0.7d16.k1*O2-k2*x(2).2.1d12]. J=[-k1*O2-k2*x(2).12.:)’) Since the two components are on different scale we plot two separate figures.t0. -->t0=0.

-->T(7) ans = 7.7) 98 .0000000001981 -->sol1(1.6) ans = 0.100000113e4 100000109e4 100000105e4 O3 100000101e4 100000097e4 100000093e4 100000089e4 0 20 40 60 80 100 120 140 160 Time in s Fig. -->min(ind) ans = 7.:)<0). -->sol1(1. 16 – concentration of Ozone We see on the figure that the concentration is negative ! We check this -->ind=find(sol1(1.

ans = .5 -->sol1(2.0. 14 significant digits remain constant : -->format(’v’. 99 .1).16) -->sol1(2. There exists a rule-of-thumb.0000000000283 We also see that the ozone rapidly reach a constant value.t0.5 After some experimentation.T1.atol. -->sol11=ode(x0.1d-13]. we come with -->rtol=[1d-2.2.ozone2). since x1 is rapidly very small.10) ans = 1000000999845.rtol. Beginning with this and after some trial and error. The components are scaled very differently as we can see.1d-2]. we choose atol and rtol.$) ans = 1000000999845. -->atol=[1d-150. what is important is the absolute error. set rtol(i) = 10m+1 and atol(i) at the value at which |xi | is essentially significant. For the first component x1 . See the discussion on error and tolerance absolute and relative in paragraph (5. to begin : Let m the number of significant digits required for solution component xi .

--max(ind) ans = 77. -->max(ind) ans = 77.rtol.T1. -->sol11=ode(x0. -->min(ind) ans = 60.:)<0). It means that we have only 17 values which are negative. 418] a function ozone3.sci which use only the function k4 (t).atol. --ind=find(sol11(1.t0. and use on the interval [141. This is immediate and we omit it. -->%ODEOPTIONS(6)=1.jacozone2).-->ind=find(sol11(1.$) x0 = 100 .ozone2.:)<0). --min(ind) ans = 60. We shall now take the last results of the integration as new start. The result is equivalent. A try with a pure adams method gives poorer results. We can try with a stiff only method. -->x0=sol11(:.

483813747-152 The first component has still negative values.3. -->T2=linspace(141.T2. The figure for the ozone is the same and the concentration stays constant.:)’) -->xset(’window’.914250067-152 ! 1000000999845.sol2(1.t0. -->.getf("/Users/sallet/Documents/Scilab/ozone3.jacozone3).atol.:)) ans = . -->sol2=ode(x0.rtol.sci").ozone3. 101 .1) -->xbasc() -->plot2d(T2’.sol2(2.418.:)’) -->min(sol2(1. -->xbasc() -->plot2d(T2’.! ! 1.5 ! -->to=T1($) to = 141.277). but they are in the absolute tolerance ! We obtain the figure for the concentration of free oxygen.

39e-99 35e-99 31e-99 27e-99 23e-99 19e-99 15e-99 11e-99 7e-99 3e-99 -1e-99140 180 220 260 300 340 380 420 Fig.3) -->plot2d(T3’. with the two functions ki . -->T3=linspace(418. -->sol3=ode(x0.sci").sci the complete program.t0.:)) 102 . t0=T2($).42364). -->.sol3(2. -->rtol=1d-7.x0=sol2(:.42782.:)’) -->xset(’window’. -->xset(’window’. 17 – concentration of free oxygen on [141. 418] We can now integrate on the interval [418.rtol.T3.atol. using ozone1. 43059].sol3(1.ozone1).atol=1d-9.:)’) -->min(sol3(1.2) -->plot2d(T3’. We make an exploratory computation with default tolerance atol = 10−7 and rtol = 10−5 .getf("/Users/sallet/Documents/Scilab/ozone1.$).

866E-13 9e7 8e7 O 7e7 6e7 5e7 4e7 3e7 2e7 1e7 0 1e7 0 1e4 2e4 3e4 4e4 5e4 418 Time in s 42742 Fig. 18 – Free oxygen on day light 103 .1.ans = .$) ans = 1.814E-11 -->sol3(1.

with two other runs. 43341]. For reference for the reader.000 1012 to another plateau 1. on the third interval. we give the figure corresponding for the two intervals for the concentration of free oxygen at the beginning of the day.86 10−13 . Our tolerance are satisfactory. We check that the concentration of O has still negative values. and on the interval [43059. 19 – Molecular oxygen. If we plot.108e10 107e10 O2 106e10 105e10 104e10 103e10 102e10 101e10 100e10 99e10 0 1e4 2e4 3e4 4e4 5e4 Fig. starting at very low concentration (3. with ozone2.077 1012 . corresponding to [42782. We complete now the simulation. and then decreases to 1. day light On day light the free oxygen. We shall get something like the figure (18). 43059] with the function ozone3 . these plots will disappear.6 10−98 ) goes to a peak 8. since the ki are zero. but under the value of atol. You should write a script to obtain for example a simulation on 10 days. and the two intervals at the end and at the beginning of the next day. the remaining of the day and the beginning of the next one. We use the following script : 104 . and pasted all the results obtained. together. The remaining of the simulation is obtained in a periodic way. in reason of their scale. We observe that the concentration of molecular oxygen change from a plateau 1.8 107 at noon (21429s).

5]) -->plot2d(T5’.:)’) -->xsetech([0.0. 141] but the I.5.sol5(1.V.5.0.sol4(1.5. 20 – Molecular oxygen at low levels The chattering effect in the graph of interval [43059.0. 43341] is simply a scale zoom effect.0.5.0.0.0.sol11(1.0.5]) -->plot2d(T1’.sol2(1.:)’) -->xsetech([0. The same thing arrive in the interval [0.0.:)’) Free Oxygen 2700 2300 1900 1500 1100 700 300 -100 0 20 40 60 80 100 120 140 160 39e-99 35e-99 31e-99 27e-99 0-141s 23e-99 19e-99 15e-99 11e-99 7e-99 3e-99 -1e-99 140 180 141-418s 220 260 300 340 380 420 19e-14 17e-14 15e-14 13e-14 11e-14 9e-14 7e-14 5e-14 3e-14 1e-14 -1e-14 42780 42820 42860 42900 42940 42980 43020 43060 10e-152 8e-152 6e-152 43059-43341s 42782-43059s 4e-152 2e-152 0 -2e-152 -4e-152 -6e-152 43050 43090 43130 43170 43210 43250 43290 43330 43370 Fig.5]) -->plot2d(T4’.5.-->xbasc() -->[([0.5.0.5.0.0.0.:)’) -->xsetech([0. for this 105 .5]) -->plot2d(T2’.0.5.

000E+12 ! Do it ! Obtain a simulation for 10 days. You can check by yourself. A source of large systems of ODE is the discretization of PDE.$) ans = ! ! 2.982-155 ! 1. and we begin the plot at time 1 giving a concentration of 2410. since the I. The chattering effect in interval [0. with the command -->xbasc() -->plot2d(T1(70:141)’. that this happen also in the first interval.$) ans = ! ! 1.V. But the solver of Scilab are able to handle large systems.4. The brusselator with diffusion is the system of PDE : 106 .sol11(1.V. whereas the I. at the beginning of the interval for the concentration of free oxygen is 3. are almost the same. problems of larger size can be considered.interval is 106 . We shall consider some examples and particularly the example of the Brusselator with diffusion ( HW).4 Large systems and Method of lines Till now we have just solve ODE with small dimension.8 10−156 . 141] is squashed by the scale. But depending on your computer. 5.077E+12 ! -->sol11(:.924-152 ! 1. We shall here consider an example in dimensions 20 and 40. comparing the starting for the second intervals (k4 = 0 and k3 = 0) : -->sol5(:.70:141)’) The remaining of the simulation should give something approximatively periodic.

∂u(t.∂u ∂t ∂v ∂t With 0 ≤ x ≤ 1 With the boundaries conditions = A + u2 v − (B + 1)u + α ∂ u ∂x2 ∂2 = −Bu − u2 v + α ∂xv 2 2 (23) u(0. We consider the grid 0 = x0 < x1 < x2 < · · · < xN < xN +1 = 2π Some functions vi (t) are used to approximate the function u(xi . t). let consider the variable coefficient wave equation ut + c(x)ux = 0 For x ∈ [0. For example we can use a first order difference approximation on a grid of spacing ∆x vi (t) − vi−1 (t) ∂u(t. Consider the Brusselator with diffusion (23). t) = u(1. Another approximation is given of the spatial derivatives of u(t. t) = 1 u(x. t) = v(1. t) = 3 v(x. If one use the spectral methods. We use a grid of N points 107 . 2π]. x). 0) = 1 + sin(2πx) v(0. xi ) vi+1 (t) − vi−1 (t) ≈ ∂x 2∆x Of course these formulas must be adapted at the boundary of the grid. 0) = 3 When a PDE is discretized in space (semi-discretization) the result is an ODE. For example. xi ) ≈ ∂x ∆x A more accurate approximation can be a central difference ∂u(t. and (Dv)i is the ith component. xi ) ≈ (Dv)i ∂x Where D is a differentiation matrices applied to the function v.

u2 . the equation (24) is equivalent to  α 2 ˙  x2i−1 = A + x2i−1 x2i − (B + 1)x2i + (∆x)2 (x2i−3 − 2x2i−1 + x2i+1 )   α x2i = Bx2i−1 − x2 x2i + (∆x)2 (x2i−2 − 2x2i + x2i+2 ) ˙ 2i−1 i  x2i−1 (0) = 1 + 2 sin(2π N +1 )   x2i (0) = 3 (25) It is straightforward. · · · . v2 . from the original equation to check that each RHS of the system depends at most only of the 2 preceding and the 2 following coordinates. v0 (t) = vN +1 (t) = 3  0  u (0) = 1 + 2 sin(2πx )  i i   vi (0) = 3 xi = We shall code this ODE.3. We ordered the components and the equations in the following order : (u1 .1 i i=1:N ∆x = N +1 N +1 We approximate the second spatial derivatives by finite difference to obtain a system of ODE of dimension N. If we follow the syntax of the section (5. D = (B + 1) + 2c For the 4 first columns   0 0 c c ···  ···  0 x2 0 x2 3 1    2x1 x2 − D −x2 − 2c 2x3 x4 − D −x2 − 2c · · ·  3 1    B − 2x1 x2 0 B − 2x1 x4 0 ···  c c c c ··· 108 . If we adopt the convention x−1 = u0 (t) = x2N +1 = uN (t) = 1 and x0 (t) = v0 (t) = x2N +2 (t) = vN (t) = 3 to incorporate the boundary conditions. This proves that the Jacobian is banded with mu=2 and ml=2. for shortening the formulas .11) the matrix J is. v1 . setting.  α 2 ˙  ui = A + ui vi − (B + 1)ui + (∆x)2 (ui−1 − 2ui + ui+1)   2  vi = Bui − ui vi + α 2 (vi−1 − 2vi + vi+1 )  ˙ (∆x) (24) u (t) = uN +1 (t) = 1. vN ) That is we set x2i−1 = ui and x2i = vi for i = 1 : N. uN .

.. We introduce a temporary vector to take care of the boundaries conditions. function xdot=brusseldiff(t. c=alpha*(N+1)^2. using the vectorization properties of Scilab. c*(y(i+1)-2*y(i+3)+y(i+5)) ///////////////// function J=Jacbruss(t.B=3. results from these formulas. c*(y(i)-2*y(i+2)+y(i+4)) xdot(i+1)=B*y(i+2)-y(i+3).x) //banded Jacobian ml=2 mu=2 109 .x. xdot=zeros(2*N.3].And for the remaining columns till the two last columns.1.1) i=1:2:2*N-1 xdot(i)=A+y(i+3).       ··· c c 2 ··· x2i−1 0 · · · 2x2i−1 x2i − D −x2 − 2c 2i−1 · · · B − 2x2i+1 x2i 0 ··· c c ··· c c 2 ··· 0 x2N −1 · · · 2x2N −1 x2N − D −x2 −1 − 2c 2N · · · B − 2x2N +1 x2N 0 ··· 0 0       The code.*y(i+2)..3.*y(i+2).x) //constants A=1. // A temporary vector for taking in account // boundary conditions x=x(:).alpha=1/50. The Jacobian is coded as a sub-function of the main program. y=[1.^2+..^2-(B+1)*y(i+2)+. // // dimension of the problem N=20.

1:2:2*N)=(2*x(1:2:2*N).2:2:2*N)-2*c. -->%ODEOPTIONS=[1. d=B+1+2*c. We have to call the function brussdiff. then you retype in the window command the same 6 lines of instructions and you get the solution for N=40 ! Now we can plot the solution of the PDE. We can do the same for N = 40. -->x0(1:2:2*N)=1+sin(2*\%pi*X)’.^2)’.X. T=linspace(0.solbru(2:2:2*N.:)’) -->xset(’window’. : -->.mu=2. J(2.3:2*N)=c.20000.ml=2. You just have to change in the function brussdiff.0.solbru(1:2:2*N. -->N=20. J(3.0.0.1) --plot3d1(T.%inf.getf("/Users/sallet/Documents/Scilab/brusseldiff. c=alpha*(N+1)^2.B=3.T.10.2].0. in only two places at the beginning of the function brussdiff and Jacbruss. set the %ODEOP T IONS with Jactype=4.Jacbruss). thes lines of commands --plot3d1(T.V.sci.t0. and set the I. -->x0(2:2:2*N)=3.100). J(5. J(3. The solution is obtained in a snap.1).*x(2:2:2*N))’-d.5. A=1. t0=0. -->solbru=ode(x0.brusseldiff.1:2:2*N)=-J(3.1:2:2*N)-2*c-1. -->x0=zeros(2*N. ui is the discretization of the solution u.sci the value of N .1:2*N-3)=c.X. N=20.alpha=1/50.//see the notes for the explaination x=x(:).sci"). Using the properties of plot3d .12.2*N). -->X=(1:N)/(N+1). maxstep=20000. // five lines to be defined J(1.2:2:2*N)=J(2.2.:)’) 110 . Don’t forget to save the file and make a “getf”. J(4.2:2:2*N)=(x(1:2:2*N). 10]. J=zeros(5. Now we can integrate the ODE on the interval [0.4.

50 5 X t 0.66 1. 22 – Solution of v(x.02 0. t) of (23) 0 Z 4.89 1.50 Y 0.02 0.98 10 5 X Fig. t) of (23) 111 .Give the figures 0 Z 2. 21 – solution of u(x.98 10 Fig.09 2.45 0.00 0.24 0.

t) other point of view You are now able to test the ODE solver of MATLAB on classic test problems.solbru(1:2:2*N. we change the point of view.89 1. save the file and in the window command make a “getf” for this file to redefine the function.45 0. The pedestrian way is to change the parameters in the file of the ODE.98 0.02 Fig. and change the orientation of the x-axis. 112 .To obtain a comparable figure as in HW pp7. 5. In the preceding section we have use this way to change the number of discretization N from 20 to 40. See the section of examples. Do it ! Experiment.:)’. -->plot3d1(T.5 Passing parameters to functions It is frequent to study ODE with parameters and to want to see what are the effects of parameters on the solutions.50 Y 0. 23 – Solution u(x. for example for 100 parameters this is not very convenient. it is the only way to learn how to use Scilab.00 10 5 X 0 0.45) And obtain Z 2.X(N:-1:1). If you want to test. make your own errors. In this file N has an affected value in the file.170.

param2. d) The we code the function function xdot=lotka(t.c. paramk are the parameters you want.(-c+d*x(1))*x(2)] (26) 113 . c. In the syntax of ODE of Scilab you replace the argument f by a list : ode ( x0. 5. If a Jacobian is coded the same rules applies. b..param1. The predator-prey Lotka-Volterra equations are defined by x1 = (a − b x2 ) x1 ˙ x2 = (−c + dx1 ) x2 ˙ We have 4 positive parameters : (a..1 Lotka-Volterra equations.lst) Where lst is a list with the following structure lst=list (f.b. The function f must be a function with the syntax function xdot =f(t...x..paramk).. t . .paramk) We shall give an example. Functions can be invoked with less input or output parameters.a.. . param1. but cannot modified this variables. passing parameters The ODE Lotka-Volterra are contained in any book on ODE. The argument param1.param2. .x. A function has then access to all the variables in the superior levels.5.If a variable in a function is not defined (and is not among the input parameters) then it takes the value of a variable having the same name in the calling environment. This variable however remains local in the sense that modifying it within the function does not alter the variable in the calling environment. param2.d) xdot=[(a-b*x(2))*x(1).t0... It is not possible to call a function if one of the parameter of the calling sequence is not defined In this section we shall give a way to do affect parameters easily and to automatize the process using a script.

d(i))). tableau(2:3.875439 0.889492 0.5669242 4.3248549 0.9745155 114 0.tf.9).t. b = 0.1:.b.9. c. //example lst=list(lotka.1.i)=X.5 ! 0. tableau=zeros(2. we can give values to the parameters and call the solver ODE a= 1.9. n=length(alpha).getf("/Users/sallet/Documents/Scilab/lotka.sci").list(lotka.1].7036127 13.1 2.lst).4 0.c=1: d=. We want to have the solution (x1 (10). -->tableau tableau= ! ! ! 0.1 : 0.exec("/Users/sallet/Documents/Scilab/lotkascript.tableau]. d). tableau=[alpha.b.6287856 ! 3.d) sol=ode(x0. Here is a script lotkascript.t0. x2 (10)) for each value of the parameter d = 0.b=0. To change the parameters it suffice to change the values of (a.5474123 6. For example we fix a = 1.1 to 0. a=1.7917042 25.sci"). t0=0.c.3 0.a.9 with initial value [1. 1].1.c = 1 and we want to see what happens when d varies from 0. end In the window command you execute the script -->.b=1.Now at the command line.2 0.0348836 ! . tf=10.1 : 0.a. b. x0=[1.c. for i=1:n X=ode(x0.1:.t0.sci for getting an array answering the question // Lotka script for the tableau // .c=1=d=2.

8082973 0. If you change often the parameters.4650893 0.getf("/Users/sallet/Documents/Scilab/foo. A warning : a sub-function has not access to the parameters of the other function of the program.8 0.9 ! 0. -->foo(1) ans = 63. -->.5.7 0.7867920 1. Created in the file.7055805 2.2915144 0.2052732 ! 5.2 variables in Scilab and passing parameters To sum up the question of passing parameters. Use parameters as input in the function and use list when you invoke solvers with ODE.column 6 to 9 ! ! ! 0.8661472 1. but these parameters must exists somewhere. You can use in a function some parameters.sci function w=foo(t) c3=22.6 0. or existing in the current local memory of the window command. consider the following function foo. w=1+subfoo(t)+N // function z=subfoo(t) z=c3*t If you type in Scilab -->N=40.62 -->subfoo(1) !--error 4 undefined variable : c3 at line 2 of function subfoo subfoo(1) 115 called by : .sci").9386788 ! 1. omega=%pi/43200.62. For example. inside the code. It is not necessary to create global variables.

The function subfoo has not access to c3. Any assignment to that variable. This equivalent to the first method.sci) function w=foo1(t) global c3 c3=22. each Scilab function. with a little subtlety Ordinarily.62.sci function w=foo2(t) global c3 w=1+subfoo2(t)/c3 // function z=subfoo2(t) global c3 z=c3*t If you load by getf these two functions.sci – create c3 in the base workspace. global in foo and in subfoo.2. Then if for example you create the two functions foo1. 116 . This is the manner used in subfunction Jacozone of function ozone.sci (see the difference with foo. To remedy you have 3 solutions – create c3 in subfoo. Affecting a value to c3 in foo.4.3). – declaring c3. even with the variable c3 having no given value in foo2. In this case c3 is a variable of the program foo. is available to all the other functions declaring it global. a simple trial show that foo. giving it value. has its own local variables and can “read” all variables created in the base workspace or by the calling functions. by typing in the window command.2 share the variable c3. The global allow to make variables read/write across functions.1 and foo. w=1+subfoo(t)/c3 // function z=subfoo(t) global c3 z=c3*t And the function foo. only foo.sci in section (5. but however c3 is not a variable in the base workspace. it is just declared global. in any function.

this is a great burden on the code. The most effective medications for this form of emotional disorder are the simple salts lithium chloride or lithium carbonate. 117 . giving rise to the form of affective psychosis known as bipolar depression. a current hypothesis is a certain regularity of the function defined by the RHS of the ODE. 5.62 -->c3 !--error 4 undefined variable : c3 5. All the algorithms are intended for smooth problems. -->subfoo1(1) ans = 22.6 Discontinuities As we have seen in the theoretical results for existence and uniqueness for ODE. the ability to monitor blood levels and keep the doses within modest ranges (approximately one milliequivalent [mEq] per litre) makes it an effective remedy for manic episodes and it can also stabilize the mood swings of the manic-depressive patient. Even if the quality codes can cope with singularities. periods of depression and mania alternate. and unexpected results can come out. We shall illustrate this by an example from the book of Shampine.-->foo2(1) ans = 2. Lipschitzian is the minimum. In some patients.6. Unfortunately it is often the case in applications. or manic-depressive disorder. Mania is a severe form of emotional disturbance in which the patient is progressively and inappropriately euphoric and simultaneously hyperactive in speech and locomotor behaviour. Although some serious side effects can occur with large doses of lithium. Then when the function in the RHS has some discontinuity we have a problem. Moreover the theoretical results for the convergence and the order of the methods suppose that the function f is sufficiently derivable (at least p). and very often some smoothness is supposed.1 Pharmacokinetics This example is a simple compartmental model of Lithium absorption.

the ODE giving the excretion of lithium (without further input) is x2 = −a x2 . Then u(t) is periodic function on 1/2 day. This is modeled by a two compartment model. 24 – Compartment model of Lithium A mass balance analysis give the equation x1 = −a x1 + u(t) ˙ x2 = a x1 − b x2 ˙ (27) Since the half-life of Lithium in the blood is 24 hours.6 x1 x2 b=O. The constant is such that the integration on a half-day (the quantity of drug metabolized) is 1.The carbonate lithium has a half-life of 24 hours in the blood. if we use for unity of time the day. Remember that the safe range is narrow.5451774 The lithium is administrated in one unity dose. The first question is with a dosage of one tablet every 12 hours. The first is the digestive tract and the second is the blood. constant on 1/48 and taking zero value on the remaining of the day. how long does it take for the medication to reach a steady-state in the blood. which gives a half-life of log(2)/a we have ˙ b = log(2) ≈ 0. passing during one half-hour. 118 . The dynamic can be summarized by Intestinal tract Blood u(t) a=5.6931472 The half-life in the digestive tract is 3 hours which gives a = log(2)/(3/24) ≈ 5. every half-day.7 Excretion Fig.

P) satisfies 0 ≤ pmodulo(x.. The function pmodulo is vectorized and can be used to build any periodic function.P)=x-P . and moreover which is built-in.. P ]. The function modulo (x. programming u(t) in Scilab : Once more there is a clumsy way to code u(t). The keys are the test function and the function pmodulo. modulo(x. is defined on [0.P) is modulo(x. which has the great advantage to give a vectorized code. 1/2] by u(t) = 48 0 1 if 0 ≤ t ≤ 48 1 if 48 < t < 1 2 Clearly the RHS of the ODE is discontinuous. When you consider the grid P Z. With these preliminaries it is now easy to code a periodic function. The number pmodulo(x.P) is the remainder of the division with integer quotient of x by P (as period). P ) = x − nP ./ P ) That is pmodulo(x.Then u(t) of period 1/2. w] and taking value 0 on ]w.* floor(x . that is considerably quicker. and using if loops. w. i. P ) is the distance of x to the nearest left number on the grid. 119 . Doing by case .w. P ) function x=pulsep(t. P ) < P . of period P which value is 1 on [0.p) <= w ).p) //give a step periodic function of the variable t of period p //value 1 if 0 <= t <= w and 0 if w < t < p x= bool2s(pmodulo(t. We call it pulsep(t.e if n is the smallest integer such that nP ≤ x < (n + 1)P modulo(x. But the bad points of discontinuity are known. There is a better manner.

For safety, we test is the remainder is less or equal to w, we obtain boolean variables, %T or %F , and we convert in variables 1 or 0. See the comments in the code of the function ki in section (5.4.3). Now it is simple to code the Lithium model function xdot= phk(t,x) // pharmacokinetics example from Shampine //plot of numerical solutions of ODE pp105 b =log(2); a=log(2)/(3/24): dotx=[-a 0; a -b]*x + 48 * [pulsep(t,1/48,1/2) ; 0]; // function pulsep(t,w,p) x=bool2s( pmodulo(t,p) <= w ); We now write a script for testing some solvers. We test the default solver RK, Adams and RKF (Shampine and Watts). // compute solution for the pharmaco-kinetics problem // ;getf("/Users/sallet/Documents/Scilab/phk.sci"); // x0=[0;0]; t0=0; tf=10; T=linspace(t0,tf,100); T=T(:); X=ode(x0,t0,T,phk); xset(’window’,0) xbasc() plot2d(’RK’,T,X(:,2)’,2) X1=ode(’rkf’,x0,t0,T,phk); xset(’window’,1) xbasc() plot(T,X1(:,2)’,3) 120

X2=ode(’adams’,x0,t0,T,phk); xset(’window’,2) xbasc() plot(T,X2(:,2)’,3) For the method “RKF”
3.2

2.8

2.4

2.0

1.6

1.2

0.8

0.4

0 0 1 2 3 4 5 6 7 8 9 10

Fig. 25 – concentration of Li in blood, by RKF method For the method “RK”

121

3.2

2.8

2.4

2.0

1.6

1.2

0.8

0.4

0 0 1 2 3 4 5 6 7 8 9 10

Fig. 26 – concentration of Li in blood, by RK method For the method “Adams”
3.2

2.8

2.4

2.0

1.6

1.2

0.8

0.4

0 0 1 2 3 4 5 6 7 8 9 10

Fig. 27 – concentration of Li in blood, by Adams method The solver gives contradictory results. To understand what happens, we 122

But in Scilab. too much big steps. We have to integrate on 20 intervals. once again . 3. 1/481/2. namely at time 3.2 0. 10] On intervals of the kind [p/2. which was substantially quicker .8 2. defined by the mesh [0.8 0.8 2. a for loop. p + 1. 28 – Plotting the two components It is clear that the “RKF” method. Integrating on interval of smoothness for f : We shall now integrate on the intervals of smothness for u(t).2 2.4 2.4 2. To define this interval. To get the more precise result of the solution is to integrate on interval. this is clumsy : a principle is to use the vectorized capacity of 123 . we can use. · · · .4 0 0 1 2 3 4 5 6 7 8 9 10 RK Fig.0 1. and has missed some absorption of tablets.5. on the intervals of the kind [p/2 + 1/48. has tacken big steps. p/2 + 1/48. 19/2 + 1/48.0 1. for example . for 10 days. · · · .5.should have plot also the first component : 3. p/2 + 1/48] the fonction u has value 48.2 2. 4 and 4.8 0. p + 1] the function has value 0.4 0 0 1 2 3 4 5 6 7 8 9 10 RKF 3.2 0.6 1.6 1. stop at bad points and restart.

-->A=kron(J.mesh(2*i). now to each block we have to make a translation of a multiple of 1/2. To begin we concatene 20 copies of this interval. to begin we prepare the output. -->mesh=[mesh.I). 124 . // the last point is missing.2). We obtain a vector of mesh points of length 48. This done by using the kronecker product of matrices : -->I=[0. We set the loop // a loop for solving the problem // we integrate on a half a day. 1/48] and to make 20 translations of steps 1/2.n). //blocks of translations -->mesh=A+M. xout=x0’ tout=t0. x0=[0. -->K=ones(1. -->J=ones(1. Now we have the mesh we can integrate. The idea is to take the interval [0.10].Scilab and avoid loops which are computer time consuming.20). -->L=0:1/2:19/2 .K). t0=0. n=100.0]. //this half day is divised in two // integration intervals for i=1:20 //integration with lithium absorption // on the first half--day T=linspace(mesh(2*i-1).1/48] . We take n = 100 steps by interval. // translation -->M=kron(L.

mesh(2*i+1).2)) So we obtain the following plot : 125 . xout=[xout . x0=X(:.X(:. x0=X(:. end xbasc() plot2d(tout.T.t0.X(:.n).2:$)’].$).odephk2). xout=[xout . X=ode(x0.T.X=ode(x0.2:$)’]. tout=[tout.$).odephk1). tout=[tout.T(2:$)’].xout(:. //new IV t0=T($).T(2:$)’].t0. //new IV t0=T($). // integration without lithium absorption //on the second half day T=linspace(mesh(2*i).

4 0 0 1 2 3 4 5 6 7 8 9 10 Fig. 29 – concentration of Li in blood.2 0.8 RKF 0. in blue the “true value” integrating by interval.0 1.3.8 2.2 0. 30 – comparison of methods 126 . So we obtain the following plot : 3.2 Intervall 2.4 2.6 1. in green the method RKF.4 0 0 1 2 3 4 5 6 7 8 9 10 Fig. by interval We can compare the different plots.2 2.8 2. in red with dots the method RK.4 RK 2.0 1.8 0.6 1.

When this depends of the solution itself the matter becomes much more difficult because it is necessary to locate bad points. We describe first the syntax of “root” : 5.1 Syntax of “Roots ” The syntax is in the window command : 127 . and the hypersurface of equation g(y) = 0 is the surface of discontinuities. The “bad” points are known. A common example is a problem which can be written x = f1 (x) if g(x) > 0 ˙ x = f2 (x) if g(x) < 0 ˙ (28) The function g is called a switching function.7. In this case breaking up the the integration interval solve the problem. The two first one are given by a bouncing ball. Scilab has a event location capabilities. The RK and Adams methods has good behavior.We see that the RKF method. when considering the Coulomb example. Actually breaking up the problem gives rise to a restart. since RKF choose too big integrating steps (when this is not precised by the user) . 5. this is due to the skipping of some absorption of Lithium. The problem is to detect when the solution is crossing a surface of discontinuity. All this presumes that we know where the bad places occur. We shall illustrate this by two examples and explain how to use these capabilities.7 Event locations The preceding example of the absorption of Lithium is a case of a RHS discontinuous. If there are a great many places where singularities occur. Plot the first component and compare. the restarts becomes expensive. h as bad results. the third is an example taken from HW of the friction Coulonb’s law. But there is some cases where there can be difficult to approach the problem in this way. This situation is quite frequent in applications where the RHS depends on a switching function. We shall come back on this point later on.

7. x.atol. i. x) where t is a real scalar (time) and x a real vector (state). The dimension of g – g : external i.2 Description With this syntax (first argument equal to “root”) ode computes the solution of the differential equation x = f (t.rd]=ode(’root’. It is an external i. The number of constraints to be satisfied . x) = 0.ng. to be sure that we encounter the surface. gout) where ng is the number of constraints and gout is the value of g (output of the program). ng.e.[x. If g is a character string it refers to the name of a Fortran subroutine or a C function.x0. function which gives the RHS of the ODE – rtol.e.rtol . 128 . – f : external i.e. The function g g should give the equation of the surface. – t0 : real scalar (initial time).e.t0. If we must pass parameters to g. – t : real vector (times at which the solution is computed). – ng : integer.5)) or Jac Ouput rd is a 1 × k vector. or the name of a Fortran subroutine or a C function (character string) with specified calling sequence or a list. The first entry contains the stopping time.jac .g ) The parameters are – x0 : real vector or matrix (initial conditions). with the following calling sequence : g(n. a function with specified syntax.e. – jac : external i. Other entries indicate which components of g have changed sign. x) until the state x(t) crosses the surface ˙ in Rn of equation g(t. 5. It returns a vector of size ng which corresponds to the ng constraints. the function g can be a list with the same conventions as for f apply (see section (5. k larger than 2 indicates that more than one surface ((k − 1) surfaces) have been simultaneously traversed.tf .f . Needless to say that we must choose the final time tf safely. t. If g is a function the syntax should be as follows : z = g(t.atol : real constants or real vectors of the same size as x. function which gives the Jacobian of the RHS of the ODE. function or character string or list.

-1.testz. namely.05] and we add the characteristic function of the interval [0. 0. ∞]. and constant outside the neighborhood of the origin.05.tf.x)’.01.05)’) -->[xsol.x)’.0. To show what can happen.-1]. 0. and set hmax to 0.t0.. xdot =0 ! The function g is obtain by a a sinusoid.01 : -->%ODEOPTIONS=[1. No zero has been detected ! The remedy for this is to adjust the maxstep used by the solver (see section (5.2. This function is constant. The solver use the root finding capabilities of Scilab ..1. With this we use “root” -->deff(’xdot=testz(t.3 Precautions to be taken in using “root” Missing a zero : Some care must be taken when using “root” for event location. 0. We define a function with some zeros near the origin.12. with zeros at each step 1/50.g).5. We use %ODEOPTIONS.x0.4).tf. Then our function g depending only of time.05)+. -->rd rd = [] The vector rd is empty.0.0. we use an explicit example. equal to 1 on [0. sin(50πt).The output x is the first solution (computed) reaching the surface defined by g 5. and we choose a very simple ODE. we multiply this sinusoid by the characteristic function of the interval [0. But in other hand the solver adjust the steps according the tolerance required.rd]=ode(’root’.’xdot=0’) -->deff(’y=g(t.1.04.t0. has 2 zeros located at 0.02. 129 . ∞].*(t<.7.3.rd]=ode(’root’.g). -->[xsol.*(t>0).0.testz. bool2s(t> =.x0.10000. When the ODE is particularly smooth and easy the solver can take great step and miss some zeros of the function..05.’y=sin(50*%pi*t).

testz. 130 .5). and restart from this new point to search the new events.rd]=ode(’root’.3. in other words.3. You can remedy to this situation with three ways 1. and in the example of the dry friction of Coulomb’s law.7.x0. which is 0 by default (see section (5.one or more components of g has a root too near to the initial point That is. in section (5.5) 2. without precautions. In our example. -->t0=rd(1). Or integrate a little bit the solution.x0=xsol. -->rd rd = ! 0. we get a warning message : lsodar. Adjust the first step tried h0. Adjust the minimum step hmin. We can restart with this new I.g).3) 3.V.-->rd rd = ! 0.tf. %ODEOPTIONS(3). But it can happens (see the next examples). ¡ Restarting from an event location : There is here another subtlety. which is 0 by default (see section (5.1.02 1.t0. We have to use this trick in the example of the ball on a ramp.04 1. the root finding capabilities find the event you have previously detected. this has not occurred. ! The solver stop at the first zero of g. Sometimes when we ask Scilab to find the next event. --[xsol. %ODEOPTIONS(5). ! To obtain the second zero.

4 The bouncing ball problem A ball is thrown in a vertical plane.10].g).rd]=ode(’roots’.0. -->x0=[0. 0. x(4).Remark 5. -->tf=100. h(t)) satisfies is coded by function xdot=fallingball2d(t.’z=x(3)’) -->[x.x)’. -->t0=0. You can miss some singularities of your system when one of these two quantities are too big. the time at which the ball hits the ground. We give numerically. The ODE that (x(t).81] At time t0 = 0.7.1.5.fallingball2d.getf("/Users/sallet/Documents/Scilab/fallingball2d.tf. -->rd(1) ans = 2. the ball is thrown with initial condition x(0) = 0. ˙ ˙ h(0) = 0 and h(0) = 10. 5.t0.038736 -->x x = 131 . -->.x0.sci"). -->deff(’z=g(t. The position of the ball is represented by the abscissa x(t) and his height h(t). x(0) = 5.1 : Modifying h0 or hmin is not without harmlessness. -9.x) xdot=[x(2).

the arrival speed is (x(thit ). so ˙ x = 10. xout = x0. if thit is the ˙ time of hit. xeout = [].g).1. y = −10 ˙ ˙ When the ball hits the ground.. x = 5.fallingball. −k h(thit )) ˙ Where k is a coefficient of elasticity. x0 = [0. h ? they are obtained by calling x. for i = 1:10 //solve till the first stop [x.sci").! 10. teout = []. h(thit )) and becomes ˙ ˙ (x(thit ). ! -10.19368 ! 5.’. x. tfinal = 30. it bounces. tstart = 0. deff(’z=g(t.8. 20]. ! 0.tstart.getf("/Users/sallet/Documents/Scilab/fallingball.. h. y = 0. //accumulate output 132 .rd]=ode(’root’.x)’. We write the following script // script for a falling ball // // script for a falling ball // . We set k = 0..19. or in other words .’z=x(1)’) tout = tstart.x0. ! ! ! ! ˙ What are the value of x.tfinal.

0183486 6.2:$)’].5.9*x(2). 0.T(2:$)].091743 30.rd(1)].9745158 6. 5."000") When the script is executed we obtain a tableau of the 10 first times where ˙ the ball hits the ground. xeout=[xeout.0183486 10.096 . 0.0183486 6. 5. X=ode(x0.097152 1.091743 30. T=T(:).8. // new IV (initial values) x0(1)=0. end xset(’window’.1)) plot2d(teout.091743 30.2.549E-16 1.3. 5.091743 30.872579 30.348624 24.0183486 6.850E-15 8.xout(:.038736 3.109E-15 1.tstart.0183486 6. 133 0. 5.6697248 4.100)’. . 5.19368 18. teout=[teout.T. 5. x.6777216 .rd(1).1) xbasc() plot2d(tout.6.091743 30.fallingball).X(:. tout=[tout. x0(2)=-.091743 5.2768 2. 5. .// T=linspace(tstart.zeros(teout). h.009E-17 6.0183486 6. The time ˙ is given in the first column -->[teout xeout] ans = ! ! ! ! ! ! ! ! ! ! 2. tstart=rd(1). h at this events.3421773 ! ! ! ! ! ! ! ! ! ! - . give the values of x.12 4.103E-15 -10. 5.147E-16 9.1.-9. xout=[xout.0183486 6. 5. 3.x’].061E-15 1.62144 .091743 30.4 .

one or more components of g has a root too near to the initial point Simply when starting from the new initial point. but this time the ball is bouncing on a ramp (equation x = y = 1) The rule for restarting is a reflection rule on a surface : the new starting vector speed is symmetric with respect to the normal at the hitting point. the solver find this point. We need here a refinement : when we ask Scilab to find the next event.7. and reduced accordingly to the reduction coefficient. which is precisely an “event ” location point. 134 Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο .With accumulation of output (see the script) we obtain the following plot 23 19 15 11 7 3 1 0 4 8 12 16 20 24 28 Fig.5 The bouncing ball on a ramp We consider the same problem. without precautions. 31 – A Bouncing ball on the ground 5. as we have already said a warning message : lsodar. we get .

fallingball2d).T(2:$)]. t0 = 0.0.’. t0=rd(1). Z1=ode(x0.0 0 1 0. // new IV (initial values) // must push a little bit the solution x0=A*x. 0 -k 0 0].001.g). tfinal = 100.rd(1)].fallingball2d.2. xeout=[xeout. 135 .x)’. //accumulate output // T=linspace(t0.422.0].Then we need to advance a little bit the solution et restart from this point for searching for the new events. for i = 1:7 //solve till the first stop [x. k=0.getf("/Users/sallet/Documents/Scilab/fallingball2d.t0+eps.t0.100)’.0 0 0 -k. tout=[tout.rd(1). // script for a falling ball // .rd]=ode(’root’. x0 = [0. A=[1 0 0 0 . xout=[xout. teout=[teout.2:$)’].x(3)]’) tout = t0.t0.x’].tfinal. teout = []. deff(’z=g(t.x0.fallingball2d). T=T(:).T. xout = x0.t0. eps=0.’z=[x(1)+x(3)-1.2. xeout = [].X(:.sci"). X=ode(x0.

1 0 0. Try the other solutions : modifying h0. Compute the path of theball In the first example."000") We obtain the plot 2.4 0.$).1 Ο 0. there is a vertical wall located at the end of the ramp.V.1) xbasc() plot2d(xout(:. Suppose that the ground is given by 136 ΟΟ Ο Ο Ο Ο .1).tout=[tout.3 1. 0 1.2 Fig. 32 – A bouncing ball on a ramp Try the preceding code without “the little push” to see the reaction of Scilab. Suppose that in the preceding configuration.3)) plot2d(xeout(:.6 0. end xset(’window’. // new I.xout(:.1). or hmin. x0=Z1(:.3). the ground is flat.8 1.2 0.xeout(:.3 0.-9.7 0. xout=[xout. tO=t0+eps.9 1.t0+eps].Z1’].5 Falling Ball on a ramp 1.

tI ) < 0 and fII (x.1.7. The coefficient of restitution is k = 0. fI (x. We follow a solution starting in the upper half–plane till it hits the manifold x2 = 0. Let denote tI the time where the solution reaches the manifold. A = 2. When the solution crosses the x1 –axis different cases occur. tI ) < 0 and fII (x.1 sin(5πt). t) = −x1 + 2 cos(πt) − 4 and fII (x. Let look at the different vectors fields near the intersection with the x1 –axis : 1. tI ) < 0. the surface is then the x1 –axis. on which acts a periodic force A cos(ωt.0. fI (x. We write the equation in standard form x1 = x2 ˙ x2 = −0. 0. The term of the dry friction has value 4 in the expression of x2 in ˙ equation (30) 137 . and subject to a dry friction force of intensity mu and to a viscous friction of intensity 2D.1] . 5. the solution remains trapped in the manifold x2 = 0 and x1 = Constant = x1 (tI ) until one of the value of fI or fII changes sign and becomes of same positive sign.9. Simulates the path of the ball. : [2. and ω = π. 2. µ = 4. fI (x. The ball is thrown at time 0 with I.2 x2 − x1 + 2 cos(πt) − 4sign(x2 ) ˙ (30) The switching function is x2 . then the solution will leave the manifold and go toward the common direction given by the vector fields. 1. t) = −x1 + 2 cos(πt) + 4 the different values of the vectors field on the x1 –axis.6 Coulomb’s law of friction We consider the example x + 2D x + µ sign(x) + x = A cos(ωt) ¨ ˙ ˙ (29) The parameters are D = 0.V. the initial values are x(0) = 3 and x(0) = 4. We continue the integration in the x2 < 0 half–plane. ˙ This equation model the movement of a solid of mass M moving on a line. We shall now implement the solution of this problem. We have for the differential equation 3 possibilities. tI ) > 0.

MODE) // Example of a spring with dry and viscous // and a periodic force // MODE describe the vector field // MODE=0 we are on the x1 axis M=1. // Mass A=2. then we are trapped in the manifold x2 = 0. Or if the speed is 0 (we are then in the x1 –axis manifold) and if the components of two vector fields on the x2 axis given by the formulas −x1 + 2 cos(πt) − 4 and −x1 + 2 cos(πt) + 4 have opposite sign. If the two preceding expressions have opposite signs. // Amplitude of input force 138 friction . With this analysis. an solution for the code could be function xdot = coulomb(t. corresponding to MODE=1 or MODE=-1. and then the ODE is given by x = 0.2.−1 or 0. the event is : the component x2 taking the value 0. and is given by sign(x_2) The event function is twofold. When we are not in the manifold x2 = 0. have same sign the solution goes through the x1 –axis and we choose the appropriate vector field. these 3 possibilities. which takes values 1. Or the term of the dry friction has value −4 in the expression of x2 in ˙ equation (30) 3. given by the 2 formulas −x1 + 2 cos(πt) − 4 and −x1 + 2 cos(πt) + 4. If the the components of the two vector fields on the x2 axis. by the introduction of a parameter MODE.x. ˙ We shall code. At this point we have to decide which way the solution of the ODE is going. then we set MODE=0 and the next event that we must detect is the change of sign of the expression (−x1 + 2 cos(πt) − 4)(−x1 + 2 cos(πt) + 4) The event is then : looking for a zero of this expression.

. A=2.1. if MODE==0 F=( (-mu+A*cos(%pi*t)-x(1)) *( mu+A*cos(%pi*t)-x(1) ) ).mu*MODE +A*cos(%pi*t)-x(1))/M].5.-1].0. (-2*D*x(2). else value=x(2).getf("/Users/sallet/Documents/Scilab/coulomb.1. end //-----------------------------------------------------//sub function for event location function value=gcoul(t. mu=4.2.0. // script for plotting Coulomb example //----------------------------------------------------clear A = 2.0. value = F . // Dry Friction force D=0.5.mu=4.MODE) M=1. end Now we write a script for finding the events and accumulate the output. 4]. .sci").20000. mu=4. x0 = [3. D=0. 0]. tfinal = 10. // 139 .0.x. M = 1. // viscous force // if MODE==0 xdot = [0. D=0. // limiting hmax %ODEOPTIONS=[1. t0=0.1.12. else xdot=[x(2).-1. tstart =t0. We have the same remark that we have to push a little bit the solution from the “event point ”.

xeout = []. [xsole.x0.rd] = ode(’root’. xsol=ode(x0. xout = x0.T(2:$)].//------------------------------------------------------MODE=sign(x0(2)). list2=list(gcoul.xsole]. tout=[tout.1.list1. //updating value of MODE list1=list(coulomb.05.list1). tout = tstart.rd(1)]. // We need to push a little bit the integration // in order that gcoul has not a zero too near // the initial point Tplus=tstart+0. // Looking for a new zero of gcoul x0=xsolplus.MODE). break. end. //If no event occur.100). 140 .tfinal.Tplus. tstart=Tplus.list1).tstart.tfinal. teout=[teout.xsol(:. while tfinal >= tstart // Solve until the first terminal event. xout=[xout.tstart.T.MODE). xsolplus=ode(x0.2:$)]. teout = [].tstart. then terminates till final time if rd(1)==[] then T=linspace(tstart.list2). // If event occurs accumulate events xeout=[xeout.

xout’) //Mark the events with a little "o" plot2d(teout’.0].T.rd(1). if MODE==0 MODE=sign(a1). a1=-mu+A*cos(%pi*tstart)-x0(1).tstart. //start at end of the preceding integration tstart = rd(1).style=[-9.-9]) Running this script gives the times and location of events : -->[teout’. plot2d(t. F=(-mu+A*cos(%pi*tstart)-x0(1) ) * .// Accumulate output.A*cos(%pi*t).xeout’] ans = 141 . xsol=ode(x0. (mu+A*cos(%pi*tstart)-x0(1)). T=linspace(tstart. end end // plotting xset(’window’. elseif F<=0 MODE= 0 . xout = [xout..xeout’. x0 = [xsole(1).5). else MODE=sign(a1). //plot the solution plot2d(tout’. tout = [tout..100). xsol(:. T(2:$)].2:$)].1) xbasc() // plot the force t = t0 + (0:100)*((tfinal-t0)/100).list1).

5628123 2. The script gives the following plot : 5 4 2 Force 1 0 1 speed 2 0 1 2 3 4 5 6 7 8 9 10 Fig.6140528 2. 7.6179261 6. in the script.055E-16 0.5513298 8. 9.2164227 3.6849041 5. but within precision machine accuracy.9024581 2.410E-15 0.2. 33 – Solution of (30) 142 Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο 3 Ο Ο Ο Ο Ο position Ο . Notice also that we set to zero. 9.683E-16 0.7281715 2.! ! ! ! ! ! ! ! ! ! 0.5325592 .175E-16 0.845E-16 7.9024581 2.7281715 2. 1.727185 4.0352271 2.558E-16 ! ! ! ! ! ! ! ! ! ! Notice that the second component x2 is zero.5042179 4.6140528 2.2035123 3.7436998 9.2164227 2.7193768 7.6281436 3. this component when we restart from these points.

argument t is useless. rect. Example 5. Note as usual. but it must be given.yr.8 In the plane Scilab has tools for ODE in the plane. f returns a column vector of size 2.1 Plotting vector fields calling sequence : fchamp(f.<opt_args>) parameters : – f : An external (function or character string) or a list which describes the ODE. – . strf DESCRIPTION : fchamp is used to draw the direction field of a 2D first order ODE defined by the external function f. – It can also be an object of type list. list(f. and it is possible to plot solution directly from a mouse click in the graphic window. 5. [u]).5.rect.yr. which gives the value of the direction field f at point x and at time t.yr : Two row vectors of size n1 and n2 which define the grid on which the direction field is computed.8.u) and u1 gives the value of the parameter u. where f is supposed to be a function of type y = f (t.2 : We consider the equation of the pendulum (5) and we plot the corresponding vector field // plotting the vector field of the pendulum // // get Equation ODE pendulum 143 .u1) where f is a function of type y=f(t. key2..y.xr. x.It can be a function name f...[arfact.x.t. – t : The selected time. – xr. key2=value2. can be one of the following : arfact... y. where key1..xr. that if the ODE is autonomous.strf]) fchamp(x. We can plot vector fields in the plane. – <opt args> : This represents a sequence of statements key1=value1.

28 5.03 6.n).1.03 3.delta_x.y.n).89 1. Then we integrates from this I. We get I.77 2.26 2. x=linspace(-delta_x. 144 ."031") xselect() Running this script give you the picture 4. Pin¸on ’s book [21]. delta_x = 2*%pi.51 3.00 1.51 1.77 5.delta_y].2 Using the mouse The following section is taken from B.delta_x.sci"). We use xclick.26 0. [-delta_x..66 1.0.55 2. delta_y = 2.44 3.00 0.66 3.x.-delta_y. y=linspace(-delta_y..44 6...V. 34 – Vectors field pendulum 5. c We can plot solution of ODE directly from the mouse.delta_y.89 0.78 0. xbasc() fchamp(pendulum. // n = 30.55 4. (for autonomous systems) with the mouse.8.V.getf("/Users/sallet/Documents/Scilab/pendulum.78 2.28 Fig.

window number.3 : We consider the Brusselator equation (ODE). – c w : integer. If it is called with 3 left hand side arguments. position of the mouse.c_x. P_stat = [2 . We pass a parameter function [f] = Brusselator(t.c y : the coordinates of the position of the mouse click in the current graphics scale. we plot the vector fields and solve with some clicks The ODE in the plane. This is the Brusselator without diffusion.sci").c y : real scalars. If it is called with 4 left hand side arguments. – flag : integer. it waits for a mouse click in any graphics window. Example 5.c_w. mouse button number. eps = -4. it waits for a mouse click in the current graphics window.The primitive xclick is governed by Calling sequence : [c_i. description : xclick waits for a mouse click in the graphics window. c x.eps) // f =[ . middle and right) or -1 in case of problems with xclick. 1 or 2 (for left. c i : an integer which gives the number of the mouse button that was pressed 0.x(1)^2*x(2)] We now have the following script from [21] : // Plotting Brusselator . 145 . (5+eps)*x(1) .getf("/Users/sallet/Documents/Scilab/Brusselator. c w : the window number where the click has occurred.(6+eps)*x(1) + x(1)^2*x(2). the click event queue is not cleared when entering xclick.c_y. – c x.x.]=xclick([flag]) parameters : – c i : integer. (5+eps)/2]. If present.

.08.color(num+1). x_max = P_stat(1) + delta_x.[x_min. // 1/ vector field plotting xbasc() fchamp(list(Brusselator.1.:)’. T = 5 . x_min = P_stat(1) .u0. c_y.y_max]. y = linspace(y_min. n).y. atol = 1.. y_max. // solver tolerance t = linspace(0.delta_y. rtol = 1.0.m).d-09.length(color)). n). delta_y = 4.d-10. y_min. num = -1. -9. t.P_stat(2)+0..c_y].c_y]=xclick().eps)). "000") u0 = [c_x. [u] = ode(’stiff’. n = 20.x."000") elseif c_i == 2 then break end end Try it 146 . list(Brusselator."031") xfrect(P_stat(1)-0.0.delta_x.T. 18 21 12 30 27]. rtol. x_max. if c_i == 0 then plot2d(c_x. while %t [c_i.0.c_x. x = linspace(x_min. num = modulo(num+1.16. y_min = P_stat(2) .x_max..u(2. y_max = P_stat(2) + delta_y. plot2d(u(1. 0.08.. atol.eps).// plotting limits delta_x = 6.16) // critical point xselect() // 2/solving ODE m = 500 . color = [21 2 3 4 5 6 19 28 32 9 13 22 .:)’.

Some collections of test problem exist.74 4. 6 0.4.8 2.2 4.35 0.9.0 Ο Ο Ο Ο Ο Fig.20 2.35 1. 35 – Brusselator trajectories from the mouse 5.1 −x x = 4t4 −1 x ˙ t x(2) = 15 2 To integrate on [0. 4 0.9 test functions for ODE To understand a solver’s performances experimentation is a good way.89 3.89 3.50 0. 30]. 5. 50] The solution is the polynomial x(t) = 1 + t + t2 + t3 −x x = 4t4 −1 x ˙ t x(2) = 15 2 147 Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο .04 2.20 1.8 8. We give here some test problems.4 5. 10].74 3.04 2. The reader is invited to test his ability with Scilab on these problems. 0 2. [0. 8 1.6 6.0 3. 0.

0. 10]. . 148 . 0.9. 50] 5. [0. 2π]. 30]. 0)T Where A is a N × N tridiagonal matrix.9.To integrate on [0. 30]. . 10]. 6π]. .9. 0.5 x = Ax ˙ x(0) = (1. 16π] 5.5x3 ˙ x(0) = 1 To integrate on [0. .2 x=x ˙ x(0) = 1 To integrate on [0. x2 (0) = 1 1 t+1 To integrate on [0.4   x1 = x2  ˙  x2 = −x1 ˙    x1 (0) = 0 .3 x = −0. [0. 20] The solution is x(t) = √ 5.9. [0. 50] The solution is the polynomial x(t) = 1 + t + t2 + t3 5. 0.

1) + diag(ones(N − 1. 0.51) With period K=1.A = −2 ∗ eye(N. 28K The solution is x1 (t) = sn(t.t))Q−1 . 50] for N = 50 The solution is x(t) = Q diag(exp(λi . 1). 30]. x2 (0) = 1 ˙ 149 . x2 (0) = 1 x3 = 1 2 sin N +1 ijπ N +1 iπ 2(N + 1) To integrate on [0.9. 0. 10]. 1) To integrate on [0. [0.x0 Where the eigenvalues λi = −4 sin2 and R is an orthogonal matrix Rij = 5.862640080233273855203 5. 6K].6   x1 = x2 x3  ˙  x2 = −x1 x3 ˙  x3 = −0. N) + diag(ones(N − 1.51x1 x2  ˙  x1 (0) = 0 .7  1  x1 = − x3 r  ¨  x2  x2 = − 3  ¨ r    r = sqrtx2 + x2 2 1    x1 (0) = 1 . 0. 1).51) x3 (t) = dn(t.51) x2 (t) = cn(t. 4K].9. [0. 0. x1 (0) = 0 ˙    x2 (0) = 0 . 0.

x2 (0) = −1. There is no analytic expression available for the solution. [0. 16π] The solution is  r = sqrtx2 + x2 1 2    x1 (0) = 0.9 Arenstorf orbit There are the equations of motion of a restricted three body problem.0493575098303199 ˙ . 0. The error is measured at the end of the period T 150  −µ∗ 2 2  x1 = 2x2 + x1 − µ∗ x1r+µ − µ x1r3 x2 = −2x1 + x2 − µ∗ x3 − µ x3 ˙ ¨ ˙ 3  ¨ r1 r2  1 2       r = (x + µ)2 + x2  1 1  2   r = (x − µ∗ )2 + x2  2 1 2 x1 (0) = 0. µ∗ = 1 − µ          x1 (0) = 0 .6 x2 t) = 0.4 . [0.9. x1 (0) = 0 ˙   µ = 1/82. 16π] The solution is x1 (t) = cos(t) x2 t) = sin(x) 5.9.8  1  x1 = − x3 r  ¨  x2  x2 = − 3  ¨ r   To integrate on [0. x2 (0) = 2 ˙ x1 (t) = cos(u) − 0. x1 (0) = 0 ˙    x2 (0) = 0 .8 sin(u) where u is the solution of u = −0.45. 2π].To integrate on [0. 6π].6 sin u 5. 2π]. x1 (0) = 0 ˙    x2 (0) = 0 .4 . 0. 6π].

x2 (0) = −2. µ∗ = 1 − µ          x1 (0) = 0.19216933131963970 5. To integrate on [0. x1 (0) = 0 ˙    x2 (0) = 0 .06521656015796255 5.11 The knee problem  ˙  εx = (1 − t)x − x2  x(0) = 1 [Dahlquist] : The parameter ε satisfies 0 < ε 1 This is a stiff problem. ε = 10−6.45. ε = 10−4 . 8π] 151 .T = 6.994 . 2] 5.4 .12 Problem not smooth  ¨  x = −x − sign(x) − 3 sin(2t)  x(0) = 0 x(0) = 3 ˙ [Dahlquist] : To integrate on [0.00158510637908252 ˙ T = 17.10 Arenstorf orbit Period T  −µ∗ 2 2  x1 = 2x2 + x1 − µ∗ x1r+µ − µ x1r3 x2 = −2x1 + x2 − µ∗ x3 − µ x3 ˙ ¨ ˙ 3  ¨ r1 r2  1 2       r = (x + µ)2 + x2  1 1  2    r2 = (x1 − µ∗ )2 + x2 2 x1 (0) = 0. x1 (0) = 0 ˙   µ = 1/82.9.9.9.

152 .375 10−6x1 − x2 )  ˙  1 x2 = 77.27 (x3 − (1 + x1 )x2 ˙  x3 = 0. Comput. 1989. Commun. 1971.. 1993. Butcher. ed. ACM. An efficient Runge-Kutta (4. 10(5) :1038–1051. Nørsett. Appl.C. 2. Appl. Stat. In Frontiers in numerical analysis. Math. Byrne and Alan C. Bogacki and L. Brown. Shampine. Algorithm 407 : Difsub for solution of ordinary differential equations [d2].9. [3] Peter N. Malcolm. Michael A. [0. Bogacki and L. I : Nonstiff problems. [7] George E. 2(4) :321–325.. and Cleve B. 60] This is a stiff problem with a limit cycle. Comput. Numerical methods for ordinary differential equations. Shampine. [6] Gear C. Solving ordinary differential equations.13 To integrate on [0. J. [9] Ernst Hairer and Martin Hairer.W. Prentice Hall. [4] J.F. 32(6) :15–28. Computer methods for mathematical computations. [10] Ernst Hairer. Byrne. 14(3) :185–190. and Gerhard Wanner. Math. VODE : A variable-coefficient ODE solver. and Alan C. 1996. George D. Moler.5. Wiley. Comput. Phys.F. 1977. [5] George D.. 5) pair. 2002. Lett.161(x1 − x2 )  ˙  x1 (0) = 1 x2 (0) = 2 x3 (0) = 3 R´f´rences ee [1] P. W. A 3(2) pair of Runge-Kutta formulas.. Forsythe. [2] P. 30]. Sci. Hindmarsh. [8] C. 1987. Hindmarsh. Gear. 1971. The Oregonator   x1 = 77. Prentice Hall. Numerical initial value problems in ODE. SIAM J. GniCodes –Matlab programs for geometric numerical integration. Stiff ODE solvers : A review of current and coming attractions. Springer-Verlag. 1989. Syvert P. 2003. 70 :1–62. rev.27(x2 + x1 (18 .

A user’s view of solving stiff ordinary differential equations.[11] Ernst Hairer and Gerhard Wanner. 2nd rev. Shampine. Chapman and Hall. Hindmarsch and G. Gladwell. Sedgwick.W. 1973. editor. [14] A. II : Stiff and differential-algebraic problems. I. Prentice Hall. Odepack. Scientific Computing. Numerical solution of ordinary differential equations. 2003. and Stephen Nash. [21] B. Numer. Assoc. Krogh. Gear. Byrne. Academic Press. 20 :545–562.D. 153 . Hirsch and Stephen Smale. 1994. pages 55–64. SpringerVerlag.. B. [15] A.D. dynamical systems.M. SIAM Rev. Mach. [13] A. 1974. Une introduction ` Scilab.. Comparing numerical methods for ordinary differential equations. Byrne. Solving ODEs with MATLAB. W. 1980. [12] Jack K.F. 1989. Fellen. 15 :10–11. Byrne. Assoc. [18] David Kahaner. Enright. A test for instability in the numerical solution of ordinary differential equations. SIAM J. In Computational techniques for ordinary differential equations. [23] L. Hull.C. [25] L. Anal. 2000. Lsode and lsodi. With disc. 2nd ed. Thompson.D. 1967. Comput. [19] Fred T. and A. 1980.E.C. Ordinary differential equations. Numerical methods and software. Krieger Publishing Company.T. Cleve Moler. 1976. pages 1–17. Comput. On testing a subroutine for the numerical integration of ordinary differential equations.. [17] T. 1979. Shampine and C. S.F.. Hindmarsch and G. J. Krogh. In Numerical methods for differential systems. R.. IECN. two new initial value ordinary differential equation. Pin¸on . R. J. Shampine. Mach. and S.E. [24] L. Amsterdam. 1976.. Cambridge University Press. Differential equations. Episode. 1980. 9 :603–637. What everyone solving differential equations numerically should know. 14 :351–354.E.H. and linear algebra. Scientific Computing. Stepleman et al. 1980.F. North-Holland Publ. ACM SIGNUM.C. [16] Morris W. c a [22] Lawrence F. In Eds. Shampine. a systematized collection of ode solvers. 1972. [20] F. Hindmarsch and G. Solving ordinary differential equations. Hale. ed. 21 :1–17.

Shampine and M. [29] Andrew H. [27] L. GEARS : A package for the solution of sparse.A. and S..F. Shampine and S. Computer solution of ordinary differential equations.[26] L. 1975. Hindmarsh. In Initial value problems for ODE. 1976. 39(5-6) :43–54.F. The initial value problem. Shampine. Math.the state of the art. Gordon.M. W. 154 . [28] L. Freeman. Sherman and Alan C. Appl. Thompson.H..K. Comput.F. 1980. Solving nonstiff ordinary differential equations . 18 :376–411. stiff ordinary differential equations. Davenport. H. pages 190–200. Event location for ordinary differential equations. Watts. 2000. SIAM Rev.

Sign up to vote on this title
UsefulNot useful