Professional Documents
Culture Documents
1
2
which decays to zero for growing n only when h is Differential equations for which the numerical
so small that |1 + hλ| < 1. This imposes a severe solution using the implicit Euler method is more
step size restriction when λ is a negative number efficient than that using the explicit Euler method
of large absolute value (just think of λ = −1010 ). are called stiff differential equations. They in-
For larger step sizes the numerical solution suf- clude important applications in the description
fers an instability that is manifested in wild os- of processes with multiple time scales (e.g., fast
cillations of increasing amplitude. The problem and slow chemical reactions) and in spatial semi-
is that the explicit Euler method and the differ- discretizations of time-dependent partial differen-
ential equation have completely different stability tial equations. For example, for the heat equa-
behaviors unless the step size is chosen extremely tion, stable numerical solutions are obtained with
small (see Figure 1). the explicit Euler method only when temporal
Such behavior is not restricted to the simple step sizes are bounded by the square of the spa-
scalar example considered above, but extends to tial grid size, whereas the implicit Euler method
linear systems of differential equations in which is unconditionally stable.
3
where ye(t) is the solution of the modified differ- higher derivatives of f are available with au-
ential equation ye0 = fe(ey ) with initial value y0 . tomatic differentiation software) are sometimes
The remarkable feature is that when the sym- used.
plectic Euler method is applied to a Hamiltonian
system, then the modified differential equation is 3.2 Explicit Runge–Kutta methods
again Hamiltonian. The modified Hamilton func-
tion has an asymptotic expansion Two ideas underlie Runge–Kutta methods: first,
the integral in
e = H + hH2 + h2 H3 + · · · .
H Z 1
y(t0 + h) = y(t0 ) + h y 0 (t0 + θh) dθ,
The symplectic Euler method therefore conserves 0
the modified energy H e (up to arbitrarily high 0
with y (t) = f (t, y(t)), is approximated by a
powers of h), which is close to the original en-
quadrature formula with weights bi and nodes ci ,
ergy H. This conservation of the modified energy
prevents the linearly growing drift in the energy s
X
that is present along numerical solutions of the y1 = y0 + h bi Yi0 , Yi0 = f (t0 + ci h, Yi ).
explicit and implicit Euler methods. For these i=1
two methods the modified differential equation is
Second, the internal stage values Yi ≈ y(t0 + ci h)
no longer Hamiltonian.
are determined by another quadrature formula for
the integral from 0 to ci :
3 Nonstiff problems s
X
Yi = y0 + h aij Yj0 , i = 1, . . . , s,
3.1 Higher-order methods j=1
A method is said to have order p if the local er-
with the same function values Yj0 as for y1 . If
ror (recall that this is the error after one step
the coefficients satisfy aij = 0 for j ≥ i, then
of the method starting from the exact solution)
the above sum actually extends only from j = 1
is bounded by Chp+1 , where h is the step size
to i − 1, and hence Y1 , Y10 , Y2 , Y20 , . . . , Ys , Ys0 can
and C depends only on bounds of derivatives of
be computed explicitly one after the other. The
the solution y(t) and of the function f . Like for
methods are named after Carl Runge, who in 1895
the Euler method in Section 2.1, the order is de-
proposed two- and three-stage methods of this
termined by comparing the Taylor expansions of
type, and after Wilhelm Kutta, who in 1901 pro-
the exact and the numerical solution, which for a
posed what is now known as the classical Runge–
method of order p should agree up to and includ-
Kutta method of order 4, which extends the Simp-
ing the hp term.
son quadrature rule from integrals to differential
A drawback of the Euler methods is that they
equations:
are only of order 1. There are different ways to in-
crease the order: using additional, auxiliary func-
tion evaluations in passing from yn to yn+1 (one- 0
step methods); using previously computed solu- 1/2 1/2
tion values yn−1 , yn−2 , . . . and/or their function 1/2 0 1/2
values (multistep methods); or using both (gen- ci aij 1 0 0 1
≡
eral linear methods). For nonstiff initial value bj 1/6 2/6 2/6 1/6
problems the most widely used methods are ex-
plicit Runge–Kutta methods of orders up to 8, in Using Lady Windermere’s fan as in Section 2,
the class of one-step methods, and Adams-type one finds that the global error yn − y(tn ) of a pth-
multistep methods up to order 12. For very strin- order Runge–Kutta method over a bounded time
gent accuracy requirements of 10 or 100 digits, interval is O(hp ).
high-order extrapolation methods or high-order The order conditions of general Runge–Kutta
Taylor series expansions of the solution (when methods were elegantly derived by John Butcher
6
in 1964 using a tree model for the derivatives of 3.4 Adams methods
f and their concatenations by the chain rule, as
The methods introduced by Astronomer Royal
they appear in the Taylor expansions of the ex-
John Couch Adams in 1855 were the first of high
act and the numerical solutions. This enabled
order that use only function evaluations, and in
the construction of even higher-order methods,
their current variable-order, variable step-size im-
among which excellently constructed methods of
plementations they are among the most efficient
orders 5 and 8 by Dormand and Prince (from
methods for general nonstiff initial value prob-
1980) have found widespread use. These meth-
lems.
ods are equipped with lower-order error indica-
When k function values fn−j = f (tn−j , yn−j )
tors from embedded formulas that use the same
(j = 0, . . . , k − 1) have already been computed,
function evaluations. These error indicators are
the integrand in
used for an adaptive selection of the step size that
is intended to keep the local error close to a given Z 1
error tolerance in each time step. y(tn+1 ) = y(tn ) + h f (tn + θh, y(tn + θh)) dθ
0
where y(t, h) is the explicit Euler approximation with the backward differences ∇fn = fn − fn−1 ,
at t obtained with step size h. At t = t0 + H, ∇i fn = ∇i−1 fn −∇i−1 fn−1 , and with coefficients
the error expansion coefficients up to order p can 5 3 251
(γi ) = ( 21 , 12 , 8 , 720 , . . .). The method thus cor-
be eliminated by evaluating at h = 0 (extrapo- rects the explicit Euler method by adding differ-
lating) the interpolation polynomial through the ences of previous function values.
Euler values y(t0 + H, hj ) for j = 1, . . . , p corre- Especially for higher orders, the accuracy of the
sponding to different step sizes hj = H/j. This approximation suffers from the fact that the in-
gives a method of order p, which formally falls terpolation polynomial is used outside the inter-
into the general class of Runge–Kutta methods. val of interpolation. This is avoided if the, as yet,
Instead of using the explicit Euler method as ba- unknown value (tn+1 , fn+1 ) is added to the inter-
sic method, it is preferable to take Gragg’s method polation points. Let P ∗ (t) denote the correspond-
(from 1964) which uses the explicit midpoint rule ing interpolation polynomial, which is now used
to replace the integrand f (t, y(t)). This yields the
yn+1 = yn−1 + 2hf (tn , yn ), n ≥ 1, implicit Adams method of order k +1, which takes
the form
and an explicit Euler starting step to compute y1 . k
This method has an error expansion in powers of
X
yn+1 = yn + hfn+1 + h γi∗ ∇i fn+1 ,
h2 (instead of h above) at even n, and with the i=1
elimination of each error coefficient one therefore
gains a power of h2 . Extrapolation methods have with (γi∗ ) = (− 21 , − 12
1 1
, − 24 19
, − 720 , . . .). The equa-
built-in error indicators that can be used for order tion for yn+1 is solved approximately by one or at
and step-size control. most two fixed point iterations, taking the result
7
from the explicit Adams method as the starting phrased as saying that all Pk solutions to the lin-
iterate (the predictor) and inserting its function ear difference equation j=0 αj yn+j = 0 stay
value on the right-hand side of the implicit Adams bounded as n → ∞, or equivalently:
Pk
method (the corrector). All roots of the polynomial j=0 αj ζ j are in
In a variable-order, variable step-size imple- the complex unit disk, and those on the unit circle
mentation, the required starting values are built are simple.
up by starting with the methods of increasing If this stability condition is satisfied and the
order 1, 2, 3, . . . one after the other. Strategies method is of order p, then the error satisfies
for increasing or decreasing the order are based yn − y(tn ) = O(hp ) on bounded time intervals,
on monitoring the backward difference terms. provided that the error in the starting values is
Changing the step size is computationally more O(hp ).
expensive, since it requires a recalculation of all
Dahlquist also proved order barriers: the order
method coefficients. It is facilitated by passing in-
of a stable k-step method cannot exceed k + 2 if
formation from one step to the next by the Nord-
k is even, k + 1 if k is odd, and k if the method
sieck vector, which collects the values of the in-
is explicit (βk = 0).
terpolation polynomial and all its derivatives at
tn scaled by powers of the step size.
3.6 General linear methods
3.5 Linear multistep methods
Both explicit and implicit Adams methods (with Predictor-corrector Adams methods fall neither
constant step size) belong to the class of linear into the class of multistep methods, since they
multistep methods use the predictor as an internal stage, nor into
the class of Runge–Kutta methods, since they use
k
X k
X previous function values. Linear multistep meth-
αj yn+j = h βj fn+j ods and Runge–Kutta methods are extreme cases
j=0 j=0 of a more general class of methods
the linear test equation yields yn+1 = P (hλ)yn , p = 2s is thus obtained with Gauss nodes. Nev-
where P is a polynomial of degree s, the num- ertheless, Gauss methods have found little use in
ber of stages. The stability region of such a stiff initial value problems (as opposed to bound-
method is necessarily bounded, since |P (z)| → ∞ ary value problems, see Section 6). The reason
as |z| → ∞. for this is that the stability function here satis-
On the other hand, an implicit Runge–Kutta fies |R(z)| → 1 as z → −∞, whereas ez → 0 as
method has z → −∞.
yn+1 = R(hλ)yn The desired damping property at infinity is
obtained for the sub-diagonal Padé approxima-
with a rational function R(z), called the stabil-
tion. This is the stability function for the col-
ity function of the method, which is an approxi-
location method at Radau points, which are the
mation to the exponential at the origin, R(z) =
nodes of the quadrature of order p = 2s − 1 with
ez + O(z p+1 ) as z → 0. The method is A-stable if
cs = 1. Let us collect the basic properties: the s-
|R(z)| ≤ 1 for Re z ≤ 0. The subtle interplay be-
stage Radau method is an implicit Runge–Kutta
tween order and stability is clarified by the theory
method of order p = 2s − 1; it is A-stable and has
of order stars, developed by Wanner, Hairer and
R(∞) = 0.
Nørsett in 1978. In particular, this theory shows
that among the Padé approximations Rk,j (z) to The Radau methods have some more remark-
the exponential (the rational approximations of able features: they are nonlinearly stable with the
numerator degree k and denominator degree j so-called algebraic stability property; their last in-
of highest possible order p = j + k), precisely ternal stage equals the starting value for the next
those with k ≤ j ≤ k + 2 are A-stable. Optimal- step (this property is useful for very stiff and for
order implicit Runge–Kutta methods having the differential-algebraic equations); and their inter-
diagonal Padé approximations Rs,s as stability nal stages all have order s.
function are the collocation methods based on The last property does indeed hold for every
the Gauss quadrature nodes, while those having collocation method with s nodes. It is impor-
the sub-diagonal Padé approximations Rs−1,s are tant because of the phenomenon of order reduc-
the collocation methods based on the right-hand tion: in the application of an implicit Runge–
Radau quadrature nodes. We turn to these im- Kutta method to stiff problems, the method may
portant implicit Runge–Kutta methods next. have only the stage order, or stage order +1, with
stiffness-independent error constants, instead of
the full classical order p that is obtained with
4.4 Gauss and Radau methods
nonstiff problems.
A collocation method based on the nodes 0 ≤ c1 < The implementation of Radau methods by
· · · < cs ≤ 1 determines a polynomial u(t) of Hairer and Wanner (from 1991) is known for
degree at most s such that u(t0 ) = y0 and the its robustness in dealing with stiff problems and
differential equation is satisfied at the s points differential-algebraic equations of the type M y 0 =
t0 + ci h: f (t, y) with a singular matrix M .
long times. The most prominent of these are Such methods exist: the “symplectic Euler
Hamiltonian systems, which are all-important in method” of Section 1.3 is indeed symplectic.
physics. Their flow has the geometric property of Great interest in symplectic methods was spurred
being symplectic. when, in 1988, Lasagni, Sanz-Serna and Suris
In respecting the phase space geometry under independently characterized symplectic Runge–
discretization and analyzing its effect on the long- Kutta methods as those whose coefficients satisfy
time behavior of a numerical method, there is a the condition
shift of viewpoint from concentrating on the ap-
proximation of a single solution trajectory to con- bi aij + bj aji − bi bj = 0.
sidering the numerical method as a discrete dy-
namical system that approximates the flow of a Gauss methods (see Section 4.4) were already
differential equation. known to satisfy this condition, and were thus
While many differential equations have inter- found to be symplectic. Soon after these discov-
esting structures to preserve under discretization eries it was realized that a numerical method is
— and much work has been done in devising and symplectic if and only if the modified differen-
analyzing appropriate numerical methods for do- tial equation of backward analysis (Section 2.5)
ing so — we will restrict our attention to Hamil- is again Hamiltonian. This made it possible to
tonian systems here. Their numerical treatment prove rigorously the almost-conservation of en-
has been an active research area for the past two ergy over times that are exponentially long in the
decades. inverse step size, as well as further favorable long-
time properties such as the almost-preservation
of KAM (Kolmogorov–Arnold–Moser) tori of per-
5.1 Symplectic methods turbed integrable systems over exponentially long
The time-t flow of a differential equation y 0 = times.
f (y) is the map ϕt that associates with an initial
value y0 at time 0 the solution value at time t: 5.2 The Störmer–Verlet method
ϕt (y0 ) = y(t). Consider a Hamiltonian system
By the time symplecticity was entering the field of
p0 = −∇q H(p, q), q 0 = ∇p H(p, q), numerical analysis, scientists in molecular simula-
tion had been doing symplectic computations for
or equivalently, for y = (p, q), more than twenty years without knowing it: the
µ ¶ standard integrator of molecular dynamics, the
0 I method used successfully ever since Luc Verlet in-
y 0 = J −1 ∇H(y) with J= .
−I 0 troduced it to the field in 1967, is symplectic. For
a Hamiltonian H(p, q) = 12 pT M −1 p + V (q) with
The flow ϕt of a Hamiltonian system is symplec- a symmetric positive definite mass matrix M , the
tic, that is, the derivative Dϕt with respect to the method is explicit and given by the formulas
initial value satisfies
h
Dϕt (y)T J Dϕt (y) = J pn+1/2 = pn − ∇V (qn )
2
qn+1 = qn + hM −1 pn+1/2
for all y and t for which ϕt (y) exists. This is a h
quadratic relation formally similar to orthogonal- pn+1 = pn+1/2 − ∇V (qn+1 ).
2
ity, with J in place of the identity matrix I, but it
is related to the preservation of areas rather than Such a method was also formulated by the as-
lengths in phase space. tronomer Störmer in 1907, and in fact can even
A numerical one-step method yn+1 = Φh (yn ) be traced back to Newton’s Principia from 1687,
is called symplectic if the numerical flow Φh is a where it was used as a theoretical tool in the
symplectic map: proof of the preservation of angular momentum
in the two-body problem (Kepler’s second law),
DΦh (y)T J DΦh (y) = J. which is indeed preserved by this method. Given
12
that there are already sufficiently many Newton and ϕVt of the systems with Hamiltonians T and
methods in numerical analysis, it is fair to refer to V , respectively, are obtained by solving the trivial
the method as the Störmer–Verlet method (Verlet differential equations
method and leapfrog method are also often-used ½ 0 ½ 0
names). As will be discussed in the next three T p =0 V p = −∇V (q)
ϕt : ϕt :
subsections, the symplecticity of this method can q 0 = M −1 p q0 = 0 .
be understood in various ways by relating the We then note that the Störmer–Verlet method
method to different classes of methods that have can be interpreted as a composition of the exact
proven useful in a variety of applications. flows of the split differential equations:
Let us denote the one-step map (pn , qn ) 7→ Since the flows ϕTh/2 and ϕVh are symplectic, so is
(pn+1 , qn+1 ) of the Störmer–Verlet method by their composition.
ΦSV
h , that of the symplectic Euler method of Sec- Splitting the vector field of a differential equa-
tion 1.3 by ΦSEh , and that of the adjoint symplec- tion and composing the flows of the subsystems is
tic Euler method by ΦSE∗ h , where instead of the a structure-preserving approach that yields meth-
argument (pn+1 , qn ) one uses (pn , qn+1 ). Then, ods of arbitrary order and is useful in the time
the second-order Störmer–Verlet method can be integration of a variety of ordinary and partial
interpreted as the composition of the first-order differential equations, such as linear and nonlin-
symplectic Euler methods with halved step size: ear Schrödinger equations.
ΦSV SE∗ SE
h = Φh/2 ◦ Φh/2 .
5.5 Variational integrators
Since the composition of symplectic maps is
symplectic, this shows the symplecticity of the For the Hamiltonian H(p, q) = 12 pT M −1 p + V (q),
Störmer–Verlet method. We further note that the Hamilton equations of motion ṗ = −∇V (q),
the method is time-reversible (or symmetric): q̇ = M −1 p can be combined to give the second-
Φ−h ◦ Φh = id, or equivalently Φh = Φ∗h with order differential equation
the adjoint method Φ∗h := Φ−1 −h . This is known M q̈ = −∇V (q),
to be another favorable property for conservative
systems. which can be interpreted as the Euler-Lagrange
Moreover, from first-order methods we have ob- equations for minimizing the action integral
tained a second-order method. More generally, Z tN
starting from a low-order method Φh , methods 1
L(q(t), q̇(t)) dt with L(q, q̇) = q̇ TM q̇ − V (q)
of arbitrary order can be constructed by suitable t0 2
compositions
over all paths q(t) with fixed end-points. In the
Φcs h ◦ · · · ◦ Φc1 h or Φ∗bs h ◦ Φas h ◦ · · · ◦ Φ∗b1 h ◦ Φa1 h . Störmer-Verlet method, eliminating the momenta
yields the second-order difference equations
Systematic approaches to high-order composi-
tions were first given by Suzuki and Yoshida in M (qn+1 − 2qn + qn−1 ) = −h2 ∇V (qn ),
1990. Whenever the base method is symplectic,
then so is the composed method. The coefficients which are the discrete Euler-Lagrange equations
can be chosen such that the resulting method is for minimizing the discretized action integral
symmetric. N −1
−q ¢ −q ¢
³ ¡ ´
X h q q
L qn , n+1 n + L qn+1 , n+1 n ,
¡
2 h h
5.4 Splitting methods n=0
Splitting the Hamiltonian H(p, q) = T (p) + V (q) which results from a trapezoidal rule approxi-
into its kinetic energy T (p) = 12 pT M −1 p and po- mation to the action integral and piecewise lin-
tential energy V (q), we have that the flows ϕTt ear approximation to q(t). The Störmer–Verlet
13
method can thus be interpreted as resulting from smaller time steps, noting u(h) = u(−h).
the direct discretization of the Hamilton varia- The argument of f [slow] might preferably be re-
tional principle. Such an interpretation can in placed with an averaged value q n , in order to mit-
fact be given for every symplectic method. Con- igate the adverse effect of possible step size res-
versely, symplectic methods can be constructed onances that appear when the product of h with
by minimizing a discrete action integral. In par- an eigenfrequency of the Hessian is close to an
ticular, approximating the action integral by a integral multiple of π.
quadrature formula and the positions q(t) by a If the fast potential is quadratic, V [fast] (q) =
1 T
piecewise polynomial leads to a symplectic par- 2 q Aq, the auxiliary differential equation can be
titioned Runge–Kutta method, which in general solved exactly in terms of trigonometric functions
uses different Runge–Kutta formulas for positions of the matrix h2 A. The resulting method can
and momenta. With Gauss quadrature one rein- then be viewed as an exponential integrator as
terprets in this way the Gauss methods of Sec- considered in Section 4.6.
tion 4.4, and with higher-order Lobatto quadra-
ture formulas one obtains higher-order relatives 6 Boundary value problems
of the Störmer–Verlet method.
In a two-point boundary value problem, the dif-
5.6 Oscillatory problems ferential equation is coupled with boundary con-
ditions of the same dimension:
Highly oscillatory solution behavior in Hamilto-
nian systems typically arises when the potential y 0 (t) = f (t, y(t)), a ≤ t ≤ b,
is a multiscale sum V = V [slow] + V [fast] , where r(y(a), y(b)) = 0.
the Hessian of V [fast] has positive eigenvalues that
are large compared with those of V [slow] . (Here As an important class of examples, such problems
we assume M = I for simplicity.) With standard arise as the Euler–Lagrange equations of varia-
methods such as the Störmer–Verlet method, very tional problems, typically with separated bound-
small time steps would be required, for reasons ary conditions ra (y(a)) = 0, rb (y(b)) = 0.
of both accuracy and stability. Various numeri-
cal methods have been devised with the aim to 6.1 The sensitivity matrix
overcome such a limitation. Here we just de-
scribe one such method, a multiple time-stepping The problem of existence and uniqueness of a so-
method that reduces the computational work sig- lution is more subtle than for initial value prob-
nificantly when the slow force f [slow] = −∇V [slow] lems. For a linear boundary value problem
is far more expensive to evaluate than the fast
y 0 (t) = C(t)y(t) + g(t), a ≤ t ≤ b,
force f [fast] = −∇V [fast] . A basic principle is to
rely on averages instead of pointwise force eval- Ay(a) + By(b) = q,
uations. In the averaged-force method, the force
a unique solution exists if and only if the sen-
fn = −∇V (qn ) in the Störmer–Verlet method is
sitivity matrix E = A + B U (b, a) is invertible,
replaced with an averaged force f n as follows: we
where U (t, s) is the propagation matrix yielding
freeze the slow force at qn and consider the aux-
v(t) = U (t, s)v(s) for every solution of the linear
iliary differential equation
differential equation v 0 (t) = C(t)v(t).
ü = f [slow] (qn ) + f [fast] (u) A solution of a nonlinear boundary value prob-
lem is locally unique if the linearization along this
with initial values u(0) = qn , u̇(0) = 0. We then
solution has an invertible sensitivity matrix.
define the averaged force as
Z 1
fn = (1 − |θ|) f [slow] (qn ) + f [fast] (u(θh)) dθ,
¡ ¢ 6.2 Shooting
−1
Just as Newton’s method replaces a nonlinear sys-
which equals f n = h12 u(h) − 2u(0) + u(−h) .
¡ ¢
tem of equations with a sequence of linear sys-
The value u(h) is computed approximately with tems, the shooting method replaces a boundary
14
value problem with a sequence of initial value value problems on the subintervals together with
problems. The objective is to find an initial value their linearization, and then a linear system with
x such that the solution of the differential equa- a large sparse matrix is solved for the increments
tion with this initial value, denoted y(t; x), satis- in (x0 , . . . , xN ).
fies the boundary conditions:
Further Reading
1. Ascher, U.M., Mattheij, R.M.M. and Rus-
sell, R.D. 1995 Numerical solution of bound-
ary value problems for ordinary differential equa-
tions. SIAM, Philadelphia.
2. Ascher, U.M. and Petzold, L.R. 1998 Computer
methods for ordinary differential equations and
differential-algebraic equations, SIAM, Philadel-
phia.
3. Butcher, J.C. 2008 Numerical methods for or-
dinary differential equations, second revised ed.,
Wiley, Chichester.
4. Crouzeix, M. and Mignot, A.L. 1989 Analyse
numérique des équations différentielles, 2e éd.
révisée et augmentée. Masson, Paris.
5. Deuflhard, P. and Bornemann, F. 2002 Scientific
computing with ordinary differential equations,
Springer, New York.
6. Gear, C.W. 1971 Numerical initial value prob-
lems in ordinary differential equations, Prentice-
Hall, Englewood Cliffs, NJ.
7. Hairer, E., Nørsett, S.P. and Wanner, G. 1993
Solving ordinary differential equations. I: Nons-
tiff problems. 2nd revised ed., Springer, Berlin.
8. Hairer, E. and Wanner, G. 1996 Solving ordinary
differential equations. II: Stiff and differential-
algebraic problems. 2nd revised ed., Springer,
Berlin.
9. Hairer, E., Lubich, C. and Wanner, G. 2006
Geometric numerical integration. Structure-
preserving algorithms for ordinary differential
equations. 2nd revised ed., Springer, Berlin.
10. Henrici, P. 1962 Discrete variable methods in or-
dinary differential equations, Wiley, New York.
11. Iserles, A. 2009 A first course in the numerical
analysis of differential equations. Second edi-
tion, Cambridge Univ. Press, Cambridge.
12. Leimkuhler, B. and Reich, S. 2004 Simulating
Hamiltonian dynamics, Cambridge Univ. Press,
Cambridge.
Biographies of contributors