Are you sure?
This action might not be possible to undo. Are you sure you want to continue?
highdimensional systems of nonlinear equations
Proefschrift
ter verkrijging van
de graad van Doctor aan de Universiteit Leiden,
op gezag van de Rector Magniﬁcus Dr. D. D. Breimer,
hoogleraar in de faculteit der Wiskunde en
Natuurwetenschappen en die der Geneeskunde,
volgens besluit van het College voor Promoties
te verdedigen op dinsdag 9 december 2003
klokke 15.15 uur
door
Bartholomeus Andreas van de Rotten
geboren te Uithoorn
op 20 oktober 1976
Samenstelling van de promotiecommissie:
promotoren: prof. dr. S.M. Verduyn Lunel
prof. dr. ir. A. Bliek (Universiteit van Amsterdam)
referent: prof. dr. D. Estep (Colorado State University)
overige leden: prof. dr. G. van Dijk
dr. ir. H.C.J. Hoefsloot (Universiteit van Amsterdam)
prof. dr. R. van der Hout
dr. W.H. Hundsdorfer (CWI, Amsterdam)
prof. dr. L.A. Peletier
prof. dr. M.N. Spijker
A limited memory Broyden method
to solve highdimensional systems
of nonlinear equations
Bart van de Rotten
Mathematisch Instituut, Universiteit Leiden, The Netherlands
ISBN: 9090175768
Printed by PrintPartners Ipskamp
The research that led to this thesis was funded by N.W.O. (Nederlandse or
ganisatie voor Wetenschappelijk Onderzoek) grant 61661410 and supported
by the Thomas Stieltjes Institute for Mathematics:
Contents
Introduction 1
I Basics of limited memory methods 15
1 An introduction to iterative methods 17
1.1 Iterative methods in one variable . . . . . . . . . . . . . . . . . 18
1.2 The method of Newton . . . . . . . . . . . . . . . . . . . . . . . 25
1.3 The method of Broyden . . . . . . . . . . . . . . . . . . . . . . 35
2 Solving linear systems with Broyden’s method 55
2.1 Exact convergence for linear systems . . . . . . . . . . . . . . . 56
2.2 Two theorems of Gerber and Luk . . . . . . . . . . . . . . . . . 62
2.3 Linear transformations . . . . . . . . . . . . . . . . . . . . . . . 67
3 Limited memory Broyden methods 71
3.1 New representations of Broyden’s method . . . . . . . . . . . . 72
3.2 Broyden Rank Reduction method . . . . . . . . . . . . . . . . . 83
3.3 Broyden Base Reduction method . . . . . . . . . . . . . . . . . 96
3.4 The approach of Byrd . . . . . . . . . . . . . . . . . . . . . . . 100
II Features of limited memory methods 109
4 Features of Broyden’s method 111
4.1 Characteristics of the Jacobian . . . . . . . . . . . . . . . . . . 113
4.2 Solving linear systems with Broyden’s method . . . . . . . . . . 120
4.3 Introducing coupling . . . . . . . . . . . . . . . . . . . . . . . . 125
4.4 Comparison of selected limited memory Broyden methods . . . 128
i
ii Contents
5 Features of the Broyden rank reduction method 135
5.1 The reverse ﬂow reactor . . . . . . . . . . . . . . . . . . . . . . 135
5.2 Singular value distributions of the update matrices . . . . . . . 138
5.3 Computing on a ﬁner grid using same amount of memory . . . 140
5.4 Comparison of selected limited memory Broyden methods . . . 142
III Limited memory methods
applied to periodically forced processes 147
6 Periodic processes in packed bed reactors 149
6.1 The advantages of periodic processes . . . . . . . . . . . . . . . 149
6.2 The model equations of a cooled packed bed reactor . . . . . . 152
7 Numerical approach for solving
periodically forced processes 167
7.1 Discretization of the model equations . . . . . . . . . . . . . . . 168
7.2 Tests for the discretized model equations . . . . . . . . . . . . . 173
7.3 Bifurcation theory and continuation techniques . . . . . . . . . 175
8 Eﬃcient simulation of periodically forced reactors in 2D 183
8.1 The reverse ﬂow reactor . . . . . . . . . . . . . . . . . . . . . . 183
8.2 The behavior of the reverse ﬂow reactor . . . . . . . . . . . . . 185
8.3 Dynamic features of the full twodimensional model . . . . . . . 191
Notes and comments 195
Bibliography 201
A Test functions 207
B Matlab code of the limited memory Broyden methods 211
C Estimation of the model parameters 219
Samenvatting (Waarom Broyden?) 223
Nawoord 227
Curriculum Vitae 229
Introduction
Periodic chemical processes form a ﬁeld of major interest in chemical reac
tor engineering. Examples of such processes are the pressure swing adsorber
(PSA), the thermal swing adsorber (TSA), the reverse ﬂow reactor (RFR),
and the more recently developed pressure swing reactor (PSR). The state of
a chemical reactor, that contains a periodically forced process, is given by the
temperature proﬁles and concentration proﬁles of the reactants. Starting with
an initial state, the reactor generally goes through a transient phase during
many periods before converging to a periodic limiting state. This periodic
limiting state is also known as the cyclic steady state (CSS). Because the re
actor operates in this state most of the time, it is interesting to investigate
the dependence of the cyclic steady state on the operating parameters of the
reactor.
The simulation of periodically forced processes in packed bed reactors leads
to the development of partial diﬀerential equations. In order to investigate the
behavior of the system numerically, we discretize the equations in space. The
action of the process during one period can be computed by integrating the
obtained system of ordinary diﬀerential equations in time for one period. The
map that assigns to an initial state of the process, the state after one period
is called the period map. We denote the period map by f : R
n
→R
n
, which,
in general, is highly nonlinear. The dynamical process in the reactor can now
be formulated by the dynamical system
x
k+1
= f(x
k
), k = 0, 1, 2, . . . ,
where x
k
denotes the state of the reactor after k periods. Periodic states of
the reactor are ﬁxed points of the period map f and a stable cyclic steady
state can be computed by taking the limit of x
k
as k → ∞. Depending on
the convergence properties of the system at hand, the transient phase of the
process might be very long, and eﬃcient methods to ﬁnd the ﬁxed points of f
are essential.
1
2 Introduction
Fixed points of the map f correspond to zeros of g : R
n
→ R
n
where g is
given by
g(x) = f(x) −x.
So, the basic equation we want to solve is
g(x) = 0 for x ∈ R
n
. (1)
Because (1) is a system of n nonlinear equations, iterative algorithms are
needed to approximate a zero of the function g. The iterative algorithms pro
duce a sequence ¦x
k
¦ of approximations to the zero x
∗
of g. A function eval
uation can be a rather expensive task, and it is generally accepted that the
most eﬃcient iterative algorithms for solving (1) minimize the least number
of function evaluations.
In his thesis [49], Van Noorden compares several iterative algorithms for
the determination of the CSS of periodically forced processes, by solving (1).
He deduces that for this type of problem, the NewtonPicard method and the
method of Broyden are especially promising.
In this thesis, the study of periodically forced processes is extended to
more complex models. Because the dimension of the discretized system of
such models is very large, memory constraints arise and care must be taken
in the choice of iterative methods. Therefore, it is necessary to develop lim
ited memory algorithms to solve (1), that is, algorithms that use a restricted
amount of memory. Since the method of Broyden is popular in the chemical
reactor engineering, we focus on approaches aimed at reducing the memory
needed by the method of Broyden. We call the resulting algorithms limited
memory Broyden methods.
Basics of limited memory methods
The standard iterative algorithm is the method of Newton. Let x
0
∈ R
n
be an
initial guess in the neighborhood of a zero x
∗
of g. Newton’s method deﬁnes a
sequence ¦x
k
¦ in R
n
of approximations of x
∗
by
x
k+1
= x
k
−J
−1
g
(x
k
)g(x
k
), k = 0, 1, 2, . . . , (2)
where J
g
(x), is the Jacobian of g at the point x. An advantage of the method
of Newton is that the convergence is quadratic in a neighborhood of a zero,
i.e.,
x
k+1
−x
∗
 < cx
k
−x
∗

2
,
Introduction 3
for a certain constant c > 0. Since it is not always possible to determine the
Jacobian of g analytically, we often have to approximate J
g
using ﬁnite diﬀer
ences. The number of function evaluations per iteration step in the resulting
approximate Newton’s method is (n + 1).
In 1965, Broyden [8] proposed a method that uses only one function evalu
ation per iteration step instead of (n+1). The main idea of Broyden’s method
is to approximate the Jacobian of g by a matrix B
k
. Thus the scheme (2) is
replaced by
x
k+1
= x
k
−B
−1
k
g(x
k
), k = 0, 1, 2, . . . . (3)
After every iteration step, the Broyden matrix B
k
is updated using a rank
onematrix. If g(x) is an aﬃne function, so for some A ∈ R
n×n
and b ∈ R
n
,
g(x) = Ax +b, then
g(x
k+1
) −g(x
k
) = A(x
k+1
−x
k
)
holds. According to this equality, the updated Broydenmatrix B
k+1
is chosen
such that it satisﬁes the equation
y
k
= B
k+1
s
k
, (4)
with
s
k
= x
k+1
−x
k
and y
k
= g(x
k+1
) −g(x
k
).
Equation (4) is called the secant equation and algorithms for which this con
dition is satisﬁed are called secant methods. If we assume that B
k+1
and B
k
are identical on the orthogonal complement of the linear space spanned by s
k
,
the condition in (4) results in the following update scheme for the Broyden
matrix B
k
B
k+1
= B
k
+ (y
k
−B
k
s
k
)
s
T
k
s
T
k
s
k
= B
k
+
g(x
k+1
)s
T
k
s
T
k
s
k
(5)
In 1973, Broyden, Dennis and Mor´e [11] published a proof that the method
of Broyden is locally qsuperlinearly convergent, i.e.,
lim
k→∞
x
k+1
−x
∗

x
k
−x
∗

= 0.
In 1979, Gay [22] proved that for linear problems the method of Broyden is in
fact exactly convergent in 2n iterations. Moreover, he showed that this implies
locally 2nstep, quadratic convergence for nonlinear problems
x
k+2n
−x
∗
 ≤ cx
k
−x
∗

2
,
4 Introduction
with c > 0. This proof of exact convergence was simpliﬁed and sharpened in
1981 by Gerber and Luk [23]. In practice, these results imply that the method
of Broyden needs more iterations to converge than the method of Newton.
Yet, since only one function evaluation is made for every iteration step, the
method of Broyden might signiﬁcantly reduce the amount of CPUtime to
solve the problem.
In Chapter 1, we discuss the method of Newton and the method of Broyden
in more detail and in particular describe the derivation and convergence prop
erties. Subsequently, we consider the method of Broyden for linear systems
of equations in Chapter 2. We deduce that Broyden’s method uses selective
information of the system to solve it.
Both Newton’s and Broyden’s method need to store an (n n)matrix,
see (2) and (3). Therefore, for highdimensional systems, this might lead to
severe memory constraints. From the early seventies, there has been serious
attention paid to the issue of reducing the amount of storage required for
the iterative methods. Diﬀerent techniques have appeared for solving large
nonlinear problems [62].
The problem that we consider is the general nonlinear equation (1), so
nothing is known beforehand about the structure of the Jacobian of the sys
tem. In Chapter 3, we develop several limited memory methods that do not
depend on the structure of the Jacobian and are based on the method of
Broyden. In addition to a large reduction of the memory used, these limited
memory methods give more insight in the original method of Broyden, since we
investigate the question of how much and which information can be dropped
without destroying the property of superlinear convergence. In Section 3.2,
we derive our main algorithm, the Broyden Rank Reduction method (BRR).
To introduce the idea of the BRR method we ﬁrst consider an example.
Example 1. The period map f : R
n
→ R
n
to be considered is a small (take
ε = 1.0 10
−2
) quadratic perturbation to two times the identity map,
f(x) =
¸
¸
¸
¸
2x
1
−εx
2
2
.
.
.
2x
n−1
−εx
2
n
2x
n
¸
. (6)
The unique ﬁxed points of the function f, x
∗
= 0, can be found by applying
Broyden’s method to solve (1) with g(x) = f(x) −x up to a certain residual
g(x) < 1.0 10
−12
,
Introduction 5
using initial estimate vector x
0
= (1, . . . , 1). In order to obtain a good example
of memory reduction, we choose n = 100, 000. Starting with a simple initial
matrix B
0
= −I, the ﬁrst Broyden matrix is given by,
B
1
= B
0
+c
1
d
T
1
,
where c
1
= g(x
1
)/s
0
 and d
1
= s
0
/s
0
. Because B
0
does not have to be
stored, it is more economical to store the vector c
1
and d
1
instead of the
matrix B
1
itself. Applying another update, we obtain the second Broyden
matrix,
B
2
= B
1
+c
2
d
T
2
= B
0
+c
1
d
T
1
+c
2
d
T
2
, (7)
where c
2
= g(x
2
)/s
1
 and d
2
= s
1
/s
1
. Now 4 n = 400, 000 locations are
used to store the vector pairs ¦c
1
, d
1
¦ and ¦c
2
, d
2
¦. In the next iteration step,
we would need 6 n = 600, 000 storage locations to store all of the vector pairs.
We consider the update matrix in the ﬁfth iteration that consists of the ﬁrst
ﬁve rankone updates to the Broyden matrix,
Q = c
1
d
T
1
+c
2
d
T
2
+. . . +c
5
d
T
5
.
Because Q is the sum of ﬁve rankone matrices, it has rank less or equal to
ﬁve, and if we compute the singular value decomposition of Q (see Section 3.2
for details) we see that Q can be written as
Q = σ
1
u
1
v
T
1
+. . . +σ
5
u
5
v
T
5
,
where ¦u
1
, . . . , u
5
¦ and ¦v
1
, . . . , v
5
¦ are orthonormal sets of vectors and
σ
1
= 2.49, σ
2
= 1.61, σ
3
= 0.214 10
−5
,
σ
4
= 0.121 10
−12
, σ
5
= 0.00.
This suggests that we can ignore the singular value σ
5
in the singular value
decomposition of Q without changing the update matrix Q. We replace the
matrix Q by
¯
Q with
¯
Q = Q−σ
5
u
5
v
T
5
= σ
1
u
1
v
T
1
+. . . +σ
4
u
4
v
T
4
.
We deﬁne ˜ c
i
:= σ
i
u
i
and
˜
d
i
:= v
i
for i = 1, . . . 4. The diﬀerence between the
original Broyden matrix B
5
= B
0
+Q and the ’reduced’ matrix
¯
B
5
= B
0
+
¯
Q
can be estimated as
B
5
−
¯
B
5
 = B
0
+Q−B
0
−
¯
Q = σ
5
u
5
v
T
5
 = σ
5
u
5
v
5
 = σ
5
,
6 Introduction
which is equal to zero in this case. After this reduction we can store a new
pair of update vectors
c
5
:= g(x
6
)/s
5
 and d
2
:= s
2
/s
2
.
Continuing, on every iteration step we ﬁrst remove the singular value σ
5
of
Q before computing the new update. This leads to Algorithm 3.11 of Section
3.2, the Broyden Rank Reduction method, with parameter p = 5. Surprisingly
the ﬁfth singular value of the update matrix remains zero in all subsequent
iterations until the process is converged, see Figure 1. Therefore, if in every
iteration we save the four largest singular values of the update matrix and
drop the ﬁfth singular value, we do not alter the Broyden matrix. In fact, we
apply the method of Broyden itself.
0 5 10 15
0
0.5
1
1.5
2
2.5
3
PSfrag replacements
iteration k
s
i
n
g
u
l
a
r
v
a
l
u
e
s
Figure 1: The singular values of the update matrix during the BRR process with
p = 5.
The rate of convergence of this process is plotted in Figure 2, together with
the rate of convergence of the BRR method for other values of p.
If p is larger than 5, the rate of convergence does not increase. We observe
that the residual g(x
k
) is approximately 10
−14
after 14 iterations. For p = 5,
the number of required storage locations is reduced from n
2
= 10
10
for the
Broyden matrix of the original method to 2pn = 10
6
for the BRR method.
Note that p cannot be any small number, and care is needed to ﬁnd the
optimal p. Clearly, the BRR process does not converge for p = 2. In the ﬁrst
iterations, it might not be harmful to remove the second singular value of
the update matrix, but after 8 iterations the process fails to keep the fast q
superlinear convergence and starts to diverge. For p = 3 we observe the same
kind of behavior, where the diﬃculties start in the 9th iteration.
Introduction 7
0 10 20 30 40 50 60
10
−15
10
−10
10
−5
10
0
10
5
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 2: The convergence rate of the Broyden Rank Reduction method when com
puting a ﬁxed point of the function f given by (6). [’◦’(p = 1), ’×’(p = 2), ’+’(p = 3),
’∗’(p = 4), ’’(p = 5)]
The reduction applied to the update matrix Q can also be explained as
follows. For the method of Broyden the action of the Broyden matrix has to be
known is all n directions, for example, in order to compute the new Broyden
step s
k
, see (3). In this situation the BRR method is satisﬁed with the action
of the Broyden matrix in only p directions. These directions are produced by
the Broyden process itself.
Features of limited memory methods
In Part I, we discuss most of the properties of Broyden’s method and the newly
developed Broyden Rank Reduction method. The good results in practical
applications of Broyden’s method can only be explained to a limited degree.
Therefore, in Part II we apply the method of Broyden and the BRR method to
several test functions. We are especially interested in nonlinear test functions,
since it is known that for linear systems of equations, Broyden’s method is far
less eﬃcient than, for example, GMRES [61] and BiCGSTAB [70, 71].
In the neighborhood of the solution, a function often can be considered as
approximately aﬃne. Moreover, the rate of convergence of Broyden’s method
applied to the linearization of a function indicates how much iterations it
might need for the function itself. In Part I, we show that if we apply Broy
den’s method to an aﬃne function g(x) = Ax + b, the diﬀerence B
k
− A
between the Broyden matrix and the Jacobian of the function does not in
crease as k increases, if measured in the matrix norm induced by the l
2
vector
8 Introduction
norm. For nonlinear functions g the diﬀerence B
k
− J
g
(x
∗
)
F
, measured in
the Frobenius norm, may increase. However, we can choose a neighborhood
N
1
of the solution x
∗
and a neighborhood N
2
of the Jacobian J
g
(x
∗
), so that
if (x
0
, B
0
) ∈ N
1
N
2
the diﬀerence B
k
− J
g
(x
∗
)
F
never exceeds two times
the initial diﬀerence B
0
−J
g
(x
∗
)
F
.
Example 2. Let the matrix A be given by the sum
A =
¸
¸
¸
¸
¸
2 1
.
.
.
.
.
.
.
.
.
1
2
¸
+S, (8)
where the elements of S are in between zero and one. The matrix S contains
in fact the values of a grayscale picture of a cat. We consider the system
of linear equations Ax = 0 and apply the method of Broyden from initial
condition x
0
= (1, . . . , 1) and with initial Broyden matrix B
0
= −I. The
dimension of the problem is n = 100. Since A is invertible, the Theorem of
Gay implies that it will take Broyden’s method less then 200 iterations to solve
the problem exactly. It turns out that in the simulation, about 219 iterations
are needed to reach a residual of g(x
k
) < 10
−12
. The ﬁnite arithmetic of the
computer has probably introduced a nonlinearity into the system so that the
conditions of Gay’s Theorem are not completely fulﬁlled.
In Figure 3, we have plotted the matrix A and the Broyden matrix for
diﬀerent iterations. We observe that in some way the Broyden matrix B
k
tries to approximate the Jacobian A. Although the ﬁnal Broyden matrix is
certainly not equal to the Jacobian, it approximates the Jacobian to such an
extent that the solution to the problem Ax = 0 can be found.
After 50 iterations, the rough contour of the cat can be recognized in
the Broyden matrix B
50
. While reconstructing the two main diagonals of the
Jacobian, the picture of the cat is sharpened. Note that the light spot at
the left side of the image of the Jacobian is considered less interesting by the
method of Broyden. On the other hand, the nose and eyes of the cat are
clearly detected.
Limited memory methods applied to periodic processes
We now consider an application in the chemical reactor engineering. In case
of signiﬁcant temperature ﬂuctuations, it turns out to be essential to include
a second space dimension in the model of packed bed reactors. Moreover, to
Introduction 9
Jacobian B
50
B
100
B
218
Figure 3: The Jacobian as given by (8) and the Broyden matrix at three diﬀerent
iterations of the Broyden process (n = 100). Black corresponds to values smaller than
0 and white to values larger than 1.
obtain an accurate approximation of the periodic state of the reactor, it is nec
essary to use a ﬁne grid. This implies that the number of equations, n, is very
large. Combining the integration of the system of ordinary diﬀerential equa
tions for the evaluation of the function g with a ﬁne grid in the reactor makes
it practically impossible to solve (1) using classical iterative algorithms for
nonlinear equations. To overcome severe memory constraints, many authors
have reverted to pseudohomogeneous onedimensional models and to coarse
10 Introduction
grid discretization, which renders such models inadequate or inaccurate.
The radial transport of heat and matter is essential in nonisothermal
packed bed reactors [72]. A highly exothermic reaction, a large width of the
reactor, and eﬃcient cooling of the reactor at the wall cause radial temperature
gradients to be present, see Figure 4. Clearly, for reactors operating under
these conditions the radial dimension must be taken into account explicitly.
PSfrag replacements
ax. distance
rad. distance
temperature
c
o
n
v
e
r
s
i
o
n
PSfrag replacements
ax. distance
rad. distance
t
e
m
p
e
r
a
t
u
r
e
conversion
Figure 4: Qualitative conversion and temperature distribution of the cooled reverse
ﬂow reactor in the cyclic steady state using the twodimensional model (10)(12) with
the parameter values of Tables C.1 and C.2.
As an example, we consider a reverse ﬂow reactor, which is considered in
detail in Chapter 8. The reverse ﬂow reactor (RFR) is a catalytic packedbed
reactor in which the ﬂow direction is periodically reversed to trap a hot zone
within the reactor. Upon entering the reactor, the cold feed gas is heated up
regeneratively by the hot bed so that a reaction can occur. The reaction is
assumed to be exothermic. At the other end of the reactor, the hot product
gas is cooled by the colder catalyst particles. The beginning and end of the
reactor thus eﬀectively work as heat exchangers. The cold feed gas purges
the hightemperature (reaction) front in downstream direction. Before the
hot reaction zone exits the reactor, the feed ﬂow direction is reversed. The
ﬂowreversal period, denoted by t
f
, is usually constant and predeﬁned. One
complete cycle of the RFR consists of two ﬂowreverse periods. Overheating of
the catalyst and hot spot formation are avoided by a limited degree of cooling.
In Chapter 6, we derive the balance equations of the twodimensional model
of a general packed bed reactor. Here we give a short summary of the deriva
tion. We start with the onedimensional pseudohomogeneous model of Khi
nast, Jeong and Luss [33], which takes into account the axial heat and mass
dispersion.
Introduction 11
The concentration and temperature depend on the axial and the radial
direction, c = c(z, r, t) and T = T(z, r, t). The second spatial dimension is
incorporated by including the radial components of the diﬀusion terms,
εD
rad
1
r
∂
∂r
r
∂c
∂r
and λ
rad
1
r
∂
∂r
r
∂T
∂r
,
in the component balance and the energy balance, respectively. The cooling
term in the energy balance disappears. Instead, at the wall of the reactor the
boundary condition
λ
rad
∂T
∂r
r=R
= −U
w
(T(R) −T
c
), (9)
is added to the system. Equation (9) describes the heat loss at the reactor
wall to the surrounding cooling jacket.
In summary, we can now give the complete twodimensional model. The
component balance is given by
ε
∂c
∂t
= εD
ax
∂
2
c
∂z
2
−u
∂c
∂z
−r
(c, T) +εD
rad
1
r
∂
∂r
r
∂c
∂r
, (10)
the energy balance is given by
((ρc
p
)
s
(1 −ε) + (ρc
p
)
g
ε)
∂T
∂t
= λ
ax
∂
2
T
∂z
2
−u(ρc
p
)
g
∂T
∂z
+
(−∆H)r
(c, T) +λ
rad
1
r
∂
∂r
r
∂T
∂r
, (11)
and the boundary conditions are given by
−λ
ax
∂T
∂z
z=0
= u(ρc
p
)
g
(T
0
−T(0)),
∂T
∂z
z=L
= 0,
−εD
ax
∂c
∂z
z=0
= u(c
0
−c(0)),
∂c
∂z
z=L
= 0,
∂c
∂r
r=0
= 0,
∂c
∂r
r=R
= 0,
∂T
∂r
r=0
= 0, λ
rad
∂T
∂r
r=R
= −U
w
(T(R) −T
c
).
(12)
The values of the parameters in this model are derived in Appendix C and
summarized in Tables C.1 and C.2.
12 Introduction
In Chapter 7, we describe a numerical approach to deal with the partial
diﬀerential equations in order to compute the cyclic steady state of the process.
When we are able to evaluate the period map of the process using discretization
techniques and integration routines, we can apply the limited memory Broyden
methods of Chapter 3.
We deﬁne f : R
n
→ R
n
to be the period map of the RFR of one ﬂow
reverse period associated to the balance equations (10)  (12). We use 100
equidistant grid points in the axial direction and 25 grid points in the radial
direction. The state vector, denoted by x, consist of the temperature and the
concentration in every grid point. This implies that n = 5000.
In Chapter 8, we propose the use of the Broyden Rank Reduction method
to simulate a full twodimensional model of the reverse ﬂow reactor with radial
gradients taken into account. A disadvantage of the method of Broyden is
that an initial approximation of the solution has to be chosen as well as an
initial Broyden matrix. This problem is naturally solved by the application.
The reverse ﬂow reactor is usually started in preheated state, for example
T = 2T
0
, where the reactor is ﬁlled with the carrier gas without a trace of the
reactants, that is, c = 0. This initial state of the reactor is chosen as the initial
state of the Broyden process. In the ﬁrst periods, the state of the reverse ﬂow
reactor converges relatively fast to the cyclic steady state. Thereafter the rate
of convergence decreases and the dynamical process takes many periods before
it reaches the cyclic steady state. By taking the initial Broyden matrix equal
to minus the identity, B
0
= −I, the ﬁrst iteration of the Broyden process is a
dynamical simulation step,
x
1
= x
0
−B
−1
0
g(x
0
) = x
0
+f(x
0
) −x
0
= f(x
0
).
We apply the BRR method for diﬀerent values of p to approximate a
zero of the function g(x) = f(x) − x with a residual of 10
−8
. Figure 5 shows
that the BRR method converges in 49 iterations for p = 10. Here, instead of
25, 000, 000 (n
2
) required for a standard Broyden iteration, only 100, 000 (2pn)
storage locations are needed for the Broyden matrix. If we use a few more
iterations, p can even be chosen equal to 5 and the number of storage locations
is reduced further. If p is chosen too small (p = 2) the (fast) convergence is
lost. For complete details of the computations see Chapter 5.
The BRR method makes it possible to compute eﬃciently the cyclic steady
state of the reverse ﬂow reactor, where radial gradients are integrated in the
model.
Introduction 13
0 10 20 30 40 50 60 70 80 90 100
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 5: The convergence rate of the method of Broyden and the BRR method, for
diﬀerent values of p, applied to the period map of the reverse ﬂow reactor using the
twodimensional model (10)(12) with the parameter values of Tables C.1 and C.2.
[’+’(p = 10), ’∗’(p = 5), ’’(p = 2)]
14 Introduction
Part I
Basics of limited memory
methods
15
Chapter 1
An introduction to iterative
methods
A general nonlinear system of algebraic equations can be written as
g(x) = 0, (1.1)
where x = (x
1
, . . . , x
n
) is a vector in R
n
, the ndimensional real vector space.
The function g : R
n
→ R
n
is assumed to be continuously diﬀerentiable in an
open, convex domain T. The Jacobian of g, denoted by J
g
, is assumed to be
Lipschitz continuous in T, that is, there exists a constant γ ≥ 0 such that for
every u and v in T,
J
g
(u) −J
g
(v) ≤ γu −v.
We write J
g
∈ Lip
γ
(T).
In general, systems of nonlinear equations cannot be solved analytically
and we have to consider numerical approaches. These approaches are generally
built on the concept of iterations. Steps involving similar computations are
performed over and over again until the solution is approximated. The oldest
and the most famous iterative method might be the method of Newton, also
called the NewtonRaphson method. In Section 1.2, we derive and discuss the
method of Newton and describe the convergence properties. In addition, we
discuss some quasiNewton methods, based on the method of Newton.
The quasiNewton method of most interest to this work is the method of
Broyden, proposed by Charles Broyden in 1965 [8]. In Section 1.3, we derive
this method in the same way as Broyden. We prove the local convergence of
the method and discuss the most interesting features.
As a simple introduction into quasiNewton methods, we ﬁrst consider a
scalar problem.
17
18 Chapter 1. An introduction to iterative methods
1.1 Iterative methods in one variable
The algorithms we discuss in this section are the scalar version of Newton’s
method, Broyden’s method and other quasiNewton methods, as discussed in
Sections 1.2 and 1.3. The multidimensional versions of the methods are more
complex, but an understanding of the scalar case will help in understanding
the multidimensional case. The theorems in this section are special cases of
the theorems in Sections 1.2 and 1.3. So, we often omit the proof, unless it
gives insight in the algorithms.
The scalar version of Newton’s method
The standard iterative method to solve (1.1) is the method of Newton, which
can be described by a single expression. We choose an initial guess x
0
∈ R to
the solution x
∗
and compute the sequence ¦x
k
¦ using the iteration scheme
x
k+1
= x
k
−
g(x
k
)
g
(x
k
)
, k = 0, 1, 2, . . . . (1.2)
This iteration scheme involves solving a local aﬃne model for the function g
instead of solving the nonlinear equation (1.1) directly. A clear choice for the
aﬃne model, denoted by l
k
(x), is the tangent line to the graph of g in the
point (x
k
, g(x
k
)). So, the function is linearized in the point x
k
, i.e.,
l
k
(x) = g(x
k
) +g
(x
k
)(x −x
k
), (1.3)
and x
k+1
is deﬁned to be the zero of this aﬃne function, which yields (1.2).
We illustrate this idea with an example.
Example 1.1. Let g : R →R be given by
g(x) = x
2
−2. (1.4)
The derivative of this function is g
(x) = 2x and an exact zero of g is x
∗
=
√
2.
As initial condition, we take x
0
= 3. The ﬁrst aﬃne model equals the tangent
line to g at x
0
,
l
0
(x) = g(x
0
) +g
(x
0
)(x −x
0
) = 7 + 6(x −3) = 6x −11.
The next iterate x
1
is determined to be the intersection point of the tangent
line and the xaxis, x
1
=
11
6
, see Figure 1.1. Next, we repeat the same step
starting from the new estimate x
1
.
1.1 Iterative methods in one variable 19
1 2 3 4
−2
0
2
4
6
8
10
PSfrag replacements
x
0
x
1
x
2
Figure 1.1: The ﬁrst two steps of the scalar version of Newton’s method (1.2) for
x
2
− 2 = 0, starting at x
0
= 3.
An important fact is that the method of Newton is locally qquadratically
convergent. In every iteration, the number of accurate digits is doubled when
the iteration starts close to the true solution. The scalar version of Theorem
1.10 reads.
Theorem 1.2. Let g : R →R be continuously diﬀerentiable in an open inter
val T, where g
∈ Lip
γ
(T). Assume that for some ρ > 0, [g
(x)[ ≥ ρ for every
x ∈ T. If g(x) = 0 has a solution x
∗
∈ T, then there exists an ε > 0 such that
if [x
0
−x
∗
[ < ε, the sequence ¦x
k
¦ generated by
x
k+1
= x
k
−
g(x
k
)
g
(x
k
)
, k = 0, 1, . . . ,
exists and converges to x
∗
. Furthermore, for k ≥ 0,
[x
k+1
−x
∗
[ ≤
γ
2ρ
[x
k
−x
∗
[
2
.
The condition that g
(x) has a nonzero lower bound in T, simply means
that g
(x
∗
) must be nonzero for Newton’s method to converge quadratically.
If g
(x
∗
) = 0, then x
∗
is a multiple root, and Newton’s method converges only
20 Chapter 1. An introduction to iterative methods
linearly [18]. In addition, if g
(x) ≥ ρ on T, the continuity of g implies that
x
∗
is the only solution in T.
Theorem 1.2 guarantees the convergence only for a starting point x
0
that
lies in a neighborhood of the solution x
∗
. If [x
0
− x
∗
[ is too large, Newton’s
method might not converge. So, the method is useful for its fast local conver
gence, but we need to combine it with a more robust algorithm that is can
converge from starting points further away from the true solution.
The secant method
In many practical applications, the nonlinear equation cannot be given in
closed form. For example, the function g might be the output of a compu
tational or experimental procedure. In this case, g
(x) is not available and
we have to modify Newton’s method, which requires the derivative g
(x) to
model g around the current estimate x
k
by the tangent line to g(x) at x
k
. The
tangent line can be approximated by the secant line through g(x) at x
k
and a
nearby point x
k
+h
k
. The slope of this line is given by
a
k
=
g(x
k
+h
k
) −g(x
k
)
h
k
. (1.5)
The function g(x) is modeled by
l
k
(x) = g(x
k
) +a
k
(x −x
k
). (1.6)
Iterative methods that solve (1.6) in every iteration step are called quasi
Newton methods. These methods follow the scheme
x
k+1
= x
k
−
g(x
k
)
a
k
, k = 0, 1, . . . . (1.7)
Of course we have to choose h
k
in the right way. For h
k
suﬃciently small,
a
k
is a ﬁnitediﬀerence approximation to g
(x
k
). In Theorem 1.5, we show
that using a
k
given by (1.5) with suﬃciently small h
k
, works as well as using
the derivative itself. However, in every iteration two function evaluations are
needed. If computing g(x) is very expensive, using h
k
= x
k−1
−x
k
may be a
better choice, where x
k−1
is the previous iterate. Substituting h
k
= x
k−1
−x
k
in (1.5) gives
a
k
=
g(x
k−1
) −g(x
k
)
x
k−1
−x
k
, (1.8)
and only one function evaluation is required, since g(x
k−1
) is already computed
in the previous iteration. This quasiNewton method is called the secant
1.1 Iterative methods in one variable 21
method, because the local model uses the secant line through the points x
k
and x
k−1
. Since a
0
is not deﬁned by the secant method, a
0
is often chosen
using (1.5) with h
0
small or a
0
= −1.
While it may seem locally ad hoc, it turns out to work well. The method is
slightly slower than a ﬁnitediﬀerence method, but usually it is more eﬃcient in
terms of the total number of function evaluations required to obtain a speciﬁed
accuracy.
To prove the convergence of the secant method we need the following
lemma, which also plays a role in the multidimensional setting.
Lemma 1.3. Let g : R →R be continuously diﬀerentiable in an open interval
T, and let g
∈ Lip
γ
(T). Then for any x, y in T,
[g(y) −g(x) −g
(x)(y −x)[ ≤
γ(y −x)
2
2
.
Proof. The fundamental theorem of calculus gives that g(y)−g(x) =
y
x
g
(z)dz,
which implies
g(y) −g(x) −g
(x)(y −x) =
y
x
(g
(z) −g
(x))dz. (1.9)
Under the change of variables
z = x +t(y −x), dz = dt(y −x),
(1.9) becomes
g(y) −g(x) −g
(x)(y −x) =
1
0
(g
(x +t(y −x)) −g
(x))(y −x)dt.
Applying the triangle inequality to the integral and using the Lipschitz conti
nuity of g
, yields
[g(y) −g(x) −g
(x)(y −x)[ = [y −x[
1
0
γ[t(y −x)[dt = γ[y −x[
2
/2.
We analyze one step of the quasiNewton process (1.6). By construction,
x
k+1
−x
∗
= a
−1
k
(a
k
(x
k
−x
∗
) −g(x
k
) +g(x
∗
))
= a
−1
k
(g(x
∗
) −g(x
k
) −g
(x
k
)(x
∗
−x
k
) + (g
(x
k
) −a
k
)(x
∗
−x
k
))
= a
−1
k
x
∗
x
k
(g
(z) −g
(x
k
))dz + (g
(x
k
) −a
k
)(x
∗
−x
k
)
.
22 Chapter 1. An introduction to iterative methods
If we deﬁne e
k
= [x
k
− x
∗
[ and use g
∈ Lip
γ
(T) in the same way as in the
proof of Lemma 1.3, we obtain
e
k+1
≤ [a
−1
k
[
γ
2
e
2
k
+[g
(x
k
) −a
k
[e
k
. (1.10)
In order to use (1.10), we have to know how close the ﬁnite diﬀerence
approximation a
k
is to g
(x
k
) as a function of h
k
.
Lemma 1.4. Let g : R →R be continuously diﬀerentiable in an open interval
T and let g
∈ Lip
γ
(T). If x
k
, x
k
+h
k
∈ T and a
k
is deﬁned by (1.5), then
[a
k
−g
(x
k
)[ ≤
γ[h
k
[
2
. (1.11)
Proof. From Lemma 1.3, we have
[g(x
k
) −g(x
k
+h
k
) −h
k
g
(x
k
)[ ≤
γ[h
k
[
2
2
.
Dividing both sides by [h
k
[ gives the desired result.
Substituting (1.11) in (1.10) gives
e
k+1
≤
γ
2[a
k
[
(e
k
+[h
k
[)e
k
. (1.12)
Using this inequality, it is not diﬃcult to prove the following theorem.
Theorem 1.5. Let g : R → R be continuously diﬀerentiable in an open in
terval T and let g
∈ Lip
γ
(T). Assume that [g
(x)[ ≥ ρ for some ρ > 0 and
for every x ∈ T. If g(x) = 0 has a solution x
∗
∈ T, then there exist positive
constants ε, h such that if ¦h
k
¦ is a real sequence with 0 < [h
k
[ ≤ h, and if
[x
0
−x
∗
[ < ε, then the sequence ¦x
k
¦ given by
x
k+1
= x
k
−
g(x
k
)
a
k
, a
k
=
g(x
k
+h
k
) −g(x
k
)
h
k
,
for k = 0, 1, . . . , is well deﬁned and converges qlinearly to x
∗
. If lim
k→∞
h
k
=
0, then the convergence is qsuperlinear. If there exists some constant c
1
such
that
[h
k
[ ≤ c
1
[x
k
−x
∗
[,
or equivalently, a constant c
2
such that
[h
k
[ ≤ c
2
[g(x
k
)[, (1.13)
1.1 Iterative methods in one variable 23
then the convergence is qquadratic. If there exists some constant c
3
such that
[h
k
[ ≤ c
3
[x
k
−x
k−1
[, (1.14)
then the convergence is at least twostep qquadratic.
If we would like the ﬁnitediﬀerence methods to converge qquadratically,
we just set h
k
= c
2
[g(x
k
)[. Indeed, if x
k
is close enough to x
∗
, the mean value
theorem and the fact that g(x
∗
) = 0 implies that [g(x
k
)[ ≤ c[x
k
−x
∗
[ for some
c > 0. Note that the secant method h
k
= x
k−1
−x
k
is included as a special case
of (1.14). We restrict the proof of Theorem 1.5 to the twostep qquadratic
convergence of the secant method.
Proof (of Theorem 1.5). We ﬁrst prove that the secant method is qlinearly
convergent. Choose ε = ρ/(4γ) and h = ρ/(2γ). Suppose x
0
and x
1
are in T
and in addition [x
0
− x
∗
[ < ε, [x
1
− x
∗
[ < ε and [h
1
[ = [x
1
− x
0
[ < η
. Since
[g
(x)[ ≥ ρ for all x ∈ T, (1.11) implies that
[a
1
[ = [a
1
−g
(x
1
) +g
(x
1
)[
≥ [g
(x
1
)[ −[a
1
−g
(x
1
)[
≥ ρ −
γ[h
1
[
2
≥ ρ −
γ
2
ρ
2γ
=
3
4
ρ.
From (1.12), this gives
e
2
≤
γ
2
3
4
ρ
(e
1
+[h
1
[)e
1
≤
2γ
3ρ
ρ
4γ
+
ρ
2γ
e
1
=
1
2
e
1
.
Therefore, we have [x
2
−x
∗
[ ≤
1
2
ε < ε and
[h
2
[ = [x
2
−x
1
[ ≤ e
2
+e
1
≤
3
2
ε =
3
4
h < h.
Using the same arguments, we obtain
e
k+1
≤
2γ
3ρ
(e
k
+[h
k
[)e
k
≤
1
2
e
k
for all k = 1, 2, . . . . (1.15)
To prove the twostep qquadratic convergence of the secant method, we note
that
[h
k
[ = [x
k
−x
k−1
[ ≤ e
k
+e
k−1
, k = 1, 2, . . . .
24 Chapter 1. An introduction to iterative methods
Using the linear convergence, we derive from (1.15)
e
k+1
≤
2γ
3ρ
(e
k
+e
k
+e
k−1
)e
k
≤
2γ
3ρ
(2e
k−1
)
1
2
e
k−1
=
2γ
3ρ
e
2
k−1
.
This implies the twostep qquadratic convergence.
In numerical simulations, we have to deal with the restrictions arising from
the ﬁnite arithmetic of the computer. In particular, we should not choose h
k
too small because then fl(x
k
) = fl(x
k
+h
k
), where fl(a) is the ﬂoating point
representation of a, and the ﬁnitediﬀerence approximation of g
(x
k
) is not
deﬁned. Additionally, it can also happen that fl(g(x
k
)) = fl(g(x
k
+ h
k
)),
although the derivative is not equal to zero. This is one reason that the secant
process can fail. Of course, it depends on the function and how accurately we
can approximate the zero of the function.
In the next example, we consider the secant method again applied to the
function g(x) = x
2
−2.
Example 1.6. Let g : R
n
→ R
n
be given by (1.4). We apply the secant
method, (1.7) and (1.8), to solve g(x) = 0, starting from the initial condition
x
0
= 3. In the ﬁrst iteration step of the secant method, we use a
0
= g
(x
0
). In
Figure 1.2, the ﬁrst two steps of the secant method are displayed. Table 1.1
shows that the secant method needs just a few more iterations to obtain the
same precision for the approximation of
√
2 as Newton’s method.
iteration Newton’s method Secant method
0 3.0 3.0
1 1.833333333333 1.833333333333
2 1.462121212121 1.551724137931
3 1.414998429895 1.431239388795
4 1.414213780047 1.414998429895
5 1.414213562373 1.414218257349
6  1.414213563676
7  1.414213562373
Table 1.1: The method of Newton (1.2) and the secant method (1.7) and (1.8) ap
proximating
√
2.
1.2 The method of Newton 25
1 2 3 4
−2
0
2
4
6
8
10
PSfrag replacements
x
0
x
1
x
2
Figure 1.2: The ﬁrst two steps of the secant method, deﬁned by (1.7) and (1.8), for
x
2
− 2 = 0, starting from x
0
= 3.
1.2 The method of Newton
This section we derive the method of Newton in the multidimensional set
ting. By Theorem 1.10, we prove that in R
n
Newton’s method is locally
qquadratically convergent. We then discuss the ﬁnitediﬀerence version of
Newton’s method and other types of quasiNewton methods that leads to the
introduction of the method of Broyden in Section 1.3.
Derivation of the algorithm
Similar to the onedimensional setting, the method of Newton is based on
ﬁnding the root of an aﬃne approximation to g at the current iterate x
k
. The
local model is derived from the equality
g(x
k
+s) = g(x
k
) +
x
k
+s
x
k
J
g
(z)sdz,
where J
g
denotes the Jacobian of g. If the integral is approximated by J
g
(x
k
)s,
the model in the current iterate becomes
l
k
(x
k
+s) = g(x
k
) +J
g
(x
k
)s. (1.16)
26 Chapter 1. An introduction to iterative methods
We solve this aﬃne model for s, that is, ﬁnd s
k
∈ R
n
such that
l
k
(x
k
+s
k
) = 0.
This Newton step, s
k
, is added to the current iterate
x
k+1
= x
k
+s
k
.
The new iterate x
k+1
is not expected to equal x
∗
, but only to be a better
estimate than x
k
. Therefore, we build the Newton iteration into an algorithm,
starting from an initial guess x
0
.
Algorithm 1.7 (Newton’s method). Choose an initial estimate x
0
∈ R
n
,
pick ε > 0, and set k := 0. Repeat the following sequence of steps until
g(x
k
) < ε.
i) Solve J
g
(x
k
)s
k
= −g(x
k
) for s
k
,
ii) x
k+1
:= x
k
+s
k
.
In order to judge every iterative method described in this thesis, we con
sider the rate of convergence of each method on a test function, the discrete
integral equation function, as described in Appendix A. We have chosen this
function from a large set of test function, called the CUTE collection, cf.
[18, 47]. It is a commonly chosen problem and, in addition, the method of
Broyden is able to compute a zero of this function rather easily. The ﬁrst time
we use this test function, we explicitly give the expression of the function. In
future examples, we refer to Appendix A.
Example 1.8. We apply Algorithm 1.7 to ﬁnd a zero of the discrete integral
equation function, given by
g
i
(x) = x
i
+
h
2
(1−t
i
)
i
¸
j=1
t
j
(x
j
+t
j
+1)
3
+t
i
n
¸
j=i+1
(1−t
j
)(x
j
+t
j
+1)
3
, (1.17)
for i = 1, . . . , n, where h = 1/(n+1) and t
i
= i h, i = 1, . . . , n. We start with
the initial vector x
0
given by
x
0
= (t
1
(t
1
−1), . . . , t
n
(t
n
−1)).
In Table 1.2, the convergence properties of Newton’s method are described,
for diﬀerent dimensions n of the problem. The initial residual g(x
0
) and the
1.2 The method of Newton 27
ﬁnal residual g(x
k
∗) are given, where k
∗
is the number of iterations used.
The variable R is a measure of the rate of convergence and deﬁned by
R = log(g(x
0
)/g(x
k
∗))/k
∗
. (1.18)
The residual g(x
k
), k = 0, . . . , k
∗
, is plotted in Figure 1.3. We observe that
the dimension of the problem does not inﬂuence the convergence of Newton’s
method in case of this test function.
method n g(x
0
) g(x
k
∗) k
∗
R
Newton 10 0.2518 6.3085 · 10
−15
3 10.4393
Newton 100 0.7570 1.7854 · 10
−14
3 10.4594
Newton 200 1.0678 2.4858 · 10
−14
3 10.4637
Table 1.2: The convergence properties of Algorithm 1.7 applied to the discrete integral
equation function (1.17) for diﬀerent dimensions n.
0 1 2 3
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 1.3: The convergence rate of Algorithm 1.7 applied to the discrete integral
equation function (1.17) for diﬀerent dimensions n. [’◦’(n = 10), ’×’(n = 100), ’+’(n =
200)]
Note that if g is an aﬃne function, Newton solves the problem in one
iteration. Even if a component function of g is aﬃne, each iterate generated
by Newton’s method is a zero of this component function, that is, if g
1
is aﬃne
then g
1
(x
1
) = g
1
(x
2
) = . . . = 0. We illustrate this with an example.
28 Chapter 1. An introduction to iterative methods
Example 1.9. We consider the Rosenbrock function g : R
2
→R
2
deﬁned by
g(x) =
10(x
2
−x
2
1
)
1 −x
1
.
As initial condition, we choose x
0
= (−1.2, 1). The function value in x
0
equals g(x
0
) = (−4.4, 2.2). Note that the second component of the Rosen
brock function is aﬃne in x. This explains the zero in the function value of
x
1
= (1, −3.84), which equals g(x
1
) = (−48.4, 0). As said before, all future
iterates will be a zero of the second component function. This implies that
the ﬁrst component of x
k
will be equal to 1 for all future iterations. So, the
ﬁrst component of the Rosenbrock function has become aﬃne and the next
iterate yields the solution x
2
= (1, 1).
Some problems arise in implementing Algorithm 1.7. The Jacobian of
g is often not analytically available, for example, if g itself is not given in
analytic form. A ﬁnitediﬀerence method or less expensive methods should be
used to approximate the Jacobian. Secondly, if J
g
(x
k
) is illconditioned, then
J
g
(x
k
)s
k
= −g(x
k
) will not give a reliable solution.
Local convergence of Newton’s method
In this section, we give a proof of the local qquadratic convergence of Newton’s
method and discuss its implications. The proof is a prototype of the proofs
for convergence of the quasiNewton methods. All the convergence results in
this thesis are local, i.e., there exists an ε > 0 such that the iterative method
converges for all x
0
in an open neighborhood N(x
∗
, ε) of the solution x
∗
. Here,
N(x
∗
, ε) = ¦x ∈ R
n
[ x −x
∗
 < ε¦.
Theorem 1.10. Let g : R
n
→ R
n
be continuously diﬀerentiable in an open,
convex set T ∈ R
n
. Assume that J
g
∈ Lip
γ
(T) and that there exist x
∗
∈ R
n
and β > 0 such that g(x
∗
) = 0 and J
g
(x
∗
) is nonsingular with J
g
(x
∗
)
−1
 ≤ β.
Then there exists an ε > 0 such that for all x
0
∈ N(x
∗
, ε) the sequence ¦x
k
¦
generated by
x
k+1
= x
k
−J
g
(x
k
)
−1
g(x
k
), k = 0, 1, 2, . . . ,
is well deﬁned, converges to x
∗
, and satisﬁes
x
k+1
−x
∗
 ≤ βγx
k
−x
∗

2
for k = 0, 1, 2, . . . . (1.19)
1.2 The method of Newton 29
We can show that the convergence is qquadratic, by choosing ε such that
J
g
(x) is nonsingular for all x ∈ N(x
∗
, ε). The reason is that if the Jacobian
is nonsingular, the local error in the aﬃne model (1.16) is at most of order
O(x
k
−x
∗

2
). This is a consequence of the following lemma.
Lemma 1.11. Let g : R
n
→ R
n
be continuously diﬀerentiable in the open,
convex set T ⊂ R
n
and let x ∈ T. If J
g
∈ Lip
γ
(T), then for any y ∈ T,
g(y) −g(x) −J
g
(x)(y −x) ≤
γ
2
y −x
2
.
Proof. According to the fundamental theorem of calculus, we have
g(y) −g(x) −J
g
(x)(y −x)
=
1
0
J
g
(x +t(y −x))(y −x)dt
−J
g
(x)(y −x)
=
1
0
(J
g
(x +t(y −x)) −J
g
(x))(y −x)dt. (1.20)
We can bound the integral on the right hand side of (1.20) in terms of the
integrand. Together with the Lipschitz continuity of J
g
at x ∈ T, this implies
g(y) −g(x) −J
g
(x)(y −x) ≤
1
0
J
g
(x +t(y −x)) −J
g
(x)(y −x)dt
≤
1
0
γt(y −x)(y −x)dt
= γ(y −x)
2
1
0
tdt =
γ
2
y −x
2
.
The next theorem says that matrix inversion is continuous in norm. Fur
thermore, it gives a relation between the norms of the inverses of two nearby
matrices that is useful later in analyzing algorithms.
Theorem 1.12. Let . be the induced l
2
norm on R
n×n
and let E ∈ R
n×n
.
If E < 1, then (I −E)
−1
exists and
(I −E)
−1
 ≤
1
1 −E
.
If A is nonsingular and A
−1
(B −A) < 1, then B is nonsingular and
B
−1
 ≤
A
−1

1 −A
−1
(B −A)
.
30 Chapter 1. An introduction to iterative methods
The proof of Theorem 1.12 can be found in [18].
Proof (of Theorem 1.10). We choose
ε ≤ 1/(2βγ) (1.21)
so that N(x
∗
, ε) ⊂ T. By induction on k, we show that (1.19) holds for each
iteration step and that
x
k+1
−x
∗
 ≤
1
2
x
k
−x
∗
,
which implies that x
k+1
∈ N(x
∗
, ε) if x
k
∈ N(x
∗
, ε).
We ﬁrst consider the basis step (k = 0). Using the Lipschitz continuity of
J
g
at x
∗
, x
0
−x
∗
 ≤ ε and (1.21), we obtain
J
g
(x
∗
)
−1
(J
g
(x
0
) −J
g
(x
∗
)) ≤ J
g
(x
∗
)
−1
J
g
(x
0
) −J
g
(x
∗
)
≤ βγx
0
−x
∗
 ≤ βγε ≤
1
2
.
Theorem 1.12 implies that J
g
(x
0
) is nonsingular and
J
g
(x
0
)
−1
 ≤
J
g
(x
∗
)
−1

1 −J
g
(x
∗
)
−1
(J
g
(x
0
) −J
g
(x
∗
))
≤ 2J
g
(x
∗
)
−1
 ≤ 2β. (1.22)
This implies that x
1
is well deﬁned and, additionally,
x
1
−x
∗
= x
0
−x
∗
−J
g
(x
0
)
−1
g(x
0
)
= x
0
−x
∗
−J
g
(x
0
)
−1
(g(x
0
) −g(x
∗
))
= J
g
(x
0
)
−1
(g(x
∗
) −g(x
0
) −J
g
(x
0
)(x
∗
−x
0
)). (1.23)
The second factor in (1.23) gives the diﬀerence between g(x
∗
) and the aﬃne
model l
0
(x) evaluated at x
∗
. Therefore, by Lemma 1.11 and (1.22),
x
1
−x
∗
 ≤ J
g
(x
0
)
−1
g(x
∗
) −g(x
0
) −J
g
(x
0
)(x
∗
−x
0
)
≤ 2β
γ
2
x
0
−x
∗

2
= βγx
0
−x
∗

2
.
We have shown (1.19) for k = 0. Since x
0
−x
∗
 ≤ ε ≤ 1/(2βγ), it follows that
x
1
− x
∗
 ≤
1
2
x
0
− x
∗
, which yields x
1
∈ N(x
∗
, ε) as well. This completes
the proof for k = 0.
The proof of the induction step proceeds in the same way.
1.2 The method of Newton 31
Note that if g is aﬃne, the Jacobian is constant and the Lipschitz constant
γ can be chosen to be zero. We then have
x
1
−x
∗
 ≤ βγx
0
−x
∗

2
= 0,
and the method of Newton converges exactly in one single iteration. If g is a
nonlinear function, the relative nonlinearity of g at x
∗
is given by, γ
rel
= β γ.
So, for x ∈ T,
J
g
(x
∗
)
−1
(J
g
(x) −J
g
(x
∗
)) ≤ J
g
(x
∗
)
−1
J
g
(x) −J
g
(x
∗
)
≤ βγx −x
∗
 = γ
rel
x −x
∗
.
The radius of guaranteed convergence of Newton’s method is inversely propor
tional to the relative nonlinearity, γ
rel
, of g at x
∗
. The bound ε for the region
of convergence is a worstcase estimate. In directions from x
∗
in which g is
less nonlinear, the region of convergence may very well be much larger.
We conclude with a summery of the characteristics of Newton’s method.
Advantages of Newton’s method
• qQuadractically convergent from good starting points if J
g
(x
∗
) is non
singular,
• Exact solution in one iteration for an aﬃne function g (exact at each
iteration for any aﬃne component function of g).
Disadvantages of Newton’s method
• Not globally convergent for many problems,
• Requires the Jacobian J
g
(x
k
) at each iteration step,
• Each iteration step requires the solution of a system of linear equations
that might be singular or illconditioned.
QuasiNewton methods
We have already indicated that it is not always possible to compute the Ja
cobian of the function g, or that it is very expensive. In this case, we have to
approximate the Jacobian, for example, by using ﬁnite diﬀerences.
Algorithm 1.13 (Discrete Newton method). Choose an initial estimate
x
0
∈ R
n
and set k := 0. Repeat the following sequence until g(x
k
) < ε.
32 Chapter 1. An introduction to iterative methods
i) Compute
A
k
=
(g(x
k
+h
k
e
1
) −g(x
k
))/h
k
(g(x
k
+h
k
e
n
) −g(x
k
))/h
k
ii) Solve A
k
s
k
= −g(x
k
) for s
k
,
iii) x
k+1
:= x
k
+s
k
.
Example 1.14. We consider the discrete integral equation function g, given
by (A.5). We assume that h
k
≡ h and apply Algorithm 1.13 for diﬀerent values
of h. We start with the initial condition x
0
given by (A.6) and set ε = 10
−12
.
The convergence properties of the discrete Newton method are described in
Table 1.3. The rate of convergence is plotted in Figure 1.4. The diﬀerence
between the real Jacobian and the approximated Jacobian J
g
(x
k
)−A
k
 turns
out to be of order 10
−5
for h = 1.0 10
−4
, of order 10
−7
for h = 1.0 10
−8
and
of order 10
−3
for h = 1.0 10
−12
.
method n h g(x
0
) g(x
k
∗) k
∗
R
Discrete Newton 100 1.0 · 10
−4
0.7570 4.4908 · 10
−16
4 8.7652
Discrete Newton 100 1.0 · 10
−8
0.7570 8.6074 · 10
−14
3 9.9351
Discrete Newton 100 1.0 · 10
−12
0.7570 2.1317 · 10
−13
4 7.2246
Table 1.3: The convergence properties of Algorithm 1.13, applied to the discrete
integral equation function (A.5) for diﬀerent values of h.
If the ﬁnitediﬀerence step size h
k
is properly chosen, the discrete Newton
method is also qquadratically convergent. This is the conclusion of the next
theorem. We denote the l
1
vectornorm and the corresponding induced matrix
norm by .
1
.
Theorem 1.15. Let g and x
∗
satisfy the assumptions of Theorem 1.10. Then
there exist ε, h > 0 such that if ¦h
k
¦ is a real sequence with 0 < [h
k
[ ≤ h and
x
0
∈ N(x
∗
, ε), the sequence ¦x
k
¦ generated by
x
k+1
= x
k
−A
−1
k
g(x
k
), k = 0, 1, 2, . . . ,
where
A
k
=
(g(x
k
+h
k
e
1
) −g(x
k
))/h
k
(g(x
k
+h
k
e
n
) −g(x
k
))/h
k
,
is well deﬁned and converges qlinearly to x
∗
. Additionally, if
lim
k→∞
h
k
= 0,
1.2 The method of Newton 33
0 1 2 3 4
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 1.4: The convergence rate of the discrete Newton method 1.13 applied to
the discrete integral equation function (A.5) for diﬀerent values of h. [’◦’(h = 10
−4
),
’×’(h = 10
−8
), ’+’(h = 10
−12
)]
then the convergence is qsuperlinear. If there exists a constant c
1
such that
[h
k
[ ≤ c
1
x
k
−x
∗

1
,
or equivalently a constant c
2
such that
[h
k
[ ≤ c
2
g(x
k
)
1
,
then the convergence is qquadratic.
For the proof of Theorem 1.15 we refer to [18]. Another way to avoid com
putations of the Jacobian in every iteration is to compute the Jacobian in the
ﬁrst iteration, A = J
g
(x
0
), and use this matrix in all subsequent iterations as
an approximation of J
g
(x
k
). This method is called the NewtonChord method.
It turns out that the NewtonChord method is locally linearly convergent [38].
Algorithm 1.16 (NewtonChord method). Choose an initial estimate
x
0
∈ R
n
, set k := 0, and compute the Jacobian A := J
g
(x
0
). Repeat the
following sequence of steps until g(x
k
) < ε.
i) Solve As
k
= −g(x
k
) for s
k
,
ii) x
k+1
:= x
k
+s
k
.
Example 1.17. Let g be the discrete integral equation function given by
(A.5). We apply Algorithm 1.16 and Algorithm 1.7 to approximate the zero
34 Chapter 1. An introduction to iterative methods
of g. As initial estimate we choose x
0
given by (A.6) multiplied by a factor 1, 10
or 100. The convergence properties of the NewtonChord method and Newton’s
method are described in Table 1.4. In Figure 1.5, we can observe the linear
convergence of the NewtonChord method. The rate of convergence of the
NewtonChord method is very low in case of the initial condition 100x
0
. Clearly
for all initial conditions, the NewtonChord method needs more iterations to
converge than the original method of Newton, see Figure 1.6.
method n factor g(x
0
) g(x
k
∗) k
∗
R
Newton Chord 100 1 0.7570 2.3372 · 10
−14
8 3.8886
Newton Chord 100 10 18.5217 2.9052 · 10
−13
16 1.9866
Newton Chord 100 100 3.8215 · 10
3
2.1287 200 0.0375
Newton 100 1 0.7570 1.7854 · 10
−14
3 10.4594
Newton 100 10 18.5217 2.7007 · 10
−16
4 9.6917
Newton 100 100 3.8215 · 10
3
3.9780 · 10
−13
9 4.0890
Table 1.4: The convergence properties of Algorithm 1.16 and Algorithm 1.7 applied to
the discrete integral equation function (A.5) for diﬀerent initial conditions (x
0
, 10x
0
and 100x
0
).
0 2 4 6 8 10 12 14 16 18 20
10
−15
10
−10
10
−5
10
0
10
5
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 1.5: The convergence rate of Algorithm 1.16 applied to the discrete integral
equation function (A.5) for diﬀerent initial conditions. [’◦’(x
0
), ’×’(10x
0
), ’+’(100x
0
)]
1.3 The method of Broyden 35
0 1 2 3 4 5 6 7 8 9 10
10
−15
10
−10
10
−5
10
0
10
5
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 1.6: The convergence rate of Algorithm 1.7 applied to the discrete integral
equation function (A.5) for diﬀerent initial conditions. [’◦’(x
0
), ’×’(10x
0
), ’+’(100x
0
)]
1.3 The method of Broyden
The NewtonChord method of the previous section saves us the expensive
computation of the Jacobian J
g
(x
k
) is every iterate x
k
of the process, by ap
proximating it by the Jacobian in the initial condition, A = J
g
(x
0
). Additional
information about the Jacobian obtained during the process is neglected. This
information consists of the function values of g in the iterates needed to com
pute the step s
k
. In this section, we start with the basis idea for a class of
methods that adjust the approximation matrix to the Jacobian J
g
(x
k
) using
only the function value g(x
k
). We single out the method proposed by C.G.
Broyden in 1965 [8] that has a qsuperlinear and even 2nstep qquadratic
local convergence rate and seems to be very successful in practice. This algo
rithm, which is analogous to the method of Newton, is called the method of
Broyden.
A derivation of the algorithm
Recall that in one dimension we use the local model (1.6),
l
k+1
(x) = g(x
k+1
) +a
k+1
(x −x
k+1
)
for the nonlinear function g. Note that l
k+1
(x
k+1
) = g(x
k+1
) for all choices of
a
k+1
∈ R
n
. If we set a
k+1
= g
(x
k+1
), we obtain Newton’s method. If g
(x
k+1
)
is not available, we force the scheme to satisfy l
k+1
(x
k
) = g(x
k
), that is
g(x
k
) = g(x
k+1
) +a
k+1
(x
k
−x
k+1
),
36 Chapter 1. An introduction to iterative methods
which yields the secant approximation (1.8),
a
k+1
=
g(x
k+1
) −g(x
k
)
x
k+1
−x
k
.
The next iterate x
k+2
is the zero of the local model, l
k+1
. Therefore we arrive
at the quasiNewton update
x
k+2
= x
k+1
−g(x
k+1
)/a
k+1
.
The price we have to pay is a reduction in local convergence rate, from q
quadratic to 2step qquadratic convergence.
In multiple dimensions, we apply an analogous aﬃne model
l
k+1
(x) = g(x
k+1
) +B
k+1
(x −x
k+1
).
For Newton’s method B
k+1
equals the Jacobian J
g
(x
k+1
). We enforce the same
requirement that led to the onedimensional secant method. So, we assume
that l
k+1
(x
k
) = g(x
k
), which implies that
g(x
k
) = g(x
k+1
) +B
k+1
(x
k
−x
k+1
). (1.24)
Furthermore, if we deﬁne the current step by s
k
= x
k+1
−x
k
, and the yield of
the current step by y
k
= g(x
k+1
) −g(x
k
), Equation (1.24) is reduced to
B
k+1
s
k
= y
k
. (1.25)
We refer to (1.25) as the secant equation. For completeness we ﬁrst give the
deﬁnition of a secant method.
Deﬁnition 1.18. The iterative process
x
k+1
= x
k
−B
−1
k
g(x
k
)
is called a secant method if the matrix B
k
satisﬁes the secant equation (1.25)
in every iteration step.
The crux of the problem in extending the secant method to more than
one dimension is that (1.25) does not completely specify the matrix B
k+1
. In
fact, if s
k
= 0, there is an n(n − 1)dimensional aﬃne subspace of matrices
satisfying (1.25). Constructing a successful secant approximation consists of
selecting a good approach to choose from all these possibilities. The choice
should enhance the Jacobian approximation properties of B
k+1
or facilitate
its use in a quasiNewton algorithm.
1.3 The method of Broyden 37
A possible strategy is using the former function evaluations. That is, in
additional to the secant equation, we set
g(x
l
) = g(x
k+1
) +B
k+1
(x
l
−x
k+1
), l = k −m, . . . , k −1.
This is equivalent to
g(x
l
) = g(x
l+1
) +B
k+1
(x
l
−x
l+1
), l = k −m, . . . , k −1,
so,
B
k+1
s
l
= y
l
, l = k −m, . . . , k −1. (1.26)
For m = n−1 and linear independent s
k−m
, . . . , s
k
the matrix B
k+1
is uniquely
determined by (1.25) and (1.26). Unfortunately, most of the time s
k−m
, . . . , s
k
tend to be linearly dependent, making the computation of B
k+1
a poorly posed
numerical problem.
The approach that leads to the successful secant approximation is quite
diﬀerent. Aside from the secant equation no new information about either the
Jacobian or the model is given. The idea is to preserve as much as possible of
what we already have. Therefore, we try to minimize the change in the aﬃne
model, subject to the secant equation (1.25). The diﬀerence between the new
and the old aﬃne model, at any x is given by
l
k+1
(x) −l
k
(x) = g(x
k+1
) +B
k+1
(x −x
k+1
) −g(x
k
) −B
k
(x −x
k
)
= y
k
−B
k+1
s
k
+ (B
k+1
−B
k
)(x −x
k
)
= (B
k+1
−B
k
)(x −x
k
).
The last equality is due to the secant equation. Now if we write an arbitrary
x ∈ R
n
as
x −x
k
= αs
k
+q, where q
T
s
k
= 0, α ∈ R,
the expression that we want to minimize becomes
l
k+1
(x) −l
k
(x) = α(B
k+1
−B
k
)s
k
+ (B
k+1
−B
k
)q. (1.27)
We have no control over the ﬁrst term on the right hand side of (1.27), since
it equals
(B
k+1
−B
k
)s
k
= y
k
−B
k
s
k
. (1.28)
However, we can make the second term on the right hand side of (1.27) zero
for all x ∈ R
n
, by choosing B
k+1
such that
(B
k+1
−B
k
)q = 0, for all q ⊥ s
k
. (1.29)
38 Chapter 1. An introduction to iterative methods
This implies that (B
k+1
− B
k
) has to be a rankone matrix of the form us
T
k
,
with u ∈ R
n
. Equation (1.28) now implies that u = (y
k
−B
k
s
k
)/(s
T
k
s
k
). This
leads to the Broyden or secant update
B
k+1
= B
k
+
(y
k
−B
k
s
k
)s
T
k
s
T
k
s
k
. (1.30)
The word ’update’ indicates that we are not approximating the Jacobian in
the new iterate, J
g
(x
k+1
), from scratch. Rather a former approximation B
k
is updated into a new one, B
k+1
. This type of updating is shared by all the
successful multidimensional secant approximation techniques.
We arrive at the algorithm of Broyden’s method.
Algorithm 1.19 (Broyden’s method). Choose an initial estimate x
0
∈
R
n
and a nonsingular initial Broyden matrix B
0
. Set k := 0 and repeat the
following sequence of steps until g(x
k
) < ε.
i) Solve B
k
s
k
= −g(x
k
) for s
k
,
ii) x
k+1
:= x
k
+s
k
iii) y
k
:= g(x
k+1
) −g(x
k
),
iv) B
k+1
:= B
k
+ (y
k
−B
k
s
k
)s
T
k
/(s
T
k
s
k
),
In this section we use the Frobenius norm, denoted by .
F
. The norm is
deﬁned by
A
F
=
n
¸
i=1
n
¸
j=1
A
2
ij
1/2
. (1.31)
So, it equals the l
2
vector norm of the matrix written as a n
2
vector. For
y, s ∈ R
n
the set of all matrices that satisfy the secant equation As = y is
denoted by
O(y, s) = ¦A ∈ R
n×n
[ As = y¦.
In the preceding, we have followed the steps of Broyden when developing
his iterative method in [8], but the derivation of the Broyden update can be
made much more rigorous. The Broyden update is the minimum change to
B
k
consistent with the secant equation (1.25), if (B
k+1
− B
k
) is measured in
Frobenius norm. That is, of all matrices A that satisfy the secant equation
(1.25) the new Broyden matrix B
k+1
yields the minimum of A−B
k

F
. This
will be proved in Lemma 1.20.
1.3 The method of Broyden 39
Lemma 1.20. Let B ∈ R
n×n
and s, y ∈ R
n
arbitrary. If s = 0, then the
unique solution A =
¯
B to
min
A∈Q(y,s)
A−B
F
(1.32)
is given by
¯
B = B +
(y −Bs)s
T
s
T
s
.
Proof. We compute for any A ∈ O(y, s),

¯
B −B
F
=
(y −Bs)s
T
s
T
s
F
=
(A−B)ss
T
s
T
s
F
≤ A−B
F
ss
T
s
T
s
2
= A−B
F
.
Note that O(y, s) is a convex (in fact, aﬃne) subset of R
n×n
. Because the
Frobenius norm is strictly convex, the solution to (1.32) is unique on the
convex subset O(y, s).
We have not deﬁned yet what should be chosen for the initial approxima
tion B
0
to the Jacobian in the initial estimate, J
g
(x
0
). The ﬁnite diﬀerences
approximation turns out to be a good start. It also makes the minimum change
characteristics of Broyden’s update more appealing, as given in Lemma 1.20.
Another choice, that avoids the computation of J
g
(x
0
), is taking the initial
approximation equal to minus identity,
B
0
= −I. (1.33)
Suppose the function g is deﬁned by g(x) = f(x) − x, where f is the period
map of a dynamical process
x
k+1
= f(x
k
), k = 0, 1, . . . . (1.34)
A ﬁxed point of the process (1.34) is a zero of the function g. By choosing
B
0
= −I, the ﬁrst iteration of Broyden is just a dynamical simulation step,
x
1
= x
0
−B
−1
0
g(x
0
) = x
0
−(f(x
0
) −x
0
) = f(x
0
).
So, in this way, we let the system choose the direction of the ﬁrst step. In
addition, the initial Broyden matrix is easy to store and can be directly imple
mented in the computer code. This makes the reduction methods discussed
in Chapter 3 eﬀective.
We now apply the method of Broyden to the test function (A.5).
40 Chapter 1. An introduction to iterative methods
Example 1.21. Let g be the discrete integral equation function given by
(A.5). We deﬁne the initial condition x
0
by (A.6) and we set ε = 10
−12
. We
apply Algorithm 1.19 for diﬀerent dimensions of the problem. The convergence
results for the method of Broyden are described in Table 1.5. Although the
Broyden’s method needs more iterations to converge than Newton’s method, it
avoids the computation of the Jacobian. The method of Broyden method only
makes one function evaluation per iteration compared to the n + 1 function
evaluations of Algorithm 1.13. The rate of convergence again does not depend
on the dimension of the problem, see also Figure 1.7.
method n g(x
0
) g(x
k
∗) k
∗
R
Broyden 10 0.2518 4.8980 · 10
−14
21 1.3937
Broyden 100 0.7570 4.4398 · 10
−13
21 1.3412
Broyden 200 1.0678 6.3644 · 10
−13
21 1.3404
Table 1.5: The convergence results for Algorithm 1.19 applied to the discrete integral
equation function (A.5) for diﬀerent dimensions n.
0 2 4 6 8 10 12 14 16 18 20 22
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 1.7: The convergence rate of Algorithm 1.19 applied to the discrete integral
equation function (A.5) for diﬀerent dimensions n. [’◦’(n = 10), ’×’(n = 100), ’+’(n =
200)]
Example 1.22. Let g be the discrete integral equation function given by
(A.5). We apply Algorithm 1.19 to approximate the zero of g. As we did for
the NewtonChord method and the method of Newton we multiply the initial
condition x
0
, given by (A.6), by a factor 1, 10 and 100. The convergence results
1.3 The method of Broyden 41
for Broyden’s method are described in Table 1.6. For the initial condition
100x
0
the method of Broyden fails to converge.
method n factor g(x
0
) g(x
k
∗) k
∗
R
Broyden 100 1 0.7570 4.4398 · 10
−13
21 1.3412
Broyden 100 10 18.5217 8.7765 · 10
−13
33 0.9297
Broyden 100 100 3.8215 · 10
3
1.0975 · 10
+20
13 2.9151
Table 1.6: The convergence results for Broyden’s method 1.19 applied to the discrete
integral equation function (A.5) for diﬀerent initial conditions, x
0
, 10x
0
and 100x
0
.
Superlinear convergence
In order to prove the convergence of Broyden’s method, we ﬁrst need the
following extension of Lemma 1.11.
Lemma 1.23. Let g : R
n
→ R
n
be continuously diﬀerentiable in the open,
convex set T ⊂ R
n
, x ∈ T. If J
g
∈ Lip
γ
(T) then for every u and v in T
g(v) −g(u) −J
g
(x)(v −u) ≤ γ max¦v −x, u −x¦v −u. (1.35)
Moreover, if J
g
(x) is invertible, there exist ε > 0 and ρ > 0 such that
(1/ρ)v −u ≤ g(v) −g(u) ≤ ρv −u, (1.36)
for all u, v ∈ T for which max¦v −x, u −x¦ ≤ ε.
Proof. The proof of Equation (1.35) is similar to the proof of Lemma 1.11.
Equation (1.35) together with the triangle inequality implies that for u, v
satisfying max¦v −x, u −x¦ ≤ ε,
g(v) −g(u) ≤ J
g
(x)(v −u) +g(v) −g(u) −J
g
(x)(v −u)
≤ (J
g
(x) +γ max¦v −x, u −x¦)v −u
≤ (J
g
(x) +γε)v −u.
Similarly,
g(v) −g(u) ≥ J
g
(x)(v −u) −g(v) −g(u) −J
g
(x)(v −u)
≥
1
J
g
(x)
−1

−γ max¦v −x, u −x¦
v −u
≥
1
J
g
(x)
−1

−γε
v −u.
42 Chapter 1. An introduction to iterative methods
Thus if ε < (1/J
g
(x)
−1
γ), then 1/J
g
(x)
−1
 −γε > 0 and (1.36) holds if we
choose ρ large enough such that
ρ > J
g
(x) +γε,
and
1
ρ
<
1
J
g
(x)
−1

−γε.
In the next theorem it is necessary to use the Frobenius norm (1.31).
Because all norms in a ﬁnitedimensional vector space are equivalent, there is
a constant η > 0 such that
A ≤ ηA
F
, (1.37)
where . is the l
2
operator norm induced by the corresponding vector norm.
By L(R
n
) we denote the space of all linear maps from R
n
to R
n
, i.e., all
(n n)matrices. So, an element of the power set {¦L(R
n
)¦ is a set of linear
maps from R
n
to R
n
. The function Φ appearing in Theorem 1.24 is a set valued
function, that assigns to a couple of a vector x ∈ R
n
and a matrix B ∈ L(R
n
) a
set of matrices ¦
¯
B¦. This can consists of one single element and it can contain
B itself.
Theorem 1.24. Let g : R
n
→ R
n
be continuously diﬀerentiable in the open,
convex set T ⊂ R
n
, and assume that J
g
∈ Lip
γ
(T). Assume that there exists an
x
∗
∈ T such that g(x
∗
) = 0 and J
g
(x
∗
) is nonsingular. Let Φ : R
n
L(R
n
) →
{¦L(R
n
)¦ be deﬁned in a neighborhood N = N
1
N
2
of (x
∗
, J
g
(x
∗
)) where N
1
is contained in T and N
2
only contains nonsingular matrices. Suppose there
are nonnegative constants α
1
and α
2
such that for each (x, B) in N, and for
¯ x = x −B
−1
g(x), the function Φ satisﬁes

¯
B −J
g
(x
∗
)
F
≤
1 +α
1
max¦¯ x −x
∗
, x −x
∗
¦
B −J
g
(x
∗
)
F
+α
2
max¦¯ x −x
∗
, x −x
∗
¦ (1.38)
for each
¯
B in Φ(x, B). Then for arbitrary r ∈ (0, 1), there are positive constants
ε(r) and δ(r) such that for x
0
−x
∗
 < ε(r) and B
0
−J
g
(x
∗
)
F
< δ(r), and
B
k+1
∈ Φ(x
k
, B
k
), k ≥ 0, the sequence
x
k+1
= x
k
−B
−1
k
g(x
k
) (1.39)
is well deﬁned and converges to x
∗
. Furthermore,
x
k+1
−x
∗
 ≤ rx
k
−x
∗
 (1.40)
for each k ≥ 0, and ¦B
k
¦, ¦B
−1
k
¦ are uniformly bounded.
1.3 The method of Broyden 43
Proof. Let r ∈ (0, 1) be given and set β ≥ J
g
(x
∗
)
−1
. Choose δ(r) = δ and
ε(r) = ε such that
(2α
1
δ +α
2
)
ε
1 −r
≤ δ, (1.41)
and for η given by (1.37),
β(1 +r)(γε + 2ηδ) ≤ r. (1.42)
If necessary further restrict ε and δ so that (x, B) lies in the neighborhood N
whenever B−J
g
(x
∗
)
F
< 2δ and x−x
∗
 < ε. Suppose that B
0
−J
g
(x
∗
)
F
<
δ and x
0
−x
∗
 < ε. Then B
0
−J
g
(x
∗
) < ηδ < 2ηδ, and since (1.42) yields
2β(1 +r)ηδ ≤ r, (1.43)
Theorem 1.12 gives
B
−1
0
 ≤
β
1 −β2ηδ
≤
β
1 −r/(1 +r)
= (1 +r)β.
Lemma 1.23 now implies that
x
1
−x
∗
 ≤ x
0
−B
−1
0
g(x
0
) −x
∗

≤ B
−1
0

g(x
0
) −g(x
∗
) −J
g
(x
∗
)(x
0
−x
∗
)
+B
0
−J
g
(x
∗
)x
0
−x
∗

≤ β(1 +r)(γε + 2ηδ)x
0
−x
∗
,
and by (1.42) it follows that x
1
− x
∗
 ≤ rx
0
− x
∗
. Hence, x
1
− x
∗
 < ε,
and thus x
1
∈ T.
We complete the proof with an induction argument. Assume that both
B
k
−J
g
(x
∗
)
F
≤ 2δ and x
k+1
−x
∗
 ≤ rx
k
−x
∗
 for k = 0, 1, . . . , m−1. It
follows from (1.38) that
B
k+1
−J
g
(x
∗
)
F
−B
k
−J
g
(x
∗
)
F
≤ (α
1
B
k
−J
g
(x
∗
)
F
+α
2
) max¦x
k+1
−x
∗
, x
k
−x
∗
¦
≤ (2α
1
δ +α
2
) max¦rx
k
−x
∗
, x
k
−x
∗
¦
≤ (2α
1
δ +α
2
)r
k
x
0
−x
∗

≤ (2α
1
δ +α
2
)εr
k
,
and by summing both sides from k = 0 to m−1, we obtain
B
m
−J
g
(x
∗
)
F
≤ B
0
−J
g
(x
∗
)
F
+ (2α
1
δ +α
2
)
ε
1 −r
,
44 Chapter 1. An introduction to iterative methods
which by (1.41) implies that B
m
−J
g
(x
∗
) ≤ 2δ. To complete the induction
step we only need to prove that x
m+1
− x
∗
 ≤ rx
m
− x
∗
. This follows by
an argument similar to the one for m = 1. In fact, since B
m
−J
g
(x
∗
) ≤ 2ηδ,
Lemma 1.12 and (1.43) implies that
B
−1
m
 ≤ (1 +r)β,
and by Lemma 1.23 it follows that
x
m+1
−x
∗
 ≤ B
−1
m

g(x
m
) −g(x
∗
) −J
g
(x
∗
)(x
m
−x
∗
)
+B
m
−J
g
(x
∗
)x
m
−x
∗

≤ β(1 +r)(γε + 2ηδ)x
m
−x
∗
,
and x
m+1
−x
∗
 ≤ rx
m
−x
∗
 follows from (1.42).
Corollary 1.25. Assume that the hypotheses of Theorem 1.24 hold. If some
subsequence of ¦B
k
− J
g
(x
∗
)¦ converges to zero, then the sequence ¦x
k
¦
converges qsuperlinearly at x
∗
.
Proof. We would like to show that
lim
k→∞
x
k+1
−x
∗

x
k
−x
∗

= 0.
By Theorem 1.24 there are numbers ε(
1
2
) and δ(
1
2
) such that B
0
−J
g
(x
∗
)
F
<
δ(
1
2
) and x
0
−x
∗
 < ε(
1
2
) imply that x
k+1
− ≤
1
2
x
k
−x
∗
 for each k ≥ 0. Let
now r ∈ (0, 1) be given. We can choose m > 0 such that B
m
−J
g
(x
∗
)
F
< δ(r)
and x
m
− x
∗
 < ε(r). So, x
k+1
− x
∗
 ≤ rx
k
− x
∗
 for each k ≥ m. Since
r ∈ (0, 1) was arbitrary, the proof is completed.
It should be clear that some condition like the one in Corollary 1.25 is
necessary to guarantee qsuperlinear convergence. For example, the Newton
Chord iteration scheme, see Algorithm 1.16,
x
k+1
= x
k
−J
g
(x
0
)
−1
g(x
k
)
satisﬁes (1.38) with α
1
= α
2
= 0, but is, in general, only linearly convergent.
One of the interesting aspects of the result of the following theorem is that
qsuperlinear convergence is guaranteed for the method of Broyden, without
any subsequence of ¦B
k
−J
g
(x
∗
)¦ necessarily converging to zero.
1.3 The method of Broyden 45
Theorem 1.26. Let g : R
n
→ R
n
be continuously diﬀerentiable in the open,
convex set T ⊂ R
n
, and assume that J
g
∈ Lip γ(T). Let x
∗
be a zero of g, for
which J
g
(x
∗
) is nonsingular. Then the update function Φ(x, B) = ¦
¯
B [ s = 0¦,
where
¯
B = B + (y −Bs)
s
T
s
T
s
, (1.44)
is well deﬁned in a neighborhood N = N
1
N
2
of (x
∗
, J
g
(x
∗
)), and the corre
sponding iteration
x
k+1
= x
k
−B
−1
k
g(x
k
) (1.45)
with B
k+1
∈ Φ(x
k
, B
k
), k ≥ 0, is locally and qsuperlinearly convergent at x
∗
.
Before we can prove the theorem we need some preparations. The idea
of the proof of Theorem 1.26 is in the following manner. If
¯
B is given by
(1.44), then Lemma 1.23 and standard properties of the matrix norms .
2
and .
F
imply that there exists a neighborhood N of (x
∗
, J
g
(x
∗
)) such that
condition (1.38) is satisﬁed for every (x, B) in N. Subsequently, Theorem 1.24
yields that iteration (1.45) is locally and linearly convergent. The qsuperlinear
convergence is a consequence of the following two lemma’s.
Lemma 1.27. Let x
k
∈ R
n
, k ≥ 0. If ¦x
k
¦ converges qsuperlinearly to x
∗
∈
R
n
, then in any norm .,
lim
k→∞
x
k+1
−x
k

x
k
−x
∗

= 1.
Deﬁne the error in the current iteration e
k
by
e
k
= x
k
−x
∗
. (1.46)
The proof is drawn in Figure 1.8. Clearly, if
lim
k→∞
e
k+1

e
k

= 0, then lim
k→∞
s
k

e
k

= 1.
Proof (of Lemma 1.27). With e
k
given by (1.46) we compute
lim
k→∞
s
k

e
k

−1
= lim
k→∞
s
k
 −e
k

e
k

≤ lim
k→∞
s
k
+e
k

e
k

= lim
k→∞
e
k+1

e
k

= 0,
46 Chapter 1. An introduction to iterative methods
PSfrag replacements
x
k
x
k+1
s
k
e
k
e
k+1
x
∗
Figure 1.8: Schematic drawing of two subsequent iterates.
where the ﬁnal equality is the deﬁnition of qsuperlinear convergence if e
k
= 0
for all k.
Note that Lemma 1.27 is also of interest to the stopping criteria in our al
gorithms. It shows that whenever an algorithm achieves at least qsuperlinear
convergence, then any stopping test that uses s
k
is essentially equivalent to
the same test using e
k
, which is the quantity we are really interested in.
Lemma 1.28. Let T ⊆ R
n
be an open, convex set, g : R
n
→R
n
continuously
diﬀerentiable, and J
g
∈ Lip
γ
(T). Assume that J
g
(x
∗
) is nonsingular for some
x
∗
∈ T. Let ¦A
k
¦ be a sequence of nonsingular matrices in L(R
n
). Suppose
for some x
0
∈ T that the sequence of points generated by
x
k+1
= x
k
−A
−1
k
g(x
k
) (1.47)
remains in T, and satisﬁes lim
k→∞
x
k
= x
∗
, where x
k
= x
∗
for every k. Then
¦x
k
¦ converges qsuperlinearly to x
∗
in some norm . and g(x
∗
) = 0, if and
only if
lim
k→∞
(A
k
−J
g
(x
∗
))s
k

s
k

= 0 (1.48)
where s
k
= x
k+1
−x
k
.
Proof. Deﬁne e
k
= x
k
−x
∗
. First we assume that (1.48) holds, and show that
g(x
∗
) = 0 and that ¦x
k
¦ converges qsuperlinearly to x
∗
. Equation (1.47) gives
0 = A
k
s
k
+g(x
k
) = (A
k
−J
g
(x
∗
))s
k
+g(x
k
) +J
g
(x
∗
)s
k
,
so that
−g(x
k+1
) = (A
k
−J
g
(x
∗
))s
k
+ (−g(x
k+1
) +g(x
k
) +J
g
(x
∗
)s
k
), (1.49)
1.3 The method of Broyden 47
and
g
k+1

s
k

≤
A
k
−J
g
(x
∗
)s
k

s
k

+
 −g(x
k+1
) +g(x
k
) +J
g
(x
∗
)s
k

s
k

≤
A
k
−J
g
(x
∗
)s
k

s
k

+γ max¦x −x
∗
, x −x
∗
¦ (1.50)
where the second inequality follows from Lemma 1.23. Equation (1.50) to
gether with lim
k→∞
e
k
 = 0 and (1.48) gives
lim
k→∞
g(x
k+1
)
s
k

= 0. (1.51)
Since lim
k→∞
s
k
 = 0, it follows that
g(x
∗
) = lim
k→∞
g(x
k
) = 0.
From Lemma 1.23, there exist ρ > 0, k
0
≥ 0, such that
g(x
k+1
) = g(x
k+1
) −g(x
∗
) ≥
1
ρ
e
k+1
, (1.52)
for all k ≥ k
0
. Combining (1.51) and (1.52) gives
0 = lim
k→∞
g(x
k+1
)
s
k

≥ lim
k→∞
1
ρ
e
k+1

s
k

≥ lim
k→∞
1/ρ e
k+1

e
k
 +e
k+1

= lim
k→∞
1/ρ r
k
1 +r
k
,
where r
k
= e
k+1
/e
k
. This implies
lim
k→∞
r
k
= 0,
which completes the proof of qsuperlinear convergence.
The proof of the reverse implication, that qsuperlinear convergence and
g(x
∗
) = 0 imply (1.48), is the derivation above read in more or less the reversed
order. From Lemma 1.23, there exist ρ > 0, k
0
≥ 0, such that
g(x
k+1
) ≥ ρe
k+1

48 Chapter 1. An introduction to iterative methods
for all k ≥ k
0
. Therefore,
0 = lim
k→∞
e
k+1

e
k

≥ lim
k→∞
g(x
k+1
)
1/ρe
k

≥ lim
k→∞
ρ
g(x
k+1
)
s
k

s
k

e
k

. (1.53)
The qsuperlinear convergence implies that lim
k→∞
s
k
/e
k
 = 1 according
to Lemma 1.27. Together with (1.53) this gives that (1.51) holds. Finally,
from (1.49) and Lemma 1.23,
(A
k
−J
g
(x
∗
))s
k

s
k

≤
g(x
k+1
)
s
k

+
 −g(x
k+1
) +g(x
k
) +J
g
(x
∗
)s
k

s
k

≤
g(x
k+1
)
s
k

+γ max¦x −x
∗
, x −x
∗
¦,
which together with (1.51) and lim
k→∞
e
k
 = 0 proves (1.48).
Due to the Lipschitz continuity of J
g
, it is easy to show that Lemma 1.28
remains true if (1.48) is replaced by
lim
k→∞
(A
k
−J
g
(x
k
))s
k

s
k

= 0. (1.54)
This condition has an interesting interpretation. Because s
k
= −A
−1
k
g(x
k
),
Equation (1.54) is equivalent to
lim
k→∞
J
g
(x
k
)(s
N
k
−s
k
)
s
k

= 0,
where s
N
k
= −J
g
(x
k
)
−1
g(x
k
) is the Newton step from x
k
. Thus the necessary
and suﬃcient condition for the qsuperlinear convergence of a secant method
is that the secant steps converge, in magnitude and direction, to the Newton
steps from the same points.
After stating a ﬁnal lemma we are able to prove the main theorem of this
section.
Lemma 1.29. Let s ∈ R
n
be nonzero and E ∈ R
n×n
. Then
E
I −
ss
T
s
T
s
F
=
E
2
F
−
Es
s
2
1/2
(1.55)
≤ E
F
−
1
2E
F
Es
s
2
. (1.56)
1.3 The method of Broyden 49
Proof. Note that I −(ss
T
/s
T
s) is a Euclidean projection, and so is ss
T
/s
T
s.
So by the Pythagorean theorem,
E
2
F
=
E
ss
T
s
T
s
2
F
+
E
I −
ss
T
s
T
s
2
F
,
and the equality
E
ss
T
s
T
s
F
=
Es
s
,
we have proved (1.55). Because for any α ≥ [β[ ≥ 0, (α
2
−β
2
)
1/2
≤ α−β
2
/2α,
Equation (1.55) implies (1.56).
Proof (of Theorem 1.26). In order to be able use both Theorem 1.24 and
Lemma 1.28, we ﬁrst derive an estimate for 
¯
B − J
g
(x
∗
). Assume that ¯ x
and x are in T and s = 0. Deﬁne
¯
E =
¯
B − J
g
(x
∗
), E = B − J
g
(x
∗
),
¯ e = ¯ x −x
∗
, and e = x −x
∗
. Note that
¯
E =
¯
B −J
g
(x
∗
)
= B −J
g
(x
∗
) + (y −Bs)
s
T
s
T
s
= (B −J
g
(x
∗
))
I −
ss
T
s
T
s
+ (y −J
g
(x
∗
))
s
T
s
T
s
.
Therefore,

¯
E
F
≤
(B −J
g
(x
∗
))
I −
ss
T
s
T
s
F
+
y −J
g
(x
∗
)s
s
≤
E
I −
ss
T
s
T
s
F
+γ max¦¯ e, e¦. (1.57)
For the last inequality of (1.57) Lemma 1.23 is used. Because I − ss
T
/(s
T
s)
is an orthogonal projection it has l
2
norm equal to one,
I −
ss
T
s
T
s
= 1.
Therefore, the inequality (1.57) can be reduced to

¯
E
F
≤ E
F
+γ max¦¯ e, e¦. (1.58)
We deﬁne the neighborhood N
2
of J
g
(x
∗
) by
N
2
=
B ∈ L(R
n
) [ J
g
(x
∗
)
−1
 B −J
g
(x
∗
) <
1
2
¸
.
50 Chapter 1. An introduction to iterative methods
Then any B ∈ N
2
is nonsingular and satisﬁes
B
−1
 ≤
J
g
(x
∗
)
−1

1 −(J
g
(x
∗
)
−1
(B −J
g
(x
∗
))
≤ 2J
g
(x
∗
)
−1
.
To deﬁne the neighborhood N
1
of x
∗
, choose ε > 0 and ρ > 0 as in Lemma
1.23 so that max¦¯ x − x
∗
, x − x
∗
¦ ≤ ε implies that x and ¯ x belong to T
and that (1.36) holds, for u = x and v = ¯ x, i.e.,
(1/ρ)x − ¯ x ≤ g(x) −g(¯ x) ≤ ρx − ¯ x. (1.59)
In particular, if x −x
∗
 ≤ ε and B ∈ N
2
then x ∈ T and
s = B
−1
g(x) ≤ B
−1
g(x) −g(x
∗
) ≤ 2ρJ
g
(x
∗
)
−1
x −x
∗
.
Let N
1
be the set of all x ∈ R
n
such that x −x
∗
 <
ε
2
and
2ρJ
g
(x
∗
)
−1
x −x
∗
 <
ε
2
.
If N = N
1
N
2
and (x, B) ∈ N, then
¯ x −x
∗
 ≤ s +x −x
∗
 ≤ ε.
Hence, ¯ x ∈ T and moreover, (1.59) shows that s = 0 if and only if x = x
∗
. So,
the update function is well deﬁned in N. Equation (1.58) then shows that the
update function associated with the iteration (1.45) satisﬁes the hypotheses
of Theorem 1.24 and therefore, the algorithm according to (1.45) is locally
convergent at x
∗
. In addition we can choose r ∈ (0, 1) in (1.40) arbitrarily. We
take r =
1
2
, so
x
k+1
−x
∗
 ≤
1
2
x
k
−x
∗
 (1.60)
Considering Lemma 1.28, a suﬃcient condition for ¦x
k
¦ to converge q
superlinearly to x
∗
is
lim
k→∞
E
k
s
k

s
k

= 0. (1.61)
In order to justify Equation (1.61) we write Equation (1.57) as
E
k+1

F
≤
E
k
I −
s
k
s
T
k
s
T
k
s
k
F
+γ max¦e
k+1
, e
k
¦. (1.62)
Using Equation (1.60) and Lemma 1.29 in (1.62), we obtain
E
k+1

F
≤ E
k

F
−
E
k
s
k

2
2E
k

F
s
k

2
+γe
k
,
1.3 The method of Broyden 51
or
E
k
s
k

2
s
k

2
≤ 2E
k

F
E
k

F
−E
k+1

F
+γe
k

. (1.63)
Theorem 1.24 gives that ¦B
k
¦ is uniformly bounded for k ≥ 0. This implies
that there exists an M > 0 independently of k such that
E
k
 = B
k
−J
g
(x
∗
) ≤ B
k
 +J
g
(x
∗
) ≤ M.
By Equation (1.60) we obtain
∞
¸
k=0
e
k
 ≤ 2ε.
Thus from (1.63),
E
k
s
k

2
s
k

2
≤ 2M
E
k

F
−E
k+1

F
+γe
k

, (1.64)
and summing the left and right sides of (1.64) for k = 0, 1, . . . , m, yields
m
¸
k=0
E
k
s
k

2
s
k

2
≤ 2M
E
0

F
−E
m+1

F
+γ
m
¸
k=0
e
k

≤ 2M
E
0

F
+ 2εγ
≤ 2M
M + 2εγ
. (1.65)
Because (1.65) is true for any m ≥ 0, we obtain
∞
¸
k=0
E
k
s
k

2
s
k

2
< ∞,
which implies (1.61) and completes the proof.
The inverse notation of Broyden’s method
A restriction of the method of Broyden is that it is necessary to solve an
ndimensional system to compute the Broyden step, see Algorithm 1.19. To
avoid this problem, instead of the Broydenmatrix one could store the inverse
of this matrix, and the operation is reduced to a matrixvector multiplication.
If H
k
is the inverse of B
k
then
s
k
= −H
k
g(x
k
),
52 Chapter 1. An introduction to iterative methods
and the secant equation becomes
H
k+1
y
k
= s
k
. (1.66)
Equation (1.66) again does not deﬁne a unique matrix but a class of matrices.
In Section 1.3, the new Broyden matrix B
k+1
has been chosen so that,
in addition to the secant equation (1.25), it satisﬁes B
k+1
q = B
k
q in any
direction q orthogonal to s
k
. This was suﬃcient to deﬁne B
k+1
uniquely and
the update was given by (1.30).
It is possible, using Householder’s modiﬁcation formula, to compute the
new inverse Broyden matrix H
k+1
= B
−1
k+1
with very little eﬀort from H
k
.
Householder’s formula, also called the ShermanMorrison formula, states that
if A is a nonsingular (n n)matrix, u and v are vectors in R
n
, and (1 +
v
T
A
−1
u) = 0, then (A+uv
T
) is nonsingular and
(A+uv
T
)
−1
= A
−1
−
A
−1
uv
T
A
−1
1 +v
T
A
−1
u
. (1.67)
This formula is a particular case of the ShermanMorrisonWoodbury for
mula derived in the next theorem.
Theorem 1.30. Let A ∈ R
n×n
be nonsingular and U, V ∈ R
n×p
be arbitrary
matrices with p ≤ n. If (I +V
T
A
−1
U) is nonsingular then (A+UV
T
)
−1
exists
and
(A+UV
T
)
−1
= A
−1
−A
−1
U(I +V
T
A
−1
U)
−1
V
T
A
−1
. (1.68)
Proof. The formula (1.68) is easily veriﬁed by computing
(A
−1
−A
−1
U(I +V
T
A
−1
U)
−1
V
T
A
−1
)(A+UV
T
),
and
(A+UV
T
)(A
−1
−A
−1
U(I +V
T
A
−1
U)
−1
V
T
A
−1
),
that both yield the identity. Therefore A+UV
T
is invertible and the inverse
is given by (1.68).
Equation (1.67) gives that if s
T
k
H
k
y
k
= 0, then
H
k+1
= B
−1
k+1
= (B
k
+ (y
k
−B
k
s
k
)
s
T
k
s
T
k
s
k
)
−1
= B
−1
k
−(B
−1
k
y
k
−s
k
)
s
T
k
B
−1
k
s
T
k
B
−1
k
y
k
= H
k
+ (s
k
−H
k
y
k
)
s
T
k
H
k
s
T
k
H
k
y
k
. (1.69)
1.3 The method of Broyden 53
The iterative scheme
x
k+1
= x
k
−H
k
g(x
k
), k = 0, 1, 2, . . . ,
together with the rank one update (1.69) equals Algorithm 1.19.
Instead of assuming that B
k+1
q = B
k
q in any direction q orthogonal to s
k
,
we could also require that
H
k+1
q = H
k
q for q
T
y
k
= 0.
This is, in some sense, the complement of the ﬁrst method of Broyden. Since
H
k+1
satisﬁes (1.66), it is readily seen that for this method H
k+1
is uniquely
given by
H
k+1
= H
k
+ (s
k
−H
k
y
k
)
y
T
k
y
T
k
y
k
.
This update scheme, however, appears in practice to be unsatisfactory and is
called the second or ’bad’ method of Broyden [8].
54 Chapter 1. An introduction to iterative methods
Chapter 2
Solving linear systems with
Broyden’s method
One important condition for an algorithm to be a good iterative method is
that it should use a ﬁnite number of iterations to solve a system of linear
equations
Ax +b = 0, (2.1)
where A ∈ R
n×n
and b ∈ R
n
. As we have seen in Section 1.2, the method
of Newton satisﬁes this condition, that is, it solves a system of linear equa
tions in just one single iteration step. Although computer simulations indicate
that the method of Broyden satisﬁes ﬁnite convergence, for a long time it was
not possible to prove this algebraically. In 1979, fourteen years after Charles
Broyden proposed his algorithm, David Gay published a proof that Broyden’s
method converges in at most 2n iteration steps for any system of linear equa
tions (2.1) where A is nonsingular [22]. In addition Gay proved under which
conditions the method of Broyden needs exactly 2n iterations.
For many examples, however, it turns out that Broyden’s method needs
much less iterations. In 1981, Richard Gerber and Franklin Luk [23] published
an approach to compute the exact number of iterations that Broyden’s method
needs to solve (2.1).
In this chapter, we discuss the Theorems of Gay, Section 2.1, and of Ger
ber and Luk, Section 2.2, and we give examples to illustrate the theorems. In
Section 2.3, we show that the method of Broyden is invariant under unitary
transformations and in some weak sense also under nonsingular transforma
tions. This justiﬁes that we restrict ourselves to examples where A is in Jordan
canonical block form, cf. Section 4.2.
But ﬁrst we start again with the problem in the onedimensional setting.
55
56 Chapter 2. Solving linear systems with Broyden’s method
The onedimensional case
Consider the function g : R →R given by
g(x) = αx +β,
where α = 0. It is clear that Newton’s method converges in one iteration
starting from any initial point x
0
, diﬀerent from the solution x
∗
. Indeed,
x
1
= x
0
−
g(x
0
)
g
(x
0
)
= x
0
−
αx
0
+β
α
= −
β
α
.
and g(x
1
) = 0. It turns out that Broyden needs two iterations from the same
initial point x
0
= x
∗
, if b
0
∈ R is an arbitrary nonzero scalar, with b
0
= α. We
compute
s
0
= −
g(x
0
)
b
0
= −
α
b
0
x
0
−
β
b
0
.
So, if x
1
= x
0
+s
0
, then
g(x
1
) = α
1 −α/b
0
x
0
+
1 −α/b
0
β =
1 −α/b
0
(αx
0
+β).
The scalar b
0
is updated by
b
1
= b
0
+
g
1
s
0
= b
0
+
(1 −α/b
0
)(αx
0
+β)
−(αx
0
+β)/b
0
= b
0
−(b
0
−α) = α.
Thus after one iteration Broyden’s method succeeds to ﬁnd the derivative of
the function g. Therefore the method converges in the next iterations step,
that is,
x
2
= x
1
−
g(x
1
)
b
1
= x
1
−
α
b
1
x
1
−
β
b
1
= −
β
α
.
2.1 Exact convergence for linear systems
Suppose that g : R
n
→R
n
is an aﬃne function, that is, for x ∈ R
n
,
g(x) = Ax +b, (2.2)
where A : R
n×n
and b ∈ R
n
. The matrix A is assumed to be nonsingular.
For notational simplicity we denote g(x
k
) by g
k
. We consider the following
generalization of the method of Broyden.
2.1 Exact convergence for linear systems 57
Algorithm 2.1 (Generalized Broyden’s method). Choose x
0
∈ R
n
and
a nonsingular (n n)matrix H
0
. Compute s
0
:= −H
0
g(x
0
) and let k := 0.
Repeat the following sequence of steps as long as s
k
= 0.
i) x
k+1
:= x
k
+s
k
,
ii) y
k
:= g(x
k+1
) −g(x
k
),
iii) Choose v
k
such that the conditions
v
T
k
y
k
= 1, (2.3)
v
k
= H
T
k
u
k
, (2.4)
are satisﬁed, where u
T
k
s
k
= 0.
iv) H
k+1
:= H
k
+ (s
k
−H
k
y
k
)v
T
k
,
v) Compute s
k+1
:= −H
k+1
g(x
k+1
),
Property (2.3) establishes the inverse secant equation (1.66),
H
k+1
y
k
= s
k
.
Note that both properties are satisﬁed when Broyden’s ’good’ update is used,
i.e.,
v
k
= H
T
k
s
k
/(s
T
k
H
k
y
k
) for s
T
k
H
k
y
k
= 0.
The ’bad’ Broyden update v
k
= y
k
/(y
T
k
y
k
) clearly satisﬁes property (2.3), but
might non keep (2.4) invariant.
The updated Broyden matrix can be written as
H
k+1
= H
k
+ (s
k
−H
k
y
k
)v
T
k
= H
k
−H
k
(g
k
+y
k
)v
T
k
= H
k
(I −g
k+1
v
T
k
). (2.5)
With this relation the following lemma is easily been shown.
Lemma 2.2. If H
k
is invertible and v
k
satisﬁes conditions (2.3) and (2.4)
then H
k+1
is invertible as well.
Proof. The determinant of the matrix I −g
k+1
v
T
k
equals
det(I −g
k+1
v
T
k
) = 1 −v
T
k
g
k+1
= 1 −v
T
k
(y
k
+g
k
)
= 1 −1 −u
T
k
H
k
g
k
= u
T
k
s
k
.
58 Chapter 2. Solving linear systems with Broyden’s method
Because u
T
k
s
k
is assumed to be nonzero, this implies that H
k+1
is invertible if
H
k
is invertible.
According to the deﬁnition of the Broyden step, s
k
= −H
k
g
k
, the non
singularity of H
0
implies that s
k
= 0 if and only if g
k
= 0 for all k ≥ 0. Thus
the algorithm stops if and only if the zero of the function g is found.
Since g is a aﬃne function the yield of the step size can be expressed as
y
k
= As
k
. (2.6)
The matrix A is assumed to be nonsingular. So, (2.6) establishes that y
k
is
a nonzero vector throughout the execution of Algorithm 2.1. We also use the
relations
y
k
= −AH
k
g(x
k
), (2.7)
and
(I −AH
k+1
)y
k
= 0. (2.8)
Equation (2.7) implies that
g
k+1
= y
k
+g
k
= (I −AH
k
)g
k
. (2.9)
A theorem of Gay
In this section, we show that Algorithm 2.1 converges in at most 2n steps when
applied to an aﬃne function g : R
n
→ R
n
, given by (2.2), where A ∈ R
n×n
is nonsingular and b ∈ R
n
. This follows as an easy corollary to the following
lemma. The notation σ used below denotes the greatest integer less than or
equal to σ ∈ R.
In the proof of Lemma 2.3, we need the equalities
AH
k+1
= A(H
k
+ (s
k
−H
k
y
k
)v
T
k
) (2.10)
and
AH
k+1
= AH
k
(I −(I −AH
k
)g
k
v
T
k
), (2.11)
for which we have used (2.6) and (2.7). From (2.10) we deduce
I −AH
k+1
= (I −AH
k
)(I +AH
k
g
k
v
T
k
)
= (I −AH
k
)(I −y
k
v
T
k
). (2.12)
2.1 Exact convergence for linear systems 59
Lemma 2.3. If A ∈ R
n×n
and Algorithm 2.1 is applied to g(x) ≡ Ax − b
with the result that g
k
≡ g(x
k
) and y
k−1
are linearly independent, then for
1 ≤ j ≤ (k + 1)/2, the vectors
(AH
k−2j+1
)
i
g
k−2j+1
, 0 ≤ i ≤ j, (2.13)
are linearly independent.
Proof. We prove (2.13) by induction on j. The linearity of g implies that
y
k−1
= As
k−1
= −AH
k−1
g
k−1
, so (2.13) is easily seen to hold for j = 1, using
that y
k−1
= g
k
− g
k−1
. For the induction we proof that the vectors in (2.13)
are linearly independent for j = 2. The proof for 3 ≤ j ≤ (k +1)/2 is similar
and we refer to [22] for a complete derivation.
By (2.11) we have that
AH
k−1
= AH
k−2
(I −(I −AH
k−2
)g
k−2
v
T
k−2
)
Moreover, Equation (2.9) gives g
k−1
= (I −AH
k−2
)g
k−2
and therefore
(AH
k−1
)g
k−1
= AH
k−2
(I −(I −AH
k−2
)g
k−2
v
T
k−2
)g
k−1
= AH
k−2
(I −AH
k−2
)g
k−2
(1 −v
T
k−2
g
k−1
)
= (1 −v
T
k−2
g
k−1
)(I −AH
k−2
)AH
k−2
g
k−2
Since g
k−1
and (AH
k−1
)g
k−1
are linearly independent
(I −AH
k−2
)g
k−2
and (I −AH
k−2
)AH
k−2
g
k−2
are linearly independent as well. According to (2.8) we have
(I −AH
k−2
)y
k−3
= 0.
Therefore y
k−3
, g
k−2
and AH
k−2
g
k−2
are linearly independent. Note that, as
before,
g
k−2
= (I −AH
k−3
)g
k−3
and
(AH
k−2
)g
k−2
= (1 −v
T
k−3
g
k−2
)(I −AH
k−3
)AH
k−3
g
k−3
.
Since y
k−3
= AH
k−3
g
k−3
, we see that (AH
k−3
)
i
g
k−3
, 0 ≤ i ≤ 2, are linearly
independent. Therefore (2.13) holds for j = 2.
Theorem 2.4. If g(x) = Ax−b and A ∈ R
n×n
is nonsingular, then Algorithm
2.1 converges in at most 2n steps, i.e., g
k
= 0 for some k ≤ 2n.
60 Chapter 2. Solving linear systems with Broyden’s method
Proof. By Lemma 2.3 there exists a k with 1 ≤ k ≤ 2n − 1 such that g
k
and
y
k−1
are linearly dependent. The theorem clearly holds if g
k
= 0, so assume
that g
k
= 0, (whence g
k−1
= 0 too). Lemma 2.2 shows that H
l
is nonsingular
for l ≥ 0, so s
k−1
= 0. Because A is also nonsingular, we must have y
k−1
= 0
and hence g
k
= λy
k−1
for some λ = 0. According to (2.8) y
k−1
= AH
k−1
y
k−1
,
so g
k
= AH
k
g
k
, whence g
k+1
= g
k
−AH
k
g
k
= 0.
With the following example we illustrate Theorem 2.4.
Example 2.5. Consider the linear function g
1
(x) = A
1
x and g
2
(x) = A
2
x,
where
A
1
=
¸
¸
¸
2 1 0 0
0 2 1 0
0 0 2 1
0 0 0 2
¸
and A
2
=
¸
¸
¸
2 1 0 0
0 2 0 0
0 0 2 1
0 0 0 2
¸
. (2.14)
We apply the method of Broyden, see Algorithm 1.19, starting with the initial
matrix B
0
= −I and initial estimate x
0
= (1, 1, 1, 1). The rate of convergence
is given in Figure 2.1. Clearly, here the number of 2n iterations is an upper
bound for Algorithm 1.19 to obtain the exact zero of the function g
1
and g
2
.
0 1 2 3 4 5 6 7 8
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 2.1: The convergence rate of Algorithm 1.19, when solving A
i
x = 0, for i = 1, 2,
where A
1
and A
2
are deﬁned in (2.14) [’◦’: A
1
, ’×’: A
2
]
In Section 1.3, we have seen that the Broyden matrix B
k
does not neces
sarily converge to the Jacobian even if the sequence ¦x
k
¦ converges to x
∗
.
2.1 Exact convergence for linear systems 61
Lemma 2.6. Let g : R
n
→ R
n
be an aﬃne function, with nonsingular Jaco
bian A ∈ R
n×n
. Consider Algorithm 1.19, where s
T
k
B
−1
k
y
k
is nonzero in every
iteration k. Then for all k = 0, . . . , k
∗
,
B
k+1
−A
F
≤ B
k
−A
F
(2.15)
Proof. Because we assume that s
T
k
B
−1
k
y
k
= 0 for all k, Algorithms 1.19 and 2.1
are equivalent for v
k
= H
T
k
s
k
/(s
T
k
H
k
y
k
). Theorem (2.4) gives that the process
is well deﬁned and converges in a ﬁnite number of iterations.
Because g is aﬃne we have that y
k
= As
k
and according to the Broyden
update, we obtain
B
k+1
= B
k
+ (y
k
−B
k
s
k
)
s
T
k
s
T
k
s
k
B
k+1
−A = (B
k
−A)
I −
s
k
s
T
k
s
T
k
s
k
,
and by taken the Frobenius norm of both sides, we arrive at
B
k+1
−A
F
≤ B
k
−A
F
I −
s
k
s
T
k
s
T
k
s
k
≤ B
k
−A
F
.
In other words, the diﬀerence between the Jacobian and the Broyden ma
trix, is projected on an (n−1)dimensional subspace orthogonal to the Broyden
step s
k
. Therefore, the ﬁnal diﬀerence B
k
∗ − A
F
depends in particular on
the orthogonality of the Broyden steps ¦s
0
, . . . , s
k
∗¦.
Lemma 2.7. Let g : R
n
→ R
n
be an aﬃne function, with nonsingular Ja
cobian A ∈ R
n×n
. Consider Algorithm 2.1 and suppose for some k ≥ 1, that
y
k
= 0, v
T
k
y
k−1
= 0 and rank(I −AH
k
) = n−1 then rank(I −AH
k+1
) = n−1
and y
k
spans the kernel of (I −AH
k+1
).
Proof. The assumption that y
k
= 0 implies that y
k−1
= 0. According to (2.8)
we see that y
k
is in the kernel of (I −AH
k+1
). Similarly, since rank(I −AH
k
) =
n −1 the vector y
k−1
spans the kernel of (I −AH
k
). Any other null vector y
of (I −AH
k+1
) must (after scaling) satisfy (I −y
k
v
T
k
)y = y
k−1
. But v
k
spans
the kernel of (I − y
k
v
T
k
)
T
, and because, by assumption, v
T
k
y
k−1
= 0, y
k−1
is
not in the range of (I −y
k
v
T
k
). So, y
k
spans the kernel of (I −AH
k+1
).
Lemma 2.7 leads to the important observation that the sequence of ma
trices H
k
does not terminate with the inverse of the matrix A, at least in the
usual case in which all v
T
k
y
k−1
= 0. In fact, each matrix H
k
and A
−1
agree
only on a subspace of dimension one.
62 Chapter 2. Solving linear systems with Broyden’s method
2.2 Two theorems of Gerber and Luk
We consider the Broyden process again applied to compute a zero of the aﬃne
function g given by (2.2). Let Z
k
be deﬁned as the subspace spanned by the
Krylov sequence ¦g
k
, AH
k
g
k
, (AH
k
)
2
g
k
, . . .¦. So,
Z
k
= span¦g
k
, AH
k
g
k
, (AH
k
)
2
g
k
, . . .¦, (2.16)
for k ≥ 0. We will call subspace Z
k
the kth Krylov subspace. So, Z
0
will
be called the zeroth subspace. We already have proved that Algorithm 2.1
terminates at the kth iteration if and only if g(x
k
) = 0. Thus s
k
= 0 if and
only the dimension of Z
k
is zero.
We proceed to show how the Z
k
’s decrease in dimension and ﬁrst derive
several lemma’s.
Lemma 2.8. Let z
k+1
be any vector in Z
k+1
. Then there exists a vector z
k
in
Z
k
such that
z
k+1
= (I −AH
k
)z
k
.
Proof. It suﬃces to show that for j ≥ 0, there is a vector t
j
in Z
k
such that
(AH
k+1
)
j
g
k+1
= (I −AH
k
)t
j
.
We prove this by induction. If j = 0, we have t
0
= g
k
because of (2.9). Assume
there is a vector t
j
in Z
k
such that
(AH
k+1
)
j
g
k+1
= (I −AH
k
)t
j
.
By deﬁnition of H
k+1
, we obtain
(AH
k+1
)
j+1
g
k+1
= (AH
k+1
)(AH
k+1
)
j
g
k+1
= (AH
k
+A(s
k
−H
k
y
k
)v
T
k
)(I −AH
k
)t
j
= (I −AH
k
)AH
k
t
j
+c(I −AH
k
)y
k
,
where c = v
T
k
(I −AH
k
)t
j
. So,
(AH
k+1
)
j+1
g
k+1
= (I −AH
k
)t
j+1
,
where t
j+1
= AH
k
t
j
+cy
k
. By (2.7) the vector t
j+1
∈ Z
k
.
We immediately see that
Z
k+1
⊂ Z
k
. (2.17)
Another direct implication of Lemma 2.8 is formulated in the following lemma.
2.2 Two theorems of Gerber and Luk 63
Lemma 2.9. Let the vectors ¦t
1
, t
2
, . . . , t
d
¦ span Z
k
. Then the vectors
(I −AH
k
)t
1
, (I −AH
k
)t
2
, . . . , (I −AH
k
)t
d
span Z
k+1
.
Lemma 2.10. Let dimZ
k
= d + 1. If there is a nonzero vector w
k
in the
subspace Z
k
∩ Ker(I −AH
k
) then
w
k
=
d
¸
i=0
α
i
(AH
k
)
i
g
k
, α
d
= 0.
Proof. We have
0 = (I −AH
k
)w
k
= α
0
g
k
+
d
¸
i=1
(α
i
−α
i−1
)(AH
k
)
i
g
k
−α
d
(AH
k
)
d+1
g
k
.
Suppose that α
d
= 0. As the vectors (AH
k
)
i
g
k
for i = 0, 1, . . . , d are linearly
independent, we deduce that α
i
= 0 for i = 0, 1, . . . , d−1. But this contradicts
the assumption of a nonzero w
k
.
A consequence of Lemma 2.10 is that w
k
is unique up to a scalar multiple,
i.e., if there is a nonzero vector w
k
in Z
k
∩ Ker(I − AH
k
), then w
k
spans
Z
k
∩ Ker(I −AH
k
). Thus, with Lemma 2.9, we obtain the inequality
dimZ
k
−1 ≤ dimZ
k+1
≤ dimZ
k
. (2.18)
The following theorems state the basic result.
Theorem 2.11. If dimZ
k+1
= dimZ
k
, then dimZ
k+2
= dimZ
k+1
−1.
Theorem 2.12. If dimZ
k+1
= dimZ
k
−1 and
v
T
k
w
k
= 0, (2.19)
where w
k
spans Z
k
∩ Ker(I −AH
k
), then dimZ
k+2
= dimZ
k+1
.
Before we start to prove both theorems, a few remarks are in order on the
vector w
k
in Theorem 2.12. Since dimZ
k+1
= dimZ
k
− 1, Lemma 2.9 shows
that a nonzero w
k
must exist. In fact, it can be shown that w
k
= λy
k−1
for
some scalar λ, with the exception of the case where there exists a nonzero w
0
.
64 Chapter 2. Solving linear systems with Broyden’s method
Proof (of Theorem 2.11). Since dimZ
k+1
= dimZ
k
, the subspaces Z
k+1
Z
k
are identical by (2.17), and so y
k
∈ Z
k+1
. By (2.7)(2.8) and Lemma 2.10, y
k
spans
Z
k+1
∩ Ker(I −AH
k+1
).
Applying Lemma 2.9 completes the proof.
The next lemma is needed in the proof of Theorem 2.12.
Lemma 2.13. If there is a nonzero vector w
k
in Z
k
∩ Ker(I − AH
k
) and if
v
T
k
w
k
= 0, then Z
k+1
∩ Ker(I −AH
k+1
) = ¦0¦.
Proof. Suppose there is a nonzero vector w
k+1
in Z
k+1
∩Ker(I −AH
k+1
). First
we show that w
k+1
= λy
k
for some scalar λ. Then we will prove that y
k
is not
in Z
k+1
.
Assume w
k+1
= λy
k
for all nonzero scalars λ. Since w
k+1
∈ Z
k
and from
the equation
(I −AH
k+1
) = (I −AH
k
)(I −y
k
v
T
k
),
we deduce that
(I −y
k
v
T
k
)w
k+1
= αw
k
,
for some nonzero scalar α. But then
αv
T
k
w
k
= v
T
k
w
k+1
−v
T
k
y
k
v
T
k
w
k+1
= 0,
contradicting the assumption that v
T
k
w
k
= 0. So w
k+1
= λy
k
for some nonzero
scalar λ.
Now we show that y
k
is not in Z
k+1
. Let dimZ
k
= d + 1 where d ≥ 1. By
Lemma 2.10, the set of vectors
¦g
k
, AH
k
g
k
, . . . , (AH
k
)
d−1
g
k
, w
k
¦
is a basis for Z
k
. Assuming that y
k
∈ Z
k+1
, Lemma 2.9 implies that
y
k
= (I −AH
k
)
d−1
¸
i=0
β
i
(AH
k
)
i
g
k
+β
d
w
k
= (I −AH
k
)
d−1
¸
i=0
β
i
(AH
K
)
i
g
k
.
So, if d > 1, then
β
0
+ (β
1
−β
0
+ 1)AH
k
g
k
+. . .
+ (β
d−1
−β
d−2
)(AH
k
)
d−1
g
k
−β
d−1
(AH
k
)
d
g
k
= 0.
2.2 Two theorems of Gerber and Luk 65
For d = 1 we obtain
β
0
g
k
+ (1 −β
0
)AH
k
g
k
= 0.
Either case is impossible, as the vectors (AH
k
)
i
g
k
, i = 0, . . . , d, are linearly
independent. So y
k
/ ∈ Z
k+1
and hence Z
k+1
∩ Ker(I −AH
k+1
) = ¦0¦.
Proof (of Theorem 2.12). The proof follows directly from the Lemmas 2.9 and
2.13.
Theorems 2.11 and 2.12 imply ﬁnite termination of the method. Let
dimZ
0
= d
0
and dimZ
1
= d
1
. From (2.18), we see that d
0
− 1 ≤ d
1
≤ d
0
.
Applying the Theorems 2.11 and 2.12, we conclude that Algorithm 2.1 must
terminate in exactly d
0
+d
1
steps, if (2.19) is satisﬁed in every iteration where
dimZ
k+1
= dimZ
k
− 1. A weaker statement, though easier to check, is that
Broyden’s method needs at most 2d
0
iterations, which is a direct consequence
of 2.11 and (2.18).
Corollary 2.14. Let d
0
= dimZ
0
then Algorithm 2.1 needs at most 2d
0
iter
ations to converge.
Example 2.15. It turns out that in case of the function g
1
in Example 2.5
both the zeroth and the ﬁrst Krylov space of the Broyden process has dimen
sion 2 (= d
0
+ d
1
). This predicts the four iterations the method of Broyden
needs to solve g
1
(x) = 0.
In case of the function g
2
of Example 2.5, both the zeroth and the ﬁrst
Krylov space of the Broyden process has dimension 4. The method of Broyden
needs 8 iterations to solve the equation g
2
(x) = 0.
In the next example, we show (2.19) is a necessary condition for Theorem
2.12.
Example 2.16. Consider the linear function g(x) = Ax, where
A =
¸
−1 1 0
0 −1 1
0 0 −1
¸
.
If we apply the method of Broyden starting with the initial matrix B
0
= −I
and initial estimate x
0
= (1, 1, 1). Then we obtain the following process. Note
that the inverse of the Broyden also equal minus the identity, H
0
= −I.
The function value in x
0
equals g(x
0
) = Ax
0
= (0, 0, −1). Therefore the
zeroth Krylov space is given by
Z
0
= span¦g
0
, AH
0
g
0
, (AH
0
)
2
g
0
, . . .¦ = span
¸
0
0
−1
¸
,
¸
0
1
−1
¸
,
¸
−1
2
−1
¸
¸
,
66 Chapter 2. Solving linear systems with Broyden’s method
and the dimension of Z
0
equals three. The kernel of (I − AH
0
) is one
dimensional and spanned by the vector (−1, 0, 0). So the vector w
0
that spans
Z
0
∩ Ker(I −AH
0
) equals w
0
= (−1, 0, 0).
We see that the ﬁrst Broyden step equals s
0
= −H
0
g
0
= (0, 0, −1) and
so the last element of the iterate is nicely removed, x
1
= x
0
+ s
0
= (1, 1, 0).
With the yield of the Broyden step y
0
= (0, −1, 1) we compute the new inverse
Broyden matrix,
v
0
= H
T
0
s
0
/(s
T
0
H
0
y
0
) = (0, 0, 1),
and
H
1
= H
0
+ (s
0
−H
0
y
0
)v
T
0
=
¸
−1 0 0
0 −1 −1
0 0 −1
¸
.
The new Broyden matrix B
1
is given by
B
1
=
¸
−1 0 0
0 −1 1
0 0 −1
¸
.
The function value in x
1
equals g(x
1
) = Ax
1
= (0, −1, 0). It turn out that
the ﬁrst Krylov space is twodimensional and is given by
Z
1
= span¦g
1
, AH
1
g
1
, . . .¦ = span
¸
0
−1
0
¸
,
¸
1
−1
0
¸
¸
.
We thus have that d
0
= 3 and d
1
= 2. According to Theorem 2.12 the in
tersection of Z
1
and the kernel of (I − AH
1
) must be empty and dimZ
2
=
dimZ
1
, if v
T
0
w
0
= 0. However, in this example v
T
0
w
0
= 0. The kernel of
(I − AH
1
) is spanned by the vectors (−1, 0, 0) and (0,
1
2
√
2,
1
2
√
2). Therefore
Z
1
∩ Ker(I − AH
1
) is spanned by the nonzero vector w
1
= (−1, 0, 0). The
dimension of the second Krylov space, denoted by d
2
, equals one. Note that
w
0
and w
1
are parallel (here chosen to be identical).
The second Broyden step equals s
1
= −H
1
g
1
= (0, −1, 0) and so the second
element of the iterate is nicely removed, x
2
= x
1
+s
1
= (1, 0, 0). Together with
the yield of the Broyden step y
1
= (−1, 1, 0), we compute the new inverse
Broyden matrix, v
1
= H
T
1
s
1
/(s
T
1
H
1
y
1
) = (0, 1, 1), and
H
2
= H
1
+ (s
1
−H
1
y
1
)v
T
1
=
¸
−1 −1 −1
0 −1 −1
0 0 −1
¸
.
2.3 Linear transformations 67
The Broyden matrix B
2
is given by
B
2
=
¸
−1 1 0
0 −1 1
0 0 −1
¸
.
The Broyden matrix B
2
equals the Jacobian of the linear function. Hence, we
know that the Broyden process terminates in the next iteration. Again the
vector v
1
and w
1
are orthogonal.
The function value in x
2
equals g(x
2
) = Ax
2
= (−1, 0, 0) and the second
Krylov space is given by
Z
2
= span¦g
2
, . . .¦ = span
¸
−1
0
0
¸
¸
.
Because H
2
is the inverse of the Jacobian, A, the kernel of (I − AH
2
) is the
entire space. The vector w
2
is thus given by w
2
= (−1, 0, 0), and w
2
= w
1
=
w
0
.
The ﬁnal Broyden step equals s
2
= (−1, 0, 0) and the solution to the
problem is found, that is, x
3
= 0. The third Krylov space is given by Z
3
=
span¦g
3
, AH
3
g
3
, . . .¦ = ¦0¦.
2.3 Linear transformations
An important observation for our present approach is that Broyden’s method
is invariant under unitary transformations for general systems. We make this
precise in the next lemma.
Lemma 2.17. Let g : R
n
→ R
n
be a general function, and choose x
0
∈ R
n
and B
0
∈ R
n×n
. Let U be a unitary matrix. Consider Algorithm 1.19 starting
with ˜ x
0
= U
T
x
0
and
¯
B
0
= U
T
B
0
U, applied to the function ˜ g(z) = U
T
g(Uz),
z ∈ R
n
. Then for every k = 0, 1, . . . ,
˜ x
k
= U
T
x
k
and
¯
B
k
= U
T
B
k
U. (2.20)
In particular,
˜ g(˜ x
k
) = g(x
k
).
Proof. Statement (2.20) is easily proved using an induction principle. For
k = 0, Equation (2.20) follows from the assumptions. We compute
˜ x
k+1
= ˜ x
k
−
¯
B
−1
k
˜ g(˜ x
k
)
= U
T
x
k
−U
T
B
−1
k
UU
T
g(UU
T
x
k
)
= U
T
(x
k
−B
−1
k
g(x
k
)) = U
T
x
k+1
.
68 Chapter 2. Solving linear systems with Broyden’s method
Therefore
˜ g
k+1
= ˜ g(˜ x
k+1
) = U
T
g(UU
T
x
k+1
) = U
T
g(x
k+1
) = U
T
g
k+1
,
and
˜ s
k
= ˜ x
k+1
− ˜ x
k
= U
T
x
k+1
−U
T
x
k
= U
T
s
k
.
This leads to
¯
B
k+1
=
¯
B
k
+
(˜ y
k
−
¯
B
k
˜ s
k
)˜ s
T
k
˜ s
T
k
˜ s
k
=
¯
B
k
+
˜ g
k+1
˜ s
T
k
˜ s
T
k
˜ s
k
= U
T
B
k
U +
U
T
g(UU
T
x
k+1
)(U
T
s
k
)
T
(U
T
s
k
)
T
(U
T
s
k
)
= U
T
B
k
U +U
T
g(x
k+1
)s
T
k
s
T
k
UU
T
s
k
U
= U
T
(B
k
+
g(x
k+1
)s
T
k
s
T
k
s
k
)U = U
T
B
k+1
U.
So, (2.20) is true for every k = 0, 1, . . . and
˜ g(˜ x
k
) = U
T
g(UU
T
x
k
) = U
T
g(x
k
) = g(x
k
).
It might happen that a system is more or less singular. This is unproﬁtable
for the numerical procedures to solve this system. The question is whether
scaling of the system does change the rate of convergence of Broyden’s method.
For linear systems of equations we have the following result.
Lemma 2.18. Let g : R
n
→R
n
be an aﬃne function. Suppose, for a certain
choice of x
0
and H
0
, the dimension of the zeroth Krylov space Z
0
(2.16) is equal
to d
0
. Let U be a nonsingular matrix, and consider the Broyden process starting
with ˜ x
0
= U
−1
x
0
and
¯
B
0
= U
−1
B
0
U, applied to the function ˜ g(z) = U
−1
g(Uz),
z ∈ R
n
. Then the method of Broyden needs at most 2d
0
iterations to converge
exactly to the zero of ˜ g, i.e., ˜ g(˜ x
k
) = 0 for some k ≤ 2d
0
.
Proof. First note that
˜ g
0
= ˜ g(˜ x
0
) = U
−1
g(U˜ x
0
) = U
−1
g(UU
−1
x
0
) = U
−1
g(x
0
) = U
−1
g
0
.
2.3 Linear transformations 69
If we apply the linear transformation x → Ux, the zeroth Krylov space
¯
Z
0
built with ˜ g
0
and
¯
A
¯
H
0
becomes
¯
Z
0
= span¦˜ g
0
,
¯
A
¯
H
0
˜ g
0
, (
¯
A
¯
H
0
)
2
˜ g
0
, . . .¦
= span¦U
−1
g
0
, U
−1
AUU
−1
H
0
UU
−1
g
0
, (U
−1
AUU
−1
H
0
U)
2
U
−1
g
0
, . . .¦
= span¦U
−1
g
0
, U
−1
AH
0
g
0
, U
−1
(AH
0
)
2
g
0
, . . .¦ = U
−1
Z
0
Because U is of full rank, the dimensions of Z
0
and
¯
Z
0
are equal. Corollary
2.14 completes the proof.
70 Chapter 2. Solving linear systems with Broyden’s method
Chapter 3
Limited memory Broyden
methods
In the previous chapters, we saw that the method of Broyden has several
advantages. In comparison with the method of Newton it does not need ex
pensive calculation of the Jacobian of the function g. According to a clever
updating scheme of the Broyden matrix, every iteration step includes only
one function evaluation. This makes the method eﬃcient for problems where
the evaluation of g is very timeconsuming. Although Broyden’s method fails
to have local qquadratic convergence it is still qsuperlinearly convergent for
nonlinear equations and exact convergent for linear equations. In addition, the
method of Broyden turns out to be quite suitable for problems stemming from
applications, for example, from chemical reaction engineering, see Section 8.3.
A disadvantage of Broyden’s method arises if we consider highdimensional
systems of nonlinear equations, involving a large amount of memory to store
the n
2
elements of the Broyden matrix.
In this chapter, we develop a structure to reduce the number of storage
locations for the Broyden matrix. All methods described in this chapter are
based on the method of Broyden and reduce the amount of memory needed
for the Broyden matrix from n
2
storage locations to 2pn storage locations.
Therefore we call these algorithms limited memory Broyden methods. The
parameter p is ﬁxed during the iteration steps of a limited memory Broyden
method.
In Section 3.1, we describe how we can use the structure of the Broyden
update scheme, to write the Broyden matrix B as a sum of the initial Broyden
matrix B
0
and an update matrix Q, which is written as the product of two
(n p)matrices, Q = CD
T
. The initial Broyden matrix is set to minus the
71
72 Chapter 3. Limited memory Broyden methods
identity at every simulation (B
0
= −I). By applying a reduction to the rank
of Q in subsequent iterations of the Broyden process, the number of elements
to store never exceeds 2pn.
The Broyden Rank Reduction method is introduced in Section 3.2. This
method considers the singular value decomposition of Q, and applies the reduc
tion by truncating the singular value decomposition up to p−1 singular values.
We prove under which conditions of the pth singular value of the update ma
trix, the qsuperlinear convergence of the method of Broyden is retained. In
addition, we discuss several properties of the Broyden Rank Reduction method
that also gives more insight in the original Broyden process.
To increase the understanding of limited memory Broyden methods we
give in Section 3.3 a generalization of the Broyden Rank Reduction method.
In Section 3.4, we observe a limited memory Broyden method coming from
the work of Byrd et al. [12]. This approach cannot be trapped in the frame
work of Section 3.3 but due to its natural derivation, we have taken it into
consideration.
3.1 New representations of Broyden’s method
The updates of the ’good’ method of Broyden, Algorithm 1.19, are generated
by
B
k+1
= B
k
+ (y
k
−B
k
s
k
)
s
T
k
s
T
k
s
k
= B
k
+
g(x
k+1
)s
T
k
s
T
k
s
k
, (3.1)
with
s
k
= x
k+1
−x
k
and y
k
= g(x
k+1
) −g(x
k
).
Equation (3.1) implies that if an initial matrix B
0
is updated p times, the
resulting matrix B
p
can be written as the sum of the initial matrix B
0
and p
rank one matrices, that is,
B
p
= B
0
+
p−1
¸
k=0
(y
k
−B
k
s
k
)
s
T
k
s
T
k
s
k
= B
0
+CD
T
, (3.2)
where C = [c
1
, . . . , c
p
], D = [d
1
, . . . , d
p
] are deﬁned by
c
k+1
= (y
k
−B
k
s
k
)/s
k
, d
k+1
= s
k
/s
k
,
for k = 0, . . . , p −1.
3.1 New representations of Broyden’s method 73
The sum of all correction terms to the initial Broyden matrix B
0
in (3.2),
we call the update matrix. So, if Q denotes the update matrix, then
Q = CD
T
=
p
¸
k=1
c
k
d
T
k
. (3.3)
By choosing B
0
to be minus the identity (B
0
= −I), the initial Broyden
matrix can be implemented in the code for the algorithm. So, it suﬃces to
store the (np)matrices C and D. In addition, we take advantage of Equation
(3.3) to compute the product Qz for any vector z ∈ R
n
. The following lemma
is clear.
Lemma 3.1. Let Q = CD
T
, where C and D are arbitrary (n p)matrices.
Storing the matrices C and D requires 2pn storage locations. Furthermore the
computation of the matrix vector product Qz = C(D
T
z), with z ∈ R
n
, costs
2pn ﬂoating point operations.
In the next iteration step of the Broyden process, 2(p+1)n storage locations
are needed to store the Broyden matrix B
p+1
. In the following iteration step
2(p + 2)n storage locations are needed to store B
p+2
, etc. In case n is even,
after n/2 iterations of Broyden’s method, 2(n/2)n = n
2
storage locations are
needed, which equals the number of storage locations we need for the Broyden
matrix itself. In other words, this alternative notation for the Broyden matrix,
given by (3.2), is only useful if p can be kept small (p < n). However, if the
method of Broyden needs more than p iterations to converge, we have to reduce
the number of rankone matrices that forms the update matrix (3.3). We ﬁx
the maximal number of corrections to be stored at p. After p iterations of the
method of Broyden all columns of the matrices C and D are used. To make it
possible to proceed after these p iterations, the two next examples are obvious.
We remove all corrections made to the initial Broyden matrix and start all over
again, or, we freeze the Broyden matrix and neglect all subsequent corrections.
Example 3.2. Let g be the discrete integral equation function, given by
(A.5). As initial estimate we choose x
0
given by (A.6) and we set ε = 10
−12
.
We apply the original method of Broyden, Algorithm 1.19, where the updates
to the initial Broyden matrix are stored as in (3.3). After p iterations we
remove all stored corrections and restart the Broyden algorithm with initial
estimate x
0
= x
p
. The dimension of the problem is ﬁxed at n = 100, therefore
the initial residual is g(x
0
) ≈ 0.7570. The rate of convergence is given in
Figure 3.1. It turns out that for p = 10 the same number of iterations are
needed as for the original method of Broyden. For p = 3 and p = 5 a few more
74 Chapter 3. Limited memory Broyden methods
iterations are needed (24 and 26, respectively). However, for p = 2 about 92
iterations are needed.
0 5 10 15 20 25 30 35 40
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 3.1: The convergence rate of Algorithm 1.19 applied to the discrete integral
equation function (A.5) where after p iterations the Broyden process is restarted.
[’◦’(Broyden), ’×’(p = 10), ’’(p = 5), ’∗’(p = 4), ’’(p = 3), ’’(p = 2), ’6’(p = 1)]
Example 3.3. Let g be the discrete integral equation function, given by
(A.5). As initial estimate we choose x
0
given by (A.6) and we set ε = 10
−12
.
We apply the original method of Broyden, Algorithm 1.19, where the updates
to the initial Broyden matrix are stored as in (3.3). After p iterations all future
corrections are neglected. The dimension of the problem if ﬁxed at n = 100,
and the initial residual equals g(x
0
) ≈ 0.7570. The rate of convergence is
given in Figure 3.2. The method is divergent for every value of p. Note that for
p = 1 and p = 2 the convergence behavior is equal. This can be explained by
Table 3.1. The diﬀerence between B
1
and B
2
is relatively small and therefore
it makes no diﬀerence whether we freeze the Broyden matrix after the ﬁrst or
after the second iteration. Similarly the diﬀerences in l
2
norm between B
3
, B
4
and B
5
is of order 10
−1
and we see that the convergence behavior is equal for
p = 3, 4 and 5.
An appropriate name for the method used in Example 3.3 would be the
BroydenChord method. Unfortunately, the method does not work. The
method of Example 3.2 is more promising. However, it is worth investigating
whether it is possible to save more information about the previous iterations
of the process.
We introduce a more sophisticated approach. If p corrections to the initial
Broyden matrix are stored, the update matrix Q is the sum of p rankone
3.1 New representations of Broyden’s method 75
k B
k+1
− B
k
k B
k+1
− B
k
0 2.2125 10 1.893
1 0.052184 11 0.233
2 2.044 12 0.028375
3 0.11172 13 1.8165
4 0.19556 14 0.37294
5 2.014 15 0.0093723
6 0.0085303 16 1.9524
7 1.6974 17 0.10937
8 0.59031 18 0.060311
9 0.014515 19 1.9682
Table 3.1: The diﬀerence in l
2
norm between two subsequent Broyden matrices of
Algorithm 1.19 applied to the discrete integral equation function (A.5).
0 2 4 6 8 10 12 14 16 18 20 22
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 3.2: The convergence rate of Algorithm 1.19 applied to the discrete inte
gral equation function (A.5), where after p iterations the Broyden matrix is frozen.
[’◦’(Broyden), ’×’(p = 10), ’’(p = 8), ’’(p = 7), ’+’(p = 6), ’’(p = 5), ’’(p = 3),
’’(p = 2), ’6’(p = 1)]
matrices, and has at most rank p. If we approximate the update matrix by a
matrix of lower rank q (q ≤ p − 1). This approximation, denoted by
¯
Q, can
be decomposed using two (n q)matrices,
¯
C and
¯
D. In this way, memory is
available to store p − q additional updates. Repeating this action every time
that p updates are stored, it is sure that the number of storage locations for
the Broyden matrix never exceeds 2pn.
In this chapter, we derive a number of limited memory Broyden methods
that based on trying to ’reduce’ the update matrix Q, given by (3.3). To gather
76 Chapter 3. Limited memory Broyden methods
all methods in a general updatereduction scheme, we propose the following
conditions.
• The parameter p is predeﬁned and ﬁxed (1 ≤ p ≤ n).
• The Broyden matrix B
k
is written as the sum of the initial Broyden
matrix B
0
and the update matrix Q.
• The update matrix Q is written as a product of two (n p)matrices C
and D, that is Q = CD
T
, with C = [c
1
, . . . , c
p
] and D = [d
1
, . . . , d
p
]. A
rankone update to the Broyden matrix is stored in a column of C and
the corresponding column of D.
• The current number of stored updates is denoted by m (0 ≤ m ≤ p).
The maximal number of updates to the initial Broyden matrix is thus
given by p.
• The initial Broyden matrix equals minus identity, B
0
= −I. We start
the limited memory Broyden process with the matrices C and D equal
to zero (m := 0).
• A new update is stored in column m + 1 of the matrices C and D.
If already p updates are stored (m = p), a reduction is applied to the
update matrix just before the next update is computed. The new number
of updates after the reduction is denoted by q (0 ≤ q ≤ p − 1). No
reduction is performed as long as m < p.
• When applying the reduction, the decomposition CD
T
of the update
matrix is optionally rewritten by
CD
T
= C(
¯
DZ)
T
= CZ
T
¯
D
T
=:
¯
C
¯
D
T
, (3.4)
where the matrix Z ∈ R
p×p
is nonsingular. Thereafter the last p − q
columns of the matrices C and D are set to zero (m := q).
After rewriting the matrices C and D, the columns are ordered in such
a way that the last p − q columns of the matrices C and D can be removed
to perform the reduction. So, the ﬁrst q columns are saved and the reduced
Broyden matrix is given by
¯
B = B
k
−
p
¸
l=q+1
c
l
d
T
l
= B
0
+
q
¸
l=1
c
l
d
T
l
. (3.5)
3.1 New representations of Broyden’s method 77
The new Broyden matrix
¯
B after the updating scheme becomes
¯
B =
¯
B + (y
k
−
¯
Bs
k
)s
T
k
/(s
T
k
s
k
). (3.6)
Only if m = p, a reduction to the update matrix is needed just before
storing the new correction. In the other case, if m < p, no reduction is applied
and the correction to the Broyden matrix is simply given by (3.1). This normal
update is stored in column m+1 of the matrices C and D. So, d
m+1
= s
k
/s
k

and
c
m+1
:=
y
k
−B
0
s
k
−
m
¸
l=1
c
l
d
T
l
s
k
/s
k

= g(x
k+1
)/s
k
. (3.7)
However, directly after a reduction is applied, care should be taken by comput
ing the next update. Since the Broyden matrix B
k
is replaced by a reduced
matrix
¯
B, (3.7) is no longer valid. Substituting (3.5) into (3.6) gives, with
m = q,
c
m+1
:=
y
k
−B
0
s
k
−
q
¸
l=1
c
l
d
T
l
s
k
/s
k
 (3.8)
or equivalently
c
m+1
:=
g(x
k+1
) +
p
¸
l=q+1
c
l
d
T
l
s
k
/s
k
. (3.9)
Note that the ﬁrst approach, (3.8) has the disadvantage that we have to
store the vector y
k
. The number q determines which approach is the cheapest
one in ﬂoating points operations. Especially if q = p−1, the second approach,
(3.9), is very attractive. The update is then reduced to
c
p
:= (g(x
k+1
) +c
p
d
T
p
s
k
)/s
k
.
In (3.9), the last p −q columns of C and D are still used to compute the new
update before they are set to zero. We proceed with the ﬁrst approach.
We are now ready to give the algorithm of a general limited memory Broy
den method.
Algorithm 3.4 (The limited memory Broyden method). Choose an
initial estimate x
0
∈ R
n
, set the parameters p and q, and let C = [c
1
, . . . , c
p
],
D = [d
1
, . . . , d
p
] ∈ R
n×p
be initialized by c
i
= d
i
= 0 for i = 1, . . . , p (m := 0).
Set k := 0 and repeat the following sequence of steps until g(x
k
) < ε.
78 Chapter 3. Limited memory Broyden methods
i) Solve (B
0
+CD
T
)s
k
= −g(x
k
) for s
k
,
ii) x
k+1
:= x
k
+s
k
,
iii) y
k
:= g(x
k+1
) −g(x
k
),
iv) If m = p deﬁne
¯
C = CZ
T
and
¯
D = DZ
−1
for a nonsingular matrix
Z ∈ R
p×p
and set c
i
= d
i
= 0 for i = q + 1, . . . , p (m := q),
v) Perform the Broyden update, i.e.,
c
m+1
:= (y
k
−B
0
s
k
−
¸
m
l=1
c
l
d
T
l
s
k
)/s
k

d
m+1
:= s
k
/s
k
,
and set m := m+ 1.
It actually turns out that one can avoid solving the large ndimensional
system B
k
s
k
= −g(x
k
), by using the ShermanMorrison formula (1.68). This
gives
(B
0
+CD
T
)
−1
= B
−1
0
−B
−1
0
C(I +D
T
B
−1
0
C)
−1
D
T
B
−1
0
. (3.10)
By inspection of (3.10) we see that (I + D
T
B
−1
0
C) is a (p p)matrix. So,
we only have to solve a linear system in R
p
. Due to our choice of the initial
Broyden matrix (B
0
= −I), the inverse is trivial, that is B
−1
0
= −I.
The new update to the Broyden matrix is always made after the reduction
to the update matrix Q. So, the limited memory Broyden method is still a
secant method. Equation (3.6) implies that
¯
Bs
k
=
¯
Bs
k
+ (y
k
−
¯
Bs
k
)s
T
k
s
k
/(s
T
k
s
k
) = y
k
.
Finally, note that in the ﬁrst p iteration steps no reduction takes place.
So, during these iterations the limited memory Broyden method is equivalent
to the method of Broyden. Since x
p+1
is computed still using the original, not
yet reduced, Broyden matrix B
p
. The diﬀerence between a limited memory
Broyden method and the method of Broyden itself can be detected only in
iteration step k = p + 2.
We make a ﬁrst attempt to reduce the number of columns of the matrices
C and D. The simplest thought is to do nothing with the columns of the
matrices C and D (so, Z = I) and if no additional corrections to the Broyden
matrix can be stored, free memory can be created by removing old updates.
We just make a selection of q updates that we would like to keep. The columns
of C and D corresponding to these updates are placed in the ﬁrst q columns
3.1 New representations of Broyden’s method 79
and hereafter the last p − q columns of both matrices are put to zero. After
the reduction additional updates can be stored for the next p −q iterations of
the Broyden process. We will discuss some of the basic choices for the updates
to save.
One possibility is removing the update matrix Q completely and start all
over again. Thus take q = 0 and remove all columns of C and D. Note,
however, that the Broyden process does not restart with the initial matrix
¯
B = B
0
, because directly after the reduction a new update is stored in the
ﬁrst columns of C and D. So the algorithm considered in Example 3.5 is indeed
diﬀerent from the algorithm of Example 3.2. Note also that, in this case, it is
superﬂuous to rewrite the matrices C and D, because all columns are removed.
Example 3.5. Let g be the discrete integral equation function, given by
(A.5). As initial estimate we choose x
0
given by (A.6) and we set ε = 10
−12
.
We apply Algorithm 3.4, where q is set to zero. Again the dimension is chosen
to be n = 100, and thus g(x
0
) = 0.7570. The rate of convergence is given in
Figure 3.3. For p = 2, 4, 5 and 10 more or less the same number of iterations
are needed as for the method of Broyden itself. Only for p = 3 more iterations
are needed to converge and for p = 1 the process does not converge at all.
0 5 10 15 20 25 30 35 40
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 3.3: The convergence rate of Algorithm 3.4 applied to the discrete integral
equation function (A.5) with q = 0. [’◦’(Broyden), ’×’(p = 10), ’’(p = 5), ’∗’(p = 4),
’’(p = 3), ’’(p = 2), ’6’(p = 1)]
Another possibility to reduce the update matrix is to remove the ﬁrst
column of both matrices, C and D, i.e., the oldest update of the Broyden
process. The parameter q is set to q := p −1. If Z ∈ R
p×p
is the permutation
80 Chapter 3. Limited memory Broyden methods
matrix
Z =
¸
¸
¸
¸
¸
0 1
.
.
.
.
.
.
.
.
.
1
1 0
¸
, (3.11)
step (iv) of Algorithm 3.4, then implies that
¯
C = CZ
T
=
c
2
c
p
c
1
and
¯
D = DZ
−1
=
d
2
d
p
d
1
.
Example 3.6. Let g be the discrete integral equation function given by (A.5).
As initial estimate we choose x
0
given by (A.6) and we set ε = 10
−12
. We apply
the Algorithm 3.4, where q is set to p −1 and Z is given by (3.11). We choose
n = 100 and thus g(x
0
) ≈ 0.7570. The rate of convergence is given in Figure
3.4. For p = 2 and p = 3 a few more iterations are needed than for the method
of Broyden. For all other values of p the convergence is much slower. For p = 1
we have no convergence, which was already known because this is exactly the
same case as the algorithm used in Example 3.5 with p = 1.
0 5 10 15 20 25 30 35 40
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 3.4: The convergence rate of Algorithm 3.4, applied to the discrete integral
equation function (A.5), with q = p−1 and Z given by (3.11). [’◦’(Broyden), ’×’(p =
10), ’’(p = 5), ’∗’(p = 4), ’’(p = 3), ’’(p = 2), ’6’(p = 1)]
The next approach of reduction is removing the last column of both ma
trices, C and D, i.e., the latest update of the Broyden process. So, again
3.1 New representations of Broyden’s method 81
q := p − 1, but now in step (iv) of Algorithm 3.4 the decomposition of the
update matrix is not rewritten (Z = I). Because the columns are removed
after the Broyden step, i.e., after x
k+1
is computed, this approach is not equal
to freezing the Broyden matrix. Besides the new update is still computed and
added to the Broyden matrix. Therefore, this method is a secant method, that
is, the new Broyden matrix B
k+1
satisﬁes the secant equation (1.25).
Example 3.7. Let g be the discrete integral equation function given by (A.5).
As initial estimate we choose x
0
given by (A.6) and we set ε = 10
−12
. We apply
Algorithm 3.4, where q is set to p −1 and Z = I. We choose n = 100 and thus
g(x
0
) ≈ 0.7570. The rate of convergence is given in Figure 3.5. The process
diverges for p = 4 and 5. For p = 2 and 3 the convergence is rather slow. Only
for p = 10 we have fast convergence. Note that we already have discussed the
case p = 1 in the previous two examples.
0 5 10 15 20 25 30 35 40
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 3.5: The convergence rate of Algorithm 3.4 applied to the discrete integral
equation function (A.5), with q = p − 1 and Z = I. [’◦’(Broyden), ’×’(p = 10),
’’(p = 5), ’∗’(p = 4), ’’(p = 3), ’’(p = 2), ’6’(p = 1)]
Instead of removing one single update, we can remove the ﬁrst two columns
of both matrices, C and D, the two oldest updates of the Broyden process.
In Section 1.3, we have seen that after the method of Broyden diverges for
one iteration, the next iteration it makes a large step in the right direction.
Perhaps two updates are in some way related. The parameter q is set to p−2.
82 Chapter 3. Limited memory Broyden methods
If Z ∈ R
p×p
is the permutation matrix
Z =
¸
¸
¸
¸
¸
¸
¸
¸
0 0 1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
.
.
.
0
1 0
¸
(3.12)
step (iv) of Algorithm 3.4 implies that
¯
C = CZ
T
=
c
3
c
p
c
1
c
2
and
¯
D = DZ
−1
=
d
3
d
p
d
1
d
2
,
and subsequently the last two columns of C and D are set to zero.
Example 3.8. Let g be the discrete integral equation function given by (A.5).
As initial estimate we choose x
0
given by (A.6) and we set ε = 10
−12
. We apply
Algorithm 3.4, where q is set to p−2 and Z given by (3.12). We choose n = 100
and thus g(x
0
) ≈ 0.7570. The rate of convergence is given in Figure 3.6. The
method cannot be applied for p = 1. The rate of convergence is rather fast for
the smaller values of p, for p = 8 and p = 10 the process converges slower.
0 5 10 15 20 25 30 35 40
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 3.6: The convergence rate of Algorithm 3.4 applied to the discrete integral
equation function (A.5), with q = p − 2 and Z is given by (3.12). [’◦’(Broyden),
’×’(p = 10), ’’(p = 8), ’∗’(p = 4), ’’(p = 3), ’’(p = 2)]
3.2 Broyden Rank Reduction method 83
In the ﬁnal example of this section, we remove the last two columns of the
matrices C and D, when a reduction has to be applied. So, we remove the
two latest update of the Broyden process. The parameter q is set to p−2, and
again Z = I.
Example 3.9. Let g be the discrete integral equation function given by (A.5).
As initial estimate we choose x
0
given by (A.6) and we set ε = 10
−12
. We apply
Algorithm 3.4, where q is set to p − 2 and Z = I. We choose n = 100 and
thus g(x
0
) ≈ 0.7570. The rate of convergence is given in Figure 3.7. Again
the method cannot be applied for p = 1. It is Remarkable that for p = 5 and
p = 10 the rate of convergence is much lower.
0 5 10 15 20 25 30 35 40
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 3.7: The convergence rate of Algorithm 3.4 applied to the discrete integral
equation function (A.5), with q = p − 2 and Z = I. [’◦’(Broyden), ’×’(p = 10),
’’(p = 5), ’∗’(p = 4), ’’(p = 3), ’’(p = 2)]
Of course we could think of more fancy approaches for the selection of
the columns of C and D. Perhaps, it is interesting to remove all odd or all
even columns of the matrices C and D. A more serious approach would be
removing the p −q updates stored in the update matrix that are the smallest
in Frobenius norm. From Table 3.1 we can derive that these updates probably
are the 2nd, 4th and 5th update, etc, for the example used in this chapter.
3.2 Broyden Rank Reduction method
In this section, we arrive to the main work of this thesis. In Algorithm 3.4,
we have represented the Broyden matrix by
B
k
= B
0
+CD
T
,
84 Chapter 3. Limited memory Broyden methods
where C, D ∈ R
n×p
. We store the corrections to the initial Broyden matrix in
the columns of the matrices C = [c
1
, . . . , c
p
] and D = [d
1
, . . . , d
p
]. The update
matrix, denoted by Q, is deﬁned by
Q = CD
T
=
p
¸
l=1
c
l
d
T
l
. (3.13)
In Section 3.1, we have tried to reduce the rank of Q during the Broyden
process. We saw that for small values of p often has diﬃculties to converge
and if the process succeeds to converge, the rate of convergence might be low.
In addition, if the limited memory Broyden method converges for a certain
value for p it might diverges for a larger value for p. Clearly, we cannot tell yet
whether and when removing an update destroys the structure of the Broyden
matrix too much.
To introduce a new special limited memory Broyden method, we ﬁrst recall
some basic properties of singular values. Every real matrix A ∈ R
n×n
can be
written as
A = UΣV
T
= σ
1
u
1
v
T
1
+ +σ
n
u
n
v
T
n
(3.14)
where U = [u
1
, . . . , u
n
] and V = [v
1
, . . . , v
n
] are orthogonal matrices and
Σ = diag(σ
1
, . . . , σ
n
). The real nonnegative numbers σ
1
≥ . . . ≥ σ
n
≥ 0 are
called the singular values of A. Because, for i = 1, . . . , n,
A
T
Av
i
= V ΣU
T
UΣV
T
v
i
= σ
2
i
v
i
,
AA
T
u
i
= UΣV
T
V ΣU
T
u
i
= σ
2
i
u
i
,
the column vectors of U are the eigenvectors of AA
T
and are called the left
singular vectors of A, and the column vectors of V are the eigenvectors of A
T
A
and are called the right singular vectors of A. The rank of a matrix A equals p if
and only if σ
p
is the smallest positive singular value, i.e., σ
p
= 0 and σ
p+1
= 0.
The following basic theorem yields that the best rankp approximation of a
matrix A is given by the ﬁrst p terms of the singular value decomposition.
The proof can be found in [27].
Theorem 3.10. Let the singular value decomposition of A ∈ R
n×n
be given
by (3.14). If q < r = rank A and
A
q
=
q
¸
k=1
σ
k
u
k
v
T
k
then
min
rank B=q
A−B = A−A
q
 = σ
q+1
,
where . denotes the l
2
matrix norm.
3.2 Broyden Rank Reduction method 85
An interpretation of this theorem is that the largest singular values of a
matrix A contain the most important information of the matrix A. The theory
of singular values can be extended to rectangular matrices, see again [27].
This leads us to consider the following reduction procedure for the limited
memory Broyden method. Compute the singular value decomposition of the
update matrix Q. Because the rank of Q is less or equal to p, the singular
value decomposition can be reduced to
Q = σ
1
u
1
v
T
1
+ +σ
p
u
p
v
T
p
.
Next choose q and remove the smallest p − q singular value and their corre
sponding left and right singular vectors from the singular value decomposition
of Q.
In other words, considering the general Algorithm 3.4, for step (iv) the
singular value decomposition of Q is computed and stored in the matrices C
and D. Then, by setting the last p − q columns of both matrices to zero the
last p − q terms of the singular value decomposition are removed. This leads
to the best rank q approximation of the update matrix Q that is available in
the l
2
norm.
A problem we still have to deal with is that we do not want to compute the
(n n)update matrix Q explicitly. So, the question is how we can determine
the singular values of this matrix.
Using the QRdecomposition of D =
¯
DR we observe that Q can be written
as
CD
T
= C(
¯
DR)
T
= CR
T
¯
D
T
=
¯
C
¯
D
T
,
where
¯
D is orthogonal. Now, using the singular value decomposition of
¯
C =
UΣW
T
, we see that
¯
C
¯
D
T
= (UΣW
T
)
¯
D
T
= (UΣ)(
¯
DW)
T
=
´
C
´
D
T
.
Because W and
¯
D are orthogonal matrices, the product
´
D is orthogonal as
well. Therefore,
´
C
´
D
T
represents an economic version of the singular value
decomposition of the update matrix. The matrix Z in Algorithm 3.4 is given
by Z = W
−1
R.
Note that the singular values of
¯
C are the square roots of the eigenvalues
of
¯
C
T
¯
C which is an (p p)matrix. In addition, the matrix W consists of
the eigenvectors of
¯
C
T
¯
C. The right singular vectors of Q are obtained using
these eigenvectors. So the ndimensional problem of the computation of the
singular values of Q has, in fact, become a pdimensional problem.
This limited memory Broyden method in which we remove the pth singu
lar value of the update matrix in every iteration, is called the Broyden Rank
86 Chapter 3. Limited memory Broyden methods
Reduction (BRR) method. In applications it turns out to be a very eﬃcient
algorithm to solve highdimensional systems of nonlinear equations. The the
oretical justiﬁcation of the BRR method is given in Theorem 3.12 where we
show under which conditions the method is qsuperlinear convergent. The
general Algorithm 3.4 can be replaced by this new algorithm.
Algorithm 3.11 (The Broyden Rank Reduction method). Choose an
initial estimate x
0
∈ R
n
, set the parameters p and q, and let C = [c
1
, . . . , c
p
],
D = [d
1
, . . . , d
p
] ∈ R
n×p
be initialized by c
i
= d
i
= 0 for i = 1, . . . , p (m := 0).
Set k := 0 and repeat the following sequence of steps until g(x
k
) < ε.
i) Solve (I −D
T
C)t
k
= D
T
g(x
k
) for t
k
,
ii) s
k
:= g(x
k
) +Ct
k
,
iii) x
k+1
:= x
k
+s
k
,
iv) y
k
:= g(x
k+1
) −g(x
k
),
v) Compute the QRdecomposition of D =
¯
DR,
D :=
¯
D, C := CR
T
,
vi) Compute the SVD of C = UΣW
T
, (σ
1
≥ ≥ σ
p
)
C := UΣ, D := DW,
vii) If m = p then set c
i
= d
i
= 0 for i = q + 1, . . . , p (m := q),
viii) Perform the Broyden update, i.e.,
c
m+1
:= (y
k
+s
k
−
¸
m
l=1
c
l
d
T
l
s
k
)/s
k

d
m+1
:= s
k
/s
k
,
and set m := m+ 1.
In Algorithm 3.11, we compute the singular value decomposition of Q in
every iteration, even if we apply no reduction, in order to obtain a better
understanding of the importance of the updates to the Broyden matrix. For
economical reasons it would be better to compute the singular value decom
position
Note that in step (v) of the ﬁrst iteration of Algorithm 3.11 the QR
decomposition is computed of a zero matrix. Since, however, the matrix R is
then set to zero, we can choose any orthogonal matrix Q without disturbing
the procedure.
3.2 Broyden Rank Reduction method 87
To proof the qsuperlinear convergence of a limited memory Broyden method
we observe that before a Broyden update is applied, the Broyden matrix is
reduced using a reduction matrix R,
¯
B = B −R.
The new updated Broyden matrix
¯
B is therefore given by
¯
B =
¯
B + (y −
¯
B)s
T
/(s
T
s) = B + (y −Bs)
s
T
s
T
s
−R
I −
ss
T
s
T
s
. (3.15)
Comparable to the proof of the convergence of Broyden’s method, Theorem
1.26, we estimate the diﬀerence between the new Broyden matrix and the
Jacobian of g at x
∗
. It follows that
¯
B −J
g
(x
∗
) = (B −J
g
(x
∗
))
I −
ss
T
s
T
s
+ (y −J
g
(x
∗
)s)
s
T
s
T
s
−R
I −
ss
T
s
T
s
.
Thus instead of (1.58) we obtain

¯
B −J
g
(x
∗
)
F
≤ B −J
g
(x
∗
)
F
+γ max¦¯ x −x
∗
, x −x
∗
¦ +R (3.16)
According to Theorem 1.24, a general limited memory Broyden method would
be linearly convergent to x
∗
if the norm R of the reduction R can be esti
mated by the length of the Broyden step s, because then
R ≤ s ≤ 2 max¦¯ x −x
∗
, x −x
∗
¦. (3.17)
So, (1.38) is satisﬁed for all
¯
B ∈ Φ(x, B) where Φ : R
n
L(R
n
) →{¦L(R
n
)¦
is deﬁned as Φ(x, B) = ¦
¯
B [ R ≤ s, s = 0¦ and
¯
B by (3.15). This leads
to the following theorem.
Theorem 3.12. Let g : R
n
→ R
n
be continuously diﬀerentiable in the open,
convex set T ⊂ R
n
, and assume that J
g
∈ Lip γ(T). Let x
∗
be a zero of g,
for which J
g
(x
∗
) is nonsingular. Then the update function Φ(x, B) = ¦
¯
B :
R
F
< s, s = 0¦, where
¯
B = B + (y −Bs)
s
T
s
T
s
−R
I −
ss
T
s
T
s
,
is well deﬁned in a neighborhood N = N
1
N
2
of (x
∗
, J
g
(x
∗
)), and the corre
sponding iteration
x
k+1
= x
k
−B
−1
k
g(x
k
)
with B
k+1
∈ Φ(x
k
, B
k
), k ≥ 0, is locally and qsuperlinearly convergent at x
∗
.
88 Chapter 3. Limited memory Broyden methods
The proof of the qsuperlinear convergence of a limited memory Broyden
method is identical to the proof of Theorem 1.26. Due to (3.16) and (3.17),
the inequality in (1.57) holds, if γ is replaced by γ + 2.
We apply Algorithm 3.11 to our test function.
Example 3.13. Let g be the discrete integral equation function given by
(A.5). As initial estimate we choose x
0
given by (A.6) and we set ε = 10
−12
.
We apply Algorithm 3.11, for diﬀerent values of p, and q := p−1. So, we remove
only the smallest singular value starting from the pth iteration. In Table 3.2,
the convergence results for the BRR method are given, for diﬀerent values of
p. It turns out that the method only converges for p ≥ 7, however in those
cases the rate of convergence is exactly the same as the rate of convergence of
Broyden’s method, see Figure 3.8. In Figure 3.9, we consider the ratio between
the removed singular value σ
p
and the size of the Broyden step s
k−1
 in the
kth iteration, k = 0, . . . , k
∗
. It is clear that for every p the quotient σ
p
/s
k−1

eventually increases. If this quotient becomes of order one, the BRR method
get diﬃculties to achieve the fast convergence and the process starts to deviate
from the convergence of the method of Broyden.
method n p g(x
0
) g(x
k
∗) k
∗
R
Broyden 100  0.7570 4.4433 · 10
−13
21 1.3411
BRR 100 10 0.7570 4.4433 · 10
−13
21 1.3411
BRR 100 8 0.7570 4.4469 · 10
−13
21 1.3411
BRR 100 7 0.7570 3.6068 · 10
−13
21 1.3511
BRR 100 6 0.7570 1.6256 · 10
−3
200 0.0307
BRR 100 5 0.7570 3.4376 · 10
−2
200 0.0155
BRR 100 4 0.7570 1.7514 · 10
−1
200 0.0073
BRR 100 3 0.7570 1.9912 · 10
+22
160 −0.3227
BRR 100 2 0.7570 4.6464 · 10
+30
55 −1.2889
BRR 100 1 0.7570 3.2381 · 10
−2
200 0.0158
Table 3.2: Characteristics of Algorithm 3.11 applied to the discrete integral equation
function (A.5), with q = p − 1.
We can conclude that the Broyden Rank Reduction method converges as
fast as the method of Broyden as long as the quotient σ
p
/s
k−1
 remains small.
If the quotient grows, we cannot control the convergence process.
Instead of removing the smallest singular value we also could remove other
singular values from the SVD of the update matrix Q. If we want to remove
the largest singular value in every iteration, we have to include an intermediate
3.2 Broyden Rank Reduction method 89
0 5 10 15 20 25 30 35 40
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 3.8: The convergence rate of Algorithm 3.11 applied to the discrete integral
equation function (A.5), with q = p − 1. [’◦’(Broyden), ’×’(p = 10), ’’(p = 8),
’’(p = 7), ’+’(p = 6), ’’(p = 5), ’∗’(p = 4), ’’(p = 3), ’’(p = 2), ’6’(p = 1)]
0 5 10 15 20 25 30 35 40
10
−10
10
0
10
10
PSfrag replacements
iteration k
q
u
o
t
i
e
n
t
σ
p
/
s
k
−
1
Figure 3.9: The quotient σ
p
/s
k−1
for Algorithm 3.11, applied to the discrete integral
equation function (A.5), with q = p − 1. [’◦’(Broyden), ’×’(p = 10), ’’(p = 8),
’’(p = 7), ’+’(p = 6), ’’(p = 5), ’∗’(p = 4), ’’(p = 3), ’’(p = 2), ’6’(p = 1)]
step. After computing the singular value decomposition of the update matrix
in step (vi), we additionally permute the columns of the matrices C and D, so
that the ﬁrst column of both matrices is moved to the last column. In other
words, we apply Algorithm 3.4 where q is set to q = p − 1 and the matrix Z
90 Chapter 3. Limited memory Broyden methods
is equal to
Z =
¸
¸
¸
¸
¸
0 1
.
.
.
.
.
.
.
.
.
1
1 0
¸
W
−1
R. (3.18)
Example 3.14. Let g be the discrete integral equation function given by
(A.5). As initial estimate we choose x
0
given by (A.6) and we set ε = 10
−12
.
We apply Algorithm 3.4 with q := p − 1 and Z deﬁned by (3.18) So, we
remove only the largest singular value starting from the pth iteration. In
Figure 3.10, we observe that the process diverges shortly after we remove the
largest singular value from the singular value decomposition of the update
matrix.
0 5 10 15 20 25 30 35 40
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 3.10: The convergence rate of Algorithm 3.11 applied to the discrete integral
equation function (A.5), with q = p−1 and Z given by (3.18). [’◦’(Broyden), ’×’(p =
10), ’’(p = 7), ’’(p = 5), ’∗’(p = 4), ’’(p = 3), ’6’(p = 1)]
The computations
In order to compute the singular value decomposition of the update matrix
Q = CD
T
, we use two steps. First we make the matrix D orthogonal and
then we compute the singular value decomposition of the matrix
¯
C. In these
steps two important matrices are involved. We now have a closer view to the
the matrix R of the QRdecomposition of D and the matrix W, containing the
eigenvectors of
¯
C
T
¯
C.
3.2 Broyden Rank Reduction method 91
Note that after p iterations of the BRR process, the matrix D is nearly
orthogonal. The ﬁrst p − 1 columns denoted by v
1
, . . . , v
p−1
are the right
singular vectors of the previous update matrix, and form an orthonormal set
in the R
n
. Let cd
T
be the new rankone update to the Broyden matrix, then
the last column of D contains the vector d. The decomposition of the update
matrix is rewritten by
CD
T
= C(
¯
DR)
T
= CR
T
¯
D
T
=
¯
C
¯
D
T
,
where
¯
DR is the QRdecomposition of D. So, R has the structure
R =
¸
¸
¸
¸
1 r
1p
.
.
.
.
.
.
1 r
1,p−1
r
pp
¸
where the last column (r
1p
, . . . , r
pp
) describes how the new vector d is dis
tributed over the old ’directions’ of the update matrix. In fact r
lp
= v
T
l
d,
l = 1, . . . , p − 1, and r
pp
normalizes the new vector
˜
d
p
after the orthog
onalization. The matrix R is invertible if and only if r
pp
= 0, that is, if
d / ∈ span¦v
1
, . . . , v
p−1
¦. The inverse matrix is then given by
R
−1
=
¸
¸
¸
¸
1 −r
1p
/r
pp
.
.
.
.
.
.
1 −r
1,p−1
/r
pp
1/r
pp
¸
,
and
¯
D = DR
−1
, thus
˜
d
p
=
1
r
pp
d −
p−1
¸
l=1
r
lp
r
pp
v
l
,
which is equivalent to the GrammSchmidt orthogonalization of d with re
spect to the orthonormal set ¦v
1
, . . . , v
p−1
¦. On the other hand, if r
pp
= 0
then d ∈ span¦v
1
, . . . , v
p−1
¦ and
˜
d
p
can be any vector orthogonal to the set
¦v
1
, . . . , v
p−1
¦.
In order to obtain the singular value decomposition of Q, the eigenvectors
of
¯
C
T
¯
C are computed and stored in the (p p)matrix W. The right singular
vectors of Q are obtained by multiplying W from the left by
¯
D. So,
¯
C
¯
D
T
= (UΣW
T
)
¯
D
T
= (UΣ)(
¯
DW)
T
=
´
C
´
D
T
.
92 Chapter 3. Limited memory Broyden methods
After the ﬁrst p iterations of the BRR process, W has no particular structure
W =
¸
¸
w
11
w
1p
.
.
.
.
.
.
.
.
.
w
p1
w
pp
¸
.
Nothing can be said about the entries of the matrix, because a rankone per
turbation of a matrix can disturb the singular value decomposition completely.
On the other hand, if
¯
C
¯
D
T
is already in SVDformat then W = I. The matrix
W tells us how we have to turn the columns of
¯
D to obtain the right singular
vectors of Q. By considering the diagonal of W we can observe whether or
not the update to the Broyden matrix changes the form of the singular value
decomposition.
Tables 3.3 and 3.4 can be explained in the following way. Because ini
tially the matrices C and D are zero, the singular values decomposition of the
updatematrices does not have to be computed in the ﬁrst iteration. For k = 2
it is trivial that all but the ﬁrst element of the ﬁrst column of R are equal to
zero, since m = 1. For k = 3 the element [r
12
[ is close to one. This implies
that the ﬁrst two Broyden steps, s
0
and s
1
, point in more or less the same
direction. Note that the diﬀerence between the Broyden matrices B
1
and B
2
is small, see Table 3.1. The diagonal of W shows that the update matrix is
in singular value decomposition format, in spite of the addition of a rankone
matrix. In the fourth iteration (k = 4) a second direction orthogonal to the
ﬁrst is involved. According to the diagonal of W the two directions have to
be adjusted slightly to obtain the singular value decomposition. In the sixth
iteration (k = 6) the fourth and ﬁfth direction are twisted ([w
44
[, [w
55
[ = 1).
Note that the singular values corresponding to these directions are small. Di
rectly after the introduction of a third direction in iteration 7 the second and
third direction are twisted. So, a direction is found that is more important
than the second direction of the last iteration. In iteration k = 8 the Broyden
step lies mainly in this new direction. Note that the ﬁrst direction obtained by
Broyden’s method, remains the principal direction in all subsequent iterations.
3.2 Broyden Rank Reduction method 93
k m
r
1m
 · · · r
pm

w
11
 · · · w
pp

2 1
0
1
0
1
3 2
0
1
0
1
4 3
0
1
0
1
5 4
0
1
0
1
6 5
0
1
0
1
7 6
0
1
0
1
8 7
0
1
0
1
9 7
0
1
0
1
10 7
0
1
0
1
11 7
0
1
0
1
Table 3.3: The absolute values of the elements of column m of R and the diagonal
of W during the BRR process, Algorithm 3.11 with p = 7 and q = 6, applied to the
discrete integral equation function (A.5) (n = 100).
94 Chapter 3. Limited memory Broyden methods
k m
r
1m
 · · · r
pm

w
11
 · · · w
pp

12 7
0
1
0
1
13 7
0
1
0
1
14 7
0
1
0
1
15 7
0
1
0
1
16 7
0
1
0
1
17 7
0
1
0
1
18 7
0
1
0
1
19 7
0
1
0
1
20 7
0
1
0
1
21 7
0
1
0
1
Table 3.4: The absolute values of the elements of column m of R and the diagonal
of W during the BRR process, Algorithm 3.11 with p = 7 and q = 6, applied to the
discrete integral equation function (A.5) (n = 100).
3.2 Broyden Rank Reduction method 95
The Broyden Rank Reduction Inverse method
The reduction process to the update matrix of the Broyden matrix can also be
applied in case of the inverse notation of the method of Broyden. The inverse
Broyden matrix H can also be written as the sum of the initial matrix H
0
and
an update matrix Q. Apart from the computation of the Broyden step and the
rankone update to the Broyden matrix, the algorithm is essentially the same
and has similar convergence properties. The ShermanMorrisonWoodbury
formula (1.68) shows, however, that Algorithm 3.11 and Algorithm 3.15 are
not identical.
Algorithm 3.15 (The Broyden Rank Reduction Inverse method).
Choose an initial estimate x
0
∈ R
n
, set the parameters p and q, and let
C = [c
1
, . . . , c
p
], D = [d
1
, . . . , d
p
] ∈ R
n×p
be initialized by c
i
= d
i
= 0 for
i = 1, . . . , p (m := 0). Set k := 0 and repeat the following sequence of steps
until g(x
k
) < ε.
i) s
k
:= g(x
k
) −CD
T
g(x
k
),
ii) x
k+1
:= x
k
+s
k
,
iii) y
k
:= g(x
k+1
) −g(x
k
),
iv) Compute the QRdecomposition of D =
¯
DR,
D :=
¯
D, C := CR
T
,
v) Compute the SVD of C = UΣW
T
, (σ
1
≥ ≥ σ
p
)
C := UΣ, D := DW,
vi) If m = p then set c
i
= d
i
= 0 for i = q + 1, . . . , p (m := q),
vii) Perform the Broyden update, i.e.,
α := −s
T
k
y
k
+
¸
m
l=1
(c
T
l
s
k
)(d
T
l
y
k
),
c
m+1
:= (s
k
+y
k
−
¸
m
l=1
c
l
d
T
l
y
k
)/α,
d
m+1
:= −s
k
+
¸
m
l=1
d
l
c
T
l
s
k
,
and
c
m+1
:= c
m+1
d
m+1
,
d
m+1
:= d
m+1
/d
m+1
,
and set m := m+ 1.
96 Chapter 3. Limited memory Broyden methods
Example 3.16. Let g be the discrete integral equation function given by
(A.5). As initial estimate we choose x
0
given by (A.6) and we set ε = 10
−12
.
We apply Algorithm 3.15, for diﬀerent values of p, and q := p − 1. So, we
remove only the smallest singular value starting from the pth iteration. It
turns out that the method converges as fast as the method of Broyden for
p ≥ 7, see Figure 3.8. For p = 6 just a few more iterations are needed. In
Figure 3.9, we consider the ratio between the removed singular value σ
p
and
the size of the Broyden step s
k−1
 in the kth iteration, k = 0, . . . , k
∗
. It
is clear that for every p the quotient σ
p
/s
k−1
 eventually increases. If this
quotient becomes of order one, the BRR method get diﬃculties to achieve the
fast convergence and the process starts to deviate from the convergence of the
method of Broyden. Note that the results are similar to those of Example
3.13.
0 5 10 15 20 25 30 35 40
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 3.11: The convergence rate of Algorithm 3.15, applied to the discrete integral
equation function (A.5), with q = p − 1. [’◦’(Broyden), ’×’(p = 10), ’’(p = 8),
’’(p = 7), ’+’(p = 6), ’’(p = 5), ’∗’(p = 4), ’’(p = 3), ’’(p = 2), ’6’(p = 1)]
3.3 Broyden Base Reduction method
We now develop a generalization of the reduction methods described in the
previous section. For this purpose we repeat some results. The Broyden
matrix after the pth correction can be written
B = B
0
+Q,
3.3 Broyden Base Reduction method 97
0 5 10 15 20 25 30 35 40
10
−10
10
0
10
10
PSfrag replacements
iteration k
q
u
o
t
i
e
n
t
σ
p
/
s
k
−
1
Figure 3.12: The quotient σ
p
/s
k−1
for Algorithm 3.15 applied to the discrete inte
gral equation function (A.5), with q = p − 1. [’◦’(Broyden), ’×’(p = 10), ’’(p = 8),
’’(p = 7), ’+’(p = 6), ’’(p = 5), ’∗’(p = 4), ’’(p = 3), ’’(p = 2), ’6’(p = 1)]
where the update matrix, denoted by Q, has at most rank p. As we have seen
before Q is the product of two (n p)matrices C and D,
Q = CD
T
.
In order to reduce the rank of Q, we propose the following approach. Let
1 be a qdimensional subspace of the R
n
with orthonormal basis ¦v
1
, . . . , v
q
¦
(q ≤ p). The idea is that Q is approximated with a new matrix
¯
Q without
destroying the action of the update matrix on the qdimensional subspace 1,
i.e.,
Q[
V
=
¯
Q[
V
. (3.19)
To assure that
¯
Q has rank less than q, it is set equal to zero on the orthogonal
complement of 1, thus
Q[
V
⊥ = 0. (3.20)
The new update matrix
¯
Q can be decomposed in two (nq)matrices
¯
C and
¯
D,
where
¯
D equals the matrix V = [v
1
, . . . , v
q
] that consists of the basis vectors.
By (3.19) and the orthogonality of V, it follows that
QV =
¯
QV =
¯
CV
T
V =
¯
C.
Note that we have projected the update matrix on the qdimensional subspace
1,
¯
Q = QV V
T
.
98 Chapter 3. Limited memory Broyden methods
Besides, the second condition (3.20) is fulﬁlled, because for u ⊥ 1 it follows
that
¯
Qu = QV V
T
u = 0. The usefulness of this approach depends on the choice
of the subspace 1.
Notice that if 1 = ImD with dim1 = q ≤ p and
¯
Q is deﬁned by (3.19),
then
¦Ker Q¦
⊥
= ImQ
T
= ImDC
T
⊂ ImD = 1.
Thus 1
⊥
⊂ Ker Q and
Q[
V
⊥ =
¯
Q[
V
⊥ = 0,
which implies that Q and
¯
Q are equal on the whole R
n
. The diﬀerence in
decomposition between CD
T
and
¯
C
¯
D
T
is that
¯
D is an orthogonal (n q)
matrix and D is not necessarily.
Because the number of columns of C and D is reduced using an orthonor
mal basis ¦v
1
, . . . , v
q
¦ of the subspace 1, we call this approach the Broyden
Base Reduction (BBR) method.
Example 3.17. We assume that rank Q ≤ p and take the subspace 1 spanned
by the right singular vectors ¦v
1
, . . . , v
p
¦, corresponding to the largest p sin
gular values σ
1
, . . . , σ
p
of Q. The set of right singular vectors also forms an
orthonormal basis of 1. We deﬁne
¯
D := V = [v
1
, . . . , v
p
] and
¯
C := QV. The
product
¯
C
¯
D
T
represents the singular value decomposition of Q, since
¯
C
T
¯
C = V
T
Q
T
QV = V
T
Σ
2
V = Σ
2
,
where Σ = diag(σ
1
, . . . , σ
p
) implies that the columns of
¯
C are orthogonal and
there exists an orthogonal (n p)matrix U such that
¯
C = UΣ. By taking
the subspace 1 = span¦v
1
, . . . , v
q
¦ with q < p the Broyden Rank Reduction
method is obtained.
After p iterations of Broyden’s method, the matrix D is given by
D =
s
0
s
0
s
p−1
s
p−1
.
To apply a reduction on the columns of D, we take a qdimensional subspace,
1 that contains the vectors ¦s
p−q
, . . . , s
p−1
¦, q < p. To obtain an orthonormal
basis for 1 we can compute the QLdecomposition of D = V L, where V is an
orthogonal (n q)matrix, and L is a lower triangular (p p)matrix. We use
the QLdecomposition instead of the usual QRdecomposition because then
for l = 1, . . . , p, we have that
span¦s
l−1
, . . . , s
p−1
¦ ⊂ span¦v
l
, . . . , v
p
¦.
3.3 Broyden Base Reduction method 99
So, we can take 1 = span¦v
p−q+1
, . . . , v
p
¦. We rewrite the decomposition of
Q by
CD
T
= C(V L)
T
= CL
T
V
T
=:
¯
C
¯
D
T
. (3.21)
By removing the ﬁrst (p −q) columns of
¯
C and
¯
D the update matrix retains
the same action on the last q Broyden steps. Note that this is not the case in
Example 3.6. This reduction is applied whenever the maximum number of p
columns in C and D is reached.
Example 3.18. Let g be the discrete integral equation function given by
(A.5). As initial estimate we choose x
0
given by (A.6) and we set ε = 10
−12
.
We apply Algorithm 3.4, for q := p −1 and Z given by
Z =
¸
¸
¸
¸
¸
0 1
.
.
.
.
.
.
.
.
.
1
1 0
¸
L, (3.22)
where L comes from the QLdecomposition of D. In Figure 3.13, we observe
that the rate of convergence is high for p = 8. For smaller values of p the rate
of convergence is rather low, or the process even diverges.
0 5 10 15 20 25 30 35 40
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 3.13: The convergence rate of Algorithm 3.4 applied to the discrete integral
equation function (A.5), with q = p−1 and Z given by (3.22). [’◦’(Broyden), ’×’(p =
10), ’’(p = 5), ’∗’(p = 4), ’’(p = 3), ’’(p = 2), ’6’(p = 1)]
Another choice for 1 could be the subspace that contains the ﬁrst p − 1
Broyden steps. Thus 1 ⊃ span¦s
0
, . . . , s
p−2
¦. Note that after the pth iteration,
100 Chapter 3. Limited memory Broyden methods
the subspace 1 is set to ImV where D = V R. In the subsequent iterations,
the subspace 1 remains ﬁxed. This implies that after every iteration, the new
correction cd
T
to the Broyden matrix is subdivided over the p − 1 existing
directions, because the update matrix is rewritten as
CD
T
= C(V R)
T
= CR
T
V D
T
=:
¯
C
¯
D
T
, (3.23)
where V = [v
1
, . . . , v
p−1
,
˜
d
p
]. The last column
˜
d
p
of V is orthogonal to the base
vectors v
1
, . . . , v
p−1
. After the reduction the ﬁrst p −1 columns of the matrix
C have been adapted and the ﬁrst p−1 columns of D are still the base vectors
v
1
, . . . , v
p−1
. Because we store the basis from the pth iteration, we call this
method the Broyden Base Storing (BBS) method.
Example 3.19. Let g be the discrete integral equation function given by
(A.5). As initial estimate we choose x
0
given by (A.6) and we set ε = 10
−12
.
We apply Algorithm 3.4, for q := p−1 and Z = R, where R comes from the QR
decomposition of D. In Figure 3.13, we observe that the rate of convergence
is again high for p = 8. For smaller values of p the rate of convergence is very
low, or the process diverges.
0 5 10 15 20 25 30 35 40
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 3.14: The convergence rate of Algorithm 3.4, applied to the discrete integral
equation function (A.5), with q = p − 1 and Z = R. [’◦’(Broyden), ’×’(p = 10),
’’(p = 5), ’∗’(p = 4), ’’(p = 3), ’’(p = 2), ’6’(p = 1)]
3.4 The approach of Byrd
In 1994, Byrd, Nocedal and Schnabel derived a compact representation of
the matrices generated by Broyden’s update (3.1) for systems of nonlinear
3.4 The approach of Byrd 101
equations. These new compact representation is of interest in its own right, but
also of use in limited memory methods. Therefore we include the derivation
in this chapter.
Let us deﬁne the (n k)matrices S
k
and Y
k
by
S
k
=
s
0
. . . s
k−1
, Y
k
=
y
0
. . . y
k−1
. (3.24)
We ﬁrst prove a preliminary lemma on products of projection matrices
V
k
= I −
y
k
s
T
k
y
T
k
s
k
, (3.25)
that will be useful in subsequent analysis and is also interesting in its own
right.
Lemma 3.20. The product of a set of k projection matrices of the form (3.25)
satisﬁes
V
0
V
k−1
= I −Y
k
R
−1
k
S
T
k
, (3.26)
where R
k
is the (k k)matrix
(R
k
)
i,j
=
s
T
i−1
y
j−1
if i ≤ j,
0 otherwise.
Proof. Proceeding by induction, we note that (3.26) holds for k = 1, because
in this case the right hand side of (3.26) is given by
I −y
0
1
s
T
0
y
0
s
T
0
= V
0
.
Now, assume that (3.26) holds for some k, and consider k +1. If we write the
matrix R
k+1
as
R
k+1
=
¸
R
k
S
T
k
y
k
0 1/ρ
k
,
we see that
R
−1
k+1
=
¸
R
−1
k
−ρ
k
R
k−1
S
T
k
y
k
0 ρ
k
.
This implies that
I −Y
k+1
R
−1
k+1
S
T
k+1
= I −
Y
k
y
k
¸
R
−1
k
−ρ
k
R
k
−1S
T
k
y
k
0 ρ
k
¸
S
T
k
s
T
k
= I −Y
k
R
−1
k
S
T
k
+ρ
k
Y
k
R
−1
k
S
T
k
y
k
s
T
k
−ρ
k
y
k
s
T
k
= (I −Y
k
R
−1
k
S
T
k
)(I −ρ
k
y
k
s
T
k
).
102 Chapter 3. Limited memory Broyden methods
Together with the induction hypothesis, we obtain
V
0
V
k
= (I −Y
k
R
−1
k
S
T
k
)(I −ρ
k
y
k
s
T
k
)
= (I −Y
k+1
R
−1
k+1
S
T
k+1
),
which establishes the product relation (3.26) for all k.
Compact representation of the Broyden matrix
As before, we deﬁne
S
k
=
s
0
. . . s
k−1
, Y
k
=
y
0
. . . y
k−1
,
and we assume that the vectors s
i
are nonzero.
Theorem 3.21. Let B
0
be a nonsingular starting matrix, and let B
k
be ob
tained by updating B
0
k times using Broyden’s formula (3.1) and the pairs
¦s
i
, y
i
¦
k−1
i=0
. Then
B
k
= B
0
+ (Y
k
−B
0
S
k
)N
−1
k
S
T
k
, (3.27)
where N
k
is the k k matrix
(N
k
)
i,j
=
s
T
i−1
s
j−1
if i ≤ j,
0 otherwise.
(3.28)
Proof. It is easy to show (using induction) that B
k
can be written as
B
k
= C
k
+D
k
, (3.29)
where C
k
and D
k
are deﬁned recursively by
C
0
= B
0
, C
k+1
= C
k
(I −ρ
k
s
k
s
T
k
) k = 0, 1, 2, . . . ,
and
D
0
= 0, D
k+1
= D
k
(I −ρ
k
s
k
s
T
k
) +ρ
k
y
k
s
T
k
k = 0, 1, 2, . . . , (3.30)
where
ρ
k
= 1/s
T
k
s
k
.
Considering ﬁrst C
k
we note that it can be expressed as the product of C
0
with a sequence of projection matrices,
C
k
= C
0
(I −ρ
0
s
0
s
T
0
) (I −ρ
k−1
s
k−1
s
T
k−1
). (3.31)
3.4 The approach of Byrd 103
Now we apply Lemma 3.20, with y := s in the deﬁnition (3.25), to (3.31) in
order to obtain
C
k
= B
0
−B
0
S
k
N
−1
k
S
T
k
, (3.32)
for all k = 1, 2, 3, . . . .
Next we show by induction that D
k
has the compact representation
D
k
= Y
k
N
−1
k
S
T
k
. (3.33)
By the deﬁnition (3.30), we have that D
1
= y
0
ρ
0
s
T
0
, which agrees with (3.33)
for k = 1. Assume now that (3.33) holds for some k. Then by (3.30),
D
k+1
= Y
k
N
−1
k
S
T
k
(I −ρ
k
s
k
s
T
k
) +ρ
k
y
k
s
T
k
= Y
k
N
−1
k
S
T
k
−ρ
k
Y
k
N
−1
k
s
k
s
T
k
+ρ
k
y
k
s
T
k
=
Y
k
y
k
¸
N
−1
k
−ρ
k
N
k
−1S
T
k
s
k
0 0
¸
S
T
k
s
T
k
+
Y
k
y
k
¸
0 0
0 ρ
k
¸
S
T
k
s
T
k
= Y
k+1
¸
N
−1
k
−ρ
k
N
k
−1S
T
k
s
k
0 ρ
k
S
T
k+1
. (3.34)
Note, however, that
¸
N
−1
k
−ρ
k
N
k
−1S
T
k
s
k
0 ρ
k
¸
N
k
S
T
k
s
k
0 1/ρ
k
= I,
which implies that the second matrix on the right hand side of (3.34) is N
−1
k+1
.
By induction this establishes (3.33). Finally, substituting (3.32) and (3.33) in
(3.29), we obtain (3.27).
We now derive a compact representation of the inverse Broyden update
which is given by
H
k+1
= H
k
+ (s
k
−H
k
y
k
)
s
T
k
H
k
s
T
k
H
k
y
k
(3.35)
Theorem 3.22. Let H
0
be a nonsingular starting matrix, and let H
k
be ob
tained by updating H
0
k times using the inverse Broyden’s formula (3.35) and
the pairs ¦s
i
, y
i
¦
k−1
i=0
. Then
H
k
= H
0
+ (S
k
−H
0
Y
k
)(M
k
+S
T
k
H
0
Y
k
)
−1
S
T
k
H
0
, (3.36)
where S
k
and Y
k
are given by (3.24) and M
k
is the (k k)matrix
(M
k
)
i,j
=
−s
T
i−1
s
j−1
if i > j,
0 otherwise.
(3.37)
104 Chapter 3. Limited memory Broyden methods
Proof. Let
U = Y
k
−B
0
S
k
, V
T
= N
−1
k
S
T
k
,
so that (3.27) becomes
B
k
= B
0
+UV
T
Applying the ShermanMorrisonWoodbury formula (1.68), we obtain
H
k
= B
−1
k
= B
−1
0
−B
−1
0
U(I +V
T
B
−1
0
U)
−1
V
T
B
−1
0
= H
0
−H
0
(Y
k
−B
0
S
k
)(I +N
−1
k
S
T
k
H
0
(Y
k
−B
0
S
k
))
−1
N
−1
k
S
T
k
H
0
= H
0
−(H
0
Y
k
−S
k
)(N
k
+S
T
k
H
0
Y
k
−S
T
k
S
k
)
−1
S
T
k
H
0
.
By (3.28) and (3.37) we have N
k
−S
T
k
S
k
= M
k
, which gives (3.36).
Note that since we have assumed that all the updates given by (3.35) exist,
we have implicitly assumed the nonsingularity of B
k
. This nonsingularity
along with the ShermanMorrison formula (1.68) ensures that (M
k
+S
T
k
H
0
Y
k
)
is nonsingular.
In applications, we only use the representation (3.36) of the inverse Broy
den matrix. Because we always start with H
0
= −I as initial matrix (3.36) is
reduced to
H
k
= −I −(Y
k
+S
k
)(M
k
−S
T
k
Y
k
)
−1
S
T
k
. (3.38)
The matrix we want to invert, (M
k
−S
T
k
Y
k
), can be approximately singular,
because the size of the Broyden step s
k
 decreases if the process converges.
In that case, the norm of the ﬁrst column of M is much larger than the
norm of the last but one column. In the pth iteration, the ﬁrst column of M
equals (0, −s
T
1
s
0
, . . . , −s
T
p−1
s
0
) and column p −1 equals (0, . . . , 0, −s
T
p−1
s
p−2
).
For the same reason the (p p)matrix S
T
p
Y
p
probably does not have p large
singular values. In addition, the vectors ¦s
0
, . . . , s
p−1
¦ can be more or less
linear dependent. There exists a remarkable way to solve this problem. Instead
of storing the Broyden steps and their yields, we deﬁne
D =
s
0
s
0
s
p−1
s
p−1
= S
p
T, (3.39)
C =
y
0
s
0
y
p−1
s
p−1
= Y
p
T, (3.40)
where T = diag(1/s
0
, . . . , 1/s
p−1
). Note that T is invertible, since s
k
= 0
during the Broyden process. So, we substitute S
p
= DT
−1
and Y
p
= CT
−1
into (3.38) and arrive at
H
p
= −I −(DT
−1
+CT
−1
)(T
−1
(TMT)T
−1
−(DT
−1
)
T
CT
−1
)
−1
(DT
−1
)
T
= −I −(D +C)((TMT) −D
T
C)
−1
D
T
. (3.41)
3.4 The approach of Byrd 105
Note that the product TMT equals the (p p)matrix
(TMT)
i,j
=
−
s
T
i−1
s
i−1
s
j−1
s
j−1
if i > j,
0 otherwise.
=
−d
T
i−1
d
j−1
if i > j,
0 otherwise.
Removing updates
Using this notation, the method of Broyden can simply be transformed into a
limited memory method. We apply the condition that at most p updates to
the Broyden matrix can be stored. The Broyden steps and their corresponding
yields are stored according to (3.39) and (3.40) in the matrices C and D. So,
before a new Broyden step in iteration k = p + 1 can be computed, we have
to remove a column of both matrices C and D.
Algorithm 3.23 (The limited memory Broyden method of Byrd).
Choose an initial estimate x
0
∈ R
n
, set the parameters p and q, and let
C = [c
1
, . . . , c
p
], D = [d
1
, . . . , d
p
] ∈ R
n×p
be initialized by c
i
= d
i
= 0 for
i = 1, . . . , p (m := 0). Set k := 0 and repeat the following sequence of steps
until g(x
k
) < ε.
i) Compute for i = 1, . . . , m and j = 1, . . . , m,
M
i,j
=
−d
T
i−1
d
j−1
if i > j,
0 otherwise.
(3.42)
ii) Solve (M −
¸
m
l=1
d
T
l
c
l
)t
k
= −
¸
m
l=1
d
T
l
g(x
k
) for t
k
,
iii) d
m+1
:= −g(x
k
) +
¸
m
l=1
(c
l
+d
l
)t
k
,
iv) x
k+1
:= x
k
+d
m+1
,
v) c
m+1
:= g(x
k+1
) −g(x
k
),
vi) d
m+1
:= d
m+1
/d
m+1
and c
m+1
:= c
m+1
/c
m+1
,
vii) Let m := m+ 1,
viii) If m = p then set c
l
= d
l
= 0 for l = q + 1, . . . , p (m := q).
Note that we changed the meaning of the matrices C and D compared to
the other limited memory methods. Instead of B
k
= B
0
+ CD
T
we have in
106 Chapter 3. Limited memory Broyden methods
Algorithm 3.23 the matrix B
k
given by
B
k
= B
0
+ (
c
1
c
m
+
d
1
d
m
)N
−1
d
T
1
.
.
.
d
T
m
¸
¸
¸
(3.43)
where the matrix N is given by
(N)
i,j
=
d
T
i−1
d
j−1
if i ≤ j,
0 otherwise,
(3.44)
for i = 1, . . . , m and j = 1, . . . , m. Note that the dimensions of N and M
depend on m, and so these dimensions are variable.
Because after the reduction step (vii) the latest (scaled) Broyden step s
k
and its yield y
k
are stored in column q of the matrices C and D, Algorithm
3.23 is still a secant method.
Theorem 3.24. Let B
k+1
be given by (3.43), where C = [c
1
, . . . , c
m
] and
D = [d
1
, . . . , d
m
] are both (n m)matrices and N is given by (3.44). If
s
k
/s
k
 is stored in column m of D and y
k
/s
k
 is stored in column m of C,
then B
k+1
satisﬁes the secant equation (1.25).
Proof. Because N is nonsingular, v = s
k
 e
m
is the unique solution of the
equation
Nv = D
T
s
k
or, equivalently
¸
¸
¸
¸
¸
d
T
0
d
0
d
T
0
d
m−1
d
T
0
s
k
s
k
.
.
.
.
.
.
.
.
.
d
T
m−1
d
m−1
d
T
m−1
s
k
s
k
s
T
k
s
k
s
k
s
k
¸
v =
¸
¸
¸
¸
¸
d
T
0
s
k
.
.
.
d
T
m−1
s
k
s
T
k
s
k
s
k
¸
.
Therefore
B
k+1
s
k
= B
0
s
k
+ (C −B
0
D)N
−1
D
T
s
k
= B
0
s
k
+ (C −B
0
D)s
k
e
m
= B
0
s
k
+y
k
−B
0
s
k
= y
k
.
3.4 The approach of Byrd 107
We observe that in case of p = 1 the update to the Broyden matrix is
directly removed at the end of every iteration step. Therefore, Algorithm 3.23
equals dynamical simulation for p = 1.
Example 3.25. Let g be the discrete integral equation function, given by
(A.5). As initial estimate we choose x
0
given by (A.6) and we set ε = 10
−12
.
We apply Algorithm 3.23, for diﬀerent values values of p, the parameter q is
set to p −1. The rate of convergence is given in Figure 3.15. Clearly, for every
value of p the method needs more iterations than the method of Broyden.
Only for p = 3 and p = 4 the convergence is reasonably fast. Note that for
p = 2 we have again the same result as we have for p = 1 in case of the other
limited memory Broyden methods. For p = 1 the method directly diverges.
0 5 10 15 20 25 30 35 40
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 3.15: The convergence rate of Algorithm 3.23 applied to the discrete integral
equation function (A.5), for diﬀerent values of p. [’◦’(Broyden), ’×’(p = 10), ’’(p =
5), ’∗’(p = 4), ’’(p = 3), ’’(p = 2), ’6’(p = 1)]
108 Chapter 3. Limited memory Broyden methods
Part II
Features of limited memory
methods
109
Chapter 4
Features of Broyden’s method
In Part I we provided the theoretical background to the limited memory Broy
den methods. We discussed the derivation and convergence of the method of
Broyden and indicated the freedom in the algorithm to reduce the amount of
memory to store the Broyden matrix and still preserving the fast convergence.
In this chapter we investigate whether the characteristics of the function g
can tell us whether or not our main algorithm, the Broyden Rank Reduction
method, will succeed to approximate a zero x
∗
of g.
We consider the function g as the diﬀerence between a period map f :
R
n
→R
n
and the identity. So,
g(x) = f(x) −x.
A zero x
∗
of the function g is a ﬁxed point of the function f. As we pointed
out in Section 1.3, the ﬁrst step of a limited memory Broyden method is a
dynamical simulation step if the initial Broyden matrix is given by B
0
= −I,
that is,
x
1
= f(x
0
).
In addition, suppose that g is an aﬃne function, g(x) = Ax + b, where A ∈
R
n×n
and b ∈ R
n
. We know that
B
k+1
s
k
= y
k
= As
k
,
for every k = 0, 1, 2, . . . , and that the Broyden matrix B
k+1
is the sum of B
0
and the update matrix CD
T
. Therefore, the equality
CD
T
s
k
= (A+I)s
k
holds, where (A+I) is the Jacobian of the period map f. For nonlinear func
tions g and f, this suggests that the update matrix approximates in some sense
111
112 Chapter 4. Features of Broyden’s method
the Jacobian of f, J
f
(x
∗
) = J
g
(x
∗
) + I. Also, the parameter p of the limited
memory Broyden method could be chosen rank J
f
(x
∗
) +1. We investigate this
conjecture with an example.
Example 4.1. Let g : R
n
→R
n
be given by g(x) = ax, with nonzero a ∈ R.
The unique solution of the system g(x) = 0, is x
∗
= 0 and the Jacobian of
g is given by J
g
(x
∗
) = aI. Let A = aI, then the Jacobian of the period map
f : R
n
→R
n
is given by J
f
(x
∗
) = A+I = (a + 1)I. Let x
0
= 0 be arbitrarily
given, the ﬁrst Broyden step becomes
s
0
= −B
−1
0
g(x
0
) = ax
0
,
and thus x
1
= x
0
+ax
0
= (a +1)x
0
. Note that in case of a = −1 the Jacobian
of f equals the zero matrix, which has rank zero, and that the exact solution
is found in just one single iteration of the Broyden process. This is clear
since the initial Broyden matrix equals the Jacobian (B
0
= A = −I) and the
method of Newton converges in one iteration on linear systems. Now assume
that a = −1. Because g(x
1
) = a(a + 1)x
0
the new Broyden matrix becomes
B
1
= B
0
+
g(x
1
)s
T
0
s
T
0
s
0
= −I +
a(a + 1)x
0
(ax
0
)
T
(ax
0
)
T
(ax
0
)
= −I + (a + 1)
x
0
x
T
0
x
T
0
x
0
.
The next Broyden step is given by
s
1
= −B
−1
1
g(x
1
)
= −
−I + (a + 1)
x
0
x
T
0
x
T
0
x
0
−1
a(a + 1)x
0
= −
−I +
a + 1
a
x
0
x
T
0
x
T
0
x
0
a(a + 1)x
0
= a(a + 1)x
0
−(a + 1)
2
x
0
= −(a + 1)x
0
,
and x
2
= x
1
+s
1
= (a+1)x
0
−(a+1)x
0
= 0. This was to be expected as well,
because the zeroth Krylov subspace is given by
Z
0
= span ¦g(x
0
), (AH
0
)g(x
0
), (AH
0
)
2
g(x
0
), . . .¦
= span ¦g(x
0
), ag(x
0
), a
2
g(x
0
), . . .¦ = span ¦x
0
¦,
and has dimension d
0
= dimZ
0
= 1, for a = 0. According to Corollary 2.14
the method of Broyden converges in less that 2d
0
= 2 iterations. Note that,
although A + I = (a + 1)I has full rank, it has singular value (a + 1) with
multiplicity n. Obviously, the method of Broyden uses the information of the
Jacobian J
f
(x
∗
) in only one direction, that is, on the subspace spanned by
x
0
. The last update matrix before the process converges exactly is given by
CD
T
= (a + 1)x
0
x
T
0
/(x
T
0
x
0
).
4.1 Characteristics of the Jacobian 113
4.1 Characteristics of the Jacobian
In a small neighborhood of the solution x
∗
, the nonlinear function g can be
considered as approximately linear, depending on the relative nonlinearity γ
rel
,
see Section 1.3. Therefore, we compare in this section the convergence proper
ties of the method of Broyden for several test functions and their linearizations
around x
∗
. The initial Broyden matrix is set to minus the identity.
We deﬁne the aﬃne function l : R
n
→R
n
by
l(x) = g(x
∗
) +J
g
(x
∗
)(x −x
∗
) = J
g
(x
∗
)x −J
g
(x
∗
)x
∗
,
and write
l(x) = Ax +b (4.1)
where
A = J
g
(x
∗
) and b = −J
g
(x
∗
)x
∗
. (4.2)
In this section we compute both the singular values of the matrix A + I and
of the zeroth Krylov space for the linearized problem, given by
Z
0
= span ¦l(x
0
), (AH
0
)l(x
0
), (AH
0
)
2
l(x
0
), . . .¦.
We investigate the connection between d
0
= dimZ
0
an the choice of p for
the BRR method to solve g(x) = 0. The dimension of a Krylov space can
not always be determined exactly. The vector (AH
0
)
j
l(x
0
), for example, can
still be linearly independent of the ﬁrst j vectors in the Krylov sequence,
l(x
0
), . . . , (AH
0
)
j−1
l(x
0
), but also lie close to the subspace spanned by these
j vectors. Therefore, we deﬁne the zeroth Krylov matrix by
K
0
:= K(l(x
0
), AH
0
), (4.3)
where
K(v, A) =
v
v
Av
Av
A
n−1
v
A
n−1
v
.
The rank of K
0
equals the dimension of Z
0
. However, we can derive the singular
values of K
0
, to obtain a more continuous description of the rank of K
0
. The
rank of K
0
can be approximated by the number of relatively large (for example,
≥ 10
−15
) singular values of K
0
.
The discrete integral equation function
We consider the function g : R
n
→ R
n
as given by (A.5), with dimension
n = 20. As expected from Section 1.2 it takes Newton’s method 3 iterations to
114 Chapter 4. Features of Broyden’s method
0 5 10 15 20 25 30 35 40
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
quotient σ
p
/s
k−1
0 5 10 15 20 25 30 35 40
10
−10
10
0
10
10
PSfrag replacements
iteration k
residual g(x
k
)
q
u
o
t
i
e
n
t
σ
p
/
s
k
−
1
Figure 4.1: The convergence rate of Algorithm 1.19 and Algorithm 3.11, with q = p−1,
applied to the discrete integral equation function (A.5) and additionally the quotient
σ
p
/s
k−1
. [’◦’(Broyden), ’×’(p = 10), ’’(p = 8), ’’(p = 7), ’+’(p = 6), ’’(p = 5),
’∗’(p = 4), ’’(p = 3), ’’(p = 2), ’6’(p = 1)]
converge to a residual of g(x
k
) < 10
−12
, starting from the initial condition
x
0
given by (A.6). In Figure 4.1 we have plotted the rate of convergence for
the method of Broyden and the BRR method for diﬀerent values of p.
The method of Broyden needs 21 iterations to obtain the same order of
residual. It turns out that the BRR method needs also 21 iterations for p ≥ 7,
cf. Section 3.2, where we took n = 50. For smaller values of the parameter p
the residual diverges from the path of Broyden’s method when the quotient
σ
p
/s
k−1
 has become too large, see again Figure 4.1. Thereafter the residual
g(x
k
) changes very little from iteration to iteration and the process slowly
diverges. So, the quotient σ
p
/s
k−1
 decreases rather because the size of the
Broyden step increases than because the singular value σ
p
gets smaller.
In Figure 4.2, we have plotted the singular values of both J
f
(x
∗
) and K
0
,
deﬁned by (4.3). The graph of the singular values of J
f
(x
∗
) describes an
exponential decay to 2. Clearly the matrix J
f
(x
∗
) has full rank. The singular
4.1 Characteristics of the Jacobian 115
values of K
0
describe a fast linear decay until the 9th singular value. The
remaining singular values are of the same order. Note that it is not evident to
determine the dimension of the zeroth Krylov space.
0 5 10 15 20
2
2.05
2.1
2.15
2.2
2.25
2.3
0 5 10 15 20
10
−15
10
−10
10
−5
10
0
Figure 4.2: The singular values of J
f
(x
∗
) (left) and K
0
(right) in case of the discrete
integral equation function (A.5), n = 20.
In Figure 4.3, we have plotted the singular values of the same matrices
J
f
(x
∗
) and K
0
for a larger dimension (n = 50). We observe that the number
0 10 20 30 40 50
2
2.05
2.1
2.15
2.2
2.25
2.3
0 10 20 30 40 50
10
−15
10
−10
10
−5
10
0
Figure 4.3: The singular values of J
f
(x
∗
) (left) and K
0
(right) in case of the discrete
integral equation function (A.5), n = 50.
of large singular values of K
0
is about the same as in case of n = 20. So, for
the linearized system the method of Broyden would need as many iterations
for n = 20 as it needs for n = 50. This explains the same rate of convergence
for diﬀerent dimensions of the nonlinear problem, see Example 1.21.
The discrete boundary value function
We consider the function g : R
n
→ R
n
as given by (A.2), with dimension
n = 20. The method of Newton needs again 3 iterations to converge to a
116 Chapter 4. Features of Broyden’s method
residual of g(x
k
) < 10
−10
, starting from the initial condition x
0
, given by
(A.3). The method of Broyden needs 60 iterations to obtain the same order of
residual, see Figure 4.4. It turns out that the BRR method fails to converge
for every value of p. As can be seen in Figure 4.4 the residual increases directly
after the quotient σ
p
/s
k−1
 has become too large.
0 10 20 30 40 50 60
10
−10
10
−5
10
0
10
5
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
quotient σ
p
/s
k−1
0 10 20 30 40 50 60
10
−10
10
−5
10
0
PSfrag replacements
iteration k
residual g(x
k
)
q
u
o
t
i
e
n
t
σ
p
/
s
k
−
1
Figure 4.4: The convergence rate of Algorithm 1.19 and Algorithm 3.11, with q = p−1,
applied to the discrete boundary value function (A.2) and additionally the quotient
σ
p
/s
k−1
. [’◦’(Broyden), ’×’(p = 10), ’’(p = 5), ’∗’(p = 4), ’’(p = 3), ’’(p = 2),
’6’(p = 1)]
The singular values of J
f
(x
∗
) are all diﬀerent and are nicely distributed
over the interval [1, 5], Figure 4.5. All singular values of K
0
are larger than
10
−15
and more than 10 singular values are even larger than 10
−5
. Although
one might consider K
0
not to have full rank it is rather close to be nonsingular.
So, the method of Broyden would need almost all 2n iterations to converge on
the linearized problem.
4.1 Characteristics of the Jacobian 117
0 5 10 15 20
1
2
3
4
5
0 5 10 15 20
10
−10
10
−5
10
0
Figure 4.5: The singular values of J
f
(x
∗
) (left) and K
0
(right) in case of the discrete
boundary value function (A.2), n = 20.
The extended Rosenbrock function
We consider the function g : R
n
→ R
n
as given by (A.7), with dimension
n = 20. The method of Newton needs 3 iterations to converge to a residual of
g(x
k
) < 10
−12
, starting from the initial condition x
0
given by (A.7). The
method of Broyden needs 18 iteration to obtain the same order of residual,
see Figure 4.6.
For p = 1 and p = 2 the BRR method fails to converges. For larger values
of p, however, the BRR method has a high rate of convergence and is even
faster than the method of Broyden. If we take p larger than 5 the rate of
convergence of the BRR method does not increase, that is, the BRR method
still needs 11 iterations. Note that for p = 5 the quotient σ
p
/s
k−1
 does not
exceeds 10
−15
, see Figure 4.6.
The unique solution of the extended Rosenbrock function is the vector
x
∗
= (1, . . . , 1). The extended Rosenbrock function is a system of n/2 copies
of the Rosenbrock function, see Example 1.9. So, the Jacobian J
f
= J
g
+I at
the solution x
∗
is a blockdiagonal matrix, with blocks given by
−19 10
−1 1
.
Therefore, the Jacobian J
f
(x
∗
) has two diﬀerent singular values, that is, σ
1
=
. . . = σ
n/2
≈ 21.5134 and σ
n/2+1
= . . . = σ
n
≈ 0.4183, see Figure 4.7. Clearly,
only two singular values of the matrix K
0
are signiﬁcant. So, the dimension of
the zeroth Krylov space Z
0
is 2 and the method of Broyden would need at most
4 iterations to solve the linearized system. Note that the BRR method can
approximate the zero of the extended Rosenbrock function if we take p = 3.
118 Chapter 4. Features of Broyden’s method
0 5 10 15 20 25
10
−15
10
−10
10
−5
10
0
10
5
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
quotient σ
p
/s
k−1
0 5 10 15 20 25
10
−15
10
−10
10
−5
10
0
10
5
PSfrag replacements
iteration k
residual g(x
k
)
q
u
o
t
i
e
n
t
σ
p
/
s
k
−
1
Figure 4.6: The convergence rate of Algorithm 1.19 and Algorithm 3.11, with q = p−1,
applied to the extended Rosenbrock function (A.7) and additionally the quotient
σ
p
/s
k−1
. [’◦’(Broyden), ’’(p = 5), ’∗’(p = 4), ’’(p = 3), ’’(p = 2), ’6’(p = 1)]
The extended Powell singular function
We consider the function g : R
n
→ R
n
as given by (A.9), with dimension
n = 20. It turns out that the method of Newton converges linearly in 23
iterations to a residual of g(x
k
) < 10
−12
, starting from the initial condition
given by (A.10). With the same initial condition, the method of Broyden fails
to converge to the zero of the extended Powell singular function, as does the
BRR method, for every value of p.
The unique solution of the extended Powell singular function is the zero
vector, x
∗
= (0, . . . , 0). The Jacobian J
f
= J
g
+ I at the solution x
∗
is a
4.1 Characteristics of the Jacobian 119
0 5 10 15 20
0
5
10
15
20
0 5 10 15 20
10
−15
10
−10
10
−5
10
0
Figure 4.7: The singular values of J
f
(x
∗
) (left) and K
0
(right) in case of the extended
Rosenbrock function (A.7), n = 20.
blockdiagonal matrix, with blocks given by
¸
¸
¸
2 10 0 0
0 1
(5) −
(5)
0 0 1 0
0 0 0 1
¸
.
The Jacobian J
f
is nonsingular and has four diﬀerent singular values, that is,
σ
1
= . . . = σ
n/4
≈ 10.2501, σ
n/4+1
= . . . = σ
n/2
≈ 3.3064, σ
n/2+1
= . . . =
σ
3n/4
≈ 1.0000, and σ
3n/4+1
= . . . = σ
n
≈ 0.0590, see Figure 4.8. Only three
0 5 10 15 20
0
2
4
6
8
10
0 5 10 15 20
10
−15
10
−10
10
−5
10
0
Figure 4.8: The singular values of J
f
(x
∗
) (left) and K
0
(right) in case of the extended
Powell singular function (A.9), n = 20.
singular values of the matrix K
0
are signiﬁcant. The dimension of the zeroth
Krylov space Z
0
is about 3 and the method of Broyden would need at most 6
iterations to solve the linearized system. The Jacobian J
g
, however, is singular
120 Chapter 4. Features of Broyden’s method
at the zero x
∗
of the function g and the theory of Sections 1.3 and 3.2 cannot
be applied.
4.2 Solving linear systems with Broyden’s method
As we have seen in Section 4.1 the linearized problem gives more insight in
the success of the method of Broyden on the original nonlinear problem. We
ﬁrst recall the main results of Chapter 2 In many problems, components of
the function are linear or nearly linear. Therefore it is interesting to consider
the method of Broyden on linear systems.
Theorems 2.11 and 2.12 show that the number of iterations needed by the
method of Broyden to converge exactly on linear problems
Ax +b = 0, (4.4)
can be predicted by the sum of the dimensions of the Krylov spaces Z
0
and
Z
1
. By Corollary 2.14 we know that the method of Broyden needs at most
2d
0
iterations on linear systems, where d
0
= dimZ
0
. According to Lemma
2.18 Broyden’s method needs at most 2d
0
iterations for all linearly translated
systems of (4.4).
Therefore, we consider the method of Broyden applied to linear systems
where A has a Jordan canonical block form. In this section the vector b is
chosen to be the zero vector. As initial Broyden matrix we choose again
B
0
= −I and in all examples we choose the initial condition x
0
= (1, . . . , 1).
Another conclusion of Chapter 2 is that although the diﬀerence between
the Broyden matrix and the Jacobian does not grow, Lemma 2.6, it is not
necessary that the Broyden matrix approaches the Jacobian even if the linear
system (4.4) is solved. It has been proved that, under certain conditions, the
Broyden matrix and the Jacobian only coincide in one single direction, Lemma
2.7. In this section we illustrate the development of the Broyden matrix along
the Broyden process.
One Jordan block
Let us consider the matrix A ∈ R
n×n
, given by
A =
¸
¸
¸
¸
¸
λ 1
.
.
.
.
.
.
.
.
.
1
λ
¸
. (4.5)
4.2 Solving linear systems with Broyden’s method 121
The vector b is set to zero and we choose λ equal to 2. If x
0
is given by
x
0
= (1, . . . , 1), the dimension of the zeroth Krylov space Z
0
equals d
0
= n.
It takes Broyden’s method at most 2d
0
= 2n iterations to solve (4.4). In
Example 2.5 we have seen that, indeed, for n = 4 the method of Broyden
needs 8 iterations to converge.
We choose the dimension n = 20 and apply the method of Broyden. The
residual g(x
k
) oscillates around 10
1
for 39 iterations and then suddenly drops
to 10
−12
in the 40th iteration step.
We have plotted the Jacobian A in Figure 4.9. The structure of the Jaco
bian can be clearly distinguished. In the same ﬁgure we have also plotted the
initial Broyden matrix (B
0
= −I), as well as the Broyden matrix at several
iterations. The matrix B
40
is the ﬁnal matrix before the method of Broyden
solves (4.4).
Jacobian B
0
B
3
B
4
B
5
B
6
B
10
B
15
B
20
B
25
B
30
B
40
Figure 4.9: The Jacobian (4.5) of the linear system, the initial Broyden matrix and
the Broyden matrix at subsequent iterations (n = 20). Black corresponds to the value
−1 and white to the value 2.
122 Chapter 4. Features of Broyden’s method
Clearly, Broyden’s method tries to recover the structure of the Jacobian,
starting from its initial matrix. Due to our choice of the Jacobian, the initial
Broyden matrix and the initial estimate x
0
, this recovery starts at the bottom
right side of the matrix. Iteration after iteration the update to the Broyden
matrix involves a next entry of the main diagonal. We see that the Broyden
matrix is also developing a subdiagonal. After about 25 iterations the upper
left corner of the matrix is reached. Thereafter the elements of the two main
diagonals are adjusted and the oﬀdiagonal elements are pressed to zero.
We have applied Algorithm 3.11 to solve the linear system with the Jaco
bian of (4.5), for every value of p. The BRR method is only converging for
p = 20 but not as fast as the method of Broyden itself. In 60 iterations a resid
ual is reached of 3.2468 10
−11
. We have seen in Figure 4.9 that the method
of Broyden mainly concerns the main diagonals of the matrix. The other ele
ments of the Broyden matrix are kept approximately zero. The BRR method,
however, disturbs the structure of the Broyden matrix. That is, where the
elements of the Broyden matrix should be zero a pattern arises, see Figure
4.10.
B
30
B
40
B
50
B
60
Figure 4.10: The Broyden matrix at three diﬀerent iterations of Algorithm 3.11, with
p = 20 (n = 20). Black corresponds to the value −1 and white to the value 2.
For smaller values of p the BRR method fails to converge. That is, the
convergence behavior of Broyden’s method is followed for about 2p iterations,
but then the process diverges. We now apply Algorithm 3.11 with p = 1. This
implies that there is only one singular value available to update the initial
Broyden matrix B
0
in order to approximate the Jacobian (4.5). We have
plotted the Broyden matrix at four iterations of the BRR process, see Figure
4.11. Again the update process starts at the lower right corner of the matrix.
However, instead of creating the upper subdiagonal, after a few iterations the
diagonal structure is restored in the lower right corner.
We apply Algorithm 3.11 with p = 2. We have plotted the Broyden matrix
at four iterations of the BRR process, see Figure 4.12. The update process
4.2 Solving linear systems with Broyden’s method 123
B
10
B
20
B
30
B
40
Figure 4.11: The Broyden matrix at three diﬀerent iterations of Algorithm 3.11, with
p = 1 (n = 20). Black corresponds to the value −1 and white to the value 2.
starts as expected in the lower right corner of the matrix. It turns out that the
process fails to reconstruct the Jacobian again. Instead, two spots are created
on the diagonal and destroy the banded structure of the Broyden matrix. As
said before the process fails to converge.
B
10
B
20
B
30
B
40
Figure 4.12: The Broyden matrix at three diﬀerent iterations of Algorithm 3.11, with
p = 2 (n = 20). Black corresponds to the value −1 and white to the value 2.
Two equal Jordan blocks
We assume that n is even and consider the matrix A ∈ R
n×n
, given by
A =
A
11
0
0 A
22
, (4.6)
where both A
11
, A
22
∈ R
n/2×n/2
, are Jordan blocks (4.5) with the same eigen
value λ = 2. The vector b is the zero vector. If x
0
is given by x
0
= (1, . . . , 1)
the dimension of the zeroth Krylov space Z
0
equals d
0
= n/2. It takes Broy
den’s method at most 2d
0
= n iterations to solve (4.4). In Example 2.5 we
have seen that for n = 4 the method of Broyden needs 4 iterations to converge.
124 Chapter 4. Features of Broyden’s method
We choose the dimension n = 20 and plot the Jacobian A, see Figure 4.13.
The Broyden matrix is plotted at several iterations. The matrix B
20
is the
ﬁnal matrix before the method of Broyden solves the problem.
Jacobian B
5
B
10
B
20
Figure 4.13: The Jacobian (4.6) of the linear system, the initial Broyden matrix and
the Broyden matrix at subsequent iterations (n = 20). Black corresponds to the value
−1 and white to the value 2.
As for the previous example, Broyden’s method tries to recover the struc
ture of the Jacobian, starting from the initial matrix. The process starts at
the bottom right side of the matrix. Iteration after iteration the update to
the Broyden matrix involves a next entry of the main diagonal. Note that
the Broyden matrix again is developing a subdiagonal. But, in addition, two
bands arise that connect both (n/2)dimensional systems.
Here the method of Broyden needs 20 iterations to converge, and g(x
k
)
oscillates before it drops to 10
−12
. It turns out that the BRR method is as fast
as Broyden’s method for p ≥ 11. For p ≤ 10 the process eventually diverges.
Two diﬀerent Jordan blocks
We assume that n is even and consider the matrix A ∈ R
n×n
, given by
A =
A
11
0
0 A
22
, (4.7)
where both A
11
, A
22
∈ R
n/2×n/2
, are Jordan blocks (4.5) but with diﬀerent
eigenvalues λ
1
= 2 and λ
2
= 3. The vector b is the zero vector. If the initial
condition is given by x
0
= (1, . . . , 1), the dimension of the zeroth Krylov space
Z
0
equals d
0
= n. It takes Broyden’s method at most 2d
0
= 2n iterations to
solve (4.4).
We choose the dimension n = 20 and plot the Jacobian A, see Figure 4.14.
The Broyden matrix is plotted at several iterations. The matrix B
40
is the
ﬁnal matrix before the method of Broyden solves the problem.
4.3 Introducing coupling 125
Jacobian B
10
B
30
B
40
Figure 4.14: The Jacobian (4.7) of the linear system, the initial Broyden matrix and
the Broyden matrix at subsequent iterations (n = 20). Black corresponds to the value
−1 and white to the value 3.
Broyden’s method tries to recover the structure of the Jacobian, starting
from the initial matrix. As in the previous example the two bands are devel
oped that connect the two (n/2)dimensional systems. However, at the end of
the process these bands are eventually removed.
Here the same description of the computations is valid as for the ﬁrst
matrix. The method of Broyden needs 40 iterations to converge. We have to
choose p = 20 for the BRR method to converge. For smaller values of p the
BRR method indeed diverges.
4.3 Introducing coupling
In the previous section we have seen that the method of Broyden ﬁnds out
when a system of equations can be split into several independent systems of
equations and that it tries to solve the independent systems simultaneously.
We consider the matrix A ∈ R
n×n
, given by
A =
¸
¸
¸
¸
¸
λ δ
.
.
.
.
.
.
.
.
.
δ
λ
¸
. (4.8)
The vector b is set to zero and we choose λ equal to 2. The parameter δ varies
between zero and one. With x
0
given by x
0
= (1, . . . , 1) the exact dimension
of the zeroth Krylov space Z
0
equals d
0
= n, if δ = 0.
However, it turns out that for small values of δ the method of Broyden
needs less than 2n iterations.
If δ = 1.0 10
−4
the method of Broyden needs 8 iterations to converge to a
residual of 9.82210
−16
. In Figure 4.15 we have plotted the Jacobian and several
126 Chapter 4. Features of Broyden’s method
Broyden matrices of the process. After 8 iterations, only three elements on the
diagonal are ’recovered’, but evidently this is enough for Broyden’s method
to ﬁnd the solution. The BRR method turns out to be convergent for every
value for p.
Jacobian B
4
B
6
B
8
Figure 4.15: The Jacobian (4.8) of the linear system, with δ = 1.0 · 10
−4
, and the
Broyden matrix at subsequent iterations (n = 20). Black corresponds to the value
−1 and white to the value 2.
If δ = 1.0 10
−3
the method of Broyden needs 10 iterations to converge
to a residual of 1.4698 10
−14
. The Broyden matrix in the 10th iterations has
recovered four elements on the diagonal, see Figure 4.16. This is enough for
the method of Broyden to ﬁnd the solution. Simulations show that the BRR
method is convergent for every value for p, except for p = 2.
Jacobian B
6
B
8
B
10
Figure 4.16: The Jacobian (4.8) of the linear system, with δ = 1.0 · 10
−3
, and the
Broyden matrix at subsequent iterations (n = 20). Black corresponds to the value
−1 and white to the value 2.
If δ = 1.010
−2
, then the method of Broyden needs 14 iterations to converge
to a residual of 3.2768 10
−13
. Similarly to the previous cases, the process has
to recover some of the diagonal elements of the Jacobian, before it ﬁnds the
solutions. Here, the ﬁnal number of recovered elements is 6, see Figure 4.17.
The oﬀdiagonal elements are still small and therefore not distinguishable.
Remarkably, the BRR method has exactly the same rate of convergence for
4.3 Introducing coupling 127
p ≥ 6. For smaller values of p the rate of convergence is low or the process
diverges.
Jacobian B
6
B
10
B
14
Figure 4.17: The Jacobian (4.8) of the linear system, with δ = 1.0 · 10
−2
, and the
Broyden matrix at subsequent iterations (n = 20). Black corresponds to the value
−1 and white to the value 2.
If δ = 0.1 the method of Broyden needs 30 iterations to converge. The
Broyden matrix starts to recover the 14th elements of the diagonal, when the
process converges, see Figure 4.18. Clearly, the oﬀdiagonal elements of the
Jacobian become important. The BRR method only converges equally fast for
p ≥ 12.
Jacobian B
10
B
20
B
30
Figure 4.18: The Jacobian (4.8) of the linear system, with δ = 0.1, and the Broyden
matrix at subsequent iterations (n = 20). Black corresponds to the value −1 and
white to the value 2.
If δ = 0.5 the method of Broyden needs 40 iterations to converge. The
situation is comparable to the one described in Section 4.2, where we consid
ered a Jacobian consisting of one canonical Jordan block. The plots in Figure
4.19 are similar to those of Figure 4.9. The BRR method fails to converge for
p < 20, and even for p = 20 the rate of convergence is lower, i.e., 47 iterations
are needed instead.
For diﬀerent values of δ we have plotted the rate of convergence of the
method of Broyden when solving g(x) = 0, see Figure 4.20.
128 Chapter 4. Features of Broyden’s method
Jacobian B
20
B
30
B
40
Figure 4.19: The Jacobian (4.8) of the linear system, with δ = 0.5, and the Broyden
matrix at subsequent iterations (n = 20). Black corresponds to the value −1 and
white to the value 2.
0 5 10 15 20 25 30 35 40
10
−15
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 4.20: The rate of convergence of Broyden’s method solving (4.4) where A
is given by (4.8) for diﬀerent values of δ. [’◦’(δ = 1.0 · 10
−4
), ’×’(δ = 1.0 · 10
−3
),
’+’(δ = 1.0 · 10
−2
) ’∗’(δ = 0.1), ’’(σ = 0.5)]
4.4 Comparison of selected limited memory Broy
den methods
In this section we compare the most promising limited memory Broyden meth
ods, derived in Chapter 3. For every test function of Appendix A and every
linear system discussed in Section 4.2 we applied the methods for p = 1, . . . , 20.
The results we have put in Tables 4.2  4.6. The results of the discrete bound
ary value function (A.2) and the extended Powell singular function (A.9) are
not included because all limited memory Broyden methods fail to converge for
these functions.
In the tables and the description of the results we have used abbreviations
for the limited memory Broyden methods, as listed in Table 4.1.
In all tables the methods are listed vertically and diﬀerent values of p are
listed in horizontal direction. The initial condition x
0
as well as the dimension
4.4 Comparison of selected limited memory Broyden methods 129
UPALL The Broyden Update Restart method
(Algorithm 3.4 with q = 0)
UP1 The Broyden Update Reduction method
(Algorithm 3.4 with q = 1 and Z = I)
BRR The Broyden Rank Reduction method
(Algorithm 3.11 with q = p − 1)
BRRI The Broyden Rank Reduction Inverse method
(Algorithm 3.15 with q = p − 1)
BRR2 The Broyden Rank Reduction method
(Algorithm 3.11 with q = p − 2)
BBR The Broyden Base Reduction method
(Algorithm 3.4 with q = p − 1 and Z given by (3.22))
BBS The Broyden Base Storing method
(Algorithm 3.4 with q = p − 1 and Z = R)
BYRD The limited memory Broyden method proposed by Byrd et al.
(Algorithm 3.23 with q = p − 1)
Table 4.1: The abbreviation of several limited memory Broyden methods.
n are uniform for every simulation in a table. For every combination of the
method and parameter p, the number of iterations of the simulation are given
and the variable R that represents the rate of convergence of the process to
obtain a residual of g(x
k
) < ε. Note that if R is negative the ﬁnal residual
g(x
k
∗) is larger than the initial residual g(x
0
). If R is large, then the
method has a high rate of convergence. If a process fails to converge this is
indicated by an asterisk.
In the examples of Chapter 3 we saw that in some cases a process initially
converges, but fails after a few iterations. So, the residual g(x
k
) might have
been smaller than the ﬁnal residual. This situation cannot be distinguished
in the tables of this section. We refer for further details to Chapter 3 and
Sections 4.1 and 4.2.
The discrete integral equation function
We consider the discrete integral equation function (A.5) for n = 20 and apply
the limited memory Broyden methods starting from x
0
given by (A.6) to a
residual of g(x
k
) < 10
−12
. The results are given in Table 4.2.
For p ≥ 10 all methods succeed to converge. Some of the methods are even
indistinguishable from the method of Broyden. It turns out that especially
UPALL has good results, because the method converges for all p ≥ 2. For
p = 1 all methods fail to converge. Note that for p = 1 all methods, except for
BRR2 and BYRD, are in fact equal. For BRR there exists a sharp boundary,
130 Chapter 4. Features of Broyden’s method
method p = 20 p = 19 p = 18 p = 17 p = 16
UPALL 21 1.3412 23 1.3544 21 1.3619 21 1.4441 22 1.2688
UP1 21 1.3412 21 1.3412 21 1.3412 21 1.3412 21 1.3327
BRR 21 1.3412 21 1.3412 21 1.3412 21 1.3412 21 1.3412
BBRI 21 1.3411 21 1.3411 21 1.3411 21 1.3411 21 1.3411
BRR2 21 1.3412 21 1.3412 21 1.3412 21 1.3412 21 1.3412
BBR 21 1.3412 21 1.3412 21 1.3412 21 1.3412 21 1.3412
BBS 21 1.3412 21 1.3412 21 1.3412 21 1.3412 21 1.3412
BYRD 21 1.3411 21 1.3411 21 1.3411 21 1.3410 21 1.3432
p = 15 p = 14 p = 13 p = 12 p = 11
UPALL 23 1.3103 20 1.3677 23 1.2103 22 1.3442 21 1.3353
UP1 24 1.2372 21 1.3029 27 1.0782 25 1.1281 29 0.9653
BRR 21 1.3412 21 1.3412 21 1.3412 21 1.3412 21 1.3412
BBRI 21 1.3411 21 1.3411 21 1.3411 21 1.3411 21 1.3411
BRR2 21 1.3412 21 1.3412 21 1.3412 21 1.3412 21 1.3412
BBR 21 1.3412 21 1.3412 21 1.3412 21 1.3412 21 1.3412
BBS 21 1.3412 21 1.3412 21 1.3411 21 1.3411 21 1.3401
BYRD 27 1.0585 32 0.8577 32 0.8585 63 0.4433 111 0.2489
p = 10 p = 9 p = 8 p = 7 p = 6
UPALL 22 1.3908 21 1.3658 20 1.4545 23 1.2895 22 1.2677
UP1 29 0.9555 41 0.6953 52 0.5278 32 0.8740 44 0.6512
BRR 21 1.3412 21 1.3412 21 1.3411 21 1.3511 200 0.0307
∗
BBRI 21 1.3411 21 1.3411 21 1.3410 21 1.3329 23 1.1922
BRR2 21 1.3412 21 1.3411 21 1.3384 23 1.2061 36 0.7662
BBR 21 1.3412 21 1.3410 21 1.3401 200 0.0773
∗
200 0.0655
∗
BBS 22 1.3909 22 1.3864 22 1.3733 131 −0.4034
∗
104 −0.6570
∗
BYRD 171 0.1650 200 0.1027
∗
90 0.3157 200 0.0640
∗
94 0.2991
p = 5 p = 4 p = 3 p = 2 p = 1
UPALL 21 1.3469 24 1.2088 33 0.8307 24 1.2468 200 0.0158
∗
UP1 83 0.3425 30 0.9762 25 1.1068 26 1.0909 200 0.0158
∗
BRR 200 0.0155
∗
200 0.0073
∗
160 −0.3226
∗
55 −1.2889
∗
200 0.0158
∗
BBRI 50 0.5684 53 0.5189 119 0.2407 94 0.2995 200 0.0158
∗
BRR2 51 0.5655 84 0.3360 114 0.2410 24 1.2468 – –
BBR 200 0.0605
∗
200 0.0008
∗
168 0.1646 38 0.7934 200 0.0158
∗
BBS 64 0.4277 200 0.1195
∗
62 −0.7674
∗
200 0.0778
∗
200 0.0158
∗
BYRD 75 0.3734 30 0.9509 38 0.7933 200 0.0158
∗
9 −6.3791
∗
Table 4.2: The number of iterations and the rate of convergence for the limited
memory Broyden methods of Table 4.1, applied to the discrete integral equation
function (A.5) (n = 20). [’*’ (no convergence)]
that is, the method converges for p ≥ 7 and fail for p ≤ 6. The same holds for
the methods BBR and BBS, both methods converge for p ≥ 8. These methods,
however, also converge for some smaller values of p. For p ≤ 12 the method
BYRD needs many iterations to converge, except for p = 3 and p = 4, where
4.4 Comparison of selected limited memory Broyden methods 131
the method does converge rather fast. We can conclude that every method
can be trusted if p is larger than a certain critical value (p = 6 for BRR,
p = 7 for BBR and BBS, etc). Beneath this critical value the method might
occasionally converge.
The extended Rosenbrock function
For the extended Rosenbrock function (A.7) with n = 20, we give the results
of the simulations only for p ≤ 10, since for every method starting from x
0
given by (A.8) the rate of convergence hardly increases for larger values of p.
The results are listed in Table 4.3.
method p = 10 p = 9 p = 8 p = 7 p = 6
UPALL 11 2.9756 13 ∞ 12 2.8867 14 ∞ 16 2.2081
UP1 11 2.9756 11 2.9444 15 ∞ 14 2.4949 25 1.2801
BRR 11 2.9756 11 2.9756 11 2.9756 11 2.9756 11 2.9756
BRRI 11 2.9756 11 2.9756 11 2.9756 11 2.9756 11 2.9756
BRR2 11 2.9756 11 2.9756 11 2.9756 11 2.9756 11 2.9756
BBR 11 2.9756 11 2.9756 11 2.9756 11 2.9756 11 2.9756
BBS 11 2.9756 11 2.9756 11 2.9756 11 2.9756 11 2.9756
BYRD 11 2.9723 14 ∞ 14 ∞ 18 1.8733 19 1.9444
p = 5 p = 4 p = 3 p = 2 p = 1
UPALL 14 2.4087 38 0.8749 200 −0.1284
∗
200 −0.0374
∗
200 −0.0303
∗
UP1 23 1.4538 20 1.6973 66 0.4997 30 1.1411 200 −0.0303
∗
BRR 13 2.6200 13 2.6875 22 1.5367 62 0.5074 200 −0.0303
∗
BRRI 13 2.5924 13 2.6017 77 0.4126 33 −1.3923
∗
200 −0.0316
∗
BRR2 13 2.6200 13 2.6875 33 0.9461 200 −0.0374
∗
– –
BBR 11 2.9756 11 2.9756 11 2.9756 200 −0.0986
∗
200 −0.0303
∗
BBS 11 2.9756 11 2.9756 11 2.9756 18 1.7653 200 −0.0303
∗
BYRD 20 1.6295 32 1.0355 145 0.2341 200 −0.0380
∗
4 −14.9518
∗
Table 4.3: The number of iterations and the rate of convergence for the limited
memory Broyden methods of Table 4.1, applied to the extended Rosenbrock function
(A.7) (n = 20). [’*’ (no convergence)]
The ’∞’sign implies that accidentally the exact zero of the extended
Rosenbrock function is found. Again all methods fail to converge for p = 1.
For p = 2 only the methods UP1, BRR and BBS converge. Note that most
methods converge for p = 3. The method BBR and BBS are for p = 3 even as
fast as for p = 10.
132 Chapter 4. Features of Broyden’s method
One Jordan block
We consider again the matrix A ∈ R
n×n
, given by (4.5), with λ equal to 2 and
n = 20. The vector b is the zero vector. In Table 4.4 we give the results for
the limited memory Broyden methods for 16 ≤ p ≤ 20, starting from x
0
given
by x
0
= (1, . . . , 1). All methods fail to converge for smaller values of p. Note
that the method of Broyden needs 2n iterations to solve (4.4), see Section 4.2.
method p = 20 p = 19 p = 18 p = 17 p = 16
UPALL 200 −0.0470
∗
200 −0.0491
∗
200 −0.0501
∗
200 −0.0472
∗
200 −0.0482
∗
UP1 200 −0.0515
∗
200 −0.0636
∗
200 −0.0516
∗
200 −0.0474
∗
200 −0.0491
∗
BRR 65 0.4889 162 −0.2709
∗
152 −0.2872
∗
143 −0.3057
∗
136 −0.3216
∗
BRRI 83 0.3662 200 0.0620
∗
200 −0.0495
∗
200 −0.0805
∗
200 −0.1101
∗
BRR2 96 0.3259 167 0.1812 136 0.2245 200 0.0361
∗
200 0.1294
∗
BBR 194 −0.2331
∗
200 −0.1959
∗
200 −0.1987
∗
153 −0.2883
∗
182 −0.2450
∗
BBS 62 0.5514 139 −0.3142
∗
123 −0.3592
∗
114 −0.3892
∗
104 −0.4233
∗
BYRD 200 −0.0564
∗
200 −0.0560
∗
200 −0.0549
∗
200 −0.0634
∗
200 −0.0378
∗
Table 4.4: The number of iterations and the rate of convergence for the limited
memory Broyden methods of Table 4.1, applied to the linear equation (4.4) where A
is given by (4.5) and b = 0 (n = 20). [’*’ (no convergence)]
Two equal Jordan blocks
We assume that n is even and consider the matrix A ∈ R
n×n
, given by (4.6)
where both A
11
, A
22
∈ R
n/2×n/2
, are Jordan blocks (4.5) with the eigenvalue
λ = 2. The vector b is the zero vector. The initial condition x
0
is given by
x
0
= (1, . . . , 1). Note that it takes Broyden’s method n iterations to solve
(4.4), see Section 4.2. The results of the simulation are given in Table 4.5
Most of the limited memory Broyden methods fail for p ≤ 10 except for the
method BRR2 and BBS (and UP1 for p = 2). The methods UPALL, UP1
also fail to converge for 11 ≤ p ≤ 16 and the method BYRD for 11 ≤ p ≤ 18.
Two diﬀerent Jordan blocks
We take n = 20 and consider the matrix A ∈ R
n×n
, given by (4.7) where
both A
11
, A
22
∈ R
n/2×n/2
, are Jordan blocks given by (4.5), with diﬀerent
eigenvalues λ
1
= 2 and λ
2
= 3. The vector b is the zero vector and the initial
condition x
0
is given by x
0
= (1, . . . , 1). Note that it takes Broyden’s method
2n iterations to solve (4.4), see Section 4.2. The results for the limited memory
Broyden methods for 16 ≤ p ≤ 20 are given in Table 4.6. More or less the
same description is valid as for Table 4.4.
4.4 Comparison of selected limited memory Broyden methods 133
method p = 20 p = 19 p = 18 p = 17 p = 16
UPALL 20 1.7351 20 1.7351 20 1.6369 104 0.3384 200 0.0589
∗
UP1 20 1.7351 20 1.7351 20 1.6805 167 0.1870 200 0.1403
∗
BRR 20 1.7351 20 1.7351 20 1.7331 20 1.7259 20 1.6852
BRRI 20 1.7685 20 1.7685 20 1.7662 20 1.7521 20 1.7198
BRR2 20 1.7351 20 1.7351 20 1.7326 20 1.7280 20 1.6866
BBR 20 1.7351 20 1.7351 20 1.7465 20 1.7275 20 1.7245
BBS 20 1.7351 20 1.7351 20 1.7387 20 1.7325 20 1.7205
BYRD 20 1.7035 20 1.7036 200 −0.0623
∗
200 0.0924
∗
200 0.0033
∗
p = 15 p = 14 p = 13 p = 12 p = 11
UPALL 200 0.0476
∗
200 0.0171
∗
200 0.0251
∗
200 0.0151
∗
200 0.0162
∗
UP1 200 0.0132
∗
200 0.0182
∗
200 0.0235
∗
200 0.0079
∗
200 0.0042
∗
BRR 20 1.6375 20 1.6571 20 1.6027 20 1.5553 20 1.5288
BRRI 20 1.7052 20 1.6270 20 1.6756 20 1.5609 20 1.5227
BRR2 20 1.6466 20 1.6467 20 1.6305 20 1.5621 73 0.4173
BBR 20 1.7495 20 1.7549 20 1.6908 20 1.6256 20 1.5424
BBS 20 1.7113 20 1.6994 20 1.7035 20 1.7338 20 1.7177
BYRD 200 0.0469
∗
200 0.0598
∗
200 0.0121
∗
200 0.0114
∗
200 0.0029
∗
p = 10 p = 9 p = 8 p = 7 p = 6
UPALL 200 0.0157
∗
200 0.0432
∗
200 0.0213
∗
200 0.0398
∗
200 0.0515
∗
UP1 200 0.0182
∗
200 0.0348
∗
200 0.0359
∗
200 0.0088
∗
200 0.0581
∗
BRR 166 −0.2652
∗
130 −0.3374
∗
116 −0.3775
∗
117 −0.3750
∗
105 −0.4178
∗
BRRI 200 0.0287
∗
200 −0.0357
∗
200 −0.1002
∗
200 −0.0933
∗
200 −0.2109
∗
BRR2 90 0.3381 159 0.1909 128 0.2394 200 0.0913
∗
200 0.0122
∗
BBR 200 −0.1304
∗
200 −0.1804
∗
165 −0.2639
∗
200 −0.1673
∗
196 −0.2335
∗
BBS 36 1.0697 122 −0.3577
∗
103 −0.4280
∗
100 −0.4388
∗
89 −0.4907
∗
BYRD 200 0.0146
∗
200 0.0127
∗
200 0.0086
∗
200 0.0402
∗
200 0.0351
∗
p = 5 p = 4 p = 3 p = 2 p = 1
UPALL 200 0.0465
∗
200 0.0989
∗
200 0.0585
∗
200 0.1395
∗
105 −0.4200
∗
UP1 200 0.0641
∗
200 0.0964
∗
200 0.1116
∗
184 0.1727 105 −0.4200
∗
BRR 98 −0.4522
∗
93 −0.4750
∗
105 −0.4177
∗
97 −0.4541
∗
105 −0.4200
∗
BRRI 196 −0.2225
∗
190 −0.2299
∗
146 −0.3023
∗
148 −0.3154
∗
105 −0.4200
∗
BRR2 200 −0.0312
∗
200 0.0069
∗
200 −0.0013
∗
200 0.1395
∗
– –
BBR 200 −0.1937
∗
200 −0.1759
∗
200 0.0980
∗
200 0.1205
∗
105 −0.4200
∗
BBS 86 −0.5056
∗
99 −0.4398
∗
99 −0.4464
∗
99 −0.4484
∗
105 −0.4200
∗
BYRD 200 0.0960
∗
200 0.0747
∗
200 0.1205
∗
105 −0.4200
∗
33 −1.3498
∗
Table 4.5: The number of iterations and the rate of convergence for the limited
memory Broyden methods of Table 4.1, applied to the linear equation (4.4) where A
is given by (4.6) and b = 0 (n = 20). [’*’ (no convergence)]
134 Chapter 4. Features of Broyden’s method
method p = 20 p = 19 p = 18 p = 17 p = 16
UPALL 200 −0.0930
∗
200 −0.0806
∗
200 −0.0813
∗
200 −0.0798
∗
200 −0.0633
∗
UP1 200 −0.0630
∗
200 −0.0570
∗
200 −0.0558
∗
200 −0.0465
∗
200 −0.0408
∗
BRR 38 0.8408 45 0.7108 147 −0.2954
∗
149 −0.2953
∗
130 −0.3400
∗
BRRI 38 0.8642 46 0.6741 200 0.0648
∗
200 −0.1018
∗
200 −0.0847
∗
BRR2 38 0.8229 103 0.3301 112 0.2736 200 0.1223
∗
200 0.0566
∗
BBR 42 0.7646 168 −0.2635
∗
200 −0.1886
∗
167 −0.2646
∗
164 −0.2708
∗
BBS 40 0.8473 41 0.8489 146 0.2196 87 −0.5110
∗
105 −0.4182
∗
BYRD 200 −0.0444
∗
200 −0.0678
∗
200 −0.0589
∗
200 −0.0672
∗
200 −0.0502
∗
Table 4.6: The number of iterations and the rate of convergence for the limited
memory Broyden methods of Table 4.1, applied to the linear equation (4.4) where A
is given by (4.7) and b = 0 (n = 20). [’*’ (no convergence)]
Chapter 5
Features of the Broyden rank
reduction method
Anticipating on the simulations of Chapter 8, we investigate the convergence
properties of the Broyden Rank Reduction method for computing ﬁxed points
of the period map f : R
n
→R
n
of the reverse ﬂow reactor deﬁned by (8.3), cor
responding to the partial diﬀerential equations of the one and twodimensional
model. The results are described in Section 5.1. In Section 5.2 we consider
the singular values of the update matrix for both models. In 5.3 we show that
the BRR method makes it possible to compute the limiting periodic state of
the reverse ﬂow reactor on a ﬁner grid using the same amount of memory and
just a few more iterations. Finally, in Section 5.4 we compare the convergence
properties of the limited memory Broyden methods listed in Table 4.1.
5.1 The reverse ﬂow reactor
The onedimensional model
Let f : R
n
→R
n
be the map of one ﬂow reverse period, see (8.3), that corre
sponds to the balance equations in (6.23)(6.25) using the parameters values
of Table 6.2. In addition, we ﬁx the ﬂow reverse period and the dimensionless
cooling capacity (t
f
= 1200s and Φ = 0.2). As initial condition we take a
state of the reactor that is at high constant temperature (T = 2T
0
) and ﬁlled
with inert gas (c = 0). For the ﬁnite volume discretization an equidistant
grid is used with N grid points (N = 100). This leads to an ndimensional
discretized problem, where n = 2N = 200. The system of ordinary diﬀerential
equations is integrated over one reverse ﬂow period using the NAGlibrary
135
136 Chapter 5. Features of the Broyden rank reduction method
routine D02EJF. To solve the equation g(x) = 0 with g(x) = f(x) − x, the
BRR method is applied for diﬀerent values of p.
0 10 20 30 40 50 60
10
−10
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 5.1: The convergence rate of Algorithm 1.19 and Algorithm 3.11, with q = p−1,
applied to the period map of the reverse ﬂow reactor (8.3) using the onedimensional
model (6.23)(6.25) with the parameter values of Table 6.2. [’◦’(Broyden), ’×’(p = 20),
’+’(p = 10), ’∗’(p = 5), ’’(p = 4), ’’(p = 3), ’’(p = 2), ’’(p = 1)]
The information in Figure 5.1 can be interpreted in the following way.
The method of Broyden converges to a residual with g(x
k
) < 10
−10
in 52
iterations. For p = 20, the BRR method approximates the convergence rate
of the method of Broyden using one ﬁfth of the amount of memory. Note that
the residuals of both methods are equal up to the 45th iteration. For p = 10,
the BRR method is even faster than the method of Broyden. So, the number
of iterations needed to converge to g(x
k
) < 10
−10
is not monotonously
increasing for smaller values of p. If we take p = 5 or p = 4 instead of p = 10,
the BRR method needs a few more iterations to converge. However, the
amount of memory usage is divided by a factor 2 and
5
2
, respectively. For
p = 3, p = 2, and p = 1 the BRR method has a very low rate of convergence.
We see that a large reduction of memory is obtained with just a few more
iterations.
The twodimensional model
Let f : R
n
→ R
n
now be the map of one ﬂow reverse period, see (8.3),
corresponding to the balance equations in (6.26)(6.28) using the parameters
values of Tables 6.2. We ﬁx the ﬂow reverse period and the dimensionless
cooling capacity (t
f
= 1200s and Φ = 0.2). The ratio between the width and
5.1 The reverse ﬂow reactor 137
the length of the reactor is set at R/L = 0.0025. As initial condition a state of
the reactor is taken that is at high constant temperature (T = 2T
0
) and ﬁlled
with inert gas (c = 0). For the ﬁnite volume discretization an equidistant grid
is used with N grid points in the axial direction (N = 100). In the radial
direction a nonuniform grid of M grid points is chosen that becomes ﬁner in
the direction of the wall of the reactor (M = 25). In fact, a segment of the
reactor is divided in M rings with the same volume. The dimension of the
discretized problem is denoted by n (n = 2 M N = 5000). The system of
ordinary diﬀerential equations is integrated over one reverse ﬂow period using
the NAGlibrary routine D02NCF.
To solve the equation g(x) = 0, with g(x) = f(x) − x, the BRR method
is applied for diﬀerent values of p. It turns out that for the twodimensional
model it is no longer possible to apply the original method of Broyden, due to
memory constraints.
0 10 20 30 40 50 60
10
−5
10
0
PSfrag replacements
iteration k
r
e
s
i
d
u
a
l
g
(
x
k
)
Figure 5.2: The convergence rate of Algorithm 3.11, with q = p − 1, applied to the
period map of the reverse ﬂow reactor (8.3) using the twodimensional model (6.26)
(6.28) with the parameter values of Table 6.2. [’×’(p = 20), ’+’(p = 10), ’∗’(p = 5),
’’(p = 4), ’’(p = 3), ’’(p = 2)]
Figure 5.2 shows that the BRR method has a high rate of convergence for
p ≥ 5. For 2 ≤ p ≤ 4 the BRR method does not converge within 60 iterations.
The amount of memory needed to store the Broyden matrix can be reduced
by choosing p = 10 instead of p = 20, using approximately the same number
of iterations.
138 Chapter 5. Features of the Broyden rank reduction method
5.2 Singular value distributions
of the update matrices
As we have explained in Section 3.2 the rank of the update matrix increases
during the Broyden process, since in every iteration a rankone matrix is added
to the update matrix. So, the number of nonzero singular values of the update
matrix increases. In this section we investigate what happens with the singular
values if we remove the pth singular value in every iteration, that is, if we apply
the Broyden Rank Reduction method with parameter p.
The onedimensional model
As in the previous section, we ﬁrst consider the period map of the reverse ﬂow
reactor deﬁned by (8.3) corresponding to the onedimensional model (6.23)
(6.25). In Figure 5.3 we have plotted the singular values of the update matrix
during the BRR process, for diﬀerent values of p.
In case of p = 50, we see that the update matrix has rank one at the begin
ning of the second iteration, that is, the matrix has one nonzero singular value.
In every iteration one nonzero singular value is added, the smallest singular
value. This singular value increases during some iterations and thereafter is
reaches a more or less stable value. For example, the largest singular value σ
1
jumps in the second iteration from about 10
−1
to 10
0
. Subsequently the value
of σ
1
is rather stable. Because the parameter p is larger than the number of
iterations done by the BRR process, no singular values are removed. So, we
have considered a situation where the BRR method is equal to the method of
Broyden.
If we choose p = 10 we see that during the ﬁrst 10 iterations, the singular
value distribution is exactly the same as for p = 50. Thereafter the singular
value σ
10
starts jumping around. The other nine singular values seems to be
invariant under the reduction procedure.
Decreasing the parameter p to 5, the singular value σ
5
starts to oscillate
after the 5th iteration. In addition the singular value σ
1
is larger than for
p = 50 and p = 10. The other singular values are still rather stable.
The twodimensional model
We apply the same investigation for the period map of the reverse ﬂow reactor
deﬁned by (8.3) corresponding to the twodimensional model (6.26)(6.28). In
Figure 5.4 we have plotted the singular values of the update matrix during
the BRR process, for diﬀerent values of p.
5.2 Singular value distributions of the update matrices 139
0 5 10 15 20 25 30 35 40 45
10
−4
10
−2
10
0
PSfrag replacements
iteration k
s
i
n
g
u
l
a
r
v
a
l
u
e
s
0 5 10 15 20 25 30 35 40 45 50
10
−4
10
−2
10
0
PSfrag replacements
iteration k
s
i
n
g
u
l
a
r
v
a
l
u
e
s
0 5 10 15 20 25 30 35 40 45 50 55
10
−4
10
−2
10
0
PSfrag replacements
iteration k
s
i
n
g
u
l
a
r
v
a
l
u
e
s
Figure 5.3: The singular value distribution of the update matrix during the BRR
process, with q = p − 1, applied to the period map of the reverse ﬂow reactor (8.3)
corresponding to the onedimensional model (6.23)(6.25). [up (p = 50), middle
(p = 10), down (p = 5)]
It turns out the we can describe the behavior of the singular values of the
update matrix in the same way as we did for the onedimensional model. The
only diﬀerence is that for p = 5 the singular value σ
4
starts to alter instead of
the singular value σ
1
.
140 Chapter 5. Features of the Broyden rank reduction method
0 5 10 15 20 25 30 35 40 45
10
−4
10
−2
10
0
PSfrag replacements
iteration k
s
i
n
g
u
l
a
r
v
a
l
u
e
s
0 5 10 15 20 25 30 35 40 45 50
10
−4
10
−2
10
0
PSfrag replacements
iteration k
s
i
n
g
u
l
a
r
v
a
l
u
e
s
0 10 20 30 40 50 60
10
−4
10
−2
10
0
PSfrag replacements
iteration k
s
i
n
g
u
l
a
r
v
a
l
u
e
s
Figure 5.4: The singular value distribution of the update matrix during the BRR
process, with q = p − 1, applied to the period map of the reverse ﬂow reactor (8.3)
corresponding to the twodimensional model (6.26)(6.28). [pu (p = 50), middle
(p = 10), down (p = 5)].
5.3 Computing on a ﬁner grid using same amount
of memory
In Section 5.1 we have seen that the BRR method makes it possible to ﬁnd
symmetric periodic solutions of the RFR using the full twodimensional model
5.3 Computing on a ﬁner grid using same amount of memory 141
of the RFR. In addition, we have shown that even when using the one
dimensional description of the RFR, the BRR method saves memory. It turned
out that, surprisingly, for the twodimensional model the same values for p can
be used as in case of the onedimensional model. We now show that it is pos
sible to use a ﬁner grid with the same amount of memory to store the Broyden
matrix at the expense of just a few more iterations.
For the above simulation of the twodimensional model a very slim reactor
is used (R/L = 0.0025). As will be discussed in Section 8.3, gradients in the
radial direction are absent in this case, and the twodimensional model leads
to exactly the same results as the onedimensional model. If we take a larger
radius for the reactor (R/L = 0.025), then radial temperature gradients are in
fact introduced. To illustrate the beneﬁts of our limiting memory method we
compare two simulations of the model with M = 25 and M = 5 grid points
in the radial direction. So, the dimension of the discretized problem becomes
n = 5000 and n = 1000, respectively.
We have applied the BRR method with diﬀerent values of p to compute
the periodic state of the reactor, see Table 5.1.
M = 25 M = 5
# iterations # storage loc. # iterations # storage loc.
p = 20 48 200, 000 53 40, 000
p = 10 50 100, 000 55 20, 000
p = 5 61 50, 000 65 10, 000
p = 4 65 40, 000 82 8, 000
p = 3 80 30, 000 76 6, 000
p = 2 > 100 20, 000 90 4, 000
p = 1 > 100 10, 000 > 100 2, 000
Table 5.1: The number of iterations of Algorithm 3.11, with q = p − 1, applied to
the period map of the reverse ﬂow reactor (8.3) corresponding to the twodimensional
model (6.26)(6.28), and the number of storage locations for the Broyden matrix
using a grid with N = 100 grid points in the axial direction and M = 25, respectively
M = 5, grid points in the radial direction.
Although a few more iterations are needed than in case of the slim reactor
(R/L = 0.0025), still the same values for p can be used for both the ﬁne and
the coarse grid. Note that for every value of p the rate of convergence for
M = 25 is higher than for M = 5. Suppose, for example, that at most 40, 000
storage locations are available. To accelerate the convergence we want to use
the largest value of p. For the coarse grid the parameter p can be chosen to
142 Chapter 5. Features of the Broyden rank reduction method
be 20 and for the ﬁne grid at most p = 4. This implies that instead of 53
iterations for a coarse grid, 65 iterations are needed for a ﬁne grid to solve
the discretized problem while using the same amount of memory to store the
Broyden matrix.
Although the approximation of the cyclic steady state is qualitatively good
using the coarse grid, Figure 5.5(b), the more accurate approximation using
the ﬁne grid, Figure 5.5(a), is preferable.
0
0.2
0.4
0.6
0.8
1 0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
1.2
PSfrag replacements
t
e
m
p
e
r
a
t
u
r
e
ax. distance
rad. distance
(a) Fine grid (M = 25)
0
0.2
0.4
0.6
0.8
1 0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
1.2
PSfrag replacements
t
e
m
p
e
r
a
t
u
r
e
ax. distance
rad. distance
(b) Coarse grid (M = 5)
Figure 5.5: Temperature distribution over the reactor bed using a coarse and a ﬁne
grid in the radial direction.
5.4 Comparison of selected limited memory Broy
den methods
Concluding this chapter, we apply the limited memory Broyden methods listed
in Table 4.1, to compute a ﬁxed point of the period map f : R
n
→R
n
deﬁned
by (8.3), as we did for several test functions in Chapter 4. The computations
are stopped if a maximal number of 200 iterations is reached or the process
is converged to a residual of g(x
k
) < ε, where ε = 10
−10
for the one
dimensional model and ε = 10
−8
for the twodimensional model.
The onedimensional model
The results of the simulations with the period map of the onedimensional
model as given in Table 5.2.
5.4 Comparison of selected limited memory Broyden methods 143
It turns out that the methods BRR, BRRI, BRR2, BBR and BBS are
rather fast for p ≥ 5. Note that for p = 5 we need 2pn = 2 5 200 = 2000
storage locations to store the update matrix and for p = 50 we need 20, 000
storage locations. The method BRR can even be applied for p = 4 using 57
iterations. The method BRRI is still applicable for p = 3 using 69 iterations.
We clearly see that a smaller value of p does not have to imply that more
iterations are needed for the limited memory Broyden process. The fact that
for p = 50 not all methods converge in 48 iterations can be explained by the
introduction of rounding errors of the large computations. For two simulations
the results were not returned by the program, because an evaluation of the
period map failed during the process.
The twodimensional model
The results of the simulation with the period map of the twodimensional
model are given in Table 5.3. Note that all methods converge in 47 iterations
for p = 50. For p ≥ 10 the method BRR, BRRI, BRR2 and BBR need less
than 51 iterations. The convergence properties of the limited memory Broyden
methods in case of the twodimensional model are comparable to those in case
of the onedimensional model. Note that the method BRR is applicable for
p = 4 using 64 iterations instead of 47.
144 Chapter 5. Features of the Broyden rank reduction method
method p = 50 p = 40 p = 30 p = 25 p = 20
UPALL 49 0.5186 75 0.3421 66 0.3917 79 0.3178 63 0.4051
UP1 49 0.5186 65 0.3839 92 0.2731 86 0.2923 118 0.2156
BRR 48 0.5202 47 0.5324 53 0.4750 51 0.4998 55 0.4575
BRRI 50 0.5019 56 0.4610 51 0.4979 47 0.5305 50 0.5056
BRR2 48 0.5202 47 0.5324 49 0.5134 60 0.4221 52 0.5067
BBR 49 0.5186 50 0.5069 49 0.5101 50 0.5187 55 0.4543
BBS 49 0.5186 47 0.5340 53 0.4744 55 0.4547 48 0.5264
BYRD 52 0.4831 50 0.4991 62 0.4215 57 0.4429 59 0.4441
p = 15 p = 14 p = 13 p = 12 p = 11
UPALL 82 0.3046 95 0.2654 71 0.3540 74 0.3421 72 0.3455
UP1 120 0.2076 126 0.2013 146 0.1716 119 0.2096 115 0.2216
BRR 56 0.4503 51 0.4889 50 0.5029 49 0.5119 54 0.4670
BRRI 53 0.4815 55 0.4561 48 0.5201 49 0.5193 46 0.5727
BRR2 46 0.5490 48 0.5216 57 0.4471 46 0.5414 47 0.5298
BBR 49 0.5078 48 0.5314 64 0.3968 46 0.5497 44 0.5732
BBS 54 0.4626 52 0.4851 59 0.4232 53 0.4701 51 0.5023
BYRD 76 0.3287 68 0.3807 101 0.2477 65 0.3883 85 0.3010
p = 10 p = 9 p = 8 p = 7 p = 6
UPALL 75 0.3342 86 0.2967 102 0.2442 131 0.1945 96 0.2596
UP1 162 0.1544 . . . . . .
∗
132 0.1892 145 0.1725 146 0.1705
BRR 52 0.4843 55 0.4523 58 0.4332 50 0.5030 53 0.4715
BRRI 46 0.5511 49 0.5135 49 0.5159 44 0.5753 53 0.4693
BRR2 63 0.4016 49 0.5271 49 0.5281 52 0.4789 55 0.4528
BBR 49 0.5154 47 0.5323 42 0.5960 67 0.3809 47 0.5295
BBS 43 0.5852 58 0.4303 45 0.5667 53 0.4772 59 0.4314
BYRD 71 0.3550 91 0.2747 97 0.2572 133 0.1913 112 0.2266
p = 5 p = 4 p = 3 p = 2 p = 1
UPALL 116 0.2201 138 0.1854 164 0.1561 200 0.0786
∗
200 0.0757
∗
UP1 155 0.1628 155 0.1608 185 0.1388 191 0.1310 200 0.0757
∗
BRR 54 0.4634 57 0.4432 85 0.2936 105 0.2420 200 0.0757
∗
BRRI 62 0.4166 63 0.3952 69 0.3668 90 0.2778 200 0.0819
∗
BRR2 60 0.4211 92 0.2737 158 0.1602 200 0.1129
∗
– –
BBR 60 0.4175 . . . . . .
∗
154 0.1617 182 0.1382 200 0.0757
∗
BBS 71 0.3541 81 0.3098 76 0.3305 200 0.0717
∗
200 0.0757
∗
BYRD 171 0.1458 128 0.1971 150 0.1663 188 0.1335 200 0.0059
∗
Table 5.2: The number of iterations and the rate of convergence for diﬀerent limited
memory Broyden methods, applied to the period map of the reverse ﬂow reactor (8.3)
according to the onedimensional model (6.23)(6.25), n = 200. [’*’ (no convergence),
’. . .’ (no data)]
5.4 Comparison of selected limited memory Broyden methods 145
method p = 50 p = 40 p = 30 p = 25 p = 20
UPALL 47 0.4780 52 0.4259 54 0.4373 66 0.3568 67 0.3269
UP1 47 0.4780 61 0.3734 72 0.3167 98 0.2240 78 0.2830
BRR 47 0.4782 47 0.4782 47 0.4783 47 0.4783 47 0.5126
BRRI 47 0.4782 47 0.4783 47 0.4782 47 0.4783 47 0.5118
BRR2 47 0.4782 47 0.4782 47 0.4781 47 0.4778 47 0.4770
BBR 47 0.4780 47 0.4796 47 0.4912 47 0.4995 47 0.4669
BBS 47 0.4780 47 0.4782 47 0.4819 47 0.4876 50 0.4434
BYRD 47 0.4781 48 0.4567 53 0.4235 59 0.3742 56 0.4087
p = 15 p = 14 p = 13 p = 12 p = 11
UPALL 62 0.3760 72 0.3071 77 0.2880 115 0.1924 84 0.2659
UP1 78 0.2809 93 0.2357 101 0.2180 116 0.1909 115 0.1960
BRR 47 0.4913 47 0.5062 48 0.4657 48 0.4574 46 0.4781
BRRI 47 0.4955 47 0.5027 48 0.4640 48 0.4785 47 0.4690
BRR2 47 0.4796 48 0.4666 48 0.4833 47 0.4668 50 0.4427
BBR 48 0.4652 51 0.4580 47 0.4802 46 0.4833 47 0.4655
BBS 48 0.4648 50 0.4532 49 0.4813 60 0.3727 48 0.4678
BYRD 77 0.2861 61 0.3597 79 0.2813 74 0.3017 . . . . . .
∗
p = 10 p = 9 p = 8 p = 7 p = 6
UPALL 82 0.2677 79 0.2830 . . . . . .
∗
103 0.2173 . . . . . .
∗
UP1 137 0.1629 171 0.1289 94 0.2404 110 0.1994 152 0.1539
BRR 49 0.4513 52 0.4512 55 0.4002 52 0.4238 50 0.4488
BRRI 48 0.4622 50 0.4621 51 0.4293 49 0.4593 59 0.3770
BRR2 50 0.4455 53 0.4198 59 0.3747 . . . . . .
∗
53 0.4143
BBR 49 0.4643 . . . . . .
∗
55 0.4104 55 0.4426 55 0.4082
BBS 52 0.4562 55 0.3983 55 0.4045 59 0.3908 74 0.2961
BYRD 88 0.2520 82 0.2723 79 0.2846 78 0.2828 92 0.2480
p = 5 p = 4 p = 3 p = 2 p = 1
UPALL 81 0.2780 142 0.1557 96 0.2577 200 0.0560
∗
. . . . . .
∗
UP1 127 0.1725 130 0.1695 . . . . . .
∗
92 0.2453 . . . . . .
∗
BRR 60 0.3996 64 0.3441 79 0.2778 107 0.2112 . . . . . .
∗
BRRI 85 0.2610 59 0.3713 91 0.2405 110 0.2028 . . . . . .
∗
BRR2 61 0.3732 78 0.2804 79 0.2770 200 0.0891
∗
– –
BBR 56 0.3924 80 0.2749 92 0.2388 . . . . . .
∗
. . . . . .
∗
BBS 57 0.3997 73 0.3027 102 0.2331 200 0.0807
∗
. . . . . .
∗
BYRD 123 0.1785 . . . . . .
∗
. . . . . .
∗
. . . . . .
∗
200 0.0014
∗
Table 5.3: The number of iterations and the rate of convergence for diﬀerent limited
memory Broyden methods, computing a ﬁxed point of the period map (8.3) according
to the twodimensional model (6.26)(6.28), n = 5000. [’*’ (no convergence), ’. . .’ (no
data)]
146 Chapter 5. Features of the Broyden rank reduction method
Part III
Limited memory methods
applied to periodically forced
processes
147
Chapter 6
Periodic processes in packed
bed reactors
In this chapter, we give a short introduction in the chemical reactor engineer
ing. In Section 6.1, we discuss the most common cyclic processes in packed
bed reactors and explain their advantages. The balance equations for a general
packed bed reactor are derived in Section 6.2.
6.1 The advantages of periodic processes
Periodic processes in packed bed reactors mainly arise from periodically vary
ing the feeding conditions, that is, the temperature, pressure and direction of
the feed streams.
Pressure and thermal swing adsorption
In pressure swing adsorption (PSA) processes, gas mixtures are separated by
selective adsorption over a bed of sorbent materials. If the adsorbent is satu
rated, that is, it cannot adsorb any more adsorbate, it has to be regenerated.
Therefore, the adsorbent must bind components reversibly, so that it does
not have to be replaced every time it is saturated, but can be cleaned in the
reactor itself. The periodic nature of the PSA arises from the high pressure
adsorption phase and the subsequent low pressure regeneration phase.
During adsorption one component is selectively adsorbed, such that at the
product end of the reactor the gas stream does not contain this component. In
a packed bed a front is therefore formed that slowly migrates in the direction of
the product end. From the feed point up to the adsorption front, the feed gas
149
150 Chapter 6. Periodic processes in packed bed reactors
mixture is in equilibrium with a saturated sorbent, while further downstream,
the gas phase contains nonadsorbing components only and the sorbent is not
saturated. During this step the pressure is maintained at a high level.
Before the adsorbent in the reactor is completely saturated, the product
end of the reactor is closed and the pressure is released at the feed end of the
reactor. This second step is called the blowdown step.
When the pressure has dropped to suﬃcient low level, it is maintained at
this level and ’clean’ carrier gas is led into the reactor at the product end such
that the adsorbent in the reactor is purged, that is, the adsorbed component
is removed from the sorbent during this regeneration step.
When the adsorbent has lost enough of its loading, the product end of
the reactor is again closed and the pressure is raised to the former high level.
After this pressurization the process returns to the ﬁrst step.
De Montgareuil and Domine and independently Skarstrom are generally
considered to be the inventors of the PSA. The Skarstrom PSA cycle was
immediately accepted for commercial use in air drying. Pressure swing ad
sorption is widely used for bulk separation and puriﬁcation of gasses. Major
applications are for example, moisture removal from air and natural gas, sep
aration of normal and isoalkanes, and hydrogen recovery and puriﬁcation. A
pressure swing adsorber designed to separate water from air has been studied
by e.g. Kvamsdal and Hertzberg [39].
Thermal swing adsorption (TSA) processes are similar to pressure swing
adsorption processes and are also intended to separate gas mixtures. But here
the cyclic nature arises from the low temperature adsorption phase and the
subsequent high temperature regeneration phase. Studies of thermal swing
adsorbers can be found in work by e.g. Davis and Levan [14]. Combinations
of PSA and TSA processes also exist.
Pressure swing reactor
The principle of Pressure Swing Reactors (PSR), sometimes also referred to
as Sorption Enhanced Reaction Processes (SERP), is based upon physically
admixing a sorbent and a catalyst in one vessel in order to achieve a separation
concurrent with a reaction. Sorption and catalysis may even be integrated in
a single material. The sorption enhanced reaction process has been demon
strated primarily in achieving supra equilibrium levels in equilibrium limited
reactions. The adsorption is typically used to purify one of the reaction prod
ucts. The cyclic nature of a pressure swing reactor arises from the same high
pressure adsorption and low pressure regeneration phases as in the pressure
swing adsorber. The pressure swing reactor is a relatively new process and
6.1 The advantages of periodic processes 151
has been studied by e.g. Hufton et al. [29], Carvill et al. [13] and Kodde and
Bliek [36].
The PSR potentially oﬀers the following advantages:
• Increased conversion of reactants,
• Improved selectivities and yields of desired products,
• Reduced requirements for external supply or cooling capacity,
• Reduced capital expenditure by process intensiﬁcation,
• More favorable reaction conditions might be possible, resulting in longer
lifetime of equipment and less catalyst deactivation.
A well known application of the pressure swing reactor is the removal of
CO from syngas, combining lowtemperature shift catalysis and selective CO
2
removal by adsorption. Production of high purity hydrogen from syngas, as
required for instance for fuel cell applications, normally uses a multistep pro
cess, involving both a water gas shift and a selective oxidation process. In the
latter step a part of the produced hydrogen is inevitably lost. This disadvan
tage can be avoided in a reactive separation using PSR. By a combination of
low temperature shift catalysis and selective adsorption of carbon dioxide in
one vessel, the removal of CO as a result of the shift reaction rather than by
selective oxidation might become feasible.
The shift reaction is given by
H
2
O + CO
−→
←−
H
2
+ CO
2
.
When adsorbing the CO
2
the equilibrium of the above reaction shifts to the
right. This implies that more H
2
is produced and more CO is removed. Being
a member of the family of adsorptive reactors, the PSR is limited to compara
tively low temperature applications in order to maintain suﬃcient adsorption
capacity for the sorbent.
The reverse ﬂow reactor
The simplest example of a periodic process might be the reverse ﬂow reactor
(RFR), a packed bed reactor in which the ﬂow direction is periodically re
versed in order to trap a hot reaction zone within the reactor. In this way
even systems with a small adiabatic temperature rise can be operated with
out preheating the feed stream. The reverse ﬂow reactor concept was ﬁrst
proposed and patented by Cottrell in 1938 for the removal of pollutants. We
describe the RFR is more detail in Section 8.1.
152 Chapter 6. Periodic processes in packed bed reactors
6.2 The model equations
of a cooled packed bed reactor
We consider a tubular reactor ﬁlled with small catalyst particles in which gas
ﬂows in axial direction. The gas contains a bulk part of an inert gas with
a trace of reactants A that, when meeting the catalyst, reacts to a product
B. We deal with exothermic reactions only. To avoid overheating  melting
and burning  of the catalyst particles, the reactor is cooled using a cooling
jacket around the reactor. Turbulences of the gas around the particles cause
a nearly constant velocity over a cross section of the reactor. The reactor we
have described here is called a cooled packed bed reactor.
In this section a mathematical model is derived that describes the essen
tials of the reactor unit. The dimension of the model denotes the number of
spatial directions in the model. Time is considered as an additional dimension.
Therefore, the onedimensional model consists of the axial dimension. For the
twodimensional model also the radial direction is taken into account. If we
distinguish between the solid, the catalyst or the adsorbent, and the gas phase,
we obtain a heterogeneous model. We consider a pseudohomogeneous model,
neglect the diﬀerence in temperature between the solid particles and the gas
phase, and assume that the species only exists in the gas phase. The model is
based on the conservation of mass and energy, which is described by balance
equations. In order to be able to formulate the model several assumptions
have to be made.
The mass transport mechanisms we take into account are convective mass
transport, turbulence around the catalyst particles and bulk diﬀusion. The
latter two are lumped together as dispersion in axial (and radial) direction.
Heat transfer is the result of the following mechanisms.
• Mechanisms independent of ﬂow:
– Thermal conduction through the solid particle,
– Thermal conduction through the contact point of two particles,
– Radiant heat transfer between the surfaces of two adjacent pellets.
• Mechanisms depending on the ﬂuid ﬂow:
– Thermal conduction through the ﬂuid ﬁlm near the contact surface
of two pellets,
– Heat transfer by convection,
– Heat conduction within the ﬂuid,
6.2 The model equations of a cooled packed bed reactor 153
– Heat transfer by lateral mixing.
The contribution of radiation to the total heat ﬂow turns out to be im
portant at temperatures above 400
◦
C. Below this temperature the various
mechanisms of heat transport, except for the heat transport by convection,
are usually described by a lumped parameter, the eﬀective thermal conduc
tivity.
The transport resistance between the gas phase and the catalyst is negli
gible, as is the multiplicity of the catalyst particles, that is, the diﬀerence in
activity. Therefore the eﬀectivity, denoted by η, is equal to one. We assume
that the gas phase satisﬁes the ideal gas law, and ﬂows through the vessel at a
constant velocity. The velocity over a cross section of the reactor is assumed
constant, due to a high rate of turbulence. We assume that the pressure drop
over the unit caused by the ﬂow along the catalyst bed, is negligible
The equipment both upstream and downstream the reactor have no inﬂu
ence on the behavior of the ﬂow inside the vessel. Furthermore we assume that
dispersion of energy and mass, caused by diﬀusion and turbulence around the
catalyst particles, can only occur inside the reactor and not in the channels
leading to it. In addition, the reaction only occurs inside the reactor. There
fore we can apply Danckwerts boundary conditions, see [63]. The temperature
and composition of the feed streams and the mass ﬂow are constant in time.
The thermal equilibrium between the gas and the catalyst occurs instan
taneously. Hence, intraparticle gradients in temperature or concentration are
assumed to be negligible. We assume that all the physical properties are con
stant in the range of temperature and concentration that occurs in the reactor.
The dispersion coeﬃcient is assumed to be constant and equal for every com
ponent. The reaction is exothermic and the heat of reaction is independent
of the temperature. The reaction does not change the number of moles in the
gas phase, thus one mole of species A gives one mole of species B.
In order to model the cooling we assume that the reactor wall is cooled at
an more or less constant temperature, caused by a high ﬂow rate or a large
density of the cooling ﬂow. Inside the reactor the cooling occurs only via the
gas phase due to the negligible contact area between the catalyst particles and
the reactor wall.
In Table 6.1 we have summarized the assumptions made for both the one
and the two dimensional model of a packed bed reactor. The additional con
dition for the onedimensional model is that concentration and temperature
are constant over a cross section of the reactor.
154 Chapter 6. Periodic processes in packed bed reactors
• The gas phase satisﬁes the ideal gas law.
• The velocity of the ﬂow is constant.
• The heat and concentration equilibrium between the gas phase and the catalyst
occurs instantaneously.
• The transport resistance and multiplicity of the catalyst particles is negligible.
• The physical properties, like the dispersion coeﬃcient, the thermal conductiv
ity and the molar based heat capacity, are independent of temperature and
concentration and equal for every component.
• The pressure drop caused by the catalyst particles is negligible.
• The reaction does not change the number of moles in the gas phase.
• The equipment both upstream and downstream has no inﬂuence on the ﬂow
inside the reactor.
• Dispersion of heat and mass occurs only inside the reactor.
• The temperature and composition of the feed gas is constant in time.
• The reactor wall is cooled at constant temperature.
• Cooling at the reactor wall inside the reactor occurs only via the gas phase.
Table 6.1: Assumptions on the cooled packed bed reactor
The component balances and the mass balance
The component balance represents the conservation of mass of one single
species in the gas phase. We consider a very basic example of a species A
reacting into species B, that is A → B. The total concentration of the gas is
denoted by ρ and the mole fraction of species A by y
A
. The partial concentra
tion of species A is given by C
A
= ρy
A
.
We compute the ﬂow of species A through the cross section of the reactor
at z = z
0
, that is, the number of moles that passes at z = z
0
every second.
The ﬂow is caused by convection and diﬀusion.
The convection is the bulk motion caused by feeding the reactor. If u is
the rate of the ﬂow, then the convection is given by
B
A
= C
A
u mol/(m
2
s).
The diﬀusion is based on contributions from molecular diﬀusion in the gas
phase (to create the highest possible entropy) and from the turbulent ﬂow
6.2 The model equations of a cooled packed bed reactor 155
PSfrag replacements
z z + ∆z
Figure 6.1: A segment of the reactor of length ∆z.
around the particles, and is given by
J
A
= −ρD
ax
∂y
A
∂z
mol/(m
2
s).
The molar ﬂux is the sum of the convection and the diﬀusion term and
represents the number of moles of a component that crosses a unit area per
second,
W
A
= J
A
+B
A
mol/(m
2
s).
To compute the ﬂow one has to multiply the ﬂux by the cross sectional area
of the reactor, denoted by A
c
. But, since the reactor is ﬁlled with particles,
the void fraction ε has to be taken into account. So, the ﬂow equals
F
A
= εA
c
W
A
mol/s.
The component balance is obtained by considering a small segment of the
reactor, see Figure 6.1. The volume of the segment is equal to ∆V = A
c
∆z.
The number of moles that accumulate in the small segment, ε∆V ∂C
A
/∂t, is
equal to the number of moles that enters the section, F
A
(z), minus the number
that leaves, F
A
(z + ∆z), minus the number that reacts per second. If r
is
the number of moles that reacts per kilogram catalyst per second, we have to
multiply this by (1 −ε)∆V ρ
cat
, to obtain the number of moles that reacts per
second in the segment. This leads to the equality
¦accumulation¦ = ¦in¦ −¦out¦ −¦reaction¦
ε∆V
∂C
A
∂t
= F
A
(z) −F
A
(z + ∆z) −(1 −ε)∆V ρ
cat
r
.
After dividing both sides by ∆V and letting the length of the segment going
to zero (∆z ↓ 0), we arrive at a partial diﬀerential equation. The component
balance of species A reads
ε
∂C
A
∂t
=
∂
∂z
ρεD
ax
∂y
A
∂z
−εuC
A
¸
−(1 −ε)ρ
cat
r
(6.1)
156 Chapter 6. Periodic processes in packed bed reactors
The left hand side of the component balance denotes the accumulation of
component A in the gas phase. The convective and diﬀusive contributions to
the ﬂow and the reaction rate are represented by the right hand side terms.
In the same way we obtain the component balance of species B, given by
ε
∂C
B
∂t
=
∂
∂z
ρεD
ax
∂y
B
∂z
−εuC
B
¸
+ (1 −ε)ρ
cat
r
. (6.2)
Note the plus sign in front of the reaction term, which implies that the reaction
increases the concentration of species B.
Finally, a third species is also present in the reactor, namely, the carrier
gas. The carrier gas is an inert, that is, it does not take part in the reaction.
Therefore, the component balance of the inert is given by
ε
∂C
I
∂t
=
∂
∂z
ρεD
ax
∂y
I
∂z
−εuC
I
¸
. (6.3)
If we add the component balances of all species we obtain the overall
mass balance, which is an important equation if the velocity is not necessarily
constant. Because the sum of the mole fractions equals one, y
A
+y
B
+y
I
= 1,
and has zero derivative, the overall mass balance is given by
ε
∂ρ
∂t
= −ε
∂uρ
∂z
.
Note that the reaction term is also canceled in this equation since the reaction
does not change the total number of molecules.
The component balance equations contain a second order derivative of the
mole fraction. So, we derive the boundary conditions at both ends of the
reactor. Note that we have assumed that mass dispersion appears only inside
the reactor. The reactor is called a closedclosed vessel, that is, either upstream
or downstream of the reactor mass dispersion is negligible. At the entrance
the boundary equation compares the ﬂux in and in front of the reactor This
leads to the equality W
A,0
= W
A
[
z=0
, which is equal to
C
A,0
u = −ρD
ax
∂y
A
∂z
+C
A
u
z=0
. (6.4)
At the other end we have assumed that no inﬂuences exists of the equipment
on the behavior of the ﬂow, which implies no gradients in the concentration
of the components,
∂y
A
∂z
z=L
= 0. (6.5)
6.2 The model equations of a cooled packed bed reactor 157
The energy balance
Let us consider the open system given by a thin segment of the packed bed
reactor of length ∆z. The total energy contained in the segment is given by
E
sys
. The energy balance describes the change in total energy of the system
(∂E
sys
/∂t), which can be computed in two diﬀerent ways.
For the ﬁrst approach we analyze what happens inside the segment. The
total energy is given by the sum of the energy of the catalyst and the energy
of the gas phase, that is,
E
sys
= E
s
ρ
s
(1 −ε)∆V +
¸
i
E
i
ρ
i
ε∆V. (6.6)
The energy E
i
of a species in the gas phase consists of the enthalpy, H
i
, and the
product −PV
i
. Here V
i
is the speciﬁc volume per mol of species i. Therefore,
¸
i
E
i
ρ
i
=
¸
i
(H
i
−PV
i
)ρ
i
=
¸
H
i
ρ
i
−P.
In the last step we used that
¸
i
V
i
ρ
i
= 1. Dividing (6.6) by ∆V and diﬀeren
tiating in time leads to the change in energy. If we assume that the density of
the catalyst is constant (∂ρ
s
/∂t = 0), then the change in potential energy is
linearly proportional to the change in temperature, that is,
∂H
∂t
= c
p
∂T
∂t
,
where c
p
denotes the speciﬁc heat capacity, at constant pressure. The speciﬁc
heat capacity c
p
is assumed to be independent of temperature and concentra
tion. Using the component balances (6.1)(6.3), that is,
ε
∂ρ
i
∂t
= ε
∂W
i
∂z
+ν
i
(1 −ε)ρ
s
r
,
the change in energy reads
1
∆V
∂E
sys
∂t
=
∂E
s
∂t
ρ
s
(1 −ε) +ε
¸
i
H
i
∂ρ
i
∂t
+ε
¸
i
ρ
i
∂H
i
∂t
−ε
∂P
∂t
= (1 −ε)(ρc
p
)
s
∂T
∂t
+ε(ρc
p
)
g
∂T
∂t
+ε
¸
i
H
i
∂W
i
∂z
+(1 −ε)ρ
s
(−∆H)r
−ε
∂P
∂t
, (6.7)
where (−∆H) =
¸
i
ν
i
H
i
denotes the heat of reaction.
158 Chapter 6. Periodic processes in packed bed reactors
On the other hand, we can consider the interaction of the segment with its
surrounding. We have to take into account conduction (through the gas phase
and the catalyst particles), heat transport due to the ﬂow in the reactor, and
cooling. The energy that results from the work of equipment is neglected.
For the conduction term, we use Fick’s ﬁrst law of diﬀusion, cf. [63, 72].
The amount of energy that passes a cross section of the reactor per square
meter, per second equals
−λ
ax
∂T
∂z
, (6.8)
where λ
ax
is the eﬀective axial heat conductivity, that depends on the heat
conductivities in the gas and the solid phase and on the heattransfer resistance
between the two phases. Note that the conduction operates in the direction
of decreasing temperature, indicated by the minus sign.
The ﬂow of energy is the amount of energy that crosses a passes section of
the reactor per second and is given by
¸
i
F
i
E
i
= εA
c
¸
i
W
i
E
i
.
Because the energy of species i equals E
i
= H
i
−PV
i
, we obtain
¸
i
W
i
E
i
=
¸
i
W
i
H
i
−P
¸
i
W
i
V
i
=
¸
i
W
i
H
i
+PρD
ax
¸
i
∂y
i
∂z
V
i
−Pu
¸
i
ρ
i
V
i
. (6.9)
By assuming that the speciﬁc volume is equal for every species in the gas
phase, the second term of the last expression in (6.9) disappears.
The cooling, denoted by
˙
Q, is the amount of energy that leaves the segment
at the wall of the reactor per second. The cooling rate per square meter
surface area is linearly proportional to the diﬀerence in temperature of the
segment and of the cooling jacket (−U
w
(T − T
c
)). The surface area of the
segment equals 2πR∆Z and the volume of the segment is πR
2
∆z. By a
w
we denote the ratio between the surface area and the volume of the segment
(a
w
= 2πR∆z/(πR
2
∆z) = 2/R.). The total cooling per second is thus given
by
˙
Q = −U
w
(T −T
c
)A
c
∆za
w
. (6.10)
From (6.8), (6.9) and (6.10), we obtain the following expression for the change
6.2 The model equations of a cooled packed bed reactor 159
in energy of the segment
lim
∆z→0
1
∆V
∂E
sys
∂t
= lim
∆z→0
1
A
c
∆z
−λ
ax
A
c
∂T
∂z
z
z+∆z
+εA
c
¸
i
W
i
H
i
−Pu
z
z+∆z
−U
w
a
w
A
c
∆z(T −T
c
)
¸
= λ
ax
∂
2
T
∂z
2
−ε
∂
∂z
¸
i
W
i
H
i
+
∂
∂z
Pu
¸
−U
w
a
w
(T −T
c
). (6.11)
Note that the second term of the right hand side can be expanded to
ε
∂
∂z
¸
i
W
i
H
i
= ε
¸
i
∂H
i
∂z
W
i
+ε
¸
i
H
i
∂W
i
∂z
. (6.12)
Because
¸
i
W
i
=
¸
i
−ρD
a
x
∂y
i
∂z
+ρ
i
u
= ρ
g
u.
and because ∂H
i
/∂z can be approximated by (c
p
)
g
∂T/∂z the ﬁrst term of the
right hand side of (6.12) becomes
ε(ρc
p
)
g
u
∂T
∂z
.
Since (6.7) is valid for all ∆V, we can combine (6.7) and (6.11). The term
ε
¸
i
H
i
∂W
i
/∂z cancels and we derive the equation for the energy balance
(1 −ε)(ρc
p
)
s
+ε(ρc
p
)
g
∂T
∂t
−ε
∂P
∂t
= λ
ax
∂
2
T
∂z
2
−ε(ρc
p
)
g
u
∂T
∂z
−
∂
∂z
Pu
¸
−U
w
a
w
(T −T
c
) + (1 −ε)ρ
s
(−∆H)r
. (6.13)
The left hand side shows the accumulation of enthalpy in the gas and solid
phase. On the right hand side, the ﬁrst two terms show the contribution of the
heat transfer by convection and diﬀusion. The fourth term denotes the heat
transfer through the reactor wall to the surroundings. The last term gives the
enthalpy change due to reaction.
The boundary conditions are obtained in a similar way to those of the
component balance, (6.4) and (6.5), and are given by
u(ρc
p
)
g
T
0
= −λ
ax
∂T
∂z
+u(ρc
p
)
g
T
z=0
,
160 Chapter 6. Periodic processes in packed bed reactors
at the entrance of the reactor and
∂T
∂z
z=L
= 0.
at the product end.
Reaction rate
The reaction rate depends on many factors. First of all it depends on the
concentration of the reactants in the reactor. In addition the temperature is
important. At low temperature, for example, the reaction might not occur at
all. If the heat of the reactor is extremely high, it can accelerate the reaction
and the reactor might explode. The type of the catalyst and the system of the
reaction on the catalyst increase the complexity of the formula.
In the simulations of Chapter 8 we restrict ourselves to the reaction rate
given by
r
(c, T) =
ηk
∞
a
v
k
c
exp[−E
a
/R
gas
T]
a
v
k
c
+ηk
∞
exp[−E
a
/R
gas
T]
c,
according to Khinast et al. [33].
Radial direction
To extend to onedimensional model with the radial direction, we assume
that the state in the reactor is cylindrically symmetric and that the dispersion
coeﬃcient D
rad
and the thermal conductivity λ
rad
are independent of position,
concentration and temperature. In addition, we assume that energy transport
by mass diﬀusion can be lumped into the thermal conductivity.
We consider the radial part of the diﬀusion in the energy balance equation.
The radial part of the diﬀusion in the component balance is obtained in a
similar way. We subdivide the segment of the reactor with width ∆z in M
rings, see Figure 6.2. The widths of the rings are given by ∆r
1
, . . . , ∆r
M
.
Denote by r
i
the center radius of the ith ring, that is r
1
=
1
2
∆r
1
, r
2
= ∆r
1
+
1
2
∆r
2
, and in general r
i
=
¸
i−1
j=1
∆r
j
+
1
2
∆r
i
, i = 1, . . . , M. We take a ring
with center radius r and width ∆r. The volume of this ring is given by
∆V = ∆z
π
r +
1
2
∆r
2
−π
r −
1
2
∆r
2
¸
= 2π∆z r∆r.
Similarly to the axial case (6.8), the heat conductivity in radial direction per
m
2
surface area, is given by
−λ
rad
∂T
∂r
.
6.2 The model equations of a cooled packed bed reactor 161
PSfrag replacements
∆z
r
1
r
2
Figure 6.2: A segment of the reactor of length ∆z.
The accumulation in the ring under consideration equals the ﬂow through the
surface of the ring at r −
1
2
∆r minus the ﬂow through the surface of the ring
at r +
1
2
∆r. If we divide the accumulation term by the volume of the ring, we
obtain
1
∆V
(−λ
rad
)
∂T
∂r
r−
1
2
∆r
2π
r −
1
2
∆r
∆z
−(−λ
rad
)
∂T
∂r
r+
1
2
∆r
2π
r +
1
2
∆r
∆z
¸
. (6.14)
The expression in Formula (6.14) can be further simpliﬁed to
(λ
rad
)
1
∆r
∂T
∂r
r+
1
2
∆r
r−
1
2
∆r
+
1
2r
(λ
rad
)
∂T
∂r
r−
1
2
∆r
+
∂T
∂r
r+
1
2
∆r
.
By taking the limit ∆r →0, we arrive at
(λ
rad
)
∂
2
T
∂r
2
+
1
r
∂T
∂r
¸
,
which equals
(λ
rad
)
1
r
∂
∂r
r
∂T
∂r
¸
. (6.15)
The radial part of the diﬀusion in the component balance is given by
(ρD
rad
)
1
r
∂
∂r
r
∂y
A
∂r
¸
. (6.16)
162 Chapter 6. Periodic processes in packed bed reactors
At the wall of the reactor the boundary condition
λ
rad
∂T
∂r
r=R
= −U
w
(T(R) −T
c
), (6.17)
is added to the system. Equation (6.17) describes the heat loss at the reactor
wall to the surrounding cooling jacket, which is linearly proportional to the
diﬀerence in the temperature inside and outside of the reactor wall. Because
no material can pass through the wall of the reactor, we have
∂y
A
∂r
r=R
= 0.
The cylindrical symmetry in the reactor yields the boundary conditions
∂y
A
∂r
r=0
= 0, and
∂T
∂r
r=0
= 0.
A justiﬁcation of the twodimensional model
In the following, we justify that the above extension of the onedimensional
model is indeed natural. The relation between the one and twodimensional
balance equations is based on the idea of a weighted average. To give a useful
onedimensional representation of the twodimensional state of the reactor, the
weighted average can be taken of the temperature and the concentration over
the cross section of the reactor. In the twodimensional model, the temperature
in the point (z, r) at time t is denoted by T(z, r, t). So, the average temperature
over the cross section through z = z
0
equals
¯
T(z
0
, t) =
2
R
2
R
0
rT(z
0
, r, t)dr. (6.18)
Before we compare the energy balance equations of both models, we apply
a few simpliﬁcations. We assume that the term (ρc
p
)
g
is constant in time and
space, as well as the velocity and the pressure. Therefore, the energy balance
(6.13) becomes
(1 −ε)(ρc
p
)
s
+ε(ρc
p
)
g
∂T
∂t
=
λ
ax
∂
2
T
∂z
2
−ε(ρc
p
)
g
u
∂T
∂z
−U
w
a
w
(T −T
c
). (6.19)
6.2 The model equations of a cooled packed bed reactor 163
The twodimensional version of the energy balance equation reads
(1 −ε)(ρc
p
)
s
+ε(ρc
p
)
g
∂T
∂t
=
λ
ax
∂
2
T
∂z
2
−ε(ρc
p
)
g
u
∂T
∂z
+ (λ
rad
)
1
r
∂
∂r
r
∂T
∂r
¸
. (6.20)
If we take the weighted average of both sides of the energy balance, (6.20),
over the cross section of the reactor and use (6.18) for the weighted average
of the temperature, we obtain
((ρc
p
)
s
(1 −ε) + (ρc
p
)
g
ε)
∂
¯
T
∂t
= λ
ax
∂
2
¯
T
∂z
2
−u(ρc
p
)
g
∂
¯
T
∂z
+
(−∆H)
2
R
2
R
0
r r
(c, T)dr +λ
rad
2
R
2
r
∂T
∂r
R
r=0
. (6.21)
Using the boundary conditions in radial direction, we can rewrite the last term
of (6.21) in the following way
λ
rad
2
R
2
r
∂T
∂r
R
r=0
= −
2
R
U
w
T(R) −T
0
. (6.22)
If we substitute (6.22) in (6.21) and assume that the concentration and the
temperature are constant in the radial direction, we recover the energy balance
of the onedimensional model, (6.19), with a
w
= 2/R.
In the same way we can show that the component balance of the one
dimensional model is also a limiting case of the component balance of the
twodimensional model.
Dimensionless equations
In order to obtain the dimensionless versions of the balance equations we
use the following dimensionless variables. The conversion is given by x =
(c
0
− c)/c
0
, where c
0
is the concentration of the reactants in the feeding gas
and c = C
A
= ρy
A
. If the conversion equals zero no reaction has occurred
and if the conversion equals one the reaction is completed. The dimensionless
temperature is given by θ = (T −T
0
)/T
0
, where T
0
is the temperature of the
feeding gas. Since the reaction is exothermic and the cooling temperature is
ﬁxed at T
0
, the dimensionless temperature is always positive. The independent
dimensionless variables are time, τ = tu/L, the axial distance, ξ = z/L, and
the radial distance, ζ = r/R, for the twodimensional model.
164 Chapter 6. Periodic processes in packed bed reactors
a
1
= (ρc
p
)
s
(1 −ε)/(ρc
p
)
g
+ε a
2
= (−∆H)c
0
/(T
0
(ρc
p
)
g
) = ∆T
ad
/T
0
a
3
= Lηk
∞
/u a
4
= ηk
∞
/(a
v
k
c
)
Pe
m
= uL/(εD
ax
) Pe
h
= uL(ρc
p
)
g
/λ
ax
β = E
a
/(RT
0
) Φ = 2LU
w
/(Ru(ρc
p
)
g
)
Pe
mp
= uL/(εD
rad
) Pe
hp
= (ρc
p
)
g
Lu/λ
rad
Table 6.2: The dimensionless parameters of the balance equations.
We ﬁrst derive the dimensionless version of the component (conversion)
balance of the onedimensional model. Substituting the expressions for the
dimensionless variables into (6.1) gives
ε
∂(1 −x)c
0
∂τL/u
= εD
ax
∂
2
(1 −x)c
0
∂(ξL)
2
−u
∂(1 −x)c
0
∂ξL
−
ηk
∞
a
v
k
c
(1 −x)c
0
a
v
k
c
exp(
E
a
RT
0
1
1+θ
) +ηk
∞
.
Hereafter, we divide both sides by the factor −c
0
u/L. By gathering all param
eters in dimensionless groups, we obtain
ε
∂x
∂τ
=
1
Pe
m
∂
2
x
∂ξ
2
−
∂x
∂ξ
+a
3
(1 −x)
exp(β/(1 +θ)) +a
4
.
In the same way the energy balance in (6.19) becomes
((ρc
p
)
s
(1 −ε) + (ρc
p
)
g
ε)
∂T
0
(1 +θ)
∂τL/u
= λ
ax
∂
2
T
0
(1 +θ)
∂(ξL)
2
−u(ρc
p
)
g
∂T
0
(1 +θ)
∂ξL
+
(−∆H)
ηk
∞
a
v
k
c
(1 −x)c
0
a
v
k
c
exp(
E
a
RT
0
1
1+θ
) +ηk
∞
−U
w
a
w
(T
0
(1 +θ) −T
c
).
Dividing by (ρc
p
)
g
T
0
u/L gives
a
1
∂θ
∂τ
=
1
Pe
h
∂
2
θ
∂ξ
2
−
∂θ
∂ξ
+a
2
a
3
(1 −x)
exp[β/(1 +θ)] +a
4
−Φθ.
The expressions for the dimensionless parameters are given in Table 6.2.
For the twodimensional model the major part follows from the above dis
cussion. We only deal with the radial components of the diﬀusion terms. After
dividing the radial diﬀusion term of the energy balance (6.15) by (ρc
p
)
g
T
0
u/L,
the dimensionless version is given by
λ
rad
(ρc
p
)
g
Lu
L
2
R
2
1
ζ
∂
∂ζ
ζ
∂θ
∂ζ
¸
.
6.2 The model equations of a cooled packed bed reactor 165
Therefore, we deﬁne
Pe
hp
=
(ρc
p
)
g
Lu
λ
rad
.
By substituting the dimensionless variables in the radial term of the com
ponent balance (6.16) and subsequently dividing again by −c
0
u/L, we obtain
εD
rad
uL
L
2
R
2
1
ζ
∂
∂ζ
ζ
∂x
∂ζ
¸
,
and we deﬁne Pe
mp
by
Pe
mp
=
uL
εD
rad
.
In Appendix C we explain that in general D
ax
= D
rad
. We have added the
parameters Pe
mp
and Pe
hp
in Table 6.2.
The dimensionless boundary conditions in axial direction of the one and
twodimensional model are derived in the same way as the balance equations
and given at the end of this section. To point out the diﬀerence between
the one and twodimensional model we explicitly derive the dimensionless
boundary condition for the temperature at r = R (ζ = 1). So, starting with
λ
rad
∂T
∂r
r=R
= −U
w
(T(R) −T
c
),
we substitute θ and ζ, and divide both sides by u(ρc
p
)
g
T
0
R/L
2
, which leads
to
λ
rad
Lu(ρc
p
)
g
L
2
R
2
∂θ
∂ζ
ζ=1
= −
U
w
L
Ru(ρc
p
)
g
θ(1) = −
1
2
Φ θ(1).
Note that dimensionless cooling capacity, Φ, is multiplied by a factor one half.
A summary of the model equations
We summarize the complete dimensionless one and twodimensional model.
The parameters involved are given in Table 6.2.
For the onedimensional model the component balance reads
ε
∂x
∂τ
=
1
Pe
m
∂
2
x
∂ξ
2
−
∂x
∂ξ
+χ(x, θ), (6.23)
where the reaction rate is given by
χ(x, θ) = a
3
(1 −x)
exp(β/(1 +θ)) +a
4
¸
−1
.
166 Chapter 6. Periodic processes in packed bed reactors
The energy balance involves a cooling term and is given by
a
1
∂θ
∂τ
=
1
Pe
h
∂
2
θ
∂ξ
2
−
∂θ
∂ξ
+a
2
χ(x, θ) −Φθ. (6.24)
The boundary conditions read
θ −
1
Pe
h
∂θ
∂ξ
ξ=0
= 0,
∂θ
∂ξ
ξ=1
= 0,
x −
1
Pe
m
∂x
∂ξ
ξ=0
= 0,
∂x
∂ξ
ξ=1
= 0.
(6.25)
For the twodimensional model the component balance is given by
ε
∂x
∂τ
=
1
Pe
m
∂
2
x
∂ξ
2
−
∂x
∂ξ
+χ(x, θ) +
1
Pe
mp
L
2
R
2
1
ζ
∂
∂ζ
ζ
∂x
∂ζ
¸
, (6.26)
the energy balance is given by
a
1
∂θ
∂τ
=
1
Pe
h
∂
2
θ
∂ξ
2
−
∂θ
∂ξ
+a
2
χ(x, θ) +
1
Pe
hp
1
ζ
∂
∂ζ
ζ
∂θ
∂ζ
¸
(6.27)
and the boundary conditions are given by
θ −
1
Pe
h
L
2
R
2
∂θ
∂ξ
ξ=0
= 0,
∂θ
∂ξ
ξ=1
= 0,
x −
1
Pe
m
∂x
∂ξ
ξ=0
= 0,
∂x
∂ξ
ξ=1
= 0,
∂θ
∂ζ
ζ=0
= 0,
1
2
Φθ +
1
Pe
hp
L
2
R
2
∂θ
∂ζ
ζ=1
= 0,
∂x
∂ζ
ζ=0
= 0,
∂x
∂ζ
ζ=1
= 0.
(6.28)
Chapter 7
Numerical approach for
solving periodically forced
processes
To solve a model consisting of partial diﬀerential equations including nonlinear
terms, the use of a computer is unavoidable. Therefore the model has to be
discretized in space and implemented in the computer. In Section 7.1 we give
an example of a discretization, called ﬁnite volumes.
During this process of discretization and implementation many errors might
be made. For instance, rounding errors can have a major inﬂuence. Further
more, the grid should be chosen ﬁne enough. So, the implemented models
have to be checked. This will be done in Section 7.2.
In this chapter we use basic partial diﬀerential equations. The ideas de
veloped in Sections 7.2 and 7.1 can easily be extended to the model equations
of the packed bed reactor derived in Section 6.2.
The last part of this chapter, Section 7.3, contains a short description of
bifurcation theory and a continuation technique. The bifurcation theory can
be used to ﬁnd out whether a periodically forced process has a periodic stable
limiting state. In addition, it shows when small changes in the parameters
have a major inﬂuence on the behavior of the limiting state. For parame
ter investigation we do not want to compute the periodic limiting state from
scratch for every value of the bifurcation parameter. If we change the bifurca
tion parameter slightly we would prefer to take the old periodic limiting state
as initial estimate of an iterative method to compute the new one.
167
168 Numerical approach for solving periodically forced processes
7.1 Discretization of the model equations
We consider the following initialboundary value problem
u
t
= du
zz
−au
z
+h(u),
(au −du
z
)
z=0
= du
z
z=1
= 0,
u(z, 0) = u
0
(z), z ∈ [0, 1],
(7.1)
where u : [0, 1] R
+
→ R and a, d > 0. The partial diﬀerential equation
describes, for example, the temperature distribution in a reactor. Note that
the positivity of a implies that the gas is ﬂowing from the left to the right end
of the reactor, in positive zdirection.
In order to discretize (7.1) we divide the reactor in N segments of equal
width. The state u is assumed to be constant over a segment and located
in the center, see Figure 7.1. For every segment, i = 1, . . . , N, a balance
PSfrag replacements
z = 0 z = 1
∆z
u(z
1
) u(z
2
) u(z
i
) u(z
N
)
Figure 7.1: The distribution of the grid points over the interval.
equation is derived, where the accumulation term u
t
(z
i
) is expressed in terms
of the state of the segment, u(z
i
), and the states of the neighboring segments,
u(z
i−2
), u(z
i−1
), u(z
i+1
) and u(z
i+2
). This results in a large system of ordinary
diﬀerential equations, which can be written as
U
t
= F(U(t)), (7.2)
where U(t) = (u(z
1
, t), . . . , u(z
n
, t)).
In other words, we divide the interval [0, 1] in N small intervals of equal
length (∆z = 1/N), and deﬁne z
i
= (i −
1
2
) ∆z, i = 1, . . . , N. The boundaries
of the ith interval are given by z
i+
1
2
= i ∆z and z
i−
1
2
= (i − 1) ∆z, for
i = 1, . . . , N. Therefore, z1
2
= 0 and z
N+
1
2
= 1. In order to approximate the
ﬁrst and second derivative of u in a grid point z
i
, we use the Taylor expansion
7.1 Discretization of the model equations 169
of u around z
i
u(z) = u(z
i
) +u
z
(z
i
)(z −z
i
) +
1
2
u
zz
(z
i
)(z −z
i
)
2
+
1
6
u
zzz
(z
i
)(z −z
i
)
3
+O([z −z
i
[
4
). (7.3)
The ﬁrst derivative can be computed in several ways. One approach is
called ﬁrst order upwind. If the ﬂow rate in the reactor is rather high, we
have to use information of grid points that lie in upstream direction. So, if
the ﬂow comes from the left we use the state value u in z
i−1
. We apply (7.3)
in z = z
i−1
and obtain
u(z
i−1
) = u(z
i
) −u
z
(z
i
)∆z +
1
2
u
zz
(z
i
)(∆z)
2
+O((∆z)
3
). (7.4)
If we rearrange the terms, the derivative of u in z
i
equals
u
z
(z
i
) =
u(z
i
) −u(z
i−1
)
∆z
+O(∆z). (7.5)
The ﬁrst term of the right hand side of (7.5) is the approximation for the
derivative of u. The situation is given schematically in Figure 7.2.
PSfrag replacements
z
i−1
z
i−
1
2
z
i
z
i+1
u(z
i−1
)
u(z
i
)
u(z
i+1
)
Figure 7.2: First order upwind approximation of the derivative in z
i
Another approach, called second order central, is applicable if the diﬀusion
in the reactor dominates the dynamics. We apply (7.3) in z
i+1
u(z
i+1
) = u(z
i
) +u
z
(z
i
)∆z +
1
2
u
zz
(z
i
)(∆z)
2
+O((∆z)
3
). (7.6)
170 Numerical approach for solving periodically forced processes
Subtracting (7.4) from (7.6) gives
u(z
i+1
) −u(z
i−1
) = 2∆zu
z
(z
i
) +O((∆z)
3
)
and therefore
u
z
(z
i
) =
u(z
i+1
) −u(z
i−1
)
2∆z
+O((∆z)
2
). (7.7)
The situation is given schematically in Figure 7.3.
PSfrag replacements
z
i−1
z
i−
1
2
z
i
z
i+
1
2
z
i+1
u(z
i−1
)
u(z
i
) u(z
i+1
)
Figure 7.3: Second order central approximation of the derivative in z
i
For the second derivative of u one relevant approximation is available. Note
that
u(z
i+1
) −u(z
i
)
∆z
= u
z
(z
i
) +
1
2
u
zz
(z
i
)∆z +
1
6
u
zzz
(z
i
)(∆z)
2
+O((∆z)
3
), (7.8)
and
u(z
i
) −u(z
i−1
)
∆z
= u
z
(z
i
) −
1
2
u
zz
(z
i
)∆z +
1
6
u
zzz
(z
i
)(∆z)
2
+O((∆z)
3
). (7.9)
We subtract (7.9) from (7.8), divide by ∆z and arrive at
u
zz
(z
i
) =
u(z
i+1
) −2u(z
i
) +u(z
i−1
)
h
2
+O(h
2
).
If we use second order central for the diﬀusion term and ﬁrst order upwind
for the convective term, the ith component function of F in (7.2) is given by
d
(∆z)
2
u(z
i+1
) −2u(z
i
) +u(z
i−1
)
−
a
∆z
u(z
i
) −u(z
i−1
)
+h(u(z
i
)), (7.10)
7.1 Discretization of the model equations 171
for i = 2, . . . , N − 1. In order to derive the ﬁrst component function and the
last component function of F, we apply a slightly diﬀerent approach. First,
note that
(du
zz
−au
z
)
z
i
= (du
z
−au)
z
z
i
≈
1
z
i+
1
2
−z
i−
1
2
(du
z
−au)
z
i+
1
2
z
i−
1
2
.
The ﬁrst derivative u
z
evaluated at the boundary between two segments,
(z
i+
1
2
), is approximated using the state u at the grid points of both segments
(z
i
and z
i+1
), that is,
u
z
(z
i+
1
2
) =
1
z
i+1
−z
i
(u(z
i+1
) −u(z
i
)).
If we apply ﬁrst order upwind then u evaluated at z
i+
1
2
is approximated by
u(z
i
), the nearest point in upstream direction. The ith component function of
F becomes
d
1
z
i+
1
2
−z
i−
1
2
u(z
i+1
) −u(z
i
)
z
i+1
−z
i
−
u(z
i
) −u(z
i−1
)
z
i
−z
i−1
−a
u(z
i
) −u(z
i−1
)
z
i+
1
2
−z
i−
1
2
+h(u(z
i
)), (7.11)
for i = 2, . . . , N−1. Note that for equidistant grid (7.10) and (7.11) are equal.
Because the mesh is chosen such that z1
2
= 0, the left boundary condition
reads (du
z
−au)[
z
1
2
= 0. The ﬁrst component function of F is given by
d
1
z3
2
u(z
2
) −u(z
1
)
z
2
−z
1
−a
u(z
1
)
z3
2
+h(u(z
1
)).
On the other hand, we have deﬁned z
N+
1
2
= 1 and thus du
z
[
z
N+
1
2
= 0. The
Nth component function of F yields
d
1
1 −z
N−
1
2
−
u(z
N
) −u(z
N−1
)
z
N
−z
N−1
−a
u(z
N
) −u(z
N−1
)
1 −z
N−
1
2
+h(u(z
N
)).
If we apply central discretization for the ﬁrst derivative, u(z
i+
1
2
) is approx
imated by (u(z
i
) + u(z
i+1
))/2 to derive the last component function of F. In
this case we have to evaluate u at z
N+1
. Because the ﬁrst derivative of u in
z
N+
1
2
= 1 equals zero, we can replace u(z
N+1
) by u(z
N
).
172 Numerical approach for solving periodically forced processes
The twodimensional initialboundary value problem
u
t
= d
1
u
zz
−a
1
u
z
+
d
2
r
(ru
r
)
r
+h(u),
(a
1
u −d
1
u
z
)
z=0
= d
1
u
z
z=1
= 0,
u
r
r=0
= (a
2
u +d
2
ru
r
)
r=1
= 0,
u(z, r, 0) = u
0
(z, r), (z, r) ∈ [0, 1]
2
,
(7.12)
where a
1
, a
2
, d
1
, d
2
> 0, can be discretized using mainly the same approach.
In addition to the axial derivative in the partial diﬀerential equation, we have
to deal with radial component of the diﬀusion term. Therefore, we divide the
radial axis into M intervals, with boundaries r1
2
, r3
2
, . . . , r
M+
1
2
. We set r1
2
= 0
and r
M+
1
2
= 1. In every interval j we choose a grid point r
j
, j = 1, . . . , M,
and approximate the radial term at the grid point r
j
, by
d
2
r
(ru
r
)
r
r
j
≈
d
2
r
j
1
r
j+
1
2
−r
j−
1
2
(ru
r
)
r
j+
1
2
r
j−
1
2
. (7.13)
For j = 2, . . . , M −1 we can expand (7.13) to
d
2
r
(ru
r
)
r
r
j
≈
d
2
r
j
1
r
j+
1
2
−r
j−
1
2
r
j+
1
2
u(r
j+1
) −u(r
j
)
r
j+1
−r
j
−r
j−
1
2
u(r
j
) −u(r
j−1
)
r
j
−r
j−1
.
For the ﬁrst and the last grid point the boundary conditions have to be taken
into account. Because the derivative u
r
at r = 0 equals zero, we obtain for
j = 1,
d
2
r
(ru
r
)
r
r
1
≈
d
2
r
1
1
r3
2
−r1
2
(ru
r
)
r
3
2
r
1
2
≈
d
2
r
1
u(r
2
) −u(r
1
)
r
2
−r
1
.
The other boundary condition leads to
d
2
r
(ru
r
)
r
r
M
≈
d
2
r
M
1
r
M+
1
2
−r
M−
1
2
(ru
r
)
r
M+
1
2
r
M−
1
2
≈
1
r
M
1
1 −r
M−
1
2
−a
2
u(1) −d
2
r
M−
1
2
u(r
M
) −u(r
M−1
)
r
M
−r
M−1
.
The value of u at r = 1 is not deﬁned yet. Because the gradient of u at r = 1
in general is not equal to zero, we extrapolate u using the values at the grid
points r
M−1
and r
M
, see Figure 7.4. This leads to
u(1) = u(r
M
) + (1 −r
M
)
u(r
M
) −u(r
M−1
)
r
M
−r
M−1
.
7.2 Tests for the discretized model equations 173
PSfrag replacements
r
M−1
r
M−
1
2
r
M
r
M+
1
2
u(r
M−1
)
u(r
M
)
u(1)
Figure 7.4: An approximation of the value u(1).
So far, we have only considered the axial and radial terms separately.
In order to discretize (7.12) the values of u in the grid points of the two
dimensional mesh should be stored in one single vector U. In order to obtain
a small band width of the Jacobian of the function F, we ﬁrst count in the
direction that has the smallest number of grid points (often M < N). So the
vector U becomes
U = (u
1,1
, u
1,2
, . . . , u
1,M
, u
2,1
, . . . , u
N,1
, . . . , u
N,M
),
where u
i,j
= u(z
i
, r
j
).
7.2 Tests for the discretized model equations
The initialboundary value problem (7.1) can explicitly be solved if the func
tion h is aﬃne. In general the solution is a inﬁnite sum of terms with co
sine, sine and the exponential function, where the coeﬃcients are ﬁxed by
the boundary conditions and the initial condition. The explicit solution is
derived by splitting the variables, that is, assuming that u in of the form
u(z, t) = Z(z)T(t) and substituting this in the partial diﬀerential equations.
The obtained analytical solution can be compared to the results of numerical
simulation.
If it is not possible to compute the explicit solution we have to consider
other techniques to check the solution given by the computer. One approach
is to check the balances. We ﬁrst integrate both sides of the partial diﬀerential
174 Numerical approach for solving periodically forced processes
equation of (7.1) in time (T > 0) and space,
T
0
1
0
u
t
dzdt =
T
0
1
0
(du
zz
−au
z
+h(u))dzdt.
If we assume that the solution is continuous, we can change the order of
integration. Therefore,
1
0
u(z, t)
T
0
¸
dz =
T
0
(dw
z
−au)
1
0
+
1
0
h(u)dz
¸
dt.
By applying the boundary conditions of (7.1) we obtain
1
0
¦u(z, T) −u
0
(z)¦dz =
T
0
−au(1, t) +
1
0
h(u)dz
¸
dt.
We can check the simulation by computing the integral at the right hand side
simultaneously with the variable u.
Another possibility is multiplying both side of the partial diﬀerential equa
tion by u and then integrating in time and space, which results in the following
’energy estimate’,
T
0
1
0
(uu
t
)dzdt =
T
0
1
0
u(du
zz
−au
z
+h(u))dzdt.
Again interchanging the integration order gives
1
0
1
2
u
2
T
0
¸
dz =
T
0
duu
z
1
0
−
1
0
du
2
z
dz −
a
2
u
2
1
0
+
1
0
uh(u)dz
¸
dt,
and inserting the boundary conditions yields
1
2
1
0
¦u
2
(z, T) −u
2
(z, 0)¦dz =
T
0
−au
2
(0, t)
−d
1
0
u
2
z
dz −
a
2
u
2
(1, t) +
a
2
u
2
(0, t) +
1
0
uh(u)dz
¸
dt.
This can be simpliﬁed to
1
0
u
2
(z, T)dz =
1
0
u
2
0
(z)dz
−
T
0
au
2
(1, t) +au
2
(0, t)) + 2d
1
0
u
2
z
dz −
1
0
uh(u)dz
¸
dt.
7.3 Bifurcation theory and continuation techniques 175
This implies that if the function h satisﬁes
1
0
uh(u)dz < 0 then the ’total
amount of energy’ in the system (u(., t)
2
2
=
1
0
u
2
(z, t)dz) is decreasing.
If the function h is aﬃne, the twodimensional problem (7.12) can be solved
explicitly by splitting the variables, that is, assuming that u in of the form
u(z, r, t) = Z(z)R(r)T(t). The resulting ordinary diﬀerential equations in z
and t are solved in the same way, as was done for the onedimensional problem.
The ordinary diﬀerential equation that involves the variable r is a socalled
Bessel equation and the solutions are given in terms of the Bessel function. In
general the solution of (7.12) is an inﬁnite sum of terms with cosine, sine, the
exponential function and the Bessel function, where the coeﬃcients are ﬁxed
by the boundary conditions and the initial condition.
We shortly discuss some additional approaches to check the implemen
tation of the discretized equations. If a reliable implementation of the one
dimensional model exists, we can consider (artiﬁcial) limit cases of the process
in which the onedimensional and the twodimensional should give the same
results. That is, when the state of the reactor has no gradients in radial di
rection. We describe two situations starting from an initial state u
0
, which
is constant in radial direction. Note that the boundary condition at z = 0
and z = 1 are uniform over the cross section of the reactor. If the diﬀusion
in radial direction is high, diﬀerences in radial direction will be removed in
stantaneously. In terms of 7.12 this means that if d
2
is large, the term (ru
r
)
r
will become small shortly. Another example is when the cooling of the reactor
stagnates at or nearby the reactor wall. That is, the reactor wall is isolated,
a
2
= 0, or no diﬀusion exists in radial direction, d
2
= 0. If a
2
= 0 then the
temperature gradient at r = 1 will be zero and no gradients are introduced.
In order to check the implemented model if radial gradients are present,
we can again integrate the initialboundary value problem (7.12) in time and
space. Note that for the radial direction we ﬁrst have to multiply the equation
by r, in order to obtain the weighted average. We obtain similar integral
equations as in the case of the system (7.1). Terms of the integral equation
can simultaneously be integrated with the variable u.
7.3 Bifurcation theory and continuation techniques
Periodically forced processes in packed bed reactors can be described by use
of partial diﬀerential equations. In order to investigate the behavior of the
system numerically, we discretize the equations in space using a ﬁnite volumes
technique with ﬁrst order upwind for the convective term. The state of the
reactor at time t is denoted by a vector x(t) from the ndimensional vector
176 Numerical approach for solving periodically forced processes
space, R
n
. The resulting system of n ordinary diﬀerential equations can be
written as
x
(t) = F(x(t), t), (7.14)
where F(, t +t
c
) = F(, t) for t ∈ R, and t
c
denotes the period length.
The map f : R
n
→R
n
that assigns to an initial state at time zero, x(0) =
x
0
, the value of the solution after one cycle, x(t
c
), is called the Poincar´e or
period map of (7.14). So, we have
f(x
0
) = x(t
c
). (7.15)
In other words, evaluating the map f is equivalent to simulating one cycle of
the process in (7.14).
Moreover, a periodic state of the reactor corresponds to a t
c
periodic so
lution x(t) of (7.14). Since the initial condition, x
0
, of a periodic solution is a
ﬁxed point of the period map, we solve
f(x) −x = 0, (7.16)
using iterative methods. Note that the value f(x) is obtained by integrating
a large system of ordinary diﬀerential equations over a period t
c
. Therefore,
the function evaluation is a computationally expensive task, and the iterative
method that needs the fewest evaluations of f to solve (7.16), is the most
eﬃcient. Since it might take a long transient time before the limiting periodic
state is reached, direct methods are preferable to dynamical simulation.
The model of a packed bed reactor contains several physical parameters,
which may vary over certain speciﬁed intervals. Therefore, it is important to
understand the qualitative behavior of the systems as a bifurcation parame
ter change. A good design for the periodically forced process in the packed
bed reactor is such that the qualitative behavior does not change when the
bifurcation parameter is varied slightly from the value for which the original
design was made. The value of the bifurcation parameter where the qualita
tive property of the state of the reactor changes is called a bifurcation point.
Knowledge of the bifurcation points is necessary for a good understanding of
the system. Our objective in this section is to give a short view of the easiest
types of bifurcations and method to ﬁnd the bifurcation values.
Let f be the period map of a periodically forced process. Now assume that
the period map depends on a bifurcation parameter λ. Thus, we consider the
dynamical system
x
k+1
= f(x
k
; λ) (7.17)
where f : R
n
R → R
n
, starting from an initial condition x
0
. A periodic
state of the process is a ﬁxed point of the period map (f(x
∗
) = x
∗
). In
7.3 Bifurcation theory and continuation techniques 177
fact, the periodic state depends on the value of the bifurcation parameter
(x
∗
= x
∗
(λ)). We are interested whether and how the periodic state changes,
when we vary the bifurcation parameter slightly. To understand the local
behavior of the system in more detail, we have to consider the Jacobian of the
period map, also called the monodromy matrix. The monodromy matrix M
describes the evolution of a small perturbation over one period. The stability
of periodic solutions is determined by the Floquet multipliers, the eigenvalues
of the monodromy matrix, [30].
A periodic solution is stable when the absolute values of all the (possibly
complex) eigenvalues of M are smaller than unity. This implies that a neigh
borhood exists of the periodic state x
∗
in which all trajectories converge to
the periodic state as time goes to inﬁnity.
PSfrag replacements
µ
1
µ
1
µ
1
µ
2
1 −1
θ
0
Figure 7.5: Diﬀerent bifurcation scenarios of a periodically forced system.
When changing the bifurcation parameter an eigenvalue might cross the
unit circle and the dynamics of the system can change completely. Landing
at a bifurcation point the periodic state becomes unstable or disappears. The
angle at which an eigenvalue µ crosses the unit circle, determines the type of
bifurcation. In the following example we consider three diﬀerent scenario’s,
see Section 8.2 for more details.
Example 7.1. If the eigenvalue leaves the unit circle at µ = 1 the number of
periodic solutions of the system changes. In general this will be by two. The
bifurcation point is called a limit point or a saddlenode. Let the ﬂowreversal
time t
f
be the bifurcation parameter and ﬁxed all other physical parameters
of the system. For moderate values of t
f
a stable periodic state exists at high
temperature. However the longer we ﬂow from one direction the more energy
is purged out of the reactor during one ﬂowreversal period. There exists a
minimum value for t
f
for which the extinguished state is still the only possible
periodic state. This value for t
f
corresponds to the bifurcation point.
If the eigenvalue leaves the unit circle at µ = −1 the period of the solution
is doubled. Let f be the map corresponding to half a period of the reverse ﬂow
178 Numerical approach for solving periodically forced processes
reactor, deﬁned by (8.3). If x
∗
is a stable ﬁxed point of f, the limiting state
of the reactor is a symmetric period state, see Figure 8.5(a). If we alternate
the bifurcation parameter such that the largest eigenvalue of the monodromy
matrix leaves the unit circle at µ = −1, the ﬁxed point x
∗
becomes unstable,
and the limiting state corresponds to a new point ˜ x
∗
that satisﬁes
f
2
(˜ x
∗
) = ˜ x
∗
= f(˜ x
∗
).
Since f
2
equals the period map of a whole cycle of the reverse ﬂow reactor,
˜ x
∗
is a periodic state of the process. However, it has becomes asymmetric, see
Figure 8.5(b).
If a pair of eigenvalues leaves the unit circle at µ
1
= e
iθ
0
and µ
2
= e
−iθ
0
,
where 0 < θ
0
< π, the limiting state of the reactor becomes quasiperiodic,
which implies the state follows two frequencies, see 8.5(c). Note that a com
plex eigenvalue always has a conjugate partner. This bifurcation is called a
NeimarkSacker bifurcation, and corresponds to a transition from a single to
a twofrequency motion.
Continuation techniques
Clearly, we are interested in the dependence of the limiting periodic state of the
periodically forced process on certain bifurcation parameters. In this section
we describe the basics of continuation techniques to analyze the dynamical
system
x
k+1
= F(x
k
, α), α ∈ R,
where the period map F : R
n+1
→R
n
depends upon one bifurcation parameter
α.
Fixed points of the period map, also called equilibrium points, satisfy the
equation
F(x, α) = x. (7.18)
If we denote a point in R
n+1
by y = (x, α) and deﬁne G : R
n+1
→ R
n
by
G(y) = F(x, α) −x, Equation (7.18) leads to
G(y) = 0. (7.19)
By the implicit function Theorem the system of (7.19) locally deﬁnes a smooth
onedimensional curve C in R
n+1
passing through a point y
0
, that satisﬁes
(7.19), provided that
rank J
G
(y
0
) = n. (7.20)
7.3 Bifurcation theory and continuation techniques 179
Here J
G
(y
0
) denotes the Jacobian of G at y
0
. Every point on the curve C that
satisﬁes (7.20) is called regular.
During continuation, points on this curve (y
0
, y
1
, y
2
, . . .) are approximated
with a desired accuracy. The ﬁrst two points of the sequence are computed
by ﬁxing the bifurcation parameter and applying iterative methods to solve
(7.18). For the subsequent points, most of the continuation algorithms used
in bifurcation analysis implement predictorcorrector methods that include
three basic steps, prediction, correction and step size adjustment. The next
continuation point is predicted by adding a step to the previous point, that is
based on previously computed points of the branch and an appropriate step
length. Next, the prediction is corrected bordered with a step length condition.
Finally the step size is adapted.
We describe some of the basic choices for the prediction and the correction
step, and strategies to validate the new computed point on the bifurcation
branch in order to choose the new step size.
Prediction
Suppose that a regular point y
k
in the sequence approximating the curve C
has been found. Then, the initial guess ˜ y of the next point in the sequence is
made using the prediction formula
˜ y = y
k
+ ∆s
k
v
k
, (7.21)
where ∆s
k
is the current step size, and v
k
∈ R
n+1
is a vector of unit length
(v
k
 = 1).
A possible choice for v
k
is the tangent vector to the curve in y
k
. To obtain
the tangent vector we parametrize the curve near y
k
, by the arclength s with
y(0) = y
k
. If we substitute the parametrization into (7.19) and take derivative
with respect to s, we obtain
J
G
(y
k
)v
k
= 0, (7.22)
since v
k
= dy/ds(0). System (7.22) has a unique solution (v
k
has unit length)
because rank J
G
(y
k
) = n by the assumption of regularity.
Another popular prediction method is the secant prediction. It requires
two previous points on the curve, y
k−1
and y
k
. The prediction is given by
(7.21), where now
v
k
=
y
k−1
−y
k
y
k−1
−y
k

. (7.23)
The advantage of this method is that the computation of the Jacobian J
G
and
of the solution of a large system of equations is avoided.
180 Numerical approach for solving periodically forced processes
A third, simple, method is changing the bifurcation parameter only and
using the last point in the sequence y
k
as initial guess of the next point on the
bifurcation branch. The direction of the step is therefore given by
v
k
= e
n+1
.
where e
n+1
is the last unitvector in R
n+1
. A minor of this method is that
the branch can only be detected in increasing direction of the bifurcation
parameter.
Correction
Having predicted a point ˜ y presumably close to the curve, one needs to locate
the next point y
k+1
on the curve to within a speciﬁed accuracy. This correction
is usually performed by some Newtonlike iterations. However, the standard
Newton iterations have to be applied to a system in which the number of
equations is equal to that of the unknowns. So, in order to apply Newton’s
method or a quasiNewton method, a scalar condition
h
k
(y) = 0
has to be appended to the system (7.18), where h
k
: R
n+1
→ R is called the
control function. We redeﬁne function G : R
n+1
→R
n+1
by
G(y) =
F(x, α) −x
h
k
(y)
.
Solving
G(y) = 0 (7.24)
geometrically means that one looks for an intersection of the curve C with some
surface near ˜ y. It is natural to assume that the prediction point ˜ y belongs to
this surface as well (that is, h
k
(˜ y) = 0). There are several ways to specify the
function h
k
(y).
The simplest way is to take a hyperplane passing through the point ˜ y that
is orthogonal to the coordinate axis of the bifurcation parameter, namely, set
h
k
(y) = α − ˜ α.
This approach is called natural continuation. Instead of the coordinate axis
of the bifurcation parameter often the axis is taken that corresponds to the
index of the component of v
k
with the maximum absolute value. Because the
element of y
k
with this index is locally the most rapidly changing along C.
7.3 Bifurcation theory and continuation techniques 181
PSfrag replacements
y
0
y
1
˜ y ˜ y
y
v v
Figure 7.6: Prediction and correction step.
Another possibility, called pseudoarclength continuation, is to select the
hyperplane passing through the point ˜ y that is orthogonal to the vector v
k
.
This hyperplane is deﬁned by
0 = 'y − ˜ y, v
k
`.
Therefore we set
h
k
(y) = 'y − ˜ y, v
k
`
= 'y −(y
k
+ ∆s
k
v
k
), v
k
`
= 'y −y
k
, v
k
` −∆s
k
. (7.25)
If the curve is regular (rank J
G
(y) = n for all y ∈ C) and the step size ∆s
k
is suﬃciently small, one can prove that the Newton iterations for (7.24) will
converge to a point on the curve C from the predicted point ˜ y of the tangent
prediction or the secant prediction, [38].
For the third possibility not a hyperplane is taken but a sphere around
the previous computed point y
k
in the sequence. That is, the distant between
the approximation of the next point on the curve and y
k
is ﬁxed at ∆s
k
. The
control function is therefore deﬁned as
h
k
(y) = y −y
k
 −∆s
k
.
Clearly, the predicted point ˜ y lies on the sphere. The main disadvantages of
this approach is that the control function is not linear and, especially in the
neighborhood of bifurcation point, the continuation might go in the wrong
direction, since the curve has at least two intersection points with the sphere.
Note that the matrix J
G
(y
k
) needed in (7.22) can be extracted from the
last iteration of the Newton process solving (7.24).
182 Numerical approach for solving periodically forced processes
Step size adjustment
There are many sophisticated algorithms to control the step size ∆s
k
. The
simplest convergence dependent control, however, has proved to be reliable
and easily implementable. That is, if no convergence occurs after a prescribed
number of iterations in the correction step, we decrease the step size and return
to the prediction step. If the last point is successfully computed, we accept
it as a new point of the sequence and multiply the step length with a given,
constant factor greater than one. If the convergence succeeds but it uses many
iterations we accept the new point of the sequence but decrease the step size
∆s
k
.
To summarize this section we give the continuation algorithm that we have
applied in our simulations.
Algorithm 7.2 (Continuation scheme). Let y
k
= (x
k
, α
k
) and y
k−1
=
(x
k−1
, α
k−1
) be the last successfully computed points in the sequence approx
imating the branch. Fix the real parameters a and b (a > 1 and 0 < b < 1)
and the number i
max
. The next point, y
k+1
= (x
k+1
, α
k+1
), in the sequence is
determined by:
Secant prediction: Set ˜ y = y
k
+ ∆s
k
v
k
, where ∆s
k
is the current step
size and v
k
is deﬁned by (7.23).
pseudoarclength continuation: Solve (7.24) where G(y) = (F(x, α) −
x, h
k
(y)) and h
k
(y) is deﬁned by (7.25)
stepsize control: If the correction step fails, then multiply ∆s
k
by b and
return to the prediction step. If the correction step succeeds using less
than i
max
/2 iterations of an iterative method, then accept the new point
in the sequence and set ∆s
k+1
= a∆s
k
. If the correction step succeeds
but using more than i
max
/2 iterations, then accept the new point in the
sequence and set ∆s
k+1
= b∆s
k
.
Chapter 8
Eﬃcient simulation of
periodically forced reactors in
2D
The ﬁnal chapter of this thesis is devoted to the connection between the it
erative method for solving highdimensional systems of nonlinear equations
and the eﬃcient simulation of a twodimensional model for the reverse ﬂow
reactor, where the radial direction is taken into account.
8.1 The reverse ﬂow reactor
We start to recall the description of the reverse ﬂow reactor from the intro
duction. The reverse ﬂow reactor (RFR) is a catalytic packedbed reactor in
which the ﬂow direction is periodically reversed to trap a hot zone within the
reactor. Upon entering the reactor, the cold feed gas is heated up regenera
tively by the hot bed so that a reaction can occur. The reaction is assumed
to be exothermic. At the other end of the reactor the hot product gas is
cooled by the colder catalyst particles. The beginning and end of the reactor
thus eﬀectively work as heat exchangers. The cold feed gas purges the high
temperature (reaction) front in downstream direction. Before the hot reaction
zone exits the reactor, the feed ﬂow direction is reversed. The ﬂowreversal
period, denoted by t
f
, is usually constant and predeﬁned. One complete cycle
of the RFR consists of two ﬂowreverse periods. Overheating of the catalyst
and hot spot formation are avoided by a limited degree of cooling, at the wall
at constant temperature. This can be done by using a huge amount of cool
ing water, that ﬂows at a high rate along the outside of the reactor wall. A
183
184 Chapter 8. Eﬃcient simulation of periodically forced reactors in 2D
schematic diagram of the reactor is shown in Figure 8.1.
PSfrag replacements
T
c
T
c
catalyst
cooling jacket
gas ﬂow
Figure 8.1: Schematic drawing of the cooled reverse ﬂow reactor.
Starting with an initial state, the reactor goes through a long transient
phase before converging to a periodic limiting state, also called the cyclic
steady state (CSS). Limiting states of periodically forced packed bed reactors
are of interest to the industry because the reactor operates in this situation
most of the time.
The basic model for a ﬁxed bed catalytic reactor, such as the RFR, is
the socalled pseudohomogeneous onedimensional model. This model does
not diﬀerentiate between the ﬂuid and the solid phase and considers gradients
in the axial direction only. Eigenberger and Nieken [19] have investigated a
simpliﬁed onedimensional model. Due to a very short residence time of the
gas in the reactor, they assume the continuity equation and the mass balance
equation to be in quasi steady state when compared to the energy balance
equation. They apply standard dynamical simulation to compute the limiting
periodic states of the reverse ﬂow reactor. Due to their choice of the model
and the values of the parameters all periodic states discovered are symmetric,
that is, the state after one ﬂow reversal period is the mirror image of the initial
state.
Rehacek, Kubicek and Marek [57, 58] have extended the model of the RFR
to a twophase model with transfer of mass and energy between the ﬂuid and
solid phase. They consider the period map, that is, the map which assigns
the new state after one period of the process to an initial state. To obtain a
numerical expression of the period map, the authors discretize the partial dif
ferential equations of the model in space and integrate the resulting system of
ordinary diﬀerential equations over one period. Again with dynamical simula
8.2 The behavior of the reverse ﬂow reactor 185
tion, that is, iterating the period map, symmetric stable periodic states of the
RFR are obtained. In addition, they observe asymmetric and quasiperiodic
behavior.
Khinast, Luss et al. [34, 32, 33, 35] have developed an eﬃcient method to
compute bifurcation diagrams of periodic processes. Their approach is based
on previous work of Gupta and Bhatia [26] in which the system of partial
diﬀerential equations is considered as a boundary value problem in time. The
boundary condition implies that the initial state of the reactor equals the state
at the end of the cycle and, therefore, has to be a ﬁxed point of the period map,
as explained in more detail in Section 7.3. The method of Broyden is used
in combination with continuation techniques to ﬁnd the parameter dependent
ﬁxed points of the period map.
For steady state processes that have coeﬃcients and boundary conditions
invariant in time, twodimensional models are standard practice, see [56].
When modeling a steady state process, a time invariant state can often be
expressed as the solution of a system of ordinary diﬀerential equations, where
time derivatives are absent. For the theoretical analysis of limiting states of
steady state processes, a great number of eﬃcient mathematical and numer
ical tools is available. In periodically forced systems, such as the RFR, the
limiting solution varies in time. To our knowledge, full twodimensional mod
els for the RFR have never been solved using a direct iterative method, such
as the method of Broyden. The reason being that an accurate simulation re
quires a ﬁne grid which yields a high dimensional discretized system. Due to
large computational costs, both regarding CPUtime and regarding memory
usage, twodimensional models of periodically forced systems have so far been
avoided, at the expense of relevance and accuracy.
The radial transport of heat and matter, however, is very important in
nonisothermal packed bed reactors [72]. Highly exothermic reaction, a large
width of the reactor, and eﬃcient cooling of the reactor at the wall cause
radial temperature gradients to be present, see Figure 8.2(b). Clearly, for
cooled reverse ﬂow reactors the radial dimension must explicitly be taken into
account.
8.2 The behavior of the reverse ﬂow reactor
From the initial state to the CSS
As initial condition for the reverse ﬂow reactor we take a preheated reactor
ﬁlled with an inert gas. In the computations we consider the dimensionless
temperature θ = (T −T
0
)/T
0
and the conversion x = (c
0
−c)/c
0
, where T
0
is
186 Chapter 8. Eﬃcient simulation of periodically forced reactors in 2D
PSfrag replacements
ax. distance
rad. distance
temperature
c
o
n
v
e
r
s
i
o
n
(a) conversion
PSfrag replacements
ax. distance
rad. distance
t
e
m
p
e
r
a
t
u
r
e
conversion
(b) temperature
Figure 8.2: Qualitative temperature and conversion distribution of the cooled reverse
ﬂow reactor in the cyclic steady state according to the twodimensional model (6.26)
(6.28) with the parameter values of Table 6.2.
the temperature and c
0
is the concentration of the feed gas. The dimensionless
initial condition is set to
θ ≡ 1 and x ≡ 1.
We start the process and let the gas ﬂow entering the reactor at the left end.
The feed gas contains a trace of the reactant A and is at low temperature.
When entering the hot reactor, the cold feed is heated up and comes into
contact with the catalyst. Therefore, the reaction occurs and the concentration
of species A decreases. Because the reaction is assumed to be exothermic, the
temperature increases and a reaction front is created. On the other hand, the
catalyst at the left side of the reactor is cooled due to the low temperature
of the feed gas. At the left side of the reaction front the temperature is too
low to activate the reaction. At the other side of the reaction front all of the
reactants has reacted and the conversion is completed.
In Figure 8.3 the state of the reactor is given a diﬀerent times. The re
action front can be easily distinguished. Because the reactor is cooled the
temperature decreases at the right side of the reaction front.
After a period of time t
f
the feeding at the left end of the reactor is stopped
and the ﬂow direction is reversed by feeding from the right end of the reactor.
Directly after this ﬂow reversal, the hot reaction zone withdraws from the
right end and moves in left direction. The concentration A still present in the
left part is purged out of the reactor and after a short intermediate phase the
8.2 The behavior of the reverse ﬂow reactor 187
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
c
o
n
PSfrag replacements
axial distance
temperature
conversion
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
PSfrag replacements
axial distance
t
e
m
p
e
r
a
t
u
r
e
conversion
Figure 8.3: Snap shots of the ﬁrst reverse ﬂow period of the reactor.
conversion of species A in the product gas is again equal to one. The product
gas during this intermediate phase is often considered as waste gas. Note that
the reaction front now occurs at the right side of the hot zone, see Figure 8.4.
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
c
o
n
PSfrag replacements
axial distance
temperature
conversion
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
PSfrag replacements
axial distance
t
e
m
p
e
r
a
t
u
r
e
conversion
Figure 8.4: Snapt shots of the second reverse ﬂow period of the reactor.
By reversing the ﬂow direction after a ﬁxed period t
f
over and over again,
the hot reaction zone is catch in the reactor. It depends on the conditions of the
process what will be the state of the reactor after many cycles. Clearly, when
the cooling capacity is too high the state extinguishes, because the reaction
188 Chapter 8. Eﬃcient simulation of periodically forced reactors in 2D
cannot be sustained at low temperatures. If the reverse ﬂow period is too
long the reaction front exits the reactor. We describe the limiting state of the
reactor for diﬀerent values of the cooling capacity and a moderate reverse ﬂow
period.
Periodical, asymmetric and quasiperiodic states
We use dynamical simulation to determine the limiting state of the reactor
for diﬀerent values of the dimensionless cooling capacity. Adiabatic operation
leads to periodic, symmetric states at which the temperature (and concen
tration) proﬁles at the beginning and end of a ﬂowreversal period are mir
ror images. We call these states symmetric period1 operation. Laboratory
and pilotplant RFRs usually cannot be operated in an adiabatic mode [43].
Moreover, in some applications involving equilibriumlimited reactions cooling
is applied to avoid exceeding some critical temperatures at which either un
desired reactions or catalyst deactivation may occur. Various modes of RFR
cooling were described by Matros and Bunimovich [44]. Reactor cooling may
introduce some complex and rich dynamic features, which do not exist in its
absence. For example, under relatively fast ﬂowreversal frequencies the sym
metric states of a cooled RFR may become unstable and either asymmetric or
quasiperiodic states may be obtained. Quasiperiodic behavior of the reactor
means that in addition to the ﬂowreversal period, the forcing frequency, a
second period determines the overall behavior. Examples of the dimension
less temperature proﬁles, for these three types of states are shown in Figure
8.5. The diﬀerences in the dynamic features are caused by changes in the
dimensionless cooling capacity Φ, as deﬁned in Table 6.2.
0 0.2 0.4 0.6 0.8 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
PSfrag replacements
axial distance
t
e
m
p
e
r
a
t
u
r
e
(a) Φ = 0.332
0 0.2 0.4 0.6 0.8 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
PSfrag replacements
axial distance
t
e
m
p
e
r
a
t
u
r
e
(b) Φ = 0.324
0 0.2 0.4 0.6 0.8 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
PSfrag replacements
axial distance
t
e
m
p
e
r
a
t
u
r
e
(c) Φ = 0.3
Figure 8.5: The limiting state of the reverse ﬂow reactor at the switch of the ﬂow
direction
8.2 The behavior of the reverse ﬂow reactor 189
To illustrate the development of quasiperiodic behavior we consider the
maximum temperature of the reactor at the end of every ﬂowreversal period,
see Figure 8.6. After a transient phase of about 50 ﬂowreversal periods the
0 50 100 150 200 250 300 350 400 450 500
0.7
0.75
0.8
0.85
0.9
0.95
1
1.05
PSfrag replacements
θ
m
a
x
number of ﬂowreversals
420 430 440 450 460 470 480 490 500
0.75
0.8
0.85
0.9
PSfrag replacements
θ
m
a
x
number of ﬂowreversals
Figure 8.6: The maximal temperature of the reactor at the switch of the ﬂow direction,
for the ﬁrst 500 ﬂowreversal periods, and the same picture starting after 420 ﬂow
reversal periods (Φ = 0.3).
reactor reaches a quasiperiodic regime. The second frequency of the quasi
periodic behavior of the reactor equals 45 ﬂowreversal periods.
We construct a corresponding Poincar´e map by considering ∆θ
ave
(n) versus
θ
ave
(n), where we deﬁne
θ
ave
(n) =
1
0
θ(z, nt
f
)dz, n = 0, 1, 2, . . . , (8.1)
and
∆θ
ave
(n) = 2
1/2
0
θ(z, nt
f
)dz −
1
1/2
θ(z, nt
f
)dz
, n = 0, 1, 2, . . . . (8.2)
190 Chapter 8. Eﬃcient simulation of periodically forced reactors in 2D
The value θ
ave
(n) is the average reactor temperature after the nth ﬂow reversal
and ∆θ
ave
(n) is the corresponding averaged diﬀerence between the tempera
tures in the right and left half of the reactor. Clearly, the sign of ∆θ
ave
(n)
changes upon alternating ﬂow reversal. For symmetric period1 states, the
Poincar´e map consists of two points, both for the same θ
ave
(n) value. For
asymmetric period1 states, the Poincar´e map has two points, but not for the
same θ
ave
(n) values. In Figure 8.7 we have plotted the Poincar´e map corre
sponding to the quasiperiodic behavior of Figure 8.6.
consist of a set of points forming two closed curves, thus indicating quasi
period behavior. Each curve corresponds to one ﬂow direction.
0.41 0.415 0.42 0.425
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
PSfrag replacements
θ
ave
∆
θ
a
v
e
Figure 8.7: The Poincar´e map of ∆θ
ave
(2k) versus θ
ave
(2k), representing the quasi
periodic behavior of the reverseﬂow reactor after the transient phase (2k ≥ 50).
8.3 Dynamic features of the full twodimensional model 191
8.3 Dynamic features of the full twodimensional
model
Before doing simulations with the twodimensional model (6.26)(6.28), we
simplify the problem in the following way. From the mathematical point of
view, it makes no diﬀerence if the ﬂow direction in the reactor is reversed or
if the reactor is reversed itself while the ﬂuid ﬂows from the same direction.
Therefore we do not compute the state of the RFR after a whole cycle, but we
integrate the system over one ﬂowreversal period (t
f
) and then reverse the
reactor in the axial direction. So, instead of f(x
0
) = x(z, (t
c
u)/L), the period
map is given by
f(x
0
) = x((L −z)/L, (t
f
u)/L), (8.3)
where L is length of the reactor and u is the superﬁcial velocity. The state
of the reactor after a whole cycle is then obtained by applying the map f
twice to the initial condition. A ﬁxed point of f corresponds to a symmetric
periodic state of the reactor. If asymmetric periodic states exist, we can ﬁnd
them by computing ﬁxed points of the original period map. The only way to
determine whether the limiting state of the reactor is quasiperiodic is by using
dynamical simulation. In this section we restrict ourselves to the computation
of symmetric periodic states.
We consider aspects of limiting periodic states of the RFR for diﬀerent
values of the dimensionless reactor radius, denoted by R/L. The results are
expressed in the dimensionless temperature (T − T
0
)/T
0
and the conversion
(c
0
−c)/c
0
. We have ﬁxed the ﬂowreversal period (t
f
= 1200s). As a bifurca
tion parameter we use the dimensionless cooling capacity, deﬁned by,
Φ =
2LU
w
Ru(ρc
p
)
g
.
To obtain the results of this section the BRR method is used with p = 30.
The bifurcation diagrams, describing the dependence of the symmetric periodic
state of the reactor on the dimensionless cooling capacity, are constructed
using a standard continuation technique in combination with the BRR method.
Eigenvalues of the Jacobian J
f
are determined using the subspace method with
locking [60].
We describe two diﬀerent cases of the limiting periodic state for a ﬁxed
value of the cooling capacity (Φ = 0.2). If the reactor is rather slim (for exam
ple, R/L = 0.0025), we observe that the temperature is constant over every
cross section of the reactor, see Figure 8.8(b). In this way we can validate
the twodimensional model. Indeed according to the theory of Section 6.2, if
192 Chapter 8. Eﬃcient simulation of periodically forced reactors in 2D
radial gradients are absent, the weighted average of the twodimensional tem
perature proﬁle equals the temperature proﬁle of the onedimensional model.
This has been conﬁrmed by simulations of the onedimensional model. The
same observation is valid for the conversion, see Figure 8.8(a).
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
1.2
PSfrag replacements
temperature
c
o
n
v
e
r
s
i
o
n
axial distance
(a) Conversion
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
1.2
PSfrag replacements
t
e
m
p
e
r
a
t
u
r
e
conversion
axial distance
(b) Temperature
Figure 8.8: Axial temperature and conversion proﬁles of the RFR (in CSS) at the
beginning of a reverse ﬂow according to the twodimensional model (6.26)(6.28) with
the parameter values of Table 6.2. The cooling capacity Φ is ﬁxed at 0.2 and the
radius of the reactor equals R/L = 0.0025.
We use the same value for the cooling capacity (Φ = 0.2), but now with a
larger reactor width (R/L = 0.025). This implies that the cooling now prop
agates less easily through the reactor and steep temperature gradients in the
radial direction arise. In Figure 8.9(b) we have represented the distribution of
the temperature over the catalyst bed in the cyclic steady state. For several
positions in the radial direction, the temperature proﬁle along the reactor is
plotted. The lines with the highest temperatures correspond to radial posi
tions near the axis of the reactor. The lines with the lowest temperatures
correspond to radial positions near the wall of the reactor. Clearly, the cool
ing is especially inﬂuencing the temperature of the catalyst near the wall of
the reactor. Note that for diﬀerent radial positions the axial position of the
maximum temperature is shifted. This results in a lower maximum of the
weighted average of the temperature. In Figure 8.9(a) the conversion of the
same cyclic steady state is given. The lines with the highest conversion cor
respond to radial positions near the axis of the reactor. The lines with the
lowest conversion correspond to radial positions near the wall of the reactor.
Note that only around the axis the conversion is complete at the end of the
8.3 Dynamic features of the full twodimensional model 193
reactor. Therefore the product gas consists of a mixture of both products as
reactants and on an average the conversion is not complete.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
1.2
PSfrag replacements
temperature
c
o
n
v
e
r
s
i
o
n
axial distance
(a) Conversion
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
1.2
PSfrag replacements
t
e
m
p
e
r
a
t
u
r
e
conversion
axial distance
(b) Temperature
Figure 8.9: Axial temperature and conversion proﬁles of the RFR (in CSS) at the
beginning of a reverse ﬂow period according to the twodimensional model (6.26)
(6.28) with the parameter values of Table 6.2. The cooling capacity Φ is ﬁxed at 0.2
and the radius of the reactor equals R/L = 0.025. In addition, the weighted average
(6.18) is given (’◦’).
Two bifurcation branches are shown in Figure 8.10. The weighted average
(6.18) of the temperature is computed over every cross section. The maxi
mum of these values is plotted versus the dimensionless cooling capacity Φ for
diﬀerent values of R/L. It can be shown that, for every value of the cooling ca
pacity, a stable extinguished state exists. For the slim reactor (R/L = 0.0025)
the maximum average temperature is always higher than for the wide reactor
(R/L = 0.025) at the same cooling capacity. This can be explained by the fact
that for the wide reactor, for diﬀerent radial positions, the maximum of the
temperature is not found at the same axial position in the reactor. Note that
for the slim reactor there exists a minimum in the upper branch (at Φ ≈ 0.3).
The reason is that the two high temperature zones, cf. Figure 8.8(b), merge
into one. For cooling capacities higher than Φ ≈ 0.67, the reactor cannot op
erate at high temperature and dies out. The part of the branch with negative
cooling capacity has of course no physical meaning. The bifurcation branch
for the wide reactor has more or less the same characteristics. However, the
minimum has disappeared and the upper branch has become monotonically
decreasing.
To determine the stability of the points on the bifurcation branches, we
194 Chapter 8. Eﬃcient simulation of periodically forced reactors in 2D
0
0.2
0.4
0.6
0.8
1
1.2
−0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
−1.4
−1.2
−1
−0.8
0.8
1
1.2
1.4
1.6
1.8
PSfrag replacements
dimensionless cooling capacity Φ
θ
m
a
x
µ
m
a
x
Figure 8.10: The maximum dimensionless temperature (θ
max
) and the largest Floquet
multiplier (µ
max
) versus the cooling capacity (Φ) for two diﬀerent values of the reactor
radius. The twodimensional model (6.26)(6.28) was used with the parameter values
of Table 6.2. [’∗’ (R/L = 0.0025), ’◦’ (R = 0.025)]
have also plotted the largest Floquet multiplier (µ
max
) in Figure 8.10. Start
ing with Φ = 0 at the upper branch of the bifurcation diagram the largest
eigenvalue of the Jacobian at the ﬁxed points is slightly less than +1, im
plying that the ﬁxed points are stable. At Φ ≈ 0.15 a negative eigenvalue
the largest eigenvalue in modulus and crosses the unit circle at µ = −1 for
Φ ≈ 0.19, causing a symmetry loss bifurcation, that is, the symmetric state
become unstable and a stable asymmetric period1 state emerges. For cooling
capacities higher than Φ ≈ 0.32 (Φ ≈ 0.48), the largest eigenvalue returns to
the unit circle but remains close to −1. Then the symmetric state is stable
but it takes the reactor a large number of cycles to converge to this limiting
state. Finally, at the limit point, for which Φ ≈ 0.67 (Φ ≈ 0.65), a positive
eigenvalue crosses the unit circle at µ = +1. So, for higher cooling capacities
the cooling eventually causes extinction of the reactor. The ﬁxed points of the
lower branches for both the wide and the slim reactor are unstable.
Notes and comments
Section 1.1
For more information on ﬁnite arithmetic see [18] and [20].
A clear and detailed introduction in quasiNewton methods to solve non
linear equations and optimization problems is given by Dennis and Schnabel
[18].
The proof of Theorem 1.2 is given in [18] where it is Theorem 2.4.3.
Lemma 1.3 is Lemma 2.4.2 of [18] and Lemma 1.4 is Corollary 2.6.2 of [18].
Theorem 1.5 is given without proof in [18] where it is Theorem 2.6.3.
Section 1.2
In [8] Broyden uses the mean convergence rate R given by
R =
1
m
log(g(x
0
)/g(x
m−1
))
as the measure of eﬃciency of a method for solving a particular problem,
where m is the total number of function evaluations. In this thesis we divide
the logarithm by k
∗
instead of m, which makes R inﬁnity if k
∗
= 0.
Theorem 1.10 is a simpliﬁcation of Theorem 5.2.1 of [18], where it is as
sumed that J
g
∈ Lip
γ
(N(x
∗
, r)) with N(x
∗
, r) ⊂ T, for some r.
Lemma 1.11 is a simpliﬁcation of Lemma 4.1.12 of [18], where J
g
is assumed
to be Lipschitz continuous at x only.
Theorem 1.12 is called the Banach perturbation theorem. Theorem 3.1.4
of [18] is a more general version of the perturbation theorem, where . can be
any norm on R
n×n
that satisﬁes AB ≤ A B, A, B ∈ R
n×n
and I = 1.
The theorem is also given [55].
Theorem 1.15 is Theorem 5.4.1 of [18].
195
196 Notes and comments
Section 1.3
Lemma 1.20 is a special case of Lemma 8.1.1 of [18]. If the l
2
operator norm
is used in (1.32) instead of the Frobenius norm, multiple solutions for A exist,
some clearly less desirable than Broyden’s update.
Lemma 1.23 is a combination of Lemma’s 4.1.15 and 4.1.16 of [18]. The
lemma is also given in [11] where it is Lemma 3.1.
Theorem 1.24 is a special case of Theorem 3.2 of [11], where instead of the
Frobenius norm a weighted matrix norm, denoted by .
M
. This very general
theorem of Broyden, Dennis and Mor´e was developed to extend the analysis
given by Dennis for Broyden’s method [15], to other secant methods. The
theorem is in some sense considered as unsatisfying, because the initial Broy
den matrix must be close to the Jacobian. For the limited memory Broyden
methods, where we choose B
0
= −I in general, this is assumption is not satis
ﬁed. However, all other convergence proofs of the quasiNewton methods for
nonlinear equations are built on this result.
Corollary 1.25 is Corollary 3.3 of [11].
Theorem 1.26 is a particular case of Theorem 4.3 of [11]. The proof of the
theorem is simpliﬁed by using results of [18] and [16].
Lemma 1.27 is Lemma 8.2.3 of [18].
Lemma 1.28 is Lemma 2.2 of [16].
Lemma 1.29 is Lemma 8.2.5 of [18].
Theorem 1.30 can be found in e.g. [28].
For practical implementation, the method of Broyden has to be used in
combination with global algorithms. Well known approaches are for example
line search and the modeltrust region approach, see Sections 6.3 and 6.4
of [18]. To obtain a more robust method, Broyden himself chose the ﬁnite
diﬀerence approximation of the Jacobian for the initial estimate B
0
and applied
a backtracking strategy for the line search, see [8].
An overview of many of the important theoretical results of secant methods
is given in e.g. [17, 42].
In 1970 Broyden [7] has proved that his method converges Rsuperlinearly
on linear problems and in 1971 he has proved that the method converges
locally and at least linearly on nonlinear problems [9].
In 2000 Broyden has written a short note on the discovery of the ’good
Broyden’ method [10].
Notes and comments 197
Section 2.1
The generalized Broyden’s method, Algorithm 2.1, is proposed by Gerber and
Luk in [23]. The algorithm was also published by Gay in [22], where in case
of y
k
= 0 the new inverse Broyden matrix H
k+1
was set to H
k
. In Chapter 2
we only consider aﬃne functions g(x) = Ax + b, where the matrix A ∈ R
n×n
is nonsingular. Therefore, y
k
= As
k
and y
k
= 0 if and only if s
k
= 0.
Lemma 2.2 is a particular case of results derived in [23].
A full proof of Lemma 2.3 can be found in [22], where it is Lemma 2.1.
Theorem 2.4 is Theorem 2.2 of [22].
Lemma 2.7 is a slightly adjusted version of Lemma 3.1 of [54], that is
derived from Lemma 2.3 of [22].
In [22] Gay proved under which conditions Algorithm 2.1 requires a full
2n steps to convergence.
As a result of the 2nstep exact convergence for linear systems, Gay proved
in [22] that the method of Broyden is 2nstep quadratically convergent for
nonlinear functions.
Section 2.2
Lemma 2.8 is Lemma 3.1 of [23], Lemma 2.9 is Lemma 3.2 of [23] and Lemma
2.10 is Lemma 3.3 of [23].
Theorems 2.11 and 2.12 are Theorems 3.1 and 3.2 of [23]. Note that
the condition (2.19) is unsatisfactory, since it has to be checked during the
process. We would like to sharpen Theorem 2.12 in the following way. If
dimZ
k+1
= dimZ
k
− 1 and v
T
k
w
k
= 0 then dimZ
k+2
= dimZ
k+1
− 1, a
nonzero vector w
k+1
∈ Z
k+1
∩ Ker(I − AH
k+1
) exists, that equals w
k
and
satisﬁes v
T
k+1
w
k+1
= 0. This would imply that if w
0
= 0 and v
T
0
w
0
= 0 the
method of Broyden needs d
0
iterations to converge. Simulations conﬁrmed
this conjecture, see Example 2.16.
Lemma 2.13 is Lemma 3.4 of [23].
Section 2.3
According Lemma 2.18 we consider in the examples of Chapters 2 and 4 aﬃne
functions g(x) = Ax, where A is in Jordan normal form, see [64] for more
details.
198 Notes and comments
Section 3.1
In this thesis we only consider limited memory methods that are based on the
method of Broyden and are applicable for nonlinear functions with general
nonsingular Jacobian. In 1970 Schubert [62] has proposed a secant method
to solve nonlinear functions, where the Jacobian is sparse and the locations
of the nonzero elements are known. In addition to the secant equation, he
imposes the updated Broyden matrix to have the same sparsity structure as
the Jacobian. In 1971 Broyden [9] has investigated the properties of this
modiﬁed algorithm, both theoretical and experimental. Toint has extended
this approach to quasiNewton algorithms for optimization problems, cf. [65,
66, 67].
The NewtonPicard method is ﬁrst proposed by Lust et al. [41]. The
algorithm applies the method of Newton on a small pdimensional subspace
and dynamical simulation, Picard iteration, on the orthogonal subspace. The
small subspace is formed by the eigenvectors corresponding to the largest
eigenvalues in modulus of the Jacobian J
g
at the current iterate x
k
. The p
eigenvectors can be computed using subspace iteration, see [60], avoiding the
usage of a large (n n)matrix in the algorithm.
A relatively new ﬁeld of research is the NewtonKrylov method, cf. [6], that
is based on solving the Newton iteration step without computing the Jacobian
of the function explicitly. To approximate the Newton step, subspace iteration
is used. Derived from this idea, TensorKrylov methods provide even faster
algorithms, see [5].
Limited memory quasiNewton methods for optimization problems have
been studied by e.g. Kolda, O’Leary and Nazareth [37], Liu and Nocedal [40],
Morales and Nocedal [45] and Nocedal [48].
Section 3.2
A good overview of singular values can be found in e.g. [27, 25]
The rank reduction applied in Algorithm 3.11 with q = p − 1 can also be
considered as an additional rankone update. Let v
p
be the right singular vec
tor corresponding to the pth singular value of the update matrix Q in iteration
p + 1. The new update matrix
¯
Q satisﬁes
¯
Qv
p
= 0, and in all other directions
¯
Q has the same action as Q. This implies for the intermediate Broyden matrix
¯
B that
¯
Bv
p
= B
0
v
p
,
¯
Bu = Bu for u ⊥ v
p
,
Notes and comments 199
and therefore
¯
B = B + (B
0
v
p
−Bv
p
)
v
T
p
v
T
p
v
p
= B −Qv
p
v
T
p
,
which is a rank oneupdate of B.
The condition (3.17) on the reduction matrix R is already suggested in
[11].
Section 3.4
The idea of Section 3.4 comes from an article by Byrd, Nocedal and Schnabel
[12], in which they derived short representation for diﬀerent quasiNewton
methods.
Lemma 3.20 is Lemma 2.1 of [12], Theorem 3.21 is Theorem 6.1 of [12] and
Theorem 3.22 is Theorem 6.2 of [12].
The scaling in Algorithm 3.23 is proposed by Richard Byrd.
In a limited context using the notation of Section 3.4, the multiple secant
version of Broyden’s update, see (1.25) and (1.26), is given by
B
k
= B
0
+ (Y
k
−B
0
S
k
)(S
T
k
S
k
)
−1
S
T
k
. (8.4)
This update is well deﬁned as long as S
k
has full column rank, and obeys the
k secant equations B
k
S
k
= Y
k
.
Comparing (8.4) to the formula in (3.27) for k consecutive, standard Broy
den updates, we see that in the multiple secant approach we use S
T
k
S
k
, while
in (3.27) it is the upper triangular portion of this matrix, including the main
diagonal. Therefore, the two update are the same if the directions in S
k
are
orthogonal. The preference between these two formulas does not appear to be
clear cut. The formula (3.27) has the advantage that it is well deﬁned for any
S
k
, while (8.4) is only well deﬁned numerically if the k step directions that
make up S
k
are suﬃciently linearly independent. If they are not, only some
subset of them can be utilized in a numerical implementation of the multiple
Broyden method. This is the approach that has often been taken in implemen
tations of this update. On the other hand, (8.4) always enforces the k prior
secant equations while (3.27) only enforces the most recent equation. Thus
it would probably be worthwhile considering either method (or their inverse
formulations) in a limited memory method for solving nonlinear equations.
Section 6.2
A comprehensive overview of chemical reactors and modeling techniques is
written by e.g. Scott Fogler [63] and Froment and Bischoﬀ [21].
200 Notes and comments
A clear introduction is given by Aris [3].
Section 7.2
An introduction to dynamical systems can be found in [4].
Section 7.1
Basics of discretization techniques are given in [59].
Section 7.3
For locating a bifurcation branch it is enough to approximate the points on
the branch op to an error of about 10
−2
during the continuation scheme. In
the neighborhood of bifurcation points the points on the branch might have
to be determined more accurately.
Van Noorden et al. [51, 52] compared several convergence acceleration
techniques (such as the method of Newton, the method of Broyden and the
NewtonPicard method) in combination with continuation techniques. From
their work it turns out that Broyden’s method is the most eﬃcient for solving
large systems of nonlinear equations in terms of function evaluations.
An advanced adapted Broyden method that uses information of the con
tinuation process to update the Broyden matrix is developed by Van Noorden
et al. [50].
Studies in continuation techniques and bifurcation analysis can be found
in work by e.g. Allgower, Chien and Georg [1] and Allgower and Georg [2].
Section 8.2
An extended investigation of the dynamical behavior of the reverse ﬂow reactor
is given by Khinast et al. [33].
Recent studies of the reverseﬂow reactor can be found in work by Gl¨ockler,
Kolios and Eigenberger [24] and Jeong and Luss [31].
Bibliography
[1] E.L. Allgower, C.S. Chien, and K. Georg. Large sparse continuation
problems. J. Comput. Appl. Math., 26:3–21, 1989.
[2] E.L. Allgower and K. Georg. Numerical continuation methods, vol
ume 13 of Springer Series in Computational Mathematics. Springer
Verlag, Berlin, 1990. An introduction.
[3] R. Aris. Mathematical modelling techniques. Dover Publications Inc.,
New York, 1994. Corrected and expanded reprint of the 1978 original.
[4] D.K. Arrowsmith and C.M. Place. An introduction to dynamical systems.
Cambridge University Press, Cambridge, 1990.
[5] A. Bouaricha. TensorKrylov methods for large nonlinear equations. Com
put. Optim. Appl., 5:207–232, 1996.
[6] P.N. Brown and Y. Saad. Convergence theory of nonlinear NewtonKrylov
algorithms. SIAM J. Optim., 4:297–330, 1994.
[7] C. G. Broyden. The convergence of singlerank quasiNewton methods.
Math. Comp., 24:365–382, 1970.
[8] C.G. Broyden. A class of methods for solving nonlinear simultaneous
equations. Math. Comp., 19:577–593, 1965.
[9] C.G. Broyden. The convergence of an algorithm for solving sparse non
linear systems. Math. Comp., 25:285–294, 1971.
[10] C.G. Broyden. On the discovery of the ’good Broyden’ method. Math.
Program., B 87:209–213, 2000.
[11] C.G. Broyden, J.E. Dennis, Jr., and J.J. Mor´e. On the local and su
perlinear convergence of quasiNewton methods. J. Inst. Math. Appl.,
12:223–245, 1973.
201
202 Bibliography
[12] R.H. Byrd, J. Nocedal, and R.B. Schnabel. Representations of quasi
Newton matrices and their use in limited memory methods. Math. Pro
gram., 63:129–156, 1994.
[13] B.T. Carvill, J.R. Hufton, M. Anand, and S. Sircar. Sorption enhanced
reaction process. AIChE J., 42(10):2765–2772, 1996.
[14] M.M. Davis and M.D. Levan. Experiments on optimization of thermal
swing adsorption. Ind. Eng. Chem. Res., 28:778–785, 1989.
[15] J.E. Dennis, Jr. On the convergence of Broyden’s method for nonlinear
systems of equations. Math. Comp., 25:559–567, 1971.
[16] J.E. Dennis, Jr. and J.J. Mor´e. A characterization of superlinear con
vergence and its application to quasiNewton methods. Math. Comp.,
28:549–560, 1974.
[17] J.E. Dennis, Jr. and J.J. Mor´e. QuasiNewton methods, motivation and
theory. SIAM Rev., 19:46–89, 1977.
[18] J.E. Dennis, Jr. and R.B. Schnabel. Numerical methods for unconstrained
optimization and nonlinear equations, volume 16 of Classics in applied
mathematics. Society for Industrial and Applied Mathematics (SIAM),
Philadelphia, PA, 1996. Corrected reprint of the 1983 original.
[19] G. Eigenberger and U. Nieken. Catalytic combustion with periodicﬂow
reversal. Chem. Eng. Sci., 43:2109–2115, 1988.
[20] K. Eriksson, D. Estep, P. Hansbo, and C. Johnson. Computational dif
ferential equations. Cambridge University Press, Cambridge, 1996.
[21] G.F. Froment and K.B. Bischoﬀ. Chemical reactor analysis and design.
John Wiley & Sons Ltd., New York, 1990.
[22] D.M. Gay. Some convergence properties of Broyden’s method. SIAM J.
Numer. Anal., 16:623–630, 1979.
[23] R.R. Gerber and F.T. Luk. A generalized Broyden’s method for solving
simultaneous linear equations. SIAM J. Numer. Anal., 18:882–890, 1981.
[24] B Gl¨ockler, G. Kolios, and G. Eigenberger. Analysis of a novel reverse
ﬂow reactor concept for autothermal methane steam reforming. Chem.
Eng. Sci., 58:593–601, 2003.
Bibliography 203
[25] G.H. Golub and C.F. Van Loan. Matrix computations. Johns Hopkins
Studies in the Mathematical Sciences. Johns Hopkins University Press,
third edition, 1996.
[26] V.K. Gupta and S.K. Bhatia. Solution of cyclic proﬁles in catalytic reactor
operation with periodicﬂow reversal. Comput. Chem. Eng., 15:229–237,
1991.
[27] R.A. Horn and C.R. Johnson. Matrix analysis. Cambridge University
Press, Cambridge, 1990. Corrected reprint of the 1985 original.
[28] A.S. Householder. Principles of numerical analysis, pages 135–138. Mc
GrawHill, New York, 1953.
[29] J.R. Hufton, S. Mayorga, and S. Sircar. Sorptionenhanced reaction pro
cess for hydrogen production. AIChE J., 45:248–256, 1999.
[30] G. Iooss and D.D. Joseph. Elementary stability and bifurcation theory.
Undergraduate texts in mathematics. SpringerVerlag, New York, second
edition, 1990.
[31] Y.O. Jeong and D. Luss. Pollutant destruction in a reverseﬂow chro
matographic reactor. Chem. Eng. Sci., 58:1095–1102, 2003.
[32] J.G. Khinast, A. Gurumoorthy, and D. Luss. Complex dynamic features
of a cooled reverseﬂow reactor. AIChE J., 44:1128–1140, 1998.
[33] J.G. Khinast, Y.O. Jeong, and D. Luss. Dependence of cooled reverseﬂow
reactor dynamics on reactor model. AIChE J., 45:299–309, 1999.
[34] J.G. Khinast and D. Luss. Mapping regions with diﬀerent bifurcation
diagrams of a reverseﬂow reactor. AIChE J., 43:2034–2047, 1997.
[35] J.G. Khinast and D. Luss. Eﬃcient bifurcation analysis of periodically
forced distributed parameter systems. Comput. Chem. Eng., 24:139–152,
2000.
[36] A.J. Kodde and A. Bliek. Selectivity enhancement in consecutive reac
tions using the pressure swing reactor. Stud. Surf. Sci. Catal., 109:419–
428, 1997.
[37] T.G. Kolda, D.P. O’Leary, and L. Nazareth. BFGS with update skipping
and varying memory. SIAM Journal on Optimization, 8:1060–1083, 1998.
204 Bibliography
[38] Y.A. Kuznetsov. Elements of applied bifurcation theory, volume 112 of
Applied Mathematical Sciences. SpringerVerlag, New York, second edi
tion, 1998.
[39] H.M. Kvamsdal and T. Hertzberg. Optimization of pressure swing ad
sorption systems the eﬀect of mass transfer during the blowdown step.
Chem. Eng. Sci., 50:1203–1212, 1995.
[40] D.C. Liu and J. Nocedal. On the limited memory BFGS method for large
scale optimization. Math. Program., 45:503–528, 1989.
[41] K. Lust, D. Roose, A. Spence, and A.R. Champneys. An adaptive
NewtonPicard algorithm with subspace iteration for computing periodic
solutions. SIAM J. Sci. Comput., 19:1188–1209, 1998.
[42] J.M. Mart´ınez. Practical quasiNewton methods for solving nonlinear
systems. J. Comput. Appl. Math., 124:97–121, 2000.
[43] Yu.Sh. Matros. Catalytic processes under unsteady state conditions. El
sevier, Amsterdam, 1989.
[44] Yu.Sh. Matros and G.A. Bunimovich. Reverseﬂow operation in ﬁxed bed
catalytic reactors. Catal. Rev., 38:1–68, 1996.
[45] J.L. Morales and J. Nocedal. Automatic preconditioning by limited mem
ory quasiNewton updating. SIAM J. Optim., 10:1079–1096, 2000.
[46] J.J. Mor´e and M.Y. Cosnard. Numerical solution of nonlinear equations.
ACM Trans. Math. Soft., 5:64–85, 1979.
[47] J.J. Mor´e, B.S. Garbow, and K.E. Hillstrom. Testing unconstrained op
timization software. ACM Trans. Math. Soft., 7:17–41, 1981.
[48] J. Nocedal. Updating quasiNewton matrices with limited storage. Math
ematics of Computation, 35:773–782, 1980.
[49] T.L. van Noorden. New algorithms for parameterswing reactors. PhD
thesis, Vrije Universiteit, Amsterdam, 2002.
[50] T.L. van Noorden, S.M. Verduyn Lunel, and A. Bliek. A Broyden rank
p +1 update continuation method with subspace iteration. To appear in
SIAM J. Sci. Comput.
Bibliography 205
[51] T.L. van Noorden, S.M. Verduyn Lunel, and A. Bliek. Acceleration of
the determination of periodic states of cyclically operated reactors and
separators. Chem. Eng. Sci., 57:1041–1055, 2002.
[52] T.L. van Noorden, S.M. Verduyn Lunel, and A. Bliek. The eﬃcient com
putation of periodic states of cyclically operated chemical processes. IMA
J. Appl. Math., 68:149–166, 2003.
[53] Numerical Algorithms Group (NAG). The NAG Fortran library manual,
Mark 20, 2003. Available from http://www.nag.co.uk/.
[54] D.P. O’Leary. Why Broyden’s nonsymmetric method terminates on linear
equations. SIAM J. Optim., 5:231–235, 1995.
[55] J.M. Ortega and W.C. Rheinboldt. Iterative solution of nonlinear equa
tions in several variables, volume 30 of Classics in applied mathematics.
Society for Industrial and Applied Mathematics (SIAM), Philadelphia,
PA, 2000. Reprint of the 1970 original.
[56] R.M. Quinta Ferreira and C.A. AlmeidaCosta. Heterogeneous models
of tubular reactors packed with ionexchange resins: Simulation of the
MTBE synthesis. Ind. Eng. Chem. Res., 35:3827–3841, 1996.
[57] J. Rehacek, M. Kubicek, and M. Marek. Modeling of a tubular catalytic
reactor with ﬂow reversal. Chem. Eng. Sci., 47:2897–2902, 1992.
[58] J. Rehacek, M. Kubicek, and M. Marek. Periodic, quasiperiodic and
chaotic spatiotemporal patterns in a tubular catalytic reactor with peri
odic ﬂow reversal. Comput. Chem. Eng., 22:283–297, 1998.
[59] R.D. Richtmyer and K.W. Morton. Diﬀerence methods for initialvalue
problems, volume 4 of Interscience tracts in pure and applied mathematics.
John Wiley & Sons Ltd., New YorkLondonSydney, second edition, 1967.
[60] Y. Saad. Numerical methods for large eigenvalue problems, algorithms and
architectures for advanched scientiﬁc computing. Manchester university
press, Manchester, 1992.
[61] Y. Saad and M.H. Schultz. GMRES: a generalized minimal residual al
gorithm for solving nonsymmetric linear systems. SIAM J. Sci. Statist.
Comput., 7:856–869, 1986.
[62] L.K. Schubert. Modiﬁcation of a quasiNewton method for nonlinear
equations with a sparse Jacobian. Math. Comput., 24:27–30, 1970.
206 Bibliography
[63] H. Scott Fogler. Elements of chemical reaction engineering. Prentice Hall
PTR, third edition, 1999.
[64] J. Stoer and R. Bulirsch. Introduction to numerical analysis, volume 12 of
Texts in Applied Mathematics. SpringerVerlag, New York, third edition,
2002. Translated from the German by R. Bartels, W. Gautschi and C.
Witzgall.
[65] Ph.L. Toint. On sparse and symmetric matrix updating subject to a linear
equation. Math. Comput., 31:954–961, 1977.
[66] Ph.L. Toint. On the superlinear convergence of an algorithm for solving
a sparse minimization. SIAM J. Numer. Anal., 16, 1979.
[67] Ph.L. Toint. A sparse quasiNewton update derived variationally with
a nondiagonally weighted Frobenius norm. Math. Comput., 37:425–433,
1981.
[68] B.A. van de Rotten and S.M. Verduyn Lunel. A limited memory Broy
den method to solve highdimensional systems of nonlinear equations.
Technical Report 200306, Universiteit Leiden, 2003.
[69] B.A. van de Rotten, S.M. Verduyn Lunel, and A. Bliek. Eﬃcient sim
ulation of periodically forced reactor in 2d. Technical Report 200313,
Universiteit Leiden, 2003.
[70] H.A. van der Vorst. BiCGSTAB: a fast and smoothly converging variant
of BiCG for the solution of nonsymmetric linear systems. SIAM J. Sci.
Statist. Comput., 13:631–644, 1992.
[71] H.A. van der Vorst and G.L.G. Sleijpen. Iterative BiCG type methods
and implementation aspects. In Algorithms for large scale linear algebraic
systems (Gran Canaria, 1996), volume 508 of NATO Adv. Sci. Inst. Ser.
C Math. Phys. Sci., pages 217–253. Kluwer Acad. Publ., Dordrecht, 1998.
[72] K.R. Westerterp, W.P.M. van Swaaij, and A.A.C.M. Beenackers. Chemi
cal reactor design and operation. John Wiley & Sons Ltd., second edition,
1988.
Appendix A
Test functions
This appendix is devoted to a discussion of the test functions used to test
the diﬀerent limited memory Broyden methods of Chapter 3. Because the
methods of Newton and Broyden are not globally converging and the area of
convergence can be small, we have chosen some speciﬁc test functions, taken
from the CUTE collection, cf. [18, 47].
Discrete boundary value function
The twopoint boundary value problem
u
(t) =
1
2
(u(t) +t + 1)
3
, 0 < t < 1, u(0) = u(1) = 0. (A.1)
can be discretized by considering the equation at the points t = t
i
, i = 1, . . . , n.
We apply the standard O(h
2
) discretization and denote h = 1/(n + 1) and
t
i
= i h, i = 1, . . . , n. The resulting system of equations is given by
g(x) = 0,
where
g
i
(x) = 2x
i
−x
i−1
−x
i+1
+
h
2
2
(x
i
+t
i
+ 1)
3
, i = 1, . . . , n, (A.2)
for x = (x
1
, . . . , x
n
) and x
i
= u(t
i
), i = 1, . . . , n. The Jacobian of this function
has a band structure with on both subdiagonals the value −1. The elements
on the diagonal of the Jacobian are given by
∂g
i
∂x
i
= 2 +
2h
2
2
(x
i
+ih + 1)
2
, i = 1, . . . , n.
207
208 Appendix A. Test functions
As initial condition we deﬁne the vector x
0
by
x
0
= (t
1
(t
1
−1), . . . , t
n
(t
n
−1)). (A.3)
The socalled discrete boundary value function was ﬁrst used by Mor´e and
Cosnard to test the methods of Brent and of Brown [46]. In Figure A.1 we
have plotted the initial condition x
0
and the zero x
∗
of the function g.
0 0.2 0.4 0.6 0.8 1
−0.25
−0.2
−0.15
−0.1
−0.05
0
Figure A.1: The initial condition x
0
(dotted line) and the zero x
∗
(solid line) of the
function g given by (A.2).
Discrete integral equation function
In the same article Mor´e and Cosnard also considered the discrete integral
equation function [46]. If we integrate the boundary value problem (A.1) two
times and apply the boundary conditions, we obtain the nonlinear integral
equation
u(t) +
1
2
1
0
H(s, t)(u(s) +s + 1)
3
ds = 0, (A.4)
where
H(s, t) =
s(1 −t), s < t,
t(1 −s), s ≥ t.
209
To discretize Equation (A.4), we replace the integral by an npoint rectangular
rule based on the points t = t
i
, i = 1, . . . , n. If we denote h = 1/(n + 1) and
t
i
= i h, i = 1, . . . , n, the resulting system of equations reads
g(x) = 0,
where g(x) is given by
g
i
(x) = x
i
+
h
2
(1−t
i
)
i
¸
j=1
t
j
(x
j
+t
j
+1)
3
+t
i
n
¸
j=i+1
(1−t
j
)(x
j
+t
j
+1)
3
, (A.5)
for i = 1, . . . , n. Note that the Jacobian of the function g has a dense structure.
As in case of the discrete boundary value function, we start with the initial
vector x
0
, given by
x
0
= (t
1
(t
1
−1), . . . , t
n
(t
n
−1)). (A.6)
Extended Rosenbrock function
The extended Rosenbrock function g : R
n
→R
n
is deﬁned for even n by
g
2i−1
(x) = 10(x
2i
−x
2
2i−1
),
g
2i
(x) = 1 −x
2i−1
,
i = 1, . . . , n/2. (A.7)
This implies that the equation
g(x) = 0
equals n/2 copies of a system in the twodimensional space.
Note that the Jacobian of the extended Rosenbrock function is a block
diagonal matrix. The (2 2)matrices on the diagonal are given by
−20x
2i−1
10
−1 0
.
The unique zero of (A.7) is given by x
∗
= (1, . . . , 1), so that the Jacobian of g
is nonsingular at x
∗
and has singular values approximately 22.3786 and 0.4469
with multiplicity n/2.
As initial vector x
0
for the iterative methods we choose
(−1.2, 1, . . . , −1.2, 1). (A.8)
210 Appendix A. Test functions
Extended Powell singular function
The extended Powell singular function contains n/4 copies of the same function
in the fourdimensional space. Let n be a multiple of 4 and deﬁne the function
g : R
n
→R
n
by
g
4i−3
(x) = x
4i−3
+ 10x
4i−2
,
g
4i−2
(x) =
√
5(x
4i−1
−x
4i
),
g
4i−1
(x) = (x
4i−2
−2x
4i−1
)
2
,
g
4i
(x) =
√
10(x
4i−3
−x
4i
)
2
,
i = 1, . . . , n/4. (A.9)
The unique zero of (A.9) is x
∗
= 0. So, the Jacobian is a block diagonal matrix
with blocks
¸
¸
1 10 0 0
0 0
√
5 −
√
5
0 2(x
4i−2
− 2x
4i−1
) −4(x
4i−2
− 2x
4i−1
) 0
2
√
10(x
4i−3
− x
4i
) 0 0 −2
√
10(x
4i−3
− x
4i
)
¸
and singular at the zero x
∗
.
The initial point x
0
is given by
(3, −1, 0, 1, . . . , 3, −1, 0, 1). (A.10)
Appendix B
Matlab code of the limited
memory Broyden methods
We have implemented the codes of the iterative methods described in Chapters
1 and 3 in the computer languages Fortran and Matlab. The codes in Fortran
were used in order to apply the integration routines and matrix manipulation
routines of the Fortran NAGlibrary [53], as well as to compute the solutions
of large dimensional systems of equations (n ≥ 1000). The codes in Matlab
were used in order to obtain more insight in the Broyden matrices and the
update matrices, to manufacture plots of the Broyden matrices, the singular
values of the update matrices and the rate of convergence, as well as to present
the codes in a convenient manner.
The method of Broyden
We omit the codes of Newton’s method, Algorithm 1.7, the NewtonChord
method, Algorithm 1.16, and the Discrete Newton method, Algorithm 1.13,
and start with the plain method of Broyden, Algorithm 1.19, that forms the
basis of the codes of all the limited memory Broyden methods to come.
function [ x ] = . . .
broyden ( gcn , x , B, n , imax , i eps , i f a i l )
%%% I n i t i a l i s a t i o n %%%
g = f eval ( gcn , x , n ) ; i t e = 0;
ne ( i t e +1) = sqrt ( g ’ ∗ g ) ;
%%% Broyden i t e r a t i o n %%%
while ( ne ( i t e +1) > i e ps ) ,
%%% Broyden s t ep %%%
211
212 Appendix B. Matlab code of the limited memory Broyden methods
s = −B\g ; ns = s ’ ∗ s ;
x = x + s ;
y = f eval ( gcn , x , n) − g ; g = y + g ;
i t e = i t e + 1;
ne ( i t e +1) = sqrt ( g ’ ∗ g ) ;
%%% Matri x updat e %%%
B = B + ( y − B∗ s )∗ s ’ / ns ;
end;
We are not only interested in the zero of the function g but also in the
convergence properties of the method. Therefore, we include extra output
parameters of the subroutine, such as the number of iterations ’ite’ and the
residue at every iteration step ’ne’. In addition the algorithm can stuck at
several points. The reason of failure of the subroutine we return in the variable
’ifail’. The local matrices and vectors are declared at the beginning of the
subroutine. The extended code for the method of Broyden reads
function [ x , i t e , ne , i f a i l ] = . . .
broyden ( gcn , x , B, n , imax , i eps , meps , i f a i l )
i f ( i f a i l ˜=0  imax == 0) i f a i l = 1; return , end;
disp( ’ # ∗∗∗ The method of Broyden ∗∗∗ ’ ) ;
%%% Pr e al l oc at i on %%%
g = zeros ( n , 1 ) ; s = zeros ( n , 1 ) ; y = zeros ( n , 1 ) ;
%%% I n i t i a l i s a t i o n %%%
g = f eval ( gcn , x , n ) ; i t e = 0;
ne ( i t e +1) = sqrt ( g ’ ∗ g ) ;
%%% Broyden i t e r a t i o n %%%
while ( ne ( i t e +1) > i e ps ) ,
i f ( ne ( i t e +1) > meps ( 2 ) ) i f a i l = 4; break , end;
i f ( i t e >= imax ) i f a i l = 2; break , end;
%%% Broyden s t ep %%%
i f ( rcond(B) < meps ( 1 ) ) i f a i l = 5; break , end;
s = −B\g ; ns = s ’ ∗ s ;
i f ( ns <= 0) i f a i l = 3; break , end;
x = x + s ;
y = f eval ( gcn , x , n) − g ; g = y + g ;
i t e = i t e + 1;
ne ( i t e +1) = sqrt ( g ’ ∗ g ) ;
%%% Matri x updat e %%%
B = B + ( y − B∗ s )∗ s ’ / ns ;
end;
If the residual becomes larger than a predeﬁned value ’meps(2)’, the process
is not expected to converge. Therefore the computation is stopped to avoid
213
overﬂow. The condition number of the Broyden matrix B is computed by
rcond(B) = B B
−1
. The Broyden matrix is considered as approximately
singular if the condition number is smaller than the machine precision, stored
in ’meps(1)’.
The general limited memory Broyden method
We indicated in Chapter 3 that the structure of all limited memory Broyden
methods are similar, except for Algorithms 3.15 and 3.23. The basis of the
limited memory Broyden methods as described in Algorithm 2.1 is given by
the following routine
function [ x , i t e , ne , i f a i l ] = . . .
lmb( gcn , x , C, D, n , p , q , m, imax , i eps , meps , i f a i l )
i f ( i f a i l ˜=0  imax == 0) i f a i l = 1; return , end;
i f ( p < 1  p > n ) i f a i l = 1; return , end;
i f ( q < 0  q > p−1) i f a i l = 1; return , end;
i f (m < 0  m > p ) i f a i l = 1; return , end;
disp( ’ # ∗∗∗ The l i mi t e d memory Broyden method ∗∗∗ ’ ) ;
%%% Pr e al l oc at i on %%%
g = zeros ( n , 1 ) ; s = zeros ( n , 1 ) ; y = zeros ( n , 1 ) ;
B2 = zeros ( p , p ) ;
%%% I n i t i a l i s a t i o n %%%
g = f eval ( gcn , x , n ) ; i t e = 0;
ne ( i t e +1) = sqrt ( g ’ ∗ g ) ;
%%% Broyden i t e r a t i o n %%%
while ( ne ( i t e +1) > i e ps ) ,
i f ( ne ( i t e +1) > meps ( 2 ) ) i f a i l = 4; break , end;
i f ( i t e >= imax ) i f a i l = 2; break , end;
%%% Broyden s t ep %%%
B2 = eye( p)−D’ ∗C;
i f ( rcond(B2) < meps ( 1 ) ) i f a i l = 5; break , end;
s = C∗(B2 \ (D’ ∗ g ) ) + g ; ns = sqrt ( s ’ ∗ s ) ;
i f ( ns <= 0) i f a i l = 3; break , end;
x = x + s ;
y = f eval ( gcn , x , n) − g ; g = y + g ;
i t e = i t e + 1;
ne ( i t e +1) = sqrt ( g ’ ∗ g ) ;
i f (m == p)
%%% Recomposi ti on %%%
%%% Reduct i on %%%
m = q ;
end;
214 Appendix B. Matlab code of the limited memory Broyden methods
%%% Matri x updat e %%%
m = m + 1;
C( : ,m) = ( y + s − C( : , 1 : m−1)∗D( : , 1 : m−1) ’∗ s )/ ns ;
D( : ,m) = s /ns ;
end;
The only part that has to be ﬁlled in is how the decomposition CD
T
of
the update matrix is rewritten and which columns of the matrices C and D
are removed.
In the main program the subroutine ’lmb’ is for example called in the
following way.
function program
%%% Pr e al l oc at i on and i n i t i a l i s a t i o n %%%
i e ps = 1. 0E−12;
meps = [ 1 . 0E−16; 1. 0 E20 ] ;
imax = 200;
n = 100;
p = 5;
C = zeros ( n , p ) ; D = zeros ( n , p ) ;
m = 0;
x0 = ones ( n , 1 ) ;
i f a i l = 0;
q = p−1;
[ x , i t e , ne , i f a i l ] = . . .
lmb( ’ gcn ’ , x , C, D, n , p , q , m, imax , i eps , meps , i f a i l ) ;
Removing columns in normal format
The simplest way to create free columns in the (np)matrices C and D is just
by setting p−q columns equal to zero. To satisfy the conditions superposed on
the limited memory Broyden methods, see Section 3.1, the nonzero columns
are stored in the ﬁrst q columns of the matrices.
For example, we can remove the newest p − q updates of the Broyden
process by setting the last p −q columns of C and D to zero.
%%% Reduct i on %%%
C( : , q+1:p) = zeros ( n , p−q ) ;
D( : , q+1:p) = zeros ( n , p−q ) ;
The oldest p − q updates of the Broyden process are removed by storing
the last q columns of C and D in the ﬁrst q columns and again setting the last
p −q columns of the new matrices C and D equal to zero.
%%% Reduct i on %%%
C( : , 1 : q) = C( : , p−q+1:p ) ; C( : , q+1:p) = zeros ( n , p−q ) ;
D( : , 1 : q) = D( : , p−q+1:p ) ; D( : , q+1:p) = zeros ( n , p−q ) ;
215
Removing columns in SVDformat
For the Broyden Rank Reduction method three additional (p p)matrices
have to be declared.
%%% Pr e al l oc at i on %%%
R = zeros ( p , p ) ; S = zeros ( p , p ) ; W = zeros ( p , p ) ;
Before the reduction is applied, the matrices C and D are written as
the singular value decomposition of the update matrix. For this the QR
decomposition is computed of the matrix D and thereafter the SVDdecomposition
of C.
%%% Recomposi ti on %%%
%%%QR−decomposi t i on , R%%%
[ D, R] = qr (D, 0 ) ; C = C∗R’ ;
%%% SVD−decomposi t i on , W%%%
[ C, S ,W] = svd(C, 0 ) ; C = C∗S ; D = D∗W;
The smallest p − q singular values of the update matrix are removed by
setting the last p −q columns of C and D equal to zero.
%%% Reduct i on %%%
C( : , q+1:p) = zeros ( n , p−q ) ;
D( : , q+1:p) = zeros ( n , p−q ) ;
In order to remove the largest p −q singular values of the update matrix,
the last q columns of C and D are copied to the ﬁrst q columns of both matrices
and subsequently the new last p−q columns of C and D are set equal to zero.
%%% Reduct i on %%%
C( : , 1 : q) = C( : , p−q+1:p ) ; C( : , q+1:p) = zeros ( n , p−q ) ;
D( : , 1 : q) = D( : , p−q+1:p ) ; D( : , q+1:p) = zeros ( n , p−q ) ;
Note that these reduction procedures also have been applied in the normal
format.
Removing the ﬁrst columns in QLformat
The Broyden Base Reduction method is actually similar to removing the ﬁrst
columns of the matrices C and D in the normal format. Before we apply the
reduction the matrix D is ﬁrst orthogonalized using a QLdecomposition. So,
the (p p)matrix L has to be declared.
%%% Pr e al l oc at i on %%%
L = zeros ( p , p ) ;
The decomposition of the update matrix is rewritten in the following way.
%%% Recomposi ti on %%%
%%% QL−decomposi t i on , L%%%
[ D, L] = ql (D) ; C = C∗L ’ ;
The ﬁrst p −q columns of C and D are removed in the same way as done
in normal format.
216 Appendix B. Matlab code of the limited memory Broyden methods
%%% Reduct i on %%%
C( : , 1 : q) = C( : , p−q+1:p ) ; C( : , q+1:p) = zeros ( n , p−q ) ;
D( : , 1 : q) = D( : , p−q+1:p ) ; D( : , q+1:p) = zeros ( n , p−q ) ;
For the subroutine ’ql’ we used the QRdecomposition routine of Matlab.
Let ¦d
1
, . . . , d
p
¦ be the columns of D. If the QRdecomposition of [d
p
, . . . , d
1
]
is given by
d
p
d
1
=
˜
d
p
˜
d
1
¸
¸
r
11
r
1p
.
.
.
.
.
.
r
pp
¸
,
then we obtain
d
1
d
p
=
˜
d
1
˜
d
p
¸
¸
r
pp
.
.
.
.
.
.
r
1p
r
11
¸
=:
¯
DL.
So, the ’ql’subroutine reads
function [ Q, L] = ql (A) ;
%%% [Q, L] = QL(A) produces t he ” economy s i z e ” QL−decomposi t i on .
%%% I f A i s m−by−n wi t h m > n , t hen t he f i r s t n col umns of Q
%%% are computed . L i s a l ower t r i a ng ul a r mat ri x .
[ Q, L] = qr ( f l i pl r (A) , 0 ) ;
Q = f l i pl r (Q) ;
L = f l i pl r ( fl i pud (L) ) ;
Removing the last columns in QRformat
The Broyden Base Storing method computes the QRdecomposition of the
matrix D before it removes the last p − q columns of the matrices C and D.
Therefore, we declare the (p p)matrix R.
%%% Pr e al l oc at i on %%%
R = zeros ( p , p ) ;
Subsequently we rewrite the decomposition of the update matrix in the
following way.
%%% Recomposi ti on %%%
%%% QL−decomposi t i on , L%%%
[ D, R] = qr (D, 0 ) ; C = C∗R’ ;
The last p−q columns are removed in the same way as done in the normal
format.
%%% Reduct i on %%%
C( : , q+1:p) = zeros ( n , p−q ) ;
D( : , q+1:p) = zeros ( n , p−q ) ;
217
The inverse notation of Broyden’s method
For the inverse notation only the computation of the Broyden step s and the
update to the inverse Broyden matrix are diﬀerent from the standard ’lmb’
subroutine. So, the Broyden iterations reads
%%% Broyden i t e r a t i o n %%%
while ( ne ( i t e +1) > i e ps ) ,
i f ( ne ( i t e +1) > meps ( 2 ) ) i f a i l = 4; break , end;
i f ( i t e >= imax ) i f a i l = 2; break , end;
%%% Broyden s t ep %%%
s = g − C( : , 1 :m)∗D( : , 1 :m) ’ ∗ g ; ns = sqrt ( s ’ ∗ s ) ;
i f ( ns <= 0) i f a i l = 3; break , end;
x = x + s ;
y = f eval ( gcn , x , n) − g ; g = y + g ;
i t e = i t e + 1;
ne ( i t e +1) = sqrt ( g ’ ∗ g ) ;
i f (m == p)
%%% Recomposi ti on %%%
%%% Reduct i on %%%
m = q ;
end;
%%% Matri x updat e %%%
m = m + 1;
C( : ,m) = C( : , 1 : m−1)∗D( : , 1 : m−1) ’∗y − y ;
stHy = s ’ ∗C( : ,m) ;
D( : ,m) = D( : , 1 : m−1)∗C( : , 1 : m−1) ’∗ s − s ;
nHts = sqrt (D( : ,m) ’ ∗D( : ,m) ) ;
i f ( stHy == 0  nHts <= 0) i f a i l = 3; break , end;
C( : ,m) = ( s−C( : ,m) ) / stHy∗nHts ;
D( : ,m) = D( : ,m)/ nHts ;
end;
All reduction methods discussed above are applicable to the limited mem
ory inverse Broyden method.
The limited memory Broyden method proposed by Byrd et al.
This limited memory Broyden method has to be implemented in a quite diﬀer
ent setting. In fact, the update to the Broyden matrix is not clearly computed,
but inherent of the algorithm. For the sake of clarity, we declare the matrix
M2 instead of B2. On the other hand the vectors s and y are not used.
%%% Pr e al l oc at i on %%%
g = zeros ( n , 1 ) ;
M2 = zeros ( p , p ) ;
For simplicity we only consider the case where the oldest p −q updates to
the Broyden matrix are removed if m = p. The complete Broyden iteration
reads
218 Appendix B. Matlab code of the limited memory Broyden methods
%%% Broyden i t e r a t i o n %%%
while ( ne ( i t e +1) > i e ps ) ,
i f ( ne ( i t e +1) > meps ( 2 ) ) i f a i l = 4; break , end;
i f ( i t e >= imax ) i f a i l = 2; break , end;
%%%% Broyden s t ep and updat e %%%
i f (m == 0)
D( : , 1 ) = g ;
el se
M2 = zeros (m,m) ;
for i = 1:m
for j = 1: i −1
M2( i , j ) = −D( : , i ) ’ ∗D( : , j ) ;
end;
end;
M2( 1 :m, 1 :m) = M2( 1 :m, 1 :m)−D( : , 1 :m) ’ ∗C( : , 1 :m) ;
i f ( rcond(M2( 1 :m, 1 :m)) < meps ( 1 ) ) i f a i l = 5; break , end;
D( : ,m+1) = . . .
(C( : , 1 :m)+D( : , 1 :m) ) ∗(M2( 1 :m, 1 :m) \ (D( : , 1 :m) ’ ∗ g ) ) + g ;
end;
x = x + D( : ,m+1);
ns = sqrt (D( : ,m+1) ’∗D( : ,m+1));
i f ( ns <= 0) i f a i l = 3; break , end;
C( : ,m+1) = f eval ( gcn , x , n) − g ; g = C( : ,m+1) + g ;
i t e = i t e + 1; ng = sqrt ( g ’ ∗ g ) ; ne ( i t e +1) = ng ;
%%% Scal i ng %%%
D( : ,m+1) = D( : ,m+1)/ns ;
C( : ,m+1) = C( : ,m+1)/ns ;
m = m + 1;
i f (m == p) ,
%%%% Reduct i on %%%
C( : , 1 : q) = C( : , p−q+1:p ) ; C( : , q+1:p) = zeros ( n , p−q ) ;
D( : , 1 : q) = D( : , p−q+1:p ) ; D( : , q+1:p) = zeros ( n , p−q ) ;
m = q ;
end;
end;
The scaling is inserted to overcome a bad condition number for the matrix
M2. In contrast with the matrix B2 the matrix M2 is not invertible if m < p,
because we have declared M2 as a (p p)matrix. Therefore, we use the
leftupper (mm)part of the matrix M2.
Appendix C
Estimation of the model
parameters
In the simulations of the twodimensional model (6.26)(6.28), we take the
same parameter values as used by Khinast, Jeong and Luss (1999), see Table
C.1. To compute the eﬀective axial heat conductivity the following expression
is proposed
λ
ax
= (1 −ε)λ
s
+λ
g
+
u
2
(ρc
p
)
2
g
ha
v
.
(ρc
p
)
s
1382.0 kJ/m
3
K (ρc
p
)
g
0.6244 kJ/m
3
K η 1
k
∞
9.85 · 10
6
s
−1
a
v
1426.0 m
2
surf
/m
3
react
k
c
0.115 m/s
h 0.02kW/(m
2
K) L 4.0 m ε 0.38
T
c
= T
0
323 K ∆T
ad
50 K E
a
/R
gas
8328.6 K
D
ax
3 · 10
−5
m
2
/s u 1.0 m/s t
f
1200s
λ
s
0.0 kW/(mK) λ
g
2.6 · 10
−4
kW/(mK)
Table C.1: Parameter values for the reverse ﬂow reactor
In this appendix we derive appropriate values for the radial dispersion
D
rad
and for the radial heat conductivity λ
rad
using correlation formulas of
Westerterp, Swaaij and Beenackers [72]. The derived values of the radial
parameters are given in Table C.2. In our simulations we ﬁx the ﬂow reverse
time (t
f
= 1200s).
When a ﬂuid ﬂows through a packed bed of solid particles with low porosity,
the variations in the local velocity cause a dispersion in the direction of the
ﬂow. In not too short beds (i.e., L/d
p
> 10, where d
p
denotes the particle
size) this dispersion can be described by means of a longitudinal dispersion
219
220 Appendix C. Estimation of the model parameters
coeﬃcient, although in reality no back mixing occurs. The void spaces of a
packed bed can be considered as ideal mixers, and the number of voids is
roughly equal to
N ∼
L
d
p
.
Using the relation N = Pe
m
/2 = uL/(2εD
ax
), the following expression for the
axial dispersion in packed beds, denoted by the Bodenstein number, can be
derived
Bo
ax
=
ud
p
εD
ax
= 2.
To avoid large wall eﬀects, it is assumed that d
p
/2R < 0.1. It is known that
the radial Bodenstein number, Bo
rad
∼ ud
p
/(εD
rad
), approaches a value of 10
to 12 at Re > 100. This implies that the coeﬃcient of transverse dispersion
D
rad
is about six times smaller than D
ax
.
Heat can be transported perpendicular to the main ﬂow by the same mech
anism if a transverse temperature gradient exists, resulting in a convective heat
conductivity λ
rad
. Besides, heat transport occurs by thermal radiation between
the particles. The (isotropic) thermal conductivity of the bed is denoted by
λ
0
. The total radial thermal conductivity is then given by
λ
rad
= λ
0
+λ
rad
,
where λ
0
and λ
rad
act fairly independently. For the convective heat conduc
tivity the following correlation is given
λ
rad
=
(ρc
p
)
g
d
p
u
8[2 −(1 −d
p
/R)
2
]
.
Note that under stagnant conditions, we have that λ
rad
= 0 and that the
radial heat dispersion coeﬃcient equals the thermal conductivity. If the heat
diﬀusion through the solid particles can be neglected (that is, λ
s
= 0) the
following expression is valid for λ
0
,
λ
0
= 0.67 λ
g
ε,
in case of 0.26 < ε < 0.93 and T < 673K. Using the parameter values given in
Table C.1, we arrive at the following expression for the radial heat conductivity
λ
rad
= λ
0
+λ
rad
= 6.6 10
−5
+
0.6244 d
p
1.0
8[2 −(1 −d
p
/R)
2
]
= 6.6 10
−5
+ 7.81 10
−2
d
p
2 −(1 −d
p
/R)
2
.
221
If we choose the particle diameter to be d
p
= 1.010
−3
m and take R in the range
from 0.01m to 0.1m, then the radial heat conductivity varies from 1.3210
−4
to
1.4310
−4
kW/(mK). Therefore, we ﬁx the value at λ
rad
= 1.410
−4
kW/(mK).
d
p
1.0 · 10
−3
m D
rad
0.5 · 10
−5
m
2
/s λ
rad
1.4 · 10
−4
kW/(mK)
Table C.2: The values of the radial parameters for the twodimensional model of the
reverse ﬂow reactor
In the computations of the Chapter 5 and 8 we have used the dimen
sionless equations for the onedimensional model, (6.23)(6.25), and the two
dimensional model, (6.26)(6.28). The corresponding dimensionless parame
ters of Table 6.2 are computed using the values of Tables C.1 and C.2.
Symbols
In Section 6.2 we have used the following symbols.
Roman
a
v
speciﬁc external particle surface area, m
2
surf
/m
3
reactor
a
w
speciﬁc reactor wall surface area, m
2
wall
/m
3
reactor
c, C concentration, kmol/m
3
D dispersion coeﬃcient, m
2
/s
d
p
particle diameter, m
E
a
activation energy, kJ/kmol
h heattransfer coeﬃcient, kW/(m
2
K)
k
c
masstransfer coeﬃcient, m/s
k
∞
frequency factor for reaction, s
−1
L reactor length, m
r radial distance, m
R radius of the reactor, m
R
gas
universal gas constant, kJ/(kmol K)
u superﬁcial gas velocity, m/s
U
w
heattransfer coeﬃcient at reactor wall, kW/(m
2
K)
t time, s
t
f
ﬂow reverse time, s
T temperature, K
T
c
(T
0
) cooling(feed) temperature, K
z axial distance, m
222 Appendix C. Estimation of the model parameters
Greek
−∆H heat of reaction, kJ/kmol
−∆T
ad
adiabatic temperature rise, K
ε void fraction, [−]
η eﬀectiveness factor, [−]
λ
0
(isotropic) thermal conductivity, kW/(m
3
K)
λ
convective heat conductivity, kW/(m
3
K)
λ thermal conductivity, kW/(m
3
K)
(ρc
p
) volumetric heat capacity, kJ/(m
3
K)
Φ dimensionless cooling capacity, [−]
Dimensionless parameters
Bo Bodenstein number
Pe P´eclet number
Pr Prandl number
Re Reynolds number
Subscripts
ax axial direction
rad radial direction
g gas phase
s solid phase
Samenvatting
(Waarom Broyden?)
Wiskunde is ´e´en van de oudste wetenschappen in de wereld, maar speelt in
het huidige wetenschappelijke onderzoek nog altijd een prominente rol. Be
schouwen we de volgende voorbeelden.
• Ecologisch onderzoek naar vervuiling van een fabriek die zijn afval loost
in een baai met open verbinding naar zee.
• Het gedrag van veengrond onder invloed van dag en nacht.
• De toestand van kraakbeen in de pols onder herhaaldelijke belasting.
• Periodiek aangedreven processen in chemische reactoren.
Wat deze processen met elkaar gemeen hebben is dat ze wiskundig ge
zien eigenlijk hetzelfde zijn. Ze worden namelijk beschreven met behulp van
parti¨ele diﬀerentiaalvergelijkingen met tijdsafhankelijke parameters en rand
condities. We zijn vooral ge¨ınteresseerd in wat een systeem doet na verloop
van (een lange) tijd. Omdat de condities van het proces periodiek in de tijd
zijn, verwachten we hetzelfde voor de uiteindelijke toestand van het systeem.
In dat geval noemen we het systeem in periodiek stabiele toestand (’cyclic
steady state’).
We beschouwen een variabele, zeg x, bijvoorbeeld de concentratie van de
giftige stof in het water van een baai, of de temperatuur van de reactor. Deze
variabele hangt af van plaats en tijd. In eenvoudige gevallen zijn de parti¨ele
diﬀerentiaalvergelijkingen nog op te lossen. Maar wanneer meerdere variabe
len in het proces een rol spelen of wanneer het mechanisme moeilijker wordt,
is dit al niet meer te doen en moeten we op een andere manier een oplossing
vinden. De laatste decennia kunnen we de hulp inroepen van de computer.
Daarvoor dienen we de parti¨ele diﬀerentiaalvergelijkingen eerst aan te passen
223
224 Samenvatting
zodat de computer er ¨ uberhaupt iets mee kan doen. We discretiseren de ver
gelijkingen op een grid, dat wil zeggen, we delen de ruimte op in kleine blokjes
en nemen aan dat per blokje de variabele constant is. In plaats van de parti¨ele
diﬀerentiaalvergelijkingen hebben we nu een groot stelsel van n gewone diﬀe
rentiaalvergelijkingen, voor elk blokje in de ruimte een vergelijking die afhangt
van x. We deﬁni¨eren de periodeafbeelding f : R
n
→R
n
als de functie die de
toestand aan het begin van een periode overbrengt naar de toestand aan het
einde van de periode. Om de periodeafbeelding te verkrijgen, moeten we het
stelsel van gewone diﬀerentiaalvergelijkingen integreren over ´e´en periode. De
periodiek stabiele toestand is dus een vast punt van de periodeafbeelding en
een nulpunt van de functie g(x) = f(x) − x. De uiteindelijke vergelijking die
we moeten oplossen, wordt
g(x) = 0.
De uitdaging is nu om steeds gedetailleerdere modellen te gebruiken om de
processen beter te kunnen omschrijven. Bovendien kan het nodig zijn om een
ﬁjner grid te gebruiken, dat wil zeggen, meer gridpunten. De dimensie van het
gediscretiseerde probleem n wordt hierdoor groter en eﬃci¨ente methodes zijn
nodig om g(x) = 0 op te lossen.
In toepassingen is de methode van Broyden populair. Deze methode gaat
uit van een beginschatting x
0
voor het nulpunt x
∗
van de functie g. Met behulp
van een iteratief schema wordt een reeks van iteraties ¦x
k
¦ berekend dat naar
de oplossing x
∗
convergeert. Hierbij wordt gebruik gemaakt van een matrix
B
k
dat de Jacobiaan (de afgeleide) van de functie g in de iteratie x
k
benadert.
De Broyden matrix wordt elke iteratie aangepast door er een rang´e´enmatrix
bij op te tellen. Per iteratie wordt slechts ´e´en functieevaluatie uitgevoerd. De
methode van Broyden blijkt in het bijzonder geschikt te zijn voor problemen
afkomstig van periodieke processen.
Een nadeel van de methode van Broyden, is dat de (n n)matrix B
k
op
geslagen moet worden. Dit kan een probleem vormen als het model te groot
wordt. De vraag is dus of we de Broyden matrix op een eﬃci¨ente manier
kunnen opslaan. Na een uitgebreide studie van de methode van Broyden,
waarvan de simulaties vooral te vinden zijn in het tweede deel van dit proef
schrift, hebben we een oplossing gevonden. Door de Broyden matrix (zelf een
benadering) te benaderen, lukt het om de matrix op te slaan met behulp van
2pn elementen, in plaats van de n
2
elementen. De parameter p wordt vooraf
vastgesteld en de ideale waarde voor p is bepaald door eigenschappen van de
functie g en niet van de dimensie n van het gediscretiseerde probleem. Het
blijkt dat voor veel gevallen p klein gekozen kan worden. De methode die we
hebben ontwikkeld, noemen we de ’Broyden Rank Reduction’ methode. Als
Samenvatting 225
bijkomend voordeel, gaan de grote ndimensionale berekeningen die nodig zijn
voor de methode van Broyden over in kleine pdimensionale berekeningen.
n
n
p
p
We hebben bewezen onder welke omstandigheden de ’Broyden Rank Reduc
tion’ methode even snel is als de originele methode van Broyden. Dit is het
belangrijkste resultaat van het eerste deel van dit proefschrift.
De aanleiding voor het ontwikkelen van de ’Broyden Rank Reduction’ me
thode was een probleem afkomstig uit de reactorkunde. Dit probleem wordt
in het laatste deel van dit proefschrift volledig uitgewerkt.
De ’reverseﬂow’ reactor is een cilindrische buis gevuld met een katalysator
deeltjes waar een gas doorheen wordt gestuurd. In dit gas zit een reagens dat,
wanneer het in contact komt met de katalysator, reageert tot een product. We
veronderstellen dat de reactie exotherm is, dat wil zeggen, er komt warmte bij
vrij. Omdat de reactie alleen plaatsvindt als de temperatuur hoog genoeg
is, warmen we de reactor eerst op voordat we het proces starten. Als we nu
koud gas (op kamertemperatuur) de reactor inlaten, warmt het op en vindt de
reactie plaats. Dit heeft twee eﬀecten. Op de plaats waar de reactie plaats
vindt, ontstaat een reactiefront en stijgt de temperatuur. Vervolgens wordt
dit reactiefront door de reactor heen gestuwd en zal, wanneer we niets ver
anderen aan de condities van het proces, de reactor verlaten. De hele reactor
is dan op kamertemperatuur en er kan geen reactie meer plaatsvinden. We
kunnen dit voorkomen door voordat het reactiefront de reactor heeft verla
ten de reactor in omgekeerde richting te gaan gebruiken. We laten dan het
gas binnen aan het rechter uiteinde en vangen het product op aan het linker
uiteinde. Hierdoor zal het reactiefront weer naar links verschuiven.
Daar er veel energie bij de reactie vrijkomt en de reactor aan de wand wordt
gekoeld, ontstaan er in radiale richting temperatuursgradi¨enten. Daarom zou
den we graag de reactor beschrijven met behulp van een tweedimensionaal
model, met de concentratie van het reagens en de temperatuur als variabelen.
Indien we voor de discretisatie 100 gridpunten nemen in axiale richting en
25 gridpunten in radiale richting, is de dimensie van het gediscretiseerde pro
bleem, n, gelijk aan 2 100 25 = 5000. Het blijkt dat de vergelijking g(x) = 0,
niet meer is op te lossen met behulp van de methode van Broyden, daar naast
226 Samenvatting
alle andere matrices en vectoren een Broyden matrix moet worden opgesla
gen met 25.000.000 elementen. De ’Broyden Rank Reduction’ methode kan
wel worden toegepast voor bijvoorbeeld p = 20 of p = 10. Hiervoor moeten
respectievelijk 200.000 en 100.000 elementen worden opgeslagen. De methode
convergeert voor beide p’s even snel, terwijl het geheugengebruik wordt ge
halveerd. De parameter p kan zelfs gelijk worden gekozen aan 5 (opnieuw een
halvering van het geheugengebruik) in ruil voor een paar extra iteraties. Een
periodieke toestand van de ’reverseﬂow’ reactor met temperatuursgradi¨enten
in radiale richting kan met de ’Broyden Rank Reduction’ methode nu voor het
eerst eﬃci¨ent berekend worden.
Nawoord
Dit proefschrift was nooit voltooid zonder de inbreng en steun van vele vrien
den, collega’s en bekenden. Daarbij denk ik vooral aan de personen die direct
bij het totstandkoming van dit proefschrift betrokken zijn geweest. Verschil
lende leden van de promotiecommissie hebben door hun opmerkingen en vra
gen de presentatie van de resultaten overzichtelijker gemaakt en het aantal
fouten en onnauwkeurigheden verminderd.
Financi¨ele ondersteuning voor congres en werkbezoek heb ik gekregen van
het Leids Universiteits Fonds, Shell en van NWO via het Pioneer project. Het
Mathematisch Instituut heeft mij alle vrijheid gegeven om ongestoord mijn on
derzoek te doen en bovendien mij te kunnen voorbereiden op het toekomstig
werk voor de klas. Gedurende vier jaar heb ik me thuis gevoeld op twee totaal
verschillende instituten. Zowel in Leiden als bij het Instituut voor de Techni
sche Scheikunde in Amsterdam was er altijd wel iemand om een oplossing te
vragen voor een probleem (niet noodzakelijk betreﬀende mijn onderzoek) of
om trots mijn nieuwste resultaten te laten zien (van mijn kat Siep bijvoorbeeld,
zie de introductie).
In het laatste jaar van mijn promotie kreeg ik de gelegenheid om een maand
de Colorado State University te bezoeken op uitnodiging van Don Estep. Het
was voor mij een grote uitdaging om het resultaat van mijn onderzoek te
bespreken met hem en zijn collega’s. Het was leuk om de verschillende ﬁets
routes in en rondom Fort Collins te ontdekken.
Vanaf het begin van mijn promotie heb ik veel gehad aan de gesprekken
met Tycho, die mij in wezen is voorgegaan in dit onderzoek. Zowel zijn heldere
idee¨en als vermogen tot relativeren hebben mij veel geholpen. Het basisidee
van de ’Broyden Rank Reduction’ methode is mede aan hem te danken.
De tijd die ik met Miguel op de kamer zat, waren aangenaam en eﬀectief.
Het schoolbord op de kamer werd intensief gebruikt, want het was vaak al
voldoende om een probleem uit te leggen aan de ander om zelf ineens de
oplossing te zien. Zijn kennis over Latex is dit proefschrift zeker ten goede
gekomen.
227
228 Nawoord
Ook buiten de universiteit hebben velen mij geholpen bij het voltooien
van mijn promotie, vaak ook zonder dat zij zich ervan bewust waren. Door
het commentaar van Bertram en Luuk is de leesbaarheid van het begin en
het einde van dit proefschrift vergroot. De onvoorwaardelijke steun van mijn
ouders, mijn broer en mijn zus is voor mij van onschatbare waarde.
D´esir´ee is voor mij de reden om altijd vol te blijven houden.
Curriculum Vitae
Bart van de Rotten is geboren op 20 oktober 1976 in Uithoorn. Op zevenjarige
leeftijd behaalde hij als laatste van de klas het diploma voor vermenigvuldigen.
Deze valse start werd echter snel ingehaald en hij slaagde erin om alle reken
taken van de basisschool uit te werken. Door ´e´en van zijn leraren werd hij zelfs
getipt als de toekomstig directeur van de IBM. Ook op de middelbare school
bleef de interesse voor de wiskunde groeien, mede door het enthousiasme van
zijn wiskundedocent. Vanaf de derde klas tot aan het laatste jaar van de
universiteit heeft hij leerlingen begeleid in wis en natuurkunde. Zowel voor
wiskunde A, wiskunde B als natuurkunde sloot hij het gymnasium af met
(afgerond) een tien voor het eindexamen.
Op 3 juli 1995 nam hij zijn diploma in ontvangst en in de herfst van
datzelfde jaar begon hij de wiskundeopleiding aan de Vrije Universiteit te
Amsterdam. In de zomer van 1998 startte hij een specialisatie in de Operato
rentheorie onder begeleiding van dr. A.C.M. Ran. Voor het schrijven van zijn
eindscriptie ’Invariant Lagrangian subspaces of inﬁnite dimensional Hamilto
nians and the Ricatti equation’, vertrok hij in oktober 1998 voor vier maanden
naar de Technische Universit¨at Wien. Hier was hij te gast bij prof. dr. H.
Langer een expert op het gebied van de operatorentheorie. Op 25 augustus
1999 studeerde hij cum laude af.
Zijn interesse ging over van de zuivere naar de meer toegepaste wiskun
de. Bij prof. dr. S.M. Verduyn Lunel en prof. dr. A. Bliek (Universiteit
van Amsterdam) deed hij zijn onderzoek in het oplossen van grote stelsels
nietlineaire vergelijkingen, afkomstig van modellen voor chemische, periodiek
aangedreven reactoren, waarvan dit proefschrift het resultaat is. Tijdens deze
periode gaf hij werkcollege’s in analyse en numerieke wiskunde. Daarnaast be
zocht hij conferenties in Lunteren, Wageningen, Hasselt en Montreal, gaf hij
diverse keren een voordracht in Leiden, Utrecht en Amsterdam, deed hij mee
aan een modelleerweek in Eindhoven en volgde hij onder andere een cursus
in Delft. Als hoogtepunt van zijn promotie bezocht hij het voorjaar van 2003
de Colorado State University. Daar was het mogelijk om zijn onderzoek te
229
230 Curriculum Vitae
verdiepen met behulp van de expertise van prof. dr. D. Estep en zijn groep.
Binnenkort keert hij terug naar de Vrije Universiteit waar hij een opleiding
gaat volgen tot wiskundedocent.
Samenstelling van de promotiecommissie:
promotoren:
prof. dr. S.M. Verduyn Lunel prof. dr. ir. A. Bliek (Universiteit van Amsterdam) prof. dr. D. Estep (Colorado State University) prof. dr. G. van Dijk dr. ir. H.C.J. Hoefsloot (Universiteit van Amsterdam) prof. dr. R. van der Hout dr. W.H. Hundsdorfer (CWI, Amsterdam) prof. dr. L.A. Peletier prof. dr. M.N. Spijker
referent: overige leden:
A limited memory Broyden method to solve highdimensional systems of nonlinear equations
Bart van de Rotten Mathematisch Instituut, Universiteit Leiden, The Netherlands ISBN: 9090175768 Printed by PrintPartners Ipskamp The research that led to this thesis was funded by N.W.O. (Nederlandse organisatie voor Wetenschappelijk Onderzoek) grant 61661410 and supported by the Thomas Stieltjes Institute for Mathematics:
. . . 100 2 Solving linear systems with Broyden’s 2. . . . . . .1 Exact convergence for linear systems . . . . . . . . . .3 Broyden Base Reduction method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Introducing coupling . . 2. . . 67 71 . 35 method 55 . . 3. . . . . . . . . . . . . . . . 56 . .2 Two theorems of Gerber and Luk . . . . . . . . . . . . . . . 18 The method of Newton . .2 Broyden Rank Reduction method . . . . . . . . . . . . . 4. . . 96 . . . . . 3. .1 1. . . . . . . . . . 72 . . . . . . . . II Features of limited memory methods . . . . . . . . . . . .4 The approach of Byrd . . . . .2 Solving linear systems with Broyden’s method . . . . . . .1 Characteristics of the Jacobian . . . . . . . .4 Comparison of selected limited memory Broyden methods i . . . . . . . . 62 . . . . . . . . . . . . . . . . 4. . . .2 1. . . . 83 . . . . 2. . . . . . . . . . . . . . . . . . . . . 3 Limited memory Broyden methods 3. . . . . .3 Linear transformations .Contents Introduction 1 I Basics of limited memory methods 15 1 An 1. . . 109 111 113 120 125 128 4 Features of Broyden’s method 4. . . . .1 New representations of Broyden’s method 3. . . . . .3 introduction to iterative methods 17 Iterative methods in one variable . . . . . 4. . 25 The method of Broyden . . . . .
. . 7. . . . . .ii Contents 5 Features of the Broyden rank reduction method 5. . . . . .2 Singular value distributions of the update matrices . . . . . 191 Notes and comments Bibliography A Test functions B Matlab code of the limited memory Broyden methods C Estimation of the model parameters Samenvatting (Waarom Broyden?) Nawoord Curriculum Vitae 195 201 207 211 219 223 227 229 . . . . . . . . . .1 Discretization of the model equations . . .2 The behavior of the reverse ﬂow reactor . . . . . . . . . . 167 168 173 175 8 Eﬃcient simulation of periodically forced reactors in 2D 183 8. . .4 Comparison of selected limited memory Broyden methods .1 The reverse ﬂow reactor . . . . . . . . 135 135 138 140 142 III Limited memory methods applied to periodically forced processes 147 6 Periodic processes in packed bed reactors 149 6. . . . . .2 The model equations of a cooled packed bed reactor . . . . . . . . . . . . 183 8. 5. . .3 Computing on a ﬁner grid using same amount of memory 5. . . . . . . 7. . . . 5. . .3 Bifurcation theory and continuation techniques . . . 152 7 Numerical approach for solving periodically forced processes 7. . . . . . . . . . . . . . . .1 The advantages of periodic processes . . . . . . . . . . . . . . . . 185 8. .3 Dynamic features of the full twodimensional model . . . . . .2 Tests for the discretized model equations . .1 The reverse ﬂow reactor . . . . . . . . . . . 149 6. .
1.Introduction Periodic chemical processes form a ﬁeld of major interest in chemical reactor engineering. The dynamical process in the reactor can now be formulated by the dynamical system xk+1 = f (xk ). the state after one period is called the period map. in general. it is interesting to investigate the dependence of the cyclic steady state on the operating parameters of the reactor. is highly nonlinear. the reactor generally goes through a transient phase during many periods before converging to a periodic limiting state. which. the thermal swing adsorber (TSA). . . The action of the process during one period can be computed by integrating the obtained system of ordinary diﬀerential equations in time for one period. . We denote the period map by f : Rn → Rn . and the more recently developed pressure swing reactor (PSR). 1 . k = 0. . is given by the temperature proﬁles and concentration proﬁles of the reactants. Periodic states of the reactor are ﬁxed points of the period map f and a stable cyclic steady state can be computed by taking the limit of xk as k → ∞. we discretize the equations in space. In order to investigate the behavior of the system numerically. the transient phase of the process might be very long. and eﬃcient methods to ﬁnd the ﬁxed points of f are essential. that contains a periodically forced process. Starting with an initial state. The state of a chemical reactor. The map that assigns to an initial state of the process. Examples of such processes are the pressure swing adsorber (PSA). where xk denotes the state of the reactor after k periods. the reverse ﬂow reactor (RFR). The simulation of periodically forced processes in packed bed reactors leads to the development of partial diﬀerential equations. This periodic limiting state is also known as the cyclic steady state (CSS). Because the reactor operates in this state most of the time. Depending on the convergence properties of the system at hand. 2.
(2) where Jg (x). Basics of limited memory methods The standard iterative algorithm is the method of Newton. k = 0. In this thesis. (1) Because (1) is a system of n nonlinear equations. We call the resulting algorithms limited memory Broyden methods. An advantage of the method of Newton is that the convergence is quadratic in a neighborhood of a zero. . 1. The iterative algorithms produce a sequence {xk } of approximations to the zero x∗ of g. In his thesis [49]. . . the study of periodically forced processes is extended to more complex models.. the NewtonPicard method and the method of Broyden are especially promising. Let x0 ∈ Rn be an initial guess in the neighborhood of a zero x∗ of g. it is necessary to develop limited memory algorithms to solve (1). the basic equation we want to solve is g(x) = 0 for x ∈ Rn . Because the dimension of the discretized system of such models is very large. Newton’s method deﬁnes a sequence {xk } in Rn of approximations of x∗ by −1 xk+1 = xk − Jg (xk )g(xk ). and it is generally accepted that the most eﬃcient iterative algorithms for solving (1) minimize the least number of function evaluations. 2. So. A function evaluation can be a rather expensive task. Therefore. .2 Introduction Fixed points of the map f correspond to zeros of g : Rn → Rn where g is given by g(x) = f (x) − x. algorithms that use a restricted amount of memory. xk+1 − x∗ < c xk − x∗ 2 . is the Jacobian of g at the point x. memory constraints arise and care must be taken in the choice of iterative methods.e. iterative algorithms are needed to approximate a zero of the function g. i. . Since the method of Broyden is popular in the chemical reactor engineering. Van Noorden compares several iterative algorithms for the determination of the CSS of periodically forced processes. He deduces that for this type of problem. by solving (1). that is. we focus on approaches aimed at reducing the memory needed by the method of Broyden.
then g(xk+1 ) − g(xk ) = A(xk+1 − xk ) holds. the updated Broydenmatrix Bk+1 is chosen such that it satisﬁes the equation yk = Bk+1 sk .e. Dennis and Mor´ [11] published a proof that the method e of Broyden is locally qsuperlinearly convergent. In 1965. with sk = xk+1 − xk and yk = g(xk+1 ) − g(xk ). the Broyden matrix Bk is updated using a rankonematrix. k→∞ lim xk+1 − x∗ = 0. Moreover. the condition in (4) results in the following update scheme for the Broyden matrix Bk Bk+1 = Bk + (yk − Bk sk ) sT g(xk+1 )sT k k = Bk + sT sk sT sk k k (5) (4) In 1973. After every iteration step. If g(x) is an aﬃne function. . xk − x∗ In 1979. . 2. quadratic convergence for nonlinear problems xk+2n − x∗ ≤ c xk − x∗ 2 . 1. Broyden [8] proposed a method that uses only one function evaluation per iteration step instead of (n + 1). The main idea of Broyden’s method is to approximate the Jacobian of g by a matrix Bk . According to this equality. . If we assume that Bk+1 and Bk are identical on the orthogonal complement of the linear space spanned by s k . i. Since it is not always possible to determine the Jacobian of g analytically. we often have to approximate Jg using ﬁnite diﬀerences. Gay [22] proved that for linear problems the method of Broyden is in fact exactly convergent in 2n iterations. . The number of function evaluations per iteration step in the resulting approximate Newton’s method is (n + 1). Broyden.Introduction 3 for a certain constant c > 0. g(x) = Ax + b.. he showed that this implies locally 2nstep. Equation (4) is called the secant equation and algorithms for which this condition is satisﬁed are called secant methods. Thus the scheme (2) is replaced by −1 k = 0. . so for some A ∈ Rn×n and b ∈ Rn . (3) xk+1 = xk − Bk g(xk ).
since only one function evaluation is made for every iteration step. Diﬀerent techniques have appeared for solving large nonlinear problems [62]. this might lead to severe memory constraints. To introduce the idea of the BRR method we ﬁrst consider an example. these results imply that the method of Broyden needs more iterations to converge than the method of Newton. In Chapter 3. since we investigate the question of how much and which information can be dropped without destroying the property of superlinear convergence. In Chapter 1. In addition to a large reduction of the memory used. Therefore. Both Newton’s and Broyden’s method need to store an (n × n)matrix. f (x) = . there has been serious attention paid to the issue of reducing the amount of storage required for the iterative methods. we derive our main algorithm.0 · 10−12 . 2xn−1 − εx2 n 2xn (6) The unique ﬁxed points of the function f. The problem that we consider is the general nonlinear equation (1).2. We deduce that Broyden’s method uses selective information of the system to solve it. Example 1. Yet. see (2) and (3). . so nothing is known beforehand about the structure of the Jacobian of the system. for highdimensional systems.4 Introduction with c > 0. we develop several limited memory methods that do not depend on the structure of the Jacobian and are based on the method of Broyden. we consider the method of Broyden for linear systems of equations in Chapter 2. the method of Broyden might signiﬁcantly reduce the amount of CPUtime to solve the problem. we discuss the method of Newton and the method of Broyden in more detail and in particular describe the derivation and convergence properties. From the early seventies. This proof of exact convergence was simpliﬁed and sharpened in 1981 by Gerber and Luk [23]. Subsequently. x∗ = 0. . 2x1 − εx2 2 . the Broyden Rank Reduction method (BRR). these limited memory methods give more insight in the original method of Broyden. The period map f : Rn → Rn to be considered is a small (take ε = 1. . In practice. can be found by applying Broyden’s method to solve (1) with g(x) = f (x) − x up to a certain residual g(x) < 1. In Section 3.0 · 10−2 ) quadratic perturbation to two times the identity map.
the ﬁrst Broyden matrix is given by. This suggests that we can ignore the singular value σ5 in the singular value decomposition of Q without changing the update matrix Q. . and if we compute the singular value decomposition of Q (see Section 3. . The diﬀerence between the ˜ original Broyden matrix B5 = B0 + Q and the ’reduced’ matrix B5 = B0 + Q can be estimated as T B 5 − B 5 = B 0 + Q − B 0 − Q = σ 5 u 5 v5 = σ 5 u 5 v5 = σ 5 .61. . v5 } are orthonormal sets of vectors and σ1 = 2. In the next iteration step. . d2 }. . Because B0 does not have to be stored. d1 } and {c2 .49. . Starting with a simple initial matrix B0 = −I. 000 storage locations to store all of the vector pairs. Q = c 1 dT + c 2 dT + . We consider the update matrix in the ﬁfth iteration that consists of the ﬁrst ﬁve rankone updates to the Broyden matrix. we obtain the second Broyden matrix. 1). . . . . . .214 · 10−5 . . + c 5 dT . We replace the matrix Q by Q with T T T Q = Q − σ 5 u 5 v5 = σ 1 u 1 v1 + . σ2 = 1. . (7) 2 1 2 where c2 = g(x2 )/ s1 and d2 = s1 / s1 . . u5 } and {v1 . σ5 = 0. σ3 = 0. . . + σ 5 u 5 v5 . . 000.Introduction 5 using initial estimate vector x0 = (1. σ4 = 0. we choose n = 100. Applying another update. 000 locations are used to store the vector pairs {c1 . . B1 = B 0 + c 1 d T .121 · 10−12 . 1 where c1 = g(x1 )/ s0 and d1 = s0 / s0 . where {u1 . Now 4 · n = 400. .00. 4. . 1 2 5 Because Q is the sum of ﬁve rankone matrices. ˜ We deﬁne ci := σi ui and di := vi for i = 1. . B2 = B 1 + c 2 d T = B 0 + c 1 d T + c 2 d T . In order to obtain a good example of memory reduction. we would need 6 · n = 600. it is more economical to store the vector c1 and d1 instead of the matrix B1 itself.2 for details) we see that Q can be written as T T Q = σ 1 u 1 v1 + . it has rank less or equal to ﬁve. + σ 4 u 4 v4 .
11 of Section 3. This leads to Algorithm 3. In the ﬁrst iterations.5 0 PSfrag replacements 0 5 10 15 iteration k Figure 1: The singular values of the update matrix during the BRR process with p = 5. After this reduction we can store a new pair of update vectors c5 := g(x6 )/ s5 and d2 := s2 / s2 . with parameter p = 5. Note that p cannot be any small number.5 singular values 2 1. and care is needed to ﬁnd the optimal p. if in every iteration we save the four largest singular values of the update matrix and drop the ﬁfth singular value. the Broyden Rank Reduction method. it might not be harmful to remove the second singular value of the update matrix. For p = 5. on every iteration step we ﬁrst remove the singular value σ5 of Q before computing the new update. the number of required storage locations is reduced from n2 = 1010 for the Broyden matrix of the original method to 2pn = 106 for the BRR method. If p is larger than 5. Clearly. We observe that the residual g(xk ) is approximately 10−14 after 14 iterations. we apply the method of Broyden itself. we do not alter the Broyden matrix. Continuing. the rate of convergence does not increase. Therefore. .5 1 0.2. The rate of convergence of this process is plotted in Figure 2. 3 2. where the diﬃculties start in the 9th iteration. together with the rate of convergence of the BRR method for other values of p. For p = 3 we observe the same kind of behavior. In fact. Surprisingly the ﬁfth singular value of the update matrix remains zero in all subsequent iterations until the process is converged.6 Introduction which is equal to zero in this case. but after 8 iterations the process fails to keep the fast qsuperlinear convergence and starts to diverge. see Figure 1. the BRR process does not converge for p = 2.
since it is known that for linear systems of equations. ’∗’(p = 4). we discuss most of the properties of Broyden’s method and the newly developed Broyden Rank Reduction method. We are especially interested in nonlinear test functions. In Part I. in Part II we apply the method of Broyden and the BRR method to several test functions. In this situation the BRR method is satisﬁed with the action of the Broyden matrix in only p directions. a function often can be considered as approximately aﬃne. for example. The good results in practical applications of Broyden’s method can only be explained to a limited degree. in order to compute the new Broyden step sk . the rate of convergence of Broyden’s method applied to the linearization of a function indicates how much iterations it might need for the function itself. for example. Moreover. Features of limited memory methods In Part I. ’+’(p = 3). if measured in the matrix norm induced by the l2 vector .Introduction 7 10 5 residual g(xk ) 10 0 10 −5 10 −10 PSfrag replacements 10 −15 0 10 20 30 40 50 60 iteration k Figure 2: The convergence rate of the Broyden Rank Reduction method when computing a ﬁxed point of the function f given by (6). the diﬀerence Bk − A between the Broyden matrix and the Jacobian of the function does not increase as k increases. 71]. ’ ’(p = 5)] The reduction applied to the update matrix Q can also be explained as follows. Therefore. Broyden’s method is far less eﬃcient than. GMRES [61] and BiCGSTAB [70. [’◦’(p = 1). we show that if we apply Broyden’s method to an aﬃne function g(x) = Ax + b. These directions are produced by the Broyden process itself. In the neighborhood of the solution. see (3). ’×’(p = 2). For the method of Broyden the action of the Broyden matrix has to be known is all n directions.
. Note that the light spot at the left side of the image of the Jacobian is considered less interesting by the method of Broyden. . the nose and eyes of the cat are clearly detected. 1 2 (8) where the elements of S are in between zero and one. On the other hand. It turns out that in the simulation. the picture of the cat is sharpened. may increase. we have plotted the matrix A and the Broyden matrix for diﬀerent iterations. In Figure 3. Although the ﬁnal Broyden matrix is certainly not equal to the Jacobian. However. B0 ) ∈ N1 × N2 the diﬀerence Bk − Jg (x∗ ) F never exceeds two times the initial diﬀerence B0 − Jg (x∗ ) F . We observe that in some way the Broyden matrix Bk tries to approximate the Jacobian A. . . The matrix S contains in fact the values of a grayscale picture of a cat. In case of signiﬁcant temperature ﬂuctuations. to .8 Introduction norm. . Moreover.. it turns out to be essential to include a second space dimension in the model of packed bed reactors. measured in the Frobenius norm. . Example 2. For nonlinear functions g the diﬀerence Bk − Jg (x∗ ) F . While reconstructing the two main diagonals of the Jacobian. The dimension of the problem is n = 100. Let the matrix A be 2 A= given by the sum 1 . + S. it approximates the Jacobian to such an extent that the solution to the problem Ax = 0 can be found. the Theorem of Gay implies that it will take Broyden’s method less then 200 iterations to solve the problem exactly. The ﬁnite arithmetic of the computer has probably introduced a nonlinearity into the system so that the conditions of Gay’s Theorem are not completely fulﬁlled. . about 219 iterations are needed to reach a residual of g(xk ) < 10−12 .. We consider the system of linear equations Ax = 0 and apply the method of Broyden from initial condition x0 = (1. . Limited memory methods applied to periodic processes We now consider an application in the chemical reactor engineering. so that if (x0 . 1) and with initial Broyden matrix B0 = −I. . we can choose a neighborhood N1 of the solution x∗ and a neighborhood N2 of the Jacobian Jg (x∗ ). the rough contour of the cat can be recognized in the Broyden matrix B50 . After 50 iterations. Since A is invertible..
Black corresponds to values smaller than 0 and white to values larger than 1. many authors have reverted to pseudohomogeneous onedimensional models and to coarse . Combining the integration of the system of ordinary diﬀerential equations for the evaluation of the function g with a ﬁne grid in the reactor makes it practically impossible to solve (1) using classical iterative algorithms for nonlinear equations. This implies that the number of equations. obtain an accurate approximation of the periodic state of the reactor. n. To overcome severe memory constraints. it is necessary to use a ﬁne grid. is very large.Introduction 9 Jacobian B50 B100 B218 Figure 3: The Jacobian as given by (8) and the Broyden matrix at three diﬀerent iterations of the Broyden process (n = 100).
we consider a reverse ﬂow reactor. we derive the balance equations of the twodimensional model of a general packed bed reactor. The radial transport of heat and matter is essential in nonisothermal packed bed reactors [72]. The beginning and end of the reactor thus eﬀectively work as heat exchangers. and eﬃcient cooling of the reactor at the wall cause radial temperature gradients to be present. PSfrag replacements PSfrag replacements temperature rad. which renders such models inadequate or inaccurate. The cold feed gas purges the hightemperature (reaction) front in downstream direction. We start with the onedimensional pseudohomogeneous model of Khinast. Upon entering the reactor. One complete cycle of the RFR consists of two ﬂowreverse periods. for reactors operating under these conditions the radial dimension must be taken into account explicitly. As an example. Overheating of the catalyst and hot spot formation are avoided by a limited degree of cooling. distance rad. distance Figure 4: Qualitative conversion and temperature distribution of the cooled reverse ﬂow reactor in the cyclic steady state using the twodimensional model (10)(12) with the parameter values of Tables C. distance conversion temperature conversion ax. Before the hot reaction zone exits the reactor. is usually constant and predeﬁned. Jeong and Luss [33]. the hot product gas is cooled by the colder catalyst particles. At the other end of the reactor. The reaction is assumed to be exothermic. Clearly. the feed ﬂow direction is reversed.2. the cold feed gas is heated up regeneratively by the hot bed so that a reaction can occur. . which takes into account the axial heat and mass dispersion. a large width of the reactor. In Chapter 6. distance ax.10 Introduction grid discretization. The reverse ﬂow reactor (RFR) is a catalytic packedbed reactor in which the ﬂow direction is periodically reversed to trap a hot zone within the reactor. Here we give a short summary of the derivation. denoted by tf . which is considered in detail in Chapter 8. The ﬂowreversal period.1 and C. A highly exothermic reaction. see Figure 4.
= −Uw (T (R) − Tc ). (9) r=R is added to the system. (10) the energy balance is given by ((ρcp )s (1 − ε) + (ρcp )g ε) ∂T ∂2T ∂T = λax 2 − u(ρcp )g + ∂t ∂z ∂z (−∆H)r (c. The second spatial dimension is incorporated by including the radial components of the diﬀusion terms.1 and C. Instead. ∂c −εDax ∂z ∂c ∂r r=0 ∂T ∂r r=0 z=0 = 0. in the component balance and the energy balance. T ) + εDrad ∂t ∂z ∂z r ∂r r ∂c ∂r . we can now give the complete twodimensional model. The component balance is given by ε ∂2c ∂c 1 ∂ ∂c = εDax 2 − u − r (c. In summary. = u(c0 − c(0)).Introduction 11 The concentration and temperature depend on the axial and the radial direction. at the wall of the reactor the boundary condition λrad ∂T ∂r = −Uw (T (R) − Tc ). Equation (9) describes the heat loss at the reactor wall to the surrounding cooling jacket. r=R . respectively. The cooling term in the energy balance disappears. c = c(z. εDrad 1 ∂ r ∂r r ∂c ∂r and λrad 1 ∂ r ∂r r ∂T ∂r . r. r. ∂T ∂z z=L ∂c ∂z z=L ∂c ∂r r=R = 0. (11) = u(ρcp )g (T0 − T (0)).2. T ) + λrad and the boundary conditions are given by −λax ∂T ∂z z=0 1 ∂ r ∂r r ∂T ∂r . = 0. = 0. = 0. t). λrad ∂T ∂r (12) The values of the parameters in this model are derived in Appendix C and summarized in Tables C. t) and T = T (z.
where the reactor is ﬁlled with the carrier gas without a trace of the reactants. where radial gradients are integrated in the model. Here.(12). In Chapter 8. This initial state of the reactor is chosen as the initial state of the Broyden process. p can even be chosen equal to 5 and the number of storage locations is reduced further. we describe a numerical approach to deal with the partial diﬀerential equations in order to compute the cyclic steady state of the process. This implies that n = 5000. For complete details of the computations see Chapter 5. that is. we propose the use of the Broyden Rank Reduction method to simulate a full twodimensional model of the reverse ﬂow reactor with radial gradients taken into account. for example T = 2T0 . 000 (n2 ) required for a standard Broyden iteration. A disadvantage of the method of Broyden is that an initial approximation of the solution has to be chosen as well as an initial Broyden matrix. The state vector. We apply the BRR method for diﬀerent values of p to approximate a zero of the function g(x) = f (x) − x with a residual of 10−8 . If we use a few more iterations. When we are able to evaluate the period map of the process using discretization techniques and integration routines.12 Introduction In Chapter 7. B0 = −I. Thereafter the rate of convergence decreases and the dynamical process takes many periods before it reaches the cyclic steady state. 000 (2pn) storage locations are needed for the Broyden matrix. consist of the temperature and the concentration in every grid point. c = 0. By taking the initial Broyden matrix equal to minus the identity. 000. We use 100 equidistant grid points in the axial direction and 25 grid points in the radial direction. We deﬁne f : Rn → Rn to be the period map of the RFR of one ﬂow reverse period associated to the balance equations (10) . only 100. the ﬁrst iteration of the Broyden process is a dynamical simulation step. The BRR method makes it possible to compute eﬃciently the cyclic steady state of the reverse ﬂow reactor. This problem is naturally solved by the application. Figure 5 shows that the BRR method converges in 49 iterations for p = 10. the state of the reverse ﬂow reactor converges relatively fast to the cyclic steady state. . −1 x1 = x0 − B0 g(x0 ) = x0 + f (x0 ) − x0 = f (x0 ). instead of 25. In the ﬁrst periods. denoted by x. If p is chosen too small (p = 2) the (fast) convergence is lost. The reverse ﬂow reactor is usually started in preheated state. we can apply the limited memory Broyden methods of Chapter 3.
1 and C. ’∗’(p = 5). for diﬀerent values of p.Introduction 13 10 0 residual g(xk ) 10 −5 PSfrag replacements 0 10 20 30 40 50 60 70 80 90 100 iteration k Figure 5: The convergence rate of the method of Broyden and the BRR method. applied to the period map of the reverse ﬂow reactor using the twodimensional model (10)(12) with the parameter values of Tables C. [’+’(p = 10).2. ’ ’(p = 2)] .
14 Introduction .
Part I Basics of limited memory methods 15 .
.
As a simple introduction into quasiNewton methods. . In addition.1) where x = (x1 . the ndimensional real vector space. that is. We prove the local convergence of the method and discuss the most interesting features.Chapter 1 An introduction to iterative methods A general nonlinear system of algebraic equations can be written as g(x) = 0. we ﬁrst consider a scalar problem. In Section 1. In general. The oldest and the most famous iterative method might be the method of Newton. We write Jg ∈ Lipγ (D). we discuss some quasiNewton methods. . Jg (u) − Jg (v) ≤ γ u − v . is assumed to be Lipschitz continuous in D. 17 . xn ) is a vector in Rn . also called the NewtonRaphson method. These approaches are generally built on the concept of iterations. proposed by Charles Broyden in 1965 [8]. based on the method of Newton. we derive and discuss the method of Newton and describe the convergence properties. Steps involving similar computations are performed over and over again until the solution is approximated. convex domain D.3. The function g : Rn → Rn is assumed to be continuously diﬀerentiable in an open. (1. The quasiNewton method of most interest to this work is the method of Broyden. The Jacobian of g. there exists a constant γ ≥ 0 such that for every u and v in D. systems of nonlinear equations cannot be solved analytically and we have to consider numerical approaches. we derive this method in the same way as Broyden. denoted by Jg . In Section 1. . .2.
Let g : R → R be given by (1. .3. We illustrate this idea with an example. An introduction to iterative methods 1.e. .1. So.2 and 1.18 Chapter 1. we often omit the proof. as discussed in Sections 1. 1. denoted by lk (x). A clear choice for the aﬃne model. Next.2 and 1. is the tangent line to the graph of g in the point (xk .1. . . see Figure 1. we take x0 = 3. The scalar version of Newton’s method The standard iterative method to solve (1. (1. the function is linearized in the point xk . The multidimensional versions of the methods are more complex.3) and xk+1 is deﬁned to be the zero of this aﬃne function. unless it gives insight in the algorithms.2). but an understanding of the scalar case will help in understanding the multidimensional case.2) This iteration scheme involves solving a local aﬃne model for the function g instead of solving the nonlinear equation (1. g (xk ) k = 0. We choose an initial guess x0 ∈ R to the solution x∗ and compute the sequence {xk } using the iteration scheme xk+1 = xk − g(xk ) .3.1) directly. x1 = 11 . which can be described by a single expression. So.1) is the method of Newton. l0 (x) = g(x0 ) + g (x0 )(x − x0 ) = 7 + 6(x − 3) = 6x − 11. (1. Example 1. i. . which yields (1. The theorems in this section are special cases of the theorems in Sections 1. 2. g(xk )). g(x) = x2 − 2.. we repeat the same step 6 starting from the new estimate x1 . Broyden’s method and other quasiNewton methods.1 Iterative methods in one variable The algorithms we discuss in this section are the scalar version of Newton’s method. The ﬁrst aﬃne model equals the tangent line to g at x0 .4) √ The derivative of this function is g (x) = 2x and an exact zero of g is x∗ = 2. As initial condition. The next iterate x1 is determined to be the intersection point of the tangent line and the xaxis. lk (x) = g(xk ) + g (xk )(x − xk ).
and Newton’s method converges only . starting at x0 = 3. exists and converges to x∗ .1 Iterative methods in one variable 19 10 8 6 4 2 PSfrag replacements 0 x2 1 x1 2 x0 3 4 −2 Figure 1. Let g : R → R be continuously diﬀerentiable in an open interval D. the sequence {xk } generated by xk+1 = xk − g(xk ) .10 reads. 1. . Theorem 1.1: The ﬁrst two steps of the scalar version of Newton’s method (1. Furthermore.2) for x2 − 2 = 0.2. In every iteration. If g (x∗ ) = 0. The scalar version of Theorem 1. then there exists an ε > 0 such that if x0 − x∗  < ε. An important fact is that the method of Newton is locally qquadratically convergent. . . If g(x) = 0 has a solution x∗ ∈ D. Assume that for some ρ > 0. g (x) ≥ ρ for every x ∈ D. the number of accurate digits is doubled when the iteration starts close to the true solution. simply means that g (x∗ ) must be nonzero for Newton’s method to converge quadratically. then x∗ is a multiple root.1. . where g ∈ Lipγ (D). for k ≥ 0. xk+1 − x∗  ≤ γ xk − x∗ 2 . 2ρ The condition that g (x) has a nonzero lower bound in D. g (xk ) k = 0.
For hk suﬃciently small. However.5) gives g(xk−1 ) − g(xk ) ak = . The secant method In many practical applications. (1. In this case. ak is a ﬁnitediﬀerence approximation to g (xk ). In Theorem 1. the nonlinear equation cannot be given in closed form. This quasiNewton method is called the secant . the method is useful for its fast local convergence. (1. . we show that using ak given by (1. Substituting hk = xk−1 − xk in (1. ak k = 0.2 guarantees the convergence only for a starting point x 0 that lies in a neighborhood of the solution x∗ . The tangent line can be approximated by the secant line through g(x) at xk and a nearby point xk + hk . In addition.6) Iterative methods that solve (1. if g (x) ≥ ρ on D.5) with suﬃciently small hk . .8) xk−1 − xk and only one function evaluation is required. but we need to combine it with a more robust algorithm that is can converge from starting points further away from the true solution. using hk = xk−1 − xk may be a better choice. works as well as using the derivative itself. where xk−1 is the previous iterate. in every iteration two function evaluations are needed. since g(xk−1 ) is already computed in the previous iteration. The slope of this line is given by ak = g(xk + hk ) − g(xk ) . the continuity of g implies that x∗ is the only solution in D. These methods follow the scheme xk+1 = xk − g(xk ) . If computing g(x) is very expensive. So. which requires the derivative g (x) to model g around the current estimate xk by the tangent line to g(x) at xk .7) Of course we have to choose hk in the right way. g (x) is not available and we have to modify Newton’s method. An introduction to iterative methods linearly [18]. For example. If x0 − x∗  is too large. (1.5. Newton’s method might not converge. the function g might be the output of a computational or experimental procedure. 1.5) The function g(x) is modeled by lk (x) = g(xk ) + ak (x − xk ). . .6) in every iteration step are called quasiNewton methods. hk (1. Theorem 1.20 Chapter 1.
g(y) − g(x) − g (x)(y − x) ≤ γ(y − x)2 . The fundamental theorem of calculus gives that g(y)−g(x) = which implies y g (z)dz. . We analyze one step of the quasiNewton process (1. it turns out to work well.3. g(y) − g(x) − g (x)(y − x) = Under the change of variables z = x + t(y − x).1.9) dz = dt(y − x).9) becomes 1 x (g (z) − g (x))dz. To prove the convergence of the secant method we need the following lemma. (1. because the local model uses the secant line through the points xk and xk−1 . 2 y x Proof. but usually it is more eﬃcient in terms of the total number of function evaluations required to obtain a speciﬁed accuracy. Let g : R → R be continuously diﬀerentiable in an open interval D. The method is slightly slower than a ﬁnitediﬀerence method. Applying the triangle inequality to the integral and using the Lipschitz continuity of g . y in D. Lemma 1. which also plays a role in the multidimensional setting. Then for any x.5) with h0 small or a0 = −1. Since a0 is not deﬁned by the secant method.6).1 Iterative methods in one variable 21 method. a0 is often chosen using (1. xk+1 − x∗ = a−1 (ak (xk − x∗ ) − g(xk ) + g(x∗ )) k = a−1 k x∗ = a−1 (g(x∗ ) − g(xk ) − g (xk )(x∗ − xk ) + (g (xk ) − ak )(x∗ − xk )) k xk (g (z) − g (xk ))dz + (g (xk ) − ak )(x∗ − xk ) . yields 1 g(y) − g(x) − g (x)(y − x) = y − x 0 γt(y − x)dt = γy − x2 /2. and let g ∈ Lipγ (D). While it may seem locally ad hoc. By construction. (1. g(y) − g(x) − g (x)(y − x) = 0 (g (x + t(y − x)) − g (x))(y − x)dt.
If g(x) = 0 has a solution x∗ ∈ D. . then ak − g (xk ) ≤ Proof.10) gives ek+1 ≤ γ (ek + hk )ek . we have to know how close the ﬁnite diﬀerence approximation ak is to g (xk ) as a function of hk . If there exists some constant c1 such that hk  ≤ c1 xk − x∗ . it is not diﬃcult to prove the following theorem. If xk .11) Using this inequality. Let g : R → R be continuously diﬀerentiable in an open interval D and let g ∈ Lipγ (D).3. Theorem 1. and if x0 − x∗  < ε. h such that if {hk } is a real sequence with 0 < hk  ≤ h.3. is well deﬁned and converges qlinearly to x∗ . then there exist positive constants ε. we obtain ek+1 ≤ a−1  k γ 2 e + g (xk ) − ak ek . (1. From Lemma 1. Substituting (1. 2 (1. then the sequence {xk } given by xk+1 = xk − g(xk ) . ak ak = g(xk + hk ) − g(xk ) .5).12) γhk 2 . Lemma 1.5. then the convergence is qsuperlinear. . 2 γhk  .10). . or equivalently.22 Chapter 1.13) . a constant c2 such that hk  ≤ c2 g(xk ). 2ak  (1.4. xk + hk ∈ D and ak is deﬁned by (1. If limk→∞ hk = 0. Let g : R → R be continuously diﬀerentiable in an open interval D and let g ∈ Lipγ (D). 1. 2 k (1.10) In order to use (1. An introduction to iterative methods If we deﬁne ek = xk − x∗  and use g ∈ Lipγ (D) in the same way as in the proof of Lemma 1.11) in (1. hk for k = 0. we have g(xk ) − g(xk + hk ) − hk g (xk ) ≤ Dividing both sides by hk  gives the desired result. . Assume that g (x) ≥ ρ for some ρ > 0 and for every x ∈ D.
14). If there exists some constant c3 such that hk  ≤ c3 xk − xk−1 . we obtain ek+1 ≤ 2γ 1 (ek + hk )ek ≤ ek 3ρ 2 for all k = 1. We ﬁrst prove that the secant method is qlinearly convergent. We restrict the proof of Theorem 1. then the convergence is at least twostep qquadratic. Suppose x0 and x1 are in D and in addition x0 − x∗  < ε. Proof (of Theorem 1.12). . the mean value theorem and the fact that g(x∗ ) = 0 implies that g(xk ) ≤ cxk − x∗  for some c > 0. 3ρ 4γ 2γ 2 2 · 3ρ 4 1 2 Therefore. . .15) To prove the twostep qquadratic convergence of the secant method. we have x2 − x∗  ≤ · ε < ε and 3 3 h2  = x2 − x1  ≤ e2 + e1 ≤ ε = h < h. this gives e2 ≤ ρ 2γ ρ 1 γ + (e1 + h1 )e1 ≤ e1 = e1 . . . we note that hk  = xk − xk−1  ≤ ek + ek−1 . . . (1. (1. 2.5 to the twostep qquadratic convergence of the secant method.5). Note that the secant method hk = xk−1 −xk is included as a special case of (1. if xk is close enough to x∗ . 2 2γ 4 (1. .11) implies that a1  = a1 − g (x1 ) + g (x1 ) ≥ g (x1 ) − a1 − g (x1 ) γh1  ≥ ρ− 2 γ ρ 3 ≥ ρ− · = ρ. k = 1.1 Iterative methods in one variable 23 then the convergence is qquadratic. Since g (x) ≥ ρ for all x ∈ D. 2 4 Using the same arguments. If we would like the ﬁnitediﬀerence methods to converge qquadratically. .14) From (1. Indeed. Choose ε = ρ/(4γ) and h = ρ/(2γ). we just set hk = c2 g(xk ). 2.1. x1 − x∗  < ε and h1  = x1 − x0  < η .
833333333333 1. .2) and the secant method (1. starting from the initial condition x0 = 3.0 1.414213562373 Table 1.414998429895 1.833333333333 1.7) and (1.7) and (1.8) ap√ proximating 2. In numerical simulations. iteration 0 1 2 3 4 5 6 7 Newton’s method 3.6. we should not choose h k too small because then f l(xk ) = f l(xk + hk ).431239388795 1.2. to solve g(x) = 0. we have to deal with the restrictions arising from the ﬁnite arithmetic of the computer.414213562373  Secant method 3. and the ﬁnitediﬀerence approximation of g (xk ) is not deﬁned.551724137931 1.414218257349 1.4).15) ek+1 ≤ 2γ 1 2γ 2 2γ (ek + ek + ek−1 )ek ≤ (2ek−1 ) ek−1 = e . In Figure 1.24 Chapter 1. Of course. (1.1 shows that the secant method needs just√ few more iterations to obtain the a same precision for the approximation of 2 as Newton’s method.414213780047 1. we consider the secant method again applied to the function g(x) = x2 − 2. although the derivative is not equal to zero. Table 1. where f l(a) is the ﬂoating point representation of a. it depends on the function and how accurately we can approximate the zero of the function. In the next example. In particular. We apply the secant method.1: The method of Newton (1. An introduction to iterative methods Using the linear convergence. it can also happen that f l(g(xk )) = f l(g(xk + hk )).462121212121 1. Let g : Rn → Rn be given by (1. This is one reason that the secant process can fail. Example 1. we use a0 = g (x0 ).414998429895 1.8). the ﬁrst two steps of the secant method are displayed. In the ﬁrst iteration step of the secant method. Additionally. 3ρ 3ρ 2 3ρ k−1 This implies the twostep qquadratic convergence. we derive from (1.414213563676 1.0 1.
deﬁned by (1.16) .10. where Jg denotes the Jacobian of g. 1. (1. The local model is derived from the equality xk +s g(xk + s) = g(xk ) + xk Jg (z)sdz.3. If the integral is approximated by Jg (xk )s. Derivation of the algorithm Similar to the onedimensional setting. the model in the current iterate becomes lk (xk + s) = g(xk ) + Jg (xk )s.2: The ﬁrst two steps of the secant method.8).2 The method of Newton This section we derive the method of Newton in the multidimensional setting. By Theorem 1. for x2 − 2 = 0. the method of Newton is based on ﬁnding the root of an aﬃne approximation to g at the current iterate x k .7) and (1.1. we prove that in Rn Newton’s method is locally qquadratically convergent. starting from x0 = 3.2 The method of Newton 25 10 8 6 4 2 PSfrag replacements 0 x2 1 x1 2 x0 3 4 −2 Figure 1. We then discuss the ﬁnitediﬀerence version of Newton’s method and other types of quasiNewton methods that leads to the introduction of the method of Broyden in Section 1.
. n. . . we build the Newton iteration into an algorithm.17) for i = 1. Algorithm 1. given by gi (x) = xi + h (1−ti ) 2 i n tj (xj +tj +1)3 +ti j=1 j=i+1 (1−tj )(xj +tj +1)3 . for diﬀerent dimensions n of the problem. Therefore. tn (tn − 1)). starting from an initial guess x0 . and set k := 0. n. ii) xk+1 := xk + sk . in addition. cf. We have chosen this function from a large set of test function. we explicitly give the expression of the function. An introduction to iterative methods We solve this aﬃne model for s. the convergence properties of Newton’s method are described. The new iterate xk+1 is not expected to equal x∗ .7 to ﬁnd a zero of the discrete integral equation function. We start with the initial vector x0 given by x0 = (t1 (t1 − 1). The initial residual g(x0 ) and the . The ﬁrst time we use this test function.2. Example 1. pick ε > 0. i) Solve Jg (xk )sk = −g(xk ) for sk . Choose an initial estimate x0 ∈ Rn . we refer to Appendix A. is added to the current iterate xk+1 = xk + sk . In order to judge every iterative method described in this thesis. the method of Broyden is able to compute a zero of this function rather easily. we consider the rate of convergence of each method on a test function. This Newton step. i = 1. . . but only to be a better estimate than xk . 47]. where h = 1/(n + 1) and ti = i · h. as described in Appendix A. . that is. Repeat the following sequence of steps until g(xk ) < ε. [18. We apply Algorithm 1. . sk . ﬁnd sk ∈ Rn such that lk (xk + sk ) = 0. . . It is a commonly chosen problem and. . . . In future examples.26 Chapter 1.7 (Newton’s method). the discrete integral equation function. called the CUTE collection. In Table 1.8. (1.
7 applied to the discrete integral equation function (1.7570 1. Newton solves the problem in one iteration.4858 · 10−14 k∗ 3 3 3 R 10. ’+’(n = 200)] Note that if g is an aﬃne function. 10 0 residual g(xk ) 10 −5 10 −10 PSfrag replacements 10 −15 0 1 2 3 iteration k Figure 1. We illustrate this with an example.4594 10.4637 Table 1.17) for diﬀerent dimensions n. . is plotted in Figure 1.7854 · 10−14 2.0678 g(xk∗ ) 6.3085 · 10−15 1. . .17) for diﬀerent dimensions n. Even if a component function of g is aﬃne. (1. [’◦’(n = 10). . k ∗ . where k ∗ is the number of iterations used. = 0. if g1 is aﬃne then g1 (x1 ) = g1 (x2 ) = .2518 0. We observe that the dimension of the problem does not inﬂuence the convergence of Newton’s method in case of this test function.2 The method of Newton 27 ﬁnal residual g(xk∗ ) are given.3: The convergence rate of Algorithm 1.1. each iterate generated by Newton’s method is a zero of this component function. .7 applied to the discrete integral equation function (1.3.18) The residual g(xk ) . .4393 10. The variable R is a measure of the rate of convergence and deﬁned by R = log( g(x0 ) / g(xk∗ ) )/k ∗ .2: The convergence properties of Algorithm 1. method Newton Newton Newton n 10 100 200 g(x0 ) 0. . k = 0. ’×’(n = 100). that is.
2. . if Jg (xk ) is illconditioned. which equals g(x1 ) = (−48. A ﬁnitediﬀerence method or less expensive methods should be used to approximate the Jacobian. ε) of the solution x∗ . then Jg (xk )sk = −g(xk ) will not give a reliable solution.2. Some problems arise in implementing Algorithm 1. 1). All the convergence results in this thesis are local. Let g : Rn → Rn be continuously diﬀerentiable in an open. is well deﬁned.28 Chapter 1.. 2. converges to x∗ . The proof is a prototype of the proofs for convergence of the quasiNewton methods. So.7. ε) the sequence {xk } generated by xk+1 = xk − Jg (xk )−1 g(xk ). . (1. N (x∗ . the ﬁrst component of the Rosenbrock function has become aﬃne and the next iterate yields the solution x2 = (1. The function value in x0 equals g(x0 ) = (−4. Local convergence of Newton’s method In this section. 2. 1. for example. We consider the Rosenbrock function g : R2 → R2 deﬁned by g(x) = 10(x2 − x2 ) 1 .9. An introduction to iterative methods Example 1. Secondly. As said before. 1 − x1 As initial condition. we choose x0 = (−1. The Jacobian of g is often not analytically available. we give a proof of the local qquadratic convergence of Newton’s method and discuss its implications.4. This implies that the ﬁrst component of xk will be equal to 1 for all future iterations. convex set D ∈ Rn .4. there exists an ε > 0 such that the iterative method converges for all x0 in an open neighborhood N (x∗ . for k = 0. i. . 1). −3.10. 1. ε) = {x ∈ Rn  x − x∗ < ε}. Theorem 1. Here. Then there exists an ε > 0 such that for all x0 ∈ N (x∗ . .2). . Assume that Jg ∈ Lipγ (D) and that there exist x∗ ∈ Rn and β > 0 such that g(x∗ ) = 0 and Jg (x∗ ) is nonsingular with Jg (x∗ )−1 ≤ β.e. all future iterates will be a zero of the second component function. 0). . and satisﬁes xk+1 − x∗ ≤ βγ xk − x∗ 2 k = 0. if g itself is not given in analytic form.19) . . .84). This explains the zero in the function value of x1 = (1. Note that the second component of the Rosenbrock function is aﬃne in x.
be the induced l2 norm on Rn×n and let E ∈ Rn×n . by choosing ε such that Jg (x) is nonsingular for all x ∈ N (x∗ . this implies 1 g(y) − g(x) − Jg (x)(y − x) ≤ ≤ 0 1 0 Jg (x + t(y − x)) − Jg (x) (y − x) dt γ t(y − x) (y − x) dt 2 0 1 = γ (y − x) tdt = γ y − x 2. g(y) − g(x) − Jg (x)(y − x) ≤ γ y − x 2. Together with the Lipschitz continuity of Jg at x ∈ D.1. we have g(y) − g(x) − Jg (x)(y − x) 1 = 0 1 Jg (x + t(y − x))(y − x)dt − Jg (x)(y − x) (1. If Jg ∈ Lipγ (D).11. Let g : Rn → Rn be continuously diﬀerentiable in the open. 1 − A−1 (B − A) .16) is at most of order O( xk − x∗ 2 ). Lemma 1. This is a consequence of the following lemma. then B is nonsingular and B −1 ≤ A−1 . If A is nonsingular and A−1 (B − A) < 1. then for any y ∈ D.2 The method of Newton 29 We can show that the convergence is qquadratic.20) = 0 (Jg (x + t(y − x)) − Jg (x))(y − x)dt. 2 The next theorem says that matrix inversion is continuous in norm. We can bound the integral on the right hand side of (1. then (I − E)−1 exists and (I − E)−1 ≤ 1 1− E . Theorem 1. If E < 1. 2 Proof. the local error in the aﬃne model (1. convex set D ⊂ Rn and let x ∈ D.12. Let . The reason is that if the Jacobian is nonsingular. Furthermore. According to the fundamental theorem of calculus.20) in terms of the integrand. ε). it gives a relation between the norms of the inverses of two nearby matrices that is useful later in analyzing algorithms.
Therefore.12 implies that Jg (x0 ) is nonsingular and Jg (x0 )−1 ≤ Jg (x∗ )−1 ≤ 2· Jg (x∗ )−1 ≤ 2β. additionally. it follows that 1 x1 − x∗ ≤ 2 x0 − x∗ . ε). By induction on k. 2 Jg (x∗ )−1 Theorem 1. by Lemma 1. we obtain Jg (x∗ )−1 (Jg (x0 ) − Jg (x∗ )) ≤ Jg (x0 ) − Jg (x∗ ) 1 ≤ βγ x0 − x∗ ≤ βγε ≤ .30 Chapter 1. This completes the proof for k = 0. Since x0 −x∗ ≤ ε ≤ 1/(2βγ). ε) ⊂ D.23) gives the diﬀerence between g(x∗ ) and the aﬃne model l0 (x) evaluated at x∗ . 2 We have shown (1. The proof of the induction step proceeds in the same way.10).21) so that N (x∗ . ε) as well. x1 − x∗ ≤ Jg (x0 )−1 g(x∗ ) − g(x0 ) − Jg (x0 )(x∗ − x0 ) γ ≤ 2β x0 − x∗ 2 = βγ x0 − x∗ 2 . ε) if xk ∈ N (x∗ . x1 − x∗ = x0 − x∗ − Jg (x0 )−1 g(x0 ) = Jg (x0 )−1 (g(x∗ ) − g(x0 ) − Jg (x0 )(x∗ − x0 )). We ﬁrst consider the basis step (k = 0).19) for k = 0.23) The second factor in (1. (1. = x0 − x∗ − Jg (x0 )−1 (g(x0 ) − g(x∗ )) (1.12 can be found in [18].22) 1 − Jg (x∗ )−1 (Jg (x0 ) − Jg (x∗ )) This implies that x1 is well deﬁned and.21).22). we show that (1. .19) holds for each iteration step and that xk+1 − x∗ ≤ 1 xk − x∗ . We choose ε ≤ 1/(2βγ) (1. 2 which implies that xk+1 ∈ N (x∗ . An introduction to iterative methods The proof of Theorem 1. Using the Lipschitz continuity of Jg at x∗ . x0 − x∗ ≤ ε and (1.11 and (1. which yields x1 ∈ N (x∗ . Proof (of Theorem 1.
Repeat the following sequence until g(xk ) < ε. • Requires the Jacobian Jg (xk ) at each iteration step. of g at x∗ .13 (Discrete Newton method). • Each iteration step requires the solution of a system of linear equations that might be singular or illconditioned. If g is a nonlinear function. • Exact solution in one iteration for an aﬃne function g (exact at each iteration for any aﬃne component function of g). for example. Advantages of Newton’s method • qQuadractically convergent from good starting points if Jg (x∗ ) is nonsingular. We conclude with a summery of the characteristics of Newton’s method. We then have x1 − x∗ ≤ βγ x0 − x∗ 2 = 0. Disadvantages of Newton’s method • Not globally convergent for many problems. and the method of Newton converges exactly in one single iteration. The bound ε for the region of convergence is a worstcase estimate. . QuasiNewton methods We have already indicated that it is not always possible to compute the Jacobian of the function g. for x ∈ D. In this case.1. Algorithm 1. the relative nonlinearity of g at x∗ is given by. γrel = β · γ. Choose an initial estimate x0 ∈ Rn and set k := 0. or that it is very expensive. γrel . So. the region of convergence may very well be much larger. the Jacobian is constant and the Lipschitz constant γ can be chosen to be zero. In directions from x∗ in which g is less nonlinear.2 The method of Newton 31 Note that if g is aﬃne. Jg (x) − Jg (x∗ ) The radius of guaranteed convergence of Newton’s method is inversely proportional to the relative nonlinearity. by using ﬁnite diﬀerences. Jg (x∗ )−1 (Jg (x) − Jg (x∗ )) ≤ Jg (x∗ )−1 ≤ βγ x − x∗ = γrel x − x∗ . we have to approximate the Jacobian.
The diﬀerence between the real Jacobian and the approximated Jacobian Jg (xk )−Ak turns out to be of order 10−5 for h = 1. ε). . The convergence properties of the discrete Newton method are described in Table 1. Theorem 1.6074 · 10−14 2. iii) xk+1 := xk + sk .0 · 10−4 .1317 · 10−13 k∗ 4 3 4 R 8. We start with the initial condition x0 given by (A. 1.13 for diﬀerent values of h.7570 g(xk∗ ) 4.7570 0. The rate of convergence is plotted in Figure 1. if k→∞ lim hk = 0. k = 0.15.2246 (g(xk + hk en ) − g(xk ))/hk Discrete Newton 100 Discrete Newton 100 Discrete Newton 100 Table 1. is well deﬁned and converges qlinearly to x∗ . If the ﬁnitediﬀerence step size hk is properly chosen.10. Let g and x∗ satisfy the assumptions of Theorem 1.32 Chapter 1. Then there exist ε.3. .7570 0. An introduction to iterative methods i) Compute Ak = (g(xk + hk e1 ) − g(xk ))/hk · · · ii) Solve Ak sk = −g(xk ) for sk . 1 .9351 7.4908 · 10−16 8. method n h 1. We assume that hk ≡ h and apply Algorithm 1. Additionally. Example 1. .6) and set ε = 10−12 .0 · 10−8 and of order 10−3 for h = 1.0 · 10−8 1. given by (A. 2. .5) for diﬀerent values of h.4. This is the conclusion of the next theorem.13. the discrete Newton method is also qquadratically convergent.3: The convergence properties of Algorithm 1. .0 · 10−4 1. k where Ak = (g(xk + hk e1 ) − g(xk ))/hk · · · (g(xk + hk en ) − g(xk ))/hk . applied to the discrete integral equation function (A. We denote the l1 vectornorm and the corresponding induced matrix norm by .14. h > 0 such that if {hk } is a real sequence with 0 < hk  ≤ h and x0 ∈ N (x∗ .5). We consider the discrete integral equation function g. the sequence {xk } generated by xk+1 = xk − A−1 g(xk ).0 · 10−12 .0 · 10−12 g(x0 ) 0.7652 9. of order 10−7 for h = 1.
1.2 The method of Newton
33
10
0
residual g(xk )
10
−5
10
−10
PSfrag replacements
10
−15
0
1
2
3
4
iteration k Figure 1.4: The convergence rate of the discrete Newton method 1.13 applied to the discrete integral equation function (A.5) for diﬀerent values of h. [’◦’(h = 10 −4 ), ’×’(h = 10−8 ), ’+’(h = 10−12 )]
then the convergence is qsuperlinear. If there exists a constant c1 such that hk  ≤ c1 xk − x∗ 1 , or equivalently a constant c2 such that hk  ≤ c2 g(xk ) 1 , then the convergence is qquadratic. For the proof of Theorem 1.15 we refer to [18]. Another way to avoid computations of the Jacobian in every iteration is to compute the Jacobian in the ﬁrst iteration, A = Jg (x0 ), and use this matrix in all subsequent iterations as an approximation of Jg (xk ). This method is called the NewtonChord method. It turns out that the NewtonChord method is locally linearly convergent [38]. Algorithm 1.16 (NewtonChord method). Choose an initial estimate x0 ∈ Rn , set k := 0, and compute the Jacobian A := Jg (x0 ). Repeat the following sequence of steps until g(xk ) < ε. i) Solve Ask = −g(xk ) for sk , ii) xk+1 := xk + sk . Example 1.17. Let g be the discrete integral equation function given by (A.5). We apply Algorithm 1.16 and Algorithm 1.7 to approximate the zero
34
Chapter 1. An introduction to iterative methods
of g. As initial estimate we choose x0 given by (A.6) multiplied by a factor 1, 10 or 100. The convergence properties of the NewtonChord method and Newton’s method are described in Table 1.4. In Figure 1.5, we can observe the linear convergence of the NewtonChord method. The rate of convergence of the NewtonChord method is very low in case of the initial condition 100x0 . Clearly for all initial conditions, the NewtonChord method needs more iterations to converge than the original method of Newton, see Figure 1.6.
method Newton Chord Newton Chord Newton Chord Newton Newton Newton
n 100 100 100 100 100 100
factor 1 10 100 1 10 100
g(x0 ) 0.7570 18.5217 3.8215 · 103 0.7570 18.5217 3.8215 · 103
g(xk∗ ) 2.3372 · 10−14 2.9052 · 10−13 2.1287 1.7854 · 10−14 2.7007 · 10−16 3.9780 · 10−13
k∗ 8 16 200 3 4 9
R 3.8886 1.9866 0.0375 10.4594 9.6917 4.0890
Table 1.4: The convergence properties of Algorithm 1.16 and Algorithm 1.7 applied to the discrete integral equation function (A.5) for diﬀerent initial conditions (x 0 , 10x0 and 100x0 ).
10
5
10
0
residual g(xk )
10
−5
10
−10
PSfrag replacements
10
−15
0
2
4
6
8
10
12
14
16
18
20
iteration k Figure 1.5: The convergence rate of Algorithm 1.16 applied to the discrete integral equation function (A.5) for diﬀerent initial conditions. [’◦’(x0 ), ’×’(10x0 ), ’+’(100x0 )]
1.3 The method of Broyden
35
10
5
10
0
residual g(xk )
10
−5
10
−10
PSfrag replacements
10
−15
0
1
2
3
4
5
6
7
8
9
10
iteration k Figure 1.6: The convergence rate of Algorithm 1.7 applied to the discrete integral equation function (A.5) for diﬀerent initial conditions. [’◦’(x0 ), ’×’(10x0 ), ’+’(100x0 )]
1.3
The method of Broyden
The NewtonChord method of the previous section saves us the expensive computation of the Jacobian Jg (xk ) is every iterate xk of the process, by approximating it by the Jacobian in the initial condition, A = Jg (x0 ). Additional information about the Jacobian obtained during the process is neglected. This information consists of the function values of g in the iterates needed to compute the step sk . In this section, we start with the basis idea for a class of methods that adjust the approximation matrix to the Jacobian Jg (xk ) using only the function value g(xk ). We single out the method proposed by C.G. Broyden in 1965 [8] that has a qsuperlinear and even 2nstep qquadratic local convergence rate and seems to be very successful in practice. This algorithm, which is analogous to the method of Newton, is called the method of Broyden.
A derivation of the algorithm
Recall that in one dimension we use the local model (1.6), lk+1 (x) = g(xk+1 ) + ak+1 (x − xk+1 ) for the nonlinear function g. Note that lk+1 (xk+1 ) = g(xk+1 ) for all choices of ak+1 ∈ Rn . If we set ak+1 = g (xk+1 ), we obtain Newton’s method. If g (xk+1 ) is not available, we force the scheme to satisfy lk+1 (xk ) = g(xk ), that is g(xk ) = g(xk+1 ) + ak+1 (xk − xk+1 ),
36
Chapter 1. An introduction to iterative methods
which yields the secant approximation (1.8), ak+1 = g(xk+1 ) − g(xk ) . xk+1 − xk
The next iterate xk+2 is the zero of the local model, lk+1 . Therefore we arrive at the quasiNewton update xk+2 = xk+1 − g(xk+1 )/ak+1 . The price we have to pay is a reduction in local convergence rate, from qquadratic to 2step qquadratic convergence. In multiple dimensions, we apply an analogous aﬃne model lk+1 (x) = g(xk+1 ) + Bk+1 (x − xk+1 ). For Newton’s method Bk+1 equals the Jacobian Jg (xk+1 ). We enforce the same requirement that led to the onedimensional secant method. So, we assume that lk+1 (xk ) = g(xk ), which implies that g(xk ) = g(xk+1 ) + Bk+1 (xk − xk+1 ). (1.24)
Furthermore, if we deﬁne the current step by sk = xk+1 − xk , and the yield of the current step by yk = g(xk+1 ) − g(xk ), Equation (1.24) is reduced to Bk+1 sk = yk . (1.25)
We refer to (1.25) as the secant equation. For completeness we ﬁrst give the deﬁnition of a secant method. Deﬁnition 1.18. The iterative process
−1 xk+1 = xk − Bk g(xk )
is called a secant method if the matrix Bk satisﬁes the secant equation (1.25) in every iteration step. The crux of the problem in extending the secant method to more than one dimension is that (1.25) does not completely specify the matrix Bk+1 . In fact, if sk = 0, there is an n(n − 1)dimensional aﬃne subspace of matrices satisfying (1.25). Constructing a successful secant approximation consists of selecting a good approach to choose from all these possibilities. The choice should enhance the Jacobian approximation properties of Bk+1 or facilitate its use in a quasiNewton algorithm.
we can make the second term on the right hand side of (1. The idea is to preserve as much as possible of what we already have. (1. . . l = k − m. = yk − Bk+1 sk + (Bk+1 − Bk )(x − xk ) l = k − m. sk the matrix Bk+1 is uniquely determined by (1.27) zero for all x ∈ Rn . The approach that leads to the successful secant approximation is quite diﬀerent. . so. Bk+1 sl = yl . at any x is given by lk+1 (x) − lk (x) = g(xk+1 ) + Bk+1 (x − xk+1 ) − g(xk ) − Bk (x − xk ) = (Bk+1 − Bk )(x − xk ). .27). k − 1. we try to minimize the change in the aﬃne model. Aside from the secant equation no new information about either the Jacobian or the model is given. k − 1.27) We have no control over the ﬁrst term on the right hand side of (1. making the computation of Bk+1 a poorly posed numerical problem. Unfortunately. α ∈ R. . .29) . . (1. we set g(xl ) = g(xk+1 ) + Bk+1 (xl − xk+1 ). . in additional to the secant equation.25) and (1. The last equality is due to the secant equation. .3 The method of Broyden 37 A possible strategy is using the former function evaluations. . Therefore. l = k − m. . by choosing Bk+1 such that (Bk+1 − Bk )q = 0. . Now if we write an arbitrary x ∈ Rn as x − xk = αsk + q. . subject to the secant equation (1. for all q ⊥ sk . . (1.26) For m = n−1 and linear independent sk−m . . where q T sk = 0. . The diﬀerence between the new and the old aﬃne model.1. . (1. .25). That is. the expression that we want to minimize becomes lk+1 (x) − lk (x) = α(Bk+1 − Bk )sk + (Bk+1 − Bk )q. k − 1.28) However. since it equals (Bk+1 − Bk )sk = yk − Bk sk . most of the time sk−m . sk tend to be linearly dependent. This is equivalent to g(xl ) = g(xl+1 ) + Bk+1 (xl − xl+1 ).26). . .
38 Chapter 1. An introduction to iterative methods This implies that (Bk+1 − Bk ) has to be a rankone matrix of the form usT . This k leads to the Broyden or secant update Bk+1 = Bk + (yk − Bk sk )sT k . This will be proved in Lemma 1. we have followed the steps of Broyden when developing his iterative method in [8]. We arrive at the algorithm of Broyden’s method. Set k := 0 and repeat the following sequence of steps until g(xk ) < ε.20. For y. ii) xk+1 := xk + sk iii) yk := g(xk+1 ) − g(xk ).25). k k In this section we use the Frobenius norm. i) Solve Bk sk = −g(xk ) for sk .19 (Broyden’s method). The Broyden update is the minimum change to Bk consistent with the secant equation (1. Choose an initial estimate x0 ∈ Rn and a nonsingular initial Broyden matrix B0 . deﬁned by n n F. denoted by . s ∈ Rn the set of all matrices that satisfy the secant equation As = y is denoted by Q(y.28) now implies that u = (yk − Bk sk )/(sT sk ). iv) Bk+1 := Bk + (yk − Bk sk )sT /(sT sk ). sT sk k (1. Rather a former approximation Bk is updated into a new one. .25) the new Broyden matrix Bk+1 yields the minimum of A − Bk F . but the derivation of the Broyden update can be made much more rigorous. of all matrices A that satisfy the secant equation (1. The norm is (1. Bk+1 . from scratch. k with u ∈ Rn . This type of updating is shared by all the successful multidimensional secant approximation techniques. s) = {A ∈ Rn×n  As = y}. Jg (xk+1 ). So. Equation (1. it equals the l2 vector norm of the matrix written as a n2 vector. Algorithm 1. In the preceding.30) The word ’update’ indicates that we are not approximating the Jacobian in the new iterate. That is.31) A F = i=1 j=1 A2 ij 1/2 . if (Bk+1 − Bk ) is measured in Frobenius norm.
Note that Q(y. then the ¯ unique solution A = B to A∈Q(y. k = 0. where f is the period map of a dynamical process xk+1 = f (xk ). s). . We compute for any A ∈ Q(y.s) min A−B F (1.34) is a zero of the function g. . . B0 = −I.5). Because the Frobenius norm is strictly convex. aﬃne) subset of Rn×n . the ﬁrst iteration of Broyden is just a dynamical simulation step. By choosing B0 = −I.20. . in this way. (1. as given in Lemma 1. So. ¯ B−B F = = ≤ (y − Bs)sT sT s F (A − B)ssT sT s F ssT A−B F T s s 2 = A−B F.34) A ﬁxed point of the process (1. The ﬁnite diﬀerences approximation turns out to be a good start. is taking the initial approximation equal to minus identity. the solution to (1. Jg (x0 ). s) is a convex (in fact. We have not deﬁned yet what should be chosen for the initial approximation B0 to the Jacobian in the initial estimate. we let the system choose the direction of the ﬁrst step. In addition. If s = 0. y ∈ Rn arbitrary. that avoids the computation of Jg (x0 ). . We now apply the method of Broyden to the test function (A.32) is unique on the convex subset Q(y. the initial Broyden matrix is easy to store and can be directly implemented in the computer code. sT s Proof. Another choice. 1.1. (1.33) Suppose the function g is deﬁned by g(x) = f (x) − x. s). It also makes the minimum change characteristics of Broyden’s update more appealing.32) is given by (y − Bs)sT ¯ B=B+ . −1 x1 = x0 − B0 g(x0 ) = x0 − (f (x0 ) − x0 ) = f (x0 ).3 The method of Broyden 39 Lemma 1. Let B ∈ Rn×n and s. This makes the reduction methods discussed in Chapter 3 eﬀective.20.
’×’(n = 100). We apply Algorithm 1.19 to approximate the zero of g. The method of Broyden method only makes one function evaluation per iteration compared to the n + 1 function evaluations of Algorithm 1. given by (A. Let g be the discrete integral equation function given by (A. The convergence results .5. [’◦’(n = 10).5).19 for diﬀerent dimensions of the problem.13. ’+’(n = 200)] Example 1.7: The convergence rate of Algorithm 1.21.6).5) for diﬀerent dimensions n.22.3404 Table 1. As we did for the NewtonChord method and the method of Newton we multiply the initial condition x0 .0678 g(xk∗ ) 4. it avoids the computation of the Jacobian. 10 and 100. We apply Algorithm 1. by a factor 1.5: The convergence results for Algorithm 1. The convergence results for the method of Broyden are described in Table 1. Although the Broyden’s method needs more iterations to converge than Newton’s method.7.6) and we set ε = 10−12 .3937 1.3412 1.8980 · 10−14 4. see also Figure 1. The rate of convergence again does not depend on the dimension of the problem. method Broyden Broyden Broyden n 10 100 200 g(x0 ) 0. We deﬁne the initial condition x0 by (A.4398 · 10−13 6. 10 0 residual g(xk ) 10 −5 10 −10 PSfrag replacements 10 −15 0 2 4 6 8 10 12 14 16 18 20 22 iteration k Figure 1. Let g be the discrete integral equation function given by (A.19 applied to the discrete integral equation function (A.5) for diﬀerent dimensions n.19 applied to the discrete integral equation function (A.7570 1.3644 · 10−13 k∗ 21 21 21 R 1.40 Chapter 1.5). An introduction to iterative methods Example 1.2518 0.
Moreover.35) is similar to the proof of Lemma 1. for all u.7570 18. Equation (1. v satisfying max{ v − x . u − x } ≤ ε. Jg (x)−1 . g(v) − g(u) ≥ ≥ ≥ Jg (x)(v − u) − g(v) − g(u) − Jg (x)(v − u) 1 − γ max{ v − x . u − x } v − u Jg (x)−1 1 − γε v − u . there exist ε > 0 and ρ > 0 such that (1/ρ) v − u ≤ g(v) − g(u) ≤ ρ v − u . method Broyden Broyden Broyden n 100 100 100 factor 1 10 100 g(x0 ) 0.3 The method of Broyden 41 for Broyden’s method are described in Table 1.23.36) (1.6: The convergence results for Broyden’s method 1. Let g : Rn → Rn be continuously diﬀerentiable in the open. if Jg (x) is invertible. convex set D ⊂ Rn .9151 Table 1. u − x }) v − u ≤ ( Jg (x) + γε) v − u . x ∈ D. v ∈ D for which max{ v − x .6. 10x0 and 100x0 . Jg (x)(v − u) + g(v) − g(u) − Jg (x)(v − u) (1.19 applied to the discrete integral equation function (A. The proof of Equation (1.5) for diﬀerent initial conditions. u − x } ≤ ε.35) Similarly.0975 · 10+20 k∗ 21 33 13 R 1. If Jg ∈ Lipγ (D) then for every u and v in D g(v) − g(u) − Jg (x)(v − u) ≤ γ max{ v − x . u − x } v − u . For the initial condition 100x0 the method of Broyden fails to converge.9297 2.4398 · 10−13 8. g(v) − g(u) ≤ ≤ ( Jg (x) + γ max{ v − x . we ﬁrst need the following extension of Lemma 1.7765 · 10−13 1.5217 3.11. Proof.1. Lemma 1.11.35) together with the triangle inequality implies that for u. Superlinear convergence In order to prove the convergence of Broyden’s method.3412 0. x0 .8215 · 103 g(xk∗ ) 4.
36) holds if we choose ρ large enough such that ρ > Jg (x) + γε.31). x − x } (1. the sequence is well deﬁned and converges to x∗ . { Bk } are uniformly bounded. Suppose there are nonnegative constants α1 and α2 such that for each (x.. This can consists of one single element and it can contain B itself. is the l2 operator norm induced by the corresponding vector norm. An introduction to iterative methods Thus if ε < (1/ Jg (x)−1 γ). and 1 1 < − γε. Because all norms in a ﬁnitedimensional vector space are equivalent.42 Chapter 1. convex set D ⊂ Rn .24 is a set valued function. ρ Jg (x)−1 where . the function Φ satisﬁes ¯ ¯ B − Jg (x∗ ) F In the next theorem it is necessary to use the Frobenius norm (1. Let Φ : Rn × L(Rn ) → P{L(Rn )} be deﬁned in a neighborhood N = N1 × N2 of (x∗ .38) ¯ (1. that assigns to a couple of a vector x ∈ Rn and a matrix B ∈ L(Rn ) a ¯ set of matrices {B}. B). and for x = x − B −1 g(x). −1 xk+1 = xk − Bk g(xk ) + α2 max{ x − x . Furthermore. Then for arbitrary r ∈ (0. (1. x − x∗ } · B − Jg (x∗ ) ¯ ∗ ∗ F ¯ for each B in Φ(x.24. Bk ).e. So. B) in N. Jg (x∗ )) where N1 is contained in D and N2 only contains nonsingular matrices. an element of the power set P{L(Rn )} is a set of linear maps from Rn to Rn . and Bk+1 ∈ Φ(xk . i. Let g : Rn → Rn be continuously diﬀerentiable in the open. Theorem 1. Assume that there exists an x∗ ∈ D such that g(x∗ ) = 0 and Jg (x∗ ) is nonsingular.37) ≤ 1 + α1 max{ x − x∗ . there are positive constants ε(r) and δ(r) such that for x0 − x∗ < ε(r) and B0 − Jg (x∗ ) F < δ(r).40) . The function Φ appearing in Theorem 1. and { Bk }. By L(Rn ) we denote the space of all linear maps from Rn to Rn . 1). all (n × n)matrices. then 1/ Jg (x)−1 − γε > 0 and (1.39) −1 for each k ≥ 0. there is a constant η > 0 such that A ≤ η A F. and assume that Jg ∈ Lipγ (D). k ≥ 0. xk+1 − x∗ ≤ r xk − x∗ (1.
and by (1. It follows from (1. Assume that both Bk − Jg (x∗ ) F ≤ 2δ and xk+1 − x∗ ≤ r xk − x∗ for k = 0.12 gives −1 B0 ≤ (1. 1. Bm − Jg (x∗ ) ≤ (2α1 δ + α2 )rk x0 − x∗ ≤ (2α1 δ + α2 ) max{r xk − x∗ . 1 − β2ηδ 1 − r/(1 + r) −1 x0 − B0 g(x0 ) − x∗ −1 B0 · Lemma 1. 1−r .42) yields 2β(1 + r)ηδ ≤ r. (1. B) lies in the neighborhood N whenever B−Jg (x∗ ) F < 2δ and x−x∗ < ε.37). Theorem 1. xk − x∗ } ≤ (α1 Bk − Jg (x∗ ) − Bk − Jg (x∗ ) F F + α2 ) max{ xk+1 − x∗ . Let r ∈ (0.38) that Bk+1 − Jg (x∗ ) F ≤ (2α1 δ + α2 )εrk . and thus x1 ∈ D. x1 − x∗ < ε. β(1 + r)(γε + 2ηδ) ≤ r. 1) be given and set β ≥ Jg (x∗ )−1 . xk − x∗ } and by summing both sides from k = 0 to m − 1. Then B0 − Jg (x∗ ) < ηδ < 2ηδ.1. Suppose that B0 −Jg (x∗ ) F < δ and x0 − x∗ < ε. . Hence. Choose δ(r) = δ and ε(r) = ε such that ε ≤ δ. . m − 1. and since (1.43) β β ≤ = (1 + r)β.42) it follows that x1 − x∗ ≤ r x0 − x∗ . we obtain F ≤ B0 − Jg (x∗ ) F + (2α1 δ + α2 ) ε . .23 now implies that x1 − x∗ ≤ ≤ g(x0 ) − g(x∗ ) − Jg (x∗ )(x0 − x∗ ) + B0 − Jg (x∗ ) x0 − x∗ ≤ β(1 + r)(γε + 2ηδ) x0 − x∗ . (1.3 The method of Broyden 43 Proof.41) (2α1 δ + α2 ) 1−r and for η given by (1. We complete the proof with an induction argument.42) If necessary further restrict ε and δ so that (x. .
the proof is completed. To complete the induction step we only need to prove that xm+1 − x∗ ≤ r xm − x∗ . . 1) be given. xk+1 − x∗ ≤ r xk − x∗ for each k ≥ m. xk+1 = xk − Jg (x0 )−1 g(xk ) satisﬁes (1. since Bm − Jg (x∗ ) ≤ 2ηδ. We would like to show that lim xk+1 − x∗ = 0. but is. 1) was arbitrary. in general. only linearly convergent. For example. then the sequence {xk } converges qsuperlinearly at x∗ . If some subsequence of { Bk − Jg (x∗ ) } converges to zero.12 and (1.44 Chapter 1. the NewtonChord iteration scheme.42).25 is necessary to guarantee qsuperlinear convergence. Assume that the hypotheses of Theorem 1. Corollary 1.43) implies that −1 Bm ≤ (1 + r)β. Proof.38) with α1 = α2 = 0.16. and xm+1 − x∗ ≤ r xm − x∗ follows from (1. So.23 it follows that xm+1 − x∗ ≤ −1 Bm g(xm ) − g(x∗ ) − Jg (x∗ )(xm − x∗ ) + Bm − Jg (x∗ ) xm − x∗ ≤ β(1 + r)(γε + 2ηδ) xm − x∗ . see Algorithm 1. This follows by an argument similar to the one for m = 1. It should be clear that some condition like the one in Corollary 1.24 hold. Lemma 1.24 there are numbers ε( 1 ) and δ( 2 ) such that B0 −Jg (x∗ ) F < 2 1 1 ∗ < ε( 1 ) imply that x ∗ for each k ≥ 0. and by Lemma 1. xk − x∗ k→∞ 1 By Theorem 1. One of the interesting aspects of the result of the following theorem is that qsuperlinear convergence is guaranteed for the method of Broyden. In fact. An introduction to iterative methods which by (1.41) implies that Bm − Jg (x∗ ) ≤ 2δ. We can choose m > 0 such that Bm −Jg (x∗ ) F < δ(r) and xm − x∗ < ε(r).25. without any subsequence of { Bk − Jg (x∗ ) } necessarily converging to zero. Since r ∈ (0. Let δ( 2 ) and x0 −x k+1 − ≤ 2 xk −x 2 now r ∈ (0.
44).27).23 and standard properties of the matrix norms .45) Before we can prove the theorem we need some preparations. Theorem 1. F imply that there exists a neighborhood N of (x∗ . xk − x∗ ek = x k − x ∗ . is locally and qsuperlinearly convergent at x∗ . If B is given by (1.27. and assume that Jg ∈ Lip γ(D). Let g : Rn → Rn be continuously diﬀerentiable in the open. Then the update function Φ(x. Bk ).24 yields that iteration (1.8. for ¯ which Jg (x∗ ) is nonsingular. The idea ¯ of the proof of Theorem 1.38) is satisﬁed for every (x. B) = {B  s = 0}. where sT ¯ B = B + (y − Bs) T .44) s s is well deﬁned in a neighborhood N = N1 × N2 of (x∗ . ek . Let x∗ be a zero of g. B) in N.46) Deﬁne the error in the current iteration ek by The proof is drawn in Figure 1.1. k ≥ 0.26 is in the following manner.3 The method of Broyden 45 with Bk+1 ∈ Φ(xk . convex set D ⊂ Rn . Lemma 1. and the corresponding iteration −1 xk+1 = xk − Bk g(xk ) (1. then in any norm . k ≥ 0. (1. ek then k→∞ lim sk = 1. Let xk ∈ Rn . Jg (x∗ )) such that condition (1. (1. k→∞ lim xk+1 − xk = 1. then Lemma 1. Subsequently. if k→∞ lim ek+1 = 0. Theorem 1. ek Proof (of Lemma 1. . With ek given by (1. 2 and . Jg (x∗ )). The qsuperlinear convergence is a consequence of the following two lemma’s. If {xk } converges qsuperlinearly to x∗ ∈ Rn .26.45) is locally and linearly convergent.46) we compute k→∞ lim sk −1 ek = ≤ = k→∞ lim k→∞ lim k→∞ lim sk − e k ek sk + ek ek ek+1 = 0. Clearly.
48) k→∞ sk where sk = xk+1 − xk .49) . and g(x∗ ) = 0.48) holds. Proof.27 is also of interest to the stopping criteria in our algorithms. Suppose for some x0 ∈ D that the sequence of points generated by xk+1 = xk − A−1 g(xk ) k (1. so that −g(xk+1 ) = (Ak − Jg (x∗ ))sk + (−g(xk+1 ) + g(xk ) + Jg (x∗ )sk ). and Jg ∈ Lipγ (D). convex set.47) remains in D. Then {xk } converges qsuperlinearly to x∗ in some norm .28. First we assume that (1. Deﬁne ek = xk − x∗ . Note that Lemma 1. if and only if (Ak − Jg (x∗ ))sk lim =0 (1. Let D ⊆ Rn be an open. then any stopping test that uses sk is essentially equivalent to the same test using ek . Equation (1. which is the quantity we are really interested in. Lemma 1. An introduction to iterative methods xk+1 PSfrag replacements sk ek+1 xk ek x∗ Figure 1. (1.47) gives 0 = Ak sk + g(xk ) = (Ak − Jg (x∗ ))sk + g(xk ) + Jg (x∗ )sk . where xk = x∗ for every k. Assume that Jg (x∗ ) is nonsingular for some x∗ ∈ D.8: Schematic drawing of two subsequent iterates.46 Chapter 1. and satisﬁes limk→∞ xk = x∗ . where the ﬁnal equality is the deﬁnition of qsuperlinear convergence if ek = 0 for all k. g : Rn → Rn continuously diﬀerentiable. It shows that whenever an algorithm achieves at least qsuperlinear convergence. Let {Ak } be a sequence of nonsingular matrices in L(Rn ). and show that g(x∗ ) = 0 and that {xk } converges qsuperlinearly to x∗ .
is the derivation above read in more or less the reversed g(x order.23. there exist ρ > 0. This implies k→∞ lim rk = 0. k→∞ From Lemma 1. x − x∗ } (1. Equation (1. k→∞ ek + ek+1 k→∞ 1 + rk lim 1 ek+1 .52) gives 0 = g(xk+1 ) k→∞ sk 1 ek+1 ≥ lim k→∞ ρ sk 1/ρ · ek+1 1/ρ · rk ≥ lim = lim . that qsuperlinear convergence and ∗ ) = 0 imply (1. it follows that g(x∗ ) = lim g(xk ) = 0. k0 ≥ 0. k0 ≥ 0.52) where rk = ek+1 / ek . such that g(xk+1 ) ≥ ρ ek+1 . sk (1.50) sk where the second inequality follows from Lemma 1. ρ (1.23.3 The method of Broyden 47 and gk+1 sk ≤ ≤ Ak − Jg (x∗ )sk − g(xk+1 ) + g(xk ) + Jg (x∗ )sk + sk sk Ak − Jg (x∗ )sk + γ max{ x − x∗ .51) and (1.48) gives lim g(xk+1 ) = 0.1. The proof of the reverse implication. Combining (1.23. which completes the proof of qsuperlinear convergence. From Lemma 1. there exist ρ > 0.50) together with limk→∞ ek = 0 and (1. such that g(xk+1 ) = g(xk+1 ) − g(x∗ ) ≥ for all k ≥ k0 .51) k→∞ Since limk→∞ sk = 0.48).
49) and Lemma 1.53) The qsuperlinear convergence implies that limk→∞ sk / ek = 1 according to Lemma 1.23.54) This condition has an interesting interpretation.53) this gives that (1. to the Newton steps from the same points. from (1. Due to the Lipschitz continuity of Jg .56) .48 Chapter 1.48) is replaced by k→∞ lim (Ak − Jg (xk ))sk = 0.54) is equivalent to k→∞ lim Jg (xk )(sN − sk ) k = 0. Therefore. After stating a ﬁnal lemma we are able to prove the main theorem of this section.27. 0 = ek+1 k→∞ ek g(xk+1 ) ≥ lim k→∞ 1/ρ ek g(xk+1 ) sk ≥ lim ρ · · . sk where sN = −Jg (xk )−1 g(xk ) is the Newton step from xk .55) 2 Es s . (1. x − x∗ }. Thus the necessary k and suﬃcient condition for the qsuperlinear convergence of a secant method is that the secant steps converge. Then E I− ssT sT s F = ≤ E E F 2 F Es s 1 − 2 E F − 2 1/2 (1.51) and limk→∞ ek = 0 proves (1. sk (1. Because sk = −A−1 g(xk ).29. k→∞ sk ek lim (1. in magnitude and direction. k Equation (1. Finally. Lemma 1.51) holds. (Ak − Jg (x∗ ))sk sk ≤ ≤ − g(xk+1 ) + g(xk ) + Jg (x∗ )sk g(xk+1 ) + sk sk g(xk+1 ) + γ max{ x − x∗ . An introduction to iterative methods for all k ≥ k0 . Let s ∈ Rn be nonzero and E ∈ Rn×n . it is easy to show that Lemma 1.48). sk which together with (1.28 remains true if (1. Together with (1.
28. I− ssT = 1.3 The method of Broyden 49 Proof. Assume that x ¯ ¯ = B − Jg (x∗ ). we ﬁrst derive an estimate for B − Jg (x∗ ) .55) implies (1. Deﬁne E e = x − x∗ .55). and e = x − x∗ . ¯ and x are in D and s = 0.24 and ¯ Lemma 1. E and the equality E 2 F = E ssT sT s ssT sT s 2 F + E I− = Es . Note that I − (ssT /sT s) is a Euclidean projection. Because for any α ≥ β ≥ 0. Equation (1. s ssT sT s 2 F . F we have proved (1. Note that ¯ ¯ ¯ ¯ E = B − Jg (x∗ ) = B − Jg (x∗ ) + (y − Bs) = (B − Jg (x∗ )) I − Therefore. 2 (1. (α2 − β 2 )1/2 ≤ α − β 2 /2α. and so is ssT /sT s. the inequality (1.23 is used. e }.1. sT s Therefore. ¯ E F sT sT s ssT sT + (y − Jg (x∗ )) T . So by the Pythagorean theorem.58) We deﬁne the neighborhood N2 of Jg (x∗ ) by N2 = B ∈ L(Rn )  Jg (x∗ )−1 · B − Jg (x∗ ) < . ¯ 1 .26).57) Lemma 1.57) can be reduced to ¯ E F ≤ E F + γ max{ e . Proof (of Theorem 1. Ts s s s ≤ ≤ (B − Jg (x∗ )) I − E I− ssT sT s F ssT sT s F + y − Jg (x∗ )s s (1. E = B − Jg (x∗ ). ¯ For the last inequality of (1. Because I − ssT /(sT s) is an orthogonal projection it has l2 norm equal to one. In order to be able use both Theorem 1. e }.57) + γ max{ e .56).
24 and therefore.61) k→∞ sk In order to justify Equation (1.57) as Ek+1 F ≤ Ek I − sk sT k sT sk k F + γ max{ ek+1 .e.45) satisﬁes the hypotheses of Theorem 1. ¯ (1/ρ) x − x ≤ g(x) − g(¯) ≤ ρ x − x .62).60) and Lemma 1. 1) in (1.61) we write Equation (1. so 2 1 xk+1 − x∗ ≤ xk − x∗ (1. So. We take r = 1 . if x − x∗ ≤ ε and B ∈ N2 then x ∈ D and s = B −1 g(x) ≤ B −1 g(x) − g(x∗ ) ≤ 2ρ Jg (x∗ )−1 ε 2 (1. .29 in (1. x ∈ D and moreover. Equation (1. ¯ x ¯ In particular. x − x∗ } ≤ ε implies that x and x belong to D ¯ ¯ and that (1. ek }.62) Using Equation (1. (1. B) ∈ N. (1. 1 − (Jg (x∗ )−1 (B − Jg (x∗ )) To deﬁne the neighborhood N1 of x∗ . then ε x − x∗ < . In addition we can choose r ∈ (0..28.60) 2 Considering Lemma 1. choose ε > 0 and ρ > 0 as in Lemma 1. for u = x and v = x.40) arbitrarily.58) then shows that the update function associated with the iteration (1. (1.36) holds.50 Chapter 1. Let N1 be the set of all x ∈ Rn such that x − x∗ < 2ρ Jg (x∗ )−1 If N = N1 × N2 and (x.23 so that max{ x − x∗ . ¯ the update function is well deﬁned in N. 2 and x − x∗ ≤ s + x − x∗ ≤ ε.45) is locally convergent at x∗ . ¯ Hence. i. An introduction to iterative methods Then any B ∈ N2 is nonsingular and satisﬁes B −1 ≤ Jg (x∗ )−1 ≤ 2 Jg (x∗ )−1 .59) shows that s = 0 if and only if x = x∗ . we obtain Ek+1 F ≤ Ek F − E k sk 2 2 E k F sk 2 + γ ek . the algorithm according to (1. a suﬃcient condition for {xk } to converge qsuperlinearly to x∗ is E k sk lim = 0.59) x − x∗ .
(1. .63). instead of the Broydenmatrix one could store the inverse of this matrix.24 gives that { Bk } is uniformly bounded for k ≥ 0.63) Theorem 1. m.60) we obtain ∞ k=0 ek ≤ 2ε. Thus from (1.65) E k sk 2 < ∞. To avoid this problem. we obtain ∞ k=0 (1. By Equation (1. .3 The method of Broyden 51 or E k sk 2 ≤ 2 Ek sk 2 F Ek F − Ek+1 F + γ ek . . The inverse notation of Broyden’s method A restriction of the method of Broyden is that it is necessary to solve an ndimensional system to compute the Broyden step. This implies that there exists an M > 0 independently of k such that Ek = Bk − Jg (x∗ ) ≤ Bk + Jg (x∗ ) ≤ M.64) for k = 0. 1. E k sk 2 ≤ 2M sk 2 Ek F − Ek+1 F + γ ek . .61) and completes the proof.65) is true for any m ≥ 0. If Hk is the inverse of Bk then sk = −Hk g(xk ). and the operation is reduced to a matrixvector multiplication. Because (1.19. (1.1. sk 2 which implies (1.64) and summing the left and right sides of (1. . see Algorithm 1. yields m k=0 E k sk 2 sk 2 m ≤ 2M ≤ 2M E0 E0 F − Em+1 + 2εγ F +γ k=0 ek F ≤ 2M M + 2εγ .
This was suﬃcient to deﬁne Bk+1 uniquely and the update was given by (1. then k −1 Hk+1 = Bk+1 = (Bk + (yk − Bk sk ) (A + U V T )(A−1 − A−1 U (I + V T A−1 U )−1 V T A−1 ).25). in addition to the secant equation (1. Let A ∈ Rn×n be nonsingular and U. An introduction to iterative methods and the secant equation becomes Hk+1 yk = sk .52 Chapter 1. The formula (1. Householder’s formula. using Householder’s modiﬁcation formula. the new Broyden matrix Bk+1 has been chosen so that. to compute the −1 new inverse Broyden matrix Hk+1 = Bk+1 with very little eﬀort from Hk . Theorem 1.30).68) Proof. V ∈ Rn×p be arbitrary matrices with p ≤ n. states that if A is a nonsingular (n × n)matrix. then (A + uv T ) is nonsingular and (A + uv T )−1 = A−1 − A−1 uv T A−1 .68) is easily veriﬁed by computing (A−1 − A−1 U (I + V T A−1 U )−1 V T A−1 )(A + U V T ). it satisﬁes Bk+1 q = Bk q in any direction q orthogonal to sk . s T H k yk k (1.30.67) gives that if sT Hk yk = 0.67) This formula is a particular case of the ShermanMorrisonWoodbury formula derived in the next theorem. also called the ShermanMorrison formula. It is possible. In Section 1. sT −1 k ) sT sk k −1 s T Bk k −1 s T B k yk k −1 −1 = Bk − (Bk yk − sk ) = Hk + (sk − Hk yk ) sT Hk k .66) again does not deﬁne a unique matrix but a class of matrices. If (I + V T A−1 U ) is nonsingular then (A + U V T )−1 exists and (A + U V T )−1 = A−1 − A−1 U (I + V T A−1 U )−1 V T A−1 .3. u and v are vectors in Rn . Equation (1. and (1 + v T A−1 u) = 0. and that both yield the identity.68).69) . (1. (1. 1 + v T A−1 u (1.66) Equation (1. Therefore A + U V T is invertible and the inverse is given by (1.
69) equals Algorithm 1. . Since Hk+1 satisﬁes (1.19. we could also require that Hk+1 q = Hk q for q T yk = 0. appears in practice to be unsatisfactory and is called the second or ’bad’ method of Broyden [8]. This is. 2. k = 0. Instead of assuming that Bk+1 q = Bk q in any direction q orthogonal to sk . 1.3 The method of Broyden 53 The iterative scheme xk+1 = xk − Hk g(xk ). .66). together with the rank one update (1. . . .1. the complement of the ﬁrst method of Broyden. it is readily seen that for this method Hk+1 is uniquely given by yT Hk+1 = Hk + (sk − Hk yk ) Tk . however. in some sense. yk yk This update scheme.
54 Chapter 1. An introduction to iterative methods .
Although computer simulations indicate that the method of Broyden satisﬁes ﬁnite convergence. In 1981. Section 2. In Section 2. fourteen years after Charles Broyden proposed his algorithm. for a long time it was not possible to prove this algebraically.Chapter 2 Solving linear systems with Broyden’s method One important condition for an algorithm to be a good iterative method is that it should use a ﬁnite number of iterations to solve a system of linear equations Ax + b = 0. we discuss the Theorems of Gay.1). Richard Gerber and Franklin Luk [23] published an approach to compute the exact number of iterations that Broyden’s method needs to solve (2. In addition Gay proved under which conditions the method of Broyden needs exactly 2n iterations. Section 4. cf. 55 .3. however. the method of Newton satisﬁes this condition. (2. As we have seen in Section 1. This justiﬁes that we restrict ourselves to examples where A is in Jordan canonical block form. it solves a system of linear equations in just one single iteration step.1) where A ∈ Rn×n and b ∈ Rn . Section 2. and we give examples to illustrate the theorems. it turns out that Broyden’s method needs much less iterations. that is. In this chapter.2. we show that the method of Broyden is invariant under unitary transformations and in some weak sense also under nonsingular transformations. David Gay published a proof that Broyden’s method converges in at most 2n iteration steps for any system of linear equations (2.2. For many examples. In 1979.1) where A is nonsingular [22].2.1. and of Gerber and Luk. But ﬁrst we start again with the problem in the onedimensional setting.
Indeed. where α = 0. The scalar b0 is updated by b1 = b0 + g1 (1 − α/b0 )(αx0 + β) = b0 + = b0 − (b0 − α) = α. s0 = − b0 b0 b0 So. We consider the following generalization of the method of Broyden. g(x) = Ax + b. (2. Solving linear systems with Broyden’s method The onedimensional case Consider the function g : R → R given by g(x) = αx + β. if b0 ∈ R is an arbitrary nonzero scalar. g (x0 ) α α and g(x1 ) = 0. Therefore the method converges in the next iterations step. We compute α β g(x0 ) = − x0 − . It turns out that Broyden needs two iterations from the same initial point x0 = x∗ .2) where A : Rn×n and b ∈ Rn . then g(x1 ) = α 1 − α/b0 x0 + 1 − α/b0 β = 1 − α/b0 (αx0 + β).56 Chapter 2. if x1 = x0 + s0 . It is clear that Newton’s method converges in one iteration starting from any initial point x0 . with b0 = α. x1 = x 0 − g(x0 ) αx0 + β β = x0 − =− . for x ∈ Rn .1 Exact convergence for linear systems Suppose that g : Rn → Rn is an aﬃne function. diﬀerent from the solution x∗ . s0 −(αx0 + β)/b0 Thus after one iteration Broyden’s method succeeds to ﬁnd the derivative of the function g. α β β g(x1 ) = x 1 − x1 − =− . x2 = x 1 − b1 b1 b1 α 2. that is. The matrix A is assumed to be nonsingular. . For notational simplicity we denote g(xk ) by gk . that is.
Hk+1 yk = sk . The determinant of the matrix I − gk+1 vk equals T T det(I − gk+1 vk ) = 1 − vk gk+1 T = 1 − vk (yk + gk ) = u T sk . but might non keep (2. If Hk is invertible and vk satisﬁes conditions (2. T Proof.4) then Hk+1 is invertible as well.2. Choose x0 ∈ Rn and a nonsingular (n × n)matrix H0 .3).2.3) (2. k = 1 − 1 − u T Hk g k k .3) establishes the inverse secant equation (1.1 Exact convergence for linear systems 57 Algorithm 2.66). i) xk+1 := xk + sk .3) and (2. Property (2.5) With this relation the following lemma is easily been shown. T T vk = Hk sk /(sk Hk yk ) for sT Hk yk = 0.e. k T The ’bad’ Broyden update vk = yk /(yk yk ) clearly satisﬁes property (2. Note that both properties are satisﬁed when Broyden’s ’good’ update is used. where uT sk = 0. Repeat the following sequence of steps as long as sk = 0. k T iv) Hk+1 := Hk + (sk − Hk yk )vk . Compute s0 := −H0 g(x0 ) and let k := 0. (2..1 (Generalized Broyden’s method). ii) yk := g(xk+1 ) − g(xk ). i.4) invariant. T H k uk . iii) Choose vk such that the conditions T vk yk = 1. Lemma 2. T = Hk − Hk (gk + yk )vk (2. The updated Broyden matrix can be written as T Hk+1 = Hk + (sk − Hk yk )vk T = Hk (I − gk+1 vk ).4) vk = are satisﬁed. v) Compute sk+1 := −Hk+1 g(xk+1 ).
Thus the algorithm stops if and only if the zero of the function g is found. Solving linear systems with Broyden’s method Because uT sk is assumed to be nonzero. where A ∈ Rn×n is nonsingular and b ∈ Rn . From (2. According to the deﬁnition of the Broyden step.58 Chapter 2. this implies that Hk+1 is invertible if k Hk is invertible.6) The matrix A is assumed to be nonsingular.1. Since g is a aﬃne function the yield of the step size can be expressed as yk = Ask .9) (2.10) and T AHk+1 = AHk (I − (I − AHk )gk vk ). Equation (2.8) A theorem of Gay In this section.3.7) implies that gk+1 = yk + gk = (I − AHk )gk . (2.10) we deduce T I − AHk+1 = (I − AHk )(I + AHk gk vk ) T = (I − AHk )(I − yk vk ). (2. we show that Algorithm 2.1 converges in at most 2n steps when applied to an aﬃne function g : Rn → Rn . (2.6) and (2.12) .11) for which we have used (2. (2. So. (2. the nonsingularity of H0 implies that sk = 0 if and only if gk = 0 for all k ≥ 0. This follows as an easy corollary to the following lemma.6) establishes that yk is a nonzero vector throughout the execution of Algorithm 2. given by (2. sk = −Hk gk . We also use the relations yk = −AHk g(xk ). The notation σ used below denotes the greatest integer less than or equal to σ ∈ R. In the proof of Lemma 2.7). we need the equalities T AHk+1 = A(Hk + (sk − Hk yk )vk ) (2.2).7) and (I − AHk+1 )yk = 0. (2.
The proof for 3 ≤ j ≤ (k + 1)/2 is similar and we refer to [22] for a complete derivation. If g(x) = Ax−b and A ∈ Rn×n is nonsingular. gk−2 = (I − AHk−3 )gk−3 and Since yk−3 = AHk−3 gk−3 .11) we have that T AHk−1 = AHk−2 (I − (I − AHk−2 )gk−2 vk−2 ) 0 ≤ i ≤ j. so (2.e.. By (2. T (AHk−2 )gk−2 = (1 − vk−3 gk−2 )(I − AHk−3 )AHk−3 gk−3 . Therefore (2.1 Exact convergence for linear systems 59 Lemma 2.13) holds for j = 2. If A ∈ Rn×n and Algorithm 2. Note that.13) is easily seen to hold for j = 1. then Algorithm 2. 0 ≤ i ≤ 2.13) are linearly independent for j = 2. According to (2. gk−2 and AHk−2 gk−2 are linearly independent.2.4. i.13) Moreover. .1 is applied to g(x) ≡ Ax − b with the result that gk ≡ g(xk ) and yk−1 are linearly independent. the vectors (AHk−2j+1 )i gk−2j+1 . then for 1 ≤ j ≤ (k + 1)/2 . gk = 0 for some k ≤ 2n. We prove (2. are linearly independent. as before. (2.13) by induction on j. For the induction we proof that the vectors in (2.1 converges in at most 2n steps. Proof.3. Equation (2.8) we have (I − AHk−2 )yk−3 = 0. Theorem 2. we see that (AHk−3 )i gk−3 .9) gives gk−1 = (I − AHk−2 )gk−2 and therefore T (AHk−1 )gk−1 = AHk−2 (I − (I − AHk−2 )gk−2 vk−2 )gk−1 T = (1 − vk−2 gk−1 )(I − AHk−2 )AHk−2 gk−2 T = AHk−2 (I − AHk−2 )gk−2 (1 − vk−2 gk−1 ) Since gk−1 and (AHk−1 )gk−1 are linearly independent (I − AHk−2 )gk−2 and (I − AHk−2 )AHk−2 gk−2 are linearly independent as well. The linearity of g implies that yk−1 = Ask−1 = −AHk−1 gk−1 . Therefore yk−3 . are linearly independent. using that yk−1 = gk − gk−1 .
10 0 residual g(xk ) 10 −5 10 −10 PSfrag replacements 10 −15 0 1 2 3 4 5 6 7 8 iteration k Figure 2. Because A is also nonsingular. The theorem clearly holds if gk = 0. 1 2 0 0 0 0 2 0 0 0 .60 Chapter 2.19 to obtain the exact zero of the function g 1 and g2 .2 shows that Hl is nonsingular for l ≥ 0. According to (2.4. By Lemma 2. With the following example we illustrate Theorem 2. 2. 1.19. so sk−1 = 0. see Algorithm 1. Lemma 2. . Solving linear systems with Broyden’s method Proof. we must have yk−1 = 0 and hence gk = λyk−1 for some λ = 0.19. Clearly. ’×’: A2 ] In Section 1. 1).14) We apply the method of Broyden. Example 2. (whence gk−1 = 0 too).14) [’◦’: A1 . whence gk+1 = gk − AHk gk = 0. for i = 1. starting with the initial matrix B0 = −I and initial estimate x0 = (1. The rate of convergence is given in Figure 2. where A1 and A2 are deﬁned in (2.5. 1 2 (2.3.1.8) yk−1 = AHk−1 yk−1 . when solving Ai x = 0. we have seen that the Broyden matrix Bk does not necessarily converge to the Jacobian even if the sequence {xk } converges to x∗ . Consider the linear function g1 (x) where 2 1 0 0 2 0 2 1 0 0 A1 = and A2 = 0 0 2 1 0 0 0 0 2 0 = A1 x and g2 (x) = A2 x. here the number of 2n iterations is an upper bound for Algorithm 1. 1. so assume that gk = 0. so gk = AHk gk .1: The convergence rate of Algorithm 1.3 there exists a k with 1 ≤ k ≤ 2n − 1 such that gk and yk−1 are linearly dependent.
y spans the kernel of (I − AH not in the range of (I − yk vk k k+1 ). . is projected on an (n−1)dimensional subspace orthogonal to the Broyden step sk .1 k T s /(sT H y ). .1 Exact convergence for linear systems 61 Lemma 2. Let g : Rn → Rn be an aﬃne function. iteration k. and because.1 and suppose for some k ≥ 1. . . . Theorem (2.19 and 2.6. vk yk−1 = 0 and rank(I − AHk ) = n − 1 then rank(I − AHk+1 ) = n − 1 and yk spans the kernel of (I − AHk+1 ). Any other null vector y T of (I − AHk+1 ) must (after scaling) satisfy (I − yk vk )y = yk−1 .7 leads to the important observation that the sequence of matrices Hk does not terminate with the inverse of the matrix A. the ﬁnal diﬀerence Bk∗ − A F depends in particular on the orthogonality of the Broyden steps {s0 . So. yk−1 is T ). since rank(I −AHk ) = n − 1 the vector yk−1 spans the kernel of (I − AHk ). Consider Algorithm 2. Consider Algorithm 1. sk∗ }. k −1 Proof.4) gives that the process are equivalent for vk = Hk k k k k is well deﬁned and converges in a ﬁnite number of iterations. Because g is aﬃne we have that yk = Ask and according to the Broyden update. Algorithms 1. In other words. Similarly. . Let g : Rn → Rn be an aﬃne function. at least in the T usual case in which all vk yk−1 = 0. we obtain Bk+1 − A F ≤ Bk − A F (2. where sT Bk yk is nonzero in every k ∗. Lemma 2. Because we assume that sT Bk yk = 0 for all k. . sT sk k and by taken the Frobenius norm of both sides. the diﬀerence between the Jacobian and the Broyden matrix.8) we see that yk is in the kernel of (I −AHk+1 ). Then for all k = 0. Proof. with nonsingular Jaco−1 bian A ∈ Rn×n . . Therefore. In fact. According to (2. . that T yk = 0. v T y the kernel of (I − yk vk k k−1 = 0.19. with nonsingular Jacobian A ∈ Rn×n . we arrive at Bk+1 − A F ≤ Bk − A F I− ≤ Bk − A F. But vk spans T )T . Lemma 2.15) Bk+1 = Bk + (yk − Bk sk ) Bk+1 − A = (Bk − A) I − sk sT k sT sk k sT k sT sk k sk sT k .2. each matrix Hk and A−1 agree only on a subspace of dimension one. The assumption that yk = 0 implies that yk−1 = 0.7. by assumption.
We immediately see that Zk+1 ⊂ Zk . . We proceed to show how the Zk ’s decrease in dimension and ﬁrst derive several lemma’s. we obtain (AHk+1 )j+1 gk+1 = (AHk+1 )(AHk+1 )j gk+1 = (I − AHk )AHk tj + c(I − AHk )yk .1 terminates at the kth iteration if and only if g(xk ) = 0. Let Zk be deﬁned as the subspace spanned by the Krylov sequence {gk .16) for k ≥ 0. (2.62 Chapter 2. . . It suﬃces to show that for j ≥ 0. So. We already have proved that Algorithm 2. .7) the vector tj+1 ∈ Zk . (2. where tj+1 = AHk tj + cyk . Thus sk = 0 if and only the dimension of Zk is zero. Zk = span{gk . T where c = vk (I − AHk )tj .}. Proof.2 Two theorems of Gerber and Luk We consider the Broyden process again applied to compute a zero of the aﬃne function g given by (2.17) Another direct implication of Lemma 2. we have t0 = gk because of (2. So.8. Then there exists a vector zk in Zk such that zk+1 = (I − AHk )zk .9).8 is formulated in the following lemma. AHk gk .2). We will call subspace Zk the kth Krylov subspace. Let zk+1 be any vector in Zk+1 . (AHk )2 gk . If j = 0. AHk gk . By deﬁnition of Hk+1 . Solving linear systems with Broyden’s method 2. . So. there is a vector tj in Zk such that (AHk+1 )j gk+1 = (I − AHk )tj .}. (AHk )2 gk . . Lemma 2. We prove this by induction. . Assume there is a vector tj in Zk such that (AHk+1 )j gk+1 = (I − AHk )tj . Z0 will be called the zeroth subspace. T = (AHk + A(sk − Hk yk )vk )(I − AHk )tj (AHk+1 )j+1 gk+1 = (I − AHk )tj+1 . By (2.
. But this contradicts the assumption of a nonzero wk . . . αd = 0. In fact. . Then the vectors (I − AHk )t1 . .9 shows that a nonzero wk must exist. Thus. i. If dim Zk+1 = dim Zk . if there is a nonzero vector wk in Zk ∩ Ker(I − AHk ). 1. . . Let dim Zk = d + 1. with the exception of the case where there exists a nonzero w 0 . A consequence of Lemma 2. then dim Zk+2 = dim Zk+1 . .12. If there is a nonzero vector wk in the subspace Zk ∩ Ker(I − AHk ) then d wk = i=0 αi (AHk )i gk . (2. Let the vectors {t1 .18) (2. . As the vectors (AHk )i gk for i = 0. t2 . . .. Theorem 2. Since dim Zk+1 = dim Zk − 1.10 is that wk is unique up to a scalar multiple.11. . Proof. The following theorems state the basic result. We have d 0 = (I − AHk )wk = α0 gk + i=1 (αi − αi−1 )(AHk )i gk − αd (AHk )d+1 gk . . then wk spans Zk ∩ Ker(I − AHk ). (I − AHk )td span Zk+1 .2. . Before we start to prove both theorems.12. we obtain the inequality dim Zk − 1 ≤ dim Zk+1 ≤ dim Zk . Suppose that αd = 0. a few remarks are in order on the vector wk in Theorem 2. d are linearly independent.2 Two theorems of Gerber and Luk 63 Lemma 2. we deduce that αi = 0 for i = 0.9. Lemma 2. d − 1.9.10. . . it can be shown that wk = λyk−1 for some scalar λ. 1. with Lemma 2. Lemma 2. .19) where wk spans Zk ∩ Ker(I − AHk ).e. then dim Zk+2 = dim Zk+1 − 1. td } span Zk . Theorem 2. If dim Zk+1 = dim Zk − 1 and T vk wk = 0. (I − AHk )t2 .
By Lemma 2. If there is a nonzero vector wk in Zk ∩ Ker(I − AHk ) and if T vk wk = 0. Lemma 2. .13. (AHk )d−1 gk . Lemma 2. AHk gk . i=0 + (βd−1 − βd−2 )(AHk )d−1 gk − βd−1 (AHk )d gk = 0. T (I − yk vk )wk+1 = αwk . First we show that wk+1 = λyk for some scalar λ.10. Proof (of Theorem 2. Assume wk+1 = λyk for all nonzero scalars λ. Then we will prove that yk is not in Zk+1 . T contradicting the assumption that vk wk = 0. yk spans Zk+1 ∩ Ker(I − AHk+1 ). and so yk ∈ Zk+1 . Solving linear systems with Broyden’s method Applying Lemma 2. wk } is a basis for Zk . Let dim Zk = d + 1 where d ≥ 1. Since wk+1 ∈ Zk and from the equation T (I − AHk+1 ) = (I − AHk )(I − yk vk ). Now we show that yk is not in Zk+1 . βi (AHk )i gk + βd wk i=0 d−1 βi (AHK )i gk . the subspaces Zk+1 Zk are identical by (2.9 implies that d−1 yk = (I − AHk ) = (I − AHk ) So. So wk+1 = λyk for some nonzero scalar λ. . Suppose there is a nonzero vector wk+1 in Zk+1 ∩Ker(I −AHk+1 ). Since dim Zk+1 = dim Zk . . . if d > 1.11).9 completes the proof.12. . . . But then T T T T αvk wk = vk wk+1 − vk yk vk wk+1 = 0. we deduce that Proof. By (2. the set of vectors {gk .64 Chapter 2.10. then β0 + (β1 − β0 + 1)AHk gk + . for some nonzero scalar α.7)(2. The next lemma is needed in the proof of Theorem 2.8) and Lemma 2. then Zk+1 ∩ Ker(I − AHk+1 ) = {0}.17). Assuming that yk ∈ Zk+1 .
The method of Broyden needs 8 iterations to solve the equation g2 (x) = 0. 0 0 −1 β0 gk + (1 − β0 )AHk gk = 0. 2 .15. .2 Two theorems of Gerber and Luk 65 For d = 1 we obtain Either case is impossible. . 1. we see that d0 − 1 ≤ d1 ≤ d0 . AH0 g0 . Then we obtain the following process. Applying the Theorems 2.18).12).16. Corollary 2. In the next example. This predicts the four iterations the method of Broyden needs to solve g1 (x) = 0. where −1 1 0 A = 0 −1 1 . though easier to check. which is a direct consequence of 2. The proof follows directly from the Lemmas 2. −1).12. So yk ∈ Zk+1 and hence Zk+1 ∩ Ker(I − AHk+1 ) = {0}. if (2. .5.} = span 0 . 1 . are linearly independent. 1). It turns out that in case of the function g1 in Example 2. If we apply the method of Broyden starting with the initial matrix B0 = −I and initial estimate x0 = (1.12 imply ﬁnite termination of the method.2. . . H0 = −I.5 both the zeroth and the ﬁrst Krylov space of the Broyden process has dimension 2 (= d0 + d1 ).12.19) is satisﬁed in every iteration where dim Zk+1 = dim Zk − 1.14.1 must terminate in exactly d0 + d1 steps.11 and 2. i = 0. The function value in x0 equals g(x0 ) = Ax0 = (0.9 and 2. . is that Broyden’s method needs at most 2d0 iterations. both the zeroth and the ﬁrst Krylov space of the Broyden process has dimension 4. −1 −1 −1 . Note that the inverse of the Broyden also equal minus the identity. Consider the linear function g(x) = Ax. d. / Proof (of Theorem 2.19) is a necessary condition for Theorem 2. we show (2. In case of the function g2 of Example 2. A weaker statement.13. Theorems 2. Let d0 = dim Z0 then Algorithm 2. as the vectors (AHk )i gk . .1 needs at most 2d0 iterations to converge. (AH0 )2 g0 . 0. Let dim Z0 = d0 and dim Z1 = d1 . Example 2.11 and (2. Example 2.11 and 2. Therefore the zeroth Krylov space is given by 0 0 −1 Z0 = span{g0 . we conclude that Algorithm 2.18). From (2.
66 Chapter 2. 0) and so the second element of the iterate is nicely removed. denoted by d2 . 0. 1) we compute the new inverse Broyden matrix. 0 0 −1 The function value in x1 equals g(x1 ) = Ax1 = (0. 0 0 −1 We thus have that d0 = 3 and d1 = 2. 1 2. 0. 0 0 . −1. 0. 1.} = span −1 . The dimension of the second Krylov space. 0). . T v0 = H0 s0 /(sT H0 y0 ) = (0. if v0 w0 = 0. However. 0). we compute the new inverse T Broyden matrix. 0. So the vector w0 that spans Z0 ∩ Ker(I − AH0 ) equals w0 = (−1. v1 = H1 s1 /(sT H1 y1 ) = (0. −1) and so the last element of the iterate is nicely removed. 0. We see that the ﬁrst Broyden step equals s0 = −H0 g0 = (0. 1). Note that w0 and w1 are parallel (here chosen to be identical). −1. and 1 −1 −1 −1 T H2 = H1 + (s1 − H1 y1 )v1 = 0 −1 −1 . . 0). Solving linear systems with Broyden’s method and the dimension of Z0 equals three. 1 2).12 the intersection of Z1 and the kernel of (I − AH1 ) must be empty and dim Z2 = T T = The kernel of dim Z1 . It turn out that the ﬁrst Krylov space is twodimensional and is given by 0 1 Z1 = span{g1 . 0) and (0. 1. √ (I − AH1 ) is spanned by the vectors (−1. Therefore 2 2 Z1 ∩ Ker(I − AH1 ) is spanned by the nonzero vector w1 = (−1. −1. According to Theorem 2. 0). . AH1 g1 . 1. 0). Together with the yield of the Broyden step y1 = (−1. The kernel of (I − AH0 ) is onedimensional and spanned by the vector (−1. 0). 0 and −1 0 0 T H1 = H0 + (s0 − H0 y0 )v0 = 0 −1 −1 . equals one. With the yield of the Broyden step y0 = (0. 1). x2 = x1 +s1 = (1. 0). in this example v0 w0 √ 0. −1 . 0 0 −1 The new Broyden matrix B1 is given by −1 0 0 B1 = 0 −1 1 . 0. The second Broyden step equals s1 = −H1 g1 = (0. x1 = x0 + s0 = (1. 0.
Lemma 2. 0).3 Linear transformations 67 The Broyden matrix B2 is given by −1 1 0 B2 = 0 −1 1 . Again the vector v1 and w1 are orthogonal. we know that the Broyden process terminates in the next iteration. . 0. . ˜ ˜ n .19 starting with x0 = U T x0 and B0 = U T B0 U. . We make this precise in the next lemma. Z2 = span{g2 .20) follows from the assumptions.3 Linear transformations An important observation for our present approach is that Broyden’s method is invariant under unitary transformations for general systems. . . We compute −1 ˜x xk+1 = xk − Bk g (˜k ) ˜ ˜ −1 = U T xk − U T Bk U U T g(U U T xk ) −1 = U T (xk − Bk g(xk )) = U T xk+1 . 0) and the solution to the problem is found. the kernel of (I − AH2 ) is the entire space. 0. The function value in x2 equals g(x2 ) = Ax2 = (−1. Equation (2. Let g : Rn → Rn be a general function. x3 = 0. .20) is easily proved using an induction principle. 0) and the second Krylov space is given by −1 0 . that is. AH3 g3 . The third Krylov space is given by Z3 = span{g3 . . For k = 0. 1. ˜x Proof. 0. g (˜k ) = g(xk ) . and w2 = w1 = w0 . 2. Statement (2. Then for every k = 0. and choose x0 ∈ Rn and B0 ∈ Rn×n .20) In particular.} = {0}. Let U be a unitary matrix. 0 0 −1 The Broyden matrix B2 equals the Jacobian of the linear function. Consider Algorithm 1. . Hence. The ﬁnal Broyden step equals s2 = (−1. applied to the function g (z) = U T g(U z). . . (2. A. z∈R x k = U T xk ˜ and Bk = U T Bk U.2.} = span 0 Because H2 is the inverse of the Jacobian.17. . The vector w2 is thus given by w2 = (−1.
˜ ˜ z ∈ Rn . and consider the Broyden process starting with x0 = U −1 x0 and B0 = U −1 B0 U. . The question is whether scaling of the system does change the rate of convergence of Broyden’s method. For linear systems of equations we have the following result. for a certain choice of x0 and H0 . . First note that g0 = g (˜0 ) = U −1 g(U x0 ) = U −1 g(U U −1 x0 ) = U −1 g(x0 ) = U −1 g0 . i. ˜ ˜ ˜ This leads to Bk+1 = Bk + = Bk + (˜k − Bk sk )˜T y ˜ sk Ts sk ˜k ˜ gk+1 sT ˜ ˜k sT sk ˜k ˜ = U T Bk U + U T g(U U T xk+1 )(U T sk )T (U T sk )T (U T sk ) g(xk+1 )sT k U = U T Bk U + U T T sk U U T sk = U T (Bk + g(xk+1 )sT k )U = U T Bk+1 U. . Let g : Rn → Rn be an aﬃne function. ˜ ˜x Proof.e. sT sk k So. Then the method of Broyden needs at most 2d0 iterations to converge exactly to the zero of g . (2.68 Chapter 2. ˜ ˜x and sk = xk+1 − xk = U T xk+1 − U T xk = U T sk . ˜x It might happen that a system is more or less singular.20) is true for every k = 0. applied to the function g (z) = U −1 g(U z).16) is equal to d0 . the dimension of the zeroth Krylov space Z0 (2. This is unproﬁtable for the numerical procedures to solve this system. and g (˜k ) = U T g(U U T xk ) = U T g(xk ) = g(xk ) . Suppose. Let U be a nonsingular matrix. g (˜k ) = 0 for some k ≤ 2d0 .. ˜ ˜x ˜ . 1.18. Solving linear systems with Broyden’s method Therefore gk+1 = g (˜k+1 ) = U T g(U U T xk+1 ) = U T g(xk+1 ) = U T gk+1 . Lemma 2.
3 Linear transformations 69 If we apply the linear transformation x → U x.} g ˜ ˜ = span{U −1 g0 . U −1 (AH0 )2 g0 . . Corollary 2. . the zeroth Krylov space Z0 built with g0 and AH0 becomes ˜ Z0 = span{˜0 . . . . the dimensions of Z0 and Z0 are equal. U −1 AH0 g0 .2. . . (AH0 )2 g0 . (U −1 AU U −1 H0 U )2 U −1 g0 . . AH0 g0 .} = span{U −1 g0 .14 completes the proof. . U −1 AU U −1 H0 U U −1 g0 .} = U −1 Z0 Because U is of full rank. .
70 Chapter 2. Solving linear systems with Broyden’s method .
Although Broyden’s method fails to have local qquadratic convergence it is still qsuperlinearly convergent for nonlinear equations and exact convergent for linear equations. Q = CD T . from chemical reaction engineering. In this chapter. involving a large amount of memory to store the n2 elements of the Broyden matrix. we describe how we can use the structure of the Broyden update scheme. In comparison with the method of Newton it does not need expensive calculation of the Jacobian of the function g.3. The initial Broyden matrix is set to minus the 71 . for example. to write the Broyden matrix B as a sum of the initial Broyden matrix B0 and an update matrix Q. This makes the method eﬃcient for problems where the evaluation of g is very timeconsuming. see Section 8. All methods described in this chapter are based on the method of Broyden and reduce the amount of memory needed for the Broyden matrix from n2 storage locations to 2pn storage locations. the method of Broyden turns out to be quite suitable for problems stemming from applications. In Section 3. According to a clever updating scheme of the Broyden matrix. every iteration step includes only one function evaluation. which is written as the product of two (n × p)matrices. In addition. we develop a structure to reduce the number of storage locations for the Broyden matrix. The parameter p is ﬁxed during the iteration steps of a limited memory Broyden method. we saw that the method of Broyden has several advantages.Chapter 3 Limited memory Broyden methods In the previous chapters.1. Therefore we call these algorithms limited memory Broyden methods. A disadvantage of Broyden’s method arises if we consider highdimensional systems of nonlinear equations.
dp ] are deﬁned by ck+1 = (yk − Bk sk )/ sk . . This approach cannot be trapped in the framework of Section 3.1) implies that if an initial matrix B0 is updated p times. [12]. we have taken it into consideration. The Broyden Rank Reduction method is introduced in Section 3. sT sk k (3. .1 New representations of Broyden’s method The updates of the ’good’ method of Broyden.2.1) sk sk sT sk k with sk = xk+1 − xk and yk = g(xk+1 ) − g(xk ). . p − 1. Limited memory Broyden methods identity at every simulation (B0 = −I).4. 3. . .72 Chapter 3. (3.2) where C = [c1 . In addition. we discuss several properties of the Broyden Rank Reduction method that also gives more insight in the original Broyden process. . we observe a limited memory Broyden method coming from the work of Byrd et al. dk+1 = sk / sk . are generated by T sT g(xk+1 )sk Bk+1 = Bk + (yk − Bk sk ) Tk = Bk + . . This method considers the singular value decomposition of Q. Equation (3. . By applying a reduction to the rank of Q in subsequent iterations of the Broyden process.3 a generalization of the Broyden Rank Reduction method. .19. To increase the understanding of limited memory Broyden methods we give in Section 3. the number of elements to store never exceeds 2pn. . D = [d1 . the qsuperlinear convergence of the method of Broyden is retained. We prove under which conditions of the pth singular value of the update matrix. and applies the reduction by truncating the singular value decomposition up to p−1 singular values. . . In Section 3. for k = 0. . cp ]. the resulting matrix Bp can be written as the sum of the initial matrix B0 and p rank one matrices.3 but due to its natural derivation. that is. p−1 Bp = B 0 + k=0 (yk − Bk sk ) sT k = B0 + CDT . Algorithm 1.
1. with z ∈ Rn . costs 2pn ﬂoating point operations. To make it possible to proceed after these p iterations. In the next iteration step of the Broyden process. The dimension of the problem is ﬁxed at n = 100. where C and D are arbitrary (n × p)matrices. is only useful if p can be kept small (p n). We remove all corrections made to the initial Broyden matrix and start all over again. therefore the initial residual is g(x0 ) ≈ 0. 2(p+1)n storage locations are needed to store the Broyden matrix Bp+1 . then p Q = CD T = k=1 ck d T . After p iterations we remove all stored corrections and restart the Broyden algorithm with initial estimate x0 = xp . we call the update matrix. we have to reduce the number of rankone matrices that forms the update matrix (3. etc. which equals the number of storage locations we need for the Broyden matrix itself. we take advantage of Equation (3. We apply the original method of Broyden. Storing the matrices C and D requires 2pn storage locations. For p = 3 and p = 5 a few more . Let g be the discrete integral equation function. Let Q = CD T . given by (A. if the method of Broyden needs more than p iterations to converge.3. if Q denotes the update matrix.1 New representations of Broyden’s method 73 The sum of all correction terms to the initial Broyden matrix B0 in (3. Lemma 3. In addition. In the following iteration step 2(p + 2)n storage locations are needed to store Bp+2 .7570. 2(n/2)n = n2 storage locations are needed. after n/2 iterations of Broyden’s method.2). k (3. given by (3. this alternative notation for the Broyden matrix.3). However. Example 3. or. Furthermore the computation of the matrix vector product Qz = C(D T z).19. the two next examples are obvious.3). So. We ﬁx the maximal number of corrections to be stored at p. In other words.6) and we set ε = 10−12 . So.5). The rate of convergence is given in Figure 3.1.3) to compute the product Qz for any vector z ∈ Rn . The following lemma is clear.2. it suﬃces to store the (n×p)matrices C and D. where the updates to the initial Broyden matrix are stored as in (3.3) By choosing B0 to be minus the identity (B0 = −I). After p iterations of the method of Broyden all columns of the matrices C and D are used. we freeze the Broyden matrix and neglect all subsequent corrections. It turns out that for p = 10 the same number of iterations are needed as for the original method of Broyden. the initial Broyden matrix can be implemented in the code for the algorithm. As initial estimate we choose x0 given by (A. In case n is even.2). Algorithm 1.
The diﬀerence between B1 and B2 is relatively small and therefore it makes no diﬀerence whether we freeze the Broyden matrix after the ﬁrst or after the second iteration. ’6’(p = 1)] Example 3. the update matrix Q is the sum of p rankone .74 Chapter 3. B4 and B5 is of order 10−1 and we see that the convergence behavior is equal for p = 3. ’ ’(p = 2). As initial estimate we choose x0 given by (A. We apply the original method of Broyden.1: The convergence rate of Algorithm 1. Similarly the diﬀerences in l2 norm between B3 . respectively). Limited memory Broyden methods iterations are needed (24 and 26.3 would be the BroydenChord method.7570. 4 and 5. for p = 2 about 92 iterations are needed. This can be explained by Table 3. However. and the initial residual equals g(x0 ) ≈ 0. The method is divergent for every value of p. However.5). If p corrections to the initial Broyden matrix are stored. the method does not work.6) and we set ε = 10−12 . Note that for p = 1 and p = 2 the convergence behavior is equal. ’∗’(p = 4). it is worth investigating whether it is possible to save more information about the previous iterations of the process. The rate of convergence is given in Figure 3. ’ ’(p = 5).1.3).3. Algorithm 1. Let g be the discrete integral equation function. 0 10 residual g(xk ) 10 −5 10 −10 PSfrag replacements 10 −15 0 5 10 15 20 25 30 35 40 iteration k Figure 3. An appropriate name for the method used in Example 3. The dimension of the problem if ﬁxed at n = 100. ’×’(p = 10).19.2 is more promising.5) where after p iterations the Broyden process is restarted. Unfortunately. ’ ’(p = 3).19 applied to the discrete integral equation function (A. given by (A.2. [’◦’(Broyden). where the updates to the initial Broyden matrix are stored as in (3. After p iterations all future corrections are neglected. The method of Example 3. We introduce a more sophisticated approach.
denoted by Q. ’6’(p = 1)] matrices. 10 0 residual g(xk ) 10 −5 PSfrag replacements 10 −10 0 2 4 6 8 10 12 14 16 18 20 22 iteration k Figure 3.19 applied to the discrete integral equation function (A. To gather .5). it is sure that the number of storage locations for the Broyden matrix never exceeds 2pn. In this chapter.2125 0.233 0.19 applied to the discrete integral equation function (A. In this way. memory is available to store p − q additional updates.3). ’+’(p = 6). This approximation. ’ ’(p = 2). and has at most rank p. ’ ’(p = 8). where after p iterations the Broyden matrix is frozen.5).2: The convergence rate of Algorithm 1.10937 0. ’×’(p = 10).0093723 1.893 0. ’ ’(p = 3).9524 0.6974 0.59031 0.1 New representations of Broyden’s method 75 k 0 1 2 3 4 5 6 7 8 9 Bk+1 − Bk 2. ’ ’(p = 5).9682 Table 3. [’◦’(Broyden).060311 1.1: The diﬀerence in l2 norm between two subsequent Broyden matrices of Algorithm 1.8165 0. ’ ’(p = 7). we derive a number of limited memory Broyden methods that based on trying to ’reduce’ the update matrix Q.0085303 1.19556 2.37294 0.028375 1.052184 2.11172 0.014515 k 10 11 12 13 14 15 16 17 18 19 Bk+1 − Bk 1. C and D. can be decomposed using two (n × q)matrices. given by (3.044 0. Repeating this action every time that p updates are stored.3. If we approximate the update matrix by a matrix of lower rank q (q ≤ p − 1).014 0.
cp ] and D = [d1 . the columns are ordered in such a way that the last p − q columns of the matrices C and D can be removed to perform the reduction. the decomposition CD T of the update matrix is optionally rewritten by CDT = C(DZ)T = CZ T DT =: C DT . So. A rankone update to the Broyden matrix is stored in a column of C and the corresponding column of D. . • The current number of stored updates is denoted by m (0 ≤ m ≤ p). • When applying the reduction. • A new update is stored in column m + 1 of the matrices C and D. . . • The update matrix Q is written as a product of two (n × p)matrices C and D. dp ]. • The Broyden matrix Bk is written as the sum of the initial Broyden matrix B0 and the update matrix Q. If already p updates are stored (m = p). with C = [c1 .5) . • The parameter p is predeﬁned and ﬁxed (1 ≤ p ≤ n). (3. . the ﬁrst q columns are saved and the reduced Broyden matrix is given by p q B = Bk − cl d T l l=q+1 = B0 + l=1 cl d T . Thereafter the last p − q columns of the matrices C and D are set to zero (m := q). No reduction is performed as long as m < p. a reduction is applied to the update matrix just before the next update is computed. we propose the following conditions. We start the limited memory Broyden process with the matrices C and D equal to zero (m := 0). The new number of updates after the reduction is denoted by q (0 ≤ q ≤ p − 1).76 Chapter 3. Limited memory Broyden methods all methods in a general updatereduction scheme.4) where the matrix Z ∈ Rp×p is nonsingular. . that is Q = CD T . . After rewriting the matrices C and D. • The initial Broyden matrix equals minus identity. . l (3. . B0 = −I. The maximal number of updates to the initial Broyden matrix is thus given by p.
. Set k := 0 and repeat the following sequence of steps until g(xk ) < ε. Especially if q = p − 1. directly after a reduction is applied. care should be taken by computing the next update. . dm+1 = sk / sk and m cm+1 := = yk − B 0 s k − g(xk+1 )/ sk . In the other case.6) gives.1 New representations of Broyden’s method 77 ¯ The new Broyden matrix B after the updating scheme becomes ¯ B = B + (yk − Bsk )sT /(sT sk ). . (3.9). k k (3.8) p cm+1 := g(xk+1 ) + l=q+1 cl d T s k / s k . The number q determines which approach is the cheapest one in ﬂoating points operations. p (m := 0). l (3. . if m < p.8) has the disadvantage that we have to store the vector yk . D = [d1 . q cm+1 := yk − B0 sk − or equivalently cl d T s k / s k l l=1 (3. Choose an initial estimate x0 ∈ Rn .1).7) is no longer valid.4 (The limited memory Broyden method). We are now ready to give the algorithm of a general limited memory Broyden method. . and let C = [c1 . . Since the Broyden matrix Bk is replaced by a reduced matrix B. This normal update is stored in column m+1 of the matrices C and D. the last p − q columns of C and D are still used to compute the new update before they are set to zero. . . . .9) Note that the ﬁrst approach. We proceed with the ﬁrst approach. (3. a reduction to the update matrix is needed just before storing the new correction. p In (3.5) into (3. is very attractive.3. cl d T s k / s k l l=1 (3. . with m = q. . So. cp ]. the second approach. . dp ] ∈ Rn×p be initialized by ci = di = 0 for i = 1. Substituting (3.6) Only if m = p. set the parameters p and q.7) However.9). no reduction is applied and the correction to the Broyden matrix is simply given by (3. Algorithm 3. The update is then reduced to cp := (g(xk+1 ) + cp dT sk )/ sk . (3.
v) Perform the Broyden update. Z = I) and if no additional corrections to the Broyden matrix can be stored. Since xp+1 is computed still using the original. The new update to the Broyden matrix is always made after the reduction to the update matrix Q. . Equation (3. free memory can be created by removing old updates. that is B0 = −I. The simplest thought is to do nothing with the columns of the matrices C and D (so. Broyden matrix Bp . and set m := m + 1. i. Limited memory Broyden methods i) Solve (B0 + CDT )sk = −g(xk ) for sk . So. ii) xk+1 := xk + sk . .10) we see that (I + D T B0 C) is a (p × p)matrix. So. We just make a selection of q updates that we would like to keep. This gives −1 −1 −1 −1 (B0 + CDT )−1 = B0 − B0 C(I + D T B0 C)−1 DT B0 .. It actually turns out that one can avoid solving the large ndimensional system Bk sk = −g(xk ).78 Chapter 3. iv) If m = p deﬁne C = CZ T and D = DZ −1 for a nonsingular matrix Z ∈ Rp×p and set ci = di = 0 for i = q + 1. the limited memory Broyden method is still a secant method.e. by using the ShermanMorrison formula (1. The diﬀerence between a limited memory Broyden method and the method of Broyden itself can be detected only in iteration step k = p + 2. So. We make a ﬁrst attempt to reduce the number of columns of the matrices C and D. The columns of C and D corresponding to these updates are placed in the ﬁrst q columns . we only have to solve a linear system in Rp . p (m := q). . not yet reduced. m T l=1 cl dl sk )/ sk (3.10) −1 By inspection of (3. note that in the ﬁrst p iteration steps no reduction takes place. the inverse is trivial. during these iterations the limited memory Broyden method is equivalent to the method of Broyden. cm+1 := (yk − B0 sk − dm+1 := sk / sk .6) implies that T ¯ Bsk = Bsk + (yk − Bsk )sT sk /(sk sk ) = yk . iii) yk := g(xk+1 ) − g(xk ). k Finally. . Due to our choice of the initial −1 Broyden matrix (B0 = −I).68).
and thus g(x0 ) = 0.e.4 applied to the discrete integral equation function (A. ’6’(p = 1)] Another possibility to reduce the update matrix is to remove the ﬁrst column of both matrices. Note also that. The rate of convergence is given in Figure 3. i. 5 and 10 more or less the same number of iterations are needed as for the method of Broyden itself. ’ ’(p = 3). After the reduction additional updates can be stored for the next p − q iterations of the Broyden process. ’ ’(p = 2). given by (A..3.2. the oldest update of the Broyden process. in this case. We will discuss some of the basic choices for the updates to save. however. Again the dimension is chosen to be n = 100. it is superﬂuous to rewrite the matrices C and D. So the algorithm considered in Example 3.5 is indeed diﬀerent from the algorithm of Example 3. because directly after the reduction a new update is stored in the ﬁrst columns of C and D.5. If Z ∈ Rp×p is the permutation . where q is set to zero. [’◦’(Broyden).4. One possibility is removing the update matrix Q completely and start all over again. As initial estimate we choose x0 given by (A. Let g be the discrete integral equation function. that the Broyden process does not restart with the initial matrix B = B0 .1 New representations of Broyden’s method 79 and hereafter the last p − q columns of both matrices are put to zero.3: The convergence rate of Algorithm 3. Thus take q = 0 and remove all columns of C and D.5) with q = 0. For p = 2. ’ ’(p = 5). because all columns are removed. ’×’(p = 10). The parameter q is set to q := p − 1. We apply Algorithm 3.7570. Only for p = 3 more iterations are needed to converge and for p = 1 the process does not converge at all.5).6) and we set ε = 10−12 . C and D. 10 0 residual g(xk ) 10 −5 10 −10 PSfrag replacements 10 −15 0 5 10 15 20 25 30 35 40 iteration k Figure 3. Example 3.3. Note. ’∗’(p = 4). 4.
7570.6.. Example 3. c p c1 (3.5). applied to the discrete integral equation function (A. ’6’(p = 1)] The next approach of reduction is removing the last column of both matrices.. ’ ’(p = 2).5 with p = 1. Let g be the discrete integral equation function given by (A. We choose n = 100 and thus g(x0 ) ≈ 0. . 10 0 residual g(xk ) 10 −5 10 −10 PSfrag replacements 10 −15 0 5 10 15 20 25 30 35 40 iteration k Figure 3. with q = p − 1 and Z given by (3. . ’ ’(p = 3).4. ’ ’(p = 5). The rate of convergence is given in Figure 3. the latest update of the Broyden process. As initial estimate we choose x0 given by (A. again . So. [’◦’(Broyden).11).11). C and D. . ’∗’(p = 4).4: The convergence rate of Algorithm 3. i.4. ’×’(p = 10). where q is set to p − 1 and Z is given by (3.80 Chapter 3. then implies that C = CZ T = c2 · · · and D = DZ −1 = d2 · · · Z= 0 1 .. 1 . For p = 2 and p = 3 a few more iterations are needed than for the method of Broyden. We apply the Algorithm 3. . 1 0 .. For p = 1 we have no convergence.11) d p d1 . Limited memory Broyden methods matrix step (iv) of Algorithm 3.5). which was already known because this is exactly the same case as the algorithm used in Example 3.4.6) and we set ε = 10−12 .e. For all other values of p the convergence is much slower.4.
this method is a secant method. ’∗’(p = 4). Example 3.5: The convergence rate of Algorithm 3. The process diverges for p = 4 and 5. As initial estimate we choose x0 given by (A. In Section 1. we can remove the ﬁrst two columns of both matrices. with q = p − 1 and Z = I. we have seen that after the method of Broyden diverges for one iteration. We choose n = 100 and thus g(x0 ) ≈ 0. For p = 2 and 3 the convergence is rather slow. this approach is not equal to freezing the Broyden matrix.e. ’ ’(p = 5). ’ ’(p = 2). ’6’(p = 1)] Instead of removing one single update. the new Broyden matrix Bk+1 satisﬁes the secant equation (1.3. the next iteration it makes a large step in the right direction.5.1 New representations of Broyden’s method 81 q := p − 1.4. Let g be the discrete integral equation function given by (A.7570. ’ ’(p = 3). Therefore. i. but now in step (iv) of Algorithm 3.7. where q is set to p − 1 and Z = I.5).. Because the columns are removed after the Broyden step.4 applied to the discrete integral equation function (A. Note that we already have discussed the case p = 1 in the previous two examples. after xk+1 is computed. C and D. [’◦’(Broyden). ’×’(p = 10). . We apply Algorithm 3. that is. Perhaps two updates are in some way related. Besides the new update is still computed and added to the Broyden matrix. The parameter q is set to p − 2. the two oldest updates of the Broyden process. The rate of convergence is given in Figure 3. Only for p = 10 we have fast convergence.3. 10 0 residual g(xk ) 10 −5 10 −10 PSfrag replacements 10 −15 0 5 10 15 20 25 30 35 40 iteration k Figure 3.5).6) and we set ε = 10−12 .25).4 the decomposition of the update matrix is not rewritten (Z = I).
’ ’(p = 3). for p = 8 and p = 10 the process converges slower. .12) C = CZ T = c3 · · · and D = DZ −1 = d3 · · · d p d1 d2 .7570. We apply Algorithm 3.4 applied to the discrete integral equation function (A.. . where q is set to p−2 and Z given by (3. Example 3. . [’◦’(Broyden)... Limited memory Broyden methods If Z ∈ Rp×p is the permutation matrix 0 0 . 1 . As initial estimate we choose x0 given by (A.6: The convergence rate of Algorithm 3. and subsequently the last two columns of C and D are set to zero.8. c p c1 c2 (3. . Z= 1 1 step (iv) of Algorithm 3. The method cannot be applied for p = 1. ’∗’(p = 4).. The rate of convergence is given in Figure 3. 0 0 . Let g be the discrete integral equation function given by (A. with q = p − 2 and Z is given by (3.. .6) and we set ε = 10−12 .12). ’×’(p = 10). ’ ’(p = 2)] .82 Chapter 3. .5). The rate of convergence is rather fast for the smaller values of p.4 implies that 1 .6.5). .. 10 0 residual g(xk ) 10 −5 10 −10 PSfrag replacements 10 −15 0 5 10 15 20 25 30 35 40 iteration k Figure 3. .4.12). ’ ’(p = 8). We choose n = 100 and thus g(x0 ) ≈ 0.
[’◦’(Broyden). Let g be the discrete integral equation function given by (A.4 applied to the discrete integral equation function (A. The parameter q is set to p − 2. In Algorithm 3. We apply Algorithm 3.9. ’ ’(p = 2)] Of course we could think of more fancy approaches for the selection of the columns of C and D. As initial estimate we choose x0 given by (A.1 we can derive that these updates probably are the 2nd. for the example used in this chapter. we arrive to the main work of this thesis. when a reduction has to be applied.7570. 3. ’∗’(p = 4).2 Broyden Rank Reduction method In this section.5). Example 3. we remove the two latest update of the Broyden process.5). The rate of convergence is given in Figure 3.4. ’ ’(p = 3). Perhaps. 4th and 5th update. . we remove the last two columns of the matrices C and D. ’ ’(p = 5). 0 10 residual g(xk ) 10 −5 10 −10 PSfrag replacements 10 −15 0 5 10 15 20 25 30 35 40 iteration k Figure 3.7. etc.4.3. ’×’(p = 10).2 Broyden Rank Reduction method 83 In the ﬁnal example of this section. and again Z = I. A more serious approach would be removing the p − q updates stored in the update matrix that are the smallest in Frobenius norm.6) and we set ε = 10−12 . From Table 3. Again the method cannot be applied for p = 1. with q = p − 2 and Z = I.7: The convergence rate of Algorithm 3. where q is set to p − 2 and Z = I. So. It is Remarkable that for p = 5 and p = 10 the rate of convergence is much lower. We choose n = 100 and thus g(x0 ) ≈ 0. we have represented the Broyden matrix by Bk = B0 + CDT . it is interesting to remove all odd or all even columns of the matrices C and D.
1. .14) A = U ΣV T = σ1 u1 v1 + · · · + σn un vn where U = [u1 . if the limited memory Broyden method converges for a certain value for p it might diverges for a larger value for p. Let the singular value decomposition of A ∈ Rn×n be given by (3. we cannot tell yet whether and when removing an update destroys the structure of the Broyden matrix too much. . In addition. where . cp ] and D = [d1 . Limited memory Broyden methods where C. Clearly. denoted by Q. 2 AT Avi = V ΣU T U ΣV T vi = σi vi . . . .. we ﬁrst recall some basic properties of singular values. . . n. The real nonnegative numbers σ1 ≥ . denotes the l2 matrix norm. . . vn ] are orthogonal matrices and Σ = diag(σ1 . σp = 0 and σp+1 = 0. is deﬁned by p Q = CD T = l=1 cl d T . D ∈ Rn×p . The update matrix. . We store the corrections to the initial Broyden matrix in the columns of the matrices C = [c1 . .10. . un ] and V = [v1 . l (3. . . we have tried to reduce the rank of Q during the Broyden process. . and the column vectors of V are the eigenvectors of A T A and are called the right singular vectors of A. If q < r = rank A and q Aq = k=1 T σ k u k vk then rank B=q min A − B = A − Aq = σq+1 . . The rank of a matrix A equals p if and only if σp is the smallest positive singular value. the rate of convergence might be low. . . We saw that for small values of p often has diﬃculties to converge and if the process succeeds to converge. .84 Chapter 3. dp ]. . . . .e. To introduce a new special limited memory Broyden method. the column vectors of U are the eigenvectors of AAT and are called the left singular vectors of A. The proof can be found in [27].14). i. The following basic theorem yields that the best rankp approximation of a matrix A is given by the ﬁrst p terms of the singular value decomposition. Because. Every real matrix A ∈ Rn×n can be written as T T (3. 2 AAT ui = U ΣV T V ΣU T ui = σi ui . σn ). Theorem 3. . . . ≥ σn ≥ 0 are called the singular values of A. . for i = 1.13) In Section 3.
In other words. In addition. is called the Broyden Rank .3. the singular value decomposition can be reduced to T T Q = σ 1 u 1 v1 + · · · + σ p u p vp . Using the QRdecomposition of D = DR we observe that Q can be written as CDT = C(DR)T = CRT DT = C DT . Now. A problem we still have to deal with is that we do not want to compute the (n × n)update matrix Q explicitly. Therefore. see again [27]. This limited memory Broyden method in which we remove the pth singular value of the update matrix in every iteration. the question is how we can determine the singular values of this matrix. using the singular value decomposition of C = U ΣW T . considering the general Algorithm 3. in fact. the matrix W consists of the eigenvectors of C T C. Then. become a pdimensional problem. So. So the ndimensional problem of the computation of the singular values of Q has. Compute the singular value decomposition of the update matrix Q. This leads us to consider the following reduction procedure for the limited memory Broyden method. The matrix Z in Algorithm 3.4 is given by Z = W −1 R. Because the rank of Q is less or equal to p. where D is orthogonal. This leads to the best rank q approximation of the update matrix Q that is available in the l2 norm. for step (iv) the singular value decomposition of Q is computed and stored in the matrices C and D.4. The theory of singular values can be extended to rectangular matrices. Because W and D are orthogonal matrices. the product D is orthogonal as well. Note that the singular values of C are the square roots of the eigenvalues of C T C which is an (p × p)matrix.2 Broyden Rank Reduction method 85 An interpretation of this theorem is that the largest singular values of a matrix A contain the most important information of the matrix A. we see that C DT = (U ΣW T )DT = (U Σ)(DW )T = C DT . Next choose q and remove the smallest p − q singular value and their corresponding left and right singular vectors from the singular value decomposition of Q. C DT represents an economic version of the singular value decomposition of the update matrix. by setting the last p − q columns of both matrices to zero the last p − q terms of the singular value decomposition are removed. The right singular vectors of Q are obtained using these eigenvectors.
The theoretical justiﬁcation of the BRR method is given in Theorem 3.. and let C = [c1 . cm+1 := (yk + sk − dm+1 := sk / sk . and set m := m + 1. . i) Solve (I − D T C)tk = DT g(xk ) for tk . . C := CRT . For economical reasons it would be better to compute the singular value decomposition Note that in step (v) of the ﬁrst iteration of Algorithm 3. ii) sk := g(xk ) + Ctk . D = [d1 . . p (m := q). . . we compute the singular value decomposition of Q in every iteration. dp ] ∈ Rn×p be initialized by ci = di = 0 for i = 1. . . the matrix R is then set to zero.11 the QRdecomposition is computed of a zero matrix. Choose an initial estimate x0 ∈ Rn . however. . . D := DW.86 Chapter 3. Algorithm 3. Limited memory Broyden methods Reduction (BRR) method. In applications it turns out to be a very eﬃcient algorithm to solve highdimensional systems of nonlinear equations. Set k := 0 and repeat the following sequence of steps until g(xk ) < ε. iii) xk+1 := xk + sk . . D := D. . vi) Compute the SVD of C = U ΣW T . iv) yk := g(xk+1 ) − g(xk ). In Algorithm 3. even if we apply no reduction. . . v) Compute the QRdecomposition of D = DR. set the parameters p and q. .12 where we show under which conditions the method is qsuperlinear convergent. in order to obtain a better understanding of the importance of the updates to the Broyden matrix. .e. cp ]. vii) If m = p then set ci = di = 0 for i = q + 1. The general Algorithm 3. p (m := 0). . Since.11 (The Broyden Rank Reduction method).4 can be replaced by this new algorithm. (σ1 ≥ · · · ≥ σp ) C := U Σ. we can choose any orthogonal matrix Q without disturbing the procedure. i. viii) Perform the Broyden update.11. m T l=1 cl dl sk )/ sk .
B = B − R. It follows that ssT sT ssT ¯ B − Jg (x∗ ) = (B − Jg (x∗ )) I − T + (y − Jg (x∗ )s) T − R I − T . ¯ (3.2 Broyden Rank Reduction method 87 To proof the qsuperlinear convergence of a limited memory Broyden method we observe that before a Broyden update is applied. we estimate the diﬀerence between the new Broyden matrix and the Jacobian of g at x∗ . k ≥ 0. This leads to the following theorem.12. Theorem 3. (1. s s s s (3. x − x∗ }. s = 0}.3.15). where sT ssT ¯ B = B + (y − Bs) T − R I − T .15) Comparable to the proof of the convergence of Broyden’s method. B) where Φ : Rn × L(Rn ) → P{L(Rn )} ¯ ¯ is deﬁned as Φ(x. Theorem 1. is locally and qsuperlinearly convergent at x∗ . the Broyden matrix is reduced using a reduction matrix R.38) is satisﬁed for all B ∈ Φ(x. convex set D ⊂ Rn . B) = {B  R ≤ s . and assume that Jg ∈ Lip γ(D). Bk ). Let x∗ be a zero of g. s s s s s s Thus instead of (1. a general limited memory Broyden method would be linearly convergent to x∗ if the norm R of the reduction R can be estimated by the length of the Broyden step s . ¯ The new updated Broyden matrix B is therefore given by sT ssT ¯ B = B + (y − B)sT /(sT s) = B + (y − Bs) T − R I − T . ¯ for which Jg (x∗ ) is nonsingular.16) According to Theorem 1.58) we obtain ¯ B − Jg (x∗ ) F ≤ B − Jg (x∗ ) F + γ max{ x − x∗ . Jg (x∗ )).17) ¯ So. Then the update function Φ(x. s = 0} and B by (3.24. B) = {B : R F < s . s s s s is well deﬁned in a neighborhood N = N1 × N2 of (x∗ . x − x∗ } + R ¯ (3.26. Let g : Rn → Rn be continuously diﬀerentiable in the open. . and the corresponding iteration −1 xk+1 = xk − Bk g(xk ) with Bk+1 ∈ Φ(xk . because then R ≤ s ≤ 2 max{ x − x∗ .
7570 0.26. So.7570 0.2381 · 10−2 k∗ 21 21 21 21 200 200 200 160 55 200 R 1.4376 · 10−2 1.13.3227 −1. If we want to remove the largest singular value in every iteration. If the quotient grows.7570 g(xk∗ ) 4.7570 0. we cannot control the convergence process.3411 1.0307 0.7570 0.11.9.6) and we set ε = 10−12 . In Figure 3.8. If this quotient becomes of order one.7570 0.9912 · 10+22 4. We apply Algorithm 3. we have to include an intermediate .11 to our test function. and q := p−1. Limited memory Broyden methods The proof of the qsuperlinear convergence of a limited memory Broyden method is identical to the proof of Theorem 1.0155 0.7570 0. We apply Algorithm 3.7570 0. see Figure 3.6068 · 10−13 1. for diﬀerent values of p.3411 1. Let g be the discrete integral equation function given by (A.0158 Table 3. As initial estimate we choose x0 given by (A. Instead of removing the smallest singular value we also could remove other singular values from the SVD of the update matrix Q. .2.7570 0.3411 1.6464 · 10+30 3. .2: Characteristics of Algorithm 3.6256 · 10−3 3. k ∗ .4433 · 10−13 4. We can conclude that the Broyden Rank Reduction method converges as fast as the method of Broyden as long as the quotient σp / sk−1 remains small. It turns out that the method only converges for p ≥ 7. .17).0073 −0.7570 0.4469 · 10−13 3. we remove only the smallest singular value starting from the pth iteration. if γ is replaced by γ + 2. with q = p − 1.5). k = 0. the inequality in (1. for diﬀerent values of p.11 applied to the discrete integral equation function (A.2889 0. method Broyden BRR BRR BRR BRR BRR BRR BRR BRR BRR n 100 100 100 100 100 100 100 100 100 100 p 10 8 7 6 5 4 3 2 1 g(x0 ) 0. Due to (3. . the convergence results for the BRR method are given.5).3511 0.7514 · 10−1 1. Example 3.57) holds. however in those cases the rate of convergence is exactly the same as the rate of convergence of Broyden’s method. In Table 3. we consider the ratio between the removed singular value σp and the size of the Broyden step sk−1 in the kth iteration.4433 · 10−13 4.88 Chapter 3. the BRR method get diﬃculties to achieve the fast convergence and the process starts to deviate from the convergence of the method of Broyden.16) and (3. It is clear that for every p the quotient σp / sk−1 eventually increases.
applied to the discrete integral equation function (A. so that the ﬁrst column of both matrices is moved to the last column. ’ ’(p = 7).5). ’ ’(p = 5).2 Broyden Rank Reduction method 89 10 0 residual g(xk ) 10 −5 10 −10 PSfrag replacements 10 −15 0 5 10 15 20 25 30 35 40 iteration k Figure 3. In other words. ’ ’(p = 7). with q = p − 1. ’ ’(p = 2).4 where q is set to q = p − 1 and the matrix Z . ’∗’(p = 4). ’×’(p = 10).5). ’6’(p = 1)] 10 10 quotient σp / sk−1 10 0 PSfrag replacements 10 −10 0 5 10 15 20 25 30 35 40 iteration k Figure 3. ’×’(p = 10). we apply Algorithm 3. ’+’(p = 6).3. ’6’(p = 1)] step. ’ ’(p = 2). ’ ’(p = 8). ’ ’(p = 8).8: The convergence rate of Algorithm 3. ’+’(p = 6).11.9: The quotient σp / sk−1 for Algorithm 3. After computing the singular value decomposition of the update matrix in step (vi). ’ ’(p = 3). [’◦’(Broyden). [’◦’(Broyden). with q = p − 1.11 applied to the discrete integral equation function (A. ’ ’(p = 5). ’ ’(p = 3). we additionally permute the columns of the matrices C and D. ’∗’(p = 4).
1 0 . .90 Chapter 3. Let g be the discrete integral equation function given by (A. Z= 0 1 .18). we observe that the process diverges shortly after we remove the largest singular value from the singular value decomposition of the update matrix.5). we remove only the largest singular value starting from the pth iteration. In Figure 3. with q = p − 1 and Z given by (3. 1 −1 W R.. .5). ’ ’(p = 3).18) So. we use two steps.11 applied to the discrete integral equation function (A. In these steps two important matrices are involved. ’×’(p = 10). As initial estimate we choose x0 given by (A. Limited memory Broyden methods is equal to Example 3. (3. First we make the matrix D orthogonal and then we compute the singular value decomposition of the matrix C. [’◦’(Broyden). ’∗’(p = 4).. ’ ’(p = 5). . We now have a closer view to the the matrix R of the QRdecomposition of D and the matrix W.4 with q := p − 1 and Z deﬁned by (3.10: The convergence rate of Algorithm 3. .6) and we set ε = 10−12 . ’6’(p = 1)] The computations In order to compute the singular value decomposition of the update matrix Q = CD T . ’ ’(p = 7).18) 10 0 residual g(xk ) 10 −5 10 −10 PSfrag replacements 10 −15 0 5 10 15 20 25 30 35 40 iteration k Figure 3.14..10. We apply Algorithm 3. containing the eigenvectors of C T C. .
. .p−1 /rpp 1/rpp rlp vl . . p − 1. . if d ∈ span{v1 . So. In order to obtain the singular value decomposition of Q. The inverse matrix is then given by / R−1 = and D = DR−1 . vp−1 }. . vp−1 are the right singular vectors of the previous update matrix. the matrix D is nearly orthogonal. 1 −r1. On the other hand. then the last column of D contains the vector d. C DT = (U ΣW T )DT = (U Σ)(DW )T = C DT . vp−1 }. The matrix R is invertible if and only if rpp = 0. . and form an orthonormal set in the Rn . if rpp = 0 ˜ then d ∈ span{v1 . .. . . vp−1 } and dp can be any vector orthogonal to the set {v1 . the eigenvectors of C T C are computed and stored in the (p × p)matrix W. . . . . that is. . vp−1 }. . . The decomposition of the update matrix is rewritten by CDT = C(DR)T = CRT DT = C DT . . ˜ l = 1. where DR is the QRdecomposition of D. . . The ﬁrst p − 1 columns denoted by v1 .3. . . . . rpp ) describes how the new vector d is distributed over the old ’directions’ of the update matrix.2 Broyden Rank Reduction method 91 Note that after p iterations of the BRR process. So. and rpp normalizes the new vector dp after the orthogonalization. In fact rlp = vlT d. R= 1 r1. . thus 1 ˜ d− dp = rpp p−1 l=1 1 −r1p /rpp . Let cdT be the new rankone update to the Broyden matrix. The right singular vectors of Q are obtained by multiplying W from the left by D.. . . . . . . .p−1 rpp where the last column (r1p . . . rpp which is equivalent to the GrammSchmidt orthogonalization of d with respect to the orthonormal set {v1 . . . . . . R has the structure 1 r1p .
in spite of the addition of a rankone matrix. So. Note that the diﬀerence between the Broyden matrices B1 and B2 is small. .92 Chapter 3. . Tables 3. s0 and s1 . W has no particular structure w11 · · · w1p . since m = 1. In the fourth iteration (k = 4) a second direction orthogonal to the ﬁrst is involved. if C DT is already in SVDformat then W = I. In the sixth iteration (k = 6) the fourth and ﬁfth direction are twisted (w44 . . The diagonal of W shows that the update matrix is in singular value decomposition format. the singular values decomposition of the updatematrices does not have to be computed in the ﬁrst iteration. On the other hand. For k = 2 it is trivial that all but the ﬁrst element of the ﬁrst column of R are equal to zero. . In iteration k = 8 the Broyden step lies mainly in this new direction. For k = 3 the element r12  is close to one. Directly after the introduction of a third direction in iteration 7 the second and third direction are twisted. see Table 3. . wp1 · · · wpp . point in more or less the same direction.1. because a rankone perturbation of a matrix can disturb the singular value decomposition completely. This implies that the ﬁrst two Broyden steps.4 can be explained in the following way. a direction is found that is more important than the second direction of the last iteration. Because initially the matrices C and D are zero.3 and 3. Note that the singular values corresponding to these directions are small. remains the principal direction in all subsequent iterations. Note that the ﬁrst direction obtained by Broyden’s method. Limited memory Broyden methods Nothing can be said about the entries of the matrix. . By considering the diagonal of W we can observe whether or not the update to the Broyden matrix changes the form of the singular value decomposition. According to the diagonal of W the two directions have to be adjusted slightly to obtain the singular value decomposition.. The matrix W tells us how we have to turn the columns of D to obtain the right singular vectors of Q. w55  = 1). W = . . After the ﬁrst p iterations of the BRR process.
Algorithm 3.3: The absolute values of the elements of column m of R and the diagonal of W during the BRR process.11 with p = 7 and q = 6.5) (n = 100). applied to the discrete integral equation function (A.3.2 Broyden Rank Reduction method 93 k 2 3 4 5 6 7 8 9 10 11 m 1 r1m  ··· rpm  1 w11  ··· wpp  1 2 3 4 5 6 7 7 7 7 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 Table 3. .
. Limited memory Broyden methods k 12 13 14 15 16 17 18 19 20 21 m 1 r1m  ··· rpm  1 w11  ··· wpp  7 7 7 7 7 7 7 7 7 7 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 Table 3.4: The absolute values of the elements of column m of R and the diagonal of W during the BRR process.11 with p = 7 and q = 6.94 Chapter 3. applied to the discrete integral equation function (A.5) (n = 100). Algorithm 3.
.e. .3. . Algorithm 3. set the parameters p and q. p (m := q). cm+1 := (sk + yk − m cl dT yk )/α. . that Algorithm 3. . Choose an initial estimate x0 ∈ Rn . D := D. p (m := 0). cp ]. . . vii) Perform the Broyden update.. . iv) Compute the QRdecomposition of D = DR. m T T l=1 (cl sk )(dl yk ). i. i) sk := g(xk ) − CD T g(xk ). l=1 l α := −sT yk + k and cm+1 := cm+1 · dm+1 .68) shows. . vi) If m = p then set ci = di = 0 for i = q + 1.11 and Algorithm 3. dm+1 := dm+1 / dm+1 . iii) yk := g(xk+1 ) − g(xk ). ii) xk+1 := xk + sk . v) Compute the SVD of C = U ΣW T . Apart from the computation of the Broyden step and the rankone update to the Broyden matrix. C := CRT . D := DW. Set k := 0 and repeat the following sequence of steps until g(xk ) < ε. . and let C = [c1 . however. . the algorithm is essentially the same and has similar convergence properties.15 (The Broyden Rank Reduction Inverse method).15 are not identical. l=1 l dm+1 := −sk + m dl cT sk .2 Broyden Rank Reduction method 95 The Broyden Rank Reduction Inverse method The reduction process to the update matrix of the Broyden matrix can also be applied in case of the inverse notation of the method of Broyden. . . . The inverse Broyden matrix H can also be written as the sum of the initial matrix H0 and an update matrix Q. dp ] ∈ Rn×p be initialized by ci = di = 0 for i = 1. and set m := m + 1. The ShermanMorrisonWoodbury formula (1. (σ1 ≥ · · · ≥ σp ) C := U Σ. . D = [d1 . . .
’ ’(p = 3). ’ ’(p = 2).9.3 Broyden Base Reduction method We now develop a generalization of the reduction methods described in the previous section.96 Chapter 3. . It turns out that the method converges as fast as the method of Broyden for p ≥ 7. . and q := p − 1. applied to the discrete integral equation function (A. For this purpose we repeat some results. 10 0 residual g(xk ) 10 −5 10 −10 PSfrag replacements 10 −15 0 5 10 15 20 25 30 35 40 iteration k Figure 3. ’ ’(p = 8). So. for diﬀerent values of p. We apply Algorithm 3. ’6’(p = 1)] 3. see Figure 3.8.16. The Broyden matrix after the pth correction can be written B = B0 + Q. k ∗ .13.15. It is clear that for every p the quotient σp / sk−1 eventually increases. For p = 6 just a few more iterations are needed. ’+’(p = 6). ’×’(p = 10). k = 0. the BRR method get diﬃculties to achieve the fast convergence and the process starts to deviate from the convergence of the method of Broyden. Let g be the discrete integral equation function given by (A. . In Figure 3. . ’ ’(p = 5). with q = p − 1. Note that the results are similar to those of Example 3. we consider the ratio between the removed singular value σ p and the size of the Broyden step sk−1 in the kth iteration. ’∗’(p = 4).6) and we set ε = 10−12 .15. As initial estimate we choose x0 given by (A.11: The convergence rate of Algorithm 3.5). we remove only the smallest singular value starting from the pth iteration. If this quotient becomes of order one. . [’◦’(Broyden). Limited memory Broyden methods Example 3.5). ’ ’(p = 7).
. . . ’6’(p = 1)] where the update matrix. Let V be a qdimensional subspace of the Rn with orthonormal basis {v1 . .5). we propose the following approach.. ’ ’(p = 3). it is set equal to zero on the orthogonal complement of V. The new update matrix Q can be decomposed in two (n×q)matrices C and D.e. ’ ’(p = 5). The idea is that Q is approximated with a new matrix Q without destroying the action of the update matrix on the qdimensional subspace V. thus (3. . has at most rank p. By (3. Q = CD T . ’ ’(p = 8). denoted by Q.12: The quotient σp / sk−1 for Algorithm 3. where D equals the matrix V = [v1 . it follows that QV = QV = CV T V = C. As we have seen before Q is the product of two (n × p)matrices C and D. . ’×’(p = 10). . ’ ’(p = 7).15 applied to the discrete integral equation function (A. Q = QV V T . vq } (q ≤ p). i.19) To assure that Q has rank less than q.20) QV ⊥ = 0. ’+’(p = 6). Note that we have projected the update matrix on the qdimensional subspace V. vq ] that consists of the basis vectors.19) and the orthogonality of V.3. ’∗’(p = 4). [’◦’(Broyden). (3. QV = QV . with q = p − 1.3 Broyden Base Reduction method 97 10 10 quotient σp / sk−1 10 0 PSfrag replacements 10 −10 0 5 10 15 20 25 30 35 40 iteration k Figure 3. In order to reduce the rank of Q. ’ ’(p = 2). . .
. . the second condition (3. since C T C = V T QT QV = V T Σ2 V = Σ2 . The set of right singular vectors also forms an orthonormal basis of V. Notice that if V = Im D with dim V = q ≤ p and Q is deﬁned by (3. Thus V ⊥ ⊂ Ker Q and QV ⊥ = QV ⊥ = 0. . then {Ker Q}⊥ = Im QT = Im DC T ⊂ Im D = V. . σp ) implies that the columns of C are orthogonal and there exists an orthogonal (n × p)matrix U such that C = U Σ. . By taking the subspace V = span{v1 . . . . . vp }. . sp−1 } ⊂ span{vl . σp of Q. . The product C DT represents the singular value decomposition of Q. . Limited memory Broyden methods Besides. vp }. We assume that rank Q ≤ p and take the subspace V spanned by the right singular vectors {v1 . . p. Example 3. . . because for u ⊥ V it follows that Qu = QV V T u = 0. . To obtain an orthonormal basis for V we can compute the QLdecomposition of D = V L. we take a qdimensional subspace. The usefulness of this approach depends on the choice of the subspace V. . q < p. . we call this approach the Broyden Base Reduction (BBR) method. . . .19). . vp ] and C := QV. .98 Chapter 3. V that contains the vectors {sp−q . The diﬀerence in decomposition between CD T and C DT is that D is an orthogonal (n × q)matrix and D is not necessarily. .17. We use the QLdecomposition instead of the usual QRdecomposition because then for l = 1. we have that span{sl−1 . . . where V is an orthogonal (n × q)matrix. where Σ = diag(σ1 . .20) is fulﬁlled. . . . We deﬁne D := V = [v1 . the matrix D is given by D= s0 s0 ··· sp−1 sp−1 . vq } with q < p the Broyden Rank Reduction method is obtained. . . corresponding to the largest p singular values σ1 . . To apply a reduction on the columns of D. sp−1 }. . . . . vq } of the subspace V. Because the number of columns of C and D is reduced using an orthonormal basis {v1 . After p iterations of Broyden’s method. . . . . which implies that Q and Q are equal on the whole Rn . and L is a lower triangular (p × p)matrix.
.4. 10 0 residual g(xk ) 10 −5 10 −10 PSfrag replacements 10 −15 0 5 10 15 20 25 30 35 40 iteration k Figure 3.5). Note that after the pth iteration.13: The convergence rate of Algorithm 3. Thus V ⊃ span{s0 . . vp }.22) Z= . .18. .22). ’6’(p = 1)] Another choice for V could be the subspace that contains the ﬁrst p − 1 Broyden steps. .. This reduction is applied whenever the maximum number of p columns in C and D is reached. we can take V = span{vp−q+1 .. We apply Algorithm 3.6. For smaller values of p the rate of convergence is rather low.6) and we set ε = 10−12 .3. . ’∗’(p = 4). . We rewrite the decomposition of Q by CDT = C(V L)T = CLT V T =: C DT . [’◦’(Broyden). 1 0 1 where L comes from the QLdecomposition of D. ’ ’(p = 5). . L. sp−2 }.. with q = p − 1 and Z given by (3.13. . Note that this is not the case in Example 3. . ’ ’(p = 3). (3. In Figure 3. (3.4 applied to the discrete integral equation function (A. Example 3. ’ ’(p = 2).5). As initial estimate we choose x0 given by (A. . we observe that the rate of convergence is high for p = 8. for q := p − 1 and Z given by 0 1 .3 Broyden Base Reduction method 99 So. . . Let g be the discrete integral equation function given by (A. or the process even diverges.21) By removing the ﬁrst (p − q) columns of C and D the update matrix retains the same action on the last q Broyden steps. ’×’(p = 10).
6) and we set ε = 10−12 .4. dp ]. . ’ ’(p = 2). . we observe that the rate of convergence is again high for p = 8. . Nocedal and Schnabel derived a compact representation of the matrices generated by Broyden’s update (3. . . vp−1 . or the process diverges. Because we store the basis from the pth iteration. Let g be the discrete integral equation function given by (A.100 Chapter 3. because the update matrix is rewritten as CDT = C(V R)T = CRT V DT =: C DT . In the subsequent iterations. ’ ’(p = 5). vp−1 . Byrd. the subspace V remains ﬁxed. vp−1 . .1) for systems of nonlinear . with q = p − 1 and Z = R. . ’ ’(p = 3). ’∗’(p = 4).5). . [’◦’(Broyden). As initial estimate we choose x0 given by (A.19.5). ’×’(p = 10).13. .23) ˜ ˜ where V = [v1 . the new correction cdT to the Broyden matrix is subdivided over the p − 1 existing directions. . For smaller values of p the rate of convergence is very low. . we call this method the Broyden Base Storing (BBS) method. .4 The approach of Byrd In 1994. (3. The last column dp of V is orthogonal to the base vectors v1 . In Figure 3.4. After the reduction the ﬁrst p − 1 columns of the matrix C have been adapted and the ﬁrst p − 1 columns of D are still the base vectors v1 . for q := p−1 and Z = R. where R comes from the QRdecomposition of D.14: The convergence rate of Algorithm 3. Limited memory Broyden methods the subspace V is set to Im V where D = V R. Example 3. We apply Algorithm 3. 10 0 residual g(xk ) 10 −5 10 −10 PSfrag replacements 10 −15 0 5 10 15 20 25 30 35 40 iteration k Figure 3. This implies that after every iteration. ’6’(p = 1)] 3. applied to the discrete integral equation function (A.
assume that (3.25) satisﬁes −1 T V0 · · · Vk−1 = I − Yk Rk Sk . Lemma 3. . Therefore we include the derivation in this chapter. (3. .26) holds for k = 1.j = sT yj−1 i−1 0 if i ≤ j. k −1 T −1 T = I − Yk Rk Sk + ρk Yk Rk Sk yk sT − ρk yk sT k k . sk−1 . 0 1/ρk we see that −1 Rk+1 = −1 T Rk −ρk Rk−1 Sk yk . 0 ρk This implies that −1 T I − Yk+1 Rk+1 Sk+1 = I − Yk yk −1 T Rk −ρk Rk −1Sk yk 0 ρk T Sk sT k −1 T = (I − Yk Rk Sk )(I − ρk yk sT ). Proceeding by induction. but also of use in limited memory methods. because in this case the right hand side of (3.26) holds for some k.24) We ﬁrst prove a preliminary lemma on products of projection matrices Vk = I − yk s T k . s T y0 0 0 Now. (3.4 The approach of Byrd 101 equations. Let us deﬁne the (n × k)matrices Sk and Yk by Sk = s0 . The product of a set of k projection matrices of the form (3.26) where Rk is the (k × k)matrix (Rk )i. otherwise. and consider k + 1.26) is given by I − y0 1 T s = V0 . we note that (3.3. T yk s k (3. Proof. These new compact representation is of interest in its own right. yk−1 . Yk = y0 . . .25) that will be useful in subsequent analysis and is also interesting in its own right. If we write the matrix Rk+1 as T Rk Sk yk Rk+1 = .20.
Yk = y0 . we deﬁne Sk = s0 . 2. . we obtain −1 T V0 · · · Vk = (I − Yk Rk Sk )(I − ρk yk sT ) k −1 T = (I − Yk+1 Rk+1 Sk+1 ). .26) for all k. Compact representation of the Broyden matrix As before. Theorem 3.29) . Ck = C0 (I − ρ0 s0 sT ) · · · (I − ρk−1 sk−1 sT ). 1. where Ck and Dk are deﬁned recursively by C0 = B 0 . 0 k−1 (3. . . otherwise. Limited memory Broyden methods Together with the induction hypothesis.102 Chapter 3.31) Dk+1 = Dk (I − ρk sk sT ) + ρk yk sT k k k = 0.21. . yk−1 .1) and the pairs k−1 {si . (3. where ρk = 1/sT sk . .j = sT sj−1 i−1 0 if i ≤ j. It is easy to show (using induction) that Bk can be written as Bk = C k + D k .27) where Nk is the k × k matrix (Nk )i.30) Ck+1 = Ck (I − ρk sk sT ) k k = 0. . . and we assume that the vectors si are nonzero. (3. Then −1 T Bk = B0 + (Yk − B0 Sk )Nk Sk . . and let Bk be obtained by updating B0 k times using Broyden’s formula (3. yi }i=0 .28) Proof. Let B0 be a nonsingular starting matrix. sk−1 . (3. (3. and D0 = 0. which establishes the product relation (3. k Considering ﬁrst Ck we note that it can be expressed as the product of C0 with a sequence of projection matrices. 1. 2. . . .
4 The approach of Byrd 103 for all k = 1.24) and Mk is the (k × k)matrix (Mk )i. Assume now that (3.30). By induction this establishes (3. Now we apply Lemma 3. (3.22. .35) and k−1 the pairs {si . 0 1/ρk −1 which implies that the second matrix on the right hand side of (3. Next we show by induction that Dk has the compact representation −1 T Dk = Y k N k S k . however. to (3. yi }i=0 . . We now derive a compact representation of the inverse Broyden update which is given by Hk+1 = Hk + (sk − Hk yk ) sT Hk k s T H k yk k (3. 0 ρk (3.j = −sT sj−1 i−1 0 if i > j. .37) . which agrees with (3. substituting (3.34) is Nk+1 .33) in (3. Finally. we have that D1 = y0 ρ0 sT . 2.30). (3. we obtain (3.33) holds for some k.33) By the deﬁnition (3.27).32) (3.36) where Sk and Yk are given by (3. and let Hk be obtained by updating H0 k times using the inverse Broyden’s formula (3.25). with y := s in the deﬁnition (3.35) Theorem 3. otherwise.3. that −1 T Nk −ρk Nk −1Sk sk 0 ρk T Nk S k sk = I.32) and (3.33). Then by (3. Then T T Hk = H0 + (Sk − H0 Yk )(Mk + Sk H0 Yk )−1 Sk H0 . .34) Note. 3.31) in order to obtain −1 T Ck = B 0 − B 0 S k Nk S k . −1 T Dk+1 = Yk Nk Sk (I − ρk sk sT ) + ρk yk sT k k −1 T Nk −ρk Nk −1Sk sk 0 0 −1 T −1 = Y k N k S k − ρ k Y k N k s k s T + ρ k yk s T k k = Y k yk T Sk + Y k yk sT k 0 0 0 ρk T Sk sT k = Yk+1 −1 T Nk −ρk Nk −1Sk sk T Sk+1 . Let H0 be a nonsingular starting matrix.33) 0 for k = 1. (3.29).20.
(3. Instead of storing the Broyden steps and their yields. since sk = 0 during the Broyden process. 0. = Yp T. There exists a remarkable way to solve this problem.39) (3. Limited memory Broyden methods Proof. . Because we always start with H0 = −I as initial matrix (3. Note that T is invertible.68) ensures that (Mk + Sk H0 Yk ) is nonsingular.68). 1 p−1 p−1 T For the same reason the (p × p)matrix Sp Yp probably does not have p large singular values. −sT sp−2 ).36). . .38) and arrive at Hp = −I − (DT −1 + CT −1 )(T −1 (T M T )T −1 − (DT −1 )T CT −1 )−1 (DT −1 )T = −I − (D + C)((T M T ) − D T C)−1 DT . −sT s0 . we obtain −1 −1 −1 −1 −1 Hk = B k = B0 − B0 U (I + V T B0 U )−1 V T B0 −1 T −1 T = H0 − H0 (Yk − B0 Sk )(I + Nk Sk H0 (Yk − B0 Sk ))−1 Nk Sk H0 −1 T V T = N k Sk . . . −sT s0 ) and column p − 1 equals (0.41) . . sp−1 } can be more or less linear dependent. T The matrix we want to invert. we only use the representation (3. because the size of the Broyden step sk decreases if the process converges.40) where T = diag(1/ s0 . 1/ sp−1 ). the ﬁrst column of M equals (0. In the pth iteration. . .35) exist. . In that case. the norm of the ﬁrst column of M is much larger than the norm of the last but one column. T T T = H0 − (H0 Yk − Sk )(Nk + Sk H0 Yk − Sk Sk )−1 Sk H0 . .104 Chapter 3. . . This nonsingularity T along with the ShermanMorrison formula (1.37) we have Nk − Sk Sk = Mk . In applications. T By (3. so that (3. (3. In addition. .36) is reduced to T T Hk = −I − (Yk + Sk )(Mk − Sk Yk )−1 Sk . we substitute Sp = DT −1 and Yp = CT −1 into (3.28) and (3. (3. .36) of the inverse Broyden matrix.27) becomes Bk = B 0 + U V T Applying the ShermanMorrisonWoodbury formula (1. can be approximately singular. (Mk −Sk Yk ). the vectors {s0 . . Let U = Y k − B0 Sk . which gives (3. . we have implicitly assumed the nonsingularity of Bk . So.38) D = C = s0 s0 y0 s0 ··· ··· sp−1 sp−1 yp−1 sp−1 = Sp T. we deﬁne Note that since we have assumed that all the updates given by (3.
Removing updates Using this notation. . m and j = 1. and let = 0 for of steps −dT dj−1 i−1 0 if i > j. . . . p (m := 0). viii) If m = p then set cl = dl = 0 for l = q + 1. Note that we changed the meaning of the matrices C and D compared to the other limited memory methods. So. dp ] ∈ Rn×p be initialized by ci = di i = 1. . . vii) Let m := m + 1. . we have to remove a column of both matrices C and D. C = [c1 . .j = − sT i−1 sj−1 si−1 sj−1 if i > j. . . i) Compute for i = 1. . m. We apply the condition that at most p updates to the Broyden matrix can be stored. . set the parameters p and q. The Broyden steps and their corresponding yields are stored according to (3. before a new Broyden step in iteration k = p + 1 can be computed.42) ii) Solve (M − =− m T l=1 dl g(xk ) iii) dm+1 := −g(xk ) + iv) xk+1 := xk + dm+1 . . . . otherwise. vi) dm+1 := dm+1 /dm+1 and cm+1 := cm+1 /cm+1 .23 (The limited memory Broyden method of Choose an initial estimate x0 ∈ Rn .4 The approach of Byrd 105 Note that the product T M T equals the (p × p)matrix (T M T )i.39) and (3. . cp ]. Set k := 0 and repeat the following sequence until g(xk ) < ε. . Mi. the method of Broyden can simply be transformed into a limited memory method. . D = [d1 .j = m T l=1 dl cl )tk Byrd). 0 = −dT dj−1 i−1 0 if i > j. (3. .40) in the matrices C and D. for tk . Algorithm 3. . Instead of Bk = B0 + CDT we have in . . m l=1 (cl + dl )tk . v) cm+1 := g(xk+1 ) − g(xk ). otherwise. otherwise. p (m := q). . .3. .
. d m−1 sk sT k sk sk . . = T . . . . cm ] and D = [d1 . . Limited memory Broyden methods Algorithm 3. . v = sk · em is the unique solution of the equation N v = D T sk or. Proof. = B0 sk + (C − B0 D) sk em sk T dT sk d0 dm−1 0 .43). . dm ] are both (n × m)matrices and N is given by (3.23 is still a secant method. . . m. then Bk+1 satisﬁes the secant equation (1. )N −1 . . s T T dm−1 dm−1 dm−1 sk k sT k sk sk sk v dT sk 0 . m and j = 1.24.. . otherwise. If sk / sk is stored in column m of D and yk / sk is stored in column m of C.j = dT dj−1 i−1 0 if i ≤ j.25). . Let Bk+1 be given by (3. . . . . . Note that the dimensions of N and M depend on m. where C = [c1 . (3. . Theorem 3. . Because N is nonsingular. . and so these dimensions are variable.43) where the matrix N is given by (N )i. . Because after the reduction step (vii) the latest (scaled) Broyden step s k and its yield yk are stored in column q of the matrices C and D. Algorithm 3.23 the matrix Bk given by dT 1 . . .44) for i = 1. . equivalently dT d0 · · · 0 . Therefore Bk+1 sk = B0 sk + (C − B0 D)N −1 DT sk = B 0 sk + y k − B 0 sk = y k .44).106 Chapter 3. dT m B k = B 0 + ( c1 · · · c m + d1 · · · dm (3.
Clearly. for every value of p the method needs more iterations than the method of Broyden.23. Only for p = 3 and p = 4 the convergence is reasonably fast.23 applied to the discrete integral equation function (A.15. For p = 1 the method directly diverges. ’ ’(p = 2). ’∗’(p = 4). 10 0 residual g(xk ) 10 −5 10 −10 PSfrag replacements 10 −15 0 5 10 15 20 25 30 35 40 iteration k Figure 3. for diﬀerent values values of p. Note that for p = 2 we have again the same result as we have for p = 1 in case of the other limited memory Broyden methods.3.15: The convergence rate of Algorithm 3. The rate of convergence is given in Figure 3.6) and we set ε = 10−12 .4 The approach of Byrd 107 We observe that in case of p = 1 the update to the Broyden matrix is directly removed at the end of every iteration step. Example 3. for diﬀerent values of p. ’ ’(p = 3). Algorithm 3.25.23 equals dynamical simulation for p = 1. As initial estimate we choose x0 given by (A. ’6’(p = 1)] . Let g be the discrete integral equation function. given by (A. Therefore.5). the parameter q is set to p − 1. [’◦’(Broyden). ’ ’(p = 5). ’×’(p = 10).5). We apply Algorithm 3.
Limited memory Broyden methods .108 Chapter 3.
Part II Features of limited memory methods 109 .
.
the Broyden Rank Reduction method. For nonlinear functions g and f.3. x1 = f (x0 ). where A ∈ Rn×n and b ∈ Rn . 1. the ﬁrst step of a limited memory Broyden method is a dynamical simulation step if the initial Broyden matrix is given by B0 = −I. . g(x) = f (x) − x. So. where (A + I) is the Jacobian of the period map f. We consider the function g as the diﬀerence between a period map f : Rn → Rn and the identity. will succeed to approximate a zero x∗ of g. As we pointed out in Section 1.Chapter 4 Features of Broyden’s method In Part I we provided the theoretical background to the limited memory Broyden methods. In this chapter we investigate whether the characteristics of the function g can tell us whether or not our main algorithm. for every k = 0. A zero x∗ of the function g is a ﬁxed point of the function f. We discussed the derivation and convergence of the method of Broyden and indicated the freedom in the algorithm to reduce the amount of memory to store the Broyden matrix and still preserving the fast convergence. this suggests that the update matrix approximates in some sense 111 . . . 2. We know that Bk+1 sk = yk = Ask . . g(x) = Ax + b. Therefore. and that the Broyden matrix Bk+1 is the sum of B0 and the update matrix CD T . that is. suppose that g is an aﬃne function. the equality CDT sk = (A + I)sk holds. In addition.
Now assume that a = −1. then the Jacobian of the period map f : Rn → Rn is given by Jf (x∗ ) = A + I = (a + 1)I. (ax0 )T (ax0 ) sT s0 x0 x0 0 −1 s1 = −B1 g(x1 ) The next Broyden step is given by x0 xT 0 xT x0 0 −1 = − − I + (a + 1) = − −I + a(a + 1)x0 a + 1 x0 xT · T 0 a(a + 1)x0 a x0 x0 = a(a + 1)x0 − (a + 1)2 x0 = −(a + 1)x0 . with nonzero a ∈ R.14 the method of Broyden converges in less that 2d0 = 2 iterations. This was to be expected as well. (AH0 )g(x0 ). and that the exact solution is found in just one single iteration of the Broyden process. for a = 0. Note that in case of a = −1 the Jacobian of f equals the zero matrix. Features of Broyden’s method the Jacobian of f. because the zeroth Krylov subspace is given by Z0 = span {g(x0 ). Also. . The unique solution of the system g(x) = 0. although A + I = (a + 1)I has full rank. the parameter p of the limited memory Broyden method could be chosen rank Jf (x∗ ) + 1. Note that. . Jf (x∗ ) = Jg (x∗ ) + I.112 Chapter 4. the method of Broyden uses the information of the Jacobian Jf (x∗ ) in only one direction. and has dimension d0 = dim Z0 = 1. . . Because g(x1 ) = a(a + 1)x0 the new Broyden matrix becomes B1 = B 0 + g(x1 )sT a(a + 1)x0 · (ax0 )T x0 xT 0 = −I + = −I + (a + 1) T 0 . . the ﬁrst Broyden step becomes −1 s0 = −B0 g(x0 ) = ax0 . is x∗ = 0 and the Jacobian of g is given by Jg (x∗ ) = aI. ag(x0 ). .} = span {g(x0 ). According to Corollary 2. which has rank zero. Example 4. that is. We investigate this conjecture with an example. This is clear since the initial Broyden matrix equals the Jacobian (B0 = A = −I) and the method of Newton converges in one iteration on linear systems. . on the subspace spanned by x0 . Obviously. Let A = aI.1. (AH0 )2 g(x0 ). it has singular value (a + 1) with multiplicity n. Let x0 = 0 be arbitrarily given.} = span {x0 }. a2 g(x0 ). and x2 = x1 + s1 = (a + 1)x0 − (a + 1)x0 = 0. The last update matrix before the process converges exactly is given by T T CDT = (a + 1)x0 x0 /(x0 x0 ). and thus x1 = x0 + ax0 = (a + 1)x0 . Let g : Rn → Rn be given by g(x) = ax.
see Section 1. As expected from Section 1. (4. (AH0 )l(x0 ). The rank of K0 can be approximated by the number of relatively large (for example. . We investigate the connection between d0 = dim Z0 an the choice of p for the BRR method to solve g(x) = 0. .5). depending on the relative nonlinearity γ rel . . l(x0 ). The dimension of a Krylov space cannot always be determined exactly. Therefore. However. we deﬁne the zeroth Krylov matrix by K0 := K(l(x0 ). A) = v v Av Av (4. ≥ 10−15 ) singular values of K0 . with dimension n = 20.3) An−1 v An−1 v ··· . to obtain a more continuous description of the rank of K0 . where K(v. we can derive the singular values of K0 .1 Characteristics of the Jacobian 113 4. we compare in this section the convergence properties of the method of Broyden for several test functions and their linearizations around x∗ . We deﬁne the aﬃne function l : Rn → Rn by l(x) = g(x∗ ) + Jg (x∗ )(x − x∗ ) = Jg (x∗ )x − Jg (x∗ )x∗ . . . AH0 ). (AH0 )j−1 l(x0 ). The vector (AH0 )j l(x0 ).2 it takes Newton’s method 3 iterations to .1) (4. but also lie close to the subspace spanned by these j vectors. .4. and write l(x) = Ax + b where A = Jg (x∗ ) and b = −Jg (x∗ )x∗ .3. The rank of K0 equals the dimension of Z0 . The discrete integral equation function We consider the function g : Rn → Rn as given by (A. Therefore. for example. given by Z0 = span {l(x0 ). can still be linearly independent of the ﬁrst j vectors in the Krylov sequence.}. (AH0 )2 l(x0 ).1 Characteristics of the Jacobian In a small neighborhood of the solution x∗ . The initial Broyden matrix is set to minus the identity.2) In this section we compute both the singular values of the matrix A + I and of the zeroth Krylov space for the linearized problem. the nonlinear function g can be considered as approximately linear. .
’×’(p = 10). the quotient σp / sk−1 decreases rather because the size of the Broyden step increases than because the singular value σp gets smaller. applied to the discrete integral equation function (A. ’+’(p = 6). cf.1 we have plotted the rate of convergence for the method of Broyden and the BRR method for diﬀerent values of p. with q = p−1. Thereafter the residual g(xk ) changes very little from iteration to iteration and the process slowly diverges.11. Clearly the matrix Jf (x∗ ) has full rank. see again Figure 4. The graph of the singular values of Jf (x∗ ) describes an exponential decay to 2. In Figure 4.3). So. The singular . The method of Broyden needs 21 iterations to obtain the same order of residual. starting from the initial condition x0 given by (A. ’ ’(p = 8). It turns out that the BRR method needs also 21 iterations for p ≥ 7.2. Section 3. ’ ’(p = 2).2.1. Features of Broyden’s method 10 0 residual g(xk ) 10 −5 PSfrag replacements 10 −10 quotient σp / sk−1 10 −15 0 5 10 15 20 25 30 35 40 iteration k 10 10 quotient σp / sk−1 10 0 PSfrag replacements residual g(xk ) 10 −10 0 5 10 15 20 25 30 35 40 iteration k Figure 4. ’∗’(p = 4). ’ ’(p = 5).5) and additionally the quotient σp / sk−1 . ’ ’(p = 7). For smaller values of the parameter p the residual diverges from the path of Broyden’s method when the quotient σp / sk−1 has become too large. ’6’(p = 1)] converge to a residual of g(xk ) < 10−12 .1: The convergence rate of Algorithm 1. [’◦’(Broyden). where we took n = 50.19 and Algorithm 3. we have plotted the singular values of both Jf (x∗ ) and K0 .6). In Figure 4. deﬁned by (4. ’ ’(p = 3).114 Chapter 4.
2). see Example 1.3 2.2: The singular values of Jf (x∗ ) (left) and K0 (right) in case of the discrete integral equation function (A. In Figure 4.1 2.15 10 2. This explains the same rate of convergence for diﬀerent dimensions of the nonlinear problem.3: The singular values of Jf (x∗ ) (left) and K0 (right) in case of the discrete integral equation function (A.3.3 2. 2. n = 20. The method of Newton needs again 3 iterations to converge to a . Note that it is not evident to determine the dimension of the zeroth Krylov space.5). We observe that the number 2. n = 50. for the linearized system the method of Broyden would need as many iterations for n = 20 as it needs for n = 50.4. we have plotted the singular values of the same matrices Jf (x∗ ) and K0 for a larger dimension (n = 50).2 2.1 2.25 2. of large singular values of K0 is about the same as in case of n = 20.5).05 2 0 10 20 30 40 50 0 10 20 30 40 50 10 −15 −10 10 0 10 −5 Figure 4. The discrete boundary value function We consider the function g : Rn → Rn as given by (A.2 2.05 2 0 5 10 15 20 0 5 10 15 20 10 −10 10 0 10 −5 10 −15 Figure 4.15 2. with dimension n = 20.1 Characteristics of the Jacobian 115 values of K0 describe a fast linear decay until the 9th singular value. The remaining singular values are of the same order.21. So.25 2.
It turns out that the BRR method fails to converge for every value of p. ’6’(p = 1)] The singular values of Jf (x∗ ) are all diﬀerent and are nicely distributed over the interval [1. [’◦’(Broyden). 10 5 residual g(xk ) 10 0 PSfrag replacements 10 −5 10 −10 quotient σp / sk−1 0 10 20 30 40 50 60 iteration k quotient σp / sk−1 10 0 10 −5 PSfrag replacements residual g(xk ) 10 −10 0 10 20 30 40 50 60 iteration k Figure 4.5. with q = p−1.116 Chapter 4. 5]. ’ ’(p = 5). All singular values of K0 are larger than 10−15 and more than 10 singular values are even larger than 10−5 . starting from the initial condition x0 . see Figure 4. ’∗’(p = 4).2) and additionally the quotient σp / sk−1 . ’ ’(p = 2). the method of Broyden would need almost all 2n iterations to converge on the linearized problem. Features of Broyden’s method residual of g(xk ) < 10−10 .4 the residual increases directly after the quotient σp / sk−1 has become too large. As can be seen in Figure 4.19 and Algorithm 3. Although one might consider K0 not to have full rank it is rather close to be nonsingular. given by (A. So.4. ’×’(p = 10). applied to the discrete boundary value function (A. ’ ’(p = 3). Figure 4.4: The convergence rate of Algorithm 1.11.3). The method of Broyden needs 60 iterations to obtain the same order of residual. .
only two singular values of the matrix K0 are signiﬁcant. For p = 1 and p = 2 the BRR method fails to converges.2). For larger values of p. starting from the initial condition x0 given by (A. the Jacobian Jf = Jg + I at the solution x∗ is a blockdiagonal matrix. with dimension n = 20.9. .1 Characteristics of the Jacobian 117 5 4 10 0 10 3 2 1 0 5 10 15 20 −5 10 −10 0 5 10 15 20 Figure 4. . see Figure 4.5: The singular values of Jf (x∗ ) (left) and K0 (right) in case of the discrete boundary value function (A. . the dimension of the zeroth Krylov space Z0 is 2 and the method of Broyden would need at most 4 iterations to solve the linearized system. −1 1 Therefore.7). the BRR method still needs 11 iterations. The extended Rosenbrock function is a system of n/2 copies x of the Rosenbrock function.7. . = σn/2 ≈ 21. .4183. So. the Jacobian Jf (x∗ ) has two diﬀerent singular values. The extended Rosenbrock function We consider the function g : Rn → Rn as given by (A. Clearly. n = 20. Note that for p = 5 the quotient σp / sk−1 does not exceeds 10−15 .4. that is. the BRR method has a high rate of convergence and is even faster than the method of Broyden. however. = σn ≈ 0. with blocks given by −19 10 . . .5134 and σn/2+1 = .6. The method of Newton needs 3 iterations to converge to a residual of g(xk ) < 10−12 . If we take p larger than 5 the rate of convergence of the BRR method does not increase. The unique solution of the extended Rosenbrock function is the vector ∗ = (1. Note that the BRR method can approximate the zero of the extended Rosenbrock function if we take p = 3. see Figure 4. see Figure 4. . The method of Broyden needs 18 iteration to obtain the same order of residual.7). . 1). that is. σ1 = . So.6. see Example 1.
the method of Broyden fails to converge to the zero of the extended Powell singular function. . with q = p−1. The unique solution of the extended Powell singular function is the zero vector. With the same initial condition.19 and Algorithm 3. ’ ’(p = 3). . x∗ = (0.9).10).6: The convergence rate of Algorithm 1.7) and additionally the quotient σp / sk−1 .118 Chapter 4. ’6’(p = 1)] The extended Powell singular function We consider the function g : Rn → Rn as given by (A. starting from the initial condition given by (A. [’◦’(Broyden). 0). ’ ’(p = 5). ’∗’(p = 4). for every value of p. . . with dimension n = 20. It turns out that the method of Newton converges linearly in 23 iterations to a residual of g(xk ) < 10−12 . as does the BRR method. applied to the extended Rosenbrock function (A. Features of Broyden’s method 10 5 residual g(xk ) 10 0 10 −5 PSfrag replacements 10 −10 quotient σp / sk−1 10 −15 0 5 10 15 20 25 iteration k 10 5 quotient σp / sk−1 10 0 10 −5 PSfrag replacements residual g(xk ) 10 −10 10 −15 0 5 10 15 20 25 iteration k Figure 4.11. The Jacobian Jf = Jg + I at the solution x∗ is a . ’ ’(p = 2).
σn/4+1 = . . see Figure 4. .9).7).4.8. σ1 = . = σn ≈ 0. n = 20.7: The singular values of Jf (x∗ ) (left) and K0 (right) in case of the extended Rosenbrock function (A. .0590.0000. . The dimension of the zeroth Krylov space Z0 is about 3 and the method of Broyden would need at most 6 iterations to solve the linearized system. σn/2+1 = . .8: The singular values of Jf (x∗ ) (left) and K0 (right) in case of the extended Powell singular function (A. The Jacobian Jg . . however. 1 0 0 1 The Jacobian Jf is nonsingular and has four diﬀerent singular values. = σn/4 ≈ 10. = σn/2 ≈ 3. with blocks given by 2 10 0 1 0 0 0 0 0 0 (5) − (5) .2501. . that is.3064. n = 20. is singular . singular values of the matrix K0 are signiﬁcant. blockdiagonal matrix. Only three 10 8 10 6 4 2 0 0 5 10 15 20 10 −15 −5 10 0 10 −10 0 5 10 15 20 Figure 4. . and σ3n/4+1 = .1 Characteristics of the Jacobian 119 20 15 10 10 0 10 −5 10 5 0 0 5 10 15 20 10 −10 −15 0 5 10 15 20 Figure 4. = σ3n/4 ≈ 1.
Lemma 2.6.. under certain conditions. .. where d0 = dim Z0 . We ﬁrst recall the main results of Chapter 2 In many problems.14 we know that the method of Broyden needs at most 2d0 iterations on linear systems. .2 Solving linear systems with Broyden’s method As we have seen in Section 4. it is not necessary that the Broyden matrix approaches the Jacobian even if the linear system (4. components of the function are linear or nearly linear. Therefore it is interesting to consider the method of Broyden on linear systems. According to Lemma 2. 1). In this section the vector b is chosen to be the zero vector. Features of Broyden’s method at the zero x∗ of the function g and the theory of Sections 1.. Theorems 2.5) . 4. the Broyden matrix and the Jacobian only coincide in one single direction. (4. . .7.1 the linearized problem gives more insight in the success of the method of Broyden on the original nonlinear problem. A= . Lemma 2. 1 λ (4.3 and 3. given by λ 1 .120 Chapter 4. By Corollary 2. . In this section we illustrate the development of the Broyden matrix along the Broyden process. As initial Broyden matrix we choose again B0 = −I and in all examples we choose the initial condition x0 = (1.18 Broyden’s method needs at most 2d0 iterations for all linearly translated systems of (4. . .4) can be predicted by the sum of the dimensions of the Krylov spaces Z0 and Z1 . Another conclusion of Chapter 2 is that although the diﬀerence between the Broyden matrix and the Jacobian does not grow. .2 cannot be applied. we consider the method of Broyden applied to linear systems where A has a Jordan canonical block form. Therefore.12 show that the number of iterations needed by the method of Broyden to converge exactly on linear problems Ax + b = 0. It has been proved that.11 and 2.4). .4) is solved. One Jordan block Let us consider the matrix A ∈ Rn×n .
1). .4).5 we have seen that.9. We have plotted the Jacobian A in Figure 4.4. as well as the Broyden matrix at several iterations. The matrix B40 is the ﬁnal matrix before the method of Broyden solves (4. We choose the dimension n = 20 and apply the method of Broyden. Jacobian B0 B3 B4 B5 B6 B10 B15 B20 B25 B30 B40 Figure 4. If x0 is given by x0 = (1. The structure of the Jacobian can be clearly distinguished.9: The Jacobian (4.4). .5) of the linear system. In the same ﬁgure we have also plotted the initial Broyden matrix (B0 = −I). . the dimension of the zeroth Krylov space Z0 equals d0 = n. for n = 4 the method of Broyden needs 8 iterations to converge. the initial Broyden matrix and the Broyden matrix at subsequent iterations (n = 20). Black corresponds to the value −1 and white to the value 2.2 Solving linear systems with Broyden’s method 121 The vector b is set to zero and we choose λ equal to 2. It takes Broyden’s method at most 2d0 = 2n iterations to solve (4. The residual g(xk ) oscillates around 101 for 39 iterations and then suddenly drops to 10−12 in the 40th iteration step. . In Example 2. indeed. .
Thereafter the elements of the two main diagonals are adjusted and the oﬀdiagonal elements are pressed to zero. the convergence behavior of Broyden’s method is followed for about 2p iterations.11 with p = 1. That is. with p = 20 (n = 20). this recovery starts at the bottom right side of the matrix. where the elements of the Broyden matrix should be zero a pattern arises. however. We have applied Algorithm 3. The other elements of the Broyden matrix are kept approximately zero. Black corresponds to the value −1 and white to the value 2.11. see Figure 4. This implies that there is only one singular value available to update the initial Broyden matrix B0 in order to approximate the Jacobian (4. The BRR method is only converging for p = 20 but not as fast as the method of Broyden itself.2468 · 10−11 . After about 25 iterations the upper left corner of the matrix is reached.12.11 with p = 2.5). after a few iterations the diagonal structure is restored in the lower right corner. see Figure 4. We have seen in Figure 4.10: The Broyden matrix at three diﬀerent iterations of Algorithm 3.10.11 to solve the linear system with the Jacobian of (4. see Figure 4.5). For smaller values of p the BRR method fails to converge. Iteration after iteration the update to the Broyden matrix involves a next entry of the main diagonal. Due to our choice of the Jacobian. We now apply Algorithm 3. starting from its initial matrix. We have plotted the Broyden matrix at four iterations of the BRR process. In 60 iterations a residual is reached of 3. The update process . Features of Broyden’s method Clearly.9 that the method of Broyden mainly concerns the main diagonals of the matrix.122 Chapter 4. disturbs the structure of the Broyden matrix. The BRR method.11. Again the update process starts at the lower right corner of the matrix. Broyden’s method tries to recover the structure of the Jacobian. That is. instead of creating the upper subdiagonal. but then the process diverges. We see that the Broyden matrix is also developing a subdiagonal. However. We apply Algorithm 3. the initial Broyden matrix and the initial estimate x0 . B30 B40 B50 B60 Figure 4. We have plotted the Broyden matrix at four iterations of the BRR process. for every value of p.
with p = 1 (n = 20).4. The vector b is the zero vector. . . two spots are created on the diagonal and destroy the banded structure of the Broyden matrix. In Example 2. 0 A22 (4. . It turns out that the process fails to reconstruct the Jacobian again. are Jordan blocks (4.5) with the same eigenvalue λ = 2.4). with p = 2 (n = 20). . given by A= A11 0 . A22 ∈ Rn/2×n/2 . 1) the dimension of the zeroth Krylov space Z0 equals d0 = n/2.6) where both A11 . It takes Broyden’s method at most 2d0 = n iterations to solve (4. Black corresponds to the value −1 and white to the value 2. B10 B20 B30 B40 Figure 4.11: The Broyden matrix at three diﬀerent iterations of Algorithm 3. As said before the process fails to converge. Two equal Jordan blocks We assume that n is even and consider the matrix A ∈ Rn×n . Black corresponds to the value −1 and white to the value 2.11. If x0 is given by x0 = (1.5 we have seen that for n = 4 the method of Broyden needs 4 iterations to converge. .11. starts as expected in the lower right corner of the matrix. Instead.12: The Broyden matrix at three diﬀerent iterations of Algorithm 3.2 Solving linear systems with Broyden’s method 123 B10 B20 B30 B40 Figure 4.
13: The Jacobian (4. Broyden’s method tries to recover the structure of the Jacobian. starting from the initial matrix. 1). . But. Here the method of Broyden needs 20 iterations to converge. The Broyden matrix is plotted at several iterations. .14. If the initial condition is given by x0 = (1.4). see Figure 4. and g(x k ) oscillates before it drops to 10−12 . . two bands arise that connect both (n/2)dimensional systems. The vector b is the zero vector. .13. given by A= A11 0 . It turns out that the BRR method is as fast as Broyden’s method for p ≥ 11.6) of the linear system. A22 ∈ Rn/2×n/2 . Features of Broyden’s method We choose the dimension n = 20 and plot the Jacobian A. For p ≤ 10 the process eventually diverges. The Broyden matrix is plotted at several iterations. the dimension of the zeroth Krylov space Z0 equals d0 = n. . The matrix B20 is the ﬁnal matrix before the method of Broyden solves the problem. the initial Broyden matrix and the Broyden matrix at subsequent iterations (n = 20). It takes Broyden’s method at most 2d0 = 2n iterations to solve (4. Black corresponds to the value −1 and white to the value 2. 0 A22 (4. The matrix B40 is the ﬁnal matrix before the method of Broyden solves the problem. As for the previous example. Note that the Broyden matrix again is developing a subdiagonal. Iteration after iteration the update to the Broyden matrix involves a next entry of the main diagonal. are Jordan blocks (4. We choose the dimension n = 20 and plot the Jacobian A.124 Chapter 4. in addition. see Figure 4. Jacobian B5 B10 B20 Figure 4. Two diﬀerent Jordan blocks We assume that n is even and consider the matrix A ∈ Rn×n .7) where both A11 .5) but with diﬀerent eigenvalues λ1 = 2 and λ2 = 3. The process starts at the bottom right side of the matrix.
7) of the linear system.8) A= . The parameter δ varies between zero and one. Here the same description of the computations is valid as for the ﬁrst matrix. 1) the exact dimension of the zeroth Krylov space Z0 equals d0 = n.. 4.15 we have plotted the Jacobian and several .3 Introducing coupling 125 Jacobian B10 B30 B40 Figure 4. δ λ The vector b is set to zero and we choose λ equal to 2. given by λ δ .4.3 Introducing coupling In the previous section we have seen that the method of Broyden ﬁnds out when a system of equations can be split into several independent systems of equations and that it tries to solve the independent systems simultaneously. (4. at the end of the process these bands are eventually removed. . Black corresponds to the value −1 and white to the value 3. . We have to choose p = 20 for the BRR method to converge. starting from the initial matrix. .14: The Jacobian (4. the initial Broyden matrix and the Broyden matrix at subsequent iterations (n = 20). However. Broyden’s method tries to recover the structure of the Jacobian. We consider the matrix A ∈ Rn×n . As in the previous example the two bands are developed that connect the two (n/2)dimensional systems. . In Figure 4.. For smaller values of p the BRR method indeed diverges.0 · 10−4 the method of Broyden needs 8 iterations to converge to a residual of 9. it turns out that for small values of δ the method of Broyden needs less than 2n iterations.822·10−16 . . With x0 given by x0 = (1. . However. The method of Broyden needs 40 iterations to converge. . .. . If δ = 1. if δ = 0.
0·10−2 . If δ = 1. This is enough for the method of Broyden to ﬁnd the solution. Features of Broyden’s method Broyden matrices of the process. The Broyden matrix in the 10th iterations has recovered four elements on the diagonal.0 · 10−4 . Jacobian B6 B8 B10 Figure 4. with δ = 1. the BRR method has exactly the same rate of convergence for . the process has to recover some of the diagonal elements of the Jacobian. then the method of Broyden needs 14 iterations to converge to a residual of 3. Black corresponds to the value −1 and white to the value 2.8) of the linear system. The oﬀdiagonal elements are still small and therefore not distinguishable.0 · 10−3 . and the Broyden matrix at subsequent iterations (n = 20). After 8 iterations. If δ = 1.4698 · 10−14 . only three elements on the diagonal are ’recovered’. with δ = 1. see Figure 4.8) of the linear system. and the Broyden matrix at subsequent iterations (n = 20).0 · 10−3 the method of Broyden needs 10 iterations to converge to a residual of 1. Similarly to the previous cases. but evidently this is enough for Broyden’s method to ﬁnd the solution. see Figure 4. Black corresponds to the value −1 and white to the value 2.16. The BRR method turns out to be convergent for every value for p. Remarkably. before it ﬁnds the solutions.17. except for p = 2.126 Chapter 4. Jacobian B4 B6 B8 Figure 4.16: The Jacobian (4. Simulations show that the BRR method is convergent for every value for p. the ﬁnal number of recovered elements is 6.15: The Jacobian (4. Here.2768 · 10−13 .
0 · 10−2 .8) of the linear system.17: The Jacobian (4.2.e. Clearly. The situation is comparable to the one described in Section 4. and the Broyden matrix at subsequent iterations (n = 20). Black corresponds to the value −1 and white to the value 2.. For smaller values of p the rate of convergence is low or the process diverges. 47 iterations are needed instead. The Broyden matrix starts to recover the 14th elements of the diagonal. with δ = 0.1. For diﬀerent values of δ we have plotted the rate of convergence of the method of Broyden when solving g(x) = 0. Jacobian B6 B10 B14 Figure 4. i. and even for p = 20 the rate of convergence is lower. where we considered a Jacobian consisting of one canonical Jordan block. The plots in Figure 4.20.4. see Figure 4. The BRR method fails to converge for p < 20.19 are similar to those of Figure 4. If δ = 0. .9.3 Introducing coupling 127 p ≥ 6.5 the method of Broyden needs 40 iterations to converge. when the process converges. Black corresponds to the value −1 and white to the value 2.18: The Jacobian (4. Jacobian B10 B20 B30 Figure 4.8) of the linear system. and the Broyden matrix at subsequent iterations (n = 20).18. the oﬀdiagonal elements of the Jacobian become important. see Figure 4. with δ = 1.1 the method of Broyden needs 30 iterations to converge. The BRR method only converges equally fast for p ≥ 12. If δ = 0.
128
Chapter 4. Features of Broyden’s method
Jacobian
B20
B30
B40
Figure 4.19: The Jacobian (4.8) of the linear system, with δ = 0.5, and the Broyden matrix at subsequent iterations (n = 20). Black corresponds to the value −1 and white to the value 2.
10
0
residual g(xk )
10
−5
PSfrag replacements
10
−10
10
−15
0
5
10
15
20
25
30
35
40
iteration k Figure 4.20: The rate of convergence of Broyden’s method solving (4.4) where A is given by (4.8) for diﬀerent values of δ. [’◦’(δ = 1.0 · 10−4 ), ’×’(δ = 1.0 · 10−3 ), ’+’(δ = 1.0 · 10−2 ) ’∗’(δ = 0.1), ’ ’(σ = 0.5)]
4.4
Comparison of selected limited memory Broyden methods
In this section we compare the most promising limited memory Broyden methods, derived in Chapter 3. For every test function of Appendix A and every linear system discussed in Section 4.2 we applied the methods for p = 1, . . . , 20. The results we have put in Tables 4.2  4.6. The results of the discrete boundary value function (A.2) and the extended Powell singular function (A.9) are not included because all limited memory Broyden methods fail to converge for these functions. In the tables and the description of the results we have used abbreviations for the limited memory Broyden methods, as listed in Table 4.1. In all tables the methods are listed vertically and diﬀerent values of p are listed in horizontal direction. The initial condition x0 as well as the dimension
4.4 Comparison of selected limited memory Broyden methods
129
UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD
The Broyden Update Restart method (Algorithm 3.4 with q = 0) The Broyden Update Reduction method (Algorithm 3.4 with q = 1 and Z = I) The Broyden Rank Reduction method (Algorithm 3.11 with q = p − 1) The Broyden Rank Reduction Inverse method (Algorithm 3.15 with q = p − 1) The Broyden Rank Reduction method (Algorithm 3.11 with q = p − 2) The Broyden Base Reduction method (Algorithm 3.4 with q = p − 1 and Z given by (3.22)) The Broyden Base Storing method (Algorithm 3.4 with q = p − 1 and Z = R) The limited memory Broyden method proposed by Byrd et al. (Algorithm 3.23 with q = p − 1)
Table 4.1: The abbreviation of several limited memory Broyden methods.
n are uniform for every simulation in a table. For every combination of the method and parameter p, the number of iterations of the simulation are given and the variable R that represents the rate of convergence of the process to obtain a residual of g(xk ) < ε. Note that if R is negative the ﬁnal residual g(xk∗ ) is larger than the initial residual g(x0 ) . If R is large, then the method has a high rate of convergence. If a process fails to converge this is indicated by an asterisk. In the examples of Chapter 3 we saw that in some cases a process initially converges, but fails after a few iterations. So, the residual g(xk ) might have been smaller than the ﬁnal residual. This situation cannot be distinguished in the tables of this section. We refer for further details to Chapter 3 and Sections 4.1 and 4.2.
The discrete integral equation function
We consider the discrete integral equation function (A.5) for n = 20 and apply the limited memory Broyden methods starting from x0 given by (A.6) to a residual of g(xk ) < 10−12 . The results are given in Table 4.2. For p ≥ 10 all methods succeed to converge. Some of the methods are even indistinguishable from the method of Broyden. It turns out that especially UPALL has good results, because the method converges for all p ≥ 2. For p = 1 all methods fail to converge. Note that for p = 1 all methods, except for BRR2 and BYRD, are in fact equal. For BRR there exists a sharp boundary,
130
Chapter 4. Features of Broyden’s method
method UPALL UP1 BRR BBRI BRR2 BBR BBS BYRD 21 21 21 21 21 21 21 21
p = 20 1.3412 1.3412 1.3412 1.3411 1.3412 1.3412 1.3412 1.3411 p = 15 23 21 21 21 21 21 21 21
p = 19 1.3544 1.3412 1.3412 1.3411 1.3412 1.3412 1.3412 1.3411 p = 14 20 21 21 21 21 21 21 32 1.3677 1.3029 1.3412 1.3411 1.3412 1.3412 1.3412 0.8577 p=9 21 41 21 21 21 21 22 200 1.3658 0.6953 1.3412 1.3411 1.3411 1.3410 1.3864 0.1027∗ p=4 24 30 200 53 84 200 200 30 1.2088 0.9762 0.0073∗ 0.5189 0.3360 0.0008∗ 0.1195∗ 0.9509 33 25 160 119 114 168 62 38 20 52 21 21 21 21 22 90 23 27 21 21 21 21 21 32 21 21 21 21 21 21 21 21
p = 18 1.3619 1.3412 1.3412 1.3411 1.3412 1.3412 1.3412 1.3411 p = 13 1.2103 1.0782 1.3412 1.3411 1.3412 1.3412 1.3411 0.8585 p=8 1.4545 0.5278 1.3411 1.3410 1.3384 1.3401 1.3733 0.3157 p=3 0.8307 1.1068 −0.3226∗ 0.2407 0.2410 0.1646 −0.7674∗ 0.7933 24 26 55 94 24 38 200 200 23 32 21 21 23 200 131 200 22 25 21 21 21 21 21 63 21 21 21 21 21 21 21 21
p = 17 1.4441 1.3412 1.3412 1.3411 1.3412 1.3412 1.3412 1.3410 p = 12 1.3442 1.1281 1.3412 1.3411 1.3412 1.3412 1.3411 0.4433 p=7 1.2895 0.8740 1.3511 1.3329 1.2061 0.0773∗ −0.4034∗ 0.0640∗ p=2 1.2468 1.0909 −1.2889∗ 0.2995 1.2468 0.7934 0.0778∗ 0.0158∗ 200 200 200 200 – 200 200 9 22 44 200 23 36 200 104 94 21 29 21 21 21 21 21 111 22 21 21 21 21 21 21 21
p = 16 1.2688 1.3327 1.3412 1.3411 1.3412 1.3412 1.3412 1.3432 p = 11 1.3353 0.9653 1.3412 1.3411 1.3412 1.3412 1.3401 0.2489 p=6 1.2677 0.6512 0.0307 ∗ 1.1922 0.7662 0.0655∗ −0.6570∗ 0.2991 p=1 0.0158 ∗ 0.0158 ∗ 0.0158∗ 0.0158 ∗ – 0.0158∗ 0.0158∗ −6.3791∗
UPALL UP1 BRR BBRI BRR2 BBR BBS BYRD
23 24 21 21 21 21 21 27
1.3103 1.2372 1.3412 1.3411 1.3412 1.3412 1.3412 1.0585 p = 10
UPALL UP1 BRR BBRI BRR2 BBR BBS BYRD
22 29 21 21 21 21 22 171
1.3908 0.9555 1.3412 1.3411 1.3412 1.3412 1.3909 0.1650 p=5
UPALL UP1 BRR BBRI BRR2 BBR BBS BYRD
21 83 200 50 51 200 64 75
1.3469 0.3425 0.0155∗ 0.5684 0.5655 0.0605∗ 0.4277 0.3734
Table 4.2: The number of iterations and the rate of convergence for the limited memory Broyden methods of Table 4.1, applied to the discrete integral equation function (A.5) (n = 20). [’*’ (no convergence)]
that is, the method converges for p ≥ 7 and fail for p ≤ 6. The same holds for the methods BBR and BBS, both methods converge for p ≥ 8. These methods, however, also converge for some smaller values of p. For p ≤ 12 the method BYRD needs many iterations to converge, except for p = 3 and p = 4, where
4.4 Comparison of selected limited memory Broyden methods
131
the method does converge rather fast. We can conclude that every method can be trusted if p is larger than a certain critical value (p = 6 for BRR, p = 7 for BBR and BBS, etc). Beneath this critical value the method might occasionally converge.
The extended Rosenbrock function
For the extended Rosenbrock function (A.7) with n = 20, we give the results of the simulations only for p ≤ 10, since for every method starting from x0 given by (A.8) the rate of convergence hardly increases for larger values of p. The results are listed in Table 4.3.
method UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 11 11 11 11 11 11 11 11 p = 10 2.9756 2.9756 2.9756 2.9756 2.9756 2.9756 2.9756 2.9723 p=5 UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 14 23 13 13 13 11 11 20 2.4087 1.4538 2.6200 2.5924 2.6200 2.9756 2.9756 1.6295 38 20 13 13 13 11 11 32 13 11 11 11 11 11 11 14 p=9 ∞ 2.9444 2.9756 2.9756 2.9756 2.9756 2.9756 ∞ p=4 0.8749 1.6973 2.6875 2.6017 2.6875 2.9756 2.9756 1.0355 200 66 22 77 33 11 11 145 12 15 11 11 11 11 11 14 p=8 2.8867 ∞ 2.9756 2.9756 2.9756 2.9756 2.9756 ∞ p=3 −0.1284∗ 0.4997 1.5367 0.4126 0.9461 2.9756 2.9756 0.2341 200 30 62 33 200 200 18 200 14 14 11 11 11 11 11 18 p=7 ∞ 2.4949 2.9756 2.9756 2.9756 2.9756 2.9756 1.8733 p=2 −0.0374∗ 1.1411 0.5074 −1.3923∗ −0.0374∗ −0.0986∗ 1.7653 −0.0380∗ 200 200 200 200 – 200 200 4 16 25 11 11 11 11 11 19 p=6 2.2081 1.2801 2.9756 2.9756 2.9756 2.9756 2.9756 1.9444 p=1 −0.0303∗ −0.0303 ∗ −0.0303 ∗ −0.0316∗ – −0.0303∗ −0.0303 ∗ −14.9518∗
Table 4.3: The number of iterations and the rate of convergence for the limited memory Broyden methods of Table 4.1, applied to the extended Rosenbrock function (A.7) (n = 20). [’*’ (no convergence)]
The ’∞’sign implies that accidentally the exact zero of the extended Rosenbrock function is found. Again all methods fail to converge for p = 1. For p = 2 only the methods UP1, BRR and BBS converge. Note that most methods converge for p = 3. The method BBR and BBS are for p = 3 even as fast as for p = 10.
3892∗ −0. . with diﬀerent eigenvalues λ1 = 2 and λ2 = 3.0636∗ −0. 1). In Table 4.0472∗ −0. The vector b is the zero vector.1101∗ 0.0516∗ −0.5) with the eigenvalue λ = 2.2. [’*’ (no convergence)] Two equal Jordan blocks We assume that n is even and consider the matrix A ∈ Rn×n .2. 1).2245 −0.2450∗ −0.2872∗ −0.0495∗ 0. All methods fail to converge for smaller values of p. .1987∗ −0.3057∗ −0. .2709∗ 0. . Two diﬀerent Jordan blocks We take n = 20 and consider the matrix A ∈ Rn×n . method UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 200 200 65 83 96 194 62 200 p = 20 −0. .0620∗ 0.3592∗ −0.0515∗ 0. see Section 4.0470∗ −0.132 Chapter 4. Note that the method of Broyden needs 2n iterations to solve (4.4 we give the results for the limited memory Broyden methods for 16 ≤ p ≤ 20.0361∗ −0.1294∗ −0.3662 0. The vector b is the zero vector.6) where both A11 . given by (4. starting from x0 given by x0 = (1. .0501∗ −0.0482∗ −0. More or less the same description is valid as for Table 4. The results for the limited memory Broyden methods for 16 ≤ p ≤ 20 are given in Table 4.0549∗ 200 200 143 200 200 153 114 200 p = 17 −0.0378∗ Table 4.4) where A is given by (4. A22 ∈ Rn/2×n/2 .4889 0.3216∗ −0. .0491∗ −0.7) where both A11 . given by (4. . . see Section 4.3142∗ −0. with λ equal to 2 and n = 20. 1).0491∗ −0.1. are Jordan blocks (4.0634∗ 200 200 136 200 200 182 104 200 p = 16 −0.4). . Note that it takes Broyden’s method n iterations to solve (4. applied to the linear equation (4.3259 −0.5 Most of the limited memory Broyden methods fail for p ≤ 10 except for the method BRR2 and BBS (and UP1 for p = 2).4: The number of iterations and the rate of convergence for the limited memory Broyden methods of Table 4. Features of Broyden’s method One Jordan block We consider again the matrix A ∈ Rn×n . UP1 also fail to converge for 11 ≤ p ≤ 16 and the method BYRD for 11 ≤ p ≤ 18. .4).2331∗ 0.4233∗ −0.2883∗ −0. The vector b is the zero vector and the initial condition x0 is given by x0 = (1. The methods UPALL. .6.5).4.5514 −0.0474∗ −0.0564∗ 200 200 162 200 167 200 139 200 p = 19 −0. see Section 4.4). .0560∗ 200 200 152 200 136 200 123 200 p = 18 −0.1959∗ −0. The initial condition x0 is given by x0 = (1. A22 ∈ Rn/2×n/2 . Note that it takes Broyden’s method 2n iterations to solve (4. given by (4.1812 −0.2.5). The results of the simulation are given in Table 4.0805∗ 0.5) and b = 0 (n = 20). are Jordan blocks given by (4.
0964∗ −0.4464∗ 0.7326 1.6305 1.6027 1.1673∗ −0.0033∗ p = 11 0.0029∗ p=6 0.1205∗ −0.0980∗ −0.3750∗ −0.7351 1.2639∗ −0.0402∗ p=2 0.6270 1.7259 1.1870 1.7351 1.3023∗ −0.0114∗ p=7 0.4907∗ 0.6756 1.0697 0.0924∗ p = 12 0.0476∗ 0.7338 0.7351 1.5424 1.0747∗ 200 200 105 146 200 200 99 200 200 200 116 200 128 165 103 200 200 200 20 20 20 20 20 200 20 20 20 20 20 20 20 200 p = 18 1.3498∗ UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 200 200 20 20 20 20 20 200 0.0960∗ Table 4.3374∗ −0.1205∗ 200 184 97 148 200 200 99 105 200 200 117 200 200 200 100 200 200 200 20 20 20 20 20 200 104 167 20 20 20 20 20 200 p = 17 0.7351 1.0182∗ −0.7205 0.2652∗ 0.0913∗ −0.4200∗ −0.1937∗ −0.1727 −0.4200∗ 105 105 105 105 – 105 105 33 200 200 105 200 200 196 89 200 200 200 20 20 73 20 20 200 200 200 20 20 20 20 20 200 p = 16 0.1804∗ −0.4388∗ 0.1002∗ 0.7052 1.1403 ∗ 1.7521 1.4) where A is given by (4.6852 1.0623∗ p = 13 0.4177∗ −0.7113 0.0122∗ −0.4200∗ −0.0146∗ p=5 UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 200 200 98 196 200 200 86 200 0.4398∗ 0.0151∗ 0.1759∗ −0.5288 1.7662 1.0157∗ 0.0581∗ −0.6467 1.2335∗ −0.1909 −0.7245 1. [’*’ (no convergence)] .1395∗ 0.0348∗ −0.0585∗ 0.2225∗ −0.1304∗ 1.0312∗ −0.7325 0.0432∗ 0.7035 p = 15 20 20 20 20 20 20 20 20 p = 19 1.5056∗ 0.6) and b = 0 (n = 20).0287∗ 0.7351 1.0515∗ 0.0989∗ 0.7351 1.0589 ∗ 0.7495 1.0933∗ 0.6805 1.5553 1.4541∗ −0.4173 1.1.4200∗ −0.7351 1.6866 1.7685 1.0086∗ p=3 0.5609 1.7035 0.5: The number of iterations and the rate of convergence for the limited memory Broyden methods of Table 4.7387 −0.6256 1.7036 p = 14 200 200 20 20 20 20 20 200 0.0088∗ −0.7275 1.1116∗ −0.0469∗ p = 10 UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 200 200 166 200 90 200 36 200 0.0121∗ p=8 0. applied to the linear equation (4.0079∗ 1.4522∗ −0.2394 −0.7351 1.2299∗ 0.1395∗ 0.0465∗ 0.4200∗ −1.6908 1.2109∗ 0.4750∗ −0.0641∗ −0.7351 1.4200∗ – −0.7351 1.4200∗ −0.7549 1.0162∗ 0.7331 1.7465 1.0235∗ 1.4.7177 0.7685 1.6571 1.6994 0.0171∗ 0.0398∗ 0.5621 1.0351∗ p=1 −0.7351 1.6466 1.7351 1.0357∗ 0.3775∗ −0.4178∗ −0.0132∗ 1.0251∗ 0.3384 0.0213∗ 0.0182∗ 1.4280∗ 0.3577∗ 0.7280 1.4 Comparison of selected limited memory Broyden methods 133 method UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 20 20 20 20 20 20 20 20 p = 20 1.4484∗ −0.7198 1.3154∗ 0.3381 −0.5227 0.0013∗ 0.6375 1.6369 1.0598∗ p=9 200 200 130 200 159 200 122 200 0.0127∗ p=4 200 200 93 190 200 200 99 200 0.0042∗ 1.0359∗ −0.0069∗ −0.
0465∗ −0.8229 0.0648∗ 0.2196 −0.0566∗ −0.3301 −0.0678∗ 200 200 147 200 112 200 146 200 p = 18 −0. Features of Broyden’s method method UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 200 200 38 38 38 42 40 200 p = 20 −0.2708∗ −0.7) and b = 0 (n = 20).6: The number of iterations and the rate of convergence for the limited memory Broyden methods of Table 4. [’*’ (no convergence)] .2646∗ −0.0633∗ −0.0558∗ −0.0806∗ −0.4) where A is given by (4.0847∗ 0.8408 0.0930∗ −0.7646 0.0630∗ 0.2736 −0.2635∗ 0.0502∗ Table 4.6741 0.1.2954∗ 0.8489 −0.0589∗ 200 200 149 200 200 167 87 200 p = 17 −0.0813∗ −0.1018∗ 0.0444∗ 200 200 45 46 103 168 41 200 p = 19 −0.3400∗ −0. applied to the linear equation (4.8473 −0.7108 0.4182∗ −0.0672∗ 200 200 130 200 200 164 105 200 p = 16 −0.5110∗ −0.134 Chapter 4.1223∗ −0.2953∗ −0.8642 0.0570∗ 0.0408∗ −0.1886∗ 0.0798∗ −0.
3). in Section 5. that corresponds to the balance equations in (6. The system of ordinary diﬀerential equations is integrated over one reverse ﬂow period using the NAGlibrary 135 . This leads to an ndimensional discretized problem. For the ﬁnite volume discretization an equidistant grid is used with N grid points (N = 100). corresponding to the partial diﬀerential equations of the one.25) using the parameters values of Table 6. In Section 5. In addition.3 we show that the BRR method makes it possible to compute the limiting periodic state of the reverse ﬂow reactor on a ﬁner grid using the same amount of memory and just a few more iterations. see (8.3).2).1. As initial condition we take a state of the reactor that is at high constant temperature (T = 2T 0 ) and ﬁlled with inert gas (c = 0).1 The reverse ﬂow reactor The onedimensional model Let f : Rn → Rn be the map of one ﬂow reverse period.Chapter 5 Features of the Broyden rank reduction method Anticipating on the simulations of Chapter 8.23)(6. In 5.4 we compare the convergence properties of the limited memory Broyden methods listed in Table 4. 5.2 we consider the singular values of the update matrix for both models. we investigate the convergence properties of the Broyden Rank Reduction method for computing ﬁxed points of the period map f : Rn → Rn of the reverse ﬂow reactor deﬁned by (8. Finally. where n = 2N = 200. The results are described in Section 5.and twodimensional model.1.2. we ﬁx the ﬂow reverse period and the dimensionless cooling capacity (tf = 1200s and Φ = 0.
3). Note that the residuals of both methods are equal up to the 45th iteration. [’◦’(Broyden). and p = 1 the BRR method has a very low rate of convergence. So. ’ ’(p = 3). the BRR method is even faster than the method of Broyden. respectively. the number of iterations needed to converge to g(xk ) < 10−10 is not monotonously increasing for smaller values of p. the BRR method approximates the convergence rate of the method of Broyden using one ﬁfth of the amount of memory. the BRR method is applied for diﬀerent values of p. ’ ’(p = 1)] The information in Figure 5. If we take p = 5 or p = 4 instead of p = 10.19 and Algorithm 3. the BRR method needs a few more iterations to converge.2. 10 0 residual g(xk ) 10 −5 PSfrag replacements 10 −10 0 10 20 30 40 50 60 iteration k Figure 5. For p = 10. To solve the equation g(x) = 0 with g(x) = f (x) − x. We see that a large reduction of memory is obtained with just a few more iterations. We ﬁx the ﬂow reverse period and the dimensionless cooling capacity (tf = 1200s and Φ = 0. The ratio between the width and . For 2 p = 3. applied to the period map of the reverse ﬂow reactor (8. ’∗’(p = 5).1: The convergence rate of Algorithm 1.136 Chapter 5. However.25) with the parameter values of Table 6.23)(6. see (8. For p = 20. p = 2. Features of the Broyden rank reduction method routine D02EJF. ’×’(p = 20).11. The method of Broyden converges to a residual with g(xk ) < 10−10 in 52 iterations. corresponding to the balance equations in (6.2). ’+’(p = 10).3) using the onedimensional model (6. ’ ’(p = 4). The twodimensional model Let f : Rn → Rn now be the map of one ﬂow reverse period.28) using the parameters values of Tables 6. the amount of memory usage is divided by a factor 2 and 5 .26)(6.1 can be interpreted in the following way.2. ’ ’(p = 2). with q = p−1.
’∗’(p = 5).3) using the twodimensional model (6.2. In fact. To solve the equation g(x) = 0.5. ’ ’(p = 3). The system of ordinary diﬀerential equations is integrated over one reverse ﬂow period using the NAGlibrary routine D02NCF. ’+’(p = 10). 10 0 residual g(xk ) 10 −5 PSfrag replacements 0 10 20 30 40 50 60 iteration k Figure 5. For 2 ≤ p ≤ 4 the BRR method does not converge within 60 iterations. with g(x) = f (x) − x. a segment of the reactor is divided in M rings with the same volume. [’×’(p = 20). applied to the period map of the reverse ﬂow reactor (8. The amount of memory needed to store the Broyden matrix can be reduced by choosing p = 10 instead of p = 20. with q = p − 1. ’ ’(p = 4). As initial condition a state of the reactor is taken that is at high constant temperature (T = 2T0 ) and ﬁlled with inert gas (c = 0).2 shows that the BRR method has a high rate of convergence for p ≥ 5. due to memory constraints. It turns out that for the twodimensional model it is no longer possible to apply the original method of Broyden.28) with the parameter values of Table 6. For the ﬁnite volume discretization an equidistant grid is used with N grid points in the axial direction (N = 100).26)(6. using approximately the same number of iterations. the BRR method is applied for diﬀerent values of p.2: The convergence rate of Algorithm 3.1 The reverse ﬂow reactor 137 the length of the reactor is set at R/L = 0.11. ’ ’(p = 2)] Figure 5. The dimension of the discretized problem is denoted by n (n = 2 · M · N = 5000).0025. . In the radial direction a nonuniform grid of M grid points is chosen that becomes ﬁner in the direction of the wall of the reactor (M = 25).
Because the parameter p is larger than the number of iterations done by the BRR process. Subsequently the value of σ1 is rather stable.25). In Figure 5. the smallest singular value. The onedimensional model As in the previous section.26)(6. Features of the Broyden rank reduction method 5. In Figure 5. that is. In this section we investigate what happens with the singular values if we remove the pth singular value in every iteration.3) corresponding to the onedimensional model (6. the singular value σ5 starts to oscillate after the 5th iteration.2 Singular value distributions of the update matrices As we have explained in Section 3.3 we have plotted the singular values of the update matrix during the BRR process. The twodimensional model We apply the same investigation for the period map of the reverse ﬂow reactor deﬁned by (8. for diﬀerent values of p.4 we have plotted the singular values of the update matrix during the BRR process. the singular value distribution is exactly the same as for p = 50.23)(6. we see that the update matrix has rank one at the beginning of the second iteration. the largest singular value σ 1 jumps in the second iteration from about 10−1 to 100 . In case of p = 50. we have considered a situation where the BRR method is equal to the method of Broyden. we ﬁrst consider the period map of the reverse ﬂow reactor deﬁned by (8. In every iteration one nonzero singular value is added. . Thereafter the singular value σ10 starts jumping around. Decreasing the parameter p to 5. for diﬀerent values of p. So. the matrix has one nonzero singular value. In addition the singular value σ1 is larger than for p = 50 and p = 10. For example. that is.28).138 Chapter 5. The other singular values are still rather stable. So.2 the rank of the update matrix increases during the Broyden process. the number of nonzero singular values of the update matrix increases.3) corresponding to the twodimensional model (6. If we choose p = 10 we see that during the ﬁrst 10 iterations. if we apply the Broyden Rank Reduction method with parameter p. since in every iteration a rankone matrix is added to the update matrix. This singular value increases during some iterations and thereafter is reaches a more or less stable value. no singular values are removed. The other nine singular values seems to be invariant under the reduction procedure.
23)(6. .3) corresponding to the onedimensional model (6. down (p = 5)] It turns out the we can describe the behavior of the singular values of the update matrix in the same way as we did for the onedimensional model. The only diﬀerence is that for p = 5 the singular value σ4 starts to alter instead of the singular value σ1 . [up (p = 50). with q = p − 1.3: The singular value distribution of the update matrix during the BRR process. applied to the period map of the reverse ﬂow reactor (8.25). middle (p = 10).5.2 Singular value distributions of the update matrices 139 singular values 10 0 10 −2 PSfrag replacements 10 −4 0 5 10 15 20 25 30 35 40 45 iteration k singular values 10 0 10 −2 PSfrag replacements 10 −4 0 5 10 15 20 25 30 35 40 45 50 iteration k singular values 10 0 10 −2 PSfrag replacements 10 −4 0 5 10 15 20 25 30 35 40 45 50 55 iteration k Figure 5.
26)(6. applied to the period map of the reverse ﬂow reactor (8. [pu (p = 50).1 we have seen that the BRR method makes it possible to ﬁnd symmetric periodic solutions of the RFR using the full twodimensional model . down (p = 5)]. with q = p − 1.3) corresponding to the twodimensional model (6.28).4: The singular value distribution of the update matrix during the BRR process. Features of the Broyden rank reduction method singular values 10 0 10 −2 PSfrag replacements 10 −4 0 5 10 15 20 25 30 35 40 45 iteration k singular values 10 0 10 −2 PSfrag replacements 10 −4 0 5 10 15 20 25 30 35 40 45 50 iteration k singular values 10 0 10 −2 PSfrag replacements 10 −4 0 10 20 30 40 50 60 iteration k Figure 5. 5.3 Computing on a ﬁner grid using same amount of memory In Section 5. middle (p = 10).140 Chapter 5.
To accelerate the convergence we want to use the largest value of p. the BRR method saves memory. 000 6. 000 8. M = 25 # iterations p = 20 p = 10 p=5 p=4 p=3 p=2 p=1 48 50 61 65 80 > 100 > 100 # storage loc. 200. 40.3 Computing on a ﬁner grid using same amount of memory 141 of the RFR. see Table 5. If we take a larger radius for the reactor (R/L = 0. respectively. that at most 40.26)(6. 000 20. 000 10. gradients in the radial direction are absent in this case.1. 000 20.3. and the number of storage locations for the Broyden matrix using a grid with N = 100 grid points in the axial direction and M = 25. For the coarse grid the parameter p can be chosen to . for example. 000 2.5. respectively M = 5. 000 # iterations 53 55 65 82 76 90 > 100 M =5 # storage loc. 000 100.11. 000 50. 000 Table 5.1: The number of iterations of Algorithm 3. grid points in the radial direction. we have shown that even when using the onedimensional description of the RFR. 000 10. Suppose. with q = p − 1. Note that for every value of p the rate of convergence for M = 25 is higher than for M = 5.3) corresponding to the twodimensional model (6. To illustrate the beneﬁts of our limiting memory method we compare two simulations of the model with M = 25 and M = 5 grid points in the radial direction.0025). surprisingly. still the same values for p can be used for both the ﬁne and the coarse grid. Although a few more iterations are needed than in case of the slim reactor (R/L = 0. We now show that it is possible to use a ﬁner grid with the same amount of memory to store the Broyden matrix at the expense of just a few more iterations.0025). We have applied the BRR method with diﬀerent values of p to compute the periodic state of the reactor. and the twodimensional model leads to exactly the same results as the onedimensional model. 000 40. 000 30. As will be discussed in Section 8. In addition.025). So.28). For the above simulation of the twodimensional model a very slim reactor is used (R/L = 0. for the twodimensional model the same values for p can be used as in case of the onedimensional model. applied to the period map of the reverse ﬂow reactor (8. the dimension of the discretized problem becomes n = 5000 and n = 1000. 000 4. then radial temperature gradients are in fact introduced. It turned out that. 000 storage locations are available.
5. distance (a) Fine grid (M = 25) ax. This implies that instead of 53 iterations for a coarse grid.4 0. The onedimensional model The results of the simulations with the period map of the onedimensional model as given in Table 5.4 0.2 0.8 0.2 0. Features of the Broyden rank reduction method be 20 and for the ﬁne grid at most p = 4.2.1.5(b).4 1 PSfrag replacements PSfrag replacements rad.142 Chapter 5. we apply the limited memory Broyden methods listed in Table 4.2 0 0 0.6 0. The computations are stopped if a maximal number of 200 iterations is reached or the process is converged to a residual of g(xk ) < ε. distance (b) Coarse grid (M = 5) Figure 5.6 0.8 0.8 0.6 0. 1.2 temperature 1 0.2 1. distance ax. the more accurate approximation using the ﬁne grid.5(a).4 0.5: Temperature distribution over the reactor bed using a coarse and a ﬁne grid in the radial direction. Although the approximation of the cyclic steady state is qualitatively good using the coarse grid. to compute a ﬁxed point of the period map f : Rn → Rn deﬁned by (8.6 0. as we did for several test functions in Chapter 4. 65 iterations are needed for a ﬁne grid to solve the discretized problem while using the same amount of memory to store the Broyden matrix.4 Comparison of selected limited memory Broyden methods Concluding this chapter. Figure 5.2 0 0. Figure 5.2 0 0 0.6 0.3).8 1 0.4 0. where ε = 10−10 for the onedimensional model and ε = 10−8 for the twodimensional model.8 0. distance rad.8 1 0.2 0 0. .6 0. is preferable.4 1 temperature 1 0.
Note that the method BRR is applicable for p = 4 using 64 iterations instead of 47. The convergence properties of the limited memory Broyden methods in case of the twodimensional model are comparable to those in case of the onedimensional model. because an evaluation of the period map failed during the process.5. For p ≥ 10 the method BRR.4 Comparison of selected limited memory Broyden methods 143 It turns out that the methods BRR. The twodimensional model The results of the simulation with the period map of the twodimensional model are given in Table 5. BRRI. 000 storage locations. Note that all methods converge in 47 iterations for p = 50.3. BBR and BBS are rather fast for p ≥ 5. The method BRRI is still applicable for p = 3 using 69 iterations. . For two simulations the results were not returned by the program. We clearly see that a smaller value of p does not have to imply that more iterations are needed for the limited memory Broyden process. The method BRR can even be applied for p = 4 using 57 iterations. BRR2. Note that for p = 5 we need 2pn = 2 · 5 · 200 = 2000 storage locations to store the update matrix and for p = 50 we need 20. BRRI. BRR2 and BBR need less than 51 iterations. The fact that for p = 50 not all methods converge in 48 iterations can be explained by the introduction of rounding errors of the large computations.
’ (no data)] . 55 49 49 47 58 91 0.1663 200 191 105 90 200 182 200 188 p=2 0.1129∗ 0.3839 0.3952 0.1854 0.5186 0.2266 UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 82 120 56 53 46 49 54 76 0.4889 0.3455 0.1602 0.5078 0.144 Chapter 5.2096 0.1892 0.4471 0.5187 0.2420 0.1628 0.4626 0.5134 0..1335 200 200 200 200 – 200 200 200 p=1 0.1705 0.3668 0.1725 0.4303 0.4991 p = 14 95 126 51 55 48 48 52 68 0.2201 0.0757∗ 0.23)(6.5264 0.3) according to the onedimensional model (6. n = 200.5324 0.1310 0. .1388 0.4701 0.4693 0.4670 0.2477 p=8 0.4221 0.3917 0.3098 0.1561 0.1617 0.5019 0.5154 0.5159 0.5497 0.5323 0.3342 0. .4175 0.0757∗ 0.4772 0.4561 0.3421 0.3178 0.0717∗ 0.5852 0.3010 p=6 0.5202 0.2216 0.5135 0.5101 0. 81 128 p=4 0.4016 0..1382 0.5186 0.2442 0. applied to the period map of the reverse ﬂow reactor (8.1913 96 146 53 53 55 47 59 112 72 115 54 46 47 44 51 85 63 118 55 50 52 55 48 59 p = 20 0.5511 0.4815 0.3287 p = 10 UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 75 162 52 46 63 49 43 71 0.1716 0.5753 0.5056 0. Features of the Broyden rank reduction method method UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 49 49 48 50 48 49 49 52 p = 50 0.2936 0.3421 0.4051 0. ’.2013 0.2747 102 132 58 49 49 42 45 97 71 146 50 48 57 64 59 101 66 92 53 51 49 49 53 62 p = 30 0.5029 0.3550 p=5 UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 116 155 54 62 60 60 71 171 0.4750 0.5201 0.4332 0.2731 0.3807 p=9 86 .∗ 0.4634 0.5069 0.2737 .2156 0..2572 131 145 50 44 52 67 53 133 74 119 49 49 46 46 53 65 79 86 51 47 60 50 55 57 p = 25 0.4744 0.5727 0.2596 0.1945 0.2654 0.0757∗ 0.2076 0.2923 0.4523 0.5298 0.0819 ∗ – 0.4979 0.4851 0.4215 p = 13 0.5414 0.4314 0.4831 p = 15 75 65 47 56 47 50 47 50 p = 40 0. .3305 0.2778 0.4232 0.4432 0.5202 0.5271 0.1458 138 155 57 63 92 .3540 0..5490 0.5667 0.4610 0.4441 p = 11 0.5305 0.4547 0.5186 0.4543 0.4715 0.5340 0.5281 0.5314 0. .4575 0. .∗ 0.4843 0.4166 0.2967 .5119 0.4211 0.4429 p = 12 0. [’*’ (no convergence).2: The number of iterations and the rate of convergence for diﬀerent limited memory Broyden methods.3968 0.3809 0. .3541 0.0757 ∗ 0.3046 0.0059 ∗ Table 5.0757 ∗ 0.0786∗ 0.4789 0.5030 0.25).5295 0.3883 p=7 0.5960 0.5193 0.5023 0.1971 164 185 85 69 158 154 76 150 p=3 0.4528 0.4503 0.5324 0.4998 0.5732 0.5216 0.5186 0.1608 0.5067 0.1544 0.
.2173 0.3441 0.4082 0....1960 0.4783 0.4580 0. .. computing a ﬁxed point of the period map (8.4652 0..3997 0..4781 p = 15 52 61 47 47 47 47 47 48 p = 40 0.∗ 96 .2480 p=1 .2520 p=5 UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 81 127 60 85 61 56 57 123 0.1539 0.3996 0.2677 0. .1924 0.∗ 0.4690 0.4780 0.3713 0..2404 0. [’*’ (no convergence)... . 55 59 78 115 116 48 48 47 46 60 74 66 98 47 47 47 47 47 59 p = 25 0.∗ .4678 .4434 0.3908 0.2809 0.3167 0.2723 p=4 142 130 64 59 78 80 73 . 67 78 47 47 47 47 50 56 p = 20 0.4782 0.4198 . .4669 0.3924 0.∗ 0. .4574 0. .2331 .4426 0.4621 0.4235 p = 13 0.4780 0.2861 p = 10 UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 82 137 49 48 50 49 52 88 0. – .4782 0..2846 p=3 0..3071 0.∗ . .2240 0. .4913 0.2453 0.4782 0.4778 0.0807∗ .4833 0.2770 0. 152 50 59 53 55 74 92 84 115 46 47 50 47 48 . . .1557 0..2405 0. .3269 0. ..2830 0.28).4783 0..4655 0. . ..5062 0.2804 0...2180 0.4488 0. . .5. . . .2610 0. .2357 0.3742 p = 12 0.2828 p=2 0. .1695 0.2778 0.∗ . 200 .4238 0.5126 0.3017 p=7 0.4455 0.1909 0. .3: The number of iterations and the rate of convergence for diﬀerent limited memory Broyden methods. ..4567 p = 14 72 93 47 47 48 51 50 61 0.2830 0.4781 0. .4513 0.∗ 0.∗ p=6 .0891∗ .2880 0.4819 0.∗ 0. ’.2813 p=8 .4045 0..3770 0. .2780 0. .4802 0.3747 0.∗ 0. n = 5000. . . 94 55 51 59 55 55 79 77 101 48 48 48 47 49 79 54 72 47 47 47 47 47 53 p = 30 0.3983 0.∗ .2749 0. .4780 0.0014∗ UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 62 78 47 47 47 48 48 77 0.3568 0. . 0.4876 0.1629 0.1725 0. .. . .4512 0.4427 0.4593 .4783 0.∗ 0.4562 0..2659 0. 79 91 79 92 102 .4643 0.4259 0.4666 0.1994 0.4104 0.2112 0.2028 0.2577 .4781 0.4782 0.4780 0.4648 0. 200 .4640 0.4796 0. 55 82 0.4143 0.3732 0.5118 0..∗ 200 92 107 110 200 .4532 0.3) according to the twodimensional model (6.4783 0. ..4782 0..∗ – .3727 0..4785 0.’ (no data)] .4782 0.4087 p = 11 0.2388 0.4912 0..0560∗ 0.4657 0.4622 0.4770 0. . . .4 Comparison of selected limited memory Broyden methods 145 method UPALL UP1 BRR BRRI BRR2 BBR BBS BYRD 47 47 47 47 47 47 47 47 p = 50 0.4833 0.4813 0.4668 0.26)(6.4293 0.. 103 110 52 49 .5027 0.3734 0.3027 .4995 0.4796 0..4002 0.3760 0.3597 p=9 79 171 52 50 53 .4782 0....1785 Table 5.∗ .2961 0.1289 0.4373 0..∗ 0.4955 0. .
146 Chapter 5. Features of the Broyden rank reduction method .
Part III Limited memory methods applied to periodically forced processes 147 .
.
so that it does not have to be replaced every time it is saturated.Chapter 6 Periodic processes in packed bed reactors In this chapter. that is. gas mixtures are separated by selective adsorption over a bed of sorbent materials. it cannot adsorb any more adsorbate. During adsorption one component is selectively adsorbed. In Section 6. pressure and direction of the feed streams. The balance equations for a general packed bed reactor are derived in Section 6. the adsorbent must bind components reversibly. Pressure and thermal swing adsorption In pressure swing adsorption (PSA) processes. From the feed point up to the adsorption front. Therefore. that is. we give a short introduction in the chemical reactor engineering. In a packed bed a front is therefore formed that slowly migrates in the direction of the product end. it has to be regenerated.1. 6.2. The periodic nature of the PSA arises from the high pressure adsorption phase and the subsequent low pressure regeneration phase. but can be cleaned in the reactor itself. such that at the product end of the reactor the gas stream does not contain this component. If the adsorbent is saturated.1 The advantages of periodic processes Periodic processes in packed bed reactors mainly arise from periodically varying the feeding conditions. we discuss the most common cyclic processes in packed bed reactors and explain their advantages. the temperature. the feed gas 149 .
while further downstream.g. During this step the pressure is maintained at a high level. Studies of thermal swing adsorbers can be found in work by e. Pressure swing adsorption is widely used for bulk separation and puriﬁcation of gasses.g. Major applications are for example. that is. When the adsorbent has lost enough of its loading. The sorption enhanced reaction process has been demonstrated primarily in achieving supra equilibrium levels in equilibrium limited reactions. Pressure swing reactor The principle of Pressure Swing Reactors (PSR). Davis and Levan [14]. A pressure swing adsorber designed to separate water from air has been studied by e. When the pressure has dropped to suﬃcient low level. The pressure swing reactor is a relatively new process and . separation of normal and isoalkanes. Thermal swing adsorption (TSA) processes are similar to pressure swing adsorption processes and are also intended to separate gas mixtures. After this pressurization the process returns to the ﬁrst step. This second step is called the blowdown step. moisture removal from air and natural gas. the gas phase contains nonadsorbing components only and the sorbent is not saturated. The cyclic nature of a pressure swing reactor arises from the same high pressure adsorption and low pressure regeneration phases as in the pressure swing adsorber. Sorption and catalysis may even be integrated in a single material. De Montgareuil and Domine and independently Skarstrom are generally considered to be the inventors of the PSA. and hydrogen recovery and puriﬁcation. The adsorption is typically used to purify one of the reaction products. Periodic processes in packed bed reactors mixture is in equilibrium with a saturated sorbent. the product end of the reactor is closed and the pressure is released at the feed end of the reactor. But here the cyclic nature arises from the low temperature adsorption phase and the subsequent high temperature regeneration phase. Combinations of PSA and TSA processes also exist.150 Chapter 6. Kvamsdal and Hertzberg [39]. sometimes also referred to as Sorption Enhanced Reaction Processes (SERP). the product end of the reactor is again closed and the pressure is raised to the former high level. The Skarstrom PSA cycle was immediately accepted for commercial use in air drying. Before the adsorbent in the reactor is completely saturated. the adsorbed component is removed from the sorbent during this regeneration step. it is maintained at this level and ’clean’ carrier gas is led into the reactor at the product end such that the adsorbent in the reactor is purged. is based upon physically admixing a sorbent and a catalyst in one vessel in order to achieve a separation concurrent with a reaction.
the PSR is limited to comparatively low temperature applications in order to maintain suﬃcient adsorption capacity for the sorbent. ←− When adsorbing the CO2 the equilibrium of the above reaction shifts to the right. The reverse ﬂow reactor The simplest example of a periodic process might be the reverse ﬂow reactor (RFR). • Improved selectivities and yields of desired products. Carvill et al. A well known application of the pressure swing reactor is the removal of CO from syngas. We describe the RFR is more detail in Section 8. Being a member of the family of adsorptive reactors. This disadvantage can be avoided in a reactive separation using PSR. [29]. combining lowtemperature shift catalysis and selective CO 2 removal by adsorption.1. This implies that more H2 is produced and more CO is removed. [13] and Kodde and Bliek [36]. • More favorable reaction conditions might be possible. By a combination of low temperature shift catalysis and selective adsorption of carbon dioxide in one vessel. Production of high purity hydrogen from syngas. a packed bed reactor in which the ﬂow direction is periodically reversed in order to trap a hot reaction zone within the reactor. • Reduced capital expenditure by process intensiﬁcation. The shift reaction is given by H2 O + CO −→ H2 + CO2 .g. Hufton et al. resulting in longer lifetime of equipment and less catalyst deactivation. the removal of CO as a result of the shift reaction rather than by selective oxidation might become feasible. In the latter step a part of the produced hydrogen is inevitably lost. The PSR potentially oﬀers the following advantages: • Increased conversion of reactants. The reverse ﬂow reactor concept was ﬁrst proposed and patented by Cottrell in 1938 for the removal of pollutants.6.1 The advantages of periodic processes 151 has been studied by e. normally uses a multistep process. as required for instance for fuel cell applications. . In this way even systems with a small adiabatic temperature rise can be operated without preheating the feed stream. • Reduced requirements for external supply or cooling capacity. involving both a water gas shift and a selective oxidation process.
Therefore. • Mechanisms depending on the ﬂuid ﬂow: – Thermal conduction through the ﬂuid ﬁlm near the contact surface of two pellets. – Heat transfer by convection. • Mechanisms independent of ﬂow: – Thermal conduction through the solid particle. the onedimensional model consists of the axial dimension. The dimension of the model denotes the number of spatial directions in the model. Periodic processes in packed bed reactors 6. – Radiant heat transfer between the surfaces of two adjacent pellets. The reactor we have described here is called a cooled packed bed reactor. the catalyst or the adsorbent. – Thermal conduction through the contact point of two particles. when meeting the catalyst.of the catalyst particles. – Heat conduction within the ﬂuid. Time is considered as an additional dimension. We consider a pseudohomogeneous model. To avoid overheating . The model is based on the conservation of mass and energy. Heat transfer is the result of the following mechanisms.152 Chapter 6. and the gas phase.melting and burning . which is described by balance equations. turbulence around the catalyst particles and bulk diﬀusion. The latter two are lumped together as dispersion in axial (and radial) direction. In order to be able to formulate the model several assumptions have to be made. We deal with exothermic reactions only.2 The model equations of a cooled packed bed reactor We consider a tubular reactor ﬁlled with small catalyst particles in which gas ﬂows in axial direction. Turbulences of the gas around the particles cause a nearly constant velocity over a cross section of the reactor. The mass transport mechanisms we take into account are convective mass transport. we obtain a heterogeneous model. For the twodimensional model also the radial direction is taken into account. reacts to a product B. If we distinguish between the solid. and assume that the species only exists in the gas phase. the reactor is cooled using a cooling jacket around the reactor. In this section a mathematical model is derived that describes the essentials of the reactor unit. The gas contains a bulk part of an inert gas with a trace of reactants A that. . neglect the diﬀerence in temperature between the solid particles and the gas phase.
We assume that the pressure drop over the unit caused by the ﬂow along the catalyst bed. are usually described by a lumped parameter. The additional condition for the onedimensional model is that concentration and temperature are constant over a cross section of the reactor. Hence. We assume that all the physical properties are constant in the range of temperature and concentration that occurs in the reactor. thus one mole of species A gives one mole of species B. see [63]. the diﬀerence in activity. as is the multiplicity of the catalyst particles. The reaction is exothermic and the heat of reaction is independent of the temperature. In Table 6. can only occur inside the reactor and not in the channels leading to it. that is. Inside the reactor the cooling occurs only via the gas phase due to the negligible contact area between the catalyst particles and the reactor wall. except for the heat transport by convection.6. The dispersion coeﬃcient is assumed to be constant and equal for every component. intraparticle gradients in temperature or concentration are assumed to be negligible. In order to model the cooling we assume that the reactor wall is cooled at an more or less constant temperature. due to a high rate of turbulence. caused by a high ﬂow rate or a large density of the cooling ﬂow. the eﬀective thermal conductivity.1 we have summarized the assumptions made for both the one and the two dimensional model of a packed bed reactor. Furthermore we assume that dispersion of energy and mass. Below this temperature the various mechanisms of heat transport. caused by diﬀusion and turbulence around the catalyst particles. We assume that the gas phase satisﬁes the ideal gas law. . the reaction only occurs inside the reactor. The velocity over a cross section of the reactor is assumed constant.2 The model equations of a cooled packed bed reactor 153 – Heat transfer by lateral mixing. The transport resistance between the gas phase and the catalyst is negligible. denoted by η. is equal to one. The contribution of radiation to the total heat ﬂow turns out to be important at temperatures above 400◦ C. The thermal equilibrium between the gas and the catalyst occurs instantaneously. is negligible The equipment both upstream and downstream the reactor have no inﬂuence on the behavior of the ﬂow inside the vessel. The temperature and composition of the feed streams and the mass ﬂow are constant in time. Therefore we can apply Danckwerts boundary conditions. The reaction does not change the number of moles in the gas phase. and ﬂows through the vessel at a constant velocity. In addition. Therefore the eﬀectivity.
The convection is the bulk motion caused by feeding the reactor. the thermal conductivity and the molar based heat capacity. If u is the rate of the ﬂow. like the dispersion coeﬃcient. • Cooling at the reactor wall inside the reactor occurs only via the gas phase. • Dispersion of heat and mass occurs only inside the reactor. • The velocity of the ﬂow is constant. that is A → B. • The equipment both upstream and downstream has no inﬂuence on the ﬂow inside the reactor. • The physical properties. We consider a very basic example of a species A reacting into species B. • The pressure drop caused by the catalyst particles is negligible. that is. • The temperature and composition of the feed gas is constant in time. • The transport resistance and multiplicity of the catalyst particles is negligible.154 Chapter 6. • The heat and concentration equilibrium between the gas phase and the catalyst occurs instantaneously. The partial concentration of species A is given by CA = ρyA . • The reaction does not change the number of moles in the gas phase. then the convection is given by BA = C A · u mol/(m2 s). The ﬂow is caused by convection and diﬀusion. We compute the ﬂow of species A through the cross section of the reactor at z = z0 .1: Assumptions on the cooled packed bed reactor The component balances and the mass balance The component balance represents the conservation of mass of one single species in the gas phase. Periodic processes in packed bed reactors • The gas phase satisﬁes the ideal gas law. the number of moles that passes at z = z0 every second. The diﬀusion is based on contributions from molecular diﬀusion in the gas phase (to create the highest possible entropy) and from the turbulent ﬂow . Table 6. are independent of temperature and concentration and equal for every component. • The reactor wall is cooled at constant temperature. The total concentration of the gas is denoted by ρ and the mole fraction of species A by yA .
the void fraction ε has to be taken into account. denoted by Ac . So.2 The model equations of a cooled packed bed reactor 155 PSfrag replacements z z + ∆z Figure 6.6. ε∆V ∂t After dividing both sides by ∆V and letting the length of the segment going to zero (∆z ↓ 0). This leads to the equality {accumulation} = {in} − {out} − {reaction} ∂CA = FA (z) − FA (z + ∆z) − (1 − ε)∆V ρcat r . minus the number that leaves. FA (z).1: A segment of the reactor of length ∆z. The component balance is obtained by considering a small segment of the reactor. But. ε∆V ∂CA /∂t. The volume of the segment is equal to ∆V = Ac ∆z. and is given by JA = −ρDax ∂yA ∂z mol/(m2 s). FA (z + ∆z). around the particles. since the reactor is ﬁlled with particles. see Figure 6. WA = J A + B A mol/(m2 s). If r is the number of moles that reacts per kilogram catalyst per second. To compute the ﬂow one has to multiply the ﬂux by the cross sectional area of the reactor.1. to obtain the number of moles that reacts per second in the segment. The molar ﬂux is the sum of the convection and the diﬀusion term and represents the number of moles of a component that crosses a unit area per second. The number of moles that accumulate in the small segment.1) . is equal to the number of moles that enters the section. minus the number that reacts per second. The component balance of species A reads ε ∂CA ∂ ∂yA = ρεDax − εuCA − (1 − ε)ρcat r ∂t ∂z ∂z (6. we arrive at a partial diﬀerential equation. the ﬂow equals FA = εAc WA mol/s. we have to multiply this by (1 − ε)∆V ρcat .
the carrier gas. In the same way we obtain the component balance of species B.4) At the other end we have assumed that no inﬂuences exists of the equipment on the behavior of the ﬂow. the component balance of the inert is given by ε ∂CI ∂ ∂yI = ρεDax − εuCI . a third species is also present in the reactor. Periodic processes in packed bed reactors The left hand side of the component balance denotes the accumulation of component A in the gas phase. which implies no gradients in the concentration of the components. yA + yB + yI = 1.0 = WA z=0 . namely. The convective and diﬀusive contributions to the ﬂow and the reaction rate are represented by the right hand side terms. it does not take part in the reaction.156 Chapter 6.2) Note the plus sign in front of the reaction term.0 u = −ρDax ∂yA + CA u ∂z z=0 . we derive the boundary conditions at both ends of the reactor. which implies that the reaction increases the concentration of species B. and has zero derivative. Because the sum of the mole fractions equals one. At the entrance the boundary equation compares the ﬂux in and in front of the reactor This leads to the equality WA. Note that we have assumed that mass dispersion appears only inside the reactor. So. given by ε ∂CB ∂ ∂yB = ρεDax − εuCB + (1 − ε)ρcat r . ∂t ∂z ∂z (6. the overall mass balance is given by ε ∂ρ ∂uρ = −ε . ∂yA = 0. ∂t ∂z Note that the reaction term is also canceled in this equation since the reaction does not change the total number of molecules. that is. which is equal to CA. which is an important equation if the velocity is not necessarily constant.5) ∂z z=L . Therefore. (6. (6. Finally. ∂t ∂z ∂z (6. that is. The reactor is called a closedclosed vessel.3) If we add the component balances of all species we obtain the overall mass balance. either upstream or downstream of the reactor mass dispersion is negligible. The component balance equations contain a second order derivative of the mole fraction. The carrier gas is an inert.
Dividing (6. and the product −P Vi . The speciﬁc heat capacity cp is assumed to be independent of temperature and concentration.6) by ∆V and diﬀerentiating in time leads to the change in energy. at constant pressure. that is.3). i (6. The energy balance describes the change in total energy of the system (∂Esys /∂t). For the ﬁrst approach we analyze what happens inside the segment. In the last step we used that i Vi ρi = 1. The total energy is given by the sum of the energy of the catalyst and the energy of the gas phase. The total energy contained in the segment is given by Esys . E i ρi = i i (Hi − P Vi )ρi = Hi ρi − P. If we assume that the density of the catalyst is constant (∂ρs /∂t = 0). Here Vi is the speciﬁc volume per mol of species i. ∂t ∂t where cp denotes the speciﬁc heat capacity. Esys = Es ρs (1 − ε)∆V + Ei ρi ε∆V. that is.7) = (1 − ε)(ρcp )s ∂T ∂T + ε(ρcp )g +ε ∂t ∂t i +(1 − ε)ρs (−∆H)r − ε where (−∆H) = i ν i Hi ∂P . Therefore. ∂t denotes the heat of reaction. Hi . that is. . ∂t ∂z the change in energy reads 1 ∂Esys ∆V ∂t = ∂Es ρs (1 − ε) + ε ∂t Hi i ∂ρi +ε ∂t ρi i ∂Hi ∂P −ε ∂t ∂t Hi ∂Wi ∂z (6. Using the component balances (6.1)(6.2 The model equations of a cooled packed bed reactor 157 The energy balance Let us consider the open system given by a thin segment of the packed bed reactor of length ∆z.6. ∂T ∂H = cp .6) The energy Ei of a species in the gas phase consists of the enthalpy. which can be computed in two diﬀerent ways. then the change in potential energy is linearly proportional to the change in temperature. ε ∂ρi ∂Wi =ε + νi (1 − ε)ρs r .
[63.9) disappears. (6. The ﬂow of energy is the amount of energy that crosses a passes section of the reactor per second and is given by Fi Ei = εAc i i W i Ei . denoted by Q.9) By assuming that the speciﬁc volume is equal for every species in the gas phase. that depends on the heat conductivities in the gas and the solid phase and on the heattransfer resistance between the two phases. heat transport due to the ﬂow in the reactor. and cooling. per second equals ∂T −λax .). ˙ The cooling. indicated by the minus sign. Because the energy of species i equals Ei = Hi − P Vi . (6. we can consider the interaction of the segment with its surrounding.10). we obtain the following expression for the change . We have to take into account conduction (through the gas phase and the catalyst particles). is the amount of energy that leaves the segment at the wall of the reactor per second. we obtain W i Ei = i i Wi Hi − P W i Vi i = i Wi Hi + P ρDax i ∂yi Vi − P u ∂z ρi Vi . Periodic processes in packed bed reactors On the other hand. The total cooling per second is thus given by ˙ Q = −Uw (T − Tc )Ac ∆zaw .8).158 Chapter 6. i (6. the second term of the last expression in (6.10) From (6. 72].9) and (6. cf. we use Fick’s ﬁrst law of diﬀusion. Note that the conduction operates in the direction of decreasing temperature. The surface area of the segment equals 2πR∆Z and the volume of the segment is πR2 ∆z.8) ∂z where λax is the eﬀective axial heat conductivity. For the conduction term. The cooling rate per square meter surface area is linearly proportional to the diﬀerence in temperature of the segment and of the cooling jacket (−Uw (T − Tc )). The amount of energy that passes a cross section of the reactor per square meter. (6. By aw we denote the ratio between the surface area and the volume of the segment (aw = 2πR∆z/(πR2 ∆z) = 2/R. The energy that results from the work of equipment is neglected.
4) and (6. ∂z and because ∂Hi /∂z can be approximated by (cp )g ∂T /∂z the ﬁrst term of the right hand side of (6. The fourth term denotes the heat transfer through the reactor wall to the surroundings. The last term gives the enthalpy change due to reaction.12) − ρDa x ∂yi + ρi u = ρg u.11) i ∂ P u − Uw aw (T − Tc ). ∂z (6. the ﬁrst two terms show the contribution of the heat transfer by convection and diﬀusion.13) ∂z The left hand side shows the accumulation of enthalpy in the gas and solid phase. ∂z Note that the second term of the right hand side can be expanded to ε Because Wi = i i ∂ ∂z Wi Hi = ε i i ∂Hi Wi + ε ∂z Hi i ∂Wi .11).7) and (6. . (6. The boundary conditions are obtained in a similar way to those of the component balance.5). ∂z ε Since (6. On the right hand side.7) is valid for all ∆V. and are given by u(ρcp )g T0 = −λax ∂T + u(ρcp )g T ∂z z=0 .2 The model equations of a cooled packed bed reactor 159 in energy of the segment 1 ∂Esys ∆z→0 ∆V ∂t lim = lim 1 ∆z→0 Ac ∆z − λax Ac i ∂T ∂z z z+∆z z z+∆z +εAc = λax ∂2T ∂ −ε 2 ∂z ∂z Wi Hi − P u Wi Hi + − Uw aw Ac ∆z(T − Tc ) (6.12) becomes ε(ρcp )g u ∂T .6. The term i Hi · ∂Wi /∂z cancels and we derive the equation for the energy balance (1 − ε)(ρcp )s + ε(ρcp )g − ∂T ∂P ∂2T ∂T −ε = λax 2 − ε(ρcp )g u ∂t ∂t ∂z ∂z ∂ P u − Uw aw (T − Tc ) + (1 − ε)ρs (−∆H)r . we can combine (6. (6.
We take a ring j=1 2 2 with center radius r and width ∆r. for example. and in general ri = i−1 ∆rj + 1 ∆ri . . z=L = 0. the heat conductivity in radial direction per m2 surface area. . see Figure 6. Reaction rate The reaction rate depends on many factors. The volume of this ring is given by ∆V 1 = ∆z π r + ∆r 2 = 2π∆z · r∆r. .160 Chapter 6. . it can accelerate the reaction and the reactor might explode. . The widths of the rings are given by ∆r1 .8). T ) = c. Periodic processes in packed bed reactors at the entrance of the reactor and ∂T ∂z at the product end. The type of the catalyst and the system of the reaction on the catalyst increase the complexity of the formula. av kc + ηk∞ exp[−Ea /Rgas T ] according to Khinast et al. . Radial direction To extend to onedimensional model with the radial direction. If the heat of the reactor is extremely high. First of all it depends on the concentration of the reactants in the reactor. . concentration and temperature. we assume that energy transport by mass diﬀusion can be lumped into the thermal conductivity. Denote by ri the center radius of the ith ring. In the simulations of Chapter 8 we restrict ourselves to the reaction rate given by ηk∞ av kc exp[−Ea /Rgas T ] r (c. i = 1. In addition the temperature is important. At low temperature. we assume that the state in the reactor is cylindrically symmetric and that the dispersion coeﬃcient Drad and the thermal conductivity λrad are independent of position. that is r1 = 1 ∆r1 . The radial part of the diﬀusion in the component balance is obtained in a similar way. We subdivide the segment of the reactor with width ∆z in M rings. [33]. In addition. M. the reaction might not occur at all. . We consider the radial part of the diﬀusion in the energy balance equation.2. is given by ∂T . ∆rM . r2 = ∆r1 + 2 1 ∆r2 . −λrad ∂r . 2 1 − π r − ∆r 2 2 Similarly to the axial case (6.
(6.14) The expression in Formula (6.15) r ∂r ∂r The radial part of the diﬀusion in the component balance is given by (λrad ) (ρDrad ) 1 ∂ ∂yA r . + ∂r2 r ∂r 1 ∂ ∂T r . If we divide the accumulation term by the volume of the ring. we arrive at (λrad ) which equals ∂2T 1 ∂T . r ∂r ∂r (6. (6.2 The model equations of a cooled packed bed reactor 161 r2 r1 PSfrag replacements ∆z Figure 6.2: A segment of the reactor of length ∆z.16) . r+ 1 ∆r 2 By taking the limit ∆r → 0.6. we 2 obtain 1 ∂T (−λrad ) ∆V ∂r 1 2π r − ∆r ∆z 2 − (−λrad ) ∂T ∂r 1 r+ 2 ∆r 1 r− 2 ∆r 1 2π r + ∆r ∆z 2 .14) can be further simpliﬁed to 1 ∂T (λrad ) ∆r ∂r 1 r+ 2 ∆r + 1 r− 2 ∆r 1 ∂T (λrad ) 2r ∂r + r− 1 ∆r 2 ∂T ∂r . The accumulation in the ring under consideration equals the ﬂow through the 1 surface of the ring at r − 2 ∆r minus the ﬂow through the surface of the ring at r + 1 ∆r.
18) Before we compare the energy balance equations of both models. the temperature in the point (z. t).19) ∂z ∂z . To give a useful onedimensional representation of the twodimensional state of the reactor. as well as the velocity and the pressure. t) = 2 R R rT (z0 . we apply a few simpliﬁcations. the average temperature over the cross section through z = z0 equals 2 ¯ T (z0 . In the twodimensional model. Therefore. Because no material can pass through the wall of the reactor. the weighted average can be taken of the temperature and the concentration over the cross section of the reactor. 0 (6. which is linearly proportional to the diﬀerence in the temperature inside and outside of the reactor wall. Equation (6. r) at time t is denoted by T (z. The relation between the one. (6. r=R The cylindrical symmetry in the reactor yields the boundary conditions ∂yA ∂r = 0. So. we justify that the above extension of the onedimensional model is indeed natural.17) r=R is added to the system.17) describes the heat loss at the reactor wall to the surrounding cooling jacket. We assume that the term (ρcp )g is constant in time and space. the energy balance (6. we have ∂yA ∂r = 0. and ∂T ∂r = 0.and twodimensional balance equations is based on the idea of a weighted average. r. r. t)dr. Periodic processes in packed bed reactors At the wall of the reactor the boundary condition λrad ∂T ∂r = −Uw (T (R) − Tc ). r=0 r=0 A justiﬁcation of the twodimensional model In the following. (6.162 Chapter 6.13) becomes (1 − ε)(ρcp )s + ε(ρcp )g ∂T = ∂t ∂2T ∂T λax 2 − ε(ρcp )g u − Uw aw (T − Tc ).
where T0 is the temperature of the feeding gas. (6. The dimensionless temperature is given by θ = (T − T0 )/T0 . τ = tu/L. (6. In the same way we can show that the component balance of the onedimensional model is also a limiting case of the component balance of the twodimensional model. If the conversion equals zero no reaction has occurred and if the conversion equals one the reaction is completed. The conversion is given by x = (c0 − c)/c0 . where c0 is the concentration of the reactants in the feeding gas and c = CA = ρyA . the axial distance.20) ∂z ∂z r ∂r ∂r If we take the weighted average of both sides of the energy balance.22) If we substitute (6. (6. . for the twodimensional model. over the cross section of the reactor and use (6. we can rewrite the last term of (6.6.20). the dimensionless temperature is always positive.2 The model equations of a cooled packed bed reactor 163 The twodimensional version of the energy balance equation reads (1 − ε)(ρcp )s + ε(ρcp )g ∂T = ∂t ∂2T ∂T 1 ∂ ∂T λax 2 − ε(ρcp )g u + (λrad ) r . R (6. we recover the energy balance of the onedimensional model. ξ = z/L.22) in (6.21) in the following way λrad 2 ∂T r R2 ∂r R r=0 =− 2 · Uw T (R) − T0 . and the radial distance. Dimensionless equations In order to obtain the dimensionless versions of the balance equations we use the following dimensionless variables. ζ = r/R.19). (6. with aw = 2/R.21) r=0 Using the boundary conditions in radial direction. Since the reaction is exothermic and the cooling temperature is ﬁxed at T0 . The independent dimensionless variables are time. we obtain ((ρcp )s (1 − ε) + (ρcp )g ε) ¯ ¯ ¯ ∂T ∂2T ∂T = λax 2 − u(ρcp )g + ∂t ∂z ∂z R 2 ∂T 2 r · r (c.21) and assume that the concentration and the temperature are constant in the radial direction. T )dr + λrad 2 r (−∆H) 2 R 0 R ∂r R .18) for the weighted average of the temperature.
By gathering all parameters in dimensionless groups. We only deal with the radial components of the diﬀusion terms.164 Chapter 6. We ﬁrst derive the dimensionless version of the component (conversion) balance of the onedimensional model. we divide both sides by the factor −c0 u/L. Periodic processes in packed bed reactors a1 = (ρcp )s · (1 − ε)/(ρcp )g + ε a3 = Lηk∞ /u Pem = uL/(εDax ) β = Ea /(RT0 ) Pemp = uL/(εDrad ) a2 = (−∆H)c0 /(T0 (ρcp )g ) = ∆Tad /T0 a4 = ηk ∞ /(av kc ) Peh = uL(ρcp )g /λax Φ = 2LUw /(Ru(ρcp )g ) Pehp = (ρcp )g Lu/λrad Table 6. the dimensionless version is given by λrad L2 1 ∂ ∂θ · 2 ζ . Substituting the expressions for the dimensionless variables into (6. (ρcp )g Lu R ζ ∂ζ ∂ζ . For the twodimensional model the major part follows from the above discussion. 2 ∂τ Peh ∂ξ ∂ξ exp[β/(1 + θ)] + a4 The expressions for the dimensionless parameters are given in Table 6. = εDax −u − Ea 1 2 ∂τ L/u ∂(ξL) ∂ξL av kc exp( RT0 · 1+θ ) + ηk∞ Hereafter. we obtain ε (1 − x) 1 ∂ 2 x ∂x ∂x = − + a3 . After dividing the radial diﬀusion term of the energy balance (6.1) gives ε ∂(1 − x)c0 ∂ 2 (1 − x)c0 ∂(1 − x)c0 ηk∞ av kc (1 − x)c0 .15) by (ρcp )g T0 u/L. ∂τ Pem ∂ξ 2 ∂ξ exp(β/(1 + θ)) + a4 In the same way the energy balance in (6. (−∆H) Ea 1 av kc exp( RT0 1+θ ) + ηk∞ Dividing by (ρcp )g T0 u/L gives a1 1 ∂ 2 θ ∂θ (1 − x) ∂θ = − + a 2 a3 − Φθ.2.2: The dimensionless parameters of the balance equations.19) becomes ((ρcp )s (1 − ε) + (ρcp )g ε) ∂T0 (1 + θ) ∂τ L/u 2 T (1 + θ) ∂ 0 ∂T0 (1 + θ) = λax − u(ρcp )g + 2 ∂(ξL) ∂ξL ηk∞ av kc (1 − x)c0 − Uw aw (T0 (1 + θ) − Tc ).
starting with λrad ∂T ∂r = −Uw (T (R) − Tc ). The dimensionless boundary conditions in axial direction of the one. εDrad In Appendix C we explain that in general Dax = Drad . To point out the diﬀerence between the one.16) and subsequently dividing again by −c0 u/L.2. The parameters involved are given in Table 6. We have added the parameters Pemp and Pehp in Table 6. Lu(ρcp )g R ∂ζ ζ=1 Ru(ρcp )g 2 Note that dimensionless cooling capacity.and twodimensional model are derived in the same way as the balance equations and given at the end of this section. we deﬁne Pehp = (ρcp )g Lu .and twodimensional model we explicitly derive the dimensionless boundary condition for the temperature at r = R (ζ = 1). A summary of the model equations We summarize the complete dimensionless one.6. For the onedimensional model the component balance reads ε ∂x 1 ∂ 2 x ∂x = − + χ(x. θ) = a3 (1 − x) exp(β/(1 + θ)) + a4 −1 .2.2 The model equations of a cooled packed bed reactor 165 Therefore. θ).23) where the reaction rate is given by χ(x. λrad By substituting the dimensionless variables in the radial term of the component balance (6. we obtain εDrad L2 1 ∂ ∂x · 2 ζ . ∂τ Pem ∂ξ 2 ∂ξ (6.and twodimensional model. is multiplied by a factor one half. r=R we substitute θ and ζ. So. uL R ζ ∂ζ ∂ζ and we deﬁne Pemp by Pemp = uL . which leads to λrad Uw L L2 ∂θ 1 =− · 2 · θ(1) = − Φ · θ(1). . Φ. and divide both sides by u(ρcp )g T0 R/L2 .
∂θ ∂ξ ξ=1 ∂x ∂ξ ξ=1 1 2 Φθ = 0. = 0.24) The boundary conditions read θ− x− 1 ∂θ Peh ∂ξ ξ=0 = 0. ∂x ∂ζ ζ=1 = 0. Periodic processes in packed bed reactors The energy balance involves a cooling term and is given by a1 1 ∂ 2 θ ∂θ ∂θ = − + a2 χ(x. . 1 ∂x Pem ∂ξ ξ=0 For the twodimensional model the component balance is given by ε ∂x 1 ∂ 2 x ∂x 1 L2 1 ∂ ∂x = − + χ(x. ∂θ ∂ξ ξ=1 ∂x ∂ξ ξ=1 = 0. ∂τ Peh ∂ξ 2 ∂ξ (6. + = 0. = 0. θ) + ζ 2 ∂τ Peh ∂ξ ∂ξ Pehp ζ ∂ζ ∂ζ (6. = 0.25) = 0.26) the energy balance is given by a1 1 ∂ 2 θ ∂θ 1 1 ∂ ∂θ ∂θ = − + a2 χ(x.166 Chapter 6. (6. ∂θ ∂ζ ζ=0 ∂x ∂ζ ζ=0 = 0.28) 1 L2 ∂θ Pehp R2 ∂ζ ζ=1 = 0. ∂τ Pem ∂ξ 2 ∂ξ Pemp R2 ζ ∂ζ ∂ζ (6.27) and the boundary conditions are given by θ− x− 1 L2 ∂θ Peh R2 ∂ξ ξ=0 1 ∂x Pem ∂ξ ξ=0 = 0. θ) − Φθ. (6. θ) + ζ .
Section 7. contains a short description of bifurcation theory and a continuation technique. The bifurcation theory can be used to ﬁnd out whether a periodically forced process has a periodic stable limiting state. Furthermore. For parameter investigation we do not want to compute the periodic limiting state from scratch for every value of the bifurcation parameter. If we change the bifurcation parameter slightly we would prefer to take the old periodic limiting state as initial estimate of an iterative method to compute the new one.3. rounding errors can have a major inﬂuence. The ideas developed in Sections 7. The last part of this chapter.1 can easily be extended to the model equations of the packed bed reactor derived in Section 6. In addition. For instance. This will be done in Section 7. the use of a computer is unavoidable.2 and 7. Therefore the model has to be discretized in space and implemented in the computer. it shows when small changes in the parameters have a major inﬂuence on the behavior of the limiting state.Chapter 7 Numerical approach for solving periodically forced processes To solve a model consisting of partial diﬀerential equations including nonlinear terms.2. During this process of discretization and implementation many errors might be made.2. In this chapter we use basic partial diﬀerential equations. called ﬁnite volumes. In Section 7. So. the implemented models have to be checked. 167 .1 we give an example of a discretization. the grid should be chosen ﬁne enough.
168
Numerical approach for solving periodically forced processes
7.1
Discretization of the model equations
ut = duzz − auz + h(u), (au − duz ) = duz = 0, z=0 z=1 u(z, 0) = u0 (z), z ∈ [0, 1],
We consider the following initialboundary value problem
(7.1)
where u : [0, 1] × R+ → R and a, d > 0. The partial diﬀerential equation describes, for example, the temperature distribution in a reactor. Note that the positivity of a implies that the gas is ﬂowing from the left to the right end of the reactor, in positive zdirection. In order to discretize (7.1) we divide the reactor in N segments of equal PSfrag replacements width. The state u is assumed to be constant over a segment and located in the center, see Figure 7.1. For every segment, i = 1, . . . , N, a balance
u(z1 ) u(z2 ) u(zi ) u(zN )
z=0 ∆z
z=1
Figure 7.1: The distribution of the grid points over the interval.
equation is derived, where the accumulation term ut (zi ) is expressed in terms of the state of the segment, u(zi ), and the states of the neighboring segments, u(zi−2 ), u(zi−1 ), u(zi+1 ) and u(zi+2 ). This results in a large system of ordinary diﬀerential equations, which can be written as Ut = F (U (t)), (7.2)
where U (t) = (u(z1 , t), . . . , u(zn , t)). In other words, we divide the interval [0, 1] in N small intervals of equal length (∆z = 1/N ), and deﬁne zi = (i − 1 ) · ∆z, i = 1, . . . , N. The boundaries 2 of the ith interval are given by zi+ 1 = i · ∆z and zi− 1 = (i − 1) · ∆z, for 2 2 i = 1, . . . , N. Therefore, z 1 = 0 and zN + 1 = 1. In order to approximate the 2 2 ﬁrst and second derivative of u in a grid point zi , we use the Taylor expansion
7.1 Discretization of the model equations
169
of u around zi 1 u(z) = u(zi ) + uz (zi )(z − zi ) + uzz (zi )(z − zi )2 2 1 + uzzz (zi )(z − zi )3 + O(z − zi 4 ). (7.3) 6 The ﬁrst derivative can be computed in several ways. One approach is called ﬁrst order upwind. If the ﬂow rate in the reactor is rather high, we have to use information of grid points that lie in upstream direction. So, if the ﬂow comes from the left we use the state value u in zi−1 . We apply (7.3) in z = zi−1 and obtain 1 u(zi−1 ) = u(zi ) − uz (zi )∆z + uzz (zi )(∆z)2 + O((∆z)3 ). 2 If we rearrange the terms, the derivative of u in zi equals uz (zi ) = u(zi ) − u(zi−1 ) + O(∆z). ∆z (7.5) (7.4)
The ﬁrst term of the right hand side of (7.5) is the approximation for the derivative of u. The situation is given schematically in Figure 7.2.
u(zi )
PSfrag replacements
u(zi−1 )
zi+1 zi−1 u(zi+1 ) zi− 1 2 zi
Figure 7.2: First order upwind approximation of the derivative in zi
Another approach, called second order central, is applicable if the diﬀusion in the reactor dominates the dynamics. We apply (7.3) in zi+1 1 u(zi+1 ) = u(zi ) + uz (zi )∆z + uzz (zi )(∆z)2 + O((∆z)3 ). 2 (7.6)
170
Numerical approach for solving periodically forced processes
Subtracting (7.4) from (7.6) gives u(zi+1 ) − u(zi−1 ) = 2∆zuz (zi ) + O((∆z)3 ) u(zi+1 ) − u(zi−1 ) + O((∆z)2 ). 2∆z The situation is given schematically in Figure 7.3. PSfrag replacements uz (zi ) =
u(zi )
and therefore
(7.7)
u(zi+1 )
u(zi−1 )
zi−1
1 zi− 2
zi zi+ 1 2
zi+1
Figure 7.3: Second order central approximation of the derivative in zi
For the second derivative of u one relevant approximation is available. Note that u(zi+1 ) − u(zi ) 1 1 = uz (zi ) + uzz (zi )∆z + uzzz (zi )(∆z)2 + O((∆z)3 ), (7.8) ∆z 2 6 and u(zi ) − u(zi−1 ) 1 1 = uz (zi ) − uzz (zi )∆z + uzzz (zi )(∆z)2 + O((∆z)3 ). (7.9) ∆z 2 6 We subtract (7.9) from (7.8), divide by ∆z and arrive at uzz (zi ) = u(zi+1 ) − 2u(zi ) + u(zi−1 ) + O(h2 ). h2
If we use second order central for the diﬀusion term and ﬁrst order upwind for the convective term, the ith component function of F in (7.2) is given by d u(zi+1 ) − 2u(zi ) + u(zi−1 ) (∆z)2 − a u(zi ) − u(zi−1 ) + h(u(zi )), (7.10) ∆z
7.1 Discretization of the model equations
171
for i = 2, . . . , N − 1. In order to derive the ﬁrst component function and the last component function of F, we apply a slightly diﬀerent approach. First, note that (duzz − auz )
zi
= (duz − au)z
zi
≈
zi+ 1
2
1 (duz − au) − zi− 1
2
zi+ 1
2
zi− 1
2
.
The ﬁrst derivative uz evaluated at the boundary between two segments, (zi+ 1 ), is approximated using the state u at the grid points of both segments 2 (zi and zi+1 ), that is, uz (zi+ 1 ) =
2
1 (u(zi+1 ) − u(zi )). zi+1 − zi
If we apply ﬁrst order upwind then u evaluated at zi+ 1 is approximated by 2 u(zi ), the nearest point in upstream direction. The ith component function of F becomes d 1 zi+ 1 − zi− 1
2
2
u(zi+1 ) − u(zi ) u(zi ) − u(zi−1 ) − zi+1 − zi zi − zi−1 −a
2
u(zi ) − u(zi−1 ) + h(u(zi )), (7.11) zi+ 1 − zi− 1
2
for i = 2, . . . , N − 1. Note that for equidistant grid (7.10) and (7.11) are equal. Because the mesh is chosen such that z 1 = 0, the left boundary condition 2 reads (duz − au)z 1 = 0. The ﬁrst component function of F is given by
2
d
1 u(z2 ) − u(z1 ) u(z1 ) · + h(u(z1 )). −a z3 z2 − z 1 z3
2 2 2 2
N th component function of F yields d 1 1 − zN − 1 −
On the other hand, we have deﬁned zN + 1 = 1 and thus duz zN + 1 = 0. The u(zN ) − u(zN −1 ) zN − zN −1 u(zN ) − u(zN −1 ) + h(u(zN )). 1 − zN − 1
2
2
−a
If we apply central discretization for the ﬁrst derivative, u(zi+ 1 ) is approx2 imated by (u(zi ) + u(zi+1 ))/2 to derive the last component function of F. In this case we have to evaluate u at zN +1 . Because the ﬁrst derivative of u in zN + 1 = 1 equals zero, we can replace u(zN +1 ) by u(zN ).
2
Because the derivative ur at r = 0 equals zero. see Figure 7. . we divide the radial axis into M intervals. Therefore. r (a u − d u ) 1 = 0. Because the gradient of u at r = 1 in general is not equal to zero. r) ∈ [0.172 Numerical approach for solving periodically forced processes The twodimensional initialboundary value problem ut = d1 uzz − a1 uz + d2 (rur )r + h(u). u(z. with boundaries r 1 . we have to deal with radial component of the diﬀusion term. . .4. .13) to d2 (rur )r r rj ≈ 1 d2 rj rj+ 1 − rj− 1 2 2 rj+ 1 u(rj+1 ) − u(rj ) u(rj ) − u(rj−1 ) . − rj− 1 2 2 rj+1 − rj rj − rj−1 For the ﬁrst and the last grid point the boundary conditions have to be taken into account. d1 . This leads to u(1) = u(rM ) + (1 − rM ) · u(rM ) − u(rM −1 ) . rM − rM −1 . 2 rM − rM −1 The value of u at r = 1 is not deﬁned yet.13) For j = 2. 1]2 . In addition to the axial derivative in the partial diﬀerential equation. We set r 1 = 0 2 2 2 2 and rM + 1 = 1. . . . 2 and approximate the radial term at the grid point rj . = d 1 uz 1 z z=0 z=1 where a1 . r1 r2 − r 1 The other boundary condition leads to d2 (rur )r r rM ≈ ≈ 1 d2 · (rur ) rM rM + 1 − r M − 1 2 2 rM + 1 2 2 rM − 1 1 1 · rM 1 − r M − 1 2 − a2 u(1) − d2 rM − 1 u(rM ) − u(rM −1 ) . . d2 (rur )r r r1 ≈ d2 1 (rur ) · r1 r 3 − r 1 2 2 r3 2 r1 2 ≈ d2 u(r2 ) − u(r1 ) · . . (7. . M. a2 . rM + 1 . r). M − 1 we can expand (7. by d2 (rur )r r rj ur r=0 = (a2 u + d2 rur ) r=1 = 0. r. 0 (7. j = 1. . 0) = u (z.12) ≈ 1 d2 (rur ) rj rj+ 1 − rj− 1 2 2 rj+ 1 2 2 rj− 1 . we extrapolate u using the values at the grid points rM −1 and rM . . (z. r 3 . we obtain for j = 1. d2 > 0. can be discretized using mainly the same approach. In every interval j we choose a grid point rj .
2 . . . . u1. . uN. sine and the exponential function. . In general the solution is a inﬁnite sum of terms with cosine. One approach is to check the balances. 7. . where ui.j = u(zi . assuming that u in of the form u(z. So the vector U becomes U = (u1. where the coeﬃcients are ﬁxed by the boundary conditions and the initial condition. If it is not possible to compute the explicit solution we have to consider other techniques to check the solution given by the computer.7. The explicit solution is derived by splitting the variables. So far. uN. t) = Z(z)T (t) and substituting this in the partial diﬀerential equations. rj ). .M .4: An approximation of the value u(1). that is.M ).1) can explicitly be solved if the function h is aﬃne.2 Tests for the discretized model equations The initialboundary value problem (7.1 . We ﬁrst integrate both sides of the partial diﬀerential . In order to discretize (7. . u2. .1 . .2 Tests for the discretized model equations 173 u(1) u(rM ) PSfrag replacements u(rM −1 ) rM −1 rM − 1 2 rM rM + 1 2 Figure 7. we have only considered the axial and radial terms separately.12) the values of u in the grid points of the twodimensional mesh should be stored in one single vector U.1 . The obtained analytical solution can be compared to the results of numerical simulation. . In order to obtain a small band width of the Jacobian of the function F. u1. . we ﬁrst count in the direction that has the smallest number of grid points (often M < N ).
T ) − u2 (z. T ) − u0 (z)}dz = 0 − au(1. t)) + 2d 0 u2 dz − z 1 uh(u)dz dt. 0 . T 0 0 1 T 1 0 (uut )dzdt = 0 u(duzz − auz + h(u))dzdt. t) + u2 (0.174 Numerical approach for solving periodically forced processes equation of (7. t) + h(u)dz dt.1) in time (T > 0) and space. If we assume that the solution is continuous. Another possibility is multiplying both side of the partial diﬀerential equation by u and then integrating in time and space. 0 u2 (z. 0)}dz = 1 T 0 − au2 (0. t) 1 −d This can be simpliﬁed to 1 0 T 0 a a u2 dz − u2 (1. Therefore. 1 u(z. t) + au2 (0. 0 We can check the simulation by computing the integral at the right hand side simultaneously with the variable u.1) we obtain 1 0 T 1 {u(z. t) + z 2 2 uh(u)dz dt. Again interchanging the integration order gives 1 0 1 2 u 2 T 0 T dz = 0 duuz 1 0 1 − 0 a du2 dz − u2 z 2 1 0 1 + 0 uh(u)dz dt. we can change the order of integration. T 0 0 1 T 1 0 ut dzdt = 0 (duzz − auz + h(u))dzdt. t) 0 T 0 T dz = 0 (dwz − au) + 0 1 1 h(u)dz dt. 0 By applying the boundary conditions of (7. T )dz = 0 1 u2 (z)dz 0 1 0 − au2 (1. which results in the following ’energy estimate’. and inserting the boundary conditions yields 1 2 1 0 {u2 (z.
r.12) is an inﬁnite sum of terms with cosine.3 Bifurcation theory and continuation techniques 175 This implies that if the function h satisﬁes 0 uh(u)dz < 0 then the ’total 1 amount of energy’ in the system ( u(. the term (rur )r will become small shortly. That is.. which is constant in radial direction.12) in time and space. the twodimensional problem (7. In order to check the implemented model if radial gradients are present.12) can be solved explicitly by splitting the variables. or no diﬀusion exists in radial direction. as was done for the onedimensional problem. In general the solution of (7. a2 = 0.12 this means that if d2 is large. the exponential function and the Bessel function. Note that for the radial direction we ﬁrst have to multiply the equation by r. sine. in order to obtain the weighted average. We obtain similar integral equations as in the case of the system (7. diﬀerences in radial direction will be removed instantaneously. Another example is when the cooling of the reactor stagnates at or nearby the reactor wall. the reactor wall is isolated. If a reliable implementation of the onedimensional model exists. 1 7. The ordinary diﬀerential equation that involves the variable r is a socalled Bessel equation and the solutions are given in terms of the Bessel function. In terms of 7. where the coeﬃcients are ﬁxed by the boundary conditions and the initial condition. 2 If the function h is aﬃne. That is. The state of the reactor at time t is denoted by a vector x(t) from the ndimensional vector . that is. t) 2 = 0 u2 (z. we discretize the equations in space using a ﬁnite volumes technique with ﬁrst order upwind for the convective term. t) = Z(z)R(r)T (t).1). In order to investigate the behavior of the system numerically. we can consider (artiﬁcial) limit cases of the process in which the onedimensional and the twodimensional should give the same results. Note that the boundary condition at z = 0 and z = 1 are uniform over the cross section of the reactor. We describe two situations starting from an initial state u0 .3 Bifurcation theory and continuation techniques Periodically forced processes in packed bed reactors can be described by use of partial diﬀerential equations. If the diﬀusion in radial direction is high. We shortly discuss some additional approaches to check the implementation of the discretized equations. Terms of the integral equation can simultaneously be integrated with the variable u. when the state of the reactor has no gradients in radial direction. The resulting ordinary diﬀerential equations in z and t are solved in the same way. t)dz) is decreasing. assuming that u in of the form u(z. d2 = 0.7. we can again integrate the initialboundary value problem (7. If a2 = 0 then the temperature gradient at r = 1 will be zero and no gradients are introduced.
Moreover. the function evaluation is a computationally expensive task. t + tc ) = F (·. Knowledge of the bifurcation points is necessary for a good understanding of the system. x0 .16). Rn . we consider the dynamical system xk+1 = f (xk . Our objective in this section is to give a short view of the easiest types of bifurcations and method to ﬁnd the bifurcation values. is the most eﬃcient. The value of the bifurcation parameter where the qualitative property of the state of the reactor changes is called a bifurcation point.15) In other words. x(0) = x0 . the value of the solution after one cycle. it is important to understand the qualitative behavior of the systems as a bifurcation parameter change. (7.14) where F (·. Now assume that the period map depends on a bifurcation parameter λ. starting from an initial condition x0 .14). which may vary over certain speciﬁed intervals. a periodic state of the reactor corresponds to a t c periodic solution x(t) of (7. direct methods are preferable to dynamical simulation.17) where f : Rn × R → Rn . λ) (7. The map f : Rn → Rn that assigns to an initial state at time zero. we have f (x0 ) = x(tc ). (7. A periodic state of the process is a ﬁxed point of the period map (f (x∗ ) = x∗ ). t) for t ∈ R. Thus. Therefore. In . The model of a packed bed reactor contains several physical parameters. x(tc ). So. and tc denotes the period length. The resulting system of n ordinary diﬀerential equations can be written as x (t) = F (x(t). Since the initial condition.14). t).14). and the iterative method that needs the fewest evaluations of f to solve (7. Therefore. of a periodic solution is a ﬁxed point of the period map. Note that the value f (x) is obtained by integrating a large system of ordinary diﬀerential equations over a period tc . is called the Poincar´ or e period map of (7. A good design for the periodically forced process in the packed bed reactor is such that the qualitative behavior does not change when the bifurcation parameter is varied slightly from the value for which the original design was made. Let f be the period map of a periodically forced process. Since it might take a long transient time before the limiting periodic state is reached. we solve f (x) − x = 0. (7.176 Numerical approach for solving periodically forced processes space.16) using iterative methods. evaluating the map f is equivalent to simulating one cycle of the process in (7.
see Section 8.1. If the eigenvalue leaves the unit circle at µ = 1 the number of periodic solutions of the system changes. also called the monodromy matrix. For moderate values of tf a stable periodic state exists at high temperature. The stability of periodic solutions is determined by the Floquet multipliers. the eigenvalues of the monodromy matrix. Landing at a bifurcation point the periodic state becomes unstable or disappears. However the longer we ﬂow from one direction the more energy is purged out of the reactor during one ﬂowreversal period. determines the type of bifurcation.2 for more details. This implies that a neighborhood exists of the periodic state x∗ in which all trajectories converge to the periodic state as time goes to inﬁnity. We are interested whether and how the periodic state changes. Let the ﬂowreversal time tf be the bifurcation parameter and ﬁxed all other physical parameters of the system. If the eigenvalue leaves the unit circle at µ = −1 the period of the solution is doubled. Example 7. In general this will be by two. There exists a minimum value for tf for which the extinguished state is still the only possible periodic state. Let f be the map corresponding to half a period of the reverse ﬂow . [30].5: Diﬀerent bifurcation scenarios of a periodically forced system. the periodic state depends on the value of the bifurcation parameter (x∗ = x∗ (λ)). we have to consider the Jacobian of the period map. In the following example we consider three diﬀerent scenario’s. when we vary the bifurcation parameter slightly. To understand the local behavior of the system in more detail.7. A periodic solution is stable when the absolute values of all the (possibly complex) eigenvalues of M are smaller than unity. PSfrag replacements µ1 µ1 1 µ1 −1 θ0 µ2 Figure 7. The bifurcation point is called a limit point or a saddlenode. This value for tf corresponds to the bifurcation point. The monodromy matrix M describes the evolution of a small perturbation over one period. The angle at which an eigenvalue µ crosses the unit circle. When changing the bifurcation parameter an eigenvalue might cross the unit circle and the dynamics of the system can change completely.3 Bifurcation theory and continuation techniques 177 fact.
see Figure 8. α) = x. see ˜ Figure 8. α) and deﬁne G : Rn+1 → Rn by G(y) = F (x. Fixed points of the period map. If a pair of eigenvalues leaves the unit circle at µ1 = eiθ0 and µ2 = e−iθ0 . satisfy the equation (7.5(a). If x∗ is a stable ﬁxed point of f. If we denote a point in Rn+1 by y = (x.19) By the implicit function Theorem the system of (7. Continuation techniques Clearly. Note that a complex eigenvalue always has a conjugate partner. where 0 < θ0 < π.18) F (x. provided that rank JG (y0 ) = n. we are interested in the dependence of the limiting periodic state of the periodically forced process on certain bifurcation parameters. α) − x.19).5(c). that satisﬁes (7. and the limiting state corresponds to a new point x∗ that satisﬁes ˜ f 2 (˜∗ ) = x∗ = f (˜∗ ). also called equilibrium points.5(b). α). x∗ is a periodic state of the process. the ﬁxed point x∗ becomes unstable.178 Numerical approach for solving periodically forced processes reactor.19) locally deﬁnes a smooth onedimensional curve C in Rn+1 passing through a point y0 . it has becomes asymmetric. and corresponds to a transition from a single to a twofrequency motion. the limiting state of the reactor becomes quasiperiodic. see 8.20) . This bifurcation is called a NeimarkSacker bifurcation. the limiting state of the reactor is a symmetric period state. which implies the state follows two frequencies. Equation (7. x ˜ x Since f 2 equals the period map of a whole cycle of the reverse ﬂow reactor. However. (7. where the period map F : Rn+1 → Rn depends upon one bifurcation parameter α. If we alternate the bifurcation parameter such that the largest eigenvalue of the monodromy matrix leaves the unit circle at µ = −1.3). deﬁned by (8. (7.18) leads to G(y) = 0. α ∈ R. In this section we describe the basics of continuation techniques to analyze the dynamical system xk+1 = F (xk .
˜ (7. y1 . . prediction.21) where ∆sk is the current step size.20) is called regular. yk−1 and yk .23) vk = yk−1 − yk The advantage of this method is that the computation of the Jacobian J G and of the solution of a large system of equations is avoided. The ﬁrst two points of the sequence are computed by ﬁxing the bifurcation parameter and applying iterative methods to solve (7. Next. points on this curve (y0 . The next continuation point is predicted by adding a step to the previous point. that is based on previously computed points of the branch and an appropriate step length. We describe some of the basic choices for the prediction and the correction step.21). y2 . most of the continuation algorithms used in bifurcation analysis implement predictorcorrector methods that include three basic steps.7. and vk ∈ Rn+1 is a vector of unit length ( vk = 1). The prediction is given by (7. During continuation. Then. (7. Every point on the curve C that satisﬁes (7. For the subsequent points. . Another popular prediction method is the secant prediction. . and strategies to validate the new computed point on the bifurcation branch in order to choose the new step size. To obtain the tangent vector we parametrize the curve near yk . by the arclength s with y(0) = yk .) are approximated with a desired accuracy. If we substitute the parametrization into (7. the initial guess y of the next point in the sequence is ˜ made using the prediction formula y = yk + ∆sk vk .22) has a unique solution (vk has unit length) because rank JG (yk ) = n by the assumption of regularity. . A possible choice for vk is the tangent vector to the curve in yk . It requires two previous points on the curve.3 Bifurcation theory and continuation techniques 179 Here JG (y0 ) denotes the Jacobian of G at y0 .19) and take derivative with respect to s.18). where now yk−1 − yk . (7. Finally the step size is adapted. the prediction is corrected bordered with a step length condition. correction and step size adjustment. System (7. we obtain JG (yk )vk = 0.22) since vk = dy/ds(0). Prediction Suppose that a regular point yk in the sequence approximating the curve C has been found.
hk (˜) = 0). namely. Correction Having predicted a point y presumably close to the curve. Because the element of yk with this index is locally the most rapidly changing along C. F (x. α) − x . We redeﬁne function G : Rn+1 → Rn+1 by G(y) = Solving G(y) = 0 (7. ˜ This approach is called natural continuation. The simplest way is to take a hyperplane passing through the point y that ˜ is orthogonal to the coordinate axis of the bifurcation parameter. simple. Instead of the coordinate axis of the bifurcation parameter often the axis is taken that corresponds to the index of the component of vk with the maximum absolute value. It is natural to assume that the prediction point y belongs to ˜ ˜ this surface as well (that is. one needs to locate ˜ the next point yk+1 on the curve to within a speciﬁed accuracy. a scalar condition hk (y) = 0 has to be appended to the system (7.180 Numerical approach for solving periodically forced processes A third.24) geometrically means that one looks for an intersection of the curve C with some surface near y . However. in order to apply Newton’s method or a quasiNewton method. the standard Newton iterations have to be applied to a system in which the number of equations is equal to that of the unknowns. So.18). The direction of the step is therefore given by vk = en+1 . where en+1 is the last unitvector in Rn+1 . method is changing the bifurcation parameter only and using the last point in the sequence yk as initial guess of the next point on the bifurcation branch. hk (y) . This correction is usually performed by some Newtonlike iterations. where hk : Rn+1 → R is called the control function. There are several ways to specify the y function hk (y). A minor of this method is that the branch can only be detected in increasing direction of the bifurcation parameter. set hk (y) = α − α.
7.3 Bifurcation theory and continuation techniques
181
PSfrag replacements
y0 y1 v y ˜ y v y ˜
Figure 7.6: Prediction and correction step.
Another possibility, called pseudoarclength continuation, is to select the hyperplane passing through the point y that is orthogonal to the vector v k . ˜ This hyperplane is deﬁned by 0 = y − y , vk . ˜ Therefore we set hk (y) = = = y − y , vk ˜
y − yk , vk − ∆sk .
y − (yk + ∆sk vk ), vk
(7.25)
If the curve is regular (rank JG (y) = n for all y ∈ C) and the step size ∆sk is suﬃciently small, one can prove that the Newton iterations for (7.24) will converge to a point on the curve C from the predicted point y of the tangent ˜ prediction or the secant prediction, [38]. For the third possibility not a hyperplane is taken but a sphere around the previous computed point yk in the sequence. That is, the distant between the approximation of the next point on the curve and yk is ﬁxed at ∆sk . The control function is therefore deﬁned as hk (y) = y − yk − ∆sk . Clearly, the predicted point y lies on the sphere. The main disadvantages of ˜ this approach is that the control function is not linear and, especially in the neighborhood of bifurcation point, the continuation might go in the wrong direction, since the curve has at least two intersection points with the sphere. Note that the matrix JG (yk ) needed in (7.22) can be extracted from the last iteration of the Newton process solving (7.24).
182
Numerical approach for solving periodically forced processes
Step size adjustment There are many sophisticated algorithms to control the step size ∆sk . The simplest convergence dependent control, however, has proved to be reliable and easily implementable. That is, if no convergence occurs after a prescribed number of iterations in the correction step, we decrease the step size and return to the prediction step. If the last point is successfully computed, we accept it as a new point of the sequence and multiply the step length with a given, constant factor greater than one. If the convergence succeeds but it uses many iterations we accept the new point of the sequence but decrease the step size ∆sk . To summarize this section we give the continuation algorithm that we have applied in our simulations. Algorithm 7.2 (Continuation scheme). Let yk = (xk , αk ) and yk−1 = (xk−1 , αk−1 ) be the last successfully computed points in the sequence approximating the branch. Fix the real parameters a and b (a > 1 and 0 < b < 1) and the number imax . The next point, yk+1 = (xk+1 , αk+1 ), in the sequence is determined by: Secant prediction: Set y = yk + ∆sk · vk , where ∆sk is the current step ˜ size and vk is deﬁned by (7.23). pseudoarclength continuation: Solve (7.24) where G(y) = (F (x, α) − x, hk (y)) and hk (y) is deﬁned by (7.25) stepsize control: If the correction step fails, then multiply ∆sk by b and return to the prediction step. If the correction step succeeds using less than imax /2 iterations of an iterative method, then accept the new point in the sequence and set ∆sk+1 = a∆sk . If the correction step succeeds but using more than imax /2 iterations, then accept the new point in the sequence and set ∆sk+1 = b∆sk .
Chapter 8
Eﬃcient simulation of periodically forced reactors in 2D
The ﬁnal chapter of this thesis is devoted to the connection between the iterative method for solving highdimensional systems of nonlinear equations and the eﬃcient simulation of a twodimensional model for the reverse ﬂow reactor, where the radial direction is taken into account.
8.1
The reverse ﬂow reactor
We start to recall the description of the reverse ﬂow reactor from the introduction. The reverse ﬂow reactor (RFR) is a catalytic packedbed reactor in which the ﬂow direction is periodically reversed to trap a hot zone within the reactor. Upon entering the reactor, the cold feed gas is heated up regeneratively by the hot bed so that a reaction can occur. The reaction is assumed to be exothermic. At the other end of the reactor the hot product gas is cooled by the colder catalyst particles. The beginning and end of the reactor thus eﬀectively work as heat exchangers. The cold feed gas purges the hightemperature (reaction) front in downstream direction. Before the hot reaction zone exits the reactor, the feed ﬂow direction is reversed. The ﬂowreversal period, denoted by tf , is usually constant and predeﬁned. One complete cycle of the RFR consists of two ﬂowreverse periods. Overheating of the catalyst and hot spot formation are avoided by a limited degree of cooling, at the wall at constant temperature. This can be done by using a huge amount of cooling water, that ﬂows at a high rate along the outside of the reactor wall. A 183
184
Chapter 8. Eﬃcient simulation of periodically forced reactors in 2D
schematic diagram of the reactor is shown in Figure 8.1.
Tc
catalyst
PSfrag replacements
gas ﬂow
cooling jacket Tc
Figure 8.1: Schematic drawing of the cooled reverse ﬂow reactor.
Starting with an initial state, the reactor goes through a long transient phase before converging to a periodic limiting state, also called the cyclic steady state (CSS). Limiting states of periodically forced packed bed reactors are of interest to the industry because the reactor operates in this situation most of the time. The basic model for a ﬁxed bed catalytic reactor, such as the RFR, is the socalled pseudohomogeneous onedimensional model. This model does not diﬀerentiate between the ﬂuid and the solid phase and considers gradients in the axial direction only. Eigenberger and Nieken [19] have investigated a simpliﬁed onedimensional model. Due to a very short residence time of the gas in the reactor, they assume the continuity equation and the mass balance equation to be in quasi steady state when compared to the energy balance equation. They apply standard dynamical simulation to compute the limiting periodic states of the reverse ﬂow reactor. Due to their choice of the model and the values of the parameters all periodic states discovered are symmetric, that is, the state after one ﬂow reversal period is the mirror image of the initial state. Rehacek, Kubicek and Marek [57, 58] have extended the model of the RFR to a twophase model with transfer of mass and energy between the ﬂuid and solid phase. They consider the period map, that is, the map which assigns the new state after one period of the process to an initial state. To obtain a numerical expression of the period map, the authors discretize the partial differential equations of the model in space and integrate the resulting system of ordinary diﬀerential equations over one period. Again with dynamical simula
In periodically forced systems. 8. as explained in more detail in Section 7.2 The behavior of the reverse ﬂow reactor From the initial state to the CSS As initial condition for the reverse ﬂow reactor we take a preheated reactor ﬁlled with an inert gas. the limiting solution varies in time. twodimensional models are standard practice. To our knowledge. however. and eﬃcient cooling of the reactor at the wall cause radial temperature gradients to be present. Khinast. such as the RFR. both regarding CPUtime and regarding memory usage.3. where time derivatives are absent. For the theoretical analysis of limiting states of steady state processes. The reason being that an accurate simulation requires a ﬁne grid which yields a high dimensional discretized system.8. twodimensional models of periodically forced systems have so far been avoided. For steady state processes that have coeﬃcients and boundary conditions invariant in time. 35] have developed an eﬃcient method to compute bifurcation diagrams of periodic processes. Clearly. 32. a great number of eﬃcient mathematical and numerical tools is available. therefore. iterating the period map. The method of Broyden is used in combination with continuation techniques to ﬁnd the parameter dependent ﬁxed points of the period map. Their approach is based on previous work of Gupta and Bhatia [26] in which the system of partial diﬀerential equations is considered as a boundary value problem in time. such as the method of Broyden. In the computations we consider the dimensionless temperature θ = (T − T0 )/T0 and the conversion x = (c0 − c)/c0 . where T0 is . see Figure 8. a large width of the reactor. a time invariant state can often be expressed as the solution of a system of ordinary diﬀerential equations. Highly exothermic reaction. at the expense of relevance and accuracy. for cooled reverse ﬂow reactors the radial dimension must explicitly be taken into account. that is. see [56]. The radial transport of heat and matter. In addition. full twodimensional models for the RFR have never been solved using a direct iterative method. has to be a ﬁxed point of the period map. The boundary condition implies that the initial state of the reactor equals the state at the end of the cycle and.2 The behavior of the reverse ﬂow reactor 185 tion. [34. When modeling a steady state process. Luss et al. they observe asymmetric and quasiperiodic behavior.2(b). Due to large computational costs. symmetric stable periodic states of the RFR are obtained. is very important in nonisothermal packed bed reactors [72]. 33.
26)(6. The reaction front can be easily distinguished.186 Chapter 8. the reaction occurs and the concentration of species A decreases. After a period of time tf the feeding at the left end of the reactor is stopped and the ﬂow direction is reversed by feeding from the right end of the reactor. Eﬃcient simulation of periodically forced reactors in 2D PSfrag replacements PSfrag replacements temperature rad. We start the process and let the gas ﬂow entering the reactor at the left end. Directly after this ﬂow reversal.2.3 the state of the reactor is given a diﬀerent times. Because the reactor is cooled the temperature decreases at the right side of the reaction front. the catalyst at the left side of the reactor is cooled due to the low temperature of the feed gas. The feed gas contains a trace of the reactant A and is at low temperature. distance conversion temperature conversion ax. the temperature increases and a reaction front is created. The concentration A still present in the left part is purged out of the reactor and after a short intermediate phase the . distance (a) conversion rad. the cold feed is heated up and comes into contact with the catalyst. the hot reaction zone withdraws from the right end and moves in left direction. Therefore.28) with the parameter values of Table 6. At the left side of the reaction front the temperature is too low to activate the reaction. At the other side of the reaction front all of the reactants has reacted and the conversion is completed. the temperature and c0 is the concentration of the feed gas. In Figure 8.2: Qualitative temperature and conversion distribution of the cooled reverse ﬂow reactor in the cyclic steady state according to the twodimensional model (6. The dimensionless initial condition is set to θ≡1 and x ≡ 1. distance ax. distance (b) temperature Figure 8. On the other hand. Because the reaction is assumed to be exothermic. When entering the hot reactor.
6 0. Clearly. Note that the reaction front now occurs at the right side of the hot zone.6 temperature 0.2 temperature conversion 0 0.8 con 0.2 0.4 0.8 con 0.6 0.4: Snapt shots of the second reverse ﬂow period of the reactor.2 0.8 1 0 0 0. because the reaction . see Figure 8.8.2 0.4 PSfrag replacements 0.6 temperature 0.8 1 conversion axial distance axial distance Figure 8.2 The behavior of the reverse ﬂow reactor 187 1 1 0.2 temperature conversion 0 0.4 0.8 1 0 0 0.6 0.4 0.6 0. It depends on the conditions of the process what will be the state of the reactor after many cycles.6 0.2 PSfrag replacements 0 0.4 0.8 0.8 1 conversion axial distance axial distance Figure 8. conversion of species A in the product gas is again equal to one.8 0.4.4 0.4 0.6 0. 1 1 0.3: Snap shots of the ﬁrst reverse ﬂow period of the reactor.4 PSfrag replacements 0. when the cooling capacity is too high the state extinguishes.2 PSfrag replacements 0 0. The product gas during this intermediate phase is often considered as waste gas. the hot reaction zone is catch in the reactor.2 0. By reversing the ﬂow direction after a ﬁxed period tf over and over again.
in some applications involving equilibriumlimited reactions cooling is applied to avoid exceeding some critical temperatures at which either undesired reactions or catalyst deactivation may occur.2.8 0.2 0.7 0.6 0. Periodical.5: The limiting state of the reverse ﬂow reactor at the switch of the ﬂow direction .5 0. If the reverse ﬂow period is too long the reaction front exits the reactor.2 0. a second period determines the overall behavior.1 0 0 0.8 0.3 (a) Φ = 0.9 1 0.1 0 0 0. Examples of the dimensionless temperature proﬁles. symmetric states at which the temperature (and concentration) proﬁles at the beginning and end of a ﬂowreversal period are mirror images. Adiabatic operation leads to periodic.4 0. We describe the limiting state of the reactor for diﬀerent values of the cooling capacity and a moderate reverse ﬂow period.332 (b) Φ = 0.7 0.2 0. Eﬃcient simulation of periodically forced reactors in 2D cannot be sustained at low temperatures.7 0. Reactor cooling may introduce some complex and rich dynamic features.5 0.5 0.4 0.2 0.4 0.2 0. For example.188 Chapter 8.3 0.1 0 0 0. which do not exist in its absence.4 0.6 0. Various modes of RFR cooling were described by Matros and Bunimovich [44].9 temperature temperature 0.4 0. under relatively fast ﬂowreversal frequencies the symmetric states of a cooled RFR may become unstable and either asymmetric or quasiperiodic states may be obtained.5.4 0. for these three types of states are shown in Figure 8.2 0.8 1 PSfrag replacements PSfrag replacements axial distance 1 PSfrag replacements axial distance 1 axial distance (c) Φ = 0.8 0.3 0. Laboratory and pilotplant RFRs usually cannot be operated in an adiabatic mode [43].3 0. We call these states symmetric period1 operation.6 0. Quasiperiodic behavior of the reactor means that in addition to the ﬂowreversal period. as deﬁned in Table 6.9 1 0. the forcing frequency.8 temperature 0. asymmetric and quasiperiodic states We use dynamical simulation to determine the limiting state of the reactor for diﬀerent values of the dimensionless cooling capacity.6 0. The diﬀerences in the dynamic features are caused by changes in the dimensionless cooling capacity Φ.8 0.324 Figure 8. 1 0.6 0. Moreover.6 0.
We construct a corresponding Poincar´ map by considering ∆θave (n) versus e θave (n). ntf )dz .8 0.9 θmax 0.85 θmax 0.6.85 0. for the ﬁrst 500 ﬂowreversal periods. . . and the same picture starting after 420 ﬂowreversal periods (Φ = 0. .9 0.2) . 2.2 The behavior of the reverse ﬂow reactor 189 To illustrate the development of quasiperiodic behavior we consider the maximum temperature of the reactor at the end of every ﬂowreversal period. reactor reaches a quasiperiodic regime.75 0. 1. After a transient phase of about 50 ﬂowreversal periods the 1. (8. see Figure 8.05 1 0. The second frequency of the quasiperiodic behavior of the reactor equals 45 ﬂowreversal periods.6: The maximal temperature of the reactor at the switch of the ﬂow direction. 1.3). ntf )dz. where we deﬁne 1 θave (n) = 0 θ(z. . 2. (8.1) and 1/2 1 ∆θave (n) = 2 0 θ(z.7 0 50 100 150 200 250 300 350 400 450 500 PSfrag replacements number of ﬂowreversals 0. 1/2 n = 0. .75 420 430 440 450 460 470 480 490 500 number of ﬂowreversals Figure 8. ntf )dz − θ(z. . . .8 PSfrag replacements 0.8. n = 0.95 0.
2 0. the Poincar´ map consists of two points. .6. thus indicating quasiperiod behavior.41 0. Each curve corresponds to one ﬂow direction. 0.25 0.42 0.7 we have plotted the Poincar´ map corree sponding to the quasiperiodic behavior of Figure 8.1 0. In Figure 8. the sign of ∆θave (n) changes upon alternating ﬂow reversal. both for the same θave (n) value. For symmetric period1 states.3 ∆θave 0. but not for the e same θave (n) values.5 0. representing the quasie periodic behavior of the reverseﬂow reactor after the transient phase (2k ≥ 50). consist of a set of points forming two closed curves.15 PSfrag replacements 0. For e asymmetric period1 states.05 0 0.425 θave Figure 8.35 0. the Poincar´ map has two points.190 Chapter 8.45 0.4 0.415 0.7: The Poincar´ map of ∆θave (2k) versus θave (2k). Eﬃcient simulation of periodically forced reactors in 2D The value θave (n) is the average reactor temperature after the nth ﬂow reversal and ∆θave (n) is the corresponding averaged diﬀerence between the temperatures in the right and left half of the reactor. Clearly.
Therefore we do not compute the state of the RFR after a whole cycle. If asymmetric periodic states exist.26)(6.3 Dynamic features of the full twodimensional model Before doing simulations with the twodimensional model (6. The state of the reactor after a whole cycle is then obtained by applying the map f twice to the initial condition.28). Indeed according to the theory of Section 6. Ru(ρcp )g To obtain the results of this section the BRR method is used with p = 30.0025). The results are expressed in the dimensionless temperature (T − T0 )/T0 and the conversion (c0 − c)/c0 . We describe two diﬀerent cases of the limiting periodic state for a ﬁxed value of the cooling capacity (Φ = 0. (8. As a bifurcation parameter we use the dimensionless cooling capacity. Eigenvalues of the Jacobian Jf are determined using the subspace method with locking [60]. but we integrate the system over one ﬂowreversal period (tf ) and then reverse the reactor in the axial direction. see Figure 8. So. we simplify the problem in the following way. we observe that the temperature is constant over every cross section of the reactor. (tc u)/L).2). it makes no diﬀerence if the ﬂow direction in the reactor is reversed or if the reactor is reversed itself while the ﬂuid ﬂows from the same direction. We consider aspects of limiting periodic states of the RFR for diﬀerent values of the dimensionless reactor radius. describing the dependence of the symmetric periodic state of the reactor on the dimensionless cooling capacity. R/L = 0. instead of f (x0 ) = x(z.2. denoted by R/L.8(b).3) where L is length of the reactor and u is the superﬁcial velocity.8. We have ﬁxed the ﬂowreversal period (tf = 1200s).3 Dynamic features of the full twodimensional model 191 8. the period map is given by f (x0 ) = x((L − z)/L. are constructed using a standard continuation technique in combination with the BRR method. If the reactor is rather slim (for example. The bifurcation diagrams. A ﬁxed point of f corresponds to a symmetric periodic state of the reactor. In this section we restrict ourselves to the computation of symmetric periodic states. if . (tf u)/L). In this way we can validate the twodimensional model. Φ= 2LUw . From the mathematical point of view. deﬁned by. we can ﬁnd them by computing ﬁxed points of the original period map. The only way to determine whether the limiting state of the reactor is quasiperiodic is by using dynamical simulation.
9 1 0 0 0. Note that only around the axis the conversion is complete at the end of the .2 0.4 PSfrag replacements 0.9(a) the conversion of the same cyclic steady state is given.5 0. Note that for diﬀerent radial positions the axial position of the maximum temperature is shifted.2 0 conversion 0 0. the temperature proﬁle along the reactor is plotted.1 0.2 1.8 0. the weighted average of the twodimensional temperature proﬁle equals the temperature proﬁle of the onedimensional model.7 0.28) with the parameter values of Table 6. the cooling is especially inﬂuencing the temperature of the catalyst near the wall of the reactor.6 PSfrag replacements 0.6 0. For several positions in the radial direction.3 0.192 Chapter 8. This results in a lower maximum of the weighted average of the temperature.025).3 0. This implies that the cooling now propagates less easily through the reactor and steep temperature gradients in the radial direction arise.2). We use the same value for the cooling capacity (Φ = 0. The lines with the highest temperatures correspond to radial positions near the axis of the reactor.2 1 1 conversion 0.1 0.26)(6. but now with a larger reactor width (R/L = 0. 1.2 0.4 0.4 0.6 0.5 0.6 0. In Figure 8.9 1 axial distance (a) Conversion axial distance (b) Temperature Figure 8.8 0. The lines with the lowest conversion correspond to radial positions near the wall of the reactor.2 0.0025.8: Axial temperature and conversion proﬁles of the RFR (in CSS) at the beginning of a reverse ﬂow according to the twodimensional model (6.2 and the radius of the reactor equals R/L = 0. Eﬃcient simulation of periodically forced reactors in 2D radial gradients are absent. The same observation is valid for the conversion. This has been conﬁrmed by simulations of the onedimensional model.2. see Figure 8.8 temperature 0. The lines with the lowest temperatures correspond to radial positions near the wall of the reactor. The cooling capacity Φ is ﬁxed at 0.9(b) we have represented the distribution of the temperature over the catalyst bed in the cyclic steady state.4 temperature 0. Clearly.7 0.8 0.8(a). The lines with the highest conversion correspond to radial positions near the axis of the reactor. In Figure 8.
2.8 0.18) is given (’◦’).3).8. for diﬀerent radial positions.9: Axial temperature and conversion proﬁles of the RFR (in CSS) at the beginning of a reverse ﬂow period according to the twodimensional model (6.8 temperature 0.6 PSfrag replacements 0.3 Dynamic features of the full twodimensional model 193 reactor. It can be shown that. The reason is that the two high temperature zones.4 0. The weighted average (6.025) at the same cooling capacity.8 0.2 1.6 0.2 1 1 conversion 0.1 0. This can be explained by the fact that for the wide reactor.2 0 conversion 0 0. a stable extinguished state exists.4 temperature 0. the minimum has disappeared and the upper branch has become monotonically decreasing. the reactor cannot operate at high temperature and dies out. merge into one.025.8(b). 1.7 0. The part of the branch with negative cooling capacity has of course no physical meaning. for every value of the cooling capacity.2 0. Note that for the slim reactor there exists a minimum in the upper branch (at Φ ≈ 0.2 0.2 and the radius of the reactor equals R/L = 0.4 0. the weighted average (6.6 0. cf. we .28) with the parameter values of Table 6. For cooling capacities higher than Φ ≈ 0.8 0. The maximum of these values is plotted versus the dimensionless cooling capacity Φ for diﬀerent values of R/L. Figure 8.5 0.9 1 0 0 0.2 0.18) of the temperature is computed over every cross section. To determine the stability of the points on the bifurcation branches.7 0.26)(6. the maximum of the temperature is not found at the same axial position in the reactor. For the slim reactor (R/L = 0.0025) the maximum average temperature is always higher than for the wide reactor (R/L = 0. The bifurcation branch for the wide reactor has more or less the same characteristics.6 0. Two bifurcation branches are shown in Figure 8.1 0.10. The cooling capacity Φ is ﬁxed at 0.3 0. However. In addition.4 PSfrag replacements 0.3 0. Therefore the product gas consists of a mixture of both products as reactants and on an average the conversion is not complete.67.5 0.9 1 axial distance (a) Conversion axial distance (b) Temperature Figure 8.
2 1 0. at the limit point.1 0.26)(6. Eﬃcient simulation of periodically forced reactors in 2D 1.7 0.48).5 0.67 (Φ ≈ 0.025)] have also plotted the largest Floquet multiplier (µmax ) in Figure 8.0025). for higher cooling capacities the cooling eventually causes extinction of the reactor.4 −0. ’◦’ (R = 0.32 (Φ ≈ 0.1 0 0. the largest eigenvalue returns to the unit circle but remains close to −1.8 θmax µmax 0.10.19.8 dimensionless cooling capacity Φ Figure 8.2 1 0.4 0.2 0 1.10: The maximum dimensionless temperature (θmax ) and the largest Floquet multiplier (µmax ) versus the cooling capacity (Φ) for two diﬀerent values of the reactor radius. The ﬁxed points of the lower branches for both the wide and the slim reactor are unstable.6 0.4 0. [’∗’ (R/L = 0.194 Chapter 8.2.2 −0. Then the symmetric state is stable but it takes the reactor a large number of cycles to converge to this limiting state.8 −1 −1. that is.4 1. causing a symmetry loss bifurcation. . Finally. for which Φ ≈ 0.6 1.65). So.28) was used with the parameter values of Table 6. The twodimensional model (6.2 0.6 0.8 PSfrag replacements −0.3 0. implying that the ﬁxed points are stable.15 a negative eigenvalue the largest eigenvalue in modulus and crosses the unit circle at µ = −1 for Φ ≈ 0.8 1. the symmetric state become unstable and a stable asymmetric period1 state emerges. Starting with Φ = 0 at the upper branch of the bifurcation diagram the largest eigenvalue of the Jacobian at the ﬁxed points is slightly less than +1. At Φ ≈ 0. For cooling capacities higher than Φ ≈ 0.2 −1. a positive eigenvalue crosses the unit circle at µ = +1.
where m is the total number of function evaluations. A clear and detailed introduction in quasiNewton methods to solve nonlinear equations and optimization problems is given by Dennis and Schnabel [18].12 is called the Banach perturbation theorem.2 In [8] Broyden uses the mean convergence rate R given by R= 1 log( g(x0 ) / g(xm−1 ) ) m as the measure of eﬃciency of a method for solving a particular problem.4 of [18] is a more general version of the perturbation theorem. Theorem 1.3 is Lemma 2.1 of [18].1 of [18]. Theorem 1.5 is given without proof in [18] where it is Theorem 2.3.1 For more information on ﬁnite arithmetic see [18] and [20]. where it is assumed that Jg ∈ Lipγ (N (x∗ . for some r. r) ⊂ D. where .15 is Theorem 5.3. Theorem 1.2.2 of [18]. In this thesis we divide the logarithm by k ∗ instead of m.6.11 is a simpliﬁcation of Lemma 4. B ∈ Rn×n and I = 1. which makes R inﬁnity if k ∗ = 0.1. The theorem is also given [55].12 of [18]. r)) with N (x∗ .2 of [18] and Lemma 1. Lemma 1. A. where Jg is assumed to be Lipschitz continuous at x only.4 is Corollary 2. Theorem 3. 195 . Section 1.4. Theorem 1.6.10 is a simpliﬁcation of Theorem 5.4. can be any norm on Rn×n that satisﬁes AB ≤ A · B . The proof of Theorem 1.2 is given in [18] where it is Theorem 2.4.Notes and comments Section 1.1. Lemma 1.
An overview of many of the important theoretical results of secant methods is given in e. This very general theorem of Broyden.1 of [18].3 Lemma 1. Theorem 1. Corollary 1.25 is Corollary 3.1. because the initial Broyden matrix must be close to the Jacobian.2. to other secant methods.27 is Lemma 8. Well known approaches are for example line search and the modeltrust region approach. The theorem is in some sense considered as unsatisfying. where we choose B0 = −I in general.26 is a particular case of Theorem 4. For practical implementation. Lemma 1.2 of [11]. the method of Broyden has to be used in combination with global algorithms.32) instead of the Frobenius norm.16 of [18]. For the limited memory Broyden methods. Lemma 1. where instead of the Frobenius norm a weighted matrix norm. see [8]. . see Sections 6. To obtain a more robust method.5 of [18].196 Notes and comments Section 1. M .28 is Lemma 2. some clearly less desirable than Broyden’s update.15 and 4. The lemma is also given in [11] where it is Lemma 3. The proof of the theorem is simpliﬁed by using results of [18] and [16]. Theorem 1.2. [17.4 of [18]. In 1970 Broyden [7] has proved that his method converges Rsuperlinearly on linear problems and in 1971 he has proved that the method converges locally and at least linearly on nonlinear problems [9].3 of [11].g. this is assumption is not satisﬁed. In 2000 Broyden has written a short note on the discovery of the ’good Broyden’ method [10]. denoted by . multiple solutions for A exist. Lemma 1. If the l2 operator norm is used in (1.29 is Lemma 8. 42].g. Theorem 1.3 of [18]. all other convergence proofs of the quasiNewton methods for nonlinear equations are built on this result. Broyden himself chose the ﬁnitediﬀerence approximation of the Jacobian for the initial estimate B0 and applied a backtracking strategy for the line search.3 of [11].1.2 of [16].20 is a special case of Lemma 8.1.30 can be found in e.24 is a special case of Theorem 3. [28]. However.23 is a combination of Lemma’s 4.1. Lemma 1. Dennis and Mor´ was developed to extend the analysis e given by Dennis for Broyden’s method [15].3 and 6.
As a result of the 2nstep exact convergence for linear systems. where A is in Jordan normal form.1 of [23]. The algorithm was also published by Gay in [22]. Simulations conﬁrmed this conjecture. since it has to be checked during the process. where it is Lemma 2.1 and 3.18 we consider in the examples of Chapters 2 and 4 aﬃne functions g(x) = Ax.8 is Lemma 3. .16.7 is a slightly adjusted version of Lemma 3. We would like to sharpen Theorem 2.2 of [23].4 of [23]. see Example 2. that is derived from Lemma 2.3 According Lemma 2. Gay proved in [22] that the method of Broyden is 2nstep quadratically convergent for nonlinear functions.11 and 2.9 is Lemma 3. Lemma 2. Theorems 2. In Chapter 2 we only consider aﬃne functions g(x) = Ax + b.2 Lemma 2. Therefore. a nonzero vector wk+1 ∈ Zk+1 ∩ Ker(I − AHk+1 ) exists.10 is Lemma 3. Theorem 2.3 of [23]. Lemma 2. This would imply that if w0 = 0 and v0 w0 = 0 the method of Broyden needs d0 iterations to converge.4 is Theorem 2.Notes and comments 197 Section 2.1 of [54]. Lemma 2. that equals wk and T T satisﬁes vk+1 wk+1 = 0.2 of [22]. yk = Ask and yk = 0 if and only if sk = 0. Algorithm 2.1 requires a full 2n steps to convergence. Section 2. where in case of yk = 0 the new inverse Broyden matrix Hk+1 was set to Hk . is proposed by Gerber and Luk in [23]. In [22] Gay proved under which conditions Algorithm 2. Section 2.13 is Lemma 3.12 in the following way. A full proof of Lemma 2. where the matrix A ∈ Rn×n is nonsingular. Note that the condition (2.2 of [23] and Lemma 2. If T dim Zk+1 = dim Zk − 1 and vk wk = 0 then dim Zk+2 = dim Zk+1 − 1. Lemma 2. see [64] for more details.1 The generalized Broyden’s method.19) is unsatisfactory.1.3 of [22].2 is a particular case of results derived in [23].3 can be found in [22].1.12 are Theorems 3.
avoiding the usage of a large (n × n)matrix in the algorithm. see [60]. [6]. TensorKrylov methods provide even faster algorithms. Derived from this idea. 67]. Morales and Nocedal [45] and Nocedal [48]. 66. To approximate the Newton step. [41]. A relatively new ﬁeld of research is the NewtonKrylov method.g. This implies for the intermediate Broyden matrix B that Bvp = B0 vp . The algorithm applies the method of Newton on a small pdimensional subspace and dynamical simulation. on the orthogonal subspace. Liu and Nocedal [40]. cf. The NewtonPicard method is ﬁrst proposed by Lust et al. Kolda.g. [27. In 1970 Schubert [62] has proposed a secant method to solve nonlinear functions. and in all other directions Q has the same action as Q. he imposes the updated Broyden matrix to have the same sparsity structure as the Jacobian. see [5]. Toint has extended this approach to quasiNewton algorithms for optimization problems. The small subspace is formed by the eigenvectors corresponding to the largest eigenvalues in modulus of the Jacobian Jg at the current iterate xk . where the Jacobian is sparse and the locations of the nonzero elements are known. The p eigenvectors can be computed using subspace iteration. Let vp be the right singular vector corresponding to the pth singular value of the update matrix Q in iteration p + 1. [65.11 with q = p − 1 can also be considered as an additional rankone update. Limited memory quasiNewton methods for optimization problems have been studied by e. both theoretical and experimental. In addition to the secant equation. O’Leary and Nazareth [37].1 In this thesis we only consider limited memory methods that are based on the method of Broyden and are applicable for nonlinear functions with general nonsingular Jacobian. Section 3. . 25] The rank reduction applied in Algorithm 3. The new update matrix Q satisﬁes Qvp = 0. that is based on solving the Newton iteration step without computing the Jacobian of the function explicitly. Bu = Bu for u ⊥ vp . subspace iteration is used. Picard iteration.2 A good overview of singular values can be found in e. cf. In 1971 Broyden [9] has investigated the properties of this modiﬁed algorithm.198 Notes and comments Section 3.
see (1. In a limited context using the notation of Section 3.2 of [12]. while (8. only some subset of them can be utilized in a numerical implementation of the multiple Broyden method. . Therefore.22 is Theorem 6. the multiple secant version of Broyden’s update.23 is proposed by Richard Byrd.4 comes from an article by Byrd. including the main diagonal. standard BroyT den updates. If they are not.4) is only well deﬁned numerically if the k step directions that make up Sk are suﬃciently linearly independent.26). in which they derived short representation for diﬀerent quasiNewton methods. Section 3. Thus it would probably be worthwhile considering either method (or their inverse formulations) in a limited memory method for solving nonlinear equations. The preference between these two formulas does not appear to be clear cut.2 A comprehensive overview of chemical reactors and modeling techniques is written by e. while in (3.4) always enforces the k prior secant equations while (3.20 is Lemma 2.4) This update is well deﬁned as long as Sk has full column rank.4 The idea of Section 3. This is the approach that has often been taken in implementations of this update.27) has the advantage that it is well deﬁned for any Sk . The formula (3. and obeys the k secant equations Bk Sk = Yk .4.27) for k consecutive.1 of [12] and Theorem 3. The scaling in Algorithm 3. On the other hand. (8.g. we see that in the multiple secant approach we use Sk Sk . Scott Fogler [63] and Froment and Bischoﬀ [21].4) to the formula in (3. Comparing (8. Theorem 3. Lemma 3. The condition (3. T vp vp which is a rank oneupdate of B. Nocedal and Schnabel [12].1 of [12]. (8.Notes and comments 199 and therefore B = B + (B0 vp − Bvp ) T vp T = B − Qvp vp .17) on the reduction matrix R is already suggested in [11]. is given by T T Bk = B0 + (Yk − B0 Sk )(Sk Sk )−1 Sk .27) it is the upper triangular portion of this matrix.21 is Theorem 6. Section 6. the two update are the same if the directions in S k are orthogonal.25) and (1.27) only enforces the most recent equation.
Allgower.g.2 An extended investigation of the dynamical behavior of the reverse ﬂow reactor is given by Khinast et al. Studies in continuation techniques and bifurcation analysis can be found in work by e. [50]. [51. An advanced adapted Broyden method that uses information of the continuation process to update the Broyden matrix is developed by Van Noorden et al. [33].2 An introduction to dynamical systems can be found in [4]. From their work it turns out that Broyden’s method is the most eﬃcient for solving large systems of nonlinear equations in terms of function evaluations. Section 7. In the neighborhood of bifurcation points the points on the branch might have to be determined more accurately. Section 8. Section 7. Recent studies of the reverseﬂow reactor can be found in work by Gl¨ckler. Section 7.200 Notes and comments A clear introduction is given by Aris [3]. .3 For locating a bifurcation branch it is enough to approximate the points on the branch op to an error of about 10−2 during the continuation scheme. o Kolios and Eigenberger [24] and Jeong and Luss [31]. Van Noorden et al. Chien and Georg [1] and Allgower and Georg [2]. the method of Broyden and the NewtonPicard method) in combination with continuation techniques. 52] compared several convergence acceleration techniques (such as the method of Newton.1 Basics of discretization techniques are given in [59].
Dover Publications Inc.. Math. Math. SpringerVerlag. 201 .J.S.. J. 19:577–593.L. Broyden.G. Math. 1990. Mor´. 2000. [5] A. On the discovery of the ’good Broyden’ method. [8] C. 1965. 1994. Optim. 25:285–294. Comp. Convergence theory of nonlinear NewtonKrylov algorithms. volume 13 of Springer Series in Computational Mathematics. Comp..L.N. [9] C. Optim. 1971. Appl. Bouaricha.G. Brown and Y. Berlin. Dennis. Appl. Cambridge University Press. 26:3–21. A class of methods for solving nonlinear simultaneous equations. Broyden. An introduction to dynamical systems. Broyden. Arrowsmith and C. [4] D.. Corrected and expanded reprint of the 1978 original. The convergence of singlerank quasiNewton methods. B 87:209–213. 24:365–382. Aris. Broyden. [10] C. [3] R. 5:207–232. Math. and K. 1996. Chien. Broyden. 1973. C. 12:223–245. Math. Math.. The convergence of an algorithm for solving sparse nonlinear systems. Georg.G. 1970. TensorKrylov methods for large nonlinear equations. and J.. New York. Inst. Appl. Allgower and K. Jr. Place. 1989. 4:297–330. [11] C. Large sparse continuation problems. G.. 1994. 1990. Comput..G. Comput. Georg. An introduction. Allgower. [2] E..E. On the local and sue perlinear convergence of quasiNewton methods.Bibliography [1] E. Cambridge. SIAM J. Program. Comp. Saad. [6] P. J.M. Mathematical modelling techniques. [7] C.. J. Numerical continuation methods.K.
Byrd. Jr. PA.. Cambridge. Sircar. Sci. QuasiNewton methods. AIChE J. Program. Analysis of a novel reverseo ﬂow reactor concept for autothermal methane steam reforming. 16:623–630. Jr. and G.J.E. Hufton. 42(10):2765–2772.202 Bibliography [12] R.. [14] M. Froment and K. 63:129–156. Schnabel. Catalytic combustion with periodicﬂow reversal. Math. Mor´. A characterization of superlinear cone vergence and its application to quasiNewton methods. Computational differential equations. [23] R. Kolios.. Anal. Some convergence properties of Broyden’s method. Comp.E. Experiments on optimization of thermal swing adsorption. Eng. Davis and M. . Eng. Chem. Corrected reprint of the 1983 original. Chem. Nocedal. Anand. On the convergence of Broyden’s method for nonlinear systems of equations. Jr. 1996. SIAM Rev. Representations of quasiNewton matrices and their use in limited memory methods. P. John Wiley & Sons Ltd. Dennis.. 1974. Gay. Chem. Gerber and F. [22] D. Dennis. Luk.T.. Johnson. Carvill. motivation and e theory. D. 1990. and R. [24] B Gl¨ckler.H. 2003. Ind. [15] J. [18] J.B. [17] J. 58:593–601. 19:46–89.B. Levan. volume 16 of Classics in applied mathematics. M. 18:882–890. and J. [21] G. and R. Eng. [16] J. [13] B. Numer.B.M. 1979.R. and J. J. Numer.R.. Estep. G. [19] G. 28:549–560. Eigenberger and U. Nieken. 25:559–567. Chemical reactor analysis and design. SIAM J. Cambridge University Press. Society for Industrial and Applied Mathematics (SIAM).. Philadelphia. Bischoﬀ.E. 1977. 1989. and C. Dennis. Anal.. Mor´.. 1994. 1996. A generalized Broyden’s method for solving simultaneous linear equations. Sorption enhanced reaction process. Sci. [20] K. 1971. 1988. Eigenberger.. Math.F. Dennis. Jr.D. SIAM J. J. Res.J. 28:778–785.M.T. 1981. Hansbo.. Math. Numerical methods for unconstrained optimization and nonlinear equations. Eriksson. Comp. and S. 1996.E. Schnabel. 43:2109–2115. New York.
and L.. . Khinast and D. Chem. 24:139–152.G. Gupta and S.G. [30] G. Nazareth. pages 135–138. Mayorga. Matrix computations. Corrected reprint of the 1985 original. AIChE J. 45:299–309.. 109:419– 428. 58:1095–1102. [31] Y. Sorptionenhanced reaction process for hydrogen production. [28] A. SpringerVerlag. D.. O’Leary. Luss.F. Horn and C. Hufton. Comput. Sci.. Johnson. Pollutant destruction in a reverseﬂow chromatographic reactor. 1998. Cambridge University Press. Luss. AIChE J. 1990. BFGS with update skipping and varying memory. [33] J. 2003. 1999. second edition. Sci.D. Jeong.P.K. 43:2034–2047. 1999. A. Bhatia. and S. AIChE J. Kodde and A. Khinast. 1997. Eﬃcient bifurcation analysis of periodicallyforced distributed parameter systems.R. Solution of cyclic proﬁles in catalytic reactor operation with periodicﬂow reversal. 2000. Matrix analysis. Undergraduate texts in mathematics.J. Sircar. 45:248–256. Johns Hopkins Studies in the Mathematical Sciences.G.G. New York.H.G. Luss. Iooss and D. [29] J. Van Loan. Catal. [36] A. Bliek. Complex dynamic features of a cooled reverseﬂow reactor.Bibliography 203 [25] G.O. Mc GrawHill.K. Chem. and D. Eng. Eng. 15:229–237. Stud. 1998. 1990. 1997. New York. Khinast. 1991. Eng. 44:1128–1140. Mapping regions with diﬀerent bifurcation diagrams of a reverseﬂow reactor. Luss. Johns Hopkins University Press. Khinast and D. Golub and C. and D. Dependence of cooled reverseﬂow reactor dynamics on reactor model. Jeong and D. S. Elementary stability and bifurcation theory. Selectivity enhancement in consecutive reactions using the pressure swing reactor. 1953.R. Luss. 1996. third edition. Comput. Y. Kolda.O. AIChE J..S. Joseph. Cambridge. 8:1060–1083. Householder. [34] J. Gurumoorthy. SIAM Journal on Optimization. [32] J. [26] V. [37] T.. [35] J..A. [27] R.. Principles of numerical analysis. Chem. Surf.
Liu and J. Comput. Nocedal. Morales and J. Bliek.A. 1979. Sci.C. Chem. 5:64–85. Amsterdam.R. 1998. [46] J. Optim. SIAM J.A. 1989. Roose. [44] Yu. Testing unconstrained ope timization software. 124:97–121. 7:17–41. Verduyn Lunel. 1980. and A.. Automatic preconditioning by limited memory quasiNewton updating. Soft.E. PhD thesis. e ACM Trans. Math. Mor´. Eng. Elsevier. 1995. [49] T. [50] T. 1996. Catal.Sh. Hertzberg.J. Updating quasiNewton matrices with limited storage. ACM Trans.. [39] H.M. Matros and G. Mor´ and M. Nocedal. [42] J. Sci. [45] J. Mathematics of Computation. van Noorden. [40] D. 2000. Lust.204 Bibliography [38] Y. van Noorden. Garbow. Nocedal. Comput.L. Kuznetsov. Spence.L. Reverseﬂow operation in ﬁxed bed catalytic reactors. S. volume 112 of Applied Mathematical Sciences. SpringerVerlag. 35:773–782. second edition. An adaptive NewtonPicard algorithm with subspace iteration for computing periodic solutions. 50:1203–1212. Sci. 1981. B. New algorithms for parameterswing reactors. A. Math.M. Comput. Champneys. Rev..L. To appear in SIAM J. Mart´ ınez. 2002. Hillstrom. D.. Numerical solution of nonlinear equations.Sh. 45:503–528. Bunimovich. New York. On the limited memory BFGS method for large scale optimization.M. 2000. and K. Catalytic processes under unsteady state conditions. SIAM J.. A Broyden rank p + 1 update continuation method with subspace iteration. Amsterdam. Kvamsdal and T. and A. Soft. .. Elements of applied bifurcation theory.S. 1989. 10:1079–1096. [41] K. Vrije Universiteit. [48] J. Cosnard. [43] Yu.J. Practical quasiNewton methods for solving nonlinear systems. Appl. 38:1–68. Math.. Optimization of pressure swing adsorption systems the eﬀect of mass transfer during the blowdown step. [47] J. J. Matros. 19:1188–1209. 1998. Math.Y. Program..
Math.M. Sci. 2000. Math. Marek. Bliek. SIAM J. Available from http://www. Sci. Ortega and W. [62] L. Reprint of the 1970 original.uk/.M. van Noorden. van Noorden. Bliek. SIAM J. Philadelphia. 1967. Rehacek. [56] R.. second edition. [54] D. GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems.. New YorkLondonSydney..M. S. M. Numerical methods for large eigenvalue problems.. quasiperiodic and chaotic spatiotemporal patterns in a tubular catalytic reactor with periodic ﬂow reversal. and A. Sci.P. [60] Y. S. 7:856–869. Chem. and M. Comput.nag.K. algorithms and architectures for advanched scientiﬁc computing. Eng.C. 1970. 57:1041–1055. Marek. M.A. 22:283–297. Verduyn Lunel. . Heterogeneous models of tubular reactors packed with ionexchange resins: Simulation of the MTBE synthesis. IMA J. Optim.D.Bibliography 205 [51] T. Appl. Society for Industrial and Applied Mathematics (SIAM). Eng. [58] J. Periodic. [52] T. 1986. Saad. 5:231–235.. Why Broyden’s nonsymmetric method terminates on linear equations.co. Rheinboldt. Modiﬁcation of a quasiNewton method for nonlinear equations with a sparse Jacobian. Comput. John Wiley & Sons Ltd.. [53] Numerical Algorithms Group (NAG). Schultz.H. Ind. [61] Y. 35:3827–3841. 47:2897–2902. [59] R. Mark 20. Res. 2002. Chem. Chem.M. [57] J. Modeling of a tubular catalytic reactor with ﬂow reversal. 1995. Statist. Diﬀerence methods for initialvalue problems. 1998. 1992. Chem.. The NAG Fortran library manual. 2003. Eng..L. [55] J. Eng. Acceleration of the determination of periodic states of cyclically operated reactors and separators. 24:27–30.W. 2003. Saad and M. Rehacek. volume 30 of Classics in applied mathematics. AlmeidaCosta. 1996. Morton. Manchester university press. and A. Iterative solution of nonlinear equations in several variables. PA. O’Leary. 68:149–166. volume 4 of Interscience tracts in pure and applied mathematics. Richtmyer and K. Kubicek. 1992. Kubicek. The eﬃcient computation of periodic states of cyclically operated chemical processes. Verduyn Lunel. Schubert. and M.. Quinta Ferreira and C.L. Manchester. Comput.
. [64] J. Toint. Inst.M. 1988. 13:631–644. SIAM J.A. [68] B. volume 508 of NATO Adv. Sci. Anal.M.A. Translated from the German by R. 1992.L. 1998. 1999. third edition. A sparse quasiNewton update derived variationally with a nondiagonally weighted Frobenius norm.. Technical Report 200313. Universiteit Leiden. Scott Fogler.L.A. pages 217–253. Toint. C Math. [66] Ph. SIAM J. van de Rotten and S.G. 16. Sci.A. Statist.M. Toint. Technical Report 200306. 37:425–433.A.M. 1996). W. Ser. Comput. Chemical reactor design and operation. W. Prentice Hall PTR. Elements of chemical reaction engineering.. Witzgall. 31:954–961. . van Swaaij. van de Rotten. Numer. Sci. Bartels. Comput. 2003. On the superlinear convergence of an algorithm for solving a sparse minimization. Math. Universiteit Leiden..L.. [70] H. Kluwer Acad. Eﬃcient simulation of periodically forced reactor in 2d. Sleijpen. third edition. Comput.P. [69] B. [72] K.C.206 Bibliography [63] H. Bliek. Bulirsch. 1981. [67] Ph. and A. second edition. Introduction to numerical analysis. John Wiley & Sons Ltd. Dordrecht.L.R. 1977. and A. SpringerVerlag. BiCGSTAB: a fast and smoothly converging variant of BiCG for the solution of nonsymmetric linear systems. S. Stoer and R. [65] Ph.. 1979. Phys. Westerterp. Publ. Beenackers.. Iterative BiCG type methods and implementation aspects. A limited memory Broyden method to solve highdimensional systems of nonlinear equations. 2002. Verduyn Lunel. Math. van der Vorst. New York. [71] H. Verduyn Lunel. van der Vorst and G. In Algorithms for large scale linear algebraic systems (Gran Canaria. On sparse and symmetric matrix updating subject to a linear equation. 2003. volume 12 of Texts in Applied Mathematics. Gautschi and C.
we have chosen some speciﬁc test functions. i = 1. The resulting system of equations is given by g(x) = 0. . The Jacobian of this function has a band structure with on both subdiagonals the value −1.1) can be discretized by considering the equation at the points t = ti . . . . . 47]. n. The elements on the diagonal of the Jacobian are given by ∂gi 2h2 =2+ (xi + ih + 1)2 . n. We apply the standard O(h2 ) discretization and denote h = 1/(n + 1) and ti = i · h. u(0) = u(1) = 0. .Appendix A Test functions This appendix is devoted to a discussion of the test functions used to test the diﬀerent limited memory Broyden methods of Chapter 3.2) for x = (x1 . xn ) and xi = u(ti ). . taken from the CUTE collection. . . . (A. . . ∂xi 2 207 i = 1. (A. Because the methods of Newton and Broyden are not globally converging and the area of convergence can be small. where gi (x) = 2xi − xi−1 − xi+1 + h2 (xi + ti + 1)3 . 2 i = 1. cf. . n. n. i = 1. . . . i = 1. . . 2 0 < t < 1. . . . Discrete boundary value function The twopoint boundary value problem 1 u (t) = (u(t) + t + 1)3 . . [18. . . . n.
s ≥ t. . If we integrate the boundary value problem (A.1 −0.3) The socalled discrete boundary value function was ﬁrst used by Mor´ and e Cosnard to test the methods of Brent and of Brown [46].2 0.208 Appendix A. .05 −0.25 0 0.2 −0.1: The initial condition x0 (dotted line) and the zero x∗ (solid line) of the function g given by (A.2). . In Figure A.15 −0. tn (tn − 1)). Test functions As initial condition we deﬁne the vector x0 by x0 = (t1 (t1 − 1).4) u(t) + 2 0 where H(s. t)(u(s) + s + 1)3 ds = 0. t) = t(1 − s). (A.6 0. Discrete integral equation function In the same article Mor´ and Cosnard also considered the discrete integral e equation function [46]. we obtain the nonlinear integral equation 1 1 H(s.8 1 Figure A. (A.1 we have plotted the initial condition x0 and the zero x∗ of the function g. s < t. 0 −0. . s(1 − t). .1) two times and apply the boundary conditions.4 0.
(A.4). . . −1 0 The unique zero of (A. . 1). n. we replace the integral by an npoint rectangular rule based on the points t = ti .209 To discretize Equation (A. . This implies that the equation g(x) = 0 equals n/2 copies of a system in the twodimensional space. we start with the initial vector x0 . 2i−1 i = 1. . i = 1. given by x0 = (t1 (t1 − 1).8) g2i−1 (x) = 10(x2i − x2 ). . . The (2 × 2)matrices on the diagonal are given by −20x2i−1 10 . tn (tn − 1)). (A. . .2. n. the resulting system of equations reads g(x) = 0. i = 1. . . . .7) . As in case of the discrete boundary value function.5) for i = 1. −1.6) Extended Rosenbrock function The extended Rosenbrock function g : Rn → Rn is deﬁned for even n by g2i (x) = 1 − x2i−1 . As initial vector x0 for the iterative methods we choose (−1.2. . . . . . (A.4469 with multiplicity n/2. n. so that the Jacobian of g is nonsingular at x∗ and has singular values approximately 22. . 1). .3786 and 0. If we denote h = 1/(n + 1) and ti = i · h. where g(x) is given by h gi (x) = xi + (1−ti ) 2 i n tj (xj +tj +1) +ti j=1 j=i+1 3 (1−tj )(xj +tj +1)3 . . . n/2. .7) is given by x∗ = (1. . (A. Note that the Jacobian of the function g has a dense structure. . . . Note that the Jacobian of the extended Rosenbrock function is a block diagonal matrix. 1. .
.9) is x∗ = 0. 3. . (A.210 Appendix A. √ g4i−2 (x) = 5(x4i−1 − x4i ). 0. . i = 1. −1. (A. . . The initial point x0 is given by 0 0 √ √ 5 − 5 −4(x4i−2 − 2x4i−1 ) 0 √ 0 −2 10(x4i−3 − x4i ) (3. Let n be a multiple of 4 and deﬁne the function g : Rn → Rn by g4i−3 (x) = x4i−3 + 10x4i−2 . the Jacobian is a block diagonal matrix with blocks √ 2 10(x4i−3 − x4i ) and singular at the zero x∗ .10) . 0. n/4. 1. . 1).9) g4i−1 (x) = (x4i−2 − 2x4i−1 )2 . 1 0 0 10 0 2(x4i−2 − 2x4i−1 ) 0 The unique zero of (A. Test functions Extended Powell singular function The extended Powell singular function contains n/4 copies of the same function in the fourdimensional space. . . So. √ g4i (x) = 10(x4i−3 − x4i )2 . −1.
x . Algorithm 1. %%% Broyden s t e p %%% 211 . %%% Broyden i t e r a t i o n %%% while ( ne ( i t e +1) > i e p s ) . the singular values of the update matrices and the rate of convergence.16.7. . broyden ( gcn . The codes in Matlab were used in order to obtain more insight in the Broyden matrices and the update matrices. i e p s . Algorithm 1. B . n . Algorithm 1. The method of Broyden We omit the codes of Newton’s method. that forms the basis of the codes of all the limited memory Broyden methods to come. Algorithm 1. i t e = 0 . x . to manufacture plots of the Broyden matrices. and start with the plain method of Broyden.13.Appendix B Matlab code of the limited memory Broyden methods We have implemented the codes of the iterative methods described in Chapters 1 and 3 in the computer languages Fortran and Matlab. ne ( i t e +1) = sqrt ( g ’ ∗ g ) . imax . i f a i l ) %%% I n i t i a l i s a t i o n %%% g = f e v a l ( gcn . as well as to present the codes in a convenient manner. n ) . the NewtonChord method. and the Discrete Newton method. as well as to compute the solutions of large dimensional systems of equations (n ≥ 1000).19. . The codes in Fortran were used in order to apply the integration routines and matrix manipulation routines of the Fortran NAGlibrary [53]. function [ x ] = .
In addition the algorithm can stuck at several points. ne ( i t e +1) = sqrt ( g ’ ∗ g ) . Matlab code of the limited memory Broyden methods s = −B\ g . %%% Broyden i t e r a t i o n %%% while ( ne ( i t e +1) > i e p s ) . break . Therefore. i e p s . x . end . end . i t e = i t e + 1. s = −B\ g . imax . i t e = 0 . n . break . ne ( i t e +1) = sqrt ( g ’ ∗ g ) . The reason of failure of the subroutine we return in the variable ’ifail’. y = zeros ( n . If the residual becomes larger than a predeﬁned value ’meps(2)’. end . ns = s ’ ∗ s . x . We are not only interested in the zero of the function g but also in the convergence properties of the method. return . i t e = i t e + 1. break . n ) . end . end . n ) − g . i f ( ne ( i t e +1) > meps ( 2 ) ) i f a i l = 4 . . ne . The extended code for the method of Broyden reads function [ x . %%% Matrix u p d a t e %%% B = B + ( y − B∗ s ) ∗ s ’ / ns . i f ( i t e >= imax ) i f a i l = 2 . s = zeros ( n . n ) − g . 1 ) . break . x . end . disp ( ’ # ∗∗∗ The method o f Broyden ∗∗∗ ’ ) . . %%% P r e a l l o c a t i o n %%% g = zeros ( n . i f a i l ) i f ( i f a i l ˜ = 0  imax == 0) i f a i l = 1 . Therefore the computation is stopped to avoid . end . g = y + g . meps . 1 ) . i f ( ns <= 0) i f a i l = 3 . B . x . The local matrices and vectors are declared at the beginning of the subroutine. i t e . i f a i l ] = . x = x + s. x = x + s. %%% Matrix u p d a t e %%% B = B + ( y − B∗ s ) ∗ s ’ / ns . y = f e v a l ( gcn . ns = s ’ ∗ s . we include extra output parameters of the subroutine. g = y + g . ne ( i t e +1) = sqrt ( g ’ ∗ g ) . 1 ) . such as the number of iterations ’ite’ and the residue at every iteration step ’ne’. %%% Broyden s t e p %%% i f ( rcond (B) < meps ( 1 ) ) i f a i l = 5 .212 Appendix B. y = f e v a l ( gcn . %%% I n i t i a l i s a t i o n %%% g = f e v a l ( gcn . broyden ( gcn . the process is not expected to converge.
meps . i t e = i t e + 1. end . i f ( p < 1  p > n ) i f a i l = 1 . disp ( ’ # ∗∗∗ The l i m i t e d memory Broyden method ∗∗∗ ’ ) . break . The condition number of the Broyden matrix B is computed by rcond(B) = B · B −1 . y = f e v a l ( gcn . n ) . B2 = zeros ( p .23. end .1 is given by the following routine function [ x . x . The general limited memory Broyden method We indicated in Chapter 3 that the structure of all limited memory Broyden methods are similar. The basis of the limited memory Broyden methods as described in Algorithm 2. end . i t e = 0 . p . end . . ne ( i t e +1) = sqrt ( g ’ ∗ g ) . n ) − g . break . 1 ) . i f (m == p ) %%% R e c o m p o s i t i o n %%% %%% R e d u c t i o n %%% m = q. x = x + s. lmb ( gcn . s = zeros ( n . i f a i l ] = . end . . . ne . D. %%% P r e a l l o c a t i o n %%% g = zeros ( n . x . y = zeros ( n . g = y + g . stored in ’meps(1)’. %%% I n i t i a l i s a t i o n %%% g = f e v a l ( gcn . i f ( ne ( i t e +1) > meps ( 2 ) ) i f a i l = 4 . q . end . break . ne ( i t e +1) = sqrt ( g ’ ∗ g ) . imax . C . return . i t e . i f ( i t e >= imax ) i f a i l = 2 . end . p ) . %%% Broyden s t e p %%% B2 = eye ( p)−D’ ∗C . m. s = C∗ ( B2 \ ( D’ ∗ g ) ) + g . The Broyden matrix is considered as approximately singular if the condition number is smaller than the machine precision. i f ( rcond ( B2) < meps ( 1 ) ) i f a i l = 5 . %%% Broyden i t e r a t i o n %%% while ( ne ( i t e +1) > i e p s ) . return . i f a i l ) i f ( i f a i l ˜ = 0  imax == 0) i f a i l = 1 . ns = sqrt ( s ’ ∗ s ) . x . except for Algorithms 3. i f ( ns <= 0) i f a i l = 3 . end . 1 ) . break . 1 ) .213 overﬂow. i f ( q < 0  q > p −1) i f a i l = 1 . end . return . i f (m < 0  m > p ) i f a i l = 1 .15 and 3. return . n . i e p s .
meps .1. n = 100. D = zeros ( n . 1 : m−1)∗D ( : . n . end . i e p s . x0 = o n e s ( n . m = 0. q +1:p ) = zeros ( n . p−q ) .214 Appendix B. D( : . The only part that has to be ﬁlled in is how the decomposition CD T of the update matrix is rewritten and which columns of the matrices C and D are removed. 1 : q ) = C ( : . 0 E20 ] . i f a i l ) . x . To satisfy the conditions superposed on the limited memory Broyden methods. C = zeros ( n . p ) . Removing columns in normal format The simplest way to create free columns in the (n×p)matrices C and D is just by setting p−q columns equal to zero. i f a i l ] = . C . In the main program the subroutine ’lmb’ is for example called in the following way. i f a i l = 0. [ x . 0 E−12. D ( : . 1 : m−1) ’∗ s ) / ns . ne . p−q +1:p ) . p = 5. D. q . p . p−q +1:p ) . %%% R e d u c t i o n %%% C ( : . i t e . lmb ( ’ gcn ’ .m) = ( y + s − C ( : . imax = 2 0 0 . function program %%% P r e a l l o c a t i o n and i n i t i a l i s a t i o n %%% i e p s = 1 . For example. imax . we can remove the newest p − q updates of the Broyden process by setting the last p − q columns of C and D to zero.m) = s / ns . . D( : . D( : . q = p−1. C ( : . . p−q ) . 1 : q ) = D( : . p−q ) . the nonzero columns are stored in the ﬁrst q columns of the matrices. 1 ) . p−q ) . . see Section 3. 1 . meps = [ 1 . 0 E− 1 6 . Matlab code of the limited memory Broyden methods %%% Matrix u p d a t e %%% m = m + 1. %%% R e d u c t i o n %%% C ( : . q +1:p ) = zeros ( n . p ) . C ( : . The oldest p − q updates of the Broyden process are removed by storing the last q columns of C and D in the ﬁrst q columns and again setting the last p − q columns of the new matrices C and D equal to zero. q +1:p ) = zeros ( n . m. q +1:p ) = zeros ( n .
The ﬁrst p − q columns of C and D are removed in the same way as done in normal format. So. q +1:p ) = zeros ( n . p ) . Removing the ﬁrst columns in QLformat The Broyden Base Reduction method is actually similar to removing the ﬁrst columns of the matrices C and D in the normal format. S . p ) . q +1:p ) = zeros ( n . p−q ) . p−q ) . the matrices C and D are written as the singular value decomposition of the update matrix. W = zeros ( p . D ( : .W] = svd (C . 1 : q ) = D( : . L %%% [ D. The smallest p − q singular values of the update matrix are removed by setting the last p − q columns of C and D equal to zero. C = C∗R ’ . C = C∗S .215 Removing columns in SVDformat For the Broyden Rank Reduction method three additional (p × p)matrices have to be declared. Before we apply the reduction the matrix D is ﬁrst orthogonalized using a QLdecomposition. The decomposition of the update matrix is rewritten in the following way. %%% R e d u c t i o n %%% C ( : . D( : . W%%% [ C. R %%% −d [ D. Before the reduction is applied. C = C∗L ’ . %%% R e d u c t i o n %%% C ( : . C ( : . the (p × p)matrix L has to be declared. %%% P r e a l l o c a t i o n %%% L = zeros ( p . Note that these reduction procedures also have been applied in the normal format. the last q columns of C and D are copied to the ﬁrst q columns of both matrices and subsequently the new last p − q columns of C and D are set equal to zero. 1 : q ) = C ( : .R] = qr (D. q +1:p ) = zeros ( n . q +1:p ) = zeros ( n . p−q +1:p ) . In order to remove the largest p − q singular values of the update matrix. p ) . p ) . For this the QRdecomposition is computed of the matrix D and thereafter the SVDdecomposition of C. 0 ) . p−q ) . S = zeros ( p . . p−q ) . 0 ) . %%% R e c o m p o s i t i o n %%% %%% QL−d e c o m p o s i t i o n . %%% R e c o m p o s i t i o n %%% %%% QR e c o m p o s i t i o n . %%% SVD−d e c o m p o s i t i o n . D = D∗W. L ] = q l (D ) . %%% P r e a l l o c a t i o n %%% R = zeros ( p . D( : . p−q +1:p ) .
If the QRdecomposition of [dp . . . q +1:p ) = zeros ( n . Let {d1 . . The last p − q columns are removed in the same way as done in the normal format. d1 ] is given by r11 · · · r1p . 0 ) . ˜p . %%% [Q.. . q +1:p ) = zeros ( n . D( : . D ( : . C ( : . p−q ) .R] = qr (D. D( : . If A is m −by−n w i t h m > n . p−q ) . . p−q +1:p ) . d . Subsequently we rewrite the decomposition of the update matrix in the following way. we declare the (p × p)matrix R. 1 : q ) = C ( : . C = C∗R ’ . . . d1 · · · So. Matlab code of the limited memory Broyden methods %%% R e d u c t i o n %%% C ( : . dp · · · d 1 = dp · · · d1 . 0 ) . Q = f l i p l r (Q) . L %%% [ D. L i s a l o w e r t r i a n g u l a r m a t r i x . t h e n t h e f i r s t n columns o f Q %%% %%% a r e computed . . dp } be the columns of D. p−q +1:p ) . r1p · · · r11 =: DL. 1 : q ) = D( : . %%% R e c o m p o s i t i o n %%% %%% QL−d e c o m p o s i t i o n . . L ] = q l (A ) . then we obtain For the subroutine ’ql’ we used the QRdecomposition routine of Matlab.216 Appendix B. L = f l i p l r ( f l i p u d (L ) ) . Removing the last columns in QRformat The Broyden Base Storing method computes the QRdecomposition of the matrix D before it removes the last p − q columns of the matrices C and D. q +1:p ) = zeros ( n . . q +1:p ) = zeros ( n . Therefore. L ] = qr ( f l i p l r (A) . %%% R e d u c t i o n %%% C ( : .. [ Q. . the ’ql’subroutine reads function [ Q. L ] = QL(A) p r o d u c e s t h e ” economy s i z e ” QL−d e c o m p o s i t i o n . . rpp ˜ d p = d1 · · · rpp . p−q ) . p ) . %%% P r e a l l o c a t i o n %%% R = zeros ( p . . ˜ ˜ . p−q ) . .
stHy = s ’ ∗C ( : . 1 :m) ∗D ( : . i f ( stHy = = 0  nHts <= 0) i f a i l = 3 . p ) . i f ( i t e >= imax ) i f a i l = 2 .m) = D ( : .217 The inverse notation of Broyden’s method For the inverse notation only the computation of the Broyden step s and the update to the inverse Broyden matrix are diﬀerent from the standard ’lmb’subroutine. M2 = zeros ( p . end .m) = ( s−C ( : . %%% Broyden s t e p %%% s = g − C ( : .m) = D( : . end . 1 : m−1) ’∗ s − s . For the sake of clarity. ns = sqrt ( s ’ ∗ s ) . but inherent of the algorithm. nHts = sqrt (D( : . end . D( : . break . All reduction methods discussed above are applicable to the limited memory inverse Broyden method. g = y + g . i f (m == p ) %%% R e c o m p o s i t i o n %%% %%% R e d u c t i o n %%% m = q. end . 1 : m−1)∗C ( : . break . C ( : . 1 ) . ne ( i t e +1) = sqrt ( g ’ ∗ g ) . %%% P r e a l l o c a t i o n %%% g = zeros ( n . end . break . C ( : .m) / nHts . This limited memory Broyden method has to be implemented in a quite diﬀerent setting. end . x = x + s. the update to the Broyden matrix is not clearly computed. 1 :m) ’ ∗ g . %%% Matrix u p d a t e %%% m = m + 1. The complete Broyden iteration reads . x .m) ’ ∗D( : . n ) − g . i f ( ns <= 0) i f a i l = 3 . break . For simplicity we only consider the case where the oldest p − q updates to the Broyden matrix are removed if m = p. D( : . On the other hand the vectors s and y are not used. 1 : m−1)∗D ( : . the Broyden iterations reads %%% Broyden i t e r a t i o n %%% while ( ne ( i t e +1) > i e p s ) . In fact.m) = C ( : . So. 1 : m−1) ’∗ y − y . we declare the matrix M 2 instead of B2. i f ( ne ( i t e +1) > meps ( 2 ) ) i f a i l = 4 .m) ) / stHy ∗ nHts . i t e = i t e + 1. The limited memory Broyden method proposed by Byrd et al.m) .m) ) . y = f e v a l ( gcn .
x = x + D( : . M2( 1 :m. ne ( i t e +1) = ng . end . . i f ( i t e >= imax ) i f a i l = 2 . 1 :m)−D ( : . end . i f ( ne ( i t e +1) > meps ( 2 ) ) i f a i l = 4 . m = q. D( : . q +1:p ) = zeros ( n . i f (m == p ) . (C ( : . The scaling is inserted to overcome a bad condition number for the matrix M 2. end . end . . In contrast with the matrix B2 the matrix M 2 is not invertible if m < p. end . f o r i = 1 :m f o r j = 1 : i −1 M2( i . i ) ’ ∗D( : . i f ( rcond (M2( 1 :m. break . 1 :m) = M2( 1 :m. D ( : . i f ( ns <= 0) i f a i l = 3 . j ) = −D( : . we use the leftupper (m × m)part of the matrix M 2. i t e = i t e + 1 . end .m) . Matlab code of the limited memory Broyden methods %%% Broyden i t e r a t i o n %%% while ( ne ( i t e +1) > i e p s ) .m+1). g = C ( : . 1 : q ) = C ( : .m+ 1 ) ) .m+1)/ ns .218 Appendix B. 1 :m) . 1 :m) ) ∗ (M2( 1 :m. ns = sqrt (D( : . p−q +1:p ) . 1 :m) ) < meps ( 1 ) ) i f a i l = 5 . else M2 = zeros (m. C ( : . %%% S c a l i n g %%% D( : . end . p−q +1:p ) . p−q ) . D( : . 1 :m)+D ( : . n ) − g . break . q +1:p ) = zeros ( n . break . end . m = m + 1.m+1) ’∗D( : .m+1) = D( : . . end . j ) . C ( : . Therefore.m+ 1 ) = .m+1) = f e v a l ( gcn . break . 1 ) = g . because we have declared M 2 as a (p × p)matrix. %%%% R e d u c t i o n %%% C ( : .m+1) + g . 1 :m) ’ ∗C ( : .m+1) = C ( : . x .m+1)/ ns . 1 :m) ’ ∗ g ) ) + g . 1 :m) \ (D ( : . 1 : q ) = D( : . %%%% Broyden s t e p and u p d a t e %%% i f (m == 0) D( : . p−q ) . ng = sqrt ( g ’ ∗ g ) . C ( : .
1.6 · 10−4 kW/(mK) 3 η kc ε Ea /Rgas tf 1 0.0 kJ/m K 9.1: Parameter values for the reverse ﬂow reactor In this appendix we derive appropriate values for the radial dispersion Drad and for the radial heat conductivity λrad using correlation formulas of Westerterp. In not too short beds (i.115 m/s 0. we take the same parameter values as used by Khinast. Swaaij and Beenackers [72]. The derived values of the radial parameters are given in Table C.0 m/s 2.6244 kJ/m K 1426.. where dp denotes the particle size) this dispersion can be described by means of a longitudinal dispersion 219 . λax = (1 − ε)λs + λg + (ρcp )2 g (ρcp )s k∞ h Tc = T 0 Dax λs 1382.02kW/(m2 K) 323 K 3 · 10−5 m2 /s 0.0 kW/(mK) 3 (ρcp )g av L ∆Tad u λg 0. In our simulations we ﬁx the ﬂow reverse time (tf = 1200s). see Table C. When a ﬂuid ﬂows through a packed bed of solid particles with low porosity.0 m 50 K 1. Jeong and Luss (1999). the variations in the local velocity cause a dispersion in the direction of the ﬂow. To compute the eﬀective axial heat conductivity the following expression is proposed u2 hav .2.6 K 1200s Table C. L/dp > 10.26)(6.38 8328.0 m2 /m3 react surf 4.85 · 106 s−1 0.e.Appendix C Estimation of the model parameters In the simulations of the twodimensional model (6.28).
we have that λrad = 0 and that the radial heat dispersion coeﬃcient equals the thermal conductivity.6 · 10−5 + 0.81 · 10−2 . denoted by the Bodenstein number.1. Boax = εDax To avoid large wall eﬀects. dp Using the relation N = Pem /2 = uL/(2εDax ). 2 − (1 − dp /R)2 .26 < ε < 0. The (isotropic) thermal conductivity of the bed is denoted by λ0 . 8[2 − (1 − dp /R)2 ] Note that under stagnant conditions. For the convective heat conductivity the following correlation is given λrad = (ρcp )g dp u .1. λ0 = 0. Heat can be transported perpendicular to the main ﬂow by the same mechanism if a transverse temperature gradient exists. The void spaces of a packed bed can be considered as ideal mixers. resulting in a convective heat conductivity λrad . If the heat diﬀusion through the solid particles can be neglected (that is. Besides. can be derived udp = 2. and the number of voids is roughly equal to L N∼ . λs = 0) the following expression is valid for λ0 .6244 · dp · 1. This implies that the coeﬃcient of transverse dispersion Drad is about six times smaller than Dax . we arrive at the following expression for the radial heat conductivity λrad = λ0 + λrad = 6.67 · λg · ε. the following expression for the axial dispersion in packed beds.220 Appendix C. where λ0 and λrad act fairly independently. heat transport occurs by thermal radiation between the particles. Borad ∼ udp /(εDrad ). The total radial thermal conductivity is then given by λrad = λ0 + λrad . in case of 0. approaches a value of 10 to 12 at Re > 100. Using the parameter values given in Table C. although in reality no back mixing occurs. It is known that the radial Bodenstein number. it is assumed that dp /2R < 0.0 8[2 − (1 − dp /R)2 ] dp = 6.6 · 10−5 + 7.93 and T < 673K. Estimation of the model parameters coeﬃcient.
Therefore. The corresponding dimensionless parameters of Table 6. dp 1.0 · 10−3 m Drad 0.32·10 −4 to 1.4 · 10−4 kW/(mK) Table C.2: The values of the radial parameters for the twodimensional model of the reverse ﬂow reactor In the computations of the Chapter 5 and 8 we have used the dimensionless equations for the onedimensional model. m . s−1 reactor length. (6. m/s frequency factor for reaction.2.1 and C.0·10−3 m and take R in the range from 0. C D dp Ea h kc k∞ L r R Rgas u Uw t tf T Tc (T0 ) z speciﬁc external particle surface area.5 · 10−5 m2 /s λrad 1. then the radial heat conductivity varies from 1. s ﬂow reverse time. m/s heattransfer coeﬃcient at reactor wall.26)(6. kJ/(kmol K) superﬁcial gas velocity. we ﬁx the value at λrad = 1. and the twodimensional model.43·10−4 kW/(mK).01m to 0. K cooling(feed) temperature.221 If we choose the particle diameter to be dp = 1.1m. s temperature. kmol/m3 dispersion coeﬃcient.4·10−4 kW/(mK). m2 /m3 reactor wall concentration. m activation energy.2 are computed using the values of Tables C. m radius of the reactor. kJ/kmol heattransfer coeﬃcient.23)(6.28). m radial distance.25).2 we have used the following symbols. m2 /m3 reactor surf speciﬁc reactor wall surface area. (6. kW/(m2 K) masstransfer coeﬃcient. K axial distance. m universal gas constant. kW/(m2 K) time. m2 /s particle diameter. Roman av aw c. Symbols In Section 6.
[−] Dimensionless parameters Bo Pe Pr Re Bodenstein number P´clet number e Prandl number Reynolds number Subscripts ax rad g s axial direction radial direction gas phase solid phase . kJ/kmol adiabatic temperature rise. [−] (isotropic) thermal conductivity. kW/(m3 K) convective heat conductivity. kW/(m3 K) thermal conductivity. kW/(m3 K) volumetric heat capacity. kJ/(m3 K) dimensionless cooling capacity. Estimation of the model parameters Greek −∆H −∆Tad ε η λ0 λ λ (ρcp ) Φ heat of reaction. [−] eﬀectiveness factor.222 Appendix C. K void fraction.
• Ecologisch onderzoek naar vervuiling van een fabriek die zijn afval loost in een baai met open verbinding naar zee. Deze variabele hangt af van plaats en tijd. Daarvoor dienen we de parti¨le diﬀerentiaalvergelijkingen eerst aan te passen e 223 . De laatste decennia kunnen we de hulp inroepen van de computer. • Het gedrag van veengrond onder invloed van dag en nacht. bijvoorbeeld de concentratie van de giftige stof in het water van een baai. of de temperatuur van de reactor.Samenvatting (Waarom Broyden?) Wiskunde is ´´n van de oudste wetenschappen in de wereld. • De toestand van kraakbeen in de pols onder herhaaldelijke belasting. In dat geval noemen we het systeem in periodiek stabiele toestand (’cyclic steady state’). We zijn vooral ge¨ ınteresseerd in wat een systeem doet na verloop van (een lange) tijd. Omdat de condities van het proces periodiek in de tijd zijn. In eenvoudige gevallen zijn de parti¨le e diﬀerentiaalvergelijkingen nog op te lossen. zeg x. verwachten we hetzelfde voor de uiteindelijke toestand van het systeem. Wat deze processen met elkaar gemeen hebben is dat ze wiskundig gezien eigenlijk hetzelfde zijn. • Periodiek aangedreven processen in chemische reactoren. Maar wanneer meerdere variabelen in het proces een rol spelen of wanneer het mechanisme moeilijker wordt. We beschouwen een variabele. Beschouwen we de volgende voorbeelden. is dit al niet meer te doen en moeten we op een andere manier een oplossing vinden. maar speelt in ee het huidige wetenschappelijke onderzoek nog altijd een prominente rol. Ze worden namelijk beschreven met behulp van parti¨le diﬀerentiaalvergelijkingen met tijdsafhankelijke parameters en rande condities.
moeten we het stelsel van gewone diﬀerentiaalvergelijkingen integreren over ´´n periode. wordt g(x) = 0. noemen we de ’Broyden Rank Reduction’ methode. De ee periodiek stabiele toestand is dus een vast punt van de periodeafbeelding en een nulpunt van de functie g(x) = f (x) − x. De methode die we hebben ontwikkeld. In toepassingen is de methode van Broyden populair. meer gridpunten. dat wil zeggen. waarvan de simulaties vooral te vinden zijn in het tweede deel van dit proefschrift. De parameter p wordt vooraf vastgesteld en de ideale waarde voor p is bepaald door eigenschappen van de functie g en niet van de dimensie n van het gediscretiseerde probleem. Deze methode gaat uit van een beginschatting x0 voor het nulpunt x∗ van de functie g. Het blijkt dat voor veel gevallen p klein gekozen kan worden. Met behulp van een iteratief schema wordt een reeks van iteraties {xk } berekend dat naar de oplossing x∗ convergeert. De ee methode van Broyden blijkt in het bijzonder geschikt te zijn voor problemen afkomstig van periodieke processen. De Broyden matrix wordt elke iteratie aangepast door er een rang´´nmatrix ee bij op te tellen. De uitdaging is nu om steeds gedetailleerdere modellen te gebruiken om de processen beter te kunnen omschrijven. in plaats van de n2 elementen. hebben we een oplossing gevonden. Een nadeel van de methode van Broyden. Om de periodeafbeelding te verkrijgen. Bovendien kan het nodig zijn om een ﬁjner grid te gebruiken. Door de Broyden matrix (zelf een benadering) te benaderen. We deﬁni¨ren de periodeafbeelding f : Rn → Rn als de functie die de toestand aan het begin van een periode overbrengt naar de toestand aan het einde van de periode. We discretiseren de ver¨ gelijkingen op een grid. is dat de (n × n)matrix Bk opgeslagen moet worden. we delen de ruimte op in kleine blokjes en nemen aan dat per blokje de variabele constant is. Dit kan een probleem vormen als het model te groot wordt. Per iteratie wordt slechts ´´n functieevaluatie uitgevoerd. dat wil zeggen. voor elk blokje in de ruimte een vergelijking die afhangt e van x. In plaats van de parti¨le e diﬀerentiaalvergelijkingen hebben we nu een groot stelsel van n gewone diﬀerentiaalvergelijkingen. lukt het om de matrix op te slaan met behulp van 2pn elementen. Na een uitgebreide studie van de methode van Broyden. Hierbij wordt gebruik gemaakt van een matrix Bk dat de Jacobiaan (de afgeleide) van de functie g in de iteratie xk benadert. Als . De vraag is dus of we de Broyden matrix op een eﬃci¨nte manier e kunnen opslaan.224 Samenvatting zodat de computer er uberhaupt iets mee kan doen. De uiteindelijke vergelijking die we moeten oplossen. De dimensie van het gediscretiseerde probleem n wordt hierdoor groter en eﬃci¨nte methodes zijn e nodig om g(x) = 0 op te lossen.
Als we nu koud gas (op kamertemperatuur) de reactor inlaten. De aanleiding voor het ontwikkelen van de ’Broyden Rank Reduction’ methode was een probleem afkomstig uit de reactorkunde. met de concentratie van het reagens en de temperatuur als variabelen. Dit is het belangrijkste resultaat van het eerste deel van dit proefschrift. De hele reactor is dan op kamertemperatuur en er kan geen reactie meer plaatsvinden. ontstaat een reactiefront en stijgt de temperatuur. daar naast . We veronderstellen dat de reactie exotherm is. de reactor verlaten. reageert tot een product. Het blijkt dat de vergelijking g(x) = 0. warmen we de reactor eerst op voordat we het proces starten. Op de plaats waar de reactie plaatsvindt. We kunnen dit voorkomen door voordat het reactiefront de reactor heeft verlaten de reactor in omgekeerde richting te gaan gebruiken. Hierdoor zal het reactiefront weer naar links verschuiven. In dit gas zit een reagens dat. n n p p We hebben bewezen onder welke omstandigheden de ’Broyden Rank Reduction’ methode even snel is als de originele methode van Broyden. Omdat de reactie alleen plaatsvindt als de temperatuur hoog genoeg is.Samenvatting 225 bijkomend voordeel. niet meer is op te lossen met behulp van de methode van Broyden. er komt warmte bij vrij. wanneer het in contact komt met de katalysator. gaan de grote ndimensionale berekeningen die nodig zijn voor de methode van Broyden over in kleine pdimensionale berekeningen. ontstaan er in radiale richting temperatuursgradi¨nten. Dit heeft twee eﬀecten. gelijk aan 2 · 100 · 25 = 5000. Indien we voor de discretisatie 100 gridpunten nemen in axiale richting en 25 gridpunten in radiale richting. Daarom zoue den we graag de reactor beschrijven met behulp van een tweedimensionaal model. is de dimensie van het gediscretiseerde probleem. wanneer we niets veranderen aan de condities van het proces. We laten dan het gas binnen aan het rechter uiteinde en vangen het product op aan het linker uiteinde. Daar er veel energie bij de reactie vrijkomt en de reactor aan de wand wordt gekoeld. Dit probleem wordt in het laatste deel van dit proefschrift volledig uitgewerkt. De ’reverseﬂow’ reactor is een cilindrische buis gevuld met een katalysatordeeltjes waar een gas doorheen wordt gestuurd. Vervolgens wordt dit reactiefront door de reactor heen gestuwd en zal. warmt het op en vindt de reactie plaats. n. dat wil zeggen.
000 elementen. De parameter p kan zelfs gelijk worden gekozen aan 5 (opnieuw een halvering van het geheugengebruik) in ruil voor een paar extra iteraties.226 Samenvatting alle andere matrices en vectoren een Broyden matrix moet worden opgeslagen met 25. De ’Broyden Rank Reduction’ methode kan wel worden toegepast voor bijvoorbeeld p = 20 of p = 10. e . Een periodieke toestand van de ’reverseﬂow’ reactor met temperatuursgradi¨nten e in radiale richting kan met de ’Broyden Rank Reduction’ methode nu voor het eerst eﬃci¨nt berekend worden. Hiervoor moeten respectievelijk 200.000. terwijl het geheugengebruik wordt gehalveerd.000 elementen worden opgeslagen.000 en 100. De methode convergeert voor beide p’s even snel.
Het Mathematisch Instituut heeft mij alle vrijheid gegeven om ongestoord mijn onderzoek te doen en bovendien mij te kunnen voorbereiden op het toekomstig werk voor de klas. Het schoolbord op de kamer werd intensief gebruikt.Nawoord Dit proefschrift was nooit voltooid zonder de inbreng en steun van vele vrienden.en werkbezoek heb ik gekregen van e het Leids Universiteits Fonds. Gedurende vier jaar heb ik me thuis gevoeld op twee totaal verschillende instituten. Shell en van NWO via het Pioneer project. Het basisidee e van de ’Broyden Rank Reduction’ methode is mede aan hem te danken. want het was vaak al voldoende om een probleem uit te leggen aan de ander om zelf ineens de oplossing te zien. Het was leuk om de verschillende ﬁetsroutes in en rondom Fort Collins te ontdekken. De tijd die ik met Miguel op de kamer zat. waren aangenaam en eﬀectief. collega’s en bekenden. Zowel in Leiden als bij het Instituut voor de Technische Scheikunde in Amsterdam was er altijd wel iemand om een oplossing te vragen voor een probleem (niet noodzakelijk betreﬀende mijn onderzoek) of om trots mijn nieuwste resultaten te laten zien (van mijn kat Siep bijvoorbeeld. Zowel zijn heldere idee¨n als vermogen tot relativeren hebben mij veel geholpen. Financi¨le ondersteuning voor congres. 227 . Het was voor mij een grote uitdaging om het resultaat van mijn onderzoek te bespreken met hem en zijn collega’s. Daarbij denk ik vooral aan de personen die direct bij het totstandkoming van dit proefschrift betrokken zijn geweest. Zijn kennis over Latex is dit proefschrift zeker ten goede gekomen. Vanaf het begin van mijn promotie heb ik veel gehad aan de gesprekken met Tycho. die mij in wezen is voorgegaan in dit onderzoek. Verschillende leden van de promotiecommissie hebben door hun opmerkingen en vragen de presentatie van de resultaten overzichtelijker gemaakt en het aantal fouten en onnauwkeurigheden verminderd. zie de introductie). In het laatste jaar van mijn promotie kreeg ik de gelegenheid om een maand de Colorado State University te bezoeken op uitnodiging van Don Estep.
. e e D´sir´e is voor mij de reden om altijd vol te blijven houden. De onvoorwaardelijke steun van mijn ouders. vaak ook zonder dat zij zich ervan bewust waren.228 Nawoord Ook buiten de universiteit hebben velen mij geholpen bij het voltooien van mijn promotie. mijn broer en mijn zus is voor mij van onschatbare waarde. Door het commentaar van Bertram en Luuk is de leesbaarheid van het begin en het einde van dit proefschrift vergroot.
Tijdens deze periode gaf hij werkcollege’s in analyse en numerieke wiskunde. gaf hij diverse keren een voordracht in Leiden. Deze valse start werd echter snel ingehaald en hij slaagde erin om alle rekentaken van de basisschool uit te werken. Op 3 juli 1995 nam hij zijn diploma in ontvangst en in de herfst van datzelfde jaar begon hij de wiskundeopleiding aan de Vrije Universiteit te Amsterdam. vertrok hij in oktober 1998 voor vier maanden naar de Technische Universit¨t Wien. Vanaf de derde klas tot aan het laatste jaar van de universiteit heeft hij leerlingen begeleid in wis. Op zevenjarige leeftijd behaalde hij als laatste van de klas het diploma voor vermenigvuldigen. Op 25 augustus 1999 studeerde hij cum laude af. Bliek (Universiteit van Amsterdam) deed hij zijn onderzoek in het oplossen van grote stelsels nietlineaire vergelijkingen. Bij prof. dr. Voor het schrijven van zijn eindscriptie ’Invariant Lagrangian subspaces of inﬁnite dimensional Hamiltonians and the Ricatti equation’. S. Als hoogtepunt van zijn promotie bezocht hij het voorjaar van 2003 de Colorado State University. Hasselt en Montreal. Ran. Daarnaast bezocht hij conferenties in Lunteren. Wageningen. A. A. Hier was hij te gast bij prof. Utrecht en Amsterdam. In de zomer van 1998 startte hij een specialisatie in de Operatorentheorie onder begeleiding van dr. Door ´´n van zijn leraren werd hij zelfs ee getipt als de toekomstig directeur van de IBM. dr. a Langer een expert op het gebied van de operatorentheorie.M. waarvan dit proefschrift het resultaat is. wiskunde B als natuurkunde sloot hij het gymnasium af met (afgerond) een tien voor het eindexamen.M.en natuurkunde.C. dr. deed hij mee aan een modelleerweek in Eindhoven en volgde hij onder andere een cursus in Delft. Ook op de middelbare school bleef de interesse voor de wiskunde groeien. H. Zijn interesse ging over van de zuivere naar de meer toegepaste wiskunde. Daar was het mogelijk om zijn onderzoek te 229 . Verduyn Lunel en prof. periodiek aangedreven reactoren. Zowel voor wiskunde A. mede door het enthousiasme van zijn wiskundedocent. afkomstig van modellen voor chemische.Curriculum Vitae Bart van de Rotten is geboren op 20 oktober 1976 in Uithoorn.
.230 Curriculum Vitae verdiepen met behulp van de expertise van prof. Binnenkort keert hij terug naar de Vrije Universiteit waar hij een opleiding gaat volgen tot wiskundedocent. dr. Estep en zijn groep. D.
This action might not be possible to undo. Are you sure you want to continue?