AXEL KLAWONN
SIAM J. SCI. COMPUT. c 2000 Society for Industrial and Applied Mathematics
Vol. 22, No. 4, pp. 11991219
Abstract. A new domain decomposition method with Lagrange multipliers for elliptic problems
is introduced. It is based on a reformulation of the wellknown nite element tearing and intercon
necting (FETI) method as a saddle point problem with both primal and dual variables as unknowns.
The resulting linear system is solved with blockstructured preconditioners combined with a suitable
Krylov subspace method. This approach allows the use of inexact subdomain solvers for the positive
denite subproblems. It is shown that the condition number of the preconditioned saddle point prob
lem is bounded independently of the number of subregions and depends only polylogarithmically on
the number of degrees of freedom of individual local subproblems. Numerical results are presented
for a plane stress cantilever membrane problem.
Key words. domain decomposition, Lagrange multipliers, nite element tearing and intercon
necting, preconditioners, elliptic systems, nite elements
AMS subject classications. 65F10, 65N30, 65N55
PII. S1064827599352495
1. Introduction. In the past decade a great deal of research has been carried
out on nonoverlapping domain decomposition methods using Lagrange multipliers. In
these methods the original domain is decomposed into nonoverlapping subdomains.
The continuity is then enforced by using Lagrange multipliers across the interface
dened by the subdomain boundaries. A computationally quite ecient member
of this class of domain decomposition algorithms is the nite element tearing and
interconnecting (FETI) method introduced by Farhat and Roux [7]. In its original
version, a Neumann problem is solved on each subdomain and the method is known
to be scalable in the sense that its rate of convergence is independent of the number
of subproblems. In a variant of the FETI method introduced in Farhat, Mandel, and
Roux [6] an additional Dirichlet problem is solved exactly on each subdomain, in each
iteration. This makes the rate of convergence of the iteration even less sensitive to
the number of unknowns of the local problems. The use of inexact Dirichlet solvers is
possible without a radical change of the FETI method. However, the use of inexact
Neumann solvers does require a redesign of these algorithms; this is the topic of the
present work.
In this paper, a new domain decomposition method with Lagrange multipliers is
introduced by rst reformulating the system of the FETI algorithm as a saddle point
problem with both primal and dual variables. The resulting system is then solved
Received by the editors February 26, 1999; accepted for publication (in revised form) May 1,
2000; published electronically October 18, 2000.
http://www.siam.org/journals/sisc/224/35249.html
Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York,
NY 10012 (widlund@cs.nyu.edu, http://www.cs.nyu.edu/cs/faculty/widlund). The work of this au
thor was supported in part by the National Science Foundation under grant NSFCCR9732208 and
in part by the US Department of Energy under contract DEFG0292ER25127.
1199
1200 AXEL KLAWONN AND OLOF B. WIDLUND
using blockstructured preconditioners and a suitable Krylov subspace method. We
can then avoid potentially quite costly direct solvers relying instead on any of a number
of welltested preconditioners for positive denite subproblems, such as incomplete LU
methods, (algebraic) multigrid, etc. The good features of the FETI method such as
scalability and eciency are preserved. Essentially, the storage and factorization costs
related to use of exact solvers in the FETI algorithm are eliminated while the new
method will incur new costs for the preconditioners on the subdomains and additional
storage of primal variable components in the iteration; we also have to expect that
the iteration count will increase in comparison to when exact subdomain solvers are
employed. A complete theory, based quite directly on the pioneering work of Mandel
and Tezaur [16], and on earlier work by Klawonn [11], is also developed in this paper;
see Theorems 9 and 15. In addition, we show how these results can be improved for
a dierent family of preconditioners developed in [14]; see Theorems 10 and 16.
The remainder of this article is organized as follows. In section 2, we present the
equations of linear elasticity and a nite element discretization thereof. In section
3, we review the FETI method and we develop our new method in section 4. In
subsection 5.1, the convergence analysis of Mandel and Tezaur [16] is extended from
scalar, second order elliptic equations to the system of equations of linear elasticity.
A convergence analysis and condition number estimates for the blockdiagonal pre
conditioner are given in subsection 5.2. The paper concludes with section 6, in which
we report on some of our numerical experiments.
We note that a short conference paper [13] prepared previously describes and
discusses a slightly dierent version of our main algorithm.
2. The elliptic problem. In this section, we introduce our model problem,
the elliptic system arising from the displacement formulation of compressible, linear
elasticity and its discretization by conforming nite elements.
2.1. The equations of linear elasticity. The equations of linear elasticity
model the displacement of a linear elastic material under the action of external and
internal forces. We denote the elastic body by R
d
, d = 2, 3, and its boundary by
and assume that one part of the boundary,
0
, is clamped, i.e., with homogeneous
Dirichlet boundary conditions, and that the rest,
1
:=
0
, is subject to a surface
force g, i.e., a natural boundary condition. We can also introduce a body force f , e.g.,
gravity. The appropriate space for a variational formulation is the Sobolev space
H
1
0
() := v H
1
()
d
: v

0
= 0. The linear elasticity problem consists of
nding the displacement u H
1
0
() of the elastic body , such that
G
_
(u) : (v)dx + G
_
div u div v dx = F, v) v H
1
0
(). (1)
Here G and are material parameters depending on the Poisson ratio and Youngs
modulus E with (0, 1/2] and E > 0. In the case of plane stress, we have G =
E/(1+), = /(1), and for plane strain and threedimensional elasticity we have
G = E/(1 +), = /(1 2). Furthermore,
ij
(u) :=
1
2
(
ui
xj
+
uj
xi
) is the linearized
strain tensor, and
(u) : (v) =
d
i,j=1
ij
(u)
ij
(v), F, v) :=
d
i=1
_
f
i
v
i
dx +
d
i=1
_
1
g
i
v
i
d.
DOMAIN DECOMPOSITION WITH LAGRANGE MULTIPLIERS 1201
The associated bilinear form of linear elasticity is
a(u, v) = G
_
(u) : (v)dx + G
_
[u[
2
dx, and [u[
2
H
1
()
:= u
2
L2()
. It is obvious that the bilinear
form a(, ) is continuous with respect to  
H
1
()
. However, proving ellipticity is far
less trivial but it can be established from Korns rst inequality; see, e.g., Ciarlet [3].
Lemma 1 (Korns rst inequality). Let R
d
, d 2, be a Lipschitz domain.
Then, there exists a positive constant c = c(,
0
), such that
_
(u) : (v)dx
and introduce two inner products on (H
1
())
2
, for a region with diameter one,
(u, v)
E1
:= ((u), (v))
L2()
+ (u, v)
L2()
and
(u, v)
E2
:= ((u), (v))
L2()
+
3
i=1
l
i
(u)l
i
(v),
where the l
i
() are dened by
l
i
(u) :=
_
(r
i
)
t
udx.
By using Lemma 2,  
E1
, given by the inner product (, )
E1
, is a norm and not just
a seminorm, and so, by construction, is  
E2
.
The norms, just introduced, are equivalent.
Lemma 3. There exist constants 0 < c C < , such that
c u
E1
u
E2
C u
E1
u (H
1
())
2
.
Proof. The proof of the right inequality follows immediately from the Cauchy
Schwarz inequality. The left inequality is proven by contradiction and by using Rel
lichs theorem as in a proof of generalized PoincareFriedrichs inequalities, cf., e.g.,
Necas [18, Chap. 2.7].
We obviously have
(u)
L2()
u
L2()
u (H
1
())
2
. (2)
Using (2) and Lemmas 2 and 3, we obtain the following.
Lemma 4. There exists a constant 0 < c, such that
c u
L2()
(u)
L2()
u
L2()
u (H
1
())
2
, u ker ().
2.2. Finite elements and the discrete problem. Since we consider only
compressible elastic materials, it follows from Lemma 1 that the bilinear form a(, )
is uniformly elliptic. We can therefore successfully discretize the system (1) with
loworder, conforming nite elements, such as linear or bilinear elements.
We assume that a triangulation
h
of is given which is shape regular and has
a typical element diameter of h. We denote by W
h
() H
1
0
() the corresponding
conforming space of nite element functions, e.g., piecewise linear or bilinear contin
uous functions. Thus, it is our goal to solve the discrete problem
a(u
h
, v
h
) = F, v
h
) v
h
W
h
(). (3)
In what follows, we work exclusively with the discrete problem and we drop the
subscript h from now on.
DOMAIN DECOMPOSITION WITH LAGRANGE MULTIPLIERS 1203
3. A review of the FETI method. In this section, we give a brief review of
the original FETI method introduced in Farhat and Roux [7] and the variant with
a Dirichlet preconditioner introduced in Farhat, Mandel, and Roux [6]. For more
detailed descriptions and proofs, we refer to [4, 5, 17, 22] and the references therein.
Let the domain R
2
be decomposed into N nonoverlapping subdomains
i
, i = 1, . . . , N, each of which is the union of elements and such that the nite
element nodes on the boundaries of neighboring subdomains match across the inter
face := (
N
i=1
i
) . Let the corresponding conforming nite element spaces
be W
i
= W
h
(
i
), i = 1, . . . , N, and let W :=
N
i=1
W
i
be the associated product
space. When it is necessary to use vectors of nodal values, which dene the elements
of a nite element space, we underline; e.g., W is the product space that corresponds
to W. Analogously, we denote the vector of nodal values associated with the nite
element function u by u.
For each subdomain
i
, i = 1, . . . , N, we assemble the local stiness matrices K
i
and local load vectors f
i
by integrating the appropriate expressions over individual
subdomains. We denote the local vectors of nodal values by u
i
.
We can now formulate a minimization problem with constraints given by the
intersubdomain continuity conditions.
Find u W, such that
J(u) :=
1
2
u
t
Ku f
t
u min
Bu = 0
_
, (4)
where
u =
_
_
u
1
.
.
.
u
N
_
_, f =
_
_
f
1
.
.
.
f
N
_
_, and K =
_
_
K
1
O O
O K
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. O
O O K
N
_
_
.
The matrix B = [B
1
, . . . , B
N
] is constructed from 0, 1, 1 such that the values of
the solution u, associated with more than one subdomain, coincide when Bu = 0.
The local stiness matrices K
i
are positive semidenite. The problem (4) is uniquely
solvable if and only if ker (K) ker (B) = 0, i.e., K is invertible on the null space
of B. This condition holds since the original nite element model is elliptic.
By introducing a vector of Lagrange multipliers to enforce the constraint Bu =
0, we obtain a saddle point formulation of (4):
Find (u, ) WU, such that
Ku + B
t
= f
Bu = 0
_
. (5)
We note that the solution of (5) is in general only unique up to an additive vec
tor from ker (B
t
). The space of Lagrange multipliers U is therefore chosen as the
range (B); see also discussion below.
We will also use a full rank matrix, built from the rigid body motions of the
subdomains
R =
_
_
R
1
O O
O R
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. O
O O R
N
_
_
,
1204 AXEL KLAWONN AND OLOF B. WIDLUND
such that range (R) = ker (K); the subdomains with nonsingular stiness matrices do
not contribute any columns to R.
The solution of the rst equation in (5) exists if and only if f B
t
range (K);
this constraint will lead to the introduction of a projection P. We obtain
u = K
(f B
t
) +R if f B
t
ker (K), (6)
where K
B
t
= BK
f +BR.
By considering the component orthogonal to BR, we nd that
PF = Pd
G
t
= e
_
(7)
with G := BR, F := BK
B
t
, d := BK
f , P := I G(G
t
G)
1
G
t
, and e := R
t
f . The
second equation in (7) is a consequence of the orthogonality condition of (6). We note
that P is an orthogonal projection from U onto ker (G
t
).
Any solution of (5) and (7) yields the same solution u of (4) and (5) if :=
(G
t
G)
1
G
t
(d F) is chosen; see Mandel, Tezaur, and Farhat [17, Thm. 2.4].
We dene the space of admissible increments by
V := U : Bw w ker (K) = ker (G
t
).
The original FETI method is a conjugate gradient method in the space V applied to
PF = Pd,
0
+V (8)
with an initial approximation
0
chosen such that G
t
0
= e. To introduce pre
conditioned variants, let D be a diagonal matrix. Then, the preconditioner M
1
,
introduced in Farhat, Mandel, and Roux [6], is of the form
M
1
= B
_
O O
O D
1
SD
1
_
B
t
with
S =
_
_
S
1
O O
O S
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. O
O O S
N
_
_
.
The matrix S is the Schur complement of K obtained by eliminating the interior
degrees of freedom of each of the subdomains. This computation clearly can be
carried out in parallel and results in the blockdiagonal matrix S which operates only
on the degrees of freedom on the subdomain boundaries. In the application of M
1
to
a vector, N independent Dirichlet problems have to be solved in each iteration step;
M
1
is therefore often called the Dirichlet preconditioner. The simplest choice for D
DOMAIN DECOMPOSITION WITH LAGRANGE MULTIPLIERS 1205
is the identity matrix; this choice is made for the original Dirichlet preconditioner as
introduced in Farhat, Mandel, and Roux [6]. Another possibility, which leads to faster
convergence, is to choose D as a diagonal matrix, where the diagonal elements equal
the number of subdomains to which the interface node belongs. This multiplicity
scaling (MS) is discussed in Rixen and Farhat [22, 23]; see also further discussion in
[14].
To keep the search directions of this preconditioned conjugate gradient method in
the space V, the application of the preconditioner M
1
has to be followed by another
application of the projection P. Hence, the Dirichlet variant of the FETI method is
the conjugate gradient algorithm applied to the preconditioned system
PM
1
PF = PM
1
P d,
0
+V, (9)
with an initial approximation
0
chosen such that G
0
= e. We note that, for V,
PM
1
PF = PM
1
P
t
P
t
FP and that we can therefore view the operator on the
lefthand side of the preconditioned FETI system as the product of two symmetric
matrices as required for the conjugate gradient algorithm.
4. The blockdiagonal preconditioner. Using the decomposition =
0
+,
with V , we can rewrite (8) as
PBK
B
t
= PBK
(f B
t
0
). (10)
Since u = K
(f B
t
) +R and Bu = 0, we see that the solution of (10) satises
_
K B
t
PB O
_ _
u
_
=
_
f B
t
0
0
_
.
We note that the elimination of the displacement variables of this system, by applying
K
to the rst equation, gives us (10). Using that V, i.e., P = , and that
P
t
= P, we can make the system matrix symmetric such that
_
K (PB)
t
PB O
_ _
u
_
=
_
f B
t
0
0
_
. (11)
The displacement variable is not uniquely dened by this system; we are enforcing
PBu = 0 but not Bu = 0. Given any solution u of (11), we obtain a solution that
satises all the requirements by computing u R(G
t
G)
1
G
t
Bu; this is quite similar
to the choice of the null space component of (6).
For the solution of the saddle point problem (11), we propose a preconditioned
conjugate residual method with a blockdiagonal preconditioner. For a detailed de
scription of this algorithm, see Hackbusch [9] or Klawonn [11, 12]. We note that this
algorithm will be designed such that the rst component of each iterate belongs to
range (K).
Our preconditioner has the form
B =
_
K O
O
M
_
.
Here
K is assumed to be symmetric and a good preconditioner for K + D
H
Q,
1206 AXEL KLAWONN AND OLOF B. WIDLUND
where
Q =
_
_
Q
1
O O
O Q
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. O
O O Q
N
_
_
,
with Q
i
the mass matrices associated with the mesh on
i
and D
H
= diag
N
i=1
(H
2
i
I
i
)
is a diagonal matrix. Here H
i
denotes the diameter of the subdomain
i
. We further
assume that
M is symmetric and a good preconditioner for M; i.e., we assume there
exist constants k
0
, k
1
, m
0
, m
1
with 0 < c m
0
m
1
C < and 0 < c k
0
k
1
C < , such that
k
0
u
t
(K +D
H
Q)u u
t
Ku k
1
u
t
(K +D
H
Q)u u W,
m
0
t
M
t
M m
1
t
M V.
(12)
Here c and C are generic constants independent of, or only weakly dependent on, the
mesh size. Because of the blockdiagonal structure of K and M and the precondition
ers, c and C do not depend on the number of subdomains.
A preconditioner is often said to be optimal when the constants c and C can
be chosen to be independent of the mesh size and the number of subdomains. We
also note that all that is required here are preconditioners for quite benign positive
denite problems on individual subregions and that the bounds are independent of
the number of subdomains.
From these assumptions it is clear that our preconditioner B is symmetric positive
denite and thus it can be used with the preconditioned conjugate residual method.
In order to have a computationally ecient preconditioner, we must also assume that
K
1
and
M
1
can be applied to a vector at a low cost.
To guarantee that the iterates belong to range (K), we make use of the orthogonal
projection P
R
onto range (K) which is given by
P
R
:= I R(R
t
R)
1
R
t
.
Consequently, we have that range (R) = ker (K) and we note that P
R
is a block matrix
with a 33 block for each interior subdomain; the expense of applying P
R
to a vector
is therefore very modest.
Our domain decomposition method is now the conjugate residual algorithm ap
plied to the preconditioned system
B
1
/x = B
1
F
with
/ =
_
K (PB)
t
PB O
_
, B
1
=
_
P
R
K
1
P
t
R
O
O P
M
1
P
t
_
, (13)
x =
_
u
_
, F =
_
f B
t
0
0
_
.
We note that it is easy to see that only two matrixvector products with the projection
P and one with the projection P
R
are required in each step. We note that the iterates
of the conjugate residual method belong to W
R
V with W
R
:= range (K).
DOMAIN DECOMPOSITION WITH LAGRANGE MULTIPLIERS 1207
5. Analysis. In this section, we will work with both nite element functions
and vectors of nodal values representing them. We will make no distinction between
operators and their matrix representation.
We will use the spaces W, U, and V, and we begin by dening a norm  
W
and a seminorm [ [
W
on W by
v
2
W
:=
N
i=1
v
i

2
H
1
(i)
, [v[
2
W
:=
N
i=1
[v
i
[
2
H
1
(i)
,
where v
i

2
H
1
(i)
:= [v
i
[
2
H
1
(i)
+
1
H
2
i
v
i

2
L2(i)
for v = [v
1
, . . . , v
N
] W. We also
need the orthogonal decomposition of W into W
R
:= range (K) and its orthogonal
complement W
R
:= ker (K), i.e.,
W = W
R
W
R
.
For the analysis we need to consider the trace space W
of W. We equip the
space W
[
2
W
:=
N
i=1
[w
,i
[
2
H
1/2
(i)
, w

2
W
:= [w
[
2
W
+
N
i=1
1
H
i
w
,i

2
L2(i)
.
We also dene
B
: W
U B
:= B

W
.
From the construction of the jump operator B it is clear that
Bw = B
for w W where w
= w

,
since continuity is enforced only across the interface. As a consequence, we obtain
ker (B
t
) = ker (B
t
B
t
= B
B
t
.
Let us now introduce a norm on V by 
V
:= [B
t
[
W
. This denes a norm,
and not only a seminorm, since range (B
t
V
. For the sake of simplicity, we will
identify the space V
with V, but we will use both norms. The above denitions are
essentially the same as in [16].
Let us also dene the product space X := WV which we equip with the graph
norm x
X
:=
_
v
2
W
+
2
V
for x = (v, ) X. We also need the subspace
X
R
:= W
R
V with the same norm as X.
5.1. FETI for linear elasticity. For scalar, second order elliptic equations, it
has been shown by Mandel and Tezaur [16] that the condition number of the FETI
method with the original Dirichlet preconditioner (D = I) satises
(PM
1
PF) C (1 + log(H/h))
3
.
We note that (H/h)
2
is proportional to the number of degrees of freedom of a subdo
main. An improved result for a family of Dirichlet preconditioners using multiplicity
1208 AXEL KLAWONN AND OLOF B. WIDLUND
scaling, which leads to a C (1+log(H/h))
2
estimate, is given in Klawonn and Widlund
[14]; this work is also extended to problems with discontinuous coecients.
In this section, we extend the results of Mandel and Tezaur [16] from scalar elliptic
equations to the system of linear elasticity. To be able to use the results of [16], we
now introduce the vectorvalued Laplacian to be used only in our analysis. We denote
by
K
:=
_
_
K
,1
O O
O K
,2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. O
O O K
,N
_
_
a blockdiagonal matrix, where the local stiness matrices K
,i
, i = 1, . . . , N, are
obtained from the discretization of the inner product (u, u)
H
1
(i)
using the same
nite element space as for a(, ). We denote by S
, u
) = min
vW
v

=u
(K
v, v) = (K
v
harm
, v
harm
)
for u
. Here v
harm
W is the discrete harmonic extension of u
v
harm
, v) = 0 v W
0
with v
harm
= u
,
where W
0
is the subspace of W with elements that vanish on the interface .
Returning to the elasticity case, it can also easily be seen that
(Su
, u
) = min
vW
v

=u
(Kv, v) = (Kv
elast
, v
elast
)
for u
. Here v
elast
W is the extension of u
.
Using these formulas and the continuity of a(, ), we nd that for u
,
(Su
, u
) = min
vW
v

=u
(Kv, v)
(Kv
harm
, v
harm
)
C (K
v
harm
, v
harm
)
= C min
vW
v

=u
(K
v, v)
= C(S
, u
).
To bound (Su
, u
range (S),
DOMAIN DECOMPOSITION WITH LAGRANGE MULTIPLIERS 1209
(S
, u
) = min
vW
v

=u
(K
v, v)
(K
v
elast
, v
elast
)
C (Kv
elast
, v
elast
)
= C min
vrange (K)
v

=u
(Kv, v)
= C(Su
, u
).
Since (S
, u
) and [u
[
2
W
are equivalent for u
[
2
W
(Su
, u
) C [u
[
2
W
u
range (S).
The next lemma follows from Mandel and Tezaur [16, proof of Lem. 3.11].
Lemma 6.
inf
V
sup
wW
(, B
)

V
w

W
C (1 + log(H/h))
1/2
,
sup
V
sup
wW
(, B
)

V
w

W
C (1 + log(H/h)).
The next result is essentially an extension of that lemma to the system of equations
of linear elasticity.
Lemma 7. There exist constants 0 < c C < , independent of the mesh size
and the number of subdomains, such that
c(1 + log(H/h))
1

2
V
(F, ) C(1 + log(H/h))
2

2
V
V.
Proof. As shown in [16, proof of Lem. 3.11], we have for V
(F, ) = sup
wrange (S)
(, B
)
2
(Sw
, w
)
.
Using Lemma 5 and that B
t
)
2
[w
[
2
W
C sup
wrange (S)
(, B
)
inf
zker (S)
w
+z

2
W
= C sup
w
=w
+z
range (S),z
ker (S)
(, B
)
w
+z

2
W
= C sup
wW
(, B
)
2
 w

2
W
.
Analogously, we obtain
(F, ) c sup
wW
(, B
)
2
w

2
W
.
The bounds of (F, ) now follow from Lemma 6.
Combining the denitions of the (exact) Dirichlet preconditioner M
1
and of the
norm  
V
with Lemma 5, we obtain the next lemma.
1210 AXEL KLAWONN AND OLOF B. WIDLUND
Lemma 8. There exist constants 0 < c C < , independent of the mesh size
and the number of subdomains, such that
c 
2
V
(M
1
, ) C 
2
V
V.
A condition number estimate of PM
1
PF follows easily from these estimates;
cf. also [16]. The proof for the algorithm using an inexact Dirichlet solver proceeds
along very similar lines.
Theorem 9. There exists a positive constant C, independent of the mesh size
and the number of subdomains, such that
(PM
1
PF) C (1 + log(H/h))
3
,
where (PM
1
PF) :=
max
/
min
is the spectral condition number dened by the
ratio of the largest and the smallest eigenvalue
max
and
min
of PM
1
PF.
Similarly, there exists a positive constant C, independent of the mesh size and the
number of subdomains, such that
(P
M
1
PF) C (1 + log(H/h))
3
,
for any preconditioner
M
1
that is spectrally equivalent to the exact Dirichlet precon
ditioner.
We note that Tezaur [27] established a condition number estimate of the form
C(1 + log(H/h))
2
for an algebraic FETI method developed by Park and Felippa [20]
and Park, Justino, and Felippa [21]; see also the discussion in Rixen et al. [24].
Recently, a modied family of Dirichlet preconditioners, which includes the one
with multiplicity scaling for both redundant and nonredundant Lagrange multipliers
was introduced and analyzed by the authors in [14]. Using redundant Lagrange multi
pliers amounts to choosing the maximal number of constraints, pairwise connecting all
degrees of freedom which belong to the same point x, and using a Lagrange multiplier
for each possible pair. In contrast, nonredundant Lagrange multipliers are obtained
by choosing the minimal number which ensures continuity of the displacements at
each point on the interface.
For the case of nonredundant Lagrange multipliers, our preconditioner is of the
form
M
1
nr
:= (BB
t
)
1
B
_
O O
O S
_
B
t
(BB
t
)
1
= (B
B
t
)
1
B
SB
t
(B
B
t
)
1
and for the redundant case, we have
M
1
r
:= M
1
= B
_
O O
O D
1
SD
1
_
B
t
= B
D
1
SD
1
B
t
,
where D is dened by multiplicity scaling; cf. section 3. We note that in the case
of discontinuous coecients, the preconditioners
M
nr
and
M
r
have to be enhanced
using a more elaborate scaling; we refer to [14] for a more detailed discussion.
In [14], condition number estimates of C (1 + log(H/h))
2
are given for second
order scalar elliptic equations for both the preconditioners
M
nr
and
M
r
. We note
that the construction and analysis in [14] also hold for problems with discontinuous
coecients. In [14, proof of Thm. 1], we show in the nonredundant case
sup
wrange (S)
, B
)
[w
[
2
S
M
nr
, ),
sup
wrange (S)
, B
)
[w
[
2
S
C (1 + log(H/h))
2
M
nr
, ).
(14)
DOMAIN DECOMPOSITION WITH LAGRANGE MULTIPLIERS 1211
Similar inequalities can be obtained using
M
r
. In that paper, [14, Lem. 3], we dene
a norm on V in terms of the preconditioner
M
nr
, i.e., 
V
:=
M
1
nr
, ); the roles
of 
V
and 
V
are reversed in [14] compared to this paper. The dual norm 
V
can now be dened, as before, using the new V norm. A simple computation gives

2
V
=
M
nr
, ). The same arguments are equally valid for
M
r
in the redundant
case.
Using the infsup and supsup estimates (14), Lemma 5, and the denition of

V
, we obtain a condition number estimate for the FETI preconditioners for linear
elasticity, using the same arguments as in the proof of Theorem 9. We note that
the improved condition number estimate is a result of the optimality of the infsup
constant in the rst inequality in (14).
Theorem 10. There exists a positive constant C, independent of the mesh size
and the number of subdomains, such that
(P
M
1
PF) C (1 + log(H/h))
2
,
where
M is either
M
r
or
M
nr
.
5.2. Analysis of the blockdiagonal preconditioner. In this section, we give
a condition number estimate for the blockdiagonal preconditioner for the systems of
equations arising from linear elasticity. This results in a convergence estimate for the
preconditioned conjugate residual method.
As shown in section 4, the system of equations (11) involves an operator from
WV onto itself. The component of u in ker(K) is determined after the completion
of the iteration. It is therefore appropriate to consider the restriction of the operator
/ to the subspace X
R
= W
R
V. Similarly, we can view the preconditioner B
1
as
a mapping from X
R
onto itself; see (13).
An upper bound for the convergence rate of the conjugate residual method can
be given in terms of the condition number (B
1
/) of the preconditioned system. A
theory of blockdiagonal preconditioners for saddle point problems of dierent origins
has been developed by several authors; see Rusten and Winther [25], Silvester and
Wathen [26], Kuznetsov [15], and Klawonn [12]. To the best of our knowledge, the
rst proof for blockpreconditioners applied to saddle point problems with a singular
block K is given in Klawonn [10, 11] and independently in Arnold, Falk, and Winther
[1]. In order to obtain a condition number estimate for B
1
/, we follow the short
argument given in [1] which is in the same spirit as that given by Mandel and Tezaur
[16, Lem. 3.1] for the positive denite case. For completeness, we include the short
proof here using our notation and the norms of X
R
and its dual. Denoting by ()
the spectral radius of a matrix, we nd
(B
1
/) = (B
1
/)((B
1
/)
1
)
B
1
/
X
R
X
R
(B
1
/)
1

X
R
X
R
B
1

X
R
X
R
/
X
R
X
R
B
X
R
X
R
/
1

X
R
X
R
.
Hence, we need estimates of the norms of the operators B, /, and their inverses.
The next lemma is due to Brezzi [2] and is given here in our notation.
Lemma 11. Let B : W
R
V satisfy an infsup and a supsup condition, i.e.,
inf
V
sup
wW
R
(, Bw)

V
w
W
0
,
sup
V
sup
wW
R
(, Bw)

V
w
W
1
,
(15)
1212 AXEL KLAWONN AND OLOF B. WIDLUND
where
0
,
1
> 0.
Furthermore, let K : W
R
W
R
be a symmetric operator satisfying
[(Kw, v)[
1
w
W
v
W
w, v W
R
,
(Kw, w)
0
w
2
W
w W
R
,
(16)
where
0
,
1
> 0.
Then, / : X
R
X
R
is an isomorphism with /
X
R
X
C(
1
,
1
) and
/
1

X
R
X
R
C(
0
,
1
,
0
), where
C(
1
,
1
) :=
1
+
1
and C(
0
,
1
,
0
) :=
max(
1
0
+
1
0
(1 +
1
/
0
)), (
1
0
+
1
2
0
)(1 +
1
/
0
).
The uniform boundedness and ellipticity of K on W
R
follow directly from the
denition of the norm  
W
. Thus, we obtain constants
0
,
1
> 0 which are
independent of h, H.
We are left with showing the infsup and supsup conditions for B.
Lemma 12.
inf
V
sup
wW
R
(, Bw)

V
w
W
C (1 + log(H/h))
1/2
=:
0
,
sup
V
sup
wW
R
(, Bw)

V
w
W
C (1 + log(H/h)) =:
1
.
Proof. For V, let us now consider
sup
wW
R
(, Bw)
w
W
sup
wW
R
(, Bw)
[w[
W
C sup
wW
R
(, Bw)
(Kw, w)
1/2
= C sup
wW
R
(, B
)
inf
v W
v

=w
(Kv, v)
1/2
= C sup
wrange (S)
(, B
)
(Sw
, w
)
1/2
C sup
wW
(, B
)
w

W
1

V
with
1
:= C (1+log(H/h)). The last inequality follows from Lemma 6. Analogously,
we obtain
sup
wW
R
(, Bw)
w
W
0

V
with
0
:= C (1 + log(H/h))
1/2
.
Combining these results with Lemma 11, we obtain estimates of the norm of /
and /
1
as follows.
Lemma 13. The operator / : X
R
X
R
is an isomorphism and satises
/
X
R
X
R
C (1 + log(H/h)),
/
1

X
R
X
R
C (1 + log(H/h)),
DOMAIN DECOMPOSITION WITH LAGRANGE MULTIPLIERS 1213
where C > 0 is a generic constant independent of the mesh size and the subdomain
diameters.
The next lemma provides bounds for B
X
R
X
R
and B
1

X
R
X
R
.
Lemma 14. There exist constants 0 < c C < , which depend only on
k
0
, k
1
, m
0
, and m
1
, such that
B
X
R
X
R
C, B
1

X
R
X
R
1
c
.
These bounds are uniform with respect to the mesh size and the number of subdomains
if B is an optimal preconditioner, i.e., if the constants k
0
, k
1
, m
0
, m
1
are uniformly
bounded.
Proof. From the rst inequalities of (12) and Korns second inequality (Lemma
2), we obtain by a standard scaling argument that (
M
1
, ) C (M
1
, ) = C (B
SB
t
, ) C [B
t
[
2
W
= C 
2
V
.
In the last inequality, we have used that V implies B
t
M
1
, ) c 
2
V
.
By using these inequalities, we get for V

2
V
= sup
V
(, )
2

2
V
C sup
V
(, )
2
(
M
1
, )
= C sup
V
(,
M
1/2
)
2
(, )
= C (
M, )
and analogously

2
V
c (
M, ).
From the denition of B follows
(Bx, x) = (
Ku, u) + (
M, ) C(u
2
W
+
2
V
) = C x
2
X
with x = (u, ) X
R
and
(Bx, x) c x
2
X
.
The boundedness of B and B
1
follows by using the following formulas:
B
X
R
X
R
= sup
xX
R
(Bx, x)
x
2
X
,
B
1

1
X
R
X
R
= inf
xX
R
(Bx, x)
x
2
X
.
From Lemmas 13 and 14 the next theorem follows.
Theorem 15.
(B
1
/) C (1 + log(H/h))
2
1214 AXEL KLAWONN AND OLOF B. WIDLUND
1.0E5
1.0E5
H
H
Fig. 1. Sample domain decomposition of the cantilever with 16 subdomains.
with a constant C independent of H, h.
We can also improve the results of Theorem 15 by using the results of [14], just
as Theorem 10 represents an improvement of Theorem 9. We see that the infsup
estimate used in the proof of Lemma 12 can be replaced with the improved constant
bound given by the rst inequality of (14). Revising the arguments in the proof of
Theorem 15 in this respect, we obtain the following.
Theorem 16. There exists a constant C > 0 independent of H, h such that
(B
1
/) C (1 + log(H/h)).
It is known that the convergence rate of the conjugate residual method depends
linearly on the condition number, whereas that of the conjugate gradient algorithm
does depend on its square root. Thus, from the improved estimates in Theorems 10
and 16, we see that both methods have an asymptotic convergence rate on the order
of (1 + log(H/h)). This is also reected in the numerical experiments in section 6.
6. Numerical results. We have applied our domain decomposition method to
a plane stress problem described in section 2. The Poisson ratio is = 0.3 and the
elasticity modulus E = 2.1 10
11
N/m
2
, which models steel. The domain is the unit
square xed on the lefthand side and free on the remainder of the boundary except
for the upperright corner element. The only nonzero components of the load vector
are associated with the node at that corner; cf. Figure 1.
All our computations have been performed in MATLAB 5.3. Our Krylov subspace
method is the preconditioned conjugate residual method with a zero initial guess. The
stopping criterion is r
n

2
/r
0

2
< 10
6
, where r
n
and r
0
are the nth and initial
residuals.
Our domain is decomposed into N N square subdomains with H := 1/N;
see Figure 1. In our implementation, we use redundant Lagrange multipliers; see the
end of subsection 5.1 for a denition. This is known to yield a smaller number of
iterations. An explanation from a mechanical viewpoint is given in Rixen and Farhat
[23]; see also Klawonn and Widlund [14] for an analysis and connections between the
cases of redundant and nonredundant Lagrange multipliers.
DOMAIN DECOMPOSITION WITH LAGRANGE MULTIPLIERS 1215
Table 1
Iteration counts for an increasing number of subdomains of constant size. (I)
K = K+1/H
2
M
Q
and
M = M; (II)
K = K+1/H
2
M
Q
and
M using ILU(0); (III)
K ILU(10
3
) of K+1/H
2
M
Q
and
M = M; (IV)
K using ILU(10
3
) of K+1/H
2
M
Q
and
M using ILU(0); (V) FETI using Dirichlet
preconditioner. MS : D = multiplicity scaling, Id : D = Identity.
H/h = 8 Iter (I) Iter (II) Iter (III) Iter (IV) Iter (V)
1/h 1/H MS Id MS Id MS Id MS Id MS Id
16 2 11 19 11 19 23 36 24 37 9 14
32 4 17 27 17 27 33 67 33 70 13 24
64 8 21 33 23 33 41 84 41 85 17 29
96 12 21 39 23 39 47 89 49 93 20 32
128 16 21 39 25 41 51 91 55 96 21 35
Table 2
Iteration counts for a constant number of subdomains of increasing size. (I)
K = K+1/H
2
M
Q
and
M = M; (II)
K = K+1/H
2
M
Q
and
M using ILU(0); (III)
K ILU(10
3
) of K+1/H
2
M
Q
and
M = M; (IV)
K using ILU(10
3
) of K+1/H
2
M
Q
and
M using ILU(0); (V) FETI using Dirichlet
preconditioner. MS : D = multiplicity scaling, Id : D = Identity.
H=1/4 Iter (I) Iter (II) Iter (III) Iter (IV) Iter (V)
1/h H/h MS Id MS Id MS Id MS Id MS Id
16 4 15 25 15 25 28 61 28 60 12 22
32 8 17 27 17 27 33 67 33 70 13 24
64 16 19 29 19 29 51 99 58 102 15 27
128 32 19 33 23 47 69 140 79 145 18 31
We report on two dierent series of experiments for dierent combinations of
preconditioners
K and
M in order to analyze the numerical scalability of the method.
In our rst set of runs, we keep the dimension of the subproblems, and H/h, xed and
increase the number of subdomains and thus the overall problem size. In a second
series, we keep a xed number of subdomains and increase the value of H/h resulting
in a smaller h.
In order to see how our method behaves in the best possible case, we rst report
on results for
K = K +
1
H
2
Q and
M = M; cf. Tables 1 and 2 and Figures 2 and
3. For both cases, we present results for
M constructed using D = I as well as
with the multiplicity scaling D = MS; cf. section 3. We also compare the iteration
counts of our new method with those of the original FETI method using the Dirichlet
preconditioner M; see Tables 1 and 2 and Figures 2 and 3. In the experiments with
the FETI algorithm, we stop the iteration when the preconditioned projected residual
is smaller than 10
6
f
2
; see Farhat and Roux [8]. In all cases, the convergence is
considerably faster with MS; using this scaling the asymptotic convergence rate is
also reached much earlier than for D = I. For both choices of D, we obtain scalable
domain decomposition methods in both series of experiments.
To gain insight into the convergence behavior with inexact blocks
K and
M,
we use preconditioners based on an incomplete Cholesky factorization (ILU). In the
following, ILU(0) stands for an incomplete Cholesky factorization with no ll in while
ILU(tol) is a threshold ILU factorization, with a threshold of tol, as provided in
MATLAB 5.3; any entry in a column of the Cholesky factor L is dropped if its
magnitude is smaller than the drop tolerance tol times the norm of its column. We
denote by
S the matrix that replaces S when ILU(0) is used to solve the Dirichlet
problems in each subdomain. Three dierent combinations are considered: 1.
K =
1216 AXEL KLAWONN AND OLOF B. WIDLUND
20 40 60 80 100 120 140
0
5
10
15
20
25
30
35
40
45
DIAG MS =
DIAG Id =
FETI MS = . .
FETI Id = .
1/h
I
T
E
R
A
T
I
O
N
S
20 40 60 80 100 120 140
0
5
10
15
20
25
30
35
40
45
DIAG MS =
DIAG Id =
1/h
I
T
E
R
A
T
I
O
N
S
20 40 60 80 100 120 140
0
10
20
30
40
50
60
70
80
90
100
DIAG MS =
DIAG Id =
1/h
I
T
E
R
A
T
I
O
N
S
20 40 60 80 100 120 140
0
10
20
30
40
50
60
70
80
90
100
DIAG MS =
DIAG Id =
1/h
I
T
E
R
A
T
I
O
N
S
Fig. 2. Iteration counts for an increasing number of subdomains of constant size. H/h = 8.
Upper left:
K = K +
1
H
2
Q,
M = M. Upper right:
K = K + 1/H
2
M
Q
and
M using ILU(0). Lower
left:
K ILU(10
3
) of K+1/H
2
M
Q
and
M = M. Lower right:
K using ILU(10
3
) of K+1/H
2
M
Q
and
M using ILU(0).
K +
1
H
2
Q and
M
1
= B
_
O O
O D
1
SD
1
_
B
t
;
2.
K is built with ILU(10
3
) applied to K +
1
H
2
Q and
M
1
= M
1
; 3.
K is built
with ILU(10
3
) applied to K +
1
H
2
Q and
M
1
, as in combination 1., by using the
inexact Schur complement
S. The computational results are given in Tables 1 and 2;
cf. also Figures 2 and 3.
We also present results with a more accurate ILU decomposition based on a
threshold tolerance tol = 10
6
in Table 3. We see that a more accurate preconditioner
K improves the overall rate of convergence signicantly. Comparing these results and
the number of iterations obtained for the new method and the original FETI algo
rithm, both with the exact Dirichlet preconditioner, there is a strong indication that
the increase in the iteration counts, which results when using the ILU(10
3
) precon
ditioner, would be largely eliminated if a better blockpreconditioner, e.g., (algebraic)
multigrid, were used as
K.
DOMAIN DECOMPOSITION WITH LAGRANGE MULTIPLIERS 1217
20 40 60 80 100 120 140
0
5
10
15
20
25
30
35
40
DIAG MS =
DIAG Id =
FETI MS = . .
FETI Id = .
1/h
I
T
E
R
A
T
I
O
N
S
20 40 60 80 100 120 140
0
5
10
15
20
25
30
35
40
45
50
55
DIAG MS =
DIAG Id =
1/h
I
T
E
R
A
T
I
O
N
S
20 40 60 80 100 120 140
0
50
100
150
DIAG MS =
DIAG Id =
1/h
I
T
E
R
A
T
I
O
N
S
20 40 60 80 100 120 140
0
50
100
150
DIAG MS =
DIAG Id =
1/h
I
T
E
R
A
T
I
O
N
S
Fig. 3. Iteration counts for a constant number of subdomains of increasing size. H = 1/4.
Upper left:
K = K +
1
H
2
Q,
M = M. Upper right:
K = K + 1/H
2
M
Q
and
M using ILU(0). Lower
left:
K ILU(10
3
) of K+1/H
2
M
Q
and
M = M. Lower right:
K using ILU(10
3
) of K+1/H
2
M
Q
and
M using ILU(0).
Table 3
Iteration counts. (I)
K based on ILU(10
6
) of K + 1/H
2
M
Q
and
M = M; (II)
K based on
ILU(10
6
) of K+1/H
2
M
Q
and
M using ILU(0). MS : D = multiplicity scaling, Id : D = Identity.
H/h = 8 Iter (I) Iter (II)
1/h 1/H MS Id MS Id
16 2 18 30 20 30
32 4 17 27 17 27
64 8 21 33 23 35
128 16 21 39 25 41
Acknowledgments. The rst author would like to acknowledge the hospitality
of and many discussions with XiaoChuan Cai, Charbel Farhat, and Daniel Rixen
at the University of Colorado at Boulder and also wishes to thank Daniel Rixen for
providing parts of his FETI MATLAB code.
REFERENCES
[1] D. N. Arnold, R. S. Falk, and R. Winther, Preconditioning discrete approximations of the
ReissnerMindlin plate model, RAIRO Model. Math. Anal. Numer., 31 (1997), pp. 517
557.
1218 AXEL KLAWONN AND OLOF B. WIDLUND
[2] F. Brezzi, On the existence, uniqueness, and approximation of saddlepoint problems arising
from Lagrange multipliers, RAIRO Model. Math. Anal. Numer., 8 (1974), pp. 129151.
[3] P. G. Ciarlet, Mathematical Elasticity Volume I: ThreeDimensional Elasticity, North
Holland, Amsterdam, 1988.
[4] C. Farhat, P. Chen, and J. Mandel, A scalable Lagrange multiplier based domain decomposi
tion method for timedependent problems, Internat. J. Numer. Methods Engrg., 38 (1995),
pp. 38313853.
[5] C. Farhat, P. Chen, J. Mandel, and F.X. Roux, The twolevel FETI methodPart II:
Extensions to shell problems, parallel implementation and performance results, Comput.
Methods Appl. Mech. Engrg., 155 (1998), pp. 153179.
[6] C. Farhat, J. Mandel, and F.X. Roux, Optimal convergence properties of the FETI domain
decomposition method, Comput. Methods Appl. Mech. Engrg., 115 (1994), pp. 367388.
[7] C. Farhat and F.X. Roux, A method of nite element tearing and interconnecting and its
parallel solution algorithm, Internat. J. Numer. Methods Engrg., 32 (1991), pp. 12051227.
[8] C. Farhat and F.X. Roux, Implicit parallel processing in structural mechanics, Comput.
Mech. Adv., 2 (1994), pp. 1124.
[9] W. Hackbusch, Iterative Solution of Large Sparse Systems of Equations, Springer, New York,
1994.
[10] A. Klawonn, An Optimal Preconditioner for a Class of Saddle Point Problems with a Penalty
Term, Part II: General Theory, Technical report 14/95, Westfalische WilhelmsUniversitat
M unster, Germany, April 1995. Also available as Technical report 683, Courant Institute
of Mathematical Sciences, New York University, New York, 1995.
[11] A. Klawonn, Preconditioners for Indenite Problems, Ph.D. thesis, Westfalische Wilhelms
Universitat M unster, 1995; Technical report 716, Courant Institute of Mathematical Sci
ences, New York University, New York, 1996.
[12] A. Klawonn, An optimal preconditioner for a class of saddle point problems with a penalty
term, SIAM J. Sci. Comput., 19 (1998), pp. 540552.
[13] A. Klawonn and O. B. Widlund, A domain decomposition method with Lagrange multipli
ers for linear elasticity, in Proceedings of the 11th International Conference on Domain
Decomposition Methods, Greenwich, UK, 1998, C. H. Lai, P. E. Bjrstad, M. Cross, and
O. B. Widlund, eds., Domain Decomposition Press, Bergen, Norway, 1998, pp. 4956. Also
available online from http://www.ddm.org/DD11/Klawonn.ps.gz.
[14] A. Klawonn and O. B. Widlund, FETI and NeumannNeumann iterative substructuring
methods: Connections and new results, Comm. Pure Appl. Math., to appear.
[15] Yu. A. Kuznetsov, Ecient iterative solvers for elliptic nite element problems on nonmatch
ing grids, Russian J. Numer. Anal. Math. Modelling, 10 (1995), pp. 187211.
[16] J. Mandel and R. Tezaur, Convergence of a substructuring method with Lagrange multipliers,
Numer. Math., 73 (1996), pp. 473487.
[17] J. Mandel, R. Tezaur, and C. Farhat, A scalable substructuring method by Lagrange mul
tipliers for plate bending problems, SIAM J. Numer. Anal., 36 (1999), pp. 13701391.
[18] J. Ne cas, Les methodes directes en theorie des equations elliptiques, Academia, Prague, 1967.
[19] J. A. Nitsche, On Korns second inequality, RAIRO Model. Math. Anal. Numer., 15 (1981),
pp. 237248.
[20] K. C. Park and C. A. Felippa, A Variational Framework for Solution Method Developments
in Structural Mechanics, Technical report CUCAS Report 9621, University of Colorado
at Boulder, 1996.
[21] K. C. Park, M. R. Justino, and C. A. Felippa, An algebraically partitioned FETI method for
parallel structural analysis: Algorithm description, Internat. J. Numer. Methods Engrg.,
40 (1997), pp. 27172737.
[22] D. Rixen and C. Farhat, Preconditioning the FETI and balancing domain decomposition
methods for problems with intra and intersubdomain coecient jumps, in Proceedings
of the Ninth International Conference on Domain Decomposition Methods in Science and
Engineering, Bergen, Norway, 1996, P. Bjrstad, M. Espedal, and D. Keyes, eds., Do
main Decomposition Press, Bergen, Norway, 1998, pp. 472479. Also available online from
http://www.ddm.org/DD9/Rixen.ps.gz.
[23] D. Rixen and C. Farhat, A simple and ecient extension of a class of substructure based pre
conditioners to heterogeneous structural mechanics problems, Internat. J. Numer. Methods
Engrg., 44 (1999), pp. 489516.
[24] D. Rixen, C. Farhat, R. Tezaur, and J. Mandel, Theoretical comparison of the FETI and
algebraically partitioned FETI methods, and performance comparisons with a direct sparse
solver, Internat. J. Numer. Methods Engrg., 46 (1999), pp. 501534.
[25] T. Rusten and R. Winther, A preconditioned iterative method for saddlepoint problems,
DOMAIN DECOMPOSITION WITH LAGRANGE MULTIPLIERS 1219
SIAM J. Matrix Anal. Appl., 13 (1992), pp. 887904.
[26] D. Silvester and A. Wathen, Fast iterative solutions of stabilised Stokes systems Part II:
Using general block preconditioners, SIAM J. Numer. Anal., 31 (1994), pp. 13521367.
[27] R. Tezaur, Analysis of Lagrange Multiplier Based Domain Decomposition, Ph.D. the
sis, University of Colorado at Denver, 1998; also available online from http://www
math.cudenver.edu/graduate/ thesis/rtezaur.ps.gz.