Charles W. Groetsch - Stable Approximate Evaluation of Unbounded Operators (2007)

Lecture Notes in Mathematics 1894
Editors:
J.-M. Morel, Cachan
F. Takens, Groningen
B. Teissier, Paris
Charles W. Groetsch
Stable Approximate
Evaluation of
Unbounded Operators
ABC
Author
Charles W. Groetsch
The Traubert Chair
School of Science and Mathematics
The Citadel
Charleston, SC 29409
USA
e-mail: charles.groetsch@citadel.edu
Library of Congress Control Number: 2006931917
Mathematics Subject Classification (2000): Primary: 47A52, 65J20;

Secondary: 47A58, 65J22
ISSN print edition: 0075-8434

ISSN electronic edition: 1617-9692
ISBN-10 3-540-39942-9 Springer Berlin Heidelberg New York
ISBN-13 978-3-540-39942-1 Springer Berlin Heidelberg New York
DOI 10.1007/3-540-39942-9
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,
reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9,
1965, in its current version, and permission for use must always be obtained from Springer. Violations are
liable for prosecution under the German Copyright Law.
Springer is a part of Springer Science+Business Media
springer.com
°c Springer-Verlag Berlin Heidelberg 2007
The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply,
even in the absence of a specific statement, that such names are exempt from the relevant protective laws
and regulations and therefore free for general use.
Typesetting by the author and SPi using a Springer LATEX package
Cover design: WMXDesign GmbH, Heidelberg
Printed on acid-free paper SPIN: 11856795 VA41/3100/SPi 543210
In Memory of Joaquin Bustoz, Jr.
1939-2003
Preface
This monograph is a study of an aspect of operator approximation theory

that emerges from the theory of linear inverse problems in the mathematical
sciences. Such inverse problems are often modeled by operator equations of the
first kind involving a compact linear operator defined on a Hilbert space. The
conventional solution operator for the inverse problem is the Moore-Penrose
generalized inverse of the model operator. Except in the unusual case when the
model operator has finite rank, the Moore-Penrose inverse is a densely defined,
closed, unbounded operator. Therefore bounded perturbations in the data can
be amplified without bound by the solution operator. Indeed, it is a common
experience of those who have dealt with such inverse problems to find that
low amplitude noise in the data expresses itself as high amplitude oscillations
in the computed solution. The successful treatment of these inverse problems
therefore requires two ingredients: an approximation of the Moore-Penrose
inverse coupled with a stabilization technique to dampen spurious oscillations
in the approximate solution.
Here we consider stabilized evaluation of an unbounded operator as a prob-
lem in its own right in operator approximation theory. By stabilized evaluation
we mean that the value of an unbounded operator at some vector in its do-
main is approximated by applying bounded linear operators to an approximate
data vector that is not necessarily in the domain of the original unbounded
operator. Questions of convergence and orders of approximation will be of
foremost concern. A unifying thread for the discussion is a classical theorem
of von Neumann on certain bounded “resolvents” of closed densely defined
unbounded operators. This result is the bridge that allows passage from the
unbounded operator to a class of approximating bounded operators. When
von Neumann’s theorem is combined with the spectral theorem for bounded
self-adjoint operators a general scheme for stabilized evaluation of the un-
bounded operator results. Particular cases of the general scheme, notably the
Tikhonov-Morozov method and its variants, are studied in some detail and
finite-dimensional realizations are dealt with in the final chapter.
VIII Preface
The key idea of von Neumann’s proof involves regarding the graph of the
operator as a subspace embedded in a product Hilbert space (in this sense
the proof is truly Cartesian). We also show that this notion, combined with
von Neumann’s alternating projection theorem, applied in the product space,
can be used to give an alternate non-spectral proof of one of the best known
operator stabilization methods.
Our intent is for the monograph to be reasonably self-contained. We begin
with a fairly informal introductory chapter in which a number of model inverse
problems leading to the evaluation of unbounded operators are introduced.
The next chapter fills in background material from functional analysis and
operator theory that is helpful in the sequel. We hope that this approach
will make the monograph a useful source of collateral reading for students
in graduate courses in functional analysis and related courses in analysis and
applied mathematics.
Much of the work that is reported here was originally carried out in col-
laboration with my friends Otmar Scherzer of the University of Innsbruck,
Austria and Martin Hanke-Bourgeois of the University of Mainz, Germany.
With both Otmar and Martin I had the happy experience of open and friendly
collaborations in which my benefits exceeded my contributions.
While writing these notes I learned of the tragic death of my earliest
colleague and coauthor, Joaquin Bustoz, Jr., of Arizona State University.
Besides his many research papers on summability theory and special functions,
Joaquin’s important and lasting contributions to the mathematical education
of disadvantaged youth made his passing a great loss to the profession, to say
nothing of the personal sense of loss felt by his friends and colleagues. This
monograph is fondly dedicated to Joaquin’s memory.
I am grateful to the Charles Phelps Taft Research Center for providing
support, in the form of a Faculty Fellowship, during the preparation of this
work.
Seabrook Island, S.C. Charles Groetsch

May, 2006
Contents
1 Some Problems Leading to Unbounded Operators . . . . . . . . . 1

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Some Unbounded Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Differentiation and Multiplication . . . . . . . . . . . . . . . . . . . 3
1.2.2 Dirichlet-to-Neumann Map . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.3 Potential Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.4 Moment Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Unbounded Operators and the Heat Equation . . . . . . . . . . . . . . . 9
1.3.1 Time Reversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.2 Source Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.3 Diffusivity from Temperature History . . . . . . . . . . . . . . . . 12
1.3.4 The Steady State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.5 Surface Phenomena . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2 Hilbert Space Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.1 Hilbert Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Weak Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Bounded Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4 Unbounded Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.5 Pseudo-inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.6 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3 A General Approach to Stabilization . . . . . . . . . . . . . . . . . . . . . . 53

3.1 A General Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.2 Some Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.2.1 The Tikhonov-Morozov Method . . . . . . . . . . . . . . . . . . . . . 58
3.2.2 The Iterated Tikhonov-Morozov Method . . . . . . . . . . . . . 63
3.2.3 An Interpolation-Based Method . . . . . . . . . . . . . . . . . . . . . 65
X Contents
3.2.4 A Method Suggested by Dynamical Systems . . . . . . . . . . 70

3.3 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4 The Tikhonov-Morozov Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.1 The Tikhonov-Morozov Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2 The Discrepancy Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3 Iterated Tikhonov-Morozov Method . . . . . . . . . . . . . . . . . . . . . . . 86
4.4 The Nonstationary Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5 Finite-Dimensional Approximations . . . . . . . . . . . . . . . . . . . . . . . . 101

5.1 Stabilization by Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.2 Finite Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.2.1 A Finite Element Discrepancy Criterion . . . . . . . . . . . . . . 116
5.3 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
1
Some Problems Leading to Unbounded
Operators
His majesty then turned to me, and requested me to explain the reason
why such great effects should proceed from so small a cause ...
The Adventures of Hajji Baba of Isfahan
J. Morier
1.1 Introduction
Unbounded operators can transform arbitrarily small vectors into arbitrar-

ily large vectors – a phenomenon known as instability. Stabilization methods
strive to approximate a value of an unbounded operator by applying a family
of bounded operators to rough approximate data that do not necessarily lie
within the domain of the unbounded operator. This monograph is a math-
ematical study of stabilization techniques for the evaluation of unbounded
operators acting between Hilbert spaces. The goal of the study is to contribute
to the theoretical basis of a wide-ranging methodology for inverse problems,
data smoothing and image analysis. My intent is entirely theoretical. I do not
propose to delve into the details of these applications, but rather my aim is
to use the application areas as motivators for the study of specific topics in
the theory of operator approximations.
Stabilization problems inevitably arise in the solution of inverse problems
that are phrased in infinite-dimensional function spaces. The direct versions
of these problems typically involve a highly smoothing operator and conse-
quently the inversion process is usually highly ill-posed. This ill-posed problem
can be viewed on a theoretical level (and often on a quite practical level) as
evaluating an unbounded operator on some data space (the range of the direct
operator). The main difficulty arises when inaccuracies (arising in the physi-
cal setting from measurement errors) carry the observable data outside of the
theoretical data space. The unbounded nature of the solution operator can
then give rise to spectacular instabilities in the approximate solution. General
stabilization techniques, the subject of this monograph, involve insertion of
2 1 Some Problems Leading to Unbounded Operators
the given data into the data space of the solution operator and subsequent
approximate evaluation of the solution operator. Of course these approxima-
tions must be carried out very carefully in order to preserve stability. A key
point in this process is the choice of parameters defining the approximate
evaluation in such a way as to control the instabilities.
It is hoped that this monograph which explores these ideas, and is
expressed in rigorous operator theory, will be interesting to mathematicians
and will also be potentially useful to those in the scientific community who
confront unstable ill-posed problems in their work. Our intent is also to pro-
vide a source for collateral study for students in courses in applied functional
analysis and related courses. With this in mind the next chapter provides
brief details of some of the operator theory that will be used in the sequel.
Before proceeding to this background material we give a semi-formal intro-
duction to some problems from application areas that lead to the evaluation
of unbounded operators.
We remind the reader that a linear operator T : X → Y from a normed
linear space X into a normed linear space Y is called bounded if the number
T x
T = sup
x=0 x
is finite (unless further specification is necessary, · will always denote the

norm in an appropriate space). For linear operators the concepts of continuity
at a point, uniform continuity and boundedness coincide. If T is an unbounded
linear operator, then there is a sequence of unit vectors {zn } in the domain
of T satisfying T zn → ∞ as n → ∞. For this reason we frequently say that
for an unbounded linear operator T the mapping z → T z is unstable. That
is, for linear operators we treat the terms stable, bounded and continuous as
synonymous.
1.2 Some Unbounded Operators

In elementary calculus the differentiation process is regarded as basic and
benign. But differentiation is Janus-faced and its darker side is all too fa-
miliar to scientists who must differentiate functions that are given empir-
ically. This sinister face of differentiation is instability. In operator terms
this instability means that the linear operator defined by differentiation is
unbounded on function√ spaces with conventional norms. For example, the
functions φn (t) = 2 sin nπt are unit vectors in the space L2 [0, 1], yet the
L2 -norm of the derivatives, φn = πn, is unbounded. This means, for exam-
ple, that the differentiation operator defined in the subspace of differentiable
functions in L2 [0, 1], when considered as a operator taking values in L2 [0, 1],
is unbounded.
1.2 Some Unbounded Operators 3
1.2.1 Differentiation and Multiplication
Differentiation of functions defined on unbounded domains is closely related

to the action of multiplication operators. For example, in the space L2 (R) dif-
ferentiation may be viewed in the frequency domain via the Fourier transform
as multiplication by −iω where ω is the frequency variable. The operation of
multiplication by the independent variable also plays a key role in quantum
mechanics where it is used to model the mean position of a particle in a given
state. The basic multiplication operator M on L2 (R) defined on the domain
∞
D(M ) = f ∈ L (R) :
2
(xf (x)) dx < ∞
2
−∞
is a densely defined linear operator which is unbounded. For example, the

functions fn = √1n χ[−n/2,n/2] ∈ D(M ), where χS indicates the characteristic
function of a set S, satisfy fn = 1, but M fn → ∞ as n → ∞.
Differentiation is such an ubiquitous operation that it may seem pointless
to give a specific example. However, since the unbounded operators that mo-
tivate this monograph often arise in inverse problems for partial differential
equations, we provide a simple example of such an inverse problem occur-
ring in a model problem for the wave equation. Consider a tightly stretched
string of nonuniform density extending between the boundaries x = 0 and
x = 1 and executing motion in the vertical plane under the influence of tensile
forces alone. The vertical displacement of the point on the string at position
x ∈ [0, 1] and time t > 0 is given by a function u(x, t). The evolution of u(x, t)
is governed, assuming certain simplifying assumptions, by a partial differential
equation of the form
ρ(x)utt = uxx , 0 < x < 1, t>0
where ρ is a measure of the nonuniform mass density distribution of the string.

We suppose that the ends of the string are fixed on the horizontal axis, that
is, that the boundary conditions
u(0, t) = 0, u(1, t) = 0
hold for all t ≥ 0. The classical separation of variables technique suggests a

solution in the form of a superposition of standing waves of the form r(t)y(x).
The spatial factor in the standing wave then satisfies the two point boundary
value problem
y + ω 2 ρy = 0, y(0) = 0, y(1) = 0
where the separation constant ω 2 is limited to certain discrete values. These
“eigenvalues” are associated via the time factor rω (t) of the standing wave
with characteristic frequencies of vibration of the string. Such a frequency for
an admissible value ω 2 might be determined experimentally by the use of a
properly tuned stroboscopic light and the corresponding spatial factor y(x)
of the standing wave could then be observed directly. The density coefficient
ρ(x) satisfies
y (x)
ρ(x) = − 2 .
ω y(x)
That is, the required coefficient ρ(x) is obtained by evaluating the (nonlinear)
operator
y
L(y) = − 2
ω y
defined, for example, on the set of functions y on [0, 1] having absolutely
continuous first derivative and for which L(y) lies in an appropriate function
space, say L2 [0, 1]. The unbounded nature of the second derivative operator
can then mean that exceedingly small experimental errors in the measurement
of y can lead to unacceptably large errors in the computation of y and hence
the evaluation of L(y) is unstable.
1.2.2 Dirichlet-to-Neumann Map
Another concrete unbounded operator is provided by the Dirichlet-to-

Neumann map in potential theory (see [6] for a survey and applications to
tomography). We consider the simplest illustration. Let Ω be the unit disk in
the plane and denote by ∂Ω the unit circle which is its boundary. For a given
square integrable function f on ∂Ω let u be the function which is harmonic
in Ω and agrees with f on ∂Ω. That is, u is a the solution of the Dirichlet
problem
∆u = 0 in Ω
u = f on ∂Ω
where ∆ is the Laplacian operator. The Dirichlet-to-Neumann map is the
mapping
∂u
f →
∂ν
where ν is the outer unit normal to ∂Ω, i.e., the directional derivative of u in
the outer radial direction. Denote the Fourier coefficients of f by fˆ(n), that
is, 2π
1
ˆ
f (n) = f (t)e−int dt .
2π 0
Using separation of variables techniques in polar coordinates one is led to the
formal solution
u(r, θ) = r|n| fˆ(n) exp(inθ)
n∈Z
and hence
∂u
= |n|fˆ(n) exp(inθ).
∂ν
n∈Z
The Dirichlet-to-Neumann map is therefore the linear operator L defined on

the subspace

2 ˆ
D(L) = f ∈ L (∂D) :
2
|n| |f (n)| < ∞
2
n∈Z
and taking values in L2 (∂D) defined by

(Lf )(eiθ ) = |n|fˆ(n) exp(inθ).
n∈Z
The functions {fm } ⊆ D(L) defined by fm (eiθ ) = cos mθ are uniformly

bounded in L2 (∂D), but Lfm → ∞ as m → ∞ and hence the operator
L is unbounded.
Another instance of the mapping of Dirichlet data to Neumann information
is provided by the Laplace equation in a strip:
∆u = 0, x > 0 0 < y < π,
u(x, π) = u(x, 0) = 0, x≥0
u(0, y) = f (y), 0≤y≤π
u(x, y) → 0, as x → ∞.
Formal separation of variable techniques lead easily to the representation
g(y) := ux (0, y) = (Lf )(y)
where the operator L is defined by

∞

(Lf )(y) = − nan sin ny.
n=1
Here π
2
an = f (η) sin nηdη
π 0
and the unbounded operator L is defined on the domain
∞

D(L) = f ∈ L [0, π] :
2
n |an | < ∞ .
2 2
n=1
1.2.3 Potential Theory
Instability may arise in more subtle ways in the analysis of partial differential
equations. A well-known example occurs in potential theory. Consider the
model Cauchy problem for Laplace’s equation on an infinite strip. Here the
unstable mapping takes one boundary distribution into another. Specifically,
suppose u(x, y) is a harmonic function in the strip
Ω = {(x, y) : 0 < x < 1, −∞ < y < ∞}
satisfying a homogenous Neumann condition on the boundary x = 0 and one

wishes to determine the boundary values g(y) = u(1, y) given the boundary
values f (y) = u(0, y). That is, u(x, y) satisfies
∆u = 0 in Ω,
u(0, y) = f (y),
ux (0, y) = 0,
u(1, y) = g(y)
for −∞ < y < ∞.

Applying the Fourier transform û = F{u} with respect to the y-variable
∞
1
û(x : ω) = √ u(x, y)e−iωy dy
2π −∞
results in the initial value problem involving the frequency parameter ω:
d2
û = ω 2 û,
dx2
û(0) = fˆ,
d
û(0) = 0
dx
giving
ĝ(ω) = û(1, ω) = fˆ(ω) cosh (ω).
Therefore the linear operator connecting f to g is given by
g = Lf = F −1 {fˆ(ω) cosh (ω)}.
This linear operator is defined only on functions f ∈ L2 (R) for which
fˆ(ω) cosh (ω) ∈ L2 (R),
a condition that says roughly that high frequency components of f must decay
very rapidly. In particular, L is defined on band-limited functions and hence
is densely defined in L2 (R). However, L is unbounded. Indeed, if
fˆn = χ[n−1/2,n+1/2]
is the characteristic function of the interval [n−1/2, n+1/2], then fn ∈ L2 (R),

and fn = 1, by the Parseval-Plancherel relation. However,
n+1/2 n+1/2
1
Lfn 2 = (cosh ω)2 dω ≥ e2ω dω → ∞ as n → ∞,
n−1/2 4 n−1/2
showing that L is unbounded.

Another common example of an unbounded linear operator occurring in
potential theory is provided by the problem of extending a harmonic function
from an interior curve to the boundary of a region. Suppose u is a harmonic
function in the unit disc and 0 < h < 1. Standard separation of variables
techniques lead to a representation of u in polar coordinates in the form

u(r, θ) = cn r|n| einθ ,
n∈Z
where {cn } are certain complex constants. Suppose boundary values g(θ) =
u(1, θ) on the unit circle are to be given in terms of values f (ϕ) = u(h, ϕ) on
the interior circle r = h, where g has the complex Fourier expansion

g(θ) = cn einθ
n∈Z
and that f is given by

f (ϕ) = an einϕ .
n∈Z
Then
f (ϕ) = cn h|n| einϕ
n∈Z
and hence cn = h−|n| an , that is, g = Lf where

(Lf )(θ) = h−|n| an e−nθ
n∈Z
and

D(L) = f ∈ L2 [0, 2π] : h−2|n| |an |2 < ∞
n∈Z
where 2π
1
an = f (ϕ)e−inϕ dϕ.
2π 0
The operator L is densely defined in L2 [0, 2π] and unbounded since, if fm (ϕ) =
eimϕ , then fm is bounded, while Lfm → ∞ as m → ∞. Note that L may
be expressed in real form as
g(θ) = (Lf )(θ)

∞
1 2π
1 −n 2π
= f (ϕ)dϕ + h f (ϕ) cos n(θ − ϕ)dϕ.
2π 0 π n=1 0
1.2.4 Moment Theory
Next we consider an unbounded operator occurring in moment theory. A key

mathematical idea to emerge from Fourier’s pioneering studies in heat con-
duction was the notion of representing a function by a sequence of numbers
– its Fourier coefficients. The inverse problem of associating a function with
a sequence of numbers, that is, the problem of characterizing those functions
which are representable by Fourier series of various types has a huge liter-
ature. In abstract terms, Fourier theory may be viewed as the study of the
relationship between an element x in a separable real Hilbert space H and the
sequence of “moments” {
x, qn }∞n=0 , where {qn } is a given complete ortho-
normal basis for H. The mapping A : H → l2 , where l2 is the space of square
integrable real sequences, defined by
Ax = {
x, qn }
and its inverse L : l2 → H defined by

∞

L({cn }) = cn qn
n=0
are, by Parseval’s identity, linear isomorphisms.

The crucial ingredient in Fourier’s recipe is orthogonality. If the orthogo-
nality of the basis is relaxed, one can ask the same question about the rela-
tionship between a function and its moments with respect to a given basis. If
{pk }∞
k=0 is a given linearly independent set of vectors in a separable Hilbert
space H, then the moments of a vector x ∈ H with respect to {pk } are the
numbers
µk =
x, pk , k = 0, 1, 2, . . . .
For example, if H is the real space L2 [0, 1] and pk (s) = sk , these are called
the Hausdorff moments. These moments have a mechanical interpretation if
we think of x as representing the mass density distribution of a bar, for then
1 1 1
x(s)ds, sx(s)ds, s2 x(s)ds
0 0 0
1.3 Unbounded Operators and the Heat Equation 9
represent the total mass of the bar and its moment and moment of inertia,
respectively, about its left end. The mapping from the moments to the dis-
tribution is unbounded. In fact, let L : D(L) ⊂ l2 → L2 [0, 1] be the linear
operator defined on
∞

D(L) = {µk } ∈ l : 2
µk s ∈ L [0, 1] .
k 2
k=0
For a given p > 0, let

1 0
k−(1+p ln s)/2
µ(p)k = s ds = eku−pu/2 du
0 −∞
for k > p/2 and note that µ(p)k ≤ 1/(k − p/2) and hence {µ(p)k } ∈ l2 and
µ(p) is bounded independently of p. Now let xp (s) = s−(1+p ln s)/2 and note
that 1 0
1
xp 2 = s−(1+p ln s) ds = e−t(1+pt) et dt = π/p
0 −∞ 2
and hence xp ∈ L2 [0, 1]. Also note that L({µ(p)}) = xp , and, as previously
observed µ(p) is bounded, but
1
xp 2 = π/p → ∞, as p→0
2
showing that L is unbounded.
1.3 Unbounded Operators and the Heat Equation
In direct problems for partial differential equations the goal is to find a solu-
tion, or features of a solution, given the differential operator, including coeffi-
cients, and certain ancillary information in the form of initial values, boundary
values or source terms. The differential operator may be viewed as a mapping
from certain combinations of coefficients, initial values, boundary values and
source terms to solutions. Inverse problems for partial differential equations
typically require the recovery of distributed coefficients, initial values, bound-
ary values or source terms given sufficient information about the solution.
Since the solution enters the problem through some process of differentiation
it is reasonable to expect that the resolution of inverse problems for par-
tial differential equations will involve the evaluation of unbounded operators.
In this section we provide several examples of inverse problems for the one-
dimensional heat equation that give rise to unbounded solution operators.
The examples treat the reconstruction of initial distributions, source terms,
coefficients and boundary values. In each example we proceed formally and
identify the unbounded solution operator for the inverse problem.
1.3.1 Time Reversal

Suppose a uniform bar, identified with the interval [0, π], is heated to an initial
temperature distribution g(x) for x ∈ [0, π] while the endpoints of the bar are
kept at temperature zero. For suitable choices of constants the evolution of
the space-time temperature distribution of the bar, u(x, t), is governed by the
one-dimensional heat equation
∂u ∂2u
= , 0 < x < π, t>0
∂t ∂x2
and satisfies the boundary conditions u(0, t) = u(π, t) = 0. Suppose we observe
the temperature distribution f (x) of the bar at some later time, say t = 1,
that is the function f (x) = u(x, 1) is observed, and we wish to reconstruct
the initial distribution g(x). Separation of variables leads to a solution of the
form
∞

cn e−n t sin nx
2
u(x, t) =
n=1
and therefore
∞

cn e−n sin nx
2
f (x) =
n=1
where π
2 m2
cm = e f (y) sin mydy.
π 0
Therefore,
∞
2 n2 −n2 t π
u(x, t) = e e f (y) sin ny sin nxdy
n=1
π 0
and hence
∞
2 n2 π
g(x) = u(x, 0) = e f (y) sin ny sin nxdy.
n=1
π 0
That is, g = Lf where

∞ ∞
2 n2
Lf (x) = e sin nx f (y) sin nydy
π n=1 0
In other words, the solution g of the inverse problem is obtained from the
data f via the unbounded operator L defined on functions f in the set
∞ π

2m2 2
D(L) = f ∈ L [0, π] :
2
e am < ∞, am = f (y) sin mydy
m=1 0
Here the instability is apparent: small (in L2 norm) perturbations of the data
2
f can, thanks to the factors en , translate into very large changes in the
solution g.
1.3.2 Source Identification
We now treat a problem of determining a spatially distributed source term

from the temperature distribution at a specific time. Suppose that in the
model
∂u ∂2u
= + g(x), 0 < x < π, 0 ≤ t (1.1)
∂t ∂x2
where u(x, t) is subject to the boundary conditions
u(0, t) = u(π, t) = 0, 0<t (1.2)
and initial condition

u(x, 0) = 0, 0≤x≤π (1.3)
one wishes to reconstruct the source distribution g(x) from the spatial tem-
perature distribution at some latter time, say, f (x) = u(x, 1). A particular
purely spatial solution of (1.1)-(1.2) is:
x s
x π s
w(x) = g(τ )dτ ds − g(τ )dτ ds.
π 0 0 0 0
and hence the general solution of (1.1)-(1.2) has the form

∞

u(x, t) = bn exp (−n2 t) sin nx + w(x).
n=1
In order to satisfy (1.3) we need

−2 π
bn = w(s) sin nsds, n = 1, 2, . . . (1.4)
π 0
which, after a couple of integrations by parts, noting that w = −g, is equiv-
alent to
2 π
−n2 bn = g(s) sin nsds.
π 0
Therefore,
∞

g(x) = (−n2 bn ) sin nx. (1.5)
n=1
However,
∞ −n2
f (x) = u(x, 1) = n=1 bn e sin nx + w(x)
∞ −n2 2

π
= n=1 bn e + π 0
w(s) sin nsds sin nx
∞ −n2
= n=1 (bn e − bn ) sin nx, by (1.4),
and hence
2 π
bn (e−n − 1) =
2
f (s) sin nsds.
π 0
By (1.5) one is then led to the explicit representation
∞ π
2 n2
g(x) = Lf (x) = sin nx f (s) sin nsds.
π n=1 1 − e−n2 0
That is, g = Lf , where L is the linear operator on L2 [0, π] with domain

∞
2 π
D(L) = f : m am < ∞, am =
4 2
f (s) sin msds .
m=1
π 0
We note that D(L) is dense in L2 [0, π] since all finite linear combinations
of the eigenfunctions fn (s) = sin ns are contained in D(L). Moreover, L is
unbounded since fn 2 = π/2 while
πn2
Lfn 2 = →∞
2(1 − e−n2 )
as n → ∞.
1.3.3 Diffusivity from Temperature History

Now we illustrate a model in which a time dependent diffusivity coefficient in
a heat equation is to be determined by measuring the temperature history at
a single internal point, say at x = .5. It is a routine matter to verify that the
problem (see, Cannon (1984), p. 191)
∂u ∂2u
= g(t) 2 0 < x < 1, 0<t
∂t ∂x
u(0, t) = u(1, t) = 0, u(x, 0) = sin πx
has formal solution
t
u(x, t) = exp −π 2
g(s)ds sin πx
0
and therefore the coefficient g(t) is related to the profile h(t) = u(.5, t) by

h (t)
g(t) = − 2
π h(t)
That is, the coefficient g(t) is related to the processed profile
f (t) = (− ln |h(t)|)/π 2
d
by the unbounded operator L = , namely, g = Lf .
dt
1.3.4 The Steady State
An unbounded operator also plays a role in a simple model problem for the
two-dimensional steady state heat equation. Suppose that the temperature
distribution g(x) = u(x, 1) on the top edge of a rectangle [0, π] × [0, 1] is
desired, where the temperature u(x, y) on the rectangle satisfies
∆u = 0 in (0, π) × (0, 1)
u(0, y) = u(π, y) = 0 for y ∈ [0, 1]
u(x, 0) = 0 for x ∈ [0, π]
and boundary data on [0, π] are over-specified by the boundary flux condition
∂u
(x, 0) = f (x), for x ∈ [0, π].
∂y
Elementary separation of variables techniques then lead to the representation
g = Lf , where the operator
∞
2 sinh n π
(Lf )(x) = f (s) sin nsds sin nx
π n=1 n 0
is defined on the space

∞
π
D(L) = f ∈ L [0, π] :
2
a2n e2n /n2 < ∞, an = f (s) sin nsds .
n=1 0
Note that if fn (x) = sin nx, then fn 2 = π/2, while Lfn 2 → ∞, and hence
the operator L : D(L) ⊆ L2 [0, π] → L2 [0, π] is unbounded.
1.3.5 Surface Phenomena
Unbounded operators raise their heads in certain problems of identification

of nonlinear surface phenomena for the heat equation. In the simplest case a
homogeneous semi-infinite solid, modeled by the right half-plane x ≥ 0, which
is initially at temperature 0, is brought into contact with a gas represented by
the region x < 0. The ambient temperature of the gaseous region is a known
increasing function of time f (t) and, as the ambient warms, thermal energy
is transferred to the solid via interaction at the surface x = 0. This surface
effect is modeled in terms of the temperature gradient at the surface of the
body. If the temperature of the body at position x and time t is designated
by u(x, t), then (under suitable normalizations) the governing equations are
∂u ∂2u
= 2 , 0 < x, 0<t
∂t ∂ x
u(x, 0) = 0, 0≤x
∂u
(0, t) = g(u(0, t)) − f (t)
∂x
where g is some unknown function satisfying g(0) = 0. The goal is to estimate
the unknown function g which governs the thermal transfer at the surface,
given the ambient temperature history f (t) and the surface temperature his-
tory ϕ(t) = u(0, t), both of which can be measured by thermal probes. This is
an inverse problem of the classic type of determining physics from measure-
ments and leads, via a Laplace transform analysis, to the equation
t
g(ϕ(t)) = f (t) + ϕ (τ )k(t − τ )dτ, t > 0
0
√
where the kernel k is given by k(t) = 1/ πt. From this equation values of
the unknown function g may be determined in terms of measured values of
ϕ(t) := u(0, t) and the unbounded operator
t
(Lϕ)(t) = ϕ (τ )k(t − τ )dτ. (1.6)
0
Similar problems arise for other simple geometries and boundary condi-
tions. We illustrate the identification of a nonlinear boundary condition in
a simple heat transfer problem for a hollow ball. Consider a spherical shell
b ≤ r ≤ a, which is initially at temperature zero, whose inner surface r = b
is constantly at temperature zero, and whose outer surface r = a is heated
under the influence of a spatially homogeneous ambient temperature f (t). In
suitable units the temperature u(r, t) satisfies
2
ut = ∆u = urr + ur , b < r < a, t > 0
r
u(r, 0) = 0, b ≤ r ≤ a
u(b, t) = 0, t > 0
We assume that the flux at the active surface r = a is given by
ur (a, t) = g(u(a, t)) − f (t)
where f (t) is a given source term and g is an unknown function to be

determined. To simplify the notation we let ϕ(t) := u(a, t) be the temper-
ature history at the outer surface. We assume that ϕ is differentiable and that
all functions of time are sufficiently regular to allow application of standard
Laplace transform techniques. If we denote Laplace transforms (with respect
to time) by corresponding upper case letters, e.g., U = U (r, s) = L{u(r, t)},

then the equations above transform to
r2 U + 2rU − sr2 U = 0
U (b) = 0, U (a) = G
where G is the Laplace transform of g(u(a, t)) − f (t) and primes signify differ-
entiation with respect to r. It is a routine, though tedious, matter to show that
√ √
a2 G e sr /r − e− s(r−2b) /r
U= √ √ √ √
(a s − 1)ea s + (a s + 1)e−(a−2b) s
and in the limit as r → a− , we obtain

√ √
a(ea s − e−(a−2b) s )
Φ=G √ √ √ √ ,
(a s − 1)ea s + (a s + 1)e−(a−2b) s
where Φ is the Laplace transform of the surface temperature history ϕ(t) =

u(a, t). Therefore
√ √ √ √
(a s − 1)ea s + (a s + 1)e−(a−2b) s
G(s) = {sΦ(s)} × √ √
as(ea s − e−(a−2b) s )
and hence, by the convolution theorem,

t
g(ϕ(t)) − f (t) = ϕ (τ )k(t − τ )dτ
0
where
√ √ √ √
1 (a s − 1)ea s + (a s + 1)e−(a−2b) s
k(t) = L−1 √ √
a s(ea s − e−(a−2b) s )
∞
1 1
=− +√ exp(−(a − b)2 n2 /t).
a πt n=−∞
We see that identifying the unknown function g which controls the heat trans-
fer at the surface involves computing values of the function
t
(Lϕ)(t) = ϕ (τ )k(t − τ )dτ.
0
Other unbounded operators with similar kernels occur in similar problems.

For example, in the case of a solid ball of radius a, the same type of operator
arises but with the kernel
∞
1 1
k(t) = − + √ exp(−a2 n2 /t)
a πt n=−∞
while for a slab 0 ≤ x ≤ a with a frozen face at x = a (i.e., u(a, t) = 0) the

kernel is
∞
1
k(t) = √ exp(−a2 n2 /t).
πt n=−∞
For the slab with insulated face at x = a (i.e., ux (a, t) = 0) the kernel is
∞
1
k(t) = √ (−1)n exp(−a2 n2 /t).
πt n=−∞
1.4 Overview
In this chapter we have acted a bit cavalierly in our use of mathematics as
our chief concern was to provide a variety of examples of unbounded linear
operators arising in application areas, particularly in inverse problems. The
remainder of this monograph has a more rigorous tone. In the coming chap-
ters certain aspects of rigorous operator approximation theory are developed
and applied to the general problem of stabilizing the evaluation of a closed
unbounded linear operator defined on a dense subspace of a Hilbert space.
In the next chapter, some basic results of the theory of Hilbert spaces
and operator theory are briefly reviewed. Among the tools that shall prove
indispensable for the later development are certain weak compactness and
convergence notions, the spectral theorem for a self-adjoint bounded linear
operator, von Neumann’s theorem on the boundedness of a type of resolvent
of a closed unbounded operator, and his theorem on the convergence of the
alternating projection algorithm. These topics and related ideas will be put
to use in the general approximation theory developed in later chapters.
Chapter 3 presents an abstract general scheme for stabilized evaluation of
a closed densely defined linear operator. In the general scheme von Neumann’s
theorem is teamed up with the spectral theorem for bounded self-adjoint linear
operators to produce “smoothed” data that converge in operator graph norm
to the true data vector. Some general convergence results are proved beginning
with the case of error-free data. For the case of rough, approximately specified
data, convergence is proved under certain a priori assumptions relating the
stability parameter to the error level in the data and to certain general features
of the approximation scheme. Various special instances of the general scheme,
including methods in which the stability parameter assumes a continuous
range of values as well as methods in which an iteration number assumes the
role of the stability parameter, are briefly discussed.
The best-known stabilized evaluation method, the Tikhonov-Morozov
method, is taken up in Chapter 4. Some special features of this method,
1.5 Notes 17
including the discrepancy principle and saturation results, are discussed in

some detail. An iterative version of the method is also investigated and an
independent proof of the convergence of the stationary iterative version of the
method, based on von Neumann’s alternating projection algorithm, is pre-
sented. Finally, a nonstationary version of the iterative method is discussed.
To this point all approximations have been considered in the context of
infinite dimensional Hilbert space. In the final chapter, approximations that
reside in fixed finite-dimensional subspaces, and consequently are effectively
computable, are treated. Direct projection methods, including methods in
which the given data is projected onto a finite dimensional subspace of the
domain of the operator before evaluation, and a scheme related to the “dual
least squares” method, are discussed. A finite element version of the Tikhonov-
Morozov method is discussed in some detail.
1.5 Notes
In Section 1.2 the simplest case of the Dirichlet-to-Neumann map is consid-

ered. Of course many generalizations are possible. For example, the Dirichlet-
to-Neumann map for general elliptic partial differential operator on domains
in Rn , considered as an unbounded operator in a Banach space is treated in
[10]. For more details on the Dirichlet-to-Neumann map see [45] and another
example is found in [10]. An extensive development of the coefficient identi-
fication problem for a nonuniform string and related problems may be found
in [43]. For much more on moment problems related to inverse problems (in
which the example given above is attributed to Inglese) see [2]. Cannon [5]
is the best source for information on the one-dimensional heat equation and
[25] is an excellent general source on applied partial differential equations and
integral equations. The direct problem of heat transfer for a radiating semi-
infinite body was first treated by Mann and Wolf [38]. The inverse problem
of determining the surface thermal transfer function for certain nonlinearly
radiating solids with simple geometries can be found in [19]. A discussion of
the theory of stabilization methods for the evaluation of singular convolution
operators, based on the theory of Fourier transforms, can be found in [42].
2
Hilbert Space Background
Geometry (which is the only science that it hath

pleased God hitherto to bestow on mankind).
Hobbes
This chapter outlines the Hilbert-Schmidt geometrization of analysis and some

of its far-reaching consequences. Collected here are the principal results on
Hilbert spaces and the theory of operators that will be put to use in the
sequel. We have tried to strike a balance between thoroughness and brevity.
Proofs, sometimes rather brief or informal, will be supplied for many results;
for others references will be given to easily accessible sources. Some readers will
find much of this material to be familiar fare; they may confidently proceed
to the next chapter. Others may find this précis useful, if for no other reason
than to establish some notation and basic terminology that will be used in
later chapters.
2.1 Hilbert Space

We shall for the most part restrict our consideration to real Hilbert spaces.
The inner product on a generic Hilbert space H will be symbolized by
·, ·
and the associated norm by · , that is

x =
x, x,
for all x ∈ H. With this norm a Hilbert space is a complete normed linear
space, that is, a sequence {xn } ⊂ H which is Cauchy in this norm (that is,
xn − xm → 0 as n, m → ∞) converges to some vector in H. A typical exam-
ple of a Hilbert space is the space L2 (Ω) consisting of Lebesgue measurable
square integrable real-valued functions on a domain Ω ⊆ Rn with the inner
product

f, g = f (x)g(x)dx
Ω
20 2 Hilbert Space Background
and the associated norm

12
f = |f (x)| dx
2
.
Ω
We remind the reader that any inner product space may be completed to
form a Hilbert space by a process of endowing the space of certain equivalence
classes of Cauchy sequences with an appropriate inner product (see, e.g., [50]).
The inner product in a Hilbert space satisfies the Cauchy-Schwarz
inequality
|
x, y| ≤ xy
and hence the inner product is continuous with respect to the associated norm.
Since x2 =
x, x, one quickly finds that the norm associated with an inner
product satisfies the Parallelogram Law:
x + y2 + x − y2 = 2x2 + 2y2 .
Two vector x and y are termed orthogonal (denoted x ⊥ y) if

x, y = 0 and
a collection of vectors {xα }α∈A is called orthogonal if xα1 ⊥ xα2 for α1 = α2 .
If, in addition, xα = 1 for each α ∈ A the set is called orthonormal. If
{xα }α∈A is an orthonormal set then for each x ∈ H,

|
x, xα |2 ≤ x2
α∈A
where at most countably many terms in the sum are nonzero. This is known as
Bessel’s inequality. An orthonormal set {xα }α∈A is called complete if
x, xα =
0 for every α ∈ A implies that x = 0. If {xα }α∈A is a complete orthonormal
set, then every x ∈ H has a convergent Fourier expansion

x=
x, xα xα
α∈A
and
x2 = |
x, xα |2 .
α∈A
The later equation is known as Parseval’s identity. A Hilbert space is called

separable if it contains a countable complete orthonormal set. Hilbert spaces
that are important in applications are usually separable.
The orthogonal complement of a subset S ⊆ H is the closed subspace
S ⊥ = {y ∈ H :
x, y = 0, for all x ∈ S}.
Every closed subspace S of a Hilbert space H affects a “Cartesian” de-

composition, H = S ⊕ S ⊥ , meaning that each x ∈ H has a unique repre-
sentation in the form x = x1 + x2 where x1 ∈ S and x2 ∈ S ⊥ . Indeed, if
2.1 Hilbert Space 21
d = inf{x − y : y ∈ S} and {yn } ⊆ S satisfies x − yn → d, then by the

parallelogram law
ym − yn 2 = ym − x + x − yn 2
= 2(ym − x2 + x − yn 2 ) − 4(ym + yn )/2 − x2
≤ 2(ym − x2 + x − yn 2 ) − 4d2 → 0, as n, m → ∞.
Therefore, {yn } is a Cauchy sequence and hence yn → x1 for some x1 ∈ S.

The vector x1 is called the projection of x onto S and is symbolized by PS x.
Now define x2 = x − x1 , and suppose that y ∈ S is a unit vector. Then, since
x1 +
x2 , yy ∈ S, we have
x2 2 = x − x1 2 = d2 ≤ x − (x1 +
x2 , yy)2 = x2 2 − |
x2 , y|2
and hence
x2 , y = 0, that is, x2 ∈ S ⊥ .
If the subspace S contains a complete orthonormal set {xn }, then PS may
be computed by use of the Fourier projection:

PS x =
x, xn xn
n
for each x ∈ H.
If S is a subspace of H, then S ⊥⊥ = S, the (norm) closure of S. In fact,
⊥⊥
S = (S ⊥ )⊥ is a closed subspace of H containing S and therefore, S ⊆ S ⊥⊥ .
On the other hand, if x ∈ S ⊥⊥ , then x = x1 + x2 where x1 ∈ S ⊆ S ⊥⊥ and
⊥
x2 ∈ S = S ⊥ . Hence, x2 = x − x1 ∈ S ⊥⊥ and so x2 ∈ S ⊥⊥ ∩ S ⊥ = {0}.
That is, x = x1 ∈ S, and hence S ⊥⊥ ⊆ S.
Each y ∈ H gives rise to a real-valued linear functional l : H → R defined
by l(x) =
x, y. The Cauchy-Schwarz inequality guarantees that this linear
functional is continuous (with respect to the norm). The Riesz Representation
Theorem asserts that every continuous linear functional on H has this form.
Indeed, if l : H → R is a nonzero continuous linear functional, then the
nullspace of l, defined by
N (l) = {x ∈ H : l(x) = 0}
is a closed subspace of H. Choose z ∈ N (l)⊥ with l(z) = 1, then x − l(x)z ∈

N (l) for every x ∈ H, and hence

x − l(x)z, z = 0,
that is, l(x) =

x, y, where y = z/z2 . We call this y the Riesz representer
of the functional l. It is easy to see that this representer is unique.
Many spaces of functions, or generalized functions, that are important
in applied mathematics form Hilbert spaces when endowed with appropriate
inner products. An example of an important Hilbert space in the theory of

partial differential equations is the Sobolev space H01 (Ω) which is the comple-
tion with respect to the norm generated by the inner product

u, v = ∇u · ∇vdx
Ω
of the space C02 (Ω) of twice continuously differentiable functions with com-
pact support contained in the interior of the bounded domain Ω ⊆ Rn with
piecewise smooth boundary ∂Ω. If u, v ∈ C02 (Ω), then u and v vanish on the
boundary ∂Ω and Green’s identity gives

(−∆u)vdx = ∇u · ∇vdx =
u, v
Ω Ω
where ∆ is the Laplacian operator. Therefore any classical solution u of the

Poisson equation
−∆u = f in Ω
u = 0 on ∂Ω,
where f ∈ L2 (Ω), satisfies

u, v = f vdx
Ω
for all v ∈ C02 (Ω). On the other hand, for any fixed f ∈ L2 (Ω), the linear
functional l : H01 (Ω) → R defined by

l(v) = f (x)v(x)dx
Ω
is continuous on H01 (Ω) (a consequence of the Poincaré-Zaremba inequality

[35]). Therefore, by the Riesz representation theorem there is a unique u ∈
H01 (Ω) satisfying

u, v = f (x)v(x)dx
Ω
for all v ∈ H01 (Ω). This u ∈ H01 (Ω) is called the weak solution of the elliptic
boundary value problem (2.1). “Weak” notions play an important role in the
sequel.
2.2 Weak Convergence

A sequence {xn } in a Hilbert space H is said to converge weakly to x ∈ H,
denoted xn x, if
xn , y →
x, y for all y ∈ H. For example, if {xn } is
an orthonormal sequence, then Bessel’s inequality immediately implies that
xn 0. Loosely speaking, one may say that continuous linear functionals are
2.2 Weak Convergence 23
ultimately unable to distinguish the terms of a weakly convergent sequence

from its weak limit. This has important practical implications as many mea-
surement processes may be modeled by continuous linear functionals.
If {xn } converges strongly to x, that is, if xn −x → 0, then xn x by the
Cauchy-Schwarz inequality. On the other hand, if xn x and xn → x,
then
x − xn 2 =
x − xn , x − xn = x2 − 2
xn , x + x2 → 0,
that is, weak convergence along with convergence of the norms implies strong
convergence. The Cauchy-Schwarz inequality also implies that the norm is
weakly lower semi-continuous, namely, if xn x, then
x2 = lim inf

x, xn ≤ x lim inf xn
and hence x ≤ lim inf xn . The uniform boundedness principle implies
that every weakly convergent sequence is bounded (see [26], p. 183, for an
elementary proof that does not employ the uniform boundedness principle).
The next basic result is a partial converse and plays an important role in later
developments.
Theorem 2.1. Each bounded sequence in a Hilbert space has a weakly con-
vergent subsequence.
Proof. Suppose {xn } is a bounded sequence in a Hilbert space H. By restrict-

ing our attention to the closure of the span of {xn } we may assume that H is
separable, that is, there is a sequence {yn } that is dense in H. We shall show
that there is a subsequence {zn } of {xn } such that the mapping
v → lim
zn , v
n
is a continuous linear functional on H. The subsequence {zn } is extracted via

a diagonal process. Since {xn } is bounded, the numerical sequence {
xn , y1 }
is bounded and hence there is a subsequence of {xn }, which we will denote
(1) (1) (1)
by {xn } such that {
xn , y1 } is convergent. From {xn } we may extract a
(2) (2)
subsequence {xn } such that {
xn , y2 } is convergent, and so on. Now let
(n)
zn = xn . Then{
zn , yj } is convergent for every j = 1, 2, 3, . . .. Given v ∈ H
and > 0, choose a yN such that v − yN < . Since {zn } is bounded, there
is a number M such that zn − zm < M for all n, m. Then
|
zn − zm , v| = |
zn − zm , v − yN +
zn − zm , yN |
≤ M + |
zn − zm , yN | < (M + 1)
for n, m sufficiently large and hence {

zn , v} is a Cauchy sequence. Let
l(v) = lim
zn , v
n
then one sees that l is a linear functional and l(v) ≤ M v and hence l is
continuous. By the Riesz representation theorem there is a w ∈ H such that
l(v) =
w, v, that is
lim
zn , v =
w, v
n
for each v ∈ H, that is, zn w.

2.3 Bounded Linear Operators

A function T : H1 → H2 , where H1 and H2 are Hilbert spaces is called a
linear operator if T (αx + βy) = αT x + βT y for any scalars α and β and any
vectors x, y ∈ H1 . Such an operator is called bounded if the number
T = sup{T x/x : x = 0}
is finite. The number T is called the norm of the bounded linear operator
T and one sees that
T x − T y ≤ T x − y
for all x, y ∈ H1 . Therefore, bounded linear operators are continuous (relative
to the norms in each space) and it is easy to see that if a linear operator
is continuous at any point, then it is bounded. A bounded linear operator
T : H → R is called, as we have seen, a bounded linear functional and
the Riesz representation theorem identifies the space of all bounded linear
functionals on a Hilbert space H with the space H itself. We note that if S
is a closed subspace of a Hilbert space H, then the projection operator PS
defined previously is bounded and in fact PS = 1.
Suppose T : H1 → H2 is a bounded linear operator. Then for each y ∈ H2
the mapping l : H1 → R defined by l(x) =
T x, y is a bounded linear func-
tional. Therefore, the Riesz representation theorem guarantees the existence
of a unique vector in H1 , denoted T ∗ y, satisfying

T x, y =
x, T ∗ y.
The operator T ∗ : H2 → H1 so defined is called the adjoint of T ; it is easy to
see that T ∗ is a bounded linear operator, in fact, T ∗ = T and T ∗∗ = T .
If T = T ∗ , then the operator T is called self-adjoint.
Each linear operator T gives rise to two important subspaces, the nullspace
N (T ) = {x ∈ H1 : T x = 0}
and the range
R(T ) = {T x : x ∈ H1 }.
There is a useful relationship between these subspaces for an operator and its
adjoint. We see that w ∈ R(T )⊥ if and only if
0 =
T x, w =
x, T ∗ w
2.3 Bounded Linear Operators 25
for all x ∈ H1 , that is, R(T )⊥ = N (T ∗ ). Therefore, R(T ) = R(T )⊥⊥ =

N (T ∗ )⊥ . Replacing T by T ∗ and noting that T ∗∗ = T , we get two additional
relationships. Taken together, we have these relationships between the four
fundamental subspaces associated with a bounded linear operator T :
N (T ∗ ) = R(T )⊥ , N (T ∗ )⊥ = R(T )
N (T ) = R(T ∗ )⊥ , N (T )⊥ = R(T ∗ ).
If N (T ) = {0} and R(T ) = H2 , then the operator T has a linear inverse

operator T −1 defined on the entire space H2 . If, in addition, T is bounded then
Banach’s theorem (in a more general version this is called the open mapping
theorem) asserts that T −1 is bounded.
Theorem 2.2 (Banach). If T : H1 → H2 is a bounded linear operator with
N (T ) = {0} and R(T ) = H2 , then T −1 : H1 → H2 is a bounded linear
operator.
This is a good place to remind the reader of another fundamental result
on bounded linear operators:
Theorem 2.3 (Uniform Boundedness Principle). If {Tα } is a collection

of bounded linear operators from a Hilbert space H1 into a Hilbert space H2
with the attribute that the collection of vectors {Tα x} is bounded for each
x ∈ H1 , then the collection of operators {Tα } has a uniform bound, that is,
there exists a constant C, which is independent of α, such that Tα ≤ C for
all α.
If T is a bounded linear operator satisfying T x ≥ mx for all x ∈ H1

and some m > 0, then R(T ) is closed. Indeed, if T xn → y, then {T xn } is a
Cauchy sequence and hence {xn } is a Cauchy sequence. Therefore, xn → x for
some x ∈ H1 and by the continuity of T , we have y = T x. Banach’s theorem
then implies that T has a bounded inverse T −1 : R(T ) → H1 and in fact,
T −1 ≤ m−1 .
A linear operator T : H1 → H2 is called an operator of finite rank if its
range is spanned by finitely many vectors in H2 . In other words, T has finite
rank if there are linearly independent vectors {v1 , . . . , vm } in H2 such that

m
Tx = lk (x)vk
k=1
where the coefficients lk are bounded linear functionals on H1 . Therefore, by

the Riesz representation theorem,

m
Tx =
x, uk vk
k=1
for certain vectors {u1 , . . . , um } ⊂ H1 . Note that every finite rank operator is
continuous, but more is true. If {xn } is a weakly convergent sequence in H1 ,
say xn w, then

m
m
T xn =
xn , uk vk →
w, uk vk = T w,
k=1 k=1
and hence a finite rank operator T maps weakly convergent sequences into
strongly convergent sequences. Finite rank operators are a special case of an
important class of linear operators that enjoy this weak-to-strong continuity
property.
Definition. A linear operator T : H1 → H2 is called compact if xn w

implies T xn → T w.
Note that every finite rank operator is compact and every compact oper-
ator is a fortiori continuous. Also, limits of finite rank operators are compact
(the proof is a standard “/3” argument). More precisely, if {Tn } is a sequence
of finite rank operators converging in operator norm to an operator T , then
T is compact.
Compact operators acting on a Hilbert space have a particularly sim-
ple structure expressed in terms of certain characteristic subspaces known as
eigenspaces. The eigenspace of a linear operator T : H → H associated with
a scalar λ is the subspace
N (T − λI) = {x ∈ H : T x = λx}.
In general, an eigenspace may be trivial, that is, it may consist only of the
zero vector. If N (T − λI) = {0}, we say that λ is an eigenvalue of T . If T is
self-adjoint, then the eigenvalues of T are real numbers and vectors in distinct
eigenspaces are orthogonal to each other. If T is self-adjoint, compact, and
of infinite rank, then the eigenvalues of T form a sequence of real numbers
{λn }. This sequence converges to zero, for taking an orthonormal sequence
{xn } with xn ∈ N (T − λn I) we have λn xn = T xn → 0 since xn 0 (a
consequence of Bessel’s inequality). Since xn = 1, it follows that λn → 0.
The fact that T xn → 0 for a sequence of unit vectors {xn } is an abstract
version of the Riemann-Lebesgue Theorem.
The prime exemplar of a compact self-adjoint operator is the integral
operator on the real space L2 [a, b] generated by a symmetric kernel k(·, ·) ∈
L2 ([a, b] × [a, b]):
b
(T f )(s) = k(s, t)f (t)dt.
a
This operator is bounded since for f ∈ L2 [a, b] and k(·, ·) ∈ L2 ([a, b] × [a, b])
the Cauchy-Schwarz inequality gives

b
b
T f 2 = a
| a k(s, t)f (t)dt|2 ds

b
b
b
≤ ( |k(s, t)|2 dt a |f (t)|2 dt)ds
a a

b
b
= f 2 a a
|k(s, t)|2 dtds.
It is also compact because it is the limit in operator norm of the finite rank
operators
b
TN f = cn,m f (t)φm (t)dtφn
n.m≤N a
where {φn }∞ 2
1 is a complete orthonormal sequence in L [a, b] and
∞

k(s, t) = cn,m φn (s)φm (t)
n,m=1
is the Fourier expansion of k(·, ·) relative to the orthonormal basis{φn (s)φm (t)}
for L2 ([a, b] × [a, b]).
Operators with weakly singular kernels form another class of compact
operators on the space L2 [a, b]. These operators are important in certain ap-
plication areas, particularly potential theory. As an illustration we consider
the one-dimensional case. Given a constant α with 0 < α < 1 and a function
k(·, ·) ∈ L2 ([a, b] × [a, b]), the integral operator T defined by
b
k(s, t)
(T f )(s) = f (t)dt
a |s − t|α
is said to have a weakly singular kernel. Such an operator is bounded on
L2 [a, b] since if f ∈ L2 [a, b] and if M is an essential bound for |k(·, ·)|, then
by the Cauchy-Schwarz inequality
b b 2

k(s, t)
T f =
2
f (t)dt ds
a a |s − t|α

b b
1 b
|f (t)|2
≤M 2
dt dt ds
a a |s − t|α a |s − t|α

M 2 (b − a)(1−α) b b
|f (t)|2
≤ dtds
1−α a a |s − t|α

M 2 (b − a)(1−α) b b
= |f (t)|2 |s − t|−α dsdt
1−α a a
M 2 (b − a)2(1−α)
≤ f 2 .
(1 − α)2
Now define operators Tn by

b
k(s, t)
(Tn f )(s) = f (t)dt
a rn (s, t)α
where ⎧
⎨ |s − t|, |s − t| ≥ 1
n
rn (s, t) =
⎩
1/n, |s − t| < 1
n.
Then Tn is compact since it is generated by a square integrable kernel. Using

the Cauchy-Schwarz inequality and a change of order of integration as above,
and denoting the interval [a, b] ∩ [s − 1/n, s + 1/n] by In and various positive
constants that occur by a generic “C”, we have
b 2

T f − Tn f 2 = k(s, t)(|s − t|−α
− r (s, t)−α
)f (t)dt ds
n
a In
b 2
k(s, t)
= (1 − n |s − t| )f (t)dt ds
α α
|s − t|α
a In
2
b
|k(s, t)|
≤C |f (t)|dt ds
a In |s − t|α

b
dt |f (t)|2
≤C dt ds
a In |s − t|α In |s − t|α

C b
|f (t)|2 C
≤ dtds ≤ 1−α f 2
n1−α a In |s − t|α n
and hence T − Tn → 0 as n → ∞. Therefore the weakly singular operator

T is compact.
Each of the kernels introduced at the end of the previous chapter which
are associated with the identification of a nonlinear boundary condition at
the active surface of a simple solid that is being warmed in an ambient bath
is weakly singular. For example, the operator modeling the semi-infinite solid
t
ϕ (τ )
(Lϕ)(t) = √ dτ
0 t−τ
can be considered on any interval [0, T ] as a product of a compact operator √ K

on L2 [0, T ] which is generated by the weakly singular kernel k(t, τ )/ t − τ ,
where k(t, τ ) = χ[0,t] (τ ) is the characteristic function of the interval [0, t], and
the unbounded differentiation operator D defined on
D(D) = {ϕ ∈ L2 [0, T ] : ϕ abs.cont., ϕ(0) = 0, ϕ ∈ L2 [0, T ]}.

The spectral theorem for a compact self-adjoint operator gives a particu-

larly simple characterization of the range of such an operator. We have seen
that the nonzero eigenvalues of a compact self-adjoint operator T of infinite
rank form a sequence of real numbers {λn } with λn → 0 as n → ∞. The
corresponding eigenspaces N (T − λn I) are all finite-dimensional and there is
a sequence {vn } of orthonormal eigenvectors, that is, vectors satisfying
vj = 1,
vi , vj = 0, for i = j, and T vj = λj vj .
(here the eigenvalues are repeated in accordance with the dimension of the
associated eigenspace). The closure of the range of T is the closure of the span
of this sequence of eigenvectors. Since T is self-adjoint, N (T )⊥ = R(T ); the
decomposition theorem then gives the representation
∞

w = Pw +
w, vj vj
j=1
for all w ∈ H, where P is the projection operator from H onto N (T ). The

range of T therefore has the form
∞

Tw = λj
w, vj vj .
j=1
This result is known as the Spectral Theorem. The spectral theorem enables
the development of a remarkably useful functional calculus of operators. Given
a real-valued function which is defined and bounded on the set of eigenvalues
{λj }∞
j=1 of a compact self-adjoint operator T : H → H, the operator f (T ) :
H → H is defined by
∞

f (T )x = f (λj )
x, vj vj
j=1
where {vj } are orthonormal eigenvectors associated with the respective (re-
peated as necessary) eigenvalues. Since f is bounded, Bessel’s inequality en-
sures that this series converges, that is, f (T ) : H → H is well-defined. It is a
routine matter to verify that the operator f (T ) so defined is linear, bounded
(in fact, f (T ) = max |f (λj )|) and self-adjoint.
j
The spectral theorem for compact self-adjoint linear operators can be
phrased in another way that is suggestive of its generalization to bounded
self-adjoint linear operators. Denote by Pλj the orthogonal projection
onto the
eigenspace N (T − λj I), in particular, P0 = PN (T ) . Now let Eλ = λj ≤λ Pλj .
Then Eλ is a projection operator satisfying Eλ = O for λ < λmin and Eλ = I
for λ ≥ λmax , where [λmin , λmax ] is the smallest closed interval containing
the eigenvalues of T . The spectral representation of T may now be written in
terms of the family of projections {Eλ } as
b
Tx = λj (Eλj − Eλj−1 )x = λdEλ
j a
for any a ≤ λmin and b > λmax , in which the operator-valued integral is to be
interpreted in the Riemann-Stieljes sense. We note that the projection-valued
function Eλ is right continuous in the sense that limλ→λ+ Eλ x = Eλ0 x for
0
each x ∈ H.
If T is a bounded linear operator, then the spectrum of T , denoted σ(T ), is
the set of all complex numbers λ such that (T − λI) does not have a bounded
inverse. The spectrum of a bounded self-adjoint linear operator is a nonempty
compact subset of real numbers contained in the interval [mT , MT ] where
mT = min {
T x, x : x = 1}
and
MT = max {
T x, x : x = 1}.
The spectral theorem asserts that there exists a family {Eλ }λ∈R of projection
operators on H, called a resolution of the identity generated by T , having the
properties enunciated in the previous paragraph, such that
b
Tx = λdEλ
a
for all x ∈ H, where a ≤ mT and b > MT . Also, if f is a continuous function

on [a, b], then the operator f (T ) defined by
b
f (T )x = f (λ)dEλ x
a
is a bounded self-adjoint linear operator satisfying
f (T ) = max |f (λ)|.
λ∈σ(T )
A particularly important consequence of this result is the fact that if {pn (λ)}
is a sequence of easily computable functions (e.g., polynomials) which are
continuous and uniformly convergent to f on a closed interval [a, b] containing
the spectrum of T , then
pn (T ) − f (T ) → 0
as n → ∞.
2.4 Unbounded Operators

A linear operator L defined on a linear subspace D(L) of a Hilbert space
H1 and taking values in a Hilbert space H2 (we write L : D(L) ⊆ H1 → H2 )
2.4 Unbounded Operators 31
is unbounded if there is a sequence of unit vectors {un } ⊆ D(L) with

Lun → ∞. Many interesting linear operators are not bounded. Indeed,
the collection of examples given in the first chapter of unstable solution pro-
cedures for various inverse problems provides many illustrations of unbounded
linear operators.
The adjoint operator may also be defined for unbounded linear operators
with dense domains. Given a linear operator L : D(L) ⊆ H1 → H2 with dense
domain D(L), let D(L∗ ) be the subspace of all vectors y ∈ H2 satisfying

Lx, y =
x, y ∗
for some vector y ∗ ∈ H1 and all x ∈ D(L). The vector y ∗ is then uniquely
defined and we set L∗ y = y ∗ . Then the operator L∗ : D(L∗ ) ⊆ H2 → H1 is
linear. If L is both densely defined and closed, then L∗ is also densely defined
(see below).
As a simple example, we determine the adjoint of the differentiation oper-
ator on H = L2 [0, 1]. Let
D(L) = {f ∈ H : f is absolutely cont. on [0, 1] and f ∈ H}.
Then D(L) is dense in H since it contains the complete orthonormal set

{sin nπx}∞
n=1 . Let
D∗ = {g ∈ D(L) : g(0) = 0 = g(1)}.
Then for f ∈ D(L) and g ∈ D∗ ,

1 1

Lf, g = f (t)g(t)dt = f (t)g(t)|10 − f (t)g (t)dt =
f, −g .
0 0
Therefore D ⊆ D(L ) and L g = −g for g ∈ D∗ . On the other hand, if

∗ ∗ ∗
g ∈ D(L∗ ), let g ∗ = L∗ g. Then

Lf, g =
f, g ∗
1
for all f ∈ D(L). In particular, for f ≡ 1 we find that g ∗ (t)dt = 0. Now let
0
t
h(t) = − g ∗ (s)ds.
0
Then h ∈ D∗ and L∗ h = g ∗ = L∗ g and hence h − g ∈ N (L∗ ). Therefore,
Lf, h − g = 0 for all f ∈ D(L). But R(L) contains all continuous functions
and hence g = h ∈ D∗ . We conclude that D(L∗ ) = D∗ and L∗ g = −g .
A linear operator L : D(L) ⊆ H1 → H2 is called closed if its graph G(L) =
{(x, Lx) : x ∈ D(L)} is a closed subspace of the product Hilbert space H1 ×H2 .
This means that if {xn } ⊂ D(L), xn → x ∈ H1 , and Lxn → y ∈ H2 , then
(x, y) ∈ G(L), that is, x ∈ D(L) and Lx = y. For example, the differentiation
operator defined in the previous paragraph is closed. For suppose {fn } ⊆ D(L)
and fn → f and fn → g, in each case the convergence being in the L2 [0, 1]
norm. Since x
fn (x) = fn (0) + fn (t)dt
0
we see that the sequence of constant functions {fn (0)} converges in L2 [0, 1]
and hence the numerical sequence {fn(0)} converges to some real number C.
x
Now define h ∈ D(L) by h(x) = C + g(t)dt. Then, for any t ∈ [0, 1], we
0
have by use of the Cauchy-Schwarz inequality

x
|fn (x) − h(x)| = |fn (0) − C + 0 fn (t) − g(t)dt|

1
≤ |fn (0) − C| + 0
|fn (t) − g(t)|dt
≤ |fn (0) − C| + fn − g
and hence fn → h uniformly. Therefore, f = h ∈ D(L) and Lf = f = h = g,

verifying that the operator L is closed.
A simple example from one-dimensional heat conduction, treated in
Chapter 1, provides another illustration of a closed densely defined linear
operator. If in the model
∂u ∂2u
= + g(s), 0 < s < π, 0 < t < 1,
∂t ∂s2
where u(s, t) is subject to the boundary and initial conditions
u(0, t) = u(π, t) = 0, u(s, 0) = 0 for 0 ≤ s ≤ π,
one wishes to reconstruct the source distribution g(s) from the spatial temper-
ature distribution x(s) = u(s, 1), one is led by formal separation of variables
techniques, as was seen in Chapter 1, to the representation
∞
n2
g(s) = (Lx)(s) = an sin ns
n=1
1 − e−n2
where π
2
an = x(s) sin nsds.
π 0
That is, g = Lx, where L is the linear operator on L2 [0, π] with domain
∞ π
2
D(L) = x ∈ L2 [0, π] : m4 a2m < ∞, am = x(s) sin msds
m=1
π 0
which is defined above. Note that D(L) is dense in L2 [0, π] since all finite
linear combinations of the eigenfunctions φn (s) = sin ns are contained in
D(L). Moreover, L is unbounded since φn 2 = π/2, while
πn2
Lφn 2 = →∞
2(1 − e−n2 )
as n → ∞. Finally, L is a closed operator, that is, its graph is a closed subspace
of L2 [0, π]×L2 [0, π]. Indeed, if {xk } ⊆ D(L) and xk → x in L2 [0, π] as k → ∞,
while Lxk → g in L2 [0, π], then
g − Lxk , φn → 0 as k → ∞ for each n. Also,
n2 n2

g − Lxk , φn =
g, φn − −n2
xk , φn →
g, φn −
x, φn .
1−e 1 − e−n2
Therefore,
n2

g, φn =
x, φn
1 − e−n2
and hence
∞
∞
2
m2
m4 |
x, φm |2 ≤ |
g, φm |2 = g2 .
m=1 m=1
1 − e−m2
Therefore, x ∈ D(L) and

∞
∞
n2 2 2
Lx =
x, φn φ n =
g, φn φn = g.
n=1
1 − e−n π
2
π n=1
As closed linear operators are an essential feature of the sequel, we provide

yet another example. The reader is invited to investigate the closedness of the
remaining examples of linear unbounded operators illustrated in Chapter 1.
Consider the linear operator L defined on
D(L) = {f ∈ L2 (R) : fˆ(ω) cosh ω ∈ L2 (R)}
by
Lf = F −1 {fˆ(ω) cosh ω}
where F is the Fourier transform operator and F{f } = fˆ. If {fn } ⊆ D(L)
and Lfn → g in L2 (R), then
F −1 {fˆn (ω) cosh ω} → g ∈ L2 (R)
and hence {fˆn (ω) cosh ω} converges to ĝ and to fˆ cosh ω. Therefore, fˆ cosh ω ∈
L2 (R), that is, f ∈ D(L), and
Lf = F −1 {fˆn (ω) cosh ω} = F −1 {ĝ} = g.
Therefore, L is closed.
The next proposition shows that closed linear operators are common.
Theorem 2.4. If L is densely defined, then L∗ is closed.
Proof. Suppose {yn } ⊆ D(L∗ ), yn → y and L∗ yn → z. Then for every u in

the dense subspace D(L) we have

Lu, y = limn→∞
Lu, yn
= limn→∞
u, L∗ yn =
u, z
and hence y ∈ D(L∗ ) and L∗ y = z, that is, L∗ is closed.

The four fundamental equations relating the ranges and nullspaces that we
discussed previously for bounded linear operators continue to hold for closed
densely defined linear operators. For example, if w ∈ R(L)⊥ , then

Lx, w = 0 =
x, 0
for all x ∈ D(L) and hence w ∈ D(L∗ ) and L∗ w = 0, that is, R(L)⊥ ⊆ N (L∗ ).
On the other hand, if w ∈ D(L∗ ) and L∗ w = 0, then for all x ∈ D(L)

Lx, w =
x, L∗ w = 0
and hence N (L∗ ) ⊆ R(L)⊥ . Therefore, R(L)⊥ = N (L∗ ). It then follows that
N (L∗ )⊥ = R(L)⊥⊥ = R(L).
Similarly, one finds that N (L) = R(L∗ )⊥ and N (L)⊥ = R(L∗ ).

A densely defined linear operator L : D(L) ⊆ H → H is called symmetric
if

Lx, y =
x, Ly for all x, y ∈ D(L).
A densely defined linear operator L is called self-adjoint if L is symmetric
and D(L∗ ) = D(L), that is, L∗ = L. Every self-adjoint transformation is,
of course, symmetric; however, a symmetric transformation is not necessarily
self-adjoint. Consider, for example, the linear operator Lf = f defined on the
subspace of L2 [0, 1] consisting of all functions f with absolutely continuous
first derivative, such that f and f vanish at the endpoints of [0, 1] and f ∈
L2 [0, 1]. For f, g ∈ D(L) one has
1 1

Lf, g = f (t)g(t)dt = f (t)g (t)dt =
f, Lg.
0 0
Therefore, L is symmetric. Yet L is not self-adjoint as any function g having

an absolutely continuous derivative and a square integrable second derivative,
but not satisfying the boundary conditions specified above, is in D(L∗ ) but
not in D(L) and hence D(L) D(L∗ ).
The example just given shows that a symmetric linear operator is not
necessarily bounded. The Hellinger-Toeplitz Theorem gives a sufficient con-
dition for a symmetric operator to be bounded: a symmetric linear operator
whose domain is the entire space is bounded. This theorem is a special case
of the next result because a symmetric linear operator is, as is every adjoint
operator, necessarily closed (see Theorem 2.4).
If a linear operator L : D(L) ⊆ H1 → H2 is closed, then D(L) is a Hilbert
space when endowed with the graph inner product [·, ·] defined by:
[x, y] =
x, y +
Lx, Ly for x, y ∈ D(L).
Indeed, if {xn } ⊆ D(L) is Cauchy in the graph norm, then {xn } is Cauchy
in H1 and {Lxn } is Cauchy in H2 . Therefore, there is an x ∈ H1 such that
xn → x and a y ∈ H2 such that Lxn → y. Since L is closed, x ∈ D(L) and
y = Lx and hence {xn } converges to x ∈ D(L) in the graph norm, that is,
D(L) is complete with respect to the graph norm.
If L is closed and everywhere defined, i.e., D(L) = H1 , then since the
graph norm dominates the norm on H1 , we find, by the corollary to Banach’s
theorem, that the norm in H1 is equivalent to the graph norm. In particular,
the operator L is then bounded:
Theorem 2.5 (Closed Graph Theorem). A closed everywhere defined lin-
ear operator is bounded.
The arguments developed by von Neumann in his study of unbounded
operators may properly be called Cartesian as they rely on clever use of the
graph of the linear operator L : D(L) ⊆ H1 → H2 , considered as a subspace
of the product Hilbert space H1 × H2 , along with like treatment of the adjoint
operator L∗ . We now present an important theorem of von Neumann [47]
concerning the operators
:= (I + L∗ L)−1
L
and LL
where L : D(L) ⊆ H1 → H2 is a closed densely defined linear operator. This

theorem is a key ingredient in our general stabilization theory which will be
presented in the next chapter.
Theorem 2.6 (von Neumann). The operators L and LL
are both defined
is self-adjoint.
on all of H1 and are bounded. Also, L
Von Neumann’s proof makes elegant use of the graph of L
G(L) = {(x, Lx) : x ∈ D(L)}
considered as a closed subspace of the Hilbert space H1 × H2 with inner

product [·, ·] defined by
[(x, y), (w, z)] =

x, w +
y, z
Since G(L) is closed, the product Hilbert space may be decomposed as
H1 × H2 = G(L) + G(L)⊥ .
However,
G(L)⊥ = {(−L∗ y, y) : y ∈ D(L∗ ) ⊆ H2 }. (2.1)
Indeed, if y ∈ D(L∗ ), then for any x ∈ D(L)

−L∗ y, x +
y, Lx =
−L∗ y, x +
L∗ y, x = 0
and hence (−L∗ y, y) ∈ G(L)⊥ . On the other hand, if (u, v) ∈ G(L)⊥ , then for
all x ∈ D(L)

x, u +
Lx, v = 0
that is,

Lx, v =
x, −u
and hence, by definition of L∗ , v ∈ D(L∗ ) and u = −L∗ v. That is
(u, v) = (−L∗ v, v)
which establishes (2.1).

Therefore, for any h ∈ H1 , the decomposition of H1 × H2 given above
ensures the representation
(h, 0) = (x, Lx) + (−L∗ y, y)
for some x ∈ D(L) and y ∈ D(L∗ ). But then y = −Lx and h = x − L∗ y, that
is, x ∈ D(L∗ L) and h = (I + L∗ L)x. Hence (I + L∗ L)−1 is defined on all of
H1 and, since h2 ≥ x2 ,
= (I + L∗ L)−1 ≤ 1.
L
We also have
2 =
L(I + L∗ L)−1 x, L(I + L∗ L)−1 x =
Lx,
LLx x − Lx
≤ x2
and hence LL ≤ 1.

Since L is everywhere defined and bounded to prove that L is self-adjoint
we need only show that it is symmetric. Given x, y ∈ H1 , there are x̄, ȳ ∈
D(L∗ L) such that x = (I + L∗ L)x̄ and y = (I + L∗ L)ȳ. Then
y =
x̄, y =
x̄, (I + L∗ L)ȳ =
x, ȳ =
x, Ly

Lx,
is self-adjoint.
and hence L
The ideas of this proof may also be used to provide a result of independent
interest regarding the adjoint. Suppose the operator T : H2 × H1 → H1 × H2
is defined by
T (y, x) = (−x, y).
Then, as we have seen in the proof of the last theorem (see 2.1),
T (G(L∗ )) = G(L)⊥ (2.2)
where the orthogonal complement is in the product space H1 × H2 . This is

used to establish the following result.
Theorem 2.7. If L is a closed densely defined linear operator, then L∗∗ = L.
Proof. To ensure that L∗∗ exists we must first show that D(L∗ ) is dense in
H2 . Suppose to the contrary that there is a non-zero vector z ∈ D(L∗ )⊥ . Then

L∗ y, 0 −
y, z = 0
for all y ∈ D(L∗ ). That is,
(0, z) ∈ (T (G(L∗ )))⊥ .
By (2.2), along with the fact that G(L) is closed in H1 × H2 , we have
G(L) = G(L)⊥⊥ = (T (G(L∗ )))⊥
and therefore (0, z) ∈ G(L). Since L is linear we then have z = L0 = 0 and

hence D(L∗ ) is dense, so the adjoint L∗∗ exists.
Viewing (T (G(L∗ )))⊥ in a slightly different light, we see that it consists of
vectors (x, u) ∈ H1 × H2 satisfying

L∗ y, x =
y, u
for all y ∈ D(L∗ ). But this is the defining condition for L∗∗ , that is,
G(L) = (T (G(L∗ )))⊥ = G(L∗∗ ),
and hence L = L∗∗ .

and L
We point out a few basic properties of the operators L that will be
useful later.
= D(LL∗ ) and R(L)
Lemma 2.8. R(L) = D(L∗ L).
Proof. Since L ∈ D(LL∗ ) for all z ∈ H2 ,

= (I + LL∗ )−1 , it follows that Lz
⊆ D(LL ). On the other hand, if y ∈ D(LL∗ ), then y = Lw
that is, R(L) ∗
where w = y + LL∗ y, i.e., D(LL∗ ) ⊆ R(L). The other equality is established
in the same way.

This result extends easily to positive integral powers of the operator. In-
⊆ D(LL∗ ), it follows im-
deed, if n is a positive integer, then, since R(L)
∗ n
mediately that R(L ) ⊆ D((LL ) ). On the other hand, supposing that
n
D((LL∗ )k ) ⊆ R(L k ), we find that if x ∈ D((LL∗ )k+1 ), then LL∗ x ∈

∗ k
D((LL ) ) and hence
k ),
(I + LL∗ )x = x + LL∗ x ∈ D((LL∗ )k ) ⊆ R(L
that is
k ) = R(L
x ∈ (I + LL∗ )−1 R(L k+1 ).
Therefore, by induction, we have D((LL∗ )n ) ⊆ R(L n ), for all positive

integers n.
Next we give a relationship between the unbounded operator L∗ L and the
bounded operator I − L
(a similar result relates LL∗ and I − L).
and R(L∗ L) = R(I − L).
Lemma 2.9. N (L∗ L) = N (I − L)
Proof. The first equality follows from the fact that L∗ Lx = 0 if and only if
= x.
x = (I + L∗ L)x, that is, if and only if Lx
Suppose x ∈ D(L L) and let w = x + L∗ Lx. Then (I − L)w
∗ = L∗ Lx and
∗
hence R(L L) ⊆ R(I − L). On the other hand, if y = z − Lz for some z ∈ H1 ,
then
∈ D(L∗ L)
z − y = Lz
⊆ R(L∗ L).
and (I + L∗ L)(z − y) = z. That is, y = L∗ L(z − y), i.e., R(I − L)

The same sort of argument shows that

N (LL∗ ) = N (I − L)
and R(LL∗ ) = R(I − L).
We note that since L is self-adjoint and bounded, with L ≤ 1, it fol-

⊆ [0, 1]. Also, it can be shown that LL
lows that σ(L) 1/2
is bounded with

LL ≤ 1 and, if x ∈ D(L), then
1/2
1/2 x = L
LL 1/2 Lx
(a more general result is proved in Theorem 3.2 of the next chapter). Using
this we have:
1/2 : H1 → H2 is bounded and
Lemma 2.10. LL
1/2 )∗ (LL
(LL 1/2 ) = I − L

Proof. For any x, y ∈ D(L) we have

(LL 1/2 )x, y =
L
1/2 )∗ (LL
1/2 Lx, L∗ LLx, y.
But,

L∗ LLx = (I − L)x,
= L∗ LLx
and hence, for a given y ∈ D(L),
1/2 )∗ (LL

(LL 1/2 )x, y =
(I − L)x,
y
for all x ∈ D(L). Therefore, since D(L) is dense in H1 , and I − L is bounded,

this identity extends to all x ∈ H1 . The resulting identity then holds for all y
in the dense subspace D(L) and hence
(LL 1/2 ) = I − L.
1/2 )∗ (LL

is self-adjoint with σ(L)

Since L ⊆ [0, 1] we see that I − L
≤ 1. We note
< 1, then
that I − L < 1 if and only if L is bounded. Indeed, if I − L
= I + L L is bounded, so L L is symmetric and everywhere defined, and
L −1 ∗ ∗
hence bounded by the Hellinger-Toeplitz theorem. For any x ∈ H1 , we then

have in view of Theorem 2.7
L∗ Lx2 ≥
L∗ Lx, x = Lx2
and hence L is bounded. On the other hand, if L is bounded, then taking

{Eλ }λ≥0 to be a resolution of the identity for the operator L∗ L, we find that
L2 2 2
2= λ L2
(I − L)x d
Eλ x, x ≤ x2
0 1+λ 1 + L2
< 1.
and hence I − L
We also note the following fact about the operator LL∗
Theorem 2.11. If L is closed and densely defined, then LL∗ is closed.
Proof. Suppose yn ∈ D(LL∗ ), yn → y and LL∗ yn → u ∈ H2 . Then,

is bounded, yn → L(y
(I +LL∗ )yn → y +u and hence, since L +u). Therefore,

L(y + u) = y, that is, y ∈ D(LL ) and y + LL y = y + u, or LL∗ y = u and
∗ ∗
hence LL∗ is closed.

consider the differentiation

As a specific example of the operator L

operator Lf = f , where
D(L) = {f ∈ L2 [0, 1] : f abs.cont., f ∈ L2 [0, 1]}.

Earlier we showed that the adjoint has domain
D(L∗ ) = {y ∈ L2 [0, 1] : y abs.cont., y ∈ L2 [0, 1], y(0) = y(1) = 0}
and L∗ y = −y . Therefore,
D(L∗ L) = {f ∈ D(L) : Lf ∈ D(L∗ )}
= {f ∈ D(L) : f abs.cont., f ∈ L2 [0, 1], f (0) = f (1) = 0}
= f if and only if f ∈ D(L∗ L)

and L∗ Lf = −f . Now, given h ∈ L2 [0, 1], Lh
∗
and h = (I + L L)f . That is, f is the solution of the two point boundary
value problem
f − f = −h, f (0) = f (1) = 0.
in terms of Green’s function for this boundary
We may therefore express L
value problem: 1

Lh(t) = G(t, s)h(s)ds
0
where G(·, ·) ∈ L ([0, 1] × [0, 1]) is the symmetric kernel

2

cosh (t − 1) cosh s/ sinh 1, s ≤ t
G(s, t) =
cosh (s − 1) cosh t/ sinh 1, t < s
That is, L is the compact self-adjoint operator defined on L2 [0, 1] by the

integral transform above.
The operator L, treated in Chapter 1, which maps a spatial temperature
distribution f (x) = u(x, 1) into the source term g of the one-dimensional heat
equation
∂u ∂2u
= + g(x), 0 < x < π, 0 < t
∂t ∂x2
provides another example in which the operator L = (I + L∗ L)−1 is compact.
This operator L is defined by
∞

Lf = cn
f, ϕn ϕn
n=1
where
n2
ϕn (x) = sin nx and cn = .
1 − e−n2
It is easy to see that
∞

=
Lf (1 + c2n )−1
f, ϕn ϕn
n=1
2.5 Pseudo-inversion 41
is a bounded self-adjoint linear operator defined on L2 [0, π]. Also, if fk f ,

∞ boundedness theorem, fk − f is bounded, and hence,
then by the uniform
since the series n=1 (1 + c2n )−1 is convergent, given any > 0, there is an N
such that
(1 + c2n )−1
fk − f, ϕn ϕn < /2.
n>N
Since {fk } converges weakly to f , we then have

Lf ≤
k − Lf (1 + c2n )−1
fk − f, ϕn ϕn + /2 <
n≤N
for k sufficiently large. Hence Lf , showing that L

k converges strongly to Lf
is compact.
Operators L∗ L for which L is compact are said to be operators with com-
pact resolvent. This class of operators plays an important role in one of the
coming sections. Kato [30] has noted that operators of this type are especially
significant in applications:
Operators with compact resolvent occur frequently in mathematical
physics. It may be said that most differential operators that appear
in classical boundary value problems are of this type.
2.5 Pseudo-inversion
We are interested in generalized solutions of the equation Lx = y where
L : D(L) ⊆ H1 → H2 is a closed densely defined linear operator and y ∈ H2 .
Of course the equation has a solution if and only if y ∈ R(L), however we can
associate generalized solutions with any y in the dense subspace R(L)+R(L)⊥
of H2 . Indeed, if y ∈ R(L) + R(L)⊥ , then P y ∈ R(L) where P is the projector
of H2 onto R(L), the closure of the range of L. In this case the equation
Lx = P y
has solutions and we call any such solution a least squares solution of the
original equation. If x ∈ D(L) is a least squares solution then
Lx − y2 = P y − y2 = min Lz − y2

z∈D(L)
which accounts for the “least squares” terminology. Among the possibly infi-
nitely many least squares solutions there can be at most one lying in N (L)⊥ .
In fact, if u and v are least squares solutions, then u − v ∈ D(L) and
L(u − v) = Lu − Lv = P y − P y = 0
and hence if u and v lie in N (L)⊥ , then u − v ∈ N (L) ∩ N (L)⊥ = {0}. On the
other hand, if x is a least squares solution and x = x1 + x2 ∈ N (L) + N (L)⊥ ,
then x2 = x − x1 ∈ D(L) and Lx2 = Lx − 0 = P y, i.e., x2 is a least squares

solution lying in N (L)⊥ . Therefore, if y ∈ R(L) + R(L)⊥ , then there is a
unique vector x† ∈ D(L) ∩ N (L)⊥ satisfying Lx† = P y. It is easy to see that
x† is characterized as the least squares solution of smallest norm. Indeed, if z
is another least squares solution, then z − x† ∈ N (L) and hence
z2 = z − x† 2 + x† 2 ≥ x† 2 .
Let D(L† ) = R(L) + R(L)⊥ . The operator L† : D(L† ) ⊆ H2 → H1 defined

by L† y = x† where x† is the unique least squares solution of Lx = y lying
in D(L) ∩ N (L)⊥ is called the Moore-Penrose generalized inverse of L. In the
case when N (L) = {0} and R(L) = H2 one sees that L† = L−1 . We note that
L† is itself a closed densely defined linear operator. Indeed, if {yn } ⊆ D(L† )
and yn → y ∈ H2 , while L† yn → x ∈ H1 , then LL† yn = P yn → P y since
P is bounded. Also, since L† yn ∈ N (L)⊥ , we have x ∈ N (L)⊥ since N (L)⊥
is closed. But since L is closed we then have x ∈ D(L) and Lx = P y, i.e.,
y ∈ D(L† ) and L† y = x. Therefore, L† is closed. Furthermore, L† is defined
on all of H2 if and only if R(L) is closed. Therefore, by the Closed Graph
Theorem, L† is bounded if and only if R(L) is closed. From the definition of
the Moore-Penrose inverse it follows that
LL† y = PR(L) y
for all y ∈ D(L† ). We also note that
L† Lx = PN (L)⊥ x
for all x ∈ D(L). In fact, if x ∈ D(L) and x = x1 + x2 ∈ N (L) + N (L)⊥ , then
PN (L)⊥ x = x2 = x − x1 ∈ D(L).
Also,
L† Lx = L† Lx2 .
However, x2 ∈ N (L)⊥ ∩ D(L) and Lx2 = Lx = PR(L) Lx and hence, x2 =
L† Lx. That is, L† Lx = PN (L)⊥ x. We summarize these results in the following
theorem:
Theorem 2.12. If L : D(L) ⊆ H1 → H2 is a closed densely defined linear
operator, then L† : D(L† ) = R(L) + R(L)⊥ → D(L) ∩ N (L)⊥ is closed and
densely defined. Also
(i) L† is bounded if and only if R(L) is closed.
(ii) LL† y = PR(L) y for all y ∈ D(L† ).
(iii) L† Lx = PN (L)⊥ x for all x ∈ D(L).

2.6 Optimization 43
We now give a result that relates the Moore-Penrose inverse of the generally

unbounded operator L∗ L to that of the bounded linear operator I − L.
Theorem 2.13. If L is closed and densely defined, then
− L)
(L∗ L)† = L(I †.
is bounded
Proof. First note that from Lemma 2.9 and the fact that L
D((L∗ L)† ) = R(L∗ L) + R(L∗ L)⊥
+ R(I − L)
= R(I − L) ⊥
† ) = D(L(I
= D((I − L) − L)
† ).
Suppose now that z = L∗ Lw + u ∈ D((L∗ L)† ), where w ∈ D(L∗ L) ∩ N (L∗ L)⊥

and u ∈ R(L∗ L)⊥ . Then (L∗ L)† z = w. Also, using Lemma 2.9
− L)
L(I † z = L(I
− L)
† L∗ Lw
− L)
= L(I † (I + L∗ L − I)w
− L)
= L(I
† (I − L)(I + L∗ L)w

= LP ∗
)⊥ (I + L L)w.
N (I−L
= N (L∗ L) = N (L) we have

However, for w ∈ N (L)⊥ and x ∈ N (I − L)
(I + L∗ L)w =
x, L(I

x, (I + L∗ L)w =
Lx, + L∗ L)w = 0
⊥ , i.e.,
therefore, (I + L∗ L)w ∈ N (I − L)
∗ ∗
PN (I−L
)⊥ (I + L L)w = (I + L L)w.
Hence
− L)
L(I † z = L(I
+ L∗ L)w = w = (L∗ L)† z.

2.6 Optimization
We begin with a fairly general situation in which there may be multiple op-
tima. Suppose S ⊆ H is weakly closed and x ∈ H. Then there is a y ∈ S such
that y − x = dist(S, x), where dist(S, x) is the distance from x to S:
dist(S, x) = min{z − x : z ∈ S}.

This is a simple consequence of previous considerations. Indeed, if
d = inf{z − x : z ∈ S},
then there is a sequence {zn } ⊆ S such that lim zn − x = d. The sequence
{zn } then has, by Theorem 2.1, a weakly convergent subsequence, say znk y.
Since S is weakly closed we then have y ∈ S and, by the weak lower semi-
continuity of the norm,
d ≤ y − x ≤ inf znk − x = d.
Note that as a consequence of the continuity of the norm, dist(S, x) also exists
if S is (strongly) closed. The set of all points in S nearest to x is called the
projection of x onto S and is denoted PS (x):
PS (x) = {y ∈ S : y − x = inf{z − x : z ∈ S}} .
The distance to S also has a simple continuity property:
|dist(S, u) − dist(S, v)| ≤ u − v
Indeed, for any w ∈ S, since u − w ≤ u − v + v − w we have
dist(S, u) ≤ u − v + inf{v − w : w ∈ S} = u − v + dist(S, v)
interchanging u and v we get the desired result.

In the case when S is weakly closed and convex the projection PS is single-
valued, for if u, v ∈ PS (x), then by the parallelogram law and the fact that
(u + v)/2 ∈ S we have
u − v2 = u − x − (v − x)2
= 2u − x2 + 2v − x2 − 4(u + v)/2 − x2
≤ 4 × dist(S, x) − 4 × dist(S, x) = 0.
The purely geometric fact that a weakly closed convex set has the nearest
point property has many important applications in optimization theory. As
an illustration we provide a very simple example of optimal control of a one-
dimensional dynamical system. Suppose a unit point mass moves on the real
line and is steered from the origin with initial velocity 1 by a control (i.e.,
external force) u ∈ L2 ([0, 1]). Our interest is in a control that returns the
particle to a soft landing at the origin in unit time while expending minimal
effort. As a measure of effort we take
1
u2 = |u(t)|2 dt.
0
The position of the particle is modeled as a function

2.6 Optimization 45
x ∈ H 2 [0, 1] = {x ∈ L2 ([0, 1]) : x, ẋ, abs.cont., ẍ ∈ L2 ([0, 1])}
where the dots indicate temporal derivatives. The motion of the particle is
then governed by
ẍ = u, x(0) = 0, ẋ(0) = 1, x(1) = 0, ẋ(1) = 0. (2.3)
Let S be the set of all functions (controls) u ∈ L2 ([0, 1]) for which the equa-
tions (2.3) are satisfied for some trajectory x ∈ H 2 [0, 1]. Note that S is clearly
a convex set. Also, S is weakly closed in L2 ([0, 1]). Indeed, suppose {un } ⊂ S
and un u ∈ L2 ([0, 1]). Let
t s
x(t) = u(τ )dτ ds
0 1
and note that x ∈ H 2 [0, 1], ẍ = u, x(0) = 0, and ẋ(1) = 0. Also, if xn is the
trajectory corresponding to the control un , then since um u,
0 0
1 = ẋn (0) = un (τ )dτ =
−1, un →
−1, u = u(τ )dτ = ẋ(0)
1 1
and hence ẋ(0) = 1. Finally,

1
s
0 = xn (1) = 0 1 un (τ )dτ ds

1
1
= 0
(1 − τ )un (τ )dτ → 0
(1 − τ )u(τ )dτ = x(1)
and hence the set S is weakly closed. Therefore S contains a unique control
that is nearest to the origin, and this of course is the minimal effort control
which solves the optimal control problem.
A simple sketch convinces one that the nearest point in a weakly closed
convex set is characterized by the following useful inequality.
Theorem 2.14. If S is weakly closed and convex, then x0 = PS (x) if and
only if x0 ∈ S satisfies
x − x0 , y − x0 ≤ 0 for all y ∈ S.
Proof. If x0 satisfies the stated inequality, then for any y ∈ S,
y − x2 = (y − x0 ) + (x0 − x)2
= y − x0 2 + 2
y − x0 , x0 − x + x0 − x2 ≥ x − x0 2
and hence x0 = PS (x).

On the other hand, if x0 = PS (x) then for any y ∈ S and any t ∈ [0, 1] the
vector y(t) := (1 − t)x0 + ty lies in S and
x − y(t)2 = x − x0 + t(x0 − y)2
= x − x0 2 − 2t
x − x0 , y − x0 + t2 x0 − y2 .
So if
x − x0 , y − x0 > 0 then x − y(t) < x − x0 for t sufficiently small,
contradicting the fact that x0 = PS (x).
One consequence of Theorem 2.14 is a very strong continuity property of

PS , namely
PS (u) − PS (v) ≤ u − v (2.4)
for all u, v ∈ H. Indeed from Theorem 2.14 we have

u − PS (u), PS (v) − PS (u) ≤ 0 and
v − PS (v), PS (u) − PS (v) ≤ 0
and hence

u − v + PS (v) − PS (u), PS (v) − PS (u) ≤ 0
that is,

u − v, PS (v) − PS (u) + PS (v) − PS (u)2 ≤ 0
from which we conclude that
PS (u) − PS (v)2 ≤
v − u, PS (v) − PS (u) ≤ u − vPS (u) − PS (v)
establishing (2.4).
Every weakly closed set is a fortiori strongly closed, but a strongly closed
set is not necessarily weakly closed. However, a strongly closed set which is also
convex is weakly closed and hence weak closedness and strong closedness are
equivalent in the class of convex sets. This is also a consequence of Theorem
2.14. Let S be a closed convex set and suppose {xn } ⊂ S and xn x0 . Recall
that we have previously observed that PS (x0 ) exists uniquely if S is closed
and convex. By Theorem 2.14

x0 − PS (x0 ), xn − PS (x0 ) ≤ 0
for all n and hence, since xn x0 ,
x0 − PS (x0 )2 = lim
x0 − PS (x0 ), xn − PS (x0 ) ≤ 0.
n
Therefore, x0 = PS (x0 ) ∈ S and hence S is weakly closed.

The geometry of the projector comes into sharper focus when the base
convex set is in fact a subspace.
Lemma 2.15. If S is a closed subspace of H, then x − PS (x) ∈ S ⊥ for all
x ∈ H.
Proof. Since S is a subspace w + PS (x) ∈ S for all w ∈ S and hence by
Theorem 2.14

x − PS (x), w =
x − PS (x), w + PS (x) − PS (x) ≤ 0.
Replacing w by −w we get

x − PS (x), w = 0
for all w ∈ S, that is, x − PS (x) ∈ S ⊥ for all x ∈ H.

2.6 Optimization 47
Theorem 2.16. If S is a closed subspace, then PS is a self-adjoint linear

operator and PS = 1.
Proof. Suppose x, y ∈ H. Since S is a subspace and PS (y) ∈ S, any u ∈ S has
the form u = z + PS (y) for some z ∈ S. Therefore, by the lemma,
u − (x + y)2 = z + PS (y) − (x + y)2
= z − PS (x) + PS (x) − x + PS (y) − y2
= z − PS (x)2 + PS (x) − x + PS (y) − y2
≥ PS (x) + PS (y) − (x + y)2

and hence PS (x + y) = PS (x) + PS (y). A similar, but easier, argument shows
that PS (αx) = αPS (x) for all x ∈ H and α ∈ R. Therefore, PS is a linear
operator and we will follow custom by dispensing with the use of parentheses
and write PS x for PS (x).
By (2.4), PS x ≤ x for all x ∈ H and hence PS ≤ 1. However, since
PS2 = PS , it follows that PS = PS2 ≤ PS 2 and hence PS = 1. Finally,
from the lemma,

PS x, y −
x, PS y =
PS x, y −
x − PS x + PS x, PS y
=
PS x, y −
PS x, PS y =
PS x, y − PS y = 0
and therefore PS is self-adjoint.

Theorem 2.17. If a self-adjoint bounded linear operator T satisfies T 2 = T ,

then T = PS where S = N (T − I).
Proof. First note that S is a closed subspace. Also, for any y ∈ S,

x − T x, y − T x =
x − T x, T 2 y − T 2 x =
T x − T x, T y − T x = 0
and hence T x = PS x by Theorem 2.14.
.
Projections provide best approximations in a set to a given vector. So it

is not surprising that algorithms for best approximations that satisfy more
than one criterion may be developed by using more than one projection. The
intersection point of a pair of lines provides a trivial example. Draw two in-
tersecting lines. Now take an arbitrary point, project it onto the first line,
then project that point onto the second line, and continue the process, alter-
nately projecting onto each line in turn. You will see that the points so con-
structed follow a zig-zag path and converge to the intersection point of the two
lines. This is the simplest version of von Neumann’s theorem on alternating
projections [48] first given in 1933.
Theorem 2.18. If S1 and S2 are closed subspaces of a Hilbert space H, then

(PS2 PS1 )n x → PS1 ∩S2 x as n → ∞ for each x ∈ H.
Proof. Let P1 = PS1 , P2 = PS2 and define Tn by
T 1 = P1 , T2 = P2 P1 , T3 = P1 P2 P1 , etc.
that is, ⎧
⎨ (P2 P1 )n/2 , n even
Tn =
⎩
P1 (P2 P1 )(n−1)/2 , n odd
One sees easily that
⎧
⎨ Tm+n , m+n odd
∗
Tm Tn =
⎩
Tm+n−1 , m + n − 1 odd
and hence
∗ ∗
Tm x − Tn x2 =
Tm Tm x, x −
Tm Tn x, x −
Tn∗ Tm x, x +
Tn∗ Tn x, x
=
T2m−1 x, x − 2
Tq(m,n) x, x +
T2n−1 x, x
(2.5)
where q(m, n) = m + n or m + n − 1, whichever is odd. Now for any positive
integer j we have

T2j−1 x, x = Tj x2 ≤ Tj−1 x2 =
T2j−3 x, x
and therefore {
T2j−1 x, x} is a decreasing sequence of non-negative numbers
and hence
a(x) = lim
T2j−1 x, x
j→∞
exists for each x ∈ H. From (2.5) it follows that
Tm x − Tn x2 → a(x) − 2a(x) + a(x) = 0
as m, n → ∞ and hence {Tn x} is a Cauchy sequence. Therefore, Tn x → P (x)

as n → ∞ for some P (x) ∈ H. The operator P : H → H so defined is linear,
as each of the Tn is linear, and symmetric, as each of the T2n+1 is symmetric.
Therefore, P is bounded by the Hellinger-Toeplitz theorem. Since P1 Tn = Tn
or Tn+1 , depending on whether n is odd or even, respectively, we find that
P1 P x = lim P1 Tn x = lim Tn x = P x
n→∞ n→∞
and hence P x ∈ S1 . One shows in a similar fashion that P2 P x = P x and

hence P x ∈ S1 ∩S2 . These results show that Tn P x = P x for each n and hence
P 2 = P . We may now conclude from Theorem 2.17 that P x = PS1 ∩S2 x.
2.6 Optimization 49
Finally, since (P2 P1 )n = T2n , we see that (P2 P1 )n x → PS1 ∩S2 x as n → ∞

for each x ∈ H.
Having specialized from projections onto closed convex sets to projections

onto closed subspaces we now take a half-step backward and treat the case of
projection onto affine sets. By an affine subset of a Hilbert space H we mean
a set of the form S(z) = {s + z : s ∈ S} where S is a given closed subspace of
H and z ∈ H. Note that affine sets are closed and convex and S(z) = S(y) if
and only if z − y ∈ S. Also if S1 (z) and S2 (y) are two non-disjoint affine sets,
then for any w ∈ S1 (z) ∩ S2 (y) we have S1 (z) = S1 (w), S2 (y) = S2 (w) and
S1 (z) ∩ S2 (y) = (S1 ∩ S2 )(w).
Lemma 2.19. PS(z) (x) = PS x + PS ⊥ w, for any w ∈ S(z).

Proof. First note that PS ⊥ w = PS ⊥ z for any w ∈ S(z) and hence
PS x + PS ⊥ w = PS x + PS ⊥ z = PS x − PS z + z ∈ S(z)
for any w ∈ S(z). Also, if s + z ∈ S(z), then
x − (s + z) = x − z − s
≥ x − z − PS (x − z) = x − (PS x + PS ⊥ z)

and hence the result.

In particular we see (setting w = z) that PS(z) (x) = PS(z) (y) if and only
if x − y ∈ S ⊥ .
Lemma 2.20. If A1 = S1 (z) and A2 = S2 (y) are non-disjoint affine sets then
PA2 (PA1 (x)) − PA1 ∩A2 (x) = PS2 PS1 (x − PA1 ∩A2 (x)).
= PA1 ∩A2 (x), then by the previous lemma, we have, since x
Proof. Let x ∈
A1 ∩ A2 :
PA2 (PA1 (x)) − x
= PS2 PA1 (x) + PS2⊥ x
−x

= PS2 PA1 (x) − PS2 x

= PS2 (PA1 (x) − x

)
−x
= PS2 (PS1 x + PS1⊥ x )
= PS2 (PS1 x − PS1 x

) = PS2 PS1 (x − x
).

Lemma 2.21. If A1 and A2 are non-disjoint affine sets then

PA1 ∩A2 (PA2 (PA1 (x))) = PA1 ∩A2 (x)
for all x ∈ H.
Proof. Suppose A1 = S1 (w) and A2 = S2 (w) for some subspaces S1 and S2
and some w ∈ A1 ∩ A2 . The stated equality is equivalent to the condition
x − PA2 (PA1 (x)) ∈ (S1 ∩ S2 )⊥ .
However,
x − PA2 (PA1 (x)) = x − PS2 PA1 (x) − PS2⊥ w
= x − PS2 (PS1 x + PS1⊥ w) − PS2⊥ w
= (x − PS2 PS1 x) − PS2 PS1⊥ w − PS2⊥ w.

and each of the terms of this last expression is a member of (S1 ∩ S2 )⊥ .

Having dispensed with these preliminaries we may now prove the conver-
gence of the alternating projection algorithm for two affine sets. From the
previous two lemmas we have
(PA2 PA1 )2 (x) − PA1 ∩A2 (x) = PA2 PA1 (PA2 PA1 (x)) − PA1 ∩A2 (x)
= PA2 PA1 (PA2 PA1 (x)) − PA1 ∩A2 (PA2 PA1 (x))
= PS2 PS1 (PA2 PA1 (x) − PA1 ∩A2 (PA2 PA1 (x)))
= PS2 PS1 (PA2 PA1 (x) − PA1 ∩A2 (x))
= (PS2 PS1 )2 (x − PA1 ∩A2 (x)).

Similarly for any positive integer n,
(PA2 PA1 )n (x) − PA1 ∩A2 (x) = (PS2 PS1 )n (x − PA1 ∩A2 (x)).
Therefore, by Theorem 2.18
(PA2 PA1 )n (x) − PA1 ∩A2 (x) = (PS2 PS1 )n (x − PA1 ∩A2 (x))
→ PS1 ∩S2 (x − PA1 ∩A2 (x))

as n → ∞ for any x ∈ H. However, x − PA1 ∩A2 (x) ∈ (S1 ∩ S2 )⊥ and hence
we have proved:
Theorem 2.22 (Alternating Projection Theorem). If A1 and A2 are
non-disjoint affine subsets of H, then (PA2 PA1 )n (x) → PA1 ∩A2 (x) as n → ∞
for each x ∈ H.
The Alternating Projection Theorem may be extended in a number of
directions; see Deutsch [7] for a clear and comprehensive survey.
2.7 Notes 51
2.7 Notes
The literature on Hilbert spaces and linear operators is wide and deep. Our
treatment in this chapter is close in spirit to works on “applied” or “numer-
ical” functional analysis such as [3], [49], [50], [33] and [36]. The classical
masterpiece by pioneers in the subject [44] is still hard to beat and the work
[35] by a modern master is highly recommended. A proof of Banach’s Theo-
rem (also called the Open Mapping Theorem) can be found, for example, in
[13] p. 57. The usual proof of the uniform boundedness principle relies on a
topological “category” argument. Halmos [26] (Problem 20) called for a more
elementary non-topological proof. Such was supplied by Hennefeld [29] whose
proof turned out to be a rediscovery of a classical argument of Hausdorff [28].
Lucid presentations of fundamental results on closed unbounded opera-
tors, including the closed graph theorem, von Neumann’s Theorem (originally
presented in [47]) and the Hellinger-Toeplitz theorem can be found in [44].
Lemma 2.8 can be extended to positive powers of the operators L and L
(see [21]).
An account of the Moore-Penrose inverse of bounded linear operators,
with special emphasis on approximation methods, can be found in [12]. The
most authoritative source for best approximation in Hilbert space is [7]. The
Alternating Projection theorem for closed subspaces was presented by von
Neumann in his Princeton lectures of 1933 and published in [48]. The identity
in Theorem 2.13 and related results may be found in [17].
Successive projection onto finitely many special affine sets, namely hyper-
planes, is the basis of Kaczmarz’s iterative method for underdetermined linear
systems, which in turn is the foundation of the ART (algebraic reconstruction
technique) method in computed tomography (see, e.g., [16] for a discussion
and additional references). The method has been widely extended to cover
the case of successive projection onto convex sets (see, e.g., [7]) and has been
applied to various nonlinear optimization problems (see, e.g. [46]).
3
A General Approach to Stabilization
The first Almighty Cause acts not by partial,

but by gen’ral laws.
A. Pope
In this short chapter we develop a general approach for stabilized evaluation of

a closed unbounded operator using von Neumann’s theorem and the spectral
theory of bounded self-adjoint operators as basic tools. Recall that the essential
problem of evaluating an unbounded linear operator L : D(L) ⊆ H1 → H2 at a
vector x ∈ D(L) given only an approximation xδ ∈ H1 satisfying x − xδ ≤ δ
is that in general xδ ∈/ D(L), and even if xδ ∈ D(L) one can not guarantee
that Lx → Lx as δ → 0, since L is discontinuous. Our goal in this chapter
δ
is to develop general approximations to Lx of the form Sα x, where Sα is

a bounded linear operator depending on a parameter α > 0. Since Sα is
bounded the vector Sα xδ is defined for all xδ ∈ H1 and for each α > 0
the mapping xδ → Sα xδ is stable. The next ingredient in the stabilization
strategy is a parameter choice scheme α = α(δ) that ensures the regularity of
the approximations, that is, so that
Sα(δ) xδ − Lx → 0 as δ → 0.
3.1 A General Method

The key to our development is von Neumann’s theorem. Recall that this states
that if L : D(L) ⊆ H1 → H2 is a closed densely defined linear operator, then
the operator L : H1 → H2 defined by
= (I + L∗ L)−1
L
⊆ [0, 1] and the operator LL
is bounded and self-adjoint with σ(L) : H1 → H2
is bounded with LL ≤ 1. In the same way the operator L : H 2 → H2
defined by
54 3 A General Approach to Stabilization
= (I + LL∗ )−1
L
: H2 → H1 is bounded.
is bounded and self-adjoint and the operator L∗ L
Before proceeding we notice a simple property of these operators.

Theorem 3.1. If f ∈ C[0, 1] and x ∈ D(L), then f (L)x ∈ D(L) and

f (L)Lx = Lf (L)x.
∈ D(L∗ L) and (I +L∗ L)Lx
Proof. For any x ∈ H1 we have Lx = x. Therefore,
if x ∈ D(L), then
= (I + LL∗ )LLx.
Lx = L(I + L∗ L)Lx
We then have

LLx
= (I + LL∗ )−1 Lx = LLx
and from this it follows that

p(L)Lx
= Lp(L)x
for any polynomial p. If f ∈ C[0, 1], then by the Weierstrass approximation

theorem there is a sequence of polynomials {pn } that converges uniformly to
f . For x ∈ D(L) we then have

f (L)Lx
= lim pn (L)Lx
= lim Lpn (L)x.
n n
For any y ∈ D(L∗ ) it follows that

f (L)Lx, y
y = limn
Lpn (L)x,
L∗ y
= limn
pn (L)x,
L∗ y.
=
f (L)x,
Therefore, using Theorem 2.7, we find that

∈ D(L∗∗ ) = D(L)
f (L)x
and
= f (L)Lx.
Lf (L)x

∗ x = L∗ f (L)x
In the same way one finds that f (L)L for all x ∈ D(L∗ ).
√
is bounded
Corollary 3.2. If f (t) = tg(t) where g ∈ C[0, 1], then Lf (L)
and
≤ g∞
Lf (L)
3.1 A General Method 55
Proof. If x, z ∈ D(L), then
∗ (Lf (L))x,

(Lf (L)) Lf (L)z
z =
Lf (L)x,

=
f (L)Lx,
f (L)Lz
=
L
1/2 g(L)Lx, 1/2 g(L)Lz
L
L)Lx,
=
Lg( L)Lz
Lg(

=
LLg( Lg(L)z
L)x,

=
L∗ LLg( g(L)z.
L)x,
But
=I −L
L∗ LL
and therefore,
∗ (Lf (L))x,

(Lf (L)) L)x,
z =
(I − L)g( g(L)z.

But, since I − L and g(L)

are bounded linear operators, this identity extends
to all x, z ∈ H1 , and hence
2 ≤ I − Lg(
Lf (L) 2 ≤ (g∞ )2 .
L)

A general stabilization procedure is suggested by the formal identity Lx =

L
LL −1 x. Stable approximations {yα } to Lx will be formed in the following
way:
α (L)x
yα = LLT (3.1)
where Tα ∈ C[0, 1] for each α > 0 and the family of functions is shaped to
approximate t−1 in the following sense:
Tα (t) → t−1 as α → 0 for each t ∈ (0, 1] (3.2)
and
|tTα (t)| is uniformly bounded for α > 0, t ∈ [0, 1]. (3.3)
Note that, since LL are both (by von Neumann’s theorem) bounded
and Tα (L)
linear operators, the approximations yα given by (3.1) are defined for all
x ∈ H1 , not just for x ∈ D(L), and the mapping x → yα is stable. In particular
this means that the approximations
α (L)x
yαδ = LLT δ
are defined and stable for approximations xδ ∈ H1 to x ∈ D(L) satisfying

x − xδ ≤ δ even if xδ ∈/ D(L). A general stable approximation scheme
for Lx then consists of a choice of a family {Tα } satisfying (3.2) and (3.3)
matched with a parameter choice strategy α = α(δ) designed to ensure that
δ
yα(δ) → Lx as δ → 0. Before treating some specific cases we establish basic
convergence and stabilization results. We consider first the case of error-free
data x.
Theorem 3.3. Suppose L : D(L) ⊆ H1 → H2 is a closed densely defined lin-
ear operator and {Tα } is a family of continuous real-valued functions defined
on [0, 1] satisfying (3.2) and (3.3).
α (L)x
(i) For all x ∈ H1 , xα = LT → x as α → 0.
(ii) If x ∈ D(L), then yα = Lxα → Lx as α → 0.
/ D(L), then yα → ∞ as α → 0.

(iii) If x ∈
Proof. First note that
α (L))x
x − xα = (I − LT
But by (3.2) and (3.3), the function 1 − tTα (t) converges in a pointwise and
uniformly bounded manner to the function

1, t = 0
ϕ(t) =
0, t ∈ (0, 1]
then
The spectral theorem applied to the bounded self-adjoint operator L
gives
x − xα → PN (L
) x = 0, as α → 0.
By Lemma (3.1), if x ∈ D(L), then
α (L))x
Lx − yα = L(I − LT α (L))Lx.
= (I − LT
then
The spectral theorem applied to the bounded self-adjoint operator L
gives
Lx − yα → PN (L
) Lx = 0, as α → 0.
To establish the final assertion, note that if {yα } has a bounded sequence,
then it has a weakly convergent subsequence, say yαn w, for some sequence
αn → 0. Now yαn = Lxn where xn = LT α (L)x
and by the properties of {Tα }
n
and the Spectral Theorem,
xn → x − PN (L
) x = x as n→∞
But since the graph of L is closed and convex, and hence weakly closed,
xn → x and Lxn w, and we have x ∈ D(L) and Lx = w. So if x ∈ / D(L),
then {yα } must be unbounded.

3.1 A General Method 57
Suppose now that xδ ∈ H1 is an approximation to x ∈ D(L) satisfying

x − xδ ≤ δ. The stability error
α (L)x
yα − yαδ := LLT − LLT
α (L)x
δ
may be estimated as follows:

α (L)(x
yα − yαδ 2 =
L∗ LLT α (L)(x
− xδ ), LT − xδ )
=
(I − L)T
α (L)(x α (L)(x
− xδ ), LT − xδ )
α (L)
≤ δ 2 ||(I − L)T LT α (L).

Therefore, if r(α) is a function satisfying

|(1 − t)Tα (t)| ≤ r(α) for t ∈ [0, 1] (3.4)
α (L)
then, since LT is uniformly bounded, we find that

yα − yαδ ≤ δO( r(α)).
Putting these results together we obtain the following general stabilization
result:
Theorem 3.4. If x ∈ D(L) and x − xδ ≤ δ then
α (L)x
yαδ = LLT δ → Lx as δ → 0

if α = α(δ) → 0 as δ → 0 in such a way that δ r(α(δ)) → 0.
Under appropriate conditions an actual rate of convergence can be obtained
in terms of a function ω(α, ν) satisfying
max |(1 − tTα (t))tν | ≤ ω(α, ν) (3.5)
t∈[0,1]
ν ) for some ν > 0. If {Tα }

Theorem 3.5. Suppose x ∈ D(L) and Lx ∈ R(L
satisfies (3.5), then
Lx − LLT
α (L)x = O(ω(α, ν)).
ν w, then
Proof. This is immediate for if Lx = L
α (L)x
Lx − LLT =L ν w − LT
α (L)
Lν w

= (I − LTα (L))L w.
ν

Note that in the particular case ν = 1 the requirement on x in the previous
theorem may be expressed simply as x ∈ D(LL∗ L) since by Lemma 2.8

D(LL∗ L) = {x ∈ D(L) : Lx ∈ R(L)}.
Also, the relaxed assumption that x ∈ D(L∗ L) leads to a special convergence
rate.
α (L)x
Theorem 3.6. If x ∈ D(L∗ L), then Lx − LLT = O(ω(α, 1/2)).
Proof. Write
Lx − LLT = LSα (L)x
α (L)x
where Sα (t) = 1 − tTα (t). Then, on setting w = x + L∗ Lx, we find that
α (L)x
Lx − LLT 2 =
L∗ LSα (L)x,
Sα (L)x

+ Sα (L)w,
=
−Sα (L)x
Sα (L)x

≤
Sα (L)w, Lw
Sα (L)
L
= Sα (L) 1/2 w2 ≤ ω(α, 1/2)2 w2 .

3.2 Some Cases

We now illustrate the general results of the previous section on some spe-
cific stable approximate evaluation methods. In some of the examples the
stabilization parameter has a continuous range of positive values, while in
other iterative methods the role of the stabilization parameter is played by
a discrete iteration number that tends to infinity. Depending on the particu-
lar method under consideration, the stabilization parameter may take values
which approach either 0 or ∞.
3.2.1 The Tikhonov-Morozov Method
The best known stabilization procedure is what we call the Tikhonov-Morozov

method in which y = Lx is approximated by
yα = L(I + αL∗ L)−1 x (3.6)
where α is a positive stabilization parameter converging to zero.

Before taking up general results for the Tikhonov-Morozov method, we
illustrate the method on a couple of simple examples of unbounded opera-
tors that were discussed in Chapter 1. For example, if L is the Dirichlet-to-
Neumann map on the unit disk with domain

2 ˆ
D(L) = f ∈ L (∂D) :
2
|n| |f (n)| < ∞
2
n∈Z
defined by
(Lf )(eiθ ) = |n|fˆ(n) exp(inθ).
n∈Z
3.2 Some Cases 59
where 2π
1
fˆ(n) = f (t)e−int dt .
2π 0
then one sees easily that
|n|

yαδ (eiθ ) = L(I + αL L)∗ −1 δ
f (exp iθ) = fδ (n) exp (inθ).
1 + αn2
n∈Z
As another illustration consider the problem of reconstructing a source distri-

bution g in the heat problem
∂u ∂2u
= + g(x), 0 < x < π, 0 < t,
∂t ∂x2
where u(x, t) is subject to the boundary conditions
u(0, t) = u(π, t) = 0, 0<t
and initial condition

u(x, 0) = 0, 0 ≤ x ≤ π,
from knowledge of the temperature distribution f (x) = u(x, 1) at time t = 1.

We found in Chapter 1 that g = Lf where
∞
2 π
D(L) = h : m am < ∞, am =
4 2
h(s) sin msds
m=1
π 0
and π
∞
2 n2
Lf (x) = sin nx f (s) sin nsds.
π n=1 1 − e−n2 0
In this case the Tikhonov-Morozov approximation to g is found to be
gαδ = L(I + αL∗ L)−1 f δ

∞ π
π
= αn sin nx f δ (s) sin nsds
2 n=1 0
where 2
n2 n2
αn = / 1+α .
1 − e−n2 1 − e−n2
Returning to the general situation, we find that since

(I + αL∗ L)−1 = ((1 − α)I + α(I + L∗ L))−1 = L((1 + αI)−1
− α)L
the Tikhonov-Morozov method fits into our general scheme if we take
Tα (t) = (α + (1 − α)t)−1 , t ∈ [0, 1].
Note that this class of functions satisfies criteria (3.2) and (3.3) and hence
L(I + αL∗ L)−1 x → Lx as α→0
for each x ∈ D(L). Also, since

1
max |(1 − t)Tα (t)| =
t∈[0,1] α
we may take r(α) = 1/α in Theorem 3.4. That is, if we set
xα = (I + αL∗ L)−1 x and xδα = (I + αL∗ L)−1 xδ
then we have:
Corollary 3.7. If x ∈ D(L) and x − xδ ≤ δ, then
√
Lxα − Lxδα ≤ δ/ α.
∈ H1 is some approximation to x ∈ D(L) satisfying x−xδ || ≤

Therefore, if xδ
δ and if δ = ◦( α(δ)), then by Theorem 3.4
L(I + α(δ)L∗ L)−1 xδ → Lx as δ → 0.
In order to apply Theorem 3.5 and obtain a convergence rate we require

an upper bound for
α(1 − t)tν
(1 − tTα (t))tν = .
α + (1 − α)t
For 0 < ν ≤ 1 we claim that
α(1 − t)tν
≤ αν for t ∈ [0, 1]. (3.7)
α + (1 − α)t
This is the same as
(1 − t)(t/α)ν
≤ 1.
1 + (1 − α)(t/α)
But setting z = t/α, this is seen to be equivalent to
z ν − αz ν+1 ≤ 1 + (1 − α)z, for z ∈ [0, 1/α]. (3.8)
The function on the left of the inequality above has a maximum of

ν
ν 1
α1−ν
ν+1 ν+1
3.2 Some Cases 61
which is clearly not greater than 1 for 0 < ν ≤ 1 and hence for 0 < ν ≤ 1 and
0 < α < 1 inequality (3.8) holds and hence so does the bound (3.7). We may
therefore take
ω(α, ν) = αν
for the Tikhonov-Morozov method. An immediate application of Theorems

3.5 and 3.6 gives:
Corollary 3.8. If x ∈ D(L) and Lx ∈ R(L) (equivalently, x ∈ D(LL∗ L)),
then
Lx − yα = O(α).
If x ∈ D(L∗ L), then

√
Lx − yα = O( α).
A converse of the first rate in Corollary 3.8 may be of interest. We provide

two proofs of such a converse as each is instructive in its own way. In the
first converse we assume that L is compact, that is, that LL∗ has compact
∗
resolvent. For operators LL with compact resolvent, we show that if x ∈ D(L)
and Lx − Lxα = O(α) as α → 0, then x ∈ D(LL∗ L). To see this suppose
the self-adjoint bounded operator L is compact and let {uj ; λj } be a complete
orthonormal eigensystem for L. Note that the eigenvalues {λj } lie in the
interval (0, 1]. Suppose they are ordered as:
0 < . . . ≤ λn+1 ≤ λn ≤ . . . ≤ λ2 ≤ λ1 .
Then

Lxα = LL(αI −1 x
+ (1 − α)L)

= L(αI −1 Lx
+ (1 − α)L)
∞ λj
=
Lx, uj uj
j=1
α + (1 − α)λj
and hence
∞ (1 − λj )2
Lx − Lxα 2 = α2 |
Lx, uj |2
j=1
(α + (1 − α)λj )2
∞
≥ α2 (1 − λ1 )2 j=1 (α + (1 − α)λj )−2 |
Lx, uj |2 .
Therefore, if Lx − Lxα = O(α), we have

∞

(α + (1 − α)λj )−2 |
Lx, uj |2 ≤ C
j=1
for some constant C and all α ∈ (0, 1]. In particular, all of the partial sums of
the above series are uniformly bounded by C. Letting α → 0+ in each of the
individual partial sums shows that

n
λ−2
j |
Lx, uj | ≤ C
2
j=1
∞
for each n and hence the series j=1 λ−2
j |
Lx, uj | is convergent. The vector
2
∞

z= λ−1
j
Lx, uj uj
j=1
is therefore well-defined and

∞

=
Lz
Lx, uj uj = Lx,
j=1
and hence x ∈ D(LL∗ L) in light of Lemma 2.8. We now

that is, Lx ∈ R(L)
is compact.
drop the assumption that L
Theorem 3.9. If x ∈ D(L) and Lx − Lxα = O(α), then x ∈ D(LL∗ L).
xα = (I + αL∗ L)−1 x ∈ D(L∗ L)
and hence
Lxα = (I + αLL∗ )−1 Lx ∈ D(LL∗ ).
and
Lxα − Lx = −αLL∗ Lxα .
Therefore, by the hypothesis
LL∗ Lxα = O(1). (3.9)
By Theorem 2.11 we know that LL∗ is closed. This can be seen in another
way by using Theorem 2.7. Indeed, if {yn } ⊆ D(LL∗ ) satisfies, yn → y and
LL∗ yn → p, then, using Theorem (2.7), we have for any u ∈ D(LL∗ ) =
D(L∗∗ L∗ )

LL∗ u, y = limn→∞
LL∗ u, yn = limn→∞
u, L∗∗ L∗ yn
= limn→∞
u, LL∗ yn =
u, p.
3.2 Some Cases 63
Therefore, y ∈ D(LL∗ ) and LL∗ y = p, that is, LL∗ is closed. Hence the
graph of LL∗ is closed and convex, and hence weakly closed. By (3.9) there is
then a sequence αn → 0 with
LL∗ Lxαn w
for some w. But Lxαn → Lx, by Theorem 3.3 (ii). Since the graph of LL∗ is
weakly closed, it follows that Lx ∈ D(LL∗ ) and LL∗ Lx = w. In particular,
x ∈ D(LL∗ L) as claimed.
An application of Theorem 3.5 and Theorem 3.4 gives:

Theorem 3.10. Suppose x ∈ D(L) and Lx ∈ R(L ν ) for some 0 < ν ≤ 1.
If x − x ≤ δ and the stabilization parameter is chosen in the form α =
δ
Cδ 2/(2ν+1) , then
yα(δ)
δ
− Lx = O(δ 2ν/(2ν+1) ).
Note that this theorem gives a best rate of O(δ 2/3 ) for the case ν = 1. In the
next chapter we shall show that this rate is essentially best possible.
3.2.2 The Iterated Tikhonov-Morozov Method
In this section we briefly investigate an iterative stabilization method related

to the Tikhonov-Morozov method. The simplest iterative stabilization method
is suggested by the approximation
1 1
n−1
1 − (1 − t)n
= ≈ (1 − t)j = =: Tn (t).
t 1 − (1 − t) j=0
t
It is easy to see that the family {Tn (t)} satisfies (3.2) and (3.3). Furthermore,
tTn (t) = (1 − t)tTn−1 (t) + t, T0 (t) = 0
and hence one is led via the general spectral approach to the iterative method
n−1 + Lx
xn = (I − L)x
or equivalently
(I + L∗ L)xn = L∗ Lxn−1 + x, x0 = 0 (3.10)
This gives, for each n, a stable approximation yn = Lxn to the value Lx.
The method (3.10) is a special case (for α = 1, see below) of the iterated
Tikhonov-Morozov method.
Approximation orders that are arbitrarily near to the order of error in the
data, O(δ), are achievable by use of iteration methods. One such method is the
iterated Tikhonov-Morozov method. A different iterative stabilization method

is studied in the next section. In the iterated Tikhonov-Morozov method a
positive parameter α is fixed and the job of stabilization is assumed by an
iteration number n (so in this case the stabilization parameter n → ∞). In
the ordinary Tikhonov-Morozov method with parameter α the value Lx is
approximated by a stable approximation Lxα where xα satisfies
(I + αL∗ L)xα = x.
In the iterated method approximations x0 , x1 , . . . are given by x0 = 0 and
(I + αL∗ L)xn = x + αL∗ Lxn−1 (3.11)
where α is a fixed positive parameter.

In the case when only an approximate data vector xδ is available satisfying
x − xδ ≤ δ, the approximations generated in the same way with x replaced
by xδ will be denoted {xδn }. We emphasize that in this case the stabilization
parameter is the iteration number n which satisfies n → ∞, rather than the
parameter α → 0, as in the general discussion of the first section, and we trust
that this trivial modification will cause no confusion.
The iterated method (3.11) may be expressed in terms of the operator

L = (I + L∗ L)−1 by
n = Lx
(αI + (1 − α)L)x + α(I − L)x
n−1 .
In other words the stabilized approximations to Lx are given by yn = Lxn

where xn = LT n (L)x
and the functions {Tn } are defined iteratively by T0 (t) =
0 and

1 α(1 − t)
Tn (t) = 1+ Tn−1 (t) , n = 1, 2, . . .
(1 − α)t + α t
for t ∈ (0, 1], or equivalently

n
1 α(1 − t)
Tn (t) = 1− .
t (1 − α)t + α
The definition Tn (0) = 0 extends these functions continuously to [0, 1]. Note
that |tTn (t)| ≤ 1 for all n and Tn (t) → 1/t as n → ∞ for each t ∈ (0, 1],
therefore {Tn } satisfies (3.2) and (3.3) and hence the approximations satisfy
Lxn → Lx as n → ∞.
The general stability estimate requires a bound r(n) for the function
1 1−s
(1 − t)Tn (t) = (1 − (1 − s)n )
α s
3.2 Some Cases 65
where s = t/(α + (1 − α)t) ∈ [0, 1]. But note that
1−s n
(1 − (1 − s)n ) = (1 − s)j ≤ n
s j=1
and hence we may take r(n) = n/α. We therefore have
Suppose x ∈ D(L)δ and x − x ≤ δ. If n = n(δ) → ∞ as

δ
Corollary 3.11.
n → ∞ while δ n(δ) → 0, then Lxn(δ) → Lx.
We shall give another proof of this result from an entirely different perspective
in the next chapter.
To obtain an error bound we notice that
ν
s
t (1 − tTn (t)) = α (1 − s)
ν ν n
(1 − s) + sα
where again s = t/(α + (1 − α)t) ∈ [0, 1]. We may assume that α < 1 and then
ν
s
αν (1 − s)n ≤ (1 − s)n sν
(1 − s) + sα
n ν
n ν
≤ = O(n−ν ).
n+ν n+ν
ν ) for some ν > 0, we then find that

If x ∈ D(L) and Lx ∈ R(L
yn − Lx = O(n−ν ).
Combining this with the previous result gives

Corollary 3.12. If x ∈ D(L) and Lx ∈ R(L ν ) for some ν > 0, then for
x − x ≤ δ if the iteration parameter n = n(δ) is chosen so that n(δ) ∼
δ
δ −2/(2ν+1) we have
yn(δ)
δ
− Lx = O(δ 2ν/(2ν+1) ).
Note that the restriction ν ≤ 1 is not imposed here and hence rates arbitrarily
close to O(δ) are in principle achievable by the iterated Tikhonov-Morozov
method. In the next chapter we will study the nonstationary Tikhonov-
Morozov method in which the value of the constant α may change from one
iteration to the next.
3.2.3 An Interpolation-Based Method
Of course there are many other possible choices for the family {Tα } that
lead to stabilization methods. For example, another arises from interpolatory
function theory. Let Tn be the polynomial of degree not greater than n that
interpolates the function f (t) = 1/t at t = β1−1 , β2−1 , . . .

, βn−1 , where {βj } are
positive numbers satisfying 1 ≥ βn → 0 as n → ∞ and βj = ∞, that is
⎛ ⎞
1⎝ n
Tn (t) = 1− (1 − βj t)⎠ .
t j=1
Note that {Tn } is given iteratively by T1 (t) = β1 and
Tn+1 (t) = βn+1 + (1 − βn+1 t) Tn (t), n = 1, 2, . . . . (3.12)
It is evident that |tTn (t)| ≤ 1 and therefore {Tn } satisfies (3.3). Also,
⎛ ⎞
n n n
0≤ (1 − βj t) ≤ e−tβj = exp ⎝− βj t⎠ → 0 as n → ∞
j=1 j=1 j=1
for t ∈ (0, 1], and hence {Tn } satisfies (3.2). We therefore immediately obtain
from Theorem 3.3 that the iteratively defined sequence
" #
yn+1 = βn+1 LLx + I − βn+1 L yn , n = 0, 1, . . . y0 = 0 (3.13)
converges to Lx for each x ∈ D(L). In order to obtain a stability result we

need a candidate for the function r(n) in equation (3.4) (again in this instance
the iteration number n plays the role of a stability parameter). This is easily
had from the iterative formulation (3.12):
|(1 − t)Tn+1 (t)| ≤ βn+1 + |(1 − t)Tn (t)|
and hence

n
max |(1 − t)Tn (t)| ≤ βj =: σn .
t∈[0,1]
j=1
From the general stability result of the previous section we then obtain
ynδ − yn ≤ δσn1/2 .
Therefore, if the iteration parameter n = n(δ) grows at a rate controlled so

that δ 2 σn(δ) → 0 as δ → 0, then Theorem 3.4 guarantees that
δ
yn(δ) → Lx, as δ→0
where ynδ is defined as in (3.13) with x replaced by xδ satisfying x − xδ ≤ δ.

For convergence rates we need a function ω(n, ν) satisfying (3.5). But note
that
n
0 ≤ (1 − tTn (t))tν ≤ tν e−tβj = tν e−σn t .
j=1
3.2 Some Cases 67
However, the function on the right of the inequality above achieves for t ∈ [0, 1]
a maximum value of ν ν (eσn )−ν and hence we may use
" ν #ν
ω(n, ν) = σn−ν .
e
From Theorem 3.5 we obtain:
ν ) for some ν > 0, then
Corollary 3.13. If x ∈ D(L) and Lx ∈ R(L
yn − Lx = O(σn−ν ).
Combining the results of Theorem (3.4) and Theorem (3.5) we therefore

obtain:
Corollary 3.14. If x ∈ D(L) and Lx ∈ R(L ν ) for some ν > 0, then for
x − x ≤ δ if the iteration parameter n = n(δ) is chosen so that σn(δ) ∼
δ
δ −2/(2ν+1) we have
yn(δ)
δ
− Lx = O(δ 2ν/(2ν+1) ).
Under appropriate conditions one can achieve rates that are arbitrarily
close to optimal for this method by use of an a posteriori choice of the iteration
parameter rather than with an a priori choice as in Theorem 3.14. In fact,
note that ⎛ ⎞
n " #
yn = L ⎝I − ⎠ x = Lxn
I − βj L (3.14)
j=1
where ⎛ ⎞
n "
#
xn = ⎝I − ⎠ x,
I − βj L x0 = 0 (3.15)
j=1
and {xδn } is defined in the same way with x replaced by xδ . The approxima-
tions {xδn } can be compared with the available data xδ in order to monitor
the convergence of {ynδ } to Lx. First note that xδn → xδ as n → ∞ and
$" # $
$ (xδ − xδ )$
xδ − xδ = $ I − βn L
n $ ≤ xδ − xδ .
n−1 n−1
We assume that for a given constant τ > 1, the signal-to-noise ratio is not
less than τ , i.e., we assume that
xδ − xδ0 = xδ ≥ τ δ.
There is then a first value n = n(δ) ≥ 1 of the iteration index for which
xδ − xδn(δ) < τ δ. (3.16)
Note that this iteration number is chosen in an a posteriori manner as the

computation proceeds.
Lemma 3.15. If xn(δ) is given by (3.15) where n(δ) satisfies (3.16), then
x − xn(δ) ≤ (τ + 1)δ.
Proof. From (3.15), using the approximate data xδ in one case and “clean”
data x ∈ D(L) in the other, we have
⎛ ⎞

n(δ)
xδn(δ) − xn(δ) = ⎝I − ⎠ (xδ − x).
(I − βj L)
j=1
Now
x − xn(δ) = xδ − xδn(δ) + x − xδ + xδn(δ) − xn(δ)
"% #
= xδ − xδn(δ) +
n(δ) (x − xδ ).
− βj L)
j=1 (I
≤ 1, we have by (3.16)
Since I − βj L
x − xn(δ) ≤ τ δ + x − xδ ≤ (τ + 1)δ.

We now need an inequality.

µ z ≤ z1/(2µ+1) L
Lemma 3.16. For µ > 0, L µ+1/2 z2µ/(2µ+1) .
Proof. Let {Eλ } be a resolution of the identity for H2 generated by the

bounded self-adjoint operator L : H2 → H2 . By Hölder’s inequality

µ z2 = 1 1 · λ2µ dEλ z2

L 0
"
#1/(2µ+1) "
#2µ/(2µ+1)
1 1
≤ 0
1dEλ z2 0
λ2µ+1 dEλ z2
" #2µ/(2µ+1)
µ+1/2 z2
= z2/(2µ+1) L .

µ w for some w ∈ D(L), then

Lemma 3.17. If x ∈ D(L) and x = L
Lx − Lxn(δ) = O(δ µ/(µ+1) ).
Proof. From (3.14) and lemma 3.1 we find

"% #
Lx − Lxn(δ) = L
n(δ) L
− βj L) µ w
j=1 (I
"% #
=
n(δ) L
− βj L) µ Lw = L
µ zn(δ)
j=1 (I
"% #
where zn(δ) =
n(δ) Lw and hence zn(δ) ≤ Lw.
− βj L)
j=1 (I
3.2 Some Cases 69
Applying the previous lemma, we find
µ+1/2 zn(δ) 2µ/(2µ+1)

Lx − Lxn(δ) ≤ Lw1/(2µ+1) L
(3.17)
= Lw 1/(2µ+1) 1/2
L (Lx − Lxn(δ) ) 2µ/(2µ+1)
.
≤ 1,
However, since x − xn(δ) ≤ (τ + 1)δ and LL
1/2 (Lx − Lxn(δ) )2 =

L(Lx
L − Lxn(δ) ), Lx − Lxn(δ)
− xn(δ) ), Lx − Lxn(δ)
=
LL(x
≤ (τ + 1)δLx − Lxn(δ) .
Therefore (3.17) gives
Lx − Lxn(δ) = O(δ µ/(2µ+1) )Lx − Lxn(δ) µ/(2µ+1)
that is,
Lx − Lxn(δ) = O(δ µ/(µ+1) ).

Theorem 3.18. Suppose that x ∈ D(L) and x = L µ w for some w ∈ D(L)

and µ > 1/2. If x ∈ H1 satisfies x − x ≤ δ and n(δ) is chosen by (3.16),
δ δ
then
Lxδn(δ) − Lx = O(δ min((2µ−1)/(2µ),µ/(µ+1)) ).

⎛ ⎞
"
n−1 #
(xn−1 − xδn−1 ) − (x − xδ ) = − ⎝ ⎠ (x − xδ )
I − βj L
j=1
and hence by (3.16)
xn(δ)−1 − x = xδn(δ)−1 − xδ + (xn(δ)−1 − xδn(δ)−1 ) − (x − xδ )

"% " ##
≥ xδn(δ)−1 − xδ −
n(δ)−1 (x − xδ ) (3.18)
I − βj L
j=1
≥ τ δ − δ = (τ − 1)δ
µ w, then
If x = L
$ $
$ n−1 # $
$ µ" $
xn−1 − x = $ w$ = O(σ −µ ).
I − βj L
$L $ n−1
$ j=1 $
−µ
However, σn /σn−1 → 1 as n → ∞ and hence σn−1 = O(σn−µ ), therefore
−µ
xn(δ)−1 − x = O(σn(δ) ).
In light of (3.18), we then have
σn(δ) = O(δ −1/µ ). (3.19)
By the general stability estimate

√
Lxδn(δ) − Lxn(δ) = yn(δ)
δ
− yn(δ) ≤ δ σn(δ) = O(δ (2µ−1)/(2µ) ).
Combining this with the previous lemma gives the result.

We note that this result says that rates arbitrarily close to optimal may in
principle be obtained by use of the iteration number choice criterion (3.16).
3.2.4 A Method Suggested by Dynamical Systems
We begin with some heuristics and then we develop a theoretical method

for stable approximation of Lx. This method will then be used to give an-
other motivation for the iterated Tikhonov-Morozov method at the end of
this section.
Given data xδ ∈ H1 our goal is to produce a smoothed approximation
z(α) ∈ D(L) to xδ with Lz(α) → Lx as α → ∞ (again in this section we
find it convenient to reverse the direction of the stabilization parameter).
That is, if w(α) = xδ − z(α) we want w(0) = xδ and w(α) → 0 as α → ∞
in an appropriate way as δ → 0. We are only concerned with unbounded
operators L and in this case the positive unbounded operator L∗ L has an
unbounded spectrum. On the other hand the operator (L∗ L)† typically has
positive eigenvalues that converge to 0. In order to suppress high frequency
components in w one might then seek the long term trend in the solution of
dw
= −(L∗ L)† w, w(0) = xδ . (3.20)
dα
This has the formal solution
xδ − z(α) = w(α) = exp (−(L∗ L)† α)xδ
or equivalently & '

z(α) = I − exp (−(L∗ L)† α) xδ
which in light of Theorem 2.13 suggests the definition
α (L)x
z δ (α) = LT δ
3.2 Some Cases 71
where ⎧1& '

⎨ t 1 − e−αt/(1−t) , t ∈ (0, 1)
Tα (t) =
⎩
0 , t = 0, 1.
Note that these functions are continuous on [0, 1] and satisfy conditions (3.2)
and (3.3) (of course, with the modification that α → ∞). While this does
not result in a computable method, the theory of the previous section applies
nevertheless. Setting s = t/(1 − t) we find that
max |(1 − t)Tα (t)| = max (1 − e−αs )/s = α

t∈[0,1] s∈[0,∞)
Therefore, we may set r(α) = α in Theorem 3.4 and hence if α = α(δ) → ∞

and δ 2 α(δ) → 0 , then Lz δ (α(δ)) → Lx as δ → 0.
With the same transformation t → s we find that for ν > 0
ν " ν #ν
s
(1 − tTα (t))tν = e−αs ≤ sν e−αs ≤ α−ν .
s+1 e
Therefore we may take ω(α, ν) = O(α−ν ) in Theorem 3.5. Combining these

two results we obtain
ν ) for
Corollary 3.19. Suppose x ∈ D(L) and x − xδ ≤ δ. If Lx ∈ R(L
−2/(2ν+1)
some ν > 0 and α(δ) = Cδ then
Lz δ (α(δ)) − Lx = O(δ 2ν/(2ν+1) ).
Working formally, (3.20) suggests
dw
L∗ L = −w
dα
or since w(α) = xδ − z(α)
dz
L∗ L = xδ − z(α). (3.21)
dα
If we approximate the solution of the differential equation by the simple
implicit forward difference method
zn − zn−1
L∗ L = xδ − zn , z0 = 0
h
with step size h, we find on setting β = 1/h and rearranging that
(I + βL∗ L)zn = βL∗ Lzn−1 + xδ
which is the iterated Tikhonov-Morozov method. An entirely different motiva-

tion and convergence proof for this method will be given in the next chapter.
The iterated Tikhonov-Morozov method may be motivated by equation

(3.21) in a slightly different way. Specifically, from (3.21), we have
αn+1 αn+1
∗ dz
L L dτ = (αn+1 − αm )x −δ
z(τ )dτ
αn dτ αn
or, on setting β = 1/(αn+1 − αn ), and using the right hand rule on the last
integral, we are led to
zn+1 + βL∗ Lzn+1 = xδ + βL∗ Lzn
which is the iterated Tikhonov-Morozov method. This approach suggests the
use of other closed quadrature rules on the integral above. While we do not
suggest that such rules will lead to methods that offer any computational
advantage over the Tikhonov-Morozov method, it is instructive to see how
the general theory applies to another method. For example, if the trapezoidal
rule is used we are led to the approximation
βL∗ L(zn+1 − zn ) = xδ − (zn+1 + zn )/2
or, setting γ = 2β > 0,
(I + γL∗ L)zn+1 = 2xδ + γL∗ Lzn − zn .
Equivalently,
n (L)x
zn = LT δ
where T0 (t) = 0 and
2 γ(1 − t) − t
Tn+1 (t) = + Tn (t), n = 0, 1, . . . .
γ(1 − t) + t γ(1 − t) + t
One finds immediately that
n
1 γ(1 − t) − t
Tn (t) = 1−
t γ(1 − t) + t
and from this it follows that
2
|(1 − t)Tn (t)| ≤ n.
γ
Therefore, one may take r(n) = O(n) in the general stability estimate. How-
ever, for this method the convergence analysis given previously does not
automatically apply since
γ(1 − t) − t
−1 ≤ <1
γ(1 − t) + t
for t ∈ (0, 1] with the equality holding at t = 1. Therefore, the convergence of
the spectral approximation may fail if the resolution of the identity generated
by L has a jump discontinuity at t = 1. This is equivalent to the condition
that {0} = N (I − L) = N (L) (see Lemma 2.9).
3.3 Notes 73
3.3 Notes
Lardy [34] was the first to exploit von Neumann’s theorem in applications
to series representations for the Moore-Penrose inverse of a closed unbounded
linear operator. The general spectral approach to stabilized approximate eval-
uation based on von Neumann’s theorem was introduced in [15]. Another
approach to general stabilization theory, based on the theory of regulariza-
tion and Theorem 2.10 is suggested in [21].
The best known specific instance of the general method is the Tikhonov-
Morozov method. This method was developed by V.A. Morozov and his
co-workers and is summarized in [39]. A much more extensive treatment of
this method, based on our spectral approach, emerges in the following chap-
ters. See also [40] and [22] for further developments. The line of reasoning in
the proof of Theorem 3.9 is inspired by an argument of Neubauer [41]. The
iterative stabilization method that is motivated by functional interpolation
appears in [18]; the techniques of that paper owe a lot to [27]. It would appear
that other stabilization methods based on numerical integration techniques for
initial value problems for ordinary differential equations could be developed.
In appropriate circumstances the Tikhonov-Morozov method can be
adapted to stably evaluate certain nonlinear operators A. We outline the the-
ory of Al’ber [1] for accomplishing this. Suppose that A : D(A) ⊆ H → H
is a nonlinear monotone operator defined on a subset D(A) of a real Hilbert
space H, that is

Ax − Ay, x − y ≥ 0
for all x, y ∈ D(A). Given xδ ∈ H and x ∈ D(A) with
x − xδ ≤ δ
we wish to stably approximate Ax using the data xδ . We assume that A is

discontinuous in the usual sense, but satisfies a weak condition called hemi-
continuity, namely that A is weakly continuous along rays, that is
A(u + tv) Au as t → 0+
when u + tv ∈ D(A) for sufficiently small nonnegative t.

Under these conditions it can be shown, by use of a fundamental result on
maximal monotone operators, that for α > 0 the nonlinear operator I + αA
has a single valued continuous inverse defined on all of H (see, e.g., [49]). The
mapping
xδ → (I + αA)−1 xδ
is therefore a stable operation. Also,

1
A(I + αA)−1 = (I − (I + αA)−1 )
α
and hence, for fixed α > 0, the operation
xδ → A(I + αA)−1 xδ = Axδα
where xδα be the unique solution of
xδα + αAxδα = xδ
is a stable operation. Let xα be the solution of the same equation using the
“clean” data x:
xα + αAxα = x.
The goal is to show that Axδα → Ax if α = α(δ) → 0 in some appropriate
sense as δ → 0. First, we show that Axα → Ax as α → 0.
Note that, by the monotonicity of A
0 ≤
Axα − Ax, xα − x
= −α
Axα − Ax, Axα
≤ −αAxα 2 + αAxAxα
and hence
Axα ≤ Ax.
From the definition of xα we then have
xα − x = αAxα = O(α)
and hence xα → x as α → 0.
Suppose v ∈ H is arbitrary and t ≥ 0. Then
0 ≤
Axα − A(x + tv), xα − (x + tv)
=
Axα , xα − x − t
Axα , v −
A(x + tv), xα − x − tv.
Since Axα is bounded, for any sequence αn → 0, there is a subsequence,

which we again denote by αn , and a y ∈ H such that
Axαn y.
Therefore, taking limits as αn → 0 above, and using the fact that xαn → x,
we arrive at
0 ≤ −
y, v +
A(x + tv), v =
A(x + tv) − y, v.
By the hemicontinuity of A we then have
0 ≤
Ax − y, v
3.3 Notes 75
for any v ∈ H. Therefore, Ax = y, that is,
Axα Ax as α → 0.
However, since Axα ≤ Ax, it follows from the weak lower semicontinuity
of the norm that Axα → Ax and hence
Axα → Ax as α → 0.
The convergence of {Axδα } will now be established. First, we have
Axδα − Ax ≤ Axδα − Axα + Axα − Ax
and
Axδα − Axα = α−1 (xδ − xδα ) + (xα − x)
(3.22)
≤ δ/α + α−1 xα − xδα .
But, since
xδα − xα + α(Axδα − Axα ) = xδ − x,
one finds, using the monotonicity of A,
xδα − xα 2 ≤ xδα − xα 2 +
Axδα − Axα , xδα − xα
=
xδ − x, xδα − xα ≤ δxδα − xα
and hence xδα − xα ≤ δ. From (3.22), we then obtain
Axδα − Axα ≤ 2δ/α.
Therefore, if α = α(δ) → 0 in such a way that δ/α → 0, then
Axδα − Axα → 0.
But, as previously established, Axα → Ax and hence
Axδα → Ax.
4
The Tikhonov-Morozov Method
Though this be madness, yet there is method in it.

Shakespeare
In this chapter we treat the Tikhonov-Morozov method and some of its

variants in greater detail. We show that the O(δ 2/3 ) asymptotic order of
approximation (for ν = 1) given in Corollary 3.10 is essentially best possi-
ble. Variational characterizations of the method and its iterated version are
given, and an alternative convergence proof for the iterated method, based
on the Alternating Projection Theorem, is presented. Finally the theory of a
nonstationary version of the iterated method is developed.
4.1 The Tikhonov-Morozov Method

We begin with a variational characterization of the Tikhonov-Morozov method.
To that end we note that for a given fixed positive number α the product space
H1 × H2 packaged with the inner product [·, ·] defined by
[(x, y), (u, v)] =

x, u + α
y, v
and associated norm | · | satisfying
|(x, y)|2 = x2 + αy2
is a Hilbert space. Also, if L : D(L) ⊆ H1 → H2 is a closed linear operator

then the graph of L
G(L) = {(x, Lx) : x ∈ D(L)} ⊆ H1 × H2
is a closed subspace of H1 × H2 . Therefore, the orthogonal projector P of

H1 × H2 onto G(L) is a bounded linear operator. Given xδ ∈ H1 , the vector
78 4 The Tikhonov-Morozov Method
P(xδ , 0) ∈ G(L) is therefore uniquely determined and satisfies

|P(xδ , 0) − (xδ , 0)|2 = minz∈D(L) |(z, Lz) − (xδ , 0)|2
= minz∈D(L) Φα (z; xδ )
where Φα (·; xδ ) is the functional defined on D(L) by

Φα (z; xδ ) = z − xδ 2 + αLz2 .
It turns out that the minimizer of this functional is the Tikhonov-Morozov
approximation xδα of the previous chapter.
Theorem 4.1. For any xδ ∈ H1 the functional Φα (·; xδ ) has a unique min-
imizer in D(L). This minimizer lies in D(L∗ L) and is given by xδα =
(I + αL∗ L)−1 xδ .
Proof. The existence and uniqueness of the minimizer xδα of Φα (·; xδ ) over
D(L) is proved above, namely, (xδα , Lxδα ) = P(xδ , 0). For any z ∈ D(L) we
therefore have
d
Φα (xδα + tz; xδ )|t=0 = 0.
dt
But
Φα (xδα + tz, xδ ) =
xδα − xδ + tz, xδα − xδ + tz
+ α
Lxδα + tLz, Lxδα + tLz
and hence the previous condition is equivalent to

xδα − xδ , z = −α
Lxδα , Lz
for all z ∈ D(L). Therefore, Lxδα ∈ D(L∗ ), that is, xδα ∈ D(L∗ L), and
(I + αL∗ L)xδα = xδ .
We note that (I + αL∗ L)√has a bounded inverse by von Neumann’s theorem
(applied to the operator αL). Therefore, the unique minimizer of Φα (·; xδ )
is given by
xδα = (I + αL∗ L)−1 xδ .

Besides providing additional insight into the Tikhonov-Morozov method,
this theorem suggests the possibility of computing the Tikhonov-Morozov ap-
proximation by a finite element method. This theme will be developed later.
In Theorem 3.8 it was shown that with exact data x ∈ D(L) a convergence
rate of O(α) is achievable with the Tikhonov-Morozov method. We now show
that, except in a trivial case, this rate is best possible. We will denote the
Tikhonov-Morozov approximation with exact data x ∈ D(L) by xα . That is,
xα = (I + αL∗ L)−1 x.
First we note that the order of approximation O(α) is essentially best possible.
4.1 The Tikhonov-Morozov Method 79
Theorem 4.2. If x ∈ D(L) and Lx − Lxα = o(α), then x ∈ N (L).
Proof. Define eα by

eα = Lx − Lxα = Lx − LL[αI −1 x
+ (1 − α)L]

= {I − L[αI −1 }Lx,
+ (1 − α)L]
and hence
α = α(I − L)Lx.
[αI + (1 − α)L]e
≤ 1 and so, if eα = o(α), we find that

But αI + (1 − α)L

(I − L)Lx =0
= D(LL∗ ) (see Lemma 2.8). Furthermore,
and therefore, Lx ∈ R(L)

LL∗ Lx = (I + LL∗ )(I − L)Lx = 0.
But then
L∗ Lx2 =
LL∗ Lx, Lx = 0
and hence Lx ∈ R(L) ∩ N (L∗ ) = {0}, i.e., x ∈ N (L).

This result raises the question whether the rate O(δ 2/3 ) obtained in Corol-
lary 3.10 is also best possible when dealing with inexact data. It transpires
that this is the case, at least for the very important class of operators L for
which L∗ L has compact resolvent, that is, those operators L for which L is
compact.
To see that the rate O(δ 2/3 ) can not be improved, we begin with an
estimate for the stabilization parameter.
& 2'
Lemma 4.3. If x ∈ / N (L) and Lxδα − Lx = o δ 3 for all xδ satisfying
& 2 '
x − xδ ≤ δ, then α = o δ 3 .
Proof. To see this, let xδ = x − δu, where u is a unit vector and let eδα =
Lxδα − Lx. Then
" #
δ = [αI + (1 − α)L]
[αI + (1 − α)L]e LL(αI
+ (1 − α) −1 x − Lx
L)
α
L(αI
− δ[αI + (1 − α)L]L −1 u
+ (1 − α)L)
− I)Lx − δLLu.
= α(L
& 2'
Since eδα = o δ 3 , by assumption, and since
& '
≤ δLL
δLLu = o δ 23 ,
we find that
α − I)Lx → 0, as δ → 0.
2 (L
δ3
" #
But, by assumption, x ∈ − I)L) and hence α = o δ 23 .
/ N (L) = N ((L
"We2
#now show that for a wide class of operators the order of convergence
O δ 3 can not be improved. We only consider the important class of opera-
tors L∗ L which have a divergent sequence of eigenvalues. Such is the case if
L is the derivative operator, when −L∗ L is the Laplacian operator, or more
generally whenever L is a differential operator for which L is compact, i.e.,
∗
when L L has compact resolvent.
Theorem 4.4. Suppose x ∈ D(L) and L is compact. If Lx − xδ = o(δ 2/3 )
α
for all xδ ∈ H1 satisfying x − xδ ≤ δ, then x ∈ N (L).
has a sequence {λn } of eigenvalues
Proof. The spectral theorem implies that L
∗
satisfying λn → 0 and hence
" L# L has a sequence of eigenvalues µn = 1/λn −
2
1 → ∞. If Lxδα −Lx = o δ 3 for all xδ with x−xδ ≤ δ, then we will show
& 2'
that x ∈ N (L). Indeed, if x ∈
/ N (L), then, as we have just shown, α = o δ 3 .
Let eδα = Lxδα − Lx, then
eδα 2 = Lxα − Lx2 + 2

Lxα − Lx, Lxδα − Lxα + Lxδα − Lxα 2
and by hypothesis
4
Lxα − Lx2 /δ 3 → 0
as δ → 0 (since xδ = x satisfies x − xδ ≤ δ). Therefore, we must have
2
Lxα − Lx, Lxδα − Lxα + Lxδα − Lxα 2
4 → 0 as δ → 0.
δ3
Suppose that {un } are orthonormal eigenvectors of L∗ L associated with {µn }.
associated with the eigenvalues
Then {un } are eigenvectors of L
λn = 1/(1 + µn )
and λn → 0 as n → ∞. Now let xδ = x + δun . Then
Lxδα − Lxα 2

= δ 2
L(αI −1 un , L∗ LL(αI
+ (1 − α)L) −1 un
+ (1 − α)L)
= δ 2 λ2n µn (α + (1 − α)λn )−2
= δ 2 λn (1 − λn )(α + (1 − α)λn )−2 .

4.2 The Discrepancy Criterion 81
3
Therefore, if δ = δn = λn2 , then δn → 0 as n → ∞ and
−2
Lxδαn − Lxα 2 α
4 = (1 − λn ) 2 +1−α → 1 as n → ∞.
δn3 δn3
Finally, we have
|
Lxα − Lx, Lxδαn − Lxα | Lxα − Lx Lxδαn − Lxα
4 ≤ 2 2 → 0.
δn
3
δn3 δn3
This contradiction establishes that x ∈ N (L) and hence that the order O(δ 2/3 )
is essentially best possible.

4.2 The Discrepancy Criterion
The convergence rates discussed thus far for the Tikhonov-Morozov method
were based on a priori choices for the parameter α. We now investigate an
a posteriori method of choosing α, due to V.A. Morozov, which depends on
monitoring the quantity xδα −xδ . The basic philosophy of the method is that
this “discrepancy” quantity should be of the same order as the error level in
the data. Throughout we make the reasonable assumption that there is more
signal than noise in the data, that is, that the signal-to-noise ratio, xδ /δ, is
greater than one:
x − xδ ≤ δ < xδ . (4.1)
As the discrepancy method requires observation of xδα − xδ , we first note
that

xδα − xδ = L(αI −1 xδ − xδ
+ (1 − α)L)
(4.2)
−1 [α(L
= (αI + (1 − α)L) − I)xδ ].
Therefore, if {Eλ }λ∈[0,1] is the resolution of the identity for H1 generated by

then
the self-adjoint linear operator L,
1
α2 (1 − λ)2
xδα − xδ 2 = dEλ xδ 2 .
0 (α + (1 − α)λ) 2
But note that

α2 (1 − λ)2 1, λ = 0
→ as α→0
(α + (1 − α)λ)2 0, λ ∈ (0, 1]
in a uniformly bounded manner, and hence the function
f (α) = xδα − xδ 2
is continuous and satisfies
lim f (α) = PN (L

) x = 0.
δ 2
α→0+
Also,
α2 (1 − λ)2
→ 1 as α→∞
(α + (1 − α)λ)2
and therefore,
lim f (α) = xδ 2 > δ 2
α→∞
by (4.1). Furthermore, f is an increasing function of α and hence there is a

unique α = α(δ) satisfying
xδα(δ) − xδ = δ. (4.3)
The a posteriori choice of the parameter α by the criterion (4.3) is called the
choice by the discrepancy method.
Lemma 4.5. If x ∈
/ N (L) and α is chosen by the criterion (4.3), then α(δ) =
O(δ).
Proof. By (4.2)
δ − xδ ) = α(δ)(L
[αI + (1 − α)L](x − I)xδ
α(δ)
and therefore by (4.3)

− I)xδ ≤ δ/α(δ).
(L
− I), we have
/ N (L) = N (L∗ L) = N (L
Since x ∈
− I)x ≤ lim inf δ/α(δ)
0 < (L
δ→0+
and hence the result.

We next prove the convergence of the Tikhonov-Morozov method when

the parameter is chosen by the discrepancy principle. Recall that
xδα = argminz∈D(L) Φα (z; xδ ) = xδ − z2 + αLz2 .
Therefore, if α = α(δ) is chosen according to criterion (4.3), then
δ 2 + α(δ)Lxδα(δ) 2 = Φα(δ) (xδα(δ) ; xδ )
≤ Φα(δ) (x; xδ ) = δ 2 + α(δ)Lx2
and hence
Lxδα(δ) ≤ Lx. (4.4)
Theorem 4.6. If x ∈ D(L) and α(δ) is chosen according to (4.3), then

xδα(δ) → x and Lxδα(δ) → Lx as δ → 0.
Proof. The statement is equivalent to the convergence of xδα(δ) to x in the

graph norm of L, that is,
|(xδα(δ) , Lxδα(δ) ) − (x, Lx)| → 0
where the norm on the product space H1 × H2 satisfies
|(u, v)|2 = u2 + v2 .
To establish this it suffices to show that every sequence {δn } of positive num-
bers converging to zero has a subsequence, which for notational simplicity we
designate {δm }, satisfying
xδα(δ
m
m)
→ x and Lxδα(δ
m
m)
→ Lx.
First note that
xδα(δ) − x ≤ xδα(δ) − xδ + xδ − x = 2δ
and hence xδα(δ) → x as δ → 0. By (4.4) {Lxδα(δn ) } is bounded and hence any

sequence δn → 0 has a subsequence {δm } satisfying
Lxδα(δ
m
m)
y
for some y ∈ H2 . However, G(L) is weakly closed (being closed and convex)
and hence y = Lx, i.e.,
xδα(δ
m
m)
→ x and Lxδα(δ
m
m)
Lx.
But then,

Lxδα(δ
m
m)
, Lx → Lx2
and hence by the Cauchy-Schwarz inequality and (4.4),
Lx ≤ lim inf Lxδα(δ

m
m)
≤ lim sup Lxδα(δ
m
m)
≤ Lx
that is, Lxδα(δ

m
m)
→ Lx. Therefore, Lxδα(δ
m
m)
→ Lx, and hence, as this is
true for any subsequence, Lxδα(δ
n
n)
→ Lx.

Under appropriate conditions on the true data x, one can prove a conver-
gence rate for the Tikhonov-Morozov method with parameter chosen by the
discrepancy principle.
Theorem 4.7. If x ∈ D(L∗ L), and if α(δ) is chosen by (4.3), then
√
Lxδα(δ) − Lx = O( δ).
Proof. By (4.4) and (4.3)
Lxδα(δ) − Lx2 = Lxδα(δ) 2 − 2

Lxδα(δ) , Lx + Lx2
≤ 2
Lx − Lxδα(δ) , Lx
= 2
x − xδα(δ) , L∗ Lx
≤ 4δL∗ Lx.

The order of approximation given in this theorem can not be improved

is compact, then there is an x ∈ D(L∗ L) such that
in general. In fact, if L √
Lxα(δ) − Lx is not of order o( δ) under the stated conditions. Indeed,
δ
let {µn } be a sequence of eigenvalues of L∗ L with µn → ∞ and let {un }

be an associated sequence of orthonormal eigenvectors. Let x = u1 and let
xδn = x + δn un , where {δn } is a sequence of positive numbers converging
to zero which will be specified later. Note that the signal-to-noise criterion is
associated
satisfied. Let λn = 1/(1+µn ) and note that λn is an eigenvalue of L
with the eigenvector un . Then

Lxδαn − Lx2 = LL(αI −1 − Lu1 2
+ (1 − α)L)

=
L∗ L[L(αI −1 (u1 + δn un ) − u1 ],
+ (1 − α)L)

L(αI −1 (u1 + δn un ) − u1
+ (1 − α)L)
=
(λ1 µ1 /(α + (1 − α)λ1 ) − µ1 )u1 + (δn λn µn /(α + (1 − α)λn ))un ,
(λ1 /(α + (1 − α)λ1 ) − 1)u1 + (δn λn /(α + (1 − α)λn ))un
= (1 − λ1 )2 α2 µ1 (α + (1 − α)λ1 )−2 + δn2 λ2n µn (α + (1 − α)λn )−2
≥ δn2 λ2n µn (α + (1 − α)λn )−2 = (α/δn + (1 − α)λn /δn )−2 λ2n µn .

√
Now let δn = µn /(1 + µn )2 . If Lxδα(δ
n
n)
− Lx|| = o( δn ), then
(α/δn + (1 − α)λn /δn )−2 λ2n µn /δn
= (α(δn )/δn + (1 − α(δn ))λ/δn )−2 → 0
as n → ∞. However, by Lemma 4.5, α(δn )/δn is bounded and hence λn /δn →

∞. But,
λn 1
=1+ →1
δn µn
which is a contradiction.
This result shows that the rate in Theorem 4.7 can not be generally
improved, but if we make a more specific requirement on x and modify
the choice criterion (4.3) appropriately, then the best possible rate for the
Tikhonov-Morozov method is achievable by use of a modified discrepancy
principle .
We suggest the following modified discrepancy principle: choose α = α(δ)
to satisfy
xδα(δ) − xδ = δ 2 /α2 . (4.5)
First note that for fixed δ > 0 the function

1
α6 (1 − λ)2
f (α) = α4 xδα − xδ 2 = dEλ xδ 2 ,
0 (α + (1 − α)λ)
2
where {Eλ } is a resolution of the identity generated by the bounded self-

is continuous, increasing, f (0) = 0, and limα→∞
adjoint linear operator L,
f (α) = ∞. Therefore there is a unique α = α(δ) satisfying (4.5).
Lemma 4.8. Suppose x ∈ D(L∗ L) and x ∈ / N (L). If α(δ) is given by (4.5),

then δ 2 /α(δ)3 → L∗ Lx > 0 as δ → 0.
Proof. To simplify the typography we simply write α for α(δ) in this proof.
Since
δα − xδ ) = α(L
[αI + (1 − α)L](x − I)xδ ,
we have
− I)xδ ≤ xδ − xδ = δ 2 /α2
α(L α
− I)xδ → (L
and since (L − I)x > 0 as δ → 0, we find that α → 0 as
δ → 0, and hence xα − x → 0. We then have
xδα − x ≤ xδα − xα + xα − x
= (I + αL∗ L)−1 (xδ − x) + xα − x
≤ δ + xα − x → 0 as δ → 0.
But then, by (4.5)
δ 2 /α2 = xδα − xδ ≤ xδα − x + δ → 0 as δ → 0.
Next, note that since α → 0 as δ → 0

L∗ Lxα − L∗ Lx = (L(αI −1 − I)L∗ Lx → 0
+ (1 − α)L) (4.6)
as δ → 0. Also,

L∗ L(xδα − xα ) = (I − L)(αI −1 (xδ − x)
+ (1 − α)L)
and hence

L∗ L(xδα − xα ) ≤ I − Lδ/α → 0 as δ → 0.
In light of (4.6) it follows that L∗ Lxδα → L∗ Lx as δ → 0. However, xδ − xδα =

αL∗ Lxδα and hence by (4.5)
δ2
= L∗ Lxδα → L∗ Lx
α3
as δ → 0.

We can now prove the promised order of approximation result for the
modified discrepancy principle. Recall from Lemma 2.8 that the condition

x ∈ D(LL∗ L) is equivalent to x ∈ D(L) and Lx ∈ R(L).
Theorem 4.9. If x ∈ D(LL∗ L), x ∈

/ N (L) and α(δ) satisfies (4.5), then
Lxδα(δ) − Lx = O(δ 2/3 ).
Proof. By the lemma, α(δ) ∼ δ 2/3 , so from Theorem 3.4 and Corollary 3.8 we
have
Lxδα(δ) − Lx ≤ Lxδα(δ) − Lxα(δ) + Lxα(δ) − Lx

≤ δ/ α(δ) + O(α(δ)) = O(δ 2/3 ).

4.3 Iterated Tikhonov-Morozov Method

Recall that in the iterated Tikhonov-Morozov method α is a fixed positive
parameter and approximations {xδn } are generated iteratively by
δ = Lx
(αI + (1 − α)L)x δ + α(I − L)x
δ , xδ0 = 0
n n−1
or equivalently
(I + αL∗ L)xδn = xδ + αL∗ Lxδn−1 . (4.7)
We have seen that the ordinary Tikhonov-Morozov method has a varia-
tional interpretation. In a similar way the iterated Tikhonov-Morozov method
may be viewed as a multi-stage optimization procedure in which each iterate
serves to stabilize the next iterate. Specifically, it is easy to see that xδn is the
unique minimizer of
Ψn (z; xδ ) = z − xδ 2 + αLz − Lxδn−1 2 ,
over D(L). That is, xδ0 = 0 and
xδn = argminz∈D(L) Ψn (z; xδ )

4.3 Iterated Tikhonov-Morozov Method 87
which, by the same line of argument used in the analysis of the ordinary
Tikhonov-Morozov method, is equivalent to (4.7).
Again it is instructive to interpret the iterated method in the product
Hilbert space H = H1 × H2 with the norm | · | satisfying
|(u, v)|2 = u2 + αv2 .
One sees immediately that (xδn , Lxδn ) is the point in the graph G(L) ⊆ H
which is nearest (in the norm | · |) to the point (xδ , Lxδn−1 ) ∈ H.
The basic convergence theory of the iterated Tikhonov-Morozov method
was developed in the previous chapter. The basis of that theoretical develop-
ment was von Neumann’s theorem and the spectral theorem. We now present
an alternative convergence proof grounded on an entirely different idea – the
alternating projection theorem. To do so we formulate the iterated method
in the product Hilbert space H. We begin the discussion by assuming that
the data is the error-free vector x ∈ D(L). Let V1 = {x} × H2 ⊆ H and
V2 = G(L) ⊆ H. Note that both of the sets V1 and V2 are closed in H and
affine (V2 is of course a closed subspace of H since L is closed). Also, the
condition x ∈ D(L) is equivalent to V1 ∩ V2 = ∅.
The iterated method with exact data x ∈ D(L) generates a sequence {xn }
with x0 = 0 and
xn = argminz∈D(L) z − x2 + αLz − Lxn−1 2 , n = 1, 2, 3, . . .
In particular
x1 = argminz∈D(L) z − x2 + αLz − L02
= argminz∈D(L) z − x2 + αLz − 02
that is,
(x1 , Lx1 ) = P2 (x, 0) = P2 P1 (0, 0)
where P2 is the projection of H onto V2 = G(L) and P1 is the projection of
H onto the closed affine set V1 = {x} × H2 . Then P1 (x1 , Lx1 ) = (x, Lx1 ).
Furthermore,
x2 = argminz∈D(L) |(z, Lz) − (x, Lx1 )|2
and hence
(x2 , Lx2 ) = P2 (x, Lx1 ) = P2 P1 (x1 , Lx1 ).
Continuing in this manner we see that
(xn , Lxn ) = (P2 P1 )n (0, 0).
By the alternating projection theorem we have
(xn , Lxn ) → PV1 ∩V2 (0, 0) as n→∞

where the convergence is in the Hilbert space H, if and only if V1 ∩ V2 = ∅,

that is, if and only if x ∈ D(L), in which case V1 ∩ V2 = {(x, Lx)}. To sum
up: if x ∈ D(L), then
xn → x and Lxn → Lx as n → ∞.
We note that the iterated method converges in the graph norm of L to

(x, Lx) for any initial approximation x0 ∈ D(L∗ L), not just for x0 = 0. Indeed,
one sees that if
(I + αL∗ L)xn = x + αL∗ Lxn−1 , x0 ∈ D(L∗ L)
then, if x ∈ D(L)
(xn , Lxn ) = (P2 P1 )n (x0 , Lx0 )
→ PV1 ∩V2 (x0 , Lx0 ) = (x, Lx).
Suppose now that the available data xδ ∈ H1 satisfies xδ − x ≤ δ where

x ∈ D(L). Let P1δ be the projection of H onto V1δ = {xδ } × H2 . Then
(xδn , Lxδn ) = P2 P1δ (xδn−1 , Lxδn−1 ), xδ0 = 0.
Also,
αLxδn − Lxn 2 ≤ |(xδn , Lxδn ) − (xn , Lxn )|2
= |P2 P1δ (xδn−1 , Lxδn−1 ) − P2 P1 (xn−1 , Lxn−1 )|2
= |P2 P1δ (xn−1 , Lxδn−1 ) − P2 P1 (xn−1 , Lxn−1 )|2
≤ |P1δ (xn−1 , Lxδn−1 ) − P1 (xn−1 , Lxn−1 )|2
= |(xδ , Lxδn−1 ) − (x, Lxn−1 )|2
= xδ − x2 + αLxδn−1 − Lxn−1 2
≤ δ 2 + αLxδn−1 − Lxn−1 2
and therefore
Lxδn − Lxn 2 ≤ nδ 2 /α.
We have now reproduced, from a different perspective, the result of
Corollary 3.11, namely

Lxδn(δ) → Lx if δ n(δ) → 0 as δ → 0.
4.4 The Nonstationary Method 89
4.4 The Nonstationary Method

In Section 3.2.4 the iterated Tikhonov-Morozov method was motivated in
terms of an application of an implicit forward difference scheme to the initial
value problem (we first consider “clean” data x ∈ D(L))
dz
L∗ L = x − z(α), z(0) = 0
dα
Rather than using a fixed step size in the difference scheme one might consider
an adaptive scheme in which the step size changes from step to step. That is,
one might generate approximation {xn } by
xn − xn−1
L∗ L = x − xn , x0 = 0
hn
where hn are positive parameters. Setting αn = 1/hn this yields
(I + αn L∗ L)xn = x + αn L∗ Lxn−1
or equivalently
n = Lx
(αn I + (1 − αn )L)x + αn (I − L)x
n−1 . (4.8)
This is just a nonstationary version of the Tikhonov-Morozov method in which

the stabilization parameter changes with the step. We will assume throughout
that {αj } ⊂ (0, 1]. The nonstationary method comes under the umbrella of the
n (L)
general considerations of the previous chapter when we note that xn = LT
where T0 (t) = 0 and
(αn + (1 − αn )t)Tn (t) = 1 + αn (1 − t)Tn−1 (t). (4.9)
In the interest of tidying up the notation we will set
βj (t) = αj + (1 − αj )t.
Note that βj (t) ∈ [t, 1] for all t ∈ [0, 1]. With this notation
βn (t)Tn (t) = 1 + αn (1 − t)Tn−1 (t)
and hence
t αn (1 − t)
tTn (t) = + (tTn−1 (t))
βn (t) βn (t)

t t
= + 1− (tTn−1 (t)).
βn (t) βn (t)
Since t/βn (t) ∈ [t, 1] ⊆ [0, 1] we find that
|tTn (t)| ≤ max{1, |tTn−1 (t)|}

and, as T0 (t) = 0, it follows that
|tTn (t)| ≤ 1 (4.10)
for all n and all t ∈ [0, 1]. Therefore, {Tn (t)} satisfies condition (3.2).
Also, since
Tn (t) = 1/βn (t) + (1 − t/βn (t))Tn−1 (t)
one finds that

1 t 1
Tn (t) − = 1− Tn−1 (t) −
t βn (t) t
and hence n

Tn (t) − 1 1 t
= 1−
t t j=1 βj (t)
for t ∈ (0, 1]. Since {αj } is bounded, there is a subsequence {αjk } converging
to some number α ∈ [0, 1]. It then happens that
1 t
→ = 0
βjk (t) α + (1 − α)t
for t ∈ (0, 1] and hence

∞
t
= ∞.
j=1
βj (t)
So we then find that

n

Tn (t) − 1 1 t
= 1− →0
t t j=1 βj (t)
as n → ∞, for all t ∈ (0, 1]. Therefore, {Tn (t)} satisfies condition (3.3).
As a consequence of Theorem (3.3) we have:
Corollary 4.10. If {αj } ⊆ (0, 1] and x ∈ D(L), then Lxn → Lx, where {xn }
is the sequence generated by the nonstationary method (4.8).
Suppose now that x−xδ ≤ δ and denote by {xδn } the sequence generated
by nonstationary method with data xδ , that is,
δn = Lx
(αn I + (1 − αn )L)x δ + αn (I − L)x
δn−1
Since
1−t t
(1 − t)Tn (t) = + 1− ((1 − t)Tn−1 (t))
βn (t) βn (t)
and 0 ≤ (1 − t)/βj (t) ≤ 1/αj for t ∈ [0, 1], we have
|(1 − t)Tn (t)| ≤ 1/αn + |(1 − t)Tn (t)|

and hence

n
|(1 − t)Tn (t)| ≤ σn := αj−1 .
j=1
In other words, we may use r(n) = σn in Theorem 3.4 to obtain:

√
Corollary 4.11. If n = n(δ) → ∞ while δ σn(δ) → 0 as δ → 0, then
Lxδn(δ) → Lx as δ → 0.
To capture an order of convergence result we will need a bound ω(n, ν)

satisfying
|(1 − tTn (t))tν | ≤ ω(n, ν) for ν > 0.
From the definition of {Tn (t)} we have

t
1 − tTn (t) = 1 − (1 − tTn (t))
βn (t)
which gives %n
(1 − tTn (t))tν = tν j=1 (1 − t/βj (t))
%n αj
= (1 + s)−ν sν j=1
αj + s
%n αj
≤ sν j=1
αj + s
where s = t/(1 − t) ∈ [0, ∞). We now need some numerical estimates that
result from a close analysis of the functions

n
αj
f (s) = sν , s ∈ [0, ∞)
α
j=1 j
+s
where ν > 0. We are interested only in the case n → ∞, so we shall assume

that n ≥ ν. Note that f (s) is nonnegative and
n
n
s αj

f (s) = s ν−1
ν− .
αj + s j=1 αj + s
k=1
Therefore, f (s) = 0 if and only if

n
1
g(t) := =ν (4.11)
1 + αk t
k=1
where t = 1/s. Now g is continuous, strictly decreasing on [0, ∞), and
g(0) = n ≥ ν > 0 = g(∞)

and hence the equation (4.11) has a unique positive solution which we will
denote by t1 . Also, since f (0) = 0 = f (∞), we have
max f (s) = f (t−1 −ν

1 ) ≤ t1 . (4.12)
s∈[0,∞)
The negative solutions of the equation (4.11), which we will denote by

t2 , t3 , . . . , tn are separated by the vertical asymptotes t = αj−1 of the func-
tion g defined in equation (4.11) and hence

n
n−1
− tj ≥ αj−1 = σn−1 . (4.13)
j=2 j=1
Next we characterize the sum of the solutions of (4.11).

n 1−ν
Lemma 4.12. i=1 ti = σn .
ν
Proof. Equation (4.11) may be written in the equivalent form

n
n
1
n
(t + αj−1 ) − (t + αj−1 ) = 0.
j=1
ναk j=1
k=1
j=k
But the sum of the roots of a monic polynomial is the negative of the next-
to-highest order coefficient, and hence

n
n
n
1 −1 1−ν
ti = − αi−1 + α = σn .

i=1 i=1
ν k ν
k=1
" #ν
Lemma 4.13. If 0 < ν < 1, then maxs∈[0,∞) f (s) ≤ ν
1−ν σn−ν .
Proof. Since t2 , t3 , . . . , tn are negative, Lemma 4.12 gives

n
1−ν
t1 ≥ ti = σn
i=1
ν
and hence −ν

1−ν
t−ν
1 ≤ σn−ν
ν
and the result follows from inequality (4.12).

The case ν ≥ 1 requires and additional condition on the parameters {αj }. We

shall assume that there is a positive constant c such that
1
≤ cσn−1 (4.14)
αn
for all n sufficiently large. Note that this is certainly the case for the stationary
method (αj = α), the geometric choice of parameters (αj = q j−1 α, q < 1),
and the harmonic choice (αj = 1/j) and many other parameter sequences.
While this condition may appear ad hoc, it is in fact a necessary condition for
an estimate of the form
max f (s) = O(σn−ν )

s∈[0,∞)
when ν > 1.
Theorem 4.14. If ν > 1 and f (s) ≤ cν σn−ν for some cν > 0 and all n, then
{αj } satisfies (4.14) for some c > 0.
Proof. Let
k
αj
pk (s) =
j=1
αj + s
for s ≥ 0. Then, pn−1 (s) > 0, pn−1 (0) = −σn−1 , and pn−1 (0) = 1. Therefore,
pn−1 (s) ≥ 1 − σn−1 s, for s ≥ 0.
And hence,
s
cν σn−ν ≥ sν pn (s) = αn sν−1 pn−1 (s)
αn + s
αn ν
≥ s (1 − σn−1 s), for s ≥ 0.
αn + s
−1
In particular, we find on setting s = 12 σn−1 that
αn σn−1 ν −ν
cν σn−1 ≥ (1/2) σn−1
1 + 2αn σn−1
and therefore,
ν ν
αn σn−1 σn−1 αn σn−1
≤ 2ν cν ν
= 2 cν .
1 + 2αn σn−1 σn 1 + αn σn−1
We then have
ν−1
1 αn σn−1 αn σn−1
< ≤ cν ≤ 2ν cν (αn σn−1 )ν−1
2 1 + 2αn σn−1 1 + αn σn−1
and hence
1
≤ cν1/(ν−1) 2(ν+1)/(ν−1) σn−1 ,
σn
giving the result.

Before returning to our development of a rate of convergence result, we

note that the “O” estimate given in the previous theorem cannot be improved
to a “o” estimate. Indeed, if
max sν pn (s) = o(σn−ν ), as n → ∞,

s∈[0,∞)
then since 1 − σn s ≤ pn (s), we obtain on setting s = σn−1 /2,
0 < 2−ν−1 ≤ σnν max sν pn (s) = o(1), as n → ∞,

s∈[0,∞)
which is a contradiction.
We now remove the restriction ν < 1 used in Lemma 4.13 by imposing the
condition (4.14).
Lemma 4.15. If 0 < ν < n and condition (4.14) is satisfied, then
max f (s) ≤ cν σn−ν

s∈[0,∞)
where cν = (2ν(c + 1))ν for 0 < ν ≤ 1, while cν = (2ν(c + 1)ν )ν for ν > 1.
First note that if 0 < ν ≤ 1/2, then

ν
ν
≤ (2ν)ν ≤ cν
1−ν
and the stated result follows from Lemma 4.13. On the other hand, if 1/2 <
ν ≤ 1, then by Lemma 4.12 and inequality (4.13),

n
t1 ≥ − ti ≥ σn−1
i=2
and condition (4.14) then implies

1
σn = + σn−1 ≤ (c + 1)σn−1 . (4.15)
αn
It then follows from (4.12) that
max f (s) = f (t−1 −ν −ν ν −ν −ν

1 ) ≤ t1 ≤ σn−1 ≤ (c + 1) σn ≤ cν σn
s∈[0,∞)
which completes the case 0 < ν ≤ 1. The case ν > 1 is dispatched by an

inductive argument. Suppose that
f (t−1 −ν
1 ) ≤ cν σn (4.16)
holds for all ν with 0 < ν ≤ ν0 and some ν0 ≥ 1. Suppose ν ∈ (ν0 , ν0 + 1] and
n > ν. We will show that (4.16) holds for ν. By Lemma 4.12 and inequality
(4.13) one has

n 1−ν
t1 = − i=2 ti + σn
ν
1−ν 1 ν−1 1
≥ σn−1 + σn = σn−1 − .
ν ν ν αn
Consider now two cases. If 1/αn ≤ σn−1 /(2(ν − 1)), then
1 ν−1 1 1
t1 ≥ σn−1 − σn−1 = σn−1 ,
ν ν 2(ν − 1) 2ν
and hence by (4.12) and (4.15),
f (t−1 −ν ν −ν ν −ν −ν
1 ) ≤ t1 ≤ (2ν) σn−1 ≤ (2ν(c + 1)) σn ≤ cν σn .
On the other hand, if αn < 2(ν − 1)σn−1 , then by (4.16) and (4.15) (recall that
n − 1 > ν − 1 by assumption),
αn t1 αi t1 αi t1
n−1 n−1
−(ν−1)
f (t−1 −ν
1 ) = t1 ≤ αn t1
1 + αn t1 i=1 1 + αi t1 i=1
1 + αi t1
−(ν−1)
≤ αn cν−1 σn−1 ≤ 2(ν − 1)(c + 1)ν cν−1 σn−ν .
We now have an upper bound ω(n, ν) = O(σn−ν ) needed to apply Theorem

3.5 in order to obtain:
Corollary 4.16. Suppose the parameters {αn } in the nonstationary iterated
Tikhonov-Morozov method satisfy (4.14). If x ∈ D(L) and Lx ∈ R(L ν ), then
−ν
Lxn − Lx = O(σn ).
Combining this result with the general stability estimate using r(n) = σn
gives:
Corollary 4.17. Suppose x ∈ D(L) and Lx ∈ R(L ν ). If n = n(δ) is chosen
so that σn(δ) ∼ δ −2/(2ν+1) , then Lxδn(δ) − Lx = O(δ 2ν/(2ν+1) ).
The geometric choice of parameters αn = α×q n−1 where α > 0 is fixed and
0 < q < 1 is an attractive option for the choice of the sequence of stabilization
parameters. In this case
1 1−n 1 − q n 1
σn = q ≥ q 1−n = q/αn+1
α 1−q α
and hence (4.14) holds for this choice. Also, by Corollary 4.16
Lxn − Lx = O(σn−ν ) = O(q νn )

and a linear rate of convergence results which is faster the smoother (i.e., the
larger ν) the true data x.
We now consider an a posteriori choice of the iteration index in the non-
stationary Tikhonov-Morozov method. The analysis follows closely that for
the iterative method based on functional interpolation that was treated in
the previous chapter. Again we assume that the signal-to-noise ratio of the
data is strictly bounded above one:
xδ ≥ τ δ ≥ τ xδ − x (4.17)
where τ > 1. We take xδ0 = 0 and again use the criterion (3.16), that is we
denote by n(δ) the first value of the iteration index that satisfies
xδ − xδn(δ) < τ δ. (4.18)
But, by (4.9) and (4.10),

n (L)(x
xδ − xδn = LT δ − xδ ) ≤ xδ − xδ .
n−1 n−1
This, along with Theorem 3.3, guarantees that there is a first value n = n(δ)
of the iteration index that satisfies (4.18). Now,
n(δ) (L)(x
xδn(δ) − xn(δ) = LT δ − x)
and hence
n(δ) (L)(x
x − xn(δ) = xδ − xδn(δ) + LT − xδ ).
But since
n
t
|1 − tTn (t)| = 1− ≤1
j=1 βj (t)
we have
x − xn(δ) ≤ τ δ + δ = (τ + 1)δ, (4.19)
just as in Lemma 2.8.6.
µ w for some w ∈ D(L), then
If x = L
µ (I − LT
Lx − Lxn(δ) = L n(δ) (L))Lw
µ zn(δ)
=L
n(δ) (L))Lw
where zn(δ) = (I − LT and hence
zn(δ) ≤ Lw.
The same argument that was used in the proof of Lemma 3.17 of the previous
chapter now gives:
Lemma 4.18. If x = L µ w for some µ > 0 and some w ∈ D(L) and if n(δ)
is chosen according to (4.18), then
Lx − Lxn(δ) = O(δ µ/(µ+1) ).

4.5 Notes 97
Now,
n−1 (L))(x
(xn−1 − xδn−1 ) − (x − xδ ) = −(I − LT − xδ )
n−1 (L))
and I − LT ≤ 1 and hence by (4.18),
xn(δ)−1 − x = xδn(δ)−1 − xδ + (xn(δ)−1 − xδn(δ)−1 ) − (x − xδ )

(4.20)
≥ (τ − 1)δ.
µ w, then
However, if x = L
−µ
xn−1 − x = O(σn−1 ).
But, since {αj } satisfies (4.14),
σn = 1/αn + σn−1 ≤ (c + 1)σn−1

−µ
and hence σn−1 = O(σn−µ ). In particular,
xn(δ)−1 − x = O(σn−µ )
and therefore by (4.20),

σn(δ) = O(δ −1/µ ).
By the general stability estimate, we have
√
Lxδn(δ) − Lxn(δ) ≤ δ σn(δ) = O(δ (2µ−1)/(2µ) ).
Combining this with the previous result we have

Theorem 4.19. Suppose {αj } satisfies (4.14) and that x ∈ D(L) and x =
µ w for some w ∈ D(L) and µ > 1/2. If xδ ∈ H1 satisfies x − xδ ≤ δ and
L
n(δ) is chosen by (4.18), then
⎧
⎨ O(δ (2µ−1)/(2µ) ), µ ≤ 1
Lxn(δ) − Lx =
δ
⎩
O(δ µ/(µ+1) ), µ ≥ 1.
4.5 Notes
The general theory of the Tikhonov-Morozov method was initiated by V.A.
Morozov and his colleagues in the late sixties. An account of the theory, in a
more general context than that of the present monograph, can be found in [39]
(see also [40]). The proof that the order of convergence O(δ 2/3 ) is essentially
best possible for the Tikhonov-Morozov method and related results, including
the sharp order of convergence for the method when the parameter is chosen
according to the discrepancy criterion, can be found in [22]. The analysis
of the iterated Tikhonov-Morozov method in the context of the alternating

projection algorithm in the product Hilbert space is carried out in [24], while
a convergence analysis of the nonstationary method may be found in [23].
We employ spectral methods throughout this monograph in our analysis
of stabilization methods for the evaluation of a closed unbounded operator
L. Our approach requires that the domain D(L) be a dense linear subspace.
However, the basic convergence theory for the Tikhonov-Morozov method
(particularly Corollary 3.7) may be developed under relaxed assumptions on
D(L). In fact, the analysis of Morozov [39], [40] requires only that D(L) be
convex and that L be linear and closed. The key difference from the analysis
provided in this chapter is that the solution of the variational problem
Φα (xδα ; xδ ) = min Φα (z; xδ ) = min z − xδ 2 + αLz2 (4.21)

z∈D(L) z∈D(L)
is now characterized as the solution of the variational inequality

xδα − xδ , z − xδα + α
Lxδα , L(z − xδα ) ≥ 0 (4.22)
for all z ∈ D(L), rather than as the solution of an Euler equation, as in

Theorem 4.1. We note that since D(L) is convex and L is closed, the graph
G(L) is a weakly closed convex subset of the product Hilbert space H1 × H2 .
Also, the condition (4.21) says that the point (xδα , Lxδα ) in G(L) is the closest
point in G(L) to the point (xδ , 0) in the product space H1 × H2 endowed with
the norm | · | defined by
|(w, v)|2 = w2 + αv2 .
It then follows from Theorem 2.14 that
0 ≥ [(xδ , 0) − (xδα , Lxδα ), (z, Lz) − (xδα , Lxδα )]

(4.23)
= [(xδ − xδα , −Lxδα ), (z − xδα , L(z − xδα ))]
for all z ∈ D(L), where
[(u, v), (w, y)] =

u, w + α
v, y
is the inner product on H1 × H2 that generates the norm | · |. But one sees
easily that (4.23) is equivalent to (4.22). That is, the minimizer xδα of (4.21),
exists since D(L) is weakly closed, and is unique since D(L) is convex, and is
characterized by the variational inequality (4.22).
If we denote by xα the minimizer of Φα (z; x), where x ∈ D(L), then we
obtain as above

xα − x, z − xα + α
Lxα , L(z − xα ) ≥ 0 (4.24)
4.5 Notes 99
for all z ∈ D(L). Setting z = xα in (4.22), and z = xδα in (4.24), and adding,
one obtains
xδα − xα 2 + αL(xα − xδα )2 ≤

x − xδ , xα − xδα . (4.25)
From this one obtains the key inequality given in the previous case in Corollary
3.7. Indeed, from (4.25) it follows that
xδα − xα ≤ δ
and, again from (4.25),
αL(xα − xδα )2 ≤ δxδα − xα ≤ δ 2
and hence √
L(xα − xδα ) ≤ δ/ α.
By the variational property of Φα (·; x), we have
Φα (xα ; x) ≤ Φα (x; x)
and hence
xα − x2 + αLxα 2 ≤ αLx2 .
In particular, xα → x as α → 0 and Lxα ≤ Lx. Since the graph of L is
closed and convex, and therefore weakly closed, one concludes in the standard
way that xα → x in the graph norm and hence Lxα → Lx as α → 0. One
then has √
Lx − Lxδα ≤ Lx − Lxα + δ/ α
√
and hence, if α = α(δ) satisfies α/ α → 0 as δ → 0, then Lxδα → Lx.
5
Finite-Dimensional Approximations
It is hard to be finite upon an infinite subject.

Herman Melville
In previous chapters approximations to values of an unbounded operator

were formed in the context of a generally infinite dimensional Hilbert space
and hence were not in themselves effectively computable. In this chapter we
treat approximations that are formed within finite-dimensional subspaces of a
Hilbert space and hence are computable in finite terms. We begin by consider-
ing approximations that result from projection alone. We then take up finite
element type approximations which exploit the variational characteristics of
the Tikhonov-Morozov method and its variants.
5.1 Stabilization by Projection
Suppose L : D(L) ⊆ H1 → H2 is a linear densely defined operator with closed

range. Let {Vj } be a collection of finite dimensional subspaces of H1 satisfying
V1 ⊆ V2 ⊆ . . . ⊆ D(L) ∩ N (L)⊥
and
H1 ⊆ ∪∞
j=1 Vj .
The finite-dimensional subspaces {Vj } should be thought of as spanned by

some simple functions (e.g., polynomials, splines, eigen-functions, etc.) which
will serve to approximate the desired values of the unbounded operator L. The
simplest approach to such approximations is to project the given data onto
the subspace Vn and then apply the operator to this projected approximation
of the data. To this end, let Pn : H1 → Vn be a projection operator (i.e., Pn
is bounded and Pn2 = Pn , but Pn is not necessarily an orthogonal projection).
Given x ∈ D(L) we will approximate Lx by yn = LPn x. Note that for each n
the approximation LPn x is defined for all x ∈ H1 and is stable with respect
102 5 Finite-Dimensional Approximations
to perturbations in x since the operator LPn has finite rank and hence is a
bounded linear operator.
As a minimal requirement we assume that the projection method is con-
vergent, that is we assume that
LPn x → Lx as n→∞ (5.1)
for each x ∈ D(L). Let Qn be the orthogonal projector of H2 onto L(Vn ).

Since R(L) ⊆ ∪∞
j=1 L(Vj ) we have
(I − Qn )Lx → 0 as n→∞ (5.2)
for all x ∈ D(L). Since R(L) is closed, the Moore-Penrose inverse L† : H2 →

N (L)⊥ ∩ D(L) is bounded (see Theorem 2.12) and hence LPn L† is bounded.
By (5.1) we then have
LPn L† y → LL† y as n→∞
for all y ∈ H2 . Therefore, by the uniform boundedness principle there is a

constant C, which is independent of n, satisfying
LPn L† ≤ C. (5.3)
Also note that
LPn L† Lx = LPn PN (L)⊥ x = LPn x
since Vn ⊆ N (L)⊥ ∩ D(L).

In particular,
LPn x = LPn L† Lx ≤ CLx for all x ∈ D(L). (5.4)
Theorem 5.1. Suppose x ∈ D(L) and xδ ∈ H1 satisfies x − xδ ≤ δ. If

n = n(δ) → ∞ and LPn(δ) δ → 0 as δ → 0, then LPn(δ) xδ → Lx as δ → 0.
Proof. Let y = Lx, yn = LPn x and ynδ = LPn xδ . Then,
ynδ − y = ynδ − yn + yn − y = LPn (xδ − x) + yn − y. (5.5)
Let zn ∈ Vn satisfy Lzn = Qn y = Qn Lx, where Qn is the orthogonal projector

of H2 onto L(Vm ), then
yn − y = LPn x − Lx
= LPn x − Lzn − (Lx − Lzn )
= LPn (x − zn ) − (Lx − Lzn ).

5.1 Stabilization by Projection 103
Therefore, by (5.4)
yn − y ≤ (C + 1)Lx − Lzn = (C + 1)(I − Qn )Lx.
By (5.5) and (5.2) we then have
yn(δ)
δ
− y ≤ LPn(δ) δ + (C + 1)(I − Qn(δ) )Lx → 0 (5.6)
as δ → 0.

In the simple projection method the index n which characterizes the

dimension of the approximating subspace Vn is used as a stabilization
parameter. This parameter is chosen in an a priori way as a function of the
error level δ in the data in the previous theorem. We now present an a pos-
teriori way of choosing the subspace index based on a discrepancy principle,
that is, by monitoring the discrepancy between the measured data xδ and the
computed quantity xδn = Pn xδ which is the pre-image of the stable approxi-
mation Lxδn to the desired value Lx. The method will depend on knowledge
of an upper bound γn satisfying
γn ≥ I − Pn
which quantifies the approximating power of the subspace Vn when linked

with the projection Pn . Note that I − Pn ≥ 1 as I − Pn is a projection, and
that I − Pn = 1 if Pn is an orthogonal projection, however, {γn } may be
unbounded in general.
Lemma 5.2. Suppose x − xδ ≤ δ and γn ≥ I − Pn . For a given τ > 1
there is a value of n for which
xδ − xδn ≤ τ γn δ
Proof. Let Mn = γn−1 (I − Pn ). Then Mn ≤ 1 and
Mn z → 0 as n→∞
for each z in the dense subspace ∪∞

j=1 Vj of H1 , and hence the same is so for
all z ∈ H1 . But,
xδ − xδn = (I − Pn )xδ = (I − Pn )(xδ − x) + (I − Pn )x
and therefore
γn−1 xδ − xδn ≤ δ + Mn x ≤ τ δ
for n sufficiently large.

The subspace chosen by the discrepancy principle is that subspace Vn for

which n = n(δ) is the first subspace index satisfying
xδ − xδn ≤ τ γn δ (5.7)
(note that this choice depends on the particular value selected for τ and on the
upper bound γn accepted for I − Pn ). The convergence of the projection
method with this discrepancy principle will depend on the approximating
power of the subspaces and the size of the stability bound LPn . We shall
assume that
LPn dist(x, Vn−1 ) → 0 as n → ∞ (5.8)
where
dist(x, Vn−1 ) = inf{x − v : v ∈ Vn−1 }.
Theorem 5.3. Let τ > 1 and suppose n = n(δ) is the first value of n satisfy-
ing (5.7). If (5.8) holds, then Lxδn(δ) → Lx as δ → 0.
Proof. In the interest of shortening the notation we shall write n for n(δ). For
any v ∈ Vn−1 we find that
τ γn−1 δ < xδ − xδn−1 = (I − Pn−1 )xδ
≤ (I − Pn−1 )(xδ − x) + (I − Pn−1 )(x − v)
≤ γn−1 (δ + dist(x, Vn−1 ))
and therefore
1
δ≤ dist(x, Vn−1 ). (5.9)
τ −1
From (5.6) we then have
Lxδn − Lx ≤ LPn dist(x, Vn−1 )/(τ − 1) + (1 + C)(I − Qn )Lx (5.10)
and therefore, if n = n(δ) → ∞ as δ → 0, the result is proved.

Suppose, however, that for some sequence δk → 0, the sequence {n(δk )} is
bounded. Then there is some subsequence {δj } with δj → 0 and n(δj ) → n0 <
∞ as j → ∞. Since the n(δj ) are positive integers, we then have n(δj ) = n0
for j ≥ j0 , say. Using this fact and (5.7) we then find that
δ δ
x − xn0 = limj xδj − xnj0 = limj xδj − xn(δ
j
j)

≤ limj τ γn(δj ) δj = limj τ γn0 δj = 0,

and hence x = xn0 . But then it follows that

δ
j
lim (Lxn(δj)
− Lx) = lim (Lxδnj0 − Lx) = lim LPn0 (xδj − x) = 0
j→∞ j→∞ j→∞
yielding the result.

As an illustration of these results, consider the case in which L∗ L has

= (I + L∗ L)−1 has a nonincreasing sequence
compact resolvent. In this case L
of positive eigenvalues {µn } with µn → 0 and an associated orthonormal
sequence of eigenvectors {vn } which is complete in the space H1 . Setting
λn = (1 − µn )/µn , we find that {λn } is an increasing sequence satisfying
λn → ∞ and L∗ Lvn = λn vn . Let
Vn = span{v1 , v2 , . . . , vn } (5.11)
and let Pn be the orthogonal projector of H1 onto Vn . Then H1 = ∪∞
j=1 Vj
and for any z ∈ H1 ,
( n )

n
LPn z =2

z, vj vj , λk
z, vk vk ≤ λn z2
j=1 k=1
with equality if z = vn . Therefore,

LPn = λn . (5.12)
−1/2
Also, note that {λj Lvj }mj=1 is a complete orthonormal set for Vm . There-
fore,
dist(x, Vn−1 )2 ≤ (I − Pn−1 )x2
* ∞ +
∞
= j=n
x, vj vj , k=n
x, vk vk
*
∞ −1/2 −1/2
= j=n λ−1
j
Lx, λj Lvj λj Lvj ,
∞ +
−1 −1/2 −1/2
k=n λk
Lx, λk Lvk λk Lvk
≤ λ−1
n dist(Lx, L(Vn−1 ))
2
and hence
LPn dist(x, Vn−1 ) ≤ dist(Lx, L(Vn−1 )) → 0
as n → ∞, which verifies the hypothesis (5.8) and shows the applicability of
the discrepancy principle (5.7) in this case.
For this particular example we may also give a bound for the error when
the discrepancy method is used. The following theorem shows that for this
example orders of approximation that are arbitrarily near to the optimal order
may be achieved in principle. Before proving the theorem we establish a basic
inequality that we will need.
Lemma 5.4. If L∗ L has compact resolvent and z ∈ D((L∗ L)ν+1 ) for some
ν > 0, then
L∗ Lz ≤ z ν+1 (L∗ L)ν+1 z ν+1 .
ν 1
Proof. First notice that

∞
∞

L∗ Lz2 = λ2k |
z, vk |2 = ak bk
k=1 k=1
where 2ν 2
ak = |
z, vk | ν+1 and bk = λ2k |
z, vk | ν+1 .
Using p = (ν + 1)/ν and q = ν + 1, we see that {ak } ∈ lp and {bk } ∈ lq and
hence by Hölder’s inequality
∞ 1 ∞ q q1
L∗ Lz2 ≤ ( k=1 apk ) p ( k=1 bk )
&∞ ' ν+1

ν &∞ ' ν+1
1
= k=1 |
z, vk |2 k=1 λk2ν+2 |
z, vk |2
2ν 2
= z ν+1 (L∗ L)ν+1 z ν+1 .

Theorem 5.5. Suppose L∗ L has compact resolvent, Vn is given by (5.11) and

n = n(δ) is chosen by the discrepancy principle (5.7). If x ∈ D((L∗ L)ν+1 ) for
some ν > 0, then
" #
Lxδn(δ) − Lx = O δ (2ν+1)/(2ν+2) .
Proof. Throughout this proof we use n = n(δ) where n(δ) is given by (5.7).
Also, since the projectors are orthogonal, we have γn = 1. Let z = (L∗ L)ν+1 x,
then $∞ $
$ $
$ −ν−1 $
dist(x, Vn−1 ) = $ λk
z, vk vk $ ≤ λ−ν−1 z
$ $ n
k=n
and hence by (5.9) and (5.12)

−1
LPn = λn = O(δ 2ν+2 ), (5.13)
and
dist(x, Vn−1 ) = O(λ−ν−1
n ) = O(δ).
Also,
dist(Lx, L(Vn ))2 ≤ L(x − xn )2
=
L∗ L(x − xn ), x − xn
≤ L∗ L(x − xn )x − xn .

However, by the lemma,
L∗ L(x − xn ) ≤ x − xn ν/(ν+1) (L∗ L)ν+1 (x − xn )1/(ν+1)
and hence
dist(Lx, L(Vn )) ≤ x − xn (2ν+1)/(2ν+2) (L∗ L)ν+1 (x − xn )1/(2ν+2) .
But,
(L∗ L)ν+1 (x − xn ) = (I − Pn )z ≤ z
and hence
dist(Lx, L(Vn )) ≤ O(x − xn (2ν+1)/(2ν+2) ).
However, by (5.7)
x − xn = (I − Pn )x
= (I − Pn )(x − xδ ) + xδ − Pn xδ
= (I − Pn )(x − xδ ) + xδ − xδn
≤ δ + τ δ = (1 + τ )δ,
and hence
(I − Qn )Lx = dist(Lx, L(Vn )) = O(δ (2ν+1)/(2ν+2) ).
Therefore, by (5.13) and (5.10), we have
Lx − Lxδn(δ) = O(δ (2ν+1)/(2ν+2) ).

We now treat a different projection scheme which relies on choosing

basis functions in a special subspace. This method relies on projection, with
a ‘duality twist’, onto a finite-dimensional subspace of D(L∗ ), the domain
of the adjoint operator. In the previous method stabilization was achieved,
under hypotheses that may be difficult to verify, by projecting the data onto
a finite-dimensional subspace of the operator domain and then applying the
operator. The method here in a sense reverses this procedure: the value of
the operator is projected onto a finite-dimensional subspace of the domain of
the adjoint and a weak form of the equations that characterize this projection
is used to extend the technique to data that are not necessarily in the domain
of the operator.
Since L is densely defined the adjoint operator L∗ has a domain D(L∗ )
which is dense in H2 . Suppose that for each positive integer m, the set
(m) (m) (m)
{l1 , l2 , . . . , ln(m) }
consists of linearly independent vectors in D(L∗ ) and let

(m) (m)
Vm = span{l1 , . . . , ln(m) }.
Denote the orthogonal projector of H2 onto Vm by Pm and suppose that

Pm y → y as m → ∞ for each y ∈ D(L∗ ) (and hence the same holds for each
y ∈ H2 since D(L∗ ) is dense in H2 ). In particular, Pm Lx → Lx as m → ∞ for
each x ∈ D(L). The projection ym := Pm Lx is characterized by the conditions
ym ∈ Vm and
=
x, L∗ li
(m) (m) (m)

ym , li =
Lx, li , i = 1, . . . , n(m). (5.14)
Note that while the formulation ym = Pm Lx requires that x ∈ D(L), the

conditions (5.14) have meaning for any x ∈ H1 .
The approximation ym has another operational characterization. Define
operators Bm : H1 → Rn(m) and Im : Rn(m) → H2 by
⎡ ⎤

z, L∗ l1
(m)
⎢ ⎥
⎢ . ⎥
⎢ ⎥
Bm z = ⎢ . ⎥
⎢ ⎥
⎣ . ⎦

z, L∗ ln(m)
(m)
and

n(m)
(m)
Im c = cj lj ,
j=1
respectively, and let Gm : Rn(m) → Rn(m) be the linear operator whose matrix
representation relative to the standard basis is the Gramian matrix
n(m) n(m)
[Gm ]ij = [
li , lj ].
If

n(m)
(m)
ym = cj lj = Im c,
j=1
then by (5.14), Gm c = Bm x and hence ym = Lm x where Lm := Im G−1 m Bm .

Note that the operators Lm : H1 → H2 are bounded and defined on all of H1 .
Evaluating these operators is therefore a stable process.
We now show that the operators Lm are stabilizers of the unbounded oper-
ator L when the index m, which acts as a stabilization parameter, is suitably
matched with the approximating subspace Vm and the error level δ in the
data. The stabilization relies on relating the error level to the subspace Vm
by way of the smallest eigenvalue λm of the matrix Gm and the parameter
Bm .

Theorem 5.6. If n(m) = n(m(δ)) → ∞ and δBm(δ) / λm(δ) → 0, then
Lx − Lm(δ) xδ → 0
as δ → 0, for each x ∈ D(L).
Lx − Lm(δ) x = Lx − Pm(δ) Lx → 0, as δ → 0. (5.15)
It remains only to estimate Lm = Im G−1

m Bm . Since

n(m)
(m)
Lm x = cj lj
j=1
where Gm c = Bm x we have
(m) (m)
Lm x2 =
Im c, Im c = i,j ci cj
li , lj = cT Gm c
= (G−1 T −1
m Bm x) Bm x ≤ Gm Bm x
2 2
and hence 2
Lm ≤ G−1
m Bm (5.16)
Furthermore, by a well-known property of positive definite matrices, G−1

m =
λ−1
m , where λm is the smallest eigenvalue of Gm . We then have the stability
estimate
Lm x − Lm xδ ≤ δLm ≤ δBm / λm
which with (5.15) proves the result.

It is perhaps worth noting that the condition
(I − PM )Lx → 0 as m→∞
used in the proof above, which follows from the density of ∪∞

m=1 Vm , can be
given in a more quantitative fashion if additional hypotheses are assumed.
In fact, suppose the true data x has extra smoothness as expressed by the
condition x ∈ D(L∗ L), and let w = L∗ Lx. Then w ∈ D(L∗† ) and L∗† w = Lx.
Indeed,
Lx ∈ N (L∗ )⊥
and
PR(L∗ ) w = PR(L∗ ) L∗ Lx = L∗ (Lx).
Suppose now that L∗† is compact, then
(I − Pm )Lx = (I − Pm )L∗† w ≤ γm w
where
γm := (I − Pm )L∗† → 0 as m → ∞.
The model inverse problem of reconstructing a spatially distributed source
term g(x) given a measured version of the temperature distribution f (x) =
u(x, 1) at a later time in the heat problem
∂u ∂2u
= + g(x), 0 < x < π, 0<t
∂t ∂x2
u(0, t) = u(π, t) = u(x, 0) = 0,
provides an illustration of the result of this section. Formal separation of
variable techniques lead to the representation
∞ π
2 n2
g(x) = (Lf )(x) = f (s) sin nsds sin nx.
π n=1 1 − e−n2 0
Here L is a closed unbounded linear operator on L2 [0, π] whose domain is the

Sobolev space H02 [0, π]:
∞ π
2
D(L) = f ∈ L2 [0, π] : n4 a2n < ∞, an = f (s) sin nsds .
n=1
π 0
Let lk (s) = sin ks, then the orthogonal projector of L2 [0, π] onto
Vm = span{l1 , . . . , lm }
is
∞
2
Pm φ =
φ, lj lj .
π n=1
Also, Gm = π
2 I, where I is the identity operator on Rm and hence λm = π/2.
k2
Since L∗ lk = lk , we have
1 − e−k2
m 2
k2
Bm g2 = |
f, lk |2 ≤ O(m4 )f 2
1 − e−k2
k=1
and hence Bm = O(m2 ). Finally, since
2 n2
Lf − Pm Lf =
f, ln ln ,
π n>m 1 − e−n2
5.2 Finite Elements 111
if we assume greater smoothness on f in the form f ∈ H0r [0, π] for r > 2, we

have

Lf − Pm Lf 2 ≤ C (n4 n−2r )n2r |
f, ln |2 ≤ Cm4−2r f 2H r .
n>m
If f δ ∈ L2 [0, 1] satisfies f − f δ ≤ δ and f ∈ H02+ν [0, π] for some ν > 0, then

−1
a choice of cut-off level of the form m ∼ δ 2(ν+1) yields
ν
Lf − Lm f δ = O(δ ν+1 )
and hence an order of approximation arbitrarily near to the optimal order

O(δ) is achievable in principle for sufficiently smooth data.
5.2 Finite Elements

The variational characterization of the Tikhonov-Morozov approximation xδα
as the minimizer over D(L) of the functional
Φα (z; xδ ) = z − xδ 2 + αLz2
(see Theorem 4.1) suggests the possibility of using finite element methods to
effectively compute the approximations. To this end, suppose {Vm }∞ m=1 is a
sequence of finite-dimensional subspaces of H1 satisfying
V1 ⊆ V2 ⊆ · · · ⊆ D(L) and ∪∞
m=1 Vm = H1 .
Given x ∈ D(L), the finite element approximation to Lx will be Lxα,m where

3 4
xα,m = argminz∈Vm z − x2 + αLz2 .
Since Vm is finite-dimensional, such a minimizer exists and is unique. Sup-

(m) (m)
pose dimVm = n(m) and that {ϕ1 , . . . , ϕn(m) } is a basis for Vm . Then the
(m)
coefficients {cj } of the approximation

n(m)
(m) (m)
xα,m = cj ϕj
j=1
are determined by the necessary conditions

d (m)
Φα (xα,m + tϕi ; x)|t=0 = 0, i = 1, . . . , n(m)
dt
which are equivalent to the linear algebraic equations

n(m)
(m) (m) (m) (m) (m) (m)
[
ϕi , ϕj + α
Lϕi , Lϕj ]cj =
x, ϕi , i = 1, . . . , n(m).
j=1
When only an approximation xδ ∈ H1 is available satisfying x − xδ ≤ δ,

the minimizer of the functional Φα (·; xδ ) over Vm is denoted by xδα,m :
xδα,m = argminz∈Vm Φα (z; xδ ).
As a first step in our analysis, we determine a stability bound for
Lxα,m − Lxδα,m .
The stability bound turns out to be the same as that found for the approxi-
mation in infinite dimensional space (see Corollary 3.7). We will employ the
inner product [·, ·] defined on D(L) by
[u, w] =
u, w + α
Lu, Lw

where α is a fixed positive number, and the associated norm |u| = [u, u].
Note that, since L is closed, D(L) is a Hilbert space when endowed with this
inner product.
Lemma 5.7. If x ∈ D(L) ⊆ H1 and xδ ∈ H1 satisfies x − xδ ≤ δ, then
√
Lxα,m − Lxδα,m ≤ δ/ α.
Proof. The necessary condition

d
Φα (xα,m + tv; x)|t=0 = 0,
dt
for all v ∈ Vm gives

xα,m − x, v + α
Lxα,m , Lv = 0, for all v ∈ Vm (5.17)
and similarly

xδα,m − x, v + α
Lxδα,m , Lv = 0, for all v ∈ Vm .
The condition (5.17) may be expressed in terms of the inner product [·, ·] in
the following way:
[xα,m − x, v] =
xα,m − x, v + α
L(xα,m − x), Lv = −α
Lx, Lv (5.18)
for all v ∈ Vm . On the other hand,
[xδα,m − x, v] =
xδα,m − x, v + α
Lxδα,m − Lx, Lv
=
xδ − x, v +
xδα,m − xδ , v + α
L(xδα,m − x), Lv
=
xδ − x, v − α
Lx, Lv
and therefore,
[xδα,m − xα,m , v] =
xδ − x, v (5.19)
for all v ∈ Vm . In particular, setting v = xδα,m − xα,m in (5.19), and applying
the Cauchy-Schwarz inequality, one obtains
xδα,m − xα,m 2 + αLxδα,m − Lxα,m 2 = |xδα,m − xα,m |2
=
xδ − x, xδα,m − xα,m
≤ δxδα,m − xα,m
Therefore,
xδα,m − xα,m ≤ δ
and hence
αLxδα,m − Lxα,m 2 ≤ δ 2
giving the result.

√
We see from this lemma that the condition δ/ α → 0, combined with a
condition that ensures that Lxα,m − Lx → 0, will together guarantee the
convergence of the stabilized finite element approximations to Lx.
The remaining development requires an analysis of the difference between
xα,m and the infinite dimensional smoothed approximation xα using exact
data x ∈ D(L), which is characterized by
3 4
xα = argminz∈D(L) z − x2 + αLz2 .
This is equivalent to
0 =
xα − x, v + α
Lxα , Lv = [xα − x, v] + α
Lx, v
for all v ∈ D(L). The corresponding finite element approximation xα,m satis-
fies (5.18), that is
0 = [xα,m − x, v] + α
Lx, v
for all v ∈ Vm . Subtracting, we find that
[xα − xα,m , v] = 0 for all v ∈ Vm . (5.20)
We can express this in a geometrical way by saying that xα,m is the [·, ·] -
orthogonal projection of xα onto the finite-dimensional subspace Vm . That is,
xα,m = Pm xα (5.21)
where Pm : D(L) → Vm is the orthogonal projector of the Hilbert space D(L),

equipped with the inner product [·, ·], onto the subspace Vm ⊆ D(L).
Let Pm be the (ordinary) orthogonal projector of H1 onto Vm . One may

bound the quantity Lxα − Lxα,m in terms of the two quantities

βm = (I − Pm )L
and

γm = L(I − Pm )L.
LPm and L
Note that since LL, are all bounded linear operators, both of these
quantities is finite. We begin with a result that requires relatively modest
assumptions on the true data x.
Theorem 5.8. If x ∈ D(L∗ L), then
Lxα − Lxα,m 2 = O(βm

2 2
/α + γm )x + L∗ Lx2 .
where w =
Proof. First we note that x ∈ D(L∗ L) if and only if x = Lw
x + L∗ Lx. From the characterization (5.21) we have
αLxα − Lxα,m 2 ≤ |xα − xα,m |2
= |xα − Pm xα |2 ≤ |xα − Pm xα |2
= (I − Pm )xα 2 + αL(I − Pm )xα 2 .
But

xα = (I + αL∗ L)−1 x = L(αI −1 x
+ (1 − α)L)

= L(αI −1 Lw
+ (1 − α)L)
−1 L
Also, (αI + (1 − α)L) ≤ 1. Therefore,
2 + αL(I − Pm )L
αLxα − Lxα,m 2 ≤ ((I − Pm )L 2 )w2
that is,
Lxα − Lxα,m 2 ≤ (βm
2 2
/α + γm )w2 .

We now need a well-known consequence of the uniform boundedness principle.
Lemma 5.9. Suppose {An } is a sequence of bounded linear operators satisfy-

ing An x → 0 as n → ∞ for each x ∈ H1 . If K is a compact linear operator,
then An K → 0 as n → ∞.
Proof. Let B = {x ∈ H1 : x ≤ 1} be the unit ball in H1 . Assume to the

contrary that
sup An Kx ≥ 2d > 0
x∈B
for some sequence n → ∞. Then for each n there is a xn ∈ B with
An Kxn ≥ d > 0.
Since B is closed and convex, it is weakly closed and also bounded. Therefore,
by Theorem 2.1 there is a subsequence {xnk } with xnk → x ∈ B. As K is
compact, one then has Kxnk → Kx. By the uniform boundedness principle
there is a constant C with An ≤ C for all n. We then have
0 < d ≤ Ank Kxnk = Ank (Kxnk − Kx) + Ank Kx
≤ CKxnk − Kx + Ank Kx → 0
as k → ∞, which is a contradiction.

is compact. Applying the pre-

Suppose L∗ L has compact resolvent, i.e., L
vious lemma with An = I − Pn , we see that
βn → 0 as n → ∞.
If we assume that LPn x → Lx for all x ∈ D(L∗ L), that is, we assume condition
(5.1), then we can also apply the lemma to the operator An = L(I − Pn ) to
find that
γn → 0 as n → ∞.
We may now give a basic convergence result for the stabilized finite element
approximations.
Theorem 5.10. Suppose L is compact, condition (5.1) holds, and x ∈ D(L∗ L).
α = α(δ) → 0 as δ → 0 and m = m(α) → ∞ as α → 0 in such a way that
If √
δ/ α → 0 and βm 2
/α → 0, then Lxδα,m → Lx.
Proof. Note that
Lxδα,m − Lx ≤ Lxδα,m − Lxα,m + Lxα,m − Lxα + Lxα − Lx

√
≤ δ/ α + O( βm
2 /α + γ 2 ) + Lx − Lx.
m α
(5.22)
But we know from Theorem 3.3 that Lxα − Lx → 0 as α → 0 and hence
the result.

Remark. If we are willing to assume more on the true data, namely that
x ∈ R(L ν ) for some ν ≥ 1, then minor modifications of the argument above
gives the bound
√
Lxδα,m − Lx ≤ δ/ α + O( βm (ν)2 /α + γm (ν)2 ) + Lxα − Lx (5.23)
where
ν
βm (ν) = (I − Pm )L
and
ν .
γm (ν) = L(I − Pm )L
From Corollary 3.8 we known that if x ∈ D(LL∗ L), then Lxα − Lx = O(α).
We therefore obtain the following corollary:
Corollary 5.11. If, in addition to the hypotheses of Theorem 5.10, x ∈
D(LL∗ L), γm = O(α), βm = O(α3/2 ) and α ∼ δ 2/3 , then
Lxδα,m − Lx = O(δ 2/3 ).
This corollary shows that in principle the finite element approximations

are capable of achieving the optimal order of convergence possible for the
Tikhonov-Morozov method.
5.2.1 A Finite Element Discrepancy Criterion
We now investigate a discrepancy principle for the finite element method

treated in the last section. We assume that the true data x is an element of
D(L∗ L) ⊆ H1 and that xδ ∈ H1 is some approximation to x whose signal-to-
noise ratio is strictly bounded above one. That is, we assume that for some
fixed τ > 1,
1
x − xδ ≤ δ < xδ . (5.24)
τ
We assume that {Vm }∞ m=1 is a sequence of finite dimensional subspaces of
D(L∗ L) satisfying
Vm ⊆ Vm+1 and ∪∞
m=1 Vm = H1 .
We denote the dimension of Vm by n(m). Again the functional Φα (·; xδ ) is

defined on D(L) by
Φα (z; xδ ) = z − xδ 2 + αLz2 ,
and we denote by xδm,α the smoothed finite dimensional approximation to xδ

in Vm defined by
xδm,α = argminz∈Vm Φα (z; xδ ).
Let Pm denote the orthogonal projector of H1 onto Vm . Since
z − xδ 2 = z − Pm xδ 2 + (I − Pm )xδ 2
for all z ∈ Vm , we see that
xδm,α = argminz∈Vm Φα (z; Pm xδ ).

Finally, let Lm = L|Vm be the restriction of L to Vm . Then Lm is a bounded

linear operator and for z ∈ Vm
Φα (z; Pm xδ ) = z − Pm xδ 2 + αLm z2
and hence
xδm,α = argminz∈Vm z − Pm xδ 2 + αLm z2 .
Therefore, as previously seen, we have
xδm,α = (I + αL∗m Lm )−1 Pm xδ
where L∗m Lm is a bounded self-adjoint linear operator on Vm .

If {φj ; λj }j=1 is an orthonormal eigensystem for the operator L∗m Lm , then
n(m)
n(m) 2
αλj
xδm,α − Pm x =
δ 2
|
Pm xδ , φj |2 . (5.25)
j=1
1 + αλj
Since Pm xδ → xδ as m → ∞, we have Pm xδ > δτ for m sufficiently large.

Theorem 5.12. Suppose m = m(δ) is large enough so that Pm xδ > δτ .
Then there is a unique α = α(m(δ), δ) satisfying
xδm,α − Pm xδ = τ δ.
Proof. From (5.25) we see that for such fixed m = m(δ) the function
d(α) = xδm,α − Pm xδ
is continuous, strictly increasing, d(α) → 0 as α → 0+ and
d(α) → Pm xδ > τ δ as α → ∞.
Therefore, there is a unique positive α satisfying d(α) = τ δ.

We call the choice of α according to Theorem 5.12 the choice by the finite-
dimensional discrepancy principle.
Lemma 5.13. If α = α(m, δ) is chosen according to the criterion in Theorem

5.12, then Lxδm,α ≤ LPm x.
Proof. Since
xδm,α = argminz∈Vm Φα (z; Pm xδ )
we have
Φα (xδm,α ; Pm xδ ) ≤ Φα (Pm x; Pm xδ )
and hence
xδm,α − Pm xδ 2 + αLxδm,α 2 ≤ Pm x − Pm xδ 2 + αLPm x2 .
But then,
τ 2 δ 2 + αLxδm,α 2 ≤ δ 2 + αLPm x2
giving
Lxδm,α ≤ LPm x
since τ > 1.

We now assume that the subspaces Vm faithfully support the operator L in

the sense of (5.1). Actually, we need only the weaker condition that
lim LPm x ≤ Lx (5.26)

m→∞
for all x ∈ D(L∗ L). For a given δ > 0 it then follows that there is an Mδ such
that
1. LPm x2 ≤ δ + Lx2
2. (I − Pm )xδ ≤ δ
3. τ δ ≤ Pm xδ
for all m ≥ Mδ . Furthermore, for each such m = m(δ), there is a unique
α = α(m, δ) satisfying the criterion of Theorem 5.12.
Having set the stage, we may now prove the convergence of the approxi-
mations provided by the finite dimensional discrepancy principle.
Theorem 5.14. Suppose that x ∈ D(L), and that m = m(δ) is large enough
so that (1)-(3) hold. Let α = α(m, δ) be the stabilization parameter determined
by Theorem 5.12. Then
xδm,α → x and Lxδm,α → Lx as δ → 0.
Proof. To simplify notation we write xδm,α for xδm(δ),α(m(δ),δ) . First note that
xδm,α − x ≤ xδm,α − Pm xδ + Pm xδ − xδ + xδ − x
≤ (τ + 2)δ
and hence xδm,α → x as δ → 0.

Also, by Lemma 5.13 and (1), Lxδm,α is bounded and hence for each
sequence {δn } converging to zero, there is a subsequence, which we will denote
by {δk } such that Lxδm,α
k
Lx, since G(L) is weakly closed. By the weak lower
semi-continuity of the norm, along with Lemma 5.13 and (1), we then have
Lx ≤ lim inf Lxδm,α

k
≤ lim sup Lxδm,α
k
≤ Lx
5.3 Notes 119
and therefore, Lxδm,α

k
→ Lx. Therefore, each sequence {δn } converging to
zero has a subsequence {δk } satisfying
(xδm,α
k
, Lxδm,α
k
) → (x, Lx)
and the theorem is proved.

With somewhat stronger√hypotheses on the true data x, we can recover

the order of convergence O( δ) proved in the infinite dimensional setting in
Theorem 4.7.
Theorem 5.15. Suppose x ∈ D(L∗ L), condition (5.24) is satisfied, and Mδ
is chosen such that (1)-(3) are satisfied. Then for any m ≥ Mδ ,
Lxδm,α − Lx2 ≤ Cδ
where C is independent of m, if α = α(m, δ) is chosen according to the finite
dimensional discrepancy criterion of Theorem 5.12.
Proof. By the lemma we have, using (1)-(3),
Lxδm,α − Lx2 = Lxδm,α 2 − 2
Lxδm,α , Lx + Lx2
≤ LPm x2 − 2
Lxδm,α , Lx + Lx2
≤ δ + Lx2 − 2
Lx − Lxδm,α , Lx + Lx2
= δ + 2
Lx − Lxδm,α , Lx
= δ + 2
x − xδm,α , L∗ Lx
=
x − xδ + xδ − Pm xδ + Pm xδ − xδm,α , L∗ Lx
≤ δ(1 + 2L∗ Lx) + 2

(I − Pm )xδ , L∗ Lx
+2
Pm xδ − xδm,α , L∗ Lx
≤ δ(1 + 6L∗ Lx).

√
This shows that the rate O( δ) proved for the infinite dimensional dis-
crepancy principle (see Theorem 4.7) is in principle achievable by the finite
element method.
5.3 Notes
The material on direct projection in Section 5.1 follows [19]. The choice of
subspaces in D(L∗ ) is closely related to the method of dual least squares
(see, e.g., Engl, Hanke and Neubauer [8]) for solution of first kind operator
equations by projection.
References
1. Al’ber, Ya. (1978): Method of monotonicity and approximate computation of

the value of an unbounded nonlinear operator. Siberian Mathematical Journal,
19, 179-183
2. Ang, D., Gorenflo, R., Le, V., Trong, D. (2002): Moment Theory and Some
Inverse Problems in Potential Theory and Heat Conduction, Springer-Verlag
LNM1792, Berlin
3. Atkinson, K., Han, W. (2001): Theoretical Numerical Analysis, A Functional
Analysis Framework, Springer-Verlag, New York
4. Binder, A., Engl, H., Groetsch, C., Neubauer, A., Scherzer, O. (1995): Weakly
closed nonlinear operators and parameter identification in parabolic equations.
Applicable Analysis, 55, 215-234
5. Cannon, J. R. (1984): The One-dimensional Heat Equation, Encyclopedia of
Mathematics and Its Applications, Vol. 23. Addison-Wesley, Menlo Park, CA.
6. Colton, D., Ewing, W., Rundell, W., Eds. (1990): Inverse Problems in Partial
Differential Equations. SIAM, Philadelphia
7. Deutsch, F. (2001): Best Approximation in Inner Product Spaces. Springer-
Verlag, New York
8. Engl, H.W., Hanke, M., Neubauer, A. (1996): Regularization of Inverse Prob-
lems. Kluwer, Dordrecht
9. Epstein, C. (2003): Introduction to the Mathematics of Medical Imaging.
Pearson, Prentice-Hall, Upper Saddle River, N.J.
10. Gerlach, W., Wolfersdorf, L.v. (1986): On approximate computation of the val-
ues of the normal derivative of solutions to linear partial differential equations
of second order with application to Abel’s integral equation. Zeitschrift für
angewandte Mathematik und Mechanik, 66, 31-36
11. Glasko, V. (1984): Inverse Problems of Mathematical Physics. Translated from
the Russian by A. Bincer. American Institute of Physics, New York
12. Groetsch, C.W. (1977): Generalized Inverses of Linear Operators: Representa-
tion and Approximation. Dekker, New York
13. Groetsch, C.W. (1980): Elements of Applicable Functional Analysis. Dekker,
New York
14. Groetsch, C.W. (1984): The Theory of Tikhonov Regularization for Fredholm
Equations of the First Kind. Pitman, London
122 References
15. Groetsch, C.W. (1992): Spectral methods for linear inverse problems with
unbounded operators. Journal of Approximation Theory, 70, 16-28
16. Groetsch, C.W. (1993): Inverse Problems in the Mathematical Sciences. Vieweg,
Braunschweig
17. Groetsch, C.W. (1995): Inclusions and identities for the Moore-Penrose inverse
of a closed linear operator, Mathematische Nachrichten, 171, 157-164
18. Groetsch, C.W. (2006): An iterative stabilization method for the evaluation of
unbounded operators, Proceedings of the American Mathematical Society 134,
1173-1181
19. Groetsch, C.W., Hanke, M. (1995): Regularization by projection of unbounded
operators arising in inverse problems, Inverse Problems and Applications in Geo-
physics, Industry, Medicine and Technology (D.D. Ang et al. Eds.), pp. 61-70,
Publications of the HoChiMinh City Mathematical Society, Vol. 2, HoChiMinh
City, Viet Nam
20. Groetsch, C.W., Hanke, M. (1995): Regularization of some one-dimensional in-
verse problems for identification of nonlinear surface phenomena. Proceedings
of the 1995 Design Engineering Technical Conferences, Vol. 3C, pp. 903-908,
American Society of Mechanical Engineers, New York
21. Groetsch, C.W., Hanke, M. (1996): A general framework for regularized evalua-
tion of unstable operators, Journal of Mathematical Analysis and Applications,
203, 451-463
22. Groetsch, C.W., Scherzer, O. (1993): The optimal order of convergence for stable
evaluation of differential operators. Electronic Journal of Differential Equations,
3, 1-12
23. Groetsch, C.W., Scherzer, O. (2000): Nonstationary iterated Tikhonov-Morozov
method and third order differential equations for the evaluation of unbounded
operators, Mathematical Methods in the Applied Sciences, 23, 1287-3000
24. Groetsch, C.W., Scherzer, O. (2002): Iterative stabilization and edge detection.
Contemporary Mathematics, 313, 129-142
25. Guenther, R.B., Lee, J.W. (1988): Partial Differential Equations of Mathemat-
ical Physics and Integral Equations. Prentice-Hall, Englewood Cliffs, N.J.
26. Halmos, P. (1967): A Hilbert Space Problem Book. Van Nostrand, Princeton
27. Hanke, M., Groetsch, C.W. (1998): Nonstationary iterated Tikhonov regular-
ization, Journal of Optimization Theory and Applications, 98, 37-53
28. Hausdorff, F. (1932): Zur Theorie der linearen metrischen Räume, Journal für
die Reine und angewandte Mathematik, 167, 294-311
29. Hennefeld, J. (1980): A nontopological proof of the uniform boundedness theo-
rem, American Mathematical Monthly, 87, 217
30. Kato, T. (1966): Perturbation Theory for Linear Operators. Springer-Verlag,
Berlin
31. Keller, J.B., Olmstead, W.E. (1972): Temperature of a nonlinearly radiating
semi-infinite body. Quarterly Journal of Applied Mathematics, 30, 559-566
32. Kirsch, A. (1996): An Introduction to the Mathematical Theory of Inverse Prob-
lems. Springer-Verlag, New York
33. Kress, R. (1989): Linear Integral Equations. Springer-Verlag, Berlin
34. Lardy, L.J. (1975): A series representation of the generalized inverse of a closed
linear operator. Atti Accad. Naz. Lincei (Ser. VIII), 58, 152-157.
35. Lax, P.D. (2002): Functional Analysis. Wiley, New York
36. Lebedev, L.P., Vorovich, I.I., Gladwell, G.M.L. (1996): Functional Analysis:
Applications in Mechanics and Inverse Problems, Kluwer, Dordrecht
References 123
37. Louis, A. (1989): Inverse und schlecht gestellte Probleme. Teubner, Stuttgart
38. Mann, W.R., Wolf, F. (1951): Heat transfer between solids and gases under
nonlinear boundary conditions, Quarterly Journal of Applied Mathematics, 9,
163-184
39. Morozov, V.A. (1984): Methods for Solving Incorrectly Posed Problems,
Springer-Verlag, New York
40. Morozov, V.A. (1999): Some aspects of restoration of signals by a regularization
method. Recent Advances in Numerical Methods and Applications II, O.P. Iliev
et al., (Eds.). World Scientific, Singapore
41. Neubauer, A. (1985): Tikhonov-regularization of ill-posed linear operator equa-
tions on closed convex sets, dissertation, Linz
42. Popov, D.A., Sushko, D.V. (1994): Computation of singular convolutions,
pp. 43-127 in Applied Problems of Radon Transform. S. Gindikin, Ed., American
Mathematical Society Translations, Series 2, Vol. 162, Providence
43. Pöschel, J., Trubowitz, E. (1987): Inverse Spectral Theory, Academic Press,
Orlando
44. Riesz, F., Sz.Nagy, B. (1955): Functional Analysis, Ungar, New York
45. Sylvester, J., Uhlmann, G. (1990): The Dirichlet to Neumann map and ap-
plications, in Inverse Problems in Partial Differential Equations (D. Colton,
R. Ewing, W. Rundell, Eds.), 101-139. SIAM, Philadelphia
46. Youla, D., (1987): Mathematical theory of image restoration by the method of
convex projection, in Image Recovery: Theory and Application, H. Stark, Ed.,
Academic Press, New York
47. von Neumann, J. (1932): Über adjungierte Funktionaloperatoren. Annals of
Mathematics, 33, 294-310
48. von Neumann, J. (1950): Functional Operators - Vol. II. The Geometry of
Orthogonal Spaces, Annals of Mathematics Studies #22. Princeton University
Press, Princeton (This is a reprint of memeographed lecture notes first distrib-
uted in 1933.)
49. Zeidler, E. (1990): Nonlinear Functional Analysis and Its Applications. Springer-
Verlag, Berlin
50. Zeidler, E. (1995): Applied Functional Analysis: Main Principles and Their
Applications. Springer-Verlag, Berlin
Index
a posteriori choice, 67 density coefficient, 4

Deutsch, F., 50
a priori choice, 67 differentiation operator, 39
adjoint, 24 diffusivity coefficient, 12
affine set, 50 directional derivative, 4
Al’ber, Ya., 73 Dirichlet problem, 4
algebraic reconstruction technique, 51 Dirichlet-to-Neumann map, 4
alternating projection theorem, 87 discrepancy method, 81
alternating projections, 47 discrepancy principle, 104
ambient temperature, 14 dual least squares, 119
approximating subspace, 103 dynamical system, 44
eigenspace, 26
Banach’s theorem, 25
eigenvalue, 26
band-limited function, 6
elliptic boundary value problem, 22
Bessel’s inequality, 20
Engl, H., 119
best approximations, 47
error-free data, 56
best possible rate, 85
boundary condition, nonlinear, 14 finite dimensional discrepancy principle,
bounded linear operator, 2 118
Bustoz, Jr., J., VIII finite element methods, 111
finite rank operator, 25
Cannon, J., 17 Fourier coefficients, 8
Cauchy problem, 6 Fourier expansion, 20
Cauchy-Schwarz inequality, 20 Fourier projection, 21
characteristic function, 7 Fourier transform, 3, 6
Closed Graph Theorem, 35 frequencies of vibration, 3
closed linear operator, 32 functional interpolation, 73
closed quadrature rules, 72 fundamental subspaces, 25
compact operator, 26
compact resolvent, 41, 79 geometric choice of parameters, 93
complete orthonormal eigensystem, 61 Gramian matrix, 108
complete orthonormal set, 20 graph inner product, 35
complex Fourier expansion, 7 graph norm, 35
126 Index
graph of an operator, 35 monic polynomial, 92

Green’s function, 40 monotone operator, 73
Green’s identity, 22 Moore-Penrose generalized inverse, 42
Morozov, V.A., 98
Hölder’s inequality, 68 multi-stage optimization, 86
Halmos, P., 51 multiplication operators, 3
Hanke-Bourgeois, M., VIII
harmonic choice of parameters, 93 Neubauer, A., 73
harmonic function, 7 Neumann condition, 6
Hausdorff, F., 51 nonstationary method, 89
heat equation, 9 nonuniform string, 17
Hellinger-Toeplitz Theorem, 35 norm, 19
hemicontinuous operator, 73 norm of an operator, 24
Hennefeld, J., 51 normed linear space, 2
Hilbert space, 19 nullspace, 21, 24
hollow ball, 14
hyperplane, 51 Open Mapping Theorem, 51
open mapping theorem, 25
implicit forward difference method, 71 operational characterization, 108
Inglese, G., 17 optimal control, 44
inner product, 19 order of approximation, 84
integral operator, 26 orthogonal complement, 20
integral transform, 40 orthogonal projector, 102
interpolatory function theory, 65 orthogonal vectors, 20
iterated Tikhonov-Morozov method, 63, orthonormal vectors, 20
86 outer unit normal, 4
iteration parameter, 66
iterative stabilization method, 63 Parallelogram Law, 20
Parseval’s identity, 8, 20
Kaczmarz, S., 51 Poincaré-Zaremba inequality, 22
Kato, T., 41 Poisson equation, 22
positive definite matrix, 109
Laplace transform, 14 potential theory, 7
Laplace’s equation, 6 product Hilbert space, 35
Laplacian operator, 4 projection, 44
Lardy, L., 73 projection method, 102
least squares solutions, 41
linear algebraic equations, 111 radiating solids, 17
linear functional, 21 range, 24
linear operator, 24 rate of convergence, 57
resolution of the identity, 30
Mann, W.R., 17 Riemann-Lebesgue Theorem, 26
matrix representation, 108 Riemann-Stieljes integral, 30
minimal effort control, 44 Riesz Representation Theorem, 21
modified discrepancy principle, 85
moment, 9 Scherzer, O., VIII
moment of inertia, 9 second derivative operator, 4
moment theory, 8 self-adjoint operator, 24
moments, Hausdorff, 8 semi-infinite solid, 13
Index 127
separable, 20 temperature history, 12

separation of variables, 4 thermal probes, 14
sharp order of convergence, 97 thermal transfer, 14
signal-to-noise ratio, 81 Tikhonov-Morozov method, 58
Sobolev space, 22 trapezoidal rule, 72
solid ball, 15 two point boundary value problem, 40
source distribution, 11
source terms, 9 unbounded linear operator, 31
spectral representation, 29 underdetermined linear system, 51
Spectral Theorem, 29 uniform boundedness principle, 23
spectrum, 30
stability bound, 104
variational inequality, 98
stabilization parameter, 64, 103
variational interpretation, 86
stabilized approximations, 64
von Neumann’s theorem, 47
stabilized finite element approximations,
113
stretched string, 3 wave equation, 3
subspace index, 104 weak lower semi-continuity, 118
surface phenomena, 13 weak solution, 22
symmetric kernel, 26 weak-to-strong continuity, 26
weakly lower semi-continuous, 23
Taft Research Center, VIII weakly singular kernel, 27
temperature gradient, 13 Weierstrass approximation theorem, 54
Lecture Notes in Mathematics
For information about earlier volumes
please contact your bookseller or Springer
LNM Online archive: springerlink.com
Vol. 1701: Ti-Jun Xiao, J. Liang, The Cauchy Problem of Vol. 1723: J. P. Croisille, C. Lebeau, Diffraction by an Im-
Higher Order Abstract Differential Equations (1998) mersed Elastic Wedge (1999)
Vol. 1702: J. Ma, J. Yong, Forward-Backward Stochastic Vol. 1724: V. N. Kolokoltsov, Semiclassical Analysis for
Differential Equations and Their Applications (1999) Diffusions and Stochastic Processes (2000)
Vol. 1703: R. M. Dudley, R. Norvaiša, Differentiabil- Vol. 1725: D. A. Wolf-Gladrow, Lattice-Gas Cellular Au-
ity of Six Operators on Nonsmooth Functions and p- tomata and Lattice Boltzmann Models (2000)
Variation (1999) Vol. 1726: V. Marić, Regular Variation and Differential
Vol. 1704: H. Tamanoi, Elliptic Genera and Vertex Oper- Equations (2000)
ator Super-Algebras (1999) Vol. 1727: P. Kravanja M. Van Barel, Computing the Zeros
Vol. 1705: I. Nikolaev, E. Zhuzhoma, Flows in 2-dimen- of Analytic Functions (2000)
sional Manifolds (1999) Vol. 1728: K. Gatermann Computer Algebra Methods for
Vol. 1706: S. Yu. Pilyugin, Shadowing in Dynamical Sys- Equivariant Dynamical Systems (2000)
tems (1999) Vol. 1729: J. Azéma, M. Émery, M. Ledoux, M. Yor (Eds.)
Vol. 1707: R. Pytlak, Numerical Methods for Optimal Séminaire de Probabilités XXXIV (2000)
Control Problems with State Constraints (1999) Vol. 1730: S. Graf, H. Luschgy, Foundations of Quantiza-
Vol. 1708: K. Zuo, Representations of Fundamental tion for Probability Distributions (2000)
Groups of Algebraic Varieties (1999) Vol. 1731: T. Hsu, Quilts: Central Extensions, Braid Ac-
Vol. 1709: J. Azéma, M. Émery, M. Ledoux, M. Yor tions, and Finite Groups (2000)
(Eds.), Séminaire de Probabilités XXXIII (1999) Vol. 1732: K. Keller, Invariant Factors, Julia Equivalences
Vol. 1710: M. Koecher, The Minnesota Notes on Jordan and the (Abstract) Mandelbrot Set (2000)
Algebras and Their Applications (1999)
Vol. 1733: K. Ritter, Average-Case Analysis of Numerical
Vol. 1711: W. Ricker, Operator Algebras Generated by Problems (2000)
Commuting Projećtions: A Vector Measure Approach
Vol. 1734: M. Espedal, A. Fasano, A. Mikelić, Filtration in
(1999)
Porous Media and Industrial Applications. Cetraro 1998.
Vol. 1712: N. Schwartz, J. J. Madden, Semi-algebraic
Editor: A. Fasano. 2000.
Function Rings and Reflectors of Partially Ordered
Vol. 1735: D. Yafaev, Scattering Theory: Some Old and
Rings (1999)
New Problems (2000)
Vol. 1713: F. Bethuel, G. Huisken, S. Müller, K. Stef-
Vol. 1736: B. O. Turesson, Nonlinear Potential Theory and
fen, Calculus of Variations and Geometric Evolution Prob-
Weighted Sobolev Spaces (2000)
lems. Cetraro, 1996. Editors: S. Hildebrandt, M. Struwe
(1999) Vol. 1737: S. Wakabayashi, Classical Microlocal Analysis
Vol. 1714: O. Diekmann, R. Durrett, K. P. Hadeler, P. in the Space of Hyperfunctions (2000)
K. Maini, H. L. Smith, Mathematics Inspired by Biology. Vol. 1738: M. Émery, A. Nemirovski, D. Voiculescu, Lec-
Martina Franca, 1997. Editors: V. Capasso, O. Diekmann tures on Probability Theory and Statistics (2000)
(1999) Vol. 1739: R. Burkard, P. Deuflhard, A. Jameson, J.-L. Li-
Vol. 1715: N. V. Krylov, M. Röckner, J. Zabczyk, Stochas- ons, G. Strang, Computational Mathematics Driven by In-
tic PDE’s and Kolmogorov Equations in Infinite Dimen- dustrial Problems. Martina Franca, 1999. Editors: V. Ca-
sions. Cetraro, 1998. Editor: G. Da Prato (1999) passo, H. Engl, J. Periaux (2000)
Vol. 1716: J. Coates, R. Greenberg, K. A. Ribet, K. Ru- Vol. 1740: B. Kawohl, O. Pironneau, L. Tartar, J.-P. Zole-
bin, Arithmetic Theory of Elliptic Curves. Cetraro, 1997. sio, Optimal Shape Design. Tróia, Portugal 1999. Editors:
Editor: C. Viola (1999) A. Cellina, A. Ornelas (2000)
Vol. 1717: J. Bertoin, F. Martinelli, Y. Peres, Lectures on Vol. 1741: E. Lombardi, Oscillatory Integrals and Phe-
Probability Theory and Statistics. Saint-Flour, 1997. Edi- nomena Beyond all Algebraic Orders (2000)
tor: P. Bernard (1999) Vol. 1742: A. Unterberger, Quantization and Non-
Vol. 1718: A. Eberle, Uniqueness and Non-Uniqueness holomorphic Modular Forms (2000)
of Semigroups Generated by Singular Diffusion Opera- Vol. 1743: L. Habermann, Riemannian Metrics of Con-
tors (1999) stant Mass and Moduli Spaces of Conformal Structures
Vol. 1719: K. R. Meyer, Periodic Solutions of the N-Body (2000)
Problem (1999) Vol. 1744: M. Kunze, Non-Smooth Dynamical Systems
Vol. 1720: D. Elworthy, Y. Le Jan, X-M. Li, On the Geom- (2000)
etry of Diffusion Operators and Stochastic Flows (1999) Vol. 1745: V. D. Milman, G. Schechtman (Eds.), Geomet-
Vol. 1721: A. Iarrobino, V. Kanev, Power Sums, Goren- ric Aspects of Functional Analysis. Israel Seminar 1999-
stein Algebras, and Determinantal Loci (1999) 2000 (2000)
Vol. 1722: R. McCutcheon, Elemental Methods in Ergodic Vol. 1746: A. Degtyarev, I. Itenberg, V. Kharlamov, Real
Ramsey Theory (1999) Enriques Surfaces (2000)
Vol. 1747: L. W. Christensen, Gorenstein Dimensions Vol. 1777: E. García-Río, D. N. Kupeli, R. Vázquez-
(2000) Lorenzo, Osserman Manifolds in Semi-Riemannian
Vol. 1748: M. Ruzicka, Electrorheological Fluids: Model- Geometry (2002)
ing and Mathematical Theory (2001) Vol. 1778: H. Kiechle, Theory of K-Loops (2002)
Vol. 1749: M. Fuchs, G. Seregin, Variational Methods Vol. 1779: I. Chueshov, Monotone Random Systems
for Problems from Plasticity Theory and for Generalized (2002)
Newtonian Fluids (2001) Vol. 1780: J. H. Bruinier, Borcherds Products on O(2,1)
Vol. 1750: B. Conrad, Grothendieck Duality and Base and Chern Classes of Heegner Divisors (2002)
Change (2001) Vol. 1781: E. Bolthausen, E. Perkins, A. van der Vaart,
Vol. 1751: N. J. Cutland, Loeb Measures in Practice: Re- Lectures on Probability Theory and Statistics. Ecole d’
cent Advances (2001) Eté de Probabilités de Saint-Flour XXIX-1999. Editor: P.
Vol. 1752: Y. V. Nesterenko, P. Philippon, Introduction to Bernard (2002)
Algebraic Independence Theory (2001) Vol. 1782: C.-H. Chu, A. T.-M. Lau, Harmonic Functions
Vol. 1753: A. I. Bobenko, U. Eitner, Painlevé Equations in on Groups and Fourier Algebras (2002)
the Differential Geometry of Surfaces (2001) Vol. 1783: L. Grüne, Asymptotic Behavior of Dynamical
Vol. 1754: W. Bertram, The Geometry of Jordan and Lie and Control Systems under Perturbation and Discretiza-
Structures (2001) tion (2002)
Vol. 1755: J. Azéma, M. Émery, M. Ledoux, M. Yor Vol. 1784: L.H. Eliasson, S. B. Kuksin, S. Marmi, J.-C.
(Eds.), Séminaire de Probabilités XXXV (2001) Yoccoz, Dynamical Systems and Small Divisors. Cetraro,
Vol. 1756: P. E. Zhidkov, Korteweg de Vries and Nonlin- Italy 1998. Editors: S. Marmi, J.-C. Yoccoz (2002)
ear Schrödinger Equations: Qualitative Theory (2001) Vol. 1785: J. Arias de Reyna, Pointwise Convergence of
Vol. 1757: R. R. Phelps, Lectures on Choquet’s Theorem Fourier Series (2002)
(2001) Vol. 1786: S. D. Cutkosky, Monomialization of Mor-
Vol. 1758: N. Monod, Continuous Bounded Cohomology phisms from 3-Folds to Surfaces (2002)
of Locally Compact Groups (2001) Vol. 1787: S. Caenepeel, G. Militaru, S. Zhu, Frobenius
Vol. 1759: Y. Abe, K. Kopfermann, Toroidal Groups and Separable Functors for Generalized Module Cate-
(2001) gories and Nonlinear Equations (2002)
Vol. 1760: D. Filipović, Consistency Problems for Heath- Vol. 1788: A. Vasil’ev, Moduli of Families of Curves for
Jarrow-Morton Interest Rate Models (2001) Conformal and Quasiconformal Mappings (2002)
Vol. 1761: C. Adelmann, The Decomposition of Primes in Vol. 1789: Y. Sommerhäuser, Yetter-Drinfel’d Hopf alge-
Torsion Point Fields (2001) bras over groups of prime order (2002)
Vol. 1762: S. Cerrai, Second Order PDE’s in Finite and Vol. 1790: X. Zhan, Matrix Inequalities (2002)
Infinite Dimension (2001) Vol. 1791: M. Knebusch, D. Zhang, Manis Valuations and
Vol. 1763: J.-L. Loday, A. Frabetti, F. Chapoton, F. Goi- Prüfer Extensions I: A new Chapter in Commutative Al-
chot, Dialgebras and Related Operads (2001) gebra (2002)
Vol. 1764: A. Cannas da Silva, Lectures on Symplectic Vol. 1792: D. D. Ang, R. Gorenflo, V. K. Le, D. D. Trong,
Geometry (2001) Moment Theory and Some Inverse Problems in Potential
Vol. 1765: T. Kerler, V. V. Lyubashenko, Non-Semisimple Theory and Heat Conduction (2002)
Topological Quantum Field Theories for 3-Manifolds with Vol. 1793: J. Cortés Monforte, Geometric, Control and
Corners (2001) Numerical Aspects of Nonholonomic Systems (2002)
Vol. 1766: H. Hennion, L. Hervé, Limit Theorems for Vol. 1794: N. Pytheas Fogg, Substitution in Dynamics,
Markov Chains and Stochastic Properties of Dynamical Arithmetics and Combinatorics. Editors: V. Berthé, S. Fer-
Systems by Quasi-Compactness (2001) enczi, C. Mauduit, A. Siegel (2002)
Vol. 1767: J. Xiao, Holomorphic Q Classes (2001) Vol. 1795: H. Li, Filtered-Graded Transfer in Using Non-
Vol. 1768: M.J. Pflaum, Analytic and Geometric Study of commutative Gröbner Bases (2002)
Stratified Spaces (2001) Vol. 1796: J.M. Melenk, hp-Finite Element Methods for
Vol. 1769: M. Alberich-Carramiñana, Geometry of the Singular Perturbations (2002)
Plane Cremona Maps (2002) Vol. 1797: B. Schmidt, Characters and Cyclotomic Fields
Vol. 1770: H. Gluesing-Luerssen, Linear Delay- in Finite Geometry (2002)
Differential Systems with Commensurate Delays: An Vol. 1798: W.M. Oliva, Geometric Mechanics (2002)
Algebraic Approach (2002) Vol. 1799: H. Pajot, Analytic Capacity, Rectifiability,
Vol. 1771: M. Émery, M. Yor (Eds.), Séminaire de Prob- Menger Curvature and the Cauchy Integral (2002)
abilités 1967-1980. A Selection in Martingale Theory Vol. 1800: O. Gabber, L. Ramero, Almost Ring Theory
(2002) (2003)
Vol. 1772: F. Burstall, D. Ferus, K. Leschke, F. Pedit, U. Vol. 1801: J. Azéma, M. Émery, M. Ledoux, M. Yor
Pinkall, Conformal Geometry of Surfaces in S4 (2002) (Eds.), Séminaire de Probabilités XXXVI (2003)
Vol. 1773: Z. Arad, M. Muzychuk, Standard Integral Ta- Vol. 1802: V. Capasso, E. Merzbach, B.G. Ivanoff, M.
ble Algebras Generated by a Non-real Element of Small Dozzi, R. Dalang, T. Mountford, Topics in Spatial Sto-
Degree (2002) chastic Processes. Martina Franca, Italy 2001. Editor: E.
Vol. 1774: V. Runde, Lectures on Amenability (2002) Merzbach (2003)
Vol. 1775: W. H. Meeks, A. Ros, H. Rosenberg, The Vol. 1803: G. Dolzmann, Variational Methods for Crys-
Global Theory of Minimal Surfaces in Flat Spaces. Mar- talline Microstructure – Analysis and Computation (2003)
tina Franca 1999. Editor: G. P. Pirola (2002) Vol. 1804: I. Cherednik, Ya. Markov, R. Howe, G. Lusztig,
Vol. 1776: K. Behrend, C. Gomez, V. Tarasov, G. Tian, Iwahori-Hecke Algebras and their Representation Theory.
Quantum Comohology. Cetraro 1997. Editors: P. de Bar- Martina Franca, Italy 1999. Editors: V. Baldoni, D. Bar-
tolomeis, B. Dubrovin, C. Reina (2002) basch (2003)
Vol. 1805: F. Cao, Geometric Curve Evolution and Image Vol. 1829: E. Altman, B. Gaujal, A. Hordijk, Discrete-
Processing (2003) Event Control of Stochastic Networks: Multimodularity
Vol. 1806: H. Broer, I. Hoveijn. G. Lunther, G. Vegter, and Regularity. XIV, 313 p, 2003.
Bifurcations in Hamiltonian Systems. Computing Singu- Vol. 1830: M. I. Gil’, Operator Functions and Localization
larities by Gröbner Bases (2003) of Spectra. XIV, 256 p, 2003.
Vol. 1807: V. D. Milman, G. Schechtman (Eds.), Geomet- Vol. 1831: A. Connes, J. Cuntz, E. Guentner, N. Hig-
ric Aspects of Functional Analysis. Israel Seminar 2000- son, J. E. Kaminker, Noncommutative Geometry, Martina
2002 (2003) Franca, Italy 2002. Editors: S. Doplicher, L. Longo (2004)
Vol. 1808: W. Schindler, Measures with Symmetry Prop- Vol. 1832: J. Azéma, M. Émery, M. Ledoux, M. Yor
erties (2003) (Eds.), Séminaire de Probabilités XXXVII (2003)
Vol. 1809: O. Steinbach, Stability Estimates for Hybrid Vol. 1833: D.-Q. Jiang, M. Qian, M.-P. Qian, Mathemati-
Coupled Domain Decomposition Methods (2003) cal Theory of Nonequilibrium Steady States. On the Fron-
tier of Probability and Dynamical Systems. IX, 280 p,
Vol. 1810: J. Wengenroth, Derived Functors in Functional
2004.
Analysis (2003)
Vol. 1834: Yo. Yomdin, G. Comte, Tame Geometry with
Vol. 1811: J. Stevens, Deformations of Singularities
Application in Smooth Analysis. VIII, 186 p, 2004.
(2003)
Vol. 1835: O.T. Izhboldin, B. Kahn, N.A. Karpenko, A.
Vol. 1812: L. Ambrosio, K. Deckelnick, G. Dziuk, M. Vishik, Geometric Methods in the Algebraic Theory of
Mimura, V. A. Solonnikov, H. M. Soner, Mathematical Quadratic Forms. Summer School, Lens, 2000. Editor: J.-
Aspects of Evolving Interfaces. Madeira, Funchal, Portu- P. Tignol (2004)
gal 2000. Editors: P. Colli, J. F. Rodrigues (2003) Vol. 1836: C. Nǎstǎsescu, F. Van Oystaeyen, Methods of
Vol. 1813: L. Ambrosio, L. A. Caffarelli, Y. Brenier, G. Graded Rings. XIII, 304 p, 2004.
Buttazzo, C. Villani, Optimal Transportation and its Ap- Vol. 1837: S. Tavaré, O. Zeitouni, Lectures on Probabil-
plications. Martina Franca, Italy 2001. Editors: L. A. Caf- ity Theory and Statistics. Ecole d’Eté de Probabilités de
farelli, S. Salsa (2003) Saint-Flour XXXI-2001. Editor: J. Picard (2004)
Vol. 1814: P. Bank, F. Baudoin, H. Föllmer, L.C.G. Vol. 1838: A.J. Ganesh, N.W. O’Connell, D.J. Wischik,
Rogers, M. Soner, N. Touzi, Paris-Princeton Lectures on Big Queues. XII, 254 p, 2004.
Mathematical Finance 2002 (2003) Vol. 1839: R. Gohm, Noncommutative Stationary
Vol. 1815: A. M. Vershik (Ed.), Asymptotic Combina- Processes. VIII, 170 p, 2004.
torics with Applications to Mathematical Physics. St. Pe- Vol. 1840: B. Tsirelson, W. Werner, Lectures on Probabil-
tersburg, Russia 2001 (2003) ity Theory and Statistics. Ecole d’Eté de Probabilités de
Vol. 1816: S. Albeverio, W. Schachermayer, M. Tala- Saint-Flour XXXII-2002. Editor: J. Picard (2004)
grand, Lectures on Probability Theory and Statistics. Vol. 1841: W. Reichel, Uniqueness Theorems for Vari-
Ecole d’Eté de Probabilités de Saint-Flour XXX-2000. ational Problems by the Method of Transformation
Editor: P. Bernard (2003) Groups (2004)
Vol. 1817: E. Koelink, W. Van Assche(Eds.), Orthogonal Vol. 1842: T. Johnsen, A.L. Knutsen, K3 Projective Mod-
Polynomials and Special Functions. Leuven 2002 (2003) els in Scrolls (2004)
Vol. 1818: M. Bildhauer, Convex Variational Problems Vol. 1843: B. Jefferies, Spectral Properties of Noncom-
with Linear, nearly Linear and/or Anisotropic Growth muting Operators (2004)
Conditions (2003) Vol. 1844: K.F. Siburg, The Principle of Least Action in
Vol. 1819: D. Masser, Yu. V. Nesterenko, H. P. Schlick- Geometry and Dynamics (2004)
ewei, W. M. Schmidt, M. Waldschmidt, Diophantine Ap- Vol. 1845: Min Ho Lee, Mixed Automorphic Forms, Torus
proximation. Cetraro, Italy 2000. Editors: F. Amoroso, U. Bundles, and Jacobi Forms (2004)
Zannier (2003) Vol. 1846: H. Ammari, H. Kang, Reconstruction of Small
Vol. 1820: F. Hiai, H. Kosaki, Means of Hilbert Space Op- Inhomogeneities from Boundary Measurements (2004)
erators (2003) Vol. 1847: T.R. Bielecki, T. Björk, M. Jeanblanc, M.
Vol. 1821: S. Teufel, Adiabatic Perturbation Theory in Rutkowski, J.A. Scheinkman, W. Xiong, Paris-Princeton
Quantum Dynamics (2003) Lectures on Mathematical Finance 2003 (2004)
Vol. 1848: M. Abate, J. E. Fornaess, X. Huang, J. P. Rosay,
Vol. 1822: S.-N. Chow, R. Conti, R. Johnson, J. Mallet-
A. Tumanov, Real Methods in Complex and CR Geom-
Paret, R. Nussbaum, Dynamical Systems. Cetraro, Italy
etry, Martina Franca, Italy 2002. Editors: D. Zaitsev, G.
2000. Editors: J. W. Macki, P. Zecca (2003)
Zampieri (2004)
Vol. 1823: A. M. Anile, W. Allegretto, C. Ringhofer, Vol. 1849: Martin L. Brown, Heegner Modules and Ellip-
Mathematical Problems in Semiconductor Physics. Ce- tic Curves (2004)
traro, Italy 1998. Editor: A. M. Anile (2003)
Vol. 1850: V. D. Milman, G. Schechtman (Eds.), Geomet-
Vol. 1824: J. A. Navarro González, J. B. Sancho de Salas, ric Aspects of Functional Analysis. Israel Seminar 2002-
C ∞ – Differentiable Spaces (2003) 2003 (2004)
Vol. 1825: J. H. Bramble, A. Cohen, W. Dahmen, Mul- Vol. 1851: O. Catoni, Statistical Learning Theory and Sto-
tiscale Problems and Methods in Numerical Simulations, chastic Optimization (2004)
Martina Franca, Italy 2001. Editor: C. Canuto (2003) Vol. 1852: A.S. Kechris, B.D. Miller, Topics in Orbit
Vol. 1826: K. Dohmen, Improved Bonferroni Inequal- Equivalence (2004)
ities via Abstract Tubes. Inequalities and Identities of Vol. 1853: Ch. Favre, M. Jonsson, The Valuative Tree
Inclusion-Exclusion Type. VIII, 113 p, 2003. (2004)
Vol. 1827: K. M. Pilgrim, Combinations of Complex Dy- Vol. 1854: O. Saeki, Topology of Singular Fibers of Dif-
namical Systems. IX, 118 p, 2003. ferential Maps (2004)
Vol. 1828: D. J. Green, Gröbner Bases and the Computa- Vol. 1855: G. Da Prato, P.C. Kunstmann, I. Lasiecka,
tion of Group Cohomology. XII, 138 p, 2003. A. Lunardi, R. Schnaubelt, L. Weis, Functional Analytic
Methods for Evolution Equations. Editors: M. Iannelli, R. Vol. 1880: S. Attal, A. Joye, C.-A. Pillet, Open Quantum
Nagel, S. Piazzera (2004) Systems I, The Hamiltonian Approach (2006)
Vol. 1856: K. Back, T.R. Bielecki, C. Hipp, S. Peng, Vol. 1881: S. Attal, A. Joye, C.-A. Pillet, Open Quantum
W. Schachermayer, Stochastic Methods in Finance, Bres- Systems II, The Markovian Approach (2006)
sanone/Brixen, Italy, 2003. Editors: M. Fritelli, W. Rung- Vol. 1882: S. Attal, A. Joye, C.-A. Pillet, Open Quantum
galdier (2004) Systems III, Recent Developments (2006)
Vol. 1857: M. Émery, M. Ledoux, M. Yor (Eds.), Sémi- Vol. 1883: W. Van Assche, F. Marcellàn (Eds.), Orthogo-
naire de Probabilités XXXVIII (2005) nal Polynomials and Special Functions, Computation and
Vol. 1858: A.S. Cherny, H.-J. Engelbert, Singular Stochas- Application (2006)
tic Differential Equations (2005) Vol. 1884: N. Hayashi, E.I. Kaikina, P.I. Naumkin, I.A.
Vol. 1859: E. Letellier, Fourier Transforms of Invariant Shishmarev, Asymptotics for Dissipative Nonlinear Equa-
Functions on Finite Reductive Lie Algebras (2005) tions (2006)
Vol. 1860: A. Borisyuk, G.B. Ermentrout, A. Friedman, D. Vol. 1885: A. Telcs, The Art of Random Walks (2006)
Terman, Tutorials in Mathematical Biosciences I. Mathe- Vol. 1886: S. Takamura, Splitting Deformations of Dege-
matical Neurosciences (2005) nerations of Complex Curves (2006)
Vol. 1861: G. Benettin, J. Henrard, S. Kuksin, Hamil- Vol. 1887: K. Habermann, L. Habermann, Introduction to
tonian Dynamics – Theory and Applications, Cetraro, Symplectic Dirac Operators (2006)
Italy, 1999. Editor: A. Giorgilli (2005) Vol. 1888: J. van der Hoeven, Transseries and Real Differ-
Vol. 1862: B. Helffer, F. Nier, Hypoelliptic Estimates and ential Algebra (2006)
Spectral Theory for Fokker-Planck Operators and Witten Vol. 1889: G. Osipenko, Dynamical Systems, Graphs, and
Laplacians (2005) Algorithms (2006)
Vol. 1863: H. Fürh, Abstract Harmonic Analysis of Con- Vol. 1890: M. Bunge, J. Frunk, Singular Coverings of
tinuous Wavelet Transforms (2005) Toposes (2006)
Vol. 1864: K. Efstathiou, Metamorphoses of Hamiltonian Vol. 1891: J.B. Friedlander, D.R. Heath-Brown, H.
Systems with Symmetries (2005) Iwaniec, J. Kaczorowski, Analytic Number Theory, Ce-
Vol. 1865: D. Applebaum, B.V. R. Bhat, J. Kustermans, J. traro, Italy, 2002. Editors: A. Perelli, C. Viola (2006)
M. Lindsay, Quantum Independent Increment Processes I. Vol. 1892: A. Baddeley, I. Bárány, R. Schneider, W. Weil,
From Classical Probability to Quantum Stochastic Calcu- Stochastic Geometry, Martina Franca, Italy, 2004. Editor:
lus. Editors: M. Schürmann, U. Franz (2005) W. Weil (2007)
Vol. 1866: O.E. Barndorff-Nielsen, U. Franz, R. Gohm, B. Vol. 1893: H. Hanßmann, Local and Semi-Local Bifur-
Kümmerer, S. Thorbjønsen, Quantum Independent Incre- cations in Hamiltonian Dynamical Systems, Results and
ment Processes II. Structure of Quantum Levy Processes, Examples (2007)
Classical Probability, and Physics. Editors: M. Schür- Vol. 1894: C.W. Groetsch, Stable Approximate Evaluation
mann, U. Franz, (2005) of Unbounded Operators (2007)
Vol. 1867: J. Sneyd (Ed.), Tutorials in Mathematical Bio-
sciences II. Mathematical Modeling of Calcium Dynamics
and Signal Transduction. (2005)
Vol. 1868: J. Jorgenson, S. Lang, Posn (R) and Eisenstein
Sereies. (2005)
Vol. 1869: A. Dembo, T. Funaki, Lectures on Probabil-
ity Theory and Statistics. Ecole d’Eté de Probabilités de
Saint-Flour XXXIII-2003. Editor: J. Picard (2005)
Vol. 1870: V.I. Gurariy, W. Lusky, Geometry of Müntz
Spaces and Related Questions. (2005)
Vol. 1871: P. Constantin, G. Gallavotti, A.V. Kazhikhov,
Y. Meyer, S. Ukai, Mathematical Foundation of Turbulent
Viscous Flows, Martina Franca, Italy, 2003. Editors: M.
Cannone, T. Miyakawa (2006)
Vol. 1872: A. Friedman (Ed.), Tutorials in Mathemati-
cal Biosciences III. Cell Cycle, Proliferation, and Cancer
(2006)
Vol. 1873: R. Mansuy, M. Yor, Random Times and En-
largements of Filtrations in a Brownian Setting (2006)
Vol. 1874: M. Yor, M. Émery (Eds.), In Memoriam Paul-
André Meyer - Séminaire de Probabilités XXXIX (2006)
Vol. 1875: J. Pitman, Combinatorial Stochastic Processes.
Ecole d’Eté de Probabilités de Saint-Flour XXXII-2002. Recent Reprints and New Editions
Editor: J. Picard (2006)
Vol. 1876: H. Herrlich, Axiom of Choice (2006) Vol. 1618: G. Pisier, Similarity Problems and Completely
Vol. 1877: J. Steuding, Value Distributions of L- Bounded Maps. 1995 – Second, Expanded Edition (2001)
Functions(2006) Vol. 1629: J.D. Moore, Lectures on Seiberg-Witten Invari-
Vol. 1878: R. Cerf, The Wulff Crystal in Ising and Percol- ants. 1997 – Second Edition (2001)
ation Models, Ecole d’Eté de Probabilits de Saint-Flour Vol. 1638: P. Vanhaecke, Integrable Systems in the realm
XXXIV-2004. Editor: Jean Picard (2006) of Algebraic Geometry. 1996 – Second Edition (2001)
Vol. 1879: G. Slade, The Lace Expansion and its Vol. 1702: J. Ma, J. Yong, Forward-Backward Stochas-
Appli- cations, Ecole d’Eté de Probabilités de Saint-Flour tic Differential Equations and their Applications. 1999. –
XXXIV-2004. Editor: Jean Picard (2006) Corrected 3rd printing (2005)

Charles W. Groetsch - Stable Approximate Evaluation of Unbounded Operators (2007)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Charles W. Groetsch - Stable Approximate Evaluation of Unbounded Operators (2007)

Uploaded by

Copyright:

Available Formats

Lecture Notes in Mathematics 1894

Library of Congress Control Number: 2006931917

Mathematics Subject Classification (2000): Primary: 47A52, 65J20;

ISSN print edition: 0075-8434

This monograph is a study of an aspect of operator approximation theory

Seabrook Island, S.C. Charles Groetsch

1 Some Problems Leading to Unbounded Operators . . . . . . . . . 1

2 Hilbert Space Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3 A General Approach to Stabilization . . . . . . . . . . . . . . . . . . . . . . 53

3.2.4 A Method Suggested by Dynamical Systems . . . . . . . . . . 70

4 The Tikhonov-Morozov Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5 Finite-Dimensional Approximations . . . . . . . . . . . . . . . . . . . . . . . . 101

Unbounded operators can transform arbitrarily small vectors into arbitrar-

is ﬁnite (unless further speciﬁcation is necessary, · will always denote the

1.2 Some Unbounded Operators

1.2.1 Diﬀerentiation and Multiplication

Diﬀerentiation of functions deﬁned on unbounded domains is closely related

is a densely deﬁned linear operator which is unbounded. For example, the

ρ(x)utt = uxx , 0 < x < 1, t>0

where ρ is a measure of the nonuniform mass density distribution of the string.

hold for all t ≥ 0. The classical separation of variables technique suggests a

1.2.2 Dirichlet-to-Neumann Map

Another concrete unbounded operator is provided by the Dirichlet-to-

The Dirichlet-to-Neumann map is therefore the linear operator L deﬁned on

and taking values in L2 (∂D) deﬁned by

The functions {fm } ⊆ D(L) deﬁned by fm (eiθ ) = cos mθ are uniformly

∆u = 0, x > 0 0 < y < π,

u(x, π) = u(x, 0) = 0, x≥0

u(0, y) = f (y), 0≤y≤π

Formal separation of variable techniques lead easily to the representation

g(y) := ux (0, y) = (Lf )(y)

where the operator L is deﬁned by

1.2.3 Potential Theory

Ω = {(x, y) : 0 < x < 1, −∞ < y < ∞}

satisfying a homogenous Neumann condition on the boundary x = 0 and one

for −∞ < y < ∞.

g = Lf = F −1 {fˆ(ω) cosh (ω)}.

This linear operator is deﬁned only on functions f ∈ L2 (R) for which

fˆ(ω) cosh (ω) ∈ L2 (R),

is the characteristic function of the interval [n−1/2, n+1/2], then fn ∈ L2 (R),

showing that L is unbounded.

and that f is given by

and hence cn = h−|n| an , that is, g = Lf where

be expressed in real form as

g(θ) = (Lf )(θ)

1.2.4 Moment Theory

Next we consider an unbounded operator occurring in moment theory. A key

and its inverse L : l2 → H deﬁned by

are, by Parseval’s identity, linear isomorphisms.

For a given p > 0, let

1.3 Unbounded Operators and the Heat Equation

1.3.1 Time Reversal

That is, g = Lf where

1.3.2 Source Identiﬁcation

We now treat a problem of determining a spatially distributed source term

u(0, t) = u(π, t) = 0, 0<t (1.2)

and initial condition

and hence the general solution of (1.1)-(1.2) has the form

In order to satisfy (1.3) we need

That is, g = Lf , where L is the linear operator on L2 [0, π] with domain

1.3.3 Diﬀusivity from Temperature History

1.3.4 The Steady State

u(0, y) = u(π, y) = 0 for y ∈ [0, 1]

for each v ∈ H, that is, zn w.

Deﬁnition. A linear operator T : H1 → H2 is called compact if xn w