0705 0274v1 PDF

Electronic Journal of Statistics
Vol. 1 (2007) 3076

ISSN: 1935-7524
DOI: 10.1214/07-EJS014
arXiv:0705.0274v1 [math.ST] 2 May 2007
Needlet algorithms for estimation in

inverse problems
G
erard Kerkyacharian
Laboratoire de Probabilit
es et Mod`
eles Al
eatoires and Universit
e Paris X-Nanterre
(MODALX), 175 rue du Chevaleret, 75013 Paris, France
e-mail: kerk@math.jussieu.fr
Pencho Petrushev
Department of Mathematics, University of South Carolina, Columbia, SC 29208
e-mail: pencho@math.sc.edu
Dominique Picard and Thomas Willer

Laboratoire de Probabilit
es et Mod`
eles Al
eatoires and Universit
e Paris VII-Denis Diderot,
175 rue du Chevaleret, 75013 Paris, France
e-mail: picard@math.jussieu.fr; willer@math.jussieu.fr
Abstract: We provide a new algorithm for the treatment of inverse problems which combines the traditional SVD inversion with an appropriate
thresholding technique in a well chosen new basis. Our goal is to devise
an inversion procedure which has the advantages of localization and multiscale analysis of wavelet representations without losing the stability and
computability of the SVD decompositions. To this end we utilize the construction of localized frames (termed needlets) built upon the SVD bases.
We consider two different situations: the wavelet scenario, where the
needlets are assumed to behave similarly to true wavelets, and the Jacobitype scenario, where we assume that the properties of the frame truly
depend on the SVD basis at hand (hence on the operator). To illustrate
each situation, we apply the estimation algorithm respectively to the deconvolution problem and to the Wicksell problem. In the latter case, where
the SVD basis is a Jacobi polynomial basis, we show that our scheme is
capable of achieving rates of convergence which are optimal in the L2 case,
we obtain interesting rates of convergence for other Lp norms which are
new (to the best of our knowledge) in the literature, and we also give a
simulation study showing that the NEED-D estimator outperforms other
standard algorithms in almost all situations.
AMS 2000 subject classifications: Primary 62G05, 62G20; secondary
65J20.
Keywords and phrases: statistical inverse problems, minimax estimation, second-generation wavelets.
Received March 2007.
1. Introduction
We consider the problem of recovering a function f from a blurred (by a linear
. It is important to note that,
operator) and noisy version of f : Y = Kf + W
30
G. Kerkyacharian et al./NEED-D: estimation in inverse problems
31
in general, for a problem like this there exists a basis which is fully adapted
to the problem, and as a consequence, the inversion remains stable; this is the
Singular Value Decomposition (SVD) basis. The SVD basis, however, might be
difficult to determine and handle numerically. Also, it might not be appropriate
for accurate description of the solution with a small number of parameters.
Furthermore, in many practical situations, the signal exhibits inhomogeneous
regularity, and its local features are particularly interesting to recover. In such
cases, other bases or frames (in particular, localized wavelet type bases) might
be much more appropriate to represent the object at hand.
Our goal is to devise an inversion procedure which has the advantages of
localization and multiscale analysis of wavelet representations without losing
the stability and computability of the SVD decompositions. To this end we
utilize the construction (due to Petrushev and his co-authors) of localized frames
(termed needlets) built upon particular bases - here the SVD bases. This
construction uses a Calder
on type decomposition combined with an appropriate
quadrature (cubature) formula. It has the big advantage of producing frames
which are close to wavelet bases in terms of dyadic properties and localization,
but because of their compatibility with the SVD bases provide stable and easily
computable schemes.
NEED-D is an algorithm combining the traditional SVD inversion with an
appropriate thresholding technique in a well chosen new basis. It enables one to
approximate the targeted functions with excellent rates of convergence for any
Lp loss function, and over a wide range of Besov spaces.
Our main idea is by combining the thresholding algorithm with SVD-based
frames to create an effective and practically feasible algorithm for solving the
inverse problem described above. The properties of the localized frame to be
constructed depend on the underlying SVD basis. We will consider two different
behaviors, the first corresponds to a wavelet behavior in the sense that the
properties of the system are equivalent (as far as we are concerned) to the
properties of a true wavelet basis. This case typically arises in the deconvolution
setting. In the second case, the properties of the frame may differ from wavelet
bases and truly depend on the SVD basis at hand (hence on the operator K).
We will explore in detail a case typically arising when the SVD basis is a Jacobi
polynomial basis. It is illustrated by the Wicksell problem. We show that our
scheme is capable of achieving rates of convergence which are optimal in the L2
case (to the best of our knowledge, for the Wicksell problem this is the only
case studied up to now). For other Lp norms we obtain interesting rates of
convergence, which are new in the literature.
We also give a simulation study for the Wicksell problem which shows that the
NEED-D algorithm applied in combination with SVD based frames is valuable
since it outperforms other standard algorithms in almost all situations.
The paper is organized in the following way: the second section introduces
the model, the classical SVD methods, and the two basic examples considered
in this paper, i.e. the deconvolution and Wicksell problems. The third section
introduces the needlet construction, gives some basic properties of needlets and
introduces the NEED-D algorithm. The fourth section explores its properties in
32
the wavelet scenario. The main motivation for the NEED-D algorithm is given
there after. The fifth section is devoted to the results in a Jacobi scenario. The
sixth section is devoted to simulation results. The proofs of the main results
from sections 45 are given in sections 78, respectively. The last section is
an appendix which contains the definition and basic properties of the Jacobi
needlets and the associated Besov spaces.
2. Inverse Models
Suppose H and K are two Hilbert spaces and let K : H 7 K be a linear
operator. The standard linear ill-posed inverse problem consists in recovering a
good approximation f of the solution f of
g = Kf
(1)
when only a perturbation Y of g is observed. In this paper, we will consider the

case when this perturbation is an additive stochastic white noise. Namely, we
observe Y defined by the following identity:
,
Y = Kf + W
(2)
where is the amplitude of the noise. It is supposed to be a small parameter

which tends to 0. The error will be measured in terms of this small parameter.
is a K-white noise, i.e. for any g, h K, (g) := (W
, g)K , (h) :=
Here W
(W , h)K form random Gaussian vectors (centered) with marginal variance kgk2K ,
khk2K , and covariance (g, h)K (with the obvious extension when one considers k
functions instead of 2).
Equation (2) means that for any g K, we observe Y (g) := (Y , g)K =
(Kf, g)K + (g), where (g) N (0, kgk2), and Y (g), Y (h) are independent
random variables for orthogonal functions g and h.
2.1. The SVD Method
Under the assumption that K is compact, there exist two orthonormal bases
(SVD bases) (ek ) of H and (gk ) of K, and a sequence (bk ), bk 0 as k ,
such that
K Kek = b2k ek ,
KK gk = b2k gk ,
Kek = bk gk ;
with K being the adjoint operator of K.

The Singular Value Decomposition (SVD) of K
X
Kf =
bk hf, ek igk
k
gives rise to approximations of the type

f =
N
X
k=0
b1
k hY , gk iek ,
K gk = bk ek
33
where N = N () has to be properly selected. This SVD method is very attractive theoretically and can be shown to be asymptotically optimal in many situations (see Mathe and Pereverzev [23] together with their non linear counterparts
Cavalier and Tsybakov [6], Cavalier et al [5], Tsybakov [33], Goldenschluger and
Pereverzev [16], Efromovich and Koltchinskii [12]). It also has the big advantage of performing a quick and stable inversion of the operator. However, it has
serious limitations: First, the SVD bases might be difficult to determine and
handle numerically. Secondly, while these bases are fully adapted to describe
the operator K, they might not be appropriate for accurate description of the
solution with a small number of coefficients. Also in many practical situations,
the signal has inhomogeneous regularitiy, and its local features are particularly
interesting to recover. In such cases, other bases (in particular, localized wavelet
type bases) are much more suitable for representation of the object at hand.
In the last ten years, various nonlinear methods have been developed, especially in the direct case with the objective of automatically adapting to the
unknown smoothness and local singular behavior of the solution. In the direct
case, one of the most attractive methods is probably wavelet thresholding, since
it allies numerical simplicity to asymptotic optimality on a large variety of functional classes such as Besov or Sobolev spaces.
To apply this approach to inverse problems, Donoho [10] introduced a waveletlike decomposition, specifically adapted to the operator K (Wavelet-VagueletteDecomposition) and utilized a thresholding algorithm to this decomposition. In
Abramovich and Silverman [1], this method was compared with the similar
vaguelette-wavelet decomposition. Other wavelet schemes should be mentioned
here, such as the ones from Antoniadis and Bigot [3], Antoniadis & al [4], Dicken
and Maass [9], and especially for the deconvolution problem, Penski & Vidakovic
[29], Fan & Koo [13], Kalifa & Mallat [19], Neelamani & al [27]. Later on Cohen et al [7] introduced an algorithm combining a Galerkin inversion with a
thresholding algorithm.
The approach developed here was greatly influenced by these works.
2.1.1. Deconvolution
The deconvolution problem is probably one of the most famous inverse problems,
giving rise to a great deal of investigations, specially in signal processing, and
has an extensive bibliography. In the deconvolution problem, we consider the
following operator: Let in this case H = K be the set of square integrable periodic
functions, with the standard L2 [0, 1] norm, and consider
Z 1
f H 7 Kf =
(u t)f (t)dt H,
(3)
0
where is a known function in H. It is generally assumed to be a regular

function. A standard example is the box-car function which plays an important
role in extending this model to image processing and especially to analysis of
sequences of images.
34
In this case simple calculations show that the SVD bases ek and gk both
coincide with the Fourier basis. The singular values correspond to the Fourier
coefficients of the function :
bk = k .
(4)
2.1.2. Wicksells problem
Another typical example is the following classical Wicksells problem [34]. Suppose a population of spheres is embedded in a medium. The spheres have radii
that may be assumed to be drawn independently from a density f . A random
plane slice is taken through the medium and those spheres that are intersected
by the plane furnish circles which radii are the points of observation Y1 , . . . , Yn .
The unfolding problem is then to determine the density of the sphere radii from
the observed circle radii. This problem also arises in medicine, where the spheres
might be tumors in an animals liver (see Nyshka et al [28]), as well as in numerous other contexts (biological, engineering, etc.) see for instance Cruz-Orive
[8].
The difficulty of estimating the target function is well illustrated by figure 1.
The Wicksell operator has a smoothing effect, thus the local variations of the
target function become almost invisible in the case of observations corrupted
by noise. (Also compare the blurred and noised observations in figure 6 to the
target functions of figure 4.)
0.5
0.05
0.5
0.05
0.1
0.05
0
0.5
0.1
0.05
0.5
0.1
0.5
Fig 1. Heavisine function, its image by the Wicksell operator without and with gaussian noise
with rsnr = 5
Following Wicksell [34] and Johnstone and Silverman [18], the Wicksells
problem corresponds to the following operator:
H = L2 ([0, 1], d), d(x) = (4x)1 dx,
K = L2 ([0, 1], d), d(x) = 4 1 (1 y 2 )1/2 dy,
and
Kf (y) = y(1 y 2 )1/2

4
(x2 y 2 )1/2 f (x)d.
In this case, following [18], we have the following SVD bases:

ek (x) = 4(k + 1)1/2 x2 Pk0,1 (2x2 1)
gk (y) = U2k+1 (y).
35
Here Pk0,1 is the kth degree Jacobi polynomial of type (0, 1) and Uk is the second
type Chebishev polynomial of degree k. The singular values are
bk =
(1 + k)1/2 .
16
(5)
In this article, in order to avoid some additional technicalities, we consider

this problem in the white noise framework, which is simpler than the original
problem described above in density terms.
3. General scheme for construction of frames (Needlets) and
thresholding
Frames were introduced in the 1950s by Duffin and Schaeffer [11] as a means for
studying nonharmonic Fourier series. These are redundant systems which behave
like bases and allow for a lot of flexibility. Tight frame that are very close to
orthonormal bases are particularly useful in signal and image processing.
In the following we present a general scheme for construction of frames due
to Petrushev and his co-authors [26, 31, 30]. As will be shown this construction
has the advantage of producing easily computable frame elements which are
extremely well localized in all cases of interest. Following [26, 31, 30] we will call
them needlets.
Recall first the definition of a tight frame.
Definition 1. Let H be a Hilbert space. A sequence (n ) in H is said to be a
tight frame if
X
kf k2 =
|hf, n i|2 f H.
n
Let (Y, ) be a measure space with a finite positive measure. Suppose we

have the following decomposition
L2 (Y, ) =
Hk ,
k=0
where the Hk s are finite dimensional spaces. For simplicity, we assume that H0
is reduced to the constants.
Let (eki )i=1,...,lk be an orthonormal basis of Hk . Then the orthogonal projector
Lk onto Hk takes the form
Z
Lk (f )(x) =
f (y)Lk (x, y)d(y), f L2 (Y, ),
Y
where
Lk (x, y) =
lk
X
eki (x)eki (y).
i=1
Note the obvious property of the orthogonal projectors:

Z
Lk (x, y)Lm (y, z)d(z) = k,m Lk (x, z).
Y
(6)
36
The construction, inspired by the -transform of Frazier and Jawerth [15],

consists of two main steps: (i) Calder
on type decomposition and (ii) Discretization, which are described in the following two subsections.
3.1. Calder
on type decomposition
Let be a C function supported in [1, 1] such that 0 () 1 and
() = 1 if || 21 . Define a() 0 from
a2 () = (/2) () 0.
Then
a2 (/2j ) = 1,
j0
We now introduce the operator

j =
|| 1.
(7)
a2 (k/2j )Lk
k0
and its associated kernel

X
j (x, y) =
a2 (k/2j )Lk (x, y) =
k0
a2 (k/2j )Lk (x, y).
2j1 <k<2j+1
The operators j provide a decomposition of L2 (Y, ) which we record in the

following proposition.
Proposition 1. For all f L2 (Y, ), we have
f = L0 (f ) +
j (f )
in L2 (Y, ).
(8)
j=0
Proof. By the definition of Lk and (7)

L0 +
J
X
j = L0 +
J X
X
j=0
j=0
a2 (k/2j )Lk =
(k/2J+1 )Lk
(9)
and hence
kf L0 (f )
J
X
j (f )k2 =
j=0
l2J+1
l2J
kLl (f )k2 +
2J l<2J+1
kLl (f )k2 0 as
which completes the proof.

3.2. Discretization
Let us define
Kk =
k
M
m=0
Hm .
kLl (f )(1 (l/2J+1 )k2
J ,
37
We make two additional assumptions which will enable us to discretize decomposition (8) from Proposition 1:
(a)
f Kk , g Kl = f g Kk+l .
(b) Quadrature formula: For any k N there exists Xk a finite subset of Y

and positive numbers > 0, Xk , such that
Z
X
f d =
f () f Kk .
(10)
Y
Xk
(Obviously, #X0 = 1.)

We define
X
Mj (x, y) =
a(k/2j )Lk (x, y) for
k
j 0.
Then as a consequence of (6), we have

Z
Mj (x, z)Mj (z, y)d(z).
j (x, y) =
(11)
(12)
It is readily seen that Mj (x, z) = Mj (z, x) and

x, z 7 Mj (x, z) K2j+1 1
and hence
z 7 Mj (x, z)Mj (z, y) K2j+2 2 .
Now, by (10)
j (x, y) =
Mj (x, z)Mj (z, y)d(z) =
Mj (x, )Mj (, y),
X2j+2 2
which implies
Z
Z
j (x, y)f (y)d(y) =
j f (x) =
Mj (x, )Mj (, y)f (y)d(y)
Y X
2j+2 2
X2j+2 2
Z
p
p
Mj (x, )
f (y) Mj (y, )d(y).
(13)
We are now prepared to introduce the desired frame. Let Zj = X2j+2 2 for j 0
and Z1 = X0 . We define the frame elements (needlets) by
p
(14)
j, (x) = Mj (x, ), Zj , j 1.
Notice that Z1 consists of a single point and 0 = 1, , Z1 , is the

L2 -normalized positive constant. Now (13) becomes
X
j f (x) =
hf, j, ij, (x).
(15)
Zj
Proposition 2. The family (j, )Zj ,j1 is a tight frame for L2 (Y, ).
38
Proof. As
f = lim L0 (f ) +
J
we have
J
X
j (f )
j=0
J
X
hj (f ), f i.
kf k = lim hL0 (f ), f i +
2
j=0
But by (15)
hj (f ), f i =
Zj
hf, j, ihj, , f i =
Zj
|hf, j, i|2 ,
j 0,
and since 0 is the normalized constant hL0 (f ), f i = |hf, 0 i|2 . Hence

X
kf k2 = |hf, 0 i|2 +
|hf, j, i|2 ,
jN0 , Zj
which shows that (j, ) is a tight frame.

3.3. Localization properties
The critical property of the frame construction above which makes it so attractive is the excellent localization of the frame elements (needlets) (j, ) in
various settings of interest (see [25, 26, 31, 30]). The following figure (due to
Paolo Baldi) is an illustration of this phenomenon. The rapidly oscillating function is the Legendre polynomial of degree 28 , whereas the localized one is a
needlet constructed as explained above using Legendre polynomials of degree
28 and centered approximately at zero. Its localization is remarkable taking
into account that both functions are polynomials of the same degree.
200
150
100
50
50
100
150
1.0
0.8
0.6
0.4
0.2
0.0
0.2
0.4
0.6
0.8
1.0
39
In the case of the unit sphere in Rd+1 , where Hk are the spaces of spherical
harmonics, the following localization property of the needlets is established in
Narcowich, Petrushev and Ward [25, 26]: For any k N there exists a constant
Ck such that:
Ck 2dj/2
.
|j ()|
[1 + 2j arccos < , >]k
In the case of Jacobi polynomials on [1, 1], the localization of the needlets
proved in Petrushev, Xu [31] takes the form: For any k N there exists a
constant Ck such that
|j (cos )|
Ck 2j/2
p
,
(1 + 2j | arccos |)k w, (2j , cos )
|| ,
where w, (n, x) = (1 x + n2 )+1/2 (1 + x + n2 )+1/2 and , > 1/2.

The almost exponential localization of the needlets and their semi-orthogonal
structure allows to use them for characterization of spaces other than L2 , in
particular the more general Triebel-Lizorkin and Besov spaces (see [26, 31]).
3.4. NEED-D algorithm: thresholding needlet coefficients
We describe here the general idea of the method. The first step is to construct a
needlet system (frame) {j : Zj , j 1} as described in section 3, where
Hk is simply the space spanned by the k-th vector ek of the SVD basis.
The needlet decomposition of any f H takes the form
XX
(f, j )H j .
f=
j
Zj
P
i
with fi = (f, ei )H
Using Parsevals identity, we have j = (f, j )H = i fi j
i
and j = (j , ei )H . If we put Yi = (Y , gi )K , then
X
fj ej , K gi )H + i = bi fi + i ,
Yi = (Kf, gi )K + i = (f, K gi )K + i = (
j
, gi )K form a sequence of centered Gaussian variables with variwhere i = (W

ance 1. Thus
X Yi
i
j =
j
b
i
i
is an unbiased estimate of j . Notice that from the needlet construction (see

the previous section) it follows that the sum above is finite. More precisely,
i
j
6= 0 only for 2j1 < i < 2j+1 .
Let us consider the following estimate of f :
f =
J
X
X
j=1 Zj
t(j )j ,
40
where t is a thresholding operator defined by

t(j ) = j I{|j | t j }
r
1
t = log .
with
(16)
(17)
Here is a tuning parameter of the method which will be properly selected later
on. Notice that the thresholding depends on the resolution level j through the
constant j which will also be specified later on, and the same with regard to
the upper level of details J. Notice also that, in this paper, we concentrated on
hard thresholding, whereas various other kind of thresholdings could be used,
likely giving comparable results at least theoretically.
We will particularly focus on two situations (corresponding to the two examples discussed above). In the first case (see subsection 4), the needlets have very
nice properties and behave exactly like wavelets. This is for instance the case
of the deconvolution, where the SVD basis is the Fourier basis. However, more
complicated problems e.g. the Wicksells problem exhibit more delicate concentration properties for the needlets giving rise to different behaviors in terms of
rates of convergence for the estimators.
4. NEED-D in wavelet scenario
In this section, we assume that the needlet system has the following properties:
For any 1 p < , there exist positive constants cp , Cp , and Dp such that
Card Zj C2j ,
j( p
2 1)
(18)
j( p
2 1)
kj kpp
cp 2
Cp 2
,
X
X
u j kpp Dp
|u |p kj kpp , for any collection (u ).
k
Zj
(19)
(20)
Zj
s
define the space B,r
as the collection of all functions f with f =
P We P
such
that
j0
Zj j j
1
s
kf kB,r
:= k(2j[s+ 2 ] k(j )Zj kl )j0 klr < ,
s
B,r
(M )
s
kf kB,r
M.
and
(21)
(22)
Theorem 1. Let 1 0, and

j2 := sup
i
X j
]2 C22j , j 0.
[
b
i
i
2
(23)
Suppose 2 16p and 2J = [t ] 2+1 with t as in (16).

s
Then for f B,r
(M ) with 1, s 1/, r 1 (with the restriction r
1 p
if s = ( + 2 )( 1)), we have
p
(24)
Ekf f kpp C log(1/)p1 [ log(1/)]p ,
41
where
1 p
s
, if s ( + )( 1)
s + + 1/2
2
1
1 p
s 1/ + 1/p
, if
s < ( + )( 1).
=
s + + 1/2 1/
2
=
The proof of this theorem is given in section 7.

Remarks :
1. These results are essentially minimax (see Willer [35]) up to logarithmic
factors. We find back here the elbow, which was already observed in the
direct problem, as well as in the deconvolution setting (see [17], for instance).
2. Condition (23) is essential in this problem. In the deconvolution case, the
i
SVD basis is the Fourier basis and hence j
are simply the Fourier coefficients of j . Then assuming that we are in the so-called regular case
(bk k , for all k), it is easy to show that (23) is true for the needlet
system as constructed above (see also the discussion in the following subsection). A similar remark can be made regarding conditions (19) and (20).
In the deconvolution setting, the needlet construction is not strictly needed
and, as is shown in Johnstone, Kerkyacharian, Picard, Raimondo[17], the
periodized Meyer wavelet basis (see Meyer [24] and Mallat [22]) can replace the needlet construction. Condition (23) also holds in more general
cases such as the box-car deconvolution, see [17], [20] where this algorithm
is applied using Meyers wavelets. 3
4.1. Condition (23) and the needlet construction
The following lines are intended to a posteriori motivate our decision to build
upon the needlet construction. As was mentioned above condition (23) is very
important for our algorithm. The proof will reveal that it is essential, since j2 is
exactly the variance of our estimator of j , so in a sense no other thresholding
strategy can be better.
Let us now examine how condition (23) links the frame (j ) with the SVD
basis (ek ). To see this clearly let us suppose that (j ) is an arbitrary frame
and let us place ourselves in the regular case:
bi i
(this means that there exist two positive constants c and c such that c i
bi ci ). If condition (23) holds true, we have
C22j
m 2m i2m+1 1
i
j
]2 .
bi
Hence, m j,
42
i 2
[j
] c22(jm) .
2m i2m+1 1
i
This means that the energy of j
decays exponentially for i 2j , which reveals
the role of the Littlewood Paley decomposition in the previous construction,
replacing the exponential discrepancy by a cut-off.
The following proposition establishes a kind of converse property: The construction of needlet systems always implies that condition (23) is satisfied in the
regular case.
i
Proposition 3. If (j, ) is a frame such that {i : j
6= 0} is contained in a
j
j
set {C1 2 , . . . , C2 2 }, and bi i , then
j2 :=
i
X j
]2 C22j .
[
b
i
i
i
Proof. Since the elements of an arbitrary frame are bounded in norm and j
6=
j
j
0 only for C1 2 i C2 2 , we have
i
X j
[
]2 C22j kj, k2 C 22j .
b
i
i
5. NEED-D in a Jacobi-type case

Properties (19)(20) are not necessarily valid for an arbitrary needlet system,
since as mentioned above the localization properties of the frame elements depend on the initial underlying basis, and hence on the problem at hand. We will
consider here a particular case motivated by Wicksells problem.
We consider the space H = L2 (I, d(x)), where I = [1, 1], d(x) = , (x)dx,
, (x) = c, (1 x) (1 + x) , , > 1/2,
R
and c, is selected so that I d, (x) = 1. We will assume that (otherwise we can interchange the roles of and ).
Let (Pk )k0 be the L2 (I, d(x)) normalized Jacobi polynomials. We assume
that the Jacobi polynomials appear as an SVD basis of the operator K. This is
the case of Wicksells problem, where = 0, = 1, bk k 1/2 .
In the Jacobi case, the needlets have been introduced and studied in Petrushev and Xu [31]. See also the appendix, where the definition and some important properties of Jacobi needlets are given.
We will state our results in a more general setting, assuming that only a
few conditions on the needlet system are valid. Note that these conditions are
43
fulfilled by the needlet system (Jacobi needlets) constructed using the Jacobi
polynomials (Pk )k0 . The proofs are given in the appendix.
We will consider two sets of conditions. The first one (which only depends on
) is the following:
Card Zj 2j ,
X
kj kpp Cp 2jp/2 2j(p2)(1+) ,
Zj
Xj
j, kp C(
Xj
(25)
1
,
j, p 6= 2 +
+ 1/2
| |p kj, kpp )1/p .
(26)
(27)
s
e,r
We define the space B
as the collection of all functions f on [1, 1] with
representation
X X
f=
j j
j1 Zj
such that
kf kBes
,r
X
:= k(2js (
|j, | kj, k )1/ )j1 klr < ,
and
e s (M ) kf k s M.
f B
,r
er
B
(28)
(29)
e s is the Besov space defined in the Appendix as a space

In the Jacobi case, B
,r
of approximation (not depending on the special needlet-frame).
Theorem 2. Let 1 12 . Suppose
t =
Let 2 16p[1 + 4{( 2
regular case, i.e.
p
log 1/
+1
p )+
and
( 2
bi i ,
2
1+2
2 J = t
+1
p )+ }]
and assume that we are in the
1
> .
2
e s (M ) with s > [ 1 2( + 1)( 1 1 )]+ , we have

Then for f B
p,r
2
2
p
Ekf f kpp C[log(1/)]p1 [
where
(i) if p < 2 +
1
+1/2 ,
then
=
(ii) if p > 2 +
1
+1/2 ,
p
log(1/)]p ,
s
;
s + + 12
then
=
s
.
s + + ( + 1)(1 p2 )
44
Remarks :
1
, the rate obtained here is the usual one, and can
1. In the case p < 2 + +1/2
1
be proved to be minimax (see [35]). The case p > 2 + +1/2
introduces a
new rate of convergence.
2. Conditions (25)(27) enabled us to estimate the rates of convergence of
our scheme, whenever the index of the Besov space is the same as the
index of the loss function (p = ). In the sequel, we will study the case
where p and are independently chosen. This requires, however, some
additional assumptions. 3
If in addition to properties (25)(27), we now assume that the following

conditions are fulfilled: For any Zj , j 0,
c2j(p2)(+1) k()(p2)(+1/2) kj kpp C2j(p2)(+1) k()(p2)(+1/2) ,
k() < 2j1 ,
c2
j(p2)(+1)
(p2)(+1/2)
k ()
kj kpp
(30)
j(p2)(+1)
(p2)(+1/2)
C2
k ()
k () = 2 k() < 2
j1
(31)
where k() {1, . . . , 2j } is the index of Zj . Here we assume that the

points in Zj are ordered so that 1 > 2 > > 2j . Note that in the case of
Jacobi needlets Zj consists of the zeros of the Jacobi polynomial P2,
(see the
j
appendix). In the following we will briefly write k instead of k() and k instead
of k (). Of course, (26) is now a consequence of conditions (30)(31).
Observe the important fact that properties (30)(31) are valid in the case of
Jacobi Polynomials (see the appendix).
Theorem 3. Let 1 21 . Suppose that conditions
(25) (27) and (30) (31) are fulfilled. Let
2
1+2
2 J = t
and
2 16p[1 + 4{(
+1
+1
)+ (
)+ }]
2
p
2
p
and suppose that we are in the regular case, i.e.

bi i ,
1
> .
2
s
e,r
Then for f B
(M ) with s > max{,} { 12 2( + 1)( 21
1
1
1)( p ) 0}, we have
Ekf f kpp C[log(1/)]p1+a [

where
= min{(s), (s, ), (s, )}
and
1
)
p
log(1/)]p ,
a = max{a(), a()} 2
2( +
(32)
with
(s) =
(s, ) =
45
s
,
s + + 12
s 2(1 + )( 1 p1 )
s + + 2(1 + )( 12 1 )
and,
a() =
I{p = 0}
(+ 12 )(p)
(2)(+1/2)1
if [p ][1 (p 2)( + 1/2)] 0,
+ I{s = 0} if [p ][1 (p 2)( + 1/2)] < 0,
with p = 1 (p 2)( + 1/2) and s = s[1 (p 2)( + 1/2)] p(2 + 1)( +

1)( 1 p1 ).
The proofs of Theorems 2 and 3 are relegated to section 8.
Remarks :
1. Naturally, Theorem 2 follows by Theorem 3. We stated this two theorems
separately because the hypotheses of Theorem 2 are less restrictive than
the conditions in Theorem 3 and hence Theorem 2 potentially has wider
range of application than Theorem 3.
2. It is interesting to notice that the convergence rates in (32) depend only
on three distinctive regions for the parameters (which are actually present
in Theorem 2, but hidden in the condition ), which depends on a
very subtle interrelation between the parameters s, , , p, .
3. It is also interesting to note that the usual rates of convergence obtained
e.g. in the wavelet scenario are realized in the extreme case = = 12 .
3
6. Simulation study
In this section we investigate the numerical performances of the NEED-D estimator in the context of the Wicksell problem described in section 2.1.2. We
compare the results for simulated datasets to those obtained with several SVD
methods.
6.1. The estimators
6.1.1. Singular value decomposition estimators
With the notations introduced before, f can be naturally estimated by the following linear estimator based on the singular value decomposition of operator K:
f =
X
i
where (i )iN is a deterministic filter.
Yi
ei ,
bi
46
In the simulations a first SVD estimator with projection weights was used:
(
i = 1 if i N,
i = 0 if i > N,
where the parameter N was fitted for each setting so as to minimize the root
mean square error (RM SE) of the estimator.
We also use the SVD estimator developed in Cavalier and Tsybakov [6], which
is completely adaptive with a data driven choice of the filter and thus much more
convenient than the former in practice. The values of i are constant in blocs
Ij = [j1 , j 1] with limits 0 = 1 and J = N + 1 determined further:
2
= 1 j (1+j )
if i Ij , j = 1, . . . J,
i
kY k2(j)
+
= 0
if i > N,
i
where:
Yi
Yi = ,
bi
j =
kY k2(j) =
maxiIj b2
i
P
2 ,
b
iIj i
Yi2 ,
j2 = 2
iIj
bi2 ,
iIj
0 < < 1/2,
and we used the notation x+ = max(0, x).

The blocks are determined by the following procedure. Let
1
max(5, log log(1/)) and = log(
, we define:
)
if j = 0,
j = 1
j =
if j = 1,
j = j1 + (1 + )j1
if j = 2, . . . , J,
Pm
2 3
where J is large enough such that: J > max{m : i=1 b2
}.
i
In the simulation settings considered further the value taken by J = N + 1

is too large compared to the level n of the discretization resolution, thus the
estimation was performed at the level N0 = min ( n2 , N ) instead of N .
6.1.2. Construction of the needlet basis
Every needlet j,k defined on I = [1, 1] is a linear combination of Jacobi
polynomials as described in section 3, with weights depending on some filter a.
This function is chosen as:
p
a(x) = (x/2) (x), x 0
47
where (x) = I{x < 0.5} + P (x)I{0.5 x 1} and P is a polynomial adjusted

such that the corresponding needlet is sufficiently regular. In practice this choice
seems to be slightly better than a C filter with exponential shape.
The shape of a is given by figure 2, and some examples of needlets are given
in figure 3. Their amplitudes and supports fit automatically to the location of
: the needlets located near the edges of I are much sharper than those located
in the middle.
1
0.8
0.6
0.4
0.2
0
0.5
1.5
2.5
Fig 2. Filter a with polynomial shape
60
40
20
20
40
1
0.7
4
1
2
1
4
0.7
Fig 3. Examples of needlets: 7,10 , 7,40 , 7,80 and 7,120 (from left to right)
Finally NEED-D is performed by using the following basis (ej, ) adapted to

the Wicksell problem:
x [0, 1], ej, (x) = 4 2x2 j, (2x2 1).
With such a basis we have for all i N:
p
i
ej,
= a(i/2j1 )Pi () bj, ,
thus the estimated coefficients of f in the frame are very easy to compute.
6.2. Parameters of the simulation
We consider the four commonly used target functions f represented in figure 4,
and three levels of noise corresponding to three values of the root signal to

Blocks
Bumps
Heavisine
48
Doppler
1
0.5
0.5
0.5
0.5
0
0
0
0.5
0.5
0
0.5
0.5
0.5
0.5
Fig 4. Target functions
noise ratio of K(f ): rsnr {3, 5, 7}. The discretization resolution level is setto
n = 1024, and the constant in the thresholds of NEED-D is set to = 0.75 2.
The estimation error is evaluated by a Monte Carlo approximation of several
Lp () losses:
L1 is computed as the average over 20 runs of
1
n
RM SE is computed as the average over 20 runs of
n
P
i=1
s
|f ( ni ) f( ni )|/( 4i
n ).
1
n
n
P
(f ( ni ) f( ni ))2 /( 4i
n ).
i=1
In each run, the gaussian noise component is simulated independently of its

values in the other runs.
6.3. Analysis of the results
The performance of the non adaptive SVD estimator depends very strongly
on the choice of N (see figure 5). A large N is needed in the case of small
noise (first row of the figure) and in the case of very oscillating functions such
as Doppler and Bumps. However even with this optimal a posteriori choice of
N , the adaptive filter leads to better results than the non adaptive projection
weights as shown in tables 1 and 2. Indeed the former is more adapted to the
ill posed nature of the problem and to the variations of the noise, by adjusting
over the singular values (bk ) and the data (yk ).
Moreover the NEED-D estimator generally outperforms both SVD estimators. As can be seen on figure 6, the differences are obvious in high noise for
the Bumps and Doppler targets, where the SVD estimators are very noisy (in
fact all the estimators happen to leave some noise unfiltered near the right edge
0.8
0.8
0.6
0.4
0.2
0
200
400
200
400
200
400
0.8
0.2
200
400
200
400
200
400
0.5
0.6
0.3
0.4
0.4
0.4
0.2
0.2
0
200
400
0.8
0.1
0.3
0.2
0
200
400
0.4
0.6
200
400
0.8
0.2
0.8
0.6
0.3
0.4
0.6
0.4
0.2
0.2
0
0.3
0.2
0.4
0.6
0.4
0.4
0.1
0.8
0.5
0.6
0.2
49
200
400
0.1
0.4
0.2
0
200
400
200
400
0.2
Fig 5. Value of the mean square error of the non adaptive SVD estimator (y-axis) for each
value of N (x-axis) for rsnr = 7 to rsnr = 3 (from top to bottom) and for the target function
Blocks, Bumps, Heavisine and Doppler (from left to right)
of the interval, which is given lesser importance by errors measured with the
weight (x) = 1/(4x), for x ]0, 1].) This order of comparison is confirmed by
the lower values of L1 and RM SE for NEED-D than for SVD in all the settings
(see tables 1 and 2).
Blocks
Bumps
Heavisine
Doppler
Adaptive SVD
NEED-D
low
med
high
low
med
0.0399 0.0465 0.0591 0.0347 0.0404
0.0258 0.0295 0.0361 0.0180 0.0206
0.0248 0.0299 0.0401 0.0205 0.0254
0.1002 0.1085 0.1230 0.0858 0.0909
Table 1
Error L1 for each target, each noise level and each estimator
low
0.0452
0.0324
0.0257
0.1032
SVD
med
0.0495
0.0388
0.0305
0.1138
high
0.0677
0.0463
0.0402
0.1307
high
0.0511
0.0270
0.0321
0.1007
0.1
0.05
0.05
0
0.05
0.1
0.1
0
0
0.5
0.5
0.1
0.5
0.1
0.5
0.5
0
0.5
0
0
0.5
0.5
0.5
0.5
0.5
0
0.5
0
0
0.5
0.5
0.5
0.5
0.5
0
0.5
50
0
0
0.5
0.5
0.5
0.5
0.5
0.5
0.5
Fig 6. From top to bottom: observed data, NEED-D estimator, adaptive SVD estimator and
non adaptive SVD estimator for high noise (rsnr=3)
7. Proof of Theorem 1
In this proof, C will denote an absolute constant which may change from one
line to the other.
Blocks
Bumps
Heavisine
Doppler
Adaptive SVD
low
med
high
0.0665 0.0743 0.0900
0.0453 0.0508 0.0617
0.0266 0.0317 0.0418
0.1042 0.1114 0.1258
Table 2
Error L2 for each target, each noise level and each
low
0.0714
0.0489
0.0278
0.1092
SVD
med
0.0790
0.0577
0.0327
0.1200
high
0.0959
0.0706
0.0422
0.1378
low
0.0606
0.0378
0.0235
0.0969
NEED-D
med
0.0673
0.0416
0.0288
0.0999
estimator
high
0.0816
0.0523
0.0379
0.1071
51
First we have the following decomposition:

Ekf f kpp 2p1 {Ek
=: I + II
J
X
X
j=1 Zj
(t(j ) j )j kpp + k
XX
j>J Zj
j j kpp }
s
The term II is easy to analyse, as follows: Since f belongs to B,r
(M ), using
standard embedding results (which in this case simply follows from direct com1
p1 )+
s(
parisons between lq norms) we have that f also belong to Bp,r

some constant M . Hence
XX
1
1
j j kp C2J[s( p )+ ] .
k
(M ), for
j>J Zj
Then we only need to verify that

difficult.
1
1
p
)+
s(
+1/2
is always larger that , which is not
Bounding the term I is more involved. Using the triangular inequality together
with H
older inequality, and property (20) for the second line, we get
I 2p1 J p1
J
X
Ek
j=1
2p1 J p1 C
J
X
Zj
j=1 Zj
(t(j ) j )j kpp
E|t(j ) j |p kj kpp .
Now, we separate four cases:

J
X
X
j=1 Zj
E|t(j ) j |p kj kpp
J
X
X
j=1 Zj

E|t(j ) j |p kj kpp I{|j | t j } + I{|j | < t j }
J
X
X
E|j j |p kj kpp I{|j | t j }

j=1 Zj

I{|j | t j } + I{|j | < t j }

2
2

p
p
+|j | kj kp I{|j | < t j } I{|j | 2t j } + I{|j | < 2t j }
: Bb + Bs + Sb + Ss.
P i
= i i bj
is a gaussian rani
i
P j
2
2
dom variable centered, and with variance
i [ bi ] , we have using standard
P
If we notice that j j = i
Yi bi fi i
j
bi
52
properties of the gaussian distribution, for any q 1, if we recall that we set

i
P j
2
2j
, and denote by sq the qth absolute moment of the
j2 =:
i [ bi ] C2
gaussian distribution when centered and with variance 1:
E|j j |q sq jq q
2
P{|j j | t j } 2 /8
2
Hence,
Bb
Ss
J
X
X
j=1 Zj
J
X
X
j=1 Zj
jp p kj kpp I{|j |
t j }
2
|j |p kj kpp I{|j | < 2t j }.
And,
Bs
J
X
X
j=1 Zj
[E|j j |2p ]1/2 [P{|j j |
t j }]1/2 kj kpp I{|j | < t j }

2
2
J
X
X 1/2 p
2
s2p j p 21/2 /16 kj kpp I{|j | < t j }

2
j=1
Zj
J
X
2jp(+ 2 ) p
/16
j=1
/16
Now, if we remark that the j are necessarily all bounded by some constant
s
(depending on M ) since f belongs to B,r
(M ), and using (19),
Sb
J
X
X
j=1 Zj
J
X
X
j=1 Zj
J
X
j=1
|j |p kj kpp P{|j j | 2t j }I{|j | 2t j }

|j |p kj kpp 2
p
2 j 2
/8
2
8
/8
I{|j | 2t t j }
p
(2+1)
It is easy to check that in any cases if 2 16p the terms Bs and Sb are
smaller than the rates announced in the theorem.
If we recall that:
r
1
t = log
53
We have using (19) and condition (23) for any z 0:

Bb Cp
Cp
J
X
2j(p+ 2 1)
j=1
J
X
I{|j |
|j |z [t j ]z
Zj
p
2j(p+ 2 1)
j=1
Ct pz
Zj
J
X
2j[(pz)+ 2 1]
j=1
Zj
t j }
2
|j |z
Also, for any p z 0

Ss C
J
X
2j( 2 1)
j=1
Zj
C[t ]pz
J
X
|j |z jpz [t ]pz
p
2j((pz)+ 2 1)
j=1
Zj
|j |z
So in both cases we have the same bound to investigate. We will write this
bound on the following form (forgetting the constant):
I + II = t pz1 [
j0
X
2j[(pz1 )+ 2 1]
j=1
+ t pz2 [
|j |z1 ]
Zj
J
X
2j[(pz2 )+ 2 1]
j=j0 +1
Zj
|j |z2 ]
The constants zi and j0 will be chosen depending on the cases, with the only
constraint p zi 0.
Notice first, that we only need to investigate the case p , since when
s
s
p , Br
(M ) Bpr
(M ).
Let us first consider the case where s ( + 21 )( p 1), put
q=
p(2 + 1)
2(s + ) + 1
and observe that on the considered domain, q and p > q. In the sequel it
will be useful to observe that we have s = ( + 21 )( pq 1). Now, taking z2 = ,
we get:
J
X
X
p
|j | ]
2j[(p)+ 2 1]
II t p [
j=j0 +1
Now, as
Zj
p
1
p
1
1
+ ( 1) = s +
2q
q
2
and
54
|j | = 2j(s+ 2 ) j
Zj
s
with (j )j lr (this last thing is a consequence of the fact that f B,r
(M )
and item (5)), we can write:
X
1
2jp(1 q )(+ 2 ) j
II t p
j=j0 +1
Ct
1
p j0 p(1
q )(+ 2 )
The last inequality is true for any r 1 if > q and for r if = q. Notice
p
12 ). Now if we choose j0 such that
that = q is equivalent to s = (2 + 1)( 2
p
1
2j0 q (+ 2 ) t 1 we get the bound
t pq
which exactly gives the rate announced in the theorem for this case.
As for the first part of the sum (before j0 ), we have, taking now z1 = qe, with
1
P
P
1
q e
qe , so that [ 21j Zj |j |e
] q [ 21j Zj |j | ] , and using again (7),
I
j0
X
X
q
q
q )+ p
2 1]
|j |e
]
t pe
[
2j[(pe
1
j0
X
q
t pe
[
q
t pe
j0
X
Zj
eq
eq X
q )+ p
2 ][
2j[(pe
|j | ] ]
Zj
e
q
1
q
2j[(+ 2 )p(1 q )] je
e
q
q j0 [(+ 21 )p(1 q )]
Ct pe
2
Ct pq
The last two lines are valid if qe is chosen strictly smaller than q (this is possible
since q).
p
Let us now consider the case where s < (2 + 1)( 2
21 ), and choose now
q=p
1
2
s+ +
1
2
1
p
In such a way that we easily verify that p q =

(p)( 12 +)s
1
s++ 21
> 0 Furthermore we also have s +
1
2
1
1
)
p(s+ p
and q
1
s++ 21
p
p
1
1
= 2q q + ( q
=
1).
55
s
Hence taking z1 = and using again the fact that f belongs to B,r
(M ),
I t
j0
X
X
p
|j | ]
2j[(p)+ 2 1]
[
j0
X
t p
Ct
This is true
2j[(+ 2 p ) q (q)] j
p
p j0 [(+ 21 p1 ) q (q)]
since + 21 p1
1
1
j0 p
q (+ 2 p )
If we now take 2
Zj
is also strictly positive because of our constraints.

t 1 we get the bound
t pq
which is the rate announced in the theorem for this case.

Again, for II, we have, taking now z2 = qe > q(> )
q
II t pe
[
J
X
j=j0 +1
q
Ct pe
q )+ 2 1]
2j[(pe
j=j0 +1
Zj
q
|j |e
]
e
q
1
1 p
q )]
2j[(+ 2 p ) q (qe
zj
q )]
q j0 [(+ 2 p ) q (qe
Ct pe
2
1
Ct pq
8. Proof of the Theorems 2 and 3

The proof essentially follows the same steps as in the previous section. However,
the following proposition will be helpful in the sequel.
Proposition 4. Let us suppose that the following estimates are verified: Under
the conditions (30) and (31), we have
1.
X
X
p(
|j |p kj, kpp )1/p (
|j | kj, k )1/
2.
<p
,k()<2j1
<p
,k()<2j1
,k()2j1
,k()2j1
|j |p kj, kpp )1/p

|j | kj, k )1/ 22j(+1)(1/1/p)
|j |p kj, kpp )1/p
|j | kj, k )1/ 22j(+1)(1/1/p)
56
Proof. If p clearly, because, CardZj 2j ,

X
X
(
|j |p kj, kpp )1/p 2j(1/p1/) (
|j | kj, kp )1/
Zj
Zj
But, using (30) and (31),

p kj, kp Ckj, k 2j(1/1/p) .
So
X
X
(
|j | kj, k )1/
If p, clearly
X
(
,k()<2j1
,k()<2j1
|j | kj, kp )1/
But
Hence,
X
(
,k()<2j1
kj, kp Ckj, k 2j2(+1)(1/1/p) , k() < 2j1

,k()<2j1
|j | kj, k )1/ 22j(+1)(1/1/p)
The proof of the inequality with instead of obviously is identical.

Going back to the main stream of the proof, we first decompose:
Ekf f kpp
2p1 {Ek
=: I + II
J
X
X
j=1 Zj
(t(j ) j )j kpp + k
XX
j>J Zj
j j kpp }
For II, using (27),

XX
X X
X X
k
j j kpp (
k
j j kp )p C[
(
kj j kpp )1/p ]p
j>J Zj
j>J
If p, if we put =
2
1+2 ,
Zj
j>J Zj
e s (M ),
using f B
p,r
II C2Jsp = Ctsp
If < p, we decompose II in the following way

X
X
II C{[
(
|j |p kj kpp )1/p ]p
j>J ,k()<2j1
+[
j>J ,k()2j1
:=
II() + II().
|j |p kj kpp )1/p ]p }
57
s
ep,r
Now, using (4), and f B
(M ),
II() C[
2js 2j2(+1)(1/1/p) ]p
j>J
If s > 2( + 1)(1/ 1/p)

II() C2J(s2(+1)(1/1/p))p = Ct(s2(+1)(1/1/p))p
.
The term II() can be treated in the same way.

For I
Using the triangular inequality together with H
older inequality, and (27) for
the second line, we get
I
2p1 J p1
J
X
Ek
j=1
2p1 J p1 C
p1 p1
J
X
Zj
j=1 Zj
(t(j ) j )j kpp
E|t(j ) j |p kj kpp
C[I() + I()]
In the last line we separated as previously, in the sum Zj , the indices

k() < 2j1 and k() 2j1 . We will only investigate in the sequel I(), since
the argument for I() goes in the same way.
Now, we separate four cases:
J
X
j=1 ,k()<2j1
J
X
E|t(j ) j |p kj kpp
j=1 ,k()<2j1
J
X
j=1 ,k()<2j1

E|t(j ) j |p kj kpp I{|j | t j } + I{|j | < t j }

E|j j |p kj kpp I{|j | t j }
t j } + I{|j | < t j }
2
2

+|j |p kj kpp I{|j | t j } I{|j | 2t j } + I{|j | < 2t j }
I{|j |
: Bb + Bs + Sb + Ss
P i
= i i bj
is a gausi
i
P
)2 , we have using
sian random variable centered, and with variance 2 i ( bj
i
standard properties of the gaussian distribution, for any q > 0:
P
If we notice, as before, that j j = i
Yi bi fi i
j
bi
58
i
X j
)2 ]q/2 sq jq q C2jq q
(
b
i
i
r
2
1
P{|j j | log j } c /8
2
E|j j |q sq [2
Hence,
r
1
log j }
2
j=1 ,k()<2j1
r
J
X
X
1
p
p
|j | kj kp I{|j | < 2 log j }
j1
j=1
J
X
Bb
Ss
jp p kj kpp I{|j |
,k()<2
And,
r
1
2p 1/2
[E|j j | ] [P{|j j | log j }]1/2

Bs
2
j=1 ,k()<2j1
r
1
kj kpp I{|j | < log j }
2
r
J
X
X
1
1/2 p p 1/2 2 /16

p
2p j c
kj kp I{|j | < log j }
2
j1
j=1
J
X
,k()<2
c p
J
X
/16
2jp
j=1
Zj
kj kpp
p 2 /16 J(p+(p/2)(p2)(1+))
using (26). Now, if we remark that the j are necessarily all bounded by some
s
ep,r
constant M , since f B
(M ),
Sb
J
X
j=1 ,k()<2j1
|j |p kj kpp P{|j j |
r
1
1
2 log j }I{|j | 2 log j }
r
J
X
X
2
1
|j |p kj kpp c /8 I{|j | 2 log j }
j1
j=1
,k()<2
/8
J
X
X
j=1 Zj
2
8
kj kpp
2J(p/2(p2)(1+))
59
It is easy to check that in any cases for 2 large enough, the terms Bs and
Sb are smaller than the rates announced in the two theorems.
Now we focus on the bounds of Bb and Ss. Let q [0, p], we always have:
p
J
X
j=1 ,k()<2j1
J
X
j=1
jp kj kpp I{
,k()<2j1
|j |
t }
j
2
jp kj kpp |j |q
(j t /2)q
p (t /2)q
J
X
j=1 ,k()<2j1
jpq kj kpp |j |q
And
J
X
j=1
,k()<2j1
J
X
|j |p kj kpp I{|j | < 2t j }

X
(2t j )pq
j=1
(2t )pq
,k()<2j1
J
X
j=1 ,k()<2j1
|j |q kj kpp
jpq kj kpp |j |q .
So like in the wavelet scenario we have the same bound to investigate:

Bb + Ss
J
X
j=1 ,k()<2j1
(t j )pq kj kpp |j |q ,
then we use (30) and we separate as before the bound obtained in two terms A
and B with some parameters j0 , z1 and z2 determined later, depending on the
cases:
A :=
j0
X
(t j )pz1 2j(p2)(+1)
j=1
B :=
J
X
(t j )pz2 2j(p2)(+1)
j=j0 +1
|j |z1 k (p2)(+1/2)
|j |z2 k (p2)(+1/2) .
,k()<2j1
,k()<2j1
Let us first suppose that p and (p 2)( + 1/2) 1, or that p and

(p 2)( + 1/2) 1.
60
Then we take z1 = 0 and z2 = p, and let us denote p = 1 (p 2)( + 21 ). We

have:
A =
j0
X
j=1
j0
X
(t j )p 2j(p2)(+1)
k (p2)(+1/2)
,k()<2j1
(t j )p 2j(p/2)(p2)(+1) j I(p =0)
j=1
1
C(t j0 )p 2j0 (p/2)(p2)(+1) (log )I(p =0) .
And by treating B as was done previously with the term II(), we obtain:
B
J
X
j=j0 +1
2j(p2)(+1)
,k()<2j1
1
|j |p k (p2)(+1/2)
C2j0 p[s2(+1)( p )+ ] .
1/[s++ 12 ]
So if p and (p 2)( + 1/2) 1 we set 2j0 = t

yields:
s
p
1
s++ 1
2
A + B Ct
(log )I(p =0) ,
, which
2
)]
1/[s++(+1)(1
and if p and (p 2)( + 1/2) 1 we take 2j0 = t

yields:
s2(+1)( 1 1 )
p
s++(+1)(1 2 )
A + B Ct
, which
1
(log )I(p =0) .
In the other cases: p < and (p2)(+1/2) > 1, or p > and (p2)(+1/2) <
(p2)(+1/2)1
, which satisfies:
1, let us set q = (2)(+1/2)1
pq =
2( + 1)( p)
,
( 2)( + 1/2) 1
and q =
( + 1/2)( p)
,
( 2)( + 1/2) 1
so q ]0, p [ under the assumptions made above.

Let us bound the quantity
,k()<2j1
q
1 = ( 2)( + 1/2),
|j |q k (p2)(+1/2) . We define:
and 2 = (p 2)( + 1/2) 1 .
61
s
ep,r
Using H
older inequality, (30), and the fact that f B
(M ), we have:
X
|j |q k (p2)(+1/2)
,k()<2j1
Zj
|j |q k 1 k 2
X
,k()<2j1
C2jsqj
|j | k (2)(+1/2) ] [
2
q
k 1 ]1
,k()<2j1
q
(2)(+1)
2
q
]1
,k()<2j1
= C2j(p2)(+1) 2j(sq+
pq
2 )
j 1 .
In the last line we used the fact that:

q
pq
( 2)( + 1) = sq +
,
(p 2)( + 1) sq
and
2
= 1.
q
1. Let us assume that:

1
sq + (p q)( + ) < 0,
2
i.e. that:
s[(p 2)( + 1/2) 1] + ( + 1)( p)(2 + 1)
< 0.
( 2)( + 1/2) 1
Then we take z1 = 0 and z2 = q:
j0
X
A =
j=1
J
X
(t j )pq 2j(p2)(+1)
j=j0 +1
J
X
k (p2)(+1/2)
,k()<2j1
p j0 (p/2)(p2)(+1)
(t j0 ) 2
B
(t j )p 2j(p2)(+1)
,
,k()<2j1
(t j )pq 2j(sq+
C[
C(t j0 )pq 2j0 (sq+
pq
2 )
|j |q k (p2)(+1/2)
]J 1
j=j0 +1
pq
2 )
q
1
(log )1 .
2
1/[s++(+1)(1
)]
If (p 2)( + 1/2) > 1 we take 2j0 = t

p
A + B Ct
s2(+1)( 1 1 )
p
s++(+1)(1 2 )
q
1
(log )1 ,
, which yields:

1/[s++ 21 ]
and if (p 2)( + 1/2) < 1 we take 2j0 = t
62
, which yields:
s
p
s++ 1
2
q
1
(log )1 .
Notice that, because of our conditions on s, we always have j0 J.
A + B Ct
2. Let us now assume that:

s[(p 2)( + 1/2) 1] + ( + 1)( p)(2 + 1)
> 0.
( 2)( + 1/2) 1
Then we take z1 = q and z2 = p:
A C[
j0
X
(t j )pq 2j(sq+
pq
2 )
]J 1
j=1
C(t j0 )pq 2j0 (sq+
pq
2 )
q
1
(log )1 ,
and as before with the bias term II():

B
J
X
j=j0 +1
C2
2j(p2)(+1)
,k()<2j1
1
1
p
)+ ]
j0 p[s2(+1)(
1/[s++ 21 ]
If > p we take 2j0 = t
|j |p k (p2)(+1/2)
, which yields:
A + B Ct
s
s++ 1
2
q
1
(log )1 ,
2
1/[s++(+1)(1
)]
and if < p we take 2j0 = t
s2(+1)( 1 1 )
p
p
s++(+1)(1 2 )
A + B Ct
, which yields:
q
1
(log )1 .
3. Let us finally assume that:

s[(p 2)( + 1/2) 1] + ( + 1)( p)(2 + 1) = 0.
We take z1 = q and z2 = p as previously:
A+B
j0
X
tpq
j 1 + C2j0 p[s2(+1)( p )+ ]
j=1
q
1
1
1
Ctpq
(log )2 + C2j0 p[s2(+1)( p )+ ] .
We proceed exactly like in the previous case, and we obtain the same rate
whether p or < p:
A + B Ct
s
s++ 1
2
q
1
(log )2 .
63
We can sum up all the results for Bb and Ss (and thus on I()) very simply:
if 2( + 1)( 1 p1 ) < s and s[1 (p 2)( + 1/2)] p(2 + 1)( + 1)( 1 p1 )
then:
1
1
p
Bb + Ss Ct
s+2(+1)( )
p
s++(+1)(1 2 )
1
(log )a ,
if s[1 (p 2)( + 1/2)] > p(2 + 1)( + 1)( 1 p1 ) then:

p
Bb + Ss Ct
s
s++ 1
2
1
(log )a ,
where the power of the log factor depends on the parameters:

(
I{p = 0}
if [p ][1 (p 2)( + 1/2)] 0,
a=
(+ 12 )(p)
(2)(+1/2)1 + I{s = 0} if [p ][1 (p 2)( + 1/2)] < 0,
with p = 1 (p 2)( + 1/2) and s = s[1 (p 2)( + 1/2)] p(2 + 1)( +
1)( 1 p1 ).
Note that the first term in the second case is bounded by 1, so we have a 2
whatever the case.
9. Appendix: Needlets induced by Jacobi polynomials
The main references for this appendix are the two papers [31] of Petrushev and
Xu, and [21] of Kyriazis, Petrushev and Xu.
9.1. Jacobi needlets: Definition and basic properties
In this section we apply the general scheme from 3 for the construction of
Jacobi needlets. We begin by introducing some necessary notation. We denote
I = [1, 1] and d, (x) = c, , (x)dx, where
, (x) = (1 x) (1 + x) ; , > 1/2,
R
and c, is defined by I d, (x) = 1. Assume P , are the classical Jacobi
polynomials (cf. e.g. [32]). Let ,
be the Jacobi polynomial of degree k, nork
malized in L2 (d ), i.e.
Z
,
,
k n d, = m,n .
I
Let a() be as in 3.1 with the additional condition: a() > c > 0 for 3/4
7/4. Note that supp a [1/2, 2]. We define as in 3.1
X
,
j (x, y) =
a(k/2j ),
k (x)k (y).
k
64
Let = cos j, , = 1, 2, . . . , 2j , be the zeros of the Jacobi polynomial P2j

ordered so that 1 > 2 > > 2j and hence 0 < j,1 < j,2 < < j,2j < .
It is well known that (cf. [32])
j,
.
2j
(33)
We set
Xj = { : = 1, 2, . . . , 2j }.
Let n denote the space of all polynomials of degree less than n. As is well
known [32] the zeros of the Jacobi polynomial P2j serve as knots of the Gaussian
quadrature which is exact for all polynomials from 2j+1 1 , that is,
Z
X
P d, =
bj, P ( ), P 2j+1 1 ,
I
Xj
where the coefficients bj, > 0 are the Christoffel numbers [32] and bj,
2j , (2j ; ) with
, (2j ; x) := (1 x + 22j )+1/2 (1 + x + 22j )+1/2 .
We now define the Jacobi needlets by
p
j, (x) = bj, 2j (x, ),
= 1, 2, . . . , 2j ; j 0,
and we set 0 (x) = 1, (x) = 1, X1 with X1 consisting of only one

point = 0. From Proposition 2, (j, ) is a tight frame of L2 (d ), i.e.
X X
kf k22 =
|hf, j, i|2 , f L2 (d ).
j1 Xj
Hence
kj, k2 1.
(34)
Notice that (34) cannot be an equality since otherwise the needlet system (j, )
would be an orthonormal basis and this is impossible since
Z
Xp
X
bj, j, =
bj, L2j (x, ) = L2j (x, y)d(x) = 0.
We now recall the two main results from [31] which will be essential steps in
our development.
Theorem 4. For any l 1 there exists a constant Cl > 0 such that
1
2j/2
|j, (cos )| Cl p
j
j
, (2 , cos ) (1 + 2 |
l ,
2j |)
0 .
(35)
65
Obviously
, (2j ; cos ) = (2 sin2 (/2) + 22j )+1/2 (2 cos2 (/2) + 22j )+1/2 .
(36)
Therefore, 0 /2 = , (2j , cos ) ((2j + 1)2+1 2j(2+1) and hence

|j, (cos )| Cl
1
2j(1+)
,
l (2j + 1)+1/2
(1 + 2j |
|)
2j
0 /2.
(37)
Similarly, from (36)

/2 = , (2j , cos ) (2j ( ) + 1)2+1 2j(2+1)
and hence
|j, (cos )| Cl
2j(1+)
1
,
l
j
j
(1 + 2 | 2j |) (2 ( ) + 1)+1/2
Theorem 5. Let 0 < p . Then

1/p
Z
kj, kp =
Cp
|j, (x)|p d,
I
/2 . (38)
1/21/p
2j
.
, (2j ; )
Using (33) and (36), we infer , (j; ) 2j(2+1) 2+1 if 1 2j1

and , (j; ) 2j(2+1) (2j + 1)2+1 if 2j1 < 2j . Consequently,
1 2j1 =
2
j1
<2
kj, kp Cp
kj, kp Cp
2j(+1)
+1/2
12/p
2j(+1)
(2j + 1)+1/2
12/p
(39)
(40)
Corollary 1. Let 1 p and 1/p + 1/q = 1. Then
(j, ), kj, kp kj, kq Cp Cq

9.2. Estimation of the Lp norms of the needlets
Here we establish estimates (30)(31) for the norms of the Jacobi needlets. In
fact we only need to prove the lower bounds because the upper bounds are given
above, see Theorem 5 and (39)(40). We record these bounds in the following
theorem. We want to express our thanks to Yuan Xu for communicating to us
another proof of this result.
Theorem 6. 0 < p , j N,
= 1, . . . , 2
j1
, cp
2j(+1)
+1/2
12/p
k j, kp Cp
2j(+1)
+1/2
12/p
66
2j1 < 2j ,

12/p
12/p
2j(+1)
2j(+1)
cp
k j, kp Cp
(1 + (2j ))+1/2
(1 + (2j ))+1/2
or equivalently
1/21/p
2j
.
(41)
, (2j ; )
A critical role in the proof of this theorem will play the following proposition.
kj, kp
Proposition 5. Let c be an arbitrary positive constant. Then there exists a

constant c > 0 such that
2N
1
X
k=N
[Pk, (cos )]2 c, (N ; cos )1
for
c N 1 c N 1 ,
N 2.
(42)
Proof. The proof will rely on the well known asymptotic representation of Jacobi
polynomials (sf. [32, Theorem 8.21.12, p. 195]): For any constants c > 0 and
>0

cos
sin
Pn, (cos )
2
2
(n + + 1) 1/2
= N
J (N ) + 1/2 O(n3/2 )
(43)
n!
sin
for cn1 , where N = n + ( + + 1)/2 and J is the Bessel function.
Further, using the well known asymptotic identity
1/2
2
J (z) =
cos(z + ) + O(z 3/2 ), z ( = /2 /4), (44)
z
one obtains (sf. [32, Theorem 8.21.13, p. 195])
1/2
1/2

cos
Pn, (cos ) = (n)1/2 sin
{cos(N +)+(n)1 O(1)}
2
2
(45)
for cn1 cn1 .
,
As is well known the Jacobi polynomials Pk, and Pk+1
have no common
zeros and hence it suffices to prove (42) only for sufficiently large N . Also,
Pk, (x) = (1)k Pk, (x) and therefore it suffices to prove (42) only in the
case c N 1 /2.
Denote by FN () the left-hand side quantity in (42). Then by (45), applied
with c = 1/2, it follows that
FN () N 1 21
2N
1
X
k=N
c N 1 21
2N
1
X
k=N
c1 cos2 (k + h()) c2 (k)2
cos2 (k + h()) c 21 (N )2 ,
67
for N 1 /2, where h() = ( + + 1)/2 /2 /4. It is easy to

verify that for N 1 /2
2N
1
X
sin N
N
N
N
+
cos((3N 1) + 2h)
1
.
2
2 sin
2
2N
4
cos2 (k + h) =
k=N
Therefore,
21 (c /4 c (N )2 ) (c /8)21
c, (N ; ) for c N 1 /2,
FN ()
(46)
where c = max{, (8c /c )1/2 } > 0.

It remains to establish (42) for c N 1 c N 1 . Denote = (++1)/2.
We now apply (43) with c = c and = /2 to obtain using that (n + +
1)/n! n , sin , and (44)

h
i2
Pk, (cos ) 2 c1 [J ((k + ))]2 c2 k 3/2 1/2 |J ((k + ))|
c1 2 [J ((k + ))]2 c2 k 2 .
Choose so that =
FN () c1
and c c . Summing up above we get
2N
1
X
k=N
= c1 2 N
[J ((k + ))]2 c2 N 1
2N
1
X
k=N
= c1 2 N
N
1
X
j=0
1 h (k + ) i2
J
c2 N 1
N
N
i2
1 h j
J
c2 N 1 .
++
N
N
N
Obviously, the last sum above involves only values of the Bessel function J ()
for c c (2 + ) and hence uniformly in [c , c ]
N 1
1
NX
i2
1 h j
i2 X 1 h j

J
J
++
+
0,

N
N
N
N
N
j=0
j=0
N .
The
R 1 2second sum above can be viewed as a Riemann sum of the integral
J (( + 1))d, which is a continuous function of [c , c ] and hence
0
R1
min[c ,c ] 0 J2 (( + 1))d e
c > 0. Consequently, for sufficiently large N
FN ()
2 (e
cc1 N/2 cN 1 ) c2 N
c, (N ; )for c N 1 c N 1 .
From this and (46) it follows that (42) holds for sufficiently large N and this
completes the proof of Proposition 5.
68
1/2 ,
Proof of Theorem 6. We first note that (sf. [32]) ,
Pk (x) and
k (x) k
hence
X
2
kj, k22 = bj,
a2 (k/2j )(,
k (cos j, ))
2j2 <k<2j
c, (2j ; )
a2 (k/2j )(Pk, (cos j, ))2
2j2 <k<2j
c, (2 ; )
(Pk, (cos j, ))2 .
7 j
3 j
4 2 k 4 2
Observe also that there exists a constant c > 0 such that c /2j j,
c /2j , = 1, 2, . . . , 2j . We now employ Proposition 5 and (34) to conclude
that
(47)
0 < c kj, k2 1.
We need to establish only the lower bound in Theorem 6. Recall first the
upper bound from Theorem 5
kj, kp Cp
1/21/p
2j
,
j
, (2 ; )
0 < p .
(48)
Suppose 2 < p < and let 1/p + 1/q = 1. By (47) and H

olders inequality we
have

1/21/q
2j
0 < c kj, k22 kj, kp kj, kq ckj, kp
, (2j ; )
which yields
kj, kp c
1/21/p
2j
.
j
, (2 ; )
(49)
The case p = is similar. In the case 0 < p < 2, we have using (47)
p
0 < c kj, k22 kj, kpp kj, k2p
ckj, kp
1p/2
2j
,
j
, (2 ; )
which implies (49). The lower bound estimates in Theorem 6 follow by (49).
9.3. Bounding for the norm of a linear combination of needlets
Our goal is to prove estimate (27), which we record in the following theorem:
Theorem 7. Let 0 0 such that for any
collection of numbers { : = 1, 2, . . . , 2j }, j 0,
j
2
X
=1
j, kpLp (, )
Ap
2
X
=1
| |p kj, kpLp (, ) .
(50)
and if p = ,
k
2j
j, kL (, ) A sup | |kj, kL (, ) .
(51)
=1
Proof. Let us begin with the simplest case p =

k
69
j, k sup | |kj, k k
j,
k
kj, k
We can conclude using the following lemma:

Lemma 1. C < such that
j N,
j,
k C
kj, k
Proof. Using Theorem 4, and (49), we have

2j/2
1
j, (cos )
Cl p
kj, k
, (2j , cos ) (1 + |2j |)l
p
, (2j , )
2j/2
and hence
k
p
X
, (2j , )
j,
1
p
k sup C
j , cos ) (1 + |2j |)l
kj, k
(2
It remains to prove that this last quantity is bounded.

But using (36), one easily shows that (see [21]):
p
p
, (2j , )
, (2j , cos )
p
= p
(1 + |2j |)1/2+
, (2j , cos )
, (2j , cos )
consequently,
p
, (2j , )
1
p
sup
j
j
, (2 , cos ) (1 + |2 |)l
X
1
sup
<
j |)l1/2
(1
+
|2
(52)
for sufficiently large l.

Let now 0 0,
Jx |J| J
where the supremum is taken over all intervals J [1, 1] which contain x
and |J| denotes the length of J. As elsewhere, let > 1/2. It is well
70
known that the weight , (x) = (1 x) (1 + x) on [1, 1] belongs to the

Muckenhoupt class Ap with p > 1 if p 1. Then in the weighted
case the Fefferman-Stein maximal inequality (see [14] and [2]) can be stated as
follows: If 1 < p, r < and , Ap , then for any sequence of functions (fk )
on [1, 1]
X
1/r

(M1 fk )r

k
Lp (, )
X
1/r

Cp,r
|fk |r

Lp (, )
Using that M1 |f |s = (Ms f )s one easily infers from above that the following
p
}, then
maximal inequality holds: If 0 < p, r < and 0 < s < min{p, r, +1
for any sequence of functions (fk ) on [1, 1]
X
1/r

(Ms fk )r

Lp (, )
X
1/r

C
|fk |r

k
Lp (, )
(53)
As in 9.1, let = cos j, , = 1, 2, . . . , 2j , be the zeros of the Jacobi

polynomial P2,
. Set 0 = 1, 2j +1 = 1 and j,0 = 0, j,2j +1 = , respectively.
j
Denote I = [ +2 +1 , +2 1 ] and put
H = h 1I
with h =
1/2
2j
,
j
, (2 ; )
where 1I is the indicator function of I .

We next show that for any s > 0
|j, (x)| c(Ms H )(x),
x [1, 1],
= 1, 2, . . . , 2j , j 0.
(54)
Obviously, (Ms 1I )(x) = 1I (x) for x I . Let x [1, 1]\ I and set cos = x,
[0, ]. Then
[(Ms 1I )(x)]s
1 +1
|I |
|x |
|x |
1
sin 2 (j,+1 j,1 ) sin 21 (j,+1 + j,1 )
sin 12 | j, | sin 21 ( + j, )
2j j,
.
| j, |( + j, )
Using that j, c 2j for some constant c > 0, one easily verifies the inequality
1
j,
.
1
+ j,
(2 + c )(1 + 2j | j, |)
From above it follows that
(Ms 1I )(cos )
c
,
(1 + 2j | j, |)2/s
[0, ],
71
which along with (35) (applied with l 2/s) yields (54).

Combining (54) and (53) we get
j
2
X
=1
j, kpLp (, ) c
2
X
=1
| |p kH kpLp (, ) .
(55)

1/p
Straightforward calculation show that k1I kLp (, ) 2j , (2j ; )
and
hence, using Theorem 6,
kH kLp (, )
1/21/p
2j
kj, kLp (, ) .
, (2j ; )
This coupled with (55) implies (57).

Corollary 2. We have, for 1 p ,
j
2
X
hf, j, ij, kLp (, )
=1
2
X
|hf, j, i|p kj, kpLp (, ) )1/p
Ap (
Ap kf kLp(, )
=1
(56)
Proof. We have only to prove the righthand side inequality.

Let us first consider the case p = 1.
=1
2 Z
X
j
2
X
|hf, j, i|kj, kL1 (, )
=1
|f |(x)|j, (x)|d(x)kj, kL1 (, )
2
X
|j, (x)|
|f |(x){
kj, k kj, k1 }d(x) CC1 C kf k1
kj, k
=1
using lemma 1 and corollary 1.

For p = , we have
sup |hf, j, i|kj, kL (, )
12j
sup
12j
|f |(x)|j, (x)|d(x)kj, kL (, )
kf k sup kj, k1 kj, k C1 C kf k

12j
and again we have used corollary 1.

Let now 1 < p < Using H
olders inequality (1/p + 1/q = 1) we get
Z
Z
Z
|hf, j, i|p { |f ||j, | |1/p |j, | |1/q d}p |f |p |j, |d( |j, |d)p/q
p/q
|f |p |j, |dkj, k1
72
Hence
=1
|hf, j, i|
2 Z
X
j
2
X
kj, kpp
=1
p/q
|f |p |j, |dkj, k1 kj, kpp
|f | {
2
X
=1
p/q
|j, (x)|kj, k1 kj, kpp }d(x)
|f |p {
2
X
|j, (x)|
p/q
p/q
kj, k kj, k1 kj, kpp }d(x) CC C1 Cpp kf kpp
k
k
j,
=1
as by (39) and (40) we have
p/q
p/q
kj, k kj, k1 kj, kpp C C1 Cpp

we have concluded using lemma 1.
Corollary 3. let
j
j (f ) =
2
X
hf, j, ij,
=1
j
kj (f )kLp (, )
Cp (
2
X
=1
Cp kj1 f
+ j f + j+1 f kLp (, )
(by convention 1 (f ) = 0)
This claim is a simple consequence of the previous corollary, as
n u Xj , hf, j, i = hj1 f + j f + j+1 f, j, i
9.4. Besov Space
For f Lp (, ) let us define:
n N, En (f, p) = inf kf P kp
P n
Clearly
kf kp E0 (f, p) E1 (f, p)....
(57)
73
s
Definition 2. Besov space Bp,q
.
s
Let 1 p , 0 < q , 0 < s < . The space Bp,q
is defined as the
set of all the functions f Lp such that:
1
(ns En (f, p))q )1/q < , if q < ,
n
n=1
s
kf kBp,q
= kf kp + (
s
(resp. kf kBp,
= kf kp + sup ns En (f, p) < )
n1
s
As n En (f, p) is not increasing, we have an equivalent norm kf kBp,q
:
X
s
kf kBp,q
kf kp + ( (2js E2j (f, p))q )1/q , if q < ,
j=0
s
(resp. kf kBp,
= kf kp + sup 2js E2j (f, p))
j0
Theorem 8. Let f Lp (, ) and

j
j (f ) =
2
X
hf, j, ij,
=1
then
1.
X
s
kf kBp,q
kf kp + ( (2js kj (f )kp )q )1/q
(58)
j=0
2.
with the usual modification for q = .

j
s
If p < , kf kBp,q
2
X
X
kf kp + { (2js (
|hf, j, i|p kj, kpp )1/p )q }1/q
=1
j=0
(59)
with the usual modification for q = .

s
kf kB,q
kf k + {
X
(2js (sup |hf, j, i|kj, k ))q }1/q
j=0
with the previous modification for q = .

Proof. If Q 2j1 , j 1, we have
kj (f )kp = kj (f Q)kp Cp kf Qkp
and hence
kj (f )kp E2j1 (f, p).
(60)
On the other hand

E2j (f, p)
m=j+1
74
km kp .
Therefore, if
km kp m 2ms , lq (N),
we have
E2j (f, p)
m 2ms = 2js
m 2|mj|s = j 2js ,
m=j+1
m=j+1
with lq (N), by a classical convolution result. Thus (58) is established.

Now, by (57) we have
j
kj (f )kp
Cp (
2
X
=1
Cp (kj1 f kp
+ kj f kp + kj+1 f kp ).
This combined with (58) readily implies (59) and (60).

Remark 1. Suppose j, is a family of numbers such that
{
X
j=0
js
(2 (
2
X
=1
|j, |p kj, kpp )1/p )q }1/q
(with the usual modifications for p = , q = )

and let
X
2j
X
f = c1 +
j, j, ,
j=0 =1
s
(which is, by Theorem 7, obviously defined in Lp ) f Bp,q
even though not
necessarily
j, = hf, j, i.
This is due to Theorem 7, and since

hf, j, i. = hAj1 + Aj + Aj+1 , j, i.
where
Aj =
2
X
=1
j, j, , kAj kp Cp (
2
X
=1
kj, j, kpp )1/p
75
References
[1] F. Abramovich and B. W. Silverman. Wavelet decomposition approaches to
statistical inverse problems. Biometrika, 85(1):115129, 1998. MR1627226
[2] Kenneth F. Andersen and Russel T. John. Weighted inequalities for vectorvalued maximal functions and singular integrals. Studia Math., 69(1):1931,
1980/81. MR0604351
[3] A. Antoniadis and J. Bigot. Poisson inverse models. 2004. Preprint Grenoble.
[4] A. Antoniadis, J. Fan, and I. Gijbels. A wavelet method for unfolding
sphere size distributions. The Canadian Journal of Statistics, 29:265290,
2001. MR1840708
[5] L. Cavalier, G. K. Golubev, D. Picard, and A. B. Tsybakov. Oracle inequalities for inverse problems. Ann. Statist., 30(3):843874, 2002. MR1922543
[6] Laurent Cavalier and Alexandre Tsybakov. Sharp adaptation for inverse
problems with random noise. Probab. Theory Related Fields, 123(3):323
354, 2002. MR1918537
[7] Albert Cohen, Marc Hoffmann, and Markus Rei. Adaptive wavelet
Galerkin methods for linear inverse problems. SIAM J. Numer. Anal.,
42(4):14791501 (electronic), 2004. MR2114287
[8] L.M. Cruz-Orive. Distribution-free estimation of sphere size distributions
from slabs showing overprojections and truncations, with a review of previous methods. J. Microscopy, 131:265290, 1983.
[9] V. Dicken and P. Maass. Wavelet-Galerkin methods for ill-posed problems.
J. Inverse Ill-Posed Probl., 4(3):203221, 1996. MR1401487
[10] David L. Donoho. Nonlinear solution of linear inverse problems by waveletvaguelette decomposition. Appl. Comput. Harmon. Anal., 2(2):101126,
1995. MR1325535
[11] R. J. Duffin and A. C. Schaeffer. A class of nonharmonic Fourier series.
Trans. Amer. Math. Soc., 72:341366, 1952. MR0047179
[12] Sam Efromovich and Vladimir Koltchinskii. On inverse problems with
unknown operators. IEEE Trans. Inform. Theory, 47(7):28762894, 2001.
MR1872847
[13] J. Fan and J.K. Koo. Wavelet deconvolution. IEEE Transactions on Information Theory, 48(3):734747, 2002. MR1889978
[14] C. Fefferman and E. M. Stein. Some maximal inequalities. Amer. J. Math.,
93:107115, 1971. MR0284802
[15] M. Frazier, B. Jawerth, and G. Weiss. Littlewood paley theory and the
study of functions spaces. CMBS, 79, 1991. AMS. MR1107300
[16] Alexander Goldenshluger and Sergei V. Pereverzev. On adaptive inverse
estimation of linear functionals in Hilbert scales. Bernoulli, 9(5):783807,
2003. MR2047686
[17] Iain M. Johnstone, Gerard Kerkyacharian, Dominique Picard, and Marc
Raimondo. Wavelet deconvolution in a periodic setting. J. R. Stat. Soc.
Ser. B Stat. Methodol., 66(3):547573, 2004. MR2088290
[18] Iain M. Johnstone and Bernard W. Silverman. Discretization effects in
76
statistical inverse problems. J. Complexity, 7(1):134, 1991. MR1096170

[19] Jer
ome Kalifa and Stephane Mallat. Thresholding estimators for linear
inverse problems and deconvolutions. Ann. Statist., 31(1):58109, 2003.
MR1962500
[20] G. Kerkyacharian, D. Picard, and M. Raimondo. Adaptive boxcar deconvolution on full lebesgue measure sets. 2005. Preprint LPMA.
[21] G. Kyriazis, P. Petrushev, and Y. Xu. Jacobi decomposition of weighted
triebel-lizorkin and besov space. 2006. IMI 2006.
[22] Stephane Mallat. A wavelet tour of signal processing. Academic Press Inc.,
San Diego, CA, 1998. MR1614527
[23] Peter Mathe and Sergei V. Pereverzev. Geometry of linear ill-posed problems in variable Hilbert scales. Inverse Problems, 19(3):789803, 2003.
MR1984890
[24] Yves Meyer. Ondelettes et operateurs. I. Actualites Mathematiques. [Current Mathematical Topics]. Hermann, Paris, 1990. MR1085487
[25] F. Narcowich, P. Petrushev, and J. Ward. Local tight frames on spheres.
SIAM J. Math. Anal., 2006. to appear. MR2237162
[26] F. J. Narcowich, P. Petrushev, and J.M. Ward. Decomposition of besov
and triebel-lizorkin spaces on the sphere. J. Funct. Anal., 2006. to appear.
MR2253732
[27] R. Neelamani, H. Choi, and R. Baranuik.
Wavelet-based deconvolution for ill-conditioned systems.
2000.
http://wwwdsp.rice.edu/publications/pub/neelsh98icassp.pdf.
[28] D. Nyshka, G. Wahba, S. Goldfarb, and T. Pugh. Cross validated spline
methods for the estimation of three-dimensional tumor size distributions
from observations on two-dimensional cross sections. J. Amer. Statist. Assoc., 79:832846, 1984. MR0770276
[29] M. Pensky and B. Vidakovic. Adaptive wavelet estimator for nonparametric density deconvolution. Annals of Statistics, 27:20332053, 1999.
MR1765627
[30] P. Petrushev and Y. Xu. Localized polynomials kernels and frames
(needlets) on the ball. 2005. IMI 2005.
[31] Pencho Petrushev and Yuan Xu. Localized polynomial frames on the interval with Jacobi weights. J. Fourier Anal. Appl., 11(5):557575, 2005.
MR2182635
[32] G
abor Szeg
o. Orthogonal polynomials. American Mathematical Society,
Providence, R.I., 1975.
[33] Alexandre Tsybakov. On the best rate of adaptive estimation in some
inverse problems. C. R. Acad. Sci. Paris Ser. I Math., 330(9):835840,
2000. MR1769957
[34] S. D. Wicksell. The corpuscle problem: a mathematical study of a biometric
problem. Biometrika, 17:8499, 1925.
[35] T. Willer. Deconvolution in white noise with a random blurring effect.
2005. LPMA 2005.

0705 0274v1 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

0705 0274v1 PDF

Uploaded by

Copyright:

Available Formats

Electronic Journal of Statistics

Vol. 1 (2007) 3076

arXiv:0705.0274v1 [math.ST] 2 May 2007

Needlet algorithms for estimation in

Dominique Picard and Thomas Willer

G. Kerkyacharian et al./NEED-D: estimation in inverse problems

G. Kerkyacharian et al./NEED-D: estimation in inverse problems

when only a perturbation Y of g is observed. In this paper, we will consider the

where is the amplitude of the noise. It is supposed to be a small parameter

with K being the adjoint operator of K.

gives rise to approximations of the type

G. Kerkyacharian et al./NEED-D: estimation in inverse problems

where is a known function in H. It is generally assumed to be a regular

G. Kerkyacharian et al./NEED-D: estimation in inverse problems

Kf (y) = y(1 y 2 )1/2

(x2 y 2 )1/2 f (x)d.

In this case, following [18], we have the following SVD bases:

G. Kerkyacharian et al./NEED-D: estimation in inverse problems

In this article, in order to avoid some additional technicalities, we consider

Let (Y, ) be a measure space with a finite positive measure. Suppose we

eki (x)eki (y).

Note the obvious property of the orthogonal projectors:

G. Kerkyacharian et al./NEED-D: estimation in inverse problems

The construction, inspired by the -transform of Frazier and Jawerth [15],

We now introduce the operator

and its associated kernel

a2 (k/2j )Lk (x, y).

The operators j provide a decomposition of L2 (Y, ) which we record in the

Proof. By the definition of Lk and (7)

which completes the proof.

kLl (f )(1 (l/2J+1 )k2

G. Kerkyacharian et al./NEED-D: estimation in inverse problems

(b) Quadrature formula: For any k N there exists Xk a finite subset of Y

(Obviously, #X0 = 1.)

Then as a consequence of (6), we have

It is readily seen that Mj (x, z) = Mj (z, x) and

z 7 Mj (x, z)Mj (z, y) K2j+2 2 .

Mj (x, z)Mj (z, y)d(z) =

Mj (x, )Mj (, y),

Mj (x, )Mj (, y)f (y)d(y)

Notice that Z1 consists of a single point and 0 = 1, , Z1 , is the

G. Kerkyacharian et al./NEED-D: estimation in inverse problems

and since 0 is the normalized constant hL0 (f ), f i = |hf, 0 i|2 . Hence

which shows that (j, ) is a tight frame.

G. Kerkyacharian et al./NEED-D: estimation in inverse problems

where w, (n, x) = (1 x + n2 )+1/2 (1 + x + n2 )+1/2 and , > 1/2.

, gi )K form a sequence of centered Gaussian variables with variwhere i = (W

is an unbiased estimate of j . Notice that from the needlet construction (see

G. Kerkyacharian et al./NEED-D: estimation in inverse problems

where t is a thresholding operator defined by

Theorem 1. Let 1 < p < , 2 + 1 > 0, and

Suppose 2 16p and 2J = [t ] 2+1 with t as in (16).

G. Kerkyacharian et al./NEED-D: estimation in inverse problems

The proof of this theorem is given in section 7.

G. Kerkyacharian et al./NEED-D: estimation in inverse problems

set {C1 2 , . . . , C2 2 }, and bi i , then

5. NEED-D in a Jacobi-type case

G. Kerkyacharian et al./NEED-D: estimation in inverse problems

| |p kj, kpp )1/p .

e s is the Besov space defined in the Appendix as a space

and assume that we are in the

e s (M ) with s > [ 1 2( + 1)( 1 1 )]+ , we have

G. Kerkyacharian et al./NEED-D: estimation in inverse problems