You are on page 1of 16

Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.

php

CHAPTER 1
The What, Why, and How of
Wavelets

The wavelet transform is a tool that cuts up data or functions or operators into
different frequency components, and then studies each component with a resolu-
tion matched to its scale. Forerunners of this technique were invented indepen-
dently in pure mathematics (Calderón's resolution of the identity in harmonic
analysis—see e.g., Calderón (1964)), physics (coherent states for the (ax + b)-
group in quantum mechanics, first constructed by Aslaksen and Klauder (1968),
and linked to the hydrogen atom Hamiltonian by Paul (1985)) and engineering
(QMF filters by Esteban and Galland (1977), and later QMF filters with exact
reconstruction property by Smith and Barnwell (1986), Vetterli (1986) in elec-
trical engineering; wavelets were proposed for the analysis of seismic data by
J. Morlet (1983)). The last five years have seen a synthesis between all these
different approaches, which has been very fertile for all the fields concerned.
Let us stay for a moment within the signal analysis framework. (The dis-
cussion can easily be translated to other fields.) The wavelet transform of a
signal evolving in time (e.g., the amplitude of the pressure on an eardrum, for
acoustical applications) depends on two variables: scale (or frequency) and time;
wavelets provide a tool for time-frequency localization. The first section tells us
what time-frequency localization means and why it is of interest. The remaining
sections describe different types of wavelets.

1.1. Time- frequency localization.


In many applications, given a signal f (t) (for the moment, we assume that
t is a continuous variable), one is interested in its frequency content locally in
time. This is similar to music notation, for example, which tells the player which
notes (= frequency information) to play at any given moment. The standard
Fourier transform,

(F f) (w) =
1 f dt e t
f (t)

also gives a representation of the frequency content of f, but information con-


cerning time-localization of, e.g., high frequency bursts cannot be read off easily
from .F f . Time-localization can be achieved by first windowing the signal f, so
CHAPTER 1
Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

as to cut off only a well localized slice of f, and then taking its Fourier transform:
-

(Twin (1.1.1)
f)(w, t) _ fds ƒ(s) g(s —

This is the windowed Fourier transform, which is a standard technique for time-
frequency localization. 1 It is even more familiar to signal analysts in its discrete
version, where t and w are assigned regularly spaced values: t = nto , w = mwo ,
where m, n range over Z, and wo, to > 0 are fixed. Then (1.1.1) becomes

e—imw08
Twi n (f) = f ds ƒ(s) g(s — nto) . (1.1.2)

This procedure is schematically represented in Figure 1.1: for fixed n, the


T,w`n(f) correspond to the Fourier coefficients of f (•)g(• — nto). If, for instance,
g is compactly supported, then it is clear that, with appropriately chosen wo,
the Fourier coefficients T(f) are sufficient to characterize and, if need be,
to reconstruct f (•)g(• — nto). Changing n amounts to shifting the "slices" by
steps of to and its multiples, allowing the recovery of all of f from the T,W^; (f).
(We will discuss this in more mathematical detail in Chapter 3.) Many possible
choices have been proposed for the window function g in signal analysis, most
of which have compact support and reasonable smoothness. In physics, (1.1.1)
is related to coherent state representations; the g' ,t (s) = e i "sg(s — t) are the
coherent states associated to the Weyl—Heisenberg group (see, e.g., Klauder and
Skagerstam (1985)). In this context, a very popular choice is a Gaussian g. In all
applications, g is supposed to be well concentrated in both time and frequency; if
g and g are both concentrated around zero, then (T"n f) (w, t) can be interpreted
loosely as the "content" of f near time t and near frequency w. The windowed
Fourier transform provides thus a description of f in the time-frequency plane.

FIG. 1.1. The windowed Fourier transform: the function f (t) is rnultiplied with the window
function g(t), and the Fourier coefficients of the product f (t)g(t) are computed; the procedure
is then repeated for translated versions of the window, g(t — to), g(t — 2t0), • • •.
THE WHAT, WHY, AND HOW OF WAVELETS
Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

1.2. The wavelet transform: Analogies and differences with the


windowed Fourier transform.

The wavelet transform provides a similar time-frequency description, with a


few important differences. The wavelet transform formulas analogous to (1.1.1)
and (1.1.2) are

(T Wa.. f)(a, b) = lal - 1 /2 J dt f (t) ?,b (t


\ a
— )
J (1.2.1)

and
T^ (f) = ao "`/a f dt f (t) b(a^ m t — nbo) . (1.2.2)

In both cases we assume that 0 satisfies

dt &(t) = 0 (1.2.3)
J
(for reasons explained in Chapters 2 and 3).
Formula (1.2.2) is again obtained from (1.2.1) by restricting a, b to only dis-
crete values: a = a, b = nboaó in this case, with m, n ranging over Z, and
ao > 1, bo > 0 fixed. One similarity between the wavelet and windowed Fourier
transforms is clear: both (1.1.1) and (1.2.1) take the inner products of f with a
family of functions indexed by two labels, g"°t(s) = ei" 8 g(s - t) in (1.1.1), and
ab( 8 ) = l a l-1/2 1,(eab ) in (1.2.1). The functions ba ,b are called "wavelets";
the function is sometimes called "mother wavelet." (Note that i/' and g are
implicitly assumed to be real, even though this is by no means essential; if they
are not, then complex conjugates have to be introduced in (1.1.1), (1.2.1).) A
typical choice for i,b is (t) = (1 — t 2 ) exp(—t 2 /2), the second derivative of the
Gaussian, sometimes called the mexican hat function because it resembles a cross
section of a Mexican hat. The mexican hat function is well localized in both time
and frequency, and satisfies (1.2.3). As a changes, the z/,a , O(s) = la! -1 / 2 z1,(s/a)
cover different frequency ranges (large values of the scaling parameter lal cor-
respond to small frequencies, or large scale ba 0 ; small values of lat correspond
to high frequencies or very fine scale ba ,0 ). Changing the parameter b as well
allows us to move the time localization center: each 1,ab(s) is localized around
s = b. It follows that (1.2.1), like (1.1.1), provides a time-frequency description
of f. The difference between the wavelet and windowed Fourier transforms lies
in the shapes of the analyzing functions gw ,t and , a ,b , as shown in Figure 1.2.
The functions g"" t all consist of the same envelope function g, translated to the
proper time location, and "filled in" with higher frequency oscillations. All the
g" ,t , regardless of the value of w, have the same width. In contrast, the 0a ,b have
time-widths adapted to their frequency: high frequency lba ,b are very narrow,
while low frequency oa ,b are much broader. As a result, the wavelet transform
is better able than the windowed Fourier transform to "zoom in" on very short-
lived high frequency phenomena, such as transients in signals (or singularities
CHAPTER 1
Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(b)

(a)
g(X)

0 x

Wa,b with a> 1


b<0

0 x

FiG. 1.2. Typical shapes of (a) windowed Fourier transform functions g", and
(b) wavelets zpa ,b . The gc ,t (x) = e'iwxg(x — t) can be viewed as translated envelopes g, `filled
in" with higher frequencies; the a ,b are all copies of the same functions, translated and com-
pressed or stretched.

in functions or integral kernels). This is illustrated by Figure 1.3, which shows


windowed Fourier transforms and the wavelet transform of the same signal f
defined by

f (t) = sin(21rv i t) + sin(2irv2t) + ry[S(t — t 1 ) + S(t — t2)] .

In practice, this signal is not given by this continuous expression, but by samples,
and adding a S-function is then approximated by adding a constant to one sample
only. In sampled version, we have then

f (ni ) = sin(2irv i nr) + sin(27rv2nr) + a[6n ,


- l + Sn , 112 ]

For the example in Figure 1.3a, v 1 = 500 Hz, v2 = 1 kHz, r = 1/8, 000 sec (i.e.,
we have 8,000 samples per second), a = 1.5, and n2 — n1 = 32 (corresponding
to 4 milliseconds between the two pulses). The three spectrograms (graphs of
THE WRAT, WHY, AND HOW OF WAVELETS 5
Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

FIG. 1.3. (a) The signal f (t). (b) Windowed Fourier transforms off with three different
window widths. These are so-called spectrograms: only Twin(f)1 is plotted (the phase is not
rendered on the graph), using grey levels (high values = black, zero = white, intermediate
grey levels are assigned proportional to log IT` '(f)I) in the t(abscissa), w(ordinate) plane.
(c) Wavelet transform of f. To make the comparison with (b) we have also plotted T ° (f)I,
with the same grey level method, and a linear frequency axis (i.e., the ordinate corresponds
to a -1 ). (d) Comparison of the frequency resolution between the three spectrograms and the
wavelet transform. 1 would like to thank Oded Ghitza for generating this figure.
CHAPTER 1
Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

the modulus of the windowed Fourier transform) in Figure 1.3b use standard
Hamming windows, with widths 12.8, 6.4, and 3.2 milliseconds, respectively.
(Time t varies horizontally, frequency w vertically, on these plots; the grey levels
indicate the value of ITW'n(f)I, with black standing for the highest value.) As
the window width increases, the resolution of the two pure tones gets better,
but it becomes harder or even impossible to resolve the two pulses. Figure 1.3c
shows the modulus of the wavelet transform of f computed by means of the
(complex) Morlet wavelet ip(t) = C e—t2/t2 (e i-t — e — ' 2„a / 4 ), with a = 4. (To
make comparison with the spectrograms easier, a linear frequency axis has been
used here; for wavelet transforms, a logarithmic frequency axis is more usual.)
One already sees that the two impulses are resolved even better than with the
3.2 msec Hamming window (right in Figure 1.3b), while the frequency resolu-
tion for the two pure tones is comparable with that obtained with the 6.4 msec
Hamming window (middle in Figure 1.3b). This comparison of frequency resolu-
tions is illustrated more clearly by Figure 1.3d: here sections of the spectrograms
(i.e., plots of (Tw i n f)(•, t)I with fixed t) and of the wavelet transform modulus
(^ (T' f) (., b) with fixed b) are compared. The dynamic range (ratio between
the maxima and the "dip” between the two peaks) of the wavelet transform is
comparable to that of the 6.4 msec spectrogram. (Note that the flat horizontal
"tail" for the wavelet transform in the graphs in Figure 1.3d is an artifact of
the plotting package used, which set a rather high cut-off, as compared with the
spectrogram plots; anyway, this cut-off is already at —24 dB.)
In fact, our ear uses a wavelet transform when analyzing sound, at least in
the very first stage. The pressure amplitude oscillations are transmitted from
the eardrum to the basilar membrane, which extends over the whole length of
the cochlea. The cochlea is rolled up as a spiral inside our inner ear; imagine it
unrolled to a straight segment, so that the basilar membrane is also stretched
out. We can then introduce a coordinate y along this segment. Experiment and
numerical simulation show that a pressure wave which is a pure tone, f (t) =
e t, leads to a response excitation along the basilar membrane which has the
same frequency in time, but with an envelope in y, F„,(t, y) = e it q„,(y). In a
first approximation, which turns out to be pretty good for frequenties w above
500 Hz, the dependence on w of &, (y) corresponds to a shift by log w: there exists
one function 0 so that Ø(Y) is very close to O(y—log w). For a general excitation
function f, f (t) = 2 = f dw f (w)e i " t , it follows that the response function F(t, y)
is given by the corresponding superposition of "elementary response functions,"

F(t, y) = 1 dw f (w) F. (t, y)


27r

1 t
dw J(w) e çb(y — log w)
=

If we now introduce a change of parameterization, by defining

^(e — ^) = ( 27 r) -1/2 0(x), G(a, t) = F(t, log a) ,


THE WHAT, WHY, AND HOW OF WAVELETS
Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

then it follows that

G(a, t) =
J dt' f (t') (a(t — t')) ,
which (up to normalization) is exactly a wavelet transform. The dilation param-
eter comes in, of course, because of the logarithmic shifts in frequency in the 0 ,.
The occurrence of the wavelet transform in the first stage of our own biological
acoustical analysis suggests that wavelet-based methods for acoustical analysis
have a better chance than other methods to lead, e.g., to compression schemes
undetectable by our ear.

1.3. Different types of wavelet transform.

There exist many different types of wavelet transform, all starting from the
basic formulas (1.2.1), (1.2.2). In these notes we will distinguish between

A. The continuous wavelet transform (1.2.1), and

B. The discrete wavelet transform (1.2.2).


Within the discrete wavelet transform we distinguish further between

B1. Redundant discrete systems (frames) and

B2. Orthonormal (and other) bases of wavelets.

1.3.1. The continuous wavelet transform. Here the dilation and trans-
lation parameters a, b vary continuously over R (with the constraint a # 0). The
wavelet transform is given by formula (1.2.1); a function can be reconstructed
from its wavelet transform by means of the "resolution of identity" formula
f _ c 1 ^°° / °° da db
(.f , o a,b ) a,b , (1.3.1)
1 00 a 2
-

where oa" b (x) = IaL 1 / 2 0 (x- b ), and ( , ) denotes the L 2 -inner product. The
constant C,, depends only on 0 and is given by

Cb = 2j I)I2 II 1 ; (1.3.2)

we assume Co < oo (otherwise (1.3.1) does not make sense). If 0 is in L'(R)


(this is the case in all examples of practical interest), then' is continuous, so
that C.p can be finite only if 1(0) = 0, i.e., f dx'(x) = 0. A proof for (1.3.1)
will be given in Chapter 2. (Note that we have implicitly assumed that 0 is real;
for complex b, we should use 1 instead of 0 in (1.2.1). In some applications,
such complex i/i are useful.)
Formula (1.3.1) can be viewed in two different ways: (1) as a way of re-
constructing f once its wavelet transform TWa° f is known, or (2) as a way to
CHAPTER 1
Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

write f as a superposition of wavelets o ab ; the coefficients in this superposition


are exactly given by the wavelet transform of f. Both points of view lead to
interesting applications.
The correspondence f(x) —> (Twa" f)(a, b) represents a one-variable function
by a function of two variables, into which lots of correlations are built in (see
Chapter 2). This redundancy of the representation can be exploited; a beautiful
application is the concept of the "skeleton" of a signal, extracted from the con-
tinuous wavelet transform, which can be used for nonlinear filtering (see, e.g.,
Torrésani (1991), Delprat et al. (1992)).

1.3.2. The discrete but redundant wavelet transform-frames. In this


case the dilation parameter a and the translation parameter both take only
discrete values. For a we choose the integer (positive and negative) powers of
one fixed dilation parameter ao > 1, i.e., a = as'. As already illustrated by
Figure 1.2, different values of m correspond to wavelets of different widths. It
follows that the discretization of the translation parameter b should depend on
m: narrow (high frequency) wavelets are translated by small steps in order to
cover the whole time range, while wider (lower frequency) wavelets are translated
by larger steps. Since the width of ip(a^ m x) is proportional to a, we choose
therefore to discretize b by b = nboaó , where bo > 0 is fixed, and n E Z. The
corresponding discretely labelled wavelets are therefore

bm,n (x) = a^ m/a ,b(a^ m (x — nboaó ))

= ao m/z O(a^ mx — nbo) . (1.3.3)

Figure 1.4a shows schematically the lattice of time-frequency localization centers


corresponding to the i/^,,,, n . For a given function f, the inner products (f, z/i,, n )
then give exactly the discrete wavelet transform TT' (f) as defined in (1.2.2)
(we assume again that ' is real).
In the discrete case, there does not exist, in general, a "resolution of the
identity" formula analogous to (1.3.1) for the continuous case. Reconstruction
of f from Tv (f), if at all possible, must therefore be done by some other means.
The following questions naturally arise:
(1) Is it possible to characterize f completely by knowing T`°a"(f )?

(2) Is it possible to reconstruct f in a numerically stable way from TWa" (f )?


These questions concern the recovery of f from its wavelet transform. We can
also consider the dual problem (see §1.3.1), the possibility of expanding f into
wavelets, which then leads to the dual questions:
(1') Can any function be written as a superposition of m , n ?
(2') Is there a numerically stable algorithm to compute the coefficients for such
an expansion?
Chapter 3 addresses these questions. As in the continuous case, these discrete
wavelet transforms often provide a very redundant description of the original
THE WHAT, WHY, AND HOW OF WAVELETS 9
Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(a)

• . . . . • • aowo
. • • • • . • • . • . s .
(-,,o) (-1,2)

(0,0) (0,1) (0,2)


• . • . w0 . . . . . •
(1,) (1,2)
w o 10 (bol
• aob, .

. • . • . • . • • . .

. . . . . . . . ♦ . . . . . . . . . . . .

(b1

FIG. 1.4. The lattices of time-frequency localization for the wavelet transform and win-
dowed Fourier transform. (a) The wavelet transform: i,b m , n is localized around aó nbp in time.
We assume here that Iz 1 has two peaks in frequency, at ±o (this is the case, e.g., for the
Mexican hat wavelet 0(t) = (1 — t 2 )e —t2 / 2 ); F m , n (^)I then peaks at ±aó ^o, which are the
two localization centers of ?/i m , n in frequency. (b) The windowed Fourier transform: g,,, is
localized around nto in time, around mwo in frequency.
10 CHAPTER 1
Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

function. This redundancy can be exploited (it is, for instance, possible to com-
pute the wavelet transform only approximately, while still obtaining reconstruc-
tion of f with good precision), or eliminated to reduce the transform to its bare
essentials (such as in the image compression work of Mallat and Zhong (1992)). It
is in this discrete form that the wavelet transform is closest to the "O-transform"
of Frazier and Jawerth (1988).
The choice of the wavelet ?p used in the continuous wavelet transform or in
frames of discretely labelled families of wavelets is essentially only restricted by
the requirement that C,,, as defined by (1.3.2), is finite. For practical reasons,
one usually chooses 0 so that it is well concentrated in both the time and the
frequency domain, but this still leaves a lot of freedom. In the next section we
will see how giving up most of this freedom allows us to build orthonormal bases
of wavelets.

1.3.3. Orthonormal wavelet bases: Multiresolution analysis. For


some very special choices of 0 and ao, b o , the im , n constitute an orthonormal
basis for L 2 (R). In particular, if we choose ao = 2, bo = 1, 2 then there exist b,
with good time-frequency localization properties, such that the

1I'm,n(x) = 2 -- / 2 0(2 -- x — n) (1.3.4)

constitute an orthonormal basis for L 2 (R). (For the time being, and until Chap-
ter 10, we restrict ourselves to a o = 2.) The oldest example of a function 0 for
which the,.m , n defined by (1.3.4) constitute an orthonormal basis for L 2 (R) is
the Haar function,
1 0<x<2
0(x)= —1 2 <x<1
0 otherwise .
The Haar basis has been known since Haar (1910). Note that the Haar func-
tion does not have good time-frequency localization: its Fourier transform b(^)
decays like for —* oo. Nevertheless we will use it here for illustration
purposes. What follows is a proof that the Haar family does indeed constitute
an orthonormal basis. This proof is different from the one in most textbooks; in
fact, it will use multiresolution analysis as a tool.
In order to prove that the m , n (x) constitute an orthonormal basis, we need
to establish that

(1) the ?,,,,,, n are orthonormal;

(2) any L 2 -function f can be approximated, up to arbitrarily small precision,


by a finite linear combination of the 1,b,,,,, n .

Orthonormality is easy to establish. Since support (z/' m , n ) _ [2mn, 2m(n+1)],


it follows that two Haar wavelets of the same scale (same value of m) never
overlap, so that (,bm , n ,1 m , n ') = ,?Z'• Overlapping supports are possible if the
two wavelets have different sizes, as in Figure 1.5. It is easy to check, however,
that if m < m', then support (L m , n ) lies wholly within a region where 1 m ', n h is
THE WHAT, WHY, AND HOW OF WAVELETS 11
Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

constant (as on the figure). It follows that the inner product of ,bm , n and 1,b
m i, n i
is then proportional to the integral of '' itself, which is zero.

0 iiL_J 4 --

w3,0
8

FIG. 1.5. Two Haar wavelets; the support of the "narrower" wavelet is completely con-
tained in an interval where the "wider" wavelet is constant.

We concentrate now on how well an arbitrary function f can be approximated


by linear combinations of Haar wavelets. Any f in L 2 (R) can be arbitrarily well
approximated by a function with compact support which is piecewise constant
- -
on the [P2 i, (9 + 1)2 '[ (it suffices to take the support and j large enough). We
can therefore restrict ourselves to such piecewise constant functions only: assume
f to be supported on [-2J1 , 2j1 ], and to be piecewise constant on the [12 - J0 ,
(E+ 1)2 - 'o [, where J1 and J° can both be arbitrarily large (see Figure 1.6). Let
° -
us denote the constant value of f = f on [e2 'o, (t + 1)2 - Jo [ by f2. We now
represent f0 as a sum of two pieces, f° = f 1 + 6, where f1 is an approximation
to f0 which is piecewise constant over intervals twice as large as originally, i.e.,
f l 1[k2-'o+1,(k+1)2-Jo+1[ = constant = f,. The values fk are given by the aver-
ages of the two corresponding constant values for f°, fkl = 2 (f2°k + ƒ2k+1) (see
Figure 1.6). The fimction 6l is piecewise constant with the same stepwidth as
f 0 ; one immediately has

6 e=ƒW — ƒé = á(ƒ21 — fz°e+1)


and
sát+1 = fze+1 — fi = 2 (ƒ2i+1 — f2°e) _ — bie
It follows that 6 1 is a linear combination of scaled and translated Haar functions:
2J1+J0 -1

Sl = 621w(2Jo-12 — t)
t= - 2 J1+J0-1 +1

We have therefore written f as

f= f ° = f 1 + c—.^o+l,c ^G—jo+l,e
e

12 CHAPTER 1
Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

2*

BLOW UP

to
f 02
f o fs o
..
.. O ...1
S1 i

_ -• — + 0
0 0
S o 63
fl020
= 2(fo+f°)
2(0
f f 0)= s 1
- -

Fic. 1.6. (a) A function f with support [-2",2' 1 1, piecewise constant on the [k2 — "o,
(k + 1)2 — JO [. (b) A blowup of a portion of f. On every pair of intervals, f is replaced by
its average (—t f1); the difference between f and f1 is 8 1 , a linear combination of Haar
wavelets.

where f' is of the same type as f0, but with stepwidth twice as large. We can
apply the same trick to f1, so that
fl = f2 +
C— Jo+2,l I— Jo+2,t
P

with f2 still supported on [- 2 J I ,2'], but piecewise constant on the even larger
intervals [k2 -J0 + 2 , ( k + 1)2- Jo+2[. We can keep going like this, until we have
J1
f = fJo+Ji /'
+ ,Z'Wm,e
C7n.
m=—Jo+1 Q

Here f Jo +JI consists of two constant pieces (see Figure 1.7), with
I
f Jo+ J i [ 0 2 J 1 [ fó
o+Jl equal to the average of f over [0, 2 J1 and [,

fJo+Jl1 [-2J1,0[ = f+Ji the average of f over [- 2 J ',O[.

Even though we have "filled out" the whole support of f, we can still keep
going with our averaging trick: nothing stops us from widening our horizon from

THE WHAT, WHY, AND HOW OF WAVELETS 13
Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

f^ °+Jl fJ°+Jl
0

-2J, 0 2Ji

21 f 1°+Jl 1 f JO+Jl
2 o
2J1+1

2 J 1 +1

1fJo+J1W(2-J1-1x+1) 1f0i+JiW(2-J1-1x)
2 -1 20

FiG. 1.7. The averages off on [0, 2J 1 ] and [-2" 1 , 0] can be "smeared" out over the bigger
interwals [0, 2J1+1], [-2J1+1, 0]; the difference is a linear combination of verg stretched out
Haar functions.

f J1+Jz = fJ1+J2+1 + 6J1+J2+1 where


2''1 to 2 J1+1 , and writing

fJi+Jz+ll[02J1+i[ = 2 fbi+Jz, fJi+Jz+l l[-2J 1 +i o[ = 2 f 1+ Jz -

and
6J1+J2 = Z.fó 1+J2ip(2—J1-1x) — —Ji—l
x + 1)
Z fji+Jz (2
(see Figure 1.7). This can again be repeated, leading to

J1 +K
f = fJo+Jl+K +
EE
m=—Jo+1 t
Cm,t bm,t ,

where support (f Jo+J1+K) = [_2J1+K 2J1+K] and

:Jo+J1+KI oto+J1 ƒJo+J1+K I [-2 Jl+K 0[ = 2—K fJ1+Ji


[0 2Ji+x[ = 2—K f ,

It follows immediately that

.J1 +K II 2
2
— Cm,2 YOm,e = II f Jo+Ji+K II Lz
m=—Jo+1 P 1 L2
°+Ji
= 2—K/2 - 2J1/2 {I fó I2 + If Ji +J1 I 2 ] 1/2
which can be made arbitrarily small by taking sufficiently large K. As claimed,
f can therefore be approximated to arbitrary precision by a finite linear combi-
nation of Haar wavelets!
The argument we just saw has implicitly used a "multiresolution" approach:
we have written successive coarser and coarser approximations to f (the fi,
14 CHAPTER 1
Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

averaging f over larger and larger intervals), and at every step we have written
the differente between the approximation with resolution 2j -1 , and the next
coarser level, with resolution 2j, as a linear combination of the ij ,k. In fact, we
have introduced a ladder of spaces (Vj) jE z representing the successive resolution
levels: in this particular case, V^ = {f E L 2 (R); f piecewise constant on the
[2j k, 2j (k + 1) [, k E Z}. These spaces have the following properties:

(1) ••• C V2 C V1 C Vo C V_1 C V_2 C ...

(2) fl7z VJ = {0}, Ujcz Vj = L 2 (R);


(3) f E Vj "-> f (2^.) E Vo;

(4) fEVo —> f(•—n)EVoforallnEZ.

Property 3 expresses that all the spaces are scaled versions of one space (the
"multiresolution" aspect). In the Haar example we found then that there exists
a function ,0 so that

Proj v,_ 1 f = Projv; f + (f, /j,k)j,k . (1.3.5)


kEZ

The beauty of the multiresolution approach is that whenever a ladder of spaces


Vj satisfies the four properties above, together with

(5) 30 E Vo so that the O0,(x) = 0(x — n) constitute an orthonormal


basis for Vo,

then there exists 0 so that (1.3.5) holds. (In the Haar example above, we
can take q(x) = 1 if 0 <_ x < 1, «(x) = 0 otherwise.) The ? j ,k consti-
tute automatically an orthonormal basis. It turn out that there are many
examples of such "multiresolution analysis ladders," corresponding to many ex-
amples of orthonormal wavelet bases. There exists an explicit recipe for the
construction of 0: since 0 E Vo C V_ 1 , and the 0_ 1 ,„(x) = f 0(2x — n)
constitute an orthonormal basis for V_ 1 (by (3) and (5) above), there exist
a n = (O, '_ l , n ) so that 0(x) = z ç5(2x — n). It then suffices to take
0(x) = E er (-1)na_. + 1 «(2x—n). The function 0 is called a scaling functionof
the multiresolution analysis. The correspondente multiresolution analysis --> or-
thonormal basis of wavelets will be explained in detail in Chapter 5, and further
explored in subsequent chapters. This multiresolution approach is also linked
with subband filtering, as explained in §5.6 (Chapter 5).
Figure 1.8 shows some examples of pairs of functions 0, 0 corresponding to
different multiresolution analyses which we will encounter in later chapters. The
Meyer wavelets (Chapters 4 and 5) have compactly supported Fourier transform;
0 and 0 themselves are infinitely supported; they are shown in Figure 1.8a. The
Battle—Lemarié wavelets (Chapter 5) are spline functions (linear in Figure 1.8b,
cubic in Figure 1.8c), with knots at Z for 0, at 2 Z for 0. Both 0 and 0 have
infinite support, and exponential decay; their numerical decay is faster- than
for the Meyer wavelets (for comparison, the horizontal scale is the same in (a),

THE WHAT, WHY, AND HOW OF WAVELETS 15


Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(a) 2
Meyer 1 WMeyer

0
-1
5 0 5 -5 0 5

(b) 1.5 2
1 OeL,1 WBL,1

0.5
0
0
-0.5 -1
-5 0 5 -5 0 5
(c)
1 0BL,3 1 WsL,3
0.5
0
0
-0.5
-5 0 5 5 0 5

(d) 1 0Haar 1 T WHaar

0 1 0 1
(e) 2
1

-1 0 1
20

2
2
1

0
_1
2
H( 1 0 1
2T

2
(f) 2
1 ^

0
5 0 5 -5 0 5
FIG. 1.8. Some examples of orthonormal wavelet bases. For every ii in this figure, the
family 1,b k(x) = 2 -3 / 2 ^i(2 — ix — k), j, k E Z, constitutes an orthonormal basis of L 2 (R). The
figure plots 0 (the associated scaling function) and %b for different constructions which we will
encounter in later chapters. (a) The Meyer wavelets; (b) and (c) Battle—Lemarié wavelets;
(d) the Haar wavelet; (e) the nest member of the family of compactly supported wavelets, i;
(f) another compactly supported wavelet, with less asymmetry.
16 CHAPTER 1
Downloaded 10/05/12 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(b), and (c) of Figure 1.8). The Haar wavelet, in Figure 1.8d, has been known
since 1910. It can be viewed as the smallest degree Battle—Lemarié wavelet
(bHaar = ,bBL ,o) or also as the first of a family of compactly supported wavelets
constructed in Chapter 6, bHaar = j.. Figure 1.8e plots the next member of
the family of compactly supported wavelets NO; 2 0 and 20 both have support
width 3, and are continuous. In this family of NO (constructed in §6.4), the
regularity increases linearly with the support width (Chapter 7). Finally, Figure
1.8f shows another compactly supported wavelet, with support width 11, and
less asymmetry (see Chapter 8).

Notes.

There exist other techniques for time-frequency localization than the win-
dowed Fourier transform. A well-known example is the Wigner distribu-
tion. (See, e.g., Boashash (1990) for a good review on the use of the Wigner
distribution for signal analysis.) The advantage of the Wigner distribution
is that, unlike the windowed Fourier transform or the wavelet transform,
it does not introduce a reference function (such as the window function,
or the wavelet) against which \the signal has to be integrated. The disad-
vantage is that the signal enters in the Wigner distribution in a quadratic
rather than linear way, which is the cause of many interference phenom-
ena. These may be useful in some applications, especially for, e.g., signals
which have a very short time duration (an example is Janse and Kaiser
(1983); Boashash (1990) contains references to many more examples); for
signals which last for a longer time, they make the Wigner distribution less
attractive. Flandrin (1989) shows how the absolute values of both the win-
dowed Fourier transform and the wavelet transform of a function can also
be obtained by "smoothing" its Wigner distribution in an appropriate way;
the phase information is lost in this process however, and reconstruction is
not possible any more.

The restriction b o = 1, corresponding to (1.3.4), is not very serious: if


(1.3.4) provides an orthonormal basis, then so do the l m , n (x) = 2 — mh 2
0(2 mx — nb o ), with 0(x) = Ib o I -1 / 2 ij'(b^ l x), where b o 0 0 is arbi-

trary. The choice ao = 2 cannot be modified by scaling, and in fact a o


cannot be chosen arbitrarily. The general construction of orthonormal
bases we will expose here can be made to work for all rational choices for
a o > 1, as shown in Auscher (1989), but the choice a o = 2 is the simplest.
Different choices for ao correspond of course to different 0. Although the
constructive method for orthonormal wavelet bases, called multiresolution
analysis, can work only if ao is rational, it is an open question whether
there exist orthonormal wavelet bases (necessarily not associated with a
multiresolution analysis), with good time-frequency localization, and with
irrational ao.

You might also like