Professional Documents
Culture Documents
Textbook Application of Wavelets in Speech Processing 2Nd Edition Mohamed Hesham Farouk Auth Ebook All Chapter PDF
Textbook Application of Wavelets in Speech Processing 2Nd Edition Mohamed Hesham Farouk Auth Ebook All Chapter PDF
https://textbookfull.com/product/recent-advances-in-engineering-
mathematics-and-physics-proceedings-of-the-international-
conference-raemp-2019-mohamed-hesham-farouk/
https://textbookfull.com/product/protein-downstream-processing-
design-development-and-application-of-high-2nd-edition/
https://textbookfull.com/product/robust-digital-processing-of-
speech-signals-1st-edition-branko-kovacevic/
https://textbookfull.com/product/wavelets-in-medicine-and-
biology-akram-aldroubi/
Audio And Speech Processing With MATLAB Paul R. Hill
https://textbookfull.com/product/audio-and-speech-processing-
with-matlab-paul-r-hill/
https://textbookfull.com/product/student-speech-policy-
readability-in-public-schools-interpretation-application-and-
elevation-of-student-handbook-language-1st-edition-erica-salkin/
https://textbookfull.com/product/engineering-aspects-of-membrane-
separation-and-application-in-food-processing-1st-edition-robert-
w-field/
https://textbookfull.com/product/sustainable-downstream-
processing-of-microalgae-for-industrial-application-1st-edition-
kalyan-gayen/
https://textbookfull.com/product/speech-and-audio-processing-a-
matlab-based-approach-1st-edition-ian-vince-mcloughlin/
SPRINGER BRIEFS IN ELEC TRIC AL AND
COMPUTER ENGINEERING SPEECH TECHNOLOGY
Application
of Wavelets
in Speech
Processing
Second Edition
123
SpringerBriefs in Electrical and Computer
Engineering
Speech Technology
Series editor
Amy Neustein, Fort Lee, NJ, USA
Editor’s Note
The authors of this series have been hand-selected. They comprise some of the most
outstanding scientists—drawn from academia and private industry—whose research
is marked by its novelty, applicability, and practicality in providing broad based
speech solutions. The SpringerBriefs in Speech Technology series provides the
latest findings in speech technology gleaned from comprehensive literature reviews
and empirical investigations that are performed in both laboratory and real life
settings. Some of the topics covered in this series include the presentation of real
life commercial deployment of spoken dialog systems, contemporary methods of
speech parameterization, developments in information security for automated
speech, forensic speaker recognition, use of sophisticated speech analytics in call
centers, and an exploration of new methods of soft computing for improving human-
computer interaction. Those in academia, the private sector, the self service industry,
law enforcement, and government intelligence, are among the principal audience
for this series, which is designed to serve as an important and essential reference
guide for speech developers, system designers, speech engineers, linguists and
others. In particular, a major audience of readers will consist of researchers and
technical experts in the automated call center industry where speech processing is a
key component to the functioning of customer care contact centers.
Application of Wavelets
in Speech Processing
Second Edition
Mohamed Hesham Farouk
Department of Engineering, Math and Physics
Cairo University, Faculty of Engineering
Giza, Egypt
The chapters of this book have been structured such that each one is self-contained
and can be read separately. Each chapter is concerned with a specific application of
wavelets in speech technology. Every module in a chapter surveys the literature in
this topic such that the use of wavelets in the work is explained and experimental
results of proposed method are then discussed. Chapter 1 introduces the topic of
speech processing, while Chap. 2 discusses processes for speech production and
different approaches in modeling of a speech signal. Chapter 3, thereafter, explains
how wavelets can describe and model many features of a speech signal. Applications
of wavelet transform (WT) in speech processing are the subjects of subsequent
chapters. Collectively, the power of WT in estimating spectral characteristics of
speech is explained in Chap. 4 showing how elements of such spectrum can be
derived like pitch and formants. Chapter 5 confers the problem of speech activity
detection and signal separation based on features extracted from WT. Enhancement
and noise cancellation is revised in Chap. 6 showing how WT improves the process.
The problem of speech recognition is discussed in Chap. 7 in view of the provided
powerful features obtained by a wavelet analysis. Another recognition problem is
considered in Chap. 8 discussing the identification of a speaker from his voice.
Additionally, a similar topic on emotion recognition through wavelet features in an
utterance is elucidated in Chap. 9. Another key application of speech is discussed in
Chap. 10 showing how speech signal can be decoded and synthesized using a
vii
viii Preface
Acknowledgment
The author would like to thank the editorial board of the SpringerBriefs series for
letting him prepare this monograph and for their continuous cooperation during the
preparation of the work. Thanks should also go to my colleagues at the Engineering
Physics Department, Bahira Elsebelgy, Ph.D., and M. El-Gohary, M.Sc., for helping
in proofreading.
1 Introduction............................................................................................. 1
1.1 History and Definition of Speech Processing.................................. 1
1.2 Applications of Speech Processing.................................................. 2
1.3 Recent Progress in Speech Processing............................................. 2
1.4 Wavelet Analysis as an Efficient Tool for Speech Processing......... 3
References................................................................................................. 4
2 Speech Production and Perception........................................................ 5
2.1 Speech Production Process.............................................................. 5
2.2 Classification of Speech Sounds...................................................... 6
2.3 Speech Production Modeling........................................................... 7
2.4 Speech Perception Modeling........................................................... 8
2.5 Intelligibility and Speech Quality Measures.................................... 9
References................................................................................................. 10
3 Wavelets, Wavelet Filters, and Wavelet Transforms............................ 11
3.1 Short-Time Fourier Transform (STFT)............................................ 11
3.2 Multiresolution Analysis and Wavelet Transform............................ 12
3.3 Wavelets and Bank of Filters........................................................... 14
3.4 Wavelet Families.............................................................................. 15
3.5 Wavelet Packets................................................................................ 16
3.6 Undecimated Wavelet Transform..................................................... 18
3.7 The Continuous Wavelet Transform (CWT).................................... 18
3.8 Wavelet Scalogram........................................................................... 19
3.9 Empirical Wavelets.......................................................................... 19
References................................................................................................. 20
4 Spectral Analysis of Speech Signal and Pitch Estimation................... 23
4.1 Spectral Analysis.............................................................................. 23
4.2 Formant Tracking and Estimation.................................................... 24
4.3 Pitch Estimation............................................................................... 25
References................................................................................................. 27
ix
x Contents
Index................................................................................................................. 81
Abbreviations
xiii
xiv Abbreviations
First trials for speech processing through machines may be dated since ancient
Egyptians who built statutes producing sounds. During the eighteenth century, there
are documents on attempts for building speaking machines [1] .
In the human speech processing system, several transformations may be included,
such as thought to articulation, articulator’s movement to acoustical signal, propa-
gation of the speech signal, electronic transmission/storage, loudspeaker to the lis-
tener’s ears, acoustic to electrical in the inner ear, and interpretation by the listener’s
brain. These transformations are modeled through many mathematical algorithms.
In most of the speech processing algorithms, a feature space is built based on a
transformation kernel to a space of lower dimension, which allows a post-processing
stage, readily, resulting in more useful information.
Accordingly, speech processing is discussing the methods and algorithms used in
analyzing and manipulating speech signals. Since signals are usually processed in a
digital representation, speech processing can be regarded as a special case of digital
signal processing, applied to speech signal. The main topics of speech processing
are recognition, coding, synthesis, and enhancement. The processing for speech
recognition concentrates on extracting the best features which can achieve the high-
est recognition rate using a certain classifier. For speech coding and synthesis, the
coding parameters from speech signals should result in a low-dimensional set which
gives the closest matching between the original and the reconstructed signals. For
speech enhancement, efforts are directed toward discovering analysis components
which may comprise sources of signal degradation.
Most applications of speech processing have emerged many years ago. However,
recent years have seen widespread deployment of smartphones and other portable
devices with the ability to make good quality recordings of speech and even video.
Such recordings can be processed locally or transmitted for processing on remote
stations having more computational power and storage. More computational power
and storage increase rapidly every day as computational technology advances.
1.4 Wavelet Analysis as an Efficient Tool for Speech Processing 3
The current state of speech processing systems is still far from human perfor-
mance. A major problem for most speech-based applications is robustness, which
refers to the fact that they may be insufficiently general. As an example, a truly
robust automatic speech recognition (ASR) system should be independent from any
speaker, in reasonable environments. Environmental noise from natural sources or
machines, as well as communication channel distortions, all tend to degrade the
system’s performance, often severely. Human listeners, by contrast, often can adapt
rapidly to these difficulties, which suggests that there remains significant enhance-
ment needed. However, much of what we know about human speech production and
perception needs to be integrated into research efforts in the near future.
As a result, the main objective of research in speech processing is directed toward
finding techniques for robust speech processing. This concern has been motivated
by the increase in the need of lower-complexity and more efficient methods for
speech feature extraction, which are needed for enhancing the naturalness, accept-
ability, and intelligibility of the reconstructed speech signal corrupted by environ-
mental noise, and the necessity of reducing noise for robust speech recognition
systems to achieve high recognition rate in harsh environments [3]. New algorithms
are continuously developed for enhancing the performance of speech processing for
different applications. Most of the improvements are founded on the growth of
research infrastructure in speech area and the inspiration from related biological
systems. The technology of powerful computation and communication systems
admit more sophisticated and efficient algorithms to be employed for more reliable
and robust applications of speech processing. In addition to such systems, larger
speech corpora become available, and in general, the infrastructure of research in
speech area is growing continuously.
As the use of wavelets started in the field of digital signal processing since the
1990s, it finds wide applications in speech processing. Wavelet analysis has contin-
ued to serve many speech-based applications since that time till now. Unlimited
algorithms and hardware implementations have been developed employing wavelet
analysis as an efficient spectral analysis tool compensating limitations of Fourier-
based algorithms [4]. Eventually, different merits of WT can support efficient fea-
ture extraction in approximately most of the research concerns especially with
newer versions of speech corpora emerging continuously. Moreover, WT represents
an economic analysis tool from a processing time perspective since it can be
obtained in O(L), whereas the short-time Fourier transform (STFT) representation
requires O(L log M), where L is the length of the discretized speech signal and
M denotes the subframe length of the used window [5].
4 1 Introduction
References
Speech sounds are produced due to the movement of organs constituting the vocal
tract (glottis, velum, tongue, lips) acting on the air from the respiratory passages
(trachea, larynx, pharynx, mouth, nose). The vocal organs generate a local distur-
bance on the air at several positions in the vocal tract creating the sources for speech
production. The acoustic waves generated by such sources are then modulated dur-
ing the propagation through the vocal tract with a specific shape. Accordingly,
speech sounds are generated by the combined effect of sound sources and vocal
tract characteristics. The source-filter model of speech production assumes that the
spectrum of source excitation at the glottis is shaped according to filtering proper-
ties of the vocal tract. Such filtering properties change continuously with time.
Continuous changes in the shape of the vocal tract and excitation either through
glottis or tract constriction make the produced sounds at the lips nonstationary.
Wavelet analysis is one of the best methods for extracting spectral features from
nonstationary signals, since it employs multiresolution measures both in time and
frequency.
The speech production process takes place inside the vocal tract extending from the
glottis to the lips. The process is energized from air-filled lungs. The vocal tract is a
chamber of extremely complicated geometrical shape whose dimensions and con-
figuration may vary continuously with time and whose walls are composed of tis-
sues having widely ranging properties.
Figure 2.1 shows the anatomical structure of the vocal tract. The glottis is a slit-
like orifice between the vocal cords (at the top of the trachea). The cartilages around
the cords support them and facilitate adjustment of their tension. The flexible
Nasal
Nasal pharynx
Cavity
Soft
palate
Oral
pharynx
Velum
Epiglottis
Lips
Pharynx
False Vocal
Tongue folds
Vocal folds
Larynx
Laryngeal
ventricle
Thyroid
cartilage
Trachea Esophagus
structure of the vocal cords makes them oscillate easily. These oscillations are
responsible for periodic excitation of vowels. The excitation of other sounds may
be through a jet of air through a constriction within the vocal tract or a combination
of the periodic excitation and what is produced by that jet of air. The nasal tract
constitutes an ancillary path for sound transmission. It begins at the velum and
terminates at the nostrils.
Speech sounds are classified according to the type and place of excitation. Voiced
sounds and vowels are characterized by a periodic excitation at the glottis. For
voiced sounds and vowels, the expelled air from the lungs causes the vocal cords to
vibrate as a relaxation oscillator, and the airstream is modulated into discrete puffs.
These oscillations start when the subglottal pressure is increased sufficiently to
2.3 Speech Production Modeling 7
force the initially abducted cords apart with lateral acceleration. As the air flow
builds up in the orifice, the local pressure is reduced and a force acts to return the
cords to a proximate position. Consequently, the pressure approaches the subglottal
value as the flow decreases with the decrease in the orifice (glottal area). The relax-
ation cycle is then repeated. The mass and compliance of the cords and the subglot-
tal pressure determine the oscillation frequency (pitch) [1].
Unvoiced sounds are generated by passing the airstream through a constriction in
the tract. The pressure perturbations due to these excitation mechanisms provide an
acoustic wave which propagates along the vocal tract toward the lips. The source of
voiced sounds is a combined effect of both types of excitation.
If the nasal tract is coupled to the vocal cavity through the velum, the radiated
sound is the resultant of the radiation at both the lips and the nostrils and it is called
nasalized sounds (as in /m/ and /n/). The distinctive sounds of any language (pho-
nemes) are uniquely determined by describing the excitation source and the vocal
tract configuration.
The variation of the cross-sectional area (c.s.a.) along the vocal tract is called the
area function according to the articulators positions. The area function of the vowels
is determined primarily by the position of the tongue, but the positions of the jaw,
lips, and, to a small extent, the velum also influence the resulting sound. The area
function with the excitation type can uniquely define the produced sound.
s (t ) = e (t ) h (t )
∗
(2.1)
Specifically, sound source e(t) is either voiced or unvoiced. A voiced source can
be modeled, in the simplest case, by a generator of periodic pulses or asymmetrical
triangular waves which are repeated at every fundamental period (pitch). The peak
value of the source wave corresponds to the loudness of the voiced sound. On the
other hand, an unvoiced sound source can be modeled by a white noise generator,
while the mean energy corresponds to the loudness [2].
For many speech applications such as recognition, coding, and synthesis, good
performance can be achieved with a speech model that reflects broad characteristics
of timing and articulatory patterns as well as varying frequency properties [3] and
[4]. In such a model, a scheme is designed to perform some spectral shaping on a
8 2 Speech Production and Perception
certain excitation wave so that it matches the natural spectrum (i.e., the vocal tract
tube looks as a spectral shaper of the excitation). This approach is called “terminal
analog” since its output is analogous to the natural process at the terminals only. The
main interest in such an approach is centered on resonance frequencies (formants or
system poles) and their bandwidths. The two widely used methods of this approach
are the formant model [5] and the linear prediction (LP) model [6]. These models
provide simpler implementation schemes for both hardware and software. Many
commercial products now adopt such models in their operation [7]. The adoption of
terminal analog models affords sufficient intelligibility for many applications along
with fast response due to its simplicity and amenability to implementation through
many available media. Apparently, the extracted features using such models can be
considered as different forms of resonances or spectral content of a speech signal.
As wavelets are considered one of the efficient methods for representing the spec-
trum of speech signals, WT can efficiently help in implementing such models [8].
The ear is the main organ in the process of speech perception. It consists of an outer
part, a middle part, and an inner part. Figure 2.2 shows the structure of the human
auditory system. The main function of the outer ear is to catch sound waves which
is done by the pinna. The pinna is pointed forward and has a number of curves to be
able to catch the sound and determine its direction. After the sound reaches the
pinna, it is guided to the middle ear using the external auditory canal until it reaches
Auditory Nerve
Cochlea
Eardrum
the ear drum. The main function of the middle ear is to magnify the sound pressure
because the inner ear transfers sound through fluid not air as in the middle and outer
ears. Thereafter, the inner ear starts with the cochlea, the most important organ in
the human ear. The cochlea performs the spectral analysis of the speech signal
through splitting it into several frequency bands which are called critical bands [9].
The ear averages the energies of the frequencies within each critical band and thus
forms a compressed representation of the original stimulus.
Studies have shown that human perception of the frequency content of sounds,
either for pure tones or for speech signals, does not follow a linear scale. The major-
ity of the speech and speaker recognition systems have used the feature vectors
derived from a filter bank that has been designed according to the model of auditory
system. There are a number of forms used for these filters, but all of them are based
on a frequency scale that is approximately linear below 1 kHz and approximately
logarithmic above this point. Wavelet multiresolution analysis can provide accurate
localization in both time and frequency domains which can emulate the operation of
the human auditory system [8].
The terms intelligibility and quality of speech are used interchangeably. The degra-
dation of speech quality is, mainly, a result of background noise either through a
communication channel or environment [3]. The evaluation of speech quality is
highly important in many speech applications. Subjective listening or conversation
tests are the most reliable measure of speech quality; however, these tests are often
fairly expensive, time consuming, labor-intensive, and difficult to reproduce.
However, for some applications, like the assessment of alternative coding or
enhancement algorithms, an objective measure is more economic to give the
designer an immediate and reliable estimate of the anticipated perceptual quality of
a particular algorithm. Traditional objective quality measures which rely on wave-
form matching like signal-to-noise ratio (SNR) or its variants like segmental SNR
(SSNR) are examples of straightforward measures. Perceptual quality measures are
better candidate for fast assessment with more accurate results. The motivation for
this perception-based approach is to create estimators which resemble that of the
human hearing system as described by the psychoacoustic models. In a psycho-
acoustic model of the human hearing, the whole spectrum bandwidth of the speech
signal is divided into the critical bands of hearing. WT can contribute to a quality
evaluation of speech in the context of critical band decomposition and auditory
masking [10], [11], [12], and [13]. Moreover, wavelet analysis can reduce the com-
putational effort associated with the mapping of speech signals into an auditory
scale [10].
10 2 Speech Production and Perception
References
1. J. Flanagan, Speech Analysis; Synthesis and Perception (Springer, New York, 1972)
2. M. Hesham, Vocal tract modeling. Ph. D. Dissertation, Faculty of Engineering, Cairo
University, 1994
3. J. Deller, J. Proakis, J. Hansen, Discrete-Time Processing of Speech Signals (IEEE PRESS,
New York, 2000)
4. M. Hesham, M.H. Kamel, A unified mathematical analysis for acoustic wave propagation
inside the vocal tract. J. Eng. Appl. Sci. 48(6), 1099–1114 (2001)
5. D. Klatt, Review of text-to-speech conversion for English. J. Acoust. Soc. Am 82(3), 737–793
(1987)
6. J.D. Markel, A.H.J. Gray, Linear Prediction of Speech (Springer, New York, 1976), ch.4
7. D. O’Shaughnessy, Invited paper: automatic speech recognition: history, methods and chal-
lenges. Pattern Recogn. 41(10), 2965–2979 (2008)
8. S. Ayat, A new method for threshold selection in speech enhancement by wavelet thresholding,
in International Conference on Computer Communication and Management (ICCCM 2011),
(Sydney, May 2011)
9. J. Benesty, M. Sondhi, Y. Huang, Springer Handbook of Speech Processing (Springer-Verlag
New York, Inc., Secaucus, 2007)
10. M. Hesham, A predefined wavelet packet for speech quality assessment. J. Eng. Appl. Sci.
53(5), 637–652 (2006)
11. A. Karmakar, A. Kumar, R.K. Patney, A multiresolution model of auditory excitation pattern
and its application to objective evaluation of perceived speech quality. IEEE Trans. Audio
Speech Lang. Process. 14(6), 1912–1923 (2006)
12. L. Rabiner, R. Schafer, Theory and Applications of Digital Speech Processing (Pearson, New
Jersey, 2011)
13. W. Dobson, J. Yang, K. Smart, F. Guo, High quality low complexity scalable wavelet audio
coding, in Proceedings of IEEE International Conference Acoustics, Speech, and Signal
Processing (ICASSP’97), Apr 1997, pp. 327–330
Chapter 3
Wavelets, Wavelet Filters, and Wavelet
Transforms
Multiresolution analysis based on the wavelet theory permits the introduction of the
concepts of signal filtering with different bandwidths or frequency resolutions. The
WT provides a framework to decompose a signal into a number of new signals, each
one with a different degree of resolution. While the Fourier transform (FT) gives an
idea on the frequency content in a signal, the wavelet representation is an intermedi-
ate representation between the frequency and the time representations, and it can
provide good localization in both frequency and time domains. Fast variation in
both domains can be detected by inspecting the coefficients of the WT. Because
of the difficult nature of speech signals and their fast variation with time, the
WT is used. In this part, we will review the properties of different approaches for
obtaining a WT.
In general, any mathematical transform for a signal or a function of time s(t) takes
the form:
∞
S (α ) = ∫ s ( t ) K (α , t ) dt (3.1)
−∞
where S(α) is the transform of s(t) with respect to the kernel K(α, t) and α is the
transform variable. In the Fourier transform, the kernel is K(ω, t) = e-jω t where ω is
the angular frequency and is equal to 2π f while f is the frequency.
FT is the main tool for spectral analysis of different signals. The STFT cuts out
a signal in short-time intervals (frames) and performs the FT in order to capture
where s(t) is a signal, S(ω) is its STFT, and w(t-β) is a window function centered
around β in time. The window function is then shifted in time, and the Fourier trans-
form (FT) of the product is, thereafter, computed again. So, for a fixed shift β of the
window w(t), the window captures the features of the signal s(t) around different
locations defined by β. The window helps to localize the time domain data within a
limited period in time, before obtaining the frequency domain information. The
signal has been assumed to be quasi-stationary during the period of w(t). The STFT
can be viewed as a convolution of the signal s(t) with a filter having an impulse
response of the form h(t) = w(t ‐ β) e‐jωt. The STFT can be also interpreted as a bank
of narrow, slightly overlapping band pass filters with additional phase information
for each one. Alternatively, it can be seen as a special case of a family of transforms
that use basis functions.
In order to improve the accuracy with respect to time-dependent variation for
STFT, it is necessary to shorten the frame period. The frequency resolution becomes
worse with decreasing frame length. In other words, the requirements in the time
localization and frequency resolution are conflicting. So, the major drawback of the
STFT is that it uses a fixed window width. Alternatively, the WT provides a better
time-frequency representation of the signal than any other existing transform. The
WT solves the above problem to a certain extent. In contrast to STFT, which uses a
single analysis window, the WT uses short windows at high frequencies and long
windows at low frequencies.
where j and k are indices indicating scale and location of a particular wavelet while
Ts is the sampling time of s(t). Without loss of generality, Ts = 1 can be considered
in a discrete case while ψ* is the complex conjugate of ψ.
The wavelet theory would immediately allow us to obtain line frequency analysis
and synthesis with the possibility of capturing both long-lasting (low-frequency)
components and localizing short irregularities, spikes, and other singularities with
high-frequency content. The former objectives can be approached by wavelets at
low scales, while the latter are successfully performed by wavelets at high scales
and appropriate locations. Wavelet localization follows Heisenberg uncertainty
principle in both the time and frequency domains that for any given wavelet Δt
Δf ≥1/2π.
As in (3.3), the pure wavelet expansion requests an infinite number of scales or
resolutions k to represent the signal s(t) completely. This is impractical if the expan-
sion is known, only for certain scales k < M, we need a complement component to
present information of expansion for k > M. This is done by introducing a scaling
function φ(t) such that [2]:
(3.6)
ϕ j ,k ( t ) = 2 − j / 2 ϕ ( 2 − j t − k )
14 3 Wavelets, Wavelet Filters, and Wavelet Transforms
where the set φj,k(t) is an orthonormal basis for subspace of L2(ℜ). With the intro-
duced component, the signal s(t) can be represented as a limit of successive approxi-
mations corresponding to different resolutions. This formulation is called a
multiresolution analysis (MRA).
Consequently, the signal s(t) can be set as the sum of approximations plus M
details at the Mth decomposed resolution or level. Equation (3.5) can be rewritten
after including approximations as follows:
M
(3.7)
s ( t ) = ∑aM , k φM , k ( t ) + ∑∑d j , k ψ j ,k ( t )
k j =1 k
where M represents the number of scales. aM,k are the approximations or scaling
coefficients and dj,k are the details or wavelet coefficients.
As a result, WT can be obtained in O(L) whereas the STFT representation
requires O(L log2M) where L is the length of discretized signal and M denotes the
subframe length of the used window [3].
d 1,1
h(n) h’ (n)
2 2 s~(n)
s (n)
a 1,1
l(n) l’ (n)
2 2
Fig. 3.1 Analysis and reconstruction of a signal using DWT through two-channel filter bank
3.4 Wavelet Families 15
The original signal can be reconstructed using this bank of filters. In the synthe-
sis phase, the signals are upsampled and passed through the synthesis filters. The
output of the filters in the synthesis bank is summed to get the reconstructed
signal.
ψ jk ( t ) = 2 j / 2 ψ ( 2 j t − k ) , (3.8)
where j and k are indices indicating scale and location of a particular wavelet.
Accordingly, the wavelet family is a collection of wavelet functions ψjk(t) which
are translated along the time axis t and then dilated by 2j times, and the new dilated
wavelet is translated along the time axis again. The wavelets of a family share the
same properties and their collection constitutes a complete basis. The basic wavelet
function must have local (or almost local) support in both a real dimension (time in
case of speech signals) and a frequency domain. Several kinds of wavelet functions
were developed and all of them have specific properties [4] as follows:
1. A wavelet function has finite energy [5]:
∞
(3.9)
∫ ψ (t )
2
dt < ∞
−∞
2. Similar condition must hold for Ψ(f)/f if Ψ(f) is the Fourier transform of the
wavelet function and it has zero meanΨ(0)=0. This condition can be formulated
as follows:
ψ(f)
2
∞ (3.10)
∫
0
f
df < ∞
f (t) y (t)
1
1
t t
1 1
-1
(a) (b)
There are a number of basis functions that can be used as the mother wavelet for
wavelet transformation. Since the mother wavelet produces all wavelet functions
used in the transformation through translation and scaling, it determines the charac-
teristics of the resulting transform. Therefore, the appropriate mother wavelet
should be chosen in order to use the wavelet analysis effectively for a specific
application.
Figure 3.2 shows an example of the simplest wavelet, Haar. Haar wavelet is one
of the oldest and simplest types. The Haar scaling function acts as a low-pass filter
through an averaging effect on the signal while its wavelet counterpart acts as a
high-pass filter.
Daubechies wavelets are the most popular. They represent the foundations of
wavelet signal processing and are used in numerous applications. The Haar,
Daubechies, Symlets, and Coiflets are compactly supported orthogonal wavelets.
These wavelets along with Meyer wavelets can provide a perfect reconstruction of
a signal. The Meyer, Morlet, and Mexican Hat wavelets are symmetric in shape [5].
The discrete form of a scale function is the impulse response of a low-pass filter,
while the wavelet is the impulse response of a high-pass filter.
Wavelet packet basis consists of a set of multiscale functions derived from the shift
and dilation of a basic wavelet function as in (3.8). The wavelet packet (WP) basis
space is generated from the decomposition of both the low-pass filter function space
and the corresponding basic high-pass filter function space. The conventional wave-
let basis space can be considered as a special case of the WP space when the decom-
position takes place only in the low-pass filter function space [6]. Assuming that the
discrete form of a scale function is l(n) and the wavelet one is h(n), WP basis can be
expressed as:
Another random document with
no related content on Scribd:
jonka Artwellin tapaaminen tallissa ja pitkä yöllinen ratsastus
aavemaisessa hiljaisuudessa suurella arolla oli aiheuttanut.
Kahdeskymmenes luku.
Jos Brannon tiesi, että tyttö oli käynyt Lazy L:ssä, niin hänen
tietonsa tekisi leikin vielä mielenkiintoisemmaksi Lattimerille. Se
tekisi Lattimerin voitonriemun entistä täydellisemmäksi, sillä hänellä
tulisi olemaan se tyydytys lisäksi, että hän musertaisi Brannonin
siitäkin huolimatta, että tämä tiesi sen.
"Se oli sen vuoksi, että hän vihaa Beniä. Luulen, että olisin ollut
tyytyväisempi, ellei hän olisi tullut tänne ensinkään", jatkoi hän, "sillä
silloin en olisi tiennyt kaikkea, mitä hän teki. John", lisäsi hän
matalalla, pelonsekaisella äänellä "tappoiko Les Callahanin."
"Lattimer"!
Kiivas moite Mrs Whitmanin äänessä sai miehen punastumaan.
Hän pysähtyi vielä ovella ja katsoi taakseen, aivan kuin hän äkkiä
olisi muistanut jotakin tärkeää.
Yhdeskolmatta luku.
Ilta muistutti paljon erästä toista, jonka hän ikänsä oli muistava,
nimittäin sitä iltaa, jona Callahan ammuttiin kuoliaaksi. Täysi kuu
valoi pehmyttä valoaan laajalle alangolle, paljastaen kaukaiset
uinuvat laaksot, sivellen hiljaa vuoren kukkuloita loistollaan ja
täyttäen syvänteet salaperäisyydellä.
Tosin oli hän huomannut, että Lattimer silloin tällöin oli katsellut
häntä miettiväisenä, kasvoillaan ilme, jonka suhteen ei saattanut
erehtyä. Mutta Josephine ei ollut turhankaino ja hän totesi, etteivät
Lattimerin ihailevat katseet olleet sen loukkaavampia kuin
muittenkaan miesten, vieläpä ympäristössä, jonka katsottiin
kasvattavan vain gentlemanneja.
Ensin oli hän luullut, että Brannon oli ihannemies. Hän huomasi
nyt erehdyksensä. Syynä hänen taisteluunsa antautumista vastaan
hänen vetovoimalleen oli ollut se, että Brannon ei ollut se oikea
mies. Brannon oli liian itsetietoinen, liian taipumaton, liian ärsyttävän
tietoinen voimastaan ja kyvystään hallita.
Kahdeskolmatta luku.
Kahdesti oli Josephine käynyt sisällä hoitamassa Artwellia.
Kuminallakin kerralla oli hän antanut hänelle vettä ja hämärässä
lampun valossa oli hän huomannut, että hänen värinsä oli jotenkin
normaali, osoittaen, että kuume, jota vastaan Josephine oli viikon
ajan taistellut, oli hitaasti antautumassa.
"Viisikymmentä mailia."
Kolmaskolmatta luku.