1.1 The Speech Chain
1.2.2 Text-to-Speech Synthesis
1.2.3 Speech Recognition and Other Pattern Matching Problems
1.2.4 Other Speech Applications
1.3 Our Goal for this Text
2.1 Phonetic Representation of Speech
2.2 Models for Speech Production
2.3 More Refined Models
3.1 The Human Ear
3.2 Perception of Loudness
3.3 Critical Bands
3.4 Pitch Perception
3.5 Auditory Masking
3.6 Complete Model of Auditory Processing
4.1 Short-Time Energy and Zero-Crossing Rate
4.2 Short-Time Autocorrelation Function (STACF)
4.3 Short-Time Fourier Transform (STFT)
4.4 Sampling the STFT in Time and Frequency
4.5 The Speech Spectrogram
4.6 Relation of STFT to STACF
4.7 Short-Time Fourier Synthesis
4.8 Short-Time Analysis is Fundamental to our Thinking
5.1 Definition of the Cepstrum and Complex Cepstrum
5.2 The Short-Time Cepstrum
5.3.1 Computation Using the DFT
5.3.2 z-Transform Analysis
5.3.3 Recursive Computation of the Complex Cepstrum
5.4 Short-Time Homomorphic Filtering of Speech
5.5 Application to Pitch Detection
5.6.1 Compensation for Linear Filtering
5.6.2 Liftered Cepstrum Distance Measures
5.6.3 Mel-Frequency Cepstrum Coefficients
5.7 The Role of the Cepstrum
6.1 Linear Prediction and the Speech Model
6.2.1 The Covariance Method
6.2.2 The Autocorrelation Method
6.3 The Levinson–Durbin Recursion
6.4 LPC Spectrum
6.5.1 Roots of Prediction Error System Function
6.5.2 LSP Coefficients
6.5.3 Cepstrum of Vocal Tract Impulse Response
6.5.4 PARCOR Coefficients
6.5.5 Log Area Coefficients
6.6 The Role of Linear Prediction
7.1.1 Uniform Quantization Noise Analysis
7.1.2 µ-Law Quantization
7.1.3 Non-Uniform and Adaptive Quantization
7.2 Digital Speech Coding
7.3.1 Predictive Coding
7.3.2 Delta Modulation Coding the ADPCM Parameters Quality vs. Bit Rate for ADPCM Coders Basic Analysis-by-Synthesis Coding System Perceptual Weighting of the Difference Signal Generating the Excitation Signal Multi-Pulse Excitation Linear Prediction (MPLP) Code-Excited Linear Prediction (CELP) Long-Delay Predictors in Analysis-by-Synthesis Coders
7.4.1 The Two-State Excitation Model
7.4.2 Residual-Excited Linear Predictive Coding
7.4.3 Mixed Excitation Systems
7.5 Frequency-Domain Coders
7.6 Evaluation of Coders
8.1.1 Document Structure Detection
8.1.2 Text Normalization
8.1.3 Linguistic Analysis
8.1.4 Phonetic Analysis
8.1.5 Homograph Disambiguation
8.1.6 Letter-to-Sound (LTS) Conversion
8.1.7 Prosodic Analysis
8.2.1 Early Speech Synthesis Approaches
8.2.2 Word Concatenation Synthesis
8.2.3 Articulatory Methods of Synthesis
8.2.4 Terminal Analog Synthesis of Speech
8.4 TTS Applications
8.5 TTS Future Needs
9.1 The Problem of Automatic Speech Recognition
9.2.1 Recognition Feature Set
9.3.1 Mathematical Formulation of the ASR Problem
9.3.2 The Hidden Markov Model
9.3.3 Step 1 — Acoustic Modeling
9.3.4 Step 2 — The Language Model
9.3.5 Step 3 — The Search Problem
9.4 Representative Recognition Performance
9.5 Challenges in ASR Technology
Introduction to Digital Speech Processing

Introduction to Digital Speech Processing

Published by dbullseye

Published by: dbullseye on Feb 06, 2011
Copyright:Attribution Non-commercial


