SPEECH PROCESSING
ByBy1.Hawal Suyog R. 2.Hajare Vinayak B. 3.Sakhawalkar Rohit R. 4.Akhil Bhan Under Guidance of Mr. M. M. Kamble
Introduction Prior Definitions
Pitch : Defined as the perceptual appreciation of the highness or the lowness of a sound It is related to the periodicity of a sound. Frequency : Physical attribute of a sound or any type other of signal. Describes the amount of times that a repeated event occur per unit of time. Fundamental Frequency : In a complex sound or signal, it is the lowest partial.
Introduction
Application of Pitch Tracking
Score Following Musical Queries by singing or humming Acoustic feature for Human-Computer Interaction Sound-Editing Program like pitch-shifting and time-scaling operation
Introduction
Non-Exclusive Classification
Voice ( Speech, Singing ) Instrumental Monophonic Polyphonic Time-Based Algorithm Spectral-Based Algorithm Alternative
Introduction
Music-Specific Difficulties
Large frequency range for musical instrument Many instrumental sound have inharmonic partials Expressiveness factors ( glissando, vibrato, thrill) Fast algorithm for real-time processing Multiphonic
ShortShort-time speech analysis
y y y y y
Requirement Comparison of Human male & female voice Zero crossing detection Average Energy Average Power
Homomorphic Analysis
y
Separation of combined signals Removes multiplicative noise Cepstrum Analysis Autocorrelations
Synthesis
y
Sine Wave FM Wave Voice signal Mute Wave
Prediction
Use Of Filter
Formant estimation and tracking
y
Formant are important in applications such voice coding vocoders, some approaches to recognition and formantbased synthesis. However, tracking the formants in their dynamics in continuous speech is very difficult. Analysis for the roots of denominator polynomials (poles) in the z domain or proximity to the unit circle may be indicators of a formant. Formant tracking is difficult and unreliable.
Voiced vs. Unvoiced Speech
y
Voiced
Unvoiced
No periodic vibration of vocal chords
Quasi-periodic Quasi-
excitation
Modulation
by vocal tract Noise-like nature Noise Production of most consonants Low Energy
Production
of mainly vowels Energy
High
Analysis per Frame
Cutting signal in to selected frame y Shows the FFT & LPC of the frame and residual y Shows error probabilities y Shows AMDF of a frame with its fundamental frequency
y
AMDF Pitch contours
Gives the contours of pitch with time y Only shows the dots of contour where the pitch is present other wise shows blank fields
y