Professional Documents
Culture Documents
Module 3 PDF
Module 3 PDF
Introduction
and for sustained unvoiced speech, assuming white noise excitation with unit power,
the power spectrum of the output is
Thus, it is to be expected that the Fourier spectrum of the output would reflect the
properties of the excitation, the vocal tract and radiation frequency responses.
However, although vowels and fricatives can be sustained for several seconds with
little variation, natural speech is continually changing in time. Thus the standard
Fourier representations that are appropriate for periodic, transient, or stationary
random signals are not directly applicable to the representation of speech
signals.We have already seen ample evidence that the short-time analysis principle
is a valid approach to speech processing. We have seen, for example, that temporal
properties such as energy, zero crossings, and correlation are slowly varying so that
they can be assumed to be fixed over time intervals on the order of 10 to 40 msec.
We will demonstrate that spectral properties of speech likewise can be assumed to
change relatively slowly with time.
Discrete time signal, x[n], and its DTFT were related by the pair of equations
that is, the DFT is a sampled (in frequency) version of the DTFT.
Where
(7.10)
The STFT equations can be interpreted in two distinct ways. First, assuming that 𝑛̂ is fixed,
we observe from Eq. (7.8) that 𝑋𝑛 (𝑒 𝑗𝑤 ) is simply the DTFT of the sequence 𝑤[𝑛̂ − 𝑚]𝑥[𝑚],
−∞ < 𝑚 < ∞.
Therefore, for fixed 𝑛̂, 𝑋𝑛 (𝑒 𝑗𝑤 ) has the same properties as a normal DTFT.
The vertical lines in Figure 7.4 show the regions of support 0 ≤ 𝜔 ̂ < 2𝜋 for different
𝑗𝑤
values of 𝑛̂. The second interpretation follows by considering 𝑋𝑛 (𝑒 ) as a function of the
time index 𝑛̂ with 𝜔̂ fixed at, for example, 𝜔 ̂0 as in Figure 7.4.
In this case we observe that Eq. (7.8), Eq. (7.9), and Eq. (7.10) are all in the form of a
discrete-time convolution.
DTFT interpretation