You are on page 1of 12

Module III

Frequency Domain Representations

Introduction

In many areas of science and engineering, the representation of signals or other


functions by sums of sinusoids or complex exponentials leads to convenient
solutions to problems
Such representations—Fourier representations as they are commonly called—are
useful in signal processing for two basic reasons. The first is that for linear systems,
it is very convenient to determine the response to a superposition of sinusoids or
complex exponentials. The second reason is that the Fourier representations often
serve to place in evidence certain properties of the signal that may be obscure or at
least less evident in the original signal.
It is helpful to recall that the discrete-time model for the production of samples of a
steady state speech sound, such as a vowel or fricative, as shown in Figure 7.1,
Consists of a linear system with system function, V(z), excited by a source which is
either periodically varying (AVp[n] ∗ g[n] for voiced speech) or randomly varying
(ANu[n] for unvoiced speech). A transfer function, R(z), represents radiation of sound
at the lips.
In general, the spectrum of the output of such a model would be the product of the
frequency responses of the vocal tract system, the spectrum of the excitation source,
and the spectrum of the model of sound radiation. For voiced speech such as a
sustained vowel, we can write the discrete-time Fourier transform (DTFT) expression
as:

and for sustained unvoiced speech, assuming white noise excitation with unit power,
the power spectrum of the output is

Thus, it is to be expected that the Fourier spectrum of the output would reflect the
properties of the excitation, the vocal tract and radiation frequency responses.
However, although vowels and fricatives can be sustained for several seconds with
little variation, natural speech is continually changing in time. Thus the standard
Fourier representations that are appropriate for periodic, transient, or stationary
random signals are not directly applicable to the representation of speech
signals.We have already seen ample evidence that the short-time analysis principle
is a valid approach to speech processing. We have seen, for example, that temporal
properties such as energy, zero crossings, and correlation are slowly varying so that
they can be assumed to be fixed over time intervals on the order of 10 to 40 msec.
We will demonstrate that spectral properties of speech likewise can be assumed to
change relatively slowly with time.

In order to study spectral properties of speech signals, we will define a timevarying


Fourier transform, which we generally refer to as the short-time Fourier transform
(STFT). Use of the STFT is therefore termed short-time Fourier analysis (STFA). We
will also show that the STFT is invertible in the sense that, with certain constraints,
we can recover the original sampled signal by a process that we term short-time
Fourier synthesis (STFS). Indeed, STFA/STFS provides a representation of the
speech waveform that can serve as the basis for many types of speech processing
including coding and various types of signal enhancement. This is depicted in Figure
7.2, which shows that the processing can be controlled by “side information”
extracted by other means from the speech signal.

DISCRETE TIME FOURIER ANALYSIS

Discrete time signal, x[n], and its DTFT were related by the pair of equations

where 𝜔 is the normalized frequency variable of X(𝑒 𝑗𝑤 ) in units of radians. When we


wish to clearly specify the discrete-time Fourier transform defined by Eqs. (7.3a) and
(7.3b), we shall use the acronym DTFT or the terminology “normal (discrete-time)
Fourier transform,” to distinguish it from the STFT to be defined below.
The DFT and its inverse are given by the equations
The DFT and DTFT can both be used as mathematical representations of a finite
length sequence; specifically, the DFT and the DTFT of a finite-length sequence are
related by

that is, the DFT is a sampled (in frequency) version of the DTFT.

SHORT-TIME FOURIER ANALYSIS

We define the time-dependent, or short-time, Fourier transform (STFT) as

where w[ 𝑛̂ − m] is a real window sequence whose purpose is to determine the


portion of the input signal that receives emphasis at a particular time index, 𝑛̂.
The time dependent Fourier transform is a complex function of two variables:
the time index, 𝑛̂, which is discrete, and the frequency variable, 𝜔
̂, which is
continuous and periodic, with period 2π

A plot showing the domain of the two variables, 𝑛̂, and 𝜔


̂, is given in Figure 7.4 for the
range 0 ≤ 𝑛̂ ≤ 8 (𝑛̂ is defined for all discrete values but only a few are shown in this
figure) and for 0 ≤ 𝜔
̂ < 2𝜋 (since 𝜔
̂ is periodic over intervals of 2𝜋). Alternatively, we
could use the range −𝜋 < 𝜔
̂ ≤ 𝜋.
An alternative form of Eq. (7.8) is obtained by a change of summation index, which yields
the expression
Then

Where

(7.10)

The STFT equations can be interpreted in two distinct ways. First, assuming that 𝑛̂ is fixed,
we observe from Eq. (7.8) that 𝑋𝑛 (𝑒 𝑗𝑤 ) is simply the DTFT of the sequence 𝑤[𝑛̂ − 𝑚]𝑥[𝑚],
−∞ < 𝑚 < ∞.

Therefore, for fixed 𝑛̂, 𝑋𝑛 (𝑒 𝑗𝑤 ) has the same properties as a normal DTFT.
The vertical lines in Figure 7.4 show the regions of support 0 ≤ 𝜔 ̂ < 2𝜋 for different
𝑗𝑤
values of 𝑛̂. The second interpretation follows by considering 𝑋𝑛 (𝑒 ) as a function of the
time index 𝑛̂ with 𝜔̂ fixed at, for example, 𝜔 ̂0 as in Figure 7.4.
In this case we observe that Eq. (7.8), Eq. (7.9), and Eq. (7.10) are all in the form of a
discrete-time convolution.

This interpretation leads us naturally to consider the time-dependent Fourier representation


in terms of linear filtering.

 DTFT interpretation

You might also like