You are on page 1of 2

Music 421

Spring 2004-2005
Homework #8
Overlap-Add STFT Processing, Filter Banks
60 points
Due in 5 days (5/31/2005)

1. (10 pts) Draw a block diagram of the filter bank interpretation of DFT, and briefly
explain the functions of each of the blocks.

2. Define the signal yk (m) = Xm (ωk )ejωk mR , with k viewed as a fixed parameter, and m
viewed as the independent variable.

(a) (10 pts) Show that
N −1
1 X
yk (m) = w(0)x(mR)
N k=0
if N ≥ M , or if N < M and w(mN ) = 0, m = ±1, ±2, . . . .
(b) (2 pts) What does the term ejωk mR do in the reconstruction?
(c) (8 pts) What are the disadvantages of using the case N < M ?
(d) (10 pts) How do we recover x(n) for all n when R > 1?

3. (20 pts) Suppose the window transform W (ω) is a lowpass filter with cut-off frequency
ωc = 2π/R. That is, W (ω) ≈ 0 for |ω| ≥ ωc . In this case, show that
M −1
X 1
w(n − mR) ≈ W (0).
m=0
R

If these approximations were exact equalities, specify the set of useable frame step sizes
R0 such that
M
X −1
w(n − mR0 ) = constant.
m=0

4. (Optional) Cross-Synthesis
Download the skeleton program hw8xsynth.m1 and the sound source files, SteveJobs.wav2
and motorcycle.wav.3 The program analyzes the spectral envelope of the speech which
is then imposed on the spectrum of a broadband signal, here, a motorcycle sound.

(a) (5 pts) Fill in the comments (5 of them) in the program to explain what the code
in the next few lines do and why we might want to do that.
1
http://www-ccrma.stanford.edu/˜jos/hw421/hw8/hw8xsynth.m
2
http://www-ccrma.stanford.edu/˜jos/hw421/hw8/SteveJobs.wav
3
http://www-ccrma.stanford.edu/˜jos/hw421/hw8/motorcycle.wav

1
(b) (25 pts) Fill in the unfinished lines to make an alias-free cross synthesizer. Turn in
your code with all the comments completed and a sample of your cross synthesis
result between SteveJobs.wav and motorcycle.wav. Name the cross-synthesis
wave file xxxxhw8.wav where xxxx are the first four letters of your last name.
(c) (5 pts) For an arbitrary time n, plot the following:
i. The short-time speech spectrum magnitude (dB).
ii. The amplitude response of the all-pole filter 1/A(z) obtained by linear pre-
diction analysis at that time.
iii. The short-time spectral magnitude (dB) of the synthesized sample.
(d) (5 pts) Discuss what are the criteria of the selection of two signals to be fed into
the cross-synthesis, so that the synthesized speech is clearly intelligible.
Remark: One good thing to do is to create bad examples as well as good exam-
ples, and investigate why they are good or bad.

2