MUS421/EE367B Lecture 3 FFT Windows
Julius O. Smith III (jos@ccrma.stanford.edu) Center for Computer Research in Music and Acoustics (CCRMA) Department of Music, Stanford University Stanford, California 94305
Outline
April 26, 2005
• Rectangular, Hann, Hamming
• MLT Sine
• BlackmanHarris Window Family
• Slepian and Kaiser
• DolphChebyshev
• Bartlett
• Poisson
• Gaussian
• Optimal Windows
1
The Rectangular Window
Previously, we looked at the rectangular window:
w _{R} (n)
=
∆
1,
n ≥ ^{M}^{−}^{1}
2
0, otherwise
Zero−Phase Rectangular Window − M = 21
Time (samples)
The window transform was found to be
W _{R} (ω) = ^{s}^{i}^{n} ^{M} ω
2
sin ^{} ^{ω}
2
∆
=
M · asinc _{M} (ω)
(1)
where asinc _{M} (ω) denotes the aliased sinc function.
asinc _{M} (ω)
∆
_{=}
sin(M ω/2)
M · sin(ω/2)
2
This result is plotted below:
DFT of a Rectangular Window of length M = 11
Normalized Frequency
(rad/sample)
Note that this is the complete window transform, not just its magnitude. We obtain real window transforms like this only for symmetric, zerocentered windows.
3
More generally, we may plot both the magnitude and phase of the window versus frequency:
DFT of a Rectangular Window − M = 11
Normalized Frequency
(rad/sample)
Phase of Rectangular Window Transform (M = 11)
Normalized Frequency
4
(rad/sample)
In audio work, we more typically plot the window transform magnitude on a decibel (dB) scale:
DFT of a Rectangular Window − M = 11
Normalized Frequency
5
(rad/sample)
Since the DTFT of the rectangular window approximates the sinc function, it should “roll oﬀ” at approximately 6 dB per octave, as veriﬁed in the loglog plot below:
DFT of a Rectangular Window − M = 20
Normalized Frequency (rad/sample)
As the sampling rate approaches inﬁnity, the rectangular window transform converges exactly to the sinc function. Therefore, the departure of the rolloﬀ from that of the sinc function can be ascribed to aliasing in the frequency domain, due to sampling in the time domain.
6
Sidelobe RollOﬀ Rate
In general, if the ﬁrst n derivatives of a continuous
function w(t) exist (i.e., they are ﬁnite and uniquely deﬁned), then its Fourier Transform magnitude is asymptotically proportional to
_{}_{W}_{(}_{ω}_{)}_{} _{→} constant
_{ω} n+1
(as ω → ∞)
Proof: Papoulis, Signal Analysis, McGrawHill, 1977
Thus, we have the following ruleofthumb:
n derivatives ↔ −6(n + 1) dB per octave rolloﬀ rate
(since −20 log _{1}_{0} (2) = 6.0205999
.).
This is also −20(n + 1) dB per decade.
To apply this result, we normally only need to look at the window’s endpoints. The interior of the window is usually diﬀerentiable of all orders.
Examples:
• Amplitude discontinuity ↔ −6 dB/octave rolloﬀ
• Slope discontinuity ↔ −12 dB/octave rolloﬀ
• Curvature discontinuity ↔ −18 dB/octave rolloﬀ
For discretetime windows, the rolloﬀ rate slows down at high frequencies due to aliasing.
7
In summary, the DTFT of the M sample rectangular window is proportional to the ‘aliased sinc function’:
asinc _{M} (ωT)
∆
_{=}
_{≈}
Some important points:
sin(ωM T /2)
M · sin(ωT /2)
sin(πf M T )
= ∆ sinc(f M T )
MπfT
• Zero crossings at integer multiples of Ω _{M}
_{=} ∆ 2π
M
where Ω _{M}
_{=} ∆ 2π
M
= frequency sampling interval for a
length M DFT
• Main lobe width is 2Ω _{M} = ^{4}^{π}
M
• As M gets bigger, the mainlobe narrows (better frequency resolution)
• M has no eﬀect on the height of the side lobes (Same as the “Gibbs phenomenon” for Fourier series)
• First sidelobe only 13 dB down from mainlobe peak
• Side lobes roll oﬀ at approximately 6dB per octave
• A phase term arises when we shift the window to make it causal, while the window transform is real in the zerocentered case (i.e., centered about time 0)
8
Generalized Hamming Window Family
Consider the following picture in the frequency domain:
ω/Ω _{M}
We have added 2 extra aliased sinc functions (shifted), which results in the following behavior:
• There is some cancellation of the side lobes
• The width of the main lobe is doubled
9
In terms of the rectangular window transform W _{R} (ω) = M · asinc _{M} (ω) (zerocentered, unitamplitude case), this can be written as:
∆
W _{H} (ω) = αW _{R} (ω) + βW _{R} (ω − Ω _{M} ) + βW _{R} (ω + Ω _{M} )
Using the Shift Theorem dual, we can take the inverse transform of the above equation:
w _{H} = αw _{R} (n) + βe ^{−}^{j}^{Ω} ^{M} ^{n} w _{R} (n) + βe ^{j}^{Ω} ^{M} ^{n} w _{R} (n)
=
w _{R} (n) α + 2β cos ^{2}^{π}^{n}
M
Choosing various parameters for α and β result in diﬀerent windows in the generalized Hamming family, some of which have names.
10
Hann or Hanning or Raised Cosine
The Hann window is deﬁned by the settings α = 1/2 and β = 1/4:
w _{H} (n) = w _{R} (n)
1
2 ^{+}
1 _{2} cos(Ω _{M} n) = w _{R} (n) cos ^{2} ^{Ω} ^{M} n
2
Hann Window, M = 21
Time (samples)
Normalized Frequency (rad/sample)
11
Hann window properties:
• Mainlobe is 4Ω _{M} wide
• 1st side lobe is at 31dB
• Side lobes roll oﬀ at approx. 18dB / octave
Hamming
This window is determined by choosing α to cancel the ﬁrst side lobe and β to normalize peak amplitude to 1 in the time domain:
α 
= 
25 _{4}_{6} ≈ 0.54 
β 
= 
(1 − α)/2 
Note: The Hamming window is very close to the generalized Hamming window which minimizes sidelobe level within the family:
α = 0.53836
(minimum peak sidelobe magnitude)
Thus, the Hamming window is also the “Chebyshev Generalized Hamming Window” rounded to two signiﬁcant digits.
Chebyshevtype designs generally exhibit equiripple error behavior, since the worstcase error (sidelobe level in this case) is minimized (see DolphChebyshev window below)
12
Hamming Window
Hamming Window, M = 21
Time (samples)
Normalized Frequency (rad/sample)
13
Hamming Window Properties
• Discontinuous “slam to zero” at endpoints
• main lobe is 4Ω _{M}
• Roll oﬀ is approx. 6 dB / octave (aliased)
• 1st side lobe is improved over Hann
• side lobes closer to “equal ripple”
14
Longer Hamming Window
Hamming Window, M = 101
Time (samples)
Normalized Frequency (rad/sample)
• Since the sidelobes nearest the main lobe are most aﬀected by the Hamming optimization, we now have a larger frequency region over which the spectral envelope looks like that of the asinc function (an “aliased 6 dB/octave rolloﬀ”).
• The sidelobe level (42.7 dB) is also improved over that of the shorter window (40.6 dB).
15
Window Transform Summary
Window Transforms, Window length = 20
Normalized Frequency (cycles per sample)
Normalized Frequency (cycles per sample)
Normalized Frequency (cycles per sample)
16
The MLT Sine Window
The Modulated Lapped Transform (MLT) uses the sine window :
w(n) = sin n + ^{1} _{2}
_{2}_{M} ,
π
n = 0, 1, 2,
, 2M − 1 .
• Used in MPEG1, Layer 3 (MP3 format), MPEG2 AAC, MPEG4
• Sidelobes 24 dB down
• Asymptotically optimal coding gain
• Zerophase window transform (“truncated cosine
window”) has smallest moment of inertia over all
windows: _{}
π
−π
ω ^{2} W (ω)dω = min
17
BlackmanHarris Window Family
• The BlackmanHarris family of windows is basically a generalization of the Hamming family.
• In the case of the Hamming family, we constructed a summation of 3 shifted sinc functions.
• The BlackmanHarris family is derived by considering a more general summation of shifted sinc functions:
w _{B} (n) = w _{R} (n)
L−1
l=0
α _{l} cos(lΩ _{M} n)
∆
where Ω _{M} = 2π/M, n = −(M − 1)/2,
Special Cases:
(M − 1)/2, (M odd).
• L = 1 ⇒ Rectangular
• L = 2 ⇒ Generalized Hamming
• L = 3 ⇒ Blackman Family
18
FrequencyDomain Implementation
The BlackmanHarris window family can be very eﬃciently implemented in the frequency domain as a (2L − 1)point convolution with the spectrum of the unwindowed data. Examples:
• Start with a length M rectangular window
• Take an M point DFT
• Convolve the DFT data with the 3point smoother [1/4, 1/2, 1/4] to implement a Hann window
• Note that the Hann window requires no multiplies in linear ﬁxedpoint data formats
• Implement any Blackman window as a 5point smoother in the frequency domain
19
Classic Blackman
The socalled “Blackman Window” is the speciﬁc case in which α _{0} = 0.42 α _{1} = 0.5, and α _{2} = 0.08
Properties:
• Sidelobes roll oﬀ about 18dB per octave (as T → 0)
• −58dB sidelobe level (worst case)
• One degree of freedom used to increase the rolloﬀ rate from 6dB/octave to 18 dB per octave
• One degree of freedom used to minimize sidelobes
• One degree of freedom used to scale the window
Matlab:
N
=
101;
L
=
3;
No2
=
(N1)/2;
n=No2:No2;
ws 
= 
zeros(L,3*N); 
z 
= 
zeros(1,N); 
for 
l=0:L1 
ws(l+1,:)
=
[z,cos(l*2*pi*n/N),z];
end
alpha
=
w
[0.42,0.5,0.08];
*
ws;
=
alpha
%
Classic
Blackman
20
Classic Blackman Window and Transform
Classic Blackman Window: M = 51
Time (samples)
Frequency (bins)
21
ThreeTerm BlackmanHarris
Properties:
• α _{0} = 0.4243801 α _{1} = 0.4973406, and α _{2} = 0.0782793.
• Sidelobe level −71.5 dB.
• Side lobes roll oﬀ ≈ 6dB per octave in the absence of aliasing (like rectangular and Hamming).
• All degrees of freedom (scaling aside) are used to minimize side lobes (like Hamming).
Matlab:
N
=
101;
L
=
3;
No2
=
(N1)/2;
n=No2:No2;
ws 
= 
zeros(L,3*N); 
z 
= 
zeros(1,N); 
for 
l=0:L1 
ws(l+1,:)
=
[z,cos(l*2*pi*n/N),z];
end
% 3term
alpha
w
BlackmanHarris(Nuttall):
[0.4243801,
*
ws;
0.4973406,
0.0782793];
=
=
alpha
22
ThreeTerm BlackmanHarris Window and Transform
Three−Term Blackman−Harris Window: M = 51
Time (samples)
Frequency (bins)
23
Amplitude
Longer ThreeTerm BlackmanHarris Window and Transform
Three−Term Blackman−Harris Window: M = 301
1
0.8
0.6
0.4
0.2
0
Time (samples)
Frequency (bins)
24
PowerofCosine
w(n) = w _{R} (n) cos ^{L} ^{π}^{n} ,
M
n
∈ − ^{M} ^{−} ^{1}
2
_{,} M − 1
2
• L = 0, 1, 2,
• ﬁrst L terms of its Taylor expansion, evaluated at the endpoints are identically 0
• rolloﬀ rate ≈ 6(L + 1)dB/octave
• L = 0 ⇒ Rectangular window
• L = 1 ⇒ MLT sine window
• L = 2 ⇒ Hann window (“raised cosine” = “cos ^{2} ”)
• L = 3 ⇒ Alternate Blackman (max rolloﬀ rate)
Thus, cos ^{L} windows parametrize Lthorder BlackmanHarris windows conﬁgured so as to use all L degrees of freedom to maximize rolloﬀ rate.
25
Miscellaneous Windows
Bartlett (“Triangular”)
w(n) = w _{R} (n) 1 −
n − 1)/2
(M
• Convolution of two halflength rectangular windows
• Window transform is sinc ^{2} =⇒
• First sidelobe twice as far down as rect (26 dB)
• Main lobe twice as wide as that of a rectangular window having the same length (same as that of a halflength rect used to make it)
• Often applied to sample correlations of ﬁnite data
• Also called the “tent function”
26
Poisson (“Exponential”)
w _{P} (n) = w _{R} (n)e ^{−}^{α}
n
(M −1)/2
where: α determines the time constant ( τ =
Poisson Window M = 51
M
2α
^{)}
time (samples)
In the zplane, the Poisson window has the eﬀect of contracting the unit circle. Applied to a causal signal h(n) having z transform H(z), we have
27
H _{P} (z) =
=
=
∞
n=0
∞
n=0
∞
n=0
[w(n)h(n)]z ^{−}^{n}
h(n)e ^{−} ^{M}^{/}^{2} z ^{−}^{n}
αn
h(n)z ^{−}^{n} r ^{−}^{n} =
∞
n=0
= H(zr)
∆
(let r = e
α
M/2 _{)}
h(n)(zr) ^{−}^{n}
1
28
HannPoisson (“No Sidelobes”)
w(n) = _{2} 1 + cos π
1
− 1)/2 ^{e} −α
n
(M
n
(M −1)/2
• Poisson window times Hann window (exponential times raised cosine)
• “No sidelobes” for α ≥ 2
• Valuable for “hill climbing” optimization methods (gradientbased)
Matlab:
function
%HANNPOISSON
[w,h,p]

=
hannpoisson(M,alpha)
Length
M
HannPoisson
window
Mo2 
= 
(M1)/2; 
n=(Mo2:Mo2)’; 

scl 
= 
alpha 
/ 
Mo2; 
p =
scl2
exp(scl*abs(n));
=
pi
/
Mo2;
h 
= 
0.5*(1+cos(scl2*n)); 
w 
= 
p.*h; 
29
HannPoisson Window and Transform
Hann−Poisson Window, M = 21, Alpha = 2.0
Time (samples)
ωT
30
(radians per sample)
HannPoisson Slope and Curvature
Slope and Curvature of DTFT dB Magnitude, Hann−Poisson Window, M = 21, Alpha = 2.0
ωT
(radians per sample)
31
ωT
(radians per sample)
Slope and Curvature for Larger Alpha
Slope and Curvature of DTFT dB Magnitude, Hann−Poisson Window, M = 21, Alpha = 3.0
ωT
(radians per sample)
32
ωT
(radians per sample)
Maximum MainLobe Energy Window: DPSS
Question: How do we use all M degrees of freedom (sample values) in an M point window w(n) to obtain W (ω) ≈ δ(ω) in some optimal sense?
That is, we wish to perform the following optimization:
max
w
^{} main lobe energy
total energy
In the continuoustime case [ω ∈ (−∞, ∞)], this
problem is solved by a prolate spheroidal wave function, an eigenfunction of the integral equation
ω c
−ω _{c}
_{W}_{(}_{ν}_{)} sin(πD · (ω − ν) π(ω − ν)
dω = λW (ω)
where D is the nonzero timeduration of w(t) in seconds.
Interpretation:
FT(Chop _{D} (IFT(Chop _{2}_{ω} _{c} (W )))) = λW
where Chop _{D} (w) is a rectangular windowing operation which zeros w outside the interval t ∈ [−D/2, D/2].
W is thus the bandlimited extrapolation of its main lobe.
The optimal window transform W is an eigenfunction of this operation sequence corresponding to the largest eigenvalue.
33
The resulting optimal window w has maximum mainlobe energy as a fraction of total energy.
It may be called the Slepian window, or prolate spheroidal window in the continuoustime case.
In discrete time, we need Discrete Prolate Spheroidal Sequences (DPSS), eigenvectors of the following symmetric Toeplitz matrix constructed from a sampled sinc function:
S[k, l] = ^{ω} ^{c} ^{T}^{(}^{k} k − ^{−} l ^{l}^{)} ,
k, l = 0, 1, 2,
, M − 1
• M = window length in samples
• ω _{c} = mainlobe cutoﬀ frequency (rad/sec)
• T = sampling period in seconds.
The DPSS window (digital Slepian window) is then given by the eigenvector corresponding to the largest eigenvalue.
0.1 Matlab for the DPSS Window
function
[w,A,V]
=
dpssw(M,Wc);
% 
DPSSW 
 
Compute 
Digital 
Prolate Spheroidal Seq 

% 
(Slepian 
window) 
of length 
M, having 
c 
34
% 
Wc 
in 
(0,pi). 

k 
= 
(1:M1); 

s 
= 
sin(Wc*k)./ 
k; 

c0 
= [Wc,s]; 

A 
= 
toeplitz(c0); 
% 
c0=1st 
col 
of symmetric 
To 

[V,evals] 
= 
eig(A); 
% 
Really 
need 
only principal 
[emax,imax]
=
max(abs(diag(evals)));
Time (samples)
35
Normalized Frequency (Hz)
• DPSS window has a slightly narrower main lobe
• DPSS window has lower overall sidelobe levels
• Kaiser window side lobes roll oﬀ faster
Kaiser
Kaiser discovered an amazingly good approximation to
prolate spheroidal wave functions using Bessel functions:
I _{0} β 1−
I _{0} (β)
M/2 2
n
∆
w _{K} (n) =
^{,}
0,
This is called the Kaiser window.
36
_{−} M−1
2
≤ n ≤ ^{M}^{−}^{1}
2
elsewhere
The Fourier transform of the Kaiser window is given by
W(ω _{k} ) =
^{M}
_{0} (β)
I
sinh β ^{2} − ^{M}^{ω} ^{k}
2
2 ^{}
β ^{2} − ^{M}^{ω} ^{k}
2
2
where I _{0} is a Bessel function of the ﬁrst kind, and is equal to
∞
k=0
^{} (x/2) ^{k} k!
(x/2) ^{k}
∆
I _{0} (x) =
∞
(Compare this with e ^{x}^{/}^{2} =
k!
k=0
2
.)
Note: Sometimes you see the Kaiser window parameterized by α where β = πα.
∆
37
SLL vs Main Lobe Width for the Kaiser Window ( =0,0.5,1,
,4)
Main lobe width in units of rectangular−window main−lobe widths
• β = Mω _{c} T is equal to 1/2 ‘timebandwidth product’
⇒ α = ∆t · ∆f
• β trades oﬀ side lobe level for main lobe width larger β ⇒ lower S.L.L., wider mainlobe
38
β = [0, 2, 4, 6, 8, 10]
Length 50 Kaiser windows, 6 betas between 0 and 10
Time (samples)
39
dB
dB
dB
dB
dB
dB
β = [0, 2, 4, 6, 8, 10]
Length 50 Kaiser windows, 6 betas between 0 and 10
Normalized Frequency (cycles per sample)
40
dB
dB
dB
dB
M = [20, 30, 40, 50]
Kaiser window transforms, 4 window lengths between 20 and 50, beta=4
0
−20
−40
−60
0
−20
−40
−60
0
−20
−40
−60
0
−20
−40
−60
Normalized Frequency (cycles per sample)
41
Kaiser Windows and Transforms
α = [1, 2, 3]
(β = [π, 2π, 3π])
42
DolphChebyshev
Minimize the Chebyshev norm of the side lobes:
∆
min _{w} sidelobes(W ) _{∞} = min _{w} {max _{ω}_{>}_{ω} _{c} W(ω)}
Alternatively, minimize main lobe width subject to a sidelobe spec:
max (ω _{c} )
w
^{} W (ω) ≤ c _{α} , ∀ω≥ω _{c}
ClosedForm Window Transform (Dolph):
_{W}_{(}_{ω} k _{)} _{=} cos ^{} M cos ^{−}^{1} ^{} β cos ^{} ^{π}^{k}
M
, (k ≤ M − 1)
cosh ^{} M cosh ^{−}^{1} (β) ^{}
β
= cosh _{M} cosh ^{−}^{1} (10 ^{α} ) ,
1
(α ≈ 2, 3, 4)
[Note error in text, which says “β = cosh ^{−}^{1} [· · · ]”]
• Window w = IDFT(W ) [zerocentered case] or IDFT of (−1) ^{k} W(ω _{k} ) for causal case
• α controls sidelobe level (“stopband ripple”):
SideLobe Level in dB = −20α.
• smaller ripple ⇒ larger ω _{c}
• see matlab function “chebwin(M,ripple)”
43
DolphChebyshev Window, Length 31, Ripple 40 dB
Length 31 Chebyshev window
Time (samples)
Normalized Frequency (cycles/sample)
44
DolphChebyshev Window, Length 31, Ripple 200 dB
Length 31 Chebyshev window
Time (samples)
Normalized Frequency (cycles/sample)
45
DolphChebyshev Window, Length 101, Ripple 40 dB
Length 101 Chebyshev window
Time (samples)
Normalized Frequency (cycles/sample)
46
DolphChebyshev and Hamming Windows Compared
DFT of Chebyshev and Hamming Window: M = 31, Ripple = −42 dB
Normalized Frequency (rad/sample)
For the comparison, we set the ripple parameter for chebwin to 42 dB:
window
=
[
chebwin(31,42)’
47
zeros(1,102431)
];
Gaussian
The Gaussian “bell curve” is the only smooth function that transforms to itself:
1
σ _{√} 2π _{e} −t ^{2} _{/} 2σ ^{2} _{↔} _{e} −ω ^{2} _{/} 2(1/σ) ^{2}
It also achieves the minimum timebandwidth product
σ _{t} σ _{ω} = σ × (1/σ) = 1
when “width” of a function is deﬁned as the square root of its second central moment. For even functions w(t),
σ t
=
∆
∞
t
^{2} w(t)dt.
−∞
• Since the true Gaussian function has inﬁnite duration, in practice we must window it with some ﬁnite window, or at least truncate it.
• Philippe Depalle
suggests using a triangular window
raised to some power α for this purpose.
• This choice preserves the absence of sidelobes for suﬃciently large α. • It also preserves nonnegativity of the transform
48
The Gaussian Window in Spectral Modeling
Special Property: On a dB scale, the Gaussian is quadratic ⇒ parabolic interpolation of a sampled Gaussian transform is exact.
Conjecture: Quadratic interpolation of spectral peaks is generally more accurate on a logmagnitude scale (e.g., dB) than on a linear magnitude scale. This has been veriﬁed in a number of cases, and no counterexamples are yet known. Exercise: Prove this is true for the rectangular window.
Matlab for the Gaussian Window
function
[w]
=
gausswin(M,sigma)
n=((M1)/2:(M1)/2)’;
w
exp(n.*n./(2*sigma.*sigma));
=
49
Amplitude
Gaussian Window and Transform
1
0.8
0.6
0.4
0.2
0
Gaussian Window, M = 21, Sigma = M/8
Time (samples)
50
ωT
(radians per sample)
Optimal Windows
Generally we desire
W(ω) ≈ δ(ω)
• Best results are obtained by formulating this as an FIR ﬁlter design problem.
• In general, both timedomain and frequencydomain speciﬁcations are needed.
• Equivalently, both magnitude and phase speciﬁcations are necessary in the frequency domain.
51
Optimal Windows for Audio Coding
Recently, numerically optimized windows have been developed by Dolby which achieve the following objectives:
• Narrow the window in time
• Smooth the onset and decay in time
• Reduce sidelobes below the worstcase masking threshold
Windows in Graphics
^{1} http://www.cg.tuwien.ac.at/studentwork/CESCG99/TTheussl/paper.html
52
Conclusion
• There is rarely a closed form expression for an optimal window in practice.
• The hardest task is formulating the ideal error criterion.
• Given an error criterion, it is usually straightforward to minimize it numerically with respect to the window samples w.
53
Much more than documents.
Discover everything Scribd has to offer, including books and audiobooks from major publishers.
Cancel anytime.