Multirate Systems and Applications

EURASIP Journal on Advances in Signal Processing
Multirate Systems and Applications
Guest Editors: Yuan-Pei Lin, See-May Phoong, Ivan Selesnick,

Soontorn Oraintara, and Gerald Schuller

Guest Editors: Yuan-Pei Lin, See-May Phoong,
Ivan Selesnick, Soontorn Oraintara, and Gerald Schuller
Copyright © 2007 Hindawi Publishing Corporation. All rights reserved.
This is a special issue published in volume 2007 of “EURASIP Journal on Advances in Signal Processing.” All articles are open access
articles distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in
any medium, provided the original work is properly cited.
Editor-in-Chief
Ali H. Sayed, University of California, Los Angeles, USA
Associate Editors
Kenneth E. Barner, USA Søren Holdt Jensen, Denmark Marc Moonen, Belgium
Richard J. Barton, USA Mark Kahrs, USA Vitor Heloiz Nascimento, Brazil
Ati Baskurt, France Thomas Kaiser, Germany Sven Nordholm, Australia
Kostas Berberidis, Greece Moon Gi Kang, South Korea Douglas O’Shaughnessy, Canada
Jose C. Bermudez, Brazil Matti Karjalainen, Finland Antonio Ortega, USA
Enis Ahmet Cetin, Turkey Walter Kellermann, Germany Bjorn Ottersten, Sweden
Jonathon Chambers, UK Joerg Kliewer, USA Wilfried Philips, Belgium
Benoit Champagne, Canada Lisimachos Paul Kondi, USA Ioannis Psaromiligkos, Canada
Joe C. Chen, USA Alex Kot, Singapore Phillip Regalia, France
Liang-Gee Chen, Taiwan Vikram Krishnamurthy, Canada Markus Rupp, Austria
Huaiyu Dai, USA C. -C. Jay Kuo, USA William Allan Sandham, UK
Satya Dharanipragada, USA Tan Lee, China Bülent Sankur, Turkey
Frank Ehlers, Italy Geert Leus, The Netherlands Erchin Serpedin, USA
Sharon Gannot, Israel Bernard C. Levy, USA Dirk Slock, France
Fulvio Gini, Italy Ta-Hsin Li, USA Yap-Peng Tan, Singapore
Irene Y. H. Gu, Sweden Mark Liao, Taiwan Dimitrios Tzovaras, Greece
Fredrik Gustafsson, Sweden Yuan-Pei Lin, Taiwan Jacques G. Verly, Belgium
Peter Handel, Sweden Shoji Makino, Japan Bernhard Wess, Austria
R. Heusdens, The Netherlands Stephen Marshall, UK Douglas Williams, USA
Ulrich Heute, Germany C. Mecklenbräuker, Austria Roger Woods, UK
Arden Huang, USA Gloria Menegaz, Italy Jar-Ferr Kevin Yang, Taiwan
Jiri Jan, Czech Republic Ricardo Merched, Brazil Azzedine Zerguine, Saudi Arabia
Sudharman Jayaweera, USA Rafael Molina, Spain Abdelhak M. Zoubir, Germany
Contents
Multirate Systems and Applications, Yuan-Pei Lin, See-May Phoong, Ivan Selesnick,
Soontorn Oraintara, and Gerald Schuller
Volume 2007, Article ID 41658, 3 pages
Design of Optimal Quincunx Filter Banks for Image Coding, Yi Chen, Michael D. Adams,
and Wu-Sheng Lu
An Approach for Synthesis of Modulated M-Channel FIR Filter Banks Utilizing the Frequency-Response
Masking Technique, Linnéa Rosenbaum, Per Löwenborg, and Håkan Johansson
Fixed Wordsize Implementation of Lifting Schemes, Tanja Karp

Quaternionic Lattice Structures for Four-Channel Paraunitary Filter Banks, Marek Parfieniuk
and Alexander Petrovsky
Noniterative Design of 2-Channel FIR Orthogonal Filters, M. Elena Domı́nguez Jiménez

A Generalized Algorithm for Blind Channel Identification with Linear Redundant Precoders,
Borching Su and P. P. Vaidyanathan
Channel Equalization in Filter Bank Based Multicarrier Modulation for Wireless Communications,
Tero Ihalainen, Tobias Hidalgo Stitz, Mika Rinne, and Markku Renfors
Frequency-Domain Equalization in Single-Carrier Transmission: Filter Bank Approach, Yuan Yang,

Tero Ihalainen, Mika Rinne, and Markku Renfors
Design of Nonuniform Filter Bank Transceivers for Frequency Selective Channels, Han-Ting Chiang,
See-May Phoong, and Yuan-Pei Lin
Flexible Frequency-Band Reallocation Networks Using Variable Oversampled Complex-Modulated

Filter Banks, Håkan Johansson and Per Löwenborg
Wavelets in Recognition of Bird Sounds, Arja Selin, Jari Turunen, and Juha T. Tanttu
Subband Approach to Bandlimited Crosstalk Cancellation System in Spatial Sound Reproduction,

Mingsian R. Bai and Chih-Chung Lee
Subband Affine Projection Algorithm for Acoustic Echo Cancellation System, Hun Choi
and Hyeon-Deok Bae
Hindawi Publishing Corporation
doi:10.1155/2007/41658
Editorial
Yuan-Pei Lin,1 See-May Phoong,2 Ivan Selesnick,3 Soontorn Oraintara,4 and Gerald Schuller5
1 Department of Electrical and Control Engineering, National Chiao-Tung University, Hsinchu 300, Taiwan
2 Department of Electrical Engineering and Graduate Institute of Communication Engineering,
National Taiwan University, Taipei 10617, Taiwan
3 Department of Electrical and Computer Engineering, Polytechnic University, Brooklyn, NY 11201, USA
4 Department of Electrical Engineering, The University of Texas at Arlington, Arlington, TX 76010, USA
5 Audio Coding for Special Applications Research Group, Fraunhofer Institute for Digital Media Technology (IDMT),
Langewiesener Strasse 22, 98693 Ilmenau, Germany

Received 24 January 2007; Accepted 24 January 2007
Copyright © 2007 Yuan-Pei Lin et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Filterbanks for the application of subband coding of speech work, a parameterization of quincunx filterbanks is em-
were introduced in the 1970s. Since then, filterbanks and ployed to maximize coding gain subject to constraints on
multirate systems have been studied extensively. There vanishing moments and frequency selectivity. The proposed
has been great success in applying multirate systems to methods are shown to be highly effective for image cod-
many applications. Most notable of these applications in- ing.
clude subband coding, signal analysis, and representation A frequency response masking approach to the design
using wavelets, subband denoising, and so forth. Differ- of cosine modulated M-channel filterbanks is developed by
ent applications also call for different filterbank designs Linnéa et al. Using frequency response masking, this method
and the topic of designing one-dimensional and multidi- can obtain a sharper prototype and hence analysis and syn-
mensional filterbanks for specific applications has been of thesis filters with narrower transition bands. Furthermore, a
great interest. Recently there has also been a lot of in- lower complexity can be achieved at the cost of a slightly in-
terest in applying multirate theories to the area of com- creased overall delay.
munication systems such as transmultiplexers, filterbank The problem of fixed wordsize implementation of lifting
transceivers, and precoded systems. There are strikingly schemes is addressed by Tanja Karp. A reversible nonlinear
many dualities and similarities between multirate systems discrete wavelet transform with a fixed wordsize based on
and multicarrier communication systems. Many problems lifting schemes is presented. It is shown that when the ad-
in multicarrier transmission can be studied by extending ditions in the lifting steps are done using the modulus oper-
results from multirate systems and filterbanks. This ex- ation, overflows (if any) will cancel out. An analysis on the
citing research area is one that is of increasing impor- effect of finite wordsize implementation on the performance
tance. of image compression systems is performed. The results are
The aim of this special issue is to bring forward recent useful for a practical implementation of lifting schemes.
developments on filterbanks and the ever expanding area of The paper by M. Parfieniuk and A. Petrovsky proposes a
applications of multirate systems. In this special issue, there new quaternionic lattice structures for four-channel parauni-
are a total of 13 papers, which are roughly grouped into 3 tary filterbanks. Quarternion multipliers are used as the pa-
categories. raunitary building blocks and they have the advantage that
losslessness is preserved under coefficient quantization. The
one-regularity condition can be expressed in terms of the lat-
1. THEORY, DESIGN, AND IMPLEMENTATION tice coefficients and can be satisfied even under finite preci-
OF FILTERBANKS sion. The proposed structure is useful for the design and im-
plementation of four-channel paraunitary filterbanks.
Yi Chen et al. developed two methods of designing quin- A new characterization of real paraunitary two-channel
cunx filterbanks for image coding. Based on a lifting frame- filterbanks is proposed by M. Elena Domı́nguez Jiménez. The
2 EURASIP Journal on Advances in Signal Processing
new formulation gives an explicit expression of all real FIR harmonic, with the latter not easily captured by conventional
paraunitary filterbanks and it leads to a method that de- spectral analysis methods. Using wavelet packet decomposi-
signs any two-channel paraunitary filterbanks directly, with tion for feature extraction, inharmonic and transient sounds
no need of iteration procedures. can be recognized with a high success rate.
Filterbanks have also been applied to crosstalk cancel-
lation in spatial sound reproduction using multi-channel
2. APPLICATION OF FILTERBANK
loudspeakers. The widespread use of the crosstalk cancella-
SYSTEMS TO COMMUNICATIONS
tion system has been hampered by its heavy computational
loading. The subband-based bandlimited cancellation sys-
Blind channel identification using redundant filterbank pre-
tem proposed by M. R. Bai and C.-C. Lee significantly re-
coders is addressed by B. Su and P. P. Vaidyanathan. A gener-
duces the complexity while having a performance compara-
alized algorithm for solving the problem is proposed. The au-
ble to that of the full-band system.
thors show how the parameters can be designed to jointly op-
timize the system performance and computational complex- Convergence speed and complexity are known to be
ity. It is shown that the generalized algorithm outperforms two important issues in acoustic echo cancellation associ-
the previous ones. In addition, a new concept of generalized ated with long echo paths. H. Choi and H.-D. Bae present
signal richness and its properties are also investigated in the a new subband affine projection method, combining sub-
paper. band filtering and affine projection, to address these two
issues. The new algorithm outperforms both subband fil-
The issue of channel equalization in filterbank-based
tering and fullband affine projection methods in terms of
multicarrier systems is investigated by Tero Ihalainen et al.
convergence. At the same time, a lower complexity can be
A new low-complexity per-subcarrier equalizer is proposed.
achieved.
A comprehensive performance analysis of the proposed sys-
tem is presented and the performance of the proposed equal-
izer structures is compared to the cyclic-prefixed OFDM sys- ACKNOWLEDGMENTS
tem, taking into account various practical issues like trans-
mitter nonlinearity and frequency offsets. The study shows The editors would like to thank all the authors who submit-
that the filterbank system is a promising candidate for multi- ted to this special issue and express their gratitude to all the
carrier communications. reviewers for their valuable comments and suggestions. They
In a companion paper, Yuan Yang et al. investigate the also appreciate very much the support of EURASIP JASP Ed-
use of exponentially modulated filterbanks for frequency- itorial Board. They hope that this special issue will stimulate
domain equalization in single-carrier systems. Two low- more new developments and discoveries on the theories, de-
complexity equalizer structures are studied. It is demon- signs, and applications of filterbank systems.
strated that the proposed filterbank-based single-carrier
system outperforms the widely used DFT-based single- Yuan-Pei Lin
carrier system, especially when there is narrowband interfer- See-May Phoong
ence. Ivan Selesnick
The paper by Han-Ting Chiang et al. studies nonuni- Soontorn Oraintara
form filterbank transceivers for frequency selective chan- Gerald Schuller
nels. The authors propose a design method for jointly op-
timizing the frequency response and signal-to-interference
Yuan-Pei Lin was born in Taipei, Taiwan,
ratio. Simulation results show that nonuniform filterbank 1970. She received the B.S. degree in con-
transceivers with good frequency responses and high signal- trol engineering from the National Chiao-
to-interference ratio can be obtained. Tung University, Taiwan, in 1992, and the
Frequency band reallocation is an important aspect of M.S. degree and the Ph.D. degree, both in
satellite-based communication systems. A variable oversam- electrical engineering from California Insti-
pled complex modulated filterbank is introduced by H. Jo- tute of Technology, in 1993 and 1997, re-
hansson and P. Löwenborg for flexible frequency band real- spectively. She joined the Department of
Electrical and Control Engineering of Na-
location. Due to variable oversampling, the network is more
tional Chiao-Tung University, Taiwan, in
flexible in accommodating various types of services. In ad- 1997. Her research interests include digital signal processing, mul-
dition, a lower complexity is simultaneously achieved due to tirate filterbanks, and signal processing for digital communication,
inherent parallel processing. particularly in the area of multicarrier transmission. She is a recipi-
ent of 2004 Ta-You Wu Memorial Award. She served as an Associate
Editor for IEEE Transaction on Signal Processing (2002–2006). She
3. FILTERBANK SYSTEMS FOR SOUND AND is currently an Associate Editor for IEEE Signal Processing Letters,
ACOUSTICS APPLICATIONS IEEE Transaction on Circuits and Systems II, EURASIP Journal on
Advances in Signal Processing, and Multidimensional Systems and
In the paper by Arja Selin et al., filterbanks are applied to the Signal Processing, Academic Press. She is also a distinguished Lec-
recognition of bird sounds. Bird sounds can be tonal or in- turer of the IEEE Circuits and Systems Society for 2006–2007.
Yuan-Pei Lin et al. 3
See-May Phoong was born in Johor, signal processing. He is an Associate Editor for the IEEE Transac-
Malaysia, in 1968. He received the B.S. tions on Signal Processing and the Circuits, Systems and Signal Pro-
degree in electrical engineering from the cessing Journal. He received the Technology Award from Boston
National Taiwan University (NTU), Taipei, University for his integer DCT invention (with Y. J. Chen and T. Q.
Taiwan, in 1991 and the M.S. and Ph.D. de- Nguyen) in 1999. In 2003, he received the College of Engineering
grees in electrical engineering from the Cal- Outstanding Young Faculty Member Award from UTA. He repre-
ifornia Institute of Technology (Caltech), sented Thailand in the International Mathematical Olympiad com-
Pasadena, Calif, USA, in 1992 and 1996, petitions and, respectively, received the Honorable Mention Award
respectively. He was with the faculty of in Beijing, China, in 1990, and the bronze medal in Sigtuna, Swe-
the Department of Electronic and Electrical den, in 1991.
Engineering, Nanyang Technological University, Singapore, from
September 1996 to September 1997. In September 1997, he joined Gerald Schuller is the head of the Audio
the Graduate Institute of Communication Engineering and the De- Coding for Special Applications Research
partment of Electrical Engineering, NTU, as an Assistant Professor, Group at the Fraunhofer Institute for Digi-
and since August 2006, he has been a Professor. He is currently an tal Media Technology in Ilmenau, Germany,
Associate Editor for the IEEE Transactions on Circuits and Systems since January 2002, and Adjunct Profes-
I. He has previously served as an Associate Editor for Transactions sor at the Technical University of Ilmenau.
on Circuits and Systems II: Analog and Diginal Signal Processing From spring of 2005 until spring of 2006,
(January 2002–December 2003) and IEEE Signal Processing Let- he was Deputy Professor for Applied Me-
ters (March 2002–February 2005). His interests include multirate dia Systems at that university. He received
signal processing, filterbanks, and their application to communica- his Ph.D. degree from the University of
tions. He received the Charles H. Wilts Prize (1997) for outstanding Hanover in 1997. From 1998 to 2001, he was a member of tech-
independent research in electrical engineering at Caltech. He was nical staff at Bell Laboratories, Lucent Technologies, and Agere Sys-
also a recipient of the Chinese Institute of Electrical Engineering’s tems, a Lucent spin-off. There he worked in the Multimedia Com-
Outstanding Youth Electrical Engineer Award (2005). munications Research Laboratory. He was an Associate Editor of
the IEEE Transactions on Speech and Audio Processing from 2002
Ivan Selesnick received the B.S., M.E.E., until 2006, and is an Associate Editor of the IEEE Transactions on
and Ph.D. degrees in electrical engineering Signal Processing since 2006. He is a Member of the IEEE Technical
in 1990, 1991, and 1996, respectively, from Committees on Audio and Electroacoustics, on Speech and Lan-
Rice University, Houston, Tex. In 1997, he guage Processing, and Member of the Audio Engineering Society
was a Visiting Professor at the University (AES) Technical Committees on Coding of Audio Signals, and on
of Erlangen-Nurnberg, Germany. He then Signal Processing.
joined the Department of Electrical and
Computer Engineering, Polytechnic Uni-
versity, NY, USA, where he is an Associate
Professor. His current research interests are
in the area of digital signal processing, wavelet-based signal pro-
cessing, and non-Gaussian probability models. In 1991, he received
a DARPA-NDSEG Fellowship. In 1996, Dr. Selesnick’s Ph.D. dis-
sertation received the Budd Award for Best Engineering Thesis at
Rice University and an award from the Rice-TMC Chapter of Sigma
Xi. He received an Alexander von Humboldt Award (1997) and a
National Science Foundation Career Award (1999). He has been a
Member of the IEEE Signal Processing Theory and Methods Tech-
nical Committee and he is an Associate Editor of the IEEE Transac-
tions on Image Processing.
Soontorn Oraintara received the B.E. de-

gree (with first-class honors) from King
Monkut’s Institute of Technology Ladkra-
bang, Bangkok, Thailand, in 1995, and the
M.S. and Ph.D. degrees in electrical engi-
neering from the University of Wisconsin,
Madison, in 1996, and Boston University,
Boston, Mass, USA, in 2000, respectively.
He joined the Department of Electrical En-
gineering, University of Texas at Arlington
(UTA), as an Assistant Professor in July 2000, where he is cur-
rently an Associate Professor. From May 1998 to April 2000, he
was an Intern and a Consultant with the Advanced Research and
Development Group, Ericsson, Inc., Research Triangle Park, NC,
USA, His current research interests are in the field of digital signal
processing: wavelets, filterbanks, and multirate systems and their
applications in data compression, image analysis, and biomedical
doi:10.1155/2007/83858
Research Article
Design of Optimal Quincunx Filter Banks for Image Coding
Yi Chen, Michael D. Adams, and Wu-Sheng Lu
Department of Electrical and Computer Engineering, University of Victoria, Victoria, BC, Canada V8W 3P6
Received 31 December 2005; Revised 8 June 2006; Accepted 16 July 2006
Recommended by Ivan Selesnick
Two new optimization-based methods are proposed for the design of high-performance quincunx filter banks for the appli-
cation of image coding. These new techniques are used to build linear-phase finite-length-impulse-response (FIR) perfect-
reconstruction (PR) systems with high coding gain, good frequency selectivity, and certain prescribed vanishing-moment prop-
erties. A parametrization of quincunx filter banks based on the lifting framework is employed to structurally impose the PR and
linear-phase conditions. Then, the coding gain is maximized subject to a set of constraints on vanishing moments and frequency
selectivity. Examples of filter banks designed using the newly proposed methods are presented and shown to be highly effective for
image coding. In particular, our new optimal designs are shown to outperform three previously proposed quincunx filter banks in
72% to 95% of our experimental test cases. Moreover, in some limited cases, our optimal designs are even able to outperform the
well-known (separable) 9/7 filter bank (from the JPEG-2000 standard).
Copyright © 2007 Yi Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which
permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. INTRODUCTION Other design techniques have also been proposed where a

transformation is applied to the polyphase components of
Filter banks have proven to be a highly effective tool for im- the filters instead of the original filter transfer functions [8–
age coding applications [1]. In such applications, one typi- 11]. These transformation-based designs have the restriction
cally desires filter banks to have perfect reconstruction (PR), that one cannot explicitly control the shape of the 2D fil-
linear-phase, high coding gain, good frequency selectivity, ter frequency responses. Moreover, in some cases, the trans-
and satisfactory vanishing-moment properties. The PR prop- formed 2D filter banks can only achieve approximate PR. Di-
erty facilitates the construction of a lossless compression sys- rect optimization of the filter coefficients has also been pro-
tem. The linear-phase property is crucial to avoiding phase posed [12–14], but because of the involvement of large num-
distortion. High coding gain leads to filter banks with good bers of variables and nonlinear, nonconvex constraints, such
energy compaction capabilities. The presence of vanishing optimization typically leads to a very complicated system,
moments helps to reduce the number of nonzero coefficients which is often difficult to solve. Designs utilizing the lifting
in the highpass subbands and tends to lead to smoother syn- framework [15, 16] have been proposed in [17, 18] for two-
thesis basis functions. Good frequency selectivity serves to channel 2D filter banks with an arbitrary number of vanish-
minimize aliasing in the subband signals. Designing nonsep- ing moments. With these methods, however, only interpo-
arable two-dimensional (2D) filter banks with all of the pre- lating filter banks are considered (i.e., filter banks with two
ceding properties is an extremely challenging task. lifting steps).
In the one-dimensional (1D) case, various filter-bank de- The Cayley transform has been used in the characteriza-
sign techniques have been successfully developed. In the non- tion and design of multidimensional orthogonal filter banks
separable 2D case, however, far fewer effective methods have [19, 20]. In [21], B-spline filters and the McClellan transfor-
been proposed. Variable transformation methods are com- mation are used to construct orthogonal quincunx wavelets
monly used for the design of 2D filter banks. With such with fractional order of approximation. A technique utiliz-
methods, a 1D prototype filter bank is first designed, and ing polyharmonic B-splines is proposed in [22] for design-
then mapped into a 2D filter bank through a transforma- ing multidimensional/quincunx wavelet bases. Although the
tion of variables [2–6]. For example, the McClellan transfor- preceding design methods are interesting and certainly wor-
mation [7] has been used in numerous design approaches. thy of mention, they are not useful for the particular design
problem considered in our work. This is due to the fact x[n] y0 [n] y0 [n] xr [n]
H0 (z) M M G0 (z) +
that we consider the design of nontrivial linear-phase finite-
length-impulse-response (FIR) PR filter banks. In the quin- y1 [n] y1 [n]
H1 (z) M M G1 (z)
cunx case, such filter banks cannot be orthogonal [23]. Fur-
thermore, since we are interested in FIR filter banks, methods (a) (b)
that yield filter banks with infinite-length-impulse-response
(IIR) filters are not helpful either. Figure 1: The canonical form of a quincunx filter bank: (a) analysis
Uniform and nonuniform 2D directional filter banks are side, and (b) synthesis side.
proposed in [24] to process images with better directional
selectivity than conventional wavelets. Although we mention
this development here for completeness, it addresses a differ- mk being the kth column of M, we define zM = [zm0 zm1 ]T .
ent problem from that considered herein. In our work, we In the rest of this paper, unless otherwise noted, we will use
seek to design filter banks that can be used in a standard M to denote the generating matrix [ 11 −11 ] of the quincunx
wavelet configuration. For this reason, methods for the de- lattice. For convenience, we denote the partial derivative op-
sign of directional filter banks, while interesting, are not ap- erator with respect to ω = [ω0 ω1 ]T as
plicable to the problem at hand.
In this paper, we propose two new optimization-based ∂|n|
n = , (1)
methods for constructing FIR quincunx filter banks with all ∂ω0n0 ∂ω1n1
of the aforementioned desirable properties (i.e., PR, linear-
phase, high coding gain, good frequency selectivity, and cer- where n = [n0 n1 ]T ∈ (Z∗ )2 .
tain vanishing-moments properties). The Fourier transform of a sequence h is denoted as h.
The rest of this paper is structured as follows. Section 2 A (2D) filter H with impulse response h is said to be linear
briefly presents the notational conventions used herein. phase with group delay c if, for some c ∈ (1/2)Z2 , h[n] =
Then, Section 3 introduces quincunx filter banks, and h[2c−n] for all n ∈ Z2 . In passing, we note that the frequency
Section 4 presents a parametrization of linear-phase PR
response h(ω) of a linear-phase filter with impulse response
quincunx filter banks based on the lifting framework. Opti-
h and group delay c can be expressed as
mal design algorithms for quincunx filter banks with two and
more than two lifting steps are proposed in Sections 5 and 6,

h(ω) = e− jω
Tc
h[n] cos ωT (n − c) . (2)
respectively. Several design examples are then presented in
n∈Z2
Section 7 and their effectiveness for image coding is demon-
strated in Section 8. Finally, Section 9 concludes with a sum- For convenience, in what follows, we define the signed am-
mary of our work and some closing remarks. plitude response ha (ω) of H as

2. NOTATION AND TERMINOLOGY ha (ω) = h[n] cos ωT (n − c) (3)
n∈Z2
Before proceeding further, a few comments are in order con-
cerning the notation used herein. In this paper, the sets of in- (i.e., the quantity ha (ω) is h(ω)
without the exponential fac-
− jωT c
tegers and real numbers are denoted as Z and R, respectively. tor e ). Thus, the magnitude response of H is trivially
The symbols Z∗ , Z+ , Z− , Zo , and Ze denote the sets of non- given by |ha (ω)|.
negative, positive, negative, odd, and even integers, respec- In image coding, the peak-signal-to-noise ratio (PSNR)
tively. For a ∈ R, a denotes the largest integer no greater is a commonly used measure for distortion. For an original
than a, and a denotes the smallest integer no less than a. image x and its reconstructed version xr , the PSNR is defined
For m, n ∈ Z, we define the mod function as mod(m, n) = as
m − n m/n.
Matrices and vectors are denoted by upper- and lower- 2P − 1
PSNR = 20 log10 √ , (4)
case boldface letters, respectively. The symbols 0, 1, and I are MSE
used to denote a vector/matrix of all zeros, a vector/matrix of
all ones, and an identity matrix, respectively, the dimensions where
of which should be clear from the context. For matrix
mul- N0 −1 N
1 −1
tiplication, we define the product notation as Nk=M Ak 1
2
MSE = x r n 0 , n1 − x n 0 , n1 , (5)
AN AN −1 · · · AM+1 AM for N ≥ M. For convenience, a linear N0 N1 n0 =0 n1 =0
(or polynomial) function of the elements of a vector x is sim-
ply referred to as a linear (or polynomial) function in x. and each image has dimension N0 × N1 and P bits/sample.
An element of a sequence x defined on Z2 is denoted ei-
ther as x[n] or as x[n0 , n1 ] (whichever is more convenient), 3. QUINCUNX FILTER BANKS
where n = [n0 n1 ]T and n0 , n1 ∈ Z. Let n = [n0 n1 ]T
and let z = [z0 z1 ]T . Then, we define |n| = n0 + n1 and A quincunx filter bank has the canonical form shown in
zn = z0n0 z1n1 . Furthermore, for a matrix M = [m0 m1 ] with Figure 1. The filter bank consists of lowpass and highpass
Yi Chen et al. 3
x[n] y0 [n] M G0 (z) + M G0 (z) + M G0 (z) + xr [n]

H0 (z) M H0 (z) M H0 (z) M y0 [n]
y1 [n] M G1 (z)
H1 (z) M y1 [n]
..
.. .
.
yL 1 [n] M G1 (z)
H1 (z) M yL 1 [n]
yL [n] M G1 (z)
H1 (z) M yL [n]
(b)
(a)
Figure 2: The structure of an L-level octave-band filter bank: (a) analysis side, and (b) synthesis side.
x[n] ¼
y0 [n] ¼
xr [n] x[n]
H0 (z) ML ML G0 (z) + M + + y0 [n]
¼
y1 [n] ¼
H1 (z) ML ML G1 (z) +
. . . . . . .. z0 A1 (z) A2 (z) A2λ 1 (z) A2λ (z)
. . . . . .
. . . . . . .
¼ yL 1 [n] ¼
y1 [n]
HL 1 (z) M2 M2 GL 1 (z) + M + +
¼ yL [n] ¼
HL (z) M M GL (z) (a)
xr [n]
Figure 3: The equivalent nonuniform filter bank associated with y0 [n] + + M +

the L-level octave-band filter bank.
A2λ (z) A2λ 1 (z) A2 (z) A1 (z) z0 1

analysis filters H0 and H1 , lowpass and highpass synthesis fil- y1 [n] + + M
ters G0 and G1 , and M-fold downsamplers and upsamplers. (b)
In image coding applications, a quincunx filter bank
is typically applied in a recursive manner, resulting in an Figure 4: Lifting realization of a quincunx filter bank: (a) analysis
octave-band filter-bank structure as shown in Figure 2. For side, and (b) synthesis side.
an L-level octave-band filter bank generated from a quincunx
filter bank with analysis filters {Hk }, the equivalent nonuni-
form filter bank has L + 1 channels with analysis filters {Hi }
Given the lifting filters {Ak }, the corresponding analysis
and synthesis filters {Gi } as shown in Figure 3. The transfer
filter transfer functions H0 (z) and H1 (z) can be calculated as
functions {Hi (z)} of {Hi } are given by

⎧L−1 H0 (z) H0,0 zM H0,1 zM 1
⎪ k
=

, (7)
⎪
⎪ H1 (z) H1,0 zM H1,1 zM z0
⎪
⎪ H0 zM , i = 0,
⎪
⎪
⎪
⎨k=0
Hi (z) = ⎪ ML−i
L−
i−1
k
(6) where
⎪
⎪H z H0 zM , 1 ≤ i ≤ L − 1,
⎪
⎪
1
λ
⎪
⎪ k=0 H0,0 (z) H0,1 (z) 1 A2k (z) 1 0
⎪
⎩H (z), = .
1 i = L. H1,0 (z) H1,1 (z)
k=1
0 1 A2k−1 (z) 1
(8)
The transfer functions {Gi (z)} of the equivalent synthesis fil-
ters {Gi } can be derived in a similar fashion. The synthesis filter transfer functions G0 (z) and
G1 (z) can then be trivially computed as Gk (z) =
4. LIFTING PARAMETRIZATION OF QUINCUNX (−1)1−k z0−1 H1−k (−z). Since the synthesis filters are com-
FILTER BANKS pletely determined by the analysis filters, we need only to
consider the analysis side of the filter bank in what follows.
Rather than parameterizing a quincunx filter bank in terms The use of the above lifting-based parametrization is
of its canonical form, shown earlier in Figure 1, we instead helpful in several respects. First, the PR condition is automat-
employ the lifting framework [15, 16]. The lifting realization ically satisfied by such a parametrization. Second, the linear-
of a quincunx filter bank has the form shown in Figure 4. Es- phase condition can be imposed with relative ease, as we will
sentially, the filter bank is realized in polyphase form, with see momentarily. Thus, the need for additional cumbersome
the analysis and synthesis polyphase filtering each being per- constraints during optimization for PR and linear phase is
formed by a ladder network consisting of 2λ lifting filters eliminated. Lastly, the lifting realization trivially allows for
{Ak }. Without loss of generality, we may assume that none the construction of reversible integer-to-integer mappings
of the {Ak (z)} are identically zero, except possibly A1 (z) and [25], which are often useful for image coding and are em-
A2λ (z). ployed later in this work.
Now we further consider the linear-phase condition. As Thus, a2k−1 (MT ω) can be compactly written as
it turns out, the linear-phase condition can be satisfied with
a prudent choice of lifting filters {Ak }. In particular, we have a2k−1 MT ω = e jω0 a2k
T
−1 v2k −1 , (13)
shown the below result. where v2k−1 is a vector of 2l2k−1,0 l2k−1,1 elements indexed
from 0 to 2l2k−1,0 l2k−1,1 − 1, and the nth element of v2k−1 is
Theorem 1 (sufficient condition for linear phase). Consider
given by
a quincunx filter bank constructed from the lifting framework

with 2λ lifting filters as shown in Figure 4(a). If each lifting v2k−1 [n] = 2 cos ω0 n0 + n1 + 1 + ω1 n0 − n1 (14)
filter Ak is symmetric with its group delay ck satisfying
with n0 and n1 given by (10).
1 1 T k Now, consider an even-indexed lifting filter A2k . Its sup-
ck = (−1) , (9)
2 2 port region is {−l2k,0 + 1, −l2k,0 + 2, . . . , l2k,0 } × {−l2k,1 +
then the analysis filters H0 and H1 are symmetric with group 1, −l2k,1 + 2, . . . , l2k,1 }. The nth element of the coefficient vec-
delays [0 0]T and [−1 0]T , respectively. tor a2k is defined as a2k [n0 , n1 ] with n0 and n1 given by

n
A proof of the preceding theorem is provided in the first n0 =
+ 1 ∈ 1, 2, . . . , l2k,0 ,
author’s thesis [26] but is omitted here for the sake of brevity. 2l2k,1

(15)
The significance of Theorem 1 is that the linear-phase condi- n1 = mod n, 2l2k,1
tion can be trivially satisfied by choosing the lifting filters to
− l2k,1 + 1 ∈ − l2k,1 + 1, −l2k,1 + 2, . . . , l2k,1 ,
have certain symmetry properties.
Now, we examine the relationship between the analysis respectively. The frequency response a2k (ω) of A2k is com-
filter frequency responses and the lifting-filter coefficients. puted as
Since the lifting filter Ak has linear phase with group delay
l2k,0 l2k,1
ck = (−1)k [1/2 1/2]T , the support region of Ak is a rectan-
a2k (ω) = 2e− j(1/2)(ω0 +ω1 ) a2k n0 , n1
gle of size 2lk,0 × 2lk,1 for some lk,0 , lk,1 ∈ Z+ , and the number n0 =1 n1 =1−l2k,1
of independent coefficients of Ak is 2lk,0 lk,1 . Let ak be a vector (16)

containing the independent coefficients of Ak . Then, there 1 1
× cos ω0 n0 − + ω1 n1 − .
are 2lk,0 lk,1 elements in ak indexed from 0 to 2lk,0 lk,1 − 1. 2 2
Consider an odd-indexed lifting filter A2k−1 . Its support In the upsampled domain, a2k (MT ω) can be expressed as
region can be expressed as {−l2k−1,0 , −l2k−1,0 + 1, . . . , l2k−1,0 −
1} × {−l2k−1,1 , −l2k−1,1 + 1, . . . , l2k−1,1 − 1}. The nth element a2k MT ω = e− jω0 a2k
T
v2k , (17)
of the coefficient vector a2k−1 is defined as a2k−1 [n0 , n1 ] with
n0 and n1 given by where v2k is a vector of 2l2k,0 l2k,1 elements indexed from 0 to
2l2k,0 l2k,1 − 1, and the nth element of v2k is defined as
n

n0 =
∈ 0, 1, . . . , l2k−1,0 − 1 ,
v2k [n] = 2 cos ω0 n0 + n1 − 1 + ω1 n0 − n1 (18)
2l2k−1,1

n1 = mod n, 2l2k−1,1 with n0 and n1 given by (15).

− l2k−1,1 ∈ − l2k−1,1 , −l2k−1,1 + 1, . . . , l2k−1,1 − 1 . Rewriting (7) and (8) in the Fourier domain, we have
(10)

h0 (ω) h0,0 MT ω h0,1 MT ω 1
= T
T
, (19)
Since A2k−1 has linear phase, the frequency response of A2k−1 h1 (ω) h1,0 M ω h1,1 M ω e jω0
can be written from (2) as

h0,0 (ω) h0,1 (ω) λ
1 a2k (ω) 1 0
a2k−1 (ω) = e− jω
Tc
2k −1
a2k−1 [n] cos ωT n − c2k−1 = ,
n∈Z2 h1,0 (ω) h1,1 (ω) k=1
0 1 a2k−1 (ω) 1
l2k−1,0 −1 l2k−1,1 −1 (20)

= 2e j(1/2)(ω0 +ω1 ) a2k−1 n0 , n1
n0 =0 n1 =−l2k−1,1 respectively. Substituting (13), (17), and (20) into (19), we
obtain the frequency responses of the analysis filters as
1 1
× cos ω0 n0 + + ω1 n1 + .
2 2
h0 (ω)
λ
1 e− jω0 a2k
T
v2k
(11) =
h1 (ω) k=1
0 1
In the upsampled domain, a2k−1 (MT ω) can then be ex- (21)
pressed as 1 0 1
× T .
e jω0 a2k −1 v2k −1 1 e jω0
l2k−1,0 −1 l2k−1,1 −1

a2k−1 MT ω = 2e jω0 a2k−1 n0 , n1 We further define a vector x containing all of the inde-
n0 =0 n1 =−l2k−1,1 pendent coefficients {ak } of the lifting filters {Ak } as

× cos ω0 n0 + n1 + 1 + ω1 n0 − n1 . T
(12) x = a1T a2T · · · a2λ
T
. (22)
Yi Chen et al. 5

Thus, x has lx = 2 2λ i=1 li,0 li,1 elements. Clearly, each vector hk [n] and gk [n] are the impulse responses of the equivalent
ak can be expressed in terms of x as analysis and synthesis filters Hk and Gk (given by (6)), respec-
tively, and r is the normalized autocorrelation of the input.
ak = 02lk,0 lk,1 ×α0 I2lk,0 lk,1 02lk,0 lk,1 ×β0 x = Ek x, (23) Depending on the source image model, r is given by

Ek ⎧
k−1 2λ ⎨ρ|n0 |+|n1 | for separable model,
where α0 = 2 i=1 li,0 li,1 and β0 = 2 = √
i=k+1 li,0 li,1 . Substitut- r n 0 , n1 ⎩ρ n20 +n21
(27)
ing (23) into (21), we have for isotropic model,

h0 (ω)
λ
1 e− jω0 xT ET2k v2k where ρ is the correlation coefficient (typically, 0.90 ≤ ρ ≤
= 0.95). Due to the relationship between {hk [n]}, {gk [n]}, and
h1 (ω) k=1
0 1
the lifting-filter coefficient vector x, the coding gain is a non-
1 0 1 linear function of x.
× .
e jω0 xT ET2k−1 v2k−1 1 e jω0
(24) 5.2. Vanishing moments
By expanding the preceding equation, each of the analysis fil-
Now, let us consider the relationship between the lifting-filter
ter frequency responses can be viewed as a polynomial in x,
coefficients and vanishing moments. For a quincunx filter
the order of which depends on the number of lifting steps.
bank, the number of vanishing moments is equivalent to the
order of zero at [0 0]T or [π π ]T in the highpass or lowpass
5. DESIGN OF FILTER BANKS WITH TWO filter frequency response, respectively. For a linear-phase fil-
LIFTING STEPS
ter H with group delay d ∈ Z2 , its frequency response h(ω)
Consider a quincunx filter bank as shown in Figure 4(a) can be computed by (2). The mth-order partial derivative of
with two lifting steps (i.e., λ = 1). As explained earlier, for its signed amplitude response ha (ω) defined in (3) is then
image coding applications, we seek a filter bank with PR, given by
linear-phase, high coding gain, good frequency selectivity, ⎧
⎪
⎪(−1)|m|/2 h[n](n − d)m
and certain vanishing-moment properties. To satisfy both ⎪
⎪
⎪
⎪
⎪ n∈Z 2
the PR and linear-phase conditions, we use the lifting-based ⎪ × cos ωT (n − d)
⎪
⎪ for |m| ∈ Ze ,
parametrization from Theorem 1. Having elected the use of a ⎨
a (ω) =
m h
lifting-based parametrization for optimization purposes, we ⎪
⎪
⎪
⎪(−1)(|m|+1)/2 h[n](n − d)m
must now determine the relationships between the lifting- ⎪
⎪
⎪
⎪
⎪ n∈Z2
filter coefficients and the other desirable properties (such ⎩ × sin ωT (n − d)
⎪
otherwise,
as high coding gain, good frequency selectivity, and certain
vanishing-moment properties). In the sections that follow, (28)
these relationships are examined in more detail.
where m = [m0 m1 ]T . From the above equation, it fol-
lows that when |m| ∈ Zo , the mth-order partial derivative
5.1. Coding gain
of ha (ω) is automatically zero at [0 0]T and [π π ]T . There-
We begin by considering the relationship between the lifting- fore, in order to have an Nth-order zero at ω = [0 0]T , the
filter coefficients and coding gain. Coding gain is a measure filter coefficients need only satisfy
of the energy compaction ability of a filter bank, and is de-
fined as the ratio between the reconstruction error variance h[n](n − d)m = 0 ∀|m| ∈ Ze such that |m| < N.
obtained by quantizing a signal directly to that obtained by n∈Z2
quantizing the corresponding subband coefficients using an (29)

optimal bit allocation strategy. For an L-level octave-band
quincunx filter bank, the coding gain GSBC [27] is computed Similarly, in order to have an Nth-order zero at ω = [π π ]T ,
as the filter coefficients need only satisfy
L

αk αk (−1)|n−d| h[n](n − d)m
GSBC = , (25)
k=0
Ak Bk n∈Z2 (30)
=0 ∀|m| ∈ Ze such that |m| < N.
where

Ak = hk [m]hk [n]r[m − n], Since we only need to consider the case with |m| ∈ Ze in (29)
m∈Z2 n∈Z2

and (30), the number of linear equations is N/22 . Thus,
for a filter bank to have N! dual and N primal vanishing mo-

Bk = αk gk2 [n],
n∈Z2 (26) ments, the analysis filter coefficients are required to satisfy
⎧
⎨2−L for k = 0, equations like those shown in (29) and (30). Since we use
αk = ⎩ −(L+1−k) a lifting-based parametrization, the relationships need to be
2 for k = 1, 2, . . . , L,
expressed in terms of the lifting-filter coefficients.
For a quincunx filter bank constructed with two lifting Combining (34) and (36), we have the linear system of
filters A1 and A2 as shown in Figure 4(a) with λ = 1, the equations involving the lifting-filter coefficient vector x given
constraints on vanishing moments form a linear system of by
equations in the lifting-filter coefficients. In order for this fil-
ter bank to have N! dual and N primal vanishing moments, Ax = b, (37)
the impulse responses a1 [n] and a2 [n] of the lifting filters A1
and A2 , respectively, should satisfy where A = [ A01 A02 ], x = [ aa21 ], and b = [ bb12 ]. The number of
! 2 + N/22 .
equations in (37) is N/2

a1 [n](−n)m = −τ m
1 ,
!
∀m ∈ (Z∗ )2 with |m| < N, It is worth noting that for a linear-phase filter bank with
n∈Z2 two lifting steps, the analysis filter frequency responses have
(31) some special properties if this filter bank has at least one dual
1 vanishing moment. In particular, we have the result below.
a2 [n](−n)m = τ m , ∀m ∈ (Z∗ )2 with |m| < N,
n∈Z2
2 2
Theorem 2 (filter banks with two lifting steps). Consider
(32) a filter bank with two lifting steps satisfying Theorem 1. Let
h0 (ω) and h1 (ω) be the frequency responses of the lowpass and
where τ 1 = [1/2 1/2]T and τ 2 = −τ 1 = [−1/2 −1/2]T [18].
highpass analysis filters H0 and H1 , respectively. If this filter
The total number of equations in (31) and (32) combined is
! N+1 ! !
bank has at least one dual vanishing moment, then
2 ) + ( 2 ) = ((N + 1)N + (N + 1)N)/2.
( N+1
The above results on vanishing moments can be applied h0 (0, 0) = 1, (38a)
to the filter banks from Theorem 1, where the lifting filters
have linear phase. The support region of A1 is {−l1,0 , −l1,0 + h1 (π, π) = −2 (38b)
1, . . . , l1,0 − 1}×{−l1,1 , −l1,1 +1, . . . , l1,1 − 1} for some l1,0 , l1,1 ∈
Z. Then, (31) can be rewritten as (i.e., the DC gain of the lowpass analysis filter H0 is one and the
Nyquist gain of the highpass analysis filter H1 is two).

a1 [n] (n + 1)m + (−n)m = −2−|m| , (33)
n∈{0,...,l1,0 −1} A proof of the above theorem is omitted here, but again
×{−l1,1 ,...,l1,1 −1} can be found in the first author’s thesis [26].
In the preceding discussion for filter banks with two lift-
for m ∈ (Z∗ )2 and |m| < N. ! As previously discussed, we ing steps, it is assumed that the number of dual vanish-
only need to consider the case with |m| ∈ Ze . Therefore, the ing moments is no less than that of the primal ones (i.e.,
! 2 . If we
number of equations in (33) can be reduced to N/2 N! ≥ N). This is desirable in the case of image coding, as the
use a1 to denote the independent coefficients of A1 , the set of dual vanishing moments are more important than the pri-
linear equations in (33) can be expressed in a more compact mal ones for reducing the number of nonzero coefficients in
form as the highpass subbands by annihilating polynomials. Further-
more, the presence of dual vanishing moments usually leads
A1 a1 = b1 , (34) to smoother synthesis scaling and wavelet functions, which
help to improve the subjective quality of the reconstructed
where A1 is an M0 × M1 matrix with M0 = N/2 ! 2 and images.
! elements. Each
M1 = 2l1,0 l1,1 , and b1 is a vector with N/2 2
element of A1 assumes the form (n + 1)m + (−n)m , and each

element of b1 assumes the form −2−|m| . 5.3. Frequency response
Similarly, because of the linear-phase property of the sec-
ond lifting filter A2 , (32) becomes For image coding, we desire analysis filters with good fre-
quency selectivity. Since a lifting-based parametrization of
quincunx filter banks is employed, we consider the relation-
a2 [n] (n − 1)m + (−n)m = −(−2)−|m|−1 ,
n∈{1,...,l2,0 }
ship between analysis filter frequency selectivity and the lift-
×{−l2,1 +1,...,l2,1 } ing filter coefficients.
(35) To quantify the frequency selectivity of the filter bank,
we measure the deviation in frequency response between an
for m ∈ (Z∗ )2 , |m| ∈ Ze , and |m| < N. With a2 denoting the analysis filter H and an ideal filter Hd . In particular, we define
2l2,0 l2,1 independent coefficients of A2 , (35) can be rewritten the weighted frequency response error function eh of H as
as "
# #2
eh = W(ω)#ha (ω) − Dhd (ω)# dω, (39)
A2 a2 = b2 , (36) [−π,π)2
where A2 is an M0 × M1 matrix with M0 = N/22 and where W(ω) is a weighting function defined on [−π, π)2 ,
M1 = 2l2,0 l2,1 , and b2 is a vector with N/22 elements. El- ha (ω) is the signed amplitude response of H as defined by
ements of A2 and b2 assume the forms of (n − 1)m + (−n)m (3), hd (ω) is the frequency response of the ideal filter Hd , and
and −(−2)−|m|−1 , respectively. D is a scaling factor. In order for the filter H to approximate
Yi Chen et al. 7
ω1 ω1 ω1
Stopband
π π
π
Passband
ωs
ω0 ω0 Transition band
π 0 π π 0 π
ωp
ω0
π 0 ωp ωs π
π π
(a) (b)
π
Figure 5: Ideal frequency responses of quincunx filter banks for the
(a) lowpass filters and (b) highpass filters, where the shaded and
unshaded areas represent the passband and stopband, respectively. Figure 6: Weighting function for a highpass filter with diamond-
shaped stopband.
the ideal filter, the frequency response error function eh is re- Consider a filter bank as shown in Figure 4 with two lift-
quired to satisfy ing filters A1 and A2 satisfying Theorem 1. From (24), we ob-
tain the frequency responses of the analysis filters as
e h ≤ δh , (40)
h0 (ω) 1 e− jω0 xT ET2 v2 1 0 1
=

h1 (ω) 0 1 jω T T
e x E1 v1 1
0 e jω0
where δh is a prescribed upper bound on the error.
For a quincunx filter bank with sampling matrix M =
1 + xT ET2 v2 + xT ET2 v2 v1T E1 x
[ 11 −11 ], the shape of filter passband is not unique [3, 17]. =
.
Herein, in order to match the human visual system, we use e jω0 1 + xT ET1 v1
diamond-shaped ideal passband/stopband for the analysis (44)
and synthesis filters [28]. Figure 5(a) illustrates the ideal low-
pass filter frequency response given by Then, the signed amplitude response h1a (ω) of H1 is
⎧ # # h1a (ω) = 1 + xT ET1 v1 . (45)

⎨1 for #ω0 ± ω1 # ≤ π,
h0d (ω) = ⎩0
(41) The frequency response error function of the highpass anal-
otherwise,
ysis filter H1 is computed as
"
# #2
and Figure 5(b) depicts the ideal highpass filter frequency re- e h1 = W(ω)#h1a (ω) − Dh1d (ω)# dω, (46)
sponse given by [−π,π)2
⎧ # # where W(ω) is the weighting function defined in (43),

⎨1 for #ω0 ± ω1 # ≥ π, ω0 , ω1 ∈ [−π, π), h1d (ω) is the ideal frequency response of a quincunx highpass
h1d (ω) = ⎩ (42)
0 otherwise. filter defined in (42), and the scaling factor D is chosen to be
D = 2 in accordance with (38b). The frequency response er-
The weighting function W(ω) is used to control the rel- ror function in (46) can be expressed as the quadratic in the
ative importance of the passband and stopband. For a quin- lifting-filter coefficient vector x given by
cunx highpass filter with a diamond-shaped stopband, W(ω) eh1 = xT Hx x + xT sx + cx , (47)
is defined as
where
⎧ # # "
⎪
⎪
⎪1 for passband #ω0 ± ω1 # ≥ π + ω p ,
⎪
⎪ Hx = W(ω)ET1 v1 v1T E1 dω,
⎪
⎨ ω0 , ω1 ∈ [−π, π), [−π,π)2
W(ω) = ⎪ # # (43) "

⎪
⎪
⎪
⎪
γ for stopband #ω0 ± ω1 # ≤ ωs , sx = 2W(ω)ET1 v1 1 − 2h1d (ω) dω, (48)
⎪
⎩0 [−π,π)2
otherwise (i.e., transition band), "
2
cx = W(ω) 1 − 2h1d (ω) dω,
[−π,π)2
where γ ≥ 0. By adjusting the value of γ, we can control the
filter’s performance in the stopband relative to the passband. and Hx is a positive semidefinite matrix. Substituting (47)
In the case of highpass filters, for example, the weighting into the constraint on the frequency response (40), we obtain
function is as depicted in Figure 6. The weighting function a quadratic inequality involving x as
for a quincunx lowpass filter is defined in a similar way (i.e.,
with the roles of passband and stopband reversed in (43)). xT Hx x + xT sx + cx − δh ≤ 0. (49)
5.4. Design problem formulation satisfied for any choice of φ and the number of free variables
involved is reduced from n to n − r.
Consider a filter bank as shown in Figure 4(a) with two lift- The design objective is to maximize the coding gain GSBC
ing steps. The design of such a filter bank with all of the de- of an L-level octave-band quincunx filter bank, which is com-
sirable properties (i.e., PR, linear-phase, high coding gain, puted by (25) and can be expressed as a nonlinear function
good frequency selectivity, and certain vanishing-moment of the design vector φ. Let G = −10 log10 GSBC . Then, the
properties) can be formulated as a constrained optimization problem of maximizing GSBC is equivalent to minimizing G.
problem. We employ the lifting-based parametrization intro- Although taking the logarithm helps to improve the numer-
duced in Theorem 1. In this way, the PR and linear-phase ical stability of the optimization algorithm and reduces the
conditions are automatically satisfied. We then maximize the nonlinearity in G, the direct minimization of G remains a
coding gain subject to a set of constraints, which are chosen very difficult task. Our design strategy is that, for a given
to ensure that the desired vanishing moment and frequency parameter vector φ, we seek a small perturbation δ φ such
selectivity conditions are met. In what follows, we will show that G(φ + δ φ ) is reduced relative to G(φ). Because δ φ is
more precisely how this design problem can be formulated as small, we can write the quadratic and linear approximations
a second-order cone programming (SOCP) problem. of G(φ + δ φ ), respectively, as
In an SOCP problem, a linear function is minimized sub-
ject to a set of second-order cone constraints [29]. In other
1
G φ + δ φ ≈ G(φ) + gT δ φ + δ Tφ Qδ φ , (53)
words, we have a problem of the following form:
2
G φ + δ φ ≈ G(φ) + gT δ φ , (54)
minimize f T x
$ $ (50) where g and Q are, respectively, the gradient and the Hessian
subject to $FTi x + ci $ ≤ fiT x + di for i = 1, . . . , q, of G(φ) at the point φ. Having obtained such a δ φ (subject
to some additional constraints to be described shortly), the
where x ∈ Rn is the design vector containing n free variables, parameter vector φ is updated to φ+δ φ . This iterative process
and f ∈ Rn , Fi ∈ Rn×mi , ci ∈ Rmi , fi ∈ Rn , and di ∈ R. The continues until the reduction in G (i.e., |G(φ + δ φ ) − G(φ)|)
constraint FTi x +ci ≤ fiT x +di is called a second-order cone becomes less than a prescribed tolerance ε.
constraint. Now, consider the constraint on the frequency response.
Consider a filter bank satisfying Theorem 1 with two lift- In Section 5.3, we showed that for filter banks constructed
ing filters A1 and A2 , having support sizes of 2l1,0 × 2l1,1 and with two lifting steps, the frequency response error function
2l2,0 × 2l2,1 , respectively. We use x to denote the vector con- eh1 of the highpass analysis filter H1 is a quadratic polynomial
sisting of the 2l1,0 l1,1 + 2l2,0 l2,1 independent lifting-filter co- in x as given by (47). Substituting (52) into (47), we have
efficients defined in (22). As explained previously, in terms
of the lifting-filter coefficient vector x, the constraint on van- eh1 = φT Hφ φ + φT sφ + cφ , (55)
ishing moments is linear and the constraint on the frequency where
response of the highpass analysis filter is quadratic.
From Section 5.2, we know that in order for a filter bank Hφ = VTr Hx Vr ,
to have N primal and N! dual vanishing moments, x needs to
! 2 + N/22 linear equa-

sφ = VTr Hx + HTx xs + VTr sx , (56)
be the solution of a system of N/2
tions given by cφ = xsT Hx xs + xsT sx + cx ,
Ax = b. (51) and Hx , sx , and cx are given in (48). Moreover, it follows from

the fact that Hx is positive semidefinite that Hφ is also posi-
In (51), A ∈ Rm×n with rank r and b ∈ Rm×1 , where tive semidefinite. Now, let us replace φ by φk + δ φ and let the
m = N/2 ! 2 + N/22 , n = 2l1,0 l1,1 + 2l2,0 l2,1 , and r ≤ SVD of Hφ be given by
min{m, n}. The system is underdetermined when there are
Hφ = UH ΣVTH . (57)
enough lifting-filter coefficients such that m < n. In what fol-
lows, we assume that the system is underdetermined so that Then, (55) can also be written as
our eventual optimization problem will have a feasible re- $ $2
gion containing more than one point. Let the singular value eh1 = $H
! k δ φ + !sk $ + c!k , (58)
decomposition (SVD) of A be A = USVT . All of the solutions and the constraint (40) becomes the second-order cone con-
to (51) can be parameterized as straint
$ $
+
x = Ab +Vr φ = xs + Vr φ, (52) $H! k δ φ + !sk $2 ≤ δh1 − c!k , (59)
xs
where
where A+ is the Moore-Penrose pseudoinverse of A, Vr = ! k = Σ1/2 UTH ,
H
[vr+1 vr+2 · · · vn ] is a matrix composed of the last n − r 1 ! −T
columns of V, and φ is an arbitrary (n − r)-dimensional vec- !sk = H 2Hφ φk + sφ , (60)

2
tor. Henceforth, we will use φ as the design vector instead $ $2
of x. Thus, the vanishing-moment condition is automatically c!k = φTk Hφ φk + φTk sφ + cφ − $!sk $ .
Yi Chen et al. 9
ter solution along the direction of δ φ . We first evaluate G at

This iterative algorithm consists of the following steps (where
k denotes the iteration number indexed from zero).
N0 equally spaced points between φk and φk + αδ φ along the
Step (1) direction of δ φ for some α ≥ 1, including the point φk + δ φ .
Compute A and b in (37) for the desired numbers of Then, we use the point φ∗k corresponding to the minimal G
vanishing moments, and calculate Hφ , sφ , and cφ in (55). to select γ. By including a line search, in each iteration the
Then, select an initial point φ0 . This point can be chosen reduction in G is as large as the reduction obtained without
randomly, or chosen to be a quincunx filter bank proposed in the line search. This makes the algorithm converges with less
[18]. The vanishing-moment condition is satisfied, and iterations. The choice of α depends on the choice of β. When
because of the way in which we choose the upper bound δh1 β is large, we can choose α = 1. When β is small, we can
for the frequency response error function (to be discussed choose α ≥ 1. Note that a greater value of α may imply more
later), φ0 will not violate the frequency response constraint. evaluations of the coding gain function G in each iteration.
In this way, the initial point is in the feasible region.
The second comment about Step (2) concerns the choice
Step (2)
For the kth iteration, at the point φk , compute the gradient g of the upper bound δh1 of the frequency response error func-
of G(φ) in (54), and calculate H ! k , !sk , and c!k in (59). Then, tion in the SOCP problem (p1). If δh1 is too small, the feasible
solve the SOCP problem given by: region of the SOCP problem may be an empty set, especially
for designs starting from a random initial point. Therefore,
minimize gT δ φ for the kth iteration, we choose δh1 to be a scaled version of
$ $ % the error function eh1 evaluated at φk . That is, we select
subject to $H! k δ φ + !sk $ ≤ δh1 − c!k , (p1)
$ $
$δ φ $ ≤ β,
δh1 = d φTk Hφ φk + φTk sφ + cφ , (62)

where β is a given small value used to ensure that the solution
is within the vicinity of φk . Then, update φk by where 0 < d ≤ 1 is a scaling factor. In this way, the error eh1
φk+1 = φk + γδ φ , where γ is either chosen as one or is reduced after each iteration, and the frequency response of
determined by a line search explained in more detail later. A the highpass analysis filter H1 improves gradually with each
number of software packages are available for solving SOCP iteration.
problems. In our work, for example, we use SeDuMi [30].
Step (3)
If |G(φk+1 ) − G(φk )| < ε, output φ∗ = φk+1 , compute 5.5. Design algorithm with Hessian
x∗ = xs + Vr φ∗ , and stop. Otherwise, go to Step (2).
In Algorithm 1, a linear approximation (54) of the coding
gain function G is employed. This necessitates that the per-
Algorithm 1: Two-lifting-step case. turbation δ φ be located in a small region. For this design
problem, we can instead use the quadratic approximation
in (53). In this way, the approximation accuracy can be im-
Based on the preceding discussions, we now show how to proved, and the solution can be sought in a larger region.
employ the SOCP technique to solve the problem of maxi- Algorithm 1 can be adapted to utilize the quadratic approx-
mizing the coding gain GSBC , or equivalently minimizing G, imation with some minor changes to the SOCP problem in
with the vanishing-moment constraint Ax = b as in (51) and each iteration. In Step (2), we minimize gT δ φ + (1/2)δ Tφ Qδ φ
the frequency response constraint eh1 ≤ δh1 as in (40). This instead of gT δ φ in (p1). That is, we seek a solution to the
problem can be solved via Algorithm 1. following problem:
The vector x∗ output by Algorithm 1 is then the opti-
mal solution to this problem. The filter bank constructed 1
from the lifting-filter coefficient vector x∗ has high coding minimize gT δ φ + δ Tφ Qδ φ
2
gain, good frequency selectivity, and the desired vanishing-
$ $ %
moment properties (as well as PR and linear phase). (63)
subject to $Hδ
! φ + !s$ ≤ δh1 − c!,
Two additional comments are now in order concerning
the SOCP problem (p1) in the second step of the iterative al- $ $
$δ φ $ ≤ β.
gorithm (Algorithm 1). In particular, the choice of β is criti-
cal to the success of the algorithm. It should be chosen such
that Let the SVD of (1/2)Q be (1/2)Q = UQ ΣQ VTQ . When Q is
positive semidefinite, we can rewrite the objective function
gT δ ≈ G(φ + δ) − G(φ) for δ = β. (61) as
If β is too large, the linear approximation (54) is less accu- 1 $ $
rate, resulting in the linear term gT δ φ not correctly reflect- ! φ + !sQ $2 + c!Q ,
gT δ φ + δ Tφ Qδ φ = $Qδ (64)
2
ing the actual reduction in G. If β is too small, in the kth
iteration, the solution is restricted to an unnecessarily small where
region around φk , causing points outside this region which
may provide a greater reduction in G to be excluded. For this ! = Σ1/2 T 1 ! −T
Q Q UQ , !sQ = Q g, c!Q = −!sTQ!sQ . (65)
reason, we incorporate a line search in Step (2) to find a bet- 2
Table 1: Comparison of algorithms with linear and quadratic ap- The computation of the coding gain in this case is ba-
proximations. sically the same as the two-lifting-step case discussed in
Filter bank EX1 EX2 Section 5.1. For an L-level octave-band quincunx filter bank,
the coding gain GSBC is computed by (25), and GSBC is a non-
Approximation Linear Quadratic linear function of the lifting-filter coefficients.
One-level isotropic coding gain (dB) 6.86 6.86
Number of evaluations of G per iteration 10 65 6.1. Vanishing moments
Average time per iteration 0.4 1.0 Compared to the two-lifting-step case, the vanishing-mo-
Number of iterations 41 5 ments condition changes considerably for a filter bank as
Total time (seconds) 20.1 5.1 shown in Figure 4(a) with at least three lifting steps (i.e.,
λ ≥ 2). The condition is no longer linear with respect to the
lifting-filter coefficient vector x. With the notations ak , vk ,
x, and Ek introduced in Section 4, the frequency responses
If we further define δ!φ = [η δ φ ]T and f = [1 0 · · · 0]T , {hk (ω)} of the analysis filters are given by (24), and {hk (ω)}
then (63) becomes the SOCP problem, can each be expressed as a polynomial in x.
In order for this filter bank to have N! dual vanishing mo-
minimize f T δ!φ
ments, the frequency response h1 (ω) of the highpass analysis
$ $ !
filter should have an Nth-order zero at [0 0]T . Therefore,
!
subject to $Q
!δ! φ + !sQ $ ≤ f T δ
!φ ,
1a (0, 0) = 0 for all m ∈ (Z∗ )2 such that |m| ∈ Ze and
m h
(66) 1a (ω) is the signed amplitude response of
$! $ % |m| < N, ! where h
$H
!δ! φ + !s$ ≤ δh − c!,
1
H1 as defined in (3). As H1 has linear phase and h1 (ω) can be
$
$!Iδ
$ viewed as a polynomial in x, h1a (ω), and thus h(m)1a (0, 0) can
! φ $ ≤ β,
also be viewed as polynomials in x. In this way, in order to
have N! dual vanishing moments, the lifting-filter coefficients
!
! = [0 Q],
! H ! ! and !I = [0 I].
! = [0 H],
where Q in x need to satisfy N/2! 2 polynomial equations. Similarly,
Note that (64) holds only when Q is positive semidefinite in order to have N primal vanishing moments, the frequency
and Q need not always be positive semidefinite. When Q is response h(m)0 (ω) of the lowpass analysis filter H0 should sat-
not positive semidefinite, however, we can simply revert to
isfy m h0a (π, π) = 0 for all m ∈ (Z∗ )2 such that |m| ∈ Ze
using a linear approximation.
and |m| < N. It follows that x needs to satisfy N/22 poly-
When a quadratic approximation is employed, the algo-
nomial equations.
rithm reaches an optimal solution with fewer iterations than
in the linear case, but takes longer for each iteration as the 6.2. Frequency responses
coding gain is evaluated many more times when comput-
ing the Hessian. To demonstrate this difference in behavior, Recall that in the two-lifting-step case, the frequency re-
we designed two filter banks, EX1 and EX2, using the origi- sponse constraint is defined in (39) and (40), and the con-
nal Algorithm 1 and the revised algorithm with the Hessian, straint on the highpass analysis filter is a second-order cone.
respectively. Each optimization used the same initial point. For filter banks with more than two lifting steps, we define
This led to the results shown in Table 1. Clearly, very simi- the frequency response constraint in a similar way. The fre-
lar optimization results are obtained for these two designs in quency response error functions of the lowpass and highpass
terms of the coding gain. For the design with the quadratic analysis filters, however, are at least fourth-order polynomi-
approximation, the time used for each iteration is increased als in the lifting-filter coefficients. This is because the fre-
compared to the linear-approximation case, but the number quency responses of the analysis filters H0 and H1 are at least
of iterations is reduced greatly, resulting in a much shorter quadratic polynomials in the lifting-filter coefficient vector x
overall time. when more than two lifting filters are involved.
6.3. Design problem formulation

6. DESIGN OF FILTER BANKS WITH MORE THAN
TWO LIFTING STEPS In the two-lifting-step case, we saw that in terms of the
lifting-filter coefficients, the vanishing-moment condition is
Although Algorithm 1 only works for the two-lifting-step a linear system of equations and the frequency response con-
case, this algorithm can be generalized to design filter banks straint is a second-order cone. For filter banks with more
with more than two lifting steps. When more lifting filters are than two lifting steps, the design problem becomes increas-
involved, however, the relationships between the filter-bank ingly complicated as the constraints on vanishing moments
characteristics (i.e., coding gain, vanishing-moment proper- and frequency responses become higher-order polynomials
ties, and frequency selectivity) and the lifting-filter coeffi- in the lifting-filter coefficients. In order to use the SOCP
cients become more complicated. In this section, we consider technique, the constraints on vanishing moments and the
how to formulate the design as an SOCP problem based on frequency response must be approximated by linear and
these relationships. quadratic constraints, respectively.
Yi Chen et al. 11
We deal with the coding gain GSBC (x) with the same strat-
This iterative algorithm consists of the following steps (where
egy as in the two-lifting-step case. The linear approximation k denotes the iteration number indexed from zero).
of G with G(x) = −10 log10 GSBC (x) is given by Step (1)

Select an initial point x0 such that the resulting filter bank has
G x + δ x ≈ G(x) + gT δ x , (67) the desired number of vanishing moments. We can choose
the first two lifting filters using the method proposed for the
where g is the gradient of G at point x. We iteratively seek two-lifting-step case, and then set the coefficients of the other
a small perturbation δ x in x such that G(x + δ x ) is reduced K − 2 lifting filters to be all zeros. Alternatively, we can
randomly select the coefficients of the first K − 2 filters, and
relative to G(x) until the difference between G(x + δ x ) and
then use the last two lifting filters to provide dual and primal
G(x) is less than a prescribed tolerance. vanishing moments. In this way, the filter bank constructed
As discussed in Section 6.1, the constraint on vanishing with the initial point x0 has the desired number of vanishing
moments is a set of polynomial equations in x. We substitute moments. Moreover, since the upper bound δh1 for the
x with xk + δ x . Provided that δ x is small, the quadratic and frequency response error function is chosen in the same way
higher-order terms in δ x can be neglected, and these polyno- as in Algorithm 1, the frequency response constraint will not
mial equations can be approximated by the linear system be violated. Therefore, x0 is inside the feasible region.
Step (2)
Ak δ x = bk . (68) For the kth iteration, at the point xk , compute the gradient g
of G(x), Ak and bk in (68), and H ! k , !sk , and c!k in (70). Then,
solve the SOCP problem:
In this way, the filter bank constructed with lifting-filter coef-
ficients xk +δ x has the desired vanishing-moment properties. minimize gT δ x
Due to the problem formulation, the moments of interest are subject to Ak δ x = bk ,
only guaranteed to be small, but not exactly zero. In practice, $ $ % (p2)
$H! k δ x + !sk $ ≤ δh1 − c!k ,
however, the moments are typically very close to zero, as will
$ $
be illustrated later via our design examples. $δ x $ ≤ β.
Now we consider the frequency response of the highpass
analysis filter H1 . The weighted error function eh1 is defined The linear constraint Ak δ x = bk can be parameterized as in
in (39). In order to have good frequency selectivity, the func- Algorithm 1 to reduce the number of design variables, or be
approximated by the second-order cone Ak δ x − bk ≤ εδ
tion eh1 must satisfy the constraint (40). From (8), h1a (ω) with εδ being a prescribed tolerance. Then, we can use the
has at least a second-order term in x. Therefore, eh1 is at least optimal solution δ x to update xk by xk+1 = xk + δ x . We can
a fourth-order polynomial in x. Using a similar approach as also optionally incorporate a line search into this process to
above, we replace x by xk +δ x in h1a (ω) with δ x being small, improve the efficiency of the algorithm.
and neglect the second- and higher-order terms in δ x . Now, Step (3)
If |G(xk+1 ) − G(xk )| < ε, then output x∗ = xk+1 and stop.
h1a (ω) is approximated by a linear function of δ x . Using (39),
Otherwise, go to Step (2).
a quadratic approximation of eh1 is obtained as
eh1 = δ Tx Hk δ x + δ Tx sk + ck , (69) Algorithm 2: More-than-two lifting-step case.
where Hk is a symmetric positive semidefinite matrix, and

Hk , sk , and ck are dependent on xk . Therefore, the constraint Step (2), we deal with the constant δh1 in the same way as in
eh1 ≤ δh1 can be expressed in the form of a second-order cone Algorithm 1 (i.e., δh1 is chosen to be a scaled version of the
constraint as error function evaluated at the point xk ). We use a variable
$ $ scaling factor D in the frequency response error function (39)
$H! k δ x + !sk $2 ≤ δh1 − c!k . (70) since the Nyquist gain of H1 is dependent on the lifting-filter
coefficients in this case. For the kth iteration, we choose D
Note that the approximation is not applied to eh1 , but to to be the Nyquist gain of the highpass analysis filter obtained
h1a (ω). In this way, the matrix Hk is guaranteed to be posi- from the previous iteration (i.e., D = h1a (π, π) with h1a (ω)
tive semidefinite, which allows for the form of a second-order being the signed amplitude response of H1 obtained from the
cone as in (70). (k − 1)th iteration).
Based on the preceding approximation methods of the Due to the linear approximation (68), the moments as-
vanishing-moment condition and frequency response con- sociated with the desired vanishing-moment conditions are
straint, the design of filter banks with more than two lift- only guaranteed to be small but not necessarily zero. An ad-
ing steps can be formulated as an iterative SOCP problem. justment step can be applied after Step (3) to further re-
To solve this design problem, we use a scheme similar to duce the moments in question at the expense of a slight de-
Algorithm 1. Let K be the number of lifting steps. The mod- crease in the coding gain. This step is formulated as follows.
ified algorithm (Algorithm 2) is given. Let {Γi (x)} = 0 be the set of polynomial equations that the
Upon termination of Algorithm 2, the output x∗ will cor- lifting-filter coefficient vector x needs to satisfy to achieve N
respond to a filter bank with all of the desired properties. In primal and N! dual vanishing moments. When δ x is small,
the linear approximation of Γi (x∗ + δ x ) is obtained by filter frequency response h0 (ω) is a quadratic polynomial in

the design vector φ. We can replace φ by φk + δ φ in h0 (ω) and
Γi x∗ + δ x = Γi (x∗ ) + giT δ x , (71) keep only the constant and first-order terms. Then, the er-
ror function eh0 computed with this linear approximation of
where gi is the gradient of Γi at the point x∗ . This adjustment h0 (ω) becomes a quadratic function of δ φ , and the constraint
process can then be formulated as the following optimization eh0 ≤ δh0 can be expressed as a second-order cone in δ φ .
problem:
2 7. DESIGN EXAMPLES
minimize Γi (x∗ ) + giT δ x
i
$ $ (72) In order to demonstrate the effectiveness of our proposed
subject to $δ x $ ≤ βa , design methods, we now present several examples of filter
banks constructed using Algorithms 1 and 2. In passing, we
where βa is a prescribed small value. The objective function note that our software implementation of these algorithms
of (72) can be rewritten as (written in MATLAB) is available on the Internet [31]. For
all of the design examples in this section, the optimization is

2
Γi (x∗ ) + giT δ x carried out for maximal coding gain assuming an isotropic
i image model with correlation coefficient ρ = 0.95 and a six-
level wavelet decomposition.
= δx
T
gi giT δ x + δ Tx 2 Γi (x∗ )gi + Γ2i (x∗ ). Using our proposed methods, we designed three filter
i i i banks, henceforth referred to by the names OPT1, OPT3, and
(73) OPT4. The lifting-filter coefficient vectors {ai } (as defined in
(10) and (15)) for these three filter banks are given in Table 2.
Since i gi giT is positive semidefinite, the objective function For comparison purposes, we also consider four filter banks
can be expressed in the form H! δ δ x +!sδ 2 +c!δ . If we introduce produced using methods previously proposed by others, with
another variable η to be the upper bound of the term H ! δ δx + three being quincunx and one being separable. The first two
!sδ , the problem in (72) becomes quincunx filter banks are constructed using the technique of
[18], and are henceforth referred to by the names KS1 and
minimize η KS2. The third quincunx filter bank is the so-called (6, 2) fil-
$ $ ter bank proposed in [9], which we henceforth refer to by the
subject to $H
! δ δ x + !sδ $ ≤ η, (74)
$ $ name G62. The one separable filter bank considered herein
$ δ x $ ≤ βa . is the well-known 9/7 filter bank employed in the JPEG-2000
standard [1]. Some important characteristics of the various
The above problem is equivalent to the SOCP problem, filter banks are shown in Table 3. The OPT1 filter bank was
designed using Algorithm 1 with two lifting steps. The next
minimize f T δ!x two filter banks, referred to as OPT3 and OPT4, were de-
$ $ signed using Algorithm 2 with three or more lifting steps,
!
subject to $H
! δ δ x + !sδ $ ≤ f T δ!x , (75) and thus, the desired vanishing-moment conditions are only
$ $ guaranteed to be met approximately (i.e., the moments in
$!Iδ x $ ≤ βa ,
question are only guaranteed to be close to zero). For each
of these two filter banks, the order of the largest nonzero
!
where δ!φ = [η δ φ ]T , f = [1 0 · · · 0]T , H
! δ = [0 H
! δ ], and moment (of those in question) is shown in the rightmost
!I = [0 I]. column of Table 3. The frequency responses of the analysis
In Algorithm 2, instead of using the linear approximation and synthesis lowpass filters are shown in Figures 7, 8, and 9.
(67) of the coding gain function G, we can also employ the Since the highpass filter frequency responses are simply mod-
quadratic approximation of G given by ulated versions of the lowpass ones, the former have been
omitted here due to space constraints. The primal scaling and

1 wavelet functions are illustrated in Figures 10, 11, and 12.
G x + δ x ≈ G(x) + gT δ x + δ Tx Qδ x , (76) From Table 3, clearly, the optimal designs, OPT1, OPT3,
2
and OPT4, each have a higher isotropic coding gain than
where g and Q are the gradient and the Hessian of G(x) at the KS1, KS2, and G62 quincunx filter banks. Furthermore,
the point x, respectively. A change similar to that used in the designs with three and four lifting steps also have a
Section 5.5 can be made to the SOCP problem (p2) in Step higher isotropic coding gain than the 9/7 filter bank, which
(2) of Algorithm 2. is very impressive considering that the 9/7 filter bank is well
The approximation method for the frequency response known for its high coding gain. For OPT3 and OPT4, the
constraint explained previously in this section can also be zeroth moments are nearly vanishing on the order of 10−10
used to control the frequency response of the lowpass anal- to 10−12 , which is small enough to be considered as zero
ysis filter H0 for filter banks with two or more lifting steps. for all practical purposes. The first moments are automat-
For example, in the two-lifting-step case, the analysis lowpass ically zero due to the linear-phase property as previously
Yi Chen et al. 13
Table 2: Lifting-filter coefficients for the (a) OPT1, (b) OPT3, and embedded lossy/lossless image codec of [32]. This codec can
(c) OPT4 filter banks (where the coefficient vectors {ai } are as de- be used with either nonseparable or separable filter banks
fined in (10) and (15))
based on the lifting framework. Some additional information
(a) about the codec is included in the appendix. For test data,
a1 a2 all twenty seven (reasonably sized) grayscale images from the
−0.0159198316 0.0141419383 JPEG-2000 test set [33] were used in our experiments.
0.0570315087 −0.0475750610 Using each of the filter banks listed in Table 3, the test
−0.3319070666 0.1826552865 images were coded in a lossy manner at four compression ra-
−0.3336501890 0.1839773572 tios (i.e., 128, 64, 32, and 16), and then decoded. In each case,
0.0596966372 −0.0501021101 the difference between original and reconstructed images was
−0.0177016160 0.0165757568 measured in terms of PSNR. In the cases of quincunx and
0 0 separable filter banks, six and three levels of decomposition
−0.0002158944 0.0073072183 were employed, respectively.
0.0584826734 −0.0487234955 A statistical summary of all of the lossy compression re-
0.0590711965 −0.0488388947
sults (i.e., for the twenty seven test images coded at four com-
−0.0014144431 0.0082567802
pression ratios) obtained with the quincunx filter banks is
0 0
provided in Table 4. In particular, the table shows the per-
0 0
0 0
centage of cases where the OPT1, OPT3, and OPT4 optimal
−0.0171945340 0.0165064152 designs outperform the KS1, KS2, and G62 filter banks. We
−0.0162784411 0.0158188087 can see that our new filter banks outperform KS1 in 70% to
0 0 80% of the cases, outperform KS2 in more than 80% of the
0 0 cases, and outperform G62 in more than 90% of the cases.
It is worth noting that the KS1 filter bank has the best per-
(b) formance among all of the quincunx filter banks constructed
using the method in [18] with filter supports comparable to
a1 a2 a3 our design examples, and the G62 filter bank has the best per-
0.0121916538 −0.0412467652 0.0312090846 formance among the three filter banks in [9]. In other words,
−0.2252324567 0.2230448713 −0.1065049947 we are comparing our optimal designs to the very best com-
−0.2244562781 0.2234323639 −0.1060172665 peting quincunx filter banks produced by other methods.
0.0131716139 −0.0423652185 0.0301113988 For illustrative purposes, we now provide a subset of the
0 0 0
lossy coding results, namely those obtained for the test im-
0.0123383222 −0.0429058837 0.0289842780
ages sar2 and gold. Information about these two images is
0.0125969226 −0.0419932594 0.0317300494
provided in Table 5. The sar2 image is more isotropic (than
0 0 0
separable) in nature, while the gold image is more separa-
ble, as demonstrated by the contour plots of their normalized
(c) autocorrelation functions shown in Figure 13. The lossy cod-
ing results for the sar2 and gold images are shown in Table 6.
a1 a2
Obviously, our three optimal designs (i.e., OPT1, OPT3, and
0.0634983772 −0.0451377582
OPT4) perform very well, consistently outperforming the
−0.1474840240 0.0687594491
KS1, KS2, and G62 quincunx filter banks in all cases. For ex-
−0.2023765008 0.1518386544
ample, in the case of the sar2 image at a compression ratio of
0.0294352099 −0.0326419204
16, our optimal designs outperform the KS1, KS2, and G62
0 0
0.0622324334 −0.0460766038
filter banks by margins of 0.12 to 0.23, 0.29 to 0.4, and 0.42
0.0202133422 −0.0240443429
to 0.53 dB, respectively. Moreover, for the isotropic sar2 im-
0 0 age, our optimal designs even achieve better results than the
a3 a4 9/7 filter bank in most cases. For example, the OPT3 design
−0.2321916679 0.2012955400 outperforms the 9/7 filter bank at all of the four compression
−0.0651787971 0.0186944256 ratios considered (for the sar2 image). This is quite an en-
couraging result, as the 9/7 filter bank is generally held to be
discussed in Section 6.1. Lastly, from Figures 7 to 12, we one of the very best in the literature.1
see that the optimal filter banks have good diamond-shaped The reconstructed images associated with the optimal fil-
passbands/stopbands and smooth primal scaling and wavelet ter banks also have subjective quality comparable to that of
functions.
1 Of course, the idea that nonseparable filter banks can offer improved
8. IMAGE CODING RESULTS AND ANALYSIS performance (over separable ones) for images with nonseparable (e.g.,
isotropic) statistics is not a new one. In fact, it is this very idea that has
In order to further demonstrate the utility of our new fil- inspired much research in the area of nonseparable filter banks. For ex-
ter banks, they were employed in an enhanced version of the ample, this idea has been expressed in [21] as well as in many other works.
Table 3: Comparison of filter-bank characteristics.
Support of Support of analysis filters Coding gain (dB) Vanishing moments

Name
lifting filters †
Lowpass Highpass Isotropic Separable N! N Max.
OPT1 6 × 6, 6 × 6 13 × 13 7×7 12.06 13.59 2 2 —
OPT3 4 × 4, 4 × 4, 4 × 4 9×9 13 × 13 12.23 13.26 2 2 10−12
OPT4 4 × 4, 4 × 4, 2 × 2, 2 × 2 13 × 13 11 × 11 12.21 13.07 2 2 10−10
KS1 6 × 6, 6 × 6 13 × 13 7×7 11.95 13.64 6 6 —
KS2 8 × 8, 4 × 4 15 × 15 11 × 11 11.75 13.92 8 4 —
G62 6 × 6, 2 × 2 13 × 13 11 × 11 11.64 12.98 6 2 —
9/7 2, 2, 2, 2 9 7 12.09 14.88 4 4 —
†
Support regions are diamond-shaped for OPT1, OPT3, OPT4, KS1, and KS2, and rectangular-shaped for G62.
1 1.2
1
Magnitude
Magnitude
0.8
0.8
0.6 0.6
0.4 0.4
0.2 0.2
0.5 0.5
0 0.5 0 0.5
0.5 0 0
Fy 0.5 0.5 0.5
1 1 Fx Fy 1 1 Fx
(a) (a)
1.5 1.5
Magnitude
Magnitude
1 1
0.5 0.5
0.5 0.5
0 0.5 0 0.5
0.5 0 0
0.5 0.5 0.5
Fy 1 1 Fy 1 1
Fx Fx
(b) (b)
Figure 7: Frequency responses of the (a) lowpass analysis and (b) Figure 8: Frequency responses of the (a) lowpass analysis and (b)
lowpass synthesis filters of OPT1. lowpass synthesis filters of OPT3.
the KS1, KS2, G62, and 9/7 filter banks. As an example, the techniques (i.e., Algorithms 1 and 2) yield linear-phase PR
lossy reconstructed images for sar2 using these filter banks systems with high coding gain, good frequency selectivity,
are shown in Figure 14. It is apparent from the figures that and certain prescribed vanishing-moment properties.
the reconstructed images corresponding to OPT1, OPT3, Using Algorithms 1 and 2, we designed several filter
and OPT4 have good subjective quality. banks with all of the desirable properties. These optimal fil-
ter banks were employed in an image codec and their cod-
9. CONCLUSIONS ing performance was compared to that of four previously
proposed filter banks (three quincunx and one separable).
In this paper, we have proposed two new optimization-based The experimental results show that our new filter banks out-
methods (and variations thereof) for the design of quin- perform the three previously proposed quincunx filter banks
cunx filter banks for image coding. The proposed design in 72% to 95% of the test cases. Thus, our design methods
Yi Chen et al. 15
1.2
1
Magnitude
0.8
0.6
0.4
0.2
(a)
0.5
0 0.5
0
0.5 0.5
Fy 1 1 Fx
(a)
(b)
1.5
Magnitude
Figure 11: The (a) primal wavelet and (b) primal scaling functions
1 for OPT3.
0.5
0.5
0 0.5
0.5 0
Fy 1 1 0.5
Fx
(b)
Figure 9: Frequency responses of the (a) lowpass analysis and (b)
lowpass synthesis filters of OPT4. (a)
(b)
(a) Figure 12: The (a) primal wavelet and (b) primal scaling functions
for OPT4.
Table 4: Statistical summary of the lossy compression results for

twenty seven test images, each coded at compression ratios of 128,
64, 32, and 16. Percentage of cases where the OPT1, OPT3, and
OPT4 optimal designs outperform the KS1, KS2, and G62 (quin-
cunx) filter banks.
(b)
Filter banks OPT1 OPT3 OPT4
Figure 10: The (a) primal wavelet and (b) primal scaling functions
for OPT1. KS1 78% 75% 72%
KS2 83% 82% 81%
G62 95% 94% 93%
clearly yield superior filter banks compared to other quin-

cunx filter-bank design methods. Moreover, in some cases,
Table 5: Small subset of test images.
our optimal designs even outperform the (separable) 9/7 fil-
ter bank, which is considered to be one of the very best in the Image Size bpp Model Description
literature. These results demonstrate the effectiveness of our Synthetic aperture
new design techniques. Furthermore, through the use of our sar2 800 × 800 12 Isotropic
radar
design methods, it is possible to develop higher-performance
gold 720 × 576 8 Separable Houses
image codecs based on quincunx filter banks.
10
8
0.4
6
4 0.5
2 0.6
0
(a)
2
4
6
8
10
10 5 0 5 10
(a)
(b)
10
8
5
0.8
6
4 0.9
2 5
0.9
0
2 (c)
4
6
8
10
10 5 0 5 10
(b)
(d)
Figure 13: The contour plots of the autocorrelation functions of
the (a) sar2 and (b) gold images.
Table 6: Lossy compression results for the (a) sar2 and (b) gold
images.
(e)
(a)
PSNR (dB)
CR†
OPT1 OPT3 OPT4 KS1 KS2 G62 9/7
128 22.73 22.77 22.75 22.66 22.56 22.39 22.75
64 23.54 23.60 23.61 23.45 23.34 23.13 23.56
32 24.73 24.82 24.79 24.62 24.49 24.29 24.70
16 26.67 26.78 26.75 26.55 26.38 26.25 26.62 (f)
(b)
PSNR (dB)
CR†
OPT1 OPT3 OPT4 KS1 KS2 G62 9/7
128 27.14 27.19 27.12 26.98 26.92 26.72 27.16
64 28.90 28.95 28.95 28.82 28.71 28.47 29.06
(g)
32 30.90 30.97 30.95 30.81 30.70 30.50 31.28
16 33.36 33.41 33.35 33.28 33.17 32.97 33.82
Figure 14: Part of the lossy reconstructions obtained for the sar2
image at a compression ratio of 32 using the (a) OPT1, (b) OPT3,
†
Compression ratio. (c) OPT4, (d) KS1, (e) KS2, (f) G62, and (g) 9/7 filter banks.
Yi Chen et al. 17
APPENDIX wavelet bases,” IEEE Transactions on Signal Processing, vol. 43,

no. 3, pp. 649–665, 1995.
IMAGE CODEC [9] A. Gouze, M. Antonini, and M. Barlaud, “Quincunx lifting
scheme for lossy image compression,” in Proceedings of the
The image codec [32] used for collecting experimental re- IEEE International Conference on Image Processing (ICIP ’00),
sults herein was written in C++ and it supports both lossy vol. 1, pp. 665–668, Vancouver, BC, Canada, September 2000.
and lossless compression of grayscale images. The codec was [10] S. C. Chan, K. S. Pun, and K. L. Ho, “On the design and imple-
partly inspired by technologies contained in the JPEG-2000 mentation of a class of multiplierless two-channel 1D and 2D
Verification-Model 0.0 software [34]. Although originally de- nonseparable PR FIR filterbanks,” in Proceedings of the IEEE
veloped in [32], the codec has undergone major changes International Conference on Image Processing (ICIP ’01), vol. 2,
since that time in order to improve its coding performance. pp. 241–244, Thessaloniki, Greece, October 2001.
[11] K. S. C. Pun and T. Q. Nguyen, “A novel and efficient de-
The codec employs reversible integer-to-integer versions of
sign of multidimensional PR two-channel filter banks with
wavelet transforms [25] (which can be trivially constructed hourglass-shaped passband support,” IEEE Signal Processing
from the lifting realization of a filter bank). Letters, vol. 11, no. 3, pp. 345–348, 2004.
The general structure of the codec is as follows. In the en- [12] G. Karlsson and M. Vetterli, “Theory of two-dimensional mul-
coder, a wavelet transform is first applied to the input data. tirate filter banks,” IEEE Transactions on Acoustics, Speech, and
Then, a bitplane coder is applied independently to each of the Signal Processing, vol. 38, no. 6, pp. 925–937, 1990.
resulting subband signals. The bitplane coder employs three [13] E. Viscito and J. P. Allebach, “The analysis and design of mul-
coding passes per bitplane (i.e., predicted significant, refine- tidimensional FIR perfect reconstruction filter banks for arbi-
ment, and predicted insignificant passes), similar in spirit to trary sampling lattices,” IEEE Transactions on Circuits and Sys-
those found in the JPEG-2000 codec [1], for example. The tems, vol. 38, no. 1, pp. 29–41, 1991.
[14] T. D. Tran, R. L. de Queiroz, and T. Q. Nguyen, “Linear-phase
symbols generated by the bitplane coder are then entropy-
perfect reconstruction filter bank: lattice structure, design, and
coded using a context-based adaptive arithmetic coder. The application in image coding,” IEEE Transactions on Signal Pro-
ordering of the data in the codestream is optimized for rate- cessing, vol. 48, no. 1, pp. 133–147, 2000.
distortion performance, and rate control is achieved solely by [15] W. Sweldens, “The lifting scheme: a custom-design construc-
the truncation of the embedded codestream. The structure of tion of biorthogonal wavelets,” Applied and Computational
the decoder essentially mirrors that of the encoder. Harmonic Analysis, vol. 3, no. 2, pp. 186–200, 1996.
[16] I. Daubechies and W. Sweldens, “Factoring wavelet transforms
ACKNOWLEDGMENT into lifting steps,” Journal of Fourier Analysis and Applications,
vol. 4, no. 3, pp. 247–268, 1998.
The authors would like to thank the anonymous reviewers [17] T. Cooklev, A. Nishihara, T. Yoshida, and M. Sablatash, “Mul-
for useful comments, which have helped to improve the qual- tidimensional two-channel linear phase FIR filter banks and
wavelet bases with vanishing moments,” Multidimensional Sys-
ity of this paper.
tems and Signal Processing, vol. 9, no. 1, pp. 39–76, 1998.
[18] J. Kovačević and W. Sweldens, “Wavelet families of increas-
REFERENCES ing order in arbitrary dimensions,” IEEE Transactions on Image
Processing, vol. 9, no. 3, pp. 480–496, 2000.
[1] ISO/IEC 15444-1, Information technology—JPEG 2000 image [19] J. Zhou, M. N. Do, and J. Kovačević, “Multidimensional or-
coding system—Part 1: Core coding system, 2000. thogonal filter bank characterization and design using the
[2] D. B. H. Tay and N. G. Kingsbury, “Flexible design of mul- Cayley transform,” IEEE Transactions on Image Processing,
tidimensional perfect reconstruction FIR 2-band filters using vol. 14, no. 6, pp. 760–769, 2005.
transformations of variables,” IEEE Transactions on Image Pro- [20] J. Zhou, M. N. Do, and J. Kovačević, “Special paraunitary
cessing, vol. 2, no. 4, pp. 466–480, 1993. matrices, Cayley transform, and multidimensional orthogonal
[3] P. P. Vaidyanathan, Multirate Systems and Filter Banks, Prentice filter banks,” IEEE Transactions on Image Processing, vol. 15,
Hall, Upper Saddle River, NJ, USA, 1993. no. 2, pp. 511–519, 2006.
[4] T. Chen and P. P. Vaidyanathan, “Multidimensional multirate [21] M. Feilner, D. Van De Ville, and M. Unser, “An orthogonal
filters and filter banks derived from one-dimensional filters,” family of quincunx wavelets with continuously adjustable or-
IEEE Transactions on Signal Processing, vol. 41, no. 5, pp. 1749– der,” IEEE Transactions on Image Processing, vol. 14, no. 4, pp.
1765, 1993. 499–510, 2005.
[5] J. M. Shpairo, “Adaptive McClellan transformations for quin- [22] D. Van De Ville, T. Blu, and M. Unser, “Isotropic polyhar-
cunx filter banks,” IEEE Transactions on Signal Processing, monic B-splines: scaling functions and wavelets,” IEEE Trans-
vol. 42, no. 3, pp. 642–648, 1994. actions on Image Processing, vol. 14, no. 11, pp. 1798–1813,
[6] T. A. C. M. Kalker and I. A. Shah, “Group theoretic approach 2005.
to multidimensional filter banks: theory and applications,” [23] J. Kovačević and M. Vetterli, “Nonseparable multidimensional
IEEE Transactions on Signal Processing, vol. 44, no. 6, pp. 1392– perfect reconstruction filter banks and wavelet bases for Rn ,”
1405, 1996. IEEE Transactions on Information Theory, vol. 38, no. 2, part 2,
[7] J. H. McClellan, “The design of two-dimensional digital filters pp. 533–555, 1992.
by transformation,” in Proceedings of the 7th Annual Princeton [24] T. T. Nguyen and S. Oraintara, “Multiresolution direction fil-
Conference on Information Sciences and Systems, pp. 247–251, terbanks: theory, design, and applications,” IEEE Transactions
Princeton, NJ, USA, March 1973. on Signal Processing, vol. 53, no. 10, pp. 3895–3905, 2005.
[8] S.-M. Phoong, C. W. Kim, P. P. Vaidyanathan, and R. Ansari, [25] A. R. Calderbank, I. Daubechies, W. Sweldens, and B.-L. Yeo,
“New class of two-channel biorthogonal filter banks and “Wavelet transforms that map integers to integers,” Applied
and Computational Harmonic Analysis, vol. 5, no. 3, pp. 332– standard and principal author of one of the first JPEG-2000 imple-
369, 1998. mentations (i.e., JasPer). He is also a Member of the IEEE and a reg-
[26] Y. Chen, “Design and application of quincunx filter banks,” istered Professional Engineer in the province of British Columbia.
M.S. thesis, Department of Electrical and Computing Engi-
neering, University of Victoria, Victoria, BC, Canada, 2006. Wu-Sheng Lu received the B.S. degree
[27] J. Katto and Y. Yasuda, “Performance evaluation of subband in mathematics from Fudan University,
coding and optimization of its filter coefficients,” in Visual Shanghai, China, in 1964, and the M.S. de-
Communications and Image Processing (VCIP ’91), vol. 1605 of gree in electrical engineering and the Ph.D.
Proceedings of SPIE, pp. 95–106, Boston, Mass, USA, Novem- degree in control science from the Univer-
ber 1991. sity of Minnesota, Minn, USA, in 1983 and
[28] M. Vetterli, J. Kovačević, and D. J. Legall, “Perfect reconstruc- 1984, respectively. He was a Postdoctoral
tion filter banks for HDTV representation and coding,” Signal Fellow at the University of Victoria, Victo-
Processing: Image Communication, vol. 2, no. 3, pp. 349–363, ria, BC, Canada, in 1985 and a visiting As-
1990. sistant Professor with the University of Min-
[29] M. S. Lobo, L. Vandenberghe, S. Boyd, and H. Lebret, “Appli- nesota in 1986. Since 1987, he has been with the University of Vic-
cations of second-order cone programming,” Linear Algebra toria where he is a Professor. His current teaching and research in-
and Its Applications, vol. 284, no. 1–3, pp. 193–228, 1998. terests are in the general areas of digital signal processing and ap-
[30] J. F. Sturm, “Using SeDuMi 1.02, a MATLAB toolbox for op- plication of optimization methods. He is the coauthor with A. An-
timization over symmetric cones,” Optimization Methods and toniou of Two-Dimensional Digital Filters (Marcel Dekker, 1992).
Software, vol. 11, no. 1, pp. 625–653, 1999. He served as an Associate Editor of the Canadian Journal of Elec-
[31] Michael Adams’, August 2006 http://www.ece.uvic.ca/∼mda- trical and Computer Engineering in 1989, and Editor of the same
dams. journal from 1990 to 1992. He served as an Associate Editor for
[32] M. D. Adams, “ELEC 545 project: a wavelet-based lossy/ loss- the IEEE Transactions on Circuits and Systems, Part II, from 1993
less image compression system,” Department of Electrical and to 1995 and for Part I of the same journal from 1999 to 2001 and
Computer Engineering, University of British Columbia, Van- from 2004 to 2005. Presently he is serving as an Associate Editor for
couver, BC, Canada, April 1999. the International Journal of Multidimensional Systems and Signal
[33] “JPEG-2000 test images,” ISO/IEC JTC 1/SC 29/WG 1 N 545, Processing. He is a Fellow of the Engineering Institute of Canada
July 1997. and the IEEE.
[34] SAIC and University of Arizona, “JPEG-2000 VM 0 software”,
ISO/IEC JTC 1/SC 29/WG 1 N 840, May 1998.
Yi Chen received the B.Eng. degree in

electronic engineering from Tsinghua Uni-
versity, Beijing, China, in 2002, and the
M.A.Sc. degree in electrical engineering
from the University of Victoria, Victoria,
BC, Canada, in 2006. Her research inter-
ests include image processing, wavelets, and
multirate systems.
Michael D. Adams received the B.A.Sc. de-

gree in computer engineering from the Uni-
versity of Waterloo, Waterloo, ON, Canada,
in 1993, the M.A.Sc. degree in electrical
engineering from the University of Victo-
ria, Victoria, BC, Canada, in 1998, and the
Ph.D. degree in electrical engineering from
the University of British Columbia, Vancou-
ver, BC, Canada, in 2002. Since 2003, he has
been an Assistant Professor in the Depart-
ment of Electrical and Computer Engineering at the University
of Victoria. From 1993 to 1995, he was a member of technical
staff at Bell-Northern Research (now Nortel Networks) in Ottawa,
ON, Canada. His research interests include digital signal process-
ing, wavelets, multirate systems, image coding, and multimedia sys-
tems. He is the recipient of a Natural Sciences and Engineering Re-
search Council (of Canada) Postgraduate Scholarship. He is a vot-
ing member of the Canadian Delegation to ISO/IEC JTC 1/SC 29
(i.e., Coding of Audio, Picture, Multimedia and Hypermedia In-
formation), and has been an active participant in the JPEG-2000
standardization effort, serving as Coeditor of the JPEG-2000 Part-5
doi:10.1155/2007/68285
Research Article
An Approach for Synthesis of Modulated M-Channel FIR Filter
Banks Utilizing the Frequency-Response Masking Technique
Linnéa Rosenbaum, Per Löwenborg, and Håkan Johansson
Department of Electrical Engineering, Linköping University, 581 83 Linköping, Sweden
Received 22 December 2005; Revised 29 June 2006; Accepted 26 August 2006
Recommended by Soontorn Oraintara
The frequency-response masking (FRM) technique was introduced as a means of generating linear-phase FIR filters with narrow
transition band and low arithmetic complexity. This paper proposes an approach for synthesizing modulated maximally decimated
FIR filter banks (FBs) utilizing the FRM technique. A new tailored class of FRM filters is introduced and used for synthesizing
nonlinear-phase analysis and synthesis filters. Each of the analysis and synthesis FBs is realized with the aid of only three subfilters,
one cosine-modulation block, and one sine-modulation block. The overall FB is a near-perfect reconstruction (NPR) FB which
in this case means that the distortion function has a linear-phase response but small magnitude errors. Small aliasing errors are
also introduced by the FB. However, by allowing these small errors (that can be made arbitrarily small), the arithmetic complexity
can be reduced. Compared to conventional cosine-modulated FBs, the proposed ones lower significantly the overall arithmetic
complexity at the expense of a slightly increased overall FB delay in applications requiring narrow transition bands. Compared
to other proposals that also combine cosine-modulated FBs with the FRM technique, the arithmetic complexity can typically be
reduced by 40% in specifications with narrow transition bands. Finally, a general design procedure is given for the proposed FBs
and examples are included to illustrate their benefits.
Copyright © 2007 Linnéa Rosenbaum et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
1. INTRODUCTION class of FBs with nearly perfect reconstruction. The distor-

tion function has a linear-phase response but a small mag-
Maximally decimated FBs (see Figure 1) find applications nitude distortion. Further, small aliasing errors are present.
in numerous areas [1–3]. Over the past two decades, a vast The magnitude distortion and aliasing errors can however
number of papers on the theory and design of such FBs have be made arbitrarily small by properly designing a prototype
been published. Traditionally, the attention has to a large ex- filter, and a general design procedure for this purpose is pre-
tent been paid to the problem of designing perfect recon- sented. Compared to conventional cosine modulated FBs as
struction (PR) FBs. In a PR FB, the output sequence of the well as similar approaches, the proposed ones lower the over-
overall system is simply a shifted version of the input se- all arithmetic complexity significantly, in applications requir-
quence. However, FBs are most often used in applications ing narrow transition bands. An example of such an appli-
where small errors (emanating from quantizations, etc.) are cation is frequency-band decomposition for parallel sigma-
inevitable and allowed. Imposing PR on the FB is then an delta systems [7] (what is gained using parallelism, is lost
unnecessarily severe restriction which may lead to a higher with a wide transition band). In the former comparison, also
arithmetic complexity than is actually required to meet the the number of distinct coefficients is reduced significantly, at
specification at hand (arithmetic complexity is defined in the expense of a slightly increased overall delay. Apart from
this article as the number of arithmetic operations per sam- the NPR property, the main features of the FBs presented
ple needed in an implementation of an FB). To reduce the here are the following.
complexity one should therefore use near perfect reconstruc-
tion (NPR) FBs. For example, it is demonstrated in [4–6] that Modulation
the complexity can be reduced significantly by using NPR in- Regular cosine modulated FBs are widely used and known to
stead of PR FBs. For this reason, this paper proposes a new be highly efficient, since each of the analysis and synthesis
Analysis filter bank x0 (m) Synthesis filter bank
Ha0 (z) M M Hs0 (z)

x1 (m)
M
x(n) Ha1 (z) M M Hs1 (z) y(n)
. .
. .
. . . . . .
. . . .
. . xM 1 (m) . .
HaM 1 (z) M M HsM 1 (z)
Figure 1: M-channel maximally decimated FB.
parts can be implemented with the aid of only one (pro- Relation to previous work
totype) filter and a discrete cosine transform [2]. The effi-
ciency of this technique is exploited in the article after ap- Cosine modulated FIR FBs based on the original FRM fil-
propriate modifications. Specifically, both cosine and sine ters have been considered in [15–19]. The resulting struc-
modulations are utilized together with a modified class of ture requires only one modulation block in each of the anal-
FRM filters (see below), which generates efficient overall ysis and synthesis parts but, on the other hand, additional
FBs. upsamplers (and downsamplers) are needed, which makes
some subfilters work at an unnecessarily high sampling rate.
The focus is also different, since the goal in [15–19] is to
Frequency-response masking (FRM) minimize the number of optimization parameters and not
the arithmetic complexity. It should also be noted that, ex-
When the transition bands of the filters are narrow, the over- cept for two examples in [18, 19], the examples in [15–
all complexity may be high. This is due to the fact that the 19] have filter specifications where only one branch in the
order of an FIR filter is inversely proportional to the transi- FRM structure is needed. For such specifications, the arith-
tion bandwidth [8]. To alleviate this problem, one can use the metic complexity is not lower than for that of a regular
FRM technique which was introduced as a means of generat- direct-form FIR prototype filter. Thus, in terms of multipli-
ing linear-phase FIR filters with both narrow transition band cations per input/output sample there is nothing to gain us-
and low arithmetic complexity [9–12]. However, to make the ing narrow-band (one-branch) FRM prototype filters, and
technique suitable for the proposed modulated FBs, we in- therefore they are not discussed in this paper. Finally, it is
troduce a modified class of FRM filters. This modified class noted that this paper is an extension of the work presented
has been considered in [13, 14], but not in the context of at two conferences [20, 21], where the basic principles were
M-channel FBs. The main difference is that these FRM fil- introduced without giving all details presented in this pa-
ters have a nonlinear-phase response whereas the traditional per.
ones have a linear-phase response. The proposed FRM fil- The outline of the paper is as follows: in Section 2, a brief
ters are used as prototype filters in the proposed cosine and treatment of the conventional FRM technique is given. Af-
sine modulation-based FBs. Each of the analysis and synthe- ter that, the proposed FB is described in detail in Section 3.
sis FBs is realized with the aid of three subfilters, one cosine This section also includes some important properties and a
modulation block, and one sine modulation block. The rea- realization of the FB class. Section 4 gives a general design
son for using the modified FRM filters in the proposed mod- procedure, followed by a design example and comparisons
ulation scheme is that the corresponding FB structure re- in Section 5. The paper is concluded in Section 6.
quires a lower arithmetic complexity. Using instead the con-
ventional FRM filters, one would need three cosine modula- 2. FRM TECHNIQUE
tion blocks.
As an introduction to FRM, the conventional FRM technique
Few optimization parameters for generating lowpass linear-phase filters is reviewed in this
section. The modifications used in the proposed FB class are
Another advantage of the proposed FB class is that the num- described in the subsequent section.
ber of parameters to optimize is few, which is an important In the frequency-response masking technique, the trans-
issue in extensive designs. Efficient structures are given for fer function of the overall filter is expressed as [9–12]
implementing the proposed FBs, and procedures for opti-
mizing them in the minimax sense are described. H(z) = G zL F0 (z) + Gc zL F1 (z), (1)
Linnéa Rosenbaum et al. 3
G(zL ) F0 (z)
G(e jωT ) Gc (e jωT )
x(n) y(n)
Gc (zL ) F1 (z)
(G) π/2 (G) π ωT

ωc T ωs T
Figure 2: Structure used in the FRM approach.
(a)
where G(z) and Gc (z) are referred to as the model filter and G(e jLωT ) Gc (e jLωT )
complementary model filter, respectively. The filters F0 (z)
and F1 (z) are referred to as the masking filters which ex-
tract one or several passbands of the periodic model filter
G(zL ) and periodic complementary1 model filter Gc (zL ). The π ωT
structure is illustrated in Figure 2 and typical magnitude re- (b)
sponses of the subfilters as well as the resulting filter can be
seen in Figure 3 in the next section.
The FRM technique was originally introduced in [10] as Pa (e jωT )
F0 (e jωT )
a means to reduce the arithmetic complexity of linear-phase (G)
2(k + 1)π ωs T
FIR filters with narrow transition bands. In this approach, F1 (e jωT ) L
G(z) and Gc (z) have to be even-order linear-phase filters of
equal delays and form a complementary filter pair, whereas π ωT
both F0 (z) and F1 (z) are either even- or odd-order linear- (G) 2kπ + ωs T
(G)
ωc T 2kπ + ωc T
(G)
2kπ
phase filters of equal delays. These filters could be used di- L L L
rectly to generate the analysis and synthesis filters in the pro-
(c)
posed modulated FB scheme to be considered in the follow-
ing section, but the result is that each of the analysis and
synthesis FB then requires three modulation blocks. There-
Pa (e jωT )
fore, we introduce in the next section modified FRM FIR fil- F1 (e jωT )
ters that make it possible to use only two modulation blocks. (G)
2kπ + ωc T
F0 (e jωT )
These modified FRM FIR filters have been considered in L
[13, 14] but not in the context of M-channel FBs.
2kπ
(G)
ωc T π ωT
(G) (G)
2(k 1)π + ωs T 2kπ ωs T L
3. PROPOSED FILTER BANKS L L
(d)
This section gives transfer functions, properties, and realiza-
tions of the proposed FBs. The choices of prototype filters Figure 3: Illustration of magnitude functions in the FRM approach,
and analysis and synthesis transfer functions assure the over- where (c) and (d) show the two alternatives Case 1 and Case 2, re-
all filter bank to fulfill the NPR criteria. spectively.
3.1. Prototype filter transfer functions

be one of the transition bands provided by either G(zL ) or
For the proposed modulated FBs, the transfer functions of Gc (zL ). We refer to these two different cases as Case 1 and
the analysis and synthesis filters are generated from the pro- Case 2, respectively. Further, we let ωc T, ωs T, δc , and δs de-
totype filter transfer functions Pa (z) and Ps (z), respectively. note the passband edge, stopband edge, passband ripple, and
These transfer functions are given by stopband ripple, respectively, for the overall filter Pa (z) (and
Ps (z)). For the model and masking filters G(z), Gc (z), F0 (z),

Pa (z) = G zL F0 (z) + Gc zL F1 (z), (2) and F1 (z), additional superscripts (G), (Gc ), (F0 ), and (F1 ),
L L respectively, are included in the corresponding ripples and
Ps (z) = G z F0 (z) − Gc z F1 (z). (3)
edges. The periodicity L, and the subfilters G(z), Gc (z), F0 (z),
and F1 (z) are selected to satisfy the following criteria.
Typical magnitude responses for the model filter, the mask- (i) The model filters G(z) and Gc (z) are linear-phase
ing filter, and overall filter Pa (z) are as shown in Figure 3. FIR filters of odd order NG , with symmetrical and anti-
The transition band of Pa (z) (and Ps (z)) can be selected to symmetrical impulse responses, respectively. They are related
as
1 In the case of linear-phase FIR filters, this means that the sum of the zero-
phase frequency responses of the filter pair is equal to unity. Gc (z) = G(−z) (4)
and designed to be approximately power complementary is not needed here, since power complementarity can be
(i.e., |G(e jωT )|2 + |Gc (e jωT )|2 ≈ 1). This is mainly what achieved directly by choosing the model filters according
distinguishes the proposed FRM filters from the conven- to Section 3.1. The main difference is though that unlike
tional ones,2 and it means for example that the transition the conventional ones, the proposed prototype filters have
band of G(z) must be centered at π/2. a nonlinear-phase response. Nevertheless, by the choices in
(ii) L is an integer related to the number of channels M (7)–(10), the FB is ensured to have all the important proper-
as ties that are stated later in Section 3.3.
⎧
⎪
⎨(4m + 1)M, Case 1, 3.3. Filter bank properties
L=⎪ (5)
⎩
(4m − 1)M, Case 2. This section gives five important properties of the proposed
FBs useful in the design procedure. Proofs of the first four
The reason for this restriction is that the transition band of properties are given in the appendix. The fifth property is
the FRM filter (see the illustration of the two different cases shown in Section 4.
in Figures 3(c) and 3(d)) must coincide with the transition (1) The magnitude responses of Pa (z) and Ps (z) are
band of the prototype filter at π/2M. Thus, equal, that is,
2kπ ± π/2 π
jωT

jωT
= . (6)
Pa e
=
Ps e
.
L 2M (11)
(iii) The masking filters F0 (z) and F1 (z) are of order NF

and linear-phase lowpass filters with symmetrical impulse re- (2) The cascaded filter Pa (z)Ps (z) has a linear-phase re-
sponses. The filter order can be either even or odd. Further, sponse.
in order to ensure approximate power complementarity of (3) The magnitude responses of Hak (z) and Hsk (z) are
the analysis filters, additional restrictions in the transition equal, that is,
bands of Pa (z) and Ps (z) must be added. This leads to slightly
tightened restrictions on the passband and stopband edges of

Hak e jωT
=
Hsk e jωT
. (12)
the masking filters compared to [10], which is illustrated in
Figure 3.
(4) The distortion transfer function V0 (z) (see Section 4)
3.2. Analysis and synthesis filter transfer functions has a linear-phase response with a delay of LNG +NF samples.
(5) The FBs can readily be designed in such a way that
For Case 1, the analysis filters Hak (z) and synthesis filters (a) the analysis and synthesis filters are arbitrarily good
Hsk (z) are obtained by modulating the prototype filters Pa (z) frequency-selective filters, and (b) the magnitude distortion
and Ps (z) according to and aliasing errors are arbitrarily small.

(k+0.5)
−(k+0.5)
Hak (z) = βk Pa zW2M + βk∗ Pa zW2M , (7) 3.4. Filter bank structures
−(k+0.5) In this section it is shown how to realize the proposed analy-
Hsk (z) = c j(−1)k βk Ps zW2M
(k+0.5)
− βk∗ Ps zW2M , sis FB class with two modulation blocks instead of three. The
(8) synthesis FB can be realized in a corresponding way [2]. We
begin by expressing G(z) and Gc (z) in polyphase forms ac-
respectively, for k = 0, 1, . . . , M − 1, with
cording to
⎧
⎪
⎨−1, NG + 1 = 4m,

c=⎪ (9) G(z) = G0 z2 + z−1 G1 z2 ,
⎩
1, NG + 1 = 4m + 2 (13)
Gc (z) = G(−z) = G0 z2 − z−1 G1 z2
for some integer m, and
(k+0.5)NF /2 so that Pa (z) in (2) can be written on the form
WM = e− j2π/M , βk = w2M . (10)

For Case 2, (9) is negated. Note that this type of modula- Pa (z) = G0 z2L A(z) + z−L G1 z2L B(z). (14)
tion is slightly different from the one that is usually em-
ployed in cosine-modulated FBs [2]. For example, θk in [2]
In (14), the filters A(z) and B(z) are the sum and the differ-
ence of the two masking filters according to
2 For the conventional FRM filters, NG must be even and Gc (z) = z−NG /2 −
G(z). In this case, it is not possible to make G(z) and Gc (z) approximately
power complementary. A(z) = F0 (z) + F1 (z), B(z) = F0 (z) − F1 (z). (15)
The analysis filters Hak (z) can then be written as 4. FILTER BANK DESIGN

Hak (z) = G0 − z2L Ak (z) + s(−1)k jz−L G1 − z2L Bk (z), For M-channel maximally decimated FBs (see Figure 1) the
(16) z-transform of the output signal is given by
M
−1
where m
Y (z) = Vm (z)X zWM , (23)
(k+0.5) −(k+0.5) m=0
Ak (z) = βk A zW2M + βk∗ A zW2M ,
(17) where
(k+0.5)
−(k+0.5)
Bk (z) = βk B zW2M − βk∗ B zW2M , M −1

m
⎧ Vm (z) = Hak zWM Hsk (z). (24)
⎪
⎨−1, Case 1, k=0
s=⎪ (18)
⎩1, Here, V0 (z) is the distortion transfer function whereas the
Case 2.
remaining Vm (z) are the aliasing transfer functions. For a PR
(near-PR) FB, it is required that the distortion function is
As seen in (16), G0 (−z2L ) and G1 (−z2L ) are conveniently in- (approximates) a delay, and that the aliasing components are
dependent of k and are thus the same in each channel. (approximate) zero. We now derive expressions for the speci-
Let a(n), b(n), ak (n), and bk (n) denote the impulse re- fication of the model filter G(z) and the masking filters F0 (z)
sponses of A(z), B(z), Ak (z), and Bk (z), respectively. We then and F1 (z), in order for the analysis filters Hak (z), the distor-
get from (17) and (10) that ak (n) and bk (n) are related to tion function V0 (z), and the aliasing terms Vm (z), to fulfill a
a(n) and b(n) through given specification.
Let the specifications of Hak (z) be
(2k + 1)π N
ak (n) = 2a(n) cos n− F ,

2M 2 1 − δc ≤
Hak e jωT
≤ 1 + δc , ωT ∈ Ωc,k ,
(19)

(25)

Hak e jωT
≤ δs ,
(2k + 1)π N ωT ∈ Ωs,k ,
bk (n) = 2 jb(n) sin n− F .
2M 2
where Ωc,k and Ωs,k , respectively, are the passband and stop-
Since bk (n) is purely imaginary, Hak (z) is obviously the trans- band regions of Hk (z). Expressed with the aid of Δ, where
fer function of a filter with a real impulse response. It can be Δ is half the transition bandwidth, they are as illustrated in
written as Figure 6. Furthermore, the magnitude of the distortion and

aliasing functions are to meet
Hak (z) = G0 − z2L Ak (z) − s(−1)k z−L G1 − z2L BkR (z),

1 − δ0 ≤
V0 e jωT
≤ 1 + δ0 , ωT ∈ [0, π], (26)
(20)
jωT

Vm e
≤ δ1 , ωT ∈ [0, π], m = 0, 1, . . . , M − 1,
where (27)
BkR (z) = − jBk (z). (21) respectively. To fulfill the above specifications, the following
optimization problem is solved:
Through a similar derivation as above, the synthesis fil-
minimize δ
ters Hsk (z) can be rewritten as

δ
Hsk (z) = (−1)k G0 − z2L BkR (z) + sz−L G1 − z2L Ak (z). subject to
Hak e jωT
− 1
≤ δ c , ωT ∈ Ωc,k ,
δ1
(22)

Hak e jωT
≤ δ δs , ωT ∈ Ωs,k ,
The realization of the analysis FB is shown in Figure 4, where δ1
Qi(A) (−z2 ) and Qi(B) (−z2 ), i = 0, 1, . . . , 2M − 1, are the pol-

jωT

yphase components of A(z) and B(z), respectively. The co-
V0 e
− 1
≤ δ δ0 , ωT ∈ [0, π],
δ1
sine modulation block T1 is a simplified version of the corre-

jωT
sponding one in [2] (with θk = 0). It consists of two trivial

Vm e
≤ δ, ωT ∈ [0, π].
matrices and an M × M DCT-IV matrix. The other one, T2 , is (28)
a corresponding sine modulation block. Further, because of
symmetry in the coefficients of G(z), the two filters G0 (−z2 ) The adjustable parameters in (28) are the filter coefficients
and G1 (−z2 ) can share multipliers. This is illustrated for the of the subfilters G(z), F0 (z), and F1 (z), and δ. For the spec-
0th channel and filter order NG = 3, in Figure 5. Although we ifications (25)–(27) to be fulfilled, we must find a solution
have three subfilters to implement, G(z), F0 (z), and F1 (z), we with δ ≤ δ1 . The problem is a nonlinear optimization prob-
have been able to reduce the number of modulation blocks lem and therefore requires a good initial solution. For this
needed from three to only two. purpose, we first optimize G(z), F0 (z), and F1 (z) separately
w0
u0
x(n) M (A)
Q0 ( z 2 ) 0 G0 ( z2 ) x0 (m)
(A)
1
QM ( z 2 ) z w1
Cosine-modulation block T1
1
z 1
u1
M (A)
Q1 ( z 2 ) G0 ( z2 ) x1 (m)
M 1
.
.
. (A)
QM+1 ( z2 ) z 1 M
.
.
.
.
. M+1 .
.
z 1 . .
uM 1
M (A) wM
QM 1 ( z 2 ) 1
2M 1
(A)
Q2M 1 ( z2 ) z 1 2M 1 G0 ( z2 ) xM 1 (m)
u0 s
(B)
Q0 ( z 2 ) 0 G1 ( z2 ) z 1 w0
(B)
1
QM ( z 2 ) z 1
Sine-modulation block T2
u1 s
(B)
Q1 ( z 2 ) G1 ( z2 ) z 1 w1
M 1
(B) M
QM+1 ( z2 ) z 1
. . M+1 .
.. .. .
.
uM 1
(B)
QM 1 ( z 2 )
s( 1)M 1
(B)
Q2M 1 ( z2 ) z 1 2M 1 G1 ( z2 ) z 1 wM 1
Figure 4: Realization of the proposed analysis FB.
G0 ( z2 )
2T
g(0) g(1) 2T
g(0)
x0 (m)
x0 (m)
s
2T
g(1)
T
g(1) g(0) s
T 2T
G1 ( z2 )
Figure 5: Sharing of multipliers between G0 (−z2 ) and G1 (−z2 ) in the 0th channel when NG = 3.
and then these filters can serve as a good initial solution for possible to successively decrease the filter orders of the sub-
further optimization according to (28). filters and still satisfy the given specifications (25)–(27) after
In the following three sections, we give formulas for de- simultaneous optimization.
signing G(z), F0 (z), and F1 (z), so that they together fulfill a For some specifications, for example, when M is large,
general specification of an NPR FB. These formulas are based it might not be possible to do simultaneous optimization.
on worst-case assumptions, and therefore in general, we get Then, separate optimization can be used exclusively and give
some unnecessary design margin. Because of this, it might be a good (although not optimal) solution. The masking filters
These formulas hold under the condition that second- and

Ha0 (e jωT ) Ha1 (e jωT )
higher-order terms are neglected. As seen, F0 (z) and F1 (z)
are restricted equally and we can use the simplified nota-
tions δc(F) = δc(F0 ) = δc(F1 ) and δs(F) = δs(F0 ) = δs(F1 ) . Further-
more, G(z) has the same ripples as its complementary filter,
Δ Δ Δ Δ ωT [Gc (z) = G(−z)]; thus δc(G) = δc(Gc ) and δs(G) = δs(Gc ) . This im-
π/M π/M 3π/M plies that Case 1 and Case 2 with respect to the design do not
differ, and the final simplified requirements on the subfilters
Figure 6: Passband and stopband regions for H(e jωT ). regarding ripples are
δc(F) + δs(F) + δPC ≤ δc ,

F0 (z) and F1 (z) can be designed using McClellan-Parks algo-
δc(F) + δs(F) + δc(G) ≤ δc ,
rithm [22] or linear programming to fulfill δc(F0 ) , δs(F0 ) , and
2 2 (33)
δc(F1 ) , δs(F1 ) , respectively. The model filter G(z) should be de- 2 δs(F) + δs(G) ≤ δs2 ,
signed to fulfill δc(G) and δs(G) but also to be approximately
power complementary with a maximally allowed error of 2δs(F) ≤ δs .
δPC . To this end, nonlinear optimization must be used, and,
for example, the algorithm in [22] can be used as a initial 4.2. Distortion function
solution. Throughout the paper, the nonlinear optimization
is performed in the minimax sense, but optimization in, for The distortion transfer function V0 (z) is given by
example, the least square sense is also possible after minor
modifications.3 M
−1
V0 (z) = Hak (z)Hsk (z). (34)
k=0
4.1. Analysis filters
In the appendix, it is shown that the frequency response of
In order to fulfill the specification of frequency selectivity of the distortion function can be expressed using the zero-phase
the analysis filters, the magnitude of Hak (z) is studied, as a frequency response V0R (ωT) as
function of the three subfilters G(z), F0 (z), and F1 (z). For

convenience, we use the notation X (±k) (z) which stands for V0 e jωT = e− j(NG L+NF )ωT V0R (ωT), (35)

X e±((2k+1)/2M)π z . (29) where
This notation allows the transfer functions of the analysis fil- M
−1
2 2
ters to be written on the form V0R (ωT)= GR(−k) (LωT) (−k)
F0R (+k)
(ωT)+F1R (ωT)
k=0
Hak (z) = G(−k) zL E0k (z) + Gc(−k) zL E1k (z), (30) 2 2
+ G(cR−k) (LωT) (+k)
F0R (−k)
(ωT)+F1R (ωT) .
where E0k (z) and E1k (z) are two different combinations of (36)
the masking filters according to
To have near PR, V0 (e jωT ) should approximate a pure de-
E0k (z) = βk F0(−k) (z) + βk∗ F1(+k) (z), lay. Here, linear phase is fulfilled exactly (with a delay of
(31) LNG + NF samples) and therefore it is enough to make sure
E1k (z) = βk∗ F0(+k) (z) + βk F1(−k) (z). that V0R (ωT) approximates one. Equation (36) leads to the
The reason for this paraphrase is that the filters in (31) be- following worst case ripple, ignoring second-order effects:
long to Subclass I in [14] where useful formulas for ripple
2 δc(F) + δs(F) + max δPC , δc(G) ≤ δ0 . (37)
estimations are found. Using these formulas, as well as the
fact that both E0k (z) and E1k (z) are the sum of the two filters
F0 (z) and F1 (z), just shifted differently; the following restric- 4.3. Aliasing functions
tions on the different filters can be deduced: Because of the decimation after the analysis filters in Figure 1,

δc(F0 ) + δs(F1 ) ≤ min δc(E0 ) , δc(E1 ) , M − 1 unwanted aliasing functions are introduced in the
system. Their transfer functions are given in (24) for m =
δc(F1 ) + δs(F0 ) ≤ min δc(E0 ) , δc(E1 ) , (32) 1, . . . , M − 1 and should approximate zero in a near-PR FB.
Normally in modulated FBs, adjacent terms in the aliasing
δs(F0 ) + δs(F1 ) ≤ min δs(E0 ) , δs(E1 . )
functions are summed up to zero. This is called adjacent-
channel aliasing cancellation [2]. By inserting the expressions
for Hak (z) and Hsk (z) as given by (7) and (8) into (23) and
3 The focus in this paper is on the design procedure, not the specific design (24), we obtain expressions for all Vm (z), m = 1, . . . , M − 1,
criterion. and after a close investigation of these sums, the following
conclusions can be drawn. There are two masking filters, but By finding the derivative of this expression with respect to L,
only the contribution from one of them (the largest overlap) the optimal L can be found for each specification as4
is perfectly cancelled by adjacent-channel cancellation. Be-
cause of this, all the M terms in each aliasing function will
1
make a small contribution to the aliasing error. The maximal Lopt = . (43)
ripple is determined by the stopband ripple of the masking (2Δ)/π + 8ΔKF / MπKG
filters, δs(F) , and the squared stopband ripple of the model fil-
ter (δs(G) )2 . More precisely we get 5δs(F) + 2(δs(G) )2 . Nonadja- In addition, L is restricted by the number of channels M, as
cent terms will have a maximum ripple of 2δs(F) and we have L = (4m ± 1)M in (5).
M − 2 of these terms. Therefore the worst case magnitude
error for one aliasing function δ1 will be 5. DESIGN EXAMPLES
2
2(M − 2)δs(F) + 5δs(F) +2 δs(G) ≤ δ1 . (38)
To demonstrate the proposed design method, several modu-
For large M, this worst-case estimation of the aliasing func- lated FBs are designed.5 In the first two examples, the spec-
tions will unfortunately be far from the real case. Therefore ifications of and in (25)–(27) are the following: δc = δs =
(38) is only useful for small and moderate values of M. A δ0 = δ1 = 0.01. Further, the number of channels M varies
number of different filter banks have been synthesized, and and determines the width of the transition band 2Δ, with
these results indicate that δ1 typically have about the same Δ = 0.025π/M. The third example is a comparison to [18,
size as δ0 . This can be used as a guideline when designing Example 2]. The interesting aspect to study when compar-
filter banks for larger values of M. ing multirate FBs is not the filter orders, but the number of
multiplications per input/output sample (number of multi-
4.4. Estimation of optimal L plications at the lower rate), here denoted as mults/sample.
This is because different filters can work at different sample
The total number of multiplications per input/output sample rates. For the proposed FBs, the number of mults/sample can
(mults/sample) for the analysis (or synthesis) filter bank is be calculated as in (39), whereas with a regular FIR proto-
expressed as type filter of order N, it is simply 2((N + 1)/M). One should
also keep in mind that the modulation blocks also contribute
NF + 1 NG + 1
R=2 + , (39) to the total arithmetic complexity of the FBs and that only
M 2 one is needed with a regular FIR prototype filter or with
where NG is the filter order of G(z) and NF is the filter or- the approach in [18]. This contribution is however indepen-
der of F0 (z) and F1 (z). Both NG and NF depend on the pe- dent of the filter orders and has a relatively low complex-
riodicity factor L in the FRM technique, and this implies ity compared to the filter part. It is therefore not discussed
that the arithmetic complexity is heavily dependent on the here.
choice of L. Therefore, a formula is derived for estimating
its optimal value. The filters F0 (z) and F1 (z) work at a sam- Example 1. A FB with M = 5 was designed and the esti-
pling rate reduced by a factor M and thereby their number of mated optimal L was found to be either 5 or 15, depending
mults/sample is also decreased by the same factor. Further, on the choice of KG in Section 4.4. Both cases were consid-
G(z) is symmetric and it is possible for its polyphase compo- ered, and 15 was found to give the FB with lowest complex-
nents G0 (z) and G1 (z) to share multipliers. ity for the given specification. Translating the specification to
To estimate the filter order of an FIR filter, one can use restrictions on the three subfilters gives δc(F) = 0.001, δs(F) =
the formula
0.00085, δc(G) = 0.0031, δPC = 0.0031, and δs(G) = 0.0099.
K These specifications are met with filter orders NG = 47 and
N= , (40)
ωs T − ωc T NF = 114. Further, with successive decrement of NF , the
where ωs T and ωc T are the stopband and passband edges of specification was found to be fulfilled for NF ≥ 102. Mag-
the filter. For NF , a good approximation of K is [8] nitude responses of the analysis filters, distortion function,
and aliasing functions with NF = 102 are plotted in Figures
−20 log δs(F) δc(F) − 13 7, 8, and 9. Using nonlinear optimization, the filter orders
KF = 2π (41) could be lowered to NG = 39 and NF = 58 and still meet
14.6 the specification. This shows that for this particular speci-
but for NG , the additional condition of power complemen- fication, there was a large design margin. The correspond-
tarity [14] will increase the corresponding KG . The masking ing magnitude responses are depicted in Figures 10, 11, and
filters F0 (z) and F1 (z) have the same transition bandwidth, 12. Using (39), the implementation cost without the nonlin-
π/L−2Δ, while the corresponding value for G(z) is 2LΔ. With ear optimization procedure for the overall FB (including the
(40) and (41) the total number of mults/sample can be esti-
mated as
4 The variable KG is assumed to be independent of L.
2 KF 1 KG
R= +1 + +1 . (42) 5 For the joint optimization, the Matlab function fminimax.m has been
M π/L − 2Δ 2 2LΔ used.
0.1
0
Magnitude (dB)
Magnitude (dB)
0.05
20
40 0
60 0.05
80 0.1
0 0.2π 0.4π 0.6π 0.8π π 0 0.2π 0.4π 0.6π 0.8π π
ωT (rad) ωT (rad)
Figure 8: Magnitude response of the distortion function without

Figure 7: Magnitude responses of the analysis filters without the the nonlinear optimization procedure with NG = 47 and NF = 102,
nonlinear optimization procedure with NG = 47 and NF = 102, Example 1.
Example 1.
40
Magnitude (dB)
analysis and synthesis parts) is 130.4 mults/sample plus the
60
cost to implement the cosine and sine modulation blocks.
After the nonlinear optimization procedure, the number is 80
only 87.2.
100
As a comparison, the estimated complexity of a regular
0 0.2π 0.4π 0.6π 0.8π π
FIR6 cosine modulated NPR FB would need a filter order of
about 580. Therefore, at least about 232 mults/sample are ωT (rad)
needed in the filter part using a regular FIR prototype fil- Figure 9: Magnitude responses of the aliasing functions without
ter. Thus, even without the nonlinear optimization proce- the nonlinear optimization procedure with NG = 47 and NF = 102,
dure, the proposed method gives a solution with substan- Example 1.
tially lower arithmetic complexity.
As usual when employing the FRM technique, we achieve
more savings when the transition band becomes more nar-
row. The price to pay for the decreased arithmetic complex-
0
Magnitude (dB)
ity and the decreased number of optimization parameters is,

as always when using an FRM approach with linear-phase 20
subfilters, a longer overall delay. In this example, the delay 40
is about 39% longer for the proposed FB without joint op-
timization compared to the regular FB. With joint optimiza- 60
tion, the figure is decreased to 11%. 0 0.2π 0.4π 0.6π 0.8π π
ωT (rad)
Example 2. With increasing M, also L increases and it be-
comes difficult to optimize the different filters together in the
minimax sense. However, optimizing them separately, also Figure 10: Magnitude responses of the analysis filters with NG = 39
gives good results. Filter banks with M = 8, 16, 32, and 256 and NF = 58, Example 1.
were designed, and the optimal L was found to be 24, 48,
96, and 768, respectively. The number of multiplications re-
quired per sample in the filter parts is visualized in Table 1. some details for M = 32 are given. When (33) and (37) are
For comparison reasons, the estimated complexity with a used to distribute the ripples ((38) is not considered because
regular FIR prototype filter (estimated as above) is also given. of the size of M), the required filter orders were NG = 47 and
Further, the total delay of the filter parts of the different FBs NF = 716. With a successive decrement of NF , the specifica-
is given, as well as the number of distinct filter coefficients tion was found to be fulfilled for NF ≥ 658.7 The ripples after
to optimize. When the number of channels is doubled, the the separate design are δc < 0.0040, δs < 0.0034, δ0 < 0.0096,
transition bands of the masking filters and the regular FIR and δ1 < 0.0071, and the magnitude response of the analysis
filter are halved. This corresponds to an approximately dou- filters is shown in Figure 13.
bled filter order. But since the sampling rate for the filters
is also halved, the number of multiplications per sample re- Example 3. A comparison with [18, Example 2] has been
mains about the same. This is the reason for the limited made and the results are summarized in Table 2. The data
variations for different M in Table 1. For further illustration, in the first column is synthesized with L = 24. The second
column corresponds to a separate design of the subfilters us-
6 The estimation is taken from the 2-channel case, and then when gener-
alizing, the filter order is assumed to be proportional to the transition 7 The decrease of NF may seem large, but it only corresponds to a reduction
bandwidth. of 5% of the overall complexity.
1.01
0
Magnitude (dB)
Magnitude (dB)
1.005
20
1
40
0.995
0.99 60
0 0.2π 0.4π 0.6π 0.8π π 0 0.2π 0.4π 0.6π 0.8π π
ωT (rad) ωT (rad)
Figure 11: Magnitude response of the distortion function without
the nonlinear optimization procedure with NG = 39 and NF = 58, Figure 13: Magnitude responses of the analysis filters with separate
Example 1. optimization for M = 32, Example 2.
40 Table 2: Comparison with [18, Example 2].

Magnitude (dB)
[18, Example 2] L = 24 L = 40
60
NG 186 169 101
80 NF0 (NF1 ) 143 210 329
δs 0.0014 0.0014 0.0014
0 0.2π 0.4π 0.6π 0.8π π
ωT (rad)
δ0 0.009 0.000 47 0.006
δ1 0.0018 0.000 51 0.000 81
Figure 12: Magnitude responses of the aliasing functions without
the nonlinear optimization procedure with NG = 39 and NF = 58, Coefficients 475(238) 297 383
Example 1. Mults./sample 446 275.5 267
Delay 4 607 4 266 4 369
Table 1: Number of multiplications per sample, total delay, and

number of optimization parameters using the proposed prototype
filters or a regular FIR prototype filter, for different numbers of solution with L = 40 is preferable. Due to the extra up-
channels. samplers in [18], some subfilters work at a higher sampling
FB class M Mults/sample Coefficients Delay rate compared to our proposal. This seems to be the main
explanation to the significant difference (40% decrease) in
Proposed 8 130.5 190 1292 arithmetic complexity. The number of distinct coefficients
Regular FIR 8 232.25 465 928 to be optimized given in [18, Example 2] is 475, but since
Proposed 16 129.75 352 2582 their three subfilters all have linear phase, the correct num-
ber seems more likely to be 238. However, using the number
Regular FIR 16 232.125 929 1856
given in the example, the proposed FBs have about 20% less
Proposed 32 130.375 683 5170 optimization parameters.
Regular FIR 32 232.0625 1857 3712
Proposed 256 132.39 5426 41 496 6. CONCLUSION
Regular FIR 256 232.008 14 849 29 696 This paper introduced an approach for synthesizing mod-
ulated maximally decimated FIR FBs using the FRM tech-
nique. For this purpose, a new class of FRM filters was in-
troduced. Each of the analysis and synthesis FBs is realized
ing the distribution formulas given in (33), (37), and (38), with the aid of three filters, one cosine modulation block, and
with L = 24. In the last column, results with L = 40 are pre- one sine modulation block. The overall FBs achieve nearly
sented. When the distribution formulas for L = 40 were used, PR with a linear-phase distortion function. Further, a design
NF0 and NF1 were found to be 361, but after the separate op- procedure is given, allowing synthesis of a general FB speci-
timization, it was possible to lower these orders to 329.8 No fication. Compared to similar approaches, the proposed FBs
joint optimization has been performed on the FBs in column have about 40% lower arithmetic complexity. Compared to
two or three; thus these results can be improved further. regular cosine modulated FIR FBs, both the overall arith-
In terms of distinct coefficients, L = 24 is the best choice, metic complexity and the number of distinct filter coeffi-
but if the number of mults/sample is more interesting, the cients are significantly reduced, at the expense of an increased
overall FB delay in applications requiring narrow transition
bands. These statements were demonstrated by means of sev-
8 For L = 24, it was not possible to decrease the filter orders. eral design examples.
APPENDIX transfer functions of the analysis filters, (7), and the synthesis
filters, (8), as
This appendix shows some of the properties of the proposed
FBs concerning the prototype filters, the analysis filters, and
the synthesis filters. Hak (z) = βk Pa(−k) (z) + βk∗ Ps(+k) (z)
We first regard the magnitude response of the proto- (−k) (−k)
type filters and the phase response of Pa (e jωT )Ps (e jωT ) (prop- = βk G(−k) zL F0 (z) + Gc(−k) zL F1 (z)
erties (1) and (2) in Section 3.3). The frequency responses
of G(e jωT ), Gc (e jωT ), F0 (e jωT ), and F1 (e jωT ) can be written + βk∗ G(+k) zL F0(+k) (z) + G(+k)
c zL F1(+k) (z) ,
as
Hsk (z)
jωT
− jNG ωT/2
G e =e GR (ωT),
= c j − 1k βk Ps(−k) (z) − βk∗ Ps(+k) (z)
jωT

Gc e = e− jNG ωT/2 GcR (ωT), (−k) (−k)
(A.1) = c j − 1k βk (G(−k) zL F0 (z) − G(c−k) zL F1 (z)

F0 e jωT = e− jNF ωT/2 F0R (ωT),
+βk∗ G(+k) zL F0(+k) (z) − G(+k)
c zL F1(+k) (z) .
jωT

F1 e = e− jNF ωT/2 F1R (ωT), (A.5)
where GR (ωT), GcR (ωT), F0R (ωT), and F1R (ωT) denote
zero-phase frequency responses. We rewrite the magnitude We use the fact that
responses of the prototype filters in (2) and (3) as

e± j((2k+1)/2M)π z)2L = −z2L ,
jωT
Pa e

= G e jLωT F0 e jωT + Gc e jLωT F1 e jωT e j((2k+1)/2M)π z)L = ± j(−1)k zL , (A.6)

= e− j(NG L+NF )ωT/2 GR (LωT)F0R (ωT)+ jGcR (LωT)F1R (ωT) , e− j((2k+1)/2M)π z)L = ∓ j(−1)k zL ,

Ps e jωT where the plus or minus sign depends on k and on m in
jLωT
jωT
jLωT
jωT
(5). Rewriting the model filters using their polyphase com-
=G e F0 e − Gc e F1 e
ponents we get

= e− j(NG L+NF )ωT/2 GR (LωT)F0R (ωT) − jGcR (LωT)F1R (ωT) .

(A.2) G(−k) zL = G0 − z2L ∓ j(−1)k z−L G1 z2L ,

From (A.2) it follows that the squared magnitude response G(+k) zL = G0 − z2L ± j(−1)k z−L G1 z2L ,
of the two prototype filters are (A.7)

jωT
2 Gc(−k) zL = G0 − z2L ± j(−1)k z−L G1 z2L ,

Pa e
= G2 (LωT)F 2 (ωT) + G2 (LωT)F 2 (ωT)
R 0R cR 1R
G(+k)
c zL = G0 − z2L ∓ j(−1)k z−L G1 z2L .

2
=
Ps e jωT
(A.3) This gives us the following relation between G(z) and Gc (z):
thus identical. Further, the product of the two magnitude re-
sponses has linear phase, as can be seen in (A.4) below. Here- G(−k) zL = G(+k)
c zL , G(+k) zL = G(c−k) zL .
after, (ωT) and (LωT) are left out for the sake of simplicity, (A.8)

Pa e jωT Ps e jωT = e− j(NG L+NF )ωT GR F0R + jGcR F1R Now we rewrite the transfer function of the analysis and syn-
thesis filters as
· GR F0R − jGcR F1R

= e− j(NG L+NF )ωT G2R F0R
2
+ G2cR F1R
2
. Hak (z) = G(−k) zL βk F0(−k) (z) + βk∗ F1(+k) (z)
(A.4)
+ G(c−k) zL βk∗ F0(+k) (z) + βk F1(−k) (z) ,
Secondly, we show that the magnitude responses of the
analysis filters and the synthesis filters are equal, and that Hsk (z) = c j(−1)k G(−k) zL βk F0(−k) (z) + βk∗ F1(+k) (z)
the product of Hak (e jωT ) and Hsk (e jωT ) has a linear-phase
(−k)
− G(c−k) zL βk∗ F0 (z) + βk F1 (z) .
(+k)
response with delay LNG + NF (properties (3) and (4) in
Section 3.3). We use the notation in (29) and rewrite the (A.9)
We use (A.9) and omit (ωT) and (LωT) to write their fre- [6] L. Svensson, P. Löwenborg, and H. Johansson, “A class of
quency responses as cosine-modulated causal IIR filter banks,” in Proceedings of the
9th International Conference on Electronics, Circuits and Sys-
Hak e jωT = e− j/2(NG L+NF )ωT ∓ j(k+0.5)πNG tems (ICECS ’02), vol. 3, pp. 915–918, Dubrovnik, Croatia,
September 2002.
(−k) (−k) (+k) (−k) (+k) (−k)
· GR F0R +F1R + jGcR F0R +F1R , [7] A. Eshraghi and T. S. Fiez, “A comparative analysis of parallel
delta-sigma ADC architectures,” IEEE Transactions on Circuits
and Systems I: Regular Papers, vol. 51, no. 3, pp. 450–458, 2004.
Hsk e jωT = c j(−1)k e− j/2(NG L+NF )ωT ∓ j(k+0.5)πNG
[8] J. F. Kaiser, “Nonrecursive digital filter design using I0 -sinh
(−k) (−k) (+k) (−k) (+k) (−k) window function,” in Proceedings of the IEEE Symposium on
· GR F0R +F1R − jGcR F0R +F1R . Circuits & Systems (ISCAS ’74), vol. 3, pp. 20–23, San Fran-
(A.10) cisco, Calif, USA, April 1974.
[9] T. Saramäki, “Finite impulse response filter design,” in Hand-
From this, it follows that the magnitude of the frequency re- book for Digital Signal Processing, S. K. Mitra and J. F. Kaiser,
sponses are equal, as can be seen in (A.11) below, Eds., chapter 4, pp. 155–277, John Wiley & Sons, New York,

NY, USA, 1993.
Hak e jωT
=
G(−k) F (−k) +F (+k) + jG(−k) F (+k) +F (−k)
, [10] Y. C. Lim, “Frequency-response masking approach for the syn-
R 0R 1R cR 0R 1R
thesis of sharp linear phase digital filters,” IEEE Transactions on

Hsk e jωT
=
G(−k) F (−k) +F (+k) − jG(−k) F (+k) +F (−k)
. Circuits and Systems, vol. 33, no. 4, pp. 357–364, 1986.
R 0R 1R cR 0R 1R
[11] Y. C. Lim and Y. Lian, “The optimum design of one and two-
(A.11)
dimensional FIR filters using the frequency response masking
technique,” IEEE Transactions on Circuits and Systems II: Ana-
Finally, since e∓ j(k+0.5)πNG = −c j(−1)k , the product of the
log and Digital Signal Processing, vol. 40, no. 2, pp. 88–95, 1993.
filters Hak (e jωT ) and Hsk (e jωT ) is [12] T. Saramäki, “Design of computationally efficient FIR filters
using periodic subfilters as building blocks,” in The Circuits
Hak e jωT Hsk e jωT
and Filters Handbook, W. K. Chen, Ed., pp. 2578–2601, CRC
2 2 Press, Boca Raton, Fla, USA, 1995.
= e− j(NG +NF )ωT · GR(−k) (−k)
F0R (+k)
+ F1R [13] H. Johansson and T. Saramäki, “Two-channel FIR filter banks
2 2 based on the frequency-response masking approach,” in Pro-
+ G(cR−k) (+k)
F0R (−k)
+ F1R ceedings of the 2nd International Workshop on Transforms Filter
(A.12) Banks, Brandenburg an der Havel, Germany, March 1999.
[14] H. Johansson, “New classes of frequency-response masking
and thus FIR filters,” in Proceedings of the IEEE International Symposium
M
−1 on Circuits and Systems, vol. 3, pp. 81–84, Geneva, Switzerland,

V0 e jωT = Hak e jωqT Hsk e jωT May 2000.
k=0 [15] P. S. R. Diniz, L. C. R. De Barcellos, and S. L. Netto, “De-
M −1 sign of cosine-modulated filter bank prototype filters using
2 2
= e− j(NG +NF )ωT G(R−k) (−k)
F0R (+k)
+ F1R the frequency-response masking approach,” in Proceedings of
k=0
the IEEE International Conference on Acoustics, Speech and Sig-
nal Processing (ICASSP ’01), vol. 6, pp. 3621–3624, Salt Lake,

(−k) 2 (+k) (−k) 2 Utah, USA, May 2001.
+ GcR F0R + F1R
[16] M. B. Furtado Jr., P. S. R. Diniz, and S. L. Netto, “Opti-
(A.13)
mized prototype filter based on the FRM approach for cosine-
which obviously has a linear-phase response of −(NG L + modulated filter banks,” Circuits, Systems, and Signal Process-
ing, vol. 22, no. 2, pp. 193–210, 2003.
NF )ωT.
[17] S. L. Netto, L. C. R. De Barcellos, and P. S. R. Diniz, “Efficient
design of narrowband cosine-modulated filter banks using a
REFERENCES two-stage frequency-response masking approach,” Journal of
Circuits, Systems and Computers, vol. 12, no. 5, pp. 631–642,
[1] R. E. Crochiere and L. R. Rabiner, Multirate Digital Signal Pro- 2003.
cessing, Prentice-Hall, Englewood Cliffs, NJ, USA, 1983. [18] P. S. R. Diniz, L. C. R. De Barcellos, and S. L. Netto, “Design
[2] P. P. Vaidyanathan, Multirate Systems and Filter Banks, Pren- of high-resolution cosine-modulated transmultiplexers with
tice-Hall, Englewood Cliffs, NJ, USA, 1993. sharp transition band,” IEEE Transactions on Signal Processing,
[3] N. J. Fliege, Multirate Digital Signal Processing, John Wiley & vol. 52, no. 5, pp. 1278–1288, 2004.
Sons, New York, NY, USA, 1994. [19] M. B. Furtado Jr., P. S. R. Diniz, S. L. Netto, and T. Saramäki,
[4] T. Saramäki, “A generalized class of cosine modulated filter “On the design of high-complexity cosine-modulated trans-
banks,” in Proceedings of the 1st International Workshop on multiplexers based on the frequency-response masking ap-
Transforms and Filter Banks, pp. 336–365, Tampere, Finland, proach,” IEEE Transactions on Circuits and Systems I: Regular
February 1998. Papers, vol. 52, no. 11, pp. 2413–2426, 2005.
[5] R. Bregović and T. Saramäki, “An efficient approach for [20] L. Svensson, P. Löwenborg, and H. Johansson, “Modulated
designing nearly perfect-reconstruction low-delay cosine- m-channel FIR filter banks utilizing the frequency response
modulated filter banks,” in Proceedings of the IEEE Interna- masking approach,” in Proceedings of the IEEE Nordic Sig-
tional Symposium on Circuits and Systems, vol. 1, pp. 825–828, nal Processing Symposium (NORSIG ’02), Hurtigruta, Tromsö-
Phoenix, Ariz, USA, May 2002. Trondheim, Norway, October 2002.
[21] L. Rosenbaum, P. Löwenborg, and H. Johansson, “Cosine

and sine modulated FIR filter banks utilizing the frequency-
response masking approach,” in Proceedings of the IEEE Inter-
national Symposium on Circuits and Systems, vol. 3, pp. 882–
885, Bangkok, Thailand, May 2003.
[22] J. H. McClellan, T. W. Parks, and L. R. Rabiner, “A computer
program for designing optimum FIR linear phase digital fil-
ters,” IEEE Transactions on Audio and Electroacoustics, vol. 21,
no. 6, pp. 506–526, 1973.
Linnéa Rosenbaum (maiden name Svens-

son) was born in Färgaryd, Sweden, in
1976. She received the M.S. degree in ap-
plied physics and electrical engineering and
the Licentiate degree in electronics systems
from Linköping University, Sweden, in 2001
and 2003, respectively. She is currently pur-
suing her studies for the Doctoral degree.
Her research interests are digital filters with
emphasis on realization and implementa-
tion of filter banks. She received the IEEE Nordic Signal Processing
Symposium Best Paper Award 2002.
Per Löwenborg was born in Oskarshamn,

Sweden, in 1974. He received the M.S. de-
gree in applied physics and electrical en-
gineering and the Licentiate and Doctoral
degrees in electronics systems from Link-
öping University, Sweden, in 1998, 2001,
and 2002, respectively. His research interests
are within the field of theory, design, and
implementation of analog and digital signal
processing electronics. He is the author or
coauthor of one book and more than 50 international journals and
conference papers. He was awarded the 1999 IEEE Midwest Sym-
posium on Circuits and Systems Best Student Paper Award and the
2002 IEEE Nordic Signal Processing Symposium Best Paper Award.
He is a Member of the IEEE.
Håkan Johansson was born in Kumla, Swe-
den, in 1969. He received the M.S. degree in
computer science and the Licentiate, Doc-
toral, and Docent degrees in electronics sys-
tems from Linköping University, Sweden,
in 1995, 1997, 1998, and 2001, respectively.
During 1998 and 1999, he held a postdoc-
toral position at Signal Processing Labo-
ratory, Tampere University of Technology,
Finland. He is currently a Professor in elec-
tronics systems at the Department of Electrical Engineering of
Linköping University. His research interests include theory, design,
and implementation of signal processing systems. He is the author
or coauthor of four textbooks and more than 100 international
journals and conference papers. He has served/serves as an Asso-
ciate Editor for the IEEE Transactions on Circuits and Systems-
II (2000–2001), IEEE Signal Processing Letters (2004–2007), and
IEEE Transactions on Signal Processing (2006–2008), and he is a
Member of the IEEE International Symposium on Circuits and Sys-
tems DSP Track Committee.
doi:10.1155/2007/13754
Research Article
Fixed Wordsize Implementation of Lifting Schemes
Tanja Karp
Department of Electrical and Computer Engineering, College of Engineering, Texas Tech University,
P.O. Box 43102, Lubbock, TX 79409-3102, USA
Received 16 December 2005; Revised 29 May 2006; Accepted 26 August 2006
We present a reversible nonlinear discrete wavelet transform with predefined fixed wordsize based on lifting schemes. Restricting
the dynamic range of the wavelet domain coefficients due to a fixed wordsize may result in overflow. We show how this overflow
has to be handled in order to maintain reversibility of the transform. We also perform an analysis on how large a wordsize of
the wavelet coefficients is needed to perform optimal lossless and lossy compressions of images. The scheme is advantageous to
well-known integer-to-integer transforms since the wordsize of adders and multipliers can be predefined and does not increase
steadily. This also results in significant gains in hardware implementations.
Copyright © 2007 Tanja Karp. This is an open access article distributed under the Creative Commons Attribution License, which
permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. INTRODUCTION lifting scheme, where, in addition to rounding, overflow oc-

curs if a value exceeds the range of representable numbers.
Lifting schemes have been introduced by Daubechies and
In this paper, we show that lifting schemes are also ro-
Sweldens as a structure to design and implement the discrete
bust towards overflow if it is handled as wrap-around and we
wavelet transform (DWT) [1]. They have gained immense
evaluate the performance of fixed wordsize lifting schemes.
popularity, since they provide an elegant means to maintain
perfect reconstruction of the DWT while staying in the field
of integer numbers [2], thus allowing for lossless coding. At 2. FIXED WORDSIZE LIFTING
each lifting step, a quantizer that ensures that intermediate
A typical lifting step and its inverse can be described in a
results are rounded to integer numbers is introduced.
matrix form as
Lifting schemes have been applied to a variety of fields.
They are used to implement the polyphase filters of modu-
Z y0 (n) 1 A(z) Z x0 (n)
lated filter banks [3], are building blocks of integer-to-integer = · ,
transforms such as the IntFFT [4], the IntDCT [5], and Z y1 (n) 0 1 Z x1 (n)
⎡ ⎤ (1)
YCoCg-R color space transform [6]. More recently, the con- x (n) 1 −A(z) Z{ y0 (n)}
Z
⎣ 0 ⎦ =
cept of lifting has been extended to multidimensional inputs · ,
[7]. Lifting schemes can even be designed in such a way that Z x1 (n) 0 1 Z y1 (n)
they preserve signal symmetries, a fact that is important for
image compression. The performance of nonlinear reversible where Z{x(n)} denotes the z transform of a sequence x(n)
DWT using lifting schemes has been studied in [8]. and A(z) is an FIR filter. For integer-to-integer lifting, the
While keeping the wordlength of the wavelet coefficients output sequence of the filter A(z) is rounded to be an integer.
finite is a necessary condition for lossless coding, the dy- For a fixed wordsize implementation, we additionally need to
namic of the signal and thus its wordlength normally in- take care of overflow, in case the result exceeds the dynamic
crease. This is not only disadvantageous from a compression range defined by the wordsize. Overflow can occur at each
point of view, but also means a higher complexity of the cir- multiplication and each addition during the lifting step. For
cuitry. For example, in an FPGA, the number of gates needed reasons of simplicity, let us first look at the case where the fil-
for a multiplier increases significantly with the wordlength. ter A(z) simplifies to a constant, that is, A(z) = a. The round-
It is therefore of interest to keep the maximum wordsize lim- ing and overflow operations for the lifting and inverse lifting
ited, thus resulting in a fixed-point implementation of the steps are shown in Figure 1. Overflow needs to be handled at
x0 (n) y0 (n) x0 (n)
a a
Round Round
Overflow Overflow
v(n) v(n)
x1 (n) + Overflow y1 (n) + Overflow x1 (n)
(a) (b)
Figure 1: Fixed-point implementation of a lifting step and its inverse.
the resultant of the multiplication as well as at the resultant then subtracting the same number is the original value even
of the adder. if overflow affected the intermediate result. Thus, if x1 (n0 ) +
We assume that x0 (n), x1 (n) are finite resolution num- v(n0 ) > xmax , we obtain
bers representable with a fixed wordsize, and a is an arbi-

trary real number. It can be easily seen from Figure 1 that y1 n0 = wrap x1 n0 + v n0
x0 (n) = x0 (n) always holds true if the same fixed-point num-
= xmin + x1 n0 + v n0 − xmax
ber format is chosen for lifting and inverse lifting. The mul-
tiplication a · x0 (n) is implemented at both the lifting step = x1 n0 + v n0 − xmax − xmin ,
and the inverse lifting step since y0 = x0 , resulting in the
same sequence v(n) at the lifting and inverse lifting steps x1 n0 = wrap y1 n0 − v n0

as long as the same rounding rules are applied and over- = wrap x1 n0 + v n0 − xmax − xmin − v n0
flow is handled in the same way (saturation or wrap-around)
= wrap x1 n0 − xmax − xmin = x1 n0 ,
for both schemes. If no overflow occurs during the addi-
(4)
tion y1 (n) = x1 (n) + v(n), then no overflow occurs dur-
ing the subtraction at the inverse lifting step and we obtain
x1 (n) = y1 (n) − v(n) = x1 (n). and if x1 (n0 ) + v(n0 ) < xmin , we get
For overflow occurring during the addition v(n) + x1 (n)
and the subtraction y1 (n) − v(n), the overflow only cancels, y1 n0 = wrap x1 n0 + v n0

if it is treated as wrap-around and not as saturation. In the = xmax − xmin − x1 n0 + v n0
following, we assume that we can represent numbers in the
= x1 n0 + v n0 + xmax − xmin ,
range from xmin to xmax , including these two margins.
In saturation, numbers that are larger than xmax or x1 n0 = wrap y1 n0 − v n0

smaller than xmin are mapped to xmax or xmin , respectively. = wrap x1 n0 + v n0 + xmax − xmin − v n0
Assuming that overflow occurs at a certain time index n0 in
= wrap x1 n0 + xmax − xmin = x1 n0 .
such a way that x1 (n0 ) + v(n0 ) > xmax yields
(5)

y1 n0 = saturate x1 n0 + v n0 = xmax , (2)
From the upper equations, we see that the overflow error
x1 n0 = y1 n0 − v n0 = xmax − v n0
= x 1 n0 . (3)
introduced at the lifting step is canceled at the inverse lifting
Note that the result of y1 (n0 )−v(n0 ) always lies within the step. If we now return to the general form of lifting with A(z)
range of representable numbers, and therefore no overflow being a filter, we realize that all we have to ensure for overflow
occurs in (3). A similar case can be made for x1 (n0 ) + v(n0 ) < errors to cancel is that the filter output signal v(n) is the same
xmin . for lifting and inverse lifting. Since both filters and their in-
In wrap-around, overflow is handled in such a way that put signals are identical, this means that we have to imple-
the amount by which a number exceeds xmax will be added ment the convolution in an identical way at the lifting and
to xmin and the amount by which a number is smaller than the inverse lifting steps, that is, applying the same wordsize,
xmin will be subtracted from xmax . Note that wrap-around is using the same rounding algorithms, and treating overflow
very similar to modulus operation and results in wrap(x + in the same way. Note that for the overflow handling blocks
k(xmax − xmin )) = x, if k is an integer value. While overflow which calculate v(n), it does not matter whether we choose
error is larger in wrap-around than in saturation, it has the saturation or wrap-around in the case of overflow, as long as
advantage that the result obtained after adding a number and we do the same for lifting and inverse lifting.
Tanja Karp 3
(a) (b)
Figure 2: Fixed-point approximation and detail coefficients in the wavelet domain (3-level decomposition) with wordsize of (a) 13 bits and
(b) 9 bits.
3. PERFORMANCE ANALYSIS 2
To evaluate the performance of the fixed-point lifting 1.5

Compression ratio
scheme, we implemented Daubechies’ 9–7 wavelet using the

lifting scheme described in [1]. Each lifting step is imple-
mented according to Figure 1, where instead of a constant 1
a we have first-order filters. The dual lifting step is imple-
mented accordingly. We then apply three levels of the fixed-
point wavelet transform to the 256 × 256 pixel 8-bit grey- 0.5
scale image known as “cameraman” using different fixed-
point formats and wordsizes. Image boundaries are treated 0
using symmetric extension [9]. Since the final scaling factor 8 9 10 11 12 13 14 15
in the lifting decomposition of the 9–7 wavelet does not per- Wordsize (bits)
tain symmetry of the approximation and detail coefficients if
Frac9, sat. and wrap Frac7, sat. and wrap
it is implemented as a 4-step lifting [1], it has been omitted Frac9, wrap Frac7, wrap
in our implementation. Reference [9] explains how it could
be implemented using a per-lifting-step symmetric extension Figure 3: Lossless compression using SPIHT. Wavelet coefficients
method. have 7 bits after the binary point (frac7) or 9 bits (frac9); the num-
In a first step, the input image is scaled such that all pixel ber of bits before the binary points varies.
values lie within the range (−1, 1), and thus can be repre-
sented by a sign bit and 7 bits describing the values after the
binary point. All lifting coefficients and intermediate signals and for Figure 2(b), we only allow 2 bits before the binary
are realized using an identical fixed-point format. For our point (i.e., a total wordsize of 9 bits).
analysis, we consider wordsizes with 7 bits after the binary While Figure 2(a) looks very similar to what we are used
point (same as input) and 9 bits after the binary point to al- to see for linear, integer-to-floating-point DWT decomposi-
low for a more accurate representation of intermediate re- tion, namely horizontal, vertical, and diagonal edge informa-
sults. The number of bits before the binary point is fixed for tion in the detail bands and a scaled copy of the original im-
each implementation. Depending on the setup, it varies from age in the approximation band, Figure 2(b) shows large co-
1 (sign bit) to 6, allowing for different dynamic ranges for efficients in the detail bands. In fact, most of the detail bands
intermediate signals as well as the approximation and detail look like an approximation of the original image. This can be
coefficients. Overflow at the lifting step addition is treated as explained by the fact that handling overflow as wrap-around
wrap-around to ensure cancellation of overflow errors at the turns a large positive number into a small negative number,
reconstruction. Overflow within the lifting step is treated ei- a nonlinear operation that produces new frequencies, result-
ther as wrap-around or saturation. ing in the large number of high-frequency components seen
Figure 2 shows the wavelet coefficients of the 3-level de- in the detail bands of Figure 2(b).
composition if 7 bits are used after the binary point, satura- To examine the suitability of our fixed-point lifting
tion is applied for overflow within a lifting step, and wrap- scheme for lossless compression, we have losslessly encoded
around is used at the adders. For Figure 2(a), we allow 6 bits the fixed-point wavelet coefficients using SPIHT encoder
in front of the binary point (i.e., a total wordsize of 13 bits) [10]. Figure 3 shows the obtained compression ratios. As
Lossy compression 4 : 1 Figure 4 shows the PSNRs obtained when performing

40
lossy compression of ratios 4 : 1 and 16 : 1 using SPIHT for
35
the fixed-point wavelet coefficients. We notice that the PSNR
significantly reduces if less than 4 extra bits are spent before
30 the binary point to cover the increased dynamic of the signal.
Again, for short wordsizes, the SPIHT has to spend many bits
PSNR
25 to encode the large number of significant detail coefficients.

For the range of 1 to 3 extra bits, the fixed-point scheme with
20 saturation for overflow handling at the lifting filter again out-
performs the one that used wrap-around. Also, we see that
15
the increased precision of the intermediate signals, lifting fil-
10 ter coefficients, and wavelet coefficients maintained in the
0 1 2 3 4 5 scheme with 9 bits after the binary point shows only minimal
Extra bits before binary point improvement over the scheme with 7 bits, and thus does not
Frac7, sat. and wrap Frac7, wrap
justify the increased circuitry since it also has a lower com-
Frac9, sat. and wrap Frac9, wrap pression ratio in the lossless case. Note that the low compres-
sion ratio obtained is due to the small image size.
(a)
Lossy compression 16 : 1 4. APPROPRIATE CHOICE OF WORDSIZE

30
The results from the previous section have shown that the
25 performance of the compression scheme highly depends on
the wordsize chosen. From Figure 2, we have seen that over-
flow that happens when calculating the approximation co-
PSNR
20
efficients results in a significant change of the detail coeffi-
cients and a reduced ability of SPIHT to compress the image
15 effectively. Since most large wavelet coefficients are in the ap-
proximation band, which shows a reduced-size thumbnail of
the original image, we have to choose the wordsize such that
10 overflow only happens rarely when calculating these coeffi-
0 1 2 3 4 5
cients. For Daubechies’√9–7 wavelet, the lowpass filter ampli-
Extra bits before binary point fies the amplitude by 2, resulting in a factor of 2 per 2D
Frac7, sat. and wrap Frac7, wrap decomposition step, and thus 1 additional bit being required
Frac9, sat. and wrap Frac9, wrap per level of decomposition. Since we did not implement the
(b) scaling factor of 1.1496 in the lifting decomposition [1], the
gain is actually slightly lower. Figures 3 and 4 confirm that we
Figure 4: Lossy coding using SPIHT. Wavelet coefficients have 7 bits obtain close-to-optimal performance in terms of compres-
after the binary point (frac7) or 9 bits (frac9); the number of bits sion ratio for lossless compression and in terms of PSNR for
before the binary points varies. lossy compression with 3 additional bits before the binary
point if we use saturation for overflow within a lifting step.
However, from the difference in PSNR at 3 additional bits
expected, the scheme with 7 bits after the binary point when using wrap-around or saturation within a lifting step,
achieves a higher compression ratio as the one with 9 bits, we conclude that overflow still happens during the calcula-
since SPIHT can stop encoding 2-bit levels earlier. However, tions since otherwise both results should be identical. Only
the schemes with the shorter wordlength result in a poorer at 4 additional bits the results for wrap-around and satura-
compression ratio, mainly because of the large number of tion do converge. Thus, in addition to the general scaling of
significant wavelet coefficients in the detail bands. We can the approximation coefficients, one needs to take interme-
observe from Figure 3 that we need a wordsize that is about diate results into account, where overflow can occur even if
3 to 4 bits larger than the 8-bit input wordsize to obtain the the final wavelet coefficient lies within the range of realizable
highest compression ratio. Having a larger wordsize than that numbers. For the lifting implementation of Daubechies’ 9–
does not improve the performance any further but means 7 wavelet [1], the most critical step in the decomposition is
increased circuit complexity, since adders and multipliers of the first lifting step, where neighboring even-indexed pixels
larger wordsize have to be used. are added and then scaled by a factor of −1.5861, thus re-
Also, we see from Figure 3 that the fixed-point im- sulting in values that have a magnitude that is 3.1723 as large
plementation that uses saturation for the lifting filter and as the incoming values if those belong to a smooth region
wrap-around for the addition outperforms the one that uses of the image and are identical. Thus, the extra bit (4 addi-
wrap-around at both places. This is because saturation intro- tional bits before the binary point instead of 3) avoids major
duces the lower overflow error of both schemes. occurrence of overflow at this step. The other lifting steps of
Tanja Karp 5
our implementation [1] have scaling factors with magnitudes Tanja Karp received the Dipl.-Ing. degree
less than one and are thus of no concern. in electrical engineering (M.S.E.E.) and
the Dr.-Ing. degree (Ph.D.) from Hamburg
5. CONCLUSION University of Technology, Hamburg, Ger-
many, in 1993 and 1997, respectively. In
In this paper, we have presented a fixed wordsize im- 1995 and 1996, she spent two months as
plementation of the lifting scheme that maintains perfect a Visiting Researcher at the Signal Process-
reconstruction even in the case of overflow occurrences. It ing Department of ENST, Paris, France, and
provides a nonlinear, reversible, integer-to-integer discrete at the Mutirate Signal Processing Group,
University of Wisconsin at Madison, respec-
wavelet transform with predefined dynamic range of the
tively, working on modulated filter banks. In 1997, she joined the
wavelet coefficients. However, when limiting the dynamic Institute of Computer Engineering at Mannheim University, Ger-
range of the wavelet coefficients too much, a large number many, as a Senior Research and Teaching Associate. From 1998 to
of significant detail coefficients occur due to overflow errors 1999, she has also taught as a Guest Lecturer at the Institute for
and make the scheme less suitable for lossy and even loss- Microsystem Technology at Freiburg University, Germany. From
less compression using standard wavelet encoding schemes. 2000 to 2006, she was an Assistant Professor in the Department
However, as long as the wordlength is chosen such that over- of Electrical and Computer Engineering at Texas Tech University,
flow occurs only rarely, the scheme provides the advantage Lubbock, Texas. She is now an Associate Professor in the same de-
that limited wordsize adders and multipliers can be used, partment. Her research interests include multirate signal process-
which is of particular advantage for FPGA implementations. ing, filter banks, audio coding, multicarrier modulation, and signal
We have derived how the increase in wordsize depends on the processing for communications. She is an IEEE Member and regu-
larly reviews articles for several IEEE and EURASIP transactions.
number of levels of wavelet transform that are performed as
well as the values of the lifting coefficients.
REFERENCES
into lifting steps,” Journal of Fourier Analysis and Applications,
vol. 4, no. 3, pp. 247–269, 1998.
[2] A. R. Calderbank, I. Daubechies, W. Sweldens, and B.-L. Yeo,
“Wavelet transforms that map integers to integers,” Applied
and Computational Harmonic Analysis, vol. 5, no. 3, pp. 332–
369, 1998.
[3] A. Mertins and T. Karp, “Modulated, perfect reconstruction
filterbanks with integer coefficients,” IEEE Transactions on Sig-
nal Processing, vol. 50, no. 6, pp. 1398–1408, 2002.
[4] S. Oraintara, Y.-J. Chen, and T. Q. Nguyen, “Integer fast
Fourier transform,” IEEE Transactions on Signal Processing,
vol. 50, no. 3, pp. 607–618, 2002.
[5] J. Liang and T. D. Tran, “Fast multiplierless approximations of
the DCT with the lifting scheme,” IEEE Transactions on Signal
Processing, vol. 49, no. 12, pp. 3032–3044, 2001.
[6] H. Malvar and G. Sullivan, “YCoCg-R: a color space with
RGB reversibility and low dynamic range,” ISO/IEC JTC1/
SC29/WG11 and ITU-T SG16 Q.6, July 2003.
[7] R. Geiger, Y. Yokotani, and G. Schuller, “Improved integer
transforms for lossless audio coding,” in Proceedings of the
37th Asilomar Conference on Signals, Systems and Computers
(ACSSC ’03), vol. 2, pp. 2119–2123, Pacific Grove, Calif, USA,
November 2003.
[8] J. Reichel, G. Menegaz, M. J. Nadenau, and M. Kunt, “Integer
wavelet transform for embedded lossy to lossless image com-
pression,” IEEE Transactions on Image Processing, vol. 10, no. 3,
pp. 383–392, 2001.
[9] M. D. Adams and R. K. Ward, “Symmetric-extension-
compatible reversible integer-to-integer wavelet transforms,”
IEEE Transactions on Signal Processing, vol. 51, no. 10, pp.
2624–2636, 2003.
[10] A. Said and W. A. Pearlman, “A new, fast, and efficient im-
age codec based on set partitioning in hierarchical trees,”
IEEE Transactions on Circuits and Systems for Video Technol-
ogy, vol. 6, no. 3, pp. 243–250, 1996.
doi:10.1155/2007/37481
Research Article
Quaternionic Lattice Structures for Four-Channel
Paraunitary Filter Banks
Marek Parfieniuk and Alexander Petrovsky
Department of Real-Time Systems, Faculty of Computer Science, Bialystok Technical University,

Wiejska 45A street, 15-351 Bialystok, Poland
Received 31 December 2005; Revised 1 October 2006; Accepted 9 October 2006
Recommended by Gerald Schuller
A novel approach to the design and implementation of four-channel paraunitary filter banks is presented. It utilizes hypercomplex
number theory, which has not yet been employed in these areas. Namely, quaternion multipliers are presented as alternative pa-
raunitary building blocks, which can be regarded as generalizations of Givens (planar) rotations. The corresponding quaternionic
lattice structures maintain losslessness regardless of coefficient quantization and can be viewed as extensions of the classic two-
band lattice developed by Vaidyanathan and Hoang. Moreover, the proposed approach enables a straightforward expression of the
one-regularity conditions. They are stated in terms of the lattice coefficients, and thus can be easily satisfied even in finite-precision
arithmetic.
Copyright © 2007 M. Parfieniuk and A. Petrovsky. This is an open access article distributed under the Creative Commons
Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
1. INTRODUCTION Lattice and dyadic-based factorizations of paraunitary

polyphase matrices can be distinguished. The first approach
Paraunitary filter banks (PUFBs) can be considered the most utilizes Givens (planar) rotations [2]. They are implemented
important among multirate systems [1]. This results from with the help of a specific structure, whose shape is the reason
the fact that such filter banks are lossless in addition to guar- for using the name “lattice.” The second technique is based
anteeing perfect reconstruction. A clear relation between the on Householder reflections and degree-one building blocks,
fullband and subband signal energies greatly simplifies the- which are of a different nature [3]. The lattice structures are
oretical considerations, and hence makes PUFBs useful for more frequently used because the structural imposition of
applications such as image coding. the above-mentioned additional properties is easier [4–6].
The paraunitary property means that the basis func- A serious practical problem with the factorizations for
tions related to the subbands of a filter bank are orthogo- PUFBs is that they lose essential properties in the case of
nal. However, it is more convenient to work with the anal- finite-precision implementation. The only exception is the
ysis polyphase transfer matrix E(z), which is paraunitary two-band lattice structure reported in [7]. These facts are
if EH (z−1 )E(z) = cIM , where c is a nonzero constant and not widely known because the effects of coefficient quantiza-
M denotes the number of channels [2]. Thus, instead of tion in PUFBs were studied only in [8]. This is undoubtedly
constraining the impulse response coefficients, the usual a consequence of the growing popularity of lifting factor-
way to obtain a PUFB is to compose its polyphase ma- izations, which guarantee perfect reconstruction under finite
trix from suitable building blocks. From a different point precision [9, 10]. However, they lead to biorthogonal systems
of view, the matrix is appropriately factorized. In this way, with a complicated relation between the fullband and sub-
other properties of the filter frequency responses can be si- band signal energies.
multaneously imposed, such as linear phase (LP), pairwise- In this paper, we propose a novel approach to the design
mirror-image (PMI) symmetry, and regularity. The selec- and implementation of four-band PUFBs. It utilizes hyper-
tion and arrangement of factorization components are de- complex number theory, which has not yet been employed in
cisive. these areas. Namely, quaternion multipliers are presented as
alternative paraunitary building blocks, which can be viewed 4 z 1 z 1

as generalizations of Givens rotations. The lattice structures z 1 α0,0 α1,0 α2,0
4
based on them maintain losslessness regardless of coefficient z 1 α0,3 α0,1 α1,1 α2,1
4
quantization [11]. Moreover, the one-regularity conditions 1 α0,5 α0,4 α0,2 α1,2 α2,2
z
can be expressed in terms of the lattice coefficients and thus 4
satisfied even under finite precision [12].
The limitation of the applicability of the technique to E0
Λ(z) R1
Λ(z) R2
the case of four channels is undoubtedly a serious disadvan-
tage. However, the proposed solution can be recognized as cos α
an extension of the two-band lossless lattice presented in [7].
sin α
Moreover, our development can stimulate further researches
α
aimed at its generalization, on the one hand, and practical sin α
applications, on the other hand.
The organization of the paper is as follows. In Section 2, cos α
the conventional lattice structures for PUFBs are briefly re-
viewed to provide the necessary background for further dis- Figure 1: Conventional plane rotation-based lattice structure for
cussion. Losslessness and regularity are approached more 4-channel general PUFB (N = 3).
closely, and the effect of coefficient quantization on these
properties is accentuated. Section 3 introduces a quater-
nionic multiplier as an alternative building block for four- For M = 4 and N = 3, the details of this approach, which
channel PUFBs. In Section 4, quaternionic variants of the is tightly connected with the QR decomposition of a matrix,
factorizations from Section 2 are derived, as well as the one- are explained in the scheme shown in Figure 1.
regularity conditions on their coefficients. The advantages of
the proposed solution, related to finite-precision implemen- 2.2. Four-channel LP PUFB
tations, are emphasized. The obtained results are exploited
in Section 5, where three representative PUFB design exam- Linear phase responses of a filter bank are necessary to use
ples are shown. Finally, some concluding remarks are given symmetric extension to handle the boundaries of finite-
in Section 6. length signals [14]. Therefore, LP PUFBs are very important
from a practical point of view, especially in image process-
Notations 1. Column vectors are denoted by lowercase bold- ing. For these systems, the best known factorization of the
faced characters, whereas matrices by the uppercase ones. polyphase transfer matrix assumes M to be an even number
The notation amn refers to the (m, n) entry of a matrix A. Im and has the following form [4, 15]:
and Jm denote the m × m identity and reverse identity matri-
E(z) = GN −1 (z)GN −2 (z) · · · G1 (z)E0 , (3)
ces, respectively. The superscript T stands for transposition.
Quantization is indicated with Q(·). Three specific vectors in which
e = [1 0 0 0]T , a = [1 1 0 0]T , and o = [1 1 1 1]T are 1
helpful. The L2 -norm is considered in our discussion. E0 = √ Φ0 W diag IM/2 , JM/2 , (4)
2
1
2. CONVENTIONAL LATTICE STRUCTURES Gi (z) = Φi WΛ(z)W, i = 1, . . . , N − 1, (5)
2
2.1. Four-channel general PUFB where

The most essential issue in lossless system design is how to IM/2 IM/2
W= , (6)
obtain an M × M paraunitary polyphase transfer matrix E(z) IM/2 −IM/2
of a given McMillan degree [2]. No other properties are re-
Λ(z) = diag IM/2 , z−1 IM/2 , (7)
quired.
At the first successful attempt to solve this problem [13],
Φi = diag Ui , Vi . (8)
the factorization
The design freedom is related to the M/2 × M/2 orthogo-

E(z) = RN −1 Λ(z)R
N −2 Λ(z)
· · · R1 Λ(z)E0 (1) nal matrices Ui and Vi , which are again parameterized using
Givens rotations. For M = 4 and N = 3, this approach leads
was used. It contains the delays to the structure shown in Figure 2. It should be noted that
a 2 × 2 orthogonal matrix corresponds to a single rotation.

Λ(z) = diag z−1 , IM −1 (2)
A relatively recent result is the simplification of the above
and orthogonal matrices: a general one, E0 , with M(M − 1)/2 factorization derived in [16]. Namely, for i > 0, Ui can be
degrees of freedom, and Ri , i = 1, . . . , N − 1, constrained to replaced with the identity matrix, so that
have M − 1 of these. Both kinds of matrices are commonly
Φi = diag IM/2 , Vi , i > 0, (9)
implemented using Givens (planar) rotations, each of which
corresponds to one degree of freedom [2]. without affecting the completeness of the factorization.
M. Parfieniuk and A. Petrovsky 3
U0 U1 U2
1/ 2 1/2 1/2
4
z 1

1/ 2 1/2 1/2
4
z 1
V0 V1 V2
J2 1/ 2 1
1/2 1
1/2
4 z z
z 1
1/ 2 1/2 1/2
4 z 1 z 1
W Φ0 W Λ(z) W Φ1 W Λ(z) W Φ2
E0 G1 (z) G2 (z)
Figure 2: Conventional lattice structure for 4-channel LP PUFB (N = 3).
ΓV0 Γ 1/ 2 ΓV1 Γ Γ V2 J2
4 1/2 1/2
z 1
1/ 2 1/2 1/2
4
z 1
V0 V1 V2
J2 1/ 2 1
1/2 1
1/2
4 z z
z 1
1/ 2 1/2 1/2
4 z 1 z 1
E0 G1 (z) G2 (z)
Figure 3: Conventional lattice structure for 4-channel PMI LP PUFB (N = 3).
2.3. Four-channel PMI LP PUFB As the number of the degrees of design freedom is re-
duced, the optimization of filter bank coefficients is easier,
Among LP PUFBs, there are systems with pairwise-mirror- which was the main motivation behind the development of
image symmetric frequency responses [17]. This property such systems. Recently, it has been shown how to achieve fur-
means that the magnitude responses of the pairs of filters ther simplifications [18].
are symmetric with respect to π/2, which can be expressed For M = 4 and N = 3, such an approach leads to the
in terms of the transfer functions or impulse responses of the structure shown in Figure 3.
analysis filters as
HM −1−k (z) = ±Hk (−z), (10) 2.4. Construction of synthesis filter bank
or To process a signal in subbands, both analysis and synthe-

sis filter banks are needed. In practice, the synthesis compu-
hM −1−k (n) = ±(−1)n hk (n), (11) tational scheme is constructed by arranging the inverses of
the components of the factorization of the analysis polyphase
respectively, where k = 0, . . . , N − 1 and n = 0, . . . , L − 1, transfer matrix in reverse order. It is noteworthy, however,
assuming that the filters are of length L. that in the paraunitary case, the synthesis filters are simply
In the case of an even M, PMI symmetry can be easily the time-reversed version of the analysis ones.
obtained by slightly modifying the lattice factorization for LP
PUFBs. Namely, it is sufficient to associate Ui with Vi in (8)
2.5. Coefficient quantization effects
so that
2.5.1. Losslessness
Ui = ΓVi Γ, i = 0, . . . , N − 2, (12)
UN −1 = JM/2 VN −1 Γ, (13) All presented conventional factorizations lose paraunitary
property in the case of coefficient quantization. Even per-
where Γ is the diagonal matrix whose diagonal entries are fect reconstruction is not provided by finite-precision lat-
γmm = (−1)m−1 , m = 1, . . . , M/2. tice structures. This is because a quantized Givens rotation
matrix, for example, with one real and three distinct imaginary parts. The imagi-
⎡ ⎤ nary units i, j, and k are related by the following equations:
1 0 0 0
⎢0 1 ⎥
⎢ 0 0 ⎥ i2 = j 2 = k2 = i jk = −1,
⎢ ⎥ (14)
⎣0 0 Q(cos α) −Q(sin α)⎦ (18)
0 0 Q(sin α) Q(cos α) i j = − ji = k, jk = −k j = i, ki = −ik = j.
is not orthogonal as there are two different column norms: 1 They define quaternion multiplication so that
and Q2 (cos α) + Q2 (sin α) = 1, and only one nonorthog-

onal component is enough to destroy the losslessness of pq = p1 q1 − p2 q2 − p3 q3 − p4 q4
an entire factorization [8].
+ p1 q2 + p2 q1 + p3 q4 − p4 q3 i
(19)
2.5.2. Regularity + p1 q3 + p3 q1 + p4 q2 − p2 q4 j

Coefficient quantization also affects the regularity of a filter + p1 q4 + p4 q1 + p2 q3 − p3 q2 k,
bank. This property is crucial for low bit-rate coding where
subband coefficients are aggressively quantized, as it alle- which is associative and distributive, but noncommutative
viates blocking artifacts [14]. The concept originates from (pq = qp) unless one of the operands is a scalar. This mainly
wavelet theory, where it is a property of scaling functions and distinguishes quaternions, as the definitions of other opera-
wavelets, critical for smooth signal approximation [19, 20]. tions are nothing more than simple extensions of those re-
However, it is not straightforward to extend the notion to lated to complex numbers. As examples, we can consider the
discrete-time systems, especially to M-band ones in which addition
M > 2.

For an M-band filter bank, regularity can be defined p ± q = p1 ± q1 + p2 ± q2 i + p3 ± q3 j + p4 ± q4 k,
as the number of zeros at the mirror (aliasing) frequencies (20)
2kπ/M, k = 1, . . . , M − 1, of the lowpass filter H0 (z). To ob-
tain K degrees of regularity, the polyphase matrix E(z) must the conjugate
satisfy the condition [6]
q = q1 − q2 i − q3 j − q4 k, (21)
dn M −1 −(M −1)
T

E z 1 z · · · z = cn e, (15)
dzn z=1
and the norm (modulus)
with cn = 0 for n = 0, . . . , K − 1. In particular, for the one-
regularity (K = 1) and four bands (M = 4), the above ex- |q| = qq = qq = q12 + q22 + q32 + q42 . (22)
pression simplifies to
The division is defined as the multiplication by the reciprocal
E(1)o = c0 e. (16)
q
It is easy to verify that this is equivalent to have zero magni- q−1 = , (23)
tude responses of all bandpass filters Hk (z), k = 1, . . . , M − 1, | q |2
at DC (zero) frequency. Thus a constant input is entirely cap-
tured by the lowpass filter, and there is no leakage to the re- which satisfies the identity
maining bands, which would cause the checkerboard artifact
in the case of an image coding application [14]. qq−1 = q−1 q = 1. (24)
Conventionally, the regularity conditions are expressed in
terms of the angles of the Givens rotations which form a lat- The modulus |q| forms the basis for the polar represen-
tice structure [6, 14]. However, such an approach is of lim- tation [21]
ited practical importance, as quantization of rotation matri-
ces changes the corresponding angles, which destroys regu- q1 = |q| cos φ,
larity. So it is more advantageous to have the regularity con-
q2 = |q| sin φ cos ψ,
ditions expressed directly in terms of lattice coefficients. (25)
q3 = |q| sin φ sin ψ cos χ,
3. QUATERNIONS AND ORTHOGONAL MATRICES q4 = |q| sin φ sin ψ sin χ,
3.1. Quaternions
where the angles 0 ≤ φ ≤ π, 0 ≤ ψ ≤ π, and 0 ≤ χ < 2π are
Quaternions were discovered by Hamilton [21]. They are hy- the three remaining degrees of freedom. Polar representation
percomplex numbers of the form [22] allows us to easily parameterize fixed-modulus quaternions.
In our case, unit quaternions (|q| = 1) are of great impor-
q = q1 + q2 i + q3 j + q4 k, q1 , q2 , q3 , q4 ∈ R, (17) tance.
q q q q
A
x M+ (q)x x M (q)x B B
(a) (b) Figure 5: Structural transformation corresponding to (31).
Figure 4: Graphical symbols for the quaternion multipliers whose

coefficient q is (a) the left multiplication operand and (b) the right in a quite large extent, but the emphasis is placed on slightly
multiplication operand, respectively. different nuances.
3.3. Reduction of a 4 × 4 orthogonal matrix

3.2. Quaternion multiplication matrices
Theorem 1. Every 4 × 4 orthogonal matrix A can be repre-
Because quaternions can be identified with four-element col-
sented as the product
umn vectors:
T A = M± (a) diag(1, B), (31)
q ⇐⇒ q = q1 q2 q3 q4 , (26)
where
all operations on hypercomplex numbers can be consistently
represented in vector-matrix notation. We are particularly a = a11 − a21 i − a31 j − a41 k (32)
interested in multiplication, which can be written in two
equivalent forms as and B is a 3 × 3 orthogonal matrix.
⎡ ⎤ ⎡ ⎤
p1 − p2 − p3 − p4 q1 Proof. As A and M± (a) are both orthogonal, M± (a)T A must
⎢p ⎥ ⎢ ⎥
⎢ 2 p1 − p4 p3 ⎥ ⎢q2 ⎥ be orthogonal as well. The quaternion a is constructed so
pq ⇐⇒ ⎢ ⎥×⎢ ⎥
⎣ p3 p4 p1 − p2 ⎦ ⎣q3 ⎦ as to have the inner product of the first columns of A and
p4 − p3 p2 p1 q4 M± (a) equal to unity. This is the value of the (1, 1)st element
of M± (a)T A, so all the remaining elements in the first row
M+ (p)
⎡ ⎤ ⎡ ⎤ (27) and column of this matrix must be zeros. Thus the rest of its
q1 −q2 −q3 −q4 p1 elements forms a 3 × 3 orthogonal matrix B.
⎢q q4 −q3 ⎥ ⎢ ⎥
⎢ 2 q1 ⎥ ⎢ p2 ⎥
=⎢ ⎥×⎢ ⎥. The corresponding structural transformation is shown in
⎣q3 −q4 q1 q2 ⎦ ⎣ p3 ⎦
q4 q3 −q2 q1 p4 Figure 5. It should be noted that the reducing ability of unit
quaternion multiplication matrices suggests their tight con-
M− (q)
nections with Givens rotations, which are commonly used
Thus two different multiplication matrices exist, the left- in matrix parameterization via QR decomposition, as it has
M+ (·) and right-operand M− (·) one. been mentioned earlier. One quaternion multiplication is re-
In the following discussion, we restrict ourselves to unit lated to three degrees of freedom and can be treated as a four-
quaternion multiplication matrices. To represent quaternion dimensional generalization of a Givens rotation [11].
multipliers graphically, we also introduce the symbols shown
in Figure 4. 3.4. Parameterization of a 4 × 4 orthogonal matrix
Both matrices are orthogonal as
Theorem 2 (see [24]). For every orthogonal 4 × 4 matrix A,
M± (q)−1 = M± (q)T , (28) there exists a unique (up to signs) pair of unit quaternions p
and q such that
and have determinant +1. Hence, they belong to the 4×4 spe-
cial orthogonal group commonly referred to as SO(4) [23]. A = M+ (p)M− (q) = M− (q)M+ (p). (33)
They also form groups with respect to multiplication, which
implies the following identities: Proof. We begin by decomposing the given matrix A ac-
cording to (31) and then deal with diag(1, B). It is known
M+ qN −1 · · · M+ q0 = M+ qN −1 · · · q0 , (29a) [25] that the latter matrix can be represented using one unit
−
−
−
quaternion b as
M qN −1 · · · M q0 = M q0 · · · qN −1 . (29b)
Another interesting and useful relation is M+ (b)M− (b) = diag(1, B), (34)
M± (q)T = M± (q). (30) where

⎡ ⎤
b2 +b2 − b32 − b42 2 − b1 b4 + b2 b3 2 b1 b3 + b2 b4
From a PUFB perspective, the connections between ⎢ 1 2 ⎥
quaternion multiplication matrices and arbitrary 4 × 4 or- B=⎢ 2 2 2 2 ⎥
⎣ 2 b1 b4 +b3 b2 b1− b2 + b3 − b4 2 − b1 b2 + b3 b4 ⎦ .
2 2 2 2
thogonal ones are intriguing. To make the paper comprehen- 2 − b1 b3 +b4 b2 2 b1 b2 + b4 b3 b1 − b2 − b3 + b4
sive, we have decided to repeat the derivations from [24, 25] (35)
The equations 4 p0 q0 z 1
q1 z 1
q2
1 z 1
b12 = 1 + b11 + b22 + b33 , 4
4 1
z
1 4
b22 = 1 + b11 − b22 − b33 ,
4 z 1
(36)
1 4
b32 = 1 − b11 + b22 − b33 ,
4
1 E0
Λ(z) R1
Λ(z) R2
b42 = 1 − b11 − b22 + b33 ,
4
1 1 Figure 6: Quaternionic lattice structure for 4-channel general
b1 b2 = b32 − b23 , b 1 b3 = b13 − b31 , PUFB (N = 3).
4 4
1 1
b1 b4 = b21 − b12 , b2 b3 = b12 + b21 , (37)
4 4
1 1
The specific structures of quaternion multiplication ma-
b2 b4 = b13 + b31 , b3 b4 = b23 + b32 ,
4 4 trices allow us to perform this operation in 8 real multiplica-
which can be easily derived, allow us to calculate b from B. tions, but the algorithm is quite intricate [26].
This system of equations is overdetermined as the num- The possibility of multiplierless implementations is
ber of equations exceeds the number of unknowns. To avoid much more important. They can be realized with distributed
a contradiction, the equation which gives the bk of a max- arithmetic or using four-dimensional CORDIC algorithm.
imum absolute value should be selected from among (36). The feasibility of computation parallelization or pipelining
Then it must be supplemented by the three equations in (37) together with the regularity of the layout of a digital circuit
which involve bk , to allow us to determine all components make quaternionic multiplier very attractive for FPGA and
of the quaternion b. It should be noted that the squares at VLSI technologies [27].
the left-hand side of (36) make −b an equivalent solution.
Finally, we get the desired factorization 4. QUATERNIONIC LATTICE STRUCTURES
+ + − + −
A = M (a)M (b)M (b) = M (ab)M (b) (38) 4.1. Four-channel general PUFB
based on the quaternions p = ab and q = b. Theorem 3 (see [11]). The quaternionic variant of the factor-
ization (1) for a 4-channel general PUFB results from the fol-
It should be emphasized that the matrix product (33) is
lowing substitution:
commutative, though the product of the related quaternions
is not. The theorem is also true after the transition to − p and
E0 = M+ q0 M− p0 , (40)
−q.
±
Ri = M q i , i = 1, . . . , N − 1, (41)
3.5. Quaternion multiplier as paraunitary
where p0 and all qi are unit-norm quaternions.
building block
The parameterization (33) has several advantages which

make quaternion multipliers interesting paraunitary build- Proof. Both of the theorems from the previous section, which
ing blocks. concern 4 × 4 orthogonal matrices, are exploited. According
First of all, a quantized quaternion multiplication matrix, to (31), the matrices Ri in (1) can be represented as
for example,
⎡ ⎤ Ri = M± qi diag 1, Bi . (42)
Q q1 −Q q2 −Q q3 −Q q4
⎢Qq Qq −Qq Qq ⎥
⎢ 2 1 4 3 ⎥ Since
⎢ ⎥ (39)
⎣Qq3 Q q4 Qq1 −Q q2 ⎦
Q q4 −Q q3 Q q2 Q q1
diag 1, Bi Λ(z)
= Λ(z) diag 1, Bi , (43)
still has the same sets of absolute values in all its rows and the 3 × 3 orthogonal matrix Bi can be moved to the preceding
columns. So the column norm is constant and is equal to
stage and multiplied by Ri−1 . The same procedure can be ap-
Q(q1 )2 + Q(q2 )2 + Q(q3 )2 + Q(q4 )2 , and hence the prod- plied to the resulting orthogonal matrix. Starting from RN −1 ,
uct (33) always represents an orthogonal transformation. we process the subsequent stages to reach E0 and to apply
Moreover, it is sufficient to hold only 8 real numbers (33).
(2 quaternions) in memory, whereas the direct representa-
tion of the corresponding matrix would require to store all Figure 6 shows the corresponding quaternionic lattice
its 16 entries. structure.

1/ 2 1/2 1/2
4 q0 p0 p1 p2
z 1
1/ 2 1/2 1/2
4
z 1
J2
1/ 2 1 1/2 1/2
4 z z 1
z 1
1/ 2 1 1/2 1 1/2
4 z z
E0 G1 (z) G2 (z)
Figure 7: Quaternionic lattice structure for 4-channel LP PUFB (N = 3).
Theorem 4 (see [12]). A four-band general PUFB determined

by (1) in conjunction with (40) and (41) is one-regular if and Ui αi Ui = I Ui αi
only if
1 Vi βi Vi βi Vi αi
p0 = ± o qN −1 · · · q0 , (44)
2
under the assumption that the left-operand multiplication ma- (a) (b) (c)
trix is used in (41).
Figure 8: (a) Conventional, (b) simplified, and (c) quaternionic re-
Proof. By substituting E(z) with (1) in (16), and then using alizations of Φi .
(40) and (41), we get

M+ qN −1 · · · M+ q0 M− p0 o = c0 e, (45)
are used instead of (8). All pi and q0 are unit quaternions

as Λ(1) = I4 . A simple analysis of the norms of the factors which have the two last imaginary parts (related to j and k)
in this expression gives the value of c0 . It must be ±2 as the zeroed, so they are constrained to be complex numbers in fact.
norm of o equals 2, and those of e and the rows/columns of
the quaternion matrices are unity. Hence, by exploiting (29a) Proof. Theorem 2 allows us to decompose each matrix Φi de-
too, we can write fined by (8) in the following way:

M+ qN −1 · · · q0 M− p0 o = ±2e. (46) Φi = M− pi M+ qi . (51)
This clearly suggests to make p0 constrained, so we use (30)
The block-diagonal structure of Φi is inherited by the quater-
to obtain
nion multiplication matrices and this is the cause of the de-

M− p0 o = ±2M+ qN −1 · · · q0 )e. (47) generation of the corresponding hypercomplex numbers to
the complex ones. It is easy to check that
This matrix-vector expression can be interpreted as the
quaternionic equation
M+ qi W = WM+ qi , (52)
op0 = ±2qN −1 · · · q0 (48)
M qi Λ(z) = Λ(z)M qi .
+ +
(53)
in which p0 is unknown. The left multiplication by o/4 leads
to the solution, or (44). Thus M+ qi can be moved to the precedingstage Gi−1 (z),
which leads to (50). As the product Φi−1 M+ qi maintains
The above result can be easily adapted to the case of the orthogonality and a block-diagonal structure, the procedure
right-operand multiplication matrix in (41). We omit this for can be repeated on it. The only exception is at Φ0 , which
brevity reasons. must be represented using both quaternion multiplication
matrices.
4.2. Four-channel LP PUFB
The corresponding lattice structure is shown in Figure 7.
Theorem 5 (see [28]). The conventional, presented in Section It is noteworthy that by quaternionic factorization, the num-
2.2, factorization for 4-channel LP PUFBs changes into a ber of different coefficients is reduced with respect to the con-
quaternionic alternative when ventional approach [4] and the same as in the simplified vari-
ant derived in [16]. However, the computational complexity
Φ0 = M− p0 M+ q0 , (49)
remains unchanged. The differences between the mentioned
Φi = M −
pi , i = 1, . . . , N − 1, (50) realizations of Φi are explained in Figure 8.
Γ J2
1/ 2 1/2 1/2
4 p0 p1 p2
z 1
1/ 2 1/2 1/2
4
z 1
J2 1/ 2 1
1/2 1
1/2
4 z z
z 1
1/ 2 1
1/2 1
1/2
4 z z
E0 G1 (z) G2 (z)
Figure 9: Quaternionic lattice structure for 4-channel PMI LP PUFB (N = 3).
Theorem 6 (see [12]). A 4-band LP PUFB realized using the Proof. In the case of 4 channels, ΓVi Γ = VTi , and so the first
quaternionic approach is one-regular if and only if condition (12) necessary to obtain PMI symmetry directly
imposes the form of Φi which coincides with a quaternion
1 multiplication matrix, because
q0 = ± √ p0 · · · pN −1 a. (54)
2
Φi = diag ΓVi Γ, Vi = diag VTi , Vi (61)

Proof. As in the case of a general PUFB, the first step is to M− (p i)
expand (16) in accordance with the considered factorization
of E(z). We get if pi is constrained to be a complex number.
The obvious identities JJ = I and J2 Vi J2 = VTi allow
√
M− pN −1 · · · M− p0 M+ q0 2a = c0 e (55) a quaternion multiplication matrix to be extracted also from
ΦN −1 determined by the condition (13). Namely,
as WΛ(1)W = 2I4 and W diag(I2 , J2 )o = a. The value of c0
ΦN −1 = diag J2 VN −1 Γ, VN −1
again results from the examination of the norms
√ of the fac-
tors and must be ±2 as the norm of a equals 2, while the = diag VTN −1 , VN −1 diag J2 Γ, I2 . (62)
remaining ones are unity. Applying (29b), we obtain
M− (pN −1 )
−
+
√
M p0 · · · pN −1 M q0 a = ± 2e (56)
The corresponding structure is shown in Figure 9. In the
case of a PMI LP PUFB, by quaternionic factorization the
and see that it is the easiest to make q0 dependent on the
number of coefficients is decreased with respect to the con-
remaining coefficients. The identity (30) allows us to write
ventional solution and is the same as in its simplification de-
the matrix equation
rived in [18].
√
M+ q0 a = ± 2M− p0 · · · pN −1 e (57) Theorem 8 (see [12]). A four-band PMI LP PUFB realized
according to Theorem 7 is one-regular if and only if
and then convert it into the quaternionic equivalent
1
√ pN −1 = ± √ ap0 · · · pN −2 . (63)
q0 a = ± 2p0 · · · pN −1 . (58) 2
Proof. Given the quaternionic factorization, we can expand
The right multiplication by a/2 gives the desired regularity (16) into
constraint (54) on q0 .
M− pN −1 diag J2 Γ, I2
4.3. Four-channel PMI LP PUFB √ (64)
· M− pN −2 · · · M− p0 2a = c0 e.
Theorem 7 (see [28]). The constraints (12)-(13) on the ma- The value of c0 results from norm inspection and equals ±2.
trices used in the factorization from Section 2.3, for 4-channel Noticing that
PMI LP PUFBs, can be satisfied by taking
1
diag J2 Γ, I2 = M+ (a)M− (a), (65)
Φi = M− pi , i = 0, . . . , N − 2, (59) 2
and utilizing (29b) and (30), we can rewrite (64) as
ΦN −1 = M− pN −1 diag J2 Γ, I2 , (60)
1 + √
where Γ = diag(1, −1) and the quaternionic coefficients pi are M (a)M− (a)M− p0 · · · pN −2 a = ± 2M− pN −1 e.
2
restricted to be unit complex numbers. (66)
Then, the transition to quaternions yields Table 1: Rational coefficient values for general PUFB.
1 Coeff. Re Imi Im j Imk Wordlength

√ aap0 · · · pN −2 a = ± pN −1 , (67)
22 p0 −45/128 9/16 −31/128 −5/8 8
and we obtain (63) by conjugating both sides, as aa = 2. q0 −11/16 −1/2 1/16 7/16 5
q1 3/8 1/8 3/4 −1/2 4
4.4. Robustness to coefficient quantization
All of the developed lattice structures are lossless regard-

where σx2k are the subband variances. They correspond to the
less of coefficient quantization. This is because the de-
diagonal elements of the autocorrelation matrix of the trans-
rived factorizations contain no components which become
formed signal, R y y :
nonorthogonal when represented with finite precision. Thus,
the frequency responses of such systems are always power-
complementary [2]: σx2k = R y y kk . (72)

M −1
jω 2
Hk e = c2 , ∀ω, (68) It can be determined as the product
k=0
though c can deviate from 1. If a compensation of this effect R y y = HRxx HT (73)

is desired, c can be calculated as

−1 L
2 of the autocorrelation matrix of the input signal, Rxx , and

M −1
jω 2
M −1
Hk e the transform matrix H formed from the impulse responses
2
c = = hk (n) . (69)
ω=0
k=0 k=0 n=0 of the filter bank as follows:
and the multiplication by its reciprocal can be easily embed-
ded into the computational scheme. [H]kn = hk (L − 1 − n), (74)
The plot thickens if regularity is considered because the
quaternion conditioned by the others under (44), (54), or where k = 0, . . . , M − 1 and n = 0, . . . , L − 1.
(63) must be represented accurately. Fortunately, the neces- In our experiments, the matrix Rxx was generated for an
sary wordlength is finite and strictly determined by those of AR(1) input process with unit variance and the correlation
the remaining coefficients. Moreover, any scaling of the coef- coefficient of 0.95. Such a model is particularly appropriate
ficient value does not disturb regularity. only for natural images, and therefore other applications will
To demonstrate that regularity can indeed be easily im- require different approaches.
posed on quaternionic lattice structures, even under finite In the synthesis procedures, the quaternion lattice coef-
precision, the next section shows three design examples. The ficients assumed to be unconstrained in (54) and (63) were
obtained filter banks with rational quaternionic coefficients represented in the polar form (25). So the standard Matlab
can be implemented using fixed-point arithmetic, possibly in routines intended for an unconstrained optimization, that
multiplierless manner as in [29]. is, fminunc and fminsearch, could be used to search for the
angles that minimize the objective function. Given infinite-
5. DESIGN EXAMPLES precision coefficients, we carefully converted them into ra-
tionals. This was done intuitively by hand, but the develop-
5.1. Coefficient synthesis methodology ment of an advanced algorithm, like that proposed in [29], is
planned.
The goal was to obtain frequency-selective filter banks with
high coding gains. So, the weighted sum of two criteria was
used as an objective function for optimization. 5.2. Design example 1: 8-tap general PUFB
The first criterion is the stopband attenuation measured A 4-channel 8-tap PUFB was designed using the results from
in terms of energy as Section 4.1. The synthesized coefficients of quaternionic lat-

M −1 tice structure are given in Table 1 and the corresponding
jω 2
εSBE = Hk e dω, (70) magnitude responses are shown in Figure 10(a). The filter
k=0 ω∈Ωk
bank is characterized by a coding gain of 8.1227 dB and
where Ωk denotes the stopband of the kth filter and the num- a minimum stopband attenuation of 20 dB, so it can com-
ber of channels, M, equals 4 in our case. pete with the similar system demonstrated in [10].
The second performance criterion is a coding gain de- The plots allow us to verify that the designed PUFB is
fined as indeed one-regular. It can be easily noticed in Figure 10(a)
M −1 that only the lowpass filter has a nonzero magnitude response
(1/M) k=0 σx2k at DC frequency. On the other hand, the zero-pole plot in
CG = 10 log10 1/M , (71)
M −1 Figure 10(b) shows that the lowpass filter has a single zero
2
k=0 σxk in each of that points of the unit circle, which correspond
Table 2: Rational coefficient values for LP PUFB.

0
Coeff. Re Imi Im j Imk Wordlength
Hk (e jω ) (dB)
10
q0 −231/512 459/1024 0 0 11
20 p0 −7/8 −3/8 0 0 4
p1 −3/16 15/16 0 0 5
30 p2 −9/16 −13/16 0 0 5
40
0 0.2 0.4 0.6 0.8 1
ω/π 0
(a) 10
Hk (e jω ) (dB)
20
1 30
40
0.5
Imaginary part
50
7
0 0 0.2 0.4 0.6 0.8 1
ω/π
0.5 (a)
1
1
Imaginary part
2 1 0 1 2
Real part 11
0
(b)
1
φ(t) ψ1 (t)
5 4 3 2 1 0 1 2
Real part
(b)
ψ2 (t) ψ3 (t)
φ(t) ψ1 (t)
(c)
ψ2 (t) ψ3 (t)
Figure 10: Design example of general PUFB: (a) magnitude re-
sponses, (b) zeros of H0 (z), and (c) the scaling function and
wavelets.
(c)
to the mirror aliasing frequencies. Figure 10(c) demonstrates Figure 11: Design example of LP PUFB: (a) magnitude responses,
the wavelet basis related to the system. (b) zeros of H0 (z), and (c) the scaling function and wavelets.
5.3. Design example 2: 12-tap LP PUFB

5.4. Design example 3: 12-tap PMI LP PUFB
The second design example demonstrates the usefulness of
the theory developed in Section 4.2. The hypercomplex coef- The results from Section 4.3 allowed us to design the 12-tap
ficient values given in Table 2 determine the 12-tap LP PUFB LP PUFB whose pairwise-mirror-image symmetric magni-
which has a coding gain of 8.1845 dB. The plots in Figure 11 tude responses are shown in Figure 12(a). The coefficients of
allow us to evaluate the magnitude responses of the system the quaternionic lattice are given in Table 3. In spite of de-
and verify its one-regularity. The filters have good frequency creased design freedom and shorter coefficient wordlengths,
selectivity. For the lowpass and highpass ones, the sidelobes the characteristics of the system are very close to those of the
are below the −35 dB level, whereas for the bandpass filters, LP filter bank presented above. Namely, the coding gain is
the peak amplitude of the sidelobes is about −20 dB. of 8.1699 dB and the levels of the sidelobes are of −31 and
Table 3: Rational coefficient values for PMI LP PUFB. is the structural imposition of paraunitary property (lossless-
Coeff. Re Imi Im j Imk Wordlength ness) even with finite-precision arithmetic. It also enables the
straightforward expression of the one-regularity conditions
p0 7/8 3/8 0 0 4
in terms of the coefficients of the quaternionic lattice struc-
p1 3/16 −1 0 0 5 ture, which is also advantageous in fixed-point implementa-
p2 −17/128 43/64 0 0 8 tions. So the solution is especially interesting from a practical
point of view.
0 ACKNOWLEDGMENTS
10 This work was supported by the Polish Ministry of Science
Hk (e jω ) (dB)
20 and Higher Education (MNiSzW) in years 2005-2006 (Grant

no. 3 T11F 014 29). It was also partially supported by Bia-
30
lystok Technical University under the Grant W/WI/2/05.
40
50 REFERENCES
0 0.2 0.4 0.6 0.8 1 [1] P. P. Vaidyanathan and Z. Doǧanata, “The Role of lossless
ω/π systems in modern digital signal processing: a tutorial,” IEEE
(a) Transactions on Education, vol. 32, no. 3, pp. 181–197, 1989.
[2] P. P. Vaidyanathan, Multirate Systems and Filter Banks,
1.5 Prentice-Hall, Englewood Cliffs, NJ, USA, 1993.
1 [3] P. P. Vaidyanathan, T. Q. Nguyen, Z. Doǧanata, and T. Sara-
Imaginary part
0.5
maki, “Improved technique for design of perfect reconstruc-
11 tion FIR QMF banks with lossless polyphase matrices,” IEEE
0 Transactions on Acoustics, Speech, and Signal Processing, vol. 37,
0.5 no. 7, pp. 1042–1056, 1989.
[4] A. K. Soman, P. P. Vaidyanathan, and T. Q. Nguyen, “Linear
1
phase paraunitary filter banks: theory, factorizations and de-
1.5 signs,” IEEE Transactions on Signal Processing, vol. 41, no. 12,
4 3 2 1 0 1 2 pp. 3480–3496, 1993.
Real part [5] A. K. Soman and P. P. Vaidyanathan, “A complete factorization
(b) of paraunitary matrices with pairwise mirror-image symmetry
in the frequency domain,” IEEE Transactions on Signal Process-
φ(t) ψ1 (t) ing, vol. 43, no. 4, pp. 1002–1004, 1995.
[6] S. Oraintara, T. D. Tran, P. N. Heller, and T. Q. Nguyen, “Lat-
tice structure for regular paraunitary linear-phase filterbanks
and M-band orthogonal symmetric wavelets,” IEEE Transac-
tions on Signal Processing, vol. 49, no. 11, pp. 2659–2672, 2001.
ψ2 (t) ψ3 (t) [7] P. P. Vaidyanathan and P.-Q. Hoang, “Lattice structures for op-
timal design and robust implementation of two-band perfect
reconstruction QMF banks,” IEEE Transactions on Acoustic,
Speech, and Signal Processing, vol. 36, no. 1, pp. 81–94, 1988.
[8] P. P. Vaidyanathan, “On coefficient-quantization and com-
(c) putational roundoff effects in lossless multirate filter banks,”
IEEE Transactions on Signal Processing, vol. 39, no. 4, pp. 1006–
Figure 12: Design example of PMI LP PUFB: (a) magnitude re- 1008, 1991.
sponses, (b) zeros of H0 (z), and (c) the scaling function and [9] Y.-J. Chen and K. S. Amaratunga, “M-channel lifting factor-
wavelets. ization of perfect reconstruction filter banks and reversible M-
band wavelet transforms,” IEEE Transactions on Circuits and
Systems II: Analog and Digital Signal Processing, vol. 50, no. 12,
pp. 963–976, 2003.
−20 dB. The reason for this is the similarity of the zero loca-
[10] Y.-J. Chen, S. Oraintara, and K. S. Amaratunga, “Dyadic-
tions shown in Figure 12(b) to those in Figure 11(b). The dif-
based factorizations for regular paraunitary filterbanks and
ferences between the wavelet bases are almost unnoticeable. M-band orthogonal wavelets with structural vanishing mo-
ments,” IEEE Transactions on Signal Processing, vol. 53, no. 1,
6. CONCLUSION pp. 193–207, 2005.
[11] M. Parfieniuk and A. Petrovsky, “Quaternionic building block
The developed quaternionic approach to the design and im- for paraunitary filter banks,” in Proceedings of the 12th Euro-
plementation of four-band PUFBs seems to be very compet- pean Signal Processing Conference (EUSIPCO ’04), pp. 1237–
itive with the conventional techniques. Its unique advantage 1240, Vienna, Austria, September 2004.
[12] M. Parfieniuk and A. Petrovsky, “Quaternionic formulation of [29] Y.-J. Chen, S. Oraintara, T. D. Tran, K. Amaratunga, and T.
the first regularity for four-band paraunitary filter banks,” in Q. Nguyen, “Multiplierless approximation of transforms with
Proceedings of IEEE International Symposium on Circuits and adder constraint,” IEEE Signal Processing Letters, vol. 9, no. 11,
Systems (ISCAS ’06), pp. 883–886, Kos, Greece, May 2006. pp. 344–347, 2002.
[13] Z. Doǧanata, P. P. Vaidyanathan, and T. Q. Nguyen, “Gen-
eral synthesis procedures for FIR lossless transfer matrices,
Marek Parfieniuk was born in Bialystok,
for perfect-reconstruction multirate filter bank applications,”
Poland, in 1975. He received his M.S. degree
IEEE Transactions on Acoustics, Speech, and Signal Processing,
in computer science, with honors, from Bi-
vol. 36, no. 10, pp. 1561–1574, 1988.
alystok Technical University, in 2000. Cur-
[14] G. Strang and T. Q. Nguyen, Wavelets and Filter Banks, rently, he is completing the procedures for
Wellesley-Cambridge Press, Wellesley, Mass, USA, 1996. the Ph.D. degree. Since 2000, he has been
[15] R. L. de Queiroz, T. Q. Nguyen, and K. R. Rao, “The GenLOT: an Assistant Lecturer at Faculty of Com-
generalized linear-phase lapped orthogonal transform,” IEEE puter Science, Bialystok Technical Univer-
Transactions on Signal Processing, vol. 44, no. 3, pp. 497–507, sity. From 2000 to 2003, he also worked for
1996. ComputerLand S.A. as an Enterprise Soft-
[16] L. Gan and K.-K. Ma, “A simplified lattice factorization for ware Developer.
linear-phase perfect reconstruction filter bank,” IEEE Signal
Processing Letters, vol. 8, no. 7, pp. 207–209, 2001. Alexander Petrovsky received the Dipl.-
[17] T. Q. Nguyen and P. P. Vaidyanathan, “Maximally decimated Ing. degree in computer engineering, in
perfect-reconstruction FIR filter banks with pairwise mirror- 1975, and the Ph.D. degree, in 1980, both
image analysis (and synthesis) frequency responses,” IEEE from the Minsk Radio-Engineering Insti-
Transactions on Acoustics, Speech, and Signal Processing, vol. 36, tute, Minsk, Belarus. In 1989, he received
no. 5, pp. 693–706, 1988. the Doctor of Science degree from The In-
stitute of Simulation Problems in Power
[18] L. Gan and K.-K. Ma, “A simplified lattice factorization for
Engineering, Academy of Science, Kiev,
linear-phase paraunitary filter banks with pairwise mirror im-
Ukraine. In 1975, he joined Minsk Radio-
age frequency responses,” IEEE Transactions on Circuits and
Engineering Institute. He became a Re-
Systems II: Express Briefs, vol. 51, no. 1, pp. 3–7, 2004.
search Worker and Assistant Professor; and since 1980, he has been
[19] O. Rioul, “Regular wavelets: a discrete-time approach,” IEEE an Associate Professor at the Computer Science Department. From
Transactions on Signal Processing, vol. 41, no. 12, pp. 3572– 1983 to 1984, he was a Research Worker at the Royal Holloway Col-
3579, 1993. lege and the Imperial College of Science and Technology, University
[20] P. Steffen, P. N. Heller, R. A. Gopinath, and C. S. Burrus, “The- of London, London, UK. Since May 1990, he has been a Professor
ory of regular M-band wavelet bases,” IEEE Transactions on and Head of the Computer Engineering Department, the Belaru-
Signal Processing, vol. 41, no. 12, pp. 3497–3511, 1993. sian State University of Informatics and Radioelectronics, and he
[21] W. R. Hamilton, “On quaternions; or on a new system of imag- is with the Real-Time Systems department, Faculty of Computer
inaries in algebra,” The London, Edinburgh and Dublin Philo- Science, Bialystok Technical University, Poland. Recently, his main
sophical Magazine and Journal of Science, vol. 25, pp. 489–495, research interests are acoustic signal processing, such as speech and
1844. audio codings, noise reduction and acoustic echo cancellation, ro-
[22] I. L. Kantor and A. S. Solodovnikov, Hypercomplex Numbers: bust speech recognition, and real-time signal processing. He is a
An Elementary Introduction to Algebras, Springer, New York, Member of Russian A. S. Popov Society for Radioengineering, Elec-
NY, USA, 1989. tronics, and Communications, and an Editorial Staff Member of
[23] A. Baker, Matrix Groups: An Introduction to Lie Group Theory, the Russian journal Digital Signal Processing, AES, IEEE, EURASIP.
Springer, London, UK, 2002.
[24] H. G. Baker, “Quaternions and orthogonal 4x4 real matrices,”
Tech. Rep., June 1996, http://www.gamedev.net/reference/
articles/article428.asp.
[25] E. Salamin, “Application of quaternions to computation with
rotations,” Tech. Rep., Stanford AI Lab, Stanford, Calif, USA,
1979.
[26] T. D. Howell and J. C. Lafon, “The complexity of the
quaternion product,” Tech. Rep. TR 75-245, Cornell Univer-
sity, Ithaca, NY, USA, June 1975, http://citeseer.ist.psu.edu/
howell75complexity.html.
[27] M. Parfieniuk and A. Petrovsky, “Implementation perspectives
of quaternionic component for paraunitary filter banks,” in
Proceedings of the International Workshop on Spectral Methods
and Multirate Signal Processing (SMMSP ’04), pp. 151–158, Vi-
enna, Austria, September 2004.
[28] M. Parfieniuk and A. Petrovsky, “Linear phase paraunitary fil-
ter banks based on quaternionic component,” in Proceedings of
International Conference on Signals and Electronic Systems (IC-
SES ’04), pp. 203–206, Poznań, Poland, September 2004.
doi:10.1155/2007/45816
Research Article
Noniterative Design of 2-Channel FIR Orthogonal Filters
M. Elena Domı́nguez Jiménez
TACA Research Group, ETSI Industriales, Universidad Politécnica de Madrid, 28006 Madrid, Spain
Received 1 January 2006; Revised 12 June 2006; Accepted 26 August 2006
This paper addresses the problem of obtaining an explicit expression of all real FIR paraunitary filters. In this work, we present a
general parameterization of 2-channel FIR orthogonal filters. Unlike other approaches which make use of a lattice structure, we
show that our technique designs any orthogonal filter directly, with no need of iteration procedures. Moreover, in order to design
an L-tap 2-channel paraunitary filterbank, it suffices to choose L/2 independent parameters, and introduce them in a simple ex-
pression which provides the filter coefficients directly. Some examples illustrate how this new approach can be used for designing
filters with certain desired properties. Further conditions can be eventually imposed on the parameters so as to design filters for
specific applications.
Copyright © 2007 M. Elena Domı́nguez Jiménez. This is an open access article distributed under the Creative Commons
properly cited.
1. INTRODUCTION Hence, it suffices to find the power spectral P(z) =

|H(z)|2 which satisfies P(z) + P(−z) = 2, and then
Filterbanks are widely used in all signal processing areas. In factors it as P(z) = |H(z)|2 = H(z)H(z−1 ) so as to
the 2-channel case, the filterbank decomposes any signal into get the real filter coefficients. Thus, it is necessary to
its lowpass and highpass components; this is achieved by compute roots of a 2L − 1 degree polynomial P; but
convolution with a lowpass filter h and a highpass filter g. the main drawback is that the corresponding iterative
When using finite impulse response (FIR) filters, the imple- algorithms generally become numerically unstable for
mentation is even more direct. long filters.
In particular, for signal compression applications, or- (b) Lattice filters design. Instead of computing a polyno-
thogonal subband transforms are desired; hence, paraunitary mial, this approach designs the polyphase matrix asso-
filterbanks are required. Thus, some design techniques have ciated with the FIR 2-channel cell given by filters h, g.
been addressed in the literature. Moreover, the appearance The polyphase matrix is defined as
of the wavelet theory gave a new insight into the filter bank
theory, and also provided new methods for the design of real
Heven (z) Hodd (z)
FIR paraunitary filters. H p (z) = (2)
Despite the wide number of publications, we have fo- Geven (z) Godd (z)
cused on the most well-known results, which are contained
in [1–5]. We can merge these main approaches into three and the filterbank is paraunitary if and only if H p (z)
groups. is unitary for every |z| = 1. Thus, it suffices to build
(a) Methods based on spectral factorizations. Commonly unitary matrices of this kind. The paradigm of these
used in wavelet theory [1, 3], they are based on the methods is Vaidyanathan’s algorithm [2, 6, 7]. Basi-
following characterization: a real L-tap filter h = cally, any paraunitary real L-tap FIR causal filter is ob-
(h1 , h2 , . . . , hL ) is paraunitary if and only if its transfer tained through iteration, because its polyphase matrix
can be factorized as
function H(z) = Ln=1 hn z1−n verifies
L/2−1

−1
H p (z) = I+ z −1 v j vtj Q, (3)
H(z)2 + H(−z)2 = 2, |z| = 1. (1) j =1
where Q is unitary of order 2, and v j are unitary col- Finally, P will denote the exchange matrix, say, the one
umn vectors of R2 . Thus, in order to design a parau- that produces a reversal. Recall that any Toeplitz matrix T
nitary filter of length L, we need a total amount of L/2 verifies PTP = T t . In effect, by reversing the order of its rows
parameters in [−1, 1], and L/2 signs. This algorithm and columns we obtain its transpose.
behaves well numerically, but it is difficult to apply
when imposing extra desired properties upon the fil- 2. NEW EXPRESSION OF ORTHOGONAL FILTERS
ter. Besides, as we will see later on, this representation
is redundant, in the sense that it could eventually give Throughout this paper, we will say that an L-tap filter h is
rise to filters of smaller length L − 2. orthogonal if it is orthogonal to its even shifts, that is, if it
(c) Lifting scheme [5, 8]. this is apparently another ap- satisfies
proach for either orthogonal or biorthogonal two-
−2k
L
channel filter banks. The key idea is to build filters of L
∀k = 1, . . . , − 1, 0= hn hn+2k . (5)
length L with desirable properties by lifting filters of 2 n=1
length smaller than L. But, for the orthogonal case [5],
it turns out to be another iterative algorithm, equiv-
If we additionally impose the norm 1 condition ( Ln=1 h2n =
alent to the lattice factorization already mentioned. 1), then h will be called paraunitary.
Thus, we will consider it as a particular example of the The orthogonality condition implies that L is even; then,
lattice filters design approach. (5) can be rewritten, for any k = 1, . . . , L/2 − 1, as
We summarize that all these well-known approaches
present some disadvantages. hn hn+2k = − hn hn+2k . (6)
n odd n even
On the other hand, in our recent work [9] we have pro-
posed the first parameterization for paraunitary filters of For instance, if k = L/2 − 1, we have that h1 hL−1 = −h2 hL ; as
length L by means of only L/2 independent parameters and h1 · hL = 0, there must be a real parameter a1 such that
1 sign; besides, the filter coefficients are obtained explicitly,
using neither an iteration process nor a root finding proce- hL−1 h2
dure. In this paper, we improve our design technique, by ob- =− = a1 . (7)
hL h1
taining a simpler expression; we also make for the first time
a rigorous proof of its validity for lowpass filters. Finally, as Hence,
one of the main contributions, we present a novel explicit ex-
pression of the power spectral response P(z) of paraunitary h2 = −a1 h1 , hL−1 = a1 hL . (8)
filters. It constitutes a new tool for the design of filters. For
each specific application, it may be used to design the parau- In other words, h2 , hL−1 can be derived from hL , h1 , respec-
nitary filter to satisfy the desired properties. tively.
The paper is organized as follows: in Section 2 we de- Now the key question arises: can we always write the even
rive the explicit expression of all real 2-channel FIR orthog- components of the filter by means of the odd ones, and vice
onal filterbanks. In Section 3 we study the particular case of versa? In [9], we have proved the next result, which guaran-
lowpass paraunitary filterbanks, and illustrative examples are tees that the answer is yes. Its demonstration is also included
shown. In Section 4 we obtain the general explicit expression here.
of the halfband power spectral response of a paraunitary fil-
ter, by means of the free parameters; conclusions are finally Theorem 1. h = (h1 , h2 , . . . , hL ) is an orthogonal real fil-
discussed in Section 5. ter if and only if there exists a unique set of real numbers
Let us now introduce the following notation, necessary to a1 , . . . , aL/2−1 such that, for any k = 1, . . . , L/2 − 1,
follow the development of our work: for any set of real num-

k
k
bers (a1 , . . . , am ), let us denote the Toeplitz low-triangular h2k = − h2k+1−2 j a j , hL+1−2k = hL−2k+2 j a j . (9)
matrix of order m which contains these numbers in its first j =1 j =1
column,
⎛ ⎞ Or, in an equivalent matricial way,
a1 0 0 ··· 0
⎜ .. .⎟ ⎛ ⎞ ⎛ ⎞
⎜ . .. ⎟ h2 h1
⎜ a2 a1 0 ⎟
⎜⎜ ..
⎟
.. ⎟
⎜
⎜ h4 ⎟
⎟
⎜
⎜ h3 ⎟
⎟
T a1 , . . . , am = ⎜
⎜ a3 a2 a1 . .⎟ ⎟
. (4) ⎜ .. ⎟ = −T a1 , . . . , aL/2−1 ⎜ .. ⎟, (10)
⎜ ⎟ ⎜ ⎟
⎜ . ⎟ ⎝ . ⎠ ⎝ . ⎠
⎜ . .. .. .. ⎟
⎝ . . . . 0⎠ hL−2 hL−3
am · · · a3 a2 a1 ⎛ ⎞ ⎛ ⎞
h3 h4
Throughout this paper, only real matrices and vectors are ⎜ . ⎟ ⎜ ⎟
⎜ . ⎟ ⎜ . ⎟
considered. Matrices are denoted by capital letters, and vec- ⎜ . ⎟ = T a1 , . . . , aL/2−1 t ⎜ .. ⎟ . (11)
⎜ ⎟ ⎜ ⎟
tors by boldface lowercase letters. The superscript t denotes ⎝hL−3 ⎠ ⎝hL−2 ⎠
transposition. hL−1 hL
M. Elena Domı́nguez Jiménez 3
Proof. Equation (5) may be easily rewritten matricially as The former identity yields (10) directly. On the other
⎛ ⎞ hand, by reversing the rows of the second identity we obtain
hL−1 (11). Just recall that T(a1 , . . . , aL/2−1 ) is a Toeplitz matrix, so
⎜ ⎟
⎜hL−3 ⎟ the exchange matrix P satisfies
T h1 , h3 , . . . , hL−3 ⎜ .
⎜ . ⎟
⎟
⎝ . ⎠ t
PT a1 , . . . , aL/2−1 = T a1 , . . . , aL/2−1 P, (16)
h3
⎛ ⎞ (12)
h2 which concludes the proof.
⎜ h4 ⎟
⎜ ⎟ For example, for k = 2, Theorem 1 implies that hL−3 =
= −T hL , hL−2 , . . . , h4 ⎜
⎜ .. ⎟,
⎟
⎝ . ⎠ a1 hL−2 + a2 hL and −h4 = a1 h3 + a2 h1 . So we have shown that
hL−2 it is possible to express every odd coefficient h2k+1 by means
of its following even coefficients of the filter, and every even
where we have used our notation for lower triangular coefficient h2k by means of its former odd ones.
Toeplitz matrices. As h1 · hL = 0, both matrices are nonsingu-
lar; besides, their inverses are also lower triangular Toeplitz 2.1. New simplified expression of orthogonal filters
matrices; finally, such matrices always commute, so we can
state that Once we have demonstrated the existence of the vector of pa-
⎛ ⎞ rameters a = (a j )L/2 −1
j =1 , then we define
hL−1
⎜ ⎟
−1 ⎜hL−3 ⎟ (i) a Toeplitz low-triangular matrix of order L/2 − 1:
T hL , hL−2 , . . . , h4 ⎜ ⎟
⎜ .. ⎟
⎝ . ⎠ A := T 0, a1 , . . . , aL/2−2 ; (17)
h3
⎛ ⎞ ⎛ ⎞ (ii) and two vectors of length L/2 − 1:
h2 a1
⎜ h4 ⎟ ⎜ a ⎟ −1
−1 ⎜ ⎟ ⎜ 2 ⎟ b = − I + AAt
= − T h1 , h3 , . . . , hL−3 ⎜ .. ⎟=⎜ . ⎟; a,
⎜ ⎟ ⎜ . ⎟
⎝ . ⎠ ⎝ . ⎠ −1 (18)
hL−2 aL/2−1 c = At b = − A I + AAt
t
a,
(13)
which are well defined because I + AAt is always a
in other words, we define (a1 , . . . , aL/2−1 )t as any of these two positive definite matrix. Note also that b1 = −a1 and
vectors. For instance, the first coefficient a1 is the one that cL/2−1 = 0 because of the null diagonal of A.
verifies a1 = hL−1 /hL = −h2 /h1 . Thus, we simultaneously For the sake of simplicity, from now on we will denote
have
t
⎛ ⎞ heven = h2 , h4 , h6 , . . . , hL−2 ,
h2 ⎛ ⎞
⎜ ⎟ a1 t (19)
⎜ h4 ⎟ ⎜ . ⎟
⎜ ⎟ = −T h1 , h3 , . . . , hL−3 ⎜ . ⎟ , hodd = h3 , h5 , . . . , hL−3 , hL−1 ,
⎜ .. ⎟ ⎝ . ⎠
⎝ . ⎠
aL/2−1 which contain the even and odd indexed coefficients of h ex-
hL−2
(14) cept the first and the last ones, h1 , hL .
⎛ ⎞
hL−1 ⎛ ⎞ Now we are able to finally express all the components of
⎜h ⎟ a1 the filter by means of h1 , hL , and the L/2 − 1 parameters. This
⎜ L−3 ⎟ ⎜ ⎟
⎜ . ⎟ = T hL , hL−2 , . . . , h4 ⎜ .. ⎟ . is one of the main results of this paper, which constitutes a
⎜ . ⎟ ⎝ . ⎠
⎝ . ⎠ new characterization and design method of all orthogonal
aL/2−1
h3 filters, even simpler than the one obtained in [9].
−1
Note also that the set of parameters (a j )L/2
j =1 which satisfies Theorem 2. h =(h1 , h2 , . . . , hL ) is an orthogonal filter if and
any of these conditions is unique. Besides, these equations only if there exist L/2 − 1 real numbers a1 , . . . , aL/2−1 such that
are equivalent to
⎛ ⎞ ⎛ ⎞ heven = h1 b + hL Pc, hodd = h1 c − hL Pb. (20)
h2 h1
⎜ h4 ⎟ ⎜ h3 ⎟ Proof. By making use of the matrix A and the vectors heven
⎜ ⎟ ⎜ ⎟
⎜ .. ⎟ = −T a1 , . . . , aL/2−1 ⎜ .. ⎟, and hodd introduced above, (10) and (11) can be, respectively,
⎜ ⎟ ⎜ ⎟
⎝ . ⎠ ⎝ . ⎠ rewritten as
hL−2 hL−3
⎛ ⎞ ⎛ ⎞ (15) −heven = h1 a + Ahodd , hodd = hL Pa + At heven (21)
hL−1 hL
⎜h ⎟ ⎜h ⎟ so we have that
⎜ L−3 ⎟ ⎜ L−2 ⎟
⎜ . ⎟ = T a1 , . . . , aL/2−1 ⎜ . ⎟ .
⎜ . ⎟ ⎜ . ⎟
⎝ . ⎠ ⎝ . ⎠ heven + Ahodd = −h1 a, −At heven + hodd = hL Pa.
h3 h4 (22)
It just suffices to solve this linear system with unknowns Remark 1. From this expression, we also deduce that any
heven , hodd . By elementary Gaussian elimination operations, pair of conjugate quadrature mirror filters is associated to
it is equivalent to the system the same set of independent parameters a1 , . . . , aL/2−1 ; the only
difference is the value of the first and last coefficients. If we
I + AAt heven = − h1 I+hL AP a,
choose h1 , hL for the filter h, then we just have to set g1 = hL ,
(23)
I + At A hodd = hL P − h1 At a, gL = −h1 for its CQM filter g.
from which we can obtain both vectors independently, be- 2.2. New expression of paraunitary filters
cause I + AAt and I + At A are nonsingular. Moreover, we can
exploit the fact that A is Toeplitz: At = PAP, A = PAt P, and Next, we impose the constraint that the vector h has norm 1;
At A = PAAt P so (I + At A) = P(I + AAt )P and regarding (27), let us note that the norm of each column is
−1 −1 equal to
I + At A P = P I + AAt ; (24)
besides, it is easy to show that 1 + b2 + c2 = 1 − bt a ≥ 1, (29)

t −1
−1
I + AA A = A I + At A , where we have used that
−1 −1 (25) 2
I + At A At = At I + AAt . b2 + At b = bt I + AAt b = −bt a ≥ 0. (30)
Finally, we make use of all these expressions and the defi- Due to the orthogonality of the two columns of this ex-
nition of b and c given in expressions (18) in order to obtain pression (27), the norm of h is very easy to compute:
(20):
−1 1 = h2 = 1 − bt a h21 + h2L . (31)
heven = − I + AAt h1 I+hL AP a = h1 b + hL Pc,
(26) As the quantity 1 ≤ 1 − bt a < ∞ and only depends on the
t
−1 t

hodd = I + A A hL P − h1 A a = −hL Pb + h1 c. election of the parameters, it just suffices to choose h1 , hL in
the circle of radius
We have derived that, by choosing L/2 − 1 arbitrary pa- 1
rameters and 2 arbitrary nonzero numbers h1 , hL , we are able 0< √ ≤ 1. (32)
1 − bt a
to parameterize the whole set of orthogonal filters h of length
L. In other words, these filters are characterized by means of Corollary 1. h =(h1 , h2 , . . . , hL ) is a paraunitary filter if and
just L/2 + 1 parameters. And this representation is unique: only if there exist L/2 − 1 real numbers a1 , . . . , aL/2−1 verifying
different sets of parameters always yield different filters, so (20), and
there is no redundancy in this parameterization. −1
All the coefficients of the filter are of the following form h21 + h2L = 1 − bt a . (33)
(first: odd coefficients, last: even coefficients):
⎛ ⎞ ⎛ ⎞ This means that hL (up to its sign) is expressed by means of h1 .
h1 1 0 In other words, it is deduced that the set of paraunitary filters of
⎜ h ⎟ ⎜ c −Pb⎟ h
⎜ odd ⎟ ⎜ ⎟ 1 length L is determined by L/2 parameters, and 1 sign.
⎜ ⎟=⎜ ⎟ . (27)
⎝heven ⎠ ⎝b Pc ⎠ hL For instance, if all the parameters are chosen null, then
hL 0 1 vectors b and c are null, and the filter obtained is of the type
h = (h1 , 0, . . . , 0, hL ) which is orthogonal, and unitary when-
Thus, any orthogonal filter is a linear combination of
ever h21 + h2L = 1/(1 + 02 + 02 + 02 ) = 1.
these two columns, which are indeed orthogonal filters of
length L − 2. They are orthogonal columns; moreover, it can
easily be seen that they are conjugate quadrature filters. In 3. DESIGN OF ORTHOGONAL FILTERS WITH
effect, the odd components of the first filter correspond to DESIRABLE PROPERTIES
the even components of the second one, reversed; and the
3.1. Design of lowpass orthogonal filters
even components of the first filter are the opposite of the odd
components of the second one, reversed. This property will H(1) = s = 0. Equivalently, let
Lowpass filters must satisfy
be exploited in the next section. us now impose s = H(1) = hn = ut h where u is the vector
Let us remark that this property confirms the underlying whose components are all equal to 1. Again, by using (27),
idea of lattice factorization [6] and lifting scheme [5]. L-tap the sum of the coefficients of h is a linear combination of the
paraunitary filters can be built by means of paraunitary filters sum of each one of the two columns:
of smaller length (L − 2). In this sense, our design approach
generalizes those existing techniques. s = h1 1 + ut (b + c) + hL 1 + ut (c − b) (34)
To finish this section, let us notice that we can also write
so we get the equation of a straight line. Note that the normal
h1 hL
hodd Pheven = c −Pb . (28) vector is always nonzero, because the sum of both columns
hL −h1 cannot vanish simultaneously. The reason is that they are
conjugate quadrature mirror filters. Hence, there are always as for the lattice filters, we only would need 4/2 − 1 = 1 uni-
infinite choices for h1 , hL in that line. tary vector, and a unitary matrix of order 2. It is easy to see
For example, by choosing all parameters null, the equa- that the unitary matrix for lowpass filters is always equal to
tion of the line is s = h1 + hL so the associated orthogonal √
filter is h = (h1 , 0, . . . , 0, s − h1 ). 2 1 1
Q= . (39)
2 1 −1
3.2. Design of lowpass paraunitary filters
Next, by choosing an arbitrary unitary vector v = (c, d)t /
√
It is well known that lowpass paraunitary filters must satisfy c2 + d2 and the unitary matrix Q, the paraunitary lowpass
filters computed via the lattice method are
the DC leakage condition. As √ H(−1) = 0, introducing it into
(1) we obtain that H(1) = 2. Now we impose both √ condi- 1
tions over the orthogonal filter h: norm 1 and sum 2. h= 2 √ d 2 − cd, d 2 + cd, c2 + cd, c2 − cd (40)
c + d2 2
The equations that h1 , hL must verify are
−1 so they are of length 4, except when c = 0 (and we have Haar
h21 + h2L = 1 − bt a , filter), or d = 0 or c = d (shifted versions of Haar filter).
√ (35)
2 = h1 1 + ut (b + c) + hL 1 + ut (−b + c) . This is an example that the lattice design may provide filters
of smaller length.
In other words, (h1 , hL ) lies in the intersection between On the other hand, note that its components are h =
a circle and a line in R2 . May this intersection be null? This (h1 , −ah1 , ah4 , h4 ) with (in case the length is exactly 4)
question was open in our previous work [9] but now we c+d 1 + d/c
demonstrate that the answer is no. The reason is that, for a= = (41)
c − d 1 − d/c
any lowpass filter of first and last coefficients h1 , hL , the cor-
responding conjugate highpass filter of the same length is and the ratio d/c is the very important direction of vector v,
the one whose first and last coefficients are ±hL , ∓h1 . This whereas a ∈ R is a free parameter which can take all possible
means that the line which is orthogonal to the previous one real values. This means that our expression (38) is simpler
and contains the origin will surely intersect such circle in two than (40), and yields the same set of paraunitary filters.
points: ±(hL , −h1 ). So there is only one highpass filter (up to
the sign); hence, there is only one lowpass orthogonal filter 3.4. Example: 4-tap lowpass paraunitary filter
which satisfies both conditions above. So this justifies that with maximum attenuation
such intersection is not null, moreover, it contains only one
point. The attenuation of the lowpass filter may be measured as
For example, if all the parameters are chosen to be π/2
L/2 −1
(−1)n
null, then all these vectors are null, and this condition is H(w)2 dw = π + 2 r(2n + 1) , (42)
clearly
√ satisfied, giving rise to the paraunitary lowpass filters 0 2 n=0
(2n + 1)
± 2/2(1, 0, 0, . . . , 0, 0, 1); for L = 2, we obtain the Haar filter.
where r(n) denotes the autocorrelation coefficients of the fil-
3.3. Example: 4-tap lowpass paraunitary filters ter.
Let us impose now the maximum attenuation to our 4-
As a very simple example, let us consider paraunitary fil- tap designed filters. In this case we should maximize r(1) −
ters of length 4; they must be of the following form: h = r(3)/3. To this aim, we compute such autocorrelation coeffi-
(h1 , −ah1 , ah4 , h4 ), they must have norm 1, and satisfy the cients of (38):
DC condition:
3a2 + a4 1 − a2

2 −1 r(1) = 2 , r(3) = h1 h4 = 2 . (43)
h21 + h24 = 1+a , 1 + a2 2 1 + a2 2
√ (36)
2 = h1 (1 − a) + h4 (1 + a). Next, it suffices to maximize
But such conditions are always possible for all a1 , since the r(3) 10a2 + 3a4 − 1
r(1) − = 2 . (44)
line and the circle intersect in only one point, 3 6 1 + a2
√
(1 − a) (1 + a) We obtain that the maximum is achieved for a = ± 3. For
h1 = √ , h4 = √ (37) √
1 + a2 2 1 + a2 2 a = 3, we have
obtaining the unique expression for the filter 1 √ √ √ √
h = √ 1 − 3, 3 − 3, 3 + 3, 1 + 3 , (45)

4 2
1
h = √ 1 − a, a2 − a, a2 + a, 1 + a . (38) √
1 + a2 2 whereas for a = − 3, we obtain
Let us compare it to the other approaches. The spectral 1 √ √ √ √
h = √ 1 + 3, 3 + 3, 3 − 3, 1 − 3 (46)
method would have required a greater amount of operations; 4 2
which correspond to the 4-tap Daubechies filters (mini- the coefficients of C(z2 )B(z−2 ) (say, the correlation between
mum/maximum phase), which are the optimal ones, with (1, c) and b). Recall that all these vectors are computed di-
attenuation (π/2) + (7/6). rectly from the free parameters a. Finally, hL is simply ob-
Let us remark that our technique confirms the results obtained from h1 by means of (33), up to a sign.
tained by means of other approaches, although in a more
direct way. Nevertheless, working with longer filters will in- 5. CONCLUSIONS
volve maximizing a functional which depends on more vari-
ables, and the expressions will be more complicated. We have presented a novel characterization of real parauni-
tary FIR filterbanks. This provides a new method for the di-
4. NEW EXPRESSION OF THE POWER rect design of this type of filters. Its main advantage is that it
SPECTRAL RESPONSE does not need any iteration process. It just suffices to choose
arbitrary values of some parameters, and substitutes them
As another final contribution, we will find the explicit expres- into a closed-form expression. We have also obtained the
sion of the halfband polynomial P(z) = |H(z)|2 associated to general expression of lowpass paraunitary filters. Moreover,
a paraunitary filter of length L. Our final aim would be to de- the proposed technique helps us to design filters with desired
sign the polynomial P instead of the filter itself. To this end, properties in a very simple and direct way, even more than
we first must find the desired expression of P by means of the the existing techniques, as has been illustrated with 4-tap fil-
L/2 − 1 independent parameters (a1 , . . . , aL/2−1 ), apart from ters. For paraunitary filters of arbitrary length, we have also
h1 , hL which verify the 1-norm condition (33). obtained a simple explicit expression of its power spectral re-
We will use the simple expression (27) already obtained. sponse. This yields a new powerful tool for designing parau-
Let us denote H1 (z) the transfer function of the filter given by nitary filters which satisfy extra conditions, as it is usually
the first column. On one hand, its even coefficients constitute requested in specific applications.
vector b, while its odd coefficients are (1, c). Note that it is a
filter of length L − 2 because the last component of c is zero.
So we can write H1 (z) = C(z2 ) + z−1 B(z2 ), where B, C are the ACKNOWLEDGMENTS
respective transfer functions associated to the filters b, and This work has been supported by UPM through the AYUDA
(1, c), both of length L/2 − 1. PUENTE reference AY05/11263, and by CICYT through
Moreover, this first column constitutes an orthogonal fil- the Research Project DIPSTICK reference TEC2004-02551/
ter; in effect, the filter |H1 (z)|2 is halfband: TCM.
2 2
d = 2 1 − bt a = H1 (z) + H1 (−z)
(47) REFERENCES
2 2
= 2C z2 + 2B z2 .
[1] I. Daubechies, Ten Lectures on Wavelets, SIAM, Philadelphia, Pa,
USA, 1992.
On the other hand, the second column is a shifted version
[2] P. P. Vaidyanathan, Multirate Systems and Filter Banks, Prentice-
of its CQF filter, so we easily deduce that Hall, Englewood Cliffs, NJ, USA, 1993.
[3] M. Vetterli and J. Kovacevic, Wavelets and Subband Coding,
H(z) = h1 H1 (z) + hL z1−L H1 − z−1 . (48) Prentice-Hall, Englewood Cliffs, NJ, USA, 1995.
[4] G. Strang and T. Nguyen, Wavelets and Filter Banks, Wellesley-
Let us finally compute the power spectral response, also
Cambridge Press, Wellesley, Mass, USA., 1996.
by making use of (33):
2 into lifting steps,” Journal of Fourier Analysis and Applications,
P(z) = H(z) vol. 4, no. 3, pp. 247–269, 1998.
2 [6] P. P. Vaidyanathan, T. Q. Nguyen, Z. Doganata, and T. Sara-
= h1 H1 (z) + hL z1−L H1 − z−1 maki, “Improved technique for design of perfect reconstruction
FIR QMF banks with lossless polyphase matrices,” IEEE Trans-
d
= h21 + h2L + 2h1 hL Re zL−1 H1 (−z)H1 (z) actions on Acoustics, Speech, and Signal Processing, vol. 37, no. 7,
2 (49) pp. 1042–1056, 1989.
2 −2
[7] S.-M. Phoong, C. W. Kim, P. P. Vaidyanathan, and P. Ansari, “A
+2 h21 − h2L Re zC z B z new class of two-channel biorthogonal filter banks and wavelet
2 2
bases,” IEEE Transactions on Signal Processing, vol. 43, no. 3, pp.

= 1 + 2h1 hL Re zL−1 C z2 − z−2 B z2 649–665, 1995.
[8] J. Kovacevic and W. Sweldens, “Wavelet families of increasing
+ 2 h21 − h2L Re zC z2 B z−2 , order in arbitrary dimensions,” IEEE Transactions on Image Pro-
cessing, vol. 9, no. 3, pp. 480–496, 2000.
where Re stands for real part of the complex number. [9] M. E. Domı́nguez Jiménez, “New technique for design of 2-
We summarize that the coefficients of P, which are the channel FIR paraunitary filter banks,” in Proceedings of the 7th
autocorrelation coefficients r(n) of the filter, can be easily ob- IASTED International Conference on Signal and Image Processing
tained by means of the coefficients of C 2 and B2 (resp., the (SIP ’05), pp. 220–225, Honolulu, Hawaii, USA, August 2005.
autoconvolution of (1, c), and the autoconvolution of b) and
M. Elena Domı́nguez Jiménez was born in

Madrid, Spain, in 1969. She received the de-
gree in mathematical sciences from the Uni-
versidad Complutense de Madrid in 1992
and the Ph.D. degree from the Universidad
Politécnica de Madrid in 2001. She works as
an Assistant Professor at the ETSII Depar-
tament of Applied Mathematics of the Uni-
versidad Politécnica de Madrid. Since 2005,
she also belongs to the Research Group
TACA of the same University. Her research interests include wavelet
theory, filter design, multiresolution signal processing, and audio
compression. She obtained an Extraordinary Award from the Uni-
versidad Politécnica de Madrid for the best doctoral dissertation
during 2001.
doi:10.1155/2007/25672
Research Article
A Generalized Algorithm for Blind Channel Identification with
Linear Redundant Precoders
Borching Su and P. P. Vaidyanathan
Department of Electrical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
Received 25 December 2005; Revised 19 April 2006; Accepted 11 June 2006
Recommended by See-May Phoong
It is well known that redundant filter bank precoders can be used for blind identification as well as equalization of FIR channels.
Several algorithms have been proposed in the literature exploiting trailing zeros in the transmitter. In this paper we propose a
generalized algorithm of which the previous algorithms are special cases. By carefully choosing system parameters, we can jointly
optimize the system performance and computational complexity. Both time domain and frequency domain approaches of chan-
nel identification algorithms are proposed. Simulation results show that the proposed algorithm outperforms the previous ones
when the parameters are optimally chosen, especially in time-varying channel environments. A new concept of generalized signal
richness for vector signals is introduced of which several properties are studied.
Copyright © 2007 B. Su and P. P. Vaidyanathan. This is an open access article distributed under the Creative Commons Attribution
cited.
1. INTRODUCTION noise. When the block size M increases, the bandwidth effi-
ciency η = (M + L)/M approaches unity asymptotically. The
Wireless communication systems often suffer from a prob- deterministic method proposed in [3] (which we will call the
lem due to multipath fading which makes the channels SGB method) exploits trailing zeros with length L introduced
frequency-selective. Channel coefficients are often unknown in each transmitted block and assumes the input sequence
to the receiver so that channel identification needs to be done is rich. That is, the matrix composed of finite source blocks
before equalization can be performed. Among techniques achieves full rank.
for identifying unknown channel coefficients, blind meth- The method in [3] requires the receiver to accumulate
ods have long been of great interest. In the literature many at least M blocks before channel coefficients can be identi-
blind methods have been proposed based on the knowledge fied. This prevents the system from identifying channel co-
of second-order statistics (SOS) or higher-order statistics of efficients accurately when the channel is fast-varying, espe-
the transmitted symbols [1, 2]. These methods often need to cially when the block size M is large. More recently, Man-
accumulate a large number of received symbols until chan- ton and Neumann pointed out that the channel could be
nel coefficients can be estimated accurately. This requirement identifiable with only two received blocks [4]. An algorithm
leads to a disadvantage when the system is working over a based on viewing the channel identification problem as find-
fast-varying channel. ing the greatest common divisor (GCD) of two polynomi-
A deterministic blind method using redundant filterbank als is proposed in [5] (which we will call the MNP method).
precoders was proposed by Scaglione et al. [3] by exploiting Eventhough it greatly reduces the number of received blocks
trailing zeros introduced at the transmitter. Figure 1 shows needed for channel identification, the algorithm has much
a typical linear redundant precoded system. Source sym- more computational complexity especially when the block
bols are divided into blocks with size M and linearly pre- size M is large.
coded into P-symbol blocks which are then transmitted on In this paper, we propose a generalized algorithm of
the channel. It is well known that when P ≥ M + L, where which the SGB algorithm proposed in [3] and the MNP al-
L is the maximum order of the FIR channel, interblock in- gorithm in [5] are both special cases. By carefully choos-
terference (IBI) can be completely eliminated in absence of ing parameters, the system performance and computational
e(n) Vector
Channel y(n)
s1 (n) u1 (n) u(n) y(n) y1 (n)
s1 (n)
P H(z) P
z 1 z
s2 (n) u2 (n) y2 (n)
s2 (n)
P P
z 1 z
. .
. R(z) G(z) .
. .
. . . .
. .. .. .
sM (n) . .
sM (n)
Vector Vector
z
s(n) uP (n) 1 yP (n)
z s(n)
P P
Precoder Equalizer
Interleaving Blocking
Figure 1: Communication system with redundant filter bank precoders.
complexity can be jointly optimized. The rest of the paper 2. PROBLEM FORMULATION AND
is organized as follows. Section 2 describes the system struc- LITERATURE REVIEW
ture with linear precoder filter banks and reviews several
existing blind algorithms. In Section 3 we present the gen- 2.1. Redundant filter bank precoders
eralized algorithm and derive the conditions on the input
sequence under which the algorithm operates properly. In Consider the multirate communication system [8] depicted
Section 4 we propose a frequency domain version of the gen- in Figure 1. The source symbols s1 (n), s2 (n), . . . , sM (n) may
eralized algorithm. The concept of generalized signal richness come from M different users or from a serial-to-parallel op-
is introduced in Section 5 and some properties thereof are eration on data of a single user. For convenience we consider
studied in detail. Simulation results and complexity analy- the blocked version s(n) as indicated. The vector s(n) is pre-
sis of both time and frequency domain approaches are precoded by a P × M matrix R(z) where P > M. The information
sented in Section 6. In particular, simulations under time- with redundancy is then sent over the channel H(z). We as-
varying channel environments are presented to demonstrate sume H(z) is an FIR channel with a maximum order L, that
the strength of the proposed algorithm against channel vari- is,
ation. Finally, conclusions are made in Section 7. Some of the
results in the paper have been presented at a conference [6].
L
H(z) = hk z−k . (2)
k=0
1.1. Notations
Boldfaced lower-case letters represent column vectors. Bold- The signal is corrupted by channel noise e(n). The re-
faced upper-case letters and calligraphic upper case letters ceived symbols y(n) are divided into P × 1 block vec-
are reserved for matrices. Superscripts as in AT and A† de- tors y(n). The M × P matrix G(z) is the channel equal-
note the transpose and transpose-conjugate operations, re- izer and s1 (n), s2 (n), . . . , sM (n) are the recovered symbol
spectively, of a matrix or a vector. All the vectors and ma- streams. Also, for simplicity we define h as the column vector
trices in this paper are complex-valued. In the figures “↑ P” [h0 h1 · · · hL T ]. We set
represents an expander and “↓ P” a decimator [7].
If v = [v1 v2 · · · vM T ] is an M × 1 column vec- P = M + L, (3)
tor, then T (v, q) denotes an (M + q − 1) × q Toeplitz ma-
trix whose first row and first column are [v1 0 · · · 0] and
that is, the redundancy introduced in a block is equal to the
[v1 v2 · · · vM 0 · · · 0T ], respectively. For example,
maximum channel order.
⎡ ⎤ 2.2. Trailing zeros as transmitter guard interval

⎛⎡ ⎤ ⎞ a1 0 0
a1 ⎢a2 a1 0⎥
⎢ ⎥ Suppose we choose the precoder R(z) = [ R01 ] where R1 is an
⎜⎢a ⎥ ⎟ ⎢a a1 ⎥
⎜⎢ 2 ⎥ ⎟ ⎢ 3 a2 ⎥ M × M constant invertible matrix and the L × M zero matrix
T ⎜⎢ ⎥ , 3⎟ = ⎢ ⎥. (1)
⎝⎣a3 ⎦ ⎠ ⎢a4 a3 a2 ⎥ 0 represents zero-padding with length L in each transmitted
⎢ ⎥
a4 ⎣0 a4 a3 ⎦ block, as indicated in Figure 2. For simplicity of describing
0 0 a4 the algorithms, in this section we assume the noise is absent.
B. Su and P. P. Vaidyanathan 3
Noise e(n)
s1 (n) u1 (n) u(n) y(n) y1 (n)

P H(z) P
s2 (n) Channel z y2 (n)
u2 (n) z 1
P P
1 z
. R1 . z
. . .
. . ..
sM (n) uM (n) z 1
P Vector
Vector .
Vector 1 . y(n)
u(n) z .
s(n)
P
.
.
.
Block of z
1 yP (n)
L zeros z
P P
Figure 2: The zero-padding system with precoder R1 .
Now, the received blocks can be written as 2.3. The GCD approach
Another approach proposed in [5] requires only two received

y(1) y(2) · · · y(J) blocks for blind channel identification. Recall that the chan-

Y matrix; size P ×J
nel is described by y = HM u = T (h, M)u, or
(4)
= HM R1 s(1) s(2) · · · s(J) , ⎡ ⎤
h0 0
⎢ ⎥⎡ ⎤
S matrix; size M ×J
⎡ ⎢h ⎤ .. ⎥ u
y1 ⎢ 1 . ⎥
⎢y ⎥ ⎢ ⎥⎢ 1 ⎥
⎢ 2⎥ ⎢ .. ⎥ ⎢ u2 ⎥
⎢ . ⎥=⎢ . h0 ⎥
⎥⎢ ⎥
where HM = T (h, M) is the full-banded Toeplitz channel ⎢ . ⎥ ⎢ ⎢ .. ⎥ . (6)
⎣ . ⎦ ⎢h ⎢ L h1 ⎥
⎥⎣ . ⎦
matrix. As long as vector h is nonzero, the matrix HM has yP
⎢
⎢
⎥
. . .. ⎥ uM
full column rank M. Now, we assume the signal s(n) is rich, ⎣ . .⎦
that is, there exists an integer J such that the matrix S has 0 hL
full row rank M. Since R1 is an M × M invertible matrix,
we conclude that the P × J matrix Y has rank M. So there By multiplying [1 x x2 · · · xP−1 ] to both sides of (6), we
exist L linearly independent vectors that are left annihilators obtain
of Y. In other words, there exists a P × L matrix U0 such that
U†0 Y = UHM R1 S = 0. Now that R1 S has rank M, this implies y(x) = h(x)u(x), (7)
where
U†0 HM = 0. (5)
−1
P
L
y(x) yk+1 xk , h(x) hk x k ,
The channel coefficients h can then be determined by solving k=0 k=0
(5). In practice where channel noise is present, the computa- −1
(8)

M
tion of the annihilators is replaced with the computation of u(x) uk+1 x k
the eigenvectors corresponding to the smallest L singular val- k=0
ues of Y. In this and the following sections, the channel noise
term is not shown explicitly. are polynomial representations of the output vector, channel
Note that this algorithm [3] works under the assumption vector, and input vector, respectively. This means, (6) is noth-
that S has full row rank M. Obviously J ≥ M is a necessary ing but a polynomial multiplication. Now, suppose we have
condition for this assumption. This means the receiver must two received blocks y(1) and y(2), and let y1 (x) = h(x)u1 (x)
accumulate at least M blocks (i.e., a duration of M(M + L) and y2 (x) = h(x)u2 (x) represent the polynomial forms of
symbols) before channel identification can be performed. these. Then the channel polynomial h(x) can be found as the
This could be a disadvantage when the system is working over GCD of y1 (x) and y2 (x), given that the input polynomials
a fast-varying channel. u1 (x) and u2 (x) are coprime to each other.
To compute the GCD of y1 (x) and y2 (x), we first con- received blocks freely as long as they satisfy a certain con-
struct a (2P − 1) × 2P matrix [9] straint.
⎡ ⎤
y11 0 ··· 0 y21 0 ··· 0
⎢ 3.1. Algorithm description
⎢ .. .. .. .. ⎥
⎥
⎢ y12 y11 . . y22 y21 . . ⎥
⎢ ⎥ Observe (6) again and note that it can be rewritten as
⎢ . . .. .. ⎥
⎢ .. y .. 0 . y22 . 0 ⎥
⎢ 12 ⎥ T (y, Q) = T (h, M + Q − 1)T (u, Q),
⎢ ⎥ (11)
YP ⎢ .
⎢ y1P ..
. ⎥. (9)
⎢ y11 y2P .. y21 ⎥
⎥ where T (·, ·) is defined as in (1). Here Q can be any positive
⎢ 0 y1P y12 0 y2P y22 ⎥
⎢ ⎥ integer. Note that in the MNP method Q is chosen as P, as
⎢ . .. .. . . .. ⎥
⎢ . .. .. .. ⎥ described in the previous section. Suppose the receiver gath-
⎣ . . . . . . . . ⎦
0 ··· 0 y1P 0 ··· 0 y2P ers J blocks with J ≥ 2. Then we have Y(J) Q = HM+Q−1 UQ ,
(J)
where
One can verify that
Q = T y(1), Q
Y(J) T y(2), Q · · · T y(J), Q ,
⎡ ⎤ ⎡ ⎤
h0 0 u11 0 u21 0 HM+Q−1 = T (h, M + Q − 1),
⎢ ⎥ ⎢ ⎥
⎢h . . . ⎥ ⎢u ..
.
.
u22 . . ⎥ (12)
⎢ 1 ⎥ ⎢ 12 ⎥
⎢ ⎥ ⎢ ⎥
⎢ .. ⎥ ⎢ .. .. ⎥ U(J) = T u(1), P · · · T u(J), P . (13)
⎢. ⎥ ⎢
h0 ⎥ ⎢ . u11 . u21 ⎥ Q
YP = ⎢
⎢h
⎥.
⎢ L h1 ⎥ ⎢
⎥ ⎢u1M u12 u2M u22 ⎥
⎥ Note that U(J) (J)
⎢ ⎥ ⎢ .. ⎥ Q has size (M + Q − 1) × QJ and YQ has size
⎢ . . .. ⎥ ⎢ .. .. .. ⎥
⎣ . .⎦ ⎣ . . . . ⎦ (P + Q − 1) × QJ. For notational simplicity, from now on we
0 hL 0 u1M 0 u2M will use subscript Q as in NQ to denote NQ = N +Q − 1 where
N denotes a positive integer. In particular,
matrixHM+P −1 matrixU
size(2P −1)×(M+P −1) size(M+P −1)×2P
MQ = M + Q − 1,
(10) (14)
PQ = P + Q − 1.
When u1 (x) and u2 (x) are coprime to each other, it can Notice that they still have the relationship PQ = MQ + L.
be shown that the matrix U has full rank M + P − 1 (see Assume now the matrix U(J) Q has full row rank MQ . Taking
Section 5). Since HM+P−1 = T (h, M + P − 1) also has rank
M + P − 1, rank(YP ) = M + P − 1 and hence YP has L left singular-value decomposition (SVD) of Y(J) Q we have

annihilators (i.e., there exists a (2P − 1) × L matrix U0 such Σ †
that U†0 Y = 0). These annihilators are also annihilators of Y(J)
Q = Ur U0 Vr V0 . (15)
0
each column of matrix HM+P−1 , and we can therefore, in ab-
sence of noise, identify channel coefficients h0 , h1 , . . . , hL up The size of Σ is MQ × MQ since both HMQ and U(J) Q have full
to a scalar ambiguity. In presence of noise, the columns of rank MQ . The columns of the MQ × L matrix U0 are left an-
U0 would be selected as the eigenvectors associated with the nihilators of matrix Y(J) and also of H since U(J) has full row
smallest singular values of YP . rank. Suppose
⎡ ⎤
u11 u12 · · · u1,P+Q−1
2.4. Connection to the earlier literature ⎢u ⎥
⎢ 21 u22 · · · u2,P+Q−1 ⎥
U†0 = ⎢
⎢ .. .. ⎥.
⎥ (16)
The MNP method described above can be viewed as a dual ⎣ . . ⎦
version of the subspace methods proposed in the earlier lit- uL1 uL2 · · · uL,P+Q−1
erature in multichannel blind identification [10, 11]. In the
subspace method in [11], the single source can be estimated Form the Hankel matrices
⎡ ⎤
as the GCD of the received data from two (more generally N) uk1 uk2 · · · uk,L+1
different antennas. The MNP method [5] swaps the roles of ⎢ u uk3 · · · uk,L+2 ⎥
⎢ k2 ⎥
data blocks and multichannel coefficients. Uk ⎢⎢ .. .. ⎥
⎥ (17)
⎣ . . ⎦
uk,MQ uk,MQ +1 · · · uk,PQ
3. A GENERALIZED ALGORITHM
for k, 1 ≤ k ≤ L. Then we have
In this section we propose a generalized algorithm of which ⎡ ⎤
U1
each of the two algorithms described in the previous section ⎢U ⎥
⎢ 2⎥
is a special case. Comparing the two algorithms described ⎢ . ⎥ h = 0. (18)
⎢ . ⎥
above, we find that the MNP approach needs much fewer ⎣ . ⎦
received blocks for blind identifiability. However, it has more UL
computational complexity. Each received block is repeated P
U matrix; size LMQ ×(L+1)
times to build a big matrix. Using the generalized algorithm,
we can choose the number of repetitions and the number of Vector h can thus be identified up to a scalar ambiguity.
Vector Vector condition for U(J)

Q to have full rank, it is not sufficient because
v(n) vQ (n)
N N ⎡ ⎤ NQ NQ it also depends on the values of entries of u(n). Nevertheless,
I
Q ⎣ N⎦ when inequality (19) is satisfied, the probability of U(J)
Q hav-
Q 1 0 ing full rank is usually close to unity in practice, especially
1
when a large symbol constellation is used. Thus,
z
⎡ ⎤
N 1 0 NQ M−1
⎢ ⎥
⎢ IN ⎥ Q= (20)
⎣ ⎦ J −1
Q 2 0
. appears to be a selection that minimizes the computational
. .
. . .
.
cost given the number of received blocks J. A detailed study
.
. on the conditions for U(J)
Q to have full rank is presented in
Section 5.
1 ⎡ ⎤
1 0
z N NQ
Q When J = 2, Q can be chosen as small as M − 1 rather
⎣ ⎦
IN than P. If we take J = 3, Q = (M − 1/2) makes the matrix
Y twice smaller. We can choose Q = 1 only when J ≥ M.
This coincides with the SGB algorithm which uses a richness
Figure 3: Q-repetition and shifting operation. assumption [3].
4. FREQUENCY DOMAIN APPROACH

3.2. Q-repetition and shifting operation
In this section we slightly modify the blind identification al-
As we can see in the previous section, the repetition and gorithm and directly estimate the frequency responses of the
shifting operation on a vector signal is crucial in the gener- channel at different frequency bins and equalize the channel
alized algorithm. Figure 3 gives a block diagram of this oper- in the frequency domain. We call the modified algorithm fre-
ation. For future notational convenience, the subscript Q as quency domain approach. Some of the ideas come from [12].
in vQ (n) denotes the result of this operation on a vector sig- The receiver structure for the frequency domain approach is
nal. By viewing (11) and applying this operation on y(n) and shown in Figure 4. To demonstrate how this system works,
u(n), we obtain the relationship observe the PQ × MQ full-banded Toeplitz channel matrix

yQ (n) = HM+Q−1 uQ (n) HMQ = T h, MQ . (21)
for any positive integer Q. Define a row vector vρT = [1 ρ−1 · · · ρ−(PQ −1) ] with ρ a
nonzero complex number. Due to full-banded Toeplitz struc-
3.3. Special cases of the algorithm ture of HMQ , we have

The blind channel identification algorithm described above vρT HMQ = H(ρ) ρ−1 H(ρ) · · · ρ−(MQ −1) H(ρ) , (22)
uses two parameters: (a) the number of received blocks J; (b)
the number of repetitions per block Q. A number of points where H(ρ) = Lk=0 hk ρ−k is the channel z-transform evalu-
should be noted here: ated at z = ρ.
Let N be chosen as an integer greater than or equal to PQ ,
(1) the algorithm works for any J and Q as long as U(J) Q has and let ρ1 , ρ2 , . . . , ρN be distinct nonzero complex numbers.
full row rank MQ . This is the only constraint for choosing Consider an N × PQ matrix VN ×PQ whose ith row is vρTi :
parameters J and Q;
⎡ −(PQ −1) ⎤
(2) note that if we choose Q = 1 and J ≥ M, then the 1 ρ1−1 ρ1−2 · · · ρ1
algorithm reduces to the SGB algorithm [3]; ⎢ ⎥
⎢ −(P −1) ⎥
(3) if we choose Q = P and J = 2, it becomes the MNP ⎢1 ρ2−1 ρ2−2 · · · ρ2 Q ⎥
⎢ ⎥
VN ×PQ =⎢ ⎥. (23)
algorithm [5]. ⎢ .. ⎥
⎢ . ⎥
⎣ ⎦
So both the SGB method and the MNP method are a −(P −1)
1 ρN−1 ρN−2 · · · ρN Q
special case of the proposed algorithm. Since U(J)
Q has size
(J)
MQ × QJ, UQ having full row rank implies QJ ≥ MQ = It is easy to verify that
M + Q − 1, or ⎡ −(M −1) ⎤
1 ρ1−1 · · · ρ1 Q
M−1 ⎢ ⎥
Q≥ . (19) ⎢ −(M −1) ⎥
J −1 ⎢1 ρ2−1 · · · ρ2 Q ⎥
⎢ ⎥
VN ×PQ HMQ = ΛN ⎢
⎢ . ⎥,
⎥
(24)
⎢ . ⎥
Also note that we cannot choose J = 1 since U(J) Q can never ⎣ . ⎦
− −
have full rank unless the block size M = 1. This is consistent 1 ρN−1 · · · ρN
(M Q 1)
with the theory that two blocks are required for blind chan-
VN ×MQ matrix
nel identification [4]. While the inequality (19) is a necessary
Vector Vector Vector vector in (25). Form the MQ × N matrices

y(n) yQ (n) z(n)
⎡ ⎤
y(n) y1 (n) yQ1 (n) z1 (n) ui1 ui2 ··· uiN
P ⎢ ui1 ρ1−1 ui2 ρ2−1 ··· uiN ρN−1 ⎥
⎢ ⎥
z ⎢ ⎥
⎢ ui1 ρ1−2 ui2 ρ2−2 ··· uiN ρN−2 ⎥
y2 (n) yQ2 (n) z2 (n) Ui = ⎢
⎢
⎥,
⎥
P
⎢
.. ⎥
z Q-repetition ⎣ . ⎦
−(MQ −1) −(MQ −1) −(MQ −1)
. and V N P Q ui1 ρ1 ui2 ρ2 · · · uiN ρN
. .
. shifting . (31)
. .
..
z
yP (n)
P and let U = [UT1 UT2 · · · UTN −MQ ]T . Then from (30) we
yQPQ (n)
have Uh N = 0. Then the frequency domain channel coeffi-
zN (n)
N can be estimated by solving this equation. After the
cients h
frequency domain channel coefficients are estimated, the re-
ceived symbols can be equalized directly in the frequency do-
Figure 4: Receiver structure for frequency domain approach. main, as in DMT systems.
Recall that we have the freedom to choose N as any inte-
ger greater than or equal to PQ and the values of ρi , 1 ≤ i ≤ N
where as any nonzero complex number in the z-domain. In this pa-
per, we use N = PQ and
ΛN = diag H ρ1 H ρ2 · · · H ρN N
diag h !
j2kπ
(25) ρk = exp , k = 0, 1, . . . , N − 1. (32)
N
is a diagonal matrix with frequency domain channel coeffi-
Note that since H(z) is an Lth order system, there are
cients as the diagonal entries. Now, when we gather receiving
at most L values among H(ρi ) which can be zero (channel
blocks and repeat them as in (12), we get the following ma-
nulls). By choosing N ≥ PQ , there are at least MQ nonzero
trix:

values among H(ρi ), i = 1, 2, . . . , PQ . In practice we can

choose to equalize the received symbols in frequency bins as-
Q = T (y(1), Q) T (y(2), Q) · · · T y(J), Q
Y(J) . (26)
sociated with the largest MQ frequency responses H(ρi ) to
Since we have Y(J) (J) enhance the system performance. This provides resistance to
Q = HMQ UQ in absence of noise, by
channel nulls.
multiplying VN ×PQ and Y(J)
Q , we have
Z = VN ×PQ Y(J) (J) (J) 5. GENERALIZED SIGNAL RICHNESS

Q = VN ×PQ HMQ UQ = ΛN VN ×MQ UQ . (27)
For the generalized blind channel identification method pro-
Recall that rank(Y(J) (J)
Q ) = rank(UQ ) = MQ . Since ρ1 , ρ2 , . . . , ρN posed in this paper to work properly, the matrix U(J) Q de-
are all distinct, the matrix Z has the same rank as Y(J) Q . The fined in (13) must have full row rank for given parame-
dimension of the null space of matrix Z is hence N − MQ . By ters J and Q. An obvious necessary condition has been pre-
performing SVD on Z, we can find these N − MQ left anni- sented as inequality (19) in Section 3. The sufficiency, how-
hilators of Z, which are also annihilators of ΛN VN ×MQ . There ever, depends on the content of signal u(n). When Q =
exists an (N − MQ ) × N matrix U†0 such that U†0 Z = 0. Since 1 and u(n) is rich, then there exists J such that U(J) Q =
U(J)
Q has full rank, this implies [u(0) u(1) · · · u(J − 1)] has full rank. When Q > 1, u(n)
requires another kind of richness property so that U(J) Q has
U†0 ΛN VN ×MQ = 0. (28) full rank for a finite integer J. We call this property the gener-
Suppose alized signal richness and define it as follows.
⎡ ⎤ Definition 1. An M × 1 sequence u(n), n ≥ 0 is said to be
u11 u12 ··· u1N
⎢ u21 u22 ··· u2N ⎥ (1/Q)-rich if there exists a finite integer J such that the (M +
⎢ ⎥
U†0 = ⎢
⎢ .. .. .. ⎥.
⎥ (29) Q − 1) × JQ matrix
⎣ . . . ⎦
uN −MQ ,1 uN −MQ ,2 · · · uN −MQ ,N Q = T s(0), Q
U(J) T s(1), Q · · · T s(J), Q (33)
Then by observing the i jth entry of (28), we have has full row rank M + Q − 1.
† = 0
u†i j h (30) Several interesting properties of generalized signal rich-
N
ness will be presented in this section. The reason why we use
for all i, j, 1 ≤ i ≤ N − MQ and 1 ≤ j ≤ MQ , where ui j = the notation of (1/Q) will soon be clear when these proper-
N is the row
[ui1 ρ1−( j −1) ui2 ρ2−( j −1) · · · uiN ρN−( j −1) ]† . Here h ties are presented.
5.1. Measure of generalized signal richness are 1, 2, . . . , M − 1, and ∞. (1/(M − 1))-richness is thus
the weakest form of generalized richness. When using the
Lemma 1. If an M × 1 sequence s(n) is (1/Q)-rich, then s(n) MNP method [9], this weakest form of generalized richness
is (1/(Q + 1))-rich. is very crucial. If this weakest form of richness of s(n) is
not achieved, then by Lemma 2 s(n) has an infinite degree
Proof. See the appendix. of non-richness and polynomials pTM (x)s(n) have a common
Lemma 1 states a basic property of generalized signal factor (x − α). Then as in Section 2.3, when we take GCD of
richness: the smaller the value of Q is, the “stronger” the con- the polynomials representing the received blocks, the receiver
dition of (1/Q)-richness is. For example, if an M × 1 sequence would be unable to determine whether the factor (x − α) be-
s(n) is 1-rich, or simply rich, then it is (1/Q)-rich for any pos- longs to the channel polynomial or is a common factor of the
itive integer Q. On the contrary, a (1/2)-rich signal s(n) is not symbol polynomials. Therefore, if the input signal s(n) has in-
necessarily 1-rich. We can thus define a measure of general- finite degree of non-richness, all methods proposed in this paper
ized signal richness for a given M ×1 sequence s(n) as follows. will fail for all Q.
Furthermore, the MNP method proposed in [5] uses Q =
Definition 2. Given an M × 1 sequence s(n), n ≥ 0, the degree P. Using Lemma 3, we see that using Q = M − 1 is sufficient
of nonrichness of s(n) is defined as if we are computing the GCD of polynomials representing
received blocks and the following two conditions are true: (1)
!
1 the GCD is known to have a degree less than or equal to L; (2)
Qmin min s(n) is -rich . (34) the degree of each symbol polynomial is less than or equal to
Q Q
M − 1. Using Q = P not only is computationally unnecessary,
Recall that the larger the degree of nonrichness Qmin is, but also, as we will see in simulation results in Section 6, has
the weaker the richness of the signal s(n) is. If s(n) is not sometimes a worse performance than using Q = M − 1 in
(1/Q)-rich for any Q, then Qmin = ∞. The property of an in- presence of noise.
finite degree of nonrichness can be described in the follow- The sufficiency of Q = M −1 can also be understood from
ing lemma. We use the notation pM (x) to denote the column the point of view of polynomial theory. Suppose polynomials
vector: a(x) and b(x) have degrees less than or equal to P − 1 and
T have a greatest common denominator d(x) whose degree is
pM (x) = 1 x x2 · · · xM −1 . (35) less than or equal to L. Suppose a(x) = d(x)a1 (x) and b(x) =
d(x)b1 (x) and both a1 (x) and b1 (x) have degrees less than or
Lemma 2. Consider an M × 1 sequence s(n). The following equal to M − 1 and they are coprime to each other. Then there
statements are equivalent: exists polynomials p(x) and q(x) whose degree are less than
or equal to M − 2 such that 1 = p(x)a1 (x) + q(x)b1 (x) and
(1) s(n) is not (1/Q)-rich for any Q;
thus d(x) = p(x)a(x) + q(x)b(x).
(2) the degree of nonrichness of s(n) is infinity;
(3) either there exists a complex number α such that
[1 α · · · αM −1 ] is an annihilator of s(n) or 5.2. Connection to earlier literature
[0 · · · 0 1] is an annihilator of s(n);
An earlier proposition mathematically equivalent to Lemma
(4) either polynomials pn (x) = pTM (x)s(n), n ≥ 0 share
3 has been presented in the single-input-multiple-output
a common zero (at α) or their orders are all less than
(SIMO) blind equalization literature [10, 13]. We review it
M − 1.
here briefly.
Proof. See the appendix.
Proposition 1. Let h[n] be J × 1 vectors. Suppose a QJ × (Q +
Note that the statement [0 · · · 0 1] is an annihilator M − 1) block Toeplitz matrix
of s(n) in condition (3) and the statement that polynomials
pn (x) have orders less than M − 1 in condition (4) can be
interpreted as the special situation when the common zero α TQ (h)
⎡ ⎤
is at infinity. h[0] h[1] · · · h[M − 1] 0 ··· 0
If an M × 1 sequence s(n) has a finite degree of non- ⎢ . .. ⎥
⎢ ··· h[M − 1] . . ⎥
⎢ 0 h[0] h[1] . ⎥
richness, or s(n) is (1/Q)-rich for some integer Q, then it can =⎢ ⎥
⎢ .. .. .. .. .. .. ⎥
be shown that the maximum possible value of Qmin is M − 1, ⎣ . . . . . . 0 ⎦
as described in the following lemma. 0 ··· 0 h[0] h[1] · · · h[M − 1]
(36)
Lemma 3. If M > 1 and an M × 1 sequence s(n) is not (1/(M −
1))-rich, then it is not (1/Q)-rich for any Q.
satisfies the following conditions:
Proof. See the appendix.
(1) h[0]
= 0 and h[M − 1] = 0;
With Lemma 3, we can see that for an M × 1 sequence (2) h[n] = 0 for n < 0 and n ≥ M;
s(n), the possible values of the degree of non-richness Qmin (3) Q ≥ M − 1.
Then TQ (h) has full column rank if and only if M = 8; L = 4

101

M
100
h(z) h[i]z−i
= 0, ∀z. (37)
Normalized channel MSE

i=0
10 1
Here h[n] was used to refer to the impulse response of

10 2
a J × 1 channel. Q stands for the observation period in the
multiple-channel receiver end. Conditions (1) and (2) imply
10 3
that the channel has finite impulse response. Condition (3)
can be met by increasing the observation period Q. While this
4
old proposition focuses on the coefficients of multiple chan- 10
nels rather than values of transmitted symbols, it is mathe-
matically equivalent to the statement that s(n) is (1/(M − 1))- 10 5
10 15 20 25 30 35 40 45
rich if and only if polynomials pTM (x)s(n) do not share com-
SNR (dB)
mon zeros. The case of Q < M − 1, however, has not been
considered earlier in the literature, to the best of our knowl- J = 2, Q = 12 (GCD) J = 10, Q = 12 (GCD)
edge. J = 2, Q = 1 (SGB) J = 10, Q = 1 (SGB)
J = 2, Q = 8 J = 10, Q = 2
5.3. Remarks on generalized signal richness

Figure 5: Normalized least squared channel error estimation.
In this section we introduced the concept of generalized sig-
nal richness. Given an M × 1 signal s(n), n ≥ 0, the degree
of non-richness Qmin was defined. For an input signal with a M = 8; L = 4
100
degree of non-richness Qmin , we can choose any
10 1
Q ≥ Qmin (38)
10 2
and some finite J for the generalized algorithm proposed in 3

10
Section 3 to work properly. The possible values of Qmin are
BER
1, 2, . . . , M − 1, and ∞. If s(n) has an infinite degree of non- 4

10
richness, the algorithm proposed in this paper will fail for
all Q. The degree of non-richness of a signal s(n) directly 5
10
depends on its content. A deeper study of degree of non-
richness will be presented elsewhere [14]. 10 6
6. SIMULATIONS AND DISCUSSIONS 10 7

10 15 20 25 30 35 40 45
In this section, several simulation results, comparisons, and SNR (dB)
discussions will be presented. We will first test our proposed J = 2, Q = 12 (GCD) J = 10, Q = 12 (GCD)
method and compare it with the existing methods [3, 5] de- J = 2, Q = 1 (SGB) J = 10, Q = 1 (SGB)
scribed in Section 2. Secondly, we will compare the perfor- J = 2, Q = 8 J = 10, Q = 2
mances of time domain versus frequency domain approaches
and show that under some channel conditions the frequency Figure 6: Bit error rate.
domain approach outperforms the time domain approach.
Finally, we will analyze and compare the computational com-
plexity of algorithms proposed in this paper. where h and h are the estimated and the true channel vec-
tors, respectively. The simulated normalized channel estima-
6.1. Simulations of time domain approaches tion error is shown in Figure 5 and the corresponding BER is
presented in Figure 6. When the number of blocks J = 10, the
A Rayleigh fading channel of order L = 4 is used. The size MNP method (with the number of block repetitions Q = 12)
of transmitted blocks is M = 8 and received block size is P = outperforms the SGB method (Q = 1) by a considerable
M+L = 12. The normalized least squared channel estimation range. Taking Q = 2 saves a lot of computation and yet yields
error, denoted as Ech , is used as the figure of merit for channel a good performance as indicated. Furthermore, in the case
identification and is defined as follows: of J = 2, the system with Q = 8 even outperforms the orig-
inal MNP method with Q = 12. This also strengthens our
− h2
h argument in Section 5 that choosing Q as large as P is unnec-
Ech = , (39) essary.
h2
M = 8; L = 4 6.3. Complexity analysis

100
For the algorithms presented in Section 3, the SVD computa-
10 1 tion dominates the computational complexity. The number
of blocks J, the number of repetitions per block Q, and the
received block size P decide the size of the matrix on which

10 2 SVD is taken. The complexity of SVD operation on an n × m
matrix [15] is on the order of O(mn2 ) with m ≥ n. Since Y(J)
Q
3
has size (P +Q − 1) × QJ, the complexity is O(QJ(P +Q − 1)2 ).
10
We can see that the complexity can be greatly reduced by
choosing a smaller Q. Recall that the SGB method [3] uses
10 4 Q = 1 and the MNP method [5] uses Q = P. We thus have
the following arguments:
5
(i) the MNP method has a complexity around 4P times
10
10 15 20 25 30 35 40 45 the complexity of the SGB method for any J. A choice
SNR (dB) of Q between 1 and P could be seen as a compromise
between system performance and complexity;
FD 9 blocks Q = 1
(ii) when J is large, we have the freedom to choose a
TD 9 blocks Q = 1
FD 9 blocks Q = 2
smaller Q, as explained in the previous section.
TD 9 blocks Q = 2 For the frequency domain approach presented in Section 4,
an additional matrix multiplication is required. This de-
Figure 7: Normalized least squared channel error estimation. mands extra computational complexity of the order of
O(JPQ2 ). However, if the values ρi are chosen as equally
spaced on the unit circle, an FFT algorithm can be ex-
ploited and the computational complexity will be reduced to
6.2. Simulations of frequency domain approaches O(JPQ log PQ ) and is negligible compared to the complexity
of SVD operations.
Figure 7 shows the comparison of frequency domain ap-
proach and time domain approach under the channel coeffi-
cients H(z) = 1 − jz−1 + (−1 + 0.01 j)z−2 + (0.01 + j)z−3 − 6.4. Simulations for time-varying channels
0.01 jz−4 .
In this section, we demonstrate the capability of the proposed
For frequency domain approach, the normalized least
generalized blind identification algorithm in time-varying
squared channel error is defined as
channels environments. The received symbols can be ex-
pressed as

−h 2
h
Ech = , (40)
L
2
h y(n) = h(n, k)x(n − k), (42)
k=0
where where the (L + 1)-tap channel coefficients h(n, k) vary as the

time index n changes. We generate the channel coefficients

= H ρ1
h H ρ2 · · · H ρN (41) as follows. During a time interval T, the channel coefficients
change from h1 (k) to h2 (k), where h1 (k) and h2 (k), 0 ≤ k ≤
L represent two sets of (L + 1)-tap independent coefficients.
Simulation results show that
is the estimation of h. The variation of the coefficient is done by linear interpolation
and h
frequency domain approach outperforms time domain ap- such that
proach especially when the noise level is high. While the fre- ⎧
⎪
⎪ h1 (k), if n = 0,
quency domain approach does not in general beat the time ⎪
⎪
⎪
⎨
domain approach for a random channel, it has been consis- h(n, k) = ⎪h2 (k), if n = T, (43)
tently observed that frequency domain approach performs ⎪
⎪
⎪
⎪ T −n n
better than time domain approach when the last channel co- ⎩ h1 (k) + h2 (k) otherwise.
T T
efficient h(L) has a small magnitude (i.e., at least one zero of
H(z) is close to the origin). In our simulation, we choose T = 180. Coefficients of h1 (k)
Since we have the freedom to choose values of coefficients and h2 (k) are given in Table 1. The size of transmitted blocks
ρi , the receiver can adjust ρi dynamically according to the is M = 8 and received block size is P = M + L = 12 (so the
a priori knowledge of the approximated channel zero loca- channel coefficients completely change after 15 blocks). Sim-
tions. This is especially useful when the channel coefficients ulations are performed under different choices of J and Q, as
are changing slowly from block to block. indicated in Figures 8 and 9. The normalized least squared
Table 1: Coefficients for the time-varying channel. M = 8; L = 4

101
k h1 (k) h2 (k)
0 −0.6563 + 0.7059i −1.2519 + 0.2295i
1 −0.6534 + 1.1774i 0.9347 + 0.1237i

2 −0.4229 − 0.2362i 0.0346 − 0.6180i 100
3 0.2145 − 0.2207i 0.7272 − 1.4084i
4 −0.1478 + 0.2802i 0.8612 + 0.3455i
10 1
channel error is defined as
− h2
h
Ech = , (44)
h2
2
where h is the estimated channel and h is the averaged coef- 10
10 15 20 25 30 35 40 45
ficients during the time the channel is being estimated: SNR (dB)
1 −1
n0 +JP T J = 2; Q = 8 J = 8; Q = 2
h= h(n, 0) h(n, 1) · · · h(n, L) . (45) J = 4; Q = 3 J = 10; Q = 2
JP n=n0 J = 6; Q = 2 J = 10; Q = 1
In Figure 8 we see that when J = 10, the time range is too
large for the algorithm to estimate the time-varying chan- Figure 8: Normalized channel MSE performance for a time-
nel accurately. The performance for J = 2 is much better in varying channel.
high SNR region because the channel does not vary too much
during the time of two blocks. However, in low SNR region
the performance for J = 2 becomes bad. The case for J = 4
has the best performance among all other choices because the M = 8; L = 4
100
channel does not vary too much during the duration of four
receiving blocks, and more data are available for accurate es-
timation. This simulation result provides clues about how we
can choose the optimal J: if the channel variation is fast (T is
smaller) we need a smaller J while we can use a larger J when 10 1
T is larger.
BER
6.5. Remarks on choosing the optimal parameters

According to the simulations results above, we summarize 10 2
here a general guideline to choose a set of optimal param-
eters in practice.
(1) When the channel is constant and for a fixed Q, a larger
J appears to have a better performance (as shown in 3
10
Figure 5) since more data are available for accurate es- 10 15 20 25 30 35 40 45
timation. SNR (dB)
(2) When the channel is time-varying, the optimal choice
J = 2; Q = 8 J = 8; Q = 2
of J depends on the speed of channel variation. Sim-
J = 4; Q = 3 J = 10; Q = 2
ulation results in Figures 8 and 9 suggest when the J = 6; Q = 2 J = 10; Q = 1
channel coefficients completely change in N blocks, a
choice of J ≈ N/4 could be appropriate.
Figure 9: Bit error rate performance for a time-varying channel.
(3) Suppose J is given, a choice of Q as the smallest inte-
ger that satisfies inequality (19) often has a satisfactory
performance. A slightly larger Q can sometimes be bet-
ter (see Figure 5 for J = 10) at the expense of a slightly 6.6. Noise handling for large J
increased complexity. However, if Q is too large, the
performance could be even worse (see Figure 5 for J = It should be noted that when J is very large (and Q = 1), the
2, Q = 12). proposed method behaves like a traditional subspace method
The guidelines above are given by observing the simulation using second-order statistics. Suppose
results. An analytically optimal set of J and Q is still under
investigation. Y(J) = HU(J) + E(J) , (46)
where E(J) is composed of J columns of noise vectors e(n). Observing the first Q elements of the vector equation above,
The autocorrelation matrix of received blocks can be esti- we obtain
mated as
& ' 1
R y y = E y(n)y† (n) ≈ Y(J) Y(J)† . (47) v1 v2 · · · vM+Q−1 T s(n), Q = 01×Q , ∀n. (A.2)
J
If the input signal and channel noise are uncorrelated, we can
write R y y as Without loss of generality, assume [v1 v2 · · · vM+Q−1 ] to
be nonzero and it is an annihilator of T (s(n), Q). This vio-
R y y = HRuu H† + Ree , (48) lates the assumption that s(n) is (1/Q)-rich.
where Ruu = E[u(n)u† (n)] and Ree = E[e(n)e† (n)] are au-
Proof of Lemma 2. Conditions (1) and (2) are equivalent by
tocorrelation matrices of input blocks and noise vectors, re-
definition. The equivalence of conditions (3) and (4) can
spectively. If Ree is known (e.g., if the noise is white and noise
also be easily examined. If condition (3) is true, then ei-
variance is N0 , then Ree = N0 IP ), an improved estimation of
annihilators of matrix H can be performed by taking eigen- ther pTM+Q−1 (α) or [0 · · · 0 1] is an annihilator of sQ (n)
decomposition of R y y − Ree , which results in better chan- (as defined in Section 3.2) for all Q and hence condition
nel estimation [3]. This technique, however, does not apply (1) is also true. In the case condition (1) is true, assume
when J is small. there exists n ≥ 0 such that the degree of the polynomial
pTM (x)s(n) is M − 1. Then for any Q, there exists a row vector
vT = [v1 v2 · · · vM+Q−1 ] such that vT sQ (n) = 0, for all n.
7. CONCLUDING REMARKS
This implies
In this paper we proposed a generalized algorithm for blind
channel identification with linear redundant precoders. The

M
& '
number of received blocks J ≥ 2 can be chosen freely de- vk+l s(n) l = 0, ∀n, k ≥ 0, (A.3)
pending on the speed of channel variation. The minimum l=1
number of repetitions Q of each received block is derived
to optimize the computational complexity while retaining
good performance. Simulation shows that when the system where [·]l represents the lth element of a column vector.
−1
parameter Q is properly chosen, the generalized algorithm So the series {vk }M+Q
k=1 must satisfy the recurrence (A.3)
outperforms previously reported special cases, especially in a for any n ≥ 0. This requires the characteristic polynomials
time-varying channel environments. pTM (x)s(n), n ≥ 0 to share at least one zero. So condition (4)
A frequency domain version of the generalized algorithm must be true. By the arguments above, these four conditions
is also presented. Simulation result shows that it outperforms are equivalent.
time domain approach at low SNR region for certain types
of channels, for example, channels with a zero close to the Proof of Lemma 3. If s(n) is proportional to a same nonzero
origin. Since we have the freedom to choose different fre- vector x for all n, then it is obviously not (1/Q)-rich for
quency parameters in the frequency domain approach, cer- any Q. We thus assume without loss of generality that
tain choices other than equally spaced grids on the unit circle s(0) and s(1) are linearly independent. Suppose polynomi-
can be used to improve the system performance for different als pTM (x)s(0) and pTM (x)s(1) have two sets of distinct zeros
channel zero locations. An even more challenging problem {α01 , α02 , . . . , α0,M −1 } and {α11 , α12 , . . . , α1,M −1 }, respectively.
might be to analytically derive the optimal frequency points Since s(n) is not (1/Q)-rich, there exists a (2M − 2)-row vec-
for a specific type of channel. tor vT = [v1 v2 · · · v2M −2 ] such that vT T (s(n), M − 1) =
The concept of generalized signal richness for a vector sig- 01×(M −1) . We have that the nonzero row vector vT must have
nal is introduced. With the degree of non-richness of the in- the form of
put signal decided, we can determine the minimum number
of repetitions theoretically. A complete set of necessary and

M −1
sufficient conditions for signals satisfying generalized signal −1 −2 −(M −2)
vT = ck 1 α0,k α0,k · · · α0,k
richness is still under investigation. The study of effect of a k=1
linear precoder on the property of generalized signal richness (A.4)
could also be a challenging problem.
M −1
−1 −2 −(M −2)
= dk 1 α1,k α1,k · · · α1,k
k=1
APPENDIX
Proof of Lemma 1. Suppose s(n) is (1/Q)-rich but not (1/(Q+ for some coefficients c1 , c2 , . . . , cM −1 , d1 , d2 , . . . , dM −1 . This
1))-rich, then there exists a 1 × (M + Q) nonzero vector implies
vT = [v1 v2 · · · vM+Q ] such that

vT T s(n), Q + 1 = 01×(Q+1) , ∀n. (A.1) cT −dT V = 0T , (A.5)
where cT = [c1 c2 · · · cM −1 ], dT = [d1 d2 · · · dM −1 ], [7] P. P. Vaidyanathan, Multirate Systems and Filter Banks,
and Prentice-Hall, Englewood Cliffs, NJ, USA, 1993.
⎡ ⎤ [8] Y.-P. Lin and S.-M. Phoong, “Perfect discrete multitone mod-
pT α ulation with optimal transceivers,” IEEE Transactions on Signal
⎢ 2M −2 01 ⎥
⎢ .. ⎥ Processing, vol. 48, no. 6, pp. 1702–1711, 2000.
⎢ ⎥
⎢ . ⎥ [9] W. Qiu, Y. Hua, and K. Abed-Meraim, “A subspace method
⎢ ⎥
⎢ pT ⎥ for the computation of the GCD of polynomials,” Automatica,
⎢ 2M −2 0,M −1 ⎥
α
V=⎢
⎢ ⎥
⎥
(A.6) vol. 33, no. 4, pp. 741–743, 1997.
⎢ pT2M −2 α11 ⎥
⎢ ⎥ [10] L. Tong, G. Xu, and T. Kailath, “A new approach to blind
⎢ .. ⎥
⎢ . ⎥ identification and equalization of multipath channels,” in Pro-
⎣ ⎦ ceedings of the 25th Asilomar Conference on Signals, Systems,
pT2M −2 α1,M −1 & Computers, vol. 2, pp. 856–860, Pacific Grove, Calif, USA,
November 1991.
is a Vandermonde matrix. If all zeros {αi j } are distinct, V is a [11] E. Moulines, P. Duhamel, J.-F. Cardoso, and S. Mayrargue,
(2M − 2) × (2M − 2) invertible matrix and (A.5) implies cT = “Subspace methods for the blind identification of multichan-
dT = 0T and hence vT = 0T . This contradicts the assumption nel FIR filters,” IEEE Transactions on Signal Processing, vol. 43,
that s(n) is not (1/(M − 1))-rich. Therefore, if s(n) is not no. 2, pp. 516–525, 1995.
(1/(M − 1))-rich, there must be a common zero shared by [12] P. P. Vaidyanathan and B. Vrcelj, “A frequency domain ap-
pT2M −2 (x)s(0) and pT2M −2 (x)s(1). Similarly, we can obtain that proach for blind identification with filter bank precoders,”
there exists an α such that pT2M −2 (α)s(n) = 0 for all n. Using in Proceedings of IEEE International Symposium on Circuits
Lemma 2, this implies that s(n) is not (1/Q)-rich for all Q. and Systems (ISCAS ’04), vol. 3, pp. 349–352, Vancouver, BC,
In the case where the polynomial pT2M −2 (x)s(n) has mul- Canada, May 2004.
tiple zeros for some n, the matrix V in (A.5) can be replaced [13] Y. Li and Z. Ding, “Blind channel identification based on sec-
with a confluent Vandermonde matrix [15] which is still in- ond order cyclostationary statistics,” in Proceedings of IEEE In-
ternational Conference on Acoustics, Speech, and Signal Process-
vertible.
ing (ICASSP ’93), vol. 4, pp. 81–84, Minneapolis, Minn, USA,
April 1993.
[14] B. Su and P. P. Vaidyanathan, “Generalized signal rich-
ACKNOWLEDGMENTS ness preservation problem and Vandermonde-form preserv-
ing matrices,” to appear in IEEE Transactions on Signal Pro-
This work was supported in part by the NSF Grant CCF- cessing.
0428326, ONR Grant N00014-06-1-0011, and the Moore [15] G. H. Golub and C. F. Van Loan, Matrix Computations, Johns
Fellowship of the California Institute of Technology. Hopkins University Press, Baltimore, MD, USA, 3rd edition,
1996.
REFERENCES
Borching Su was born in Tainan, Taiwan,
[1] B. Porat and B. Friedlander, “Blind equalization of digital on October 8, 1978. He received the B.S.
communication channels using high-order moments,” IEEE and M.S. degrees in electrical engineer-
Transactions on Signal Processing, vol. 39, no. 2, pp. 522–526, ing and communication engineering, both
1991. from National Taiwan University (NTU),
[2] L. Tong, G. Xu, and T. Kailath, “Blind identification and equal- Taipei, Taiwan, in 1999 and 2001, respec-
ization based on second-order statistics: a time domain ap- tively. He is currently pursuing the Ph.D.
proach,” IEEE Transactions on Information Theory, vol. 40, degree in the field of digital signal pro-
no. 2, pp. 340–349, 1994. cessing at California Institute of Technol-
[3] A. Scaglione, G. B. Giannakis, and S. Barbarossa, “Redun- ogy (Caltech). In 2003, he was awarded the
dant filter bank precoders and equalizers part II: blind channel Moore Fellowship from Caltech. His current research interests in-
estimation, synchronization, and direct equalization,” IEEE clude multirate systems and their applications on digital commu-
Transactions on Signal Processing, vol. 47, no. 7, pp. 2007–2022, nications.
1999.
[4] J. H. Manton and W. D. Neumann, “Totally blind channel P. P. Vaidyanathan received the B.Tech. and
identification by exploiting guard intervals,” Systems and Con- M.Tech. degrees in radiophysics and elec-
trol Letters, vol. 48, no. 2, pp. 113–119, 2003. tronics, from the University of Calcutta, and
[5] D. H. Pham and J. H. Manton, “A subspace algorithm for the Ph.D. degree in electrical and computer
guard interval based channel identification and source recov- engineering from the University of Califor-
ery requiring just two received blocks,” in Proceedings of IEEE nia at Santa Barbara, in 1982. Since then he
International Conference on Acoustics, Speech and Signal Pro- has been with the Faculty of Electrical Engi-
cessing (ICASSP ’03), vol. 4, pp. 317–320, Hong Kong, April neering at the California Institute of Tech-
2003. nology. He has authored many papers in
[6] B. Su and P. P. Vaidyanathan, “A generalization of determinis- the signal processing area. He has received
tic algorithm for blind channel identification with filter bank several awards for excellence in teaching at the California Insti-
precoders,” in Proceedings of IEEE International Symposium tute of Technology. In 1989, he received the IEEE ASSP Senior Pa-
on Circuits and Systems (ISCAS ’06), Kos Island, Greece, May per Award. In 1990, he was recipient of the S. K. Mitra Memorial
2006. Award from the Institute of Electronics and Telecommunications
Engineers, India, for his joint paper in the IETE journal. He is a

Fellow of the IEEE. He received the 1995 F. E. Terman Award of
the American Society for Engineering Education. He has served
as a Distinguished Lecturer for the IEEE Signal Processing Soci-
ety (1996-1997). In 1999 he was chosen to receive the IEEE CAS
Society’s Golden Jubilee Medal. He was a recipient of the IEEE Sig-
nal Processing Society’s Technical Achievement Award for the year
2002.
doi:10.1155/2007/49389
Research Article
Channel Equalization in Filter Bank Based Multicarrier
Modulation for Wireless Communications
Tero Ihalainen,1 Tobias Hidalgo Stitz,1 Mika Rinne,2 and Markku Renfors1
1 Institute of Communications Engineering, Tampere University of Technology, P.O. Box 553, Tampere FI-33101, Finland
2 Nokia Research Center, P.O. Box 407, Helsinki FI-00045, Finland
Received 5 January 2006; Revised 6 August 2006; Accepted 13 August 2006
Recommended by See-May Phoong
Channel equalization in filter bank based multicarrier (FBMC) modulation is addressed. We utilize an efficient oversampled filter
bank concept with 2x-oversampled subcarrier signals that can be equalized independently of each other. Due to Nyquist pulse
shaping, consecutive symbol waveforms overlap in time, which calls for special means for equalization. Two alternative linear
low-complexity subcarrier equalizer structures are developed together with straightforward channel estimation-based methods to
calculate the equalizer coefficients using pointwise equalization within each subband (in a frequency-sampled manner). A novel
structure, consisting of a linear-phase FIR amplitude equalizer and an allpass filter as phase equalizer, is found to provide enhanced
robustness to timing estimation errors. This allows the receiver to be operated without time synchronization before the filter bank.
The coded error-rate performance of FBMC with the studied equalization scheme is compared to a cyclic prefix OFDM reference
in wireless mobile channel conditions, taking into account issues like spectral regrowth with practical nonlinear transmitters and
sensitivity to frequency offsets. It is further emphasized that FBMC provides flexible means for high-quality frequency selective
filtering in the receiver to suppress strong interfering spectral components within or close to the used frequency band.
Copyright © 2007 Tero Ihalainen et al. This is an open access article distributed under the Creative Commons Attribution License,
1. INTRODUCTION radio channel, interference from the previous OFDM sym-

bol, referred to as inter-symbol-interference (ISI), will only
Orthogonal frequency division multiplexing (OFDM) [1] affect the guard interval. At the receiver, the guard interval
has become a widely accepted technique for the realization is discarded to elegantly avoid ISI prior to transforming the
of broadband air-interfaces in high data rate wireless ac- signal back to frequency domain using the fast Fourier trans-
cess systems. Indeed, due to its inherent robustness to multi- form (FFT).
path propagation, OFDM has become the modulation choice While enabling a very efficient and simple way to com-
for both wireless local area network (WLAN) and terrestrial bat multipath effects, the CP is pure redundancy, which de-
digital broadcasting (digital audio and video broadcasting; creases the spectral efficiency. As a consequence, there has
DAB, DVB) standards. Furthermore, multicarrier transmis- recently been a growing interest towards alternative multi-
sion schemes are generally considered candidates for the fu- carrier schemes, which could provide the same robustness
ture “beyond 3 G” mobile communications. without requiring a CP, that is, offering improved spectral
All these current multicarrier systems are based on the efficiency. Pulse shaping in multicarrier transmission dates
conventional cyclic prefix OFDM modulation scheme. In back to the early work of Chang [2] and Saltzberg [3] in
such systems, very simple equalization (one complex coef- the sixties. Since then, various multicarrier concepts based
ficient per subcarrier) is made possible by converting the on the Nyquist pulse shaping idea with overlapping sym-
broadband frequency selective channel into a set of paral- bols and bandlimited subcarrier signals have been developed
lel flat-fading subchannels. This is achieved using the inverse by Hirosaki [4], Le Floch et al. [5], Sandberg and Tzannes
fast Fourier transform (IFFT) processing and by inserting a [6], Vahlin and Holte [7], Wiegand and Fliege [8], Nedic
time domain guard interval, in the form of a cyclic prefix [9], Vandendorpe et al. [10], Van Acker et al. [11], Siohan
(CP), to the OFDM symbols at the transmitter. By dimen- et al. [12], Wyglinski et al. [13], Farhang-Boroujeny [14, 15],
sioning the CP longer than the maximum delay spread of the Phoong et al. [16], and others. One central ingredient in the
later developments is the theory of efficiently implementable, synthesis and a 2x-oversampled analysis bank. The problem
modulation-based uniform filter banks, developed by Vet- of channel equalization is addressed in Section 3. The theo-
terli [17], Malvar [18], Vaidyanathan [19], and Karp and retical background and principles of the proposed compen-
Fliege [20], among others. In this context, the filter banks are sation method are presented. The chosen filter bank struc-
used in a transmultiplexer (TMUX) configuration. ture leads to a relatively simple signal model that results in
We refer to the general concept as filter bank based multi- criteria for perfect subcarrier equalization and formulas for
carrier (FBMC) modulation. In FBMC, the subcarrier signals FBMC performance analysis in case of practical equalizers.
cannot be assumed flat-fading unless the number of subcar- A complex FIR filter-based subcarrier equalizer (CFIR-SCE)
riers is very high. One approach to deal with the fading fre- and the so-called amplitude-phase (AP-SCE) equalizer are
quency selective channel is to use waveforms that are well lo- presented. Especially, some low-complexity cases are ana-
calized, that is, the pulse energy both in time and frequency lyzed and compared in Section 4. In Section 5, we present
domains is well contained to limit the effect on consecutive a semianalytical and a full time domain simulation setup
symbols and neighboring subchannels [5, 7, 12]. In this con- to evaluate the performance of the equalizer structures in a
text, a basic subcarrier equalizer structure of a single complex broadband wireless communication channel. Furthermore,
coefficient per subcarrier is usually considered. Another ap- the effects of timing and frequency offsets, nonlinearity of
proach uses finite impulse response (FIR) filters as subcarrier a power amplifier, and overall system complexity are briefly
equalizers with cross-connections between the adjacent sub- investigated. Finally, the conclusions are drawn in Section 6.
channels to cancel the inter-carrier-interference (ICI) [6, 10].
A third line of studies applies a receiver filter bank structure 2. EXPONENTIALLY MODULATED PERFECT
providing oversampled subcarrier signals and performs per- RECONSTRUCTION TRANSMULTIPLEXER
subcarrier equalization using FIR filters [4, 8, 9, 11, 13]. The
main idea here is that equalization of the oversampled sub- Figure 1 shows the structure of the complex exponen-
carrier signals restores the orthogonality of the subcarrier tially modulated TMUX that can produce a complex in-
waveforms and there is no need for cross-connections be- phase/quadrature (I/Q) baseband signal required for spec-
tween the subcarriers. This paper contributes to this line of trally efficient radio communications [23]. It has real format
studies by developing low-complexity linear per-subcarrier for the low-rate input signals and complex I/Q-presentation
channel equalizer structures for FBMC. The earlier contri- for the high-rate channel signal. It should be noted that
butions either lack connection to the theory of efficient mul- FBMC with (real) m-PAM as subcarrier modulation and
tirate filter banks, use just a complex multiplier as subcarrier OFDM with (complex) m2 -QAM ideally provide the same
equalizer or, in case of non trivial subcarrier equalizers, lack bit rate since in general the subcarrier symbol rate in FBMC
the analysis of needed equalizer length in practical wireless is twice that of OFDM for a fixed subchannel spacing. In this
communication applications (many of such studies have fo- structure, there are 2M low-rate subchannels equally spaced
cused purely on wireline transmission). Also various practi- between [−Fs /2, Fs /2], Fs denoting the high sampling rate.
cal issues like peak-to-average power ratio and effects of tim- EMFBs belong to a class of filter banks in which the
ing and frequency offsets have not properly been addressed subfilters are formed by frequency shifting the lowpass pro-
in this context before. totype h p [n] with an exponential sequence [27]. Exponen-
The basic model of the studied adaptive sine modu- tial modulation translates H p (e jω ) (lowpass frequency re-
lated/cosine modulated filter bank equalizer for transmul- sponse) around the new center frequency determined by the
tiplexers (ASCET) has been presented in our earlier work subcarrier index k. The prototype h p [n] can be optimized
[21–23]. This paper extends the low-complexity equalizer in such a manner that the filter bank satisfies the perfect-
of [23, 24], presenting comprehensive performance analysis, reconstruction (PR) condition, that is, the output signal is
and studies the tradeoffs between equalizer complexity and a delayed version of the input signal [27, 28]. In the gen-
number of subcarriers required to achieve close-to-ideal per- eral form, the synthesis and analysis filters of EMFBs can be
formance in a practical broadband wireless communication written as
environment. A simple channel estimation-based calculation
of the equalizer coefficients is presented. The performance of 2 M+1 1 π
fk [n] = h p [n] exp j n + k+ , (1)
the studied equalizer structures is compared to OFDM, tak- M 2 2 M
ing into account various practical issues.

In a companion paper [25], a similar subband equalizer 2 M+1 1 π
hk [n] = h p [n] exp − j N − n + k+ ,
structure is applied to the filter bank approach for frequency M 2 2 M
domain equalization in single carrier transmission. In that (2)
context, filter banks are used in the analysis-synthesis config-
uration to replace the traditional FFT-IFFT transform-pair respectively, where n = 0, 1, . . . , N and k = 0, 1, . . . , 2M − 1.
in the receiver. Furthermore, it is assumed that the filter order is N = 2KM −
The rest of the paper is organized as follows. Section 2 1. The overlapping factor K can be used as a design parame-
briefly describes an efficient implementation structure for ter because it affects on how much stopband attenuation can
the TMUX based on exponentially modulated filter banks be achieved. Another essential design parameter is the stop-
(EMFB) [26]. The structure consists of a critically sampled band edge of the prototype filter ωs = (1 + ρ)π/2M, where
Tero Ihalainen et al. 3
CMFB
+
analysis
I
+
CMFB
Re SCE Re xk [m]
xk [m] synthesis
1/2 SMFB Q
analysis +
+ Channel CMFB
+
analysis Q
j
+
SMFB Im SCE Re x2M 1 k [m]
x2M synthesis I
1 k [m] 1/2 SMFB
+
analysis
FM (ω) F2M 1 (ω)F0 (ω) F1 (ω) FM 1 (ω)

π 0 π
Figure 1: Complex TMUX with oversampled analysis bank and per-subcarrier equalizers.
the roll-off parameter ρ determines how much adjacent sub- Further, although the discussion here is based on the use
channels overlap. Typically, ρ = 1.0 is used, in which case of PR filter banks, also nearly perfect-reconstruction (NPR)
only the contiguous subchannels are overlapping with each designs could be utilized. In the critically sampled case, the
other, and the overall subchannel bandwidth is twice the sub- implementation benefits of NPR designs are limited because
channel spacing. the efficient ELT structures cannot be utilized [29]. However,
In the approach selected here, the EMFB is implemented in the 2x-oversampled case, having two parallel CMFB and
using cosine and sine modulated filter bank (CMFB/SMFB) SMFB blocks, the implementation benefits of NPR designs
blocks [28], as can be seen in Figure 1. The extended lapped could be more significant.
transform (ELT) is an efficient method for implementing PR
CMFBs [18] and SMFBs [28]. The relations between the syn- 3. CHANNEL EQUALIZATION
thesis and analysis filters of the 2M-channel EMFB and the
corresponding M-channel CMFB and SMFB with the same The problem of channel equalization in the FBMC context
real FIR prototype h p [n] are is not so well understood as in the DFT-based systems. Our
⎧ equalizer concept can be applied to both real and complex
⎨ fkc [n] + j fks [n], k ∈ [0, M − 1] modulated baseband signal formats; here we focus on the
fk [n] = ⎩ c s complex case. In its simplest form, the subcarrier equalizer
− f2M −1−k [n] − j f2M −1−k [n] , k ∈ [M, 2M − 1],
structure consists only of a single complex coefficient that
(3) adjusts the amplitude and phase responses of each subchan-
⎧ nel in the receiver [22]. Higher-order SCEs are able to equal-
⎨hck [n] − jhsk [n], k ∈ [0, M − 1]
hk [n] = ⎩ c ize each subchannel better if the channel frequency response
s
− h2M −1−k [n]+ jh2M −1−k [n] , k ∈ [M, 2M − 1], is not flat within the subchannel. As a result, the use of
(4) higher-order SCEs enables to increase the relative subchan-
nel bandwidth because the subchannel responses are allowed
respectively. A specific feature of the structure in Figure 1 is to take mildly frequency selective shapes. As a consequence,
that while the synthesis filter bank is critically sampled, the the number of subchannels to cover a given signal band-
subchannel output signals of the analysis bank are oversam- width by FBMC can be reduced. In general, higher-order
pled [26] by a factor of two. This is achieved by using the equalizer structures provide flexibility and scalability to sys-
symbol-rate complex (I/Q) subchannel signals, instead of the tem design because they offer a tradeoff between the num-
real ones that are sufficient for detection after the channel ber of required subchannels and complexity of the subcarrier
equalizer, or in case of a distortion-free channel. equalizers.
We consider here the use of EMFBs which have odd chan- The oversampled receiver is essential for the proposed
nel stacking, that is, the center-most pair of subchannels is equalizer structure. In case of roll-off ρ = 1.0 or lower, non-
symmetrically located around the zero frequency at the base- aliased versions of the subchannel signals are obtained in
band. We could equally well use a modified EMFB struc- the 2x-oversampled receiver when complex (I/Q) signals are
ture [26] with even stacking (the center-most subchannel lo- sampled at the symbol rate. Consequently, complete chan-
cated symmetrically about zero). The latter form has also a nel equalization in an optimal manner is possible. As a result
slightly more efficient implementation structure, based on of the high stopband attenuation of the subchannel filters,
DFT-processing. The proposed equalizer structure can also there is practically no aliasing of the subchannel signals in
be applied with modified DFT (MDFT) filter banks [20], the receiver bank. Thus perfect equalization of the distort-
with modified subchannel processing. However, for the fol- ing channel within the subchannel passband and transition
lowing analysis EMFB was selected since it results in the most band regions would completely restore the orthogonality of
straightforward system model. the subchannel signals [9].
3.1. Theoretical background and principles 3.1.1. ICI analysis
Figure 2(a) shows a subchannel model of the complex For the potential ICI terms from the contiguous subchannels
TMUX with per-subcarrier equalizer. A more detailed model k − 1 and k + 1 (below and above) to the subchannel k of
that includes the interference from the contiguous subchan- interest, we can write
nels is shown in Figure 2(b). Limiting the sources of inter-
ference to the closest neighboring subchannels is justified if fk−1 [n] ∗ hk [n]
the filter bank design provides sufficiently high stopband at-
b
c
b
s
tenuation. Furthermore, in this model the order of down- = hk [l] fkc−1 [n − l] + hk [l] fks−1 [n − l]
sampling and equalization is interchanged based on the mul- l=a l=a

tirate identities [19]. The latter model is used as a basis for
b
c
b
s
the cross-talk analysis that follows. It is also convenient for +j· hk [l] fks−1 [n − l] − hk [l] fkc−1 [n − l]
semianalytical performance evaluations. The equalizer con- l=a l=a
Q
cept is based on the property that with ideal sampling and = vkI [n] + j · vk [n] = vk [n],
equalization, the desired subchannel signal, carried by the
real part of the complex subchannel output, is orthogonal fk+1 [n] ∗ hk [n]
to the contiguous subchannel signal components occupying
b
b
c c s s
the imaginary part. The orthogonality between the subchan- = hk [l] fk+1 [n − l] + hk [l] fk+1 [n − l]
nels is introduced when the linear-phase lowpass prototype l=a l=a
h p [n] is exponentially frequency shifted as a bandpass filter,

b
c
b
s
s c
with 90-degree phase-shift between the carriers of the con- +j· hk [l] fk+1 [n − l] − hk [l] fk+1 [n − l]
tiguous subchannels. l=a l=a
In practice, the nonideal channel causes amplitude and Q
= uIk [n] + j · uk [n] = uk [n],
phase distortion. The latter results in rotation between the (7)
I-and Q-components of the neighboring subchannel signals
causing ICI or cross-talk between the subchannels. ISI, on respectively.
the other hand, is mainly caused by the amplitude distortion. Due to PR design, the real parts vkI [m] and uIk [m] (m be-
The following set of equations provides proofs for these state- ing the sample index at the low rate) of the downsampled
ments. We derive them for an arbitrary subchannel k on the subchannel signals are all-zero sequences (or close to zero
positive side of the baseband spectrum and the results can sequences in the NPR case). So ideally, when the real part
easily be extended for the subchannels on the negative side of the signal is taken in the receiver, no crosstalk from the
using (3) and (4). In the following analysis we use a non- neighboring subchannels is present in the signal used for de-
causal zero-phase system model, which is obtained by using, tection. Channel distortion, however, causes phase rotation
instead of (2), analysis filters of the form between the I- and Q-components breaking the orthogonal-
ity between the subcarriers. Channel equalization is required
to recover the orthogonality of the subcarriers.
2 M+1 1 π The ICI components from other subcarriers located fur-
hk [n]= h p [n + N] exp − j −n + k+ .
M 2 2 M ther apart from the subchannel of interest are considered
(5) negligible. This is a reasonable assumption because the ex-
tent of overlapping of subchannel spectra and the level of
stopband attenuation can easily be controlled in FBMC. In
By referring to the equivalent form, shown in Figure 2(b), fact, they are used as optimization criteria in filter bank de-
and adopting the notation from there, we can express the cas- sign, as discussed in the previous section.
cade of the synthesis and analysis filters of the desired sub- The cascade of the distorting channel with instantaneous
channel k as impulse response (in the baseband model) hch [n] and the
upsampled version of the per-subcarrier equalizer ck [n] (see

b
c
b
s
Figure 2) applied to the subchannel k of interest can be
fk [n] ∗ hk [n] = hk [l] fkc [n − l] + hk [l] fks [n − l] expressed as
l=a l=a
hch [n] ∗ ck [n] = rk [n]. (8)

b
c b
s
s c
+j· hk [l] fk [n − l] − hk [l] fk [n − l]
l=a l=a In the analysis, a noncausal high-rate impulse response ck [n]
= tkI [n] + j · tkQ [n] = tk [n], is used for the equalizer, although in practice the low-rate
(6) causal form ck [m] is applied.
Next we analyze the ICI components potentially remain-
ing in the real parts of the subchannel signals that are used for
where ∗ denotes the convolution operation, summation in- detection. Figure 3 visualizes the two ICI bands for subchan-
dexes are a = −N + max(n, 0) and b = min(n, 0), and nel k = 0. We start from the lower-side ICI term and use an
n ∈ [−N, . . . , N]. equivalent baseband model, where the potential ICI energy
Xk Xk
M fk [n] hch [n] hk [n] M ck [m] Re
Synthesis bank Distorting Analysis bank Equalizer
channel
(a)
Xk+1 Q
uIk [n] + juk [n]
M
= fk+1 [n] hk [n]
Xk Q
tkI [n] + jtk [n] Q
rkI [n] + jrk [n] Xk
M + Re M
= fk [n] hk [n] = hch [n] ck [n]
⎧
for n = mM, m Z
Xk Q ⎨c [n/m],
1 vkI [n] + jvk [n] k
M ck [n] = ⎩
= fk 1 [n] hk [n] 0, otherwise
(b)
Figure 2: Complex TMUX with per-subcarrier equalizer. (a) System model for subchannel k. (b) Equivalent form including also contiguous
subchannels for crosstalk analysis.
Desired subchannel model is not valid as such. However, we can establish a sim-
ple relation between the actual decimated subchannel output
sequence zk [mM] in the filter bank system and the sequence
obtained by decimating in the baseband model. It is straight-
forward to see that the following relation holds:

π 0 3π ω zk [n]e− jnkπ/M = (−1)mk zk [mM]. (11)
2M 2M n=mM
RX filter of the desired subchannel Thus, for odd subchannels, the actual decimated ICI se-
TX filter of the contiguous subchannel
quence is obtained by lowpass-to-highpass transformation
Potential ICI spectrum
(i.e., through multiplication by an alternating ±1-sequence)
from the ICI sequence of the baseband model. Then the ac-
Figure 3: Potential ICI spectrum for subchannel k = 0. tual ICI is guaranteed to be zero if it is zero in the baseband
model. Therefore, a sufficient condition for zero lower-side
ICI in all subchannels is that the equalized baseband channel
is symmetrically located about zero frequency. We can write impulse response is purely real.
the baseband cross-talk impulse response from subchannel For the upper-side ICI, we can first write the baseband
k − 1 to subchannel k in case of an ideal channel as model as
vk [n] = vkI [n] + j vkQ [n] = vk [n]e− jnkπ/M . (9)
uk [n] = uIk [n] + j uQk [n] = uk [n]e− jn(k+1)π/M . (12)
In the appendix, it is shown that this impulse response is
purely imaginary, that is, vkI [n] ≡ 0 and vk [n] = v0 [n]. In
case of nonideal channel with channel equalization, the base- Again, it is shown in the appendix that this baseband im-
band cross-talk impulse response can now be written as pulse response is purely imaginary, that is, uIk [n] ≡ 0 and
uk [n] = u2M −1 [n]. With equalized nonideal channel, the
gkk−1 [n] = jv0Q [n] ∗ rk [n], (10) cross-talk response is now
where rk [n] = rk [n]e− jnkπ/M . Here the upper index denotes
the source of ICI. Now we can see that if the equalized chan- gkk+1 [n] = juQ2M −1 [n] ∗ rk [n]e− jnπ/M (13)
nel impulse response is real in the baseband model, then the
cross-talk impulse response is purely imaginary, and there is
and the upper-side ICI vanishes if the equalized channel im-
no lower-side ICI in the real part of the subchannel signal
pulse response is real in this baseband model. Now the rela-
that is used for detection.
tion between the decimated models is
At this point we have to notice that the lower-side ICI
energy is zero-centered after decimation only for the even-

indexed subchannels, and for the odd subchannels the above zk [n]e− jn(k+1)π/M = (−1)m(k+1) zk [mM] (14)
n=mM
and a sufficient condition also for zero upper-side ICI is that The above conditions were derived in the high-rate, full-
the equalized baseband channel impulse response is purely band case, and if the conditions are fully satisfied, ISI within
real. However, the baseband models for the two cases are the subchannel and ICI from the lower and upper adja-
slightly different, and both conditions cent subchannels are completely eliminated. In practice, the
equalization takes place at the decimated low sampling rate,
Im rk [n] ≡ 0, and can be done only within the passband and transition
(15) band regions (assuming roll-off ρ = 1.0). However, the ICI
Im rk [n]e− jnπ/M ≡ 0 and ISI components outside the equalization band are pro-
portional to the stopband attenuation of the subchannel fil-
have to be simultaneously satisfied to achieve zero over- ters and can be ignored.
all ICI. In frequency domain, the equalized channel fre-
quency response is required to have symmetric amplitude
and antisymmetric phase with respect to both of the fre- 3.2. Optimization criteria for the equalizer coefficients
quencies kπ/M and (k + 1)π/M to suppress both ICI com-
ponents. Naturally, the ideal full-band channel equaliza- Our interest is in low-complexity subcarrier equalizers,
tion (resulting in constant amplitude and zero phase) im- which do not necessarily provide responses very close to the
plies both conditions. In our FBMC system, the equal- ideal in all cases. Therefore, it is important to analyze the ICI
ization is performed at low rate, after filtering and dec- and ISI effects with practical equalizers. This can be carried
imation by M, and the mentioned two frequencies cor- out most conveniently in frequency domain. In the baseband
respond to 0 and π, that is, the filtered and downsam- model, the lower and upper ICI spectrum magnitudes are
pled portion of Hch (e jω ) in subchannel k multiplied by
the equalizer Ck (e jω ) must fulfill the symmetry condition Q jω Q jω
Vk (e )Rk (e )
for zero ICI. In this case, the two symmetry conditions
are equivalent (i.e., symmetric amplitude around 0 implies
Q Q jω
symmetric amplitude around π, and antisymmetric phase = V0 (e jω )Rk (e )
around 0 implies antisymmetric phase around π). The tar-
get is to approximate ideal channel equalization over the M

Q jω

subchannel passband and transition bands with sufficient = H p e j(ω−(π/2M)) H p e j(ω+(π/2M)) · Rk (e ) ,
2
accuracy.

Q jω Q j(ω+(π/M))
Uk e Rk e
3.1.2. ISI analysis

Q Q j(ω+(π/M))
In case of an ideal channel, the desired subchannel impulse = U2M −1 e jω Rk e
response of the baseband model can be written as
M

Q j(ω+(π/M))
I Q − jnkπ/M
= H p e j(ω−(π/2M)) H p e j(ω+(π/2M)) ·R k e ,

tk [n] = tk [n] + j tk [n] = tk [n]e . (16) 2
(19)
For odd subchannels, a lowpass-to-highpass transformation
has to be included in the model to get the actual response for respectively. Here the upper-case symbols stand for the
the decimated filter bank, but the model above is suitable for Fourier transforms of the impulse responses denoted by the
analyzing all subchannels. Now the real part of the subchan- corresponding lower-case symbols. The terms involving the
nel response with actual channel and equalizer can be written two frequency shifted prototype frequency responses are the
(see the appendix) as overall magnitude response for the crosstalk. H p (e j(ω−(π/2M)) )
appears here as the receive filter for the desired subchan-
gk [n] = Re tk [n] ∗ rk [n] = Re t0 [n] ∗ rk [n] nel and H p (e j(ω+(π/2M)) ) denotes the response of the trans-
(17) mit filter of the contiguous (potentially interfering) subchan-
Q
= t0I [n] ∗ Re rk [n] − t0 [n] ∗ Im rk [n] . nel. The actual frequency response includes phase terms,
but based on the discussion in the previous subsection we
The conditions for suppressing ICI are also sufficient for sup- know that, in the baseband model of the ideal channel
pressing the latter term of this equation. Furthermore, in case case, all the cross-talk energy is in the imaginary part of
of PR filter bank design, t0I [n] is a Nyquist pulse. Designing the impulse response. The residual imaginary part of the
the channel equalizer to provide unit amplitude and zero- equalized channel impulse response rkQ [n] determines how
phase response, a condition equivalent of having much of this cross-talk energy appears as ICI in detection.
⎧ It can be calculated as a function of frequency for a given
⎨1, n = 0,
Re rk [n] = δ[n] = ⎩ (18) set of equalizer coefficients, assuming the required knowl-
0, otherwise, edge on the channel response is available. Now the ICI
power for subchannel k can be obtained with good accu-
would suppress the ISI within the subchannel. racy by integrating over the transition bands in the baseband
model 4. LOW-COMPLEXITY POINTWISE PER-SUBCARRIER

π/2M EQUALIZATION
M2

2

PkICI = H p e j(ω−(π/2M)) Hp e j(ω+(π/2M))
−π/2M 4 The known channel equalization solutions for FBMC suffer
2
Q jω from insufficient performance, as in the case of the 0th-order
· R k e dω
π/2M ASCET [22], and/or from relatively high implementation
M2

2
complexity, as in the FIR filter based approach described, for
+ H p e j(ω−(π/2M)) Hp e j(ω+(π/2M))
−π/2M 4 example, by Hirosaki in [4]. To overcome these problems, a

Q j(ω+(π/M)) 2 specific structure that equalizes at certain frequency points
· R k e dω.
is considered. The pointwise equalization principle proceeds
(20) from the consideration that the subchannel equalizers are
designed to equalize the channel optimally at certain fre-
Also the ISI power can be calculated, as soon as the chan-
quency points within the subband. To be more precise, the
nel and equalizer responses are known, from the aliased
coefficients of the equalizer are set such that, at all the con-
spectrum of Gk (e jω ), as sidered frequency points, the equalizer amplitude response
π/M

2 optimally approaches the inverse of the determined chan-

1
j(ω+(lπ/M))

PkISI = M −

Gk e dω.

(21) nel amplitude response and the equalizer phase response
0 l=−1 optimally approaches the negative of the determined chan-
nel phase response. Optimal equalization at all frequencies
Here, the Nyquist criterion in frequency domain is used: would implicitly fulfill the zero ICI conditions of (15), and
in ISI-free conditions, the folded spectrum of the overall the zero ISI condition of (18). In pointwise equalization, the
subchannel response Gk (e jω ) adds up to a constant level M, a optimal linear equalizer is approximated between the con-
condition equivalent to overall impulse response being unity sidered points and the residual ICI and ISI interference pow-
impulse. By calculating the difference between this ideal refers depend on the degree of inaccuracy with respect to the
erence level and the actual spectrum, the spectrum resulting zero ICI/ISI conditions and can be measured using (20) and
from the residual ISI can be extracted. Integration over this (21), respectively. On the other hand, the level of inaccu-
residual spectrum gives the ISI power, according to (21). racy depends on the relation of the channel coherence band-
Typically, the pulse shape applied to the symbol detector, width [31] to the size of the filter bank and the order of
the slicer, is constrained to satisfy the Nyquist criterion. In the pointwise per-subcarrier equalizer. For mildly frequency
the presence of ISI, this often requires from the receive filter selective subband responses, low-complexity structures are
(in this context, the term “receive filter” is assumed to insufficient to keep the residual ICI and ISI at tolerable lev-
clude both the analysis filter and the equalizer) a gain that els.
compensates for the channel loss and causes the noise power Alternative optimization criteria are possible for the
to be amplified. This is called noise enhancement. The sub- equalizer coefficients from the amplitude equalization point
channel noise gain can be calculated as of view, namely, zero-forcing (ZF) and mean-squared error
π 2 (MSE) criteria [30, 31]. The most straightforward approach
1 jω
βkn = Ck e H p e j((ω∓π/2)/M) dω, (22) is ZF, where the coefficients are set such that the achieved
2π −π
equalizer response compensates the channel response ex-
where Ck (e jω ) is the response of the subchannel equalizer. actly at the predetermined frequency points. The ZF crite-
The − and + signs are valid for even and odd subchannel rion aims to minimize the PkICI and PkISI , but ignores the ef-
indexes, respectively. fect of noise. Ultimately, the goal is to minimize the proba-
bility of decision errors. The MSE criterion tries to achieve
3.3. Semianalytical performance evaluation this goal by making a tradeoff between the noise enhance-
ment and residual ISI at the slicer input. The MSE criterion
The performance of the studied FBMC, using per-subcarrier
thus alleviates the noise enhancement problem of ZF and
equalization to combat multipath distortion, can be evalu-
could provide improved performance for those subchannels
ated semianalytically according to the discussion above. The
that coincide with the deep notches in the channel frequency
term “semianalytically” refers, in this context, to the fact that
response. For high SNR, the MSE solution of the ampli-
no actual signal needs to be generated for transmission. In-
tude equalizer converges to that obtained by the ZF crite-
stead, a frequency domain analysis of the distorting channel
rion.
and the equalizer can be applied to derive the ICI and ISI
power spectra and the noise enhancement involved. Based
on PkICI , PkISI , and βkn , the overall signal to interference plus 4.1. Complex FIR equalizer
noise ratio(s) (SINR) for given Eb /N0 -value(s) can be ob-
tained. Then, well-known formulas based on the Q-function A straightforward way to perform equalization at certain fre-
[30] and Gray-coding assumption can be exploited to esti- quency points within a subband is to use complex FIR fil-
mate the uncoded bit error-rate (BER) performance. This ter (CFIR-SCE), an example structure of which is shown in
can further be averaged over a number of channel instances Figure 4, that has the desired frequency response at those
corresponding to a given power delay profile. given points. In order to equalize for example at three
z 1 z 1 the overall phase response of the AP-SCE phase correction

section (for the kth subchannel) can be derived from (24)
and (25)
c 1k c0k c1k

arg Hpeq (e jω ) = arg e jϕ0k · Hc e jω · Hr e jω

Re
−bck cos ω
= ϕ0k + 2 arctan
1 + bck sin ω (26)
Figure 4: An example structure of the CFIR-SCE subcarrier equal-
brk sin ω
izer. + 2 arctan .
1 + brk cos ω
In a similar manner, we can express the transfer function of

frequency points, a 3-tap complex FIR with noncausal trans-
the amplitude equalizer section in a noncausal form as
fer function
HCFIR - SCE (z) = c−1 z + c0 + c1 z−1 (23) Haeq (z) = a2 z2 + a1 z + a0 + a1 z−1 + a2 z−2 , (27)
offers the needed degrees of freedom. The equalizer coef- from which the equalizer magnitude response for the kth
ficients are calculated by evaluating the transfer function, subchannel is obtained

which is set to the desired response, at the chosen frequency
Haeq (e jω ) = a0k + 2a1k cos ω + 2a2k cos 2ω. (28)
points and setting up an equation system that is solved for
the coefficients.
4.2. Amplitude-phase equalizer 4.3. Low-complexity AP-SCE and CFIR-SCE
We consider a linear equalizer structure consisting of an all- Case 1. The subchannel equalization is based on a single fre-
pass phase correction section and a linear-phase amplitude quency point located at the center frequency of a specific
equalizer section. This structure is applied to each complex subchannel, at ±π/2 at the low sampling rate. Here the +
subchannel signal for separately adjusting the amplitude and sign is valid for the even and the − sign is valid for the odd
phase. This particular structure makes it possible to indepen- subchannel indexes, respectively. In this case, the associated
dently design the amplitude equalization and phase equaliza- phase equalizer only has to comprise a complex coefficient
tion parts, leading to simple algorithms for optimizing the e jϕ0k for phase rotation. The amplitude equalizer is reduced
equalizer coefficients. The orders of the equalizer stages are to just one real coefficient as a scaling factor. This case corre-
chosen to obtain a low-complexity solution. A few variants sponds to the 0th-order ASCET or a single-tap CFIR-SCE.
of the filter structure have been studied and will be described
Case 2. Here, equalization at two frequency points located at
in the following.
the edges of the passband of a specific subchannel, at ω = 0
An example structure of the AP-SCE equalizer is illus-
and ω = ±π, is expected to be sufficient. The + and − signs
trated in detail in Figure 5. In this case, each subchannel
are again valid for the even and odd subchannels, respec-
equalizer comprises a cascade of a first-order complex all-
tively. In this case, the associated equalizer has to comprise, in
pass filter, a phase rotator combined with the operation of
addition to the complex coefficient e jϕ0k , the first-order com-
taking the real part of the signal, and a first-order real allpass
plex allpass filter as the phase equalizer, and a symmetric 3-
filter for compensating the phase distortion. The structure,
tap FIR filter as the amplitude equalizer. Compared to the
moreover, consists of a symmetric 5-tap FIR filter for com-
equalizer structure of Figure 5, the real allpass filter is omit-
pensating the amplitude distortion. Note that the operation
ted and the length of the 5-tap FIR filter is reduced to 3. In
of taking the real part of the signal for detection is moved
the CFIR-SCE approach, two taps are used.
before the real allpass phase correction stage. This does not
affect the output of the AP-SCE, but reduces its implementa- Case 3. Here, three frequency points are used for channel
tion complexity. equalization. One frequency point is located at the center of
The transfer functions of the real and complex first-order the subchannel frequency band, at ω = ±π/2, and two fre-
allpass filters are given by quency points are located at the passband edges of the sub-
1 + br z channel, at ω = 0 and ω = ±π. In this case, the associated
Hr (z) = , (24) equalizer has to comprise all the components of the equalizer
1 + br z−1
structure depicted in Figure 5. In the CFIR-SCE structure of
1 − jbc z Figure 4, all three taps are used.
Hc (z) = , (25)
1 + jbc z−1
Mixed cases of phase and amplitude equalization. Naturally,
respectively. In practice, these filters are realized in the causal also mixed cases of AP-SCE are possible, in which a different
form as z−1 H· (z), but the above noncausal forms simplify number of frequency points within a subband are considered
the following analysis. For the considered example structure, for the compensation of phase and amplitude distortion. For
Phase equalizer Amplitude equalizer
Phase rotator
e jϕ0k
bck j brk

Re z 1 z 1 z 1 z 1
z 1
z 1 a2k a1k a0k a1k a2k
j
1

z 1 z
bck brk
Complex allpass filter Real allpass filter 5-tap symmetric FIR
Figure 5: An example structure of the AP-SCE subcarrier equalizer.
example, Case 3 phase equalization could be combined with complex target response, the target phase, and amplitude re-
Case 2 amplitude correction and so forth. Ideally, the num- sponse values at the three considered frequency points for
ber of frequency points considered within each subchannel is subchannel k. The value i = 1 corresponds to the subchan-
not fixed in advance, but can be individually determined for nel center frequency whereas values i = 0 and i = 2 refer to
each subchannel based on the frequency domain channel es- the lower and upper passband edge frequencies, respectively.
timates of each data block. This enables the structure of each With MSE criterion,
subchannel equalizer to be controlled such that the associ- ∗
Hch e j(2k+i)(π/2M)
ated subchannel response is equalized optimally at the mini- χik = ,
2
mum number of frequency points which can be expected to Hch e j(2k+i)(π/2M) + η (31)
result in sufficient performance. The CFIR-SCE cannot pro-
vide such mixed cases. ξik = arg χik , ik = χik ,
Also further cases could be considered since additional
where Hch is the channel frequency response in the baseband
frequency points are expected to result in better performance
model of the overall system. The effect of noise enhance-
when the subband channel response is more selective. How-
ment is incorporated into the solution of the equalizer pa-
ever, this comes at the cost of increased complexity in pro-
rameters using the noise-to-signal ratio η and a scaling fac-
cessing the data samples and much more complicated for-
mulas for obtaining the equalizer coefficients. tor γ = 3/ 2i=0 χik Hch (e j(2k+i)(π/2M) ) that normalizes the sub-
channel signal power to avoid any scaling in the symbol val-
For Case 3 structure, CFIR-SCE and AP-SCE equalizer ues used for detection. In the case of ZF criterion, η = 0 and
coefficients can be calculated by evaluating (23) and (26), γ = 1.
and (28), respectively, at the frequency points of interest, set- The operation of the ZF-optimized amplitude and phase
ting them equal to the target values, and solving the resulting equalizer sections of Case 3 AP-SCE are illustrated with ran-
system of equations for the equalizer coefficients: domly selected subchannel responses in Figures 6 and 7, re-
CFIR-SCE: spectively.
In Case 2, MSE-optimized coefficients for CFIR-SCE and
γ AP-SCE amplitude equalizer can be calculated as
c−1k = χ0k − χ2k ∓ j 2χ1k − χ0k − χ2k ,
4 γ γ
γ c0k = χ0k + χ2k , a0k = 0k + 2k ,
c0k = χ0k + χ2k , (29) 2 2
2 γ γ (32)
γ
c1k = ± χ0k − χ2k , a1k = ± 0k − 2k ,
c1k = χ0k − χ2k ± j(2χ1k − χ0k − χ2k ) ; 2 4
4
where γ = 2/(χ0k Hch (e j(kπ/M) ) + χ2k Hch (e j(2k+2)(π/2M) )). The
AP-SCE: AP-SCE phase equalizer coefficients ϕ0k and bck can be ob-
ξ0k + ξ2k tained as in Case 3.
ϕ0k = , γ Case 1 equalizers are obtained as special cases of the used
2
a0k = 0k + 21k + 2k ,
structures, including only a single complex coefficient for
4
ξ2k − ξ0k γ CFIR-SCE and an amplitude scaling factor and a phase ro-
bck = ± tan , a1k = ± 0k − 2k ,
4 4 tator for AP-SCE. It is natural to calculate these coefficients
γ
ξ1k − ϕ0k a2k = 0k − 21k + 2k . based on the frequency response values at the subchannel
brk = ± tan , 8 center frequencies, that is,
2
(30) c0k = χ1k ,
(33)
a0k = χ1k , ϕ0k = arg χ1k ,
Here the ± signs are again for the even/odd subchan-
nels, respectively, and χik , ξik , and ik , i = 0, . . . , 2, are the with η = 0, since MSE and ZF solutions are the same.
3.5 with error control coding are reported and compared to an

OFDM reference in a realistic simulation environment. Also
3 sensitivity to timing and frequency offsets and performance
ε0
with practical transmitter power amplifiers are investigated.
Amplitude in linear scale
2.5
We consider equally spaced real 2-PAM, 4-PAM, and 8-PAM
constellations for FBMC and complex square-constellations
2
ε1 QPSK, 16-QAM, and 64-QAM in the OFDM case.
1.5
5.1. Semianalytical performance evaluation
1
Semianalytical simulations were carried out with the
ε2 Vehicular-A power delay profile (PDP), defined by the rec-
0.5
ommendations of the ITU [32], for a 20 MHz signal band-
0 width. These simulations were performed in quasi-static
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 conditions, that is, the channel was time-invariant during
Normalized frequency (Fs /2) each transmitted frame. Perfect channel information was as-
Channel response sumed. In all the simulations, the average channel power gain
Equalizer target points εi was scaled to unity. Performance was tested with filter banks
Equalizer amplitude response consisting of 2M = {64, 128, 256} subchannels. The filter
Combined response of channel and equalizer bank designs used roll-off ρ = 1.0 and overlapping factor
K = 5 resulting in about 50 dB stopband attenuation. The
Figure 6: Operation of the ZF-optimized Case 3 amplitude equal- statistics are based on 2000 frame transmissions for each of
izer section. which an independent channel realization was considered.
The semianalytical results were obtained by calculating the
60 subcarrierwise ICI and ISI powers PkICI and PkISI , respectively,
together with noise gains βkn for k = 0, 1, . . . , 2M − 1. These
were then used to determine the subcarrierwise SINR-values,
40
as a function of channel Eb /N0 -values, for all the channel in-
stances. The uncoded BER results were obtained for 2-, 4-,
20 and 8-PAM modulations by evaluating first the theoretical
Phase (degrees)
subcarrierwise BERs based on the SINR-values using the Q-

0 function and Gray-coding assumption, and finally averaging
the BER over all the subchannels and 2000 channel instances.
ξ0
ξ1
20
5.1.1. Basic results for AP-SCE
40
The comparison in Figure 8(a) for ZF 4-PAM shows that
ξ2
the time domain simulation-based (Sim) and semi-analytic
60
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 model-based (SA) results match quite well. This encourages
Normalized frequency (Fs /2)
to carry out system performance evaluations, especially in
the algorithm development phase, mostly using the semiana-
Channel response lytical approach, which is computationally much faster. Time
Equalizer target points ξi
Equalizer phase response domain simulation results in Figure 8(b) for 4-PAM indicate
Combined response of channel and equalizer that the performance difference of ZF and MSE criteria is
rather small. Figures 8(c) and 8(d) show the semi-analytic re-
Figure 7: Operation of the Case 3 phase equalizer section. sults for 2-PAM and 8-PAM, respectively, using the ZF crite-
rion. It can be observed that higher-order AP-SCE improves
the equalizer performance significantly, allowing the use of a
lower number of subcarriers. Also ideal OFDM performance
5. NUMERICAL RESULTS
(without guard interval overhead) is shown as a reference.
The performance of the low-complexity subcarrier equal- With the aid of the AP-SCE equalizer, the performance of
izers was evaluated with different number of subchannels FBMC with a modest number of subcarriers can be made to
both semianalytically and using full simulations in time do- approach that of the ideal OFDM.
main. First, basic results are reported to illustrate how the
performance depends on the number of subcarriers and the 5.1.2. Comparison of CFIR-FBMC and AP-FBMC
equalizer design case. Also the reliability of the semianalytical
model is examined and the differences between ZF and MSE In the other simulations, it is assumed that the receiver is
criteria are compared. Finally, more complete simulations time-synchronized such that the first path corresponds to
100 100
4-PAM/16-QAM 4-PAM/16-QAM
10 1 10 1
BER
BER
10 2 10 2
10 3 10 3
0 5 10 15 20 25 0 5 10 15 20 25
Eb /N0 (dB) Eb /N0 (dB)
2M = 128, case 1, Sim 2M = 128, case 3, SA 2M = 128, case 1, ZF 2M = 128, case 3, MSE
2M = 128, case 1, SA 2M = 128, case 3, Sim 2M = 64, case 3, MSE 2M = 256, case 3, ZF
2M = 64, case 3, Sim 2M = 256, case 3, SA 2M = 64, case 3, ZF 2M = 256, case 3, MSE
2M = 64, case 3, SA 2M = 256, case 3, Sim 2M = 256, case 1, ZF Ideal OFDM
2M = 256, case 1, SA Ideal OFDM 2M = 128, case 3, ZF
2M = 256, case 1, Sim
(a) (b)
100 100
2-PAM/QPSK 8-PAM/64-QAM
10 1 10 1
BER
BER
10 2 10 2
3 10 3
10
0 5 10 15 20 25 0 5 10 15 20 25
2M = 64, case 1 2M = 128, case 3 2M = 256, case 1

2M = 128, case 1 2M = 256, case 3 2M = 128, case 3
2M = 64, case 3 Ideal OFDM 2M = 256, case 3
2M = 256, case 1 Ideal OFDM
(c) (d)
Figure 8: Uncoded BER results for AP-SCE with a quasi-static ITU-R Vehicular-A channel model and 20 MHz bandwidth. (a) Comparison
of time domain simulations (Sim) and semi-analytic model (SA) for ZF 4-PAM. (b) Comparison of ZF and MSE criteria with 4-PAM based
on time domain simulations. (c) Semi-analytic performance of ZF 2-PAM. (d) Semi-analytic performance of ZF 8-PAM. Ideal OFDM (using
corresponding square-constellation QAM, without guard interval overhead) included in all figures as a reference.
zero delay. Figure 9, however, shows a semi-analytic BER Vehicular-A PDP [33]. Simulation result statistics are based
comparison of the two subcarrier equalizer structures for on 2000 independent channel instances of this model and the
2M = 256 subchannels when the effect of time synchroniza- MSE criterion was used in the derivation of the amplitude
tion error is considered. Simulations were carried out with equalizer coefficients. Figure 9 shows the performance in two
a quasi-static channel model based on the extended ITU-R cases: with delays of 0 and 64 samples, corresponding to 0
10 1 40
2-PAM 35
BER
10 2
30
10 3
5 10 15 20 25 25
SIR (dB)
Eb /N0 (dB)
20
(a)
15
10 1
10
4-PAM
BER
10 2
5
10 3 0
5 10 15 20 25 0 0.1 0.2 0.3 0.4 0.5
Eb /N0 (dB) Timing offset/symbol interval
(b) AP-SCE, case 2
CFIR-SCE, case 3
1 AP/CFIR-SCE, case 1
10
8-PAM Figure 10: Semi-analytic SIR due to timing phase offset in AP-SCE
BER
10 2
and CFIR-SCE in an ideal channel.
10 3
5 10 15 20 25
Eb /N0 (dB)
provide performance gain in fractional delay compensation
CFIR-SCE, case 3, d = 0.5 AP-SCE, case 3, d = 0 compared to FIR structures with similar complexity.
CFIR-SCE, case 3, d = 0 Ideal OFDM
AP-SCE, case 3, d = 0.5
5.2. Performance comparisons with channel coding
(c)
5.2.1. Channel model, system parameters, and
Figure 9: Semi-analytic BER in AP-SCE and CFIR-SCE. Parameter OFDM reference
d = timing offset/subcarrier symbol interval.
We have also carried out full simulations in time domain
comparing cyclic prefix OFDM and FBMC. It was of par-
ticular interest to evaluate the performance of FBMC with
and 50% of the subcarrier symbol interval, respectively. It is AP-SCE and CFIR-SCE per-subcarrier equalizers and to ex-
seen that with 0 timing offset, CFIR-SCE and AP-SCE have plore the potential spectral efficiency gain. Time-variant ra-
very similar performance. However, AP-SCE is clearly more dio channel impairments were modeled based on the ex-
robust in the presence of timing offset. Especially with high- tended ITU-R Vehicular-A PDP [33] (maximum excess de-
order modulations, the performance of CFIR-SCE is signif- lay of 2.51 μs). This upgraded channel model has been
icantly degraded when the timing error approaches half of shown to improve the frequency correlation properties when
the subcarrier symbol interval. AP-SCE is very robust in this compared to the original PDP, making it better suited
sense, and the results demonstrate that FBMC with AP-SCE for evaluation of wideband transmission with frequency-
can be operated without timing synchronization prior to the dependent characteristics. Mobile velocity of 50 km/h and
receiver filter bank. carrier frequency of 5 GHz were assumed. With sampling
Figure 10 shows the signal-to-interference ratio (SIR) rate of 26.88 MHz (7× WCDMA chip rate), 616 subcar-
performance in case of an ideal channel with timing off- riers of 1024 in OFDM and 84/168/672 subchannels of
set only. Here, Case 2 AP-SCE includes only the first-order 128/256/1024 in FBMC were activated to obtain systems with
complex allpass and phase rotation; the real allpass does not the same effective bandwidth of 18 MHz (at 40 dB below
have any effect in this case. Figure 10 was obtained in the passband level). This corresponds to subchannel bandwidths
2M = 256 subcarrier case, but it was observed that with of 26.25 kHz and 210/105/26.25 kHz, respectively. 2-, 4-, and
other filter bank sizes, the behavior in terms of relative tim- 8-PAM modulations were considered for FBMC whereas
ing offset is very similar. It is seen that Case 3 CFIR-SCE QPSK, 16-QAM, and 64-QAM were used for OFDM. The
gives clearly better performance than simple phase rotation FBMC design used roll-off ρ = 1.0 and overlapping factor
(Case 1), and with timing offsets approaching half of the K = 5 resulting in a stopband attenuation (defined as
symbol interval, Case 2 AP-SCE has 3 dB better performance the level of the highest sidelobe) of about 50 dB (for 2-
than Case 3 CFIR-SCE. This is in accordance with the find- PAM/QPSK comparison also K = 3 was considered, giv-
ings in [34], where it is observed that allpass IIR structures ing stopband attenuation of about 38 dB). Channel coding
was performed using low-density parity check (LDPC) cod- considered. For 4-PAM and 8-PAM, 2M = 256 subchannels
ing [35]. The maximum number of iterations in iterative de- are required to keep the performance benefit with respect to
coding was set to ten. About 10% overhead for pilot carri- the OFDM reference.
ers is assumed in OFDM and similar overhead for training
sequences in FBMC. OFDM has 41.67 μs overall symbol du- 5.3. Performance with nonlinear power amplifier
ration, with 2.53 μs guard interval and 1.04 μs raised-cosine
roll-off for spectral shaping. Both systems have a single zero The ratio between the maximum instantaneous power of a
power subcarrier in the middle of the spectrum to facilitate signal and its mean power (PAPR) is proportional to the
receiver implementation. The information bit rates in the number of subcarriers and also depends on the modulation
two systems were approximately matched using code rates of constellation used. This is a matter of concern when the sig-
R = 3/4 and R = 2/3 for OFDM and FBMC, respectively. Bits nal passes through a nonlinear device such as the power am-
for a single frame to be transmitted were coded in blocks of plifier (PA). In this situation, signal components of differ-
3348 and 3990 bits, respectively, after which all the coded bits ent instantaneous power might be amplified differently, in-
of a frame were randomly interleaved before bits-to-symbols troducing distortion to the signal and causing spectral re-
and symbols-to-subcarriers mappings. The resulting num- growth to the bands adjacent to the signal. In this section,
ber of source bits in a fixed frame duration of 250 μs are 5022 we focus on the spectral regrowth caused by a PA on FBMC
and 5320 for QPSK/OFDM and 2-PAM/FBMC, respectively. and OFDM with similar parameters as in the time domain
Ideal channel estimation was assumed for both OFDM and BER simulations. We apply time domain raised-cosine win-
FBMC modulations. Simulation result statistics are based on dowing of 28 samples to the OFDM signal in order to assure
5000 transmitted frames for each of which an independent attenuation of 40 dB for the signal at 9 MHz from the carrier
realization of the channel model was applied. MSE optimiza- frequency. Therefore, the overall 40 dB bandwidth for OFDM
tion criterion was used to derive the amplitude equalizer and FBMC is 18 MHz. The PA follows the solid state power
parameters. amplifier (SSPA) model that can be found in [36]. Only am-
plitude nonlinearity is taken into account. The amplitude
gain is given by
5.2.2. Coded results
pi
po = 2 , (34)
Figures 11(a), 11(b), and 11(c) show the obtained results for 1 + pi / psat
2-PAM/QPSK, 4-PAM/16-QAM, and 8-PAM/64-QAM com-
parisons, respectively. Coded frame error rate (FER) and BER where pi and po are the amplitude of the PA input signal
are shown as a function of required energy per source bit to and output signal, respectively, and psat denotes the satura-
noise spectral density-ratio. Due to the absence of time do- tion voltage of the PA. The spectral regrowth is measured
main guard interval and reduced frequency domain guard- as a function of the input back-off (IBO) of the input sig-
bands, higher spectral efficiency in FBMC is achieved. This nal at the amplifier. In Figure 12 we show the regrowth of
excess transmission capacity can be used to transmit more the spectra of FBMC (dashed lines) and OFDM (continuous
redundant data (lower coding rate) while maintaining sim- lines). For FBMC we simulate IBOs that are 1.2 dB higher
ilar information data rate compared to OFDM. This turns than for OFDM. This reflects the fact that for a similar coded
into favor of FBMC in the FER/BER performance compari- BER performance we can use an FBMC signal with 1.2 dB
son as somewhat less energy in FBMC is sufficient to result in less power than OFDM. We can see from the figure, that it
similar error probability compared to OFDM. Alignment of is of advantage to be able to use a weaker signal, since close
the performance curves for K = 3 and K = 5 in Figure 11(a) to the desired passband we obtain less spectral regrowth. At
indicates that at least in narrowband interference-free con- more distant frequencies, the OFDM spectrum decays faster
ditions, FBMC design with K = 3 (and possibly even K = because the useful bandwidth is smaller than the useful band-
2) provides sufficient performance with reduced complexity width in FBMC (16.2 MHz versus 17.6 MHz). OFDM with
compared to K = 5. a comparable useful bandwidth (672 active subcarriers) has
a spectral decay profile similar to FBMC’s. Moreover, at the
same IBOs and same useful bandwidths, both systems show
5.2.3. Effect of AP-SCE structure and parameters very similar regrowth curves.
The ability of AP-SCE/CFIR-SCE equalizer to compensate 5.4. Frequency offset

for mildly frequency selective subchannel responses is clearly
visible in the simulation results. FBMC of 2M = 256 sub- In multicarrier transmissions, frequency offsets (e.g., due to
channels with Case 3 AP-SCE/CFIR-SCE follows the per- Doppler and inaccuracy of local oscillators in the transmis-
formance curves obtained with the structure consisting of sion chain) introduce ICI. In case of a fixed frequency offset
2M = 1024 subchannels with Case 1 equalizer. So, great in OFDM, the SIR due to the resulting ICI is given by [37]
reduction in the number of subchannels required to cover 1
the 18 MHz signal band can be achieved with higher-order SIR= 2 Nc −1 2 ,
AP-SCE/CFIR-SCE structures. In case of 2-PAM modula- sin(πΔ f ) p=0, p
=Nc /2 1/ Nc sin π(p+Δ f )/Nc
tion even a filter bank with 2M = 128 subchannels can be (35)
Upgraded ITU-R Veh-A@ 50 km/h, Upgraded ITU-R Veh-A@ 50 km/h,

5000 frames, 2-PAM/QPSK, LDPC coding 5000 frames, 4-PAM/16-QAM, LDPC coding
100 100
FER
FER 1
10 1 10
BER BER
2
Pe
2
Pe
10 10
3 Dotted lines < > K = 3 3

10 10
Solid lines < > K = 5
Dashed lines < > CFIR
4 10 4
10
0 5 10 15 20 0 5 10 15 20
OFDM, QPSK OFDM, 16-QAM

FBMC, 2M = 128, AP, Case 3 FBMC, 2M = 128, CFIR, Case 3
FBMC, 2M = 128, CFIR, Case 3 FBMC, 2M = 128, AP, Case 3
FBMC, 2M = 256, AP, Case 3 FBMC, 2M = 256, AP, Case 3
FBMC, 2M = 256, CFIR, Case 3 FBMC, 2M = 256, CFIR, Case 3
FBMC, 2M = 1024, AP, Case 1 FBMC, 2M = 1024, AP, Case 1
(a) (b)
Upgraded ITU-R Veh-A@ 50 km/h,

5000 frames, 8-PAM/64-QAM, LDPC coding
100
FER
10 1
2 BER
Pe
10
10 3
10 4
0 5 10 15 20
Eb /N0 (dB)
OFDM, 64-QAM
FBMC, 2M = 128, AP, Case 3
FBMC, 2M = 256, CFIR, Case 3
(c)
Figure 11: Coded FER and BER performance: (a) 2-PAM/FBMC and QPSK/OFDM; (b) 4-PAM/FBMC and 16-QAM/OFDM; and (c) 8-
PAM/FBMC and 64-QAM/OFDM.
where Nc and Δ f are the number of subcarriers and fre- Basically, the frequency offset introduces a time-varying
quency offset, respectively. The effects of frequency offsets phase offset, which is common to all subcarriers. In the sim-
in FBMC were tested with a simple simulation experiment ulation, as well as in the analytical results for OFDM, the
by measuring the mean squared error in symbol detection constant part of the common phase offset is assumed to be
with a set of fixed frequency offsets. The results are shown cancelled by the channel equalizer such that in the mid-
and compared to the OFDM performance in Figure 13. Here dle of each symbol the phase error of each subcarrier is
Nc = 256 for both systems. zero.
10 40
0
35
10
30
20
PSD (dB)
SIR (dB)
25
30
40 20
50
15
60 FB, no PA
Windowed OFDM 10
70 no PA
80 5
30 20 10 0 10 20 30 0 0.05 0.1 0.15 0.2
Frequency (MHz) Frequency offset as a fraction of subcarrier spacing
FB, IBO = 1.2 dB Windowed OFDM, IBO = 0 dB OFDM

FB, IBO = 7.2 dB Windowed OFDM, IBO = 6 dB FBMC
FB, IBO = 13.2 dB Windowed OFDM, IBO = 12 dB
FB, IBO = 19.2 dB Windowed OFDM, IBO = 18 dB
Figure 13: SIR due to frequency offset in OFDM and FBMC.
Figure 12: Spectral regrowth due to PA nonlinearity.
Table 1: Multiplications in receiver per one detected complex sym-

bol in OFDM and per two detected real symbols in FBMC.
It can be seen from Figure 13 that for a given relative
(with respect to subcarrier spacing) frequency offset, the Case 1 Case 3
FBMC SIR performance is slightly better but within 2 dB OFDM, 1k-FFT 10 —
from the OFDM performance. Since FBMC allows signifi- FBMC, K = 2, 2M = 128, AP-SCE 20 30
cantly wider subcarrier spacing, the relative frequency off- FBMC, K = 5, 2M = 256, AP-SCE 34 44
sets are smaller, and there is a clear performance benefit for FBMC, K = 2, 2M = 128, CFIR-SCE 20 28
FBMC in terms of frequency offset effects. This indicates also FBMC, K = 5, 2M = 256, CFIR-SCE 34 42
a potential for better performance in case of fast fading.
5.5. Complexity For a fair comparison, we calculate the overall number of

multiplications per detected complex symbol in the OFDM
In this subsection, a rough evaluation of the computational case and per two detected real symbols in the FBMC case. For
complexity of FBMC is presented, using a simple complexity simplicity, it is assumed that all the subcarriers are in use. The
measure: the number of real multiplications required to de- resulting overall multiplication rates with the two extreme
tect a symbol. We focus on the receiver side where the OFDM cases of FBMC-complexity are shown in Table 1.
FFT or FBMC analysis bank and the equalizer are the main It is observed that with this complexity measure, FBMC
processing blocks. Channel estimation and calculation of the is more complex than the basic OFDM system. However, the
equalizer coefficients are not included in this evaluation. implementation of FBMC is yet quite realistic with today’s
One of the most efficient algorithms for implement- efficient digital signal processors or dedicated very large scale
ing DFT is the split radix FFT algorithm [18], taking integration (VLSI) hardware. It is expected that there are
M(log2 (M) − 3) + 4 real multiplications for a block of M a lot of possibilities to optimize the EMFB implementation
complex samples. In the OFDM case, 3 real multipliers are in dedicated hardware, using short coefficient word-lengths,
enough to do the complex multiplication to equalize each of sums of powers of two implementations for coefficients, and
the used subcarriers. so forth.
In the FBMC case, the FFT-based algorithm presented Furthermore, it can be noted that due to the larger block-
in [26] is the most efficient one to implement the oversam- size, OFDM requires significantly bigger data memory and
pled analysis bank in terms of multiplication rate. It requires coefficient storage in processor-based implementations.
2M(2K − 2 + log2 (M)) real multiplications for a block of M One quite significant aspect is the needed baseband filter-
high-rate samples. In an efficient implementation, the AP- ing in the receiver before the filter bank or FFT. The oversam-
SCE subcarrier equalizers take 2, 5, and 7 real multiplications pled analysis bank acts as a high-quality channel selection
in Cases 1, 2, and 3, respectively, per detected real symbol. filter, effectively suppressing adjacent channels and other in-
Alternatively, the 3-tap CFIR-SCE structure takes 6 real mul- terference components appearing in the range of the unused
tiplications per detected real symbol. subcarriers. In the OFDM case, the attenuation capability
of the DFT is rather limited, regarding the adjacent chan- These results encourage further research on FBMC for
nels and other out-of-band interference sources that are not beyond 3G communications. Such studies include devel-
synchronized to the guard interval structure. Therefore ad- opment of robust synchronization and channel estimation
ditional highly selective digital baseband filtering is usually techniques, as well as optimization of filter banks for low
needed in OFDM, especially if the frequency domain guard- complexity with high flexibility. For example, efficient NPR
bands between the adjacent frequency channels are to be filter bank designs form an interesting topic.
minimized. Including the baseband filtering in the complex-
ity comparison may change the measures significantly. APPENDIX
ICI AND ISI RELATED IMPULSE RESPONSES

6. CONCLUSION
IN THE BASEBAND MODEL
We have studied a new low-complexity per-subcarrier chan- We first derive the equivalent baseband model for the lower
nel equalizer for FBMC transceiver for high-rate wideband ICI response using (1) and (5):
communications over doubly-dispersive channel and ana-
lyzed its performance. It was shown that the coded error- vk [n] = fk−1 [n] ∗ hk [n] e− j(nkπ/M)
rate performance of FBMC is somewhat better than that of
the OFDM reference. It was also indicated that the perfor- = fk−1 [n]e− j(nkπ/M) ∗ hk [n]e− j(nkπ/M)

mance of FBMC with a practical nonlinear power amplifier
= fk−1 [n]e− j(nkπ/M) e− j((M+1)kπ/2M)
is similar to that of OFDM or even better. Further, FBMC
is much less sensitive to frequency offsets than OFDM due ∗ hk [n]e− j(nkπ/M) e+ j((M+1)kπ/2M) (A.1)
to the possibility of using significantly lower number of sub-
carriers. The latter observation indicates also a potential for = fk−1 [n]e− j(n+(M+1)/2)(kπ/M)
improved performance benefit in case of fast-fading chan-
nels. It was seen that an FBMC receiver can be operated with- ∗ hk [n]e j(−n+(M+1)/2)(kπ/M)
out time synchronization prior to the receiver bank, also with = f2M −1 [n] ∗ h0 [n] = v0 [n].
higher-order modulations if AP-SCE is used. This leads to a
simplified overall receiver structure. Further,
The arithmetic complexity of FBMC is, no doubt, higher v0 [n] = f2M −1 [n] ∗ h0 [n]
than that of the reference OFDM, but yet realistic with dig- c s
ital VLSI technologies. However, FBMC has various bene- = − f0c [n] + j f0s [n] ∗ h0 [n] − jh0 [n]
c s (A.2)
fits over OFDM, like higher flexibility in choosing the de- = − f0c [n] ∗ h0 [n] + f0s [n] ∗ h0 [n]
sign parameters. Especially, it can be emphasized that the re- c s
ceiver filter bank in FBMC acts as a high-quality frequency + j f0s [n] ∗ h0 [n] + f0c [n] ∗ h0 [n] .
selective filter for all the signal components in the received
Applying the relationships between the sine/cosine modu-
band. This is in contrast to OFDM where transients are intro-
lated analysis/synthesis filters that can be found in [26] and
duced to the signal components that are not synchronized to
which apply for a PR TMUX,
the guard interval structure, causing leakage of interference
power also to noninterfered parts of the spectrum. There- hsk [n] = (−1)k+K fkc [n],
fore, FBMC has the potential of providing sufficient attenua- (A.3)
fks [n] = (−1)k+K hck [n],
tion for adjacent channels and other interferences, reducing
the complexity of the baseband filtering of the receiver. On we finally obtain
the other hand, this feature makes it feasible to have small c s
v0 [n] = j f0s [n] ∗ h0 [n] + f0c [n] ∗ h0 [n] . (A.4)
frequency domain guard-bands between the asynchronous
adjacent channel users, increasing the spectral efficiency. Se- Although (A.2) and (A.4) use the noncausal form for the
lectivity provides also high robustness to narrowband inter- analysis filters, we can see that the introduced time delay
ferences in the signal band, and a possibility to use multi- does not affect (A.3) in any way. Further, since the delay
ple nonadjacent frequency slots for a single user in a flexible is the same in both sine and cosine analysis filters, the real
manner. terms cancel each other out the same way they would do
The numerical examples were mostly performed with if causal expressions were used. Therefore, the lower ICI is
high-end TMUX designs (K = 5) with relatively high stop- purely imaginary in the baseband model. Likewise, we can
band attenuation (about 50 dB). However, results with lower- write for the upper ICI response
complexity filter banks (K = 3) (about 38 dB) were also
shown and even the case with K = 2 (about 30 dB) was uk [n] = fk+1 [n] ∗ hk [n] e− j(n(k+1)π/M)
tested. The conclusion is that even though the performance
= fk+1 [n]e− j(n+(M+1)/2)((k+1)π/M)
analysis assumed infinite attenuation for the subchannel (A.5)
stopbands, the performance degradation even with about ∗ hk [n]e j(−n+(M+1)/2)((k+1)π/M)
30 dB stopband attenuation is minor if the system does not
need to suppress strong interfering signal components. = f0 [n] ∗ h2M −1 [n] = u2M −1 [n].
Now in Proceedings of IEEE Global Telecommunications Conference

c s (GLOBECOM ’97), vol. 3, pp. 1519–1523, Phoenix, Ariz, USA,
u2M −1 [n] = f0c [n] + j f0s [n] ∗ − h0 [n] − jh0 [n] November 1997.
c s
[10] L. Vandendorpe, L. Cuvelier, F. Deryck, J. Louveaux, and O.
= − f0c [n] ∗ h0 [n] + f0s [n] ∗ h0 [n] van de Wiel, “Fractionally spaced linear and decision-feedback
s c (A.6) detectors for transmultiplexers,” IEEE Transactions on Signal
− j f0c [n] ∗ h0 [n] + f0s [n] ∗ h0 [n] Processing, vol. 46, no. 4, pp. 996–1011, 1998.
s c [11] K. Van Acker, G. Leus, M. Moonen, O. van de Wiel, and T.
= − j f0c [n] ∗ h0 [n] + f0s [n] ∗ h0 [n] , Pollet, “Per tone equalization for DMT-based systems,” IEEE
Transactions on Communications, vol. 49, no. 1, pp. 109–119,
so also the upper ICI response is purely imaginary. 2001.
For the impulse response of subchannel k, we can write [12] P. Siohan, C. Siclet, and N. Lacaille, “Analysis and design
in the baseband model of OFDM/OQAM systems based on filterbank theory,” IEEE
Transactions on Signal Processing, vol. 50, no. 5, pp. 1170–1183,

2002.

tk [n] = fk [n] ∗ hk [n] e− j(nkπ/M)
[13] A. M. Wyglinski, P. Kabal, and F. Labeau, “Adaptive filter-

− j(n+(M+1)/2)(kπ/M) bank multicarrier wireless systems for indoor environments,”
= fk [n]e
(A.7) in Proceedings of the 56th IEEE Vehicular Technology Confer-
ence (VTC ’02), vol. 1, pp. 336–340, Vancouver BC, Canada,
∗ hk [n]e+ j(−n+(M+1)/2)(kπ/M) September 2002.
[14] B. Farhang-Boroujeny, “Multicarrier modulation with blind
= f0 [n] ∗ h0 [n] = t0 [n].
detection capability using cosine modulated filter banks,” IEEE
Transactions on Communications, vol. 51, no. 12, pp. 2057–
2070, 2003.
ACKNOWLEDGMENTS [15] B. Farhang-Boroujeny and L. Lin, “Analysis of post-combiner
equalizers in cosine-modulated filterbank-based transmulti-
The authors would like to thank the anonymous reviewers plexer systems,” IEEE Transactions on Signal Processing, vol. 51,
no. 12, pp. 3249–3262, 2003.
and the associate editor for their constructive comments and
[16] S.-M. Phoong, Y. Chang, and C.-Y. Chen, “DFT-modulated
suggestions that helped to improve the manuscript, both in
filterbank transceivers for multipath fading channels,” IEEE
presentation and contents. This research was partly funded Transactions on Signal Processing, vol. 53, no. 1, pp. 182–192,
by Nokia. 2005.
[17] M. Vetterli, “Perfect transmultiplexers,” in Proceedings of IEEE
International Conference on Acoustics, Speech and Signal Pro-
REFERENCES
cessing (ICASSP ’86), vol. 11, pp. 2567–2570, Tokyo, Japan,
[1] R. van. Nee and R. Prasad, OFDM for Wireless Multimedia September 1986.
Communications, Artech House, Boston, Mass, USA, 2000. [18] H. S. Malvar, Signal Processing with Lapped Transforms, Artech
[2] R. W. Chang, “Synthesis of band-limited orthogonal signals House, Boston, Mass, USA, 1992.
for multichannel data transmission,” Bell System Technical [19] P. P. Vaidyanathan, Multirate Systems and Filter Banks, Prentice
Journal, vol. 45, pp. 1775–1796, 1966. Hall, Englewood Cliffs, NJ, USA, 1993.
[3] B. R. Saltzberg, “Performance of an efficient parallel data [20] T. Karp and N. J. Fliege, “Modified DFT filter banks with per-
transmission system,” IEEE Transactions on Communications, fect reconstruction,” IEEE Transactions on Circuits and Systems
vol. 15, no. 6, pp. 805–811, 1967. II: Analog and Digital Signal Processing, vol. 46, no. 11, pp.
[4] B. Hirosaki, “An analysis of automatic equalizers for orthogo- 1404–1414, 1999.
nally multiplexed QAM systems,” IEEE Transactions on Com- [21] J. Alhava and M. Renfors, “Adaptive sine-modulated/cosine-
munications, vol. 28, no. 1, pp. 73–83, 1980. modulated filter bank equalizer for transmultiplexers,” in Pro-
[5] B. Le Floch, M. Alard, and C. Berrou, “Coded orthogonal fre- ceedings of the European Conference on Circuit Theory and De-
quency division multiplex,” Proceedings of the IEEE, vol. 83, sign (ECCTD ’01), vol. 3, pp. 337–340, Espoo, Finland, August
no. 6, pp. 982–996, 1995. 2001.
[6] S. D. Sandberg and M. A. Tzannes, “Overlapped discrete mul- [22] T. Ihalainen, T. Hidalgo Stitz, and M. Renfors, “On the per-
titone modulation for high speed copper wire communica- formance of low-complexity ASCET-equalizer for a complex
tions,” IEEE Journal on Selected Areas in Communications, transmultiplexer in wireless mobile channel,” in Proceedings of
vol. 13, no. 9, pp. 1571–1585, 1995. the 7th International OFDM-Workshop, Hamburg, Germany,
[7] A. Vahlin and N. Holte, “Optimal finite duration pulses for September 2002.
OFDM,” IEEE Transactions on Communications, vol. 44, no. 1, [23] T. Ihalainen, T. Hidalgo Stitz, and M. Renfors, “Efficient per-
pp. 10–14, 1996. carrier channel equalizer for filter bank based multicarrier sys-
[8] T. Wiegand and N. J. Fliege, “Equalizers for transmultiplexers tems,” in Proceedings of IEEE International Symposium on Cir-
in orthogonal multiple carrier data transmission,” in Proceed- cuits and Systems (ISCAS ’05), pp. 3175–3178, Kobe, Japan,
ings of the European Signal Processing Conference (EUSIPCO May 2005.
’96), vol. 2, pp. 1211–1214, Trieste, Italy, September 1996. [24] T. Ihalainen, T. Hidalgo Stitz, and M. Renfors, “Performance
[9] S. Nedic, “An unified approach to equalization and echo can- comparison of LDPC-coded FBMC and CP-OFDM in beyond
cellation in OQAM-based multi-carrier data transmission,” 3G context,” in Proceedings of IEEE International Symposium
on Circuits and Systems (ISCAS ’06), pp. 2049–2052, Kos, Tobias Hidalgo Stitz was born in 1974 in
Greece, May 2006. Eschwege, Germany. He obtained the M.S.
degree in telecommunications engineering
[25] Y. Yuan, T. Ihalainen, M. Rinne, and M. Renfors, “Frequency
from the Polytechnic University of Madrid
domain equalization in single carrier transmission: filter bank
(UPM) in 2001, after writing his Masters
approach,” accepted to EURASIP Journal on Applied Signal
Thesis at the Institute of Communications
Processing.
Engineering of the Tampere University of
[26] A. Viholainen, J. Alhava, and M. Renfors, “Efficient imple- Technology (TUT). From 1999 to 2001, he
mentation of 2x oversampled exponentially modulated filter was Research Assistant at TUT and is now
banks,” IEEE Transactions on Circuits and Systems II, vol. 53, working towards his doctoral degree there.
pp. 1138–1142, 2006. His research interests include wireless communications based on
[27] J. Alhava and M. Renfors, “Complex lapped transforms and multicarrier systems, especially focusing on filter bank based sys-
modulated filter banks,” in Proceedings of the 2nd International tems and other filter bank applications for signal processing.
Workshop on Spectral Methods and Multirate Signal Processing
(SMMSP ’02), pp. 87–94, Toulouse, France, September 2002. Mika Rinne received his M.S. degree from
[28] A. Viholainen, T. Hidalgo Stitz, J. Alhava, T. Ihalainen, and M. TUT in signal processing and computer sci-
Renfors, “Complex modulated critically sampled filter banks ence, in 1989. He acts as Principal Scien-
based on cosine and sine modulation,” in Proceedings of IEEE tist in the Radio Technologies Laboratory of
International Symposium on Circuits and Systems (ISCAS ’02), Nokia Research Center. His background is
vol. 1, pp. 833–836, Scottsdale, Ariz, USA, May 2002. in research of multiple-access methods, ra-
dio resource management, and implemen-
[29] A. Viholainen, J. Alhava, and M. Renfors, “Efficient imple-
tation of packet decoders for radio commu-
mentation of complex modulated filter banks using cosine and
nication systems. Currently, his interests are
sine modulated filter banks,” Eurasip Journal on Applied Signal
in research of protocols and algorithms for
Processing, vol. 2006, Article ID 58564, 10 pages, 2006.
wireless communications including WCDMA, long-term evolution
[30] E. A. Lee and D. G. Messerschmitt, Digital Communication,
of 3G and beyond 3G systems.
Kluwer Academic, Boston, Mass, USA, 2nd edition, 1994.
[31] J. G. Proakis, Digital Communications, McGraw-Hill, New Markku Renfors was born in Suoniemi,
York, NY, USA, 3rd edition, 1995. Finland, on January 21, 1953. He received
[32] ITU-R, “Guidelines for evaluation of radio transmission tech- the Diploma Engineer, Licentiate of Tech-
nologies for IMT-2000,” Recommendation M.1225, 1997. nology, and Doctor of Technology degrees
[33] T. B. Sorensen, P. E. Mogensen, and F. Frederiksen, “Extension from (TUT), Tampere, Finland, in 1978,
of the ITU channel models for wideband (OFDM) systems,” in 1981, and 1982, respectively. From 1976 to
Proceedings of the 62nd IEEE Vehicular Technology Conference 1988, he held various research and teach-
(VTC ’05), vol. 1, pp. 392–396, Dallas, Tex, USA, September ing positions at TUT. From 1988 to 1991,
2005. he was a Design Manager at the Nokia Re-
[34] T. I. Laakso, V. Vālimāki, M. Karjalainen, and U. K. Laine, search Center and Nokia Consumer Elec-
“Splitting the unit: delay tools for fractional delay filter de- tronics, Tampere, Finland, where he focused on video signal pro-
sign,” IEEE Signal Processing Magazine, vol. 13, no. 1, pp. 30– cessing. Since 1992, he has been a Professor and Head of the In-
60, 1996. stitute of Communications Engineering at TUT. His main research
areas are multicarrier systems and signal processing algorithms for
[35] R. G. Gallager, “Low-density parity-check codes,” IRE Trans-
flexible radio receivers and transmitters.
actions on Information Theory, vol. 8, pp. 21–28, 1962.
[36] C. Rapp, “Effects of HPA-nonlinearity on a 4-DPSK/OFDM-
signal for a digital sound broadcasting system,” in Proceedings
of the 2nd European Conference on Satellite Communications,
pp. 179–184, Liege, Belgium, October 1991.
[37] P. H. Moose, “Technique for orthogonal frequency division
multiplexing frequency offset correction,” IEEE Transactions
on Communications, vol. 42, no. 10, pp. 2908–2914, 1994.
Tero Ihalainen received his M.S. degree in

electrical engineering from Tampere Uni-
versity of Technology (TUT), Finland, in
2005. Currently, he is a Researcher and
a Postgraduate student at the Institute of
Communications Engineering at TUT, pur-
suing towards the doctoral degree. His
main research interests are digital signal
processing algorithms for multicarrier and
frequency domain equalized single-carrier
modulation-based wireless communications, especially applica-
tions of multirate filter banks.
doi:10.1155/2007/10438
Research Article
Frequency-Domain Equalization in Single-Carrier
Transmission: Filter Bank Approach
Yuan Yang,1 Tero Ihalainen,1 Mika Rinne,2 and Markku Renfors1

1 Institute of Communications Engineering, Tampere University of Technology, P.O. Box 553, 33101 Tampere, Finland
2 Nokia Research Center, P. O. Box 407, Helsinki 00045, Finland
Received 12 January 2006; Revised 24 August 2006; Accepted 14 October 2006
Recommended by Yuan-Pei Lin
This paper investigates the use of complex-modulated oversampled filter banks (FBs) for frequency-domain equalization (FDE) in
single-carrier systems. The key aspect is mildly frequency-selective subband processing instead of a simple complex gain factor per
subband. Two alternative low-complexity linear equalizer structures with MSE criterion are considered for subband-wise equal-
ization: a complex FIR filter structure and a cascade of a linear-phase FIR filter and an allpass filter. The simulation results indicate
that in a broadband wireless channel the performance of the studied FB-FDE structures, with modest number of subbands, reaches
or exceeds the performance of the widely used FFT-FDE system with cyclic prefix. Furthermore, FB-FDE can perform a significant
part of the baseband channel selection filtering. It is thus observed that fractionally spaced processing provides significant perfor-
mance benefit, with a similar complexity to the symbol-rate system, when the baseband filtering is included. In addition, FB-FDE
effectively suppresses narrowband interference present in the signal band.
Copyright © 2007 Yuan Yang et al. This is an open access article distributed under the Creative Commons Attribution License,
1. INTRODUCTION feedback structures have been considered. In [2, 4–6], it has

been demonstrated that the single-carrier frequency-domain
Future wireless communications must provide ever increas- equalization may have a performance advantage and that it
ing data transmission rates to satisfy the growing demands of is less sensitive to nonlinear distortion and carrier synchro-
wireless networking. As symbol-rates increase, the intersym- nization inaccuracies compared to multicarrier modulation.
bol interference, caused by the bandlimited time-dispersive The most common approach for FDE is based on
channel, distorts the transmitted signal even more. The FFT/IFFT transforms between the time and frequency do-
difficulty of channel equalization in single-carrier broad- mains. Usually, a cyclic prefix (CP) is employed for the trans-
band systems is thus regarded as a major challenge to high- mission blocks. Such a system can be derived, for exam-
rate transmission over mobile radio channels. Single-carrier ple, from OFDM by moving the IFFT from the transmit-
time-domain equalization has become impractical because ter to the receiver [4]. FFT-FDEs with CP are character-
of the high computational complexity of needed transversal ized by a flat-fading model of the subband responses, which
filters with a high number of taps to cover the maximum de- means that one complex coefficient per subband is sufficient
lay spread of the channel [1]. This has lead to extensive re- for ideal linear equalization. This approach has overhead in
search on spread spectrum techniques and multicarrier mod- data transmission due to the guard interval between symbol
ulation. On the other hand, single-carrier transmission has blocks. Another approach is to use overlapped processing of
the benefit, especially for uplink, of a very simple transmit- FFT blocks [7–9] which allows equalization without CP. This
ter architecture, which avoids, to a large extent, the peak- results in a highly flexible FDE concept that can basically be
to-average power ratio problems of multicarrier and CDMA used for any single-carrier system, including also CDMA [8].
techniques. In recent years, the idea of single-carrier trans- This paper develops high performance single-carrier
mission in broadband wireless communications has been FDE techniques without CP by the use of highly frequency-
revived through the application of frequency-domain equal- selective filter banks in the analysis-synthesis configuration,
izers, which have clearly lower implementation complexity instead of the FFT and IFFT transforms. We examine the
than time-domain equalizers [1–3]. Both linear and decision use of subband equalization for mildly frequency-selective
subbands, which helps to reduce the number of subbands Section 4 gives numerical results, including simulation re-
required to achieve close-to-ideal performance. This is facil- sults to illustrate the effects of filter bank and equalizer pa-
itated by utilizing a proper complex, partially oversampled rameters on the system performance. Then detailed compar-
filter bank structure [10–13]. isons of the studied FB-SSE and FB-FSE structures with the
One central choice in the FDE design is between symbol- reference systems are given.
spaced equalizers (SSE) and fractionally spaced equalizers
(FSE) [3, 14]. An ideal receiver includes a matched filter
2. FFT BASED FREQUENCY-DOMAIN EQUALIZATION
with the channel matched part, in addition to the root raised
IN A SINGLE-CARRIER TRANSMISSION
cosine (RRC) filter, before the symbol-rate sampling. SSE
ignores the channel matched part, leading to performance Throughout this paper, we consider single-carrier block
degradation, whereas FSEs are, in principle, able to achieve transmission over a linear bandlimited channel with addi-
ideal linear equalizer performance. However, symbol-rate tive white Gaussian noise. We assume that the channel has
sampling is often used due to its simplicity. In frequency- time-invariant impulse response during each block transmis-
domain equalization, FSE can be done by doubling the num- sion. For each block, a CP is inserted in front of the block, as
ber of subbands and the sampling rate at the filter bank input shown in Figure 1. In this case, the received signal is obtained
[1, 3, 6]. This paper examines also the performance and com- as a cyclic convolution of the transmitted signal and channel
plexity tradeoffs of the SSE and FSE structures. impulse response. Therefore, the channel frequency response
The main contribution of this paper is an efficient com- is accurately modeled by a complex coefficient for each fre-
bination of analysis-synthesis filter bank system and low- quency bin [17]. The length of the CP extension is P ≥ L,
complexity subband-wise equalizers, applied to frequency- where L is the maximum length of the channel impulse re-
domain equalization. The filter bank has a complex I/Q in- sponse. The CP includes a copy of information symbols from
put and output signals suitable for processing baseband com- the tail of the block. This results in bandwidth efficiency re-
munication signals as such, so no additional single sideband duction by the factor M/(M +P), where M is the length of the
filtering is needed in the receiver (real analysis-synthesis information symbol block. In general, for time-varying wire-
systems cannot be easily adapted to this application). The less environment, M is chosen in such a way that the channel
filter bank also has oversampled subband signals to fa- impulse response can be considered to be static during each
cilitate subband-wise equalization. We consider two low- block transmission.
complexity equalizer structures operating subband-wise: (i) The block diagram of a communication link with FFT-
a 3-tap complex-valued FIR filter (CFIR-FBEQ), and (ii) SSE and FFT-FSE is shown in Figure 1. The operations of
the cascade of a low-order allpass filter as the phase equal- the equalization include the forward transform from time to
izer and a linear-phase FIR filter as the amplitude equalizer frequency domain, channel inversion, and the reverse trans-
(AP-FBEQ). In the latter structure, the amplitude and phase form from frequency to time domain. The CP is inserted
equalizer stages can be adjusted independently of each other, after the symbol mapping in the transmitter and discarded
which turns out to have several benefits. Simple channel esti- before equalization in the receiver. At the transmitter side, a
mation based approaches for calculation of the equalizer co- block of M symbols x(m), m = 0, 1, . . . , M − 1, is oversam-
efficients both in SSE and FSE configurations and for both pled and transmitted with the average power σx2 . The received
equalizer structures are developed. Further, the benefits of oversampled signal r(n) can be written as
FB-FSEs in contributing significantly to the receiver selectiv-
ity will be addressed.
In a companion paper [15], a similar subband equalizer r(n) = x(n) ⊗ c(n) + v(n),
structure is utilized in filter bank based multicarrier (FBMC) (1)
c(n) = gT (n) ⊗ hch (n) ⊗ gR (n).
modulation, and its performance is compared to a refer-
ence OFDM modulation in a doubly dispersive broadband
wireless communication channel. In this paper, we continue Here v(n) is additive white Gaussian noise with variance σn2 .
with the comparisons of OFDM, FBMC, single-carrier FFT- The symbol ⊗ represents convolution, hch (n) is the channel
FDE, and FB-FDE systems. The key idea of our equalizer con- impulse response, and gT (n) and gR (n) are the transmit and
cept has been presented in the earlier work [16] together with receive filters, respectively. They are both RRC filters with the
two of the simplest cases of the subband equalizer. roll-off factor α ≤ 1 and the total signal bandwidth B = (1 +
The content of this paper is organized as follows: α)/T, with T denoting the symbol duration.
Section 2 gives an overview of FFT-SSE and FFT-FSE. In ad- Generally in the paper, the lowercase letters will be used
dition, the mean-squared error (MSE) criterion based sub- for time-domain notations and the uppercase letters for
band equalizer coefficients are derived. Section 3 addresses frequency-domain notations. The letter n is used for time-
the exponentially modulated oversampled filter banks and domain 2× symbol-rate data sequences and m for symbol-
the subband equalization structures, CFIR-FBEQ and AP- rate sequences, while the script k represents the index of
FBEQ. The particular low-complexity cases of these struc- frequency-domain subband signals. For example, in Figure 1,
tures are presented, together with the formulas for calcu- Rk is the received signal of kth subband, and Wk and W k rep-
lating the equalizer coefficients from the channel estimates. resent the kth subband equalizer coefficients of SSE and FSE,
Also, the channel estimation principle is briefly described. respectively.
Yuan Yang et al. 3
Additive noise
x(m) x(n) v(n)
Bits Symbol CP Tx filter Channel
mapping insertion 2 gT (n) hch (n) +
0010111010
Rx filter
Symbol-spaced gR (n)
equalizer
W0
X0
R0
W1
X1 2
x(m) x(m) R1 r(m)
. . . .
P/S .
. M-IFFT . . M-FFT . S/P
.
. .
WM CP
XM 1
removal
1
RM 1
Fractionally-spaced 0
W
equalizer R0
.
M
W 1
.
.

RM 1 . r(n) CP
X0 M
W 2M-FFT .. S/P removal
x(m) x(m) +
. . RM
P/S . M-IFFT .. .
. . W2M . 1
+ R2M 1
XM 1
CP Data
P symbols M symbols
One block
Figure 1: General model of FFT-SSE and FFT-FSE for single-carrier frequency-domain equalization.
2.1. Symbol-spaced equalizer To minimize MSE, considering the residual intersymbol

interference and additive noise, the frequency response of the
Suppose that cSSE (m) is the symbol-rate impulse response of optimum linear equalizer is given by [14]
the cascade of transmit filter gT (n), channel hch (n), and re-
ceiver filter gR (n), and CkSSE is the kth bin of its DFT trans- ∗
CkSSE
form, the DFT length being equal to the symbol block length Wk = 2 , (3)
CkSSE + σn2 σx2
M. Assuming that the length of the CP is sufficient, that is,
longer than the delay spread of cSSE (n), we can express the
kth subband sample as where k = 0, 1, . . . , M − 1 and (·)∗ represents complex con-
jugate.
Rk = CkSSE Xk + Nk , k = 0, 1, . . . , M − 1, (2)
2.2. Fractionally-spaced equalizer
where Xk is the ideal noise- and distortion-free sample and The FFT-FSE, shown in Figure 1, operates at 2× symbol-rate,
Nk is zero mean Gaussian noise. The equalized frequency- 2/T. In some papers, it is also named as T/2-spaced equalizer
domain samples are Xk = Wk Rk , k = 0, 1, . . . , M −1. After the [14, 18]. For each transmitted block, the received samples are
IFFT, the equalized time-domain signal x(m) is processed by processed using a 2M-point FFT. The RRC filter block at the
a slicer to get the detected symbols x(m). The error sequence receiver is absent since it can be realized together with the
at the slicer is e(m) = x(m) − x(m) and MSE is defined as equalizer in the frequency domain [1].
E[|e(m)|2 ]. In the case of SSE, the folding is carried out before equal-
The subband equalizer optimization criterion could be ization, where the folding frequency is 1/2T. It is evident in
zero forcing (ZF) or MSE. In this paper, we are focus- Figure 2 that uncontrolled aliasing over the transition band
ing on wideband single-carrier transmission, with heavily F1 takes place. This means that SSE can only compensate for
frequency-selective channels. In such cases, the ZF equaliz- the channel distortion in the aliased received signal, which
ers suffer from severe noise enhancement [14] and MSE pro- results in performance loss. On the other hand, FSE com-
vides clearly better performance. We consider here only the pensates for the channel distortion in received signal before
MSE criterion. the aliasing takes place. After equalization, the aliasing takes
ment the whole matched filter together with the MSE equal-
FSE
SSE izer. The whole spectrum, where the equalization takes place,
α that is, the FFT frequency bins, can be grouped into three fre-
quency regions with different equalizer actions.
(i) Passbands F0 : k ∈ [0, (1 − α)M/2] ∪ [(3 + α)M/2,
1/T 1/2T 0 1/2T 1/T 3/2T 2/T
2M − 1].
F1 F0 F1
F2 F2 There is no aliasing in these two regions, so the equal-
izer coefficients can be written in simplified form as
F0 Passband T Symbol duration FSE ∗
Ck Gk

Wk =
Qk + σ 2 σ 2 .
F1 Transition band α Roll-off (6)
F2 Stopband n x
(ii) Transition bands F1 : k ∈ [(1 − α)M/2, (1 + α)M/2] ∪

Figure 2: Signal spectra in the cases of SSE and FSE. [(3 − α)M/2, (3 + α)M/2].
Aliasing takes place when the received signal is folded,
and (5) should be used.
place in an optimal manner. The performance is expected to (iii) Stopbands F2 : k ∈ [(1 + α)M/2, (3 − α)M/2].
approach the performance of an ideal linear equalizer. Only noise and interference components are included
Let Hkch , k = 0, 1, . . . , 2M − 1, denote the 2M-point k = 0.
and all subband signals can be set to zero, W
DFT of the T/2-spaced channel impulse response, and Gk
The use of oversampling provides robustness to the sam-
denote the RRC filter in the transmitter or in the receiver
pling phase. Basically the frequency-domain equalizer imple-
side. Assuming zero-phase model for the RRC filters, Gk is
ments also symbol-timing adjustment. Furthermore, com-
always real-valued. The optimum linear equalizer model in-
pared with the SSE system, the receiver filter of the FSE sys-
cludes now the following elements: transmitter RRC filter,
tem can be implemented efficiently in the frequency domain.
channel hch (n), matched filter including receiver RRC fil-
This means that the pulse shaping filtering will not intro-
ter and channel matched filter h∗ch (−n), resampling at the
duce additional computational complexity, even if it has very
symbol-rate, and MSE linear equalizer at symbol-rate. The
sharp transition bands.
2×-oversampled system frequency response can be written
as
2.3. Computational complexity of SSE and FSE
FSE 2
C
ch ch ∗
Qk = Gk Hk Hk Gk = k 2 , In the following example, we will count the real multiplica-
Gk (4) tions at the receiver side. The complexity mainly comes from
CkFSE = Hkch Gk 2 . RRC filtering, FFT and IFFT, and equalization.
(i) Suppose that M = 512 symbols are transmitted in a
Here CkFSE is the kth bin of DFT transform of the T/2-spaced
block. The number of the received samples is 2M =
impulse response of the cascade of the channel and the two
1024 because of the oversampling by 2.
RRC filters. The channel estimator described in Section 3.4
(ii) Each subband equalizer has only one complex weight,
provides estimates for CkFSE . Now the frequency bins k and
resulting in 4 real multiplications per subband.
M + k carry redundant information about the same subband
(iii) The pulse shaping filter is an RRC filter with the roll-
data, just weighted differently by the RRC filters and the
off factor of α = 0.22 and the length of NRRC = 31.
channel. The folding takes place in the sampling rate reduc-
Because of symmetry, only (NRRC + 1)/2 = 16 multi-
tion, adding up these pairs of frequency bins. Before the ad-
pliers are needed for the RRC filtering in the SSE. In
dition, it is important to compensate the channel phase re-
an efficient decimation structure, (NRRC + 1)/2 multi-
sponse so that the two bins are combined coherently, and
plications per symbol are needed, both for the real and
also to weight the amplitudes in such a way that the SNR
imaginary parts of the received signal.
is maximized. The maximum ratio combining idea [1] and
(iv) The split-radix algorithm [19] is applied to the FFT.
the sampled matched filter model [14] lead to the same re-
For an M-point FFT, M(log2 M − 3) + 4 real multipli-
sult. Combining this front-end model with the MSE linear
cations are needed.
equalizer leads to the following expression for the optimal
(v) In the case of SSE, the total number of real multiplica-
subband equalizer coefficients:
tions per symbol is about (NRRC +1)+2 log2 M −2 ≈ 48.
∗ (vi) In the case of FSE, the number of subbands used is
C FSE Gk
k = k
W M(1 + α). The total number of real multiplications per
+ σ2 σ2 .
(5)
Qk + Q(M+k)
mod(2M) n x symbol is about 3 log2 M − 3 + 4α ≈ 25.
The frequency index k = 0, 1, . . . , 2M − 1 covers the entire From the above discussion, we can easily conclude that FFT-
spectrum [0, 2π] as ωk = 2πk/2M, that is, k = 0 corresponds FSE has lower rate of real multiplications than FFT-SSE. This
to DC and k = M corresponds to the symbol-rate 1/T. It is mainly due to the reason that much of the complexity is
should be noted that here the equalizer coefficients imple- saved when the RRC filter is realized in frequency domain.
Yuan Yang et al. 5
0 0
Amplitude (dB)
Amplitude (dB)
10
20 20
30
40 40
50
60 60
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Frequency ω/π Frequency ω/π
(a) DFT bank (b) EMFB
Figure 3: Comparison of the subband frequency responses of DFT and EMFB.
v(n)
0010111010 x(m) Tx filter Channel
Symbol mapping +
2 gT (n) hch (n)
Bits
Critically sampled 2x-oversampled

synthesis banks analysis banks
r(n)
. Re R0 +
.
CMFB .. . . CMFB Re
x(m) x(m) .. j .
Re +
Equalizer
2 +
j +
.. Re .. .
SMFB . .. SMFB Im
Re .
R2M +
1
j
+
.
.. SMFB
+
+
.
. CMFB
.
+
Figure 4: Generic FB-FDE system model in the FSE case.
3. EXPONENTIALLY MODULATED FILTER bank structure has complex baseband I/Q signals as its input
BANK BASED FDE and output, as required for spectrally efficient radio commu-
nications. The sampling rate conversion factor in the analysis
Filter banks provide an alternative way to perform the sig- and synthesis banks is M, and there are 2M low-rate sub-
nal transforms between time and frequency domains, in- bands equally spaced between [0, 2π]. In the critically sam-
stead of FFT. As shown in Figure 3, exponentially modu- pled case, this FB has a real format for the low-rate subband
lated FBs (EMFBs) achieve better frequency selectivity than signals [12].
DFT banks, but they have the drawback that, since the basis
functions are overlapping and longer than a symbol block,
the CP cannot be utilized. Consequently, the subbands can- 3.1. Exponentially modulated filter bank
not be considered to have flat frequency responses. However,
the lack of CPs can be considered a benefit, since CPs add EMFB belongs to a class of filter banks in which the subfil-
overhead and reduce the spectral efficiency. Furthermore, in ters are formed by modulating an exponential sequence with
the FSE case, frequency-domain filtering with a filter bank is the lowpass prototype impulse response h p (n) [11, 12]. Ex-
quite effective in suppressing strong interfering spectral com- ponential modulation translates H p (e jω ) (lowpass frequency
ponents in the stopband regions of the RRC filter. response of the prototype filter) to a new center frequency
Figure 4 shows the FB-FSE model including a complex determined by the subband index k. The prototype filter
exponentially modulated analysis-synthesis filter bank struc- h p (n) can be optimized in such a manner that the filter
ture as the core of frequency-domain processing. The filter bank satisfies the perfect reconstruction condition, that is,
the output signal is purely a delayed version of the input sig- blocks of the analysis bank of Figure 4 would be omitted).
nal. In the general form, the EMFB synthesis filters fke (n) and For a block of M complex input samples, 2M real subband
analysis filters gke (n) can be written as samples are generated in the critically sampled case and 2M
complex subband samples are generated in the oversampled

2 M+1 1 π case.
fke (n) = h p (n) exp j n + k+ , The advantage of using 2×-oversampled analysis filter
M 2 2 M

bank is that the channel equalization can be done within
2 M+1 1 π
gke (n) = h p (n) exp − j NB − n + k+ , each subband independently of the other subbands. Assum-
M 2 2 M ing roll-off ρ = 1.0 or less in the filter bank design, the
(7) complex subband signals of the analysis bank are essentially
alias-free. This is because the aliasing signal components are
where n = 0, 1, . . . , NB and subband index k = 0, 1, . . . , 2M − attenuated by the stopband attenuation of the subband re-
1. Furthermore, it is assumed that the subband filter order is sponses. Subband-wise equalization compensates the chan-
NB = 2KM − 1. The overlapping factor K can be used as a de- nel frequency response over the whole subband bandwidth,
sign parameter because it affects how much stopband attenu- including the passband and transition bands. The imaginary
ation can be achieved. Another essential design parameter is parts of the subband signals are needed only for equalization.
the stopband edge of the prototype filter ωs = (1 + ρ)π/2M, The real parts of the subband equalizer outputs are sufficient
where the roll-off parameter ρ determines how much adja- for synthesizing the time-domain equalized signal, using a
cent subbands overlap. Typically, ρ = 1.0 is used, in which critically sampled synthesis filter bank.
case only the neighboring subbands are overlapping with It should be mentioned that an alternative to oversam-
each other, and the overall subband bandwidth is twice the pled subband processing is to use a critically sampled anal-
subband spacing. ysis bank together with subband processing algorithms that
The amplitude responses of the analysis and synthesis fil- have cross-connections between the adjacent subbands [22].
ters divide the whole frequency range [0, 2π] into equally However, we believe that the oversampled model results in
wide passbands. EMFB has odd channel stacking, that is, kth simplified subband processing algorithms and competitive
subband is centered at the frequency (k + 1/2)π/M. After complexity.
decimation, the even-indexed subbands have their passbands After the synthesis bank, the time-domain symbol-rate
centered at π/2 and the odd-indexed at −π/2. This unsym- signal is fed to the detection device. In the FSE model of
metry has some implications in the later formulations of the Figure 4, the synthesis bank output signal is downsampled to
subband equalizer design. the symbol-rate. In the case of FSE with frequency-domain
In our approach, EMFB is implemented using cosine- folding, an M-channel synthesis bank would be sufficient,
and sine-modulated filter bank (CMFB/SMFB) blocks [11, instead of the 2M-channel bank. The design of such a fil-
12], as can be seen in Figure 4. The extended lapped trans- ter bank system in the nearly perfect reconstruction sense is
form is an efficient method for implementing perfect re- discussed in [23].
construction CMFBs [20] and SMFBs [21]. The relations We consider here the use of EMFB which has odd channel
between the 2M-channel EMFB and the corresponding M- stacking, that is, the center-most pair of subbands is symmet-
channel CMFB and SMFB with the same real prototype are rically located around the zero frequency at the baseband.
⎧ We could equally well use a modified EMFB structure [13]
⎪
⎨ fkc (n) + j fks (n), k ∈ [0, M − 1], with even channel stacking, that is, center-most subband is
fke (n) = ⎪ located symmetrically around the zero frequency, which has
⎩− f c s
2M −1−k (n) − j f2M −1−k (n) , k ∈ [M, 2M − 1],
a slightly more efficient implementation structure based on
⎧ DFT processing. Also modified DFT filter banks [24] could
⎪
⎨gkc (n) − jgks (n), k ∈ [0, M − 1],
e be utilized with some modifications in the baseband process-
gk (n) = ⎪ ing. However, the following analysis is based on EMFBs since
⎩− g c s
2M −1−k (n) + jg2M −1−k (n) , k ∈ [M, 2M − 1],
they result in the most straightforward system model.
(8) Further, the discussion is based on the use of perfect re-
construction filter banks, but also nearly perfect reconstruc-
where gkc (n) and gks (n) are the analysis CMFB/SMFB subfilter tion (NPR) designs could be utilized, which usually result in
impulse responses, fkc (n) and fks (n) are the synthesis bank shorter prototype filter length. In the critically sampled case,
subfilter responses (the superscript denotes the type of mod- the implementation benefits of NPR are limited, because the
ulation). They can be generated according to (7). efficient extended lapped transform structures cannot be uti-
One additional feature of the structure in Figure 4 is that, lized [12]. However, in the 2×-oversampled case, having par-
while the synthesis filter bank is critically sampled, the sub- allel CMFB and SMFB blocks, the implementation benefit of
band output signals of the analysis bank are oversampled by the NPR designs could be significant.
the factor of two. This is achieved by using the complex I/Q
subband signals, instead of the real ones which would be suf- 3.2. Channel equalizer structures and designs
ficient for reconstructing the analysis bank input signal in the
synthesis bank when no subband processing is used [10, 13] In the filter bank, the number of subbands is selected in such
(in a critically sampled implementation, the two lower most a way that the channel is mildly frequency selective within
Yuan Yang et al. 7
each individual subband. We consider here several low- Amplitude equalizer

2
complexity subband equalizers which are designed to
equalize the channel optimally at a small number of selected
1.8
frequency points within each subband. Figure 5 shows one
example, where the subband equalizer is determined by the
Amplitude in linear scale

1.6
channel response of three selected frequency points, one at
the center frequency, the other two at the subband edges. In
1.4
this example, the ZF criterion is used for equalization, that
is, the channel frequency response is exactly compensated at
1.2 ε2
those selected frequency points.
1
3.2.1. CFIR-FBEQ ε1
0.8
A very basic approach is to use a complex FIR filter as a sub- ε0
band equalizer. A 3-tap FIR filter,1 ECFIR (z) = c0 z +c1 +c2 z−1 , 0.6
1.5 1 0.5 0 0.5
has the required degrees of freedom to equalize the channel
Normalized frequency in Fs/2
frequency response within each subband.
It should be noted that the subband equalizer response Channel response
depends on the number of frequency points considered Equalizer target points εi
Equalizer amplitude response
within each subband. Regarding the choice of the specific
Combined response of channel and equalizer
frequency points, the design can be greatly simplified when
the choice is among the normalized frequencies ω = 0, ±π/2, (a) Amplitude compensation
and ±π. At the selected frequency points, the equalizer is de-
Phase equalizer
signed to take the target values given by (5) in the FSE case 25
and by (3) in the SSE case. Below we focus on the MSE based
FSE. 20
When three subband frequency points are selected in
the subband equalizer design, there are a total of 4M fre- 15
quency points for 2M subbands, that is, we consider the MSE
Phase (degrees)
equalizer response W κ at equally spaced frequency points 10

κπ/(2M), κ = 0, 1, . . . , 4M − 1. For notational convenience, ξ1
ξ2
we define the target frequency responses in terms of subband 5
index k = 0, 1, . . . , 2M − 1, instead of frequency point index
κ. The kth subband target response value is denoted as ηik , 0
which is defined as ξ0
5
2k+i ,
ηik = W i = 0, 1, 2. (9) 10
0.5 0 0.5 1 1.5
Normalized frequency in Fs/2
At the low rate after decimation, these frequency points Channel response
Equalizer target points ξi
{η0k , η1k , η2k } are located for the even subbands at the nor-
Equalizer phase response
malized frequencies ω = {0, π/2, π }, and for the odd sub- Combined response of channel and equalizer
bands at the frequencies ω = {−π, −π/2, 0}. Combining (5)
and (9), we can get the following equations for the subband (b) Phase compensation
equalizer response ECFIR (e jω ) at these target frequencies.
Even subbands: Figure 5: An example of AP-FBEQ subband equalizer responses.
⎧
⎪
⎪c0k + c1k + c2k = η0k ,
⎪
⎪
(ω = 0), Odd subbands:
⎪
⎪
⎧
⎨
CFIR jω π ⎪
Ek e = jc0k + c1k − jc2k = η1k ,
⎪ ω= , (10) ⎪−c0k + c1k − c2k = η0k ,
⎪
⎪
(ω = −π),
⎪
⎪ 2 ⎪
⎪

⎪ ⎨
⎪
⎪
⎩−c + c − c = η , CFIR jω −π
0k 1k 2k 2k (ω = π). Ek e = − jc0k + c1k + jc2k = η1k ,
⎪ ω= ,
⎪
⎪ 2
⎪
⎪
⎪
⎩c + c + c = η ,
0k 1k 2k 2k (ω = 0).
1 In practice, the filter is realized in the causal form z−1 ECFIR (z). (11)
Phase equalizer Amplitude equalizer

Phase rotator
e jϕk
bck j brk
Σ Re Σ z 1 z 1 z 1 z 1
z 1
z 1 a2k a1k a0k a1k a2k
j
z 1 z 1 Σ Σ Σ Σ
bck brk
Complex allpass filter Real allpass filter 5-tap symmetric FIR
Figure 6: An example of the AP-FBEQ subband equalizer structure.
The 3-tap complex FIR coefficients {c0k , c1k , c2k } of the only the real part of the phase rotator output needs to be
kth subband equalizer can be obtained as follows (+ signs calculated, and the real filters are implemented only for the
stand for even subbands and − signs for odd subbands, I-branch. The structure of Figure 6 is completely equivalent
resp.): with the original one, but it is computationally much more
efficient. With the same kind of reasoning, it is easy to see that

1 η0k − η2k η0k + η2k in the CFIR-FBEQ case, only two real multipliers are needed
c0k = ± − j η1k − , to implement each of the taps.
2 2 2
The orders of the equalizer sections, as well as the num-
η0k + η2k
c1k = , (12) ber of specific frequency points used in the subband equalizer
2 design, offer a degree of freedom and are chosen to obtain

1 η0k − η2k η0k + η2k a low-complexity solution. Firstly, we consider the subband
c2k = ± + j η1k − .
2 2 2 equalizer structure shown in Figure 6. The transfer functions
of the complex and real first-order allpass filters Ack (z) and
3.2.2. AP-FBEQ Ark (z) can be given by2
The idea of AP-FBEQ approach is to compensate channel 1 − jbck z

Ack (z) = ,
amplitude and phase distortion separately. In other words, 1 + jbck z−1
at those selected frequency points, the amplitude response (13)
1 + brk z
of the equalizer is proportional to the inverse of the channel Ark (z) = ,
1 + brk z−1
amplitude response, and the phase response of the equalizer
is the negative of the channel phase response. respectively. The phase response of the equalizer for the kth
The subband equalizer structure, shown in Figure 6, is a subband can be described as
cascade of a phase equalization section, consisting of allpass
filter stages and a phase rotator, and an amplitude equaliza- arg EkAP e jω = arg e jϕk · Ack e jω · Ark e jω

tion section, consisting of a linear-phase FIR filter. This par- −bck cos ω
ticular structure makes it possible to design the amplitude = ϕk + 2 arctan
1 + bck sin ω (14)
equalization and phase equalization independently, leading

brk cos ω
to simple formulas for channel estimation based solutions, + 2 arctan .
1 + brk sin ω
or simplified and fast adaptive algorithms for adaptive sub-
band equalizers. In this paper, we refer to this frequency- The equalizer magnitude response for the kth subband can
domain equalization approach as the amplitude-phase filter be written as
bank equalizer, AP-FBEQ.
AP jω
The real parts of the equalized subband signals are suffi- E e = a0k + 2a1k cos ω + 2a2k cos 2ω. (15)
k
cient for constructing the sample sequence for detection, and
the imaginary parts are irrelevant after the subband equaliz- The AP-FBEQ idea can be applied to both SSE and FSE
ers. In the basic form of the AP-FBEQ subband equalizer, the in similar manner as CFIR-FBEQ. Here, we focus on the
operation of taking the real part would be after all the fil- FSE case. Three subband frequency points at normalized
ters of the subband equalizer. But since the real filters (real frequencies ω={0, π/2, π } for the even subbands and ω=
allpass and magnitude equalizer) act independently on the {−π, −π/2, 0} for the odd subbands are selected in the sub-
real (I) and imaginary (Q) branch signals, the results of the band equalizer design. Here, we define the target amplitude
Q-branch computations after the phase rotator would never
be utilized. Therefore, it is possible to move the real part
operation and combine it with the phase rotator, that is, 2 The allpass filters can be realized in the causal form z−1 Ak (z).
Yuan Yang et al. 9
and phase response values for subband k as ik and ζik , re- The subband equalizer structure is not necessarily fixed
spectively: in advance but can be determined individually for each
subband based on the frequency-domain channel estimates.
This enables the structure of each subband equalizer to be
ik = W2k+i ,
controlled such that each subband response is equalized op-
(16) timally at the minimum number of frequency points which
2k+i ,
ζik = arg W i = 0, 1, 2. can be expected to result in sufficient performance.
The performances of these three different subband equal-
Then, combining (5), (14), (15), and (16) at these tar- izer designs, together with the 3-tap CFIR-FBEQ, will be ex-
get frequencies, we can derive two allpass filter coefficients amined in the next section.
{bck , brk } and a phase rotator ϕk for phase compensation
section and the FIR coefficients {a0k , a1k , a2k } for amplitude 3.3. FSE and SSE
compensation.
Also in the SSE version of CFIR-FBEQ and AP-FBEQ, the
In this paper, the following three different low-complex-
decimating RRC filtering needs to be carried out before
ity designs of the AP-FBEQ structure are considered. (+ signs
equalization, and uncontrolled aliasing results in similar per-
stand for the even subbands and − signs for the odd ones.)
formance loss as in the FFT-SSE.
Case 1. One frequency point is selected in the subband. This In the FSE, the receiver RRC filter can again be imple-
model of subband equalizer consists only of the phase rota- mented in the frequency domain together with the equalizer,
tor e jϕk for phase compensation and a real coefficient a0k for with low complexity. Since no guard interval is employed
amplitude compensation. In fact, it behaves like one com- and the subbands are highly frequency selective, frequency-
plex equalizer coefficient for each subband in the FFT-FDE domain filtering can be implemented independently of the
system. The subband center frequency point is selected to de- roll-off and other filtering requirements, as long as the
termine the equalizer response stopband attenuation in the filter bank design is sufficient
for the receiver filter from the RF point of view. It can be
noted that the FB-FSE structure provides a flexible solution
ϕk = ζ1k , a0k = 1k . (17) for channel equalization and channel filtering, since the re-
ceiver filter bandwidth and roll-off can be controlled by ad-
justing the RRC-filtering part of the equalizer coefficient cal-
Case 2. Two frequency points are selected at the subband
culations.
edges at the frequency points ω = 0 and ±π to determine the
In advanced receiver designs, a high initial sampling rate
equalizer coefficients. The subband equalizer structure con-
is often utilized, followed by a multistage decimation fil-
sists of a cascade of a first-order complex allpass filter fol-
ter chain which is highly optimized for low-implementation
lowed by a phase rotator and an operation of taking the real
complexity [25]. The first stages of the decimation chain of-
part of the signal. Finally, a symmetric linear-phase 3-tap FIR
ten utilize multiplier-free structures, like the cascaded inte-
filter is applied for amplitude compensation. In this case, the
grator comb, and the major part of the implementation com-
equalizer coefficients can be calculated as
plexity is at the last stage. In such designs, FB-FSE provides a
flexible generic solution for the last stage of a channel filter-
ζ0k + ζ2k 1 ing chain.
ϕk = , a0k = 0k + 2k ,
2 2

(18) 3.4. Channel estimation
ζ − ζ0k 1
bck = ± tan 2k , a2k = ± 0k − 2k .
4 4 FB-FDEs, as well as FFT-FDEs, can be implemented by us-
ing adaptive channel equalization algorithms to adjust the
Case 3. Three frequency points are used in each subband, as equalizer coefficients. However, we focus here on channel
we have discussed above, one at the subband center and two estimation based approach, where the equalizer coefficients
at the passband edges. The equalizer structure contains two are calculated at regular intervals based on the channel esti-
allpass filters, a phase rotation stage and a symmetric linear- mates and knowledge of the desired receiver filter frequency
phase 5-tap FIR filter. Their coefficients are calculated as be- response, according to (3) or (5). In the performance studies,
low: we have utilized a basic, maximum likelihood (ML) channel
estimation method (also known as the least-squares method)
ζ0k + ζ2k 0k + 21k + 2k using training sequences [26]. Here, Gold codes [27] of dif-
ϕk = , a0k = , ferent lengths are used as training sequences.
2 4

In SSE, a training sequence is transmitted, and the
ζ − ζ0k 0k − 2k
bck = ± tan 2k , a1k = ± , symbol-rate channel impulse response (including transmit-
4 4 ter and receiver RRC filters) is estimated based on the re-

ζ1k − ϕk 0k − 21k + 2k ceived training sequence at the decimating RRC filter output.
brk = ± tan , a2k = ± . This channel estimate is used for calculating the equalizer co-
2 8
(19) efficients using (3).
In FSE, we have chosen to estimate T/2-spaced impulse with a minor but consistent benefit for AP-FBEQ. With a low
responses (including the two RRC filters). Including the re- number of subbands and with high-order modulation, the
ceiver RRC filter in the estimated response minimizes the differences are more visible. In the following comparisons,
noise and interference coming into the channel estimator. AP-FBEQ performance is considered. It is clearly visible that
Now, the channel estimator utilizes the receiver RRC fil- AP-FBEQ Cases 2 and 3 equalizers improve the performance
ter output at two times the symbol-rate. It must be noted significantly compared to Case 1. When the modulation or-
that this approach requires a time-domain RRC filter for the der becomes higher, the performance gaps between differ-
training sequences in the receiver, even if frequency-domain ent equalizer structures increase. As the most interesting un-
filtering is applied to the data symbols. coded BER region is between 1% and 10%, it is seen that 256
subbands with Case 3 are sufficient to achieve good perfor-
mance even with high-order modulation. The resulting per-
4. NUMERICAL RESULTS formance is rather close to the analytic BER bound; however,
it is clear that the gray-coding assumption is not very ac-
4.1. Basic simulations and numerical comparisons curate at low Eb /N0 , and the analytic performance curve is
somewhat optimistic. With this specific channel model, 128
The considered models of FFT-FDE and FB-FDE were intro- subbands are sufficient for QPSK and 16-QAM modulations
duced in Figures 1 and 4, respectively. The pulse shaping fil- when AP-FBEQ Case 3 equalizer is used.
ters both in the transmitter and receiver are real-valued RRC
The FB design parameter, overlapping factor K, controls
filters with α = 0.22. In the FSE case, the receiver RRC filter
the level of stopband attenuation. Increasing K improves the
is realized by the equalizer. The filter bank designs in the sim-
stopband attenuation, with the cost of increased implemen-
ulations used roll-off ρ = 1.0, different numbers of subbands
tation complexity. Figure 8 presents the BER performance
2M = {128, 256} and overlapping factors K = {2, 3, 5}, re-
of Case 3 equalizer with 256 subbands and the different K-
sulting in about 30 dB, 38 dB, and 50 dB stopband attenua-
factors. For QPSK modulation, it can be seen that the K-
tions, respectively.
factor has relatively small effect on the performance, and
The performances were tested using the extended
even K = 2 may provide sufficient performance. In the case
vehicular-A channel model of ITU-R with the maximum ex-
of higher order modulations, K = 3 can achieve sufficient
cess delay of about 2.5 μs [28]. The symbol-rate was 1/T =
performance.
15.36 MHz. The channel fading was modelled quasistatic,
that is, the channel frequency response was time invariant
during each frame transmission. 4000 independent channel SSE versus FSE performance and FFT-FDE versus
instances were simulated to obtain the average performance. FB-FDE comparisons
The MSE criterion was applied to solve the equalizer coeffi-
cients. The bit-error-rate (BER) performance was simulated Figure 9 presents the results for SSE and FSE in the FFT-FDE
with QPSK, 16-QAM, and 64-QAM modulations, with gray and FB-FDE receivers. It is clearly seen that FSE provides sig-
coding, and was compared to the performance of FFT-FDE. nificant performance gain over SSE in the considered case.
In all FFT-FDE simulations, the CP is included and assumed The performance differences between AP-FBEQ and the con-
to be longer than the delay spread. Also the performance of ventional FFT-FDE methods are relatively small. However,
the ideal MSE linear equalizer is included for reference. This it should be noted that in Figure 9 the guard-interval over-
analytic performance reference was obtained by applying the head is not taken into account in the Eb /N0 -axis scaling, even
MSE formula for the infinite-length linear MSE equalizer though sufficiently long CP (200 samples) is utilized. In prac-
from [14] and then using the well-known formulas of the tice, the CP length effects in the BER plots only on the Eb /N0 -
Q-function and gray-coding assumption for estimating the axis scaling.
BER. The BER measure is averaged over 5000 independent
channel instances. Ideal channel estimation was assumed in
Figures 7, 8, and 9, but in Figures 10, 11, and 12, the channel Guard-interval considerations
estimator described in Section 3.4 was utilized. The BER and
frame-error-rate (FER) performance with low density parity For example, 10% or 25% guard-interval length would mean
check (LDPC) [29] error correction coding are presented in about 0.4 dB or 1 dB degradation on the Eb /N0 -axis, respec-
Figures 11 and 12. tively. The delay spread of the channel model corresponds
to about 39 symbol-rate samples or 77 samples at twice
the symbol-rate. Then the minimum FFT size to reach 10%
Raw BER performance of FB-FSE guard-interval overhead is about 350 for SSE and 700 for
FSE. However, the RRC pulse shaping and baseband chan-
Figure 7 presents the uncoded BER performance of the nel filtering extend the delay spread, possibly by a factor 2, so
CFIR-FBEQ and AP-FBEQ compared to the analytic per- the CP length should be in the order of 5 μs in this example.
formance with QPSK, 16-QAM, and 64-QAM modulations. Then the practical FFT length could be 512 or 1024 for SSE
The three different designs of AP-FBEQ and a 3-tap CFIR- and 1024 or 2048 for FSE. The conclusion is that consider-
FBEQ were examined. It can be seen that the CFIR-FBEQ and ably higher number of subbands is needed in the FFT case to
AP-FBEQ Case 3 performances are rather similar, however, reach realistic CP overhead.
Yuan Yang et al. 11
10 1 1
10
BER
BER
10 2 2
10
10 3 3
0 2 4 6 8 10 12 14 16 10
0 2 4 6 8 10 12 14 16 18
AP Case 1; 2M = 128 AP Case 2; 2M = 256 AP Case 1; 2M = 128 AP Case 2; 2M = 256
AP Case 1; 2M = 256 CFIR 3-tap; 2M = 256 AP Case 1; 2M = 256 CFIR 3-tap; 2M = 256
AP Case 2; 2M = 128 AP Case 3; 2M = 256 AP Case 2; 2M = 128 AP Case 3; 2M = 256
CFIR 3-tap; 2M = 128 Analytic CFIR 3-tap; 2M = 128 Analytic
AP Case 3; 2M = 128 AP Case 3; 2M = 128
(a) QPSK (b) 16-QAM
10 1
BER
10 2
0 2 4 6 8 10 12 14 16 18
Eb /N0 (dB)
AP Case 1; 2M = 128 AP Case 2; 2M = 256

AP Case 1; 2M = 256 CFIR 3-tap; 2M = 256
AP Case 2; 2M = 128 AP Case 3; 2M = 256
CFIR 3-tap; 2M = 128 Analytic
AP Case 3; 2M = 128
(c) 64-QAM
Figure 7: Uncoded BER performance of FB-FSE (CFIR-FBEQ 3-tap and AP-FBEQ Cases 1, 2, 3) with overlapping factor K = 5 and
2M = {128, 256} subbands.
Performance with channel estimation 4.2. Performance comparison with practical

parameters and error-correction coding
In Figure 10, the uncoded BER performance of AP-FBEQ
is simulated with a practical channel estimator. The chan- Here, we include LDPC forward error correction (FEC) cod-
nel estimator described in Section 3.4 is utilized, using Gold ing and the channel estimator in the simulation model. The
codes of different lengths as a training sequence. It is ob- main parameters are indicated in Table 1. With the cho-
served that the training sequence length of 384 symbols is sen parameters, the training symbol overhead is 10% and
quite sufficient. the two systems with different LDPC code-rates transmit
1
64-QAM
10 10 1
64-QAM
BER
BER
16-QAM
QPSK 16-QAM
10 2 2
10
QPSK
10 3 3
10
0 2 4 6 8 10 12 14 16 18 0 2 4 6 8 10 12 14 16 18
K =2 128 training sequence

Figure 8: Uncoded BER performance for FB-FSE (AP-FBEQ Case 3 Figure 10: Uncoded BER performance for FB-FSE with ML based
equalizer) with 2M = 256 subbands and different K-factors. channel estimation using different training sequence lengths with
QPSK, 16-QAM, and 64-QAM modulations. AP-FBEQ Case 3
equalizer with 2M = 256 subbands and overlapping factor K = 5
was used.
16-QAM
10 1
The AP-FBEQ and CFIR-FBEQ systems are also com-

pared in Figure 12 with the FBMC and OFDM systems of
QPSK [15]. The parameters of FB-FDE are the same as in Table 1,
BER
10 2 except that code-rate 3/4 is used to reach similar bits rate with
the other systems. The parameters are consistent with the
ones considered in [15], with similar overhead for training
sequences/pilots, signal bandwidth, and bit rates. The same
type of LDPC code is used, however with higher code-rate
3
3/4 in OFDM and FB-FDE, and code-rate 2/3 in the FBMC
10
0 2 4 6 8 10 12 14 16 system. Higher code-rate is needed in OFDM to accomodate
Eb /N0 (dB) the CP-overhead and FB-FDE to accommodate the overhead
due to the excess band. With QPSK modulation, the number
SSE; AP-FBEQ Case 3; 2M = 256 of source bits in one 250 μs frame are 5022, 5184, and 5320
SSE; 2048-FFT
FSE; AP-FBEQ Case 3; 2M = 256
for OFDM, FB-FDE, and FBMC, respectively.
FSE; 2048-FFT Figure 12 displays that with QPSK modulation, FB-FDE
has clear performance benefit over FBMC and CP-OFDM;
Figure 9: Uncoded BER performance comparison between SSE and whereas with 16-QAM modulation, FB-FDE and CP-OFDM
FSE-type FB-FDE and FFT-FDE with QPSK and 16-QAM modu- are rather similar and clearly worse than that of FBMC.
lations. AP-FBEQ Case 3 equalizer with 2M = 256 subbands and
overlapping factor K = 5 was used.
4.3. Complexity comparison between FFT-FDEs
and FB-FDEs
exactly the same number of source bits per frame. Higher Here we evaluate the receiver complexity of FFT-FDEs and
code-rate is needed in the FFT-FDE system to accommo- FB-FDEs in terms of real multiplications per detected sym-
date the CP overhead. Meanwhile, the CP length which is bol. The complexity metric includes the FB or FFT trans-
1/8 of the useful symbol duration introduces Eb /N0 degrada- form, subband equalizers, as well as the baseband filtering
tion of 10 log10 (9/8) dB. The comparison of Figure 11 shows in the SSE case. The time-domain RRC filter is assumed to
that FB-FDE has about 1 dB performance advantage over the be of length NRRC = 31. The receiver RRC filtering and deci-
FFT-FDE under the most interesting coded FER region 1%– mation are realized in the frequency domain in both FSE sys-
10%. This is the joint results of using lower code-rate and the tems, using half-sized IFFT or FB on the synthesis side. The
absence of CP Eb /N0 degradation. Moreover, we can see that split-radix algorithm [19] is applied for FFT/IFFT, critically
AP-FBEQ and CFIR-FBEQ have very similar performance. sampled filter banks are implemented with the fast extended
Yuan Yang et al. 13
Table 1: FFT-FDE and FB-FDE system parameters.
FB-FSE FFT-FSE
Sampling rate 30.72 MHz 30.72 MHz
symbol-rate 15.36 MHz 15.36 MHz
RRC roll-off 0.22 0.22
Signal bandwidth 18.74 MHz 18.74 MHz
No. of subbands 256 1024
Data symbols per frame 3456 3072
Cyclic prefix (symbols) 0 64
Training symbols 384 384
Total symbols 3840 3840
Frame duration 250 μs 250 μs
FEC LDPC code-rate 2/3 LDPC code-rate 3/4
Modulation QPSK 16-QAM 64-QAM QPSK 16-QAM 64-QAM
Transmit bits (coded) 6912 13824 20736 6144 12288 18432
Source bits 4608 9216 13824 4608 9216 13824
Table 2: Receiver complexity comparison between the FB-FDE and FFT-FDE receivers: number of real multiplications per symbol.
FFT-FDE M = 1024 M = 2048

SSE 2 log2 M − 4 + NRRC + 1 48 50
FSE 3 log2 M − 6 + 4α 24 27

FSE with time-domain RRC 3 log2 M − 6 + 4α + 2 NRRC + 1 88 91
FB-FDE M = 128; K = 2 M = 256; K = 5
(1) AP-FBEQ
SSE, Case 1 6K + 3 log2 M − 1 + NRRC 63 84
SSE, Case 2 6K + 3 log2 M + 2 + NRRC 66 87
SSE, Case 3 6K + 3 log2 M + 4 + NRRC 68 89
FSE, Case 1 10K + 5 log2 M − 4 + 2α 51 86
FSE, Case 2 10K + 5 log2 M − 1 + 5α 55 90
FSE, Case 3 10K + 5 log2 M + 1 + 7α 57 92
(2) CFIR-FBEQ
FSE, 3-taps 10K + 5 log2 M + 6α 56 91
lapped transform algorithm [12], and the oversampled anal- complexity of FB-FDE depends heavily on the K factor of the
ysis banks are implemented using the optimized FFT based FB design. The subband equalizer choice has a minor effect
structure of [13]. The needed number of real multiplications on the overall complexity.
for a block of M high-rate samples is M(log2 M − 3) + 4 for In a CP based system, the capability of the frequency-
the FFT or IFFT, M(2K +log2 M+2) for the critically sampled domain filter to suppress strong adjacent channels or other
synthesis bank, and 2M(2K + log2 M − 2) for an oversampled interferences in the stopbands are limited due to FFT block-
analysis bank. For FB-FDE, we have seen that 128 or 256 sub- ing effects. Assume that there is a strong interference sig-
bands are sufficient, whereas 1 k or 2 k FFT lengths are re- nal in the stopband of the RRC filter. Removing the CPs
quired. For FB-FDE, 2 real multipliers are needed for each would cause transients in the interference waveforms, and
tap of the CFIR, 2 for the first-order complex allpass and 1 these would cause relatively strong error transients at the
for the real allpass (the two multipliers in the allpass struc- ends of the time-domain symbol blocks even after filtering.
tures of Figure 6 can be combined), two for phase rotation, Thus it seems that a baseband filter before the FFT is needed
and 2 for amplitude equalizer (we can scale a0 = 1, and do in CP based single-carrier FDE. FB-FSE may actually be very
the overall signal scaling in the phase rotator). The overall competitive compared to FFT-FSE, if additional baseband fil-
complexity figures are shown in Table 2, considering two ex- tering is needed in the latter structure. With oversampled
treme cases of filter bank complexity. equalizer processing, the implementation of the baseband fil-
The comparison between SSE and FSE depends very ter is not as efficient as in the SSE case. In the example set-
much on the needed baseband RRC and channel filter com- up, if the RRC filter is implemented in time-domain at 2×
plexity, but it is evident that, also in the FB-FDE case, FSE symbol-rate, the FFT-FSE multiplication rates are increased
may actually be less complex to implement than SSE. The by 64 multiplications per symbol.
100 100
FER
10 1
10 1 FER
10 2 BER
BER/FER
10 2
BER/FER
3 BER
10 10 3
10 4
10 4
4 5 6 7 8 9 10
4 5 6 7 8 9 10
Eb /N0 (dB)
Eb /N0 (dB)
1024-FFT FDE
CP-OFDM CFIR-FBEQ; 2M = 256
CFIR-FBEQ; 2M = 256
CFIR-FBMC; 2M = 256 AP-FBEQ; 2M = 256
AP-FBEQ; 2M = 256
AP-FBMC; 2M = 256
(a) QPSK modulation
(a) QPSK modulation
100 100
FER FER
1 10 1
10
2 10 2
10
BER/FER
BER/FER
3 BER 10 3 BER
10
4 10 4
10
10 11 12 13 14 15 16 10 11 12 13 14 15 16
1024-FFT FDE CP-OFDM CFIR-FBEQ; 2M = 256

CFIR-FBEQ; 2M = 256 CFIR-FBMC; 2M = 256 AP-FBEQ; 2M = 256
AP-FBEQ; 2M = 256 AP-FBMC; 2M = 256
(b) 16-QAM modulation (b) 16-QAM modulation
Figure 12: Coded BER and FER performance comparison between
Figure 11: Coded BER and FER performance comparison between
CP-OFDM, FBMC, and FB-FSE with practical system parameters
FFT-FSE and FB-FSE with practical system parameters and LDPC
and LDPC coding. Both 3-tap CFIR and AP Case 3 subband equal-
coding. Both 3-tap CFIR and AP Case 3 subband equalizers are in-
izers are included in FBMC and FB-FSE models.
cluded in FB-FSE models.
In certain wireless communication scenarios, strong nar-

rowband interferences (NBI) are considered as a serious
5. CONCLUSION problem [30], and various methods have been developed
for mitigating their effects. Frequency-domain NBI mitiga-
We have presented a filter bank based frequency-domain tion can be easily combined with both FFT-FDE and FB-
equalizer with mildly frequency-selective subband process- FDE with minor additional complexity. It has been observed
ing and a modest number of subbands. The performance that FFT based frequency-domain filtering has limitations
is better than that of the FFT-FDE. Furthermore, FB-FDE as NBI mitigation method due to the FFT leakage, while
is applicable to any single carrier system, whether CP is in- filter bank based approaches provide clearly better perfor-
cluded or not. mance [30–32].
Yuan Yang et al. 15
Regarding the choice between CFIR-FBEQ and AP- Personal Multimedia Communications (WPMC ’02), vol. 1, pp.
FBEQ, it was seen that the latter gives consistently slightly 27–36, Honolulu, Hawaii, USA, October 2002.
better performance with the cost of slightly higher multipli- [8] L. Martoyo, T. Weiss, F. Capar, and F. K. Jondral, “Low com-
cation rate. Furthermore, in AP-FBEQ, the amplitude and plexity CDMA downlink receiver based on frequency domain
phase responses can be adjusted independently of each other, equalization,” in Proceedings of IEEE 58th Vehicular Technology
which is a very useful feature in many respects. For example, Conference (VTC ’03), vol. 2, pp. 987–991, Orlando, Fla, USA,
October 2003.
in [33] the equalizer amplitude response is tuned to enhance
[9] P. Schniter and H. Liu, “Iterative frequency-domain equaliza-
narrowband interference suppression. In [23], a filter bank tion for single-carrier systems in doubly-dispersive channels,”
system with a 2M-channel analysis bank and an M-channel in Proceedings of the 38th Asilomar Conference on Signals, Sys-
synthesis bank is developed, and it is observed that tuning tems, and Computers, vol. 1, pp. 667–671, Pacific Grove, Calif,
of the phase response in the subband equalizers is needed to USA, November 2004.
achieve nearly perfect reconstruction characteristics with low [10] J. Alhava and M. Renfors, “Adaptive sine-modulated/cosine-
distortion. modulated filter bank equalizer for transmultiplexers,” in Pro-
The overlapped-FFT algorithms also avoid the use of ceedings of the European Conference on Circuit Theory and De-
CPs. This structure can be seen as a kind of a simple fil- sign (ECCTD ’01), pp. 337–340, Espoo, Finland, August 2001.
ter bank with basis functions overlapping in time [7–9]. It [11] J. Alhava, A. Viholainen, and M. Renfors, “Efficient imple-
can be seen that there is a continuum of filter bank design mentation of complex exponentially-modulated filter banks,”
cases between the overlapped FFT based approach and the FB in Proceedings of IEEE International Symposium on Circuits and
Systems, vol. 4, pp. 157–160, Bangkok, Thailand, May 2003.
based designs with high K values. If the frequency selectivity
[12] A. Viholainen, J. Alhava, and M. Renfors, “Efficient imple-
of the filter bank design is not important, then relatively low-
mentation of complex modulated filter banks using cosine and
complexity designs probably provide the best tradeoff. As we sine modulated filter banks,” EURASIP Journal on Applied Sig-
have seen, the performance difference between K = 3 and nal Processing, vol. 2006, Article ID 58 564, 10 pages, 2006.
K = 5 is relatively small. [13] A. Viholainen, J. Alhava, and M. Renfors, “Efficient imple-
The complexity of FB-FDEs is no doubt higher than that mentation of 2x oversampled exponentially modulated filter
of FFT-FDE structures. However, we believe that the same banks,” IEEE Transactions on Circuits and Systems II, vol. 53,
filter bank can be used to implement part of the channel fil- pp. 1138–1142, 2006.
tering, with much higher performance than when using the [14] J. G. Proakis, Digital Communications, McGraw-Hill, New
FFT-FDE structures. FB-FDE provides an easily configurable York, NY, USA, 4th edition, 2001.
structure for the final stage of the channel filtering chain, to- [15] T. Ihalainen, T. Hidalgo Stitz, M. Rinne, and M. Renfors,
gether with the channel equalization functionality. “Channel equalization in filter bank based multicarrier mod-
ulation for wireless communications,” to appear in EURASIP
Journal of Applied Signal Processing.
REFERENCES [16] Y. Yang, T. Ihalainen, and M. Renfors, “Filter bank based fre-
quency domain equalizer in single carrier modulation,” in Pro-
[1] M. V. Clark, “Adaptive frequency-domain equalization and di- ceedings of the 14th IST Mobile and Wireless Communications
versity combining for broadband wireless communications,” Summit, Dresden, Germany, June 2005.
IEEE Journal on Selected Areas in Communications, vol. 16, [17] N. Benvenuto and S. Tomasin, “On the comparison between
no. 8, pp. 1385–1395, 1998. OFDM and single carrier modulation with a DFE using a
[2] G. Kadel, “Diversity and equalization in frequency domain - frequency-domain feedforward filter,” IEEE Transactions on
a robust and flexible receiver technology for broadband mo- Communications, vol. 50, no. 6, pp. 947–955, 2002.
bile communication systems,” in Proceedings of IEEE 47th Ve- [18] J. R. Treichler, I. Fijalkow, and C. R. Johnson Jr., “Fractionally
hicular Technology Conference (VTC ’97), vol. 2, pp. 894–898, spaced equalizers: how long should they really be?” IEEE Signal
Phoenix, Ariz, USA, May 1997. Processing Magazine, vol. 13, no. 3, pp. 65–81, 1996.
[3] D. Falconer, S. L. Ariyavisitakul, A. Benyamin-Seeyar, and [19] P. Duhamel, “Implementation of split-radix FFT algorithms
B. Eidson, “Frequency domain equalization for single-carrier for complex, real, and real-symmetric data,” IEEE Transactions
broadband wireless systems,” IEEE Communications Maga- on Acoustics, Speech, and Signal Processing, vol. 34, no. 2, pp.
zine, vol. 40, no. 4, pp. 58–66, 2002. 285–295, 1986.
[4] H. Sari, G. Karam, and I. Jeanclaude, “Transmission tech- [20] H. S. Malvar, Signal Processing with Lapped Transforms, Artech
niques for digital terrestrial TV broadcasting,” IEEE Commu- House, Norwood, Mass, USA, 1992.
nications Magazine, vol. 33, no. 2, pp. 100–109, 1995. [21] A. Viholainen, T. Hidalgo Stitz, J. Alhava, T. Ihalainen, and M.
[5] A. Czylwik, “Comparison between adaptive OFDM and sin- Renfors, “Complex modulated critically sampled filter banks
gle carrier modulation with frequency domain equalization,” based on cosine and sine modulation,” in Proceedings of IEEE
in Proceedings of IEEE 47th Vehicular Technology Conference International Symposium on Circuits and Systems, vol. 1, pp.
(VTC ’97), vol. 2, pp. 865–869, Phoenix, Ariz, USA, May 1997. 833–836, Scottsdale, Ariz, USA, May 2002.
[6] A. Gusmão, R. Dinis, and N. Esteves, “On frequency-domain [22] M. R. Petraglia, R. G. Alves, and P. S. R. Diniz, “New structures
equalization and diversity combining for broadband wire- for adaptive filtering in subbands with critical sampling,” IEEE
less communications,” IEEE Transactions on Communications, Transactions on Signal Processing, vol. 48, no. 12, pp. 3316–
vol. 51, no. 7, pp. 1029–1033, 2003. 3327, 2000.
[7] D. D. Falconer and S. L. Ariyavisitakul, “Broadband wire- [23] A. Viholainen, T. Ihalainen, T. Hidalgo Stitz, Y. Yang, and
less using single carrier and frequency domain equalization,” M. Renfors, “Flexible filter bank dimensioning for mul-
in Proceedings of the 5th International Symposium on Wireless ticarrier modulation and frequency domain equalization,”
in Proceedings of IEEE Asia Pacific Conference on Circuits and Mika Rinne received his M.S. degree from
Systems, pp. 451–454, Singapore, December 2006. Tampere University of Technology (TUT)
[24] T. Karp and N. J. Fliege, “Modified DFT filter banks with per- in signal processing and computer science,
fect reconstruction,” IEEE Transactions on Circuits and Systems in 1989. He acts as Principal Scientist in the
II, vol. 46, no. 11, pp. 1404–1414, 1999. Radio Technologies laboratory of Nokia Re-
[25] T. Hentschel and G. Fettweis, Software Radio Receivers, Kluwer search Center. His background is in research
Academic, Boston, Mass, USA, 1999. of multiple-access methods, radio resource
[26] S. Kay, Fundamentals of Statistical Signal Processing: Estimation management and implementation of packet
Theory, Prentice-Hall, Englewood Cliffs, NJ, USA, 1993. decoders for radio communication systems.
[27] W. W. Peterson and E. J. Weldon Jr., Error-Correcting Codes, Currently, his interests are in research of
MIT Press, Cambridge, Mass, USA, 2nd edition, 1972. protocols and algorithms for wireless communications including
[28] T. B. Sorensen, P. E. Mogensen, and F. Frederiksen, “Extension WCDMA, long-term evolution of 3G and beyond 3G systems.
of the ITU channel models for wideband OFDM systems,”
Markku Renfors was born in Suoniemi,
in Proceedings of IEEE 62nd Vehicular Technology Conference
Finland, on January 21, 1953. He received
(VTC ’05), vol. 1, pp. 392–396, Dallas, Tex, USA, September
the Diploma Engineer, Licentiate of Tech-
2005.
nology, and Doctor of Technology degrees
[29] R. G. Gallager, Low-Density Parity-Check Codes, MIT Press,
from the Tampere University of Technology
Cambridge, Mass, USA, 1963.
(TUT), Tampere, Finland, in 1978, 1981,
[30] S. Hara, T. Matsuda, K. Ishikura, and N. Morinaga, “Co-exis-
and 1982, respectively. From 1976 to 1988,
tence problem of TDMA and DS-CDMA systems-application
he held various research and teaching posi-
of complex multirate filter bank,” in Proceedings of IEEE Global
tions at TUT. From 1988 to 1991, he was a
Telecommunications Conference (GLOBECOM ’96), vol. 2, pp.
Design Manager at the Nokia Research Cen-
1281–1285, London, UK, November 1996.
ter and Nokia Consumer Electronics, Tampere, Finland, where he
[31] M. J. Medley, G. J. Saulnier, and P. K. Das, “Narrow-band in-
focused on video signal processing. Since 1992, he has been a Pro-
terference excision in spread spectrum systems using lapped
fessor and Head of the Institute of Communications Engineering
transforms,” IEEE Transactions on Communications, vol. 45,
at TUT. His main research areas are multicarrier systems and signal
no. 11, pp. 1444–1455, 1997.
processing algorithms for flexible radio receivers and transmitters.
[32] T. Hidalgo Stitz and M. Renfors, “Filter-bank-based narrow-
band interference detection and suppression in spread spec-
trum systems,” EURASIP Journal on Applied Signal Processing,
vol. 2004, no. 8, pp. 1163–1176, 2004.
[33] Y. Yang, T. Hidalgo Stitz, M. Rinne, and M. Renfors, “Mitiga-
tion of narrowband interference in single carrier transmission
with filter bank equalization,” in Proceedings of IEEE Asia Pa-
cific Conference on Circuits and Systems, pp. 749–752, Singa-
pore, December 2006.
Yuan Yang received his B.S. degree in elec-

trical engineering from HoHai University,
Nanjing, China, in 1996, and his M.S. de-
gree in information technology from Tam-
pere University of Technology (TUT), Tam-
pere, Finland, in 2001, respectively. Cur-
rently, he is a researcher and a postgradu-
ate student at the Institute of Communica-
tions Engineering at TUT, working towards
the doctoral degree. His research interests
are in the field of broadband wireless communications, with em-
phasis in the topics of frequency-domain equalizers and multirate
filter banks applications.
Tero Ihalainen received his M.S. degree in

electrical engineering from Tampere Uni-
versity of Technology (TUT), Finland, in
2005. Currently, he is a researcher and
a postgraduate student at the Institute of
Communications Engineering at TUT, pur-
suing towards the doctoral degree. His
main research interests are digital signal
processing algorithms for multicarrier and
frequency domain equalized single-carrier
modulation based wireless communications, especially applica-
tions of multirate filter banks.
doi:10.1155/2007/61396
Research Article
Design of Nonuniform Filter Bank Transceivers
for Frequency Selective Channels
Han-Ting Chiang,1 See-May Phoong,1 and Yuan-Pei Lin2

1 Department of Electrical Engineering, Graduate Institute of Communication Engineering, National Taiwan University,
Taipei 10617, Taiwan
2 Department of Electrical and Control Engineering, National Chiao Tung University, Hsinchu 300, Taiwan
Received 14 January 2006; Revised 16 July 2006; Accepted 13 August 2006
In recent years, there has been considerable interest in the theory and design of filter bank transceivers due to their superior fre-
quency response. In many applications, it is desired to have transceivers that can support multiple services with different incoming
data rates and different quality-of-service requirements. To meet these requirements, we can either do resource allocation or design
transceivers with a nonuniform bandwidth partition. In this paper, we propose a method for the design of nonuniform filter bank
transceivers for frequency selective channels. Both frequency response and signal-to-interference ratio (SIR) can be incorporated
in the transceiver design. Moreover, the technique can be extended to the case of nonuniform filter bank transceivers with rational
sampling factors. Simulation results show that nonuniform filter bank transceivers with good filter responses as well as high SIR
can be obtained by the proposed design method.
Copyright © 2007 Hindawi Publishing Corporation. All rights reserved.
1. INTRODUCTION (signal-to-interference ratio) maximization. Like OFDM sys-

tems, simple one-tap equalizers can be employed at the re-
The orthogonal frequency division multiplexing (OFDM) ceiver for channel equalization. It has been demonstrated
system has enjoyed great success in many wideband com- that filter bank transceivers with high SIR and good fre-
munication systems due to its ability to combat intersym- quency responses can be obtained [10].
bol interference (ISI) [1]. It is known that the transmitting In many applications, it is desired to have transceivers
and receiving filters of the OFDM transceiver have poor fre- that can support multiple services [11, 12]. Different ser-
quency responses. As a result, many subchannels will be af- vices might have different incoming data rates and different
fected when there is narrowband interference, and the per- quality-of-service requirements. One solution to this prob-
formance degrades significantly [2]. Many techniques have lem is to judiciously allocating the resources to meet the re-
been proposed to solve this problem. quirements, see, for example, [11]. Another solution is to
One of the solutions is the filter bank technique. In recent use a nonuniform filter bank transceiver. The theory and de-
years, there has been considerable interest in the application sign of nonuniform filter banks have been studied by a num-
of filter banks to the design of transceivers with good fre- ber of researchers [13–18]. These results are extended to the
quency characteristics [2–10]. Many of these previous studies design of transceivers and transmultiplexers with nonuni-
[3–6] have focused on the design of filter bank transceivers form band separation in [12, 19]. In [12], the authors pro-
(or transmultiplexers) under the assumption that the trans- posed a general building block for the design of nonuniform
mission channel is an ideal channel that does not create ISI. filter bank transmultiplexers. Near perfect reconstruction
When the channel is a frequency selective channel, these fil- transmultiplexers with good frequency property can be ob-
ter bank transceivers suffer from severe ISI effect [7, 8], and tained by the proposed method therein. In [19], a design of
post processing technique is needed at the receiver for chan- nonuniform transmultiplexers using semi-infinite program-
nel equalization [4]. Recently the authors in [10] studied ming was proposed. The proposed algorithm was efficient
the filter bank transceiver for frequency selective channels. and good results were achieved. However these nonuniform
The transmitting and receiving filters are optimized for SIR transceiver designs do not consider the channel effect. When
v(n)
x0 (n) N0 F0 (z) C(z) zl0 H0 (z) N0 x0 (n)
x1 (n) N1 F1 (z) H1 (z) N1 x1 (n)

. . . .
. . . .
. . . .
xM 1 (n) NM 1 FM 1 (z) HM 1 (z) NM 1 xM 1 (n)
Transmitting bank Receiving bank
Figure 1: A nonuniform filter bank transceiver with integer sampling factors.
the transmission channel is frequency selective, an additional 2. NONUNIFORM FILTER BANK TRANSCEIVERS
equalizer is needed at the receiver. WITH INTEGER SAMPLING FACTORS
In this paper, we consider the design of nonuniform
transceiver for frequency selective channels. Both the cases Figure 1 shows a nonuniform filter bank transceiver. The
of integer and rational sampling factors are considered. As downsampling and upsampling ratios Ni are integers and
the effect of channel is taken into consideration at the time they can be different for different i. A larger Ni indicates a
the filter bank is optimized, simple one-tap equalizers can be lower data rate and also implies that a smaller bandwidth is
used at the receiver for channel equalization. Unlike the uni- allocated to the ithsubband. For a filter bank transceiver, the
−1
form case, the equivalent system from the transmitter input integers Ni satisfy M i=0 1/Ni ≤ 1, which is a necessary condi-
tion for recovering the input signals xi (n). When the equal-
to the receiver output is no longer LTI and ISI-free condition −1
needs to be derived. Furthermore we will show that like the ity M i=0 1/Ni = 1 holds, the transceiver is said to be crit-
uniform case [10], SIR can be formulated as a Rayleigh-Ritz ically sampled. The transmission channel is modeled as an
ratio of filter coefficients. The optimal filters that maximize Lth-order LTI channel with transfer function
the SIR can be obtained from an eigenvector of a positive def-
L
inite matrix. Moreover, an iterative algorithm that can incor- C(z) = c(l)z−l . (1)
porate the frequency response is proposed for SIR maximiza- l=0
tion. Simulation results show that we can obtain nonuniform
transceivers with very high SIR (around 50 dB) and good fre- The additive noise is denoted by v(n). Because our formu-
quency response (stopband attenuation around 40 dB). lation is based on the signal-to-interference ratio, the chan-
nel noise does not affect the transceiver design. Therefore in
This paper is organized as follows. In Section 2, we study
Sections 2, 3, and 4, we set v(n) = 0. For convenience, an
nonuniform filter bank transceivers with integer sampling
advance operator zl0 is added at the receiver to account for
factors. The ISI-free condition is derived and the SIR is for-
the system delay caused by channel C(z). In practice, this ad-
mulated as a Rayleigh-Ritz ratio of transmitting and receiv-
vance element can be replaced by an appropriate delay. In
ing filters. Then SIR-optimized transmitting and receiving
this paper, we consider only FIR filter banks. The transmit-
filters are given. Moreover, the design method can be ex-
ting and receiving filters are, respectively,
tended to the case of unknown frequency selective chan-
nels. In Section 3, an iterative algorithm is proposed to al- N fi Nhi

ternatingly optimize the transmitting and receiving filters for Fi (z) = fi (n)z , −n
Hi (z) = hi (n)zn . (2)
SIR maximization. We will show how to incorporate the fre- n=0 n=0
quency response into the objective function. The results are
extended to the case of rational sampling factor in Section 4. The orders of these filters N fi and Nhi can be larger than Ni .
In Section 5, simulation examples are given to demonstrate For notational simplicity, we use the noncausal expression
the usefulness of the proposed method. A conclusion is given for the receiving filters. Causal filters can be obtained easily
in Section 6. by adding sufficient delays. In addition, we assume that the
input signals xi (n) are uncorrelated, zero mean, wide sense
stationary (WSS), and white random processes with the same
Notation variance Ex . That is,

The N-fold downsampled and upsampled versions of x(n) E xi (n) = 0, E xi (n)x∗j (m) = Ex δ(i − j)δ(n − m).
are respectively denoted by [x(n)]↓N and [x(n)]↑N in the time (3)
domain, and by [X(z)]↓N and [X(z)]↑N in the z domain. The
convolution of two sequences x(n) and y(n) is represented This assumption is usually satisfied by properly interleaving
by x(n) ∗ y(n). the input data.
Han-Ting Chiang et al. 3
2.1. ISI-free condition for 0 ≤ i, j ≤ M − 1, and 0 ≤ l ≤ L. Note that since Fi (z)

and H j (z) are of finite length, αi,l (n) and βi, j,l (n) have finite
The filter bank transceiver shown in Figure 1 is said to be ISI- nonzero terms only. Using the above definition, we can write
free if in the absence of noise, for all possible input signals the jth output signal x j (n) as
xi (n), the outputs are
xi (n) = Gi xi (n), (4)

L
x j (n) = α j,l (0)c(l) x j (n)
for some constant Gi . In this case, a zero-forcing solution can l=0
be obtained by cascading a simple one-tap equalizer. Express-
ing the output signal at the jth subband in the z domain, we
L

+ c(l) α j,l (n) − α j,l (0)δ(n) ∗ x j (n) (10)
have
l=0

M −1
L
−1

X j (z) = Xi zNi Fi (z)zl0 C(z)H j (z)
M

i=0
↓N j + c(l)βi, j,l (n) ∗ xi (n) ↑ pi, j .
i=0 l=0 ↓ p j,i
i
=j
= X j (z) F j (z)zl0 C(z)H j (z) ↓N j
(5)

M −1

+ Xi zNi Fi (z)zl0 C(z)H j (z) ↓N j .
The first, second, and third terms on the right-hand side of
i=0 the above expression are the desired signal, the intraband
i
=j
ISI and the cross-band ISI, respectively. To get an ISI-free
transceiver, we need to find the transmitting filters Fk (z) and
From the above equation, we see that in general the system
receiving filters Hk (z) so that the second and third terms are
from the input xi (n) to the output x j (n) is not LTI unless
equal to zero. The general solution to this problem is still
N j is a factor of Ni . This is very different from the case of
unknown. In the following, we will show how to reduce the
uniform filter bank transceivers, in which all Ni = N. Let gi, j
effect of ISI by finding a solution that maximizes the signal-
be the greatest common divisor (gcd) of Ni and N j . Define
to-interference ratio (SIR).
two coprime integers pi, j = Ni /gi, j and p j,i = N j /gi, j . Then
we can write
2.2. Matrix formulations of αi,l (n) and βi, j,l (n)

X j (z) = X j (z) F j (z)zl0 C(z)H j (z) ↓N j
In this section, we will formulate the sequences αi,l (n) and

M −1
βi, j,l (n) in a matrix form. These expressions will be useful
(6)
+ Xi z pi, j Fi (z)zl0 C(z)H j (z) ↓gi, j ↓ p . for the optimization of the transceivers. Recall from (9) that
j,i
i=0 αi,l (n) and βi, j,l (n) are obtained from the convolution of fk (n)
i
=j
and hk (n). Let us define the following vectors:
Define
⎡ ⎤ ⎡ ⎤
Ti, j (z) = Fi (z)zl0 C(z)H j (z) ↓gi, j
αi,0 (n) βi, j,0 (n)
⎢ ⎥ ⎢ ⎥
⎢αi,1 (n)⎥ ⎢βi, j,1 (n)⎥

L
(7) ⎢ ⎥ ⎢ ⎥
= c(l) Fi (z)H j (z)zl0 −l αi (n) = ⎢
⎢ .. ⎥ ,
⎥ βi, j (n) = ⎢
⎢ .. ⎥
⎥,
↓gi, j ⎢ . ⎥ ⎢ . ⎥
l=0 ⎣ ⎦ ⎣ ⎦
αi,L (n) βi, j,L (n)
for 0 ≤ i, j ≤ M − 1. As the input signals xi (n) are arbitrary,
one can show (see the appendix for a proof) that the ISI-free (11)
⎡ ⎤ ⎡ ⎤
condition Xi (z) = Gi Xi (z) is satisfied if and only if hi (0) fi (0)
⎢ ⎥ ⎢ ⎥
⎢ hi (1) ⎥ ⎢ fi (1) ⎥
⎧ ⎢ ⎥ ⎢ ⎥
⎨ Gi , j = i, hi = ⎢
⎢ .. ⎥ ,
⎥ fi = ⎢
⎢ .. ⎥ .
⎥
Ti, j (z) = ⎩ (8) ⎢ . ⎥ ⎢ . ⎥
⎣ ⎦ ⎣ ⎦
0, otherwise.
hi Nhi f i N fi
For convenience of discussion, we express [Fi (z)H j (z)zl0 −l ]↓gi, j
in terms of the two sequences αi,l (n) and βi, j,l (n) as
Then from (9), it is not difficult to verify that the vectors
⎧ αi (n) and βi, j (n) can respectively be expressed as
⎪
⎪αi,l (0) + αi,l (n)z−n , i = j,
⎪
⎪
⎨ n
−l n
=0
Fi (z)H j (z)zl0 ↓gi, j =
⎪
⎪
⎪ −n αi (n) = Ai (n)hi ,
⎪
⎩ βi, j,l (n)z , i
= j,
n (12)
(9) βi, j (n) = Bi, j (n)h j ,
where the matrices Ai (n) and Bi, j (n) are respectively given by is cyclo wide sense stationary with period pi, j , or CWSS(pi, j ).
Letting u(n) = [x j (n)]↑ pi, j , then its autocorrelation coeffi-
Ai (n) cients satisfy E[u(n)u∗ (n − k)] = E[u(n+ pi, j )u∗ (n+ pi, j − k)].
⎡ ⎤ Since pi, j and p j,i are coprime, the quantity
fi nNi +l0 fi nNi +l0 +1 · · · fi nNi +l0 +Nhi
⎢ ⎥
⎢ fi nNi +l0 − 1 fi nNi +l0 − 1+1 · · · fi nNi +l0 − 1+Nhi ⎥
L
⎢ ⎥
=⎢
⎢ .. .. .. .. ⎥,
⎥ c(l)βi, j,l (n) ∗ xi (n) ↑ pi, j (16)
⎢ . . . . ⎥ l=0
⎣ ⎦ ↓ p j,i

fi nNi +l0−L fi nNi +l0−L+1 · · · fi nNi +l0−L+Nhi
is also CWSS(pi, j ) [20]. From (10), we see that the cross-
Bi, j (n) band interference consists of (M − 1) CWSS sequences
⎡ ⎤ with period pi, j for i = 0, . . . , j − 1, j + 1, . . . , M − 1. Let
fi ngi, j +l0 fi ngi, j +l0 +1 · · · fi ngi, j +l0 +Nh j
⎢ ⎥ P j be the least common multiple of the integers { p0, j , . . . ,
⎢ fi ngi, j +l0−1 fi ngi, j +l0−1+1 · · · fi ngi, j +l0−1+Nh j ⎥
⎢ ⎥ p j −1, j , p j+1, j , . . . , pM −1, j }. Then the cross-band interference is
=⎢
⎢ .. .. . . ⎥.
⎥
⎢ . . . ⎥ a CWSS(P j ) random process. We can compute the average
⎣ . . .
⎦ cross-band interference power over one period P j and it is
fi ngi, j +l0−L fi ngi, j +l0−L+1 · · · fi ngi, j +l0−L+Nh j given by
(13)
L 2
1

The dimensions of the matrices Ai (z) and Bi, j (n) are, respec- Pcross ( j) = Ex βi, j,l (n)c(l) .

(17)
pi, j
i,n l=0
tively, (L + 1) × (Nhi + 1) and (L + 1) × (Nh j + 1). Notice i
=j
that gi, j = Ni when i = j. Similarly, we can also express the
vectors αi (n) and βi, j (n), respectively, in terms of the trans- Next we will express the three quantities Psig ( j), Pintra ( j), and
mitting filter fi as Pcross ( j) in terms of the receiving filter coefficients h j (n). To
do this, let us define the (L + 1) × 1 vector
i (n)fi ,
αi (n) = A i, j (n)fi ,
βi, j (n) = B (14)
T
c = c(0) c(1) · · · c(L) . (18)
i (n) and B
for some matrices A i (n) and
i, j (n). The matrices A
i, j (n) consist of the transmitting filter coefficients h j (n) and Then from (12), we can write
B
they are very similar to Ai (n) and Bi, j (n), respectively. L 2
2

Ex α j,l (0)c(l) = Ex cT A j (0)h j

2.3. SIR-optimized receiving filters l=0 (19)
† † ∗ T
In this section, we will design the receiving filters so that = Ex h j A j (0)c c A j (0)h j .
the SIR is maximized for a fixed set of transmitting filters.
As the jth receiving filter affects only the jth output signal Similarly, using the expressions of αi (n) and βi, j (n) in (12),
x j (n), the receiving filters can be designed separately; the jth we can also write the intraband and cross-band interference
receiving filter F j (z) is optimized so that the SIR of the jth powers in a quadratic form of h j . In summary, the three pow-
output signal x j (n) is maximized. Recall from (10) that the ers are given by
output of the jth subband x j (n) consists of three compo-
nents, namely, the desired signal, the intraband interference, Psig ( j) = h†j Qsig, j h j , Pintra ( j) = h†j Qintra, j h j ,
and the cross-band interference. As the input signals xi (n) (20)
satisfy the uncorrelated and white property in (3), the de- Pcross ( j) = h†j Qcross, j h j ,
sired signal power and intraband interference power at the
jth output are given by where the matrices Qsig, j , Qintra, j , and Qcross, j are, respectively,
given by
L 2

Psig ( j) = Ex

α j,l (0)c(l)

, Qsig, j = Ex A†j (0)c∗ cT A j (0),
l=0
L 2 (15) Qintra, j = Ex A†j (n)c∗ cT A j (n),

Pintra ( j) = Ex α j,l (n)c(l) ,

n, n
=0 (21)
n, n
=0 l=0 1
Qcross, j = Ex B†i, j (n)c∗ cT Bi, j (n).
i,n
pi, j
where Ex is the power of the input signal defined in (3). The i
=j
computation of the cross-band interference power is more
complicated because the sequence [x j (n)]↑ pi, j is not a WSS As xi (n) and x j (n) are uncorrelated for i = j, the total ISI
process. From multirate theory [20], we know that [x j (n)]↑ pi, j power at the jth output is Pisi ( j) = Pintra ( j) + Pcross ( j). Thus
the SIR of the jth output is given by 2.5. SIR optimized for unknown channels
† In many applications, the exact channel impulse response
Psig ( j) h j Qsig, j h j
γj = = † , (22) may not be available, and we may have only the statistics
Pisi ( j) h j Qisi, j h j
of the transmission channels. The above design method can
where Qisi, j = Qintra, j + Qcross, j . Notice that both Qsig, j and easily be modified to obtain transceivers that are optimized
Qisi, j are positive semidefinite matrices. Furthermore, except for unknown channels. Assume that the vector containing
for some very rare cases, the matrix Qisi, j is positive definite. the channel impulse response, c, is zero-mean with autocor-
From the above expression, we see that the SIR is expressed as relation matrix
a Rayleigh-Ritz ratio of h j . The optimal unit-norm vector h j
Rc = E cc† . (26)
that maximizes γ j is well known [21]. Let Q1/2 isi, j be the posi-
1/2 1/2
tive definite matrix such that Qisi, j = Qisi, j Qisi, j . The optimal In this case, the exact channel impulse response is not
h j is given by known. From previous discussions, we know that the sig-
−1/2 −1/2
nal power and interference powers at the output of the jth
−1/2
v† Qisi, j Qsig, j Qisi, j v subband are respectively given by (20) and (21). When the
h j,opt = Qisi, j arg max . (23)
v
=0 v† v channel is not known, we can compute the average signal
power and interference powers by taking the expectation
The optimal vector v is the eigenvector corresponding with respect to the channel impulse response c(l). It is not
to the largest eigenvalue of the positive definite matrix difficult to verify that the average SIR can also be expressed
−1/2 −1/2
Qisi, j Qsig, j Qisi, j . as a Rayleigh-Ritz ratio of the filter coefficients hi .
Similarly, given the receiving filters, we can modify
2.4. SIR-optimized transmitting filters the optimization of transmitting filters fi for the case of
unknown channels by using the average SIR. In many situ-
In this section, we consider the SIR optimization of the trans- ations, we do not know the statistics of the channel. In this
mitting filters fi (n) given a fixed set of the receiving filters. As case, it is often assumed that the channel impulse responses
the ith transmitting filter fi (n) affects only the ith input sig- are independent identical distribution, that is, i.i.d. channels.
nal xi (n), we can consider the SIR due to the ith transmitted The autocorrelation matrix of the channel impulse response
signal xi (n). Consider the transmission scenario when only becomes Rc = σc2 I.
the ith subband is transmitting, that is, x j (n) = 0 for j = i.
Then from (10), the outputs of the receiver are given by
3. AN ITERATIVE ALGORITHM FOR SIR OPTIMIZATION
WITH FREQUENCY CRITERIA

L
xi (n) = αi,l (0)c(l) xi (n)
l=0 From the previous discussions, we know that when the trans-
mitting filters are given, we can obtain optimum receiving fil-

L

+ c(l) αi,l (n) − αi,l (0)δ(n) ∗ xi (n), ters so that SIR is maximized. Conversely, given the receiving
l=0 filters we can design the transmitting filters that maximize
the SIR. One can therefore alternatingly optimize the receiv-

L

x j (n) = c(l)βi, j,l (n) ∗ xi (n) , for i
= j. ing and transmitting filters so that SIR is maximized. Because
↑ pi, j
l=0 ↓ p j,i in each iteration, the solution obtained in the previous iter-
(24) ation is also a candidate, the SIR cannot decrease1 when the
number of iterations increases. As we will see in the numeri-
Note that the first and second terms on the right-hand side cal examples, the increase in SIR is substantial as the number
of (24) are respectively the desired signal and the intraband of iterations increases. However because no constraint is ap-
interference due to the ith transmitted signal xi (n). On the plied on the filters, their frequency responses will often de-
other hand, x j (n) represents the cross-band interferences due grade significantly as the number of iterations increases. To
to xi (n). By following a procedure similar to that in the pre- solve this problem, we can incorporate the filter stopband en-
vious section, we can compute the signal power and interfer- ergy in the optimization. Let us consider the design of the
ence powers and express the SIR as a Rayleigh-Ritz ratio as receiving filters h j . The stopband energy of the jth receiving
follows: filter H j (z) is given by
fi† Q
sig,i fi
1 jω 2
γi = , (25) Pstop ( j) = H j e dω, (27)
† isi,i fi
fi Q 2π h, j
where the matrices Q sig,i and Q

isi,i are positive semidefinite where h, j is the stopband region of H j (z). Define the vec-
matrices that have a form very similar to Qsig,i and Qisi,i , tor eN (z) = [1 z · · · zN ]T . Then the weighted stopband
respectively. Hence the optimal unit-norm fi that maximizes
the SIR can be obtained by solving the above Rayleigh-Ritz
ratio. 1 In general, it is not guaranteed that the SIR is monotonically increasing.
v(n)
x0 (n) N0 F0 (z) M0 C(z) zl0 M0 H0 (z) N0 x0 (n)
x1 (n) N1 F1 (z) M1 M1 H1 (z) N1 x1 (n)

. . . .
. .. .. ..
.
xM 1 (n) NM 1 FM 1 (z) MM 1 MM 1 HM 1 (z) NM 1 xM 1 (n)
Transmitting bank Receiving bank
Figure 2: Nonuniform filter bank transceiver with rational sampling factors.
energy can be expressed as For k ≥ 1, repeat the following steps.

(2) Given the receiving filters Hi(k−1) (z), optimize F (k)
j (z) so
Pstop ( j) = h†j Qstop, j h j , (28) that η j is maximized for 0 ≤ j ≤ M − 1.
(3) Given the transmitting filter F (k) (k)
j (z), optimize Hi (z)
where the matrix Qstop, j is given by so that ηi is maximized for 0 ≤ i ≤ M − 1.
(4) Stop if the SIR is higher than the desired value or if it
1
Qstop, j = eNh j e jω e†Nh j e jω dω. (29) reaches the maximum number of iterations; otherwise,
2π h, j k = k + 1 and return to step (2).
The new objective function that incorporates the frequency 4. NONUNIFORM FILTER BANK TRANSCEIVERS
response is WITH RATIONAL SAMPLING FACTORS
h†j Qsig, j h j In this section, we generalize the design method to the case of
ηj = † , (30) rational sampling factors. We will first employ the technique
h j Qisi, j + ch, j Qstop, j h j
in [15] to convert the transceiver with rational sampling fac-
tors into an equivalent transceiver with integer sampling fac-
where ch, j ≥ 0 is a weight that adjusts the relative importance tor. Then the optimization method developed in the previ-
of the frequency responses. When ch, j = 0, the new objective ous sections can be adopted. The block diagram of a nonuni-
function η j reduces to the SIR expression γ j in (22) and no form filter bank transceiver with rational sampling factors is
frequency criteria are applied. One can see that η j is also a shown in Figure 2. At the transmitter, the input signal xi (n)
Rayleigh-Ritz ratio of h j . We can choose h j to be the unit- goes through an Ni -fold expander and an Mi -fold decima-
norm vector that maximizes this ratio. Similarly, one can in- tor. The bandwidth of the ith subband is proportional to the
corporate the stopband energy into the optimization of the ratio Mi /Ni . Without loss of generality, we assume that the
transmitting filters fi (n). One will get a new objective func- integers Mi and Ni are coprime. If they are not coprime, then
tion it is known [20] that the ith subband can be replaced with
an equivalent system with coprime Mi and Ni , and such an
fi† Q
sig,i fi
equivalent system will have a lower complexity. Furthermore,
ηi = †
, (31)
isi,i + c f ,i Q
fi Q stop,i ]fi to ensure symbol recovery, we assume

M −1
where fi† Q
stop,i fi is the term corresponding to the stopband Mi
≤ 1. (32)
energy of the filter fi (n). The optimal fi is the unit-norm vec- i=0
Ni
tor that maximizes ηi .
Note that in the new objective function, the passband re- Let us decompose the kth transmitting and receiving fil-
sponses of the filters are not included. For unit-norm filters, ters using the polyphase representation as
when the stopband energy is small, the passband energy will
k −1
M
be close to one. In transceiver designs, nearly zero ISI prop-
erty can be guaranteed by a high SIR and the flatness of pass- Hk (z) = z
Ek,
zMk ,

=0
band response is not needed. (33)
The iterative algorithm for transceiver optimization is k −1
M
−
Mk

summarized as follows. Fk (z) = z Rk,
z .

=0
(1) Select a set of the receiving filters Hi(0) (z) with good
frequency responses. Note that no coefficient of Hk (z) or Fk (z) appears in more
xk (n) Nk Fk (z) Mk
xk,0 (n)
xk (n) Mk Nk Rk,0 (z)
xk,1 (n)
zbk,1 Mk Nk z ak,1 R (z)
k,1
. . . .
. . . .
. . . .
xk,Mk 1 (n)
bk,Mk 1 ak,Mk 1
z Mk Nk z Rk,Mk 1 (z)
(a)
Mk Hk (z) Nk xk (n)
xk,0 (n)
Ek,0 (z) Nk Mk xk (n)
xk,1 (n)
zak,1 Ek,1 (z) Nk Mk z bk,1
. . . .
. . . .
. . . .
xk,Mk 1 (n)
ak,Mk 1 bk,Mk 1
z Ek,Mk 1 (z) Nk Mk z
(b)
Figure 3: (a) Equivalent circuit of the kth subband in the transmitting bank, (b) equivalent circuit of the kth subband in the receiving bank.
than one Ek,

(z) or Rk,
(z). As Mk and Nk are coprime, we can xk (n). Using these results, we can redraw Figure 2 as Figure 4.
always find positive integers a and b such that aMk − bNk = The transceiver in Figure 4 has the same structure as that in
1. Let ak,1 and bk,1 be the smallest integers that satisfy this Figure 1. Since input signals xi, j (n) are also uncorrelated, we
condition. Define can apply the design method developed in previous sections
to obtain the optimal Rk,
(z) and Ek,
(z). The filters Fk (z) and
ak,l = lak,1 , bk,l = lbk,1 . (34) Hk (z) can be obtained from (33).
Using the polyphase representation and the noble identities 5. SIMULATIONS

[20], we can redraw the kth subbands of the transmitter and
receiver, respectively, as those shown in Figures 3(a) and 3(b). In this section, we provide two examples to show the perfor-
Moreover, since Mk and bk,1 are coprime, we have2 mance of nonuniform filter bank transceivers designed by us-
ing the proposed method. It is emphasized that in transceiver
designs, the nearly zero ISI property is guaranteed by a high
bk,1 Mk , bk,2 Mk , . . . , bk,Mk −1 Mk = 1, 2, . . . , Mk − 1 , SIR value and passband flatness is not needed. We assume
(35) that the channel noise v(n) is AWGN in the following exam-
ples.
where [p]q represents p modulo q, which is a number be-
tween 0 and q − 1. Thus, in Figure 3(a), the signal xk (n) Example 1. In this example, we design nonuniform filter
is split into its polyphase components {xk,0 (n), xk,1 (n), . . . , bank transceivers with integer sampling factors. The num-
xk,Mk −1 (n)}. Similarly, {
xk,0 (n), xk,1 (n), . . . , xk,Mk −1 (n)} in ber of subbands is M = 4 and the sampling factors are {N0 ,
Figure 3(b) are the polyphase components of the signal N1 , N2 , N3 } = {2, 4, 8, 8}. Four-tap channels are used here. A
total of 100 randomly generated iid channels are employed in
the simulation. We assume that channel impulse responses
2 See homework [20, Problem 4.9]. are known. All the transmitting and receiving filters are of
v(n)
x0,0 (n) N0 R0,0 (z) C(z) zl0 E0,0 (z) N0 x0,0 (n)
x0,1 (n) N0 z a0,1 R (z)

0,1 za0,1 E0,1 (z) N0 x0,1 (n)
. . .
.. . . ..
.. .
x0,M0 1 (n) N0 z a0,M0 1

R0,M0 1 (z) za0,M0 1 E0,M0 1 (z) N0 x0,M0 1 (n)
x1,0 (n) N1 R1,0 (z) E1,0 (z) N1 x1,0 (n)
x1,1 (n) N1 z a1,1 R (z)

1,1 za1,1 E1,1 (z) N1 x1,1 (n)
. . . .
. . .. .
. . .
x1,M1 1 (n) N1 z a1,M1 1

R1,M1 1 (z) za1,M1 1 E1,M1 1 (z) N1 x1,M1 1 (n)
. .
. .. .. .
. .
. .
xM 1,0 (n) NM 1 RM 1,0 (z) EM 1,0 (z) NM 1 xM 1,0 (n)
xM 1,1 (n) NM 1 z aM 1,1 R

M 1,1 (z) zaM 1,1 EM 1,1 (z) NM 1 xM 1,1 (n)
. . .
. . . .
. . . .
.
xM 1,MM 1 1 (n) NM 1 z aM 1,MM 1 1

RM 1,MM 1 1 (z) zaM 1,MM 1 1 EM 1,MM 1 1 (z) NM 1 xM 1,MM 1 1 (n)
Figure 4: Equivalent circuit of the nonuniform filter bank transceiver with rational sampling factors in Figure 2.
order 56. We consider the iterative algorithm for both cases 58

of with and without frequency criteria. For the case with fre- 56
quency criteria (indicated as fc), the weights for the stopband
energy are chosen as c f ,0 = c f ,1 = ch,0 = ch,1 = 0.05, and 54
c f ,2 = c f ,3 = ch,2 = ch,3 = 0.4. We plot the SIR averaged 52
over the 100 random channels versus the number of itera-
SIR (dB)
tions and the results are shown in Figure 5. From the figure, 50
we see that the average SIR increases with the number of it- 48
erations. When no frequency criteria are applied, the average
SIR increases by about 15 dB and it can be as high as 56 dB 46
after 400 iterations. Even when the frequency criteria are ap- 44
plied, the average SIR increases by more than 8 dB. Thus the
incorporation of frequency criteria results in a loss of SIR 42
by 7 dB. To show the improvement in frequency response 40
when the frequency criteria are applied, we plot the magni- 0 50 100 150 200 250 300 350 400
tude responses of the transceiver optimized for one partic- The number of iterations
ular channel—Channel A after the 200th iteration. The im- 100 random channels
pulse response of Channel A is given by 100 random channels frequency criteria

Channel A = 0.2218 −0.475 0.3906 0.2845 . (36) Figure 5: SIR versus the number of iterations.
0 0
10 10
20 20
Magnitude response (dB)

30 30
40 40
50 50
60 60
70 70
80 80
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Normalized frequency ω/π Normalized frequency ω/π
Figure 6: Magnitude responses of the transmitting filters (no fre- Figure 8: Magnitude responses of the transmitting filters (with fre-
quency criteria). quency criteria).
0 0
10 10
20 20
30 30
40 40
50 50
60 60
70 70
80 80
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Figure 7: Magnitude responses of the receiving filters (no frequency Figure 9: Magnitude responses of the transmitting filters (with fre-
criteria). quency criteria).
The results are shown in Figures 6, 7, 8, and 9. Comparing the quency criteria (indicated by c = 0); (ii) optimization with
results in Figures 6 and 7 with those in Figures 8 and 9, we can frequency criteria and the weights on the stopband energy
see that the incorporation of the frequency criteria improves are c f ,0 = c f ,1 = ch,0 = ch,1 = c = 0.1 (indicated by c = 0.1);
the frequency characteristics of the transceiver significantly. (iii) optimization with frequency criteria and the weights on
The tradeoff is a loss in SIR of around 7 dB. the stopband energy are c f ,0 = c f ,1 = ch,0 = ch,1 = c = 10 (in-
dicated by c = 10). The SIR averaged over 100 random chan-
nels versus the number of iterations are given in Figure 10 for
Example 2. In this example, we design two-band nonuni- the three different values of c. From the figure, we see that the
form filter bank transceivers with rational sampling factors, SIR is smaller when we impose frequency criteria. The heav-
where N0 = N1 = 5, M0 = 2, and M1 = 3. A total of 100 iid ier the frequency criteria, the lower the SIR. Comparing the
channels with 4 taps are randomly generated. The filter or- cases of c = 10 and c = 0, the loss of SIR (after 200 iterations)
ders are N f0 = Nh0 = 58 and N f1 = Nh1 = 87. The trans- is around 6 dB. Even with the frequency weighting of c = 10,
mitting filters F0 (z) and F1 (z) are, respectively, initialized as the SIR can be as high as 47 dB, a value that is good enough
good lowpass and highpass filters with a passband bandwidth for many applications. To demonstrate the effect of adding
of 2π/5. We consider 3 cases: (i) optimization without fre- frequency criteria, we plot the filter magnitude responses for
54 0
10
52 20

30
50
40
SIR (dB)
50
48
60
70
46
80
44 90
100
42 110
0 50 100 150 200 0 0.2 0.4 0.6 0.8 1
The number of iterations Normalized response ω/π
c=0 c=0 c = 10
c = 0.1 c = 0.1 Initial
c = 10
Figure 10: SIR versus the number of iterations. Figure 12: Magnitude response of F1 (z).
0 0
10 10
20
20
30
30
40
50 40
60 50
70
60
80
70
90
100 80
110 90
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
c=0 c = 10 c=0
c = 0.1 Initial c = 0.1
c = 10
Figure 11: Magnitude response of F0 (z). Figure 13: Magnitude response of H0 (z).
one particular channel—Channel B after 200 iterations. The frequency weighting can greatly enhance the selectivity of fil-
impulse response of Channel B is ters.

Channel B = −0.44270 −0.42492 0.39377 0.34971 . 6. CONCLUSION

(37)
In this paper, we propose a method for designing nonuni-
The magnitude responses of the initial filters are given in form filter bank transceivers for frequency selective channels.
Figures 11 and 12. The results are shown in Figures 11, 12, By expressing the SIR as Rayleigh-Ritz ratios of transmitting
13, and 14 (for the purpose of comparison, we also plot the and receiving filters respectively, we can iteratively optimize
initial |Fi (e jω )| in the same figure). From the figure, it is the filters so that SIR is maximized. Moreover, a new cost
clear that without any frequency weighting, the magnitude function that incorporates the filter frequency response is in-
responses degrade significantly after 200 iterations and the troduced. Iterative optimization algorithm based on the new
0 [3] M. Vetterli, “Perfect transmultiplexers,” in Proceedings of IEEE

10 International Conference on Acoustics, Speech and Signal Pro-
cessing (ICASSP ’86), vol. 11, pp. 2567–2570, Tokyo, Japan,
20 April 1986.
[4] S. D. Sandberg and M. A. Tzannes, “Overlapped discrete mul-

30 titone modulation for high speed copper wire communica-
40 tions,” IEEE Journal on Selected Areas in Communications,
vol. 13, no. 9, pp. 1571–1585, 1995.
50 [5] B.-S. Chen, C.-L. Tsai, and Y.-F. Chen, “Mixed H2 /H∞ filtering
design in multirate transmultiplexer systems: LMI approach,”
60 IEEE Transactions on Signal Processing, vol. 49, no. 11, pp.
70 2693–2701, 2001.
[6] P. Martin-Martin, F. Cruz-Roldan, and T. Saramaki, “Opti-
80 mized transmultiplexers for multirate systems,” in Proceedings
of IEEE International Symposium on Circuits and Systems (IS-
90 CAS ’05), vol. 2, pp. 1106–1109, Kobe, Japan, May 2005.
0 0.2 0.4 0.6 0.8 1
[7] A. D. Rizos, J. G. Proakis, and T. Q. Nguyen, “Comparison of
Normalized frequency ω/π
DFT and cosine modulated filter banks in multicarrier modu-
c=0 lation,” in Proceedings of IEEE Global Telecommunications Con-
c = 0.1 ference (GLOBECOM ’94), vol. 2, pp. 687–691, San Francisco,
c = 10 Calif, USA, November-December 1994.
[8] S. Govardhanagiri, T. Karp, P. Heller, and T. Nguyen, “Per-
Figure 14: Magnitude response of H1 (z). formance analysis of multicarrier modulation systems using
cosine modulated filter banks,” in Proceedings of IEEE Inter-
national Conference on Acoustics, Speech and Signal Process-
ing (ICASSP ’99), vol. 3, pp. 1405–1408, Phoenix, Ariz, USA,
cost function yields transceivers with high SIR as well as good March 1999.
[9] T. Ihalainen, T. H. Stitz, and M. Renfors, “Efficient per-carrier
frequency responses.
channel equilier for filter bank based multicarrier systems,”
in Proceedings of IEEE International Conference on Acoustics,
APPENDIX Speech and Signal Processing (ICASSP ’05), Philadelphia, Pa,
USA, March 2005.
In the following, we will prove that the transceiver in Figure 1 [10] S.-M. Phoong, Y. Chang, and C.-Y. Chen, “DFT-modulated
is ISI-free if and only if the transfer function Ti, j (z) in (7) filterbank transceivers for multipath fading channels,” IEEE
satisfies (8). Suppose that Ti, j (z) = 0 for some i = j. Let Transactions on Signal Processing, vol. 53, no. 1, pp. 182–192,
ti, j (k) be one of the nonzero coefficients. So Ti, j (z) con- 2005.
tains the term ti, j (k)z−k . Note that the integers pi, j and p j,i [11] S. Dasgupta and A. Pandharipande, “Optimum multiflow
are coprime. Thus there exist integers a and b such that biorthogonal DMT with unequal subchannel assignment,”
api, j + bp j,i = k. As the inputs Xi (z) are arbitrary, let us take IEEE Transactions on Signal Processing, vol. 53, no. 9, pp. 3572–
3582, 2005.
Xi (z) = za and Xl (z) = 0 for all l = i. From (6), one can
[12] T. Liu and T. Chen, “Design of multichannel nonuniform
find that X j (z) will contain the term ti, j (k)z−b . That means, transmultiplexers using general building blocks,” IEEE Trans-
the ith transmitted signal is causing interference to the jth actions on Signal Processing, vol. 49, no. 1, pp. 91–99, 2001.
output of the receiver. Therefore we should have Ti, j (z) = 0 [13] P.-Q. Hoang and P. P. Vaidyanathan, “Non-uniform multirate
for all i = j. For the case of i = j, it is clear from (6) that filter banks: theory and design,” in Proceedings of IEEE Interna-
the transceiver is ISI-free if and only if T j, j (z) = G j for some tional Symposium on Circuits and Systems, vol. 1, pp. 371–374,
constant G j . Portland, Ore, USA, May 1989.
[14] S. Akkarakaran and P. P. Vaidyanathan, “New results and open
problems on nonuniform filter-banks,” in Proceedings of IEEE
ACKNOWLEDGMENT International Conference on Acoustics, Speech, and Signal Pro-
cessing (ICASSP ’99), vol. 3, pp. 1501–1504, Phoenix, Ariz,
This work was supported in part by the National Science
USA, March 1999.
Council of Taiwan, under Contracts NSC94-2752-E-002- [15] J. Kovacevic and M. Vetterli, “Perfect reconstruction filter
006-PAE, NSC94-2213-E-002-075, and NSC94-2213-E-009- banks with rational sampling factors,” IEEE Transactions on
038. Signal Processing, vol. 41, no. 6, pp. 2047–2066, 1993.
[16] J. Princen, “Design of nonuniform modulated filterbanks,”
REFERENCES IEEE Transactions on Signal Processing, vol. 43, no. 11, pp.
2550–2560, 1995.
[1] R. Van Nee and R. Prasad, OFDM Wireless Multimedia Com- [17] K. Nayebi, T. P. Barnwell III, and M. J. T. Smith, “Nonuni-
munication, Artech House, Boston, Mass, USA, 2000. form filter banks: a reconstruction and design theory,” IEEE
[2] T. H. Luo, C. H. Liu, S.-M. Phoong, and Y.-P. Lin, “Design Transactions on Signal Processing, vol. 41, no. 3, pp. 1114–1127,
of channel-resilient DFT bank transceivers,” in Proceedings of 1993.
13th European Signal Processing Conference (EUSIPCO ’05), [18] F. Argenti, B. Brogelli, and E. Del Re, “Design of pseudo-QMF
Antalya, Turkey, September 2005. banks with rational sampling factors using several prototype
filters,” IEEE Transactions on Signal Processing, vol. 46, no. 6, Editor for IEEE Transaction on Signal Processing (2002–2006). She
pp. 1709–1715, 1998. is currently an Associate Editor for IEEE Transaction on Circuits
[19] C. Y.-F. Ho, B. W.-K. Ling, Y.-Q. Liu, P. K.-S. Tam, and K.-L. and Systems II, EURASIP Journal on Advances in Signal Process-
Teo, “Optimal design of nonuniform FIR transmultiplexer using, and Multidimensional Systems and Signal Processing, Aca-
ing semi-infinite programming,” IEEE Transactions on Signal demic Press. She is also a distinguished Lecturer of the IEEE Cir-
Processing, vol. 53, no. 7, pp. 2598–2603, 2005. cuits and Systems Society for 2006-2007.
[20] P. P. Vaidyanathan, Multirate Systems and Filter Banks, Prentice
Hall, Englewood Cliffs, NJ, USA, 1993.
[21] R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge
University Press, Cambridge, UK, 1985.
Han-Ting Chiang was born in Taipei, Tai-

wan, in 1980. He received the B.S. degree
in electrical engineering from the National
Tsing Hua University, Hsinchu, Taiwan,
and the M.S. degree in electrical engineer-
ing from the National Taiwan University,
Taipei, Taiwan, in 2003 and 2005, respec-
tively. His research interests include signal
processing for communications, wireless
communications, and multimedia signal
processing.
See-May Phoong was born in Johor,

Malaysia, in 1968. He received the B.S.
degree in electrical engineering from the
National Taiwan University (NTU), Taipei,
Taiwan, in 1991 and M.S. and Ph.D. de-
grees in electrical engineering from the Cal-
ifornia Institute of Technology (Caltech),
Pasadena, Calif, in 1992 and 1996, respec-
tively. He was with the faculty of the Depart-
ment of Electronic and Electrical Engineer-
ing, Nanyang Technological University, Singapore, from September
1996 to September 1997. In September 1997, he joined the Gradu-
ate Institute of Communication Engineering and the Department
of Electrical Engineering, NTU, as an Assistant Professor, and since
August 2006, he has been a Professor. He is currently an Associate
Editor for the IEEE Transactions on Circuits and Systems I. He has
previously served as an Associate Editor for Transactions on Cir-
cuits and Systems II: Analog and Digital Signal Processing (January
2002–December 2003) and IEEE Signal Processing Letters (March
2002–February 2005). His interests include multirate signal pro-
cessing, filter banks and their application to communications. He
received the Charles H. Wilts Prize (1997) for outstanding inde-
pendent research in electrical engineering at Caltech. He was also
a recipient of the Chinese Institute of Electrical Engineerings Out-
standing Youth Electrical Engineer Award (2005).
Yuan-Pei Lin was born in Taipei, Taiwan,

1970. She received the B.S. degree in con-
trol engineering from the National Chiao-
Tung University, Taiwan, in 1992, and the
M.S. degree and the Ph.D. degree, both in
electrical engineering from California Insti-
tute of Technology, in 1993 and 1997, re-
spectively. She joined the Department of
Electrical and Control Engineering of Na-
tional Chiao-Tung University, Taiwan, in
1997. Her research interests include digital signal processing, mul-
tirate filter banks, and signal processing for digital communication,
particularly the area of multicarrier transmission. She is a recipient
of 2004 Ta-You Wu Memorial Award. She served as an Associate
doi:10.1155/2007/63714
Research Article
Flexible Frequency-Band Reallocation Networks Using
Variable Oversampled Complex-Modulated Filter Banks
Håkan Johansson and Per Löwenborg
Electronics Systems, Department of Electrical Engineering, Linköping University, 58183 Linköping, Sweden
Received 22 December 2005; Revised 17 May 2006; Accepted 16 July 2006
A crucial issue in the next-generation satellite-based communication systems is the satellite on-board reallocation of information
which requires digital flexible frequency-band reallocation (FBR) networks. This paper introduces a new class of flexible FBR net-
works based on variable oversampled complex-modulated filter banks (FBs). The new class can outperform the previously existing
ones when all the aspects flexibility, low complexity and inherent parallelism, near-perfect frequency-band reallocation, and sim-
plicity are considered simultaneously.
Copyright © 2007 H. Johansson and P. Löwenborg. This is an open access article distributed under the Creative Commons
properly cited.
1. INTRODUCTION The satellites are to communicate with user units via multiple
spot beams. In order to use the limited available frequency
The future society foresees globally interconnected digital spectrum efficiently, the satellite on-board signal process-
communication systems offering multimedia services, infor- ing must support frequency-band reusage among the beams
mation on demand, and delivery of information (data) at and also flexibility in bandwidth and transmitted power al-
high data rates and low cost and with high performance. Ter- located to each user. Further, dynamic frequency allocation
restrial networks could in principle meet the requirements is desired for covering different service types requiring dif-
on communication capacity due to the practically unlimited ferent data rates and bandwidths. An important issue in
bandwidth provided by fiber optic cables, but this capacity the next-generation satellite-based communication system is
is rarely available today. A large investment is required to therefore the on-board reallocation of information. In tech-
bridge the distance between the local exchange and the cus- nical terms, this calls for digital multi-input multi-output
tomer. It is therefore internationally recognized that satellite (MIMO) flexible frequency-band reallocation (FBR) networks
systems will play an important complementary role in pro- (Frequency-band reallocation is also referred to as frequency
viding the global coverage required for both fixed and mobile multiplexing and demultiplexing.) which thus are critical
communications [1–3]. However, to meet the requirements components. Figure 1 illustrates the principle of FBR.
of the communication systems of tomorrow, it is imperative The following main requirements on the next-generation
to develop a new generation of satellite systems, payload ar- flexible FBR networks are identified.
chitectures, ground technologies, and techniques combining
flexibility with cost efficiency. It is envisaged that the im- Flexibility
provements required as to the capacity as well as complex-
ity fall in the range of one and two orders of magnitude Frequency bands of different and variable bandwidths must
[1]. be handled.
The European Space Agency (ESA) outlines three major
standard architectures for future broadband systems [1]. Two Low complexity and inherent parallelism
of these are the distributed access network and professional
user network which are to provide high-capacity point-to- The implementation complexity must be low. Further, the
point and multicast services for ubiquitous Internet access. network (algorithm) itself should not impose restrictions
Input data rate 1/Tin samples/second Output data rate 1/Tout samples/second
Input signal 1 Output signal 1 Output signal 3

In 1 Out 1
1 2 3 5 1 2 6
FBR network
Out 2
0 π/4 π/2 π ωTin (rad) 0 π/2 π ωTout (rad) 0 π/2 π ωTout (rad)
Out 3
Input signal 2 Output signal 2 Output signal 4
In 2 Out 4
4 5 6 3 4
0 π/2 π ωTin (rad) 0 π ωTout (rad) 0 π ωTout (rad)
Figure 1: Illustration of frequency-band reallocation in the case of two input signals, four output signals, and six users. In practice, one must
also include frequency guard bands between the subbands in order to make the network realizable (see Section 2).
on the feasible throughput. The implementation technology more complicated to analyze and design. In summary, the
available should be the limiting factor. Meeting these require- new technique can outperform the previously existing tech-
ments, high-throughput/low-power implementations can be niques when all the aspects flexibility, low complexity and
obtained. inherent parallelism, near-perfect FBR, and simplicity are
considered simultaneously. Thus, the technique presented
Near-perfect frequency-band reallocation here has the potential to become a standard solution for
the next-generation satellite-based communications systems.
Near-perfect FBR means that each subband can be shifted to It is noted that, although the proposed technique primar-
the new positions with small errors. By using an FBR network ily targets a problem present in satellite-based communi-
that is able to approximate perfect FBR as close as desired, cation, as outlined in [1], it is a general technique that
the degradation of the overall system performance [typically can be used in any communication environment that re-
measured in terms of bit-error-rate (BER)] due to these net- quires transparent (bentpipe) flexible reallocation of infor-
works can be made as small as desired. mation.
It is also noted that FBs have been used before in related
Simplicity contexts for partial reconstruction of spectra [7, 8], which is
one of the functions of FBR networks, but neither of those
Simplicity means that the FBR network should be easily an- papers addresses the general problem formulation of flexible
alyzed, designed, and implemented. Although this may not FBR networks that is addressed in this paper. We also wish
be strictly needed in order to arrive at a high-performance to point out that complex modulated filter banks have been
processor, it is naturally advantageous to keep everything as studied in many papers before (see, e.g., [9–12]) but, again,
simple as possible. neither of those papers addresses the problem dealt with in
this paper. Finally, it is noted that parts of the material in this
1.1. Contribution of the paper and relation to paper have been presented at a conference [13].
previous work
1.2. Paper outline
The contribution of this paper is the introduction of a new
class of flexible FBR networks based on variable oversam- Following this introduction, Sections 2–5 are devoted to
pled complex-modulated filter banks (FBs). Compared to the proposed single-input single-output (SISO) networks
the existing FBR networks [4–6], the proposed ones can whereas Section 6 points out the necessary modifications for
(1) outperform the regular complex-modulated DFT FB- obtaining the proposed MIMO networks. The reason why
based networks in terms of flexibility since that technique the main part of the paper considers the SISO case, despite
is totally inflexible, (2) outperform the tree-structured FB- the fact that a practical multicast system requires a MIMO
based networks in terms of flexibility and complexity be- network, is that it is beneficial to first understand and solve
cause tree-structured FBs in our environment only offer par- the SISO network case. This is because the SISO network case
tial flexibility (although the title of [6] indicates full flex- is simpler and a properly designed FBR SISO network can be
ibility) and require a substantially higher complexity than utilized in MIMO networks. In this way, the analysis and syn-
that of modulated FBs (because most of the filtering does thesis of MIMO networks are greatly simplified.
not take place at the lowest sampling rate involved), and
(3) outperform the overlap/save DFT/IDFT-based networks 2. FLEXIBLE FBR SISO NETWORK
[4, 5] in terms of near-perfect FBR since it is not known
how to achieve this with that technique. Further, both tree- The section begins with the problem formulation and then
structured FBs and overlap/save DFT/IDFT networks appear introduces the proposed flexible FBR network.
H. Johansson and P. Löwenborg 3
Q-number of granularity bands Granularity band Guard band
2π/Q 2Δ

0 2πα/Q 2π/Q + 2πα/Q 4π/Q + 2πα/Q 2π 2π/Q + 2πα/Q ωTin

(a)
X Q = 6, q = 6
X0 X1 X2 X3 X4 X5
0 ωTin
(b)
X Q = 6, q = 3
X0 X1 X2
0 ωTin
(c)
X Q = 6, q = 1
X0
0 ωTin
(d)
Figure 2: Granularity bands and typical input signals.
2.1. Problem formulation for the frequency shifts. A straightforward implementation

of the adjustable-bandwidth filters and time-varying multi-
The problem addressed here was outlined in [1] and is plications would however result in a very high implementa-
based on multiple frequency time division multiple access tion cost. To solve the problem in a much more efficient way,
(MF/TDMA) schemes. The input signal is divided into Q we propose a new flexible FBR network based on oversam-
fixed granularity bands. Any user can occupy one or several pled complex-modulated FBs.
of these granularity bands. The input signal thus contains
an on-line variable (adjustable) number of user subbands q,
2.2. Proposed network
where 1 ≤ q ≤ Q. In the extreme cases, q = Q and q = 1, as
illustrated in Figures 2(b) and 2(d), respectively, for Q = 6. We introduce the flexible FBR SISO network shown in
The case with q = 3 and Q = 6 is illustrated in Figure 2(c). Figure 3. This scheme makes use of an N-channel analysis
It is stressed that q is an on-line variable and its value can filter bank with fixed analysis filters Hk (z) for splitting the
thus be changed during operation by an external controller. input signal into N subbands, and downsampling and up-
In addition, we assume full flexibility which means that all sampling by M together with an N-channel synthesis filter
possible subband decompositions and reallocation schemes bank with adjustable synthesis filters Gk (z) for generating
can occur. Furthermore, guard bands (transition bands) in frequency shifts (i.e., redirecting the subbands to the desired
frequency are assumed in order to ensure the network to be output positions) as well as recombination of FB subbands
realizable in practice. This is also depicted in Figure 2. Guard into the q shifted user subbands yr (n), r = 0, 1, . . . , q − 1. In
bands are only present between different user subbands, not the SISO case, all yr (n) are finally summed to produce the
within a user subband. single output y(n), but in the general MIMO case, different
The function of the flexible FBR SISO network is thus yr (n) can be directed to different outputs (see Section 5).
three-fold: it should (1) separate the input signal into the The scheme in Figure 3 is the basis and it is used for anal-
user subbands, (2) shift the user subbands in frequency ysis and design purposes. However, because the synthesis fil-
to the desired positions, and (3) combine the frequency- ters Gk (z) are adjustable, they are not used in the final imple-
shifted user subbands into the output signal. In principle, mentation because implementation of such filters becomes
this function can be implemented through a bank of on-line quite expensive. Therefore, a main point of this paper is to
adjustable-bandwidth filters for the signal separation, and show that, with appropriate choices of filters and parameters
time-varying complex-valued multiplications (modulators) in the FBR network, it is possible to implement the same
Fixed analysis FB Adjustable synthesis FB
y0 (n)
H0 (z) M M G0 (z)
Channel combiner
y1 (n)
x(n) H1 (z) M M G1 (z) y(n)
. . . .
. . . . ..
. . . .
.
yq 1 (n)
HN 1 (z) M M GN 1 (z)
Figure 3: Proposed flexible FBR SISO network. The adjustable synthesis FB can be efficiently implemented using a fixed FB and a variable
channel switch as indicated in Figure 4.
Flexible frequency-band reallocation network

Fixed uniform-band FBs μkr
y0 (n)
H0 (z) M M H0 (z)
Channel combiner
Channel switch
In y1 (n) Out
H1 (z) M M H1 (z)
. . . . .
. . . . . .
. . . . . .
yq 1 (n) .
HN 1 (z) M M HN 1 (z)
Figure 4: Efficient implementation of the proposed flexible FBR SISO network in Figure 3 using a fixed FB and a variable channel switch.
With an appropriately chosen prototype filter order, all μkr become equal to unity.
function using instead a variable channel switch and fixed enough stopband attenuation. To ensure this in the present
FBs according to the proposed scheme in Figure 4 where the setup, it is first observed that the filters are to extract spectra
output from the analysis filter Hk (z) is connected to the in- in accordance with Figures 2 and 5. This is achieved by divid-
put of the synthesis filter Hckr (z), with ckr being given by ing each granularity band into a number of uniform-band FB
(16) and μkr being adjustable phase rotations given by (17) in channels with principle filter magnitude responses according
Section 2.5. In this way, the complexity can be reduced sub- to Figure 6 (also cf. the discussion below). The filter band-
stantially, as fixed filters are considerably less complex to im- widths are thus 2π/N and their transition bands are 2Δ wide.
plement in hardware compared to adjustable filters. Further- It is now required that passbands and transition bands of
more, the fixed analysis (synthesis) FB can be implemented shifted terms caused by decimation do not overlap. This is
using only one filter block and an IDFT (DFT) block, and all achieved when
μkr become unity for an appropriately chosen filter order. In
N
all, this results in a very efficient realization with retained full M≤ < N. (1)
flexibility. The key to this efficient solution is to make use of 1 + NΔ/π
oversampling to avoid channel aliasing, more channels than In addition to the constraint in (1), there is an addi-
granularity bands, and appropriately matched analysis and tional relation between M and N that must be fulfilled and
synthesis filters. The following sections give the details. it is derived as follows. Through decimation and interpola-
tion by the factor M, frequency shifts of 2πm/M radians for
2.3. Restrictions on M and N m = 0, 1, . . . , M − 1 can be generated. It is required that one
is able to generate all integer frequency shifts of the granu-
As opposed to fixed networks, aliasing components cannot larity frequency shift, that is, all frequency shifts 2πq/Q for
be completely eliminated through cancellation in fully flex- q = 0, 1, . . . , Q − 1. In particular, one must be able to shift the
ible FBR networks due to the large number of reallocation granularity bands by all values 2πq/Q. It is therefore required
possibilities and constraints. Instead it must be possible to that M be a multiple of Q, that is,
make them arbitrarily small in each channel which can be
done using oversampling FBs and analysis filters with high M = BQ, B ≥ 1, B integer. (2)
X0 X1 X2
0 2π ωT
(a)
H0 H1 H2 H3 H4 H5 H6 H7
0 2π ωT
(b)
G0 G1 G2 G3 G4 G5 G6 G7
0 2π ωT
(c)
H0 G0 + H1 G1 H2 G2 + H3 G3 + H4 G4 + H5 G5 H6 G6 + H7 G7
Y0 Y1 Y2
0 2π ωT
(d)
G2 G3 G4 G5 G6 G7 G0 G1
0 2π ωT
(e)
H23 G2 + H33 G3 + H43 G4 + H53 G5 H63 G6 + H73 G7 H03 G0 + H13 G1
Y1 Y2 Y0
0 2π ωT
(f)
Figure 5: Illustration of frequency-band reallocation using the proposed FBs with Q = 4, N = 8. (a)–(d) Recombination of channels. (a),
(b), (e), and (f) Recombination of channels and reallocation of subbands; H 3 stands for H shifted three granularity-band shifts to the right
which amounts to one shift to the left when M = 4.
Since N > M according to (1), this means that the number of B is selected as
uniform-band channels cannot equal the number of granu-
larity bands. Instead, N must be a multiple of Q, as illustrated B = A − K, 1 ≤ K ≤ A − 1, K integer, (4)
in Figure 5. That is,
whereby
AM M = N − KQ, (5)
N = AQ = , A > B, A integer. (3)
B
where K is the smallest integer allowed without introducing
aliasing. From (1) and (5), it follows that K must satisfy
Because the downsampling-by-M blocks (upsampling-by-M
blocks) in Figure 4 can be propagated to the input (output) εN 2 εA2
[9], the complexity for a given N is minimized by selecting K≥ = , (6)
Q(Q + εN) 1 + εA
M as large as possible without introducing aliasing, that is,
without violating (1). Thus, it follows from (2) and (3) that where ε denotes how much the guard band 2Δ occupies the
H0 H1 H2 HN 1

0 2πα/N 2π/N + 2πα/N 4π/N + 2πα/N 2π 2π/N + 2πα/N ωT
Figure 6: Principle magnitude responses of the analysis filters.
granularity band 2π/Q, that is, P 2Δ

2π
2Δ = ε , 0 < ε ≤ 1. (7)
Q
π π/N 0 π/N π ωT
For any given Q, one can thus in principle choose any
value of N [which also determines M through (5)] satisfy- Figure 7: Principle magnitude response of the prototype filter.
ing (2) and (3). In practice, it is selected so as to minimize
the implementation complexity. This issue will be treated in
Section 3.
where
Discussion WN = e− j2π/N , (11)
As seen in (3), the new network makes use of more FB chan- βk = WN(k+α)D/2 , (12)
nels than granularity bands (maximum number of user sub-
and α is a real-valued constant used for placing the filters at
bands). This is necessary in order to be able to generate all
the desired centre frequencies according to Figures 2 and 5.
possible frequency shifts, the reason being that a slight over-
The constants βk compensate for the phase rotations
sampling is employed. At first sight, this may seem to be a
that generally are introduced when replacing the Dth-order
drawback but is in fact an advantage in that the implementa-
linear-phase FIR filter P(z) with P(zWNk+α ). In this way, all
tion complexity can be reduced by using more channels than
analysis filters become linear-phase FIR filters with the same
the minimum one required by the application, which is Q in
delay (D/2) as the prototype filter. Indeed, with βk as in (12),
the present application. It is possible to use N > Q here (but
the frequency responses become
not in all FB applications) because the role of the FB is to
move spectra which one in principle can do with an arbitrary
jωT
− jDωT/2 2π(k + α)
number of FB channels without degrading the performance Hk e =e PR ωT − . (13)
in terms of BER, and so forth. In this way, the complexity N
may even be lower than that of regular maximally decimated
FBs despite the fact that a slight oversampling is used.
2.5. Synthesis filters
2.4. Analysis filters As opposed to conventional FBs, where one is interested in
perfect reconstruction of the input x(n), the synthesis filters
The analysis filters are obtained from a Dth-order linear- must here be chosen in such a way that the outputs yr (n),
phase FIR prototype filter with transfer function r = 0, 1, . . . , q − 1, ideally are frequency-shifted (and delayed
D
due to the FB delay) versions of the subsignals according to
P(z) = p(n)z−n (8)
n=0 Yr (z) = z−D Xr zWQsr , (14)
and with the impulse response p(n) being symmetric, that is, where WQ = e− j2π/Q , 2π/Q is the granularity frequency shift
p(n) = p(D − n). The frequency response of such a prototype (minimum allowed frequency shift), and sr is an integer de-
filter can be written as noting the desired number of granularity-band shifts of sub-
band r. For example, if it is desired to move X0 (X2 ) in
P e jωT = e− jDωT/2 PR (ωT), (9)
Figure 2(b) to the position of X2 (X0 ), then sr = 2 (sr = −2).
where PR (ωT) is the real zero-phase frequency response [14] Furthermore, it should be possible to approximate perfect
and ωT denotes the “discrete-time frequency.” Its magnitude FBR as close as desired (i.e., to approximate (14) as close as
response is here principally as illustrated in Figure 7. The desired) for all values of q, 1 ≤ q ≤ Q, by properly designing
analysis filters are complex-modulated versions of the pro- the FB. Both of these criteria are met by selecting the synthe-
totype filter according to sis filters as

Hk (z) = βk P zWNk+α , k = 0, 1, . . . , N − 1, (10) Gk (z) = μkr Hckr (z), (15)
where 0
Hk (e jωT ) (dB)
20
ckr = k + Asr , (16)
40
μkr = WN(mr N/M)D/2 (17) 60
with 80
0 0.4π 0.8π 1.2π 1.6π 2π
⎧
⎨Bsr , sr ≥ 0, ωT (rad)
mr = ⎩ (18)
M + Bsr , sr < 0, Figure 8: Analysis filters in Example 1.
and B being given by (4). The equations above hold for k =
Air , Air + 1, . . . , Air + Anr − 1, with ir denoting the left-most 20
granularity band included in xr (n), A being given by (3), and X0 X1 X2
X(e jωT ) (dB)

0
nr denoting the number of granularity bands in subband r.
To obtain (17), we have utilized that 20
mr m N/M
40
WM = WN r . (19)
60
0 0.4π 0.8π 1.2π 1.6π 2π
It should be noted here that the pair (k, r) only takes on
ωT (rad)
values that correspond to ckr ∈ [0, N − 1] which for obvi-
ous reasons must be ensured. This will always be the case
because our notations reflect the fact that the input sub- Figure 9: Input spectrum in Example 1.
band r covering the granularity-band positions i, for i = ir ,
ir + 1, . . . , ir + nr − 1, is to be moved to the positions i + sr .
That is, it is a priori assumed that Example 1. As a means of illustration, we consider the
following example:
ir , ir + nr − 1 ∈ [0, Q − 1] (20)
Number of granularity bands: Q=4
as well as Number of FB channels: N =8
Downsampling factor: M=4
ir + sr , ir + sr + nr − 1 ∈ [0, Q − 1]. (21) Transition band width: Δ = 0.125π/Q = 0.125π/4
Frequency offset: α = 0.5
Since the number of FB channels is N = AQ, it follows that Prototype filter order: D = 134
the input subband r is also covered by the analysis FB chan- Number of subbands: q=3
nels k, for k = Air , Air + 1, . . . , Air + Anr − 1. For these values Number of granularity bands
of k, it now follows from (20) that n0 = 1, n1 = 2, n2 = 1
in each input subband:

First FB channel in each
k + Asr ∈ 0, A(Q − 1) + A − 1 = [0, N − 1]. (22) k0 = 0, k1 = 2, k2 = 6.
input subband:
Thus, all ckr in (16) belong to [0, N − 1]. The magnitude responses of the analysis filters are shown
The constants μkr compensate for the phase rotations in Figure 8. Design details will be discussed in Section 4. The
that generally are introduced when replacing the Dth-order input spectrum is plotted in Figure 9. We now consider three
mr
linear-phase FIR filters Hk (z) with Hk (zWM ). In this way, all different reallocation schemes.
synthesis filters become linear-phase FIR filters with the same
delay (D/2) as the prototype filter (compare with the analysis Reallocation scheme (a)
filters in Section 2.4). Further simplifications are obtained by
noting that it is always possible to make all μkr = 1. Indeed, In this case, we assume that the output subband positions are
we have the same as the input subband positions. This illustrates the
mr D ability of the filter bank to recombine several adjacent chan-
= integer =⇒ μkr = 1. (23) nels. In this case, the synthesis filters are the same as the anal-
2M
ysis filters which means that the channel switch simply passes
Thus, it is always possible to make all μkr equal to unity by on its inputs as seen in Figure 13(a). The output spectrum
selecting the filter order D of the prototype filter properly. becomes as shown in Figure 10. It is seen that it is the same
This is easily achieved by introducing a proper amount of as the input spectrum except for small errors introduced
additional delays. in the FBR network. By properly designing the network,
Finally, it is noted that it follows from (15) that the net- these errors can be made negligible compared to other errors
work in Figure 3 with fixed filters and adjustable filters can be that are always present in communication systems. To exem-
efficiently implemented by the network in Figure 4 that uses plify: the input samples are in this example randomly gen-
two sets of fixed filters and a variable channel switch. erated quadrature amplitude modulated symbols (QAM-16)
20 20
Y0 Y1 Y2 Y1 Y0 Y2
Y (e jωT ) (dB)
Y (e jωT ) (dB)
0 0
20 20
40 40
60 60
0 0.4π 0.8π 1.2π 1.6π 2π 0 0.4π 0.8π 1.2π 1.6π 2π
ωT (rad) ωT (rad)
Figure 10: Output spectrum in Example 1 for reallocation scheme Figure 12: Output spectrum in Example 1 for reallocation scheme
(a). (c).
20 The output spectrum becomes in this case as shown in

Y1 Y2 Y0 Figure 12. The errors are of the same order as in schemes (a)
Y (e jωT ) (dB)
0
and (b). The parameter values are here as follows: s0 = 2,
20
s1 = −1, s2 = 0; m0 = 2, m1 = −1, m2 = 0; c00 = 4,
40 c10 = 5, c21 = 0, c31 = 1, c41 = 2, c51 = 3, c62 = 6,
c72 = 7; μ00 = μ10 = −1, μ21 = μ31 = μ41 = μ51 = − j,
60
0 0.4π 0.8π 1.2π 1.6π 2π μ62 = μ72 = 1. The switch is in this case implemented as
ωT (rad) shown in Figure 13(c).
Finally, it is noted that we used a filter order of 134 in this
Figure 11: Output spectrum in Example 1 for reallocation scheme
(b). example which resulted in multiplier values μkr not equal to
unity. This was done in order to illustrate that the proposed
technique works in such cases as well. By increasing the filter
normalized so that the signal has a unity average power. Us- order to, for example, 136, all μkr become equal to unity.
ing an additional filter for recovering the first subband (x0 ),
we find that the maximum distance between the input and
3. IMPLEMENTATION COMPLEXITY
output samples is below 0.01. As a consequence, if the sym-
bol error rate due to additive white noise alone (thus with- The main point of this section is the selection of the number
out errors created in the FBR network) is, say, 10−6 , it will in of FB channels N that minimize the overall implementation
the worst case be increased to 1.5 × 10−6 due to the FBR net- complexity when efficient DFT- and IDFT-based realizations
work. By increasing the filter orders, and redesigning the FBR are employed.
network, the degradation can be reduced to any level that in
practice is negligible.
3.1. Efficient DFT- and IDFT-based realizations
Reallocation scheme (b) Utilizing the polyphase form of P(z) given by [9]
N
−1
In this case, we assume a scheme as that shown earlier in
Figures 5(a), 5(b), 5(e), and 5(f). This is achieved by select- P(z) = z−i Pi zN , (24)
i=0
ing the synthesis filters according to (15) with the following
numbers of granularity-band shifts: s0 = 3, s1 = s2 = −1. where Pi (z) are the polyphase components, Hk (z) in (10) can
These values imply that mr = 3, for r = 0, 1, 2, which means be rewritten as
that μkr = − j for all pairs of values kr of interest in (17), that N
−1
is, for kr = 00, 10, 21, 31, 41, 51, 62, 72. These values of kr re-
Hk (z) = βk z−i αi Pi zN WNαN WN−ki , (25)
sult in the following values of ckr : c00 = 6, c10 = 7, c21 = 0, i=0
c31 = 1, c41 = 2, c51 = 3, c62 = 4, c72 = 5. When the synthesis
FB is implemented using a switch and fixed filters, as shown where
in Figure 4, we recall that the role of the channel switch is to αi = WN−αi . (26)
redirect its input at position k to its output at position ckr .
In this example, the switch in Figure 4 is thus implemented Making use of (25), well-known properties of DFT and IDFT
as shown in Figure 13(b). The output spectrum becomes as FBs, and properties of downsamplers and upsamplers, it is
shown in Figure 11. The errors are of the same order as in now recognized that the analysis and synthesis FBs can be re-
scheme (a). alized with the aid of an N-point IDFT and N-point DFT, re-
spectively, as shown in Figures 14 and 15 where all arithmetic
Reallocation scheme (c) operations take place at the lowest sampling rate ( fin /M). The
multipliers in Figure 15 are given by
In this case, we assume that the two narrow-band subbands
are to interchange their positions as compared to scheme (b). γk = βk WNk . (27)
the implementation of an N-point DFT, as well as an IDFT,

requires about 0.5N log2 (N) multiplications per block of N
input/output samples, provided that an efficient FFT algo-
rithm is used. The complexity CA of the analysis FB becomes
thereby1,2
D + 1 + 0.5N log2 (N)

CA =
M (30)
KP /Δ + 2 + N log2 (N)
= .
2M
(a) (b) (c)
For a fixed N, it is evident from (30) that the complexity re-
duces as M increases. This justifies the choice M = N − KQ
Figure 13: Channel switch in Example 1; Schemes (a), (b), and (c). in (5) in Section 2.3. Expressed in terms of A, with N = AQ
according to (3), (30) can alternatively be written as
In the efficient synthesis FB in Figure 15, the separate outputs KP /Δ + 2 + AQ log2 (AQ)
yr (n) from the channel combiner (Figure 3) are not available. CA = . (31)
2M
This means that the multipliers μkr have to be placed at the
input, preferably in front of the DFT (instead of the channel Assuming that equality holds in (1) and (6), one may find the
switch) since they can then be combined with the multipliers minimum of the function CA by setting its derivative with re-
already present there; this is illustrated in Figure 15. In this spect to A to zero and solve for A yielding the optimum A,
way, the multiplier cost can be minimized also in those cases denoted here as Aopt . However, since CA and its derivative
when μkr =/ 1. It should also be noted that not having the involve both A and the logarithm of A, it is not possible to
separate outputs yr (n) available is not a problem in the SISO express Aopt in a simple form. In practice it is therefore advis-
case as only the composite output y(n) is supposed to be used able to plot CA as a function of A from which Aopt easily can
here. However, in the MIMO case, this is a problem that must be identified. This is illustrated in Figure 16 for two different
be taken care of (see Section 5). values of KP . One should note here that there are basically
In summary, it is seen that the proposed FBR network has three different cases that may occur. In the first case, as seen
about the same low complexity as that of a regular fixed mod- in the uppermost plot in Figure 16, Aopt lies between Amin
ulated FB but with the additional inherent flexibility. Natu- and Amax , which denote the minimum and maximum values
rally, there is an overhead cost due to the channel switch, but of A, respectively. The minimum value is always Amin = 2
such a block is required in all flexible FBR networks and thus due to (2) and (3). The maximum value is determined by the
not an extra cost in comparison with other such networks. upper bound on N that exists because the number of chan-
nels (N/Q) in each subband times the guard bandwidth (2Δ)
3.2. Selection of N that minimizes the implementation cannot exceed the granularity bandwidth (2π/Q).3 Hence, N
complexity is bounded by
As seen earlier in Section 2.3, there is not just one selection π Q
of the number of FB channels N that can be used for a fixed N≤ = , (32)
Δ ε
prespecified number of subbands Q and guard band width
2Δ. In practice, it is of course of interest to select N so that the where the equality comes from (7). This implies that the
overall implementation complexity is minimized. This issue maximum value of A is
is treated in this section.
1
Because the prototype filter P(z) is a linear-phase FIR fil- Amax = , (33)
ε
ter, its order D can be estimated as [15]
Kp where x stands for the maximum integer smaller than or
D= , (28) equal to x. In the second case, as seen in the downmost plot
2Δ in Figure 16, Aopt = Amax . This occurs when KP is large. In
where 2Δ is the transition bandwidth (which equals the the third case, Aopt = Amin , which occurs when KP is small.
width of the guard band, see Figure 2) and

−20 log10 δc δs − 13 1 As a measure of complexity, the multiplication rate is used. It is here the
Kp = (29) number of multiplications per input (output) sample in the analysis FB
14.6/(2π)
(synthesis FB). The multiplication rate takes into account the data rate at
with δc and δs being the passband and stopband ripples, re- which the multiplications are performed.
2 The number of additions and delay elements is here roughly proportional
spectively. The order is thus inversely proportional to the
to the number of multiplications and is therefore omitted in the discus-
transition bandwidth. The number of multipliers required sion.
in the prototype filter is D + 1, since the symmetry of the 3 The bound can be increased, in principle to infinity, by reducing the guard
linear-phase FIR prototype filter cannot be utilized. Further, bandwidth, but this does not make sense as it will increase the filter order.
α0 β0
x(n) M P0 (zL WNαN )
z 1
α1 β1
M P1 (zL WNαN ) IDFT
.
.
.
..
z 1 .
αN 1 βN 1
M PN L αN
1 (z WN )
Figure 14: Analysis FB realizing the analysis filters Hk (z), as given by (10), where L = A/B = integer. When A/B is not an integer, a
more general polyphase implementation of the polyphase components Pi (zN ) followed by downsampling has to be used [9], but all filtering
operations can still be moved to the input rate.
μkr
γ0 αN 1
PN L W αN ) M
1 (z N
z 1
Channel switch
γ1 αN 2
DFT PN L W αN ) M
2 (z N
.. .. .. ..
. . . .
z 1
γN 1 α0
P0 (zL WNαN ) M y(n)
Figure 15: Synthesis FB realizing the synthesis filters Gk (z) as given by (15) using a channel switch and fixed filters Hk (z) as given by (10).
16 In the discussion above, some simplifications were made

in order to arrive at the optimum A. In practice, there are
Complexity CA
14 KP = 11.62 several issues that must be taken into consideration which

complicates the minimization of the complexity. These issues
12 are discussed below.
Amin = 2 Amax = 10 First, it was assumed that the passband and stopband
10
1 2 3 4 5 6 7 8 9 10 11 ripples are constant regardless the value of N. As N in-
A = N/Q creases, one should rather replace the stopband ripple δs by
δs /N though, to compensate for the larger number of alias-
(a)
ing components, at least when using worst-case design tech-
40 niques (see (49) in Section 4.3). However, since the order of
an FIR filter depends on the stopband ripple logarithmically,
Complexity CA
30 KP = 33.14 this compensation will have a minor effect upon the order.
Hence, if we instead use δs /N above, the complexity CA as a
20 function of A will only change slightly.
Amin = 2 Amax = 10 Second, it was assumed that the prototype filter is a reg-
10 ular lowpass linear-phase FIR filter without requirements in
1 2 3 4 5 6 7 8 9 10 11
the transition band. However, one should compensate for the
A = N/Q fact that the prototype filter must exhibit an approximately
(b) power complementary behavior in the transition band. This
means that the constant KP in (28) should be replaced by
cKP , c > 1. Our experience is that c is approximately constant,
Figure 16: Complexity CA as a function of A = N/Q for two differ- regardless of the other parameter values, although there exist
ent values of KP as given by (29). no empirically derived formulas based on a large number of
designs that confirm this assertion. If c is constant, the effect simple form, we first recognize that (14) in the Fourier do-
is that we simply increase the value of KP , the result of which main corresponds to
is that Aopt will move closer to Amax , unless Aopt = Amax
for KP in which case Aopt remains the same. This is seen in Yr e jωT = e− jDωT Xr e jωT WQsr (37)
Figure 16.
which, equivalently, can be written as
Third, we have assumed that all multiplications have

the same cost in an implementation. However, in cases Yr e jωT = Fr e jωT WQsr X e jωT WQsr (38)
where α takes on the value 0, ±0.25, or ±0.5 (implying
that WN−αN takes on the values 1, ± j, and −1) each mul- with
tiplication in the polyphase components only requires one ⎧
⎨e− jDωT , ωT ∈ Ω(r)
x ,
real multiplication whereas the multiplications in the DFT jωT
Fr e =
⎩0,
(39)
and IDFT, most of which are always complex, require at / Ω(r)
ωT ∈ x ,
least three real multiplications [16]. Taking this into ac-
count amounts to replacing 0.5 in (30) with 1.5, the re- where
sult of which is that Aopt will move closer to Amin , unless

Aopt = Amin for the value 0.5, in which case Aopt remains Ωrx = 2ir − 1 π/Q + 2πα/Q + Δ,
(40)
the same. 2ir + 2nr − 1 π/Q + 2πα/Q − Δ
Taking these issues into account, one can thus still gener-
ate a plot as that in Figure 16 from which the optimum value and 1/T is the input and output sampling rate.
of A can be determined. As to the synthesis FB, its complex- The network is a perfect FBR network if the right-hand
ity is the same as that of the analysis FB when all μkr equal side of (35) for z = e jωT equals that in (38). Thus, the net-
unity, which always can be guaranteed if a certain amount of work is a perfect FBR network if Vrm (z) in (36) for all r and
additional delay can be accepted. In the most general case, m satisfy
with some or all of μkr not being equal to unity, at most N/M
Vrm e jωT = Fr e jωT WQsr , m = mr ,
additional complex multiplications per input/output sample (41)
are required. Since N/M never exceeds 1/2, this is a minor Vrm (z) = 0, m =/ mr ,
extra cost during normal operation.
where Fr (e jωT ) is given by (39) and mr is given by (18). We
have also utilized that WQsr = WM mr
. When sr is negative, mr
4. DESIGN equals M +Bsr instead of Bsr which is due to the fact that only
positive values of m are used in (35). It is possible to replace
This section considers the design of the flexible FBR net- m M+m
Bsr with M + Bsr because WM = WM .
work which amounts to determining the linear-phase FIR
It should be noted that for the special case with q = Q =
prototype filter P(z) so that the network approximates per-
1, a regular FB is obtained. In this case, no reallocation can
fect FBR. This is in principle the same design problem as in
take place (since only one band is present) and the whole
conventional FBs, but it is much more complex here due to
band should be reconstructed. In this special case, a perfect
the many different reallocation schemes involved.
FBR is the same as a perfect reconstruction FB.
4.1. Distortion and aliasing
4.2. Relation between Vrmr (e jωT ) and Vr0 (e jωT )
Using well-known input-output relations for the downsam-
This section shows that the FBR network for all sr of interest
pler and upsampler [9], one finds that the z-transform of the
can be related to an FBR network with sr = 0, that is, when
output y(n) in Figures 3 and 4 can be expressed as
subbands are not reallocated but only recombined. This
q−1
amounts to showing that Vrmr (e jωT ) are frequency shifted
Y (z) = Yr (z), (34) versions of Vr0 (e jωT ). This relation eases the design substan-
r =0 tially as discussed in the following section.
where the outputs yr (n), r = 0, 1, . . . , q − 1, are given by We first note that the frequency responses correspond-
mr
ing to Hk (zWM ) and Gk (z) are obtained from (9), (10), and
M
−1 (15), as
m
Yr (z) = Vrm (z)X zWM (35)
mr −(m N/M)D/2
m=0 Hk e jωT WM = e− jDωT/2 WN r

with WM = e− j2π/M and 2π k + mr N/M + α
× PR ωT − ,
kr +An
r −1 N
m
Vrm (z) = H zWM Gk (z), (36)
Gk e jωT = e− jDωT/2 WN(mr N/M)D/2
k=kr
2π k + mr N/M + α
where kr = Air denotes the first FB channel included in the × PR ωT − ,
same band as xr (n). We now wish to state the condition un- N
der which perfect FBR is obtained. In order to do that in a (42)
respectively. Hence, the frequency responses corresponding realizations with zero distortion and aliasing errors when it
mr
to Vrmr (z) = Hk (zWM )Gk (z) become comes to flexible FBR reallocation networks. The reason is
that (41) should be satisfied for all r = 0, 1, . . . , q − 1, all
Vrmr e jωT q = 0, 1, . . . , Q − 1, and all feasible combinations and reallo-
kr +An cations schemes. This means that the number of conditions
i −1 2π k + mr N/M + α
= e− jDωT PR2 ωT − . to satisfy is substantially larger for flexible FBR networks than
k=kr
N for regular FBs. Therefore, one has to accept the use of near-
(43) perfect FBR networks. This is however not really a problem
because the FB is to be used in a communication system
Thus, the distortion function is a linear-phase function with which always contains other sources of errors which together
delay D and magnitude result in a certain BER. The important point is that it is pos-
sible to design the FBR network to approximate perfect FBR
kr +An
i −1 2π k + mr N/M + α as close as desired as one thereby can make the degradation
Vrm e jωT = PR ωT −
2
.
r
k=kr
N due to the imperfect FBR network negligible compared to the
other errors involved. In addition, it is known that the use
(44) of near-PR FBs instead of PR FBs can reduce the complex-
ity substantially [17] which means that one should aim for
We note that near-PR systems anyhow. Exactly how close to perfect FBR
the network must be is not specific for the proposed network
Vrmr e jωT = Vr0 e jωT WNmr N/M , (45) but depends on the communication environment, modula-
tion techniques, and other factors [18] that are beyond the
where Vr0 (e jωT ) is given by scope of this paper.
In principle, one can apply any standard nonlinear opti-
kr +An
i −1
2π(k + α) mization technique [19] directly to meet the criteria in (47)
Vr0 e jωT = e− jDωT PR2 ωT − , (46)
k=kr
N and (48). However, as the optimization is nonlinear, and will
contain many constraints, it may become numerically diffi-
is the distortion function when the subbands are only re- cult or infeasible to solve this problem in practice. One way
combined (thus not reallocated). This shows that Vrmr (z) are to reduce the number of constraints substantially is to allow
frequency-shifted versions of Vr0 (z). Hence, if the network is a slight overdesign and replace (48) with
a near-perfect FBR network when Gk (z) = Hk (z), so is the jωT δ1
network when these Gk (z) are replaced with the functions in P e ≤ , ωT ∈ Ωs , (49)
N
(15). It should be mentioned, however, that the aliasing com-
ponents do not remain the same but their magnitudes are where Ωs denotes the stopband of P(z). It is also noted that
still bounded by the stopband attenuation of the prototype nonlinear optimization benefits from a good initial solution
filter. which here can be obtained by using the well-known algo-
rithm in [20] which generates linear-phase FIR filters opti-
mum in the minimax sense.
4.3. Minimax design
Finally, we note that, for a fixed reallocation scheme, (47)
Filter banks are commonly designed using minimax or least- and (48) correspond to the requirements of partially recon-
squares design techniques, or combinations of such design structing FBs [7]. However, as already explained, the design
techniques [17]. This paper discusses minimax design but problem is much more complex here as a large number of
the alternatives can of course be used as well after appropriate reallocations options must be handled simultaneously in the
modifications. design.
Due to (45), it suffices to control Vr0 (e jωT ), given by (46),
for r = 0, 1, . . . , q − 1, and the aliasing terms in the design. 5. FLEXIBLE FBR MIMO NETWORKS
For this reason, let the specifications of Vrm (z) be
This section shows how to generalize the proposed SISO net-

Vr0 e jωT − Fr e jωT ≤ δ0 , ωT ∈ [0, π], (47) works to MIMO networks.
where δ0 > 0 and Fr (e jωT ) is given by (39), and 5.1. K-input K-output frequency-band reallocation
networks
Vrm e jωT ≤ δ1 , ωT ∈ [0, π], (48)
Generalizing the SISO system considered so far to a MIMO
for m = 0, 1, . . . , M − 1, m =
/ mr , mr being given by (18), system with equal number (K) of inputs and outputs, we
and δ1 > 0. The parameters δ0 and δ1 are prescribed distor- propose the flexible FBR network depicted in Figure 17. It
tion and aliasing errors, respectively, and determined by the is here assumed that the subbands are reallocated to unique
application at hand. In conventional FBs, the distortion and positions. Further, the analysis FBs (AFBs) and synthesis FBs
aliasing errors can be made zero by using certain classes of (SFBs) are instances of the fixed FBs used in Section 3. Thus,
PR FBs. It is however not likely that one can find practical the only difference from the SISO case is that the channel
Frequency-band reallocation network 20

X20 X21 X22
X2 (e jωT ) (dB)
0
In 1 AFB SFB Out 1 20
40
Channel switch
60
In 2 AFB SFB Out 2 0 0.4π 0.8π 1.2π 1.6π 2π
ωT (rad)
.. .. .. ..
. . . . Figure 19: Input 2 spectrum in Example 2.
In K AFB SFB Out K

20
Y11 Y20
Y1 (e jωT ) (dB)
0
Fixed filter banks
20
Figure 17: Proposed K-input K-output flexible FBR network using 40

fixed FBs and a channel switch. 60
0 0.4π 0.8π 1.2π 1.6π 2π
ωT (rad)
20 Figure 20: Output 1 spectrum in Example 2.

X10 X11 X12
X1 (e jωT ) (dB)
20
20
40 Y10 Y12 Y22 Y21
Y2 (e jωT ) (dB)
0
60
0 0.4π 0.8π 1.2π 1.6π 2π 20
ωT (rad)
40
Figure 18: Input 1 spectrum in Example 2. 60

0 0.4π 0.8π 1.2π 1.6π 2π
ωT (rad)
switch in this MIMO case is able to redirect information Figure 21: Output 2 spectrum in Example 2.
from any input beam to any output beam, as illustrated in
the example below. If the FBR SISO network is designed as
outlined in Section 4, the overall performance for each out-
From analysis FB 1
To synthesis FB 1
put subband in the MIMO network will be the same as in the

SISO network, except for some minor negligible differences
caused by differences in the aliasing terms. Consequently, it
suffices to design one prototype filter for the SISO case, that
is, as outlined in Section 4, and then use K instances of the
corresponding FBs according to Figure 17. This implies that
the proposed MIMO system is modular which is attractive
from the design and implementation points of view.
Example 2. The function of the proposed FBR MIMO net-

From analysis FB 2
To synthesis FB 2
work is illustrated through an example with two input and

output beams. The two input spectra are plotted in Figures
18 and 19. It is desired to reallocate the subbands according
to Figures 20 and 21, which plot the two output spectra. The
frequency-band reallocation is achieved by using the channel
switch in Figure 22 and FBs with the same filter magnitude
responses as used earlier in Example 1 (Figure 8), but with
an additional filter delay introduced to make all μkr equal to
Figure 22: Channel switch in Example 2.
unity.
Frequency-band reallocation network
Ch Co
In 1 AFB SFB R Out 1 to K1
Channel switch
Ch Co
In 2 AFB SFB R Out K1 + 1 to K2
. . . . .
. . . . .
. . . . .
Ch Co
In S AFB SFB R Out Kr + 1 to K
Fixed filter banks
Figure 23: Proposed S-input K-output FBR network using fixed FBs, a channel switch, and channel combiners (Ch Co).
5.2. S-input K-output systems FB (See Footnote 4). In this case, one can redirect all out-
put subbands to the baseband. Further, by making use of
Generalizing the K-input K-output system considered above multirate identities [9] one can make the overall computa-
to an S-input K-output system, we propose the flexible FBR tional complexity of the K synthesis FBs roughly the same
network depicted in Figure 23. Again, it is assumed that the as earlier. That is, the number of arithmetic operations per
subbands are reallocated to unique positions which implies time unit remains the same whereas the number of synthe-
that K ≥ S. It is further assumed that sis FB instances is R times higher. Note that analysis FBs can
be implemented in the same way as for the SISO case and
K = RS (50) MIMO case with equal number of inputs and outputs. It is
thus only the synthesis parts that need to be modified in this
generalized MIMO case.
which corresponds to the fact that the output beams’ band-
width is assumed to be R times narrower than that of the
input beams4 . This means that only some of the synthesis 5.3. Further generalizations
FB outputs are combined to form the outputs. It also means
that decimation by R can take place at the outputs without One may also think of allowing S > K in the network in
introducing aliasing. Hence, in principle, it is again possible Section 5.2 above. However, this requires synthesis FBs with
to use only S fixed synthesis FBs, but it is then not possible upsampling rates higher than the downsampling rates used
to directly redirect all output subbands to the baseband. In- in the analysis FBs. The proposed network cannot be used
stead, one has to make use of the whole band and let the sub- for this case straightforwardly and is therefore not discussed
sequent decimation make the mapping to the baseband; that further in this paper.
is, the spectrum at the input of each decimator has a band-
width of π/R and is positioned between pπ/R and (p + 1)π/R
6. CONCLUDING REMARKS
with respect to the input sampling rate, with p being an inte-
ger belonging to the set [0, R − 1]. This paper introduced a new class of flexible FBR networks
However, a problem of using only S fixed synthesis FBs using variable oversampled complex-modulated FBs. The
is that it is then not possible to make use of the efficient new network can outperform existing ones when all the as-
realization in Figure 15 because the outputs of the synthe- pects flexibility, low complexity and inherent parallelism,
sis filters are not available in that structure. To get around near-perfect FBR, and simplicity are considered simultane-
this problem, we propose to use instead K = RS fixed syn- ously. The paper discussed design and complexity issues and
thesis FBs, each being an instance of the fixed synthesis FB provided examples that demonstrated the functionality. Fi-
used in Section 3 (Figure 15) but with some of the inputs to nally, we wish to make the following two remarks. First, the
the DFT being zero which corresponds to the fact that only FB prototype filter used in this paper is a linear-phase FIR
a subset of the FB channels will be utilized in each synthesis filter. It is possible to use instead a nonlinear-phase FIR filter
or an IIR filter, after appropriate modifications, as a means to
reduce the delay and/or the implementation complexity. Sec-
4 This case can be generalized to allow outputs with different data rates ond, the proposed design technique is simple, and attractive
which amounts to allowing different downsampling factors at the output
in Figure 23. In the implementation, different instances of synthesis FBs
in that sense, but it generates overdesigned FBs. There is thus
must then be used, with different numbers of inputs to the DFT being set room for reduction of the complexity by using other design
to zero. methods. These are topics for future research.
REFERENCES [16] K. K. Parhi, VLSI Digital Signal Processing Systems: Design and
Implementation, chapter 8, John Wiley & Sons, New York, NY,
[1] B. Arbesser-Rastburg, R. Bellini, F. Coromina, et al., “R&D di- USA, 1999.
rections for next generation broadband multimedia systems: [17] T. Saramäki and R. Bregovic, “Multirate systems and filter
an ESA perspective,” in Proceedings of 20th AIAA International banks,” in Multirate Systems: Design and Applications, G.
Communication Satellite Systems Conference and Exhibit, Mon- Jovanovic-Dolecek, Ed., chapter 2, pp. 27–85, Idea Group,
treal, Quebec, Canada, May 2002. Hershey, Pa, USA, 2002.
[2] E. Del Re and L. Pierucci, “Next-generation mobile satellite
[18] S. Haykin, Digital Communications, John Wiley & Sons, New
networks,” IEEE Communications Magazine, vol. 40, no. 9, pp.
York, NY, USA, 1988.
150–159, 2002.
[3] M. Wittig, “Satellite onboard processing for multimedia appli- [19] S. G. Nash and A. Sofer, Linear and Nonlinear Programming,
cations,” IEEE Communications Magazine, vol. 38, no. 6, pp. McGraw-Hill, New York, NY, USA, 1996.
134–140, 2000. [20] J. H. McClellan, T. W. Parks, and L. R. Rabiner, “A computer
[4] M.-L. Boucheret, I. Mortensen, and H. Favaro, “Fast convo- program for designing optimum FIR linear phase digital fil-
lution filter banks for satellite payloads with on-board pro- ters,” IEEE Transactions on Audio and Electroacoustics, vol. 21,
cessing,” IEEE Journal on Selected Areas in Communications, no. 6, pp. 506–526, 1973.
vol. 17, no. 2, pp. 238–248, 1999.
[5] G. Chiassarini and G. Gallinaro, “Frequency domain switch- Håkan Johansson was born in Kumla, Swe-
ing: algorithms, performances, implementation aspects,” in den, in 1969. He received the Master of Sci-
Proceedings of the 7th Tyrrhenian International Workshop on ence degree in computer science and the Li-
Digital Communications, Viareggio, Italy, September 1995. centiate, Doctoral, and Docent degrees in
[6] H. G. Göckler and B. Felbecker, “Digital on-board FDM- electronics systems from Linköping Univer-
demultiplexing without restrictions on channel allocation and sity, Sweden, in 1995, 1997, 1998, and 2001,
bandwidth,” in Proceedings of the 7th International Workshop respectively. During 1998 and 1999 he held
on Digital Signal Processing Techniques for Space Applications a post doctoral position at Signal Processing
(DSP ’99), Noordwijk, The Netherlands, 1999. Laboratory, Tampere University of Technol-
[7] T. Q. Nguyen, “Partial spectrum reconstruction using digital ogy, Finland. He is currently a Professor in
filter banks,” IEEE Transactions on Signal Processing, vol. 41, electronics systems at the Department of Electrical Engineering of
no. 9, pp. 2778–2795, 1993. Linköping University. His research interests include theory, design,
[8] W. A. Abu-Al-Saud and G. L. Stuber, “Efficient wideband and implementation of signal processing systems. He is the author
channelizer for software radio systems using modulated PR or coauthor of four textbooks and more than 100 international
filterbanks,” IEEE Transactions on Signal Processing, vol. 52, journal and conference papers. He has served/serves as an Associate
no. 10, part 1, pp. 2807–2820, 2004. Editor for the IEEE Trans. on Circuits and Systems-II (2000-2001),
IEEE Signal Processing Letters (2004–2007), and IEEE Trans. Sig-
[9] P. P. Vaidyanathan, Multirate Systems and Filter Banks, Pren-
nal Processing (2006–2008), and he is a Member of the IEEE Int.
tice-Hall, Englewood Cliffs, NJ, USA, 1993.
Symp. Circuits. Syst. DSP track committee.
[10] P. N. Heller, T. Karp, and T. Q. Nguyen, “A general formula-
tion of modulated filter banks,” IEEE Transactions on Signal Per Löwenborg was born in Oskarshamn,
Processing, vol. 47, no. 4, pp. 986–1002, 1999. Sweden, in 1974. He received the Master
[11] T. Karp and N. J. Fliege, “Modified DFT filter banks with per- of Science degree in applied physics and
fect reconstruction,” IEEE Transactions on Circuits and Systems electrical engineering and the Licentiate,
II: Analog and Digital Signal Processing, vol. 46, no. 11, pp. and Doctoral degrees in electronics sys-
1404–1414, 1999. tems from Linköping University, Sweden, in
1998, 2001, and 2002, respectively. His re-
[12] J. Alhava, A. Viholainen, and M. Renfors, “Efficient imple- search interests are within the field of the-
mentation of complex exponentially-modulated filter banks,” ory, design, and implementation of analog
in Proceedings of IEEE International Symposium on Circuits and and digital signal processing electronics. He
Systems (ISCAS ’03), vol. 4, pp. 157–160, Bangkok, Thailand, is the author or coauthor of one book and more than 50 interna-
May 2003. tional journal and conference papers. He was awarded the 1999
[13] H. Johansson and P. Löwenborg, “Flexible frequency-band re- IEEE Midwest Symposium on Circuits and Systems best student
allocation network based on variable oversampled complex- paper award and the 2002 IEEE Nordic Signal Processing Sympo-
modulated filter banks,” in Proceedings of IEEE International sium best paper award. He is a Member of the IEEE.
Conference on Acoustics, Speech and Signal Processing (ICASSP
’05), vol. 3, pp. 973–976, Philadelphia, Pa, USA, March 2005.
[14] T. Saramäki, “Finite impulse response filter design,” in Hand-
book for Digital Signal Processing, S. K. Mitra and J. F. Kaiser,
Eds., chapter 4, pp. 155–277, John Wiley & Sons, New York,
NY, USA, 1993.
[15] J. F. Kaiser, “Nonrecursive digital filter design using I0 -sinh
window function,” in Proceedings of IEEE International Sympo-
sium on Circuit and Systems (ISCAS ’74), pp. 20–23, San Fran-
cisco, Calif, USA, April 1974.
doi:10.1155/2007/51806
Research Article
Wavelets in Recognition of Bird Sounds
Arja Selin, Jari Turunen, and Juha T. Tanttu
Department of Information Technology, Tampere University of Technology, Pori, P.O. Box 300, 28101 Pori, Finland
Received 9 September 2005; Revised 30 May 2006; Accepted 22 June 2006
This paper presents a novel method to recognize inharmonic and transient bird sounds efficiently. The recognition algorithm
consists of feature extraction using wavelet decomposition and recognition using either supervised or unsupervised classifier. The
proposed method was tested on sounds of eight bird species of which five species have inharmonic sounds and three reference
species have harmonic sounds. Inharmonic sounds are not well matched to the conventional spectral analysis methods, because
the spectral domain does not include any visible trajectories that computer can track and identify. Thus, the wavelet analysis was
selected due to its ability to preserve both frequency and temporal information, and its ability to analyze signals which contain
discontinuities and sharp spikes. The shift invariant feature vectors calculated from the wavelet coefficients were used as inputs of
two neural networks: the unsupervised self-organizing map (SOM) and the supervised multilayer perceptron (MLP). The results
were encouraging: the SOM network recognized 78% and the MLP network 96% of the test sounds correctly.
Copyright © 2007 Arja Selin et al. This is an open access article distributed under the Creative Commons Attribution License,
1. INTRODUCTION Human ear and brain constitute an effective voice recog-

nition system. For the human ear it is relatively easy to notice
Nearly all birds make different kinds of sounds which are even subtle differences in sounds, whereas for the computer
used in communication with other conspecifics and also the recognition task is much more difficult. In bird sound
between different species. Sounds are only produced when research, the typical methods of classification have been lis-
needed, and so all the sounds have some meaning [1, 2]. tening and visual assessment of spectrograms. However, hu-
Most sounds are produced by the syrinx, which is the avian man decision is always subjective. So, the automatization of
vocal organ [3]. In most species the syrinx is bipartite, so this classification process would be an important new tool
the bird can produce two notes simultaneously [4, 5]. Bird for bioacoustic research [10]. Automatic classification of-
sounds can be tonal or inharmonic, which is one way to di- fers new possibilities for the identification of vocal groups of
vide the bird species into groups. Inharmonic sounds are birds, and may also give new tools for the classification of the
often transient and their frequency contents are very near sounds of other animals.
each other. Bird vocalization contains both songs and calls. Classification of bird sounds has been studied a lot and its
Calls are shorter and simpler than songs, and both sexes pro- application range includes, for example, bird census and tax-
duce them throughout the year. It seems that most birds have onomy [11–13]. Nevertheless, only a few studies exist where
from 5 to 15 distinct calls, and the functions of them can the identification of bird species by their sound is made
be, for example, flight, alarm, excitement, and so on. Some automatically [14–19]. Most of these studies, for example,
birds can have several different calls for the same function, [14, 17], have focused on tonal and harmonic sounds, and
whereas some birds use very similar calls in different circum- are based on conventional spectral analysis methods. These
stances to mean different things. In addition, in many species methods are not well matched to inharmonic and transient
there is high individual and regional variability in phrases sounds. In [19] inharmonic bird sounds have been classified
and song patterns [6–9]. Thus, two kinds of bird sound vari- using 19 low-level parameters of syllables. It seems, however,
ability have to be taken into account in the classification. that the number of parameters is probably too high for an
One is the variation of different sound types and another is efficient recognition algorithm.
the variation across geographic regions and among individ- The aim of our study was to develop a computationally
uals. effective recognition method for inharmonic bird sounds,
and to investigate the applicability of the wavelet analysis for

this task. The wavelet analysis has gained a great deal of atten- Preprocessing Segmentation Postprocessing
tion in the field of digital signal processing [20]. It has many
advantages, for example, its ability to find out both frequency
Wavelet Feature
and temporal information, and to analyze signals which con- decomposition calculation
tain discontinuities and sharp spikes. These properties are
appropriate for inharmonic and transient bird sounds. In the
wavelet packet transform the original signal is converted into Network Network Recognition
training testing results
wavelet coefficients. The orthogonal wavelet packets can be
designed by hierarchical association of PR (perfect recon-
struction) paraunitary filter banks [21]. Because the number Figure 1: The recognition process.
of the coefficients is usually large after the decomposition and
because using all wavelet coefficients as features will often
lead to inaccurate results, the extraction of the most impor- Calculation T
h0
tant features is essential. The feature extraction from wavelet of the
s8 threshold
coefficients has been studied, for example, in [22, 23]. In spite s8 Thres- s8
of the many advantages of the wavelet transform, it also has S8 S8
holding
a disadvantage: it is time dependent. To avoid this problem,
four shift invariant parameters were used as features in this . .
s .
.
.
. s

study.
Artificial neural networks (ANNs) are being applied to
s1 Thres- s1

pattern recognition and have successfully been used in the S1 S1
holding
automated classification of acoustic signals including animal
sounds [24–27]. The ANNs have also been used in the clas-
sification and recognition of bird sounds [28–30]. In this Figure 2: The noise reduction using the filter bank.
study, two commonly known neural networks, the unsuper-
vised self-organizing map (SOM) and the supervised multi-
layer perceptron (MLP), were selected as the classifiers due mean energy value during the segmentation. The sound-
to their ability to compensate discrepancies among the data. tracks were extracted automatically into smaller pieces iden-
The distinguishability of bird species was first examined with tifying the beginning and ending of each call. The soundtrack
the SOM, which is essentially a clustering algorithm, and af- was clipped if the onset of the sound exceeded the adaptive
ter that the sound data was classified using the MLP. threshold level and the end of the sound dropped under that
threshold value.
2. METHODS During the postprocessing the interfering broadband
noise was reduced from the sound signal, s, using the eight-
The model of the whole recognition process is presented in band filter bank (cf. Figure 2).
Figure 1. During the preprocessing the noise was reduced The outputs si (n) from the thresholding blocks were cal-
from the soundtracks. Then the soundtracks were segmented culated as
into smaller pieces which are called sounds in the sequel. ⎧
⎨0 if si (n) < Th0 ,
During the postprocessing the sounds were checked manu- si (n) =

⎩sgn si (n) si (n) − Th0

ally. All the sounds were decomposed into the wavelet co- else (1)
efficients using the wavelet packet decomposition (WPD). for i = 1, . . . , 8,
The features were calculated from these wavelet coefficients
and the feature vectors were composed. The feature vectors where the threshold value Th0 was defined as 2 times the
of the training data were introduced to the MLP and the standard deviation of the output s8 after preliminary tests.
SOM networks during the training phase. Finally, both net- Reduction of the noise emphasized the essential informa-
works were tested on separate testing data and the recog- tion of the bird sound. At the end of the postprocessing all
nition results were examined. Altogether, the phases of the sounds were checked manually and verified consistently. A
recognition process were automatic, except the checking of few sounds were recorded in a very noisy environment or
the sounds, which was made manually. they were in inseparable groups, and were therefore rejected
during the manual checking.
2.1. Preprocessing, segmentation, and postprocessing
2.2. Wavelet packet decomposition
During the preprocessing the zero mean data was normal-
ized in the range [−1, 1], and the low-frequency wind noise The wavelet packet analysis was used for the signal decompo-
was reduced using a long moving average filter. Because the sition [31, 32]. In the WPD the signal s is split into approxi-
noise level varied a lot between the sound tracks, the noise mation (A) and detail (D) parts. Due to the downsampling,
threshold level was calculated adaptively from long-term aliasing occurs in the WPD tree. This aliasing changes the
Arja Selin et al. 3
N
S
1 A D
2 A D A D
3 A D A D A D A D
4 A D A D A D A D A D A D A D A D
5 A D A D A D A D A D A D A D A D A D A D A D A D A D A D A D A D
6 ADAD AD ADAD ADADADADADA D A DADA DADAD ADA DADADADADADADADA DADADADADADAD
12345678 32 64
Figure 3: The symmetric wavelet decomposition tree. The grey bins are used in the proposed method.
frequency order of some branches of the tree [33]. The sym- 32

30
metric wavelet decomposition tree is illustrated in Figure 3, 28
where the WPD tree is put in an increasing frequency order 26
from the left to the right. 24
The preliminary tests showed that the best decomposi- 22
20 Maximum
tion level (N) was six. Thus, the signal s was split into 26 = 64 energy
Bins
18
parts, which are called bins in the sequel. The bin number 1 16 Spread
contained so low frequencies that proved to be irrelevant for 14
12
the recognition. Because the bins 33–64 also proved to be ir- Position
relevant, the wavelet coefficients were calculated from bins Width 10
7
2–32 marked grey in Figure 3.
There are several wavelet families that have proved to 4
2
be particularly usable [34]. The Daubechies wavelet family 500 1000 1500 2000 2500 3000 3500 4000
(dbN) was selected, because in it both scaling and wavelet Samples
functions are compactly supported and they are orthogo-
nal. The 10 dB was selected for the wavelet function, because
Figure 4: The four shift invariant features: maximum energy, po-
the preliminary tests showed that it compromised the best
sition, spread, and width. The larger absolute values of the wavelet
decomposition results of the tested alternatives with the se- coefficients are presented with the darker color.
lected bird sounds.
2.3. Features
where q is the number of the sample and r is the number of
As mentioned before, the main disadvantage of the wavelet the bin. J is a set of index pairs (q, r) for which c2 (q, r) >
transform is its time dependence. That is why the four shift Th1 (r). In (5) #J is the number of elements (cardinality) of
invariant parameters were selected as features. These four the set J. So, the spread S is a sum of the average energies of
features, maximum energy, position, spread, and width are il- those coefficients whose energy exceeded the threshold value
lustrated in Figure 4. Th1 . After the preliminary test with the data the threshold
The number of the WPD coefficients of each bin is de- value Th1 (r) was calculated as
noted as nc . The bin energy EB (r) of the wavelet coefficients
c of bin r was defined as EB (r)
nc Th1 (r) = (6)
6
EB (r) = c2 (n, r), r = 2, 3, . . . , 32, (2)
n=1
from the average energy EB (r) of bin r.
and the average energy EB (r) of each bin r was defined as The fourth feature, the width W represents the number
EB (r) of bins which satisfy the inequality
EB (r) = . (3)
nc
EB (r) > Th2 , (7)
The largest average energy value

Em = max EB (r) (4) where the threshold value Th2 was selected as 1.3 after pre-
r
liminary tests with the data.
was then searched, and it is called the maximum energy Em of Finally all four features were normalized, in order to be
the sound. The position P represents the number of the bin r, comparable with one another. The normalization levels were
in which the maximum energy was located. defined after preliminary tests with the data. The maximum
The spread S was calculated as energy Em was normalized as
1 2
S= c (q, r), (5) Em
#J (q,r)∈J Em = , (8)
nB
Table 1: Selected set of bird sounds used in this study.
Scientific abbr. Scientific name English name Sound type MLP training SOM training Testing
ANAPLA Anas platyrhynchos Mallard Inharmonic 138 113 60
ANSANS Anser anser Greylag goose Inharmonic 135 113 59
COTCOT Coturnix coturnix Quail Tonal 190 113 83
CRECRE Crex crex Corncrake Inharmonic 443 113 110
GLAPAS Glaucidium passerinum Pygmy owl Pure harmonic 113 113 48
LOCFLU Locustella fluviatilis River warbler Inharmonic 890 113 328
PICPIC Pica pica Magpie Inharmonic 203 113 97
PORPOR Porzana porzana Spotted crake Tonal 166 113 69
— — — — 2278 904 854
where nB is the number of the coefficients of the bin which In the SOM training the calculated feature vectors were
exceeded the Th1 . The position P was normalized as introduced to a 10 × 10-size SOM network. The other sizes,
for example, 6 × 6, 8 × 8, and 12 × 12, of the network were
P P
P = = . (9) also tested. However, the chosen size yielded best recognition
2N /4 16 results. The SOM network was trained for up to 3000 epochs
The spread S was normalized as using the training data (cf. Table 1). The results did not im-
prove although the number of the epochs was changed.
S
S = (10) After preliminary tests, the selected MLP architecture was
100
4-15-40-3. Each output was finally rounded to 0 or 1, and
and the width W as then three output bits of each sound were converted into

= W numbers 1–8, which was enough for classes of eight bird
W . (11) sounds. The MLP network was trained for up to 65 epochs
20
and the mean square error goal was 0.0001. After the train-
Thus, 31 × nc WPD coefficients were reduced to four nor- ing, it became obvious that all the nodes, and the weighting
malized features: maximum energy Em , position P, spread S,

and bias parameters of the MLP network were needed, which
and width W.
These four features formed the final feature
means that none of the outputs of the nodes was too close to
vector for recognition. The main reason for the normaliza- zero. Both networks were tested on separate testing data after
tion was the SOM, which yields better recognition results if the training.
the inputs are in the same scale. In addition, the training time
of the SOM network is shorter with normalized inputs.
3. THE BIRD SOUND DATA
2.4. Classifiers
Our main purpose was to study the efficient recognition of
Two commonly known neural networks, unsupervised self- inharmonic or transient bird sounds. The sampling rate of
organizing map (SOM) [35] and supervised multilayer per- the sound data, Fs , was 44.1 kHz and 16-bit accuracy was
ceptron (MLP) [36], were used as classifiers. The neural net- used. The data was analyzed in the Matlab environment [37],
works were selected due to their ability to compensate dis- and the Wavelet Toolbox [34] was utilized. The idea was to
crepancies in the data. This is one way to deal with the in- choose such bird species whose sounds are inharmonic and
dividual and regional variability of bird vocalizations. The sounds which resemble one another. This is the reason why
motivation for using unsupervised and supervised networks the inharmonic sounds of the mallard, the greylag goose, the
was to verify the predefined decisions of the supervised MLP corncrake, the river warbler and the magpie were selected.
against the unsupervised SOM, and to compare their rela- The sounds of the quail and the spotted crake are tonal, but
tive performance. In the SOM the four-dimensional data was contain some transient features, for example, irregular pitch
mapped into two-dimensional space. The SOM clusters the period. The pure tonal territorial song of the male pygmy owl
data so that neighbouring clusters are quite similar, while was chosen as a reference sound.
more distant clusters become increasingly diverse [35]. The In the classification, the variation of different sound types
low and high variability between the sounds of the species in every species has to be taken into account by examin-
can be seen from the compactness of the clusters. Thus, in ing each sound type separately. That is why only one type
this study the distinguishability of the species was first exam- of call of each species was used in this study. However, sev-
ined with the SOM, and after that the classification was made eral types of calls of the greylag goose were included, be-
with the MLP. cause these calls are very similar to one another. Hence, it was
Arja Selin et al. 5
tested how the greylag goose can be recognized using many the greylag goose were recognized correctly, and 23% of the
types of calls. In addition, a sufficient number of recordings sounds were recognized unspecified. That might result from
of those eight species was available quite easily and the qual- the fact that several types of calls of the greylag goose were
ity of the recordings was sufficient. The data of the selected included in the study. Altogether, 92 sounds of all 854 test
eight species is summarized in Table 1. The table contains sci- sounds were recognized wrongly. A total of 78% of the test
entific abbreviations and names, English names, and sound sounds were recognized correctly with the SOM network.
types. Also the number of sounds in the training and testing
is indicated.
4.2. Results using the MLP
The sounds were recorded in Finland by Pertti Kali-
nainen, Ilkka Heiskanen, and Jan-Erik Bruun. There were Table 3 contains the recognition result of the MLP network.
totally 3132 sounds which were divided into training data All the test sounds of the quail (COTCOT) and the spot-
(2278 sounds) and testing data (854 sounds). The training ted crake (PORPOR) were recognized correctly. Again, the
and testing data were from different tracks. It turned out that recognition result of the sounds of the greylag goose was
if there were the same number of training data of each group, poor, and the reason might be the same as with the SOM
the SOM network yielded better results. Thus, in the case of network. Twenty-four sounds of all the test sounds were rec-
the SOM network the training data was reduced to 113 sam- ognized wrongly. Altogether, 96% of the test sounds of the
ples per species. eight bird species were recognized correctly with the MLP
The typical spectrograms and corresponding wavelet co- network.
efficient figures of eight species that were used in this study
are presented in Figure 5. As can be seen, the wavelet trans-
form compresses the energy of the coefficients more than tra- 5. DISCUSSION AND CONCLUSIONS
ditional Fourier transform in spectrograms. Only the very es-
sential information is preserved after the WPD. Our purpose was to study how inharmonic and transient
bird sounds can be recognized efficiently. The results of this
study are very encouraging. The results indicate that it is pos-
4. RESULTS sible to recognize bird sounds of the test species using neural
networks with only four features calculated from the wavelet
4.1. Results using the SOM packet decomposition coefficients.
Segmentation plays an important role in sound recogni-
The clustering result of the SOM network after training is tion, because incorrectly segmented sounds will probably be
illustrated in Figure 6. classified wrongly. In most cases, segmentation is the most
The areas marked with letters present how sounds of complicated and challenging part of the whole recognition
each bird species were situated in the 10 × 10 SOM net- process. However, it is quite difficult to make it totally au-
work (cf. Section 2.4) after the overlapping nodes had been tomatic. Noise reduction goes hand in hand with successful
analyzed. The SOM network was examined node by node segmentation. The segmentation is even more difficult if the
and the outliers were labelled. The species which had most sound tracks are very noisy. In this study the segmentation
sounds in a particular node won and the possible other and noise reduction were implemented so that the original
sounds were classified as outliers. If two or more differ- sound information of the target species remained as intact
ent species had the same number of sounds in a particu- as possible. After the automatic segmentation, all the sounds
lar node, all were classified as outliers. If no species won, were checked manually. The noise reduction was done using
the node was classified as unspecified. If no sound is situ- an eight-band filter bank, which reduced the irrelevant noise
ated in the node, it was classified as empty node. Unspecified information and emphasized the essential information of the
nodes are marked with black color and empty nodes with bird sound. The main purpose of the preprocessing was to
grey color in Figure 6. In the SOM, compact clusters rep- control the signal quality so that all sounds were comparable
resent the species with little variation between sounds, and, with each other.
respectively, the scattered clusters represent the species with The selection of the wavelet function and the decomposi-
large variation. As it can be seen, for example, the test sounds tion level are the most important phases of the WPD. In this
of the river warbler (R) form a compact and uniform area, study the 10 dB was selected for the wavelet function and the
whereas the sounds of the greylag goose (G) spread out in a level of the decomposition was selected to be six after pre-
broad area. The SOM clustered 87% of training sounds cor- liminary testing. The preliminary tests were used because the
rectly. authors do not know any reliable algorithm for selecting the
The confusion matrix of Table 2 illustrates the recogni- wavelet function and the decomposition level properly. The
tion result of the SOM network after the trained network had preliminary tests indicated that the 10 dB wavelet function
been tested on the test sounds. The rows of the confusion ma- and the 6th decomposition level compromised the best de-
trix show how each species is recognized. All the test sounds composition results with selected bird sounds.
of the river warbler (LOCFLU) were recognized correctly, as The four features were calculated from the wavelet packet
can be seen from the diagonal of the matrix. Altogether, 7% decomposition coefficients. Many kinds of other features
of the test sounds were unspecified and 15% were recognized were calculated from the coefficients and they were also
wrongly. It should be noticed that only 51% of the sounds of tested. However, the chosen four features: maximum energy,
ANAPLA ANAPLA ANSANS ANSANS

32 10 32
10
Frequency (kHz)
Frequency (kHz)
28 28
8 24 8 24
20 20
Bins
Bins
6 6
16 16
4 12 4 12
2 8 2 8
4 4
2000 4000 6000 8000 2000 4000 6000 8000 2000 6000 10000 2000 6000 10000
Samples Samples Samples Samples
(a) (b) (c) (d)
COTCOT COTCOT CRECRE CRECRE

10 32 10 32
Frequency (kHz)
Frequency (kHz)
28 28
8 24 8 24
6 20 6 20
Bins
Bins
16 16
4 12 4 12
2 8 2 8
4 4
500 1500 2500 3500 500 1500 2500 3500 1000 3000 5000 7000 1000 3000 5000 7000
(e) (f) (g) (h)
GLAPAS GLAPAS LOCFLU LOCFLU

10 32 10 32
Frequency (kHz)
Frequency (kHz)
28 28
8 24 8 24
20 20
Bins
Bins
6 6
16 16
4 12 4 12
8 2 8
2
4 4
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5 500 1500 2500 3500 500 1500 2500 3500
Samples 104 Samples 104 Samples Samples
(i) (j) (k) (l)
PICPIC PICPIC PORPOR PORPOR

10 32 10 32
Frequency (kHz)
28
Frequency (kHz)
28
8 24 8 24
6 20 6 20
Bins
Bins
16 16
4 12 4 12
2 8 2 8
4 4
500 1500 2500 3500 500 1500 2500 3500 1000 3000 5000 1000 3000 5000
(m) (n) (o) (p)
Figure 5: (a), (c), (e), (g), (i), (k), (m), and (o) typical spectrograms and (b), (d), (f), (h), (j), (l), (n), and (p) corresponding wavelet
coefficients of the eight species used in this study are presented. The frequency and bins are bounded to 11.025 kHz (Fs/4), because at the
higher frequencies there was no essential information. In the spectrograms the darker colors represent the higher energies of the sound.
Correspondingly, the larger absolute values of the coefficient are presented with the darker color in the adjacent wavelet coefficient figures.
The range of the coefficients is [−5, 5].
position, spread, and width, described and separated the ing data contained very probably sounds of seven mallard,
sounds of the eight bird species best. nine graylag goose, three quail, eight corncrake, five pygmy
The data of the eight bird species that was used in this owl, two river warbler, six magpie, and three spotted crake
study was divided so that there were about 70% training data individuals. The testing data was selected from tracks dif-
and 30% testing data. Both networks, the SOM and the MLP, ferent from the training data and it was also very probably
were first trained and then tested on separate data. The train- from different individuals. So, the testing data consisted of
Arja Selin et al. 7
Table 2: The confusion matrix in percentage terms when using the SOM network.
% ANAPLA ANSANS COTCOT CRECRE GLAPAS LOCFLU PICPIC PORPOR Unspecified

ANAPLA 78 20 0 0 0 0 0 0 2
ANSANS 24 51 0 0 0 0 0 2 23
COTCOT 0 0 87 0 0 0 8 4 1
CRECRE 0 0 0 83 0 0 1 0 16
GLAPAS 0 15 0 0 75 0 0 0 10
LOCFLU 0 0 0 0 0 100 0 0 0
PICPIC 1 0 2 1 0 0 58 38 0
PORPOR 0 0 0 0 0 0 9 91 0
Table 3: The confusion matrix in percentage terms when using the MLP network.
% ANAPLA ANSANS COTCOT CRECRE GLAPAS LOCFLU PICPIC PORPOR

ANAPLA 98 2 0 0 0 0 0 0
ANSANS 2 83 1.7 5.1 1.7 5.1 1.7 0
COTCOT 0 0 100 0 0 0 0 0
CRECRE 1 2 0 96 0 0 1 0
GLAPAS 0 2 0 0 96 2 0 0
LOCFLU 0 0.3 0 0 0 99.7 0 0
PICPIC 0 0 5 1 0 0 94 0
PORPOR 0 0 0 0 0 0 0 100
P P P P A G G G G In conclusion, the SOM classified 78% and the MLP 96%

of the test sounds correctly. After the testing of both net-
P P G A A G A G G
works, all wrongly recognized sounds were manually exam-
P P P P G A A G A G ined and labelled. The test result showed that 24 sounds were
P P P G G A A G G recognized wrongly using the MLP network. In the SOM
network 39 of test sounds were unspecified and 92 sounds
G Q G A A A A G
were recognized wrongly. After plotting and examining all
Q S S S C G G A the wavelet packet coefficient figures of the misrecognitions,
Q S S S S M R R the reason for the most wrong recognitions became obvi-
ous. Firstly, the coefficient pattern of the misrecognitions was
Q M S S M R R R C
shifted so that two features, the position and the width, were
Q M M M S S R R C strayed. Secondly, the wrong recognition resulted presum-
Q Q Q Q Q M R R C C ably from false segmentation or low signal-to-noise ratio.
The proposed method provides quite a robust approach
to sound recognition, particularly to the inharmonic and
A ANAPLA, mallard R LOCFLU, river warbler transient bird sounds. The variability among the bird sounds
G ANSANS, greylag goose M PICPIC, magpie within and between the species was taken into account us-
Q COTCOT, quail S PORPOR, spotted crake ing neural networks in the classification. The sounds of the
C CRECRE, corncrake Unspecified node selected eight species vary only slightly. Also, the variation
P GLAPAS, pygmy owl Empty node across geographic regions was insignificant, because all the
sounds were recorded in Finland.
Figure 6: The clustering result of the 10 × 10 SOM network after In conclusion, the results presented in this paper are very
training. encouraging. They indicated that it is possible to recognize
bird sounds using neural networks with only four features
calculated from the wavelet packet coefficients. Although the
sounds of two mallard individuals, four graylag goose, two neural networks have many benefits, such as their ability
quail, two corncrake, and two pygmy owl individuals, and to learn and therefore generalize the variability of the data,
one river warbler, one magpie, and one spotted crake indi- there is a long way to go before the recognition system beats
viduals. the human ear. When using neural networks in the pattern
classification, there has to be a fixed number of classes into [3] C. H. Greenewalt, Bird Song: Acoustics and Physiology, Smith-
which activations are classified. Hence, the disadvantage of sonian Institution Press, Washington, DC, USA, 1968.
the neural networks is the fixed number of output classes, [4] S. A. Zollinger, T. Riede, and R. A. Suthers, “Production of
that is, closed set of species. When more species need to be nonlinear phenomena in the Northern Mockingbirds (Minus
classified, the network has to be retrained all over again be- polyglottos),” in Proceedings of the 1st International Conference
fore it can be tested on a new set of birds. on Acoustic Communication by Animals, pp. 283–284, College
Park, Md, USA, July 2003.
Although the tested algorithms proved to be quite ro-
bust recognition methods for a limited set of birds, the pro- [5] R. A. Suthers, G. Beckers, S. A. Zollinger, E. Vallet, and M.
Kreuzer, “Mechanisms of vocal complexity in birds,” in Pro-
posed method cannot beat a human expert listener. A human
ceedings of the 1st International Conference on Acoustic Com-
expert listener can identify birds with almost 100% accu- munication by Animals, pp. 237–238, College Park, Md, USA,
racy by using a priori knowledge and environmental or other July 2003.
context-dependent information for classification, whereas [6] J. W. Bradbury, “Parrots and technology,” in Proceedings of the
our proposed method uses only a short recording without 1st International Conference on Acoustic Communication by An-
any other information. In [19] the inharmonic bird sounds imals, pp. 29–30, College Park, Md, USA, July 2003.
were recognized with nearest neighbor classifier using Maha- [7] M. C. Baker and D. M. Logue, “Population differentiation in a
lanobis distance measure with 74% accuracy, whereas in this complex bird sound: a comparison of three bioacoustical anal-
study the SOM classified 78% and the MLP 96% of the in- ysis procedures,” Ethology, vol. 109, no. 3, pp. 223–242, 2003.
harmonic bird sounds correctly. On the other hand, the re- [8] J. G. Groth, “Call matching and positive assortative mating in
sults are quite incomparable to other methods, because the red crossbills,” The Auk, vol. 110, no. 2, pp. 398–401, 1993.
test set of birds was limited and the features were calculated [9] M. S. Robb, “Introduction to vocalizations of crossbills in
differently. Northwestern Europe,” Dutch Birding, vol. 22, no. 2, pp. 61–
The method tested in this study is intended for automatic 107, 2000.
monitoring of birds that are living in a predefined area or [10] V. B. Deecke and V. M. Janik, “Automated categorization of
night time active birds or migratory birds whose probability bioacoustic signals: avoiding perceptual pitfalls,” Journal of the
of existence is known beforehand. The continuous monitor- Acoustical Society of America, vol. 119, no. 1, pp. 645–653,
2006.
ing of the same birds is costly and time-consuming. Thus, the
[11] A. M. Elowson and J. P. Hailman, “Analysis of complex vari-
aid of automatic recognition in field work might be desirable.
ation: dichotomous sorting of predator-elicited calls of the
The algorithm must be fine-tuned in a way that it recognizes Florida scrub jay,” Bioacoustics, vol. 3, no. 4, pp. 295–320, 1991.
the predefined and limited set of birds correctly either leaving [12] J. G. Groth, “Resolution of cryptic species in appalachian red
out or storing the uncertain or unknown sounds for manual crossbills,” The Condor, vol. 90, no. 4, pp. 745–760, 1988.
checking. [13] S. F. Lovell and M. R. Lein, “Song variation in a population of
Automatic recognition presents a new method for iden- Alder Flycatchers,” Journal of Field Ornithology, vol. 75, no. 2,
tifying and differentiating bird species by their sounds, and pp. 146–151, 2004.
may offer new tools also for bird researchers. However, the [14] A. Härmä, “Automatic identification of bird species based on
automatic recognition of bird species is by no means an easy sinusoidal modelling of syllables,” in Proceedings of the IEEE
task. The fact that sounds and calls vary among species and International Conference on Acoustics, Speech, and Signal Pro-
the same species might have many call types make automatic cessing (ICASSP ’03), vol. 5, pp. 545–548, Hong Kong, April
recognition even more difficult. In this demanding task the 2003.
wavelet transform has proven to be an efficient method to be [15] A. Härmä and P. Somervuo, “Classification of the harmonic
taken into consideration. structure in bird vocalization,” in Proceedings IEEE Interna-
tional Conference on Acoustics, Speech, and Signal Processing
(ICASSP ’04), vol. 5, pp. 701–704, Montreal, Quebec, Canada,
6. ACKNOWLEDGMENTS May 2004.
[16] N. Mesgarani and S. Shamma, “Bird call classification using
The authors would like to thank Pertti Kalinainen, Ilkka
multiresolution spectrotemporal auditory model,” in Proceed-
Heiskanen, and Jan-Erik Bruun for their recordings and Do- ings of the 1st International Conference on Acoustic Communi-
cent Mikko Ojanen for his helpful comments on biologi- cation by Animals, pp. 155–156, College Park, Md, USA, July
cal issues. The authors also wish to thank the reviewers for 2003.
their encouraging comments and suggestions. This Research [17] J. T. Tanttu, J. Turunen, A. Selin, and M. Ojanen, “Automatic
was funded by the Academy of Finland under research Grant feature extraction and classification of crossbill (Loxia spp.)
206652 and by the Ulla Tuominen’s Foundation. flight calls,” Bioacoustics, vol. 15, no. 3, pp. 251–269, 2006.
[18] P. Somervuo and A. Härmä, “Bird song recognition based on
REFERENCES syllable pair histograms,” in Proceedings of IEEE International
Conference on Acoustics, Speech, and Signal Processing (ICASSP
[1] C. K. Catchpole and P. J. B. Slater, Bird Song: Biological Themes ’04), vol. 5, pp. 825–828, Montreal, Quebec, Canada, May
and Variations, Cambridge University Press, Cambridge, UK, 2004.
1995. [19] S. Fagerlund and A. Härmä, “Parametrization of inharmonic
[2] D. E. Kroodsma, The Singing Life of Birds: The Art and Science bird sounds for automatic recognition,” in proceedings of the
of Listening Birdsong, Houghton Miflin, Boston, Mass, USA, 13th European Signal Processing Conference (EUSIPCO ’05),
2005. Antalya, Turkey, September 2005, Proceedings on CD-ROM.
Arja Selin et al. 9
[20] O. Rioul and M. Vetterli, “Wavelets and signal processing,” Arja Selin was born in Janakkala, Finland,
IEEE Signal Processing Magazine, vol. 8, no. 4, pp. 14–38, 1991. on May 2, 1970. She received her M.S. de-
[21] A. K. Soman and P. P. Vaidyanathan, “Paraunitary filter banks gree in 2005. Currently she is preparing her
and wavelet packets,” in Proceedings of the IEEE International doctoral thesis in signal processing and pat-
Conference on Acoustics, Speech, and Signal Processing (ICASSP tern recognition.
’92), pp. 397–400, San Francisco, Calif, USA, March 1992.
[22] S. Pittner and S. V. Kamarthi, “Feature extraction from wavelet
coefficients for pattern recognition tasks,” IEEE Transactions
on Pattern Analysis and Machine Intelligence, vol. 21, no. 1, pp.
83–88, 1999. Jari Turunen received his M.S. and Ph.D.
[23] R. Learned, “Wavelet packet based transient signal classifi- degrees in 1998 and 2003, respectively, from
cation,” M.S. thesis, Massachusetts Institute of Technology, Tampere University of Technology. He cur-
Cambridge, Mass, USA, 1992. rently works as a Senior Researcher at Tam-
[24] S. M. Phelps and M. J. Ryan, “Neural networks predict re- pere University of Technology, Pori. His
sponse biases of female tungara frogs,” Proceedings of the Royal current research interests cover topics such
Society—Biological Sciences (Series B), vol. 265, no. 1393, pp. as speech and signal processing.
279–285, 1998.
[25] V. B. Deecke, J. K. B. Ford, and P. Spong, “Quantifying com-
plex patterns of bioacoustic variation: use of a neural network
to compare killer whale (Orcinus orca) dialects,” The Journal Juha T. Tanttu was born in Tampere, Fin-
of the Acoustical Society of America, vol. 105, no. 4, pp. 2499– land, on November 25, 1957. He received
2507, 1999. his M.S. and Ph.D. degrees in electrical en-
[26] J. Placer and C. N. Slobodchikoff, “A fuzzy-neural system gineering from Tampere University of Tech-
for identification of species-specific alarm calls of Gunnison’s nology in 1980 and 1987, respectively. From
prairie dogs,” Behavioural Processes, vol. 52, no. 1, pp. 1–9, 1984 to 1992, he held various teaching and
2000. research positions at the Control Engineer-
ing Laboratory of Tampere University of
[27] A. Thorn, “Artificial neural networks for vocal repertoire anal- Technology. He currently holds Professor-
ysis,” in Proceedings of the 1st International Conference on ship of Information Technology at Tampere
Acoustic Communication by Animals, pp. 245–246, College University of Technology, Pori.
Park, Md, USA, July 2003.
[28] A. L. McIlraith and H. C. Card, “Birdsong recognition us-
ing backpropagation and multivariate statistics,” IEEE Trans-
actions on Signal Processing, vol. 45, no. 11, pp. 2740–2748,
1997.
[29] A. M. R. Terry and P. K. McGregor, “Census and monitor-
ing based on individually identifiable vocalizations: the role of
neural networks,” Animal Conservation, vol. 5, no. 2, pp. 103–
111, 2002.
[30] P. Somervuo and A. Härmä, “Analyzing bird song syllables on
the self-organizing map,” in Proceedings of the Workshop on
Self-Organizing Maps (WSOM ’03), Hibikino, Japan, Septem-
ber 2003, Proceedings on CD-ROM.
[31] A. Boggess and F. J. Narcowich, A First Course in Wavelets with
Fourier Analysis, Prentice-Hall, Upper Saddle River, NJ, USA,
2001.
[32] I. Daubechies, Ten Lectures on Wavelets, SIAM, Philadelphia,
Pa, USA, 1992.
[33] A. N. Akansu and R. A. Haddad, Multiresolution Signal De-
composition: Transforms, Subbands, and Wavelets, Academic
Press, Boston, Mass, USA, 1992.
[34] M. Misiti, Y. Misiti, G. Oppenheim, and J.-M. Poggi, Wavelet
Toolbox for Use with Matlab, MathWorks, Natick, Mass, USA,
2000.
[35] T. Kohonen, Self-Organizing Maps, Springer, Berlin, Germany,
2001.
[36] S. Haykin, Neural Networks: A Comprehensive Foundation,
Macmillan College, New York, NY, USA, 1994.
[37] MathWorks, “Matlab Software Homepage,” June 2005, http://
www.mathworks.com.
doi:10.1155/2007/71948
Research Article
Subband Approach to Bandlimited Crosstalk Cancellation
System in Spatial Sound Reproduction
Mingsian R. Bai and Chih-Chung Lee
Department of Mechanical Engineering, National Chiao-Tung University, 1001 Ta-Hsueh Road, Hsin-Chu 300, Taiwan
Received 27 December 2005; Revised 1 May 2006; Accepted 16 July 2006
Crosstalk cancellation system (CCS) plays a vital role in spatial sound reproduction using multichannel loudspeakers. However,
this technique is still not of full-blown use in practical applications due to heavy computation loading. To reduce the computation
loading, a bandlimited CCS is presented in this paper on the basis of subband filtering approach. A pseudoquadrature mirror filter
(QMF) bank is employed in the implementation of CCS filters which are bandlimited to 6 kHz, where human’s localization is the
most sensitive. In addition, a frequency-dependent regularization scheme is adopted in designing the CCS inverse filters. To justify
the proposed system, subjective listening experiments were undertaken in an anechoic room. The experiments include two parts:
the source localization test and the sound quality test. Analysis of variance (ANOVA) is applied to process the data and assess
statistical significance of subjective experiments. The results indicate that the bandlimited CCS performed comparably well as the
fullband CCS, whereas the computation loading was reduced by approximately eighty percent.
Copyright © 2007 M. R. Bai and C.-C. Lee. This is an open access article distributed under the Creative Commons Attribution
cited.
1. INTRODUCTION of approaches including time domain and frequency domain.

Kirkeby and Nelson proposed an LS time-domain filtering to
The fundamental idea of spatial audio reproduction is to syn- approximate the desired inverse function [10]. In contrast to
thesize a virtual sound image so that the listener perceives the time-domain method that is time consuming for long fil-
as if the signals reproduced at the listener’s ears would have ters, a fast frequency-domain deconvolution method offers
been produced by a specific source located at an intended more advantage in terms of computational speed [11].
position relative to the listener [1, 2]. This attractive feature Notwithstanding the preliminary success of CCS in aca-
of spatial audio lends itself to an emerging audio technology demic community, two problems seriously hamper the use
with promising application in mobile phone, personal com- of CCS in practical applications. One stems from the limited
puter multimedia, video games, home theater, and so forth. size of the so-called “sweet spot” in which CCS remains effec-
The rendering of spatial audio is either by headphones tive. The sweet spots are generally so small especially at lateral
or by loudspeakers. Headphones reproduction is straightfor- side that a head movement of a few centimeters would com-
ward, but suffers from several shortcomings such as in-head pletely destroy the cancellation performance. Two kinds of
localization, front-back reversal, and discomfort to wear. approaches can be used to address this problem—the adap-
While loudspeakers do not have the same problems as the tive design and the robust design. An example of adaptive
headphones, another issue adversely affects the performance CCS with head tracker was presented in the work of Kyri-
of spatial audio rendering using loudspeakers. The issue as- akakis et al. [12], and Kyriakakis [13]. This approach dynam-
sociated with loudspeakers is the crosstalks at the contralat- ically adjusts the CCS filters by tracking the head position of
eral paths from the loudspeakers to the listener’s ears that the listener using optical or acoustical sensors. However, the
may obscure the sense of source localization due to the Haas approach has not been widely used because of the increased
effect [3]. To overcome the problem, crosstalk cancellation hardware and software complexity of the head tracker. On
systems (CCS) that seek to minimize, if not totally elimi- the other hand, instead of dynamically tracking the listener’s
nate, crosstalk have been studied extensively by researchers head, an alternative CCS design using fixed filters can be
[4–9]. Methods of designing CCS are divided into two kinds taken to create a “wide” sweet spot that accommodates larger
head movement. A well-known example of robust CCS is Modeling

Desired
“stereo dipole” presented by Kirkeby et al. [14]. Other ap- delay Model signals
proaches with multidrive loudspeakers have been suggested m
d(z)
z M(z)
by Bai et al. [15], Takeuchi et al. [16], and Yang et al. [17, 18].
LQ
Program input
The other problem is computation loading due to multi-
signals Error
channel filtering and long-length filters. In general, finer fre- x(z) + e(z)
quency resolution, that is, long impulse response, is needed
for excellent reproduction, especially in a reverberated room.
The emphasis of this paper is placed on reducing compu- CCS filters Plant
tation loading. In considering the robustness against uncer-
tainties of HRTFs (head-related transfer function) and head C(z) H(z)
v(z) w(z)
movement and head shadowing effect at high frequencies, P Q Speaker input L P Reproduced
the proposed CCS is bandlimited to frequencies below 6 kHz signals signals
[19]. That is, the CCS only functions at low frequencies and
the binaural signals are directly passed through at high fre- Figure 1: The block diagram of a multichannel model-matching
quencies. The bandlimited implementation approach sug- problem in the CCS design.
gested in [19] is more computationally demanding due to
its fixed operating rate. In this work, we adopted a subband
filtering technique based on a cosine modulated quadrature where F symbolizes the Frobenius norm [22]. For an L × Q
mirror filter (QMF) bank [20]. In this design, the approx- matrix A, Frobenius norm is defined as
imated perfect reconstruction condition is fulfilled and the
Q
L
CCS is operated at low rate. Therefore, it can use more ef- 2
A2F = alq
fort at low frequencies for characteristics of human percep-
q=1 l=1
tual hearing. Another feature of the proposed system is that (3)
Q

CCS filter is designed with frequency-dependent regulariza- 2
= aq , aq being the qth column of A.
tion [21]. The present approach which differs itself from the 2
q=1
methods using constant regularization [11] provides more
flexibility in the design stage. In order to verify the pro- Hence, the minimization problem of Frobenius norm can be
posed CCS, subjective listening experiments were conducted converted to the minimization problem of 2-norm by parti-
to compare it to the traditional CCS. The results of subjective tioning the matrices into columns. Specifically, since there is
tests will be validated by using analysis of variance (ANOVA). no coupling between the columns of the matrix C, the min-
The intention is to develop the CCS with light computation imization of the square of the Frobenius norm of the entire
loading that performs comparably well as the fullband CCS. matrix H is tantamount to minimizing the square of each
column independently. Therefore, (2) can be rewritten into
2. MULTICHANNEL INVERSE FILTERING FOR CCS
FROM A MODEL-MATCHING PERSPECTIVE Q

min Hcq − mq 2 , (4)
cq , q=1,2,...,Q 2
The CCS aims to cancel the crosstalks in the contralateral q=1
paths from the stereo loudspeakers to the listener’s ears so
that the binaural signals are reproduced at two ears like those where cq and mq are the qth column of the matrices C and
reproduced using a headphone. This problem can be viewed M, respectively. The optimal solution of cq can be obtained
from a model-matching perspective, as shown in Figure 1. by applying the method of least squares to each column:
In the block diagram, x(z) is a vector of Q program input
signals, v(z) is a vector of P loudspeaker input signals, and cq = H+ mq , q = 1, 2, . . . , Q, (5)
e(z) is a vector of L error signals. M(z) is an L × Q matrix of
where H+ is the pseudoinverse of H [22]. This optimal so-
matching model, H(z) is an L × P plant transfer matrix, and
lution in the least-square sense can be assembled in a more
C(z) is a P × Q matrix of the CCS filters. The z−m term ac-
compact matrix form:
counts for the modeling delay to ensure causality of the CCS
filters. Let us neglect the modeling delay for the moment; it is
c1 c2 · · · cQ = H+ m1 m2 · · · mQ (6a)
straightforward to write down the input-output relationship:
or
e(z) = M(z) − H(z)C(z) x(z). (1)
C = H+ M. (6b)
For arbitrary inputs, minimization of the error output is tan-
tamount to the following optimization problem: For a matrix H with full-column rank (L ≥ P), H+ can be
calculated according to
min M − HC 2F , (2)
−1
C H+ = HH H HH . (7)
M. R. Bai and C.-C. Lee 3
Here, H+ is also referred to as the left-pseudoinverse of H tively, are employed to minimize phase distortion and alias-
such that H+ H = I. ing:
In practice, the number of loudspeakers is usually greater
than the number of ears, that is, L ≤ P. Regularization can be π N
gk (n) = 2p0 (n) cos (k + 0.5) n − + θk , (9)
used to prevent the singularity of HH H from saturating the M 2
filter gains [11, 23]:
fk (n) = gk (N − n), (10)

−1
H+ = HH H + γI HH . (8)
where θk = (−1)k (π/4), 0 ≤ k ≤ M − 1, and p0 (n), n =
The regularization parameter γ can either be constant 1, 2, . . . , N are the coefficients of the prototype FIR filter. The
or frequency-dependent [21]. A frequency-dependent γ is remaining problem is how to minimize the amplitude distor-
based on a gain threshold on the maximum of the absolute tion. The distortion function T(z) for the filter bank is given
values of all entries in C. If the threshold is exceeded, a larger as in [20]:
γ should be chosen. The binary search method can be used
M −1
to accelerate the search. It is noted that the procedure to ob- 1
tain the filter C in (6) is essentially a frequency-domain for- T(z) = Fk (z)Gk (z). (11)
M k=0
mulation; inverse Fourier transform along with circular shift
(hence the modeling delay) is needed to obtain causal FIR
Z-transform of (10) leads to Fk (z) = z−N G k (z), where G k (z)
(finite impulse response) filters.
is the paraconjugation of Gk (z). The distortion function can
thus be written in frequency domain as
3. BANDLIMITED IMPLEMENTATION USING
M −1
THE MULTIRATE APPROACH
1 − jωN

Gk e jω 2 .
T e jω = e (12)
Bandlimited implementation is chosen in this work for sev- M k=0
eral reasons. First, the computation loading is too high to af-
ford a fullband (0 ∼ 20 kHz) implementation. For the ex- A filter P(z) is called a Nyquist (M) filter if the following con-
ample of the stereo loudspeaker considered herein, the CCS dition is met:
would contain 4 filters. If each filter has 3000 taps, the convo- ⎧
⎨c, n = 0,
lution would require 1.2 × 104 multiplications and additions p(Mn) = ⎩ (13)
per sample interval. Except for special-purpose DSP engine, 0, otherwise,
real time implementation for a fullband CCS is usually pro-
hibitive for the sampling rate commonly used in audio pro- where p(n) is the impulse response of P(z) and c is a con-
cessing, for example, 44.1 kHz or 48 kHz. Second, at high fre- stant. In frequency domain,
quencies, the wavelength could be much smaller than a head M −1

width. Under this circumstance, the CCS would be extremely P e j(ω−2πk/N) = Mc. (14)
susceptible to misalignment of the listener’s head and uncer- k=0
tainties involved in HRTF modeling. Third, at high frequen-
cies, a listener’s head provides natural shadowing for the con- Equations (12) and (14) indicate that if |Gk (e jω )|2 is a
tralateral paths, which is more robust than direct application Nyquist (M) filter, or equivalently |P0 (e jω )|2 is a Nyquist
of CCS. The CCS in this study is chosen to be bandlimited (2M) filter, the magnitude of T(z) will be flat.
to 6 kHz (the wavelength at this frequency is approximately In this QMF design, the Kaiser window is used as the FIR
5.6 cm). To accomplish this, a 4-channel pseudo-QMF bank prototype [24]. Given the specifications of transition band-
is employed to divide the total audible frequency range into width Δ f and stopband attenuation As , the parameter β and
subbands for CCS and direct transmission, respectively. the filter order N can be determined according to
The design strategy of subband filter bank employed in ⎧
this paper is the cosine modulated pseudo-QMF. In this ⎪

⎪0.1102 As − 8.7 if As > 50,
⎪
⎪
method, a FIR filter must be selected as the prototype. Us- ⎨
0.4
ing this prototype, an M-channel maximally decimated filter β = ⎪0.5842 As − 21 +0.07886 As − 21 if 21 < As < 50,
⎪
⎪
bank (number of subbands = up/down sampling factor) is ⎪
⎩0 if As < 21,
generated with the aid of cosine modulation. The maximum
attenuation that can be attained by a perfectly reconstruct- As − 7.95
N≈ .
ing (PR) cosine modulated filter bank is about 40 dB. Never- 14.36Δ f
theless, this PR filter bank would still present an undesirable (15)
ringing problem. To alleviate this problem, the PR condition
is relaxed in the FIR filter design to gain more stopband at- An optimization procedure is employed here to make
tenuation. From our experience, as much as 60 dB attenua- P0 (z)P 0 (z) an approximate Nyquist (2M) filter, as posed by
tion is required for acceptable reproduction. the following min-max problem [24]:
Based on the method in [20], the following analysis and
synthesis filter banks represented by gk (z) and fk (z), respec- min max p0 (n) ∗ p0 (−n)↓2M , (16)
ωc n
=0
where the asterisk ∗ denotes the convolution operator. Be-

cause this is a convex problem, optimal cutoff frequency can
Speaker L
always be found [24]. After obtaining the optimal prototype Speaker R
filter, the analysis and synthesis filters are generated accord-
ing to (9) and (10), respectively. The filter bank can be easily
implemented with techniques such as polyphase structure or
discrete cosine transform (DCT) [20].
KEMAR
Amplifier
4. SUBJECTIVE EXPERIMENTS
In order to compare the performance of the proposed CCS

and the fullband CCS, subjective experiments were under-
taken in an anechoic room. The experimental arrangement
is shown in Figure 2. This experiment employed a stereo- Figure 2: The experimental configuration.
phonic two-way loudspeaker system, ELAC BS 103.2. The
microphone and the preamplifier are GRAS 40AC and GRAS
26AM, respectively. The plant transfer function matrices 10
were measured on an acoustical manikin, KEMAR (Knowles
electronics manikin for acoustic research), along with the 0
ear model, DB-065. The frequency responses of the plants
10
are shown in Figure 3 wherein the solid line and dotted line
represent the ipsilateral and the contralateral paths, respec- Magnitude (dB) 20
tively. Only responses measured on the right ear are shown
because of the assumed symmetry. The x-axis is logarithmic 30
frequency in Hz and the y-axis is magnitude in dB. The CCS
filters with 3000 taps are designed according to the method 40
presented in Section 2 with 12 dB threshold. The matrix Q is 50
defined as
60
Q11 Q12
Q= = HC. (17)
Q21 Q22 70
102 103 104
This matrix attempts to approximate the model matrix M Frequency (Hz)
which is set to be an identity matrix here. Figure 4(a) shows Ipsilateral path
the frequency responses of Q11 f and Q12 f , where the sub- Contralateral path
script f stands for the fullband method, represented as solid
line and dotted line, respectively. After compensation, the ip- Figure 3: The frequency responses of the plants including ipsilateral
silateral magnitude is almost flat from 300 Hz to 8 kHz. Some and contralateral paths.
imperfect match can be seen at low frequencies and at high
frequencies because the CCS filter gain is constrained, that
is, large regularization. On the other hand, the contralateral bandlimited CCS is 1500. In other words, the frequency (un-
magnitude is degraded to around −40 dB. Channel separa- der 6 kHz) resolution of the bandlimited CCS is twice than
tion, defined as the ratio of the contralateral response and that of the fullband CCS. That is, the bandlimited CCS has
the ipsilateral response, is employed as a performance index. finer resolution. Figure 7(a) shows the frequency responses
The channel separation, Q12 f /Q11 f , is shown in Figure 4(b) of Q11b and Q12b , where the subscript b stands for the ban-
as the dotted line. The solid line represents the natural chan- dlimited method, represented as solid line and dotted line,
nel separation, H12 /H11 . As mentioned above, the fullband respectively. The channel separation, Q12b /Q11b , is shown in
approach is impractical due to many reasons. The proposed Figure 7(b) as the dotted line. From Figures 4(b) and 7(b),
method in this work is bandlimited to 6 kHz with 48 kHz we can see that the bandlimited CCS gets better channel sep-
sampling rate. The block diagram of the bandlimited CCS is aration, especially from 100 Hz to 1 kHz.
illustrated in Figure 5. Through the use of the method pre- Subjective listening experiment includes two parts: the
sented in Section 3, the prototype FIR filter with 120 taps source localization test and the sound quality test. Eleven
and the analysis bank are plotted in Figures 6(a) and 6(b), subjects participated in the test. The listeners were instructed
respectively. The CCS only functions at the lowest band and to sit at the position where KEMAR was. In the first part,
operates at lower sampling rate. The computation load of an the test stimulus was a pink noise bandlimited to 20 kHz.
analysis bank or a synthesis bank equals to that of the pro- Each stimulus was played 5 times in 25 ms duration with
totype FIR filter when the polyphase structure is employed. 50 ms silent interval. Virtual sound images at 7 prespeci-
Since CCS operates at low rate, it is able to sample more fre- fied directions on the right horizontal plane with increment
quencies at design stage. In the experiment, the tap of the 30◦ azimuth are rendered by using HRTFs. Listeners were
10
G0 (z) 4 CCS 4 F0 (z)
0
G1 (z) 4 4 F1 (z)
10
G2 (z) 4 4 F2 (z)
Magnitude (dB)
20
G3 (z) 4 4 F3 (z)
30
40 Analysis bank synthesis bank
50 Figure 5: The block diagram of the bandlimited CCS.

60
70
102 103 104 0
Frequency (Hz)
The frequency responses of Q11 f 20
The frequency responses of Q12 f
Magnitude (dB)
(a) 40
60
10
0 80
10
100
Magnitude (dB)
20
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
30 Frequency (normalized by π)
40 (a)
50
0
60 G0 (z) G1 (z) G2 (z) G3 (z)
70 20
102 103 104
Magnitude (dB)
Frequency (Hz) 40
Natural channel separation
Compensated channel separation 60
(b)
80
Figure 4: (a) The frequency responses of Q11 f and Q12 f . (b) Natural
channel separation and compensated channel separation. 100
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

well trained by playing the stimuli of all angles prior to the Frequency (normalized by π)
test. The experiments were blind tests in which stimuli were (b)
played randomly without informing the subjects the source
direction. The results of localization test are shown in terms Figure 6: The magnitude responses of (a) prototype FIR filter and
of target angles versus judged angles in Figures 8(a) and 8(b), (b) analysis bank.
corresponding to the cases of fullband CCS and bandlimited
CCS. The size of each circle is proportional to the number of
the listeners who localized the same perceived angle. The 45- within the range 60◦ –120◦ . It is interesting to note that ban-
degree line indicates the perfect localization. It is observed dlimited CCS exists no back-front reversal problem which
from the results that subjects localized well at front (0 de- means that the subject localizes rear stimulus to front an-
gree) and back (180 degrees) no matter what approach is em- gle. In addition, a one-way analysis of variance (ANOVA)
ployed. While the fullband CCS performs well at 30-degree on the subjective localization result was conducted. These re-
angle, subjects were confused within the range 60◦ –120◦ . On sults were preprocessed into five levels of grade, as described
the other hand, bandlimited CCS performs slightly better in Table 1. Figure 9(a) shows the means and spreads (with
10 10
0 0
10 10
Magnitude (dB)
Magnitude (dB)
20 20
30 30
40 40
50 50
60 60
70 70
102 103 102 103
Frequency (Hz) Frequency (Hz)
The frequency responses of Q11b Natural channel separation

The frequency responses of Q12b Compensated channel separation
(a) (b)
Figure 7: (a) The frequency responses of Q11b and Q12b . (b) Natural channel separation and compensated channel separation.
180 180
150 150
Judged azimuth (degree)
Judged azimuth (degree)
120 120
90 90
60 60
30 30
0 0
0 30 60 90 120 150 180 0 30 60 90 120 150 180
Target azimuth (degree) Target azimuth (degree)
(a) (b)
Figure 8: Results of the subjective localization test of azimuth. (a) Fullband CCS. (b) Bandlimited CCS.
95% confidence intervals) of the grades for two kinds of ap- ment scale described in Table 2. The test stimuli contain three
proaches. The mean of the bandlimited CCS is slightly larger types of music including a bass (low frequency), a triangle
than that of the fullband CCS as we observed previously. (high frequency), and a popular song (comprehensive effect).
ANOVA output reveals that two approaches are not statis- Figure 9(b) shows the means and spreads (with 95% confi-
tically significant (p = 0.2324 > 0.05). dence intervals) of the grades for two kinds of approaches. It
In the second part, the stimulus prefiltered by the full- seems that the fullband CCS earned a slightly higher grade
band CCS and the bandlimited CCS were treated as the ref- than the subband approach since the fullband CCS was used
erence and the object, respectively. The “double-blind triple as the reference. Nevertheless, ANOVA test reveals that the
stimulus with hidden reference” method has been employed performance difference between two approaches is not sta-
in this testing procedure [25]. A listener at a time was in- tistically significant (p = 0.4109 > 0.05).
volved in three stimuli (“A,” “B,” and “C”) where “A” repre- Here, the proposed method has been validated that it
sented the reference and “B” and/or “C” represented the hid- performs comparably well as the fullband CCS. In Table 3,
den reference and/or the object. A subject was requested to two approaches are compared in terms of computation load-
compare “B” to “A” and “C” to “A” with five-grade impair- ing, where MPU and APU represent multiplications and
Table 1: Description of five levels of grade for the subjective localization test.
Description Grade
The judged angle is the same as the target angle 5.0
30◦ difference between the judged angle and the target angle 4.0
Front-back reversal of the judged angle identical to the target angle 3.0
30◦ difference between front-back reversal of the judged angle and the target angle 2.0
Otherwise 1.0
4.7 Table 2: Five-grade impairment scale.

4.6 Impairment Grade
4.5 Imperceptible 5.0
4.4 Perceptible, but not annoying 4.0
Slightly annoying 3.0
4.3
Grade
Annoying 2.0
4.2
Very annoying 1.0
4.1
4
Table 3: The comparison of computation loading of the fullband
3.9 CCS and the bandlimited CCS with direct convolution.
3.8 Fullband Bandlimited
Fullband Bandlimited
MPU 12 000 1 980
(a)
APU 11 998 1 976
4.8
4.7 Table 4: The comparison of computation loading of the fullband
4.6 CCS and the bandlimited CCS with fast convolution.
4.5 Fullband Bandlimited
4.4 MPU 1 464 815
Grade
4.3 APU 1 462 808

4.2
4.1
4
direct convolution because of the efficient polyphase imple-
mentation. In the procedure of block convolution, the fast
3.9
Fourier transform is used to realize discrete Fourier trans-
3.8 form. Moreover, the number of complex multiplications and
Fullband Bandlimited
additions of the fast Fourier transform is equal to N log2 N,
(b) where N is the number of the transform point. After using
block convolution, the results of computation loading are
Figure 9: Means and spreads (with 95% confidence intervals) of the listed in Table 4.
grades for two kinds of CCS approaches. (a) Grades of the source The shuffler method can be applied due to symmetric as-
localization experiment. (b) Grades of the sound quality tests.
sumption. The shuffler structure is shown in Figure 10. It
saves around fifty percent of computation [19]. The multi-
channel shuffler structure can be found in [18].
additions per unit time, respectively. The computation load-
ings are calculated using direct convolution in the time do- 5. CONCLUSIONS
main. The computation loading using the proposed sub-
band filtering approach was drastically reduced by approx- A bandlimited CCS based on subband filtering has been de-
imately eighty percent, as compared to the conventional ap- veloped in the work. The intention is to establish a compu-
proach. However, there are still other fast convolution algo- tationally efficient CCS without penalty on cancellation per-
rithms that can be adopted for efficient implementation. The formance. The CCS is a bandlimited design which is effective
overlap-add methods of block convolution [26], for example, up to the frequency 6 kHz. To achieve the bandlimited imple-
are compared in the simulation. This method is only used in mentation, a pseudocosine modulated QMF is employed, al-
CCS filters, while the filter bank is still carried out by using lowing the CCS to operate at low rate within an approximate
[6] D. H. Cooper, “Calculator program for head-related transfer

C11 + C12
xL vL function,” Journal of the Audio Engineering Society, vol. 30,
2
no. 1-2, pp. 34–38, 1982.
[7] W. G. Gardner, “Transaural 3D audio,” Tech. Rep. 342, MIT
Media Laboratory, Cambridge, Mass, USA, 1995.
[8] D. H. Cooper and J. L. Bauck, “Prospects for transaural record-
C11 C12 ing,” Journal of the Audio Engineering Society, vol. 37, no. 1-2,
xR vR
2 pp. 3–19, 1989.
[9] J. L. Bauck and D. H. Cooper, “Generalized transaural stereo
and applications,” Journal of the Audio Engineering Society,
Figure 10: Shuffler filter structure for 2x2 CCS. vol. 44, no. 9, pp. 683–705, 1996.
[10] O. Kirkeby and P. A. Nelson, “Digital filter design for inver-
sion problems in sound reproduction,” Journal of the Audio
PR structure. As a result of this, spatial audio processing can Engineering Society, vol. 47, no. 7, pp. 583–595, 1999.
concentrate more on the low frequency range to better suit [11] O. Kirkeby, P. A. Nelson, H. Hamada, and F. Orduna-
Bustamante, “Fast deconvolution of multichannel systems us-
human perceptual hearing.
ing regularization,” IEEE Transactions on Speech and Audio
To compare the proposed CCS to traditional systems, Processing, vol. 6, no. 2, pp. 189–194, 1998.
subjective listening experiments were conducted in an ane- [12] C. Kyriakakis, T. Holman, J.-S. Lim, H. Hong, and H. Neven,
choic room. The experiments include two parts: source lo- “Signal processing, acoustics, and psychoacoustics for high
calization test and sound quality test. By means of the tech- quality desktop audio,” Journal of Visual Communication and
niques presented in Section 2, the fullband CCS operated at Image Representation, vol. 9, no. 1, pp. 51–61, 1998.
the sampling rate of 48 kHz requires four 3000-tapped FIR [13] C. Kyriakakis, “Fundamental and technological limitations of
filters. On the other hand, the bandlimited CCS operated at immersive audio systems,” Proceedings of the IEEE, vol. 86,
the sampling rate of 12 kHz requires only four 1500-tapped no. 5, pp. 941–951, 1998.
FIR filters. The prototype FIR filter has 120 taps. The analy- [14] O. Kirkeby, P. A. Nelson, and H. Hamada, “The “stereo dipole”
sis bank and the synthesis bank are generated from the pro- - a virtual source imaging system using two closely spaced
totype and implemented via polyphase representation. The loudspeakers,” Journal of the Audio Engineering Society, vol. 46,
results of subjective tests processed by ANOVA indicate that no. 5, pp. 387–395, 1998.
[15] M. R. Bai, C.-W. Tung, and C.-C. Lee, “Optimal design of
the bandlimited CCS performs comparably well as the full-
loudspeaker arrays for robust cross-talk cancellation using the
band CCS not only in localization but also in sound quality. Taguchi method and the genetic algorithm,” Journal of the
From Table 3, the computation loading using the proposed Acoustical Society of America, vol. 117, no. 5, pp. 2802–2813,
subband filtering approach was drastically reduced by ap- 2005.
proximately eighty percent, as compared to the conventional [16] T. Takeuchi, P. A. Nelson, and H. Hamada, “Robustness to
approach. After employing fast convolution algorithm, the head misalignment of virtual sound imaging systems,” Journal
difference between two methods is reduced. Even though the of the Acoustical Society of America, vol. 109, no. 3, pp. 958–
block convolution is very efficient, it requires more memory 971, 2001.
to store temporary data. In conclusion, which method is bet- [17] J. Yang, W.-S. Gan, and S.-E. Tan, “Improved sound separa-
ter is dependent upon which one you concern about, speed tion using three loudspeakers,” Acoustic Research Letters On-
or memory. The bandlimited CCS with direct convolution line, vol. 4, no. 2, pp. 47–52, 2003.
and shuffler method is an acceptable choice. [18] J. Yang, W.-S. Gan, and S.-E. Tang, “Development of virtual
sound imaging system using triple elevated speakers,” IEEE
Transactions on Consumer Electronics, vol. 50, no. 3, pp. 916–
ACKNOWLEDGMENT 922, 2004.
[19] W. G. Gardner, 3-D Audio Using Loudspeakers, Kluwer Aca-
The work was supported by the National Science Council in demic, London, UK, 1998.
Taiwan, under project number NSC94-2212-E009-019. [20] P. P. Vaidyanathan, Multirate Systems and Filter Banks,
Prentice-Hall, Englewood Cliffs, NJ, USA, 1993.
REFERENCES [21] M. R. Bai and C.-C. Lee, “Development and implementation
of cross-talk cancellation system in spatial audio reproduction
[1] J. Blauert, Spatial Hearing: The Psychophysics of Human Sound based on subband filtering,” Journal of Sound and Vibration,
Localization, MIT Press, Cambridge, Mass, USA, 1997. vol. 290, no. 3–5, pp. 1269–1289, 2006.
[2] D. R. Begault, 3-D Sound for Virtual Reality and Multimedia, [22] B. Noble, Applied Linear Algebra, Prentice-Hall, Englewood
AP Professional, Cambridge, Mass, USA, 1994. Cliffs, NJ, USA, 1988.
[3] A. Sibbald, “Transaural acoustic crosstalk cancellation,” Sen- [23] A. Schuhmacher, J. Hald, K. B. Rasmussen, and P. C. Hansen,
saura White Papers, 1999, http://www.sensaura.co.uk. “Sound source reconstruction using inverse boundary ele-
[4] M. R. Schroeder and B. S. Atal, “Computer simulation of ment calculations,” Journal of the Acoustical Society of America,
sound transmission in rooms,” IEEE International Convention vol. 113, no. 1, pp. 114–127, 2003.
Record, vol. 11, no. 7, pp. 150–155, 1963. [24] Y.-P. Lin and P. P. Vaidyanathan, “A Kaiser window approach
[5] P. Damaske and V. Mellert, “A procedure for generating direc- for the design of prototype filters of cosine modulated filter-
tionally accurate sound images in the upper- half space using banks,” IEEE Signal Processing Letters, vol. 5, no. 6, pp. 132–
two loudspeakers,” Acoustica, vol. 22, pp. 154–162, 1969. 134, 1998.
[25] Rec. ITU-R BS.1116-1, “Method for the subjective assessment

of small impairments in audio systems including multichan-
nel sound systems,” International Telecommunications Union,
Geneva, Switzerland, 1992–1994.
[26] A. V. Oppenheim, R. W. Schafer, and J. R. Buck, Discrete-Time
Signal Processing, Prentice-Hall, Upper Saddle River, NJ, USA,
2nd edition, 1999.
Mingsian R. Bai was born in 1959 in Taipei,

Taiwan. He received the Bachelor’s degree
in power mechanical engineering from Na-
tional Tsing-Hwa University in 1981. He
also received the Master’s degree in busi-
ness management from National Chen-Chi
University in 1984. He left Taiwan in 1984
to enter graduate school of Iowa State Uni-
versity and later received the M.S. degree
in mechanical engineering in 1985 and the
Ph.D. degree in engineering mechanics and aerospace engineering
in 1989. In 1989, he joined the Department of Mechanical Engi-
neering of National Chiao-Tung University in Taiwan as an Asso-
ciate Professor and became a Professor in 1996. He was also a Vis-
iting Scholar to Center of Vibration and Acoustics, Penn State Uni-
versity, University of Adelaide, Australia, and Institute of Sound
and Vibration Research (ISVR), UK, in 1997, 2000, and 2002, re-
spectively. His current interests encompass acoustics, audio signal
processing, electroacoustic transducers, vibroacoustic diagnostics,
active noise and vibration control, and so forth. He has over 100
published papers and 13 granted or pending patents. He is a Mem-
ber of the Audio Engineering Society (AES), Acoustical Society of
America (ASA), Acoustical Society of Taiwan, and Vibration and
Noise Control Engineering Society in Taiwan.
Chih-Chung Lee was born in 1979 in

Taipei, Taiwan. He received the B.S. degree
and the M.S. degree in mechanical engi-
neering from National Chiao-Tung Univer-
sity in 2001 and 2003, respectively. His Mas-
ter’s thesis is on personal 3D virtual cin-
ema based on panel speaker array. He is cur-
rently studying the Ph.D. degree in mechan-
ical engineering from National Chiao-Tung
University.
doi:10.1155/2007/75621
Research Article
Subband Affine Projection Algorithm for Acoustic Echo
Cancellation System
Hun Choi and Hyeon-Deok Bae
Department of Electronic Engineering, Chungbuk National University, 12 Gaeshin-Dong, Heungduk-Gu,

Cheongju 361-763, South Korea
Received 30 December 2005; Revised 14 April 2006; Accepted 18 May 2006
We present a new subband affine projection (SAP) algorithm for the adaptive acoustic echo cancellation with long echo path
delay. Generally, the acoustic echo canceller suffers from the long echo path and large computational complexity. To solve this
problem, the proposed algorithm combines merits of the affine projection (AP) algorithm and the subband filtering. Convergence
speed of the proposed algorithm is improved by the signal-decorrelating property of the orthogonal subband filtering and the
weight updating with the prewhitened input signal of the AP algorithm. Moreover, in the proposed algorithms, as applying the
polyphase decomposition, the noble identity, and the critical decimation to subband the adaptive filter, the sufficiently decomposed
SAP updates the weights of adaptive subfilters without a matrix inversion. Therefore, computational complexity of the proposed
method is considerably reduced. In the SAP, the derived weight updating formula for the subband adaptive filter has a simple form
as ever compared with the normalized least-mean-square (NLMS) algorithm. The efficiency of the proposed algorithm for the
colored signal and speech signal was evaluated experimentally.
Copyright © 2007 H. Choi and H.-D. Bae. This is an open access article distributed under the Creative Commons Attribution
cited.
1. INTRODUCTION similar in the view of decorrelating property to the affine

projection scheme. Therefore, in subband structure with or-
Adaptive filtering is essential for acoustic echo cancellation. thogonal analysis filter banks, the convergence speed of the
Among the adaptive algorithms, least-mean-square (LMS) is subband adaptive filter (SAF) is improved by the weight up-
the most popular algorithm for its simplicity and stability. dating with prewhitened inputs that result from the OSF.
However, when the input signal is highly correlated and the Recently, for fast convergence and efficient implementation,
long-length adaptive filter is needed, the convergence speed there has been increasing interest in the combining advan-
of the LMS adaptive filter can be deteriorated seriously [1, 2]. tages of the AP and the SAF [16–21]. These algorithms, for
To overcome this problem, the affine projection (AP) algo- reducing computational complexity, are based on the fast
rithm was proposed [3–11]. The improved performance of variant of AP (FAP) instead of the conventional AP. The FAP-
the AP algorithm is characterized by an updating-projection based algorithms use various iterative methods to avoid the
scheme of an adaptive filter on a P-dimensional data-related matrix inversion in weight updating. However, in the FAP-
subspace. Since the input signal is prewhitened by this pro- based algorithms, the performances are deteriorated by the
jection on an affine subspace, the convergence rate of the AP approximated errors of the iterative method and the compu-
adaptive filter is improved. However, a large computational tational complexity is still complex for the implementation.
complexity is a major drawback for its implementation, be- In this paper, we present a new subband affine projection
cause P-ordered AP adaptive filter is based on the data ma- (SAP) algorithm to improve convergence speed and reduce
trix that consists of the last P + 1 input vectors and it requires computational complexity of the AP algorithm. The SAP is
matrix inversion in weight updating. based on the subband structure [13] that uses critically deci-
The orthogonal subband filtering (OSF) is an alterna- mated adaptive filters with the polyphase decomposition and
tive method that can whiten the input signal [12–15]. The the noble identity. A new criterion is also presented for ap-
OSF can be considered a kind of projection operation. It is plying AP algorithm to polyphase decomposed adaptive filter
wide-sense stationary (WSS) autoregressive (AR) process of

Far-end signal u(n) order P, then the input signal u(k) is described by
Adaptive filter
P
s(n) u(k) = al u(k − l) + f (k), (2)

l=1
Near-end signal
y(n) Residual echo signal
e(n) where f (k) is a WSS white process with variance σ 2f . Let u(k)
+ be a vector of N samples of AR process described in (3), we
+
d(n) can rewrite the AR signal as
(a) Fullband adaptive acoustic echo canceller

P
u(k) = al u(k − l) + f(k) = Ua (k)a + f(k), (3)
Measurement noise l=1
r(n)
where the matrices Ua (k) = [u(k − 1) u(k − 2) · · · u(k−
+ Desired signal P)], u(k − l) = [u(k − l) u(k − l − 1) · · · u(k − l−
Unknown system d(n)
Σ N + 1)]T and, f(k) = [ f (k) f (k − 1) · · · f (k − N + 1)]T .
Input signal s + Error signal
+ In the system identification for the fullband AEC as
u(n) e(n)
Σ shown in Figure 1(b), y(k) is the output signal of the adap-
tive filter at iteration k. The error signal is defined by e(k) =
Adaptive system y(n) d(k) − y(k). The P-order AP adaptive filter uses (P + 1) × N
s(n)
data matrix and the optimization criterion for designing the
adaptive filter is given by [2, 22],
(b) Fullband adaptive system identification 2
minimize s(k + 1) − s(k)
Figure 1: Fullband system identification for adaptive acoustic echo
(4)
subject to d(k) = UT (k)s(k + 1),
canceller.
where

(adaptive subfilter) in each subband. In this algorithm, the U(k) = u(k) u(k − 1)u(k − 2) · · · u(k − P)
derived weight updating formula for the subband adaptive (5)
filter has a simple form as compared with the normalized = u(k) Ua (k) .
least-mean-square (NLMS) algorithm, and the weights of the
adaptive subfilter are updated with the input prewhitened by It is well known that the AP algorithm is the undetermined
the OSF in each subband. To evaluate the performance of the optimization problem. Generally, Lagrangian theory is used
proposed SAP, computer simulations are performed for sys- for solving this optimization problem with equality con-
tem identification model of echo cancellation problem. straints [2, 22, 23]. From (4), the weights of the adaptive filter
The outline of this paper is as follows. In Section 2, the are updated by the AP algorithm as in
conventional AP algorithm is reviewed. In Section 3, we de- −1
rive the new subband affine projection algorithm and de- s(k + 1) = s(k) + μU(k) U(k)T U(k) e(k),
scribe the convergence analysis and computational complex- T
ity of the proposed algorithm. Section 4 describes simulation e(k) = d(k) − y(k) = e(k) e(k − 1) · · · e(k − P) ,
results, and Section 5 contains the conclusions.
T
d(k) = U(k)T s∗ = d(k) d(k − 1) · · · d(k − P) ,
2. AFFINE PROJECTION ALGORITHM
y(k) = U(k)T s(k).
Consider the adaptive acoustic echo cancellation (AEC) sys- (6)
tem and the block diagrams of system identification for
the AEC in fullband structure as shown in Figure 1. In Parameters N and P are the length of the adaptive filter and
Figure 1(b), the adaptive filter attempts to estimate a desired the projection order, respectively. The step size μ is the re-
signal d(k) which is linearly related to the input signal u(k) laxation factor. In P-order AP algorithm of (6), AR(P) input
by model signal is decorrelated by the P times orthogonal projection
operations with projection matrix as follows:
d(k) = s∗T u(k) + r(k), (1)
−1
PUa (k) = Ua (k) UTa (k)Ua (k) UTa (k), (7)
∗
where s is the echo path that we wish to estimate and r(k)
is the measurement noise that is the independent identically which achieves the projection operation onto the subspace
distributed (i.i.d.) random signal with zero mean and vari- spanned by the columns of Ua (k). Thus, the AP adaptive fil-
ance σr2 . The input signal u(k) is assumed to be a zero-mean ter weights are updated by prewhitened input signals.
H. Choi and H.-D. Bae 3
Far-end signal u(n)
Analysis filters
s0 (n) sM 1 (n) Subband adaptive filter

y0 (n)
d0 (n)
Synthesis filters
Analysis filters
Near-end signal +
+ e (n)
d(n) . 0 Residual echo
.
. yM 1(n) signal e(n)
dM 1 (n)
+
+ eM 1 (n)
(a) Subband adaptive acoustic echo canceller
d0 (n)
h0 M
s .
+
. +
. e0 (n)
dM 1 (n)
hM 1 M
..
.
M s0 (n)
h0 u00 (n)
u0 (n) M s1 (n)
z 1 u (n) +
01 .. y0 (n)
.
u(n) .
.. sM 1 (n)
M
z M+1
u0,M 1 (n) . . +
.. .
. +
eM 1 (n)
M s0 (n)
hM 1 uM 1,0 (n)
uM 1 (n) M s1 (n)
z 1 +
uM 1,1 (n) .. yM 1(n)
.
M sM 1 (n)
z M+1
uM 1,M 1 (n)
(b) Subband adaptive system identification
Figure 2: Subband system identification for adaptive acoustic echo canceller.
3. SUBBAND AFFINE PROJECTION ALGORITHM dm (k), respectively. We can describe as

um (k) = hTm Usa (k)a + fs (k) = hTm Usa (k)a + fm (k),
Using polyphase decomposition and the noble identity [12], (8)
the fullband system of Figure 1 can be transformed into M- dm (k) = hTm d(k),
subband system [13]. Figure 2 shows the M-subband adap-
tive acoustic echo cancellation (SAEC) system and the block where Usa (k) = [usa (k − 1) usa (k − 2) · · · usa (k − P)],
diagram of system identification for the SAEC. In [15], the usa (k − l) = [u(k − l) u(k − l − 1) · · · u(k − l − L + 1)]T ,
excellency of this subband structure has been analyzed and fs (k) = [ f (k) f (k − 1) · · · f (k − L + 1)]T , and L is the
is alias free, always stable, and reasonable for implementa- length of analysis filters. The notation (↓ M) means a dec-
tion. In Figure 2, using orthogonal analysis filters (OAFs) imation by M. Note that the decimated signals umn (k) =
h0 · · · hM −1 , the input signal u(k) and the desired signal um (Mk − n) and fmn (k) = fm (Mk − n) are the subband
d(k) are partitioned into new signals denoted by um (k) and polyphase components of um (k) and fm (k), respectively.
These subband polyphase component vectors can be pre- d0 (n)

h0 2
sented by s +
T d1 (n) +
e0 (n)
h1 2
umn (k) = umn (k) umn (k − 1) · · · umn (k − Ps) ,
T (9)
fmn (k) = fmn (k) fmn (k − 1) · · · fmn (k − Ps) , 2 s0 (n)
u(n)
u0 (n) u00 (n)
where the subscript mn is the subband-decomposed poly- h0 +
z 1
phase index (m and n = 0, 1, . . . , M − 1). In M-subband 2 s1 (n) +
structure, the adaptive filter can be represented in terms of u01 (n) +
polyphase components as e1 (n)

S(z) = S0 z M
+ z−1 S1 z M
+ · · · + z−i Si z M
. (10) 2 s0 (n)
u1 (n) u10 (n)
h1 +
Based on the principle of minimum disturbance [2] and the z 1
criterion of (4) for the fullband AP adaptive filter, we formu- 2 s1 (n)
late a criterion for the M-subband AP filters as one of opti- u11 (n)
mization subject to multiple constraints, as follows:
2 Figure 3: System identification model for two-subband adaptive fil-
minimize f s(k) = s0 (k + 1) − s0 (k) ter.
2
+ · · · + sM −1 (k + 1) − sM −1 (k)

M −1
To find the Lagrange vectors λ0 and λ1 that minimize the cost
subject to dm (k) = UTmn (k)sn (k + 1)
n=0
function of (12) with respect to s0 (k + 1) and s1 (k + 1), the
error vectors in each subband are expressed as
for m = 0, 1, . . . , M − 1.
(11) 1 T
e0 (k) = U (k)U00 (k) + UT01 (k)U01 (k) λ0
From this criterion, we define the cost function for the AP 2 00
algorithm in the two-subband (M = 2) structure shown in 1 T
Figure 3 as + U00 (k)U10 (k) + UT01 (k)U11 (k) λ1 ,
2
2 2 (15)
J(k) = s0 (k + 1) − s0 (k) + s1 (k + 1) − s1 (k) 1 T
e1 (k) = U10 (k)U00 (k) + UT11 (k)U01 (k) λ0
T 2
+ d0 (k) − UT00 (k)s0 (k + 1) − UT01 (k)s1 (k + 1) λ0
T 1 T
+ U (k)U10 (k) + UT11 (k)U11 (k) λ1 .
+ d1 (k) − UT10 (k)s0 (k + 1) − UT11 (k)s1 (k + 1) λ1 , 2 10
(12)
From (15), λ0 and λ1 can be represented in matrix form as
Umn (k) = umn (k) umn (k − 1) · · · umn k − Ps ,
(13)

−1

λ0 A0 (k) B(k) e0 (k)
where λ0 and λ1 are the Lagrange multiplier vectors, and Ns =2 , (16)
and Ps are the length of the adaptive subfilter and the pro- λ1 BT (k) A1 (k) e1 (k)
jection order in each subband, respectively. In (12), the cost
function is quadratic, and also, it is convex since its Hessian where
matrix is positive definite [2, 23]. Therefore, the proposed
cost function has a global minimum solution. From (12), we
can get the partial derivatives of the cost function with re- A0 (k) = UT00 (k)U00 (k) + UT01 (k)U01 (k),
(17)
spect to s0 (k + 1) and s1 (k + 1), and set the results to zeroes A1 (k) = UT10 (k)U10 (k) + UT11 (k)U11 (k),
as [2]
∂J(k) B(k) = UT00 (k)U10 (k) + UT01 (k)U11 (k). (18)
∂s0 (k + 1)
In (16), the matrix B(k) in the off-diagonal is an undesir-
= 2 s0 (k + 1) − s0 (k) − U00 (k)λ0 − U10 (k)λ1 = 0,
able cross-term that is produced by the signals of different
∂J(k) subbands. To eliminate this cross-term, we define Gm (k) =
∂s1 (k + 1) E{Am (k)} and K(k) = E{B(k)} (E{·} denotes the expecta-
tion of {·}). The matrix Gm (k) in the main diagonal is the
= 2 s1 (k + 1) − s1 (k) − U01 (n)λ0 − U11 (n)λ1 = 0. sum of Ps × Ps Grammian matrices that consist of sample au-
(14) tocorrelations Rm (k) (for m = 0 or 1). Therefore, G0 (k) and
Pu (e jω ) With the above approximations, (16) can be simplified as

γ1
γ0 γ3

−1

γ2 λ0 A0 (k) B(k) e0 (k)
=2
ω λ1 BT (k) A1 (k) e1 (k)
π 3π/4 π/2 π/4 0 π/4 π/2 3π/4 π (23)

−1

A0 (k) 0 e0 (k)
≈2 .
Figure 4: Sample power spectrum of u(k). 0 A1 (k) e1 (k)
From (17) and (23), the Lagrange vectors λ0 and λ1 are ob-
G1 (k) can be written as tained as
−1
λ0 = 2 UT00 (k)U00 (k) + UT01 (k)U01 (k) e0 (k),
G0 (k) = E A0 (k) (24)
−1
λ1 = 2 UT10 (k)U10 (k) + UT11 (k)U11 (k) e1 (k).
=E UT00 (k)U00 (k) + UT01 (k)U01 (k)

= R0 (k) + R0 (k − 1) + · · · + R0 k − Ns + 1 , Substituting (24) into (14), we can obtain the weight updat-
ing formulae of the SAP algorithm in the two-subband case
G1 (k) = E A1 (k) as follows:

= E UT10 (k)U10 (k) + UT11 (k)U11 (k)
s0 (k + 1)
= R1 (k) + R1 (k − 1) + · · · + R1 k − Ns + 1 .
= s0 (k) + μ U00 (k)A0−1 (k)e0 (k) + U10 (k)A1−1 (k)e1 (k) ,
(19)
s1 (k + 1)

Whereas, the matrix K(k) in the off-diagonal is the sum of = s1 (k) + μ U01 (k)A0−1 (k)e0 (k) + U11 (k)A1−1 (k)e1 (k) .
Ps × Ps sample cross-correlations C(k) that consist of signals (25)
of different subband components. The matrix K(k) can be
written as 3.1. Extension to the M-subband case

K(k) = E B(k) To generalize (25), we consider the M-subband structure
shown in Figure 2(b) [13]. The cost function for this case is
= E UT00 (k)U10 (k) + UT01 (k)U11 (k) (20) defined as an extension of (12),

= C(k) + C(k − 1) + · · · + C k − Ns + 1 .

M −1

J(k) = sm (k + 1) − sm (k)2
In (20), each element of K(k) can be obtained as a sum m=0
of inner products of different subband components. We can
−1
T

M
write each element as + dm (k) − UTmn (k)sn (k + 1) λm
n=0

γu00 u10 +u01 u11 (k, l) = E uT00 (k)u10 (l) + uT01 (k)u11 (l) . (21) for M = 2, 3, . . . .
(26)
Assuming that the input signal is wide-sense stationary and
ergodic, the cross-correlation at zero lag, γu00 u10 +u01 u11 (k, l), Using (25), the proposed weight updating formula for the M-
can be expressed as subband case can be expressed in terms of the matrix forms
as follows:

uT00 (k)u10 (k) + uT01 (k)u11 (k)
γu00 u10 +u01 u11 (0) = . (22) S(k + 1) = S(k) + μX(k)Π−1 (k)E(k), (27)
Ns
For analytical simplicity, we further assume that the input where

signal is white and its spectrum is flat in each subband T
as shown in Figure 4. From these assumptions, E{uT00 u00 + S(k) = sT0 (k) sT1 (k) · · · sTM −1 (k) ,
uT01 u01 } = σu20 (σu20 is the variance of subband signal hT0 u) ⎡ ⎤
and E{uT00 u10 + uT01 u11 } = 0. For colored inputs, E{uT00 u10 + U00 (k) U10 (k) ··· U(M −1)0 (k)
⎢
uT01 u11 }
= 0. However, if the frequency responses of the anal- ⎢ U01 (k) U11 (k) ··· U(M −1)1 (k) ⎥ ⎥
⎢ ⎥
ysis filters do not overlap significantly, it is always true that X(k) = ⎢
⎢ . .. .. . ⎥,
⎥
⎣
.. . . .. ⎦
E{uT00 u10 + uT01 u11 } E{uT00 u00 + uT01 u01 } as before. This
means that the elements of B(k) are very small compared U0(M −1) (k) U1(M −1) (k) · · · U(M −1)(M −1) (k)
with the elements of A0 (k) and A1 (k). Therefore, we can con-
sider B(k) ≈ 0. X(k) is MNs × MPs matrix,
⎡ ⎤
A0 (k) 0 ··· 0 To decorrelate the AR(P) input signal, the fullband AP al-
⎢ ⎥
⎢ .. ⎥ gorithm performs the P times projection operations with the
⎢ 0 A1 (k) . ⎥
⎢ ⎥ corresponding past P input vectors. In the proposed method,
Π(k) = ⎢
⎢ ..
⎥,
⎥
⎢ . .. ⎥ on the other hand, the projection operation with lower or-
⎣ . 0 ⎦
der (Ps < P) is sufficient for the signal decorrelating. Be-
0 ··· 0 A(M −1) (k) cause the input signal is prewhitened by the subband par-
Π(k) is MPs × MPs matrix, titioning, therefore, the spectral dynamic range of each sub-
⎡ ⎤ band signal is decreased. Moreover, the length of the adap-
e0 (k)
⎢ ⎥ tive subfilter becomes Ns = N/M by applying the polyphase
⎢ e1 (k) ⎥
⎢ ⎥ decomposition and the noble identity to the maximally dec-
E(k) = ⎢
⎢ .. ⎥,
⎥
E(k) is MPs × 1 vector.
imated adaptive filter. In weight updating of AP adaptive fil-
⎣ . ⎦
ter, the order of projection governs the convergence rate of
eM −1 (k) adaptive algorithm and it depends on the length of the AP
(28) adaptive filter as well as the degree of the input correlation.
A high order of projection is required for the long adaptive
3.2. The projection order reduced by signal filter, whereas, lower order of projection is sufficient for the
partitioning shortened adaptive filter. Therefore, the projection order for
the shortened adaptive subfilter can be Ps ≈ P/M. When the
The AP algorithm of (6) is rewritten with a direction vector size of the data matrix is N × (P + 1) in the fullband, it can
Φ(k) as follows [24]: be Ns × (Ps + 1) ≈ (N/M) × (P/M) in the subband. More-
over, in view of the computational complexity of the SAP,
Φ(k) the weights of the adaptive subfilters in the subband struc-
s(k + 1) = s(k) + μ e(k), (29)
ΦT (k)Φ(k) ture are updated at a low rate that is provided by maximal
Φ(k) = u(k) − Ua (k)a(k), decimation. Consequently, computational complexity of the
(30) proposed method is much less than that of fullband AP.
−1
a(k) = UTa (k)Ua (k) UTa (k)u(k).
Now, we consider a simple implementation technique of
the proposed SAP. Although a computational complexity of
In (29), the AP algorithm updates the adaptive filter weights the proposed method is reduced, it still remains the inversion
s(k) in direction of a vector Φ(k). The direction vector is the problem of matrix. In the AP algorithm, the projection order
error vector in estimation (in least-squares sense) and it is is typically much smaller than the length of the adaptive filter.
orthogonal to the last P input vectors. Similarly, in (27), the By partitioning the P-order fullband AP into P-subbands, we
SAP algorithm updates the adaptive subfilter weights sm (k) obtain the simplified SAP (SSAP) with N/P × 1 data vectors
in direction of a vector Φm (k) given by for weight updating instead of data matrices. Consequently,
the weight updating formula for each subband adaptive sub-

M −1
filter is similar to that of the NLMS adaptive filter and the
Φm (k) = Φmn (k), (31) matrix inversion is not required. Now, we assume that the
m=0
projection order in the fullband is 2 (P = 2). By partitioning
where each subdirection vector for the adaptive subfilters is into two-subbands, (25) are simply rewritten as
given by

u00 (k)e0 (k) u10 (k)e1 (k)
Φmn (k) = umn (k) − Uamn (k)amn (k), (32) s0 (k + 1) = s0 (k) + μ + ,
σu20 (k) σu21 (k)
−1
amn (k) = UTamn (k)Uamn (k) UTamn (k)umn (k),
(33) (36)
u (n)e (k) u11 (k)e1 (k)
[4pt]Uamn (k) = umn (k − 1) umn (k − 2) · · · umn k − Ps . s1 (k + 1) = s1 (k) + μ 01 2 0 + ,
σu0 (k) σu21 (k)
(34)
In (33), amn (k) is the subband least-squares estimate of the where σu2m (k) is the variance of input signal in each subband.
parameter vector a, and it is transformed by orthogonal sub- Note that the computational complexity for the subband
band filtering. Φmn (k) is orthogonal to the past Ps input vec- partitioning is much less than that for calculating the inverse
tors umn (k − 1), umn (k − 2), . . . , umn (k − Ps ). From (31) and matrix. In a practical implementation, the SSAP gives con-
(32), we can know that the weights of the adaptive subfil- siderable savings in computational complexity.
ter are updated to the orthogonal direction of the past MPs
decomposed subband input vectors. In the fullband AP algo- 3.3. Convergence of the mean weight vector
rithm, AR(P) input signal is decorrelated by the projection
matrix as shown in (7). Similarly, each subband input signal To analyze the convergence behavior of the proposed SAP, we
is decorrelated by the subband projection matrices as follows: first define the mean-square deviation as
−1 2 2
PUamn (k) = Uamn (k) UTamn (k)Uamn (k) UTamn (k). (35) D(k) = E s(k) = E s∗ − s(k) . (37)
Table 1: Comparison of the computational complexities; N is the length of adaptive filter or unknown system (filter), L is the length of
analysis and synthesis filters, M is the number of subbands, P is the projection order, and D is the size of data frame in LC-GSFAP.
Multiplications/iteration
Algorithms Multiplications/iteration
for L = 64, N = 512, M = 4, P = 4, D = 2
SNLMS [13] 3N + 2M(L + 2) 2064
Fullband AP [3] P 3 /2 + 3NP 2 + NP + N 27 168

Subband M P 2 + P + N + (2P + N)/D + 1
3160
LC-GSFAP [19] +2ML

P 3 / 2M 3 + NP 2 (M + 1)/M 3
The proposed SAP ≈ 2305
+NP(P + M + 1)/M 2 + 2ML
The SSAP 3N + 2P(L + 2) 2064
For analytical simplicity, we consider the two-subband case. ET (k). Hence, in the absence of disturbance, the necessary
The polyphase components of the unknown filter, s∗0 and s∗1 , and sufficient condition for the convergence in the mean-
can be represented as square sense is that the step-size parameter must satisfy the
double inequality
S∗ (z) = S∗0 z2 + z−1 S∗1 z2 . (38)
0 < μ < 2. (44)
From (27), we can get
S(k
+ 1) = S(k) − μX(n)Π−1 (k)E(k), (39)
3.4. Computational complexity

where S(k) = [sT0 (k) sT1 (k)]T , ∗
for s0 (k) = s0 − s0 (k) and
s1 (k) = s∗
The computational complexities per iteration in terms of the
1 − s1 (k). Taking the squared-Euclidean norm on
both sides of (39), the weight updating formula can be rep- number of multiplications for the proposed SAP and the
resented as (assume that XT (k)X(k) ≈ Π(k)) SSAP, the fullband AP [3], the subband NLMS (SNLMS)
[13], and the subband LC-GSFAP [19] are shown in Table 1.

S(k
+ 1)2 − S(k)
2 When the fullband sampling rate is Fs = 1/Ts , the weights
T of the adaptive filter in the subband structure are updated
= μ2 ET (k)Π−1 (k)E(k) − 2μS
(k)X(k)Π−1 (n)E(k), at a lower rate, 1/MTs . In the AP and the SAP, matrix inver-
(40) sions were assumed to be performed with standard LU de-
composion: O3 /2 multiplications [17], where O is the rank of
and taking the expectation on both sides of (40), we can get a square matrix, and it is equal to the projection order in AP
(O = P or Ps ). In SSAP that partitioned into P-subband, the
D(k + 1) − D(k) length of the subband adaptive filter is Ns = N/M |M =P = N/P
and the projection order in each subband is Ps = P/M |M =P =
= μ2 E ET (k)Π−1 (k)E(k) − 2μE ξ(k)Π−1 (k)E(k) ,
1. In applications, such as adaptive echo cancellation, the
(41) length of analysis filters is typically much smaller than the
length of the adaptive filter. Consequently, it can be seen that
where
the proposed algorithm is much more efficient than the other
ξ(k) = ST (k)X(k). (42) algorithms.
For the proposed algorithm to be stable, the mean-square de-

viation D(k) must decrease monotonically with an increasing 4. SIMULATION RESULTS
number of iterations n implying that D(k + 1) − D(k) < 0.
Therefore, the step size μ has to fulfill the condition To evaluate the performance of the proposed SAP algorithm,
we carry out computer simulations in acoustic echo cancel-
2E ξ(k)Π−1 (k)E(k) lation scenario. The length of the unknown system shown in
0<μ< . (43)
E ET (k)Π−1 (k)E(k) Figure 5 is N = 512. It is an actual impulse response of the
echo path in a room, sampled at 8 kHz and truncated to 512
In (43), ξ(k) = ST (k)X(k) is the undisturbed error vector. samples. For signal partitioning in all experiments, we use
If we consider the situation where the disturbance is negli- the cosine-modulated filter banks [25] (analysis and synthe-
gible, the disturbed error vector is equal to the error vector sis) with prototype frequency responses shown in Figure 6.
0.3 30
0.2 25
0.1
20
ERLE (dB)
Echo path
0
15
0.1
10
0.2
5
0.3
0.4 0
0 63 127 255 511 0 1 2 3 4 5 6 7 8
Samples Time (s)
Figure 5: Impulse response of the echo path measured in a room. Fullband AP (P = 2)

M = 4 LC-GSFAP (P = 2)
M = 2 proposed SAP (Ps = 2)
0 Figure 7: ERLE curves of the fullband AP with P = 2, M = 4 LC-

GSFAP with P = 2 and D = 2, M = 2 SAP with Ps = 2, and M = 4
SAP with Ps = 2 for AR(4) inputs (N = 512, μ = 1, SNR = 30 dB).
50
H(w) (dB)
desire signal d(k) such that SNR = 30 dB. The step size is
100 set to a unit (μ = 1) for fast convergence. In acoustic echo
cancellation systems as shown in Figures 1 and 2, we evalu-
ate the echo return loss enhancement (ERLE) performances
150 of the proposed SAP, the fullband AP, and the four-subband
LC-GSFAP with 2-oversampling factor (OS = 2) algorithm.
N −1
i=0 d (n − i)
2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ERLE = 10 log10 N −1 . (46)
i=0 e (n − i)
2
Normalized frequency (rad)
Generally, the weights of adaptive filter are frozen when the
M=2
M=4
double talk is detected, then they are readjusted when the
M=8 double talk is inactive. For the double-talk condition, we
evaluate the tracking ability of the proposed method. The
Figure 6: Frequency responses of the prototype filters. path of echo is changed at the detected time and the weights
of adaptive filter are frozen and then, when the double talk is
inactive, the weights of adaptive filter are readjusted to cancel
For efficient subband decomposition of input signals, the the changed echo path.
lengths of analysis filters are increased with M so that the
ratio of the transition band to the passband is maintained
4.1. The proposed SAP with AR(4) input
nearly the same for all values of M. The prototype filters’
lengths are 32, 64, and 128 for M = 2, 4, and 8, respec-
Figure 7 shows the ERLE performances of the proposed
tively. The input signals are zero-mean wide-sense stationary
method, the fullband AP, and subband LC-GSFAP with the
AR(P) and a real speech sampled at 8 kHz. AR(4) process is
same projection order (P = Ps = 2) for different num-
given by
bers of subbands (M = 2, 4). We assumed that the double

P talk is detected at about 4.5 (seconds). For the same projec-
u(k) = al u(k − l) + f (k), (45) tion order, the SAP and the subband LC-GSFAP have faster
l=1 convergence rates than the fullband. From these results, we
T can doubtlessly know that the convergence speed of adaptive
where AR coefficients are a = 1 0.999 0.99 0.995 0.9 filter is improved by the subband filtering and it speeds up
for AR(4). f (k) is zero-mean and unit-variance white Gaus- with the increase of M. Figure 8 shows the ERLE of each algo-
sian random process. The measurement noise is added to rithm with the different values of the projection order (P = 4
30 1
Far-end signal
0.5
25
0
20 0.5
ERLE (dB)
1
0 0.5 1 1.5 2 2.5
15
Time (s)
10 1
Near-end signal
0.5
5
0
0 0.5
0 1 2 3 4 5 6 7 8
Time (s) 1
0 0.5 1 1.5 2 2.5
Fullband AP (P = 4) Time (s)
M = 2 proposed SAP (Ps = 2) Figure 10: Far-end signal and near-end signal of AEC with speech
M = 4 proposed SAP (Ps = 1) as excitation.
Figure 8: ERLE curves of the fullband AP with P = 4, M = 4 LC- 35

GSFAP with P = 2, OS = 2, and D = 2, M = 2 SAP with Ps = 2,
and M = 4 SAP with Ps = 1 for AR(4) inputs (N = 512, μ = 1, SNR 30
= 30 dB).
25
20
ERLE (dB)
1
Input signal (speech)
15
0.5
10
0
5
0.5
0
1
0 0.5 1 1.5 2 2.5 5
Time (s)
10
0 0.5 1 1.5 2 2.5
20
Power spectrum (dB)
Time (s)
10
0 Fullband AP (P = 4)
10
20 M = 4 proposed SAP (Ps = 2)
30
0 π/2 π Figure 11: Comparison of ERLE for fullband AP, M = 4, OS = 2,
Frequency and D = 2 LC-GSFAP, M = 2 SAP, and M = 4 SAP with 8 kHz
sampled speech as excitation (N = 512, P = Ps = 2, μ = 1, SNR
= 30 dB).
Figure 9: Input signal (speech) and its power spectrum of speech
( fs = 8 kHz).
algorithms in view of the computational complexity and the
convergence speed.
and Ps = 1, 2) and different numbers of subbands (M = 2, 4).
Comparing the results of Figure 8 with that of Figure 7, the 4.2. The proposed SAP with real speech input
convergence speeds of the SAP with the reduced projection
order can be deteriorated. However, it is faster than that of The speech signal and its power spectrum are shown in
other algorithms. From these results, the increase of M im- Figure 9. The speech is a woman’s voice sampled at 8 kHz.
proves the convergence speed and also allows the projection Figure 10 shows the far-end signal and the near-end signal of
order P to be reduced. Therefore, it can be said that the pro- AEC. The projection orders for each algorithm are equal to 2
posed SAP improves the performance of the conventional AP (P = Ps = 2). The speaker output signal-to-measurement
in the efficiency. Consequently, the SAP is superior to other noise is set to 30 dB. Figure 11 shows ERLE curves of the
0
Near-end signal
1
0.5
0 5
0.5
1
0 0.5 1 1.5 2 2.5 10
Time (s)
MSE (dB)
15
Fullband AP
0.2
0.1 20
0
0.1
0.2 25
0 0.5 1 1.5 2 2.5
Time (s)
30
M = 4 LC-GSFAP
35
0.2 0 1 2 3 4 5 6
0.1
0
Sample numbers 104
0.1
0.2 Fullband AP (P = 2)
0 0.5 1 1.5 2 2.5
Fullband AP (P = 4)
Time (s) M = 2 proposed SAP (Ps = 2)
M = 2 SAP
0.2
0.1
0
0.1 Figure 13: Comparison of MSE curves of the simplified SAP
0.2 (SSAP) for AR(4) (N = 512, μ = 1, SNR = 30 dB).
0 0.5 1 1.5 2 2.5
Time (s)
0
M = 4 SAP
0.2
0.1 5
0
0.1
0.2
0 0.5 1 1.5 2 2.5 10
Time (s)
MSE (dB)
15
Figure 12: Comparison of residual error signals for Fullband AP, 20

M = 4, OS = 2, and D = 2 LC-GSFAP, M = 2 SAP, and M = 4
SAP with speech as excitation (N = 512, P = Ps = 2, μ = 1, SNR 25
= 30 dB).
30
35
M = 2, 4 SAP, the M = 4, OS = 2 LC-GSFAP, and the full- 0 0.5 1 1.5 2 2.5 3
band AP with the real speech as excitation. Figure 12 illus- Sample numbers 104
trates the residual error signal of each algorithm.
Fullband AP (P = 4)
4.3. MSE performance of the SAP Fullband AP (P = 8)
and the simplified SAP M = 8 proposed SAP (Ps = 1)
We compare the performance of the proposed algorithms

Figure 14: Comparison of MSE curves of each algorithm for AR(4)
(the SAP and the SSAP) with other algorithms. Figure 13
(N = 512, μ = 1, SNR = 30 dB).
shows the MSE curves of the SAP and the fullband AP. The
convergence rate of the fullband AP goes up with P and those
of the SAP go up with P or M. Increase of P leads to a large
computational complexity, whereas, increase of M does not. The projection order of the SSAP is 1 (Ps = 1) at both sets.
For the same projection order, the SAP has faster conver- Figure 14 shows the MSE curves of the SSAP and the fullband
gence rates than the fullband. To evaluate the performance of AP. In the first set, the convergence rate of the SSAP is similar
the SSAP, two sets of simulations are considered. In the first to that of the fullband AP. In the second set, we can observe
set, the number of subbands in the SSAP and the projection that the fullband AP is superior to the SSAP. However, the
order for the fullband AP are set to 4 (M = 4 and P = 4), steady-state error of the fullband AP is larger than that of the
whereas, those are 8 (M = 8 and P = 8) in the second set. SSAP. This large steady-state error is in accord with the result
of [24]. Moreover, as described earlier, the fullband AP with [10] M. Tanaka, S. Makino, and J. Kojima, “A block exact fast affine
higher projection order has extremely large computational projection algorithm,” IEEE Transactions on Speech and Audio
complexity. Whereas, the SSAP is comparable in view of the Processing, vol. 7, no. 1, pp. 79–86, 1999.
computational complexity with the NLMS. Consequently, we [11] F. Albu and H. K. Kwan, “Fast block exact Gauss-Seidel pseudo
can conclude that the effect of the plenty subband partition- affine projection algorithm,” Electronics Letters, vol. 40, no. 22,
ing is more effective than that of higher projection order to pp. 1451–1453, 2004.
improve the convergence rate of the fullband AP. [12] P. P. Vaidyanathan, Multirate Systems and Filter Banks,
[13] S. S. Pradhan and V. U. Reddy, “A new approach to sub-
5. CONCLUSIONS
band adaptive filtering,” IEEE Transactions on Signal Process-
In this paper, we present a new subband affine projec- ing, vol. 47, no. 3, pp. 655–664, 1999.
tion algorithm based on the subband structure [13] and [14] M. R. Petraglia, R. G. Alves, and P. S. R. Diniz, “New structures
the fullband affine projection algorithm [3] for acoustic for adaptive filtering in subbands with critical sampling,” IEEE
Transactions on Signal Processing, vol. 48, no. 12, pp. 3316–
echo cancellation. The proposed algorithm uses the OSF
3327, 2000.
for prewhitening the highly correlated inputs. This OSF is
[15] S. Miyagi and H. Sakai, “Convergence analysis of alias-
a kind of projection operation and it can partly substitute
free subband adaptive filters based on a frequency domain
for the updating-projection scheme of the fullband AP al- technique,” IEEE Transactions on Signal Processing, vol. 52,
gorithm. Moreover, the OSF with the polyphase decomposi- no. 1, pp. 79–89, 2004.
tion, the noble identity, and critical decimation can reduce [16] S. Makino, K. Strauss, S. Shimauchi, Y. Haneda, and A. Naka-
the computational complexity. By combining the merits of gawa, “Subband streo echo canceller using the projection algo-
the OSF and the AP algorithm, the derived method gives the rithm with convergence to the true echo path,” in Proceedings
rapid convergence rate and the reduced computational com- of the IEEE International Conference on Acoustics, Speech, and
plexity. In addition, we present that the proposed algorithm Signal Processing (ICASSP ’97), vol. 1, pp. 299–302, Munich,
can be reduced to a simplified form such as the NLMS by Germany, April 1997.
partitioning over the number of subbands as the projection [17] M. Bouchard, “Multichannel affine and fast affine projection
order. The simplified form is a good approach to implement algorithms for active noise control and acoustic equalization
the proposed method in most practical applications. Several systems,” IEEE Transactions on Speech and Audio Processing,
simulation results support the theoretical predictions and vol. 11, no. 1, pp. 54–60, 2003.
show the improved performances. [18] Q. G. Liu, B. Champagne, and K. C. Ho, “On the use of a mod-
ified fast affine projection algorithm in subbands for acous-
tic echo cancelation,” in Proceedings of the IEEE Digital Signal
REFERENCES Processing Workshop, pp. 354–357, Loen, Norway, September
1996.
[1] B. Widrow and S. D. Stearns, Adaptive Signal Processing,
[19] E. Chau, H. Sheikhzadeh, and R. L. Brennan, “Complexity re-
duction and regularization of a fast affine projection algorithm
[2] S. Haykin, Adaptive Filter Theory, Prentice-Hall, Upper Saddle
for oversampled subband adaptive filters,” in Proceedings of the
River, NJ, USA, 4th edition, 2002.
IEEE International Conference on Acoustics, Speech and Signal
[3] K. Ozeki and T. Umeda, “An adaptive filtering algorithm using Processing (ICASSP ’04), vol. 5, pp. 109–112, Montreal, Que-
an orthogonal projection to an affine subspace and its prop- bec, Canada, May 2004.
erties,” Electronics & Communications in Japan, vol. 67, no. 5,
pp. 19–27, 1984. [20] K. Nishikawa and H. Kiya, “New structure of affine projection
algorithm using a novel subband adaptive system,” in Proceed-
[4] S. G. Sankaran and A. A. Beex, “Convergence behavior of
ings of the 3rd IEEE Workshop on Signal Processing Advances in
affine projection algorithms,” IEEE Transactions on Signal Pro-
Wireless Communications (SPAWC ’01), pp. 364–367, Taoyuan,
cessing, vol. 48, no. 4, pp. 1086–1096, 2000.
Taiwan, March 2001.
[5] S. L. Gay and J. Benesty, Acoustic Signal Processing for Telecom-
munication, Kluwer Academic, Boston, Mass, USA, 2000. [21] H. R. Abutalebi, H. Sheikhzadeh, R. L. Brennan, and G. H.
[6] M. Rupp, “A family of adaptive filter algorithms with decor- Freeman, “Affine projection algorithm for oversampled sub-
relating properties,” IEEE Transactions on Signal Processing, band adaptive filters,” in Proceedings of the IEEE International
vol. 46, no. 3, pp. 771–775, 1998. Conference on Acoustics, Speech and Signal Processing (ICASSP
[7] S. Werner and P. S. R. Diniz, “Set-membership affine projec- ’03), vol. 6, pp. 209–212, Hong Kong, April 2003.
tion algorithm,” IEEE Signal Processing Letters, vol. 8, no. 8, pp. [22] E. K. P. Chong and S. H. Zak, An Introduction to Optimization,
231–235, 2001. John Wiley & Sons, New York, NY, USA, 1996.
[8] H.-C. Shin and A. H. Sayed, “Mean-square performance of a [23] T. K. Moon and W. C. Stirling, Mathematical Methods and Al-
family of affine projection algorithms,” IEEE Transactions on gorithms, Prentice-Hall, Englewood Cliffs, NJ, USA, 2000.
Signal Processing, vol. 52, no. 1, pp. 90–102, 2004. [24] S. J. M. de Almeida, J. C. M. Bermudez, N. J. Bershad, and
[9] S. L. Gay and S. Tavathia, “The fast affine projection algo- M. H. Costa, “A statistical analysis of the affine projection al-
rithm,” in Proceedings of the IEEE International Conference on gorithm for unity step size and autoregressive inputs,” IEEE
Acoustics, Speech, and Signal Processing (ICASSP ’95), vol. 5, Transactions on Circuits and Systems I: Regular Papers, vol. 52,
pp. 3023–3026, Detroit, Mich, USA, May 1995. no. 7, pp. 1394–1405, 2005.
[25] Y.-P. Lin and P. P. Vaidyanathan, “A kaiser window approach

for the design of prototype filters of cosine modulated filter-
banks,” IEEE Signal Processing Letters, vol. 5, no. 6, pp. 132–
134, 1998.
Hun Choi received the B.S. and the M.S.

degrees in electronics from Chungbuk Na-
tional University, South Korea, in 1996 and
2001, respectively. Since 2001, he is cur-
rently pursuing the Ph.D. degree. From
November 1996 to March 1997, he served
as a Research Engineer in the Department
of Product Development of LG Semicon.
His research interests include adaptive sig-
nal processing, multirate signal processing,
and methods applied to acoustic and communication systems.
Hyeon-Deok Bae received his M.S. and

Ph.D. degrees in electronics from Seoul Na-
tional University (SNU), South Korea, in
1980 and 1992, respectively. From 1983
to 1987, he was an Assistant Professor at
Kwandong University, Kangwon, South Ko-
rea. Since 1987, he has been a Professor at
Chungbuk National University, South Ko-
rea. His research interests include adaptive
signal processing, multirate systems, and
wavelets applications for signal processing. In 1994, he was a vis-
iting Professor at Syracuse University, Syracuse, NY, USA.

Multirate Systems and Applications

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Multirate Systems and Applications

Uploaded by

Copyright:

Available Formats

EURASIP Journal on Advances in Signal Processing

Multirate Systems and Applications

Guest Editors: Yuan-Pei Lin, See-May Phoong, Ivan Selesnick,

Multirate Systems and Applications

Fixed Wordsize Implementation of Lifting Schemes, Tanja Karp

Noniterative Design of 2-Channel FIR Orthogonal Filters, M. Elena Domı́nguez Jiménez

Frequency-Domain Equalization in Single-Carrier Transmission: Filter Bank Approach, Yuan Yang,

Flexible Frequency-Band Reallocation Networks Using Variable Oversampled Complex-Modulated

Subband Approach to Bandlimited Crosstalk Cancellation System in Spatial Sound Reproduction,

Langewiesener Strasse 22, 98693 Ilmenau, Germany

Soontorn Oraintara received the B.E. de-

Yi Chen, Michael D. Adams, and Wu-Sheng Lu

Received 31 December 2005; Revised 8 June 2006; Accepted 16 July 2006

Recommended by Ivan Selesnick

1. INTRODUCTION Other design techniques have also been proposed where a

x[n] y0 [n] M G0 (z) + M G0 (z) + M G0 (z) + xr [n]

n1 = mod n, 2l2k−1,1 with n0 and n1 given by (15).

quantizing the corresponding subband coeﬃcients using an (29)

element of A1 assumes the form (n + 1)m + (−n)m , and each

⎧ # # h1a (ω) = 1 + xT ET1 v1 . (45)

⎧ # # where W(ω) is the weighting function defined in (43),

! 2 + N/22 linear equa-

Ax = b. (51) and Hx , sx , and cx are given in (48). Moreover, it follows from

columns of V, and φ is an arbitrary (n − r)-dimensional vec- !sk = H 2Hφ φk + sφ , (60)

ter solution along the direction of δ φ . We first evaluate G at

δh1 = d φTk Hφ φk + φTk sφ + cφ , (62)

6.3. Design problem formulation

eh1 = δ Tx Hk δ x + δ Tx sk + ck , (69) Algorithm 2: More-than-two lifting-step case.

where Hk is a symmetric positive semidefinite matrix, and

Table 3: Comparison of filter-bank characteristics.

Support of Support of analysis filters Coding gain (dB) Vanishing moments

Table 4: Statistical summary of the lossy compression results for

clearly yield superior filter banks compared to other quin-

APPENDIX wavelet bases,” IEEE Transactions on Signal Processing, vol. 43,

Yi Chen received the B.Eng. degree in

Michael D. Adams received the B.A.Sc. de-

Linnéa Rosenbaum, Per Löwenborg, and Håkan Johansson

Department of Electrical Engineering, Linköping University, 581 83 Linköping, Sweden

Received 22 December 2005; Revised 29 June 2006; Accepted 26 August 2006

Recommended by Soontorn Oraintara

1. INTRODUCTION class of FBs with nearly perfect reconstruction. The distor-

Analysis filter bank x0 (m) Synthesis filter bank

Ha0 (z) M M Hs0 (z)

HaM 1 (z) M M HsM 1 (z)

Figure 1: M-channel maximally decimated FB.

(G) π/2 (G) π ωT

3.1. Prototype filter transfer functions

(iii) The masking filters F0 (z) and F1 (z) are of order NF

yphase components of A(z) and B(z), respectively. The co-

sponding one in [2] (with θk = 0). It consists of two trivial

Figure 4: Realization of the proposed analysis FB.

These formulas hold under the condition that second- and

δc(F) + δs(F) + δPC ≤ δc ,

Figure 8: Magnitude response of the distortion function without

ity and the decreased number of optimization parameters is,

40 Table 2: Comparison with [18, Example 2].

Table 1: Number of multiplications per sample, total delay, and

[21] L. Rosenbaum, P. Löwenborg, and H. Johansson, “Cosine

Linnéa Rosenbaum (maiden name Svens-

Per Löwenborg was born in Oskarshamn,

Recommended by Soontorn Oraintara

1. INTRODUCTION lifting scheme, where, in addition to rounding, overflow oc-

x0 (n) y0 (n) x0 (n)

x1 (n) + Overflow y1 (n) + Overflow x1 (n)

! 2 + N/22 linear equa-

x0 (n) y0 (n) x0 (n)

x1 (n) + Overflow y1 (n) + Overflow x1 (n)

Another approach proposed in [5] requires only two received