You are on page 1of 4

ADAPTED FILTER BANKS IN MACHINE LEARNING: APPLICATIONS IN BIOMEDICAL

SIGNAL PROCESSING

D. J. S t r a ~ s s ' ~W~ Delb3,


- ~ , J. .lung3, and I? K. Plinkert3

'Key Numerics, Saarbriicken, Germany


'Institute for New Materials, Saarbriicken, Germany
3Saarland University Hospital, Homburg/Saar, Germany

ABSTRACT on the data have been suggested for feature extraction, e.g.,
see [6, 71. Data dependent schemes mainly rely on an ad-
The theory of signal-adapted filter banks has been devel-
justment of the decomposition tree, see [8] for a comparison
oped in signal compression in recent years and only rarely.
of several schemes. The local discriminant basis algorithm
be applied to other applications fields such as machinelean-
(LDB) [9] is a well accepted scheme which relies on the best
ing. In this paper, we propose lattice structure based signal-
basis paradigm. We extend this scheme to the construction
adapted filter banks and time-scale atoms, respectively, for
of more powerful morphological LDBs (MLDBs) using the
the construction of morphological local discriminant bases
lattice srrucrure.
and hybrid wavelet-support vector classifiers. The first men-
We also present hybrid wavelet-support vector classi-
tioned method is a more powerful construction of the re-
fiers where we apply an adaptation of wavelet~decomposi-
cently introduced local discriminant bases algorithm which
tions which is tailored for the recently introduced support
employs, additionally to the conventional wavelet-packet
vector machine class$ers with radial basis functions as ker-
tree adjustment, an adaptation of the analyzing time-scale
nels. This adaptation strategy is well motivated from the
atoms. The latter mentioned method utilizes adapted wavelet
paradigm of large margin classifiers [IO] and allows for an
decompositions which are tailored for support vector clas-
optimization of the representation of the data before solving
sifiers with radial basis functions as kernels. For both meth-
the quadratic programming problem of support vector ma-
ods, we present applications in biomedical signal process-
chines by including a priori knowledge of the pattern recog-
ing.
nition task.

1. INTRODUCTION
2. ADAPTED WAVELET PACKET
A general issue when applying wavelet like frequency de- DECOMPOSITIONS
compositions is the design of appropriate hasis functions
and filter banks, respectively. The design of signal-adapted Let Ho(z) := C k E Z h ~ [ k ] tbe- kthe z-transform of the
filter banks by means of energy compaction is an active and analysis lowpass filter and H l ( z ) := & Z h l [ k ] ~ - k the
ongoing field of research [ 1,2,3,4]. Often these approaches t-transform of the analysis highpass filter of a two-channel
are closely related to the design of optimal wavelets for sig- paraunitary filter bank with real-valued filter coefficients
nal compression. and at least one vanishing moment of the high pass filter,
In this paper, we employ some of the ideas developed i.e., we have that H1(1) = 0. The polyphase matrix Hpl(z)
in this area for machine leaming where we want to model of the analysis bank of such filter banks can be decomposed
some dependency between an input and an output space as follows
given a finite number of associations. The major difference
between our approach and the design of compaction filter
is that we apply other adaptation criteria than energy com-
paction which is not necessarily the best criterion for signal
discrimination [5].
In particular, we present a wavelet packet feature extrac-
tion scheme. Wavelet based methods with no special focus and finally the space P K := {fi = (290,. . . ,29~-1) : 29k E
This work has been supported by the MED-EL Deulschland GmbH, [0, T)} can serve to parameterize all two-channel parauni-
Stamberg, Germany. tary filter banks with at least one vanishing moment of the

0-7803-7663-3/03/$17.00 02003 IEEE VI - 425 ICASSP 2003


highpass filter, see [ I I], [12, Theorem4.71, and [13] forde- 3. F o r j = J - 1 , . . . , O a n d k = O , . _ . 2, ' - l d e t e r -
tailed discussions. We will use (8) to denote the parameter- mine the "best subset" A ; , k by the following rule:
ization if necessary. We use K = 2 in ( I ) in the application Set A;,k := CmET (TI(j,k m), . . . , T L ( k, ~ 4).,
section. If A;,k 2 A;+i,zk + A;+i,zk+i
For the implementation of wavelet packet decomposi- then Aj,k := B j , k
tions, twMhannel paraunitary filter banks are arranged in e k e Aj,k := Aj+i,znUAj+i,zn+i andAj,k := Aj+i,zk+
a binary tree of decomposition depth J , see [14, 151 for an Aj+i,zk+i.
introduction of wavelet packets. For the decomposition of
signals x E e2, we define the projections onto the orthog- Output: Ao,o
onal wavelet packet spaces n;,k = Oj+~,zk O ; + I , ~ ~ + I The basis Ao,o is called the local discriminant basis
(OO,O:= ea, ( j , k) E Z, where Z specifies the binary de- (LDB). The discriminatory power of this basis is measured
composition tree) by Pzk : t2 -+Oj,k, with the following by A o , ~ . By the considerations in the previous section,
expansions each angle vector I9 E P K defines a different dictionary
D ( 8 ) . The discriminatory power of the dictionary D(I9)
~j,kx= ~j,k[r~IqTk,
can be measured by Ao,o(8)computed in the LDB algo-
mtz
rithm. Thus the MLDB is given by
Yj,k[4 = (X,$k)P, (2)
where qTk are the wavelet packet basis functions. In prac- B = arg max A~,~(B). (4)
UEPK
tice our signals are of finite length d. Here we will exclu-
sively deal with signals which length d is a power of 2. This problem can either be solved by a hypercube evalu-
For this, we define the maximal decomposition depth by ation, parameter space reduction, or genetic algorithms, see
J := log, d and the sets of indices := {O, 1,.. . ,Z J - 3 - [13, 161 fordetails.
l}. Then we have the finite wavelet packet bases Bj,k :=
{qTk : m E T j } . 4. HYBRID WAVELETSUPPORT VECTOR
CLASSIFIERS
3. MORPHOLOGICAL LOCAL DISCRIMINANT
We only present our adaptation strategy here and refer to
BASES
[IO] for of introduction of SVMs.
Suppose now that we are given L classes A! ( 1 = 1,. . . ,L ) Let K : X x X -+ R (Xis a compact subset of Rd) be
of Mi training patterns, respectively, i.e., At := {xi,! E a positive definite symmetric function in L 2 ( X x X).For a
lRd : i = 1,.. . , M ! } . We denote the expansion coeffi- given K , there exists a reproducing kernel Hilbert space
cients (2) ofthesepattems by yi,i,j,k = ( g i ~ , j , k [ m ]=) ~ ~ ~ 3 1 =~span {IC(?., .) : j , E X}
( ( x < ~ , q ; ~ ). )F ~
o r~
(j~
, k ) E Z a n d m E 7,,thetime-
of real valued functions on X with inner product determined
scale energy map Tl of class 1 is defined by , K(%,x)whichhastherepro-
by ( K ( f , x ) , K ( x , x ) ) . ~=
ducing kernel K , i.e., (f(.),K(f;)).~, = f(i) (f E
X K ) . By Mercer's Theorem, the reproducing kernel K can
i=1 be expanded in a uniformly convergent series on X x X
(3) m

Now the LDB algorithm can be summarized as follows [9]: K(X>Y)= C%3(Oj(X)3Dj(Y)> (5)
j=1
LDB Algorithm
where qj 2 0. We restrict OUT interest to functions K that
Choose a dictionary and an additive discriminant mea- arise from a radial basis function (RBF) such that
sure V which may be, e.g.,the e2 distance or the rel-
ative entropy, see [9]. K(x,Y)= k(llx - YIIz)
1. For every class 1 = 1,. . . , L construct the time-scale , where I I . 112 denotes the Euclidean norm on I t d . We intro-
energy map T 1 . duce a so+alledfeature map Q : X +t2 by
2. Fork = 0 , . . . ,2-' - 1 set A J , :=
~ BJ,~
and = (fiP;(.));GN'

A J ,:=
~ 2) (TI(J, k , m ) , . . . ,T L ( Jk, m ) ) . Let e* denote the Hilbert space of real valued quadratic
Vl€TJ summable sequences. By (3,we have that Q(x)(x E X)

VI - 426
is an element in e2 with ll@(x)11$ = C,”=lqjip:(x) = 5. APPLICATIONS
K ( x , x) = k ( 0 ) .We define thefeature space FK c e* by
the e2%losure of all finite linear combinations of elements 5.1. MLDB Feature Extraction
Wx) (x E 4 An auditory evoked response is an response within the audi-
F K = span{@(x): x E X } . tory system that is produced or evoked by sounds (an audi-
tory or acoustic stimuli). Of a particular interest is the audi-
Then FK is a Hilbert space with I I 1 1 = ~11 . 1~ tory brainstern response (ABR), that is, the auditory evoked
response that stems from the brainstem. The detection of
4.1. Adaptation in FK discriminating features between the binaurally (stimulus to
both ears) evoked ABRs and sum of the monaurally (stim-
For a fixed waveform, x we define the function ulus to one ear) evoked ABRs is of diagnostic interest [19].
l 9 P
For a total of ten patients, we have shown the the discrim-
= (llYBllL,... >llYNll@) (6) inant power of MLDBs and LDB, measured by for
the individual patient in Fig. 1. The LDB was constructed
where yn (n = 1 , . . . , N ) are a number of N pre-selected
using the Daubechies wavelet with 6 filter coefficients. The
subbands of a wavelet packet decomposition, e.g., the sub-
better performance of the MLDB is clearly noticeable, see
bands of an octave banddecomposition. We normalize< ,(29)
[I61 for a connection with a classifier.
by its concentration in e’. This feature vector carries the
multilevel concentration of the waveform x which is ro-
bust against local instabilities in time [13]. To gain sbift-
invariance of this feature vector, frame decompositions us-
ing nonsubsampled filter banks [I41 can be applied instead
of orthogonal decompositions which are shift-variant [17].
SVMs rely on the optimal hyperplane classificafion [lo]
in the feature space FK. Thus we have to ensure that our ex-
tracted features are mapped to far apart points in F K before
Fig. 1. MLDB and LDB performance
solving the quadratic programming problem of the SVM.
For a given training set

A ( 8 ) := { ( S i ( 2 9 ) , Y i ) E x c Rd x {-1,1} : i = 1 , . . . , M } ,
we denote sets of indices i E { 1, ,M } with yi = 1 and
yi = -1 by M+ and M - , re2pectively. Consequently, for a
large margin, we try to find I9 such that

8 = argm={
BC’F min
iEM+,jEM- /I*(<~(@))- w ~ ( I ~ ) ) I I ; ~ } .
(7)
. ~
..........~
..
It follows that
&OF “.,!RI RRF -ling
Il@(€i(8))
- @(€j(19))ll;x =
Fig. 2. SVMs and hybrid-wavelet SVMs
W O ) - 2k ( I I E i ( f v -€j(19)llz)
We suppose that k(.) is monotonely decreasing in 1 ‘1. Then
(7) can be rewritten as 5.2. Hybrid Classification
The rate independent discrimination of endocardial electro-
a = a r g mBE?
=( i € Mmin
+,j€M- /IE~(I~)-FJI~)II~}. grams (EEs), i.e., bioelectric signals which stem from the
inner heart, representing a physiological and a pathologi-
Now we have an optimization problem in the original space cal rhythm is of diagnostic interest [ZO]. For an applica-
X instead of the feature FK which is not explicitly given. tion of the presented hybrid wavelet-support vector clas-
An approach to gain an approximated solution to problems sifier (Gaussian RBF, and the suhbands of an octave hand
of this type is given [IS]. decomposition in (6)) to the EEs from ten patients with a

VI - 421
physiological and a pathological rhythm, the number of sup- [7] L. J. Trejo and M. J. Shensa, “Feature extraction of
port vectors found as well as the error rate is shnwn in Fig. event-related potentials using wavelets: An applica-
2 using a training set of 8 beats, and a test set of 16 beats tion to human performance monitoring,” Brain and
from the physiological and pathological rhythm for the indi- Lunguage, vol. 66, pp. 89-107.1999.
vidual patient. The results for SVM applied to time domain
[8] G. Rutledge and G. McLean, “Comparison of several
waveforms are also shown. A much better performance of
wavelet packet feature extraction algorithms,” Submit-
the hybrid-wavelet support vector classifier in terms of the
ted to IEEE Trans. on Pattern Recognition and Ma-
number of support vectors found and the error rate is clearly
chine Intelligence, 2000.
noticeable (note that we expect a better generalization if we
have less support vectors [lo]). See [IS] for an application 191 N. Saito and R. R. Coifman, “Local discriminant
using shift-invariant frames. bases,” in Wavelet Applications in Signal and Image
Processing 11, Proc. SPIE 2303, A. F. Laine and M. A.
6. CONCLUSIONS Unser, Eds., July 1994, pp. 2-14.
[lo] V. Vapnik, The Nature ofStatistical Learning Theory,
We have presented two approaches in computational intel-
Springer, NY, 1995.
ligence using signal-adapted filter banks. In particular, we
constructed morphological local discriminant bases where [ I l l P. P. Vaidyanathan, Multirate Systems and Filter
we adapted the morphology of the analyzing atoms addi- Banks, Prentice Hall, Englewood Cliffs, NJ, 1993.
tionally to the tree adjustment. We also presented hybrid
[ 121 G. Strang and T. Nguyen, Wavelets and Filter Banks,
wavelet-support vector classifiers where the theoretically
Wellesley-Cambridge Press, Wellesley, MA, 1996.
well founded support vector machines allowed the optimiza-
tion of the representation of the data for the subsequent clas- [I31 D. J. Strauss, AdaptedFilterBanksandSupporr Vector
sification. The efficiency of each approach was verified in Architectures: Hybrid Approaches to Machine Leam-
two applications. We conclude that signal-adapted filter ing, Logos-Verlag, Berlin, Germany, 2002.
banks are a sensible approach in machine learning.
[I41 M. Vetterli and J. KovaEeviC, Wavelets and Subband
Coding, Prentice-Hall, Englewood Cliffs, NJ, 1995.
7. REFERENCES
1151 M. V. Wickerhauser, Adapted Wavelet Analysisform
P. H. Delsarte, B. Macq, and D. T. M. Slock, “Signal Theory to Software, A. K. Peters, Ltd., Wellesley, MA,
adapted multiresolution transforms for image coding,” 1994.
IEEE Trans. on Information Theory, vol. 38, pp. 897-
903,1992. 1161 D. J. Strauss, G. Steidl, and W. Delb, “Feature ex-
traction by shape-adapted local discriminant bases,”
M. K. Tsatsanis and G. B. Giannakis, “Principal com- Signal Processing, 2002, In press.
ponent filter banks for optimal multiresolution analy-
[I71 E. P. Simoncelli, W. T. Freeman, E. H. Adelson, and
sis,” IEEE Trans. on Signal Processing, vol. 43, pp.
D. J. Hegger, “Shiftable multiscale transforms,” IEEE
1766-1777,1995.
Trans. on Information Theory, vol. 38, pp. 587-608,
P. P. Vaidyanathan, “Theory of optimum orthonormal 1992.
filter banks,” IEEE Trans. on Signal Processing, vol. [I81 D. J. Strauss and G. Steidl, “Hybrid wavelet-support
46, pp. 1528-1543,1998. vector classification of waveforms,” J. of Computa-
P. P. Vaidyanathan and S. Akkarakaran, “A review of tional and Applied Mathematics, 2002, In press.
the theory and applications of principal component fil- [I91 W. Delb, D. J. Strauss, G. Hohenberg, and K. P. Plink-
ter banks,” J. ofAppliedandComputationa1Harmonic ert, “The binaural interaction component in children
Analysis, to appear. with central auditory processing disorders,” Intema-
rional Journal ofdudiology, 2002, In Press.
B. D. Ripley, Pattern Recognition and Neural Net-
works, Cambridge University Press, Cambridge; [20] M. M. Shieh, L. Clem, L. Malden, Pixley A., and
1996. E. Fain, “Improved supraventricular and ventricu-
lar tachycardia discrimination using electrogram mor-
S. Pittner and S. V. Kamarthi, “Feature extrac- phology in an implantable cardioverter defibrillator,”
tion from wavelet coefficients for pattern recognition
G. Ital. Cardiol., vol. 28 (supplement I),
pp. 245-251,
tasks,” IEEE Trans. on Pattern Recognition and Ma- 1998.
chine Intelligence. vol. 21, pp. 83-88, 1999.

VI - 428

You might also like