**Speech Denoising by Adaptive Weighted Average Filtering in the EMD framework
**

Kais KHALDI and Monia TURKI-HADJ ALOUANE

Unit´ Signaux et Syst` mes, ENIT e e BP 37, Le Belvedre 1002 Tunis, Tunisia Email: kais.khaldi@gmail.com, m.turki@enit.rnu.tn IRENav, Ecole Navale/ E3 I2 (EA3876), ENSIETA Groupe ASM, Lanv´ oc Poulmic e BP600, 29240 Brest−Arm´ es, France e Email:boudra@ecole-navale.fr

Abdel-Ouahab BOUDRAA

Abstract—This paper introduces a new speech enhancement method, which combines Adaptive Center Weighted Average (ACWA) ﬁlter with Empirical Mode Decomposition (EMD). Both ACWA and EMD operate in the time domain. The ACWA ﬁlter is advantageous as it operates adaptively in the time domain and does not require the stationarity and the whiteness of the signals. Thanks to the data driven decomposition of the EMD, the application of the ACWA ﬁlter on the IMFs gives better results than the ACWA ﬁltering of the noisy signal. The proposed EMD-ACWA denoising method is applied to noisy speech signal with different noise levels and the results are compared to those obtained by two different denoising methods: wavelet thresholds and ACWA ﬁltering. A signiﬁcant superiority of the EMD-ACWA method over the two others is shown in white noisy contexts as well as in correlated noisy ones.

I. I NTRODUCTION Recently, a new temporal signal decomposition method, called Empirical Mode Decomposition (EMD), has been introduced by Huang et al. [1] for analyzing data from nonstationary and nonlinear processes. The major advantage of the EMD is that the basis functions are derived from the signal itself. Hence, the analysis is adaptive in contrast to traditional methods such as wavelets where the basis functions are ﬁxed. The EMD has received more attention in terms of applications [2]-[3], interpretation [4]-[5], and improvement [6]-[7]. The major advantage of the EMD is that the basis functions are derived from the signal itself. The EMD is also used in speech denoising [8]. In fact, speech signal noise reduction is a well known problem in signal processing. Particularly, linear methods such as the Wiener ﬁltering [9], are largely used, because linear ﬁlters are easy to implement and to design. However, these methods are not effective when the noise estimation is not possible or when the noise is colored. To overcome these difﬁculties, nonlinear methods have been proposed and especially those based on Wavelet thresholding [10]-[11]. A limit of the wavelet approach is that the basis functions are ﬁxed, and thus do not necessarily match all real signals. To overcome the drawbacks of the wavelet method, two strategies for noise reduction have been proposed in [8]: EMD associated with ﬁltering is efﬁcient for relatively low noise level and when associated with thresholding is attractive in particular for relatively high noise level. However, in [8],

only signals corrupted by additive white Gaussian noise are considered. In this paper, an adaptive denoising scheme associating EMD with the ACWA ﬁlter is proposed. The ACWA ﬁlter [12] and other correlated versions are basically used in image enhancement domain [13]. This ﬁlter operates adaptively in the time domain what ﬁts in the EMD framework, and it does not require the stationarity of the signals and the whiteness of the noise. The effectiveness of the ACWA ﬁlter can be improved when it is associated to the EMD decomposition. Indeed, the IMFs are less noisier than the noisy speech. The proposed denoising method beneﬁts from the advantages of the EMD and the attractive properties of the ACWA ﬁlter, which is adaptive and easy to implement, for obtaining good performance in the presence of white as well as colored noises. II. EMD ALGORITHM The EMD decomposes a given signal x(t) into a series of IMFs through an iterative process called sifting; each one with a distinct time scale [1]. The decomposition is based on the local time scale of x(t), and yields adaptive basis functions. The EMD can be seen as a type of wavelet decomposition whose subbands are built up as needful to separate the different components of x(t). Each IMF replaces the signals detail, at a certain scale or frequency band [4]. The EMD picks out the highest frequency oscillation that remains in x(t). By deﬁnition, an IMF satisﬁes two conditions : 1) the number of extrema and the number of zeros crossings may differ by no more than one. 2) the average value of the envelope deﬁned by the local maxima, and the envelope deﬁned by the local minima, is zero. Thus, locally, each IMF contains lower frequency oscillations than the just extracted one. The EMD does not use a pre-determined ﬁlter or a wavelet function, and is a fully data-driven method [1]. To be successfully decomposed into IMFs, the signal x(t) must have at least two extrema, one minimum and one maximum. The sifting involves the following steps : Step 1: Fix the threshold and set j ← 1 (j th IMF) Step 2: rj−1 (t) ← x(t) (residual)

978-1-4244-2628-7/08/$25.00 ©2008 IEEE

-1-

Where T is x(t) time duration. III.i−1 (t) + Lj. if Fvar ≥ σj Fmean . i − 1(t))2 t=1 (g) : Repeat Steps (b)-(f) until SD(i)< and then put IMFj (t) ← hj. [16] as following: σj = 1.i−1 (t))/2 (e) : Update : hj. indexed by j.i−1 (t) (d) : Compute the mean of the envelopes : μj.i−1 (t) (c) : Compute upper and lower envelopes Uj. respectively local maxima and minima of hj. where IMFj is a noisy version of the data fj . T HE EMD-ACWA DENOISING APPROACH The proposed denoising method is shown in ﬁgure 1. (3) The extracted IMFs include the noise since each IMF.i−1 (t).i−1 (t) − μj. Usually.4826 × Median {|IMFj (t) − Median {IMFj (t)} |} .˜ x(t) 3 ACWA ﬁlter IM FC Residual Fig. ˜ (9) Classically the ACWA ﬁlter has been used in image enhancement applications. (2) x(t) = ˜ j=1 ˜ fj (t) + rC (t) (6) The denoising of the IMF by the ACWA ﬁlter is given as follows [12] ˜ fj (t) = 2 Fmean + Kj (IM Fj (t) − Fmean ). x(t). Denoising scheme.i−1 (t) and Lj. ˜ An estimation fj (t) of fj (t) based on the noisy observation IMFj (t) is given by ˜ fj (t) = Γ[IMFj (t)]. The sifting is repeated several times (i). and (b) it smoothes uneven amplitudes.2 and 0. we have to determine SD value for the sifting. To guarantee that IMF components retain enough physical sense of both amplitude and frequency modulation. The result of the sifting is that x(t) will be decomposed into a sum of C IMFs and a residual rC (t) such as the following: C IM F1 ACWA ﬁlter IM F2 IM F3 ACWA ﬁlter . The noisy signal y(t) described by an additive model is given by : y(t) = x(t) + b(t).using cubic spline. and σj designates the variance of noise contained in the IMF indexed by j. the estimated signal. computed from the two consecutive sifting results.2008 International Conference on Signals.i−1 (t) =(Uj.i (t) (j th IMF) Step 4: Update residual : rj (t) := rj−1 (t) − IMFj (t). Finally.i ← 1 ( i number of sifts) (b) : Extract local maxima/minima of hj. This is accomplished by limiting the size of the standard deviation SD.[15]. The sifting has two effects : (a) it eliminates riding waves. The noisy signal is decomposed into a sum of IMFs as follows: C y(t) = j=1 IMFj (t) + rC (t). As shown by (7) this ﬁlter operates in the time domain what corresponds well to the EMD framework. It can be also interesting and effective in the context of audio signal enhancement. (5) where Γ[IMFj (t)] is a temporal processing using ACWA ﬁlter. all the parameters are computed in time domain and hence transformation to frequency domain is not where x(t) corresponds to the clean speech signal and b(t) denotes the noise signal. SD (or ) is set between 0. (4) -2- . (8) Fvar where Fmean and Fvar denote respectively the average and the variance of the IMF computed over a sliding window of 2 length L.i−1 (t) ← rj−1 (t) .i−1 (t) by interpolating.ACWA ﬁlter ^U R + . can be approximated as follows: IMFj (t) = fj (t) + bj (t). in order to get h true IMF that fulﬁlls the conditions (1) and (2). Circuits and Systems Step 3: Extract the j th IMF : (a) : hj. Step 5: Repeat Step 3 with j := j + 1 until the number of extrema in rj (t) is ≤ 2. i := i + 1 y(t) . is given by : ˜ C x(t) = j=1 IMFj (t) + rC (t) (1) C value is determined automatically using SD (Step 3(f)). otherwise (7) 2 σj Kj = (1 − ). 1. i(t)| SD(i) = (hj.EMD (f) : Calculate the stopping criterion : T 2 |hj.i (t) := hj. In contrast to the classical ﬁlters such as Wiener ﬁlter. The noise level σj is estimated as in [14].3 [1]. i − 1(t) − hj.

Figure 3. ˜ We take as example two speech signals ”a” and ”b”. Huang. This choice is justiﬁed by the results shown in ﬁgure 3 where are displayed the variations of the SNRout versus L for two values of SNRin : -2 dB and 0 dB.S. E XPERIMENTAL RESULTS The proposed noise reduction method is tested on a speech signal corrupted by different noises whose levels are ﬁxed through the input Signal to Noise Ratio (SNR).M. Cexus and L. V.4 dB. J. Lett. a signiﬁcant SNR improvement. Besides. Mhoulouse. 2004. The use of a masking signal to improve empirical mode decomposition. 2004. Cexus. These results demonstrate the effectiveness of the proposed method. Kaiser. 2007. and P. volume 1. IEEE Sig. Obtained results for speech signal contaminated with different noises with different SNR values ranging from -10 dB to 10 dB.. 3rd edition. M. The original signals and their corresponding noisy versions are depicted in ﬁgure 2. 2004. Mach. Lee. Proc. 4:485–488. [3] A. Huang and al. Wu and N. shows that the EMD-ACWA performs better than the wavelet (db8) and ACWA-ﬁlter in terms of noise reduction. the EMDACWA produces lower residual noise. Boudraa. 84(4):1–10. this ﬁlter can perform in general noisy contexts: white as well as colored noise.O. Boudraa. high as well as low noise level.G. Cexus and J. Samba Diop.L. the hypothesis of signals stationarity and noise whiteness are relaxed. Proakis and D. A careful comparative examination of the signals of ﬁgures 4 and 5. 2(4):165–168. [15] A. As an objective criterion to evaluate the performance of the denoising method. In Proc. R EFERENCES (˜(t))2 x . the wavelet thresholds (db8). Rilling. EMD-based signal noise reduction. IEEE ISCCSP.L. Toulouse.O. ISSN: 1304-4494. Johnstone. and Nozomu Hamada. is achieved by the EMD-ACWA method. Soc. The denoised versions of signals ”a” and ”b” obtained by the EMD-ACWA. 7 and 8 show the variations of the SNRout versus the SNRin relating to the denoising signal ”a” when corrupted respectively by a white Gaussian noise. Donoho. Ideal spatial adaptation via wavelet shrinkage.C. Boudraa and J. T Besides. Guillon. Boudraa S.. a new speech enhancement method to effectively remove the noise components is presented. March 1980. Roy. IEEE Trans. Part III: Fundamental Electronics Science.C. Denoising via empirical mode decomposition. pages 45–48. [2] F. Khaldi. 3:1501–1504. Optimal and bidirectional optimal empirical mode decomposition. De-noising by soft-thresholding. and Applications. Theory. Algorithms.C.F. IEEE ISCCSP. London A.G.. 460:1597–1611. 2004.E. [8] K. Hammamet. and Z. Digital image enhancement and noise ﬁltering by using local statistics. We have combined two powerful adaptive methods: the EMD and the ACWA ﬁltering. Proc.C. Inform. A. Digital Signal Processing: Principles. In fact. varying from 4. 2007. J.E.2 dB to 17. Benramdane. showed that the proposed method performs better than the the wavelet approach and the ACWA ﬁlter. Flandrin. noticeably less speech distortion compared to the wavelet (db8) method and ACWA ﬁlter. Circuits and Systems needed. We use the ACWA ﬁlter as comparison method because it gives better results than the MMSE ﬁlter [17]. Figures 6. IEEE Trans. Transient turbulent pressure signal processing using empirical mode decomposition. The size L of the ACWA ﬁlter window is ﬁxed to 511. The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysis. Electroics and Communications in Japan. Indeed. IEEE ICASSP. C ONCLUSION (x(t)) 2 SNRin = 10 log10 t=1 T t=1 . 41(3):613–627. In Proc. even for very low SNR values. The SNRin is ﬁxed to -2dB. Manolakis. Boudraa. March 2008. In fact. because it gives good results compared with others wavelets. [13] Masayuki Meguro. Process.A. Deering and J. Weng and K. [5] Z. [6] B. -3- . Royal Society. we choose db8 with a hard threshold as a tool of comparison. Speech signal noise reduction by EMD. Note that when listening to the enhanced speechs. 454(1971):903–995. shows that for L = 511 the SNRout remains almost constant. [10] D. J. Barner. Int. Proc. 11(2):112–114.E. 2006.O. IEEE ISCCSP. Proc. Marrakech. 81:425–455. Akira Taguchi. 1998. J. Malta. IEEE ICASSP. In particular. [9] J. Intell.O. we can still observe the effectiveness of the proposed method in removing the noise components as the gain in SNR can go up to 14 dB. as the signal is enhanced sample by sample.2008 International Conference on Signals. the improvement in SNR provided by the EMD-ACWA is much higher than those given by the wavelet method and the ACWA ﬁlter. Empirical mode decomposition as a ﬁlter bank. Goncalves. Philadelphia. Sig. Cexus. Morocco. [11] D. are shown respectively in ﬁgures 4 and 5. Saidi. If estimation using empirical mode decomposition and nonlinear teager energy operator. Astolﬁ. 1(1):33–37. and E. Prentice-Hall. [7] R. [4] P. 1996. the reported results demonstrated that the EMD-ACWA denoising method is effective for noise removal and conﬁrmed that it is a very attractive method to use in general noisy contexts. A study of the characteristics of white noise using the empirical mode decomposition method.O. 2000. The results obtained by the proposed method are compared to the wavelet approach (Daubechies 8) and ACWA ﬁlter. These signals are corrupted by a colored noise ”f16” with SNR value ﬁxed to -2dB. IV. x(t))2 ˜ (11) SNRout = 10 log10 (x(t) − where x(t) is the reconstructed signal. [12] J. Datadependent weighted average ﬁlterig for image sequence restoration. Physics in Signal and Image Processing. Donoho and I. G. and the ACWA ﬁlter. the colored f16 noise and the colored factory noise. 1994. Proc. (10) (y(t) − x(t))2 where x(t) and y(t) are respectively the clean and the noisy signals. Pattern Anal. Bouchiki. [14] A. Proc. Biometrica. Proc. we use the output Signal to Noise Ratio: T t=1 T t=1 In this paper. 1995. In addition. [1] N. Salzenstein A. Turki-Hadj Alouane. 2005. A.

5 x 10 4 0 0. 0 0. 2008 (submitted).P.5 2 Time 2.5 4 4.5 6 0 200 400 600 800 (L) Size of the window ACWA filter 1000 1200 SNR output [dB] Fig. 2nd edition.5 3 3.5 4 4.5 10 9. The original signals (”a” and ”b”) and their noisy versions (f16 noise with SNR =-2dB). Vetterling.5 1 1.5 7 6.5 4 4. Denoised version of the signal ”b” obtained by the EMD-ACWA.5 4 4. 2.5 x 10 4 2.5 2 Time 2.5 3 3. Khaldi and A.5 x 10 4 0 0.5 3 3.O.5 x 10 4 ACWA filter 1 0 −1 0 0.5 1 1.5 4 4.5 1 1. Numerical Recipes in C: The Art of Scientiﬁc Computing. 1992.5 3 3. Variation of the SNRout versus the SNRin relating to the denoising of the signal ”a” corrupted by a white Gaussian noise. Press.5 1 1.5 4 4. Advances in Adaptive Data Analysis (AADA). EMD−ACWA 1 0 −1 0 0.A.5 2 noisy a 2.5 x 10 4 Amplitude 1 0 −1 0 0. 6.5 SNR gain [dB] 0 0.5 2 2. Voiced speech enhancement based on adaptive ﬁltering of selected intrinsic mode functions. and B.5 a Wavelet (db8) 1 0 −1 2 b 2.5 1 1.5 x 10 4 Amplitude 1 0 −1 0 0.T.5 3 3.5 3 3. 4.5 1 1. the Wavelet (db8) and the ACWA ﬁlter (f16 noise with SNR =-2dB). S. Turki-Hadj Alouane K. 3.5 1 1. Denoised version of the signal ”a” obtained by the EMD-ACWA. Circuits and Systems [16] William H.2008 International Conference on Signals.5 x 10 4 Wavelet (db8) 1 0 −1 11 10. Teukolsky.5 3 3. The variation of the SNRout relating to the noisy signal ”a” versus L the size of the ACWA ﬁlter window (f16 noise with SNR=-2 db ad SNR=0 db).5 2 Time 2. Flannery.5 4 4.5 3 3.5 2 noisy b 1 0 −1 0 0.5 2 2. 5. Cambridge University Press.5 4 4.5 x 10 4 ACWA filter 1 0 −1 Amplitude 1 0 −1 0 0. −5 0 Initial SNR [dB] 5 10 Fig.5 x 10 4 9 8. Boudraa.5 4 4. the Wavelet (db8) and the ACWA ﬁlter (f16 noise with SNR =-2dB).5 3 3.5 1 1. volume 1.5 2 2.5 3 3.5 1 1. EMD−ACWA 1 0 −1 Fig. -4- . W.5 8 7.5 2 2. For initial SNR = −2 dB For initial SNR = 0 dB 18 16 14 12 10 8 6 4 2 0 −10 EMD−ACWA Wavelet (Daubechies 8) ACWA filter Fig. [17] M.5 1 1.5 4 4.5 x 10 4 Amplitude Fig.

7.2008 International Conference on Signals. Circuits and Systems 18 16 14 12 SNR output [dB] EMD−ACWA Wavelet (Daubechies 8) ACWA filter 10 8 6 4 2 0 −10 −5 0 Initial SNR [dB] 5 10 Fig. Variation of the SNRout versus the SNRin relating to the denoising of the signal ”a” corrupted by the factory noise. Variation of the SNRout versus the SNRin relating to the denoising of the signal ”a” corrupted by the f16 noise. 8. 18 16 14 12 SNR output [dB] EMD−ACWA Wavelet(Daubechies 8) ACWA filter 10 8 6 4 2 0 −10 −5 0 Initial SNR [dB] 5 10 Fig. -5- .

