You are on page 1of 2

2013 IEEE International Conference on Consumer Electronics (ICCE)

Speech Enhancement by Kalman Filtering with a Particle FilterBased Preprocessor


Yun-Kyung Lee, Gyeo-Woon Jung, and Oh-Wook Kwon, Member, IEEE
coefficients and noise variance) of the Kalman filter in nongaussian noise environments. The particle filter sequentially updates the filtering density under a relaxed Gaussian assumption. We estimate the parameters by using the SIS algorithm, which is summarized as [2]:
m) 1. Sample xi( m ) p( xi | xi( 1 )m

AbstractTo reduce nonstationary noise in real environments, we propose to use a particle filter as a preprocessor of Kalman filtering. From noisy input speech signals, the autoregressive (AR) model parameters are estimated by using a particle filter. Clean speech signal is estimated by a Kalman filter configured with the estimated parameters. Experimental results show that when speech signal is corrupted by babble noise, the proposed algorithm improves the output SNR by 1.5 dB.

2. Compute the weights 3. Normalize


p ( x i | y1:i )

wi( m ) p( y i | x i( m ) )m

I. INTRODUCTION A Kalman filter is an effective algorithm to enhance speech signals from a series of measurements observed over time, containing random noise and other inaccuracies. The Kalman filter achieves a faster convergence behavior than a normalized least-mean-square (NLMS)-based adaptive filter. There have been numerous studies on Kalman filtering for speech enhancement [1]. Even though noise observed in real situations has a nonstationary and dynamic feature, previous studies on Kalman filter were mostly applied by using the stationary white Gaussian noise assumption for simplicity. We present a sequential nonstationary speech enhancement method using the Kalman filtering combined with a particle filter [2] to estimate the parameters of speech signal and the variance of nonstationary additive noise. The sequential importance sampling (SIS) is used to estimate the parameters of the particle filter and clean speech signal is estimated by the particle filter in a frame-wise manner and is applied to a Kalman filter. In this work, speech signal is modeled as an autoregressive (AR) process. The noise variance and the parameters of the AR process are estimated in the Kalman filter. Our experimental results shows that the proposed Kalman filtering with a particle filter leads to significant signal-to-noise ratio (SNR) gain and improves the speech quality remarkably. II. SYSTEM DESCRIPTION A. Particle filter-based parameter estimation In the general formulation of the state estimation problem, the objective is to track the time evolution of the filtering density. If we assume that the parameters of speech and noise signal are known, the optimal estimate of the original speech signal can be obtained from a Kalman filter. However, in realistic scenarios, the background noise signal is unknown and the noise sources are mostly nongaussian. We use a particle filter in order to estimate the parameters (AR
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (No. 2012-0001730).

~ ( m) = w ( m ) / w i i
M m =1

K w ( k ) m k =1 i

= 1,..., M

4. Compute the new filtering density

~ ( m) ( x x ( m ) ) w i i i

5. Resample to obtain M new equally-weighted set of particles. Resampling has the effect of removing particles with low weights and amplifying particles with high weights. Accordingly, the posteriori probability distribution of the resampled particles has a sharper distribution. The concept of the above particle filtering process is visualized in Fig. 1. From the speech signal estimated in the particle filter, the speech and noise parameters are computed through linear predictive coding (LPC). The estimated speech signal from the particle filter can be described by the p-th order AR model and ~ the state transform matrix F is defined as:

~ ~ ~ ~ l1 l2 l p 1 l p ~ 1 0 0 0 F= 0 0 0 1 ~ ~ where l1 ,..., l p are LPC coefficients.

B. Estimation of clean speech by using the Kalman filter After the speech signal is estimated in the first stage by using the particle filter, we compute the AR parameters and noise variance. Given these parameters, the final clean speech ~ signal is extracted with a Kalman filtering process. Let v i
(m) xn p ( x n | x n 1 )

wn

(m)

= p ( yn | xn )

(evaluation)

p ( xn | y n )

( xn | y n ) p (resampling)
(m) xn +1 p ( x n +1 | x n )

(prediction)

Fig. 1. Concept of the particle filtering process [2].

978-1-4673-1363-6/13/$31.00 2013 IEEE

340

0.6 0.4

~ the original speech signal. Let V denote the covariance matrix ~ (n) obtained from the result of estimated measurement noise v

denote the estimated observation (measurement) noise: ~ = y s , where y is the observed speech signal and s is v i i i i i

0.2 0 -0.2 -0.4 -0.6 -0.8 0 0.5 1 1.5 2 2.5 3 3.5

(a) Clean speech signal


0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 0 0.5 1 1.5 2 2.5 3 3.5

of the particle filtering process. Then the Kalman filter is applied as follows.
~ i|i 1 = Fi|i 1 x i 1|n1 x (Prediction) ~ ~ K i|i 1 = Fi|i 1 K i 1 Fi 1|i 1T + U i +1 Gi = K i|i 1 H i H i K i|i 1 H i
i|i 1 si = y i H i x i|i = x i|i 1 + Gi si x
T

(b) Noisy signal


10 5 0

-5

~ + Vi

-10

0.5

1.5

2.5

3.5

(c) Enhanced signal with the baseline algorithm


0.8 0.6 0.4 0.2 0 -0.2 -0.4

(Correction)

-0.6

0.5

1.5

2.5

3.5

K i = ( I Gi H i ) K i|i 1
i is the predicted state estimate, K i|i 1 is In the above, where x

(d) Enhanced signal with the proposed algorithm Fig. 2. Sample waveforms from out computer experiments. TABLE I OUTPUT SNR (DB) UNDER DIFFERENT NOISE CONDITIONS Input SNR(dB) Average Noise Algorithm Type -10 -5 0 5 10 N1 N2 N3 Baseline Proposed Baseline Proposed Baseline Proposed 1.7 3.8 1.5 2.9 3.0 3.4 2.9 5.2 3.8 4.7 2.1 4.6 4.1 7.5 5.2 7.1 2.6 7.6 5.5 8.4 6.2 7.9 4.1 7.8 6.3 9.1 6.9 8.6 5.8 8.8 4.1 6.8 4.7 6.2 6.0 6.4

the predicted state-error covariance matrix, K i is the filteringerror covariance matrix, U i is the covariance matrix of process noise, H i is the observation matrix and Gi is the Kalman gain. III. EXPERIMENTAL RESULTS We performed computer experiments to evaluate the proposed algorithm in various noise environments, by using the database of the Speech Separation Challenge [3]. The added noise sources are three types: N1 (car noise), N2 (babble noise), and N3 (white Gaussian noise). The sampling rate of speech database was lowered from 25 kHz to 16 kHz. The noisy speech signal was generated by mixing clean speech with the noise sources at -10, -5, 0, 5, 10 dB SNRs. Note that N1 and N3 are stationary noise but N2 noise has a nonstationary nature. Fig. 2 shows the waveforms of the clean, the noisy, and the enhanced speech signals, from top to bottom. In the figure, the noisy signal was corrupted with the N2 (babble) noise with input SNR=0 dB. We confirmed that noise was suppressed remarkably to yield enhanced speech signal. We also computed the output SNR (dB) of enhanced speech signal. Table I compares the output SNR with respect to speech signals under the three noise conditions by using the Kalman filter with an NLMS adaptive filter-based preprocessor (Baseline) and the Kalman filter with a particle filter-based preprocessor (Proposed), respectively. The proposed algorithm provides the average SNR increase of 2.7 dB, 1.5 dB, and 0.5 dB under the N1, N2, and N3 noise conditions, respectively. From these results, it is justified that our algorithm significantly improves the objective quality measure in nonstationary environments as well as in stationary environments. IV. CONCLUSIONS We proposed a Kalman filter-based speech enhancement

algorithm where a particle filter is used as a preprocessor to estimate the Kalman filter parameters in nonstationary noise conditions. In computer experiments with artificially-mixed noisy speech signals, the proposed algorithm achieved the improvement of the output SNR by 2.7 dB, 1.5 dB, and 0.5 dB for the car noise, babble noise, and white Gaussian noise conditions, respectively. REFERENCES
[1] [2] [3] D.C. Popescu and I. Zeljkovic, Kalman filtering of colored noise for speech enhancement, Proc. ICASSP, pp3 997-1000, 1998. M.S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking, IEEE Trans. Signal Processing, vol. 50, no. 2, Feb 2002. M.P. Cooke, J. Barker, S.P. Cunningham, and X. Shao, An audio-visual corpus for speech perception and automatic speech recognition, J. Acoust. Soc. Am., vol. 120, issue 5, pp. 2421-2424, Nev., 2006.

341

You might also like