You are on page 1of 16

Independent Component Analysis For Time Series Separation

Ahtasham Ashraf

ICA Blind Signal Separation (BSS) or Independent Component Analysis (ICA) is the identification & separation of mixtures of sources with little prior information.x|2 sense. We will concentrate on Time Series separation of Multiple Targets . ‡ Applications include: ² ² ² ² ² Audio Processing Medical data Finance Array processing (beamforming) Coding ‡ « and most applications where Factor Analysis and PCA is currently used. ‡ While PCA seeks directions that represents data best in a |x0 . ICA seeks such directions that are most independent from each other.

m=n observations x = As x1 Observations .The simple Cocktail Party Problem Mixing matrix A s1 Sources x2 s2 n sources.

Depend on the distances of the microphones from the speakers ..Motivation Two Independent Sources Mixture at two Mics x1 (t ) ! a11s1  a12 s2 x2 (t ) ! a21s1  a22 s2 aIJ ..

Motivation Get the Independent Signals out of the Mixture .

ICA Model (Noise Free) ‡ Use statistical ´latent variables´ system ‡ Random variable sk instead of time signal ‡ xj = aj1s1 + aj2s2 + . we can compute W=A-1 and hence s = Wx = A-1x . of IC¶s = no of observable mixtures and A is square and invertible ‡ So after estimating A.. for all j x = As ‡ IC¶s s are latent variables & are unknown AND Mixing matrix A is also unknown ‡ Task: estimate A and s using only the observeable random vector x ‡ Lets assume that no. + ajnsn.

we can Est A. .Illustration 2 IC¶s with distribution: ®1 if | si |e 3 ¾ ± ± p ( si ) ! ¯ 2 3 ¿ ± 0 otherwise ± ° À Zero mean and variance equal to 1 Mixing matrix A is ¨ 2 3¸ A!© © 2 1¹ ¹ ª º The edges of the parallelogram are in the direction of the cols of A So if we can Est joint pdf of x1 & x2 and then locating the edges.

Restrictions ‡ si are statistically independent ² p(s1. So A cann¶t be estimated. 2 ¨ x12  x2 ¸ 1 ¹ p ( x1 .s2) = p(s1)p(s2) ‡ Nongaussian distributions ² The joint density of unit variance s1 & s2 is symmetric. So it doesn¶t contain any information about the directions of the cols of the mixing matrix A. x2 ) ! exp©  © 2T 2 ¹ º ª . the estimation is still possible. ² If only one IC is gaussian.

because both s and A are unknown. any scalar multiple in one of the sources can always be cancelled by dividing the corresponding col of A by it. ² Fix magnitudes of IC¶s assuming unit variance: E{si2} = 1 ² Only ambiguity of sign remains ‡ Can¶t determine the order of the IC¶s ² Terms can be freely changed.Ambiguities ‡ Can¶t determine the variances (energies) of the IC¶s ² Both s & A are unknowns. . So we can call any IC as the first one.

. Such a w would correspond to a z with only one non zero comp. So we could take w as a vector which maximizes the non-gaussianity of wTx. (By CLT) ‡ f(s1) f(s2) Where w is one of the rows of matrix W.v. so zTs is more gaussian than either of si. Since sum of two indep r. AND becomes least gaussian when its equal to one of si. f(x1) = f(s1 +s2) y ! w x ! w As ! z s T T T ‡ ‡ ‡ ‡ y is a linear combination of si..ICA Principal (Non-Gaussian is Independent) (Non‡ ‡ Key to estimating A is non-gaussianity The distribution of a sum of independent random variables tends toward a Gaussian distribution. is more gaussian than individual r. So we get back the si.v. with weights given by zi.

y ) a G ( y ) !  exp( a.Measures of Non-Gaussianity Non‡ We need to have a quantitative measure of non-gaussianity for ICA Estimation.u 2 / 2) . ‡ Kurtotis : gauss=0 (sensitive to outliers) kurt ( y ) ! E{ y 4 }  3( E{ y 2 }) 2 ‡ Entropy : gauss=largest ‡ Neg-entropy : gauss = 0 ‡ Approximations H ( y ) !  ´ f ( y ) log f ( y ) dy J ( y) ! ( y gauss )  ( y) (difficult to estimate) 2 J ( y ) ! 1 E y 2  1 kurt ( y ) 2 12 48 _a J ( y ) } ?E_ ( y )a E_ (v )a G G A 2 ‡ where v is a standard gaussian random variable and : G ( y ) ! 1 log cosh( a.

Data Centering & Whitening ‡ Centering x = x¶ ² E{x¶} ² But this doesn¶t mean that ICA cannt estimate the mean. This greatly simplifies ICA. . but it just simplifies the Alg. Its done by EVD. ² IC¶s are also zero mean because of: E{s} = WE{x} ² After ICA.E{x¶} to zero mean IC¶s ‡ Whitening ² We transform the x·s linearly so that the x~ are white. add W. x~ = (ED-1/2ET)x = ED-1/2ET Ax = A~s where E{xx~} = EDET So we have to Estimate Orthonormal Matrix A~ ² An orthonormal matrix has n(n-1)/2 degrees of freedom. ‡ Reducing dim of data (choosing dominant Eig) while doing whitening also help. So for large dim A we have to est only half as much parameters.

. i.e.. if we use measures of nongaussianity which are immune to gaussian noise... n-dimensional vector of IC¶s n .Noisy ICA Model x = As + n A .. mxn mixing matrix s . ‡ So gaussian moments are used as contrast functions.. J ( y ) } ?E _ ( y )a E _ (v)a G G A 2 ‡ ‡ ‡ ‡ G ( y ) ! 1 / 2T c exp( x 2 / 2c 2 ) . m-dimensional random noise vector Same assumptions as for noise-free model.

. ‡ however. in pre-whitening the effect of noise must be taken in to account: x~= (E{xxT} .)-1/2 x x~ = Bs + n~.

Simulation Results I have used the Synthetic data with & without noise to separate the time series of DW & AAV which are moving fairly close to each other .

Simulation Results .

hut.media.math.cis.html ± http://www.fun-thom.mit.hut.edu/~paris/ica.html .media.rug.phys.cis.edu/stat/index.uah.html object-based audio capture demos ± http://www.ele.misto.fi/projects/ica/icademo/ Lots of links ± http://sound.html Virtual Laboratories in Probability and Statistics ± http://www.esp.nl/onderzoek/daniels/BSS.edu/~westner/sepdemo.html Demo for BBS with ÄCoBliSS³ (wav-files) ± http://www.fi/aapo/papers/NCS99web/node11.tue.mit.References ‡ ‡ ‡ ‡ ‡ ‡ ‡ ‡ ICA demo step-by-step Feature extraction (Images.html Tomas Zemanµs page on BSS research ± http://ica.nl/demos/ica/ Aapo Hyvarinen: ICA (1999) ± http://www.cz/page3. Video) ± http://hlab.