You are on page 1of 39

Independent Component Analysis

for Blind Source Separation and


dimensional reduction
Dipanjan Roy
Associate Professor
School of AIDE
Indian Institute of Technology Jodhpur
Outline

1 Blind Source Separation

2 Independent Component Analysis

3 Experiments and Applications

4 Summar
y

March 16, 2022 2/28


Motivation

• A Method: find underlying factors or


components form multi-dimensional statistical
data

• Distinguishes : looks for components that are


both statistically independent and non-
gaussian data
1. ORIGINAL SOUND SOURCES

2. SAMPLES AT THE COCKTAIL PARTY

3. FOUND SOUND SOURCES


Apply ICA to separate the samples sound sources

http://research.ics.aalto.fi/ica/cocktail/cocktail_en.cgi
What’s a Blind Source Separation

Blind Source Separation is a method to estimate original signals from


observed signals which consist of mixed original signals and noise.

March 16, 2022 5/28


BSS is often
Example used for Speech analysis and Image
of BSS

analysis

March 16, 2022 6/28


BSS is also
Example of very important for brain signal analysis
BSS (cont’d)

March 16, 2022 7/28


ICA scope

• Independent Component Analysis


– ICA model
– ICA theory
– ICA applications/results

• Independent Subspace Analysis


– ISA model
– ISA theory
– ISA results
Model Formalization
The problem of BSS is formalized as
follow: The matrix

X ∈ Rm × d (1)
denotes original signals, where m is number of original signals, and d is
dimension of one signal.
We consider that the observed signals Y ∈ R n × d are given by linear mixing system
as
Y = A X + E,

(2)
The
wheregoal
A ∈of RBSS
n × m isistothe
estimate
unknownAˆ and Xˆ so
mixing Xˆ provides
that and
matrix E ∈ R nunknown
× d denotes a

original signal asnpossible.


noise. Basically, ≥ m.

Mar 16, 2022


9/28
Kinds of BSS Methods

Actually, degree of freedom of BSS model is very high to estimate A and


X . Because there are a huge number of combinations (A , X ) which satisfy
Y = A X + E.
Therefore, we need some constraint to solve the BSS problem such as:
PCA : orthogonal constraint
SCA : sparsity constraint
NMF : non-negativity constraint
ICA : in-dependency constraint
In this way, there are many methods to solve the BSS problem depending on
the constraints. What we use is depend on subject matter.
The Non-negative Matrix Factorization(NMF) was introduced in my previous
seminar. We can get its solution by the alternating least squares algorithm.
Today, I will introduce another method the Independent Component Analysis.

Mar. 16, 2022 10/28


Independent Component Analysis

The party problem

x 1 (t) = a 1 1 s 1 (t) + a 1 2 s 2 (t) + a 1 3 s 3 (t) (3)


x 2 (t) = a 2 1 s 1 (t) + a 2 2 s 2 (t) + a 2 3 s 3 (t) (4)
x 3 (t) = a 3 1 s 1 (t) + a 3 2 s 2 (t) + a 3 3 s 3 (t) (5)

x is an observed signal, and s is an original signal. We assume that {s 1 , s 2 , s 3 }


are statistically independent of each other.

The model of ICA


Independent Component Analysis (ICA) is to estimate the independent
components s(t) from x(t).

x(t ) = As (t ) (6)

Mar. 16, 2022 11/28


Independent Component Analysis
Goal:

12
Approach

Hypothesis of ICA
1 { s i } are statistically independent of each other,

p(s 1 , s 2 , . . . , s n ) = p(s 1 )p(s 2 ) · · · (7)


p(s n ).
2 { s i } follow the Non-Gaussian distribution.
If { s i } follows the Gaussian distribution, then ICA is impossible.
3 A is a regular matrix.
Therefore, we can rewrite the model
as
s(t) = B x ( t ) , (8)

where B = A − 1 . It is only necessary to estimate B so that { s i }


are independent.

Jan. 31, 2012 13/28


Independent Component Analysis

Model
Observations (Mixtures)

ICA estimated signals original signals


14
Independent Component Analysis
Model

We observe

We want

Goal:
15
ICA vs PCA, Similarities

• Perform linear transformations


• Matrix factorization

PCA: low rank matrix factorization for


compression
N X = U S M<N

ICA: full rank matrix factorization to remove dependency between the


rows
N X = A S

N
PCA and ICA

• Multi-dimensional statistical
– PCA and ICA: reduce dimensions
• Difference:
• PCA: with a Gaussian model
• ICA: with non-Gaussian model

• PCA: Vector are orthogonal


• ICA: Vector are not orthogonal
ICA vs PCA, Differences

• PCA: X=US, UTU=I


• ICA: X=AS

• PCA does compression


– M<N
• ICA does not do compression
– same # of features (M=N)

• PCA just removes correlations, not higher order dependence


• ICA removes correlations, and higher order dependence

• PCA: some components are more important than others


(based on eigenvalues)
• ICA: components are equally important
18
Whitening and ICA

Definition of White signal


White signals are defined as any z which satisfies conditions of

E[z] = 0, E [ z z T ] = I .

(9)

First, we show an example of original independent signals and observed signal


as follow:

(a) source (s1 , s2 )

(b) observed (x 1 , x 2 )

Observed signals x(t) are given by


Mar x(t)
16, 2022 = As(t). 19/28
Whitening and ICA (cont’d)
Whitening is useful for preprocessing of ICA.
First, we apply the whitening to observed signals x(t).

(c) observed (x 1 , x 2 ) (d)


whitening (z1 , z2 )

The whitening signals are denoted as (z 1 , z 2 ), and they are given by


(10)
z(t) = V x(t),
where V is a whitening matrix for x. Model becomes (11)
s (t) = U z (t) = U V x (t) = B x (t),
and U is an orthogonal transform matrix. We can say that the
whitening simplifies the ICA problem. So it is only necessary to
Mar. 16, 2022
20/28
Whitening solves half of the ICA problem

Note:
The number of free parameters of an N by N orthogonal matrix is
(N-1)(N-2)/2. whitening solves half of the ICA problem

original mixed whitened

21
Non-Gaussianity and ICA

Non-Gaussianity is a measure of in-dependency.


According to the central limit theorem, the Gaussianity of x (t ) must be
larger than s(t).
Now, we put biT as mixing vector, sˆi(t) = b Ti x(t). We want to maximize the
Non-Gaussianity of (b iT x(t)). Then such b is a part of solution B .
For example, there are following two vector b′ and b. We can say that b is
better
than b ′ .

Mar. 16, 2022

22/28
Solving ICA

ICA task: Given x,


• find y (the estimation of s),
• find W (the estimation of A-1)

ICA solution: y=Wx


• Remove mean, E[x]=0
• Whitening, E[xxT]=I
• Find an orthogonal W optimizing an objective function
– Sequence of 2-d Jacobi (Givens) rotations

original mixed whitened rotated


(demixed)
38
ICA Cost Functions

) go away from normal distribution


Some ICA Applications

STATIC TEMPORAL
• Image denoising
•Medical signal processing –
• Microarray data fMRI, ECG, EEG
processing • Brain Computer Interfaces
• Decomposing the •Modeling of the
spectra of galaxies hippocampus, place cells
• Face recognition •Modeling of the visual
• Facial expression cortex
recognition • Time series analysis
• Feature extraction • Financial applications

• Clustering • Blind deconvolution

• Classification
ICA Application, Removing Artifacts from EEG
signals
• EEG ~ Neural cocktail party
• Severe contamination of EEG activity by
– eye movements
– blinks
– muscle
– heart, ECG artifact
– vessel pulse
– electrode noise
– line noise, alternating current (60 Hz)

• ICA can improve signal


– effectively detect, separate and remove activity in EEG
records from a wide variety of artifactual sources.
(Jung, Makeig, Bell, and Sejnowski)

• ICA weights help find location of sources 26


Fig from Jung 27
Fig from Jung 28
29 Fig from Jung
Experiments: Real Image 1

(a) ob 1 (b) ob 2
(a) newyork (a) estimated signal 1

Figure: Observed Signals


(b) shanghai (b) estimated signal 2
Figure: Original Signals Figure: Estimated Signals

30/28
• Using ICA to analyze fMRI data of multiple subjects raises some questions:
• How are components to be combined across subjects?
• How should the final results be thresholded and/or presented?

31
ICA: Single Subject

The ICA maps from


one subject for the
visual and basal
ganglia components
are depicted along
with their time
courses (basal
ganglia in green and
visual in pink)

Note that the visual


time course
precedes the motor
time course

32
Approach 2

• Group ICA (stacking images)


• [V. D. Calhoun, T. Adali, G. D. Pearlson, and J. J. Pekar, "A Method for Making Group Inferences From
Functional MRI Data Using Independent Component Analysis," Hum. Brain Map., vol. 14, pp. 140-151,
2001.]
• [V. J. Schmithorst and S. K. Holland, "Comparison of Three Methods for Generating Group Statistical
Inferences From Independent Component Analysis of Functional Magnetic Resonance Imaging Data," J.
Magn Reson. Imaging, vol. 19, pp. 365-368, 2004.]
• Components and time courses can be directly compared

Sub 1
ICA

Sub N

Sub 1 Sub N

33
ICA for Motion Style Components

(Mori & Hoshino 2002, Shapiro et al 2006, Cao et al 2003)


• Method for analysis and synthesis of human motion
from motion captured data
• Provides perceptually meaningful components
• 109 markers, 327 parameters
) 6 independent components (emotion, content,
…)

34
Using ICA for classification

Activity distributions of Test data


– within-category test images
are much narrower
– off-category is closer to the
Gaussian distribution
Happy

ICA basis
Train data

[Happy]
ICA basis

[Disgust]
Disgust
ICA basis vectors extracted from natural images

Gabor wavelets,
edge detection,
receptive fields
Experiments: Real Image 2

(a) ob 1 (b) ob 2
(a) buta (a) estimated signal 1

Figure: Observed Signals


(b) kobe (b) estimated signal 2

Figure: Original Signals Figure: Estimated Signals

Mar. 16, 2022 37/28


Experiments: Real Image 2 (using
filtering)

(a) ob 1 (b) ob 2
(a) buta (a) estimated signal 1

Figure: Observed Signals


(b) kobe (b) estimated signal 2

Figure: Original Signals Figure: Estimated Signals

Jan. 31, 2012 38/28


Experiments: Real Image 3 (using filtering)

(a) nyc (b) sha

(c) rock (d) pig (a) estimated signal 1 (b) estimated signal 2

(e) obs1 (f) obs2

(c) estimated signal 3 (d) estimated signal 4


(g) obs3 (h) obs4
Figure: Estimated Signals
Figure: Ori. & Obs.
Mar. 16, 2022 39/28

You might also like