Professional Documents
Culture Documents
Source Seperation
Chapter 9
PCA, ICA, Blind Source Seperation
EEE 485/585 Statistical Learning and Data Analytics Unsupervised
Learning
Principal Component
Analysis (PCA)
Blind Source
Separation Problem
Blind Source
Separation Problem
Independent
Component Analysis
(ICA)
Cem Tekin
Bilkent University
Principal Component
Analysis (PCA)
Blind Source
Separation Problem
Blind Source
Separation Problem
Independent
Component Analysis
(ICA)
9.2
PCA, ICA, Blind
Principal component analysis (PCA) Source Seperation
I.e., the subspace captures almost all variation in the data. Principal Component
Analysis (PCA)
Blind Source
Separation Problem
Blind Source
Separation Problem
Independent
Component Analysis
(ICA)
9.3
PCA, ICA, Blind
Normalize Source Seperation
1
Pn
x̄j = n i=1 xij , j = 1, . . . , p
xij ← xij − x̄j , j = 1, . . . , p, i = 1, . . . , n.
Pn
σj2 = n1 i=1 xij2 , j = 1, . . . , p
xij
xij ← σj , j = 1, . . . , p, i = 1, . . . , n.
Unsupervised
Learning
Dividing by σj is not necessary if attributes are on the
Principal Component
same scale. Ex: Each correspond to value in USD Analysis (PCA)
Blind Source
Separation Problem
Blind Source
Separation Problem
Independent
Component Analysis
(ICA)
9.4
PCA, ICA, Blind
Computing the major axis of variation Source Seperation
n n Principal Component
1X T 2 1X T T T Analysis (PCA)
(x i u) = (x i u) (x i u) Blind Source
n n Separation Problem
i=1 i=1
Blind Source
n
1 X
T
Separation Problem
= u x i x Ti u Independent
n Component Analysis
i=1 (ICA)
n
!
1X
= uT x i x Ti u
n
i=1
9.5
PCA, ICA, Blind
Optimization problem Source Seperation
Pn
Σ = n1 i=1 x i x Ti is the sample covariance matrix of the
data
Principal Component
Analysis (PCA)
Blind Source
Separation Problem
Blind Source
Separation Problem
Independent
Component Analysis
(ICA)
9.6
PCA, ICA, Blind
Optimization problem Source Seperation
Pn
Σ = n1 i=1 x i x Ti is the sample covariance matrix of the
data
Blind Source
Separation Problem
9.6
PCA, ICA, Blind
Result Source Seperation
Unsupervised
Learning
Principal Component
Analysis (PCA)
Blind Source
Separation Problem
Blind Source
Separation Problem
Independent
Component Analysis
(ICA)
9.7
PCA, ICA, Blind
PCA algorithm Source Seperation
1 Given X = [x T1 , . . . , x Tn ]T , compute Σ = n1 XT X
2 Compute all eigenvalues λ1 ≥ λ2 ≥ . . . ≥ λp of Σ, and the
corresponding eigenvectors u 1 , u 2 , . . . , u p
3 Pick k ≤ p eigenvectors with the largest eigenvalues, i.e.,
u1, u2, . . . , uk Unsupervised
Learning
T
zi1 x i u1 Principal Component
Analysis (PCA)
zi2 x T u 2
i Blind Source
4 For all i = 1, . . . , n, let z i = . = .
.
. . . Separation Problem
Blind Source
zik x Ti u k Separation Problem
Independent
5 {z i }ni=1 is the k -dimensional approximation to {x i }ni=1 Component Analysis
(ICA)
9.8
PCA, ICA, Blind
PCA example - Arrest dataset Source Seperation
Principal Component
Analysis (PCA)
Blind Source
Separation Problem
Blind Source
Separation Problem
Independent
Component Analysis
(ICA)
9.9
PCA, ICA, Blind
PCA example - Arrest dataset Source Seperation
Independent
Component Analysis
0.28 0.87 (ICA)
9.9
PCA, ICA, Blind
Data visualization using
−0.5
PCA 0.0 0.5 Source Seperation
UrbanPop
3
2
0.5
Hawaii California
Rhode Island
Massachusetts
Utah New Jersey
Second Principal Component
Connecticut
1
Washington Colorado
New York Nevada
Minnesota Pennsylvania
Ohio IllinoisArizona Unsupervised
Wisconsin Oregon Rape
Texas
Learning
Delaware Missouri
Oklahoma
Kansas
Nebraska Indiana Michigan
Iowa Principal Component
0.0
New Hampshire
0
Florida
Idaho Virginia New Mexico Analysis (PCA)
Maine Wyoming
Maryland
North Dakota Montana Blind Source
Assault
South Dakota Tennessee
Louisiana
Separation Problem
Kentucky
−1
−0.5
South Carolina
Independent
−2
Component Analysis
North Carolina
Mississippi
(ICA)
−3
−3 −2 −1 0 1 2 3
9.10
Figure 10.1 from “An introduction to statistical learning" by James et al.
PCA, ICA, Blind
Interpretation of the results Source Seperation
Blind Source
Separation Problem
Blind Source
Separation Problem
Independent
Component Analysis
(ICA)
9.11
PCA, ICA, Blind
Importance of standardization in PCA Source Seperation
Scaled Unscaled
−0.5 0.0 0.5 −0.5 0.0 0.5 1.0
1.0
UrbanPop
3
UrbanPop
150
2
Second Principal Component
100
** ** *
0.5
*
*
1
* * Unsupervised
** *
50
* * * * Rape
* * * Learning
* ** * * Rape
*
0.0
** * * ** * *
0
* * ** * * ** **
* * ** * * Principal Component
0.0
* * * * * *
* * **
0
* * * *Murder * * Assault
* * Assault * ** * * * * *
* * * * Analysis (PCA)
* * * * *
−1
* * * * *
* *
−50
* *
Murder Blind Source
*
−0.5
Separation Problem
−0.5
−2
−100
* Blind Source
*
Separation Problem
−3
Independent
−3 −2 −1 0 1 2 3 −100 −50 0 50 100 150 Component Analysis
(ICA)
First Principal Component First Principal Component
Figure 10.3 from “An introduction to statistical learning" by James et al. 9.12
PCA, ICA, Blind
How to choose k ? Source Seperation
Principal Component
Variance explained by the mth principal component Analysis (PCA)
Blind Source
2 Separation Problem
n n n p
1 X
2 1 X 1 X X Blind Source
zim = (x Ti u m )2 = xij umj Separation Problem
n n n Independent
i=1 i=1 i=1 j=1 Component Analysis
(ICA)
Pn Pp 2
i=1 ( j=1
xij umj )
PVE(m) = Pp P n 2
j=1 i=1 xij
Pk
PVE(first k ) = m=1 PVE(m)
9.13
PCA, ICA, Blind
How to choose k ? Source Seperation
1.0
1.0
Cumulative Prop. Variance Explained
0.8
0.8
Prop. Variance Explained
0.6
0.6
Unsupervised
Learning
0.4
0.4
Principal Component
Analysis (PCA)
0.2
0.2
Blind Source
Separation Problem
0.0
0.0
Blind Source
1.0 1.5 2.0 2.5 3.0 3.5 4.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0 Separation Problem
Independent
Principal Component Principal Component Component Analysis
(ICA)
Figure 10.4 from “An introduction to statistical learning" by James et al. 9.14
PCA, ICA, Blind
Blind source separation problem Source Seperation
A Unsupervised
Learning
Principal Component
Analysis (PCA)
Blind Source
B
Separation Problem
Blind Source
Separation Problem
Independent
Component Analysis
(ICA)
9.15
PCA, ICA, Blind
Blind source separation problem Source Seperation
Observations:
x1 (1) x1 (2) x1 (T )
x2 (1) , x2 (2) , . . . , x2 (T )
x3 (1) x3 (2) x3 (T )
Unsupervised
Source signals (unobserved): Learning
Principal Component
Analysis (PCA)
s1 (1) s1 (2) s1 (T ) Blind Source
s2 (1) , s2 (2) , . . . , s2 (T ) Separation Problem
Independent
Mixing model (coefficients aij unknown): Component Analysis
(ICA)
Image denoising
Medical signal processing
Brain computer interfaces
Time series analysis
Unsupervised
Learning
Principal Component
Analysis (PCA)
Blind Source
Separation Problem
Blind Source
Separation Problem
Independent
Component Analysis
(ICA)
9.17
PCA, ICA, Blind
Independent Component Analysis (ICA) Source Seperation
Unsupervised
Learning
x1 = a11 s1 + a12 s2 + . . . + a1p sp Principal Component
Analysis (PCA)
..
. Blind Source
Separation Problem
9.18
PCA, ICA, Blind
Ambiguities in ICA Source Seperation
Blind Source
the same as or different from sj (t) Separation Problem
Blind Source
These ambiguities do not create a problem in most Separation Problem
Independent
applications Component Analysis
(ICA)
9.19
PCA, ICA, Blind
Ambiguities in ICA Source Seperation
Unsupervised
Learning
Principal Component
Analysis (PCA)
Blind Source
Separation Problem
Blind Source
Separation Problem
Independent
Component Analysis
(ICA)
9.20
PCA, ICA, Blind
pdf of a linear transformation of a random vector Source Seperation
Blind Source
Separation Problem
Blind Source
Separation Problem
Independent
Component Analysis
(ICA)
9.21
PCA, ICA, Blind
ICA algorithm Source Seperation
Principal Component
Independence holds both over sources and over time! Analysis (PCA)
p Independent
Y Component Analysis
pX (x) = pSj (w Tj x)|det(W)| (ICA)
j=1
T
w1
w T2
where W = .
..
w Tp
9.22
PCA, ICA, Blind
ICA algorithm - estimating W Source Seperation
Likelihood:
T
Y
L(W) = pX (x(t))
t=1
Unsupervised
T p Learning
Y Y
= pSj (w Tj x(t))|det(W)| Principal Component
Analysis (PCA)
t=1 j=1 Blind Source
Separation Problem
Independent
Component Analysis
p (ICA)
T
X X
l(W) = log pSj (w Tj x(t)) + log |det(W)|
t=1 j=1
MLE:
Unsupervised
Learning
Principal Component
Analysis (PCA)
Blind Source
Separation Problem
Blind Source
Separation Problem
Independent
Component Analysis
(ICA)
Figure 12.20 from “Machine learning: A probabilistic perspective" by Kevin Murphy 9.24