Professional Documents
Culture Documents
Mixtures of Gaussians
Sargur Srihari
srihari@cedar.buffalo.edu
1
Machine Learning Srihari
2
Machine Learning Srihari
3
Goal of Gaussian Mixture Modeling
Machine Learning Srihari
• Goal of Modeling:
• Find maximum likelihood parameters πk, µk, Σk
• Examples of data sets and models
1-D data, K=2 subclasses 2-D data, K=3
Joint Distribution
• Define joint distribution of latent variable
and observed variable
– p(x,z)=p(x|z) p(z)
– x is observed variable
– z is the hidden or missing variable
– Marginal distribution p(z)
– Conditional distribution p(x|z)
7
Machine Learning Srihari
Observed variable x
10
Machine Learning Srihari
( ) = ∑ π N (x | µ , Σ )
K
p(x) = ∑ p(z)p(x | z) = ∑ ∏ π k N x | µk , Σk
zk zk
k k k
z z k =1 k =1
– Since zk ∈{0,1}
• This is the standard form of a Gaussian mixture
11
Machine Learning Srihari
∑ p(z j
= 1)p(x | z j = 1)
j =1
π k N (x | µk , Σk ) p(x,z)=p(x|z)p(z)
= K
∑ π N (x | µ , Σ
j k j
)
j =1
Plan of Discussion
• Next we look at
1. How to get data from a mixture model synthetically
and then
2. Given a data set {x1,..xN} how to model the data
using a mixture of Gaussians
14
Machine Learning Srihari
Illustration of responsibilities
• Evaluate for every data point
– Posterior probability of each component
• Responsibility γ (znk ) is associated with
data point xn
• Color using proportion of red, blue and
green ink
– If for a data point γ (zn1 ) = 1 it is colored red
– If for another point γ (zn 2 ) = γ (zn 3 ) = 0.5 it
has equal blue and green and will appear
as cyan
16
Machine Learning Srihari
⎢ x ⎥
– nth row is given by xnT X =⎢ 2
⎢
⎥
⎥
⎢ xN ⎥
⎣ ⎦
18
Machine Learning Srihari
z k =1
(
p(x) = ∑ p(z)p(x | z) = ∑ π kN x | µk , Σk ) Since z has values {zk}
with probabilities {πk}
Maximization of Log-Likelihood
N
⎧K ⎫
ln p(X | π , µ, Σ) = ∑ ln ⎨∑ π k N(x n | µk , Σk )⎬
n =1 ⎩⎪ k =1 ⎭⎪
21
Machine Learning Srihari
Problem of Identifiability
A density p(x | θ ) is identifiable if θ ≠ θ ' then there is an x for which p(x | θ ) ≠ p(x | θ ')
n =1
Covariance matrices 1 N
Σk = ∑
N k n=1
γ (znk )(x n − µk )(x n − µk )T
N
Mixing Coefficients Nk N k = ∑ γ (znk )
πk =
N n=1
EM Formulation
• The results for µk , Σ k , π k are not closed form
solutions for the parameters
– Since γ (znk ) the responsibilities depend on those
parameters in a complex way
• Results suggest an iterative solution
• An instance of EM algorithm for the particular
case of GMM
30
Machine Learning Srihari
31
Machine Learning Srihari
32
Machine Learning Srihari
33
Machine Learning Srihari
34
Machine Learning Srihari
35
Machine Learning Srihari
36
Machine Learning Srihari
EM continued
• Step 2: E step: Evaluate responsibilities using current
parameter values
π k N(x n | µk , Σk )
γ (zk )= K
∑ π N(xj n
| µ j , Σ j ))
j =1
N
1
Σ new
k
=
Nk
∑ γ (z nk
)(x n
− µk
new
)(x n
− µk
new T
)
n =1
N
Nk
π knew = where N k = ∑ γ (znk )
N n =1 37
Machine Learning Srihari
EM Continued
• Step 4: Evaluate the log likelihood
N
⎧K ⎫
ln p(X | π , µ, Σ) = ∑ ln ⎨∑ π k N(xn | µk , Σk )⎬
n =1 ⎪⎩ k =1 ⎪⎭
38