MA5232 Modeling and Numerical Simulations: Iterative Methods For Mixture-Model Segmentation 8 Apr 2015

4/7/2015
MA5232 Modeling and Numerical

Simulations
Lecture 2
Iterative Methods for Mixture-Model
Segmentation
8 Apr 2015
National University of Singapore
Last time
PCA reduces dimensionality of a data set while
retaining as much as possible the data variation.
Statistical view: The leading PCs are given by the
leading eigenvectors of the covariance.
Geometric view: Fitting a d-dim subspace model via
SVD
Extensions of PCA
Probabilistic PCA via MLE
Kernel PCA via kernel functions and kernel matrices
4/7/2015
This lecture
Review basic iterative algorithms for central
clustering
Formulation of the subspace segmentation
problem
Segmentation by Clustering
From: Object Recognition as Machine Translation, Duygulu, Barnard, de Freitas, Forsyth, ECCV02
4/7/2015
Example 4.1
Euclidean distance-based clustering is not
invariant to linear transformation
Distance metric needs to be adjusted after

linear transformation
Central Clustering
Assume data sampled from a mixture of
Gaussian
Classical distance metric between a sample

and the mean of the jth cluster is the
Mahanalobis distance
4/7/2015
Central Clustering: K-Means

Assume a map function provide each ith
sample a label
An optimal clustering minimizes the withincluster scatter:
i.e., the average distance of all samples to

their respective cluster means
Central Clustering: K-Means

However, as K is user defined,
when
each point becomes a cluster itself: K=n.
In this chapter, would assume true K is known.
4/7/2015
Algorithm
A chicken-and-egg view
Two-Step Iteration
10
4/7/2015
Example
http://util.io/k-means
11
Feature Space
Source: K. Grauman
4/7/2015
Results of K-Means Clustering:
Image
Clusters on intensity
Clusters on color
K-means clustering using intensity alone and color alone

* From Marc Pollefeys COMP 256 2003
4/7/2015
A bad local optimum
15
Characteristics of K-Means
It is a greedy algorithm, does not guarantee to
converge to the global optimum.
Given fixed initial clusters/ Gaussian models, the
iterative process is deterministic.
Result may be improved by running k-means
multiple times with different starting conditions.
The segmentation-estimation process can be
treated as a generalized expectationmaximization algorithm
16
4/7/2015
EM Algorithm [Dempster-Laird-Rubin 1977]

Expectation Maximization (EM) estimates the
model parameters and the segmentation in a
ML sense.
Assume samples are independently drawn
from a mixed probabilistic distribution,
indicated by a hidden discrete variable z
Cond. dist. can be Gaussian
17
The Maximum-Likelihood Estimation

The unknown parameters are
The likelihood function:
The optimal solution maximizes the loglikelihood

18
4/7/2015

Directly maximize the log-likelihood function
is a high-dimensional nonlinear optimization
problem
19
Define a new function:
The first term is called expected complete loglikelihood function;

The second term is the conditional entropy.
20
10
4/7/2015
Observation:
21

Regard the (incomplete) log-likelihood as a
function of two variables:
Maximize g iteratively (E step, followed by M
step)
22
11
4/7/2015
Iteration converges to a stationary

point
23
Prop 4.2: Update
24
12
4/7/2015
Update
Recall
Assume
is fixed, then maximize the
expected complete log-likelihood
25
To maximize the expected log-likelihood, as an

example, assume each cluster is isotropic
normal distribution:
Eliminate the constant term in the objective
26
13
4/7/2015
Exer 4.2
Compared to k-means, EM assigns the

samples softly to each cluster according to a
set of probabilities.
27
EM Algorithm
28
14
4/7/2015
Exam 4.3: Global max may not exist
29
Alternative view of EM:

Coordinate ascent
w
w1
30
15
4/7/2015

Coordinate ascent
w
w1
31

Coordinate ascent
w
w2
w1
32
16
4/7/2015

Coordinate ascent
w
w2
w1
33

Coordinate ascent
w
w2
w1
34
17
4/7/2015

Coordinate ascent
w
w2
w1
35
Visual example of EM
18
4/7/2015
Potential Problems
Incorrect number of Mixture Components
Singularities
Incorrect Number of Gaussians
19
4/7/2015
Incorrect Number of Gaussians
Singularities
A minority of the data can have a
disproportionate effect on the model
likelihood.
For example
20
4/7/2015
GMM example
Singularities
When a mixture component collapses on a
given point, the mean becomes the point, and
the variance goes to zero.
Consider the likelihood function as the
covariance goes to zero.
The likelihood approaches infinity.
21
4/7/2015
K-means VS EM
k-means clustering and EM clustering on an artificial dataset ("mouse"). The

tendency of k-means to produce equi-sized clusters leads to bad results, while
EM benefits from the Gaussian distribution present in the data set
43
So far
K-means
Expectation Maximization
44
22
4/7/2015
Next up
Multiple-Subspace Segmentation
K-subspaces
EM for Subspaces
45
Multiple-Subspace Segmentation
46
23
4/7/2015
K-subspaces
47
K-subspaces
With noise, we minimize
Unfortunately, unlike PCA, there is no constructive

solution to the above minimization problem. The main
difficulty is that the foregoing objective is hybrid it is
a combination of minimization on the continuous
variables {Uj} and the discrete variable j.
48
24
4/7/2015
K-subspaces
49
K-subspaces
Exactly the same as

in PCA
50
25
4/7/2015
K-subspaces
51
K-subspaces
52
26
4/7/2015
EM for Subspaces
53
EM for Subspaces
54
27
4/7/2015
EM for Subspaces
55
EM for Subspaces
56
28
4/7/2015
EM for Subspaces
57
EM for Subspaces
In the M step
58
29
4/7/2015
EM for Subspaces
59
EM for Subspaces
60
30
4/7/2015
EM for Subspaces
61
EM for Subspaces
62
31
4/7/2015
Relationship between K-subspaces and

EM
At each iteration,
K-subspaces algorithm gives a definite
assignment of every data point into one of the
subspaces;
EM algorithm views the membership as a
random variable and uses its expected value
to give a probabilistic assignment of the
data point.
63
Homework
Read the handout Chapter 4 Iterative
Methods for Multiple-Subspace
Segmentation.
Complete exercise 4.2 (page 111) of the
handout
64
32

MA5232 Modeling and Numerical Simulations: Iterative Methods For Mixture-Model Segmentation 8 Apr 2015

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MA5232 Modeling and Numerical Simulations: Iterative Methods For Mixture-Model Segmentation 8 Apr 2015

Uploaded by

Copyright:

Available Formats

4/7/2015

MA5232 Modeling and Numerical

National University of Singapore

Distance metric needs to be adjusted after

National University of Singapore

Classical distance metric between a sample

National University of Singapore

Central Clustering: K-Means

i.e., the average distance of all samples to

Central Clustering: K-Means

National University of Singapore

National University of Singapore

National University of Singapore

National University of Singapore

Results of K-Means Clustering:

K-means clustering using intensity alone and color alone

A bad local optimum

National University of Singapore

EM Algorithm [Dempster-Laird-Rubin 1977]

The Maximum-Likelihood Estimation

The optimal solution maximizes the loglikelihood

The Maximum-Likelihood Estimation

National University of Singapore

Define a new function:

The first term is called expected complete loglikelihood function;

National University of Singapore

The Maximum-Likelihood Estimation

National University of Singapore

Iteration converges to a stationary

National University of Singapore

Prop 4.2: Update

National University of Singapore

National University of Singapore

To maximize the expected log-likelihood, as an

Eliminate the constant term in the objective

National University of Singapore

Compared to k-means, EM assigns the

National University of Singapore

National University of Singapore

Exam 4.3: Global max may not exist

National University of Singapore

Alternative view of EM:

National University of Singapore

Alternative view of EM:

National University of Singapore

Alternative view of EM:

National University of Singapore

Alternative view of EM:

National University of Singapore

Alternative view of EM:

National University of Singapore

Alternative view of EM:

National University of Singapore

Incorrect Number of Gaussians

Incorrect Number of Gaussians

k-means clustering and EM clustering on an artificial dataset ("mouse"). The

National University of Singapore

National University of Singapore

National University of Singapore

National University of Singapore

National University of Singapore

Unfortunately, unlike PCA, there is no constructive