You are on page 1of 98

Image as Markov Random Field and

Applications1

intro, applications

MRF, labeling . . .

how it can be computed at all?

Tom Svoboda, svoboda@cmp.felk.cvut.cz


Czech Technical University in Prague, Center for Machine Perception
http://cmp.felk.cvut.cz
Last update: July 14, 2010
Talk Outline

Applications in segmentation: GraphCut, GrabCut, demos

Please note that the lecture will be accompanied be several sketches and derivations on the blackboard
and few live-interactive demos in Matlab

About this very lecture


2/50

MRF is a complicated topic

this lecture is introductory

some simplifications in order not to lose the whole picture

most important references provided

few notes before we start

many accessible explanations on the web (wikipedia . . . )

Markov Random Field MRF


3/50

From wikipedia2:
A Markov random field, Markov network or undirected graphical model is a
graphical model in which a set of random variables have a Markov property
described by an undirected graph.
More formal definition, which we follow, can be found in the first chapter3
of the book [4].

2
3

http://en.wikipedia.org/wiki/Markov_random_field
freely available at: http://www.nlpr.ia.ac.cn/users/szli/MRF_Book/MRF_Book.html

Markov Random Field MRF


3/50

From wikipedia2:
A Markov random field, Markov network or undirected graphical model is a
graphical model in which a set of random variables have a Markov property
described by an undirected graph.
More formal definition, which we follow, can be found in the first chapter3
of the book [4].
Let think about images:

2
3

http://en.wikipedia.org/wiki/Markov_random_field
freely available at: http://www.nlpr.ia.ac.cn/users/szli/MRF_Book/MRF_Book.html

Markov Random Field MRF


3/50

From wikipedia2:
A Markov random field, Markov network or undirected graphical model is a
graphical model in which a set of random variables have a Markov property
described by an undirected graph.
More formal definition, which we follow, can be found in the first chapter3
of the book [4].

image intensities are the random variables

the values depend only on their immediate spatial neighborhood which


is the Markov property

Let think about images:

images are organized in a regular grid which can be seen as an


undirected graph

2
3

http://en.wikipedia.org/wiki/Markov_random_field
freely available at: http://www.nlpr.ia.ac.cn/users/szli/MRF_Book/MRF_Book.html

Labeling for image analysis


4/50

Assign a label to image pixel (or to features in general).

Many image analysis and interpretation problems can be posed as labeling


problems.
Image intensity can be considered as a label (think about pallete
images).

Sites
S index a discrete set of m sites.
S = {1, . . . , m}

individual pixel

image region

Site could be:

corner point, line segment, surface patch . . .

Labels and sites


5/50

A label is an event that may happen to a site.


Set of labels L.
We will discuss disrete labels:
L = {l1, . . . , lM }
shortly
L = {1, . . . , M }

Labels and sites


5/50

A label is an event that may happen to a site.


Set of labels L.
We will discuss disrete labels:
L = {l1, . . . , lM }
shortly
L = {1, . . . , M }

intensity value

object label

in edge detection binary flag L = {edge, nonedge}

What can be a label?

...

Ordering of labels
6/50

Some labels can be ordered some not.


Ordered labels can be used to measure distance between labels.

The Labeling Problem


7/50

Assigning a label from the label set L to each of the sites S.


Example: edge detection in an image
Assign a label fi from the set L = {edge, nonedge} to site i S where the
elements in S index the image pixels. The set
f = {f1, . . . , fm}
is called a labeling.
Unique labels
When each site is assigned a unique label, labeling can be seen as mapping
from S to L.
f : S L
A labeling is also called a coloring in mathematical programming.

How many possible labelings?


8/50

Assuming all m sites have the same label set L


m
=
L
F=L

.
.
.

L
{z
}
|
m times

Imagine the image restoration problem [3].


The the m is the number of pixels in the image and L equals to number of
intensity levels.
Many, many possible labelings. Usually only few are good.

Labeling problems in image analysis




image restoration

region segmentation

edge detection

object detection and recognition

stereo

9/50

...

Labeling with contextual analysis


10/50

In images, site neighborhood matters.


A probability P (fi) does not depend only on the site but also on the labeling
around. Mathematically speaking we must consider conditional probability
P (fi|{fi0 })
where {fi0 } denotes the set of other labels.
no context:
P (f ) =

P (fi)

iS

Markov Random Field - MRF


P (fi|fS{i}) = P (fi|fNi )
where fNi stands for the labels at the sites neighboring i.

MRF a Gibbs Random Fields


11/50

How to specify an MRF: in terms of conditional probabilities P (fi|fNi ) or


joint probability P (f )?
HammersleyClifford theorem about equivalence between MRF and Gibbs
distribution.
A set of random variables F is said to be a Gibbs Random Fields on S with
respect to N iff its configurations obey a Gibbs distribution
P = eU

P (f ) = Z

T1 U (f )

0.9
0.8
0.7

where

0.6
0.5

Z=

T1 U (f )

f F

is a normalizing constant.

0.4
0.3
0.2
0.1
0
0

energy term U

10

Gibbs distribution
12/50

T1 U (f )

T is tempertature, T = 1 unless stated otherwise

P (f ) = Z

U (f ) is the energy function


1

P = e T U

T=0.5
T=1.0
T=2.0
T=5.0

0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0

energy term U

10

Energy function
13/50

U (f ) =

Vc(f )

cC

is a sum of clique potentials Vc(f ) over all possible cliques C.

4
4

Illustration from the book [4]

Simple cliques, Automodels


14/50

Contextual constraints on two labels


U (f ) =

X
iS

V1(fi) +

X X

V2(fi, fi0 )

iS i0Ni

This can be interpreted as


U (f ) = Udata (f ) + Usmooth (f )

Neighborhood and cliques on a regular lattices


15/50

5
5

Illustration from the book [4]

Energy minimization
16/50

What is to be computed? Gibbs distribution:


P (f ) = Z

T1 U (f )

Remind that f is the desired labeling


f : S L
In MAP formulation we seek the most probable labeling P (f ).
The best labeling minimizes energy U (f )
It is a combinatorial problem. The next explanation follows mainly [2].

Energy data term and smoothness term


17/50

U (f ) = Udata (f ) + Usmooth (f )
The data term measures how well a label fp fits the particular pixel (site) p.
Globally,
X
Udata (f ) =
Dp(fp)
pP

Example: in Image restoration Dp(fp) = (fp Ip)2, where Ip is the


observed intensity and fp is the label (assigned intensity).
Smoothness term

smooth but not everywhere (think about object boundary)

Expresses the context. Setting a proper smoothness term is much more


tricky.

discontinuity preserving

Interacting pixels
18/50

We consider the energy


U (f ) =

Dp(fp) +

pP

Vp,q (fp, fq )

{p,q}N

where N is the set of interacting pixels, typically adjacent pixels. Dp is


assumed to be nonnegative.
Interaction of adjacent pixels may have long range impact!
Vp,q is called interaction penalty.

(reasonable) interaction penalties


19/50

labels , , L
V (, ) = 0 = ,
V (, ) = V (, ) 0 ,

(2)

V (, ) V (, ) + V (, )

(3)

Penalty is metric if all hold and semimetric if only (2,3) is satisfied.

truncated quadratic V (, ) = min(K, ( )2)

Examples of discontinutity preserving penalties

truncated absolute distance V (, ) = min(K, | |)

(1)

Potts model V (, ) = KT ( 6= ), where T () = 1 if argument is


true, otherwise 0.

towards the best labeling (minimum energy)


20/50

Catch: finding global minimum is NP complete even for the simple Potts
model.
local minimum is sought.
Problem

poor choice of energy funtion

If the solution is poor

local minimum is far from the global one

Local minimum
21/50

a labeling f is a local minimum of the energy U if


U (f ) U (f 0)
for any f 0 near to f . In case of discrete labeling near means withing single
move of f .
Many local minimization method use standard moves, where only one pixel
(site) may change label at a time.
Example: greedy optimization
for each pixel, the label which gives the largest descrease of the energy is
chosen.

Fast approximate energy minimization via graph


cuts

22/50

We only sketch the main ideas from the seminal work [2]6
Allow more than just one label change at a time

- swap and -expansion.

Freely available implementation. Many difficult problems in computer vision were solved by using this
method and implementation.
7
image from [2]

Algorithm
23/50

Algorithm
24/50

Algorithm
25/50

- swap

How many iterations in each cycle?

A cycle is successful if a strictly better labeling is found in any iteration.

26/50

Cycling stops after first unsuccessful cycle.

- swap
27/50

min-cut is equivalent to max-flow (Ford-Fulkerson theorem)

Max-flow between terminals is a standard problem in Combinatorial


Optimization.

The best - swap is found by composing a graph and finding min-cut.

Algorithms with low-order polynomial complexities exist.

Finding optimal - swap


28/50

8
8

illustration from [2]

Finding optimal - swap, min-cut


29/50

Finding optimal - swap, re-labeling


30/50

Finding optimal - swap, re-labeling


31/50

32/50

MRF in image segmentation

Zkouka
33/50

Ptek 4.6.

Segmentation with seeds Interactive GraphCuts


34/50

Idea: denote few pixels that trully belongs to object or background and than
refine (grow) by using soft constraints.

Segmentation with seeds Interactive GraphCuts


34/50

Idea: denote few pixels that trully belongs to object or background and than
refine (grow) by using soft constraints.

Segmentation with seeds Interactive GraphCuts


34/50

Idea: denote few pixels that trully belongs to object or background and than
refine (grow) by using soft constraints.

Segmentation with seeds Interactive GraphCuts


34/50

Idea: denote few pixels that trully belongs to object or background and than
refine (grow) by using soft constraints.

Segmentation with seeds Interactive GraphCuts


34/50

data term

boundary penalties and/or pixel interactions

Idea: denote few pixels that trully belongs to object or background and than
refine (grow) by using soft constraints.

how to find the optimal boundary between object and background

Segmentation with seeds graph


35/50

9
9

illustration from [1]

Edges, cuts, segmentation


36/50

Explained on the blackboard.


See [1] for details.

What data term?


37/50

Udata (obj, bck) =

X
pobj

D(obj) +

D(bck)

pbck

The better match between pixel and obj or bck model the lower energy
(penalty).

What data term?


37/50

Udata (obj, bck) =

D(obj) +

pobj

D(bck)

pbck

The better match between pixel and obj or bck model the lower energy
(penalty).
How to model?

background, foreground (object) pixels

intensity or color distributions

histograms

for simplicity, you may think about the opposite case, the better match the
higher value

parametric models: GMM Gaussian Mixture Model

Image intensities - 1D GMM


38/50
relative frequency of intensities

relative frequency

0.015

0.01

0.005

0
50

10

Demo codes from [6]

50

100
150
intensity

200

250

300

10

Image intensities - 1D GMM


39/50
individual Gaussians
0.018
0.016

relative frequency

0.014
0.012
0.01
0.008
0.006
0.004
0.002
0
50

50

100
150
intensity

200

250

300

Image intensities - 1D GMM


40/50
Gaussian mixture
0.018
0.016

relative frequency

0.014
0.012
0.01
0.008
0.006
0.004
0.002
0
50

50

100
150
intensity

200

250

300

2D GMM
41/50

40
30
20
10
0
10
20
30
40
50
40

30

20

10

10

20

30

40

50

2D GMM
42/50

40
30
20
10
0
10
20
30
40
50
40

30

20

10

10

20

30

40

50

2D GMM
43/50

40
30
20
10
0
10
20
30
40
50
40

30

20

10

10

20

30

40

50

Data term by GMM math summary


K
X


1
1
fi T
fi
fi 1
fi
(x

p(x|fi) =
wk
exp

(x

,
12
k
k
k)
2
d fi
k=1
(2) 2 k


44/50

where d is the dimension.


K number of Gaussians. User defined.
for each label L = {obj, bck} different wk , k , k estimated from the
data (seeds)
x pixel value, can be intensity, color vector, . . .

Data term

5
4.5
4

D(obj) = ln p(x|obj)
D(bck) = ln p(x|bck)

data term

3.5
3
2.5
2
1.5
1
0.5
0
0

0.2

0.4

0.6

probability

0.8

Results data term only


45/50

See the live demo11

11

Demo codes courtesy of V. Franc and A. Shekhovtsov

What boundary (interaction) term?


46/50

The Potts model: V (p, q) = T (p 6= q), where T () = 1 if argument is true,


otherwise 0.
Effect of , see the live demo.
Results for data and interaction terms

GrabCut going beyond the seeds


47/50

The main idea: Iterate the graphcut and refine the data terms in each
iteration. Stop if the energy (penalty) does not decrease. [5]
The practical motivation

GrabCut going beyond the seeds


47/50

The main idea: Iterate the graphcut and refine the data terms in each
iteration. Stop if the energy (penalty) does not decrease. [5]
The practical motivation
Further reduce the user interaction.

12

12

images from [5]

GrabCut algorithm
48/50

Init: Specify background pixels TB . TF = 0; TU = TB


Iterative minimization:
1. assign GMM components to pixels in TU
2. learn GMM parameters from data
3. segment by using min-cut (graphcut algorithm)
4. repeat from step 1 until convergence
Optional: edit some pixels and repeat the min-cut.

References
49/50
[1] Yuri Boykov and Marie-Pierre Jolly. Interactive graph cuts for optimal boundary & region segmentation of
objects in N-D images. In Proceedings of International Conference on Computer Vision (ICCV), 2001.
[2] Yuri Boykov, Olga Veksler, and Ramin Zabih. Fast approximate energy minimization via graph cuts. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 23(11):12221239, November 2001.
[3] S. Geman and D. Geman. Stochastic relaxation, gibbs distributions, and the bayesian restoration of
images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6):721741, 1984.
[4] Stan Z. Li. Markov Random Field Modeling in Image Analysis. Springer, 2009.
[5] Carsten Rother, Vladimir Kolomogorov, and Andrew Blake. GrabCut: Interactive foreground extraction
isong iterated graph cuts. ACM Transactions on Graphics (SIGGRAPH04), 2004.
[6] Tom Svoboda, Jan Kybic, and Vclav Hlav. Image Processing, Analysis and Machine Vision. A
MATLAB Companion. Thomson, 2007. Accompanying www site http://visionbook.felk.cvut.cz.

End
50/50

P = eU

1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0

energy term U

10

T1 U

P =e

T=0.5
T=1.0
T=2.0
T=5.0

0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0

energy term U

10

relative frequency of intensities

relative frequency

0.015

0.01

0.005

0
50

50

100
150
intensity

200

250

300

individual Gaussians
0.018
0.016

relative frequency

0.014
0.012
0.01
0.008
0.006
0.004
0.002
0
50

50

100
150
intensity

200

250

300

Gaussian mixture
0.018
0.016

relative frequency

0.014
0.012
0.01
0.008
0.006
0.004
0.002
0
50

50

100
150
intensity

200

250

300

40
30
20
10
0
10
20
30
40
50
40

30

20

10

10

20

30

40

50

40
30
20
10
0
10
20
30
40
50
40

30

20

10

10

20

30

40

50

40
30
20
10
0
10
20
30
40
50
40

30

20

10

10

20

30

40

50

5
4.5
4

data term

3.5
3
2.5
2
1.5
1
0.5
0
0

0.2

0.4

0.6

probability

0.8

You might also like