Lecture 02 Artificial Vision Lecture2

Discrete Inference & Learning
in Artificial Vision
Lecture 2
Reparameterization and dynamic programming
M. Nikos Paragios & M. Pawan Kumar
Outline
Models
Exponential Family
Problem Formulation
Reparameterization
Dynamic Programming
Undirected Graph
Graph G
Vertices
V
V1
V2
V3
V4
V5
V6
V7
V8
V9
Edges
E
Markov Random Field (MRF)
V1
V2
V3
V4
V5
V6
V7
V8
V9
Vertices are associated with random variables X
X1
X2
X3
X4
X5
X6
X7
X8
X9
Vertices are associated with random variables X
Unobserved
Random
Variables
X1
X2
X3
X4
X5
X6
X7
X8
X9
Neighbors
Edges define a neighborhood over random variables
MRF
X1
X2
X3
X4
X5
X6
X7
X8
X9
Variable Xp takes a value or a label xp from a set L = {l1, l2,, lh}

X = x is called a labeling
Discrete, Finite
MRF
X1
X2
X3
X4
X5
X6
X7
X8
X9
Total number of labelings is hn for n random variables

Probability of a labeling is P(x)
MRF
X1
X2
X3
X4
X5
X6
X7
X8
X9
MRF assumes the Markovian property for P(x)
MRF
X1
X2
X3
X4
X5
X6
X7
X8
X9
Xp is conditionally independent of Xq given Xps neighbors

Hammersley-Clifford Theorem
MRF
Potential
12(x1,x2)
X1
X2
X3
X4
X5
X6
X7
X8
X9
Potential
56(x5,x6)
Probability P(x) can be decomposed into clique potentials
MRF
Potential
12(x1,x2)
X1
X2
X3
X4
X5
X6
X7
X8
X9
Probability P(x) proportional to (p,q) pq(xp,xq)
Potential
56(x5,x6)
Outline
Models
Incorporating Data
Conditional Random Fields
Exponential Family
Problem Formulation
Reparameterization
Dynamic Programming
MRF
Potential
1(x1,d1)
d1
X1
d2
X2
d4
X4
X3
d5
X5
d7
X7
d3
d6
X6
d8
X8
d9
X9
Probability P(x) proportional to (p,q) pq(xp,xq)

Probability P(d|x) proportional to p p (xp,dp)
Observed
Data
MRF
d1
X1
d2
X2
d4
X4
Probability P(x,d) =
X3
d5
X5
d7
X7
d3
d6
X6
d8
X8
d9
X9
p p(xp,dp) (p,q) pq(xp,xq)
Z is known as the partition function
MRF
d1
X1
d2
X2
d4
X4
High-order
Potential
4578(x4,x5,x7,x8)
X3
d5
X5
d7
X7
d3
d6
X6
d8
X8
d9
X9
Pairwise MRF
Unary
Potential
1(x1,d1)
d1
X1
d2
X2
d4
X4
Probability P(x,d) =
X3
d5
X5
d7
X7
d3
d6
X6
d8
X8
Pairwise
Potential
56(x5,x6)
d9
X9
p p(xp,dp) (p,q) pq(xp,xq)
Z is known as the partition function
Outline
Models
Incorporating Data
Conditional Random Fields
Exponential Family
Problem Formulation
Reparameterization
Dynamic Programming
Conditional Random Fields (CRF)

d1
X1
d2
X2
d4
X4
X3
d5
X5
d7
X7
d3
d6
X6
d8
X8
d9
X9
CRF assumes the Markovian property for P(x|d)

Hammersley-Clifford Theorem
CRF
d1
X1
d2
X2
d4
X4
X3
d5
X5
d7
X7
d3
d6
X6
d8
X8
d9
X9
Probability P(x|d) proportional to p p(xp;d) (p,q) pq(xp,xq;d)

Clique potentials that depend on the data
CRF
d1
X1
d2
X2
d4
X4
Probability P(x|d) =
X3
d5
X5
d7
X7
d3
d6
X6
d8
X8
d9
X9
p p (xp;d) (p,q) pq(xp,xq;d)
Z is known as the partition functionZ
MRF and CRF
Probability P(x) =
X1
X2
X3
X4
X5
X6
X7
X8
X9
p p(xp) (p,q) pq(xp,xq)

Z
Outline
Models
Exponential Family
Problem Formulation
Reparameterization
Dynamic Programming
Exponential Family
Probability P(x) =

Z
exp(-E(x))
Probability P(x) =
Energy E(x) = p p(xp) Unary Parameters

Analogous to log(p(xp))
Exponential Family
Probability P(x) =

Z
exp(-E(x))
Probability P(x) =
Energy E(x) = p p(xp) Unary Potentials

Analogous to log(p(xp))
Exponential Family
Probability P(x) =

Z
exp(-E(x))
Probability P(x) =
Energy E(x) = p p(xp) + (p,q) pq(xp,xq)

Pairwise Parameters
Analogous to log(pq(xp,xq))
Exponential Family
Probability P(x) =

Z
exp(-E(x))
Probability P(x) =

Pairwise Potentials
Analogous to log(pq(xp,xq))
Exponential Family
exp(-E(x))
Probability P(x) =

Lower energy corresponds to higher probability
Outline
Models
Exponential Family
Problem Formulation
Reparameterization
Dynamic Programming
Energy Function
Label l1

Label l0

Xp

Random Variables X
Labels L = {l0, l1, .}
Labelling x
Xq

Xr

Xs

Energy Function
Label l1

2

Xq

3

Xr

7

Xs

Label l0

Xp

E(x) = p p(xp)
Unary Potential
Easy to minimize
Neighbourhood
Energy Function
Label l1

2

Xq

3

Xr

7

Xs

Label l0

Xp

Neighbors = { (p,q) , (q,r) , (r,s) }
Energy Function
Label l1

2

Xq

3

Xr

7

Xs

Label l0

Xp

Pairwise Potential
E(x) = p p(xp) +(p,q) pq(xp,xq)
Energy Function
Label l1

0

1

4

1

Label l0

5

Xp

2

Xq

6

3

1

4

3

Xr

3

1

0
7

Xs

Pairwise Potential
E(x) = p p(xp) +(p,q) pq(xp,xq)
Energy Function
Label l1

0

1

4

1

Label l0

5

Xp

2

Xq

6

3

1

4

3

Xr

3

1

0
7

Xs

E(x; ) = p p(xp) +(p,q) pq(xp,xq)

Parameter
Outline
Models
Exponential Family
Problem Formulation
Energy Minimization
Computing Min-Marginals
Reparameterization
Dynamic Programming
Energy Minimization
Label l1

0

1

4

1

Label l0

5

Xp

2

Xq

6

3

1

4

3

Xr

3

1

0
7

Xs

Energy Minimization
Label l1

0

1

4

1

Label l0

5

Xp

2

Xq

6

3

1

4

3

Xr

3

1

0
7

Xs

2 + 1 + 2 + 1 + 3 + 1 + 3 = 13
Energy Minimization
Label l1

0

1

4

1

Label l0

5

Xp

2

Xq

6

3

1

4

3

Xr

3

1

0
7

Xs

Energy Minimization
Label l1

0

1

4

1

Label l0

5

Xp

2

Xq

6

3

1

4

3

Xr

3

1

0
7

Xs

5 + 1 + 4 + 0 + 6 + 4 + 7 = 27
Energy Minimization
Label l1

0

1

4

1

Label l0

5

Xp

2

Xq

6

3

1

4

3

Xr

3

1

0
7

Xs

e* = min E(x; ) = E(x*; )

x* = argmin E(x; )
Energy Minimization
x* = {1, 0, 0, 1}
e* = 13
16 possible labellings
xp
0
0
0
0
0
0
0
0
xq
0
0
0
0
1
1
1
1
xr
0
0
1
1
0
0
1
1
xs
0
1
0
1
0
1
0
1
18
15
27
20
22
19
27
20
xp
1
1
1
1
1
1
1
1
xq
0
0
0
0
1
1
1
1
xr
0
0
1
1
0
0
1
1
xs
0
1
0
1
0
1
0
1
16
13
25
18
18
15
23
16
Outline
Models
Exponential Family
Problem Formulation
Energy Minimization
Computing Min-Marginals
Reparameterization
Dynamic Programming
Min-Marginals
Label l1

0

1

4

1

Label l0

5

Xp

2

Xq

6

3

1

4

3

Xr

3

1

0
7

Xs

x* = arg min E(x; ) such that xp = i

Min-marginal ep(i)
Min-Marginals
ep(0) = 15
xp
0
0
0
0
0
0
0
0
xq
0
0
0
0
1
1
1
1
xr
0
0
1
1
0
0
1
1
xs
0
1
0
1
0
1
0
1
18
15
27
20
22
19
27
20
xp
1
1
1
1
1
1
1
1
xq
0
0
0
0
1
1
1
1
xr
0
0
1
1
0
0
1
1
xs
0
1
0
1
0
1
0
1
16
13
25
18
18
15
23
16
Min-Marginals
ep(1) = 13
xp
0
0
0
0
0
0
0
0
xq
0
0
0
0
1
1
1
1
xr
0
0
1
1
0
0
1
1
xs
0
1
0
1
0
1
0
1
18
15
27
20
22
19
27
20
xp
1
1
1
1
1
1
1
1
xq
0
0
0
0
1
1
1
1
xr
0
0
1
1
0
0
1
1
xs
0
1
0
1
0
1
0
1
16
13
25
18
18
15
23
16
Min-Marginals and Energy Minimization

Minimum min-marginal of any variable =
energy of MAP labelling
mini ep(i)
mini ( minx E(x; ) such that xp = i )
minx E(x; )
Outline
Models
Exponential Family
Problem Formulation
Reparameterization
Dynamic Programming
Reparameterization
2 +
2

2 +
5

Xp

4
- 2

2
- 2

Xq

xp
xq
E(x; )
10
Add a constant to all p(i)

Subtract that constant from all q(k)
Reparameterization
2 +
2

2 +
5

Xp

4
- 2

2
- 2

Xq

xp
xq
E(x; )
7 +2-2
10 + 2 - 2
5 +2-2
6 +2-2
Add a constant to all p(i)

Subtract that constant from all q(k)
E(x; ) = E(x; )
Reparameterization
0
- 3

1
- 3

1

5

Xp

4
+ 3

2

Xq

xp
xq
E(x; )
10
Add a constant to one q(k)

Subtract that constant from pq(i,k) for all i
Reparameterization
0
- 3

1
- 3

1

5

Xp

4
+ 3

2

Xq

xp
xq
E(x; )
10 - 3 + 3
6-3+3

E(x; ) = E(x; )
Reparameterization
3

2
2
- 2

1
- 2

5
0
- 2

Xp

1
+ 1

4
- 1
2

0
+ 1

1
+ 1

2
+ 2
5

2

Xq

Xq

Xp

p(i) = p(i) + Mqp(i)

pq(i,k)= pq(i,k) - Mpq(k)
0
- 4
1
+ 4

1
- 4
4

2
- 4

5

Xp

2

Xq

q(k) = q(k) + Mpq(k)

- Mqp(i)
E(x; )
= E(x; )
Reparameterization
is a reparameterization of , iff
E(x; ) = E(x; ), for all x

Equivalently
Kolmogorov, PAMI, 2006

0

4
- 2

2 +
2

p(i) = p(i) + Mqp(i)

q(k) = q(k) + Mpq(k)
pq(i,k)= pq(i,k) - Mpq(k) - Mqp(i)
2 +
5

Xp

1

0

2
- 2

Xq

Outline
Models
Exponential Family
Problem Formulation
Reparameterization
Dynamic Programming
Dynamic Programming
Some problems are easy
Dynamic programming is exact for chains
Exact for trees
Clever Reparameterization
Outline
Models
Exponential Family
Problem Formulation
Reparameterization
Dynamic Programming
Two Variables
Three Variables
Chains and Trees
Two Variables
2

0

1

1

5

Xp

2

Xq

5

Xp

Xq


Choose the right constant
q(k) = eq(k)
Two Variables
2

Xp

4

1

1

5

2

Xq

Mpq(0) = min
5

Xp

Xq

p(0) + pq(0,0) = 5 + 0
q(1) + pq(1,0) = 2 + 1
q(k) = eq(k)
Two Variables
2

0

1

-2

5

Xp

-3

5

Xq

5

Xp

Xq

q(k) = eq(k)
Two Variables
2

xp = 1
0

1

-2

5

Xp

-3

5

Xq

5

Xp

q(0) = eq(0)
Potentials along the red path add up to 0
Choose the right constant q(k) = eq(k)
Xq

Two Variables
2

Xp

4

1

-2

5

-3

5

Xq

Mpq(1) = min
5

Xp

p(0) + pq(0,1) = 5 + 1
p(1) + pq(1,1) = 2 + 0
Xq

Two Variables
2

xp = 1
Xp

-2

6

-1

-2

5

xp = 1
-3

5

Xq

q(0) = eq(0)
5

Xp

Xq

q(1) = eq(1)
Minimum of min-marginals = MAP estimate

Two Variables
xp = 1
-3

Xp

5

Xq

q(0) = eq(0)
xq = 0
-2

6

-1

-2

5

xp = 1
5

Xp

Xq

q(1) = eq(1)
xp = 1
Two Variables
2

xp = 1
Xp

-2

6

-1

-2

5

xp = 1
-3

5

Xq

q(0) = eq(0)
5

Xp

Xq

q(1) = eq(1)
We get all the min-marginals of Xq

Computational Complexity
Number of reparameterization constants = h
Complexity for each constant = O(h)
Total complexity = O(h2)
Same complexity as brute-force !!
Outline
Models
Exponential Family
Problem Formulation
Reparameterization
Dynamic Programming
Two Variables
Three Variables
Chains and Trees
Three Variables
l1

2

1

4

1

6

3

l0

5

Xp

2

Xq

3

Xr

Reparameterize the edge (p,q) as before
Three Variables
xp = 1
l1

-2

2

-2

6

-1

6

3

l0

5

Xp

-3

5

Xq

3

Xr

xp = 1
Three Variables
xp = 1
l1

-2

2

-2

6

-1

6

3

l0

5

Xp

-3

5

Xq

3

Xr

xp = 1
Three Variables
xp = 1
l1

-2

2

-2

6

-1

6

3

l0

5

Xp

-3

5

Xq

3

Xr

xp = 1
Reparameterize the edge (q,r) as before
Three Variables
xp = 1
l1

-2

2

-2

6

-1

xq = 1
-6

12

-3

-4

l0

5

Xp

-3

5

Xq

xp = 1
-5

9

Xr

xq = 0

Three Variables
xp = 1
l1

-2

2

-2

6

-1

xq = 1
-6

12

-3

-4

l0

5

Xp

-3

5

Xq

xp = 1
-5

9

Xr

er(1)
er(0)
xq = 0

Three Variables
xp = 1
l1

-2

2

-2

6

-1

xq = 1
-6

12

-3

-4

l0

-3

Xp

5

Xq

xp = 1
xr = 0
xq = 0
-5

9

Xr

xq = 0
xp = 1
er(1)
er(0)
Number of reparameterization constants = 2h
Total complexity = O(2h2) = O(h2)
Better than brute-force O(h3)
Outline
Models
Exponential Family
Problem Formulation
Reparameterization
Dynamic Programming
Two Variables
Three Variables
Chains and Trees
Chains
X1

X2

X3

..
Reparameterize the edge (1,2)
Xn

Chains
X1

X2

X3

..
Xn

Chains
X1

X2

X3

..
Xn

Chains
X1

X2

X3

..
Xn

Chains
X1

X2

X3

..
Reparameterize the edge (n-1,n)

Min-marginals en(i) for all labels
Xn

Chains
X1

X2

X3

..
Start from left and move towards right

Pick the minimum of min-marginals
Backtrack to find the best labeling x
Xn

Number of reparameterization constants = (n-1)h
Total complexity = O(nh2)
Better than brute-force O(hn)
Trees
X1

X2

X4

X3

X5

X6

X7

Trees
X1

X2

X4

X3

X5

X6

X7

Trees
X1

X2

X4

X3

X5

X6

X7

Trees
X1

X2

X4

X3

X5

X6

X7

Trees
X1

X2

X4

X3

X5

X6

X7

Trees
X1

X2

X4

X3

X5

X6

X7

Trees
X1

X2

X4

X3

X5

X6

X7


Min-marginals e1(i) for all labels
Trees
X1

X2

X4

X3

X5

X6

X7

Start from leaves and move towards root

Pick the minimum of min-marginals
Backtrack to find the best labeling x
Number of reparameterization constants = (n-1)h
Total complexity = O(nh2)
Better than brute-force O(hn)

Lecture 02 Artificial Vision Lecture2

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 02 Artificial Vision Lecture2

Uploaded by

Copyright:

Available Formats

Discrete Inference & Learning

M. Nikos Paragios & M. Pawan Kumar

Markov Random Field (MRF)

Vertices are associated with random variables X

Markov Random Field (MRF)

Vertices are associated with random variables X

Markov Random Field (MRF)

Variable Xp takes a value or a label xp from a set L = {l1, l2,, lh}

Total number of labelings is hn for n random variables

MRF assumes the Markovian property for P(x)

Xp is conditionally independent of Xq given Xps neighbors

Probability P(x) can be decomposed into clique potentials

Probability P(x) proportional to (p,q) pq(xp,xq)

Probability P(x) proportional to (p,q) pq(xp,xq)

p p(xp,dp) (p,q) pq(xp,xq)

Z is known as the partition function

p p(xp,dp) (p,q) pq(xp,xq)

Z is known as the partition function

Conditional Random Fields (CRF)

CRF assumes the Markovian property for P(x|d)

Probability P(x|d) proportional to p p(xp;d) (p,q) pq(xp,xq;d)

p p (xp;d) (p,q) pq(xp,xq;d)

Z is known as the partition functionZ

MRF and CRF

p p(xp) (p,q) pq(xp,xq)

p p(xp) (p,q) pq(xp,xq)

Energy E(x) = p p(xp) Unary Parameters

p p(xp) (p,q) pq(xp,xq)

Energy E(x) = p p(xp) Unary Potentials

p p(xp) (p,q) pq(xp,xq)

Energy E(x) = p p(xp) + (p,q) pq(xp,xq)

p p(xp) (p,q) pq(xp,xq)

Energy E(x) = p p(xp) + (p,q) pq(xp,xq)

Energy E(x) = p p(xp) + (p,q) pq(xp,xq)

Neighbors = { (p,q) , (q,r) , (r,s) }

E(x) = p p(xp) +(p,q) pq(xp,xq)

E(x) = p p(xp) +(p,q) pq(xp,xq)

E(x; ) = p p(xp) +(p,q) pq(xp,xq)

E(x; ) = p p(xp) +(p,q) pq(xp,xq)

E(x; ) = p p(xp) +(p,q) pq(xp,xq)

E(x; ) = p p(xp) +(p,q) pq(xp,xq)

E(x; ) = p p(xp) +(p,q) pq(xp,xq)

e* = min E(x; ) = E(x*; )

x* = arg min E(x; ) such that xp = i

Min-Marginals and Energy Minimization

Add a constant to all p(i)

Add a constant to all p(i)

Add a constant to one q(k)

Add a constant to one q(k)

p(i) = p(i) + Mqp(i)

q(k) = q(k) + Mpq(k)

E(x; ) = E(x; ), for all x

Kolmogorov, PAMI, 2006

p(i) = p(i) + Mqp(i)

Add a constant to one q(k)

Choose the right constant

Choose the right constant

Choose the right constant q(k) = eq(k)

Choose the right constant q(k) = eq(k)

Minimum of min-marginals = MAP estimate

Choose the right constant q(k) = eq(k)

We get all the min-marginals of Xq

Reparameterize the edge (p,q) as before

Reparameterize the edge (q,r) as before