Professional Documents
Culture Documents
Srihari
Sargur Srihari
srihari@cedar.buffalo.edu
1
Machine Learning ! ! ! ! ! Srihari
Example
• Maximum of joint distribution p(x,y x=0 x=1
– Occurs at x=1, y=0
– With p(x=1,y=0)=0.4 )
• Marginal p(x) y=0 0.3 0.4
– p(x=0) = p(x=0,y=0)+p(x=0,y=0)=0.6
– p(x=1) = p(x=1,y=0)+p(x=1,y=1)=0.4 y=1 0.3 0.0
• Marginal p(y)
– P(y=0)=0.7
– P(y=1)=0.3
• Marginals are maximized by x=0 and y=0 which
corresponds to 0.3 of joint distribution
• In fact, set of individually most probable values
can have probability zero in joint
4
Machine Learning ! ! ! ! ! Srihari
Max-sum principle
• Seek efficient algorithm for
– Finding value of x that maximizes p(x)
– Find value of joint distribution at that x
• Second task is written
max p(x) = max ... max p(x)
x x1 xM
where M is total number of variables
5
Machine Learning ! ! ! ! ! Srihari
Chain example
• Markov chain
1
joint distribution has form
p(x) = ψ1,2(x1,x2 )ψ 2 ,3(x2 , x3 )...ψ N −1,N (xN −1,x N )
Z
• Evaluation
1
of probability maximum has form
max p(x) = max ... max ψ1,2(x1,x2 )ψ 2 ,3(x2 ,x3 )...ψ N −1,N (xN −1, xN )
x Z x1 xN
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been
corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then
1
max p(x) = max ⎡⎢ψ1,2(x1,x2 )⎡... max ψ N −1,N (xN −1,x N )⎤ ⎤⎥
x Z x1 ⎣ ⎢⎣ xN ⎥⎦ ⎦
– Results in
• More efficient computation
• Interpreted as messages passed from node xN to node
x1
6
Machine Learning ! ! ! ! ! Srihari
• Into max
x
p(x) = max ... max p(x)
x x
1 M
7
Machine Learning ! ! ! ! ! Srihari
8
Machine Learning ! ! ! ! ! Srihari
9
Machine Learning ! ! ! ! ! Srihari
Maximum compution
• At root node in sum-product algorithm
p(x) = ∏µ fs →x (x)
s∈ne(x)
10
Machine Learning ! ! ! ! ! Srihari
11
Machine Learning ! ! ! ! ! Srihari
12
Machine Learning ! ! ! ! ! Srihari
13
Machine Learning ! ! ! ! ! Srihari
Backtracking in Trellis
• For each state of given variable there is a unique
state of the previous variable that maximizes
probability
– ties are broken systematically or randomly
• Equivalent to propagating a message back down
the chain using
xnmax
−1 = φ(x max
n )
• Know as backtracking
14
Machine Learning ! ! ! ! ! Srihari
15
Machine Learning ! ! ! ! ! Srihari
Viterbi Algorithm
• Max-sum algorithm gives exact maximizing
configuration for variables provided factor graph
is a tree
• Important application is in finding most probable
sequence of hidden states in a HMM
– known as the Viterbi algorithm
16
Machine Learning ! ! ! ! ! Srihari
17
Machine Learning ! ! ! ! ! Srihari
18
Machine Learning ! ! ! ! ! Srihari
19