0% found this document useful (0 votes)
32 views10 pages

CSE616 CH 3 Part 3

Chapter 3 discusses the Hidden Markov Model (HMM) and its three main problems: evaluation, decoding, and learning. The evaluation problem calculates the probability of observing a sequence of visible states, the decoding problem finds the most probable sequence of hidden states, and the learning problem adjusts model parameters to maximize the observation probability. The chapter also introduces algorithms such as the forward algorithm for evaluation, the Viterbi algorithm for decoding, and the Baum-Welch algorithm for learning.

Uploaded by

Had Libremente
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views10 pages

CSE616 CH 3 Part 3

Chapter 3 discusses the Hidden Markov Model (HMM) and its three main problems: evaluation, decoding, and learning. The evaluation problem calculates the probability of observing a sequence of visible states, the decoding problem finds the most probable sequence of hidden states, and the learning problem adjusts model parameters to maximize the observation probability. The chapter also introduces algorithms such as the forward algorithm for evaluation, the Viterbi algorithm for decoding, and the Baum-Welch algorithm for learning.

Uploaded by

Had Libremente
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Chapter 3 (part 3): Maximum-Likelihood and Bayesian Parameter Estimation Hidden Markov Model: Extension of Markov Chains

All materials used in this course were taken from the textbook Pattern Classification by Duda et al., John Wiley & Sons, 2001 with the permission of the authors and the publisher

Hidden Markov Model (HMM)




Interaction of the visible states with the hidden states bjk= 1 for all j where bjk=P(Vk(t) | [j(t)).

3 problems are associated with this model


  

The evaluation problem The decoding problem The learning problem


CSE 616 Applied Pattern Recognition, Chapter 3, Section 3.10

Dr. Djamel Bouchaffra

 The

evaluation problem

It is the probability that the model produces a sequence VT of visible states. It is: r
parameters

P(V | 5) ! P(V T | [rT ) P([rT ) ;


T r !1

max

rmax ! cT

where each r indexes a particular sequence of T hidden T [r ! _ (1), [(2),..., [(T )a [ states
t !T

(1) (2)
Dr. Djamel Bouchaffra

P(V T | [rT ) ! P(v(t ) | [ (t )) conditional independence


t !1 t !T

P([rT ) ! P([ (t ) | [ (t  1)) Markov chain of order 1


t !1
CSE 616 Applied Pattern Recognition, Chapter 3, Section 3.10

Using equations (1) and (2), we can write:

P(V T | 5) ! P(v(t ) | [ (t )) P([ (t ) | [ (t  1))


r !1 t !1

rmax t !T

Interpretation: The probability that we observe the particular sequence of T visible states VT is equal to the sum over all rmax possible sequences of hidden states of the conditional probability that the system has made a particular transition multiplied by the probability that it then emitted the visible symbol in our target sequence. Example: Let [1, [2, [3 be the hidden states; v1, v2, v3 be the visible states and V3 = {v1, v2, v3} is the sequence of visible states P({v1, v2, v3}|5) = P([1).P(v1 | [1).P([2 | [1).P(v2 | [2).P([3 | [2).P(v3 | [3) ++ (possible terms in the sum = all possible (33= 27) cases !)
CSE 616 Applied Pattern Recognition, Chapter 3, Section 3.10

Dr. Djamel Bouchaffra

v1

v2

v3

First possibility:
[1 (t = 1) v1 [2 (t = 2) v2 [3 (t = 3) v3

Second Possibility:
[2 (t = 1) [3 (t = 2)
t !3 t !1

[1 (t = 3)

P({v1, v2, v3}|5) = P([2).P(v1 | [2).P([3 | [2).P(v2 | [3).P([1 | [3).P(v3 | [1) + +

Therefore: P({v1 , v2 , v3} | 5) !




possible sequence of hidden states

P(v(t ) | [ (t )).P([ (t ) | [ (t  1))

The evaluation problem is solved using the forward algorithm


CSE 616 Applied Pattern Recognition, Chapter 3, Section 3.10

Dr. Djamel Bouchaffra

 The

decoding problem (optimal state sequence)

Given a sequence of visible states VT, the decoding problem is to find the most probable sequence of hidden states. This problem can be expressed mathematically as: find the single best state sequence (hidden states)
[ (1), [ (2),..., [ (T ) such that : [ (1), [ (2),..., [ (T ) ! arg max P? (1), [ (2),..., [ (T ), v (1), v (2),...,V (T ) | 5A [
[ (1),[ ( 2 ),...,[ (T )

Note that the summation disappeared, since we want to find only one unique best case !
Dr. Djamel Bouchaffra CSE 616 Applied Pattern Recognition, Chapter 3, Section 3.10

Where:

5 = [T,A,B] T = P([(1) = [) (initial state probability) A = aij = P([(t+1) = j | [(t) = i) B = bjk = P(v(t) = k | [(t) = j)

In the preceding example, this computation corresponds to the selection of the best path amongst: {[1(t = 1),[2(t = 2),[3(t = 3)}, {[2(t = 1),[3(t = 2),[1(t = 3)} {[3(t = 1),[1(t = 2),[2(t = 3)}, {[3(t = 1),[2(t = 2),[1(t = 3)} {[2(t = 1),[1(t = 2),[3(t = 3)}


The decoding problem is solved using the Viterbi Algorithm


CSE 616 Applied Pattern Recognition, Chapter 3, Section 3.10

Dr. Djamel Bouchaffra

 The

learning problem (parameter estimation)

This third problem consists of determining a method to adjust the model parameters 5 = [T,A,B] to satisfy a certain optimization criterion. We need to find the best model

5 ! [T , A, B]
T

Such that to maximize the probability of the observation sequence:

Max P (V | 5)
5
We use an iterative procedure such as Baum-Welch (Forward-Backward) or Gradient to find this local optimum

Dr. Djamel Bouchaffra

CSE 616 Applied Pattern Recognition, Chapter 3, Section 3.10

Parameter Updates:
T

Forward-Backward Algorithm

K
aij !
t !1 t !T t !1

ij

(t )
ik

K
k

(t )

K ij ( t ) !

E i (t  1)aij b jk F j (t ) P (V | 5)
b jk !
T

T t !1 v ( t ) ! vk T t !1

K
l l jl

jl

(t )

Ei(t)= P(model generates visible sequence up to step t given hidden state [i(t)) Fi(t)= P(model will generate the sequence from t+1 to T given [i(t))
Dr. Djamel Bouchaffra

(t )

CSE 616 Applied Pattern Recognition, Chapter 3, Section 3.10

10

Parameters Learning Algorithm Begin initialize aij, bjk, training sequence VT, conv. criterion (cc), z=0 Do z=z+1 compute a ( z) from a(z-1) and b(z-1)   compute b ( z )from a(z-1) and b(z-1)  aij(z)= aij ( z  1)  bjk(z)= b jk ( z  1) Until max{aij(z)-aij(z-1),bjk(z)-bjk(z-1)}< cc Return aij=aij(z); bjk=bjk(z) End
CSE 616 Applied Pattern Recognition, Chapter 3, Section 3.10

Dr. Djamel Bouchaffra

You might also like