Professional Documents
Culture Documents
Ibrahim Kaddouri
Supervised by: Elisabeth Gassiat and Zacharie Naulet
September 8, 2022
Outline
2 Inference in HMMs
Clustering
Clustering is an ill-posed problem which aims to find out interesting
structures in the data or to derive a useful grouping of the observations.
Yk | Xk = j ∼ Fj
Yk | Xk = j ∼ Fj
Observations : Y0 Y1 Y2 ··· YT −1
Observations : Y0 Y1 Y2 ··· YT −1
Outline
2 Inference in HMMs
Purpose: Estimate the model parameters and the hidden states associated
to the observations.
Backward smoothing
EM algorithm
Algorithm 3: EM algorithm
Randomly initialize matrices Q and F , set nb_iter number of iterations
Compute marginal distributions (ϕk|n )0≤k≤n and joint distributions
(ϕk:k+1|n )0≤k≤n−1 using algorithm 1
for k ∈ {0, .., nb_iter − 1} do
for i, j ∈ {0, .., r P
− 1} do
n
k−1:k|n ϕ (i,j;θ′ )
q(i, j) ← Pn k=1
Pr −1
k=1 l=0
ϕk−1:k|n (i,l;θ′ )
Example
Example
0.8
0.8
0.7 0.7
0.6
0.6
Estimated value 0.5 Estimated value
Q
F
True value True value
0.5 0.4
0.3
0.4
0.2
0.3 0.1
0 50 100 150 200 0 50 100 150 200
Iterations Iterations
Figure: The two plots represent the estimated parameters at each iteration. On
the left: parameters Q0,1 and Q1,1 . On the right: parameters F0,0 and F1,0 .
Example
−4620
−4640
−4660
Log − Likelihood
−4680
−4700
−4720
−4740
−4760
where : !
1−p p
Q= and F = (f0 , f1 ).
q 1−q
where : !
1−p p
Q= and F = (f0 , f1 ).
q 1−q
When p + q = 1, the Markov chain becomes:
p
1−p 0 1 p
1−p
where : !
1−p p
Q= and F = (f0 , f1 ).
q 1−q
When p + q = 1, the Markov chain becomes:
p
1−p 0 1 p
1−p
→
− It becomes impossible to learn the model parameters.
Ibrahim KADDOURI Clustering applied to hidden Markov models September 8, 2022 16 / 35
Reconstructing the hidden states
Outline
2 Inference in HMMs
Let θ = (ν, Q, F ).
Consider the loss function:
′
L1 (x1:n , x1:n ) = 1x1:n ̸=x1:n
′
Let θ = (ν, Q, F ).
Consider the loss function:
′
L1 (x1:n , x1:n ) = 1x1:n ̸=x1:n
′
Let θ = (ν, Q, F ).
Consider the loss function:
′
L1 (x1:n , x1:n ) = 1x1:n ̸=x1:n
′
Reconstruction algorithm
Reconstruction algorithm
Example
Example
Below are the observations and the associated states:
Figure: The observations and the associated hidden states. For the observations,
the top vertical lines correspond to observations of 1, the lower ones are
observations of 0
Ibrahim KADDOURI Clustering applied to hidden Markov models September 8, 2022 23 / 35
Reconstructing the hidden states
Outline
2 Inference in HMMs
Notations
Notations
4(1 − δ) i−1
∥ϕ⋆i|n (., y1:n ) − ϕ̂i|n (., y1:n )∥TV ≤ ρ ∥ν − ν̂∥ 2 + 1/(1 − ρ)
δ2
n
!
(ρ̂ ∨ ρ)|l−i|
max fx (yl ) − fˆx (yl )
X
+1/(1 − ρ̂) ∥Q − Q̂∥F +
f (y )
l=1 0 l
∨ f1 (yl ) x =0,1
where:
δ̂ = mini,j Q̂i,j
1−2δ 1−2δ̂
ρ= 1−δ and ρ̂ = 1−δ̂
fx (y ) = pxy (1 − px ) 1−y
1−2δ
where δ = mini,j Qi,j > 0 and ρ = 1−δ .
Efficiency of the plug-in empirical Bayes classifier
1−2δ
where δ = mini,j Qi,j > 0 and ρ = 1−δ .
Ibrahim KADDOURI Clustering applied to hidden Markov models September 8, 2022 29 / 35
Asymptotic Bayes risk
Outline
2 Inference in HMMs
Exponential forgetting
Proposition
Assume:
The initial distribution is the stationary distribution
The HMM model is comprised of two hidden states and discrete
emissions
δ = mini,j Qi,j > 0
Exponential forgetting
Proposition
Assume:
The initial distribution is the stationary distribution
The HMM model is comprised of two hidden states and discrete
emissions
δ = mini,j Qi,j > 0
Then, for 0 ≤ j ′ ≤ j, k ≥ 0 and n ≥ 0:
′
∥Pθ (Xk ∈ . | Y−j:n ) − Pθ (Xk ∈ . | Y−j ′ :n )∥TV ≤ 2ρk∧n+j
0 ρk−k∧n
1
Exponential forgetting
Proposition
Assume:
The initial distribution is the stationary distribution
The HMM model is comprised of two hidden states and discrete
emissions
δ = mini,j Qi,j > 0
Then, for 0 ≤ j ′ ≤ j, k ≥ 0 and n ≥ 0:
′
∥Pθ (Xk ∈ . | Y−j:n ) − Pθ (Xk ∈ . | Y−j ′ :n )∥TV ≤ 2ρk∧n+j
0 ρk−k∧n
1
1−2δ
where ρ0 = 1−δ and ρ1 = 1 − 2δ
Ibrahim KADDOURI Clustering applied to hidden Markov models September 8, 2022 31 / 35
Asymptotic Bayes risk
Theorem
Under the same assumptions:
⋆,offline
⋆,offline 2
Rn,HMM − R∞,HMM ≤
n(1 − ρ0 )
⋆,Online
ρ1 1
Rn,HMM − R⋆,Online
∞,HMM ≤
2n 1 − ρ0
1−2δ
where ρ0 = 1−δ , ρ1 = 1 − 2δ and δ = mini,j Qi,j > 0.
Outline
2 Inference in HMMs
Theorem
- The asymptotic Bayes risk for online clustering verifies:
1 |ϕ1 |
R⋆,Online
∞,HMM ≤ −
2 2
c c2
- Assuming in addition that min (f0 , f1 ) ≥ c and that |ϕ2 | ≤ 12 ∧ 4,
then one has:
1 |ϕ1 | (1 − ϕ21 )ϕ22 ϕ3
R⋆,Online
∞,HMM ≥ − −
2 2 2c
Ibrahim KADDOURI Clustering applied to hidden Markov models September 8, 2022 35 / 35