You are on page 1of 2

Guide to slides of EM Convergence Presentation

Chen, Yu
December 2020

1 Important Concepts
1.1 Closed Point-to-set Map
• Point-to-set map (p2s map) M : map from pints of a set X to subsets of X

• M is closed at x∗ : xk ∈ X, lim xk = x∗ and lim yk = y ∗ , yk ∈ M (xk ) imply y ∗ ∈ M (x∗ )


k→∞ k→∞

1.2 Curved Exponential Family


X is a random vector with p.d.f. f (x|φ) from a probability space X.
k
X
f (x|φ) = A(φ) exp ( Ti (x)ηi (φ))h(x)
i=1

Where Ti (x) is a real valued statistics, ηi (φ) is a real valued function on the parameter space Ω ⊆ Rq , and
q < k ∈ N.
If covφ (T~ ) (T~ = [T1 , T2 , ·, Tk ]) is positive definite, then X belongs to the curved exponential family.

1.3 Three Important Sets


ϕ(a) = {φ ∈ ϕ : L(φ) ≡ a}
M (a) = {φ ∈ M : L(φ) ≡ a}
ψ(L) = {φ ∈ Ω : L(φ) ≡ L}
where M : local maxima in the interior of Ω
ϕ: stationary points in the interior of Ω
Ω: parameter space

2 Form of EM and GEM


2.1 EM
0 0 0 0 0
max
0
L(φ ) = Q(φ |φ) − H(φ |φ) = E{log(f (x|φ ))|y, φ} − E{log(k(x|y, φ ))|y, φ} (1)
φ

• E-STEP Determine Q(φ|φp ) for current φp


• M-STEP φp → φp+1 ∈ M (φp ) = {φ; φ =φ∈Ω Q(φ; φp )}

2.2 GEM
An iterative scheme:
φp+1 ∈ M (φp )
0
where φ → M (φ) is a point-to-set map such that:
0 0
Q(φ |φ) ≥ Q(φ|φ) ∀ φ ∈ M (φ)

1
3 Assumptions of GEM
• 1) Ω ⊆ Rr
• 2) Ωφ0 = {φ ∈ Ω : L(φ) ≥ L(φ0 )} is compact for any L(φ0 ) > −∞
• 3) L(φ) is continuous in Ω and differentiable in the interior of Ω

• 4) {L(φp )}p≥0 is bounded above for any φ0 ∈ Ω


• 5) φp is in the interior of Ω, int(Ω)
• 6) φp converges to some φ∗ ∈ int(Ω) such that the Hessian matrices ∇2 Q(φ∗ |φ∗ ) and ∇2 H(φ∗ |φ∗ ) exist
0 0
at the first φ∗ , and ∇2 Q(φ |φ) is continuous in (φ , φ)

4 Important Properties of EM (GEM)


• Any EM sequence {φp } increases the likelihood and L(φp ), if bounded above, converges to some L∗
• Property of point-to-set map: 0 0
Q(φ |φ) ≥ Q(φ|φ) ∀ φ ∈ M (φ) (2)

• According to Theorem 1 of DLR paper:


0 0
H(φ|φ) ≥ H(φ |φ) ∀ φ ∈ Ω (3)

• Lemma 1 of DLR: for any sequence {φp } from a GEM algorithm:

L(φp+1 ) ≥ L(φp ) (4)

5 Guide to the Theorems


Global Convergence Theorem from linear programming is the root of the first three theorems. Theorem
1 to Theorem 3 are dedicated to answering the first question: L(φp ) Converges to Global Maximum, Local
Maximum or Stationary Value? Theorem 4 to Theorem 7 answer the second question: Convergence of an GEM
sequence {φp }.

Specifically, Theorem 1 is a speciality or application of the Global Convergence Theorem in EM algorithm.


Theorem 1 introduces 2 conditions:
1) M is a closed p2s map over ϕC (or M C )
2) L(φp+1 ) > L(φp ) ∀φp ∈
/ ϕ (or M )
which are used as the conditions in Theorem 4 and Theorem 5.

Theorem 2 propose the continuity: Q(ψ|φ) is continuous in ψ and φ, which implies the 2 conditions in The-
orem 1, but this theorem can only apply to the situation of stationary point not the situation of local maximum.
0
Theorem 3 add an extra condition: supφ0 ∈Ω Q(φ |φ) > Q(φ|φ) ∀ φ ∈ ϕ\M to Theorem 2 to make Theorem
2 to be able to apply to the maximum case.

Theorem 4 inherits the two conditions from Theorem 1 and add another condition: ϕ(L∗ ) = {φ∗ } (or
M (L∗ ) = {φ∗ }) where L∗ = lim L(φp ) to answer the second question.
p→∞
Theorem 5 relaxes the condition ϕ(L∗ ) = {φ∗ } (or M (L∗ ) = {φ∗ }) where L∗ = lim L(φp ) of Theorem 4
p→∞
using lim ||φp+1 − φp || = 0, such that answers the question regarding the convergence of {φp }.
p→∞
Theorem 6, unlike Theorem 4 or 5, gives up the two conditions from Theorem 1, and replaces the two by:
0 0
∇Q(φp+1 |φp ) = 0 and ∇Q(φ |φ) is continuous in φ and φ, and a stronger condition: (a) ψ(L∗ ) = {φ∗ }, or (b)

lim ||φp+1 − φp || = 0 and ψ(L ) is discrete.
p→∞
Theorem 7 is a corollary of Theorem 6, and points out under which conditions the sequence {φp } converges
to the only maxima.
The most applicable conclusions are Theorem 2 and 7.

You might also like