You are on page 1of 12

Exponen!

al Family related to LDA

Exponen!onal family

p(x∣θ) = p(x∣η(θ)) = h(x) exp (η(θ)T t(x) − a(η(θ)))

Bernoulli distribu!on

Orginal form

p(x∣θ) = θx (1 − θ)1−x

Standard exponen!al family form

p(x∣η) = exp (η T x − ln(1 + eη ))

where

h(x) = 1
θ
η(θ) = ln
1−θ
t(x) = x
a(η(θ)) = ln(1 + eη ) = − ln(1 − θ)

Inverse parameter mapping


1
θ=
1 + e−η

Binomial distribu!on

Orginal form

p(x∣θ) = ( )θx (1 − θ)n−x


n
x

Standard exponen!al family form

p(x∣η) = ( ) exp (η T x − n ln(1 + eη ))


n
x

where

h(x) = ( )
n
x
θ
η(θ) = ln
1−θ
t(x) = x
a(η(θ)) = n ln(1 + eη ) = −n ln(1 − θ)

Inverse parameter mapping

1
θ=
1 + e−η

beta distribu!on
Original form

Γ(α + β) α−1
p(x∣α, β) = x (1 − x)β−1
Γ(α)Γ(β)

Standard exponen!al family form

T
1 ln x
exp ( [ ] [ ] − ln Γ(η1 ) + ln Γ(η2 ) − ln Γ(η1 + η2 ))
η1
p(x∣η) =
x(1 − x) η2 ln(1 − x)

where

1
h(x) =
x(1 − x)

η(α, β) = [ 1 ] = [ ]
η α
η2 β
ln x
t(x) = [ ]
ln(1 − x)
a(η(θ)) = ln Γ(η1 ) + ln Γ(η2 ) − ln Γ(η1 + η2 )
= ln Γ(α) + ln Γ(β) − ln Γ(α + β)

Inverse parameter mapping

[ ] = [ 1]
α η
β η2

Categorical Distribu!on

Orginal form
k
p(x∣θ) = ∏ θi
[x=i]

i=1

where θ = (θ1 , … , θk ), θi represents the probability of seeing element i and


k
∑i=1 θi = 1, and [x = i] evaluates to 1 if x = i, 0 otherwise.

Standard exponen!al family form

variant 1

⎡η1 ⎤ ⎡[x = 1]⎤


T

p(x∣η) = exp ( ⋮ ⋮ )
⎣ηk ⎦ ⎣[x = k]⎦

where

h(x) = 1
⎡η1 ⎤ ⎡ln θ1 ⎤
η(θ) = ⋮ = ⋮
⎣ηk ⎦ ⎣ln θk ⎦

⎡[x = 1]⎤
t(x) = ⋮
⎣[x = k]⎦
a(η(θ)) = 0

Inverse parameter mapping

⎡θ1 ⎤ ⎡e ⎤
η1

θ= ⋮ = ⋮
⎣θk ⎦ ⎣eηk ⎦

k
where ∑i=1 eηi = 1
variant 2

⎡η1 ⎤ ⎡[x = 1]⎤


T

p(x∣η) = exp ( ⋮ ⋮ )
⎣ηk ⎦ ⎣[x = k]⎦

where

h(x) = 1
⎡η1 ⎤ ⎡ln θ1 + C ⎤
η(θ) = ⋮ = ⋮
⎣ηk ⎦ ⎣ln θk + C ⎦

⎡[x = 1]⎤
t(x) = ⋮
⎣[x = k]⎦
a(η(θ)) = 0

Inverse parameter mapping

⎡ ∑ki=1 eηi ⎤
eη1
⎡θ1 ⎤ ⎡C ⎤
1 eη1

θ= ⋮ = ⋮ =⎢ ⋮ ⎥
⎣θk ⎦ ⎣ 1 eηk ⎦ ⎣ eηk ⎦
C ∑k eηi i=1

k
where ∑i=1 eηi = C

variant 3

⎡ η1 ⎤
T

⎢ ⎥
⎢ ⋮ ⎥ ⎡[x = 1]⎤
⎢ ⎥
p(x∣η) = exp ( ⎢ ⎥ )
⎢ηk−1 ⎥ ⎣

⎢ ⎥ [x = k]⎦
⎢ ⎥
⎣ 0 ⎦
where

h(x) = 1
⎡ θ1 ⎤ ⎡ θ1 ⎤
⎡ 1 ⎤ ln ln
⎢ θk ⎥ ⎢ 1 − ∑i=1 θi ⎥
η k−1
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⋮ ⎥ ⎢ ⋮ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥

η(θ) = ⎢ ⎥ = ⎢ θk−1 ⎥ = ⎢ ⎥
⎢ηk−1 ⎥ ⎢ln ⎥ ⎢ln ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
θk−1
⎢ ⎥ ⎢ θk ⎥ ⎢ 1 − ∑k−1 ⎥
⎢ ⎥ ⎢ i=1 θi ⎥
⎣ 0 ⎦
⎣ 0 ⎦ ⎣ 0 ⎦

⎡[x = 1]⎤
t(x) = ⋮
⎣[x = k]⎦
k k−1 k−1
a(η(θ)) = ln (∑ eηi ) = ln (1 + ∑ eηi ) = − ln θk = − ln (1 − ∑ θi )
i=1 i=1 i=1

k−1
where θk = 1 − ∑i=1 θi

Inverse parameter mapping

⎡ ⎤
eη1
⎢ 1 + ∑i=1 eηi ⎥
k−1
⎡ k ⎤ ⎢ ⎥
eη1
⎡ ⎤ ⎢ ∑i=1 e ⎥ ⎢ ⎢ ⎥

θ1 η ⋮
⎢ ⎥ ⎢ ⎥
i

θ=⎢
⎢⋮⎥ ⎢
⎥ ⎢ ⎥ ⎢ ⎥
⎢ k−1 ηi ⎥
η

= ⋮ = e k−1

⎣θ ⎦ ⎢ ⎥ ⎢ 1 + ∑i=1 e ⎥
⎢ ⎥
e η
⎣ k ηi ⎦ ⎢ ⎥
k

⎢ ⎥
k
∑i=1 e 1
⎣ 1 + ∑k−1 eηi ⎦
i=1

Mul!nomial Distribu!on

Orginal form
k
n!
p(x∣θ) = k ∏ θixi
∏i=1 xi ! i=1

where θ = (θ1 , … , θk ), and θi represents the probability of seeing element i and


k k
∑i=1 θi = 1, and x = (x1 , … , xk ) and ∑i=1 xi = n.

Standard exponen!al family form

variant 1

⎡η1 ⎤ ⎡x1 ⎤
T

exp ( ⋮ ⋮ )
n!
p(x∣η) = k
∏i=1 xi ! ⎣ηk ⎦ ⎣xk ⎦

where

n!
h(x) = k
∏i=1 xi !
⎡η1 ⎤ ⎡ln θ1 ⎤
η(θ) = ⋮ = ⋮
⎣ηk ⎦ ⎣ln θk ⎦

⎡x1 ⎤
t(x) = ⋮
⎣xk ⎦
a(η(θ)) = 0

Inverse parameter mapping

⎡ ⎤ ⎡ ⎤
θ eη1
1
θ= ⋮ = ⋮
⎣θ2 ⎦ ⎣eηk ⎦

k
where ∑i=1 eηi = 1
variant 2

⎡η1 ⎤ ⎡x1 ⎤
T
n!
p(x∣η) = k exp ( ⋮ ⋮ )
∏i=1 xi ! ⎣ηk ⎦ ⎣xk ⎦

where

n!
h(x) = k
∏i=1 xi !
⎡η1 ⎤ ⎡ln θ1 + C ⎤
η(θ) = ⋮ = ⋮
⎣ηk ⎦ ⎣ln θk + C ⎦

⎡x1 ⎤
t(x) = ⋮
⎣xk ⎦
a(η(θ)) = 0

Inverse parameter mapping

⎡ ∑ki=1 eηi ⎤
eη1
⎡θ1 ⎤ ⎡Ce ⎤
1 η1

θ= ⋮ = ⋮ =⎢ ⋮ ⎥
⎣θ2 ⎦ ⎣ 1 eηk ⎦ ⎣ ke k ηi ⎦
η

C ∑i=1 e

k
where ∑i=1 eηi = C

variant 3

⎡ η1 ⎤
T

⎢ ⎥
⎢ ⎥ ⎡x1 ⎤
⎢ ⎥
⋮ k
exp ( ⎢ ⎥ ⋮ − n ln (∑ eηi ) )
n!
⎢ηk−1 ⎥ ⎣ ⎦
p(x∣η) = k
∏i=1 xi ! ⎢ ⎥ xk
⎢ ⎥
i=1

⎣ 0 ⎦
where

n!
h(x) =
∏ki=1 xi !
⎡ θ1 ⎤ ⎡ θ1 ⎤
⎡ 1 ⎤ ln ln
⎢ θk ⎥ ⎢ 1 − ∑i=1 θi ⎥
η k−1
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⋮ ⎥ ⎢ ⋮ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥

η(θ) = ⎢ ⎥ = ⎢ θk−1 ⎥ = ⎢ ⎥
⎢ηk−1 ⎥ ⎢ln ⎥ ⎢ln ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
θk−1
⎢ ⎥ ⎢ θk ⎥ ⎢ 1 − ∑k−1 ⎥
⎢ ⎥ ⎢ i=1 θi ⎥
⎣ 0 ⎦
⎣ 0 ⎦ ⎣ 0 ⎦

⎡x1 ⎤
t(x) = ⋮
⎣xk ⎦
k k−1 k−1
a(η(θ)) = n ln (∑ eηi ) = n ln (1 + ∑ eηi ) = −n ln θk = −n ln (1 − ∑ θi )
i=1 i=1 i=1

k−1
where θk = 1 − ∑i=1 θi

Inverse parameter mapping

⎡ ⎤
eη1
⎢ 1 + ∑i=1 eηi ⎥
k−1
⎡ k ⎤ ⎢ ⎥
eη1
⎢ ⎥
⎢ ∑i=1 e ⎥ ⎢ ⎥
η ⋮
⎢ ⎥ ⎢ ⎥
i

θ=⎢ ⎥=⎢ ⎥
⎢ ⎥
η
⎢ ⎥
⋮ e k−1

⎢ eηk ⎥ ⎢ 1 + ∑k−1 i=1 e ⎥


⎢ ⎥
η
⎣ k ηi ⎦ ⎢ ⎥
i

∑i=1 e ⎢ 1 ⎥
⎣ 1 + ∑k−1 eηi ⎦
i=1

Dirichlet distribu!on
Orginal form

k k
Γ(∑i=1 αi )
p(x∣α) = k
∏ xαi i −1
∏i=1 Γ(αi ) i=1

k
where α = (α1 , … , αk ), and x = (x1 , … , xk ) and ∑i=1 xi = 1.

Standard exponen!al family form

⎡η1 ⎤ ⎡ln x1 ⎤
T
k k
− ∑ ln Γ(ηi ) + ln Γ (∑ ηi ) )
1
p(x∣η) = k exp ( ⋮ ⋮
∏i=1 xi ⎣ηk ⎦ ⎣ln xk ⎦ i=1 i=1

where

1
h(x) =
∏ki=1 xi
⎡η1 ⎤ ⎡α1 ⎤
η(θ) = ⋮ = ⋮
⎣ηk ⎦ ⎣αk ⎦

⎡ln x1 ⎤
t(x) = ⋮
⎣ln xk ⎦
k k k k
a(η(θ)) = ∑ ln Γ(ηi ) − ln Γ (∑ ηi ) = ∑ ln Γ(αi ) − ln Γ (∑ αi )
i=1 i=1 i=1 i=1

Inverse parameter mapping

⎡α1 ⎤ ⎡η1 ⎤
⋮ = ⋮
⎣αk ⎦ ⎣ηk ⎦

根据指数家族分布的性质,可以得到
∂a(η)
Ep(x∣η) [t(x)] =
∂η

也就是

⎡ln x1 ⎤ ∂(∑i=1 ln Γ(αi ) − ln Γ (∑i=1 αi ))


k k

⋮ =
⎣ln xk ⎦
Ep(x∣η)
⎡α1 ⎤
∂ ⋮
⎣αk ⎦

⎡ψ(α1 ) − ψ(∑i=1 αi )⎤
k

= ⋮
⎣ψ(α ) − ψ(∑k α )⎦
k i=1 i

具体推导如下

∂(∑ki=1 ln Γ(αi ) − ln Γ (∑ki=1 αi ))


∂αi

∂ ln Γ(αi ) ∂ ln Γ (∑ki=1 αi ) ∂ (∑ki=1 αi )


= − ×
∂αi (∑i=1 αi ) ∂αi
k

k
=ψ(αi ) − ψ(∑ αi′ )
i′ =1

这⾥里里ψ()是Digamma func!on

∂ ln Γ(α) Γ′ (α)
ψ(α) = =
∂α Γ(α)

You might also like