Exp Family

Exponential Families
The random vector Y = (Y1 , . . . , Yn )T has a distribution from an exponen-

tial family if the density of Y is of the form
fY (y|θ) = b(y) exp η(θ)T T (y) − a(θ)

where
T
η(θ) = η1 (θ), . . . , ηd (θ) ,
T
T (y) = T1 (y), . . . , Td (y) .
◦ η = (η1 , . . . , ηd )T is the natural parameter of the exponential family

◦ T = T (Y ) is a sufficient statistic for θ (or for η)
Definition A statistic T = T (Y1 , . . . , Yn ) is said to be sufficient for θ if the

conditional distribution of Y1 , . . . , Yn given T = t does not depend on
θ for any value of t.
A sufficient statistic for θ contains all the information in the sample about
θ. Thus, given the value of T , we cannot improve our knowledge about
θ by a more detailed analysis of the data Y1 , . . . , Yn . In other words, an
estimate based on T = t cannot improved by using the data Y1 , . . . , Yn .
Sufficiency and exponential families: Rice (1995), 280-284.
Example:
Consider a sequence of independent Bernoulli trials,
iid
Yi ∼ Bin(1, θ).
The number of successes, Sn = ni=1 Yi is sufficient for the parameter θ.

P
Additional information about the observed values Y1 , . . . , Yn , as e.g. the

order in which the successes occured, does not convey any information
about θ.
Exponential Families, Apr 13, 2004 -1-

Examples
◦ Binomial distribution: Y ∼ Bin(n, θ)

n y
fY (y|θ) = θ (1 − θ)n−y
y

n h
θ
i
= exp y log 1−θ + n log(1 − θ)
y
with
θ
η(θ) = log
1−θ
T (y) = y
◦ Normal distribution: Y ∼ N (µ, σ 2 )
2 2 − 21
1 2

fY (y|µ, σ ) = (2πσ ) exp − 2 (y − µ)
2σ
1 2 µ µ2 1 2

= exp − 2 y + 2 y − 2 − log(2πσ )
2σ σ 2σ 2
with
2
1 µ T
η(µ, σ ) = − 2 , 2
2σ σ
T
T (y) = (y 2 , y)
iid
If Y = (Y1 , . . . , Yn )T with Yi ∼ N (µ, σ 2 ), then
n
P n T
2 P
T (y) = yi , yi
i=1 i=1

◦ Gamma distribution: Y ∼ Γ(α, λ)

λα α−1
fY (y|α, λ) = y exp − λ y
Γ(α)
= x−1 exp − λ y + α log(y) + α log(λ) − log(Γ(α))

with
η(α, λ) = (−λ, α)T

T
T (y) = y, log(y)
iid
If Y = (Y1 , . . . , Yn )T with Yi ∼ Γ(α, λ), then
n
P n
P T
T (y) = yi , log(yi )
i=1 i=1
This includes as special cases

◦ the exponential distribution (= Γ(1, λ))
n 1

◦ the χ2 distribution with n degrees of freedom (= Γ 2, 2 )
◦ Beta distribution: Y ∼ B(α, β)

Γ(α + β) α−1
fY (y|α, β) = y (1 − y)β−1
Γ(α)Γ(β)
= [y(1 − y)]−1 exp α log(y) + β log(1 − y)

+ log Γ(α + β) − log Γ(α) − log Γ(β)
with
η(α, β) = (α, β)T

T
T (y) = log(y), log(1 − y)

Maximum Likelihood for Exponential Families
Suppose that Y = (Y1 , . . . , Yn )T with density
fY (y|θ) = b(y) exp η(θ)T T (y) − a(θ)

Then the log-likelihood function is given by
ln (θ|Y ) = η(θ)T T (Y ) − a(θ) + log b(Y ) .

Differentiating with respect to θ, we obtain the likelihood equations
∂η(θ) T ∂a(θ)
T (Y ) = .
∂θ ∂θ
The likelihood equations can be rewritten in the following form

∂η(θ) T ∂η(θ) T
E ∂θ
T (Y )θ =

∂θ
T (Y ).
∂η(θ)
If the matrix ∂θ is invertible, then this simplifies to
E

T (Y )|θ = T (Y ).
Proof. To see this, note that

∂η(θ) T
E ∂θ
T (Y ) −
∂a(θ)
∂θ
∂η(θ) T
Z
∂a(θ)
exp η(θ)T T (y) − a(θ) b(y) dy

= T (Y ) −
∂θ ∂θ
Z
∂
exp η(θ)T T (y) − a(θ) b(y) dy

=
∂θ
=0

Maximum Likelihood for Exponential Families
Examples
iid
◦ Y1 , . . . , Yn ∼ N (µ, σ 2 )
Note that
n
P n T
Yi2
P
T (Y ) = Yi , .
i=1 i=1
Thus the ML estimator is given by the solution of

n
P
Yi = n µ
i=1
n
Yi2 = n σ 2 + n µ2
P
i=1
iid
◦ Y1 , . . . , Yn ∼ Γ(α, λ)
Note that
n
P n
P T
T (Y ) = Yi , log(Yi ) .
i=1 i=1
Thus the ML estimator is given by the solution of

n
P α
Yi = n ·
i=1 λ
log(Yi ) = n E log(Y1 )
n
P
i=1
It can be shown that
E
∂Γ(α) 1
log(Y1 ) = · − log(λ)
∂α Γ(α)

The EM Algorithm for Exponential Families
Suppose the complete data Y have a distribution from an exponential

family
fY (y|θ) = b(y) exp η(θ)T T (y) − a(θ) .

Then the EM algorithm has a particularly simple form.
EM algorithm for exponential families
◦ E-step: Estimate the sufficient statistic T = T (Y ) by
T (k) = E T (Y )Yobs , θ(k) .

◦ M-step: Find θ(k+1) by solving the likelihood equations for θ,

∂η(θ) T ∂η(θ) T
E ∂θ
T (Y )θ =

∂θ
T (k) ,
or, if the matrix ∂η(θ)

∂θ is invertible,
E T (Y )|θ = T (k).

Example: Univariate normal observations

Suppose that only the first m of n values are observed.
◦ E-step:
= E T1 (Y )Yobs , µ̂(k) , σ̂ (k) 2 =

m
(k)
Yi + (n − m) µ̂(k)
P
T1
i=1
=E
m
(k) (k) (k) 2
Yi2 + (n − m) σ̂ (k) 2 + µ̂(k) 2
P
T2 T2 (Y ) Yobs , µ̂ , σ̂
=
i=1
◦ M-step:
= E(T1 (Y )|µ, σ 2 ) = n µ
(k) 1 (k)
T1 ⇒ µ̂(k+1) = T
n 1
= E(T2 (Y )|µ, σ 2 ) = n σ 2 + µ2
(k) 1 (k) 1 (k) 2
⇒ σ̂ (k+1) 2

T2 = T2 − 2 T1
n n

The EM Algorithm for Exponential Families
Example: t distribution
Suppose that Y1 , . . . , Yn are independently sampled from the density
1 2 −1

fYi (y|µ) = √ 1 + (y − µ)
πΓ 12

iid
Define the complete data as (Y, X) where Xi ∼ χ21 such that
Yi |Xi ∼ N (µ, Xi−1 ).
Then the complete-data likelihood is

1P n n p
2 Q
Ln (µ|Y, X) = exp − Xi (Yi − µ) Xi · fXi (Xi )
2 i=1 i=1
Thus
1 2 n
P n
P
η(µ) = µ, − µ and T (Y ) = Xi Yi , Xi .
2 i=1 i=1
◦ E-step:
E(Xi|Yi, µ̂(k))Yi = P 1 + (Y2 Y−i µ̂(k))2

n n
(k) P
T1 =
i=1 i=1 i
E(Xi|Yi, µ̂
n n 2
(k) P (k) P
T2 = )= (k) 2
i=1 i=1 1 + (Yi − µ̂ )
∂η(θ)
◦ M-step: Note that ∂θ = (1, −µ)T . Thus the ML estimator solves
the equations
= nE(X1 Y1 ) − n µ E(X1 ),
(k) (k)
T1 − µ T2
which yields
(k)
(k+1) T1
µ̂ = (k)
.
T2

Exp Family

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Exp Family

Uploaded by

Copyright:

Available Formats

Exponential Families

The random vector Y = (Y1 , . . . , Yn )T has a distribution from an exponen-

fY (y|θ) = b(y) exp η(θ)T T (y) − a(θ)

◦ η = (η1 , . . . , ηd )T is the natural parameter of the exponential family

Definition A statistic T = T (Y1 , . . . , Yn ) is said to be sufficient for θ if the

The number of successes, Sn = ni=1 Yi is sufficient for the parameter θ.

Additional information about the observed values Y1 , . . . , Yn , as e.g. the

Exponential Families, Apr 13, 2004 -1-

◦ Binomial distribution: Y ∼ Bin(n, θ)

◦ Normal distribution: Y ∼ N (µ, σ 2 )

Exponential Families, Apr 13, 2004 -2-

◦ Gamma distribution: Y ∼ Γ(α, λ)

η(α, λ) = (−λ, α)T

This includes as special cases

◦ Beta distribution: Y ∼ B(α, β)

η(α, β) = (α, β)T

Exponential Families, Apr 13, 2004 -3-

Suppose that Y = (Y1 , . . . , Yn )T with density

fY (y|θ) = b(y) exp η(θ)T T (y) − a(θ)

Then the log-likelihood function is given by

ln (θ|Y ) = η(θ)T T (Y ) − a(θ) + log b(Y ) .

Differentiating with respect to θ, we obtain the likelihood equations

The likelihood equations can be rewritten in the following form

Proof. To see this, note that

Exponential Families, Apr 13, 2004 -4-

Thus the ML estimator is given by the solution of

Thus the ML estimator is given by the solution of

It can be shown that

Exponential Families, Apr 13, 2004 -5-

Suppose the complete data Y have a distribution from an exponential

fY (y|θ) = b(y) exp η(θ)T T (y) − a(θ) .

Then the EM algorithm has a particularly simple form.

EM algorithm for exponential families

◦ E-step: Estimate the sufficient statistic T = T (Y ) by

T (k) = E T (Y ) Yobs , θ(k) .

◦ M-step: Find θ(k+1) by solving the likelihood equations for θ,

or, if the matrix ∂η(θ)

Example: Univariate normal observations

= E T1 (Y ) Yobs , µ̂(k) , σ̂ (k) 2 =

Exponential Families, Apr 13, 2004 -6-

Yi |Xi ∼ N (µ, Xi−1 ).

Then the complete-data likelihood is

E(Xi|Yi, µ̂(k))Yi = P 1 + (Y2 Y−i µ̂(k))2

Exponential Families, Apr 13, 2004 -7-

You might also like

T (k) = E T (Y )Yobs , θ(k) .

= E T1 (Y )Yobs , µ̂(k) , σ̂ (k) 2 =