Professional Documents
Culture Documents
where
T
η(θ) = η1 (θ), . . . , ηd (θ) ,
T
T (y) = T1 (y), . . . , Td (y) .
A sufficient statistic for θ contains all the information in the sample about
θ. Thus, given the value of T , we cannot improve our knowledge about
θ by a more detailed analysis of the data Y1 , . . . , Yn . In other words, an
estimate based on T = t cannot improved by using the data Y1 , . . . , Yn .
Sufficiency and exponential families: Rice (1995), 280-284.
Example:
Consider a sequence of independent Bernoulli trials,
iid
Yi ∼ Bin(1, θ).
Examples
2 2 − 21
1 2
fY (y|µ, σ ) = (2πσ ) exp − 2 (y − µ)
2σ
1 2 µ µ2 1 2
= exp − 2 y + 2 y − 2 − log(2πσ )
2σ σ 2σ 2
with
2
1 µ T
η(µ, σ ) = − 2 , 2
2σ σ
T
T (y) = (y 2 , y)
iid
If Y = (Y1 , . . . , Yn )T with Yi ∼ N (µ, σ 2 ), then
n
P n T
2 P
T (y) = yi , yi
i=1 i=1
with
with
∂η(θ) T ∂a(θ)
T (Y ) = .
∂θ ∂θ
∂η(θ)
If the matrix ∂θ is invertible, then this simplifies to
E
T (Y )|θ = T (Y ).
∂η(θ) T
Z
∂a(θ)
exp η(θ)T T (y) − a(θ) b(y) dy
= T (Y ) −
∂θ ∂θ
Z
∂
exp η(θ)T T (y) − a(θ) b(y) dy
=
∂θ
=0
Examples
iid
◦ Y1 , . . . , Yn ∼ N (µ, σ 2 )
Note that
n
P n T
Yi2
P
T (Y ) = Yi , .
i=1 i=1
n
Yi2 = n σ 2 + n µ2
P
i=1
iid
◦ Y1 , . . . , Yn ∼ Γ(α, λ)
Note that
n
P n
P T
T (Y ) = Yi , log(Yi ) .
i=1 i=1
log(Yi ) = n E log(Y1 )
n
P
i=1
E
∂Γ(α) 1
log(Y1 ) = · − log(λ)
∂α Γ(α)
E T (Y )|θ = T (k).
=E
m
(k) (k) (k) 2
Yi2 + (n − m) σ̂ (k) 2 + µ̂(k) 2
P
T2 T2 (Y ) Yobs , µ̂ , σ̂
=
i=1
◦ M-step:
= E(T1 (Y )|µ, σ 2 ) = n µ
(k) 1 (k)
T1 ⇒ µ̂(k+1) = T
n 1
= E(T2 (Y )|µ, σ 2 ) = n σ 2 + µ2
(k) 1 (k) 1 (k) 2
⇒ σ̂ (k+1) 2
T2 = T2 − 2 T1
n n
Example: t distribution
Suppose that Y1 , . . . , Yn are independently sampled from the density
1 2 −1
fYi (y|µ) = √ 1 + (y − µ)
πΓ 12
iid
Define the complete data as (Y, X) where Xi ∼ χ21 such that
Thus
1 2 n
P n
P
η(µ) = µ, − µ and T (Y ) = Xi Yi , Xi .
2 i=1 i=1
◦ E-step:
E(Xi|Yi, µ̂
n n 2
(k) P (k) P
T2 = )= (k) 2
i=1 i=1 1 + (Yi − µ̂ )
∂η(θ)
◦ M-step: Note that ∂θ = (1, −µ)T . Thus the ML estimator solves
the equations
= nE(X1 Y1 ) − n µ E(X1 ),
(k) (k)
T1 − µ T2
which yields
(k)
(k+1) T1
µ̂ = (k)
.
T2