Professional Documents
Culture Documents
Estimators
We show in the following that EX|θ0 [Z(θ)] attains its maximum at θ = θ0 . For all
θ 6= θ0 ,
EX|θ0 [Z(θ)] − EX|θ0 [Z(θ0 )]
Z Z
= log f (x|θ)f (x|θ0 )dx − log f (x|θ0 )f (x|θ0 )dx
Z
f (x|θ)
= log f (x|θ0 )dx (2)
f (x|θ0 )
Z
f (x|θ)
< − 1 f (x|θ0 )dx (as log u < u − 1 for u 6= 1)
f (x|θ0 )
= 0.
So θ0 = arg maxθ EX|θ0 [Z(θ)].
Next, we take a look at the log-likelihood function
n
X n
X
L(θ) = log fn (X1 , . . . , Xn |θ) = log f (Xi |θ) = Zi (θ). (3)
i=1 i=1
Consequently, the maximizer of n1 ni=1 Zi (θ) (i.e., the MLE θ̂n = arg maxθ n1 ni=1 Zi (θ))
P P
must be close to the maximizer of EX|θ0 [Z(θ)] (which is the true parameter value θ0 ),
provided the convergence of n1 ni=1 Zi (θ) to EX|θ0 [Z(θ)] is uniform in θ.
P
1
Sketch of proof for the second statement.
Assume that we obtain the MLE θ̂n using the first-order condition
∂
L(θ) θ=θ0
=0 (5)
∂θ
Pn
where L(θ) = i=1 log f (Xi |θ). Expanding the left hand side at point θ = θ0 , we get
∂ ∂2
0= L(θ) θ=θ0
+ (θ̂n − θ0 ) L(θ) θ=θ0
+ remainder. (6)
∂θ ∂θ2
Since θ̂n is consistent, as n → ∞, the value of θ̂n will be close to θ0 with high probability.
So with high probability we can ignore the remainder term and get
−1
∂2
∂
θ̂n − θ = − L(θ) θ=θ0
L(θ) θ=θ0
. (7)
∂θ2 ∂θ
We now make a detour to present some important results about Fisher informa-
tion, which are useful for completing the proof of the main theorem.
Assume that for each value of x, the PDF f (x|θ) is a twice differentiable function of
∂2
θ. Let λ(x|θ) = log f (x|θ), λ0 (x|θ) = ∂θ
∂
log f (x|θ) and λ00 = ∂θ 2 log f (x|θ). The Fisher
information in the random variable X is defined as
In the following we show thatR the three above expressions of I(θ) are equivalent. By
the definition of PDF, we have f (x|θ)dx = 1. By R taking first and second derivatives
with respect to θ inside the integral sign, we obtain f 0 (x|θ)dx = 0 and f 00 (x|θ)dx = 0.
R
0 (x|θ)
Since λ0 (x|θ) = ff (x|θ) , then
Z Z
EX|θ [λ0 (X|θ)] = λ0 (x|θ)f (x|θ)dx = f 0 (x|θ)dx = 0. (10)
2
We look at the two components on the right hand side separately. First, by the Law of
Large Numbers,
n
1 ∂2 1 ∂2 X
L(θ) θ=θ0
= log f (Xi |θ) θ=θ0
n ∂θ2 n ∂θ2
i=1
n
1 X ∂2 (14)
= log f (Xi |θ) θ=θ0
n ∂θ2
i=1
P
→ EX|θ0 [λ00 (X|θ0 )] = −I(θ0 ).
Second, by the Central Limit Theorem,
n
1 ∂ 1 ∂ X
L(θ) θ=θ0
= log f (Xi |θ) θ=θ0
n ∂θ n ∂θ
i=1
n
1X ∂
= log f (Xi |θ) θ=θ0 (15)
n ∂θ
i=1
V arX|θ0 [λ0 (X|θ0 )]
D I(θ0 )
→ N EX|θ0 [λ0 (X|θ0 )], = N 0, .
n n
and
D 1
θ̂n → N θ0 , . (17)
nI(θ0 )