Professional Documents
Culture Documents
EF.)= EnFu(x)
PROOF
PROO
E [nF)]
1
nF(x)
= F(x),
Frx) hasa maximum variance where F(x) = }(ie., at the median). To see this, denote Var [F(a)]
p1-p)PP
n n
do122
dp
=0 n
Therefore Var [F»(*)] reaches a maximum when F(x) = P(X < x) = , ie., at the median of F.
Definition8.2. Let X be arandomvariablewith density function f(:10). The Fisherinformation
of X is defined as w
10)=Var96logf(X0) E2-CEE
Theorem 8.2 (Lemma A, Rice p. 276). The Fisher information I(0) can also be written as
1
I() =
Ealog f(X10)
0r
32
(0)=-E3e logfX10)
under appropriate smoothness conditions on f.
recall that
(Proof) Let Z log f{X|4) and
=
I(0) =
Var(Z) E(Z2)- [E (Z)]?.
=
know that
Since f is a density function, we
J-o
By taking the derivative w.r.t. 6 on both sides, we obtain
l@)d =- o
(8.6)
Now note that
1e eleiel,
alog fule) fle e)tro)= 38fa).
rEquation (8.6) above therefore becomnes
lo Fl0)|seo) dr =0.
>ELiEl,fcte]
(8.7)
The integral in (8.7) is simply the expected value of Z, therefore E(Z) = 0 and
2
I() = Var (2) =
E(2) = E
log fX10)
Again take the derivative w.r.t. 0 in (8.7), we obtain
= log f0)Fald
E91ogfx1e)E9 logFXI0)
We therefore have the required result:
Theorem 8.3 (Theorem B, Rice p. 264). Under certain assumptions on f with regards to derioatives
that exist,
z= log f(X|0)
i=1
f (X10)
i=1 fCX|0) -14p
Now note that
-1 Corr(Z,T) = -
Cov(Z, T) =1, pl
Var()Var(T)
that is,
Cov(Z, T)
ICorr(Z,T)]= Var(Z) yVar(T) 1,
or, equivalently,
Cov (Z, T) Var(Z) Var(T).
Hence,
Var(T)>Cov (2, T)
nl(0)
To prove the theorem we now only need to show that Cov(Z, T) = 1. Now,
,t(x1,2 flfao) d
n
fx10)Tfa;1ø)=j=1 i=1
fa1)
it follows that
Cov(Z,T)= -C0
t1,2,,n)af0) dx;
i=1
t ( , 2,
-co
)fol0)
i=1
dx
=
E(T)
1.
1
Var(T) l()
Theorem 8.6 gives a lower bound for the variance of any unbiased estimator. An unbiased
estimator whose variance achieves this lower bound is said to be efficient. Since the asymp-
totic variance of a maximum likelihood estimators is equal to the Cramér-Rao lower bound,
maximum likelihood estimators are said to be asymptotically efficient
Theorem 8.7 (Rice p. 281). A statistic T(X1, X2,... X,) is susficient for a parameter 0 ifand only if
the joint density function (or mass function) factors in the following form:
Suppose that the mass function factors as given in the theorem. Let X =
(X1, X2,..., Xn) and
X = (X1,X2,..,Xn). Then we have that fCT-l, X=*.
T)t)
P(T=t) = Y. PX = x)
T=)
L st,0)h(x)
xT(x)=t)
= 8t,0),h(x).
x/T(x)=t)
However,
PX =x|T t) = =
P(X =
x,T= t
P(T = t)
st,0)hx)
gt,e)LxT(W=) h{x)
hx)
ZtxIT(x)=t} h(x)
Therefore, the conditional distribution is independent of .
Conversely, suppose that the conditional distribution of X given T is independent of 6, i.e.,
P(X =x|[T =
t, 0) =
P(X =
x|T =
t). Now, since
P(X =
x|T =
t, 0) PX=
= x, T =t|0)_ P(X xl0) =
we have that
P(X =
x|0) =
P(T=t[0)P(X =
x{T =
t)
= gt, 0)}h(x),
where
gt,0)= P(T =#|0)
and
h(x)= PX =x|T =
),
Prooffor thecontinuouscase Let C be any test such that Mc(00) s Mc-(@o), i.e.
PaeedsMo(0o)-Mc(0%)
H, Fle) z flep&s
20
H.
kloh) 9.1)
Now we must show that fer eets Ho r o ) z PCre eds Ho tnne
Mc(01)-Mc®) 2 0. 9.2)
Remember that
H(xEB) =
1 ifxe B
0 ifx ¢ B. J
Likewise,
Mc(01) = I (x E C)f (x101) dx1 dxn.
Hence
Mc(01)- Mc(O1) . =
I(xe c)-I(x EC)MS«18,) 4 (9.3)
and
fx Go) <k
fxe)
fx e1)2 a l ) (9.5)
k
It follows from (9.4) and (9.5) that
79
oal b e lane
Similarly, for x ¢ C' it holds that
and
es wth h positd numbed
(9.8)
f(x101)< k
[I(xeC)-I(xe C)]fx|6o)
[I(x EC) -
fo)
k=0
ro)= fCro) (x-xo) ftxo)
+ +
" (xo)+
Now for the function f(e) = xlog()we have that
=Xo log(1) it m
fxo) =
xolog = 0
and
1
Px) =log+x.x/X0 log(1 f(xo) = 11
f")X0 Plao)
91
Therefore the Taylor series expansion of flx) xlog(about xo is
=
1
fo) =
(x-xo) + (*- X0
1n
Hence we can approximate -2 log A = 22 0;log by
i=1
=2E|0-E) 1(O-E
-2log A +
i=1
-Poi) + Q-E)*
212(P E
i=1 i=1
But since pi =
2L Poi = 1 the first term becomes 0. Finally we have that
-2 log A (O-E
i=1
E