You are on page 1of 11

Hence,

EF.)= EnFu(x)
PROOF
PROO
E [nF)]
1
nF(x)
= F(x),

which implies that Fn (x) is an unbiased estimator of F(x). Also,

Var[F)]= VarnF PROOF


=
Var [nFn(x)]
1nF(x)[1 - F(x)]
Fx)[1-F(«)] (10.1)

Frx) hasa maximum variance where F(x) = }(ie., at the median). To see this, denote Var [F(a)]

by o and F(x) by p: Then (10.1) becomes

p1-p)PP
n n

Taking the derivative w.r.t. p yields

do122
dp
=0 n

Therefore Var [F»(*)] reaches a maximum when F(x) = P(X < x) = , ie., at the median of F.
Definition8.2. Let X be arandomvariablewith density function f(:10). The Fisherinformation
of X is defined as w

10)=Var96logf(X0) E2-CEE

Theorem 8.2 (Lemma A, Rice p. 276). The Fisher information I(0) can also be written as
1
I() =
Ealog f(X10)
0r
32
(0)=-E3e logfX10)
under appropriate smoothness conditions on f.

recall that
(Proof) Let Z log f{X|4) and
=

I(0) =
Var(Z) E(Z2)- [E (Z)]?.
=

know that
Since f is a density function, we

J-o
By taking the derivative w.r.t. 6 on both sides, we obtain
l@)d =- o
(8.6)
Now note that
1e eleiel,
alog fule) fle e)tro)= 38fa).
rEquation (8.6) above therefore becomnes

lo Fl0)|seo) dr =0.
>ELiEl,fcte]
(8.7)
The integral in (8.7) is simply the expected value of Z, therefore E(Z) = 0 and
2
I() = Var (2) =
E(2) = E
log fX10)
Again take the derivative w.r.t. 0 in (8.7), we obtain

= log f0)Fald

3log f(xl0)| fl) dx logfote) fa) d

E91ogfx1e)E9 logFXI0)
We therefore have the required result:

(6) =Elog f(XI®) 6z los f(XjO)


The large sample distribution of a maximum likelihood estimator is approximately normal

with mean and variance 1/ni(0), i.e.,


E)-nso
N(e, a sn 00.
C
In other words, the mle is asymptotically unbiased and1/nI(©) is the asymptotic variance of
U the mle.

Theorem 8.3 (Theorem B, Rice p. 264). Under certain assumptions on f with regards to derioatives

that exist,

-6= yni(®) (-0)


1/Cnl(0))
will tend to a standard normal distribution, i.e., XN
Vni(e)(- 0NO,1) as n > o.
y1/(nI(6))
Theorem 8.6 (Cramér-Rao Inequality, Theorem A, Rice p. 300). Let X1,X2,.X beii.d. with
density function f(z|0). Let T = t(X1, X2,...,Xn) be an unbiased estimator for 6. Under certain

conditions on f it holds that MLE D


1
Var(T)2 nl(0)
Proof. Let

z= log f(X|0)
i=1

f (X10)
i=1 fCX|0) -14p
Now note that
-1 Corr(Z,T) = -
Cov(Z, T) =1, pl
Var()Var(T)
that is,
Cov(Z, T)
ICorr(Z,T)]= Var(Z) yVar(T) 1,
or, equivalently,
Cov (Z, T) Var(Z) Var(T).

We therefore have that


Var(T)Cov(2,T)
Var(Z)
We've already shown that E ( log f (X;16)=0 (see equation (8.7)). Consequently,

Varlogf(X:10)=E log f(X|0)= I(6).


61
From independence of the X; it follows that

Var(2)=Var2log f(X;10) =nl(®).


1=1

Hence,
Var(T)>Cov (2, T)
nl(0)
To prove the theorem we now only need to show that Cov(Z, T) = 1. Now,

Cov(Z, T)= E(ZT)- E(Z) E(T)


= E(ZT)

,t(x1,2 flfao) d
n

-oo .4 An) |L f(xil0)


}-1
Since

fx10)Tfa;1ø)=j=1 i=1
fa1)
it follows that

Cov(Z,T)= -C0
t1,2,,n)af0) dx;
i=1

t ( , 2,
-co
)fol0)
i=1
dx
=
E(T)

1.

We thus have that Cov(Z, T) = 1, and consequently

1
Var(T) l()
Theorem 8.6 gives a lower bound for the variance of any unbiased estimator. An unbiased
estimator whose variance achieves this lower bound is said to be efficient. Since the asymp-
totic variance of a maximum likelihood estimators is equal to the Cramér-Rao lower bound,
maximum likelihood estimators are said to be asymptotically efficient
Theorem 8.7 (Rice p. 281). A statistic T(X1, X2,... X,) is susficient for a parameter 0 ifand only if
the joint density function (or mass function) factors in the following form:

fa,2,.,X7|6) g{T(x1,*2,,Xn), 8]h(x1,X2,-..pKn) at O).hl)


=

Proof. (We give the proof for the discrete case.)

Suppose that the mass function factors as given in the theorem. Let X =
(X1, X2,..., Xn) and
X = (X1,X2,..,Xn). Then we have that fCT-l, X=*.
T)t)
P(T=t) = Y. PX = x)
T=)
L st,0)h(x)
xT(x)=t)
= 8t,0),h(x).
x/T(x)=t)

However,
PX =x|T t) = =
P(X =
x,T= t
P(T = t)

st,0)hx)
gt,e)LxT(W=) h{x)
hx)
ZtxIT(x)=t} h(x)
Therefore, the conditional distribution is independent of .
Conversely, suppose that the conditional distribution of X given T is independent of 6, i.e.,
P(X =x|[T =
t, 0) =
P(X =
x|T =
t). Now, since

P(X =
x|T =
t, 0) PX=
= x, T =t|0)_ P(X xl0) =

P(T = t|0) P(T = t16)

we have that

P(X =
x|0) =
P(T=t[0)P(X =
x{T =
t)
= gt, 0)}h(x),

where
gt,0)= P(T =#|0)
and
h(x)= PX =x|T =
),

which proves the theorem.


to wie the proof or t
Con inuous cese eplace w t h

and the power of the test is

Mc(1) = Po, (reject Ho) =1-6.

Prooffor thecontinuouscase Let C be any test such that Mc(00) s Mc-(@o), i.e.
PaeedsMo(0o)-Mc(0%)
H, Fle) z flep&s
20
H.
kloh) 9.1)
Now we must show that fer eets Ho r o ) z PCre eds Ho tnne
Mc(01)-Mc®) 2 0. 9.2)
Remember that

Mo(01)= Pe, (reject Ho)


=Pe,xe Xis in ueom ngenC
f(x|61) dx1 dxn
x:xEC:) Tnddod fancon
t o b o n reqicn mdao
k not
-0
I(x EC)f«te) d*i ...dx
-c0
Teecbon qon

where I (x e B) is the indicator function of the event B, i.e.

H(xEB) =
1 ifxe B

0 ifx ¢ B. J
Likewise,
Mc(01) = I (x E C)f (x101) dx1 dxn.

Hence

Mc(01)- Mc(O1) . =
I(xe c)-I(x EC)MS«18,) 4 (9.3)

Now, forxEC'it holds that

I(xEC)-I(x e C) =1-I(xE C)20, 9.4)

and

fx Go) <k
fxe)
fx e1)2 a l ) (9.5)
k
It follows from (9.4) and (9.5) that

xEC)- I(x e C]f0)> ux E C)-I(x e C)]fx|0o) (9.6)


k

79

oal b e lane
Similarly, for x ¢ C' it holds that

I(xEC)-I(xe C) =0-I(x EC) s0, (9.7)

and
es wth h positd numbed

(9.8)
f(x101)< k

It follows from (9.7) and (9.8) that

[I(xeC)-I(xe C)]fx|6o)
[I(x EC) -

I(x ¬ C)1f(x | 01) 2 k

which is equivalent to (9.6).


These equations hold for all x and substitution in (9.3) yields

Mc(01)- Mc(1) 2 | xeC)-00


-I*¬ C]fx| e%) drj.. dxnk
= [Mc-(0o)-Mc(e)]k20.
Therefore Mc-(01) -

Mc{01) 2 0 and we have proved the theorem.


Theorem 9.3. Asymptotically
(i.e.,for large sample sizes) X and-2logA are equivalent under Ho

Now we will prove this theorem heuristically by means of Taylor expansion.


Proof. Remember that the Taylor series expansion of f(x) about xo is given by

fo)
k=0
ro)= fCro) (x-xo) ftxo)
+ +
" (xo)+
Now for the function f(e) = xlog()we have that

=Xo log(1) it m
fxo) =
xolog = 0

and

1
Px) =log+x.x/X0 log(1 f(xo) = 11

f")X0 Plao)

91
Therefore the Taylor series expansion of flx) xlog(about xo is
=

1
fo) =
(x-xo) + (*- X0
1n
Hence we can approximate -2 log A = 22 0;log by
i=1

=2E|0-E) 1(O-E
-2log A +

i=1

-Poi) + Q-E)*
212(P E
i=1 i=1

But since pi =
2L Poi = 1 the first term becomes 0. Finally we have that

-2 log A (O-E
i=1
E

You might also like