You are on page 1of 28

15.

Minimum Variance Unbiased Estimation


ECE 830, Spring 2014

1 / 28

Bias-Variance Trade-Off

Recall that
b = Bias2 ()
b + Var().
b
MSE()
In general, the minimum MSE estimator has non-zero bias and
non-zero variance.
We can reduce bias only at a potential increase in variance.
Conversely, modifying the estimator to reduce the variance may
lead to an increase in bias.

2 / 28

Example:
Let
xn = A + wn
wn N 0, 2

N
X
e=
xn
A
N n=1

where is an arbitrary constant. If


SN

N
1 X
xn ,
N n=1

then
e =
A
SN

3 / 28

Example: (cont.)
Lets find the value of that minimizes the MSE.
 
e = Var (SN ) = Var (SN ) =
Var A
 
h i
e = E A
e A = E [ SN ] A =
A=
Bias A
Thus the MSE is
 
e =
MSE A

4 / 28

Aside: alternatively, we could have computed the MSE as follows


 2
A + 2 , i = j
E[xi xj ] =
A2
, i 6= j

 

2 
e
eA
MSE A
= E A
h i
h i
e2 2E A
e A + A2
= E A

#
"
N
N
X
X
1
2 1
= E
x n A + A2
xi xj 2E
N 2 i,j=1
N n=1
N
N
1 X
1 X
E[x
x
]

2
E[xn ]A + A2
i
j
N 2 i,j=1
N n=1


2
= 2 A2 +
2A2 + A2
N
2 2
2
=
+ ( 1) A2
|
{z
}
N
| {z }
2
e)
Bias
A
(
e
Var(A)

5 / 28

So how practical is the MSE as a design criterion?


In the previous example, the MSE is minimized when
 
e
dMSE A
=
d
=
The optimal (in an MSE sense) value depends on the unknown
parameter A! Therefore, the estimator is not realizable. This
phenomenon occurs for many classes of problems.
We need an alternative to direct MSE minimization.

6 / 28

Note that in the above example, the problematic dependence on


the parameter (A) enters through the Bias component of the MSE.
This occurs in many situations. Thus a reasonable alternative is to
constrain the estimator to be unbiased, and then find the estimator
that produces the minimum variance (and hence provides the
minimum MSE among all unbiased estimators).
Note: Sometimes no unbiased estimator exists and we cannot
proceed at all in this direction.

Definition: Minimum Variance Unbiased Estimator


b is a minimum variance unbiased estimator (MVUE) for if
1. Eb =
 
 
2. If Eb0 = , then Var b Var b0 .

7 / 28

Existence of the Minimum Variance Unbiased


Estimator (MVUE)
Does an MVUE estimator exist? Suppose there exist three
unbiased estimators:
b1 , b2 , b3
Two possibilities exist.

b3 is MVUE

no MVUE exists!

8 / 28

Example:
Suppose we observe a single scalar realization x of
X Unif (0, 1/) , > 0.
An unbiased estimator of does not exist. To see this, note that
p (x|) = I[0,1/] (x) .
If b is unbiased, then
h i
> 0, = E b =
=
=
But if this is true for all , then we have b (x) = 0, which is not an
unbiased estimator.
9 / 28

Finding the MVUE Estimator

There is no simple, general procedure for finding the MVUE


estimator. In the next several lectures we will discuss several
approaches:
1. Find a sufficient statistic and apply the Rao-Blackwell theorem
2. Determine the so-called Cramer-Rao Lower Bound (CRLB)
and verify that the estimator achieves it.
3. Further restrict the estimator to a class of estimators (e.g.,
linear or polynomial functions of the data)

10 / 28

Recipe for finding a MVUE


(1) Find a complete sufficient statistic t = T (X).
(2) Find any unbiased estimator b0 and set
b
(X)
:= E[b0 (X)|t = T (X)]
or find a function g such that
b
(X)
= g(T (X))
is unbiased.
These notes answer the following questions:
1. What is a sufficient statistic?
2. What is a complete sufficient statistic?
3. What does step (2) do above?
4. Is this estimator unique?
5. How do we know its the MVUE?
11 / 28

Definition: Sufficient statistic


Let X be an N -dimensional random vector and let denote a
p-dimensional parameter of the distribution of X. The statistic
t := T (X) is a sufficient statistic for if and only if the conditional
distribution of X given T (X) is independent of .
See lecture 4 for more information on Sufficient Statistics and how
to find them.

12 / 28

Minimal and Complete Sufficient Statistics


Definition: Minimal Sufficient Statistic
A sufficient statistic t is said to be minimal if the dimension of t
cannot be reduced and still be sufficient.

Definition: Complete sufficient statistic


A sufficient statistic t := T (X) is complete if for all real-valued
functions which satisfy
(E[(t)|] = 0)
we have
(P[(t) = 0|] = 1)
Under very general conditions, if t is a complete sufficient statistic,
then t is minimal.
13 / 28

Example: Bernoulli trials


Consider N independent Bernoulli trials
iid

xi Bernoulli(), [0, 1].


P
Recall k = N
n=1 xi is sufficient for . Now suppose E[(k)|] = 0
for all . But
E[(k)|] =
=
where poly() is an N th degree polynomial. Then
poly() = 0 [0, 1]
= poly() is the zero polynomial
= (k)
=
14 / 28

Rao-Blackwell Theorem
Rao-Blackwell Theorem
Let Y , Z be random variables and define the function
g(z) := E[Y |Z = z].
Then
E[g(Z)] = E[Y ]
and
Var(g(Z)) Var(Y )
with equality iff Y = g(Z) almost surely.
Note that this version of Rao-Blackwell is quite general and has
nothing to do with estimation of parameters. However, we can
apply it to parameter estimation as follows.
15 / 28

Consider X p(x|). Let b1 be an unbiased estimator of and let


t = T (x) be a sufficient statistic for . Apply Rao-Blackwell with
Y := b1 (x)
Z := t = T (x).
Consider the new estimator
b2 (x) = g(T (x)) = E[b1 (X)|T (X) = t].
Then we may conclude:
1. b2 is unbiased
2. Var(b2 ) Var(b1 )
In words, if b1 is any unbiased estimator, then smoothing b1 with
respect to a sufficient statistic decreases the variance while
preserving unbiasedness.
Therefore, we can restrict our search for the MVUE to functions of
a sufficient statistic.
16 / 28

The Rao-Blackwell Theorem


Rao-Blackwell Theorem, special case
Let X be a random variable with pdf p(X|) and let t(X) be a
sufficient statistic. Let b1 (x) be an estimator of and define
h
i
b2 (t) := E b1 (X)|t(X) .
Then
E[b2 (T )] = E[b1 (X)]
and
Var(b2 (T )) Var(b1 (X))
with equality iff b1 (X) b2 (t(X)) with probability one (almost
surely).

17 / 28

Rao-Blackwell Theorem in Action


Suppose we observe 2 independent realizations from a N (, 2 )
distribution. Denote these observations x1 and x2 , with
X = [x1 , x2 ]T . Consider the simple estimator of :

=x1
E[b
] =
Var [b
] =
The MSE is therefore:

Intuitively, we expect that the sample mean should be a better


estimator since
1

e = (x1 + x2 )
2
averages the two observations together.
18 / 28

Is this the best possible estimator?


Lets find a sufficient statistic for :
p(x1 , x2 ) =

1 (x1 )2 /22 (x2 )2 /22


e
e
2 2

=
=

19 / 28

The Rao-Blackwell Theorem states that:


= E[b
|t]
is as good as or better than
b in terms of estimator variance. (See

Scharf p94.) What is ? First we need to compute the mean of


the conditional density p(b
|t) or p(x1 |t)
p(x1 |t) =

p(x1 , t)
p(t)

p(x1 , t) =
p(t) =
E(t) =
Var(t) =

20 / 28

p(x1 |t)

=
=
=
=

x1 |t

1
2 2
1
4 2



1
2
2
2
(x1 ) + (t x1 ) (t 2) /2
exp
2 2

 1 2
1
x 2x1 + 2 + t2 2x1 t + x21 2t+

exp 22 1
+2x1 + 2 t2 /2 + 4t/2 42 /2
2



1
1
2
2

2x

2x
t
+
t
/2
exp
1
1
2 2
2


1
(x1 t/2)2

exp
2
2


=E[b
|t] =
Var( ) =
MSE( ) =
21 / 28

The Lehmann-Scheffe Theorem

The Rao-Blackwell Theorem tells us how to decrease the variance


of an unbiased estimator. But when can we know that we get a
MVUE?
Answer: When t is a complete sufficient statistic.

Lehmann-Scheffe Theorem
If t is complete, there is at most one unbiased estimator that is a
function of t.

22 / 28

Proof
Suppose
E[b1 ] = E[b2 ] =
b1 (X) := g1 (T (X))
b2 (X) := g2 (T (X)).
Define
(t) := g1 (t) g2 (t).
Then
E[(t)] =
By definition of completeness, we have

In other words
b1 = b2 with probability 1.
23 / 28

Recipe for finding a MVUE

This result suggests the following method for finding a MVUE:


(1) Find a complete sufficient statistic t = T (X).
(2) Find any unbiased estimator b0 and set
b
(X)
:= E[b0 (X)|t = T (X)]
or find a function g such that
b
(X)
= g(T (X))
is unbiased.

24 / 28

Rao-Blackwell and Complete Suff. Stats.


Theorem
If b is constructed by the recipe above, then b is the unique MVUE.
Proof: Note that in either construction, b is a function of t. Let
b1 be any unbiased estimator. We must show that
b Var(b1 ).
Var()
Define
b2 (X) := E[b1 (X)|t = T (X)].
By Rao-Blackwell, it suffices to show
b Var(b2 ).
Var()

25 / 28

Proof (cont.)

But b and b2 are both unbiased and functions of a complete


sufficient statistic
To show uniqueness, in the above argument suppose
b Then the Rao-Blackwell bound holds with
Var(b1 ) = Var().
equality

26 / 28

Example: Uniform distribution.


Suppose X = [x1 xN ]T where
iid

xi Unif[0, ], i = 1, . . . , N.
What is an unbiased estimator of ?
N
2 X
b
1 =
xi
N
i=1

is unbiased. However, it is not MVUE.

27 / 28

Example: (cont.)
From the Fisher-Neyman factorization theorem,
p(X|) =

N
Y
1
i=1

I[0,] (xi )

1
I
() I(,mini xi ] (0)
N [maxi xi ,)
{z
}
|
{z
} |
a(X)

b (t)

we see that
T = max xi
i

is a sufficient statistic. It is left as an exercize to show that T is in


fact complete. Since b1 is not a function of T , it is not MVUE.
However,
b2 (X) = E[b1 (X)|t = T (X)]
is the MVUE.
28 / 28

You might also like