You are on page 1of 2

102 CHAPTER 5. GENERAL MVU ESTIMATION 5.3.

SUFFICIENT STATISTICS
103
5.3 Sufficient Statistics
a DC level A .

. I N-l,
A= N Lx[nJ
n=O

was the MVU estimator, having minimum variance a 2 / N. & on the other hand, ~
had chosen
A = x[O]
A A
as our estimator it is immediately clear that even though A is unbiased, its variance is Ao

much larger (bei~g a 2 ) than the minimum. Intuitively, the poor performance is a direct
result of discarding the data oints {xli], x[2], ... ,x[N - I]} which carr inf(ilrmation (a) Observations provide information after (b) No information from observations after
about A. reasona e questioll'o to as IS IC a a samples are pertinent to the T(x) observed-T(x) is not sufficient T(x) observed-T(x) is sufficient
estimation problem? or Is there a set of data that is sufficient? The foll<?wing data
Figure 5.1 Sufficient statistic definition
sets may be claimed to be sufficient in that they may be used to compute A.

SI {x[Oj,x[IJ, ... ,x[N-l]} depend on A. If it did, then we could infer some additional information about A from
S2 {x[Oj +x[I],x[2j,x[3j, ... ,x[N -I]} the data in addition to that already provided by the sufficient statistic. As an example,
in Figure 5.1a, if x = Xo for an arbitrar~ Xo, then values of A near Ao would be more
S3 {% x[n]}. likely. This violates our notion that 2:n~ol x[nj is a sufficient statistic. On the other
hand, in F'lgure 5.1b, any value of A IS as hkeIy as any other, so that after observing
T(x) the data may be discarded. Hence, to verify that a statistic is sufficient we need
SI represents the original data set, which as expected, is always sufficient for the prob- to determine the conditional PDF and confirm that there is no dependence on A.
lem. S2 and S3 are also sufficient. It is obvious that for this roblem there are man
sufficient data sets. The data set that contaIlls t e least number of elements is called
Example 5.1 - Verification of a Sufficient Statistic
the minimal on.,. If we now think of the elements of these sets as statistics, we say
that the N statistics of 1 are su c~en as we as t e statIstIcs of 2 an
Consider the PDF of (5.1). To prove that 2:::'~01 x[n] is a sufficient statistic we need
the single statistic of S3' This latter statistic, 2:n~ol x[n], in addition to being a suf-
ficient statistic is the minimal sufficient statisti~ For estimation of A, once we know to determine p(xJT(x) = To; A), where T(x) = 2:;:'~01 x[nJ. By the definition of the
conditional PDF we have
2:::' ~1 x[n], we 'nO"lollger need the individual data values since all infor~ation ~as be~
summarIzed III the sufficient statistic. To quantIfY what we mean by thIS, consIder the
PDF of the data
p(xJT(x) = To; A) = p(x, T(x) = To; A) .
p(T(x) = To; A)

p(x' A) = 1 [1
exp - -2 L (x[nj -
N-l
A?
]
(5.1) But note that T(x) is functionally dependent on x, so that the joint PDF p(x, T(x) =
To; A) takes on nonzero values only when x satisfies T(x) = To. The joint PDF is
, (21l'a 2 ) If 2a n=O
therefore p(x; A)6(T(x) - To), where 6 is the Dirac delta function (see also Appendix
and assume that T(x) = 2:;:'~01 x[n] = To has been observed. Knowledge of the value of 5A for a further discussion). Thus, we have that
1
this statistic will change the PDF to the conditional one p(xJ 2:;:'=0 x[n] = To; A), which
now ives the PDF of the observatIons after the sufficIent statistic has been observed. p(xJT(x) = To; A) = p(x; A)6(T(x) - To)
(5.2)
Since the statistic is su cient for the estimation of A, this con ltIona PD s ou p(T(x) = To; A)
••• - _______ • • • • • • • • • • • • • • • • • 1

104 CHAPTER 5. GENERAL MVU ESTIMATION 5.4. FINDING SUFFICIENT STATISTICS 105

Clearly, T(x) '" N(N A, N 0- 2 ), so that where g is a function depending on x only through T(x) and h is a /unction depending
only. o~ x, then T(x) is a sufficient statistic for 8. Conversely, if T(x) is a sufficient
p(x; A)8(T(x) - To) statzstzc for 8, then the PDF can be factored as in {5.3}.
A proof of this theorem is contained in Appendix 5A. It should be mentioned that at
1 N-l ]
times it is not obvious if the PDF can be factored in the required form. If this is the
= 1 N exp - - 2 L (x[nJ - A)2 8(T(x) - To)
2
(271"0- )2 [ 20- n=O case, then a sufficient statistic may not exist. Some examples are now given to illustrate

1
(271"0-2)~
exp [ _ _
1
20- 2

n=O
x 2[nJ- 2AT(x) + N A2)] 8(T(x) - To)
the use of this powerful theorem.

Example 5.2 - DC Level in WGN


1 [1
2
(271"0- )2 20-
L x 2[nJ- 2ATo + N A2 ) ] 8(T(x) - To).
--~N exp - -2 (N-l
n=O
We now reexamine the problem discussed in the previous section. There the PDF
was given by (5.1), where we note that 0- 2 is assumed known. To demonstrate that a
From (5.2) we have factorization exists we observe that the exponent of the PDF may be rewritten as
p(xIT(x) = To; A)
N-l N-l N-l
L (x[nJ- A? = L x 2[nJ- 2A L x[nJ + N A2
--2~~
1 exp [- 1 Lx2[nJ ] exp [- 1
20- 2N-l 20- 2(-2 ATo + NA2) ] n=O n=O n=Q

(271"0- ) n-O 8(T(x) - To)


so that the PDF is factorable as
1 exp [_ _l_(To _ NA)2]
V271" N 0- 2 2N 0- 2

VN exp --22 N-l


---N---'
(271"0- 2)-,
[1
L x 2 [nJ ] exp [T,2]
0- n=O
2N° 2 8(T(x) - To)
0-
which as claimed does not depend on A. Therefore, we can conclude that "E~':Ol x[nJ Clearly then, T(x) = "E~':Ol x[nJ is a sufficient statistic for A. Note that T'(x) =
is a sufficient statistic for the estimation of A. 0 2 "E~':Ol x[nJ is also a sufficient statistic for A, and in fact any one-to-one function of
This example indicates the procedure for verifying that a statistic is sufficient. For L~':ol x[nJ is a sufficient statistic (see Problem 5.12). Hence, sufficient statistics are
many problems the task of evaluating the conditional PDF is formidable, so that an unique only to within one-to-one transformations. 0
easier approach is needed. Additionally, in Example 5.1 the choice of "E~':Ol x[nJ for
examina.tion as a sufficient statistic was fortuitous. In general an even more difficult
problem would be to identify potential sufficient statistics. The approach of guessing at Example 5.3 - Power of WGN
a sufficient statistic and then verifying it is, of course, quite unsatisfactory in practice.
To alleviate the guesswork we can employ the Neyman-Fisher factorization theorem, Now consider the PDF of (5.1) with A = 0 and 0- 2 as the unknown parameter. Then,
which is a simple "turn-the-crank" procedure for finding sufficient statistics.

p(X;0-2) =(
12)l! [1 N-l 2 ]
exp --22 L x [nJ . 1 .
5.4 Finding Sufficient Statistics ,
271"0- 2
.0- n=O
..........,
The Neyman-Fisher factorization theorem is now stated, after which we will use it to g(T(x), 0- 2) h(x)
find sufficient statistics in several examples.
~gain it is immediately obvious from the factorization theorem that T(x) = "EN,:l x 2[nJ
Theorem 5.1 (Neyman-Fisher Factorization) If we can factor the PDF p(x; 8) as IS a sufficient statistic for 0- 2 . See also Problem 5.1. n 0 0
p(x; 8) = g(T(x), 8)h(x) (5.3)

You might also like