Professional Documents
Culture Documents
In Example 6.24 we showed that if X is a jointly Gaussian random vector then the lin-
ear combination Z = aTX is a Gaussian random variable. Suppose that we do not
know the joint pdf of X but we are given that Z = aTX is a Gaussian random variable
for any choice of coefficients aT = 1a1 , a2 , Á , a n2. This implies that Eqs. (6.48) and
(6.49) hold, which together imply Eq. (6.50) which states that X has the characteristic
function of a jointly Gaussian random vector.
The above definition is slightly broader than the definition using the pdf in Eq. (6.44).
The definition based on the pdf requires that the covariance in the exponent be invertible.
The above definition leads to the characteristic function of Eq. (6.50) which does not
require that the covariance be invertible. Thus the above definition allows for cases
where the covariance matrix is not invertible.
In general we refer to the above estimator for X in terms of Y as the maximum a pos-
teriori (MAP) estimator. The a posteriori probability is given by:
P3Y = y ƒ X = x4P3X = x4
P3X = x ƒ Y = y4 =
P3Y = y4
and so the MAP estimator requires that we know the a priori probabilities P3X = x4.
In some situations we know P3Y = y ƒ X = x4 but we do not know the a priori proba-
bilities, so we select the estimator value x as the value that maximizes the likelihood of
the observed value Y = y:
max P3Y = y ƒ X = x4.
x
Section 6.5 Estimation of Random Variables 333
1 sX 2
2 ¢
exp b - x - r 1y - mY2 - mX ≤ r
211 - r 2sX
2 sY
fX1x | y2 =
22psX2 11 - r22
which is maximized by the value of x for which the exponent is zero. Therefore
n sX
X MAP = r 1y - mY2 + mX .
sY
The conditional pdf of Y given X is:
1 sY 2
exp b - ¢ y - r 1x - m X2 - m Y ≤ r
211 - r22sY2 sX
fY1y | x2 = .
22psY2 11 - r22
which is also maximized for the value of x for which the exponent is zero:
sY
0 = y - r 1x - mX2 - mY .
sX
334 Chapter 6 Vector Random Variables
n sX
X ML = 1y - mY2 + mX .
rsY
n n
Therefore we conclude that X ML Z XMAP . In other words, knowledge of the a priori probabili-
ties of X will affect the estimator.
COV1X, Y2 sX
a* = = rX,Y ,
VAR1Y2 sY
where the second equality follows from the orthogonality condition. Note that when
|rX,Y| = 1, the mean square error is zero. This implies that P3|X - a*Y - b*| = 04
= P3X = a*Y + b*4 = 1, so that X is essentially a linear function of Y.
336 Chapter 6 Vector Random Variables
Linear estimators in general are suboptimal and have larger mean square errors.
n = 1 25 Y - 1/2 3
X + = Y + 1
25 2 1/2 2
q
E3X ƒ y4 = xe -1x - y2 dx = y + 1 and so E3X ƒ Y4 = Y + 1.
Ly
Thus the optimum linear and nonlinear estimators are the same.
Section 6.5 Estimation of Random Variables 337
1.4
1.2
FIGURE 6.2
Comparison of linear and nonlinear estimators.
n = 1 1 X - 3/2 1
Y + = 1X + 12/5.
25 2 25/2 2
e -y 1 - e -x - xe -x xe -x
x
E3Y ƒ x4 = y -x dy = -x = 1 - .
L0 1 - e 1 - e 1 - e -x
The optimum linear and nonlinear estimators are not the same in this case. Figure 6.2 compares
the two estimators. It can be seen that the linear estimator is close to E3Y ƒ x4 for lower values of
x, where the joint pdf of X and Y are concentrated and that it diverges from E3Y ƒ x4 for larger
values of x.
Example 6.28
Let X be uniformly distributed in the interval 1-1, 12 and let Y = X2. Find the best linear esti-
mator for Y in terms of X. Compare its performance to the best estimator.
The mean of X is zero, and its correlation with Y is
1
E3XY4 = E3XX24 = x3/2 dx = 0.
L- 12
Therefore COV1X, Y2 = 0 and the best linear estimator for Y is E[Y] by Eq. (6.55). The mean
square error of this estimator is the VAR(Y) by Eq. (6.57).
The best estimator is given by Eq. (6.58):
E3Y ƒ X = x4 = E3X2 ƒ X = x4 = x2.
The mean square error of this estimator is
E31Y - g1X2224 = E31X2 - X2224 = 0.
Thus in this problem, the best linear estimator performs poorly while the nonlinear estimator
gives the smallest possible mean square error, zero.
338 Chapter 6 Vector Random Variables
This is identical to the best linear estimator. Thus for jointly Gaussian random variables the min-
imum mean square error estimator is linear.
To simplify the discussion we will assume that X and the Yi have zero means. The
same derivation that led to Eq. (6.58) leads to the optimum minimum mean square
estimator:
g*1y2 = E3X ƒ Y = y4. (6.59)
The minimum mean square error is then:
= VAR3X ƒ Y = y4fY1y2dy.
3
Rn
We take derivatives with respect to ak and again obtain the orthogonality conditions:
n
E B ¢ X - a akYk ≤ Yj R = 0 for j = 1, Á , n.
k=1
Section 6.5 Estimation of Random Variables 339
Now suppose that X has mean mX and Y has mean vector m Y , so our estimator
now has the form:
n
n = g1Y2 =
a akYk + b = a Y + b.
T
X (6.62)
k=1
The same argument that led to Eq. (6.53b) implies that the optimum choice for b is:
b = E3X4 - aTm Y .
Therefore the optimum linear estimator has the form:
n = g1Y2 = aT1Y - m 2 + m = aTZ + m
X Y X X
where Z = Y - m Y is a random vector with zero mean vector. The mean square error
for this estimator is:
E31X - g1Y2224 = E31X - aTZ - mX224 = E31W - aTZ224
where W = X - mX has zero mean. We have reduced the general estimation prob-
lem to one with zero mean random variables, i.e., W and Z, which has solution given
by Eq. (6.61a). Therefore the optimum set of linear predictors is given by:
a = R z -1E3WZ4 = K Y-1E31X - mX21Y - m Y24. (6.63a)
The mean square error is:
This result is of particular importance in the case where X and Y are jointly Gauss-
ian random variables. In Example 6.23 we saw that the conditional expected value
340 Chapter 6 Vector Random Variables
of X given Y is a linear function of Y of the form in Eq. (6.62). Therefore in this case
the optimum minimum mean square estimator corresponds to the optimum linear
estimator.
E3Y214 E3Y1Y24
RY = B R
E3Y1Y24 E3Y224
The optimum estimator using a single antenna received signal involves solving the 1 * 1 version
of the above system:
E3X24 2
XN = Y1 = Y1
E3X 4 + E3N 14
2 2 3
3 2 -1 2 1 3 -2 2 0.4
a = R Y-1E3XY4 = B R B R = B RB R = B R
2 3 2 5 -2 3 2 0.4
and the optimum estimator is:
N = 0.4Y + 0.4Y .
X 1 2
The mean square error for the two antenna estimator is:
2
E31X - aTY224 = VAR1X2 - aTE3YX4 = 2 - 30.4, 0.44 B R = 0.4.
2
Section 6.5 Estimation of Random Variables 341
As expected, the two antenna system has a smaller mean square error. Note that the re-
ceiver adds the two received signals and scales the result by 0.4. The sum of the signals is:
so combining the signals keeps the desired signal portion, X, constant while averaging the two
noise signals N1 and N2. The problems at the end of the chapter explore this topic further.
1 r1 a r
s2 B R B R = s2 B 2 R .
r1 1 b r1
Equation (6.61a) gives
r2 - r21 r111 - r212
a = and b = .
1 - r21 1 - r21
Xn Xn ⫺ 1 Xn ⫺ 2
b ⫻ a ⫻
⫹
^
⫹ Xn
⫹ En
⫺
FIGURE 6.3
A two-tap linear predictor for processing
speech.
342 Chapter 6 Vector Random Variables
In Problem 6.78, you are asked to show that the mean square error using the above values of a
and b is
1r21 - r222
s2 b 1 - r21 - r. (6.64)
1 - r21
Typical values for speech signals are r1 = .825 and r2 = .562. The mean square value of the pre-
dictor output is then .281s2. The lower variance of the output 1.281s22 relative to the input vari-
ance 1s22 shows that the linear predictor is effective in anticipating the next sample in terms of
the two previous samples. The order of the predictor can be increased by using more terms in the
linear predictor. Thus a third-order predictor has three terms and involves inverting a 3 * 3 cor-
relation matrix, and an n-th order predictor will involve an n * n matrix. Linear predictive tech-
niques are used extensively in speech, audio, image and video compression systems. We discuss
linear prediction methods in greater detail in Chapter 10.
Define the matrix ∂ 1/2 as the diagonal matrix of square roots of the eigenvalues:
2l1 0 Á 0
0 2l2 Á 0
∂ 1/2 ! D
. . Á . T.
0 0 Á 2ln