Professional Documents
Culture Documents
Solutions To Chapter 5 Problems
Solutions To Chapter 5 Problems
Problem 5.2 Let F denote the event that a student fails (hence F c is the event that the student
passes), and S denote the event that a student studies (hence S c is the event that the student
does not study. We are given that P (F c |S) = 0.9 (hence P (F |S) = 1 −0.9 = 0.1), P (F |S c) = 0.9
(hence P (F c |S c ) = 1 − 0.9 = 0.1), and P (S) = 0.7 (hence P (S c ) = 1 − 0.7 = 0.3).
(a) By the law of total probability,
P (F ) = P (F |S)P (S) + P (F |S c )P (S c) = 0.1 × 0.7 + 0.9 × 0.3 = 0.34
(b) By Bayes’ rule, the conditional probability that a student that failed studied for the exam is
given by
P (F |S)P (S) 0.1 × 0.7 7
P (S|F ) = = =
P (F ) 0.34 34
(c) The conditional probability that a student that failed did not study is P (S c |F ) = 1−P (S|F ) =
7 27
1 − 34 = 34 .
(d) Yes, since conditional probabilities obey the same rule as probabilities, as long as we are
conditioning on the same event.
Remark: On the other hand, we would not expect P (S|F ) and P (S c|F c ) to add up to one,
since we are conditioning on different events. To see this, let us use Bayes’ rule to compute the
probability that a student that passed did not study:
P (F c |S c )P (S c ) 0.1 × 0.3 1
P (S c |F c ) = c
= =
P (F ) 1 − 0.34 22
Adding this to the result of (b) does not have the interpretation of adding the probabilities of
complementary events (and, indeed, gives a result unequal to one), since we are conditioning on
different events.
1
Problem 5.3 We have Y ∼ Exp(1) if 0 sent, Y ∼ Exp( 10 ) if 1 sent, and P [0 sent] = 1 −
P [1 sent] = 0.6. Recall that the complementary CDF of an exponential random variable is given
by P [Exp(µ) > z] = e−µz , z ≥ 0.
(a) P [Y > 5|0 sent] = e−5 .
5 1
(b) P [Y > 5|1 sent] = e− 10 = e− 2 .
(c) Using the law of total probability,
1
P [Y > 5] = P [Y > 5|0 sent]P [0 sent]+P [Y > 5|1 sent]P [1 sent] = e−5 ×0.6+e− 2 ×0.4 = 0.2467
e−5 × 0.6
P [0 sent|Y = 5] = 5
1 − 10
= 0.1428
e−5 × 0.6 + 10
e × 0.4
P [b3 = 0] = P [b1 = 0, b2 = 0]+P [b1 = 1, b2 = 1] = P [b1 = 0]P [b2 = 0]+P [b1 = 1]P [b2 = 1] = 0.8×0.1+0.2×0
pi
(b) Denoting pi = P [bi = 0], we have eLi = 1−pi
. Thus,
p3 p1 p2 + (1 − p1 )(1 − p2 )
eL3 = =
1 − p3 p1 (1 − p2 ) + (1 − p1 )p2
so that
eL1 +L2 + 1
L3 = log
eL1 + eL2
We can view Li as an expression of our “belief” about the value of bit bi , and the preceding
computation is a key component of “belief propagation” based decoding of channel codes, as
2
discussed in Chapter 7.
Problem 5.5 Let Y1 , ..., Yn denote the outputs corresponding to the n channel uses. Then
Z = Y1 + ... + Yn .
(a) Given X = 0, Y1 , ..., Yn are conditionally i.i.d Bernoulli random variables with P [Yi = 1|X =
0] = a = 1 − P [Yi = 0|X = 0], so that their sum Z is conditionally binomial: Z|X=0 ∼ Bin(n, a).
Thus, the conditional pmf is given by
n
P [Z = z|X = 0] = p(z|0) = az (1 − a)n−z
z
(b) According to the majority rule, the receiver says 1 if Z > ⌊n/2⌋ (assume n odd). Thus, the
conditional probability of error given that 0 is sent is
n
X n
Pe|0 = P [Z > ⌊n/2⌋|X = 0] = az (1 − a)n−z
z
z=⌊n/2⌋+1
0.9
0.8
0.7
0.6
m
0.5
0.4
0.3
0.2
0.1
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
P[X=0|Z=m]
Since Z|X=0 ∼ Bin(n, a) and Z|X=1 ∼ Bin(n, 1 − a), the preceding simplifies to
am (1 − a)n−m 1
P [X = 0|Z = m] = =
am (1 − a)n−m + an−m (1 − a)m 1−a 2m−n
1+ a
assuming equal priors. The stem plot is shown in Figure 1. As expected, smaller values of Z
correspond to a higher posterior probability for X = 0.
(d) When P [X = 0] = 0.9, the posterior probability can be written as
0.9am (1 − a)n−m 1
P [X = 0|Z = m] = =
0.9am (1 − a)n−m + 0.1an−m (1 − a)m 1 1−a 2m−n
1+ 9 a
3
1
0.9
0.8
0.7
0.6
m
0.5
0.4
0.3
0.2
0.1
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
P[X=0|Z=m]
The corresponding stem plot is shown in Figure 2. Comparing with Figure 1, we see that a
higher prior probability for X = 0 leads to a higher posterior probability for X = 0, but for
large values of m, we still have small posterior probabilities for 0 sent (if we get a large enough
number of 1s from the channel, it outweighs our prior information that 0 is significantly more
likely to be sent). (e) The LLR for a given value of Z is given as
P [X = 0|Z = m] P [X = 0|Z = m]
LLR(m) = log = log
P [X = 1|Z = m] 1 − P [X = 0|Z = m]
The stem plots corresponding to (c)-(d) are shown in Figure 3.
15 15
10
10
5
m
−5
−5
−10
−15 −10
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
LLR(m) LLR(m)
Problem 5.6 (a) Using Bayes’ rule and the law of total probability, the posterior probabilities
are given by
P [Y = y|X = 0]P [X = 0] P [Y = y|X = 0]P [X = 0]
P [X = 0|Y = y] = =
P [Y = y] P [Y = y|X = 0]P [X = 0] + P [Y = y|X = 1]P [X = 1]
For 0, 1 equally likely, this reduces to
P [Y = y|X = 0]
P [X = 0|Y = y] = 0, 1 equiprobable
P [Y = y|X = 0] + P [Y = y|X = 1]
4
which gives 1−p−q−r
1−q−r
= 0.917 y = +3
r
r+q
= 0.75 y = +1
P [X = 0|Y = y] = q
r+q
= 0.25 y = −1
p
= 0.083 y = −3
1−q−r
Thus, the LLR is a sum of two terms, one corresponding to the transition probabilities {P [Y =
y|X = i]} and one to the prior probabilities {P [X = i]}, where i = 0, 1. For equiprobable priors,
the second term is zero, hence the LLRs are given by
(c) Since the channel uses are conditionally independent, the required conditional probabilities
are given by
P [Y = y|X = i] = P [Y1 = y1 , Y2 = y2 , Y3 = y3 |X = i]
= P [Y1 = y1 |X = i]P [Y2 = y2 |X = i]P [Y3 = y3 |X = i], i = 0, 1
(d) We replicate the argument in (a) to emphasize that the key ideas apply to vector observations
gathered over multiple channel uses as well. Using Bayes’ rule, we have
P [Y = y|X = 0]
P [X = 0|Y = y] = = 0.917 0, 1 equiprobable
P [Y = y|X = 0] + P [Y = y|X = 1]
where we have used the results of (d). As before, the LLR can be written as
P [X = 0|Y = y] P [Y = y|X = 0] P [X = 0]
L(y) = log = log + log
P [X = 1|Y = y] P [Y = y|X = 1] P [X = 1]
5
Remark: For independent channel uses, we can write the LLR as
3
X P [Y = yk |X = 0] P [X = 0]
L(y) = log + log
k=1
P [Y = yk |X = 1] P [X = 1]
so that the contributions from the different channel uses and the priors simply add up. This
illustrates why the LLR is an attractive means of combining information from prior probabilities
and observations.
(e) Since the LLR is positive (i.e., the posterior probability of 0 is higher than that of 1), we
would decide on 0 based on the channel output +1,+3,-1.
1
Problem 5.7 The random variable X ∼ Exp(µ = 10 ) (mean E[X] = µ1 = 10).
(a) P [X > x] = e−µx for x ≥ 0, hence P [X > 20] = e−20/10 = e−2 = 0.1353.
(b) P [X ≤ x] = P [X < x] = 1 − e−µx for x ≥ 0, hence P [X < 5] = 1 − e−5/10 = 0.3935.
(c) By Bayes’ rule,
P [X > 20|X > 10] = P [X>20,X>10]
P [X>10]
= PP [X>20]
[X>10]
e−20/10
= e−10/10
= e−1 = 0.3679
(d) We have
R R∞ e−(µ+1)x ∞ µ 1
E e−X = e−x p(x)dx = 0 e−x µe−µx dx = |
−(µ+1) 0
= µ+1
= 11
1
setting µ = 10
. (e) We have
∞ ∞
1
Z Z Z
E X3 = 3 3 −µx
t3 e−t dt
x p(x)dx = x µe dx = 3
0 µ 0
substituting t = µx. As discussed in the text, the integral evaluates to Γ(4) = 3! = 6, so that
6
E X 3 = 3 = 6000
µ
Problem 5.8 (a) For X = max (U1 , ..., Un ), we have X ≤ x if and only if U1 ≤ x, ..., Un ≤ x.
The CDF of X is therefore given by
P [Y > y] = P [U1 > y, ..., Un > y] = P [U1 > y]...P [Un > y] = (1 − FU (y))n
(c) When the {Ui } are uniform over [0, 1], we have FU (u) == u, 0 ≤ u ≤ 1, hence
FX (x) = xn , 0 ≤ x ≤ 1
6
1 1
n=1
0.9 n=5 0.9
n=10
0.8 0.8
0.7 0.7
0.6 0.6
CDF F(x)
CDF F(y)
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
n=1
0.1 0.1 n=5
n=10
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x y
Figure 4: The CDFs of the maximum and minimum of n i.i.d. uniform random variables.
(Of course, FX (x) = 0 for x < 0 and FX (x) = 0 for x ≥ 1, since X lies in [0, 1].) Figure 4(a)
plots the CDF of X. The probability mass shifts towards one as n increases. For 0 < x < 1,
xn → 0 as n → ∞, so that the limiting CDF concentrates all of its probability mass at one:
0, x < 1
lim FX (x) =
n→∞ 1, x ≥ 1
Thus, we get a discrete random variable putting all its probability mass at one, taking the limit
of a large number of continuous random variables.
(d) When the {Ui } are uniform over [0, 1], their minimum Y also lies in [0, 1]. The CDF is given
by
FY (y) = 1 − (1 − y)n , 0 ≤ y ≤ 1
Figure 4(b) plots the CDF of Y . The probability mass shifts towards zero as n increases. For
0 < y ≤ 1, we have 0 ≥ 1 − y < 1 and (1 − y)n → 0, so that we get that
0, y ≤ 0
lim FY (y) =
n→∞ 1, y > 0
This limit is actually not a valid CDF, since it is not right-continuous at y = 0. But it does
show a unit jump at y = 0, indicating that all the probability mass is concentrated at zero. We
can prove the latter rigorously using more sophisticated techniques, but we do not attempt to
do this here.
Problem 5.9 Let U1 ∼ Exp(µ1 ) and U2 ∼ Exp(µ2 ) denote two independent exponential random
variables.
(a) We wish to express events involving min (U1 , U2 ) in terms of an intersection of events involving
U1 and U2 , in order to exploit the independence assumption. We have min (U1 , U2 ) > x if and
only if U1 > x and U2 > x. Thus, we have, for x ≥ 0,
P [min (U1 , U2 ) > x] = P [U1 > x, U2 > x] = P [U1 > x]P [U2 > x]
= e−µ1 x e−µ2 x = e−(µ1 +µ2 )x
which is in the form of the complementary CDF of an exponential random variable. That is,
min (U1 , U2 ) ∼ Exp(µ1 + µ2 ), hence the statement is True.
7
(b) Since max (U1 , U2 ) ≤ x if and only if U1 ≤ x and U2 ≤ x, we have, for x ≥ 0,
which is not in the 1 − e−µx form of an exponential CDF. Thus, the maximum is not an expo-
nential random variable, and the statement is False.
Problem 5.10 (a) This is the same as Problem 5.8(b), but we can see the explicit form here
where we have used the independence of U, V , and the uniform distribution of U, in the last
equality. We know that P [V ≤ v] = v, 0 ≤ v ≤ 1, and P [V ≤ v] = 1 for v ≥ 1. For y ≤ 1,
uy ≤ 1, so that Z 1
y
FY (y) = uydu = , 0 ≤ y ≤ 1
0 2
For y > 1, uy > 1 and P [V ≤ uy] = 1 for u > y1 , so that
1
1
1
Z Z
y
FY (y) = uydu + 1dy = 1 − , y>1
0 1
y
2y
Method 2: We can also compute the CDF pictorially as shown in Figure 5. The joint distribution
v=yu
v v 1/y
1 v=yu 1
y
Area = 1/(2y) Area = 1 − 1/(2y)
0 1 u 0 1 u
Area = y/2
y<1 y>1
Figure 5: Pictorial computation of CDF of V /U, where U, V are i.i.d. uniform over [0, 1].
of (U, V ) is uniform over the unit square in the (u, v) plane, so that the event V /U ≤ y whose
probability we desire to find represents the shaded regions shown, and probability itself is simply
the area of these shaded regions. For y > 1, the region V /U ≤ y is triangular, with area y/2.
For y > 1, the region V /U ≤ y is the complement of a triangular region of area 1/(2y).
p √
Problem 5.11 We have already derived the joint density of R = X12 + X22 = Z and Θ =
tan−1 X
X1
2
in Example 5.4.3, so we will just use those results.
8
2 √
(a) We have p(r) = re−r Ir≥0 from Example 5.4.3. Since Z = R2 and dz/dr = 2r = 2 z, we
have
p(r)
p(z) = √ = e−z Iz≥0
|dz/dr| r= z
showing that Z ∼ Exp(1).
(b) The statement is True. Since R and Θ are independent, so are Z = R2 and Θ.
(If F is not monotone increasing, then we have to define the inverse carefully, but the preceding
result still holds.)
(b) For Y ∼ Exp(1/2) (mean 2), we have
For F −1 (u) = y, we have u = F (y) = 1−e−y/2 so that y = −2 log(1−u). Thus, the transformation
Y = −2 log(1 − U) works. Since 1 − U is also uniform over [0, 1], we can replace it by U, so
Y = −2 log U also works.
(c) The Matlab code for generating the histogram in Figure 6 is given below.
80
70
60
50
40
30
20
10
0
0 5 10 15
Figure 6: Histogram for exponential random variable of mean 2, simulated using uniform random
variables as in Problem 5.12.
Problem 5.13: While all the derivations in this problem are there in the text and other problems,
we do them from scratch here in order to reinforce the concepts.
(a) Since U1 and U2 are independent, so are Z = −2 ln U1 and Θ = 2πU2 . Clearly, Θ is uniform
over [0, 2π]. Since U1 takes values in [0, 1], the random variable Z takes values in [0, ∞). The
CDF of Z is given by
9
We recognize that Z ∼ Exp( 12 ), or an exponential random variable with mean E[Z] = 2.
Remark: This provides a good opportunity to emphasize that one does not always have to write
down explicit expressions for the joint CDF or density in order to specify the joint distribution.
Any specification that would allow one to write down such expressions if needed is sufficient. In
the preceding, we provided such a specification by stating that Z and Θ are independent, with
Z ∼ Exp( 21 ) and Θ ∼ Unif [0, 2π].
(b) Let us do this from scratch instead of using results from prior examples and problems.
1 1
p(z, θ) = p(z)p(θ) = e−z/2 , z ≥ 0, 0 ≤ θ ≤ 2π (1)
2 2π
and
p(z, θ)
p(x1 , x2 ) = (2)
|det (J(x1 , x2 ; z, θ))| z=x2 +x2 ,θ=tan−1 x2
1 2 x1
so that det (J(x1 , x2 ; z, θ)) = 12 cos2 θ + 12 sin2 θ = 12 . Plugging this and (1) into (2), we obtain
that the joint density of X1 , X2 is given by
1 −(x21 +x22 )/2
p(x1 , x2 ) = e , − ∞ < x1 , x2 < ∞
2π
We recognize that this is a product of two N(0, 1) densities, so that X1 , X2 are i.i.d. N(0, 1)
random variables.
(c) A code fragment using the preceding to generate N(0, 1) random variables is provided below,
and the histogram generated is shown in Figure 7.
70
60
50
40
30
20
10
0
−4 −3 −2 −1 0 1 2 3 4
Figure 7: Histogram based on 2000 N(0, 1) random variables generated using the method in
Problem 5.13.
10
%generate uniform random variables
U1 = rand(N,1);
U2 = rand(N,1);
Z = -2*log(U1); %exponentials, mean 2
theta=2*pi*U2; % uniform over [0,2 pi]
%transform to standard Gaussian
X1=sqrt(Z).*cos(theta);
X2=sqrt(Z).*sin(theta);
X = [X1;X2];%2N independent N(0,1) random variables
hist(X,100); %histogram with hundred bins
(d) We estimate E[X 2 ] as the empirical mean by adding the following code fragment:
estimated_power = sum(X.^2)/(2*N)
The answer should be close to the theoretical answer E[X 2 ] = var(X) + (E[X])2 = 1 + 02 = 1.
(e) The desired probability P [X 3 + X > 3] can be estimated by adding the following code
fragment.
Figure 8: CDF of the Bernoulli random variable Y1 in Problem 5.14. The dots at the jumps at
0 and 1 indicate the right continuity of the CDF.
Problem 5.14 (a) Y1 takes values 0 and 1, with P [Y1 = 0] = P [U1 ≤ 0.7] = 0.7 and P [Y1 =
1] = 1 − P [Y1 = 0] = 0.3. Thus, the CDF is given by
0, y1 < 0
F (y1 ) = P [Y1 ≤ y1 ) = 0.7, 0 ≤ y1 < 1
1, y≥1
11
The plot for n = 20 and p = 0.3 is given in Figure 5.6.
(c)-(e) We skip the histogram, but show how to compute moments using simulation. Since Z is
a sum of n Bernoulli random variables, its first moment is simply
E[Z] = nE[Y1 ] = np
The second and third moments can be computed in a number of ways, including using moment
generating functions. We skip deriving these, but give the expressions in the code below. If we
run the code for n = 20 and p = 0.3, we will get E[Z] = 6, E[Z 2 ] = 40.2 (this was not asked for)
and E]Z 3 ] = 293.3. Simulations with 10000 runs come very close to these values, but you should
check what happens with fewer runs, say 1000.
n=20;
p=0.3;
runs=10000;%number of simulation runs needed
U=rand(n,runs); %matrix of unif(0,1) random variables
Y=U > 0.7; %threshold to get matrix Bernoulli random variables P[1]=0.3
Z=sum(Y); %add n rows to get Bin(n,p) random variables
%simulation-based moment computations
first_moment_estimate = sum(Z)/runs
second_moment_estimate=sum(Z.^2)/runs
third_moment_estimate = sum(Z.^3)/runs
%analytical computation of moments
first_moment_analytical=n*p
second_moment_analytical=n*(n-1)*p^2+n*p
third_moment_analytical=n*(n-1)*(n-2)*p^3+3*n*(n-1)*p^2+n*p
Problem 5.15 (a) The joint density must integrate to one, hence we must have
Z ∞Z ∞ Z 0 Z 0 Z ∞Z ∞
−(2x2 +y 2 )/2 −(2x2 +y 2 )/2 2 2
1=K e dxdy + K e dxdy = 2K e−(2x +y )/2 dxdy
0 0 −∞ −∞ 0 0
where we have used symmetry. The integrals in x and y separate out, and we have
Z ∞ Z ∞ −x2 /(2v2 ) √
e 2 1 π
q 1
q
−(2x2 )/2 2
e dx = 2πv1 2
dx = 2πv1 =
0 0 2πv1 2 2
massaging the x integrand into an N(0, v12 ) density, with v12 = 21 . Similarly, we can massage the
y integrand into an N(0, v22 ) density with v22 = 1 to get
Z ∞ √
21 2π
q
−y 2 /2
e dy = 2πv2 =
0 2 2
√ √ √
We therefore have 1 = 2K 2π 22π , or K = 2/π.
(b) The marginal density of X is
R ∞ −(2x2 +y2 )/2
Z 0 e dy, x ≥ 0
p(x) = p(x, y)dy =
R 0 −(2x2 +y2 )/2
−∞
e dy, x < 0
By symmetry, the y integrals evaluate to the same answer for the two cases above, so that
2
p(x) ∼ e−x . Thus, X ∼ N(0, 21 ) (the constant must evaluate out to whatever is needed for p(x)
12
to integrate to one. A similar reasoning shows that Y ∼ N(0, 1).
(c) The event X 2 + X > 2 can be written as
X 2 + X − 2 = (X + 2)(X − 1) > 0
which happens if X + 2 > 0, X − 1 > 0, or X + 2 < 0, X − 1 < 0. That is, it happens if X > 1
or X < −2. Thus,
! !
1−0 −2 − 0 √ √
P [X 2 + X > 2] = P [X > 1] + P [X < −2] = Q p +Φ p = Q( 2) + Q(2 2)
1/2 1/2
Plugging into the expression for the joint Gaussian density, we obtain
1 1 2 2
p(y1 , y2 ) = √ exp − [7(y1 − 4) − 16(y1 − 4)(y2 + 1) + 13(y2 + 1) ]
2π 27 54
13
Problem 5.17 (a) Using bilinearity of covariance, we have
(I used Matlab, even though I could have computed it by hand.) We can now plug into formula
(6.10) for the joint Gaussian density.
(c) P [Y2 > 2Y1 −1] = P [Z > 0], where Z = Y2 −2Y1 + 1 = aT Y + 1, where a = (−2, 1)T . Thus, Z
is Gaussian with mean E[Z] = aT mY + 1 = 26 and var(Z) = aT CY a = 425, where I have again
used Matlab. Since Y = AX, we could also express Z in terms of the original random vector
X: Z = aT AX + 1 = (AT a)T X + 1 = aT1 X + 1, where a1 = AT a = (−5, 5)T . We would then
obtain E[Z] = aT1 m + 1 and var(Z) = aT1 Ca1 , which, as can be checked, give the same answers
as before. Now that we know that Z ∼ N(26, 425), we have
0 − 26 26
P [Z > 0] = Q √ =1−Q √ = 0.8964
425 425
(d) We know that Y1 ∼ N(−8, 33) from (b). The desired probability can be written as
P [Y12 > 3Y1 + 10] = P [Y12 − 3Y1 − 10 > 0] = P [(Y1 − 5)(Y1 + 2) > 0]
= P [Y
1 > 5,Y1 > −2]
+ P [Y1 < 5,Y1 <−2] =P [Y1 > 5] + P [Y1 < −2]
5−(−8) −2−(−8)
=Q √
33
+Φ √ = Q √1333 +Φ √6
33 33
Problem 5.18 (a)-(c) The densities and contour plots are given by Figures 9-11.
(d) The contour plots are ellipses which are aligned along the x and y axes for ρ = 0, but are
2
rotated for ρ 6= 0. Deriving the specific relationship between σX , σY2 and ρ, and the ellipse major
and minor axes and their alignment, is left as an exercise.
The code for producing these plots is provided below (the numerical values set for the parameters
are for part (c)).
var_x=4; %variance of X
var_y=1; %variance of Y
rho=0.5; %normalized correlation between X and Y, lies in (-1,1)
r12=rho*sqrt(var_x*var_y);
R=[var_x r12;r12 var_y];
[x,y]=meshgrid(-5:0.1:5);
Rinv=inv(R);
exp_arg= Rinv(1,1)*(x.^2)+Rinv(2,2)*(y.^2)+2*Rinv(1,2)*(x.*y);
14
normalization = 1/(2*pi*sqrt(det(R)));
z=normalization*exp(-0.5*exp_arg);
%density plot
figure;
surf(x,y,z);
xlabel(’x’);
ylabel(’y’);
zlabel(’Joint Gaussian Density’);
figure;
%contour plot
contour(x,y,z);
xlabel(’x’);
ylabel(’y’);
3
0.16
0.14 2
Joint Gaussian Density
0.12
1
0.1
0.08
y
0
0.06
−1
0.04
0.02 −2
0
5 −3
5
0 −4
0
−5
−5 −5 −5 −4 −3 −2 −1 0 1 2 3 4 5
y x x
2
Figure 9: Joint Gaussian density with σX = 1, σY2 = 1, ρ = 0.
3
0.2
2
Joint Gaussian Density
0.15
1
0.1
y
−1
0.05
−2
0
5 −3
5
0 −4
0
−5
−5 −5 −5 −4 −3 −2 −1 0 1 2 3 4 5
y x x
2
Figure 10: Joint Gaussian density with σX = 1, σY2 = 1, ρ = 0.5.
Problem 5.19 Assuming X, Y are zero mean, joint Gaussian Z = X − 2Y N(0, σZ2 ) with.
σZ2 = var(Z) = cov(X − 2Y, X − 2Y ) = cov(X, X) − 4cov(X, Y ) + 4cov(Y, Y )
2
= σX − 4ρσX σY + 4σY2
(a) For parts (a)-(c) in Problem 5.18, we have σZ2 = 1−0+4(1) = 5, σZ2 = 1−4(0.5)(1)(1)+4(1) =
4, and σZ2 = 4 − 4(0.5)(2)(1) + 4(1) = 4.
(b) Z = X − 2Y and X are joint Gaussian, so they are independent if uncorrelated.
2
cov(Z, X) = cov(X − 2Y, X) = cov(X, X) − 2cov(X, Y ) = σX − 2ρσX σY
15
5
3
0.1
1
0.06
y
0
0.04
−1
0.02
−2
0
5 −3
5
0 −4
0
−5
−5 −5 −5 −4 −3 −2 −1 0 1 2 3 4 5
y x x
2
Figure 11: Joint Gaussian density with σX = 4, σY2 = 1, ρ = 0.5.
equals zero if σX = 2ρσY . This holds only in case (b) in Problem 5.18.
Problem 5.20 (a) We have cov(X, Y ) = ρσX σY = − 43 , so the covariance matrix is given by
1 − 34
CX =
− 34 1
where X = (X, Y )T .
(b) Z = aT X, where aT = (2 3), so that Z ∼ N(aT mY = 8, aT CX a = 4).
(c) In order to compute P [Z 2 − Z > 6] = P [Z 2 − Z − 6 > 0], we factorize
Z 2 − Z − 6 = Z 2 − 3Z + 2Z − 6 = (Z + 2)(Z − 3)
This expression is positive if both factors are positive (Z > −2 and Z > 3, which is equivalent
to Z > 3), or if both factors are negative (Z < −2 and Z < 3, which is equivalent to Z < −2).
These two events are mutually exclusive, hence
3−8 −2−8
P [{Z > 3} or {Z < −2}] = P [Z > 3] + P [Z < −2] = Q √
4
+Φ √
4
Q(−5/2) + Φ(−10/2) = 1 − Q(5/2) + Q(5) = 0.9938
Problem 5.21 (a), (b) This was worked out in Example 5.4.3 for v 2 = 1. Using the same
reasoning, we obtain that the joint density is
r − r22
p(r, φ) = e 2v Ir≥0 Iφ ∈ [0,2π]
2πv 2
so that R and Φ are independent, with Φ uniform over [0, 2π] and R a Rayleigh random variable
with density
r r2
p(r) = 2 e− 2v2 Ir≥0
v
(c) Z = R2 takes values in [0, ∞) with
r 2
r − 2v
p(r) v2
e 2
p(z) = √ = √
|dz/dr| r= z 2r r= z
1 − z2
p(z) = e 2v I{z≥0}
2v 2
16
Thus, Z ∼ Exp( 2v12 ), i.e., it is exponential with mean 2v 2 .
(d) 20 dB below corresponds to a factor of 0.01. Using the well-known expression for the expo-
nential CDF, we have
2 )/(2v 2 )
P [Z ≤ 0.01(2v 2)] = 1 − e−0.01(2v = 1 − e−0.01 ≈ 0.01
using the approximation ex ≈ 1 + x for |x| small. The answer does not depend on v 2 .
Problem 5.22 (a) The mean function is given by
mX (t) = E [2 sin(20πt + Θ)]
= 14 2 sin(20πt) + 2 sin(20πt + π2 ) + 2 sin(20πt + π) + 2 sin(20πt + 3π
2
)
= 14 (2 sin(20πt) + 2 cos(20πt) − 2 sin(20πt) − 2 cos(20πt))
=0
Since 2 sin θ1 sin θ2 = cos(θ1 − θ2 ) − cos(θ1 + θ2 ), the autocorrelation function is given by
RX (t1 , t2 ) = E[X(t1 )X(t2 )] = E [2 sin(20πt1 + Θ) 2 sin(20πt2 + Θ)]
= E [2 cos (20π(t1 − t2 )) − 2 cos (20π(t1 + t2 ) + 2Θ)]
= 2 cos (20π(t1 − t2 ))
since
E [cos (20π(t1 + t2 ) + 2Θ)]
= 41 (cos (20π(t1 + t2 )) + cos (20π(t1 + t2 ) + π) + cos (20π(t1 + t2 ) + 2π) + cos (20π(t1 + t2 ) + 3π))
= 41 (cos (20π(t1 + t2 )) − cos (20π(t1 + t2 )) + cos (20π(t1 + t2 )) − cos (20π(t1 + t2 )))
=0
(b) X is WSS, since its mean function, and its autocorrelation function depends only on time
differences.
(c) A delayed version of X is given by
respectively.
(d) The time averaged mean and autocorrelation function of X can be computed exactly as in
the example of a sinusoid with random phase in the text, and match the ensemble averages in
(a).
(e) Yes, X is ergodic in mean and autocorrelation.
Problem 5.23 The three candidate functions are sketched in Figure 12.
(a) The triangle function is a convolution of two boxes: f1 (τ ) = I[−1/2,1/2] ∗ I[−1/2,1/2] . Its Fourier
transform is F1 (f ) = sinc2 (f ). The latter is symmetric and nonnegative, and hence is a valid
PSD, hence f1 is a valid autocorrelation function.
(b) The shifted triangle f2 (τ ) is not symmetric, and hence is not a valid autocorrelation function.
(c) Taking the Fourier transform of f3 , we obtain
1
F3 (f ) = F1 (f ) − F1 (f ) e−j2πf + ej2πf = sinc2 (f ) (1 − cos 2πf ) ≥ 0
2
17
f1 (τ ) f 2(τ ) f 3(τ )
1 1 1
τ τ τ
−1 1 0 1 2 −1 1
−1/2
since cosine is bounded above by one. Thus, F3 (f ) is symmetric and nonnegative, and hence
f3 (τ ) is a valid autocorrelation function.
E [Xp (t)] = E [Xc (t)] cos 2πfc t − E [Xs (t)] sin 2πfc t
Since cosine and sine are linearly independent, the preceding can be constant if and only if
RXp (t1 , t2 ) = E [Xp (t1 )Xp (t2 )] = RXc (t1 , t2 ) cos 2πfc t1 cos 2πfc t2 +RXs (t1 , t2 ) sin 2πfc t1 sin 2πfc t2 −RXc ,Xs (t1 ,
(3)
Using trigonometric identities we can write these out in terms of t1 − t2 and t1 + t2 . We have
cos 2πfc t1 cos 2πfc t2 = 12 cos 2πfc (t1 − t2 ) + 12 cos 2πfc (t1 + t2 )
sin 2πfc t1 sin 2πfc t2 = 12 cos 2πfc (t1 − t2 ) − 21 cos 2πfc (t1 + t2 )
(4)
sin 2πfc t1 cos 2πfc t2 = 12 sin 2πfc (t1 − t2 ) + 21 sin 2πfc (t1 + t2 )
cos 2πfc t1 sin 2πfc t2 = − 12 sin 2πfc (t1 − t2 ) + 12 sin 2πfc (t1 + t2 )
where
1 1
A= (RXc (t1 , t2 ) + RXs (t1 , t2 )) cos 2πfc (t1 −t2 )− (RXs ,Xc (t1 , t2 ) − RXc ,Xs (t1 , t2 )) sin 2πfc (t1 −t2 )
2 2
(5)
and
1 1
B= (RXc (t1 , t2 ) − RXs (t1 , t2 )) cos 2πfc (t1 +t2 )− (RXs ,Xc (t1 , t2 ) + RXc ,Xs (t1 , t2 )) sin 2πfc (t1 +t2 )
2 2
(6)
In order for the autocorrelation function to depend on t1 − t2 alone, the undesired t1 + t2 terms
in (6) must vanish, which requires that the coefficients of the cosine and sine in the previous
equations must vanish:
RXc (t1 , t2 ) − RXs (t1 , t2 ) = 0, RXs ,Xc (t1 , t2 ) + RXc ,Xs (t1 , t2 ) = 0 (7)
18
Plugging into (5), we obtain that
A = RXc (t1 , t2 ) cos 2πfc (t1 − t2 ) − RXs ,Xc (t1 , t2 ) sin 2πfc (t1 − t2 )
This depends on t1 − t2 alone if RXc (t1 , t2 ) and RXs ,Xc (t1 , t2 ) depend on t1 − t2 alone. Putting
these together with (7), we obtain that Xp is WSS if Xc , Xs are zero mean, jointly WSS with
where we have rewritten the conditions (7) in terms of τ = t1 − t2 using joint wide sense
stationarity.
(b) Under the conditions derived in (a), we obtain that
Problem 5.25 As we see from Figure 13, the signal x(t) is periodic with period 2, and hence so
is its time-averaged autocorrelation function
Rx (τ ) = x(t)x(t − τ ) (11)
We can see this by replacing τ by τ + 2 in (11). Hence we only need to compute Rx (τ ) over
a single period, say for τ ∈ [−1, 1]. Furthermore, since Rx (τ ) is even, we can focus on τ ∈ [0, 1].
Finally, we only need to average over a period to compute Rx ; we can see this by replacing t by
19
Figure 14: Autocorrelation function in Problem 5.25(a).
t + 2 in (11). Thus, we can restrict the average over t to a period. Figure 13 shows x(t) and
x(t − τ ), with t ranging over a period, and for an arbitrary τ ∈ [0, 1]. We see that
R3
Rx (τ ) = 12 −2 1 x(t)x(t − τ )dt = 12 (−τ + (1 − τ ) − τ + (1 − τ ))
2
= 1 − 2τ , 0 ≤ τ ≤ 1
We may now replace τ by its magnitude (Rx is an even function), and invoke periodicity, to
specify the autocorrelation function as follows:
Rx (τ ) = 1 − 2|τ | , |τ | ≤ 1 , Rx (τ ) = Rx (τ + 2)
The autocorrelation function is sketched in Figure 14.
(b) For finding the Fourier series of x(t), we use the differentiation trick to reduce it down to an
impulse train (see Example 2.4.1). The procedure is shown in Figure 15. Since z(t) = dx/dt is
a sum of two interleaved impulse trains, its Fourier series is easily computed as follows:
1
2e−j2πkf0 (−1/2) − 2e−j2πkf0 (1/2)
z(t) = dx/dt ↔ zk =
T0
where T0 = 2 is the period and f0 = 1/T0 = 1/2 is the fundamentl. Simplifying, we obtain
zk = 2j sin(πk/2), and hence, for k =
6 0,
zk sin(πk/2) 0, k even
xk = =2 = 2
j2πkf0 πk ± πk
, k odd
The PSD is given by
X X 4
Sx (f ) = |xk |2 δ(f − kf0 ) = δ(f − k/2)
k k odd
π2k2
20
(c) It is left as an exercise to check that the Fourier series of Rx (τ ) derived in (a) is given by
{|xk |2 }, so that the answers in (a) and (b) are consistent.
Problem 5.26: Typo in the problem. We should have set D to be uniform over [0, 2] (i.e., the
period of x(t)) in order to generate a WSS and stationary process X(t). In this case, taking
expectation over D effectively performs averaging over a period. Ensemble averages therefore
give the same answer as time averages of the sort computed in Problem 5.25. Details omitted.
40
PSD of x1
PSD of x2
35
30
PSD 25
20
15
10
0
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
f
Problem 5.27 We derive the PSDs below and plot them in Figure 16.
(a) x1 (t) = (n ∗ h)(t), where H(f ) = j2πf ↔ dtd . Thus,
1 − e−j2πf d
G(f ) =
d
1−e−j2πf d 1−ej2πf d
|G(f )|2 = G(f )G∗ (f ) = d d 2
sin2 πf d
= 2−2 cos
d2
2πf d
= 4 d2 = 4π f 2 2 sin πf d
πf d
2 2 2
= 4π f sinc (f d)
Thus,
Sx2 (f ) = |G(f )|2Sn (f ) = 4π 2 f 2 sinc2 (f d) I[−1,1] (f )
As d → 0, g(t) tends to the derivative, and Sx2 → Sx1 . For nonzero d, since sinc2 (f d) ≤ 1 for all
f , we have Sx2 (f ) ≤ Sx1 (f ) for all f , with strict inequality everywhere except at f = 0. Thus,
x2 has smaller power than x1 .
(c) The power of x1 is given by
Z 1
8π 2
Z
Rx1 (0) = Sx1 (f )df = 4π 2 f 2 df = ≈ 26.3
−1 3
1
The power of x2 (for d = 2
is given by
Z Z 1
Rx2 (0) = Sx2 (f )df = 4π 2 f 2 sinc2 (f /2) df ≈ 16.0
−1
21
where the integral is evaluated numerically.
Problem 5.28: Taking the Fourier transform of the autocorrelation function, we get
R∞ R0
SX (f ) = 0
e−aτ e−j2πf τ dτ + −∞
eaτ e−j2πf τ dτ
∞ 0
e−(a+j2πf )τ e(a−j2πf )τ 1 1
= −(a+j2πf )
+ (a−j2πf )
= a+j2πf
+ a−j2πf
0 −∞
2a
= a2 +4π 2 f 2
When X is passed through an ideal LPF of bandwidth W , i.e., with transfer function I[−W,W ](f ),
the output power is given by
W W
2a
Z Z
Output Power = SX (f ) df = df
−W −W a2 + 4π 2 f 2
a
Make the standard substitution 2πf = a tan θ )(so df = 2π
sec2 θ dθ) to get
2πW
Z tan−1 a
2a a
Output Power = a2 +a2 tan2 θ 2π
sec2 θ dθ
− tan−1 2πW
a
2 2πW
= π
tan−1 a
2 2πW
tan−1 = 0.99
π a
so that
tan(0.99π/2)
W0.99 = a ≈ 10.13a
2π
which scales linearly with a. This is to be expected: since RX (τ ) = e−a|τ | , and the exponent is
dimensionless, we can think of 1/a as the unit of time, and hence a as the unit of frequency.
Sm(f)
2
f
−2 2
Figure 17: PSD of m in Problem 5.29. The area under it gives the signal power at the channel
input.
Problem 5.29 (a) Signal power at channel input is the area under the PSD shown in Figure 17:
Z ∞
1 1
Pm = Sm (f )df = × base × height = × 4 × 2 = 4
−∞ 2 2
22
(b) The PSD of the signal at the channel output is given by Sm1 (f ) = Sm (f )|H(f )|2. The signal
power is therefore given by
R∞ R2
Pm1 = −∞ Sm (f )|H(f )|2df = 2 0 Sm (f )|H(f )|2df
R1 R2 1 2
= 2 0 2(1 − f /2)22 df + 2 1 2(1 − f /2)12df = 16(f − f 2 /4)0 + 4(f − f 2 /4)1
= 16 × (1 − 12 /4) + 4 × ((2 − 22 /4) − (1 − 12 /4)) = 13
From (b), we know that the signal power Sin = 13, so that the SNR at equalizer input is given
by
SNRin = Sin /Nin = 13/(3/2) = 26/3 = 8.67
(d) The messagel sees the cascade of channel and equalizer, which is given by the transfer function
2I[−2,2] (f ), hence the signal power at the equalizer output is given by Sout = 22 Pm = 16. The
noise sees only the equalizer’s transfer function, say G(f ), so the noise power at the equalizer
output is given by
Z ∞ Z 1 Z 2
2 1 2 1 2
Nout = Sn (f )|G(f )| df = 2 1 df + 2 2 df = 5/2
−∞ 0 4 1 4
Note that the SNR at the equalizer output is actually smaller than that at the equalizer input.
Thus, in undoing the distortion of the channel, we have enhanced the noise. Such noise enhance-
ment is also seen in digital communication over dispersive channels, as we see in Chapter 8.
(A more pleasant approach is to draw a picture of the PSD and find the appropriate area under
the curve–try it.)
Problem 5.31 (a) WGN has infinite power, so SNR at filter input is zero.
(b) For B ≤ 1, the signal power is given by
Z B B
S=2 (1 − f )df = 2(f − f 2 /2)0 = 2B − B 2
0
2B − B 2
SNR = S/N = = 500(2 − B)
0.002B
23
This is decreasing in B, hence the SNR for B = 12 is better than for B = 1, even though the
former distorts the signal.
Problem 5.32 (a) The autocorrelation function is given by
N0 N0 ∞
Z
Ry (τ ) = (h ∗ hM F )(τ ) = h(t)h(t − τ )dt
2 2 −∞
For τ ≥ 0, we obtain
R∞ R∞
Ry (τ ) = N20 τ e−t/T0 e−(t−τ )/T0 dt = N0 τ /T0
2
e τ
e−2t/T0 dt = N0 T0 −τ /T0
4
e ,τ ≥ 0
Since the autocorrelation function is symmetric, we can replace τ by its magnitude to get
N0 T0 −|τ |/T0
Ry (τ ) = e
4
The power equals Ry (0) = N04T0 . The PSD can be obtained by directly taking the Fourier
transform of Ry , but we choose to use the formula
N0
Sy (f ) = |H(f )|2
2
where
−(j2πf + 1 )t ∞
R∞ R∞
e T0
−j2πf t −t/T0 −j2πf t
H(f ) = h(t)e dt = e e dt =
−∞ 0 −(j2πf + T1 ) 0
0
1 T0
= j2πf + T1
= j2πf T0 +1
0
(b) If n, and therefore y, is a Gaussian random process. Thus, y(0) and y(t0) − 21 y(0) are jointly
Gaussian, and are independent if they are uncorrelated. The covariance is given by
1 1 N0 T0 −|t0 |/T0 1
cov(y(t0) − y(0), y(0)) = Ry (t0 ) − Ry (0) = e )−
2 2 4 2
This equals zero for
|t0 | = T0 ln 2 ≈ 0.69T0
H(f)
H(f)
1
1/2
f f
−1 1 −51 −50 −49 1 49 50 51
24
(a) The filter transfer function is as shown in Figure 18(a), so that the noise power at the output
is given by
Z 1
N0
Pn = N0 (1 − f )2 df =
0 3
(b) The filter transfer function is as shown in Figure 18(b), and the noise power at the output is
given by
R 51 R 51
Pn = N0 49 |H(f )|2 df = N0 49 ((1/2)(1 − |f − 50|))2 df
R1
= 2N0 (1/2)2 0 (1 − f )2 df = N60
after doing a change of variables to evaluate the integral. We could also have recognized that
||h||2 in (b) is half that in (a): the filter in (b) has a passband impulse response whose I compo-
nent is the impulse response in (a) (with Q component zero), so that the energy in the impulse
response in (b) is half that in (a).
Problem 5.34 (a) The filter is specified as h(t) = I[−1,1] (t) ↔ H(f )2sinc2f . We have
where Ry (τ ) = σ 2 (h ∗ hM F )(τ ) is sketched in Figure 19. Thus, Y = (y(1), y(2), y(3))T ∼ N(0, C)
where
Ry (0) Ry (1) Ry (2) 2 1 0
C = Ry (1) Ry (0) Ry (1) = 1 2 1
Ry (2) Ry (1) Ry (0) 0 1 2
(c) We can write this as Z = aT Y ∼ N(0, σZ2 ), where a = (1, −2, 1)T and σZ2 = aT Ca = 4.
Alternatively, we can compute the variance as σZ2 = cov(y(1) − 2y(2) + y(3), y(1) − 2y(2) + y(3))
and use the bilinearity of covariance.
Z 3 Z 3 Z 3
Y2 = y(t)dt = s(t)dt + n(t)dt = 2 + N2
1 1 1
25
R2 R3
where N1 = 0 n(t)dt, and N2 = 1 n(t)dt are jointly Gaussian and zero mean, by the Gaussianity
of n. Thus, Y1 and Y2 are joint Gaussian, with means E[Y1 ] = E[Y2 ] = 2, and covariances
Z 2 Z 2 Z 2Z 2
cov(Y1 , Y1) = cov(N1 , N1 ) = E[ n(t)dt n(u)du] = E[n(t)n(u)]dtdu
0 0 0 0
E[n(t)n(u)] = σ 2 δ(t − u)
Problem 5.36: We have h(t) = p(t) − p(t − 1), where p(t) = I[0,1] (t) and
Rz (τ ) = σ 2 (h ∗ hM F )(τ )
Note that hM F (t) = h(−t) = p(−t) − p(−t − 1) = pM F (t) + pM F (t + 1), so that we can break up
the desired convolution as follows:
(h ∗ hM F )(τ ) = (p(τ ) − p(τ − 1)) ∗ (pM F (τ ) − pM F (τ + 1))
= p(τ ) ∗ pM F (τ ) − p(τ ) ∗ pM F (τ + 1) − p(τ − 1) ∗ pM F (τ ) + p(τ − 1) ∗ pM F (τ + 1)
= a(τ ) − a(τ + 1) − a(τ − 1) + a(τ ) = 2a(τ ) − a(τ + 1) − a(τ − 1)
where a(τ ) = (p ∗ pM F )(τ ) is a tent function centered at the origin. This computation, and the
resulting autocorrelation function, is depicted in Figure 20.
26
Figure 20: Computation of autocorrelation function in Problem 5.36.
(b) z(49) and z(50) are linear transformations of a zero mean Gaussian process, and are therefore
zero mean Gaussian random variables with covariance matrix given by
Rz (|49 − 49|) Rz (|49 − 50|) 2 −1
C= =
Rz (|50 − 49|) Rz (|50 − 50|) −1 2
(c) As in (b), z(49) and z(52) are zero mean, jointly Gaussian, each with variance Rz (0) = 2,
but their covariance is Rz (|49 − 52|) = Rz (3) = 0, hence they are independent. Thus, they are
iid N(0, 2).
(d) The required probability can be written as P [Y > 0], where Y = 2z(50) − z(49) + z(51) is
zero mean Gaussian. Hence P [Y > 0] = 21 by the symmetry of the Gaussian density around its
mean.
(e) The required probability can be written as P [Y > 2] = Q 2−0 σY
. We now compute the
variance of Y using bilinearity of covariance:
var(Y ) = cov (2z(50) − z(49) + z(51), 2z(50) − z(49) + z(51)) = 4Rz (|50 − 50|) + Rz (|49 − 49|) + Rz (|51 − 51|
= 6Rz (0) − 2Rz (2) = 12
Thus, P [Y > 2] = Q √212 = Q √13 .
27
Figure 21: Computation of autocorrelation function in Problem 5.37.
function).
(b) The random vector Z = (z(0), z(1), z(2))T is a linear transformation of a zero mean Gaussian
random process, and is therefore a zero mean Gaussian random vector with covariance matrix
given by
Rz (|0 − 0|) Rz (|0 − 1|) Rz (|0 − 2|) 5 2 0
C = Rz (|1 − 0|) Rz (|1 − 1|) Rz (|1 − 2|) = σ 2 2 5 2
Rz (|2 − 0|) Rz (|2 − 1|) Rz (|2 − 2|) 0 2 5
(c) We wish to compute P [Y > 4], where Y = z(0) − z(1) + z(2) is zero mean Gaussian with
variance computed as
var(Y ) = cov (z(0) − z(1) + z(2), z(0) − z(1) + z(2)) = Rz (|0 − 0|) + Rz (|1 − 1|) + Rz (|2 − 2|) − 2Rz (|0 −
= 3Rz (0) − 4Rz (1) + 2Rz (2) = 3 × 5 − 4 × 2 + 2 × 0 = 7
We therefore obtain P [Y > 4] = Q 4−0 √
7
= Q √4 .
7
Problem 5.38 (a) The random process z(t) is zero mean Gaussian, being a linear transformation
of the zero mean Gaussian process n. We can therefore compute the covariance of samples of
z(t) as follows:
28
since hM F (t) = h∗ (−t).
(b) For h(t) = I[0,1] (t), the convolution h ∗ hM F is the tent function a(τ ) shown in Figure 22.
The samples Z = (z[1], z[2], z[3]) have covariances given by cov(z[m], z[n]) = N20 a((m − n)Ts ) =
N0
2
a(|m − n|Ts ). For Ts = 21 , we get the covariance matrix
a(0) a(1/2) a(1) 1 1/2 0
N0 N0
C= a(1/2) a(0) a(1/2) = 1/2 1 1/2
2 2
a(1) a(1/2) a(0) 0 1/2 1
(c) For Ts = 1, cov(z[m], z[n]) = N20 a(|m−n|) = 0 for m 6= n, so that the samples are independent
(since they are jointly Gaussian and uncorrelatied). As before, var(z[m]) = N20 a(0) = N20 , so that
{z[m]} are i.i.d. N(0, N20 ).
(d) The samples are Rindependent if cov(z[n], z[m]) = 0 for all m 6= n, which happens if (h ∗
hM F )((n − m)Ts ) = h(t)h∗ (t − (n − m)Ts )dt = 0 for m 6= n. Thus, q(t) = (h ∗ hM F )(t) ↔
Q(f ) = H(f )HM F (f ) = H(f )H ∗(f ) = |H(f )|2 is Nyquist at rate 1/Ts , so that h(t) ↔ H(f ) is
square root Nyquist at rate 1/Ts .
Figure 23: The signal and its matched filter for Problem 5.39.
Figure 24: The convolution of the signal with its matched filter for Problem 5.39(b).
Problem 5.39 (a) The signal s(t) and its matched filter sM F (t) = s(−t) are sketched in Figure
23.
(b) In order to compute s∗sM F , we break them into smaller pieces. Specifically, we see from Figure
23 that we can write s(t) = p(t) − p(t − 1) − 2p(t − 2), where p(t) = I[0,1] (t). The matched filter
impulse response can therefore be written as sM F (t) = s(−t) = p(−t) − p(−t − 1) − 2p(−t − 2) =
pM F (t) − pM F (t + 1) − 2pM F (t + 2), where pM F (t) = p(−t). The convolution can therefore be
29
written as
(s ∗ sM F )(t) = (p(t) − p(t − 1) − 2p(t − 2)) ∗ (pM F (t) − pM F (t + 1) − 2pM F (t + 2))
= p(t) ∗ pM F (t) + p(t − 1) ∗ pM F (t + 1) + 4p(t − 2) ∗ pM F (t + 2)
− p(t) ∗ pM F (t + 1) − 2p(t) ∗ pM F (t + 2) − p(t − 1) ∗ pM F (t) + 2p(t − 1) ∗ pM F (t + 2)
− 2p(t − 2) ∗ pM F (t) + 2p(t − 2) ∗ pM F (t + 1)
= 6a(t) − a(t + 1) − 2a(t + 2) − a(t − 1) + 2a(t + 1) − 2a(t − 2) + 2a(t − 1)
= 6a(t) + a(t + 1) + a(t − 1) − 2a(t + 2) − 2a(t − 2)
where a(t) = (p ∗ pM F )(t) is a tent function centered at the origin. The computation and the
final waveform is shown in Figure 24.
(c) Using the decomposition in (b), we have pM F (t) = I[−1,0] (t) = h(t + 1), so that sM F (t) =
pM F (t) − pM F (t + 1) − 2pM F (t + 2) = h(t + 1) − h(t + 2) − 2h(t + 3). Thus, (x ∗ sM F )(t) =
y(t + 1) − y(t + 2) − 2y(t + 3), where y(t) = (x ∗ h)(t).
Figure 25: Effective correlators corresponding to samples and linear combinations of samples at
filter output.
30
Figure 26: Approximating correlation with g(t) using three samples at the output of h(t) =
I[0,1] (t).
where n n
X X
g(t) = αi h(ti − t) = αi hM F (t − ti )
i=1 i=1
(b) The triangular waveform can be approximated by three boxes, as shown in Figure 26, with
sampling times {ti } given by 0, 21 , 1 and αi ≡ 1.
h(0.5 − t)
s(t)
t
−1 −1/2 1/2 1
1
Figure 27: Sampling at t0 = 2
gives the best match between g(t) = h(t0 − t) and s(t).
h(−t)
g(t)
t 2
−1
h(0.5 − t)
+ 1
t
−1/2 1/2 =
t
h(1−t) −1 −1/2 1/2 1
+
t
1
Figure 28: Choosing three samples and combining them to produce an effective correlator g
which approximates the triangular shape of the signal s.
31
Problem 5.42 We have signal corrupted by white noise with PSD N20 = σ 2 = 0.1.
(a) The integrator output can be written as the output of a correlator, hy, gi = hs, gi + hn, gi,
where g(t) = I[−1,1] (t). In general, the signal contribution at the output of a correlator g is hs, gi,
and the noise contribution N = hn, gi ∼ N(0, σ 2 ||g||2). The SNR at the output of a correlator u
is therefore given by 2
|hs, gi|2 |hs, gi|2
1 g
SNR = = 2 = 2 s, (12)
E[N 2 ] σ ||g||2 σ ||g||
For g(t) = I[−1,1] (t), we have hs, gi = 1 (area under triangle of base 2 and height 1) and ||g||2 = 2.
Plugging into (12) and setting σ 2 = 0.1, we obtain SNR = 5.
(b) As stated in Theorem 5.7.1, the SNR is maximized by correlating against the signal s(t).
Setting g(t) = s(t) in (12), we have hs, gi = ||s||2 and N = hn, si ∼ N(0, σ 2 ||s||2), which gives
||s||2
SNRmax = = 20/3 ≈ 6.67
σ2
since ||s||2 = 32 and σ 2 = 0.1.
(c) The filter output at time t0 is given by
Z
(y ∗ h)(t0 ) = y(t)h(t0 − t)dt = hy, gi
where g(t) = h(t0 − t). From (12) (which repeats the discussion before Theorem 5.7.1), we see
that we should choose the sampling time such that g(t) “best matches” s(t) in its shape, in the
g
sense of maximizing |hs, ||g|| i|2 , subject to whatever constraints we are placing on the choice of g.
If unconstrained, g = s (or any scalar multiple thereof) is optimal, but here we are constraining
g to take the form g(t) = h(t0 − t).) From Figure 27, we see that this happens when t0 = 21 . We
have ||g||2 = 1 and
Z 1
2 3
hs, gi = (1 − |t|)dt =
− 12 4
Plugging into (12), we obtain SNR = 5.625.
(d) If we can now take linear combinations of samples at the output of the filter, we obtain
X X Z
ai (y ∗ h)(ti ) = ai y(t)h(ti − t)dt = hy, gi
i i
where X
g(t) = ai h(ti − t)
i
We now have to choose the sampling times and combinations coefficients so that the shape of g
matches up well with that of s. For example, t0 = 0, t1 = 12 and t2 = 1 with a0 = a1 = a2 = 1
works well, as shown in Figure 28. For the scaling shown in the figure, we have ||g||2 = 5 and
hs, gi = 47 . Plugging into (12), we get SNR = 6.125, which is slightly better than in (c). We
could certainly play with the coefficients to try to get a better matching of shape, but we leave
it at this.
Take-away: Even when implementation constraints prevent us from using the optimal correlator
g = s, we can construct approximations to s within these constraints by trying to match the
shape of s as closely as we can.
32