Lec 1

Lecture 1
Capacity of the Gaussian Channel
• Basic concepts in information theory: Appendix B

• Capacity of the Gaussian channel: Appendix B, Ch. 5.1–3
Mikael Skoglund, Theoretical Foundations of Wireless 1/23
Entropy and Mutual Information I
• Entropy for a discrete random variable X with alphabet X and pmf

p(x) , Pr(X = x), ∀x ∈ X
X
H(X) , − p(x) log p(x)
x∈X
• H(X) = average amount of uncertainty removed when observing

X = the information obtained X
• It holds that 0 ≤ H(X) ≤ log |X |
• Entropy for an n-tuple X1n , (X1 , . . . , Xn )
X
H(X1n ) = H(X1 , . . . , Xn ) = − p(xn1 ) log p(xn1 )
xn
1

Entropy and Mutual Information II
• Conditional entropy of Y given X = x
X
H(Y |X = x) , − p(y|x) log p(y|x)
y∈Y
• H(Y |X = x) = the average information obtained when observing Y

when it is already known that X = x
• Conditional entropy of Y given X (on the average)
X
H(Y |X) , p(x)H(Y |X = x)
x∈X
• Define g(x) = H(Y |X = x). Then H(Y |X) = Eg(X).

• Chain rule
H(X, Y ) = H(Y |X) + H(X)
(c.f., p(x, y) = p(y|x)p(x))
Entropy and Mutual Information III
• Mutual information
XX p(x, y)
I(X; Y ) , p(x, y) log
x y
p(x)p(y)
• I(X; Y ) = the average information about X obtained when

observing Y (and vice versa)

Entropy and Mutual Information IV
H(X, Y )
H(X|Y )
I(X; Y )
H(Y |X)
H(X) H(Y )
I(X; Y ) = I(Y ; X)
I(X; Y ) = H(Y ) − H(Y |X) = H(X) − H(X|Y )
I(X; Y ) = H(X) + H(Y ) − H(X, Y )
I(X; X) = H(X)
H(X, Y ) = H(X) + H(Y |X) = H(Y ) + H(X|Y )
Entropy and Mutual Information V
• A continuous random variable X with pdf f (x), differential entropy

Z
h(X) = − f (x) log f (x)dx
• For E[X 2 ] = σ 2 ,
1
h(X) ≤ log(2πeσ 2 ) [bits]
2
with = only for X Gaussian
• Mutual information,
ZZ
f (x, y)
I(X; Y ) = f (x, y) log dxdy
f (x)f (y)

Jensen’s Inequality
• For f : Rn → R convex and a random X ∈ Rn ,
f (E[X]) ≤ E[f (X)]
• Reverse inequality for f concave

• For f strictly convex (or strictly concave),
f (E[X]) = E[f (X)] =⇒ Pr(X = E[X]) = 1
Fano’s Inequality
• Consider the following estimation problem (discrete RV’s):

X random variable of interest
Y observed random variable
X̂ = f (Y ) estimate of X based on Y
• Define the probability of error as
Pe = Pr(X̂ 6= X)
• Fano’s inequality lower bounds Pe
h(Pe ) + Pe log(|X | − 1) ≥ H(X|Y )
[h(x) = −x log x − (1 − x) log(1 − x)]

The Gaussian Channel I
wm
encoder decoder
ω xm ym ω̂
α β
• At time m: transmitted symbol xm ∈ X = R, received symbol

ym ∈ Y = R, noise wm ∈ R
• The noise {wm } is i.i.d. Gaussian N (0, σ 2 )
• A memoryless Gaussian transition density (noise variance σ 2 ),
11 2

f (y|x) = √ exp − 2 (y − x)
2πσ 2 2σ
The Gaussian Channel II
• Coding for the Gaussian channel, subject to an average power

constraint
• Equally likely information symbols ω ∈ IM = {1, . . . , M }
• An (M, n) code with power constraint P
1 Power-limited codebook
n o
C= xn
1 (1), . . . , x n
1 (M ) ,
with n
X
n−1
x2m (i) ≤ P, i ∈ IM
m=1
2 Encoding: ω = i ⇒ xn n
1 = α(i) = x1 (i) transmitted
n n
3 Decoding: y1 received ⇒ ω̂ = β(y1 )
• One symbol → one codeword → n channel uses

Capacity
• A rate
log M
R,
n
is achievable (subject to the power constraint P ) if there exists a
sequence of (⌈2nR ⌉, n) codes with codewords satisfying the power
constraint, and such that the average probability of error
Pe(n) = Pr(ω̂ 6= ω)
tends to 0 as n → ∞.
• The capacity C is the supremum of all achievable rates.
A Lower Bound for C I
• Gaussian random code design: Fix
x2

1
f (x) = p exp −
2π(P − ε) 2(P − ε)
for a small ε > 0, and drawQa codebook Cn = xn1 (1), . . . , xn1 (M )

i.i.d according to f (xn1 ) = m f (xm ).

• Mutual information: Let
f (y|x)
ZZ
Iε = f (y|x)f (x) log R dxdy
f (y|x)f (x)dx
P −ε

1
= log 1 +
2 σ2
• the mutual info between input and output when the channel is
“driven by” f (x) = N (0, P − ε)

A Lower Bound for C II
• Encoding: A message ω ∈ IM is encoded as xn
1 (ω)
• Transmission: Received sequence
y1n = xn1 (ω) + w1n

2
where wm are i.i.d zero-mean Gaussian, E[wm ] = σ2
• Decoding: For any sequences xn n
1 and y1 , let
n
1 f (y1n |xn1 ) 1 X f (ym |xm )
fn = fn (xn1 , y1n ) = log = log
n f (y1n ) n m=1 f (ym )
(n)
and let Tε be the set of (xn1 , y1n ) such that fn > Iε − ε. Declare
(n)
ω̂ = i if xn1 (i) is the only codeword such that (xn1 (i), y1n ) ∈ Tε ,
and in addition
Xn
n −1
x2m (i) ≤ P,
m=1
otherwise set ω̂ = 0.
A Lower Bound for C III
• Average probability of error:

πn = Pr(ω̂ 6= ω) = symmetry = Pr(ω̂ 6= 1|ω = 1)
with “Pr” over the random codebook and the noise

• Let X
E0 = {n −1
x2m (1) > P }
m
and
xn1 (i), xn1 (1) + w1n ∈ Tε(n)

Ei =
then
M
X
πn = P (E0 ∪ E1c ∪ E2 ∪ · · · ∪ EM ) ≤ P (E0 ) + P (E1c ) + P (Ei )
i=2

A Lower Bound for C IV
• For n sufficiently large, we have
• P (E0 ) < ε
• P (E1c ) < ε
•
P (Ei ) ≤ 2−n(Iε −ε) , i = 2, . . . , M
that is,
πn ≤ 2ε + 2−n(Iε −R−ε)
⇒ For the average code, R < Iε − ε ⇒ πn → 0 as n → ∞
⇒ Exists at least one code with Pen → 0 for R < Iε − ε
⇒

1 P
C ≥ log 1 + 2
2 σ
An Upper Bound for C I
• Consider any sequence of codes that can achieve the rate R

• Fano =⇒
n
1 X
R≤ I(xm (ω); ym ) + αn
n m=1
(n)
where αn = n−1 + RPe → 0 as n → ∞, and where
1
I(xm (ω); ym ) = h(ym ) − h(wm ) = h(ym ) − log 2πeσ 2
2
2
PM
• Since E[ym ] = Pm + σ 2 where Pm = M −1 i=1 x2m (i) we get
1
h(ym ) ≤ log 2πe(σ 2 + Pm )
2
and hence I(xm (ω); ym ) ≤ 2−1 log(1 + Pm /σ 2 ).

An Upper Bound for C II
Thus
n
n−1 m Pm
P
1 X 1 Pm 1
R≤ log 1 + 2 + αn ≤ log 1 + + αn
n m=1 2 σ 2 σ2

1 P 1 P
≤ log 1 + 2 + αn → log 1 + 2 as n → ∞
2 σ 2 σ
for all achievable R, due to Jensen and the power constraint =⇒

1 P
C ≤ log 1 + 2
2 σ
Coding Theorem for the Gaussian Channel
Theorem
A memoryless Gaussian channel with noise variance σ 2 and power
constraint P has capacity

1 P
C = log 1 + 2
2 σ
That is, all rates R < C and no rates R > C are achievable.

The Gaussian Waveform Channel I
N (f )
x(t) y(t)
H(f )
(−T /2, T /2) (−T /2, T /2)
• Linear-filter waveform channel with Gaussian noise,

• independent Gaussian noise with spectral density N (f )
• linear filter H(f )
• input confined to (−T /2, T /2)
• output measured over (−T /2, T /2)
• codebook
C = {x1 (t), . . . , xM (t)}
• power constraint
Z T /2
1
x2i (t)dt ≤ P
T −T /2
• rate
log M
R=
T
The Gaussian Waveform Channel II

• Capacity (in bits per second),
1 |H(f )|2 · β
Z
C= log df
2 F (β) N (f )

N (f )
Z
P = β− df
F (β) |H(f )|2
where
F(β) = f : N (f ) · |H(f )|−2 ≤ β

for different β ∈ (0, ∞).

• That is, there exist codes such that arbitrarily low error probability is
possible as long as
log M
<C R=
T
and as T → ∞. For R > C the error probability is > 0.

The Gaussian Waveform Channel III
• In the achievability proof: Random Gaussian codewords,
with spectral density
+
N (f )
S(f ) = β −
|H(f )|2
where (
x, x ≥ 0
[x]+ =
0, x < 0
• The famous special case of a bandlimited AWGN channel:

• Perfect lowpass filter of bandwidth W
(
1 |f | ≤ W
H(f ) =
0 |f | > W
• White Gaussian noise, with N (f ) = N0 /2
The Gaussian Waveform Channel IV
• The capacity of this channel is (Shannon ’48):

P
C = W log 1 + [bits per second]
W N0
• Fundamental resources: power P and bandwidth W

Waterfilling
• A frequency-selective Gaussian waveform channel (H(f ) arbitrary),

white Gaussian noise (N (f ) = N0 /2) ⇒
1
Z
log β|H(f )|2 df

C=
2 F (β)

N0 1
Z
P = β− df
2 F (β) |H(f )|2

where F(β) = f : β ≥ |H(f )|−2
• Optimal signal spectrum
+
N0 1
S(f ) = β−
2 |H(f )|2
• “sample in frequency” ⇒ OFDM. . .

Lec 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lec 1

Uploaded by

Copyright:

Available Formats

Lecture 1

Capacity of the Gaussian Channel

• Basic concepts in information theory: Appendix B

Mikael Skoglund, Theoretical Foundations of Wireless 1/23

Entropy and Mutual Information I

• Entropy for a discrete random variable X with alphabet X and pmf

• H(X) = average amount of uncertainty removed when observing

Mikael Skoglund, Theoretical Foundations of Wireless 2/23

• H(Y |X = x) = the average information obtained when observing Y

• Define g(x) = H(Y |X = x). Then H(Y |X) = Eg(X).

Mikael Skoglund, Theoretical Foundations of Wireless 3/23

Entropy and Mutual Information III

• I(X; Y ) = the average information about X obtained when

Mikael Skoglund, Theoretical Foundations of Wireless 4/23

Mikael Skoglund, Theoretical Foundations of Wireless 5/23

Entropy and Mutual Information V

• A continuous random variable X with pdf f (x), differential entropy

Mikael Skoglund, Theoretical Foundations of Wireless 6/23

• For f : Rn → R convex and a random X ∈ Rn ,

f (E[X]) ≤ E[f (X)]

• Reverse inequality for f concave

f (E[X]) = E[f (X)] =⇒ Pr(X = E[X]) = 1

Mikael Skoglund, Theoretical Foundations of Wireless 7/23

• Consider the following estimation problem (discrete RV’s):

• Fano’s inequality lower bounds Pe

h(Pe ) + Pe log(|X | − 1) ≥ H(X|Y )

[h(x) = −x log x − (1 − x) log(1 − x)]

Mikael Skoglund, Theoretical Foundations of Wireless 8/23

• At time m: transmitted symbol xm ∈ X = R, received symbol

Mikael Skoglund, Theoretical Foundations of Wireless 9/23

The Gaussian Channel II

• Coding for the Gaussian channel, subject to an average power

• One symbol → one codeword → n channel uses

Mikael Skoglund, Theoretical Foundations of Wireless 10/23

Mikael Skoglund, Theoretical Foundations of Wireless 11/23

A Lower Bound for C I

• Gaussian random code design: Fix

for a small ε > 0, and drawQa codebook Cn = xn1 (1), . . . , xn1 (M )

i.i.d according to f (xn1 ) = m f (xm ).

Mikael Skoglund, Theoretical Foundations of Wireless 12/23

y1n = xn1 (ω) + w1n

Mikael Skoglund, Theoretical Foundations of Wireless 13/23

A Lower Bound for C III

• Average probability of error:

with “Pr” over the random codebook and the noise

Mikael Skoglund, Theoretical Foundations of Wireless 14/23

Mikael Skoglund, Theoretical Foundations of Wireless 15/23

An Upper Bound for C I

• Consider any sequence of codes that can achieve the rate R

Mikael Skoglund, Theoretical Foundations of Wireless 16/23

for all achievable R, due to Jensen and the power constraint =⇒

Mikael Skoglund, Theoretical Foundations of Wireless 17/23

Coding Theorem for the Gaussian Channel

Mikael Skoglund, Theoretical Foundations of Wireless 18/23

• Linear-filter waveform channel with Gaussian noise,

The Gaussian Waveform Channel II

for different β ∈ (0, ∞).

Mikael Skoglund, Theoretical Foundations of Wireless 20/23

• The famous special case of a bandlimited AWGN channel:

• White Gaussian noise, with N (f ) = N0 /2

Mikael Skoglund, Theoretical Foundations of Wireless 21/23

The Gaussian Waveform Channel IV

• The capacity of this channel is (Shannon ’48):