You are on page 1of 12

Lecture 1

Capacity of the Gaussian Channel

• Basic concepts in information theory: Appendix B


• Capacity of the Gaussian channel: Appendix B, Ch. 5.1–3

Mikael Skoglund, Theoretical Foundations of Wireless 1/23

Entropy and Mutual Information I

• Entropy for a discrete random variable X with alphabet X and pmf


p(x) , Pr(X = x), ∀x ∈ X
X
H(X) , − p(x) log p(x)
x∈X

• H(X) = average amount of uncertainty removed when observing


X = the information obtained X
• It holds that 0 ≤ H(X) ≤ log |X |
• Entropy for an n-tuple X1n , (X1 , . . . , Xn )
X
H(X1n ) = H(X1 , . . . , Xn ) = − p(xn1 ) log p(xn1 )
xn
1

Mikael Skoglund, Theoretical Foundations of Wireless 2/23


Entropy and Mutual Information II
• Conditional entropy of Y given X = x
X
H(Y |X = x) , − p(y|x) log p(y|x)
y∈Y

• H(Y |X = x) = the average information obtained when observing Y


when it is already known that X = x
• Conditional entropy of Y given X (on the average)
X
H(Y |X) , p(x)H(Y |X = x)
x∈X

• Define g(x) = H(Y |X = x). Then H(Y |X) = Eg(X).


• Chain rule
H(X, Y ) = H(Y |X) + H(X)
(c.f., p(x, y) = p(y|x)p(x))

Mikael Skoglund, Theoretical Foundations of Wireless 3/23

Entropy and Mutual Information III

• Mutual information
XX p(x, y)
I(X; Y ) , p(x, y) log
x y
p(x)p(y)

• I(X; Y ) = the average information about X obtained when


observing Y (and vice versa)

Mikael Skoglund, Theoretical Foundations of Wireless 4/23


Entropy and Mutual Information IV
H(X, Y )

H(X|Y )
I(X; Y )
H(Y |X)

H(X) H(Y )

I(X; Y ) = I(Y ; X)
I(X; Y ) = H(Y ) − H(Y |X) = H(X) − H(X|Y )
I(X; Y ) = H(X) + H(Y ) − H(X, Y )
I(X; X) = H(X)
H(X, Y ) = H(X) + H(Y |X) = H(Y ) + H(X|Y )

Mikael Skoglund, Theoretical Foundations of Wireless 5/23

Entropy and Mutual Information V

• A continuous random variable X with pdf f (x), differential entropy


Z
h(X) = − f (x) log f (x)dx

• For E[X 2 ] = σ 2 ,

1
h(X) ≤ log(2πeσ 2 ) [bits]
2
with = only for X Gaussian
• Mutual information,
ZZ
f (x, y)
I(X; Y ) = f (x, y) log dxdy
f (x)f (y)

Mikael Skoglund, Theoretical Foundations of Wireless 6/23


Jensen’s Inequality

• For f : Rn → R convex and a random X ∈ Rn ,

f (E[X]) ≤ E[f (X)]

• Reverse inequality for f concave


• For f strictly convex (or strictly concave),

f (E[X]) = E[f (X)] =⇒ Pr(X = E[X]) = 1

Mikael Skoglund, Theoretical Foundations of Wireless 7/23

Fano’s Inequality

• Consider the following estimation problem (discrete RV’s):


X random variable of interest
Y observed random variable
X̂ = f (Y ) estimate of X based on Y
• Define the probability of error as

Pe = Pr(X̂ 6= X)

• Fano’s inequality lower bounds Pe

h(Pe ) + Pe log(|X | − 1) ≥ H(X|Y )

[h(x) = −x log x − (1 − x) log(1 − x)]

Mikael Skoglund, Theoretical Foundations of Wireless 8/23


The Gaussian Channel I

wm
encoder decoder
ω xm ym ω̂
α β

• At time m: transmitted symbol xm ∈ X = R, received symbol


ym ∈ Y = R, noise wm ∈ R
• The noise {wm } is i.i.d. Gaussian N (0, σ 2 )
• A memoryless Gaussian transition density (noise variance σ 2 ),

11 2
 
f (y|x) = √ exp − 2 (y − x)
2πσ 2 2σ

Mikael Skoglund, Theoretical Foundations of Wireless 9/23

The Gaussian Channel II

• Coding for the Gaussian channel, subject to an average power


constraint
• Equally likely information symbols ω ∈ IM = {1, . . . , M }
• An (M, n) code with power constraint P
1 Power-limited codebook
n o
C= xn
1 (1), . . . , x n
1 (M ) ,

with n
X
n−1
x2m (i) ≤ P, i ∈ IM
m=1
2 Encoding: ω = i ⇒ xn n
1 = α(i) = x1 (i) transmitted
n n
3 Decoding: y1 received ⇒ ω̂ = β(y1 )

• One symbol → one codeword → n channel uses

Mikael Skoglund, Theoretical Foundations of Wireless 10/23


Capacity

• A rate
log M
R,
n
is achievable (subject to the power constraint P ) if there exists a
sequence of (⌈2nR ⌉, n) codes with codewords satisfying the power
constraint, and such that the average probability of error

Pe(n) = Pr(ω̂ 6= ω)

tends to 0 as n → ∞.
• The capacity C is the supremum of all achievable rates.

Mikael Skoglund, Theoretical Foundations of Wireless 11/23

A Lower Bound for C I

• Gaussian random code design: Fix

x2
 
1
f (x) = p exp −
2π(P − ε) 2(P − ε)

for a small ε > 0, and drawQa codebook Cn = xn1 (1), . . . , xn1 (M )




i.i.d according to f (xn1 ) = m f (xm ).


• Mutual information: Let

f (y|x)
ZZ
Iε = f (y|x)f (x) log R dxdy
f (y|x)f (x)dx
P −ε
 
1
= log 1 +
2 σ2

• the mutual info between input and output when the channel is
“driven by” f (x) = N (0, P − ε)

Mikael Skoglund, Theoretical Foundations of Wireless 12/23


A Lower Bound for C II
• Encoding: A message ω ∈ IM is encoded as xn
1 (ω)
• Transmission: Received sequence

y1n = xn1 (ω) + w1n


2
where wm are i.i.d zero-mean Gaussian, E[wm ] = σ2
• Decoding: For any sequences xn n
1 and y1 , let

n
1 f (y1n |xn1 ) 1 X f (ym |xm )
fn = fn (xn1 , y1n ) = log = log
n f (y1n ) n m=1 f (ym )

(n)
and let Tε be the set of (xn1 , y1n ) such that fn > Iε − ε. Declare
(n)
ω̂ = i if xn1 (i) is the only codeword such that (xn1 (i), y1n ) ∈ Tε ,
and in addition
Xn
n −1
x2m (i) ≤ P,
m=1

otherwise set ω̂ = 0.

Mikael Skoglund, Theoretical Foundations of Wireless 13/23

A Lower Bound for C III

• Average probability of error:



πn = Pr(ω̂ 6= ω) = symmetry = Pr(ω̂ 6= 1|ω = 1)

with “Pr” over the random codebook and the noise


• Let X
E0 = {n −1
x2m (1) > P }
m

and
xn1 (i), xn1 (1) + w1n ∈ Tε(n)
 
Ei =
then
M
X
πn = P (E0 ∪ E1c ∪ E2 ∪ · · · ∪ EM ) ≤ P (E0 ) + P (E1c ) + P (Ei )
i=2

Mikael Skoglund, Theoretical Foundations of Wireless 14/23


A Lower Bound for C IV
• For n sufficiently large, we have
• P (E0 ) < ε
• P (E1c ) < ε

P (Ei ) ≤ 2−n(Iε −ε) , i = 2, . . . , M
that is,
πn ≤ 2ε + 2−n(Iε −R−ε)
⇒ For the average code, R < Iε − ε ⇒ πn → 0 as n → ∞
⇒ Exists at least one code with Pen → 0 for R < Iε − ε

 
1 P
C ≥ log 1 + 2
2 σ

Mikael Skoglund, Theoretical Foundations of Wireless 15/23

An Upper Bound for C I

• Consider any sequence of codes that can achieve the rate R


• Fano =⇒
n
1 X
R≤ I(xm (ω); ym ) + αn
n m=1
(n)
where αn = n−1 + RPe → 0 as n → ∞, and where
1
I(xm (ω); ym ) = h(ym ) − h(wm ) = h(ym ) − log 2πeσ 2
2
2
PM
• Since E[ym ] = Pm + σ 2 where Pm = M −1 i=1 x2m (i) we get

1
h(ym ) ≤ log 2πe(σ 2 + Pm )
2
and hence I(xm (ω); ym ) ≤ 2−1 log(1 + Pm /σ 2 ).

Mikael Skoglund, Theoretical Foundations of Wireless 16/23


An Upper Bound for C II

Thus
n
n−1 m Pm
   P 
1 X 1 Pm 1
R≤ log 1 + 2 + αn ≤ log 1 + + αn
n m=1 2 σ 2 σ2
   
1 P 1 P
≤ log 1 + 2 + αn → log 1 + 2 as n → ∞
2 σ 2 σ

for all achievable R, due to Jensen and the power constraint =⇒

 
1 P
C ≤ log 1 + 2
2 σ

Mikael Skoglund, Theoretical Foundations of Wireless 17/23

Coding Theorem for the Gaussian Channel

Theorem
A memoryless Gaussian channel with noise variance σ 2 and power
constraint P has capacity

 
1 P
C = log 1 + 2
2 σ

That is, all rates R < C and no rates R > C are achievable.

Mikael Skoglund, Theoretical Foundations of Wireless 18/23


The Gaussian Waveform Channel I
N (f )

x(t) y(t)
H(f )
(−T /2, T /2) (−T /2, T /2)

• Linear-filter waveform channel with Gaussian noise,


• independent Gaussian noise with spectral density N (f )
• linear filter H(f )
• input confined to (−T /2, T /2)
• output measured over (−T /2, T /2)

• codebook
C = {x1 (t), . . . , xM (t)}
• power constraint
Z T /2
1
x2i (t)dt ≤ P
T −T /2
• rate
log M
R=
T
Mikael Skoglund, Theoretical Foundations of Wireless 19/23

The Gaussian Waveform Channel II


• Capacity (in bits per second),

1 |H(f )|2 · β
Z
C= log df
2 F (β) N (f )
 
N (f )
Z
P = β− df
F (β) |H(f )|2

where
F(β) = f : N (f ) · |H(f )|−2 ≤ β


for different β ∈ (0, ∞).


• That is, there exist codes such that arbitrarily low error probability is
possible as long as
log M
<C R=
T
and as T → ∞. For R > C the error probability is > 0.

Mikael Skoglund, Theoretical Foundations of Wireless 20/23


The Gaussian Waveform Channel III
• In the achievability proof: Random Gaussian codewords,
with spectral density
 +
N (f )
S(f ) = β −
|H(f )|2

where (
x, x ≥ 0
[x]+ =
0, x < 0

• The famous special case of a bandlimited AWGN channel:


• Perfect lowpass filter of bandwidth W
(
1 |f | ≤ W
H(f ) =
0 |f | > W

• White Gaussian noise, with N (f ) = N0 /2

Mikael Skoglund, Theoretical Foundations of Wireless 21/23

The Gaussian Waveform Channel IV

• The capacity of this channel is (Shannon ’48):

 
P
C = W log 1 + [bits per second]
W N0

• Fundamental resources: power P and bandwidth W

Mikael Skoglund, Theoretical Foundations of Wireless 22/23


Waterfilling

• A frequency-selective Gaussian waveform channel (H(f ) arbitrary),


white Gaussian noise (N (f ) = N0 /2) ⇒

1
Z
log β|H(f )|2 df

C=
2 F (β)
 
N0 1
Z
P = β− df
2 F (β) |H(f )|2

where F(β) = f : β ≥ |H(f )|−2
• Optimal signal spectrum
 +
N0 1
S(f ) = β−
2 |H(f )|2

• “sample in frequency” ⇒ OFDM. . .

Mikael Skoglund, Theoretical Foundations of Wireless 23/23

You might also like