Channel Capacity Notes

Gaussian channel
Information theory 2013, lecture 6
Jens Sjölund
8 May 2013
Jens Sjölund (IMT, LiU) Gaussian channel 1 / 26

Outline
1 Definitions
2 The coding theorem for Gaussian channel
3 Bandlimited channels
4 Parallel Gaussian channels
5 Channels with colored Gaussian noise
6 Gaussian channels with feedback

Introduction
Definition
A Gaussian channel is a time-discrete channel with output Yi , input Xi
and noise Zi at time i such that
Yi = Xi + Zi , Zi ∼ N (0, N )
where Zi is i.i.d and independent of Xi .
The Gaussian channel is the most important continuous alphabet

channel, modeling a wide range of communication channels.
The validity of this follows from the central limit theorem

The power constraint
If the noise variance is zero or the input is unconstrained, the

capacity of the channel is infinite.
The most common limitation on the input is a power constraint
EXi2 ≤ P

Information capacity
Theorem
The information capacity of a Gaussian channel with power constraint
P and noise variance N is
1 P

C = max I(X; Y ) = log 1 +
f (x):EX 2 ≤P 2 N
Proof.
Use that EY 2 = P + N =⇒ h(Y ) ≤ 12 log 2πe(P + N ) (Theorem 8.6.5)
I(X; Y ) = h(Y ) − h(Y |X) = h(Y ) − h(X + Z|X) = h(Y ) − h(Z|X)

1 1
= h(Y ) − h(Z) ≤ log 2πe(P + N ) − log 2πeN
2 2
1 P +N 1 P

= log = log 1 +
2 N 2 N
which holds with equality iff X ∼ N (0, P )
Outline
1 Definitions

Statement of the theorem
Theorem (9.1.1)
The capacity of a Gaussian channel with power constraint P and noise
variance N is
1 P

C = log 1 +
2 N
Proof.
See the book. Based on random codes and joint typicality decoding
(similar to the proof for the discrete case).

Handwaving motivation
Volume of n-dimensional sphere

q ∝ rn
Pn 2
p
Radius of received vectors i=1 Yi ≤ n(P + N )
√
Using decoding spheres of radius nN the maximum number of
nonintersecting decoding spheres is therefore no more than
p n
n(P + N
P n/2
n P
M = √ n = 1 + = 2 2 log(1+ N )
nN N
(note the typo in eq. 9.22 in the book). This corresponds to a rate
n
log(1+ NP )
log 2 2
log M 1 P

R= = = log 1 + =C
n n 2 N

Outline
1 Definitions

Bandlimited channels
A bandlimited channel with white noise can be described as
Y (t) = (X(t) + Z(t)) ∗ h(t)
where X(t) is the signal waveform, Z(t) is the waveform of the white
Gaussian noise and h(t) is the impulse response of an ideal bandpass
filter, which cuts all frequencies greater than W .
Theorem (Nyquist-Shannon sampling theorem)

Suppose that a function f (t) is bandlimited to W. Then, the function is
1
completely determined by samples of the function spaced 2W apart.

Bandlimited channels
Sampling at 2W Hertz, the energy per sample is P /2W . Assuming a

noise spectral density N0 /2 the capacity is
! !
1 P /2W 1 P
C = log 1 + = log 1 + bits per sample
2 N0 /2 2 N0 W
!
P
= W log 1 + bits per second
N0 W
which is arguably one of the most famous formulas of information

theory.

Outline
1 Definitions

Parallel Gaussian channels
Consider k independent Gaussian channels in parallel with a common

power constraint
Yj = Xj + Zj , j = 1, 2, . . . , k
Zj ∼ N (0, Nj )
k
X
E Xj2 ≤ P
j=1
The objective is to distribute the total power among channels so as to

maximize the capacity.

Information capacity of parallel Gaussian channel
Following the same lines of reasoning as in the single channel case

one may show
Pj
!
X X1
I(X1 , . . . , Xk ; Y1 , . . . , Yk ) ≤ h(Yj ) − h(Zj ) ≤ log 1 +
2 Nj
j j
where Pj = EXj2 and

P
Pj = P . Equality is achieved by
  
 P1 0 ... 0 
  0 P2 ... 0 
  
(X1 , . . . , Xk ) ∼ N 0,  . .. .. .. 

  .. . . . 
 
0 0 . . . Pk


Maximizing the information capacity
The objective is thus to maximize the capacity subject to the constraint
P
Pj = P , which can be done a Lagrange multiplier λ
 
Pj
X1 ! X 
J(P1 , . . . , Pk ) = log 1 + + λ  Pj − P 
 
2 Nj  
j j
∂J 1
= +λ = 0
∂Pi 2(Pi + Ni )
1 ∆
Pi = − − Ni = ν − Ni
2λ
however Pi ≥ 0, so the solution needs to modified to

ν − Ni if ν ≥ Ni

+

Pi = (ν − Ni ) = 
0
 if ν < Ni

Water-filling
The gives the solution

k
(ν − Nj )+
!
X 1
C= log 1 +
2 Nj
j=1
Pk +
where ν is chosen so that j=1 (ν − Nj ) = P (typo in the summary).
As the signal power increases, power is alloted to the channels where

the sum of noise and power is the lowest. By analogy to the way water
distributes itself in a vessel, this is referred to as water-filling.

Outline
1 Definitions

Channels with colored Gaussian noise
Consider parallel channels with dependent noise.
Yi = Xi + Zi , Z n ∼ N (0, KZ )
This can also represent the case of a single channel with Gaussian
noise with memory. For a channel with memory, a block of n
consecutive uses can be considered as n channels in parallel with
dependent noise.

Maximizing the information capacity (1/3)
Let KX be the input covariance matrix. The power constraint can then
be written
1X 1
EXi2 = tr(KX ) ≤ P
n n
i
As for the independent channels,
I(X1 , . . . , Xk ; Y1 , . . . , Yk ) = h(Y1 , . . . , Yk ) − h(Z1 , . . . , Zk )
The entropy of the output is maximal when Y is normally distributed,

and since the input and noise are independent
1
KY = KX + KZ =⇒ h(Y1 , . . . , Yk ) = log ((2πe)n |KX + KZ |)
2

Thus, maximizing the capacity amounts to maximizing |KX + KZ |

subject to the constraint tr(KX ) ≤ nP . The eigenvalue decomposition of
KZ gives
KZ = QΛQt , where QQt = I
then,
∆
|KX + KZ | = |Q||Qt KX Q + Λ||Qt | = |Qt KX Q + Λ| = |A + Λ|
and
tr(A) = tr(Qt KX Q) = tr(Qt QKX ) = tr(KX ) ≤ nP

From Hadamard’s inequality (Theorem 17.9.2)
Y
|A + Λ| ≤ (Aii + λi )
i
with equality iff A is diagonal. Since

X
Aii ≤ nP , Aii ≥ 0
i
it can be shown using the KKT conditions that the optimal solution is
given by
Aii = (ν − λi )+
X
(ν − λi )+ = nP
i

Resulting information capacity
For a channel with colored Gaussian noise

n
(ν − λi )+
!
1X1
C= log 1 +
n 2 λi
i=1
where λi are the eigenvalues of KZ and ν is chosen so that

Pn +
i=1 (ν − λi ) = nP (typo in the summary).

Outline
1 Definitions

Gaussian channels with feedback
(n)
Yi = Xi + Zi , Zi ∼ N (0, KZ )
Because of the feedback, X n and Z n are not independent; Xi is
casually dependent on the past values of Z.

Capacity with and without feedback
For memoryless Gaussian channels, feedback does not increase

capacity. However, for channels with memory, it does.
Capacity without feedback

(n) +
!  
1 |KX + KZ | 1 X  (λ − λ i ) 
Cn = max log = log 1 + (n)

2n |KZ | 2n

tr(KX )≤nP λ
i i
Capacity with feedback

!
1 |KX+Z |
Cn,FB = max log
tr(KX )≤nP 2n |KZ |

Bounds on the capacity with feedback
Theorem
1
Cn,FB ≤ Cn +
2
Theorem (Pinsker’s statement)

Cn,FB ≤ 2Cn
In words, Gaussian channel capacity is not increased by more than

half a bit or by more than a factor 2 when we have feedback.

Channel Capacity Notes

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Channel Capacity Notes

Uploaded by

Copyright:

Available Formats

Gaussian channel

Information theory 2013, lecture 6

Jens Sjölund (IMT, LiU) Gaussian channel 1 / 26

2 The coding theorem for Gaussian channel

4 Parallel Gaussian channels

5 Channels with colored Gaussian noise

6 Gaussian channels with feedback

Jens Sjölund (IMT, LiU) Gaussian channel 2 / 26

where Zi is i.i.d and independent of Xi .

The Gaussian channel is the most important continuous alphabet

Jens Sjölund (IMT, LiU) Gaussian channel 3 / 26

If the noise variance is zero or the input is unconstrained, the

Jens Sjölund (IMT, LiU) Gaussian channel 4 / 26

I(X; Y ) = h(Y ) − h(Y |X) = h(Y ) − h(X + Z|X) = h(Y ) − h(Z|X)

2 The coding theorem for Gaussian channel

4 Parallel Gaussian channels

5 Channels with colored Gaussian noise

6 Gaussian channels with feedback

Jens Sjölund (IMT, LiU) Gaussian channel 6 / 26

Jens Sjölund (IMT, LiU) Gaussian channel 7 / 26

Volume of n-dimensional sphere

Jens Sjölund (IMT, LiU) Gaussian channel 8 / 26

2 The coding theorem for Gaussian channel

4 Parallel Gaussian channels

5 Channels with colored Gaussian noise

6 Gaussian channels with feedback

Jens Sjölund (IMT, LiU) Gaussian channel 9 / 26

A bandlimited channel with white noise can be described as

Y (t) = (X(t) + Z(t)) ∗ h(t)

Theorem (Nyquist-Shannon sampling theorem)

Jens Sjölund (IMT, LiU) Gaussian channel 10 / 26

Sampling at 2W Hertz, the energy per sample is P /2W . Assuming a

which is arguably one of the most famous formulas of information

Jens Sjölund (IMT, LiU) Gaussian channel 11 / 26

2 The coding theorem for Gaussian channel

4 Parallel Gaussian channels

5 Channels with colored Gaussian noise

6 Gaussian channels with feedback

Jens Sjölund (IMT, LiU) Gaussian channel 12 / 26

Consider k independent Gaussian channels in parallel with a common

The objective is to distribute the total power among channels so as to

Jens Sjölund (IMT, LiU) Gaussian channel 13 / 26

Following the same lines of reasoning as in the single channel case

where Pj = EXj2 and

Jens Sjölund (IMT, LiU) Gaussian channel 14 / 26

Jens Sjölund (IMT, LiU) Gaussian channel 15 / 26

The gives the solution

As the signal power increases, power is alloted to the channels where

Jens Sjölund (IMT, LiU) Gaussian channel 16 / 26

2 The coding theorem for Gaussian channel

4 Parallel Gaussian channels

5 Channels with colored Gaussian noise

6 Gaussian channels with feedback

Jens Sjölund (IMT, LiU) Gaussian channel 17 / 26

Consider parallel channels with dependent noise.

Jens Sjölund (IMT, LiU) Gaussian channel 18 / 26

As for the independent channels,

I(X1 , . . . , Xk ; Y1 , . . . , Yk ) = h(Y1 , . . . , Yk ) − h(Z1 , . . . , Zk )

The entropy of the output is maximal when Y is normally distributed,

Jens Sjölund (IMT, LiU) Gaussian channel 19 / 26

Thus, maximizing the capacity amounts to maximizing |KX + KZ |

Jens Sjölund (IMT, LiU) Gaussian channel 20 / 26

with equality iff A is diagonal. Since