11 views

Uploaded by Hardika B Vaghasiya

save

- iisc-drdo-ac-talk3
- Want a Superfast Computer_ First Processor is Here That Uses Light for Communication
- Design and Implementation of an area efficient interleaver for MIMO-OFDM systems
- a2009.1
- Lecture 4
- C1_GioiThieuTongQuan
- Hopping
- FHdescr
- ilchtcnewieee[1]
- Abdallah
- Slides IT2 SS2012
- Digital Communication Systems
- 14683.pdf
- 516255614-MIT
- Undergraduate Syllabus Elective Subject
- 04786457.pdf
- Coding and Decoding With Convolutional Codes
- iDirect-Hub (1).pdf
- PLC G3 Physical Layer Specification
- Md2401 Manual
- AMR Points
- PC8530_Product_Brief
- turbo3
- Star Thermal Printer Programmer's Manual
- 2g1723 Questions
- LINK-LEVEL PERFORMANCE EVALUATION OF RELAY-BASED WIMAX NETWORK
- IEEE VLSI
- 4_20240_456
- The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
- Dispatches from Pluto: Lost and Found in the Mississippi Delta
- Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
- Sapiens: A Brief History of Humankind
- The Unwinding: An Inner History of the New America
- Yes Please
- The Prize: The Epic Quest for Oil, Money & Power
- A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
- This Changes Everything: Capitalism vs. The Climate
- Grand Pursuit: The Story of Economic Genius
- The Emperor of All Maladies: A Biography of Cancer
- John Adams
- Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
- Rise of ISIS: A Threat We Can't Ignore
- Smart People Should Build Things: How to Restore Our Culture of Achievement, Build a Path for Entrepreneurs, and Create New Jobs in America
- The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
- The World Is Flat 3.0: A Brief History of the Twenty-first Century
- Team of Rivals: The Political Genius of Abraham Lincoln
- The New Confessions of an Economic Hit Man
- Bad Feminist: Essays
- How To Win Friends and Influence People
- Steve Jobs
- Angela's Ashes: A Memoir
- The Incarnations: A Novel
- You Too Can Have a Body Like Mine: A Novel
- The Silver Linings Playbook: A Novel
- Leaving Berlin: A Novel
- Extremely Loud and Incredibly Close: A Novel
- The Sympathizer: A Novel (Pulitzer Prize for Fiction)
- The Light Between Oceans: A Novel
- The Blazing World: A Novel
- The Rosie Project: A Novel
- The First Bad Man: A Novel
- We Are Not Ourselves: A Novel
- The Flamethrowers: A Novel
- Brooklyn: A Novel
- A Man Called Ove: A Novel
- Bel Canto
- The Master
- Life of Pi
- The Love Affairs of Nathaniel P.: A Novel
- A Prayer for Owen Meany: A Novel
- The Cider House Rules
- The Perks of Being a Wallflower
- Lovers at the Chameleon Club, Paris 1932: A Novel
- The Bonfire of the Vanities: A Novel
- Interpreter of Maladies
- Beautiful Ruins: A Novel
- The Kitchen House: A Novel
- Wolf Hall: A Novel
- The Art of Racing in the Rain: A Novel
- The Wallcreeper
- The Wife: A Novel

You are on page 1of 5

**Finite Blocklength Coding for Channels with
**

Side Information at the Receiver

Amir Ingber and Meir Feder

Department of EE-Systems, The Faculty of Engineering

Tel Aviv University, Tel Aviv 69978, ISRAEL

email: {ingber, meir}@eng.tau.ac.il

Abstract—The communication model of a memoryless channel

that depends on a random state that is known at the receiver only

is studied. The channel can be thought of as a set of underlying

channels with a ﬁxed state, where at each channel use one of

these channels is chosen at random, and this selection is known

to the receiver only. The capacity of a such channel is known,

and is given by the expectation (w.r.t. the random state) of the

capacity of the underlying channels.

In this work we examine the ﬁnite-length characteristics of this

channel, and their relation to the characteristics of the underlying

channels. We derive error exponent bounds (random coding and

sphere packing) for the channel and determine their relation to

the corresponding bounds of the underlying channels. We also

determine the channel dispersion and its relation to the dispersion

of the underlying channels. We show that both in the error

exponent bounds and in the dispersion case, the expectation of

these quantities is too optimistic w.r.t. the actual value. Examples

for such channels are discussed.

I. INTRODUCTION

The communication model of a memoryless channel that

depends on a random state is studied. We focus on the

case where the random state, also known as channel state

information (CSI), is known at the receiver only. The channel,

denoted by W, can be thought of as a set of (memoryless)

channels, W

S

, where S is the random state. Such a model

appears many times in practice: the ergodic fading channel

is an example for such a channel, where the fade value is

assumed to be known at the receiver. Sometimes the state

S is a result of the communication scheme design and is

inserted intentionally (for example, in order to attain symmetry

properties).

In this work we study the relationship between the ﬁnite

blocklength information theoretic properties of the channel W

and those of the underlying channels W

S

.

The capacity of this channel is well known, an is gen-

erally given by the expectation (over S) of the capacity of

the underlying channel W

S

. We follow by analyzing other

information theoretical properties such as the error exponent

and the channel dispersion of the channel W and comparing

then to the expected values of these properties of the channel

W

S

.

The main results can be summarized as follows:

The random coding and sphere packing error exponent

bounds [1] are both given by the expression E

0

(ρ) − ρR

(optimized w.r.t. ρ), where E

0

(ρ) is a function of the channel.

We show that the function E

0

for the channel W is given by

E

0

(ρ) = −log E

_

2

−E

(S)

0

(ρ)

_

, (1)

where E

(S)

0

is the corresponding E

0

function for the channel

W

S

, E[·] denotes expectation (w.r.t. S), and log = log

2

.

In [2], error exponents for channels with side information

were considered. However, the focus was on channels with

CSI at the transmitter as well, compound channels and more.

While the case of CSI known at the receiver only is as special

case, the contribution here lies in the simplicity of the relation

(1).

We also discuss the channel dispersion (see [3], [4]), which

quantiﬁes the speed at which the rate approaches capacity with

the block length (when the codeword error rate is ﬁxed). We

show the following relationship between the dispersions of W

and W

S

, denoted V and V

S

respectively:

V = E[V

S

] +VAR[C

S

] . (2)

Both in the error exponent and in the dispersion case, we

show that the expected exponent and the expected dispersion

are too optimistic w.r.t. the actual exponent and dispersion.

Finally, we discuss several examples that involve chan-

nels with side information at the receiver, such as channel

symmetrization, multilevel codes with multi-stage decoding

(MLC-MSD) and bit-interleaved coded modulation (BICM).

II. THE GENERAL COMMUNICATION MODEL

A. Channel Model

Let W be a discrete memoryless channel (DMC)

1

with

input x ∈ X, and output (y, s) ∈ Y × S, where s ∈ S is the

channel state, which is independent of the channel input X:

W(y, s|x) = P

Y,S|X

(y, s|x) = P

S

(s)P

Y |S,X

(y|s, x). (3)

Deﬁnition 1: Let W

s

be the W where the state S is ﬁxed

to s:

W

s

(y|x) P

Y |S,X

(y|s, x). (4)

1

Similar results for continuous-output channels can be derived similarly.

978-1-4244-8682-3/10/$26.00 c 2010 IEEE

000798

B. Communication Scheme

The communication scheme is deﬁned as follows. Let n be

the codeword length, and let M be a set of 2

nR

messages.

The encoder and decoder are denoted f

n

and g

n

respectively,

where

• f

n

: M → X

n

is the encoder, which maps the input

message m to the channel input x ∈ X

n

.

• g

n

: Y

n

× S

n

→ M is the decoder, which maps the

channel output and the channel state to an estimate ˆ m of

the transmitted message.

• The considered error probability is the codeword error

probability p

e

P( ˆ m = m), where the messages m are

drawn randomly from M with equal probability.

The communication scheme is depicted in Figure 1.

We shall be interested in the tradeoff between rate R,

codelength n and error probability p

e

of the best possible

codes.

III. INFORMATION THEORETIC ANALYSIS

Here we shall be interested in the performance of the

optimal codes for the channel W. We review known results

for the capacity, and present the results for the error exponent

and the channel dispersion.

A. Capacity

Since the channel W is simply a DMC with a scalar input

and a vector output, the capacity can be simply derived (see,

e.g. [5]):

C(W) = max

p(x)

I(X; Y, S)

= max

p(x)

I(X; Y |S) +I(X; S)

= max

p(x)

I(X; Y |S), (5)

where the last equality holds since X and S are independent.

Note that the capacity can also be written as

max

p(x)

E

S

[I(X; Y |S = s)]. (6)

In the paper we limit ourselves to a ﬁxed input distribution

(e.g. equiprobable). In this case the capacity is given by

I(X; Y |S) = E

S

[I(X; Y |S = s)]. (7)

Recalling the deﬁnition of the channel conditioned on the state

s, we get

C(W) = I(X; Y |S)

= E

S

[I(X; Y |S = s)]

= E[C(W

S

)], (8)

where C(W

s

) is the capacity of the underlying channel W

s

.

We conclude that the capacity formula can be interpreted as

an expectation over the capacities of the underlying channels.

Note that when the CSI is available at the transmitter as

well, (8) holds even without the assumption of a ﬁxed prior

on X.

B. Error Exponent

The error exponent of a channel is given by [1]

E(R) lim

n→∞

−

1

n

log (p

e

(n, R)) , (9)

where p

e

(n, R) is the average codeword error probability for

the best code of length n and rate R (assuming that the limit

exists).

While the exact characterization of the error exponent is

still an open question, two important bounds are known [1]:

the random coding and the sphere packing error exponents,

which are a lower and upper bounds, respectively.

The random coding exponent is given by

E

r

(R) = max

ρ∈[0,1]

max

pX(·)

{E

0

(ρ, P

X

) −ρR}, (10)

where E

0

(ρ, P

X

) is given by

−log

⎡

⎣

y∈Y

_

x∈X

P

X

(x)W(y|x)

1

1+ρ

_

1+ρ

⎤

⎦

. (11)

The sphere packing bound E

sp

(R) is given by

E

sp

(R) = max

ρ>0

max

pX(·)

{E

0

(ρ, p) −ρR}. (12)

It can be seen that both exponent bounds are similar. In fact,

they only differ in the optimization region of the parameter ρ,

and they coincide at rates beyond a certain rate called the

critical rate.

We note that both bounds depend on the function E

0

(·). For

channels with CSI at the receiver, we derive E

0

(·) explicitly.

Following the relationship (8), we wish to ﬁnd the connections

between E

0

(·) and the corresponding E

0

functions of the

conditional channels W

s

, denoted E

(s)

0

.

Theorem 1: Let W be a channel with CSI at the receiver.

Then the function E

0

(·) for this channel is given by

E

0

(ρ, P

X

) = −log E

_

2

−E

(S)

0

(ρ,PX)

_

. (13)

Proof: When the channel output is (y, s), we get

E

0

(ρ, P

X

) =

= −log

⎡

⎣

y∈Y,s∈S

_

x∈X

P

X

(x)W(y, s|x)

1

1+ρ

_

1+ρ

⎤

⎦

= −log

_

s∈S

P

S

(s)×

×

y∈Y

_

x∈X

P

X

(x)P

Y |S,X

(y|s, x)

1

1+ρ

_

1+ρ

⎤

⎦

(a)

= −log

_

s∈S

P

S

(s)2

−E

(s)

0

(ρ,PX)

_

= −log E

_

2

−E

(S)

0

(ρ,PX)

_

,

where (a) follows from the deﬁnition of E

(s)

0

.

000799

x ∈ X

n

y ∈ Y

n

W m∈M ˆ m

s ∈ S

n

Random

state

Encoder

Decoder

Fig. 1. Communication scheme for channels with CSI at the receiver

As a corollary, we get the random coding and the sphere

packing exponents for the channel W according to (10) and

(12).

Following (8), one might think that the error exponent

bounds (for example, E

r

(R)) will be given by the expectation

of the exponent function w.r.t. S. This is clearly not the case,

as seen in Theorem 1. In addition, the following can be shown:

Theorem 2: Let

˜

E

r

(R) be the average of E

(S)

r

w.r.t. S:

˜

E

r

(R) E

_

E

(S)

r

(R)

_

. (14)

Then

˜

E

r

(R) always overestimates the true random coding

exponent of W, E

r

(R).

Proof: Let

˜

E

0

(ρ, P

X

) = E

_

E

(S)

0

(ρ, P

X

)

_

. Since 2

−(·)

is

convex, it follows by the Jensen inequality and Theorem 1 that

E

0

(ρ, P

X

) ≤

˜

E

0

(ρ, P

X

). (15)

We continue with

˜

E

r

(R):

˜

E

r

(R) = E

_

E

(S)

r

(R)

_

= E

_

sup

PX; ρ∈[0,1]

_

E

(S)

0

(ρ, P

X

) −ρR

_

_

≥ sup

PX; ρ∈[0,1]

E

_

E

(S)

0

(ρ, P

X

) −ρR

_

= sup

PX; ρ∈[0,1]

_

˜

E

0

(ρ, P

X

) −ρR

_

(a)

≥ sup

PX; ρ∈[0,1]

[E

0

(ρ, P

X

) −ρR]

= E

r

(R), (16)

where (a) follow from (15).

Note that the proof of Theorem 2 holds no matter what the

optimization region of ρ is. Therefore the same version for

the sphere packing exponent follows similarly. We conclude

that the expectation (w.r.t. S) of the error exponent bounds

overestimate the true exponent bounds of W (and also the

true error exponent, over the critical rate).

C. Dispersion

An alternative information theoretical measure for quantify-

ing coding performance with ﬁnite block lengths is the channel

dispersion. Suppose that a ﬁxed codeword error probability

p

e

and a codeword length n are given. We can then seek

the maximal achievable rate R given p

e

and n. It appears

that for ﬁxed p

e

and n, the gap to the channel capacity is

approximately proportional to Q

−1

(p

e

)/

√

n (where Q(·) is

the complementary Gaussian cumulative distribution function).

The proportion constant (squared) is called the channel disper-

sion. Formally, deﬁne the (operational) channel dispersion as

follows [3]:

Deﬁnition 2: The dispersion V(W) of a channel W with

capacity C is deﬁned as

V(W) = lim

pe→0

limsup

n→∞

n ·

_

C −R(n, p

e

)

Q

−1

(p

e

)

_

2

, (17)

where R(n, p

e

) is the highest achievable rate for codeword

error probability p

e

and codeword length n.

In 1962 , Strassen [4] used the Gaussian approximation to

derive the following result for DMCs:

R(n, p

e

) = C −

_

V

n

Q

−1

(p

e

) +O

_

log n

n

_

, (18)

where C is the channel capacity, and the new quantity V is

the (information-theoretic) dispersion, which is given by

V VAR(i(X; Y )), (19)

where i(x; y) is the information spectrum, given by

i(x; y) log

P

XY

(x, y)

P

X

(x)P

Y

(y)

, (20)

and the distribution of X is the capacity-achieving distri-

bution that minimizes V . Strassen’s result proves that the

dispersion of DMCs is equal to VAR(i(X; Y )). This result

was recently tightened (and extended to the power-constrained

AWGN channel) in [3]. It is also known that the channel

dispersion and the error exponent are related as follows. For a

channel with capacity C and dispersion V , the error exponent

can be approximated (for rates close to the capacity) by

E(R)

∼

=

(C−R)

2

2V ln 2

. See [3] for details on the early origins of

this approximation by Shannon.

We now explore the dispersion for the case of channels with

side information at the receiver.

Theorem 3: The dispersion of the channel W with CSI at

the receiver is given by

V(W) = E[V(W

S

)] +VAR[C(W

S

)] , (21)

000800

where both expectation and variance are taken w.r.t. the

random state S.

Proof: Since W is nothing but a DMC with a vec-

tor output, the proof boils down to the calculation of

VAR[i(X; (Y, S))]. The information spectrum in this case is

given by

i(x; y, s) = log

P

Y SX

(y, s, x)

P

Y S

(y, s)P

X

(x)

(a)

= log

P

Y |S,X

(y|s, x)

P

Y |S

(y|s)

i(x; y|s), (22)

where (a) follows since X and S are independent.

Suppose that s is ﬁxed, i.e. consider the channel W

s

. The

capacity is given by

C(W

s

) = E[i(X; Y |S)|S = s]

= I(X; Y |S = s). (23)

The dispersion of the channel W

s

is given by

V(W

s

) = VAR(i(X; Y |S)|S = s)

= E

_

i

2

(X; Y |S)|S = s

¸

−E

2

[i(X; Y |S)|S = s]

= E

_

i

2

(X; Y |S)|S = s

¸

−C(W

s

)

2

. (24)

Finally, the dispersion of the original channel W is given

as follows:

V(W) = VAR(i(X; Y |S))

(a)

= E[VAR[i(X; Y |S)|S = s]]

+VAR[E[i(X; Y |S)|S = s]]

= E[V(W

S

)] +VAR[C(W

S

)] , (25)

where (a) follows from the law of total variance.

A few notes regarding this result:

• Let

˜

V(W) E[V(W

S

)]. As an immediate corollary of

Theorem 3, it can be seen that

˜

V(W) underestimates the

true dispersion of W, V(W). This fact ﬁts the exponent

case: both expected exponent and expected dispersion are

too optimistic w.r.t. the true exponent and dispersion.

• The factor VAR[C(W

S

)] can be viewed as a penalty

factor over the expected dispersion

˜

V(W), that grows as

the capacities of the underlying channels are more spread.

IV. CODE DESIGN

When designing channel codes, the fact that the output is

two-dimensional may complicate the code design. It would

therefore be of interest to apply some processing on the outputs

Y and S, and feed them to the decoder as a single value. We

seek such a processing method that would not compromise

the achievable performance over the modiﬁed channel (not

only in the capacity sense, but in the error probability at ﬁnite

codelengths sense as well).

For binary channels this can be done easily by calculating

the log-likelihood-ratios for each channel output pair (y, s)

(see Figure 2).

For channel outputs s and y, denote the LLR of x given

(y, s) by z:

z = LLR(y, s) log

P

Y |S,X

(y|s, x = 0)

P

Y |S,X

(y|s, x = 1)

. (26)

It is well known that for channels with binary input, the

optimal ML decoder can be implemented to work on the LLR

values only. Therefore by plugging the LLR calculator at the

channel output, and supplying the decoder with the LLRs only,

the performance is not harmed, and we can therefore regard

the channel as a simple DMC with input x and output z =

LLR(y, s) for code design purposes.

V. EXAMPLES

A. Symmetrization of binary channels with equiprobable input

In the design of state-of-the-art channel codes, it is usually

convenient to have channels that are symmetric. In recent years

there have developed methods to design very efﬁcient binary

codes, such as LDPC codes. When designing LDPC codes, A

desired property of a binary channel is that its output will be

symmetric[6].

Deﬁnition 3 (Binary input, output symmetric channels [6]):

A memoryless binary channel U with input alphabet {0, 1}

and output alphabet R is called output-symmetric, if for all

y ∈ R

U(y|0) = U(−y|1). (27)

Consider a general binary channel W with arbitrary out-

put (which is not necessarily symmetric), and suppose that,

for practical reasons, we are interested in coding over this

channel with equiprobable input (which may or may not be

the achieving prior for that channel). The fact that we use

equiprobable input does not make the channel symmetric

according to Deﬁnition 3. However, there exists a method

for transforming this channel to a symmetric one, without

compromising on the capacity, error exponent or dispersion:

First, we add the LLR calculation to the channel and regard it

as a part of the channel. This way we get a real-output channel

from any arbitrary channel. Second, before we transmit the

binary codewords on the channel, instead of coding on the

channel directly, we perform a bit-wise XOR operation with

an iid pseudo-random binary vector. It can be shown that by

multiplying the LLR values by −1 wherever the input was

ﬂipped, the LLRs are corrected. It can also be shown that

the channel, with the corrected LLR calculation is symmetric

according to Deﬁnition 3. In [7], this method (termed ’channel

adapters’) was used in order to symmetrize the sub-channels of

several coded modulation schemes. It is also shown in [7] that

the capacity is unchanged by the channel adapters. By using

Theorems 1 and 3, it can be veriﬁed that the error exponent

bounds and the dispersion remain the same as well.

B. Multilevel Coding and Multistage Decoding (MLC-MSD)

MLC-MSD is a method for using binary codes in order

to achieve capacity on nonbinary channels (see, e.g. [8]). In

000801

W m ˆ m

z ∈ R

n

Random

state

Encoder Decoder

LLR

calc.

Fig. 2. Incorporating LLR calculation into the channel

MLC-MSD, the binary encoders work in parallel over the same

block of channel uses, and the decoders work sequentially as

follows: the ﬁrst decoder assumes the rest of the codewords

are noise and decodes the message from the ﬁrst encoder.

Every other decoder, in its turn, decodes the message from the

corresponding encoder assuming that the decoded messages

from the previous decoders are correct, therefore regards these

messages as side-information. The effective channels between

each encoder-decoder, called sub-channels, are in fact channels

with CSI at the receiver, and therefore can be analyzed by

Theorems 1 and 3. For more details on ﬁnite-length analysis

of MLC-MSD, see [9].

C. Bit-Interleaved Coded Modulation (BICM)

BICM [10] is another popular method for channel coding

using binary codes over nonbinary channels (for example,

a channel with output of size 2

L

). It is based on taking a

single binary code, feeding it into a long interleaver, and

then mapping the interleaved coded bits onto the nonbinary

channel alphabet (every L-tuple of consecutive bits is mapped

to a symbol in the channel input alphabet of size 2

L

). At the

receiver, the LLR of all coded bits are calculated according to

the mapping, de-interleaved and fed to the decoder.

By assuming that the interleaver is ideal (i.e. of inﬁnite

length), the equivalent channel of BICM is modeled as a

binary channel with a random state [10]. The state is chosen

uniformly from {1, ..., L}, and represents the index of the

input bit in the L-tuple. Since the state is known to the receiver

only, this model ﬁts the channel models discussed in the paper.

Finite blocklength analysis of BICM should be done care-

fully: although the model of a binary channel with a state

known at the receiver allows the derivation of error exponent

and channel dispersion, they do not have the usual meaning of

quantifying the performance of BICM at ﬁnite block lengths.

The reason for that is the interleaver: how can one rely on the

existence of an inﬁnite-length interleaver in order to estimate

the ﬁnite-length performance?

The solution comes in the form of an explicit ﬁnite-length

interleaver. Recently an alternative scheme called Parallel

BICM was proposed [11], where binary codewords are used

in parallel and an interleaver of ﬁnite length is used in order

to validate the BICM model of a binary channel with a state

known at the receiver. This allows the proper use of Theorems

1 and 3 (see [11] for the details).

D. Fading Channels

Rayleigh fading channel, which is popular in wireless

communication, can be modeled as a channel with CSI at the

receiver. The state in this setting is the fade value, which is

usually estimated and some version of it is available at the

receiver. When the fading is fast (a.k.a. ergodic fading) the

channel is memoryless and ﬁts the model discussed in the

paper, and Theorems 1 and 3 can be applied.

ACKNOWLEDGMENT

A. Ingber is supported by the Adams Fellowship Program

of the Israel Academy of Sciences and Humanities.

REFERENCES

[1] Robert G. Gallager, Information Theory and Reliable Communication,

John Wiley & Sons, Inc., New York, NY, USA, 1968.

[2] Pierre Moulin and Ying Wang, “Capacity and random-coding exponents

for channel coding with side information,” IEEE Trans. on Information

Theory, vol. 53, pp. 1326–1347, 2007.

[3] Y. Polyanskiy, H.V. Poor, and S. Verd´ u, “Channel coding rate in the

ﬁnite blocklength regime,” IEEE Trans. on Information Theory, vol. 56,

no. 5, pp. 2307 –2359, May 2010.

[4] V. Strassen, “Asymptotische absch¨ atzungen in shannons informa-

tionstheorie,” Trans. Third Prague Conf. Information Theory, 1962,

Czechoslovak Academy of Sciences, pp. 689–723.

[5] Thomas M. Cover and Joy A. Thomas, Elements of Information Theory,

John Wiley & sons, 1991.

[6] Thomas J. Richardson, Mohammad Amin Shokrollahi, and R¨ udiger L.

Urbanke, “Design of capacity-approaching irregular low-density parity-

check codes,” IEEE Trans. on Information Theory, vol. 47, no. 2, pp.

619–637, 2001.

[7] Jilei Hou, Paul H. Siegel, Laurence B. Milstein, and Henry D. Pﬁster,

“Capacity-approaching bandwidth-efﬁcient coded modulation schemes

based on low-density parity-check codes,” IEEE Trans. on Information

Theory, vol. 49, no. 9, pp. 2141–2155, 2003.

[8] Udo Wachsmann, Robert F. H. Fischer, and Johannes B. Huber, “Mul-

tilevel codes: Theoretical concepts and practical design rules,” IEEE

Trans. on Information Theory, vol. 45, no. 5, pp. 1361–1391, 1999.

[9] Amir Ingber and Meir Feder, “Capacity and error exponent analysis

of multilevel coding with multistage decoding,” in Proc. IEEE Inter-

national Symposium on Information Theory, Seoul, South Korea, 2009,

pp. 1799–1803.

[10] Giuseppe Caire, Giorgio Taricco, and Ezio Biglieri, “Bit-interleaved

coded modulation,” IEEE Trans. on Information Theory, vol. 44, no. 3,

pp. 927–946, 1998.

[11] Amir Ingber and Meir Feder, “Parallel bit interleaved coded mod-

ulation,” in ALLERTON 2010, 48th Annual Allerton Conference on

Communication, Control and Computing, September 29 - October 1,

2010, Allerton, USA, 09 2010.

000802

We shall be interested in the tradeoff between rate R. We conclude that the capacity formula can be interpreted as an expectation over the capacities of the underlying channels. where n • fn : M → X is the encoder. The communication scheme is depicted in Figure 1.PX ) = − log E 2−E0 (ρ.s∈S x∈X 1+ρ In the paper we limit ourselves to a ﬁxed input distribution (e. In fact.X (y|s. R)) . (s) where (a) follows from the deﬁnition of E0 . I NFORMATION T HEORETIC A NALYSIS Here we shall be interested in the performance of the optimal codes for the channel W . (8) − log s∈S 1+ρ ⎤ ⎦ × y∈Y (a) x∈X PX (x)PY |S. PX ) − ρR}. [5]): C(W ) = = = max I(X. S) max I(X. denoted E0 . S) p(x) p(x) p(x) B. Theorem 1: Let W be a channel with CSI at the receiver. Note that the capacity can also be written as max ES [I(X. We note that both bounds depend on the function E0 (·). Y |S = s)]. Following the relationship (8). Let n be the codeword length. • The considered error probability is the codeword error ˆ probability pe P (m = m). s|x) 1+ρ PS (s)× 1 Recalling the deﬁnition of the channel conditioned on the state s. p) − ρR}. x) PS (s)2−E0 s∈S ( S) ( s) 1 1+ρ where C(Ws ) is the capacity of the underlying channel Ws . PX ) is given by ⎡ − log⎣ y∈Y x∈X 1+ρ 1 1+ρ ⎤ ⎦. The random coding exponent is given by Er (R) = max max{E0 (ρ. e. s). respectively. = − log (ρ. we wish to ﬁnd the connections between E0 (·) and the corresponding E0 functions of the (s) conditional channels Ws . codelength n and error probability pe of the best possible codes. Communication Scheme The communication scheme is deﬁned as follows.PX ) . Y |S). ρ>0 pX (·) (12) max I(X. ρ∈[0. Error Exponent The error exponent of a channel is given by [1] 1 (9) E(R) lim − log (pe (n. The encoder and decoder are denoted fn and gn respectively. (11) PX (x)W (y|x) The sphere packing bound Esp (R) is given by Esp (R) = max max{E0 (ρ. and they coincide at rates beyond a certain rate called the critical rate. While the exact characterization of the error exponent is still an open question.g. (7) ⎤ ⎦ PX (x)W (y. the capacity can be simply derived (see. which are a lower and upper bounds. n→∞ n where pe (n. Y |S) ES [I(X. In this case the capacity is given by I(X. R) is the average codeword error probability for the best code of length n and rate R (assuming that the limit exists). PX ) = − log E 2−E0 ( S) (ρ. p(x) It can be seen that both exponent bounds are similar. PX ) = ⎡ = = − log ⎣ y∈Y. Y |S) + I(X. Y |S = s)]. we derive E0 (·) explicitly. (13) (6) Proof: When the channel output is (y. and let M be a set of 2nR messages. Note that when the CSI is available at the transmitter as well. which maps the channel output and the channel state to an estimate m of ˆ the transmitted message. Y |S = s)] E[C(WS )]. where the messages m are drawn randomly from M with equal probability. Capacity Since the channel W is simply a DMC with a scalar input and a vector output. Then the function E0 (·) for this channel is given by E0 (ρ. III. which maps the input message m to the channel input x ∈ X n . two important bounds are known [1]: the random coding and the sphere packing error exponents. (8) holds even without the assumption of a ﬁxed prior on X.g. they only differ in the optimization region of the parameter ρ.1] pX (·) (10) where E0 (ρ. A.B. We review known results for the capacity. we get E0 (ρ. we get C(W ) = = = I(X. and present the results for the error exponent and the channel dispersion. For channels with CSI at the receiver.PX ) . n n • gn : Y × S → M is the decoder. equiprobable). Y. (5) where the last equality holds since X and S are independent. 000799 . Y |S) = ES [I(X.

t.t. Theorem 3: The dispersion of the channel W with CSI at the receiver is given by V(W ) = E[V(WS )] + VAR [C(WS )] . r (14) pe and a codeword length n are given. deﬁne the (operational) channel dispersion as follows [3]: Deﬁnition 2: The dispersion V(W ) of a channel W with capacity C is deﬁned as V(W ) = lim lim sup n · pe →0 n→∞ ˜ Then Er (R) always overestimates the true random coding exponent of W . We conclude that the expectation (w. Communication scheme for channels with CSI at the receiver As a corollary. PX ). We can then seek the maximal achievable rate R given pe and n. y) is the information spectrum. Er (R)) will be given by the expectation of the exponent function w. (18) E E E(S) (R) r PX . It appears that for ﬁxed pe and n. It is also known that the channel dispersion and the error exponent are related as follows. ρ∈[0. ˜ We continue with Er (R): ˜ Er (R) = = ≥ = (a) C − R(n. Note that the proof of Theorem 2 holds no matter what the optimization region of ρ is. which is given by V VAR(i(X. In 1962 . Therefore the same version for the sphere packing exponent follows similarly.1] PX . (S) ˜ Proof: Let E0 (ρ. Formally. This result was recently tightened (and extended to the power-constrained AWGN channel) in [3]. PXY (x. PX ) − ρR ˜ E0 (ρ. over the critical rate). Strassen’s result proves that the dispersion of DMCs is equal to VAR(i(X. pe ) Q−1 (pe ) 2 . Suppose that a ﬁxed codeword error probability and the distribution of X is the capacity-achieving distribution that minimizes V . (17) (15) where R(n. Dispersion An alternative information theoretical measure for quantifying coding performance with ﬁnite block lengths is the channel dispersion.1] sup sup E E0 (ρ.1] sup (S) E0 (ρ.Random state s ∈ Sn m∈M x ∈ Xn W y ∈ Yn Decoder m ˆ Encoder Fig. it follows by the Jensen inequality and Theorem 1 that ˜ E0 (ρ. the error exponent can be approximated (for rates close to the capacity) by 2 E(R) ∼ (C−R) . ρ∈[0. Following (8). the gap to the √ channel capacity is approximately proportional to Q−1 (pe )/ n (where Q(·) is the complementary Gaussian cumulative distribution function). as seen in Theorem 1. ρ∈[0. Y )). In addition. pe ) = C − V −1 Q (pe ) + O n log n n . pe ) is the highest achievable rate for codeword error probability pe and codeword length n. PX ) = E E0 (ρ. PX ) ≤ E0 (ρ.1] sup = Er (R). (21) 000800 . For a channel with capacity C and dispersion V . Since 2−(·) is convex. Er (R). C. Strassen [4] used the Gaussian approximation to derive the following result for DMCs: R(n. y) . S.r. PX ) (S) − ρR where C is the channel capacity. PX ) . one might think that the error exponent bounds (for example. S) of the error exponent bounds overestimate the true exponent bounds of W (and also the true error exponent. given by i(x.r. Y )). S: ˜ Er (R) E E(S) (R) . where (a) follow from (15). we get the random coding and the sphere packing exponents for the channel W according to (10) and (12). The proportion constant (squared) is called the channel dispersion. See [3] for details on the early origins of = 2V ln 2 this approximation by Shannon.r. 1. We now explore the dispersion for the case of channels with side information at the receiver. This is clearly not the case. PX ) − ρR] (16) ≥ PX . PX (x)PY (y) (19) where i(x. ρ∈[0. y) log (20) PX . PX ) − ρR [E0 (ρ. and the new quantity V is the (information-theoretic) dispersion. the following can be shown: (S) ˜ Theorem 2: Let Er (R) be the average of Er w.t.

with the corrected LLR calculation is symmetric according to Deﬁnition 3. it can be seen that V(W ) underestimates the true dispersion of W . the performance is not harmed. (27) Consider a general binary channel W with arbitrary output (which is not necessarily symmetric). Therefore by plugging the LLR calculator at the channel output. [8]). It is also shown in [7] that the capacity is unchanged by the channel adapters. Y |S = s).t. When designing LDPC codes. s) for code design purposes. but in the error probability at ﬁnite codelengths sense as well). and supplying the decoder with the LLRs only. The capacity is given by C(Ws ) = = V(Ws ) E [i(X. As an immediate corollary of ˜ Theorem 3. The fact that we use equiprobable input does not make the channel symmetric according to Deﬁnition 3. s)PX (x) PY |S. Y |S)|S = s]] +VAR [E[i(X. This fact ﬁts the exponent case: both expected exponent and expected dispersion are too optimistic w. s) = (a) For channel outputs s and y. x) log i(x. Deﬁnition 3 (Binary input. instead of coding on the channel directly. Suppose that s is ﬁxed. IV. Y |S)|S = s] I(X. Second. However. s) by z: z = LLR(y.X (y|s. such as LDPC codes.t. i. without compromising on the capacity. the proof boils down to the calculation of VAR[i(X. before we transmit the binary codewords on the channel. it can be veriﬁed that the error exponent bounds and the dispersion remain the same as well. e. output symmetric channels [6]): A memoryless binary channel U with input alphabet {0. this method (termed ’channel adapters’) was used in order to symmetrize the sub-channels of several coded modulation schemes. and feed them to the decoder as a single value. x = 0) .X (y|s. and we can therefore regard the channel as a simple DMC with input x and output z = LLR(y. PY |S. we perform a bit-wise XOR operation with an iid pseudo-random binary vector. Y |S)|S = s −E2 [i(X. y|s). s. B. Y |S)|S = s − C(Ws )2 . • The factor VAR [C(WS )] can be viewed as a penalty ˜ factor over the expected dispersion V(W ). PY |S (y|s) log (22) where (a) follows since X and S are independent. C ODE DESIGN When designing channel codes. the true exponent and dispersion. Y |S)|S = s] = E i2 (X. It would therefore be of interest to apply some processing on the outputs Y and S. denote the LLR of x given (y. We seek such a processing method that would not compromise the achievable performance over the modiﬁed channel (not only in the capacity sense. and suppose that. that grows as the capacities of the underlying channels are more spread. (25) = = where (a) follows from the law of total variance. x = 1) (26) = PY SX (y. A few notes regarding this result: ˜ • Let V(W ) E[V(WS )]. 000801 . S))]. s) log PY |S. it is usually convenient to have channels that are symmetric.where both expectation and variance are taken w. there exists a method for transforming this channel to a symmetric one. 1} and output alphabet R is called output-symmetric. the random state S. x) PY S (y. In [7]. A desired property of a binary channel is that its output will be symmetric[6]. In recent years there have developed methods to design very efﬁcient binary codes. E XAMPLES A. error exponent or dispersion: First. the fact that the output is two-dimensional may complicate the code design. This way we get a real-output channel from any arbitrary channel. Y |S)) E[VAR[i(X. y. It can also be shown that the channel.r. (Y. The information spectrum in this case is given by i(x. the dispersion of the original channel W is given as follows: V(W ) = (a) VAR(i(X. For binary channels this can be done easily by calculating the log-likelihood-ratios for each channel output pair (y. V. consider the channel Ws .g.X (y|s. It can be shown that by multiplying the LLR values by −1 wherever the input was ﬂipped. if for all y∈R U (y|0) = U (−y|1). we are interested in coding over this channel with equiprobable input (which may or may not be the achieving prior for that channel). Y |S)|S = s) = E i2 (X. By using Theorems 1 and 3. (24) Finally. we add the LLR calculation to the channel and regard it as a part of the channel. Y |S)|S = s]] E[V(WS )] + VAR [C(WS )] . for practical reasons. s) (see Figure 2).e.r. V(W ). the LLRs are corrected. the optimal ML decoder can be implemented to work on the LLR values only. (23) It is well known that for channels with binary input. Proof: Since W is nothing but a DMC with a vector output. In The dispersion of the channel Ws is given by = VAR(i(X. Symmetrization of binary channels with equiprobable input In the design of state-of-the-art channel codes. Multilevel Coding and Multistage Decoding (MLC-MSD) MLC-MSD is a method for using binary codes in order to achieve capacity on nonbinary channels (see.

For more details on ﬁnite-length analysis of MLC-MSD. the equivalent channel of BICM is modeled as a binary channel with a random state [10]. see [9]. “Design of capacity-approaching irregular low-density paritycheck codes. and represents the index of the input bit in the L-tuple.” in Proc. 5. H. of inﬁnite length). [2] Pierre Moulin and Ying Wang. September 29 .e. Recently an alternative scheme called Parallel BICM was proposed [11]. no. Giorgio Taricco. May 2010. Ingber is supported by the Adams Fellowship Program of the Israel Academy of Sciences and Humanities. are in fact channels with CSI at the receiver. and S. [8] Udo Wachsmann. and Ezio Biglieri. which is popular in wireless communication. USA.” in ALLERTON 2010. D. feeding it into a long interleaver. 1991. 09 2010. Cover and Joy A. Seoul. Laurence B. Since the state is known to the receiver only.V. 2. 1998. and therefore can be analyzed by Theorems 1 and 3. IEEE International Symposium on Information Theory. 2010. R EFERENCES [1] Robert G. Pﬁster. 927–946. Robert F. pp. John Wiley & sons. NY. L}.Random state m Encoder W LLR calc. on Information Theory. [10] Giuseppe Caire. The state in this setting is the fade value. pp. 619–637. called sub-channels. 2001. u Urbanke. Fading Channels Rayleigh fading channel. Siegel. The reason for that is the interleaver: how can one rely on the existence of an inﬁnite-length interleaver in order to estimate the ﬁnite-length performance? The solution comes in the form of an explicit ﬁnite-length interleaver. no. no. de-interleaved and fed to the decoder. 47. 1799–1803. “Capacity and random-coding exponents for channel coding with side information. South Korea. C.October 1. Verd´ . 000802 . in its turn.k. [5] Thomas M. Strassen.” Trans. H. Third Prague Conf. Information Theory and Reliable Communication. Richardson. “Parallel bit interleaved coded modulation. At the receiver. The effective channels between each encoder-decoder. Polyanskiy. John Wiley & Sons. [6] Thomas J. Huber. “Capacity and error exponent analysis of multilevel coding with multistage decoding. Paul H. and then mapping the interleaved coded bits onto the nonbinary channel alphabet (every L-tuple of consecutive bits is mapped to a symbol in the channel input alphabet of size 2L ). vol. [9] Amir Ingber and Meir Feder. 49. vol. no. Czechoslovak Academy of Sciences. decodes the message from the corresponding encoder assuming that the decoded messages from the previous decoders are correct. pp.” IEEE Trans. 2141–2155. 45. New York. 1962. vol. 1999. “Capacity-approaching bandwidth-efﬁcient coded modulation schemes based on low-density parity-check codes. which is usually estimated and some version of it is available at the receiver. 2. and Johannes B. “Multilevel codes: Theoretical concepts and practical design rules. Poor. and Henry D.” IEEE Trans. no. 48th Annual Allerton Conference on Communication. 1968. Gallager. pp. 1326–1347. [3] Y. [7] Jilei Hou. 53. 689–723. USA. pp. When the fading is fast (a.. Allerton.” IEEE Trans. By assuming that the interleaver is ideal (i. and R¨ diger L. ergodic fading) the channel is memoryless and ﬁts the model discussed in the paper. It is based on taking a single binary code.” IEEE Trans. the LLR of all coded bits are calculated according to the mapping. 5. z ∈ Rn Decoder m ˆ Fig. this model ﬁts the channel models discussed in the paper. they do not have the usual meaning of quantifying the performance of BICM at ﬁnite block lengths. therefore regards these messages as side-information. 2007. pp. The state is chosen uniformly from {1. Control and Computing. vol. This allows the proper use of Theorems 1 and 3 (see [11] for the details). and Theorems 1 and 3 can be applied.. Mohammad Amin Shokrollahi. 1361–1391. Fischer. [4] V. Thomas. “Asymptotische absch¨ tzungen in shannons informaa tionstheorie. Finite blocklength analysis of BICM should be done carefully: although the model of a binary channel with a state known at the receiver allows the derivation of error exponent and channel dispersion. ACKNOWLEDGMENT A. “Channel coding rate in the u ﬁnite blocklength regime. Inc. 3. on Information Theory. 2003. vol. 2307 –2359. a channel with output of size 2L ). Bit-Interleaved Coded Modulation (BICM) BICM [10] is another popular method for channel coding using binary codes over nonbinary channels (for example. 9.. Elements of Information Theory. vol.” IEEE Trans. “Bit-interleaved coded modulation. on Information Theory. 2009.” IEEE Trans. pp. pp. the binary encoders work in parallel over the same block of channel uses. where binary codewords are used in parallel and an interleaver of ﬁnite length is used in order to validate the BICM model of a binary channel with a state known at the receiver. 56.. on Information Theory. 44. Milstein.a. Every other decoder. and the decoders work sequentially as follows: the ﬁrst decoder assumes the rest of the codewords are noise and decodes the message from the ﬁrst encoder. Information Theory. on Information Theory. Incorporating LLR calculation into the channel MLC-MSD. . on Information Theory. [11] Amir Ingber and Meir Feder. can be modeled as a channel with CSI at the receiver.

- iisc-drdo-ac-talk3Uploaded bychetan_dh
- Want a Superfast Computer_ First Processor is Here That Uses Light for CommunicationUploaded byDom DeSicilia
- Design and Implementation of an area efficient interleaver for MIMO-OFDM systemsUploaded byseventhsensegroup
- a2009.1Uploaded byLeonardo Serna Guarín
- Lecture 4Uploaded bykostas_ntougias5453
- C1_GioiThieuTongQuanUploaded bymottienday
- HoppingUploaded byDeepak Sharma
- FHdescrUploaded byLOKENDRA
- ilchtcnewieee[1]Uploaded byjesusblessmee
- AbdallahUploaded bybelalassabi
- Slides IT2 SS2012Uploaded bykostas_ntougias5453
- Digital Communication SystemsUploaded byselnemais
- 14683.pdfUploaded byNebiye Solomon
- 516255614-MITUploaded byganga_ch1
- Undergraduate Syllabus Elective SubjectUploaded bySauradeep Debnath
- 04786457.pdfUploaded byprabhman88
- Coding and Decoding With Convolutional CodesUploaded byĐỗ Minh Toán
- iDirect-Hub (1).pdfUploaded byM Tanvir Anwar
- PLC G3 Physical Layer SpecificationUploaded byvutrungkienspaces
- Md2401 ManualUploaded byDani TheBrutal
- AMR PointsUploaded bypulok123
- PC8530_Product_BriefUploaded byjeserq1
- turbo3Uploaded byKarthik Mohan K
- Star Thermal Printer Programmer's ManualUploaded byKerzhan
- 2g1723 QuestionsUploaded byabdullahnisar92
- LINK-LEVEL PERFORMANCE EVALUATION OF RELAY-BASED WIMAX NETWORKUploaded byJohn Berg
- IEEE VLSIUploaded byEng Sugaal
- 4_20240_456Uploaded bypks009