12 views

Uploaded by HossamSalah

Entropy provides a numerical measure of how much randomness or uncertainty there is in a random variable. This quantity is a starting point in a fundamental result in information theory known as channel coding theorem which considers the problem of constructing an error-free communications.

- THEORY OF CODES
- Systems Engineering
- Continuous CS
- Digital Communication
- 5.1 Modbus Serial Line En
- Hamming Codes Notes
- Network Lab Pg Ms
- Hybrid ARQ Schemens in UW
- lec22.pdf
- Comparative Study of Error Detection and Correction Coding Techniques (1)
- Endurance-Aware Security Enhancement in Non-Volatile Memories Using Compression and Selective Encryption
- Dcom
- BG-TA-CRC Series, Part 2- CRC Mathematics and Theory-Barr
- CNET_05_DataLinkControl
- Abou-Faycal Trott Shamai
- Modified
- smdi
- Ediabas - Error Ref
- Convolutional Codes.pdf
- MAN0841-03-EN_HE-359THM100_200.pdf

You are on page 1of 23

F A C U L T Y OF E N G I N E E R I N G

Spring 2016

MP 219 Mathematics 9

(Probability and Random Processes)

Dr. Sherif Rabia

Eng. Sara Kamel

Channel Coding Theorem

Page 1 of 23

Team Members

----------------------Seat

Number

Name

72

73

149

178

214

236

Page 2 of 23

Index

---------Entropy definition . 4

Source coding .. 7

Mutual information .. 10

Channel capacity ... 12

Channel coding theorem ..... 14

Matlab .... 19

Sources ... 21

Introduction

Page 3 of 23

communication theory: What is the ultimate data compression (answer: the

entropy H), and what is the ultimate transmission rate of communication

(answer: the channel capacity C). For this reason some consider information

theory to be a subset of communication theory. We argue that it is much

more.

Claude Elwood Shannon (1916 2001), American electrical engineer and

mathematician,

has been called the father of information theory, and was the founder of

practical digital circuit

design theory.

Channel Coding Theorem: It is possible to achieve near perfect

communication of information over a noisy channel

overview

Page 4 of 23

Entropy definition

------------------------ Shannon Information Content

The Shannon Information Content of an outcome with probability p is

I =log 2

1

p(x )

Entropy

Definition: The entropy is a measure of the average uncertainty in the random variable.

It is the number of bits on average of Shannon Information Content required to describe the

random variable.

The entropy H(X) of a discrete random variable X is defined by

H ( x )= p (x) log 2

x

1

p(x )

Page 5 of 23

2- H(X)=0 if X is a deterministic variable (certainty).

3- H(X) is maximum for equi-probability statistics (uncertainty).

To show this properties

Let

X

Px

0

1-p

1

p

def H(p).

H(X) depends only on the probability mass function px not on RV X so we can write H(p).

Joint entropy

We now extend the definition to a pair of random variables X & Y.

Page 6 of 23

Definition: The joint entropy H(X, Y) of a pair of discrete random variables (X, Y) with a

joint distribution p(x, y) is defined as

H ( x , y )= p( x , y) log 2

x, y

1

p(x , y)

Conditional entropy

x, y

1

p( yx )

Source coding

---------------------- Introduction

The information coming from the source can be characters, if the information

source is a text. It would be pixels if the information source is an image. So if I

Page 7 of 23

want to transmit pixels or characters, how could I do that? Well, this is done

using source code.

Source coding is a mapping from (a sequence of) symbols from an

information source to a sequence of bits.

This is the concept behind data compression.

from the binary bits e.g. winzip software in Windows OS.

some distortion e.g. JPEG.

Source coding tries to make a minimal code length and helps to get rid of

undesired or unimportant extra information.

The channel that will receive the code may not have the capacity to

communicate at the source information rate. So, we use source coding to

represent source at a lower rate with some loss of information.

Code length

Therere two types of codes: fixed length code and variable-length code.

Page 8 of 23

1. Fixed length code, as its name clarifies, has symbols that have the

same number of bits.

2. Variable length code has symbols that have different number of bits

depending on the probability of each symbol.

Variable length code is the better solution as it allows the minimal code

length.

The source symbols may have uniform or non-uniform distribution.

Non uniform distribution of the source symbols may allow efficient

representation of the signal at a lower rate. If I have a symbol S1 that

appears regularly (has a large probability); it would be good if I encode this

symbol into a small code word so the length of the total code will be short.

The length of the code word is determined by the following formula:

log r

1

p

the code (2 in case of binary code) and p is the probability of the symbol.

Lets take an example to make things clearer.

Theres a source that generates three symbols: S1, S2 and S3. The

probabilities of S1, S2 and S3 are 0.3, 0.5 and 0.2 respectively.

By applying the formula, well obtain the following results.

I 1 = 1.7,

I2

= 1 and

I 3 = 2.3.

But, as you may notice, these results are theoretical as theres no 1.7 bit. So

to make it practical we shall approximate them. Thus, the results will be as

following:

I 1 = 2,

I2

= 1 and

I 3 = 3.

As weve mentioned before, we shall notice that the symbol of the largest

probability (S2) will have the shortest length (only one bit).

So if the source information generates the following code:

Page 9 of 23

S1 S2 S1 S3 S2 S2 S1 S2 S3 S2

The source coding will generate 17 bits (2 bits + 1 bit + 2 bits + 3 bits + 1 bit

+ 1 bit + 2 bits + 1 bit + 3 bits + 1 bit).

One of the most fundamental properties of a source code is that it must be

uniquely decodable.

{0,010,01,10} is an example of non-uniquely decodable source code.

If we have the following stream of bits: 001010, it can be read 0 01 010 or 0

010 10 or 0 01 0 10. So confusion will be presented when we receive this

stream.

But if the code was {10,00,11,110} and we receive the same stream 001010,

it can be only read 00 10 10.

So, {10,00,11,110} is an example of uniquely decodable source code.

Page 10 of 23

Mutual information

---------------------- Definition

Mutual information is one of many quantities that measures how much one

random variable tells us about another. It can be thought of as the reduction

in uncertainty about one random variable given knowledge of another.

Intuitively, mutual information measures the information that X and Y share:

it measures how much knowing one of these variables reduces uncertainty

about the other. For example, if X and Y are independent, then

knowing X does not give any information about Y and vice versa, so their

mutual information is zero. At the other extreme, if X is a deterministic

function of Y and Y is a deterministic function of X then all information

conveyed by X is shared with Y: knowing X determines the value of Y and vice

versa. High mutual information indicates a large reduction in uncertainty; low

mutual information indicates a small reduction; and zero mutual information

between two random variables means the variables are independent. An

important theorem from information theory says that the mutual information

between two variables is 0 if and only if the two variables are statistically

independent.

For example, suppose X represents the roll of a fair 6-sided die, and Y

represents whether the roll is even (0 if even, 1 if odd). Clearly, the value of Y

tells us something about the value of X and vice versa. That is, these

variables share mutual information.

On the other hand, if X represents the roll of one fair die, and Z represents

the roll of another fair die, then X and Z share no mutual information. The roll

of one die does not contain any information about the outcome of the other

die.

Mathematical representation

Page 11 of 23

is PXY(x,y) , the mutual information between them, denoted I(X;Y) , is given

by

PXY (x , y )

I(X;Y)=

x,y

marginal

PXY ( x , y )

PX(x)=

And

PY(y)=

PXY (x , y )

x

To understand what I(X;Y) actually means, lets modify the equation first.

I(X;Y)= H(X)H(X|Y), where

H(X)=

PX (x)log PX (x)

x

and

PX Y ( x y )log (PXY ( x y ))

x

H(X|Y)=

PY ( y)

after observing Y.

The focus here is on discrete variables, but most results derived for discrete

variables extend very naturally to continuous ones one simply replaces

sums by integrals.

The units of information depend on the base of the logarithm. If base 2 is

used (the most common, and the one used here), information is measured in

bits.

Page 12 of 23

and the mutual information.

-----------------------------Its the highest rate of reliable (error free) information that can be

transmitted through a communication channel.

- The attenuation of a channel which varies with frequency as well as channel

length.

- The noise induced into the channel which increases with distance.

- Non-linear effects such as clipping on the signal.

Page 13 of 23

(bits/s)[ information rate is the average entropy per symbol] is equal to or

less than the channel capacity, C, (i.e. R < C) then there is, in principle, a

coding technique which enables transmission over the noisy channel with no

errors.

The inverse of this is that if R > C, then the probability of error is close to 1

for every symbol.

It states that:

C = B log2 (1+

S

N

) bits/s

Where

C: Channel capacity

B: Channel bandwidth

S: Signal power

N: Noise power

S

N

increases and also when the signal to noise ratio increases.

and data communications, but its application is most common in data

communications.

The channel capacity theorem relates three system parameters:

1- Channel bandwidth B

2- Average transmitted signal power S

3- Noise power at the channel N

Hence for a given average transmitted power [S] and channel bandwidth [B]

we can transmit information at rate [C bits/s] without any error.

Page 14 of 23

Its not possible to transmit information at any other rate higher than [C

bits/s] without having a definite probability of error. Hence the channel

capacity theorem defines the fundamental limit on the rate of error-free

transmission for a power-limited, band-limited channel.

------------------------------------The purpose of channel coding theory is to find codes which transmit quickly,

contain many valid code words and can correct or at least detect many

errors. While not mutually exclusive, performance in these areas is a tradeoff.

So, different codes are optimal for different applications. The needed

properties of this code mainly depend on the probability of errors happening

during transmission.

Although not a very good code, a simple repeat code can serve as an

understandable example. Suppose we take a block of data bits (representing

Page 15 of 23

sound) and send it three times. At the receiver we will examine the three

repetitions bit by bit and take a majority vote. The twist on this is that we

don't merely send the bits in order. We interleave them. The block of data bits

is first divided into 4 smaller blocks. Then we cycle through the block and

send one bit from the first, then the second, etc. This is done three times to

spread the data out over the surface of the disk. In the context of the simple

repeat code, this may not appear effective. However, there are more powerful

codes known which are very effective at correcting the "burst" error of a

scratch or a dust spot when this interleaving technique is used.

A number of algorithms are used for channel coding we will discuss some of

them which are linear. First let`s explain some definitions.

embedded in the encoded output. Conversely, in a non-systematic

code the output does not contain the input symbols.

Systematic codes have the advantage that the parity data can simply be

appended to the source block, and receivers do not need to recover the

original source symbols if received correctly for engineering purposes such as

synchronization and monitoring, it is desirable to get reasonable good

estimates of the received source symbols without going through the lengthy

decoding process which may be carried out at a remote site at a later time.

The codes we are going to be discussing will be systematic codes.

Block codes: In coding theory, a block code is any member of the large and

important family of error-correcting codes that encode data in blocks. There is

a vast number of examples for block codes, many of which have a wide range

of practical applications. Block codes are conceptually useful because they

allow coding theorists, mathematicians, and computer scientists to study the

limitations of all block codes in a unified way. Such limitations often take the

form of bounds that relate different parameters of the block code to each

other, such as its rate and its ability to detect and correct errors.

Page 16 of 23

Cyclic code

A cyclic code is a block code, where the circular shift of each code word

gives another word that belongs to the code. They are error-correcting

codes that have algebraic properties that are convenient for efficient error

detection and correction.

"If 00010111 is a valid code word, applying a right circular shift gives the

string 10001011. If the code is cyclic, then 10001011 is again a valid code

word. In general, applying a right circular shift moves the least significant bit

(LSB) to the leftmost position, so that it becomes the most significant bit

(MSB); the other positions are shifted by 1 to the right"

General definition:

Let C be a linear code over a finite field GF(q) of block length n. C is called

a cyclic code if, for every code word c=(c1,...,cn) from C, the word

Page 17 of 23

(cn,c1,...,cn-1) in

GF (q)n

again a code word. Because one cyclic right shift is equal to n 1 cyclic left

shifts, a cyclic code may also be defined via cyclic left shifts. Therefore the

linear code C is cyclic precisely when it is invariant under all cyclic shifts.

Parity

Definition:

A parity bit, or check bit is a bit added to the end of a string of binary code

that indicates whether the number of bits in the string with the value one is

even or odd. Parity bits are used as the simplest form of error detecting code.

Parity types:

In the case of even parity, for a given set of bits, the occurrence of bits whose

value is 1 is counted. If that count is odd, the parity bit value is set to 1,

making the total count of occurrences of 1's in the whole set (including the

parity bit) an even number. If the count of 1's in a given set of bits is already

even, the parity bit's value remains 0.

In the case of odd parity, the situation is reversed. For a given set of bits, if

the count of bits with a value of 1 is even, the parity bit value is set to 1

making the total count of 1's in the whole set(including the parity bit) an odd

number. If the count of bits with a value of 1 is odd, the count is already odd

so the parity bit's value remains 0.

If the parity bit is present but not used, it may be referred to as mark

parity (when the parity bit is always 1) or space parity (the bit is always 0).

Parity in Mathematics:

In mathematics, parity refers to the evenness or oddness of an integer, which

for a binary number is determined only by the least significant bit. In

telecommunications and computing, parity refers to the evenness or oddness

of the number of bits with value one within a given set of bits, and is thus

determined by the value of all the bits. It can be calculated via an XOR sum of

the bits, yielding 0 for even parity and 1 for odd parity. This property of being

Page 18 of 23

dependent upon all the bits and changing value, if any one bit changes,

allows for its use in error detection schemes.

Error detection:

If an odd number of bits (including the parity bit) are transmitted incorrectly,

the parity bit will be incorrect, thus indicating that a parity error occurred in

the transmission. The parity bit is only suitable for detecting errors; it cannot

correct any errors, as there is no way to determine which particular bit is

corrupted. The data must be discarded entirely, and re-transmitted from

scratch. On a noisy transmission medium, successful transmission can

therefore take a long time, or even never occur. However, parity has the

advantage that it uses only a single bit and requires only a number of XOR

gates to generate. Hamming code is an example of an error-correcting code.

Parity bit checking is used occasionally for transmitting ASCII characters,

which have 7 bits, leaving the 8th bit as a parity bit.

Hamming code

Hamming code is a linear error-correcting code that encodes four bits of data into

seven bits by adding three parity bits. It is a member of a larger family

of Hamming codes.

They can detect up to two-bit errors or correct one-bit errors without detection of

uncorrected errors. By contrast, the simple parity code cannot correct errors, and

can detect only an odd number of bits in error. Hamming codes are perfect

codes, that is, they achieve the highest possible rate for codes with their block

length and minimum distance of three.

The goal of Hamming codes is to create a set of parity bits that overlap such

that a single-bit error (the bit is logically flipped in value) in a data bit or a

parity bit can be detected and corrected. While multiple overlaps can be

created, the general method is presented in Hamming codes.

Page 19 of 23

bits p1 to p3 and which parity bits apply to which data bits

This table describes which parity bits cover which transmitted bits in the

encoded word. For example, p2 provides an even parity for bits 2, 3, 6, and

7. It also details which transmitted by which parity bit by reading the

column. For example, d1 is covered by p1 and p2 but not p3. This table will

have a striking resemblance to the parity-check matrix (H) in the next

section.

Page 20 of 23

Matlab implementation

------------ Hamming Code

Simulation

Huffmann Code

Page 21 of 23

Simulation

Sources

-----------Page 22 of 23

http://coltech.vnu.edu.vn/~thainp/books/Wiley_-_2006__Elements_of_Information_Theory_2nd_Ed.pdf

http://mailhes.perso.enseeiht.fr/documents/SourceCoding_Mailhes.pdf

https://en.wikipedia.org/wiki/Shannon%27s_source_coding_theorem

https://en.wikipedia.org/wiki/Coding_theory#Source_coding

http://www.scholarpedia.org/article/Mutual_information

http://www.ee.ic.ac.uk/hp/staff/dmb/courses/infotheory/info_1.pdf

Page 23 of 23

- THEORY OF CODESUploaded byWayan Bentar
- Systems EngineeringUploaded byriteshtayal
- Continuous CSUploaded byDong Yin
- Digital CommunicationUploaded byPoornartha Sawant
- 5.1 Modbus Serial Line EnUploaded byWAGUDELO
- Hamming Codes NotesUploaded byaquilesreina
- Network Lab Pg MsUploaded byArvind Purohith
- Hybrid ARQ Schemens in UWUploaded by283472 ktr.phd.ece.19
- lec22.pdfUploaded byVidhya Ds
- Comparative Study of Error Detection and Correction Coding Techniques (1)Uploaded byJp Cruz
- Endurance-Aware Security Enhancement in Non-Volatile Memories Using Compression and Selective EncryptionUploaded byshapour2
- DcomUploaded bydearmohseen
- BG-TA-CRC Series, Part 2- CRC Mathematics and Theory-BarrUploaded byAhmed Salah
- CNET_05_DataLinkControlUploaded byPiyush Suthar
- Abou-Faycal Trott ShamaiUploaded byNguyễn Minh Dinh
- ModifiedUploaded bySureshbabu Kunchala
- smdiUploaded byAldair Pinto
- Ediabas - Error RefUploaded byBlasterX1
- Convolutional Codes.pdfUploaded bySalem Alsalem
- MAN0841-03-EN_HE-359THM100_200.pdfUploaded byMircea Murar
- Pointing Errors FSO TurbulenceUploaded byPooja Gopal
- Watermarking.pdfUploaded bymayajogi
- Nand FlashUploaded byselezier

- Math9 S16 Lec17 MarkovChainsUploaded byHossamSalah
- Math9 S16 Lec0 OverviewUploaded byHossamSalah
- lec1-EE391-11-4-2017Uploaded byHossamSalah
- Math9 S16 Lec1 BasicsUploaded byHossamSalah
- Lecture 1 ElectrostaticsUploaded byHossamSalah
- Difference Between QPSK, OQPSKUploaded byHossamSalah
- C4Uploaded byHossamSalah
- Math9 S16 Lec11 MultipleRVs DiscreteUploaded byHossamSalah
- Math9 S16 Lec17 MarkovChainsUploaded byHossamSalah
- Math9 S16 Lec0 OverviewUploaded byHossamSalah
- Math9 S16 Lec0 OverviewUploaded byHossamSalah
- Math9 S16 Lec0 OverviewUploaded byHossamSalah
- CSE333S17HW1Uploaded byHossamSalah
- Math9 S16 Lec7 ContinuousRandomVariables1Uploaded byHossamSalah

- QStart Flatpack2-PSSystUploaded byTERASAT SA
- 1038faUploaded byNacho Consolani
- Solar Power in IndiaUploaded byRam Meena
- Risk Communication ModeUploaded byChandra Bhushan Gupta
- POPULARITY OF MARXIST SCHOOL OF THOUGHT IN AFRICAN LITERATURE: A CRITIQUE.Uploaded byIJCIRAS Research Publication
- Preliminary Examination in Reading 1Uploaded byJennifer Garcia Erese
- tsunami webquestUploaded byapi-264886115
- Bio TransformationUploaded byrawatpooran05
- Chap012wid AnovaUploaded byRAMA
- UIAA 104 Slings March 2013Uploaded byBaba Yaga
- SidebySide1_Workbook_sampleunit.pdfUploaded byDan Franlui Mayo
- kalUploaded byJorge Suarez Fernandez
- 1Uploaded byAnonymous huaIYe1
- Atomic Packing FractionUploaded byMd. Ashraf Ali
- ENGL 1302.WY1 SyllabusUploaded bySean F-W
- farmacophoreUploaded byCarla Luna Minuche
- on_heartsUploaded byliubartas
- SAADC2015 Abstract BookUploaded byfiqi iqbal
- Frecuencimetro.docxUploaded bydarkking00
- Calamo Nd InUploaded bymartindinglasan
- 14.IJMPERDAPR201714Uploaded byTJPRC Publications
- [Ielts Share] 101 Ielts Listening TestsUploaded byVY LUONG TRIEU
- 101109 Retaining Wall CalculationUploaded byAlexPak
- 14 CelebiUploaded bypradeepjoshi007
- Must Change Item During ROH & POHUploaded byRakesh Jainwal
- Ricoh 3800c (Model K-p1, A-c2, J-p1) Technical NoteUploaded byAnonymous UslEQS
- The Observing OneUploaded byscholar786
- Theory and Use of Curved Surface Correction (CSC) Software in Olympus NDT EPOCHUploaded byMarciel Amorim
- Nissan+Skyline+GT-RUploaded byIbrahim Ab Ras
- Football Point CounterpointUploaded byJenna Intersimone