Unit 2

Department of Electronics and Communication
AY 2022-23 Sem-V
Information Theory and Coding
Unit-2
Source Coding
Need of source coding:

The aim of source coding is to represent information as accurately as possible using as few
bits as possible and in order to do so redundancy from the source needs to be removed. The
source coding reduces redundancy to improve the efficiency of the system.
It is the process of arranging the efficient representation of data generated by the source.
Two approaches:
 Fixed length Coding: All symbols are given equal codeword length
 Variable length Coding: codeword length varies in accordance with the probability of
occurrence of each symbols.
Source encoder performs coding
The Code produced by a discrete memoryless source, has to be efficiently represented, which is
an important problem in communications. For this to happen, there are code words, which
represent these source codes.
For example, in telegraphy, we use Morse code, in which the alphabets are denoted
by Marks and Spaces. If the letter E is considered, which is mostly used, it is denoted
by “.” Whereas the letter Q which is rarely used, is denoted by “--.-”
Where Sk is the output of the discrete memoryless source and bk is the output of the source encoder
which is represented by 0s and 1s.
The encoded sequence is such that it is conveniently decoded at the receiver.
Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)
AY 2022-23 Sem-V
Let us assume that the source has an alphabet with k different symbols and that
the kth symbol Sk occurs with the probability Pk, where k = 0, 1…k-1.
Let the binary code word assigned to symbol Sk, by the encoder having length lk, measured in
bits.Hence, we define the average code word length (𝐿)of the source encoder as
𝑀
𝐿 = ∑ 𝑃𝐾 𝑙𝐾
𝐾=1
Consider : Assume Prob.s as 1/4
Symbols bit source encoder length (𝑙𝐾 )
S1 000 0 1
S2 0100 01 2
S3 0010 10 2
S4 0011 11 2
(𝐿)=7/4 bits
𝐿 represents the average number of bits per source symbol
If 𝐿 min=minimum possible value of 𝐿
Then coding efficiency can be defined as
With 𝐿 ≥ 𝐿 min we will have η≤1 (which means source coding aim to give compact
representation i.e for 7 bit ,it should give 1 or 2 bit representation. 𝐿 should be kept as min.as
possible, but it should satisfy 𝐿 ≥Lmin ,that is the condition for good or optimal code ,The value
of 𝐿 min is given by Shannon’s first theorem.)
However, the source encoder is considered efficient when η=1
For this, the value𝐿 min has to be determined.
Let us refer to the definition, “Given a discrete memoryless source of entropy H(X), the average
code-word length 𝐿 for any source encoding is bounded as 𝐿 ≥H(X)” (Shannon’s first theorem)
in simpler words, the code word example: Morse code for the word QUEUE is−.−..−...−. is
always greater than or equal to the source code QUEUE in example. Which means, the symbols
in the code word are greater than or equal to the alphabets in the source code.
Hence with 𝐿 min=H(X) the efficiency of the source encoder in terms of Entropy H(X) may be
written as:
Source Coding theorem: noiseless coding theorem / Shannon’s first theorem.

AY 2022-23 Sem-V
𝐿 ≥H(X)
Lower and upper limit for 𝐿

This source coding theorem is called as noiseless coding theorem as it establishes an error-free
encoding. It is also called as Shannon’s first theorem.
State and Prove Shannon’s first theorem.

Proof :
𝐿 ≥H(X)
Proof for : 𝑳 −H(X) ≥0

First write the equation of 𝑳 𝒂𝒏𝒅 H(X)
∑𝑀 𝑀
𝐾=1 𝑃𝐾 𝑙𝐾 *1 -∑𝐾=1 𝑃𝐾 𝑙𝑜𝑔2 1/𝑃𝐾 ≥0
𝑙𝑜𝑔22 = 1 Rewriting
∑𝑀 2 𝑀
𝐾=1 𝑃𝐾 𝑙𝐾 𝑙𝑜𝑔2 -∑𝐾=1 𝑃𝐾 𝑙𝑜𝑔2 1/𝑃𝐾 ≥0
Take common term outside

∑𝑀 2
𝐾=1 𝑃𝐾 (𝑙𝐾 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 1/𝑃𝐾 ) ≥0
∑𝑀 2
𝐾=1 𝑃𝐾 (𝑙𝐾 𝑙𝑜𝑔2 + 𝑙𝑜𝑔2 𝑃𝐾 ) ≥0
𝑙
∑𝑀 2𝑘
𝐾=1 𝑃𝐾 (𝑙𝑜𝑔2 + 𝑙𝑜𝑔2 𝑃𝐾 ) ≥0
Remember log along b=logab

𝒍
∑𝑴 𝟐𝒌
𝑲=𝟏 𝑷𝑲 (𝒍𝒐𝒈𝟐 𝑷𝑲 ) ≥0
Change to natural log for simplifying

1
∑𝑀
𝐾=1 𝑃𝐾 (𝑙𝑜𝑔 𝑃𝐾 2
𝑙𝑘
) ≥0
𝑙𝑜𝑔2
Remember Maths identity

log x≤ x-1
-log 1/x≤ x-1
log 1/x ˃1-x

AY 2022-23 Sem-V
in our case let x=𝑃𝐾 2𝑙𝑘

𝑙𝑜𝑔 𝑃𝐾 2𝑙𝑘 ≤ 𝑃𝐾 2𝑙𝑘
−𝑙𝑜𝑔 1/𝑃𝐾 2𝑙𝑘 ≤ 𝑃𝐾 2𝑙𝑘 − 1
𝑙𝑜𝑔 1/𝑃𝐾 2𝑙𝑘 ≥ 1 − 𝑃𝐾 2𝑙𝑘
Rewrite in log2 form
1 1 1
∑𝑀
𝑙𝑘
) ≥ 𝑙𝑜𝑔2 1/𝑃𝐾 ∑𝑀
𝐾=1 𝑃𝐾 (1 − )
𝑙𝑜𝑔2 𝑙𝑜𝑔2 𝑃𝐾 2𝑙𝑘
1
∑𝑀
𝑙𝑘
) is nothing but 𝐿 −H(X)
𝑙𝑜𝑔2
1 1 1
𝐿 −H(X) ≥ 𝑙 ∑𝑀
𝐾=1 𝑃𝐾 − 𝑙 ∑𝑀
𝐾=1 𝑃𝐾 )
𝑜𝑔2 𝑜𝑔2 𝑃𝐾 2𝑙𝑘
1
≥1(always 1) − 𝑙 ∑𝑀
𝐾=1 2
−𝑙𝑘
𝑜𝑔2
𝐿 −H(X) ≥1(always 1) − ( ≤ 1(always)s

𝐿 −H(X) ≥0
Proof for 𝑳˂ H(X)+1:
To prove this let’s consider 2 equations first
∑ 2−𝑙𝑘 ≤ 1
𝐾=1
∑𝑀
𝐾=1 𝑃𝐾 =1
2−𝑙𝑘 ≤ 𝑃𝐾 ___A
𝑃𝐾 ≥ 2−𝑙𝑘
𝑃𝐾 ˂2−𝑙𝑘 +1 __B
We are inferring from A to B…
Take log2
𝑙𝑜𝑔2 𝑃𝐾 ˂𝑙𝑜𝑔2 2−𝑙𝑘 +1
𝑙𝑜𝑔2 𝑃𝐾 ˂(−𝑙𝑘 + 1)𝑙𝑜𝑔2 2
𝑙𝑜𝑔2 𝑃𝐾 ˂(−𝑙𝑘 + 1)

AY 2022-23 Sem-V
𝑙𝑜𝑔2 𝑃𝐾 − 1˂ − 𝑙𝑘
𝑙𝑘 ˂1 − 𝑙𝑜𝑔2 𝑃𝐾
𝑙𝑘 ˂1 + 𝑙𝑜𝑔2 1/𝑃𝐾
Using ∑𝑀
𝐾=1 𝑃𝐾 =1
𝑀 𝑀
1
∑ 𝑃𝐾 𝑙𝑘 ˂ ∑ 𝑃𝐾 (1 + 𝑙𝑜𝑔2 )
𝑃𝑘
𝐾=1 𝐾=1
𝑀 𝑀 𝑀
1
∑ 𝑃𝐾 𝑙𝑘 ˂ ∑ 𝑃𝐾 + ∑ 𝑃𝐾 𝑙𝑜𝑔2
𝑃𝑘
𝐾=1 𝐾=1 𝐾=1
𝐿˂ H(X)+1 ____________Hence Proved
Prefix Code :
A prefix code is a type of code system distinguished by its possession of the "prefix property",
which requires that there is no whole code word in the system that is a prefix (initial segment) of
any other code word in the system.
For example, a code with code words {9, 55} has the prefix property; a code consisting of
{9, 5, 59, 55} does not, because "5" is a prefix of "59" and also of "55". A prefix code is
a uniquely decodable code: given a complete and accurate sequence, a receiver can identify each
word without requiring a special marker between words. However, there are uniquely decodable
codes that are not prefix codes; for instance, the reverse of a prefix code is still uniquely
decodable (it is a suffix code), but it is not necessarily a prefix code.

AY 2022-23 Sem-V
Prefix codes are also known as

 prefix-free codes/comma-free code ,
 prefix condition codes and
 instantaneous codes.
 Although Huffman coding is just one of many algorithms for deriving prefix codes,
prefix codes are also widely referred to as "Huffman codes", even when the code was not
produced by a Huffman algorithm.
 Using prefix codes, a message can be transmitted as a sequence of concatenated code
words, without any out-of-band markers or (alternatively) special markers between words
to frame the words in the message. The recipient can decode the message unambiguously,
by repeatedly finding and removing sequences that form valid code words. Prefix codes
are not error-correcting codes. In practice, a message might first be compressed with a
prefix code, and then encoded again with channel coding (including error correction)
before transmission.
 For any uniquely decodable code there is a prefix code that has the same code word
lengths.Kraft's inequality characterizes the sets of code word lengths that are possible in
a uniquely decodable code
Examples of prefix codes include:
 variable-length Huffman codes

 country calling codes
 Chen–Ho encoding
 the country and publisher parts of ISBNs
 the Secondary Synchronization Codes used in the UMTS W-CDMA 3G Wireless
Standard

AY 2022-23 Sem-V
 VCR Plus+ codes

 Unicode Transformation Format, in particular the UTF-8 system for
encoding Unicode characters, which is both a prefix-free code and a self-
synchronizing code
 variable-length quantity
Kraft McMillan Inequality property – KMI

In coding theory, the Kraft–McMillan inequality gives a necessary and sufficient condition for the
existence of a prefix code[1] (in Leon G. Kraft's version) or a uniquely decodable code (in Brockway
McMillan's version) for a given set of codeword lengths.
Shannon Fano Algorithm:
Shannon Fano Algorithm is an entropy encoding technique for lossless data compression
of multimedia. Named after Claude Shannon and Robert Fano, it assigns a code to each
symbol based on their probabilities of occurrence. It is a variable length encoding
scheme, that is, the codes assigned to the symbols will be of varying length.
HOW DOES IT WORK?
The steps of the algorithm are as follows:
1. Create a list of probabilities or frequency counts for the given set of symbols so that the
relative frequency of occurrence of each symbol is known.
2. Sort the list of symbols in decreasing order of probability, the most probable ones to
the left and least probable to the right.
3. Split the list into two parts, with the total probability of both the parts being as close to
each other as possible.
4. Assign the value 0 to the left part and 1 to the right part.
5. Repeat the steps 3 and 4 for each part, until all the symbols are split into individual
subgroups.
The Shannon codes are considered accurate if the code of each symbol is unique.
EXAMPLE:.
1.Using Shannon Fano coding lossless compression technique.
Find
Codewords, average code word length, code efficiency, entropy and Information rate if
nyquist rate is 1000 samples/sec

AY 2022-23 Sem-V
Step:
Tree:
Solution:
Use
The average codeword length:
𝑀
𝐾=1
R= nyquist rate as 1000 samples/sec
1
Entropy : H(s) = ∑3𝑖=1 𝑃𝑖 𝑙𝑜𝑔2
𝑃𝑖
Information Rate:R = H(s)r

Code efficiency=

AY 2022-23 Sem-V
Let P(x) be the probability of occurrence of symbol x:
1. Arranging the symbols in decreasing order of probability:
And Cal Prob.

P(D) + P(B) = 0.30 + 0.2 = 0.58
and,
P(A) + P(C) + P(E) = 0.22 + 0.15 + 0.05 = 0.42
And since the almost equally split the table, the most is dividedit the blockquote table is
blockquotento {D, B} and {A, C, E} and assign them the values 0 and 1 respectively.
Now, in {D, B} group, P(D) = 0.30 and P(B) = 0.28
which means that P(D)~P(B), so divide {D, B} into {D} and {B} and assign 0 to D and 1 to
B.

AY 2022-23 Sem-V
Step:
Tree:
In {A, C, E} group,
P(A) = 0.22 and P(C) + P(E) = 0.20

So the group is divided into
{A} and {C, E}
and they are assigned values 0 and 1 respectively.

AY 2022-23 Sem-V
1. In {C, E} group,
P(C) = 0.15 and P(E) = 0.05

So divide them into {C} and {E} and assign 0 to {C} and 1 to {E}
Step:
Tree:
The Shannon codes for the set of symbols are:

AY 2022-23 Sem-V
The average codeword length:

𝐿 = ∑𝑀
𝐾=1 𝑃𝐾 𝑙𝐾
5
𝐾=1
=0.22*2+0.28*2+0.15*3+0.30*2+0.05*3
𝐿 = 2.2
r=nyquist rate as 1000 samples/sec

1 1 1 1 1
Entropy : H(S) = ∑5𝑖=1 𝑃𝑖 𝑙𝑜𝑔2 =0.22𝑙𝑜𝑔2 0.22 +0.28𝑙𝑜𝑔2 0.28 +0.15𝑙𝑜𝑔2 0.15 +0.30𝑙𝑜𝑔2 0.30
𝑃𝑖
1
+0.05𝑙𝑜𝑔2 0.05
H(S) =2.142 symbols/sec
Information Rate:
R = H(S)*r = =2.142 x 1000
R = 2142 bits/sec

AY 2022-23 Sem-V
Code efficiency=
H(S) =2.142 symbols/sec
𝐿 min=H(S)
Code efficiency=2.142/2.2 =0.95=95%

Redundancy =1-code efficiency =1-0.95=0.05

AY 2022-23 Sem-V
Huffman coding algorithm:

The source symbols are listed in order of decreasing probability.
The two source symbols of lowest probability are assigned a 0 and 1. This part of step is
referred to as splitting stage.
These two source symbols are regarded as being combined into a new source symbol with
probability equal to the sum of two original probabilities. The probability of the new
symbolis placed in the list in accordance with its value.
The procedure is repeated until we are left with a final list of source statics of only two for
which a 0 and a 1 are assigned.
In discrete Memoryless source content (S0 = 0.4, S1 = 0.2, S2 =

0.2, S3 = 0.1 , S4 = 0.1. calculate efficiency of source.

AY 2022-23 Sem-V
As high as possible
MSB
The average codeword length

AY 2022-23 Sem-V
Redundancy=1-codeeffciency=1-0.95=0.05
𝐿 min=H(S)
As low as possible

AY 2022-23 Sem-V
Probability Codeword
Symbol
S0 0.4 1
S1 0.2 01
S2 0.2 000
S3 0.1 0010
S4 0.1 0011

AY 2022-23 Sem-V
Which clearly reflects that Huffman encoding process is not

unique.

AY 2022-23 Sem-V
0.4

AY 2022-23 Sem-V

AY 2022-23 Sem-V

AY 2022-23 Sem-V
2.422
2.422 2422

2.422 2.422 96.8%
2.422
0.032 REDUNDANCY
AY 2022-23 Sem-V

AY 2022-23 Sem-V

AY 2022-23 Sem-V

AY 2022-23 Sem-V

AY 2022-23 Sem-V
Extended Huffman Coding/Adaptive Huffman

Coding :
In applications where the alphabet size is large, pmax is generally quite small, and the
amount of deviation from the entropy, especially in terms of a percentage of the rate, is
quite small.
However, in cases where the alphabet is small and the probability of occurrence of the
different letters is skewed, the value of pmax can be quite large and the Huffman code
can become rather inefficient when compared to the entropy.
To overcome this inefficiency we use adaptive Huffman coding, the same can be
illustrated with the help of following example:
Consider a source that puts out iid letters from the alphabet A = {a1, a2, a3} with the
probability model P(a1) = 0.8, P(a2) = 0.02, and P(a3) = 0.18. The entropy for this
source is 0.816 bits/symbol. A Huffman code for this source is shown in Table below
TABLE 1: The Huffman code.
Letter Codeword
a1 0
a2 11
a3 10
The average length for this code is 1.2 bits/symbol. The difference between the average
code length and the entropy, or the redundancy, for this code is 0.384 bits/symbol,
which is 47% of the entropy. This means that to code this sequence we would need
47% more bits than the minimum required.

AY 2022-23 Sem-V
Now for the source described in the above example, instead of generating a codeword
for every symbol, we will generate a codeword for every two symbols. If we look at the
source sequence two at a time, the number of possible symbol pairs, or size of the
extended alphabet, is 32 = 9. The extended alphabet, probability model, and Huffman
code for this example are shown in Table below
TABLE 2: The extended alphabet and corresponding Huffman code.
Letter Probability Code
a1a1 0.64 0
a1a2 0.016 10101
a1a3 0.144 11
a2a1 0.016 10100
a2a2 0.0004 10100101
a2a3 0.0036 1010011
a3a1 0.1440 100

AY 2022-23 Sem-V
Letter Probability Code
a3a2 0.0036 10100100
a3a3 0.0324 10100
The average codeword length for this extended code is 1.7228 bits/symbol. However,
each symbol in the extended alphabet corresponds to two symbols from the original
alphabet.
Therefore, in terms of the original alphabet, the average codeword length is 1.7228/2 =
0.8614 bits/symbol.
This redundancy is about 0.045 bits/symbol, which is only about 5.5% of the entropy.
Advantage of extended Huffman coding
We see that by coding blocks of symbols together we can reduce the redundancy of
Huffman codes.

AY 2022-23 Sem-V

AY 2022-23 Sem-V

AY 2022-23 Sem-V

Unit 2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 2

Uploaded by

Copyright:

Available Formats

Department of Electronics and Communication

Need of source coding:

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Lower and upper limit for 𝐿

State and Prove Shannon’s first theorem.

Proof for : 𝑳 −H(X) ≥0

Take common term outside

Remember log along b=logab

Change to natural log for simplifying

Remember Maths identity

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

in our case let x=𝑃𝐾 2𝑙𝑘

𝐿 −H(X) ≥1(always 1) − ( ≤ 1(always)s

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

𝐿˂ H(X)+1 ____________Hence Proved

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Prefix codes are also known as

 variable-length Huffman codes

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

 VCR Plus+ codes

Kraft McMillan Inequality property – KMI

Shannon Fano Algorithm:

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Information Rate:R = H(s)r

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Let P(x) be the probability of occurrence of symbol x:

1. Arranging the symbols in decreasing order of probability:

And Cal Prob.

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

P(A) = 0.22 and P(C) + P(E) = 0.20

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

P(C) = 0.15 and P(E) = 0.05

The Shannon codes for the set of symbols are:

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

The average codeword length:

r=nyquist rate as 1000 samples/sec

H(S) =2.142 symbols/sec

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

H(S) =2.142 symbols/sec

Code efficiency=2.142/2.2 =0.95=95%

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Huffman coding algorithm:

In discrete Memoryless source content (S0 = 0.4, S1 = 0.2, S2 =

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

The average codeword length

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Which clearly reflects that Huffman encoding process is not

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Extended Huffman Coding/Adaptive Huffman

Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)

Letter Probability Code

a1a2 0.016 10101