Professional Documents
Culture Documents
AY 2022-23 Sem-V
Information Theory and Coding
Unit-2
Source Coding
Where Sk is the output of the discrete memoryless source and bk is the output of the source encoder
which is represented by 0s and 1s.
The encoded sequence is such that it is conveniently decoded at the receiver.
Dr.Tanuja S.Dhope (Bharati Vidyapeeth (DU),COE,Pune)
Department of Electronics and Communication
AY 2022-23 Sem-V
Information Theory and Coding
Let us assume that the source has an alphabet with k different symbols and that
the kth symbol Sk occurs with the probability Pk, where k = 0, 1…k-1.
Let the binary code word assigned to symbol Sk, by the encoder having length lk, measured in
bits.Hence, we define the average code word length (𝐿)of the source encoder as
𝑀
𝐿 = ∑ 𝑃𝐾 𝑙𝐾
𝐾=1
Consider : Assume Prob.s as 1/4
Symbols bit source encoder length (𝑙𝐾 )
S1 000 0 1
S2 0100 01 2
S3 0010 10 2
S4 0011 11 2
(𝐿)=7/4 bits
𝐿 represents the average number of bits per source symbol
If 𝐿 min=minimum possible value of 𝐿
Then coding efficiency can be defined as
With 𝐿 ≥ 𝐿 min we will have η≤1 (which means source coding aim to give compact
representation i.e for 7 bit ,it should give 1 or 2 bit representation. 𝐿 should be kept as min.as
possible, but it should satisfy 𝐿 ≥Lmin ,that is the condition for good or optimal code ,The value
of 𝐿 min is given by Shannon’s first theorem.)
However, the source encoder is considered efficient when η=1
For this, the value𝐿 min has to be determined.
Let us refer to the definition, “Given a discrete memoryless source of entropy H(X), the average
code-word length 𝐿 for any source encoding is bounded as 𝐿 ≥H(X)” (Shannon’s first theorem)
in simpler words, the code word example: Morse code for the word QUEUE is−.−..−...−. is
always greater than or equal to the source code QUEUE in example. Which means, the symbols
in the code word are greater than or equal to the alphabets in the source code.
Hence with 𝐿 min=H(X) the efficiency of the source encoder in terms of Entropy H(X) may be
written as:
Source Coding theorem: noiseless coding theorem / Shannon’s first theorem.
𝐿 ≥H(X)
𝑙𝑜𝑔22 = 1 Rewriting
∑𝑀 2 𝑀
𝐾=1 𝑃𝐾 𝑙𝐾 𝑙𝑜𝑔2 -∑𝐾=1 𝑃𝐾 𝑙𝑜𝑔2 1/𝑃𝐾 ≥0
∑𝑀 2
𝐾=1 𝑃𝐾 (𝑙𝐾 𝑙𝑜𝑔2 + 𝑙𝑜𝑔2 𝑃𝐾 ) ≥0
𝑙
∑𝑀 2𝑘
𝐾=1 𝑃𝐾 (𝑙𝑜𝑔2 + 𝑙𝑜𝑔2 𝑃𝐾 ) ≥0
1
∑𝑀
𝐾=1 𝑃𝐾 (𝑙𝑜𝑔 𝑃𝐾 2
𝑙𝑘
) is nothing but 𝐿 −H(X)
𝑙𝑜𝑔2
1 1 1
𝐿 −H(X) ≥ 𝑙 ∑𝑀
𝐾=1 𝑃𝐾 − 𝑙 ∑𝑀
𝐾=1 𝑃𝐾 )
𝑜𝑔2 𝑜𝑔2 𝑃𝐾 2𝑙𝑘
1
≥1(always 1) − 𝑙 ∑𝑀
𝐾=1 2
−𝑙𝑘
𝑜𝑔2
∑ 2−𝑙𝑘 ≤ 1
𝐾=1
∑𝑀
𝐾=1 𝑃𝐾 =1
2−𝑙𝑘 ≤ 𝑃𝐾 ___A
𝑃𝐾 ≥ 2−𝑙𝑘
𝑃𝐾 ˂2−𝑙𝑘 +1 __B
We are inferring from A to B…
Take log2
𝑙𝑜𝑔2 𝑃𝐾 ˂𝑙𝑜𝑔2 2−𝑙𝑘 +1
𝑙𝑜𝑔2 𝑃𝐾 ˂(−𝑙𝑘 + 1)𝑙𝑜𝑔2 2
𝑙𝑜𝑔2 𝑃𝐾 ˂(−𝑙𝑘 + 1)
𝑙𝑜𝑔2 𝑃𝐾 − 1˂ − 𝑙𝑘
𝑙𝑘 ˂1 − 𝑙𝑜𝑔2 𝑃𝐾
𝑙𝑘 ˂1 + 𝑙𝑜𝑔2 1/𝑃𝐾
Using ∑𝑀
𝐾=1 𝑃𝐾 =1
𝑀 𝑀
1
∑ 𝑃𝐾 𝑙𝑘 ˂ ∑ 𝑃𝐾 (1 + 𝑙𝑜𝑔2 )
𝑃𝑘
𝐾=1 𝐾=1
𝑀 𝑀 𝑀
1
∑ 𝑃𝐾 𝑙𝑘 ˂ ∑ 𝑃𝐾 + ∑ 𝑃𝐾 𝑙𝑜𝑔2
𝑃𝑘
𝐾=1 𝐾=1 𝐾=1
Prefix Code :
A prefix code is a type of code system distinguished by its possession of the "prefix property",
which requires that there is no whole code word in the system that is a prefix (initial segment) of
any other code word in the system.
For example, a code with code words {9, 55} has the prefix property; a code consisting of
{9, 5, 59, 55} does not, because "5" is a prefix of "59" and also of "55". A prefix code is
a uniquely decodable code: given a complete and accurate sequence, a receiver can identify each
word without requiring a special marker between words. However, there are uniquely decodable
codes that are not prefix codes; for instance, the reverse of a prefix code is still uniquely
decodable (it is a suffix code), but it is not necessarily a prefix code.
Shannon Fano Algorithm is an entropy encoding technique for lossless data compression
of multimedia. Named after Claude Shannon and Robert Fano, it assigns a code to each
symbol based on their probabilities of occurrence. It is a variable length encoding
scheme, that is, the codes assigned to the symbols will be of varying length.
HOW DOES IT WORK?
The steps of the algorithm are as follows:
1. Create a list of probabilities or frequency counts for the given set of symbols so that the
relative frequency of occurrence of each symbol is known.
2. Sort the list of symbols in decreasing order of probability, the most probable ones to
the left and least probable to the right.
3. Split the list into two parts, with the total probability of both the parts being as close to
each other as possible.
4. Assign the value 0 to the left part and 1 to the right part.
5. Repeat the steps 3 and 4 for each part, until all the symbols are split into individual
subgroups.
The Shannon codes are considered accurate if the code of each symbol is unique.
EXAMPLE:.
1.Using Shannon Fano coding lossless compression technique.
Find
Codewords, average code word length, code efficiency, entropy and Information rate if
nyquist rate is 1000 samples/sec
Step:
Tree:
Solution:
Use
The average codeword length:
𝑀
𝐿 = ∑ 𝑃𝐾 𝑙𝐾
𝐾=1
R= nyquist rate as 1000 samples/sec
1
Entropy : H(s) = ∑3𝑖=1 𝑃𝑖 𝑙𝑜𝑔2
𝑃𝑖
which means that P(D)~P(B), so divide {D, B} into {D} and {B} and assign 0 to D and 1 to
B.
Step:
Tree:
In {A, C, E} group,
1. In {C, E} group,
𝐿 = ∑ 𝑃𝐾 𝑙𝐾
𝐾=1
=0.22*2+0.28*2+0.15*3+0.30*2+0.05*3
𝐿 = 2.2
Information Rate:
R = H(S)*r = =2.142 x 1000
R = 2142 bits/sec
Code efficiency=
𝐿 min=H(S)
As high as possible
MSB
Redundancy=1-codeeffciency=1-0.95=0.05
𝐿 min=H(S)
As low as possible
Probability Codeword
Symbol
S0 0.4 1
S1 0.2 01
S2 0.2 000
S3 0.1 0010
S4 0.1 0011
0.4
2.422
2.422 2422
0.032 REDUNDANCY
Department of Electronics and Communication
AY 2022-23 Sem-V
Information Theory and Coding
Letter Codeword
a1 0
a2 11
a3 10
The average length for this code is 1.2 bits/symbol. The difference between the average
code length and the entropy, or the redundancy, for this code is 0.384 bits/symbol,
which is 47% of the entropy. This means that to code this sequence we would need
47% more bits than the minimum required.
Now for the source described in the above example, instead of generating a codeword
for every symbol, we will generate a codeword for every two symbols. If we look at the
source sequence two at a time, the number of possible symbol pairs, or size of the
extended alphabet, is 32 = 9. The extended alphabet, probability model, and Huffman
code for this example are shown in Table below
TABLE 2: The extended alphabet and corresponding Huffman code.
a1a1 0.64 0
a1a3 0.144 11
The average codeword length for this extended code is 1.7228 bits/symbol. However,
each symbol in the extended alphabet corresponds to two symbols from the original
alphabet.
Therefore, in terms of the original alphabet, the average codeword length is 1.7228/2 =
0.8614 bits/symbol.
This redundancy is about 0.045 bits/symbol, which is only about 5.5% of the entropy.
Advantage of extended Huffman coding
We see that by coding blocks of symbols together we can reduce the redundancy of
Huffman codes.