You are on page 1of 16

Module 3

ELEMENTS OF ENCODING
3.1 Purpose of encoding; Separable
binary codes

3.2 Shannon-fano encoding,

Topics to 3.3 Necessary and sufficient


conditions for noiseless coding,
be 3.4 Average length of encoded
covered messages; Shannon’s binary encoding

3.5 Huffman’s minimum redundancy


codes.

3.6 Lossy and Loseless data


compression techniques.

2
Huffman Coding
 Huffman coding results in an optimal code. It is the code that has the highest
efficiency.
 The Huffman coding procedure is as follows:
 1. List the source symbols in order of decreasing probability.
 2. Combine the probabilities of the two symbols having the lowest probabilities
and reorder the resultant probabilities, this step is called reduction 1. The same
procedure is repeated until there are two ordered probabilities remaining.
 3. Start encoding with the last reduction, which consists of exactly two ordered
probabilities. Assign 0 as the first digit in the code word for all the source
symbols associated with the first probability; assign 1 to the second probability.
 4. Now go back and assign 0 and 1 to the second digit for the two probabilities
that were combined in the previous reduction step, retaining all the source
symbols associated with the first probability; assign 1 to the second probability.
 5. Keep regressing this way until the first column is reached.
 6. The code word is obtained tracing back from right to left.

3
Huffman Encoding - Example

H (X) = 2.36 b/symbol L= 2.38 b/symbol


η = H (X)/ L= 0.99
4
Shannon Fano Code Huffman Code

5
6
7
8
The source coding theorem


  The source coding theorem states that for a DMS X,
with entropy H (X), the average code word length L
per symbol is bounded as L≥ H (X).
 L can be made as close to H (X) as desired for some
suitable chosen code.
 Thus, with 𝐿𝑚𝑖𝑛 = 𝐻 (𝑋)
 The code efficiency can be rewritten as 𝜂 = 𝐻(𝑋)/𝐿
 Re-written as:
 M= bits per letter i.e M=2 for binary codes

9
Huffman as an optimal code
 What is an optimal code?
 1. The code efficiency is maximum
 How and when?
 2. The code gives the lowest possible average codeword
length for a given M which results in maximum efficiency and
minimum redundancy.
 3. “Compression” should be maximum for an efficient source
encoder
 Moreover, in an optimal code, symbols that occur more
frequently (have a higher probability of occurrence) will have
shorter codewords than symbols that occur less frequently.

10
Shannon Fano Vs Huffman
Coding

 In general, Shannon-Fano and Huffman coding will


always be similar in size.
 However, Huffman coding will always at least equal the
efficiency of the Shannon-Fano method, may be more
than it in some cases.
 For example,

11
Shannon fano and Huffman Code

Try encode a message : AAABE

Symbol Count S.F Code Symbol Count H Code


A 14 2 A 14 1
B 7 2 B 7 3
C 5 2 C 5 3
D 5 3 D 5 3
E 4 3 E 4 3

Encode a Message: ABCDE


ASCII: 8 bits or 1 byte each: 280 bits Fixed Length code
Shannon Fano code: 87 bits
Variable
12 Length code
Huffman code: 77 bits
Role of M (bits per symbol)
Efficiency=? Efficiency=88.7% and 95.4%
P S. F H Code P S. F H Code
(M=2) code (M=3) code
0.4 00 0 0.4 0 0
0.2 01 111 0.2 10 2
0.12 100 101 0.12 11 11
0.08 101 1101 0.08 20 12
0.08 110 1100 0.08 21 100
0.08 1110 1001 0.08 220 101
0.04 1111 1000 0.04 221 102
L=2.52 L=2.48 L=1.72 L=1.6
H(X)=2.42 H(X)=2.42 H(X)=2.42 H(X)=2.42
Log 3 Log 3
13
P S. F P H Code
(M=3) code (M=3)
0.4 0 0.4 0 0.4 0 0.4 0

0.2 1 0 0.2 2 0.2 2 0.4 1

0.12 1 1 0.12 11 0.2 10 0.2 2

0.08 2 0 0.08 12 0.12 11

0.08 2 1 0.08 100 0.08 12

0.08 2 2 0 0.08 101

0.04 2 2 1 0.04 102

14
Practice problem
Take M=4
P (M=3) S. F H
code Code

0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05

15
Thank you

You might also like