You are on page 1of 10

Department of Electrical Engineering

Indian Institute of Technology Jodhpur


EE 325: Contemporary Communication Systems
2019-20 Second Semester (January - May 2020)

Solutions for Tutorial 2


Information Source Coding
23 January 2020

Question T2.1
An information source has six possible outputs with probabilities as shown in the table. Codes A, B , C ,
D, E and F , as given in the table, are considered.
1) Which of these codes are uniquely decodable?
2) Which are instantaneous decodable codes?
3) Find the average codeword length L for all the uniquely decodable codes.

Output P (si ) A B C D E F
1
s1 2 000 0 0 0 0 0
1
s2 4 001 01 10 10 10 100
1
s3 16 010 011 110 110 1100 101
1
s4 16 011 0111 1110 1110 1101 110
1
s5 16 100 01111 11110 1101 1110 111
1
s6 16 101 011111 111110 1011 1111 001

Solution T2.1
Recall the definition of Uniquely Decodable Code - A block code is said to uniquely decodable if, and
only if, the nth extension of the code is nonsingular for every finite n. A block code is said to nonsingular
if all the words of the code are distinct. Also note that a necessary and sufficient condition for a code
to be instantaneous is that no complete codeword of the code be a prefix of some other codeword. We
make use of these definitions to classify the given codes.

Code A - It is prefix free P


code and hence it is both instantaneous decodable and uniquely decodable.
The average length is L = qi=1 Pi li = 3 bits/symbol.

Code B - It is not a prefix free code as codeword for s1 is prefix to all other codewords, and hence it is
not instantaneous decodable code. For verifying uniquely decodablity we make use of the fact that bit 0
is start bit of every codeword, that allows for unique decoding by marking codeword boundaries in the
binary representation. The average length L = 2.125 bits/symbol.

Code C - It is prefix free code and hence it is both instantaneous decodable and uniquely decodable.
Pq
The average length is L = i=1 Pi li = 2.125 bits/symbol.

Code D - It is not a prefix free code as codeword for s2 is prefix to codeword for s6 , and hence it is
not instantaneous decodable code. For verifying uniquely decodablity we consider the second extension
of Code D, codewords for s3 s3 and s5 s2 are identical 110110, hence the second extension of Code D
is singular and Code D is not uniquely decodable.

Code E - It is prefix free P


code and hence it is both instantaneous decodable and uniquely decodable.
The average length is L = qi=1 Pi li = 2 bits/symbol.

Code F - It is not a prefix free code as codeword for s1 is prefix to codeword for s6 , and hence it is
not instantaneous decodable code. For verifying uniquely decodablity we consider the third extension of
Code F , codewords for s1 s1 s2 and s6 s1 s1 are identical 00100, hence the third extension of Code D is
singular and Code F is not uniquely decodable.

Note: It is pertinent to note that if two codewords from different code extensions are identical that does
not make a code non-uniquely decodable. Any valid example to prove non-uniquely decodable codes must
show that at least one extension of the code is singular.

Question T2.2
A zero-memory source emits seven messages with probabilities 1/2, 1/4, 1/8, 1/16, 1/32, 1/64, and 1/64,
respectively. Obtain the compact binary codes and find the average length of the code word. Determine
the efficiency and the redundancy of the code.

Solution T2.2
All the symbol probabilities of this source are of the form (1/r)αi , for r = 2 with αi being some integer.
The compact code is obtained by setting the code word lengths equal to 1, 2, 3, 4, 5, 6 and 6, respectively.

TABLE I
C OMPACT B INARY C ODE FOR THE GIVEN S OURCE

Messages Symbol Probability Code


1
s1 2
1
1
s2 4
01
1
s3 8
001
1
s4 16
0001
1
s5 32
00001
1
s6 64
000001
1
s7 64
000000

We calculate the entropy of this source


7
X 1
H2 (S) = Pi log2
Pi
i=1
=1.96875 bits/symbol
The average word length of this code is L = 7i=1 Pi li = 1.96875 bits/symbol and so the efficiency
P
is
H2 (S)
η == = 1,
L
and the redundancy, γ , of the source code is
γ = 1 − η = 0.

Question T2.3
A zero-memory source emits seven messages with probabilities 1/3, 1/3, 1/9, 1/9, 1/27, 1/27, and 1/27,
respectively. Obtain the compact 3-ary codes and find the average length of the code word. Determine
the efficiency and the redundancy of the code.
Solution T2.3
All the symbol probabilities of this source are of the form (1/r)αi , for r = 3 with αi being some integer.
The compact ternary code is obtained by setting the code word lengths equal to 1, 1, 2, 2, 3, 3 and 3,
respectively.
TABLE II
C OMPACT T ERNARY C ODE FOR THE GIVEN S OURCE

Messages Symbol Probability Code


1
s1 3
0
1
s2 3
1
1
s3 9
20
1
s4 9
21
1
s5 27
220
1
s6 27
221
1
s7 27
222

We calculate the entropy of this source


7
X 1
H3 (S) = Pi log3
Pi
i=1
13
= 3-ary digits/symbol
9
P7
The average word length of this code is L = i=1 Pi li = 13/9 3-ary digits/symbol and so the
efficiency is
H3 (S)
η == = 1,
L
and the redundancy, γ , of the source code is
γ = 1 − η = 0.

Question T2.4
A source emits three equiprobable messages randomly and independently.
1) Find the source entropy.
2) Find a compact ternary code, the average length of the code word, the code efficiency and the
redundancy.
3) Repeat (2) for a compact binary code.
4) To improve the efficiency of the binary code we now encode the second extension of the source.
Find a compact binary code, the average length of the code word, the code efficiency and the
redundancy.

Solution T2.4
1) The entropy of the given source is
3
X 1
H3 (S) = Pi log3
Pi
i=1
=1 3-ary digits/symbol
2) All the symbol probabilities of this source are of the form (1/r)αi , for r = 3 with αi being some
integer. The compact ternary code is obtained by setting the code word lengths equal to 1, 1, and
1, respectively.
TABLE III
C OMPACT T ERNARY C ODE FOR THE GIVEN S OURCE
Messages Symbol Probability Code
1
s1 3
0
1
s2 3
1
1
s3 3
2

P3
The average word length of this code is L = i=1 Pi li = 1 3-ary digits/symbol and so the
efficiency is
H3 (S)
η == = 1,
L
and the redundancy, γ , of the source code.
γ = 1 − η = 0.
3) To find compact binary code we use Huffman coding.
TABLE IV
C OMPACT B INARY C ODE FOR THE GIVEN S OURCE
Messages Symbol Probability Code
1
s1 3
1
1
s2 3
00
1
s3 3
01

P3
The average word length of this code is L = i=1 Pi li = 5/3 bits/symbol. The entropy of the
given source is
3
X 1
H2 (S) = Pi log2
Pi
i=1
=1.585 bits/symbol,
so the efficiency is
H2 (S)
η == = 0.951,
L
and the redundancy, γ , of the source code is
γ = 1 − η = 0.049.
4) The compact binary code for the second extension of the source is given by
TABLE V
C OMPACT B INARY C ODE FOR THE SECOND EXTENSION OF THE SOURCE
Messages Symbol Probability Code
1
s1 s1 9
001
1
s1 s2 9
010
1
s1 s3 9
011
1
s2 s1 9
100
1
s2 s2 9
101
1
s2 s3 9
110
1
s3 s1 9
111
1
s3 s2 9
0000
1
s3 s3 9
0001

P9
The average word length of this code is L2 = i=1 Pi li = 29/9 bits/symbol. The entropy of the
second extension of the source is
9
2
X 1
H2 (S ) = Pi log2
Pi
i=1
=3.17 bits/symbol,
so the efficiency is
H2 (S 2 )
η == = 0.984,
L2
and the redundancy, γ , of the code is
γ = 1 − η = 0.016.

Question T2.5
A zero-memory source emits six messages with probabilities 0.3, 0.25, 0.15, 0.13, 0.1, and 0.07, respec-
tively. Find the 4-ary Huffman code. Determine its average word length, the efficiency and the redundancy.

Solution T2.5

P6
The average word length of this code is L = i=1 Pi li = 1.3 4-ary digits/symbol. The entropy of the
given source is
6
X 1
H4 (S) = Pi log4
Pi
i=1
=1.19 4-ary digits/symbol,
TABLE VI
4- ARY H UFFMAN C ODE FOR THE GIVEN S OURCE

Messages Symbol Probability Code


s1 0.3 1
s2 0.25 2
s3 0.15 3
s4 0.13 00
s5 0.1 01
s6 0.07 02
s7 (dummy) 0 03

so the efficiency is
H4 (S)
η == = 0.915,
L
and the redundancy, γ , of the source code.
γ = 1 − η = 0.085

Question T2.6
A zero-memory source emits s1 and s2 with probabilities 0.85 and 0.15, respectively. Find the optimum
(Huffman) binary code for this source as well as its second and third order extensions. Determine the
code efficiencies in each case.

Solution T2.6

TABLE VII
H UFFMAN B INARY C ODE FOR THE GIVEN S OURCE

Messages Symbol Probability Code


s1 0.85 1
s2 0.15 0

P2
The average word length of this code is L = i=1 Pi li = 1 bits/symbol. The entropy of the given source
is
2
X 1
H2 (S) = Pi log2
Pi
i=1
=0.61 bits/symbol,
so the efficiency is
H2 (S)
η == = 0.61.
L

Now we consider the second extension of the source for Huffman coding
TABLE VIII
H UFFMAN B INARY C ODE FOR THE SECOND EXTENSION OF THE S OURCE

Messages Symbol Probability Code


s1 s1 0.7225 0
s1 s2 0.1275 10
s2 s1 0.1275 110
s2 s2 0.0225 111

P4
The average word length of this code is L2 = i=1 Pi li = 1.4275 bits/symbol. The entropy of the
second extension of the source is
4
2
X 1
H2 (S ) = Pi log2
Pi
i=1
=1.22 bits/symbol,
so the efficiency is
H2 (S 2 )
η == = 0.855.
L2

Now we consider the third extension of the source for Huffman coding
TABLE IX
H UFFMAN B INARY C ODE FOR THE THIRD EXTENSION OF THE S OURCE

Messages Symbol Probability Code


s1 s1 s1 0.614295 0
s1 s1 s2 0.108375 100
s1 s2 s1 0.108375 101
s2 s1 s1 0.108375 110
s1 s2 s2 0.019125 11100
s2 s1 s2 0.019125 11101
s2 s2 s1 0.019125 11110
s2 s2 s2 0.003375 11111

P8
The average word length of this code is L3 = i=1 Pi li = 1.89342 bits/symbol. The entropy of the third
extension of the source is
8
X 1
H2 (S 3 ) = Pi log2
Pi
i=1
=1.83 bits/symbol,
so the efficiency is
H2 (S 3 )
η == = 0.967.
L3

Question T2.7
Consider a source with alphabet {s0 , s1 , s2 , s3 , s4 , s5 } and associated symbol probabilities
{1/3, 1/6, 1/6, 1/6, 1/12, 1/12}.
For this source, find 3-ary Huffman code and compute its expected code length L.

Solution T2.7

TABLE X
3- ARY H UFFMAN C ODE FOR THE GIVEN S OURCE

Messages Symbol Probability Code


1
s1 3
0
1
s2 6
10
1
s3 6
11
1
s4 6
12
1
s5 12
20
1
s6 12
21
s7 (dummy) 0 22

P6
The average word length of this code is L = i=1 Pi li = 5/3 3-ary digits/symbol.

Question T2.8
Consider an information source with four symbols, A, B, C, D, and the associated symbol probabilities,
pA ≥ pB ≥ pC ≥ pD. If we assume that Huffman coding is used to compress source data and there
are two different tree constructions with the following properties are possible:
First tree: The length of the longest path from the root (longest codeword) to a symbol is 3.
Second tree: The length of the longest path from the root to a symbol is 2.
Derive one constraint relating the symbol probabilities which ensures feasibility of both tree construc-
tions.
1) Write down the constraint.
2) Show the resulting codes.

Solution T2.8
The condition pC + pD = pA ensures feasibility of both tree constructions
TABLE XI
F IRST H UFFMAN C ODE

Messages Symbol Probability First Reduced Source Second Reduced Source Huffman Code
A pA pA pA 0
B pB pC + pD pB + pC + pD 10
C pC pB 110
D pD 111
TABLE XII
S ECOND H UFFMAN C ODE

Messages Symbol Probability First Reduced Source Second Reduced Source Huffman Code
A pA pC + pD pA + pB 00
B pB pA pC + pD 01
C pC pB 10
D pD 11

Question T2.9
Consider the pseudo-code for the LZW decoder given below:
initialize TABLE[0 to 255] = code for individual bytes
CODE = read next code from encoder
STRING = TABLE[CODE]
output STRING
while there are still codes to receive:
CODE = read next code from encoder
if TABLE[CODE] is not defined:
ENTRY = STRING + STRING[0]
else:
ENTRY = TABLE[CODE]
output ENTRY
add STRING+ENTRY[0] to TABLE
STRING = ENTRY
Suppose that this decoder has received the following five codes from the LZW encoder (these are the
first five codes from a longer compression run):
97 – index of ’a’ in the translation table
98 – index of ’b’ in the translation table
257 – index of second addition to the translation table
256 – index of first addition to the translation table
258 – index of third addition to the translation table
After it has finished processing the fifth code, what are the entries in TABLE and what is the cumulative
output of the decoder?

Solution T2.9

TABLE XIII
D ECODER D ICTIONARY

Input Code String Table Decoding


97 None a
98 Table[256]=ab b
257 Table[257]=bb bb
256 Table[258]=bba ab
258 bba

The decoded output is ’abbbabbba’.


Question T2.10
A particular source uses the Lempel-Ziv-Welch algorithm to communicate with a receiver. The message
that the source wishes to communicate is made up of just three symbols: a, b, c. These symbols are
respectively stored in positions 1, 2 and 3 in the dictionary at both the source and the receiver. The
subsequent dictionary entries, as they are built by the LZW algorithm, are assigned to positions numbered
4, 5, 6, · · · . Suppose the transmitted sequence is
2, 4, 5, 1, 3, 4, 7, 10, 11, 6, 10
Decode the sequence, and write down the receiver’s entire dictionary at the end of the transmission.

Solution T2.10

TABLE XIV
D ECODER D ICTIONARY

Input Code String Table Decoding


2 Table[4]=bb b
4 Table[5]=bbb bb
5 Table[6]=bbba bbb
1 Table[7]=ac a
3 Table[8]=cb c
4 Table[9]=bba bb
7 Table[10]=aca ac
10 Table[11]=acaa aca
11 Table[12]=acaab acaa
6 Table[13]=bbbaa bbba
10 aca

The decoded output is ’bbbbbbacbbacacaacaabbbaaca’.

You might also like