Professional Documents
Culture Documents
Question T2.1
An information source has six possible outputs with probabilities as shown in the table. Codes A, B , C ,
D, E and F , as given in the table, are considered.
1) Which of these codes are uniquely decodable?
2) Which are instantaneous decodable codes?
3) Find the average codeword length L for all the uniquely decodable codes.
Output P (si ) A B C D E F
1
s1 2 000 0 0 0 0 0
1
s2 4 001 01 10 10 10 100
1
s3 16 010 011 110 110 1100 101
1
s4 16 011 0111 1110 1110 1101 110
1
s5 16 100 01111 11110 1101 1110 111
1
s6 16 101 011111 111110 1011 1111 001
Solution T2.1
Recall the definition of Uniquely Decodable Code - A block code is said to uniquely decodable if, and
only if, the nth extension of the code is nonsingular for every finite n. A block code is said to nonsingular
if all the words of the code are distinct. Also note that a necessary and sufficient condition for a code
to be instantaneous is that no complete codeword of the code be a prefix of some other codeword. We
make use of these definitions to classify the given codes.
Code B - It is not a prefix free code as codeword for s1 is prefix to all other codewords, and hence it is
not instantaneous decodable code. For verifying uniquely decodablity we make use of the fact that bit 0
is start bit of every codeword, that allows for unique decoding by marking codeword boundaries in the
binary representation. The average length L = 2.125 bits/symbol.
Code C - It is prefix free code and hence it is both instantaneous decodable and uniquely decodable.
Pq
The average length is L = i=1 Pi li = 2.125 bits/symbol.
Code D - It is not a prefix free code as codeword for s2 is prefix to codeword for s6 , and hence it is
not instantaneous decodable code. For verifying uniquely decodablity we consider the second extension
of Code D, codewords for s3 s3 and s5 s2 are identical 110110, hence the second extension of Code D
is singular and Code D is not uniquely decodable.
Code F - It is not a prefix free code as codeword for s1 is prefix to codeword for s6 , and hence it is
not instantaneous decodable code. For verifying uniquely decodablity we consider the third extension of
Code F , codewords for s1 s1 s2 and s6 s1 s1 are identical 00100, hence the third extension of Code D is
singular and Code F is not uniquely decodable.
Note: It is pertinent to note that if two codewords from different code extensions are identical that does
not make a code non-uniquely decodable. Any valid example to prove non-uniquely decodable codes must
show that at least one extension of the code is singular.
Question T2.2
A zero-memory source emits seven messages with probabilities 1/2, 1/4, 1/8, 1/16, 1/32, 1/64, and 1/64,
respectively. Obtain the compact binary codes and find the average length of the code word. Determine
the efficiency and the redundancy of the code.
Solution T2.2
All the symbol probabilities of this source are of the form (1/r)αi , for r = 2 with αi being some integer.
The compact code is obtained by setting the code word lengths equal to 1, 2, 3, 4, 5, 6 and 6, respectively.
TABLE I
C OMPACT B INARY C ODE FOR THE GIVEN S OURCE
Question T2.3
A zero-memory source emits seven messages with probabilities 1/3, 1/3, 1/9, 1/9, 1/27, 1/27, and 1/27,
respectively. Obtain the compact 3-ary codes and find the average length of the code word. Determine
the efficiency and the redundancy of the code.
Solution T2.3
All the symbol probabilities of this source are of the form (1/r)αi , for r = 3 with αi being some integer.
The compact ternary code is obtained by setting the code word lengths equal to 1, 1, 2, 2, 3, 3 and 3,
respectively.
TABLE II
C OMPACT T ERNARY C ODE FOR THE GIVEN S OURCE
Question T2.4
A source emits three equiprobable messages randomly and independently.
1) Find the source entropy.
2) Find a compact ternary code, the average length of the code word, the code efficiency and the
redundancy.
3) Repeat (2) for a compact binary code.
4) To improve the efficiency of the binary code we now encode the second extension of the source.
Find a compact binary code, the average length of the code word, the code efficiency and the
redundancy.
Solution T2.4
1) The entropy of the given source is
3
X 1
H3 (S) = Pi log3
Pi
i=1
=1 3-ary digits/symbol
2) All the symbol probabilities of this source are of the form (1/r)αi , for r = 3 with αi being some
integer. The compact ternary code is obtained by setting the code word lengths equal to 1, 1, and
1, respectively.
TABLE III
C OMPACT T ERNARY C ODE FOR THE GIVEN S OURCE
Messages Symbol Probability Code
1
s1 3
0
1
s2 3
1
1
s3 3
2
P3
The average word length of this code is L = i=1 Pi li = 1 3-ary digits/symbol and so the
efficiency is
H3 (S)
η == = 1,
L
and the redundancy, γ , of the source code.
γ = 1 − η = 0.
3) To find compact binary code we use Huffman coding.
TABLE IV
C OMPACT B INARY C ODE FOR THE GIVEN S OURCE
Messages Symbol Probability Code
1
s1 3
1
1
s2 3
00
1
s3 3
01
P3
The average word length of this code is L = i=1 Pi li = 5/3 bits/symbol. The entropy of the
given source is
3
X 1
H2 (S) = Pi log2
Pi
i=1
=1.585 bits/symbol,
so the efficiency is
H2 (S)
η == = 0.951,
L
and the redundancy, γ , of the source code is
γ = 1 − η = 0.049.
4) The compact binary code for the second extension of the source is given by
TABLE V
C OMPACT B INARY C ODE FOR THE SECOND EXTENSION OF THE SOURCE
Messages Symbol Probability Code
1
s1 s1 9
001
1
s1 s2 9
010
1
s1 s3 9
011
1
s2 s1 9
100
1
s2 s2 9
101
1
s2 s3 9
110
1
s3 s1 9
111
1
s3 s2 9
0000
1
s3 s3 9
0001
P9
The average word length of this code is L2 = i=1 Pi li = 29/9 bits/symbol. The entropy of the
second extension of the source is
9
2
X 1
H2 (S ) = Pi log2
Pi
i=1
=3.17 bits/symbol,
so the efficiency is
H2 (S 2 )
η == = 0.984,
L2
and the redundancy, γ , of the code is
γ = 1 − η = 0.016.
Question T2.5
A zero-memory source emits six messages with probabilities 0.3, 0.25, 0.15, 0.13, 0.1, and 0.07, respec-
tively. Find the 4-ary Huffman code. Determine its average word length, the efficiency and the redundancy.
Solution T2.5
P6
The average word length of this code is L = i=1 Pi li = 1.3 4-ary digits/symbol. The entropy of the
given source is
6
X 1
H4 (S) = Pi log4
Pi
i=1
=1.19 4-ary digits/symbol,
TABLE VI
4- ARY H UFFMAN C ODE FOR THE GIVEN S OURCE
so the efficiency is
H4 (S)
η == = 0.915,
L
and the redundancy, γ , of the source code.
γ = 1 − η = 0.085
Question T2.6
A zero-memory source emits s1 and s2 with probabilities 0.85 and 0.15, respectively. Find the optimum
(Huffman) binary code for this source as well as its second and third order extensions. Determine the
code efficiencies in each case.
Solution T2.6
TABLE VII
H UFFMAN B INARY C ODE FOR THE GIVEN S OURCE
P2
The average word length of this code is L = i=1 Pi li = 1 bits/symbol. The entropy of the given source
is
2
X 1
H2 (S) = Pi log2
Pi
i=1
=0.61 bits/symbol,
so the efficiency is
H2 (S)
η == = 0.61.
L
Now we consider the second extension of the source for Huffman coding
TABLE VIII
H UFFMAN B INARY C ODE FOR THE SECOND EXTENSION OF THE S OURCE
P4
The average word length of this code is L2 = i=1 Pi li = 1.4275 bits/symbol. The entropy of the
second extension of the source is
4
2
X 1
H2 (S ) = Pi log2
Pi
i=1
=1.22 bits/symbol,
so the efficiency is
H2 (S 2 )
η == = 0.855.
L2
Now we consider the third extension of the source for Huffman coding
TABLE IX
H UFFMAN B INARY C ODE FOR THE THIRD EXTENSION OF THE S OURCE
P8
The average word length of this code is L3 = i=1 Pi li = 1.89342 bits/symbol. The entropy of the third
extension of the source is
8
X 1
H2 (S 3 ) = Pi log2
Pi
i=1
=1.83 bits/symbol,
so the efficiency is
H2 (S 3 )
η == = 0.967.
L3
Question T2.7
Consider a source with alphabet {s0 , s1 , s2 , s3 , s4 , s5 } and associated symbol probabilities
{1/3, 1/6, 1/6, 1/6, 1/12, 1/12}.
For this source, find 3-ary Huffman code and compute its expected code length L.
Solution T2.7
TABLE X
3- ARY H UFFMAN C ODE FOR THE GIVEN S OURCE
P6
The average word length of this code is L = i=1 Pi li = 5/3 3-ary digits/symbol.
Question T2.8
Consider an information source with four symbols, A, B, C, D, and the associated symbol probabilities,
pA ≥ pB ≥ pC ≥ pD. If we assume that Huffman coding is used to compress source data and there
are two different tree constructions with the following properties are possible:
First tree: The length of the longest path from the root (longest codeword) to a symbol is 3.
Second tree: The length of the longest path from the root to a symbol is 2.
Derive one constraint relating the symbol probabilities which ensures feasibility of both tree construc-
tions.
1) Write down the constraint.
2) Show the resulting codes.
Solution T2.8
The condition pC + pD = pA ensures feasibility of both tree constructions
TABLE XI
F IRST H UFFMAN C ODE
Messages Symbol Probability First Reduced Source Second Reduced Source Huffman Code
A pA pA pA 0
B pB pC + pD pB + pC + pD 10
C pC pB 110
D pD 111
TABLE XII
S ECOND H UFFMAN C ODE
Messages Symbol Probability First Reduced Source Second Reduced Source Huffman Code
A pA pC + pD pA + pB 00
B pB pA pC + pD 01
C pC pB 10
D pD 11
Question T2.9
Consider the pseudo-code for the LZW decoder given below:
initialize TABLE[0 to 255] = code for individual bytes
CODE = read next code from encoder
STRING = TABLE[CODE]
output STRING
while there are still codes to receive:
CODE = read next code from encoder
if TABLE[CODE] is not defined:
ENTRY = STRING + STRING[0]
else:
ENTRY = TABLE[CODE]
output ENTRY
add STRING+ENTRY[0] to TABLE
STRING = ENTRY
Suppose that this decoder has received the following five codes from the LZW encoder (these are the
first five codes from a longer compression run):
97 – index of ’a’ in the translation table
98 – index of ’b’ in the translation table
257 – index of second addition to the translation table
256 – index of first addition to the translation table
258 – index of third addition to the translation table
After it has finished processing the fifth code, what are the entries in TABLE and what is the cumulative
output of the decoder?
Solution T2.9
TABLE XIII
D ECODER D ICTIONARY
Solution T2.10
TABLE XIV
D ECODER D ICTIONARY