Professional Documents
Culture Documents
Lect 10
Lect 10
Motivation Uniquely decipherable codes Prefix codes Huffman code construction Extensions and applications
Motivation
Suppose we want to store and transmit very large files (messages) consisting of strings (words) constructed over an alphabet of characters (letters). Representing each character with a fixed-length code will not result in the shortest possible file! Example: 8-bit ASCII code for characters
some characters are much more frequent than others using shorter codes for frequent characters and longer ones for infrequent ones will result in a shorter file
Example
a
Frequency (%) 45 Fixed-length 000 Variable-length 0 Message: abadef
b
13 001 101
c
12 010 100
Terminology
Let w, p, and s be words over the alphabet C. For w = ps, p is the prefix and s is the suffix of w. Let t be a non-empty word. t is called a tail iff there exist two messages c1c2cm and c1c2cn such that:
ci and cj are code-words and c1 c1 (1 i n, 1 j m) t is a suffix of cn c1c2cmt = c1c2cn
The length of a word w is l(w). w is non-empty when l(w) > 0. l is the maximum length of a code-word in C.
000001000011100101 0101011111011100 A file of 100,000 characters takes: 3100,000 = 300,000 bits with fixed-length code (.451 + .133 + .123+ .163 + .094 + .054) 100,000 = 224,000 bits on average with variable-length code (25% less)
Represent the characters from an input alphabet using a variable-length code alphabet C, taking into account the occurrence frequency of the characters. Desired properties: The code must be uniquely decipherable: every message can be decoded in only one way. The code must be optimal with respect to the input probability distribution. The code must be efficiently decipherable prefix code: no string is a prefix of another.
j) do:
Example 1
C = {00,10,11,100,110} 1. Tails: 10.0 = 100 t = 0 11.0 = 110 t = 0 2. Tails 0.0 = 00 t=0
Example 2
C = {1,00,101,010} 1. Tails: 1.01 = 101 t = 01 2. Tails 01.0 = 010 t = 0 0.10 = 010 t = 10 10.1 = 101 t = 1 1 is a code-word! C is not UD Counter-example: 10100101 has two meanings: 1.010.010.1 101.00.101
C is UD
Prefix codes
We consider only prefix codes: no code-word is a prefix of another code-word. Prefix codes are uniquely decipherable by definition. A binary prefix code can be represented as a binary tree:
leaves are a code-words and their frequency (%) internal nodes are binary decision points: 0 means go to the left, 1 means go to the right of a character. They include the sum of frequencies of their children. The path from the root to the code-word is the binary representation of the code-word.
Data Structures, Spring 2004 L. Joskowicz
Message: 000.001.000.011.100.101
c b 001 010
1
14
a 0
0
58
1
28
0
14
1 1 0 d 0 f 1100 1 e 1101
1 f 5
b 101
a 45 b: 13 c 12
d 16 e 9
111
Frequency (%)
100
1
55
0
25
1 1 0
14 30
0 c 12 Frequency (%)
Data Structures, Spring 2004 L. Joskowicz
1 d 16
b: 13 0 f 5
1 e 9
B (T )
c C
f (c )dT (c )
f ( c )dT ( c )
Start: Step 1:
f 5 c 12
e 9 b: 13
c 12 b: 13 d 16
a 45 a 45
14
d 16 1 f 5
0 e 9 0 e 9
Data Structures, Spring 2004 L. Joskowicz
1 f 5
0 b: 13
1 c 12
0
14
1 d 16 1 f 5
0 e 9
100
0 a 45 0
25
1 0 1 c 12 101
1 0
14 30
0 b: 13 100
Data Structures, Spring 2004 L. Joskowicz
0 e 9 1100
1 f 5 1101
return Extract-Min(Q)
Data Structures, Spring 2004 L. Joskowicz
d 16 111
55
b: 13
c 12
0 e 9
1 f 5
Step 3:
25
30
Step 4: a 45
a 45
25
0 0 1
55
1 0
14 30
1 d 16
Step 2:
14
d 16 0
25
a 45 1 c 12
b: 13
T first transformation y x b a
a b x y
f (c)dT (c)
c C
[ f ( y)dT ' ( y) f (b)dT ' (b)] [ f ( y)dT '' ( y) f (b)dT ' (b)] [ f ( y)dT ' ( y) f (b)dT ' (b)] [ f ( y)dT ' (b) f (b)dT ' ( y)] ( f (b) f ( y))(dT ' (b) dT ' ( y)) 0
because 0 f (b) f (y) and 0 (dT(b) dT(y)) Since T is optimal, B(T) = B(T)
[ f (x)dT ( x) f (a)dT (a)] [ f ( x)dT ' (x) f (a)dT ' (a)] [ f ( x)dT ( x) f (a)dT (a)] [ f ( x)dT (a) f (a)dT ( x)] ( f (a) f ( x))(dT (a) dT ( x)) 0
because 0 f (a) f (x) and 0 (dT(a) dT(x)) Since T is optimal, B(T) = B(T)
B(S) = B(S) [f (x)dS(x) + f (y)dS(y)] + f (z)dS(z) Since dS(x) = dS(y) = dS(z) + 1, we get: B(S) = B(S) f (x) f (y) and similarly, B(T) = B(T) f (x) f (y)