Professional Documents
Culture Documents
Huffman Coding
Huffman Coding
CLRS- 16.3
Huffman coding.
Frequency in 000s
Freq in 000s
a fixed-length
a variable-length
!
bits,
and
" '$&)*'%'$&+(
Encoding
Given a code (corresponding to some alphabet ) and
a message it is easy to encode the message. Just
replace the characters by the codewords.
Example:
If the code is
Decoding
Prefix-Codes
Fixed-length codes are always uniquely decipherable
(why).
We saw before that these do not always give the best
compression so we prefer to use variable length codes.
Prefix Code: A code is called a prefix (free) code if
no codeword is a prefix of another one.
Example:
a prefix code.
is
a
bits,
saving.
n4/100
e/45
n3/55
n2/35
n1/20
a/20
b/15
c/5
d/15
n4/100
e/45
n3/55
n2/35
a/20
b/15
c/5
d/15
n1/20
to the length,
of the codeword in code
The sum
is the weighted external
pathlength of tree .
11
Huffman Coding
Step 2: Set frequency
.
Remove
and add creating new alphabet
.
Note that
12
be the
Let
alphabet and its frequency distribution.
In the first step Huffman coding merges
and .
a/20
b/15
e/45
n1/20
c/5
Alphabet is now
d/15
.
13
Alphabet is now
Algorithm merges and
a/20
.
and ).
n2/35
e/45
n1/20
b/15
New alphabet is
c/5
d/15
.
14
and
.
.
e/45
55
0
n2/35
a/20
n1/20
b/15
New alphabet is
c/5
d/15
.
15
Current alphabet is
.
and finishes.
Algorithm merges and
n4/100
e/45
n3/55
n2/35
n1/20
a/20
b/15
c/5
d/15
16
e/45
n3/55
n2/35
n1/20
a/20
b/15
c/5
d/15
Huffman code is
, ,
, ,
.
This is the optimum (minimum-cost) prefix code for
this distribution.
17
18
19
Lemma: Consider the two letters, and with the smallest frequencies. There is an optimal code tree in which these two letters are sibling leaves in the tree in the lowest level.
Proof: Let
be an optimum prefix code tree, and let and
be two siblings at the maximum depth of the tree (must exist
is
full).
Assume
without
loss
of
generality
that
because
and
(if this is not true, then rename these
and
have the two smallest frequencies
characters). Since
it
follows that
(they may be equal) and
(may be equal). Because
are at the deepest
level of the
and
tree we know that
and
.
Now switch the positions of and in the tree resulting in a different tree
and see how the cost changes. Since is optimum,
, that is, is an optimum tree. By
Therefore,
switching with we get a new tree which by a similar argument is optimum. The final tree satisfies the statement of the
claim.
20
21
22