0% found this document useful (0 votes)

43 views16 pages

Huffman Codes

Uploaded by

iamhocks2020

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views16 pages

Huffman Codes

Uploaded by

iamhocks2020

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Huffman Codes

Uniquely Decodable Codes

A variable length code assigns a bit string (codeword)
of variable length to every symbol
e.g. a = 1, b = 01, c = 101, d = 011

What if you get the sequence of bits

1011 ?
Is it aba, ca, or, ad?

A uniquely decodable code is a variable length code in

which bit strings can always be uniquely decomposed
into its codewords.
Prefix Codes
A prefix code is a variable length code in which no
codeword is a prefix of another word
e.g a = 0, b = 110, c = 111, d = 10
Can be viewed as a binary tree with symbol values at
the leaves and 0 or 1s on the edges.

0 1
0 1
a
0 1 d
b c
Some Prefix Codes for Integers
n Binary Unary
1 ..001 0
2 ..010 10
3 ..011 110
4 ..100 1110
5 ..101 11110
6 ..110 111110

Many other fixed prefix codes:

Golomb, phased-binary, subexponential, ...
Average Length
For a code C with associated probabilities p(c) the
average length is defined as

ACL (C ) = ∑ p (c)l (c)

c∈C

We say that a prefix code C is optimal if for all

prefix codes C’, ACL(C) ≤ ACL(C’)

l(c) = length of the codeword c (a positive integer)

A property of optimal codes
Theorem: If C is an optimal prefix code for the
probabilities {p1, …, pn} then pi < pj
implies l(ci) ≤ l(cj)
Proof: (by contradiction)
Assume l(ci) > l(cj). Consider switching codes ci and
cj. If ACL is the average length of the original
code, the length of the new code is

ACL' = ACL + p j (l (ci ) − l (c j )) + pi (l (c j ) − l (ci ))

= ACL + ( p j − pi )(l (ci ) − l (c j ))
< ACL

This is a contradiction since ACL is not optimal

Huffman Codes
Invented by Huffman as a class assignment in 1950.
Used in many, if not most, compression algorithms
• gzip, bzip, jpeg (as option), fax compression,…
Properties:
– Generates optimal prefix codes
– Cheap to generate codes
– Cheap to encode and decode
Huffman Codes
Huffman Algorithm
• Start with a forest of trees each consisting of a
single vertex corresponding to a symbol s and with
weight p(s)
• Repeat:
– Select two trees with minimum weight roots p1
and p2
– Join into single tree by adding root with weight
p1 + p2
Example
p(a) = .1, p(b) = .2, p(c ) = .2, p(d) = .5

a(.1) b(.2) c(.2) d(.5)

(.3) (.5) (1.0)
0 1
a(.1) b(.2) (.3) (.5) d(.5)
c(.2) 1
0
Step 1 a(.1) b(.2) (.3) c(.2)
0 1
Step 2 a(.1) b(.2)
Step 3
a=000, b=001, c=01, d=1
Encoding and Decoding
Encoding: Start at leaf of Huffman tree and follow
path to the root. Reverse order of bits and send.
Decoding: Start at root of Huffman tree and take
branch for each bit received. When at leaf can
output symbol and return to root.

There are even faster methods that (1.0)

0 1
can process 8 or 32 bits at a time (.5) d(.5)
0 1
(.3) c(.2)
0 1
a(.1) b(.2)
Huffman codes are optimal
Lemma 1: There is an optimal prefix code tree
where the two symbols with smallest probabilities
are siblings in the last level.

Proof.
• An optimal tree is a full binary tree
• The two symbols with smallest probability must be
in the last level
• It is possible to interchange probabilities in the
same level without affecting the cost of the tree.
Huffman codes are optimal
Notation

ABL(T): Average code length of the prefix code

induced by tree T.
Huffman codes are optimal
Lemma 2: Let S’ be an alphabet obtained from S by
replacing its two symbols of smallest probabilities,
say x and y, by na new symbol w and setting
p(w)=p(x)+p(y). In addition, let T (T’) be the tree
constructed by Huffman algorithm for alphabet
S(S’). Then, ABL(T)=ABL(T’) + p(w)
Proof.
ABL(T ) = ∑ p( x) DepthT ( x) =
x∈S

p ( y ) DepthT ( y ) + p ( z ) DepthT ( z ) + ∑ p( x) Depth ( x) =

x∈S , x ≠ y , x ≠ z
T

p ( w)( DepthT ' (w) + 1) + ∑ p( x) Depth ( x) =

x∈S , x ≠ y , x ≠ z
T

p ( w) + ∑ p ( x) DepthT ' ( x) = p ( w) + ABL(T ' )

x∈S '
Huffman codes are optimal
Theorem: The Huffman Algorithm construct the
optimal prefix code.
Proof. Induction on the size of the alphabet.
For |S|=2 the result holds.
Assume by induction that result holds when |S|<k.
Induction Step: |S|=k.
For the sake of contradiction, assume that result
does not hold. Then, there is a tree Z such that
ABL(Z) < ABL(T).
By Lmma 1, we can assume that the two symbols x
and y with smallest probabilities are siblings at
the last level of Z.
Huffman codes are optimal
Let Z’ be tree obtained from Z removing these two
siblings and replacing its parent by a leaf with
probability p(x)+p(y).

From Lemma 2, ABL(Z)=ABL(Z’)+p(w)

Since
ABL(Z)=ABL(Z’)+p(w)<ABL(T)=ABL(T’)+p(w),

it follows that
ABL(Z’)<ABL(T’),
which contradicts the induction hypothesis.
Implementation

• Maintain a heap with storing the probabilities of

the active nodes.

• Running time O(n log n)

It - Stephen King's PDF
80% (10)
It - Stephen King's PDF
588 pages
Secret Code Samsung
89% (38)
Secret Code Samsung
3 pages
Open Deed of Sale of A Motor Vehicle
81% (606)
Open Deed of Sale of A Motor Vehicle
1 page
Sim Owner Details - Pakistan No #1 Number Information System 2025
56% (16)
Sim Owner Details - Pakistan No #1 Number Information System 2025
3 pages
All Format
91% (32)
All Format
1 page
1500 Vocabulary Words
78% (112)
1500 Vocabulary Words
27 pages
میری گرم فیملی
79% (48)
میری گرم فیملی
133 pages
XXX Archita Phukan Viral Video Original XXX VIDEOS
8% (12)
XXX Archita Phukan Viral Video Original XXX VIDEOS
4 pages
Big Book of Sex
39% (134)
Big Book of Sex
386 pages
Earseus Key
50% (16)
Earseus Key
4 pages
Microsoft Office 2007 Activation Keys
85% (34)
Microsoft Office 2007 Activation Keys
2 pages
XXXX XXXXXXXX: X X X X X XX
60% (5)
XXXX XXXXXXXX: X X X X X XX
2 pages
NADANPENKODI - Malayalam Kambi Kathakal
60% (10)
NADANPENKODI - Malayalam Kambi Kathakal
8 pages
Telugu Family Sex Stories Collection
67% (102)
Telugu Family Sex Stories Collection
157 pages
Sample Research Paper PDF
90% (21)
Sample Research Paper PDF
36 pages
Chemistry (Annual Reports - Vol.59-1962)
100% (8)
Chemistry (Annual Reports - Vol.59-1962)
576 pages
All Numbers
68% (19)
All Numbers
59 pages
50 Numerical Questions On Electricity Class 10
89% (82)
50 Numerical Questions On Electricity Class 10
49 pages
Corel Draw X7 Serial Number & Activation Code
58% (43)
Corel Draw X7 Serial Number & Activation Code
1 page
Carbon and Its Compound (Prashant Kirad)
91% (272)
Carbon and Its Compound (Prashant Kirad)
21 pages
Telugu Boothu Kathala 24 PDF
77% (13)
Telugu Boothu Kathala 24 PDF
20 pages
Mineral and Energy Resources (Prashant Kirad)
92% (254)
Mineral and Energy Resources (Prashant Kirad)
20 pages
Telugu Boothu Kathala 5
67% (18)
Telugu Boothu Kathala 5
33 pages
Manufacturing Industries (Prashant Kirad)
91% (120)
Manufacturing Industries (Prashant Kirad)
22 pages
Uveit Foster
50% (6)
Uveit Foster
954 pages
R. D. Sharma Class 9th Book PDF - Unlocked
82% (72)
R. D. Sharma Class 9th Book PDF - Unlocked
464 pages
Agriculture (Prashant Kirad)
90% (220)
Agriculture (Prashant Kirad)
22 pages
Obligations and Contracts Hector de Leon
80% (81)
Obligations and Contracts Hector de Leon
905 pages
EFG Hermes - 21dec2022
No ratings yet
EFG Hermes - 21dec2022
54 pages
Casein Content in Milk Samples Study
89% (502)
Casein Content in Milk Samples Study
10 pages

Huffman Codes

Uploaded by

Huffman Codes

Uploaded by

Huffman Codes

Uniquely Decodable Codes

What if you get the sequence of bits

A uniquely decodable code is a variable length code in

Many other fixed prefix codes:

ACL (C ) = ∑ p (c)l (c)

We say that a prefix code C is optimal if for all

l(c) = length of the codeword c (a positive integer)

ACL' = ACL + p j (l (ci ) − l (c j )) + pi (l (c j ) − l (ci ))

This is a contradiction since ACL is not optimal

a(.1) b(.2) c(.2) d(.5)

There are even faster methods that (1.0)

ABL(T): Average code length of the prefix code

p ( y ) DepthT ( y ) + p ( z ) DepthT ( z ) + ∑ p( x) Depth ( x) =

p ( w)( DepthT ' (w) + 1) + ∑ p( x) Depth ( x) =

p ( w) + ∑ p ( x) DepthT ' ( x) = p ( w) + ABL(T ' )

From Lemma 2, ABL(Z)=ABL(Z’)+p(w)

• Maintain a heap with storing the probabilities of

• Running time O(n log n)

You might also like