You are on page 1of 15

Huffman Coding

Algorithm
Data Compression and Data Retrieval
Encoding
Decoding
Prefix-Codes
Fixed-length Vs Variable-length
coding
Introduction
• Popular method for data compression.
• Developed by David Huffman as a student in a class on Information
Theory at MIT in 1950.
• Works on binary tree.
• It is a type of optimal prefix code and can be viewed as a variable length
code.
• Blocks with higher probabilities are assigned shorter codewords and
blocks with low probabilities are assigned long codewords.
• Generate a code tree and Huffman code is obtained from labelling of
code tree.
Cont…
• Purpose: For the construction of minimum redundancy code.
• Feature: How the variable length codes can be packed together.
• In Huffman encoded data stream, each character can have a variable
number of bits. How do we separate one character from the next?
• For that we need proper selection of Huffman codes that enable the
correct separation.
• Ex. The characters A to G occur in original data stream with
probabilities A=0.154, B=0.110, C=0.072, D=0.063 and so on.
A=1, B=01, … , G=000011
Algorithm
• Input: A set of symbols and their probabilities
• Output: Prefix free binary code with minimum expected codeword length
• Algorithm:
1. Begin with list of all symbols with their associated frequencies.
2. Find the two symbols with lowest frequency.
3. Create a new symbol and link it to these two symbols.
4. Remove the original symbols from the list.
5. Give the new symbol the combined frequencies of two characters.
6. Add the new symbol to the list.
7. Repeat until only one symbol remains in the list.
Example
Cont…
Cont…
Cont…
Cont…
Decoding Huffman Code
• As we read bits from input stream, we traverse the tree beginning at
the root, taking left hand path if read 0 and right hand path if we read
1. When we hit a leaf, we have found the code.
• Advantages:
• Easy to implement.
• Produce lossless compression.
• Disadvantages:
• Slow process of compression.
• Variable length codes so difficult for the decoder to know that it has reached
the last bit of the code.
Thank
You

You might also like