ADAPTIVE HUFFMAN
CODING
Definition
Adaptive Huffman coding is a method of entropy encoding used for
data compression. Unlike traditional Huffman coding, where the
frequency of each symbol is known upfront and remains static
throughout the encoding process, Adaptive Huffman coding
dynamically updates the encoding scheme as symbols are processed.
This makes it suitable for scenarios where the frequency distribution of
symbols is not known in advance, such as streaming data or data with
unknown characteristics.
Key Concepts and Theory
[Link] Coding Recap:
1. Huffman coding is a type of variable-length prefix coding algorithm, where
symbols are represented by variable-length codes based on their frequency.
2. More frequent symbols are assigned shorter codes, and less frequent
symbols are assigned longer codes.
3. A Huffman tree (binary tree) is constructed from the symbol frequencies, and
it is used to assign codes.
[Link] Nature:
1. In adaptive Huffman coding, the tree is built and updated as the data stream
is processed. The frequency table is not static, and the Huffman tree can
change dynamically as new symbols are observed.
2. Adaptive Huffman coding is particularly useful when the data is streaming or
when it's impossible to get the full dataset upfront (e.g., real-time encoding
of data).
Example
• Faller-Gallot (FGK) Algorithm:
• This is an efficient algorithm used to build an adaptive Huffman tree.
The algorithm works as follows:
[Link]:
1. The algorithm starts with an empty tree or a minimal tree that contains only
the escape symbol EEE.
[Link] Symbols:
1. For each symbol xxx in the input stream:
1. If xxx is not in the tree (i.e., it’s the first occurrence of xxx), the escape symbol is used to
signal the introduction of a new symbol.
1. The frequency of the symbol xxx is incremented.
2. The tree is adjusted, ensuring that the most frequent symbols are closer to the root and
maintaining the Huffman property.
[Link] Balancing:
1. After processing each symbol, the tree is adjusted to reflect the updated
frequencies and ensure that the tree remains efficient.
Advantages of Adaptive
Huffman Coding
• No Predefined Frequency Table: Since the frequencies are updated
dynamically, there's no need for a predefined frequency table, which
is useful for real-time or streaming data.
• Efficient for Unknown Data: It works well when the data distribution
is not known beforehand and changes over time.
• Adaptable: The tree continuously adapts to the input data, ensuring
optimal compression as new symbols are encountered.
Disadvantages of Adaptive
Huffman Coding
• Complexity: The algorithm is more complex than static Huffman
coding because it involves dynamically updating the tree and
rebalancing it after each symbol.
• Slightly Lower Compression: In some cases, adaptive Huffman coding
may result in slightly less compression efficiency compared to static
Huffman coding since the tree starts with less information.
• Real-Time Adjustments: Rebalancing the tree after each symbol may
introduce some overhead, especially with larger data sets or high-
frequency updates.
Applications of Adaptive
Huffman Coding
• Real-Time Compression: Adaptive Huffman coding is useful in real-
time compression scenarios, such as streaming video or audio data,
where the entire data set is not available upfront.
• Data Streams: It is ideal for situations where data is continuously
generated and needs to be compressed in real-time, such as network
protocols and data transmission.