Professional Documents
Culture Documents
Step-01:
•Create a leaf node for each character of the text.
•Leaf node of a character contains the occurring frequency of that character.
Step-02:
•Arrange all the nodes in increasing order of their frequency value.
Step-03:
Considering the first two nodes having minimum frequency,
•Create a new internal node.
•The frequency of this new node is the sum of frequency of those two nodes.
•Make the first node as a left child and the other node as a right child of the newly created node.
Step-04:
•Keep repeating Step-02 and Step-03 until all the nodes form a single tree.
•The tree finally obtained is the desired Huffman Tree.
Adaptive Huffman coding
The benefit of one-pass procedure is that the source can be encoded in real time,
though it becomes more sensitive to transmission errors, since just a single loss ruins
the whole code.
Tree Manipulation
Each node has a sibling
Node's with higher weights have higher orders
On each level, the node farthest to the right will have the highest order although there might be other nodes with equal weight
Leaf nodes contain character values, except the Not Yet Transmitted(NYT) node which is the node whereat all new characters are
added
Internal nodes contain weights equal to the sum of their children's weights
All nodes of the same weight will be in consecutive order.
Every tree contains a root and a NYT node, where the NYT node is the node with the lowest order in the tree. already contains
that character. If it doesn't, the NYT node spawns two new nodes. The node to its right is a new node containing the character
and the new left node is the new NYT node. If the character is already in the tree, you simply update the weight of that particular
tree node. In some cases, when the node is not the highest-ordered node in its weight class, you will need to swap this node so
that it fulfills the property that nodes with higher weight have higher orders. To do this, before you update the node's weight,
search the tree for all nodes of equal weight and swap the soon-to-be updated value with the highest ordered node of equal
weight. Finally update the weight.
However in both cases for inserting values, weights are changed for a leaf and this change will effect all nodes above it.
Therefore, after you insert a node, you must check the parent above the node following the same procedure you followed when
updating already seen values. Check to see whether the node in question is the highest order node in its weight class prior to
updating. If not, swap with the node that is the highest order making sure to reassign only the pointers to the two nodes being
swapped.
Tree Manipulation Procedure: Be sure to notice the
key verbs here: insertnew value, give birth to new
nodes, update weight, check if max in weight
class, swap, isRoot, move to parent. Not all of these
will be functions, but these actions will form the basis
of a tree manipulation class for encoding and
decoding.
Encoding Procedure
Once you have the functions of your tree manipulation
working correctly, it is relatively easy to complete the
encoding and decoding parts of adaptive huffman
coding. To encode, you simply read through the file to be
compressed one character at a time. If you have seen the
character before, you write to the output file the root to
leaf path with 1 demarkating a move right and a 0
marking a move left- the same as you would in static
huffman coding. If the character is new, write out the
root to leaf path of the NYT node to alert the decoder
that a new character follows. Then write out the new
character itself. Use nine bits in anticipation of the
PSEUDO_EOF. Finally update the tree by calling the
appropriate insert function with the new value. Read
through the entire file in this manner and when you are
done, manually write out the final root to NYT path
followed by the PSEUDO_EOF character.
Decoding Procedure
The decoding procedure is very similar to the encoding procedure
and should be easy to figure out based on the information in the
previous section. To uncompress the compressed file, read it in one
bit at a time, traversing the tree as it is up to that point. Eventually,
you will come to a leaf. If that leaf has a character value, write out
the eight-bit translation of the character to the uncompressed file
and then update the count of that character in the tree, making
sure all necessary changes are made to the tree as the whole. If the
leaf is the NYT node, read in the next nine bits and write out the
eight bit translation of that character. Then insert the new
character in the tree. It is extremely important to remember that
the compresser and decompresser, although reading in characters
in different manners, should construct trees exactly the same. At
any given point in a file in either operation, the trees would be
the same if compared. Do not confuse this with saying that the
compresser and decompresser run simultaneously. They don't.
But they do construct the same trees after reading the same
information. You might even say that they use the same
"Adaptive Huffman Tree class"- but that might be just a tad
presumptuous.
Application
Applications of Huffman Encoding-
•Huffman encoding is widely used in
compression formats like GZIP, PKZIP (winzip) and
BZIP2.
•Multimedia codecs like JPEG, PNG and MP3 uses
Huffman encoding (to be more precised the
prefix codes)
•Huffman encoding still dominates the
compression industry since newer arithmetic
and range coding schemes are avoided due
to their patent issues.