You are on page 1of 6

Hamming and Huffman Coding Tutorial

By Tom S. Lee Encoding: Hamming and Huffman codes are completely different tools used by computers. Hamming code is an error detecting and correcting tool, while Huffman is a compression tool. Both can be used together, however, to both compress and error detect a message. The following tutorial will show both methods used together. Let's begin by compressing the simple message "codes_are_cool". The first thing we need to do is detect which letters are being used in the message, and their frequency. The list is as follows: c o d e s _ a r l 2/14 3/14 1/14 2/14 1/14 2/14 1/14 1/14 1/14 14/14

We then make a tree graph, beginning at the bottom and working our way up. We start with the lowest frequency numbers and combine them to one node. The graph is as follows:

Next, we read our codes from the top down (to keep it prefix free) and encode our characters: c o d e s _ a r l 011 11 0000 100 0001 010 0010 0011 101

We can now convert our code:

0111100001000001010001000111000100111111101 Now this is our Huffman code, but what about hamming? we must now break this code into byte sized chunks. I will break it down into 4 bit sections: 0111 1000 0100 0001 0100 0100 0111 0001 0011 1111 101 For each section, I need to find the check bits. First set up a table like the following: 0 0 0 1 0 0 1 0 0 0 1 1 0 1 0 0 0 1 0 1 0 1 1 0 0 1 1 1 1 0 0 0

and every colomn that has a single 1, put a check bit. Everywhere else, put a data bit: 0 0 0 1 C1 0 0 1 0 C2 0 0 1 1 D1 0 1 0 0 C3 0 1 0 1 D2 0 1 1 0 D3 0 1 1 1 D4 1 0 0 0 C4

Now input the first data into our table: 0 0 0 1 C1 0 0 1 0 C2 0 0 1 1 D1 0 0 1 0 0 C3 0 1 0 1 D2 1 0 1 1 0 D3 1 0 1 1 1 D4 1 1 0 0 0 C4

We must now figure out what our check bits are. Let's leave off C4 for simplicity. Look in the row where C1 has a 1. Every place that has a 1 needs to be xor'd together. This will give us the value of that check bit. We do this in each row. For example: C1 = D1 xor D2 xor D4 C1 = 0 xor 1 xor 1 C1 = 0 C2 = D1 xor D3 xor D4 C2 = 0 xor 1 xor 1 C2 = 0 C3 = D2 xor D3 xor D4 C3 = 1 xor 1 xor 1 C3 = 1

So our Hamming code of 0111 becomes: 0001111 And thats all there is to it. Lets try the second code: 0 0 0 1 C1 0 0 1 0 C2 0 0 1 1 D1 1 0 1 0 0 C3 0 1 0 1 D2 0 0 1 1 0 D3 0 0 1 1 1 D4 0 1 0 0 0 C4

C1 = D1 xor D2 xor D4 C1 = 1 xor 0 xor 0 C1 = 1 C2 = D1 xor D3 xor D4 C2 = 1 xor 0 xor 0 C2 = 1 C3 = D2 xor D3 xor D4 C3 = 0 xor 0 xor 0 C3 = 0 Our resulting code is: 1110000 We can go through all of these in the same manner and obtain a correct code for each, but I will leave that as an exercise for the reader. Decoding: For this part of the tutorial, I will use a different example, as the previous is quite an exhaustive amount of work. Here is the Huffman code that is given: d g o _ 10 110 0 111

The hamming codes that were given are as follows (ignore the 8th bit in each): 0 0 0 1 1 1 1 1 0 0 1 0 1 0 0 0 1 1 1 0 0 1 1 1 0 1 0 0 0 0 0 0

So we create the same table that we did in the encoding part and place the first byte of code in it: 0 0 0 1 0 0 1 0 0 0 1 1 0 1 0 0 0 1 0 1 0 1 1 0 0 1 1 1 1 0 0 0

C1 0

C2 1

D1 0

C3 1

D2 1

D3 0

D4 0

C4 0

Now similarly to encoding, we must xor all bits whose row corresponds to the check bit in question. The difference is, this time we must also xor in the check bit. The result should be zero. If not, an error has occurred in transmission. A1 = C1 xor D1 xor D2 xor D4 A1 = 0 xor 0 xor 1 xor 0 A1 = 1 A2 = C2 xor D1 xor D3 xor D4 A2 = 1 xor 0 xor 0 xor 0 A2 = 1 A3 = C3 xor D2 xor D3 xor D4 A3 = 1 xor 1 xor 0 xor 0 A3 = 0 We can see that all three A's are not equal to zero. This lets us know that there is an error in the code. It also tells us where exactly the error is. If you read the errors in reverse order, you will have the binary location of the error. Looking at this error, we can see there is an error in location three. We simply flip the bit, and we get the Huffman code: 1100 Now we try the second byte. 0 0 0 1 C1 0 0 0 1 0 C2 1 0 0 1 1 D1 0 0 1 0 0 C3 0 0 1 0 1 D2 1 0 1 1 0 D3 1 0 1 1 1 D4 1 1 0 0 0 C4 0

A1 = C1 xor D1 xor D2 xor D4 A1 = 0 xor 0 xor 1 xor 1 A1 = 0 A2 = C2 xor D1 xor D3 xor D4 A2 = 1 xor 0 xor 1 xor 1 A2 = 1 A3 = C3 xor D2 xor D3 xor D4 A3 = 0 xor 1 xor 1 xor 1 A3 = 1 We can see there is an error in location 6 so we flip the bit and get the huffman code: 0101 Now we try byte three 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 1 0 1 1 1 0 0

1 C1 0

0 C2 1

1 D1 1

0 C3 0

1 D2 1

0 D3 1

1 D4 0

0 C4 0

A1 = C1 xor D1 xor D2 xor D4 A1 = 0 xor 1 xor 1 xor 0 A1 = 0 A2 = C2 xor D1 xor D3 xor D4 A2 = 1 xor 1 xor 1 xor 0 A2 = 1 A3 = C3 xor D2 xor D3 xor D4 A3 = 0 xor 1 xor 1 xor 0 A3 = 0 So there is an error in bit two. Bit two is a check bit, so it has no effect on the huffman code, which is: 1110 For the last bit, we get: 0 0 0 1 C1 1 0 0 1 0 C2 1 0 0 1 1 D1 0 0 1 0 0 C3 0 0 1 0 1 D2 0 0 1 1 0 D3 1 0 1 1 1 D4 0 1 0 0 0 C4 0

A1 = C1 xor D1 xor D2 xor D4 A1 = 1 xor 0 xor 0 xor 0 A1 = 1 A2 = C2 xor D1 xor D3 xor D4 A2 = 1 xor 0 xor 1 xor 0 A2 = 0 A3 = C3 xor D2 xor D3 xor D4 A3 = 0 xor 0 xor 1 xor 0 A3 = 1 So there is an error in bit 5. Simply by flipping the bit, we get the huffman code: 0110 Now that we have corrected all the errors in the code, we can begin to decompress it. The code reads: 1100 0101 1110 0110 Now looking at the huffman chart, we can decipher this code: d g o 10 110 0

_ 110 0 0 10 111 10 0 110

111 g o o d _ d o g

So our message reads: good_dog

You might also like