Professional Documents
Culture Documents
7 ITCT Huffman Coding
7 ITCT Huffman Coding
Entropy
H = p i log 2
i = 2, p1 = 1,
1 pi
H =0 H = 1 bits / symbol
p2 = 0
p1 = 1 2 , p2 = 1 2
i = 4, p1 = p2 = p3 = p4 = 1 4 H = 2 bits / symbol
From information theory, the average number of bits needed to encode the symbols in a source S is always bounded by the entropy of S.
Huffman Coding
Given the statistical distribution of the gray levels Generate a code that is as close as possible to the minimum bound, entropy A variable length code
Huffman Coding
Five steps
1.
2.
3. 4. 5.
Find the gray-level probabilities for the image by finding the histogram Order the input probabilities (histogram magnitudes)from smallest to largest Combine the smallest two by addition GOTO step2,until only two probabilities are left By working backward along the tree,generate code by alternating assignment of 0 and 1
Example
Number of pixels
40 30 20 10
g = 100 = 0.2
0
20
g = 100= 0.3
1
30
g = 100 = 0.1
2
10
g = 100 = 0.4
3
40
g g g g
00 01 10 11
Entropy =
i =0
p log( p )
i i
= [(0.2) log(0.2) + (0.3) log(0.3) + (0.1) log(0.1) + (0.4) log(0.4)] 1.846 bits/pixel
ave
l p
i= 0 i
L 1
= 3 ( 0 .2 ) + 2 ( 0 .3 ) + 3 ( 0 .1 ) + 1( 0 .4 )
=1.9 bits/pixel
Example
Relative frequency of characters in a message text. A larger number indicates higher frequency of use. These are the characters for which we will want the shortest code.
Group Lowest
log2(1/P)
Information content To achieve optimum compression, each value should be represented with exactly log2(1/P) bits
Huffman Codes
Huffman Code
High probability data items are assigned short codes No code contains any other code as a prefix Reading from the left-hand bit, each code is uniquely decodable
Huffman Codes
To achieve optimum compression, a separate code table is required for each of the two sequences Carphone and Claire The loss of potential compression efficiency due to the requirement for integral length codes is very obvious for vector 0 in the Claire sequence
The image and video coding standards define sets of codewords based on the probability distributions of a large range of video material Generic table Compression efficiency is lower than that obtained by pre-analysing the data to be encoded Advantage: not requiring to calculate and transmit individual probability tables
Unconstrained Length
Constrained Length
Shortened Huffman code
7 7 7 7 7 7 7 7