You are on page 1of 10

Data Compression Seminar John Kieffer

2 Huffman Codes

Suppose one wants to design a memoryless prefix code for compressing a data vector x over a k-ary alphabet
A. Let F = (F1 , F2 , . . . , Fk ) denote the vector in which Fi is the number of times that the i-th most frequent
symbol in A appears in x. Suppose we were to encode x using a memoryless prefix code for which the
underlying Kraft vector is L = (L1 , L2 , . . . , Lk ). Then the length of the encoder output bitstring B(x) is

F • L = F1 L1 + F2 L2 + . . . + Fk Lk
The best prefix code would be one for which B(x is minimized. To design such a code, one finds a Kraft
vector L for which L • F is minimized. Exhaustive search is not a very good way to find such a Kraft vector
for large k. (For example, for k = 11, you’d have to search through 89 possibilities for L.) The Huffman
algorithm gives us an efficient way to find an optimal L for F , that is, a Kraft vector for which the dot
product L • F is minimized. (The example below shows that there can be more than one optimal L for a
given F .) A code designed using an optimal L for F shall be called a Huffman code.
EXAMPLE. We consider the data vector x = (a, a, a, a, b, b, c, c, d, e) over the alphabet A = {a, b, c, d, e}.
The vector of frequencies F for x is F = (4, 2, 2, 1, 1). There are three possible Kraft vectors that can be
used to define a memoryless prefix code for x, namely,

L = (1, 2, 3, 4, 4)
L = (1, 3, 3, 3, 3)
L = (2, 2, 2, 3, 3)
The reader can check that each of these three choices gives the dot product L • F = 22. These Kraft vectors
give rise to three distinct memoryless prefix codes for x, each of which yields an encoder output codeword
length of B(x) = 22 codebits. Since this is the smallest possible length for B(x), each of these three codes
is a Huffman code for x.

2.1 Huffman algorithm


In this section, we give the simple description of the Huffman algorithm and illustrate with some examples.
If M = (M1 , M2 , . . . , Mr ) and N = (N1 , N2 , . . . , Ns ) vectors with integer components, we define M ♦N to
be the vector

M ♦N = (M1 + N1 , M2 + 1, M3 + 1, . . . , Mr + 1, N2 + 1, N3 + 1, . . . , Ns + 1)
For example, (3, 1, 2, 2)♦(5, 1, 2, 3, 4, 4) = (8, 2, 3, 3, 2, 3, 4, 5, 5).
Suppose one has a list of vectors with integer components

U 1, U 2, . . . , U j (1)
where j ≥ 2. We perform what we call a pruning operation on the list (1) to obtain a list of vectors

V 1 , V 2 , . . . , V j−1 (2)
First, choose any two vectors U i1 , U i2 (i1 6= i2 ) from list (1) such that these two vectors have the smallest
initial components among all the initial components of vectors in list (1). Form the vector U = U i1 ♦[Ui2 .
Then, strike out U i1 , U i2 from the list (1), and append the new vector U to the end of the list. The resulting
list is the list (2). For example, if list (1) is the list
to obtain a list of j − 1 integer vectors U 1 , V 2 , .

1
codeword assignments
1 1
2 01
3 000
0 001

(Notice that we have assigned the shortest codeword to the most frequent symbol in x and the longest
codeword to the least frequent symbol; if we do not do this, the codeword B(x) assigned to x below becomes
longer.) Replacing each sample in x with its assigned codeword, we get the sequence of codewords

000, 000, 01, 1, 000, 01, 01, 1, 01, 01, 1, 001, 1, 1, 1, 001 (1)

Concatenating these together, we see that the encoder output B(x) in response to the encoder input x is:

B(x) = (0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1)

The fact that S is a prefix set allows the decoder to obtain the decomposition (1) from B(x), whence the
decoding of B(x) into x can be completed using the above table.

2.2 Structure of Prefix Sets


The trivial prefix sets are the three sets {0}, {1}, {0, 1}. All other prefix sets are termed nontrivial. Every
nontrivial prefix set can be reduced to a simpler prefix set. To see how this is done, we introduce two
notations: if w is a bitstring, then w̄ is the bitstring obtained by complementing the last entry in w, and
w̃ is the bitstring obtained by deleting the last entry in w. (Thus, if w = 00101, then w̄ = 00100, and
w̃ = 0010.) The nontrivial prefix sets can be classified into two types. We shall say that a nontrivial prefix
set {w1 , w2 , . . . , wk } is of Type I if k ≥ 2 and wk−1 = w̄k . All the other nontrivial prefix sets we shall call
Type II prefix sets. The reader can easily see that the following is true:

• If {w1 , w2 , . . . , wk } is of Type I, then {w1 , w2 , . . . , wk−2 , w̃k } is a prefix set.


• If {w1 , w2 , . . . , wk } is of Type II, then {w1 , w2 , wk−1 , w̃k } is a prefix set.

(To see this, first argue that if {w1 , w2 , wk−1 , w̃k } fails to be a prefix set, then the prefix set {w1 , w2 , . . . , wk }
is of Type I.) These facts motivate one to define the following two operations on prefix sets:

• A Type I operation on a prefix set S generates a new prefix set in which a longest or second longest
bitstring w in S is replaced by the two bitstrings w0 and w1.
• A Type II operation on a prefix set S generates a new prefix set in which a longest or second longest
bitstring w in S is replaced by either the bitstring w0 or the bitstring w1.

Thus, if S1 is a Type I prefix set, there exists a simpler prefix set S2 such that S1 is obtained from S2 via a
Type I operation; if S1 is a Type II prefix set, there exists a prefix set S2 such that S1 is obtained from S2
via a Type II operation.
EXAMPLE. The prefix set S1 = {0, 11, 101} is of Type II. It can be reduced to the prefix set S2 =
{0, 10, 11}. S1 is obtained from S2 via a Type II operation (replacing 10 with 101). The prefix set S1 =
{0, 10, 110, 111} is of Type I. It can be reduced to the prefix set S2 = {0, 10, 11}. S1 is obtained from S2 via
a Type I operation (replacing 11 with 110 and 111).
If S and S 0 are prefix sets, let us write S → S 0 to denote that S 0 is obtained from S via either a Type I
or Type II operation. Then, in view of the preceding, the following must be true:

• If S is a nontrivial prefix set, then there must exist prefix sets S1 , S2 , . . . , Sj such that S1 is trivial and

S1 → S2 → S3 → . . . → Sj = S

2
EXAMPLE. Let S be the prefix set

S = {0, 10, 1100, 1101, 1111, 11100}


Through successive reductions, we obtain the prefix sets

S1 = {0, 10, 1100, 1101, 1110, 1111}

S2 = {0, 10, 111, 1100, 1101}

S3 = {0, 10, 110, 111}

S4 = {0, 10, 11}

S5 = {0, 1}
Notice that S5 is trivial and S5 → S4 → S3 → S2 → S1 → S.

2.3 Kraft’s Inequality


If S = {w1 , w2 , . . . , wk } is a prefix set, then we define v(S) to be the vector

v(S) = (L1 , L2 , . . . , Lk )
where Li is the length of wi . Let us call a vector (L1 , L2 , . . . , Lk ) having nondecreasing positive integer
components a Kraft vector if the inequality

2−L1 + 2−L2 + . . . + 2−Lk ≤ 1


holds. (This inequality is called Kraft’s inequality.) In view of our preceding section, one can easily deduce
the following property:

• If S is any prefix set, then v(S) is a Kraft vector

(In other words, the lengths of the bitstrings in a prefix set satisfy Kraft’s inequality.) Here is how you can
see that this statement is true. First, it is true for the trivial prefix sets. Secondly, suppose the statement
holds for a prefix set S, and that v(S) = (L1 , L2 , . . . , Lk ). If you perform a Type I or Type II operation on
S to get the prefix set S 0 , then in forming v(S 0 ) you replace some component Li of v(S) with Li + 1 (Type
II operation), or else you replace Li with two components Li + 1, Li + 1 (Type I operation). Since

2−(Li +1) < 2−(Li +1) + 2−(Li +1) = 2−Li ,


Kraft’s inequality must hold for v(S 0 ) if it holds for v(S). As any prefix set can be obtained by doing finitely
many operations of Type I or Type II starting from a trivial prefix set, Kraft’s inequality must hold for any
prefix set.
What is perhaps more remarkable (and this has important implications for code design), the converse of
this property is true, namely, given any Kraft vector (L1 , L2 , . . . , Lk ), there must exist a prefix set S such
that v(S) = (L1 , L2 , . . . , Lk ). To show this, we partition the nontrivial Kraft vectors into two types, the
proper Kraft vectors and the improper Kraft vectors:

• A proper Kraft vector is a Kraft vector (L1 , L2 , . . . , Lk ) for which

2−L1 + 2−L2 + . . . + 2−Lk = 1

3
• An improper Kraft vector is a Kraft vector (L1 , L2 , . . . , Lk ) in which

2−L1 + 2−L2 + . . . + 2−Lk < 1 (2)

The following can easily be shown:

• If (L1 , L2 , . . . , Lk ) is a proper Kraft vector, then k ≥ 2 and Lk = Lk−1 .


• If (L1 , L2 , . . . , Lk ) is a proper Kraft vector and Lk > 1, then (L1 , L2 , . . . , Lk−2 , Lk − 1) is a proper
Kraft vector.
• If (L1 , L2 , . . . , Lk ) is an improper Kraft vector and Lk > 1, then (L1 , L2 , . . . , Lk−1 , Lk − 1) is a Kraft
vector.

To see these things, observe that

2−L1 + 2−L2 + . . . + 2−Lk−1 = 1 − 2−Lk


holds for a proper Kraft vector (L1 , L2 , . . . , Lk ). Multiplying both sides by 2Lk , you get

2Lk −Lk−1 I = J
where I is an integer and J is an odd integer. Since the left side of the preceding equation must be odd,
we must have 2Lk = 2Lk−1 , rather than 2Lk > 2Lk−1 . The second statement above follows from the first
statement. To see that the third statement is true, observe that if (2) holds, then

2−L1 + 2−L2 + . . . + 2−Lk = I/2Lk


where I is an integer less than 2Lk . Therefore

2−L1 + 2−L2 + . . . + 2−Lk−1 + 2−(Lk −1) = (I + 1)/2Lk ≤ 1

EXAMPLE. (1, 2, 4, 4, 4, 5) is an improper Kraft vector. So, (1, 2, 4, 4, 4, 4) is a Kraft vector. Since
it’s improper, (1, 2, 3, 4, 4, 4) is also a Kraft vector. (2, 2, 2, 2) is a proper Kraft vector. Consequently, so is
(1, 2, 2), and then so is (1, 1).
From the preceding example, it can be seen that any proper Kraft vector eventually reduces to the vector
(1, 1) through a chain of proper Kraft vectors. Also, any improper Kraft vector reduces to either the vector
(1) or the vector (1, 1) through a chain of Kraft vectors.
Let v be a fixed Kraft vector. Combining this section with the previous one, we obtain a method for
constructing a prefix set S for which v(S) = v: Starting with v, do a chain of reductions until you obtain (1)
or (1, 1). Then, starting with the prefix set {0} or {0, 1}, you can go back up the chain doing appropriate
Type I or Type II operations until you wind up with the desired prefix set S.
EXAMPLE. Let’s construct a prefix set S for which v(S) = (1, 2, 4, 4, 4, 5). Doing a chain of reductions,
we get the Kraft vectors

(1,2,4,4,4,5,6,6) (proper)
(1,2,4,4,4,5,5) (proper)
(1,2,4,4,4,4) (proper)
(1,2,3,4,4) (proper)
(1,2,3,3) (proper)
(1,2,2) (proper)
(1,1)

4
Start with the prefix set S0 = {0, 1} and go back up the chain. To get each Kraft vector back up the chain,
we have to do an appropriate Type I operation to the prefix set from the preceding set:
S0 → S1 = {0, 10, 11} and v(S1 ) = (1, 2, 2)
S1 → S2 = {0, 10, 110, 111} and v(S2 ) = (1, 2, 3, 3)
S2 → S3 = {0, 10, 110, 1110, 1111} and v(S3 ) = (1, 2, 3, 4, 4)
S3 → S4 = {0, 10, 1100, 1101, 1110, 1111} and v(S4 ) = (1, 2, 4, 4, 4, 4)
S4 → S5 = {0, 10, 1100, 1101, 1110, 11110, 11111} and v(S5 ) = (1, 2, 4, 4, 4, 5, 5)
S5 → S6 = {0, 10, 1100, 1101, 1110, 11110, 111110, 111111} and v(S6 ) = (1, 2, 4, 4, 4, 5, 6, 6)

2.4 Compact Codes


A prefix code is said to be compact if its Kraft vector is proper. For example, a prefix code employing the
set of codewords {0, 10, 110, 111} is compact because

2−1 + 2−2 + 2−3 + 2−3 = 1


Why are the compact codes important? Using the Type II operation of Section 2.1 finitely many times,
one can convert a code having an improper Kraft vector into a new code having a proper Kraft vector, i.e.,
a compact code. If the original Kraft vector is (L1 , L2 , . . . , Lk ) and the new Kraft vector is (L01 , L02 , . . . , L0k ),
then one will have

L01 ≤ L1 , L02 ≤ L2 , . . . , L0k ≤ Lk (3)


and at least one of these inequalities will be a strict inequality. Suppose that the original code and the new
code are both used to encode a data vector over an alphabet of size k. Then, in view of (3), the new code
will yield fewer codebits at the encoder output.
The argument in the preceding paragraph allows us to restrict ourselves to compact codes in prefix code
design. In view of the importance of the compact codes in code design, let us classify the compact codes. We
shall not distinguish between two compact codes that have the same Kraft vector. The following example
illustrates why we don’t make this distinction. The prefix code based on the codeword set {0, 10, 110, 111}
and the prefix code based on the codeword set {1, 01, 000, 001} both have the Kraft vector (1, 2, 3, 3). The
number of codebits generated in encoding a data vector x via the first code coincides with the number of
codebits generated in encoding x via the second code. (The number of codebits generated at the encoder
output in prefix coding depends on the code only through its Kraft vector.) Hence, these two codes are
operationally equivalent.
In view of the discussion in the preceding paragraph, we will speak of the (1, 2, 3, 3) compact code. More
generally, given any proper Kraft vector (L1 .L2 , . . . , Lk ), we speak of the (L1 , L2 , . . . , Lk ) compact code. A
k-ary compact code is a compact code in which the alphabet is of size k. There is only one 2-ary compact
code, namely, the (1, 1) code. There is only one 3-ary compact code, namely the (1, 2, 2) code. There are
only two 4-ary compact codes, namely the (1, 2, 3, 3) code and the (2, 2, 2, 2) code. How can one catalog all
the k-ary compact codes for a given k? In view of the “collapsing” operations on Kraft vectors described in
Section 2.2, one can “expand” a (k − 1)-ary (L1 , L2 , . . . , Lk−1 ) compact code in two ways to obtain a k-ary
compact code:

• Replace the largest Li by Li + 1, Li + 1


• Replace the second largest Li by Li + 1, Li + 1

All k-ary compact codes arise as the result of the above expansion operations performed on (k − 1)-ary
compact codes. Let us obtain the 5-ary compact codes this way:

(1, 2, 3, 3) → (1, 2, 3, 4, 4), (1, 3, 3, 3, 3)


(2, 2, 2, 2) → (2, 2, 2, 3, 3)

5
There are thus three 5-ary compact codes, the (1, 2, 3, 4, 4), (1, 3, 3, 3, 3), and (2, 2, 2, 3, 3) codes.
Let us now classify the 6-ary compact codes. First, we generate them from the 5-ary codes:

(1, 2, 3, 4, 4) → (1, 2, 3, 4, 5, 5), (1, 2, 4, 4, 4, 4)


(1, 3, 3, 3, 3) → (1, 3, 3, 3, 4, 4), (2, 2, 3, 3, 3, 3)
(2, 2, 2, 3, 3) → (2, 2, 2, 3, 4, 4)
There are thus five 6-ary compact codes, the (1, 2, 3, 4, 5, 5), (1, 2, 4, 4, 4, 4), (1, 3, 3, 3, 4, 4), (2, 2, 3, 3, 3, 3),
and (2, 2, 2, 3, 4, 4) codes.
Let us now generate and classify the 7-ary compact codes:

(1, 2, 3, 4, 5, 5) → (1, 2, 3, 4, 5, 6, 6), (1, 2, 3, 5, 5, 5, 5)


(1, 2, 4, 4, 4, 4) → (1, 2, 4, 4, 4, 5, 5), (1, 3, 3, 4, 4, 4, 4)
(1, 3, 3, 3, 4, 4) → (1, 3, 3, 3, 4, 5, 5)
(2, 2, 3, 3, 3, 3) → (2, 2, 3, 3, 3, 4, 4), (2, 3, 3, 3, 3, 3, 3)
(2, 2, 2, 3, 4, 4) → (2, 2, 2, 3, 4, 5, 5), (2, 2, 2, 4, 4, 4, 4)
There are thus nine 7-ary compact codes, the (1, 2, 3, 4, 5, 6, 6), (1, 2, 3, 5, 5, 5, 5), (1, 2, 4, 4, 4, 5, 5), (1, 3, 3, 4, 4, 4, 4),
(1, 3, 3, 3, 4, 5, 5), (2, 2, 3, 3, 3, 4, 4), (2, 3, 3, 3, 3, 3, 3), (2, 2, 2, 3, 4, 5, 5), and (2, 2, 2, 4, 4, 4, 4) codes.
One can continue with this process indefinitely. The reader can verify the correctness of the following
table which gives the number of k-ary compact codes for k = 2 through k = 11:

compact code tabulation


k # of k-ary codes
2 1
3 1
4 2
5 3
6 5
7 9
8 16
9 28
10 50
11 89

For large k, it has been determined that there are about (1.7941472)k compact codes.

2.5 MATLAB functions


We introduce four new MATLAB functions:

• concat

• prefix encode
• compactcodeP
• compactcodeI

The function concat is a utility (a function used to build other functions), the function prefix encode is
an encoding function, and the functions compactcodeP and compactcodeI are code design functions.

6
2.5.1 concat
The m-file concat.m is given below:

%This m-file is called concat.m


%I is a vector of indices for two or more bitstrings
%Command y=concat(I); creates in memory the bitstring
%y consisting of the left to right concatenation of the
%bitstrings from I, and prints this string to the screen
%
function y = concat(I)
N=length(I);
y=[];
for i=1:N;
A=index_to_bitstring(I(i));
y=[y A];
end
j=bitstring_to_index(y);
print_bitstrings(j);

The function “concat” created by this m-file operates as follows. Let I be the vector of indices of two or
more bitstrings. Then y = concat(I) is the bitstring obtained by concatenating together from left to right
the bitstrings whose indices are in I. For example, the vector of indices of the bitstrings 1, 01, 000, 001 is [2,
4, 7, 8]. Executing the MATLAB line

y = concat([2 4 7 8]);

you will see the bitstring 1010000001 displayed on the screen. (This is the bitstring obtained by concate-
nating together the bitstrings 1, 01, 000, 001.) The vector [1,0,1,0,0,0,0,0,1] will also be stored in
MATLAB memory.

2.5.2 prefix encode

The m-file prefix encode.m is given by:

%This m-file is called prefix_encode.m


%It creates an encoding function
%x denotes a data vector, with alphabet {0,1,...,k-1}
%W denotes the vector of indices of a prefix set
%Command y=prefix_encode(W,x); yields in memory
%the encoder output bitstring y in response to x for the
%memoryless prefix code induced by W, and prints y
%to the screen
%
function y = prefix_encode(W,x);
k=length(W);
for i=1:k;
j=find(x==i-1);
f(i)=length(j);
end
[u,r]=sort(f);
for i2=1:k;
z(r(i2))=W(k-i2+1);

7
end
n=length(x);
for i3=1:n;
q(i3)=z(x(i3)+1);
end
y=concat(q);

The MATLAB function “prefix encode” created by this m-file accomplishes memoryless prefix encoding.
Let x denote the data vector to be encoded, and we suppose that the alphabet of this data vector is the set
{0, 1, . . . , k − 1} for some positive integer k. Let W denote the vector of indices of the prefix set of bitstrings
{w1 , w2 , . . . , wk } for the memoryless prefix code that is to be employed. Executing the MATLAB line

y=prefix_encode(W,x);

creates the encoder output bitstring y in MATLAB memory and this bitstring is also printed out on the
screen. For example, let x = [3, 3, 2, 1, 3, 2, 2, 1, 2, 2, 1, 0, 1, 1, 1, 0] be the data vector, and let the prefix set
for the code be {1, 01, 000, 001}. Converting this set to integer form, we get W = [2, 4, 7, 8]. Typing in the
MATLAB command “prefix encode(W,x);”, you will see the sequence

0000000110000101101011001111001
printed out on the screen. This is the encoder output.

2.5.3 compactcodeP
This is the m-file “compactcodeP.m”:

%This m-file is called compactcodeP.m


%Given a proper Kraft vector L, command y=compactcodeP(L)
%designs a vector y of indices of bitstrings in
%a prefix set having Kraft vector L, and prints this
%prefix set on the screen
%
function y=compactcodeP(x)
z=[];
N=length(x);
while N>2
r=[x(1:N-2) x(N)-1];
if r(N-2)==r(N-1)
z=[1 z];
else
z=[0 z];
end
x=sort(r);
N=N-1;
end
n=length(z);
u=[1 1];
v=[1 2];
for i=1:n;
if z(i)==1
u=[u(1:i) u(i+1)+1 u(i+1)+1];
v=[v(1:i) 2*v(i+1)+1 2*v(i+1)+2];
else

8
t=find(u<u(i+1));
j=length(t);
u=[u(1:j-1) u(j)+1 u(j)+1 u(j+1:i+1)];
v=[v(1:j-1) 2*v(j)+1 2*v(j)+2 v(j+1:i+1)];
end
end
y=v;
print_bitstrings(y);

We describe what the MATLAB function “compactcodeP” does. Let L be a proper Kraft vector (L1 , L2 , . . . , Lk ).
Executing the MATLAB command compactcodeP(L) prints out on the screen a prefix set {w1 , w2 , . . . , wk }
of a compact code whose Kraft vector equals L. For example, executing the line

compactcodeP([1 2 3 4 4]);

results in the appearance on the screen of the following prefix set for a (1, 2, 3, 4, 4) compact code:

0
10
110
1110
1111

2.5.4 compactcodeI
The following is the m-file “compactcodeI.m”:

%This m-file is called compactcodeI.m


%It creates a code design function
%L denotes an improper Kraft vector
%command y =compactcodeI(L); stores vector y
%of indices of a prefix set of a compact code
%having Kraft vector <= L, and prints this prefix set
%to the screen
%
function y = compactcodeI(L)
k=length(L);
if k~=1
[new index]=sort(2.^(-L));
reverse=(k:-1:1);
new=new(reverse);
index=index(reverse);
power=2.^(-L);
while sum(power) < 1
[largest position]=max(L);
L(position)=L(position)-1;
power=2.^(-L);
end
y=compactcodeP(L);
else
print_bitstrings(1);
end

9
We describe what the MATLAB function “compactcodeI” does. Let L be an improper Kraft vector
(L1 , L2 , . . . , Lk ). Executing the MATLAB command compactcodeI(L) prints out on the screen a prefix
set {w1 , w2 , . . . , wk } of a compact code whose kraft vector (L01 , L02 , . . . , L0k ) satisfies L0i ≤ Li for each i. For
example, executing the line

compactcodeI([2 3 3 3 4 5 5]);

you will see the following prefix set appear on the screen:

00
010
011
100
101
110
111

The Kraft vector for this prefix set is the proper Kraft vector [2, 3, 3, 3, 3, 3, 3], each of whose components is
less than or equal to the corresponding component of the improper Kraft vector [2, 3, 3, 3, 4, 5, 5]. The above
prefix set therefore yields a (2, 3, 3, 3, 3, 3, 3) compact code.

10

You might also like