Chapter 5 VARIABLE-LENGTH CODING

---- INFORMATION THEORY RESULTS (II)

5.1 Some Fundamental Results 5.1.1 Coding an Information Source • Consider an information source, represented by a source alphabet S.

S = {s1, s2,L, sm},
• Terms source message. symbol and

(5.1)

where si ’ are source symbols. s information

§ Used interchangeably in the literature.
1

2 CHAPTER 5 VARIABLE-LENGTH CODING

§ We like to consider: an information message can be a source symbol, or a combination of source symbols. • We denote code alphabet by A and

A = {a1, a2 ,L, ar},
where

a j ’ are code symbols. s

A message code is a sequence of code symbols that represents a given information message.

• In the simplest case, a message consists of only a source symbol. Encoding is then a procedure to assign a codeword to the source symbol. Namely,

si → Ai = (ai1, ai2,L, aik ) ,
where the codeword Ai is a string of k code symbols assigned to the source symbol si . • The term message ensemble is defined as the entire set of messages.

5.1 Coding An information Source 3

• A code, also known as an ensemble code, is defined as a mapping of all the possible sequences of symbols of S (message ensemble) into the sequences of symbols in A. • In binary coding, the number of code symbols r is equal to 2, since there are only two code symbols available: the binary digits “ and “ . 0” 1” Example 5.1 üConsider an English article and ASCII code. üIn this context, the source alphabet consists of all the English letters in both lower and upper cases and all the punctuation marks. üThe code alphabet consists of the binary 1 and 0. üThere are a total of 128 7-bit binary codewords. üFrom Table 5.1, we see that codeword assigned to capital letter A (a source symbol) is 1000001 (the codeword).

4 CHAPTER 5 VARIABLE-LENGTH CODING

Table 5. 1 Seven-bit American standard code for information interchange (ASCII)
Bits
1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 2 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 3 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 4 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 5 6 7 0 0 0 NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI 1 0 0 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US 0 1 0 SP ! ″ # $ % & ′ ( ) * + , . / 1 1 0 0 1 2 3 4 5 6 7 8 9 : ; < = > ? 0 0 1 @ A B C D E F G H I J K L M N O 1 0 1 P Q R S T U V W X Y Z [ \ ] ∧  0 1 1 ‘ a b c d e f g h i j k l m n o 1 1 1 p q r s t u v w x y z { | } ~ DEL

NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI DLE

Null, or all zeros Start of heading Start of text End of text End of transmission Enquiry Acknowledge Bell, or alarm Backspace Horizontal tabulation Line feed Vertical tabulation Form feed Carriage return Shift out Shift in Data link escape

DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US SP DEL

Device control 1 Device control 2 Device control 3 Device control 4 Negative acknowledgment Synchronous idle End of transmission block Cancel End of medium Substitution Escape File separator Group separator Record separator Unit separator Space Delete

10 and 11. ØThere are four codewords listed in the right column of the table.2) linear block code Source Symbol Codeword S1 ( 0 0 ) S2 ( 0 1 ) S3 ( 1 0 ) S4 ( 1 1 ) 00000 10100 01111 11011 . ØFrom the table. It is a linear block code.2 ØTable 5. we see that the code assigns a 5-bit codeword to each source symbol.2) code. ØIn this example. ØThe code alphabet consists of the binary 1 and 0. the source alphabet consists of the four (22 ) source symbols listed in the left column of the table: 00. Table 5. 01.1 Coding An information Source 5 Example 5.2 lists what is known as the (5. 2 A (5.5.

Example 5.1 Block Code • A code is said to be a block code if it maps each source symbol in S into a fixed codeword in A. the codes listed in the above two examples are block codes. • Obviously.2.1.3 .2 Uniquely Decodable Code • A code is uniquely decodable if it can be unambiguously decoded.1. a code has to be uniquely decodable if it is to be in use.1. • Hence. 5.6 CHAPTER 5 VARIABLE-LENGTH CODING 5.2.2 Some Desired Characteristics 5.

5. 4 A nonsingular code Source Symbol Codeword S1 S2 S3 S4 1 11 00 01 . 3 A not uniquely decodable code Source Symbol Codeword S1 S2 S3 S4 00 10 00 11 Nonsingular Code • A block code is nonsingular if all the codewords are distinct. Table 5.1 Coding An information Source 7 Table 5.

e.4 is such an example in that it is nonsingular while it is not uniquely decodable (because once the binary string “ 11” is received.) The nth Extension of a Block Code • The nth extension of a block code..8 CHAPTER 5 VARIABLE-LENGTH CODING Example 5. üIf a code is not a nonsingular code. which maps the source symbol si into the codeword Ai . then the code is not uniquely decodable. we do not know if the source symbols transmitted are s1followed by s1 or simply s2 . üNotice that. i. at least two codewords are identical. a nonsingular code does not guarantee unique decodability. however.4 gives a nonsingular code since all four codewords are distinct. is a block code that maps the sequences of source symbols si1si 2Lsin . üThe code shown in Table 5.4 üTable 5.

5 The second extension of the nonsingular block code shown in Example 5.5. Table 5.1 Coding An information Source 9 into Ai1 Ai2L Ain the sequences . of codewords • A Necessary and Sufficient Condition of Block Codes’ Unique Decodability A block code is uniquely decodable if and only if the nth extension of the code is nonsingular for every finite n.4 Source Symbol Codeword Source Symbol Codeword S1 S1 S1 S2 S1 S3 S1 S4 S2 S1 S2 S2 S2 S3 S2 S4 11 111 100 101 111 1111 1100 1101 S3 S1 S3 S2 S3 S3 S3 S4 S4 S1 S4 S2 S4 S3 S4 S4 001 0011 0000 0001 011 0111 0100 0101 .

6: three uniquely decodable codes.10 CHAPTER 5 VARIABLE-LENGTH CODING 5. 6 Three uniquely decodable codes Source Symbol Code 1 Code 2 Code 3 S1 S2 S3 S4 00 01 10 11 1 01 001 0001 1 10 100 1000 Example 5.2. The first one is in fact a two-bit NBC. Table 5.3 Instantaneous Codes Definition of Instantaneous Codes A uniquely decodable code is said to be instantaneous if it is possible to decode each codeword in a code symbol sequence without knowing the succeeding codewords. q .1.6 q Table 5.

It is instantaneous.1 Coding An information Source 11 In decoding we can immediately tell which source symbols are transmitted since each codeword has the same length. Specifically. 1” The third code is different from the previous two codes in that if we see a “ 10” string we are not sure if it corresponds to s2 until we see a succeeding “ . we still cannot tell 0” if it is s3 since the next one may be “ 0” (hence s4 ) or “ (hence s3 ). we know it is the end of the codeword. this the that not q Definition of the jth Prefix Assume a codeword the sequences of Ai = ai1ai2Laik . Therefore we see code 3 is uniquely decodable. Whenever we see a “ . if the 1” next code symbol is “ . Then code symbols ai1ai 2 Laij . q In the second code. the next “ 1” belongs to succeeding codeword.5. code symbol “ 1” functions like a comma. In 1” example. however.

12 CHAPTER 5 VARIABLE-LENGTH CODING with 1 ≤ j ≤ k is the jth order prefix of the codeword Ai . while the fifth order prefix is 11001. 11. 1100.7 q If a codeword is 11001. v This condition is called the prefix condition. we need a block code that is nonsingular. Hence. . uniquely decodable. Example 5. 1. The first order prefix is 1. and instantaneous. q A Necessary and Sufficient Condition of Being Instantaneous Codes v A code is instantaneous if and only if no codeword is a prefix of some other codeword. the instantaneous code is also called the prefix condition code or sometimes simply the prefix code. it has the following five prefixes: 11001. v In many applications. 110.

.5. the symbols generated by the source are independent of each other.3 Discrete Memoryless Sources • The simplest model of an information source.1 Coding An information Source 13 5.1. the source is memoryless or it has a zeromemory.1. ♦ A compact code is also referred to as a minimum-redundancy code. • In this model.2. or an optimum code. That is. 5.4 Compact Code ♦ A uniquely decodable code is said to be compact if its average length is the minimum among all other uniquely decodable codes based on the same source alphabet S and code alphabet A.

lm . p(sm ) . § The average length of the code is then equal to m Lavg = ∑ li p(si ) . 1) 5.1 as a discrete memoryless source.1.L.14 CHAPTER 5 VARIABLE-LENGTH CODING • Consider the information source expressed in Equation 5.4 Extensions of a Discrete Memoryless Source • Instead of coding each source symbol in a discrete source alphabet. it is often useful to code blocks of symbols. l2 . p(s2 ). i=1 (5. § The lengths of the codewords can be denoted by l1.L. . § The occurrence probabilities of the source symbols can be denoted by p(s1).

Each block is considered as a new source symbol.2 Entropy ØLet each block be denoted by β i and βi = (si1.4. 2) Because of the memoryless assumption. sin ) . § These m n blocks thus form an information source alphabet. which is denoted by S n .5.L.1. then there are a total of m n blocks. 5.4. called the nth extension of the source S.L. 3) . § If n symbols are grouped into a block. i j=1 ØHence (5. Ø (5.1.1 Definition § Consider the zero-memory source alphabet S = {s1. sm} . s2 .1 Coding An information Source 15 5. we have n p(β ) = ∏ p(sij ) . si2.

. 6) where Lavg is the average codeword length of a variable-length code for the S n .4. Shannon’ noiseless coding s theorem can be expressed as H (S ) ≤ Lavg < H (S ) +1.1.16 CHAPTER 5 VARIABLE-LENGTH CODING H (S n ) = n ⋅ H ( S ) . is itself a discrete memoryless source. 5) that is. (5.3 Noiseless Source Coding Theorem v For a discrete zero-memory information source S. v Since the nth extension of the source alphabet. (5. avg n (5. we can apply the above result to it. there exists a variable-length code whose average length is bounded below by the entropy of the source (that is encoded) and bounded above by the entropy plus 1. S n . H (S n ) ≤ Ln < H (S n ) +1. 4) 5.

To make ε arbitrarily small. the noiseless source coding theory states that for an arbitrary positive number ε . v In most of cases. 8) v as n is large enough. avg H (S n ) = nH (S ) we have and H (S ) ≤ Lavg < H (S ) + 1 . there is a variable-length code which satisfies the following: H ( S ) ≤ Lavg < H (S ) + ε (5. when coding blocks of n source symbols. n (5. we have to make the block size n large enough.1 Coding An information Source 17 v Since Ln = nLavg .5. 7) v Therefore. the price turns out to be small when n is not larger enough. .

s2 . we need a direct encoding method that is optimum and instantaneous (hence uniquely decodable) for an information source with finite source symbols in source alphabet S. 5. 9) ♦ WLOG. • It can be used for r-ary encoding as r>2.2.1 Required Rules for Optimum Instantaneous Codes ♦ Consider an information source: S = ( s1. sm ) . For the notational brevity. • Huffman code is the first such optimum code [huffman 1952].18 CHAPTER 5 VARIABLE-LENGTH CODING 5. (5. • The most frequently used at present.L.2 Huffman Codes • In many cases. we discuss only the Huffman coding used in the binary case presented here. assume the occurrence probabilities of the source symbols are as follows: . however.

Each possible sequence of length lm −1 bits must be used either as a codeword or must have one of its prefixes used as a codeword. (5. l1 ≤ l2 ≤ L ≤ lm−1 = lm . the lengths of codewords assigned to the source symbols should be l1 ≤ l2 ≤ L ≤ lm−1 ≤ lm . (5.Index 19 p(s1) ≥ p( s2 ) ≥ L ≥ p( sm−1) ≥ p( sm ) ♦ Since we are seeking the optimum code for S. 11) 2. Huffman derived the following rules (restrictions): 1. . 10) ♦ Based on the requirements of the optimum and instantaneous code. 3. The codewords of the two least probable source symbols should be the same except for their last bits.

which is one source symbol less than the original source alphabet. p(sm−1) + p(sm ) . Its occurrence probability is the sum of two source symbols. these two source symbols can be combined to form a single new symbol.e. we can rearrange the source symbols according to a nonincreasing order of their occurrence probabilities. respectively.2. Its codeword is the common prefix of order lm − 1 of the two codewords assigned to sm and sm−1 .20 CHAPTER 5 VARIABLE-LENGTH CODING 5. respectively. the binary 0 and 1. In the first auxiliary source alphabet. The new set of source symbols thus generated is referred to as the first auxiliary source alphabet. These two codewords are identical except for the last bits. i.. we see that the two least probable source symbols have equal-length codewords. q q Therefore.2 Huffman Coding Algorithm q Based on these three rules. q q q q .

the resultant source alphabet will have only two source symbols. The procedure continues.9 . q q Example 5. The second auxiliary source alphabet will again have one source symbol less than the first auxiliary source alphabet. At this time. The coding is then complete.Index 21 q The same procedure can be applied to this newly created source alphabet. In some step. we combine them to form a single source symbol with a probability of 1.

25 ) S 5.25 00 101 11 1001 1000 01 2 3 2 4 4 2 S 1 ( 0.10 ) 0 1 S 1 ( 0.9 Source symbol Occurrence probability Codeword assigned Length of Codeword S1 S2 S3 S4 S5 S6 0.3 ) S 6 ( 0.20 ) S 5.3 0.4.6 ( 0.3 ) S 6 ( 0.0 ) 1 Figure 5.3 ( 0.4.3 ( 0. and Huffman codes in Example 5.22 CHAPTER 5 VARIABLE-LENGTH CODING Table 5.15 ) S 2 ( 0.9 Huffman coding procedure in Example . 7 The source alphabet.20 ) 0 1 S 5.1 0.2 0.25) 0 1 S 1.10 ) S 4 ( 0.10 ) S 5 ( 0.3) S 6 (0.25 ) S 3 ( 0.3 ) S 6 ( 0.25 ) S 3 ( 0.45 ) 0 S( 1.05 ) 0 1 S 1 ( 0.20 ) S 2 ( 0.2.1 5.4 ( 0.1 0.45 ) S 1 ( 0.05 0.2.25 ) S 3 ( 0.4.55 ) S 5.2 ( 0.

In TC. and the number of less probable source symbols is large.Index 23 5.1 Applications q Recall that it has been used in differential coding and transform coding.2. .3 Modified Huffman Codes when the occurrence probabilities are skewed. the magnitude of the quantized nonzero transform coefficients and the runlength of zeros in the zigzag scan are encoded by using the Huffman code. q 5.2.2. the required codebook memory will be large. The modified Huffman code can reduce the memory requirement while keeping almost the same optimality.

086 (5.1 Limitations of Huffman Coding • Huffman coding is optimum for blockencoding a source alphabet. H ( S ) ≤ Lavg < H ( S ) + pmax + 0.24 CHAPTER 5 VARIABLE-LENGTH CODING 5.3 Arithmetic Codes Arithmetic coding: üquite different from Huffman coding ügaining increasing popularity 5. • The average codeword length achieved by Huffman coding satisfies the following inequality [gallagher 1978]. with each source symbol having an occurrence probability. .22) pmax : the maximum occurrence probability in the set of the source symbols.3.

• An extreme situation. ⇒ üaverage codeword length is 1 üredundancy η ≈1 üThis agrees with Equation 5. however. There are only two source symbols. we need two bits: one for each.22. while the other has a very large probability (very close to 1). üThis inefficiency is due to the fact that Huffman coding always encodes a source symbol with an integer number of bits. One has a very small probability.Index 25 • In the case where the probability distribution among source symbols is skewed (some probabilities are small. . § Using Huffman coding. § Entropy of source alphabet is close to 0 since the uncertainty is very small. the upper bound may be large. while some are quite large). implying that the coding redundancy may not be small.

• Another limitation is that when encoding a message that consists of a sequence of source symbols the nth extension Huffman coding needs to enumerate all possible sequences of source symbols having the same length. This is not computationally efficient. § A message may be encoded by cascading the relevant codewords. as discussed in coding the nth extended source alphabet. . ♦ Quite different from Huffman coding.26 CHAPTER 5 VARIABLE-LENGTH CODING • The fundamental idea behind Huffman coding is block coding. arithmetic coding is stream-based. It overcomes the drawbacks of Huffman coding. § That is. some codeword having an integral number of bits is assigned to a source symbol. It is the block-based approach that is responsible for the limitations of Huffman codes.

• In this example.12 • Same source alphabet as that used in Example 5.3.2 The Principle of Arithmetic Coding Example 5. a string of source symbols s1s2 s3 s4 s5 s6 is encoded. ♦ Arithmetic coding may reach the theoretical bound to coding efficiency specified in the noiseless source coding theorem for any information source.Index 27 ♦ A string of source symbols is encoded as a string of code symbols. 5. Hence free of the integral-bits-per-source-symbol restriction and more efficient. however. .9.

0.65 0.0 ) 0 0.3 ) [ 0.75 ) [ 0.05 0. CDF (si ) = ∑ p(s j ) j=1 i . 0.3 0. 0.2.6 0.75 5.65.25 [ 0.1) into Slightly different from that of cumulative distribution function (CDF) in probability theory.65 ) [ 0. 12) .4 ) [ 0.8 Source alphabet probabilities in Example 5.2 0.3. 0.75.12 Source symbol Occurrence probability and cumulative Associated subintervals CP S1 S2 S3 S4 S5 S6 0.3 0.1 0.4 0. 1.6 ) [ 0.4.1 0. (5.3.6.28 CHAPTER 5 VARIABLE-LENGTH CODING Table 5.1 Dividing Interval Subintervals v Cumulative probability (CP) [0. 0.

§ Width of each subinterval equal to probability of corresponding source symbol. v Each subinterval § Its lower end point located at CP(si ) . .Index 29 CP(si ) = ∑ p(s j ) . § Alternatively. § A subinterval can be completely defined by its lower end point and its width. 13) where CP(s1) = 0 . it is determined by its two end points: the lower and upper end points (sometimes also called the left and right end points). j=1 i−1 (5.

0.12 0.3 0.12 (d) 0.1058175 0. 0.4 0.105813 0.1065 S4 [ 0. 0.1056 0.105825 0.1083 0. .1044 0. 0.195 0.0 (a) [ 0.0) [ 0.10569 0.1058145 0. 0.1125 [ 0.65.105807 0. The encoded symbol string is S1 S2 S3 S4 S5 S6. 2 Arithmetic coding working on the same source alphabet as that in Example 5. 0.3.105804 0.105795 0. 0.4) [ 0.1058250) Figure 5.108 0.1056.102 S3 0.225 0 [ 0.102 0.65 0.105795.6) [ 0. 0.65) [ 0.1058175.1059) 0.9.099 0.1095 0.18 0.75 1.1059 (f) 0.105825) 0.10572 0.108 (e) 0.75) (b) 0. 1.10578 0.09 0. 0.75. 0.105795 S6 0.6 0.102.1058250 [ 0.3 (c) 0.6.30 CHAPTER 5 VARIABLE-LENGTH CODING 0 0.12) 0.1056 S5 [ 0.3) S1 [ 0.09.1059 0.4.09 S2 0.108) 0.

3) means that any real number in the subinterval.3. i. Hence. 0.2. Encoding the Second Source Symbol ØRefer to Part (b) of Figure 5.3. thus representing the source symbol s1 . the subinterval recursion in the .3. We use the same procedure as used in Part (a) to divide the interval [0. Picking up the subinterval [0.Index 31 5. can be a pointer to the subinterval.e. 0. we pick up its subinterval [0. This can be justified by considering that all the six subintervals are disjoined.3). ØNotice that the subintervals are recursively generated from Part (a) to Part (b). we pick up its subinterval [0.09.. 0.3) into six subintervals. 0. any real number equal to or greater than 0 and smaller than 0.2 Encoding Encoding the First Source Symbol Refer to Part (a) of Figure 5.3.12). Since the first symbol is s1 . It is known that an interval may be completely specified by its lower end point and width. Since the second symbol to be encoded is s2 .

Fifth and Sixth Source Symbols • The resulting subinterval [0.32 CHAPTER 5 VARIABLE-LENGTH CODING arithmetic coding procedure is equivalent to the following two recursions: end point recursion and width recursion. . 14) Encoding the Fourth.1058175. üThe lower end point recursion: Lnew = Lcurrent +Wcurrent ⋅ CPnew L : the lower end points W : the width new: the new recursion current: thecurrent recursion üThe width recursion is Wnew = Wcurrent ⋅ p(s ) . 0.1058250) can represent the source symbol string s1s2 s3s4 s5s6 . i Encoding the Third Source Symbol (5.

It is then determined that 0. Since • Once the first symbol is decoded. Now let us examine how the decoding process is carried out with the lower end of the final subinterval. Often. the lower end of the final subinterval is used as the code string.2. however. q q ♦ The decoder knows the encoding procedure and therefore has the information contained in Part (a) of Figure 5. . 0 < 0.1058175< 0.3.09<0. the decoder may know the partition of subintervals shown in Part (b) of Figure 5. any real numbers in the interval can be the code string for the input symbol string since all subintervals are disjoined.3 Decoding q Theoretically.1058175<0. the symbol s1 is first decoded.3.12.3 .3.Index 33 5.

readjustment (subtraction). symbol. does not need to construct Parts (b).34 CHAPTER 5 VARIABLE-LENGTH CODING That is. the decoder only needs the information contained in Part (a) of Figure 5. As a result. however.3. (e) and (f) of Figure 5. the lower end is contained in the subinterval corresponding to the symbol s2 . Instead. q q q . and scaling [langdon 1984].3. Decoding can be split into the following three steps: comparison. q The above procedure gives us an idea of how decoding works. (d). The decoding process. • Note that a terminal symbol is necessary to inform the decoder to stop decoding. s2 is the second decoded The procedure repeats itself until all six symbols are decoded. (c).

0.3.3 are constructed.352725<0. o the lower end of the corresponding interval. Compare this value with Part (a).Index 35 q In summary. o This value is then divided by p1=0. Back to Example: o After symbol s1 is first decoded. (e) and (f) of Figure 5. o We can decode the rest in the same way. readjustment and scaling exactly “ undo”what the encoding procedure has done. resulting in 0.1058175.3<0.352725. we decode s2 since 0. considering the way in which Parts (b). resulting in 0.4. q . (c). is subtracted from 0. (d). we see that the three steps discussed in the decoding process: comparison.1058175.

e.1058175. ♦ We see that an input source symbol string s1s2s3s4 s5s6 .36 CHAPTER 5 VARIABLE-LENGTH CODING 5. it is done in a manner called first in first out (FIFO). the code string is generated continually. the source symbol encoded first is decoded first.1058250). 0.2. i. Any number in this interval can be used to denote the string of the source symbols. via encoding. ♦ Furthermore. ♦ We also observe that arithmetic coding can be carried out in an incremental manner.. ♦ It is obvious that the width of the final subinterval becomes smaller and smaller .3. This explains the name arithmetic coding. That is.4 Observations ♦ Both encoding and decoding involve only arithmetic operations (addition and multiplication in encoding. That is. corresponds to a subinterval [0. subtraction and division in decoding). source symbols are fed into the encoder one by one and the final subinterval is refined continually.

101. ♦ To encode the same source symbol string.Index 37 when the length of the source symbol string becomes larger and larger.1000.11. One way is shown in Example 5.9. We construct a fixed codeword for each source symbol. In this way. an arithmetic coding system is able to know when to terminate decoding. ♦ It is necessary to have a termination symbol at the end of an input source symbol string. a 17-bit code string 00. did arithmetic coding become an increasingly important coding technique. we can cascade the corresponding codewords to form the output.01. where the five . Only after this problem was solved in the late 1970s. This causes what is known as the precision problem. Huffman coding can be implemented in two different ways. It is this problem that prohibited arithmetic coding from practical usage for quite a long period of time.1001. Since Huffman coding is instantaneous.

38 CHAPTER 5 VARIABLE-LENGTH CODING periods are used to indicate different codewords for easy reading.4: treat each group of six source symbols as a new source symbol.1058175.000110111111111. 0. which falls into the final subinterval representing the string s1s2 s3s4 s5 s6 . 0. ♦ As we see that for the same source symbol string. This is called the 6th extension of Huffman block code. calculate its occurrence probability by multiplying the related six probabilities. in order to . ♦ Another way is to form a 6th extension of the source alphabet as discussed in Section 5. the final subinterval obtained by using arithmetic coding is [0.1058211962.1058250). In other words. ♦ This indicates that the arithmetic coding is more efficient than the Huffamn coding in this example. It is noted that the 15-bit binary decimal. is equal to the decimal 0. then apply the Huffman coding algorithm to the 6th extension of the discrete memoryless source.1.

This implies a high complexity in implementation and a large codebook.Index 39 encode the source string s1s2 s3s4 s5s6 . q q 5. 5.3.3. hence not efficient.3 Implementation Issues q The growing precision problem. (the 6th extension of) Huffman coding encodes all of the 66 = 46656 codewords in the 6th extension of the source alphabet. This advance is due to the incremental implementation of arithmetic coding. This problem has been resolved and the finite precision arithmetic is now used in arithmetic coding. arithmetic coding is also applicable to r-ary encoding with r>2. ♦ Similar to the case of Huffman coding.1 Incremental Implementation .3.

üThat is. 0. üAfter the sixth symbol is encoded. üOne more digit.1058175. üThe cumulative output is 0.1056.1058. the final subinterval is [0. the resultant subinterval is [0.102.108). the two most significant decimal digits are the same and they remain the same in the encoding process.1059). 0. 0. Refer to Table 5. 5.1058250). • This important observation reveals that we are able to incrementally transmit output (the code symbols) and receive input (the source symbols that need to be encoded). üWe can transmit these two digits without affecting the final code string.40 CHAPTER 5 VARIABLE-LENGTH CODING • We observe that after the third symbol. • After the fourth symbol s4 is encoded.11. the resultant subinterval is [0. is encoded. . can be transmitted. s3 .

9 Final subintervals and cumulative output in Example 5.Index 41 • Table 5.1056 0.12 Final subinterval Source symbol Lower end S1 S2 S3 S4 S5 S6 0 0.1059 0. and decoding by comparison of magnitude of binary .3.105795 0.4 History • The idea of encoding by using cumulative probability in some ordering.108 0.09 0.2 Other Issues Eliminating Multiplication Carry-Over Problem 5.3 0.3.105 0.102 0.1058250 0.10 0.105825 0.3.105 0.1058 Cumulative output 5.1058175 Upper end 0.12 0.

however. • The result was further developed by Jelinek in his book on information theory [jelinek 1968]. • The recursive implementation of arithmetic coding was devised by Elias (another member in Fano’ first information theory s class at MIT). guazzo 1980]. • Practical arithmetic coding was developed by several independent groups [rissanen 1979.42 CHAPTER 5 VARIABLE-LENGTH CODING fraction was introduced in Shannon’ s celebrated paper [shannon 1948]. This unpublished result was first introduced by Abramson as a note in his book on information theory and coding [abramson 1963]. rubin 1979. • The growing precision problem prevented arithmetic coding from practical usage. The proposal of using finite precision arithmetic was made independently by Pasco [pasco 1976] and Rissanen [rissanen 1976]. .

. the activities of JPEG and JBIG combined the best features of the various existing arithmetic coders and developed the binary arithmetic coding procedure known as the QM-coder [pennebaker 1992]. ØNote that in text and bilevel image applications there are only two source symbols (black and while).3.Index 43 • A well-known tutorial paper on arithmetic coding appeared in [langdon 1984]. Therefore binary arithmetic coding achieves high coding efficiency.5 Applications ØArithmetic coding is becoming popular. 5. and the occurrence probability is skewed. • Based on the Q-coder. • The tremendous efforts made in IBM lead to a new form of adaptive binary arithmetic coding known as the Q-coder [pennebaker 1988].

[bell 1990] T. Hankamer. no. pp. A. pp. C. 6. vol. pp. 1949. [fano 1949] R.” IEEE Transactions on Information Theory. Principles and Practice of Information Theory.4 References [abramson 1963] N. COM. “ The transmission of information. [guazzo 1980] M. 40. Abramson. Englewood. Gallagher. 1. no. G. . vol. 930. 668-674. Witten. Text Compression.27. vol. 1098-1101. “ Variations on a theme by Huffman. Reading MA: Addison-Wesley. MA. 1963. J. MIT. June 1979. Huffman. Fano.” Proceedings of The IRE .44 CHAPTER 5 VARIABLE-LENGTH CODING ØIt has been successfully applied to bilevel image coding [langdon 1981] and adopted by the international standards for bilevel image compression JBIG. pp. [huffman 1952] D. New York: McGrawHill. January 1980.” IEEE Transactions on Communications. H. M. Blahut. September 1952. 1990. E. NJ: Prentice Hall. Ø It has also been adopted by the JPEG. 15-25.” Technical Report 65. [gallagher 1978] R. November 1978.” A IEEE Transactions on Information Theory. IT-26. “ method for the construction of minimumA redundancy codes. 6. no. Cambridge.932. “ general minimum. Guazzo. [hankamer 1979] M. Information Theory and Coding. vol. 1986.24.redundacy source-coding algorithm. IT. Cleary and I. G. “ modified Huffman procedure with reduced A memory requirement. 5. Bell. Research Laboratory of Electronics. [blahut 1986] R.

2. 1996. Langdon. Mitchell. 198-203. Pasco. B. J. Introduction to Data Com pression.”IEEE Transactions on Information Theory. “ Generalized Kraft inequality and arithmetic coding. IT-28. [shannon 1948] C. November 1979. Jr.Index 45 [jelinek 1968] F. Pennebaker. 717-726. Langdon. G. October 1948. second edition. 32. Jr. Rubin. B. Sayood.”IBM Journal of Research and Development. 800 (1982). 1992. pp. “ introduction to arithmetic coding. Rissanen. July 1948. Landon. Rissanen. COM-29. New York: Van Nostrand Reinhold. [rissanen 1979] J. vol. Jr. New York: M&T Books. [rissanen 1976] J. Pennebaker and J. May 1976. June 1981. B. no. L. pp. E. [langdon 1981] G. vol. Ph. IT-25. pp. J. 2. G. [rubin 1979] F. no. [pasco 1976] R. pp. [sayood 1996] K. pp. 149-162. 6. 6. JPEG: Still Image Data Compression Standard. G. Nelson and J. 135-149. pp.” IBM Journal of Research and Development. 623-656 (Part II). ‘A mathematical theory of communication.” IBM An Journal of Research and Development.J. New York: McGrawHill. 28. 20. Arps. “ Arithmetic stream coding using fixed precision registers. 379-423 (Part I). vol. [nelson 1996] M. [pennebaker 1992] W. Langdon. Rissanen and G. no. “ overview of the basic principles of the Q-coder adaptive binary arithmetic Coder.white images with arithmetic coding. no. [pennebaker 1988] W. Jr. Source Coding Algorithms for Fast Data Compression. vol. San Francisco. November 1988.”IEEE Transactions on Information Theory. 1996. Stanford University. [langdon 1984] G. and J. Jelinek. 858-867. dissertation.” IEEE Transactions on Communications. 23.. G. no.D. Rissanen. 1968. Probabilistic Information Theory. “ simple general binary A source code. vol. G. “ Arithmetic coding. vol. decodability . pp.” An IBM Journal of Research and Development. Shannon. CA: Morgan Kaufmann Publishers. Mitchell. and R. 672-675. G. 6.’ Bell ‘ ’ System Technical Journal. vol. March 1979. The Data Compression Book. March 1984. Gailly. [langdon 1982] G. pp. Langdon. 27. L. and J. 1976. “ Compression of black.