This action might not be possible to undo. Are you sure you want to continue?

Vida Movahedi

October 2006

Contents

A simple example Definitions Huffman Coding Algorithm Image Compression

[ ] How can we code this message using 0/1 so the coded message will have minimum length (for transmission or saving!) 5 symbols à at least 3 bits For a simple encoding.g.A simple example Suppose we have a message consisting of 5 symbols. length of code is 10*3=30 bits . e.

yet since their length is not the same. Intuition: Those symbols that are more frequent should have smaller codes. there must be a way of distinguishing each code For Huffman code.A simple example cont. length of encoded message will be =3*2 +3*2+2*2+3+3=24bits .

Definitions An ensemble X is a triple (x. Ax.. a2. Px={p1.2 . . pI} where P(x)=P(x=ai)=pi.3 5. . å pi = 1 Shannon information content of x h(x) = log2(1/P(x)) i 1 2 ai a b c z pi . p2. pi>0.0007 h(pi) 4.1 6.0263 .0575 .. 10.0128 . Px) x: value of a random variable Ax: set of possible values for x .4 Entropy of x 1 H ( x) = å P( x). log P( x) xÎAx 3 . Ax={a1. aI} Px: probability for each value .. . 26 .

X). L(C.Source Coding Theorem There exists a variable-length encoding C of an ensemble X such that the average length of an encoded symbol. satisfies L(C. H(X)+1) The Huffman coding algorithm produces optimal symbol codes .X)Î[H(X).

1.11.Symbol Codes Notations: AN: all strings of length N A+: all strings of finite length {0.010.1}+={0.001.1}+ c(x): codeword for x.10.111} {0. l(x): length of codeword .001.01. . } A symbol code C for an ensemble X is a mapping from Ax (range of x values) to {0.1}3={000.000.00.

1/8 . c .Example Ensemble X: Ax= { a . 1/8} c(a)= 1000 c+(acd)= 100000100001 (called the extended code) C0: ai a b c d c(ai) 1000 0100 0010 0001 li 4 4 4 4 . b . d } Px= {1/2 . 1/4 .

Any encoded string must have a unique decoding A code C(X) is uniquely decodable if. y Î AX .e. no two distinct strings have the same encoding. under the extended code C+. i. x ¹ y Þ c + ( x) ¹ c + ( y ) . + "x.

instantaneous code or self-punctuating code) .The symbol code must be easy to decode If possible to identify end of a codeword as soon as it arrives àno codeword can be a prefix of another codeword A symbol code is called a prefix code if no code word is a prefix of any other codeword (also called prefix-free code.

The code should achieve as much compression as possible The expected length L(C.X) of symbol code C for X is L(C . X ) = xÎ Ax å P( x)l ( x) = å p l i =1 | Ax | i i .

c . d } Px= {1/2 . 1/8} c+(acd)= 0110111 (9 bits compared with 12) C1: ai a b c c(ai) 0 10 110 111 li 1 2 3 3 prefix code? d . 1/8 .Example Ensemble X: Ax= { a . b . 1/4 .

In doing so. who had worked with information theory inventor Claude Shannon to develop a similar code. the student outdid his professor.History In 1951. David Huffman and his MIT information theory classmates given the choice of a term paper or a final exam Huffman hit upon the idea of using a frequency-sorted binary tree and quickly proved this method the most efficient.The Huffman Coding algorithm. Huffman built the tree from the bottom up instead of from the top down .

. equal length.Huffman Coding Algorithm 1. and repeat. differing in last digit) 2. Combine these two symbols into a single symbol. Take the two least probable symbols in the alphabet (longest codewords.

55 1 1 0 0 a 0. c .2 11 0 d 0.Example Ax={ a . 0. 0. b .15.25.15 011 . d . 0.15 010 0.0 0.25 00 b 0.2.25.45 1 c 0. 0.3 1 e 0.25 10 1. e } Px={0.15} 0 0.

Statements Lower bound on expected length is H(X) There is no better symbol code for a source than the Huffman code Constructing a binary tree top-down is suboptimal .

g.Disadvantages of the Huffman Code Changing ensemble If the ensemble changesà the frequencies and probabilities change à the optimal coding changes e. in text compression symbol frequencies vary with context Re-computing the Huffman code by running through the entire file in advance?! Saving/ transmitting the code too?! Does not consider blocks of symbols strings_of_ch àthe next nine symbols are predictable aracters_ . but bits are used without conveying any new information .

1}) Adaptive Huffman coding Calculates frequencies dynamically based on recent actual frequencies Huffman template algorithm Generalizing probabilities à any weight Combining methods (addition) à any function Can solve other min. 1.g. . problems e.. max [wi+length(ci)] .Variations n-ary Huffman coding Uses {0.. n-1} (not just {0.

or some linear predicting function à Decorrelate the raw image data 2. such as Huffman coding. A standard coding technique.version 1: DPCM with arithmetic coding .version 2: DPCM with Huffman coding . arithmetic coding. Lossless JPEG: . A linear predictor such as DPCM.Image Compression 2-stage Coding technique 1.

DPCM Differential Pulse Code Modulation DPCM is an efficient way to encode highly correlated analog signals into binary form suitable for digital transmission. or input to a digital computer Patent by Cutler (1952) . storage.

DPCM .

Build a Huffman tree by sorting the histogram and successively combine the two bins of the lowest value until only one bin remains. . Step 3. Encode the residual image. Encode the Huffman tree and save the Huffman tree with the coded value.Huffman Coding Algorithm for Image Compression Step 1. Step 2.

y>0 3. Find the range threshold R for N: # of pixels . Compute the symmetry histogram S S(y)= H(y) + H(-y). P: desired proportion of most-likely magnitudes å S ( j) £ P ´ N < å S ( j) j =0 j =0 R -1 R . Compute the residual histogram H H(x)= # of pixels having residual magnitude x 2.Huffman Coding of the most-likely magnitude MLM Method 1.

Inference. 2003.References (1) MacKay.C. Y.C. Cambridge University Press. (2) Wikipedia. (4) O Neal . and Chang. ..org/wiki/Huffman_coding (3) Hu.J. D. A new losseless compression scheme based on Huffman coding scheme for image compression . C. http://en. Information Theory.wikipedia. and Learning Algorithms.C.

Sign up to vote on this title

UsefulNot useful- Huffman Coding
- huffman
- Image Compression
- Image Compression
- c Se 373 Lecture 06 Greedy Algorithms
- ADPCM
- DIP Lecture13
- Defining Collection Plan Types_SPD
- 40 New Service Tax Codes
- Huffman 2
- Paper 199-Morse Code Translator Using the Arduino Platform
- LZW-PaulPenfield
- Paper
- 06042349
- Apa
- Codes Used in Satellite Communication
- VAN101
- ChannelCoding_TurboCode.pdf
- ijatcse02132912
- Literature Survey First Step
- Audio Compression notes(Data compression)
- Overview
- Syllabus Reg 09
- NGMAST08-Paper_Turbo
- Solution Test 2
- 8 - TT64ROMBanks
- Varodayan_02UGThesis_ProductCodes.pdf
- NR-410507-Digital Speech and Image Processing-C
- Codigo Internacional de Sinais
- Fractal Compression
- huffman coding