2 views

Uploaded by Angger Ardiansyah

dmkwqmdkmkwmdkkmdkqmwkdmkqnwmdknqkndknqkwnmknmskwqnmkdnwkenkwnqmdknwqkndkwnmdlkneinewjri

dmkwqmdkmkwmdkkmdkqmwkdmkqnwmdknqkndknqkwnmknmskwqnmkdnwkenkwnqmdknwqkndkwnmdlkneinewjri

© All Rights Reserved

- Mtech Scheme & Syllabus_keralauniversity
- MMC Chap3
- Ws 1621multimedia Compression Techniques
- 1mtech Comp Security
- Scalable Image Encryption Based Lossless Image Compression
- pg_ch15_0403
- Noise: A Call for Inquiry and Experimentation
- Audio Compression
- 5. Comp Sci - IJCSE - IMAGE COMPRESSION WITH - Manoj Kumar.pdf
- Compression on Music Files
- 1569293007
- 7_7725_PAMOD_02E
- TEMS Symphony 6.1 - Technical Product Description2
- mtp_report
- 7320
- Information
- v7-656-672
- 00643c
- S7 EC DIP MAY 2014
- dip4e_detailed_TOC.pdf

You are on page 1of 220

Compression

ECE 302 Spring 2012

Purdue University, School of ECE

Prof. Ilya Pollak

Reducing the file size without compromising the

quality of the data stored in the file too much

(lossy compression) or at all (lossless

compression).

With compression, you can fit higher-quality data

(e.g., higher-resolution pictures or video) into a

file of the same size as required for lower-quality

uncompressed data.

Ilya Pollak

Our appetite for data (high-resolution pictures,

HD video, audio, documents, etc) seems to

always significantly outpace hardware

capabilities for storage and transmission.

Ilya Pollak

If the data is continuous-time (e.g., audio) or

continuous-space (e.g., picture), it first needs to be

discretized.

Ilya Pollak

If the data is continuous-time (e.g., audio) or

continuous-space (e.g., picture), it first needs to be

discretized.

Sampling is typically done nowadays during signal

acquisition (e.g., digital camera for pictures or audio

recording equipment for music and speech).

Ilya Pollak

If the data is continuous-time (e.g., audio) or

continuous-space (e.g., picture), it first needs to be

discretized.

Sampling is typically done nowadays during signal

acquisition (e.g., digital camera for pictures or audio

recording equipment for music and speech).

We will not study sampling. It is studied in ECE 301,

ECE 438, and ECE 440.

We will consider compressing discrete-time or

discrete-space data.

Ilya Pollak

Example: compression of

grayscale images

An eight-bit grayscale image is a rectangular array

of integers between 0 (black) and 255 (white).

Each site in the array is called a pixel.

Ilya Pollak

Example: compression of

grayscale images

An eight-bit grayscale image is a rectangular array

of integers between 0 (black) and 255 (white).

Each site in the array is called a pixel.

It takes one byte (eight bits) to store one pixel value,

since it can be any number between 0 and 255.

Ilya Pollak

Example: compression of

grayscale images

An eight-bit grayscale image is a rectangular array

of integers between 0 (black) and 255 (white).

Each site in the array is called a pixel.

It takes one byte (eight bits) to store one pixel value,

since it can be any number between 0 and 255.

It would take 25 bytes to store a 5x5 image.

Ilya Pollak

Example: compression of

grayscale images

An eight-bit grayscale image is a rectangular array

of integers between 0 (black) and 255 (white).

Each site in the array is called a pixel.

It takes one byte (eight bits) to store one pixel value,

since it can be any number between 0 and 255.

It would take 25 bytes to store a 5x5 image.

Can we do better?

Ilya Pollak

Example: compression of

grayscale images

255 255 255 255 255

255 255 255 255 255

200 200 200 200 200

200 200 200 200 200

200 200 200 200 100

Ilya Pollak

Idea #1:

Transform the data to create lots of zeros.

Ilya Pollak

Idea #1:

Transform the data to create lots of zeros. For example,

we could rasterize the image, compute the differences, and

store the top left value along with the 24 differences [in

reality, other transforms are used, but they work in a similar

fashion]

Ilya Pollak

Idea #1:

Transform the data to create lots of zeros. For example,

we could rasterize the image, compute the differences, and

store the top left value along with the 24 differences [in

reality, other transforms are used, but they work in a similar

fashion]:

255,0,0,0,0,0,0,0,0,0,55,0,0,0,0,0,0,0,0,0,0,0,0,0,100

Ilya Pollak

Idea #1:

Transform the data to create lots of zeros. For example,

we could rasterize the image, compute the differences, and

store the top left value along with the 24 differences [in

reality, other transforms are used, but they work in a similar

fashion]:

255,0,0,0,0,0,0,0,0,0,55,0,0,0,0,0,0,0,0,0,0,0,0,0,100

This seems to make things worse: now the numbers can

range from 255 to 255, and therefore we need two bytes

per pixel!

Ilya Pollak

Idea #1:

Transform the data to create lots of zeros. For example,

we could rasterize the image, compute the differences, and

store the top left value along with the 24 differences [in

reality, other transforms are used, but they work in a similar

fashion]:

255,0,0,0,0,0,0,0,0,0,55,0,0,0,0,0,0,0,0,0,0,0,0,0,100

This seems to make things worse: now the numbers can

range from 255 to 255, and therefore we need two bytes

per pixel!

Idea #2:

when encoding the data, spend fewer bits on frequently

occurring numbers and more bits on rare numbers.

Ilya Pollak

Entropy coding

Suppose we are encoding realizations of a discrete random variable X such that

value of X

255

55

100

probability

22/25

1/25

1/25

1/25

Ilya Pollak

Entropy coding

Suppose we are encoding realizations of a discrete random variable X such that

value of X

255

55

100

probability

22/25

1/25

1/25

1/25

value of X

255

55

100

codeword

00

01

10

11

Ilya Pollak

Entropy coding

Suppose we are encoding realizations of a discrete random variable X such that

value of X

255

55

100

probability

22/25

1/25

1/25

1/25

value of X

255

55

100

codeword

00

01

10

11

Ilya Pollak

Suppose we are encoding realizations of a discrete random variable X such that

value of X

255

55

100

probability

22/25

1/25

1/25

1/25

value of X

255

55

100

codeword

00

01

10

11

Now consider the following encoder:

value of X

255

55

100

codeword

01

000

001

Ilya Pollak

Suppose we are encoding realizations of a discrete random variable X such that

value of X

255

55

100

probability

22/25

1/25

1/25

1/25

value of X

255

55

100

codeword

00

01

10

11

Now consider the following encoder:

value of X

255

55

100

codeword

01

000

001

For a file with 25 numbers, E[file size] = 25(22/25 + 2/25 + 3/25 + 3/25) = 30 bits!

Ilya Pollak

Entropy coding

A similar encoding scheme can be devised for a

random variable of pixel differences which takes

values between 255 and 255, to result in a smaller

average file size than two bytes per pixel.

Ilya Pollak

Entropy coding

A similar encoding scheme can be devised for a

random variable of pixel differences which takes

values between 255 and 255, to result in a smaller

average file size than two bytes per pixel.

Another commonly used idea: run-length coding. I.e.,

instead of encoding each 0 individually, encode the

length of each string of zeros.

Ilya Pollak

value of X

255

55

100

probability

22/25

1/25

1/25

1/25

codeword

01

000

001

Ilya Pollak

value of X

255

55

100

probability

22/25

1/25

1/25

1/25

codeword

01

000

001

What about this alternative encoder?

value of X

255

55

100

probability

22/25

1/25

1/25

1/25

codeword

01

10

Ilya Pollak

value of X

255

55

100

probability

22/25

1/25

1/25

1/25

codeword

01

000

001

What about this alternative encoder?

value of X

255

55

100

probability

22/25

1/25

1/25

1/25

codeword

01

10

Ilya Pollak

value of X

255

55

100

probability

22/25

1/25

1/25

1/25

codeword

01

000

001

What about this alternative encoder?

value of X

255

55

100

probability

22/25

1/25

1/25

1/25

codeword

01

10

Is there anything wrong with this encoder?

Ilya Pollak

uniquely decodable!

value of X

255

55

100

probability

22/25

1/25

1/25

1/25

codeword

01

10

by 55

Ilya Pollak

uniquely decodable!

value of X

255

55

100

probability

22/25

1/25

1/25

1/25

codeword

01

10

by 55

Therefore, this code is unusable!

It turns out that the first code is uniquely decodable.

Ilya Pollak

amenable to entropy coding?

0.7

0.6

0.5

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

a

two bits per symbol

0

a

two bits per symbol

Ilya Pollak

amenable to entropy coding?

0.7

0.6

0.5

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

a

two bits per symbol

0

a

two bits per symbol

Conclusion: the transform procedure should be such that the numbers fed

into the entropy coder have a highly concentrated histogram (a few very

likely values, most values unlikely).

Ilya Pollak

amenable to entropy coding?

0.7

0.6

0.5

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

a

two bits per symbol

0

a

two bits per symbol

Conclusion: the transform procedure should be such that the numbers fed

into the entropy coder have a highly concentrated histogram (a few very

likely values, most values unlikely). Also, if we are encoding each number

individually, they should be independent or approximately independent.

Ilya Pollak

some information?

253

253

255

254

255

254

254

254

255

254

252

255

255

254

252

253

253

254

254

254

252

255

253

252

253

Ilya Pollak

some information?

253

253

255

254

255

253.5

253.5

253.5

253.5

253.5

254

254

254

255

254

253.5

253.5

253.5

253.5

253.5

252

255

255

254

252

253.5

253.5

253.5

253.5

253.5

253

253

254

254

254

253.5

253.5

253.5

253.5

253.5

252

255

253

252

253

253.5

253.5

253.5

253.5

253.5

Quantization

Ilya Pollak

from (left to right): {252,253,254,255},

{188,189,190,191}, {125,126,127,128},

{61,62,63,64}, {0,1,2,3}.

from (left to right): {240,,255},

{176,,191}, {113,,128}, {49,,64 },

{0,,15}.

Ilya Pollak

Converting continuous-valued to

discrete-valued signals

Many real-world signals are continuous-valued.

audio signal a(t): both the time argument t and the intensity value

a(t) are continuous;

image u(x,y): both the spatial location (x,y) and the image

intensity value u(x,y) are continuous;

video v(x,y,t): x,y,t, and v(x,y,t) are all continuous.

Ilya Pollak

Converting continuous-valued to

discrete-valued signals

Many real-world signals are continuous-valued.

audio signal a(t): both the time argument t and the intensity value

a(t) are continuous;

image u(x,y): both the spatial location (x,y) and the image

intensity value u(x,y) are continuous;

video v(x,y,t): x,y,t, and v(x,y,t) are all continuous.

sampling), is studied in ECE 301, 438, and 440.

Ilya Pollak

Converting continuous-valued to

discrete-valued signals

Many real-world signals are continuous-valued.

audio signal a(t): both the time argument t and the intensity value

a(t) are continuous;

image u(x,y): both the spatial location (x,y) and the image

intensity value u(x,y) are continuous;

video v(x,y,t): x,y,t, and v(x,y,t) are all continuous.

sampling), is studied in ECE 301, 438, and 440.

However, in addition to descretizing the argument

values, the signal values must be discretized as well in

order to be digitally stored.

Ilya Pollak

Quantization

Digitizing a continuous-valued signal into a discrete and

finite set of values.

Converting a discrete-valued signal into another discrete

-valued signal, with fewer possible discrete values.

Ilya Pollak

Suppose data X(1),,X(N) is quantized using two quantizers, to result in

Y1(1),,Y1(N) and Y2(1),,Y2(N).

Suppose both Y1(1),,Y1(N) and Y2(1),,Y2(N) can be encoded with the

same number of bits.

Which quantization is better?

The one that results in less distortion. But how to measure distortion?

In general, measuring and modeling perceptual image similarity and similarity of

audio are open research problems.

Some useful things are known about human audio and visual systems that

inform the design of quantizers.

Ilya Pollak

System to Contrast Changes, as a

Function of Frequency

Ilya Pollak

System to Contrast Changes, as a

Function of Frequency

Ilya Pollak

System to Contrast Changes, as a

Function of Frequency

Ilya Pollak

intricacies in the way human

visual system computes

similarity

Ilya Pollak

Ilya Pollak

Ilya Pollak

complicated, because measuring image fidelity is complicated.

Often, very simple distortion measures are used such as mean-square error.

Ilya Pollak

s

255

255

127

95

r

quantize each value separately

simple thresholding

127

255

95

255

more complex

Ilya Pollak

amenable to scalar quantization?

s

255

127

r

If (r,s) are jointly uniform over green square

(or, more generally, independent), knowing

r does not tell us anything about s.

Best thing to do: make quantization

decisions independently.

0

127

255

Ilya Pollak

amenable to scalar quantization?

s

255

255

127

95

r

If (r,s) are jointly uniform over green square

(or, more generally, independent), knowing

r does not tell us anything about s.

Best thing to do: make quantization

decisions independently.

0

127

255

r

If (r,s) are jointly uniform over yellow

region, knowing r tells us a lot about s.

0

95

255

decisions jointly.

Ilya Pollak

amenable to scalar quantization?

s

255

255

127

95

r

If (r,s) are jointly uniform over green square

(or, more generally, independent), knowing

r does not tell us anything about s.

Best thing to do: make quantization

decisions independently.

0

127

255

r

If (r,s) are jointly uniform over yellow

region, knowing r tells us a lot about s.

0

95

255

decisions jointly.

procedure should be such that the coefficients fed into the quantizer are

independent (or at least uncorrelated, or almost uncorrelated), in order to

enable the simpler scalar quantization.

Ilya Pollak

Does it make sense to do scalar

quantization with different

quantization bins for different

variables?

s

255

127

127

255

Ilya Pollak

Does it make sense to do scalar

quantization with different

quantization bins for different

variables?

No reason to do this if we are

quantizing grayscale pixel values.

s

255

127

127

255

Ilya Pollak

Does it make sense to do scalar

quantization with different

quantization bins for different

variables?

No reason to do this if we are

quantizing grayscale pixel values.

However, if we can decompose the

image into components that are less

perceptually important and more

perceptually important, we should use

larger quantization bins for the less

important components.

s

255

127

127

255

Ilya Pollak

Compression Algorithm for Audio,

Images, or Video

data

transform

quantization

entropy

coding

compressed

bitstream

Ilya Pollak

Compression Algorithm for Audio,

Images, or Video

data

transform

quantization

entropy

coding

compressed

bitstream

(Various transforms are considered in ECE 301 and ECE 438.)

Ilya Pollak

Sequence of discrete or continuous

random variables X(1),,X(N)

(e.g., transformed image pixel

values).

video, speech signal)

Ilya Pollak

Sequence of discrete or continuous

random variables X(1),,X(N)

(e.g., transformed image pixel

values).

video, speech signal)

variables Y(1),,Y(N), each

distributed over a finite set of

values (quantization levels)

Quantizer

Ilya Pollak

Sequence of discrete or continuous

random variables X(1),,X(N)

(e.g., transformed image pixel

values).

video, speech signal)

variables Y(1),,Y(N), each

distributed over a finite set of

values (quantization levels)

Quantizer

Ilya Pollak

distortion of quantizers

Suppose data X(1),,X(N) are quantized, to result in Y(1),,Y(N).

N

N

2

2

E ( X(n) Y (n)) = E ( D(n))

n =1

n =1

2

If D(1),..., D(N ) are identically distributed, this is the same as NE ( D(n)) , for any n.

Ilya Pollak

Use quantization intervals (bins) of equal

size [x1,x2), [x2,x3),[xL,xL+1].

Quantization levels q1, q2,, qL.

Each quantization level is in the middle of

the corresponding quantization bin:

qk=(xk+xk+1)/2.

Ilya Pollak

Use quantization intervals (bins) of equal

size [x1,x2), [x2,x3),[xL,xL+1].

Quantization levels q1, q2,, qL.

Each quantization level is in the middle of

the corresponding quantization bin:

qk=(xk+xk+1)/2.

If quantizer input X is in [xk,xk+1), the

corresponding quantized value is Y = qk.

Ilya Pollak

Uniform vs non-uniform

quantization

Uniform quantization is not a good

strategy for distributions which

significantly differ from uniform.

Ilya Pollak

Uniform vs non-uniform

quantization

Uniform quantization is not a good

strategy for distributions which

significantly differ from uniform.

If the distribution is non-uniform, it is better

to spend more quantization levels on

more probable parts of the distribution

and fewer quantization levels on less

probable parts.

Ilya Pollak

X = source random variable with a known distribution. We assume it to be a

continuous r.v. with PDF fX(x)>0.

Ilya Pollak

X = source random variable with a known distribution. We assume it to be a

continuous r.v. with PDF fX(x)>0.

The results can be extended to discrete or mixed random variables, and to

continuous random variables whose density can be zero for some x.

Ilya Pollak

X = source random variable with a known distribution. We assume it to be a

continuous r.v. with PDF fX(x)>0.

The results can be extended to discrete or mixed random variables, and to

continuous random variables whose density can be zero for some x.

x1 =

xL+1 =

< q1 < x2 q2 < x3 q3 < qL 1 < x L qL < +

I.e., qk k-th quantization interval

Ilya Pollak

X = source random variable with a known distribution. We assume it to be a

continuous r.v. with PDF fX(x)>0.

The results can be extended to discrete or mixed random variables, and to

continuous random variables whose density can be zero for some x.

x1 =

xL+1 =

< q1 < x2 q2 < x3 q3 < qL 1 < x L qL < +

I.e., qk k-th quantization interval

outcomes, q1, q2,, qL, defined by

Y = Y (X) =

q1

if X < x2

q2

if x 2 X < x3

qL 1 if x L 1 X < x L

qL

X xL

Ilya Pollak

Given the pdf fX(x) of the source r.v. X and the desired number L of

quantization levels, find the quantization interval endpoints x2,,xL and

quantization levels q1,, qL to minimize the mean-square error, E[(YX)2].

Ilya Pollak

Given the pdf fX(x) of the source r.v. X and the desired number L of

quantization levels, find the quantization interval endpoints x2,,xL and

quantization levels q1,, qL to minimize the mean-square error, E[(YX)2].

To do this, express the mean-square error in terms of the quantization

interval endpoints and quantization levels, and find the minimum (or

minima) through differentiation.

Ilya Pollak

E (Y X ) =

2

2

y(x)

x

f X (x)dx

(

)

Ilya Pollak

E (Y X ) =

2

( y(x) x )

L xk+1

f X (x)dx =

2

y(x)

x

f X (x)dx

(

)

k =1 xk

Ilya Pollak

E (Y X ) =

2

( y(x) x )

L xk+1

f X (x)dx =

( y(x) x )

k =1 xk

L xk+1

f X (x)dx =

2

q

x

f X (x)dx

(

)

k

k =1 xk

Ilya Pollak

E (Y X ) =

2

( y(x) x )

L xk+1

f X (x)dx =

2

Minimize w.r.t. qk :

E (Y X ) =

qk

( y(x) x )

k =1 xk

xk+1

2 (q

L xk+1

f X (x)dx =

2

q

x

f X (x)dx

(

)

k

k =1 xk

x ) f X (x)dx = 0

xk

Ilya Pollak

E (Y X ) =

2

( y(x) x )

f X (x)dx =

2

Minimize w.r.t. qk :

E (Y X ) =

qk

xk+1

xk

L xk+1

( y(x) x )

k =1 xk

xk+1

2 (q

L xk+1

f X (x)dx =

2

q

x

f X (x)dx

(

)

k

k =1 xk

x ) f X (x)dx = 0

xk

xk+1

f (x)dx =

k X

xf X (x)dx

xk

Ilya Pollak

E (Y X ) =

2

( y(x) x )

L xk+1

f X (x)dx =

2

Minimize w.r.t. qk :

E (Y X ) =

qk

( y(x) x )

k =1 xk

xk+1

2 (q

L xk+1

f X (x)dx =

2

q

x

f X (x)dx

(

)

k

k =1 xk

x ) f X (x)dx = 0

xk

xk+1

xk+1

xk

xk+1

qk f X (x)dx =

xk

xf X (x)dx, therefore qk =

xf X (x)dx

f X (x)dx

xk

xk+1

xk

Ilya Pollak

E (Y X ) =

2

( y(x) x )

L xk+1

f X (x)dx =

2

Minimize w.r.t. qk :

E (Y X ) =

qk

( y(x) x )

k =1 xk

xk+1

2 (q

L xk+1

f X (x)dx =

2

q

x

f X (x)dx

(

)

k

k =1 xk

x ) f X (x)dx = 0

xk

xk+1

xk+1

xk

xk+1

qk f X (x)dx =

xk

xf X (x)dx, therefore qk =

xf X (x)dx

f X (x)dx

xk

xk+1

xk

Ilya Pollak

E (Y X ) =

2

( y(x) x )

L xk+1

f X (x)dx =

2

Minimize w.r.t. qk :

E (Y X ) =

qk

( y(x) x )

k =1 xk

xk+1

2 (q

L xk+1

f X (x)dx =

2

q

x

f X (x)dx

(

)

k

k =1 xk

x ) f X (x)dx = 0

xk

xk+1

xk+1

xk

xk+1

qk f X (x)dx =

xk

xf X (x)dx, therefore qk =

xf X (x)dx

f X (x)dx

xk

xk+1

xk

2

2

This is a minimum, since 2 E (Y X ) =

qk

xk+1

2f

(x)dx > 0.

xk

Ilya Pollak

E (Y X ) =

2

( y(x) x )

L xk+1

f X (x)dx =

( y(x) x )

k =1 xk

L xk+1

f X (x)dx =

2

q

x

f X (x)dx

(

)

k

k =1 xk

Ilya Pollak

E (Y X ) =

2

( y(x) x )

L xk+1

f X (x)dx =

( y(x) x )

k =1 xk

L xk+1

f X (x)dx =

2

q

x

f X (x)dx

(

)

k

k =1 xk

x

xk+1

k

2

2

2

E (Y X ) =

( qk 1 x ) f X (x)dx + ( qk x ) f X (x)dx

xk

xk xk1

xk

Ilya Pollak

E (Y X ) =

2

( y(x) x )

L xk+1

f X (x)dx =

( y(x) x )

k =1 xk

L xk+1

f X (x)dx =

2

q

x

f X (x)dx

(

)

k

k =1 xk

x

xk+1

k

2

2

2

E (Y X ) =

( qk 1 x ) f X (x)dx + ( qk x ) f X (x)dx

xk

xk xk1

xk

= ( qk 1 xk ) f X (xk ) ( qk xk ) f X (xk )

2

Ilya Pollak

E (Y X ) =

2

( y(x) x )

L xk+1

f X (x)dx =

( y(x) x )

k =1 xk

L xk+1

f X (x)dx =

2

q

x

f X (x)dx

(

)

k

k =1 xk

x

xk+1

k

2

2

2

E (Y X ) =

( qk 1 x ) f X (x)dx + ( qk x ) f X (x)dx

xk

xk xk1

xk

2

Ilya Pollak

E (Y X ) =

2

( y(x) x )

L xk+1

f X (x)dx =

( y(x) x )

k =1 xk

L xk+1

f X (x)dx =

2

q

x

f X (x)dx

(

)

k

k =1 xk

x

xk+1

k

2

2

2

E (Y X ) =

( qk 1 x ) f X (x)dx + ( qk x ) f X (x)dx

xk

xk xk1

xk

2

q + qk

xk = k 1

, for k = 2,, L.

2

Ilya Pollak

E (Y X ) =

2

( y(x) x )

L xk+1

f X (x)dx =

( y(x) x )

k =1 xk

L xk+1

f X (x)dx =

2

q

x

f X (x)dx

(

)

k

k =1 xk

x

xk+1

k

2

2

2

E (Y X ) =

( qk 1 x ) f X (x)dx + ( qk x ) f X (x)dx

xk

xk xk1

xk

2

q + qk

xk = k 1

, for k = 2,, L.

2

2

2

This is a minimum, since 2 E (Y X ) = 2 ( qk qk 1 ) f X (xk ) > 0.

xk

Ilya Pollak

xk+1

x xfX (x)dx

k

q

=

= E [ X | X k-th quantization interval], for k = 1,, L

k

x

k+1

f X (x)dx

xk

xk = qk 1 + qk , for k = 2,, L

Ilya Pollak

xk+1

x xfX (x)dx

k

q

=

= E [ X | X k-th quantization interval], for k = 1,, L

k

x

k+1

f X (x)dx

xk

xk = qk 1 + qk , for k = 2,, L

E.g., if X is uniform, then Lloyd-Max quantizer = uniform quantizer.

Ilya Pollak

xk+1

x xfX (x)dx

k

q

=

= E [ X | X k-th quantization interval], for k = 1,, L

k

x

k+1

f X (x)dx

xk

xk = qk 1 + qk , for k = 2,, L

E.g., if X is uniform, then Lloyd-Max quantizer = uniform quantizer.

iterative algorithm (e.g., lloyds command in Matlab).

Ilya Pollak

xk+1

x xfX (x)dx

k

q

=

= E [ X | X k-th quantization interval], for k = 1,, L

k

x

k+1

f X (x)dx

xk

xk = qk 1 + qk , for k = 2,, L

E.g., if X is uniform, then Lloyd-Max quantizer = uniform quantizer.

iterative algorithm (e.g., lloyds command in Matlab).

For real data, typically the PDF is not given and therefore needs to be

estimated using, for example, histograms constructed from the observed

data.

Ilya Pollak

X = ( X(1),, X(N )) = source random vector with a given joint distribution.

L = a desired number of quantization points.

Ilya Pollak

X = ( X(1),, X(N )) = source random vector with a given joint distribution.

L = a desired number of quantization points.

We would like to find:

(1) L events A1 ,, AL that partition the joint sample space of X(1),, X(N ), and

(2) L quantization points q1 A1 ,, q L AL

Ilya Pollak

X = ( X(1),, X(N )) = source random vector with a given joint distribution.

L = a desired number of quantization points.

We would like to find:

(1) L events A1 ,, AL that partition the joint sample space of X(1),, X(N ), and

(2) L quantization points q1 A1 ,, q L AL ,

Y = q k if X Ak , for k = 1,, L,

minimizes the mean-square error,

N

2

E Y X = E (Y (n) X(n))

n =1

Ilya Pollak

X = ( X(1),, X(N )) = source random vector with a given joint distribution.

L = a desired number of quantization points.

We would like to find:

(1) L events A1 ,, AL that partition the joint sample space of X(1),, X(N ), and

(2) L quantization points q1 A1 ,, q L AL ,

Y = q k if X Ak , for k = 1,, L,

minimizes the mean-square error,

N

2

E Y X = E (Y (n) X(n))

n =1

Difficulty: cannot differentiate with respect to a set Ak , and so unless the set of all allowed

partitions is somehow restricted, this cannot be solved.

Ilya Pollak

you some idea about various

issues involved in quantization.

And now, on to entropy coding

data

transform

quantization

entropy

coding

compressed

bitstream

Ilya Pollak

Problem statement

Source (e.g., image,

video, speech signal,

or quantizer output)

Sequence of discrete

random variables X(1),,X(N)

(e.g., transformed image pixel values),

assumed to be independent and

identically distributed over a finite

alphabet {a1,,aM}.

Ilya Pollak

Problem statement

Source (e.g., image,

video, speech signal,

or quantizer output)

Sequence of discrete

random variables X(1),,X(N)

(e.g., transformed image pixel values),

assumed to be independent and

identically distributed over a finite

Encoder: mapping

alphabet {a1,,aM}.

between source

symbols and binary

strings (codewords)

Binary string

Requirements:

minimize the expected length of the binary string;

the binary string needs to be uniquely decodable, i.e., we need to be able

to infer X(1),,X(N) from it!

Ilya Pollak

Problem statement

Source (e.g., image,

video, speech signal,

or quantizer output)

Sequence of discrete

random variables X(1),,X(N)

(e.g., transformed image pixel values),

assumed to be independent and

identically distributed over a finite

Encoder: mapping

alphabet {a1,,aM}.

between source

symbols and binary

strings (codewords)

Binary string

encode each of them separately.

Each can assume any value among {a1,,aM}.

Therefore, our code will consist of M codewords, one for each symbol

a1,,aM.

symbol

codeword

a1

w1

aM

wM

Ilya Pollak

Unique Decodability

symbol

codeword

00

01

It could be aaab or aad or acb or cab or cd.

Not uniquely decodable!

Ilya Pollak

decodability

Prefix condition: no codeword in the code is a prefix for

any other codeword.

Ilya Pollak

decodability

Prefix condition: no codeword in the code is a prefix for

any other codeword.

If the prefix condition is satisfied, then the code is

uniquely decodable.

Proof. Take a bit string W that corresponds to two different

strings of symbols, A and B. If the first symbols in A and B are

the same, discard them and the corresponding portion of W.

Repeat until either there are no bits left in W (in this case A=B)

or the first symbols in A and B are different. Then one of the

codewords corresponding to these two symbols is a prefix for

the other.

Ilya Pollak

decodability

Prefix condition: no codeword in the code is a prefix for

any other codeword.

Visualizing binary strings. Form a binary tree where

each branch is labeled 0 or 1. Each codeword w can be

associated with the unique node of the tree such that

string of 0s and 1s on the path from the root to the

node forms w.

Ilya Pollak

decodability

Prefix condition: no codeword in the code is a prefix for

any other codeword.

Visualizing binary strings. Form a binary tree where

each branch is labeled 0 or 1. Each codeword w can be

associated with the unique node of the tree such that

string of 0s and 1s on the path from the root to the

node forms w.

Prefix condition holds if an only if all the codewords are

leaves of the binary tree.

Ilya Pollak

decodability

Prefix condition: no codeword in the code is a prefix for

any other codeword.

Visualizing binary strings. Form a binary tree where

each branch is labeled 0 or 1. Each codeword w can be

associated with the unique node of the tree such that

string of 0s and 1s on the path from the root to the

node forms w.

Prefix condition holds if an only if all the codewords are

leaves of the binary tree---i.e., if no codeword is a

descendant of another codeword.

Ilya Pollak

decodability, one word is not a leaf

symbol

codeword

00

01

Ilya Pollak

decodability, one word is not a leaf

symbol

codeword

c

d

wa=0

0

1

wb=1

Ilya Pollak

decodability, one word is not a leaf

symbol

codeword

00

wc=00

0

wa=0

0

1

wb=1

Ilya Pollak

decodability, one word is not a leaf

symbol

codeword

00

01

wc=00

0

wa=0

0

1

wd=01

1

wb=1

Ilya Pollak

leaves

symbol

codeword

b

c

d

0

1

wa=1

Ilya Pollak

leaves

symbol

codeword

01

c

d

0

0

1

wb=01

1

wa=1

Ilya Pollak

leaves

wc=000

symbol

codeword

01

000

001

0

0

1

wd=001

0

1

wb=01

1

wa=1

Ilya Pollak

leaves

wc=000

symbol

codeword

01

000

001

0

0

1

wd=001

codeword contains another

codeword. This is equivalent

to saying that the prefix

condition holds.

1

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

wc=000

0

0

1

wd=001

symbol

codeword

01

000

001

corresponding path from the root of the binary tree.

Each time a leaf is reached, output the codeword and

go back to the root.

0

1

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

How to decode the following string?

wc=000

000001101

0

0

1

0

1

wd=001

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

wc=000

000001101

0

0

1

0

1

wd=001

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

wc=000

000001101

0

0

1

0

1

wd=001

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

wc=000

000001101

0

0

1

0

1

wd=001

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

wc=000

000001101

0

0

1

output: c

0

wd=001

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

wc=000

000001101

0

0

1

output: c

0

wd=001

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

wc=000

000001101

0

0

1

output: c

0

wd=001

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

wc=000

000001101

0

0

1

output: c

0

wd=001

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

wc=000

000001101

0

0

1

output: cd

0

wd=001

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

wc=000

000001101

0

0

1

output: cd

0

wd=001

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

wc=000

000001101

0

0

1

wd=001

output: cda

0

1

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

wc=000

000001101

0

0

1

wd=001

output: cda

0

1

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

wc=000

000001101

0

0

1

wd=001

output: cda

0

1

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

wc=000

000001101

0

0

1

wd=001

output: cdab

0

1

wb=01

1

wa=1

Ilya Pollak

leaves => unique decodability

wc=000

000001101

0

0

1

wd=001

0

1

wb=01

final output:

cdab

1

wa=1

Ilya Pollak

decodability

There are uniquely decodable codes

which do not satisfy the prefix condition

(e.g., {0, 01}).

Ilya Pollak

decodability

There are uniquely decodable codes

which do not satisfy the prefix condition

(e.g., {0, 01}). For any such code, a prefix

condition code can be constructed with an

identical set of codeword lengths. (E.g.,

{0, 10} for {0, 01}.)

Ilya Pollak

decodability

There are uniquely decodable codes

which do not satisfy the prefix condition

(e.g., {0, 01}). For any such code, a prefix

condition code can be constructed with an

identical set of codeword lengths. (E.g.,

{0, 10} for {0, 01}.)

For this reason, we can consider just

prefix condition codes.

Ilya Pollak

Entropy coding

Given a discrete random variable X with M possible outcomes

(symbols or letters) a1,,aM and with PMF pX, what is the

lowest achievable expected codeword length among all the

uniquely decodable codes?

Answer depends on pX; Shannons source coding theorem provides

bounds.

expected codeword length?

Answer: Huffman code.

Ilya Pollak

Huffman code

Consider a discrete r.v. X with M possible outcomes a1,,aM and with PMF

pX. Assume that pX(a1) pX(aM). (If this condition is not satisfied,

reorder the outcomes so that it is satisfied.)

Ilya Pollak

Huffman code

Consider a discrete r.v. X with M possible outcomes a1,,aM and with PMF

pX. Assume that pX(a1) pX(aM). (If this condition is not satisfied,

reorder the outcomes so that it is satisfied.)

Consider aggregate outcome a12 = {a1,a2} and a discrete r.v. X such that

a12

X' =

X

if X = a1 or X = a2

otherwise

Ilya Pollak

Huffman code

Consider a discrete r.v. X with M possible outcomes a1,,aM and with PMF

pX. Assume that pX(a1) pX(aM). (If this condition is not satisfied,

reorder the outcomes so that it is satisfied.)

Consider aggregate outcome a12 = {a1,a2} and a discrete r.v. X such that

a12

X' =

X

if X = a1 or X = a2

otherwise

p ( a ) + p ( a ) if a = a

X 1

X

2

12

pX ' ( a ) =

if a = a3 ,, aM

p X ( a )

Ilya Pollak

Huffman code

Consider a discrete r.v. X with M possible outcomes a1,,aM and with PMF

pX. Assume that pX(a1) pX(aM). (If this condition is not satisfied,

reorder the outcomes so that it is satisfied.)

Consider aggregate outcome a12 = {a1,a2} and a discrete r.v. X such that

a12

X' =

X

if X = a1 or X = a2

otherwise

p ( a ) + p ( a ) if a = a

X 1

X

2

12

pX ' ( a ) =

if a = a3 ,, aM

p X ( a )

Suppose we have a tree, T, for an optimal prefix condition code for X. A tree

T for an optimal prefix condition code for X can be obtained from T by

splitting the leaf a12 into two leaves corresponding to a1 and a2.

Ilya Pollak

Consider a discrete r.v. X with M possible outcomes a1,,aM and with PMF

pX. Assume that pX(a1) pX(aM). (If this condition is not satisfied,

reorder the outcomes so that it is satisfied.)

Consider aggregate outcome a12 = {a1,a2} and a discrete r.v. X such that

a12

X' =

X

if X = a1 or X = a2

otherwise

p ( a ) + p ( a ) if a = a

X 1

X

2

12

pX ' ( a ) =

if a = a3 ,, aM

p X ( a )

Suppose we have a tree, T, for an optimal prefix condition code for X. A tree

T for an optimal prefix condition code for X can be obtained from T by

splitting the leaf a12 into two leaves corresponding to a1 and a2.

We wont prove this.

Ilya Pollak

letter

pX(letter)

a1

0.10

a2

0.10

a3

0.25

a4

0.25

a5

0.30

Example

Ilya Pollak

letter

pX(letter)

a1

0.10

a2

0.10

a3

0.25

a4

0.25

a5

0.30

Example

Step 1: combine

the two least likely

letters.

letter

pX(letter)

a12

0.20

a3

0.25

a4

0.25

a5

0.30

Ilya Pollak

letter

pX(letter)

a1

0.10

a2

0.10

a3

0.25

a4

0.25

a5

0.30

Example

Step 1: combine

the two least likely

letters.

a1

a2

letter

pX(letter)

a12

0.20

a3

0.25

a4

0.25

a5

0.30

a12

Ilya Pollak

letter

pX(letter)

a1

0.10

a2

0.10

a3

0.25

a4

0.25

a5

0.30

Example

Step 1: combine

the two least likely

letters.

a1

Tree for X:

a2

a12

letter

pX(letter)

a12

0.20

a3

0.25

a4

0.25

a5

0.30

Tree for X

(still to be

constructed)

Ilya Pollak

Example

letter

pX(letter)

a12

0.20

a3

0.25

a4

0.25

a5

0.30

Step 2: combine

the two least likely

letters from the new

alphabet.

letter

pX(letter)

a123

0.45

a4

0.25

a5

0.30

Ilya Pollak

Example

letter

pX(letter)

a12

0.20

a3

0.25

a4

0.25

a5

0.30

Step 2: combine

the two least likely

letters from the new

alphabet.

a1

pX(letter)

a123

0.45

a4

0.25

a5

0.30

a12

1

a2

letter

a3

a123

Ilya Pollak

Example

letter

pX(letter)

a12

0.20

a3

0.25

a4

0.25

a5

0.30

Step 2: combine

the two least likely

letters from the new

alphabet.

a1

Tree for X:

0

pX(letter)

a123

0.45

a4

0.25

a5

0.30

a12

1

a2

letter

a3

a123

Tree for

X

Ilya Pollak

Example

letter

pX(letter)

a12

0.20

a3

0.25

a4

0.25

a5

0.30

Step 2: combine

the two least likely

letters from the new

alphabet.

a1

Tree for X:

0

a12

a3

pX(letter)

a123

0.45

a4

0.25

a5

0.30

Tree for X

1

a2

letter

a123

Tree for

X

Ilya Pollak

Example

letter

pX(letter)

a123

0.45

a4

0.25

a5

0.30

the two least likely

letters

a1

pX(letter)

a123

0.45

a45

0.55

a12

1

a2

letter

a3

a4

a5

a123

0

1

a45

Ilya Pollak

Example

letter

pX(letter)

a123

0.45

a4

0.25

a5

0.30

the two least likely

letters

a1

Tree for X:

0

pX(letter)

a123

0.45

a45

0.55

a12

1

a2

letter

a3

a123

Tree for X

a4

a5

a45

Ilya Pollak

Example

letter

pX(letter)

a123

0.45

a4

0.25

a5

0.30

the two least likely

letters

a1

Tree for X:

a12

a3

a5

a123

0.45

a45

0.55

Tree for X

a4

pX(letter)

Tree for X

a123

1

a2

letter

a45

Ilya Pollak

Example

letter

pX(letter)

a123

0.45

a4

0.25

a5

0.30

the two least likely

letters

a1

Tree for X:

0

a12

a3

a5

a123

0.45

a45

0.55

Tree for X

a123

Tree for X

a4

pX(letter)

Tree for X

1

a2

letter

a45

Ilya Pollak

Example

letter

pX(letter)

a123

0.45

a45

0.55

two remaining letters

Done!

a1

Tree for X:

a12

1

a2

a3

a4

a5

a123

1

0

1a

45

a12345

Ilya Pollak

Example

letter

pX(letter)

a123

0.45

a45

0.55

two remaining letters

for each leaf is the sequence

from the root to that leaf.

a1

Tree for X:

1

a2

a3

a4

a5

1

0

1

Ilya Pollak

Example

a1

Tree for X:

1

a2

a3

a4

a5

1

0

1

letter

pX(letter)

codeword

a1

0.10

111

a2

0.10

a3

0.25

a4

0.25

a5

0.30

Ilya Pollak

Example

a1

Tree for X:

1

a2

a3

a4

a5

1

0

1

letter

pX(letter)

codeword

a1

0.10

111

a2

0.10

110

a3

0.25

a4

0.25

a5

0.30

Ilya Pollak

Example

a1

Tree for X:

1

a2

a3

a4

a5

1

0

1

letter

pX(letter)

codeword

a1

0.10

111

a2

0.10

110

a3

0.25

10

a4

0.25

a5

0.30

Ilya Pollak

Example

a1

Tree for X:

1

a2

a3

a4

a5

1

0

1

letter

pX(letter)

codeword

a1

0.10

111

a2

0.10

110

a3

0.25

10

a4

0.25

01

a5

0.30

Ilya Pollak

Example

a1

Tree for X:

1

a2

a3

a4

a5

1

0

1

letter

pX(letter)

codeword

a1

0.10

111

a2

0.10

110

a3

0.25

10

a4

0.25

01

a5

0.30

00

Ilya Pollak

Example

Expected codeword length: 3(0.1) + 3(0.1) + 2(0.25) + 2(0.25) + 2(0.3) = 2.2 bits

a1

Tree for X:

1

a2

a3

a4

a5

1

0

1

letter

pX(letter)

codeword

a1

0.10

111

a2

0.10

110

a3

0.25

10

a4

0.25

01

a5

0.30

00

Ilya Pollak

Self-information

Consider again a discrete random variable X with M possible

outcomes a1,,aM and with PMF pX.

Ilya Pollak

Self-information

Consider again a discrete random variable X with M possible

outcomes a1,,aM and with PMF pX.

Self-information of outcome am is I(am) = log2 pX(am) bits.

Ilya Pollak

Self-information

Consider again a discrete random variable X with M possible

outcomes a1,,aM and with PMF pX.

Self-information of outcome am is I(am) = log2 pX(am) bits.

E.g., pX(am) = 1 then I(am) = 0. The occurrence of am is not at

all informative, since it had to occur. The smaller the

probability of an outcome, the larger its self-information.

Ilya Pollak

Self-information

Consider again a discrete random variable X with M possible

outcomes a1,,aM and with PMF pX.

Self-information of outcome am is I(am) = log2 pX(am) bits.

E.g., pX(am) = 1 then I(am) = 0. The occurrence of am is not at

all informative, since it had to occur. The smaller the

probability of an outcome, the larger its self-information.

Self-information of X is I(X) = log2 pX(X) and is a random

variable.

Ilya Pollak

Self-information

Consider again a discrete random variable X with M possible

outcomes a1,,aM and with PMF pX.

Self-information of outcome am is I(am) = log2 pX(am) bits.

E.g., pX(am) = 1 then I(am) = 0. The occurrence of am is not at

all informative, since it had to occur. The smaller the

probability of an outcome, the larger its self-information.

Self-information of X is I(X) = log2 pX(X) and is a random

variable.

Entropy of X is the expected value of its self-information:

M

m =1

Ilya Pollak

For any uniquely decodable code, the expected codeword length is H (X).

Moreover, there exists a prefix condition code for which the expected codeword

length is < H (X) + 1.

Ilya Pollak

Example

Suppose that X has M=2K possible outcomes a1,,aM.

Ilya Pollak

Example

Suppose that X has M=2K possible outcomes a1,,aM.

Suppose that X is uniform, i.e., pX (a1) = = pX (aM) = 2K.

Ilya Pollak

Example

Suppose that X has M=2K possible outcomes a1,,aM.

Suppose that X is uniform, i.e., pX (a1) = = pX (aM) = 2K. Then

2K

( )

k =1

Ilya Pollak

Example

Suppose that X has M=2K possible outcomes a1,,aM.

Suppose that X is uniform, i.e., pX (a1) = = pX (aM) = 2K. Then

2K

( )

k =1

sequences. Thus, a fixed-length code for X that uses all these

2K K-bit sequences as codewords for all the 2K outcomes of X,

will have expected codeword length of K.

Ilya Pollak

Example

Suppose that X has M=2K possible outcomes a1,,aM.

Suppose that X is uniform, i.e., pX (a1) = = pX (aM) = 2K. Then

2K

( )

k =1

sequences. Thus, a fixed-length code for X that uses all these

2K K-bit sequences as codewords for all the 2K outcomes of X,

will have expected codeword length of K.

I.e., for this particular random variable, this fixed-length code

achieves the entropy of X, which is the lower bound given by

the source coding theorem.

Ilya Pollak

Suppose that X has M=2K possible outcomes a1,,aM.

Suppose that X is uniform, i.e., pX (a1) = = pX (aM) = 2K. Then

2K

( )

k =1

sequences. Thus, a fixed-length code for X that uses all these

2K K-bit sequences as codewords for all the 2K outcomes of X,

will have expected codeword length of K.

I.e., for this particular random variable, this fixed-length code

achieves the entropy of X, which is the lower bound given by

the source coding theorem.

Therefore, the K-bit fixed-length code is optimal for this X.

Ilya Pollak

proving the source coding theorem

log2 (1) log2e for log2 > 0.

Proof: differentiate g() = (1) log2e log2 and show that

g(1) = 0 is its minimum.

Ilya Pollak

If integers d1 ,, d M satisfy the inequality

M

dm

1,

(1)

m =1

then there exists a prefix condition code whose codeword lengths are these integers.

Conversely, the codeword lengths of any prefix condition code satisfy this inequality.

Ilya Pollak

A full binary tree of depth D has

2D leaves.

Ilya Pollak

Tree depth D = 4

2D leaves. (Here, depth is D=4 and

the number of leaves is 24=16.)

Ilya Pollak

Tree depth D = 4

Depth of red

node = 2

2D leaves. (Here, depth is D=4 and

the number of leaves is 24=16.)

node at depth d has 2Dd leaf

descendants. (Here, D=4, the red

node is at depth d=2, and so it has

242 = 4 leaf descendants.)

Ilya Pollak

Suppose d1 d M satisfy (1). Consider the full binary tree of depth d M , and consider all its

nodes at depth d1 . Assign one of these nodes to symbol a1 .

Ilya Pollak

Suppose d1 d M satisfy (1). Consider the full binary tree of depth d M , and consider all its

nodes at depth d1 . Assign one of these nodes to symbol a1 . Consider all the nodes at depth d2 which

are not a1 and not descendants of a1 . Assign one of them to symbol a2 .

Ilya Pollak

Suppose d1 d M satisfy (1). Consider the full binary tree of depth d M , and consider all its

nodes at depth d1 . Assign one of these nodes to symbol a1 . Consider all the nodes at depth d2 which

are not a1 and not descendants of a1 . Assign one of them to symbol a2 . Iterate like this M times.

Ilya Pollak

Suppose d1 d M satisfy (1). Consider the full binary tree of depth d M , and consider all its

nodes at depth d1 . Assign one of these nodes to symbol a1 . Consider all the nodes at depth d2 which

are not a1 and not descendants of a1 . Assign one of them to symbol a2 . Iterate like this M times.

If we have run out of tree nodes to assign after r < M iterations, it means that every leaf in the full

binary tree of depth d M is a descendant of one of the first m symbols, a1 ,, ar .

Ilya Pollak

Suppose d1 d M satisfy (1). Consider the full binary tree of depth d M , and consider all its

nodes at depth d1 . Assign one of these nodes to symbol a1 . Consider all the nodes at depth d2 which

are not a1 and not descendants of a1 . Assign one of them to symbol a2 . Iterate like this M times.

If we have run out of tree nodes to assign after r < M iterations, it means that every leaf in the full

binary tree of depth d M is a descendant of one of the first m symbols, a1 ,, ar . But note that every

node at depth dm has 2 dM dm descendants. Note also that the full tree has 2 dM leaves. Therefore, if

every leaf in the tree is a descendant of a1 ,, ar , then

r

d M dm

= 2 dM

m =1

Ilya Pollak

Suppose d1 d M satisfy (1). Consider the full binary tree of depth d M , and consider all its

nodes at depth d1 . Assign one of these nodes to symbol a1 . Consider all the nodes at depth d2 which

are not a1 and not descendants of a1 . Assign one of them to symbol a2 . Iterate like this M times.

If we have run out of tree nodes to assign after r < M iterations, it means that every leaf in the full

binary tree of depth d M is a descendant of one of the first m symbols, a1 ,, ar . But note that every

node at depth dm has 2 dM dm descendants. Note also that the full tree has 2 dM leaves. Therefore, if

every leaf in the tree is a descendant of a1 ,, ar , then

r

2

m =1

d M dm

=2

dM

dm

=1

m =1

Ilya Pollak

Suppose d1 d M satisfy (1). Consider the full binary tree of depth d M , and consider all its

nodes at depth d1 . Assign one of these nodes to symbol a1 . Consider all the nodes at depth d2 which

are not a1 and not descendants of a1 . Assign one of them to symbol a2 . Iterate like this M times.

If we have run out of tree nodes to assign after r < M iterations, it means that every leaf in the full

binary tree of depth d M is a descendant of one of the first m symbols, a1 ,, ar . But note that every

node at depth dm has 2 dM dm descendants. Note also that the full tree has 2 dM leaves. Therefore, if

every leaf in the tree is a descendant of a1 ,, ar , then

r

d M dm

=2

dM

m =1

=1

m =1

Therefore,

dm

2

m =1

dm

= 2

m =1

dm

m = r +1

Ilya Pollak

nodes at depth d1 . Assign one of these nodes to symbol a1 . Consider all the nodes at depth d2 which

are not a1 and not descendants of a1 . Assign one of them to symbol a2 . Iterate like this M times.

If we have run out of tree nodes to assign after r < M iterations, it means that every leaf in the full

binary tree of depth d M is a descendant of one of the first m symbols, a1 ,, ar . But note that every

node at depth dm has 2 dM dm descendants. Note also that the full tree has 2 dM leaves. Therefore, if

every leaf in the tree is a descendant of a1 ,, ar , then

r

d M dm

=2

dM

m =1

=1

m =1

Therefore,

dm

2

m =1

dm

= 2

m =1

dm

m = r +1

Thus, our procedure can in fact go on for M iterations. After the M -th iteration, we will have

constructed a prefix condition code with codeword lengths d1 ,, d M .

Ilya Pollak

Suppose d1 d M , and suppose we have a prefix condition code with there codeword lengths.

Consider the binary tree corresponding to this code.

Ilya Pollak

Suppose d1 d M , and suppose we have a prefix condition code with there codeword lengths.

Consider the binary tree corresponding to this code. Complete this tree to obtain a full tree of

depth d M .

Ilya Pollak

Suppose d1 d M , and suppose we have a prefix condition code with there codeword lengths.

Consider the binary tree corresponding to this code. Complete this tree to obtain a full tree of

depth d M . Again use the following facts:

the full tree has 2 dM leaves;

the number of leaf descendants of the codeword of length dm is 2 dM dm .

Ilya Pollak

Suppose d1 d M , and suppose we have a prefix condition code with there codeword lengths.

Consider the binary tree corresponding to this code. Complete this tree to obtain a full tree of

depth d M . Again use the following facts:

the full tree has 2 dM leaves;

the number of leaf descendants of the codeword of length dm is 2 dM dm .

The combined number of all leaf descendants of all codewords must be less than or equal to

the total number of leaves in the full tree:

M

d M dm

2 dM

m =1

Ilya Pollak

Suppose d1 d M , and suppose we have a prefix condition code with there codeword lengths.

Consider the binary tree corresponding to this code. Complete this tree to obtain a full tree of

depth d M . Again use the following facts:

the full tree has 2 dM leaves;

the number of leaf descendants of the codeword of length dm is 2 dM dm .

The combined number of all leaf descendants of all codewords must be less than or equal to

the total number of leaves in the full tree:

M

2

m =1

d M dm

dM

dm

1.

m =1

Ilya Pollak

H(X)E[C]

Let dm be the codeword length for am , and let random variable C be the codeword length for X.

Ilya Pollak

H(X)E[C]

Let dm be the codeword length for am , and let random variable C be the codeword length for X.

M

m =1

m =1

Ilya Pollak

H(X)E[C]

Let dm be the codeword length for am , and let random variable C be the codeword length for X.

1

dm

H (X) E[C] = p X (am )log 2 p X (am ) p X (am )dm = p X (am ) log 2

log 2 2

p

(a

)

m =1

m =1

m =1

X

m

Ilya Pollak

H(X)E[C]

Let dm be the codeword length for am , and let random variable C be the codeword length for X.

1

dm

H (X) E[C] = p X (am )log 2 p X (am ) p X (am )dm = p X (am ) log 2

log 2 2

p

(a

)

m =1

m =1

m =1

X

m

1

= p X (am ) log 2

dm

p

(a

)2

m =1

X

m

Ilya Pollak

H(X)E[C]

Let dm be the codeword length for am , and let random variable C be the codeword length for X.

1

dm

H (X) E[C] = p X (am )log 2 p X (am ) p X (am )dm = p X (am ) log 2

log 2 2

p

(a

)

m =1

m =1

m =1

X

m

1

= p X (am ) log 2

dm

p

(a

)2

m =1

X

m

1

p X (am )

1

log 2 e

dm

p X (am )2

m =1

(by Lemma 1)

Ilya Pollak

H(X)E[C]

Let dm be the codeword length for am , and let random variable C be the codeword length for X.

1

dm

H (X) E[C] = p X (am )log 2 p X (am ) p X (am )dm = p X (am ) log 2

log 2 2

p

(a

)

m =1

m =1

m =1

X

m

1

= p X (am ) log 2

dm

p

(a

)2

m =1

X

m

1

p X (am )

1

log 2 e

dm

p X (am )2

m =1

(by Lemma 1)

M

M 1

= dm p X (am ) log 2 e

m =1 2

m =1

Ilya Pollak

H(X)E[C]

Let dm be the codeword length for am , and let random variable C be the codeword length for X.

1

dm

H (X) E[C] = p X (am )log 2 p X (am ) p X (am )dm = p X (am ) log 2

log 2 2

p

(a

)

m =1

m =1

m =1

X

m

1

= p X (am ) log 2

dm

p

(a

)2

m =1

X

m

1

p X (am )

1

log 2 e

dm

p X (am )2

m =1

(by Lemma 1)

M

M 1

= dm p X (am ) log 2 e

m =1 2

m =1

M dm

= 2 1 log 2 e 0

m =1

Ilya Pollak

H(X)E[C]

Let dm be the codeword length for am , and let random variable C be the codeword length for X.

1

dm

H (X) E[C] = p X (am )log 2 p X (am ) p X (am )dm = p X (am ) log 2

log 2 2

p

(a

)

m =1

m =1

m =1

X

m

1

= p X (am ) log 2

dm

p

(a

)2

m =1

X

m

1

p X (am )

1

log 2 e

dm

p X (am )2

m =1

(by Lemma 1)

M

M 1

= dm p X (am ) log 2 e

m =1 2

m =1

M dm

= 2 1 log 2 e 0

m =1

By Kraft inequality, this holds for any prefix condition code. But it is also true for any uniquely

decodable code.

Ilya Pollak

E[C] < H(X)+1?

Choose dm = log 2 p X (am ) (where x stands for the smallest integer which is x). Then

dm log 2 p X (am )

Ilya Pollak

E[C] < H(X)+1?

Choose dm = log 2 p X (am ) (where x stands for the smallest integer which is x). Then

dm log 2 p X (am ) dm log 2 p X (am ) 2 dm p X (am )

Ilya Pollak

E[C] < H(X)+1?

Choose dm = log 2 p X (am ) (where x stands for the smallest integer which is x). Then

dm log 2 p X (am )

dm log 2 p X (am )

dm

p X (am )

2

m =1

dm

p X (am ) = 1.

m =1

Ilya Pollak

E[C] < H(X)+1?

Choose dm = log 2 p X (am ) (where x stands for the smallest integer which is x). Then

dm log 2 p X (am )

dm log 2 p X (am )

dm

p X (am )

2

m =1

dm

p X (am ) = 1.

m =1

Therefore, Kraft inequality is satisfied, and we can construct a prefix condition code with codeword

lengths d1 ,, d M .

Ilya Pollak

E[C] < H(X)+1?

Choose dm = log 2 p X (am ) (where x stands for the smallest integer which is x). Then

dm log 2 p X (am )

dm log 2 p X (am )

dm

p X (am )

2

m =1

dm

p X (am ) = 1.

m =1

Therefore, Kraft inequality is satisfied, and we can construct a prefix condition code with codeword

lengths d1 ,, d M . Also, by construction,

dm 1 < log 2 p X (am ) dm < log 2 p X (am ) + 1

Ilya Pollak

E[C] < H(X)+1?

Choose dm = log 2 p X (am ) (where x stands for the smallest integer which is x). Then

dm log 2 p X (am )

dm log 2 p X (am )

dm

p X (am )

2

m =1

dm

p X (am ) = 1.

m =1

Therefore, Kraft inequality is satisfied, and we can construct a prefix condition code with codeword

lengths d1 ,, d M . Also, by construction,

dm 1 < log 2 p X (am ) dm < log 2 p X (am ) + 1

p X (am )dm < p X (am )log 2 p X (am ) + p X (am )

Ilya Pollak

E[C] < H(X)+1?

Choose dm = log 2 p X (am ) (where x stands for the smallest integer which is x). Then

dm log 2 p X (am )

dm log 2 p X (am )

dm

p X (am )

2

m =1

dm

p X (am ) = 1.

m =1

Therefore, Kraft inequality is satisfied, and we can construct a prefix condition code with codeword

lengths d1 ,, d M . Also, by construction,

dm 1 < log 2 p X (am ) dm < log 2 p X (am ) + 1

p X (am )dm < p X (am )log 2 p X (am ) + p X (am )

p

m =1

m =1

Ilya Pollak

E[C] < H(X)+1?

Choose dm = log 2 p X (am ) (where x stands for the smallest integer which is x). Then

dm log 2 p X (am )

dm log 2 p X (am )

dm

p X (am )

2

m =1

dm

p X (am ) = 1.

m =1

Therefore, Kraft inequality is satisfied, and we can construct a prefix condition code with codeword

lengths d1 ,, d M . Also, by construction,

dm 1 < log 2 p X (am ) dm < log 2 p X (am ) + 1

p X (am )dm < p X (am )log 2 p X (am ) + p X (am )

p

m =1

m =1

m =1

m =1

Ilya Pollak

E[C] < H(X)+1?

Choose dm = log 2 p X (am ) (where x stands for the smallest integer which is x). Then

dm log 2 p X (am )

dm log 2 p X (am )

dm

p X (am )

2

m =1

dm

p X (am ) = 1.

m =1

lengths d1 ,, d M . Also, by construction,

dm 1 < log 2 p X (am ) dm < log 2 p X (am ) + 1

p X (am )dm < p X (am )log 2 p X (am ) + p X (am )

p

m =1

m =1

m =1

m =1

Ilya Pollak

far from the entropy

Let X have two outcomes, a1 and a2, with probabilities 12d

and 2d, respectively.

Ilya Pollak

far from the entropy

Let X have two outcomes, a1 and a2, with probabilities 12d

and 2d, respectively.

Huffman code: 0 for a1; 1 for a2.

Expected codeword length: 1.

Ilya Pollak

far from the entropy

Let X have two outcomes, a1 and a2, with probabilities 12d

and 2d, respectively.

Huffman code: 0 for a1; 1 for a2.

Expected codeword length: 1.

Entropy: (12d) log2(12d) + d2d 0 for large d. For

example, if d=20, this is 0.0000204493.

Ilya Pollak

far from the entropy

Let X have two outcomes, a1 and a2, with probabilities 12d

and 2d, respectively.

Huffman code: 0 for a1; 1 for a2.

Expected codeword length: 1.

Entropy: (12d) log2(12d) + d2d 0 for large d. For

example, if d=20, this is 0.0000204493.

Problem: no codeword can have fractional numbers of bits!

Ilya Pollak

far from the entropy

Let X have two outcomes, a1 and a2, with probabilities 12d

and 2d, respectively.

Huffman code: 0 for a1; 1 for a2.

Expected codeword length: 1.

Entropy: (12d) log2(12d) + d2d 0 for large d. For

example, if d=20, this is 0.0000204493.

Problem: no codeword can have fractional numbers of bits!

If we have a source which produces independent random

variables X1, X2 , , all identically distributed to X, a single

Huffman code can be constructed for several of them,

effectively resulting in fractional numbers of bits per random

variable.

Ilya Pollak

Example

(X1,X2) will have four outcomes, (a1,a1), (a1,a2), (a2,a1), (a2,a2),

with probabilities 12d+1+22d, 2d22d, 2d22d, and 22d,

respectively.

Ilya Pollak

Example

(X1,X2) will have four outcomes, (a1,a1), (a1,a2), (a2,a1), (a2,a2),

with probabilities 12d+1+22d, 2d22d, 2d22d, and 22d,

respectively.

Huffman code: 0 for (a1,a1); 10 for (a1,a2); 110 for (a2,a1); 111

for (a2,a2).

Ilya Pollak

Example

(X1,X2) will have four outcomes, (a1,a1), (a1,a2), (a2,a1), (a2,a2),

with probabilities 12d+1+22d, 2d22d, 2d22d, and 22d,

respectively.

Huffman code: 0 for (a1,a1); 10 for (a1,a2); 110 for (a2,a1); 111

for (a2,a2).

Expected codeword length per random variable:

[12d+1+22d + 2(2d22d) + 3(2d22d)+ 3(22d)]/2

Ilya Pollak

Example

(X1,X2) will have four outcomes, (a1,a1), (a1,a2), (a2,a1), (a2,a2),

with probabilities 12d+1+22d, 2d22d, 2d22d, and 22d,

respectively.

Huffman code: 0 for (a1,a1); 10 for (a1,a2); 110 for (a2,a1); 111

for (a2,a2).

Expected codeword length per random variable:

[12d+1+22d + 2(2d22d) + 3(2d22d)+ 3(22d)]/2

This is 0.500001 for d=20

Ilya Pollak

Example

(X1,X2) will have four outcomes, (a1,a1), (a1,a2), (a2,a1), (a2,a2),

with probabilities 12d+1+22d, 2d22d, 2d22d, and 22d,

respectively.

Huffman code: 0 for (a1,a1); 10 for (a1,a2); 110 for (a2,a1); 111

for (a2,a2).

Expected codeword length per random variable:

[12d+1+22d + 2(2d22d) + 3(2d22d)+ 3(22d)]/2

This is 0.500001 for d=20

sequences of Xks.

Ilya Pollak

of independent, identically distributed

random variables

Suppose we are jointly encoding independent, identically distributed discrete

random variables X1 ,, X N , each taking values in {a1 ,, aN }.

For any uniquely decodable code, the expected codeword length is H (Xn ).

Moreover, there exists a prefix condition code for which the expected codeword

1

length is < H (Xn ) + .

N

Ilya Pollak

for iid sequences

Consider random vector X = ( X1 ,, X N ) . The self-information of its outcome x = ( x1 ,, x N ) is

I(x) = log 2 p X1 ,, XN ( x1 ,, x N )

Ilya Pollak

for iid sequences

Consider random vector X = ( X1 ,, X N ) . The self-information of its outcome x = ( x1 ,, x N ) is

N

n =1

n =1

Ilya Pollak

for iid sequences

Consider random vector X = ( X1 ,, X N ) . The self-information of its outcome x = ( x1 ,, x N ) is

N

n =1

n =1

Therefore, the entropy of X is

N

N

H ( X ) = E I ( X ) = E I ( Xn ) = H ( Xn ) = NH ( Xn ) .

n =1

n =1

Ilya Pollak

for iid sequences

Consider random vector X = ( X1 ,, X N ) . The self-information of its outcome x = ( x1 ,, x N ) is

N

n =1

n =1

Therefore, the entropy of X is

N

N

H ( X ) = E I ( X ) = E I ( Xn ) = H ( Xn ) = NH ( Xn ) .

n =1

n =1

Therefore, applying the single-symbol source coding theorem to X, we have:

H ( X ) E [ C N ] < H ( X ) + 1,

where E [ C N ] is the expected codeword length for the optimal uniquely decodable code for X

Ilya Pollak

for iid sequences

Consider random vector X = ( X1 ,, X N ) . The self-information of its outcome x = ( x1 ,, x N ) is

N

n =1

n =1

Therefore, the entropy of X is

N

N

H ( X ) = E I ( X ) = E I ( Xn ) = H ( Xn ) = NH ( Xn ) .

n =1

n =1

Therefore, applying the single-symbol source coding theorem to X, we have:

H ( X ) E [ C N ] < H ( X ) + 1,

NH ( Xn ) E [ C N ] < NH ( Xn ) + 1,

where E [ C N ] is the expected codeword length for the optimal uniquely decodable code for X

Ilya Pollak

for iid sequences

Consider random vector X = ( X1 ,, X N ) . The self-information of its outcome x = ( x1 ,, x N ) is

N

n =1

n =1

Therefore, the entropy of X is

N

N

H ( X ) = E I ( X ) = E I ( Xn ) = H ( Xn ) = NH ( Xn ) .

n =1

n =1

Therefore, applying the single-symbol source coding theorem to X, we have:

H ( X ) E [ C N ] < H ( X ) + 1,

NH ( Xn ) E [ C N ] < NH ( Xn ) + 1,

1

,

N

is the expected codeword length for the optimal uniquely decodable code for X,

H ( Xn ) E [C ] < H ( Xn ) +

where E [ C N ]

E [CN ]

and E [ C ] =

is the corresponding expected codeword length per symbol.

N

Ilya Pollak

Arithmetic coding

Another form of entropy coding.

More amenable to coding long sequences of symbols than

Huffman coding.

Can be used in conjunction with on-line learning of conditional

probabilities to encode dependent sequences of symbols:

Q-coder in JPEG (JPEG also has a Huffman coding option)

QM-coder in JBIG

MQ-coder in JPEG-2000

CABAC coder in H.264/MPEG-4 AVC

Ilya Pollak

- Mtech Scheme & Syllabus_keralauniversityUploaded bysendtomerlin4u
- MMC Chap3Uploaded bySOMESH B S
- Ws 1621multimedia Compression TechniquesUploaded byVisakh Vijay
- 1mtech Comp SecurityUploaded bylalatendurath5716
- Scalable Image Encryption Based Lossless Image CompressionUploaded byAnonymous 7VPPkWS8O
- pg_ch15_0403Uploaded byALE X RAY
- Noise: A Call for Inquiry and ExperimentationUploaded byNathanCorder
- Audio CompressionUploaded byIkhsanFikriFahrozi
- 5. Comp Sci - IJCSE - IMAGE COMPRESSION WITH - Manoj Kumar.pdfUploaded byiaset123
- Compression on Music FilesUploaded byDidier9
- 1569293007Uploaded byKrishna Reddy Konda
- 7_7725_PAMOD_02EUploaded byJose Luna
- TEMS Symphony 6.1 - Technical Product Description2Uploaded bykaz2013
- mtp_reportUploaded bySaurabh Jaiswal
- 7320Uploaded byJankoStankovic
- InformationUploaded byue06037
- v7-656-672Uploaded byAbhishek Patil
- 00643cUploaded byWalter Korir
- S7 EC DIP MAY 2014Uploaded bysreevish2313
- dip4e_detailed_TOC.pdfUploaded byfreeloadtailieu2017
- Dear SusanneUploaded bysfofoby
- Viva queUploaded bySultan Mirza
- Iniyavan_thesisdraft_delft.pdfUploaded bythitana
- Compression and DecompressionUploaded bykarthickamsec
- AdvancedUploaded byRodrigo Campos
- VideoUploaded byDanial Nadeem
- Jl 2516751681Uploaded byAnonymous 7VPPkWS8O
- Low Complexity Multiview Video CodingUploaded byshadankhattak
- ubicc_624_624Uploaded byUbiquitous Computing and Communication Journal
- [IJCST-V4I4P44]:T. Kalaiselvi, S. Boopathiraja, P. SriramakrishnanUploaded byEighthSenseGroup

- Soundhandbook Eng 10Uploaded byAnonymous u0wETydF
- ENTC New OBE Curriculum From 2013 IntakeUploaded bysarjoon
- Image RestorationUploaded byswatisingla786
- EC305Uploaded byapi-3853441
- Principles of Digital Communication A Top-Down Approach Bixio RimoldiUploaded byAlbertoPita
- Electronics InstrumentationUploaded byAkhil
- 1 Gsm TerminationUploaded byMUHAMMAD SYAFI HR
- Ravinder Sharma's RESUMEUploaded byRavi Singla
- IEEE 802.22 Standard Approved for White Space DevelopmentUploaded byJournal of Computing
- DLink DAP 2553 ManualUploaded bybmmanuals
- MITRES_6_007S11_hw04_solUploaded byjmathew_984887
- lte-in-band-relay-prototype-and-field-measurement.pdfUploaded bySalkovićElvis
- MIB and SIB in LTEUploaded byVikas Khantwal
- lte architctureUploaded byraham niazi
- FibeAir-IP-20G-Datasheet-ETSI-for-T7.pdfUploaded byLui Eduard
- Consolidated_Miami_CIQ_2.5_GLTE_NSN_V5.0_020615Uploaded bytnmoreno
- Technical Goals of NetworksUploaded byShazaib Ahmed
- Arduino UNO Tutorial 7 - Piezo Beep.pdfUploaded byAnton Sutrisno
- RLC SERIE DEGUploaded bytiago___araujo
- Optical Communication Unit-1Uploaded byNilu Jasmine
- wifi_antennameasurementUploaded byFrits Last
- Speaker recognition using SSCUploaded byMarkAGregory
- Octave Bandwidth Orthomode Transducers for the EVLAUploaded bySonia Baci
- RR2016-Vol-IIIE.docxUploaded byPuput Adi Saputro
- Chaitali PhdUploaded byHamouda Azzouz
- The Performance of Optical CDMA Access Network for Point to Multipoint Configuration using Spectral Detection TechniqueUploaded byJournalofICT
- ImageProcessing13 RevisionUploaded byThilaga Mohan
- Phase Locked LoopsUploaded bysureshy-ee213
- LoRa TechnologyUploaded byAnonymous CGk2ro
- MET1422 Ch3 Cellular Radio SystemUploaded byvino154023