You are on page 1of 68

Introduction to JPEG and

MPEG
Ingemar J. Cox
University College London
Outline
 Elementary information theory
 Lossless compression
 Quantization

UCL Adastral Park Postgraduate Campus


 Fundamentals of images
 Discrete Cosine Transform (DCT)
 JPEG
 MPEG-1, MPEG-2
Nov 27th 2006 Ingemar J. Cox 2
Bibliography
 D. MacKay, “Information Theory, Inference and learning
Algorithms”, Cambridge University Press, 2003.
http://www.inference.phy.cam.ac.uk/itprnn/book.html
 W. B. Pennebaker and J. L. Mitchell, “JPEG Still Image

UCL Adastral Park Postgraduate Campus


Data Compression Standard”, Chapman Hall, 1993
(ISBN 0-442-01272-1).
 G. K. Wallace, “The JPEG Still-Picture Compression
Standard”, IEEE Trans. On Consumer Electronics, 38,
1, 18-34, 1992.
 http://en.wikipedia.org/wiki/JPEG
Nov 27th 2006 Ingemar J. Cox 3
Bibliography
 http://en.wikipedia.org/wiki/MPEG-2
 T. Sikora, “MPEG Digital Video-Coding
Standards”, IEEE Signal Processing Magazine,

UCL Adastral Park Postgraduate Campus


82-100, September 1997

Nov 27th 2006 Ingemar J. Cox 4


Elementary Information Theory
Elementary Information Theory
 How much information does a symbol convey?
 Intuitively, the more unpredictable or surprising
it is, the more information is conveyed.

UCL Adastral Park Postgraduate Campus


 Conversely, if we strongly expected something,
and it occurs, we have not learnt very much

Nov 27th 2006 Ingemar J. Cox 6


Elementary Information Theory
 If p is the probability that a symbol will occur
 Then the amount of information, I, conveyed is:
1

UCL Adastral Park Postgraduate Campus


I  log 2  
 p
 The information, I, is measured in bits
 It is the optimum code length for the symbol

Nov 27th 2006 Ingemar J. Cox 7


Elementary Information Theory
 The entropy, H, is the average information per
symbol

1
H   p( s) log 2 (

UCL Adastral Park Postgraduate Campus


)
s p( s)

 Provides a lower bound on the compression


that can be achieved

Nov 27th 2006 Ingemar J. Cox 8


Elementary Information theory
 A simple example. Suppose we need to
transmit four possible weather conditions:
1. Sunny

UCL Adastral Park Postgraduate Campus


2. Cloudy
3. Rainy
4. Snowy

 If all conditions are equally likely, p(s)=0.25,


and H=2
 i.e. we need a minimum of 2 bits per symbol

Nov 27th 2006 Ingemar J. Cox 9


Elementary information theory
 Suppose instead that it is:
1. Sunny 0.5 of the time
2. Cloudy 0.25 of the time

UCL Adastral Park Postgraduate Campus


3. Rainy 0.125 of the time, and
4. Snowy 0.125 of the time

 Then the entropy is


1 1 1
H  0.5 log 2  0.25 log 2  2  0.125 log 2
0.5 0.25 0.125
H  0.5 1  0.25  2  2  0.125  3
H  0.5  0.5  0.75  1.75

Nov 27th 2006 Ingemar J. Cox 10


Elementary Information Theory
 Variable length codewords
 Huffman code – integer code lengths
 Arithmetic codes – non-integer code lengths

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 11
Elementary Information Theory
 Huffman code

Weather Probability Information Integer code

UCL Adastral Park Postgraduate Campus


Sunny 0.5 1 0
Cloudy 0.25 2 10
Rainy 0.125 3 110
Snowy 0.125 3 111

Nov 27th 2006 Ingemar J. Cox 12


Elementary Information Theory
 Previous illustration is an example of a lossless
code
 I.e. we are able to recover the information exactly

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 13
Elementary Information Theory
 Note that we have assumed that each symbol
is independent of the other symbols
 I.e. the current symbol provides no information
regarding the next symbol

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 14
Quantization
 Quantization is the process of approximating a
continuous (or range of values) by a (much)
smaller range of values

UCL Adastral Park Postgraduate Campus


 x  0.5 
Q( x, )  Round  
  
 Where Round(y) rounds y to the nearest integer
  is the quantization stepsize

Nov 27th 2006 Ingemar J. Cox 15


Quantization
 Example: =2

-5 -4 -3 -2 -1 0 1 2 3 4 5

UCL Adastral Park Postgraduate Campus


-2 -1 0 1 2

-4 -2 0 2 4

Nov 27th 2006 Ingemar J. Cox 16


Quantization
 Quantization plays an important role in lossy
compression
 This is where the loss happens

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 17
Fundamentals of Images
Fundamentals of images
 An image consists of pixels (picture elements)
 Each pixel represents luminance (and colour)
 Typically, 8-bits per pixel

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 19
Fundamentals of images
 Colour
 Colour spaces (representations)
 RGB (red-green-blue)
 CMY (cyan-magenta-yellow)

UCL Adastral Park Postgraduate Campus


 YUV
• Y = 0.3R+0.6G+0.1B (luminance)
• U=R-Y
• V=B-Y

 Greyscale
 Binary
Nov 27th 2006 Ingemar J. Cox 20
Fundamentals of images
 A TV frame is about 640x480 pixels
 If each pixels is represented by 8-bits for each
colour, then the total image size is

UCL Adastral Park Postgraduate Campus


 640×480*3=921,600 bytes or 7.4Mbits

 At 30 frames per second, this would be


  220Mbits/second

Nov 27th 2006 Ingemar J. Cox 21


Fundamentals of images
 Do we need all these bits?

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 22
Fundamentals of images
 Here is an image represented with 8-bits per
pixel

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 23
Fundamentals of images
 Here is the same image at 7-bits per pixel

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 24
Fundamentals of images
 And at 6-bits per pixel

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 25
Fundamentals of images
 And at 5-bits per pixel

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 26
Fundamentals of images
 And at 4-bits per pixel

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 27
Fundamentals of images
 Do we need all these bits?
 No!

 The previous example illustrated the eye’s

UCL Adastral Park Postgraduate Campus


sensitivity to luminance
 We can build a perceptual model
 Only code what is important to the human visual
system (HVS)
 Usually a function of spatial frequency

Nov 27th 2006 Ingemar J. Cox 28


Fundamentals of Images
 Just as audio has temporal frequencies
 Images have spatial frequencies
 Transforms

UCL Adastral Park Postgraduate Campus


 Fourier transform
 Discrete cosine transform
 Wavelet transform
 Hadamard transform

Nov 27th 2006 Ingemar J. Cox 29


Discrete cosine transform
 Forward DCT

C (u )N 1
 u 
S (u )   s(n) cos (n  0.5) 

UCL Adastral Park Postgraduate Campus


2 n 0  8 
 Inverse DCT

C (u ) N 1  u 
s ( n)  
2 u 0
S (u ) cos
 8
(n  0.5) 

Nov 27th 2006 Ingemar J. Cox 30
Basis functions
 DC term

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 31
Basis functions
 First term

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 32
Basis functions
 Second term

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 33
Basis functions
 Third term

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 34
Basis functions
 Fourth term

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 35
Basis functions
 Fifth term

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 36
Basis functions
 Sixth term

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 37
Basis functions
 Seventh term

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 38
DCT Example
Example
 Signal

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 40
Example
 DCT coefficients are:
 4.2426
 0

UCL Adastral Park Postgraduate Campus


-3.1543
 0
 0
 0
 -0.2242
 0

Nov 27th 2006 Ingemar J. Cox 41


Example: DCT decomposition
 DC term

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 42
Example: DCT decomposition
 2nd AC term

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 43
Example: DCT decomposition
 6th AC term

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 44
Example: summation of DCT terms
 First two non-zero coefficients

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 45
Example: summation of DCT terms
 All 3 non-zero coefficients

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 46
Example
 What if we quantize DCT coefficients?
 =1

 Quantized DCT coefficients are:


UCL Adastral Park Postgraduate Campus


4
 0
 -3
 0
 0
 0
 0
 0

Nov 27th 2006 Ingemar J. Cox 47


Example
 Approximate reconstruction

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 48
Example
 Exact reconstruction

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 49
2-D DCT Transform
 Let i(x,y) represent an image with N rows and
M columns
 Its DCT I(u,v) is given by

UCL Adastral Park Postgraduate Campus


1 M N  (2 x  1)u   (2 y  1)v 
I (u, v)  C (u )C (v)  i( x, y) cos  cos 
4  x 1 y 1  16   16 

 where
1 C (u )  1
C (0) 
2
Nov 27th 2006 Ingemar J. Cox 50
Fundamentals of images
 Discrete cosine transform
 Coefficients are approximately uncorrelated
 Except DC term
 C.f. original 8×8 pixel block

UCL Adastral Park Postgraduate Campus


 Concentrates more power in the low frequency
coefficients
 Computationally efficient

 Block-based DCT
 Compute DCT on 8×8 blocks of pixels

Nov 27th 2006 Ingemar J. Cox 51


Fundamentals of images
 Basis functions for the 8×8 DCT (courtesy
Wikipedia)

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 52
Fundamentals of JPEG
Fundamentals of JPEG

Encoder

DCT Quantizer Entropy coder

UCL Adastral Park Postgraduate Campus


Compressed
image data

IDCT Dequantizer Entropy


decoder

Decoder

Nov 27th 2006 Ingemar J. Cox 54


Fundamentals of JPEG
 JPEG works on 8×8 blocks
 Extract 8×8 block of pixels
 Convert to DCT domain

UCL Adastral Park Postgraduate Campus


 Quantize each coefficient
 Different stepsize for each coefficient
 Based on sensitivity of human visual system

 Order coefficients in zig-zag order


 Entropy code the quantized values
Nov 27th 2006 Ingemar J. Cox 55
Fundamentals of JPEG
 A common quantization table is
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55

UCL Adastral Park Postgraduate Campus


14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99
Nov 27th 2006 Ingemar J. Cox 56
Fundamentals of JPEG
 Zig-zag ordering
0 1 5 6 14 15 27 28
2 4 7 13 16 26 29 42

UCL Adastral Park Postgraduate Campus


3 8 12 17 25 30 41 43
9 11 18 24 31 40 44 53
10 19 23 32 39 45 52 54
20 22 33 38 46 51 55 60
21 34 37 47 50 56 59 61
35 36 48 49 57 58 62 63
Nov 27th 2006 Ingemar J. Cox 57
Fundamentals of JPEG
 Entropy coding
 Run length encoding followed by
 Huffman
 Arithmetic

UCL Adastral Park Postgraduate Campus


 DC term treated separately
 Differential Pulse Code Modulation (DPCM)

 2-step process
1. Convert zig-zag sequence to a symbol sequence
2. Convert symbols to a data stream
Nov 27th 2006 Ingemar J. Cox 58
Fundamentals of JPEG
 Modes
 Sequential
 Progressive
 Spectral selection

UCL Adastral Park Postgraduate Campus


• Send lower frequency coefficients first
 Successive approximation
• Send lower precision first, and subsequently refine
 Lossless
 Hierarchical
 Send low resolution image first

Nov 27th 2006 Ingemar J. Cox 59


Fundamentals of MPEG-1/2
Fundamentals of MPEG
 A sequence of 2D images
 Temporal correlation as well as spatial
correlation

UCL Adastral Park Postgraduate Campus


 TV broadcast
 Frame-based
 Field-based

Nov 27th 2006 Ingemar J. Cox 61


MPEG
 Moving Picture Experts Group
 Standard for video compression
 Similarities with JPEG

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 62
MPEG
 Design is a compromise between
 Bit rate
 Encoder/decoder complexity
 Random access capability

UCL Adastral Park Postgraduate Campus


Nov 27th 2006 Ingemar J. Cox 63
MPEG
 Images
 Spatial redundancy
 Perceptual redundancy

UCL Adastral Park Postgraduate Campus


 Video
 Spatial redundancy
 Intraframe coding
 Temporal redundancy
 Interframe coding
 Perceptual redundancy

Nov 27th 2006 Ingemar J. Cox 64


MPEG
 Consider a sequence of n frames of video.
 It consists of:
 I-frames

UCL Adastral Park Postgraduate Campus


 P-frames
 B-frames

 A sequence of one I-frame followed by P- and


B-frames is known as a GOP
 Group of Pictures
 E.g. IBBPBBPBBPBBP

Nov 27th 2006 Ingemar J. Cox 65


MPEG
 I-frames
 Intraframe coded
 No motion compensation

 P-frames

UCL Adastral Park Postgraduate Campus


 Interframe coded
 Motion compensation
• Based on past frames only

 B-frames
 Interframe coded
 Motion compensation
• Based on past and future frames

Nov 27th 2006 Ingemar J. Cox 66


MPEG
 Motion-compensated prediction
 Divide current frame, i, into disjoint 16×16
macroblocks
 Search a window in previous frame, i-1, for closest

UCL Adastral Park Postgraduate Campus


match
 Calculate the prediction error
 For each of the four 8×8 blocks in the macroblock,
perform DCT-based coding
 Transmit motion vector + entropy coded prediction
error (lossy coding)

Nov 27th 2006 Ingemar J. Cox 67


MPEG
 Like JPEG, the DC term is treated separately
 DPCM

 B-frame compression high

UCL Adastral Park Postgraduate Campus


 Need buffer and delay

Nov 27th 2006 Ingemar J. Cox 68

You might also like