You are on page 1of 21

Discrete Cosine Transform

(DCT)

“Many people would sooner die than think. In fact they do.”
- Bertrand Russell
Review
1. Convert from RGB to YCbCr
2. Pixels are grouped into 8x8 pixels called
data units
3. Discrete Cosine Transform(DCT)
applied to each data unit to create an
8x8 map of frequency components
that represent the average pixel value
and successive higher frequency
changes within a group
Overview
 What are image transforms?
 What are they used for?
 Wave transforms (cos, sin)
 DCT
What is an image transform?
 Computers store images as an NxN
matrix of values that represent pixels
 For example
 256 gray-scale image each pixel is stored
as a value between 0 – 255
 0 = black pixel
 255 = white pixel
 Value between are shades of gray
What are they used for?
 We can apply mathematical functions to
the matrices to rotate, skew,
compress…in other words TRANSFORM
an image
 Remember quad trees?
 Lets look at an example
Image Transform Example

M= 1 2 3 -4 5
2 3 -4 5 1
4 -5 2 1 7

*This is obviously not the real matrix for


this image, just pretend for the sake of the
example
Image Transform Example

Suppose ƒ(x) = transform function

In this case it’s an invert function

ƒ(M) = 5 -4 3 2 1
1 5 -4 3 2
7 1 2 -5 4
More practical example
M= 4 7 6 9 M’ = 8.5 11.5 10.5 15
6 9 3 6 1.5 3.5 -1.5 0
5 4 7 6 -2.5 -0.5 0.5 3
2 4 5 9 0.5 -0.5 2.5 0

Apply some arbitrary transform on M


More practical example
M= 4 7 6 9 M’ = 8.5 11.5 10.5 15
6 9 3 6 1.5 3.5 -1.5 0
5 4 7 6 -2.5 -0.5 0.5 3
2 4 5 9 0.5 -0.5 2.5 0

Notice how the higher values (low frequency)


are now positioned toward the top left and
the lower values (high frequency) are
positioned toward the bottom right
Wave Transforms
 DCT and Fourier transforms convert
images from time-domain to frequency-
domain to decorrelate pixels
 Time-domain
 x-axis = time, y-axis = amplitude
 Frequency-domain
 x-axis = frequency, y-axis = amplitude
Wave Transforms
Amplitude

Frequency
DCT: One Dimensional

1 n 1
 ( 2t  1 f ) 
Gf  Cf  pt cos
 

2 t 0
 2 n 
where
 1  n = size
 , f  0
Cf   2  p = pixel
1, f  0 
  G = coefficients
DCT: 2D
n 1 n 1
1  (2 y  1) j 
Gij 
2n
C iC j
x 0 y 0
pxy cos
 2 n


 (2 x  1)i 
cos 
 2n 
DCT
 Remember that JPEG breaks an image
into 8x8 units
 So for DCT n = 8
 Each pixel is scanned and the transform
is applied
 Just like our example in the beginning: We
get a matrix with new values
DCT: Frequency Distro
DCT: Frequency Distro
DCT: Why does it do this?
 DCT takes advantage of redundancies
in the data by grouping pixels with
similar frequencies together
 Higher frequencies = lower number
 Lower frequencies = higher number
 If lossy compression is acceptable, then
each data unit can then be divided by
quantization coefficient (QC)
DCT (cont)
Summary
 What are image transforms?
 What are they used for?
 Wave transforms (cos, sin)
 DCT
Sources
 Salomon D. A Guide to Data Compression Methods, 2002
 Salomon D. Data Compression: The Complete Reference, 2000
 Lehar S. “An Intuitive Explanation of Fourier Theory”. http://cns-
alumni.bu.edu/~slehar/fourier/fourier.html February 2005
 Marshall D. “Discrete Cosine Transform”,
http://www.cs.cf.ac.uk/Dave/Multimedia/node231.html#DCTbasis
February 2005
 Cabeen K & Gent P. “Image Compression and the Discrete Cosine
Transform”.
http://online.redwoods.cc.ca.us/instruct/darnold/laproj/Fall98/PKen/dct
.pdf February 2005
Questions?

You might also like