You are on page 1of 49

JPEG Compression

JPEG Compression
Steps:

YUV stores images in terms of luminance(brightness) and chrominance(hue) to give better


compression rate
Zigzag:
JPEG - Steps

1. Divide image into 8x8 subimages. [-128, 127] (i.e., DCT spectrum)
The low frequency
components are around the
For each subimage do: upper-left corner of the
2. Shift the gray-levels in the range [-128, 127] spectrum (not centered!).
(i.e., reduces the dynamic range requirements of DCT)

3. Apply DCT; yields 64 coefficients


1 DC coefficient: F(0,0)
63 AC coefficients: F(u,v)
4. Quantize the coefficients (i.e., reduce the amplitude of coefficients that do
not contribute a lot).
Q(u,v): quantization table

• Quantization Table Q[i][j]


Cq(u,v)

C(u,v)

Quantization

Q(u,v) Small magnitude coefficients


have been truncated to zero!

“quality” controls how many of


them will be truncated!
5. Order the coefficients using zig-zag ordering: Creates long runs of zeros (i.e.,
ideal for run-length encoding)

6. Encode coefficients:
6.1 Form “intermediate” symbol sequence.

6.2 Encode “intermediate” symbol sequence into


a binary sequence.
Note: DC coefficient is encoded differently from AC coefficients
Intermediate Symbol Sequence – DC
coeff

symbol_1 (SIZE) symbol_2 (AMPLITUDE)


(6) (61)

SIZE: # bits need to encode


the amplitude
DC Coefficient Encoding

symbol_1 symbol_2
(SIZE) (AMPLITUDE)

predictive
coding:

The DC coefficient of blocks other than the first


block is substituted by the difference between the
DC coefficient of the current block and that of the
previous block.
Intermediate Symbol Sequence – AC
coeff

symbol_1 (RUN-LENGTH, SIZE) symbol_2 (AMPLITUDE) end of block

RUN-LENGTH: run of zeros preceding coefficient


SIZE: # bits for encoding the amplitude of coefficient
Note: If RUN-LENGTH > 15, use symbol (15,0), e.g.,
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2  (15, 0)(3, 2)(2)
Example: AC Coefficients Encoding Symbol_2
(Variable Length Integer (VLI)
pre-computed codes)

# bits

Idea: smaller (and


more common)
values use fewer
bytes and take up
less space than larger
(and less common)
values.

DC coefficients are
encoded similarly.
Symbol_1 (1,4) (12)  (111110110 1100)
(Variable Length Code (VLC)
pre-computed Huffman codes)
VLC VLI
Final Symbol Sequence
Why another standard?
• Low bit-rate compression: At low bit-rates (e.g. below 0.25 bpp for highly detailed gray-
level images) the distortion in JPEG becomes unacceptable.
• Lossless and lossy compression: Need for standard, which provide lossless and lossy
compression in one codestream.
• Large images: JPEG doesn't compress images greater then 64x64K without tiling.
• Single decompression architecture: JPEG has 44 modes, many of them are application
specific and not used by the majority of the JPEG decoders.
• Transmission in noisy environments: in JPEG quality suffers dramatically, when bit
errors are encountered.
• Computer generated imaginary: JPEG is optimized for natural images and performs
badly on computer generated images
• Compound documents: JPEG fails to compress bi-level (text) imagery.
JPEG2000
• The source image is decomposed into components (up to 256) which are (optionally)
decomposed into rectangular tiles (basic unit of the original or reconstructed image)
• A wavelet transform is applied on each tile which is decomposed into different
resolution levels.
• The decomposition levels are made up of subbands of coefficients that describe the
frequency characteristics of local areas of the tile components, rather than across the
entire image component.
• The sub-bands of coefficients are quantized and collected into rectangular arrays of code
blocks.
• The bit planes of the coefficients in a code block (i.e. the bits of equal significance across
the coefficients in a code block) are entropy coded.
• The encoding can be done in such a way that certain regions of interest (ROI) can be coded
at a higher quality than the background.
• Markers are added to the bit stream to allow for error resilience.
• The code stream has a main header at the beginning that describes the original image
and the various decomposition and coding styles that are used to locate, extract, decode
and reconstruct the image with the desired resolution, fidelity, region of interest or other
characteristics.
Pre-Processing
• 1. Image tiling:
• Image may be quite large in comparison to the amount
of memory available to the codec.
• Partition of the original image into rectangular non-
overlapping blocks (tiles), to be compressed independently

2. DC-level shifting:
The codec expects its input sample data to have a
nominal dynamic range that is approximately centered
about zero (0 -- 255 -> -128 -- 128)
If the sample values are unsigned, the nominal dynamic
range of the samples is adjusted by subtracting a
bias from each of the sample values ( 2 P-1 ,
P is the component’s precision)
28
Pre-Processing - Tiling
• All operations, including
component mixing, wavelet
transform, quantization and
entropy coding are
performed independently on
the image tiles.
• Tiling affects the image
quality both subjectively
and objectively
• Smaller tiles create
more tiling artifacts

29
Pre-Processing (cont'd)
• 3. Components transformation:
• Maps data from RGB to YCrCb (Y, Cr, Cb - less statistically
dependent; compress better); serves to reduce the correlation
between components, leading to improved coding efficiency.
There are reversible and irreversible transforms.

Inverse reversible component transform


Forward reversible component transform 30
Pre-Processing - Component Transformations
• Component transformations
improve compression and
allow visually relevant
quantization:
• Irreversible component
transformation (ICT):
 Floating point
 For use with irreversible
(floating point 9/7) wavelet
• Reversible component
transformation (RCT) :
 Integer approximation
 For use with reversible
(integer 5/3) wavelet
31
Wavelet Transform
• Floating point 9/7 wavelet filter for lossy compression
 Best performance at low bit rate
 High implementation complexity, especially for hardware
• Integer 5/3 wavelet filter for lossless coding
 Integer arithmetic, low implementation complexity

• We filter each row and column


with a high pass and low
pass filter, followed by
downsampling by 2 (to keep the
sample rate).
• Now we have divided the tile to sub-bands.
All info (index, position, precincts,
etc.), regarding the single tile, is put
together in a contiguous stream of data 32
Code-blocks, precincts and packets

33
Wavelet Transform
• Two filtering modes:
 Convolution based: performing a series of dot products
between the two filter masks and the extended 1-D signal.
 Lifting based: sequence of very simple filtering operations
for which alternately odd sample values of the signal are
updated with a weighted sum of even sample values, and
vise versa. Lossless 1D DWT

Lossy 1D DWT

P and U stand for Prediction and Update.


34
a= 1.586,  = 0.052,  = 0.882,  = 0.443, K = 1.230
Wavelet Transform
• Symmetric extension:
• To ensure that for the filtering operations that take place at
both boundaries of the signal, one signal sample exists and
spatially corresponds to each co-efficient of the filter mask.

35
DWT (example)
• In JPEG2000 multiple stages
of the DWT are performed.
JPEG2000
supports from 0 to 32 stages.
For natural
images, usually between 4 to 8
stages are used.

36
Quantization
• The wavelet coefficients are quantized using a uniform
quantizer with deadzone. For each subband b, a basic
quantizer step size Δb is used to quantize all the coefficients
in that subband according to:

• Example:
Given a quantizer step of 10 and an encoder input value
of21.82, the quantizer index is determined as shown:

37
Coefficient Bit Modeling
• Wavelet coefficients are associated with different sub-
bands arising from the 2D separable transform applied.
• These coefficients are then arranged into rectangular
blocks within each sub-band, called code-blocks.

38
Coefficient Bit Modeling (cont'd)
• Code-blocks are then coded a bit-plane at a time
starting from the Most Significant Bit-Plane to the
Least Significant Bit-Plane (if some MSB-planes
contain no 1s, the MSB-plane is set to the top most bit-
plane, with at least one 1, the number of bit-planes
which are skipped is then encoded in a header.)

3 6
=
5 1
MSB-plane LSB-plane 39
Coefficient Bit Modeling (cont'd)
• For each bit-plane in a code-block, a special code-
block scan pattern is used for each of three coding
passes.

40
3 Passes Scanning
1. Significance Propagation Pass
• If a bit is insignificant (=0) but at least one of it's eight
neighbors is significant (=1), then it is encoded.
• If the bit at the same time is a 1, it's significance flag is set to 1
and the sign of the symbol is encoded.

2. Magnitude Refinement Pass:


• Samples which are significant and were not coded in the
significance propagation pass.
3. Clean-up Pass:
• It codes all bits which were passed over by the previous two
coding passes (insignificant bits). It is
the first pass for MSB plane.
The encoding is done by the MQ-coder,
a low 41
Quality layers
• The resulting organization
bit streams for each code-block are organized into
quality layers. A quality layer is a collection of some
• consecutive bit-plane coding passes from each tile. Each code-
block can contribute an arbitrary number of bit-plane coding
passes to a layer, but not all coding passes must be assigned to a
quality layer. Every additional layer increases the image quality.

42
Rate• Rate
Control
control is the process by which the code-stream is
altered so that a target bit rate can be reached.
• Once the entire image has been compressed, a post-
processing operation passes over all the
compressed blocks and determines the extent to
which each block's embedded bit stream should be
truncated in order to achieve the target bit rate.
• The ideal truncation strategy is one that minimizes
distortion while still reaching the target bit-rate.
• The code-blocks are compressed independently, so any
bit stream truncation policy can be used.

43
Bit stream organization
• In bit stream organization, the compressed data from the bit-
plane coding passes are separated into packets.
• Then, the packets are multiplexed together in an ordered
manner to form one code-stream.

• Each precinct
generates one
packet, even if
the packet is
empty. A packet
is composed of
a header and
the compressed
data. 44
Bit stream organization

45
Bit stream organization (cont'd)
• There are 5 ways to order the packets, called progressions,
where position refers to the precinct number:
• Quality: layer, resolution, component, position
• Resolution 1: resolution, layer, component, position
• Resolution 2: resolution, position, component, layer
• Position: position, component, resolution, layer
• Component: component, position, resolution, layer

• The sorting mechanisms are ordered from most significant to


least significant. It is also possible for the progression order to
change arbitrarily in the code-stream.

46
Code stream organization (diagram)

47
Decoding
• The decoder basically performs the opposite of the encoder:

• The code-stream is received by the decoder according to the


progression order stated in the header. The coefficients in
the packets are then decoded and dequantized, and the
reverse-ICT is performed:

• In the case of irreversible compression, the decompression


results in loss of data. The resulting image is not exactly like
the original. 48
Compress once, decompress many ways
• In JPEG2000, the compressor decides the maximum
resolution and maximum image quality to be used.
• It is also possible to perform random access by
decompressing only a certain region of the image or a
specific component of the image (e.g. the grayscale
component of a color image). Both can be performed with
varying qualities and resolutions.
• In each case it is
possible to locate,
extract, and decode
the bytes required
for the desired image
product without
decoding the entire 49
code-stream.

You might also like