FALLSEM2022-23 CSE4019 ETH VL2022230104728 2022-10-19 Reference-Material-I

Data Compression
ANISHA M. LAL
Data Compression
 Data compression aims to reduce the amount of data

required to represent a given quantity of information
while preserving as much information as possible.
Image Compression
 The goal of image compression is to reduce the
amount of data required to represent a digital image.
Types of Compression
 Lossless
 Information preserving
 Low compression ratios
 Lossy
 Not information preserving
 High compression ratios
 Trade-off: image quality vs compression ratio

Compression Ratio
compression
Compression ratio:
Data Redundancy
 Interpixel Redundancy
 Coding Redundancy
 Psychovisual Redundancy
 Compression attempts to reduce one or more of

these redundancy types.
Redundancy Types
 Inter-pixel Redundancy: It is a redundancy corresponding to

statistical dependencies among pixels, especially between
neighbouring pixels.
 Coding Redundancy: The uncompressed image usually is coded

with each pixel by a fixed length. For example, an image with
256 gray scales is represented by an array of 8-bit integers.
 Psycho-visual Redundancy: It is a redundancy corresponding to

different sensitivities to all image signals by human eyes.
Therefore, eliminating some less relative important information
in our visual processing may be acceptable.
Lossless Compression Techniques
 Interpixel Redundancy
 Based upon frequency of occurrences
 Run length encoding
 Diatomic encoding
 Bit plane encoding
 Coding Redundancy
* Based upon probability of occurrences
 Huffman encoding
 Arithmetic encoding
Run length Encoding (Interpixel
Redundancy)
 Encodes repeating string of symbols (i.e., runs) using a
few bytes: (symbol, count)
1 1 1 1 1 0 0 0 0 0 0 1  (1,5) (0, 6) (1, 1)

a a a b b b b b b c c  (a,3) (b, 6) (c, 2)
 Can compress any type of data but cannot achieve high

compression ratios compared to other compression
methods
Bit-plane Encoding (Interpixel
Redundancy)
 Process each bit plane individually.
(1) Decompose an image into a series of binary images.

(2) Compress each binary image (e.g., using run-length
coding)
Huffman Encoding (Coding
Redundancy)
* A variable-length coding technique.

* Symbols are encoded one at a time!
* There is a one-to-one correspondence between source symbols
and code words
* Optimal code (i.e., minimizes the number of code symbols per source
symbol).
Huffman Encoding Technique
• Forward Pass
1. Sort probabilities per symbol
2. Combine the lowest two probabilities
3. Repeat Step2 until only two probabilities remain.
Cont…
 Backward Pass
Assign code symbols going backwards
Huffman Decoding
 Coding/decoding can be implemented using a look-up
table.
 Decoding can be done unambiguously.
Cont…
 For example, a data stream has only five symbols ABCDE with
the following probabilities
 P(A)=0.16
 P(B)=0.51
 P(C) =0.09
 P(D) = 0.13
 P(E) = 0.11
Cont…
A 011
B 1
C 000
D 010
E 001
Arithmetic Encoding (Coding
Redundancy)
 Sequences of source symbols are encoded together

(instead of one at a time).
 No one-to-one correspondence between source
symbols and code words.
 Slower than Huffman coding but typically achieves
better compression.
Arithmetic Coding (cont’d)
 A sequence of source symbols is assigned a single

arithmetic code word which corresponds to a sub-interval
in [0,1].
α1 α2 α3 α3 α4 [0.06752, 0.0688) 0.068

 We start with the interval [0, 1] ; as the number of symbols
in the message increases, the interval used to represent it
becomes smaller.
 Smaller intervals require more information units (i.e., bits)
to be represented.
Arithmetic Coding (cont’d)
Encode message: α1 α2 α3 α3 α4
1) Start with interval [0, 1)
0 1
2) Subdivide [0, 1) based on the probabilities of αi
3) Update interval by processing source symbols

Example
Encode
α1 α 2 α3 α3 α4
[0.06752, 0.0688)
or
0.068
Lossy Compression
 Transform the image into a domain where compression
can be performed more efficiently (i.e., reduce interpixel
redundancies).
JPEG Compression
 Accepted as an international image compression

standard in 1992.
 It uses DCT for handling interpixel redundancy.
 Modes of operation:
(1) Sequential DCT-based encoding
(2) Progressive DCT-based encoding
(3) Lossless encoding
(4) Hierarchical encoding
JPEG Compression
(Sequential DCT-based encoding)
Entropy
encoder
Entropy
decoder
JPEG Steps
1. Divide the image into 8x8 subimages;
For each subimage do:
2. Shift the gray-levels in the range [-128, 127]

- DCT requires range be centered around 0
3. Apply DCT  64 coefficients

1 DC coefficient: F(0,0)
63 AC coefficients: F(u,v)
Example
[-128, 127] (non-centered

spectrum)
JPEG Steps
4. Quantize the coefficients (i.e., reduce the amplitude of

coefficients that do not contribute a lot).
Q(u,v): quantization table

Example
 Quantization Table Q[i][j]

Example (cont’d)
Quantization
JPEG Steps (cont’d)
5. Order the coefficients using zig-zag ordering
- Places non-zero coefficients first
- Creates long runs of zeros (i.e., ideal for run-length
encoding)
Example
JPEG Steps (cont’d)
6. Encode coefficients:
6.1 Form “intermediate” symbol sequence
6.2 DC coefficients: predictive encoding
6.3 AC coefficients: variable length coding

DC Coefficients Encoding
symbol_1 symbol_2
(SIZE) (AMPLITUDE)
predictive
coding:

FALLSEM2022-23 CSE4019 ETH VL2022230104728 2022-10-19 Reference-Material-I

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FALLSEM2022-23 CSE4019 ETH VL2022230104728 2022-10-19 Reference-Material-I

Uploaded by

Copyright:

Available Formats

Data Compression

 Data compression aims to reduce the amount of data

 Trade-off: image quality vs compression ratio

 Compression attempts to reduce one or more of

 Inter-pixel Redundancy: It is a redundancy corresponding to

 Coding Redundancy: The uncompressed image usually is coded

 Psycho-visual Redundancy: It is a redundancy corresponding to

1 1 1 1 1 0 0 0 0 0 0 1  (1,5) (0, 6) (1, 1)

 Can compress any type of data but cannot achieve high

(1) Decompose an image into a series of binary images.

* A variable-length coding technique.

 Sequences of source symbols are encoded together

 A sequence of source symbols is assigned a single

α1 α2 α3 α3 α4 [0.06752, 0.0688) 0.068

1) Start with interval [0, 1)

3) Update interval by processing source symbols

 Accepted as an international image compression

For each subimage do:

2. Shift the gray-levels in the range [-128, 127]

3. Apply DCT  64 coefficients

[-128, 127] (non-centered

4. Quantize the coefficients (i.e., reduce the amplitude of

Q(u,v): quantization table

 Quantization Table Q[i][j]

6.1 Form “intermediate” symbol sequence

6.2 DC coefficients: predictive encoding

6.3 AC coefficients: variable length coding

You might also like