Professional Documents
Culture Documents
Multimedia System: Chapter Eight: Multimedia Data Compression
Multimedia System: Chapter Eight: Multimedia Data Compression
1
Introduction
Compression: the process of coding that will
effectively reduce the total number of bits needed to
represent certain information.
2
The Need for Compression?
3
Fundamentals
Three approaches:
Reduce CODING Redundancy: Fewer bits to represent
frequent symbols.
Reduce INTERPIXEL / INTERFRAME
Redundancy : Neighboring pixels have similar
values.
Reduce PSYCHOVISUAL Redundancy : Human
visual system can not simultaneously distinguish all
colors.
5
Coding Redundancy
Fewer number of bits to represent frequently
occurring symbols.
Let pr(rk) = nk / n, k = 0,1,2, . ., L-1; L # of gray
levels.
Let rk be represented by l (rk) bits.
Therefore average # of bits required to represent
each pixel is
6
Coding Redundancy
Consider equation (A): It makes sense to assign fewer
bits to those rk for which pr(rk) are large in order to
reduce the sum.
This achieves data compression and results in a
variable length code.
More probable gray levels will have fewer # of bits.
7
Compute the average code length for the
following code:
Symbol(Xi) Frequency(Xi) code
A 5 0000
B 8 0001
C 11 001
D 20 01
E 26 10
F 30 11
8
Symbol(Xi) Probability Code Code Length
P(Xi) Length P(Xi)*L(Xi)
L(Xi)
A 0.05 4 0.20
B 0.08 4 0.32
C 0.11 3 0.33
D 0.20 2 0.40
E 0.26 2 052
F 0.30 2 0.60
The average code word length E(L) per source symbol is
2.37
9
Example : Variable length coding
(reversible)
10
Code 1: Natural 3-bit code
of data is redundant
11
Psycho Visual Redundancy
Question: Which one looks more
different from the original?
Original Image
13
Types of Compression
Lossless compression
original can be recovered exactly. Higher quality, bigger.
lossless compression for legal and medical documents,
computer programs
exploit only data redundancy
Error free compression
Lossy compression
only an approximation of the original can be recovered. Lower
quality, smaller.
digital audio, image, video where some errors or loss can be
tolerated
exploit both data redundancy and human perception properties
Error containing compression
14
Lossless Compression
Common methods to remove redundancy
Run Length Coding
Huffman Coding
Dictionary-Based Coding
Arithmetic, etc
15
Run Length Coding (RLC)
Run-length coding is a very widely used and simple
compression technique
In this method we replace runs of symbols with pairs
of (run-length, symbol)
Example:
Input symbols: 7,7,7,7,7,90,9,9,9,1,1,1
requires 12 Byte
Using RLC: 5,7,90,3,9,3,1= 7 Byte
Compression ratio: 12/7
16
Huffman Coding
Assigns fewer bits to symbols that appear more
often and more bits to the symbols that appear less
often
Efficient when occurrence probabilities vary
widely
It constructs a binary tree in bottom-up manner.
Then it uses the tree to find the codeword for each
symbol
17
Huffman Coding-Algorithm
1. Put all symbols on a list sorted according to their
frequency counts.
2. Repeat until the list has only one symbol left:
a. From the list pick two symbols with the lowest frequency counts.
Form a Huffman sub tree that has these two symbols as child
nodes and create a parent node.
b. Assign the sum of the children's frequency counts to the parent
and insert it into the list such that the order is maintained.
c. Delete the children from the list.
3. Assign a codeword for each leaf based on the path
from the root.assignment of 0 and 1
18
Huffman Coding-Example
Source Number of Codeword Length of
Symbol occurrence assigned codeword
S1 30 00 2
S2 10 101 3
S3 20 11 2
S4 5 1001 4
S5 10 1000 4
S6 25 01 2
S1 ( 0.30 ) S1 ( 0.30 ) S1 ( 0.30 ) S5,4,2,3 ( 0.45 ) S1,6 ( 0.55 ) 0
S ( 1.0 )
S6 ( 0.25 ) S6 ( 0.25 ) S6 ( 0.25 ) S1 ( 0.30 ) S5,4,2,3 ( 0.45 ) 1
0
S3 ( 0.20 ) S3 ( 0.20 ) S5,4,2 ( 0.25 ) S6 ( 0.25 ) 1
0
S2 ( 0.10 ) S5,4 ( 0.15 ) S3 ( 0.20 ) 1
0
S5 ( 0.10 ) S2 ( 0.10 ) 1
0
S4 ( 0.05 ) 1
19
How many bits are needed transfer this coded message?
variable-length code(Total No. of bits)
=(30*2)+(25*2)+(20*2)+(10*3)+(10*4)+(5*4)=240 bit
Average length of code?
Lave=(0.30*2)+(0.25*2)+(0.20*2)+(0.10*3)+(0.10*4)+(0.05
*4)=2.4 bit per symbol
Generate code sequence for symbols “S4S6S2”
S4S6S2=100101101
Decode the sequence “100101101”
100101101=S4S6S2
20
Dictionary-Based Coding (LZW)
LZW coding stands for Lempel-Ziv-Welch
Works by building a dictionary of phrases from the
input stream
A token or code is used to identify each distinct phrase
Number of entries(output) in the dictionary determines
the number of bits required for the code
The LZW encoder and decoder build up the same
dictionary dynamically while receiving the data.
21
P=currently recognized sequence, C=pixel being processed, output= encoded output,
code=dictionary location(codeword), string =dictionary entry
22
LZW Compression-Algorithm
23
LZW Compression-Example
We will compress the string
"ABABBABCABABBA"
Initially the dictionary is the following
Code String
1 A
2 B
3 C
24
LZW Compression-Example
S=Currently Recognized Sequence
OUTPUT=Encoded Output
STRING=Dictionary Entry
25
26
27
LZW Decompression-Algorithm
28
LZW Decompression-Example
Decompress the code 124523461 using LZW method
29