You are on page 1of 29

MULTIMEDIA SYSTEM

Chapter Eight: Multimedia Data Compression

1
Introduction
Compression: the process of coding that will
effectively reduce the total number of bits needed to
represent certain information.

2
The Need for Compression?

To reduce storage requirements ( Image, audio, video)

To reduce the bandwidth required for transmission

3
Fundamentals

Objective is to get rid(clear) of redundant data


(data different from information)
Compression ratio: CR = n1/n2
Relative data redundancy: RD = 1 − 1/CR
n1 = amount of data in data set 1(# bits uncompressed)
n2 = amount of data in data set 2 (# bits compressed)
n1 and n2 ≈ the same information
If n2 = n1, then CR = 1 and RD = 0
If n1 : n2 = 10 : 1, then RD = 0.9 ⇒ 90% of data
redundant
4
How can we save space?

Three approaches:
Reduce CODING Redundancy: Fewer bits to represent
frequent symbols.
Reduce INTERPIXEL / INTERFRAME
Redundancy : Neighboring pixels have similar
values.
Reduce PSYCHOVISUAL Redundancy : Human
visual system can not simultaneously distinguish all
colors.

5
Coding Redundancy
Fewer number of bits to represent frequently
occurring symbols.
Let pr(rk) = nk / n, k = 0,1,2, . ., L-1; L # of gray
levels.
Let rk be represented by l (rk) bits.
Therefore average # of bits required to represent
each pixel is

6
Coding Redundancy
Consider equation (A): It makes sense to assign fewer
bits to those rk for which pr(rk) are large in order to
reduce the sum.
This achieves data compression and results in a
variable length code.
More probable gray levels will have fewer # of bits.

7
Compute the average code length for the
following code:
Symbol(Xi) Frequency(Xi) code
A 5 0000
B 8 0001
C 11 001
D 20 01
E 26 10
F 30 11
8
Symbol(Xi) Probability Code Code Length
P(Xi) Length P(Xi)*L(Xi)
L(Xi)
A 0.05 4 0.20
B 0.08 4 0.32
C 0.11 3 0.33
D 0.20 2 0.40
E 0.26 2 052
F 0.30 2 0.60
The average code word length E(L) per source symbol is
2.37

9
Example : Variable length coding
(reversible)

10
Code 1: Natural 3-bit code

Code 2: Variable length code:

of data is redundant

Objective: When p(rk) is large, l2(rk) should be small...

11
Psycho Visual Redundancy
Question: Which one looks more
different from the original?

Original Image

12 A. Brightness Adjusted B. Colour Adjusted Slightly


Slightly
Psycho visual
Human eye is more sensitive to brightness details than to
fine color details.

13
Types of Compression
Lossless compression
original can be recovered exactly. Higher quality, bigger.
lossless compression for legal and medical documents,
computer programs
exploit only data redundancy
Error free compression
Lossy compression
only an approximation of the original can be recovered. Lower
quality, smaller.
digital audio, image, video where some errors or loss can be
tolerated
exploit both data redundancy and human perception properties
Error containing compression
14
Lossless Compression
Common methods to remove redundancy
Run Length Coding
Huffman Coding
Dictionary-Based Coding
Arithmetic, etc

15
Run Length Coding (RLC)
Run-length coding is a very widely used and simple
compression technique
In this method we replace runs of symbols with pairs
of (run-length, symbol)
Example:
Input symbols: 7,7,7,7,7,90,9,9,9,1,1,1
requires 12 Byte
Using RLC: 5,7,90,3,9,3,1= 7 Byte
Compression ratio: 12/7

16
Huffman Coding
Assigns fewer bits to symbols that appear more
often and more bits to the symbols that appear less
often
Efficient when occurrence probabilities vary
widely
It constructs a binary tree in bottom-up manner.
Then it uses the tree to find the codeword for each
symbol

17
Huffman Coding-Algorithm
1. Put all symbols on a list sorted according to their
frequency counts.
2. Repeat until the list has only one symbol left:
a. From the list pick two symbols with the lowest frequency counts.
Form a Huffman sub tree that has these two symbols as child
nodes and create a parent node.
b. Assign the sum of the children's frequency counts to the parent
and insert it into the list such that the order is maintained.
c. Delete the children from the list.
3. Assign a codeword for each leaf based on the path
from the root.assignment of 0 and 1

18
Huffman Coding-Example
Source Number of Codeword Length of
Symbol occurrence assigned codeword
S1 30 00 2
S2 10 101 3
S3 20 11 2
S4 5 1001 4
S5 10 1000 4
S6 25 01 2
S1 ( 0.30 ) S1 ( 0.30 ) S1 ( 0.30 ) S5,4,2,3 ( 0.45 ) S1,6 ( 0.55 ) 0
S ( 1.0 )
S6 ( 0.25 ) S6 ( 0.25 ) S6 ( 0.25 ) S1 ( 0.30 ) S5,4,2,3 ( 0.45 ) 1
0
S3 ( 0.20 ) S3 ( 0.20 ) S5,4,2 ( 0.25 ) S6 ( 0.25 ) 1
0
S2 ( 0.10 ) S5,4 ( 0.15 ) S3 ( 0.20 ) 1
0
S5 ( 0.10 ) S2 ( 0.10 ) 1
0
S4 ( 0.05 ) 1
19
How many bits are needed transfer this coded message?
variable-length code(Total No. of bits)
=(30*2)+(25*2)+(20*2)+(10*3)+(10*4)+(5*4)=240 bit
Average length of code?
Lave=(0.30*2)+(0.25*2)+(0.20*2)+(0.10*3)+(0.10*4)+(0.05
*4)=2.4 bit per symbol
Generate code sequence for symbols “S4S6S2”

S4S6S2=100101101
Decode the sequence “100101101”
100101101=S4S6S2

20
Dictionary-Based Coding (LZW)
LZW coding stands for Lempel-Ziv-Welch
Works by building a dictionary of phrases from the
input stream
A token or code is used to identify each distinct phrase
Number of entries(output) in the dictionary determines
the number of bits required for the code
The LZW encoder and decoder build up the same
dictionary dynamically while receiving the data.

21
P=currently recognized sequence, C=pixel being processed, output= encoded output,
code=dictionary location(codeword), string =dictionary entry
22
LZW Compression-Algorithm

23
LZW Compression-Example
We will compress the string
"ABABBABCABABBA"
Initially the dictionary is the following

Code String
1 A
2 B
3 C

24
LZW Compression-Example
S=Currently Recognized Sequence

C=Pixel Being Processed

OUTPUT=Encoded Output

CODE=Dictionary Location(code word)

STRING=Dictionary Entry

25
26
27
LZW Decompression-Algorithm

28
LZW Decompression-Example
Decompress the code 124523461 using LZW method

29

You might also like