You are on page 1of 9

Data Compression

C LLEGPT IT603N

4 Arithmetic Coding

Data

Data

Compression
Compression

Get Prepared Together

Prepared and Edited by:- Divya Kaurani Designed by:- Kussh Prajapati

Get Prepared Together

www.collegpt.com collegpt@gmail.com
Unit - 4 : Arithmetic Coding
Arithmetic Coding :

Arithmetic Coding bypasses the idea of replacing an input symbol with a specific code.
It replaces a stream of input symbols with a single floating point output number.

Algorithm
For Encoding:
low = 0.0 ; high = 1.0 ;
while not EOF do
range = high - low ;
read(c) ;
high = low + range * high_range(c) ;
low = low + range * low_range(c) ;
end do
output(low);

Where,
‘low’ and ‘high’ represent the current interval being encoded.
‘low_range(c)’ and ‘high_range(c)’ represent the range of probabilities assigned to
symbol c.
‘range’ represents the width of the current interval.
When encoding each symbol ‘c’, the interval [low, high] is adjusted based on the
probability range of c.
After encoding all symbols, the final ‘low’ value is outputted, representing the encoded
fraction.

For Decoding:
r = input_number
Repeat
Search c such that r falls in its range
output(c);
r = r - low_range(c);
r = r / (high_range(c) - low_range(c));
Until EOF or the length of the message is reached.

Where,
r represents the value being decoded.
In each step, the algorithm finds the symbol c whose range includes r.
Symbol c is outputted.
r is adjusted based on the interval corresponding to c.
r is rescaled to fit within the new interval.
This process repeats until the end of the encoded message or the desired message
length is reached.

Aaiye example se samajhte h


PYQ : Difference between arithmetic and Huffman coding.

Feature Huffman Coding Arithmetic Coding

Type Variable-length Non-binary,


fractional-length

Algorithm Greedy, builds a tree Statistical, assigns


based on symbol intervals based on
frequencies symbol probabilities

Output Codewords with Single string


integer bit lengths representation for the
entire data

Compression Ratio Good, approaches Potentially better, can


entropy for stationary approach theoretical
data limits of compression

Complexity Simpler to More complex,


understand and requires specialized
implement algorithms and data
structures

Decodability Guaranteed Requires precise


calculations, more
prone to errors during
decoding

Error Resilience Less resilient, errors More resilient, errors


can corrupt multiple may only affect
symbols nearby symbols

Adaptation Static codewords, not Can adapt to


ideal for changing symbol
non-stationary data frequencies
dynamically

Memory Usage Lower, needs to store Higher, needs to


codebook (optional) maintain internal
state for interval
calculations

Applications General-purpose Multimedia


compression, file compression,
compression, network high-fidelity data
protocols compression

Suitability for: Stationary data, Non-stationary data,


simplicity and high compression
efficiency are ratios are crucial
prioritized

Examples ZIP, GZIP, BZIP2 JPEG 2000, CABAC in


HEVC video coding

Theoretical Limit Entropy of the data Entropy of the data

Uniqueness of Guaranteed Not applicable, uses


Codewords probabilistic
representation
All the Best
"Enjoyed these notes? Feel free to share them with

your friends and provide valuable feedback in your

review. If you come across any inaccuracies, don't

hesitate to reach out to the author for clarification.

Your input helps us improve!"

Visit: www.collegpt.com

www.collegpt.com collegpt@gmail.com

You might also like