Technical Seminar Presentation 2005

NM Institute of Engineering and Technology National Institute of Science and Technology

A Review of Data Compression Techniques
Presented by Sudeepta Mishra Roll# CS200117052
At

NIST,Berhampur Under the guidance of Mr. Rowdra Ghatak
Edited by: Priyabrata Nayak, Lecturer, Dept. of CSE

[1]

Technical Seminar Presentation 2005
NM Institute of Engineering and Technology National Institute of Science and Technology

Introduction • Data compression is the process of encoding data so that it takes less storage space or less transmission time than it would if it were not compressed. • Compression is possible because most real-world data is very redundant

Edited by: Priyabrata Nayak, Lecturer, Dept. of CSE

[2]

Technical Seminar Presentation 2005
NM Institute of Engineering and Technology National Institute of Science and Technology

Different Compression Techniques • Mainly two types of data Compression techniques are there. – Loss less Compression.

Useful in spreadsheets, text, executable program Compression.
– Lossy less Compression. Compression of images, movies and sounds.

Edited by: Priyabrata Nayak, Lecturer, Dept. of CSE

[3]

of CSE [4] . – Lempel Ziv.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology Types of Loss less data Compression • Dictionary coders. – Zip (file format). Edited by: Priyabrata Nayak. • Run-length encoding. Dept. – Huffman coding (simple entropy coding). • Entropy encoding. Lecturer.

• The tokens form an index into a dictionary.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology Dictionary-Based Compression • Dictionary-based algorithms do not single symbols as variable-length bit they encode variable-length strings of as single tokens. symbols phrase phrases Edited by: Priyabrata Nayak. of CSE [5] . encode strings. Lecturer. • If the tokens are smaller than the they replace. compression occurs. Dept.

Lecturer. The dictionary is being built in a single pass. • Semi-Adaptive Dictionary. while at the same time encoding the data.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology Types of Dictionary • Static Dictionary. of CSE [6] . – Lempel Ziv algorithms belong to this category of dictionary coders. – The decoder can build up the dictionary in the same way as the encoder while decompressing the data. Dept. • Adaptive Dictionary. Edited by: Priyabrata Nayak.

Using ASCII coding the above string requires 48 bytes.5 bytes per word). If the dictionary has 2. of CSE [7] . i. where.e. each word is coded as x/y. and y gives the number of the word on that page.5 * 8) bytes: 50% compression.. Dept. Edited by: Priyabrata Nayak. 20 bits per word (2. whereas our encoding requires only 20 (<-2. x gives the page no.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology Dictionary-Based Compression: Example • Using a English Dictionary the string: “A good example of how dictionary based compression works” • Gives : 1/1 822/3 674/4 1343/60 928/75 550/32 173/46 421/2 • Using the dictionary as lookup table. Lecturer.200 pages with less than 256 entries per page: Therefore x requires 12 bits and y requires 8 bits.

Dept. Lecturer. stemming from the two algorithms proposed by Jacob Ziv and Abraham Lempel in their landmark papers in 1977 and 1978. of CSE [8] .Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology Lempel Ziv • It is a family of algorithms. LZ77 LZ78 LZJ LZR LZFG LZW LZT LZMW LZSS LZH LZB LZC Edited by: Priyabrata Nayak.

Edited by: Priyabrata Nayak. The dictionary is assumed to be initialized with 256 entries (indexed with ASCII codes 0 through 255) representing the ASCII table.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology LZW Algorithm • It is An improved version of LZ78 algorithm. • A dictionary that is indexed by “codes” is used. Dept. • Published by Terry Welch in 1984. Lecturer. of CSE [9] .

while (there is input){ K = next symbol from input. } else { output (index(W)). Lecturer. W = K. } } Edited by: Priyabrata Nayak. add WK to the dictionary.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Compression) W = NIL. of CSE [10] . Dept. if (WK exists in the dictionary) { W = WK.

of CSE [11] .Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Compression) Flow Chart START W= NULL YES IS EOF ? STOP NO K=NEXT INPUT YES W=WK IS WK FOUND? NO OUTPUT INDEX OF W ADD WK TO DICTIONARY W=K Edited by: Priyabrata Nayak. Dept. Lecturer.

Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Compression) Example • Input string is • The Initial Dictionary contains symbols like a. Starting from a. Dept. • Now the input string is read from left to right. Lecturer. d with their index values as 1. a b d c a d a c a b c d 1 2 3 4 Edited by: Priyabrata Nayak. c. 4 respectively. 3. of CSE [12] . b. 2.

Lecturer. Dept. of CSE [13] . a b d c a d a K c a b c d 1 2 3 4 Edited by: Priyabrata Nayak.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Compression) Example • W = Null • K=a • WK = a In the dictionary.

• Add WK to dictionary • Output code for a. Lecturer. Dept.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Compression) Example • K = b. • Set W = b a b d c a d a c K 1 a 1 b 2 c 3 d 4 ab 5 Edited by: Priyabrata Nayak. • WK = ab is not in the dictionary. of CSE [14] .

of CSE [15] .Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Compression) Example • K=d • WK = bd Not in the dictionary. • Output code b • Set W = d a b d c a d a c K 1 2 a 1 c 3 d 4 ab 5 6 b 2 bd Edited by: Priyabrata Nayak. Add bd to dictionary. Lecturer. Dept.

Lecturer. • Output code d • Set W = a a b d a b d a c K 1 2 a 1 4 ab 5 6 7 b 2 bd c 3 da d 4 Edited by: Priyabrata Nayak. Dept.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Compression) Example • K=a • WK = da not in the dictionary. of CSE [16] . • Add it to dictionary.

Dept. of CSE [17] .Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Compression) Example • K=b • WK = ab It is in the dictionary. a b d a b d a c K 1 2 a 1 4 ab 5 6 7 b 2 bd c 3 da d 4 Edited by: Priyabrata Nayak. Lecturer.

• Set W = d a b d a b d a c K 1 2 a 1 4 ab 5 5 6 7 8 b 2 bd c 3 da d 4 abd Edited by: Priyabrata Nayak.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Compression) Example • K=d • WK = abd Not in the dictionary. • Add W to the dictionary. Dept. Lecturer. of CSE [18] . • Output code for W.

Lecturer. of CSE [19] .Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Compression) Example • K=a • WK = da In the dictionary. Dept. a b d a b d a c K 1 2 a 1 4 ab 5 5 6 7 8 b 2 bd c 3 da d 4 abd Edited by: Priyabrata Nayak.

Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Compression) Example • K=c • WK = dac Not in the dictionary. • Add WK to the dictionary. • Output code for W. Dept. a b d a b d a c K 1 2 a 1 4 ab 5 5 6 7 8 7 dac 9 b 2 bd c 3 da d 4 abd Edited by: Priyabrata Nayak. Lecturer. of CSE [20] . • Set W = c • No input left so output code for W.

Dept. of CSE [21] .Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Compression) Example • The final output string is 124573 • Stop. Lecturer. a b d a b d a c K 1 2 4 ab 5 5 6 7 8 7 3 dac 9 a 1 b 2 bd c 3 da d 4 abd Edited by: Priyabrata Nayak.

add w + entry[0] to dictionary. */ { entry = dictionary entry for k. w = k. while ( read a character k ) /* k could be a character or a code. } Edited by: Priyabrata Nayak. output entry. Dept. Lecturer.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology LZW Decompression Algorithm read a character k. of CSE [22] . w = entry. output k.

Technical Seminar Presentation 2005 LZW Decompression Algorithm Flow Chart NM Institute of Engineering and Technology National Institute of Science and Technology START K=INPUT Output K W=K IS EOF ? YES STOP NO K=NEXT INPUT ENTRY=DICTIONARY INDEX (K) Output ENTRY ADD W+ENTRY[0] TO DICTIONARY W=ENTRY Edited by: Priyabrata Nayak. Lecturer. Dept. of CSE [23] .

e. Lecturer. Dept. a) • W=K 1 K 2 4 5 7 3 a a 1 b 2 c 3 d 4 Edited by: Priyabrata Nayak. of CSE [24] .Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Decompression) Example • K=1 • Out put K (i.

Lecturer. Dept. of CSE [25] .Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Decompression) Example • • • • K=2 entry = b Output entry Add W + entry[0] to dictionary • W = entry[0] (i. b) 1 2 K 4 5 7 3 a b a 1 b 2 c 3 d 4 ab 5 Edited by: Priyabrata Nayak.e.

of CSE [26] . Dept.e. d) 1 2 4 K 5 7 3 a b d a 1 ab 5 6 b 2 bd c 3 d 4 Edited by: Priyabrata Nayak. Lecturer.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Decompression) Example • • • • K=4 entry = d Output entry Add W + entry[0] to dictionary • W = entry[0] (i.

of CSE [27] . Lecturer. Dept.e.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Decompression) Example • • • • K=5 entry = ab Output entry Add W + entry[0] to dictionary • W = entry[0] (i. a) 1 2 4 5 K 7 3 a b d a b a 1 ab 5 6 7 b 2 bd c 3 da d 4 Edited by: Priyabrata Nayak.

d) 1 2 4 5 7 K 3 a b d a b d a a 1 ab 5 6 7 8 b 2 bd c 3 da d 4 abd Edited by: Priyabrata Nayak. of CSE [28] . Dept.e. Lecturer.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Decompression) Example • • • • K=7 entry = da Output entry Add W + entry[0] to dictionary • W = entry[0] (i.

Lecturer. Dept.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology The LZW Algorithm (Decompression) Example • • • • K=3 entry = c Output entry Add W + entry[0] to dictionary • W = entry[0] (i. of CSE [29] .e. c) 1 2 4 5 7 3 K a b d a b d a c a 1 ab 5 6 7 8 dac 9 b 2 bd c 3 da d 4 abd Edited by: Priyabrata Nayak.

Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology Advantages • As LZW is adaptive dictionary coding no need to transfer the dictionary explicitly. Dept. of CSE [30] . • LZW can be made really fast. and table look up is automatic. it grabs a fixed number of bits from input. Lecturer. Edited by: Priyabrata Nayak. • It will be created at the decoder side. so bit parsing is very easy.

Lecturer. – Monitor compression performance and flush dictionary when performance is poor. Edited by: Priyabrata Nayak. Dept.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology Problems with the encoder • What if we run out of space? – Keep track of unused entries and use LRU (Last Recently Used). of CSE [31] .

Edited by: Priyabrata Nayak. of CSE [32] . Dept.Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology Conclusion • LZW has given new dimensions for the development of new compression techniques. • In combination with other compression techniques many other different compression techniques are developed like LZMS. Lecturer. • It has been implemented in well known compression format like Acrobat PDF and many other types of compression packages.

programmersheaven.html [5] http://download. Inc. 2001 reprint.html [2] http://tuxtina. New York. H. Second Edition.pdf [3] BELL.de/files/seminar/LempelZivReport.. C. Data Compression The Complete Reference.uk/Dave/Multimedia/node214. Text Compression. Prentice Hall.co. Springer-Verlac.htm [8] http://www. J. Dept. Lecturer. [7] http://www.com/articles/d/a/Data_Compression. NJ. Introduction to Data Compression Second Edition.htm [6] David Salomon.com/2/Art_Huffman_p2. 137-157.htm [9] Khalid Sayood. of CSE [33] . 1990.uk/tutorials/rlecompression/RunLength Encoding (RLE) Tutorial. AND WITTEN. [4] http://www. Edited by: Priyabrata Nayak.bambooweb.cs.cdsoft. Harcourt India Private Limited. Chapter 5.. pp. CLEARY.cf.programmersheaven. T. Upper Sadle River. G.com/2/Art_Huffman_p1. I.ac.Technical Seminar Presentation 2005 REFERENCES NM Institute of Engineering and Technology National Institute of Science and Technology [1] http://www.

of CSE [34] .Technical Seminar Presentation 2005 NM Institute of Engineering and Technology National Institute of Science and Technology Thank You Edited by: Priyabrata Nayak. Dept. Lecturer.

Sign up to vote on this title
UsefulNot useful