You are on page 1of 5

Dictionary Coding CSE 391 Introduction to Data Compression

Spring 2003

Lecture 5: Dictionary Coding: LZW

Does not use statistical knowledge of data. Encoder: As the input is processed develop a dictionary and transmit the index of strings found in the dictionary. Decoder: As the code is processed reconstruct the dictionary to invert the process of encoding. Examples: LZW, LZ77, Sequitur, Applications: Unix Compress, gzip, GIF
CSE 391 - Lecture 5 - Spring 2003 2

LZW Encoding Algorithm


Repeat find the longest match w in the dictionary output the index of w put wa in the dictionary where a was the unmatched symbol

LZW Encoding Example (1)


Dictionary 0 a 1 b ababababa

CSE 391 - Lecture 5 - Spring 2003

CSE 391 - Lecture 5 - Spring 2003

LZW Encoding Example (2)


Dictionary 0 a 1 b 2 ab ababababa 0

LZW Encoding Example (3)


Dictionary 0 1 2 3 a b ab ba ababababa 01

CSE 391 - Lecture 5 - Spring 2003

CSE 391 - Lecture 5 - Spring 2003

LZW Encoding Example (4)


Dictionary 0 1 2 3 4 a b ab ba aba ababababa 01 2

LZW Encoding Example (5)


Dictionary 0 1 2 3 4 5 a b ab ba aba abab ababababa 01 2 4

CSE 391 - Lecture 5 - Spring 2003

CSE 391 - Lecture 5 - Spring 2003

LZW Encoding Example (6)


Dictionary 0 1 2 3 4 5 a b ab ba aba abab ababababa 01 2 4 3

LZW Decoding Algorithm


Emulate the encoder in building the dictionary. Decoder is slightly behind the encoder.
initialize dictionary; decode first index to w; put w? in dictionary; repeat decode the first symbol s of the index; complete the previous dictionary entry with s; finish decoding the remainder of the index; put w? in the dictionary where w was just decoded;

CSE 391 - Lecture 5 - Spring 2003

CSE 391 - Lecture 5 - Spring 2003

10

LZW Decoding Example (1)


Dictionary 0 a 1 b 2 a? 012436 a

LZW Decoding Example (2a)


Dictionary 0 a 1 b 2 ab 012436 a b

CSE 391 - Lecture 5 - Spring 2003

11

CSE 391 - Lecture 5 - Spring 2003

12

LZW Decoding Example (2b)


Dictionary 0 1 2 3 a b ab b? 012436 a b

LZW Decoding Example (3a)


Dictionary 0 1 2 3 a b ab ba 012436 a ba

CSE 391 - Lecture 5 - Spring 2003

13

CSE 391 - Lecture 5 - Spring 2003

14

LZW Decoding Example (3b)


Dictionary 0 1 2 3 4 a b ab ba ab? 012436 a b ab

LZW Decoding Example (4a)


Dictionary 0 1 2 3 4 a b ab ba aba 012436 a b ab a

CSE 391 - Lecture 5 - Spring 2003

15

CSE 391 - Lecture 5 - Spring 2003

16

LZW Decoding Example (4b)


Dictionary 0 1 2 3 4 5 a b ab ba aba aba? 012436 a b ab aba

LZW Decoding Example (5a)


Dictionary 0 1 2 3 4 5 a b ab ba aba abab 012436 a b ab aba b

CSE 391 - Lecture 5 - Spring 2003

17

CSE 391 - Lecture 5 - Spring 2003

18

LZW Decoding Example (5b)


Dictionary 0 1 2 3 4 5 6 a b ab ba aba abab ba? 012436 a b ab aba ba

LZW Decoding Example (6a)


Dictionary 0 1 2 3 4 5 6 a b ab ba aba abab bab 012436 a b ab aba ba b

CSE 391 - Lecture 5 - Spring 2003

19

CSE 391 - Lecture 5 - Spring 2003

20

LZW Decoding Example (6b)


Dictionary 0 1 2 3 4 5 6 7 a b ab ba aba abab bab bab? 012436 a b ab aba ba bab

Trie Data Structure for Encoders Dictionary


Fredkin (1960)
0 1 2 3 4 5 6 7 8 a b c d r ab br ra ac 9 ca 10 ad 11 da 12 abr 13 raa 14 abra 0 1 2 3 4 a b c d r b 5 r 12 a 14 c 8 d 10 r 6 a 9 a 11 a 7 a 13

CSE 391 - Lecture 5 - Spring 2003

21

CSE 391 - Lecture 5 - Spring 2003

22

Encoder Uses a Trie (1)


0 1 2 3 4 a b c d r b 5 r 12 a 14 abracadabraabracadabra 0 1 4 0 2 0 3 5 7 12 c 8 d 10 r 6 a 9 a 11 a 7 a 13 b 5

Encoder Uses a Trie (2)


0 1 2 3 4 a b c d r c 8 a 15 d 10 r 6 a 9 a 11 a 7 a 13

r 12 a 14

abracadabraabracadabra 0 1 4 0 2 0 3 5 7 12 8

CSE 391 - Lecture 5 - Spring 2003

23

CSE 391 - Lecture 5 - Spring 2003

24

Decoders Data Structure


Simply an array of strings
0 1 2 3 4 5 6 7 8 a b c d r ab br ra ac 9 ca 10 ad 11 da 12 abr 13 raa 14 abr?

Bounded Size Dictionary


Bounded Size Dictionary
n bits of index allows a dictionary of size 2n Doubtful that long entries in the dictionary will be useful.

0 1 4 0 2 0 3 5 7 12 8 ... a b r a c a d ab ra abr

Strategies when the dictionary reaches its limit.


1. 2. 3. 4. Dont add more, just use what is there. Throw it away and start a new dictionary. Double the dictionary, adding one more bit to indices. Throw out the least recently visited entry to make room for the new entry.
CSE 391 - Lecture 5 - Spring 2003 26

CSE 391 - Lecture 5 - Spring 2003

25

Implementing the LRV Strategy


Least Recent

Implementing the LRV Strategy


Least Recent

0 1 2 3 4 a b c d r r 6 a 9 a 11 a 7 a 13 b 5

0 1 2 3 4 a b c d r a 9 a 11 a 7 a 13

b 5 r 12 a 14

c 8

d 10

c 8

d 10

r 12 a 6 a 14

Most Recent

abracadabraabracadabra 0 1 4 0 2 0 3 5 7 12

Most Recent

abracadabraabracadabra 0 1 4 0 2 0 3 5 7 12 8

CSE 391 - Lecture 5 - Spring 2003

27

CSE 391 - Lecture 5 - Spring 2003

28

Notes on LZW
Extremely effective when there are repeated patterns in the data that are widely spread. Negative: Creates entries in the dictionary that may never be used. Applications:
Unix compress, GIF, V.42 bis modem standard

CSE 391 - Lecture 5 - Spring 2003

29

You might also like