The document describes arithmetic coding and Lempel-Ziv-Welch (LZW) compression algorithms. Arithmetic coding assigns a single codeword to a string of characters by dividing the numeric range from 0 to 1 based on character probabilities. LZW uses an adaptive dictionary to assign fixed-length codewords to variable-length strings, allowing it to efficiently encode repeated patterns. The document provides examples of how both algorithms assign codes and examples of the compression ratios achieved.
The document describes arithmetic coding and Lempel-Ziv-Welch (LZW) compression algorithms. Arithmetic coding assigns a single codeword to a string of characters by dividing the numeric range from 0 to 1 based on character probabilities. LZW uses an adaptive dictionary to assign fixed-length codewords to variable-length strings, allowing it to efficiently encode repeated patterns. The document provides examples of how both algorithms assign codes and examples of the compression ratios achieved.
The document describes arithmetic coding and Lempel-Ziv-Welch (LZW) compression algorithms. Arithmetic coding assigns a single codeword to a string of characters by dividing the numeric range from 0 to 1 based on character probabilities. LZW uses an adaptive dictionary to assign fixed-length codewords to variable-length strings, allowing it to efficiently encode repeated patterns. The document provides examples of how both algorithms assign codes and examples of the compression ratios achieved.
• Arithmetic coding is a more modern coding method that
usually outperforms Huffman coding in practice.
• Arithmetic coding yields a single codeword for each encoded
string of characters
• The first step is to divide the numeric range from 0 to 1 into a
number of different characters present in the message to be sent
• The size of each segment by the probability of the related
character 13 January 2020 UST, YEMEN 2 • A single code is given for each string of characters
13 January 2020 UST, YEMEN 3
13 January 2020 UST, YEMEN 4 At the termination character ‘ . ’ is encoded, the segment range from 0.81602 to 0.81620 and hence the codeword for the complete string is any number within the range: 0.81602 ≤ codeword<0.81620 For example, the codeword=(0.81604004)10=(0.1101000011101)2 . As compared with ASCII code: 13 𝑏𝑖𝑡𝑠 𝐶𝑜𝑚𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑟𝑎𝑡𝑖𝑜 𝑐𝑟 = ∗ 100% = 37.14% 7 𝑏𝑖𝑡𝑠/𝑠𝑦𝑚𝑏𝑜𝑙 ∗ 5 𝑠𝑦𝑚𝑏𝑜𝑙 13 January 2020 UST, YEMEN 5 13 January 2020 UST, YEMEN 6 13 January 2020 UST, YEMEN 7 • The Lempel-Ziv-Welch (LZW) algorithm employs an adaptive, dictionary-based compression technique.
• Unlike variable-length coding, in which the
lengths of the codewords are different, LZW uses fixed-length codewords to represent variable length strings of symbols/characters that commonly occur together, such as words in English text. 13 January 2020 UST, YEMEN 8 • The LZW encoder and decoder builds up the same dictionary dynamically while receiving the data—the encoder and the decoder both develop the same dictionary. • LZW proceeds by placing longer and longer repeated entries into a dictionary, then emitting the code for an element rather than the string itself, if the element has already been placed in the dictionary.
13 January 2020 UST, YEMEN 9
LZW Encoding Algorithm
13 January 2020 UST, YEMEN 10
Example …
The following test string is to be compressed
with the help of the LZW compression algorithm: ABRAKADABRAABRAKADABRA a) Show the compression process in a table. b) How much the compression ratio as compared with ASCII code (8 bit/character).
13 January 2020 UST, YEMEN 11
While the original character string needs 176 bits (22 characters of each 8 bits with ASCII code), so 𝐵0 = 22 × 8 = 176 𝑏𝑖𝑡𝑠. After LZW compression, there only 13 characters. There each character must be represented by 12 bit, so 𝐵1 = 13 × 12 = 156 𝑏𝑖𝑡𝑠 (Note: you can consider only 9 bits per character, but the computer uses the hexadecimal digits). 𝐵1 156 𝑐𝑜𝑚𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑟𝑎𝑡𝑖𝑜% = × 100% = ∗ 100% = 88.64% 𝐵0 176 13 January 2020 UST, YEMEN 12 LZW Decoding Algorithm
13 January 2020 UST, YEMEN 13
When a LZW compressed file is decoded, the dictionary can gradually be reconstructed . This is because the output of the LZW algorithm always contains only dictionary entries that were already in the dictionary. In compression, each dictionary entry begins with the last character of the previously added dictionary entry. Conversely, the last character of a new dictionary entry, added to the dictionary in decompression, is at the same time the first output character in decoding. The following table illustrates the sequence of LWZ decompression. The first column contains the sequence of each of the code characters to be decoded . The second column shows the output of decompression and the third column contains the current dictionary entry added consisting of a character string and its associated code.
13 January 2020 UST, YEMEN 14
Assignment
Given an initial dictionary:
Index Entry 1 a 2 b 3 h 4 i 5 s 6 t and output of an LZW encoder: 6 3 4 5 1 3 1 6 2 9 11 16 decode the above sequence (which is not intended to represent meaningful English). 13 January 2020 UST, YEMEN 15 Just For Thinking