Professional Documents
Culture Documents
- Save storage
Front Compression
Rear Compression
Hierarchic Compression
Huffman Coding
Front Compression
Given the first four entry of employees
ROBERTON
ROBERTSON
ROBERTSTONE
ROBINSON
Suppose employee names are 12 characters long (b = blank)
ROBERTONbbbb
ROBERTSONbbb
ROBERTSTONEb
ROBINSONbbbb
Replace characters at the front of each entry that are the same as
those in the previous entry by a corresponding count.
ROBERTONbbbb
ROBERTSONbbb
ROBERTSTONEb
ROBINSONbbbb
Front compression
0 - ROBERTONbbbb
6 - SONbbb
7 - TONEb
3 - INSONbbbb
Rear Compression
This compression can be achieved by dropping all characters to
right of the one required to distinguish the entry in question from
its two immediate neighbors.
ROBERTONbbbb
ROBERTSONbbb
ROBERTSTONEb
ROBINSONbbbb
Rear compression
1 - ROBERTO
1 - ROBERTSO
3 - ROBERTST
4 - ROBI
5 - ROBERTO
4 - ROBERTSO
4 - ROBERTST
8 - ROBI
Hierarchic Compression
ROBERTONbbbb
ROBERTSONbbb
ROBERTSTONEb
ROBINSONbbbb
Intra-file
Hierarchic Compression (inter-file)
Huffman Coding
- Bit string encodings are assigned to represent characters.
- Different characters are represented by strings of different lengths.
- Most commonly occurring characters are represented by the shortest
string.
Character
E
A
D
C
B
Frequency
35%
30%
20%
10%
5%
Code
1
01
001
0001
0000
00110001010011
00110001010011
DE
C A DE