Professional Documents
Culture Documents
1 Introduction
In data compression, one wishes to give a compact representation of data generated by a
data source. Depending upon the source of the data, the data could be of various types,
such as text data, image data, speech data, audio data, video data, etc. Data compression is
performed in order to more easily store the data or to more easily transmit the data. It is the
job of the data compression practitioner to design a data compression system for compressing
the data. Here is a block diagram illustrating a general data compression system:
In this diagram, we have termed the data generated by the data source as the source data,
and we have termed the compact representation of the source data the compressed data. The
data compression system consists of encoder and decoder. The encoder converts the source
data into the compressed data, and the decoder attempts to reconstruct the source data
from the compressed data. The reconstructed data generated by the decoder either coincides
with the source data or is perceptually indistinguishable from it.
The data compression practitioner would need to evaluate how good a potential data
compression system is. The compression ratio is a figure of merit via which this can be done.
By a compression ratio of r to 1, the data compression practitioner means that
size of source data in bits
=r
size of compressed data in bits
Thus a compression ratio of 2 to 1 means that the compressed data is half the size of the
source data. The higher the compression ratio, the better the compression system is.
EXAMPLE. This example illustrates how data compression can assist one in storing
data or in transmitting data. Suppose the data source generates an arbitrary 512 × 512
digital image consisting of 256 colors. Each color is represented by an intensity from the
set {0, 1, 2, . . . , 255}. Mathematically, this image is a 256 × 256 matrix, each of whose
elements comes from the set {0, 1, 2, . . . , 255}. (These elements are called pixel elements.)
The intensity with which each pixel element is designated can be represented using 8 bits.
Thus, the size of the source data in bits is 8×512×512 = 221 , which is about 2.1 megabytes. A
one gigabyte hard disk could thus store only about 476 of these images, without compression.
Suppose, however, that one can compress each such image at a compression ratio of 8 to
1. Then, one can store about 3800 images on this hard disk! Suppose now that one wants
to transmit an uncompressed image over a telephone channel that can transmit 30, 000 bits
per second. Computing 221 /30000, one sees that it would take about 70 seconds to do this.
With the 8 to 1 compression ratio, the compressed image could be trasmitted in under 9
seconds!
There are two varieties of data compression systems:
• lossless compression systems
• lossy compression systems
1.1 Lossless compression
In a lossless data compression system, the decoder is able to perfectly reconstruct the source
data. Thus, the block diagram of a lossless compression system looks like:
n log2 |A|
r=
k
and |A| is the size of the alphabet A. Equivalently, we can use the compression rate as our
figure of merit. The compression rate R is defined by
R = k/n
The units of the compression rate R are “codebits per data sample”. Finding a compression
system with large compression ratio is the same as finding a compression system with small
compression rate.
EXAMPLE. We take as our data to be compressed the following 4 × 4 image, in the
four colors R = “red”, O = “orange”, Y = “yellow”, G = “green”:
R R O Y
R O O Y
O O Y G
Y Y Y G
encoder table
3 001
2 01
1 1
0 000
To use the encoder table, replace each symbol in X with the binary codeword for that symbol
given in the table. The compressed data is then
B(X) = (0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0)
As in the lossless case, we let X = (x1 , x2 , . . . , xn ) denote the source data. The reconstructed
data is denoted X̂ = (x̂1 , x̂2 , . . . , x̂n ). Notice the “quantizer” present in the lossy system
that was not present in the lossless system. The quantizer, in response to the data input
X, produces the output X̂ close enough to X (with respect to Euclidean distance or some
other distance function) that there will be no perceptual difference between X and X̂. The
quantizer works by alphabet reduction. (A simple quantizer is the “rounding off” function
— this quantizer would produce X̂ = (1, 3, 4, 2, 2) in response to X = (1.1, 2.6, 4.4, 2.3, 1.7),
thereby reducing the alphabet from {1.1, 2.6, 4.4, 2.3, 1.7} to {1, 2, 3, 4}.) Notice that the
encoder acts losslessly on X̂, the vector into which the vector X has been “quantized”
— the vector X̂ is assigned a unique binary codeword B(X̂) = (b1 , b2 , . . . , bk ), and then
the decoder reconstructs X̂ from B(X̂). However, the overall system is lossy because two
distinct X can be quantized into the same X̂.
In addition to the compression rate R = k/n, there is another figure of merit via which
the lossy compression system should be judged, namely, the distortion D induced by the
system, defined by
n
D = n−1
X
d(xi , x̂i )
i=1
where d is some distance function. If you have a lossy compression system with rate and
distortion R1 , D1 , and a second compression system with rate and distortion R2 , D2 , then
the first system is the better one if R1 < R2 and D1 < D2 . Unfortunately, however, it is not
possible to design a lossy compression system for which R and D are simultaneously small,
since these two parameters are inversely related. Instead, one’s design goal should be to find
a compression system that yields the smallest R for a fixed D, or the smallest D for a fixed
R. The theory detailing the R, D trade-offs that are possible in compression system design
is called rate distortion theory.
EXAMPLE. We lossily compress the same 4 × 4 image which was losslessly compressed
above. We quantize X = (3, 3, 2, 1, 3, 2, 2, 1, 2, 2, 1, 0, 1, 1, 1, 0) into X̂ = (3, 3, 2, 1, 3, 2, 2, 1, 2, 2, 1, 1, 1, 1, 1
(The quantizer operates sample-by-sample, leaving 3, 2, 1 unchanged and converting 0 into
1, thereby reducing the alphabet from {0, 1, 2, 3} to {1, 2, 3}.) Use the encoder given by the
encoding table:
encoder table
3 00
2 01
1 1
B(X̂) = (0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1)
The compression rate is R = 24/16 = 1.5 codebits per data sample, and the distortion is
16
1 X
D= |xi − x̂i | = 0.125
16 i=1
The reconstructed image is
R R O Y
R O O Y
O O Y Y
Y Y Y Y
Here is a question for the reader: Is there a lossy compression system for which R = 1.5 and
D < 0.125?
Whether one performs lossless or lossy compression depends on the type of data one has.
In text compression, for example, one would want to do lossless compression because one
would want perfect recontruction of the text from its compressed version. In image compres-
sion, lossy compression is frequently used because the reconstructed image can appear to
be perceptually the same as the original image, without coinciding with the original image.
Lossy compression, where feasible, yields a gain in compression over lossless compression.
For example, in the compression of images a compression ratio of 2 to 1 may be the best one
can do with lossless compression, whereas a compression ratio of 8 to 1 may be feasible for
lossy compression.
1.3 MATLAB m-files
We present some MATLAB functions that will be useful to us later on. MATLAB (unlike
LISP and Mathematica, which are list processing languages) cannot handle binary sequences
(bit strings) of varying lengths very well. The MATLAB programs we present here via allow
MATLAB to deal with bitstrings by converting them to integer indices.
0, 1, 00, 01, 10, 11, 000, 001, 010, 011, 100, 101, 110, 111, 0000, . . .
To find the position of a bitstring x in this list, it can be seen that you convert the bit string
1x to integer form and then subtract 1. For example, the bitstring 0001 is the 16-th str list
because when you convert 10001 to integer form you get 16 + 1 = 17.
If you execute the MATLAB line
bitstring_to_index([0 0 0 1])
The function “index to bitstring” is the inverse of the function “bitstring to index”. Thus
if you execute the MATLAB line
index_to_bitstring(16)
you will see that MATLAB will give you the vector [0001].
This MATLAB function is used for printing bitstrings stored in MATLAB memory to the
screen. For example, if you execute the MATLAB line
print_bitstrings([16 17 18 19])
you will see the following bitstrings printed on the screen:
0001
0010
0011
0100
These are the 16-th, 17-th, 18-th, and 19-th bitstrings in the list of all bitstrings.
The MATLAB function “archive bitstrings” is very much like “print bitstrings”, except that
the bitstrings are printed to a file named “bitstring.txt”. For example, executing the MAT-
LAB line
archive_bitstrings([16 17 18 19])
will put the binary strings 0001, 0010, 0011, 0100 in the file “bitstring.txt”. Try it!
As we have seen, the function “bitstring to index” can be used to enter one bitstring into
MATLAB memory. To enter two or more bitstrings into MATLAB memory, you use the
function “input bitstrings”. You form one big vector whose components are the components
of the bitstrings to go in memory, with the components of each bitstring separated from the
components of the next bitstring by a component of “2”. For example, suppose you want to
store the bitstrings 110, 000, 10, 1 in MATLAB memory. Form the vector
and then execute the MATLAB line
input_bitstrings(x)
You will then see that MATLAB returns the vector [13752], indicating that 110, 000, 10, 1
are the 13-th, 7-th, 5-th, and 2-nd bitstrings in the list of all bitstrings. To check this result,
execute the MATLAB line
print_bitstrings([13 7 5 2])
and you will see the bitstrings 110, 000, 10, 1 printed out on the screen.