You are on page 1of 77

CHAPTER 1

INTRODUCTION

1.1 Data compression

In information theory, data compression, source coding, or bit-rate reduction is


the process of encoding information using fewer bits than the original representation.
Any particular compression is either lossy or lossless. Lossless compression reduces
bits by identifying and eliminating statistical redundancy. No information is lost in
lossless compression. Lossy compression reduces bits by removing unnecessary or
less important information. Typically, a device that performs data compression is
referred to as an encoder, and one that performs the reversal of the process
(decompression) as a decoder.

The process of reducing the size of a data file is often referred to as data
compression. In the context of data transmission, it is called source coding; encoding
done at the source of the data before it is stored or transmitted. Source coding should
not be confused with channel coding, for error detection and correction or line coding,
the means for mapping data onto a signal. Compression is useful because it reduces
the resources required to store and transmit data. Computational resources are
consumed in the compression and decompression processes. Data compression is
subject to a space–time complexity trade-off. For instance, a compression scheme for
video may require expensive hardware for the video to be decompressed fast enough
to be viewed as it is being decompressed, and the option to decompress the video in
full before watching it may be inconvenient or require additional storage. The design
of data compression schemes involves trade-offs among various factors, including the
degree of compression, the amount of distortion introduced (when using lossy data
compression), and the computational resources required to compress and decompress
the data.

1
1.1.1 Lossless compression
Lossless data compression algorithms usually exploit statistical
redundancy to represent data without losing any information, so that the process is
reversible. Lossless compression is possible because most real-world data exhibits
statistical redundancy. For example, an image may have areas of color that do not
change over several pixels; instead of coding "red pixel, red pixel, ..." the data may be
encoded as "279 red pixels". This is a basic example of run-length encoding; there are
many schemes to reduce file size by eliminating redundancy.

The Lempel–Ziv (LZ) compression methods are among the most popular
algorithms for lossless storage. DEFLATE is a variation on LZ optimized for
decompression speed and compression ratio, but compression can be slow. In the mid-
1980s, following work by Terry Welch, the Lempel–Ziv–Welch (LZW) algorithm
rapidly became the method of choice for most general-purpose compression systems.
LZW is used in GIF images, programs such as PKZIP, and hardware devices such as
modems. LZ methods use a table-based compression model where table entries are
substituted for repeated strings of data. For most LZ methods, this table is generated
dynamically from earlier data in the input. The table itself is often Huffman encoded.
Grammar-based codes like this can compress highly repetitive input extremely
effectively, for instance, a biological data collection of the same or closely related
species, a huge versioned document collection, internet archival, etc. The basic task of
grammar-based codes is constructing a context-free grammar deriving a single string.
Other practical grammar compression algorithms include Sequitur and Re-Pair.

The strongest modern lossless compressors use probabilistic models, such as


prediction by partial matching. The Burrows–Wheeler transform can also be viewed
as an indirect form of statistical modelling. In a further refinement of the direct use of
probabilistic modelling, statistical estimates can be coupled to an algorithm called
arithmetic coding. Arithmetic coding is a more modern coding technique that uses the
mathematical calculations of a finite-state machine to produce a string of encoded bits
from a series of input data symbols. It can achieve superior compression compared to
other techniques such as the better-known Huffman algorithm. It uses an internal
memory state to avoid the need to perform a one-to-one mapping of individual input
symbols to distinct representations that use an integer number of bits, and it clears out
the internal memory only after encoding the entire string of data symbols. Arithmetic

2
coding applies especially well to adaptive data compression tasks where the statistics
vary and are context-dependent, as it can be easily coupled with an adaptive model of
the probability distribution of the input data. An early example of the use of
arithmetic coding was in an optional (but not widely used) feature of the JPEG image
coding standard. It has since been applied in various other designs including H.263,
H.264/MPEG-4 AVC and HEVC for video coding.

1.1.2 Lossy compression


In the early 1990s, lossy compression methods began to be widely used. In
these schemes, some loss of information is accepted as dropping nonessential detail
can save storage space. There is a corresponding trade-off between preserving
information and reducing size. Lossy data compression schemes are designed by
research on how people perceive the data in question. For example, the human eye is
more sensitive to subtle variations in luminance than it is to the variations in color.
JPEG image compression works in part by rounding off nonessential bits of
information. Several popular compression formats exploit these perceptual
differences, including psychoacoustics for sound, and psychovisuals for images and
video.

Most forms of lossy compression are based on transform coding, especially


the discrete cosine transform (DCT). It was first proposed in 1972 by Nasir Ahmed,
who then developed a working algorithm with T. Natarajan and K. R. Rao in 1973,
before introducing it in January 1974. DCT is the most widely used lossy compression
method and is used in multimedia formats for images (such as JPEG and HEIF), video
(such as MPEG, AVC and HEVC) and audio (such as MP3, AAC and Vorbis).

Lossy image compression is used in digital cameras, to increase storage


capacities. Similarly, DVDs, Blu-ray, and streaming video use lossy video coding
formats. Lossy compression is extensively used in video. In lossy audio compression,
methods of psychoacoustics are used to remove non-audible (or less audible)
components of the audio signal. Compression of human speech is often performed
with even more specialized techniques; speech coding is distinguished as a separate
discipline from general-purpose audio compression. Speech coding is used in internet

3
telephony, for example, audio compression is used for CD ripping and is decoded by
the audio players.

Lossy compression is widely used to reduce psychovisual redundancies in


image and video compression. It is also used in audio compression in which the loss
of quality is permitted up to an extent.

1.1.3 Basic data compression model


 Removal or reduction in data redundancy is achieved by transforming the original
data from one form or representation to another.
 This step leads to the reduction of entropy. Reduction in entropy phase is a non-
reversible process. It is achieved by dropping insignificant information using
quantization techniques.

Fig 1.1: A basic data compression model.

 And the last phase is Entropy Coding (EC) which compresses data efficiently.
 The input data might be of any form i.e., text, images, video, audio etc. And the
output is a compressed data stream which when decompressed, gives us the
original data stream or an equivalent representation.

4
1.1.4 Compression ratio
Data compression ratio, also known as compression power, is a measurement
of the relative reduction in size of data representation produced by a data compression
algorithm. It is typically expressed as the division of uncompressed size by
compressed size.

¿ original file
Compressionratio=
¿ compressed file

¿ compressed file
Space saving=1−
¿ original file

1.1.5 Redundancies with respect to images


Data redundancy is the existence of data that is additional to the actual data. In
images the redundant information takes one of the three forms. They are, coding
redundancy, psycho-visual redundancy and interpixel redundancy. Image
compression algorithms work on images by exploiting these qualities of images.

1.2 Aim of the project

To achieve better compression ratios for medical images using the BWT as the
main transformation, and reduce the time taken to compress data using BWT in a
different setup as compared to the conventional one.

1.3 Methodology

In this project, we explore the Burrows-Wheeler Transform (BWT), which is


originally used for text compression. We experiment with various stages in the

5
initially proposed Burrows-Wheeler Compression Algorithm (BWCA) written by
Michael Burrows and D. J. Wheeler in their 1994 paper titled “A block-sorting
lossless data compression algorithm”.[2] We obtain the compression ratio according
to the formula given in section 1.1.4. We aim to implement the project in MATLAB
since it is fluent in dealing with arrays of multiple dimensions. As you would notice,
we use a large amount of array manipulation in the project.

1.4 Significance of the work

Nowadays, digital image compression has become a crucial factor of modern


telecommunication systems. Image compression is the process of reducing total bits
required to represent an image by reducing redundancies while preserving the image
quality as much as possible. Various applications including internet, multimedia,
satellite imaging, medical imaging uses image compression. Image compression saves
storage space and transmission bandwidth. The most neglected among the above-
mentioned fields is the compression regarding medical imaging. This is primarily due
to historic reasons, but due to advancement in medical technology, many new scans
and imaging techniques have been developed. So, one can confidently say that the
growth of data in the field of medicine is at a rate higher than normal. So, we need
compression algorithms tailored to the particular need of compressing medical images
to exploit redundancies specific to medical images. In our project we pay mind to the
field of medical imaging, to develop a lossless compression technique using BWT.
Since most of the present-day medical images like Computed Tomography (CT),
Magnetic Resonance Imaging (MRI), X-radiation scan (X-ray) etc. are in grayscale,
we present our model for compressing grayscale images.

1.5 Organization of thesis

 Chapter 2 contains the details of the literature survey conducted regarding our
project.
 Chapter 3 contains a formal introduction to the Burrows-Wheeler Transform
and focusses on its uses in text compression, how the actual transformation
takes place and some analysis of the algorithm.

6
 Chapter 4 deals with the proposed model and explains all the algorithms used
in the project using a running example. It also deals with the fundamentals of
an image and formally defining an image.
 Chapter 5 contains a brief text about MATLAB, its features, and
specifications. We focus mainly on the implementation of our project in
MATLAB.
 Chapters 6 and 7 contain the results obtained and conclusions drawn by the
members from the presented work.

7
CHAPTER 2
LITERATURE SURVEY

2.1 Introduction

The Burrows–Wheeler transform (BWT, also called as block-sorting lossless


compression or combinatorial transform) rearranges a character string into runs of
similar characters. This is useful for compression, since it tends to be easy to
compress a string that has runs of repeated characters by techniques such as move-to-
front transform and run-length encoding. More importantly, the transformation
is reversible, without needing to store any additional data except the position of the
first original character. The BWT is thus a "free" method of improving the efficiency
of text compression algorithms, costing only some extra computation.

The Burrows–Wheeler transform is an algorithm used to prepare data for use


with data compression techniques such as bzip2.[1] It was invented by Michael
Burrows and David Wheeler in 1994 while Burrows was working at DEC Systems
Research Center in Palo Alto, California.[2] It is based on a previously unpublished
transformation discovered by Wheeler in 1983. The algorithm can be implemented
efficiently using a suffix array thus reaching linear time complexity.[3]

2.2 Original scheme of BWT in text compression

A typical scheme of the Burrows-Wheeler Compression Algorithm (BWCA)


has been introduced by Abel.[5] It consists of four stages as shown. Each stage is a
transformation of the input data and reaches the output data to the next stage. The
stages are processed sequentially from left to right. The first stage is the BWT itself. It
sorts the data in a way that symbols with a similar context are grouped closely
together and keeps constant the number of symbols during the transformation.[4] The
second stage is called in this article as the Global Structure Transform (GST), which
transforms the local context of the symbols to a global context. A typical
representative of a GST stage is the Move-To-Front Transform (MTF). Burrows and
Wheeler introduced it in their original publication. It was the first algorithm used as a

8
GST stage in a BWCA original scheme. The MTF stage is a List Update Algorithm
(LUA), which replaces the input symbols with corresponding ranking values. Just like
the BWT stage, the LUA stage does not alter the number of symbols.

No compression stages Compression stage

Input text BWT GST DC Compressed


text

Fig 2.1: Typical scheme of Burrows-Wheeler Compression Algorithm.

The last stage is the Entropy Coding (EC) stage, which compresses the
symbols by using an adapted model. We focus on lossless compression due to the
aimed applications in medical field, nevertheless this scheme can be considered for
lossless image compression as well as for lossy image compression. In lossy
configuration a pre-processing based on DCT is added to compression.

2.3 Addition of RLE

The main function of the RLE is to support the probability estimation of the
next stage. Long runs of identical values tend to overestimate the global symbol
probability, which leads to lower compression. Balkenhol and Shtarkov call this
phenomenon "the pressure of runs" [6]. The RLE stage helps to decrease this
pressure. To improve the probability estimation of the EC stage, the common BWCA
schemes positions the RLE stage directly in front of the EC stage.

One common RLE stage for BWT, based compressors is Run Length
Encoding Zero (RLE-0). Wheeler has suggested to code only the runs of the 0
symbols and no runs of other symbols, since 0 is the symbol with the most runs.
Hereto an offset of 1 is added to symbols greater than 0. The run length is
incremented by one and all bits of its binary representation except the most
significant bit – which is always 1 – are stored with the symbols 0 and 1. Some
authors have suggested an RLE stage before the BWT stage for speed optimization
and for reducing BWT input, but such a stage deteriorates in general the compression

9
ratio. Otherwise, specific sorting algorithms are used to arrange the runs of symbols
practically in linear time.

Other type of Run Length Encoding is RLE-2s that has been used by Abel [1].
The RLE-2s stage replaces all runs of two or more symbols by a run consisting of
exactly two symbols. In contrast to other approaches, the length of the run is not
placed behind the two symbols inside the symbol stream but transmitted into a
separate data stream, so the length information does not disturb the context of the
main data stream.

2.4 Improvement of Global Structure Transform

Most GST stages use a recent ranking scheme for the List Update problem like
Move-To-Front (MTF) algorithm, which is used in the original BWCA approach from
Burrows and Wheeler. Many authors have presented improved MTF stages, which are
based on a delayed behaviour, such as the MTF-1 and MTF-2 approaches of
Balkenhol et al. or a sticky version by Fenwick [5]. Another approach, which
achieved a much better compression ratio than MTF stages, is the Weighted
Frequency Count (WFC) stage presented by Deorowicz [7], this scheme has a very
high cost of computation. Other GST schemes like Inversion Frequencies (IF) [6] use
a distance measurement between the occurrences of same symbol. Like the WFC
stage of Deorowicz, Abel presented a list of counters, Incremental Frequency Count
(IFC) [1]. The difference to the WFC stage is to minimize calculation.

2.5 Improvement of entropy coding

The very first proposition of Burrows and Wheeler was to use the Huffman
coder as the last stage; it is fast and simple, but the arithmetic coder is a better choice
to achieve better compression ratio. Abel has modified arithmetic coding, because of
the coding type of the IFC output inside the EC stage has a strong influence on the
compression rate, indeed it is not sufficient to compress the index stream just by a
simple arithmetic coder with a common order-n context. The index frequency of the
IFC output has a nonlinear decay. Even after the use of an RLE-2 stage, the index 0 is
still the most common index symbol on average.

10
Normally, Burrows and Wheeler, in their 1994 paper titled “A block-sorting
lossless data compression”, [2] proposed a coding scheme based on Huffman coding
which is a variable length code that encodes highly probable symbols with minimal
length codes and those that are less probable with maximal length codes. These codes
followed the prefix property. Prefix property means that no code word assigned to a
symbol using the algorithm is a prefix of another code word. It is also known as the
prefix free property.

2.6 Conclusion
In this chapter we discussed about the conventional use of BWT in text
compression and we shed some light on some improvements and additions to the
classical BWCA.

11
CHAPTER 3
BURROWS-WHEELER TRANSFORM

3.1 Introduction

The most widely used data compression algorithms are based on the sequential
data compressors of Lempel and Ziv. Statistical modelling techniques may produce
superior compression, but are significantly slower. In this paper, we present a
technique that achieves compression within a percent or so of that achieved by
statistical modelling techniques, but at speeds comparable to those of algorithms
based on Lempel and Ziv’s.

Our algorithm does not process its input sequentially, but instead processes a
block of text as a single unit. The idea is to apply a reversible transformation to a
block of text to form a new block that contains the same characters, but is easier to
compress by simple compression algorithms. The transformation tends to group
characters together so that the probability of finding a character close to another
instance of the same character is increased substantially. Text of this kind can easily
be compressed with fast locally-adaptive algorithms, such as move-to-front coding in
combination with Huffman or arithmetic coding.

Briefly, our algorithm transforms a string S of N characters by forming the N


rotations (cyclic shifts) of S, sorting them lexicographically, and extracting the last
character of each of the rotations. A string L is formed from these characters, where
the ith character of L is the last character of the ith sorted rotation. In addition to L,
the algorithm computes the index I of the original string S in the sorted list of
rotations. Surprisingly, there is an efficient algorithm to compute the original string S
given only L and I.

The sorting operation brings together rotations with the same initial characters.
Since the initial characters of the rotations are adjacent to the final characters,
consecutive characters in L are adjacent to similar strings in S. If the context of a
character is a good predictor for the character, L will be easy to compress with a
simple locally-adaptive compression algorithm.

12
In the following sections, we describe the transformation in more detail, and
show that it can be inverted. We explain more carefully why this transformation tends
to group characters to allow a simple compression algorithm to work more
effectively. We then describe efficient techniques for implementing the
transformation and its inverse, allowing this algorithm to be competitive in speed with
Lempel-Ziv-based algorithms, but achieving better compression. Finally, we give the
performance of our implementation of this algorithm, and compare it with well-
known compression programmes.

3.2 The reversible transformation

This section describes two sub-algorithms. Algorithm 1 that performs the


reversible transformation that we apply to a block of text before compressing it, and
Algorithm 2 that performs the inverse operation. A later section suggests a method for
compressing the transformed block of text.

In the description below, we treat strings as vectors whose elements are


characters.

3.2.1 Algorithm 1: Compression transformation


This algorithm takes as input a string S of N characters S [0], …, S [N – 1]
selected from an ordered alphabet X of characters. To illustrate the technique, we also
give a running example, using the string S = ‘a b r a c a d a b r a a b r a c a d a b r a’,
N = 22, and the alphabet X is the ASCII table taken in hexadecimal.

So, here our string S translates to its hexadecimal equivalent, given by taking
the value mapped to each character of S from the ASCII table and converting it to
hexadecimal. Now, S = 61 62 72 61 63 61 64 61 62 72 61 61 62 72 61 63 61 64 61 62
72 61.

13
Step 1: [sort rotations]

Form a conceptual N x N matrix M whose elements are the characters, and


whose rows are the rotations (cyclic shifts) of S, sorted in lexicographical order. At
least one of the rows of M contains the original string S. Let I be the index of the first
such row, numbering from zero.

In our example, the index is I and the matrix M is of order N x N, i.e., 22 x 22.

Table 3.1: Matrix M of the cyclic rotations.

position a b r a c a d a b r a a b r a c a d a b r a
0 a b r a c a d a b r a a b r a c a d a b r a
1 b r a c a d a b r a a b r a c a d a b r a a
2 r a c a d a b r a a b r a c a d a b r a a b
3 a c a d a b r a a b r a c a d a b r a a b r
4 c a d a b r a a b r a c a d a b r a a b r a
5 a d a b r a a b r a c a d a b r a a b r a c
6 d a b r a a b r a c a d a b r a a b r a c a
7 a b r a a b r a c a d a b r a a b r a c a d
8 b r a a b r a c a d a b r a a b r a c a d a
9 r a a b r a c a d a b r a a b r a c a d a b
10 a a b r a c a d a b r a a b r a c a d a b r
11 a b r a c a d a b r a a b r a c a d a b r a
12 b r a c a d a b r a a b r a c a d a b r a a
13 r a c a d a b r a a b r a c a d a b r a a b
14 a c a d a b r a a b r a c a d a b r a a b r
15 c a d a b r a a b r a c a d a b r a a b r a
16 a d a b r a a b r a c a d a b r a a b r a c
17 d a b r a a b r a c a d a b r a a b r a c a
18 a b r a a b r a c a d a b r a a b r a c a d
19 b r a a b r a c a d a b r a a b r a c a d a
20 r a a b r a c a d a b r a a b r a c a d a b
21 a a b r a c a d a b r a a b r a c a d a b r

14
Now we need to sort the rows of this matrix in the lexicographical order i.e., in
the alphabetical order considering each row as a word formed by its characters. We
observe that each row and column of M is a permutation of S. In the table below, we
present M after the rows are sorted.

Table 3.2: Matrix M after sorting the rows.

positio F L
n
10 a a b r a c a d a b r a a b r a c a d a b r
21 a a b r a c a d a b r a a b r a c a d a b r
18 a b r a a b r a c a d a b r a a b r a c a d
7 a b r a a b r a c a d a b r a a b r a c a d
0 a b r a c a d a b r a a b r a c a d a b r a
11 a b r a c a d a b r a a b r a c a d a b r a
3 a c a d a b r a a b r a c a d a b r a a b r
14 a c a d a b r a a b r a c a d a b r a a b r
5 a d a b r a a b r a c a d a b r a a b r a c
16 a d a b r a a b r a c a d a b r a a b r a c
8 b r a a b r a c a d a b r a a b r a c a d a
19 b r a a b r a c a d a b r a a b r a c a d a
1 b r a c a d a b r a a b r a c a d a b r a a
12 b r a c a d a b r a a b r a c a d a b r a a
4 c a d a b r a a b r a c a d a b r a a b r a
15 c a d a b r a a b r a c a d a b r a a b r a
6 d a b r a a b r a c a d a b r a a b r a c a
17 d a b r a a b r a c a d a b r a a b r a c a
9 r a a b r a c a d a b r a a b r a c a d a b
20 r a a b r a c a d a b r a a b r a c a d a b
2 r a c a d a b r a a b r a c a d a b r a a b
13 r a c a d a b r a a b r a c a d a b r a a b

15
Step 2: [find last characters of rotations]

Let the string L be the last column of M, with characters L[0], …, L[N – 1]
(equal to M [0, N – 1], …, M [N – 1, N – 1]). The output of the transformation is the
pair (L, I).

In our example, L = ‘r r d d a a r r c c a a a a a a a a b b b b’ which translates


to the string L = 72 72 64 64 61 61 72 72 63 63 61 61 61 61 61 61 61 61 62 62 62 62
and I = 4.

3.2.2 Algorithm 2: Decompression transformation


We use the example and notation introduced in Algorithm 1. Algorithm 2 uses
the output of Algorithm 1, i.e., (L, I) to reconstruct its input, the string S of length N.

Step 1: [find first characters of rotations]

This step calculates the first column F of the matrix M of Algorithm 1. This is
done by sorting the characters of L to form F. We observe that any column of the
matrix M is a permutation of the original string S, and therefore of one another.
Furthermore, because the rows of M are sorted, and F is the first column of M, the
characters in F are also sorted.

In our example, F = ‘a a a a a a a a a a b b b b c c d d r r r r’, which when


translated into hexadecimal ASCII gives

F = 61 61 61 61 61 61 61 61 61 61 62 62 62 62 63 63 64 64 72 72 72 72.

Step 2: [build a list of predecessor characters]

To assist our explanation, we describe this step in terms of the contents of the
matrix M. The reader should remember that the complete matrix is not available to the
decompressor; only the strings F, L, and the index I (from the input) are needed by
this step.

16
Consider the rows of the matrix M that start with some given character ch.
Algorithm 1 ensured that the rows of matrix M are sorted lexicographically, so the
rows that start with ch are ordered lexicographically.

Table 3.3: Matrix M after sorting the rows.

positio F L
n
0 a a b r a c a d a b r a a b r a c a d a b r
1 a a b r a c a d a b r a a b r a c a d a b r
2 a b r a a b r a c a d a b r a a b r a c a d
3 a b r a a b r a c a d a b r a a b r a c a d
4 a b r a c a d a b r a a b r a c a d a b r a
5 a b r a c a d a b r a a b r a c a d a b r a
6 a c a d a b r a a b r a c a d a b r a a b r
7 a c a d a b r a a b r a c a d a b r a a b r
8 a d a b r a a b r a c a d a b r a a b r a c
9 a d a b r a a b r a c a d a b r a a b r a c
10 b r a a b r a c a d a b r a a b r a c a d a
11 b r a a b r a c a d a b r a a b r a c a d a
12 b r a c a d a b r a a b r a c a d a b r a a
13 b r a c a d a b r a a b r a c a d a b r a a
14 c a d a b r a a b r a c a d a b r a a b r a
15 c a d a b r a a b r a c a d a b r a a b r a
16 d a b r a a b r a c a d a b r a a b r a c a
17 d a b r a a b r a c a d a b r a a b r a c a
18 r a a b r a c a d a b r a a b r a c a d a b
19 r a a b r a c a d a b r a a b r a c a d a b
20 r a c a d a b r a a b r a c a d a b r a a b
21 r a c a d a b r a a b r a c a d a b r a a b

Like M, each row of M ‘ is a rotation of S, and for each row of M there is a


corresponding row in M ‘. We constructed M ‘ from M so that the rows of M ‘ are
sorted lexicographically starting with their second character. So, if we consider only

17
those rows in M ‘ that start with a character ch, they must appear in lexicographical
order relative to one another; they have been sorted lexicographically starting with
their second characters, and their first characters are all the same and so do not affect
the sort order. Therefore, for any given character ch, the rows in M that begin with ch
appear in the same order as the rows in M ‘ that begin with ch.

Table 3.4: Matrix M ‘ formed using M.

positio
n
0 r a a b r a c a d a b r a a b r a c a d a b
1 r a a b r a c a d a b r a a b r a c a d a b
2 d a b r a a b r a c a d a b r a a b r a c a
3 d a b r a a b r a c a d a b r a a b r a c a
4 a a b r a c a d a b r a a b r a c a d a b r
5 a a b r a c a d a b r a a b r a c a d a b r
6 r a c a d a b r a a b r a c a d a b r a a b
7 r a c a d a b r a a b r a c a d a b r a a b
8 c a d a b r a a b r a c a d a b r a a b r a
9 c a d a b r a a b r a c a d a b r a a b r a
10 a b r a a b r a c a d a b r a a b r a c a d
11 a b r a a b r a c a d a b r a a b r a c a d
12 a b r a c a d a b r a a b r a c a d a b r a
13 a b r a c a d a b r a a b r a c a d a b r a
14 a c a d a b r a a b r a c a d a b r a a b r
15 a c a d a b r a a b r a c a d a b r a a b r
16 a d a b r a a b r a c a d a b r a a b r a c
17 a d a b r a a b r a c a d a b r a a b r a c
18 b r a a b r a c a d a b r a a b r a c a d a
19 b r a a b r a c a d a b r a a b r a c a d a
20 b r a c a d a b r a a b r a c a d a b r a a
21 b r a c a d a b r a a b r a c a d a b r a a

18
M ‘ is defined as the matrix formed by rotating each row of M one character to
the right, so for each i = 0, …, N – 1, and each j = 0, …, N – 1,

M ‘ [i, j] = M [i, ( j - 1) mod N]

In our example, this fact is demonstrated by the rows that begin with ‘a’. The
rows 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9 in M correspond to the rows 4, 5, 10, 11, 12, 13, 14,
15, 16, and 17 in M ‘.

Using F and L, the first columns of M and M ‘ respectively, we calculate a


vector T that indicates the correspondence between the rows of the two matrices, in
the sense that for each j = 0, …, N – 1, row j of M ‘ corresponds to row T [ j] of M.

If L[ j] is the kth instance of ch in L, then T [ j] = i where F [i] is the kth


instance of ch in F. Note that T represents a one-to-one correspondence between
elements of F and elements of L, and F [T [ j]] = L [ j].

In our example, T is: (18, 19, 16, 17, 0, 1, 20, 21, 14, 15, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13).

Step 3: [form output S]

Now, for each i = 0, …, N – 1, the characters L [ i] and F [ i] are the last and
the first characters of the row i of M. Since each row is a rotation of S, the character L
[ i] cyclicly precedes the character F [ i] in S. From the construction of T, we have F [
T [ j]] = L [ j]. Substituting i = T [ j], we have L [ T [ j]] cyclicly precedes L [ j] in S.

The index I is defined by Algorithm 1 such that row I of M is S. Thus, the last
character of S is L [ I]. We use the vector T to give the predecessors of each character:

for each i = 0, …, N – 1: S [ N – 1 – i] = L [ T i [ I]].

Where T 0[x] = x, and T i+1[x] = T [T i[x]]. This yields S, the original input to the
compressor. In our example, S = ‘a b r a c a d a b r a a b r a c a d a b r a’.

We could have defined T so that the string S would be generated from front to
back, rather than the other way around.

19
The sequence T i [ I] for i = 0, …, N – 1 is not necessarily a permutation of the
numbers 0, …, N – 1. If the original string is of the form Z P for some substring Z and
some P > 1, then the sequence T i [ I] for i = 0, …, N – 1 will also be of the form Z1 P
for some subsequence Z1. That is the repetitions in S will be generated by visiting the
same elements of T repeatedly. For example, if S = ‘c a n c a n’, Z = ‘c a n’ and P =
2, the sequence T i [ I] for i = 0, …, N – 1 will be (2, 4, 0, 2, 4, 0).

Step 4: [Alternate representation of step 3]


Now, after step 2, to obtain S, instead of using step 3 which gives S in an
inverted manner, we define T in a different manner to generate S from front to back.
We calculate T such that if F[ j] is the kth instance of ch in F, then T [ j] is the kth
instance of ch in L.

Table 3.5: Mapping between F and L.


j = Position L = Input F = Context T = Link
0 r a 4
1 r a 5
2 d a 10
3 d a 11
4 a a 12
I
5 a a 13
6 r a 14
7 r a 15
8 c a 16
9 c a 17
10 a b 18
11 a b 19
12 a b 20
13 a b 21
14 a c 8
15 a c 9
16 a d 2
17 a d 3
18 b r 0
19 b 20 r 1
20 b r 6
21 b r 7
And we have L [T [ j]] = F [ j].

In our example, T is: (4, 5 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 8, 9, 2,
3, 0, 1, 6, 7).

The index I is defined by the Algorithm 1 such that row I of M is S. Thus, the
first character of S is F [ I]. We use the vector T to get successors of each character.

for each i = 0, …, N – 1: S [ i] = F [ T i [ I]].

Where T 0[x] = x, and T i+1[x] = T [T i[x]]. This yields S, the original input to the
compressor. In our example, S = ‘a b r a c a d a b r a a b r a c a d a b r a’.

3.3 Benefits of the transformed string

Algorithm 1 sorts the rotations of an input string S, and generates the string L
consisting of the last character of each rotation.

To see why this might lead to effective compression, consider the effect on a
single letter in a common word in a block of English text. We will use the example of
the letter ‘t’ in the word ‘the’, and assume an input string containing many instances
of ‘the’.

When the list of rotations of the input is sorted, all the rotations starting with
‘he ’ will sort together; a large proportion of them are likely to end in ‘t’. One region
of the string L will therefore contain a disproportionately large number of ‘t’
characters, intermingled with other characters that can proceed ‘he ’ in English, such
as space, ‘s’, ‘T’, and ‘S’.

The same argument can be applied to all characters in all words, so any
localized region of the string L is likely to contain a large number of a few distinct
characters. The overall effect is that the probability that given character ch will occur

21
at a given point in L is very high if ch occurs near that point in L, and is low
otherwise. This property is exactly the one needed for effective compression by a
move-to-front coder, which encodes an instance of character ch by the count of
distinct characters seen since the next previous occurrence of ch. When applied to the
string L, the output of a move-to-front coder will be dominated by low numbers,
which can be efficiently encoded with a Huffman or arithmetic coder.

3.4 Conclusion

In this chapter, we introduced the forward and reverse Burrows-Wheeler


Transform and their working using an example.

22
CHAPTER 4
IMAGE COMPRESSION USING BWT

4.1 Introduction

This chapter deals with fundamentals of an image and image compression in


general, then we present an example using our algorithm for lossless image
compression. The above sentence consists of two words i.e., ‘image’ and
‘compression’ respectively. We have already dealt with the second term in the first
chapter of this document. Now we will take to look at what an image is and define an
image formally.

4.1.1 What is an image?


An image (from Latin: imago) is an artifact that depicts visual perception,
such as a photograph or other two-dimensional picture, that resembles a subject—
usually a physical object—and thus provides a depiction of it. In the context of signal
processing, an image is a distributed amplitude of color(s).

Images may be two or three-dimensional, such as a photograph or screen


display, or three-dimensional, such as a statue or hologram. They may be captured by
optical devices – such as cameras, mirrors, lenses, telescopes, microscopes, etc. and
natural objects and phenomena, such as the human eye or water.

23
Fig 4.1: An image of the human eye.[8]

4.1.2 Formal definition of a grayscale digital image


A digital image is an image composed of picture elements, also known as
pixels, each with finite, discrete quantities of numeric representation for its intensity
or gray level that is an output from its two-dimensional functions fed as input by its
spatial coordinates denoted with x, y on the x-axis and y-axis, respectively. Depending
on whether the image resolution is fixed, it may be of vector or raster type.

A 2D grayscale image can be defined by the 2-dimensional function f of two


variables x and y as follows

f ( x, y) = g,

where g is the gray-level intensity of the image at the spatial co-ordinate (x, y).
The number of bits required to store a single pixel depends on the gray-level
resolution of the image. Gray-level resolution of an image is the number of distinct
gray-levels that a single pixel of that image could be assigned. For example, a
standard grayscale image having 256 distinct gray-levels for each pixel requires 8 bits
per pixel, this is because 8 bits are required to encode 256 different levels.

24
Fig 4.2: A grayscale image and its representation with gray-level values. [9]

Raster images have a finite set of digital values, called picture elements or
pixels. The digital image contains a fixed number of rows and columns of pixels.
Pixels are the smallest individual element in an image, holding antiquated values that
represent the brightness of a given color at any specific point. Typically, the pixels are
stored in computer memory as a raster image or raster map, a two-dimensional array
of small integers.

4.1.3 Image compression


Image compression is a type of data compression applied to digital images, to
reduce their cost for storage or transmission. Algorithms may take advantage of visual
perception and the statistical properties of image data to provide superior results
compared with generic data compression methods which are used for other digital
data.

Image compression may be lossy or lossless. Lossless compression is


preferred for archival purposes and often for medical imaging, technical drawings,
clip art, or comics. Lossy compression methods, especially when used at low bit rates,
introduce compression artifacts. Lossy methods are especially suitable for natural
images such as photographs in applications where minor (sometimes imperceptible)
loss of fidelity is acceptable to achieve a substantial reduction in bit rate. Lossy
compression that produces negligible differences may be called visually lossless.

As we can see in figure 3. (b) and (c), the compressed image is different when
compared to the original (a). This is due to the fact that the image is compressed using
a lossy algorithm. These algorithms exploit the psychovisual redundancy in the
images and they concentrate primarily on obtaining a better compression ratio.
However, there is a trade-off between the compression ratio and the quality of the
image. Obtaining a high compression ratio often leads to the image losing its ability to
please or convince the observer visually, as in (c), whereas low compression ratio
means most of the image quality is preserved except for a very few changes compared
to the original image, i.e., the image is visually lossless as in (b).

25
Coming to lossless compression, one cannot simply view the compressed
image since the form in which it is stored is not a direct representation of the original
image. One needs to apply a decompression algorithm which reverts the compressed
data into a presentable form, and then one can view the image as it was before
compression, i.e., lossless both visually and literally.

4.2 Proposed model

In this section we propose a lossless image compression model based on


Burrows-Wheeler Transform as a pre-processing step before the actual compression.

4.2.1 Block diagram


Here we present the block diagram of our model that basically consists of
three blocks, the compressor, compressed data, and the decompressor.

Input image Output image

Compressor Decompressor

Store an Retrieve an
image image

Compressed
file

Fig 4.4: Block diagram of the system.

When we have an input image to store, we input the image into the
compressor and obtain a compressed stream of data as the output and we store that
data stream instead of the original data, this saves space for storage and transmission
bandwidth when we need to send this data wirelessly.

26
Similarly, when we want to look at the image, we retrieve it by decompressing
the stored compressed data stream using the decompressor. The output of the
decompressor can be ultimately compared to the original data to prove the fact that
the whole process is lossless. The compression ratio and space saving are calculated
as specified in the section 1.1.4 to get a grasp of the efficiency and effectiveness of
the compression algorithms used in the compressor.

4.3 Compressor

The order of the algorithms to be applied on the data in order to compress it is


presented in the form of a flowchart in the current section.

Reordering

Input image

BWT

MTF

RLE

Huffman
coding

Compressed
27 data

Fig 4.5: Program flow of the compressor block.


Initially the input image is fed to the program. The program then processes the
input in a step-by-step manner by applying the required algorithms on the data and
finally the result is stored in a folder. For the image to reach this state, it needs to pass
through five stages. We now describe the functionality and significance of each stage
individually.

4.4 Explanation of compressor block

In this section, we explain the block diagram of the compressor and the
algorithms used in each block.

4.4.1 Stage 1: Reordering


This is the first stage of the algorithm. Basically, images as we know them and
as defined in section 3.1.2 are inherently two-dimensional. As we have seen in chapter
2, BWT is a text compression algorithm, i.e., it processes linear, one-dimensional data
structures such as strings or arrays. So, it is apparent to us that we need to somehow
come up with a reversible one-dimensional representation of the image in order to
apply the Burrows-Wheeler transform on it.

So now, reordering comes into picture. This conversion process from two-
dimensional representation to a one dimensional one is called reordering or path
scanning. Based on the type of path taken to cover the spatial plane, we have a
plethora of reordering techniques. The most popular and effective ones are shown in
the figure below.

(a) (b) (c) (d) (e)

Fig 4.6: Some of the reordering techniques available. (a) left, (b) left-
right, (c) up-down, (d) spiral, and (e) zig-zag.

28
The left scan just scans the entire image row after row and appends all of them
together. Whereas the left-right scan scans odd numbered rows from left to right and
even numbered rows form right to left. The up-down method is similar to left-right
method but instead of rows it scans columns, odd numbered columns are scanned
from top to bottom and even numbered columns from bottom to top. In the spiral
technique, the scan starts from one of the four extreme positions and spirals inward.
Whereas in zigzag method, the scan follows a zig-zag path from start to the end of the
image.

In conclusion, the input to this stage is a two-dimensional image and the


output is a one-dimensional array, lets refer to it as S.

4.4.2 Stage 2: Burrows-Wheeler transform


The output of the reordering stage, a one-dimensional array representation, S,
of the image is given to this stage as the input. Let us consider S in this chapter as the
S initialized in the section 2.2.1 and follow it through various stages of the program.
The algorithm is applied on the input array as specified in the section 2.2.1 and the
outputs are a one-dimensional array, L, and an integer, I representing the position of
the original array in the matrix M of sorted rotations.

It is worthy to take a note of the observation that there is no actual


compression of data in this stage. This is because, BWT is not a compression
algorithm, the output array contains the same number of elements as the input array
but presumably in a different order. None the less, the output is just a permutation of
the input array. But the output is permutated in such a way that the probability of
occurrence of an element el at a position k + 1 given that el occurred at position k in
the output array is increased drastically. As a result, we end up with a sequence that
has long continuous repetitions of certain elements. This quality of the BWT encoded
array is exploited in the upcoming stages.

So, one can say that the Burrows-Wheeler Transform is a pre-processing stage
before the actual compression whose objective is to turn the input array into a more
compression-friendly form. Finally, the outputs of this stage are L and I.

29
4.4.3 Stage 3: Move-to-Front coding
The move-to-front algorithm belong to a family of algorithms called the
Global Structure Transforms. It basically transforms a local structure redundancy into
a global structure redundancy which can be exploited further by RLE and Huffman
coding.

Algorithm 3: MTF
This algorithm encodes the output, L of the stage 2, which is a string of length
N. It uses to move-to-front technique on each of the individual characters. The move-
to-front (MTF) transform is an encoding of data (typically a stream of bytes) designed
to improve the performance of entropy encoding techniques of compression. When
efficiently implemented, it is fast enough that its benefits usually justify including it
as an extra step in data compression algorithm. We now define a vector of integers,
R[0], …, R[N – 1], which are the codes for the characters L[0], …, L[N – 1].

Now, initialize a list Y of characters to contain each character in the alphabet


X exactly once in it. For each i = 0, …, N – 1 in turn, set R[i] to the number of
characters preceding character L[i] in the list Y, then move the character L[i] to the
front of the list Y.

Continuing our example from section 2.2.1, we have L = ‘r r d d a a r r c c a a


a a a a a a b b b b’ and I = 4. Here Y = {‘a’, ‘b’, ‘c’, ‘d’, ‘r’} initially and now we
compute R. We construct the below table by performing the steps of the algorithm in
order to get the encoded output.

a r r d d a a r r c c a a a a a a a a b b b
b a a d d a a r r c c c c c c c c a a a
List Y c b b r r r r d d a a r r r r r r r r c c c
d c c a a b b b b d d d d d d d d d d r r r
r d d b b c c c c b b b b b b b b b b d d d
c c
List L r r d d a a r r c c a a a a a a a a b b b b
List R 4 0 4 0 2 0 2 0 4 0 2 0 0 0 0 0 0 0 4 0 0 0
Table 4.1: MTF encoding output.

30
As shown in the table, the entries in row 1 are Y, updated after encoding each value of
L and the output is stored in R.

For example, consider the first two elements, L[0] and L[1], now here, Y is
initially as given above now L[0] is ‘r’ which appears at index 4 in Y so the output
R[0] is 4 and the list Y is updated by moving ‘r’ to the start of the list. So, now Y =
{‘r’, ‘a’, ‘b’, ‘c’, ‘d’}. Now for L[1] = ‘r’, the output R[1] would be the position of ‘r’
in the updated list Y, i.e., 0. So, in this way, all repetitions of length j would appear as
j – 1 zeros in the output. This transforms the local structure redundancy to a global
one, this transformation of redundancy in local areas of the data to global areas is key
in GST algorithms.

So, the main distinction between the input and the output is not size of the
array, but the way repetitions of character or as we call them runs, are being portrayed
in the array. As we can see, in the output array R all runs of different characters have
been reduced to runs of zeros. This drastically increases the probability of lower
indices and decreases the probabilities of lower indices. This comes in handy while
applying data coding algorithms like Huffman or Arithmetic coding to reduce the
average word length of such data.

The input to this stage is L, the data stream that has been transformed using
Burrows-Wheeler transform and the output of this stage is R, the MTF encoded array.

4.4.4 Stage 4: Run-Length Encoding


Run-length encoding (RLE) is a form of lossless data compression in which
runs of data (sequences in which the same data value occurs in many consecutive data
elements) are stored as a single data value and count, rather than as the original run.
This is most efficient on data that contains many such runs, for example, simple
graphic images such as icons, line drawings, Conway's Game of Life, and animations.
For files that do not have many runs, RLE could increase the file size.

Algorithm 4: RLE
This algorithm encodes the output, R of stage 3, which is a string of length N.
It encodes the runs in the sequence by replacing them with their frequencies. The

31
main aim of this step is to shrink long runs of same symbols. We now define E to be
an empty array.

Now we iterate over the array R and encode it in the following way. We start
at the beginning of the array R. We now append the first character, ch, of R to the
array E and now we start counting the subsequent occurrences of ch and append the
count to E and we move on to the next character ch1 (say) and continue the process
until the end of R is reached.

Continuing the running example, we have R = ‘4 0 4 0 2 0 2 0 4 0 2 0 0 0 0 0


0 0 4 0 0 0’, the even indices in E represent the characters in R and the odd indices
represent the subsequent count of the character preceding that index in R.

At the index i = 0, R[i] = 4, and let’s define c to be the count of R[i], so as of


now c = 1 and we update c as we iterate over R. And now R[i + 1] = 0, so the run of
character R[i] is terminated at the index i + 1 and c = 1. So, now, E[0] = 4 and E[1] =
1. And as soon as the run is terminated, we update E and set c to zero to accommodate
the count of the next run. And continue from the element c positions next to i.

We repeat the above process until the array R is exhausted. The output of this
stage, E is as follows:

r1 r2 r3
ORIGINAL STRING: 4 0 4 0 2020402000
0 0 0 0 4 0 0 0 = R, length = 22.

OUTPUT STRING: 4 1 0 1 4 1 0 1 2 1 0 1 2 1 0 1 4 1 0 1 2 1 0 7 4 1 0 3 = E, length


e1 e2 e3
= 28.

As we see in the above description, in the original string, R, r1 = ‘4’, r2 = ‘0 0


0 0 0 0 0’, and r3 = ‘0 0 0’. In the output string E, r1 is encoded as the string e1 which
has a length of 2. Which increases the length of the stream contrary to our
requirement of compressing the data. Even though r2, which is a string of length 7 is
encoded by e2 of length 2, and r3, a string of length 3 is transformed to e3 of length 2,
the overall length of the output stream is greater than that of the input stream. This is

32
because of the presence of unit length runs in excess. To solve this problem, we
introduce a variation of the RLE scheme in which the unit length runs would not be
accounted for.

It is important to observe that the length of the output sequence of this stage is
not same as the length of the input sequence. So, this is the first stage in which actual
compression of data is supposed to happen. But in our case, instead of compression,
the length of the data increased. So, to rectify this issue, we use a variant of RLE
called the RLE-2s to encode the output of stage 3. The working of RLE-2s is
described in the section below.

Algorithm 4.1: Run-Length Encoding – 2 symbols


This algorithm encodes the output, R of stage 3, which is a string of length N.
It encodes the runs in the sequence by replacing them with their frequencies. The
main aim of this step is to shrink long runs of same symbols. We now define E to be
an empty array.

Now we iterate over the array R and encode it in the following way. We start
at the beginning of the array R. We now append the first character, ch, of R to the
array E and now we start counting the subsequent occurrences of ch and if the count is
greater than 1, then we append ch to E once more and then append the count. If the
count is 1, i.e., it is a unit length run, then we move on to the next run.

Continuing the running example, we have R = ‘4 0 4 0 2 0 2 0 4 0 2 0 0 0 0 0


0 0 4 0 0 0’, Now to encode R[i] and its subsequent repetitions, we initialize a
variable, c = 1 and move through R incrementing i and checking the value at that
index in R, until we get to some position j, where R[ j] is not equal to its predecessor,
then if c is greater than 1, we append R[ j – 1] two times to E and then the count c and
then repeat the process till the end. If the count, c = 1, then we append R[ j – 1] to E
and move to the position j in R and continue the process till the end of the array R.

For example, if i = 0, then R[i] = 4 and R[i + 1] = 0, so here, the count, c = 1


and since c = 1 we append R[0] to E and move on to R[1]. Now at index i = 11, R[i] =
0 and as we increment i, the count, c, increases from 1 to 6 and then at index i = 17,
R[i] = 0 and R[i + 1] = 4. So, here since the count, c is greater than 1, we append
R[17] to E twice and then we append c = 6. And we move on to R[18].

33
The output of this algorithm for the given input R is as follows.

r1 r2 r3
ORIGINAL STRING: 4 0 4 0 2020402000
0 0 0 0 4 0 0 0 = R, length = 22.

OUTPUT STRING: 4 0 4 0 2 0 2 0 4 0 2 0 0 7 4 0 0 3 = E, length = 18.


e1 e2 e3

As we can see, the length of the output sequence is 18, which is less than that
of the input sequence. The input r1 = ‘4’ is encoded as e1 = ‘4’ of length 1, r2 = ‘0 0
0 0 0 0 0’ is encoded as e2 = ‘0 0 7’, which is of length 3, and r3 = ‘0 0 0’ is encoded
as e3 = ‘0 0 3’ of length = 3. So, one can say confidently, that compression has been
achieved. This is because that the runs with unit length are encoded with a single
character.

The input to this stage is R, the MTF encoded output and the output of this
stage is E, the data stream encoded using RLE-2s algorithm.

4.4.5 Stage 5: Huffman coding


In computer science and information theory, a Huffman code is a particular
type of optimal prefix code that is commonly used for lossless data compression. The
process of finding or using such a code proceeds by means of Huffman coding, an
algorithm developed by David A. Huffman while he was a Sc.D. student at MIT, and
published in the 1952 paper "A Method for the Construction of Minimum-
Redundancy Codes".

The output from Huffman's algorithm can be viewed as a variable-length code


table for encoding a source symbol (such as a character in a file). The algorithm
derives this table from the estimated probability or frequency of occurrence (weight)
for each possible value of the source symbol. As in other entropy encoding methods,
more common symbols are generally represented using fewer bits than less common

34
symbols. Huffman's method can be efficiently implemented, finding a code in time
linear to the number of input weights if these weights are sorted.

Algorithm 5: Huffman coding


This stage encodes the output of the RLE stage, E and gives a data stream with
reduced average word length as the output. If we consider our running example, input
to this stage would be E = ‘4 0 4 0 2 0 2 0 4 0 2 0 0 7 4 0 0 3’, to store this data in a
device using ASCII encoding means that each symbol is encoded using 8 bits. So, a
data stream of length, M = 18 would require a storage space of 144 bits to be stored.
When the average word length drops, it means that the conventional representation
which uses 8 bits per symbol is transformed into some other representation that uses
less than 8 bits per symbol on average.

Huffman coding generally consists of two steps:

1. Build a Huffman tree from the input characters.


2. Traverse the Huffman tree and assign codes to characters.

We now take a look at the first step using our running example. We need to
initially obtain a one-to-one mapping between symbols and their frequencies in E. We
initialize an array F containing all the unique characters in E, F = {0, 2, 3, 4, 7}. Now
we define P such that P[i] = frequency of F[i] in E, P = {9, 3, 1, 4, 1}. We need to
create a leaf node for each unique character and build a min heap of all leaf nodes
(Min Heap is used as a priority queue. The value of frequency field is used to
compare two nodes in min heap. Initially, the least frequent character is at root).
Extract two nodes with the minimum frequency from the min heap. Create a new
internal node with a frequency equal to the sum of the two nodes frequencies. Make
the first extracted node as its left child and the other extracted node as its right child.
Add this node to the min heap. Repeat the above steps until the heap contains only
one node. The remaining node is the root node, and the tree is complete.

We now apply the above steps to our example. Initially all the nodes in the
tree are leaves. And the least frequent character is at the root. This is shown below.
Here the node is a circle and its divided into two parts, on the left we have the symbol

35
and on the right, the frequency, but the internal nodes have the left side empty
because internals nodes so not represent any symbol.

root
3 1

7 1

2 3

4 4

0 9

symbol frequency

Fig 4.7: Initial condition of the tree.

Now a new parent node is created using the two least frequency leaves from
the tree and a min heap is formed using the new node, called the internal node, and
the nodes excluding the two children of the new node is formed and the process in
again repeated until we get a single node whose value in the frequency field is the
length of the array E. We make the first extracted node as the left child. The value in
the frequency field of the new node is given by the sum of the two children nodes.
The state of the binary tree after forming the first internal node is given below.

root
I1 2

3 1 7 1

2 3

4 4

36 0 9
Fig 4.8: Tree after formation of first internal node.

Now we have node {I1, 2} as the root node and other nodes. We now form a
parent node using the two nodes with the least frequency values i.e., nodes I1 and
node with ‘2’ in the symbol field. After this, the tree looks as follows.

4 4 root

I2 5

I1 2 2 3 0 9

3 1 7 1

Fig 4.9: Tree after the formation of I2.

We continue this until the root node has the length of the sequence, N in its
frequency field. After this, we label the path from parent node to left child node as ‘0’
and path to the right child node as ‘1’. Now, to obtain the symbol for F[i], we need to
traverse from the root node to the leaf consisting of F[i] at the symbol field and
append all the labels on the paths followed. The final form of the Huffman tree is
given below.

37
root
I4 18
0 1

I3 9 0 9 code = 1
0 1

I2 5 4 4 code = 01
0 1

I1 2 2 3 code = 001
0 1

3 1 7 1 code = 0001

code = 0000

Fig 4.10: The generated Huffman tree.

Given below is the table of symbols in F mapped to code words obtained from
the Huffman tree.

Table 4.2: Mapping between symbols, codewords and their lengths.

F = Symbol P= Code C = Code length


Frequency
0 9 1 1
2 3 001 3
3 1 0000 4
4 4 01 2
7 1 0001 4
38
To obtain the output data stream, we define an empty array H. We then replace
the binary representations of the symbols obtained from the ASCII tables with that of
the ones obtained from the construction of the Huffman tree and write it in a serial
manner into H. According to the code words obtained from the Huffman tree, the
sequence H in its binary form is given as H = {0 1 1 0 1 1 0 0 1 1 0 0 1 1 0 1 1 0 0 1 1
1 0 0 0 1 0 1 1 1 0 0 0 0}.

We now define an array, C such that C[i] = code length of the symbol F[i]. So,
the array C is given by C = {1, 3, 4, 2, 4}. The word length of a symbol is defined as
the number of bits in the Huffman coded representation of the symbol. So, the average
codeword length, awl, is defined as the total number of bits in the input data coded
according to Huffman tree divided by the length of the input data stream, M. Which is
given by the formula

∑( P[i]∗C [i])
awl =
M

So now, in our example,

9∗1+3∗3+1∗4 +4∗2+1∗4 34
awl = = = 1.8889
18 18

The total number of bits required to represent this information is given by


∑( P [i]∗C [i]) = 9 * 1 + 3 * 3 + 1 * 4 + 4 * 2 + 1 * 4 = 34. And to make this stage
reversible, we need a mapping between symbols and their frequencies. So, we also
store P and F along with the output data stream. Here, we have 5 symbols and 5
frequencies, 1 per each symbol. The highest value among those is ‘9’ and we need 4
bits to encode this in binary. So, in total, 40 bits are enough to encode 10 values (5
symbols and 5 frequencies). So, finally the input into this stage would be encoded
using 34 + 40 = 74 bits.

We can see that the output stream requires much lesser bits than the input
stream, meaning the input to this stage has been compressed successfully. The outputs
of this stage are P, F, and H. These outputs constitute the compressed data block in
the flowchart of section 3.2.2.

4.5 Compressed File

39
This is a quiescent stage in the whole process. Meaning that no active work
takes place in this stage. The compressed data from the Huffman coding stage of the
compressor block is given as input to this block and the same is reproduced as the
output of this block, which in turn is the input to the decompressor block.

This stage is included in the block diagram to represent a permanent storage in


which the compressed data i.e., P, F, and H are stored. When we need to view the
image, we simply invoke the decompressor block by giving the compressed data as
the input and the decompressor block processes this data and produces the original
data as its output.

4.6 Decompressor

Its duty is to retrieve the original data stream from the compressed data. The
order of the algorithms to be applied on the compressed data in order to decompress it
is presented in the form of a flowchart in the current section.

Compressed Huffman
data Decoding

RLE-2s
Decoding Output image

MTF
Decoding

Inverse Undo
BWT Reordering

Fig 4.11: Program flow of the decompressor block.

40
The input to the decompressor block is the output of the compressor block,
which for our example are the arrays P, F, and H. The output of this block is a
decoded data stream obtained from the Huffman tree constructed using P and F.

4.7 Explanation of decompressor block

In this section, we explain the block diagram of the decompressor and


the algorithms used in each block.

4.7.1 Stage 1: Huffman decoding


The inputs to this stage are the compressed data elements which are stored in
our file system. The output is a data stream obtained by decoding the Huffman
encoded data from the last stage of the compressor block. The data streams P, F, and
H are inputs to this stage. To decode a Huffman encoded data stream, we need to
contain the information regarding the symbols and the frequencies. This information
is contained in P and F. So, as in the Huffman coding stage of the compressor block,
we construct a Huffman tree. Now, the tree would look as follows.

root
I4 18
0 1

I3 9 0 9 code = 1
0 1

I2 5 4 4 code = 01
0 1

I1 2 2 3 code = 001

0 1

3 1 7 1 code = 0001

code = 0000

41
Fig 4.12: The generated Huffman tree.
Table 4.3: Mapping between symbols and code words.
F = Symbol P = Frequency Code
0 9 1
2 3 001
3 1 0000
4 4 01
7 1 0001

The mapping between symbols and their corresponding codes from the
Huffman tree are tabulated as shown above. Now, we have H = {0 1 1 0 1 1 0 0 1 1 0
0 1 1 0 1 1 0 0 1 1 1 0 0 0 1 0 1 1 1 0 0 0 0} as an input, we now need to decode this
using the above table and obtain the output in an array. Let us now define an empty
array IH to store the output of this stage. Since P and F arrays here are same as the
ones used the construction of Huffman tree in the last stage of the compressor block,
and since we use same fixed rules for the construction of a Huffman tree consistently
throughout the program, the Huffman tree constructed here is identical to the one
generated in the Huffman coding stage of the compressor block

Algorithm 6: Huffman decoding


We now describe an algorithm to decode the Huffman coded data using the
table 3. and the input array H. We iterate over H and maintain a string, s to
temporarily store the code words until s forms a code defined in the table 3. , after the
code in s is recognised as existing in the code column of table 3. , then we append the
symbol in the table corresponding to that code to IH and set s to its initial state, i.e., to
an empty string and move on to the next index in H.

For our running example, the decoding is done as shown below.

H=0 110110011001101100111000101110000

IH = 4 04020204020074003

42
At index i = 0, s = ‘0’ and since ‘0’ is not a defined code word, we move on to
the next index, i = 2, here, s becomes s = ‘0 1’, we now observe that the symbol ‘4’ is
mapped to the code word represented by s according to the constructed Huffman tree.
So, now, we append ‘4’ to IH and reset s, and then move to the index i = 2. At i = 2, s
= ‘1’ and we see that ‘1’ is a defined code word that corresponds to symbol ‘0’ in the
table 3. , so, we append ‘0’ to IH and move to the next index. The process continues
until the end of array H is reached.

Finally, the output of this stage is an array IH which is obtained by decoding


H with the help of P and F. We observe that the output of this stage is the same as the
input given to the Huffman coding stage in the compressor block. This stands to prove
the fact that Huffman coding is a lossless reversible process, provided that we have
the mapping between the original symbols and their frequencies readily available to
the Huffman decoding stage of the decompressor block.

4.7.2 Stage 2: RLE-2s decoding


The input to this stage is the array IH which is the output of the Huffman
decoding stage of the decompressor. We now show the steps to decode the array IH
which is same as the array E, the RLE-2s encoded array at the compressor block.

Algorithm 7: Run-Length Encoding – 2 symbols decoding


According to the RLE-2s, encoding scheme, we encode a run of a character by
appending the character twice to the output stream and then appending the length of
the run to the output if the length of the run is greater than 1, but if that’s not the case,
we encode it by just appending the character once to the output stream. So, for a given
array encoded using the RLE-2s algorithm, we have two possible interpretations for
each character. The first is that it belongs to a run whose length is greater than 1, and
the second is that it is a run of unit length. We now define an empty array IE to store
the output of this stage.

From the above stated encoding rules, we can deduce that if we encounter two
consecutively identical characters, i.e., at indices i, i + 1 of the array IH, then, we
need to append that character, IH[i], IH[i + 2] times to the output stream, IE, and we
move to the index i + 3 of the array IH.

43
But, if two consecutive elements of IH are dissimilar means that the run of the
character, IH[i] is of unit length. So, we append IH[i] to the output array IE and we
move to the index i + 1 in the array IH.

We continue the above stated process until we reach the end of the array IH.
Now, we apply these steps to our running example. Here, IH = ‘4 0 4 0 2 0 2 0 4 0 2 0
0 7 4 0 0 3’. The process is shown below.

e1 e2 e3
ORIGINAL STRING: 4 0 4 0 2 0 2 0 4 0 2 0 0 7 4 0 0 3 = IH, length = 18.

OUTPUT STRING: 4 0 4 0 2 0 2 0 4 0 2 0 0 0 0 0 0 0 4 0 0 0 = IE, length = 22.


r1 r2 r3

We iterate over IH and decode the string in a run-by-run fashion. At the


beginning, i = 0 and we have e1 = IH[0] = ‘4’, we now check the value of IH[1],
IH[1] = ‘0’. We note that e1 is not identical to the element to its right in the array IH.
So, we have here a case of a character with a run of unit length. Therefore, the
character is to be appended to IE only once. We now append ‘4’ to IE. We then move
to index i + 1, equal to 1 here.

When the index i reaches to the value 11, IH[11] = ‘0’ and we also have the
next element, i.e., IH[12] = ‘0’. This implies that the string e2 corresponds to a run of
character e2[0] of length e2[2], which means that in our case, e2 represents a run of
character ‘0’ which has a length of 7. So, we append IH[11] to IE IH[13], 7 times. We
then move to index i + 3, equal to 14 here.

We continue the process until the end of array IH. Here the output of this stage
is the array IE, which for our example is given by IE = ‘4 0 4 0 2 0 2 0 4 0 2 0 0 0 0 0
0 0 4 0 0 0’ of length 22. We note that the array IE is equivalent to the array R which
is given as the input to the RLE-2s encoding stage of the compressor block. So, we
can say that RLE-2s is a reversible transformation.

44
4.7.3 Stage 3: MTF decoding
The input to this stage is the array IE which is the output of the RLE-2s
decoding stage. We now need to decode that data, which is shown to be equivalent to
the output R of the MTF coding stage in the compressor block. MTF algorithm is an
involution, that means the inverse of MTF function is itself. So, we just apply the
MTF encoding algorithm to the array IE.

To successfully decode the data stream, we need the character array defined in
the section 3.2.2.3.1, Y = {‘a’, ‘b’, ‘c’, ‘d’, ‘r’}. Provided Y, we can successfully
decode the data encoded by MTF algorithm.

Algorithm 8: Move-To-Front decoding


The move-to-front (MTF) transform is an encoding of data (typically a stream
of bytes) designed to improve the performance of entropy encoding techniques of
compression. When efficiently implemented, it is fast enough that its benefits usually
justify including it as an extra step in data compression algorithm.

Now, we have a list Y of characters to contain each character in the alphabet X


exactly once in it. We have the MTF encoded array, IE = ‘4 0 4 0 2 0 2 0 4 0 2 0 0 0 0
0 0 0 4 0 0 0’. We need to decode this array by following the steps given below. We
first define an empty array IR to store the output of this stage.

We iterate over the array IE starting at i = 0 and each time, we append the
value at index IE[i] of the character array Y to the output IR. And we then move that
element in Y to the front of the array Y.

a r r d d a a r r c c a a a a a a a a b b b
b a a r d d a a r r c c c c c c c c a a a
List Y c b b r a r r d d a a r r r r r r r r c c c
d c c a b b b b b d d d d d d d d d d r r r
r d d b c c c c c b b b b b b b b b b d d d
c
List 4 0 4 0 2 0 2 0 4 0 2 0 0 0 0 0 0 0 4 0 0 0
IE
List r r d d a a r r c c a a a a a a a a b b b b
IR
Table 4.4: MTF decoding output.

45
As shown in the table, the entries in row 1 are Y, updated after encoding each
value of IE and the output is stored in IR.

For example, consider the first two elements, IE[0] and IE[1], now here, Y is
initially as given above, now IE[0] is ‘4’ which means we append Y[4] to the output
array IL and the list Y is updated by moving Y[4], ‘r’ to the start of the list. So, now Y
= {‘r’, ‘a’, ‘b’, ‘c’, ‘d’}. Now for IE[1] = ‘0’, the output IR[1] would be the element in
Y at the index 0, i.e., ‘r’. So, in this way, we decode till we reach the end of the array
IE.

The output of this stage is the array IR = ‘r r d d a a r r c c a a a a a a a a b b b


b’ and we move to the next stage. As we can see, the array IR is same as the array L
which was given to the MTF coding stage as input at the compressor block. By this
we can say that the MTF transformation is in fact reversible and lossless.

4.7.4 Stage 4: Inverse BWT


The input to this stage is the array IR obtained by decoding the MTF encoded
data. As we observe, we find that the input to this stage is the same as the output of
the BWT stage of the compressor block. And the inverse transformation is performed
as presented in the section 2.2.2 and the result is stored in an array named IL.
According to section 2.2.2, the output is given by the array, IL = ‘a b r a c a d a b r a a
b r a c a d a b r a’.

We observe that IL is identical to the vector S which has been given as an


input to the BWT stage in the section 3.2.2.2 of the compressor block. This means
that the decompression of the input data is performed. And the whole process as
inferred from the stage-to-stage equivalence among the compressor and decompressor
blocks, is undeniably lossless and completely reversible.

4.7.5 Stage 5: Undo reordering


The input to this stage is the array IL which is the same as the output of the
reordering stage in the compressor block. Depending on the reordering technique used

46
in the reordering stage of the compressor block, the inverse reordering algorithm
selects the path in which the two-dimensional output array, RI (Reconstructed Image)
is to be filled.

Once the array RI is filled, we can verify the sanity of our results by
comparing RI with the input image given to the reordering stage in the compressor
block. We find that both the arrays are identical. This means that the compression is
performed in a lossless manner.

4.8 Conclusion

We described using an example, all the forward and reverse algorithms used in
both compressor and decompressor blocks of the block diagram.

47
CHAPTER 5
IMPLEMENTATION

In this chapter, we discuss about the implementation of the model presented in


the chapter 3. We implement our project in MATLAB (an abbreviation of “MATrix
LABoratory”). We now discuss about MATLAB in general and its use in our project.

5.1 Introduction

MATLAB is a proprietary multi paradigm programming language and


numeric computing environment developed by MATH WORKS. MATLAB allows
matrix manipulations, plotting of functions and data, implementation of algorithm,
creation of user interfaces, and interfacing with programs written in other languages.

Fig 4.1: A 3-D plot in MATLAB.[11]

48
Although MATLAB is intended primarily for numeric computing, an optional
toolbox uses the MuPAD (Multi Processing Algebra Data) symbolic engine allowing
access to symbolic computing abilities. An additional package, Simulink, adds
graphical multi-domain simulation and model-based design for dynamic and
embedded systems. As of 2020, MATLAB has more than 4 million users worldwide.
They come from various background of engineering, science, and economics.

5.1.1 Features of MATLAB


MATLAB combines a desktop environment tuned for iterative analysis and
design processes with a programming language that expresses matrix and array
mathematics directly. It includes the Live Editor for creating scripts that combine
code, output, and formatted text in an executable notebook. It allows to extend its
functions and developments of all kinds of discipline through a set of characteristics
called toolbox. MATLAB has a workspace for programmers to conveniently view the
variables in use in the current program.

The MATLAB application is built around the MATLAB programming


language. Common usage of the MATLAB application involves using the "Command
Window" as an interactive mathematical shell or executing text files containing
MATLAB code. It has extensive support for graphing, plotting, and signal processing
etc. It also consists of in-built applications for various uses like colour pickers, image
segmenters, image thresholders, Simulink tools etc.

5.2 Specifications

This section deals with the software and hardware specifications required to
run the MATLAB code. And it also deals with the constraints on input images.

The whole model is divided into two programs, one for compression and the
other for decompression. The input to the compression program is an image. The
input image to the compression program must be an 8-bit grayscale image encoded in
‘TIFF’ (Tag Image File Format or tiff). The dimensions of the image must be lower

49
than 400. The output of the compression program is data embedded in the form of tiff
images in a folder in the file system. We call this data compressed data.

The inputs to the decompression program are the outputs of the compression
program. And the output is an image constructed using the compressed data. The
output image is then, if needed, compared to the original image before compression,
to verify that the compression is lossless.

Coming to the system specifications, as far as software goes, any version of


MATLAB, newer than 2016a would be good enough to run both programs. Hardware
requirements are that the system needs 8GB of RAM and 20GB storage to install and
run MATLAB. The operating system should be windows 7 or newer.

5.3 Functional description

In this section, we describe in a step-by-step manner the inputs, outputs and


arguments of our compression and decompression programs.

5.3.1 Compression program


The input to the compression program is an 8-bit grayscale tiff image. The
outputs are two 8-bit grayscale tiff files. The first one is a file containing essential
information required to reconstruct the image from its compressed form and the other
is the compressed image itself.

The essential image file consists of data like the dimensions of the image
required by the last stage of the decompressor i.e., the undo reordering stage, to
correctly arrange the linear data into a two-dimensional array, which is then encoded
as a tiff image. It also consists of the information regarding the primary index
required by the inverse BWT stage of the decompression, and the symbols and their
frequencies in the data stream before the applying Huffman encoding to the data
stream in the compressor program, this is required for the Huffman decoding function
at the decompression block to construct the Huffman tree and decode the Huffman
encoded data stream.

50
Finally at the end of the compression program, the essential data and
compressed data are converted to 8-bit tiff images and are stored in a folder created
by the program. We also display the compression ratio as the total input image file
size divided by the total size of output images

5.3.2 Decompression program


The inputs to the decompression program are the essential and compressed
image pair file location. And the output is a tiff image which is supposedly an
equivalent representation of the original image given as an input to the compression
program.

The output is converted into an 8-bit tiff image and stored in a folder created
by the program. Finally, we cross check the output image obtained with the original
image which was given as an input to the compression program, and this proves that
the compression has been done in a lossless fashion.

5.4 Conclusion

We have presented the implementation of our project in MATLAB. We have


described the input, output and other software requirements and specifications.

51
CHAPTER 6
RESULTS

6.1 Working of the compression program

We now give an 8-bit tiff image as input to the compression program and the
results are as follows. The name of the image is ‘lukas_2d_8_head_0_t.tiff’

Fig 6.1: Input image to the compressor.

The compression ratio is the key metric that indicates the efficiency of our
program. We now give the compressed image and essential image files as inputs to
the decompression program and the results are as follows.

52
Fig 6.2: Images in compressed form.

Fig 6.3: Workspace of the compression program.

The figure 6.3 consists of the workspace of our compression program. Let me
walk you through the variables created at various stages of the program. At the start of
the program, we import the input tiff image by setting the variable ImFileName to the
name of the image file, here ImFileName is ‘lukas_2d_8_head_0_t’ and then we
convert it into a 2-Dimensional array and that array is named as the orgImg and here it
has the dimensions 270x207. This array is now given as an input to the reordering
function. The output is a 1-Dimensional array named reordImg which has dimensions
1x55890. Now this is taken as input by the forward BWT function, and the outputs
are pidx whose value is 53255 for this image and BWTEncData which is the input
encoded using BWT, whose dimensions are 1x55891. This array goes as input to the
MTF coding block, and the output is a 1D array named MTFEncData whose
dimensions are 1x55891. Now this is the input to RLE-2s encoding function, and the
output of this is the array RLEncData whose dimensions are 1x28411. And finally,
this is given as an input to the Huffman coding block, whose outputs are

53
HuffmanDict which is a 256x2 cell that stores the mapping between symbols (0-256)
and their frequencies in the RLEncData which is used to construct the Huffman tree
used for Huffman decoding, and the other output is the HuffmanEncData which is the
array RLEncData encoded using Huffman coding. The variable avgwl is the average
word length calculated using the formula presented in the section 4.4.5. We see that
the dimensions of this array are much larger than the input to that stage, this is
because the elements are represented in a binary form, means we can encode 8
elements of that array using 8 bits or one element that can store a byte of data. So, we
perform the operation of converting the binary array HuffmanEncData into a decimal
array, and the output is the array named HuffmanRed with dimensions 1x20588. Now
finally we create a folder named ‘<image file name>_comp’ and write HuffmanRed
into a tiff file named ‘<image file name>_HuffmanRed.tiff’ and the data in
HuffmanDict into a tiff file called ‘<image file name>_Data.tiff’. For this image, the
folder name is ‘lukas_2d_8_head_0_t_comp’ and the HuffmanRed file is
‘lukas_2d_8_head_0_t_HuffmanRed.tiff’ and the file containing the HuffmanDict is
‘lukas_2d_8_head_0_t_Data.tiff’. And finally, the compression ratio is calculated by
the formula,

¿ original file
Compressionratio=
¿ compressed file

Here, size of original file is given by the size of the 2D array orgImg =
270x207 bytes = 55890 bytes. And the compressed file size is given by the sum of the
sizes of HuffmanRed array and the HuffmanDict cell = 20558 + 780 = 21338.

55890
Compressionratio= =2.6156 ≈ 2.62
21338

So, the CompressionRatio variable is set to 2.6156. We can summarise the


whole explanation using a simple table as given below.

Table 6.1: Summary of the output.


Size of input image 55890 bytes

Size of compressed image 20558 bytes

Size of essential image 780 bytes

Compression ratio 2.62


54
6.2 Working of the decompression program
The input to this program is the name of the contents folder without the part
‘_comp’, containing the compressed data regarding the image we want to decompress.

Fig 6.4: Inputs to the decompression program.

The output is an image that is an exact copy of the image which was given to
the compression program.

Fig 6.5 Output of the decompression program.

55
We now take a look at the workspace of the decompression program and
analyse its variables. The workspace of the program for this input is shown in the
figure 6.6

56
Fig 6.6: Workspace of the decompression program.

In the figure above, the string inp_img is given as the input. Here, it is the
string ‘lukas_2d_8_head_0_t’. Now the program generates the string ‘<image file
name>_comp’ from the variable inp_img and here it is ‘lukas_2d_8_head_0_t_comp’
and this is the name of the folder in which the compressed data is stored. Now we
create variables HuffmanRed and HuffmanDict with the contents of the file
‘lukas_2d_8_head_0_t_HuffmanRed’ and ‘lukas_2d_8_head_0_t_Data’ respectively
and then give them as an input to the Huffman decoding function and the output is the
array HuffmanDecData with dimensions 1x20588. Now this array is given as the
input to the RLE-2s decoding function, and the output is the array RLDecData which
has dimensions 1x55891. This is given to the MTF decoding function as the input and
the output is the array MTFDecData of dimensions 1x55891. Now this along with the
pidx obtained from the data file are given to the inverse BWT function as inputs and
the output is the array BWTDecData of dimensions 1x55891. Now the dimensions of
the original image stored in the data file along with the BWTDecData are given as
inputs to the undo reordering function, and the output is a 2D array whose dimensions

57
are 270x207. The array is then written to a tiff file, in a folder created by the program
named ‘<image file name>_comp_decomp’ and the file is named as ‘<image file
name>_comp_orgimg.tiff’. Here, for this input, the folder name is
‘lukas_2d_8_head_0_t_comp_decomp’ and the name of the output file is
‘lukas_2d_8_head_0_t_comp_orgimg.tiff’.

We can compare this image to the original image from which the compressed
folder was formed and make sure that the image is retrieved successfully. The original
image is imported into the variable RefImg in the form of a 2D array. Here in our
program, the variable named check indicates whether the output image is same as the
image given as input to the compression program. The variable check = 1 means that
the retrieval has been successful else unsuccessful.

In the below table we summarize the results of the decompression program for
this input.

Table 6.2: Summary of the decompression program.


Size of reference image 55890 bytes

Size of compressed image 20558 bytes

Size of output image 55890 bytes

Value of the variable ‘check’ 1 (True)

We now present a few sample cases with a few input images and finally
tabulate the results obtained for various input images. We have explained the first
sample input case in a very detailed fashion. So, that the reader would know what to
interpret from the program. From now on we just present the input and output of the
program for the test cases.

58
CHAPTER 7
CONCLUSION

7.1 Conclusion

We presented the Burrows Wheeler Transform (BWT), state of the art in text
compression field, and proposed a lossless medical image compression scheme based
on this transform. Our project mainly focusses on the use of BWT as a pre-processing
stage and the addition of RLE-2s to the BWCA for the compression of medical
images to exploit the redundancies in them and to provide a good compression ratio.
Generally, a compression ratio of 2 or above is said to be excellent for lossless
compression. In our project, as we can see, most of the sample images have a
compression ratio of 2 or higher and for some images its lower than 2, but we observe
that the ratios are very close to 2. This project is really application specific, means we
cannot use it as a general-purpose compression algorithm for all images since there
might be other algorithms that perform a better job as compared to the presented
algorithm.

7.2 Future scope

There is room for further improvement. The block sorting algorithm(BWT)


can be adjusted to work in linear time and space using newer suffix array algorithms.
We can introduce a supervised machine learning model and train it to optimize the
selection of various reordering techniques, global structure transforms and entropy
coding techniques and their combinations to maximize the compression ratios
depending upon the parameters of the input image. We can also extend it further to
compress RGB images. We can use segmentation and perform lossless compression
on a specific ROI (Region Of Interest) within the image.

59
REFERENCES

[1] J. Abel, "Improvements to the Burrows-Wheeler compression algorithm:


after BWT stages", ACM Trans. Computer Systems, submitted for
publication, 2003.
[2] M. Burrows and D.J. Wheeler, "A Block-sorting lossless data
compression", SRC Research Report 124, Digital systems research center,
Palo Alto, 1994.
[3] Y. Wiseman, "Burrows-Wheeler based JPEG", Data Science Journal, Vol.
6, 2007, pp. 19-27.
[4] A. Andersson and S. Nilsson, "Implementing radix sort", ACM Journal of
Experimental Algorithmic Vol. 3, 1998, pp. 7-22.
[5] P. Fenwick, "Block sorting text compression–final report", Technical
reports 130, University of Auckland, New Zealand, Department of
Computer Science. 1996.
[6] B. Balkenhol and Y. Shtarkov, "One attempt of a compression algorithm
using the BWT", SFB343: Discrete Structures in Math., Faculty of Math.,
Univ. of Bielefeld, Germany, 1999.
[7] [S. Deorowicz, "Second step algorithms in the Burrows-Wheeler
compression algorithm". Software Practice and Experience 32(2), 2002,
pp. 99-111.
[8] An image of the human eye,
https://i0.wp.com/post.medicalnewstoday.com/wp-content/uploads/sites/
3/2021/06/GettyImages-1265194139_header-1024x575.jpg?w=1575.
[9] Image representation using pixels, https://cdn.analyticsvidhya.com/wp-
content/uploads/2021/03/Screenshot-from-2021-03-16-10-56-56.png.
[10] Comparison of various compression ratios in lossless image
processing, https://www.dspguide.com/graphics/F_27_15.gif.
[11] A 3D plot in MATLAB,
https://undocumentedmatlab.com/images/Toolstrip_basic_controls.png.

60
APPENDIX

Compression program
clc;
close all;
clear;
% changing current working directory to our images folder
cd 'C:\Users\gvsrl\Documents\Academics\Literary_Documents\
Major_Project\References\Datasets\lukas_2d_8_tif\Resized\'

% _________________________set input image____________________________

% Taking an image of skull x-ray (resized).


ImFileName = "lukas_2d_8_head_1_t_resize"; % need to input the image
filename here...!
orgImg = imread(ImFileName, 'tiff');
% 186x200 uint8

imshow(orgImg);

% size() used to return array dimensions.


[no_of_rows, no_of_cols] = size(orgImg);
% 186, 200
% using up-down method for reordering
reordImg = up_down(orgImg);
% dimensions = 1 x 37201 uint8

% ________________________pre BWT steps_______________________________

% "0" is the special character, work on the special character being the
% least frequent one in the reordImg

ks = 0 : 255;
vs = zeros(1, 256);

D = containers.Map(ks, vs);

for i = 1 : size(reordImg, 2)
D(reordImg(i)) = D(reordImg(i)) + 1;

61
end

leastFreqEle = 1; % assume 1 is the least frequent element

for i = 1 : 255 % no use replacing 0 with 0

if D(leastFreqEle) == 0 % if an element does not occur, we can


break
break
end

if D(i) < D(leastFreqEle)


leastFreqEle = i;
end

end

Splmtr = []; % store occurances of lfe in reordImg

for i = 1 : size(reordImg, 2)

if reordImg(i) == 0
reordImg(i) = leastFreqEle;

else

if reordImg(i) == leastFreqEle
Splmtr(end + 1) = i % may need static allocation for large
arrays
end

end

end

clear i vs ks D; % clearing temporary variables from the workspace

% __________________function calls for compression_____________________

[pidx, BWTEncData] = Burr_Whee(reordImg);


% [1x1 double, 1x37201 uint8]

MTFEncData = MTF_coding(BWTEncData);
% 1x37201 uint8

62
[MaxEleRLE, RLEncData] = RL2SEncoding(MTFEncData);
% [1x1 uint16, 1x22282 uint8]

% RLEDecdata = RLDecoding(RLEncData, no_of_rows, no_of_cols);

[OccCount, HuffmanRed, HuffmanDict, HuffmanEncData, avgwl] =


HuffmanEnc(RLEncData);
% [256x1 Map, 1x18868 uint8, 256x2 cell, 1x150937 uint8, 1x1 double]

CompressionRatio = StoreCompData(ImFileName, HuffmanRed, pidx,


OccCount, no_of_rows, no_of_cols, leastFreqEle, Splmtr);

clear Splmtr OccCount leastFreqEle MaxEleRLE; % create a presentable


workspace

% _______________________reordering up-down__________________________

function [r] = up_down(im)


[nr, nc] = size(im);

r = uint8(zeros(1, nr * nc));

for i = 1 : nc

if mod(i, 2) % odd columns


r((i - 1) * nr + 1 : i * nr) = im(1 : nr, i);
% down for odd cols

else % even columns


r((i - 1) * nr + 1 : i * nr) = im(nr : -1 : 1, i);
% up for even cols

end

end

end

%______________________Burrows Wheeler Transform_______________________

function [pi, L] = Burr_Whee(ri)


ri(end + 1) = 0; % special character
s = ''; % to store ri as a string of ASCII characters

63
% need to declare size(ri, 2) in a variable for easier access

for i = 1 : size(ri, 2)
s = append(s, char(ri(i)));
% char(p) --> character mapped to value p in the ASCII table
end

% array of strings for rotations matrix


l2 = strings(1, size(ri, 2));

for i = 1 : size(ri, 2)
l2(i) = append(s(i : end), s(1 : i - 1)); % rotate from left to
right
end

l2 = sort(l2); % MATLAB sorts array of strings in LEXICOGRAPHICAL


ORDER...!!!

L = uint8(zeros(1, size(ri, 2)));


% to store the last column of l2, BWT output
pi = 0;
% to store the primary index

for i = 1 : size(ri, 2)
s1 = char(l2(i));

if s1(size(ri, 2)) == char(0)


pi = i;
end

L(i) = double(s1(size(ri, 2)));


% double(p) --> value mapped to character p in the iASCII table
(i think)
end

end

%______________________Move To Front Coding___________________________

function [R] = MTF_coding(L)


R = zeros(1, size(L, 2)); % MTF output
Y = 0 : 255; % list of unique symbols in L

for i = 1 : size(L, 2)
% find(Arr == val) returns array of all indices of val in arr
R(i) = find(Y == L(i)); % 1 to 256

% updating Y to get 0s when runs occur


Y = cat(2, Y(R(i)), Y(1 : R(i) - 1), Y(R(i) + 1 : end));

% decrementing R(i) to make it uint8 without losing data

64
R(i) = R(i) - 1; % 0 to 255
end

R = uint8(R);

end

%________________________Run-Length Encoding__________________________

function [Mx, R] = RLEncoding(M)


M = uint16(M); % to match with R initially
R = []; % R needs to be uint16 inorder to store run lengths > 256
c = 0;

for i = 1 : size(M, 2) - 1

if M(i) == M(i + 1)
c = c + 1;

else
R = [R, M(i), c];
c = 0;
end

end

% last element of M is not taken into account in the above loop if


% it is not the same as its previous element
if M(end) ~= M(end - 1)
R(end) = M(end);
end

for i = 2 : 2 : size(R, 2)

if R(i) > 255


c1 = mod(R(i), 256);
c2 = idivide(R(i), 256);
% c1 = power 0 place, c2 = power 1 place

R = [R(1 : i - 1), c1, R(i - 1), c2, R(i + 1 : end)];


end

end

Mx = max(R); % to check the effectiveness of the above conversion


code

end

%___________________Run-Length Encoding 2-symbols_____________________

65
% RLE 2S is better than RLE because we code it with only one symbol for
% unit length runs and 3 symbols for others.
% So, we code 1 to 255 as runlengths 2 to 256.
function [Mx, R] = RL2SEncoding(M)
M = double(M); % to store -1 as termination character
M(end + 1) = -1; % character to terminate the count of last run

R = []; % to store RLE2S sequence


c = 0; % c = runlength - 1

for i = 1 : size(M, 2) - 1

if M(i) == M(i + 1) % run continuity check


c = c + 1;

else

if c == 0
R = [R, M(i)]; % we append only the element if
runlength = 1

else
R = [R, M(i), M(i), c]; % we append the element twice
and the count if runlength > 1

end

c = 0; % reset count after each run ends

end

end

R = uint16(R); % converting from double to make it suitable for


idivide

xtra = 3 * (sum(R > 255 * ones(1, size(R, 2)))); % 3 bytes each


needed for runs > 255
Res = zeros(1, size(R, 2) + xtra); % preallocate for speed

k = 1; % res loop variable

for i = 1 : size(R, 2)

if R(i) > 255


c1 = mod(R(i), 256);
c2 = idivide(R(i), 256);
% c1 = power 0 place, c2 = power 1 place

66
% disp(i)
% when count, c of an element, e > 256 we write it into two
seperate runs
% as [e, e, c0, e, e, c1], where c1c0 = count in base 256
% R = [R(1:i - 1) c1 R(i - 1) R(i - 1) c2 R(i + 1:end)];
Res(k) = c1;
Res(k + 1) = R(i - 1);
Res(k + 2) = R(i - 1);
Res(k + 3) = c2;
k = k + 4;

else
Res(k) = R(i);
k = k + 1;

end

end

Mx = max(Res); % confirming that all runs > 256 are converted


R = Res;
R = uint8(R); % converting from uint16 to uint8 to save space

end

%_________________________Huffman coding______________________________

function [D, R, Hfd, H, avl] = HuffmanEnc(R)


ks = 0 : 255;
vs = zeros(1, 256);
prbs = zeros(1, 256); % storing probabilities of the words of MTF
encoded data

D = containers.Map(ks, vs);

for i = 1 : size(R, 2)
D(R(i)) = D(R(i)) + 1; % frequency of elements in M
end

for i = 0 : 255
prbs(i + 1) = D(i) / size(R, 2); % probability =
favourable/Total
end

% huffmandict(ks, vs) gives huffman dictionary, a mapping between


% symbols and their Huffman codes generated using a Huffman tree
[Hfd, avl] = huffmandict(ks, prbs);

% huffmanenco(sig, huffmandict) gives huffman encoded output for

67
% the signal stream sig based on the probabilities in huffmandict
H = huffmanenco(R, Hfd);

H = uint8(H);

sz = uint32(size(H, 2)); % sz may need to hold values > 65535


wl = uint32(8); % wl needs to match the type of sz for idivide

R = uint8(zeros(1, idivide(sz, wl) + 1)); % using M to store the


Huffman reduced output

if H(end) == 0
% storing the change from right most to help while decoding
H(end + 1) = 2 ^ (8 - mod(sz, wl)) - 1;

else
H(end + 1) = 0;

end

for i = 1 : size(R, 2) - 1

for j = 1 : 8
offs = (i - 1) * 8;
R(i) = R(i) + H(offs + j) * (2 ^ (8 - j)); % converting
binary to decimal
end % write full 8 bit notations while reversing...

end

pw = 7; % power of 2 at position 8 in binary

for i = 1 : mod(sz, wl)


R(end) = R(end) + H(idivide(sz, wl) * 8 + i) * 2 ^ pw; %
converting the remaining bits
pw = pw - 1;
end

R(end) = R(end) + H(end); % adding the change bits at the end to


help while decoding

for i = 1 : 256
Hfd{i, 2} = uint8(Hfd{i, 2}); % converting Hfd to uint8 for
efficient storage
end

H = H(1 : end - 1); % removing the change bits from H

end

% _____________________storing compressed data________________________

68
function [compRatio] = StoreCompData(Name, M, pridx, Oc, nr, nc, lfe,
Splmtr)
% convert to base 256, 3 digits each for nr, nc, lfe, pridx
% order of interpretation is Hfd, pridx, nr, nc, lfe, Splmtr

res = uint8(zeros(260, 3)); % the tiff file that needs to store the
additional info

% for primary index, nr, nc, lfe


res(257, :) = Dec2Byte3Digits(pridx);
res(258, :) = Dec2Byte3Digits(nr);
res(259, :) = Dec2Byte3Digits(nc);
res(260, :) = Dec2Byte3Digits(lfe);

% frequency count of gray levels for reconstruction of Hfd


for i = 0 : 255
res(i + 1, :) = Dec2Byte3Digits(Oc(i));
end

% positions of lfe in Reordimg


for i = 1 : size(Splmtr, 2)
res(260 + i, :) = Dec2Byte3Digits(Splmtr(i));
end

compRatio = nr * nc / (size(res, 1) * 3 + size(M, 2));

fprintf("Compression ratio = %0.2f", compRatio); % rounded to two


decimal places

% make a folder to store compressed image


Fldr = append(Name, '_comp');
mkdir(Fldr); % Fldr is an argument to the mkdir function
cd(Fldr);

DataFle = append(Name, '_Data.tiff');


HuffmanRedFle = append(Name, '_HuffmanRed.tiff');

% store as tiff images


imwrite(res, DataFle, 'tiff');
imwrite(M, HuffmanRedFle, 'tiff');

cd .. % one step backward

end

% _____________________conversion to base 256__________________________

function [b] = Dec2Byte3Digits(n)

69
b = zeros(1, 3);
rdx = 256; % the base is 256

for i = 3 : -1 : 1
b(i) = mod(n, rdx); % accumulate the remainder
n = floor(n / rdx); % n = quotient
end

b = uint8(b);

end

70
Decompression program
clc;
close all;
clear;

% changing current working directory to our images folder


cd 'C:\Users\gvsrl\Documents\Academics\Literary_Documents\
Major_Project\References\Datasets\lukas_2d_8_tif\Resized\'

RefImg = imread("lukas_2d_8_head_1_t_resize.tiff", 'tiff'); % input the


reference image here

% _______________extract data from compressed files___________________

ImFldr = "lukas_2d_8_head_1_t_resize_comp"; % need to input the folder


containing compressed image files
cd(ImFldr);

m = dir; % 4x1 struct of file attributes in the folder


[~, ~, DataFileName, HuffmanRedFileName] = m.name; % ~ is assigned to
unused values

DataFileContents = imread(DataFileName, 'tiff');


% 260x3 uint8

pidx = Byte2Dec(DataFileContents(257, :));


no_of_rows= Byte2Dec(DataFileContents(258, :));
no_of_cols = Byte2Dec(DataFileContents(259, :));
leastFreqEle = Byte2Dec(DataFileContents(260, :));

ks = 0 : 255;
vs = zeros(1, 256);
prbs = double(zeros(1, 256)); % double to be allowed in
huffmandict(symbols, probabilities)

for i = 1 : 256
vs(i) = Byte2Dec(DataFileContents(i, :));
end

D = containers.Map(ks, vs);

% size = nr * nc + 1 (Special symbol in BWT)


Tot = sum(vs); % vs contains the count of each element

71
% forming the probability vector
for i = 0 : 255
prbs(i + 1) = D(i) / Tot;
end

[HuffmanDict, avglen] = huffmandict(ks, prbs);

Splmtr = [];

for i = 261 : size(DataFileContents, 1)


Splmtr(end + 1) = Byte2Dec(DataFileContents(i));
end

% Given as input to HuffmanOrg to produce HuffmanDecData


HuffmanRed = imread(HuffmanRedFileName, 'tiff');
% 1x16302 uint8

% _________________function calls for decompression___________________

HuffmanDecData = HuffmanOrg(HuffmanRed, HuffmanDict);


% 1x22284 uint8

RLDecData = RLDecoding(HuffmanDecData, no_of_rows, no_of_cols);


% 1x 37201 uint8

MTFDecData = MTF_inverse(RLDecData);
% 1x37201 uint8

BWTDecData = undo_BWT(MTFDecData, pidx);


% 1x37200 uint8

decompressedImg = undo_up_down(ImFldr, BWTDecData, no_of_rows,


no_of_cols, Splmtr, leastFreqEle);
% 186x200 uint8

% clearing out the clutter in the workspace


clear D i Splmtr m ks vs prbs DataFileName HuffmanRedFileName ImFldr;

% _________________________check the results___________________________

72
tfs = isequal(RefImg, decompressedImg);

if tfs
disp("Original image is retrieved");

else
disp("Decompression unsuccessful...!!");

end

% _________________________Huffman decoding__________________________

function [H] = HuffmanOrg(M, Hfd)


H = uint8(zeros(1, 8 * (size(M, 2) - 1))); % intial array that
stores huffman decoded data

for i = 1 : size(M, 2)
decnum = M(i); % the decimal number to be converted is decnum

% convert from decimal to binary


for j = 0 : 7
H(i * 8 - j) = mod(decnum, 2);
decnum = idivide(decnum, 2);
end

end

wsb = 1; % no.of change bits is atleast 1

for i = 1 : 7

if H(end) == H(end - i) % detect the string of change bits


wsb = wsb + 1;

else
break % break when change is detected

end

end

H = H(1 : end - wsb); % remove change bits at the end

for i = 1 : 256
Hfd{i, 2} = double(Hfd{i, 2}); % convert to double since
huffmandeco take only double
end

73
H = double(H); % huffmandeco takes double
H = huffmandeco(H, Hfd);

H = uint8(H); % convert to uint8 to save space

end

% _______________Run-Length Encoding - 2symbol decoding________________

function [M] = RLDecoding(R, nr, nc)


M = zeros(1, nr * nc + 1); % to store the decoded RLE sequence
R = double(R); % to accomodate -1s as terminators
R = [R, -1, -1, -1, -1];
i = 1; % variable to loop through R
j = 1; % variable to loop through M

while i <= size(R, 2) - 4

% check if run len > 256


if R(i) == R(i + 1) && R(i + 1) == R(i + 3) && R(i + 3) == R(i
+ 4)
runlen = R(i + 5) * 256 + R(i + 2) + 1;
M(j : j + runlen - 1) = R(i);
i = i + 6;

% 1 < runlen < 256


elseif R(i) == R(i + 1)
runlen = R(i + 2) + 1;
M(j : j + runlen - 1) = R(i);
i = i + 3;

% runlen = 1
else
runlen = 1;
M(j : j + runlen - 1) = R(i);
i = i + 1;

end

j = j + runlen; % point j to the next empty spot in M

end

max(M)
M = uint8(M);

end

% _______________________Move-To-Front decoding_______________________

74
function [L] = MTF_inverse(R)
Y = 0 : 255; % same as in MTF encoding stage
L = uint8(zeros(1, size(R, 2)));

R = double(R); % doubling R in order to fit 1 to 256 into it as


positions in Y

for i = 1 : size(R, 2)
R(i) = R(i) + 1; % incrementing to undo the effect of MTF
encoding stage
end

for i = 1 : size(R, 2)
L(i) = Y(R(i));
% constructing L(i) from Y and R(i)

Y = cat(2, Y(R(i)), Y(1 : R(i) - 1), Y(R(i) + 1 : end));


% updating Y every step of the way
end

end

% ________________inverse Burrows-Wheeler transform___________________

function [S] = undo_BWT(L, I)


F = sort(L); % the first row of sorted rotations matrix

T = zeros(1, size(L, 2)); % T[i] = position of L[i] in F[i]


% F[T[i]] = L[i]

ks = [0]; % keyset for D


ps = [1]; % valueset for D
psi = 2; % temp var to hold start index of each char in F

for i = 2 : size(F, 2)

if F(i - 1) ~= F(i)
ks(end + 1) = F(i); % can ignore since ks is of limited
length
ps(psi) = i;
psi = psi + 1;
end

end

D = containers.Map(ks, ps); % D(i) --> position of first occourance


of i in F

75
for i = 1 : size(L, 2)
T(i) = D(L(i));
D(L(i)) = D(L(i)) + 1;
end

S = uint8(zeros(1, size(L, 2)));

for i = 1 : size(L, 2)
S(i) = F(T(I)); % We get S in the reverse order
I = T(I);
end

% we can finally discard the special character appended at the BWT


stage

S = S(end : -1 : 2); % To correct the order of S and remove special


character

end

% __________________1d to 2d conversion and storing____________________

function [ni] = undo_up_down(Name, ri, nr, nc, spl, lfe)

for i = 1 : size(ri, 2)

% to replace lfes at positions of zeros


if ri(i) == lfe && sum(find(spl == i)) == 0
ri(i) = 0;
end

end

ni = uint8(zeros(nr, nc)); % output 2D matrix

for i = 1 : nc

if mod(i, 2) % odd columns


ni(1 : nr, i) = ri((i - 1) * nr + 1 : i * nr);

else % even columns


ni(1 : nr, i) = ri(i * nr : -1 : (i - 1) * nr + 1) ;

end

end

76
% specifying the folder and file name to store it as a tiff file
Fldr = append(Name, '_decomp');
mkdir(Fldr);
cd(Fldr);

ImFileName = append(Name, '_orgimg.tiff');


imwrite(ni, ImFileName);

imshow(ni);

end

% _________________base 256 to base 10 conversion____________________

function [n] = Byte2Dec(b)


b = uint32(b); % classes of n and b need to match to do arithmetic
operations.

n = zeros(1, 1);
rdx = 256;

for i = 1 : 3
n = n + b(i) * (rdx ^ (3 - i)); % convert byte data to decimal
end

end

77

You might also like