Professional Documents
Culture Documents
CHAPTER 1
INTRODUCTION
1.1 OVERVIEW
Image Processing is a technique for various applications to enhance raw
images received from cameras/sensors placed on satellites, space probes, and
aircraft or pictures taken in healthy day-to-day life. The computational analysis of
objects in images is a very challenging issue as it usually involves automatic tasks
for segmentation. This refers and includes the detection of the objects represented,
extraction of representative features from the objects, matching between images,
rigid and non-rigid alignment of images, temporal tracking and motion analysis of
features in image sequences. Hence,image processing methodology has been
largely preferred. It is a form of signal processing for which the input is an image,
and the output of image processing may either be an image or a set of
characteristics or parameters related to the picture. Most image-processing
techniques treat the image as a two-dimensional signal.The quality of input images
plays a crucial role in the success of any computational image analysis task. This
is because the higher their quality is, the easier and simpler the task can be. Thus,
to improve the original quality of the input images, suitable methods of
computational image processing, such as noise removal, geometric correction,
edges and contrast enhancement and illumination correction or homogenization,
are required.
essential topics within the field of image processing include Image restoration,
Image enhancement, Image compression, etc.
Image processing deals with manipulation and analysis of images,by using
a computer algorithm, and improving the pictorial information for
clearunderstanding. This area is characterized by the need for an extensive
experimental work to establish the viability of the proposed solutions to a given
problem. Thus, Image processing manipulatesimages to extract information to
emphasize or de-emphasize certain aspects of the information contained in the
image or perform image analysis to obtain hidden information.
In some sense, "image processing" dates back to the earliest use of graphics
by humans. With the cost of processing being relatively high, it changed in the
1970s when digital image processing proliferated as cheaper computers and
dedicated hardware became available. The fast computers and signal processors
were available in the 2000s making the digital image processing as the most
common form of image processing. Subsequently, it is used because it is not only
the most versatile method but also the cheapest. The conventional methodologies
3
used median filteringin many research papers since the authors considered it as an
efficient methodology for removing the noise.
In the early days, topics like median filtering existed in many research
papers since the authors considered it as an efficient methodology for removing
the noise.
photographic tools. The shades and variations are all registered when the picture is
taken, and this information is converted into an image by the program. Digital
image processing methods deliver big advantages such as flexibility,
reproductiveness and preservation of original data accuracy.
A. Image representation
B. Image pre-processing
C. Image enhancement
D. Image restoration
E. Image analysis
F. Image reconstruction
G. Image data compression
1. Scaling
(dimension) to another resolution without losing the visual content in the picture.
Image interpolation algorithms can be grouped in two categories namely non-
adaptive and adaptive. In non-adaptive algorithms, computational logic is fixed
irrespective of the input image features, whereas in adaptive algorithms
computational logic is dependent upon the intrinsic image features and contents of
the input image. When the image is interpolated from a higher resolution to a
lower resolution, it is called as image down-scaling or down-sampling. On the
other hand, when the image is interpolated from a lower resolution to a higher
resolution, it is referred as image up-scaling or up-sampling. Image interpolation
has a variety of applications in the areas of computer graphics, editing, medical
image reconstruction. For instance, scaling up is used to enlarge images for HDTV
or medical image displays while scaling down is applied to shrink images to fit
mini-size LCD panel in portable instruments. It is also a part of many commercial
image processing tools or freeware graphic viewers such as Adobe Photoshop CS2
software, IrfanView, Fast Stone Photo Resizer, Photo PosPro, XnConvert etc.
Numerous digital image scaling techniques have been presented, of which the
most popular methods are pixel replication based nearest neighbor replacement
algorithm, Pixel interpolation based Bi-linear, Filter/Kernel based Cubic, Bicubic,
B-Spline, Box , Triangle, Lanczos.
2. Rotation
3. Mosaic
(C )Image Enhancement
The value of a pixel with coordinates (x,y) in the enhanced image is the
result of performing some operation on the pixels in the neighbourhood of (x,y) in
the input image, F. Neighbourhoods can be ofany shape, but usually, they are
rectangular.
Reconstructed
Apply Decoding Perform
digital image
Transmission,
storage and
archiving
The technique of Lossless compression with the compressing of data that is,
when get decompressed, will be the replica of actual data. Here,the binary dataget
compressed. Thesedata are required to be reproduced exactly when they get
decompressed again. On the contrary, the images and the music required need not
be generated 'exactly.' A resemblance of the actual image is sufficient for the most
objective, as far as the error or problems between the actual and compressed image
is avoidable or tolerable.
This type of compression is also noiseless as they never add noise to signal
or image. It is also termed as the entropy coding, as it uses the techniques of
12
compressor
Channel
decompressor
compressor
1. Run-length encoding
13
2. Huffman encoding
3. LZW coding
4. Area coding
1.Run-length encoding:
This is a very simple procedure used for sequential data and is very useful
in case of redundant data. This method replaces sequences of identical symbols
(pixels) which are called as runs by shorter symbols. The run-length code for a
grayscale image is represented by a sequence {V, R}, where V is the intensity of
pixel and R is the number of the consecutive pixel with the intensity V An
example is shown in Figure 1.5 below:
1 1 1 1 0 0 2 2 2 2
Step 3: Read the next character or symbol; if the character is last in the string, then
exit otherwise.
14
A: If: the next symbol is the same as the previous symbol, then give the same
unique value as previously. B: Else if: the next symbol is not the same as giving its
new value that is not matched from the previous value.
Step 5: Go to step 3 until a non-matching value to the not same symbol for
previous.
Step 6: Display the result, that is the count of occurrence of a single symbol, with
that particular symbol.
2. Huffman encoding
8 3 20 15
l k j h
8 3 15 20
l k h j
3 8
k l
Step 4: Merge them together with some of them and update the data.
11
3 8
k l
20 26
j
11
15
h
3.LZW Coding
3 8
k l is a fully dictionary-based coding. It is divided
LZW (Lempel-Ziv-Welch)
into two subcategories namely static and dynamic. In static, the dictionary is fixed
16
during the encoding and decoding processes. In dynamic, the dictionary is updated
if the change is needed. LZW compression replaces strings of characters with
single codes. It does not perform any analysis of the incoming text. Instead, it just
adds every new string of characters from the table of strings. The code that the
LZW algorithm outputs can be of any arbitrary length, but it must have more bits
in it than a single character. LZW compression works best for files containing lots
of repetitive data. LZW compression maintains a dictionary. In this dictionary all
the stream entry and code are stored as shown in Figure 1.6.
7 8 3 22 5 7 8 3 9 22 5 10
C1 C2 C1 9 Unique codeC2 10
Unique code C2
C1
LZW codes (C1 and C2) can be achieved by starting the encoding process
with empty dictionary. The first phrase to be entered into the dictionary should
have code values 1, which can be represented using only single bit. The 2nd
and3rd phrases to be entered into the dictionary should have code values 2 and 3
respectivelyand can be represented by 2 bits. Similarly, 3 bits are required for 4th
to 7th phrases to be entered into the dictionary and so on. This is in contrast to
LZW coding, where first 256 phrases entered initially into the dictionary are
17
Step 2: Initialize the dictionary to contain an entry of each character of the stream.
Step 3: Read the stream, if the current byte is the end of the stream, then exit.
Step 4: Otherwise, read the next character and produce anew code. If a group of
characters is frequently occurring, then give them a unique code.
Step 5: Read the next input character of the stream from dictionary; if there is no
such character in a dictionary, then.
C: Go to step 4.
4. Area coding
be highly effective, but it bears the problem of a nonlinear method, which cannot
be implemented in hardware. Therefore, the performance in terms of compression
time is not competitive, although the compression ratio is high.
1.4.2 LOSSY
compressor
channel
1. Transformation coding.
2. Vector quantization.
3. Fractal coding.
4. Block truncation coding.
5. Subband coding.
6. Chroma subsampling.
20
1.Transformation Coding
DFT and DCT are certain types of transforms used in changing the pixels of
the original image into frequency domain coefficients. There are several properties
in this type of coefficients. One is the compaction property. This is the basis for
achieving the compression.
2.Vector Quantization
3.Fractal Coding
In this method, firstly the image is divided and then a block of pixels is
arranged. Then, a threshold and reconstruction values are found out for each
block; then, a bitmap of the block is derived, and all those pixels are replaced,
which have the value greater than or equal to the threshold value by 1or 0.
5.Subband Coding
5.Chroma subsampling
This method contains the advantage of the human visual system's lower
acuity for color differences. This technique basically used in video encoding, for
example, JPEG encoding and etc. Chromo Sub-sampling is a method that holds
color information of lower resolution and intensity information. Further, the
overwhelming majority of graphics programs perform 2×2 chroma sub-sampling,
which breaks the image into 2×2 pixel blocks and only stores the average color
information for each 2×2 pixel group. The entire procedure is exhibited in Fig 1.8.
2:2
22
Cb Cr
There are hundreds of image file types of which PNG, JPEG, and GIF
formats are the most often used to display images on the internet.
1.JPEG/JFIF Format
specified, and this amount of compression affects the visual quality of the result as
well.
3.Exif Format
4.TIFF Format
5.RAW Format
6.GIF Format
7.BMP Format
The BMP file format handles graphics files within the Microsoft window
OS. Typically, BMP files are uncompressed. Hence, they are large; the advantages
are their simplicity and wide acceptance in window programs.
8.PNG Format
color with and without alpha channel while GIF supports only 256 colors and
single transparent color.
Data compression a method that takes an input data D and generates the
data C (D) with the lower number of bits as compared to input data. The reverse
process is called decompression, which takes the compressed data C (D) and
reconstructs the data D', as shown in Figure 1.9.
D
C(D) D
Figure 1.9 Compression Algorithms
from the lossy compression of audio. For example, MP3 and the images, and
JPEG, in which the small high-frequency elements maybe get discarded to the
spectral approached for their numerical solution of the partial differential
equations. By using the cosine instead of the sine functions, it is complicated in
these implementations: for the compression,which it returns, finds that the cosine
function is so much effective as mentioned here. Some functions are required to
the exact a typical signal, while for the differential equations, the cosines function
explains a specific selection of the boundary conditions.
JPEG image compression performs in part through rounding off the non-
essential bits of the information. Here is an associated trade-off between the
information loss and the reduction of size. Various number of famous compression
techniques have achieved these intuitive differences, which consists of those that
are used in the music files, video, and images. Hence, the technique of
JPEG'slossy encoding forces to be very prudent with a gray-scale portion of the
image and be very trivial with color.
DCT fragmented the images into parts of separate frequencies in which less
significant frequencies are cancelled through the quantization process. More
significant frequencies are used for retrieving the image during the process of
decompression.
The pixel values within each block range from[-128 to 127], but pixel
values of a black and white image range from [0-255] so, each block is
shifted from[0-255] to [-128 to 127].
The DCT works from left to right, top to bottom; thereby, it is applied to
each block.
Each block is compressed through quantization.
The quantized matrix is entropy encoded.
The compressed image is reconstructed through the reverse process. This
process uses the inverse Discrete Cosine Transform (IDCT).
The general idea in the Huffman encoding algorithm is to allocate very short code-
words to those blocks of input along with the high possibilities. Similarly, the
long code-words are allocated to those who are having the low probabilities.
The Huffman coding process was based on the two observations mentioned below:
Very frequently found symbols will have the shorter code- words as
compared to the symbol which found less frequently.
29
Two symbols which are very less in number may have equal length.
The Huffman code is prepared by combining together the two least possible
characters, and that is repeated in this process as far as there is only one character
is remaining. A code-tree is hence prepared, and then a Huffman code is
generated from the labeling of code tree. It is the best prefix code that is generated
from the set of the probabilities, and which has been used in the different
applications of the compression.
These generated codes are of different lengths of code, which are using an
integral number of the bits. This concept results in a decrease in average length
of the code, and hence the whole size of the compressed data becomes smaller
as compared to the solution to the issue of constructing the codes with less
redundancy.
The present research has been carried out in two stages. The main
objectives of stage 1,2 and 3 are stated below:
1.8 SUMMARY
Image processing is the study of representation and manipulation of
pictorial information. Digital image processing is performed on digital computers
that manipulate images as arrays or matrices of numbers. High computational
speed, high video resolution, more efficient computer language to process the data,
and more efficient and reliable computer vision algorithms are some of the factors
that let fields such as medical diagnosis, industrial quality control, robotic vision,
astronomy, and intelligent vehicle / highway system to be included as a part of the
large list of applications that use computer vision analysis to achieve their goals.
32
More and more complex techniques have been developed to achieve new goals
unthinkable in the past.
CHAPTER 2
LITERATURE SURVEY
2.1 INTRODUCTION
it in the transmission process. This paper points out Lossy as well as Lossless
compression techniques as used in the fields of image processing.
algorithm and fix quality degradation problem. Replacing the lossy compression
technique gives a better compression ratio without degradation of quality. For
solving the problem, Huffman coding algorithm approach uses lossless image
compression. Using a lossless Huffman coding compression algorithm, the image
is compressed up to 40.08%. Image compression plays a vital role in saving
memory and time, while the transmission Huffman algorithm is comparatively
better in the overall compression technique.
Rachit Patel et al. (2016) proposed the Image Compression using the
Huffman coding technique as a simpler and the easiest compression technique.
Compression of the image is an important task as its implementation is easy and
36
obtains less memory. The objective of this paper is to analyze Huffman coding
technique which is basically used to remove the redundant bits in data. This is
achieved by analyzing different characteristics or specification like Peak Signal to
Noise Ratio (PSNR), Mean Square Error (MSE) Bits Per Pixel (BPP) and
Compression Ratio (CR) for various input images of different sizes. Further, the
paper shows the devising of a new method of splitting an input image into equal
rows and columns. The final stage sumsup all individual compressed images,
which not only provide better result but also secures the information content. An
image compression technique has various advantages in the field of image analysis
and also for the security purpose for the image.
transformation, and Entropy encoding. The Color Space Conversion (CSC) is one
of the most important processes and can influence the compression and quality of
the image. JPEG XR uses a CSC, which is different from the traditional image
compression algorithms. JPEG XR also supports many image input formats like
RGB, YUV, CMYK, Monochrome, and arbitrary N-channel color formats. JPEG
XR uses the same Frequency Transformation and entropy encoding for all the
input image formats. In the case of JPEG XR, there is a flexibility of using
different CSC for different input image formats like RGB and CMYK. The JPEG
XR bypasses the CSC for images having uncorrelated data like YUV. There are
lot of existing works done on comparison of JPEG XR with other image
compression algorithms like JPEG, but there is very little research or literature on
comparing the image compression ratios and quality of images in different formats
using JPEG XR compression algorithm. In this study, an analysis and comparison
of the compression ratios of the images of different input formats in particular to
RGB input format and YUV 444 format has been carried out to explore the effects
of CSC on the image compression while using the JPEG XR. An analysis of
effective compression (better compression ratio) is also carried out on various
images of unique visual characteristics in different input formats when processed
using JPEG XR.
Vijaya Kumar et al. (2017) deals with the main objective of source coding
to represent the symbols or messages generated from an information source in a
suitable form so that the size of the data is reduced. In image compression, JPEG
is normally used, where huge number of zeros are generated in medium and high
frequency region of the transformed image using the combination of DCT
(Discrete Cosine Transform) and quantization. This is required to reduce the run-
length coding of an image. The process is lossy compression but provides good
illusion at a glance. In this paper, the Haar wavelet for image compression is
adopted. Haar Transformation is employed with an idea to minimize the
computational requirements by applying different compression thresholds for the
wavelet coefficients, and these results are obtained in a fraction of seconds and
improves the quality of the reconstructed image. The results show that the
reconstructed images have high compression rates and better image quality.
42
of the DWT coefficients. The quality of the compressed images has been evaluated
using some factors like the Compression Ratio (CR) and Peak Signal to Noise
Ratio (PSNR). Experimental results demonstrate that the proposed technique
provides a sufficiently higher compression ratio compared to other compression
thresholding techniques.
technique that uses tetrominoes for image compression with less edge blurring.
The reduction in file size allows more images to be stored in a given amount of
disk or memory space.
HidayahRahmalan et al. (2010) came out with an analysis about the Orthogonal
moment, which is known as better moment functions compared to the non-
orthogonal moment. Among all the orthogonal moments, Tchebichef Moment
appears to be the most recent moment function, that still attracts interest among
the computer vision researchers. The author proposes a novel approach based on
discrete orthogonal Tchebichef Moment for efficient image compression. The
image compression is useful in many applications, especially related to images
that are needed to be seen in small devices such as in mobile phones.Meanwhile,
45
and various orders of different discrete orthogonal moments. Finally, the results
obtained by the reconstruction of three color images with different families of
orthogonal moments and error analysis to compare their capacity of descry
efficiency of the proposed transform for image and video coding. Furthermore,
Xilinx Virtex-6 FPGA based hardware realization shows a 44.9% reduction in
dynamic power consumption and 64.7% lower area when compared to the
literature.
Hameed et al. (2016) deals with the idea of the evolutionary optimized
coefficients of discrete orthogonal Tchebichef moment transform (TMT). The
objective of using this transform in this study is to ameliorate the quality of the
traditional moment-based image compression methods. Most of the existing
methods compute moment-transform coefficients for the input image and then
select the coefficients sequentially downward to a certain order, based on the
desired compression ratio. However, the proposed method divides the input image
into nonoverlapping square blocks of a specific size in order to circumvent the
problem of numerical instability and then computes the TMT coefficients for each
block. In this work, a real-coded genetic algorithm is employed to optimize the
TMT coefficients of each block, which produces reconstructed images of better
quality for the desired compression ratio. Here, the optimization is carried out by
minimizing the mean square error function. Standard test images of two different
sizes (128×128 and 256 ×256) have been subjected to the proposed compression
method for the block sizes (4×4 and 8×8) in order to assess its performance. The
results reveal that the proposed real-coded genetic algorithm-based method
outperforms others, namely the conventional sequential selection method and
simple random optimization method, for the chosen input images in terms of the
task of compression.
Signal to Noise Ratio), and the number of bits required to encode the coefficients
for both DCT and DTT are verified. It has been demonstrated that DTT requires a
lesser number of bits to encode the coefficients than DCT for a given compression
ratio.
G. A. Papakostas et al. (2005)A new method for extracting feature sets with
improved reconstruction and classification performance in computer vision
applications is presented in this paper. The main idea is to propose a procedure for
obtaining surrogates of the compressed versions of very reliable feature sets
without affecting their reconstruction and recognition properties significantly. The
surrogate feature vector is of lower dimensionality and thus there are more
appropriate for pattern recognition tasks. The proposed Feature Extraction Method
(FEM) combines the advantages of the multiresolution analysis, which is based on
the wavelet theory, with the high discriminative nature of Zernike moment sets.
50
AshlinDeepa et al. (2014) The Character recognition is one of the most important
areas in the field of pattern recognition. Recently, Indian Handwritten character
recognition is getting much more attention, and researchers are contributing a lot
in this field. But Malayalam, a South Indian language, has very fewer works in
this area and needs further attention. Malayalam OCR is a complex task owing to
the various character scripts available and, more importantly, the difference in
ways in which the characters are written. The dimensions are never the same and
maybe never mapped on to a square grid, unlike English characters. The selection
of a feature extraction method is the most important factor in achieving high
recognition performance in character recognition systems. Different feature
extraction methods are designed for different representations of characters. As an
important component of pattern recognition, Feature Extraction has been paid
close attention by many scholars, and currently it has become one of the research
hot spots in the field of pattern recognition.
research is focused on developing the zigzag scan algorithm with the mapping
method. Zigzag Scan with mapping method is a sorting process of DCT-quantized
data result according to the position sequence determined in a zigzag. The study
exhibits the implementation of the Zigzag Scan mapping method into FPGA using
ROM that serves as zigzag address generator, and RAM that serves read and write
data according to a Zig-zag address generated by ROM. The efficiency of the Zig-
zag Scan with the mapping method has been successfully developed. It is able to
accelerate the sorting process of the DCT-quantized coefficients period because
input data can be immediately located in sequence position, which has been
determined without any value comparison and repetition process. Zig-zag Scan
with mapping method process time is 250 MHz or approximately 4ns per byte data
(12 ns per pixel) with the delay time (latency) of 64 clocks. It means that the
generated IC Zig-zag Scan prototype can be operated in realtime JPEG/MPEG
compression with a maximum of 3 megapixels per frame for video with 25fps
(force per second) speed. The generated IC Zig-zag Scan component (IP Core)
needs ten slices of Flip-flop (30 times less than Arafa method and 2 times less than
Ketul method) and LUT (look up table) as many as 39 slices (20 times less than
Arafa method and 2 times less than Ketul method).
nonedge blocks are compressed with high compression. The analysis results
indicate that the performance of the suggested method is much better, where the
constructed images are less distorted and compressed with a higher factor.
Gunapriya et al. (2014) present the compression of color medical images with
different color spaces. Even though multimedia data storage and communication
technologies have attained rapid growth, compression of color medical images
remains a challenging task. In the proposed method, color medical images are
converted to different color spaces such as YCbCr, NTSC and HSV. Then,
decomposition of different color space images is done using the curvelet
transform. The decomposed images are then compressed using the Huffman
coding. The results obtained for different color spaces are compared in terms of
compression ratio and bits per pixel.
Amit Kumar Mandal et al. (2013)In this experimental study, a system has been
developed that uses threshold values to segment an image. An image is treated as a
one-dimensional array of pixel values. In this work, the segmentation is performed
in the YCbCrcolor space. In this experiment, the segmented image will have 2
different colors, which are black and white, and for this reason, the segmentation
is done using local thresholding value for the Cb component of YCbCr. A mask is
used to determine the neighbors of each pixel in the image. The mask also
determines an operation to be applied to the neighborhood of every pixel in the
image. Now the mask and the operations are used to determine the local threshold
for every pixel in the image. For each pixel location, the threshold will be
different. This value is compared with the color value of the pixel. If the value of
53
the pixel in this location is greater than or equal to the specified threshold for the
pixel, then it is labeled as 1. Otherwise, if it is smaller, then it is labeled as 0. In
this way, an image with two color values is achieved.
H.B. Kekre et al. (2013)discussed the image compression using a hybrid wavelet
transform. A hybrid wavelet transform matrix is generated using two-component
transform matrices. One of the component transform matrices contributes to global
properties, whereas the second one contributes to local properties of an image.
Different sizes of component transform matrix can be used to generate hybrid
transform matrix so that its size is same as size of the image.Different colour
images of size 256×256 are used for experimentation. The proposed hybrid
wavelet transform is applied on red, green and blue planes of image separately.
Then, in each plane transformed coefficients are sorted in descending order of
their energy and the lowest energy coefficients are eliminated. Root mean square
error between original image and reconstructed image is calculated to check the
performance at different compression ratios. By varying the size of pair of
component transform matrices, hybrid transform matrix is constructed and results
are observed.Also, by changing the component matrix which contributes to local
properties of image and with size variation, results are observed and compared. It
has been observed that if more focus is given on local features, then better results
can be produced due to the implementation of the Hybrid Wavelet Transform.
Focusing on local features can be done by selecting larger size of orthogonal
component transform that contributes to local properties.
55
Athira B. Kaimal et al. (2013)deals witha fact that the increasing attractiveness
and trust on digital photography have given rise to their use for visual
communication. They require storage of large quantities of data. Due to limited
bandwidth and storage capacity, images must be compressed before storing and
transmitting. Many techniques are available for compressing the images, yet, in
some cases these techniques will reduce the quality and originality of image. As a
result of compression, some traces like blocking artefacts, and transformation
fingerprints are also introduced into the reconstructed image. It mainly affects the
medical imaging by reducing the fidelity and thereby introducing diagnostic
errors. Many hybrid techniques are also developed to overcome these problems.
This paper addresses various compression techniques as they are applicable to
various fields of image processing.
56
Mehala et al. (2013) proposed the New Image Compression Algorithm Using
Haar Wavelet Transformation technique. The proposed 8×8 transform matrix can
be obtained by appropriate inserting of some 0’s and 1/2’s into the Haar Wavelet.
The proposed Haar Wavelet algorithm is based on integers, and made sufficiently
sparse orthogonal transform matrix. A Haar Wavelet algorithm for Fast
computation is developed, besides calculating various measures like Compression
Ratio, PSNR, Threshold Value and Reconstructed Normalization. The proposed
algorithm has been implemented in Mat Lab.
using some factors like Compression Ratio (CR), Peak Signal to Noise Ratio
(PSNR), Mean Opinion Score (MOS), Picture Quality Scale (PQS) etc.
B. Penna et al. (2007)deal with the point that the Transform-based lossy
compression has a huge potential for hyperspectral data reduction. Hyperspectral
data are 3-D, and the nature of their correlation is different in each dimension.
This variation calls for a careful design of the 3-D transform to be used for
compression. In this paper, the transform design and rate allocation stage for lossy
compression of hyperspectral data are examined. Firstly, a set of 3-D transforms
is selected, which is obtained by combining various ways wavelets, wavelet
packets, the discrete cosine transform, and the Karhunen-Loegraveve transform
(KLT). Then, the coding efficiency of these combinations is evaluated. Secondly,
a low-complexity version of the KLT is proposed, in which complexity and
performance can be balanced in a scalable way, allowing one to design the
transform that better matches a specific application. Thirdly, both these
algorithms, as well as other existing transforms are integrated in the framework of
Part 2 of the Joint Photographic Experts Group (JPEG) 2000 standard, by taking
advantage of the high coding efficiency of JPEG 2000, and exploiting the
interoperability of an international standard. An evaluation framework based on
both reconstruction fidelity and impact on image exploitation is introduced, to
evaluate the proposed algorithm by applying this framework to AVIRIS scenes. It
is shown that the scheme based on the proposed low-complexity KLT significantly
outperforms previous schemes as to rate-distortion performance. As for impact on
exploitation, multiclass hard classification, spectral unmixing, binary
classification, and anomaly detection are considered as benchmark applications.
58
Ghadah Al-Khafaji et al. (2017)analyzed that the lossy based image compression
of has become an increasingly intensive research area. This is due to importance
given to daily visual media applications, including TV, video film, the internet etc.
These applications are basically based on losing some unrecognized or unwanted
information, and managing non-noticeable distortion quality changes is traded off
against high compression ratios. A lossy image compression is introduced, based
on utilizing three techniques of wavelet, polynomial prediction and block
truncation coding, in which each technique exploited according to redundancy
presence. The test results showed promising performance achievement in terms of
higher compression with lower noticeable error or degradation.
CHAPTER 3
3.1 OBJECTIVES
3.2 INTRODUCTION
Image compression techniques fall into two broad categories. They are
information preserving (lossless compression), and lossy compression. A simple
purpose of loss image compression is to minimize the bit rate (i.e. the amount of
bits required to process each pixel). This would conserve time and capacity in
digital image processing while preserving an reasonable image quality or quality.
62
The system of decreasing the size of the file in data is commonly termed as the
data-compression, though its formal name is the source-coding, that is coding
done at the source of data before it gets stored or sent. In these methods, a few loss
of information is acceptable. Dropping non-essential information from the source
of data can save the storage area. The Lossy data-compression methods are aware
of the researches on how the people anticipate data is the question. As an example,
the human eye is very sensitive to slight variations in the luminance as compared
to so many variations in the color. The Lossy image compression technique is used
in the digital cameras to raise the storage ability with the minimal decline of the
quality of the picture. Similar is the case of DVDs, which use the lossy MPEG-2
Video codec technique for the compression of the video. In the lossy audio
compression, the techniques of psychoacoustics have been used to eliminate the
non-audible or less audible components of the signal. There are many algorithms
developed for image data compression. Each Algorithm is based on a basic
principle of redundancy reduction of the given image. Based on this principle, the
algorithm can be divided into two basic techniques, namely, Spatial Coding
Technique and Transform Coding technique. Spatial Coding Technique operates
directly on the pixels of an image.
are proposed for further quantization practice and entropy encoding. The decoder
reconstructs the original image by applying the inverse transform.Transform
Coding is used to map the image into a set of transform coefficients, which are
then quantized and coded.
Compressed Data
Lossless Coding
wavelets have come into picture and have become attractive techniques for image
compression, as they give time and frequency analysis of data. Wavelet transform
can be directly applied to the whole image without blocking it. Wavelet-based
Coding is more robust under transmission and decoding errors. The Multi-
resolution property of Wavelet transforms helps to view the image at different
scales. A recent trend is to use a hybrid technique for image compression in which
one transform is combined with another transform to incorporate the advantages of
both transforms.
efficiency while reducing its complexity. The Mixed Approach discussed in this
chapter is a combination of the Transform and Spatial coding technique, which is a
simple yet efficient algorithm.
The suitability of the Algorithm is tested for different types of images, such
as textured and untextured images. The performance measures such as
Compression Ratio (CR), Maximum Compression Achieved (MCA), Signal to
Noise Ratio (SNR), Error Rate (ER) are evaluated. Compression Ratio is defined
by a number of bits required to represent the compressed image to a number of bits
required to represent the original image. The proposed coding scheme employs
wavelet decomposition, and then it converts the quantized wavelet coefficients
into a plane consisting of 1’s and 0’s using Average Window Coding.
Wavelets are preferred in image compression because they have lot of advantages
like:
Image compression using DWT gives a multi-resolution analysis of an
image. Therefore, an image can be compressed at different levels of resolution and
can be consecutively processed from low resolution to high resolution.
It is robust to common image processing.
Real-time images are space and band limited. Wavelets are localized in
time and frequency domains so as to capture local features in an image.
The main advantage of a wavelet basis is the support for a multi-resolution
analysis of the image. Compared to Fourier Transform, where the problem of
windowing is minimized through Wavelet-based decomposition by differing
window size buffer, wavelets allow analyzing the image at different resolutions.
67
(a) Encoder
(b) Decoder
Figure 3.2 Block Diagram of transform-based image coder (a) Encoder (b)
Decoder
The basic purpose of DWT is to transform the image into slowly varying
and rapidly varying contents of frequency. The separation process is performed in
two orthogonal directions separately. Average pixel information represents a
smooth component that is derived from a low pass filter, which,in turn, represents
average information. Differentials of adjacent pixels by the detail component are
derived from a high pass filter. Images are mathematically transformed to extract
details from them. These transforms give details that are readily available in it.
Images captured mostly are time-domain images in nature, yet this is not the best
way to represent it for most image processing applications.
In most cases, the main distinguished information is not visible in the signal
represented in the frequency domain. It is known intuitively, that the frequency is
associated with variation in the rate of information. Rapidly changing information
is considered as high frequency, whereas if it does not change rapidly, i.e., it
changes smoothly, this is considered as low-frequency information. If there is no
change in information, i.e., it does not change at all, then it can be referred as
having zero frequency, or no frequency. Fourier Transform gives the detail of the
frequency contents that exist in the image that is taken for analysis. In some
applications, where the time localization of the spectral contents is needed, a
transform giving both the time and frequency values of the image is required. The
transform that gives such detail is the Wavelet transform. It represents the
information in both time and frequency. Normally, the Wavelet Transform is used
to analyze images that are inactive in nature.
It can be used to take out and encode edge information, which provides
significant visual cues in separate images.
The coefficients of wavelet decomposition give information that is
sovereign of the original image resolution.
Thus, the image compression method based on DWT enables analysis of the
image with different resolutions. Finally, wavelet coefficients are used to calculate
the required linear time in the image size. They cannot minimize the size of data
required to represent the image. Algorithms based on both vector and scalar
quantization’sare falling in the category of lossy compression methods requiring a
minimum size of data to represent the image.
and 4 - (-1) = 5. Thus, the original image is decomposed into a lower resolution
(two-pixel) version and a pair of detail coefficients. Table 3.1 displays the full
decomposition achieved when this process is repeated recursively on the averages.
Table 3.1 Full Decomposition
Resolution Averages Detail Coefficients
4 [9 7 3 5] [2,-2,3,-3]
2 [8 4] [1 -1]
1 [6] [2]
Thus, for the one-dimensional Haar basis, the wavelet transform of the original
four-pixel image is given by [6 2 1 - 1]. The method used to compute the wavelet
transform by recursively averaging and differencing coefficients is called as filter
bank. The image can be reconstructed to any resolution by recursively adding and
subtracting the detail coefficients from the lower resolution versions.
k =2 p +q−1(3.1)
and k is in a range of k =0,1,2 , … ., N −1
when k=0, the Haar function id defined as a constant h 0 ( t )=1/ √ N when K>0, the
Haar function is defined as
{ }
P /2
2 ( q−1 ) 2P ≤ t<(q−0.5)/2 p
1
h k ( t )= −2
P /2
( q−0.5 ) 2P ≤t <q /2 p (3.2)
√N
0 otherwise
74
From the above equation, it can be seen that p determines the amplitude and width
of the non-zero part of the function, while q determines the position of the non-
zero part of the Haar function. The discrete Haar functions formed the basis of the
Haar matrix H
H 2N =
[ H N ⨂ [ 1,1 ]
I N ⨂ [ 1 ,−1 ]](3.3)
H ( 0 )=1
where
[ ]
1 0 ⋯ 0 0
0 1 ⋯ 0 0
I N = ⋮⋮ ⋱ ⋮⋮ (3.4)
0 0 1 0
⋯
0 0 0 1
[ ]
a 11 B ⋯ a1 n B
A ⨂ B= ⋮ ⋱ ⋮ (3.5)
am 1 B ⋯ amn B
75
[]
ϕ
h0,0
h1,0
h1,1
H N= ⋮ (3.6)
h k−1,0
h k−1,1
⋮
hk−1 ,2 k−1
−1
[ ]
1 1 1 1 1 1 1 1
1 1 1 1 −1 −1−1 −1
1 1 −1 −1 0 0 0 0
H [ m, n ] = 0 0 0 0 −1 1 −1 −1 (3.7)
1 −1 0 0 0 0 0 0
0 0 1 −1 0 0 0 0
0 0 0 0 1 −1 0 0
0 0 0 0 0 0 1 −1
From the definition of the Haar matrix H, it can be observed that unlike the
Fourier transform, H-matrix has an real element only and is non-symmetric.
The first row of the H matrix measures the average value, and the second row H
matrix measures a low-frequency component of the input vector. The next two
rows are sensitive to the first and the second half of the input vector respectively,
76
D [ 1,1 ] =2 (3.10)
−k
D [ n , n ] =2
−k + p p p+1
if 2 < n<2 (3.11)
[ ]
1/8 0 0 0 0 0 0 0
0 1 /8 0 0 0 0 0 0
0 0 1/ 4 0 0 0 0 0
D= 0 0 0 1/4 0 0 0 0 (3.12)
0 0 0 0 1/2 0 0 0
0 0 0 0 0 1/2 0 0
0 0 0 0 0 0 1/2 0
0 0 0 0 0 0 0 1/2
The Haar wavelet is constructed from the Haar Wavelet Transformation which is
generated by the scaling function φ=x (0,1 ) ( x ) forj , k ∈ Z
φ= {01otherwise
0≤ x≤1
}(3.13)
77
22
1.5
0.5
-0.5
{ }
−j −j
φ ( 2 j x−k )= 1 k 2 ≤ x<(k + 1)2 (3.15)
0 otherwise
This collection can be introduced as the system of Haar scaling functions.
Definition 3.2 Haar Function
ψ ( x ) =X (0,1 / 2) ( x )− X ( x)
Let ( 2,11 ) be the Haar function
{
1 0≤ x ≤ 1/ 2
ψ ( x ) = −1 1 ≤ x ≤ 1 ( 3.16)
2 2
0 otherwise
-2.5
22
1.5
0.5
-0.5
-1
-1.5
22
-2.5
j
2
φ j , k ( x )=2 xI j ,k ( x )(3.18)
and
ψ j , k (t )=2
j /2
( xI j+1,2 k ( x )−xI j +1,2k+1 ( x ) ) (3.19)
It means that they are not vanish on I j , k
It is obvious that simple functions are dense in L2 ( R). Hence, the set of all finite
linear combinations of characteristic functions of intervals is also dense in L2 ( R ) .
80
And also, it is known that for every interval I, the function I can be approximated
by functions in U jϵz V j . So
2
span {❑❑ V : jϵz }=L ( R ) (3.21)
constant on arbitrarily intervals, and f ∈ L2( R). Therefore f =0.In the next step of
the proof, the vector space W j is defined by
Wj=span { ψ j ,k :k ∈ Z } (3.22)
2
W j=L ( R ) (3.23)
By definition,
1 1
¿ X [ 0,2) ( x ) + ( X [ 0,1 ) ( x )−X [ 1,2) ( x ) ) (3.25)
2 2
1 1
¿ φ−1,0 ( x ) + Ψ −1,0 ( x ) ∈V −1+ W −1
√2 √2
1 1
ϕ 0 , k ( x) ϕ−1, k ( x)+ ψ (x )∈V −1+ W −1 .(3.26)
√2 √ 2 −1 ,k
So V 0 ⊆ V −1 +W −1. And also by the above relation V −1 ⊆ V 0, and W −1 ⊆V 0, which
means V −1 +W −1 ⊆ V 0 .
Hence
V 0=V −1 +W −1 (3.27)
As it is known thatV −1 ⊥W −1 , the above sum is a direct sum.
V 0=V −1 ⊕ W −1. (3.28)
V N =V N −1 ⊕W N−1, (3.29)
j=−∞
V N =⊕ N −1 Wj . (3.31)
Let N →+ ∞ , we have
2
L ( R)=⊕ j ∈Z Wj . (3.32)
82
N −1
1
X=
N
∑ Xi (3.33)
i=0
The reconstructed image is obtained by applying mean values for all non-
overlapping windows.
Wavelet
Input Transform Transforme
Image Coding d Image
Window
Storage Average
Coding
84
Compression Scheme
Inverse Window
Storage Average Coding Wavelet Code
Expansion
Inverse
Reconstructed
Wavelet
Image
Transform
Step 7: Reconstruction takes place by applying mean values for all non-
overlapping windows
Step 8: Wavelet co-efficient are applied to inverse window average coding
Step 9: Reconstructed image is obtained of size N×N
Figure 3.8 - Original (Input) and reconstructed images obtained using various
Image Compression algorithms on “Lena” image
Table 3.2 – Performance Measures obtained for “Lena” image using different
algorithms
Performance Measure
Algorithm
MCA(morphologic
al component E.R. SNR C.R.
analysis)
Transform
Coding using 75 0.0195 71.453 4
Wavelet
Spatial Coding using
75 0.0195 71.453 4
Window Average
Mixed Approach
93.75 0.0338 66.775 16
[Transform + Spatial]
70 71.45 71.45
66.77
60
50 C.R
SNR
40 E.R
30
20 16
10 4 4
0 0.01 0.01 0.03
Transform Spatial Mixed
For textured image "Calfskin", 8 bits per pixel image is given as the input,
and the reconstructed image obtained using different Algorithms are shown in
Figure 3.10 and the Performance Measures evaluated are listed in Table 6.2.
Graphical representation of Compression Ratio (CR), Signal to Noise Ratio
(SNR), and Error Rate (ER) for Transform, Spatial, and Mixed Approaches are
shown in Figure 3.11.
Figure
3.10 - Original (Input) and reconstructed images obtained using various
Image Compression algorithms on “Calfskin” image
89
Performance Measure
Algorithm
Transform
75 0.0474 67.028 4
Coding using
Wavelet
75 0.0474 67.028 4
Spatial Coding using
Window Average
80
70 67.02 67.02
62.18
60
50 C.R
SNR
40 E.R
30
20 16
10 4 4
0 0.04 0.04 0.08000000000000
01
Transform Spatial Mixed
Figure 3.11 Graphical Representation of C.R. and SNR for textured image
3.6. SUMMARY
image and also offers a high value of MCA. When tested for different sizes of
images, this Algorithm, gives the same type of result showing its robustness.
92
CHAPTER 4
4.1 OBJECTIVES
4.2 INTRODUCTION
As the medical field is advancing, today medical images are stored in some
areas, and are used for future diagnosis of patients. Thus, the large volume of
images is generated and reused today.As huge production of medical images has
become essential, it is also essential to undertake the process of compression
before storing or transmitting medical images through the internet. By
93
compression, the transmission time will be reduced. Thus, the compression has an
important role for efficient storage and transmission. Among different
compression methods practiced now, the wavelet compression technique is more
used in modern medical image compression. The wavelet technique is becoming
more popular because of exceptional image quality at a high compression rate.
But, since the 3D images are also introduced in the medical images instead of
wavelet, wavelet encoders are used for the compression.
Transformation: The discrete cosine transform cuts the image into blocks
of 64 pixels (8×8) and processes each block independently, shifting and
simplifying the colours so that there is less information to encode.
94
f(x,y)
Encoder
Decoder
f(x,y)
Symbol Inverse
decoder Mapper
JPEG 2000 is the new standard for wavelet compression issued by the
JPEG Committee. It arose out of the need to harmonize the wavelet compression
algorithm. JPEG 2000 uses a multilevel DWT with octave-scaled decompositions.
JPEG 2000 is the new ISO standard for image compression commonly used to
95
compress medical images. Version 1 of the standard provides the core coding
system, specifying both lossy and lossless compression. Version 2 provides
extensions to the standard use for a variety of applications. For 3-dimensional data
sets, there are Version 2 extensions that allow the use of several types of
decorrelating transformations in the third dimension. Specifically, wavelet
transforms, linear transforms, and dependency transforms are all classified under
Part 2. These multi-component transformations in Version 2 of JPEG 2000 can be
effective in compressing volumetric datasets because the correlation between
adjacent images can be exploited to achieve better compression than other existing
methodology.
In this type, the codes are transferred based on the cosine bases and wavelet
bases. But in fractal coding, the codes are transferred using affine transformations.
Also, the redundancy (unwanted information) is removed by using a compression
technique. Redundancy indicates duplication and irrelevancy. In redundancy, the
part of the image information is not noticed by the Human Visual System (HVS).
In the medical image compression, lossless compression is preferred, whereas, for
multimedia applications, lossy compression can also be used. The self-similarity
of the images is used in the fractal image compression (FIC) technique.
decoded at any scale. Monochrome images can be encoded because color images
are typically extensions of the grayscale representations of an image.
terabytes of space. Using wavelets, the FBI obtained a compression ratio of about
20:1. It can be noted that the storage of a fingerprint with a grayscale of 8 tones
and with a resolution of 500 dots per inch occupies 10 megabytes.
The compression ratio of the lossy compression is 50:1 or more. But the
original images cannot be completely recovered. Hence, by using the lossless
compression, original images can be completely recovered. The compression ratio
of lossless compression is around 2:1. In the medical industry, compression
without loss is normal. This is because the retrieval of the whole original signal is
essential for the diagnosis process.
Figure 4.2 shows the block diagram of the proposed system. The main part
of the block diagram is the input image, compression module, compressed image,
inverse transformation, and reconstructed image. The tile-matching and
rearrangement will turn Haar model into Tetrolet transformation. The input image
is divided into 4×4 blocks. For each block, a tetromino partition is assigned, which
is updated to the image geometry in this block.
Compression
Pre-processing
Identify matching
filter
Compressed Image
Tetrolet Transformation
Encoding
100
Where CDsumis the current detail. The new tiles are selected using a succeeding
tile. By using the Haar transform, new details are obtained, and are stored in
NDsum. After evaluating all the tiles, the process is terminated. After that, matching
tiles are selected.
Tetrolet transform is a basic four lattice board proposed by Jens Krommweh Harr.
Wavelet transform concept, according to the local geometric features of images,
adaptively selects the corresponding four lattice plate of the square area of sparse
representation. Compared with the traditional multi-scale transform wavelet, the
curvelet and contourlet using the same number of transform coefficients
reconstruction can get better image quality.
considered that the total block size is N = 4, and matching tetrominoes tile of sub-
images are stored in the level-1 Haar approximation. For all the image
transformation groups, approximate, vertical, horizontal, and diagonal coefficients,
the approximation matrix is to be set as the base image and repeated.
(a)
(b)
Wavelet transformation
Wavelet families
HaarWavelet
1 1
Ψ ( t )=1 when0< t< ;−1 when < t<1=0 otherwise( 4.2)
2 2
Daubechies wavelet
Morlet wavelet
Ψ ( t )=exp ( j w 0 t ) exp t [ ]
−t 2
2
( 4.3)
√( )( ( ))
2
Π −( w +w 0 )
exp (−( w−w0 ) )+ exp
2
Ψ ( w )= ( 4.4)
2 2
2
exp (−t )
Ψ ( t )=(1−t¿¿ 2) (4.5)¿
2
Ψ ( t )=−w exp
2
( )
−w2
2
( 4.6)
Shannon wavelet
This has poor time resolution, whereas the frequency localization is excellent.
Thus, a new technique based on the Haar wavelet transform called Tetrolet
transform is formed in this chapter. This technique is used for providing efficient
image representation.
The basic notations and concepts of the Tetrolet transform are described here.
Consider two-dimensional square data sets. The index set of a digital image
I =⊏ Z 2( 4.8)
An index that lies at the boundary has 3 neighbors; the associate degree of an
index at a vertex of the image has 2 neighbors.
106
In this chapter, it tends to contemplate disjoint partitions E of the index set I that
satisfy 2 conditions:
These subsets Iν are discussed here because the tiling problem of the square values
[0,N)2are determined which is known as tetrominoes could be a well-known
drawback being closely associated with partitions of the index set I = 2. For a
straightforward one-dimensional classification of the four components in one
tetrominoes setIν, tends to apply the bijective mapping J as follows. For I ν= letL:
Iν→ is ruled that to order the values J ( i 1 , j 1 ) , … … , J ( i 4 , j 4 ) is tended by size and
maps them to specified the littlest index identified with zero.
The tetrominoes are shaped by using the 4 unit squares. This is connected
by edges, not just at their corners. Irrespective of rotations and reflections, there
are five different shapes, thus referred to as free tetrominoes as shown in Figure
4.4. By considering the isometrics, it's clear that each square (0, N) 2 will be lined
by tetrominoes if and as long as N is even. Larsson showed that there are unit 117
solutions for a disjoint covering of a four × 4 board with four tetrominoes. For an
8 × 8 board, there is a tendency to figure 4.4 1174 > 108 as a rough edge of
potential tilings. Thus, for the sake of handling the number of solutions, it'll be
cheap to limit to an image partition into 4 × 4 squares. As pictured in Figure 4.5,
107
there are 22 basic solutions within the 4×4 board (disregarding rotations and
reflections).
k =2 A + H−D( 4.13)
Based on the above formula, p and q are unambiguously outlined for every k with
the condition that 2𝑝 is the largest power of two contained in 𝑘 and 𝑘>0.
A V
[
In general, Haar Transform = H D ( 4.14) ]
=
[ A
H D ]
[ HA VD ] (4.15)
For Level 3 (N=8), the approximation matrix contains 15 coefficients in addition
to the approximation pass:
[
HaarTransform= [ H D ] [ H D ]
A V A V
]
[ HA VD ] [ HA VD ] (4.16)
110
Noise Addition and Thresholding Methods: normally, in medical and test images,
random, Gaussian, and rician noise are applied. In order to get an efficient image
representation, a new adaptive Haar wavelet transform, called Tetrolet Transform
is implemented. Tetrolets are Haar-type wavelets whose supports are tetrominoes
which are shapes made by connecting four equal-sized squares.
Numerical results show the strong efficiency of the tetrolet transform for image
approximation.
Where sigma indicates the noise ratio, the factor used to indicate the amount
of noise. The random noise (Nr) is applied to the Gaussian equation :
2
−(x−μ)
1 2σ
2
P ( x) = e ( 4.18)
σ √2 π
Peak-signal-to-noise-ratio (PSNR)
(MSE) of the reconstructed image. MSE is defined as the difference between the
reconstructed image and original image, whose expressions are given by:
M N
1
∑ ∑ [ y ( i , j )−x ( i , j ) ] (4.19)
2
MSE=
MN i=1 j=1
L2
PSNR=10 log ( 4.20)
MSE
Where L is the length of the pixel, M,N is the pixel mean variable
¿ the originalimage
CR= (4.21)
¿ compressed image
Entropy
This is used to check the similarity of digital image and the original image. The
entropy can be calculated by using MATLAB. The statistical measure of
randomness is termed as entropy. The expression is given by:
Start
Change the low-pass and high-pass coefficient of each block into a 2×2 block.
Stop
113
Figure 4.6 shows the flow chart of the tetrolet transform-based image
compression. The basic operation of the proposed method is the conversion of
input into the sparsest tetrolet.After that, the low pass and the high pass
coefficients get changed. Finally, the coefficients are stored and aligned.
Figure 4.7 shows the input images and the compressed image for three images.
Optimum decomposition level is evaluated using thetetrolet transform. It is found
that the optimal scale number is taken as four for MRI, CT, and normal images.
Table 4.2 shows the simulation result of the compression ratio. Table 4.3 shows
the simulation result of encoding time.
Table 4.2 shows the simulation result of the compression ratio. The analysis
is performed in three different images. From the table, it is clear that for all the
three images MRI, CT, and the normal image, the proposed tetrolet transform has
a low compression ratio. If the compression ratio is high, the quality of images
will be low. Similarly, the quality of the image will be high if the compression
ratio is low. Thus, the proposed tetrolet transform is capable of generating the
highest quality compressed image as compared to the existing FQT. Figure 4.8 and
Table 4.3show the statistical analyses of the compression ratio. From the graph, it
can be found that the proposed tetrolet transform has a low compression ratio.
From the analysis of the compression ratio for MRI, CT, and normal image, it is
117
clear that the proposed tetrolet transform has different values. By using tetrolet
transform, the normal image can be compressed in high quality.
Table 4.3 gives an analysis result of the encoding time parameter. The encoding
time is the time taken for the compression of the image. Table 4.4 shows three
different techniques used for compression. From the table, it is clear that the
proposed tetrolet transform is capable of yielding higher encoding time. This
shows that images can be compressed faster using the tetrolet transform. Thus, the
proposed tetrolet transform has a faster response compared to the existing
technique FQT.
119
Figure 4.9 shows the statistical analysis of encoding time. The graph also shows
that by using the tetrolet transform CT image can be encoded faster as compared
to other image sets
4.6 SUMMARY
can be achieved. The simulation was carried out in MATLAB. From the analysis,
it is clear that the proposed tetrolet transform has high performances compared to
other existing techniques. This compression technique can be used for image
denoising as well.
122
CHAPTER 5
5.1 OBJECTIVES
With the development of CT, MRI, EBCT, SMRI, etc., the scanning rate
and distinguishing rate of imaging equipment has significantly enhanced. Using
Compression techniques, medical images can be processed to a profound degree
by de-noising, enhancement, edge extraction, etc., to make good use of the image
information and improved diagnosis. Since medical images are in digital format,
more time-efficient and cost-effective image compression technologies are to be
developed to reduce the mass volume of image data. This chapter proposes the use
of orthogonal moment transform for fast and higher compression rates. This
method incorporates a simplified mathematical approach using a sub-block
reconstruction scheme that eliminates numerical instabilities at higher moment
orders. Hence, Orthogonal Moment performs better for both real digital images
and graphically generated images.
5.2INTRODUCTION
A modern Medical imaging tool has a significant impact on the diagnosis of
diseases and preparation for surgery. As medical images have transformed into
digital formats such as DICOM, optimal settings for image compression are
needed to facilitate long- term mass storage requirements. Also, with the increased
use of medical imaging in clinical practice, and the growing dimensions of data
123
set, the management of digital medical image data sets requires high compression
rates. The pharmaceutical enterprise depends on a system that makes diagnostic
images available for radiologic interpretation, that transmits images to physicians
throughout the system, and efficiently stores images pending retrieval for future
medical or legal purposes. Computerized medical imaging generates large, data-
rich electronic files. To speed up electronic transfer and minimize computer
storage space, medical images often undergo compression into smaller digital
files. The level of diagnostic detail needed for clinical interpretation of medical
images varies according to modality. In general, nuclear medicine scans require
less detail than Computed Tomography (CT) or Magnetic Resonance (MR). As
interpretation of mammography and radiography depends on high spatial
resolution, these images demand more detail than CT or MR, which,in turn, needs
high contrast resolution for diagnostic analysis. The need for compressing medical
images are
coordinates space.
The absence of numerical approximation terms allows a more accurate
representation of image features than others, which is not possible using
conventional transform.
5.3PROBLEM DEFINITION
5.4.1 Orthogonalmoments
Definition
M pq =∬ P pq ( x , y ) f ( x , y ) dxdy (5.1)
(f )
wherep00(x, y), p10(x, y), . . . ,pkj(x, y), . . .( small letter p) are polynomial
basis functions defined on D. These set of Cartesian moments following the
128
∫ y m ( x ) y n ( x ) dx=0 m≠ n(5.2)
5.4.1.1 TCHEBICHEFmoments
s−1 s−1
1
T mn= ∑ ∑ t m (i)t n ( j)f ( i, j ) (5.3)
p ( m , s ) p ( n , s ) i=0 j=0
t 0 ( x )=1
2 x +1−s
t 1 ( x )=
s
( )
2
(n−1)
( 2 n−1 ) t1 ( x ) t n−1 ( x )−(n−1)t n−2 ( x ) 1 2
s
t n ( x )= (5.4 )
S
β ( n , S ) =S n (5.5)
i=0
130
( )( )( ) ( )
2 2 2 2
1 2 3 n
S . 1− . 1− 2 . 1− 2 … … 1− 2
S
2
S S S (5.7)
¿
2 n+1
2 x +1−S
t 0 ( x )=1 , t 1 ( x )= (5.8)
S
t n ( x )=
(
( 2 n−1 ) . t 1 ( x ) . tn −1 ( x )−( n−1 ) 1−
( n−1 )2
S2 )
. t n −2 ( x )
(5.9)
S
[ ][ ][ ] [ ]
Y 0.299 0.587 0.114 R 0
Cb = 0.1687 −0.3313 0.5 G + 128 (5.10)
Cr 0.5 −0.4187 −0.081 B 128
[][ ][ ] [ ]
R 1 0 1.4021 Y 0
G = 1 −0.34414 −0.71414 Cb − 128 (5.11)
B 1 1.7718 0 Cr 128
132
The JPEG committee was considered for use as a standard color space. The
YCbCr color space was used as the default color space carried out for the JPEG
committee. It has a limited amount on the use of other color spaces that affects the
compressibility of color images. The XYZ and YCbCr color spaces are better than
the RGB color space for image compression.
YCbCr is a color space used as part of the color image pipeline in video and
digital photography systems. YCbCr is not an absolute color space. It is a way of
encoding RGB information. The actual color displayed depends on the actual RGB
primaries used to display the signal. It represents color in terms of one luminance
component (Y) and two chrominance components (Cb and Cr), where Cb is the
chrominance-blue component, and Cr is the chrominance –red component.
The YCbCr image can be converted to/from the RGB image. To convert
from RGB to YCbCr, one variant of this color space (according to ITU-R BT.709)
is considered as following:
Y = (0.2126 ×red)+(0.7152×green)+(0.0722×blue)
Cb = 0.5389×(blue-Y)
Cr = 0.6350×(red-Y)
in better performance.
3. YCbCr is broadly utilized in video compression standards such as MPEG
and JPEG.
4. Y can be stored with high resolution or transmitted at high bandwidth, and
two chrominance components (Cband Cr) that can be bandwidth-reduced,
sub-sampled, compressed or otherwise treated separately for improved
system efficiency.
According to the Human Visual System (HVS), human eyes are more
sensitive to the luminance than the chrominance. Therefore, the accuracy of the
chrominance is reduced to achieve data compaction through sub-sampling, and the
human eyes do not easily perceive the difference. There are four frequently used
YCbCr sub-sampling formats, and all of them are shown in Figure 5.2. It is worth
noting that the name of the format is not always related to the sub-sampling ratio.
The image matrixes are partitioned into 2×2 pixels, where the orthogonal
Tchebichef moments are calculated independently. The block size N is
considered as 2.Based on orthogonal moments, a kernel matrix (K 2×2) is given as
follows:
K=
[ t( 0) t (1)
t (2) t (3) ] (5.12)
The image block matrix (F 2×2) with [F(x, y)] denotes the intensity value of
the pixel as:
F=
[ f (0,0) f (0,1)
f (1,0) f (1,1) ] (5.13)
135
There are 64 two dimensional TMT basis functions that are generated by
multiplying a horizontal against a vertical set of one-dimensional 8-point TMT
basis functions. In Figure 5.3, neutral gray represents zero, white represents
positive amplitudes, and black represents negative amplitudes. An image contains
low medium and high-frequency components. The low-frequency signal
corresponds to slowly varying color, whereas the high frequencies represent the
detail within the image information. Intuitively, low frequencies are more
important to create a good representation of an image, and the higher frequencies
can largely be ignored to a certain degree. Human eye is highly sensitive to low
frequency distortions rather than too high frequencies. The process of image
reconstruction from its moments is provided as follows:
136
M −1 N−1
f́ ( x , y )= ∑ ∑ k m ( x ) T mn k n ( y) (5.14)
m=0 n=0
5.3.5 QUANTIZATION
T q (u , v )=Round ¿
Q ( x) =
[ x
∆ ]
+0.5 , where ∆ is the quantization step¿ ¿
If the quantization step size is 1, then the quantization output function is, as shown
in Figure 5.4, the number of elements in the output domain is finite. On the other
hand, it is easy to understand that the larger the quantization step size, the larger
the loss of the precision.
Q(x)
4.0
3.0
2.0
1.0
0.5 1.5 2.5 3.5
-1.0
-3.5 -2.5 -1.5 -0.5 )
-2.0
-3.0
-4.0
Q(x)
4.0
3.0
2.0
1.0
0.5 1.5 2.5 3.5
¿)
-3.5 -2.5 -1.5 -0.5 -1.0
-2.0
-3.0
-4.0
For each layer, the moment coefficients shall be quantized separately. The
quantization process has a key role in JPEG compression, which removes the high
frequencies present in the original image. This is done due to the fact that the eye
is much more sensitive to lower spatial frequencies than to higher frequencies.
139
This is done by dividing values at high indexes in the vector (the amplitudes of
higher frequencies) with larger values than the values by which the amplitudes of
lower frequencies are divided. The Standard JPEG luminance and chrominance
quantization tables QL and QRrespectively are given below,.
[ ][ ]
16 11 10 16 17 18 24 16
Q L= 12 12 14 19 Q =
R
18 21 26 19
(5.16)
14 13 16 24 24 26 56 24
14 17 22 29 47 66 99 99
The 2-dimensional DCT performed on the 4×4 sub-blocks of the image data
generates 4-bit gains. The quantization table for luminance starts with 24 =16.
However, the Tchebichef moment generates only 2-bit gains. The quantization
tables for Tchebichef moment should start with 22=4 .The proposed luminance and
chrominance quantization tables Q ML andQ MR below are used for Tchebichef
Moment compression. These tables may be generated mathematically in a
friendlier manner.
[ ] [ ]
4 6 8 16 4 8 16 32
Q ML= 6 8 16 32 QMR = 8 16 32 64
(5.17)
8 16 32 64 16 32 64 128
16 32 64 128 32 64 128 256
Eighty traditionally popular images have been selected and the basic
image compression experimental validation has been gone through. These images
140
are categorized into 40 real images and 40 graphical images. All the images are
raw RGB 3-layer images of size 512×512 pixels.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 62 63
Mapping
0 1 5 6 14 15 27 28 2 4 7 13 16 26 29 42 --- --- 62 63
Input
4
0 1 2 3 7 8 9 62 6
5 6 10
11 12 13 14 15
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
15 62 63
Figure 5.7 Diagram of the Zig-Zag scan with mapping method
143
The mapping process is run between the input data sequence and the
zigzag position sequence. The mapping happens according to the same element
position between those two vectors (see mapping arrow in Figure 5.7). Every 64
DCT coefficients from every 8×8-pixel block and the resultant 64 DCT
coefficients must be available before the scanning process. The reading or saving
DCT coefficient input could slow down the scanning process. The architecture of
the zigzag scan is as shown in Figure 5.8.
ZIGZAG
REGISTER 1
MEMORY
2:1 2:1 FOR
MUX MUX SCANNING
ORDER
ZIGZAG_IN ZIGZAG_OUT
COUNTER 64 READY_OUT
CLOCK
RESET
Input Image
Compressed Image
rows and columns compress the whole image. The output is produced as a
compressed image.
ALGORITHM
Step 1: Using image box control read image for the process of compression.
Step 2: Call function, which will sort the pixel in Rows and Column. The function
prioritizes the pixel based on the frequency count of each pixel in the image.
Step 3: Call function, which will create an initial heap. Reheap tree according to
the occurrence of each node in the tree, the lower occurrence is attached in a heap.
Create a new node where the left child is lowest in the sorted list, and right is the
second-lowest in the sorted list.
Step 4: Build a Huffman tree based on the prioritized list. Chop-off those two
elements in a sorted list as they are parts of one node and add probabilities. The
result is the probability of the new node.
Step 5: Perform insertion sort on the list with the new node.
Step 7: Perform a traversal of trees to generate a code table. This will determine
the code for each element of tree in the following way.
The code for each symbol obtained by the tracing path of a symbol from the root
of the tree.A1 is assigned to a branch of one direction, and 0 is assigned for a
branch to another direction. For example, a symbol that is reached to branch at
147
twice; then itis left once represented by pattern '110'. Figure 5.10 depicts codes for
nodes of a sample tree.
(0) (1)
(10) (11)
(110) (111)
Step 8: First Huffman tree is built. Huffman codes, which require minimum
information to rebuild, may be generated by steps:
Step 3: Initialize current code as all zeros and assign code values to
symbols from longest to shortest code as follows:
Step 9: Encoding pixel of Huffman code has been generated; data is encoded
simply by replacing symbols with its code.
Step 11: Generate a Huffman tree equivalent to the encoding tree. Huffman code
for some encoded data, decoding can be accomplished by reading encoded data
one bit at a time.
Step 12: Read input character-wise and left to tree until the last element is reached
in the tree.
Step 13: Output the character encodes in the leaf and return to the root, and
continue the step 12 until all the codes of corresponding symbols are known.
149
Figure 5.11 shows one simple Huffman coding. Firstly, symbols with the
corresponding probability are ordered. Secondly, the symbols a3 and a5are
combined to form a new symbol, and the probability of this symbol is the
combination of its corresponding combined symbols. A similar procedure is
repeated until the final probability becomes 1.0. After that, corresponding
codewords are assigned to each symbol backwardly. Finally, the codeword for
each corresponding symbol is obtained.
M −1 N −1 2
1
E ( s )=
3 MN
∑ ∑ ∑ ¿∨g (i , j , k )−f ( i, j , k )∨¿(5.18)
i=0 j=0 k=0
150
Other measurements that represent the reconstruction accuracy are Mean Squared
Error (MSE) - calculate the average of the square of the error.
M −1 N−1 2
MSE= ∑ ∑ ∑ ¿∨g ( i , j, k )−f ( i , j ,k ) ¿∨¿2 (5.19) ¿
i=0 j =0 k=0
( Maxi )
PSNR ( dB )=10 log 10 (5.20)
√ MSE
where Max i is the maximum possible pixel value.
AD is defined as the average difference that occurred between the original image
and reconstructed image, while MD measures the maximum difference that
occurred between the original image and the reconstructed image. The formulae
are defined as :
M −1 N−1 R−1
AD= ∑ ∑ ∑ ¿∨I ( i, j , k ) ∨¿(5.21)
i=0 j =0 k=0
MD=max 0 ≤ i≤ M ∨¿ I ( i, j , k ) (5.22)
151
5.6 SUMMARY
CHAPTER 6
PERFORMANCE ANALYSIS
6.1 INTRODUCTION
Figure 6.1 - Original (Input) and reconstructed images obtained using various
Image Compression algorithms on “Lena” image
Table 6.1 – Performance Measures obtained for “Lena” image using different
algorithms
Performance Measure
Algorithm
Transform
Coding using 75 0.0195 71.453 4
Wavelet
Spatial Coding using
75 0.0195 71.453 4
Window Average
Mixed Approach 93.75 0.0338 66.775 16
[Transform + Spatial]
For textured image" Calfskin", 8 bits per pixel image is given as the input
and the reconstructed image obtained using different algorithms are shown in
Figure 6.3 and the Performance Measures evaluated are listed in Table 6.2.
Graphical representation of the Compression Ratio (CR), Signal to Noise Ratio
(SNR), and Error Rate (ER) for Transform, Spatial, and Mixed Approaches are
shown in Figure 6.4.
Figure 6.3 - Original (Input) and reconstructed images obtained using various
Image Compression algorithms on “Calfskin” image
158
Performance Measure
Algorithm
Transform
Coding using 75 0.0474 67.028 4
Wavelet
80
67.02 67.02
70 62.18
60 C.R
50 SNR
40 E.R
30
20 16
10 4 4
0.04 0.04 0.0800000000
0 000001
Transform Spatial Mixed
Table 6.4 shows the analysis of the compression ratio. The analysis is
performed in three different images. From the table, it is clear that for all the three
images MRI, CT, and the standard image, the proposed tetrolet transform has a
low compression ratio. When the compression ratio is high, quality of images will
be low. Similarly, the quality of the image will be high if the compression ratio is
low. Thus, the proposed tetrolet transform is capable of generating the highest
quality compressed image as compared to the existing FQT.Figure 6.5 shows the
statistical analysis of the compression ratio. From the graph, it can be found that
the proposed tetrolet transform has a low compression ratio. From the analysis of
the compression ratio for MRI, CT, and normal image, it is clear that the proposed
tetrolet transform has different values. By using tetrolet transform, the normal
image can be compressed in high quality.
161
Table 6.5 gives the result analysis of the encoding time parameter. The encoding
time is the time taken for compression of the image. Here,three different
techniques are used for compression. Form the table, it is clear that the proposed
tetrolet transform expressesencoding time. This shows that images will be
compressed faster using the tetrolet transform. Thus, the proposed tetrolet
transform shows a faster response when compared to the existing FQT technique.
Figure 6.8 shows the statistical analysis of encoding time. The graph also shows
that by using the tetrolet transform CT image can be encoded faster as compared
to other image sets.
162
CHAPTER 7
Efficient image compression techniques are becoming very vital in areas like
pattern recognition, image processing, system modeling, data mining, etc.
Compression techniques have become the most concentrated area in the field of
computer. Image compression is a technique of efficiently coding digital image to
reduce the number of bits required in representing an image.The image
compression problem arises when there is limited storage and bandwidth for
transmission of images. Image compression remains a challenge for researchers
because it introduces artifacts and causes blurring of the images. This research
discusses the compression technique and transformation. It can be analysed and
observed that the compression techniques are complicated unless identical data
sets and performance measures are used. After analysis of all methods, it is found
that lossless image compression techniques are the most effective among lossy
compression techniques. Further, lossyprovides a higher compression ratio as
compared to lossless. Image compression plays a vital role in saving memory
storage space and saving time while transmission images over the network. Image
compression aims to remove noisy and repeated pixels to compress the image. The
first stage of this research discusses the mixed approach, which is a combination
of spatial and transforms coding techniques. By using this approach tested the
textured and untextured images were tested. Compression ratio(CR), Maximum
compression Achieved(MCA) and signal to noise ratio(SNR) were evaluated.
While using the transform coding and spatial coding Algorithm for image
166
compression application individually, the value of MCA estimated was 75%, and
ant Compression Ratio was 4:1. But when the techniques were combined in the
mixed approach of the image compression algorithm, it was found that the MCA
increased to 93.75% with Compression Ratio of 1:16. Even though there is a small
variation in the ER and SNR values, the increased value of MCA, compared to the
fall in the quality of image reflected by ER. and SNR is appreciable and
compromising. The Mixed Algorithm offers a 'trade off' between the quality of the
image and also offers a high value of MCA. When tested for different sizes of
images, this Algorithm, gives the same type of result showing its robustness.
The third stage of the proposed work is used to create faster compression
ratio. For that the function of Tchebichef Moment Transform (TMT) can be used
as an equivalent to DCT for applications in image compression and
reconstruction. The set of Tchebichef Polynomials has the potential to work
167
better for both real-world imagery and high-end graphics. Two important
features of Tchebichef moments identified are (i) a discrete dimensionality of
definition which matches exactly with the image coordinate, and (ii) absence of
numerical approximation errors for better reconstruction. The experimental
results also prove that the proposed algorithm reduces the time taken to
transform images of different sizes efficiently. Concurrently, it has lower
computational complexity since it does not require any special primitive
algorithms as JPEG Compression. This improvement makes it practical and
more elementary implementation for both software and hardware developers.
FUTURE SCOPE
frame where the data frame is a set of all pixels that correspond to a single time
moment. With the advancements in compression technology, it is now
straightforward and efficient to compress video files. Thus, the proposed concepts
may be used and future research work can be extended further.
169
REFERENCES
49. Sethi, N., Krishna, R., & Arora, R. P. 2011. Image compression using Haar
wavelet transform. Computer Engineering and Intelligent Systems, issue
2(3), PP.1-5.
50. Nikkoo Khalsa, Dr. G. G. Sarate, Prof. D. T. Ingole2010, "Factors
Influencing The Image Compression Of Artificial And Natural Image
Using Wavelet Transform," International Journal of Engineering Science
and Technology (IJEST), ISSN: 6225-6233 Volume. 2 Issue 11.
51. B.Niteesh, V.Ravi Kumar, R, Sai Ram, P.Krishna Sagar, B.Hemanth Nag
2016"Image Compression Using Adaptive Haar Wavelet-Based Tetrolet
Transform" International Journal of Innovative Research in Computer and
Communication Engineering Volume. 4, Issue 4,, pp-6522-6528.
52. Nur Azman Abu; Siaw Lang Wong; Nanna Suryana
Herman; Ramakrishnan Mukundan2010 “An efficient compact Tchebichef
Moment for image compression” International Conference on Information
Science, Signal Processing and their Applications,
53. G. A. Papakostas, D. A., Karras, B. G. Mertzios, and Y. S. Boutalis2005
"An Efficient Feature Extraction Methodology for Computer Vision
Applications using Wavelet Compressed Zernike Moments" icgst, pp 5-15
87. Weinberger, M.J., Seroussi, G., and Sapiro, G,2000, "The LOCO-I lossless
image compression algorithm: principles and standardization into JPEG-
LS," IEEE Transactions on Image Processing, Vol. 9, Issue 8, pp.1309-
1324.
88. Yen-Yu Chen2007, "Medical image compression using DCT-based
subband decomposition and modified SPIHT data organization,"
International Journal Of Medical Informatics, pp. 717– 725.
LIST OF PUBLICATION
International Journal
1. S.UmaMaheswari and Dr.V.SrinivasaRaghavan, “Lossless MedicalImage
Compression Algorithm Using Tetrolet transformation”, Journal ofAmbient
Intelligence and Humanized Computing. DOI
https://doi.org/10.1007/s12652-020-01792-8(Accepted).