You are on page 1of 128

A

DISSERTATION
ON

IMPROVEMENT AND PERFORMANCE


ANALYSIS OF TEXT EXTRACTION
ALGORITHMS
Submitted in partial fulfillment of the requirements for
award the
Degree of
MASTER OF TECHNOLOGY

in

Electronics Engineering

(Digital Electronics & Systems)


By
Digvijay Pandey
(Roll No. 139404)
Under the Supervision of

Prof. Y.K Mishra

Department of

Electronics

Engineering
Kamla Nehru Institute of Technology
Sultanpur 288 118 (U.P.)
Affiliated to

DR. A.P.J. ABDUL KALAM TECHNICAL


UNIVERSITY, LUCKNOW, INDIA
JULY, 2016
Table of Contents
TOPIC

PAGE NO.

Certificate

Acknowledgement

ii

Abstract

iii

LIST OF ABBREVIATIONS
LIST OF FIGURES
LIST OF TABLES

S.NO
1.

CHAPTERS
INTRODUCTION
1.1

Motivation

1.2

Digital Image

1.3

Fundamentals of Digital Image Processing

1.4

Types of Noise

1.5

Image Filtering Algorithms

1.5.1

Trimmed Average Filter

1.5.2

Adaptive Filter

1.5.3

Wiener Filter

1.5.4

Maximum and Minimum Filter

10

1.5.5

Average Filter

11

1.6

Local Image Gradient and Contrast

12

1.7

Adaptive Image Contrast

13

1.8

Image Binarization Technique

14

1.8.1

Global Thresholding

14

1.8.1.1 OTSUs Method

15

1.8.2

Local Thresholding

15

1.8.3

Hybrid Thresholding

15

1.9

Canny Edge Detection

15

1.10

Research Aim and Objectives

16

1.11

Organization of Dissertation

17

2. REVIEW OF LITERATURE
2.1

General

18

2.2

Summary

28

3. METHODOLOGY

3.1 Image Processing Toolbox

29

3.2 Research Methodology

30

3.3 Simulation Modeling

33

3.3.1 Construct Model

33

3.3.2 Execute Model

33

3.3.3 Analyze Model

33

3.4 Proposed Algorithm

34

3.4.1 Flow Chart of the proposed algorithm

34

3.4.2 Algorithm for Text Extraction through Binarization

35

4. RESULT AND DISCUSSION


4.1 General

47

4.2 Performance Metrics

47

4.2.1 Peak Signal to Noise Ratio

47

4.2.2 Negative Rate Metric

47

4.2.3 Accuracy

48

4.2.4 Distance reciprocal distortion metric

48

4.2.5 Precision

48

4.2.6 F-Measure

48

4.2.7 Specificity

49

4.3 Simulation result and analysis

49

4.3.1 Performance Analysis of filter for First Input Image

49

4.3.1.1 Analysis of calculated values

55

4.3.2 Performance Analysis of Second Input Image

58

4.3.2.1 Analysis of calculated values

63

4.3.3 Performance Analysis of Third Input Image

67

4.3.3.1 Analysis of Calculated values

70

4.3.4 Performance Analysis of Fourth Input Image

74

4.3.5 Performance Analysis for Fifth Input Image

81

4.3.5.1 Analysis of calculated values

87

5. CONCULSION
5.1 Summary

92

5.2 Conclusion

92

5.3 Future Scope

93

6. REFERENCES

(94)

Electronics Engineering
Department
Kamla Nehru Institute of
Technology,
Sultanpur-228118, (U.P.)

Certificate
This is certified that Mr Digvijay Pandey, (Roll No. 139404) has carried out the
dissertation work presented in the report entitled
PERFORMANCE

ANALYSIS

OF

TEXT

IMPROVEMENT AND

EXTRACTION

ALGORITHMS

submitted in partial fulfillment of the requirements for the award the degree of Master of
Technology in Electronics Engineering

with specialization in Digital Electronics &

Systems in the Department of Electronics Engineering,

Kamla Nehru Institute of

Technology, Sultanpur -228118 (UP) under my supervision during the academic session
2015-16

(Prof. Y.K. Mishra)

(Associate Professor)
Head of Department/Supervisor

Date:July,2016
place:Sultanpur(UP)

Acknowledgement
I wish express my heartfelt gratitude to my project guide
Prof. Yogesh Kumar Mishra, Department of Electronics Engg. K.N.I.T. Sultanpur 228118,
U.P. for their active interest, constructive guidance, and advice during every stage of this
work. Their valuable guidance coupled with active and timely review of my work provided
the necessary motivation for me to work on and successfully complete seminar.
I am also thankful to Prof. Yogesh Kumar Mishra (HOD), Elex. Engg.
Department K.N.I.T. Sultanpur-228118, and all the faculty & staff members of Electronics
Engineering department for their kind support and help.
It is a contribution of many persons that make a work successful. I wish to
express my gratitude to individuals who have contributed their ideas, time and energy in this
work.
Words are inefficient to express my gratitude to family members for their
inspirations, blessing and support.
I thank God, who has supported me at every moment.

Date: July 2016


Place: Sultanpur(U.P.)

(Digvijay Pandey)
Roll No (129406)

ii

Abstract
With rapid growth of the internet the amount of image and video data is increasing
exponentially. The text data present in images and videos is useful for automatic
annotations, indexing and structuring of images. There is huge increment in images and
video database online. In such database, there is need to fetch, explore and inspect the
images and videos. Text extraction plays a major role in finding vital and valuable
information. Noise is an important factor that influences the quality of image which is
mainly produced in the processes of image acquirement and transmission. An image can
be contaminated by noise like salt and pepper noise, random valued impulse noise, speckle
noise and Gaussian noise.
For the removal of noise from images, the filtering algorithm like adaptive filter,
average filter, maximum filter, median filter, minimum filter, trimmed filter and wiener filter
are used. After removing noise from input complex image the text is extracted in binary
form through proposed algorithm. The proposed method uses the techniques of local
contrast, local gradient, adaptive map contrast, canny edge detection for detection of text
strokes and Otsu threshold for calculation of threshold value .On the basis of calculated
threshold value the pixels are classified into background and foreground .A comparative
study of some popular existing filtering method is done for text extraction from complex
images .The proposed method is simulated in MATLAB to verify and validate the
performance analysis.

iii

Chapter 1
1.1

Introduction

Motivation
With rapid growth of the internet the amount of image and video data is increasing

exponentially. In some image categories (e.g. natural scenes) and video categories (news,
documents) there is often text information. This information can be used as a semantic
feature in addition to visual feature such as colors and shapes to improve the retrieval of the
relevant images and videos. The text data present in images and videos is useful for
automatic annotations, indexing and structuring of images Jung et al. (2011). There is huge
increment in images and video database online. In such database, there is need to fetch,
explore and inspect the images and videos. Text extraction plays a major role in finding vital
and valuable information Sumathi et al. (2012). As most of the search engines are text
based, manual keywords annotations have been traditionally used. However this process is
laborious and inconsistent i.e. two users may choose different keywords for same image and
video. An alternate approach is to generate the keywords from the text that appears in the
image. These keywords can be used as semantic feature to improve the retrieval of the
relevant images and videos. The other application of text extraction from images includes
sign translation, robots, video skimming and navigation aid for visually impaired. Therefore
there is an increasing demand for text extraction from images and videos.
Although several methods have been proposed over the past years, text extraction
from images is still a problem because of almost unconstrained text appearance i.e. text can
vary drastically in font, colors, size and alignments as well as low image contrast and
complex background make the problem of automatic text extraction extremely challenging
Jung et al. (2011).

1.2

Digital Images

Every digital image has a two-dimensional mathematical representation of an image.


Digital images are formed of pixels i.e. picture element. Each pixel represents the gray level
for black and white photos at a single point in the image, so a pixel can be expressed by a
(1)

tiny dot of a fussy color. By calculating the color of an image at a large number of points,
we can reproduce a digital approximation of the image from which a replica of the original
can be recreated. Pixels are like grain particles in a conventional photographic image, which
can be organized in a regular pattern of rows and columns and store data differently to some
extent. A digital image is a rectangular adjustment of pixels sometimes called a bitmap.

Fig. 1.1: Representation of an image as pixels

The study of various noise model and filtering techniques Kamboj et al. (2013) in
image processing, noise reduction and image restoration is expected to improve the

qualitative inspection of an image and the performance criteria of quantitative image


analysis techniques.
Image optimization, compression, recompression, resizing, restoration are the basic
problems in image processing. These are the causes of disturbance to image environment.
To improve the image quality, maintaining its finality, clarity and speedy movements various
types of filters are used in research. Different filters such as trimmed average, minimummaximum, median, inverse, adaptive, wiener filter and median filters can be used for
removing the induced noises in image enhancement. It maintains intensity, luminance and
density of image.
(2)

Digital image is inclined to heterogeneous noise which affects the eminence of


image. The purpose of image denoising is to restore the detail of original image to the great
extent. The criterion of the noise removal problem depends on the type of noise by which is
corrupting the image Kamboj et al. (2013). Different methods for reduction of noise and
image enhancement have been considered.
Chan et al. (2005), put forward a two-phase scheme for removing salt and pepper
noise. In the foremost phase, an adaptive median filter is used which describe pixels that are
likely to be infected by noise. In the successive phase, the image is restored using a
particular regularization method that applies only to those selected noise candidates. In
terms of edge perpetuation and noise elimination, their restored images represent a
significant enhancement compared to those restored by applying just nonlinear filters or
regularization methods only. Strategy can remove salt-and-pepper-noise with a noise level as
high as 90%.
The evinced an improved decision-based algorithm for the restoration of gray-scale
and color images that are highly contaminated by Salt and Pepper noise which efficiently
removes the salt and pepper noise by preserving all details Nair et al. (2008). The algorithm
utilizes formerly processed neighboring pixel values to get improved image quality than the
one utilizing only the just previously applied pixel value. The projected algorithm is faster

and also produces better result than a Standard Median Filter (SMF). The advantage of the
proposed algorithm lies in removing only the noisy pixel either by the median value or by
the mean of the previously processed neighboring pixel values.
The proposed that removal of high density salt and pepper noise in noisy color
images using projected median filter Singh et al.(2013). The presentation of improved
median filter is good at lower noise density level. The mean filter prevents little noise and
gets the worst results. The enhanced median filter is good at lower noise density levels. It
removes most of the noises effectively while preserving colored image details. The
performance of the algorithm is analyzed in terms of Peak signal to noise ratio (PSNR),
Mean square error (MSE), Image Enhancement Factor (IEF).
(3)

The Luo et al. (2006) suggested that images are often corrupted by noise known as
salt and pepper noise. The standard median filters has been established as reliable method to
remove the salt and pepper noise without harming the edge features. Though, the major
problem of standard Median Filter (MF) is that the filter is effective only at low noise
densities.

1.3

Fundamentals of Digital Image Processing


A major portion of information received by a human being from the environment is

visual. Hence, processing visible information by computer has been drawing a very
significant attention of the researchers over the last few decades. The procedure of receiving
and analyzing visual information by the human species is referred to as sight, perception and
understanding. Similarly, the process of receiving and analyzing visual information by
digital computer is called digital image processing Kamboj et al. (2013).
Image processing begins with an image acquisition process. The two elements are
required to acquire digital images. The first one is a sensor; it is a physical device that is
sensitive to the energy radiated by the object that has to be imaged. The second part is called
a digitizer. It is a device for converting the output of the sensing device into digital form.
For example in a digital camera, the sensors yield an electrical output proportional to light
intensity. The digitizer converts the outputs to digital data.

The aim of digital image processing is to improve the potential information for
human interpretation and processing of image data for storage, transmission, and
representation for autonomous machine perception Hemalatha (2014). The attributes of an
image deteriorate due to contamination of various types of noise: Additive white Gaussian
noise, Rayleigh noise, Impulse noise etc. contaminate an image during the processes of
acquisition, transmission and reception and storage and retrieval. For a meaningful and
useful processing such as image segmentation and object recognition, and to have very good
visible display in applications like television, photo-phone, etc., the acquired image signal
must be noise-free and made deblurred Tripathi (2012). Image deblurring and image
denoising are the two sub-areas of image restoration. In the present research work, attempts
(4)

are made to propose efficient filters that suppress the noise and preserve the text edges and
fine details of an image as far as possible in wide range of noise density.

1.4

Types of Noise
The principal origin of noise in digital images arises during image acquisition and/or

transmission. The performance of image sensors is affected by a variety of aspects such as


environmental conditions during image acquisitions, and quality of sensing elements
themselves. Images are corrupted during transmission principally due to electromagnetic
interference in a channel employed for transmission. For example, an image transmitted
using a wireless network might be corrupted because of lightening or other atmospheric
disturbances Eng et al. (2001).
When an analog image signal is transmitted through a linear dispersive channel, the
Image edges (step-like or pulse like signal) get blurred and the image signal gets
contaminated with AWGN since no practical channel is noise free. If the channel is so poor
that the noise variance is high enough to make the signal excurse to very high positive or
high negative value, then the thresholding operation at the front end of the receiver will
contribute saturated max and min values. Such noisy pixels will be seen as white and black
spots in the image. Therefore, this type of noise is known as salt-and-pepper noise (SPN).
So, if analog image signal is transmitted, the noise could be AWGN (additive white
Gaussian noise), SPN (salt and pepper noise) or a mixed noise.

Fig. 1.2:
Original
and
image
affected
by
Gaussian noise

(5)

Fig. 1.3: An image quality


degraded by salt and pepper
noise

If the image signal is


transmitted in digital form
through a channel, then inter-symbol interference (ISI) takes place. In addition to this, the
AWGN in a practical channel also comes into picture. This makes the situation very critical.
Due to ISI and AWGN, it may so happen that a 1 may be recognized as 0 and vice-versa.
Under such circumstances, the image pixel values have changed to some random values at
random positions in the image frame. Such type of noise is known as random-valued
impulse noise (RVIN).

Fig. 1.4: A 40% randomvalued impulse noisy image

Another type of
noise that may degrade an
image

signal

is

the

speckle noise (SN). In

some biomedical applications like ultrasonic imaging and a few engineering applications
like synthesis aperture radar imaging, such a noise is encountered. The SN is a signal
dependent noise, if the image pixel magnitude is high, then the noise is also high. The noise
is multiplicative because initially a transmitting system transmits a signal to the object and
the reflected signal is recorded. When the signal is transmitted, the signal may get
contaminated with additive noise in the channel. Due to varying reflectance of the surface of
(6)
the object, the reflected signal magnitude varies. So also the noise varies since the noise is
also reflected by the surface of the object. Noise magnitude is, therefore, higher when the
signal magnitude is higher. Thus, the speckle noise is multiplicative in nature.

Fig. 1.5: An image contaminated by


Speckle noise

1.5

Image

Filtering

Algorithms
The basic problem in image processing is the enhancement and the restoration of
image in the noisy environment. For enhancing the quality of images, various filtering
techniques can be used which are available in image processing. There are various filters
which can remove the noise from images while preserving image details and enhance the
quality of image. Filters are special kind of tools designed to take an input as an image,
apply a mathematical algorithm to it, and return image in a modified format. Filters are
used to remove the noise from the images. Tripathi et al (2012) earlier linear filters
were used for removing the noise from images but linear filters have poor performance
in the presence of noise that is not additive as well as in systems where system nonlinearities or non-Gaussian statistics are encountered. The linear filters have the
advantage of fast processing but the disadvantage of not preserving edges. Conversely
the non-linear filters have the advantage of preserving text edges and the disadvantage of
slower processing Patidar et al. (2010).
1.5.1 Trimmed Average Filter

In order to calculate the -trimmed filter, the data should be sorted low to high
and summed the central part of the ordered array. The number of input data values which
(7)

are dropped from the average is controlled by the trimming parameter . It is well
known that the average filter suppresses additive white Gaussian noise better than the
median filter, while the median filter is better at preserving edges and rejecting impulses
Pitas et al. (1992). The best choice taking advantages of both average and median filter
was proposed called the a-trimmed mean filter Bednar et al. (1987). The -trimmed
mean filter rejects the smaller and the larger observation data depending on the value of
. In order to perform analysis, different metrics of images and complexity are
considered.

Fig. 1.6: A trimmed average filtered image

1.5.2 Median Filter


Median filtering is a common image enhancement technique for removing salt and
pepper noise. Because this filtering is less sensitive than linear techniques to extreme

changes in pixel values, it can remove salt and pepper noise without significantly reducing
the sharpness of an image. Median filtering is a nonlinear operation used in image
processing to reduce "salt and pepper" noise. The median is much less sensitive than the
mean to extreme values. Median filtering is therefore better able to remove these outliers
without reducing the sharpness of the image. Median filter removes impulse noise, but it
also smoothes all edges and boundaries and may erase details of the image.

(8)

Fig.1.7: Median filter image

Mean filter replaces the mean of the pixels values but it does not preserve image
details. Some details are removes with the mean filter Varghese (2014). But in the median
filter, we do not replace the pixel value with the mean of neighboring pixel values, we
replaces with the median of those values. The median is calculated by first sorting all the
pixel values from the surrounding neighbourhood into numerical order and then replacing
the pixel being considered with the middle pixel value.

1.5.3 Adaptive Filter

Adaptive filters are commonly used in image processing to enhance or restore data
by removing noise without significantly blurring the structures in the image Westin et al.
(2000). Adaptive filter is performed on the degraded image that contains original image and
noise. The mean and variance are the two statistical measures that a local adaptive filter
depends with a defined mxn window region. They can be thought of as self-adjusting digital
filters. Adaptive filters find widespread use in countering the effects of "speckle" noise,
which afflicts coherent imaging systems like ultrasound. With these imaging techniques,
scattered waves interfere with one another to contaminate an acquired image with
multiplicative speckle noise

(9)

Fig. 1.8: An adaptive filtered image

1.5.4 Weiner Filter

Wiener theory, formulated by Norbert Wiener in 1940, forms the foundation of datadependent linear least square error filters. Wiener filters play a central role in a wide range
of applications such as linear prediction, echo cancellation, signal restoration, channel
equalization and system identification. The main aim of this technique is to filter out noise
that has corrupted the signal. It is kind of statistical approach. For the designing of this filter
one should know the spectral properties of the original signal ,the noise and linear timevariant filter whose output should be as close as to the original as possible Kaur (2015). The
Wiener filter minimizes the mean square error between the estimated random process and
the desired process. Wiener filter is a low pass filter an intensity image that has been
degrade by constant power additive noise. Wiener uses a pixel-wise adaptive Wiener method
based on statistics estimated from a local neighbourhood of each pixel. The wiener filters
the image I using pixel-wise adaptive Wiener filtering, using neighbourhoods of size M-byN to estimate the local image mean and standard deviation. By omitting the [M N]
argument, M and N default to 3.

Fig.1.9: Wiener filtered image

(10)

1.5.5 Maximum & Minimum Filter


Minimum and maximum filters, also called as erosion and dilation filters,
respectively, are morphological filters that work by looking a neighbourhood around each

pixel. From the list of neighbour pixels, the minimum or maximum value is found and
stored as the corresponding resulting value. Finally, each pixel in the image is replaced by
the resulting value generated for its associated neighbourhood. If we apply max and min
filters alternately they can remove certain kind of noise, such as salt-and-pepper noise very
efficiently Kaur (2015).

Fig. 1.10: A maximum filtered image

Fig. 1.11: A minimum filtered image

1.5.6 Average Filter


The average or mean filter is a simple filter that replaces the center value in the
window with the average (mean) of all the pixel values in the local window. The window, or
kernel, is usually square but can be any shape or of any matrix size.
The idea of mean filtering is simply to replace each pixel value in an image with the
(11)
mean (`average') value of its
neighbours, including itself. This has the effect of removing pixel values which are
unrepresentative of their surroundings. Mean filtering is usually thought of as a convolution
filter. Like other convolutions it is based around a kernel, which represents the shape and
size of the neighbourhood to be sampled when calculating the mean.
This result is not a significant improvement in noise reduction and, furthermore, the
image is now very blurred. The two main problems with mean filtering, which are:

A single pixel with a very unrepresentative value can significantly affect the mean
value of all the pixels in its neighbourhood.

When the filter neighbourhood straddles an edge, the filter will interpolate new
values for pixels on the edge and so will blur that edge. This may be a problem if
sharp edges are required in the output.
Both of these problems are tackled by the median filter, which is often a better filter

for reducing noise than the mean filter, but it takes longer to compute. In general the mean
filter acts as a low pass frequency filter and, therefore, reduces the spatial intensity
derivatives present in the image.

Fig. 1.11: An average filtered image

1.6

Local Image Contrast and Gradient


The local image contrast and the local image gradient are very useful features for

text extraction from the complex images and documents (historical and degraded). They are
very effective and have been used in many document image binarization techniques Su et al.

(12)
(2010). In Bernsens paper Bernsen et al.(1986), the local contrast is defined as follows:
Error: Reference source not found
(1)

Where C(i, j ) denotes the contrast of an image pixel (i, j ), Error: Reference source not
foundand Error: Reference source not found denote the maximum and minimum intensities
within a local neighborhood windows of(i j),respectively. If the local contrast C(i, j ) is
smaller than a threshold, the pixel is set as background directly. Otherwise it will be
classified into text or background by comparing with the mean ofError: Reference source
not found and Error: Reference source not foundBernsens method s simple, but cannot

work properly on images with a complex document background. Therefore a new


binarization method is used for calculating local image contrast :
Error: Reference source not found

(2)

Where is a positive but infinitely small number that is added in case the local maximum is
equal to 0. Compared with Bernsens contrast, the new local image contrast introduces a
normalization factor (the denominator) to compensate the image variation within the
document background.
The image gradient has been widely used for edge detection Ziou et al. (1988) and it
can be used to detect the text stroke edges of the document images effectively that have a
uniform document background. On the other hand, it often detects many non stroke edges
from the background of degraded document that often contains certain image variations due
to noise, uneven lighting, bleed-through, etc. To extract only the stroke edges properly, the
image gradient needs to be normalized to compensate the image variation within the
document background Su et al. (2010.

1.7

Adaptive Image Contrast


In the pre-processing Adaptive Contrast Map is applied to the input image. In an

Adaptive Contrast Map, combination the local image contrast with the local image gradient
is applied to the input image. It detects many non stroke edges from the background of
image that often contains certain image variations due to noise, uneven lighting, bleed(13)
through, etc. To extract only the stroke edges properly, the image gradient needs to be
normalized to compensate the image variation within the document background. The
purpose of the contrast image construction is to detect the stroke edge pixels of the
document text properly Su et al. (2010.

Where C(i.j) denotes the local contrast and Error: Reference source not found represents the
local gradient normalized over [0 and 1].The local window size is set to 3. is the weight

between local contrast and local gradient which is controlled by the image statistical
information.
Error: Reference source not found
(4)
Std is the document image intensity standard deviation and is predefined parameter [0,
infinity]
1.8

Image Binarization Technique


Binarization is a pre-processing task which is very useful for document analysis

system Stathis et al. (2008). It is a process in which an image is converted into a bi


level form such that foreground information is represented by black pixel value and
background

information is represented by white pixel value. The number of

methodology have been proposed by several researchers on image segmentation using


binarization and its application towards moving object detection and human gait
recognition Chaki et al. (2014). Thresholding is the well known technique which is used
for binarization of images. The basic idea of thresholding is to select an optimal gray
level threshold value for separating objects of interest in an image from the background
based on their gray level distribution Vala et al. (2013). Based on the calculation of
threshold value the there are three type of thresholding methods:

1.8.1 Global Thresholding


The global thresholding technique computes an optimal threshold for the entire
(14)
image Jagroop Kaur et al. (2014). It works well for the simple cases, but fails for the
images with complex background and uneven illuminations. They are generally based on
histogram analysis Kasar et al. (2007). They work well for images with separated
foreground and background classes.
1.8.1.1 OTSUS Method
Otsu's method, is a global thresholding method which is used to automatically
perform clustering-based image thresholding Sankur et al. (2001) or, the reduction of a
gray level image to a binary image. The algorithm assumes that the image contains two
classes of pixels following bi-modal histogram (foreground pixels and background
pixels), it then calculates the optimum threshold separating the two classes so that their
combined spread (intra-class variance) is minimal, or equivalently (because the sum of

pair wise squared distances is constant), so that their inter-class variance is maximal
Otsu (1979).

1.8.2 Local Thresholding


The local thresholding binarization set different threshold for different target pixels
depending on their neighborhood /local information. These techniques are sensitive to
background noise due to large variance in case of a poor illuminated images. These
approaches are generally window based and local threshold for a pixel is computed from at
the gray value of the pixel within a window centered at that particular pixel Kasar et al.
(2007).

1.8.3 Hybrid Thresholding


Hybrid method use both global and local information to decide the pixel label
Sauvola et al. (1999). A first step consist in carrying out a global threshold to classify a part
of the background of the document image and keep only the part containing foreground. A
second step aims to refine the image obtain from the previous step in order to obtain a better
result Kaur et al. (2014).
1.9

Canny Edge Detection

(15)
The Canny edge detector is an edge detection operator that uses a multi-stage
algorithm to detect a wide range of edges in images. Canny edge detector has a good
localization property that it can mark the edges close to real edge locations in the
detecting image Jagtap et al. (2015). In addition, canny edge detector uses two
adaptive thresholds and is more tolerant to different imaging artifacts such as shading.

The Canny Edge detection algorithm runs in 4 separate steps:


Smoothing:
Blurring of the image to remove noise
Finding gradients:
The edges should be marked where the gradients of the image has large magnitudes.
Non-maximum suppression:

Only local maxima should be marked as edges.


Double thresholding:
Potential edges are determined by thresholding. Final edges are determined by
suppressing all edges that are not connected to a very certain (strong) edge.
1.10

Research Aim and Objectives

It was observed that over the years different algorithms has been implemented for
the text extraction from the documents and images, but it has not achieved perfection. These
algorithms have undergone vast alterations and modifications. The algorithms has been
modified in exploring the best method for achieving the threshold values ,by applying the
best techniques for edge detection, integrating the image gradient and image contrast
enhancement for improving accuracy, but the application of filtering techniques for
removing the noise from images at initial stage for text extraction has not been addressed
and explored much.

(16)

These are the objectives of my dissertation:


1. To develop an algorithm to analyze the effect of different filters applied to the
images from which the text has to be extracted through Otsu binarization method.
2. The algorithm uses the approaches like local image contrast, local image gradient,
combination of local image contrast and gradient and canny edge detector for
detecting text from images efficiently.
3. Calculation and comparison of simulation PSNR, MSE, NRM, DRD, F-Measure,
Precision, Specificity and Accuracy.

1.11 Dissertation Organization


This dissertation consists of five chapters. Chapter 1 presents an introduction to this
work as well as the purpose of this dissertation. Also, outlined are the contents of different

chapters that form the dissertation. In chapter 2, a literature review of current research on
text extraction from images, degraded document images and historical document have been
studied. This chapter also provides study about various binarization and filtering techniques
and algorithms. Chapter 3 gives a brief overview of simulation tools which are used in this
dissertation. With the help of these tools overall simulation is performed. In this chapter,
description of proposed methodology which is designed to complete this dissertation is
given. Methodology gives a complete overview of work which is done from starting to end.
After this the proposed algorithm to study the effects of filters on text extraction from
complex images through binarization is implemented.

Chapter 4 explains the various

performance measurement parameters for comparing the performance of different filters on


text extraction from complex images. Based on parameters explained in this chapter the
analysis among the filtering is done. The simulation results and the comparisons are also
presented in this chapter. Chapter 5 briefly summarizes the key outcomes of the proposed
schemes, and provides some suggestions for future work of this thesis.

(17)

Chapter 2
2.1

Review of Literature

General
The literature survey chapter should demonstrate a systematic knowledge of the area

and provide arguments to support the study focus. A literature survey helps to locate and
summarize the background study of a specific topic. Reliable sources such as IEEE, ACM
and books on binarization and filters were used for detailed literature review in order to
acquire relevant data.
Shaomin et al. (1995) the proposed a work on image enhancement using -trimmed
mean filters. Image enhancement is the most important challenging pre-processing for
almost all applications of Image Processing. By now, various methods such as Median filter,

-trimmed mean filter, etc. have been suggested. It was proved that the -trimmed mean
filter is the modification of median and mean filters. The proposed algorithm has shown
excellent performance in suppressing noise.
Sauvola et al. (1999) this paper proposes a method for adaptive document image
binarization, where the page is considered as a collection of subcomponents such as text,
background and picture. He proposes a new method that first performs a rapid classification
of the local contents of a page to background, pictures and text. Two different approaches
are then applied to define a threshold for each pixel: a soft decision method (SDM) for
background and pictures, and a specialized text binarization method (TBM) for textual and
line drawing areas. The SDM includes noise filtering and signal tracking capabilities, while
the TBM is used to separate text components from background in bad conditions, caused by
uneven lumination or noise. Finally, the outcome of these algorithms is combined.
Sobottka et al. (2000) the automatic retrieval of indexing information from colored
paper documents is a challenging problem. In order to build up bibliographic databases,
editing by humans is usually necessary to provide information about title, authors and
keywords. For automating the indexing process, the identification of text elements is
essential. In this paper an approach to automatic text extraction from colored book and
journal covers is proposed. Two methods have been developed for extracting text. The
(18)
results of both methods are combined to robustly distinguish between text and non-text
elements.
Yuan et al. (2001) in this paper they present a well designed method that makes use
of edge information to extract textual blocks from gray scale document images. It aims at
detecting textual regions on heavy noise infected newspaper images and separate them from
graphical regions. The algorithm traces the feature points in different entities and then
groups those edge points of textual regions. From using the technology of line
approximation and layout categorization, it can successfully retrieve directional placed text
blocks. Finally feature based connected component merging was introduced to gather
homogeneous textual regions together within the scope of its bounding rectangles. The
proposed method has been tested on a large group of newspaper images with multiple page
layouts, promising results approved the effectiveness of our method

Chen et al. (2001), the paper presents a fast and robust algorithm to identify text in
image or video frames with complex backgrounds and compression effects. The algorithm
first extracts the candidate text line on the basis of edge analysis, baseline location and
heuristic constraints. Support Vector Machine is then used to identify text line from the
candidates in edge-based distance map feature space. Experiments based on a large amount
of images and video frames from different sources showed the advantages of this algorithm
compared to conventional methods in both identification quality and computation time.
Tsai et al. (2002) this paper presents a novel binarization algorithm for color
document images. Conventional thresholding methods do not produce satisfactory
binarization results for documents with close or mixed foreground colors and background
colors. Initially, statistical image features are extracted from the luminance distribution.
Then, a decision-tree based binarization method is proposed, which selects various color
features to binarize color document images. First, if the document image colors are
concentrated within a limited range, saturation is employed. Second, if the image
foreground colors are significant, luminance is adopted. Third, if the image background
colors are concentrated within a limited range, luminance is also applied. Fourth, if the total
number of pixels with low luminance (less than 60) is limited, saturation is applied; else
both luminance and saturation are employed. Our experiments include 519 color images,
(19)
most of which are uniform invoice and name-card document images. The proposed
binarization method generates better results than other available methods in shape and
connected-component measurements. Also, the binarization method obtains higher
recognition accuracy in a commercial OCR system than other comparable methods.
Gllavata et al. (2003) text detection in images or videos is an important step to
achieve multimedia content retrieval. In this paper, an efficient algorithm which can
automatically detect, localize and extract horizontally aligned text in images (and digital
videos) with complex backgrounds is presented. The proposed approach is based on the
application of a color reduction technique, a method for edge detection, and the localization
of text regions using projection profile analyses and geometrical properties. The outputs of
the algorithm are text boxes with a simplified background, ready to be fed into an OCR
engine for subsequent character recognition. Our proposal is robust with respect to different
font sizes, font colors, languages and background complexities. The performance of the

approach is demonstrated by presenting promising experimental results for a set of images


taken from different types of video sequences.
Kim et al. (2003) the current paper presents a novel texture-based method for
detecting texts in images. A support vector machine is used to analyze the textural properties
of texts. No external texture feature extraction module is used, but rather the intensities of
the raw pixels that make up the textural pattern are fed directly to the SVM, which works
well even in high-dimensional spaces. Next, text regions are identified by applying a
continuously adaptive mean shift algorithm to the results of the texture analysis. The
combination of CAMSHIFT and SVMs produces both robust and efficient text detection, as
time-consuming texture analyses for less relevant pixels are restricted, leaving only a small
part of the input image to be texture-analyzed.
Jung et al. (2004) text data present in images and video contain useful information
for automatic annotation, indexing, and structuring of images. Extraction of this information
involves detection, localization, tracking, extraction, enhancement, and recognition of the
text from a given image. However, variations of text due to differences in size, style,
orientation, and alignment, as well as low image contrast and complex background make the
problem of automatic text extraction extremely challenging. While comprehensive surveys
(20)
of related problems such as face detection, document analysis, and image & video indexing
can be found, the problem of text information extraction is not well surveyed.
Ye et al. (2005) text in images and video frames carries important information for
visual content understanding and retrieval. In this paper, by using multiscale wavelet
features, we propose a novel coarse-to-fine algorithm that is able to locate text lines even
under complex background. First, in the coarse detection, after the wavelet energy feature is
calculated to locate all possible text pixels, a density-based region growing method is
developed to connect these pixels into regions which are further separated into candidate
text lines by structural information. Secondly, in the fine detection, with four kinds of
texture features extracted to represent the texture pattern of a text line, a forward search
algorithm is applied to select the most effective features. Finally, an SVM classifier is used
to identify true text from the candidates based on the selected features. Experimental results
show that this approach can fast and robustly detect text lines under various conditions.

Shui et al. (2005) Local Wiener filtering in the wavelet domain is an


effective image denoising method of low complexity. In this paper, they propose
a doubly local Wiener filtering algorithm, where the elliptic directional windows
are used for different oriented sub bands in order to estimate the signal
variances of noisy wavelet coefficients, and the two procedures of local Wiener
filtering are performed on the noisy image. The experimental results show that
the proposed algorithm improves the denoising performance significantly.

Gatos et al. (2006) this paper presents a new adaptive approach for the binarization
and enhancement of degraded documents. The proposed method does not require any
parameter tuning by the user and can deal with degradations which occur due to shadows,
non-uniform illumination, low contrast, large signal-dependent noise, smear and strain. We
follow several distinct steps: a pre-processing procedure using a low-pass Wiener filter, a
rough estimation of foreground regions, a background surface calculation by interpolating
neighboring background intensities, a thresholding by combining the calculated background
surface with the original image while incorporating image up-sampling and finally a postprocessing step in order to improve the quality of text regions and preserve stroke
connectivity.

(21)
Badekas et al. (2007) this article presents a new method for the binarization of
color document images. Initially, the colors of the document image are reduced to a
small number using a new color reduction technique. Specifically, this technique
estimates the dominant colors and then assigns the original image colors to them in order
that the background and text components to become uniform. Each dominant color
defines a color plane in which the connected components (CCs) are extracted. Next, in
each color plane a CC filtering procedure is applied which is followed by a grouping
procedure. At the end of this stage, blocks of CCs are constructed which are next
redefined by obtaining the direction of connection (DOC) property for each CC. Using
the DOC property, the blocks of CCs are classified as text or no text. The identified text
blocks are binarized properly using suitable binarization techniques, considering the rest
of the pixels as background. The final result is a binary image which contains always

black characters in white background independently of the original colors of each text
block. The proposed document binarization approach can also be used for binarization of
noisy color (or gray-scale) document images.
Nikholas et al. (2008) Binarization methods are applied to document images for
discriminating the text from the background based on pure thresholding and filtering
combined with image processing algorithms. The proposed binarization procedure consists
of five discrete steps in image processing, for different classes of document images. A
refinement technique enhances further the image quality. Results on Byzantine historical
manuscripts are discussed and potential applications and further research are proposed. The
main contribution of this paper is to propose a simple and robust binarization procedure for
pre-filtered historical manuscripts images, and simulation results are also presented.
Stathis et al. (2008) this paper proposes a new technique for the validation of
document binarization algorithms. The method is simple in its implementation and can be
performed on any binarization algorithm since it doesnt require anything more than the
binarization stage. The technique is appropriate for document images that are difficult to be
evaluated by techniques based on segmentation or recognition of the text. In this paper, a
(22)
survey of binarization algorithm performance is done in which a comparison is made range
from the oldest to the newest ones and some conclusions are presented. Experiments are
perform on artificial historical documents that imitate the common problems of historical
documents, made by using techniques of image mosaicing and combining old blank
document pages with noise-free pdf documents. This way, after the application of the
binarization algorithms to the synthetic images, it is easy to evaluate the results by
comparing the resulted image, pixel by pixel, with the original document.
Ebenezer et.al (2009) has proposed a novel decision based trimmed median filter
algorithm for restoring gray scale, and color images of highly corrupted. It interchanges the
noisy pixel by trimmed median pixel value when other values of pixels, 0s and 255s are
present in the selected window and when all the pixel values are 0s and 255s then the

noise pixel is replaced by mean value of all the elements present in the selected window. It
can handle color as well as gray images.
Hedjamg et al. (2010) in this work, a robust segmentation method for text

extraction from the historical document images is presented. The method is based on
Markovian-Bayesian clustering on local graphs on both pixel and regional scales. It consists
of three steps. In the first step, an over-segmented map of the input image is created. The
resulting map provides a rich and accurate semi-mosaic fragments. The map is processed in
the second step, similar and adjoining sub-regions are merged together to form accurate text
shapes. The output of the second step, which contains accurate shapes, is processed in the
final step in which, using clustering with fixed number of classes, the segmentation will be
obtained. The method employs significantly the local and spatial correlation and coherence
on both the image and between the stroke parts, and therefore is very robust with respect to
the degradation. The resulting segmented text is smooth, and weak connections and loops
are preserved thanks to robust nature of the method. The output can be used in succeeding
skeletonization processes which require preservation of the text topology for achieving high
performance. The method is tested on real degraded document images with promising
results.
Wang et al. (2010) has proposed a novel improved median filter algorithm for the
(23)
images highly corrupted with salt-and-pepper noise. Firstly all the pixels are classified into
signal pixels and noisy pixels by using the Max-Min noise detector. The noisy pixels are
then separated into three classes, which are low-density, moderate-density, and high-density
noises, based on the local statistic information. Finally the weighted 8-neighborhood
similarity function filter, the 55 median filter and the 4-neighborhood mean filter are
adopted to remove the noises for the low, moderate and high level cases, respectively. The
validation results show that the proposed algorithm has better performance for capabilities
of noise removal, adaptively, and detail preservation, especially effective for the cases when
the images are extremely highly corrupted.
Liu et al. (2012) have proposed an improved image filtering algorithm which is
based on median filtering algorithm and medium filtering algorithm according to the

simpleness of median filtering algorithm and the significant de-noising effect of medium
filtering algorithm. The new algorithm combines the two algorithms and thus gets a better
filtering effect. The simulation was performed using MATLAB, and the objective evaluation
using by the classical method PSNR. Simulation results showed that the new algorithm has
a better de-noising effect than the medium filtering algorithm and reduces the denoising
time as well. Thus the improved algorithm has a better practicability.
Rongzhu et al. (2012) has demonstrated the application of improved median filter
on image processing. Median filter is the most common method of clearing image noise.
This report proposes improved algorithm of median filter to remove sale and pepper noise of
image. According to the characteristics of salt and pepper noise, the algorithm detects image
noise, and establishes noise marked matrix, without processing the pixels marked as signal.
The signal of the pixel is marked as not treated, labeled according to their pixel noise
pollution in the neighborhood to take a different pixel weighted mean filter window size,
weight pixel region by the noise points to determine the local histogram. Matlab
experiments show that improved median filter can greatly reduce the time of clears image
noise and it performs better than median filters on noise reduction while retaining edges of
an image.

(24)

Wang et al. (2012) has done the comparative study of research work done in the field of
image filtering. Image filtering processes are applied on images to remove the different
types of noise that are either present in the image during capturing or introduced into the
image during transmission. The salt & pepper(impulse) noise is the one type of noise which
is occurred during transmission of the images or due to bit errors or dead pixels in the image
contents. The images are blurred due to object movement or camera displacement when we
capture the image. This pepper deals with removing the impulse noise and blurredness
simultaneously from the images. The hybrid filter is a combination of wiener filter and
median filter.

Singh et al. (2013) image segmentation is the process of partitioning an


image into multiple segments. This paper is describing a novel approach to
image segmentation by performing some steps over the edges detected of all
the objects present in the foreground or background. In some applications like,
image recognition, compression and watermarking it is likely to be inefficient
and unpractical to process the whole image. In that case it is necessary to
segment the image before recognizing, compressing or embedding some
watermark. For this several image segmentation approaches are available to
segment the image, to change the representation of the image or to simplify
the image to make it more meaningful and easy to analyze.Also this approach
will be very helpful in digital image watermarking application for more efficient
embedding of watermark.

Bolan et al. (2013) in this paper, we propose a novel document image


binarization technique that addresses these issues by using adaptive image contrast. The
adaptive image contrast is a combination of the local image contrast and the local image
gradient that is tolerant to text and background variation caused by different types of
document degradations. In the proposed technique, an adaptive contrast map is first
constructed for an input degraded document image. The contrast map is then binarized and
combined with Cannys edge map to identify the text stroke edge pixels. The document text
is further segmented by a local threshold that is estimated based on the intensities of
detected text
(25)
stroke edge pixels within a local window. The proposed method is simple, robust, and
involves minimum parameter tuning. It has been tested on three public datasets that are used
in the recent document image binarization contest (DIBCO) 2009 & 2011 and handwrittenDIBCO 2010 and achieves accuracies of 93.5%, 87.8%, and 92.03%, respectively,
Singh et al. (2013) this paper presents a survey on different image filtering
techniques. Image filtering is a crucial part of vision processing as it can remove noise from
noisy images. There are many filtering techniques to filter an image. Each filtering
technique has its own benefits to filter an image. The overall objective of this paper is to
explore the benefits and limits of existing techniques. It is found that hybrid median filter

and alpha trimmed has some potential benefits over existing filters when to reduce salt and
pepper noise.
Kaur (2014) this paper has focused on the different image binarization technique.
Existing research has been shown that no technique is perfect for every case. Several
researchers have used image filters to reduce the noise from the image however the
utilization of the guided filter (best edge preserving filter) is not found. It may increase the
accuracy of the present binarization strategies. Within the most of techniques the contrast
enhancement is either done by tradition way or not done. Therefore adaptive contrast
enhancement is required. Most of the strategies have neglected the utilization of edge map
that has the capability to map the precise character in proficient manner. This paper has
proposed a new technique which has the ability to binarized documents in more efficient
manner. The proposed method has integrated the image gradients and the image contrast
enhancement to improve the accuracy of document image binarization. The proposed
technique also utilizes the guided image filter to improve the accuracy rate further. The
comparative analysis has shown that the proposed algorithm provides quite significant
improvement over the available algorithms.
Kumari et al. (2014) in this paper the author has survey about the filtering technique
for denoising images in Digital Image Processing. In image denoising techniques, image
filtering algorithms are applied on images to remove the different types of noise that are
(26)
either present in the image during capturing or injected into the image during transmission.
The certain image denoising filters are based on the median filters. The author has explored
variety of methods to remove noise from digital images, such as Gaussian filtering, Wiener
filtering etc. Due to certain assumptions made about the frequency content of the image,
many of these algorithms remove fine details from images in addition to the noise.
Sehad (2014) in this paper, the author propose to estimate texture information based
on Gabor filters for ancient degraded documents. First, the dominant slant angle of the
document image script is computed by using the Fourier transform. Then, this dominant
angle is used within a weighted sum of angles in a Gabor filter bank in order to capture

more efficiently the document image foreground (text). This information, combined with the
variance and the mean extracted respectively from spatial and frequency domains are used
for estimating the binarization threshold. Three variants are used for evaluating the
performance of Gabor filter bank, which are based on Niblack's, Sauvola's, and Wolf's
thresholds.
Qixiang et al. (2015) this paper analyzes, compares, and contrasts
technical challenges, methods, and the performance of text detection and
recognition research in color imagery. It summarizes the fundamental problems
and enumerates factors that should be considered when addressing these
problems. Existing techniques are categorized as either stepwise or integrated
and sub-problems are highlighted including text localization, verification,
segmentation and recognition. Special issues associated with the enhancement
of degraded text and the processing of video text, multi-oriented, perspectively
distorted and multilingual text are also addressed. The categories and subcategories of text are illustrated, benchmark datasets are enumerated, and the
performance of the most representative approaches is compared.
Ranganathan et al. (2015) this paper presents a simple and efficient
binarization method to binarize the degraded document image. The proposed
technique is efficient to tolerate the high inter and intra intensity variation in
the degraded document image. Document Image binarization is a process of
converting the document image into binary
(27)
image containing text as foreground and plain white as background or vice
versa. Characters from the document image should be extracted from the
binarized image, in order to recognize them. So performance of the character
recognition system is completely depends on the binarization quality. The
proposed method is based on spatial domain techniques: Laplacian operator,
Adaptive Bilateral filter and Gaussian filter and works well for degraded
documents and palm leaf manuscript images.

2.2

Summary

It is inferred from above literature that most of the researchers give great
contribution in the study of filtering algorithms and text extraction from colored images and
degraded historical document images. Filters are used for denoising the images but
sometime it may happen that while removing noise, the filters are not able to preserve the
edge of text present in image. Therefore, it is necessary to analyze how different filters
behave and perform while extracting text from images. For this purpose, in this thesis the
main focus is to find out the performance of filters on colored images with different
background colors, intensity, illumination and text with different font size, shapes and
alignments. The quality of images are degraded and contaminated by noise like Gaussian
noise, Speckle noise and Salt and pepper noise.

(28)

Chapter 3

MethodOLOGY

This chapter discuss about the various components of simulation environment. It


also includes the detailed description of various simulation parameters and analysis used in
the study.

3.1 Image Processing Toolbox

Image Processing Toolbox provides a comprehensive set of referencestandard

algorithms,

functions, and apps for image

processing,

analysis,

visualization, and algorithm development.


Key Features:

1. Image analysis, including segmentation, morphology, statistics, and


measurement

2. Image enhancement, filtering, and deblurring


3. Geometric

transformations

and

intensity-based

image

registration

methods

4. Large image workflows, including block processing, tiling, and multi


resolution display

5. Visualization apps, including Image Viewer and Video Viewer


6. Multicore- and GPU-enabled functions and C-code generation support.
Color transformation functions
isbw(A) returns value 1 if the image is black&white, and value 0 otherwise
isgray(A) returns value 1 if the image is grayscale, and value 0 otherwise
isrgb(A) returns value 1 if the image is RGB, and value 0 otherwise
im2bw converts RGB image into black and white
rgb2gray converts RGB image into grayscale.
Spatial transformation functions
imresize(A. scale) resize the image to desired scale.
(29)
imrotate(A, angle of rotation) rotate the image upto desired angle
imcrop cuts the image to the selected rectangle
Filtering transformation functions
Wiener(A) removing the gaussian noise.
imfilter(A, fspecial('average', [3 3]) removal of noise through average filter.

medfilt2(A) removal of noise through median filter.


ordfilt2(A, 1, ones(3,3)) removal of noise through minimum filter
ordfilt2(A, 9, ones(3,3)) removal of noise through maximum filter.
wthresh(im2double(A),thr) removal of noise through wavelet denoising filter.
Binarization transformation functions
graythresh(A) Otsu thresholding function

edge(A,'canny' )Detection of text edges in images


Noise Inducing functions
imnoise() for inducing noise in images

3.2

Research Methodology
Research Methodology implies simply the methods I intend to use in my thesis. It

can be used for resolving problems and better implementation of my research work.
The research activities consist of following phases:
1. The first phase of my work starts with the selection of dissertation topic. Text data
present in images and video contain useful information for automatic annotation,
indexing and structuring of images. Extraction of this information involves detection
localization tracking extraction enhancement and recognition of the text from
degraded image. Text Extraction from image is concerned with extracting the
relevant text data from a collection of images and how the noise like salt and pepper
noise, gaussain noise present in those images are effecting the text extraction from
images.
(30)
2. In the second phase of my dissertation, I have done a broad study of my topic. In the
literature review, study about existing text extraction algorithms which are applicable
in digital image processing is done. Existing research has shown that no technique is
perfect for every case. After studying the literature I analyze that I need to design
highly efficient algorithm to ensure text extraction from degraded documents or
images with the help of removing the noise at pre-processing stage.

3. In the third phase, I found out research gaps which include to explore the
concept of text extraction through binarization from complex images and to
study the performance of filters on images before applying binarization
algorithms.

The

enhancement

are

concept
used

of
to

image

gradients

improve

the

and

accuracy

the
of

image
document

contrast
image

binarization.After this I started installing MATLAB R2013a* version and learn how
to use it. And also learn about its syntax.

and Understanding
Basic
Concepts
4. FourthSelection
phase of
of Topic
dissertation
starts withthethe
design
of the proposed model
which include designing and implementation of my algorithm in the matlab toolbox.
5. In the fifth phase, I performed overall methods in my work. In this I implemented
Literature Survey
Discussion with Mentor
Experimental
my filtering and text extraction algorithms. After this I performed
validation of my
Knowledge
proposed algorithm and compare the performance of filters in terms of PSNR, MSE,
Broad
study of Topic
DRD, MPM, NPN, Recall
and Accuracy.
6. In the final phase of my dissertation, I discussed the results and found out
which filter has better performance for my proposed algorithms under different
simulation parameter values.

Required
tools
installatio
n

Observati
on of
Existing
Solution

Docume
nt
Evidence

(31)
Identifying Research Gaps

Design Our Conceptual Model for


Simulation
Perform
Validation
Results and Analysis
Mentor Checks
Final Report

Data Collection

Data Synthesis

Fig. 3.1: Research M

Methodology

(32)

3.3

Simulation Modeling
Simulation is the technique to solving problems by the observation of the

performance, over the time, of a dynamic model of the system Raczynski


et al. (2006). Simulation represents the relationship between a system and the model. A
system is defined as the collection of components that are interacted in such a manner
that it distinguishes the system from its environment. A model is a simplified
representations of the system intended to predict what happens if certain actions are
taken Raczynski et al. (2006). Simulation development is iterative process in which the
construction,
execution

and

analysis of a model
repeatedly perform.

Fig. 3.2: Modeling Process

3.3.1 Construct Model


A model is constructed for text extraction through image. Firstly the noise which are
present in images and degrading the quality of images is removed with the help of image
filters and these filtered images act as input for text extraction algorithms.

3.3.2 Execute Model


Each model is executed and it is ensured that the simulation is carried out the way it
is expected. Any obstruction in such regards is taken care of.

3.3.3 Analyze Model


In this case analysis of the model is done with respect to various performance
metrics already defined and then conclusions are drawn based on it.
(33)

3.4

Proposed Algorithm

3.4.1 Flow Chart of the proposed algorithm

Choose the input images

Binarized image

Select the filter

Otsu Threshold

Convert it into gray scale image

Threshold Estimation

Gradient image

Canny Edge Detector

Contrast image

Adaptive Contrast Map

Fig. 3.3: Flow chart for proposed algorithm

(34)

3.4.2

Algorithm for Text Extraction through Binarization


Step 1: First of all images will be taken for the experimental purpose.
Step 2: Apply the filtering techniques for smoothing, removing noise and
restoring the input images
Step 3: If input image is a color image then it will be converted into a gray scale
image
Step 4: Gradient image and contrast image techniques are applied to the input
image.
Step 5: Adaptive contrast map will come in action to improve the contrast of the
input image.
Step 6: The edges of the text strokes are detected through canny edge detector.
Step 7: Calculate the threshold
Step 8: Apply Otsu thresholding to convert the image into a final binarized image.

3.4.3 Different filtering techniques

We can consider a noisy image to be modelled as follows:


(1)

g ( x, y ) f ( x, y ) ( x, y )

where f(x, y) is the original image pixel,

(x, y) is the noise term and g(x, y) is

the resulting noisy pixel


If we can estimate the noise model we can figure out how to restore the image
There are many different models for the image
noise term (x, y):

Gaussian
(35)

Most common model

Rayleigh

Erlang

Exponential Gaussian

Uniform

Impulse

Rayleigh

Salt and pepper noise

Erlang

Erlang (Gamma)

Uniform
Exponential

Uniform

Exponential

Impulse

Impulse

Salt and pepper noise

(36)
The arithmetic mean filter is a very simple one and is calculated as follows:
1
f ( x, y )
g ( s, t )
mn ( s ,t )S xy

(2)

3.4.3.1 Trimmed Average Filter

In order to calculate the -trimmed filter, the data should be sorted low to high
and summed the central part of the ordered array. The number of input data values which
are dropped from the average is controlled by the trimming parameter . It is well
known that the average filter suppresses additive white Gaussian noise better than the
median filter, while the median filter is better at preserving edges and rejecting impulses
Pitas et al. (1992). The best choice taking advantages of both average and median filter
was proposed called the a-trimmed mean filter Bednar et al. (1987). The -trimmed
mean filter rejects the smaller and the larger observation data depending on the value of
. In order to perform analysis, different metrics of images and complexity are
considered.

Fig. 1.6: A trimmed average filtered image

f ( x, y )

1
mn d

( s ,t )S xy

( s, t )

(3)

We can delete the d/2 lowest and d/2 highest grey levels
So gr(s, t) represents the remaining mn d pixels
(37)

3.4.3.2 Median Filter


Median filtering is a common image enhancement technique for removing salt and
pepper noise. Because this filtering is less sensitive than linear techniques to extreme
changes in pixel values, it can remove salt and pepper noise without significantly reducing
the sharpness of an image. Median filtering is a nonlinear operation used in image
processing to reduce "salt and pepper" noise. The median is much less sensitive than the
mean to extreme values. Median filtering is therefore better able to remove these outliers
without reducing the sharpness of the image. Median filter removes impulse noise, but it
also smoothes all edges and boundaries and may erase details of the image.

Fig.1.7: Median filter image

Mean filter replaces the mean of the pixels values but it does not preserve image
details. Some details are removes with the mean filter Varghese (2014). But in the median
filter, we do not replace the pixel value with the mean of neighboring pixel values, we
replaces with the median of those values. The median is calculated by first sorting all the
pixel values from the surrounding neighbourhood into numerical order and then replacing
the pixel being considered with the middle pixel value.
f ( x, y ) median{g ( s, t )}
( s ,t )S xy

(4)

3.4.3.3 Adaptive Filter


Adaptive filters are commonly used in image processing to enhance or restore data
by removing noise without significantly blurring the structures in the image Westin et al. (
(38)

2000). Adaptive filter is performed on the degraded image that contains original image and
noise. The mean and variance are the two statistical measures that a local adaptive filter
depends with a defined mxn window region. They can be thought of as self-adjusting digital
filters. Adaptive filters find widespread use in countering the effects of "speckle" noise,
which afflicts coherent imaging systems like ultrasound. With these imaging techniques,

scattered waves interfere with one another to contaminate an acquired image with
multiplicative speckle noise

Fig. 1.8: An adaptive filtered image


Error: Reference source not found

(5)

Error: Reference source not found

(6)

the transfer function has only one adaptive coefficient

Our objective, then, is to minimize the power of the last subfilter output

3.4.3.4 Weiner Filter


Wiener theory, formulated by Norbert Wiener in 1940, forms the foundation of datadependent linear least square error filters. Wiener filters play a central role in a wide range
of applications such as linear prediction, echo cancellation, signal restoration, channel
equalization and system identification. The main aim of this technique is to filter out noise
that has corrupted the signal. It is kind of statistical approach. For the designing of this filter
one should know the spectral properties of the original signal ,the noise and linear timevariant filter whose output should be as close as to the original as possible Kaur (2015).
(39)

The Wiener filter minimizes the mean square error between the estimated random process
and the desired process. Wiener filter is a low pass filter an intensity image that has been
degrade by constant power additive noise. Wiener uses a pixel-wise adaptive Wiener method
based on statistics estimated from a local neighbourhood of each pixel. The wiener filters
the image I using pixel-wise adaptive Wiener filtering, using neighbourhoods of size M-byN to estimate the local image mean and standard deviation. By omitting the [M N]
argument, M and N default to 3.

Fig.1.9: Wiener filtered image

H (e j )

S s (e j )
S s (e j ) (1 / M ) S v (e j )

(7)

power spectra of signal (s) and noise (v) are F-transforms of correlation functions
r(k)

if no noise, then H=1

if no signal, then H=0

for stationary processes always 0 < H < 1

3.4.3.5 Maximum & Minimum Filter


Minimum and maximum filters, also called as erosion and dilation filters,
respectively, are morphological filters that work by looking a neighbourhood around each

pixel. From the list of neighbour pixels, the minimum or maximum value is found and
stored as the corresponding resulting value. Finally, each pixel in the image is replaced by
the resulting value generated for its associated neighbourhood. If we apply max and min
filters alternately they can remove certain kind of noise, such as salt-and-pepper noise very

(40)

efficiently Kaur (2015).

Fig. 1.10: A maximum filtered image

Fig. 1.11: A minimum filtered image

Max Filter:
f ( x, y ) max {g ( s, t )}
( s ,t )S xy

(8)

f ( x, y ) min {g ( s, t )}

(9)

Min Filter:
( s ,t )S xy

Max filter is good for pepper noise and Min filter is good for salt noise.

3.4.3.6 Average Filter


The average or mean filter is a simple filter that replaces the center value in the
window with the average (mean) of all the pixel values in the local window. The window, or
(41)
kernel, is usually square but can be any shape or of any matrix size.
The idea of mean filtering is simply to replace each pixel value in an image with the
mean (`average') value of its
neighbours, including itself. This has the effect of removing pixel values which are
unrepresentative of their surroundings. Mean filtering is usually thought of as a convolution

filter. Like other convolutions it is based around a kernel, which represents the shape and
size of the neighbourhood to be sampled when calculating the mean.
This result is not a significant improvement in noise reduction and, furthermore, the
image is now very blurred. The two main problems with mean filtering, which are:

A single pixel with a very unrepresentative value can significantly affect the mean
value of all the pixels in its neighbourhood.

When the filter neighbourhood straddles an edge, the filter will interpolate new
values for pixels on the edge and so will blur that edge. This may be a problem if
sharp edges are required in the output.
Both of these problems are tackled by the median filter, which is often a better filter

for reducing noise than the mean filter, but it takes longer to compute. In general the mean
filter acts as a low pass frequency filter and, therefore, reduces the spatial intensity
derivatives present in the image.

Fig. 1.11: An average filtered image

1
f ( x, y )
g ( s, t )
mn ( s ,t )S xy

(42)

(10)

3.4.4 Local Image Contrast and Gradient


The local image contrast and the local image gradient are very useful features for
text extraction from the complex images and documents (historical and degraded). They are
very effective and have been used in many document image binarization techniques Su et al.
(2010). In Bernsens paper Bernsen et al.(1986), the local contrast is defined as follows:
Error: Reference source not found
(11)
here C(i, j ) denotes the contrast of an image pixel (i, j ), Error: Reference source not
foundand Error: Reference source not found denote the maximum and minimum intensities
within a local neighborhood windows of(i j),respectively. If the local contrast C(i, j ) is
smaller than a threshold, the pixel is set as background directly. Otherwise it will be
classified into text or background by comparing with the mean ofError: Reference source
not found and Error: Reference source not foundBernsens method s simple, but cannot
work properly on images with a complex document background. Therefore a new
binarization method is used for calculating local image contrast :
Error: Reference source not found

(12)

Where is a positive but infinitely small number that is added in case the local maximum is
equal to 0. Compared with Bernsens contrast, the new local image contrast introduces a
normalization factor (the denominator) to compensate the image variation within the
document background.
The image gradient has been widely used for edge detection Ziou et al. (1988) and it
can be used to detect the text stroke edges of the document images effectively that have a
uniform document background. On the other hand, it often detects many non stroke edges
from the background of degraded document that often contains certain image variations due
to noise, uneven lighting, bleed-through, etc. To extract only the stroke edges properly, the
image gradient needs to be normalized to compensate the image variation within the
document background Su et al. (2010).

(43)

3.4.5 Adaptive Image Contrast


In the pre-processing Adaptive Contrast Map is applied to the input image. In an
Adaptive Contrast Map, combination the local image contrast with the local image gradient
is applied to the input image. It detects many non stroke edges from the background of
image that often contains certain image variations due to noise, uneven lighting, bleedthrough, etc. To extract only the stroke edges properly, the image gradient needs to be
normalized to compensate the image variation within the document background. The
purpose of the contrast image construction is to detect the stroke edge pixels of the
document text properly Su et al. (2010.

Where C(i.j) denotes the local contrast and Error: Reference source not found represents the
local gradient normalized over [0 and 1].The local window size is set to 3. is the weight
between local contrast and local gradient which is controlled by the image statistical
information.
Error: Reference source not found
(14)
Std is the document image intensity standard deviation and is predefined parameter [0,
infinity]

3.4.6

Image Binarization Technique

Binarization is a pre-processing task which is very useful for document analysis


system Stathis et al. (2008). It is a process in which an image is converted into a bi
level form such that foreground information is represented by black pixel value and
background

information is represented by white pixel value. The number of

methodology have been proposed by several researchers on image segmentation using


binarization and its application towards moving object detection and human gait
recognition Chaki et al. (2014). Thresholding is the well known technique which is used
for binarization of images. The basic idea of thresholding is to select an optimal gray
level threshold value for separating objects of interest in an image
(44)

from the background based on their gray level distribution Vala et al. (2013). Based on
the calculation of threshold value the there are three type of thresholding methods:

3.4.6.1Global Thresholding
The global thresholding technique computes an optimal threshold for the entire
image Jagroop Kaur et al. (2014). It works well for the simple cases, but fails for the
images with complex background and uneven illuminations. They are generally based on
histogram analysis Kasar et al. (2007). They work well for images with separated
foreground and background classes.

3.4.6.1.1 OTSUS Method


Otsu's method, is a global thresholding method which is used to automatically
perform clustering-based image thresholding Sankur et al. (2001) or, the reduction of a
gray level image to a binary image. The algorithm assumes that the image contains two
classes of pixels following bi-modal histogram (foreground pixels and background
pixels), it then calculates the optimum threshold separating the two classes so that their
combined spread (intra-class variance) is minimal, or equivalently (because the sum of
pair wise squared distances is constant), so that their inter-class variance is maximal
Otsu (1979).

3.4.6.1.2 Local Thresholding


The local thresholding binarization set different threshold for different target pixels
depending on their neighborhood /local information. These techniques are sensitive to
background noise due to large variance in case of a poor illuminated images. These
approaches are generally window based and local threshold for a pixel is computed from at
the gray value of the pixel within a window centered at that particular pixel Kasar et al.
(2007).

3.4.6.1.3 Hybrid Thresholding


(45)
Hybrid method use both global and local information to decide the pixel label
Sauvola et al. (1999). A first step consist in carrying out a global threshold to classify a part

of the background of the document image and keep only the part containing foreground. A
second step aims to refine the image obtain from the previous step in order to obtain a better
result Kaur et al. (2014).

3.4.7

Canny Edge Detection


The Canny edge detector is an edge detection operator that uses a multi-stage

algorithm to detect a wide range of edges in images. Canny edge detector has a good
localization property that it can mark the edges close to real edge locations in the
detecting image Jagtap et al. (2015). In addition, canny edge detector uses two
adaptive thresholds and is more tolerant to different imaging artifacts such as shading.

The Canny Edge detection algorithm runs in 4 separate steps:


Smoothing:
Blurring of the image to remove noise
Finding gradients:
The edges should be marked where the gradients of the image has large magnitudes.
Non-maximum suppression:
Only local maxima should be marked as edges.
Double thresholding:
Potential edges are determined by thresholding. Final edges are determined by
suppressing all edges that are not connected to a very certain (strong) edge.

(46)

Chapter 4
4.1

General

Discussion

This chapter includes the results obtained from the simulation. The proposed
algorithm is used to provide more clarity than the previous work. In this chapter results of
all the intermediate steps of the proposed methods are highlighted. Implementation is done
using MATLAB simulation tool Experimental results of intermediate steps shows the
efficiency of the proposed approach.

4.2

Performance Metrics
In my study I want to check the performance of my proposed algorithm on different

filters. In the end comparison between different filters is done on the basis of different
performance metrics & then we evaluate our results. These metrics are as follows:

4.2.1 Peak Signal to Noise Ratio


It is defined as the ratio between the maximum possible power of a signal and the
power of corrupting noise that affects the fidelity of its representation. Because many
signals have a very wide dynamic range, PSNR is usually expressed in terms of
the logarithmic decibel scale. The PSNR (in dB) is defined as:
Error: Reference source not found

(1)

4.2.2 Negative Rate Metric


Negative rate metric measures pixel mismatch rate between the input image and
result image.
Error: Reference source not found
(2)
WhereError: Reference source not found,Error: Reference source not found denote the
number of true positives, false positives, true negatives, and false negatives respectively.
(47)

4.2.3 Accuracy
Accuracy is defined as the ratio of the correctly and incorrectly recognized
characters to the sum of correctly and incorrectly detected and recognized characters, false

positive and false negatives. It is used to describe the closeness of a measurement to the
true value and an accuracy of 100% means that the measured values are exactly the same as
the given values.
Error: Reference source not found

(3)

4.2.4 Distance reciprocal distortion metric


The Distance Reciprocal Distortion Metric (DRD) has been used to measure the
visual distortion in image. It properly correlates with the human visual perception and it
measures the distortion for all the S flipped pixels as follows:
Error: Reference source not found
(4)
Error: Reference source not foundequals to the weighted sum of the pixels in the 5x5 block of
the input image that differ from the kth flipped pixel at (x,y) in the binarized result image.
NUBN is the number of the non-uniform (not all black or white pixels) 8x8 blocks in the input
image.

4.2.5 Precision
Precision in digital image retrieval is the fraction of the documents that are relevant
to the query that are successfully retrieved.
Error: Reference source not found
(5)
Where TP is True Positive value and FP is False Positive value.
Where FN is false negative.

4.2.6 F-Measure
The F-Measure is the harmonic mean of recall (R) and precision (P) values. This
metric measures how well proposed algorithm can retrieve the desire pixels.

(48)

4.2.7 Specificity
The specificity is the number of true negative results divided by the sum of the
numbers of true negative plus false positive results
Where True positive is correctly identified data in image, False positive is
incorrectly identified data in image, True negative correctly rejected data in image and False
negative is incorrectly rejected data.

4.3

Simulation result and analysis


In this section, performance of proposed algorithms for text extraction from complex

images is evaluated using different filters in terms performance metrics.

4.3.1 Performance Analysis of filter for First Input Image


The first input image is a vehicle license plate which contains rich edge and texture
information. The image contains bright red background and uniform text which is
combination of numbers and alphabets. Firstly we will remove most of the background
noise by an effective filtering algorithm, and finally search for the text in the plate region by
proposed algorithm and extract the text from vehicle license plate. Fig. 4.1 shows the input
image.

(49)

Fig. 4.1: Input image is a vehicle license plate

Following are the output images which are obtained by applying the different
filtering methods on the proposed algorithm for the input image.

Fig. 4.2: Text extraction from vehicle license plate using adaptive filter

(50)

Fig. 4.2 shows the text extracted from the input image by using adaptive filter method on
proposed algorithm.

Fig. 4.3: Text extraction from vehicle license plate using average filter

Fig. 4.3 shows the text extracted from the input image by using average filter method
on proposed algorithm.

Fig. 4.4: Text extraction from vehicle license plate using maximum filter

(51)
Fig. 4.4 shows the text extracted from the input image by using maximum filter
method on proposed algorithm.

Fig. 4.5: Text extraction from vehicle license plate after applying median filter

Fig. 4.5 shows the text extracted from the input image by using median filter method
on proposed algorithm.

Fig. 4.6: Text extraction from vehicle license plate using minimum filter

Fig. 4.6 shows the text extracted from the input image by using minimum filter
method on proposed algorithm.
(52)

Fig. 4.7: Text extraction from vehicle license plate using trimmed filter

Fig. 4.7 shows the text extracted from the input image by using trimmed filter
method on proposed algorithm.

Fig. 4.8: Text extraction from vehicle license plate using wiener filter

(53)

Fig. 4.8 shows the text extracted from the input image by using wiener filter method
on proposed algorithm.
Table 4.1: Summary of the result for input image of vehicle license plate

Filtering
Methods

Accuracy

DRD

FNRM Precision
Measure

PSNR

Specificity

Adaptive Filter

0.9572

13.4257

0.5143

0.4987

0.8056

13.6855

1.0000

Average Filter

0.9547

12.1345

0.2518

0.4994

0.3191

13.4358

0.9999

Maximum Filter

0.9540

11.2602

0.3804

0.4991

0.3966

13.3770

0.9999

Median Filter

0.9559

11.1076

0.1898

0.4995

0.7456

13.5521

1.0000

Minimum Filter

0.9618

11.2403

1.6104

0.4960

0.8454

14.1767

0.9999

Trimmed Filter

0.9522

12.7740

0.3974

0.4991

0.2717

13.2055

0.9997

Wiener Filter

0.9523

12.1693

0.1914

0.4996

0.2727

13.21114

0.9999

Table 4.1 shows the performance values calculated by multiple filters for given input
image of vehicle license plate. These values are further analyzed with the help of the graphs.

(54)

4.3.1.1 Analysis of calculated values


In the following figure the number 1, 2, 3, 4, 5, 6 and 7 represents the adaptive filter,
average filter, maximum filter, median filter, minimum filter, trimmed filter and wiener filter
respectively.

Fig. 4.9: Accuracy of different filters for vehicle license plate

Fig 4.9 shows that in term of accuracy the minimum filter shows the best result and
trimmed filter shows the worst result for the input image of vehicle license plate. After
minimum filter the filter which shows the better output is adaptive filter in comparison of
other filters.

Fig. 4.10: Precision values of different filters for vehicle license plate
(55)

Fig. 4.10 shows that the values for minimum filter are close enough to 1. It implies
that for given input image of vehicle license plate minimum filter is more efficient for text
extraction in binary form in comparison of other filters.

Fig. 4.11: Negative rate metric of different filters for vehicle license plate

Fig.4.11 shows that for input image of vehicle license plate there is least pixel
mismatch for minimum filter and maximum pixel mismatch for wiener filter.

Fig. 4.12: Distance reciprocal distortion metric of different filters for vehicle license plate

(56)

Fig. 4.12 show that median filter provides best visual quality and maintains good
text stroke for input image and adaptive filter gives the highest values of DRD which means
that adaptive filter is not able to detect text properly in input image.

Fig. 4.13: F-Measure values of different filters for vehicle license plate

Fig. 4.13 shows that minimum filter posses the highest value of the F-Measure it
indicated that binarized image and input image are equivalent. It also implies that precision
and recall value of binarized image are having high values.

Fig. 4.14: PSNR values of different filters for vehicle license plate

(57)

Fig. 4.14 shows that for the given input image of vehicle license plate the minimum
filter gives high psnr value which shows the good image quality and less error is introduced
in output image whereas trimmed filter gives the lowest psnr value which means the quality
of input image is degraded in comparison of other filters.

Fig. 4.15: Specificity values of different filters for vehicle license plate

Fig. 4.15 shows that for adaptive filter and median filter the proposed algorithm is
able to reject the pixel of a input image which do not contain text data more accurately in
comparison of other filters.

4.3.2 Performance Analysis of Second Input Image


The second input image is logo of western union bank which contain texture
information. The image contains dull background and uniform text which are different in
size and fonts. Firstly we will remove most of the noise by an effective filtering algorithm,
and finally search for the text in the logo by proposed algorithm and extract the text from
logo of western bank. Fig. 4.16 shows the input image.

(58)

Fig. 4.16:
Input
image is
logo of
western
union

bank

Following are the output images which are obtained by applying the different

filtering methods on the proposed algorithm for the input image.

Fig. 4.17 Text extraction from logo of western union bank after applying adaptive filter

Fig. 4.17 shows the text extracted from the input image by using adaptive filter
method on proposed algorithm.

(59)

Fig. 4.18: Text extraction from input image after applying average filter

Fig. 4.18. shows the text extracted from the input image after using average filter
method on proposed algorithm.

Fig. 4.19: Text extraction from input image after applying maximum filter
(60)

Fig. 4.19 shows the text extracted from the input image after using maximum filter
method on proposed algorithm.

Fig. 4.20: Text extraction from logo of western union bank image after applying median filter

Fig. 4.20 shows the text extracted from the input image after using median filter
method on proposed algorithm.

Fig. 4.21: Text extraction from logo of western union bank image after applying minimum filter
(61)

Fig. 4.21 shows the text extracted from the input image after using minimum filter
method on proposed algorithm.

Fig. 4.22: Text extraction from logo of western union bank image after applying trimmed filter

Fig. 4.22 shows the text extracted from the input image after using trimmed filter
method on proposed algorithm.

Fig. 4.23: Text extraction from logo of western union bank image after applying wiener filter

(62)

Fig. 4.23 shows the text extracted from the input image after using wiener filter
method on proposed algorithm.
Table 4.2: Summary of the result for input image of logo of western union bank

Filtering
Methods

Accuracy

DRD

FMeasure

NRM

Precision PSNR Specificity

Adaptive Filter

0.9260

16.0210

0.0825

0.4995

0.5479

11.3098

1.0000

Average Filter

0.9190

19.5042

0.3233

0.4992

0.5758

10.4877

0.9999

Maximum Filter

0.9183

19.7458

0.8218

0.4980

0.5657

10.8766

0.9998

Median Filter

0.9272

16.9677

0.1256

0.4997

1.0000

11.3803

1.0000

Minimum Filter

0.9484

15.1737

1.7071

0.4815

1.0000

10.8774

1.0000

Trimmed Filter

0.9193

14.9850

0.5854

0.4837

0.4026

10.9826

0.9996

Wiener Filter

0.9273

16.5714

0.3765

0.4991

0.5000

11.3864

0.9999

4.3.2.1 Analysis of calculated values


In the following figure the number 1, 2, 3, 4, 5, 6 and 7 represents the adaptive filter,
average filter, maximum filter, median filter, minimum filter, trimmed filter and wiener filter
respectively.

Accuracy

0.955

0.95

0.945

values

0.94

0.935

0.93

0.925

0.92
Fig. 4.24: Accuracy of different filters for logo of Western union bank

0.915

(63)

4
Filters

Fig 4.24 shows that in term of accuracy the minimum filter shows the best result and

maximum filter shows the worst result for the input image of Western union bank logo. As
we can see in Fig 4 minimum filter extract complete text present in image whereas in
maximum filter most of the text is not extracted.

20

Distance reciprocal distortion metric

19

values

18

17

16

15

Fig. 4.25: Distance reciprocal distortion metric of different filters for logo of Western union bank

14

Fig. 4.25 show that trimmed filter provides best visual quality and maintains good

text
and adaptive filter
1 stroke for input image
2
3 gives the highest4values of DRD which
5 means

Filters
that maximum filter is not able to detect text properly in input
image.

F-Measure

1.8

1.6

1.4

values

1.2

0.8

0.6

0.4

0.2
Fig. 4.26: F- Measure of different filters for logo of Western union bank

(64)

4
5
Filtersvalue of the F-Measure it
Fig. 4.26 shows that minimum filter posses the highest

indicated that binarized image and input image are equivalent. It also implies that precision
and recall value of binarized image are having high values.

0.505

Negative rate metric

0.5

values

0.495

0.49

0.485

Fig. 4.27: Negative rate metric of different filters for logo of Western union bank

0.48

Fig. 4.27 shows that for input image of Western union bank there is least pixel

1
2 filter and maximum
3 pixel mismatch for4 median filter.
mismatch
for minimum
Filters

Precision

0.9

values

0.8

0.7

0.6

0.5

0.4

Fig. 4.28: Precision of different filters for logo of Western union bank

2 that the values for


3 minimum and median
4
Fig. 4.28 shows
filter are equal5 to 1. It
Filters

implies that for given input image minimum and median filter are more efficient for text
extraction in binary form in comparison of other filters.
(65)

Peak Signal to Noise Ratio

11.4

PSN

11.3
11.2
11.1

values

11
10.9
10.8
10.7
10.6
10.5
10.4

Fig. 4.29: PSNR of different filters for logo of Western union bank

2 that for the given 3input image the wiener


4 filter gives high psnr
5 value
Fig. 4.29 shows
Filters

which shows the good image quality and less error is introduced in output image whereas
average filter gives the lowest psnr value which means the quality of input image is
degraded in comparison of other filters.

Specificity

0.9999

values

0.9999

0.9998

0.9998

0.9997

0.9997

0.9996

0.9996

Fig. 4.30: Specificity of different filters for logo of Western union bank

3 (66)

4
Filters

Fig. 4.30 shows that for adaptive, median and minimum filter the proposed
algorithm is able to reject the pixel of a input image which do not contain text data more
accurately in comparison of other filters.

4.3.3 Performance Analysis of Third Input Image

The third input image is a DVD cover which contains a large amount of text information.
The image contains background of different type of colors and text of different font size,
alignments and shapes. The input image is corrupted and its quality is degraded by Gaussian
noise .Firstly we will remove most of the noise by an effective filtering algorithm, and
finally search for the text in the cover by proposed algorithm and extract the text from cover.

Fig. 4.31: Input image is DVD cover having Gaussian noise

Following are the output images of text extraction which are obtained by applying
the different filtering methods on the proposed algorithm for the input image.

Fig. 4.32: Text extraction


from DVD cover using
adaptive filter

Fig. 4.32 shows the text


extracted from the input image by using adaptive filter method on proposed algorithm.
(67)

Fig. 4.33: Text extraction


from DVD cover using
average filter
Fig. 4.33 shows the text extracted from the input image by using average filter
method on proposed algorithm.

Fig. 4.34: Text


extraction from DVD
cover using maximum
filter
Fig. 4.34 shows the text extracted from the input image by using maximum filter
method on proposed algorithm.

Fig. 4.35: Text extraction


from DVD cover after
applying median filter
(68)

Fig. 4.35 shows the text extracted from the input image by using median filter
method on proposed algorithm.

Fig. 4.36: Text extraction


from DVD cover using
minimum filter

Fig. 4.36 shows


the text extracted from
the input image by using minimum filter method on proposed algorithm.

Fig. 4.37: Text extraction


from DVD cover using
trimmed filter

Fig. 4.37 shows the text extracted from the input image by using trimmed filter
method on proposed algorithm.

Fig. 4.38: Text extraction


from DVD cover using
wiener filter
(69)

Fig. 4.38 shows the text extracted from the input image by using wiener filter
method on proposed algorithm.
Table 4.3: Summary of the result for DVD cover having Gaussian noise

Filtering
Methods

Accuracy

DRD

FMeasure

NRM

Precision PSNR Specificity

Adaptive Filter

0.6967

20.4257

3.8455

0.5477

0.0423

11.0592

0.7309

Average Filter

0.9216

8.5363

0.0195

0.4999

0.2857

12.3968

1.0000

Maximum Filter

0.9424

8.5363

0.0265

0.4997

0.1642

12.3968

0.8997

Median Filter

0.9482

11.0706

0.1618

0.4997

0.1642

12.8593

0.9998

Minimum Filter

0.8952

18.7968

4.2457

0.5045

0.0532

9.7968

0.9557

Trimmed Filter

0.9434

12.1343

0.0538

0.4991

0.3077

12.4681

1.0000

Wiener Filter

0.9410

12.28860

0.1806

0.4997

0.1429

12.2891

0.9895

Table 4.3 shows the performance values calculated by multiple filters for given input
image of DVD cover contaminated by Gaussian noise. These values are further analyzed
with the help of the graphs.
4.3.3.1 Analysis of Calculated values
In the following figure the number 1, 2, 3, 4, 5, 6 and 7 represents the adaptive filter,
average filter, maximum filter, median filter, minimum filter, trimmed filter and wiener filter
respectively.

(70)

Accuracy

0.95

0.9

values

0.85

0.8

0.75

0.7

0.65

Fig. 4.39: Accuracy of different filters for cover of DVD having Gaussian noise

2 that in term of accuracy


3
4
Fig 4.39 shows
the median filter
shows the best5result in
Filters

preserving the edges of text and adaptive filter shows the worst result for the input image of
cover of DVD. After minimum filter the filter which shows the better output is trimmed
filter in comparison of other filters.

Distance reciprocal distortion metric

22

20

18

values

16

14

12

10

Fig. 4.40: Distance reciprocal distortion metric of different filters cover of DVD having Gaussian
noise
8

(71)

4
Filters

Fig. 4.40 show that maximum filter provides best visual quality and detects as
possible text for input image and minimum filter gives the highest values of DRD which
means that minimum filter is not able to detect text properly in input image.

F-Measure

4.5

3.5

values

2.5

1.5

0.5

Fig. 4.41: F- Measure of different filters for cover of DVD having Gaussian noise

2 that median filter


3 posses the highest
4 value of the F-Measure
5
Fig. 4.41 shows
it
Filters

indicated that binarized image and input image are equivalent. It also implies that precision
and recall value of binarized image are having high values.

Negative rate metric

0.56

0.55

0.54

values

0.53

0.52

0.51

0.5

Fig. 4.42: Negative rate metric of different filters for cover of DVD having Gaussian noise

0.49

(72)

4
Filters

Fig.4.42 shows that for input image of cover of DVD ,there is least pixel mismatch
for median filter and maximum pixel mismatch for average filter.

Precision

0.35

0.3

0.25

values

0.2

0.15

0.1

0.05

Fig. 4.43: Precision of different filters for cover of DVD having Gaussian noise

4
Fig. 4.3 shows2 that the values for 3
trimmed filter are highest.
It implies that 5for given
Filters

input image trimmed filter are more efficient for text extraction in binary form in
comparison of other filters.

Peak Signal to Noise Ratio

13

12.5

12

values

11.5

11

10.5

10

9.5

Fig. 4.44: PSNR values of different filters for cover of DVD having Gaussian noise

3 (73)

4
Filters

Fig. 4.44 shows that for the given input image the median filter gives high psnr
value which shows the good image quality but in the process of removing the noise, the
median filter was not able to preserve the edges of text in output image.

Specificity

0.95

values

0.9

0.85

0.8

0.75

0.7

Fig. 4.45: Specificity of different filters for cover of DVD having Gaussian noise

2 that for average, 3


5
Fig. 4.45 shows
median and trimmed4filter the proposed algorithm
Filters

is able to reject the pixel of a input image which do not contain text data more accurately in
comparison of other filters.

4.3.4 Performance Analysis of Fourth Input Image


The fourth input image is a DVD cover which contains a text information. The
image contains background of different type of colors and text of different font size,
alignments and shapes. The input image is corrupted and its quality is degraded by Speckle

noise (SN). Firstly we will remove most of the noise by an effective filtering algorithm, and
finally search for the text in the cover by proposed algorithm and extract the text from cover.
Fig. 4.46 shows the input image.

(74)

Fig. 4.46: The input image having Speckle noise

Following are the output images of text extraction which are obtained by applying
the different filtering methods on the proposed algorithm for the input image.

Fig. 4.47: Text


extraction from
input image
having SN
using adaptive
filter

Fig.
4.47 shows the text extracted from the input image by using adaptive filter method on
proposed algorithm.

(75)

Fig. 4.48: Text extraction from input image having SN using average filter

Fig. 4.48 shows the text extracted from the input image by using average filter
method on proposed algorithm.

Fig. 4.49: Text


extraction from
input image
having SN using
maximum filter

Fig.
4.49 shows the
text extracted from the input image by using maximum filter method on proposed algorithm.
(76)

Fig. 4.50: Text extraction from input image having SN using median filter

Fig. 4.50 shows the text extracted from the input image by using median filter
method on proposed algorithm.

Fig. 4.51: Text


extraction from
input image
having SN using
minimum filter

Fig. 4.51
shows the text
extracted from the input image by using minimum filter method on proposed algorithm.

(77)

Fig. 4.52: Text


extraction from
input image
having SN using
trimmed filter

Fig. 4.52
shows the text
extracted from the input image by using trimmed filter method on proposed algorithm.

Fig. 4.53: Text


extraction from
input image
having SN
using wiener
filter
(78)

Fig. 4.53 shows the text extracted from the input image by using wiener filter
method on proposed algorithm.
Table 4.4: Summary of the result for input image having SN

Filtering
Methods

Accuracy

DRD

FMeasure

NRM

Precision PSNR Specificity

Adaptive Filter

0.7132

12.2243

0.0124

0.5161

0.6385

5.4237

0.7351

Average Filter

0.9308

12.4802

0.0110

0.5000

0.7178

11.6019

1.0000

Maximum Filter

0.9594

9.4423

0.0453

0.4997

0.9134

12.9623

1.0000

Median Filter

0.9583

9.4602

0.0183

0.5000

0.9032

13.7968

0.9998

Minimum Filter

0.9460

9.8490

0.0389

0.5234

0.5321

13.7968

0.9998

Trimmed Filter

0.9529

12.1343

0.0434

0.5000

0.9130

13.2699

1.0000

Wiener Filter

0.9580

9.8500

0.0307

0.4997

0.8645

13.0413

0.9999

Table 4.4 shows the performance values calculated by multiple filters for given input
image in which image quality is degraded by SN.
These values are further analyzed with the help of the graphs.
4.3.4.1 Analysis of Calculated values
In the following figure the number 1, 2, 3, 4, 5, 6 and 7 represents the adaptive filter,
average filter, maximum filter, median filter, minimum filter, trimmed filter and wiener filter
respectively.

Accuracy

0.95

values

0.9

0.85

0.8

0.75

0.7

Fig. 4.54: Accuracy of different filters for input image having SN

3 (79)

4
Filters

Fig 4.54 shows that in term of accuracy the maximum filter shows the best result in
preserving the edges of text and adaptive filter shows the worst result for the input image.
After maximum filter, the filter which shows the better output is trimmed filter in
comparison of other filters.

12.5

Distance reciprocal distortion metric


DRD

12

11.5

values

11

10.5

10

9.5
Fig. 4.55: Distance reciprocal distortion metric of different filters for input image having SN

Fig. 4.55 show that maximum filter provides best visual quality and detects as

possible
text for input
values of DRD
1
2 image and adaptive
3 filter gives the highest
4
5 which
means that adaptive filter is not able to detect text properly Filters
in input image.

F-Measure

0.05

0.045

0.04

values

0.035

0.03

0.025

0.02

0.015
Fig. 4.56: F- Measure of different filters for input image having SN

0.01

(80)

Fig. 4.56 shows that median filter posses the highest


Filtersvalue of the F-Measure it

indicated that binarized image and input image are equivalent. It also implies that precision
and recall value of binarized image are having high values.

Negative rate metric

0.52

0.515

0.51

values

0.505

0.5

0.495

0.49

0.485

0.48
Fig. 4.57: Negative rate metric of different filters input image having SN

0.475

Fig. 4.57 shows that for input image there is least pixel mismatch for maximum filter

1 maximum pixel mismatch


2
and
for filter. 3

4
Filters

Precision

0.95

0.9

0.85

values

0.8

0.75

0.7

0.65

0.6

0.55

0.5

Fig. 4.58: Precision of different filters for input image having SN

2 that the values 3of precision for maximum


4
Fig. 4.58 shows
an trimmed 5filter are
Filters

highest. It implies that for given input image maximum and trimmed filter are more efficient
(81)
for text extraction in binary form in comparison of other filters

Peak Signal to Noise Ratio

15
14
13
12

values

11
10
9
8
7
6
5

Fig. 4.59: PSNR values of different filters for input image having SN

2 that for the given3 input image the minimum


4
Fig. 4.59 shows
filter gives 5
high psnr
Filters

value which shows the good image quality but in the process of removing the noise, the
minimum filter was not able to preserve the edges of text in output image.

Specificity

0.95

Spe

values

0.9

0.85

0.8

0.75

0.7

Fig. 4.60: Specificity of different filters for input image having SN

3 (82)

4
Filters

Fig. 4.60 shows that for average, maximum and trimmed filter the proposed
algorithm is able to reject the pixel of a input image which do not contain text data more
accurately in comparison of other filters.

4.3.5 Performance Analysis for Fifth Input Image

In the input image the text is written over a background which contains different
type of colors. The input image is corrupted and its quality is degraded by salt and pepper
noise (SPN).Firstly we will remove most of the noise by an effective filtering algorithm, and
finally search for the text in the image by proposed algorithm and extract the text from
cover.

Fig. 4.61: The input image having salt and pepper noise

Following are the output images of text extraction which are obtained by applying
the different filtering methods on the proposed algorithm for the input image.

Fig. 4.62: Text extraction


from input image using
adaptive filter

Fig. 4.62 shows the text extracted from the input image by using adaptive filter
method on proposed algorithm.
(83)

Fig. 4.63: Text extraction


from input image using
average filter

Fig. 4.63 shows


the text extracted from
the input image by using average filter method on proposed algorithm

Fig. 4.64: Text extraction


from input image using
maximum filter

Fig. 4.64 shows the text


extracted from the
input image by using
maximum filter method on proposed algorithm

(84)

Fig. 4.65: Text


extraction from input
image using median
filter

Fig.

4.65

shows the text extracted from the input image by using median filter method on proposed
algorithm

Fig. 4.66: Text extraction


from input image using
maximum filter

Fig. 4.66 shows


the text extracted from
the input image by using
maximum filter method on proposed algorithm
(85)

Fig. 4.67: Text extraction


from input image using
trimmed filter

Fig. 4.67 shows


the text extracted from the input image by using trimmed filter method on proposed
algorithm

Fig. 4.68: Text extraction


from input image using
wiener filter

Fig.
shows

the

4.68
text

extracted from the input image by using wiener filter method on proposed algorithm
Table 4.5: Summary of the result for input image having SPN
(86)

Filtering
Methods

Accuracy

DRD

FMeasure

NRM

Precision PSNR Specificity

Adaptive Filter

0.5868

26.4197

0.0744

0.6215

0.0402

3.8387

0.6850

Average Filter

0.9293

10.0979

0.0108

0.5000

0.9543

11.5052

0.7689

Maximum Filter

0.8229

25.2479

0.0086

0.5000

0.7654

7.5176

0.8999

Median Filter

0.9649

9.1618

0.1572

0.4996

1.000

13.0567

1.0000

Minimum Filter

0.7189

27.7578

0.0217

0.4999

0.6654

5.5108

0.4563

Trimmed Filter

0.9505

9.6301

0.0462

0.4999

0.9995

13.0567

1.0000

Wiener Filter

0.8313

17.1161

0.0045

0.5000

0.7655

7.72891

0.9992

Table 4.5 shows the performance values calculated by multiple filters for given input
image in which image quality is degraded by SPN. These values are further analyzed with
the help of the graphs.
4.3.5.1 Analysis of calculated values
In the following figure the number 1, 2, 3, 4, 5, 6 and 7 represents the adaptive filter,
average filter, maximum filter, median filter, minimum filter, trimmed filter and wiener filter
respectively.

Accuracy

0.95

0.9

values

0.85

0.8

0.75

0.7

0.65

0.6
Fig. 4.69: Accuracy of different filters for input image having SPN

0.55

(87)

Fig 4.69 shows that in term of accuracy the median


filter shows the best result in
Filters

preserving the edges of text and minimum filter shows the worst result for the input image.
After median filter the filter which shows the better output is trimmed filter in comparison
of other filters.

28

Distance reciprocal distortion metric

26
24
22

values

20
18
16
14
12
10
Fig. 4.70: Distance reciprocal distortion metric of different filters for input image having SPN

Fig. 4.70 show that median filter provides best visual quality and detects as possible

text
1 for input image and
2 minimum filter gives
3 the highest values
4 which means that minimum
5

Filters
filter is not able to detect text properly in input image and not
effective for removing noise.

F-Measure

0.16

0.14

0.12

values

0.1

0.08

0.06

0.04

0.02
Fig. 4.71: F- Measure of different filters for input image having SN

(88)

4
Filters

Fig. 4.71 shows that median filter posses the highest value of the F-Measure it

indicated that binarized image and input image are nearly equal. Wiener filter posses the
lowest value of F-Measure which means text is not clearly identified in it.

Negative rate metric

0.64

0.62

0.6

values

0.58

0.56

0.54

0.52

0.5

0.48
Fig. 4.72: Negative rate metric of different filters input image having salt and pepper noise

0.46

Fig.4.72 shows that for input image there is least pixel mismatch for median filter

1 maximum pixel mismatch


2
3 filter
and
for minimum

4
Filters

Peak Signal to Noise Ratio

14

12

values

10

Fig.4.73: PSNR values of different filters for input image having SPN

3
(89)

4
Filters

Fig. 4.73 shows that for the given input image the median filter gives high psnr value
which shows the good image quality but in the process of removing the noise, the median
filter was not able to preserve and detect the edges of small text in output image.

Specificity

Specificity
0.9

values

0.8

0.7

0.6

0.5

0.4

Fig. 4.74: Specificity of different filters for input image having SN

2 that for average, 3


5
Fig. 4.74 shows
median and trimmed4filter the proposed algorithm
Filters

is able to reject the pixel of a input image which do not contain text data more accurately in
comparison of other filters.

Precision

1
0.9
0.8
0.7

values

0.6
0.5
0.4
0.3
0.2
0.1
0

Fig. 4.75: Precision of different filters for input image having SPN

3 (90)

4
Filters

Fig. 4.75 shows that the values of precision for median and trimmed filter are
highest. It implies that for given input image median and trimmed filter are more efficient
for text extraction in binary form in comparison of other filters

(91)

Chapter 5

Conclusion

This chapter briefly summarizes the key outcomes of the proposed


scheme, the objectives that have been achieved and some suggestions for
future scope of this work are provided. The first section of the chapter provides

a brief summary of everything covered so far. In the second section, conclusion


of the achieved results is described and in the last section some suggestions for
future scope of this work are provided.
5.1

Summary
This thesis considered various issues and challenges of Digital image

processing that prevent its wide spread use in various applications. Denoising
and extraction of text from images are the main issues of the digital image
processing. Thus this study concentrated the focus on overcoming these issues
in digital images. To achieve this objective various design issues were studied
and also analysis of the previous research works was done in this direction. A
comparative study of various filtering algorithms like wiener, adaptive, average,
minimum, maximum and median filter is also provided which would be useful
for other researchers working in the same field. After analyzing all the previous
work an algorithm is proposed for text extraction through images in which noise
is removed from input images using different filters and then text is extracted
from filtered image .The output image contain text in black color and
background in white color. The proposed algorithm has been simulated in
MATLAB. By conducting various experiments in this work analyzed the
performance of filters on input image and results have been calculated in
accordance with the performance metrics defined.
5.2

Conclusion
In this thesis analysis of filters on complex digital image for text

extraction is based on parameters accuracy, DRD, F-Measure, NRM, Precision,


PSNR and Specificity. Simulation result from chapter 4 shows that for proposed
algorithm, filters shows different behaviour for different input images. For the
input with bright background the minimum filter is showing the best result for
text extraction where as for the input image with dull
(92)
background, the minimum filter is showing the best output but trimmed filter is
showing better output for simulation parameter DRD. The other three images
are noise affected
images. The quality of those images is degraded by Gaussian noise, SN and
SPN. For image with Gaussian noise, the median filter is showing effective

result. Maximum filter is showing best result for image in which image quality is
degraded by SN. For image with SPN, the median filter is showing the best
performance in terms of simulation parameters.
The proposed method retains the useful textual information more
accurately and thus, has a wider range of applications compared to other
conventional methods.

5.3

Future Scope
The result in this thesis provides a strong foundation for future work for the

hardware design. All of the analysis presented in this thesis work involved exhaustive
simulations. The algorithm can be realized in hardware implementation as a future work.
I have limited this work by using only wiener, adaptive, average, minimum,
maximum and median filter, although it is to be stressed that the library of the best-basis can
be extended to other types of filters like hybrid filter which is combination of a combination
of wiener filter and median filter and bilateral filter. Special attention must be paid to the
threshold calculating algorithm. I have used global threshold method in my research; the
further research work can be done by using local and hybrid threshold method. Better text
extraction can be obtained if the best methods are selected for noise removal from image
and threshold estimation.
MATLAB is used for all simulating the results of the proposed algorithm. The same
work can also be simulated in OpenCv- Python for further analysis and comparison.

(93)

references
[1].

Abreu, E., Lightstone, M., Mitra, S. K. and Arakawa, K. A new efficient

approach

for the removal of impulse noise from highly corrupted images. Image
Processing, IEEE Transactions on, 5(6), 1012-1025(1996).
[2]

Al-amri, S. S., Kalyankar, N. V. and Khamitkar, S. D. A comparative study of


removal noise from remote sensing image. arXiv preprint arXiv:1002.1148(2010).

[3].

Angelopoulos, G. and Pitas, I. Multichannel Wiener filters in color image


restoration based on AR color image modelling. In Acoustics, Speech, and Signal
Processing, 1991. ICASSP-91., 1991 International Conference on (pp. 2517-2520).
IEEE(1991, April).

[4].

Arce, G. R., & Paredes, J. L. Recursive weighted median filters admitting negative
weights and their optimization. Signal Processing, IEEE Transactions on, 48(3),
768-779(2000).

[5].

Bednar, J. and Watt, T. L. Alpha-trimmed means and their

relationship to

median filters. Acoustics, Speech and Signal Processing, IEEE Transactions


on, 32(1), 145-153(1984).
[6]. Bernsen, J. Dynamic thresholding of grey-level images. In International conference
on pattern recognition (pp. 1251-1255) (1986, October)..
[7].

Bhoi, N. and Meher, S. Circular spatial filtering under high-noise-variance


conditions. Computers & Graphics, 32(5), 568-580(2008).

[8].

Bovik, A. C., Huang, T. S., & Munson Jr, D. C. A generalization of median filtering
using linear combinations of order statistics. Acoustics, Speech and Signal
Processing, IEEE Transactions on, 31(6), 1342-1350(1983).
(94)

[9].

Brownrigg, D. R. K.. The weighted median filter. Communications

of the

ACM, 27(8), 807-818(1984).


[10].

Buades, A., Coll, B. and Morel, J. M. A non-local algorithm for image denoising. In
Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer
Society Conference on (Vol. 2, pp. 60-65). IEEE(2005, June).

[11].

Chaki, N., Shaikh, S. H. and Saeed, K. A Comprehensive Survey on Image


Binarization Techniques. In Exploring Image Binarization Techniques (pp. 5-15).
Springer India(2014).

[12]. Chan, R. H., Ho, C. W. and Nikolova, M. Salt-and-pepper noise removal by m ediantype noise detectors and detail-preserving regularization.Image Processing, IEEE
Transactions on, 14(10), 1479-1485(2005).
[13]. Chen, C. T. and Chen, L. G. A self-adjusting weighted median filter for removing
impulse noise in images. In Image Processing, 1996. Proceedings., International
Conference on (Vol. 1, pp. 419-422). IEEE(1996, September).
[14]. Chen, D., Bourlard, H. and Thiran, J. P. Text identification in complex

background

using SVM. In Computer Vision and Pattern Recognition, 2001. CVPR 2001.
Proceedings of the 2001 IEEE Computer Society Conference on (Vol. 2, pp. II621). IEEE(2001)..
[15].

Chen, T. and Wu, H. R. Adaptive impulse detection using center-weighted median


filters. Signal Processing Letters, IEEE, 8(1), 1-3(2001).

[16].

Durand, F. and Dorsey, J. Fast bilateral filtering for the display of high-dynamicrange images. ACM transactions on graphics (TOG), 21(3), 257-266(2002).

[17].

Eng, H. L. and Ma, K. K. Noise adaptive soft-switching median filter. Image


Processing, IEEE Transactions on, 10(2), 242-251(2001).

[18].

Florencio, D. A. and Schafer, R. W. Decision-based median filter using local signal


statistics. In Visual Communications and Image Processing'94 (pp. 268-275).
(95)
International Society for Optics and Photonics(1994, September).

[19].

Gllavata, J., Ewerth, R. and Freisleben, B. A robust algorithm for text detection in
images. In Image and Signal Processing and Analysis, 2003. ISPA 2003.
Proceedings of the 3rd International Symposium on (Vol. 2, pp. 611-616). IEEE.
(2003, September).

[20]. Gonzalez, R. C. and Richard, E. Woods, digital image processing. ed: Prentice Hall
Press, ISBN 0-201-18075-8(2002).
[21].

Hemalatha, N. and Kumar, D. G. Image Denoising Based on Non-Local means


Algorithm.

[22]. Hou, Z. and Koh, T. S . Image denoising using robust regression.Signal Processing
Letters, IEEE, 11(2), 243-246(2004).

[23].

Jagtap, K. and Manjare, C. A. An Ancient Degraded Images Revamping Using


Binarization Technique. International Journal of Soft Computing and Engineering
(IJSCE) ISSN: 2231-2307, Volume-4, Issue-6(2015).

[24]. Jain, A. K., Duin, R. P. and Mao, J. Statistical pattern recognition: A review. Pattern
Analysis and Machine Intelligence, IEEE Transactions on, 22(1), 4-37(2000)..
[25].

Jang, I. H. and Kim, N. C. Locally adaptive Wiener filtering in wavelet domain for
image restoration. In TENCON'97. IEEE Region 10 Annual Conference. Speech and
Image Technologies for Computing and Telecommunications., Proceedings of IEEE
(Vol. 1, pp. 25-28). IEEE(1997, December).

[26].

Jayaraj, V., Ebenezer, D. and Aiswarya, K. High density salt and pepper noise
removal

in

images

using

improved

adaptive

statistics

estimation

filter.

International Journal of Computer Science and Network Security, 9(11), 170175(2009).

(96)
[27].

Jung, K., Kim, K. I. and Jain, A. K. Text information extraction in images and video:
a survey. Pattern recognition, 37(5), 977-997(2004).

[28].

Justusson, B. I. Median filtering: Statistical properties (pp. 161-196). Springer


Berlin Heidelberg(1981).

[29].

Kamboj, P. and Rani, V. Image Enhancement Using Hybrid Filtering Techniques.


International Journal of Science and Research, 2(6) (2013).

[30].

Kasar, T., Kumar, J. and Ramakrishnan, A. G. Font and background color


independent text binarization. In Second international workshop on camerabased document analysis and recognition (pp. 3-9) (2007, September).

[31]. Kasparis, T., Tzannes, N. S. and Chen, Q. Detail-preserving adaptive conditional


median filters. J. Electronic Imaging, 1(4), 358-364(1992).
[32].

Kaur, E. J. and Mahajan, R. Improved Degraded Document Image Binarization


Using Guided Image Filter. International Journal of Scientific Research And
Education, 2(07) (2014).

[33].

Kaur, E. J. and Mahajan, R. Improved Degraded Document Image Binarization


Using Guided Image Filter. International Journal Of Scientific Research And
Education, 2(07) (2014).

[34].

Kaur, S. Noise Types and Various Removal Techniques. International Journal of


Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 4, Issue 2(2015).

[35].

Kim, K. I., Jung, K. and Kim, J. H. Texture-based approach for text detection in
images using support vector machines and continuously adaptive mean shift
algorithm. Pattern Analysis and Machine Intelligence, IEEE Transactions on,
25(12), 1631-1639(2003).

[36]. Ko, S. J. and Lee, Y. H. Center weighted median filters and their applications to
(97)
image enhancement. Circuits and Systems, IEEE Transactions on, 38(9), 984993(1991).
[37].

Lee, J. S. Digital image enhancement and noise filtering by use of local


statistics.Pattern Analysis and Machine Intelligence, IEEE Transactions on, (2),
165-168(1980).

[38].

Level Otsu, N. A threshold selection method from gray-level histogram. IEEE


Transactions on Systems, Man and Cybernetics, 9(1), 62-66(1979).

[39].

Lim, J. S. Two-dimensional signal and image processing. Englewood Cliffs, NJ,


Prentice Hall, 1990, 710 p., 1(1990).

[40].

Luo, W. Efficient removal of impulse noise from digital images. Consumer


Electronics, IEEE Transactions on, 52(2), 523-527(2006).

[41].

Meher, S. Development of Some Novel Nonlinear and Adaptive Digital Image


Filters for Efficient Noise Suppression (Doctoral dissertation) (2004)..

[42].

Nair, M. S., Revathy, K. and Tatavarti, R. Removal of salt-and pepper noise in


images: a new decision-based algorithm. In Proceedings of the International Multi
Conference of Engineers and Computer Scientists (Vol. 2) (2008, March).

[43].

Ntirogiannis, K., Gatos, B. and Pratikakis, I. A combined approach for the


binarization of handwritten document images. Pattern recognition letters,35, 315(2014).

[44].

Patidar, P., Gupta, M., Srivastava, S. and Nagawat, A. K. Image de-noising by


various filters for different noise. International Journal of Computer Applications
(09758887) Volume(2010).

[45].

Perona, P. and Malik, J. Scale-space and edge detection using anisotropic


diffusion. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 12(7),
629-639(1990).
(98)

[46].

Pitas, I. and Venetsanopoulos, A. N. Order statistics in digital image


processing. Proceedings of the IEEE, 80(12), 1893-1921(1992).

[47].

Raczynski, S. Modeling and simulation: the computer science of illusion. John


Wiley & Sons(2014)..

[48].

Rudin, L. I., Osher, S. and Fatemi, E. Nonlinear total variation based noise
removal algorithms. Physica D: Nonlinear Phenomena, 60(1), 259-268(1992).

[49].

Sankur, B. and Sezgin, M. (2001). Image thresholding techniques: A survey over


categories. Pattern Recognition, 34(2), 1573-1607.

[50]. Sauvola, J., & Pietikinen, M. Adaptive document image

binarization. Pattern

recognition, 33(2), 225-236(2000).


[51]. Sawant, A., Zeman, H., Muratore, D., Samant, S. and DiBianka, F. An Adaptive
Median Filter Algorithm To Remove Impulse Noise In X-Ray And CT Images And
Speckle In Ultrasound Images, Proc. SPIE, vol. 3661,pp. 12631274, (Feb. 1999).

(99)