Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Save to My Library
Look up keyword
Like this
4Activity
0 of .
Results for:
No results containing your search query
P. 1
Document Image Segmentation Based On Gray Level Co- Occurrence Matrices and Feed Forward Neural Network

Document Image Segmentation Based On Gray Level Co- Occurrence Matrices and Feed Forward Neural Network

Ratings: (0)|Views: 433 |Likes:
Published by ijcsis
This paper presents a new method for extracting text region from the document images employing the combination of gray level co-occurrence matrices (GLCM) and artificial neural networks (ANN). We used GLCM features quantitatively to evaluate textural parameters and representations and to determine which parameter values and representations are best for extracting text region. The detection of text region is achieved by extracting the statistical features from the GLCM of the document image and these features are used as an input of neural network for classification. Experimental results show that our method gives better text extraction than other methods.
This paper presents a new method for extracting text region from the document images employing the combination of gray level co-occurrence matrices (GLCM) and artificial neural networks (ANN). We used GLCM features quantitatively to evaluate textural parameters and representations and to determine which parameter values and representations are best for extracting text region. The detection of text region is achieved by extracting the statistical features from the GLCM of the document image and these features are used as an input of neural network for classification. Experimental results show that our method gives better text extraction than other methods.

More info:

Published by: ijcsis on Sep 10, 2010
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

12/19/2012

pdf

text

original

 
Document Image Segmentation Based On GrayLevel Co- Occurrence Matrices and Feed ForwardNeural Network
S.Audithan
 
Research Scholar, Dept of CSE,Annamalai University,Annamalai NagarTamil Nadu, India
 sarabar36@rediffmail.com
 
Dr. RM. Chandrasekaran
Professor, Dept of CSEAnnamalai University,Annamalai Nagar Tamil Nadu, India
 aurmc@sify.com
 
 Abstract
—This paper presents a new method for extracting textregion from the document images employing the combination of gray level co-occurrence matrices (GLCM) and artificial neuralnetworks (ANN). We used GLCM features quantitatively toevaluate textural parameters and representations and todetermine which parameter values and representations are bestfor extracting text region. The detection of text region is achievedby extracting the statistical features from the GLCM of thedocument image and these features are used as an input of neuralnetwork for classification. Experimental results show that ourmethod gives better text extraction than other methods.
 Keywords-component;Document segmentation, GLCM, ANN, Haralick features
 
I.
 
I
NTRODUCTION
The extraction of textual information from documentimages provides many useful applications in documentanalysis and understanding, such as optical characterrecognition, document retrieval, and compression. To-date,many effective techniques have been developed for extractingcharacters from monochromatic document images. Thedocument image segmentation is an important component inthe document image understanding.An efficient and computationally fast method forsegmenting text and graphics part of document images basedon textural cues is presented in [1]. The segmentation methoduses the notion of multi scale wavelet analysis and statisticalpattern recognition. M band wavelets are used whichdecompose an image into M x M band pass channels.Information from the table of contents (TOC) pages can be
 
extracted to use indocument database for effective retrieval of therequired pages. Fully automatic identification andsegmentation of table of contents (TOC) page from scanneddocument is discussed in [2].Character segmentation is the first step of OCRsystem that seeks to decompose a document image into asequence of sub images of individual character symbols.Segmentation of monochromatic document images into fourclasses are presented in [3]. They are background, photograph,text, and graph. Features used for classification are based onthe distribution patterns of wavelet coefficients in highfrequency bands.Probabilistic latent semantic analysis (pLSA) modelis presented in [4]. The pLSA model is originally developedfor topic discovery in text analysis using “bag-of-words”document representation. The model is useful for imageanalysis by “bag-of-visual words” image representation. Theperformance of the method depends on the visual vocabularygenerated by feature extraction from the document image.Kernel-based methods have demonstrated excellentperformances in a variety of pattern recognition problems. Thekernel-based methods and Gabor wavelet to the
 
segmentationof document image is presented in [5]. The feature images arederived from Gabor filtered images. Taking the computationalcomplexity into account, the sampled feature image issubjected to Spectral Clustering Algorithm (SCA). Theclustering results serve as training samples to train a SupportVector Machine (SVM).The steerable pyramid transform is presented in [6].The features extracted from pyramid sub bands serve to locateand classify regions into text and non text in some noiseinfected, deformed, multilingual, multi script documentimages. These documents contain tabular structures, logos,stamps, handwritten text blocks, photos etc. A novel schemefor the extraction of textual areas of an image using globallymatched wavelet filter is presented in [7]. A clustering basedtechnique has been devised for estimating globally matchedwavelet filters using a collection of ground truth images. Thiswork extended to text extraction for the segmentation of document images into text, background, and picturecomponents.A classical approach in the segmentation of Canonical Syllables of Telugu document images is proposedin [8]. The model consists of zone separation and componentextraction phases as independent parts. The relation betweenzones and components is established in the segmentationprocess of canonical syllable. The segmentation efficiency of the proposed model is evaluated with respect to the canonicalgroups.It [9] presents a new method for extracting charactersfrom various real-life complex document images. It applies a
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 5, August 2010263http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
 
multi-plane segmentation technique to separate homogeneousobjects including text blocks, non-text graphical objects, andbackground textures into individual object planes. It consistsof two stages - automatic localized multilevel thresholding,and multi-plane region matching and assembling. Two novelapproaches for document image segmentation are presented in[10]. In text line segmentation a Viterbi algorithm is proposedwhile an SVM-based metric is adopted to locate words in eachtext line.Gray-level co-occurrence matrices (GLCM) toquantitatively evaluate textural parameters and representationsand to determine which parameter values and representationsare best for mapping sea ice texture. In addition, it [11]presents the three GLCM implementations and evaluated themby a supervised Bayesian classifier on sea ice texturalcontexts. Texture is one of the important characteristics usedin identifying objects or region of interest in an image,whether the image to be a photomicrograph, an aerialphotograph, or a satellite image. Some easily computabletextural features are presented in [12].II.
 
METHODOLOGY
 A.
 
GLCM 
Gray level co occurrence matrix (GLCM) is the basis forthe Haralick texture features [12]. This matrix is square withdimension
 N 
g
, where
 N 
g
is the number of gray levels in theimage. Element [
i
,
 j
] of the matrix is generated by counting thenumber of times a pixel with value
i
is adjacent to a pixel withvalue
 j
and then dividing the entire matrix by the total numberof such comparisons made. Each entry is therefore consideredto be the probability that a pixel with value
i
will be foundadjacent to a pixel of value
 j
.Since adjacency can be defined to occur in each of fourdirections in a 2D, square pixel image (horizontal, vertical, leftand right diagonals as shown in Figure 1, four such matricescan be calculated.
Figure: 1 Four directions of adjacency as defined for calculation of theHaralick texture features.
The Haralick statistics are calculated for co-occurrencematrices generated using each of these directions of adjacency.In our proposed system, based on the gray level occurrencematrix, 10 features are selected for text extraction.
 B.
 
 ANN 
An artificial neuron is a computational model inspired inthe natural neurons. Natural neurons receive signals throughsynapses located on the dendrites or membrane of the neuron.When the signals received are strong enough (surpass a certainthreshold), the neuron is activated and emits a signal thoughthe axon. This signal might be sent to another synapse, andmight activate other neurons. The complexity of real neuronsis highly abstracted when modeling artificial neurons. Thesebasically consist of inputs (like synapses), which aremultiplied by weights (strength of the respective signals), andthen computed by a mathematical function which determinesthe activation of the neuron. Another function (which may bethe identity) computes the output of the artificial neuron(sometimes in dependence of a certain threshold). ANNscombine artificial neurons in order to process information.
 
A single layer network has severe restrictions the class of tasks that can be accomplished is very limited. The limitationis overcome by the two layer feed forward network. Thecentral idea behind this solution is that the errors for the unitsof the hidden layer are determined by back propagating theerrors of the units of the output layer. For this reason themethod is often called the back propagation learning rule.Back propagation can also be considered as a generalization of the delta rule for nonlinear activation functions and multi layernetworks.A feed forward network has a layered structure. Each layerconsists of units which receive their input from units from alayer directly below and send their output to units in a layerdirectly above the unit. There are no connections within a layer.The N
i
inputs are fed into the first layer of N
h,i
hidden units.The input units are merely “fan out” units no processing takesplace in these units. The activation of a hidden unit is afunction F
i
of the weighted inputs plus a bias. The output of thehidden units is distributed over the next layer of N
h,2
hiddenunits until the last layer of hidden units of which the outputs arefed into a layer of N
o
output units as shown in Figure 2. In ourproposed method a one input layer, two hidden layers and oneoutput layer feed forward neural network is used.N
i
N
o
N
h,m-1
N
h,m-2
N
h,1
Figure 2. A multi layer networks with m layers of input
III.
 
P
ROPOSED SYSTEM
 The block diagram of the proposed text extractionfrom document images is shown in Figure 3, where Figure
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 5, August 2010264http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
 
3(a) depicts the Feature Extraction Phase while classificationPhase is shown in the Figure 3(b).(i)
 
(a) (b)
Figure 3: Our Proposed Text extraction method(a) Feature Extraction Phase (b) Classification Phase
 
 A.
 
Preprocessing
Image pre-processing is the term for operations on imagesat the lowest level of abstraction. The aim of pre-processing isan improvement of the image data that suppresses undesireddistortions or enhances some image features relevant forfurther processing and analysis task. First, the given documentimage is converted into gray scale image. Then AdaptiveHistogram Equalization (AHE) is applied to enhance thecontrast of the image. AHE computes the histogram of a localwindow centered at a given pixel to determine the mapping for
 
that pixel, which provides a local contrast enhancement.
 B.
 
Feature Extraction
Feature extraction is an essential pre-processing step topattern recognition and machine learning problems. It is oftendecomposed into feature construction and feature selection. Inour approach, GLCM features are used as a feature vector toextract the text region from the document images. In thefollowing section gives the overview of feature extraction of the text region and the graphics /image part.The input to the feature extraction module is the documentimages having text and graphics/image part. A 20X20 nonoverlapping window is used to extract the features. TheGLCM features are extracted from the text region and thegraphics parts and stored separately for training phase in theclassifier stage.The following 10 GLCM features are selected for thefeature extraction phase. Letbe the the entry in anormalized GLCM. The mean and standard deviations for thrrows and columns of the matrix are
(2)
N is the number of gray levels in the image.
1)
 
Contrast 
To emphasize a large amount of contrast, create weightsso that the calculation results in a larger figure when there isgreat contrast.
(3)
 
When i and j are equal, the cell is on the diagonal and (i-j)=0. These values represent pixels entirely similar to theirneighbor, so they are given a weight of 0. If i and j differ by1, there is a small contrast, and the weight is 1. If i and jdiffer by 2, contrast is increasing and the weight is 4. Theweights continue to increase exponentially as (i-j) increases.
2)
 
Cluster prominence
Cluster Prominence, represents the peakedness orflatness of the graph of the co-occurrence matrix with respectto values near the mean value.
(4)
3)
 
Cluster Shade
Cluster Shade represents the lack of symmetry in animage and is defined by (5).(5)
4)
 
 Dissimilarity
In the Contrast measure, weights increase exponentially(0, 1, 4, 9, etc.) as one moves away from the diagonal.However in the dissimilarity measure weights increase linearly(0, 1, 2,3 etc.).(6)
5)
 
 Energy
Angular second moment (ASM) and Energy uses eachP
ij
as a weight for itself. High values of ASM or Energy occurwhen the window is very orderly.(7)The square root of the ASM is sometimes used as atexture measure, and is called Energy.(8)
Document ImagePreprocessingGLCM FeatureExtratcionFeature VectorDocument ImagePreprocessingGLCM FeatureExtratcionCompare theFeatures withFeature VectorExtracted TextRegion
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 5, August 2010265http://sites.google.com/site/ijcsis/ISSN 1947-5500

Activity (4)

You've already reviewed this. Edit your review.
1 hundred reads
1 thousand reads
ANa Fdak liked this

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->