You are on page 1of 13

See

discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/45898220

Kannada Character Recognition System A Review

Article · January 2010


Source: arXiv

CITATIONS READS

3 73

2 authors, including:

S. Sethu Selvi
M.S. Ramaiah Institute of Technology
10 PUBLICATIONS 57 CITATIONS

SEE PROFILE

All in-text references underlined in blue are linked to publications on ResearchGate, Available from: S. Sethu Selvi
letting you access and read them immediately. Retrieved on: 10 May 2016
Kannada Character Recognition System: A Review
1
K. Indira, 2S. Sethu Selvi
1
Assistant Professor, indm_1@yahoo.com
2
Professor and Head, selvi_selvan@yahoo.com
1,2
Department of Electronics and Communication Engineering
M. S. Ramaiah Institute of Technology, Bangalore – 560 054

Abstract
Intensive research has been done on optical character recognition ocr and a large number of articles have
been published on this topic during the last few decades. Many commercial OCR systems are now available
in the market, but most of these systems work for Roman, Chinese, Japanese and Arabic characters.
There are no sufficient number of works on Indian language character recognition especially Kannada
script among 12 major scripts in India. This paper presents a review of existing work on printed Kannada
script and their results. The characteristics of Kannada script and Kannada Character Recognition
System kcr are discussed in detail. Finally fusion at the classifier level is proposed to increase the
recognition accuracy.

Keywords: Kannada Script, Skew detection, Segmentation, Feature Extraction, Nearest neighbour
classifier, Bayesian classifier .

1. INTRODUCTION
Kannada, the official language of the south 16 = 19600. If each akshara is considered as a
Indian state of Karnataka, is spoken by about 48 separate category to be recognized, building a
million people. The Kannada alphabets were classifier to handle these classes is difficult. Most of
developed from the Kadamba and Chalaukya scripts, the aksharas are similar and differ only in additional
descendents of Brahmi which were used between the strokes, it is feasible to break the aksharas into their
5th and 7th centuries A.D. The basic structure of constituents and recognize these constituents
Kannada script is distinctly different from the Roman independently. The block diagram of a KCR System is
script. Unlike many north Indian languages, Kannada shown in Figure 1.
characters do not have shirorekha (a line that connects
all the characters of any word) and hence all the 2. PRE PROCESSING
characters in a word are isolated. This creates a In a KCR system, the sequences of preprocessing
difficulty in word segmentation. Kannada script is steps are as follows:
more complicated than English due to the presence of
compound characters. However, the concept of
upper/lower case characters is absent in this script. 2.1 Binarization
Modern Kannada has 51 base characters, called The page of text is scanned through a flat
as Varnamale. There are 16 vowels and 35 bed scanner at 300dpi resolution and binarized using
consonants. (Table 1). Consonants take modified a global threshold computed automatically based on a
shapes when added with vowels. When a consonant specific image [1]. A binary image is obtained by
character is used alone, it results in a dead consonant considering the character as ON pixels and the
(mula vyanjana). Vowel modifiers can appear to the background as OFF pixels. The binarized image is
right, top or at the bottom of the base consonant. processed to remove any skew so that text lines are
Table II shows a consonant modified by all the 16 aligned horizontally in the image.
vowels. Such consonant-vowel combinations are called
live consonant (gunithakshra). When two or more 2.2 Skew Detection
consonant conjuncts appear in the input they make a
consonant conjunct (Table III). The first consonant Methods to detect and correct the skew for the
takes the full form and the following consonant document containing Kannada script has not been
becomes half consonant. In addition, two, three or reported in any of the recognition methods for
four characters can generate a new complex shape Kannada script. However, the methods discussed here
called a compound character. briefly work on the document containing Kannada
The number of possible Consonant–Vowel script.
combinations is 35 x 16 = 560. The number of
consonant -consonant vowel combinations is 35 x 35 x 2.2.1 Projection profiles
32 Kannada Character Recognition System …

A projection profile is a histogram of the number Skewed document images are decomposed by the
of ON pixels accumulated along parallel sample lines wavelet transform [5]. The matrix containing the
taken through the document. The profile may be at any absolute values of horizontal subband coefficients
angle, but often it is taken horizontally along rows or which preserves the text horizontal structure, is then
vertically along columns, and these are called the rotated through a range of angles. A projection profile
horizontal and vertical projection profiles is computed at each angle, and the angle that
respectively. For a document whose text lines span maximizes a criterion function is regarded as the
horizontally, the horizontal projection profile will skew angle. This algorithm performs well on
have peaks whose widths are equal to the character document images of various layouts and is robust to
height and minimum height valleys whose widths different languages and hence can be used for
are equal to the between line spacing. The most documents containing Kannada script.
straight forward use of the projection profile for skew
detection is to compute it at a number of angles close 2.2.4 Wavelet decomposition and Hough Transform
to the expected orientations [2]. For each angle, a [3, 5]
measure is made of the variation in the bin heights The document image is decomposed by wavelet
along the profile, and the one with the maximum transform and the LL subband which preserves the
variation gives the skew angle. At the correct skew original image is rotated through a range of angles
angle, since scan lines are aligned to text lines the and Hough Transform is calculated at each angle. The
projection profile has maximum height peaks for text angle that maximizes the highest number of counts
and valleys between line spacing. Some of the corresponds to the skew angle of the text. This method
Indian scripts, such as Devanagiri, have a dark top is suitable for Kannada text scanned at 300dpi and is
line linking all the characters in a word. This strong faster compared to other methods.
linear feature can be used for projection profile based
method. In Kannada, such a line linking all 2.3 Skew Correction
characters of the word is not present. However, a
short horizontal line can usually be seen on top of Skew correction i s performed by rotating the
most of the characters. This feature can be exploited document through an angle –θ with respect to the
for skew estimation. horizontal line, where the detected angle of skew is
θ. In order to prevent the image being rotated off the
image plane, the skewed image is first translated to the
2.2.2 Hough Transform center and the new image dimensions are computed.
This transform [3] maps points from (x, y)
domain to curves in (ρ, θ) domain, where ρ is the 3. SEGMENTATION
perpendicular distance of the line from the origin and
θ is the angle between the perpendicular line and Segmentation is the process of extracting
horizontal axis in the (x, y) plane. Crossing curves in objects of interest from an image. The f i r s t
(ρ, θ) domain is the result of collinear pixels in (x, y) s t e p i n segmentation is detecting lines. The
plane. θ is varied between 0 and 1800 for each black subsequent steps are detecting the words in each line
pixel in a document image and ρ is calculated in and the individual characters in each word.
Hough space. The maximum value in (ρ, θ) space
is considered as t h e skew angle of t h e document 3.1 Line Segmentation
image. This method is time consuming due to the
mapping operation from (x, y) plane to (ρ, θ) plane for Kunte and Samuel describe horizontal and
all black pixels, especially for images containing non vertical projection profiles (HPP and VPP) [7] for line
text dominant area (i.e. pictures, graphs etc.). Le et and word detection respectively. The horizontal
al [4] improved the computational time by applying projection profile is the histogram of the number of
Hough transform only on t h e bottom pixels of ON pixels along every row of the image. White space
connected components belonging to the dominant text between text lines is used to segment the text lines.
Figure 2 shows a sample Kannada document along
area. This method gives an accuracy of 0.50 to the with its HPP. The projection profile has valleys of
original skew angle of the image. Hough transform zero height between the text lines. Line segmentation is
based methods are robust enough to estimate skew done at these points.
angles between -150 to 150. But they are Kumar and Ramakrishnan have reported
computationally expensive and sensitive to noise. segmentation techniques for Kannada script. The
bottom conjuncts of a line overlap with top matras
2.2.3 Wavelet decomposition and projection [8] of the following text lines in the projection
profiles profile. This results in nonzero valleys in the HPP.
InterJRI Science and Technology, Vol. 1, Issue 2, July 2009 33

These lines are called Kerned Text lines. To segment width of the gaps in a line. Hence, the threshold is
such lines, the statistics of the height of the lines are not fixed and i t has to be changed after performing
found from the HPP. Then the threshold is fixed at the histogram of a line. This method may not work for
1.6 times the average line height. This threshold is all types of document.
chosen based on experimentation of segmentation on a
large number of Kannada documents. Nonzero valleys 3.3 Character Segmentation
below the threshold indicate the locations of the text
line and those above the threshold correspond to the 3.3.1 Three Stage Character Segmentation
location of kerned text lines. The midpoint of a Kumar and Ramakrishnan [8] described a three-
nonzero valley of a kerned text line is the separator of stage character segmentation for separating Kannada
the line. Ashwin and Shastry have solved the problem characters from the segmented word. Three line
of overlapping of consonant conjunct of one line with segmentation of character involves the division of
the vowel modifier of the next line by extracting the each into three segments: Top zone consists of top
minima of the horizontal projection profile smoothed matras, middle zone consists of base and compound
by a simple moving average filter. The line breaks characters and bottom zone consists of consonant
obtained sometimes segment inside a line. Such false conjuncts. Head line and base line information is
breaks are removed by using statistics of line widths extracted from the Horizontal Projection Profile. Head
and the separation between lines. line refers to the index corresponding to maximum in
the top half of the profile base line refers to the index
3.2 Word Segmentation corresponding to maximum in the bottom half of the
profile. Using the baseline information, text region in
Kunte and Samuel have proposed word the middle and top zones of a word is extracted and its
segmentation by taking the vertical projection profile VPP is obtained. Zero valued valleys of this profile
of an input text line. For Kannada script, spacing are the separators for the characters as in Figure 5.
between the words is greater than the spacing between Sometimes, the part of a consonant conjunct in the
characters in a word. The spacing between the words middle zone is segmented as a separate symbol then,
is found by taking the vertical projection profile of an split the segmented character into a base character and
input text line. The width of zero valued valleys is a vowel modifier (top or right matra). The consonant
more between the words in line as compared to the conjuncts are segmented separately based the on
width of zero valued valleys that exist between connected component analysis (CCA).
characters in a word. This information is used to
separate words from the input text lines. Figure 3
shows a text line with its VPP. 3.3.2.1 Consonant Conjunct Segmentation
Kumar and Ramakrishnan have described that Knowledge based approach is used to separate
Kannada words do not have shirorekha and all the the consonant conjuncts. The spacing to the next
characters in a word are isolated. Further, the character in the middle zone is more for characters
character spacing is non-uniform due to the having consonant conjuncts than for t h e others. To
presence of consonant conjuncts. Thus, spacing detect the presence of conjuncts, a block of partial
between the base characters in the middle zone image in the bottom zone corresponding to the gap
becomes comparable to the word spacing. This could between adjacent characters in the middle zone is
affect the accuracy of word segmentation. Hence, considered. If the number of ON pixels in the
morphological dilation is used to connect all the partial image exceeds a threshold (for examples 15
characters in a word before performing word pixels), a consonant conjunct is detected. Sometimes,
segmentation. Each ON pixel in the original image is a part of the conjunct enters the middle zone between
dilated with a structuring element then VPP of the the adjacent characters such parts will be lost if the
dilated image is determined. The zero valued valleys conjunct is segmented only in the bottom zone. Thus,
in the profile of the dilated image separates the words in order to extract the entire conjunct CCA is used.
in the original image. Figure 4 illustrates the dilated However, in some cases the conjunct is connected to
image and VPP of a line. The accuracy of word the character in the middle zone causing difficulty in
segmentation depends upon the structuring element using CCA for segmenting the conjunct alone. This
and the type of structuring element has to be decided problem is solved by removing the character in the
by performing experiments on different text with middle zone before applying CCA. But the consonant
various fonts. Ashwin and Shastry have suggested to conjunct segmentation described in this paper will
adapt a threshold for each line of text to separate have discrepancies for some of the words as
interword gaps from inter-character gaps. Threshold depicted in Figure 6. The normal segmentation
has to be obtained by analyzing the histogram of the technique described in this paper would falsely
34 Kannada Character Recognition System …

recognize the consonant conjunct for both the


characters. This discrepancy can be solved by Stage 2:
isolating the bottom matra from the rest of the
characters. The height and width of the consonant a. Remaining characters from the word are
conjunct is determined. The position of the extreme extracted using VPP.
right ON pixel is marked and the next character is b. If subscripts are not present in a word then
scanned from the next column. By d o i n g this, the the characters from the word are extracted
consonant conjunct is recognized by the first using VPP in one stage itself.
character and not w i t h the second.
Thus for character segmentation it is first necessary
3.3.2.2 Vowel Modifier Segmentation to check whether there are any subscripts in a word.
For this, a Kannada word is divided into different
This includes segmentation of the top and right horizontal zones as described in Section 3.3.2.1
matras. The part of the character above the headline
in the top zone, is the top matra. Since the headline
and baseline of each character is known, if the 3.3.2.1 Zones in a Kannada word
aspect ratio of the segmented character in the A Kannada word can be divided into different
combined top and middle zone is more than 0.95, then horizontal zones. Two different cases are considered, a
it is checked for the presence of the right matra. word without subscripts as in Figure 7 and a word
For right matra segmentation, three subimages of with subscripts as in Figure 8. Consider the sample
the character is considered: whole character, head and word as in Figure 7 which does not have a subscript
tail images. The head image is the segment containing character. The imaginary horizontal line that passes
five rows of pixels starting from the head line through the top most pixel of the word is the top
downwards similarly, the tail image contains five line. Similarly, the horizontal line that is passing
rows downwards from the baseline. T h e VPP for through the bottom most pixel of the main character is
each of these images are determined. The index the base line. The horizontal line passing through the
corresponding to the maximum profile of the character first peak in the profile is the head line. The word can
image is determined say (P). The indices be divided into top and middle zones. Top zone is the
corresponding to the first zero values, immediately portion between the top and head line and the middle
after the index (P) in the profiles of head and tail zone is the portion between the head line and base line.
images, say b1 and b2 respectively, are determined. For words with conjunct-consonant characters, it
The break point is selected as the smaller of b1 and b2. is divided into three horizontal zones as in Figure 8 for
a sample word with subscripts. The word is divided
3.3.2 Two Stage Character Segmentation into top, middle and bottom zones. The top and
Kunte and Samuel have proposed a two stage middle zones are chosen similar to that of the word
method for segmentation of Kannada characters. As without subscripts. A bottom portion is chosen between
Kannada is a non-cursive script, the individual the baseline and the bottom line. The bottom line is
characters in a word are isolated. Spacing between the horizontal line passing through the bottom most
the characters can be used for segmentation. But pixel of the word.
sometimes in VPP of a word, there will be no zero Before character segmentation it is first
valued valleys, due to the presence of conjunct- necessary to find out whether the segmented word
consonant (subscripts) characters. The subscript has a subscript or not. This can be detected as follows:
character position overlaps with the two adjacent main
characters in vertical direction. i. In the horizontal projection profile as in Figure 7,
In these cases the usual method of vertical there are two peaks of approximately equal size
projection profile to separate characters is not possible. in t h e top and middle zones of the word.
In these cases the following two stage [7] approach is The absence of the third peak after the
used, second peak indicates that there are no
subscripts in the word.
Stage 1: ii. In the HPP as in Figure 8, there are two peaks
of approximately equal size in t h e top and
a. Check for the presence of subscripts in a word. middle zones of the word. Also, there is an
b. If subscripts are present, they are extracted occurrence of third peak after the second peak
first from the word using Connected in the bottom zone of the word, which is due to
Component method. the subscripts in the word.
InterJRI Science and Technology, Vol. 1, Issue 2, July 2009 35

Stage 2 Main character segmentation:


Thus, by checking the presence or absence of the The output of the first stage converts the word
third peak in the bottom zone of the horizontal with subscripts into a plain word without any
projection profile of the segmented Kannada word, it is subscript characters. Hence during the second stage,
possible to find out whether the segmented word has a the same method used for segmenting the characters
subscript or not. (for a word without subscripts) is followed for
segmenting the main characters.
3.3.2.2 Character Segmentation of a word
without Subscripts
Consider a Kannada word which does not have any
subscripts. There is zero valued valleys in the VPP of
the word which makes the character separation easier. 3.3.3. Over Segmentation and Merge approach
The portion of the image which lies between two
Ashwin and Sastry [1] have described the
successive zero valued valleys of the VPP is assumed to
segmentation and merge approach for segmentation of
be as a separate character and separated out.
a Kannada word. In this method the words are
vertically segmented into three zones. This
3.3.2.3 Character Segmentation of a word segmentation is achieved by analyzing the HPP of a
having subscripts word. Separating the middle zone from the bottom
Consider a sample Kannada word as in Figure zone is easier as the consonant conjuncts are
8, which contains subscripts. If VPP of this word is disconnected from the base consonant. Separating the
considered, then there will be no zero valued top zone from the middle zone is difficult and there
are situations where the top zone may contain some
valleys between the first character , its subscript of the consonant or the middle zone may contain a
character and also for the third character little bit of the top vowel modifier. These inaccuracies
and its subscript. Hence, just the zero valued valleys are taken care by training the pattern classifier. Then
of the vertical projection do not determine the the three zones are segmented horizontally. The
character separation. The individual characters in middle zone is first over segmented by extracting
this case are separated in two stages as follows: points in the vertical projection showing valleys in the
In the first stage, the subscripts of the word are histogram exceeding a fixed threshold. The threshold
separated. In the second stage, from the plain word the is kept low so that a large number of segments are
individual characters are extracted using VPP. obtained. This segmentation does not give consistent
segments, these segments are merged using heuristic
Stage 1 Subscript Character Segmentation merging algorithm and recognition based algorithm.
Consider a sample word as shown in Figure 8. In the three stage segmentation technique
The total height of the word in terms of number of proposed in [8], a character is decomposed into
rows (H) is calculated. The columns of the word are base character, vowel modifier and consonant
scanned from left to right. Every column is scanned conjuncts. Features are extracted from individual
from bottom to top to find the presence of an ON pixel constituents of a character and each of the
P. When such an ON pixel is found, the number of constituents are classified and then merged to form
rows that has gone up (L) is counted. If L is less a class label. With this segmentation approach, the
than or equal to some threshold value, the pixel P is design of the classifier is complex. But the number of
assumed to be one of the points of the subscript classes required to classify a character is 102
character. Then using P as initial point, (vowels (16), consonants (35), vowel modifier
connected component algorithm [9] is applied to (16), consonant conjuncts (35), 16+35+16+35 =
extract the subscript character at that position. 102).
Threshold value is calculated by finding the position In the two stage segmentation procedure
of the valley between the second peak and the third proposed in [7], the segmentation algorithm is simple
peak which is below the base line in the bottom zone of and the design of the classifier is also simple as there
the word. The scanning process is repeated till the end is no need to merge the segmented symbols to form a
of the word (right most column) to extract all the class label. With this approach the number of
subscript characters present in the word. At the end of classes required to classify a character is 646
Stage 1, after separating subscripts what remains is a (vowels (16), consonants (35), consonant vowel
plain word without having any subscript characters modifier (35 x 16 = 560), and consonant conjuncts
as in Figure 7. (35)).
In the over-segmentation and merge approach
36 Kannada Character Recognition System …

[1], the segmentation algorithm does not give


consistent segments and the segments may differ
depending on the font and size of the characters and
also gives a large number of small segments and K – Nearest Neighbour Classification [13]:
cannot be merged using the merging algorithm. Given a set of prototype vectors,
T XY = {(x1, y1), (x2, y2),… (xl, yl)},

4. FEATURE EXTRACTION The input vectors being x i X R and corresponding


Features are a set of numbers that capture the targets being y i Y {1,2,...c} Let Rn ( x) {x':|| x x'|| r
salient characteristics of the segmented image. As font 2
} be a ball centered in the vector x in which K
and size independent recognition is required, template n
prototype vectors x i , i {1,2,...K } lie ie. | xi : x i R
matching is not advisable for recognition of
( x) | K . The K nearest neighbour classification rule
segments. Segmented symbols are unequal in size and
they are normalized to one size so that the classifier is q: X Y is define as q( x) arg max v( x, y) , where
immune to size changes in the characters. There are v(x,y) is the number of prototype vectors xi with
different features proposed for character recognition targets yi =y, which lie in the ball xi R ( x)
[10]. In spatial domain method Hu’s invariant
moments and Zernike moments [11] are used to Back Propagation Network [BPN]: This network is a
represent the segmented Kannada character. In multilayer perceptron with input layer, one or more
frequency domain methods [8] Discrete Cosine hidden layer and output layer. BPN [8] is trained in
Transform (DCT), Discrete Wavelet Transform batch mode using supervised learning, employing log
(DWT) and Karhunen Louve Transform (KLT) are sigmoidal activation function. The input is normalized
used as features for recognizing Kannada characters. to a range of 0 to 1 to meet the requirements of the
Ashwin and Sastry [1] have used a set of features activation function before training.
by splitting each segment image into a number of
zones. Kumar and Ramakrishnan describe a subspace Radial Basis function Network (RBF): This is a three
projection for base characters. This subspace layer network consisting of input, hidden and output
projection represents higher dimensional data in layers. The radial basis functions are centered on
lower dimensional space by projecting them onto the each training pattern and the layer biases are all
subspace spanned by the eigen vectors corresponding kept constant depending on the spread of the Gaussian.
to significant eigen values of the covariance matrix.
Since characters in Kannada have a rounded Support Vector Machine (SVM): The SVM
appearance, the distribution of pixels in the radial and classifier is a two class classifier based on the
angular directions is considered. Table IV summarizes discriminant functions. A discriminant function
the various feature representations for Kannada represents a surface, which separates the patterns as
characters. two classes. For OCR applications a number of two
class classifiers are trained with each one
5. CHARACTER CLASSIFICATION distinguishing one class from the other. Each class
label has an associated SVM and a test example is
The feature vector extracted from the segmented assigned to the label of the class whose SVM gives
and normalized character has to be assigned a label the largest positive output. The example is rejected if no
using a character classifier. Methods for designing SVM gives a positive output.
a character classifier are Bayes classifier based on Table V summarises the nearest neighbour
density estimation, nearest neighbour classifier based classifier on various features. This classifier
on a prototype, linear discriminant functions and performance is 92.86% for spatial domain features
neural networks. The data set is divided into training set and 98.83% for frequency domain features using
and test set for each character. DWT (Haar) for the base character. Thus the
recognition rate of a nearest neighbour classifier for
Nearest Neighbour Classifier: frequency domain features is higher compared to
The Euclidean distance [13] of the test pattern spatial domain features and is true for vowel modifier
to all the vectors in the training pattern is computed and consonant conjuncts. Table VI summarizes the
and the test pattern is assigned to the class of the neural network classifier on various features.
sample that has the minimum Euclidean Distance. Performance of BPN with frequency domain features
is 95.07 % on an average for base character. BPN
network is not used for spatial domain features. Thus,
InterJRI Science and Technology, Vol. 1, Issue 2, July 2009 37

the performance of nearest neighbour classifier is presegmented characters and hence a complete system
better than BPN network. RBF [8] with Zernike for recognition of Kannada text is not designed.
moments, the recognition rate is 96.8%. With DWT In [6] and [8], to increase the recognition accuracy
and with structural features the recognition rate is structural features are included for disambiguating
99.1% for the base character. RBF is used for spatial confused characters. Results presented in [11] and [14]
domain and frequency domain features. Thus are based only on base characters and consonant
performance of RBF network is higher than NN and conjuncts. A complete KCR system is not proposed.
BPN classifier and is true for vowel modifier and Curvelets give very good representation of edges
consonant conjuncts. SVM [1] with Zernike features, in an image, it has high directional sensitivity and are
the recognition rate is 92.6%. With modified structural highly anisotropic. Hence, can be used to represent
features, the recognition rate is 93.3% for the base Kannada characters for classification. Different types
character. Performance of SVM with spatial domain of features can be used to represent a character.
features is higher than NN classifier for the base Each of the feature vector is analysed using a
character. Comparing the work proposed in [1], [8] classifier. Multiple categories [17] of the features
and [14] RBF gives highest recognition rate of 99% are combined into a single feature vector and a
using Haar features. final classifier provides the decision of the class of the
test character.
6. CONCLUSION AND FUTURE DIRECTION Table 1. Vowels and Consonant
In this paper, a complete character recognition
system for printed Kannada documents is
d i s c u s s e d in detail. Different segmentation
techniques and various classifiers with different
features are also discussed. Different segmentation
methods lead to different classifier design, as it Table 2. Consonant – Vowel combination
depends on the number of classes at the output stage of
a classifier.
Results presented in [1] are based on scanning
Kannada texts from magazines and textbooks at 300
dpi. The training and test patterns were generated
from the same text and was ensured that these
patterns are disjoint. SVM classifier is used in the
recognition stage, and this type of two class
classifier becomes complicated for a multi-class
problem. This multi-class classification problem can
be solved [16] by using pair wise based SVM
classifier, fuzzy pair wise SVM and directed acyclic
graph (DAG).
To improve recognition performance, a
recognition based segmentation method uses dynamic
wide length to provide segmentation points which
are confirmed by the recognition stage. In [8] and
[14] DWT, DCT, KLT, Hu’s invariant moments
and Zernike moments are used to represent each
character and is classified using NN, BPN and RBF
network. Results presented in [8] are based on
38 Kannada Character Recognition System …

Table 3. Consonant Conjunct

KANNADA
TEXT
DOCUMENT Recognized o/p

PRE LINE, WORD FEATURE CLASSI


PROCESSING CHARACTER EXTRACTION FIER
SEGMENTATION

Figure 1. Block Diagram of a KCR system

Figure 2. Text lines Figure 3. Text lines Figure 4. Dilated image


with HPP with VPP and VPP of line 1
InterJRI Science and Technology, Vol. 1, Issue 2, July 2009 39

Figure 5 (a) Text line (b) VPP (c) Middle and top zone (d) VPP of (c)

Figure 6. Discrepancy in Character Figure 7. Two horizontal zones in a


Segmentation word without subscripts

Figure 8. Three horizontal zones in a word with subscripts

Table 4. Different Feature Extraction Techniques

Sl
Publication Feature Extraction
No
Kumar and DCT, DWT
1.
Ramakrishnan [8] ( Haar, Db2), KLT.
Ashwin and Division of segments into
2. tracks and sectors.
Sastry [1]
Kunte and Hu’s invariant moments,
3. Zernike moments.
Samuel [11]
Kunte and Fourier and
4. Wavelet Descriptors
Samuel [11]
Kunte and Contour extraction and
5. wavelet transform.
Samuel [11]
Kumar and Subspace
6.
Ramakrishnan [6] projection.
40 Kannada Character Recognition System …

Table 5: Nearest Neighbor classifier on various features

%
Sl. Publi Feature Recognition
Symbol Features Recognition
No cation dimension time(min)
Rate
16 (4 x 4) 93.80 0.60
DCT 64 (8 x 8) 98.70 1.23
144 (12 x 12) 98.27 2.11
Base 40 98.70 0.92
1 [8]
character KLT 50 98.55 1.13
60 98.77 1.32
DWT (Haar) 64 (8 x 8) 98.83 2.65
DWT (db2) 100 (10 x 10) 98.55 3.53
Zernike 7 92.66 -
Structural 48 92.76 -
(equally spaced
radial
Tracks & sectors)
Base Modified 48 93.28
2 [1]
character structural
(adaptively
spaced for
Equal ON
Pixels in each
annular region)
Haar (db1) 64 94.20 0.56
Top and
right matra DCT 64 93.04 0.43
3 [8]
Consonant Haar (db1) 64 96.61 0.37
conjuncts DCT 64 96.7 0.21
Zern like 7 88.13 -
Top matra Structural 48 86.91 -
Modified 48 88.28
4 [1] Consonant structural
Zernike 7 92.22
conjuncts Structural 48 89.27
Modified 48 92.76
5 [6] Base structural
Subspace 60 94.5 1.69
character Projection
InterJRI Science and Technology, Vol. 1, Issue 2, July 2009 41

Table 6. Neural Network classifier on various features

Features Feature No of Spread of Type of % Type of % Recog


Sl. Publi Dimen Hidden Gaus sian Neural Recog neural nition
Symbol
No cation sion Neurons Network nition network rate
rate
RBF
with
DWT 64 200 10 RBF 96 structur 97.5
al
Base features
1 [6]
character RBF
with
64 1200 10 RBF 97.3 structur 99.1
al
features
Haar 20 4 96.1 69.2
16 25 11 96.3 99.0
2 [8] RBF
DCT 16 20 4 BPN 95.9 52.9
Base 25 11 95.7 98.9
character Zernike 7 92.6
Structural 48 SVM 93.8
3 [1]
Modified 48 93.3
structural
Haar 64 - - - 96.8
Top
DCT 64 -- 10 --- --- RBF 96.8
and Right
-
Matras
[8] Zernike 7 88.4
4 Structural 48 SVM
88.0
Top matra _
Modified 48
structural 87.2
Haar 64 50 10 95.1 95.7
DCT 64 50 10 BPN 93.8 RBF 95.5
Consonant Zernike 7 91.9
5 [8] SVM
Conjunct Structural 48 93.8
Modified 48 94.9
Structural

Equal to
number
Hu’s moments 7 of RBF 82
training
Base samples
6 [11]
Character
Zernike 7 RBF 91.8
moments

10 96.8

Base
DWT 120 60 BPN 92
Character
7 [14] --- ----
Consonant
94
conjuncts
42 Kannada Character Recognition System …

REFERENCES Recognition of Handwritten Kannada numerals”,


[1] Ashwin T.V and P.S Sastry, “A font and size Proceedings of World Academy of Science Eng
independent OCR system for printed Kannada and Technology, vol. 32, Aug 2008.
using SVM”, Sadhana, vol. 27, Part 1, February 2002,
pp. 35–58. BIOGRAPHY
[2] Postl. W, “Detection of linear oblique structures and Dr. S. Sethu Selvi is a Professor and Head of department of
skew scan in digitized documents”, ICPR, IEEE CS, Electronics and Communication Engineering, MSRIT,
1986. Bangalore. She has published 15 papers in International,
[3] Srihari S. N and Govindraju. V. “Analysis of National journals and conference proceedings. Her areas of
textual images using the Hough transform”, interest are Digital Signal Processing, Digital Image
Technical Report, Dept of Computer Science, SUNY Processing, and Digital Signal Compression.
Buffalo, New York, April 1988.
[4] Le D.S, Thomas G.R and Wechster. H. “Automated K. Indira is an Assistant Professor in t h e department o f
page orientation and skew angle determination for Electronics and Communication Engineering, MSRIT,
binary document images”, Pattern Recognition, vol. Bangalore. Pursuing Ph.D. in Image Processing under VTU.
27, pp. 1325-1344, 1994. She has published 6 papers in National and International
[5] Shutao Li, Qinghuashen, Junsun, “Skew detection Conference Proceedings. Her areas of interests are Digital
using wavelet decomposition and projection profile Signal Processing and Digital Image Processing.
analysis”, PR letters, vol. 28, pp. 555-562, 2007.
[6] B. Vijay Kumar and A. G. Ramakrishnan, “Radial
Basis Function and subspace approach for Printed
Kannada Text Recognition”, ICASSP-2004, pp. 321-
324.
[7] R. Sanjeev Kunte, Sudhaker Samuel R. D “A Two
stage Character Segmentation Technique for Printed
Kannada Text”, GVIP Special Issue on Image
Sampling and Segmentation, March 2006.
[8] B. Vijay Kumar and A. G. Ramakrishnan, “Machine
Recognition of Printed Kannada text”
[9] Gonzalez and Woods “Digital Image Processing”
[10] Anil. K. Jain, “Feature Extraction methods for
Character Recognition – A survey”
[11] R. Sanjeev Kunte and R. D. Sudhaker Samuel, “A
simple and efficient optical character recognition
system for basic symbols in Printed Kannada Text”,
Sadhana, vol. 32, Part 5, October 2007, pp. 521–533.
[12] R. Sanjeev Kunte, Sudhaker Samuel R. D,
“Fourier and wavelet shape descriptors for
efficient Character Recognition”, First International
Conference on Signal and Image Processing
[13] B. V. Dasarathy, “Nearest neighbor pattern
classification techniques”, IEEE Computer Society
Press, New York 1991.
[14] R. Sanjeev Kunte, Sudhaker Samuel R. D, “An OCR
system for printed Kannada Text using two stage
Multi network Classification approach
employing Wavelet features”, International
Conference on Computational Intelligence and
Multimedia Applications 2007, pp. 349 -355.
[15] A. Cheung M. Bennamoun and N. W. Bergmann,
“An Arabic Optical Character Recognition system
using Recognition based Segmentation”, Pattern
Recognition, vol. 34, no. 2, pp. 215-233, Feb 2001.

[16] Guangzhou, “Hand written Similar Chinese character


recognition based on multiclass pair wise SVM”,
Proceedings of Fourth International Conference on
Machine Learning and Cybernetics, Aug 2005.
[17] Dinesh Acharya U, N. V. Subba Reddy, Krishna
Moorth Makkithaja, “Multilevel classifiers in the

You might also like