You are on page 1of 3

ISSN: 2312-7694

Asma et al. / International Journal of Computer and Communication System Engineering (IJCCSE)

Extraction of Arabic Text Regions From Images


Asma Andleeb
asmaandleeb77@gmail.com

Mehreen Sirshar
m.sirshar@gmail.com

Abstract This paper presents the methodology of extracting

These methods are based on machine learning algorithms.


C.P.Sumathi et al. [2] presents a survey on various
approaches for text extraction and their comparison to find
out the most effective one. X.Lui et al. [7] proposed the
algorithm that uses the attention mechanism from visual
perception for text region detection.

the Arabic text in Khat-e-Nastaleeq from images. Important


and useful information is present in the text containing images.
Text extraction from the images can be used in different
applications such as document retrieval, object identification
etc. In this paper we employ the connected component
algorithm for the robust extraction of text from images.
Grayscale or colored image can be taken as input image. But
the colored images require preprocessing. In this paper
processing will be done only on one ayah (Bismillah) that
must be in Khat-e-Nastaleeq. This algorithm will work on
different colors, font sizes and background complexities.
Extraction of information includes detection of edges,
localization, Enhancement and Segmentation.

I.

INTRODUCTION

In the era of digital technology the databases are


comprised of multimedia data that usually contains textual
information along with images and videos. To fully describe
the image and to increase understandability textual
information is very important. Great amount of interest has
been shown in the field of text extraction in recent years but
few work has been done on Arabic text extraction. It is very
complex to deal with different writing styles of Arabic.
Arabic has different texture characteristics as compared to
Chinese, Latin and English. Existence of baseline is the
distinctive feature of Arabic script as most of the
information is present in this section. In this paper we will
deal with one writing style i.e. Khat-e-Nastaleeq. Text
extraction can be used for indexing and classification
purpose. J.Gllavata et al. [1] presents the connected
component approach for the text extraction. This paper
presents the methodology that basically works on the Y
channel of YUV image and application of different
morphological operations on the image for text extraction.
K.Jung et al. [5] stated that images and videos contain the
embedded text that is difficult to extract due to different font
styles, colors, orientation and complex background. N.Gupta
et al.[6] presents the text extraction using discrete wavelet
transform methodology. M.B.Halima et al. [4] mentioned the
extraction of Arabic text from different news videos using
text enhancement and normalization. S.Yousafi et al. [3]
provided the comparison of three techniques for the
extraction of Arabic text from videos and images.
2014, IJCCSE All Rights Reserved

The rest of paper is organized as follows. Section 2


describes the methodology used in the project, Section 3
presents the results, and Section 4 concludes the paper and
provides the outline for future work.

II.

METHODOLOGY

In this section the processing steps of connected


component algorithm for the extraction of Arabic text are
explained. We are working on the assumptions that the text
must be in Khat-e-Nastaleeq, the orientation must be frontal.
We will work on one ayah Bismillah. The different steps
of the algorithm are.
1.

In the preprocessing convert the input RGB image


into YUV color space. The Y-channel is luminance
while U & V contain the color information. As text
region is at high contrast as compared to other
image so Y-channel (luminance) is used for the
further processing so it is extracted from the image.
The resultant image is grayscale.

Figure 1: Y-Channel

2.

Convert the gray image into edge image and


sharpen it using sharpening filters. For conversion
in edge image every pixel is assigned some weight
with respect to neighbors in left, upper and upperright direction and sharpen the resultant image to
increase the contrast of text from the background to

Vol. 02 No.01 February 2015

8|Page
www.ijccse.com

ISSN: 2312-7694
Asma et al. / International Journal of Computer and Communication System Engineering (IJCCSE)
make it easier to extract the text. The algorithm[1]
for computing edge image is
Assign upper, left, Upper-Right = 0
For all pixels in edge image assign
Left= (G(x,y) G(x-1,y))
Upper= (G(x,y)-G(x, y-1))
Upper-right=(G(x,y)-G(x+1,y1))
Edge(x,y)=max(left,
upper,
upper-right)

properties of the text are used. The regions having


minor and major axis less than 10 are taken for
further processing.To enhance text region against
plain background we have to binarize the edge
image.

Figure 4: Binary image


5.

Figure 2: Sharpened Edge Image

3.

By the use of appropriate threshold values compute


horizontal and vertical projections through
histogram in order to localize the text. Horizontal

Any pixel in binary image is surrounded by black


pixels in vertical, horizontal and diagonal directions
then substitute it with background value [1]. This is
known as gap filling. This image is further
subjected to the morphological operation to
eliminate non-text area from the output image.

Figure 5: Final result

projection is computed by adding all the pixels in


rows while vertical projection is computed by
adding all the pixels in columns. Histogram is
plotted for computing the projections as text is at
high contrast so high values are expected at
horizontal projection.
After computing the
horizontal and vertical projection images add both
the images to get the localized text image. The
threshold value is taken by the following equations.
a. Tx = Mean(Horizontal projection profile) /
20
b. Ty = Mean(Vertical projection profile) +
Max(Vertical projection profile)/10

III.

EXPERIMENTAL RESULTS

This algorithm gives 60% precise results for Khate-Nastaleeq while 40% is error rate. This program is not
applicable to any other Arabic writing styles as it doesnt
give the accurate result. While the recall rate is 70% whereas
Precision rate = Correct Detected / (Correct Detect + False
Positives) [1]
Recall rate = Correct Detect / (Correct Detect + Missed
Lines) [1]
Where false positive (FP) / False alarms are those regions in
the image which are actually not characters of a text, but
have been detected by the algorithm as text [2].

Figure 3: Horizontal Projection Histogram& localized text image

4.

For the elimination of non-text regions geometric


2014, IJCCSE All Rights Reserved

Vol. 02 No.01 February 2015

9|Page
www.ijccse.com

ISSN: 2312-7694
Asma et al. / International Journal of Computer and Communication System Engineering (IJCCSE)

IV.

CONCLUSION AND FUTURE WORK

In this paper, for Arabic text detection a simple and


effective algorithm is presented. This algorithm extracts text
automatically with complex background. This algorithm
provides a robust approach towards extraction of images
with different colors and alignment. The main reason of
failure for the previous techniques is that they are not
effective in case of too small characters and characters that
are not well-aligned. This program is written for the
detection of only one Arabic style that is Khat-e-Nastaleeq.
In this paper Arabic text is extracted from the
images while in future the extraction of text from the videos
can be done. In future, existing OCR technique can use this
extracted text as input without any preprocessing. In this
paper only Khat-e-Nastaleeq is processed while in future
text extraction related to other writing styles of Arabic text
can be done.

2014, IJCCSE All Rights Reserved

References
[1] JulindaGllavata, Ralph Ewerthand Bernd Freisleben, A
Robust Algorithm for Text Detection in Images 2007
[2] C.P. Sumathi, T. Santhanamand G.Gayathri Devi, A
survey on various approaches to text extraction in
images, August 2012.
[3] Sonia Yousfi, Sid-Ahmed Berrani, Christophe Garcia,
Arabic Text detection from videos, 2013
[4] Mohamed Ben Halima, HichemKarray and Adel M.
Alimi, Arabic Text Recognition in Video Sequences,
2010
[5] Keechul Jung, Kwang, In Kim and Anil K. Jain, Text
information extraction in imagesand video: a survey,
2004.
[6] Neha Gupta, V .K. Banga, Image Segmentation for Text
Extraction, 2012.
[7] Xiaoqing Liu and JagathSamarabandu, Multiscaleedgebased Text extraction from Complex images, 2006.

Vol. 02 No.01 February 2015

10 | P a g e
www.ijccse.com

You might also like