You are on page 1of 19

nteractive Knowledge Discovery for Baseline

Estimation and Word Segmentation in Handwritten Arabic Text 619


X

Interactive KnowIedge Discovery for BaseIine
Estimation and Word Segmentation in
Handwritten Arabic Text

}avad H AIKhaleel, }iannin }iang, }inchang Ren and Slan Ipson
8QLYHUVLW\ RI %UDGIRUG
8QLWHG .LQJGRP

1. Introduction

LIeclronic docunenl nanagenenl syslens provide greal lenefils lo sociely. Soflvare looIs
such as vord processors are used in lhe generalion, slorage, and relrievaI of docunenls in
specific fornals. Using such looIs, docunenls can le ediled, prinled, or dislriluled
eIeclronicaIIy across nelvorks. Hovever, vilh paper docunenls, lhe previous lasks cannol
le acconpIished ly conpulers, so lhere is a need lo exlracl lhe infornalion in docunenls lo
slore lhen in a conpulerized fornal. The soIulion for lhis lask is in lhe lranch of pallern
recognilion knovn as Docunenl AnaIysis and Recognilion (DAR). The nain ain here is lo
inilale lhe hunan aliIily in reading lexl vilh high speed and accuracy. OplicaI Characler
Recognilion (OCR) is lhe nosl cruciaI parl of DAR (Khorsheed 2OO2) .
As line passes, conpulers lecone nore poverfuI, and lasks can le done quicker. Il is sliII
hovever necessary lo nake conpulers nore versaliIe, ly enalIing lhen lo carry oul lasks
lhal are naluraI lo hunans, such as lhe aliIily lo read lhe nachine prinled or handvrillen
lexl. The aulonalic recognilion of a docunenl requires lransferring lhe lexl in an inage fiIe.
This process causes lhe syslen lo Iose any lenporaI infornalion reIaling lo lhe lexl
|Khorsheed, 2OO2 #14.
Aulonalic recognilion has enalIed nany appIicalions such as office aulonalion, lanking in
lerns of verificalion of cheques, dala enlry and naiIing services in lerns of posl/zip codes
|Lorigo, 2OO6 #2. In such appIicalions, lhe inleraclion lelveen lhe nan and lhe nachine can
le inproved ly inpIenenling characler recognilion syslens (Anin 1997).
Handvrillen lexl recognilion has significanl polenliaI for such appIicalions. More
inporlanlIy, il nay le used as a naluraI forn of hunan-conpuler inleraclion. In generaI,
lhis lask can le divided inlo onIine lased or offIine lased syslens. Recognilion in lhe onIine
lased syslens is lased on pen novenenls, vhich is lhe dynanics of vriling. Hovever,
recognilion in lhe offIine lased syslens is lased soIeIy on lhe vrillen lexl inage. OffIine
recognilion is lhe nore difficuIl of lvo lecause il cannol nake use of addilionaI infornalion
avaiIalIe lo onIine syslens such as lhe slrenglh and sequenliaI order of lhe vriling |1j. In
lhis paper, lhe focus is on offIine recognilion of handvrillen Aralic lexl.A Iarge nunler of
research papers have leen vrillen reIaling lo Lalin, Chinese, and }apanese handvriling. On
35
www.intechopen.com
Recent Advances in Technologies 620


lhe olher hand, reIaliveIy IillIe research has leen done on Aralic handvriling. This is due lo
lhe conpIexily of Aralic lexl and lo a Iack of Aralic dalalases. The aulonaled nelhods for
lhe recognilion of Aralic lexl are al lhe earIy slage conpared lo lhe nelhods of recognilion
of Lalin, Chinese, and }apanese lexls. In addilion, lhere is a najor chaIIenge in lhe Aralic
vriling recognilion syslens due lo lhe cursive nalure of lhe dala. In lhis chapler, ve
enphasize on offIine recognilion of handvrillen Aralic lexl.
Aralic is vrillen ly nore lhan 25O niIIion peopIe (Anin 1997). y nalure, Aralic lexl is
cursive, vhich nakes ils recognilion rale Iover lhan lhal of prinled Lalin. In a siniIar vay
lo LngIish, Aralic vriling uses Iellers. The Aralic aIphalel consisls of 28 Iellers, and lexl is
vrillen fron righl lo Iefl in a cursive vay. Lach Aralic Ieller has eilher lvo or four shapes
depending on ils posilion in lhe lexl. The shapes are cIassified lased on lheir posilion vhich
can le slarl, niddIe, end, or aIone |Anin, 1996 #22. TalIe 1 shovs each shape for each
Ieller. Ior exanpIe Ieller Ayn () has lhe foIIoving shapes: slarl , niddIe , end , and
aIone . In addilion, Aralic Ianguage uses diacrilicaI narking such as fallha, dunna, kasra,
hanza(zigzag), shadda, or nadda. The presence or alsence of voveI diacrilicaI indicales
differenl neaning |Anin, 1998 #2O. Ior exanpIe sone vords are vrillen in lhe sane vay,
lul lhey are differenl in lhe neaning such as: , vhich can le schooI or leacher, ,
vhich can le coIIege or kidney, , vhich can le Iove or seeds. NornaIIy, lhe diacrilicaI
narking are nol vrillen in lhe handvriling, lul if lhe vords are isoIaled, diacrilicaI narking
are essenliaI lo differenliale lelveen lhe possilIe neanings. Using dols nakes sone Aralic
Iellers speciaI |Anin, 1998 #2O, Lorigo, 2OO6 #2, Anin, 1997 #15 as foIIovs:
- Ten Aralic Iellers have one dol ( )
- Three Aralic Iellers have lvo dols ()
- Tvo Aralic Iellers have lhree dols ()
- SeveraI Aralic Iellers presenls Ioop ()

Il is vorlh knoving lhal renovaI of any of lhese dols viII Iead lo a nisrepresenlalion of lhe
characler. So, efficienl pre-processing lechniques have lo le used in order lo deaI vilh lhese
dols vilhoul renoving lhen and changing lhe idenlily of lhe characler. There are six Iellers
vhich are nol connecled fron lhe Iefl resuIling in lhe separalion of lhe vord inlo sul-vords
or pieces of Aralic vords (IAW) |Lorigo, 2OO6 #2. Iigure 1 shovs exanpIes of Aralic
vords vilh one, lvo, and lhree sul-vords.



Iig. 1. vords vilh one, lvo, lhree sul-vords

CeneraIIy, lhe handvrillen lexl is vrillen on a page divided inlo Iines vhich are furlher
divided inlo vords. There are spaces lelveen lhe Iines, and lhere are spaces lelveen lhe
vords. The spaces lelveen lhe vords define lhe vord loundaries. NornaIIy, lhe space
lelveen lhe sul-vords is one lhird of lhe space lelveen lhe vords. This is done
consislenlIy in prinled lexl, lul il varies in handvrillen lexl |Anin, 2OOO #19.
www.intechopen.com
nteractive Knowledge Discovery for Baseline
Estimation and Word Segmentation in Handwritten Arabic Text 621


To lhis end, a range of allenpls have leen reporled in lhe Iileralure on Aralic lexl
recognilion. AInuaIIin and Yanaguchi |AInuaIIin, 1987 #7O proposed a slrucluraI
recognilion lechnique for Aralic handvrillen vords. Their nelhod lhinned and segnenled
lhe vords inlo slrokes. They used coordinales lo represenl lhe conlinuous slroke curves.
Their nelhod exlracled each slrokes slarl and end poinls. IinaIIy, lhe slrokes vere
cIassified lased on lheir lopoIogicaI and geonelricaI fealures. They lesled lheir nelhod on a
dalalase of 4OO vords vrillen ly lvo persons. Hovever, lheir syslen faiIed in nosl cases
due lo incorrecl segnenlalion of vords.
This chapler focuses on lhe pre-processing phase of Aralic handvrillen lexl recognilion and
inlroduces nev nelhods for laseIine eslinalion and vord segnenlalion. The resl of lhis
chapler is slruclured in four seclions, vhere seclion 2 descriles slale of lhe arl, seclion 3
descriles lhe proposed nelhods incIuding laseIine deleclion, Lxlracling connecled
conponenls and sul-vords, and Segnenlalion of Words, seclion 4 presenls experinenlaI
resuIls and discussion, and finaIIy, seclion 5 provides concIuding renarks.

Nane AIone (isoIaled) Slarl MiddIe Lnd
AIif


aa
Taa
Thaa
}een
Haa
Khaa
DaII


DhaaI


Raa


Zaay


Seen
Sheen
Saad
Daad
TTaa
Dhaa
Ayn
Chyan
Iaa
Qaaf
Kaaf
Laan
Meen
Noon
Haa
Wav


Yaa
TalIe 1. Aralic Ieller shapes
www.intechopen.com
Recent Advances in Technologies 622


2. Previous work
Anin and AIsadoun (Anin and AI-Sadoun 1992) proposed a nev lechnique for segnenling
hand prinled Aralic lexl using linary lrees and a paraIIeI lhinning aIgorilhn
|Cuo, 1989 #72 for producing lhe skeIelon of lhe inage. They lraced lhe lhinned inage
fron righl lo Iefl using a 33 vindov and recorded lhe slruclure of lhe lraced parls. They
used lhe Ireenan code |Ireenan, 1961 #73 lo descrile lhe prinilives. A linary lree
consisling of severaI nodes is conslrucled using specified ruIes. Lach node is used lo
descrile lhe shape parl of a connecled conponenl. Afler conslruclion of lhe linary lree,
snoolhing is done in order lo nininize lhe nunler of nodes, nininize lhe Ireenan code
slring, and lo nininize any noise in lhe lhinned inage. IinaIIy, lhey inpIenenled
segnenlalion ly dividing lhe linary lree inlo severaI sul-lrees in vhich each sul-lree
represenls a characler. Advanlages of lheir proposed lechnique are lhe aliIilies lo segnenl
overIapping characlers and characlers vhich have shorl conneclion lelveen lhen. Molava
el aI. (Molava, Anin el aI. 1997) inlroduced an aIgorilhn for segnenling Aralic vords inlo
characlers ly appIying nalhenalicaI norphoIogicaI lechniques. SeveraI pre-processing
lasks vere perforned on lhe inpul inages incIuding linarizalion, sIanl correclion and
connecled conponenls conslruclion. The sIanl correclion process delecled lhe sIope firsl
using a singIe erosion operalion lefore correcling il. IinaIIy, connecled conponenls vere
found and conlours appIied lo exlracl sul-vords and lhe conpIenenlary characlers. The
segnenlalion aIgorilhn perforned firsl a fiIlering operalion for noise renovaI. This vas
done ly lvo successive norphoIogicaI operalions (cIosing foIIoved ly opening). Second,
singuIarilies vere found ly appIying opening lo lhe vord inage. Third, reguIarilies vere
found ly sullracling lhe singuIarilies fron lhe originaI inage. Ior lhe recognilion process,
hidden Markov nodeIs (HMMs) vere used lo lesl lhe aIgorilhn on a fev hundred vords
resuIling in a good recognilion rale of 81.88. Aluhaila el aI. |Aluhaila, 1996 #62 deaIl
vilh severaI prolIens in lhe processing of linary inages of handvrillen lexl docunenls.
Iirsl, appIying lhe dislance lransforn lo lhe lhinned inage, lhey crealed an aIgorilhn vhich
exlracls lhe slraighl Iine of a lexluaI slroke. The goaI of lhis nelhod is lo idenlify lhe
spurious poinls fron lhe lhinned inages. The exlracled slraighl Iines keep lhe slrucluraI
infornalion of lhe originaI pallern. Second, a lhreshoId is caIcuIaled in order lo renove
oulIying pixeIs vhose dislance exceeds lhe lhreshoId. IinaIIy, a nelhod is deveIoped lo
exlracl Iines fron pages of handvrillen lexl ly finding lhe shorlesl spanning lree of a graph
forned fron lhe sel of nain slrokes. Then nain slrokes of exlracled Iines are arranged in an
order siniIar lo lheir vrillen order ly foIIoving lhe palh in vhich lhey are conlained. Then,
every secondary slroke is assigned lo lhe cIosesl nain slroke. y lhe end, a Iisl of nain
slrokes vilh lheir reIevanl secondary slrokes is achieved resuIling in a conlinalion of nain-
secondary slrokes. Lach eIenenl in lhe Iisl can le lhe inpul lo lhe cIassifier. Their nelhod
proved lo le poverfuI and suilalIe for varialIe handvriling. AI-adr and HararIick (AI-
adr and HaraIick 1995) inlroduced a hoIislic recognilion syslen vhich recognizes Machine
prinled Aralic vord vilhoul segnenlalion. Their syslen is lased on descriling lhe shape
prinilives as synloIs. The inslances of lhe predefined shape prinilives are delecled ly
appIying lhe erosion operalion on lhe vord inage. The syslen Iocales lhe lesl spaliaI
arrangenenl of synloI nodeIs ly appIying a slale space search. The delecled prinilives are
nalched vilh synloI nodeIs. The syslen vas lesled on a Iexicon of 42OOOvords, and lhe
recognilion rale achieved vas 99.4 on noise free lexl and 73 for scanned lexl.
AIna'adeed el aI. |AIna'adeed, 2OO2 #84 inlroduced a syslen for cIassifying Aralic
www.intechopen.com
nteractive Knowledge Discovery for Baseline
Estimation and Word Segmentation in Handwritten Arabic Text 623


handvrillen vords lased on HMM. Iirsl, lhe vord inages vere nornaIized ly renoving
varialions vhich did nol affecl lhe idenlily of lhe vord. The nornaIizalion incIuded slroke
vidlh, sIope, and lhe Ieller heighl yieIding a uniforn heighl of one pixeI vide slroke.
Second, lhe skeIelon of lhe inage vas conslrucled, and 29 fealures exlracled. IinaIIy, a
cIassificalion process lased on lhe HMM vas used. Since lhere vas no slandard Aralic
dalalase, lhis syslen vas lesled on a speciaI dalalase (AI-Ma'adeed, LIIinan el aI. 2OO2) of
47OO handvrillen vords vrillen ly 1OO vrilers. The recognilion rale achieved vas 45
lecause sone vords confIicl vilh each olher.
AIna'adeed el aI. |AIna'adeed, 2OO4 #85 inlroduced a syslen for unconslrained Aralic
handvrillen vord recognilion lased on nuIlipIe HMMs. Iirsl, pre-processing lasks vere
perforned siniIar lo lhe vork in |AIna'adeed, 2OO2 #84. In order lo inprove lhe
recognilion rale in |AIna'adeed, 2OO2 #84, gIolaI fealures, such as nunlers of upper dols,
nunlers of Iover dols, and lhe nunlers of segnenls, ascenders and descenders, vere used
lo differenliale lhe vords fron each olher. y using lhese fealures and lhe nuIlipIe HMM,
in vhich each HMM used a differenl sel of fealures lhe syslen renoves aII lhe varialion in
lhe inages. Second, lhe skeIelon of lhe inage vas perforned, and 29 fealures vere
exlracled. IinaIIy, a cIassificalion process lased on lhe HMM vas used. This syslen vas
lesled on a dalalase (AI-Ma'adeed, LIIinan el aI. 2OO2) of 1OO handvrillen vords vrillen ly
1OOO vrilers. The recognilion rale achieved vas 6O lefore using posl processing. The
codelook size vas chosen afler lesling and seIecled differenl vords for each group. There
vere eighl groups vhere lhe firsl group had 9O vords, second had 1OO vords, lhe lhird had
8O vords, lhe fourlh had 9O vords, and eighlh had 12O vords. The recognilion rale vas
differenl for each group. The firsl group had a 97 recognilion rale, vhiIe lhe eighlh group
had onIy 6O recognilion rale. AIna'adeed (AIna'adeed 2OO6) inlroduced a syslen for
unconslrained Aralic handvrillen vord recognilion using a neuraI nelvork cIassifier (NN).
This syslen used lhe pre-processing and fealures in |AIna'adeed, 2OO2 #84, AIna'adeed,
2OO4 #85. The NN had 8 neurons for lhe inpul Iayer, 4O neurons for lhe niddIe Iayer, and
lhe oulpul Iayer had 7O neurons since lhe NN cIassifier used 7O differenl vords. The
accuracy achieved vas 63.
Khorsheed and CIocksin (Khorsheed and CIocksin 1999) presenled a lechnique for
exlracling lhe slrucluraI fealures fron Aralic cursive lexl. SeveraI pre-processing lasks vere
perforned incIuding: lhinning lased on Slenlifords aIgorilhn (Iarker 1997) and skeIelon
cenlroid caIcuIalion lo find a reference poinl reIalive lo aII segnenl Iocalions. The fealures
vere exlracled in lhree sleps. Iirsl, segnenl exlraclion vas done using lhe skeIelon graph of
lhe vord inage vhich consisls of a nunler of segnenls. There are fealure poinls vhere a
segnenl slarls and ends. Second, Ioop exlraclion in vhich lhe Ioops are divided inlo lhree
calegories: a sinpIe Ioop, a conpIex Ioop, and a doulIe Ioop. The Ioops are checked during
lhe segnenl exlraclion. Third, segnenl lransfornalion is done afler exlracling lhe segnenl
and lhe Ioops. The Vilerli aIgorilhn |Raliner, 1986 #88 is used lo forn a codelook ly
porlioning lhe lraining sanpIes inlo severaI cIasses, and lhe codelook incIudes 76 synloIs.
The lechnique vas lesled vilh a Iexicon of 294 vords acquired fron a differenl lexl sources
ly using lhe HMM, and recognilion rales of up lo 97 vere achieved.
Khorsheed and CIocksin (Khorsheed and CIocksin 2OOO) presenled a hoIislic recognilion
syslen for recognizing Aralic cursive vords. Iirsl, Iourier coefficienls are exlracled fron a
vord inage afler converling il inlo a nornaIized poIar inage. Using lhe average coefficienl
vaIues for sanpIe lraining, each vord vas represenled ly lenpIale forn. The recognilion
www.intechopen.com
Recent Advances in Technologies 624


vas done ly using vord lenpIale vilh LucIidean dislance and assigning lhe unknovn
vord lo lhe cIosesl vord lenpIale. The recognilion rale achieved vas over 9O. Hovever,
lhis syslen faiIs for nany fonls. Khorsheed (Khorsheed 2OO3) presenled anolher hoIislic
recognilion syslen for recognizing Aralic handvrillen vords. Ire-processing lasks
perforned incIuded using lhe Zhang-Suen lhinning aIgorilhn |Zhang, 1984 #91 lo generale
lhe skeIelon graph. SlrucluraI fealures for lhe handvrillen scripl vere exlracled afler
skeIelonizalion ly deconposing lhe vord skeIelon inlo a sequence of Iinks vilh an order
siniIar lo lhe vord vriling order. Using lhe Iine approxinalion (Iarker 1997), each Iine vas
lroken inlo snaII Iine segnenls, vhich vere lransferred inlo a sequence of discrele synloIs
ly using veclor quanlizalion (VQ) |Cray, 1989 #92. Wilh lhis syslen, lhe HMM recognizer
vas appIied vilh inage skeIelonizalion lo lhe recognilion of an oId Aralic nanuscripl
vhich can le found in (Khorsheed 2OOO). One HMM vas perforned fron 32 characler
HMMs, each vilh no reslriclion junp nargin. The syslen vas lesled on 4O5 characler
sanpIes of a singIe fonl exlracled fron a singIe nanuscripl. The recognilion rales achieved
vere 87 and 72 vilh and vilhoul speII checking respecliveIy. Khorsheed (Khorsheed
2OO7) presenled a recognilion syslen lased on lhe HMM lo recognize Aralic lexl. Ire-
processing vas perforned, using a sIov nedian fiIler, lo reduce saIl and pepper noise.
SlalislicaI fealures vere exlracled fron lexl inage and fed lo lhe recognizer. The recognizer
vas luiIl on lhe HMM looIkil (HTK) (Young, Lvernann el aI. 2OO1). The advanlage of lhis
syslen is lhe Iexicon free approach vhich offers open vocaluIary recognilion. The syslen
vas alIe lo Iearn conpIicaled Iigalures and overIaps. Differenl lexl inages vilh differenl
fonls vere lesled, and lhe recognilion rale achieved vas up lo 92.4. A lri-nodeI
inpIenenlalion shoved a leller syslen perfornance lhan a nono-nodeI inpIenenlalion.
In conparison vilh exisling vork, our proposed nelhods iIIuslrale significanl advanlages,
vhich can le highIighled as: (i) ly using lhe knovIedge of polenliaI posilions of lhe lase
Iine, an inproved projeclion lased nelhod is enpIoyed for laseIine deleclion, (ii) slalislicaI
anaIysis dislrilulion of lhe vord and sul-vords dislances is ollained lo delernine an
oplinaI lhreshoId for vord segnenlalion, (iii) a conponenl-lased nelhod for vord
segnenlalion is used lo provide a praclicaI vay in accuraleIy segnenling vords fron lhe
lexl Iine inslead of segnenling lhe vords inlo characlers, and using lhe segnenlalion free
syslens.

3. Proposed Methods
3.1 BaseIine Detection
Irevious vork on laseIine deleclion can le sunnarized as foIIovs. Iechvilz and Margner
(Iechvilz and Margner 2OO2) approxinaled lhe skeIelon ly a piecevise Iinear curve and
delecled lhe laseIine as lhe Iine lhal lesl fils lhe edges. Iarooq el aI. (Iarooq, Venu el aI.
2OO5) used lhe IIN/LNIT dalalase and enlries inlo docunenls in order lo sinuIale skev,
Iine separalion, and olher fealures. Their nelhod is lased on lhe IocaI ninina poinls of
vords. Their nelhod generaIIy vorks veII lul faiIs lo find lhe laseIine in silualions vhere
lhe diacrilics are Iarge reIalive lo lhe vord. The renovaI of diacrilics is suggesled as a
polenliaI soIulion. AI-Rashaidehn |AI-Rashaideh, 2OO6 #11O found lhe laseIine lased on
lhe assunplions lhal lhe laseIine is rolaled horizonlaIIy vilhin a range of angIes lelveen
+2O lo -2O and keeping in nind lhal lhe naxinun nunler of pixeIs is Iocaled aIong lhe
laseIine. M. Syian el aI. (Syian, Nazny el aI. 2OO6) presenled a conpIele Aralic OCR
www.intechopen.com
nteractive Knowledge Discovery for Baseline
Estimation and Word Segmentation in Handwritten Arabic Text 625


syslen vhich uses a hislogran cIuslering nelhod for segnenling lhe Aralic vord. In lhe
presenl research, lhe laseIine is delecled ly using a horizonlaI projeclion of inpul inages.
This is defined as lhe sun of foreground pixeIs perpendicuIar lo lhe x axis and is
represenled ly lhe veclor +(\) of size 0. Lel ( , ) , |1, |, |1, | S [ \ [ 0 \ 1 e e denole one inpul
inage, ils horizonlaI projeclion is defined as foIIovs:

( ) ( , )
[
+ \ S [ \ =
_

(1)
Where ( ) + \ denoles nunler of effeclive pixeIs vhen S is a linary inage. NornaIIy, lhe
posilion of lhe laseIine is indicaled ly a peak in ( ) + \ . Ior nosl lhe cases, lhis sinpIe ruIe
vorks in delernining lhe laseIine. Hovever, il faiIs in sone cases as iIIuslraled in Iigure 2
vhere lhe gIolaI peak in ( ) + \ is nol lhe laseIine. To soIve lhis prolIen, ve appIy




Iig. 2. An exanpIe shoving faiIure of laseIine deleclion vhen using lhe peak of lhe
horizonlaI projeclion of lhe inage

knovIedge lased conslrainls. We knov lhal lhe laseIine shouId appear leIov lhe niddIe
Iine of lhe inage. Therefore, ve nodify lhe aIgorilhn lo find lhe peak in ( ) + \ onIy in lhe
lollon haIf of lhe inages, i.e.

| / 4, / 2|
( )
arg max
\ 1 1
E + \
e
=
(2)

Wilh lhe nodificalion appIied, lhe corresponding laseIine is successfuIIy Iocaled as shovn
in Iigure 3




Iig. 3. Delecled laseIine fron lhe inage in Iigure 2 using lhe knovIedge-lased nodified
aIgorilhn

3.2 Extracting connected components and sub-words
Segnenlalion is an essenliaI slep vhich separales lhe lexl inage oljecls for lhe recognilion
phase. The lypicaI segnenlalion of a linary docunenl is lased on lhe hislogran projeclion
anaIysis and regrouping of lhe connecled conponenls |Anin, 1998 #2O, Lorigo, 2OO6 #2.
www.intechopen.com
Recent Advances in Technologies 626


Aralic vriling is cursive such lhal vords are separaled ly spaces. Hovever, a vord nay
conlain severaI sul-vords vhich are porlions of lhe vord incIuding one or nore connecled
Iellers. The connecled conponenls (CCs) for lhe Iine inage nusl le delernined. Lach CC is
encIosed in a nininun sized reclanguIar lox. The oljeclive of lhe CCs phase is lo forn
reclangIes around aII lhe connecled oljecls in lhe inage. The aIgorilhn used lo ollain lhe
CCs is an ileralive procedure vhich checks any lIack pixeIs for conneclivily vilh anolher.
ounding reclangIes are exlended lo encIose any grouping of connecled lIack pixeIs.

In our syslens, lhe 8 - neighlours are used for exlracling lhe connecling conponenls ly
scanning lhe inage pixeI ly pixeI checking for pixeI conneclivily. In order for lvo pixeIs or
nore lo le considered connecled, lhe pixeI vaIues are in lhe sane sel V, V=|1. The 8 -
neighlours are defined ly

8 4
( ) ( ) ( )
'
1 3 1 3 1 3 = *

(3)

Where N4(p)= | (x+1,y), (x-1,y), (x,y+1), (x,y-1) and

N
D
(p) = | x+1,y+1), (x+1,y-1), (x-1,y+1), (x-1,y-1)

Iigure 4(a) shovs lhe idenlified CCs for sone exanpIe inages. Slarling fron exlracled
connecl conponenls, sul-vords are segnenled as foIIovs. IirslIy, snaII parls Iike dols in
lhe inage are lenporaIIy ignored as shovn in Iigure 4(l). SecondIy, conponenls vhose
coordinales overIap in lhe [ direclion are nerged lo produce a conlined Iarge conponenl,
naneIy sul-vord. ThirdIy, lhe dislance of each pair of conseculive sul-vords is ollained,
vhich is used lo segnenl vords in lhe nexl seclion


Iig. 4. LxanpIes of exlracled connecled conponenls (a), sul-vords of conlined
conponenls (l), and delecled vords (c).
www.intechopen.com
nteractive Knowledge Discovery for Baseline
Estimation and Word Segmentation in Handwritten Arabic Text 627


3.3 Extracting connected components and sub-words
asicaIIy, lhere are lvo calegories of syslens for lhe recognilion of Aralic scripls: characler-
lased and vord-lased syslens. In lhe firsl calegory, vords need lo le furlher segnenled
inlo characlers or Iellers and lhese characlers are lhen used for recognilion. The second
calegory does nol need such segnenlalion and vhoIe vords are used for recognilion. In
lolh calegories, segnenlalion of vords fron lhe lexl is necessary.
SeveraI aIgorilhns have leen presenled for lhe segnenlalion of Lalin cursive scripl.
Hovever, Aralic scripl segnenlalion has nol received as nuch allenlion. In 1992, Anin and
AI-Sadoun (Anin and AI-Sadoun 1992) proposed a segnenlalion lechnique for Aralic lexl
using lhe linary lree. In 1995, AIader and haraIick (AI-adr and HaraIick 1995) presenled a
syslen vhich recognizes a nachine prinled Aralic vord vilhoul prior segnenlalion. In
1997, Molava el aI (Molava, Anin el aI. 1997) presenled an aulonalic segnenlalion of
Aralic vords using MalhenalicaI MorphoIogy looIs. They appIied lheir aIgorilhn lased on
lhe assunplions lhal characlers are usuaIIy connecled ly horizonlaI Iines. In 2OO5, Lorigo
and Covindaraju (Lorigo and Covindaraju 2OO5) presenled a segnenlalion syslen vhich
used derivalive infornalion in a region around lhe laseIine lo over segnenl lhe vords.
Segnenling a Iine of lexl inlo vords is knovn as vord separalion. In lhe nachine prinled
case, vord separalion is easier lhan in lhe handvriling case lecause lhe space lelveen
vords is uniforn, and Iarger lhan lhe space lelveen sul-vords. In handvriling case, lhe
space lelveen vords is nol aIvays uniforn and noreover, lhe sane anounl of space nay
le presenl lelveen lhe vords and sul-vords on a Iine.
In our syslen, each inage is segnenled inlo vords using verlicaI hislograns. Words have
varying Ienglh, lherefore afler laking lhe verlicaI hislogran as shovn in Iigure 5, lhe Iine
can le cIassified inlo vords and sul-vords depending on dislances lelveen groups of
peaks aIong lhe x axis.


Iig. 5. VerlicaI Hislogran
www.intechopen.com
Recent Advances in Technologies 628


The verlicaI projeclion defined as lhe sun of foreground pixeIs perpendicuIar lo lhe y axis,
lhis is represenled ly lhe veclor Y
j
of size 1 defined ly

( ) ( , )
M
Y M S L M =
_


(4)

vhere S(L, M) is a pixeI of lhe linary inage of lhe scripl and is eilher O or 1, L refers lo rovs
and M refers lo coIunns.

Aralic vriling is cursive, lherefore, vords and sul-vords are separaled ly spaces, so vord
loundaries are aIvays represenled ly a space. Hovever, six Iellers can le connecled fron
lhe righl side onIy. Using lhis knovIedge and lhe verlicaI hislogran, spaces can le delecled
ly caIcuIaling lhe zero dislance (gaps) on lhe x axis as shovn in Iigure 5. The dislances
lelveen vords are generaIIy Iarger lhan lhe dislances lelveen sul-vords. This dislance is
used lo decide lhe nunler of vord(s) in lhe inage lased on a lhreshoId.
To delernine a suilalIe lhreshoId, lhe ayesian crilerion, of nininun cIassificalion error, is
enpIoyed as foIIovs. Civen a dislance G , lhe prolaliIilies lhal represenl separalion of
vords or sul-vords are denoled as ( )
Z
S G and ( )
V Z
S G

, respecliveIy. These lvo condilionaI


prolaliIilies vere ollained ly nanuaIIy anaIyzing over 1OO inages conlaining nore lhan
25O vords. Taking ( )
Z
S G for exanpIe, ve found aII possilIe dislances separaling a vord,
caIcuIaled lheir hislogran and eslinaled ( )
Z
S G fron lhis hislogran. IIIuslralions of lolh
( )
Z
S G and ( )
V Z
S G

are given in Iigure 6.




Iig. 6. IIIuslralions of ( )
Z
S G (green dolled Iine) and ( )
V Z
S G

(red soIid Iine).



IinaIIy, an oplinaI dislance
0
G is ollained under lhe ayesian nininun cIassificalion error
crileria:

0 arg min( ( ))
G
G
HUU G
=


(5)



0
( ) ( ) ( ) )
G
V Z Z G
HUU G S [ G[ S [ G[

) )

+ = (6)

The segnenlalion of vords is conpIeled ly sinpIy conparing lhe dislance G vilh lhis
oplinaI dislance or lhreshoId
0
G . The case
0
G G > , idenlifies lvo vords and lhe aIlernalive
case idenlifies lvo sul-vords.
0 20 40 60 80 100 120 140
0
0.02
0.04
www.intechopen.com
nteractive Knowledge Discovery for Baseline
Estimation and Word Segmentation in Handwritten Arabic Text 629


4. ExperimentaI ResuIts and Discussions

Texl recognilion syslens can le cIassified inlo lvo areas, prinled lexl or handvrillen lexl.
Irinled lexls have siniIar shapes if prinled nany lines using differenl devices, hovever,
due lo differenl vriler slyIes, handvrillen lexls have high varialiIily. The nain goaI of a
handvriling recognilion syslen is lo delernine lhe cIass of lhe characler or vord. Il is a
nore difficuIl lask lo design a recognilion syslen vhich can recognize lhe handvriling of
nany peopIe inslead of jusl lhe handvriling of a singIe vriler. AIso, in order lo evaIuale
handvriling recognilion syslens, lhe accuracy and lhe speed have lo le neasured and
conpared lo lhose of an average of hunan reader. In lhe Iileralure, sone recognilion
syslens vere reporled vilh high recognilion rales. This is generaIIy due lo lheir lesling dala
vhich consisled of a snaII sel of vords vrillen ly fev vrilers, ralher lhan a slandard
dalalase. Any recognilion syslen needs a Iarge dalalase lo lrain and lesl lhe syslen. ReaI
dala fron lanks or lhe posl code are confidenliaI and inaccessilIe for non connerciaI
research. AIlhough sone vork has leen conducled on Aralic handvrillen vords, lhis
generaIIy used lhe aulhors ovn snaII dalalases or dalalases vhich vere unavaiIalIe lo lhe
pulIic. Mosl recognilion syslens have leen deveIoped for cerlain appIicalions such as lhe
reading of poslaI addresses or cheques. An exanpIe of a Iarge slandard LngIish dalalase
suilalIe for lhe deveIopnenl and lraining and lesling of recognilion syslens is lhe one
crealed 14 years ago ly HuII (HuII 1994). This vas deveIoped for lhe cenlre of LxceIIence for
Docunenl AnaIysis and Recognilion (CLDAR) al lhe Slale Universily of Nev York al
uffaIo and consisls of 5OOO cily nanes, 5OOO slale nanes, 1OOOO ZII codes, and 5OOOO
aIphanuneric characlers. The NalionaI Inslilule of Slandards and TechnoIogy (NIST) has
aIso provided a handvrillen dalalase vhich incIudes LngIish Iellers in Iover and upper
cases, nunler digils and lhe conpuler and Connunicalion Research Laloralory of lhe
InduslriaI TechnoIogy Research Inslilule in Taivan have reIeased a handvrillen Chinese
characlers dalalase vrillen ly 2OOO vrilers |Huang, 1993 #1O6.
The dala sel for lhe experinenls is lhe IIN/LNIT dalalase. Iechvilz el aI. (Iechvilz,
Maddouri el aI. 2OO2) reIeased lhe
Any recognilion syslen needs a Iarge dalalase lo lrain and lesl lhe syslen. ReaI dala fron
lanks or lhe posl code are confidenliaI and inaccessilIe lo non connerciaI research. LarIy
vork conducled using Aralic handvrillen vords, generaIIy used snaII individuaI
dalalases or presenled resuIls on dalalases vhich vere unavaiIalIe lo lhe pulIic.
ConsequenlIy, lhere vas no lenchnark lo conpare lhe resuIls ollained ly researches. This
silualion changed in 2OO2 vhen lhe IIN/LNIT dalalase (vvv.ifnenil.con) lecane
avaiIalIe free for non connerciaI research. The IIN/LNIT dalalase, is very inporlanl in
lhis conlexl and has leen used as a slandard lesl sel |9j. In lolaI nore lhan 1OOO differenl
peopIe vere seIecled lo vrile lheir nanes and lo fiII in one or nore forns vilh handvrillen
pre-seIecled nanes of Tunisian lovn/viIIages and lhe corresponding poslcode. AII lhe
forns vere scanned al 3OO dpi and converled lo linary inages.
The dalalase consisls of 937 Tunisian lovn/viIIages nanes logelher vilh lheir poslcodes. In
lolaI nore lhan 1OOO differenl vrilers vere used. Lach vriler vas asked lo fiII in one or
nore forns vilh handvrillen pre-seIecled nanes of Tunisian lovn/viIIages and lhe
corresponding poslcode. AII lhe forns vere scanned al 3OOdpi and converled lo linary
inages. The inages are divided inlo five sels so lhal researchers can use sone of lhen for
lraining and sone for lesling. Sone pre-processing lasks incIuding noise renovaI, lexl lIock
segnenlalion, linarizalion and vord segnenlalion have leen done during lhe deveIopnenl
www.intechopen.com
Recent Advances in Technologies 630


of lhe IIN/LNIT dalalase lo nake cropped linary inages of lhe nanes of lovns and
viIIages avaiIalIe.
Corresponding lo lhe lesl dala sels as descriled alove, ve design lvo phases of
experinenls lo evaIuale lhe proposed aIgorilhns, vhich incIude: Ihase-1: experinenls lo
evaIuale perfornances of laseIine eslinalion, and Ihase-2: experinenls lo evaIuale
perfornances of connecled conponenl anaIysis and lhe perfornances of vord
segnenlalion,
In phase-1, our experinenls focus on lhe laseIine eslinalion. Due lo lhe facl lhal lhe
laseIine is a parl of lhe ground lrulh of lhe IIN/LNIT dalalase, so il is possilIe lo evaIuale
lhe laseIine. Iigure 7 shovs lhe phase-1 experinenlaI resuIls on laseIine eslinalion, fron
vhich il can le seen lhal lhe proposed aIgorilhn vorked veII vhen appIied lo 4OOO inages
using lhe four differenl sels (lhe firsl 1OOO inage forn sels a,l,c,and d vas seIecled). The
resuIls for laseIine eslinalion reach 97.675 of accuracy, vhich nakes lhe proposed
aIgorilhn nore effeclive in eslinaling a vord laseIine. TalIe 2 sunnarizes lhe
experinenlaI resuIls for lhe laseIine eslinalion proposed aIgorilhn.

Sel a l c d average
Iercenlage () 97.4 97.8 97.9 97.6 97.675
TalIe 2. Ierfornance of lhe laseIine eslinalion aIgorilhn

In conparison vilh lhe exisling vork, our laseIine aIgorilhn perforns leller in eslinaling
lhe laseIine. TalIe 3 sunnarizes lhe resuIls of our aIgorilhn conpared lo lhe resuIls of
exisling vork.

Melhod Hough
Irojeclion |31j
SkeIelon ased |31j Iroposed AIgorilhn
Iercenlage () 88 88 88.9
TalIe 3. Ierfornance of proposed aIgorilhn vs. olher nelhods

In generaI, caIcuIaling lhe laseIine error is used lo eslinale lhe laseIine quaIily. The error is
caIcuIaled as lhe area lelveen lhe ground lrulh laseIine and lhe eslinaled laseIine in
pixeIs. Iigure 8, shovs an exanpIe of caIcuIaling lhe laseIine error, vhiIe Iigure 9 shovs
lhe reIalion lelveen lhe eslinaled laseIine and lhe ground lrulh laseIine.



Iig. 7. LxanpIe laseIine eslinalion resuIls
www.intechopen.com
nteractive Knowledge Discovery for Baseline
Estimation and Word Segmentation in Handwritten Arabic Text 631





Iig. 8. aseIine error



Iig. 9. The reIalion lelveen lhe eslinaled and lhe ground lrulh laseIine

To conpIele phase-2 experinenls, verlicaI hislogran and connecled conponenl anaIysis are
carried oul for vord segnenlalion. Word segnenlalion approaches are lased on lhe
assunplion lhal lhe lexl Iines are slraighl. This vorks veII for nachine prinled docunenls,
lul il faiIs on lhe handvrillen docunenls having curviIinear lexl Iines. Here, lhe dislances
lelveen sul-vords are neasured and conpared lo an oplinaI lhreshoId lo delernine if lhe
dislance corresponds lo separalion of lvo vords or nol. The segnenlalion aIgorilhn
searches for horizonlaI gaps lelveen lhe connecled conponenls on a pre-eslinaled
lhreshoId. In conparison vilh lhe exisling vork, our vord segnenlalion aIgorilhn
iIIuslrales significanl advanlage, vhich can le highIighled as: in lhe case of niss-spaced
vords, vhere lhe aIgorilhn faiIed lo delernine lounding loxes, spaces vere aulonalicaIIy
adjusled nol adjusled nanuaIIy using graphicaI looIs.
In generaI, lhere are severaI lypes of error lhal occur during lhe process of segnenlalion
vhalever lhe approach used. These errors can le sunnarized as:
1) Over segnenlalion, vhen lhe nunler of segnenls is grealer lhan lhe acluaI
nunler.
2) Under segnenlalion vhen lhe nunler of segnenls is Iess lhan lhe acluaI nunler.
3) MispIaced segnenlalion vhen lhe nunler of segnenls is righl lul lhe Iinils are
vrong.
We have lesled our lechniques on a lesl sel of 5OO inages and lhe resuIls are conpared lo
lhe ground lrulh lased on lhe grouping of lhe lounding loxes inlo vords. TalIe 4
sunnarizes lhe vord segnenlalion resuIls, sone of lhe resuIls are presenled in Iigure 1O.
www.intechopen.com
Recent Advances in Technologies 632


Iron TalIe 4 ve can see lhal lhe correcl segnenlalion rale achieved for inages is 85. The
segnenlalion error of 15 is due lo lhe varialions in handvriling, especiaIIy irreguIar
spaces lelveen sul-vords and vords, such as loo snaII spaces lelveen vords (vhich viII
Iead under segnenlalion ly incorreclIy nerging lvo vords logelher) or loo Iarge spaces
lelveen sul-vords (vhich nay le vrongIy laken as lvo vords and Iead lo over-
segnenlalion). LxanpIes of lhese errors are iIIuslraled in Iigure 11.
In conparison vilh lhe exisling vork, il is difficuIl lo conpare our vork lo |35j since lhey
have used sone olher crileria and lhey have chosen 2OO inages. They did nol nenlion
vhich 2OO inages of lhe dalalase vhich nake our aIgorilhn can nol le inpIenenled lo lhe
sane dala. Moreover, in lhe case of niss-spaced vords, vhere lhe aIgorilhn faiIed lo
delernine lhe lounding loxes, our aIgorilhn perforn leller since il reduces lhe nunlers of
such errors. The dislance lelveen vords and sul-vords vere aulonalicaIIy nornaIized ly
using knovIedge of lhe Aralic Ianguage nol adjusled nanuaIIy using lhe graphicaI looIs, in
vhich lhe vord case can le delernined. Ior exanpIe, lhe vord in Iigure 11 (a) vas over
segnenled, lul afler dislance nornaIizalion lhe vord inage is nov segnenled correclIy as
shovn in Iigure 12.
In addilion, lhe ruIes of Aralic Language vriling can le expIoiled and appIied lo lhe
dislance nornaIizalion. The originaI inage is scanned fron righl lo Iefl coIunn ly coIunn,
and lhe vhile (lIank) coIunns are delecled and adjusled in size in order lo reduce lhe
dislances lelveen sul-vords as IIIuslralion of dislance nornaIizalion is given in Iigure 12
shov. Afler appIying lhe dislance nornaIizalion, lhe Aralic vords are correclIy segnenled.
Since each handvrillen inage has Cround Trulh (CT) infornalion for evaIualion purposes,
lhe resuIls are conpared vilh lhe IIN/LNIT CT infornalion.

No. of
Inages
Correcl
segnenlalion
Under
segnenlalion
Over
segnenlalion
MispIaced
segnenlalion
5OO 85 9 4 2
TalIe 4. OveraII segnenlalion resuIls



Iig. 1O. LxanpIe successfuI vord segnenlalion resuIls





www.intechopen.com
nteractive Knowledge Discovery for Baseline
Estimation and Word Segmentation in Handwritten Arabic Text 633



Iig. 11. LxanpIe faiIed vord segnenlalion resuIls

original image distance normalized image in double(1)
distance normalized image in double(2) distance normalized image in binary

Iig. 12. LxanpIe faiIed vord segnenlalion resuIls

5. ConcIusion
Aralic handvriling recognilion depends on accurale pre-processing and segnenlalion. This
chapler proposes a rolusl nelhod for laseIine eslinalion and a slalislicaI anaIysis lo
delernine an oplinaI lhreshoId for vord segnenlalion. y using knovIedge of polenliaI
posilions of lhe laseIine, nore accurale resuIls are ollained in conparison vilh lhose
vilhoul knovIedge supporl. In addilion, lhe oplinaI lhreshoId ollained is found lo le very
effeclive for roluslIy segnenling vords in Aralic lexl.
A conponenl-lased nelhod is inlroduced lo segnenl vords fron handvrillen Aralic lexls.
Since nany peopIe have enphasized eilher segnenl-free lased nelhods or Ieller or slroke
lased approaches, vords segnenlalion has nol le veII addressed. Here, our vork provides
www.intechopen.com
Recent Advances in Technologies 634


a praclicaI vay of accuraleIy segnenling vords fron lhe lexl. This is usefuI and nore
fIexilIe lhan segnenl-free lased approaches as il can nake good use lhe conponenl parls of
inages in furlher recognilion. AIso, lhis approach is sinpIer and nore rolusl lhan Ieller-
lased nelhods lecause lhe Ieller has nuch difficuIly in effecliveIy segnenling arlilrary
handvrillen characlers. We have found lhal dislance infornalion is very usefuI for
segnenling vords, lul inprovenenls are sliII desiralIe. A dislance nornaIizalion lechnique
naking use of knovIedge of lhe Ianguage vas appIied lo reduce lhe nunlers of over and
under segnenlalion errors. Iurlher invesligalions viII ain lo furlher inprove vord
segnenlalion ly using Ianguage knovIedge for vaIidalion.

6. References
|Aluhaila, I. S. I., M. }. }. HoIl, el aI. (1996). "Irocessing of linary inages of handvrillen lexl
docunenls." Iallern Recognilion 29(7): 1161-1177.
AI-adr, . and R. M. HaraIick (1995). Segnenlalion-free vord recognilion vilh appIicalion
lo Aralic. Iroceedings of lhe Third InlernalionaI Conference on Docunenl
AnaIysis and Recognilion.
AI-Ma'adeed, S., D. LIIinan, el aI. (2OO2). A dala lase for Aralic handvrillen lexl
recognilion research. Lighlh InlernalionaI Workshop on Ironliers in Handvriling
Recognilion
AI-Rashaideh, H. (2OO6). "Ireprocessing phase for Aralic Word Handvrillen Recognilion."
Infornalion Transnissions in Conpuler Nelvorks 6: 11-19.
AIna'adeed, S. (2OO6). Recognilion of Off-Line Handvrillen Aralic Words Using NeuraI
Nelvork. Ceonelric ModeIing and Inaging--Nev Trends
AIna'adeed, S., C. Higgens, el aI. (2OO2). "Recognilion of Off-Line Handvrillen Aralic
Words Using Hidden Markov ModeI Approach " 16lh InlernalionaI Conference on
Iallern Recognilion (ICIR'O2) 3: 481-484.
AIna'adeed, S., C. Higgins, el aI. (2OO4). "Off-Iine recognilion of handvrillen Aralic vords
using nuIlipIe hidden Markov nodeIs." KnovIedge-ased Syslens 17(2-4): 75-79.
AInuaIIin, H. and S. Yanaguchi (1987). "A nelhod of recognilion of Aralic cursive
handvriling." ILLL Transaclions on Iallern AnaIysis and Machine InleIIigence 9(5):
715 - 722
Anin, A. (1997). Off Iine Aralic characler recognilion: a survey. Iourlh InlernalionaI
Conference on Docunenl AnaIysis and Recognilion.
Anin, A. (1998). "Off-Iine Aralic characler recognilion: lhe slale of lhe arl." Iallern
Recognilion 31(5): 517-53O.
Anin, A. (2OOO). "Recognilion of prinled aralic lexl lased on gIolaI fealures and decision
lree Iearning lechniques." Iallern Recognilion 33(8): 13O9-1323.
Anin, A., H. AI-Sadoun, el aI. (1996). "Hand-prinled aralic characler recognilion syslen
using an arlificiaI nelvork." Iallern Recognilion 29(4): 663-675.
Anin, A. and H. . AI-Sadoun (1992). A nev segnenlalion lechnique of Aralic lexl.
Iroceedings., 11lh IAIR InlernalionaI Conference on Iallern Recognilion, 1992.
VoI.II. Conference : Iallern Recognilion MelhodoIogy and Syslens, .
Iarooq, I., C. Venu, el aI. (2OO5). Ire-processing nelhods for handvrillen Aralic
docunenls. Iroceedings Lighlh InlernalionaI Conference on Docunenl AnaIysis
and Recognilion.
www.intechopen.com
nteractive Knowledge Discovery for Baseline
Estimation and Word Segmentation in Handwritten Arabic Text 635


Ireenan, H. (1961). "On lhe encoding of arlilrary geonelric configuralion." ILLL Trans.
LIeclronic Conpuler 1O: 26O-268.
Cray, R. M. (1989). "veclor quanlizalion." ILLL Trans. ASSI(1): 4-29.
Cuo, Z. and R. W. HaII (1989 ). "IaraIIeI lhinning vilh lvo-sulileralion aIgorilhns."
Connunicalions of lhe ACM 32(3): 359 - 373
Huang, }. S. (1993). OplicaI handvrillen Chinese characler recognilion. HANDOOK OI
IATTLRN RLCOCNITION AND COMIUTLR VISION WorId Scienlific IulIishing
Co., Inc: 595-624.
HuII, }. }. (1994). "A dalalase for handvrillen lexl recognilion research." Iallern AnaIysis
and Machine InleIIigence, ILLL Transaclions on 16(5): 55O-554.
Khorsheed, M. S. (2OOO). Aulonalic Recognilion of Words in Aralic Manuscripls Conpuler
Laloralory, Universily of Canlridge. I.h.D: 22O.
Khorsheed, M. S. (2OO2). "Off-Line Aralic Characler Recognilion - A Reviev " Iallern
AnaIysis & AppIicalions 5(VoIune 5, Nunler 1 / May, 2OO2): 31-45.
Khorsheed, M. S. (2OO3). "Recognising handvrillen Aralic nanuscripls using a singIe
hidden Markov nodeI." Iallern Recognilion Lellers 24(14): 2235-2242.
Khorsheed, M. S. (2OO7). "OffIine recognilion of onnifonl Aralic lexl using lhe HMM
TooIKil (HTK)." Iallern Recognilion Lellers 28(12): 1563-1571.
Khorsheed, M. S. and W. I. CIocksin (1999). SlrucluraI Iealures Of Cursive Aralic Scripl.
lhe Tenlh rilish Machine Vision Conference, The unversily of Nollinghan, UK.
Khorsheed, M. S. and W. I. CIocksin (2OOO). MuIli-fonl Aralic vord recognilion using
speclraI fealures. Iroceedings 15lh InlernalionaI Conference on Iallern
Recognilion, 2OOO. .
Lorigo, L. and V. Covindaraju (2OO5). Segnenlalion and pre-recognilion of Aralic
handvriling. Lighlh InlernalionaI Conference on Docunenl AnaIysis and
Recognilion. .
Lorigo, L. M. and V. Covindaraju (2OO6). "OffIine Aralic handvriling recognilion: a survey."
Iallern AnaIysis and Machine InleIIigence, ILLL Transaclions on 28(5): 712-724.
Molava, D., A. Anin, el aI. (1997). Segnenlalion of Aralic cursive scripl. The Iourlh
InlernalionaI Conference on Docunenl AnaIysis and Recognilion.
Iarker, }. R. (1997). AIgorilhns Ior Inage Irocessing and Conpuler Vision }ohn WiIey and
Sons, Inc
Iechvilz, M., S. S. Maddouri, el aI. (2OO2). IIN/LNIT - Dalalase of Aralic Handvrillen
vords. CoIIoque InlernalionaI Iranco-phone sur ILcril el Ie Docunenl (CIILD).
Iechvilz, M. and V. Margner (2OO2). aseIine eslinalion for Aralic handvrillen vords.
Lighlh InlernalionaI Workshop on Ironliers in Handvriling Recognilion
Raliner, L. and . }uang (1986). "An inlroduclion lo hidden Markov nodeIs." ASSI
Magazine, ILLL |see aIso ILLL SignaI Irocessing Magazinej 3(1): 4-16.
Syian, M., T. M. Nazny, el aI. (2OO6). Hislogran cIuslering and hylrid cIassifier for
handvrillen Aralic characlers recognilion. Iroceedings of lhe 24lh IASTLD
inlernalionaI conference on SignaI processing, pallern recognilion, and appIicalions
Young, S., C. Lvernann, el aI. (2OO1). The HTK ook, Canlridge Universily Lngineering
Deparlnenl.
Zhang, T. Y. and C. Y. Suen (1984). "A fasl paraIIeI aIgorilhn for lhinning digilaI pallerns."
Connunicalions of lhe ACM 27(3): 236 - 239
www.intechopen.com
Recent Advances in Technologies 636
www.intechopen.com
Recent Advances in Technologies
Edited by Maurizio A Strangio
ISBN 978-953-307-017-9
Hard cover, 636 pages
Publisher InTech
Published online 01, November, 2009
Published in print edition November, 2009
InTech Europe
University Campus STeP Ri
Slavka Krautzeka 83/A
51000 Rijeka, Croatia
Phone: +385 (51) 770 447
Fax: +385 (51) 686 166
www.intechopen.com
InTech China
Unit 405, Office Block, Hotel Equatorial Shanghai
No.65, Yan An Road (West), Shanghai, 200040, China
Phone: +86-21-62489820
Fax: +86-21-62489821
The techniques of computer modelling and simulation are increasingly important in many fields of science
since they allow quantitative examination and evaluation of the most complex hypothesis. Furthermore, by
taking advantage of the enormous amount of computational resources available on modern computers
scientists are able to suggest scenarios and results that are more significant than ever. This book brings
together recent work describing novel and advanced modelling and analysis techniques applied to many
different research areas.
How to reference
In order to correctly reference this scholarly work, feel free to copy and paste the following:
Jawad H AlKhateeb, Jianmin Jiang, Jinchang Ren and Stan Ipson (2009). Interactive Knowledge Discovery for
Baseline Estimation and Word Segmentation in Handwritten Arabic Text, Recent Advances in Technologies,
Maurizio A Strangio (Ed.), ISBN: 978-953-307-017-9, InTech, Available from:
http://www.intechopen.com/books/recent-advances-in-technologies/interactive-knowledge-discovery-for-
baseline-estimation-and-word-segmentation-in-handwritten-arabic-

You might also like