You are on page 1of 22
IEE TRANSACTIONS OM PATTERW ANALYSIS AND MACHINE INTELLIGENCE, VOL, 22, NO, 1, JANUARY 2009 w On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey Réjean Plamondon, Abatract—Hardting ha conirued io port a5 Fellow, IEEE, and Sargur N. Stinari, Fallow, IEEE ant of communion and ering nett ay o-ay le wren wh the evoducton of net tcrnalagea, Ghent vay in human warsaghne, machine soognion of handing bes prac ‘Segulez nce, 2 feeds Handuen notes va PDA, posal adrostes on emlopes in acount In bank cocks, in harceten Totsin forms, et. This ora dascrbes fe naira of hanavriten language, Nowits Wane ro elton daa, andthe bes. onal betind wri anguage recogrtion sigotm, 8am ho onno case hich parans to te evaaby cf rjactoy dats ‘uring orig) srl tne ofne case (ution para te ceanned images) a consKaed. Aig for propccoesing, chock st word cago, an paformance wth praca! systems ar indeed, Oiher ede of appliance signature vein, wrler Authonsteton, harting learning "oo are also cocire Index Terms—Hancritg eengrin, on tno, oto, wt language, signature vailatin, curve sre, handing leans tment, tool, ter 4 Inrropuction 14 The Nature of Handwriting ANDNATING Is a skill that is personal to individuals, Fundamental characteristics of handwriting are three- {old Ie consists of artificial graphical marks on a surface; its purpose is t communicate something; this purpose is achioved by virtue of the mark's conventional relation to language (33). Writing is considered fo have made possible such of culture and civilization, Bach sevipt has & set of cons, which are known a8 eborscters of letters, that have certain basie shapes, There are rules for combining letters to represent shapes of higher level linguistic Units, For ‘example, there one rules for combining the shapes of individual letters so as to form eursively written words in the Latin alphabet 1.2. Survival of Handwriting Conybooks and various writing methods, ke the Palmer method, handeriting analysis, and autograph collecting, ‘are words that conjure up a Tost world in which poople looked to handisriting as bot @ lesson in conformity and & talisman of the individual [231]. The reason that hand- writing persists in the age of the digital compiter fs the ‘convenienee of paper and pen as compared to keyboards for numerous day-to-day situations Handwriting was developed a long time ago as a means to expandl human memory and to facilitate communication, © R, Plononion i wih Fle Poledige de Mow, CP. 6079, “Secure Conve Vie, Moni Qc FC 347. Pie ren plamenton dey 1 SS et ie Cote of Felon for Docent ais ant econ CEDAR, Dgarneat Cente Scaler, ‘Stes Uneresiy of Nar tBu, Aor, NY 1228 Fil ihr Bese gab Msc! 23 aly 199 cp Nov. 199 Fesomnenied a sceponcely K hye For fran om ong pees fi arc, lose ea ema ‘pamucomgetarg afters HEEECS. ag Me T23, A the beginning of the new millennia, technology has nce again brought holt to a eressrands, Nowe: days, deve aze merous ways to expen han enemy a wal ato faite comamunknion and in this pes tive, ane might ask: Wil handing bo threatened with etntion, of will eater a perio of major growth? Hindirtng has changed temendoesly overtime ad so fat, each technology-pash Ins contabated tis expan ion. The printing press and typewriter opened up the word te formated ocaments, ierensing the numb of readers that. In en eared #0 Wte alt comarca Computer and commimication technologies such ov word processors, fax chins, ad esa ae hana spac Om literacy and handwriting. Newer technologies such Personal digital assistants (PDAs) ant digital cellular Pons will so have an mpc ‘All eae inventions have Ted to the fine-tuning and reinterpreting of dhe role of handwriting and anderen memes. Each tne, the niche excapied by Kendotng has become more cary defined ant popularied. As 2 generat el, it sooms that as the lng of handbriten fessages decreases, the numer of people using hand: setingtncrenses [165] Widespread acceptance of digital computes seemingly chatienges the future of handwriting However, in nomer- ons sitations, a pun together with paper ors sll note pad is much wore convenient than a. keyboard. Por fxample students fn a classtoom are still not typing on & potebook computer, They store language, oats, and aeape with 9 pon. Ths typical paradigm has to tae oncept of pen sompting 12}, where the Reybonrd isan ‘expensive and nonengaromic component tobe replaced by + pentip. position semsiive gunface superimposed a a {eaphic dinplay that generates eletonic ink. The liste andvering computer wll have to. process lectunic sendrting in an naconscnnel envionment, doa With rmany writing styles and lengenges, work wh ambitary & IGE THANGACTIONS GN PATTER ANALYSIS AND MACHINE INTELLIGENCE, o Fig 1. (a) Ofbioe word. The image ofthe word is converted nfo rayave! pets usng a scanner. (9 On-line werd, The = y enordnaes of he mip are recorded as 2 fancon of a wih aight user-defined alphabets, and understand any hanciwritten message by any writer, 1.3. Recognition, intorpretation, and Identification Several types of analysis, recognition, and interpretation can be associated with handwriting, Hnudivriting rcagnition is the task of transforming a language represented in its spatial form of graphical macks into is symbolic representation, or English orthogeaphy, as with many languages based on. the Latin alphabet, this symbolic eepresentation Is typically the s-bit ASCIT representation of characters. The characters ‘of moat written languages of fhe world aro representable today in the form of 16-bit Unicode (232) Handwriting inrpretation Js the task of cletermining the mearing, of a body of handwriting, ex, a handweltten address, Hand ‘orlting idomfcation isthe task of determining the author of A sample of handwriting from a set of weters, assuming that each person's handwriting Is individualistic, Signainre terfleation is the task of determining whether oF not the signature is that of a given person. Mentification and ‘verification [171], whieh Rave applications in forensic uulysis, are processes that determine the special natime of the weiling of @ specific writer [15], while handvvriting ecognition and inferproiation are processes whose objec fiver are to Filler ont Exe variations 60 a8 to determine the message. The task of reading handwriting is one invelving, speciniized human skills. Knowledge of the subject domain is essential as, for example, in the case of the notorious physician's prescription, where a pharmacist uses knowledge of drugs. 1.4. Handwriting Input ‘Hondivziting data lv converted to digital form either by seanning, the writing on paper or by writing with a special pen on an electronic surface such 39 4 digitizer combined ‘with a Equid eystal display. ‘The two approaches are distinguished as offline and on-line handwriting, respec- lively, In the em-line case, the two-dimensional coordinates ‘of successive points of the writing as a function of time are stored in order, ie, the order of strokes made by the weiter is rendily available. In dhe off-line ease, only the completed vwtiting is available as an image. The on-line cage deals with 4 spatio-temporal representation of the input, whereas the ‘offline cave involves analysis of the spatio-kiminance of an mage, Fig. 1 shows typical input signals that can be analyzed in bath eases. The save data storage requiements are widely diferent. The data requirements for an average tursively written word are: in the oncline case (Fig. 1b}, @ few hundred bytes, typically sampled at 100 samples per second, and in the offline case (Fig. Ta}, a few-hundéeed io-bytes, typically sampled at 300 dots per inch. From a global perspective, paper documents, which are aa inte ently analog medi can be converted inte digital form by ‘process of scanning and digitization. This process yields @ digital image. Por instanee, a typical 85 x 11 inch page is cannes at a resolution of 300 dots per inch to ereate a gray- scale image of 84 megabytes. The resolution is dependent ‘on the smallest fone size that neers avable recognition, 38 ‘well as the bandwidth needed for trassmission and storage of te image. ‘The eecognition tates reported are much higher for the ‘ondine case in comparison with the affine case, For ‘exrimple, for the offline, unconstrained handwritten word recognition problem, recognition rates of 95 percent, 85 percent, and 78 percent have been reported for top ‘ehoice lexicon sizes of 10, 100, ane 1,000, respectively [216h In the onedine case lage: lexicons are possible for the same ‘accuracy; a top chwice recognition vate of 80 percent pure cursive words andi a 21,000 word lexicon has been reported [204], Higher performance numbers have been achieved in recent years; however, all recognition perfor- mance numbers are dependent on the particular test set. 1.8 The State of the Art “The state of the art of automatic recognition of handwriting at the daven of the new auilenium is that asa field itis no longer an esoteric topic on the finges of information technology, but a mature disciptine that hes found many Ccornmercial uses. On-line systems for handwaiting recogni tion aze available in handheld computers such as PDAS ‘The performance of PDAs is acceptable for processing Inandprinted symbols, and, when combined with keyboard tentzy, a powerful method for data antry has been created. Off-line systems are less aecurate than on-line systems However, they ae now good enough that they have a significant economic impact em for specialized” domains such as interpreting Aandwvtten postal addressos on envelopes and reading, courtesy amounts on bank checks. The sucess of on-line systems makes it attractive to consider developing off-line systens that fist estimate the Arajectory of the writing from offline data and then use USIONDON AND SAAR CRLIE AND OFFLINE HANDWRITING RECOGNITION. A COMPREHENSIVE SURVEY « cline recognition algorithins (151). However, the difi- culty of recreating the temporal data [13] [46] [275] hws led to fow such feature extmction system wo fax, ‘The objective ofthis paper isto present a comprehensive review of the slate of the at in the automatic processing of handwriting, It reports many recent advances and changes that have oceurzed in this field, pacticularly over the last lead, Various paychophysical aspects of the generation ‘and perception of handwriting are first presented to highlight the different sources of variability that make handwiting processing co difficult, Major successes and promising applications of both or-line and off-line approaches ate indicated here, Finally, attempts to incor porate contextual knowledge, particularly from linguistics, to improve system performance age presented. Due to space limitations, we mostly limit our survey of this topic to applications dealing with dhe Latin alphabet. Moreover, in many subtopies, previous surveys have been done to bighlight, among other things, how the problem attack vas launched, what the major milestones of development in the field were, et, In these cases, we refer specifically to the papers and bile up our report upon those, 2 HANDWRITING GENERATION AND PERCEPTION ‘The study of handwriting covers a very broad feld dealing, with numerous aspects of this very complex task Te involves research concepts from several disciplines: exper: mental psychology, neuroscience, physies, engineering, computer seienee, anthropology, education, forensic docu ment examination, etc, [56, [161], [170], (208), [208] 235], (236), [237], (241) From a generation point of view, handweting involves several functions. Starting from a communication intention, «a messege is prepared at the semantic, syntactic, and lexical levels and converted somehow into a set of allogeaphs (letter shape models) and graphs (specific instances) made Lup of strokes 50a tn generate a pentip trajectory that can be recorded on-line with a digitizer or an instrumented! pen, In many cases the trajectory is ust nocoxced on paper and the resulting document can be rend tater with an offline system ‘The understonding of hondvveting generation is impor- tant in the development of both on-line and off-line sccoggition systems, pactculacly in accounting for the variability of handwriting, Sofas. mumnerous masiels have boon proposed to study and analyze handwriting, These models are generally divided into two major classes: top- own and betton-up models [173]. Top-down models refer to approzches that focus on high-level information proves sing, from semantics to basic motor control problems, Bottom-up models axe concerned! with the analysis and synthesis of low-level neuromuscular processes involved in the production of a single stroke, going upward to the generation of graphs, alfographs, words, etc “Most of the top-down models have been developed for language processing purposes. They are not exclusively ddesicated to handwriting’ ancl deal with the integeation of lexical, syntactic, and semantic information to process 3 mesmge. We will come back to same of these in Section 5. The bottorvup models are generally divided into. two groups: oscillalory [87] and discret [3] models. The former consider oscillation as a basic movement and the generation ff complex movements rill from the contol of the amplitude, phase, and fquency of a fundamental wave funetion (26), [3], [59], [298], [233]. Discrow models consider complex movements a the vaalt of 2 kompors] superimposition of a set of simple, discontinuous strokes {20}, [143], [144), [187] In the ascilatory approach, a single stroke 1s seon a5 a specific ease of an abrupt, interrupted oscillation, while ia the disereie case, continous move: ments emerge from the time-overlap af discontinsous strokes Fig. 2 summarizes and iltastrates a typical discrete mosel (167). Pais model describes a single stroke as resulting feen the cosctivaton of two neuromuscular systems, one agonist and the other antagonist, that conitol the velocity of the pons. The magnitude of the velocity ava fact af tie [8 described by a deltalognormal function [Ul] and each Stroke Is represented. by ine parameters nvflecting the instantiation and amplitude of the input command Go, By, Da}, the time delays and responce time of the two syste (iia, well as 9 basic postural informa tion (Ca, Pa? {In this ‘context, the generation of handsexiting is describes! as the vector summation of discontinwens steokes. The flueney of the trajectory emerges From the time-superimposition of strakes die © antieipatory effect In other words, and accordiag to this kinematic theory {164 once a stroke is Initiated to reach a target, 2 site lenovs how long it will rake to each that target and wieh ‘what spatial precision. This allows the subject to stata mci stroke prior tothe end of the previous ane. The immediate consequence of this anticipation phenomenon is that any observable signal fom this trnjetory at a given time is afectet both by at least the previous and. che successive strokes Fig. 2a depicts the block diogsom of the model. Fig, 2b shows @ typical action plan described by 9 sequence of virtual targets (ciamonels) linked by cicular strokes (truncate lines). Once this sction plan is activates, itis fed through the suitomasculae agonist and satagontet systems to produce trajectory that leaves, far example, 9 handiwritn trace on a piece of paper (continous line). ig. 2c, Fig, 24, ancl Fig. 2e stow the typical executions of this ation plan with increasing anticipatory effects. As seen in Fig. 26, too muich anticipation grontly degrades the Visibility ofthe message. Silay problems en emirge from fhe variability of any ofthe nine stroke parameters ofthis model Using nonlinens regression, 2 set of individual strokes and steoke parameters can be recovered from the shape and the velocity data of a handwritten trace, and both the velocity signa, anc the Ianaivriten ord gan be recon- structed (Gee Hg. 3a, Fig, ab for examples) Each of the recovered lrokes an be analyzed for the puspose of word segmentation and recognition (741, [167]. From this per spective, botionup madels provide information about neuromotor processes thal exe involved, at the lowest level of abstraction, in hencwrting recognition. Many cues about 6 IEEE TRANBACTIONS ON PATTEAW ANALYSIS AND MKCHINE INTELLIGENCE, YOL.22, NO, 1, JANUARY 2609 eta tos: ot antspaion 1036) Original word, extracted strokes and vial largots Fl. 2 (A noncrtng generation mods.) A ype aston pign mado vp of a sequonce of vital tetas) ed with ces Stokoe {dott Se. Ths intorraton fas been exkacied far a saci star of tha word (coninuous ines} sing the xa of Fg. 28. (eh) Iroomporaing aniciosloneffoc, whieh It activating the ces mone before the competion ef the presew one, moins the ganea!shapa ot | word. (2) shows the ofc of creasing the covtexa)enkioary phenomenon leters detection and word recognition have emerged from similar sts. Teom an opposite point-oview, the reading ofa hand written document ees on a basio knowledge about perception [18 222] Foychologieal experiments in he qnan charactor secogaition show io effess: 1) a chaeacter thet either oscurs frequently, cr has 9 simple siuctute Js processed as a single ant without aay decomposition of the character stuctore into simpler units nod 2) wilh infrequently occuring chareters, and those wih eoaplex struct, te aanount ome tke o secogriae a charset increases as is nuraberof stokes ieresves (10), [226] [228], [259]. The former metho of recognition is referze to a Iolite ana ehe latter a8 ray, both of which are discussed further in Section 4.3 The perceptual processes involved in retding have been discussed extensively in the cognitive psychology literature {101 [226 1223}, Such studies ae pertinent in hat they ean form the basis for algrihins thet emilate hurnan perior mance in rending [18], [36] of try to do better [224 Although much of this literature refers to the reading of ‘machine-printec text. some conclusions are equally Val for handwritten text Tor instance, the saccades feye movement) fxs at disercte points on the text, and st cach fixation the brain uses the visual peripheral field to infer the sae ofthe lext. Algorthically, this again leas to the holst approseh to recognition. 3 On-Line HaNoWRITING RECOGNITION [As proiously mentioned, online recognition refers to methods and techniques dealing with the automati processing of a mossage as itis written using a digitizer PLAMONDON AND SRAM ON-INE AND OFFLINE HANDARITING RECOGNITION A COMPRENENSVE SURVEY o7 Fig. 3. (a) Original (continuous Ine) ard reconstructed (tied Ine) curling vlooty ef the word "e390" (3) Oighal(crnous tne reconstructed (ted fre) othe wed "sage" ‘of an instrumented stylus that captures Information about the pentip, generally its position, velocity, or acceleration as a function ofthe (see Fig. 4a, Fig, by, Pig. 59, and Fig Sb Foe examples of typical signals). “this problem has been a research eballenge since the beginning of the sixties, when the fest attempts to recognize Isolated handprinted characters were performed [52], [54]. ee. Since then, numerous methods ard approaches have been proposed and tested; many have already been summarized in a few exhaustive survey papers (152), (072), 227}, 2am) Over the years, these research projects have evolved from being academic exercisos to developing technology lriven applications. We will oats on eee of these technical domains in this section: pen-based computers, signature verifiers, and developmental tools. The first ‘group tefers to the recognition of handwritten messages ‘and, gesture commands to interact with pen computing pletforms, The second deals with signatures, a very specie ‘ype of wollleamed handwriting, with the purpose af verifying the identity of « person. The thint class incorpo rates various systems thal exploit the neuromotor char- acteristics of hanelwriting to design systems for education and rehabilitation purposes, ‘el Fig (a) Volos of ho x coorenata the ponte orp 25 funston of no 2, fr thé werd doped Fg. 15 ae 3.4 Pen-Based Computers “The cancept of a pen computer was first proposed by Kay in 1968 [37], Since then, many research teams have been ‘working on the implementation ofthe "Dynabook" concept {1551 trying to integrate into a single light and! exgenomtc sysiem a tEansparent. position sensing device with. & ‘aphical display, under tbs control of @ powerful mlcro- omputer. The elimate goal her Iso rmnlc and exiend the pon and paper melaphor by the automatic processing of lectronic ink. Apart fam the numerous hardware. pro- ‘lem that stil have to be solve [129], the use of electronic penpads mostly selles on the oncine recognition of Command gestures and handwriien messages. (5S) although most ofthe nystems donot process the dll iing informotion avalabie ftom te signe! but only the stroke “Prior to any recognition, the aeguized data i genezlly preprocessed to teduce spurious ke, to normalize the Narlousaepacts of fhe tee, a to segment the signal into meaningful units [73), (152), (1721, [227] Tae noise ‘originates from several sources: the quantization noite of the digitizer 0 well a the digizing proces itsl,ceatic band, 0 finger movements (se Soelion 2), the inaceurocies of the pen-up/pen-down indicator, ete. ‘The main approaches to noise reduction deal with data smoathing, Spnal fitering, dehooking. and bresk corrections. [152 8 a fuon of time (0), fore word desc ia Fig. 1. (0) Vass of thax sorte ofthe |eEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE IMTELLIGENCE, WOL.22, N01 Januar 2006 @ Fi. 5.(e) Volos ofthe megntude ot the part eat a) ~ CB + a ® function oie forthe word dicted Fp. 1, (bj Vacs ot the aaootcatin of the pen a(t) if as 2 functon cf te for tue word depicted in Fe. 1. ‘Many recognition algorithms, which are based on the use of| sandaedizedallographs and shapes ofa cursive word, fist requice that a handprinted character ora command gesture be normalized, Other approaches try t absorb sone of these distortions [163]. Common nommalization procedures involve correction of baseline cit [19], compensation of ‘writing slant (21), [126] and adjustment of the script size 52 ‘Sogmentation refers to the different operations that must be performed to get a representation of the various basic units thatthe recognition algorithm will have to process. 1 generally works at two levels. The frst level deals with the whole message and focuses, for example, on Eine detection [$5] [242], word segmentation [227] as well as separating ontextual inputs (gesture commands [186], [243], (247), ‘handwriting style [238], equations [43], diagrams [243], and laeritics [212] from text. At this level, the goal is to define spatial zones or temporal windows, ot bot that allow the cexizaction of disjoint basic units. At the second level, the methodology focuses on the segmentation of the input into individual characters or even into subcharacter units, such ‘as strokes, This operation is among the mont chailensing, particularly for the recognition of cursive script (172) In most casos, this segmentation is tentative andl is corrected later during classification. In some systems, this slep is totally avoided by working at the word level {50}, (51) [157|, However, his approach generally makes sense for small vocabulary applications oly whore a lexicon search is fast enough to socommodate a realtime system. Some methods combine holistic recogaizers with segmentation- Dbasod algorithans [177] This is generally performed at the shape level, at the fexical level (using 2 word-shape based lexicon), oat the level of output word lists “The major problem with character segmentation is the ditficulty of determining the beginning and ending of {individual characters. The most common approsches used nowadays, unsupecvised learning [82), [128] and dats- driven knowledge-based methods [84], [166], are still Insufficient for mast applieations, Some strategies stort bottom-up, directly from the basic strokes that have been used to write a specifiecharacter. These strokes are genetally hidden in the signal due to anteipation or time-super imposition effocts (ser Fig. 2, Ply 2c Fig, 2d. and Fig, 28) [144], [168]. Several operational approaches have been proposed to define and reprosent these basic strokes: segmentation at the point of maximum curvature [116], [ni ata vertical velocity zero crossing (98), at minima ofthe 1) coordinates [61], at minima of absolute velocity {197} Some methods use a seale-space approsch [94] or &

You might also like