You are on page 1of 1

1. Outlines are analysed and stored 2. Outlines are gathered together as Blobs 3.

Blobs are organized into text lines 4. Text lines are broken into words 5. First pass of recognition process attempts to recognize each word in turn 6. Satisfactory words passed to adaptive trainer 7. Lessons learned by adaptive trainer employed in a second pass, which attempts recognize the words that were not recognized satisfactorily in the first pass 8. Fuzzy spaces resolved and text checked for small caps 9. Digital texts are outputted S dng thut ton tm m mu Step 1: Line & word finding A. Line finding Thc cht ca qu trnh line finding l blobs filtering v line construction Blobs filtering: Tnh ton ra medium height Nhng blobs m c chiu cao < chiu cao trung bnh th c kh nng l du chm cu, cc du ph khc hoc l im sai lch. Line construction: Xp cc m mu vo vi nhau to ra hng li B. Base line fitting Cn phi tm cc base line tng chnh xc vi nhng vn bn cho line cong. C. Fix pitch detection and chopping Tm ra fix pitch v chia cc ch theo fix pitch ny D. Proportional word finding X l i vi non-fixing pitch

Step 2: Word recognition Tm da theo sn ca k t Associator: dng thut tm kim th phn on

Step 3: Linguistic analysis Sau khi nhn dng c cc k t, th s ghp li thnh t. Mi ln ghp t th tin hnh phn tch ngn ng 1 ln. u tin nh sau: Top frequent word, top dic word, top numeric word, top UPPER case word, top lower case word, top classify word. T c chn s l t c sai bit t nht i vi tt c cc loi trn?

You might also like