\or|evi} D., Mihajlov D., Josifovski Lj.

, Computer System for Supprot of Humans With Damaged Sight: Subsystem for Optical Character , Simpozijum o ra~unarskim naukama i informatici, Zbornik radova YU Info '95, Knjiga 3, pp. 240-243, Brezovica, Srbija, 1995.
Recognition of Printed Cyrillic Text

Computer System for Support of Humans With Damaged Sight: Subsystem for Optical Character Recognition of Printed Cyrillic Text

Dejan 1 \or|evi}
2

, Dragan Mihajlov , Ljubomir Josifovski 

Faculty of Electrical Engineering - Skopje Faculty of Mechanical Engineering - Skopje optical character recognition of printed Cyrillic text, as a part of the system. 2. SYSTEM DESCRIPTION A schematic structure of the system for helping people with damaged sight is given on Fig. 2. The system includes a personal computer, a graphical scanner, a Braille printer and a speech synthesizer.

Abstract - A subsystem for optical character recognition of printed Macedonian Cyrillic text as a part of a system for supporting humans with damaged sight is presented. The system includes printed Cyrillic text recognition, and its printing on a Braille printer or speech synthesis. Various commercial programs for optical character recognition had been studied, but they showed to be inappropriate for implementation in the system. Therefore a program for optical character recognition was developed. Various types of classifiers like: simple overlapping of binary matrixes, neural network based perceptron, back-propagation and adaptive logical networks have been tested. Comparative analysis considering their accuracy, speed, resource requirements, and sensitivity to noise and deformations have been carried out. key words: Cyrillic text, optical character recognition, adaptive logical network, neural network 1. INTRODUCTION People use their senses as sight, hearing, touch, taste and smell for receiving information from the surroundings (Fig. 1). If one information input is disabled, it is possible to redirect the information from this input to the others after reconstructing it in a form which can be sensed by the other senses. The humans with damaged sight in absence of, or reduced abilities to receive information by the sense of sight, are being trained to receive information by other senses, primarily those of touch and hearing. For their needs the materials are printed in Braille writing, or recorded on audio tapes. Materials prepared this way are restricted and not always very actual.
s ig h t sigh t

spe aker

scan ne r

co m puter

B ra ille p rinter

Figure 2. System for helping people with damaged sight Printed text is being scanned with a scanner and the graphical image data is memorized in the computer in a form of a bit-mapped file. An optical character recognition program analyzes the picture of the scanned text, recognizes the printed characters, and stores them in a text file. This text file can further be printed on a Braille printer, or synthesized as a speech which can be heard from the loudspeaker. Ones converted to text form documents can be stored on diskettes, and reused for printing or speech synthesis again (Fig. 3). Several commercial products for optical character recognition have been tested, but they appear to be rather inappropriate for recognizing Macedonian Cyrillic letters, and incorporation in the system. Some commercial programs are capable of training for foreign characters, but training for recognizing Cyrillic letters is tricky, because of the absence of possibility for redefining the recognition of some Cyrillic letters that have the shape of some Latin letters. This is the reason why special program for optical character recognition (OCR) was developed. Several methods for character recognition have been tried. It was started with a simple matrix overlapping method which was expectedly not very satisfactory. Other methods like contour tracing have been tried too, but they appear to be slow and very complicated for implementation. A neural network model as a pattern classifier was accepted, as it appears to be very flexible. Several neural network architectures like: two-layer perceptron, multilayer backpropagation neural network, and adaptive logical network

h e arin g

to u ch

Figure 1. Model of human as a system As a result of a joint collaboration between Faculty of Electrical Engineering and Department for Rehabilitation of Children and Youngsters with Damaged Sight "Dimitar Vlahov" in Skopje a system for helping people with damaged sight is under development. The system includes automatic reading of printed Macedonian Cyrillic text, its printing on Braille printer or its synthesis by a speech synthesizer [1]. This paper addresses the subsystem for 

h e arin g

to uc h

240

Recognizing text. At the end the characters in the word are located. and all of them have shown satisfactory results.prin ted te xt scan n in g bitm ap B raille w riting OCR Licata so o{teten vid. The second part performs the recognition of the characters. Similarly the vertical projections of the pixels for every line are calculated and used for locating the words in that line (Fig. se obu~uvaat ovoj nedostatok da go te xt file sp ee ch sp ee ch synth e sis Figure 3. Optical scanner a rchivin g in dig ita l form It consists of three parts. Character separation The first part of the program uses a TIFF file holding the scanned text as input. Obtained binary matrixes are of different 241 . printing. SUBSYSTEM REALIZATION A program for optical character recognition of Cyrillic text has been developed. 5). The second and the third part of the program are realized in several versions using different neural networks and teaching strategies. while the third part is used for training the network for recognizing. It analyzes the picture and locates the lines of text. TIFF file Analyzing the picture Locating text lines Locating words Figure 5. 4). Horizontal projections of the pixels are used to locate the lines of text on the picture (Fig. In the first part scanned picture of the text is being analyzed and characters are being separated (Fig. vo nedostatok ili namaleni mo`nosti da primaat informacii so setiloto za vid. then the words in a line and separates the characters at the end. Horizontal projections of a scanned text Locating characters Separating characters in individual matrixes Figure 6. archiving and speech synthesis have been realized. Vertical projection in a single line of scanned text Filtering Scaling to pattern classifier Figure 4. 3. 6). The program was written in C and works on a PC compatible computer.

OR. . We have tested 3 and 4 layer network with 50 to 300 neurons in the hidden layers. Several trees (usually an odd number) are provided for every class and they vote for decision with the trees of the other classes (Fig. The second part of the program is an emulation of neural network.g. without performing selection of the data needed for making decision.e.  Complements Input binary vector bit m ap Figure 8. each using different network architecture. . and they are copied in a square matrix. Binary inputs and their complements are randomly connected to the leaves. the ALN has built-in capacity to generalize and maintain excellent noise immunity [2]. COMPARISON ON APPLIED NEURAL NETWORK ARCHITECTURES The ability of neural networks to be trained to recognize samples. .e. Its unique architecture utilizing only logical 242 . The first layer consists of 256 neurons each connected to one element of the binary matrix holding the character sample for recognition. It is commonly organized in a form of a binary tree with nodes of two types: adaptive elements. It gives very good results in recognition especially when using more voters. despite of for e. The number of neurons in the second layer corresponds to the number of letters which should be recognized. and ones trained to generalize i. On the other hand. MLP. Back propagation was used for training. Adaptive logic networks are very convenient for hardware implementation. to answer whether the sample belongs to its class or not. Training ALN is also much faster then training MLP. Each element has two inputs and can operate as an AND. 9). The maximum selector decides in favor of the character with largest number of votes. Training of this network was extremely slow.    . but not accurate enough. . Output vector . and leaves (Adaptive TREE). The network was trained using simple delta rule. . This network appeared to be rather fast during training and recognition. the sample is scaled to binary square matrix of constant dimensions (16´16) which is send to the neural network for recognition. the second and the third part of the program have been realized in several versions. After the filtering. In order to determine the abilities of different neural network architectures to be used for character recognition. the complexity of the ALN increases relatively slow. Every tree is trained to recognize a single class. Another feature of the ALNs is that one can use all the data he has to train the network. i. or RIGHT gate. 8). In the first version a simple two layer perceptron-type neural network was used (Fig. but ones trained it gives good results in recognition.dimensions. but the recognition was rather slow. le tte r o functions AND and OR (or logical gates if realized in hardware) enables performing of the basic pattern classification computations of binary information at extremely high speed. Figure 7. . by increasing the dimension of the problem. Ones trained. makes them an excellent candidate for implementation in OCR systems. LEFT. to give appropriate responses to other possible input vectors not presented during training. while the output is received on the root of the tree (Fig. . and can evaluate extremely fast. In the third approach an adaptive logic network (ALN) was used. Several neural network architectures have been realized and compared in their abilities for character recognition. Even emulated in software they are much faster then other neural network models. . ALN appeared to be relatively fast during training. . An adaptive logic network is a special case of the familiar multilayer perceptron (MLP) feedforward network. which actually does the pattern recognition and classification. 7). 4. Adaptive logic network Our implementation of ALN uses one or more trees per output class. Multiple parallel trees Random connections e p   . We have tried networks with 512 to 2048 leaves and realizations of 1 to 9 voters. Two-layer perceptron In the second approach a multilayer-layer neural network was used. where some filtering is performed. and extremely fast during recognition.

.E. Simple two-layer perceptron although relatively fast and with small resource requirements showed rather poor results during recognition. San Francisco. They are very fast during recognition and training. . 75 CHO 997-7 SMC. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 0 0 0 0 1 1 1 1 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 0 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sample (binary matrix 16x16) Figure 9. Kotevska. Armstrong. D. and Exercises. CONCLUSION A subsystem for optical character recognition of printed Macedonian Cyrillic text is realized. pp. 447-449. Rumelhart. 1995. Noise Translation + +/+ + +/+ +/+/+ big Different Fonts + +/+/+/+/Resources required Memory +/+ + fear Time + +/+ Complexity + +/+ + small Recognition + +/+ Overlapping Contour 2L Percep. it is aimed to use the comments of the end users as a feedback for further improvement of the system. Analysis of the properties of different types of neural network architectures considering their LITERATURE [1] D. pp.Skopje. Mihajlov. The subsystem is implemented in a system for support of humans with damaged sight. 1974. [5] McClelland. 1988. weights of the neurons. D. Adaptive logic networks appeared to be most flexible. Armstrong and G.CLASS MAXIMUM SELECTOR 5 5 voters . IEEE 1975 International Conf. Elektrotehni~ki fakultet .L. MLP (BP) ALN Table 1 Training of the networks is performed by a dedicated program for automatic training to which a set of character samples together with the classes they correspond to are presented. Time during Training Recognition + +/Table 2 5. MIT Press. IEEE Cat. W. "Properties of Binary Trees of Flexible Elements Useful in Pattern Recognition". Pattern recognition using adaptive logic trees Sensitivity on Rotation. Considering the fact that the studies have been made on a simulation example. Bochmann. 5 suitability for implementation in OCR system have been carried out. and have shown satisfactory accuracy. but they appear to be very slow. Properties of Boolean Functions with a Tree Decomposition. 243 . "Computer System for Support of Humans with Damaged Sight". Ohrid. on Cybernetics and Society. N. . \or|evi}. or the functions in the nodes of the ALN are being saved in a file which is later used during the recognition.. Exploration in Parallel Distributed Processing: A Handbook of Models. Programs. Summary of the features of different methods commonly used for optical character recognition is given in Table 1. [3] W. 1-13.. J. "Opticko prepoznavanje na znaci so upotreba na adaptivna logicka mre<a". ETAI. Summary of the speed of different neural network architectures during training and recognition is given in Table 2. v. Once in use this system would contribute in helping people with damaged sight to follow daily press and literature to date. No. After the training. seminarska rabota po predmetot nevronski mre`i i paralelno distribuirano procesiranje. BIT 13. 1975. Multilayer-layer neural networks have shown quite acceptable accuracy in recognition. 1993 [2] G. \or|evi}. Godbout. . [3] 2L Perceptron MLP (BP) ALN D.

244 .