Professional Documents
Culture Documents
Abstract — The purpose of this article is to recognize letter patterns in images to recognize objects, faces, and scenes.
and especially font from images which are containing texts. In They learn directly from image data, using patterns to
order to perform recognition process, primarily, the text in the classify images and eliminating the need for manual feature
image is divided into letters. Then, each letter is sended to the extraction. This model has a multilayer structure, each layer
recognition system. Results are filtered according to vowels comprising a plurality of two dimensional planes and a
which are most used in Turkish texts. As a result, font of the plurality of neurons each plane [9]. These layers can be
text is obtained. In order to separate letters from text, an examined in three sections, mainly in input layers, hidden
algorithm used which developed by us to do separation. This layers and output layers. While all the complex processes
algorithm has been developed considering Turkish characters required for learning take place in hidden layers, the input of
which has dots or accent such as i, j, ü, ö and ğ and helps these the data into the system is obtained from the input layer and
characters to be perceived by the system as a whole. In order to
the result is obtained from the result layer. This network
provide recognition of Turkish characters, all possibilities were
created for each of these characters and the algorithm was
consists of more than one layer allows the neurons to
formed accordingly. After recognizing the each character, perform the learning action in parallel. In addition, in
these individual parts are sended to the pre-trained deep classical machine learning, the answer is only 1 or 0, and in
convolutional neural network. In addition, a data set has been the output of the studies using this network structure, values
created for this pre-trained network. The data set contains between 0 and 1 such as 0.2 and 0.7 can be obtained. This
nearly 13 thousands of letters with 227*227*3 size have been makes it easier to solve the problem in a more detailed way,
created with different points, fonts and letters. As a result, 100 it increases the success in learning and provides better
percent of success has been attained in the training. %79.08 results. After learning features in many layers, the next part
letter and %75 of font success has been attained in the tests. of a convolutional neural network is classification. The next
to the last layer is a fully connected layer that outputs a
Keywords—deep learning, convolutional neural networks, vector of x dimensions where x is the number of classes that
font recognition, letter recognition the network will be able to predict. This vector contains the
probabilities for each class of any image being classified.
I. INTRODUCTION The final layer of the convolutional neural network
architecture uses a classification layer to provide the
In recent years, humanity is trying to do all operations classification output.
which are in the daily life, on digital systems by reducing
and automating human power. This automation requirement This paper has been divided into five parts. The first
enabled the creation of intelligent systems and provided an section of this paper gives a brief knowledge of the deep
environment for the application of the systems which are learning. The second section reviews the related works. The
called as artifical intelligence and machine learning [18]. third section is concerned with the methodology used for this
Many studies on deep learning have been made and study. The fourth section presents the findings of the
continue to be done [19,20]. Deep learning is a method that research. Finally, we provide the conclusion.
simulating the structure of the human brain [21]. This
method is a series of algorithms for finding a hierarchical II. RELATED WORKS
representation of the input data by simulating the way that In recent years, much more information has become
human brain senses important part of a set of sensory data available on the deep learning. Therefore a considerable
that it is exposed to at all times [1]. The idea which is at the amount of literature has been published on this topic. Even if
basis of deep learning emerged in 1950s with the definiton convolutional neural networks has been improved in the
of perceptron. Perceptron is the first machine that has the 1990s, it has gained the popularity since one decade.
learning ability. In the 1980s, the multilayer perceptron Increasing data, some unsolved problems based on image
structure was identified. But the perceptron has limited and video processing, the insufficiency of current learning
learning ability. Thus, the proposal of neural networks with methods increased the popularity of deep learning. So most
many layers emerged in the 2000s. The structure along with of the studies are based on the learning efficiency of
this recommendation has been better capable of learning. convolutional neural networks [14, 15]. Today, with the very
These are multiple layers of deep learning infrastructure successfull learning abilities, deep learning netwoks have
[2,3]. been used for classification, detection, diagnosis [4, 6, 8, 11,
Convolutional neural networks, in a variety of studies 12, 13].
over the image, showing high performance and Letter recognition in digitized documents is very
achievements, is known as a deep learning model that important for archiving. Some digitized documents contains
delivers enhanced results. This model useful for finding both images and text. So an analog neural netwok processor
IBIGDELFT2018 61
International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism Ankara, Turkey, 3-4 Dec, 2018
A. DataSet
Since no prepared data set was found in the literature, a
new data set was created for this study. In the first part, the
data set was categorized by letters, and the second part data Fig. 2. Training process scheme
was categorized by fonts. In the letter categorization, images
have been created for each of the 29 letters. Only letters
have been included, digits have been excluded in this study. C. Preprocessing of Image
Since images may contain different sizes of letters, three The image has been processed, before it has been sent to
points have been selected. By this way, three letter images the network. These operations have been listed below.
have been created as big, medium and small for each lower
and uppercase letter. Points have been 72 point as big, 20 x In first step, the image has been converted to intensity
point as medium and 8 point as small. After the preparation image format.
of all the letter images, images have been resized as 227*227 x In second step, the image has been converted to
for deep convolutional neural network. By this way, 228 binary format.
letter images for one letter and 6612 letter images for the
whole alphabet have been created. x In third step, the image has been converted to
complement of itself, because of morphological
Another data set has been created by selecting 38 fonts image processing.
that support Turkish letters and using these fonts in each of
the 29 letters in the Turkish alphabet. Letters have been x In final step, morphologic image processing has been
categorized according to fonts. Big and small cases have used for finding each of letters locations. We have
been included and a total of 38 fonts have been used for one used a function that returns the label matrix that
letter and 174 letter images for one font, and 6612 letter contains labels for the 8-connected (horizontal,
images for all fonts have been created in total. Thus, the vertical and diagonal) objects in image.
preparation part has been completed. Some of examples from
database are given below.
IBIGDELFT2018 62
International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism Ankara, Turkey, 3-4 Dec, 2018
IBIGDELFT2018 63
International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism Ankara, Turkey, 3-4 Dec, 2018
TABLE III. PROPORTIES OF TEST IMAGES neural networks,” 2014 IEEE Conference on Computer Vision and
Pattern Recognition, Columbus, OH, pp. 1717-1724.
Fonts Points [5] C. Riley, P. Work and R. Miller, “Visual entity identification: a neural
Arial 8, 20, 72 network approach,” IJCNN-91-Seattle International Joint Conference
on Neural Networks, Seattle, WA, USA, 1991, pp. 909 vol. 2.
Bahnschrift 8, 20, 72 [6] A. Caliskan, H. Badem, A. Basturk and M. E. Yuksel, “A
Century Gothic 8, 20, 72 comparative study on classification by deep learning,” 2016 National
Conference on Electrical, Eletronics and Biomedical Engineering
Juice ITC 8, 20, 72 (ELECO), Bursa, 2016, pp.503-506.
Test results has been given in Table IV. [7] Q. -. Wu, Y. L. Cun, L. D. Jackel and B. -. Jeng, “On-line recognition
of limited-vocabulary Chinese character using multiple convolutional
neural networks,” 1993 IEEE International Symposium on Circuits
and Systems, Chicago, IL, 1993, pp. 2435-2438 vol.4.
TABLE IV. RESULTS OF TESTING
[8] Y. Le Cun and Y. Bengio, “Word-level training of a handwritten
Points word recognizer based on convolutional neural networks,”
Proceedings of the 12th IAPR International Conference on Pattern
Fonts 8 20 72 Result Recognition, vol.3 – Conference C: Signal Processing (Cat.
No.94CH3440-5), Jerusalem, Israel, 1994, pp.88-92, vol.2.
Arial 14/26: True 23/44: True 11/21: True %100
[9] P. Kuang, W. Cao and Q. Wu, “Preview on structures and algorithms
Bahnschrift 3/32: False 9/44: False 7/21: %33 of deep learning,” 2014 11th International Computer Conference on
Franklin Arial Bahnschrift Wavelet Actiev Media Technology and Information Processig
Gothic (ICCWAMTIP), Chengdu, 2014, pp. 176-179.
Century 17/30: True 24/44: True 13/21: True %100 [10] B. E. Poser, E. Sackinger, J. Bromley, Y. LeCun, R. E. Howard and
L. D. Jackel, “An analog neural network processor and its application
Gothic
to high-speed character recognition,” IJCNN-91-Seattle International
Juice ITC 3/9: False 34/43: True 20/21: True %66 Joint Conference on Neural Networks, Seattle, WA, USA, 1991, pp.
Courrier New 415-420 vol. 1.
[11] Y. Bar, I. Diamant, L. Wolf, S. Lieberman, E. Konen and H.
Total Average Success %75 Greenspan, “Chest pathology detection using deep learning with non-
medical training,” 2015 IEEE International Symposium on
IV. RESULTS AND CONCLUSION Biomedical Imaging (ISBI), New York, NY, 2015, pp.294-297.
In this study, it has been aimed to develop a deep [12] Y. Yuan, L. Mou and X. Lu, “Scene recognition by manifold
regularized deep learning architecture,” in IEEE Transactions on
network, recognizing both fonts and letters in Turkish. With Neural Networks and Learning Systems, vol. 26, no. 10, pp.2222-
this aim, a pre-trained network has been trained with nearly 2233, Oct. 2015.
13 thousands images. The letter recognition training [13] Z. Hu, J. Tang, Z. Wang, K. Zhang, L. Zhang and Q. Sun, “Deep
accuracy, was %100, and font recognition training accuracy learning for image-based cancer detection and diagnosis-A survey,”
was %73.44, because of the similarity of the fonts. But in Pattern Recognition, vol. 83, 2018, pp. 134-149, ISSN. 0031-3203.
order to increase the font recognition percentage, a [14] M. A. Abbas, “Improving deep learning performance using random
probabilty calculation has been used, after the network forest HTM cortical learning algorithm,” 2018 First International
output has been found. Even if the first test image font Workshop on Deep and Representation Learning (IWDRL), Cairo,
2018, pp. 13-18.
accuracy is 14/26, because of the probabilty is bigger than
[15] Y. Wang, “Cognitive foundations of knowledge science and deep
0.5, it has been accepted as Arial. So with this way, the knowledge learning by cognitive robots,” 2017 IEEE 16th
recognition performance has been increased a bit more. The International Conference on Cognitive Informatics & Cognitive
network has been tested with 12 images. These images Computing (ICCI*CC), Oxford, 2017, pp. 5-5.
contains all letters. According to the results, letter [16] A. Koyun, E. Afsin, “2D optical character recognition based on deep
recognition with this network has nearly %100 percentage, learning,” Journal of Turkey Informatics Foundation of Computer
but the accuracy of font recognition is low, as it can bee seen Science and Engineering, 2017, vol. 10, no. 1, pp. 11-14.
from Table IV. But using the probability, font recognition [17] S. I. Ilkin, M. Akin, “Attacking turkish texts encrypted by
homophonic,” Proceedings of the 10th WSEAS International
percentage has been increased. In future studies, a GUI will Conference on Electronics, Hardware, 2011.
be developped and the number of tests will be increased.
[18] S. H. Tajmir, T. K. Alkasab, “Toward augmented radiologists:
And the possible fonts will be shown with a probabilty changes in radiology education in the era of machine learning and
respectively. artificial intelligence,” Academic radiology, 2018, vol. 25, pp.747-
750.
REFERENCES [19] J. M. Valin, “A hybrid dsp/deep learning approach to real-time full-
band speech enhancement,” 2017.
[1] E. Bati, “Deep convolutional neural networks with an application [20] Y. Zhou, O. Tuzel, “Voxelnet: End-to-end learning for point cloud
towards geospatial object recognition,” Diss. Middle East Technical based 3d object detection,” 2017.
University Ankara, 2014.
[21] R. Wason, “Deep learning: Evolution and expansion,” Cognitive
[2] O. Elitez, “Handwritten digit string segmentation and recognition System Research, 2018, vol. 52, pp.701-708.
using deep learning,” Diss. Middle East Technical University Ankara,
2015. [22] X. Ding, L. Chen, T. Wu, “Character independent font recognition on
a single chinese character,” in IEEE Transactions on Pattern Analysis
[3] M. U. Oner, “Metastasis detection and localization in lypmh nodes by and Machine Intelligence,2007, vol. 29, no. 2, pp. 195-204.
using convolutional neural networks,” Diss. Middle East Technical
University. [23] F. K. Jaiem, F. Slimane, M. Kherallah, “Arabic font recognition
system applied to different text entity level anlysis,” 2017
[4] M. Oquab, L. Bottou, I. Laptev and J. Sivic, “Learning and International Conference on Smart, Monitored and Controlled Cities
transferring mid-level image representations using convolutional (SM2C), Sfax, 2017, pp. 36-40.
IBIGDELFT2018 64