You are on page 1of 13

HANDWRITTEN

TEXT
RECOGNITION
USING DEEP
LEARNING

M HARINI G SUBHIKSHA R SRIHARINI V ABINAYA


ABSTRACT

The System is Built to Recognize Handwritten Text and then convert the recognized text into digital
form using Deep Learning. Deep Learning is an advanced technique to get better efficiency and reach
human level Prediction. Handwritten Recognition is a technology that can be used to recognize
handwritten characters. Handwriting text will be in images format. In this system we have used
convolution neural networks to predict real time handwritten text because these neural networks are most
properly used for Analysing images. To predict handwritten text, the Optical Character Recognition
Algorithm is used in the Convolution Recurrent Neural Network Model. Optical Character Recognition
problem is a type of image-based Sequence recognition problem. And for Sequence recognition
problems, most suited neural networks are Recurrent Neural Networks(RNN) while for an image based
problem most suited are Convolutional Neural Networks(CNN). To cope up with the OCR problems we
need to combine both of these CNN and RNN. Deep learning gives higher level recognition accuracy.
The Aim of our project is to make an application that can recognize the handwriting using concepts of
deep learning. We are thinking by approaching our problem using CNN as they provide better accuracy
over such tasks. Image processing could be a manipulation of images within the computer vision. With
the advent of technology, there are many techniques for the manipulation of photographs.
PROBLEM STATEMENT
● In this system we have used convolution neural network to predict
real time handwritten digit because these neural networks are most
properly used for analyzing images.
● Convolutional neural network is built using convolutional layer,
pooling layer, dropout layer, flatten layer, connected layer and
activation functions
PROPOSED SYSTEM
EXISTING SYSTEM
Introduces a novel approach using deep learning
Relies on conventional classification techniques for handwritten text recognition and
methods for handwritten digit recognition. conversion into digital form.

Acknowledges the progress made in Aims to enhance efficiency and achieve human-level
recognizing handwritten digits but prediction accuracy by leveraging advancements in
deep learning.
highlights the limitations in accuracy
impacting work efficiency. Integrates Convolutional Neural Networks (CNN) and
Recurrent Neural Networks (RNN) in a Convolutional
Utilizes a two-layer CNN network with Recurrent Neural Network Model.
two fully connected layers.
Utilizes Optical Character Recognition (OCR)
Employs the ReLU function to mitigate algorithm, leveraging the strengths of CNN for image-
based problems and RNN for sequence recognition.
gradient disappearance and saturation
challenges.
OCR ALGORITHM
● OCR (Optical Character Recognition): OCR is a technology used to recognize text within images,
including scanned documents and photos. It converts various types of text images (typed, handwritten,
or printed) into machine-readable text data.
● OCR Process: OCR involves converting digital or hand-written text images into machine-readable text
that computers can process, store, and edit. This enables the manipulation of text as part of data entry
and processing software.
● Feature Extraction Methods: There are two main methods for extracting features in OCR: one
evaluates characters based on lines and strokes, while the other identifies entire characters through
pattern recognition.
● Pattern-Matching Algorithms: OCR software uses pattern-matching algorithms to compare text
images character by character with its internal database. If the system matches the text word by word,
it's called optical word recognition. OCR software essentially "reads" text and converts it into digital
form.
SYSTEM ARCHITECTURE
MODULES

● Pre processing
● Convolution Layer
● Recurrent Layer
● Transcription Layer
PRE PROCESSING

● The preprocessing unit in the architecture diagram prepares input data for the
neural network model.

● It includes resizing, normalization, noise reduction, contrast enhancement, and


segmentation. These steps ensure the input images are standardized, clean, and
optimized for effective recognition by the neural network.
CONVOLUTIONAL LAYER

● The layer is used for image feature extraction. The component of convolutional
layers is constructed by taking the convolutional and max pooling layers in CRNN
model.Sequential feature representation from an input image is extracted using such
component.
● The first layer of a Convolutional Neural Network is always a Convolutional layer.
Convolutional layers apply a convolution operation to the input, passing the result to
the next layer. A convolution converts all the pixels in its receptive field into a single
value.
RECURRENT LAYER
1. Bi-directional RNN for Sequence Labelling:
● Bi-directional RNNs are used on top of convolutional layers to label
sequences.
● They capture information from both directions, enhancing sequence
understanding.
2. Fully Connected Layer:
● Connects every neuron from the previous layer to every neuron in the next.
● Output is fed back to the input, with the number of units determining output
dimensionality.
● Typically uses the hyperbolic tangent (tanh) activation function.
3. Recurrent Layer:
● Comprised of recurrent units processing input and previous hidden state to
produce output.
● Output can be further processed or sent to subsequent layers.
● Captures temporal dependencies within sequences, aiding pattern recognition.
TRANSCRIPTION LAYER
Transcription Process:
● The transcription layer converts per-frame predictions made by the RNN into a sequence of labels or text. This process
is crucial for transforming the output of the neural network into readable text.
● Connectionist Temporal Classification (CTC) is a commonly used technique in the transcription process. It helps
decode the output from the RNN and convert it into text labels.
Role of Transcription Layer:
● The transcription layer operates after the recognition model, taking the output probabilities from the model.
● Its primary function is to convert these probabilities into a sequence of recognized text or characters. This involves
applying decoding algorithms to determine the most probable sequence based on the output probabilities.
Mapping Probabilities to Symbols:
● The transcription layer maps the continuous probability distributions generated by the recognition model into discrete
symbols, such as characters or words, representing the recognized text.
● By converting probabilities into discrete symbols, the transcription layer enables the neural network to output
readable text that accurately represents the input sequence.
CONCLUSION
● An adaptive method is proposed for handwritten text recognition by pre-processing and
training the dataset consecutively with CNN and RNN.
● The input word images are processed and fed into neural network model layers during
recognition.
● The output of the CNN layers is further processed by the RNN layers. The results demonstrate
the potential of consecutive use of CNN and RNN that improve the accuracy steadily.
FUTURE SCOPE
● In future we are planning to extend this study to a larger extent where different embedding
models can be considered on large variety of the datasets.
● we aim to enhance the work by implementing online recognition and extend it to different
languages, additionally we can promote the system to recognize degraded text or broken
characters
REFERENCES
1.A. Graves and J. Schmidhuber, “Offline handwriting recognition with multidimensional recurrent neural networks,” in NIPS, 2009.
2.Rohan Vaidya;Darshan Trivedi;Sagar Satra;Prof. Mrunalini Pimpale, ”Handwritten Character Recognition Using DeepLearning”,
in ICICCT, 2018.
3.P. Voigtlaender, P. Doetsch, and H. Ney, “Handwriting recognition with large multidimensional long short-term memory recurrent
neural networks,” in ICFHR, 2016.
4.J. Puigcerver, “Are multidimensional recurrent layers really necessary for handwritten text recognition?” in ICDAR, 2017.
5.D. Keysers, T. Deselaers, H. A. Rowley, L. Wang, and V. Carbune, “Multi-language online handwriting recognition,” PAMI, vol.
39, no. 6, pp. 1180–1194, 2017.
6.V. Carbune, P. Gonnet, T. Deselaers, H. A. Rowley, A. Daryin, M. Calvo, L.-L. Wang, D. Keysers, S. Feuz, and P. Gervais, “Fast
multi- language lstm-based online handwriting recognition,” ArXiV, 2019. 49
7.U. Marti and H. Bunke. The IAM-database: An English Sentence Database for Off-line Handwriting Recognition. Int. Journal on
Document Analysis and Recognition, Volume 5, pages 39 - 46, 2002.
8.H.Bunke1, M. Roth1, E.G. Schukat-Talamazzini. Offline Cursive Handwriting Recognition using Hidden Markov Models.

You might also like