Plagiarism Checker X Report

Plagiarism Checker X Originality Report
Similarity Found: 26%
Date: Wednesday, May 08, 2019

Statistics: 2517 words Plagiarized / 9657 Total words
Remarks: Medium Plagiarism Detected - Your Document needs Selective Improvement.
-------------------------------------------------------------------------------------------
CHAPTER 1 INTRODUCTION Establishing a connection between the past and the

present by converting the handwritten historical documents into digital form to make it
available to the scholars worldwide is important. Extracting information in the text
format from pictures of handwritten images is a difficult task with several practical
applications.
In contrast to character recognition for scanned documents, recognizing text in pictures

of handwritten images is difficult due to a lot of variations in backgrounds, textures,
fonts, and lighting conditions. As a result, many text detection and recognition systems
trust traditional smartly hand-engineered options to extract the text.
Usually unique images of handwritten information contain mixture of writing styles with
not so well recognized font types and font sizes through multi-directional and skewed
curved text lines. 1.1 Overview: A lot of knowledge is present in ancient documents but
is hidden due to lack of methods to access or read them, such knowledge can’t be left
unused citing reasons such as not being able to access them readily. We live in an era
where machines are learning to do 100’s of tasks.
Computers are acquiring artificial knowledge to solve some really complicated

problems. Exploiting techniques such as Machine Learning and Artificial Intelligence
holds the key to a lot of accomplishments to be done in the future. One such
application of Machine Learning is handwritten text detection.
Handwritten text detection is a technique in which the machine gets trained on an

existing set of images and learns to recognize whole new set of images and extract the
text written in those images, this method will essential help in establishing a connection
between the past and the present by converting the handwritten historical documents
into digital form to make it available to the scholars worldwide which is extremely
beneficial. The projected system makes use of ANNs.
Multiple Convolutional Neural Network (CNN) layers are trained to extract relevant
options from the input image. These layers output a 1D or 2d feature map that is
handed over to the recurrent Neural Network (RNN) layers. The RNN propagates data
through the sequence. Afterwards, the output of the RNN is mapped onto a matrix that
contains a score for every character per sequence component.
Because the ANN is trained employing a specific coding scheme, a decryption rule must
be applied to the RNN output to get the final text. Training and decryption from this
matrix is done by the Connectionist Temporal Classification (CTC) operation. Decryption
will take advantage of a Language Model (LM). There are 2 reasons for preprocessing:
making the matter easier for the classifier and data augmentation.
data augmentation is enforced by random transformations to the dataset images. The

decoded text will contain writing system mistakes, so a text post-processing technique is
applied to account for them 1.2 Problem Statement: Text detection typically has relied
on Optical Character Recognition (OCR) whose accuracy for images containing
handwritten images has been very low.
This is mainly due to a lot of variables such as background variation, font size and style
that are a key part in any document containing handwritten images, our goal is to
develop a system which utilizes the concept of Neural Networks to give very accurate
results and a system that can be trained on a CPU instead of having to rely on a GPU.
1.3
Objectives: Developing a Machine Learning model that can overcome the limitations
posed by OCR and detect handwritten text contained in images with very high accuracy.
Develop a model that can be trained on a CPU and later be extended to accept any user
input that is not in the original dataset . CHAPTER 2 LITERATURE SURVEY Various Books
and information materials about text detection and machine learning have been studied
through in order to achieve the required information concern to this project. Among
them, following are the key points extracted through: 2.1
Tong He, Weilin Huang, Text-Attentional Convolutional Neural Network for Scene Text
Detection This paper presented a brand new system for scene text detection, by
introducing a completely unique Text-CNN classifier and a recently developed
CE-MSERs detector. The Text-CNN is meant to calculate discriminative text options from
a picture part.
Highly-supervised text info, as well as text region mask, character category, and binary
text/non-text info is leveraged They formulate the training of Text-CNN as a multi-task
learning problem that effectively incorporates interactions of multi-level management.
They show that the informative multi-level management are of notably importance for
learning a robust Text-CNN that is ready to robustly discriminate ambiguous text from
sophisticated background. additionally, A distinction improvement mechanism that
enhances region stability of text patterns has improved the current MSER.
Progressive performance on variety of benchmarks is demonstrated by the results In the

paper the training process is developed as a multi-task learning problem, where
low-level supervised data greatly facilitates the most task of text/non-text classification.
additionally, a strong low-level detector known as contrast-enhancement maximally
stable extremal regions (MSERs) is developed, that extends the wide used MSERs by
enhancing intensity distinction between text patterns and background 2.2 Tao Wang,
David J. Wu, Adam Coates, Andrew Y.
Ng End-to-End Text Recognition with Convolutional Neural Networks. In this paper, they
have thought-about a completely unique approach for end-to-end text recognition. By
utilizing large, multi-layer CNNs, they train powerful and sturdy text detection and
recognition modules. due to this increase in representational power, they are ready to
use simple non-maximal suppression and beam search techniques to construct a whole
system.
This represents a departure from previous systems that have usually relied on complex
graphical models or in an elaborate way hand-engineered systems. As proof of the
facility of this approach, they have incontestable state-of-the art results in character
recognition furthermore as lexicondriven cropped word recognition and end-to-end
recognition.
Even more, they can simply extend their model to the general setting by leverage
standard open-source spell checkers and in doing thus, come through performance
comparable to progressive. For low-level data representation in this paper, an algorithm
that belongs to the unsupervised feature learning class is used and this algorithm is
capable of automatically extracting features from the given data. They integrate these
learned features into a large, discriminatively-trained convolutional neural network
(CNN).
CNNs have enjoyed many successes in similar problems such as handwriting recognition
, visual object recognition, and character recognition. By leveraging the representational
power of these networks, they are able to train highly accurate text detection and
character recognition modules. 2.3 Yoshito Nagaoka, Tomo Miyazaki, Yoshihiro Sugaya,
Shinichiro Omachi Text Detection by Faster R-CNN with Multiple Region Proposal
Networks In this paper they proposed an end-to-end systematically trainable text
detection methodology supporting the quicker R-CNN.
The first faster R-CNN is an end-to-end CNN for quick and correct object detection. By
considering the characteristics of texts, a unique design that create use of its ability on
object detection is proposed. Though the first faster R-CNN generates region of
interests (RoIs) by a region proposal network (RPN) exploiting the feature map of the
last convolutional layer, the proposed methodology generates RoIs by multiple RPNs
using the feature maps of multiple convolutional layers.
This methodology uses multiresolution feature maps to observe texts of assorted sizes
at the same time. To aggregate the RoIs, we tend to introduce RoI-merge layer, and this
layer allows to pick valid RoIs from multiple RPNs effectively. Additionally, a training
strategy is planned for realizing end-to-end training and creating every RPN be
specialized in text region size Finally they concluded that Multi-RPN faster R-CNN
model for text detection, with multiple RPNs, multiresolution feature map is utilized.
Also, they processed that the planned training strategy was simpler than the traditional
strategy of the faster R-CNN. Multi-RPN model is end-to-end trainable and
straightforward design like faster R-CNN. Therefore, observeion speed is nearly as quick
because the original faster R-CNN and it can detect text systematically.
By these, the planned model improved detection scores dramatically. However, hey
finished that the multi-RPN cannot observe all types of text and is restricted to
horizontal text. 2.4 Zheng Zhang, Chengquan Zhang, Wei Shen2 Cong Yao, Wenyu Liu,
Multi-Oriented Text Detection with Fully Convolutional Networks In this paper, they
propose a unique approach for text detection in natural pictures. each native and global
cues are taken under consideration for localizing text lines during a coarse-to-fine
procedure.
First, a fully Convolutional Network (FCN) model is trained to predict the salient map of
text regions during a holistic manner. Then, text line hypotheses are calculable by
combining the salient map and character elements. Finally, another FCN classifier is
employed to predict the center of mass of every character, so as to get rid of the false
hypotheses.
The framework is general for handling text in multiple orientations, languages and fonts.
The projected technique systematically achieves the progressive performance. First, they
present a unique means for computing text salient map, through learning a powerful
text labeling model with FCN. The text labeling model is trained and tested during a
holistic manner, extremely stable to the massive scale and orientation variations of
scene text, and quite economical for localizing text blocks at the coarse level.
additionally, it's additionally applicable to multi-script text.
Second, associate degree economical technique for extracting bounding boxes of text
line candidates in multiple orientations is projected by them, the most concept
integrates linguistics labeling by FCN and MSER provides a natural resolution for
handling multi-oriented text. 2.5 Berat Barakat, Ahmad Droby, Majeed Kassis and Jihad
El-Sana, Text Line Segmentation for Challenging Handwritten Document Images Using
Fully Convolutional Network In this paper they present a technique for text line
segmentation of difficult historical manuscript pictures.
These manuscript pictures contain slender interline areas with touching elements,
interpenetrating vowel signs and inconsistent font varieties and sizes. additionally, they
contain bowed, multi-skewed and multi-directed aspect note lines at intervals a posh
page layout. Therefore, bounding polygonal shape labeling would be terribly
troublesome and time consuming.
Instead they depend on line masks that connect the elements on the identical text line.
Then these line masks are expected employing a Fully Convolutional Network (FCN).
FCN has been successfully used for text line segmentation of normal written document
pictures.
The paper shows that FCN is helpful with difficult manuscript pictures also. Employing a
new analysis metric that's sensitive to over segmentation also as beneath segmentation,
testing results on a publically on the market difficult written dataset are comparable the
results of a previous work on the identical dataset.
In this paper they have concluded that line mask labeling is a smaller amount
cumbersome for difficult written documents and may be a correct manner for FCN
coaching. they need additionally outlined a replacement analysis metric with the idea of
property element. This metric is sensitive to each over and beneath segmentation. New
metric is employed to validate the planned technique on the difficult written dataset. 2.6
A research by J.Pradeep, E.Srinivasan and S.Himavathi named Diagonal based feature
extraction for handwritten alphabets recognition system using neural network.
In this paper they have discussed about how parsing characters of an image diagonally
offers lower amount of features by unique but high quality, easily predictable features as
the vectors are formed based on the glyphs formed along each and every diagonal. It
also states that features along diagonal often are not empty. 2.7 An IEEE paper by
T.M.Rath and R. Manmatha titled Word spotting for historical documents.
This paper states how changes of handwritten characters points of interest such as,
highest point in the column, lowest point as well the average point can be mapped
inside a grid and then these profiles can be used in recognizing originality of a signature
or handwriting. 2.8 Classification Techniques Classification comes under the Supervised
Learning domain of Machine Learning where a computer program learns from the input
given to it and trains itself on it. Using this knowledge, it classifies new observations
given to it.
The dataset may simply be binary class ( identify whether the animal is a cat or a dog or
if the mail is spam or non-spam) or it can be multiple class too (like classifying numeric
data into their digits). In our case, the classifier is a multiclass classifier as it has to
classify the song into one of the 8 genres. Some examples where classification plays a
vital role are: Speech recognition, document classification, biometric identification etc.
There are various types of classification algorithms present in Machine Learning: 1.Linear
Classifiers: Logistic Regression, I Bayes Classifier 2.Neural Networks 3.Support Vector
Machines 4.Decision Trees 5.Nearest Neighbors 6.Boosted Trees 7.Random Forests 2.8.1
Logistic Regression This is a predictive learning model where a statistical method is used
for analysing dataset in which there are one or more independent variables that decide
the outcome.
The aim of logistic regression is to find the best fitting model that describes the
relationship between the dependent and independent variables 2.8.2 Naïve Bayes
Classifier This is a generative learning model where the classification technique is based
on the Baye’s Theorem. I Bayes classifier assumes that one feature is independent of any
other feature. This classification is based on conditional probability.
This model is useful for large datasets. Though it is a simple model, it is known to
outperform sophisticated and complex classification methods 2.8.3 Neural Networks
Neural Networks consists of neurons that are arranged in the form of layers, which take
a vector as an input and convert it into some output. Each neuron takes an input,
applies a function to it and then passes the output to the next layer.
Weights are applied to the signal passing from one neuron to another and these
weights are tuned in the training part of the neural network to adapt to the problem at
hand 2.8.4 Support Vector Machines A Support Vector Machine (SVM) is a discriminative
classifier that is defined by a separating hyperplane. Given a abelled training data
(supervised learning), the algorithm outputs an optimal hyperplane which categorizes
new examples.
In two dimensional space this hyperplane is a line dividing a plane in two parts where in
each class lay in either side. 2.8.5 Decision Trees Decision trees are used to build the
classification and regression models in the form of a tree. It is used to break down the
dataset into smaller subsets or incrementally developed. Final result is a tree with
decision nodes and leaf nodes.
Decision node has two or more branches and a leaf node is the output of the
classification or decision. 2.7.6 Nearest Neighbors The k-nearest-neighbors algorithm is
a supervised classification algorithm. It considers a bunch of labelled points and learns
on them to label other new points.
To label a new point, it looks at its nearest neighbors, and based on the number of
similar neighbors close to it, it labels the new point CHAPTER 3 REQUIREMNETS
3.1Functional Requirements The user should be able to access the text image to be
detected in an easy manner. The User should be able to detect the text (handwritten)
from various source of images. The User should get the accurate result after the image is
uploaded. 3.2Non Functional Requirements User Interface should be easily accessible.
The text detection process should work effectively and efficiently.
The model should never fail in the middle of operation. The Algorithmic model should
work consistently across the various web platforms. 3.3Software Requirements: Software
requirements are of development and user requirements 3.3.1 Development
Requirements ?Any 64-bit Linux Distribution / Windows / MacOS ?Python with Conda /
Miniconda installed ? TensorFlow library 1.12.0 ? OpenCV library 3.3.2
User Requirements ?Computer. ?Web Browser. 3.4 Hardware Requirements: Hardware

Requirements are for training and testing. 3.4.1 Training Requirements CPU: 2.3Ghz
RAM: 16 GB DISK: 360 GB 3.4.2 Testing Requirements CPU: Any Dual Core processor and
above RAM > 2GB DISK > 4GB GPU: Not mandatory, can run on CPU 3.5 Dataset The
IAM Handwriting Database contains forms of handwritten English text which can be
used to train and test handwritten text recognizers and to perform writer identification
and verification experiments. / Fig 3.1 : Properties of IAM dataset / Fig 3.2 : Ground
Truth text / Fig 3.3
: Example of the dataset CHAPTER 4 SYSTEM DESIGN AND ANALYSIS 4.1 Analysis
Analysis is the process of breaking a complex topic or substance into smaller parts to
gain a better understanding of the problem. Analysts in the field of engineering look at
requirements, structures, mechanisms, and systems dimensions. Analysis is an
exploratory activity.
The Analysis Phase is where the project lifecycle begins. The Analysis Phase is where we
break down the deliverables in the high-level Project Charter into the more detailed
requirements. The Analysis Phase is also the part of the project where we identify the
overall direction that the project will take through the creation of the project strategy.
4.1.1 Product Perspective: Allowing User to scan images in order to detect the text.
Allowing User to detect text from handwritten images. Allowing Bussiness users to use
our API via tensorflow serving model. Allowing users to integrate into a web app.
Allowing users to integrate into both mobile and desktop apps. 4.1.2
Product functions The core application functionalities such as detecting text from
handwritten images and serving the model API will be implemented using tensorflow
which is a python library. Design and Styling of User Interface will be implemented using
React.js which is a javascript library. Data will be get stored in our local system. In case of
Browsers Data will be stored into in-memory browsers. 4.2 Components of the project
4.2.1
Software Components 4.2.1.1 Developmental Components Operating System

(Windows/MacOS/Linux): An operating system abbreviated as OS a system level
software used to manage both computer hardware and software resources. It provides
common services for computer programs.
It performs allocation of resources such as cost allocation of processor time, mass

storage, printing and many other resources. It also performs memory allocation and
deallocation. Many devices ranging from a cellular phone to a video game console to a
web server and supercomputer contain Operating System. Python 3.6: Python is a high
level programming language which can be used for data science. Some of the libraries it
supports include Pandas, Numpy, Tensorflow.
It has a programming paradigm that allows to build highly scalable. Python is an

efficient language that allows garbage collection, object-oriented, procedural
programming. Anaconda: Anaconda is an open-source library for Python and R
programming languages. It provides many packages support which are already
pre-installed with anaconda library.
They support scientific computing for data science, machine learning, deep learning,
large-scale data processing, predictive analytics, statistics, probability and many more. It
simplifies package management and deployment. Any python package which is not
pre-installed with anaconda, can be added to the library with the conda tool which
comes with the anaconda software.
It is available for Windows, Linux, and MacOS distributions. Spyder IDE: Spyder is an IDE
which stands for integrated development environment. It is an open-source
cross-platform. It integrates a number of python packages mainly used for scientific
computing such as NumPy, SciPy, Matplotlib and many more. Spyder comes with
Anaconda tool. It is available for Windows, Linux and MacOS distributions.
TensorFlow: TensorFlow is a free and open-source software library which is used for a lot
of programming and dataflow tasks. It essentially is a math library containing functions
that work on a computational graph, It is widely used to perform a lot of machine
learning and neural network tasks OpenCV: OpenCV is a library of functions that can be
used to program, it is manly used to access various image properties and modify them.
4.2.1.2
User Interface Application Components 1. HTML: Hyper Text Mark Up language is a

markup language used to create a front-end view of web applications. Web browsers
retrieve html pages from requested server and display the documents into the required
websites. HTML uses tags and attributes where each attribute will have its own
property-value pairs. HTML consists of tags that describes the structure of web pages.
Attributes adds details for each of those tags. 2. CSS: Cascading Style Sheets is used as
style sheet language. Consists of key value pairs that can be referenced to any tags
present in HTML. CSS has specifications to edit formatting if the web pages are accessed
from different platforms and devices.CSS provides more flexibility and control for
representation of text in specific parts of web pages. 3.
JavaScript: JavaScript is scripting language which also follow a object oriented paradigm.
JavaScript can be used to design front-end part of the web applications. Many of the
web app tools such as node.js, jquery support javascript syntax. Javascript is used for
both front-end and back-end tools. 4.React: It is a library used along with javascript
programs used in front-end part of the application.
It makes development of user interface by introducing the concept of states 4.2.2

Hardware Components Intel i7 Processors @2.3Ghz: Core i7 is a family of high-end
performance 64-bit x86-64 processors designed by Intel for high-end desktops and
laptops. Core i7 was introduced in 2008 following the retirement of the Core 2 Quad
family.
Core i7 microprocessors are the high-end brand from the Core family, positioned above
both the Core i5 and the Core i3. 16GB DDR4 RAM: A 16GB Double Data Rate 4
Synchronous Random Access Memory was used for this application. DDR4 RAM was
released in 2014 which is a successor of DDR2 and DDR3.
DISK (360 GB): A hard disk drive is a data storage device used for storage and retrieval
of digital information. This application requires a minimum of 5 GB of HDD just to store
the dataset and train the model. 4.3 System Design Systems design is the process of
defining the architecture, components, modules, interfaces, and data for a system to
satisfy specified requirements.
Systems design could see it as the application of systems theory to product

development. The objective of the design phase is to produce the overall design of the
software. It aims to figure out the modules that should be in the system to fulfill all the
system requirements in an efficient manner.
The design will contain the specification of all these modules, their interaction with other
modules and the desired output from each module. The output of the design process is
a description of the software architecture. The design phase is followed by two
subphases High-Level Design Detailed Level Design 4.3.1
System Architecture Diagram A system architecture is a conceptual model that defines

the structure, behavior, and more views of a system. An architecture description is a
formal description and representation of a system, organized in a way that supports
reasoning about the structures and behaviors of the system. A system architecture can
comprise system components that will work together to implement the overall system.
The below figure shows a general block diagram describing the internal operations
performed by this project. / Fig 4.1 : System architecture diagram Key Phases of this
Project are: CNN RNN CTC / Fig 4.2 : System block diagram / Fig 4.3 : TensorFlow
serving API made use with React 4.4 HIGH-LEVEL DESIGN 4.1
Introduction In the high-level design, the proposed functional and non-functional

requirements of the software are depicted. The overall solution to the architecture is
developed which can handle those needs. This chapter involves the following
consideration. · Design consideration · Data flow diagram 4.2
Design consideration There are several design consideration issues that need to be
addressed or resolved before getting down designing a complete solution for the
system. 4.2.1 Assumptions and Dependencies The training of the model requires a very
good processor. For the model to serve other third parties, there should be a proper
internet connection. Proper modern Browser to support the UI which is in React.js .
Also, there should be proper internet connectivity as python and tensorflow downloads
all other libraries via the internet. 4.2.2 Data Flow Diagram A data flow diagram is the
graphical representation of the flow of data through an information system. DFD is very
useful in understanding a system and can be efficiently used during analysis. A DFD
shows the flow of data through a system.
It views a system as a function that transforms the inputs into desired outputs. Any
complex systems will not perform this transformation in a single step and a data will
typically undergo a series of transformations before it becomes the output. With a data
flow diagram, users are able to visualize how the system will operate that the system will
accomplish and how the system will be implemented, old system data flow diagrams
can be drawn up and compared with a new systems data flow diagram to draw
comparisons to implement a more efficient system.
Data flow diagrams can be used to provide the end user with a physical idea of where
they input, ultimately as an effect upon the structure of the whole system. In the
perspective of application development Data Flow Diagram (DFD) is a special chart type
which lets graphically illustrate the "flow" of data through various application
component.
So the Data Flow Diagrams can be successfully used for visualization of data processing
or structured design, for creation an overview of the Android application system, in
order to exploring the high-level design in terms of data flows and documenting the
major data flows. / Fig 4.4 : High level overview diagram / Fig 4.5 : Module wise flow
diagram Pre-processing Module The images from the dataset don't have any specific
size, thus it resizes it while not distorting till it either features a dimension of 128 or a
height of thirty-two. Then, it copies the image into a white target image of size 128×32.
Finally, it normalizes the gray-values of the image that simplifies the task for the NN.
Information augmentation will simply be integrated by repetition the image to random
positions rather than orientating it to the left or by at random resizing the image. Access
Data Model: It reads samples, puts them into batches and provides an iterator-interface
to go through the data.
Mathematical operational module CNN A Convolutional Neural Network is a Deep
Learning algorithm which can take in an input image, assign importance learnable
weights and biases to various objects in the image and be able to differentiate one from
the other. The pre-processing required in a Convolutional neural network is much lower
as compared to other classification algorithms.
While in primitive methods filters are hand-engineered, with enough training,

Convolutional Neural Networks have the ability to learn these characteristics.
Convolutional networks perceive images as volumes; i.e. three-dimensional objects,
rather than flat canvases to be measured only by width and height. That’s because
digital color images have a red-blue-green (RGB) encoding, mixing those three colors to
produce the color spectrum humans perceive.
For each pixel of an image, the intensity of R, G and B will be expressed by a number,
and that number will be an element in one of the three, stacked two-dimensional
matrices, which together form the image volume. The input image is fed into the CNN
layers. These layers square measure trained to extract relevant options from the image.
Every layer consists of 3 operations.
First, the convolution operation that applies a filter kernel of size 5×5 within the 1st 2
layers and 3×3 within the last 3 layers to the input. Then, the non-linear RELU operate is
applied. Finally, a pooling layer summarizes image regions and outputs a downsized
version of the input. whereas the image height is downsized by a pair of in every layer,
feature maps (channels) square measure supplementary, in order that the output feature
map (or sequence) encompasses a size of 32×256. Relu Layer The Rectified Linear Unit
returns 0 if it receives any negative input.
Whereas for any positive value of x it returns that value itself. This function is similar to /
Two additional major benefits of ReLUs are sparsity and a reduced likelihood of
vanishing gradient. But first recall the definition of a ReLU is h=max(0,a). One major
benefit is the reduced likelihood of the gradient to vanish. This arises when a>0a>0.
In this regime the gradient has a constant value. In contrast, the gradient of sigmoids
becomes increasingly small as the absolute value of x increases. The constant gradient
of ReLUs results in faster learning.The other benefit of ReLUs is sparsity. Sparsity arises
when a=0a=0. The more such units that exist in a layer the more sparse the resulting
representation.
Sigmoids on the other hand are always likely to generate some non-zero value resulting
in dense representations. Sparse representations seem to be more beneficial than dense
representations. Max Pooling It is common to sporadically insert a Pooling layer middle
sequential Conv layers during a ConvNet design.
It operation is to increasingly cut back the spacial size of the illustration to scale back
the number of parameters and computation within thenetwork, and therefore to
additionally management overfitting. The Pooling Layer operates severally on each
depth slice of the input and resizes it spatially, mistreatment the soap operation. the
foremost common kind may be a pooling layer with filters of size a pair ofx2 applied
with a stride of two downsamples each depth slice within the input by 2 on each
dimension and height, discarding seventy fifth of the activations.
each soap operation would during this case be taking a soap over four numbers (little
2x2 region in some depth slice). / Fig 4.6 : Max-pool example RNN Recurrent Neural
Networks (RNN) ar a {robust} and robust sort of neural networks and belong to the
foremost promising algorithms out there at the instant as a result of they're the sole
ones with an indoor memory. RNN’s ar comparatively previous, like several alternative
deep learning algorithms.
They were at first created within the 1980’s, however will solely show their real potential
since a number of years, thanks to the rise in out there procedure power, the large
amounts of information that we've got today and also the invention of LSTM within the
1990’s. thanks to their internal memory, RNN’s are ready to keep in mind vital things
regarding the input they received, that permits them to be terribly precise in predicting
what’s returning next. In a Feed-Forward neural network, the data solely moves in one
direction, from the input layer, through the hidden layers, to the output layer.
the data moves straight through the network. thanks to that, the data ne'er touches a
node double. Feed-Forward Neural Networks, don't have any memory of the input they
received antecedently and ar thus unhealthy in predicting what’s returning next. as a
result of a feedforward network solely considers the present input, it's no notion of
order in time.
They merely can’t keep in mind something regarding what happened within the past,
except their coaching. But in case of Recurrent Neural Networks the data cycles through
a loop. once it makes a choice, it takes into thought this input and conjointly what it's
learned from the inputs it received antecedently. / Fig 4.7
: Recurrent Neural Network / Fig 4.8 : Feedforward Neural Network RNN uses
backpropagation internally in its mechanism. Backpropagation Through Time (BPTT) is
largely simply a elaborate buzz word for doing Backpropagation on associate unrolled
perennial Neural Network. Unrolling could be a mental image and abstract tool, that
helps us to know what’s happening among the network.
Most of the time once we implement a perennialNeural Network within the common
programming frameworks, they mechanically watch out of the Backpropagation
however we wish to know however it works, that permits you to troubleshoot issues that
come back up throughout the event method. But there square measure many
limitations once it involves classic RNN.So we tend to used long run short term memory
that may be a flavor of RNN.
Long Short Term Memory networks – sometimes simply referred to as “LSTMs” – square
measure a special reasonably RNN, capable of learning long dependencies. LSTMs
square measure expressly designed to avoid the long dependency downside. basic
cognitive process data for long periods of your time is much their default behavior, not
one thing they struggle to find out.
All continual neural networks have the shape of a series of continuance modules of
neural network. In common place RNNs, this continuance module can have a awfully
easy structure, like one tanh layer. The key to LSTMs is that the cell state, the horizontal
line running through the highest of the diagram. The cell state is reasonably sort of a
belt.
It runs straight down the complete chain, with just some minor linear interactions. It’s
terribly straightforward for data to simply flow on it unchanged. / Fig 4.9 : Recurrent
Neural Network / Fig 4.10 : LSTM cell CHAPTER 5 IMPLEMENTATION 5.1 Introduction
Implementation is the realization of an application, or execution of a plan, idea, model,
design, specification, standard, algorithm, or policy.
In other words, an implementation is a realization of a technical specification or

algorithm as a program, software component, or other computer system through
programming and deployment. Many implementations may exist for a given
specification or standard. Implementation is one of the most important phases of the
Software Development Life Cycle (SDLC).
It encompasses all the processes involved in getting new software or hardware

operating properly in its environment, including installation, configuration, running,
testing, and making necessary changes. Specifically, it involves coding the system using
a particular programming language and transferring the design into an actual working
system. 5.2
Overview of System Implementation This project is implemented considering the
following aspects: Usability Aspect. Technical Aspect. 5.2.1 Usability Aspect The user
interface or web application is implemented using web stack - HTML, CSS, Javascript,
Node.js,React. 5.2.2 Technical Aspect The text detection in image model is implemented
using Python, TesnsorFlow and are converted to command line tool using shell script.
Python is a high level programming language which can be used for data science. Some
of the libraries it supports include Pandas,Numpy, Tensorflow. It has a programming
paradigm that allows to build highly scalable. Python is an efficient language that allows
garbage collection, object-oriented, procedural programming. Numpy is a python
library that enables to construct matrices with high level functions to manipulate array
elements.It can be used to construct multi-dimensional lists. Numpy is build on the
combination of python and C programming languages. So has a Cpython
implementation paradigm.
The reason for using C is, python runs much slower than compiled equivalents. Numpy
solves this by providing arrays implemented and interpreted in bytecode, which can be
used to vectorize some of the inner loops of a given function. OpenCV (stands for
Open-Source Computer-Vision) It is a library used to perform computer vision
functionalities in python.
It was initially made by Intel but then open sourced. It supports the various frameworks
used for Deep Learning, such as PyTorch and Tensorflow, and can be applied for
different areas of image and video processing. In this project it is used for resizing and
cropping of the images. TensorFlow is computational graph based library later adapted
as an open source python library.
It is commonly used by many researchers due to its easy-to-understand syntax and GPU
acceleration support. Similar to NumPy, TensorFlow also offers Tensor Computation and
Nvidia CUDA support. Tensors are nothing but N-Dimensional Arrays used for
computing in parallel. 5.2.2.1
The project is implemented by using Anaconda Anaconda is a free and open-source

dissemination of the Python and R programming dialects for logical figuring
(information science, AI applications, huge scale information preparing, prescient
examination, and so forth.), that intends to rearrange bundle the executives and
organization. Bundle renditions are overseen by the bundle the executives framework
conda.
The Anaconda conveyance is utilized by more than 13 million clients and incorporates in
excess of 1400 well known information science bundles reasonable for Windows, Linux,
and MacOS. Anaconda Navigator is a desktop graphical UI (GUI) incorporated into
Anaconda circulation that enables clients to dispatch applications and oversee conda
bundles, situations and channels without utilizing direction line directions.
Guide can look for bundles on Anaconda Cloud or in a neighborhood Anaconda

Repository, introduce them in a domain, run the bundles and update them. It is
accessible for Windows, macOS and Linux. The following applications are available by
default in Navigator: JupyterLab Jupyter Notebook QtConsole Spyder Glueviz Orange
Rstudio Visual Studio Code ?Advanced Code Completion Both Jupyter and Spyder
feature the typical python code auto completion.
But, we usually found that the code completion is really better on spyder compare to
Jupyter which looks to get a bit perplexed at times and doesn’t provide precise results
most of the time. Keep in mind, the more time you will spend as a programmer grinding
out code, the more you value code completion. ?User Interface (UI) React (otherwise
called React.js or ReactJS) is a JavaScript library for structure UIs.
It is kept up by Facebook and a network of individual designers and companies. React

can be utilized as a base in the improvement of single-page or portable applications, as
it's ideal just for its expected utilization of being the snappiest strategy to bring quickly
changing information that should be recorded.
Nonetheless, getting information is just the start of what occurs on a site page, which is
the reason complex React applications as a rule require the utilization of extra libraries
for state the executives, steering, and collaboration with an API ?System stability Spyder
could be a ground-breaking logical setting written in Python, for Python, and planned
by and for researchers, designers and learning experts.
It offers a novel blend of the propelled composition, examination, investigating, and

recognizable proof reasonableness of a complete advancement apparatus with the data
investigation, intuitive execution, profound investigation, and flawless visual picture
capacities of a logical bundle. Past its few intrinsical alternatives, its gifts are frequently
broadened even extra by means of its module framework and API.
furthermore, Spyder may likewise be utilized as a PyQt5 augmentation library, allowing

designers to make upon its common sense and present its parts, similar to the
intelligent support, in their own PyQt programming bundle. 5.3 Implementation Support
5.3.1 Installation of Anaconda Following are the requirements for installation of
Anaconda on Windows Operating System: Microsoft Windows 7/8/10 (32-bit or 64-bit)
GB RAM minimum, 8 GB RAM recommended GB of available disk space minimum, 4 GB
Recommended. 1280x800 minimum screen resolution We should have
already downloaded Anaconda.
To install Anaconda on Windows, we should proceed as follows: Select the working

framework you are on. From the Python 3.6 segment, browse 32bit/64bit choices. The
download should begin after this. It's a really huge record, so it may set aside some
effort to download. Go through the establishment system with the installer. After the
establishment is finished, scan for Anaconda Navigator in the Start menu. You'll see
something which resembles this / Fig 5.1
: Overview of Anaconda navigator CHAPTER 6 PSEUDOCODE Pseudo Code uses the

structural conventions of a standard programming language, however is meant for
human reading instead of machine reading. Pseudocode generally omits details that are
required for machine understanding of the algorithmic rule, like variable declarations,
system-specific code and a few subroutines.
The programming language is increased with natural language description details,

wherever convenient, or with compact notational system, the aim of using pseudocode
is that it's easier for individuals to grasp than typical programming language code it's
ordinarily utilized in textbooks and scientific publications that require documenting
numerous algorithms, and additionally in designing of computer program development,
for sketching out the structure of the program before the particular committal to writing
takes place. The “Text detection in images using Machine Learning” has 4 programs
written in Python They are: Pre-process image.py Access document and load data.py
Mathematical operation.py User input.py The pseudo code for these Python programs
are: 6.1 Pre-process image.py START: Step 1: Import the required libraries for reading
the image files, and processing them import random The random module lets us us to
randomize data or it let’s us to import random numbers. import numpy as np The
numpy package helps for scientific computing with Python.
It contains an N-dimensional array object, tools for integrating C/C++ and Fortran code;
it also contains useful linear algebra, Fourier Transform, and random number
capabilities. import cv2 The cv2 package provides necessary building blocks for image
analysis. Here, we use it for extracting features from input images to create images that
we need as targets. Step 2: define a function to preprocess the image.
def preprocess(image,imageSize, dataAugmentation = False): We define a function is to

convert the input image of size imgSize to a target image whose size is img, the
dataAugmentation can be set to true if the training set originally is of a smaller size,
because of our training set has an ample number of images we set it to false If the
image in the dataset i.e.
IAM is damaged we can choose to use a black image which can be defined using the
zeroes function in numpy library,If in case dataAugmentation is set to true we attempt
to increase or decrease the size of the image by applying random stretches to the
image. Step 3: Perform target transformation.
(width,height) = imageSize (h1,w1) =img.shape fx = w1/width fy = h1/height f =
max(fx,fy) newSize = (max(min(width, int(w1 / f)),1), max(min(height, int(h1 / f),1)) These
are the most crucial set of operations in the pre-process image module, here using the
functions of openCV we extract the current properties of the image, later we define a
size for the target image i.e.
128X32. Our goal is here to create an empty image of size 128X32 which is achieved by
using np.ones functionality and fit the input image there which is achieved by using
cv2.resize functionality. Once this process is complete we now have to normalize the
gray value of the input image so as to make the training easier for the neural network
Step 4 : Normalize Gray values While normalizing the grey values our goal is to reduce
the [0,255] interval to [0,1]. That can be done by using cv2.meanStdDev function, and
later by either subtracting or by dividing the value present in the tuple we can achieve
normalization. STOP 6.2 Access document and load data.py START: Step 1: Import the
required libraries for reading and accessing the image files and processing them import
random The random module lets us us to randomize data or it let’s us to import random
numbers. import numpy as np The numpy package helps for scientific computing with
Python.
capabilities. import cv2 The cv2 package provides necessary building blocks for image
analysis. Here, we use it for extracting features from input images to create images that
we need as targets.
import preprocess The preprocess function from the previously defined pre-process
image module processes the original image according to the requirements of the target
image. Step 2: Define a class called sample which contains ground truth text and the file
path’s to those ground truth text Step 3: Define a class called batch where we split the
dataset into subsets of multiple images. The numpy.stack function can be used to stack
up a bunch of images along an axis.
Step 4: Define a class called dataLoader which loads the dataset from a given location
and using the parameters pre-processes the image here we use the os libraries open
function to open words.txt contained in the same file path that contains this module. We
also create a set of chars for them to be unique, we define a list of existing bad sample
of images in the dataset so that machine does not learn them and Once words.txt is
opened our goal is to extract ground truth texts from them which is at 9th position in
the line, and later put all the words in the gtText into the set chars.
. / Step 5: Once the samples have been selected we have to split the so that 95% of
them belong to a training set and the remaining 5% into the testing set, for which we
use the concept of indexing we also repeat the same steps for the ground truth text as
well .
Step 6: Because we use a connectionist temporal classification layer to decode the

output of the RNN, we need to ensure that it finds a mapping between text label and
the input labels. If the label is too long an infinite gradient is given. Step 7: Define
training and validation sets in these functions we choose the number of images which
goes to each sample.
This concept is called bootstrapping where we randomly choose a subset of the data to
avoid overfitting, we define a validation set to be tested after each training Step 8:
Define functions getIterInfo to choose the next training set to be trained STOP
6.3Mathematical operation.py START: Step 1: Import the required libraries for reading
the image files, and applying machine learning operations on them import sys The sys
module let’s us access various system parameters import numpy as np The numpy
package helps for scientific computing with Python.
capabilities. import tensorflow as tf TensorFlow is the most crucial part of this code, it’s a
processing framework which allows users to define computation graphs to perform
various machine learning tasks.
Step 2: Define a class Model in which the decoder type determines the type of decoding
tensorFlow does on the recognized letters from an image, We define our model
consinting of CNN and RNN layers in this class. We also define some model constants
which ensures uniformity in training. The __init__() function initializes the arguments of
charList, decoderType,mustRestrore,snapId fro each instance of the class.
Step 3: We setup an optimizer called RMSPropOptimizer whose goal is to reduce the

loss calculated by CTC layer after each epoch. Step 4: Define a function setupCNN() In
this function we begin the process of setting up the Convolutional Neural Network. The
TensorFlow operation expand_dims helps in creating a computational graph that takes
inputs as defined and returns the output of these layers.
/ Here we define the parameters of the CNN, because we are using 5 layers of the CNN,
we define a list containing the values that will affect each layer of the CNN. / Each CNN
layers is characterized by 3 operations convolution layer which involves passing a kernel
of a given size over the entire image and generating features. In our case we generate
256 features for each image at the end of 5 layers of CNN ReLu is an activation function
which stands for Rectified Linear unit.
This ensure that any negative values after the convolution operation are all removed
Max pooling layer’s job is to reduce the size of the output of each layer Step 5: Define a
function setupRNN(), In this function we begin the process of setting up the Recurrent
Neural Network. The TensorFlow operation squeeze helps in creating a computational
graph that takes inputs as defined and returns the output of these layers.
/ / Here we create 256 cells of RNN and stack them up on one another in both the layers
of ie Forward and Backward. / / / We create a bidirectional RNN from it, such that
traversal of the input sequence is from back to front and vice versa. The result of which
is two sequence of output ie fw and bw both of which are of size 32X256 as mentioned
in the output of CNN.
Later this output is concatenated along feature axis to create 32 X 512 sequence. Later
this output is fed into CTC when it has been compressed into the size of 32 X 80. / /
Step 6: Define a function setupCNN(),In this function we begin the process of setting up
the Connectionist Temporal Classification Layer.
The TensorFlow operation transpose helps in creating a computational graph that takes
inputs as defined and returns the output of these layers. / / To calculate loss, the
operation is fed both the matrix and the ground truth text. Sparse tensor encoding is
done on the ground truth text. / Step 7: Define a function convtoSparse, this operation
converts the ground truth texts that are present into a sparse matrix so that they look
similar to matrix that is the output of the RNN layer, The for loop ensures that black
values of the ground truth text are converted into a sparse matrix and makes sure that
the sparse matrix is of size 80X32.\ Step 8: Define a function decoderOutput, this helper
function helps in determining the text contained the decoded output of the ctc.
It compares the sparse matrix of the ground truth text and the other image based on
similarities gives out the output. STOP 6.4 User Input.py START: Step 1: Import the
required libraries for reading the audio files, plotting them into spectrograms and saving
them in the genre folders import argparse argparse is a library that helps to parse
command line arguments import editdistance editdistance is a library that privides and
implementation of Levenshtein distance import DataLoader, Batch We import the
functions DataLoader and Batch from our first module. import Model We import the
function Model from our 3rd module.
Step 2: Define a function train, In this function we start the training of the model, initially
the number of epochs is zero, we continue to increase the number of epochs as the
character accuracy keeps increasing, Because we are training multiple batches we have
to randomly select batches, We save the state of the model only when the accuracy
from the previous epoch to the current epoch increases, If however there is no
improvement from the last 5 epochs we stop the training.
Step 3: Define a function validate, In this function after each epoch of training we
validate our model against the 5% validation set which also has been split into batches
to bring in the principle of bootstrapping. Step 4: After doing the validation we must
later print the results i.e. Character error rate and word accuracy, Once the model has
been trained completely we now have to accept user input and infer the text contained
in that image, When running this code on the command line we can call this function
with above mentioned long form arguments to either train the model or validate the
model, or train the model with beamsearch decoder. STOP CHAPTER 7 TESTING 7.1
TESTING We tested the application using the Unit testing method. 7.1.1
Unit Testing Unit testing could be a level of software package testing wherever
individual units/ parts of a software package are tested. the aim is to validate that every
unit of the software package performs as designed. A unit is the smallest testable a part
of any software package. it always has one or a couple of inputs and frequently one
output. In procedural programming, a unit is also a private program, function,
procedure, etc.
In object-oriented programming, the littlest unit could be a methodology, which can

belong to a base/ super category, abstract category or derived/ kid category. A module
of Associate in the Nursing application as a unit. this is often to be discouraged as
there'll in all probability be several individual units inside that module.
Unit testing frameworks, drivers, stubs, and mock/ pretend objects ar accustomed assist
in unit testing. 7.1.2 Test Cases: Table 7.1: Test Cases for Pre-processingModule Test
Case Id _Description _Input State _Expected O/P _Actual o/p _Result _ _1 _Un-Cropped
images _Application is active _Cropped image in proper dimension _Cropped image in
proper dimension _Pass _ _2 _Low contrast images _Application is active _Provides high
contrast image _Provides high contrast image _Pass _ _ Table 7.2: Test Cases for CNN,
RNN Modules Test Case Id _Description _Input State _Expected O/P _Actual o/p _Result
_ _1 _Processed handwritten images _Application is active _Text detected _Text detected
_Pass _ _2 _Processed handwritten unique images _Application is active _Text detected
_Text detected _Pass _ _ CHAPTER 8 RESULTS The model performs well with different set
of handwritten images and it outputs the result accurately. Figure 9.1 represents the
input which is fed by the user.
The input image is the pre-processed in the Pre Process Image Module. Figure 9.2
represents the pre-processed image. The pre-processed image then passes over the
various above mentioned CNN and RNN layers to give the desired output. Figure 9.3
represents the output. / / / Fig 8.4: Multiple epochs of training and validation / Table 8.1
: Accuracy increase with the number of epochs / Fig 8.5
: Graphical user interface CONCLUSION AND FUTURE SCOPE The Handwritten

recognition model in this paper provides a more accurate accuracy rate than previously
available Handwritten recognition models which use optical character recognition. The
model described in this paper is also more efficient for the reason that the input passes
through five layers of convolutional neural networks followed by two more layers of
recurrent neural networks and a final layer of Connectionist Temporal Classification.
The model is very robust as it is trained for several hours until the accuracy rate of the
last two epochs is constant. This model is also inexpensive than mostly available
Handwritten recognition models as it solely uses the CPU instead of GPU for its training
purpose. Future Scope This model can very easily be extended to recognize lines of
handwritten text using an appropriate word segmentation algorithm.
Including image binarization and enhancement, connected component extraction,

partitioning of the connected component domain into three spatial sub-domains and
average character height estimation the entire line can be detected out of handwritten
images. This model can also be extended to recognize languages other than English
when provided an appropriate dataset.
Google vision text detection API can also be integrated along with to give it a more
robust endpoint. This model can be made very efficient when trained on GPU’s and
other accelerated hardware. For example, with GPU’s provided by Google Cloud
Platform, the model will be trained more accurately and would provide much efficient
result in comparison to the present one. This model can be deployed on various cloud
services such as GCP and AWS so as it could be globally present throughout.
GCP provides up hardware for the training of the models at a minimal cost. And also
they provide several tools for maintaining the models. Kubernetes is one such tool that
helps us in maintaining the different clusters of the model. Integration of tensorflow
serving with the model makes it more easy for any third party application to make use if
the model in more business point of view.
Third party applications like websites, desktop applications, mobile applications, games
can use our model in the backend for text detection purpose. REFERENCES [1] Tong He,
Weilin Huang,Yu Qiao and Jian Yao (2016). Text-Attentional Convolutional Neural
Network for Scene Text Detection. IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL.
25, NO. 6, JUNE 2016. [2]Tao Wang, David J. Wu, Adam Coates, Andrew Y. Ng.
End-to-End Text Recognition with Convolutional Neural Networks. [3]Yoshito Nagaoka,

Tomo Miyazaki, Yoshihiro Sugaya, Shinichiro Omachi Text Detection by Faster R-CNN
with Multiple Region Proposal Networks– ECCV 2014 -13th European Conference,
Zurich, Switzerland, September 6-12, 2014, Proceedings, Part IV, pp. 497–511, 2014.
[4] Zheng Zhang, Chengquan Zhang, Wei Shen2 Cong Yao, Wenyu Liu, Multi-Oriented
Text Detection with Fully Convolutional Networks ACCV 2010 - 10th Asian Conference
onComputer Vision, Queenstown, New Zealand, November 8-12, 2010, Revised Selected
Papers, Part III, pp.770–783, 2010. [5] Berat Barakat, Ahmad Droby, Majeed Kassis and
Jihad El-Sana (2018).
Text Line Segmentation for Challenging Handwritten Document Images Using Fully
Convolutional Network. 16th International Conference on Frontiers in Handwriting
Recognition. [6] .Pradeep, E.Srinivasan and S.Himavathi (2011). Diagonal based feature
extraction for handwritten alphabets recognition system using neural network.
International Journal of Computer Science Information Technology (IJCSIT), Volume 3,
No 1,pp.27-38. [7] T.M.Rath and R.
Manmatha(2007), Word spotting for historical documents , International Journal on

Document Analysis and Recognition (IJDAR), Vol.9, No 2 4, pp. 139- 152. [8]Weilin
Huang, Yu Qiao, and Xiaoou Tang, “Robust scene text detection with convolution neural
network induced MSER trees,” Computer Vision – ECCV 2014 -13th European
Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part IV, pp.
497–511, 2014.
[9]Lukas Neumann and Jiri Matas, “A method for text localization and recognition
inreal-world images,” Computer Vision - ACCV 2010 - 10th Asian Conference
onComputer Vision, Queenstown, New Zealand, November 8-12, 2010, Revised Selected
Papers, Part III, pp.770–783, 2010. [10]Alex Krizhevsky, Ilya Sutskever, and Geoffrey E.
Hinton,“Imagenet classification With deep convolutional neural networks,” Advances in
Neural Information Processing Systems 25: 26th Annual Conference on Neural
Information Processing Systems 2012. Proceedings of a meeting held December 3-6,
2012, Lake Tahoe, Nevada,United States., pp.1106–1114, 2012. [11]L.
Neumann and K. Matas, “Real-time scene text localization and recognition,” in Proc. IEEE
Comput. Vis. Pattern Recognit. (CVPR),Jun. 2012, pp. 3538–3545. C. Yao, X. Bai, W. Liu, Y.
Ma, and Z. Tu, “Detecting texts of arbitrary orientations in natural images,” in Proc. IEEE
Comput. Vis. Pattern Recognit. (CVPR), Jun. 2012, Pp. 1083–1090. [12]Md. Rabiul Islam,
Chayan Mondal, Md. Kawsar Azam, and Abu Syed Md. Jannatul Islam (2016).
Text Detection and Recognition Using Enhanced MSER Detection and a Novel OCR
Technique. 5th International Conference on Informatics, Electronics and Vision (ICIEV).
[13]Al-Naymat, G., Chawla, S., Taheri, J. (2009). SparseDTW: A Novel Approach to Speed
up Dynamic Time Warping, The 2009 Australasian Data Mining, vol. 101, Melbourne,
Australia, ACM Digital Library, pp. 117-127. [14]T. M. Rath and R. Manmatha(2002),
Word Image Matching Using Dynamic Time Warping,Proc. of the Conf.
on Computer Vision and Pattern Recognition (CVPR), Madison, WI, vol. 2, pp. 521-527.
[15]Yi, C., Tian, Y.: Text string detection from natural scenes by structure- based partition
and grouping. IEEE Transactions on Image Processing 20(9), 25942605 (2011).
[16]Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks
with low rank expansions. arXiv preprintarXiv:1405.3866
INTERNET SOURCES:
-------------------------------------------------------------------------------------------
<1% - www.cs.bgu.ac.il/~berat/papers/Icfhr2018Text.pdf
<1% - www.justanswer.com/mental-health/455rr-employee...
<1% - www.genomicglossaries.com/content/datascience.asp
<1% - arxiv.org/pdf/1704.01474
<1% - quizlet.com/144486762/artificial-intelligence...
<1% - www.researchgate.net/publication/271553315...
<1% - shodhganga.inflibnet.ac.in/.../81908/12/12_chapter_02.pdf
<1% - arxiv.org/pdf/1510.03283v1
<1% - arxiv.org/abs/1510.03283
<1% - www.slideshare.net/Maduraietplseo/final-year...
<1% - dl.acm.org/citation.cfm?id=2632925
<1% - cs.stanford.edu/people/twangcat/papers/MSDR...
<1% - www.sciencedirect.com/science/article/pii/S...
<1% - www.ets.org/Media/Education_Topics/pdf/COEIII...
<1% - cs.stanford.edu/~twangcat/papers/MSDR_Report.pdf
<1% - pdfs.semanticscholar.org/9e37/4f7f70941611545cb...
<1% - dblp.uni-trier.de/db/conf/icdar/cbdar2017
<1% - handong1587.github.io/.../2015/10/09/ocr.html
<1% - jhui.github.io/.../03/15/Fast-R-CNN-and-Faster-R-CNN
<1% - www.quora.com/Where-can-I-find-a-nice-tutorial...
<1% - www.dsr.inpe.br/sbsr2015/files/p0617.pdf
<1% - openaccess.thecvf.com/content_ICCV_2017/papers/Saha...
<1% - therefore, observation speed is nearly as quick because the original faster r-cnn
and it can detect text systematically.
<1% - arxiv.org/pdf/1903.12473
1% - www.researchgate.net/publication/301879695_Multi...
<1% - arxiv.org/pdf/1604.04018.pdf
<1% - economical for localizing text blocks at the course level. additionally, it's
additionally applicable to multi-script text.
<1% - icfhr2018.org/program.html
<1% - iasir.net/AIJRSTEMpapers/AIJRSTEM19-201.pdf
<1% - www.enggjournals.com/ijet/docs/IJET17-09-03-513.pdf
<1% - bigdata.black/artificial-intelligence/top-10-most...
<1% - towardsdatascience.com/machine-learning-part-3...
<1% - www.sciencedirect.com/topics/medicine-and...
<1% - towardsdatascience.com/naive-bayes-classifier-81...
<1% - pdfs.semanticscholar.org/bbc1/088db4cba31ff162a9...
<1% - dzone.com/.../important-machine-learning-algorithms
<1% - www.sciencedirect.com/.../neural-networks
<1% - medium.com/mindorks/what-is-deep-learning-and...
<1% - home.ijasca.com/data/documents/IJASCA38_Falah.pdf
<1% - docs.opencv.org/.../introduction_to_svm.html
<1% - quizlet.com/142169657/hadoop-certification-flash...
<1% - www.cse.unr.edu/~bebis/CS479/PaperPresentations/...
<1% - www.coursehero.com/file/16172921/Decision-Tree
<1% - datascienceguide.github.io/decision-trees
<1% - www.researchgate.net/publication/315023624_Early...
<1% - blog.usejournal.com/a-quick-introduction-to-k...
<1% - quizlet.com/88434575/machine-learning-ii-flash-cards
<1% - webaccess.hr.umich.edu/best/quickguide.html
<1% - stackoverflow.com/.../ocr-for-detecting-bills
<1% - github.com/jpuigcerver/Laia/tree/master/egs/iam
<1% - www.askdifference.com/analyzation-vs-analysis
<1% - artandpopularculture.com/Analyzed
<1% - www.web-amenity.com
<1% - www.lifecyclestep.com/open/410.0ANALYSISPHASE.htm
<1% - www.flowfinity.com/resources/enterprise-apps...
<1% - www.advanceduninstaller.com/Raize-Components-4_2...
<1% - www.coursehero.com/file/17544166/operating-system
<1% - umsl.edu/~lawtonb/224/dynamic0.html
<1% - www.differencebetween.com/difference-between...
<1% - www.quora.com/What-is-the-difference-between...
<1% - www.revolvy.com/main/index.php?s=NumPy
<1% - www.youtube.com/watch?v=MvN7Wdh0Juk
<1% - www.thecrazyprogrammer.com/2017/12/introduction...
<1% - www.quora.com/What-are-the-reasons-to-use...
<1% - www.answers.com/Q/Which_HTML_attribute_is_used...
<1% - ux.stackexchange.com/questions/109317/most...
<1% - en.wikipedia.org/wiki/Cascading_Style_Sheets
<1% - techspirited.com/what-is-javascript-used-for
<1% - hackernoon.com/the-ultimate-list-of-javascript...
1% - en.wikichip.org/wiki/intel/core_i7
<1% - andnewsindia.com/ddr3-vs-ddr3l-and-ddr4-vs-ddr4l...
<1% - www.versiondaily.com/...disadvantages-hard-disk-drive
<1% - www.whishworks.com/blog/mulesoft/design-of-a...
<1% - businessandfinance.expertscolumn.com/what-system...
<1% - ecomputernotes.com/software-engineering/discuss-the...
<1% - csis.pace.edu/~marchese/CS389/L6/Chapter 6_Summary.pdf
<1% - ops.fhwa.dot.gov/publications/fhwahop06006/...
<1% - softwareengineering.stackexchange.com/questions/...
<1% - en.wikipedia.org/wiki/Systems_architecture
<1% - www.academia.edu/33520152/INTEGRATION_OF_HBASE...
<1% - www.cs.umd.edu/sites/default/files/scholarly...
<1% - www.nap.edu/read/9799/chapter/5
<1% - circle.visual-paradigm.com/.../data-flow-diagram
<1% - www.freetutes.com/systemanalysis/sa5-data-flow...
<1% - www.slideshare.net/chiranjeeviadi/e-learning...
<1% - www.coursehero.com/file/ph6qk9/With-a-data-flow...
1% - answers.yahoo.com/question/index?qid=...
<1% - www.conceptdraw.com/examples/images-for-dataflow...
<1% - www.academia.edu/31052965/ER_Diagrams_Design...
<1% - www.conceptdraw.com/examples/level-1-and-level-2...
<1% - mc.ai/build-a-handwritten-text-recognition...
<1% - towardsdatascience.com/build-a-handwritten-text...
<1% - www.youtube.com/watch?v=Jy9-aGMB_TE
1% - towardsdatascience.com/a-comprehensive-guide-to...
1% - troindia.in/journal/ijcesr/vol4iss4/80-85.pdf
<1% - gmatclub.com/forum/the-positive-value-of-x-that...
<1% - persagen.com/files/ml-implementation_notes.html
1% - zhuanlan.zhihu.com/p/26463994
<1% - www.quora.com/What-are-the-advantages-of-ReLU...
<1% - stats.stackexchange.com/questions/126238/what-are...
<1% - cs231n.github.io/convolutional-networks
<1% - support.google.com/analytics/answer/1033861?hl=en
<1% - github.com/aikorea/cs231n/pull/39/commits/1fdd9...
<1% - analytixon.com/category/arxiv-papers
<1% - ich.vscht.cz/~svozil/pubs/nn_intro.pdf
<1% - skymind.ai/wiki/lstm.html
<1% - iq.opengenus.org/recurrent-neural-networks
<1% - www.techleer.com/articles/185-backpropagation...
<1% - quizlet.com/16200873/cognitive-psych-chs-6-9-all...
<1% - colah.github.io/posts/2015-08-Understanding-LSTMs
<1% - www.quora.com/What-does-implement-mean-in-coding
<1% - www.primidi.com/implementation/computer_science
<1% - en.wikipedia.org/wiki/Implementation
<1% - www.techopedia.com/definition/22193/software...
<1% - www.coursehero.com/file/p41p3bl/1-Question-is...
<1% - study.com/academy/lesson/systems-development...
<1% - www.reddit.com/r/Python/comments/5mwn6a/python...
<1% - www.datacamp.com/.../face-detection-python-opencv
<1% - it was initially made by intel but then open source
<1% - www.salemhitterdal.org/index.php/download_file/203/136
<1% - stackoverflow.com/questions/43535202/confusion...
<1% - docs.anaconda.com/anaconda/navigator/glossary
<1% - www.xda-developers.com/install-adb-windows-macos...
<1% - blog.csdn.net/weixin_39278265/article/details/...
<1% - www.quora.com/Is-Android-Studio-better-than...
<1% - www.bacancytechnology.com/blog/best-ui...
<1% - www.termpaperwarehouse.com/essay-on/Pseudo-Code/...
<1% - www.revolvy.com/topic/Pseudocode&item_type=topic
<1% - www.reddit.com/r/learnprogramming/comments/8m3...
<1% - www.wou.edu/las/cs/csclasses/cs160/160-Lab06BKG.pdf
<1% - www.analyticsvidhya.com/blog/2018/09/deep...
<1% - hackernoon.com/numpy-with-python-for-data...
<1% - stackoverflow.com/tags/numpy/info
<1% - www.mathworks.com/help/images/ref/imresize.html
<1% - stackoverflow.com/questions/10168686/image...
<1% - www.analyticsvidhya.com/blog/2018/11/neural...
<1% - pythonhealthcare.org/2018/04/18/74-using-numpy...
<1% - open function to open words.text contained in the same file path that contains
this module.
<1% - www.mathworks.com/help/deeplearning/examples/...
<1% - towardsdatascience.com/first-step-in-data...
<1% - github.com/tensorflow/tensorflow/issues/3486
<1% - stackoverflow.com/questions/20661448/python...
<1% - adventuresinmachinelearning.com/convolutional...
<1% - jhui.github.io/2017/03/16/CNN-Convolutional...
<1% - www.ais.uni-bonn.de/papers/icann2010_maxpool.pdf
<1% - www.datacamp.com/community/tutorials/cnn-tensor...
<1% - github.com/Martinsos/edlib
<1% - towardsdatascience.com/a-guide-to-an-efficient...
<1% - stats.stackexchange.com/questions/19048/what-is...
<1% - quizlet.com/72306757/chap-7-8-quizzes-flash-cards
<1% - softwaretestingfundamentals.com/unit-testing
<1% - www.chegg.com/homework-help/questions-and...
<1% - www.coursehero.com/file/pc63gg/Object-oriented...
<1% - parneetk.github.io/blog/neural-networks-in-keras
<1% - blog.acolyer.org/2017/03/20/convolutional-neural...
<1% - www.researchgate.net/publication/228603132_A...
<1% - www.cde.ca.gov/ci/cr/cf/documents/foreignlangfrmwrk.pdf
<1% - techcrunch.com/2016/11/15/googles-cloud-platform...
<1% - whuang.org
<1% - leogrady.net/wp-content/uploads/2017/01/elzehiry2016...
<1% - www.semanticscholar.org/paper/End-to-end-text...
<1% - tomomiyazaki.github.io
<1% - link.springer.com/article/10.1007/s11263-015-0823-z
<1% - berat berat barakat droby, majeed kassis and and el-sana 2018 2018
<1% - www.cs.bgu.ac.il/~berat
<1% - www.robotpark.com/academy/AP/an-approach-for-robots-to...
<1% - arxiv.org/pdf/1805.04132
<1% - cmp.felk.cvut.cz/~matas/papers/neumann-text-accv10.pdf
<1% - www.cs.tau.ac.il/~dcor/Graphics/pdf.slides/YY...
<1% - www.groundai.com/project/k-for-the-price-of-1...
<1% - www.freepatentsonline.com/9195819.html
<1% - pages.ucsd.edu/~ztu/publication/cvpr12_textdetection.pdf
<1% - www.kuet.ac.bd/eee/jannatul/index.php?pg=publication
<1% - link.springer.com/article/10.1007/s10115-017-1119-0
<1% - on computer vision and pattern recognition cvpr madison madison, wi, 2 2, pp.
521-527. yi, c., tian, y.
<1% - www.researchgate.net/publication/50408991_Text...

Plagiarism Checker X Report

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Plagiarism Checker X Report

Uploaded by

Copyright:

Available Formats

Plagiarism Checker X Originality Report

Similarity Found: 26%

Date: Wednesday, May 08, 2019

CHAPTER 1 INTRODUCTION Establishing a connection between the past and the

In contrast to character recognition for scanned documents, recognizing text in pictures

Computers are acquiring artificial knowledge to solve some really complicated

Handwritten text detection is a technique in which the machine gets trained on an

data augmentation is enforced by random transformations to the dataset images. The

Progressive performance on variety of benchmarks is demonstrated by the results In the

User Requirements ?Computer. ?Web Browser. 3.4 Hardware Requirements: Hardware

Software Components 4.2.1.1 Developmental Components Operating System

It performs allocation of resources such as cost allocation of processor time, mass

It has a programming paradigm that allows to build highly scalable. Python is an

User Interface Application Components 1. HTML: Hyper Text Mark Up language is a

It makes development of user interface by introducing the concept of states 4.2.2

Systems design could see it as the application of systems theory to product

System Architecture Diagram A system architecture is a conceptual model that defines

Introduction In the high-level design, the proposed functional and non-functional

While in primitive methods filters are hand-engineered, with enough training,

In other words, an implementation is a realization of a technical specification or

It encompasses all the processes involved in getting new software or hardware

The project is implemented by using Anaconda Anaconda is a free and open-source

Guide can look for bundles on Anaconda Cloud or in a neighborhood Anaconda

It is kept up by Facebook and a network of individual designers and companies. React

It offers a novel blend of the propelled composition, examination, investigating, and

furthermore, Spyder may likewise be utilized as a PyQt5 augmentation library, allowing

To install Anaconda on Windows, we should proceed as follows: Select the working

: Overview of Anaconda navigator CHAPTER 6 PSEUDOCODE Pseudo Code uses the

The programming language is increased with natural language description details,

def preprocess(image,imageSize, dataAugmentation = False): We define a function is to

Step 6: Because we use a connectionist temporal classification layer to decode the

Step 3: We setup an optimizer called RMSPropOptimizer whose goal is to reduce the

In object-oriented programming, the littlest unit could be a methodology, which can

: Graphical user interface CONCLUSION AND FUTURE SCOPE The Handwritten

Including image binarization and enhancement, connected component extraction,

End-to-End Text Recognition with Convolutional Neural Networks. [3]Yoshito Nagaoka,

Manmatha(2007), Word spotting for historical documents , International Journal on

You might also like