Professional Documents
Culture Documents
LEARNING
Mahaboob Basha.Sk1, Vineetha.J2, Sudheer.D3
1.
aruna@nriit.edu.in, Associate professor, Dept Of IT,NRI Institute of Technology, A.P,India-521212.
2
jonnakutivineetha0341@gmail.com, UG Scholar, NRI Institute of Technology, A.P-521212
3
baythapudisaikrishna@gmail.com, UG Scholar, NRI Institute of Technology, Andhra Pradesh,
---------------------------------------------------------------------------
***----------------------------------------------------------------------------
Abstract - In this Handwritten Digit Recognition project, purposes. Precisely, we can use it in banks for
we will recognize handwritten Digits, i.e. from 0-9.
Handwritten digit recognition is the ability of computers to
recognize human handwritten digits. Handwritten Digit
recognition is one of the practically important issues in
pattern recognition applications. The applications of digit
recognition include postal mail sorting, bank check
processing, form data entry, etc. The heart of the problem
lies within the ability to develop an efficient algorithm that
can recognize handwritten digits and which is submitted by
users by the way of a scanner, tablet, and other digital
devices.
1. INTRODUCTION
Humans can see and visually sense the world around
them by using their eyes and brains. Computer vision
works on enabling computers to see and process
images in the same way that human vision does.
Several algorithms developed in the area of computer
vision to recognize images. The goal of our work will
be to create a model that will be able to identify and
determine the handwritten digit from its image with
better accuracy. We aim to complete this by using the
concepts of Convolutional Neural Network and
MNIST dataset. Though the goal is to create a model
which can recognize the digits, we can extend it for
letters and then a person’s handwriting. Through this
work, we aim to learn and practically apply the
concepts of Convolutional Neural Networks. Recent
Convolutional Neural Networks (CNN) becomes one
of the most appealing approaches and has been an
ultimate factor in a variety of recent successes and
challenging machine learning applications such as
challenge ImageNet object detection image
segmentation and face recognition. Therefore, we
choose CNN for our challenging task of image
classification. We can use it for handwriting digits
recognition which is one of high academic and
business transactions. There are many applications of
handwriting digit recognition in our real-life
reading checks, post offices for sorting letters, and implement the concept of a Convolutional Neural
many other related works. The MNIST Dataset Network for digit recognition. Understanding CNN and
(Modified National Institute of Standards and applying it to the handwritten digit recognition system
Technology Dataset) is a handwritten digit dataset. is the target of the proposed model. Convolutional
We can use it for training various image processing Neural Network extracts the features maps from the
systems. The Dataset is also widely used for training 2D images. Convolutional Neural Networks (CNNs) is a
and testing in the field of machine learning. It has very well known deep learning algorithm that can be
60,000 training and 10,000 testing examples. Each used to process images. It assigns weights and biases
image has a fixed size. The images are of size 28*28 to various parts of the image and is very capable of
pixels. It is a Dataset for people who want to try differentiating one image from another same kind of
learning techniques and pattern recognition image. Good accuracy has been achieved for
methods on real-world data while spending minimal handwritten digits recognition by using Convolutional
effort on preprocessing and formatting. We will use Neural Networks. The mammalian system of
this Dataset in our experiment. Convolutional neural visualization is taken into consideration to create CNN
networks are deep artificial neural networks. We architecture. CNN is created by D. H. Hubel in 1962.
can use it to classify images (e.g., name what they Two algorithms with the name gradient descent &
see), cluster them by similarity (photo search), and backpropagation are utilized to train the model.
perform object recognition within scenes. It can be Character images of handwritten digits are used as
used to identify faces, individuals, street signs, input. Artificial neural network (ANN) consists of one
tumors, platypuses, and many other aspects of input layer, one output layer, and, some layers which
visual data. The convolutional layer is the core exist in between the input 3 layer and output layer,
building block of a CNN. The layer’s parameters these middle layers are hidden layers. CNN and ANN
consist of a set of learnable filters (or kernels) that are very similar to each other. CNN’s deep learning
have a small receptive field but extend through the algorithm worked on the analysis of visual images.
full depth of the input volume. During the forward CNN can be used in applications like detection of an
pass, each filter is convolved across the width and object, identification of face, in the field of robotics,
height of the input volume, computing the dot video processing, segmentation, in the field of pattern
product, and producing a 2- dimensional activation recognition, processing of natural language, detection
map of that filter. As a result, the network learns of spam, categorization, speech identification,
when they see some specific type of feature at some classification of the digital image, etc. It is a hard task
spatial position in the input. Then the activation for the machines because handwritten digits are not
maps are fed into a down sampling layer, and like perfect and can be made with many different
convolutions, this method is applied one patch at a handwritings. Handwritten digit recognition is the
time. CNN has also a fully connected layer that solution to this problem which uses the image of a
classifies output with one label per node. digit and recognizes the digit present in the image.
Handwriting digit recognition has an active 2. Literature survey
community of academics studying it. A lot of A few state-of-the-art approaches that use
important work on convolutional neural networks handwritten digit recognition for digit identification
happened for handwritten digit recognition have been summarized here.
[1,6,8,10]. There are many active areas of research 2.1 Handwritten Digit Recognition for Banking
such as Online Recognition, Offline recognition, System
Real-Time Handwriting Recognition, Signature The aim of a handwriting digit recognition system is to
Verification, Postal-Address, Interpretation, Bank- convert handwritten digits into machine-readable
Check Processing, Writer Recognition. formats. The main objective of this work is to ensure
effective and reliable approaches for the recognition of
Deep Learning has emerged as a central tool for self- handwritten digits and make banking operations
perception problems like understanding images, a easier and error free The aim of a handwriting digit
voice from humans, robots exploring the world. We recognition system is to convert handwritten digits
aim to implement the concept of a Convolutional into machine-readable formats. The main objective of
Neural Network for digit recognition. Understanding this work is to ensure effective and reliable
CNN and applying it to the handwritten digit approaches for the recognition of handwritten digits
recognition system is the target of the proposed and make banking operations easier and error free. A
model. Convolutional Neural Network extracts the Handwritten Digit Recognition system (HDR) is meant
features maps from the 2D images. Deep Learning has for receiving and interpreting handwritten input in the
emerged as a central tool for self-perception form of pictures or paper documents. Traditional
problems like understanding images, a voice from systems of handwriting recognition have relied on
humans, robots exploring the world. We aim to handcrafted features and a large amount of prior
knowledge. Training an Optical character recognition where all of the computations are derived until the
(OCR) system based on these prerequisites is a last stage of classification, as well as this, is the
challenging task. Convolutional neural networks instance-based learning algorithm where the
(CNNs) are very effective in perceiving the structure approximation takes place locally. Being simplest and
of handwritten characters/words in ways that help in easiest to implement there is no explicit training phase
the automatic extraction of distinct features and earlier and the algorithm does not perform any
make CNN the most suitable approach for solving generalization of training data. KNN explains
handwriting recognition problems. Our aim in the categorical value using majority votes of K nearest
proposed work is to recognize written characters on neighbors where the value for K can differ, so on
cash deposit/ withdrawal/ and other transactions, changing the value of K, the value of votes can also
we are proposing to develop an automatic banking vary. Take our handwriting training dataset as an
deposit number recognition system that can example, each digit has been prepared a variety of
recognize the handwritten account number and samples. For example, for the digit 0, we got 188
amount number on the cash deposit slip and thus samples with each sample to represent a variation of
automate the cash deposit process at a bank counter. the handwriting of 0. Each digit sample is saved as a
2.2 Survey on Handwritten Digit Recognition text file. We also need to label or class each sample. In
using Machine Learning our case, the labels are digits from 0 to 9 and assign to
Machine learning and deep learning play an each digit sample in our handwriting training set.
important role in computer technology and artificial When we’re given a new digit sample text file, we ask
intelligence. With the use of deep learning and our KNN algorithm to identify the digit in it and label it
machine learning, human effort can be reduced in as a digit in classes 0 to 9. The idea of k-NN is to take
recognizing, learning, predictions, and many more the new sample and then convert it to a feature vector.
areas. This paper presents In our case, the digit is a 32x32 image formatted as a
recognizing the handwritten digits (0 to 9) from the 0,1 text file. To convert it as a feature vector, we will
famous MNIST dataset, comparing classifiers like load the image file into a vector like [0, [1,0,0,1]], the
KNN, PSVM, NN, and convolution neural network on length of the inner vector is 1024 (32x32). We then
basis of performance, accuracy, time, sensitivity, measure the distance of this new sample vector to
positive productivity, and specificity with using every sample vector in the training set. It is computing
different parameters with the classifiers. intensive so k-NN isn’t quite efficient for large training
2.3 Recognition of Handwritten Digits using sets use case cases measurement is an extension of the
Machine Learning Techniques Pythagorean theorem. As I believe knowing how KNN
This paper illustrates the application of object measurement work in mathematical isn’t matter for
character recognition (OCR) using template matching understanding KNN, I will assume we all know how
and machine learning techniques to solve the KNN measurement works. We then take the most
problem of handwritten character recognition. In this similar, in the sense of distance, pieces of sample data
paper, we perform the recognition task using (the nearest neighbors) and look at their labels (0 to
Template Matching, Support Vector Machine (SVM), 9). We look at the top k most similar pieces of sample
and Feed Forward Neural Network. Template data from our known dataset; this is where the k
matching is an image processing technique to break comes from. (k is an integer and it’s usually less than
the image into smaller parts and then match it to a 20.) Lastly, we take a majority vote from the k most
template image. Here we use a Multi-Class SVM similar pieces of data, and the majority is the new
classifier and Neural Network to classify the image. class in 0 to 9 we assign to the new data we were
We use the dataset to train the classifier followed by asked to classify.
feature extraction and finally applying the classifiers Implementation of KNN
to recognize the digits. 1. Compute the Euclidean distance between the test
2.4 EXISTING SYSTEM data point and all the training data.
In this section, we are going to discuss a few 2. Sort the calculated distances in ascending order.
algorithms which are used so far for handwritten 3. Get the k nearest neighbors by taking top k rows
digit recognition. Mentioned below are the from sorted array
algorithms: 4. Find the majority class of these rows.
a. KNN (K Nearest Neighbors) 5. Return predicted class.
b. SVM (Support Vector Machine) Finding accuracy score to make sure the prediction is
2.4.1 KNN (K Nearest Neighbors) correct or not.
KNN is the non-parametric method or classifier used 1)Calculation of Euclidean distance
for classification as well as regression problems. This Euclidean distance is the square root of the sum of
is the lazy or late learning classification algorithm squared distance between two points.
Fig 2.1 Euclidean Formula
Two points are passed (here row1 &row2) The
Euclidean _distance function calculates the difference
between the squares of the points and finally the
square root of the difference.
2)Get the k nearest neighbors after sorting
distance
To find the neighbors we need to first sort the distance
in ascending order, np. a resort () is used to find the
index of minimum distance. After that we will arrange
the according to the sorted index. Slicing the data
according to the number of neighbors.
3.4.1.2 Sklearn
4. PROPOSED SYSTEM
Scikit-learn (Sklearn) is the most useful and robust
In this project, we will be using the TensorFlow The pre-processing required in a ConvNet is much
library to build, train and test our models and the lower as compared to other classification algorithms.
data which we will use is MNIST Dataset. The While in primitive methods filters are hand-engineered,
algorithm used will be Convolutional Neural with enough training, ConvNets can learn these
Network (CNN). filters/characteristics. The architecture of a ConvNet is
analogous to that of the connectivity pattern of Neurons
4.1 TensorFlow
in the Human Brain and was inspired by the
TensorFlow is a free and open-source library for organization of the Visual Cortex. Individual neurons
data flow and differentiable programming across a respond to stimuli only in a restricted region of the
range of tasks. It is a symbolic math library and is visual field known as the Receptive Field. A collection of
also used in machine learning applications such as such fields overlaps to cover the entire visual area.An
neural networks. It is used for both research and image is nothing but a matrix of pixel values, So why
production at Google. TensorFlow was developed by not just flatten the image (e.g. 3x3 image matrix into a
the Google Brain team for internal Google use. It was 9x1 vector) and feed it to a Multi-Level Perceptron for
released under the Apache License 2.0 on November classification purposes In cases of extremely basic
9, 2015. TensorFlow offers multiple levels of binary images, the method might show an average
abstraction so we can choose the right one for our precision score while performing prediction of classes
needs. We can build and train models by using the but would have little to no accuracy when it comes to
high-level Keras API, which makes getting started complex images having pixel dependencies throughout.
with TensorFlow and machine learning easy. If we A ConvNet can successfully capture the Spatial and
need more flexibility, eager execution allows for Temporal dependencies in an image through the
immediate iteration and intuitive debugging. For application of relevant filters. The architecture
large ML training tasks, we can use the Distribution performs a better fitting to the image dataset due to the
Strategy API for distributed training on different reduction in the number of parameters involved and
hardware configurations without changing the the reusability of weights. In other words, the network
model definition. can be trained to understand the sophistication of the
image better.
4.2 MNIST Dataset Input Image
The MNIST Dataset (Modified National Institute of
Standards and Technology Dataset) isa large Dataset
of handwritten digits that is commonly used for
training various image processing systems. The
Dataset is also widely used for training and testing
in the field of machine learning. The MNIST Dataset
contains 60,000 training images and 10,000 testing
images. Half of the training set and half of the test
set were taken from NIST's training dataset, while Fig 4.4 4x4x3 RGB Image
the other half of the training set and the other half of In the figure, we have an RGB image that has been
the test set were taken from NIST's testing dataset. separated by its three-color planes — Red, Green, and
There have been several scientific papers on Blue. There are several such color spaces in which
attempts to achieve the lowest error rate. images exist — Grayscale, RGB, HSV, CMYK, etc
Convolution Layer — The Kernel