You are on page 1of 39

HAND WRITTEN PATTERN RECOGNITION USING

CONVOLUTIONAL NEURAL NETWORK


Project report submitted
In partial fulfilment for the degree of
Bachelor of Technology in Electrical and Electronics Engineering

By

Shibani Khatua (1641014201)


Shivani Kumari (1641014027)

DEPARTMENT OF ELECTRICAL AND


ELECTRONICS
ENGINEERING
Institute of Technical Education and Research
SIKSHA ‘O’ ANUSANDHAN (Deemed to be) UNIVERSITY
Bhubaneswar, Odisha, India
(May ,2020)

1
HAND WRITTEN PATTERN RECOGNITION USING
CONVOLUTIONAL NEURAL NETWORK

Project report submitted


In partial fulfilment for the degree of
Bachelor of Technology in Electrical and Electronics Engineering

By

Shibani Khatua (1641014201)


Shivani Kumari (1641014027)

DEPARTMENT OF ELECTRICAL AND ELECTRONICS

ENGINEERING
Institute of Technical Education and Research
SIKSHA ‘O’ ANUSANDHAN (Deemed to be) UNIVERSITY
Bhubaneswar, Odisha, India
(May,2020)

2
CERTIFICATE

This is to certify that the project titled “ Hand Written Pattern Recognition using

convolutional neural Network“ submitted by Shivani Kumari and Shibani

Khatua to the Institute of Technical Education and Research, SIKSHA‘O’

ANUSANDHAN(Deemed to be) University, Bhubaneswar for the partial

fulfilment for the degree of Bachelor of Technology in Electrical and Electronics

Engineering is a record of original bonafide work carried out by them under my

/ our supervision and guidance. The project work, in my / our opinion, has reached

the requisite standard, fulfilling the requirements for the degree of Bachelor of

Technology.

The results contained in this thesis have not been submitted in part or full to

any other university or institute for the award of any degree or diploma.

Dr Niranjan Nayak Mrutyunjaya Sahani


Dept. of EEE (HOD) Dept. of EEE(Guide)

ITER ITER

3
CONTENTS

CHAPTERS PAGES

Chapter 1: INTRODUCTION 8-9

Chapter 2: LITERATURE SURVEY AND PRINCIPLE 10-13

Chapter 3: CONCEPT GENERATION AND SELECTION 14-21

Chapter 4: PROJECT MODELLING AND SIMULATION 22-35

Chapter 5: RESULTS 36

Chapter 6: CONCLUSION & SCOPE OF FUTURE WORK 37

Chapter 7: INDIVIDUAL AND GROUP LEARNING 38

Chapter 8: REFERENCES 39

4
DECLARATION

We hereby declare that this written submission report represents our ideas in our own

words and where other’s ideas and words have been included. We have adequately

cited and referenced the original sources. We also declare that we have adhered to all

the principles of academic honesty and integrity and have not misrepresented or

fabricated or falsified any idea / source / fact in our submission.

We understand any violation of the above will be cause for disciplinary action

by the University and can also evoke penal action from the sources which have thus

not been properly cited or from whom proper permission have not been taken when

needed.

Shibani Khatua (1641014201)

Shivani Kumari (1641014027)

Date: 18.05.20

5
REPORT APPROVAL
This project report entitled “Hand Written Pattern Recognition using convolutional

Neural Network “ by (Shibani Khatua and Shivani Kumari) is approved for the
degree

Bachelor of Technology in Electrical and Electronics Engineering.

Examiners

Supervisors

Mr. Mrutyunjaya Sahani

H.O.D.

Dr. Niranjan Nayak

Date: 18.5.20 Place: Bhubaneswar

6
ABSTRACT

Handwritten character recognition has been one of the active and challenging

research areas in the field of image processing and pattern recognition. A neural

network is a feed forward neural network used for classification and recognition of

handwritten characters using numerous steps . It involve numerous applications

which are reading aid for blind, bank cheques and conversion of any hand written

document into structural text form . An attempt is made to recognize handwritten

characters for English alphabets without feature extraction using multilayer Feed

Forward neural network. Each character data set contains alphabets. Fifty different

character data sets are used for training the neural network. In the proposed

system, each character has been resized into 30x20 which is directly subjected to

training. That is, each resized character has 600 pixels and these pixels are taken as

features for training the neural network . The results show that the proposed

system yields good recognition rates which are comparable to that of feature

extraction based schemes for handwritten character recognition.

7
INTRODUCTION:-.
Hand written characters are easy to understand by humans as they have the ability to

learn. This ability has been fed to the machines by artificial intelligence and machine

learning . And the field that deals with these characters are known as OCR ( optical

character recognition ) . Character recognition ,an art of detecting, segmenting and

identifying characters from image .An ultimate objective of hand written character

recognition is to simulate the human reading capabilities so that the computer can read

, understand, edit and work as human do with text. Handwriting recognition has been

one of the most fascinating and challenging research areas in field of image processing

and pattern recognition in the recent years .It contributes immensely to the

advancement of automation process and improves the interface between man and

machine in numerous applications. Several research works have been focusing on new

techniques and methods that would reduce the processing time while providing higher

recognition accuracy. Character recognition is mainly of two types online and offline.

In online character recognition, data is captured during the writing process with the

help of a special pen on electronic surface. In offline recognition, prewritten data

generally written on a sheet of paper is scanned. Offline Character Recognition:

Generally all printed or type-written characters are classified in Offline mode. Off-line

handwritten character recognition refers to the process of recognizing characters in a

document that have been scanned from a surface such as a sheet of paper and are

stored digitally in gray scale format. The storage of scanned documents have to be

bulky in size and many processing applications as searching for a content, editing,

maintenance are either hard or impossible.

8
The online mode of recognition is mostly used to recognize only handwritten

characters. In this the handwriting is captured and stored in digital form via different

means. Usually, a special pen is used in conjunction with an electronic surface. As the

pen moves across the surface, the two- dimensional coordinates of successive points

are represented as a function of time and are stored in order. Recently, due to

increased use of handheld devices online handwritten recognition attracted attention of

worldwide researchers. This online handwritten recognition aims to provide natural

interface to users to type on screen by handwriting on a pad instead of by typing using

keyboard. The online handwriting recognition has great potential to improve user and

computer communication. Several applications including mail sorting, bank

processing, document reading and postal address recognition require offline

handwriting recognition systems. As a result, the off-line handwriting recognition

continues to be an active area for research towards exploring the newer techniques that

would improve recognition accuracy .

FIG NO-1 OVERVIEW OF PATTERN RECOGNITION PROCESS

LITERATURE SURVEY
9
An early notable attempt in the area of character recognition research is by

Grimsdale in 1959. The origin of a great deal of research work in the early sixties

was based on an approach known as analysis-by-synthesis method suggested by

Eden in 1968. The great importance of Eden's work was that he formally proved

that all handwritten characters are formed by a finite number of schematic features,

a point that was implicitly included in previous works. This notion was later used

in all methods in syntactic (structural) approaches of character recognition.

Salvador España-Boquera , in this paper hybrid Hidden Markov Model (HMM)

model is proposed for recognizing unconstrained offline handwritten texts. In this,

the structural part of the optical model has been modelled with Markov chains, and

a Multilayer Perceptron is used to estimate the emission probabilities. In this

paper, different techniques are applied to remove slope and slant from handwritten

text and to normalize the size of text images with supervised learning methods.

The key features of this recognition system were to develop a system having high

accuracy in preprocessing and recognition, which are both based on ANNs.

In literature , T. Som has discussed fuzzy membership function based approach for

HCR. Character images are normalized to 20 X 10 pixels. Average image (fused

image) is formed from 10 images of each character. Bonding box around character

is determined by using vertical and horizontal projection of character. After

cropping image to bounding box, it is resized to 10 X 10 pixels size. After that,

thing is performed and thinned image is placed in one by one row of 100 X 100

canvas. Similarity score of test image is matched with fusion image and characters

are classified.

10
WORKING PRINCIPLE:-

Handwritten recognition is normally divided into six phases which are image

Acquisition, pre-processing, segmentation, feature extraction, classification and

post processing. The block diagram of the basic character recognition is shown :-

Image Acquisition

Pre-processing

Segmentation

Feature Extraction

Classification

Post Processing

FIG NO-2 OVERVIEW OF CHARACTER RECOGNITION STEPS

A. Image Acquisition-- Digital Image is initially taken as input. The most common of

these devices is the electronic tablet or digitizer. These devices use a pen that is digital

in nature. Input images for handwritten characters can also be taken by using other

methods such as scanners, photographs or by directly writing in the computer by using

a stylus.

B. Pre-processing-- Pre-processing is the basic phase of character recognition and it is

crucial for good recognition rate. The main objective of pre-processing steps is to

11
normalize strokes and remove variations that would otherwise complicate recognition

and reduce the recognition rate. These distortions include the irregular size of text,

missing points during pen movement collections, jitter present in irregular size of text,

missing points during pen movement collections, jitter present in text, left or right

bend in handwriting and uneven distances of points from adjacent positions.

C. Segmentation-- Segmentation is done by separation of the individual characters of

an image. Generally document is processed in a hierarchical way. At first level lines

are segmented using row histogram. From each row, words are extracted using column

histogram and finally characters are extracted from words.

D. Feature Extraction --The main aim of feature extraction phase is to extract that

pattern which is most pertinent for classification . Feature extraction techniques like

Principle Component Analysis , Linear Discriminant Analysis , Chain Code , Scale

Invariant Feature Extraction , zoning, Gradient based features, Histogram might be

applied to extract the features of individual characters. These features are used to train

the system.

E. Classification-- When input image is presented to the system, its features are

extracted and given as an input to the trained classifier like artificial neural network or

support vector machine. Classifiers compare the input feature with stored pattern and

find out the best matching class for input.

F. Post Processing-- Post – processing is processing of the output from shape

recognition. Language information can increase the accuracy obtained by pure shape

recognition.

Comparision between different techniques :-


Method Accuracy Purpose

12
OCR for cursive 88.8% for lexicon To implement
handwriting. size 40,000 segmentation and
recognition
algorithms for
cursive handwriting.
Recognition of 95% for Hindi and The aim is to utilize
handwritten 98.4% for English the fuzzy technique
numerals based upon numerals overall to recognize
fuzzy model handwritten
numerals for Hindi
and English
numerals.
Combining decision 89.6% overall. To use a reliable and
of multiple an efficient
connectionist technique for
classifiers for classifying
Devanagari numeral numerals.
recognition. [
Hill climbing 93% for uppercase To implement hill
algorithm for letters. climbing algorithm
handwritten for selecting feature
character subset.
recognition.
Optimization of 88% for numbers To apply a method
feature selection for and 70% for of selecting the
recognition of letters. features in an
Arabic characters optimized way.
99.56% for To find out the
Handwritten Devanagari, recognition rate for
numeral recognition 98.99% for the six popular
for six popular Bangla, 99.37% Indian scripts.
Indian scripts. for Telugu,
98.40% for
Oriya, 98.71% for
Kannada and
98.51% for Tamil
overall.

CONCEPT GENERATION

13
Handwritten pattern recognition played a big role in the technology world then. It also

Played an important role in the storage and in the recovery of critical handwriting

Information. This handwritten recognition ensured an accurate medical care and it also

reduced storage costs. It ensured that an essential field of research remains available to

students in the future. Various applications of handwritten pattern recognition were

there like National ID number recognition, postal office automation with code number

recognition on envelope, automatic license plate recognition and bank automation In

this era of globalization technologies continue to improve and improve more in no

time.

TRADITIONAL TECHNIQUES

(a) CHARACTER EXACTRACTION

Off-line character recognition often involved scanning a form or document

written sometimes in the past. This means the individual characters contained

in the scanned image would need to be extracted. Tools existed that were

capable of performing this step. However, there were several common

imperfections in this step. The most common was when characters that

connected were returned as a single sub-image containing both characters. This

caused a major problem in the recognition stage. Yet many algorithms were

available that reduced the risk of connected characters.

14
(b) CHARACTER RECOGNITION

After the extraction of individual characters occurred, a recognition engine that

used to identify the corresponding computer character. Several different

recognition techniques were currently available.

(c) FEATURES EXTRACTION

Feature extraction worked in a similar fashion to neural network recognizers.

However, programmers must manually determine the properties they feel were

important. This approach gave the recognizer more control over the properties

used in identification. Yet any system using this approach required

substantially more development time than a neural network because the

properties were not learned automatically.

Modern techniques

Where traditional techniques focused on segmenting individual characters for

recognition, modern techniques focused on recognizing all the characters in a

segmented line of text. Particularly they focused on machine learning techniques

which were able to learn visual Features, avoiding the limiting feature engineering

previously used. State-of-the-art methods used convolutional network to extract visual

features over several overlapping windows of a text line image which an RNN used to

produce character probabilities .

Online recognition :-

15
On-line handwriting recognition involved the automatic conversion of text as it is

written on a special digitizer or PDA, where a sensor picked up the pen-tip movements

as well as pen-up/pen-down switching. This kind of data known as digital ink and can

be regarded as a digital representation of handwriting. The obtained signal then

converted into letter codes which were usable within computer and text-processing

applications. The elements of an on-line handwriting recognition interface typically

include: (a) a pen or stylus for the user to write with.

(b) a touch sensitive surface, which may be integrated with, or adjacent to, an output

display.

(c) a software application which interpreted the movements of the stylus across the

writing.

(d) Surface , translating the resulting strokes into digital text. And an off-line

recognition is the problem.

General process

The process of online handwriting recognition can be broken down into a few general

steps:

(a)pre-processing,

(b)feature extraction

(c)classification

16
The purpose of preprocessing was to discard irrelevant information in the input data,

that could negatively affect the recognition. This concerned speed and accuracy.

Preprocessing usually consists of binarization, normalization, sampling, smoothing

and denoising. The second step was feature extraction. Out of the two- or more-

dimensional vector field received from the preprocessing algorithms, higher-

dimensional data was extracted. The purpose of this step was to highlight important

information for the recognition model. This data might include information like pen

pressure, velocity or the changes of writing direction. The last big step was

classification. In this step various models were used to map the extracted features to

different classes and thus identifying the characters or words the features represent.

FIG NO-3 OVERVIEW OF ONLINE RECOGNITION BLOCK DIAGRAM

17
CONCEPT SELECTION

Character recognition from handwritten images has received greater attention in

research area of pattern recognition due to vast applications and ambiguity in the

of learning methods learning methods. Primarily, two steps including character

recognition and feature extraction are required based on some classification

algorithm for handwritten pattern recognition. Previous schemes exhibit lack of

high accuracy and low computational speed for handwritten pattern recognition

process. The aim of the proposed attempt was to make the path toward

digitalization should be clearer and provide high accuracy and faster

computational for recognizing the handwritten pattern.

The present research employed convolutional neural network consisting of

different layers for recognizing (encoding and decoding) and classifying the given

input to a better output. As classifier, MNIST as dataset with suitable parameters

for training and testing and deep learning framework for handwritten pattern

recognition. The convolutional neural network system successfully imparts

accuracy up to 99.21% which is higher than formerly proposed schemes. In

addition, the proposed system reduces computational time significantly for training

and testing due to which algorithm becomes efficient .

18
CONVOLUTIONAL NEURAL NETWORK

The name “convolutional neural network” indicates that the network employs a

mathematical operation called convolution. Convolution is a specialized kind of

linear operation. Convolutional networks are simply neural networks that use

convolution in place of general matrix multiplication in at least one of their layers.

FIG NO-3 OVERVIEW OF CONVOLUTIONAL NEURAL NETWORK

ARCHITECTURE

A convolutional neural network consists of an input and an output layer, as well as

multiple hidden layers. The hidden layers of a CNN typically consist of a series of

convolutional layers that convolve with a multiplication or other dot product. The

activation function is commonly a RELU layer, and is subsequently followed by

additional convolutions such as pooling layers, fully connected layers and

normalization layers, referred to as hidden layers because their inputs and outputs

are masked by the activation function and final convolution.

19
Though the layers are colloquially referred to as convolutions, this is only by

convention. Mathematically, it is technically a sliding dot product or cross

correlation. This has significance for the indices in the matrix, in that it affects

how weight is determined at a specific index point.

 Convolutional

When programming a CNN, the input is a tensor with shape (number of images) x

(image width) x (image height) x (image depth). Then after passing through a

convolutional layer, the image becomes abstracted to a feature map, with shape

(number of images) x (feature map width) x (feature map height) x (feature map

channels). A convolutional layer within a neural network should have the

following attributes:

 Convolutional kernels defined by a width and height (hyper-parameters).

 The number of input channels and output channels (hyper-parameter).

 The depth of the Convolution filter (the input channels) must be equal to

the number channels (depth) of the input feature map.

 Pooling

Convolutional networks may include local or global pooling layers to streamline

the underlying computation. Pooling layers reduce the dimensions of the data by

combining the outputs of neuron clusters at one layer into a single neuron in the

next layer. Local pooling combines small clusters, typically 2 x 2. Global pooling

acts on all the neurons of the convolutional layer. In addition, pooling may

compute a max or an average. Max pooling uses the maximum value from each of

a cluster of neurons at the prior layer. Average pooling uses the average value

from each of a cluster of neurons at the prior layer.

20
 Fully connected

Fully connected layers connect every neuron in one layer to every neuron in

another layer. It is in principle the same as the traditional multi-layer perceptron

neural network (MLP). The flattened matrix goes through a fully connected layer

to classify the images

 Receptive field

In neural networks, each neuron receives input from some number of locations in

the previous layer. In a fully connected layer, each neuron receives input from

every element of the previous layer. In a convolutional layer, neurons receive input

from only a restricted subarea of the previous layer. Typically the subarea is of a

square shape (e.g., size 5 by 5). The input area of a neuron is called its receptive

field. So, in a fully connected layer, the receptive field is the entire previous layer.

In a convolutional layer, the receptive area is smaller than the entire previous

layer.

 Weights

Each neuron in a neural network computes an output value by applying a specific

function to the input values coming from the receptive field in the previous layer.

The function that is applied to the input values is determined by a vector of

weights and a bias (typically real numbers). Learning, in a neural network,

progresses by making iterative adjustments to these biases and weights.

21
PROJECT MODELLING:-

FIG NO-4 OVERVIEW OF CONVOLUTIONAL NEURAL NETWORK


FRAMEWORK

FIG NO-5 OVERVIEW OF CONVOLUTIONAL NEURAL NETWORK


LAYER STRUCTURE.
22
The hidden layers under this network are described, which are as follows:

(a)Convolutional layer (CNL)

Convolutional neural network (CNL) is the first layer in convolutional neural

network which memorizes the features of input image which covers its entire

region during scanning through vertical and horizontal sliding filters. It adds a bias

for every region followed by evaluation of scalar product of both filter values and

image regions. For thresholding element-wise activation function, such as

max(0,x), sigmoid and tan(h), is applied to output of this layer via rectified linear

unit.

 Spatial arrangement

Three hyperparameters control the size of the output volume of the convolutional

layer: the depth, stride and zero-padding.

 The depth of the output volume controls the number of neurons in a layer

that connect to the same region of the input volume.

 Stride controls how depth columns around the spatial dimensions (width

and height) are allocated. When the stride is 1 then we move the filters one

pixel at a time. This leads to heavily overlapping receptive fields between

the columns, and also to large output volumes. When the stride is 2 then the

filters jump 2 pixels at a time as they slide around.

 Sometimes it is convenient to pad the input with zeros on the border of the

input volume. The size of this padding is a third hyperparameter. Padding

provides control of the output volume spatial size. In particular, sometimes

it is desirable to exactly preserve the spatial size of the input volume.

The spatial size of the output volume can be computed as a function of the

input volume size W, the kernel field size of the convolutional layer

23
neurons K, the stride with which they are applied S, and the amount of zero

padding P used on the border. The formula for calculating how many

neurons "fit" in a given volume is given by

W −K + 2
+1
S

If this number is not an integer, then the strides are incorrect and the

neurons cannot be tiled to fit across the input volume in a symmetric way.

In general, setting zero padding to be P =¿ ) when the stride is S=1 ensures

that the input volume and output volume will have the same size spatially.

However, it's not always completely necessary to use all of the neurons of

the previous layer. For example, a neural network designer may decide to

use just a portion of padding.

(b) Pooling layer (PL)

At second, there comes pooling layer which is also called as max pooling layer or sub

Sampling. In pooling layer (PL), shrinkage in the volume of data takes place for the

easier and faster network computation. Max pooling and average pooling are main

tools for implementing pooling. This layer obtains maximum value or average value

for each region of the input data by applying vertical and horizontal sliding filters

through input image and reduces the volume of data .

The pooling layer operates independently on every depth slice of the input and resizes

it spatially. The most common form is a pooling layer with filters of size 2×2 applied

with a stride of 2 downsamples at every depth slice in the input by 2 along both width

and height, discarding 75% of the activation:

24
 ReLU Layer

ReLU is the abbreviation of rectified linear unit, which applies the non-saturating

activation function [f(x)=max(0,x)]. It effectively removes negative values from an

activation map by setting them to zero. It increases the nonlinear properties of the

decision function and of the overall network without affecting the receptive fields of

the convolution layer.

(c) Fully connected layer or dense layer

At last , there is fully connected layer after convolution and pooling layer in the

standard neural network (separate neuron for each pixel) which is comprised of n

numbers of neurons, where n is the predicted class number.

For example,

There are ten neurons for ten classes (0–9) in digit character classification

problem. However, there should be 26 neurons for 26 classes (a–z) for English

character classification problem. However, deep neural network architecture

consists of many nonlinear hidden layers with a enormous number of connections

and parameters. Therefore, to train the network with very less amount of samples

is a very difficult task. In convolutional neural network, only few set of parameters

are needed for training of the system. So, convolutional neural network is the key

solution capable map correctly datasets for both input and output by varying the

trainable parameters and numbers of hidden layers with high accuracy. Hence, in

this work , convolutional neural network architecture with deep learning

framework had been considered as the best fit for the character recognition from

the handwritten pattern images . For the experiments and the verification of the

system’s performance , the normalized standard MNIST based dataset is utilized

25
and this structure of network is also used under autoencoder structure for pattern

recognition.

FIG NO-6 OVERVIEW OF DIFFERENT NEURAL LAYERS

CONSTRUCTING THE ARCHITECTURE OF NETWORK

Neural networks with multiple hidden layers can be useful for solving

classification problems with complex data, such as images. Each layer can learn

features at a different level of abstraction. However, training neural networks with

multiple hidden layers can be difficult in practice.

One way to effectively train a neural network with multiple layers is by training

one layer at a time. You can achieve this by training a special type of network also

known as an autoencoder for each desired hidden layer.

This experiment focuses on how to train a neural network with two hidden layers

to classify digits in images. First, you train the hidden layers individually in an

unsupervised fashion using encoding and decoding under the hidden layers. Then

you train a final softmax layer, and join the layers together to form a stacked

network, which you train one final time in a supervised fashion.

 DATASET

26
The MNIST dataset is an acronym that stands for the Modified National

Institute of Standards and Technology dataset. It is a dataset of 60,000 small

square 28×28 pixel grayscale images of handwritten single digits between 0

and 9. The task is to classify a given image of a handwritten digit into one of

10 classes representing integer values from 0 to 9, inclusively. It is a widely

used and deeply understood dataset and, for the most part, is “solved.” Top-

performing models are deep learning convolutional neural networks that

achieve a classification accuracy of above 99%, with an error rate between 0.4

%and 0.2% on the hold out test dataset.

FIG OVERVIEW OF DIGITS TAKEN UNDER EXPEIMENT

The labels for the images are stored in a 10-by-5000 matrix, where in every

column a single element will be one to indicate the class that the digit belongs

to, and all other elements in the column will be zero. It should be noted that if

the tenth element is one , then the digit image is a zero.

 FUNCTION AND PARAMETERS

27
The stacked neural network is a simple three-layer neural network including an

encoding layer and a decoding layer for the system function where output units

are directly connected back to input units that shown in Figure 1. The proposed

sparse neural network was trained on theX lnraw inputs , X lmhidden layer

and output layerY ln where n is number of inputs or outputs neuron and m is

number of hidden neuron and l is number of sparse neural network . The output

layer maps the input vector I ln to the hidden layer H lmwith a non-linear function

S:

(1)

where W li

denote the parameters (or weights)associated with the connection between

input unit andhidden unit. bm are a biases in hidden layer. S(v)is the sigmoid

function. The sigmoid function is defined as:

(2)

The output layer Y ln has the same number of units with the input layer and

define as:

(3)

where W j

28
denote the parameters (or weights)associated with the connection between hidden

unit and output unit.bn are the biases in the output layer. S is a sigmoid function

shown in equation (2)

FIG NO-7 OVERVIEW OF ENCODING AND DECODING LAYER

 STACKED NETWORK

We introduce the design of digit-level stacked layered network for digits

classification. The first sparse network structure contains the input layer X ln

to learn primary features on the raw input that illustrated in below figure. The

first sparse structure produces the primary feature (I). The primary featureH 1m

feeds the input layer into the second trained sparse network that produce the

secondary features (II). In below figures focuses the primary features used as

the raw input to next sparse network to learn secondary features. Then, the

secondary feature treat as input layer to a softmax classifier to map secondary

features to digit labels that shown in below figures.

29
FIG NO-8 OVERVIEW OF HIDDEN LAYERS SHOWING PRIMARY
AND SECONDARY FEATURE

FIG NO-9 OVERVIEW OF

SOFTMAX CALSSIFIER

LAYER

 OVERALL STRUCTURE

30
FIG NO-10 OVERVIEW OF PROPOSED NETWORK STRUCTURE

FIG NO-11 OVERVIEW OF CONVOLUTIONAL NETWORK STRUCTURE

31
FIG NO-12 OVERVIEW OF FLOW DIAGRAM FOR CHARACTER

RECOGNITION

 TRAINING

 FIRST SPARSE NETWORK

32
The training begins by a sparse neural network on the training data without

using the labels. An autoencoder is a neural network which attempts to

replicate its input at its output. Thus, the size of its input will be the same as

the size of its output. When the number of neurons in the hidden layer is less

than the size of the input, it learns a compressed representation of the input.

Neural networks have weights randomly initialized before training. Therefore

the results from training are different each time.

To avoid this behavior, explicitly set the random number generator seed.

Neural networks have weights randomly initialized before training. Therefore

the results from training are different each time. To avoid this behavior,

explicitly set the random number generator seed. Neural networks have

weights randomly initialized before training. Therefore the results from

training are different each time. To avoid this behavior, explicitly set the

random number generator seed.

FIG NO-13 OVERVIEW OF FIRST SPARSE NETRWORK LAYER STRUCTURE

 SECOND SPARSE NETWORK

33
After training the first sparse network, you train the second network in a similar

way. The main difference is that you use the features that were generated from the

first network as the training data in the second sparse network. Also, you decrease

the size of the hidden representation to 50, so that the encoder in the second

network learns an even smaller representation of the input data.

FIG NO-14 OVERVIEW OF SECOND SPARSE NETRWORK LAYER

STRUCTURE

 SOFTMAX LAYER

Train a softmax layer to classify the 50-dimensional feature vectors. Unlike the

sparse network, you train the softmax layer in a supervised fashion using labels for

the training data.

FIG NO-15 OVERVIEW OF SOFTMAX LAYER STRUCTURE

 FORMATION OF NEURAL NETWORK

34
The formation of neural network takes place by the combination of
all network layers along with softmax layer for the possible
outcome

FIG NO-16 OVERVIEW OF DIFFERENT NETWORK LAYER STRUCTURE

FIG NO-17 OVERVIEW OF DIFFERENT NEURAL NETWORK


COMBINATIONAL STRUCTURE

With the full network formed, you can compute the results on the test set. To use

images with the stacked network, you have to reshape the test images into a

matrix. You can do this by stacking the columns of an image to form a vector, and

then forming a matrix from these vectors.

 RESULTS :-

35
These are the required outputs by the given MATLAB code proposed by us and
visualized by a confusion matrix :-

FIG NO-18 OVERVIEW OF CONFUSION MATRIX OUTPUT AND


GRAPHICAL RESULT

36
CONCLUSION

The aim of our project is to make an interface that can be used to recognize user

Hand written characters .We approached our problem using Convolutional neural

Networks in order to get a higher accuracy .Using modern day techniques like

neural networks to implement deep learning to solve basic tasks which are done

with a blink of an eye by any human like text recognition is just scratching the

surface of the potential behind machine learning. There are infinite possibilities

and application of this technology. Traditional optical character recogniser used to

work similar to biometric device. Photo sensor technology was used to gather the

match points of physical attributes and then convert it into database of known

types.

But with the help of modern-day techniques like convolution neural networks we

are able to scan and understand words with an accuracy never seen before in

history.

Numerous application of handwritten pattern recognition using convolutional

neural network are there like reading postal addresses, bank check amounts, and

forms convolutional neural networks have accomplished astonishing

achievements across a variety of domains, including medical research, and an

increasing interest has emerged in radiology.

INDIVIDUAL AND GROUP LEARNING:-

37
Through this project I learned as an individual team member that

 should actively, participate in meetings and shares knowledge, expertise, ideas

and information.

 I should be enthusiastic.

 I should carefully work .

 I should respect other’scontribution.

 I should be committed to team objectives.

 I should carry out assignments between meetings such as collecting data,

observing processes, charting data and writing reports.

 Through this project I learned as a team.

 When a group of individuals works together, compared to one person working alone,

they promote a more efficient work output and are able to complete tasks faster due to

many minds intertwined on the same goals and a particular objectives . Working in a

team enables us to learn from one another’s mistakes. one can able to avoid future

errors, gain insight from differing perspectives, and learn new concepts from more

experienced teammate

REFERENCES :-

 Surya Nath R S M, Afseena “Handwritten Recognition- A review”,

International Journal of Scientific and Research Publications

38
 Anita Pal and Davashankar Singnh,”Handwritten English Character

Recognition Using Neural Network”, International Journal of Computer

Science and Communication

 Manoj Sonkusare and Narendra Sahu “A survey on handwritten character

recognition (hcr) techniques for english alphabets” Advances in Vision

Computing: An International Journal (AVC)

 Kalchbrenner N, Grefenstette E, Blunsom P. A convolutional neural network

for modelling sentences. Paper (XIV:1404.2188). Apr 8, 2014.

 Ciresan DC, Meier U, Gambardella LM, Schmidhuber J. “Convolutional

neural network committees for handwritten character classification”. Paper

(1135-1139). Sep 18,2011 IEEE.

 Nisha Sharma et al, “Recognition for handwritten English letters: A Review”

International Journal of Engineering and Innovative Technology (IJEIT)

Volume 2, Issue 7, January 2013.

39

You might also like