Professional Documents
Culture Documents
on
BACHELOR OF TECHNOLOGY
DEGREE
Session 2020-21
in
Information Technology
By:
AFFILIATED TO
DR. A.P.J. ABDUL KALAM TECHNICAL UNIVERSITY, U.P., LUCKNOW
(Formerly UPTU)
Acknowledgement
We would like to acknowledge the contributions and support of our project guide, Mr.
Manish Kumar Sharma, designation, IT Department, with respect and gratitude, whose
expertise, guidance, support, encouragement, and enthusiasm has made this report
possible. We are also thankful to Prof. (Dr.) Amit Sinha, H.O.D of Information
Department for his constant encouragement, valuable suggestions and moral support and
blessings.
Although it is not possible to name individually, we shall ever remain indebted to the
faculty members of ABES Engineering College, Ghaziabad for their persistent support and
This acknowledgement will remain incomplete if I fail to express our deep sense of
obligation to my parents and God for their consistent blessings and encouragement.
We hereby declare that the work being presented in this report entitled “Optical Character
System in Banking ” is an authentic record of our own work carried out under the
supervision of Mr. Manish Kumar Sharma, Designation, Information Technology.
The matter embodied in this report has not been submitted by us for the award of any other
degree.
Date:
This is to certify that the above statement made by the candidate(s) is correct to the best of
my knowledge.
(Name: ) (Name:............................. )
Acknowledgment i
Student’s Declaration ii
Abstract iv
References 7
List of figures
Serial No. Figure name
In today’s world, there is a growing demand for the software related systems to acknowledge related characters
in computer systems when data is scanned through documents in paper format as it is known that we have various
books and newspaper which are in the format of print associated with many subjects. One of the simple ways to
store data in these documented papers into the computer system is to firstly scan the documents and after that
store them as IMAGES. As a output, computer is not able to identify the characters while studying them. For this
processing we require a system software known as CHARACTER RECOGNITION SYSTEM.
Thus our need is to develop character recognition software system to perform Document Image Analysis so that
documents can be transformed from format that is paper to a format that is electronic. The process of conversion
of paper documents into that with electronic format is currently going on task in many of the organizations
especially in Research and Development (R&D) areas, in large businesses, enterprises. in the institutions of the
government, so on .
Character Recognition and Conversion is the method of electronic conversion or mechanical conversion of images
of hand text, type written text or text in printed form into machine text. The method by which printed texts are
digitized for them to be searched electronically and stored in a more compact manner. Scanning of information is
done through paper documents that are stored as images and further processed into format of text. For computers,
this process is tough to perform. Any document scanned is a graphical file, that is, pixels patterns. Further, it
becomes feasible to withdraw information that is useful. Texts that are in machine readable form can then be used
for different causes. Character Recognition and Conversion system that is based on a grid infrastructure ands
performs various processes such as analysis of image, processing of electronic document converted from paper
formats. The aim of this project is to spot, draw out and acknowledge text obtained from images, particularly
forms in the bank using the OCR algorithm. For Example-. Extracting cheque number, amount of data, etc from
a cheque used in bank.
OCR is utilized in finding and looking for text from electronic documents and papers or to place the text on
different websites. It is also used in captcha in optical music character recognition.
Another use of OCR is in dealing with computer vision research and in the study of system design which can
spot and analyse computer printed papers and texts written by hand.
This task deals with combined features of OCR and have majorly focussed on its application such as image
sudoku solver, Car License Plate Detection and Recognition, Hand written and Computer Printed Documents
Recognition. It also operates with hand written and computer printed papers. One more work was to define a
structure of data and OCR information to an operator with the help of HTML interface which is helped by making
use of HTML and Javascript. The structure of data comprises an OCR with higher level of confidence for every
fields and letters.
3.3 Segmentation
This is the third step, in this process an image of group of characters is converted into sub-images of meaningful
chunks. Therefore, the pre-processed image that is inputted is segmented into isolated characters by allocating a
number to every character by labelling process.
Fig 3 : Identifying location of the word
3.5 Classification :
Every character is classified in a specific category. They can now take it to the neural network (NN) to train them
to recognize characters. It is done by technique called KNN classifier.
OCR technology gives really fast and automated information collection which can help prevent considerable time
and labor costs in the banking system.
The system consist of various benefits such as automation of dull tasks, reduces complexity of time, reduces
database and increases adaptability to inputs that are not trained with only a little number of attributes to calculate.
The suggested system can be implemented using software like Matlab. The project works on the algorithm of
OCR based on ML. The image which is scanned is taken as input and feed forward architecture is applied.
The structure of neural network includes an input layer with fifty-four inputs, two layers which are hidden and
each has hundred neurons and a following output layer with twenty-six neurons. The training of network is done
by technique called the gradient descent back propagation way with momentum and adaptive learning rate and
log-sigmoid transfer function. Training of the Neural network has been done using known dataset. The number of
input nodes is selected on the basis of number of attributes.
V. CONCLUSION
We have chosen Optical Character Recognition as chief fundamental method for the recognition of characters.
The transformation of documents in paper format to electronic format is quite a tedious job which can also be
erroneous if done manually. There are many bank documents in a bank daily whose information requires to be
kept in the bank database. Therefore, we have suggested a structure which can work on the Optical Character
Recognition algorithm to make the defined task automated and hence effortless.
This project will actually be supportive in the banking field especially in these times of pandemic and even later.
OCR shall prove to be a robust tool for future data entry applications in the field of banking system.
The Optical Character Recognition software can be enhanced in the future in different kinds of ways such as:
Training and recognition speeds can be improved in the future by making it more user-friendly. This project will
be a stepping stone in the near future.
REFERENCES
[1] W. N. Manegoli and Prof. P. Desai, “ Optical Character Recognition for running Hand writing ”, International
Research Journal of Engineering and Technology, Vol. 4 – No. 5, pp. 793 – 795, 2016.
[2] N. Agarwal and M. Yadav, “ Hand writing Recognition System – A Review ”, International Journal of
Computer Applications, Vol. 214 – No. 19, pp. 46 – 40, 2015.
[3] P.Senior and P. Robinson. "An Offline running Hand writing Recognition System, IEEE Transactions on
Pattern recognition and Machine Intelligence, vol. 23, no. 3, pp309-312, APRIL 2014.
[4] Ray Smith. Hybrid Page Layout Analysis by a Tab-Stop Detection. Proceedings of the 11th international
conference on document analysis and recognition. 2009.