You are on page 1of 29

A

PROJECT REPORT

ON

HANDWRITTEN PAPER CHECKER USING ML

Submitted by

DHYEY D. BADHEKA (19IT450)


HARSH K. MAJITHIYA (19IT453)
KIRTAN TANK (19IT455)
NIKHIL D. VAYA (19IT456)
For Partial Fulfillment of the Requirements for Bachelor of Technology in Information
Technology

Guided by
PROF. TRUSHNA B. PATEL

December, 2022

Information Technology Department


Birla Vishvakarma Mahavidyalaya Engineering College
(An Autonomous Institution)
Vallabh Vidyanagar – 388120
Gujarat, INDIA
Birla Vishvakarma Mahavidyalaya Engineering College
(An Autonomous Institution)
Information Technology Department
AY: 2022-23, Semester I

CERTIFICATE

This is to certify that the project work entitled HANDWRITTEN PAPER


CHECKER USING ML has been successfully carried out by DHYEY D.
BADHEKA (19IT450), HARSH K. MAJITHIYA (19IT453), KIRTAN TANK
(19IT455) and NIKHIL D. VAYA (19IT456) for the subject Project-I (4IT31)
during the academic year 2022-23, Semester-I for partial fulfilment of Bachelor of
Technology in Information Technology. The work carried out during the semester
is satisfactory.

Prof. Trushna B. Patel Dr. Keyur Brahmbhatt


IT Department, IT Department Head,

BVM BVM
ACKNOWLEDGEMENT
First and foremost, we want to give thanks and praise to God, the Almighty, for His gifts that
helped us finish our project successfully.

Inspiration and motivation during presentations have always been crucial to the
accomplishment of any endeavor.

Prof. Dr. Indrajit N. Patel, Principal of Birla Vishvakarma Mahavidyalaya, has our deepest
gratitude. We have been greatly inspired by his enthusiasm, vision, genuineness, and
motivation.

We also express our sincere gratitude to Dr. Keyur Nayankumar Brahmbhatt, the head of the
information technology department at Birla Vishvakarma Mahavidyalaya, and Dr. Zankhana
Shah, for giving us the support we needed to reach the top and for giving us the chance to
work on the project. He has taught us how to conduct research and deliver the results in the
most understandable way. Working on and completing our project under their direction was a
great honor and privilege.

Additionally, we want to express our gratitude to our mentor Prof. Trushna B. Patel, whose
valuable advice and gracious supervision she provided us with during the course and which
helped form the current work as its displayed.

We owe a great deal to our friends for their inspiring encouragement, supportive advice, and
gracious oversight of the execution of our project.

 
 
  
 
ABSTRACT

Handwriting recognition is a new technology that will help mankind in the 21st century. One
application of handwriting recognition is the processing of large volumes of paper documents
such as answer sheets. Handwriting recognition is a type of optical character recognition
(OCR). OCR is the recognition of printed or handwritten text. With OCR, a document is
captured by a camera as an image and can be converted to the desired format, such as PDF.
The file is then sent to a paper-checking algorithm. This greatly reduces human involvement
in the paper verification process.
Table of Contents
Chapter 1: Introduction..........................................................................................................2

1.1 Brief overview of the work..........................................................................................2

1.2 Objective......................................................................................................................2

1.3 Scope...........................................................................................................................2

1.4 Project Modules...........................................................................................................3

1.5 Project Hardware/Software Requirements..................................................................3

Chapter 2: Literature Review.................................................................................................4

Chapter 3: System Analysis & Design....................................................................................7

3.1 Comparison of Existing Applications with your Project with merits and demerits.........7

3.2 Project Feasibility Study..................................................................................................7

3.3 Project Timeline chart......................................................................................................9

3.4 Detailed Modules Description..........................................................................................9

3.5 Project SRS....................................................................................................................10

3.5.1 Use Case Diagrams.....................................................................................................10

3.5.2 Class diagram..............................................................................................................11

3.5.3 Entity Relationship Diagrams.....................................................................................11

3.5.4 Sequence Diagrams.....................................................................................................12

3.5.5 State Diagram.............................................................................................................12

3.6 Data Dictionary..............................................................................................................13

Chapter 4: Implementation and Testing..............................................................................15

4.1 User Interface and Snapshot...........................................................................................15

4.2 Testing using Use Cases.................................................................................................18

Chapter 5: Conclusion & Future work................................................................................19

Chapter 6: References............................................................................................................20
INDEX FOR FIGURES

Fig 1: Economic Feasibility.....................................................................................................13

Fig 2: Project Timeline chart....................................................................................................14

Fig 3: Use Case Diagram.........................................................................................................15

Fig 4: Class Diagram................................................................................................................16

Fig 5: Entity Relationship Diagram.........................................................................................16

Fig 6: Sequence Diagram.........................................................................................................17

Fig 7: State Diagram................................................................................................................17

Fig 8: Home Page.....................................................................................................................20

Fig 9: About Us Page...............................................................................................................20

Fig 10: Login Page...................................................................................................................21

Fig 11: Registration Page.........................................................................................................21

Fig 12: Contact Page................................................................................................................22

Fig 13: Main Page....................................................................................................................22

Fig 14: Google Vision testing..................................................................................................25

Fig 15: Excel Sheet of the result generated by Model.............................................................26

INDEX FOR TABLES


Table 1: Literature Review

Table 2: Comparison of the existing model with our model

Table 3: Data Dictionary for Admin

Table 4: Data Dictionary for faculty

Table 5: Data Dictionary for Student


Chapter 1: Introduction
1.1 Brief overview of the work
In the field of Machine Learning, object recognition is the most demand. Some of examples
of object recognition are face recognition, handwriting recognition, Disease detection, etc.
All these things are done through a large set of image data set. These image data sets will
contain both positive and negative data relating to this domain. This helps the algorithm to
better classify the unknown data in better ways. Handwritten recognition is a new technology
that will be useful and in this handy in the 21st century. It can serve as a basic functionality
for the birth and emergence of new requirements. One application of handwritten character
recognition would be to process large sets of paper documents such as answer papers. With
the help of handwriting recognition and AI, the answer response sheets can be evaluated
without human intervention. For the scenario mentioned above, handwriting recognition acts
as the base case to be resolved. Handwriting recognition is one of the types of Optical
Character Recognition (OCR). OCR is the identification of text, which may be printed or
hand-written. With OCR, the document is captured as an image by camera and can be
converted into desired formats such as PDFs. The file is then fed to the character recognition
algorithm. This can drastically reduce human involvement in paper verification processes.

1.2 Objective
Develop a system using Machine Learning and Natural Language Processing to help teachers
in the paper-checking process and save time and cost.

1.3 Scope
To realize the objective of this, extend, a few scopes had been distinguished. As natural
language processing in machine learning could be a modern innovation, so to assist the
teacher to ease their repetitive work of checking papers. This will progress the conventional
strategies of paper checking, comprising numerous imperfections and escape clauses. The
conventional strategies of paper checking are exceptionally repetitive and time-consuming for
resources which gets to be frenzied for the faculty consequently checking a few of the papers
leads to awful psychology of faculty toward students and comes about in some negative bias.
Too, there’s no plausibility for students to bribe the faculty or for the faculty to assist any of
the students. Thus, this extent of Programmed Transcribed Paper Checking overcomes all the
conceivable impediments of conventional strategies and gives a distant better; stronger; and
improved approach to the legitimate checking of reply sheets.

1.4 Project Modules


 Login & Registration for Admin, Faculty, and Students
 Model Answer Sheet of Faculty and Students Answer sheets uploaded for processing
 Conversion of PDF to Image and text
 Autocorrect words using NLP
 Keywords Matching and Answer Summarisation Matching
 Length-wise matching of answers
 Paper Evaluation according to their assigned weights
 Generate Results by Faculty

1.5 Project Hardware/Software Requirements


Front End Technologies
 HTML
 CSS
 TypeScript
 Flask
 Angular JS
 Bootstrap

Back End Technologies


 Python
 Python Libraries
 Google Cloud Vision
 Firebase Database
 JavaScript
Chapter 2: Literature Review

Sr. Paper Year Technology used Accuracy Conclusions


No
1. Subjective 2018 Natural language Up to 90% The project works with the same
Answer processing accuracy factors which an actual human being
Evaluation considers while evaluating such as
Using the length of the answer, presence of
Machine keywords, and
Learning contextkeywordsords.
Use of Natural Language Processing
coupled with robust classification
techniques, checks for not only
keywords but also the question-
specific things.
Students will have a certain degree
of freedom while writing the answer
as the system checks for the
presence of keywords, synonyms,
right word context, and coverage of
all concepts.
It is concluded that using ML
techniques will give satisfactory
results due to holistic evaluation.
The accuracy of the evaluation can
be increased by feeding it a huge
and accurate training dataset.
As the technicality of the subject
matter changes different classifiers
can be employed.
Further improvement by taking
feedback from all the stakeholders
such as students and teachers can
improve the system meticulously
2. Handwritten 2020 Convolutional Up to 90.3% This algorithm will provide both
Text Neural Network accurate efficiency and effective result for
Recognition (CNN) the recognition.
using Deep The project gives the best accuracy
Learning. for the text which has less noise.
The accuracy completely depends
on the dataset if we increase the
data, we can get more accuracy.
If we try to avoid cursive writing
then also its best results.
3. Subjective 2021 Machine Learning up to 88% The experimentation results show
Answers and Natural accurate that on average word2vec approach
Evaluation language performs better than traditional
processing word embedding techniques as it
keeps the semantics intact.
Furthermore, Word Mover’s
Distance performs better than
Cosine Similarity in most cases and
helps train the machine learning
model faster.
With enough training, the model can
stand on its own and predict scores
without the need for any semantics
checking.

4. Semantic 2021 Deep Neural Up to 78% As of now, NLP techniques


analysis of the Network accuracy available for understanding textual
long answer Algorithm data are still primitive and
computing resource intensive.
Simple operations like comparing
two sentences, sentence
representation in a vector form,
summarization of paragraphs, and
evaluation of long written format
text require considerable computing
power.
This research aimed to apply already
existing techniques while being
resource efficient.
Normal systems are not as accurate
as the DAN model and are usually
keyword based, unlike this research
which looks for comparable
meaning rather than words.
To reduce this workload for the
neural network (DAN), I’ve broken
down the sentences into the simplest
and smallest meaningful form.
These simple sentences are then
encoded to high-dimension vectors
and compared with the key points.
Table 1: Literature Review
Chapter 3: System Analysis & Design
3.1 Comparison of Existing Applications with your Project with
merits and demerits
Existing project Handwritten paper checker
They are used to check MCQ They are not used to checking MCQ
They can’t check handwritten paper This model can check handwritten paper
They can check the paper which is typed This model can check both typed and
written
They didn’t give a mark for individual This model provides mark for the individual
questions question and provides total marks
Table 2: Comparison of the existing model with our model

3.2 Project Feasibility Study


It is the crucial part of the project which includes the various factors and reasons that cause
the system to be accepted by the public and other firms and corporations.

The following are the main feasibility concerns of this project:

Technical Feasibility:
The Automatic handwritten paper checker system uses the industry-leading technology of
`Google Cloud Vision` which provides highly accurate results of text extraction by using
strong machine learning algorithms. It also uses text correction to improve results by using
Natural Language Processing library `Textblob` further NLP is also used in the
summarization of answers which will be useful the for marking scheme. These technologies
that are used in the project make this project possible, reliable & feasible in the technical
aspect.

Market Feasibility:
The handwritten examination is a very important and crucial part of the Education system
which is conducted on large scale there is a huge demand for systems like this which are
time-saving, cost-saving, and can also minimize human efforts. By considering these
parameters it makes its market feasible.
Time-Based Feasibility:
This system can provide results around 10x faster than a normal human paper checker which
can be further improved after optimization. It also minimizes human interaction by making
major processes of detection and marking automatic which makes this system feasible in
time.

Economic Feasibility:
The project uses cloud vision technology which charges only for usage with no upfront
commitments. The estimated cost for 10,000 units of text detection is shown below:

Fig 1: Economic Feasibility


3.3 Project Timeline chart

Fig 2: Project Timeline chart

3.4 Detailed Modules Description


Login & Registration for Admin, Faculty, and Students
All the users (Faculties and Students) have to register themselves on the website and log in to
their accounts.

Model Answer Sheet of Faculty and Students Answer sheets uploaded for processing
Faculty will upload scanned model answer sheets on the website.

Students will upload their scanned answer sheets to the website for the evaluation process.

Conversion of PDF to Image and text


Using various libraries of python like PymuPdf,P df2Image and Google Vision we will
extract images and texts from scanned papers uploaded by students and store them in a text
file.

Autocorrect words using NLP


Libraries available in python like TextBlob FuzzyWuzzy will read the text from an extracted
text file and will autocorrect all the misspelled words and will overwrite the file.
Keywords Matching and Answer Summarisation Matching
In this module, all the keywords will be extracted from the text file by removing all the
stopwords like (to, the, etc.) and will be matched for students’ answer sheets and faculty
model answer sheets. Particular weight will be assigned for this and will evaluate marks
accordingly.

Also, a summary of the faculty’s answer to a particular question and students’ answers will
be generated and will be compared using Pythons libraries like fuzz and process. This factor
will also be considered for paper evaluation and marks will be counted for summary as per
the assigned weights.

Paper Evaluation according to their assigned weights


Generating question-wise marks from different parameters and their weights as discussed in
the above modules will aggregate into total marks.

Generate Results by Faculty


The faculty will get an excel sheet containing question-wise marks as well as the overall
marks of all students and can release it at his/her convenience.

3.5 Project SRS


3.5.1 Use Case Diagrams

Fig 3: Use Case Diagram


3.5.2 Class diagram

Fig 4: Class Diagram

3.5.3 Entity Relationship Diagrams

Fig 5: Entity Relationship Diagram


3.5.4 Sequence Diagrams

Fig 6: Sequence Diagram

3.5.5 State Diagram

Fig 7: State Diagram


3.6 Data Dictionary
Table name: Admin
Primary Key: admin_id
Sr no. Name Data type Constraints Description

1 admin_id Varchar(10) Primary Key To store the id of


the admin
2 admin_name Char(100) Not null To store the name
of the admin
3 admin_email Varchar(50) Nut null To store the email
of the admin
account
4 admin_pass Varchar(10) Not null To store the
password of the
admin account
5 admin_mobile Number(10) Not null To store the mobile
number of the
admin
Table 3: Data Dictionary for Admin
Table name: faculty

Primary Key: faculty_id

Sr no. Name Data type Constraints Description

1 faculty_id Varchar(10) Primary Key To store the


id of faculty
2 faculty_name Char(100) Not null To store the
name of the
faculty
3 faculty_email Varchar(50) Nut null To store the
email of the
faculty
account
4 faculty_pass Varchar(10) Not null To store the
password of
the faculty
account
5 Subject_name varchar (10) Not null To store the
name of the
subject
related to the
faculty
6 Student_ans_sheet Clob Not null To store the
uploaded
answer sheet
of the student
7 Model_ans_sheet Clob Not null To store the
uploaded
model answer
sheet
Table 4: Data Dictionary for faculty

Table name: Student


Primary Key: student_id
Sr no. Name Data type Constraints Description

1 stu_id Varchar(10) Primary Key To store the id


of the student
2 stu_name Char(100) Not null To store the
name of the
student
3 stu_email Varchar(50) Nut null To store the
email of the
student account
4 stu_pass Varchar(10) Not null To store the
password of the
student account

Table 5: Data Dictionary for Student


Chapter 4: Implementation and Testing
4.1 User Interface and Snapshot
Home Page

Fig 8: Home Page

About Us Page

Fig 9: About Us Page


Login Page

Fig 10: Login Page

Registration Page

Fig 11: Registration Page


Contact Page

Fig 12: Contact Page

Main Page

Fig 13: Main Page


4.2 Testing using Use Cases
Text Extraction (On different handwritings) :
Fig 14: Google Vision testing
The Paper checking model tested different Test Cases with This Model Answer sheet

The result (Question wise marksheet) generated is :

Fig 15: Excel Sheet of the result generated by Model


Chapter 5: Conclusion & Future work
Conclusion:
As natural language processing in machine learning is a new technology, it can be used in
ways that help the teacher ease the monotonous work of reviewing and checking papers.
Thus, this automatic handwritten paper-checking project overcomes all the shortcomings of
the traditional methods and provides a better approach to check the answer sheets correctly.

Future Work:
 To evaluate paper having image and tables.
 To add student profiles in front-end and students can submit their answer sheet on our
website and can view results by themselves.
Chapter 6: References
[1]

S. Singh, Y. Shah, Y. Vajani, and S. Dholay, “Automated Paper Evaluation System for
Subjective Handwritten Answers,” in 2021 12th International Conference on Computing
Communication and Networking Technologies (ICCCNT), Kharagpur, India, Jul. 2021, pp.
1–6. doi: 10.1109/ICCCNT51525.2021.9579912.

[2]

N. S. Yerramilli, N. J. Johnson, O. S. R. Y, P. Monika, and P. S, “Automatic Exam Answer


Checker using Optical Character Recognition and Sentence Embedding,” in 2021
International Conference on Disruptive Technologies for Multi-Disciplinary Research and
Applications (CENTCON), Nov. 2021, vol. 1, pp. 253–256. doi:
10.1109/CENTCON52345.2021.9688008.

[3]

“Automatic Exam Answer Checker using Optical Character Recognition and Sentence
Embedding | IEEE Conference Publication | IEEE Xplore.”
https://ieeexplore.ieee.org/document/9688008 (accessed Nov. 22, 2022).

[4]

M. Supic, K. Brkić, T. Hrkać, Z. Mihajlovic, and Z. Kalafatic, “Automatic recognition of


handwritten corrections for multiple-choice exam answer sheets,” May 2014, pp. 1136–1141.
doi: 10.1109/MIPRO.2014.6859739.

[5]

S. Manchala, J. Kinthali, K. Kotha, and J. Kumar, “Handwritten Text Recognition using


Deep Learning with TensorFlow,” International Journal of Engineering Research and, vol.
V9, May 2020, doi: 10.17577/IJERTV9IS050534.

[6]

V. Tanwar, “Machine Learning based Automatic Answer Checker Imitating Human Way of
Answer Checking,” International Journal of Engineering Research & Technology, vol. 10,
no. 12, Dec. 2021, doi: 10.17577/IJERTV10IS120063.
[7]

“Machine Learning based Automatic Answer Checker Imitating Human Way of Answer
Checking – IJERT.” https://www.ijert.org/machine-learning-based-automatic-answer-
checker-imitating-human-way-of-answer-checking (accessed Nov. 22, 2022).

[8]

“(PDF) Automatic recognition of handwritten corrections for multiple-choice exam answer


sheets.”
https://www.researchgate.net/publication/269291332_Automatic_recognition_of_handwritten
_corrections_for_multiple-choice_exam_answer_sheets (accessed Nov. 22, 2022).

[9]

P. Patil, S. Patil, V. Miniyar, and A. Bandal, “Subjective Answer Evaluation Using Machine
Learning,” p. 13.

You might also like