Image to Text App: A Mini Project

IMAGE TO TEXT APP
A Mini Project
Submitted in partial fulfillment of the
Requirements for the award of the Degree of
BACHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE AND ENGINEERING

By
E.B. MEGHANA G. MAHALAKSHMI

(1610116) (1610121)
Under the esteemed guidance of
Mrs. D.GOUSIYA BEGUM, M.Tech(CSE)

Lecturer,dept.of CSE
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

SRI KRISHNADEVARAYA UNIVERSITY
COLLEGE OF ENGINEERING AND TECHNOLOGY
ANANTAPUR – 515003
ANDHRA PRADESH
(2016-2020)
1|Page
SRI KRISHNADEVARAYA UNIVERSITY
COLLEGE OF ENGINEERING AND TECHNOLOGY
ANANTAPUR – 515003
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
CERTIFICATE
Certified that this is a bonafide record of the dissertation work entitled, “Image to Text App”,
done by E.B.MEGHANA bearing Admn. No: 1610116 and G.MAHALAKSHMI bearing
Admn. No: 1610121 submitted to the faculty of Computer Science and Engineering, in partial
fulfillment of the requirements for the Degree of BACHELOR OF TECHNOLOGY with
specialization in COMPUTER SCIENCE AND ENGINEERING from Sri Krishnadevaraya
University College of Engineering and Technology, Anantapur.
Signature of the Guide Signature of the Head of the Department

Mrs. D.GOUSIYA BEGUM, M. Tech(CSE) Mr. P.R.RAJESH KUMAR,M.Tech(CSE)
Lecturer Hod & Lecturer
Signature of the Coordinator

Mrs. A.Renuka Devi, M.Tech(CSE).,(Ph.D)
Lecturer
2|Page
DECLARATION
We hereby declare that the project report entitled “IMAGE TO TEXT APP” submitted to the
Department of Computer Science and Engineering, Sri Krishnadevaraya University, Anantapuramu for
the partial fulfilment of the academic requirement for the degree for Bachelor of Technology in Computer
Science and Engineering is an authentic record of our work carried out during the final year under the
esteemed guidance of Mrs. D.GOUSIYA BEGUM, M.Tech(CSE), Lecturer of Computer
Science and Engineering Department, College of Engineering and Technology, Sri Krishnadevaraya
University, Ananthapuramu.
Signature of the Students
1.
2.
3|Page
ACKNOWLEDGEMENT
The satisfaction and euphoria that accompany the successful completion of any
task would be incomplete without the mention of people who made it possible whose constant
guidance and encouragement crowned our efforts with success. It is a pleasant aspect that I have
now the opportunity to express my gratitude for all of them.
It is with immense pleasure that I would like to express my indebted gratitude to my

Mrs. D.GOUSIYA BEGUM, M.Tech (CSE) in CSE Department, who has guided me a lot and
encouraged me in every step of the project work. His/her valuable moral support and guidance
throughout the project helped me to a greater extent. I thank his/her for his/her stimulating
guidance, constant encouragement and constructive criticism which have made possible to
bring out this project work.
Our heartfelt thanks to Smt. A.RENUKA DEVI, M.Tech., (Ph.D.) project

coordinator, Lecturer of the CSE Department, for her assistance and guidance which help me in
completing this project successfully.
I wish to express my deep sense of gratitude to P.R.RAJESH KUMAR, M.Tech, Lecturer

and Head of the Department (I/C) of Computer Science and Engineering, for giving me the
opportunity of doing the project and for providing a great support in completing my project
work. I feel elated to thank him for inspiring me all the way by providing good lab facilities and
helping me in providing such good environment.
I wish to convey my acknowledgment to Dr. R.RAMACHANDRA, M.Tech., Ph.D.,

Principal, SKU College of Engineering and Technology, Anantapuramu, for providing such a
good environment and facilities.
My special thanks to the faculty of CSE Department for giving the required information in
doing my project work. Not to forget, I thank all the non-teaching staff, my friends and class
mates who had directly or indirectly helped and supported me in completing my project in time.
4|Page
Finally I wish to convey my gratitude to my parents who fostered all the requirements and
facilities that I need.
E.B.MEGHANA (1610116)
G.MAHALAKSHMI (1610121)
5|Page
ABSTRACT
Most of the information available today is either on paper or in the form of still
photographs and videos. To build digital libraries, this large volume of information needs to be
digitized into images and the text converted to ASCII for storage, retrieval, and easy
manipulation.
Text recognition in images is an active research area which attempts to develop a

computer application with the ability to automatically read the text from images. Now a days
there is a huge demand of storing the information available on paper documents in to a computer
readable form for later use. One simple way to store information from these paper documents in
to computer system is to first scan the documents and then store them as images. However to
reuse this information it is very difficult to read the individual contents and searching the
contents form these documents line-by-line and word-by-word.
Text recognition is a technique that recognizes text from the paper document in the
desired format (such as .doc or .txt). “Image to Text App” is the mobile application that
describes techniques for converting the textual content of a paper document into a machine-
readable format so that user can easily copy, edit the output text, and can paste and reuse
wherever he wants.
6|Page
Contents
Chapter-1: PROJECT OVERVIEW
1.1 Introduction
1.2 Objective
1.3 Literature Survey
1.4 Motivation
1.5 Applications
1.6 Advantages
Chapter-2: REVIEW OF LITERATURE

2.1 Literature Reviews
2.2 Relation to Past Work
2.3 In English Scripter Character Recognition
2.4. Arabic Scripter Character Recognition
2.5 Devanagari Scripter Character Recognition
Chapter-3: TEXT RECOGNITION SYSTEM

3.1 Modules
3.2 Task Involved in OCR
3.3 Basic System Architecture
3.4 Optical Character Recognition
3.5 Preprocessing
Chapter-4: REQUIREMENT AND ANALYSYS

4.1 Platform Requirement
4.1.1 Supportive Operating System
4.2 Software Requirement
4.3 Hardware Requirement
Chapter-5: IMPLEMENTATION
5.1 Programming Languages
5.1.1 Java
5.2 Source Code
5.2.1 MainActivity.java
5.2.2 activity_main.xml
5.3 Steps for implementing
7|Page
5.3.1 Enable USB debugging on your android phone
5.3.2 Get started with remote debugging android devices
Chapter-6: RESULTS AND ANALYSYS

6.1 Initial View
6.2 Selecting Image
6.3 Cropping Image
6.4 Output
6.5 Further using of data
6.6 Sample Outputs
Chapter-7: CONCLUSION AND FUTURE SCOPE

7.1 Conclusion
7.2 Scope for future work
LIST OF FIGURES
ABBREVATIONS
REFERENCES
APPENDICES
STUDENT BIODATA
8|Page
Chapter 1
PROJECT OVERVIEW
1.1 Introduction
Now-a-days, there is a growing demand for the software systems to recognize

characters in computer when information is scanned through paper documents as we
know that there are number of historical, mythological books and newspaperswhich
are in printed format. Day by day due to atmospheric changes or due to improper
handling they get damaged. Therefore, nowadays there is a huge demand in
“storing the information available in these paper documents in to a computer
storage disk and then later reusing this information by searching process”. One simple
way to store information in these paper documents in to computer system is to first
scan the documents.
Whenever we scan the documents through the scanner, the documents are
stored as images in the computersystem. These images contain text that cannot be
edited by the user. But to reuse this information it is very difficult for the computer
system to read the individual contents and search the contents form these documents
line-by-line and word-by word. The reason for this difficulty is the font
characteristics of the characters in paper documents are different to font of the
characters in computer system.
As a result, computer is unable to recognize the characters while reading

them. This concept of storing the contents of paper documents in computer storage
place and then reading and searching the content is called document processing.
Sometimes in this document processing we need to process the information that is
related to languages other than the English in the world. This process is also
called Document Image Analysis (DIA).To handle DIA in recent years many
approaches have been proposed by researchers, each approach has its own
advantages and limitation which is discussed in detail.
As we read the words, our eyes and brain continuously carry out optical character
recognition in such a way that we are not able to recognize it. Our eyes are recognizing
the luminous pattern of printed character and our brain is using this to figure out what we
are trying to say. Apart from humans , nowadays even the computer are capable of
performing this task using the technique called OCR.
9|Page
OCR helps in bringing the text available in analog format into a digital
form.Nowadays many organizations are depending on OCR systems to eliminate the
human interactions for better performance and efficiency. The objective of the paper is to
utilize this feature of the computer through an android app. This visual capability is
brought out using a android mobile phone working on Tesseract OCR engine. The
android app provides the user to recognize the text from either an image stored in the
gallery , image taken with a camera, from a stored document in mobile or allows to store
a name of the locations from the map application available in mobiles. This app can be
used for automatic number plate recognition, extracting business card information into
the contact list, Automatic insurance documents key information extraction, the
converted text can then be fed to the text to speech application and can be used as a
assertive technology for visually impaired users.
Character recognition is one of the most interesting areas of pattern recognition

and artificial intelligence. The application extracts the relevant information and
automatically enters it into electronic database instead of the conventional way of
manually retyping the text. Optical Character Recognition is a vast field with a number of
varied applications such as invoice imaging, legal industry, banking, health care industry
etc.
This is also widely used in many other fields like Captcha, Institutional
repositories and digital libraries, Optical Music Recognition without any human
correction or human effort, Automatic number plate recognition and Handwritten
Recognition. It contributes immensely to the advancement of an automation process and
can improve the interface between man and machine in numerous applications. Several
research works have been focusing on new techniques and methods that would reduce the
processing time while providing higher recognition accuracy. Now it is possible to scan
documents as an image and to make it editable and searchable for further information
processing.
1.2 Objective
Our objective is to utilize the visual capabilities of the Android mobile phone to
extract information from a business card. We use the camera features of the Android to
capture data. Extracting information from the business card requires accurate recognition
of the text of the business card. Any camera image of the business card would be subject
to several environmental conditions, such as variable lighting, reflection, rotation, and
scaling (we would desire the same data to be extracted from the business card regardless
of the distance from the camera), among others.
10 | P a g e
To achieve high speed in data processing it is necessary to convert the analog data
into digital data. Storage of hard copy of any document occupies large space and
retrieving of information from that document is time consuming. Optical character
recognition system is an effective way in recognition of printed character. It provides an
easy way to recognize and convert the printed text on image into the editable text. It also
increases the speed of data retrieval from the image. The image which contains characters
can be scanned through scanner and then recognition engine of the OCR system interpret
the images and convert images of printed characters into machine-readable characters .It
improving the interface between man and machine in many applications.
The objective of OCR software is to recognize the text and then convert it to editable
form. Thus, developing computer algorithms to identify the character in the text is the principal
task of OCR. A document is first scanned by an optical scanner, which produces an image form
of it that is not editable. Optical character recognition involves. Translation of this text image
into editable character codes such as ASCII . The below diagram shows the processing
mechanism of OCR system as shown in figure 1.1.
Fig 1.1: Image recognition process
11 | P a g e
1.3 Literature Survey
Benjamin Z. Yao, Xiong Yang, Liang Lin, Mun Wai Lee and Song-Chun Zhu [1]
proposed an image parsing to text description that generates text for images and video
content. Image parsing and text description are the two major tasks of his framework. It
computes a graph of most probable interpretations of an input image. This parse graph
includes a tree structured decomposition contents of scene, pictures or parts that cover all
pixels of image.
Over past decade many researchers form computer vision and Content Based
Image Retrieval (CBIR) domain have been actively investigating possible ways of
retrieving images and videos based on features such as color, shape and
objects[2][3][4][5][6].
Paper [7] introduced by Yi-Ren Yeh, Chun-Hao Huang, and Yu-Chiang Frank
Wang presents a novel domain adaptation approach for solving cross domain pattern
recognition problem where data and features to be processed and recognized are collected
for different domains,
S. Shahnawaz Ahmed, Shah Muhammed Abid Hussain and Md. Sayeed Salam
[8] introduced a model of image to text conversion for electricity meter reading of units
in kilo-watts by capturing its image and sending that image in the form of Multimedia
Message Service (MMS) to the server. The server will process the received image using
sequential steps: 1) read the image and convert it into a three dimensional array of pixels,
2) convert the image from color to black and white, 3) removal of shades caused due to
nonuniform light, 4) turning black pixels into white ones and vice versa, 5) threshold the
image to eliminate pixels which are neither black nor white, 6) removal of small
components, 7) conversion to text.
In [10] Fan-Chieh Cheng, Shih-Chia Huang, and Shanq-Jang Ruan gave the
technique of eliminating background model form video sequence to detect foreground
and objects from any applications such as traffic security, human machine interaction,
object recognition and so on. Accordingly, motion detection approaches can be broadly
classified in three categories: temporal flow, optical flow and background subtraction.
Iasonas Kokkinos and Petros Maragos [11] formulate the interaction between
image segmentation and object recognition using Expectation-Maximization (EM)
algorithm. These two tasks are performed iteratively, simultaneously segmenting an
image and reconstructing it in terms of objects. Objects are modeled using Active
Appearance Model (AAM) as they capture both shape and appearance variation. During
12 | P a g e
the E-step, the fidelity of the AAM predictions to the image is used to decide about
assigning observations to the object. Firstly start with oversegmentation of image and
then softly assign segments to objects. Secondly uses curve evolution to minimize
criterion derived from variational interpretation of EM and introduces AAMs as shape
priories.
Mina Makar, Vijay Chandrasekhar, Sam S. Tsai, David Chen and Bernd Girod
[13] proposed that streaming mobile augmented reality applications requires both real-
time recognition and tracking of objects of interest in a video sequence. A temporally
coherent keypoint detector and design efficient interframe predictive coding techniques
for canonical patches, feature descriptors and keypoint locations. Mobile Augmented
Reality (MAR) Systems are more important with growing interest in applications that use
image based retrieval on mobile devices. Streaming MAR applications require real-time
recognition and tracking of objects of interest.
1.3.1 Generations of OCR systems
First generation OCR systems The first commercialized OCR of this generation
was IBM 1418, which was designed to read a special IBM font407. The recognition
method was template matching, which compares the character image with a library of
prototype images for each character of each font [14].
Second generation OCR systems Next generation machines were able to

recognize regular machine-printed and hand printed characters. The character set was
limited to numerals and a few letters and symbols. Such machines appeared in the middle
of 1960s to early 1970s [14].
Third generation OCR systems For the third generation of OCR systems, the
challenges were documents of poor quality and large printed and hand-written character
sets. Low cost and high performance were also important objectives. Commercial OCR
systems with such capabilities appeared during the decade 1975 to 1985[14].
OCRs Today (Fourth generation OCR systems) The fourth generation can be
characterized by the OCR of complex documents intermixing with text, graphics, tables and
mathematical symbols, unconstrained handwritten characters, color documents, low-quality noisy
documents, etc. Among the commercial products, postal address readers, and reading aids for the
blind are available in the market [14].
13 | P a g e
1.4 Motivation
As we can see in our daily lives , people take images of some documents when
they have no other source to take that document with them , but later they have to read
each and every word from it. So we thought to make a project in which we just take an
image and process it to extract the text present in the image. It saves a lot of time to read
the text from an image.
The purpose of this project is to take English handwritten documents as input,

recognize the text and modify the handwriting such that it is a beautified version of the
input. Thus the project comprises of two parts - handwriting recognition and
beautification.
1.5 Applications
Text recognition technology may be apply throughout the entire spectrum of

industries, revolutionizing the document management process. This technology enable
scan documents to become more than just image files, turning into fully searchable
documents with text content that is recognized by computers. With the help of this
technology, people no longer need for manually retype important documents when
entering them into electronic databases. Instead, Text recognition system extracts
relevant information and enters it automatically. The result is accurate, efficient
information processing in less time. In the following, we overview some applications
of text recognition system.
1. Banking -
The uses of image text recognition vary across different fields. One widely
known application is in banking, it is used to process checks without human
involvement.
A check can be inserted into a machine, the writing on it is scanned instantly,
and the correct amount of money is transferred. This technology has nearly been
perfected for printed checks, and is fairly accurate for handwritten checks as well,
though it occasionally requires manual confirmation. Overall, this reduces wait times
in many banks.
14 | P a g e
2. Legal -
In the legal industry, there has also been a significant movement to digitize
paper documents. In order to save space and eliminate the need to sift through boxes
of paper files, documents are being scanned and entered into computer databases.
Image text recognition further simplifies the process by making documents

text-searchable, so that they are easier to locate and work with once in the database.
Legal professionals now have fast, easy access to a huge library of documents in
electronic format, which they can find simply by typing in a few keywords.
3. Healthcare –
Healthcare also use of image text recognition technology to process

paperwork. Healthcare professionals always have to deal with large volumes of forms
for each patient, including insurance forms as well as general health forms. To keep
up with all of this information, it is useful to input relevant data into an electronic
database that can be accessed as necessary.
By using image recognition technology they are able to extract information from
forms and put it into databases, so that every patient's data is promptly recorded. As a
result, healthcare providers can focus on delivering the best possible service to every
patient.
4. Image –
Text recognition in Other Industries Image text recognition technology is widely

used in many other fields, including education, finance, and government agencies.
This technology has made countless texts available online, saving money for students
and allowing knowledge to be shared. Invoice imaging applications are used in many
businesses to keep track of financial records and prevent a backlog of payments
from piling up.
In government agencies and independent organizations, image text

recognition technology simplifies data collection and analysis, among other
processes.
As the technology continues to develop, more and more applications are

found for technology, including increased use of handwriting recognition.
15 | P a g e
Optical character recognition has been applied to a number of applications. Some
of them have been explained below.
5.Legal Industry-
OCR is used in Legal industry for digitize documents, and directly entered to
computer database. Legal professionals can further search documents required from huge
databases by simply typing a few keywords [15].
6.Healthcare –
Healthcare professionals always have to deal with large volumes of forms for each
patient, including insurance forms as well as general health forms. To keep up with all of this
information, it is useful to input relevant data into an electronic database that can be accessed as
necessary. Form processing tools, powered by OCR, are able to extract information from forms
and put it into databases, so that every patient's data is promptly recorded [15].
7.Optical Music Recognition-
Initially it was aimed towards recognizing printed sheets which can be edited into
playable form with the help of electronic methods. It has many applications like processing of
different classes of music, large scale digitization of musical data and also it can be used for
diversity in musical notation [15].
8.Automatic Number Recognition
Automatic number plate recognition is used as a technique making use of optical

character recognition on images to identify vehicle registration plates. They are used by various
police forces and as a method of electronic toll collection on pay-per-use roads and cataloging
the movements of traffic or individuals [15].
9.Handwriting Recognition
It is the ability of a computer system which scans the image of handwritten text by
scanner and extracts only handwritten character from that image [16].
16 | P a g e
1.6 Advantages
❖ Retrieving invoice data is easy with the use of OCR. There’s no need to manually
input all of the data into digital format when the technology can do the job for
you!
❖ Storage of important information is important for many businesses, that’s why

OCR is used by many public services for exactly this reason.
❖ Extracting tables from documents can be a very lengthy task. One of the best
benefits of OCR is that you can swap hours of computer work for a 1-minute task.
❖ Save Your Fingers With Typing
Typing, typing, typing. Is that all you seem to do? OCR can help when a retyping
task is on your to-do list. There’s no need to spend hours at your computer desk
when the technology is capable of whizzing through the text for you. Save time,
effort and benefit from a fully-searchable digital document at the end.
❖ Documents Will Be Text Searchable
Do you want to spend hours searching for data? Bulky files are gone thanks to the
use of OCR. You can scan all the documents that you need and they’ll be text
searchable too. Simply enter a keyword into the ‘searchable PDF’ next time you
need a piece of data quickly.
❖ Digital Can Be Safer
You won’t need to worry about physical storage, documents can be found in one
place – digitally. Digital versions are far easier to backup and you won’t risk any
tea spillages either. Control exactly where your data is saved and access it
whenever you need to.
❖ Edit Documents Easily
Editing scanned documents can cause a headache. Therefore one of the best
benefits is it makes this task a breeze. It’ll swiftly convert them and allow you to
make the changes you need to in the format of your choice.
❖ Less Stress Overall
17 | P a g e
This Project Stop keeping clients on hold and benefit from running a simple
search on your computer. Get the data that you need quickly. Avoid the filing
cabinets and keep your customers engaged.
18 | P a g e
Chapter 2
REVIEW OF LITERATURE
2.1 Literature Reviews
As discussed earlier text recognition from images is still an active research in the
field of pattern recognition. To address the issues related to text recognition many
researchers have proposed different technologies, each approach or technology tries to
address the issues in different why. In this section we present a detailed survey of
approaches proposed to handle the issues related to text recognition.
Yang has proposed a novel adaptive binarization method based on wavelet

filter is proposed. This approach was processes faster, so that it is more suitable for
real-time processing and applicable for mobile devices.
Rhead has considered real world UK number plates and relates these to
ANPR. It considers aspects of the relevant legislation and standards when applying
them to real world number plates. The varied manufacturing techniques and varied
specifications of component parts are also noted. The varied fixing methodologies
and fixing locations are discussed as well as the impact on image capture.
Jawahar has proposed a recognition scheme for the indian script of

devanagari. They used approach does not require word to character segmentation,
which is one of the most common reason for high word error rate. They have been
reported a reduction of more than 20% in word error rate and over 9% reduction in
character error rate while comparing with the best available OCR system.
Ntirogiannis has studied that the document image binarization is of great

importance in the document image analysis and recognition pipeline since it affects
further stages of the recognition process. The evaluation of a binarization method
aids in studying its algorithmic behavior, as well as verifying its effectiveness, by
providing qualitative and quantitative indication of its performance. They proposed
a pixel-based binarization evaluation methodology for historical handwritten/machine-
printed document images.
Malakar has described that extraction of text lines from document images is
one of the important steps in the process of an Optical Character Recognition (OCR)
system. In case of handwritten document images, presence of skewed, touching or
overlapping text line(s) makes this process a real challenge to the researcher.
19 | P a g e
Tirthraj Dash et al have discussed HCR using associative memory net
(AMN) in their paper . They have directly worked at pixel level. Dataset was
designed in MS paint 6.1 with normal Arial font of size 28. Dimension of image
was kept 31 X 39.
Pradeep have proposed neural network based classification of handwritten

character recognition system. Each individual character is resized to 30 X 20
pixels for processing. They are using binary features to train neural network.
However such features are not robust.
2.1 Relation to past work:
[Vaidya 1999] use a feature-based approach for numeral recognition. They have
used a statistical method by assigning weights to each feature and assessing the numerals
using these weights. We also use the feature-based approach to recognize the handwritten
words. Our approach is different from theirs as words may be written in cursive writing
whereas numerals aren't. It is not always possible to break words into characters. Hence
we have to use a continuous process of matching the set of features to a database while
accounting for the permutations as new features come into view and old ones are
discarded. Also, the set of features in the case of alphabets is larger than that in the case
of numerals.
[Nicchiotti/Scagliola 1999] have shown some good examples of normalization by

removing unwanted variations in cursive handwriting using estimation and compensation
of reference lines and slant angle. In the preprocessing of documents for the purpose of
recognition and beautification, normalization is an important step to facilitate feature
extraction.
[Spitz 1998] use character shape coding process for typed word recognition. They
have a small dictionary to which all the words in the document belong. After scanning the
words, they are classified on the basis of the regions that they occupy (extending above
middle-line, extending below bottom-line or completely between the two). This narrows
down the range of possibilities for the word which is then matched against all these
possibilities. We had considered this approach but it would have been highly inefficient
in our case which is more general as ours is not restricted to a small fraction of a
dictionary nor is it restricted to typed documents where the characters are easily
distinguishable.
2.1 In English Scripter Character Recognition:
20 | P a g e
In 2004 N. M. Noor, M. Razaz and P. Manley-Cooke Proposed system using global
geometrics feature extraction and geometric density classifier for feature extraction then
neural fuzzy logic used for classification.
Evaluation of the system has achieve for Geometric Density 77.89% and Geometric
Feature 76.44% accuracy rate [6]. In 2010 Dewi Nasien, Habibollah Haron and Siti Sophiayati
Yuhaniz This studies Take three datasets from NIST database considered lowercase letters
189,411, uppercase letters 217,812 and combination of uppercase and Lowercase letters
407,223 sample are used.
Those Samples are divided into 80% for training and 20% for testing. For feature
extraction used Freeman chain code (FCC). Support vector system (SVM) is selected for
recognition step show nearest neighbor achieve 61.53% accuracy when neural network gives
57.69%. Math lab tool was used for features extracted and recognition. The evaluation
outcome suggests Nearest Neighbor is a better recognizer comparing with artificial neural
network when implemented to English Characters.
2.3. Arabic Scripter Character Recognition
In 2002 Majid M. Altuwaijri and Magdy A. Bayoumi They develop system to

recognize Arabic text using neural network used set of moment invariants descriptors (under
shift, scaling and rotation) and artificial neural network (ANN) used for classification The
study has shown 90% of a high accuracy rate .
In 2015 Ashraf Abdel Raouf, Colin A. Higgins, Tony Pridmore and Mah-moud I. Khalil
Haar studied approach for recognizing Arabic character using Haar Cascade Classiﬁer (HCC)
These classifiers were trained and tested on some 2,000 images. To extract feature Haar-like
feature extraction used and boosting of a classifier cascade. The system was tested with real text
image and produces 87% accuracy rate for Arabic character recognition.
In 2017 N. Lamghari, · M. E. H. Charaf and · S. Raghay On this research the data are
divided into three parts. From 34,000 character 70% are used for training, 15% for testing phase
and 15% for validation. To extract feature hybrid feature extraction used (pixel density,
resize, freeman code, structural features, invariant) for recognition used feed forward-back
propagation neural network. The system has achieved 98.27% high recognition rate.
21 | P a g e
In 2018 Noor A. Jebrila, Hussein R. Al-Zoubib and Qasem Abu Al-Haijac In addition
to the preprocessing step includes in particular three levels. In the primary section, they
employed word segmentation to extract characters.
In the second one section, Histograms of Oriented Gradient (HOG) are used for feature
extraction. The very last phase employed Support Vector Machine (SVM) for classifying
characters.
They have carried out the proposed method for the recognition of Jordanian
metropolis, city, and village names as a case examine, similarly to many other phrases
that offers the characters shapes that aren't included with Jordan cites. The set has cautiously
been selected to include each Arabic character in its all forms. To the conclusion, they have
got constructed their own dataset inclusive of greater than 43.000 handwritten Arabic
phrases (30000 used for training and 13000 used for testing stage). Recognition result show 99%
rate of accuracy.
2.4 Devanagari Scripter Character Recognition
In 2011 Gyanendra K. Verma, Shitala Prasad, and Piyush Kumar Curvelet present in
approach for Hindi handwritten character recognition using curvelet transformer.
The study are used dataset that contain 200 images of character (each image contains all
Hindi characters). Feature extract using curvelet transform and for recognition k-nearest
neighbor the experiment result show more than 90% accuracy .
In 2013 Divakar Yadav, Sonia Sánchez-Cuadrado and Jorge Morato develop optical
character recognition system using neural network for Hindi characters and trained with 1000
dataset. Feature extraction technique is histogram of projection based on mean distance,
on pixel values and vertical zero crossing.
Then classify using back-propagation neural network with two hidden layers.
Experimental result show 98.5% correct recognition.
22 | P a g e
In 2015 Akanksha Gaur and Sunita Yadav this system extract feature using k-means
clustering and classified used support vector machine using linear kernel and Euclidean
distance. The evaluation show that SVM has better results using linear kernel than
Euclidean distance.
Maximum achieved using Euclidean distance is 81.7% accuracy. Using linear

kernel giving 95.86% result.
In 2018 Nikita Singh present system with the title “An Efficient Approach for
Handwritten Devanagari Character Recognition Based on Artificial Neural Network” for
recognition hind character. For feature extraction they used histogram oriented gradients (HOG)
and recognition used artificial neural network (ANN) classifier. The system get 97.06% high
accuracy .
23 | P a g e
Chapter 3
TEXT RECOGNISTION SYSTEM
In this section we briefly describe the overall architecture of text recognition system
as shown in figure 3.1.
Fig 3.1 : Architecture of text recognition
24 | P a g e
A text recognition system receives an input in the form of image which contains
some text information. The output of this system is in electronic format i.e. text information
in image are stored in computer readable form.
The text recognition system can be divided in following modules:
(A) pre-processing
(B) text recognition
(C) post-processing.
Each module is further described in detail as bellow:
A. Pre-processing Module
The paper document is generally scanned by the optical scanner and is converted in
to the form of a picture. A picture is the combinations of picture elements which are also
known as pixels.
At this stage we have the data in the form of image and this image can be further
analyzed so that’s the important information can be retrieved. So to improve quality of the
input image, few operation are performed for enhancement of image such as noise removal,
normalization, binarization etc.
1) Noise Removal
Noise removal is one of the most important process. Due to this quality of the
image will increase and it will effect recognition process for better text recognition in
images. And it results in generation of more accurate output at the end of text recognition
processing. There are many methods for image noise removal such as mean filter, min-
max filter, Gaussian filter etc.
2) Normalization
Normalization is one of the important pre-processing operation for text recognition.

The normalization is applied to obtain characters of uniform size, slant and rotation.
25 | P a g e
3) Binarization
Binarization is one of the important pre-processing operation for text recognition.

A printed document is first scanned and is converted into a gray scale image.
Binarization is a technique by which the gray scale images are converted to binary
images. This separation of text from background that is required for some operations such as
segmentation. Figure 3.2 shows a colour image
(a) gray image
(b), and binary image
(c) of a text image
Fig 3.2 Architecture of text recognition
26 | P a g e
B. Text Recognition Module
This module can be used for text recognition in output image of pre-
processing model and give output data which are in computer understandable form.
Hence in this module following techniques are used.
1) Segmentation
In text recognition module, the segmentation is the most important process.

Segmentation is done to make the separation between the individual characters of an
image.
2) Feature Extraction
Feature extraction is the process to retrieve the most important data from
the raw data. The most important data means that’s on the basis of that’s the
characters can be represented accurately.
To store the different features of a character, the different classes are

made. There are many technique used for feature extraction like
Principle Component Analysis (PCA),
Linear Discriminate Analysis (LDA),
Independent Component Analysis (ICA),
Chain Code (CC),
zoning,
Gradient Based features,
Histogram etc.
3) Classification
The classification is the process of identifying each character and assigning to

it the correct character class, so that texts in images are converted in to computer
understandable form.
27 | P a g e
This process used extracted feature of text image for classification i.e. input
to this stage is output of the feature extraction process.
Classifiers compare the input feature with stored pattern and find out best
matching class for input. There are many technique used for classification such as
Artificial Neural Network (ANN),
Template Matching,
Support Vector Matching (SVM) etc.
C. Post-processing Module
The output of text recognition module is in the form text data which is understand by
computer, So there need to store it in to some proper format( i.e. text or MS-Word )for farther
use such as editing or searching in that data
3.2 Task Involoved in OCR
Fig 3.3 OCR Processing
28 | P a g e
The above figure 3.3 shows different processes which are done in OCR system (Fig. 3.3).
1. Image acquisition
Input image for OCR system might be acquire by scanning document or by capturing
photograph of document. This is also known as digitization process [17].
2. Preprocessing
Preprocessing consist series of operations and it used to enhance an image and make it
suitable for segmentation. Noise get introduced during document generation. So Proper filter like
mean filter, min-max filter, Gaussian filter etc. may be applied to remove noise from document.
. Binarization process converts gray scale or colored image to black and white image. To
enhance visibility and structural information of character Binary morphological operations like
opening, closing, thinning, hole filling etc. may be applied on image.
If scanned image is not be perfectly aligned, so we need to align it by performing slant

angle correction. Input document may be resized if it is too large in size to reduce dimensions to
improve speed of processing [17].
3. Segmentation
Character segmentation performs an operation of decomposition of an image into Sub images of

individual symbols. It is one of the decision processes in a system for optical character
recognition (OCR). Its decision that a pattern isolated from the image is that of a character or
some other identifiable unit.
Generally document is processed in hierarchical way. At first level lines are segmented
using row histogram. From each row, words are extracted using column histogram and finally
29 | P a g e
characters are extracted from words. Accuracy of final result is highly depends on accuracy of
segmentation [17].
4. Feature extraction
Feature extraction is the important part of any pattern recognition application. Feature
extraction techniques like Linear Discriminant Analysis (LDA), Principle Component Analysis
(PCA),Independent Component Analysis (ICA), Chain Code (CC), Scale Invariant Feature
Extraction (SIFT),Gradient based features, Histogram might be applied to extract the features of
individual characters. These features are used to train the system [17].
5. Classification
When image is provided as input to OCR system, its features are extracted and given as
an input to the trained classifier like artificial neural network or support vector machine.
Classifiers compare the input feature with stored pattern and find out the best matching class for
input [17].
6. Post processing
This step is not compulsory; it helps to improve the accuracy of recognition. Syntax
analysis, semantic analysis kind of higher level concepts might be applied to check the context of
recognized character [17]
Our proposed framework goes through various phases
1) Pre-Processing.
2) Feature Extraction.
3) Image Segmentation.
4) Text Conversion.
30 | P a g e
5) Text-to-Speech synthesis.
1. Edge Detection
A set of connected pixels that forms a boundary between two disjoint regions is known as
an edge. The task of segmenting an image into regions of discontinuity is done using edge
detection. Edges usually occur on the boundary of two different boundaries in an image. Edge
detection helps to clearly identify the changes in region of an image where gray scale and texture
change in the regions of an image.
There are many available edge detection techniques for extracting edges from images
such as Robert, Prewitt and Sobel which were not much efficient. Then in 1986 John. F. Canny
developed an algorithm which provided high probability of edge detection and error rate.
2. Canny Algorithm
This algorithm focuses mainly on three main aims of low error rate, minimize distance
between real edge and detected edge and minimum response i.e. one detector response per edge
to detect the edges in an image.
3. Image Segmentation
Image segmentation is another important aspect necessarily required to divide an image

into regions or categories which then helps to identify correctly the object in an image.
Segmentation functions on the properties shown by the pixels in an image, every pixel which
belongs to same category has similar gray scale value whereas pixels of different categories have
dissimilar values.
Segmentation is often one of the critical steps in analyzing the images because additional
overhead of moving to each new pixel of an image while working with object in an image. Once
image segmentation is done successfully, the other stages in image analysis are much easier.
31 | P a g e
While considering a fully automatic conversion algorithm, the success of image
segmentation is partial and sometimes requires manual intervention. Segmentation mainly has
two main objectives:
1) divide or decompose the image into parts for further processing,
2) perform change in organizing the pixels of image into higher-level units so that the
objects become more meaningful.
3.3 Basic Architecture System
The system architecture of image to text as well as speech conversion system is

illustrated in figure 3.4. There are various phases in which our system will work.
Fig 3.4 : System architecture of image to text as well as speech conversion
32 | P a g e
They are:
1)Input image:
This image is the image entered as input image to the system.
2)Pre-processing:
In this phase pre-requesit processing on the input images such as removing noise and making it
more usable to be recognizable for the system are carried out.
3)Feature extraction:
This phase is one of the important one. Extracting priliminary features and dividing them into
geometric elements like arc, line and circle and comparing these elements with known set of
characters which are store in the database.
4)Matching with database:
After feature extraction, system requires assistance of database in order to recognize the objects
in the image, so matching is done.
5)Generate text:
After successful recognition of objects, it is now important task of the system to generate
appropriate text for every input image.
6)Speech output:
33 | P a g e
The appropriate speech output for the generated text is given in the final phase.
3.4 Optical Character Recognition
Optical character recognition or optical character reader (OCR) is the electronic or

mechanical conversion of images of typed, handwritten or printed text into machine-encoded
text, whether from a scanned document, a photo of a document, a scene-photo (for example the
text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image
(for example: from a television broadcast).
Widely used as a form of data entry from printed paper data records – whether passport
documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of
static-data, or any suitable documentation – it is a common method of digitizing printed texts so
that they can be electronically edited, searched, stored more compactly, displayed on-line, and
used in machine processes such as cognitive computing, machine translation, (extracted) text-to-
speech, key data and text mining. OCR is a field of research in pattern recognition, artificial
intelligence and computer vision.
Early versions needed to be trained with images of each character, and worked on one
font at a time. Advanced systems capable of producing a high degree of recognition accuracy for
most fonts are now common, and with support for a variety of digital image file format inputs.
Some systems are capable of reproducing formatted output that closely approximates the
original page including images, columns, and other non-textual components.
3.5. Preprocessing
The preprocessing is a fundamental stage that is proceeding to the stage of feature

extraction; it regulates the appropriateness of the outcomes for the consecutive stages.
The OCR success rate is contingent on the success percentage of each stage.
34 | P a g e
A. Factors Affecting the Text Recognition Quality
Many factors influence the precision of character recognized using OCR. The factors are
scan resolution, scanned image quality, printed documents category either photocopied or laser
printer, quality of the paper, and linguistic complexities. The uneven illumination and
watermarks are few factors faced in OCR system that influence the accuracy of OCR.
B. Significance of Preprocessing in Text Recognition
The preprocessing step is necessary to obtain better text recognition rate, using efficient
algorithms of preprocessing creates the text recognition method robust using
noise removal,
image enhancing process,
image threshold process,
skewing correction,
page and text segmentation,
text normalization and morphological operations.
C. Preprocessing methods
The majority of OCR application uses binary / gray images. The images mayhave
watermarks and/or non-uniform background that make recognition process difficult without
performing the preprocessing stage. There are several steps needed to achieve this.
1. The initial step is to adjust the contrast or to eliminate the noise from the image called
as the image enhancement technique.
2. The next step is to do thresholding for removing the watermarks and/or noise
followed by the page segmentation for isolating the graphics from the text.
3. The next step is text segmentation to individual character separation followed by
morphological processing.
35 | P a g e
The morphological processing is required to addpixels if the preprocessed image has
eroded parts in the characters.
D. Techniques involved in Image
Enhancement Image enhancement increasesimage quality for perception of humans by

increasing contrast, minimizing blurring and removing noise (Nithyananda et al. 2016).
E. Spatial Image Filtering
The filters are applied to defeat the high or low frequencypresent in the image.
Eliminating the high frequencies in the image is smoothing, and the low frequencyelimination is
enhancing or edge detection in the image.
The following figure 3.4 shows the original image and 3.5 and 3.6 shows the images
applied with Prewitt and Canny edge detection methods. These filtering techniques may give
effective text detection from images available in natural scene.
Fig 3.5: Edge detection orginal image
36 | P a g e
Fig 3.6 : Edge detection Prewitt method
Fig 3.7: Edge detection Canny method
F. Techniques of Point processing
1. Global thresholding:
Image thresholding is the method of isolating the information from its background.
Hence, this method is usednormallyto grey-level, or scanned color images and itis categorized as
global and localthresholding.
37 | P a g e
Globalmethod of thresholding chooses avalue of threshold for the completeimage from
the intensity histogram. Global thresholdingautomatically reduces a greylevel image to a binary
image. The local adaptive thresholding method for each pixel it uses different values based on
the information of local area.
The figure 3.8 shows the global threshold applied using Otsu’s method.
Fig 3.8: Global Threshold using Otsu’s method
38 | P a g e
CHAPTER-4
SYSTEM REQUIREMENTS
4.1 Platform Requirement:
Platform requirements of these project is categorized by

- supportive operating system,
- software requirements and
- software requirements.
4.1.1 Supportive Operating System:
The supported Operating System for client include:
I. Android
Android is a powerful operating system and it supports a large number of

applications in Smartphones. These applications are more comfortable and advanced for
users. The hardware that supports android software is based on the ARM architecture
platform.
The android is an open-source operating system that means that it’s free and anyone can
use it. The android has got millions of apps available that can help you manage your life
one or other way and it is available to low cost in the market for that reason android is
very popular.
Android Architecture:
The android is an operating system and is a stack of software components which is

divided into five sections and four main layers that is
• Linux kernel
• Libraries
39 | P a g e
• Android runtime
Linux kernel:
The android uses the powerful Linux kernel and it supports a wide range of hardware
drivers. The kernel is the heart of the operating system that manages input and output requests
from the software. This provides basic system functionalities like process management, memory
management, device management like camera, keypad, display etc the kernel handles all the
things. Linux is really good at networking and it is not necessary to interface it to the peripheral
hardware. The kernel itself does not interact directly with the user but rather interacts with the
shell and other programs as well as with the hardware devices on the system.
Libraries:
The on top of a Linux kennel there is a set of libraries including open-source web
browsers such as WebKit, library libc. These libraries are used to play and record audio and
video. The SQLite is a database that is useful for the storage and sharing of application data. The
SSL libraries are responsible for internet security etc.
Android Runtime:
The android runtime provides a key component called Dalvik Virtual Machine which is a
kind of java virtual machine. It is specially designed and optimized for android. The Dalvik VM
is the process virtual machine in the android operating system. It is a software that runs apps on
android devices.
40 | P a g e
Application framework:
The application framework layer provides many higher-level services to applications such as
windows manager, view system, package manager, resource manager, etc. The application
developers are allowed to make use of these services in their application.
Applications and Features:
You will find all the android applications at the top layer and you will write your application and
install it on this layer. Examples of such applications are contacts, books, browsers, services, etc.
Each application performs a different role in the overall applications.
4.2 Software Requirement:
1. ANDROID STUDIO
Android Studio is the official integrated development environment (IDE) for Android
application development. It is based on the IntelliJ IDEA, a Java integrated development
environment for software, and incorporates its code editing and developer tools.
To support application development within the Android operating system, Android

Studio uses a Gradle-based build system, emulator, code templates, and Github integration.
Every project in Android Studio has one or more modalities with source code and resource files.
These modalities include Android app modules, Library modules, and Google App Engine
modules.
Android Studio uses an Instant Push feature to push code and resource changes to a
running application. A code editor assists the developer with writing code and offering code
completion, refraction, and analysis. Applications built in Android Studio are then compiled into
the APK format for submission to the Google Play Store.
41 | P a g e
4.3 Hardware Requirement:
1. RAM: 4GB minuimum or more.
RAM is short for “random access memory” and while it might sound mysterious, RAM is one of
the most fundamental elements of computing. RAM is the super-fast and temporary data storage
space that a computer needs to access right now or in the next few moments.
2. Processor: Intel i3 miniumum core.
A processor, or "microprocessor," is a small chip that resides in computers and other

electronic devices. Its basic job is to receive input and provide the appropriate output. While this
may seem like a simple task, modern processors can handle trillions of calculations per second.
42 | P a g e
Chapter 5
IMPLEMENTATION
Implementation is the process of converting a new system into an operational one. The
designed system is converted one using suitable programming languages. Implementation
includes all those activities that takes place to convert old system to new.
For implementation of this project we used Android studio IDE and Java Programming
language.
5.1 Programming Language
A programming language is our way of communicating with software. The people who
use programming languages are often called programmers or developers. The things we tell
software using a programming language could be to make a mobile application look a certain
way, or to make an object on the page move if the human user takes a certain action.
5.1.1 JAVA
Java is a general-purpose programming language that is class-based, object-oriented, and

designed to have as few implementation dependencies as possible. It is intended to let application
developers write once, run anywhere (WORA), meaning that compiled Java code can run on all
platforms that support Java without the need for recompilation. Java applications are typically
compiled to bytecode that can run on any Java virtual machine (JVM) regardless of the
underlying computer architecture. The syntax of Java is similar to C and C++, but it has fewer
low-level facilities than either of them.
Java is a programming language that has been around a lot longer than Android. It is an
object-oriented language. This means it uses the concept of reusable programming objects. If this
sounds like technical jargon, another analogy will help. Java enables us and others (like the
Android development team) to write Java code that can be structured based on real-world things,
and here is the important part: it can be reused.
43 | P a g e
5.2 Source Code:
5.2.1 MainActivity.java
public class MainActivity extends AppCompatActivity {
EditText mResultEt;
ImageView mPreviewIv;
private static final int CAMERA_REQUEST_CODE = 200;
private static final int STORAGE_REQUEST_CODE = 400:
private static final int IMAGE_ PICK_ GALLERY CODE =1000;
private static final int IMAGE_ PICK_CAMERA_CODE = 1001;
String cameraPermission[];
String storagePermission[];
Uri image_uri;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_ main);
actionBar.setSubtitle(“Click + button to insert Image”);
mResultEt = findViewilyId(R.id.resultit);
mProvIewIv = findViewElyId(R.id.imageIv);
cameraPermIssion = new String()(Manifest.permission.CAMERA,

Manifest.permiasion.WRITE_EXTERNAL_STORAGE);
storagePermission = new String[] (Manifezt.permission.WRITE_ EXTERNAL_STORAGE);
44 | P a g e
@Override
public boolean onCreateOptionsMenu(Menu menu) {
getMenuInflater().inflate(R.menu.menu_main, menu);
return true;
@Override
public Boolean onOptionsItemSelected(MenuItem item)
int id = item.getItemId();
if (id == R.id.addImage){
showImageImportDialog();
if (id == R.id.settings){
Toast.makeText(context: this, text "Settings", Toast.LENGTH SHORT).show()
return super.onOptionsItemSelected(item);
private void showImageImportDialog() {
String() items - (" Camera", " Gallery");
AlertDialog.Builder dialog = new AlertDialog.Builder(this);
dialog.setTitle("Select Image");
dialog.setltems(items, new DialogInterface.OnClickListener()
@Override public void onClick(DialogInterface dialog, int which) {
if (which== 0)(
//camera option clicked
if (!checkCameraPermission()) {
//camera permission not allowed, request it
requestCameraPermission();
45 | P a g e
}
else (
//permission allowed, take picture
pickCamera();
if (which ==1)({
//gallery option clicked
if (!checkStoragePermission()){
//Storage permission not allowed, request it
requestStoragePermission();
else{
pickGallery();
}});
dialog.create().show();
private void pickGallery(){
//intent to pick image from gallery
Intent intent = new Intent(Intent.ACTION_PICK);
//set intent type to image
intent.setType(“image/*");
startActivityForReault(intent, IMAGE_PICK_GALLERY_CODE);
private void pickCamera(){
46 | P a g e
ContentValues values = new ContentValues();
Values.put (MediaStore. Images.Media.TITLE, "NewPic");

Values.put(MediaStore.Images.Media.DESCRIPTION, "Image To text");
image_uri =
getContentResolver().insert(MediaStore.Images.Media.EXTERNAL_CONTENT_URI, values),
Intent cameraIntent = new Intent(MediaStore.ACTION_IMAGE_CAPTURE);

cameraIntent.putExtra(MediaStore.EXTRA_OUTPUT, image_uri);
startActivityForResult(cameraIntent, IMAGE_PICK_CAMERA_CODE):
private void requestStoragePermission() {
ActivityCompat.requestPermissios( this, storagePermission, STORAGE_REQUEST_CODE);
private boolean checkStoragePermission() {
boolean result = ContextCompat.checkSelfPermission(Context: this,

Manifest.permission.WRITE_EXTERNAL_STORAGE) == (PackageManager.PERMISSION
GRANTED);
return results ;
private void requestCameraPermission() {
ActivityCompat.requestPermissions(this, cameraPermission, CAMERA_REQUEST_CODE;
private boolean checkCameraPermission() I{
/*Check camera permission and return the result *In order to get high quality image we have to
save image to external storage first before inserting to image view that's why storl- : will also be
required*/ boolean result = ContextCompat.checkSelfPermission( this,
Manifest.permission.CAMERA) ==(PackageManager.PERMSSION GRANTED):
boolean result1= ContextCompat.checkSelfPermission(Context: this,

Manifest.permission.WRITE_EXTERNAL_ STORAGE) ==
(PackageManager.PERMISSION_GRANTED); return result && resultl;
47 | P a g e
public void onRequestPermissionsResult (int reguestCode, String[] permissions, @NonNull)
switch (requestCode){
case CAMERA_REQUEST_ CODE:
if (grantResults.length >0){
boolean cameraAccepted = grantResults[0] == PackageManager.PERMISSION_GRANTED;
boolean writeStorageAccepted = grantRezults [0] ==

PackageManager.PERMISSION_GRANTED;
if (cameraAccepted && writeStorageAccepted){
pickCamera();
else {
Toast.makeText(this,”permissiondenied”,Toast.LENGTH SHORT).show();
} break;
case STORAGE_REQUEST_CODE:
if (grantResults.length >0){
boolean writeStorageAccepted = grantResults[0] ==

PackageManager.PERMISSION_GRANTED;
if (writeStorageAccepted){
pickGallery();
else {
Toast.makeText(this,”permissiondenied”,Toast.LENGTH SHORT).show();
@Override
protected void onActivityResult(int requestCode, int resultCode, @Nullable Intent data)
if (resultCode == RESULT_OK){
48 | P a g e
if (requestCode ==IMAGE_PICK_GALLERY_CODE){
CropImage.activity(data.getData()) .setGuidelines(CropImageView.Guidelines.ON)
.start(activity this);
if (reques tCode ==IMAGI_PICK_CAMERA_CODE){
CropImage.activity(image_uri) .setGuidelines(CropImageView.Guidelines.ON) .start( activity:

this);
if (requestCode == CropImage.CROP IMAGE_ACTIVITY_REQUEST_CODE)
{ Croplmage.ActivityResult = CropImage.getActivityResult(data);
5.2.2 Activity_main.xml
<xml version=”1.0” encoding="utf-8"?>
<RelativeLayout
xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:tools="http://schemas.android.com/tools"
x mlns:app="http://schemas.android.com/apk/res-auto"
android:layout_width="match_parent"
android:layout_height="match_parent"
tools:context="MainActivity"›
49 | P a g e
<ScrollView
android:layout_width=”match_parent”
android:layout_height="wrap_content”>
<LinearLayout
android:layout_height="wrap_content"
android:orientation="vertical>
<android.support.v7.widget.CardView
android.layout_width=”match-parent”
androld:layout_height=”wrap_content”
app:cardBackgroundColor="fff”
app:cardUseCompatPadding="true"
app:cardCornerRadius="3dp"
app:cardElevation="3dp”>
<LinearLayout
android:orientation="vertical"
android:padding=”5dp”>
<TextView
android:text="Result"
android:textColor="@color/colorPrimary"
android:textSize="20sp"
50 | P a g e
android:layout_height="wrap_content”/>
<EditText
Android:id=”@+id/resultEt”
android:hint="Click + button to add image"
android:autoLink="all"
android:background=”@null”
android:padding="5dp”
android:textColor="#000”
android:layout _width=”match_parent"
android:layout_height= "wrap content" />
</LinearLayout>
</android.support.v7.widget.CardView>
<android.support.v7.widget.CardView
android.layout_width=”match-parent”
androld:layout_height=”wrap_content”
app:cardBackgroundColor="fff”
app:cardUseCompatPadding="true"
app:cardCornerRadius="3dp"
app:cardElevation="3dp”>
<LinearLayout
51 | P a g e
android:orientation="vertical"
android:padding=”5dp”>
<TextView
android:text="Image Preview "
android:textColor="@color/colorPrimary"
android:textSize="20sp"
android:layout_height="wrap_content”/>
<ImageText
android:id=”@+id/imageIV”
android:layout _width=”wrap_parent"
android:layout_height= "wrap_ content" />
</LinearLayout>
</ScrollView>
</RelativeLayout>
5.3 Steps for Implementing
5.3.1 Enable USB debugging on your Android phone

On Android 4.1 and lower, the Developer options screen is available by default.
On Android 4.2 and higher, do the following:
1. Open the Settings app.
2. Select System.
3. Scroll to the bottom and select About phone.
4. Scroll to the bottom and tap Build number 7 times.
5. Return to the previous screen to find Developer options near the bottom.
6. Scroll down and enable USB debugging.
52 | P a g e
5.3.2 Get Started with Remote Debugging Android Devices
• Set up your Android device for remote debugging, and discover it from your
development machine.
• Inspect and debug live content on your Android device from your development machine.
• Screencast content from your Android device onto a DevTools instance on your
Fig5.1: Remote Debugging lets you inspect a page running on an Android device from your
Step 1: Discover your Android device
1. Open the Developer Options screen on your Android.
2. Select Enable USB Debugging.

3. On your development machine, open Chrome.
4. Open DevTools.
5. In DevTools, click the Main Menu Main Menu then select More tools > Remote devices.
53 | P a g e
Fig 5.2: Opening the Remote Devices tab via the Main Menu
6. In DevTools, open the Settings tab.

7. Make sure that the Discover USB devices checkbox is enabled.
Fig 5.3: The Discover USB Devices checkbox is enabled
54 | P a g e
8. Connect your Android device directly to your development machine using a USB cable.
The first time you do this, you usually see that DevTools has detected an unknown
device. If you see a green dot and the text Connected below the model name of your
Android device, then DevTools has successfully established the connection to your
device. Continue to Step 2.
Fig5.4: The Remote Devices tab has successfully detected an unknown device that is
pending authorization
9. If your device is showing up as Unknown, accept the Allow USB Debugging permission
prompt on your Android device.
Step 2: Debug content on your Android device from your development machine
1. Open Chrome on your Android device.
55 | P a g e
2. In the Remote Devices tab, click the tab that matches your Android device model name.
At the top of this page, you see your Android device's model name, followed by its serial
number. Below that, you can see the version of Chrome that's running on the device, with
the version number in parentheses. Each open Chrome tab gets its own section. You can
interact with that tab from this section. If there are any apps using WebView, you see a
section for each of those apps, too. In Figure 5 there are no tabs or WebViews open.
Fig5.5: A connected remote device

3. In the New tab text box, enter a URL and then click Open. The page opens in a new tab
on your Android device.
4. Click Inspect next to the URL that you just opened. A new DevTools instance opens.
The version of Chrome running on your Android device determines the version of
DevTools that opens on your development machine. So, if your Android device is
running a very old version of Chrome, the DevTools instance may look very different
than what you're used to.
Inspect elements
Go to the Elements panel of your DevTools instance, and hover over an element to highlight it
in the viewport of your Android device.
56 | P a g e
You can also tap an element on your Android device screen to select it in
the Elements panel. Click Select Element on your DevTools instance, and then tap the element
on your Android device screen. Note that Select Element is disabled after the first touch, so you
need to re-enable it every time you want to use this feature.
Screencast your Android screen to your development machine:
Click Toggle Screencast to view the content of your Android device in your DevTools
instance.
You can interact with the screencast in multiple ways:
• Clicks are translated into taps, firing proper touch events on the device.
• Keystrokes on your computer are sent to the device.
• To simulate a pinch gesture, hold Shift while dragging.
• To scroll, use your trackpad or mouse wheel, or fling with your mouse pointer.
Some notes on screencasts:
• Screencasts only display page content. Transparent portions of the screencast represent
device interfaces, such as the Chrome address bar, the Android status bar, or the Android
keyboard.
• Screencasts negatively affect frame rates. Disable screencasting while measuring scrolls
or animations to get a more accurate picture of your page's performance.
• If your Android device screen locks, the content of your screencast disappears. Unlock
your Android device screen to automatically resume the screencast.
57 | P a g e
Chapter 6
RESULTS AND ANALASYS
6.1 Initial View
This is an initial view of the app.
We can see
1. Image Preview tab below which gives the preview of the image and
2. Result Tab top to it which shows output text and we can see a icon in the
top right corner, it is used to select the Image.
Fig 6.1 Initial view of the aplication
58 | P a g e
6.2 Selecting Image
If we click on that icon, these are the otions we can see. We can directly take photo of a
printed copy by using Camera button; or take the photo from the Gallery.
Fig 6.2: Selecting image of application
59 | P a g e
6.3 Cropping Image
After selecting the image, we can see crop it to which we want .
We have 3 options in the top right corner:
1. to rotate the image

2. to mirror the image
3. after we had done with cropping we wll click on crop button
Fig 6.3: Cropping Image of application
60 | P a g e
6.4 Output
Finally we get the ouput.In the result top we are showing with the ouput text.In the bottom we
can see the image that we have cropped
Fig 6.4: Output of the application
61 | P a g e
6.5 Further usage of data
The result Ouput is editable. As we can see the options there, we can cut,copy,paste and
share the selected data
Fig 6.5: Further using of data
62 | P a g e
6.6 Sample outputs :
The application is observed on different types of images, posters, handwritten texts,
webpages screen shots etc., those outputs of the application are given below.
(a) (b)
(c) (d)
63 | P a g e
(e) (f)
(g) (h)
Fig 6.6 Outputs on different types of images
64 | P a g e
Chapter 7
CONCLUSION AND FUTURE SCOPE
7.1Conclusion
Nowadays, a lot of documents are produced in paper form but it is obvious, that
automatic data recognition systems are very popular. The document is repeatedly copied and
changed during subsequent processing steps, so it exists in many different copies. In some
applications they can successfully help humans, but in some cases they are useles.
By handwritten character recognition one means the recognition of single and

unconstrained hand drawn characters, i.e. numerals, upper-case and lowercase characters of a
particular alphabet. However, the frontiers of character recognition have now moved to the
recognition of cursive script that is the recognition of characters which may be connected or
written in calligraphy.
The processing of images is faster and more cost-effective. One needs less timefor processing,
as well as less film and other photographing equipment. It is more ecological to process images.
No processing or fixing chemicals areneeded to take and process digital images.
In this way we surveyed many techniques which are necessary to implement image to
text as well as speech system. Our contribution towards this work will surely be helpful for blind
as well as physically disabled people of our society. This is a small help form our side for such
people to make them more interact able with real world. More focus is on recognition of object
in an image. This will ultimately result in identifying important objects from an image. This
paper contains an abstract view of various technique proposed in recent past year for image to
text conversion and text to speech conversion.
7.2Future Work
We can further extend the project to recognize other languages scripts. That will be helpful
to make a lot of very sacred and important Ancient books, Upanishads, Novels, Holy books etc can be
easily digitized.
The project can be implemented on intranet in future. Project can be updated in near
future as and when requirement for the same arises, as it is very flexible in terms of expansion.
With the proposed software of database Space Manager ready and fully functional the client is
now able to manage and hence run the entire work in a much better, accurate and error free
manner.
65 | P a g e
The work reported in this thesis can be extended in the following directions. 1. Font
Independent OCR An Optical Character Recognition system could be developed by considering
the multiple font style in use. Our approach is very much useful for the font independent case.
Because, for font or character size, it finds the string and the strings are parsed to recognize the
character. Once character is identified, the corresponding character could be ejected through an
efficient editor. Efforts have been taken to develop a compatible editor for Tamil and English
. OCR for Tamil and other Indian Languages Except Bangla and Hindi, all other Indian
languages require development of an OCR for printed characters, and for handwritten characters,
OCR has to be developed for all languages (including Bangla and Hindi). Of course, OCR for
printed characters are easy when compared to the handwritten cursive scripts. Even for the
printed document recognition, an OCR should be able to perform the all features besides
character recognition, such as spell check, sentence and grammar check, Also an editor with key
board encoding and font encoding is required. With this approach the printed and handwritten
characters are recognizable easily with less effort and more accuracy. A module for Skew
correction and line separation, word and character separation along with an editor with spell
checker and grammar checker could be designed for ‘developing a complete OCR. Further, with
a little fine tuning on the modules, such as, skew correction and line separation, word and
character separation, a complete OCR could be designed for handwritten scripts of any language
for that matter. It is proposed to apply the approach to all manuscripts recognition of South-
Indian languages. Since some of the characters in some of the languages have similar characters
viz, Tamil and Malayalam have similar features among few characters, and Telugu and Kannada
have similarity among most of the characters, our approach could be applied for these languages
and could be extended to all other languages.
Cursive Characters OCR There is heavy demand for an OCR system which recognizes
cursive scripts and manuscripts like Palm Leaves. This actually avoids keyboard typing and font
encoding too. Steps have been taken in our laboratory to develop an OCR for handwritten Tamil
characters. 94 4. Language Converter through OCR Once a complete OCR has been developed
for two languages with font encoding, spell checker and grammatical sentence check, then a
converter could be implemented to convert sentences from one language to another language.
66 | P a g e
LIST OF FIGURES
Fig 3.1: Architecture of Text Recognition

Fig 3.2: Architecture of Text Recognition
Fig 3.3: OCR Processing
Fig 3.4:System architecture of image to text as well as speech conversion
Fig 3.5: Edge Detection Original Image
Fig3.6: Edge detection preview Method
Fig 3.7: Edge Detection Canny Method
Fig 3.8: Global Threshold Using a tsas’s Method
Fig5.1: Remote Debugging lets you inspect a page running on an Android device
from your development machine.
Fig 5.2: Opening the Remote Devices tab via the Main Menu
Fig 53: The Discover USB Devices Checkbox is enabled
Fig5.4: The Remote Devices tab has successfully detected an unknown device
that is pending authorization
Fig5.5: A connected remote device
Fig 6.1 :Intial View Of the Application
Fig 6.2: Selecting Image Of the Application
Fig 6.3: Cropping Image Of the Application
Fig6.4: Output Of the Application
Fig 6.5: Further Usage Of Data
Fig 6.6 Outputs on different types of images
67 | P a g e
ABBREVATIONS
1.ASN:hiArtificial Neural Network
2.CC: Chain Code
3.CPU: Central Processing Unit
4.IDE:integrated Development Environment
5.LDA: Linear Discriminate Analysis
6.OCR : Optical Character Recognition
7.PCA: Principle Component Analysis
8.RAM :Ran dom Access Memory
9.ROM: Read Only Memory
10.SVM : Support Vector Matching
68 | P a g e
REFERENCES
[1] Benjamin Z. Yao, Xiong Yang, Liang Lin, Mun Wai Lee
and Song-Chun Zhu, “I2T: Image Parsing to Text
Description” IEEE Conference on Image Processing,
2008 .
[2] A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta
and R. Jain, “Content-based image retrieval at the end of
the early years,” IEEE Transactions PAMI, vol 22, no.
12, 2000.
[3] Y. Rui, T. S. Huang, and S. F. Chang, “Image retrieval:
Current techniques, promising directions, and open
issues,” Journal of Visual Communication and Image
Representation, vol. 10,1999.
[4] M. S. Lew, N. Sebe, C. Djeraba, and R. Jain, “Contentbased multimedia information
retrieval: State of the art
and challenges,” ACM Transactions on Multimedia
Computing, Communications, and Applications, vol. 2,
no. 1, pp. 1-19, Feb. 2006.
[5] C. Snoek and M. Worring, “Multimedia video indexing:
A review of the state-of-the-art,” Multimedia Tools
Appl, vol. 25, no. 1, 2005.
[6] R. Datta, D. Joshi, J. Li, and J. Z. Wang, “Image
retrieval: Ideas, influences, and trends of the new age,”
ACM Computing Surveys, vol. 40, no. 2, pp. 1-60, Apr.
2008.
[7] Yi-Ren Yeh, Chun-Hao Huang, and Yu-Chiang Frank
Wang, “Heterogeneous Domain Adaptation and
Classification by Exploiting the Correlation Subspace,”
IEEE Transactions on Image Processing, vol. 23, no. 5,
May 2014.
[8] S. Shahnawaz Ahmed, Shah Muhammed Abid Hussain
and Md. Sayeed Salam, “A Novel Substitute for the
Meter Readersin a Resource Constrained Electricity
Utility” IEEE Trans. On Smart Grid, vol. 4, no. 3, Sept.
69 | P a g e
2013.
[9] A. Abdollahi, M. Dehghani and N. Zamanzadeh, “SMSbased reconfigurable
automatic meter reading system,”
in Proc. 16th IEEE Int. Conf. Control Appl. Part IEEE
Multi-Conf. Sust. Control Singapore, Oct. 1-3, 2007, pp.
1103-1107.
[10]Fan-Chieh Cheng, Shih-Chia Huang and Shanq-Jang,
“Illumination-Sensitive Background Modeling Approach
for Accurate Moving Object Detection,” IEEE Trans. On
Broadcasting, vol. 57, no. 4, Dec 2011.
[11]Iasonas Kokkinos and Petros Maragos, “Synergy
between Object Recognition and Image Segmentation
using the Expectation-Maximization Algorithm”, IEEE
Transactions on Pattern Analysis and Machine
Intelligence, vol. 31, no. 8, Aug. 2009.
[12]T. Cootes, G. J. Edwards and C. Taylor, “Active
Appearance Models,” IEEE Trans. Pattern Analysis and
Machine Intelligence, vol. 23, no. 6, pp. 681-685, June
2001.
[13]Mina Makar, Vijay Chandrasekhar, Sam S. Tsai, David
Chen and Bernd Girod, “Interframe Coding of Feature
Description for Mobile Augmented Reality” IEEE
Trans. Image Processing, vol. 23, no. 8, Aug 2014.
[14] OCR System: A Literature Survey

[15] “Survey of OCR Applications” by Amarjot
Singh, ketan bacchuwar, Akshay bhasin.
[16] M.D. Ganis, C.L. Wilson, J.L. Blue, “Neural
network-based systems for handprint OCR
applications” in IEEE Transactions on Image
Processing, 1998, Vol: 7, Issue: 8, p.p. 1097 –
1112.
[17] “A Literature Review on Hand Written
70 | P a g e
Character Recognition” by Mansi shah &
Gordhan B Jethava Department of Computer
Science & Engineering Parul Institute of
Technology, Gujarat, India. Information
Technology Department Parul Institute of
Engg. & Technology, Gujarat, India.
71 | P a g e
Appendices
A. https://www.slideshare.net/IAMINURHEARTS1/ocr-ppt-35272335
B. https://www.sciencedirect.com/topics/engineering/image-processing
C. https://shodhganga.inflibnet.ac.in/bitstream/10603/36771/16/16_chepter%207.pd
f
D. https://shodhganga.inflibnet.ac.in/bitstream/10603/9849/11/11_chapter%206.pdf
E. https://mail.google.com/mail/u/1/#sent/QgrcJHsbgZXbhtDcSmXrbzQHVWVwQLSjZ
xl?projectr=1&messagePartId=0.1https://mail.google.com/mail/u/1/#sent/QgrcJHs
bgZXbhtDcSmXrbzQHVWVwQLSjZxl?projector=1&messagePartId=0.1
F. https://www.slideshare.net/avisek_roy91/digital-image-processing-
12632314#:~:text=Conclusion%20The%20processing%20of%20images,take%20and
%20process%20digital%20images.
G. http://cas.sdss.org/DR6/en/proj/advanced/processing/conclusion.asp
H. https://shodhganga.inflibnet.ac.in/bitstream/10603/176215/13/14_chapter%205.p
df
72 | P a g e
Student Bio-Data :
1.
Name: E.B. Meghana
Father Name: E.B. Ravi teja goud
Roll. No: 1610116
Date of Birth: 26/09/1998
Nationality: Indian
Communication Address:
Town/Village: Veldurthi Mandal: Veldurhi District: Kurnool
PIN Code: 518216
Ph. No: 8341200889
e-mail: meghanabashyam@gmail.com
Permanent Address:
PIN Code: 518216
Ph. No: 8341200889
e-mail: meghanabashyam@gmail.com
Qualifications:
Degree: Bachelor of Technology
Branch: Computer Science and Engineering
Technical Skills:
Languages : C, Java, Html, Css
Softwares : Android Studio, Eclipse IDE, Star UML, R Studio
Basic Computer Skills : MS Office, Power point, DB
Management
Advanced Computer Skills : Web Development, Data
Structures, Coding, Debugging
Area of Interest:
Big Data, Data Base Managemnet, Web Development
73 | P a g e
2.
Name: G. Mahalakshmi
Father Name: G. Ramakrishna
Roll. No: 1610121
Date of Birth: 19/07/1999
Nationality: Indian
Communication Address:
PIN Code: 518216
Ph. No: 7386053083
e-mail: mahalakshmig1967@gmail.com
Permanent Address:
PIN Code: 518216
Ph. No: 7386053083
e-mail: mahalakshmig1967@gmail.com
Qualifications:
Degree: Bachelor of Technology
Branch: Computer Science and Engineering
Technical Skills:
Languages : C, Java, Html, Css
Softwares : Android Studio, Eclipse IDE, Star UML, R
Studio
Basic Computer Skills : MS Office, Power point, DB
Management
Advanced Computer Skills : Web Development, Data
Structures, Coding, Debugging
Area of Interest:
Big Data, Data Base Managemnet, Web Development.
74 | P a g e
DECLARATION
We hereby declare that the project report entitled “IMAGE TO TEXT APP” submitted to the
Department of Computer Science and Engineering, Sri Krishnadevaraya University, Anantapuramu for
the partial fulfilment of the academic requirement for the degree for Bachelor of Technology in Computer
Science and Engineering is an authentic record of our work carried out during the final year under the
esteemed guidance of Mrs. D.GOUSIYA BEGUM, M.Tech(CSE), Lecturer of Computer
Science and Engineering Department, College of Engineering and Technology, Sri Krishnadevaraya
University, Ananthapuramu.
Signature of the Students
1.
2.
75 | P a g e

Image to Text App: A Mini Project

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Image to Text App: A Mini Project

Uploaded by

Copyright:

Available Formats

IMAGE TO TEXT APP

COMPUTER SCIENCE AND ENGINEERING

E.B. MEGHANA G. MAHALAKSHMI

Under the esteemed guidance of

Mrs. D.GOUSIYA BEGUM, M.Tech(CSE)

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Signature of the Guide Signature of the Head of the Department

Signature of the Coordinator

Signature of the Students

It is with immense pleasure that I would like to express my indebted gratitude to my

Our heartfelt thanks to Smt. A.RENUKA DEVI, M.Tech., (Ph.D.) project

I wish to express my deep sense of gratitude to P.R.RAJESH KUMAR, M.Tech, Lecturer

I wish to convey my acknowledgment to Dr. R.RAMACHANDRA, M.Tech., Ph.D.,

Text recognition in images is an active research area which attempts to develop a

Chapter-2: REVIEW OF LITERATURE

Chapter-3: TEXT RECOGNITION SYSTEM

Chapter-4: REQUIREMENT AND ANALYSYS

Chapter-6: RESULTS AND ANALYSYS

Chapter-7: CONCLUSION AND FUTURE SCOPE

Now-a-days, there is a growing demand for the software systems to recognize

As a result, computer is unable to recognize the characters while reading

Character recognition is one of the most interesting areas of pattern recognition

Fig 1.1: Image recognition process

1.3.1 Generations of OCR systems

Second generation OCR systems Next generation machines were able to

The purpose of this project is to take English handwritten documents as input,

Text recognition technology may be apply throughout the entire spectrum of

Image text recognition further simplifies the process by making documents

Healthcare also use of image text recognition technology to process

Text recognition in Other Industries Image text recognition technology is widely

In government agencies and independent organizations, image text

As the technology continues to develop, more and more applications are

7.Optical Music Recognition-

8.Automatic Number Recognition

Automatic number plate recognition is used as a technique making use of optical

❖ Storage of important information is important for many businesses, that’s why

❖ Save Your Fingers With Typing

❖ Documents Will Be Text Searchable

❖ Digital Can Be Safer

❖ Edit Documents Easily

❖ Less Stress Overall

2.1 Literature Reviews

Yang has proposed a novel adaptive binarization method based on wavelet

Jawahar has proposed a recognition scheme for the indian script of

Ntirogiannis has studied that the document image binarization is of great

Pradeep have proposed neural network based classification of handwritten

2.1 Relation to past work:

[Nicchiotti/Scagliola 1999] have shown some good examples of normalization by

2.1 In English Scripter Character Recognition:

2.3. Arabic Scripter Character Recognition

In 2002 Majid M. Altuwaijri and Magdy A. Bayoumi They develop system to

2.4 Devanagari Scripter Character Recognition

Maximum achieved using Euclidean distance is 81.7% accuracy. Using linear

Fig 3.1 : Architecture of text recognition

The text recognition system can be divided in following modules:

(B) text recognition

Each module is further described in detail as bellow:

Normalization is one of the important pre-processing operation for text recognition.

Binarization is one of the important pre-processing operation for text recognition.

(a) gray image

(b), and binary image

(c) of a text image

Fig 3.2 Architecture of text recognition