You are on page 1of 8

International Journal of Pure and Applied Mathematics

Volume 119 No. 15 2018, 111-117


ISSN: 1314-3395 (on-line version)
url: http://www.acadpubl.eu/hub/
Special Issue
http://www.acadpubl.eu/hub/

IMPLEMENTATION OF OCR USING RASPBERRY PI


FOR VISUALLY IMPAIRED PERSON
1
V.Mahalakshmi, 2Dr.M.Anto Bennet3Hemaladha R, 4Jenitta J, 5Vijayabharathi K.
1,2
Professor of Electronics and Communication Engineering, VelTech,Chennai,India
2,3,4
UGScholar,Department of Electronics and Communication Engineering,VelTech,Chennai,India
1
* Corresponding author’s Email: mahalakshmi@gmail.com

ABSTRACT
Optical character recognition (OCR) is the identification of printed characters using photoelectric
devices and computer software. It coverts images of typed, handwritten or printed text into
machine encoded text from scanned document or from subtitle text superimposed on an image.
In this project text images are converted into audio output. OCR is used in machine process such
as cognitive computing, machine translation, text to speech, key data and text mining. It is
mainly used in the field of research in Character recognition , Artificial intelligence and
computer vision .In this project, as the recognition process is done using OCR the character code
in text files are processed using Raspberry Pi deviceon which it recognizes character using
tesseract algorithm and python programming and audio output is listened. To use OCR for
pattern recognition to perform Document image analysis (DIA) we use information in grid
format in virtual digital library’s design and construction. This work mainly focuses on the OCR
based automatic book reader for the visually impaired using Raspberry PI.
Keywords:Optical character recognition (OCR),Document image analysis (DIA),The
technology of speech synthesis (TTS)

1.INTRODUCTION

Blind people are unable to perform visual tasks. For instance, textreading requires the use of a
braille reading system or a digital speechsynthesizer (if the text is available in digital format).
The majority ofpublished printed works does not include braille or audio versions, anddigital
versions are still a minority. Thus, the development of a mobileapplication that can perform the
image to speech conversion, whether its’a text written on a wall, a sheet of writing paper or in
another support,has a great potential and utility. The technology of optical character recognition
(OCR) enablesthe recognition of texts from image data. This technology has beenwidely used in
scanned or photographed documents, converting theminto electronic copies, which one can edit,
search, play its content andeasily carry. The technology of speech synthesis (TTS) enables a text
indigital format to be synthesized into human voice and played through anaudio system. The
objective of the TTS is the automatic conversion ofsentences, without restrictions, into spoken
discourse in a naturallanguage, resembling the spoken form of the same text, by a nativespeaker
of the language. This technology has had significant progressover the last decade, with many
systems being able to generate asynthetic speech very close to the natural voice. Research in the
area ofspeech synthesis has grown as a result of its increasing importance inmany new
applications[1-5].

111
International Journal of Pure and Applied Mathematics Special Issue

PROPOSED SYSTEM

Fig.1 Optical character recognition (OCR) with a portable scanner.

The mechanical or electronic conversion of images of typed, handwritten or printed text into
machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo
(for example the text on signs) or from subtitle text superimposed on an image (for example from
a television broadcast) can be done by using Optical character recognition (also optical character
reader, OCR) shown in fig.1. It is widely used as a formof information entry from printed paper
data records, whether passportdocuments, invoices, bank statements, computerized receipts,
businesscards, mail, printouts of static-data, or any suitable documentation. To
electronicallyedited, searched, to store more compactly, displayed on-line,the method of
digitizing printed texts is highly used and also contain machine processes such as cognitive
computing, machine translation,(extracted) text-to-speech, key data and text mining. OCR is a
field ofresearch in pattern recognition, artificial intelligence and computer vision. Trained with
images of each character is necessary in the early versions,and itworks on only one font at a
time. The high degree of recognition accuracy for most fonts can be achieved by the advanced
systems; also supports for a variety of digital image file format inputs. Some systems are
incorporate in replicating formatted output that closely coincides theoriginal page including
images, columns, and other non-textualcomponents .

CHARACTER RECOGNITION:

There are two basic types of core OCR algorithm, which may produce aranked list of candidate
characters.
MATRIX MATCHING:
Matrix matching is also known as "pattern matching", "patternrecognition", or "image
correlation". It involves in comparing an image to a stored glyph on a pixel-by-pixel basis; The
input glyphis beingcorrectly isolated from the rest of the image and on the stored glyph being

112
International Journal of Pure and Applied Mathematics Special Issue

in a similar font and at the same scale. This technique works best withtypewritten text and does
not work well when new fonts are encountered. This is the technique by which the early physical
photocell-based OCRis implemented.

FEATURE EXTRACTION:
Feature extraction decomposes glyphs into "features" like lines, closed loops, line direction, and
line intersections. The extraction features reduces the dimensionality of the representation and
makes the recognitionprocess computationally efficient. These features are compared with
anabstract vector-like representation of a character, which might reduce to oneor more glyph
prototypes. General techniques of feature detection incomputer vision are applicable to this type
of OCR, which is commonlyseen in "intelligent" handwriting recognition and indeed most
modern OCRsoftware. Nearest neighbor classifiers such as the k-nearest neighbor’salgorithm are
used to compare image features with stored glyph featuresand choose the nearest match. For
character recognition the software includes Cuneiform and Tesseract uses a two-pass approach.
The "adaptiverecognition" is the second pass and uses the letter shapes recognized with high
confidence onthe first pass to recognize better the remaining letters on the second
pass.Whereareas the fontis distorted (e.g. blurred or faded) or with the low quality this method is
beneficial. The standardized ALTO formatis used to store the OCR result, dedicated XML
outline is maintained by the United States Library ofCongress.

Fig.2.Block diagram of Proposed system

Camera:
Camera is used to capture the page or the screen of the monitor shown in fig 2.
Pre-processing:
A program that processes its input data to produce output thatis used as input to another program
like a compiler.
Background subtraction:

113
International Journal of Pure and Applied Mathematics Special Issue

Background subtraction, also known as foreground detection, is a technique in the fields of


image processing and computer visionwherein an image's foreground is extracted for further
processing(text recognition etc.).

Power unit:
Power supply applied is about 5.25V or 2.5A

Ultrasonic senor:
It is used to measure the non-contact distance between the pageand the senor
Raspberry pi:
It is the operating system of the device.It is capable of doing theentire job of an average desktop
computer.
Open CV:
OpenCV is a library of programming functions for real timecomputer vision.It is built to provide
a common infrastructure forcomputer vision applications and to accelerate the use of
machineperception in the commercial products.
Neuro-OCR:
It is mechanical or electronic conversion of image oftyped,handwritten or printed into machine
encoded text.
Speech synthesizer:
It gives an artificial production of human voice.
Audio Amplifier:
It reproduces a low power electronic audio signal.
Headset:
It is used to get the audio output.

METHODOLOGY:

Fig3.Methodology

Image acquisition:
In this step, the images of the text can be captured by the inbuilt camera. Depending on the
camera used, the quality of the image is captured. Weare using the Raspberry Pi’s(5MP) camera
with aresolution of 2592x1944 as shown in fig.3.

Image pre-processing:

114
International Journal of Pure and Applied Mathematics Special Issue

This step consists of color to gray scale conversion, edge detection, noise removal, bending and
cutting and thresholding.The image is converted to gray scale as many OpenCV functions
require the input parameter as a gray scale image. Bilateral filter is used for noise removal. For
better detection of the contours, canny edge detection is performed on the gray scale image. The
warpingand cropping of the image are performed according to the contours.This enables us to
detect and extract only that region which containstext and removes the unwanted background. In
the end, Thresholdingis done so that the image looks like a scanned document. This is done
to allow the OCR to efficiently convert the image to text.

Fig.4.Text to audio conversion

Image to text conversion:


The above diagram 4 shows the flow of Text-To-Speech. The first block is the image pre-
processing modules and the OCR. It convertsthe pre-processed image, which is in .png form, to a
.txt file. We areusing the TesseractOCR .
Text to speech conversion:
The second block is the voice processing module. It converts the .txt file to an audio output.
Here, the text is converted to speechusing a speech synthesizer called Festival TTS. The
Raspberry Pi hasan on-board audio jack, the on-board audio is generated by a PWMoutput.

EXPERIMENTAL RESULT
As in existing system, Screen reader has some of thedisadvantages, to overcome that, we
proposed a OCR dictating systemwhich converts the text message to audio speech.Our idea is
toimplement a hand-held computer for reading aloud to the blind.It isimplemented by using
raspberry pi with a HD camera and Bluetooth headset. The camera captures the text image which
is in front of it andconverts it as an audio and gives the sound through headset. Each and
Every word in the text line is identified via an OCR and hears it via TTS.It can read the printed
texts like books and papers. It can read the blurimage, low contrast image and over exposed
image shown in fig 5.

115
International Journal of Pure and Applied Mathematics Special Issue

Fig 5:Output of the proposed system

CONCLUSION

This work presents the development of the work by implementing OCRusing Raspberry Pi for
Blind People, considering OCR and TTSstages, to create anapplication that was gradually
improved and refined over the work.An analysiswas made regarding the OCR and TTS
technologies that were used in thedevelopment of theapplication, in order to know the methods
behind those, and tounderstand in greater detail the mechanisms thatperform optical character
recognition on images and speech synthesis of texts. The work consisted of theconstruction of an
application composed by several parts, integrating the systemofimage capture by the web
camera, which is used by an OCR framework forrecognition of its text, which is thensynthesized
through a process of TTS. Optimizations carried out for improving outcomes resulted in a
moreefficient application, capable of responding to the challenge set by the theme of the work: a
camera that reads texts for the blind.

REFERENCES
[1] Dr.AntoBennet, M , Sankaranarayanan S, Ashokram S ,Dinesh Kumar T R,“Testing of Error
Containment Capability in can Network”, International Journal of Applied Engineering
Research, Volume 9, Number 19 (2014) pp. 6045-6054.
[2]Dr.AntoBennet, M, SankarBabu G, Natarajan S, “Reverse Room Techniques for Irreversible
Data Hiding”, Journal of Chemical and Pharmaceutical Sciences 08(03): 469-475, September
2015.

[3] Dr.AntoBennet, M ,Sankaranarayanan S, SankarBabu G, “ Performance & Analysis of


Effective Iris Recognition System Using Independent Component Analysis”, Journal of
Chemical and Pharmaceutical Sciences 08(03): 571-576, August 2015
.
[4] Dr. AntoBennet, M ,Sankaranarayanan S, SankarBabu G, “ Performance & Analysis of
Effective Iris Recognition System Using Independent Component Analysis”, Journal of
Chemical and Pharmaceutical Sciences 08(03): 571-576, August 2015.

[5] Dr. AntoBennet, M, Suresh R, Mohamed Sulaiman S, “Performance &analysis of automated


removal of head movement artifacts in EEG using brain computer interface”, Journal of
Chemical and Pharmaceutical Research 07(08): 291-299, August 2015.

116
117
118

You might also like