You are on page 1of 6

© 2018 IJRAR July 2018, Volume 5, Issue 3 www.ijrar.

org (E-ISSN 2348-1269, P- ISSN 2349-


5138)

Voice Assisted Text Reading System for Visually


Impaired Persons
1
Tejashree Bagayatkar,2 Siddhesh Shetye ,3 Hrushikesh Tamhankar
1
Department of Electronics and Telecommunication Engineering
SSPM’s college of Engineering, Kankavli, Maharashtra, India,
2
Department of Electronics and Telecommunication Engineering
SSPM’s college of Engineering, Kankavli, Maharashtra, India,
3
Department of Electronics and Telecommunication Engineering
SSPM’s college of Engineering, Kankavli, Maharashtra, India

________________________________________________________________________________________________________

Abstract : Speech and text is the main medium for human communication. However, those who have poor vision can gather
information from voice. This project proposes a camera based assistive text reading to help visually impaired person in reading the text
present on the captured image. In this Project we use Logitech Camera for capturing the image. This captured image is converted into
scan image using OpenCV (Open Computer Vision) software. This scanned image converted into text with the help of Tesseract OCR
(Optical Character Recognition) software. For transformation of text into speech we use TTS (Text to Speech) engine

Keywords – Logitech Camera, Raspberry Pi, Open Computer Vision (OpenCV), Optical Character Recognition (OCR), Text to Speech
(TTS) Engine.

________________________________________________________________________________________________________
I. INTRODUCTION
There are about 2-3 percent of people of world population are blind and low vision paired people.[6] We all know that blind
people have own script language known as Braille language, which is slightly difficult to learn. This paper proposes voice
assisted text reading system for visually impaired person. This technology helps millions of people in the world who
experience a significant loss of vision.

II. BLOCK DIAGRAM

Image Image Text


Capturing Processing and Extraction
Preprocessing

Text to
Speech
Converter

Speech
Output

Figure 1: Basic Block Diagram

IJRAR1601009 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 25


© 2018 IJRAR July 2018, Volume 5, Issue 3 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-
5138)

2.1 Image Capturing

In this step the image of the text is captured using camera with high resolution. The captured image is then applied to the image
processing.

2.2 Image Processing and preprocessing

In image processing the unwanted noise is removed by applying appropriate technique such as thresholding,bending, cutting,
dilation, filtering, edge detection etc.

• Resizing Images
Images can be easily scaled up and down using OpenCV. This operation is useful for training deep learning models when we
need to convert images to the model’s input shape.

• Filtering
The technique which is used for modifying or enhancing an image filters. Filtering includes smoothing, sharpening, and edge
enhancement of the image and it is implemented under image processing.

• Simple Thresholding
It is an image segmentation method. It compares pixel values with a threshold value and updates it accordingly. OpenCV supports
multiple variations of thresholding.

• Adaptive Thresholding
In case of adaptive thresholding, different threshold values are used for different parts of the image. This function gives better
results for images with varying lighting conditions

• Edge Detection
Edges are the points in an image where the image brightness changes sharply or has discontinuities. Edges are very useful features
of an image that can be used for different applications like classification of objects in the image and localization.

• Image Filtering
In image filtering, a pixel value is updated using its neighboring values.

• Smoothing Images
Blur images with various low pass filters, Apply custom-made filters to images (2D convolution)

• Geometric Transformations of Images


Learn to apply different geometric transformation to images like translation, rotation, affine transformation etc.

2.3 Text Extraction

In Text Extraction scan image convert into text or editable data. For this Tesseract OCR technique is used.

2.4 Text to Speech Converter

This technology is used for conversion of text file into voice or in audio form. The e-Speak TTS engine is used to convert text to
speech.

2.5 Speech Output

The output of TTS is amplified by using audio amplifier and then it given to the speaker or Bluetooth headset.

IJRAR1601009 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 26


© 2018 IJRAR July 2018, Volume 5, Issue 3 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-
5138)

II. PROPOSED METHODOLOGY

Power Supply

Camera Raspberry Pi

OpenCV

Speaker
Image
OCR

TTS
Image
Processing

Figure 2: Proposed Block Diagram

3.1 Logitech Camera

Figure 3:. Logitech Camera

IJRAR1601009 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 27


© 2018 IJRAR July 2018, Volume 5, Issue 3 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-
5138)

 Specifications
 Resolution: 5 Megapixel
 Focus Type: Fixed
 Connectivity: USB
 Frame Rate: Up to 30 frames per second

3.2 Raspberry Pi

Figure 4 : Raspberry Pi

 Specifications

 Quad Core 1.2GHz Broadcom BCM2837 64bit CPU


 1GB RAM
 40-pin extended GPIO
 4 USB 2 ports
 Full size HDMI
 CSI camera port for connecting a Raspberry Pi camera
 Micro SD port for loading your operating system and storing data
 Upgraded switched Micro USB power source up to 2.5A

3.3 OpenCV (Open Computer Vision)


OpenCV is the programming library function. OpenCV is used to scan the capture images. Also, it performs image
processing operations such as filtering, thresholding, edge detection, dilation, enhancement, restoration etc. OpenCV supports
variety of programming languages such as C, C++, Python, Java and MATLAB interfaces. It is available on platform like
Windows, Linux, Android, and MAC-OS.

3.4 OCR (Optical Character Recognition)


It is electronic conversion of images typed, handwritten, printed text into machine encoded text, whether from scan document. It
can convert scanning images into editable text. For this project we use tesseract OCR engine which helps to extract recognized text.

3.5 TTS (Text to Speech)


TTS technology used to help the blind peoples or low vision impaired people. The extracted text is converted into speech using the
speech synthesizer called TTS engine. We are used e-Speak TTS engine for converting text into speech in the form of sound or
audio format.

IJRAR1601009 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 28


© 2018 IJRAR July 2018, Volume 5, Issue 3 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-
5138)

3.6 Power Supply


In this system we use the power bank to giving the power supply to the Raspberry Pi model. The power bank having capacity of
16,000mAh is used. This power bank makes the model portable. This power bank is live for a maximum of 8 hours. After the
discharging the battery, user can recharge the power bank. The power bank provides a steady voltage of 5V and 2A current. The
power bank makes the system is more flexible and compact. The user can use this system anywhere and anytime due to the voltage
stored in the power bank as power source for the system.

Figure 5: Flowchart of Proposed Methodology

IV.APPLICATION
1. Blind people can read the books and documents with the help of this device.
2. Many a times the speech is more powerful than the written messages. Thus, speech synthesizer can be used in measurement
and control systems.
3. The text to speech system is helpful to have a man-machine communication.
4. Basic and research applied text to speech synthesizers have unique feature making them excellent laboratory tool.

V.CONCLUSION
Text to speech converter device convert text captured image into audio or voice as a output. This project is useful for the blind
people or poor vision impaired people.

IJRAR1601009 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 29


© 2018 IJRAR July 2018, Volume 5, Issue 3 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-
5138)

REFERENCES

[1] S.I.ShirkeandS.V. Patil, “Portable camera based text reading of objects for blind persons,” International Journal of Applied Engineering
Research, vol. 13, no. 17, pp. 12995–12999, 2018.
[2] A.Goel,A.Sehrawat,A.Patil,P.Chougule,andS.Khatavkar,“Raspberry pi based reader for blind people,” 2018.
[3] M. Rajesh, B. K. Rajan, A. Roy, K. A. Thomas, A. Thomas, T. B. Tharakan, and C. Dinesh, “Text recognition and face detection aid for
visually impaired person using raspberry pi,” in 2017 International Conference on Circuit, Power and Computing Technologies (ICCPCT).
IEEE, 2017, pp. 1–5.
[4] S. Aaron James, S. Sanjana, and M. Monisha, “Ocr based automatic book reader for the visually impaired using raspberry pi,” International
Journal of Innovative Research in Computer and Communication, vol. 4, no. 7, 2016.
[5] R. Ani, E. Maria, J. J. Joyce, V. Sakkaravarthy, and M. Raja, “Smart specs: Voice assisted text reading system for visually impaired persons
using tts method,” in 2017 International Conference on Innovations in Green Energy and Healthcare Technologies (IGEHT). IEEE, 2017,
pp. 1–6.
[6] Srinivas M, Gangaram D, Varun Reddy R, AtulKewat K, “An Interactive Smart Glass”in International Journal of Advanced Networking &
Applications (IJANA)

IJRAR1601009 International Journal of Research and Analytical Reviews (IJRAR) www.ijrar.org 30

You might also like