You are on page 1of 79

BSc PROJECT

CS DEPARTMENT
79PROJECT ID: UQU-CS-2022F-06

ABSIR: Image Processing-Based And Machine


Learning-Based Application For Blind And
Color-Blind People

Team members
Amal Ahmed Alhutaly 439001134

Amal Ali Alyatimi 439008504

Raghad Raed Abuhanoda 439003837

Razan Abdulrahman Alsulami 439009828

Ruba Abed Al-lehibi 439001527

Supervised by:
Dr.Seereen Mohammedtaher Noorwali

Department of Computer Science

Faculty of Computer and Information Systems

Umm Al-Qura University, KSA


Contact Information

This project report is submitted to the Department of Computer Science at Umm Al-Qura
University in partial fulfillment of the requirements for the degree of Bachelor of Science in
Computer Science.

Authors:
Raghad Abuhanoda
raghadraedabu@gmail.com
Razan Alsulami
razanasalsulami@gmail.com
Amal Alhutaly
amal_almrzoqi303@gmail.com
Amal Alyatimi
amal.alyatimi@gmail.com
Ruba Al-lehibi
roro-2000@outlook.com

University Supervisor:
Seereen Noorwali
Computer Science

Department of Computer Science


Faculty of Computer and Information Systems
Umm Al-Qura University
Kingdom of Saudi Arabia
Internet: https://uqu.edu.sa

1
Intellectual Property Right Declaration

his is to declare that the work under the supervision of Seereen Noorwali having the title "ABSIR:
machine learning based android application for blind and color-blind people" carried out in partial
fulfillment of the requirements of Bachelor of Science in Computer Science, is the sole property of the
Umm Al-Qura University and the respective supervisor and is protected under the intellectual property
right laws and conventions. It can only be considered/ used for purposes like an extension for further
enhancement, product development, adoption for commercial/ organizational usage, etc., with the
University's and respective supervisor's permission.

The above statement applies to all students and faculty members.

Date:

Authors:

Name: Raghad Abu Hanoda Signature:___________________

Name: Razan Alsulami Signature:___________________

Name: Amal Alhutaly Signature:___________________

Name: Amal Al-yatima Signature:___________________

Name: Ruba Al-lehibi Signature:___________________

2
Anti-Plagiarism Declaration

This is to declare that the above publication produced under the supervision of Seereen Noorwali
having the title of "ABSIR: machine learning based android application for blind and color-blind
people" is the sole contribution of the authors and no part hereof has been reproduced illegally
(cut and paste) which can be considered as Plagiarism. All referenced parts have been used to
argue the idea and have been cited properly. We will be responsible and liable for any
consequence if a violation of this declaration is proven.

Date:

Authors:

Name: Raghad Abu Hanoda Signature:___________________

Name: Razan Alsulami Signature:___________________

Name: Amal Alhutaly Signature:___________________

Name: Amal Alyatimi Signature:___________________

Name: Ruba Al-lehibi Signature:___________________

3
Acknowledgments

First, we thank Almighty Allah for granting us success and facilitating the completion of this
report. Second, we express our supervisor, Dr.Seereen Noorwali, for her guidance, suggestions,
and assistance during our work on the graduation project. Finally, we thank the members of the
project team for their effort and cooperation.

4
Abstract

Visually impaired individuals often encounter various difficulties in their daily routines. ABSIR,
however, offers a comprehensive solution to overcome these challenges and assist in their daily
lives. The application has been developed to identify objects for the visually impaired, offering a
valuable tool for those with visual impairments. One of its prominent features is the
customization of color-filtering options, designed to meet the specific needs of color-blind
individuals.

In addition to addressing practical challenges, ABSIR prioritizes the emotional and mental
well-being of its users through a supportive and encouraging voice. This personalized touch adds
value to the application and serves as a reliable and convenient tool to help visually impaired
individuals navigate daily tasks with increased ease and safety.

ABSIR has the potential to significantly enhance the quality of life for visually impaired
individuals by providing the necessary resources and support to effectively handle daily
challenges, whether it be recognizing objects, navigating unfamiliar spaces, or managing daily
stress.

In conclusion, ABSIR is a comprehensive solution that addresses the difficulties faced by


visually impaired individuals and offers innovative features with a personalized approach. This
application provides independence, safety, and overall well-being in daily life, making it a
valuable resource for anyone looking to improve their daily experiences and lead a more
fulfilling life.

Keywords: Machine Learning, Computer Vision, Image Processing, Artificial Intelligence.

5
List Of Contents

Chapter 1: Introduction........………………...…………….........................….....14

1.1 Project Motivation..................................................................…….……...…...…….............................15

1.2 Purpose Of This Project.............................................................…………………...........….................15

1.3 Purpose Of This Document.............................................................………….......………....................16

1.4 Overview Of This Document....................................................…….……………..............…..............16

1.5 Background....................................................................................…….…….………..........................17

1.6 Brand Story.........................................................................................…..………………..…................18

1.7 Existing System.............................................................…………..……..........………….....................19

1.7.1 Related Work.....................................................................................……….………..…...................19

1.7.1.1 Software..….....................................................................................................................................20

1.7.1.2 Hadrware........…..............................................................................................................................22

1.8 Literature Review...................................................................................................................................26

Chapter 2: Dataset....................................…………………...….……......…........29

2.1 Introduction To Dataset.......................................………..............………………........…............….....30

2.2 COCO Dataset…….............................................………..............………………........…............….....30

2.3 Examples And Samples.......................................………..............………………........…............….....31

Chapter 3: System Analysis.....................................…………………......…........33

3.1 System Requirements..........................................………..............………………........…............….....34

3.1.1 Clients, Customers, and Users.................................……….…………………..................................34

6
3.1.2 Functional and Data Requirements........................…...….……………………..........…...................34

3.1.3 Non-Functional Requirements............................…...…….……………….......…….........................35

3.1.3.1 Look And Feel Requirements..................…………..............…………...….....………..................35

3.1.3.2 Usability Requirements............................……......……..................…….................……..….........35

3.1.3.3 Security Requirements............................…….........……........…………..........………..................35

3.1.3.4 Performance Requirements..........….......…….......……....………………..............……................36

3.1.3.5 Portability Requirements...........................……..…..................……………......………….............36

3.1.3.6 Availability Requirements.....................…….….................……....................………….................36

3.2 Proposed Solution.…………………………………………….....……..……...…………..….............36

Chapter 4:System Design.....……….…………....……...….………......…............. 37

4.1 Design Constraints.………………………..…………....……..……………..…….........…………….38

4.1.1 Hardware and Software Environment.……………....……………………..…..…...............……….38

4.2 Architectural Strategies.……………………………....……..……………………….......…….……...38

4.2.1 Reuse of Existing Software Components.………....……………………...………..............………..38

4.2.3 Development Method..……………....……..…………..………………….….…………..............…39

4.3 Use Case Diagram.………………..……....…….………………..…..………………….........……….40

4.4 Use Case Tables.………………………....…….…………….….………………..………......……….41

4.5 Data Flow Diagram - Context diagram...........……......……..…….………..........................................46

4.5.1 Data Flow Diagram - Level 1 diagram................…..........…….....…………………........................47

4.5.2 Data Flow Diagram - Level 2 diagrams..............…….........…..………….........................................48

4.6 Class Diagram.........................................…….......…..............…………..............................................52

7
4.7 Activity Diagram..............................……............….....................…………………….........................53

4.8 ER Diagram……..............................……............….....................…………………….........................54

4.9 Sequence Diagrams.............….........……..........................…..……..…................................................55

4.10 Architecture Diagram....................……..................................……………….....................................57

4.11 Prototype.....…....……...................…….………...........................…………………………………..58

Chapter 5: Implementation……………………….……………………………..61

5.1 Machine Learning..................................................................................................................................62

5.1.1 Object Detection……………………………………………………………………………………..62

5.1.2 YOLO Algorithm……………………………………………………………………………………63

5.1.3 Packages and Libraries Used In Machine Learning…………………………………………………63

5.2 Image Processing...................................................................................................................................64

5.2.1 Application Interface Programming (API)..........................................................................................64

5.2.2 Packages and Libraries Used In Image Processing.…………………………………………………65

5.3 ABSIR Interfaces...................................................................................................................................66

5.4 Future Work....................……................................................................................................................70

5.5 Conclusion................................................................……………..……................................……........70

References………………………………………………………………………………………………....72

Appendix…………………………………………………………………………………………………..75

8
List Of Figures

Figure 1: Process Of The ABSIR Application…………………………………………………………….16

Figure 2: ABSIR Logo………………………………………………………………..…………………...18

Figure 3: KNFB Reader…………………………………………………………………………………...20

Figure 4: Seeing AI Application…………………………………………………………………………..20

Figure 5:NoonGil Application…………….………………………………………………………………21

Figure 6: Be My Eyes Application……………………………………………………………………......21

Figure 7: TapTapSee Application………………………………………………………………………....22

Figure 8: Orcam Device…………………………………………………...………………………...……23

Figure 9: Maptic Device……………………………………………………………………………......…23

Figure 10: WeWALK Device……………………………………………………………………………...24

Figure 11: Mobiles Sound Beacon………………………………………………………………………...24

Figure 12: Examples Of The Dataset Images……………………………………………………………..31

Figure 13: Categories In The COCO Dataset……………………………………………………………..32

Figure 14: Use Case Diagram……………………………………………………………………………..40

Figure 15: DFD Context Level………….….….………………………………………………………….46

Figure 16: DFD Level 1…………………………………………………………………………………...47

Figure 17: DFD Level 2. Object Recognition……………………………………………………………..48

Figure 18: DFD Level 2. Read Menu By QR……………………………………………………………..49

Figure 19: DFD Level 2. Color Filter (Image Processing)..........................................................................50

9
Figure 20: DFD Level 2. Recording Relative’s Voice…………………………………………………….51

Figure 21: Class Diagram…………………………………………………………………………………52

Figure 22: Activity Diagram………………………………………………………………………………53

Figure 23: ER Diagram……………………………………………………………………………………54

Figure 24: Object Recognition Sequence Diagram…………………………………………………….…55

Figure 25: Filter (Image Processing) Sequence Diagram…………………………………………………55

Figure 26: QR Reader Sequence Diagram………………………………………………………………...56

Figure 27: Register/Login Sequence Diagram…………………………………………………………….56

Figure 28: Record Voice Sequence Diagram……………………………………………………………...57

Figure 29: Architectural Diagram…………………………………………………………………………57

Figure 30: Welcome Interfaces...………………………………………………………………………….58

Figure 31: (a) Register/Login Interface, (b) Register Interface, (c) OTP Interface.………….….….…….58

Figure 32: Voice Recording Interfaces …...……………………………………………………………….59

Figure 33: (a) Login Interface, (b) Forget Password Interfaces………..………………………………….59

Figure 34: (a) Settings Interface, (b) Voice Settings Interface…………………………………………….60

Figure 35: Home Screen Interfaces….…………………………………………………………………….60

Figure 36: How the API works……………………………………………………………………………64

Figure 37: How each type of color-blind sees……………………………………………………………..65

Figure 38: Welcome Interfaces….………...................................................................................................66

Figure 39: Register/Login Interface…..………...........................................................................................66

Figure 40: Register Interface…....................................................................................................................67

10
Figure 41: Login Interface………………..................................................................................….............67

Figure 42: Object Recognition Interface..............................................................................…....................68

Figure 43: Color Filtering Interfaces………….......................................................….................................68

Figure 44: Voice Recording Interfaces…..…...............................................................................................69

Figure 45: OTP, settings, and voice options Interfaces..........................................................................69

11
List Of Tables

Table 1: Sign up Use Case………...............................................................................................................41

Table 2: Log in Use Case............................................................................................................................41

Table 3: Filter ColorUse Case……………………………...………...........................................................42

Table 4: Recognize Object Use Case...........................................................................................................42

Table 5: Reed QR Use Case …………........................................................................................................43

Table 6: Select Voice Use Case…................................................................................................................43

Table 7: Record Relative’s Voice Use Case….....................................................................…....................44

Table 8: Default Use Case…..…………………………………..…............................................................44

Table 9: Admin Processing Use Case….……………………….…............................................................45

12
Glossary

13
Chapter 1: Introduction

1.1 Project Motivation


1.2 Purpose Of The Project
1.3 Purpose Of This Document
1.4 Overview Of This Document
1.5 Background
1.6 Brand Story
1.7 Existing System
1.7.1 Related Work
1.7.1.1 Software
1.7.1.2 Hardware
1.8 Literature Review

14
1.1 Project Motivation
Allah has blessed us with countless blessings, and one of the greatest of these blessings is the
ability to see. With the blessing of sight, a person can know what might harm him. Moreover,
because sight is essential in our lives, many people lack this great blessing; as they face great
difficulty in carrying out their daily work, this application must help them to facilitate their lives
on this side.

The dilemma these visually impaired people face is the possibility of danger around them, such
as a sharp tool, for example, the beginning of stairs or a closed door that collides with it; all of
these things may harm the visually impaired person.

Also, color-blind people from the categories may face difficulties due to losing an essential part
of the blessing of sight, which is color discrimination; these color-blind people may face
different challenges in their daily lives; these challenges are identifying the colors of traffic
lights, which increases the risk of accidents, also the challenge of choosing clothes and other
topics that may reduce the quality of life.

As application developers, after meeting a blind user who shared with us the difficulties she
faced in her daily life, such as the inability to recognize colors, fruits, clothes, and ingredients of
food products. She also mentioned that she had difficulty recognizing the texts accompanying
images, which made it challenging for her to fully use and benefit from many apps and websites.
These experiences highlighted the importance of accessibility and the need for applications to
consider the needs of all users, including those with visual impairments. We are committed to
incorporating these considerations into the development of ABSIR and providing features that
can help improve the accessibility and usability of our app for blind and visually impaired users.

1.2 Purpose Of This Project


● Reduce the danger people with disabilities (blind and color blind) may face.
● Helping people with disabilities (blind and color blind) to know the objects, colors, and
texts around them.
● Helping the blind to know restaurant menu items by the menu's QR code.
● Providing a friendly voice for the blind to support their mental and emotional side.
● Motivates blind people who refuse such applications to use and benefit from them.

Therefore, the application had to be developed to reduce the danger facing these groups by using
artificial intelligence, computer vision, and machine learning; ABSIR can become their eyes and
provide them with a safe and easy way to a better life. Through this dilemma, the idea of ​our
graduation project came up, “ABSIR.”

15
1.3 Purpose Of This Document

This document will contain the details of the project and an explanation of the requirements; all
the work is ordered and documented sequentially in all phases. The document will be presented
in different ways, in written and schematic form.

1.4 Overview Of This Document

This document will include details about the project and contain system analysis which consists
of functional, non-functional, and data requirements specifications. (Figure.1) shows the Process
Of ABSIR Application.

16
1.5 Background

There is no doubt that modern technology is the engine behind the expansion and advancement
of facilities in various specializations worldwide. It positively impacts people's standard of
living, whether traditional or with special needs. Additionally, people with disabilities have
greatly benefited from modern technology, particularly in education and learning. Consequently,
it helped find new jobs for people with disabilities to be more productive and artistic in the
workplace.

For this reason, smartphones are fundamental tools for many valuable functions. They are
equipped with various sensors that provide data on movement, position, and surroundings,
making them suitable as navigation aids for blind and partially sighted friends. [19]

The results of an online survey were used to understand the requirements and challenges blind
and visually impaired people face in their daily lives regarding the availability and use of digital
devices. The survey was conducted among the blind and visually impaired in Saudi Arabia using
digital forms.

A total of 164 people responded to the survey, most of them using the VoiceOver function.
People were asked about the use of intelligent devices, special devices, operating systems, object
recognition apps, indoor and outdoor navigation apps, virtual digital assistive apps, the purpose
(navigation, education, etc.) of and difficulty in using these apps, the type of assistance needed,
the reliance on others in using the assistive technologies, and the level of satisfaction from the
existing assistive technologies. Most participants were 18 – 65 years old, with 13% under 18 and
3% above 65. Sixty-five percent of the participants were graduates or postgraduates, and the rest
only had secondary education. White Cane, mobile phones, Apple iOS, Envision, Seeing AI,
VoiceOver, and Google Maps were the participants' most used devices, technologies, and apps.
Navigation, at 39.6%, was the most reported purpose of the particular devices, followed by
education (34.1%) and office jobs (12.8%).[20]

17
1.6 Brand Story
The ABSIR application serves individuals who experience visual impairments, including
blindness, color blindness, and impaired vision. To enhance the user experience, the ABSIR logo
(Figure 2) has been designed using the Braille language and features the first letter of ABSIR,
represented by the Braille alphabet. The color selected for the logo has been verified as visible
for individuals with color blindness through the use of the Adobe potential color conflict website
[22]. In addition, the user experience has been thoroughly tested by a person who experiences
color blindness to ensure the highest level of accessibility for all users.

Figure.2: ABSIR Logo

18
1.7 Existing System

1.7.1 Related Work


Navigation is a vital part of each person's life. People navigate for work, education, shopping,
and different miscellaneous reasons. Most humans might know that imagination and prescience
perform an essential position in navigation because it helps the motion from one spot to another.
It is immaculate to assume getting around without creativity and prescience in famous
environments, consisting of our room withinside the residence or maybe our workspace.
However, it takes a lot of work to navigate unexpected locations.

The World Health Organisation (WHO) records indicate that about 2.2 billion humans stay with
a few types of imaginative and prescient impairment globally.

Being blind or visually impaired no longer implies dropping the independence of having to and
from locations whenever we want. People without an imaginative and prescient or restrained
imaginative and prescient can tour independently daily with their way first-rate acceptable for
them.

In every of the most demanding situations, independence for visually impaired people is related
to secure and green navigation. However, it also has to be mentioned that to facilitate secure and
green navigation, it is right to collect tour talents and use reasserts of non-visible environmental
facts, which might hardly ever be taken into consideration via means of those who depend upon
their imaginative and prescient. However, there are still a few demanding situations that might be
confronted via way of standards for visually impaired people during day-by-day navigation.
Besides, to attain the vacation spot safely, there are extraordinarily different demanding
situations that might be expected in navigation. Some of them are figuring out pits in the front of
the path, putting obstacles, stairs, visitors junctions, signposts on the pavement, moist ground
indoors, and greased or slippery door paths[1].

Smartphones have revolutionized how blind or visually impaired people interact and use
technology. Screen studying and magnifying software programs – like that utilized in computers
let us apply those gadgets independently. However, the specific component of molecular
telephones and capsules is that they can serve a few functions for which standalone gadgets and
software programs had formerly been advanced to apps.

19
1.7.1.1 Software

● KNFB Reader Application

Nowadays, without problems, a few of the matters that formerly required unique software
programs or gadgets on phones can be done. Using a fantastic app called the KNFB Reader
(Figure.3), a print letter can be photographed, and the smartphone will examine it out loud in
seconds.[12].

Figure.3: KNFB Reader

● Seeing AI Application

It is an application for visually impaired people, does not support the Arabic language., and it
contains a set of services that help them to do their work, namely:

- Identify Short Text and Documents are types of Optical Character Recognition or OCR.

- It scans a barcode that appears on most products and identifies what that product is and other
details, such as directions and ingredients.

The application (Figure.4) is a barcode scanner and has a rating of 4.4 out of 5 in the App Store
preview [1].

Figure.4: Seeing AI Application

● Noongil Application

It is an app for visually impaired people, and although it is not fully implemented yet, it has ideas
that made us want to make use of it in our app; it supports object identification by raising the
phone over a book, Noongil application (Figure.5) reads books (text) to the user (recognize text).

20
Noongil can recognize shapes, colors, and people (point the phone at the object, and Noongil will
tell you what it is). Noongil provides voice commands to guide users' daily tasks [2].

Figure.5: NoonGil Application

● Be My Eyes Application

It is an application for visually impaired people and does not support dealing with AI. The app
pairs the blind or low-vision user with a sighted volunteer based on language and time zone. The
first volunteer to answer the request is connected to that specific user and receives a live video
feed from the rear-facing camera of the user's smartphone.

The application (Figure.6) has a 4.8 out of 5 in the App Store preview [3].

Figure.6: Be My Eyes Application

● TapTapSee Application

This mobile camera application (Figure.7) is designed specifically for blind and visually
impaired users. TapTapSee utilizes your device’s camera and VoiceOver functions to take a
picture or video of anything and identify it out loud for you. t can be helpful to, for example,
finding out the color of an item of clothing VoiceOver then speaks the identification aloud.

21
It is an online solution and cannot work without an Internet connection. The main drawback of
this application is that the recognition is very slow. In some cases, it may take more than 30
seconds. The app has a 4.2 out of 5 in the App Store preview[12].

Figure.7: TapTapSee Application

● 3D Sound Maps

For a sighted person, walking alongside the road can imply taking in each element surrounding
them. Microsoft Soundscape replicates that conduct by constructing an in-depth audio map
related to the area around someone with vision impairment.

It creates layers of context and element by drawing on region data and sound beacons and
synthesizes 3-d stereo sound to construct a continuously updating 3-d sound map of the
encompassing world[21].

1.7.1.2 Hardware

● Orcam Device

This device (Figure.8) is a pair of glasses that allows reading everything a finger indicates, is
easy to read texts, and can distinguish people's faces. However, it works only in English and
addresses a severe difficulty for all blind people who do not speak English. Moreover it does not
support providing the service for free, as the cost of the first version available in the market is
estimated at 2500 $ [10].

22
Figure.8: Orcam Device

● GPS Maptic Device

is a device created by Emilios Farrington-Arenas of Brunel University in London; it is a visual


sensor that blind people can use as a collar (Figure.9); additionally, it has feedback sensor series
placed on clothing and around the wrist like a bracelet, connected to a voice-controlled
smartphone app which uses GPS to guide the user’s movements through the body’s left or right
vibrations. However, this project is not for sale or work [11].

Figure.9: Maptic Device

● WeWALK Device

This device is associated innovative sensible white cane to touchpad and speaker (Figure.10).
Taps offer information, facilitate an individual spot obstruction, and understand after they hit the
walkway or climb stairs or once somebody is standing before them. Victimization ultrasound
WeWALK will discover obstacles on top of chest level - like tree branches, phone poles, and
traffic lights - and alert the user by causing a vibration. However, twiddling with a white stick in
one hand while employing a smartphone may be tricky. It is also considered expensive as it costs
2500$ [13].

23
Figure.10: WeWALK Device

● Mobiles Sound Beacon

The sound beacon is a navigation device for blind or visually impaired people (Figure.11). The
device emits a periodic sound that allows a person with limited vision to locate the object (for
example, the entrance to the building). These devices can be attached to doorways of buildings,
public transport cabin doors, traffic lights, etc. Beacon can produce voice messages, specifying
property name, address, and next point of interest [23].

Figure.11: Mobiles Sound Beacon

Due to the lack of applications that provide suitable interfaces for the blind, the lack of Arabic
language support, and the slow response. ABSIR will solve such defects, as it will work to speed
up the answer, use convenient and accessible user interfaces, and add more features that are not
available in some current applications, including

1- Supports the Arabic language.

2- Reading menus' contents by scanning the QR code.

3- Supports the psychological aspect of the blind using a voice relative to the blind.

24
4- Object recognition and reading texts appear on the camera and support communication with
people through artificial intelligence.

5- Warning alerts when the blind approaches a dangerous object, providing safety in his
movements while using the application.

6- The application will be provided for free to all users.

25
1.8 Literature Review

Many types of research have been published on systems that provide solutions to the challenges
faced by the colorblind and have been developed to help them. In this section, we focus on
discussing other research that has a similar idea to ABSIR.

He presented a study conducted in 2021 [15]. The systems can eliminate the need for physical
support. It allows visually impaired people to easily navigate indoor and outdoor scenarios by
loading previously recorded virtual paths through speech and sound feedback.

Furthermore, in this study conducted by Nayak & C.B. Assistive in April 2020, challenges faced
by visually impaired people can be solved by simple, low-cost assistive software that runs on
phones. Everything is controlled by voice command.

There is also research being conducted by Kuriakose, Shrestha, & Sandnes in 2022 [6]. Who
developed DeepNAVI as a navigation assistive system that supports deep learning competence?
The system also provides information about the obstacles' type, position, distance from the user,
and scene information. The information in this navigation assistant system is captured from the
navigation environment and processed on the local devices to ensure complete data privacy. No
wi-fi or any external data network is needed to operate the system. The system used the
"EfficientDet-Lite4" model for obstacle detection. This navigation assistant creates its dataset by
compromising 20 different obstacles relevant to indoor and outdoor navigation environments,
such as benches, bookcases, etc. The dataset was created by collecting images from various
sources, including Google Open Images, ImageNet, and the system's pictures.

There was a review conducted by Manoharan in 2019 [9]. He was talking about E-Speak as an
efficient and innovative system for text-to-speech conversion. Pen computing software achieves
80–90% accuracy based on handwriting. In this system, an image is scanned with the help of a
handheld scanner, and it is sent to an Android phone using Bluetooth. People with reading
difficulties due to a visual impairment, dyslexia, or being pre-literate or illiterate can benefit
from this system. Based on phonetics and other pre-set instructions, a speech synthesizer
converts the textual data into audio output. The system provides an accuracy of 97% in
identifying, processing, and converting the image input to textual and audio outputs.

26
Speaking of blindness, blindness is not just a total lack of vision. There is another type of
blindness, which is color blindness. Mrs. Baswaraju Swathi1, Koushalya R2, Vishal Roshan J3,
and Gowtham M N4 conducted research in 2020 [14]. Color blindness affects approximately 1 in
12 men and 1 in 200 women worldwide. Most people with color vision disorder can see things as
clearly as others, but they cannot see red, green, or blue light. Some diseases cause color
blindness, such as diabetes and multiple sclerosis. Red-green color blindness is classified into
two types: protanopia and deuteranopia. Tritanopia is known as blue-yellow color blindness.
Color-blind simulators like Vischeck 4, CVD simulator (Coblis) 5, Color Oracle 3, and Color
Blindness Simulate Correct 6, and color correction techniques are used for dichromacy, the LMS
Daltonization Algorithm, the Color-Blind Filter Service Algorithm, the LAB Color Corrector,
and the Shifting Color Algorithm. The aim is to present a smartphone-based experimental
comparison of color correction algorithms for dichromacy viewers.

Based on the research that was done by L.A. Elrefaei in 2018, [18]. Most people with color
blindness can see things as clearly as others, but they cannot see red, green, or blue light. In
sporadic cases, people cannot see any color at all. There are different causes of color blindness.
A smartphone-based experimental comparison of color correction algorithms for dichromacy
viewers is presented. The LMS Daltonization algorithm performs better at converting colors to
colors that colorblind people can distinguish. For people with proto nomia, the LMS algorithm
only changes the color of confusing areas with no change in brightness. The LAB color corrector
algorithm is implemented only for deuteranopia viewers.

There is research that has been conducted by Marcos Barata, Afan Galih Salman, Ikhtiar
Faahakhododo, and Bayu Kanigoro in 2018 [17]. They developed Intelligent Software Assistant,
an application that helps visually impaired or blind people access Android devices and be able to
use library resources using Android devices. The techniques used include the Scrum method,
speech-to-text, and text-to-speech methods. The application will help people be more efficient in
their daily lives. Because they include text-to-speech and speech-to-text conversion, they can
meet the needs of visually impaired users. The application was created as a background
application so that it continues to run while the device is running. This application includes
features that enable the blind to use it quickly and comfortably.

27
Some studies publish empirical feedback from users on their satisfaction with the apps they use.
This search was provided by Suraj Senjam in 2021 [16]. Such user input will enable the
developers to design user-centered and accessible apps that are friendly and easy to use.
Usability testing in people with visual impairments is critical to achieving high acceptability and
adaptation for apps. Mastering the use of a smartphone by a person with a visual impairment is a
challenging task. As a result, there may be a need for the development of a user-friendly
guideline for using accessible features and apps, particularly for people with visual impairment.

28
Chapter 2: Dataset
2.1 Introduction To Dataset

2.2 COCO Dataset

2.3 Examples and sample

29
2.1 Introduction To Dataset

A dataset is a collection of data, typically organized for use in computing and analysis. The
concept of a dataset has a long history, with early examples dating back to the 1800s when
scientists and researchers began collecting and analyzing data for various purposes. In the early
20th century, the development of computers and the rise of big data led to the creation of larger
and more complex datasets. Today, datasets are used in a wide range of fields, including
business, science, and government, and are an essential part of many data-driven
decision-making processes. Datasets can be stored and accessed in various formats, including flat
files, databases, and online repositories, and can be analyzed using a variety of tools and
techniques, such as statistical software and machine learning algorithms.

2.2 COCO Dataset

COCO (Common Objects in Context) is a large-scale object detection dataset created by


Microsoft Research in 2014. The goal of COCO is to provide a comprehensive and diverse set of
images that contain various common objects in various contexts and scales. COCO is widely
used for training object detection models, as it provides a large number of high-quality annotated
images.

As a major collection of data ABSIR considers the COCO dataset which is an imbalanced
large-scale object detection, segmentation, and captioning dataset. COCO has several features:

● Object segmentation
● Recognition in context
● Superpixel stuff segmentation
● 330K images (>200K labeled)
● 1.5 million object instances
● 80 object categories
● 91 stuff categories
● 5 captions per image
● 250,000 people with key points

30
COCO is considered one of the most challenging datasets for object detection due to its diverse set of
objects, large number of images, and high-quality annotations. The dataset is designed to support the
development of models that can generalize well to real-world images, and it has become a benchmark for
comparing different object detection algorithms.

Compared to other popular object detection datasets, such as PASCAL VOC and ImageNet, COCO
provides a more comprehensive and diverse set of images, making it a better choice for training models
that can handle a wide range of objects in real-world scenarios. Additionally, COCO includes more
complex scenes, such as scenes with multiple objects and overlapping objects, which can help train
models to handle more difficult detection scenarios.

2.3 Examples and sample

Figure.12: Examples Of The Dataset Images

31
Figure.13: Categories In The COCO Dataset

32
Chapter 3: System Analysis

3.1 System Requirements


3.1.1 Clients, Customers, and Users
3.1.2 Functional and Data Requirements
3.1.3 Non-Functional Requirements
3.1.3.1 Look And Feel Requirements
3.1.3.2 Usability Requirements.
3.1.3.3 Security Requirements
3.1.3.4 Performance Requirements
3.1.3.5 Portability Requirements
3.1.3.6 Availability Requirements
3.2 Proposed Solution

33
3.1 User and System Requirements

3.1.1 Clients, Customers, and Users

The users will be blind and color-blind people of all ages and genders in the Arabic world.

Because of the importance of the participation and activation of this category in Arabic society,
the percentage of those who suffer from visual disabilities reached (46.02%) of the total Saudi
population. To be specific, the degree of severity is distributed as follows: light (67.8%), severe
(28.5%), and extreme (3.7%). [4]

3.1.2 Functional and Data Requirements

1- User requirement: The user can create an account in the ABSIR application if he\she wants
to benefit from the service of hearing voices from one of his or her relatives.

System requirement: The system will create an account for users who sign up or let the user in
as a guest.

2- User requirement: The user will listen to the names of objects around him\her and recognize
them.

System requirement: The application should pronounce the names of objects loudly (machine
voice or blind relative person's voice).

The application must recognize the objects that appear on the camera.

3- User requirement: The user can use the application without an internet connection.

System requirement: The application works without an internet connection.

4- User requirement: The user can know about the content of barcodes.

System requirement: The application must provide the service of reading the barcode list

5- User requirement: Relative people can record their voices using the application.

34
System Requirement: The user will have to log in so he can record his voice via the Audio
recording interface.

6- The application provides alerts to the user if the camera access feature or vibrations are
closed.

7- The system uses the Arabic language to support Arabic-speaking blind people.

3.1.3 Non-Functional Requirements

3.1.3.1 Look and Feel Requirements

The application will have some properties that will make the users feel comfortable:

● Application interfaces will be user-friendly and blind-friendly as they will be simple,


straightforward, and have smooth movements between them.
● Interfaces will be compatible in colors, icon size, and font size.

3.1.3.2 Usability Requirements

● The application should be easy to use by blind and color-blind people by providing clear
and simple instructions on how to use the application.
● The application will be organized to save users time and effort.
● The application should be clear and smooth for users by not requiring lots of instructions.
● The system should describe the objects to the user correctly and in a clear voice.

3.1.3.3 Security Requirements

● Each user is required to enter a unique username and email to register in the application.
● The users' information and the voices of the relatives will be encrypted and protected
from breaches and diffusion.

35
3.1.3.4 Performance Requirements

● The application interface is expected to appear as soon as possible.


● The response to each action will be quick so the response time will be less than a second.
So the system should be fast in processing images, recognizing objects and colors, and
converting text to speech.

3.1.3.5 Portability Requirements

The system shall work on the Android and IOS platforms.

3.1.3.6 Availability Requirements

The application will be installed on the user's phone without an internet connection.

3.1.2 Proposed Solution

The proposed application "ABSIR" is to create an application that runs on android and IOS
platforms that hopefully will help users to know the surrounding objects and texts by recognizing
them and describing them to the users and avoid the dangerous object by notifying the user of the
hazardous objects. Also, the application helps users choose items from a menu by scanning the
menu's QR code and reading the items.

Main services of "ABSIR" application:

● Free to download and use.


● Arabic language support.
● Using “a voice of a relative or a friend to the blind” so that the blind feel safe and
familiar, as it helps them psychologically and increases their comfort.
● Use an alert such as vibrating when a dangerous object is around the blind.
● Providing the ability to read the menu in restaurants by the menu's QR code.

36
Chapter 4: System Design

4.1 Design Constraints


4.1.1 Hardware And Software Environment
4.2 Architectural Strategies
4.2.1 Reuse Of Existing Software Components
4.2.3 Development Method
4.3 Use Case Diagram
4.4 Use Case Tables
4.5 Data Flow Diagrams (DFDs)
4.6 Class Diagram
4.7 Activity Diagram
4.8 ER Diagram
4.9 Sequence Diagram
4.10 Architecture Diagram

37
4.1 Design Constraints

4.1.1 Hardware And Software Environment

Hardware Environment

"ABSIR" will be developed to run on different types of smartphone devices.

Software Environment

● Operating System:

Android and IOS operating systems.

● Flutter Studio:

Flutter is the official Integrated Development Environment (IDE) for both Android and IOS
applications development, it will be used to accelerate the development process.

● Visual Studio Code

Visual studio code is used as an editor to write the code of the application.

● Android Studio

Android studio code is used as an editor to write and display the application code.

4.2 Architectural Strategies

4.2.1 Reuse Of Existing Software Component

- COCO dataset for object detection.


- The existing API connects our app with a server that consists of python codes that
help with image processing specifically the correct colors in pictures for the color
blind.

38
4.2.3 Development Method

Given the constraints of limited time and budget, the Waterfall Model has been
determined to be the most appropriate methodology for the ABSIR project [5]. This is
due to its clear and straightforward structure, as well as the well-defined requirements
and scope of the project. The Waterfall Model provides a systematic approach that
effectively balances these constraints, making it the ideal choice for this project.

39
4.3 Use Case

Figure.14: Use Case Diagram

Figure 14 shows the relationship between users with different use cases and how they interact
with the application. Displays the services provided by the Absir application, where the user
accesses the following services: color filtering - object recognition - QR reading - choosing a
voice either by recording a relative voice or a virtual voice and he can access the services by
making a signup and then making a login.

40
4.4 Use Case Tables

41
42
43
44
45
4.5 Data Flow Diagram

4.5.1 Data Flow Diagram: Context Level

Figure.15: DFD Context Level

46
4.5.2 Data Flow Diagram: Level 1

Figure.16: DFD Level 1

47
Figure.15 and Figure.16 shows how the user interacts with the application processes

● The User can register in the application and log in


● The user can access and control the settings
● The user can choose the process of recognizing an object
● The user can choose the process of reading the menu by QR
● The user can choose a filter process to adjust the colors of the images to suit the type of
color blindness
● The user can choose the process of recording the voice of one of his relatives
● The user will be able to recognize texts

4.5.3 DFD Level 2. Object Recognition

Figure.17: DFD Level 2 Object Recognition

48
Figure.17 shows the process of object recognition

● The user can take a picture of the object


● The system will be able to recognize the object and express it with voice and text
● The system will be able to alert the user by sound and vibration when it recognizes an
object that could harm the user

4.5.4 DFD Level 2. Read Menu By QR

Figure.18: DFD Level 2. Read Menu By QR

Figure.18 shows the process of reading the list by QR

● The user can take a picture of the QR


● The system will be able to read the menu to the user and express it by voice

49
4.5.5 DFD Level 2. Color Filter (Image Processing)

Figure.19: DFD Level 2. Color Filter (Image Processing)

Figure.19 shows the process of Image Processing (Filter)

● The user can take an image of the object


● The user can choose the type of color blindness
● The system will be able to adjust the image colors to suit the type of color blindness
chosen by the user

50
4.5.6 DFD Level 2. Recording The Relative’s Voice

Figure.20: DFD Level 2. Recording The Relative’s Voice

Figure.20 shows the process of recording the voice of relatives

● The user can record the voice of one of his relatives


● The system will be able to save the voice and use it in the application instead of the
automatic response

51
4.6 Class Diagram

Figure.21: Class Diagram

This figure shows the classes in the ABSIR Application and the relationships with each other.
First the sign up and login classes are connected with the Home class. The Home class can
redirect to (QR code Reader, Color-blind, Object Recognition, Settings) classes. The settings
class has a Voice class to record the voice of the user’s relative and then it can be used in the
Object Recognition class to describe the objects’ names.

52
4.7 Activity Diagram

Figure.22: Activity Diagram

53
The diagram showcases the various services offered by the app, including QR code scanning,
color blindness filtering, object detection, and audio recording, providing an intuitive and
streamlined experience for users.

4.8 ER Diagram

Figure.23: ER Diagram for the relationship between users and services in the application

54
4.9 Sequence Diagram

4.9.1 Object Recognition Sequence Diagram

Figure.24: Object Recognition Sequence Diagram

This figure describes the process of the object recognition in ABSIR Application. From clicking
the button to telling the user the captured object’s name.

4.9.2 Image Processing (Filter) Sequence Diagram

Figure.25: Filter (Image Processing) Sequence Diagram

The figure shows how the user can activate the color filtering service on the application by
choosing the color-blind’s type and capture the scene, then click send to send the captured
picture to the server to modify the picture according to the chosen color-blindness.

55
4.9.3 QR Reader Sequence Diagram

Figure.26: QR Reader Sequence Diagram

4.9.4 Register/Login Sequence Diagram

Figure.27: Register/Login Sequence Diagram

56
The figure describes the process of the create account in the application and the login process
including the action of entering the data and the checking of the entered data.

4.9.5 Record Voice Sequence Diagram

Figure.28: Record Voice Sequence Diagram

This figure shows the process of recording the voice of the relative to use it during the object
recognition service.

4.10 Architecture Diagram

Figure.29: shows the two primary services of ABSIR and the methods to implement them.

57
4.11 Prototype

Figure.30: Welcome Interfaces

The first interface is displayed to the user when the application is opened. The other interfaces
will be displayed to the user with ABSIR’s features.

(a) (b) (c)

Figure.31: (a) Register/Login Interface, (b) Register Interface, (c) OTP Interface.

The first interface (a) lets the user choose between register, login, or login as a guest. The middle
interface (b) is displayed when the user chooses the register option, after the user enters the
register information he/she will be redirected to the OTP interface (c) to check the email
validation.

58
Figure.32: Voice Recording Interfaces

After the registration is completed the user has to choose either to record a voice or to use the
default voice. If the user chose to record a voice the recording screen will be displayed to let the
user record the voice and check if the said sentence was true or no, then the voice will be saved
to be used later in the object recognition category.

(a) (b)

Figure.33: (a) Login Interface, (b) Forget Password Interfaces

The first interface will be displayed if the user chose Login option Figure.31 (a). If the user click
“forget password” button, he/she will be redirected to (b) interfaces to set up a new password
process

59
(a) (b)

Figure.34: (a) Settings Interface, (b) Voice Settings Interface

This interface displays the settings for the user and can be accessed through the home screen.
The other screen is for the voice editing settings.

Figure.35: Home Screen Interfaces

This interface is displayed after creating an account, login, or login as a guest, the camera is
directed at the object and the capture button is pressed and then the object is captured and tells
the user the captured object’s name, color, and danger level. If the user chooses the QR Reader
button, the system will catch any QR code and scan it to display its contents.

60
Chapter 5:
Implementation
5.1 Machine Learning
5.1.1 Object Detection

5.1.2 YOLO Algorithm

5.1.3 Packages and Libraries

5.2 Image Processing


5.2.1 API

5.2.2 Packages and Libraries

5.3 ABSIR Application


5.4 Future Work
5.5 Conclusion

61
5.1 Machine Learning

Machine learning is a subset of AI that includes algorithms that allow software applications to
become more accurate in predicting outcomes without being explicitly programmed.

Machine learning is a technology that is used to train machines to perform various actions such
as predictions, recommendations, estimates, etc., based on historical data or experience. One of
its main advantages is the ability to process and analyze large amounts of data, extract
meaningful insights, and make predictions with high accuracy.

There are several types of machine learning algorithms, including supervised learning which is
used in the ABSIR implementation of the Obj Detection service.

Supervised learning is the type of machine learning in which machines are trained using well
"labeled" training data, and based on that data, machines predict the output. The labeled data
means some input data is already tagged with the correct output.

In supervised learning, the training data provided to the machines work as the supervisor that
teaches the machines to predict the output correctly. It applies the same concept as a student
learning under the supervision of the teacher.

5.1.1 Object Detection

Object detection is a computer vision technique for locating instances of objects in images or
videos. Object detection algorithms typically leverage machine learning or deep learning to
produce meaningful results.
Humans can quickly identify and find objects of interest when viewing photos or videos. Using a
computer, object detection attempts to simulate this intelligence.
Our app features object detection technology that provides audio descriptions to enhance the
experience of blind and visually impaired users. The object detection system analyzes the camera
view in real time and provides audio descriptions of the objects and people in the scene.
This technology greatly improves the independence and quality of life of our blind and visually
impaired users by allowing them to better understand their surroundings, avoid obstacles, and
identify objects and people. The audio descriptions are designed to be clear, concise, and easy to
understand, providing users with the information they need to navigate their environment with
greater confidence.

In summary, our app supports object detection with audio descriptions, offering a unique and
valuable experience for blind and visually impaired users.

62
5.1.2 YOLO Algorithm
Yolo is One of the most popular algorithms for object detection. It is short for "You Only Look
Once" and is a popular algorithm due to its speed and accuracy. It is used to detect traffic lights,
people, cars, and animals.

This algorithm improves object detection speed, provides accurate results with minimum errors,
and has excellent learning capabilities that enable it to learn object representations and apply
them to object detection.

The object detection function in our application leverages machine learning (ML) techniques and
is based on the YOLOv2 algorithm using the COCO dataset described in Chapter 2. This
advanced technology gives our application the ability to detect objects with accurate images,
providing reliable information and robust performance. In terms of speed, YOLO is one of the
best models in object detection, capable of detecting objects and frames with up to 150 FPS for
small
networks to process. However, in terms of mAP accuracy, YOLO was not the latest generation
model, but it has a pretty good mean accuracy (mAP) of 63%. However, YOLO v2 is Better,
faster, and stronger. At 67 FPS, YOLOv2 returns an mAP of 76.8%, and at 67 FPS an mAP of
78.6%.[34]

5.1.3 Packages and Libraries Used In Machine Learning

- Tflite package:

TensorFlow Lite is a set of tools that enables on-device machine learning by helping developers
run their models on mobile, embedded, and edge devices. The package tflite has some features
such as optimized for on-device machine learning, by addressing 5 key constraints: latency
(there's no round-trip to a server), privacy (no personal data leaves the device), connectivity
(internet connectivity is not required), size (reduced model and binary size) and power
consumption (efficient inference and a lack of network connections), also end-to-end examples,
for common machine learning tasks such as image classification, object detection, pose
estimation, question answering, text classification, etc. on multiple platforms [24].

- Speech_to_text:

A library that exposes device-specific speech recognition capability. This plugin contains a set of
classes that make it easy to use the speech recognition capabilities of the underlying platform in
Flutter. It supports Android, iOS, and the web. The target use cases for this library are commands
and short phrases, not continuous spoken conversion or always listening.

63
5.2 Image Processing
Image processing (IP) is a subfield of computer science that deals with the manipulation,
analysis, and interpretation of images. It involves techniques and algorithms for processing and
transforming digital images. IP has a wide range of applications, including computer vision,
medical imaging, and multimedia processing. Filtering is a technique for modifying or enhancing
an image.

5.2.1 Application Programming Interface (API)

Application Programming Interface, The API is the middleman between the application and the
web server. Moreover, every time you use software to communicate with other software or
online web servers, you’re using APIs to request the information you need. and allows the
application to interact with each other. Whenever we use any application on our mobiles, it sends
and receives data to and from a remote server via the Internet.[26]

At the server’s end, the application retrieves the data, interprets it, processes the required actions,
and sends it back to your phone. The application in our mobile phones then interprets the
incoming data and displays that to us in a human-readable format.[26]

Figure.36: How the API works

In our application, we have incorporated IP to address the challenges faced by color-blind


individuals. Our application features a color correction technique that adjusts the colors in
captured images to make them more distinguishable and enhances the visual experience for
color-blind users. We used an off-the-shelf server running images for the four types of color
blindness:

- Deuteranopia: green weakness


- Protopia: Red weakness
- Tritanopia: blue weakness (extremely rare).
- Hybrid: combination of red-green color blindness and blue-yellow color blindness.

64
Figure.37: How each type of color-blind sees

This means that the individual has difficulty seeing certain colors in the red-green range, as well
as certain colors in the blue-yellow range and so on.

5.2.2 Packages and Libraries Used In Image Processing

- HTTP package:

This package contains a set of high-level functions and classes that make it easy to consume
HTTP resources. It's multi-platform and supports mobile, desktop, and browser. Also, the http
package provides the simplest way to fetch data from the internet [25].

- OpenCV

OpenCV is one of the most popular computer vision libraries. OpenCV is a huge open-source
library for computer vision, machine learning, and image processing. OpenCV supports a wide
variety of programming languages like Python, C++, Java, etc. It can process images and videos
to identify objects, faces, or even the handwriting of a human, facial & gesture recognition,
Human-computer interaction, Mobile robotics, Object identification, and others.[32]

65
5.3 ABSIR Application Interfaces

Figure.38: Welcome Interfaces

The Welcome UI serves as an overview of the app's salient features, including Object Detection,
Color Blind Filters, and friendly Audio. The Object Detection function enables users to detect
and identify objects in real time using their device's camera. The Color Blind Filters feature
facilitates individuals with color blindness to better distinguish colors in their surroundings. The
friendly Audio feature provides a more interactive and immersive audio experience for users.

The user interface of the app offers three options for accessing the app:
Login, Sign Up and Join as a Guest. The Login option allows users to log
in to their existing account, the Sign-Up option enables users to create a
new account, and the Join as a Guest option allows users to access the app
without creating an account. These options cater to different user needs
and preferences, providing flexibility and choice in accessing the app. The
interface is user-friendly and intuitive.

Figure.39: Register/Login Interface

66
The Signup screen is where users can create a new account for the app.
The screen requests users to enter their email, username, and password
and to confirm their password. The secure design of the Signup screen
protects users' information and keeps it confidential. Upon filling out
the necessary information, users can create their accounts by clicking
the "Sign Up" button. The user information is stored using the
SQLfLite database, ensuring the privacy and security of users'
information.

Figure.40: Register Interface

The Login screen allows users to access their existing account by


entering their username and password. The screen is user-friendly and
straightforward, enabling users to easily log in to the app. If a user does
not have an existing account, they can create a new one by clicking the
"Sign Up" button.

Figure.41: Login Interface

67
The object recognition interface helps the blind to know what is around
them via the tflite package using the yolov2 algorithm through the
dataset used by Coco so that the user points the camera at the object
and describes the name of the object by writing the name of the object
and pronouncing it with a voice through his library Speech_to_Text.

Figure.42: Object Recognition Interface

There are two fields available in the server,


specified by the name of the type: take the type of
color blindness, And another field named Image:
takes the path of the captured image. Through the
camera controller method, the user will be able to
take a picture, and the picture will be saved in a
specific path. Through the list of types of color
blindness, the user will choose the type of color
blindness to process the image. A request will be
sent to a server to process the image based on the
image path and type of color blindness and return
the processing result.[30][31]

Figure.43: Color Filtering Interfaces

68
The Voice Selection interface allows recording
sentences that the blind want to recognize. The
voice recording interface is chosen through the
settings interface so that it allows the blind
person to record the sentences that the blind
want to recognize. If the recording is done
correctly, the sound is saved in the application
via the Flutter_sound package and
Speech_to_text library.

Figure.44: Voice Recording Interfaces

Figure.45: OTP, settings, and voice options Interfaces

69
5.4 Future Work

Despite the limitations imposed by the limited time frame, there remains opportunities for further
improvement and expansion of ABSIR capabilities. In the near future, we are looking to
integrate text recognition and reading features, which will enable the app to perform full image
descriptions. Furthermore, GPS integration is planned to provide visually impaired individuals
with navigational support. Lastly, our team is working to enhance color identification
capabilities, to further enhance the user experience for those who rely on ABSIR. These exciting
developments will further solidify ABSIR's position as a comprehensive and user-friendly
solution for those in need.

5.5 Conclusion
The ABSIR application represents a significant advancement in the field of assistive technology
for visually impaired individuals. The integration of machine learning and image processing
techniques creates a comprehensive solution for helping blind and color-blind individuals
understand their surroundings and connect emotionally with their loved ones.

The machine learning component, utilizing the YOLO2 sample and the Coco dataset in the
Flutter Tflite package, is capable of detecting and recognizing objects in real-time.

The object recognition service provides a Clarif of the objects around the user with the
appearance of a text in which the name of the object appears in Arabic with the pronunciation of
the name. So we built a machine learning model. We first need a dataset through the Tflite
library and use model yolo inside it so that object recognition can be done. And the text of the
name of the object appears in the description in Arabic by giving the name of the object. Object
recognition is done by voice in Arabic by recording the sound of Flutter_sound packages and the
Speech_to_text library

The image processing component, utilizing an API and a server connection with python code,
allows for the correction of color blindness. This provides individuals with color blindness the
ability to see objects in their true colors, allowing for a more accurate representation of their
surroundings.

In addition to these technical features, the ABSIR application also allows for personalized audio
messages from loved ones. This emotional connection provides a sense of comfort and support to
the visually impaired user, further enhancing the overall effectiveness of the application.

Overall, the ABSIR application is a powerful tool that has the potential to greatly improve the
quality of life for visually impaired individuals. Its compatibility with both iOS and Android

70
operating systems, combined with its ease of use through the Flutter IDE and Dart programming
language, makes it a highly accessible and user-friendly solution for this population.

71
References
[1] Microsoft Corporation. "Seeing AI". USA.2017.

[2] Yusuf Suhair‫‏‬. "Noongil". December 2020.

[3] Be My Eyes. "Be my eyes". October 2017.

[4] General Authority Of Statistics (Visited On on April 2022).

[5] Khneisser, Awad, ElHaddad & Mahmoud."Intelligent Eye".April 2018.

[6] Kuriakose, Shrestha & Sandnes."DeepNAVI".Aug 2022.

[7] Obe, Olumide & Akinwonmi. SMART APPLICATION FOR THE VISUALLY IMPAIRED.
March 2021.

[8] Nayak & C.B. Assistive. Mobile Application for Visually Impaired People. India.April 2020.

[9] Manoharan. A SMART IMAGE PROCESSING ALGORITHM FOR TEXT RECOGNITION,


INFORMATION EXTRACTION, AND VOCALIZATION FOR THE VISUALLY CHALLENGED.
India.2019.

[10] ORCAM."ORCAM".USA.2019.

[11] E. Farrington-Arnas." MAPTIC". London. 2020.

[12] KNFP Reader. Visited In September 2022

[13] WeWALK Company."WeWALK" 2020. Visited In September 2022

[14] Mrs. Baswaraju Swathi1, Koushalya R2, Vishal Roshan J3 & Gowtham M N4. IRJET-
Color Blindness Algorithm Comparison for Developing an Android Application. IRJET
Journal.May 2020.

[15] A Navigation and Augmented Reality System for Visually Impaired People. 2021

72
[16] Suraj S Senjam: The current advances in human–smartphone user interface design: An
opportunity for people with vision loss. September .2021

[17] Marcos Barata, Afan Galih Salman, Ikhtiar Faahakhododo and Bayu Kanigoro.
Android-based voice assistant for blind people. Jul. 2018

[18] L.A.Elrefaei. Smartphone-Based Image Color Correction for Color Blindness, Int. J.
Interact. Mob. Technol., vol. 12, no. 3, pp. pp. 104–119, Jul. 2018.

[19] Kuriakose, B.; Shrestha, R.; Sandnes, F.E. Smartphone Navigation Support for Blind and
Visually Impaired People - A Comprehensive Analysis of Potentials and Opportunities. Lect.
Notes Comput. Sci. 2020.

[20] Sahar Busaeed, Rashid Mehmood, and Iyad Katib. Requirements, Challenges, and Use of
Digital Devices and Apps for Blind and Visually Impaired.Saudi Arabia.2022

[21] Microsoft Soundscape – A Map Delivered in 3D Sound. January 11, 2019

[22] Adobe Color, Accessibility Tools." color blind safe colors on the color wheel". Visited In
November 2022.

[23] MOBILIS SOUNDS BEACON.

[24] TensorFlow Lite. Visited On February 2023

[25] pub. dev, HTTP | Dart Package. Visited In February 2023

[26] Tray.io, Inc."Tray.io" 2023.Visited In February 2023.

[27] Math Works. What Is Image Filtering In The Spatial Domain? Visited In February 2023.

[28] Ed Burns, “What Is Machine Learning and Why Is It Important?”, TechTarget, Enterprise
AI, 2021, Visited In January 2023.

73
[29] Vipin Tyagi. Understanding Digital Image Processing.2018.

[30] Git Hub, Joerg Dietrich / Daltonize. Visited In February 2023.

[31] Git Hub,(tsarjak) Sarjak Thakkar.Simulate-Correct-Colorblindness.Visited In January 2023 .

[32] Color Blindness API Service, New Yo,2021.Visited In January 2023 .

[33] Gursimar Singh, "Python in Plain English" 2021, Visited In February 2023.

[34] geeksforgeeks,2022, Dec 06.YOLO v2 – Object Detection.Visited In February 2023.

74
Appendix A

Survey :

The survey was conducted between February 2021 and September 2021 among those who are at
least 15 years old and suffer from any kind of vision loss. A total of 164 visually impaired people
participated in the survey, representing different age groups and having varying educational
backgrounds. Out of the 164 respondents (see Error! Reference source not found.), the youth
(15 to 35 years of age) were the largest group at 78%, followed by the age groups (36-45) at
15%. This indicates that these groups use technology more frequently in their everyday lives
than the other groups since the survey was conducted using Google Forms.[20]

1. Types Of Assistive Devices:

The responses showed that the visually impaired used different types of devices. Nonetheless, the
majority use white canes (48%), followed by braille displays at 29%. It is interesting to see that
25% of the respondents do not use any device at all. Those could depend on relatives to guide
them when needed. Some, however, could be reluctant to use assistance from people or devices.
Instead, they use their other senses and may not have lost all their sight This as shown in Figure
2.[20]

Figure.: The Use Of Different Assistive Technologies By The Survey Participants

2. Different Assistive Technologies

This shows that most respondents use equipment to help them walk and navigate their way
(40%), in addition to assisting them in achieving their education goals (34%). In addition (13%)
of the respondents indicated that they use it for office tasks. While (7%) use the device for object
recognition This as shown in Figure 3. [20]

Figure.:The Purpose of Assistive Devices According To The Survey Participants.

75
3. Dependency In Indoor Environments:

In indoor environments, the visually impaired had more than one means to assist them. White
canes and mobiles were the primary tools used by the visually impaired at 49% and 48%,
respectively, followed by an accompanying person at 42%. This can be attributed to their
familiarity with the surroundings (inside their homes). There is a 13% dependence on special
visually impaired devices. While (31%) rely on the remnants of vision This as shown in Figure
4.[20]

Figure: Survey Participants’ Dependency In Indoor Environments.

4. Dependency In Outdoor Environments:

As shown, respondents depend more on accompanying persons (53%) to navigate outdoor


environments than navigating indoor environments (42%). This is expected given that the
visually impaired may not be familiar with these outdoor environments. Furthermore,
respondents equally use mobiles and white canes in these conditions This as shown in Figure
5.[20]

Figure: Dependency In Outdoor Environment

5. Use A Smartphone Without Any Assistance:

The survey shows how easy it has become for the visually impaired to use mobile on their own,
without asking for assistance from others. Only 3% rely entirely on others when using mobiles,
while 22% need partial help This as shown in Figure 6. [20]

76
6. Object Recognition:

Over half of the people surveyed use object recognition apps on their mobiles, which is
encouraging. The remaining 44% said they do not use any object recognition apps. Percentages
in the following discussions will be limited to those respondents who use object recognition apps
(92/164; 56%) This as shown in Figure 7.[20]

Figure. : The Distribution Of Survey Participants Using Object Recognition Technologies.

7. Popular Object Recognition Apps:

“Envision” represents the most used object recognition app on mobiles for the participants by
(80%) , and second comes “Seeing AI.” Envision app helps the blind and the visually impaired to
read text and documents, recognize faces, and find objects. Seeing AI is the second most popular
object recognition app (22%), mainly because it is only available on iOS, whereas “Envision” is
available on iOS and Android. “Be My Eyes” was the third most common app used by the
respondents (10%). “Be My Eyes” connects blind and visually impaired individuals with sighted
volunteers via a live video call. The surveyed individuals equally use money recognition apps
and TapTapSee (4% each) This as shown in Figure 8. [20]

8. Navigation:

Little more than half of the surveyed people use navigating apps on their mobiles. Percentages in
the following discussions will be limited to those respondents who use navigation apps (89/164;
54%) This as shown in Figure 9 .[20]

77
Almost all the respondents using navigating apps indicated that their preferred app is Google
Maps. “Be My Eyes” is the second most popular choice among the respondents at 30%, followed
by “Ariadne GPS” (8%). While uncommon, “Seeing Assistance,” “Blind Square,” and “I move
around” are used by some of the people surveyed at the following percentages: 7%, 4%, and 3%,
respectively, as shown in Figure 10.[20]

78

You might also like