Professional Documents
Culture Documents
Abstract—Assistive technology is quite necessary to establish studies have been proposed where communication systems
a two-way communication among deaf, mute and normal people for deaf and mute have been developed using sensors, image
where none of them needs to acquire knowledge of sign languages. processing or smart-phones. Most of these works are limited
In this paper, we propose a novel communication aid system that
assists the deaf and mute to communicate independently with no to only sign language or gesture detection and for these reason
use of sign language. Usually, deaf and mute, and normal people they cannot be used as a bi-directional communication aid for
communicate with each other using visual references and simple the users. As these approaches are heavily dependent on sign
sentences when none of them knows the sign languages. These language detection the users of these systems must learn the
small talks are mainly based on some contexts or keywords which sign language beforehand. To this end, we proposed a system
can be delineated visually. Again these keywords can be classified
based on syllables for the purpose of vibrotactile outputs so that uses smart-phone for communication with no use of the
that the speech is easily comprehended. Therefore our proposed sign language and very easy to handle.
system is emphasized on both visual and vibrotactile feedback The objective of our system is to provide a low cost
while communicating. An Android app has been developed which
is focused on multimodal approach can be able to convert speech and easily accessible communication aid for deaf and mute
to visual contexts and vibrations, and similarly, the contexts and that will not require any learning of sign language. While
vibrations can be converted to speech. To validate the program, studying different communication approaches of deaf and mute
the system was experimented by six deaf and mute participants, people, we observed that different visual references are used in
two normal persons, and two examiners. Finally, the study of addition to sign language. If any of the communicators do not
our experimentation reflects the effectiveness and usability of
our system. know sign language then communication can be established
using some visual cues from the environment. Therefore,
Index Terms—Deaf and Mute, Assistive Technology, Vibrotac- we tried to use different visual feedback for communicating
tile Feedback, Visual Feedback, Multimodal Approach. through our system instead of using sign language.
The implementation of the system has been quite challeng-
I. I NTRODUCTION ing because of not using any sign language detection. Our
There are approximately 360 million people who are suf- system can generate different visual representations based on
fering from hearing loss and 70 million people are deaf and some given input speech or gestures. For example, when a
mute [1] according to the World Federation of Deaf (WFD). normal person communicates with deaf and mute, the user has
People who hear sounds within the range from 41 dB to 55 dB to give input of some speech through our system. The system
fall in the category of moderate hearing loss and usually face processes the input speech and determines different contexts
difficulty listening to sounds if they do not use any hearing aid. and keywords from the speech. Based on these contexts and
In addition to that, people who hear sounds of more than 91 dB keywords the output is shown to a deaf and mute user through
fall in the category of profound hearing loss. They usually rely some visual images and vibrations. Again when a deaf and
on lip-reading or sign languages for communication [2].The mute user tries to communicate using our system, they can
most common communication method for deaf and mute give input of some vibrations or gestures. These vibrations or
people is to use sign language. But sign language varies gestures have different predefined words associated with them
from region to region and there are about 300 sign languages which are then conveyed as output to the normal user as an
worldwide [3]. So, deaf and mute people find it very difficult automated speech.
to communicate in new environments. Furthermore, to learn This paper is divided into five sections including this
the sign language, the participants (deaf, mute and normal) introductory section. The rest of the paper is organized as
persons have to undergo some training sessions. This practice follows. In section II, relevant researches to our system are
is common among almost all deaf and mute but normal person discussed. We describe the proposed system design in section
do not usually tend to learn sign language. Therefore, deaf III. In section IV, we discussed the experimental analysis and
and mute faces some challenges to talk to someone who does user study of the system. Lastly, in section V, we concluded
not know sign language. To solve this problem, numerous our paper with the future scope of this study.
184
,QWHUQDWLRQDO6HPLQDURQ$SSOLFDWLRQIRU7HFKQRORJ\RI,QIRUPDWLRQDQG&RPPXQLFDWLRQL6HPDQWLF
Fig. 1. Overview of two-way communication between normal people and deaf & mute.
II. R ELATED W ORKS have been few researches [9]–[13] to detect sign language
using the Microsoft Kinect camera and interpret the sign lan-
In recent years, there have been numerous researches to
guage into some information. In paper [14] a communication
develop a suitable assistive technology for deaf and mute. The
system for deaf and mute was developed which can recognize
authors of paper [4] have divided these assistive technologies
the gesture from the input image. The authors in literature
into three categories such as- translation of speech to text,
[15] developed a full-duplex communication system where
translation of speech or text to sign language, translation of
hand gesture is processed from input video and converted
sign language to speech or text. This paper [4] has identified
to text or speech and also, the speech of a normal person
that most of these solutions deal with one-way method of
also has a corresponding gesture. In the paper, [16] a device
communication only. In our research, we categorized the
called Digital Dactylology Converter (DDC) was developed
related researches based on the communication medium of the
which processes the sign languages from input images and
users which are discussed in the following subsections.
converts them to voice signals and text messages. The authors
A. Sensor-Based Technologies in [17] developed a multi-modality based Arabic sign language
detection system using a huge image data-set. The vision-
The sensor-based assisted technologies mainly use some based assistive technologies are very good for detecting the
external devices like handheld or finger worn devices to detect sign language but are complex for communication purpose.
various gestures. In paper [5] the authors developed a finger-
worn device called Magic ring that can translate a predefined
C. Smartphone Based Technologies
gesture into some information for communication. The re-
searches of [6] and [7] are based on a micro-controller device The popularity and usability of smart devices like smart-
and some flex sensors to detect various finger movement and phones and smartwatches have led to the development of
show output as a text or speech. In paper [8] a hand gesture many assisted technologies. El-Gayyar in [18] developed a
recognition glove was developed that can translate the sign mobile application which is supported by cloud computing
language to some texts and show it on a portable device. that can translate Egyptian Arabic speech into Visualized 3D
The sensor-based device is quite accurate in detecting the Avatar. The paper [19] is also based on speech recognition
gestures but they can be costly and cumbersome to use for and then visualizing using avatar but also supports two-way
communication purpose. communication. In research [20], a hearing aid was developed
using smart-phone to help hearing-impaired people. In liter-
B. Vision-Based Technologies
ature [21], real-time emergency assistance called iHelp was
The vision-based assistive technologies for deaf and mute developed where information is sent through GPS and text
users mainly use various image processing techniques to detect messages. The research [22] is based on a smartwatch based
sign languages or gestures from input images or videos. There assistive device which triggers some vibration to the user when
185
,QWHUQDWLRQDO6HPLQDURQ$SSOLFDWLRQIRU7HFKQRORJ\RI,QIRUPDWLRQDQG&RPPXQLFDWLRQL6HPDQWLF
186
,QWHUQDWLRQDO6HPLQDURQ$SSOLFDWLRQIRU7HFKQRORJ\RI,QIRUPDWLRQDQG&RPPXQLFDWLRQL6HPDQWLF
Fig. 2. Different portions of the user interface of our system when a normal person starts communicating with a deaf and mute person. (a) The normal user
is required to provide voice input for starting the conversation. (b) The voice input is converted into text (c) The context and command are shown as a
visual output to the deaf and mute user and command will give an output of vibrations on the basis of syllables.
Fig. 3. Different portions of the user interface of our system when a deaf and mute starts a conversation with a normal person using the gesture input. (a)
The deaf and mute user provides the gesture of the context which is drawing the first letter of the relevant word. (b) The deaf and mute user provides the
command gesture in the same way. (c) The context and command as shown as visual output and relevant sentences can be selected from it. (d) The output is
shown to the normal user as a form of speech.
was passed successfully. We used the SQLite database in on each of their Android smartphones. We gave them three
both applications for storing the predefined commands and tasks to conduct and then examined the time and accuracy.
the contexts (images). Google STT API and Google TTS API Afterwards, we arranged a training session of two hours where
were used for transforming speech to text and vice versa. With we taught them how to operate the system. The training session
this proposed system, users do not need to remember sign was conducted by the design experts and this session was
languages to communicate. Rather, this system is designed in helped by an expert sign language trainer to translate the
such a way that it maintains two-way communication with a speeches into sign languages for the understanding of the
multi-modal approach. deaf and mute. Later they were given separate three tasks to
conduct and examined the time and accuracy again. Accuracy
IV. U SER S TUDY was obtained by dividing the correct output(s) by total given
To examine the feasibility and usefulness of the system, we tasks resulting in an output value from 0 to 100 and the time
tested our system with six different deaf and mute people, was obtained as seconds. The main goal of user study is to
and two normal people. Initially, we installed the application get a statistical view about the efficiency of the system as
187
,QWHUQDWLRQDO6HPLQDURQ$SSOLFDWLRQIRU7HFKQRORJ\RI,QIRUPDWLRQDQG&RPPXQLFDWLRQL6HPDQWLF
Fig. 4. Different portions of the user interface when a deaf and mute communicates with a normal user using vibrotactile input. (a) The user is asked to
enter a context and the user selects the words on the basis of syllables. (b) In the same way, the user provides the command input and chooses the
appropriate command from the list (c) The relevant command and context are matched and visual output is shown. (d) Choosing the appropriate sentence
the output is received by the normal user as a form of speech.
188
,QWHUQDWLRQDO6HPLQDURQ$SSOLFDWLRQIRU7HFKQRORJ\RI,QIRUPDWLRQDQG&RPPXQLFDWLRQL6HPDQWLF
TABLE III. A portion of conversation list provided before training session TABLE VI. Questions for user feedback
Users Conversations Context Commands No. Questions
Normal Person: I feel Tired. tired feel 1 The device helps me to communicate properly.
Deaf and Mute: Eat some food food eat 2 The device is portable and highly usable.
Normal Person: Not hungry hungry not 3 The device works properly and fast.
Deaf and Mute: Go to bed bed go 4 The device gives correct output.
Normal Person: Okay Okay — 5 The device is easy to use and functionality can be remembered.
6 The functionality of the device can be learned quickly.
TABLE IV. A portion of conversation list provided after training session 7 The device is recommended as a communication tool.
Users Conversations Context Commands TABLE VII. Ratings from the user
Deaf and Mute: I want to visit the park park visit
Normal Person: I will go on Sunday Sunday go ID ID ID ID ID ID ID ID
Ques. Avg
Deaf and Mute: Okay Okay — 1 2 3 4 5 6 7 8
Normal Person: I will travel by bus bus travel 1 5 5 4 4 3 5 5 5 4.50
Deaf and Mute: Okay Okay — 2 5 5 5 5 4 5 5 5 4.87
3 4 4 5 3 3 3 4 4 3.75
4 4 5 5 3 4 4 5 5 4.37
5 5 5 5 4 3 4 5 5 4.50
as before. Then they were asked to start the conversations as 6 5 5 5 5 4 4 5 5 4.75
soon as they were provided with the three sets of conversation 7 5 5 4 4 4 5 5 5 4.62
lists as mentioned in Table IV. This Table IV is also just a
portion of the whole three conversations. Next, the examiner
noted down the time and accuracy of each conversation as the system’s usability and acceptability. Five experts of the
earlier. current fields rated the overall application and each parameter
was rated out of 5. The checklist is designed based on some
Later, overall conversations result were compared which
criteria: users should always be informed of system operations
indicated that the result obtained after the training session
with easy to understand and highly visible status displayed on
improved noticeably as the accuracy increased and time
the screen within a reasonable amount of time, offer users a
decreased for each of the conversations. The results were
digital space where backward steps are possible, including un-
compared using the average values of time and accuracy for
doing and redoing previous actions, interface designers should
each of the conversation which is depicted in Table V.
ensure that both the graphic elements and terminology are
TABLE V. Overall comparison of time duration and accuracy between maintained across similar platforms, minimize cognitive load
before training session and after training session by maintaining task-relevant information within the display
Conver while users explore the interface, keep clutter to a minimum,
Before Training After Training
-sation endeavour to mirror the language and concepts users would
Average Time Accuracy Average Time Accuracy find in the real world based on who their target users are, etc.
(Seconds) (%) (Seconds) (%)
No. 1 93 100 21 100 [29] The parameters and evaluation of the experts are depicted
No. 2 127 81 39 85 in Table VIII. It is to be mentioned that ’Exp’ in the Table VIII
No. 3 153 45 65 83 indicates the domain Experts.
C. Users and Experts Feedback TABLE VIII. Heuristic evaluation of the system
Users feedback were taken after total completion of the Heuristic Exp Exp Exp Exp Exp
Avg
Checklist 1 2 3 4 5
evaluation. They are given a set of questionnaires each which Visibility 4 5 5 3 2 3.8
they had to give a score. The questionnaires are mentioned Match between
in Table VI. The questionnaires were set based on [25]– System and 4 4 5 3 3 3.8
Real World
[28] because of the research studies and standardized model User Control 5 5 4 4 4 4.4
and high acceptability of UX design. Almost all of these Consistency and
3 4 5 4 4 4.0
questionnaires have undergone some type of psychometric Standard
qualification, with the evaluation of dependability, rationality, Recognition
rather than 5 4 3 5 3 4.0
and sensitivity, making them important tools for users. Each of Recall
the questionnaires was rated out of 5 by the users. The average Flexibility 3 5 4 3 4 3.8
of the ratings reflected the drawbacks and usefulness of the Minimal Design 3 5 4 4 4 4.0
systems. The ratings including the average are exhibited in
Table VII. This average value helps to obtain one central value
of the data. All different values are turned into one uniform D. Findings and Limitations
value which helps us to identify and judge the opinions of From our survey and recommendations from the users and
various people about the overall system in a generic way. experts, we found some limitations of our system. Currently,
Heuristic evaluation in terms of few parameters of our the overall application works in a static environment. The
system helped to get the feedback of domain experts for database is fixed with a limited number of commands and
189
,QWHUQDWLRQDO6HPLQDURQ$SSOLFDWLRQIRU7HFKQRORJ\RI,QIRUPDWLRQDQG&RPPXQLFDWLRQL6HPDQWLF
contexts. The system fails to communicate if the words are not [9] G. A. Zahid Halim, A Kinect-Based Sign Language Hand Ges-
in the database or both the devices are not connected with each ture Recognition System for Hearing- and Speech- Impaired:
A Pilot Study of Pakistani Sign Language. RESNA, DOI:
other. The speech to text conversion should be modified and in 10.1080/10400435.2014.952845, 2015.
this case, natural language processing can be the helping hand. [10] M. K. T. Anant Agarwal, Sign Language Recognition using Microsoft
Besides the system is built only for Android smartphones, not Kinect. IEEE, Electronic ISBN: 978-1-4799-0192-0, 2013.
[11] E. G. Kin Fun Li, Kylee Lothrop and S. Lau, A Web-Based Sign
for smartwatches or even other smartphones. Language Translator Using 3D Video Processing. IEEE, Print ISBN:
978-1-4577-0789-6, 2011.
V. C ONCLUSION & F UTURE W ORK [12] M. B. Simon Lang and R. Rojas, Sign Language Recognition Using
Kinect. ICAISC, Artificial Intelligence and Soft Computing pp 394-
The work in this paper has presented a smart device based 402, 2012.
two-way communication among normal people, and deaf and [13] S. G. Fakhteh Soltani, Fatemeh Eskandari, Developing a gesture-based
game for deaf/mute people Using microsoft kinect. IEEE, DOI:
mute. The system is multi-modal in the sense that it integrates 10.1109/CISIS.2012.55, 2012.
speech, text, vibration, images, and gestures. The system [14] A. M. Anchal Sood, AAWAAZ: A Communication System for Deaf and
focuses on smart devices as a medium of communication Dumb. IEEE, Electronic ISBN: 978-1-5090-1489-7, 2016.
[15] U. G. Surbhi Rathi, Development of Full Duplex Intelligent Communi-
because they are available around the world and less expensive cation System for Deaf and Dumb People. IEEE, Electronic ISBN:
than other communication devices. To judge the usefulness, 978-1-5090-3519-9, 2017.
feasibility and usefulness of the system, feedback from the [16] S. T. H. R.-M. J. A. Z. I. Muhammad Yaqoob Javed, Muhammad
Majid Gulzar, Implementation of Image Processing Based Digital Dacty-
users and the experts were taken which rendered a positive lology Converser for Deaf-Mute Persons. IEEE, Electronic ISBN: 978-
report of our overall model. 1-4673-8753-8, 2016.
[17] M. E. H. A. S. A. S. M. G. Marwa Elpeltagy, Moataz Abdelwahab,
Still, there are lots of scopes to improve our overall model Multi-modality-based Arabic sign language recognition. IET, ISSN:
and design as some suggestions were greatly considered given 1751-9632, 2018.
by the users and experts. At present, information (words [18] M. W. Mahmoud M. El-Gayyar, Amira S. Ibrahim, Translation from
Arabic speech to Arabic Sign Language based on cloud computing.
and images) stored in the database is static. A total adap- Egyptian Informatics Journal, Volume 17, Issue 3, Pages 295-303, 2016.
tive model is planned to be developed for more flexibility [19] T. S. A. R. M. R. M. A. Kanwal Yousaf, Zahid Mehmood and
where new words which aren’t available in the database, Z. Shuguang, A Novel Technique for Speech Recognition and Visual-
ization Based Mobile Application to Support Two-Way Communication
will be stored automatically using NLP (Natural Language between Deaf-Mute and Normal Peoples. Wireless Communications
Processing) and AI (Artificial Intelligence) by segmenting and Mobile Computing, Article ID: 1013234, 2018.
command and context from the input speech. For easy learning [20] S. K. G. Ayan Banerjee, Model Based Code Generation for Medical
Cyber Physical Systems. ACM, ISBN: 978-1-4503-3190-6, 2014.
and better acquaintance with the system, a game is planned [21] W.-J. C. Y.-M. C. Liang-Bi Chen, Chia-Wei Tsai and K. S.-M. Li, A
to be developed where the users can understand the whole Real-Time Mobile Emergency Assistance System for Helping Deaf-Mute
system in a recreational way. Developing sound recognition People/Elderly Singletons. IEEE, Electronic ISBN: 978-1-4673-8364-6,
2016.
and translating texts and speeches into various languages are [22] M. Mielke and R. Bruck, AUDIS Wear: a Smartwatch based Assistive
great scopes to improve the existing model. Device for Ubiquitous Awareness of Environmental Sounds. IEEE,
Electronic ISBN: 978-1-4577-0220-4, 2016.
ACKNOWLEDGEMENT [23] Google, “Cloud speech-to-text basics,” https://cloud.google.com/
speech-to-text/docs/basics, (Accessed on 04/12/2019).
This work has been partially supported by Green University [24] ——, “Cloud text-to-speech api basics,” https://cloud.google.com/
of Bangladesh research fund. text-to-speech/docs/basics, (Accessed on 04/12/2019).
[25] J. Sauro and J. R. Lewis, Quantifying the User Experience: Practical
Statistics for User Research. Morgan Kaufmann, Page Numbers Source
R EFERENCES ISBN: 0128023082, 2016.
[1] W. H. O. WHO, “Deafness and hearing loss,” https://www.who.int/ [26] R. Hartson and P. Pyla, The UX Book. Morgan Kaufmann, eBook
news-room/fact-sheets/detail/deafness-and-hearing-loss, (Accessed on ISBN: 9780128010624, 2018.
04/12/2019). [27] W. Albert and T. Tullis, Measuring the User Experience. Morgan
[2] A. A. S.-L.-H. Association), “Degree of hearing loss,” https://www. Kaufmann, eBook ISBN: 9780080558264, 2008.
hear-it.org/Defining-hearing-loss, 2019, (Accessed on 04/07/2019). [28] “Computer system usability questionnaire,” https://garyperlman.com/
[3] U. N. UN, “International day of sign languages,” https://www.un. quest/quest.cgi, (Accessed on 04/12/2019).
org/en/events/signlanguagesday/, 23 September 2018, (Accessed on [29] “Heuristic evaluation: How to conduct a heuristic evaluation — inter-
04/12/2019). action design foundation,” https://www.interaction-design.org/literature/
[4] S. Hermawati and K. Pieri, Assistive technologies for severe article/heuristic-evaluation-how-to-conduct-a-heuristic-evaluation, (Ac-
and profound hearing loss: Beyond hearing aids and implants. cessed on 04/07/2019).
Assistive Technology, The Official Journal of RESNA, DOI:
10.1080/10400435.2018.1522524, 2019.
[5] Z. C. Z. L. Y. Z. L. J. Kengo Kuroki, Yiming Zhou, A remote
conversation support system for deaf-mute persons based on bimanual
gestures recognition using finger-worn devices. IEEE, Electronic ISBN:
978-1-4799-8425-1, 2015.
[6] T. L. Kathawut Rojanasaroch, Communication Aid Device for Illness
Deaf-Mute. IEEE, Electronic ISBN: 978-1-4799-7961-5, 2015.
[7] F. N. H. Al-Nuaimy, Design and implementation of deaf and mute people
interaction system. IEEE, Electronic ISBN: 978-1-5386-1949-0, 2017.
[8] P.-J. Y. S.-J. W. Lih-Jen kau, Wan-Lin Su, A Real-time Portable Sign
Language Translation System. IEEE, Electronic ISBN: 978-1-4673-
6558-1, 2015.
190