Final Project Jimmie

KABARAK UNIVERSITY
SCHOOL OF SCIENCE, ENGINEERING, AND TECHNOLOGY
DEPARTMENT OF INFORMATION TECHNOLOGY
PROJECT DOCUMENTATION
COMPUTER VISION RECOGNITION FOR SIGN LANGUAGE TO HELP DEAF

PEOPLE
Submitted by:
JIMMIE MUNYI
TLCM/MG/1094/05/17
SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS OF THE

BACHELOR OF SCIENCE IN TELECOMMUNICATIONS
1|Page
Table of Contents
Declaration..............................................................................................................................................4
Dedication...............................................................................................................................................5
Acknowledgment.....................................................................................................................................6
ABSTRACT............................................................................................................................................7
CHAPTER ONE..........................................................................................................................................8
INTRODUCTION.......................................................................................................................................8
1.1 Background of Study.........................................................................................................................8
1.2 Problem Statement.............................................................................................................................9
1.3Objectives...........................................................................................................................................9
1.3.1 Main Objectives..........................................................................................................................9
1.3.2 Specific Objectives.....................................................................................................................9
1.4 Justification......................................................................................................................................10
1.5 Scope of study.................................................................................................................................10
CHAPTER TWO.......................................................................................................................................11
LITERATURE REVIEW..........................................................................................................................11
2.1 Introduction.....................................................................................................................................11
2.2 Review of Related Work..................................................................................................................11
2.2.1 Research of Deep Learning for Disabled People.......................................................................11
2.2.2 Coverage of Artificial Intelligence: The Case of Disabled People............................................11
2.2.3 Smart Home Design for Disabled People Based on Neural Networks......................................11
2.3 Review of Theoretical Literature.....................................................................................................12
2.3.1 Deep Learning..........................................................................................................................12
2.3.2 Neural Networks.......................................................................................................................13
2.3.3 Convolution Neural Networks..................................................................................................14
2.3.4 Computer Vision.......................................................................................................................14
2|Page
2.3.5 Transfer Learning.....................................................................................................................14
CHAPTER THREE...................................................................................................................................15
PROJECT DESIGN...................................................................................................................................15
3.1 DATA COLLECTION....................................................................................................................15
3.1.1 Collection of Samples from Participants...................................................................................15
3.1.2 Use of currently available datasets ...........................................................................................16
3.1.3 Data Augmentation...................................................................................................................16
3.2 DESIGN OF THE CONVOLUTION NEURAL NETWORK.........................................................16
3.3 DESIGN OF THE FINAL SYSTEM...............................................................................................18
3.4 SYSTEM REQUIREMENTS..........................................................................................................19
3.4.1 Hardware requirements.............................................................................................................19
3.4.2 Software requirements..............................................................................................................19
CHAPTER FOUR.....................................................................................................................................20
SYSTEM DESIGN....................................................................................................................................20
4.1 ANALYSIS OF DATA...................................................................................................................20
4.2 TRAINING THE CONVOLUTION NEURAL NETWORK..........................................................23
4.3 SYSTEM DESIGN..........................................................................................................................25
4.4 SYSTEM OUTPUT.........................................................................................................................28
CHAPTER FIVE.......................................................................................................................................32
CONCLUSION AND RECOMMENDATIONS.......................................................................................32
5.1 DISCUSSION..................................................................................................................................32
5.2 RECOMMENDATION...................................................................................................................32
5.3 CONCLUSION...............................................................................................................................32
REFERENCES..........................................................................................................................................33
REFERENCES..........................................................................................................................................33
APPENDIX A...........................................................................................................................................34
3|Page
Questionnaire.........................................................................................................................................34
APPENDIX B............................................................................................................................................35
Estimated budget...................................................................................................................................36
Work Plan..................................................................................................................................................35
4|Page
Declaration
I, Jimmie Munyi, hereby confirm that this is my original work and that this research project is
my original work and has not been presented for an award of a degree or any similar purpose in
any other institution. I certify that the intellectual content of this project is the product of my
work and that all the assistance received in preparing this thesis and sources have been
acknowledged.
Student: Supervisor:
Signature: ……………………...... Signature………………

Date………………………. Date…………………
5|Page
Dedication
This project is dedicated with profound admiration and appreciation to God almighty for giving
me strength and breathe and my beloved parents and my siblings for their moral support. I also
want to thank our lecturers and our project supervisors for giving me such an opportunity to be
able to learn a lot of things while also having a lot of patience when teaching me.
6|Page
Acknowledgment
I would like to express my special thanks of gratitude to the Lord my God for this far we have
walked along; his divine guidance, care and enabling throughout the project period. Special
regards to my parents for funding my studies and providing all my necessities throughout this
period. I would also like to extend my sincere gratitude to all that helped to the successful
completion of the project.
7|Page
ABSTRACT
This research project aims to use Deep Learning and Computer Vision to teach a computer to
understand sign language using the different alphabets of sign language and then create a real-
time webcam based application that is designed to translate sign language that deaf people can
use to input information into the system using the method of communication they are already
familiar with. The real-time webcam based application will be the simulation of the capabilities
of my project. Over the last 10 years, Deep Learning and Artificial Intelligence has exploded
with it been applied everywhere and research papers being updated everytime something new is
discovered in that area. However, most of the research and progress in Deep Learning is within
the Deep Learning architectures themselves, and most effort in making applications of Deep
Learning is within the current hype areas like Self Driving Cars and Transfomer-based Natural
Language Processing. To make sure that progress in Deep Learning is enjoyed by all human
beings, we should ensure that we do not overlook any group. One such group that is commonly
left out is the disabled group. Little research is ongoing to make life easier for them using Deep
Learning. I will design the research approach through mixed types of research techniques. The
general design of the research and the methods used for data collection will primarily be
observation and utilization of currently available datasets. The first part giving a highlight about
the dissertation design. The second part discussing about qualitative and quantitative data
collection methods. The last part illustrating the general research framework.
8|Page
CHAPTER ONE
INTRODUCTION
1.1 Background of Study

For the past few years, technology and its growth has been a subject of global interest.
Particularly, the recent Artificial Intelligence (AI) revolution that has revolutionized how we
look at things and how we do them. From self-driving cars, to medical imaging, helping discover
vaccines for diseases, Artificial Intelligence (AI) has proved to be important and we need to
continue applying it in our lives to improve its quality. One branch of AI that is particularly
constantly improving is Deep Learning. Deep Learning is technology that extracts and
transforms data by using multiple layers (hence the deep part) of neural networks. Deep Learning
posses power, flexibility and simplicity. Some of the areas that Deep Learning is already good at,
most times, even better than humans is Natural Language Processing (NLP) tasks such as speech
recognition and summarizing documents, Computer Vision tasks such as face recognition,
Medicine and Biology, Recommendation systems and playing games like chess and go.
Over the last 10 years the use cases of Deep Learning have exploded and almost everyone is
adopting some form of Deep Learning in their work and companies. However, the concept of
Deep Learning and Neural Networks have been around since 1943, but researchers did not invest
much in them until 2010, when a group of researches solved the ImageNet challenge using
neural networks. Another reason that has greatly influenced the explosion is, now, we have
powerful computers and Graphical Processing Units (GPUs) and availability of huge and
numerous datasets that are vital in training Neural Networks.
However, one drawback of the explosion is that much focus in research of Deep Learning is
centered around small areas that people believe is vital like Self Driving Cars and it is justified
because breakthroughs in these areas can revolutionize how we live as a species. If we want the
benefits of Deep Learning to be enjoyed by all, we should be careful to broaden the research in
all areas. One commonly left out group in technological improvement, not just AI is the disabled
group. This research tries to bridge that gap, focusing on one particular group, the deaf people.
9|Page
Deaf people communicate using sign language. Deep Learning has already proved that it can
understand images and videos using Computer Vision. Applications such as Face Recognition
are common now. We would only have to extend this area, and instead of teaching the computer
to understand faces, we could teach it to understand sign language.
This research project looks into trying to see if we can bring in the benefits gained from the
advancement to the deaf.
1.2 Problem Statement

A lot of progress in being made in Artificial Intelligence and Deep Learning with major
advancements every year. One of the areas that has seen great improvement from the progress
made is Computer Vision. Currently, computers perform better than human beings on a variety
of computer visions tasks like Object Segmentation, Object Detection and Object Localization
just to name a few. However, we should be careful though that the benefits of these
advancements are being enjoyed by everyone and not just a few. One commonly left out group is
the disabled group.
I envision a system using deep learning and computer vision, to teach a computer to understand
sign language using the different alphabets of the sign language to help the deaf people to be able
to input information to the system using the method of communication they are already familiar
with. The system will have a real-time inference capability that recognizes sign language from
the webcam.
1.3 Objectives
1.3.1 Main Objectives

The main objective is to create a computer system that understands sign language and can
interpret sign language from the web-cam.
1.3.2 Specific Objectives

 To design a Convolution Neural Network that can be used to understand images.
 To use the designed Convolution Neural Network to teach a computer to classify sign
language images.
10 | P a g e
 To use the computer system that has been taught to classify sign language on a real-time
system that can classify sign language on the go, as soon as you illustrate it.
1.4 Justification
 The system will be efficient enough to be used because it can classify images real-time
hence communication can occur with a deaf person.
 The system will aid the deaf to work with computers. A system can be further developed
where they input data into a computer using Sign Language which they are already
comfortable with instead of the other methods that can prove challenging to them.
 The system can be implemented even within a tight budget since once the Convolution
Neural Network is trained all you need is a camera device that capture the sign language
and translation can be done. Even a cheap computer webcam can be used, which is
actually used in this project.
1.5 Scope of study

This project is developed for the deaf people or anyone else who communicates or uses sign
language to communicate. The system will be designed to offer a user-friendly real-time
environment that will be easy to use. All you have is to run the code and the real-time
classification of the images will begin.
For this project, I cover designing of the Convolution Neural Network, training it and using it to
create the real-time inference system. The model will live in a remote are and the final real-time
system will communicate with it using a REST API.
11 | P a g e
CHAPTER TWO
LITERATURE REVIEW
2.1 Introduction
This section of this study presents a review of literature related to the study as sourced from
various scholars, publications and relevant professional journals. There are many research
reports, articles and books written on understanding the use of academic information sharing in
higher learning institution across the world. Studies have been done to explore challenges of
adaptation of Neural Networks in various Computer Vision tasks.
Since Sign Language Classification is a special application of the use of Neural Networks in
Computer Vision, in this Literature Review I will present a review of related literature to the
study of Neural Networks in general, Convolution Neural Networks and Computer Vision. The
review has been done in accordance with the research objectives.
2.2 Review of Related Work

2.2.1 Research of Deep Learning for Disabled People
Not much research is being done in this area. However here are the ongoing research:
2.2.2 Coverage of Artificial Intelligence: The Case of Disabled People
Aspen Lillywhite and Gregor Wolbring published this research paper looking into AI for disabled people
in 27 February 2020.
2.2.2 Smart Home Design for Disabled People based on Neural Networks
Ali Hussein, Mehdi Adda and Mima Atieh published this research paper looking into how smart homes
can be designed for people living with disabilities in 2014.
12 | P a g e
2.3 Review of Theoretical Literature
2.3.1 Deep Learning
Deep learning is a computer technique to extract and transform data–-with use cases ranging
from human speech recognition to animal imagery classification–-by using multiple layers of
neural networks. Each of these layers takes its inputs from previous layers and progressively
refines them. The layers are trained by algorithms that minimize their errors and improve their
accuracy. In this way, the network learns to perform a specified task.
Deep learning has power, flexibility, and simplicity. That's why we believe it should be applied
across many disciplines. These include the social and physical sciences, the arts, medicine,
finance, scientific research, and many more.
Here's a list of some of the thousands of tasks in different areas at which deep learning, or
methods heavily using deep learning, is now the best in the world:
 Natural language processing (NLP): Answering questions; speech recognition;

summarizing documents; classifying documents; finding names, dates, etc. in documents;
searching for articles mentioning a concept
 Computer vision: Satellite and drone imagery interpretation (e.g., for disaster resilience);
face recognition; image captioning; reading traffic signs; locating pedestrians and
vehicles in autonomous vehicles
 Medicine: Finding anomalies in radiology images, including CT, MRI, and X-ray images;
counting features in pathology slides; measuring features in ultrasounds; diagnosing
diabetic retinopathy
 Biology: Folding proteins; classifying proteins; many genomics tasks, such as tumor-
normal sequencing and classifying clinically actionable genetic mutations; cell
classification; analyzing protein/protein interactions
 Image generation: Colorizing images; increasing image resolution; removing noise from
images; converting images to art in the style of famous artists
 Recommendation systems: Web search; product recommendations; home page layout
 Playing games: Chess, Go, most Atari video games, and many real-time strategy games
13 | P a g e
 Robotics: Handling objects that are challenging to locate (e.g., transparent, shiny, lacking
texture) or hard to pick up
What is remarkable is that deep learning has such varied application yet nearly all of deep
learning is based on a single type of model, the neural network.
2.3.2 Neural Networks

Neural Networks were invented in 1940s in a paper called, ‘A Logical Calculus of the Ideas
Immanent in Nervous Activity’ by neurophysiologist Warren McCulloh, but figuring how to
utilize them and train them evaded researches for a long time. His work was later picked up by a
psychologist Frank Rosenblatt who developed the artificial neuron and gave it the ability to
learn. Backpropagation, which is a vital concept that helps a neuron in a Neural Network learn
was conceived in the 1960s but Neural Networks still did not receive much attention. An MIT
professor named Marvin Minsky along with Seymour Papert did some research and later showed
that a single layer of Neural Networks was unable to learn some simple but critical mathematical
functions and they showed that using multiple layers would mitigate this issues. However, the
scientific community lost hope in Neural Networks and not many people were working on them.
The most pivotal work in neural networks in the past 50 years did not come unitl 1986 where
David Rumelhart and James McClelland developed the Parallel Distributed Processing (PDP).
The approach laid out in PDP is very similar to the approach used today in Neural Networks. The
hype in Neural Networks reemerged again following the development of the PDP and models
were built in the 1980s using a second layer of neurons avoiding the problems suggested by
Minsky before. However, another misunderstanding of the theoretical issues held back the field.
Although it was shown 30 years ago that to get good and practical performance you needed to
use more layers in the neural network, it wasn’t until 2010 when researches started winning
competitions using Neural Networks did Neural Networks receive much attention. Since then,
Neural Networks have been on a meteoric rise due to their sometimes seemingly magical ability
to solve problems that were before assumed to be unsolvable like image captioning, self driving
cars, language translations. Currently, Neural Networks are the primary solution to most
competitions and challenging technological problems like early cancer detection.
14 | P a g e
2.3.3 Convolution Neural Networks
Convolution Neural Networks are the defacto and most utilized type of Neural Networks in
Computer Vision and Classification. They are modeled using inspiration from how the human
eye operates. They remove the need for manual, time consuming feature extraction by
practitioners which is what was utilized before Convolution Neural Networks were conceived.
They provide a more scalable approach to image classification leveraging ideas from matrix
multiplication from linear algebra for the identification of patterns in images. They require a lot
of computational power and that is why we normally train them using Graphical Processing
Units (GPUs). They usually have three main layers that enable them to do their work:
Convolution Layer, a Pooling Layer and a Fully Connected Layer. The Convolution Layer is
where the most of the computation occurs and is the core building block of the Convolution
Neural Network. The pooling layer does dimensionality reduction of the parameters in the input.
Finally, the Fully Connected Layer performs the task of classification based on the features
extracted through the previous layers and their different filters. There are many types of
Convolution Neural Networks, but the one I am going to utilize in this project is the Deep
Residual Networks or the ResNet.
2.3.4 Computer Vision

Computer Vision is a field of Artificial Intelligence that enables computers and systems to derive
meaningful information from digital images, videos and other visual inputs and take action based
on those inputs.
2.3.5 Transfer Learning
Transfer Learning is the reuse of a pre-trained model on a new problem. Since training of Neural
Networks can take a lot of time, transfer learning can help when creating new models since it
leverages knowledge it gained from previous experimentation to the current project. Transfer
Learning is going to be vital in this research project.
15 | P a g e
CHAPTER THREE
PROJECT DESIGN
There are a number of approaches used in this research method design. The purpose of this
chapter is to design the methodology of the research approach through mixed types of research
techniques. The research approach also supports the researcher on how to come across the
research result findings. In this chapter, the general design of the research and the methods used
for data collection are explained in detail. It includes three main parts. The first part gives a
highlight about the collection of data and how to manipulate the data so that it is more useful.
The second part discusses about the design of the Convolution Neural Network and how the data
collected will be fed into the model. The last part illustrates the creation of the final system that
uses the trained model to predict new sign language signals from the web-cam. This last system
will be the simulation of my Research Project.
3.1 DATA COLLECTION
Data collection methods are focused on the following basic techniques.
3.1.1 Collection of Samples from participants
Willing participants will have their hand taken pictures as they illustrate with their
fingers the Sign Language. Before participating however, they need to sign the
Permission to Participate in Project Questionnaire that is available in Appendix A.
Signing this document will show that they are willing for the pictures taken of their
hands to be used in the project. The pictures will be taken in a standard white
background to remove the effects of different backgrounds affecting the quality of the
model.
16 | P a g e
3.1.2 Use of currently available datasets
On top of the data I collect from people, I will use the American Sign Language Dataset
from Kaggle that can be found here: https://www.kaggle.com/grassknoted/asl-alphabet.
It contains 87,000 images of 200x200 pixels that have 29 classes: 26 classes that
represent the letter A-Z and 3 classes representing space, delete and nothing. The
nothing class will be useful to the model to predict when no sign language is being
shown. The Dataset has a license of GPL2 meaning we can use it for free in our
projects.
3.1.3 Data Augmentation
Data Augmentation are techniques used to increase the amount of data by adding
slightly modified copies of already existing data or newly created synthetic data from
existing data. It acts as a regularizer and helps to reduce over fitting when training a
Deep Learning Model. It is going to be particularly useful to make our system universal
in that when we train it using pictures that are from right handed people, it can
generalize well to people who use their left hand to illustrate sign language.
3.2 DESIGN OF THE CONVOLUTION NEURAL NETWORK

For this project, we are going to utilize Transfer Learning to train our model since it enables us
to get good results with less training data and compute. We will use the resnet18 architecture
which is a pre-trained model made available by the PyTorch Framework. We will use Cross
Entropy Loss as our loss function since this is a classification task and we will use Adam
(Adaptive Momentum) as our optimizer to train the model and update the parameters. For our
metrics, we will use error rate and accuracy. We will use the Learning Rate Finder provided by
fastai Framework to get the best learning rate to use when training the model. This is the diagram
to describe how the model will look like:
17 | P a g e
The design of this Convolution Neural Network will be implemented in fastai and PyTorch and
Python.
18 | P a g e
3.3 DESIGN OF THE FINAL SYSTEM
The final system here will be used for the simulation of the capabilities of this project. When the
code is ran, it will use the webcam to record a person illustrating the sign language and then it
will use the trained model discussed in the previous part using REST API and it will interpret the
sign language being illustrated real-time. The final system will be built using the OpenCV
Framework and Python.
Model in Cloud
Storage
Input sent to model

Model prediction
using REST API
The Final Prediction of Sign

Video input to System
System Language
19 | P a g e
3.4 SYSTEM REQUIREMENTS
3.4.1 Hardware requirements

For this project, a lot of computational power is required to train the Convolution Neural
Network. This is because training models on a GPU is more than 10-50 times faster than training
on the CPU. Luckily, Google provides us with free GPU online with the following
specifications:
 Processor: 2vCPU @2GHz

 RAM: 13GB
 HDD: 64GB
 GPU: Nvidia K80/T4
 GPU Memory: 12GB
3.4.2 Software requirements
 Operating System: Linux

 Coding language: Python 3.8
 IDE: Jupyter Notebooks
 Frameworks: fastai, PyTorch, Pillow, OpenCV
20 | P a g e
CHAPTER FOUR
SYSTEM DESIGN
4.0 INTRODUCTION
This chapter presents the analysis and representation of data after data collection concerning
my Sign Language System. The purpose is to describe the data collected and see how the data
can be used to build the final system. The finding of the analysis relates to the research
questions.
4.1 ANALYSIS OF THE DATA USED FOR TRAINING
To build this project, I utilized a public dataset, the American Sign Language Dataset from
Kaggle that can be found here: https://www.kaggle.com/grassknoted/asl-alphabet. It contains
87,000 images of 200x200 pixels that have 29 classes: 26 classes that represent the letter A-Z
and 3 classes representing space, delete and nothing. The nothing class will be useful to the
model to predict when no sign language is being shown. The Dataset has a license of GPL2
meaning we can use it for free in our projects.
21 | P a g e
Here is an image of all the letters in the American Sign Language and their corresponding
alphabetic representation.
22 | P a g e
Here are some of the images that were used to train my model, that come from the public dataset
as well as data collected from local sources
23 | P a g e
4.2 Creating and Training the Convolution Neural Network
Using the data collected that looks like the shown sample above, I trained a Resnet-18 Model
using Transfer Learning which enables us to get good results with less training data and compute
power.
4.2.1 Model Training
The model was trained using the public GPUs provided by Google on Colab.
The main training loop and the results achieved were as follows:
After only 4 epochs of training with transfer learning, I got a validation accuracy of 99.99%
which shows the benefits of using transfer learning instead of training from scratch.
24 | P a g e
4.2.2 Model Testing
Model Validation is usually not the best method for testing the effectiveness of your model
because the model sees the validation data after every epoch and may start memorizing it. To
solve this, we usually use a separate test set put aside and only used once, after we have trained
our model to our liking.
After achieving a high accuracy of 99.99% in training, I tested my now trained model against a
test dataset that I had set aside.
Here is the function I wrote to be used for testing the model:
This method takes in the test images and the model we want to test against and the labels of the
images, runs inference and returns the percentage of the accuracy of the test set
25 | P a g e
I then ran inference on my trained model and got the following results:
Again, I got a good accuracy of 100% on the test set, which confirms that the model actually
learned to distinguish different sign language symbols rather than memorizing them.
After I was satisfied with my training results, it was now time to build my final system, which
would take input from the user in video streaming and give a real-time interpretation.
4.3 System Design.
For the final system, I utilized open-cv which can be used to get streaming videos from the
webcam and classify which sign language is being illustrated.
Here is the code for the final system:
Loading the model
26 | P a g e
Some basic setup for the environment and the model to run
Setting up the webcam
27 | P a g e
The main loop that runs the model on the webcam and gives a prediction based on the sign
language illustrated:
28 | P a g e
4.4 System Output
The system predicts what sign language is illustrated while it runs on the webcam. One places
his or her hand in the blue box shown so that the model can predict what he or she is illustrating.
Here are some output samples that I ran myself:
The above predicts nothing because there is no Sign Language illustrated in the blue box.
29 | P a g e
Here are some outputs when I actually illustrate different Sign languages in the blue box
30 | P a g e
31 | P a g e
A video version of the system working can be seen in this YouTube Video I uploaded to
demonstrate my system: https://www.youtube.com/watch?v=-nggi8EwfOA
32 | P a g e
CHAPTER FIVE
CONCLUSIONS AND RECOMMENDATIONS
5.1 DISCUSSION
This Chapter describes discuss the objectives of the system stipulated in earlier chapters,
limitation of the system conclusion and recommendation of the System.
5.2 RECOMMENDATION
A recommendation of the study is that, further studies to be done in improving the Sign
Language System. The system is ideal for the intended purpose. However, it would perform
better if the following recommendations and suggestions are considered:
I. Improving the system to work on letters and numbers instead of just letters.
II. Porting the system to other convinient platforms like a Mobile App, which would be
more accessible to everyone.
III. Improving the robustness of the system. The system experiences some challenges
when run in dark backgrounds.
5.2 CONCLUSION
Recognition of Sign Language using Computer Vision can prove to be very useful. A certain
use-case that comes to mind in, when video calling with a person who uses Sign Language to
communicate, a system can be running on the background that recognizes the sign language
being illustrated and translating them to English words for the person on the other end. This
would be beneficial from both parties.
In general, using advancements in technology such as Deep Learning to help the disabled people
should be an area of concern for researchers and practitioners.
33 | P a g e
REFERENCES
1. Howard, Jeremy, and Sylvain Gugger. 2020. Deep Learning for Coders with Fastai and
Pytorch: AI Applications Without a Phd. 1st ed. O’Reilly Media, Inc.
2. Harrison Kinsley, Daniel Kukiela, “Neural Networks from Scratch in Python”.
3. Bradski, G. 2000. “The OpenCV Library.” Dr. Dobb’s Journal of Software Tools.
4. Clark, Alex, and Contributors. n.d. “Python Imaging Library (Pillow Fork).”
https://github.com/python-pillow/Pillow.
5. Deng, J., W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. “ImageNet: A Large-
Scale Hierarchical Image Database.” In CVPR09.
6. Elkins, Andrew, Felipe F. Freitas, and Veronica Sanz. 2019. “Developing an App to
Interpret Chest X-Rays to Support the Diagnosis of Respiratory Pathology with Artificial
Intelligence.”
7. He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. “Deep Residual Learning
for Image Recognition.” CoRR abs/1512.03385. http://arxiv.org/abs/1512.03385.
8. He, Tong, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, and Mu Li. 2018. “Bag
of Tricks for Image Classification with Convolutional Neural Networks.” CoRR
abs/1812.01187. http://arxiv.org/abs/1812.01187.
9. Huang, Gao, Zhuang Liu, and Kilian Q. Weinberger. 2016. “Densely Connected
Convolutional Networks.” CoRR abs/1608.06993. http://arxiv.org/abs/1608.06993.
10. Jeremy Howard, Sylvain Gugger, and contributors. 2019. SwiftAI. fast.ai, Inc.
https://github.com/fastai/swiftai.
11. Kluyver, Thomas, Benjamin Ragan-Kelley, Fernando Pérez, Brian Granger, Matthias
Bussonnier, Jonathan Frederic, Kyle Kelley, et al. 2016. “Jupyter Notebooks – a Publishing
Format for Reproducible Computational Workflows.” Edited by F. Loizides and B.
Schmidt. IOS Press.
34 | P a g e
APPENDIX A
Permission to participate in Project
The following refers to a study into the implementation of my project. The questions are
intended for the user to give permission for the data I collect from them to be used in the project.
I will be collecting hand images of people performing Sign Language in order to use them for
training the Convolution Neural Network.
1. Gender
Male Female
2. Age
3. Name:
4. Are you willing to let the images taken of your hand be used in the following research
project ?(Tick necessary response)
 Yes
 No
5. Signature:
We promise to ensure anonymity of the data we collect from you and to respect your privacy.
35 | P a g e
APPENDIX B
Estimated budget
Item Quantity Price (ksh)
Laptop 1 (Already Available)
Printing 10 100
Internet bundles 3000
Online Cloud Storage 1500
TOTAL 4600
36 | P a g e
Work Plan
This is the workplan used throughout the project, it is estimated to take 15 weeks.
Task Time Estimated Code
Project proposal and planning 3 weeks A
Requirements analysis and design 2 weeks B
System development 5 weeks C
Implementation 3 weeks D
Documentation 2 weeks E
TOTAL TIME 15 weeks
The work plan can be represented in a Gantt chart as follows:
Week Week Week Week Week

0-3 3-6 6-9 9-12 12-15
37 | P a g e

Final Project Jimmie

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Final Project Jimmie

Uploaded by

Copyright:

Available Formats

KABARAK UNIVERSITY

SCHOOL OF SCIENCE, ENGINEERING, AND TECHNOLOGY

DEPARTMENT OF INFORMATION TECHNOLOGY

COMPUTER VISION RECOGNITION FOR SIGN LANGUAGE TO HELP DEAF

SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS OF THE

1.1 Background of Study.........................................................................................................................8

1.2 Problem Statement.............................................................................................................................9

1.3.1 Main Objectives..........................................................................................................................9

1.3.2 Specific Objectives.....................................................................................................................9

1.5 Scope of study.................................................................................................................................10

2.2 Review of Related Work..................................................................................................................11

2.2.1 Research of Deep Learning for Disabled People.......................................................................11

2.2.2 Coverage of Artificial Intelligence: The Case of Disabled People............................................11

2.3 Review of Theoretical Literature.....................................................................................................12

2.3.1 Deep Learning..........................................................................................................................12

2.3.2 Neural Networks.......................................................................................................................13

2.3.3 Convolution Neural Networks..................................................................................................14

2.3.4 Computer Vision.......................................................................................................................14

3.1 DATA COLLECTION....................................................................................................................15

3.1.1 Collection of Samples from Participants...................................................................................15

3.1.2 Use of currently available datasets ...........................................................................................16

3.1.3 Data Augmentation...................................................................................................................16

3.2 DESIGN OF THE CONVOLUTION NEURAL NETWORK.........................................................16

3.3 DESIGN OF THE FINAL SYSTEM...............................................................................................18

3.4 SYSTEM REQUIREMENTS..........................................................................................................19

3.4.1 Hardware requirements.............................................................................................................19

3.4.2 Software requirements..............................................................................................................19

4.1 ANALYSIS OF DATA...................................................................................................................20

4.2 TRAINING THE CONVOLUTION NEURAL NETWORK..........................................................23

4.3 SYSTEM DESIGN..........................................................................................................................25

4.4 SYSTEM OUTPUT.........................................................................................................................28

CONCLUSION AND RECOMMENDATIONS.......................................................................................32

Signature: ……………………...... Signature………………

1.1 Background of Study

1.2 Problem Statement

1.3.1 Main Objectives

1.3.2 Specific Objectives

1.5 Scope of study

2.2 Review of Related Work

 Natural language processing (NLP): Answering questions; speech recognition;

2.3.2 Neural Networks

2.3.4 Computer Vision

2.3.5 Transfer Learning

3.1 DATA COLLECTION

Data collection methods are focused on the following basic techniques.

3.1.1 Collection of Samples from participants

3.1.3 Data Augmentation

3.2 DESIGN OF THE CONVOLUTION NEURAL NETWORK

Input sent to model

The Final Prediction of Sign

3.4.1 Hardware requirements

 Processor: 2vCPU @2GHz

3.4.2 Software requirements

 Operating System: Linux

4.1 ANALYSIS OF THE DATA USED FOR TRAINING

4.2.1 Model Training

Here is the function I wrote to be used for testing the model:

4.3 System Design.

Here is the code for the final system:

Loading the model

Setting up the webcam

Here are some output samples that I ran myself:

Laptop 1 (Already Available)