You are on page 1of 5

An Application for Hand Gesture Detection and

Control of A Computerized System


1nd Kr Baruah Swagat 2st Konwar Utkarsh 3rd Mahanta Rajpratim
Computer Science Engineering Computer Science Engineering Computer Science Engineering
DUIET, Dibrugarh University DUIET, Dibrugarh University DUIET, Dibrugarh University
Dibrugarh, India Dibrugarh, India Dibrugarh, India
17swagat@gmail.com utkarshkonwar@gmail.com rajpratimboko@gmail.com

4th Sarma Debajit


Technology Innovation Hub
Indian Institute of Technology, Guwahati
Guwahati, India
s.debajit@iitg.ac.in

Abstract—A systematic approach to develop a hand-sign recog- subsystem [3]. The requirement for specialized hardware, such
nition software system aimed at detecting and interpreting user as intricate cameras for detecting hand movements, adds a
hand-signs for specific functionalities, enabling an easier UI barrier for entry. Moreover, existing software solutions tend
experience. The process unfolds through several sequential steps.
Firstly, the data collection phase involves employing a custom to be either limited in scope or complex in execution, leaving
Python script integrated with OpenCV, capturing a diverse range non-programming users at a disadvantage. image processing
of hand-sign images crucial for model training. Subsequently, algorithms are used to detect and track user’s hand signs and
the model training step encompasses extracting hand landmark facial expressions. This is easier compared to other approaches
data using mediapipe and leveraging scikit-learn in conjunction because wearing additional hardware is not necessary here
with the Random Forest Classifier. Once the model is trained, a
Python script assigns distinct functions to recognized hand-signs, [4]. Also, from the computer perspective, the performance of
achieved by mapping gestures to keyboard shortcuts. Lastly, the recognizing hand gestures is affected by the environmental
software’s functionality is encapsulated within a tkinter GUI, surroundings (such as light, background, distance, skin color)
yielding a cohesive user experience. This approach achieves and the position and direction of the hand [5]. Consequently,
efficient and accurate hand-sign recognition, enabling seamless this paper aims to propose a software-driven approach that
interaction and activation of functions through gesture-based
input. empowers users to manipulate specific laptop features through
Index Terms—Hand-sign recognition, machine learning, Ran- hand gestures, ensuring user-friendliness and accessibility as
dom Forest, hand-gestures, GUI. central design principles.
The subsequent sections of this paper delve into the intrica-
I. I NTRODUCTION cies of the proposed software program’s development, encom-
The notion of remotely controlling laptops through hand passing data collection, model training, gesture-to-function
gestures emerges as an appealing solution, potentially enhanc- mapping, and Graphical User Interface (GUI) integration.
ing user experience and mitigating multiple inconveniences. Through a systematic exploration, this research seeks to offer
Gesture recognition is an important area of research in com- a viable and user-centric solution that bridges the gap between
puter vision. In the framework of human–computer non-verbal hand gesture recognition technology and everyday laptop
communication, the visual interface establishes communica- interactions.
tion channels for inferring intentions from human behaviour,
including facial expressions and hand gestures [1]. II. L ITERATURE R EVIEW
The research work in hand gesture recognition has been Hand gesture frame acquisition is to capture the human
developing for more than 38 years [2]. This paper explores the hand gesture by the computer [6].In the landscape of human-
concept of remote laptop control using hand gestures, focusing computer interaction, hand gesture recognition has emerged
on the development of an intuitive and accessible software as a pivotal area of research and development. Numerous
program. The motivation for this endeavor arises from the studies have contributed to the evolution of gesture-based
author’s recognition of the aforementioned challenges. Current interfaces, aiming to bridge the gap between users and
solutions, while presenting promising capabilities, are often technology. With that being said, hand gesture recognition
accompanied by significant drawbacks. A gesture recognition is a type of perceptual computing user interface that is used
system depends on certain subsystems associated in arrange- in HCI to give the computers the capability of capturing
ment. In view of the arrangement of subsystems, the general and interpreting hand gestures, and executing commands
exhibition of the framework is reliant on the precision of every based on an understanding made on a certain gesture [7]. In
artificial intelligence, machine learning enables the computers 1. Each hand sign encompasses around 300 images stored
to learn without being explicitly programmed [8]. Previous in corresponding folders, encompassing the range of desired
models have explored various approaches, ranging from detections.
vision-based techniques to machine learning algorithms.
Developing hand gesture recognition systems such as sign B. Training the Model
language applications is extremely important to overcome the Following data collection, hand sign images undergo hand
communication barrier with people that are unfamiliar with landmark extraction, generating hand landmark data used in a
sign language [9]. custom Python script for model training shown in the second
column of Fig 1. The hand landmarks are stored in a .csv file
This could be done using vision-based recognition where and subsequently fed into a hand sign classifier script.
no special gadgets are required, and a web camera or a depth
camera is used [10]. These models often relied on predefined C. Mapping Each Hand Sign
hand features and specific lighting conditions, limiting their
Hand signs are mapped to user-defined keys or key com-
adaptability and accuracy across diverse scenarios. Sensors
binations based on individual preferences as shown in third
were also proposed for the same but elderly people suffering
column of Fig 1. This mapping process is executed by the
from chronic disease conditions that result in loss of muscle
model.py file, facilitating tailored interaction.
function may be unable to wear and take off gloves, causing
them discomfort and constraining them if used for long IV. B ENEFITS OF THE P ROPOSED M ODEL
periods [11]
The envisioned hand-sign recognition software system
gesture recognition is like a problem of feature extraction brings forth a range of advantages that transcend conventional
and pattern recognition, in which a movement is labeling as laptop interaction paradigms. This section highlights several
belonging to a given class [12].Machine learning approaches key benefits that underscore the potential of the model to
have gained prominence due to their ability to adapt to varying revolutionize user-computer engagement.
gestures and contexts • Optimized GUI for Ease of Layman Use: The integra-
In 1977, a system that detects the number of fingers tion of the software into a Graphical User Interface (GUI)
bending using a hand glove was proposed by Zimmerman fosters an environment that is intuitive and accessible
[13]. Furthermore, Gary Grimes in 1983 developed a system to users of varying technical backgrounds. The graphi-
for determining whether the thumb is touching another part cal representation of hand-signs and their corresponding
of the hand or fingers [14]. In 1990, despite the limited functionalities provides an elegant and straightforward
computer processing powers, the systems developed then interface that minimizes the learning curve, catering to
gave promising results [15]. both novices and seasoned users.
• Self-Programmable Hand Gesture Options: A dis-
tinguishing feature of the software is its flexibility in
III. P ROCESSING W ORKFLOW OF THE M ODEL allowing users to associate specific hand gestures with
personalized functions. This self-programmable aspect
empowers users to define their own gestures for actions
they commonly perform, fostering a sense of ownership
and customization over their computing experience.
• Seamless Camera Operability with Multiple Cam-
eras: The software’s compatibility with multiple cameras
extends its reach beyond the limitations of a single
camera setup. This adaptability facilitates diverse usage
scenarios, whether in laptops, external webcams, or inte-
grated devices, ensuring an encompassing user experience
regardless of the camera type.
• More Datasets Will Lead to More Efficient Values: The
software’s reliance on datasets for training implies that an
increase in the diversity and volume of collected hand-
sign images will contribute to a more refined recognition
Fig. 1. Workflow of Model
model. A larger dataset encompasses a broader spectrum
of hand configurations, enhancing the software’s ability
A. Data Collection to recognize a wider array of gestures with heightened
The initial phase involves utilizing a custom Python script accuracy.
to record specific hand signs, saving the captured images to • Dual Hand Gesture Detection for More Capabili-
designated directories, as depicted in the first column of Fig ties: By incorporating dual hand gesture detection, the
software capitalizes on the dexterity of both hands, ex- As evident from Fig 2, we have used a total of 12 datasets
panding its capabilities beyond single-handed gestures. comprising 300 images each to train the model, and it renders
This facilitates the execution of intricate commands that to be more efficient with as many images it is being fed
involve simultaneous hand movements, augmenting the
software’s utility in performing complex tasks.
The proposed hand-sign recognition software system tran-
scends conventional laptop interactions by offering an array
of advantages that enhance user engagement. The optimized
B. Working Model Images
GUI, self-programmable gestures, multi-camera compatibility,
dataset refinement, and dual hand gesture detection collec-
tively forge a path toward an evolved and efficient mode of Fig 3 depicts snapshots of the working model in action,
computer interaction. demonstrating the software’s interface and functionality.
V. S YSTEM I MPLEMENTATION AND V ISUAL
D EMONSTRATIONS
This section delves into the practical implementation of
the hand-sign recognition software system, offering visual
demonstrations that provide a tangible understanding of the
software’s functionality and its real-world application. The
sub-sections within this segment present a comprehensive
overview, beginning with dataset images that illuminate the
training process, followed by visual representations of the
working model in action. The culmination of the section
presents real-life scenarios through working example images,
showcasing the software’s efficacy and its potential to reshape
conventional user-computer interactions.
A. Dataset Images
In this section, we present a selection of dataset images
utilized during the training phase of the hand-sign recognition
model. These images represent a diverse range of hand-sign
configurations, showcasing the variability inherent in real-
world user interactions. The dataset serves as the foundation
upon which the model’s accuracy and robustness are culti-
vated.

Fig. 3. GUI of the Model

The remaining datasets are similarly appointed with dif-


ferent functions and all of the above-mentioned gestures can
be reconfigured by the end-user as per need to perform any
necessary task that can be programmed or is pre-determined
in the GUI itself.

C. Working Example Images

This section offers practical insights into the software’s


capabilities through real-world application scenarios. The fol-
lowing images will depict a change of the computer’s sound
Fig. 2. Training Models volume levels using hand gesturing.
manipulation and hand-sign extraction, pivotal in the
workflow.
• Tkinter: Tkinter’s integration supports the creation of an
accessible GUI, providing an intuitive interface for user
interaction, enhancing navigation and control.
• Mediapipe: At the core of hand landmark extraction,
the mediapipe library enables intricate hand configuration
recognition, crucial for accurate hand-sign classification.
• Scikit-Learn: Serving as a robust machine learning li-
brary, Scikit-Learn contributes to various model training
phases. Specifically, the Random Forest classification
algorithm is employed, pivotal to the software’s capa-
bilities.

Fig. 4. Volume Control at 60 This amalgamation of Python, OpenCV, Tkinter, Mediapipe,


and Scikit-Learn constitutes a synergistic technology stack.
In Fig 4, the volume levels are detected to be at 60 percent This stack underpins the software’s architecture, enabling
and the hand gesture detected is used to set the volume of the efficient data collection, accurate model training, intuitive user
computer, upon shortening and lengthening the length between interaction, and sophisticated gesture recognition. In particular,
the thumb and the index finger we can control the volume the utilization of Scikit-Learn’s Random Forest classification
levels of the computer stands as a pivotal step in realizing the software’s core
capability to identify and interpret hand gestures, thus forming
the cornerstone of its training process.

VII. R ESULTS

The implementation of the hand-sign recognition software


system culminates in a series of compelling results that
affirm its efficacy in transforming user-computer interaction.
In its current state, the software exhibits robust performance,
particularly under well-lit conditions, where accurate hand
Fig. 5. Volume Control at 40 landmark extraction and gesture recognition prevail. This
compatibility with optimal lighting environments underscores
As evident from Fig 5, as we shortened the length between the software’s ability to function seamlessly in everyday
the thumb finger and index finger, the volume levels of the settings.
computer subsequently decreased to 40 from 60, thereby
proving the model effective
VI. T ECHNOLOGY S TACK AND T OOLS Dataset Model Accuracy
0 Random Forest 100.0
The realization of the hand-sign recognition software system 1 Random Forest 100.0
hinges upon a cohesive blend of versatile technologies and 2 Random Forest 100.0
3 Random Forest 100.0
tools. The fusion of these components contributes to the 4 Random Forest 100.0
system’s functionality, usability, and effectiveness. The key 5 Random Forest 100.0
tools employed in the development of the model include: 6 Random Forest 100.0
7 Random Forest 100.0
• Python: Serving as the software’s foundation, Python
8 Random Forest 100.0
offers a versatile programming language with extensive 9 Random Forest 100.0
libraries, facilitating seamless integration with various 10 Random Forest 100.0
11 Random Forest 100.0
tools. TABLE I
• OpenCV: The software leverages OpenCV for capturing ACCURACY DATA OF M ODEL
images and processing video streams, enabling image
Dataset Model Accuracy
0 Random Forest 99
users to control laptops intuitively with hand gestures.
1 Random Forest 100.0
2 Random Forest 100.0 The integration of Python, OpenCV, Tkinter, Mediapipe,
3 Random Forest 100.0 and Scikit-Learn forms a cohesive technology stack, with
4 Random Forest 99
5 Random Forest 100.0 Scikit-Learn’s Random Forest classification showcasing
6 Random Forest 93 machine learning’s pivotal role. While acknowledging
7 Random Forest 91 current limitations, we actively pursue optimizations to
8 Random Forest 100.0
9 Random Forest 91
enhance efficiency. With self-programmable gestures, camera
TABLE II compatibility, and comprehensive functionalities, the software
ACCURACY DATA OF C OMPARISON M ODEL pioneers gesture-based computing.

Adapted From ”Hand sign recognition from depth images with In conclusion, this software fosters a harmonious blend of
multi-scale density features for deaf mute persons,” by Sahana T, et technology and human interaction, redefining user-computer
al., 2020, Procedia Computer Science, Volume 167, p. 2043-2050, engagement through natural gestures.
https://doi.org/10.1016/j.procs.2020.03.243.
R EFERENCES
The accuracy score was derived from our custom dataset [1] Chakraborty, B. K., Sarma, D., Bhuyan, M. K., MacDorman, K. F.
of 11 distinct hand signs, each encompassing 300 diverse (2018). Review of constraints on vision-based gesture recognition for
human–computer interaction. IET Computer Vision, 12(1), 3-15.
images. The dataset was divided using a test size ratio of [2] Prashan (2014) Prashan P. Cognitive science and technology book
0.2, indicating that 20% of the images were allocated for series CSAT. Springer; Singapore: 2014. Historical development of hand
testing the model, while the remaining 80% were employed gesture recognition; pp. 5–29.
[3] Sarma, D., Bhuyan, M. K. (2021). Methods, databases and recent
for training. It is noteworthy to emphasize that the accuracy advancement of vision-based hand gesture recognition for hci systems:
score is subject to variation based on factors such as the total A review. SN Computer Science, 2(6), 436.
dataset size, the number of variations within the same hand [4] T. Sahana, S. Paul, S. Basu, A.F. Mollah, Hand sign recog-
nition from depth images with multi-scale density features for
sign, and the diversity of testing images. Given our dataset’s deaf mute persons, Procedia Comput. Sci. 167 (2020) 2043–2050,
relatively modest scale, we achieved an accuracy score of http://dx.doi.org/10.1016/j.procs.2020. 03.243.
100%. This high accuracy underscores the effectiveness [5] Vaibhavi et al. (2014) Vaibhavi SG, Akshay AK, Sanket NR, Vaishali
AT, Shabnam SS. A review of various gesture recognition tech-
of our model in accurately recognizing hand signs and its niques. International Journal of Engineering and Computer Science.
suitability for the specific dataset used in this study. 2014;3:8202–8206.
[6] Ananya, Anjan Kumar Kandarpa Kumar (2015) Ananya C, Anjan
Kumar T, Kandarpa Kumar S. A Review on vision-based hand gesture
The software’s effectiveness in recognizing an array of recognition and applications. IGI Global. 2015;11:261–286.
gestures, coupled with its adaptability to diverse lighting [7] Meenakshi Pawan Singh (2011) Meenakshi P, Pawan Singh M. Hand
conditions and angles, highlights its potential to revolutionize gesture recognition for human computer interaction. International con-
ference on image information processing; Piscatway. 2011. pp. 1–7.
how users engage with their laptops. The presented results [8] Margaret (2016) Margaret R. Analytics tools help make sense of big
underscore the software’s capacity to bridge the gap between data. AWS. 2016. [24 May 2018].
traditional input methods and novel gesture-based interactions. [9] N. Mohamed, M. B. Mustafa and N. Jomhari, ”A Review of the Hand
Gesture Recognition System: Current Progress and Future Directions,”
in IEEE Access, vol. 9, pp. 157422-157436, 2021, doi: 10.1109/AC-
VIII. GUI-BASED O PEN -S OURCE D EPLOYMENT CESS.2021.3129650.
For those interested in further exploration and utilization [10] Yasen M, Jusoh S. A systematic review on hand gesture recogni-
tion techniques, challenges and applications. PeerJ Comput Sci. 2019
of the hand-sign recognition software system, an open-source Sep 16;5:e218. doi: 10.7717/peerj-cs.218. PMID: 33816871; PMCID:
project website has been established. PMC7924500.
[11] Oudah M, Al-Naji A, Chahl J. Hand Gesture Recognition Based
on Computer Vision: A Review of Techniques. J Imaging. 2020 Jul
This website serves as a hub for accessing the software, 23;6(8):73. doi: 10.3390/jimaging6080073. PMID: 34460688; PMCID:
its documentation, and relevant resources. The website also PMC8321080.
provides insights into ongoing developments, including the [12] Nogales, R.E., Benalcázar, M.E. Hand gesture recognition using machine
learning and infrared information: a systematic literature review. Int. J.
stable GUI currently under development. Mach. Learn. & Cyber. 12, 2859–2886 (2021).
[13] Praveen Shreya (2015) Praveen KS, Shreya S. Evolution of hand gesture
The website can be accessed at Wave or recognition: a review. International Journal of Engineering and Computer
Science. 2015;4:9962–9965.
https://wavegesture.web.app/. [14] Laura, Angelo Paolo (2008) Laura D, Angelo MS, Paolo D. A survey
of glove-based systems and their applications. IEEE Transactions on
IX. C ONCLUSION Systems, Man, and Cybernetics. 2008;38:461–480.
[15] Prashan (2014) Prashan P. Cognitive science and technology book
In this paper, we comprehensively explore a hand-sign series CSAT. Springer; Singapore: 2014. Historical development of hand
recognition software system that transforms user-computer gesture recognition; pp. 5–29. [
interaction. Through a systematic approach covering data
collection, model training, gesture mapping, and GUI
integration, we demonstrate the software’s empowerment of

You might also like