You are on page 1of 10

A PYTHON-BASED APPROACH FOR SIGN

LANGUAGE RECOGNITION
Baskar.D
Gowtham.M Deivakumar.S
Asst.prof
Department of Information Technology Department of Information Technology
Department of Information Technology
V.S.B ENGINEERING COLLEGE V.S.B ENGINEERING COLLEGE
V.S.B ENGINEERING COLLEGE
Karur, India Karur, India
Karur, India

Akhillan.S Arivarasu.T
Department of Information Technology Department of Information Technology
V.S.B ENGINEERING COLLEGE V.S.B ENGINEERING COLLEGE
Karur, India Karur, India

ABSTRACT communication barriers between the Deaf or


hard of hearing community and the hearing
The process of converting sign language world. Sign language is a rich and expressive
hand motions into spoken or written language is form of communication using hand gestures,
known as sign language recognition. It involves facial expressions, and body movements, and it
analysing and deciphering hand and finger serves as the primary means of communication
gestures using computer vision and machine for many individuals [11][12]. However, this
learning algorithms, then mapping those actions linguistic diversity presents unique challenges in
to corresponding words or phrases. There are automatic sign language recognition. In recent
various uses for sign language recognition, years, advances in computer vision and machine
including facilitating communication between learning have paved the way for the development
hearing and deaf persons, increasing accessibility of sign language recognition systems [13][14].
for those with hearing loss, and improving sign These systems hold immense promise in
language interpreting education and training enabling the Deaf community to interact with
programmes. The usage of this technology could technology, access educational resources, and
fundamentally alter how interact with those who participate more fully in a world that
use sign language as their primary form of predominantly communicates using spoken
communication. There are various methods for language. This technology has the potential to
recognising sign language, including sensor- enhance inclusivity, foster equal opportunities,
based methods, computer vision methods, and and revolutionize how communicate, making it
hybrid methods that integrate both. Computer an exciting and essential field of research and
vision-based algorithms identify hand gestures development. The development of sign language
and movements by evaluating visual data from recognition systems is underpinned by computer
cameras. Sensor-based approaches use sensors vision and machine learning techniques [15]
that are attached to the hands or fingers to track [16]. These systems use cameras to capture and
motion and record data. As a result, may analyse the movements of a signer's hands, face,
examine different ways to sign language and body, and then translate these gestures into
recognition and forecast a superior approach with textual or auditory output.[17][18] While still an
a higher accuracy rate. evolving field, the technology has made
significant progress, offering applications in
KEYWORDS: Artificial intelligence, Hand diverse areas such as:
gestures, Sign language, Deep learning, Neural
networks Education: Sign language recognition can be
integrated into educational tools and platforms to
create a more inclusive learning environment
1. INTRODUCTION [19][20]. It enables Deaf students to access
content and communicate with teachers and
Sign language recognition is a peers effectively.
transformative technology that bridges
Accessibility: In public spaces, such as Assistive Devices: Wearable devices with sign
transportation hubs and government offices, sign language recognition capabilities can serve as
language recognition systems can facilitate personal interpreters, improving everyday
communication, ensuring that Deaf individuals communication for Deaf individuals [27].
can access information and services
independently [21][22]. As research and development in sign
language recognition continue to advance, there
Healthcare: In healthcare settings, the technology is an increasing focus on enhancing the accuracy
can be used to bridge the communication gap and robustness of these systems, supporting
between Deaf patients and medical professionals, various sign languages and regional dialects [28].
allowing for more accurate and empathetic care Furthermore, creating affordable and user-
[23][24]. friendly applications that integrate seamlessly
into daily life remains a critical goal [29]. Fig 1
Telecommunication: Sign language recognition [Sign Language MNIST on Kaggle] shows the
can be applied to video conferencing and real- gestures for alphabets
time communication platforms, enabling Deaf
individuals to participate in virtual meetings and
conversations [25][26].

Figure 1: Sign language for alphabets.


2. RELATED WORK in support vector machine (SVM). Then
employed a one-to-all approach for the
Farman shah, et.al,…[1] proposed for the implementation of basic binary SVM into the
recognition of thirty-six static alphabets using multi-class SVM. A voting scheme is adopted for
bare hands. The dataset is obtained from the sign the final recognition and the performance of the
language videos. At a later step, four vision- proposed technique is measured in terms of
based features are extracted i.e., local binary accuracy, precision, recall, and F-score. The
patterns, a histogram of oriented gradients, edge- simulation results are promising as compared
oriented histogram, and speeded up robust with existing approaches. Sign language is the
features. The extracted features are individually way of communication and interaction for deaf
classified using Multiple kernel learning (MKL) people all around the world. This kind of
communication is accomplished over some hand glosses performed by a signer from video
gestures, facial expressions, or movement of captures. Even though there is a significant
arm/body. The sign language recognition system amount of work in the field of SLR, a lack of a
aims to enable the deaf community to complete experimental study is profound.
communicate with normal society appropriately. Moreover, most publications do not report results
It is a highly structured symbolic set that in all available datasets or share their code. Thus,
provides the human-computer interaction (HCI). experimental results in the field of SL are rarely
Sign language is very beneficial as a reproducible and lacking interpretation. Spoken
communication tool, and every day millions of languages make use of the “vocal - auditory”
deaf people around the world use sign language channel, as they are articulated with the mouth
to communicate and express their ideas. This and perceived with the ear. All writing systems
facilitation and assistance to deaf persons enable also derive from, or are representations of,
and encourage them to be a healthy part of spoken languages. Sign languages (SLs) are
society and integrate them into society. different as they make use of the “corporal -
visual” channel, produced with the body and
Shikhar sharma, et.al,…[2] implemented perceived with the eyes. SLs are not international
Sign language which is a structured form of hand and they are widely used by the communities of
gestures involving visual motions and signs, the Deaf. They are natural languages since they
which are used as a communication system. For are developed spontaneously wherever the Deaf
the deaf and speech-impaired community, sign have the opportunity to congregate and
language serves as useful tools for daily communicate mutually. In this paper, a
interaction. Sign language involves the use of comparative experimental assessment of
different parts of body namely fingers, hand, computer vision-based methods for sign
arm, head, body and facial expression to deliver language recognition is conducted. By
information. However, sign language is not implementing the most recent deep neural
common among the hearing community, and network methods in this field, a thorough
fewer are able to understand it. This poses a evaluation on multiple publicly available datasets
genuine communication barrier between the deaf is performed.
community and the rest of the society, as a
problem yet to be fully solved until this day. Agelos kratimenos, et.al,…[4]
There are growing numbers of emerging investigated the extraction of 3D body pose, face
technology such as EMG, LMC, and Kinect and hand features for the task of Sign Language
which capture gesture information more readily. Recognition. Compared these features, to Open
The common pre-processing method used are pose key-points, the most famous method for
Median and Gaussian filter as well as downsizing extracting 2D skeleton parameters and features
of images prior to subsequent stages. Skin color from raw RGB frames and their optical flow that
segmentation is one of the most commonly used are fed in a state-of-the-art Deep Learning
segmentation method. Color space which are architecture used in action and sign language
generally more robust towards illumination recognition. The experiments revealed the
condition are CIE Lab, YCbCr and HSV. More superiority of SMPL-X features due to the
recent research utilizes combination of several detailed and qualitative features extraction in the
others spatial features and modelling approaches three aforementioned regions of interest.
to improve segmentation performance. Moreover, exploited SMPL-X to point out the
significance of combining all these three regions
Nikolas adaloglou, et.al,…[3] presented for optimal results in SLR. Future work on 3D
a study to provide insights on sign language body, face and hands extraction for SLR includes
recognition, focusing on mapping non-segmented further experiments in different independent
video streams to glosses. For this task, two new datasets with more signers and varying
sequence training criteria, known from the fields environment. Furthermore, strongly believe that
of speech and scene text recognition, are applying SMPL-X in continuous SLR will give
introduced. Furthermore, a plethora of pre- further prominence to this method, where facial
training schemes is thoroughly discussed. expressions and body structure are even more
Finally, a new RGB+D dataset for the Greek sign crucial. Finally, applying SMPL-X in different
language is created. Sign Language Recognition action recognition tasks is an interesting
(SLR) can be defined as the task of inferring experiment to examine the universality of
SMPL-X success. In this work, employ SMPL- news videos. Coarsely align isolated signs and
X, a contemporary parametric model that enables news signs by joint training and propose to store
joint extraction of 3D body shape, face and hands class centroids in prototypical memory for online
information from a single image. Use this training and offline inference purpose. Our
holistic 3D reconstruction for SLR, model then learns a domain-invariant descriptor
demonstrating that it leads to higher accuracy for each isolated sign. Based on the domain
than recognition from raw RGB images and their invariant descriptor, employ temporal attention
optical flow fed into the state-of-the-art I3D-type mechanism to emphasize class-specific features
network for 3D action recognition and from 2D while suppressing those shared by different
Open pose skeletons fed into a Recurrent Neural classes. In this way, our classifier focuses on
Network learning features from class-specific
representation without being distracted.
Ilham elouariachi,et.al,…[5] Benefiting from our domain invariant descriptor
implemented three components to constitute a learning, our classifier not only outperforms the
sign gesture: manual features which are gestures state-of-the-art but also can localize sign words
made with hands, non-manual features such as from sentences automatically, significantly
facial expressions, body posture, and finger- reducing the laborious labelling procedure.
spelling when words are spelt into alphabet. In External memory equips a deep neural network
this context, the importance of fingerspelling can with capability of leveraging contextual
be noticed when a concept lacks a specific sign, information. They are originally proposed for
such as names, technical terms, or foreign words. document-level question answering (QA)
Sign recognition is a difficult task due to the problems in natural language processing.
complexity of its composition which uses signs Recently, external memory mechanisms have
of different levels, words, facial expression, body been applied to visual tracking, image
posture and finger-spelling to convey meaning. captioning, image classification and movie
With the development of recent technologies, comprehension. In general, external memory
such as Kinect sensor, new opportunities have often serves as a source providing additional
emerged in the field of human computer offline information to the model during training
interaction and sign language, allowing to and testing.
capture both RGB and Depth (RGB-D)
information. In the regard to feature extraction, Ozge mercanoglu sincan, et.al,…[7]
the traditional methods process the RGB and summarised the ChaLearn LAP Large Scale
Depth images independently. In this paper, Signer Independent Isolated SLR Challenge. The
propose a robust static finger-spelling sign challenge attracted more than 190 participants in
language recognition system adopting the two computational tracks (RGB and RGB+D),
Quaternion algebra that provide a more robust who made more than 1.5K submissions in total.
and holistically representation, based on fusing Analysed and discussed the challenge design, top
RGB images and Depth information winning solutions and results. Interestingly, the
simultaneously. Indeed, propose, for the first challenge activity and results on the different
time, a new sets of Quaternion Krawtchouk tracks showed that the research community in
moments(QKMs) and Explicit Quaternion this field is currently paying more attention to
Krawtchouk Moment Invariants (EQKMIs). The RGB information, compared to RGB+Depth
proposed system is evaluated on three well- data. Top winning methods combined hand/face
known fingerspelling datasets, demonstrate the detection, pose estimation, and transfer learning,
performance of the novel method compared to external data, fusion/ensemble of modalities and
other methods used in the literature, against different strategies to model spatio-temporal
geometrical distortion, noisy conditions and information. In some signs, there is only a subtle
complex background, indicating that it could be difference in the position of the hand, e.g.,
highly effective for many other computer vision government and Ataturk. While in the sign of
applications government the index finger touches under the
eye, in the sign of Ataturk it touches the cheek.
Dongxu li, et.al,…[6] propose a new In some signs, there is only a subtle difference in
method to improve the performance of sign movement of hands, e.g., school and soup. In the
language recognition models by leveraging sign of soup, the hand moves a little more from
cross-domain knowledge in the subtitled sign the bottom up. In some signs, facial expression
also contains an important clue for the meaning recognised signs from images (which are
of the sign, e.g., heavy. Although the nature of accessed from a webcam). The accuracy obtained
the problems in this field is primarily similar to in the result is also discussed in this paper. The
the action recognition domain, some peculiarities real-time, accurate and efficient judgement on
of sign languages make this domain specially ISL sign recognition is required to bridge the
challenging; for instance, for some pairs of signs, communication gap between the abled and the
hand motion trajectories look very similar, yet hearing or speech impaired people. In future, the
local hand gestures look slightly different. On the dataset can be expanded by adding more signs
other hand, for some pairs, hand gestures look from different language of various countries,
almost the same, and signs are identified only by thereby achieving a more effective framework
the differences in the non-manual features, i.e., for real-time applications. The method can be
facial expressions. In some cases, a very similar extended for the formation of simple words and
hand gesture can impose a different meaning expressions for both continuous and isolated
depending on the number of repetitions. Another recognition tasks. The secret to true real-time
challenge is the variation of signs when applications is enhancing the response time.
performed by different signers, i.e., body and
pose variations, duration variations etc. Also, Kothadiya, Deep, et al,…[10] presented
variation in the illumination and background study was to develop a system that recognizes
makes the problem harder, which is inherently sign language, and that can be used offline. A
problematic in computer vision. vision-based approach was developed to obtain
the data from the signer. One of the main
Boháček, Matyáš, and Marek Hrúz, et.al, features of this study is the ability of the system
…[8] explored the application of Transformer in to identify and recognize the words included in
the task of isolated SLR. Previous works tackling IISL2020 (our customized dataset). The
this problem have frequently used a IISL2020 dataset consists of 11 words; for each
computationally heavy approach to obtain word, there are about 1100 video samples of 16
sensible results or relied on pre-trained research participants, including males and
backbones, which are preventing by using females. The IISL2020 dataset was created
handcrafted pose feature representations and considering natural conditions—without extra
hence reducing the dimensionality. Furthermore, brightness, orientation, background adjustments,
previous systems could not fully utilize any gloves, etc. Unlike most sign language datasets,
normalization or augmentations beyond the the authors did not use any external help, such as
standard ones applied to visual data. Propose a sensors or smart gloves to detect hand
novel approach of utilizing Transformer for this movements. In future research, the model’s
task. Since our model operates on top of body accuracy could be improved by developing
pose sequence representations, apply knowledge different datasets under ideal conditions,
from SL linguistics to create a robust changing the orientation of the camera, and even
normalization technique as well as new data using wearable devices. Currently, the developed
augmentation techniques specific for the SL. And models work in terms of isolated signs; this
validated the approach on two datasets for approach could be utilized for interpreting
isolated SLR.achieved overall state of the art continuous sign language that leads to syntax
results for the LSA64 and established state of the generation, especially in the context of ISL. The
art results in the pose-based model category for use of vision transformers can lead to more
the WLASL.have also performed a performance accurate results than those of feedback-based
study comparing our model to the I3D baseline, learning models.
which proved that the newly proposed
architecture is substantially less demanding and 3. EXISTING METHODOLOGIES
generalizes well even on very small training sets. The sign language is used widely by
Katoch, Shagun, et.al,…[9] presented a people who are deaf-dumb these are used as a
methodology to build a large, assorted and robust medium for [30] . A sign language is nothing but
real-time alphabets (A-Z) and digits (0–9) composed of various gestures formed by
recognition system for Indian Sign Language. different shapes of hand, its movements,
Instead of using high-end technologies like orientations as well as the facial expressions
gloves or the Kinect, here authors have [31]. There are around 466 million people
worldwide with hearing loss and 34 million of
these are children. `Deaf' people have very little deaf and dumb people using Indian sign language
or no hearing ability. They use sign language for [50]. Effective extension of existing work is to
communication. People use different sign words and common expressions may not only
languages in different parts of the world [32]. make the deaf and dumb people communicate
Compared to spoken languages they are very less faster and easier with outer world, but also
in number. In existing system, lack of datasets provide a boost in Developing autonomous
along with variance in sign language with systems for understanding and aiding them [35]
locality has resulted in restrained efforts in finger [36]. The Indian Sign Language lags behind its
gesture detection [33][34]. Existing project aims American Counterpart as the research in this
at taking the basic step in bridging the field is hampered by the lack of standard datasets
communication gap between normal people and [37].

SIGN LANGUAGE BASED


APPROACHES

Vision based approach Sensor based approach Hybrid based approach

Fig 2. Sign language-based approaches


Vision-based systems use cameras as approach uses mutual error elimination to
primary tools to obtain the necessary input data. enhance the overall accuracy and precision [46]
The main advantage of using a camera is that it [47]. However, not much work has been carried
removes the need for sensors in sensory gloves out in this direction due to the cost and
and reduces the building costs of the system [38] computational overheads of the entire setup.
[39]. Cameras are quite cheap, and most laptops Nevertheless, augmented reality systems produce
use a high specification camera because of the promising results when used with hybrid tracking
blur caused by a web camera [40] [41] methodology [48][49]
The use of a certain type of instrumented 4. PROPOSED METHODOLOGIES
gloves that are fitted with various sensors,
namely, flexion (or bend) sensors, Sign language is a crucial means of
accelerometers (ACCs), proximity sensors, and communication for the Deaf and Hard of Hearing
abduction sensors, is an alternative approach community. Developing a robust sign language
with which to acquire gesture-related data [42] recognition system using CNNs can facilitate
[43]. These sensors are used to measure the bend better communication and accessibility. This
angles for fingers, the abduction between fingers, proposed work aims to leverage deep learning
and the orientation (roll, pitch, and yaw) of the techniques to recognize and interpret sign
wrist. Degrees of freedom (DoF) that can be language gestures for real-time translation and
realized using such gloves vary from 5 to 22, communication. Sign language serves as a vital
depending on the number of sensors embedded in mode of communication for this community, and
the glove [44][45] the project's central objective is to develop an
accurate and efficient Convolutional Neural
The third method of collecting raw Network (CNN) model capable of recognizing
gesture data employs a hybrid approach that and interpreting sign language gestures in real
combines glove- and camera-based systems. This time. The work will involve the creation of a
user-friendly application that allows for the patterns within the sign language
immediate recognition and translation of sign gestures.
language gestures. Additionally, the project will
explore the adaptability of the model to various Activation Functions:
sign languages and signers, ensuring it can serve  Apply activation functions, such as
a broader user base. To enhance the model's ReLU (Rectified Linear Unit), after each
performance, techniques such as data convolutional layer to introduce non-
augmentation and fine-tuning will be employed. linearity into the model.
The long-term goal includes examining
opportunities to deploy the system in assistive Pooling Layers:
technology and educational contexts, ultimately
contributing to improved accessibility and  Insert pooling layers, like max-pooling,
communication for the Deaf and Hard of Hearing to reduce the spatial dimensions of the
community. feature maps while retaining essential
information. Pooling helps make the
CNN ALGORITHM STEPS model more robust to variations in hand
positions.
Data Collection:
Flatten Layer:
 Collect a dataset of sign language
gestures. This dataset should consist of  After convolutional and pooling layers,
images or video frames of individuals add a flatten layer to convert the 2D
making various sign language signs. feature maps into a 1D vector for input
Ensure that the dataset is diverse and to the fully connected layers.
well-labeled.
Fully Connected Layers:
Data Preprocessing:
 Incorporate fully connected layers for
 Preprocess the dataset by resizing images classification. These layers should have
to a consistent size, normalizing pixel neurons equal to the number of distinct
values, and potentially applying data signs want to recognize. Use an
augmentation techniques like rotation, appropriate activation function for the
scaling, and brightness adjustment to output layer (e.g., softmax for multi-
increase dataset diversity. class classification).
Data Splitting: Loss Function:
 Divide the dataset into training,  Choose a suitable loss function, such as
validation, and testing sets. The training categorical cross-entropy, to measure the
set is used to train the model, the error between the predicted sign and the
validation set is used for hyperparameter actual sign in the training data.
tuning, and the testing set is for
evaluating the final model performance. Optimization Algorithm:

Model Architecture:  Select an optimization algorithm (e.g.,


Adam, SGD) to minimize the loss
 Design the CNN architecture. It typically function. Set hyperparameters like
consists of convolutional layers for learning rate and momentum.
feature extraction and fully connected
layers for classification. The architecture Training:
should be deep enough to capture  Train the CNN model using the training
complex spatial patterns. dataset. This involves forward and
Convolutional Layers: backward passes to update the model's
weights through multiple epochs.
 Implement convolutional layers with
appropriate filter sizes and strides. These Validation:
layers learn to detect features and
 Validate the model's performance on the  Fine-tune hyperparameters or the model
validation set during training. Monitor architecture to improve performance if
metrics like accuracy and loss to tune necessary.
hyperparameters and prevent overfitting.
Deployment:
Testing:
 Deploy the trained CNN for real-time
 After training, evaluate the model on the sign language recognition, possibly
testing set to assess its generalization and integrating it into an application or
accuracy. device with a user-friendly interface.
Fine-Tuning (Optional): These steps provide a framework for building a
CNN for sign language recognition, and they can
be adjusted based on the specific requirements of
sign language recognition

Fig 3: Proposed Block diagram


In this diagram, can split the proposed Everyone can very persuasively transfer
work into two phases such as training and testing their thoughts and comprehend each other
phase. In training phase, features vector files are through speech. Our initiative intends to close
trained using CNN algorithm. In testing phase, the gap by including a low-cost computer into
user hand capture from camera and detect the the communication chain, allowing sign
hand regions and finger regions. Finally gesture language to be captured, recognised, and
values are classified and provide the voice alert translated into speech for the benefit of blind
about recognized sign. individuals. An image processing technique is
employed in this paper to recognise the
5. CONCLUSION handmade movements. This application is used
to present a modern integrated planned system
for hearing impaired people. The camera-based 9. Katoch, Shagun, Varsha Singh, and Uma
zone of interest can aid in the user's data Shanker Tiwary. "Indian Sign Language
collection. From this survey, vision based real recognition system using SURF with SVM and
time streaming approach with CNN algorithm CNN." Array 14 (2022): 100141.
provide improved efficiency than the existing
sensor and hybrid approaches. Sensor based 10. Kothadiya, Deep, et al. "Deepsign: Sign
approach have high cost and difficult to language detection and recognition using deep
implement hybrid approach. learning." Electronics 11.11 (2022): 1780.

REFERENCES 11. A. Wadhawan and P. Kumar, ‘‘Deep


learning-based sign language recognition system
1. Shah, Farman, et al. "Sign language for static signs,’’ Neural Comput. Appl., vol. 32,
recognition using multiple kernel learning: A no. 12, pp. 7957–7968, Jun. 2020.
case study of pakistan sign language." Ieee
12. M. Jiang, J. Dong, D. Ma, J. Sun, J. He, and
Access 9 (2021): 67548-67558.
L. Lang, ‘‘Inception spatial temporal graph
2. Sharma, Shikhar, and Krishan Kumar. "ASL- convolutional networks for skeleton-based action
3DCNN: American sign language recognition recognition,’’ in Proc. Int. Symp. Control Eng.
technique using 3-D convolutional neural Robot. (ISCER), Feb. 2022, pp. 208–213.
networks." Multimedia Tools and
13. P. Das and A. Ortega, ‘‘Symmetric sub-graph
Applications 80.17 (2021): 26319-26331.
spatio-temporal graph convolution and its
3. Adaloglou, Nikolaos M., et al. "A application in complex activity recognition,’’ in
comprehensive study on deep learning-based Proc. IEEE Int. Conf. Acoust., Speech Signal
methods for sign language recognition." IEEE Process. (ICASSP), Jun. 2021, pp. 3215–3219.
Transactions on Multimedia (2021).
14.Z. Huang, X. Shen, X. Tian, H. Li, J. Huang,
4. Kratimenos, Agelos, Georgios Pavlakos, and and X.-S. Hua, ‘‘Spatio– temporal inception
Petros Maragos. "Independent sign language graph convolutional networks for skeleton-based
recognition with 3d body, hands, and face action recognition,’’ in Proc. 28th ACM Int.
reconstruction." ICASSP 2021-2021 IEEE Conf. Multimedia, Oct. 2020, pp. 2122–2130.
International Conference on Acoustics, Speech
15. A. A. Hosain, P. Selvam Santhalingam, P.
and Signal Processing (ICASSP). IEEE, 2021.
Pathak, H. Rangwala, and J. Kosecka, ‘‘Hand
5. Elouariachi, Ilham, et al. "Explicit quaternion pose guided 3D pooling for word-level sign
krawtchouk moment invariants for finger- language recognition,’’ in Proc. IEEE Winter
spelling sign language recognition." 2020 28th Conf. Appl. Comput. Vis. (WACV), Jan. 2021,
European Signal Processing Conference pp. 3429–3439.
(EUSIPCO). IEEE, 2021.
16. J. Li, C. Xu, Z. Chen, S. Bian, L. Yang, and
6. Li, Dongxu, et al. "Transferring cross-domain C. Lu, ‘‘HybrIK: A hybrid analytical-neural
knowledge for video sign language inverse kinematics solution for 3D human pose
recognition." Proceedings of the IEEE/CVF and shape estimation,’’ in Proc. IEEE/CVF Conf.
Conference on Computer Vision and Pattern Comput. Vis. Pattern Recognit. (CVPR), Jun.
Recognition. 2020. 2021, pp. 3383–3393.
7. Sincan, Ozge Mercanoglu, et al. "Chalearn 17. D. Li, C. R. Opazo, X. Yu, and H. Li,
LAP large scale signer independent isolated sign ‘‘Word-level deep sign language recognition
language recognition challenge: Design, results from video: A new large-scale dataset and
and future research." Proceedings of the methods comparison,’’ in Proc. IEEE Winter
IEEE/CVF Conference on Computer Vision and Conf. Appl. Comput. Vis. (WACV), Mar. 2020,
Pattern Recognition. 2021. pp. 1459–1469.
8. Boháček, Matyáš, and Marek Hrúz. "Sign 18. A. Tunga, S. V. Nuthalapati, and J. Wachs,
pose-based transformer for word-level sign ‘‘Pose-based sign language recognition using
language recognition." Proceedings of the GCN and BERT,’’ in Proc. IEEE Winter Conf.
IEEE/CVF winter conference on applications of Appl. Comput. Vis. Workshops (WACVW), Jan.
computer vision. 2022 2021, pp. 31–40.

You might also like