You are on page 1of 6

Human Posture Detection on Lightweight DCNN

and SVM in a Digitalized Healthcare System


Roseline Oluwaseun Ogundokun
Department of Multimedia Engineering Rytis Maskeliūnas Robertas Damaševičius
Kaunas University of Technology Department of Multimedia Engineering Faculty of Informatics, Kaunas
Kaunas, Lithuania Kaunas University of Technology University of Technology, Kaunas,
Department of Computer Science, Kaunas, Lithuania Lithuania
Landmark University Omu Aran, rytis.maskeliunas@ktu.lt robertas.damasevicius@ktu.lt
Nigeria
SDG 11, Sustainable Cities and
Communities
Landmark University Omu Aran,
Nigeria
rosogu@ktu.lt, [0000-0002-2592-2824]
2023 3rd International Conference on Applied Artificial Intelligence (ICAPAI) | 979-8-3503-2892-9/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICAPAI58366.2023.10194156

Abstract—The domains of Human Posture (HP) and to how the body is held and the extremities are arranged.
artificial intelligence (AI) have placed significant emphasis on With the advancement of the internet, individuals have opted
Human Posture Classification (HPC). By recognizing the for a healthy lifestyle, which results in reduced exercise and
standing, sitting, and walking positions of aged people, HPC daily fitness [4]. In [5], posture recognition was investigated
could effectively monitor their health condition and ensure their using a combination of microcontrollers and inertial devices.
well-being. Detection of various postures is difficult owing to the Garg's team [6] used CNN to classify yoga poses in actual
need for enough datasets and a classification framework. periods and formalized the basic skeletonization for human
MobileNet is a lightweight Deep Convolution Neural Network prominent point recognition using the MediaPipe framework.
(DCNN) with a better recognition rate and fewer parameters.
The principle of object recognition was used in [7] to
To further increase the generalization ability while reducing the
number of models hyperparameters, techniques such as
recognize yoga positions for a substantial period. InceptionV3
regularization, Transfer Learning (TL) and neural architecture and SVM were used to develop a hybrid model for human
search can be employed. To address the issue of a shortage of position identification [2]. Sitting for prolonged durations
annotated data and improve approach effectiveness, TL and while studying or at work causes muscle pain. A healthy
Data Augmentation (DA) are utilized for MobileNet and lifestyle positively influences a person’s health; however,
Xception in this study. The Support Vector Machine (SVM) neglecting good posture or adopting incorrect posture could
model boosts performance instead of the final Fully Connected lead to discomfort in the spine, lumbar region, and limbs.
Layer (FCL). To facilitate the effectiveness of our approach, TL Hence, managing people's stances is crucial to preserve their
and DA are both used for SVM. The investigation associated the security and well-being when they study or work. To enhance
efficacy of the suggested approach with that of other cutting- the effectiveness of stance identification, we constructed a
edge image classification techniques, including ResNet50V2, deep learning-founded model in this study. We evaluated to
InceptionV3, and DenseNet201, and found that the presented InceptionV3 and InceptionResNetV2 approaches, two of the
methodology is better. Our recommended method captures most widely used imagery classification techniques. Standing
temporal and depth characteristics from the image separately and sitting are acknowledged forms of action. As people can
and incorporates them into classification computations. The watch their activities whether seated for a prolonged period or
suggested approach founded on MobileNet hybridized with standing for action, the sitting and standing postures are
SVM attains the top performance with 92.12% test accuracy crucial to identify.
(Acc), 95 % area under the curve (AUC), 92% recall (Rec), 93%
ML is a subfield of AI that focuses on understanding a
precision (Prec), 92% F1 score and a computational period of
3974secs.
pattern from a massive dataset utilizing mathematical and
quantitative approaches and making judgments or predicting
Keywords— Human Posture detection, Deep Learning, future evidence. Several computer vision applications,
Transfer learning, Image classification, Human Posture, including segmentation, classification, and object tracking,
Artificial Intelligence have exhibited satisfactory performance using deep learning
(DL) algorithms [8]. With promising findings from
convolutional neural networks (CNNs) in computer vision, the
I. INTRODUCTION medical imaging research team has shifted its focus to DL-
based methodologies for developing computer-aided design
The healthcare system, tracking, virtual worlds, indoor and (CAD) devices to diagnose cancer.
outdoor tracking, and virtual environments for animations and TL is described as applying a previously trained
pleasure are just a few of the uses for HPC. Additionally, the architecture to address a different issue [9]. Building a big
structure of a residential interface may take advantage of computational model requires tremendous data and computer
posture detection (PD) [1, 2]. It is critical to suggest a resources. Data augmentation (DA) [10], an additional
technique that can assist the real-time management of the old fundamental notion for this research, enlarges the
individual with mental health issues in dwelling relatively limited training data set by generating new data by randomly
freely, given the growing aging population and the transforming the current data. It offers several benefits, such
unavailability of medical resources [3]. It is crucial to have as accelerating the convergence procedure, minimizing over-
proper posture to live a balanced existence. The posture refers

XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on November 13,2023 at 18:42:59 UTC from IEEE Xplore. Restrictions apply.
adjustment, and enhancing generalization ability. This study position and obtained an Acc of 91.47 percent. In [20], human
uses DA techniques: rotation range, width and height shift, physical motions relying on generative design techniques
horizontal flip, zoom, and shear range. were investigated. Furthermore, the use of SVM and
As illustrated in Table V, our suggested approaches augmenting for stance identification is studied in [21], and a
provide the optimum performance regarding the Acc, AUC, particular focus on yoga is investigated in [22] by
Rec, Prec, F1-score, and computational rate likened to the Nagalakshmi.
previous research. This study presents unique ways of
classifying images of human position. The significant III. MATERIALS AND METHODS
contribution made in this research is the application of 2 In this section, the authors discussed the dataset and
lightweight DCNN approaches, MobileNet and Xception, in
approach utilized in this study.
which SVM is introduced as an FCL to obtain adequate
detection accuracy (Acc) in HPC. The DA method is applied A. Dataset
to the MPII human posture imagery to compensate for the We have used the MPII human posture dataset [23] for
shortage of datasets. In addition, the TL method is conducted executing our suggested models. The authors acquired four
to save costs, time, and computing resources used. human activity poses for classification– Bicycling, Dancing,
The article is structured into subsequent sections: In Running, and Walking. MPII Human Pose Dataset is a
Section 2, the relevant studies on the development of posture current standard for multimodal human posture detection
recognition were addressed. Section 3 describes the suggested validation. About 40K labeled musculoskeletal components
approach's structure, essential activities, and hyperparameters. are in 25K photos. A human behavior taxonomy was used to
The results of the experiments are obtainable in Section 4. In collect the photographs. 410 human actions are assigned to
section 5, a summary of the discoveries and recommendations each picture. The frames before and after each YouTube
for further investigation on this subject was provided. photo were uncaptioned. [23]. Selecting these four activities
II. RELATED WORK is difficult because there is less information to comprehend.
The dataset is imbalanced with 2014 human position images
The study of HPC has increased dramatically with the and it was balanced with data augmentation, as seen in
introduction of image analysis and identification in 2D and Tables II and III. The required preprocessing was performed
3D space [11]. Chen [11] has comprehensively reviewed for each picture to have identical RGB channel dimensions.
image-centered monocular body position estimation using Table I shows that we employed data augmentation to
DL-based approaches. They stated that several posture balance the dataset. The dataset was divided between 70%
estimate approaches are accessible, including a person's body training, 15% validation, and 15% testing. All four classes
model-centered and body-free, pixel-level assessment, body were represented in training, validation, and testing sets.
joint position mapping, and heatmap modeling. The B. Suggested Approach
justification for utilizing DL for HPC is supported by CNN's
capacity to deliver current results, while AlexNet In this study, 2 DCNN lightweight architectures,
MobileNet and Xception, are used to categorize the MPII
demonstrated its usefulness by initiating a renaissance in
human posture dataset into the four posture categories of
imagery categorization [12]. Faisal [13] further on Chen's
bicycling, dancing, running, and walking using a variety of
[11] study on body-joint assessment, noting the current classification methodologies. These
study where body-joint coordinates are recognized utilizing suggested architectures are implemented with and without
gyroscope and angular position techniques for detecting joint DA. The DA technique was employed to address the scarcity
point angle and different sensors fusion. The researchers [14] of labeled MPII images. The MPII dataset is categorized as
performed a pose-centered human motion estimate, bicycling, dancing, running, or walking, centered on the
duplicating the work of Faisal and Chen [11, 13]. In a study developed MobileNet and Xception approaches. Two
on HPC by researchers [15], template-centered, generative, thousand seven hundred eighty-seven photos of balanced
and discriminative model-centered methods were established human posture data were extracted from the MPII images
as principal HPC methodologies. Liaqat et al. [16] developed studied in [23] to show the efficacy of the suggested
a hybrid stance identification method by combining classic methodologies. The MPII data is obtained from YouTube.
ML approaches with deep neural networks (DNN). The final This study employs DA and TL techniques of the transmitted
output projection is generated by integrating the learned variables of pre-trained algorithms to minimize over-fitting,
weight of the DL technique with the expectation of the help to minimize consumption, and increase
classical approach. In a separate study [17], the researchers approach efficiency. The suggested approach is illustrated in
utilized OpenPose for KeyPoint recovery, succeeded by a Figure 1. Additionally, the dataset is alienated into training,
CNN-LSTM hybrid layer for categorization. The validation, and test sets, with the test set used to assess the
technique was developed from 88 recordings of 6 yoga efficacy of the suggested approaches. Furthermore, to
improve the functionality of the recommended system, the last
positions.
FCL of DCNN architecture is substituted with SVM.
Byeon and his group's [18] study demonstrates the
application of ensemble deep methods in challenging home
contexts with diverse origins. They produced several
intriguing discoveries on the practicality of fitness devices in
home settings for elderly persons and engaging in physical
activity with the convenience of home comfort, consistent
with the post-COVID period's mindset. Kulikajevas and his
colleagues [19] suggested a Deep Recurrent Hierarchical
Network centered on MobileNetV2 for assessing the sitting

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on November 13,2023 at 18:42:59 UTC from IEEE Xplore. Restrictions apply.
[32]. The study utilized the pre-trained MobileNet, and
Xception approaches for transfer learning.
D. Data Augmentation (DA)
Typically, learning on a vast number of training
instances yields excellent results and an increased incidence
of accuracy. Human posture databases include unbalanced
Data Pre-processing such as data samples given the small specific volume. Consequently,
normalization DA is a strategy for expanding entered data by creating extra
ones from the existing input data [24]. There are numerous
techniques for augmenting data, as seen in Table I and utilized
in this study.
Data augmentation such as rotation, E. DCNN
horizontal shift, shear range etc MobileNet and Xception are among the most effective
lightweight DCNN approaches for computer vision
categorization. MobileNet and Xception, correspondingly,
consist of 28 and 71 layers. These comprise deep-wise
TL Approaches on the MobileNet convolutional layers that calculate the result of neurons
and Xception Methods linked to nearby regions in the input. Each neuron computes
a dot product between its weights and the local part to which
it is attached. In addition, they have pooling layers that
conduct a downsampling function along the coordinates and
an FCL that calculates the output vector.
Classification using MobileNet and
Both methods have been pre-trained using 1.3 million
Xception
images from the ImageNet dataset, which represent
scenarios in the real world, and 1000 classes. These photos
comprise a 224 x 244-pixel patch size and three-color
channels. In this approach, the final layer of the techniques is
Results which can be either substituted with a new layer for categorizing four classes:
bicycling, dancing, running or bicycling, dancing, running, and walking. To begin the fine-
walking tuning procedure for MPII photographs, it is essential to
Fig. 1. Suggested Approach: MPII Human Posture Detection specify the number of parameters. First, the MobileNet
approach's principal learning rate is set to 0.010. The
TABLE I. TOTAL NUMBER OF IMAGES FOR EACH CLASS BEFORE AND number of epochs is 50 without DA and TL and also 50
AFTER DA epochs with DA and TL, while the momentum and weight
Dataset Before DA After DA decaying factors are set to 0.9 and 0.005, correspondingly. In
our investigation, researchers employ the rectified linear
Bicycling 516 697
activation function (ReLU) considering that it is simpler to
Dancing 697 697 train and often provides superior results to prevent the
Running 291 697 approach training from overfitting, dropout of 0.5, and
Walking 511 696
regularisations L1 and L2 were employed. The Xception
approach is then retrained with a primary learning rate of
Total 2015 2787 0.010. The number of epochs equals 50 without DA and TL
and 50 with DA and TL. The momentum and weight decaying
factors are set to 0.9 and 0.005 correspondingly. A dropout of
TABLE II. NUMBER OF IMAGES FOR DATA SPLIT BEFORE AND AFTER 0.5 and regularisations L1 and L2 were employed once more
DA to prevent the approach from overfitting. In the digitized
healthcare business, these settings promise that the parameters
Split Set Without DA With DA
(WDA) (DA)
are fine-tuned for human posture.
Training 1454 2012 F. Support Vector Machine (SVM)
Validation 257 356
SVM is a technique for ML that explores data for
Testing 303 419
categorization. It is a supervised learning algorithm that
Total 2015 2787
categorizes data. It devises a robust and effective learning
method by segregating hyperplanes in a high-dimensional
subspace. There are several hyper-planes capable of
C. Transfer Learning (TL) classifying 4 data sets. The optimal hyperplane to choose is
the one that provides the greatest range. The range is described
Learning becomes beneficial when working with highly
as the breadth by which the border may expand before
scarce datasets, such as photographs of human posture in the colliding with a piece of evidence. Intended to be the pieces
healthcare industry. These datasets are more challenging to of information that the margin elevates, the support vectors are
get in large quantities than other datasets. Instead of building also known as the data elements that the margin drives
a DCNN from the start, it is frequently more efficient to upwards. Consequently, the objective of the SVM is to
employ a pre-trained network and fine-tune its functionality identify the optimal hyper-plane that partitions groups of aim
vectors on opposite sides of the planes [25].

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on November 13,2023 at 18:42:59 UTC from IEEE Xplore. Restrictions apply.
G. Performance Measures of our MPII dataset is used for training (2012 pictures), 15%
A classifier may be evaluated using a variety of metrics, for validation (356 photos), and 15% for testing (419 images).
notably Acc, Rec, area under the curve (AUC), Prec, and F1
TABLE III. DETECTION RESULTS USING MOBILENET AND XCEPTION
score. These have been outlined below [26, 27]: WITH DA AND TL

a. Acc: This measures the correctness of a Approach Acc AUC Rec Prec F1- TP
model's estimation and indicates the score
model's overall performance capability. It is MobileNet+DA+TL 92.12 0.95 0.92 0.93 0.92 386
defined as [26, 27] Xception+DA+TL 91.41 0.94 0.92 0.92 0.92 383

+
+ + + (1) TABLE IV. DETECTION RESULTS USING MOBILENET AND XCEPTION
WITHOUT DA AND TL
TP, FN, TN, and FP correspondingly signify True
Approach Acc AUC Rec Prec F1- TP
Positive, False Negative, True Negative, and False Positive. score
b. Rec: Rec is the proportion of accurately MobileNet+W 89.44 0.91 0.89 0.86 0.87 271
DA+WTL
predicted positive occurrences to all Xception 90.10 0.92 0.89 0.88 0.88 273
observable evidence. [26].

+
(2) Tables III and IV describe the effect of DA and TL on the
classification performance for MobileNet and Xception with
c. Prec: Prec is the proportion of accurately an integrated SVM, including the impact without DA and TL.
anticipated positive occurrences to the total In addition, Table III describes the classifier performance with
positive events expected. SVM, although without DA and TL, while Table IV describes
(3) the classifier performance combined with SVM with DA and
+ TL. The MobileNet combined with SVM's linear Kernel
d. F1 score: The F1 score refers to the function obtains the most incredible values. The
weighted mean of retention and accuracy. It recommended method yields 92.12% Acc, an AUC of 0.95,
is employed as a mathematical metric to 92% Rec, 93% Prec, and a 92% F1 score. Table V compares
evaluate the approach's effectiveness. our suggested approach, using classification techniques
Furthermore, this score is based on both FP relying on a few CNN approaches and datasets. The
and FN. computational time is used to explore the recommended
approach's processing effectiveness to evaluate its usability in
2( )
(4) real-time deployment. Table VI displays the computational
( + )
complexity, where the total computational time for
MobileNet+DA+TL and Xception+DA+TL approaches is
IV. EXPERIMENTAL OUTCOMES AND DISCUSSION approximately 3974secs and 15419secs. MobileNet+DA+TL
MobileNet and Xception are evaluated to categorize the achieved the shortest computational time of 3974 seconds. An
MPII dataset as bicycling, dancing, running, or walking. To anaconda Jupiter Notebook (Core i7 and 16 GB RAM)
show the efficacy of the suggested methodologies, a sample platform with Python software for the MobileNet and
of 2787 MPII human photos (697 bicycling, 697 dancing, 697 Xception approaches, accordingly.
running, and 697 walking) are examined [23]. To employ
MPII human photos as input to our DL technique, the MPII TABLE V. COMPARATIVE ANALYSIS OF THE SUGGESTED APPROACH
WITH EXISTING SYSTEMS
people pose images are scaled to 224x224 pixels and
strategically trimmed to obtain the MPII image. The Authors Method Dataset Acc Prec Rec F1-
score
suggested algorithms are employed for the MPII photos, Chollet Xception JFT 89.83 90 90 90
allowing each photo to potentially correspond to one of [28] images
4 classes: bicycling, dancing, running, or walking. The He, ResNet50 ILSVRC 91.52 92 92 91
Zhang, 2015 and
volume of train and test instances for MPII pictures are shown Ren, & COCO
in Table II. To boost performance, the number of training Sun [29] images
datasets is enhanced by balancing an imbalanced dataset using Proposed MobileNe MPII 92.12 93 92 92
Approach t+DA+TL Human
DA techniques such as rotation size, horizonal flip etc. During images
the DA process, all images are rotated by 20 degrees, shifted
in width by 0.3 and height by 0.3, have a shear range of 30 and
a zoom range of 0.2, and are horizontally flipped by 0.2. 70% TABLE VI. THE SUGGESTED SYSTEMS IMPLEMENTATION TIME
Approach Time (seconds)
MobileNet+DA+TL 3,974
Xception+DA+TL 15, 419

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on November 13,2023 at 18:42:59 UTC from IEEE Xplore. Restrictions apply.
MobileNet+WDA+WTL 4, 884
Xception+WDA+DTL 17, 349

According to the results, MobileNet and Xception are


the two best architectures. Figures 2 and 3 show, for
example, how many epochs each training method takes for
MobileNet+DA+TL and MobileNet+WDA+WTL. The
MobileNet+DA+TL approach's training progress is quicker Fig. 5. Xception+WDA+WTL Loss and Acc
and more efficient than existing CNN designs. Figures 4 and
5 show that the Xception+DA+TL and
Xception+WDA+WTL approaches achieve equivalent
performance to MobileNet+DA+TL and
MobileNet+WDA+WTL approaches. However, the
MobileNet and Xception methods for posture categorization
across different parameter settings are compared critically. It
seems from Figs. 2 and 3 that the training Acc for
MobileNet without DA and TL is 98.90% at 30 epochs, but
MobileNet with SVM, DA, and TL obtains an Acc of 99.80%
at 30 epochs. In addition, when employing Xception, the Acc
is 90.54% at 450 epochs without DA and TL and jumps to (a) (b)
96.54% at epoch 40 when DA and TL are applied to the MPII
dataset, as seen in Figures 4 and 5. The confusion matrix for Fig. 6. Confusion Matrix for MobileNet Approach
the MobileNet and Xception techniques is shown in Figures 6
and 7. The study suggested approach performance grows
rapidly and steadily up to an Acc above 90%.

(a) (b)

Fig. 2. MobileNet+DA+TL Loss and Acc Fig. 7. Confusion Matrix for Xception Approach

V. CONCLUSION
Different pre-trained DCNN approaches, notably
MobileNet and Xception, are explored in this work for HPC,
and the findings indicate that MobileNet and Xception designs
deliver good results. The study comprehensively assesses both
proposed approaches, and it was discovered that MobileNet
architecture has superior efficiency and lesser computing
resources than the Xception approach. This study
Fig. 3. MobileNet+WDA+WTL Loss and Acc demonstrates that effective detection of MPII pictures is
possible using end-to-end generated DL approaches without
needing prior or subsequent processing. Presented is a
retraining situation in which the training is initiated utilizing
the weights of an architecture previously trained with yet
another dataset. The DA approach addresses data shortage,
balances datasets, and adds variety to the dataset, enhancing
the pre-trained model's generalization capability and reducing
overfitting. The usage of SVM-hybridized approaches
contributes to the excellent efficiency of the suggested
Fig. 4. Xception+DA+TL Loss and Acc system. Compared to other techniques, the MobileNet
hybridized with SVM, DA, and TL has the best testing Acc
(92.1%). In addition, the Acc and AUC of the suggested
approaches are superior to those of existing systems
discovered in the literature.

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on November 13,2023 at 18:42:59 UTC from IEEE Xplore. Restrictions apply.
REFERENCES [13] Faisal, A. I., Majumder, S., Mondal, T., Cowan, D., Naseh, S., & Deen,
M. J. (2019). Monitoring methods of human body joints: State-of-the-
[1] Liaqat, S., Dashtipour, K., Arshad, K., Assaleh, K., & Ramzan, N. art and research challenges. Sensors, 19(11), 2629.
(2021). A hybrid posture detection framework: Integrating machine
[14] Boualia, S. N., & Amara, N. E. B. (2019, June). Pose-based human
learning and deep neural networks. IEEE Sensors Journal, 21(7), 9515-
activity recognition: a review. In 2019 15th international wireless
9522.
communications & mobile computing conference (IWCMC) (pp.
[2] Ogundokun, R. O., Maskeliūnas, R., Misra, S., & Damasevicius, R. 1468-1475). IEEE.
(2022). Hybrid InceptionV3-SVM-Based Approach for Human
[15] Chowdhury, A. I., Ashraf, M., Islam, A., Ahmed, E., Jaman, M. S., &
Posture Detection in Health Monitoring Systems. Algorithms, 15(11),
410. Rahman, M. M. (2020, October). hActNET: an improved neural
network-based method in recognizing human activities. In 2020 4th
[3] Koubâa, A., Ammar, A., Benjdira, B., Al-Hadid, A., Kawaf, B., Al- international symposium on multidisciplinary studies and innovative
Yahri, S. A., ... & Ras, M. B. (2020, March). Activity monitoring of technologies (ISMSIT) (pp. 1-6). IEEE.
Islamic prayer (salat) postures using deep learning. In 2020 6th
[16] Liaqat, S., Dashtipour, K., Shah, S. A., Rizwan, A., Alotaibi, A. A.,
Conference on Data Science and Machine Learning Applications
(CDMA) (pp. 106-111). IEEE. Althobaiti, T., ... & Ramzan, N. (2021). Novel ensemble algorithm for
multiple activity recognition in elderly people exploiting ubiquitous
[4] Jiang, F., Dashtipour, K., & Hussain, A. (2019, August). A survey on sensing devices. IEEE Sensors Journal, 21(16), 18214-18221.
deep learning for the routing layer of computer network. In 2019
[17] Kumar, D., & Sinha, A. (2020). Yoga pose detection and classification
UK/China Emerging Technologies (UCET) (pp. 1-4). IEEE.
using deep learning. LAP LAMBERT Academic Publishing.
[5] Devi, K. N., Anand, J., Kothai, R., Krishna, J. A., & Muthurampandian,
[18] Byeon, Y. H., Lee, J. Y., Kim, D. H., & Kwak, K. C. (2020). Posture
R. (2022). Sensor-based posture detection system. Materials Today:
recognition using ensemble deep models under various home
Proceedings, 55, 359-364.
environments. Applied Sciences, 10(4), 1287.
[6] Garg, S., Saxena, A., & Gupta, R. (2022). Yoga pose classification: a
[19] Kulikajevas, A., Maskeliunas, R., & Damaševičius, R. (2021).
CNN and MediaPipe inspired deep learning approach for real-world
Detection of sitting posture using hierarchical image composition and
application. Journal of Ambient Intelligence and Humanized
deep learning. PeerJ computer science, 7, e442.
Computing, 1-12.
[20] Albu, F., Nicolau, M., Pirvan, F., & Hagiescu, D. (2018, February). A
[7] Sharma, A., Shah, Y., Agrawal, Y., & Jain, P. (2022). Real-time
Sonification Method using human body movements. In Proceedings of
Recognition of Yoga Poses using Computer Vision for Smart Health
the 10th International Conference on Creative Content Technologies
Care. arXiv preprint arXiv:2201.07594.
(pp. 18-22).
[8] Han, J., Zhang, D., Cheng, G., Liu, N., & Xu, D. (2018). Advanced
[21] Panigrahy, D., Sahu, P. K., & Albu, F. (2021). Detection of ventricular
deep-learning techniques for salient and category-specific object
fibrillation rhythm by using boosted support vector machine with an
detection: a survey. IEEE Signal Processing Magazine, 35(1), 84-100.
optimal variable combination. Computers & Electrical Engineering,
[9] Shin, H. C., Roth, H. R., Gao, M., Lu, L., Xu, Z., Nogues, I., ... & 91, 107035.
Summers, R. M. (2016). Deep convolutional neural networks for
[22] Nagalakshmi, C., & Mukherjee, S. (2021). Classification of yoga
computer-aided detection: CNN architectures, dataset characteristics,
asanas from a single image by learning the 3d view of human poses.
and transfer learning. IEEE Transactions on medical imaging, 35(5),
Digital Techniques for Heritage Presentation and Preservation, 37-49.
1285-1298.
[23] Andriluka, M., Pishchulin, L., Gehler, P., & Schiele, B. (2014). 2d
[10] Ogundokun, R. O., Maskeliūnas, R., & Damaševičius, R. (2022).
human pose estimation: New benchmark and state of the art analysis.
Human posture detection using image augmentation and
In Proceedings of the IEEE Conference on computer Vision and Pattern
hyperparameter-optimized transfer learning algorithms. Applied
Recognition (pp. 3686-3693).
Sciences, 12(19), 10156.
[24] Ogundokun, R. O., Maskeliūnas, R., Misra, S., & Damasevicius, R.
[11] Chen, Y., Tian, Y., & He, M. (2020). Monocular human pose
(2022). A Novel Deep Transfer Learning Approach Based on Depth-
estimation: A survey of deep learning-based methods. Computer
Wise Separable CNN for Human Posture Detection. Information,
Vision and Image Understanding, 192, 102897.
13(11), 520.
[12] Alom, M. Z., Taha, T. M., Yakopcic, C., Westberg, S., Sidike, P.,
[25] Salama, W. M., & Aly, M. H. (2021). Prostate cancer detection based
Nasrin, M. S., ... & Asari, V. K. (2018). The history began from
on deep convolutional neural networks and support vector machines: a
alexnet: A comprehensive survey on deep learning approaches—arXiv
novel concern level analysis. Multimedia Tools and Applications, 80,
preprint arXiv:1803.01164.
24995-25007.
[26] Flach P (2019) Performance evaluation in machine learning: The good,
the bad, the ugly, and the way forward. In: Proceedings of the AAAI
Conference on Artificial Intelligence, 33, pp 9808–9814
[27] Ogundokun, R. O., Misra, S., Akinrotimi, A. O., & Ogul, H. (2023).
MobileNet-SVM: A Lightweight Deep Transfer Learning Model to
Diagnose BCH Scans for IoMT-Based Imaging Sensors. Sensors,
23(2), 656.
[28] Chollet, F. (2017). Xception: Deep learning with depthwise separable
convolutions. In Proceedings of the IEEE conference on computer
vision and pattern recognition (pp. 1251-1258).
[29] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning
for image recognition. In Proceedings of the IEEE conference on
computer vision and pattern recognition (pp. 770-778).

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on November 13,2023 at 18:42:59 UTC from IEEE Xplore. Restrictions apply.

You might also like