Base 01

Uploaded by

jamilahmediiuc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views6 pages

Base 01

Uploaded by

jamilahmediiuc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Human Pose Estimation using Parallel Architecture

Shuhena Salam Aonty Kaushik Deb* Dhrubajyoti Das Kang-Hyun Jo

Dept. of CSE, Dept. of CSE, Dept. of CSE, Dept. of EECE,
CUET CUET CUET University of Ulsan
Chattogram, Bangladesh Chattogram, Bangladesh Chattogram, Bangladesh Ulsan, Republic of korea
shuhena@[Link] debkaushik99@[Link] dhrubajyoti1212@[Link] acejo@[Link]
2023 International Workshop on Intelligent Systems (IWIS) | 979-8-3503-0504-3/23/$31.00 ©2023 IEEE | DOI: 10.1109/IWIS58789.2023.10284672

Abstract—Human posture recognition is extensively researched photograph poses a greater challenge as compared to a single-
for identifying human activities and various applications. Even person case. The reason is that the number and positions of
though it might be tough because of a number of things, like individuals in the picture are unknown beforehand. This makes
complex backgrounds, moving body positions, occlusions of the
self and objects, poor image resolutions, and varying lighting, it challenging for the computer system to accurately infer.
certain obstacles might occur. To address the restrictions and Both methods of posture estimation typically use a top-down
improve performance, a model for assessing human posture is or bottom-up strategy to address the aforementioned problem
developed that makes use of deep convolutional neural networks. [1].
The proposed method utilizes a bottom-up parsing technique
to identify crucial locations in the human body. Moreover, it The “human posture” issue is referred to as the diffi-
employs a non-parametric method to describe the vector field of culty in localizing human joints. It can be challenging to
key point associations, allowing for the grouping of anatomical detect undetectable connections for masking and understand
key points for each individual. The accuracy of localized key the environment when addressing problems related to solid
points is further improved through multiple stages of enhanced alignment. Estimating the position accurately is a complex
prediction. The proposed method is trained and tested on the
MPII Human Pose dataset, and it demonstrates superior perfor- task that requires multiple adjustments to navigate through a
mance in terms of accuracy compared to the latest state-of-the-art vast database of potential postures, including those that are
technique. Moreover, by including an occlusion network in the partially hidden. This is especially challenging due to the sheer
feature representation process, it efficiently discovers occluded number of possible positions and the need for repeated trial
key points, resulting in a mean average precision of 90.4%. and error. The goal of this work is to put out a model that could
Index Terms—Bottom-up parsing, occlusion, pose estimation,
skeletal key points.
estimate a multi-person position and accurately locate, orient,
and localize body components. Based on this finding, it is
utilized to estimate anatomically important points or individual
I. I NTRODUCTION MPII dataset subjects’ “parts” [2].
The task of human posture estimation in computer vi- However, this work is an improved version of the joint
sion involves recognizing and determining the positions and learning structure [3]. The proposed model improves output,
postures of humans in images and videos. In the field of detects occluded and hidden objects, and is more accurate
computer vision, the objective is to detect the positions and than the existing cutting-edge approach. By incorporating
angles of crucial body joints or distinctive landmarks, ranging feature representation, we can reveal hidden critical points.
from the head, elbows, shoulders, wrists, hips, and knees, The process of detecting, localizing, and estimating human
to the ankles. It has diverse applications across areas such posture through the use of multi-stage characteristic informa-
as robotics, motion capture, human-computer interface, sports tion is not without its challenges. This model requires careful
analysis, healthcare, and augmented reality. New techniques implementation and execution to ensure accurate results. These
for evaluating human postures are introduced each year. The difficulties with correct pose estimation are the focus of this
first method typically involved identifying a single person’s work.
posture in an image that initially had only that one person. • Incorporating traits from multiple stages in a parallel
Typically, a person detector is used, and for each detection, a structure, leads to a high level of accuracy of 90.4% of
single-person pose estimation is performed. These methods the average PCKh on the MPII assessment.
identify each constituent separately before joining them to • Extending branch 1’s fundamental architecture and incor-
form the pose. Estimating the pose of multiple people in a porating it into this branch to detect accurately obscured
important locations.

Authorized licensed use limited to: Zhejiang University. Downloaded on September 19,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
II. L ITERATURE REVIEW are connected or grouped to generate individual instances of
Pose estimation of multi-person has undergone significant human posture.
progress in recent times, owing to the growing adoption of A. Overview of the Proposed Workflow
convolutional neural networks. In [4], the authors introduced
a novel Convolutional Neural Network (CNN) technique for The method for determining the poses of several individuals
posture estimation. The method employed multiple CNN pose is represented visually in a schematic layout. The fundamental
regressors in a cascade structure, enabling a comprehensive steps of the proposed multi-person workflow pose estimation
approach to gather information and analyzing poses. This are shown in Fig. 1.
technique offered an advantage by foreseeing challenges in The initial step in the image processing pipeline involves
localization and pose estimation. preprocessing the input image to make it ready for subsequent
In [5] and in [6], exceptional results were found by using processing tasks. As a result, during the preprocessing stage,
advanced convolutional neural networks to produce the con- the input photos are resized, scaled, and normalized. The pro-
fidence score of crucial points. In [6], authors introduced a cessed image is then sent into a CNN, which simultaneously
network of a U-shape, also referred to as an hourglass module, creates a collection of feature maps. After the features have
where multiple modules of hourglass were stacked to generate been retrieved, they are sent along three different branches.
a prediction. This approach proved to be highly effective and Branch 1 produces a collection of confidence maps for each
led to significant improvements in the accuracy of the results of the predetermined key points, these maps provide an
obtained. estimation of the probability; the branch 2 algorithm creates
Key-point association constraint network (KACNet) is a sets of cluster maps individual to each person featured in the
recently developed network that has been designed to evaluate photos, these cluster maps are also unique to each individual;
and analyze the human posture efficiently, presented in [7]. and for the association, branch 3 generates a collection of
The first channel of this particular network identifies critical affinity fields between body part pairs. The parsing process
locations where the distance loss function serves as a limiting then yields a collection of bipartite matches for associating
factor for the system. Channel-2 is limited in choosing the body part candidates. Finally, all of the people in the image
association loss function that plays a crucial role in deciding are given full-body postures using the matching set of parts.
the relationship between the key points. False associations in B. Detailed Explanation of Our Proposed Methodology
both regular and unusual stances may lead to limitations.
The author offers ThermalPose, a portable neural network The steps of multi-person pose estimation and their imple-
system, in [8]. They have developed a tool for identifying mentation are described in the following subsections.
thermal images and applied already-in-use vision algorithms 1) Data Preprocessing : Images used for deep learning
that have been trained to recognize human poses in RGB training need to be similar in size. The network needs to
images. Compared to the vision-based pose estimator, the learn more than four times as many pixels in a larger image
model is 26 percent smaller. There may be some restrictions, as it did in a smaller image, which can take longer to train
such as the fact that thermal monitoring causes the person’s on. Shrinking an image preserves important details while still
body to lose heat and that thermal heat cannot pass through being safe. The ideal image size is utilized as 256 x 256,
obstacles. depending on network and GPU size.
The human pose estimate problem, when data are mis- Due to gradient propagation concerns, the pixel values are
matched between distinct postures, is addressed in [9]. The resized to fit a predetermined range. Pixels originally range
authors utilize the K-means clustering approach to identify from 0 to 255, which can make computations challenging for
and analyze odd positions without extensive knowledge about neural networks. Normalizing the data to a range of 0 to 1
them. To address limited data availability, they propose three reduces computational complexity.
solutions: using duplicate rare poses, generating synthetic In addition, the normalization approach is used to standard-
data based on rare poses, and adding uncommon poses to ize data, which is crucial for preventing issues during training.
the dataset. These techniques aim to overcome data scarcity. Non-standardized data can make training difficult and slow
However, due to the small number of typical posture data down the learning rate. Uneven numerical data points, with
points, overall improvement in performance is unlikely to be some being high and others being low, can lead to uncertainties
significant. in the proportional importance of each input during training.
In such cases, high values tend to outweigh smaller ones.
III. M ETHODOLOGY 2) Feature Extraction using Convolutional Neural Network
The main objective of the process of human posture es- (CNN): The main feature of our suggested strategy is the
timation is to identify and determine the 2D location of a spatial locations of each image and contain the coordinate
total of 16 key points (referred to as K) in photographs for locations of key points. Different architectures built on con-
each person. This involves accurately detecting and localizing volutional neural networks are evaluated to extract spatial
specific joints and parts of the body. This proposed method, characteristics from each image. Several designs, including
like [10], uses a bottom-up pipeline: first, a set of key points ResNet versions 50, 101, and 152, and HRNet versions w32,
is identified for each person, and then all of those body parts and w48, are used in this work. Both ResNet and HRNet

Authorized licensed use limited to: Zhejiang University. Downloaded on September 19,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
Fig. 1: Methodology of our proposed workflow.

have unique characteristics that are worth noting. ResNet In order to address the issue of missing information in
simplifies the training of deep neural networks while avoiding critical areas due to occlusion, The branch 1 architecture
the problem of vanishing gradients through identity mapping. has been enhanced, visualized in Fig. 3. This updated design
In contrast, HRNet preserves high-resolution representations includes a specialized occlusion net branch that helps identify
throughout the entire process, keeping the information transfer these hidden areas, allowing for a more complete and precise
periodic and ensuring that the final output is accurate in both representation of the human body. By sharing information
space and semantics. This makes it a valuable architecture for between the two levels and using a transposed convolution
image-processing tasks. to split, the network can effectively manage occlusions and
3) Part Detection, Grouping, and Part Association: Iter- provide further information for future image processing. This
ative prediction architecture is built using three branches to should lead to improved accuracy in identifying exposed and
deliver more accurate predictions. Fig. 2 shows the branch hidden critical points.
structural layout.
a) Part Detection for Confidence Maps: The feature map
is the input for Branch 1. This branch creates the set of heat
maps, which is a pattern of a matrix that stores the key points’
confidence scores in a way that each pixel includes specific
information. The location of the annotated 2D body joints is
′∗
used to create the ground-truth confidence map Sj,k (p)(1).
||p − xj , k||22

′∗
Sj,k (p) = exp − (1)
σ2
When the physical component is observable, there is a peak
that appears for the specific visible part j of each individual k,
and the extent of this peak is determined by σ. The groundtruth
location of the body part for the person in the picture is
denoted as xj,k . The objective is to identify the specific point
on a function that produces a peak value in its vicinity. That is
why the greatest confidence score is required. At test time, the
non-maximum suppression (NMS) method is used to extract
the local maximum. Between the calculated prediction and Fig. 2: The multistage architecture represents the structure of
the ground truth maps, the least square error loss function is the branches. The parameters of the convolutional layer are
utilized. denoted as “ Conv⟨ kernel-size ⟩−⟨ no. of output channels ⟩”.

Authorized licensed use limited to: Zhejiang University. Downloaded on September 19,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
x −x
In the above equation, ||xjj2,k,k−xjj1,k,k||2 is a vector field that
2 1
has been carefully directed to ensure its unit magnitude.
The groundtruth component affinity field is created through
an averaging method that combines the individual affinity
fields of all parts. To determine the confidence of the part
association, the predicted part affinity field is calculated using
the expression L′∗c (p) as defined in equation (4).
1 X ∗
L′∗ (p) = Lc (p) (4)
Fig. 3: An extension for branch detection, otherwise known as n(p)
k
the Occlusion net, has been developed in order to locate key
In this context, n(p) refers to the overall count of vectors
points that are obstructed or hidden from view.
that can be found at a certain key point within the individuals
depicted in the image.
4) Post-processing:
b) Cluster maps of part grouping: The Branch 2 com- a) Greedy Parsing Algorithm: High-quality parsed body
ponent in the system takes in a feature map as its input and poses are generated by a greedy parsing technique that aids in
then generates multiple cluster maps consisting of probability finding the best match between candidates of two sections. A
scores. These scores are instrumental in identifying the body well-known assignment problem is known as bipartite graph
parts belonging to each person in the image. This branch, once matching in this area, and we must locate the connection that
again, extracts the total number of people in the input image. maximizes the total score while also solving the assignment
A ground truth cluster map Q′∗d,j (2) is generated based on the challenge. The most evident and immediate advantage is
location of the annotated 2D body joints. required for posture estimate, therefore important points are
established to create accurate connections. The greedy tech-
Q′∗ ∗ 2
d,j = ||xj,k − xj,k ||2 (2) nique is applied to every limb, which determines its weight for
connecting with other limbs or parts of the body. Furthermore,
Let xj,k represent the ground truth position of the neck because the branch 1 model has the highest probability score
key point for each person in the image. The annotated 2D for each joint, maximum weighted parsing, as well as greedily
coordinates of the body joints, with the exception of the neck parses, are used to attain proper pairwise connections between
point location, are xj,k . To determine the key points associated key joints.
with a particular individual, the minimum Euclidean distance b) Merging: In the process of converting detected con-
which measures the straight line distance between the neck nections into finalized skeletons, merging represents the final
and other key points is calculated. It is assumed that certain stage. This step involves treating each connection as belonging
key points belong to a specific individual, and by determining to a different person and forming sets. If any of these sets
the set of important points associated with an individual, their contain body parts, then they are merged with other connected
identity can be established. This method involves calculating sets. The output image generated shows individual estimated
the distance between the neck and other key points present in skeletons and the specific locations of their major body parts.
the body and grouping them accordingly. This process helps This aids in analyzing and understanding the movement and
to identify an individual. pose of each person in the image.
c) Affinity Fields for Parts Association: The third branch
of the model, known as Branch 3, receives the feature map IV. EXPERIMENTS AND RESULTS
as input and provides a non-parametric representation of the The evaluation of human pose performance in a pose
connection between key joints, encoding both the spatial estimation assignment is difficult because many important
location and orientation of the limbs. On the other hand, factors must be taken into account. The Max-Planck Institute
the Part Affinity Field is a 2D vector field that represents for Informatics (MPII) Human pose dataset is used in a
the connection between each limb. To train the model, the number of well-established and openly accessible experiments.
association loss function is applied to the predicted and ground This collection contains around 25K photos of individuals
truth part affinity fields. The groundtruth portion affinity vector with about 40K annotated body joints. The usefulness of this
field is represented as L∗c (p)(3) at each key point in the image.
Overall, this approach provides a detailed and informative TABLE I: The rate of detection for specific points was
understanding of key point connections and limb positioning observed using various optimizers and learning rates.
while effectively minimizing errors and inaccuracies during
Optimizer Learning Rate Mean
training.
RMSProp 0.0001 31.8
 Adam 0.01 79.1
xj2 ,k −xj1 ,k Adam 0.001 89.2

||xj2 ,k −xj1 ,k ||2 if limb exists
L∗c (p) = (3) Adam 0.0001 90.4
SGD 0.0001 15.6
0 otherwise


Authorized licensed use limited to: Zhejiang University. Downloaded on September 19,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
TABLE II: Model performance on the MPII dataset using pre-trained models.
Pretrained Model Head Shoul. Elbow Wrist Hip Knee Ankle Parameters GFLOPs Mean
ResNet 50 96.1 94.3 86.9 80.4 86.8 80.6 76.3 5.64M 90.45 86.5
ResNet 101 96.7 95.3 89.4 84.4 88.5 84.3 80.1 5.64M 90.45 88.9
ResNet 152 96.9 95.4 89.4 84.4 88.5 84.3 80.1 6.77M 91.5 88.9
HRNet-w32 96.7 95.9 90.3 84.1 89.1 85.1 80.7 2.93M 7.25 89.2
HRNet-w48 97.9 95.6 90.5 85.8 89.2 85.1 80.8 17.04M 11.34 90.4

proposed model is tested by experiments that compare it to to the first stage. This demonstrates the efficiency of the
other methods. assessment process, resulting in better outcomes.
The Adam technique is used to train all posture estimation In our method for estimating poses, we have improved the
models, with a 1e-4 beginning learning rate. Additionally, 1e- Branch 1 architecture by incorporating an occlusion net. The
5 is the weight decay, and 4 is the training batch size. Usually, purpose of this is to investigate the effects of recognizing
it takes 1.5 days to train 250 epochs of ResNet 50, 101, 152, key points that are intentionally obscured. To detect occluded
and HRNet-w32 and w48-based models, which have been pre- joints, reliable ground truth annotations are crucial for the
trained with the ImageNet dataset. NVIDIA GTX 1080 GPUs functioning of networks. The results of Table V show the
are used to carry out a total of 250 training epochs. All human distinction between occlusion net inclusion and exclusion. Our
key points are evaluated for mAP (mean Average Precision) techniques perform badly when not employing occlusion net,
based on the PCKh cutoff value. On the MPII dataset, the whereas the proposed method achieves exceptional accuracy
metric PCKh (head normalized probability of the correct key when using occlusion net. Although our proposed method
point) is widely utilized for keypoint detection. without occlusion net required fewer parameters and GFLOPs,
In our experiment, a number of learning rates and optimizers this approach has less accuracy and less detection ability with
are also examined. A study of the learning rate and optimizer occluded key points. Fig. 4 visualizes the comparison between
performance on the MPII dataset is shown in Table I. This the proposed method without the use of an occlusion net and
table demonstrates that the optimizer developed by Adam with the use of an occlusion net. Notably, outstanding out-
performs more effectively than the others and has a greater comes evidence that our proposed method effectively detects
detection rate. It has a learning rate of 0.0001. Only the MPII the occluded as well as visible key points, when trained with
dataset with more than 100 epochs is used for performance occlusion nets.
analysis. ResNet50 is used in this experiment due to hardware
limitations. Additionally, ResNet50 is employed because it can Table VI shows how our proposed approach for 2D human
be trained more quickly, fits better on hardware, and is smaller. posture estimation compares to previous 2D methods. The
The model can skip one or more levels thanks to a ResNet data indicate that our study significantly improves upon prior
feature called an “identity shortcut connection,” though. With research. To attain these outcomes, we employed a pre-
this method, training the network on hundreds of levels may trained transfer model to extract crucial features. Additionally,
be done without performance suffering. the parallel branch structure of our architecture increases its
effectiveness. This information is presented in a neutral yet
Since our model has multiple stages, a number of designs
knowledgeable tone, intending to inform the reader about our
have been examined as the basic frameworks for those stages.
research in the public domain of human posture estimation.
For those stages, various backbone models, such as the ResNet
versions 50, 101, and 152, HRNet-w32, and HRNet-w48
models are examined. As can be seen in Table II, ResNet
101 outperforms ResNet 50, ResNet 152 outperforms ResNet TABLE III: Different arrangements of optimized models in
101 very slightly, HRNet-w32 outperforms ResNet 152, and computer networks.
HRNet-w48 outperforms others. As more layers are added
Backbone Network Trainable
to an architecture, its performance steadily increases. Despite Models GFLOPs mAP
& Trainable Layers Parameters
having the highest parameters and the second-best GFLOPs Freeze HRNet-w48
among the alternatives, HRNet-48 achieves the highest accu- Model I layers except 1,536,016 3.19 88.7
the last stage
racy. Therefore, the recommended choice for the backbone Freeze HRNet-w48
architecture of the proposed model is HRNet-w48. This deci- Model II layers except for 17,037,328 11.34 90.4
sion is made due to its exceptional accuracy and its capability the last 2 stages
to identify vital obscured locations.
We looked into the improved HRNet-w48 networks and
tested several network configurations for fine-tuning. Model TABLE IV: A comparison between various multi-stage branch
II achieves the highest mAP of 90.4%, according to Table III. networks for the purpose of HRNet-w48 training using the
Table IV shows that performance increases with the number of MPII dataset.
stages in the multi-stage process for estimating. The fifth stage Total stages 1 2 3 4 5
model improves performance from 88.0% to 90.4% compared mAP 88.0 89.1 89.5 90.0 90.4

Authorized licensed use limited to: Zhejiang University. Downloaded on September 19,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
TABLE V: The impact of adding or not adding Occlusion net to branch extension.
Occlusion Net. Head Shoul. Elbow Wrist Hip Knee Ankle Parameters GFLOPs Mean
Omitted 95.2 94.3 86.6 81.5 86.0 81.2 75.3 8.19M 9.07 86.9
Not omitted 97.9 95.6 90.5 85.8 89.2 85.1 80.8 17.04M 11.34 90.4

TABLE VI: Comparing performance with other models on MPII’s test set using the PCKh@0.5.
Methods Head Shoul. Elbow Wrist Hip Knee Ank. Mean
Tompson et al. 95.8 90.3 80.5 74.3 77.6 69.7 62.8 79.6
Carreira et al. 95.7 91.7 81.7 72.4 82.8 73.2 66.4 81.3
Tompson et al. 96.1 91.9 83.9 77.8 80.9 72.3 64.8 82
Pishchulin et al. 94.1 90.2 83.4 77.3 82.6 75.7 68.6 82.4
Wei et al. 97.8 95 88.7 84.0 88.4 82.8 79.4 88.5
Newell et al. 97.6 95.4 90.0 85.2 88.7 85 80.6 89.4
Chen et al. 68.7 77.6 63.2 90.2 68.2 58.6 83.2 65.0
Our Model 97.9 95.6 90.5 85.8 89.2 85.1 80.8 90.4

methodologies and provide a more comprehensive solution for

human pose estimation. Our proposed method for estimating
human posture uses a multi-branch model that addresses the
challenges presented by occluded or invisible key points.
Specifically, this approach incorporates an occlusion network
into the branch 1 design to improve performance. Our pro-
posed method was trained and tested on a publicly available
dataset, and our experiments show its effectiveness. However,
there are a few drawbacks to consider. Occasionally, it can lead
to misleading connections and produce false positive outcomes
as well. Future research will investigate new uses for this
model such as object recognition, face alignment, and multi-
resolution abstractions.
R EFERENCES
[1] C. Park, H. S. Lee, W. J. Kim, H. B. Bae, J. Lee, and S. Lee, “An efficient
approach using knowledge distillation methods to stabilize performance
in a lightweight top-down posture estimation network,” Sensors, vol. 21,
no. 22, p. 7640, Nov. 2021.
[2] S. S. Aonty, K. Deb, M. S. Sarma, P. K. Dhar and T. Shimamura, ”Multi-
Person Pose Estimation Using Group-Based Convolutional Neural Net-
work Model,” in IEEE Access, vol. 11, pp. 42343-42360, 2023, doi:
10.1109/ACCESS.2023.3271593.
(a) (b) [3] S. S. Aonty, M. S. Sarma and K. Deb, ”Jointly Learning Structure for
Human Pose Estimation using Convolutional Neural Networks,” 2022
Fig. 4: Visualization of detected visible and occluded key 4th International Conference on Electrical, Computer Telecommunica-
points for case of low visibility, occlusion of self and instance: tion Engineering (ICECTE), Rajshahi, Bangladesh, 2022, pp. 1-4, doi:
(a) proposed method without occlusion net, (d) proposed 10.1109/ICECTE57896.2022.10114551.
[4] A. Toshev and C. Szegedy, ‘Deeppose: Human pose estimation via deep
method with occlusion net. neural networks,’ in Proceedings of the IEEE conference on computer
vision and pattern recognition, 2014, pp. 1653–1660
[5] S.-E. Wei, V. Ramakrishna, T. Kanade and Y. Sheikh, ‘Convolutional
pose machines,’ in Proceedings of the IEEE Conference on Computer
V. C ONCLUSION Vision and Pattern Recognition, 2016, pp. 4724–4732
The aim of the paper is to present an effective method for [6] T. Xu and W. Takano, “Graph stacked hourglass networks for 3D
human pose estimation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern
estimating human poses through a parallel-based convolutional Recognition. (CVPR), Jun. 2021, pp. 483–499.
neural network model. This proposed model utilizes a bottom- [7] X. Zhu et al., ‘Complex human pose estimation via keypoints association
up strategy to identify and evaluate critical anatomical regions constraint network,’ IEEE Access, vol. 8, pp. 205 938–205 947, 2020
[8] I.-C. Chen, C.-J. Wang, C.-K. Wen and S.-J. Tzou, ‘Multi-person pose
based on important features. Specifically, it uses a non- estimation using thermal images,’ IEEE Access, vol. 8, pp. 174 964–174
parametric representation that targets key points to facilitate 971, 2020
the process. To enhance the outcomes, the design utilizes [9] J. Hwang, J. Yang, and N. Kwak, “Exploring rare pose in human pose
estimation,” IEEE Access, vol. 8, pp. 194964–194977, 2020.
three parallel branches in multiple stages. These branches [10] Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei and Y. Sheikh, ‘Openpose:
are responsible for estimating body points with heat maps, Realtime multi-person 2d pose estimation using part affinity fields,’
grouping estimated body joints with estimated heat maps, IEEE transactions on pattern analysis and machine intelligence, vol. 43,
no. 1, pp. 172–186, 2019
and recording joint orientations with vectors, respectively.
The study seeks to overcome the limitations of previous

Authorized licensed use limited to: Zhejiang University. Downloaded on September 19,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.

Joint Training for Human Pose Estimation
No ratings yet
Joint Training for Human Pose Estimation
6 pages
1stconference Human Pose Estimation
No ratings yet
1stconference Human Pose Estimation
7 pages
SSRN 3833854
No ratings yet
SSRN 3833854
5 pages
Kle Dr.M.S.Sheshgiri College of Engineering and Technolog, Belgaum
No ratings yet
Kle Dr.M.S.Sheshgiri College of Engineering and Technolog, Belgaum
18 pages
Learning Human Pose Estimation Features With Convolutional Networks
No ratings yet
Learning Human Pose Estimation Features With Convolutional Networks
10 pages
18MCEC09
No ratings yet
18MCEC09
35 pages
An Overview of Human Pose Estimation With Deep Learning
No ratings yet
An Overview of Human Pose Estimation With Deep Learning
11 pages
A Comprehensive Survey On Human Pose Estimation AP
No ratings yet
A Comprehensive Survey On Human Pose Estimation AP
30 pages
Deeper Cut
No ratings yet
Deeper Cut
22 pages
Mid Term Project Report Training
No ratings yet
Mid Term Project Report Training
23 pages
Diplomarbeit Lassner
No ratings yet
Diplomarbeit Lassner
115 pages
Research Proposal PDF
No ratings yet
Research Proposal PDF
4 pages
Video Action Detection via Pose Estimation
No ratings yet
Video Action Detection via Pose Estimation
4 pages
Human Pose Estimation Paper
No ratings yet
Human Pose Estimation Paper
13 pages
Multi-View Human Pose Estimation Techniques
No ratings yet
Multi-View Human Pose Estimation Techniques
19 pages
Human Pose Estimation with CNNs
No ratings yet
Human Pose Estimation with CNNs
7 pages
Proposal UNSW
100% (1)
Proposal UNSW
18 pages
Pose Estimation and Action Recognition
No ratings yet
Pose Estimation and Action Recognition
49 pages
Human Pose Estimation Guide 2019
No ratings yet
Human Pose Estimation Guide 2019
16 pages
3D Pose Estimation from 2D Keypoints
No ratings yet
3D Pose Estimation from 2D Keypoints
10 pages
2D and 3D Human Pose Estimation Guide
No ratings yet
2D and 3D Human Pose Estimation Guide
84 pages
IJCRT2009321
No ratings yet
IJCRT2009321
4 pages
IJRAR21D1205
No ratings yet
IJRAR21D1205
11 pages
Human Pose Estimation Presentation
No ratings yet
Human Pose Estimation Presentation
13 pages
3D Human Pose Estimation with Multi-Camera
No ratings yet
3D Human Pose Estimation with Multi-Camera
7 pages
Electronics 13 00967 v4
No ratings yet
Electronics 13 00967 v4
16 pages
Synopsis Human
No ratings yet
Synopsis Human
5 pages
BT4032 Research Paper
No ratings yet
BT4032 Research Paper
8 pages
Human Pose Estimation Using MediaPipe Pose and Opt
No ratings yet
Human Pose Estimation Using MediaPipe Pose and Opt
21 pages
AICTE Internship 2024 Project Report Template 2
No ratings yet
AICTE Internship 2024 Project Report Template 2
14 pages
Alphapose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time
No ratings yet
Alphapose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time
17 pages
A Review On Human Pose Estimation
No ratings yet
A Review On Human Pose Estimation
24 pages
An Enhanced Real Time Human Pose Estimation Method Based On Modified Yolov8 Framework
No ratings yet
An Enhanced Real Time Human Pose Estimation Method Based On Modified Yolov8 Framework
16 pages
Pose Detection System Overview
No ratings yet
Pose Detection System Overview
87 pages
Stresstest Poster Topic2
No ratings yet
Stresstest Poster Topic2
1 page
Applsci 11 01826 v2
No ratings yet
Applsci 11 01826 v2
14 pages
Multi-Stream CNN for Human Posture Recognition
No ratings yet
Multi-Stream CNN for Human Posture Recognition
14 pages
Emebedded Skeleton
No ratings yet
Emebedded Skeleton
9 pages
3D Human Pose Recovery from Images
No ratings yet
3D Human Pose Recovery from Images
15 pages
A New Multi-Person Pose Estimation Method Using The Partitioned CenterPose Network
No ratings yet
A New Multi-Person Pose Estimation Method Using The Partitioned CenterPose Network
14 pages
Fin Irjmets1642882332
No ratings yet
Fin Irjmets1642882332
17 pages
BT4032 Project Report
No ratings yet
BT4032 Project Report
30 pages
PolarPose: Efficient Multi-Person Pose Estimation
No ratings yet
PolarPose: Efficient Multi-Person Pose Estimation
12 pages
Efficient Monocular Human Pose Estimation Based On
No ratings yet
Efficient Monocular Human Pose Estimation Based On
11 pages
Case Study3
No ratings yet
Case Study3
15 pages
2012 13392 PDF
No ratings yet
2012 13392 PDF
37 pages
Human Pose Estimation with Compositional Tokens
No ratings yet
Human Pose Estimation with Compositional Tokens
12 pages
Human Regression
No ratings yet
Human Regression
14 pages
Openpose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields
No ratings yet
Openpose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields
14 pages
Computer Vision and Image Understanding: Yucheng Chen, Yingli Tian, Mingyi He
No ratings yet
Computer Vision and Image Understanding: Yucheng Chen, Yingli Tian, Mingyi He
20 pages
Fine-Grained Head Pose
No ratings yet
Fine-Grained Head Pose
10 pages
2D Human Pose Estimation New Benchmark and State of The Art Analysis
No ratings yet
2D Human Pose Estimation New Benchmark and State of The Art Analysis
8 pages
7.+IJISAE+17+Do++Human+Pose+Estimation IJISAE Jan11+
No ratings yet
7.+IJISAE+17+Do++Human+Pose+Estimation IJISAE Jan11+
8 pages
Human Pose Estimation with CNNs Report
No ratings yet
Human Pose Estimation with CNNs Report
8 pages
18MCEN18
No ratings yet
18MCEN18
35 pages
PifPaf: Composite Fields for Pose Estimation
No ratings yet
PifPaf: Composite Fields for Pose Estimation
10 pages
OpenThermalPose An Open-Source Annotated Thermal Human Pose Dataset and Initial YOLOv8-Pose Baselines
No ratings yet
OpenThermalPose An Open-Source Annotated Thermal Human Pose Dataset and Initial YOLOv8-Pose Baselines
8 pages
Digital Ecosystems: Interconnecting Advanced Networks With AI Applications
No ratings yet
Digital Ecosystems: Interconnecting Advanced Networks With AI Applications
918 pages
2022 Computer Vision Techniques in Manufacturing
No ratings yet
2022 Computer Vision Techniques in Manufacturing
15 pages
Lane Detection in Autonomous Vehicles A Systematic Review
No ratings yet
Lane Detection in Autonomous Vehicles A Systematic Review
37 pages
Joint Semantic Analysis in Mobile Forensics
No ratings yet
Joint Semantic Analysis in Mobile Forensics
26 pages
AI and Its Applications - Chapter 1
No ratings yet
AI and Its Applications - Chapter 1
48 pages
Bird Species Identification Using CNN
No ratings yet
Bird Species Identification Using CNN
60 pages
Irjet V11i514
No ratings yet
Irjet V11i514
7 pages
Brain Tumor Image Classification Using DNN
No ratings yet
Brain Tumor Image Classification Using DNN
14 pages
Deep Learning & Neural Networks Guide
No ratings yet
Deep Learning & Neural Networks Guide
5 pages
NextGen-AI in Engineering Design
100% (1)
NextGen-AI in Engineering Design
32 pages
Sample Paper AI
No ratings yet
Sample Paper AI
16 pages
Cloud-Based AI-Driven Flood Prediction and Risk Assessment Using Machine Learning Manuscript
No ratings yet
Cloud-Based AI-Driven Flood Prediction and Risk Assessment Using Machine Learning Manuscript
10 pages
Web App for Plant Leaf Disease Detection
No ratings yet
Web App for Plant Leaf Disease Detection
10 pages
Systematic Review of Federated Learning in Medical Imaging
No ratings yet
Systematic Review of Federated Learning in Medical Imaging
17 pages
Deep Learning for Yoga Pose Recognition
No ratings yet
Deep Learning for Yoga Pose Recognition
8 pages
Cryptocurrency Price Prediction Using Machine Lear
No ratings yet
Cryptocurrency Price Prediction Using Machine Lear
7 pages
Vehicle License Plate Detection System
No ratings yet
Vehicle License Plate Detection System
14 pages
Integration Islamic Values of Mindfulness in Pai (Islamic Religious Education) Based On Deep Learning
No ratings yet
Integration Islamic Values of Mindfulness in Pai (Islamic Religious Education) Based On Deep Learning
9 pages
AI in Insurance: Hype or Reality?: March 2016
No ratings yet
AI in Insurance: Hype or Reality?: March 2016
13 pages
Data ScienceTech Institute Programs
No ratings yet
Data ScienceTech Institute Programs
24 pages
SonicSight Phase2 PPT
No ratings yet
SonicSight Phase2 PPT
33 pages
LSTM-Based Fuzzy Neural Network for Forecasting
No ratings yet
LSTM-Based Fuzzy Neural Network for Forecasting
17 pages
Revolutionizing Skin Cancer Detection: A Comprehensive Review of Deep Learning Methods
No ratings yet
Revolutionizing Skin Cancer Detection: A Comprehensive Review of Deep Learning Methods
6 pages
Prediction of Network Traffic in Wireless Mesh Networks Using Hybrid Deep Learning Model
No ratings yet
Prediction of Network Traffic in Wireless Mesh Networks Using Hybrid Deep Learning Model
13 pages
Computer Science Domain Booklet - Dec - 2024
No ratings yet
Computer Science Domain Booklet - Dec - 2024
8 pages
Communication
No ratings yet
Communication
16 pages
40+ Deep Learning Interview Questions & Answers (2025) - Hirist
No ratings yet
40+ Deep Learning Interview Questions & Answers (2025) - Hirist
12 pages
MODULE 08 Artificial Intelligence
No ratings yet
MODULE 08 Artificial Intelligence
84 pages
Neural Network Proxy Modeling for Oil
No ratings yet
Neural Network Proxy Modeling for Oil
38 pages
Deep Learning in Financial Forecasting
No ratings yet
Deep Learning in Financial Forecasting
33 pages

Base 01

Uploaded by

Base 01

Uploaded by

Human Pose Estimation using Parallel Architecture

Shuhena Salam Aonty Kaushik Deb* Dhrubajyoti Das Kang-Hyun Jo

methodologies and provide a more comprehensive solution for

You might also like