You are on page 1of 3

:SALAT POSE DETECTION USING DEEP LEARNING ARCHITECTURE:

KEYPOINTS:
Media pipeline library:
Media pipeline is a cross-platform framework by google that allows user to create machine leraning (ml)
Solutions for mobile, web, applications, edge devices & IOT.

3D CNN Architecture:
A 3D Convolutional Neural Network (3D CNN) as refers to neural network architectures with multiple
layers that can learn hierarchical data representations. Each layer learns increasingly complex spatial
features of data. These representations can then be used for various tasks such as classification,
regression, or generation. Each type of layer has a definite function.

What is object detection?

Object detection is a computer vision task that involves identifying and locating objects in images or
videos. It is an important part of many applications, such as surveillance, self-driving cars, or robotics.
Object detection algorithms can be divided into two main categories: single-shot detectors and two-
stage detectors.

Media Pipe Computer Vision Library:

Google Media Pipe is an open-source computer vision library meant to help developers create machine
learning and computer vision applications
The components of Media Pipe cover a wide range of computer vision tasks. It excels in human pose
estimation, allowing for the tracking of important body landmarks in applications like as fitness tracking,
gesture recognition, and augmented reality experiences. It also has hand tracking, which detects and
tracks hand gestures and is useful for interactive applications, sign language recognition, and virtual
hand control

Body key-point estimation


It is an important computer vision task that entails precisely identifying and tracking specific anatomical
landmarks or key points on the human body inside pictures or video frames. These crucial points are
generally associated with critical body parts such as joints, limbs, and other body parts. The basic goal of
body key-point estimation is to detect and locate these key points reliably, allowing for thorough
tracking of body poses, movements, and gestures
Pose detection
The ML Kit Pose Detection API is a lightweight versatile solution for app developers to detect the pose of
a subject's body in real time from a continuous video or static image. A pose describes the body's
position at one moment in time with a set of skeletal landmark points. The landmarks correspond to
different body parts such as the shoulders and hips. The relative positions of landmarks can be used to
distinguish one pose from another.

What is OpenPose?

At its core, OpenPose is a groundbreaking pose estimation tool. It uses advanced neural networks to
detect human bodies, hands, and facial keypoints in images and videos. Imagine a system that can track
every movement of a dancer or the subtle expressions of a speaker – that's OpenPose in action. To
make it more relatable, think of it as teaching computers to understand and interpret human body
language in a way that was never possible before.
OpenPose is a real-time multi-person keypoint detection library for body, face, and hand estimation. It is
capable of detecting 135 keypoints.
It is a deep learning-based approach that can infer the 2D location of key body joints (such as elbows,
knees, shoulders, and hips), facial landmarks (such as eyes, nose, mouth), and hand keypoints (such as
fingertips, wrist, and palm) from RGB images or videos.
GAP:

Personalized Feedback and Guidance:

While existing approaches focus on detecting Salat postures, a unique direction could involve providing
personalized feedback and guidance to individuals based on their performance. This could involve
integrating additional sensors or data sources (such as wearable devices or depth sensors) to capture
more detailed information about the user's movements and postures during Salat. By leveraging this
data along with deep learning techniques, you could develop a system that not only detects postures
but also provides real-time feedback to users on their form, alignment, and consistency with traditional
Salat practices. This personalized feedback could help individuals improve their prayer practice over
time and adapt to any physical limitations or variations in their movements. Additionally, you could
explore incorporating educational resources or interactive features into the system to provide users
with guidance on correct postures and techniques, enhancing their overall prayer experience.

This approach would not only contribute to the field of human-computer interaction and assistive
technology but also serve as a valuable tool for individuals seeking to deepen their understanding and
practice of Salat. It would require interdisciplinary collaboration between experts in deep learning,
sensor technology, Islamic studies, and user experience design to develop an effective and culturally
sensitive solution.

You might also like