THESIS

CHAPTER 1: INTRODUCTION
1.0 Introduction About the Project
In this project, our focus extends beyond mere posture detection; we're also
exploring how real-time deep learning can enhance the overall user experience and
functionality of the system. By leveraging PoseNet's capabilities alongside other
deep learning models, we aim to develop features like real-time feedback
mechanisms that alert users to incorrect postures, personalized exercise
recommendations based on posture analysis, and even gamification elements to
make posture correction more engaging and motivating. Furthermore, our project
involves extensive data collection and model training processes to ensure the
accuracy and reliability of the posture detection system. We're integrating data
augmentation, transfer learning, and model optimization techniques to handle
diverse body types, clothing variations, and environmental conditions. This holistic
approach not only improves the system's performance but also enhances its
adaptability to different usage scenarios and user demographics. Additionally,
we're exploring the potential of integrating additional sensors, such as
accelerometers and gyroscopes, to complement PoseNet's vision-based approach.
This multi-modal fusion can provide richer insights into posture dynamics,
stability, and movement patterns, making the system more comprehensive and
valuable for various applications, including sports performance analysis, injury
prevention, and rehabilitation tracking. Overall, our project represents a
convergence of cutting-edge technologies in computer vision, deep learning, and
sensor fusion, with the ultimate goal of creating an advanced posture detection
system that promotes better health, performance, and quality of life for users across
diverse domains.
1.1 Problem Statement
■ The project was started with the idea of combining state-of-the-art PoseNet
with deep learning to transform real-time posture analysis for better health.
■ The project should be effortless to use, enabling even a novice person to use
it. The project can play an important role in Empowering individuals with
real-time posture insights, the project serves as a proactive health guardian,
preventing musculoskeletal issues and promoting well-being.
■ The main purpose of the project is to Revolutionize posture awareness,
proactively enhance health, prevent musculoskeletal disorders, and foster
well-being through real-time insights.
1.2 Innovative Ideas Of Project
■ Interactive Posture Dashboard: Create a user-friendly interface for
real-time posture analysis, recommendations, and engagement.
■ Cross-Platform Compatibility: Ensure usability across devices and
operating systems.
■ Scalable Infrastructure: Build a scalable system for unlimited client
support and seamless performance.
1.3 Project Objective
■ Adjusting Posture Dynamically: The main goal is to promote proactive
health habits by dynamically correcting and improving posture in real-time.
■ Effective Human-Machine Communication: The project's goal is to create
a smooth interface between people and technology while fostering creative
teamwork to improve well-being.
1.4 Scope of the Project
■ Health and Fitness Monitoring:** Utilize PoseNet for real-time posture
detection during exercises, yoga, and daily activities to provide users with
feedback on their posture.
■ Implement posture detection to alert individuals about incorrect sitting
positions and improve ergonomics in office settings.
■ Utilize real-time posture detection to track progress, ensure correct exercise
execution, and aid therapists in adjusting treatment plans for rehabilitation
and physical therapy.
■ Enhanced Human-Computer Interaction:** Incorporate posture detection
into gaming and virtual reality systems for more intuitive user control based
on body movements and postures.
1.5 Features
Certainly, here are some features of the posture detection project using PoseNet
and real-time deep learning:
■ Real-time Key Point Detection
■ Multi-Person Pose Tracking
■ Joint Angle Calculation
■ Posture Assessment and Feedback
■ User-Friendly Interface
■ Cross-Platform Compatibility
■ Data Privacy and Security (Encryption)
■ Integration with Wearable Devices.
1.6 System Requirements
Now, this method is intended in such a way that it takes fewer resources to figure
out work correctly. There is the minimum needs that we would like to require care
of:
■ The system wants a minimum of two GB of RAM to run all the options. It
wants a minimum 1.3 GHz processor to run smoothly.
■ Rest is all up to the user’s usage and can take care of hardware. For security
opposing anti-virus is suggested.
RAM: At least 256 MB of RAM. The amount of RAM needed depends on the
number of concurrent client connections, and whether the server and multiplexor
are deployed on the same host.
Disk Space: Approximately 300 MB is required for Instant Messaging Server
software.
Processor: Minimum 1.3 gigahertz (GHz) x86- or x64-bit dual-core processor with
SSE2instruction set and recommended 3.3 gigahertz (GHz) or faster 64-bit
dual-core processor with SSE2 instruction set.
Memory: Minimum 2 GB RAM and recommended 4 GB RAM or more the
system is made correctly, and all the testing is done e as per the requirements. So,
the rest of the things depend on the user, and no one can harm the data or the
software if the proper care is taken.
CHAPTER 2: LITERATURE SURVEY
PAPER 1:
Title: Yoga Pose Detection Using Posenet and k-NN
Author: Diwakar Shah, Vidya Rautela, Chirag Sharma, and Angelin Florence A
Year: 2021
Algorithm/Methodology Used:
Posenet for Pose estimation and KNN for classifying and correcting poses based on
detected key points of human limbs.
Outcomes: Successful utilization of advanced tecnologies for video based anlaysis.
PAPER 2:
Title: Body Posture Detection and Comparison between OpenPose, MoveNet and
PoseNet
Author:Prakarsh Kaushik,Bhanu Prakash Lohani,Anil Thakur,Amardeep Gupta.
Year: 2023
This compares 3 detections using CNN for human pose estimation.OpenNet detects
2d poses of Multiple people, PoseNet and MoveNet focus key points-based body part
detection
Outcomes: Aims to develop a creative content-generating chatbot evaluate its

performance and suggest directions.
PAPER 3:
Title: DETECTING GYM POSE USING HUMAN POSTURE RECOGNITION
Author: Avadhut Jagde, Aaditya Mane, Tanishq Hawaldar,Sumit Mundhe,Sandesh.
Year: 2023
The project introduces a human-oriented graph method, and updates features via
CGN, surpassing CNN models on PoseNet.
Outcomes: Developed pose Trainer system using pose estimation, and machine
learning for Posture Feedback.
PAPER 4:
Title: Classification of Yoga Posture Using POSENET
Author:Krushnkant Somwanshi, Nayan Jagtap, Abhishek Badadal, Shantanu Nikam.
Year: 2022
This project utilizes CNN and LSTM for real time pose detection and pose
classification enhance the feedback for poses.
Outcomes: Improving the posture and receiving real time Posture Feedback.
CHAPTER 3: SYSTEM ARCHITECTURE
3.1 Existing system
● Incorporate the Pose Net model seamlessly into the existing deep learning
system. Ensure compatibility with the chosen framework, considering
factors like TensorFlow.js for web-based real-time applications.
● Capture real-time video frames, commonly sourced from a webcam using
libraries like OpenCV. Preprocess each frame to align with the input
requirements of the Pose Net model, which may involve resizing or
normalization.
● Apply the pre-trained Pose Net model to detect key points (landmarks)
representing various body parts in each video frame. Extract these key points
for further analysis.
● Optimize the code and model for performance, considering hardware
acceleration and testing the system with various poses.
3.1.1 Disadvantages
■ Integration Complexity: Integrating Pose Net adds complexity, requiring
compatibility checks and potential conflicts.
■ Real-time Overhead: Processing webcam frames and preprocessing for Pose
Net can strain system performance.
■ Maintenance Challenges: Keeping Pose Net updated and bug-free demands
ongoing effort and testing.
■ Hardware Dependency: System performance varies based on hardware
capabilities, impacting user experience.
■ Detection Limitations: Accuracy issues may arise, especially with complex
poses or varying conditions.
3.2 Proposed System
■ Explore or design a PoseNet architecture that dynamically adapts to different
body shapes, clothing, and environmental conditions. This could involve
incorporating attention mechanisms or using pose refinement networks to
enhance accuracy.
■ Integrate multi-modal inputs, such as depth data or additional sensors, to
enhance the robustness of posture detection. Fusion of RGB images with
depth information can provide a more comprehensive understanding of the
user's pose, especially in challenging scenarios.
■ Implement a real-time feedback mechanism that not only identifies incorrect
postures but also provides guidance on corrective actions. This could involve
overlaying suggested adjustments on the user interface or providing haptic
feedback through wearables.
■ Apply Neural Architecture Search techniques to automatically discover
optimal architectures for posture detection.
■ Focus on a human-centric user experience (UX) design, ensuring that the
feedback provided is intuitive, and easy to understand, and encourages users
to actively participate in improving their posture.
3.2.1 Advantages
■ Adaptive Architecture: Dynamic PoseNet adapts to diverse body shapes
and environments, boosting accuracy and user satisfaction.
■ Multi-modal Fusion: Combining depth data and sensors improves posture
detection reliability, especially in challenging scenarios.
■ Real-time Feedback: Interactive guidance for correcting posture, through
visuals or haptic feedback, encourages user engagement and improvement.
CHAPTER 4: SYSTEM REQUIREMENTS AND TOOLS OF
PROJECT
4.1 Hardware Requirements
■ RAM: At least 256 MB of RAM. The amount of RAM needed depends on
the number of concurrent client connections, and whether the server and
multiplexor are deployed on the same host.
■ Disk Space: Approximately 300 MB is required for Instant Messaging
Server software.
■ Processor: Minimum 1.3 gigahertz (GHz) x86- or x64-bit dual-core
processor with SSE2instruction set and recommended 3.3 gigahertz (GHz)
or faster 64-bit dual-core processor with SSE2 instruction set.
■ Memory: Minimum 2 GB RAM and recommended 4 GB RAM or more the
system is made correctly, and all the testing is done e as per the
requirements. So, the rest of the things depend on the user, and no one can
harm the data or the software if the proper care is taken.
4.2 Languages And Libraries
■ HTML:
HTML (Hypertext Markup Language) is the code that is used to structure a
web page and its content. For example, content could be structured within a
set of paragraphs, a list of bulleted points, or using images and data tables.
■ CSS:
CSS (Cascading Style Sheets) is used to style and layout web pages for
example, to alter the font, color, size, and spacing of your content, split it
into multiple columns, or add 19 animations and other decorative features.
■ JavaScript:
JavaScript is a text-based programming language used both on the client
side and server side to allow you to make web pages interactive. Where
HTML and CSS are languages that give structure and style to web pages,
JavaScript gives web pages interactive elements that engage a user.
● ReactJS:
ReactJS is one of the most popular JavaScript front-end libraries which has a
strong foundation and a large community. It is a declarative, efficient, and
flexible JavaScript library for building reusable UI components. The main
objective overreacts is to develop User Interfaces (UI) that improve the
speed of the apps. It uses virtual DOM (JavaScript object), which improves
the performance of the app.
● PoseNet:
Posenet is a real-time pose detection technique with which you can detect
human beings’ poses in Images or Video. It works in both cases as
single-mode(single human pose detection) and multi-pose
detection(Multiple humans pose detection). In simple words, Posenet is a
deep learning TensorFlow model that allows you to estimate human pose by
detecting body parts such as elbows, hips, wrists, knees, and ankles, and
form a skeleton structure of your pose by joining these points.
● Python:
Python is a high-level, general-purpose programming language. Its design
philosophy emphasizes code readability with the use of significant
indentation. Python is dynamically typed and garbage-collected. It supports
multiple programming paradigms, including structured, object-oriented, and
functional programming.
● Tensorflow.JS:
TensorFlow.js is a JavaScript library for training and deploying machine
learning models on web applications and in Node.js. You can develop the
machine learning models from scratch using tensorflow.js or can use the
APIs provided to train your existing models in the browser or on your
Node.js server.
4.3 Software Requirement

■ VS Code:
Visual Studio Code is a streamlined code editor with support for
development operations like debugging, task running, and version control. It
aims to provide just the tools a developer needs for a quick
code-build-debug cycle and leaves more complex workflows to fuller
featured IDEs, such as Visual Studio IDE.
■ Google Colab:
Colab is a hosted Jupyter Notebook service that requires no setup to use and
provides free access to computing resources, including GPUs and CPUs.
Colab is especially well suited to machine learning, data science, and
education.
■ Jupyter Notebook:
The Jupyter Notebook App is a server-client application that allows editing
and running notebook documents via a web browser. The Jupyter Notebook
App can be executed on a local desktop requiring no Internet access or can
be installed on a remote server and accessed through the Internet.
CHAPTER 5: ARCHITECTURE DIAGRAM
5.1 PoseNet Architecture: How We Approach it?
The above image explores the architecture of PoseNet understanding the Neural
Network Architecture for Human Pose Estimation.
Fig 1: Architecture of Posenet.

The architecture for posture detection using Posenet in real-time involves several
key components and steps. It begins with capturing input from a camera source,
which could be a webcam or another camera. This input is then preprocessed,
which may include resizing or normalization, to optimize it for the Posenet model.
Posenet, a convolutional neural network (CNN), is then used to estimate human
poses, providing key point predictions such as joint positions. Throughout this
process, optimization for real-time performance is crucial, often leveraging
hardware acceleration and efficient algorithms.
The convolutional neural network (CNN) serves as the foundation for the design of
the Posenet pose detection model. To extract useful information, it takes an input
image and runs it through several layers of convolutional processing.
These convolutional layers aid in the capture of the image’s numerous patterns and
structures. The one-person pose estimate method used by PoseNet focuses on
estimating the pose key points of a single person.
The 2D coordinates of body key points can be directly regressed using the CNN
architecture. This means that the model develops the ability to forecast the X and Y
coordinates of bodily joints such as the wrists, elbows, knees, and ankles
throughout training.
Pose estimation is quick and easy because of the PoseNet architecture’s simplicity,
making it ideal for applications with constrained processing resources, such as web
browsers or smartphones. It offers a quick and simple approach to determining a
person’s stance in an image or video.
CHAPTER 6: REQUIREMENTS SPECIFICATION AND
DESIGN
6.1 Requirements
6.2 Use case Diagram

6.3 Activity Diagram
6.4 Class Diagram

6.5 Entity Relationship Diagram
6.6 Data Flow Diagram

CHAPTER 7: MODULES
7.1 Project Perspective
■ The system uses data and AI to customize content for users, improving
engagement. Think of tailored recommendations on streaming platforms or
personalized product suggestions in e-commerce.
■ There is an Ethical Guidelines Consider the ethical implications of
AI-generated content, like misinformation and biases. Establishing standards
for responsible content generation is essential.
7.2 Functional and Non-Functional Requirements
7.2.1 Functional Requirements
■ Real-Time Pose Detection: The system should accurately detect and track
key points of the human body (such as joints and limbs) in real-time using
PoseNet or a similar deep learning model.
■ Posture Classification: It should be able to classify detected postures into
predefined categories (e.g., sitting, standing, bending) based on the positions
of key body points.
■ User Interface Integration: The system should integrate with a user
interface to display the detected postures in a user-friendly manner,
providing real-time feedback and visualizations.
7.2.2 Non-Functional Requirements
■ Performance: The system should have low latency and high throughput to
ensure real-time processing of video streams or live camera feeds without
significant delays.
■ Accuracy: The posture detection and classification algorithms should have a
high accuracy rate, minimizing false positives and negatives to provide
reliable results.
■ Scalability: The system should be scalable to handle varying loads and
accommodate multiple users simultaneously, especially in scenarios where
multiple cameras or sensors are used for posture detection.
7.3 Modules of Algorithm
Fig 2: Single-person pose detector pipeline using PoseNet

The poseNet model is image size invariant, which means it can predict pose
positions in the same scale as the original image regardless of whether the image is
downscaled. This means PoseNet can be configured to have a higher accuracy at
the expense of performance by setting the output stride we’ve referred to above at
runtime.
The output stride determines how much we’re scaling down the output relative to
the input image size. It affects the size of the layers and the model outputs. The
higher the output stride, the smaller the resolution of layers in the network and the
outputs, and correspondingly their accuracy.
Fig 3: The output stride determines how much we’re scaling down the output relative to the input image size. A higher output stride is faster but
results in lower accuracy.
7.4 Modules of Posenet

■ PoseNet Model:
Obtain a pre-trained PoseNet model. TensorFlow.js provides a PoseNet
model that you can directly use in the browser for real-time pose estimation.
■ Webcam Input:
■ Set up code to access the user's webcam. You can use browser APIs
like `navigator.mediaDevices.getUserMedia` to get access to the
webcam stream.
■ Image Preprocessing:
■ Resize the webcam frames to the input size required by PoseNet (e.g.,
224x224 pixels).
■ Convert the frames to the appropriate format (e.g., RGB, normalized).
■ Deep Learning Inference:
■ Use the PoseNet model to perform inference on each frame from the
webcam stream.
■ Process the output of the model to extract key points of the detected
pose (e.g., coordinates of joints like shoulders, elbows, etc.).
■ Pose Visualization:
■ Draw the detected pose on the webcam frame in real time. You can
use HTML canvas or WebGL for efficient rendering.
■ Connect the key points to form a skeleton representing the pose.
■ User Interface:
■ Create a user interface to display the webcam feed and the rendered
pose.
■ Add controls for starting/stopping the pose detection, adjusting
settings, etc.
■ Performance Optimization:
■ Implement optimizations for performance, such as using web workers
for inference, batching inference requests, or using hardware
acceleration if available (e.g., WebGL for GPU acceleration).
■ Integration:
■ Integrate the pose detection module into your larger application or
system if needed. This could involve communication with other
modules or APIs.
■ Testing and Validation:
■ Test the real-time pose detection system with different poses, lighting
conditions, and camera angles to ensure robustness and accuracy.
Each of these modules plays a crucial role in creating a functional and efficient
real-time pose detection system using PoseNet and deep learning techniques.
CHAPTER 8: SYSTEM TESTING

THESIS

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

THESIS

Uploaded by

Copyright:

Available Formats

CHAPTER 1: INTRODUCTION

1.0 Introduction About the Project

Outcomes: Successful utilization of advanced tecnologies for video based anlaysis.

Outcomes: Aims to develop a creative content-generating chatbot evaluate its

4.3 Software Requirement

Fig 1: Architecture of Posenet.

6.2 Use case Diagram

6.4 Class Diagram

6.6 Data Flow Diagram

Fig 2: Single-person pose detector pipeline using PoseNet

7.4 Modules of Posenet

You might also like