Professional Documents
Culture Documents
MOVEMENT ANALYSIS
Prepared By
Madhav Goel 1806810167
2021
Supervisor
DR. MUKESH RAWAT
1. INTRODUCTION
PROBLEM STATEMENT
WHY IS THE PARTICULAR TOPIC CHOSEN?
PURPOSE OF SYSTEM REQUIREMENTS SPECIFICATION (SRS)
SCOPE
DEFINITIONS, ACRONYMS, AND ABBREVIATIONS
REFERENCES
DOCUMENT OVERVIEW
2. OVERALL DESCRIPTION
MAJOR TASKS
METHODOLOGY
i. DEEP LEARNING V/S MACHINE LEARNING
ii. CONVOLUTIONAL NEURAL NETWORK
WHOLE PROCESS
DATA FLOW DIAGRAM
3. SPECIFIC REQUIREMENTS
FUNCTIONAL REQUIREMENTS
i. MOTION DETECTION
ii. FACE DETECTION
iii. GENDER AND AGE DETECTION
iv. STORING INTO DATABASE
v. CREATE VISUALIZATION
Use case
NON-FUNCTIONAL REQUIREMENTS
i. RELIABILITY
ii. REAL-TIME
iii. EASY TO USE
iv. SCALABLE
4. SUPPORTING INFORMATION
Problem statement
Over the past years, a lot of methods have been proposed to solve the
classifications problem. Many of those methods are handcrafted which
perform unsatisfactorily on the age and gender predictions of
unconstrained in-the-wild images. These conventional hand-engineered
methods relied on the differences in dimensions of facial features and face
descriptors which do not have the ability to handle the varying degrees of
variation Observed in these challenging unconstrained imaging conditions.
The images in these categories have some variations in appearance,
noise, pose, and lighting which may affect the ability of those manually
designed computer vision methods to accurately classify the aged and
gender of the images.
Purpose of System Requirements Specification (SRS)
Scope
Facial analysis from images has gained a lot of interest because it helps in several
different problems like better ad targeting for customers, better content
recommendation system, security surveillance, and other fields as well.
Age and gender are a very important part of facial attributes and identifying them are
the very basic of facial analysis and a required step for such tasks. Many companies
are using these kinds of tools for different purposes making it easier for them to work
with customers, cater to their needs better and create a great experience for them. It
is easier to identify and predict needs of people based on their gender and age.
This project will work on the same path but at the same time, it will extend
functionality by not only detecting age and gender but will also make a data base out
of it and present it in a way that non-technical authority will be able to understand the
current status and make better decisions towards growth.
Definitions, Acronyms, and Abbreviations
2. Training Data A training dataset is the dataset you use to train your algorithms or
model so it can accurately predict your outcome .
3. Test Data A test data set is a set that is independent of the training data set,
but that follows the same probability distribution as the training data
set.
2) https://www.analyticsvidhya.com/blog/2020/02/cnn-vs-rnn-vs-mlp-analyzing-3-types-
of-neural-networks-in-deep-learning [Author: Arvind Pillai, Date: 17/02/2020],
[Accessed on: 04/10/2021]
3) Age and Gender Classification using Convolutional Neural Networks, Gil Levi and Tal
Hassner,Date: 20/08/2012, [Accessed on: 14/09/2021]
Document Overview
Methodology
Deep learning extract features manually from an image and it always try to learn high level
features from the data. This eliminates the need of domain expertise and hardcore feature
extraction and also it always use problem solving approach. Deep learning techniques
tend to solve the problem end to end whereas ML need the problem statement to break
down in parts and solved one by one and then after result is to be merged. The best thing
is that the CNN learns the filters automatically without mentioning it explicitly.
Convolutional neural network
There are many methods we can use to solve this problem. There are traditional
algorithms like “Fisherfaces” and “Eigenface” which are created for face recognition and
feature relation methods, but these do not work as well as needed. We can create
solutions better than this using CNN (convolutional neural networks) which is a deep
learning model which have emerged as the most preferred model for computer vision
tasks. They have proven to be most effective when dealing with image datasets and are
the heart of most machine learning computer vision models.
A Convolutional Neural Network is a deep neural network (DNN) widely used for the
purposes of image recognition and processing and NLP. Also known as a ConvNet, a
CNN has input and output layers, and multiple hidden layers, many of which are
convolutional. In a way, CNNs are regularized multilayer perceptrons.
It is a multilayered architecture. There are three types of layers that make up the CNN
which are
CONVOLUTIONAL LAYER
POOLING LAYER
Convolutional layer –
This is a first layer of CNN and it is used to extract various features from the input images.
It helps to transform the input image in order to extract features from it.
Pooling layer –
Pooling layer comes after the convolutional layer and it helps to decrease the size of the
features maps or we can say it helps to reduce the number of parameters to learn and the
amount of computation performed in a network.
Insights can be made via visualization that will eventually help in taking better
decisions.
Data Flow Diagram
3 SPECIFIC
REQUIREMENTS Functional
requirements
Motion detection
Videos can be treated as stack of pictures called frames. Here I am comparing
different frames(pictures) to the first frame which should be static (No movements
initially). We compare two images by comparing the intensity value of each pixel.
Once the motion is detected in the frame. The system will add an entry to the log
file.
Face detection
Once motion is detected and a frame is captured, System will detect face from the image.
If a face is identified and it is cropped from the image.
Gender and age detection
The face image which is produced from face detection phase is made fed to the CNN
trained model. It will categorize it based on gender and age.
Once gender and age detection are done and result is created it get stored into a database.
A database is an organized collection of data stored and accessed electronically from a
computer system. Where databases are more complex, they are often developed using
formal design and modeling techniques
Create visualizations
Using that stored information about the customers this functionality will create
visualizations so that anyone. Even non-technical authority may get know more about the
target audience.
Non-Functional Requirements
Reliability
The approach we have used to build this project gives us much more reliable results what
makes our project different is using deep learning instead of machine learning.
Deep learning extract features manually from an image and it always try to learn high level
features from the data. This eliminates the need of domain expertise and hardcore feature
extraction and also it always use problem solving approach. Deep learning techniques
tend to solve the problem end to end whereas ML need the problem statement to break
down in parts and solved one by one and then after result is to be merged. The best thing
is that the CNN learns the filters automatically without mentioning it explicitly.
Real-Time
Real-time data (RTD) is information that is delivered immediately after collection. There is
no delay in the timeliness of the information provided. Real-time data is often used for
navigation or tracking. Such data is usually processed using real-time computing although
it can also be stored for later or off-line data analysis.
Easy to use
The whole system is automated any manual interference is not required. It will the whole
process on its own. It will take input on its own and also process it by itself.
Scalable
Python3
We will be using python3 as our programing language for the project. Python is a general-
purpose interpreted, interactive, object-oriented, and high-level programming language. It
was created by Guido van Rossum during 1985- 1990. Like Perl, Python source code is
also available under the GNU General Public License (GPL).
Keras
We will be using Keras for designing the architecture of our model, which also provides
some helper functions to load, train, test, and evaluate the model.
Keras is an API designed for human beings, not machines. Keras follows best practices
for reducing cognitive load: it offers consistent & simple APIs, it minimizes the number of
user actions required for common use cases, and it provides clear & actionable error
messages. It also has extensive documentation and developer guides.
We are using TensorFlow backend for Keras, Tensorflow v2, or above recommended.
Numpy v1.75.0 or above and pandas v1.2 will be used for loading annotations CSV file,
cleaning, and handeling the data.
Matplotlib
Matplotlib v3.3 and seaborn will be used for displaying and plotting information about the
data, opencv v4 and pillow v8 or above for working with images, and finally sklearn 0.20
for creating the training and test split of the data.
SqLite3
The PySQLite provides a standardized Python DBI API 2.0 compliant interface to the
SQLite database. If your application needs to support not only the SQLite database but
also other databases such as MySQL, PostgreSQL, and Oracle, the PySQLite is a good
choice.
PySQLite is a part of the Python Standard library since Python version 2.5.
Hardware used
Web cam:
Web cam is used to capture video. it is where the whole system will initiate its working.
a webcam is a video camera that feeds or streams an image or video in real time to or
through a computer network, such as the internet. webcams are typically small cameras
that sit on a desk, attach to a user's monitor, or are built into the hardware. webcams can
be used during a video chat session involving two or more people, with conversations that
include live audio and video.