You are on page 1of 21

GENDER AND AGE DETECTION SYSTEM FOR CUSTOMER

MOVEMENT ANALYSIS

SOFTWARE REQUIREMEMENTS SPECIFICATION (SRS)

Prepared By
Madhav Goel 1806810167

Mansi Jain 1806810174

Nainsi Jain 1806810204

2021

Supervisor
DR. MUKESH RAWAT

MEERUT INSTITUTE OF ENGINEERING & TECHNOLOGY, MEERUT


Affiliated to
Dr. APJ ABDUL KALAM UNIVERSITY

MEERUT INSTITUTE OF ENGINEERING & TECHNOLOGY, MEERUT


All rights reserved. No part of this publication may be reprinted, reproduced, stored
in a retrieval system or transmitted, in any form or by any means, without the prior
permission in writing from the owners.
CONTENTS

1. INTRODUCTION

 PROBLEM STATEMENT
 WHY IS THE PARTICULAR TOPIC CHOSEN?
 PURPOSE OF SYSTEM REQUIREMENTS SPECIFICATION (SRS)
 SCOPE
 DEFINITIONS, ACRONYMS, AND ABBREVIATIONS
 REFERENCES
 DOCUMENT OVERVIEW

2. OVERALL DESCRIPTION
 MAJOR TASKS
 METHODOLOGY
i. DEEP LEARNING V/S MACHINE LEARNING
ii. CONVOLUTIONAL NEURAL NETWORK

 WHOLE PROCESS
 DATA FLOW DIAGRAM

3. SPECIFIC REQUIREMENTS
 FUNCTIONAL REQUIREMENTS
i. MOTION DETECTION
ii. FACE DETECTION
iii. GENDER AND AGE DETECTION
iv. STORING INTO DATABASE
v. CREATE VISUALIZATION

 Use case

 NON-FUNCTIONAL REQUIREMENTS
i. RELIABILITY
ii. REAL-TIME
iii. EASY TO USE
iv. SCALABLE

 SOFTWARE And HARDWARE USED


i. PYTHON3
ii. KERAS
iii. MATPLOTLIB
iv. SQLITE3
v. HARDWARE USED

4. SUPPORTING INFORMATION
Problem statement
Over the past years, a lot of methods have been proposed to solve the
classifications problem. Many of those methods are handcrafted which
perform unsatisfactorily on the age and gender predictions of
unconstrained in-the-wild images. These conventional hand-engineered
methods relied on the differences in dimensions of facial features and face
descriptors which do not have the ability to handle the varying degrees of
variation Observed in these challenging unconstrained imaging conditions.
The images in these categories have some variations in appearance,
noise, pose, and lighting which may affect the ability of those manually
designed computer vision methods to accurately classify the aged and
gender of the images.
Purpose of System Requirements Specification (SRS)

The purpose of this document is to present a detailed description of


GENDER AND AGE DETECTION SYSTEM FOR CUSTOMER MOVEMENT
ANALYSIS. It will explain the purpose and features of the system, the
interfaces of the system, what the system will do, the constraints under
which it must operate and how the system will react to external stimuli. This
document is intended for both the project managers and the developers of
the system and will be proposed to the college project department for its
approval.

Scope

Facial analysis from images has gained a lot of interest because it helps in several
different problems like better ad targeting for customers, better content
recommendation system, security surveillance, and other fields as well.

Age and gender are a very important part of facial attributes and identifying them are
the very basic of facial analysis and a required step for such tasks. Many companies
are using these kinds of tools for different purposes making it easier for them to work
with customers, cater to their needs better and create a great experience for them. It
is easier to identify and predict needs of people based on their gender and age.

This project will work on the same path but at the same time, it will extend
functionality by not only detecting age and gender but will also make a data base out
of it and present it in a way that non-technical authority will be able to understand the
current status and make better decisions towards growth.
Definitions, Acronyms, and Abbreviations

S.No TERMS DEFINITION

1. Dataset A collection of data pieces that can be treated by a computer as a


single unit for analytic and prediction purposes.

2. Training Data A training dataset is the dataset you use to train your algorithms or
model so it can accurately predict your outcome .

3. Test Data A test data set is a set that is independent of the training data set,
but that follows the same probability distribution as the training data
set.

4. Visualization Data visualization is the representation of data or information in a


graph, chart, or other visual format.

5. Database A database is an organized collection of structured information, or


data, typically stored electronically in a computer system.

6. Programming Language A programming language is an artificial language that can be used


to control the behaviour of a machine, particularly a computer.

7. Deep Learning A programming language is an artificial language that can be used


to control the behaviour of a machine, particularly a computer.
References

1) https://techvidvan.com/tutorials/gender-age-detection-ml-keras-opencv-cnn/, [Author: pankaj


Bhardwaj, Date: 06/27/2019], [Accessed on: 01/10/2021]

2) https://www.analyticsvidhya.com/blog/2020/02/cnn-vs-rnn-vs-mlp-analyzing-3-types-
of-neural-networks-in-deep-learning [Author: Arvind Pillai, Date: 17/02/2020],
[Accessed on: 04/10/2021]

3) Age and Gender Classification using Convolutional Neural Networks, Gil Levi and Tal
Hassner,Date: 20/08/2012, [Accessed on: 14/09/2021]

4) Grokking Deep Reinforcement Learning, Andrew W. Trask, 23/01/2019, [Accessed on:


11/09/2021]

5) Introduction to machine learning with python, Andreas C. Müller and


Sarah Guido, 2016, [Accessed on: 11/09/2021]

Document Overview

The document is organized in the following structure:

Chapter 1 – Introduction - This section describes the overview of SRS, the


scope covered and the acronyms used throughout this document.

Chapter 2 – Overall Description - This section describes high-level


requirements that provide a mechanism for users to describe their expectations
of the system.

Chapter 3 – Specific Requirements – This section describes identifies the


functional and non-functional requirements of the system.

Chapter 4 – Supporting Information - This section describes all supporting


information that makes SRS easy to
2 OVERALL DESCRIPTION

Our final project will perform three major tasks:

 Real -time face detection.

 Classification on the basis of gender and age.


 Create an object and add it to a database.
 Use that data to create visualisations that will eventually help to make decisions.

Methodology

We have used deep learning instead of machine learning

Deep learning extract features manually from an image and it always try to learn high level
features from the data. This eliminates the need of domain expertise and hardcore feature
extraction and also it always use problem solving approach. Deep learning techniques
tend to solve the problem end to end whereas ML need the problem statement to break
down in parts and solved one by one and then after result is to be merged. The best thing
is that the CNN learns the filters automatically without mentioning it explicitly.
Convolutional neural network

There are many methods we can use to solve this problem. There are traditional
algorithms like “Fisherfaces” and “Eigenface” which are created for face recognition and
feature relation methods, but these do not work as well as needed. We can create
solutions better than this using CNN (convolutional neural networks) which is a deep
learning model which have emerged as the most preferred model for computer vision
tasks. They have proven to be most effective when dealing with image datasets and are
the heart of most machine learning computer vision models.

A Convolutional Neural Network is a deep neural network (DNN) widely used for the
purposes of image recognition and processing and NLP. Also known as a ConvNet, a
CNN has input and output layers, and multiple hidden layers, many of which are
convolutional. In a way, CNNs are regularized multilayer perceptrons.
It is a multilayered architecture. There are three types of layers that make up the CNN
which are

CONVOLUTIONAL LAYER

POOLING LAYER

FULLY CONNECTED LAYER

Convolutional layer –
This is a first layer of CNN and it is used to extract various features from the input images.
It helps to transform the input image in order to extract features from it.

Pooling layer –
Pooling layer comes after the convolutional layer and it helps to decrease the size of the
features maps or we can say it helps to reduce the number of parameters to learn and the
amount of computation performed in a network.

Fully connected layer –


These are the layers where all the inputs from one layer are connected to every activation
unit of the next layer.
The Whole Process

 Capture frames from a video source

 Identification of face from the frame

 Feed that identified image to a previously trained cnn classification model.

And classification is performed.

 Make an object after classification and store it into a database

 Create visualizations on that data.

 Insights can be made via visualization that will eventually help in taking better

decisions.
Data Flow Diagram
3 SPECIFIC

REQUIREMENTS Functional

requirements

Motion detection
Videos can be treated as stack of pictures called frames. Here I am comparing
different frames(pictures) to the first frame which should be static (No movements
initially). We compare two images by comparing the intensity value of each pixel.

Once the motion is detected in the frame. The system will add an entry to the log
file.

Face detection
Once motion is detected and a frame is captured, System will detect face from the image.
If a face is identified and it is cropped from the image.
Gender and age detection
The face image which is produced from face detection phase is made fed to the CNN
trained model. It will categorize it based on gender and age.

Storing into database

Once gender and age detection are done and result is created it get stored into a database.
A database is an organized collection of data stored and accessed electronically from a
computer system. Where databases are more complex, they are often developed using
formal design and modeling techniques
Create visualizations

Using that stored information about the customers this functionality will create
visualizations so that anyone. Even non-technical authority may get know more about the
target audience.
Non-Functional Requirements

Reliability

The approach we have used to build this project gives us much more reliable results what
makes our project different is using deep learning instead of machine learning.

Deep learning extract features manually from an image and it always try to learn high level
features from the data. This eliminates the need of domain expertise and hardcore feature
extraction and also it always use problem solving approach. Deep learning techniques
tend to solve the problem end to end whereas ML need the problem statement to break
down in parts and solved one by one and then after result is to be merged. The best thing
is that the CNN learns the filters automatically without mentioning it explicitly.
Real-Time

Our system will provide real time results.

Real-time data (RTD) is information that is delivered immediately after collection. There is
no delay in the timeliness of the information provided. Real-time data is often used for
navigation or tracking. Such data is usually processed using real-time computing although
it can also be stored for later or off-line data analysis.
Easy to use

The whole system is automated any manual interference is not required. It will the whole
process on its own. It will take input on its own and also process it by itself.
Scalable

As its main function is to provide visualizations it can be used to accomplish many


business-related solutions like taking better decisions, to know the current status of the
customer and a lot more. Also, it can be connected to many applications.
Software and tools used

Python3

We will be using python3 as our programing language for the project. Python is a general-
purpose interpreted, interactive, object-oriented, and high-level programming language. It
was created by Guido van Rossum during 1985- 1990. Like Perl, Python source code is
also available under the GNU General Public License (GPL).

Keras

We will be using Keras for designing the architecture of our model, which also provides
some helper functions to load, train, test, and evaluate the model.
Keras is an API designed for human beings, not machines. Keras follows best practices
for reducing cognitive load: it offers consistent & simple APIs, it minimizes the number of
user actions required for common use cases, and it provides clear & actionable error
messages. It also has extensive documentation and developer guides.

We are using TensorFlow backend for Keras, Tensorflow v2, or above recommended.
Numpy v1.75.0 or above and pandas v1.2 will be used for loading annotations CSV file,
cleaning, and handeling the data.

Matplotlib

Matplotlib v3.3 and seaborn will be used for displaying and plotting information about the
data, opencv v4 and pillow v8 or above for working with images, and finally sklearn 0.20
for creating the training and test split of the data.
SqLite3
The PySQLite provides a standardized Python DBI API 2.0 compliant interface to the
SQLite database. If your application needs to support not only the SQLite database but
also other databases such as MySQL, PostgreSQL, and Oracle, the PySQLite is a good
choice.
PySQLite is a part of the Python Standard library since Python version 2.5.

Hardware used

Web cam:

Web cam is used to capture video. it is where the whole system will initiate its working.
a webcam is a video camera that feeds or streams an image or video in real time to or
through a computer network, such as the internet. webcams are typically small cameras
that sit on a desk, attach to a user's monitor, or are built into the hardware. webcams can
be used during a video chat session involving two or more people, with conversations that
include live audio and video.

Google Colab GPU:

Google Colaboratory is a online cloud-based Jupyter notebook environment that allows us


to train our machine learning and deep learning models on CPUs, GPUs, and TPUs.
It does not matter which computer you have, what it’s configuration is, and how ancient it
might be. You can still use Google Colab! All you need is a Google account and a web
browser. And here’s the cherry on top – you get access to GPUs like Tesla K80 and even a
TPU, for free!

You might also like