Summer Training Report On COMMERCIAL SURVEILLANCE SYSTEM USING COMPUTER VISION

COMMERCIAL SURVEILLANCE
SYSTEM USING COMPUTER VISION
Report submitted in partial fulfillment of the requirement of the degree

of
Bachelor of
Technology In
Computer Science & Engineering
By
Ananyaa Bansal
(01315002718)
To
MAHARAJA SURAJMAL INSTITUTE OF TECHNOLOGY

Affiliated to Guru Gobind Singh Indraprastha
University C-4, Janakpuri, New Delhi-58
2018-2022
I
ii
CANDIDATE'S CERTIFICATE
I, Ananyaa Bansal, Roll Number 01315002718, B. Tech (Semester-5) of the Maharaja

Surajmal Institute of Technology, New Delhi hereby declare that the report entitled
"COMMERCIAL SURVEILLANCE SYSTEM USING COMPUTER VISION AND
PYTHON” is an original work and data provided in the project is authentic to the best
of my knowledge. This report has not been submitted to any other institution for the
award of any other degree in this university or any other University of India or
abroad. Due acknowledgement has been made in the text to all the material used.
ANANYAA BANSAL
(01315002718)
Date: 27/09/2020
II
CERTIFICATE BY ORGANISATION
III
ACKNOWLEDGEMENT
First and foremost, I would like to thank God Almighty for making us capable and
fortunate enough that we have been given a chance to pursue an innovative and
knowledgeable course like B.Tech. from one of the most reputed universities in our
country.
I am very grateful to our director Dr. K.P. Chaudhary for giving us the fruitful
opportunity of working on this project and to the other faculty members, related to the
project, for providing their valuable time and attention to help us in the project.
I am also thankful to Dr. Sapna Malik for her most valuable and significant guidance
in our understanding of the project and in elaborating our view of studying the
project’s details.
I would also like to thank our Head of Department, Dr. Koyel Datta Gupta, H.O.D.
(Computer Science Department), Maharaja Surajmal Institute of Technology, who
provided a wide vision of the scope of the project.
At last, I would like to thank all the faculty members and lab in-charges who were
directly or indirectly involved in the development of this project by providing much-
needed support and cooperation.
ABSTRACT
The project comprises of an Integrated Camera setup with different roles for each of
the 4 cameras or input streams, implementing Computer Vision and other
technologies to devise a useful General Maintenance system for any Commercial
institution.
This report presents the six tasks, completed while making this project by the
knowledge gained from, but not restricted to the online training done at Coursera
which are listed below:
1) Collecting and/or shooting sample images and videos to be used for various
parts and features of the project.
2) Using these elements to perform detection of various kinds from Camera 1
through Camera 3 including Face, Mouth, Car, Car Plate and Person.
3) Using live date and time from the system to ascertain the entry time for
employees as part of the Attendance System which is part of Camera 2.
4) Writing the derived results into image as well as csv files in the working
directory of the project for further operation.
5) Designing a Motion Detector for Camera 4 which detects the slightest motion
and an Alarm system which rings a code message on being triggered by
motion.
6) The duration and starting and ending times of the intrusion are recorded in a
csv file and later made into a visual graph for analysis.
OpenCV has been used as the primary Computer Vision library to perform most of
the detections in the project.
Recognition of various entities is for the future scope of this project where different
Machine Learning models can be used to train datasets and improved for accuracy.
(For example, CNN for face recognition)
Data Visualization has been used for plotting a graph from a csv file.
Image slicing and extraction has been done to give easier view of the data.
All the tasks are completed successfully, and the project is completed and is running
with no syntactical error found.
TABLE OF CONTENTS
Content Page No.

Title I
Candidate’s Certificate II
Certificate by Organization III
Acknowledgement IV
Abstract V
Table Of Contents VI
Chapter 1: Introduction 1
1.1 Needs 1
1.2 Objectives 2
1.3 Methodology 3
1.4 Software Used 4
1.4.1 Software Required-Anaconda 3 4
1.4.2 Jupyter Notebook 5
1.4.3 Python 3 5
1.5 Hardware Used 6
1.6 About Organization 6
Chapter 2: Project Design 7

2.1 Concepts Used 7
2.1.1 Python for Data Science 7
2.1.2 Introduction to Computer Vision 8
2.1.3 Types of Computer Vision 9
2.1.4 How Computer Vision Works 10
2.1.5 Analysis of a Visual 10
2.1.6 OpenCV for Computer Vision 13
2.1.7 Using Haar Classifiers 14
2.2 Block Diagram 20
Chapter 3: Implementation 21
3.1 Major Libraries and Functions used 21
3.2 Implementation Blocks 26
3.2.1 CAMERA 1 26
3.2.2 CAMERA 2 28
3.2.3 CAMERA 3 31
3.2.4 CAMERA 4 32
Chapter 4: Results and Discussion 35

4.1 Outputs 35
4.1.1 Block 1 35
4.1.2 Block 2 38
4.1.3 Block 3 40
4.1.4 Block 4 41
Chapter 5: Future Scope & Conclusion 42

5.1 Future Scope 42
5.2 Conclusion 43
References 44
List of Figures
Fig No. Name Page No.
Fig 1 Pixelated grayscale image 11

Fig 2 Lumination values to image 11
Fig 3 Matrix of pixel values 11
Fig 4 Pixels stored in memory 12
Fig 5 Edge features 15
Fig 6 Line features 15
Fig 7 Four Rectangle features 15
Fig 8 Haar features represented numerically 15
Fig 9 Sample b/w image 16
Fig 10 Haar features applied to photo 16
Fig 11 0 and 1 ideal values 16
Fig 12 Normalised greyscale values for real cases 16
Fig 13 Formula for Delta Calculation 17
Fig 14 Input image and Integral Image 17
Fig 15 Using Classifier Cascading for example of face 19
Fig 16 Feature selection for Vehicle detection 19
Fig 17 Flowchart for capturing videos using OpenCV 29
Fig 18 Block Diagram for Motion Detection 32
Fig 19 Test 1 for Traffic Deduction 35
Fig 20 Test 2 for Traffic Deduction 35
Fig 21 Test 1 for Car Plate Record 36
Fig 22 Extracted car plate for Test 1 Car Plate Record 36
Fig 23 Test 2 for Car Plate Record 37
Fig 24 Extracted car plate for Test 2 Car Plate Record 37
Fig 25 Test 1 for Face Attendance Record 38
Fig 26 Test 2 for Face Attendance Record 38
Fig 27 Face Mask detection Camera 2-Wearing Mask 39
Fig 28 Face Mask detection Camera 2-Not Wearing Mask 39
Fig 29 Test 1 for Intrusion Detection 40
Fig 30 Window snippet for frame of Intrusion Detection 40
Fig 31 CSV file created for live Motion Detection 41
Fig 32 Observed graph for live Motion Detection 41
List of Tables
Table No. Name Page No.
Table 1 Major libraries and their classes, functions used 21

Chapter 1: Introduction
Needs and Objectives:-
1.1 Needs:
1. The need of this project is due to the fact, that COVID-19 pandemic has caused
institutions across the globe to resort to contactless services, involving least
manpower.
Social Distancing norms have been placed at most places which has expedited the
need for automated systems to a great extent.
2. This project, that is a comprehensive integration of various features, aims to
achieve minimum human contact to carry out services of an institution smoothly
and safely.
3. These times account for the necessity of facial masks to protect against the virus.
Hence, one of the features of the project determines if a certain person will be
allowed inside the institution on the basis of whether he/she is wearing a mask.
4. The popular Biometric System for attendance, although very advanced and
accurate, poses a high risk of virus contraction through touch.
So, another feature of the project focuses on Face detection and consequent
Attendance record to be analyzed by the designated employee.
5. Needless to say, Crime rate in Delhi is at an all-time high. In such times, a system
for Intrusion Detection becomes imperative to institutions, especially those that
house valuable resources such as Banks or other Expensive Stores.
6. How the sales or market of a particular institution grows or thrives depends a lot
on the time at which people visit the place.
This can be gauged by the Traffic situation in front of the said institution at
different times of the day.
So a system that approximates the vehicle numbers at various hours can prove to
be useful.
1
7. Automatic record making of Vehicle Plate Numbers to aid in Parking system, to
minimize human contact is using automation.
8. So this project becomes useful for any institution looking to make their
compound less vulnerable to COVID-19 as well as to unexplained intrusions such
as thefts or break-ins at times when no one is physically guarding the premises.
1.2 Objectives:
1. Collection and storage of data in forms of images and videos from the web or
by manual picture clicking and video shooting so as to test the program before
using it for commercial purposes.
2. Using Cascade Classifiers to load the various Haarcascade [1] xml files which
have been pre-trained on both positive and negative images.
3. Detecting the various elements in streams for the first part of the project.
4. Saving the processed and edited versions of these files in the working
directory for further operation and analysis.
5. Finding the exact frame where detection has been performed and keeping that
in the directory, thus rejecting others.
6. For the second part of the project, design and implement the program for
Motion Detection using a live video stream.
7. Designing a customized Alarm System for motion detection.
8. Saving a csv file of the duration of intrusion into the working directory.
9. Make a graph of the same for analysis.

1.3 Methodology
The methodology, which was used to make this project and accomplish all the
objectives is as follows :
 Phase -1 : Initiation
The initiation of a project or a system begins when a need or a problem is
identified. In this project, the problem was identified a real life-based problem
of providing a majorly automated management and surveillance system for
Commercial setups in this pandemic situation.
 Phase -2 : Data Collection

In this phase, the data was collected, as soon as the problem was identified. The
data was needed to be stored, and manipulated according to the tasks.
 Phase -3 : Framework Designing

The physical characteristics of the system are designed during this phase. The
physical characteristics of the system are specified and a detailed design is
prepared. In this phase according to the requirement and functionalities
required by the user are laid out. The concepts are researched about and
applied.
 Phase -4 : Development
In this phase the detailed specifications produced during the design phase are
translated into hardware, communication, and executable software. Code of
the project is prepared according to the framework designing in the above
phase.
 Phase -5 : Testing
In this phase the software result after development phase is tested and all the
problems are noted down. It is analysed that the system fulfilled the demands
in most cases.
1.4 Software Used
Software Used:-
 For Programming - Jupyter Notebook[2]
 Software Required - Anaconda 3[3]
 Language Used - Python[4]
1.4.1 Software Required – Anaconda 3
Anaconda is a free and open-source distribution of the programming languages-

Python and R for scientific computing which includes:
 Data Science,
 Machine Learning applications,
 large-scale Data Processing and Pre-processing,
 Predictive Analytics that aims to simplify package management,
 Deployment.
The distribution includes data-science packages suitable for all of the 3 platforms:
1. Windows, 2. Linux, 3. macOS
It is developed and maintained by Anaconda, Inc. As an Anaconda, Inc. product, it is
also known as Anaconda Distribution or Anaconda Individual Edition.
Package versions in Anaconda are managed by Package Management System conda:
Conda is an open source, cross-platform, language-agnostic package manager and
environment management system that installs, runs, and updates packages and their
dependencies.
Example of a conda command for installing the library ‘Pandas’:
(base) C:\Users\91997>conda install pandas
Anaconda supports Python 2.7, 3.4, 3.5 and 3.6. The default is Python 2.7 or 3.6,
depending on which installer you used:
 For the installers “Anaconda” and “Miniconda,” the default is 2.7.

 For the installers “Anaconda3” or “Miniconda3,” the default is 3.6.
For the purpose of this project, we have used Anaconda 3.
1.4.2 Programming Platform – Jupyter Notebook
The Jupyter Notebook is an open source web application that you can use to create
and share documents that contain live code, equations, visualizations, and text.
Jupyter Notebook is maintained by the people at Project Jupyter.
Anaconda comes with many scientific libraries preinstalled, including the Jupyter
Notebook, so once Anaconda is installed, no other measures need to be taken.
The Jupyter Notebook is an interactive computing environment that enables users
to author notebook documents that include live code, interactive widgets, Plots,
Narrative text, Equations, Images and Video.
The Jupyter Notebook combines three components:
1. The notebook web application: An interactive web application for writing

and running code interactively and authoring notebook documents.
2. Kernels: Separate processes started by the notebook that runs users’ code in a
given language and returns output back to the notebook web application.
3. Notebook documents: Self-contained documents that contain a representation
of all content visible in the notebook web application. Each notebook
document has its own kernel.
1.4.3 Language Used – Python
Version of Python used in project: Python 3.7.6

Python is a high-level, interpreted, interactive and object-oriented scripting language.
Python is designed to be highly readable.
Following are important characteristics of Python Programming:
1. It supports functional and structured programming methods as well as OOP.

2. It can be used as a scripting language or can be compiled to byte-code for
building large applications.
3. It provides very high-level dynamic data types and supports dynamic checking.
4. It can be easily integrated with C, C++, COM, ActiveX, CORBA, and Java.
New and advanced Python 3.7 has the following new features:
1. Easier access to debuggers through a new breakpoint() built-in

2. Simple class creation using data classes
3. Customized access to module attributes
4. Improved support for type hinting
5. Higher precision timing functions
6. More importantly, Python 3.7 is fast.
1.5Hardware Used
Hardware Used
 Solid State Disk : 256 GB

 RAM : 8 GB
 OS : Windows 10
Hardware Required
 Hard disk : 100 MB

 Ram : 512 MB
 OS : Platform Independent
1.6 About the Organisation
Coursera[5] is an online education provider that offers online courses, popularly

known as MOOCs or Massive Open Online Courses, from top universities around the
world. Currently it has over 200 partners from 48 countries.
Coursera works with universities to offer online courses, certifications, and degrees in
a variety of subjects, such as engineering, data science, machine
learning, mathematics, business, financing, computer science, digital
marketing, humanities, medicine, biology, social sciences, 3000 plus variety of
courses giving students a very broad range of information & experience in different
fields. Paid courses provide additional quizzes and projects as well as a shareable
Course Certificate upon completion and they are a 100% online.
Chapter 2: Project Design
2.1 Concepts Used

2.1.1 Python for Data Science
According to recent studies, Python is the preferred programming language for data
scientists. They need an easy-to-use language that has decent library availability and
great community participation. Data science involves extrapolating useful
information from massive stores of statistics, registers, and data. These data are
usually unsorted and difficult to correlate with any meaningful accuracy.
The 4 steps involved in the Data Science problem-solving process[6] and Python
packages for data mining are:
1. Data collection & cleansing
With Python, one can deal with almost all sorts of data that are available in different
formats such as CSV, TSV, or JSON sourced from the web.
Whether you want to import SQL tables directly into your code or need to scrape any
website, Python helps you achieve these tasks easily with its dedicated libraries. After
extracting and replacing values, you would also need to take care of missing data sets
during the data cleansing phase and replace non-values accordingly.
Libraries provided in Python for this: PyMySQL, BeautifulSoup, Scrapy.
2. Data exploration
Now that the data is collected and tidied up, it is made sure it is standardised across all
the data collected. Data is explored to identify its properties and segregate it into
different types such as numerical, ordinal, nominal, categorical, etc., in order to
provide it the required treatments.
Once data is categorised as per its types, the data analysis Python libraries are used to
unleash insights from the data by allowing you to manipulate it easily and efficiently.
The libraries provided in Python for this are: NumPy, Pandas, Dabl, Missingno.
3. Data modeling
Data modeling is the process of producing a descriptive diagram of relationships
between various types of information that are to be stored in a database.
The ability to think clearly and systematically about the key data points to be stored
and retrieved, and how they should be grouped and related, is what the data modeling
component of data science is all about. The libraries provided in Python are: SciKit-
Learn, SciPy, Keras, TensorFlow, XGBoost, PyTorch.
4. Data visualization & interpretation
Data visualisation is the graphical representation of information and data. By using

visual elements like charts, graphs and maps, data visualisation tools provide an
accessible way to see and understand trends, outliers and patterns in data.
In the world of big data, data visualisation tools and technologies are essential for
analysing massive amounts of information and making data-driven decisions.
Libraries provided in Python for this are: Matplotlib, Seaborn, Bokeh, Plotly, Pydot.
2.1.2 Introduction to Computer Vision
Computer vision is a field of Artificial Intelligence that trains computers to interpret

and understand the visual world.
Today, a number of factors have converged to bring about a renaissance in
computer vision:
1. Mobiles with built-in cameras has saturated the world with photos and videos.
2. Computing power has become more affordable and easily accessible.
3. Hardware designed for computer vision and analysis is more widely available.
4. New algorithms like CNN can take advantage of the hardware and software.
Its goal is to extract meaning from pixels.

From the engineering point of view, computer vision aims to build autonomous
systems which could perform some of the tasks which the human visual system can
perform. Automated computer vision is not yet reliable enough to detect anomalies
or track objects. Hence, humans are still in the analysis loop to assess the situation.
Challenges to Computer Vision:
 Privacy and Ethics - Vision powered surveillance systems pose a great risk
on privacy and ethics by recording and monitoring visuals.
 Lack of explainability — Modern neural network based algorithms are still a
black box to some extent. So when a model classifies an image as something,
we don’t actually know what caused it to do so.
 Deep Fakes - Using deep learning, now anybody with a powerful GPU and
training data can create a believable fake image or videos with DeepFakes.
 Adversarial attacks - Adversarial examples are inputs to machine learning
models that an attacker has intentionally designed to cause the model to make
a mistake.
2.1.3 Types of Computer Vision[7]
There are many types of Computer Vision that are used in different ways:
1. Image segmentation – It partitions an image into multiple regions or pieces to

be examined separately.
2. Object detection – It identifies a specific object in an image. Advanced object
detection recognizes many objects in a single image. These models use an (X,
Y) coordinate to create a bounding box and identify everything inside the box.
3. Facial recognition – It is an advanced type of object detection that not only
recognizes a human face in an image, but identifies a specific individual.
4. Edge detection - It is a technique used to identify the outside edge of an
object or landscape to better identify what is in the image.
5. Pattern detection – It is a process of recognizing repeated shapes, colours and
other visual indicators in images.
6. Image classification – It groups images into different categories.
7. Feature matching – It is a type of pattern detection that matches similarities
in images to help classify them.
Simple applications of computer vision may only use one of these techniques, but
more advanced uses, like computer vision for self-driving cars, rely on multiple
techniques to accomplish their goal.
2.1.4 How Computer Vision Works
Since we’re not decided on how the brain and eyes process images biologically, it’s
difficult to say how well the algorithms used in production approximate our own
internal mental processes.
On a certain level, Computer vision is all about pattern recognition. So one way to
train a computer how to understand visual data is to feed it thousands of labelled
images, and then subject those to various software techniques, or algorithms, that
allow the computer to hunt down patterns in all the elements that relate to those
labels.
For example,
If you feed a computer million images of say, aeroplanes, it will subject them all to
algorithms that analyse the colours, the shapes, the distances between the shapes,
where objects border each other, and so on, so that it identifies a profile of what
‘aeroplane’ means. When it’s done, the computer will (in theory) be able to use its
experience if fed other unlabelled images to find the ones that are of an aeroplane.
Instead of training computers to look for nose, tail, wings and lights to recognize an
aeroplane, programmers upload millions of photos of planes, and then the model
learns on its own the different features that make up an aeroplane.
Computer vision works in three basic steps:

1. Acquiring an image - Images, even large sets, can be acquired in real-time
through video, photos or 3D technology for analysis.
2. Processing the image - Deep learning models automate much of this process,
but the models are often trained by first being fed thousands of labeled images.
3. Understanding the image - The final step is the interpretative step, where an
object is identified or classified.
2.1.5 Analysis of a Visual:

Below is an illustration of the grayscale image buffer which stores an image of
Abraham Lincoln.[8] Each pixel’s brightness is represented by a single 8-bit number,
whose range is from 0 (representing black) to 255 (representing white):
Fig 1. Pixelated grayscale image Fig 2. Adding lumination values to each pixel
Fig 3. Creating a matrix out of all pixel values
The data from the image above is stored in a manner similar to this long list of
unsigned chars. Pixel values are almost universally stored, at the hardware level, in a
one- dimensional array:
{157, 153, 174, 168, 150, 152, 129, 151, 172, 161, 155, 156,
155, 182, 163, 74, 75, 62, 33, 17, 110, 210, 180, 154,
180, 180, 50, 14, 34, 6, 10, 33, 48, 106, 159, 181,
206, 109, 5, 124, 131, 111, 120, 204, 166, 15, 56, 180,
194, 68, 137, 251, 237, 239, 239, 228, 227, 87, 71, 201,
172, 105, 207, 233, 233, 214, 220, 239, 228, 98, 74, 206,
188, 88, 179, 209, 185, 215, 211, 158, 139, 75, 20, 169,
189, 97, 165, 84, 10, 168, 134, 11, 31, 62, 22, 148,
199, 168, 191, 193, 158, 227, 178, 143, 182, 106, 36, 190,
205, 174, 155, 252, 236, 231, 149, 178, 228, 43, 95, 234,
190, 216, 116, 149, 236, 187, 86, 150, 79, 38, 218, 241,
190, 224, 147, 108, 227, 210, 127, 102, 36, 101, 255, 224,
190, 214, 173, 66, 103, 143, 96, 50, 2, 109, 249, 215,
187, 196, 235, 75, 1, 81, 47, 0, 6, 217, 255, 211,
183, 202, 237, 145, 0, 0, 12, 108, 200, 138, 243, 236,
195, 206, 123, 207, 177, 121, 123, 200, 175, 13, 96, 218};
Fig 4. How pixels are stored in the memory
Computer memory consists simply of an ever-increasing linear list of address

spaces. Had we used a coloured image instead of a gray one, the dimensions would
have been different. Computers read color as a series of 3 values - red, green, and
blue (RGB) - on that same 0–255 scale.
Now, each pixel has 3 values for the computer to store in addition to its position. If
we were to colorize the sample image, that would lead to 12 x 16 x 3 values, or 576
numbers, where
12 – Rows
16 – Columns
3 – Colour pixel values
That’s a lot of memory to require for one image, and a lot of pixels for an algorithm
to iterate over. This is why most image analysis is preceded by image processing
where it is converted to a grayscale image. This is known as Changing Color
Space. To train a model with meaningful accuracy, we usually need tens of
thousands of images, hence memory optimization.
The RGB Color Model[9] - In this, each color appears as a combination of red,
green, and blue. The primary colors can be added to produce the secondary colors of
light - magenta, cyan and yellow. The combination of red, green, and blue at full
intensities makes white.
2.1.6 OpenCV for Computer Vision[10]
OpenCV is a cross-platform library using which we can develop real-time

computer vision applications. It mainly focuses on image processing, video
capture and analysis including features like face detection and object detection.
Using OpenCV library, you can −
 Read and write images

 Capture and save videos
 Process images (filter, transform)
 Perform feature detection
 Detect specific objects such as faces, eyes, cars, in the videos or images.
 Analyse the video, i.e., estimate the motion in it, subtract the background, and
track objects in it.
OpenCV was originally developed in C++. Python and Java bindings were provided.
OpenCV Python is nothing but a wrapper class for the original C++ library for Python.
Advantages:
 First and foremost, OpenCV is available free of cost
 Since OpenCV library is written in C/C++ it is quite fast
 Low RAM usage (approximately 60 –70 MB)
 It is portable as OpenCV can run on any device that can run C
Disadvantages:
 Doesn’t provide same ease of use when compared to MATLAB.
 OpenCV has a flann library of its own. This causes conflict issues when
you try to use OpenCV library with the PCL library.
2.1.7 Using the Haar Classifiers[11]
Haar-like features are digital image features used in object detection. They owe their
name to their intuitive similarity with Haar wavelets and were used in the first real-
time face detector. This classifier is widely used for various detection tasks in computer
vision industry.
Haar cascade classifier employs a machine learning approach for visual object
detection which is capable of processing images extremely rapidly and achieving high
detection rates.
This can be attributed to three main reasons:
1. Haar classifier employs 'Integral Image' concept which allows the features
used by the detector to be computed very quickly.
2. The learning algorithm is based on AdaBoost.[12] It selects a small number of
important features from a large set and gives highly efficient classifiers.
3. More complex classifiers are combined to form a 'cascade' which discard any
non-apt regions in an image, thereby spending more computation on
promising object-like regions.
Working of the Algorithm:
1. Haar features extraction:

After the tremendous amount of training data (in the form of images) is fed into the
system, the classifier begins by extracting Haar features from each image. Haar
Features are kind of convolution kernels which primarily detect whether a suitable
feature is present on an image or not.
Haar features are mentioned below:
Fig 5. Edge Features Fig 6. Line Features Fig 7. Four-rectangle features
Fig 8. Haar Features represented numerically
These Haar Features are like windows and are placed upon images to compute a
single feature. The feature is essentially a single value obtained by subtracting the
sum of the pixels under the white region and that under the black. The process can be
easily visualized in the example below:
Fig 9. Sample b/w image[13] Fig 10. Haar features applied on
relevant parts of face
To detect eyebrow, we will use Haar feature (image (1)) because forehead and
eyebrow form lighter pixels- darker pixel like image. Similarly, to detect lips we use
similar to Haar like feature (image (3)) with lighter-darker-lighter pixels. To detect
nose, we might use darker-lighter Harr like feature from (image (1)). And so on.
Fig 11. 0 and 1 for ideal cases Fig 12. Normalised greyscale values for
real cases
According to Viola-Jonas algorithm, to detect Haar like feature present in an image,

below formula should give result closer to 1. The closer the value is to 1, the greater
the change of detecting Haar feature in image.
Fig 13. Formula for Delta Calculation[14]
Ideal case: Delta = (1/8)*(8) - (1/8)*0 = 1
Real case: Delta = (1/8)*(5.9) - (1/8)*(1.3) = 0.575
(For greyscale image, assume White-Dark threshold set to 0.3. Meaning pixels with
value less than or equal to 0.3 are considered white and anything greater that 0.3 is
considered as dark)
Delta (this is different from White-Dark threshold) is assumed to have set threshold to
0.5. Any delta value greater than 0.5 detects Haar feature. In this fashion we can
apply, get and detect most relevant features on a given image.
2. Integral Images Concept:
The algorithm proposed by Viola Jones uses a 24x24 base window size, and that
would result in more than 180,000 features being calculated in this window. The
solution devised for this computationally intensive process is to go for the Integral
Image concept. The integral image means that to find the sum of all pixels under any
rectangle, we simply need the four corner values.
This means, to calculate the sum of pixels in any feature window, we do not need to
sum them up individually. All we need is to calculate the integral image using the
4 corner values.
Fig 14. Input image and Integral Image[15]

Sum of all pixels in D
= 1+4-(2+3)
= A+(A+B+C+D)-(A+C+A+B)
=D
3. Adaboost- to improve Classifier accuracy:
More than 180,000 features values result within a 24X24 window.

However, not all features are useful for identifying a given object. To only select the
best feature out of the entire chunk, a machine learning algorithm called Adaboost is
used. What it essentially does is that it selects only those features that help to improve
the classifier accuracy. It does so by constructing a strong classifier which is a linear
combination of a number of weak classifiers. This reduces the number of features
drastically to around 6000 from around 180,000.
Strong classifier = linear sum of weak

classifiers
F(x) = ∑ ( αᵢ * fᵢ( x ) )
Here, αᵢ are corresponding weights to each weak classifier fᵢ(x)
4. Using Cascade of Classifiers:
The cascade classifier essentially consists of stages where each stage consists of a
strong classifier. This is beneficial since it eliminates the need to apply all features at
once on a window. Rather, it groups the features into separate sub-windows and the
classifier at each stage determines whether or not the sub-window is a face. In case
it is not, the sub-window is discarded along with the features in that window. If the
sub-window moves past the classifier, it continues to the next stage where the second
stage of features is applied.
Each stage of the classifier labels the region defined by the current location of the
sliding window as either positive or negative. Positive indicates that an object was
found and negative indicates no objects were found. If the label is negative, the
classification of this region is complete, and the detector slides the window to the next
location. If the label is positive, the classifier passes the region to the next stage. The
detector reports an object found at the current window location when the final stage
classifies the region as positive.
Fig 15. Using Classifier Cascading for example of face[16]
How feature selection works for car detection:
Fig 16. Feature Selection for vehicle detection[17]

2.2 Block structure for Project
CAMERA • Vehicle detection and counting at set

intervals of time to approximately ascertain
the parking and traffic situation
1 • Record keeping of incoming cars' number
plates to ensure contactless parking record
and safety
CAMERA • Attendance photo record with incoming

time for employees of the organisation,
eliminating the need for a Biometric system
2 • Conditional entry by face mask detection at
the time of entry
CAMERA • Intrusion Detection by clicking and

recording various frames along with exact
3 time for future action(person detection)
CAMERA • Motion Detector for slightest disturbance in

the same room as camera 3
• Alarm System triggered by motion detector
4 which sounds the desired message out loud
• Motion vs Time visualisation observed
using the csv file saved.
Chapter 3: Implementation
3.1 Libraries and corresponding major functions used
Table 1. Major libraries and their classes, functions used
SNO. LIBRARY/MODULE FUNCTIONS USED
1 Cv2 cv2.imread( )
cv2.imshow( )
cv2.resize( )
cv2.waitKey( )
cv2.destroyAllWindows( )
cv2.cvtColor( )
cv2.CascadeClassifier( )
cv2.rectangle( )
cv2.putText( )
cv2.imwrite( )
cv2.VideoCapture( )
cv2.GaussianBlur( )
cv2.absdiff( )
cv2.threshold( )
cv2.dilate( )
cv2.findContours( )
cv2.contourArea( )
2 NumPy[18] .shape( )
3 Pandas[19] pd.DataFrame( )
pd.read_csv( )
4 Time[20] time.append( )
5 Datetime[21] datetime.datetime
datetime.now( )
6 Os[22] os.path.exists( )
os.makedirs( )
7 Random[23] random.choice( )
8 Pyttsx3[24] pyttsx3.init( )
.getProperty( )
.setProperty( )
.stop( )
.say( )
9 Threading[25] threading.Thread( )
.start( )
1. Cv2
OpenCV is an open source cross-platform computer vision and machine learning
software library. Functions and classes used:
cv2.imread(path, flag) – Used to read images into the code where path is the path of
image file and flag refers to the way in which the image is to be read(1/0/-1).
We are reading our images in UNCHANGED color mode.
cv2.VideoCapture(video_path or device index ) - Method used to read videos and

start live streaming.
1. video_path: Path of video in your system in string form with extension.
2. device index: The number to specify the camera with values as either 0 or -1.
cv2.resize(src, dsize[, dst[, fx[, fy[, interpolation]]]]) - Used for resizing images by
changing their dimensions, width alone, height alone or both, while preserving the
aspect ratio of the original image.
cv2.imshow(window name, media) - Used to display an image in a window.
The window automatically fits to the image size. We use this method of cv2 instead of
Matplotlib as we need to view our outputs in separate windows and not within the
console.
Cv2.waitKey(n) and cv2.destroyAllWindows( ) -

cv2.waitKey() is a keyboard binding function. n is time in milliseconds.
waitKey(0) displays the window infinitely until keypress(for image display).
waitKey(1) displays a frame for 1 ms, afterwards display automatically closes.
cv2.destroyAllWindows() simply destroys all the windows we created.
cv2.imwrite(filename, image) - Used to save an image to any storage device. We are

using it to save records in our current working directory.
cv2.cvtColor(input image, flag) - Used for conversion between color spaces.

This is required as OpenCV reads images in BGR format. Here, flag refers to
Flag determines the type of conversion.
It can be BGR Gray by cv2.COLOR_BGR2GRAY as flag value.
cv2.putText(image, text, org, font, fontScale, color[, thickness[, lineType[,

bottomLeftOrigin]]]) - Used to draw a text string on any image at any desired
position, font, colour, size or background.
We have used FONT_HERSHEY_SIMPLEX font.
cv2.rectangle(image, start, end, color, thickness) - used to draw a rectangle on any

image or video.
Cv2.CascadeClassifier( filename ) -
CascadeClassifier::CascadeClassifier
Loads a classifier from a file.
cv2.CascadeClassifier([filename]) → <CascadeClassifier object>
Filename – Name of the file from which the classifier is loaded.
The following files have been used in the first 3 segments of the project:
1. haarcascade_car.xml
2. haarcascade_carplate.xml
3. haarcascade_frontalface_default.xml
4. haarcascade_mcs_mouth.xml
5. haarcascade_fullbody.xml
CascadeClassifier::detectMultiScale
Detects objects of different sizes in the input image. The detected objects are
retureturned as a list of rectangles
cv2.CascadeClassifier.detectMultiScale(image[, scaleFactor[, minNeighbors[,
flags[, minSize[, maxSize]]]]]) → objects
1. scaleFactor – Specifies how much image size is reduced at each image
scale.
2. minNeighbors – Specifies how many neighbors each candidate rectangle
should have to retain it.
2. Pandas
Pandas is the data manipulation and analysis library of python. It is widely used in data
science for data processing and visualization as well.
In this project, I have used Pandas to write into, read from and perform analysis on
csv(comma separated values) files in order to form a visualization at the end.
1. pandas.DataFrame(data, index, columns, dtype, copy)
Creates 2D size-mutable, heterogeneous tabular data structure with label axes.
2. pd.read_csv(filepath, sep = ’, ‘,..,…) - Used to read csv file into the dataframe.
3. Datetime
The datetime module supplies classes for manipulating dates and times. Date as well
as time are objects and not variables in python.
1. datetime.datetime – One of the 6 total classes, it’s a combination of date and time
along with the attributes year, month, day, hour, minute, second, microsecond, and
tzinfo.
2. datetime.now( ) - now() function returns the current local date and time.
4. Os
This module provides a portable way of using operating system dependent
functionality. Os and os.path both come under Python’s standard utility modules.
Functions used:
os.path.exists(path) - Used to check whether the specified path exists or not. Returns
a bool value True if exists and False if it doesn’t.
os.makedirs(directory path) - Used to create a directory recursively. While making
leaf directory, if any intermediate-level directory is missing, they are all created.
I have used this module to create Attendance and Intrusion Detection Record folders
in the working directory.
5. pyttsx3
pyttsx3 is a text-to-speech conversion library in Python. Functions used:
1. pyttsx3.init( ) → object(Engine) - to get an engine instance for speech synthesis.
2. .getProperty(name : string) → object - Gets current value of an engine property.
3. .setProperty(name, value) → None - Queues a command to set an engine
property. 4. .say(text : unicode, name : string) → None - Queues a command to
speak an utterance. The output is the speech uttered in the Alarm System.
3.2 Implementation Blocks
The project is divided into four segments representing the 4 cameras being
programmed for. The procedures involved, with detailed steps and corresponding
algorithms are explained below:
3.2.1 CAMERA 1
This camera is positioned on the outermost boundary of the infrastructure at an
appropriate height above the ground. This segment performs 2 kinds of tasks –
1. Road Traffic Deduction

Different number of vehicles are present in front of the building at different times of
the day. An approximate of the number of vehicles clustering could be useful in
driving the profits of the said organization and even help in parking management.
Workflow:
1. First, ‘haarcascade_car.xml’ file is loaded into the classifier variable called
car_classifier using the CascadeClassifier method of OpenCV.
2. Color space conversion is done from BGR to Gray for the given resized image.
3. The corrected image is then passed into the function where the car(s) is/are
detected using the detectMultiScale method.
4. We are using a counter (initialised to 0) to count the number of cars detected.
5. Current date and time of the detection is also recorded to be later printed onto
the frame. This is done using the datetime.now function.
6. As the four coordinates of the object are detected, they are used to draw a
rectangle around it using the rectangle function with the desired formatting.
7. For every set of unique coordinate sets, the counter is incremented by 1 to
finally give us the number of detections performed at the end.
8. The current date and time are displayed at the top of the frame while the
approximate value of cars is mentioned at the bottom.
9. The processed image is finally displayed on the screen using the imshow
method.
10. At the end, this final image is stored into the working directory with a unique
name using the imwrite function.
2. Car Plate Detection and Recording
In order to achieve contactless Parking, this segment of the project focuses on
detecting car plates of cars appearing in front of it one by one and processing it to
extract the number plate from the entire frame. These car plates are then stored in the
working directory, thus keeping a record of all the vehicles entering the premises of
the building in a day.
Workflow:
1. The particular frame is resized as per requirement.
2. Choice function of the random library is used to fetch random filenames to
prevent redundancy in naming.
3. Then, ‘haarcascade_carplate.xml’ file is loaded into the classifier variable
called carplate_classifier using the CascadeClassifier method of OpenCV.
4. The corrected image is then passed into the function where the car plate is
5. Current date and time of the detection is also recorded to be later printed onto
the frame. This is done using the datetime.now function.
6. As the four coordinates of the object are detected, they are used to draw a
rectangle around it using the rectangle function.
7. The date and time of entry of the said vehicle is also displayed at the bottom of
the image.
8. The image is sliced to extract only the car plate from the entire image.
9. This extracted portion is finally saved into the working directory with a unique
name.
3.2.2 CAMERA 2
This camera is positioned at the entry gate of the building at persons’ height from the
ground. This segment also performs two kinds of tasks:
1. Facial Attendance Record (No recognition)
Usually, institutions have security guards at the entrances to manage the entry of
employees into the compound. Due to the pandemic, it is optimum that we utilize
social distancing norms and eliminate the need of security personnel asking
employees to take off their masks and show their faces. Each employee will now have
to show their face into the camera and have their attendance recorded at the current
date and time. Later, with all the data of the day, their attendances can be marked into
spreadsheets. This eliminates the need to touch the biometric system and ensures
social distancing. We do this using a live video stream.
Workflow:
1. Firstly, it is checked if the folder named ‘Attendance’ is present or not. If it is
not, an exception is thrown with the apt message and the said directory is
created.
2. Then, 'haarcascade_frontalface_default.xml' file is loaded into the classifier
variable called face_classifier using the CascadeClassifier method of OpenCV.
3. Now, we start capturing the video with 0 as the parameter to specify that we
are taking a live video stream.
4. While the video is open, a frame is read one by one and stored as a NumPy
array.
5. This frame is now used for passing into the detectMultiScale method, where
face detection is performed and desired coordinates are stored.
6. While Python is able to read the VideoCapture object, following operations
are performed:
i. Current date and time of the detection is recorded. This is done using the
datetime.now function.
ii. The four detected coordinates are used to draw a rectangle around the
object.
iii. The live video is shown in a separate window, with rectangles and date
and time printed on it in real time.
iv. The end frame, triggered by a keypress, is saved into the ‘Attendance’
directory that was previously created, using imwrite method.
7. In the end, the loop is broken and the video is released.
Capturing videos using OpenCV-
START
vid = VideoCapture(0)
No
Is video Opened
Yes
ret(bool), frame(array)
= vid.read( )
Yes Perform processing tasks on video

Is ret == true
No
vid.release( ) END
Fig 17. Flowchart for capturing videos using OpenCV

2. Mouth Detection for Mask Check
The ongoing COVID-19 pandemic, or any other widespread disease calls for
protection through the use of face masks. Adding to our aim of contactless operations,
this segment of the projects Accepts or Denies entry into the building on the basis of
whether the person is wearing a facial mask covering their mouth. A message
regarding the status of their entry will be displayed, hence they can be reprimanded if
seen inside the premises after being denied entry.
Workflow:
(The code is construct working on the principle that the failure to detect a mouth
element on the face of a person is a direct indicator of the mouth being covered by an
opaque shield or cover, which in our case is a facial mask.)
1. The image is first read and resized as per the scale factor required for
computation.
2. Then, ‘haarcascade_mcs_mouth.xml’ file is loaded into the classifier variable
called mouth_classifier using the CascadeClassifier method of OpenCV.
3. The corrected image is then passed into the function where the mouth is
4. We initialise a counter to 0 which tracks whether a mouth has been detected or
not.
5. If the classifier is able to detect a mouth, a rectangle will be drawn around it
using the coordinates received from the classifier.
6. This will also increment the counter.
7. A message saying “wear mask” will be displayed on the image(Red colour).
8. Else, no rectangle will be drawn and a message saying “Thanks for wearing
mask” will be displayed on the image(Green colour).
9. The image is finally displayed with the appropriate caption.
10. All windows are destroyed on key press.
3.2.3 CAMERA 3
For buildings or commercial setups that house confidential information, valuable

belongings, and costly consumables, it is imperative that a security system be in place
that detects human intrusion where it is not programmed to be, for instance, a locker
room at night. Using full person detection and processing of video frames, this
segment clicks and records numerous frames of the entire duration of the activity and
informs about the exact frame where detection took place. The recorded frames will
be helpful when analyzing intrusion and crime solving.
Workflow:
1. Firstly, it is checked if the folder named ‘CAM3’ is present or not. If it is not,
an exception is thrown with the apt message and the said directory is created.
2. Then, 'haarcascade_fullbody.xml' file is loaded into the classifier variable
called body_classifier using the CascadeClassifier method of OpenCV.
3. Now, we start capturing the video with 0 as the parameter to specify that we
are taking a live video stream.
4. While the video is open, a frame is read one by one and stored as a NumPy
array.
5. This frame is now used for passing into the detectMultiScale method, where
face detection is performed and desired coordinates are stored.
6. While Python is able to read the VideoCapture object, following operations
are performed:
i. Current date and time of the detection is recorded. This is done using
the datetime.now function.
ii. The four detected coordinates are used to draw a rectangle around the
object.
iii. The video is shown in a separate window, with rectangles and date and
time printed on it in real time.
iv. All captured frames(up to 75) are stored in the CAM3 folder.
7. It is also specified which exact frame contains the detection and time of
intrusion.
8. All of this goes on till key press due to the waitKey(1) function.
9. In the end, the loop is broken and the video is released.
3.2.4 CAMERA 4
This camera is situated in the same room as Camera 3. This is a Motion Detector plus
Alarm System which will be triggered on the slightest detection of motion. This
camera is placed very close to the thing being protected, so only the stealing motion is
recorded with minimum background. The alarm rings as soon as the motion detectors
ticks it off, causing it to sound a speech of required text out loud, for example
–“ALERT, intrusion detected”
1. Motion Detection
Consider the following block diagram:
Fig 18. Block diagram for Motion Detection[26]
Workflow:
1. Initially we save the image in a particular frame.
2. The next step involves converting the image to a Gaussian blur image. This is
done to ensure we calculate a palpable difference between the blurred image
and the actual image.
3. At this point, the image is still not an object. We define a threshold to remove
blemishes, such as shadows and other noises in the image.
4. The absdiff( ) function is used to calculate the difference between the first
frame and the rest of the frames.
5. The threshold( ) function provides a threshold value. Using the dilate( )
function, the following condition is checked:
i. If the difference value < 30, those pixels are converted into black.
ii. Else if the difference value > 30, those pixels are converted to White.
6. Then we add the borders by defining the contour area of the moving object
using the findContours( ) function of OpenCV.
7. The contourArea( ) function, removes the noises and the shadows. To make it
simple, it will keep only that part white, which has an area greater than 10000
pixels as we've defined for that.
8. Finally, we make a Green coloured rectangle around the moving object in the
frame.
2. Calculating the Time

The time is calculated as intervals of motion detection. There is a start value and an end
value of every interval of motion and they are all entered into a Data Frame.
Workflow:
1. We make use of DataFrame( ) function of Pandas to store the time values
during which object detection and movement appear in the frame.
2. A status variable is initialised to 0 indicating that the object is not visible.(This
is set after the video has started recording)
3. This status is incremented once the object is visible.
4. A list of status called motion_list is maintained for every frame and status
values are appended in it after every detection.
5. Now we record the date and time using datetime in a separate list if and where
a change occurs. This is done in the following manner:
i. If last value in motion_list ==1 and second last value == 0 then:
Start time of the motion is recorded
ii. Else if, last value of motion_list == 0 and second last value == 1 then:
End time for the motion is recorded.
6. These times of motion are then appended into the DataFrame.
7. This DataFrame is then finally written into a CSV file which is saved in the
working directory itself.
3. Alarm System
For playing the audio, we will be using “pyttsx3” Python library to convert text to
speech. You can choose your voice (male/female), the speed of the speech delivery
and the volume. This alarm is triggered as soon as the first instance of motion is
encountered.
Workflow:
1. We first begin by creating an Engine instance for the speech synthesis by
using the init method.
2. For setting the current value of engine property, we used the getProperty
function.
3. Before we add what the alarm is supposed to speak out, we set some
properties of voice using the setProperty function.
We are setting the volume of the speech and the speed percent as 150.
4. Now as soon as motion is detected using the motion detection code explained
above, the Thread method of the threading library calls the alarm bell function.
5. Inside the function, say( ) method is used to voice out a “ALERT” through the
system.
6. At the end the engine terminates by engine.stop( ).
4. Formation of Graph
Using the csv file formed in the previous segment, we observe a graph between motion
and the motion intervals.
On the vertical axis we have 0 and 1, 1 indicating motion.
On the horizontal axis we have time in milliseconds.
Chapter 4: Result and Discussion
4.1 Outputs
4.1.1 CAMERA 1
Traffic deduction:
Fig 19. Test 1 for Traffic Deduction Camera 1[27]
Fig 20. Test 2 for Traffic Deduction Camera 1[28]

Discussion:
Above are 2 pictures as outputs from Camera 1 where the number of cars have
roughly been estimated. As we can see, in Trial 1 we have been able to detect 9
vehicles at a particular time of the day(printed on the output image) while in Trial 2,
we have detected 2 vehicles at a completely different hour. Both of them have been
saved into the working directory for future analysis and processing in greyscale
format.
This however not the most accurate, but can definitely give an estimate of the traffic
and commotion in front of the building.
Car Plate Record Keeping:
Fig 21. Test 1 for Car Plate Record Camera 1[29]

Fig 22. Extracted car plate for Test 1 Car Plate Record Camera 1
Fig 23. Test 2 for Car Plate Record Camera 1[30]
Fig 24. Extracted car plate for Test 2 Car Plate Record Camera 1
Discussion:
Above are two pictures as outputs of the second segment of Camera 1.

The first one is where Car Plate Detection has been performed and the current time of
entry of the vehicle is noted.
The second one is the extracted portion of just the Car plate, which is saved or written
as an image file into the working directory. This will result in easy and contactless
management of parking records.
4.1.2 CAMERA 2
Face-based Attendance
Fig 25. Test 1 for Face Attendance Record Camera 2
Fig 26. Test 2 for Face Attendance Record Camera 2

Discussion:
Above are 2 Test runs of the first segment of Camera 2. Video is captured using a live
video stream. Upon detection, the live date and time is printed on the output image
which results in an attendance being recorded with the entry time of the person
printed on his/her photo and that image being saved in the folder called Attendance in
the working directory.
Face Mask based Conditional Entry
Fig 27. Face Mask detection Camera 2 Fig 28. Face Mask detection Camera 2
Wearing Mask Not wearing Mask
Discussion:
Here, mouth detection is being performed to ascertain the presence of mask on the
face of the person. If the mouth is being detected, it means that the person is not
wearing a mask. The message “Wear Mask” is displayed.
While if the detection task fails, it means the mouth is covered implying the presence
of a mask. The message “You May Enter” is displayed.
4.1.3 CAMERA 3
Intrusion Detection and Recording
Fig 29. Test 1 for Intrusion Detection Camera 3[31]
Fig 30. Code window snippet for exact frame of Intrusion Detection Camera 3
Camera starts clicking and storing multiple frames of the live video stream as soon as
it detects a person. It also informs about the exact frame number where detection
actually took place which can then be accessed through the working directory( in this
case, frame 65). This output image contains the detection as well as the date and time
of intrusion.
4.1.4 CAMERA 4
Motion Detector and Motion analysis
Fig 31. CSV file created for live Motion Detection Camera 4
Fig 32. Observed Graph for live Motion Detection Camera 4
Discussion:
Motion triggered Alarm is spoken out loud.

As motion is detected, the list is appended and time interval data is saved into the csv
file called time_of_movements.csv. The CSV file is shown above in Fig 30. It has
columns for starting as well as ending time for every instance of motion.
We have then analysed this data by plotting a motion vs Time graph. The interval
is ranging from seconds to milliseconds.
Chapter 5: Future Scope and Conclusion
5.1 Future Scope
In an ever-changing and evolving world of AI advancement, to say that the project is

fully optimum with no scope of improvement, will be highly erroneous.
Data Science based projects, such as mine, always have room for amendments as they
are based on training and testing of models with samples. The Classifier accuracy, the
precision of detection, the variation in datasets and number of models trained before
arriving to the final one, are all variables which can change based on approach and
technology.
Following upgrades could be made in the future in the project-
1. A self-trained model can be used in place of haar cascade classifiers for more
personalized results.
[32]
2. YOLO (“You Only Look Once”) , an effective real-time object recognition
algorithm can be used to exploit its various advantages such as Anchor boxes,
multiscale predictions, etc.
3. A Convolutional Neural Network(CNN) based model can be created to
perform Recognition after detection for Face Attendance.
4. The frame captured during Intrusion Detection can be sent via email to the
concerned authority for a more organised management.
5. A clean front-end for the entire project can be created and it can be deployed
on the web using Django.
6. Extraction of Car Plate numbers as pure characters can be done.
7. A better model for Mask Detection can be used that also checks for the nose
being covered.
8. Visualization for the dependency of traffic outside the building to the Profits
of the organization can be made and analysed.
5.2 Conclusion
All aforementioned tasks in the objectives have been completed in the project and
corresponding results and outputs have been attached.
The source of inspiration and skill building required for the project was inclusive of,
yet not limited to the knowledge gained as part of the Data Science course undertaken
as part of the Summer Training.
OpenCV has been used as the primary Computer Vision library to perform most of
the detections in the project and it can be concluded that this library has huge
potential in the field of Computer Vision.
COVID-19 pandemic has caused institutions across the globe to resort to contactless
services, involving least manpower. Social Distancing norms have been placed at
most places which has expedited the need for automated systems to a great extent.
This project, that is a comprehensive integration of various features, was a step
towards achieving minimum human contact to carry out services of an institution
smoothly and safely.
With great scope for the future and optimization for AI and ML additions, this project
can be ready to be used by Commercial setups.
References
[1] https://github.com/opencv/opencv/tree/master/data/haarcascades
[2] https://jupyter-notebook.readthedocs.io/en/stable/
[3] https://www.anaconda.com/products/individual
[4] https://www.python.org/
[5] https://www.coursera.org/
[6] https://www.kdnuggets.com/2020/01/python-preferred-languages-data-
science.html
[7] https://towardsdatascience.com/everything-you-ever-wanted-to-know-
about-computer-vision-heres-a-look-why-it-s-so-awesome-e8a58dfb641e
[8] https://towardsdatascience.com/everything-you-ever-wanted-to-know-
about-computer-vision-heres-a-look-why-it-s-so-awesome-e8a58dfb641e
[9] https://www.educba.com/rgb-color-model/
[10] https://docs.opencv.org/master/
[11] https://medium.com/analytics-vidhya/haar-cascade-face-identification-
aa4b8bc79478
[12] https://machinelearningmastery.com/boosting-and-adaboost-for-
machine-learning/
[13] https://medium.com/analytics-vidhya/what-is-haar-features-used-in-
face-detection-a7e531c8332b
[14] https://medium.com/analytics-vidhya/what-is-haar-features-used-in-
face-detection-a7e531c8332b
[15] https://www.mathworks.com/help/images/integral-image.html
[16] https://medium.com/analytics-vidhya/haar-cascade-face-identification-
aa4b8bc79478
[17] https://www.sciencedirect.com/science/article/pii/S0020025514010238
[18] https://numpy.org/doc/
[19] https://pandas.pydata.org/docs/
[20] https://docs.python.org/3/library/time.html
[21] https://docs.python.org/3/library/datetime.html
[22] https://docs.python.org/3/library/os.html
[23] https://docs.python.org/3/library/random.html
[24] https://pyttsx3.readthedocs.io/en/latest/
[25] https://docs.python.org/3/library/threading.html
[26] https://dzone.com/articles/opencv-python-tutorial-computer-vision-
using-openc
[27] https://driversed.com/trending/the-best-defense-preparing-to-share-the-
road-with-self-driving-cars
[28] https://www.insurancejournal.com/news/national/2011/01/27/182276.h
tm
[29] https://stackoverflow.com/questions/981378/how-to-recognize-
vehicle-license-number-plate-anpr-from-an-image
[30] https://www.researchgate.net/post/How_do_I_detect_and_recognize_v
ehicle_number_plates_from_images_taken_arbitrarily
[31] https://www.gettyimages.in/detail/video/front-porch-security-camera-
home-front-porch-burglary-stock-footage/1084227724?adppopup=true
[32] https://appsilon.com/object-detection-yolo-algorithm/

Summer Training Report On COMMERCIAL SURVEILLANCE SYSTEM USING COMPUTER VISION

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Summer Training Report On COMMERCIAL SURVEILLANCE SYSTEM USING COMPUTER VISION

Uploaded by

Copyright:

Available Formats

COMMERCIAL SURVEILLANCE

SYSTEM USING COMPUTER VISION

Report submitted in partial fulfillment of the requirement of the degree

MAHARAJA SURAJMAL INSTITUTE OF TECHNOLOGY

I, Ananyaa Bansal, Roll Number 01315002718, B. Tech (Semester-5) of the Maharaja

Content Page No.

Chapter 2: Project Design 7

Chapter 4: Results and Discussion 35

Chapter 5: Future Scope & Conclusion 42

Fig No. Name Page No.

Fig 1 Pixelated grayscale image 11

Table No. Name Page No.

Table 1 Major libraries and their classes, functions used 21

Needs and Objectives:-

7. Designing a customized Alarm System for motion detection.

9. Make a graph of the same for analysis.

 Phase -2 : Data Collection

 Phase -3 : Framework Designing

1.4.1 Software Required – Anaconda 3

Anaconda is a free and open-source distribution of the programming languages-

 For the installers “Anaconda” and “Miniconda,” the default is 2.7.

The Jupyter Notebook combines three components:

1. The notebook web application: An interactive web application for writing

1.4.3 Language Used – Python

Version of Python used in project: Python 3.7.6

Following are important characteristics of Python Programming:

1. It supports functional and structured programming methods as well as OOP.

1. Easier access to debuggers through a new breakpoint() built-in

 Solid State Disk : 256 GB

 Hard disk : 100 MB

1.6 About the Organisation

Coursera[5] is an online education provider that offers online courses, popularly

2.1 Concepts Used

1. Data collection & cleansing

4. Data visualization & interpretation

Data visualisation is the graphical representation of information and data. By using

2.1.2 Introduction to Computer Vision

Computer vision is a field of Artificial Intelligence that trains computers to interpret

Its goal is to extract meaning from pixels.

2.1.3 Types of Computer Vision[7]

1. Image segmentation – It partitions an image into multiple regions or pieces to

Computer vision works in three basic steps:

2.1.5 Analysis of a Visual:

Fig 3. Creating a matrix out of all pixel values

Fig 4. How pixels are stored in the memory

Computer memory consists simply of an ever-increasing linear list of address

2.1.6 OpenCV for Computer Vision[10]

OpenCV is a cross-platform library using which we can develop real-time

Using OpenCV library, you can −

 Read and write images

2.1.7 Using the Haar Classifiers[11]

1. Haar features extraction:

Fig 5. Edge Features Fig 6. Line Features Fig 7. Four-rectangle features

Fig 8. Haar Features represented numerically

According to Viola-Jonas algorithm, to detect Haar like feature present in an image,

Ideal case: Delta = (1/8)*(8) - (1/8)*0 = 1

Real case: Delta = (1/8)*(5.9) - (1/8)*(1.3) = 0.575

2. Integral Images Concept:

Fig 14. Input image and Integral Image[15]

3. Adaboost- to improve Classifier accuracy:

More than 180,000 features values result within a 24X24 window.

Strong classifier = linear sum of weak

4. Using Cascade of Classifiers:

Ideal case: Delta = (1/8)(8) - (1/8)0 = 1

Real case: Delta = (1/8)(5.9) - (1/8)(1.3) = 0.575