0% found this document useful (0 votes)

31 views50 pages

Report

The document presents a major project report titled 'PRO VISION' submitted by a group of students for their Bachelor of Computer Application degree at Tula's Institute, Dehradun. The project focuses on developing a smart surveillance system that integrates advanced computer vision techniques for real-time monitoring, including features like motion detection, face recognition, and visitor tracking. The report includes sections on the project's objectives, literature review, system requirements, design, and implementation, highlighting the use of Python and various libraries to enhance security in small-scale environments.

Uploaded by

chronoansh28

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views50 pages

Report

Uploaded by

chronoansh28

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 50

PRO VISION

A MAJOR PROJECT REPORT

Submitted in partial fulfillment for the award of the degree of

BACHELOR OF COMPUTER APPLICATION

COMPUTER APPLICATIONS

Under The Guidance of

Dr. Sanjeev Solanki

(Professor)
Submitted by

1. Ashish Singh Rawat 236229285022

2. Chandra Shekar 236229285033
3. Chand Lokesh Jhalak 236229285031
4. Aryan Gusain 236229285021
5. Anshika Pundir 236229285012

Submitted To

DEPARTMENT OF COMPUTER APPLICATIONS

TULA'S INSTITUTE DEHRADUN
MAY-2025
DEPARTMENT OF COMPUTER
APPLICATIONS

TULA'S INSTITUTE, DEHRADUN

DECLARATION

I hereby declare that the major project entitled “PRO VISION”

submitted for the BCA (Bachelor of Computer Application,
Dehradun) Degree is my original work and the major project have not
formed the basis for the award of my degree, associate ship,
fellowship or any other similar titles.

Place: Signature of the Student

Date:

Ashish Singh Rawat 236229285022

Chandra Shekar 236229285033

Chand Lokesh Jhalak 236229285031

Aryan Gusain 236229285021

Anshika Pundir 236229285012

DEPARTMENT OF COMPUTER
APPLICATIONS

TULA'S INSTITUTE, DEHRADUN

CERTIFICATE

This is to certify that the major project “ PRO VISION” is the bonafide work
carried out by “Ashish Singh Rawat , Chandra Shekar , Chand Lokesh
Jhalak , Aryan Gusain, Anshika Pundir” Roll number “236229285022,
236229285033, 236229085031, 236229285021, 236229285012" student of
BCA Tulas’s Institute, Dehradun ,During the year 2024-25 in partial
fulfillment of the requirement for the award of the degree of Bachelor of
Computer Application and that the major project has not formed the basis for
the award previously of any degree , diploma,associateship , fellowship or any
other similar title.

Place: Signature of the Guide

Date: Dr. Sanjeev Solanki
(Professor)
ACKNOWLEDGMENT

We would like to express my sincere gratitude to all the individuals who

have supported and contributed to the successful completion of my BCA
project. Their guidance, assistance, and encouragement have been
invaluable throughout this journey.

First and foremost, we would like to extend my heartfelt appreciation to

my project guide Dr. Sanjeev Solanki for his unwavering support and
expertise. Her continuous guidance, valuable insights, and constructive
feedback have been instrumental in shaping this project and enhancing its
quality.

We are also grateful to the faculty members of Tula's Institute

Dehradun, especially HOD Mam Dr Priya Matta and our project
coordinator Md Mursleen, for their valuable inputs, suggestions, and
encouragement. Their vast knowledge and expertise in the field of
computer science have been instrumental in broadening our
understanding of the subject matter.

Finally, we would like to acknowledge the support and encouragement

from our family. Their unwavering belief in our abilities and their
continuous support throughout our academic journey have been a
constant source of motivation.

In conclusion, we are deeply grateful to all the individuals mentioned

above and to anyone else who has played a part, no matter how big or
small, in the completion of this BCA project. Their contributions have
been indispensable, and without their support, this project would not have
been possible.
Thanks You.

Ashish Singh Rawat

Chandra Shekar
Chand Lokesh Jhalak
Aryan Gusain
Anshika Pundir

Date :-
ABSTRACT

Pro Vision is a smart surveillance system developed to overcome the

limitations of conventional CCTV setups by integrating advanced
computer vision techniques and real-time data processing. The system is
designed to perform intelligent monitoring through various automated
features, including motion detection, object monitoring, face recognition,
visitor entry/exit tracking, and real-time video recording. Built using
Python and supported by libraries such as OpenCV, NumPy, and Tkinter,
Pro Vision offers a modular and user-friendly graphical interface that
allows users to access all functionalities seamlessly.

At the core of the system, technologies like the Structural Similarity

Index (SSIM) are employed to detect changes in a scene, while the LBPH
(Local Binary Pattern Histogram) algorithm enables accurate face
recognition. Motion detection is handled by analyzing frame differences
and drawing contours around detected movement, ensuring high
responsiveness to activity within the monitored area. The system stores
important events in organized folders, categorizing them into incidents
like 'in', 'out', 'stolen', and general recordings, allowing for easy review.
By combining these intelligent modules, Pro Vision not only improves
the efficiency of surveillance but also reduces the need for constant
human supervision. It is well-suited for use in homes, offices, and
institutions seeking low-cost, automated security solutio
TABLE OF CONTENTS

Declaratrion i
Certificates ii
Acknownledgement iii
Abstract
Chapter 1 Introduction 1
1.1 Background
1.2 Objectives
1.3 Scope
Chapter 2 Literature Review 2-4
Chapter 3 System Requirement 5
3.1 Software Requirements
3.2 Hardware Requirements
Chapter 4 System Design 6-10
4.1 Block Diagram
4.2 Architecture
4.3 Component Description
4.4 Model
Chapter 5 Coding 11-23
5.1 [Link]
5.2 Find_noise.py
5.3 Spot_diff.py
5.4 [Link]
5.5 [Link]
5.6 [Link]
5.7 Finding noises in rectangle
5.8 Date and Time
Chapter 6 Implementation 24-35
6.1 Data Flow
6.2 Modules and Functionality
6.3 User Interface
Chapter 7 Future Scope 36-38
Chapter 8 Conclusion 39-41
References 42-43
Appendix 44
Chapter 1
INTRODUCTION

1.1 Background
With the rapid growth of urban infrastructure and increasing security
concerns, surveillance systems have become a crucial component in
maintaining safety and monitoring environments. Traditional CCTV
systems, while useful, often require manual intervention and lack
intelligent features. The rise of Computer Vision and Machine
Learning has enabled the development of smarter, more autonomous
security systems. This project, Smart CCTV Surveillance System, is
designed to go beyond passive video capture and introduce real-time
intelligent monitoring features such as object theft detection, motion
analysis, face recognition, and visitor counting

1.2 Objectives
The main objectives of this project are:

 To develop a Python-based GUI application that utilizes a webcam

for real-time surveillance.
 To integrate intelligent features such as:
 Object theft detection using image similarity.
 Noise/motion detection to trigger alerts.
 Face recognition using LBPH algorithm for identifying known
individuals.
 Visitor tracking to count entries and exits in a monitored room.
 To ensure cross-platform functionality (Windows/Linux/Mac).

1.3 Scope
This system is intended for use in small-scale environments like homes,
offices, or labs where real-time monitoring with intelligent alerts can
enhance security. The scope includes:

Building a desktop application using Python and OpenCV.

Implementing basic Computer Vision techniques without the need for

deep learning [Link] modular expansion in future iterations,
such as weapon detection or fire alerts using deep learning models.

1
CHAPTER 2
LITERATURE REVIEW
In the landscape of modern security, conventional closed-circuit
television (CCTV) systems are steadily becoming obsolete. These legacy
systems primarily rely on passive monitoring and continuous human
supervision, making them limited in their responsiveness and prone to
human error. The rise in demand for intelligent, automated surveillance
systems has led to the integration of advanced technologies such as
computer vision, machine learning, and real-time image processing.
Systems like Pro Vision represent a major leap in this direction by
incorporating a wide range of intelligent monitoring features that work
autonomously and respond to events as they occur.

One of the foundational elements in Pro Vision is the Structural

Similarity Index (SSIM), a perceptual metric used to compare the visual
similarity between two images. Unlike traditional methods that rely
solely on pixel-by-pixel comparison, SSIM analyzes three crucial
components of visual perception: luminance, contrast, and structure.
This approach allows the system to detect changes in a scene that may
otherwise go unnoticed in basic difference calculations. In surveillance
contexts, this is particularly useful for identifying when an object has
been removed or tampered with. For instance, when a valuable item
disappears from the camera's field of view, the system captures two
frames — one before and one after motion is detected — and applies
SSIM to determine if a significant structural change has occurred. The
use of the Python skimage library simplifies this complex computation,
offering high-level methods to apply SSIM in real-time monitoring
applications.

Another core function within Pro Vision is face detection and

recognition, which allows the system not just to see but to understand
who is present in the monitored area. Face detection is implemented
using Haar Cascade Classifiers, a method introduced by Viola and Jones
that revolutionized real-time object detection. Haar cascades use simple
rectangular features trained through machine learning to rapidly scan
images and detect faces, even under varying conditions. Once a face is
detected, Pro Vision employs the Local Binary Pattern Histogram (LBPH)
algorithm to recognize the individual. LBPH works by dividing a face

2
image into smaller regions, computing binary patterns for each region,
and generating histograms that collectively serve as a facial signature.
This signature is then compared to a pre-trained dataset to identify
known individuals. The strength of LBPH lies in its robustness against
lighting variations and facial expressions, as well as its low
computational cost, making it ideal for real-time surveillance on devices
without high-end GPUs.

Motion detection is another essential feature integrated into Pro Vision,

offering the capability to monitor dynamic activity in the camera’s field
of view. The technique used here is frame differencing, one of the
simplest yet effective methods for motion analysis. By computing the
absolute pixel-wise difference between two consecutive frames, the
system can isolate areas where movement occurs. OpenCV’s contour
detection tools are then applied to highlight the regions of interest.
When motion is detected, these contours are outlined and labeled
within the video stream. This real-time feedback allows the system to
identify events such as intrusions, unauthorized movements, or
environmental changes. It also acts as a triggering mechanism for other
system responses, such as object monitoring or visitor tracking.

Among the more innovative capabilities of Pro Vision is its directional

visitor tracking feature. This function allows the system to determine
whether a person is entering or leaving a monitored space, using spatial
and temporal analysis of motion. By defining specific left and right zones
within the video frame, the system monitors the trajectory of movement.
If motion starts on the left side and ends on the right, the system
classifies the event as an "entry," and vice versa for an "exit." This simple
yet effective technique enables automated visitor counting and activity
logging without requiring biometric scanning or manual input. Such
functionality has been increasingly discussed in recent literature on
smart buildings and occupancy analytics, as it provides valuable data for
space usage and security management.

The entire Pro Vision system is built on the Python programming

language, which was selected for its versatility, ease of use, and rich
ecosystem of libraries. Python has emerged as a dominant language in
computer vision and machine learning due to frameworks such as
OpenCV, NumPy, Pillow, and Tkinter. These tools collectively handle
tasks ranging from GUI rendering and image processing to mathematical
computations and real-time visual feedback. The integration of these

3
libraries allows for rapid prototyping and deployment across multiple
platforms, including Windows and Linux operating systems. Moreover,
Python’s open-source nature and active community support have
significantly reduced the development cycle and simplified debugging,
making it an ideal choice for both academic projects and commercial
applications.

From a broader perspective, the design choices in Pro Vision align with
current trends in smart surveillance. Research in the field has
increasingly emphasized the need for modular, automated, and locally-
processed security systems. While cloud-based surveillance offers
scalability, it introduces latency, privacy concerns, and dependency on
internet connectivity. In contrast, systems like Pro Vision, which process
data locally and store it securely on the device, offer faster response
times and greater control over sensitive information. This approach has
been highlighted in studies advocating edge computing for surveillance,
especially in privacy-sensitive or resource-constrained environments
such as homes and small businesses.

In conclusion, the literature strongly supports the methods and

technologies implemented in Pro Vision. The use of SSIM for structural
change detection, LBPH for efficient face recognition, and motion
analysis through frame differencing reflects a thoughtful integration of
reliable and well-supported computer vision techniques. These choices
ensure that Pro Vision delivers accurate, responsive, and meaningful
surveillance while remaining lightweight and accessible. As smart
surveillance continues to evolve, systems like Pro Vision provide a
practical foundation that can be extended with additional features such
as anomaly detection, night vision, and even deep learning capabilities
— offering a scalable path from academic prototype to real-world
deployment.

4
CHAPTER 3
SYSTEM REQUIREMENT
3.1 SOFTWARE REQUIREMENT

The software stack for Pro Vision is built on cross-platform tools and open-source
libraries to ensure maximum accessibility and flexibility. Below are the essential
software requirements:

Operating System: Windows

Programming Language: Python 3.x

Required Python Libraries:

OpenCV – for computer vision and camera handling

NumPy – for efficient array and numerical operations

skimage – for Structural Similarity Index and image comparison

Tkinter – for building the graphical user interface (GUI)

Pillow – for image display and processing in GUI

IDE/Text Editor: VS Code

3.2 HARDWARE REQUIREMENT

The following hardware specifications are recommended:

Processor: Intel Core i5 or equivalent (Dual Core or higher)

RAM: Minimum 4 GB (8 GB recommended for smooth performance)

Storage: At least 1 GB free space for recorded data and images

Camera: In-built or external webcam with basic driver support

Display: Minimum 1366x768 resolution for GUI rendering

Optional: LED/Flashlight for low-light or night-time monitoring

5
CHAPTER 4
SYSTEM DESIGN
The system design of Pro Vision outlines how various components—both
hardware and software—interact to create a functional smart surveillance
solution. It includes the architectural flow, module design, and
component responsibilities that drive the core functionality of the system.

4.1 Block Diagram

The block diagram of Pro Vision represents the interaction between the
user interface, processing modules, hardware, and storage components.
At the core is the webcam that captures real-time video input, which is
processed through various modules based on the user’s selection via the
GUI. The results are either displayed in the interface or stored locally as
images or video recordings.

Block Diagram (described):

Input: Webcam Feed

GUI (Tkinter): Provides access to features like Monitor, Identify,

Noise,In/Out, Record

Processing Modules:

Motion Detection

SSIM Comparison

Face Detection & Recognition

Direction Analysis (In/Out)

Video Recording

Output:

Display (real-time GUI)

Image Storage (Screenshots for events)

Video Storage (Recordings)

6
Fig.1 BLOCK DIAGRAM

7
4.2 Architecture
Pro Vision follows a layered and modular architecture to separate
concerns and enhance maintainability. The architecture consists of the
following layers:

Presentation Layer: The user interface, developed using Tkinter, serves

as the access point for users to control surveillance operations.

Processing Layer: This layer contains all computer vision logic.

Depending on the selected feature, it handles real-time image processing,
frame differencing, face recognition, and similarity checks.

Data Layer: Responsible for storing outputs such as detected frames,

visitor snapshots, and recorded videos. This ensures evidence is logged
for later analysis.

Hardware Interaction Layer: Interfaces with the webcam and handles

frame capture using OpenCV.

The separation of logic into these layers allows for easy updates and
future integration with more advanced systems such as IoT modules or
deep learning algorithms.

4.3 Component Description

Webcam: Captures continuous frames to be analyzed by the system.

Tkinter GUI: Allows users to initiate and interact with features using
buttons and icons.

Motion Detection Module: Detects movement using frame differencing

and contour analysis.

Monitor Module: Uses SSIM to compare frames and identify missing or

altered objects.

Face Recognition Module: Combines Haar cascade classifiers and the

LBPH algorithm to detect and recognize faces.

In/Out Module: Tracks direction of movement to identify room entries

and exits.

8
Recording Module: Records live video feed with timestamps and stores
it in AVI format.

Storage: Captures and stores event-specific images and full recordings

for evidence or review.

4.4 Model
The development of Pro Vision followed the Waterfall Model. This
model was chosen due to its simplicity and structured approach, making it
suitable for a compact, self-contained project like this. The phases
followed are:

Requirement Analysis: Deciding on the features and behavior of the

system.

Design: Creating architecture and deciding on algorithms (SSIM, LBPH,

etc.).

Implementation: Writing the code in Python using OpenCV, Tkinter,

and other libraries.

Testing: Verifying each module (monitoring, motion, recognition) across

multiple OS platforms.

Deployment: Running the complete GUI application and storing results.

Maintenance: Enhancing functionality and fixing minor bugs during

testing.

The Waterfall Model provided a clear development path and ensured that
every part of the system—from face recognition to recording—was tested
and working before moving to the next phase.

9
The Waterfall Model is a traditional software development methodology
that follows a linear and sequential approach. It divides the project life
cycle into distinct phases, where each phase must be completed before the
next one begins. For the Pro Vision smart surveillance system, the
Waterfall Model was selected due to its simplicity and suitability for
small to medium-sized projects where the requirements are well-
understood from the beginning.

Fig.2 WATERFALL MODEL

10
CHAPTER 5
CODING
The Pro Vision system was implemented in Python due to its simplicity,
portability, and extensive library support for computer vision and GUI
development. The system follows a modular programming approach,
where each feature is implemented as a separate Python module and
integrated into a unified GUI.

The main file ([Link]) acts as the central controller, invoking different
functionalities like monitoring, noise detection, in/out tracking, face
identification, and recording based on user interaction through the GUI.
Below is an overview of each key module in the project.

5.1 [Link]

import tkinter as tk
import [Link] as font
from in_out import in_out
from motion import noise
from rect_noise import rect_noise
from record import record
from PIL import Image, ImageTk
from find_motion import find_motion
from identify import maincall

window = [Link]()
[Link]("Smart cctv")
[Link](False, [Link](file='[Link]'))
[Link]('1080x700')
frame1 = [Link](window)

label_title = [Link](frame1, text="Smart cctv Camera")

label_font = [Link](size=35, weight='bold',family='Helvetica')
label_title['font'] = label_font
label_title.grid(pady=(10,10), column=2)
icon = [Link]('icons/[Link]')
icon = [Link]((150,150), [Link])
icon = [Link](icon)

11
label_icon = [Link](frame1, image=icon)
label_icon.grid(row=1, pady=(5,10), column=2)

btn1_image = [Link]('icons/[Link]')
btn1_image = btn1_image.resize((50,50), [Link])
btn1_image = [Link](btn1_image)

btn2_image = [Link]('icons/[Link]')
btn2_image = btn2_image.resize((50,50), [Link])
btn2_image = [Link](btn2_image)

btn5_image = [Link]('icons/[Link]')
btn5_image = btn5_image.resize((50,50), [Link])
btn5_image = [Link](btn5_image)

btn3_image = [Link]('icons/[Link]')
btn3_image = btn3_image.resize((50,50), [Link])
btn3_image = [Link](btn3_image)

btn6_image = [Link]('icons/[Link]')
btn6_image = btn6_image.resize((50,50), [Link])
btn6_image = [Link](btn6_image)

btn4_image = [Link]('icons/[Link]')
btn4_image = btn4_image.resize((50,50), [Link])
btn4_image = [Link](btn4_image)

btn7_image = [Link]('icons/[Link]')
btn7_image = btn7_image.resize((50,50), [Link])
btn7_image = [Link](btn7_image)

# --------------- Button -------------------#

btn_font = [Link](size=25)
btn1 = [Link](frame1, text='Monitor', height=90, width=180,
fg='green',command = find_motion, image=btn1_image, compound='left')
btn1['font'] = btn_font
[Link](row=3, pady=(20,10))

btn2 = [Link](frame1, text='Rectangle', height=90, width=180, fg='orange',

command=rect_noise, compound='left', image=btn2_image)
btn2['font'] = btn_font
[Link](row=3, pady=(20,10), column=3, padx=(20,5))

btn_font = [Link](size=25)
btn3 = [Link](frame1, text='Noise', height=90, width=180, fg='green',
command=noise, image=btn3_image, compound='left')
btn3['font'] = btn_font

12
[Link](row=5, pady=(20,10))

btn4 = [Link](frame1, text='Record', height=90, width=180, fg='orange',

command=record, image=btn4_image, compound='left')
btn4['font'] = btn_font
[Link](row=5, pady=(20,10), column=3)

btn6 = [Link](frame1, text='In Out', height=90, width=180, fg='green',

command=in_out, image=btn6_image, compound='left')
btn6['font'] = btn_font
[Link](row=5, pady=(20,10), column=2)

btn5 = [Link](frame1, height=90, width=180, fg='red', command=[Link],

image=btn5_image)
btn5['font'] = btn_font
[Link](row=6, pady=(20,10), column=2)

btn7 = [Link](frame1, text="identify", fg="orange",command=maincall,

compound='left', image=btn7_image, height=90, width=180)
btn7['font'] = btn_font
[Link](row=3, column=2, pady=(20,10))

[Link]()
[Link]()

5.2 find_noise.py

import cv2
from spot_diff import spot_diff
import time
import numpy as np
def find_motion():
motion_detected = False
is_start_done = False
cap = [Link](0)
check = []
print("waiting for 2 seconds")
[Link](2)
frame1 = [Link]()
_, frm1 = [Link]()
frm1 = [Link](frm1, cv2.COLOR_BGR2GRAY)
while True:

13
diff = [Link](frm1, frm2)
_, thresh = [Link](diff, 30, 255, cv2.THRESH_BINARY)
contors = [Link](thresh, cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)[0]
#look at it
contors = [c for c in contors if [Link](c) > 25]
if len(contors) > 5:
[Link](thresh, "motion detected", (50,50),
cv2.FONT_HERSHEY_SIMPLEX, 2, 255)
motion_detected = True
is_start_done = False
elif motion_detected and len(contors) < 3:
if (is_start_done) == False:
start = [Link]()
is_start_done = True
end = [Link]()
end = [Link]()
print(end-start)
if (end - start) > 4:
frame2 = [Link]()
[Link]()
[Link]()
x = spot_diff(frame1, frame2)
if x == 0:
print("runnig again")
return
else:
print("found motion sending mail")
return
else:
[Link](thresh, "no motion detected", (50,50),
cv2.FONT_HERSHEY_SIMPLEX, 2, 255)
[Link]("winname", thresh)
_, frm1 = [Link]()
frm1 = [Link](frm1, cv2.COLOR_BGR2GRAY)
if [Link](1) == 27:
break
return

14
5.3 SPOT_DIFF.PY

import cv2
import time
from [Link] import structural_similarity
from datetime import datetime

def spot_diff(frame1, frame2):

frame1 = frame1[1]
frame2 = frame2[1]
g1 = [Link](frame1, cv2.COLOR_BGR2GRAY)
g2 = [Link](frame2, cv2.COLOR_BGR2GRAY)
g1 = [Link](g1, (2,2))
g2 = [Link](g2, (2,2))
(score, diff) = structural_similarity(g2, g1, full=True)
print("Image similarity", score)
diff = (diff * 255).astype("uint8")
thresh = [Link](diff, 100, 255, cv2.THRESH_BINARY_INV)[1]
contors = [Link](thresh, cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)[0]
contors = [c for c in contors if [Link](c) > 50]
if len(contors):
for c in contors:
x,y,w,h = [Link](c)
[Link](frame1, (x,y), (x+w, y+h), (0,255,0), 2)
else:
print("nothing stolen")
return 0
[Link]("diff", thresh)
[Link]("win1", frame1)
[Link]("stolen/"+[Link]().strftime('%-y-%-m-%-d-
%H:%M:%S')+".jpg", frame1)
[Link](0)
[Link]()
return 1

15
5.4 [Link]

import cv2
import os
import numpy as np
import tkinter as tk
import [Link] as font
def collect_data():
name = input("Enter name of person : ")
count = 1
ids = input("Enter ID: ")
cap = [Link](0)
filename = "haarcascade_frontalface_default.xml"
cascade = [Link](filename)
while True:
_, frm = [Link]()
gray = [Link](frm, cv2.COLOR_BGR2GRAY)

faces = [Link](gray, 1.4, 1)

for x,y,w,h in faces:
[Link](frm, (x,y), (x+w, y+h), (0,255,0), 2)
roi = gray[y:y+h, x:x+w]

[Link](f"persons/{name}-{count}-{ids}.jpg", roi)
count = count + 1
[Link](frm, f"{count}", (20,20), cv2.FONT_HERSHEY_PLAIN,
2, (0,255,0), 3)
[Link]("new", roi)
[Link]("identify", frm)
if [Link](1) == 27 or count > 200:
[Link]()
[Link]()
train()
break
def train():
print("training part initiated !")
recog = [Link].LBPHFaceRecognizer_create()
dataset = 'persons'
paths = [[Link](dataset, im) for im in [Link](dataset)]
faces = []
ids = []
labels = []
for path in paths:
16
[Link]([Link]('/')[-1].split('-')[0])
[Link](int([Link]('/')[-1].split('-')[2].split('.')[0]))
[Link]([Link](path, 0))
[Link](faces, [Link](ids))
[Link]('[Link]')

return
def identify():
cap = [Link](0)
filename = "haarcascade_frontalface_default.xml"

paths = [[Link]("persons", im) for im in [Link]("persons")]

labelslist = []
for path in paths:
if [Link]('/')[-1].split('-')[0] not in labelslist:
[Link]([Link]('/')[-1].split('-')[0])
print(labelslist)
recog = [Link].LBPHFaceRecognizer_create()
[Link]('[Link]')
cascade = [Link](filename)
while True:
_, frm = [Link]()
gray = [Link](frm, cv2.COLOR_BGR2GRAY)
faces = [Link](gray, 1.4, 1)
for x,y,w,h in faces:
[Link](frm, (x,y), (x+w, y+h), (0,255,0), 2)
roi = gray[y:y+h, x:x+w]
label = [Link](roi)
[Link](frm, f"{labelslist[label[0]]}", (x,y),
cv2.FONT_HERSHEY_SIMPLEX, 1, (0,0,255), 3)
[Link]("identify", frm)
if [Link](1) == 27:
[Link]()
[Link]()
break

def maincall():
root = [Link]()
[Link]("480x100")
[Link]("identify")
label = [Link](root, text="Select below buttons ")
[Link](row=0, columnspan=2)
label_font = [Link](size=35, weight='bold',family='Helvetica')
button1 = [Link](root,
label['font'] = label_font text="Add Member ", command=collect_data,
btn_font = [Link](size=25) 17
height=2, width=20)
[Link](row=1, column=0, pady=(10,10), padx=(5,5))
button1['font'] = btn_font
button2 = [Link](root, text="Start with known ",
command=identify, height=2, width=20)
[Link](row=1, column=1,pady=(10,10), padx=(5,5))
button2['font'] = btn_font
[Link]()
return

5.5 IN_OUT.PY

import cv2
from datetime import datetime
def in_out():
cap = [Link](0)

right, left = "", ""

while True:
_, frame1 = [Link]()
frame1 = [Link](frame1, 1)
_, frame2 = [Link]()
frame2 = [Link](frame2, 1)

diff = [Link](frame2, frame1)

diff = [Link](diff, (5,5))

gray = [Link](diff, cv2.COLOR_BGR2GRAY)

_, threshd = [Link](gray, 40, 255,

cv2.THRESH_BINARY)

contr, _ = [Link](threshd, cv2.RETR_TREE,

cv2.CHAIN_APPROX_SIMPLE)

x = 300
if len(contr) > 0:
max_cnt = max(contr, key=[Link])
18
x,y,w,h = [Link](max_cnt)
[Link](frame1, (x, y), (x+w, y+h), (0,255,0), 2)
[Link](frame1, "MOTION", (10,80),
cv2.FONT_HERSHEY_SIMPLEX, 2, (0,255,0), 2)

if right == "" and left == "":

if x > 500:
right = True

elif x < 200:

left = True

elif right:
if x < 200:
print("to left")
x = 300
right, left = "", ""
[Link](f"visitors/in/{[Link]().strftime('%-y-
%-m-%-d-%H:%M:%S')}.jpg", frame1)
elif left:
if x > 500:
print("to right")
x = 300
right, left = "", ""
[Link](f"visitors/out/{[Link]().strftime('%-y-
%-m-%-d-%H:%M:%S')}.jpg", frame1)

[Link]("", frame1)

k = [Link](1)

if k == 27:
[Link]()
[Link]()
break

19
5.6 [Link]

import cv2

def noise():
cap = [Link](0)

while True:
_, frame1 = [Link]()
_, frame2 = [Link]()

diff = [Link](frame2, frame1)

diff = [Link](diff, cv2.COLOR_BGR2GRAY)

diff = [Link](diff, (5,5))

_, thresh = [Link](diff, 25, 255, cv2.THRESH_BINARY)

contr, _ = [Link](thresh, cv2.RETR_TREE,

cv2.CHAIN_APPROX_SIMPLE)

if len(contr) > 0:
max_cnt = max(contr, key=[Link])
x,y,w,h = [Link](max_cnt)
[Link](frame1, (x, y), (x+w, y+h), (0,255,0), 2)
[Link](frame1, "MOTION", (10,80),
cv2.FONT_HERSHEY_SIMPLEX, 2, (0,255,0), 2)

else:
[Link](frame1, "NO-MOTION", (10,80),
cv2.FONT_HERSHEY_SIMPLEX, 2, (0,0,255), 2)

[Link]("esc. to exit", frame1)

if [Link](1) == 27:
[Link]()
[Link]()
break

20
5.7 FINDING NOISES IN RECTANGLE

import cv2

donel = False
doner = False
x1,y1,x2,y2 = 0,0,0,0

def select(event, x, y, flag, param):

global x1,x2,y1,y2,donel, doner
if event == cv2.EVENT_LBUTTONDOWN:
x1,y1 = x,y
donel = True
elif event == cv2.EVENT_RBUTTONDOWN:
x2,y2 = x,y
doner = True
print(doner, donel)

def rect_noise():

global x1,x2,y1,y2, donel, doner

cap = [Link](0)

[Link]("select_region")
[Link]("select_region", select)

while True:
_, frame = [Link]()

[Link]("select_region", frame)

if [Link](1) == 27 or doner == True:

[Link]()
print("gone--")
break while True:
_, frame1 = [Link]()
_, frame2 = [Link]()

frame1only = frame1[y1:y2, x1:x2]

frame2only = frame2[y1:y2, x1:x2]

21
diff = [Link](diff, cv2.COLOR_BGR2GRAY)

diff = [Link](diff, (5,5))

_, thresh = [Link](diff, 25, 255, cv2.THRESH_BINARY)

contr, _ = [Link](thresh, cv2.RETR_TREE,

cv2.CHAIN_APPROX_SIMPLE)

if len(contr) > 0:
max_cnt = max(contr, key=[Link])
x,y,w,h = [Link](max_cnt)
[Link](frame1, (x+x1, y+y1), (x+w+x1, y+h+y1),
(0,255,0), 2)
[Link](frame1, "MOTION", (10,80),
cv2.FONT_HERSHEY_SIMPLEX, 2, (0,255,0), 2)

else:
[Link](frame1, "NO-MOTION", (10,80),
cv2.FONT_HERSHEY_SIMPLEX, 2, (0,0,255), 2)

[Link](frame1, (x1,y1), (x2, y2), (0,0,255), 1)

[Link]("esc. to exit", frame1)

if [Link](1) == 27:
[Link]()
[Link]()
break

22
5.8 DATE AND TIME

import cv2

from datetime import datetime

def record():
cap = [Link](0)

fourcc = cv2.VideoWriter_fourcc(*'XVID')
out =
[Link](f'recordings/{[Link]().strftime("%H-
%M-%S")}.avi', fourcc,20.0,(640,480))

while True:
_, frame = [Link]()

[Link](frame, f'{[Link]().strftime("%D-%H-
%M-%S")}', (50,50), cv2.FONT_HERSHEY_COMPLEX,
0.6, (255,255,255), 2)

[Link](frame)

[Link]("esc. to stop", frame)

if [Link](1) == 27:
[Link]()
[Link]()
break

23
Chapter-6
IMPLEMENTATION
The implementation of Pro Vision was carried out using Python,
leveraging libraries such as OpenCV for image processing, Tkinter for
GUI development, NumPy for numerical operations, and skimage for
structural similarity analysis. The system was designed in a modular
format to ensure scalability and ease of maintenance. Each core
functionality—monitoring, face recognition, motion detection, in/out
tracking, and video recording—was implemented as an independent
Python script and later integrated through a central graphical interface.
The GUI was designed to be user-friendly, featuring labeled buttons and
icons that allow users to trigger specific modules with a single click.
Features such as face detection were implemented using Haar cascade
classifiers, while face recognition was performed using the LBPH (Local
Binary Pattern Histogram) algorithm. Object monitoring was achieved by
comparing frames using the Structural Similarity Index (SSIM), and
direction-based tracking was coded by analyzing motion vectors across
frame boundaries. Once the backend logic was in place, the application
was tested across different platforms, including Windows and Linux, to
ensure platform independence and real-time performance.

6.1 Data Flow

In Pro Vision, data flow begins with the webcam feed, which
continuously captures real-time video frames. These frames are passed
into the system via the Graphical User Interface (GUI), where the user
selects the desired function such as Monitor, Noise Detection, Face
Identification, In/Out Tracking, or Video Recording.

Step 1: Data Capture

The webcam captures live video and sends individual frames to the
system in real-time using OpenCV.

Step 2: User Input via GUI

The user interacts with the Tkinter-based GUI, choosing one of the
surveillance features. This input controls which module processes the
video data.

24
Step 3: Data Processing in Modules
Based on the selected function:

Monitor Mode compares frames using SSIM to detect object removal.

Noise Detection uses frame differencing and contour detection for

motion.

Face Identification applies Haar cascades for detection and LBPH for
recognition.

In/Out Tracking analyzes the direction of movement based on object

position changes.

Recording Mode continuously writes video frames to a file with

timestamp overlays.

Step 4: Output Generation

After processing, the system either:

 Displays results in the GUI (e.g., face name, motion alert),

 Stores images (e.g., detected faces, motion snapshots), or
 Saves full video recordings to the system’s storage.

This structured flow ensures that data moves logically and efficiently
through each stage—from input acquisition to intelligent processing and
evidence storage—maintaining system responsiveness and reliability.

6.2 Modules And Functionality

For this project we have used various latest technologies which will be
evaluated in this chapter with every details of why it is used.
We’ll divide this section of explaination of technolgy based on
modules/features in project.
But first lets see the language used in this project.

Language Used,
We have used Python language as it is very new and also comes with so
many features like we can do Machine Learning, Computer Vision and
Also make GUI application with ease.

Reasons for Selecting this language :

25
1 – Short and Concise Language.
2 – Easy to Learn and use.
3 – Good Technical support over Internet
4 – Many Package for different tasks.
5 – Run on Any Platform.
6 – Modern and OOP language
Well these are just the minor points from our sides Python is just a lot
more than this.

Some more notes about python,

Python is a widely used general-purpose, high level programming
language. It was created by Guido van Rossum in 1991 and further
developed by the Python Software Foundation. It was designed with an
emphasis on code readability, and its syntax allows programmers to
express their concepts in fewer lines of code. Python is a programming
language that lets you work quickly and integrate systems more
[Link] are two major Python versions: Python 2 and Python 3
Some specific features of Python are as follows:

 An interpreted (as opposed to compiled) language. Contrary to e.g.

C or Fortran, one does not compile Python code before executing it.
In addition, Python can be used interactively: many Python
interpreters are available, from which commands and scripts can be
executed.
 A free software released under an open-source license: Python can
be used and distributed free of charge, even for building
commercial software.
 Multi-platform: Python is available for all major operating
systems, Windows, Linux/Unix, MacOS X, most likely your
mobile phone OS, etc.
 A very readable language with clear non-verbose syntax
 A language for which a large variety of high-quality packages are
available for various applications, from web frameworks to
scientific computing.
 A language very easy to interface with other languages, in
particular C and C++.
 Some other features of the language are illustrated just below. For
example, Python is an object-oriented language, with dynamic
typing (the same variable can contain objects of different types
during the course of a program).
Below are the different features which can performed by using this minor
project:

26
1. Monitor
2. Identify the family member
3. Detect for Noises
4. Visitors in room detection

So lets see each features one by one.

Monitor Feature :
This feature is used to find what is the thing which is stolen from
the frame which is visible to webcam. Meaning It constantly monitors the
frames and checks which object or thing from the frame has been taken
away by the thief.

This uses Structural Similarity to find the differences in the two frames.
The two frames are captured first when noise was not happened and
second when noise stopped happening in the frame.

2. SSIM is used as a metric to measure the similarity between two given

images. As this technique has been around since 2004, a lot of material
exists explaining the theory behind SSIM but very few

resources go deep into the details, that too specifically for a gradient-
based implementation as SSIM is often used as a loss function.
The Structural Similarity Index (SSIM) metric extracts 3 key features
from an image:
 Luminance
 Contrast
 Structure

The comparison between the two images is performed on the basis of

these 3 features.

27
This system calculates the Structural Similarity Index between 2 given
images which is a value between -1 and +1. A value of +1 indicates that
the 2 given images are very similar or the same while a value of -1
indicates the 2 given images are very different. Often these values are
adjusted to be in the range [0, 1], where the extremes hold the same
meaning.

Luminance: Luminance is measured by averaging over all the pixel

values. Its denoted by μ (Mu) and the formula is given below,

28
Structure: The structural comparison is done by using a consolidated
formula (more on that later) but in essence, we divide the input signal
with its standard deviation so that the result has unit standard deviation
which allows for a more robust comparison.

Luckily , thanks to skimage package in python we dont have to replicate

all this mathematical calculation in python since skimage has pre build
feature that d o all of these tasks for us with just calling its in-built
function.

We just have to feed in two images/frames which we have captured

earlier, so we just feed them in and its gives us out the masked image
with score.

Identify the Family Member feature

This feature is very useful feature of our minor project, It is used to
find if the person the frame is known or not. It do this in two steps :

1 – Find the faces in the frames

2 – Use LBPH face recognizer algorithm to predict the person from
already trained model.

So lets divide this in following categories,

1 – Detecting faces in the frames

This is done via Haarcascade classifiers which are again in-built
in openCV module of python.

29
Cascade classifier, or namely cascade of boosted classifiers working
with haar-like features, is a special case of ensemble learning, called
boosting.
It typically relies on Adaboost classifiers (and other models such as Real
Adaboost, Gentle Adaboost or Logitboost).
Cascade classifiers are trained on a few hundred sample images of image
that contain the object we want to detect, and other images that do not
contain those images.
There are some common features that we find on most common human
faces :
 a dark eye region compared to upper-cheeks
 a bright nose bridge region compared to the eyes
 some specific location of eyes, mouth, nose…
The characteristics are called Haar Features. The feature extraction
process will look like this :
Haar features are similar to these convolution kernels which are used to
detect the presence of that feature in the given image.
For doing all this stuff openCV module in python language has inbuild
function called cascadeclassifier which we have used in order to detect
for faces in the frame

30
2 – Using LBPH for face recognition

So now we have detected for faces in the frame and this is the time to
identify it and check if it is in the dataset which we’ve used to train our
lbph model.

The LBPH uses 4 parameters:

 Radius: the radius is used to build the circular local binary pattern
and represents the radius around the central pixel. It is usually set
to 1.
 Neighbors: the number of sample points to build the circular local
binary pattern. Keep in mind: the more sample points you include,
the higher the computational cost. It is usually set to 8.
 Grid X: the number of cells in the horizontal direction. The more
cells, the finer the grid, the higher the dimensionality of the
resulting feature vector. It is usually set to 8.
 Grid Y: the number of cells in the vertical direction. The more
cells, the finer the grid, the higher the dimensionality of the
resulting feature vector. It is usually set to 8

The first computational step of the LBPH is to create an intermediate

image that describes the original image in a better way, by highlighting
the facial characteristics. To do so, the algorithm uses a concept of a
sliding window, based on the parameters radius and neighbors. Which is
shown perfectly via the above image.

Extracting the Histograms: Now, using the image generated in the last

31
step, we can use the Grid X and Grid Y parameters to divide the image
into multiple grids, as can be seen in the following image:

And after all this the model is trained and later on when we want to make
predictions the same steps are applied to the make and its histograms are
compared with already trained model and in such way this feature works.
3 – Detect for Noises in the frame

This feature is used to find the noises in the frames well this is
something you would find in most of the cctv’s but in this module we’ll
see how it works.

Talking in simple way all the frames are continously analyzed and
checked for noises. Noise in checked in the consecutive frames. Simply
we do the absolute difference between two frames and in this way the
difference of two images are analyzed and Contours( boundaries of the
motion are detected ) and if there are no boundries then no motion and if
there is any there is motion .pixel and similarly every pixel has that
valules of brigthness.

So we just do simply absolute difference because negative will make no

sense at all.

32
4 – Visitors in room detection

This is the feature which can detect if someone has entered in the
room or gone out.

So it works using following steps:

1 – It first detect for noises in the frame.

2 – Then if any moiton happen it find from which side does that happen
either left or right.
3 – Last if checks if motion from left ended to right then its will detect it
as entered and capture the frame.
Or vise-versa.

So there is not complex mathematics going on around in this specific

feature.

So basically to know from which side does the motion happened we first
detect for motion and later on we draw rectangle over noise and last step
is we check the co-ordinates if those points lie on left side then it is
classified as left motion.

33
6.3 USER INTERFACE

# feature 1 - Monitor
Showing use of first feature or you can consider it as output from feature
1.

as you can see that it is detecting speaker is stolen which is true.

#feature 2 – Noise Detection

This is working captured output for NO-Motion and Motion being
detected by this application.

34
# feature 3 – In out Detection It has detected me entering in the room
and being detected as entered and saving the image locally.

# feature 4 – Face Identification

Since I have trained my model for sidhu and it is predicting sidhu
correctly.
Now its not always predict right sometime it makes bad predictions also.

35
CHAPTER 7

Future Scope

As technology continues to advance, surveillance systems are evolving

from basic video capture tools into sophisticated, intelligent security
solutions. Pro Vision is built on a modular, scalable architecture that
positions it well for future enhancement and expansion. While the
current implementation provides efficient motion detection, face
recognition, and monitoring features, there is significant potential to
further develop the system to meet the rising demands of modern
security challenges.

One of the most promising directions for future improvement is the

development of portable CCTV systems. With the increasing availability
of compact single-board computers like Raspberry Pi and Jetson Nano, it
is now feasible to run lightweight versions of Pro Vision on small,
battery-powered devices. These portable units can be deployed in
remote or temporary locations where traditional CCTV infrastructure is
impractical. Combined with wireless connectivity and local data
processing, these devices could provide on-the-go security without
reliance on constant internet or power supply.

Another important area for enhancement is the integration of night

vision capability. Most basic webcams and consumer cameras struggle
in low-light environments, limiting their effectiveness during nighttime.
Future versions of Pro Vision could incorporate infrared (IR) sensors or
thermal imaging cameras to enable 24/7 surveillance regardless of
lighting conditions. In addition, implementing low-light image
enhancement algorithms can significantly improve visibility and accuracy
during poor illumination, allowing the system to detect objects and faces
even in complete darkness.

As computational power continues to improve, Pro Vision can be

enhanced by incorporating deep learning (DL) models to unlock more
advanced features. Currently, the system relies on classical computer
vision methods that are efficient and fast, but limited in complexity.
Deep learning models, particularly Convolutional Neural Networks
(CNNs), can provide significantly better accuracy in tasks like object

36
detection, face recognition, and anomaly identification. With DL
integration, Pro Vision could support features such as:

 Deadly weapon detection, which would identify firearms or knives in

real-time video streams.
 Accident detection, useful in public spaces or transportation hubs to
detect collisions, falls, or injuries.
 Fire and smoke detection, enabling the system to act as a basic fire
alarm when flames or smoke are visually detected.
 Suspicious behavior analysis, where DL models track posture,
movement speed, or activity duration to flag unusual patterns.

These features would transform Pro Vision into a more intelligent and
context-aware system, capable of not just recording events, but
understanding and responding to them in real-time.

Furthermore, efforts can be made to convert the system into a fully

standalone application, eliminating the need for Python installation or
manual setup. By packaging the project using tools like PyInstaller,
Electron-Python, or cross-platform frameworks, the entire application
can be distributed as an executable file (.exe for Windows, .app for
macOS), ready to run with a single click. This would greatly enhance
usability for non-technical users and make the system more appealing
for deployment in offices, schools, and homes.

On a more advanced level, the project could evolve into a standalone

surveillance device. This would involve creating custom hardware with
an embedded operating system, integrated camera module, and pre-
installed software. With the addition of a small touchscreen or wireless
interface, users could operate Pro Vision without needing an external
computer. These compact surveillance boxes could serve as plug-and-
play security units, configurable via mobile or desktop apps, and
networked together for large-scale deployments.

The growing field of Internet of Things (IoT) opens further possibilities

for remote monitoring and control. A future version of Pro Vision could
be linked to a cloud dashboard, enabling users to monitor multiple
locations through a centralized interface. Real-time alerts could be
pushed to smartphones through apps or SMS when anomalies like
intrusions or fire are detected. Although the current version focuses on

37
local processing, future iterations could offer cloud sync as an optional
add-on for those who require remote access and backup.

In addition to technological enhancements, attention should be paid to

privacy, security, and legal compliance. As surveillance systems grow
more intelligent, ensuring that they operate ethically and protect user
data becomes critical. Future updates could include encrypted storage,
access control via authentication, and user activity logging. Moreover,
the system could include customizable privacy zones, where specific
areas of the camera feed are ignored or blurred to respect privacy
concerns in shared spaces.

Finally, Pro Vision holds strong potential as an educational tool. With

modular code, rich visual feedback, and real-time interaction, it can be
used to teach students and developers about computer vision, GUI
development, and real-world AI applications. Adding documentation,
training datasets, and even a graphical setup wizard could help make Pro
Vision a part of classroom labs or developer toolkits.

In summary, Pro Vision is not just a project—it is a platform. With future

upgrades such as deep learning integration, portable device deployment,
night vision support, and standalone application packaging, it can grow
into a powerful, commercial-grade smart surveillance solution. As
technology advances, the opportunities for expanding the system’s
capabilities will only increase, making Pro Vision a forward-looking
investment in security, innovation, and accessibility.

38
Chapter 8
Conclusion

The Pro Vision surveillance system was conceptualized and developed as

a response to the increasing need for smarter, more autonomous security
systems that reduce the burden of human monitoring while providing
real-time insights. As traditional CCTV systems often lack the ability to
detect, respond to, or analyze ongoing activity, Pro Vision fills a crucial
gap by integrating multiple intelligent features within a lightweight, user-
friendly application. The project stands as a successful demonstration of
how modern computer vision techniques can transform conventional
video surveillance into a proactive security solution.

Throughout the development process, a strong focus was placed on

modularity, ease of use, and real-time performance. Built using Python
and key libraries such as OpenCV, NumPy, Tkinter, and scikit-image, the
system showcases the power of open-source technologies to create
complex applications without the need for expensive proprietary tools.
Each core feature — motion detection, face recognition, object
monitoring, directional tracking, and video recording — was designed as
an independent module and integrated through a central graphical
interface. This structure not only enhanced maintainability and flexibility
but also ensured that each function could be developed and tested in
isolation, reducing the risk of bugs or performance issues in the overall
application.

One of the most valuable aspects of Pro Vision is its ability to detect and
react to environmental changes. The use of the Structural Similarity Index
(SSIM) allows the system to recognize subtle differences in image
content, making it possible to detect theft or tampering by comparing
frames captured before and after movement is detected. Meanwhile, the
motion detection algorithm based on frame differencing and contour
analysis enables the system to alert users to dynamic activity in its field
of view. These capabilities allow the system to act not only as a recorder
but as a smart observer capable of identifying potential threats in real
time.

Another key achievement of the system is the integration of face

detection and recognition using Haar cascade classifiers and the LBPH
(Local Binary Pattern Histogram) algorithm. This feature enables the
application to identify known individuals and log entry events accurately.

39
The inclusion of directional tracking further expands its usefulness,
particularly in access-controlled areas or rooms where occupancy
tracking is important. Together, these features transform the system from
a passive video recorder into an interactive security assistant that logs,
identifies, and organizes surveillance data for easy review and analysis.

From a development perspective, Pro Vision was also a valuable learning

experience in software design, modular coding, GUI development, and
applied artificial intelligence. It presented opportunities to explore best
practices in user interface design, cross-platform compatibility, code
optimization, and real-time video processing. By building and testing the
application on both Windows and Linux operating systems, the team
ensured the system would be accessible to a wide range of users and
hardware configurations. The application’s performance on basic
hardware platforms shows that effective surveillance need not require
high-end infrastructure — a modest computer and webcam are sufficient
to run the complete system efficiently.

Moreover, Pro Vision holds significant promise for future extension. Its
architecture was intentionally designed to support the integration of more
advanced features as technology evolves. Potential enhancements include
deep learning–based object detection, real-time behavioral analysis,
integration with IoT devices for smart home automation, and even mobile
or cloud connectivity for remote monitoring. These additions could
elevate Pro Vision from a basic surveillance assistant to a comprehensive
security platform suitable for deployment in offices, campuses, homes, or
public environments. The modular codebase and clean UI further support
the idea of Pro Vision as an educational tool, allowing students and
developers to learn from and build upon the existing system.

The successful implementation of Pro Vision demonstrates that practical,

intelligent security solutions can be achieved using accessible and well-
documented tools. It also reinforces the idea that innovation does not
always require expensive hardware or complex cloud services — well-
designed software, when backed by strong logic and thoughtful
integration, can significantly improve everyday technologies. In addition,
the system provides a foundation for further academic or commercial
development, acting as a stepping stone toward more complex
surveillance platforms powered by artificial intelligence and edge
computing.

In conclusion, Pro Vision achieves its core objective of transforming

traditional CCTV into a smart, responsive, and user-friendly surveillance

40
system. It effectively merges multiple technologies into a cohesive
product that is as practical as it is educational. The success of this project
underscores the value of modular software development, the potential of
open-source libraries, and the impact of automation in the field of
security. With its strong foundation and future-ready design, Pro Vision is
not just a complete project—it is the beginning of a much larger
opportunity in smart surveillance innovation.

41
References
1. Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality
assessment: From error visibility to structural similarity. IEEE Transactions on
Image Processing, 13(4), 600–612. [Link]
2. Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of
simple features. Proceedings of the 2001 IEEE Computer Society Conference on
Computer Vision and Pattern Recognition (CVPR), 1, I-511–I-518.
[Link]
3. Ahonen, T., Hadid, A., & Pietikäinen, M. (2006). Face description with local binary
patterns: Application to face recognition. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 28(12), 2037–2041. [Link]
4. Bradski, G. (2000). The OpenCV library. Dr. Dobb's Journal of Software Tools.
[Link]
5. Shapiro, L., & Stockman, G. (2001). Computer Vision. Prentice Hall.
6. Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection.
In Proceedings of the IEEE Computer Society Conference on Computer Vision and
Pattern Recognition (CVPR), 1, 886–893. [Link]
7. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You Only Look Once:
Unified, real-time object detection. Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 779–788.
8. Zhang, K., Zhang, Z., Li, Z., & Qiao, Y. (2016). Joint face detection and alignment
using multitask cascaded convolutional networks. IEEE Signal Processing Letters,
23(10), 1499–1503.
9. King, D. E. (2009). Dlib-ml: A machine learning toolkit. Journal of Machine
Learning Research, 10, 1755–1758. [Link]
10. Russakovsky, O., Deng, J., Su, H., Krause, J., et al. (2015). ImageNet Large Scale
Visual Recognition Challenge. International Journal of Computer Vision, 115(3),
211–252.
11. Huang, G., Liu, Z., van der Maaten, L., & Weinberger, K. Q. (2017). Densely
Connected Convolutional Networks. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 4700–4708.
12. Szeliski, R. (2010). Computer Vision: Algorithms and Applications. Springer.
[Link]
13. Rosebrock, A. (2020). Practical Python and OpenCV + Case Studies.
PyImageSearch.
14. OpenCV Development Team. (2023). OpenCV-Python Tutorials. Retrieved from
[Link]
15. van Rossum, G., & Drake, F. L. (2009). Python 3 Reference Manual. CreateSpace.
16. Lundh, F. (2001). An Introduction to Tkinter. Pythonware. Retrieved from
[Link]

42
17. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., et al. (2011). Scikit-learn:
Machine Learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
18. Brownlee, J. (2022). Introduction to Computer Vision in Python. Machine Learning
Mastery.
19. Singh, A., & Sharma, R. (2021). Smart Surveillance System Using OpenCV and
Python. International Journal of Computer Applications, 174(10), 8–14.
[Link]
20. Raut, A., & Jadhav, N. (2020). IoT Based Smart Surveillance System Using
Raspberry Pi. International Research Journal of Engineering and Technology
(IRJET), 7(4), 2399–2404.
21. Jain, R., & Gupta, A. (2022). An Efficient CCTV System Using Face Recognition and
Object Tracking. International Journal of Engineering Research & Technology
(IJERT), 11(02), 452–456.
22. Chandra, R., & Bose, D. (2019). AI-Powered Security Systems: Trends and
Challenges. ACM Computing Surveys, 52(5), Article 87.
23. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.
[Link]
24. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with
deep convolutional neural networks. Advances in Neural Information Processing
Systems, 25, 1097–1105.
25. Anwar, S., & Raychowdhury, A. (2020). Edge AI: Machine learning at the edge in
IoT. Proceedings of the IEEE, 108(12), 2561–2574.
26. Zhao, Z., Zheng, P., Xu, S., & Wu, X. (2019). Object Detection With Deep Learning:
A Review. IEEE Transactions on Neural Networks and Learning Systems, 30(11),
3212–3232.
27. Anil, K., & Lavanya, R. (2019). A Review on Real Time Video Surveillance System
Using Deep Learning. International Journal of Innovative Technology and Exploring
Engineering, 8(11), 3540–3543.
28. Chakraborty, T., & Saha, A. (2020). A Comparative Study on Face Recognition
Algorithms. International Journal of Advanced Computer Science and Applications,
11(4), 215–220.
29. Rajalakshmi, K., & Sangeetha, R. (2021). Real-Time Object Detection Using
OpenCV and Python. International Journal of Engineering and Technology (IJET),
13(1), 45–4.

43
Appendix

The appendix includes supplementary materials that support the development and
understanding of the Pro Vision system. It contains source code snippets, screenshots of
the user interface, sample output images from different modules (such as motion
detection, face recognition, and in/out tracking), and configuration details for software
dependencies. These artifacts provide additional insight into the technical implementation
and help validate the functionality of each component. The appendix also includes
references to third-party libraries used during development, such as OpenCV, NumPy,
and Tkinter, along with their installation instructions, ensuring the system can be
replicated or extended by other developers or researchers.

062 Vardah Rehman BTCSEAI Dissertation
No ratings yet
062 Vardah Rehman BTCSEAI Dissertation
30 pages
Smart Surveillance Using Computer Vision
No ratings yet
Smart Surveillance Using Computer Vision
4 pages
B.E Cse Batchno 134
No ratings yet
B.E Cse Batchno 134
62 pages
IV B.Tech CSE Major Project Abstract B16 T A.Y 2024-25
No ratings yet
IV B.Tech CSE Major Project Abstract B16 T A.Y 2024-25
1 page
Intelligent Surveillance Systems
0% (1)
Intelligent Surveillance Systems
4 pages
Smart CCTV for Enhanced Security
No ratings yet
Smart CCTV for Enhanced Security
24 pages
Offer Project Original
No ratings yet
Offer Project Original
52 pages
Final Report 2 PDF 1
No ratings yet
Final Report 2 PDF 1
86 pages
Object Detection for the Blind
No ratings yet
Object Detection for the Blind
49 pages
Smart Surveillance System with Python
No ratings yet
Smart Surveillance System with Python
21 pages
Smart Surveillance Camera Using AI
No ratings yet
Smart Surveillance Camera Using AI
7 pages
Dubber
No ratings yet
Dubber
36 pages
Smart Surveillance System Using Machine Learning
No ratings yet
Smart Surveillance System Using Machine Learning
9 pages
Sample Minor Project Report Jul-Dec 23
No ratings yet
Sample Minor Project Report Jul-Dec 23
66 pages
Automatic Photo Damage Recovery Using CNN
No ratings yet
Automatic Photo Damage Recovery Using CNN
54 pages
Group 6-Micro Project Documentation
No ratings yet
Group 6-Micro Project Documentation
50 pages
Fin Irjmets1681886630
No ratings yet
Fin Irjmets1681886630
4 pages
Commercial Surveillance System Report
No ratings yet
Commercial Surveillance System Report
56 pages
Report
No ratings yet
Report
54 pages
Sample Major Project Report Jul-Dec 24
No ratings yet
Sample Major Project Report Jul-Dec 24
110 pages
Ajiba Reeshma B - 23cc002 Dhivya Dharshini R-23cc012 Jeyanthi A-23cc021 Prasanth M-23cc042
No ratings yet
Ajiba Reeshma B - 23cc002 Dhivya Dharshini R-23cc012 Jeyanthi A-23cc021 Prasanth M-23cc042
20 pages
IoT-Based Theft Detection System
No ratings yet
IoT-Based Theft Detection System
52 pages
A Merged
No ratings yet
A Merged
40 pages
Final Project Report
No ratings yet
Final Project Report
19 pages
3 - Round The Clock Virtual Friend - Report
No ratings yet
3 - Round The Clock Virtual Friend - Report
41 pages
Jiya Motion Major Project 6TH SEM
No ratings yet
Jiya Motion Major Project 6TH SEM
69 pages
Null 2
No ratings yet
Null 2
72 pages
Real-Time Suspicious Activity Detection
No ratings yet
Real-Time Suspicious Activity Detection
76 pages
Smart Attendance System
No ratings yet
Smart Attendance System
38 pages
Final Report
No ratings yet
Final Report
76 pages
Moji Smart Camera
No ratings yet
Moji Smart Camera
26 pages
Computer Vision Project
No ratings yet
Computer Vision Project
33 pages
Skilldzire Report PDF
0% (1)
Skilldzire Report PDF
37 pages
Cyborg Surveillance System Overview
No ratings yet
Cyborg Surveillance System Overview
20 pages
Human Detection System Report
No ratings yet
Human Detection System Report
39 pages
Face Recognition Project Report Python
No ratings yet
Face Recognition Project Report Python
73 pages
Blockchain for IoT Data Credibility
No ratings yet
Blockchain for IoT Data Credibility
83 pages
New A Project Report-3
No ratings yet
New A Project Report-3
81 pages
Frontiers Paper
No ratings yet
Frontiers Paper
26 pages
Air Canvas Synopsis
No ratings yet
Air Canvas Synopsis
23 pages
Suspecioyus Activity
No ratings yet
Suspecioyus Activity
50 pages
Real-Time Object & Person Detection
No ratings yet
Real-Time Object & Person Detection
22 pages
Anex Project
No ratings yet
Anex Project
59 pages
Report Rmon 1
No ratings yet
Report Rmon 1
72 pages
Mini Project Report Format
No ratings yet
Mini Project Report Format
19 pages
Names and Start For Python
No ratings yet
Names and Start For Python
19 pages
IJCSP23A1333
No ratings yet
IJCSP23A1333
7 pages
AI Project Report
No ratings yet
AI Project Report
77 pages
Computer Vision Seminar Report 2023
No ratings yet
Computer Vision Seminar Report 2023
37 pages
Final Report PDF
No ratings yet
Final Report PDF
64 pages
Automatic Human Detection in Surveillance Camera To Avoid Theft Activities in Atm Centre Using Artificial Intelligen
No ratings yet
Automatic Human Detection in Surveillance Camera To Avoid Theft Activities in Atm Centre Using Artificial Intelligen
3 pages
Project Report Minor Project
No ratings yet
Project Report Minor Project
15 pages
Projece Report
No ratings yet
Projece Report
27 pages
Smart Security Camera System For Video Surveillance Using Open CV
No ratings yet
Smart Security Camera System For Video Surveillance Using Open CV
6 pages
Sridhar Main Project
No ratings yet
Sridhar Main Project
6 pages
Intelligent Surveillance System Using Deep Learning 1
No ratings yet
Intelligent Surveillance System Using Deep Learning 1
9 pages
Video Surveillance for Suspicious Activity Detection
No ratings yet
Video Surveillance for Suspicious Activity Detection
27 pages
Fin Irjmets1679739909
No ratings yet
Fin Irjmets1679739909
5 pages
Class XI Informatics Practices Q&A
No ratings yet
Class XI Informatics Practices Q&A
47 pages
Rakesh Prasad: BI & Analytics Resume
No ratings yet
Rakesh Prasad: BI & Analytics Resume
6 pages
Py Subject
No ratings yet
Py Subject
5 pages
Python Basics for Beginners
No ratings yet
Python Basics for Beginners
15 pages
Class Xi Commerce Holiday Homework - 21052024 - 145205
No ratings yet
Class Xi Commerce Holiday Homework - 21052024 - 145205
8 pages
Python Robotics with Raspberry Pi
No ratings yet
Python Robotics with Raspberry Pi
43 pages
Data Analyst & AI Specialist Resume
No ratings yet
Data Analyst & AI Specialist Resume
3 pages
Ankita Jain - Senior Software Engineer Resume
No ratings yet
Ankita Jain - Senior Software Engineer Resume
3 pages
Machine Learning for Autism Diagnosis
No ratings yet
Machine Learning for Autism Diagnosis
45 pages
World of Riddles Game Overview
No ratings yet
World of Riddles Game Overview
5 pages
Diploma in Machine Learning for Oil & Gas
No ratings yet
Diploma in Machine Learning for Oil & Gas
15 pages
Python Pandas Tutorial For Beginners
100% (1)
Python Pandas Tutorial For Beginners
203 pages
E-commerce Platform Project Overview
No ratings yet
E-commerce Platform Project Overview
19 pages
Final Computer Science PBA HSSC Merged
No ratings yet
Final Computer Science PBA HSSC Merged
12 pages
Rustam Zokirov Resume
No ratings yet
Rustam Zokirov Resume
2 pages
Entry Level Data Scientist Resume
No ratings yet
Entry Level Data Scientist Resume
1 page
Overview of Python Operators
No ratings yet
Overview of Python Operators
16 pages
HFSSScripting Guide
No ratings yet
HFSSScripting Guide
1,144 pages
Report On User Defined Functions in Python
No ratings yet
Report On User Defined Functions in Python
3 pages
Howto Clinic
No ratings yet
Howto Clinic
23 pages
Development of Graphic Interfaces With Tkinter
No ratings yet
Development of Graphic Interfaces With Tkinter
6 pages
UofT SCS FinTech Boot Camp Overview
No ratings yet
UofT SCS FinTech Boot Camp Overview
13 pages
Design of The Smart Glove To System The Visually Impaired
No ratings yet
Design of The Smart Glove To System The Visually Impaired
63 pages
Essential Python for Data Analysts
No ratings yet
Essential Python for Data Analysts
6 pages
Python and GPS Tracking - SparkFun Electronics
No ratings yet
Python and GPS Tracking - SparkFun Electronics
16 pages
01.01 Installing The NRF Connect SDK
No ratings yet
01.01 Installing The NRF Connect SDK
18 pages
Mini Project
No ratings yet
Mini Project
51 pages
QWERT
No ratings yet
QWERT
40 pages
Python Modules - GeeksforGeeks
No ratings yet
Python Modules - GeeksforGeeks
8 pages
Scikit-Learn: Machine Learning in Python
No ratings yet
Scikit-Learn: Machine Learning in Python
6 pages