You are on page 1of 57

MOTORCYCLISTS HELMET DETECTION

Major Project submitted in partial fulfillment of the requirement for the award of the
degree of

BACHELOR OF TECHNOLOGY

IN

COMPUTER SCIENCE AND ENGINEERING

Under the esteemed guidance of

Dr. Madhuri Gupta


Associate Professor

By

G.V.MYTHILI (20R11A0517)
AFIFA NILOUFER (20R11A0503)
B.V.V. SURYA VINAY (20R11A0507)

Department of Computer Science and Engineering


Accredited by NBA

Geethanjali College of Engineering and Technology


(UGC Autonomous)
(Affiliated to J.N.T.U.H, Approved by AICTE, New Delhi)
Cheeryal (V), Keesara (M), Medchal.Dist.-501 301.

April-2024
Geethanjali College of Engineering & Technology
(UGC Autonomous)
(Affiliated to JNTUH, Approved by AICTE, New Delhi)
Cheeryal (V), Keesara(M), Medchal Dist.-501 301.

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


Accredited by NBA

CERTIFICATE

This is to certify that the B.Tech Mini Project report entitled


“MOTORCYCLISTS HELMET DETECTION” is a bonafide work done by
G.V.MYTHILI (20R11A0517), AFIFA NILOUFER(20R11A0503), B.V.V.
SURYA VINAY (20R11A0507), in partial fulfillment of the requirement of the
award for the degree of Bachelor of Technology in “Computer Science and
Engineering” from Jawaharlal Nehru Technological University, Hyderabad during the
year 2023-2024.

Internal Guide HOD - CSE

Dr Madhuri Gupta Dr A SreeLakshmi


Associate Professor Professor

External Examiner
Geethanjali College of Engineering & Technology
(UGC Autonomous)
(Affiliated to JNTUH Approved by AICTE, New Delhi)
Cheeryal (V), Keesara(M), Medchal Dist.-501 301.

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


Accredited by NBA

DECLARATION BY THE CANDIDATE

We G.V.MYTHILI, AFIFA NILOUFER, B.V.V. SURYA VINAY bearing


20R11A0517, 20R11A0503, 20R11A0507 hereby declare that the project report entitled
“MOTORCYCLISTS HELMET DETECTION SYSTEM” is done under the guidance
of Dr. Madhuri Gupta, Associate Professor, Department of Computer Science and
Engineering, Geethanjali College of Engineering and Technology, is submitted in partial
fulfillment of the requirements for the award of the degree of Bachelor of Technology in
Computer Science and Engineering.

This is a record of bonafide work carried out by us in Geethanjali College of Engineering


and Technology and the results embodied in this project have not been reproduced or
copied from any source. The results embodied in this project report have not been
submitted to any other University or Institute for the award of any other degree or diploma.

G.V.MYTHLI (20R11a0517),
AFIFA NILOUFER(20R11A0503),
B.V.V.SURYA VINAY (20R11A0507),
Department of CSE, Geethanjali College of Engineering and Technology,Cheeryal.
ACKNOWLEDGEMENT

We would like to express our sincere thanks to Dr. A. Sree Lakshmi, Professor, Head of
Department of Computer Science, Geethanjali College of Engineering and Technology,
Cheeryal, whose motivation in the field of software development has made us to overcome
all hardships during the course of study and successful completion of project.

We would like to express our profound sense of gratitude to all for having helped us in
completing this dissertation. We would like to express our deep felt gratitude and sincere
thanks to our guide Mrs. Madhuri Gupta, Associate Professor, Department of Computer
Science, Geethanjali College of Engineering and Technology, Cheeryal, for her skillful
guidance, timely suggestions and encouragement in completing this project.

We would like to express our sincere gratitude to our Principal Prof. Dr. S. Udaya Kumar
for providing the necessary infrastructure to complete our project. We are also thankful to
our Secretary Mr.G.R. Ravinder Reddy for providing an interdisciplinary & progressive
environment.

Finally, we would like to express our heartfelt thanks to our parents who were very
supportive both financially and mentally and for their encouragement to achieve our set
goals.

G.V. MYTHILI(20R11A0517),

AFIFA NILOUFER(20R11A0503),

B.V.V.SURYA VINAY(20R11A0507).

i
ii
ABSTRACT
This major project endeavors to design and implement an advanced helmet detection
system utilizing the YOLOv8 (You Only Look Once version 8) object detection
framework. The primary objective is to enhance road safety by detecting non-compliant
helmet usage among bike and cycle riders in various road environments.

The project initiates with the collection of a diverse dataset comprising images captured
from different road scenarios. These images are meticulously annotated using LabelImg,
with bounding boxes delineating the presence of helmets on bike and cycle riders.
Concurrently, YAML files are generated to organize the dataset and specify the paths to the
annotated images.

Subsequently, the annotated dataset is utilized to train the YOLOv8 model, leveraging
transfer learning techniques to adapt the pre-trained model to the task of helmet detection.
The training process involves optimizing the model's parameters based on annotated
examples, with a focus on achieving high accuracy and robustness in real-world scenarios.

The YOLO model can identify all the objects involved in the input image by passing the
image through a single convolutional neural network only once. The model does not
identify the potential object locations using a neural network and then use another neural
network to detect if an object is present in all the predicted areas.

So YOLO is very quick compared to other computer vision models. The effective
architecture of the model allows us to perform real-time detections. So we are using the
YOLO model for helmet detection in our project.

The anticipated outcomes of this major project include the development of a reliable and
efficient helmet detection system capable of real-time deployment for road safety
monitoring. The system is expected to provide valuable insights for law enforcement
agencies, policymakers, and transportation authorities, aiding in the enforcement of helmet
usage regulations and the reduction of head injuries among bike riders.

iii
LIST OF FIGURES

S.No Diagram Name Page No

1 YOLOV8 Architecture 17

2 System Architecture 18

3 Use Case Diagram 20

4 Class Diagram 21

5 State chart Diagram 23

6 Activity Diagram 24

7 Preparing Dataset Using LabelImg 28

8 Dataset 28

9 Corresponding Coordinates to helment_0 29

10 Corresponding Coordinates to helment_1 29

11 Output Screen-1 37

12 Output Screen-2 37

13 Output Screen-3 38

14 Output Screen-4 38

15 Plagiarism checker 45

iv
LIST OF TABLES

S.No Table Name Page No

1 Test cases 35

v
LIST OF ABBREVIATIONS

S.No Acronym Abbreviation

1 ML Machine Learning

2 CNN Convolution Neural Network

3 YOLOV8 You Only Look Once Version 8

vi
TABLE OF CONTENTS

S.No Contents Page no


Abstract ii
List of Figures iii
List of Tables iv
List of Symbols & Abbreviations v
1. Introduction

1.1 About the project 1

1.2 Objective 2

2. System Analysis

2.1 Existing System 3

2.2 Proposed System 3

2.3 Feasibility Study

2.3.1 Details 4

2.3.2 Impact on Environment 4

2.3.3 Safety 4

2.3.4 Ethics 5

2.3.5 Cost 5

2.3.6 Type 5

2.4 Scope of the Project 6

vii
2.5 Modules 7

2.6 System Configuration 8

3. Literature Overview 9

4. System Design

4.1 System Architecture 17

4.2 UML Diagrams

4.2.1 Use Case Diagram 20


4.2.2 Class Diagram 21
4.2.3 State chart Diagram 23
4.2.4 Activity Diagram 24

4.3 System Design

4.3.1 Modular Design 26


4.3.2 Database Design 27

5. Implementation

5.1 Implementation 30

5.2 Sample code 31

6. Testing

6.1 Testing 35

6.2 Test cases 35

7. Output Screens 37

8. Conclusion

viii
8.1 Conclusion 39

8.2 Further Enhancements 40

9. Bibliography

9.1 Books References 42

9.2 Websites References 42

9.3 Technical Publication References 42

10 Appendices

A. SW used 43

B. Methodologies used 43

11 Plagiarism Report 45

ix
1.INTRODUCTION

1.1 ABOUT THE PROJECT


In response to the critical issue of road safety for bike and cycle riders, this project aims
to develop an innovative helmet detection system using advanced deep learning
techniques. The system will be designed to identify instances of non-compliant helmet
usage among cyclists in real-time, thereby contributing to the enhancement of road
safety measures.

The project initiates by ingesting video footage depicting road scenes, serving as its
primary input. This footage captures the dynamic and diverse environment of a road,
showcasing various scenarios encountered by motorcyclists. Through meticulous
processing and analysis, the project endeavors to enhance road safety by discerning
instances of helmet usage among motorcyclists, a pivotal aspect in mitigating head
injuries during potential accidents.

Utilizing cutting-edge deep learning techniques, the YOLO model is deployed as the
cornerstone for helmet detection within the project framework. The model operates
dynamically, scanning each frame of the video feed with precision and speed,
identifying motorcyclists within the scene. Upon detection, the model meticulously
evaluates each motorcyclist's head region, discerning whether a helmet is present or
absent with utmost accuracy.

In the resulting output video, each motorcyclist recognized by the YOLO model is
distinctly marked with a bounding box, elegantly encompassing their presence within
the scene. Moreover, within these bounding boxes, an informative annotation is
included, succinctly denoting the model's determination: whether the motorcyclist is
wearing a helmet or not. This amalgamation of visual cues and textual annotations
provides viewers with a comprehensive understanding of the helmet compliance status
for each motorcyclist observed in the video feed.
1
Furthermore, accompanying each bounding box annotation is a confidence level,
serving as a measure of the model's certainty regarding its helmet detection. This metric
offers valuable insights into the reliability of the detection process, aiding viewers and
stakeholders in assessing the robustness of the system's performance.

1.2 OBJECTIVE

The primary goal of this project is to develop a robust system capable of real-time
helmet detection when provided with live footage of road scenes captured by CCTV
cameras. Leveraging the YOLO model, the system dynamically analyzes each frame,
swiftly identifying motorcyclists and discerning whether they are wearing helmets or
not. This real-time capability is crucial for enabling prompt intervention and ensuring
adherence to helmet regulations, thereby enhancing road safety measures.

A Efforts are also directed towards establishing a seamless pipeline for real-time
monitoring and analysis of helmet compliance on roadways. By processing live CCTV
footage with minimal latency, the system enables prompt identification of non-
compliant motorcyclists, empowering authorities to intervene swiftly and mitigate
potential safety risks. This integration with existing surveillance infrastructure
streamlines the deployment and scalability of helmet compliance monitoring efforts
across diverse road environments.

2
2. SYSTEM ANALYSIS

In this section, we delve into the analysis of both the existing system and the proposed
system, followed by a feasibility study encompassing various aspects such as
environmental impact and safety considerations.

2.1 EXISTING SYSTEM

The current state of helmet detection systems for monitoring road safety often relies on
manual observation or static surveillance cameras. These methods lack the ability to
provide real-time insights and are susceptible to human error. Manual observation is
labor-intensive and prone to inconsistencies, while static surveillance cameras offer
limited coverage and cannot adapt to dynamic road environments effectively.
Consequently, existing systems fall short in ensuring comprehensive and timely
monitoring of helmet compliance among motorcyclists.

2.2 PROPOSED SYSTEM

In contrast, the proposed system aims to revolutionize helmet detection for road safety
through the integration of advanced computer vision techniques and real-time analysis
capabilities. Leveraging the YOLOv8 (You Only Look Once version 8) object detection
framework, the system can dynamically analyze live footage from CCTV cameras. This
approach allows for swift identification of motorcyclists and precise detection of
helmets in real-time. Upon detection, the system annotates the video feed with
bounding boxes around the detected helmets. Additionally, each annotation includes a
label indicating whether the motorcyclist is wearing a helmet ("helmet ") or not ("no
helmet"), accompanied by a confidence level. This comprehensive approach not only
enhances the accuracy and efficiency of helmet detection but also enables prompt
intervention and enforcement measures, thereby significantly improving road safety
outcomes.

3
2.3 FEASIBILITY STUDY

2.3.1 DETAILS

From a technological standpoint, the system's feasibility is high as it operates


autonomously on the device itself, requiring minimal additional resources. Leveraging
the YOLOv8 object detection framework, which is computationally efficient, the
system can run on a wide range of hardware configurations. Thus, the device hosting
the system only needs to support Python for its functioning. This simplicity reduces
implementation costs and accelerates the system's time-to-market, making it an
attractive solution for road safety monitoring initiatives. Overall, the system's self-
contained nature and minimal resource requirements enhance its feasibility as a cost-
effective solution for real-time helmet detection.

2.3.2 IMPACT ON ENVIRONMENT

The system's operation as a Python program on the device minimizes negative


environmental impacts, as it requires no additional hardware. Its lightweight design and
continuous operation ensure minimal resource consumption and energy efficiency.
Moreover, by aiding users in managing traffic schedules according to real-time data, the
system has the potential to reduce traffic congestion and associated emissions, thereby
contributing to environmental sustainability. This aligns with broader goals of
promoting sustainable transportation practices and mitigating the environmental
impacts of road traffic.

2.3.3 SAFETY

The system's safety is further bolstered by its limited interaction with external devices
and its lack of access to collect or store sensitive information. As it operates
independently without the need for external connections, the probability of a
cyberattack is significantly reduced. Additionally, since the system does not produce
4
sensitive data, there is minimal risk of compromising user privacy or security.
Consequently, the system can be considered highly safe to use, providing users with
peace of mind regarding potential cybersecurity threats.

2.3.4 ETHICS

The system's ethical integrity is upheld by its limited interaction with external devices
and its stringent data handling policies. As it does not interact with external devices, nor
does it collect, store, or produce sensitive data, the likelihood of ethical clashes is
negligible. This design ensures that user privacy and data security are upheld,
mitigating concerns regarding potential ethical dilemmas. By prioritizing user
confidentiality and data protection, the system demonstrates a commitment to ethical
principles, fostering trust and accountability in its utilization.

2.3.5 COST

The system's cost-effectiveness is notable, as it relies solely on Python and its requisite
modules. Installation expenses are virtually nonexistent, given that Python is an open-
source programming language. Moreover, the system incurs no direct costs for its
installation, making it accessible to a wide range of users. While Wi-Fi connectivity is
necessary during the initial setup for installing Python and the required modules,
ongoing operational costs are minimal. This affordability enhances the system's
accessibility and scalability, ensuring that cost constraints do not impede its adoption
and deployment.

2.3.6 TYPE

The system is classified as a Python application, necessitating the presence of Python 3


on the device for proper functionality. Additionally, the system relies on several open-
source Python modules to execute its tasks effectively. This classification underscores
the system's compatibility with a wide range of devices that support Python, ensuring
versatility in deployment. By adhering to standard programming practices and

5
leveraging open-source technologies, the system maintains flexibility and accessibility
for users across various platforms.

2.4 SCOPE

The project's scope extends beyond helmet detection to include the identification of
motorcyclists and subsequent integration with a number plate detection system. Once
the helmet detection phase is completed and motorcyclists are identified, the system
will proceed to identify the associated motorcycles.

Upon successfully identifying motorcycles, the project aims to seamlessly integrate


with a number plate detection system. This system will be responsible for accurately
recognizing and extracting number plate information from the motorcycles captured in
the video feed. Leveraging advanced optical character recognition (OCR) techniques,
the system will extract alphanumeric characters from the number plates with high
accuracy.

Once the number plate information is obtained, the system will utilize it to
automatically generate and issue traffic violation notices, commonly known as challans.
These notices will be sent directly to the respective individuals via messages, notifying
them of any detected violations.

This holistic approach not only enhances road safety through helmet compliance
monitoring but also enables efficient enforcement of traffic regulations by automating
the detection and issuance of challans for violations such as riding without a helmet or
improper number plate display. By seamlessly integrating multiple components, the
project aims to streamline the enforcement process, thereby contributing to improved
road safety and compliance with traffic laws.

6
2.5 MODULES

In the project, several modules are essential for its successful implementation. Two key
modules include the YOLO module for object detection and the display module for
visualizing the output.

The YOLO module plays a crucial role in detecting objects, particularly helmets and
motorcyclists, within the video footage. It utilizes the YOLOv8 object detection
framework, which is renowned for its accuracy and efficiency in real-time object
detection tasks. This module is responsible for processing each frame of the video feed,
identifying motorcyclists, and discerning whether they are wearing helmets.

On the other hand, the display module is responsible for visualizing the output of the
helmet detection system. It facilitates the annotation of the video feed with bounding
boxes around detected helmets and motorcyclists, as well as annotations indicating
"helmet wearing" or "no helmet" along with confidence levels. Additionally, the display
module ensures the seamless presentation of the output in a user-friendly format,
enabling stakeholders to easily interpret the results.

These modules work synergistically to deliver a comprehensive and intuitive user


experience, enabling efficient monitoring of helmet compliance among motorcyclists
and facilitating informed decision-making by stakeholders. Together, they form integral
components of the project, contributing to its overall functionality and effectiveness in
enhancing road safety.
7
2.6 SYSTEM CONFIGURATION

Hardware Requirements

i. A computer with sufficient processing power or an edge device capable of


running Python-based applications.
ii. A webcam or CCTV camera for capturing live video footage of road scenes.
iii. Internet connectivity for initial setup and installation of Python and required
modules (if not pre-installed).

Software Requirements

i. Compatible with Windows, macOS, or Linux operating systems.


ii. Python 3.x installed on the system. Ensure that the appropriate version of
Python is compatible with the YOLOv8 object detection framework and other
required modules.
iii. Install the YOLOv8 object detection framework, along with its dependencies,
for real-time object detection tasks.
iv. OpenCV library for image processing and video analysis tasks.
v. Install additional Python modules required for system functionality, such as
NumPy, Matplotlib, and PyTorch (if using for training).
vi. Choose a text editor or IDE for coding and development purposes, such as
Visual Studio Code, PyCharm, or Jupyter Notebook.

8
3. LITERATURE OVERVIEW

3.1 Real-time Traffic Monitoring System based on Deep Learning and YOLOv8
Article in ARO [The Scientific Journal of Koya University- November 2023]

The proposed system provided traffic monitoring information in real-time using a


combination of cutting-edge deep learning technology and state-of-the-art
algorithms. The system gave excellent results in all of its stages compared with
previous works, with high accuracy and low error rates. The system was
implemented with various adjustable parameters such as frame skipping, enabling
and disabling segmentation, and device-agnostic code that made the system run on
different hardware configurations and GPUs, which made the system flexible and
configurable for any video input type, resolution, or weather condition. The
classification stage in the proposed system is useful in enforcing different speed
limits based on the vehicle class. The counting process can also count vehicles in
general or count them based on their classes for more specific information. The
proposed system estimates the vehicle’s size with a novel approach based on
segmentation masks generated from the YOLOv8 segmentation model, fixing one
dimension (the vehicle’s width) and calculating the other two (the vehicle’s length
and height) from the polygon points of the segmentation mask. The vehicle size
information is useful for traffic control, as speed limits differ for different vehicle
sizes. For example, trucks have speed limits that are different from those of saloon
cars. Size information can also help detect large vehicles that are not allowed to
drive on certain roads at certain times of the day. The vehicle height is also
important for driving in tunnels and under bridges.

Link:https://www.researchgate.net/publication/375675748_Real-
time_Traffic_Monitoring_System_based_on_Deep_Learning_and_YOLOv8
9
3.2 Object detection using YOLO: challenges, architectural successors, datasets

and applications [Tausif Diwan, G. Anirudh & Jitendra V. Tembhurne]

Object detection is one of the predominant and challenging problems in computer


vision. Over the decade, with the expeditious evolution of deep learning,
researchers have extensively experimented and contributed in the performance
enhancement of object detection and related tasks such as object classification,
localization, and segmentation using underlying deep models. Broadly, object
detectors are classified into two categories viz. two stage and single stage object
detectors. Two stage detectors mainly focus on selective region proposals strategy
via complex architecture; however, single stage detectors focus on all the spatial
region proposals for the possible detection of objects via relatively simpler
architecture in one shot. Generally, the detection accuracy of two stage detectors
outperforms single stage object detectors. However, the inference time of single
stage detectors is better compared to its counterparts. Moreover, with the advent of
YOLO (You Only Look Once) and its architectural successors, the detection
accuracy is improving significantly and sometimes it is better than two stage
detectors. YOLOs are adopted in various applications majorly due to their faster
inferences rather than considering detection accuracy. As an example, detection
accuracies are 63.4 and 70 for YOLO and Fast-RCNN respectively, however,
inference time is around 300 times faster in case of YOLO. In this paper, a
comprehensive review of single stage object detectors specially YOLOs,
regression formulation, their architecture advancements, and performance
statistics. Moreover, they summarized the comparative illustration between two
stage and single stage object detectors, among different versions of YOLOs,
applications based on two stage detectors, and different versions of YOLOs along
with the future research directions.

10
Link : https://link.springer.com/article/10.1007/s11042-022-13644-y

3.3 Research on Helmet Wearing Detection Based on Improved YOLOv8


Algorithm [Lu Nannan, Wu Liuai]

In response to the current challenges of low detection accuracy in traditional safety


helmet detection network models, this paper introduces a novel model called
C2f_Fastert_EMA_YOLOv8 based on YOLOv8. This innovative model design
incorporates three key improvements aimed at enhancing detection performance and
real-time capabilities. Firstly, we introduce the FasterBlock from FasterNet to replace
certain Bottleneck components in the original C2f. This modification leads to the
creation of the entirely new C2f_Faster module, significantly boosting the model's real-
time detection speed, making it more suitable for rapid monitoring requirements in
practical scenarios. Secondly, we incorporate an EMA attention mechanism module
into the Neck part of the model. This module aids in capturing fine-grained details,
enabling the model to focus more on training safety helmet-related target features and
thereby enhancing detection accuracy. Lastly, we adopt the MPDIoU loss function to
replace the original loss function. This improvement effectively enhances the model's
bounding box regression performance, further strengthening detection accuracy.
Through experiments conducted on the SHWD safety helmet dataset, we observed that
the improved model achieved a 2.3% increase in the mean Average Precision (mAP)
compared to the original model. Additionally, we successfully reduced the model's
parameter size and overall size, reducing them by 2.62G and 1.5MB, respectively. This
innovative model not only improves detection accuracy but also reduces model
complexity, outperforming comparative algorithms and demonstrating significant
potential for practical applications.

Link: https://ieeexplore.ieee.org/document/10408800

11
3.4 Real-Time Multi-Class Helmet Violation Detection Using Few-Shot Data Sampling
Technique and YOLOv8[Armstrong Aboah, Bin Wang, Ulas Bagci, Yaw Adu-Gyamfi]

Traffic safety is a major global concern. Helmet usage is a key factor in preventing head
injuries and fatalities caused by motorcycle accidents. However, helmet usage
violations continue to be a significant problem. To identify such violations, automatic
helmet detection systems have been proposed and implemented using computer vision
techniques. Real-time implementation of such systems is crucial for traffic surveillance
and enforcement, however, most of these systems are not real-time. This study proposes
a robust real-time helmet violation detection system. The proposed system utilizes a
unique data processing strategy, referred to as few-shot data sampling, to develop a
robust model with fewer annotations, and a single-stage object detection model,
YOLOv8 (You Only Look Once Version 8), for detecting helmet violations in real-time
from video frames. Our proposed method won 7th place in the 2023 AI City Challenge,
Track 5, with an mAP score of 0.5861 on experimental validation data. The
experimental results demonstrate the effectiveness, efficiency, and robustness of the
proposed system.

Link : https://ieeexplore.ieee.org/document/10208778

12
3.5 Safety Helmet Detection Using YOLO V8 [Krunal Patel, Vrajesh Patel, Vikrant
Prajapati, Darshak Chauhan, Adil Haji]

Ensuring safety in the workplace is crucial to the wellbeing of workers and the success
of organizations. One essential aspect of workplace safety is the use of safety helmets in
hazardous environments. Safety helmets protect workers from head injuries caused by
falling objects, electric shocks, and other hazards. In recent years, computer vision-
based safety helmet detection systems have gained popularity as a means of ensuring
compliance with safety regulations and reducing accidents. This study proposes a safety
helmet detection system based on the You Only Look Once (YOLO) V8 algorithm,
which is a state-of-the-art object detection algorithm that has shown superior
performance in detecting small objects in real-time. The proposed system involves
training the YOLO V8 algorithm on a dataset of images containing workers with and
without safety helmets. The dataset was carefully curated to include various lighting
conditions, camera angles, and helmet types. The trained model was then evaluated on a
separate test set to measure its performance. Experimental results demonstrate that the
proposed approach achieves high accuracy in detecting safety helmets, with an average
precision of 0.99 and a recall of 0.99. The model also demonstrated robustness to
variations in lighting and camera angles, making it suitable for real-world deployment.

Link: https://ieeexplore.ieee.org/abstract/document/10266244

13
3.6 Real-Time Flying Object Detection with YOLOv8[Dillon Reis, Jordan Kupec,
Jacqueline Hong, Ahmad Daoudi]

This paper presents a generalized model for real-time detection of flying objects that can be
used for transfer learning and further research, as well as a refined model that is ready for
implementation. We achieve this by training our first generalized model on a data set
containing 40 different classes of flying objects, forcing the model to extract abstract
feature representations. We then perform transfer learning with these learned parameters on
a data set more representative of real world environments (i.e., higher frequency of
occlusion, small spatial sizes, rotations, etc.) to generate our refined model. Object
detection of flying objects remains challenging due to large variance object spatial
sizes/aspect ratios, rate of speed, occlusion, and clustered backgrounds. To address some of
the presented challenges while simultaneously maximizing performance, we utilize the
current state of the art single-shot detector, YOLOv8, in an attempt to find the best tradeoff
between inference speed and mAP. While YOLOv8 is being regarded as the new state-of-
the-art, an official paper has not been provided. Thus, we provide an in-depth explanation
of the new architecture and functionality that YOLOv8 has adapted. Our final generalized
model achieves an mAP50-95 of 0.685 and average inference speed on 1080p videos of 50
fps. Our final refined model maintains this inference speed and achieves an improved
mAP50-95 of 0.835.

Link: https://arxiv.org/abs/2305.09972

14
3.7 You Only Look Once: Unified, Real-Time Object Detection [Joseph Redmon, Santosh
Divvala, Ross Girshick, Ali Farhadi]
Humans glance at an image and instantly know what objects are in the image, where they
are, and how they interact. The human visual system is fast and accurate, allowing us to
perform complex tasks like driving with little conscious thought. Fast, accurate algorithms
for object detection would allow computers to drive cars without specialized sensors,
enable assistive devices to convey real-time scene information to human users, and unlock
the potential for general purpose, responsive robotic systems. Current detection systems
repurpose classifiers to perform detection. To detect an object, these systems take a
classifier for that object and evaluate it at various locations and scales in a test image.
Systems like deformable parts models (DPM) use a sliding window approach where the
classifier is run at evenly spaced locations over the entire image.

Link : https://arxiv.org/abs/1506.02640

15
3.8 A Comprehensive Review Of YOLO: From YOLOV1 TO YOLOV8 And Beyond
[Under Review In ACM Computing Surveys]

Real-time object detection has emerged as a critical component in numerous applications,


spanning various fields such as autonomous vehicles, robotics, video surveillance, and
augmented reality. Among the various object detection algorithms, the YOLO (You Only
Look Once) framework has stood out for its remarkable balance of speed and accuracy,
enabling the rapid and reliable identification of objects in images. Since its inception, the
YOLO family has evolved through multiple iterations, each building upon the previous
versions to address limitations and enhance performance. This paper aims to provide a
comprehensive review of the YOLO framework’s development, from the original YOLOv1
to the latest YOLOv8, elucidating the key innovations, differences, and improvements
across each version.
The paper begins by exploring the foundational concepts and architecture of the original
YOLO model, which set the stage for the subsequent advances in the YOLO family.
Following this, we delve into the refinements and enhancements introduced in each version,
ranging from YOLOv2 to YOLOv8. These improvements encompass various aspects such
as network design, loss function modifications, anchor box adaptations, and input
resolution scaling. By examining these developments, we aim to offer a holistic
understanding of the YOLO framework’s evolution and its implications for object
detection.

Link : https://arxiv.org/pdf/2304.00501v1

16
4. SYSTEM DESIGN

4.1 SYSTEM ARCHITECTURE

You Only Look Once (YOLOv8) proposes using an end-to-end neural network that
makes predictions of bounding boxes and class probabilities all at once. It differs from
the approach taken by previous object detection algorithms, which repurposed
classifiers to perform detection.

4.1.1 yolov8 architecture

17
4.1.2 System Architecture

In our project, the input to the model is a traffic video, which serves as a real-world
representation of road scenes captured by CCTV cameras. This video stream is passed
through the YOLOv8 (You Only Look Once version 8) model, which has been
meticulously trained to recognize objects of interest, specifically helmets and non-
helmet objects. The YOLOv8 model performs real-time object detection, swiftly
analyzing each frame of the video feed.

18
During the object detection process, the YOLOv8 model identifies and marks the
detected objects with their respective classes, such as "helmet" or "no helmet."
Additionally, the model provides the probability scores associated with each detected
object, indicating the likelihood of it belonging to a particular class. This nuanced
analysis allows for precise determination of whether a person is wearing a helmet or
not, providing valuable insights into compliance with safety regulations.

The display system plays a pivotal role in presenting the results obtained from the
YOLOv8 model. Continuously capturing frames from the traffic video, the display
system feeds each frame into the YOLOv8 model for real-time evaluation.
Subsequently, the results generated by the model are presented on the screen in a clear
and intuitive manner. This seamless integration of the display system with the YOLOv8
model enables continuous monitoring and evaluation of helmet compliance among
individuals depicted in the traffic video.

By harnessing the power of advanced object detection techniques and real-time


analysis, our project facilitates continuous assessment of helmet usage in dynamic road
environments. This iterative process of capturing, analyzing, and presenting video
frames ensures a comprehensive and timely evaluation of safety compliance, ultimately
contributing to enhanced road safety measures and the promotion of responsible riding
practices.

19
4.2 UML DIAGRAMS

4.2.1 USE CASE DIAGRAM

4.2.1.1 Use Case Diagram

The use case diagram for the helmet detection system illustrates the sequential steps
involved in its operation. Initially, the user opens the application, initiating the process.
Subsequently, they upload a traffic video to be analyzed for helmet detection. Once the
video is uploaded, the system begins to extract features by training the YOLOv8 model
with a dataset containing labeled images of helmets and non-helmet objects. This
training process enables the model to learn and recognize relevant features necessary
for accurate helmet detection.

Following the training phase, the system proceeds to analyze each frame of the
uploaded video, determining whether individuals depicted are wearing helmets or not.
20
This analysis involves processing the video frames through the trained YOLOv8 model,
which assigns confidence scores to each detection, indicating the likelihood of helmet
presence. Finally, the results of the helmet detection analysis are displayed to the user,
typically in the form of annotated video frames showcasing helmet-wearing status and
associated confidence levels. This comprehensive approach enables users to effectively
monitor and assess helmet compliance in traffic videos, contributing to improved road
safety measures.

4.2.2 CLASS DIAGRAM

4.2.2.1 Class Diagram

Firstly, the ‘LaptopUser’ class encapsulates the attributes and methods related to the
user's interaction with the system. It includes a method to generate a unique laptop ID
and a method for initiating the helmet detection process.

21
Next, the ‘DetectionSystem’ class serves as the core component responsible for
capturing videos and performing helmet detection. Within this class, the Webcam
attribute represents the webcam used for capturing video footage. It contains methods
for capturing videos from the webcam and implementing helmet detection algorithms.

The ‘AnalysisSystem’ class contains functionalities related to image processing and


feature extraction. It includes an ImageProcessor attribute responsible for processing
images captured from the webcam. The class also features a method for extracting
features from the images, which is crucial for subsequent helmet analysis.

Lastly, the ‘DecisionSystem’ class handles the final decision-making process based on
the helmet analysis results. It contains methods for determining whether a helmet is
present or not based on the analyzed features. These methods enable the system to
classify detected objects as either helmet or no helmet, providing valuable insights for
safety enforcement measures.

Overall, the class diagram outlines the key components and interactions within the
helmet detection system, facilitating the seamless analysis of video footage to ensure
compliance with helmet safety regulations.

22
4.2.3 STATE CHART DIAGRAM

4.2.3.1 State Chart Diagram

The state chart diagram for the helmet detection system outlines the sequential stages
involved in processing video data for helmet detection. It begins with the "Load Model"
state, where the system initializes and loads the pre-trained YOLOv8 model architecture
, essential for subsequent processing. Once the model is loaded, the system transitions
to the "Train with Dataset" state, where it refines the model's performance by training it
with a labeled dataset containing images of helmets and non-helmet objects. This
training process allows the model to learn and adapt to the specific characteristics of
helmet detection.

23
After training with the dataset, the system progresses to the "Pass Video" state, where it
receives the input video stream for analysis. This stage involves the continuous flow of
video frames through the system for real-time processing. As the video stream is passed
through the model, the system enters the "Extract Frames" state, where individual
frames are extracted for further analysis.

Subsequently, in the "Detect" state, the system performs helmet detection on each
extracted frame using the trained YOLOv8 model. This stage involves analyzing each
frame to identify and localize instances of helmets within the video stream. Following
helmet detection, the system transitions to the "Decision" state, where it evaluates the
detected objects and makes decisions based on predefined criteria.

Finally, in the "Display" state, the system presents the analyzed results to the user,
typically in the form of annotated video frames highlighting detected helmets. This
stage involves visualizing the analysis outcomes, allowing users to review and interpret
the detected objects and their associated attributes.

4.2.4 ACTIVITY DIAGRAM

4.2.4.1 Activity Diagram

24
The data required for training the model is prepared. This dataset is pre-processed by
marking the images with the required labels(helmet, nohelmet) using the open source
labelImg tool . This dataset is used for training the model . After every iteration of
training the model is evaluated and is improvised in the next iteration for about 100
epochs .This trained model is stored and used for making predictions .

25
4.3 SYSTEM DESIGN

4.3.1 MODULAR DESIGN

Yolo Module

The YOLO (You Only Look Once) module is integral to our helmet detection system,
serving as the backbone for real-time object detection tasks. YOLOv8, specifically, is
employed due to its efficiency and accuracy in detecting helmets within traffic videos.
This module encompasses the architecture, implementation, and fine-tuning of the
YOLOv8 model. It includes functionalities for loading the model architecture,
initializing weights, and executing object detection on video frames. Additionally, the
YOLO module incorporates techniques such as anchor boxes, feature pyramid networks
(FPNs), and skip connections to optimize detection performance. Through modular
design, the YOLO module ensures flexibility, scalability, and maintainability of the
object detection pipeline.

Display Module

The Display module is responsible for visualizing the results of helmet detection
analysis to users in a clear and intuitive manner. It includes components for rendering
video frames with annotations indicating detected helmets, confidence scores, and
classification results. The Display module facilitates seamless interaction with the
analysis outcomes, allowing users to review, interpret, and take appropriate actions
based on the detected objects. Furthermore, the Display module ensures compatibility
with various display devices and interfaces, enhancing user accessibility and usability.
Through modular design, the Display module promotes modifiability and extensibility,
enabling easy integration with other system components and potential future
enhancements.
26
4.3.2 DATABASE DESIGN

In the database design for the helmet detection system, the focus is on organizing the
structure to efficiently handle and manage the various data entities involved. The design
includes entities such as users, videos, frames, detected objects, and analysis results,
each serving a specific purpose within the system.

The user entity represents individuals interacting with the system, and it includes
attributes like username, email, and password for user authentication and access control.
Videos uploaded to the system are stored with attributes such as video ID and path,
allowing for the tracking and management of video data.

Frames extracted from the uploaded videos are stored as entities with attributes like
frame number and image data. These frames serve as the basis for object detection and
analysis within the system. Detected objects, such as helmets, are represented with
attributes describing their class labels, confidence scores, and bounding box coordinates
within the frames.

The analysis results entity stores the outcomes of helmet detection for each frame,
indicating whether a helmet is detected and the associated confidence level. These
results enable users to review and interpret the system's performance in detecting
helmets accurately.

Relationships between entities are established based on their interactions within the
system, although no explicit keys are utilized. Instead, the relationships are implied
based on the connections between entities and their attributes. This simplified approach
to database design allows for a straightforward organization of data entities without the
need for complex key structures.

27
4.3.2.1 Preparation Of Dataset Using LabelImg

4.3.2.2 Sample Dataset

28
4.3.2.3 Coordinates For Helmet_0 Image

4.3.2.4 Coordinates For Helmet_1 Image

29
5. IMPLEMENTATION

The project's implementation begins with accessing files stored in Google Drive, crucial
for dataset management and model storage. By mounting Google Drive within the
Colab environment, seamless access to dataset files and trained model weights is
ensured, facilitating efficient model training and inference processes.

Subsequently, the dataset required for training and testing the helmet detection model is
downloaded and extracted into a designated directory. This dataset likely contains
images or videos capturing various traffic scenes, with annotations specifying the
presence or absence of helmets. This step ensures that the model is trained on diverse
and representative data, enhancing its ability to generalize and accurately detect helmets
in real-world scenarios.

Once the dataset is prepared, the YOLOv8 model is trained using the dataset with
specified parameters such as the number of epochs and image size. Training the model
involves optimizing its parameters to learn and recognize patterns associated with
helmet presence within the provided images or videos. This phase is critical for the
model to achieve high accuracy and robustness in helmet detection.

After training, the model is loaded for inference, where it is applied to a video source to
detect helmets in real-time. The results of the helmet detection process are then
visualized and analyzed, allowing users to evaluate the system's performance and
accuracy. This step is essential for assessing the model's effectiveness in identifying
helmets within the video stream, providing valuable insights for further optimization
and improvement.

30
Additionally, preprocessing of video data is performed using OpenCV, a popular
computer vision library. Frames from the specified video source are captured, resized,
displayed for visual inspection, and saved to disk for further analysis. This
preprocessing step ensures that the video data is appropriately formatted and processed
for accurate helmet detection, contributing to the overall effectiveness of the system.

In summary, the project's implementation involves dataset preparation, model training,


inference, and preprocessing of video data, all integrated to create a robust helmet
detection system. By following this systematic approach, the system demonstrates the
capability to accurately identify helmets in real-world traffic scenarios, thereby
contributing to enhanced road safety measures and accident prevention efforts.

5.2 SAMPLE CODE

Img.py

import cv2

import time

cpt = 0

maxFrames = 100

count=0

cap=cv2.VideoCapture('2wheelertraffic.webm')

while cpt < maxFrames:


31
ret, frame = cap.read()

frame=cv2.resize(frame,(1080,500))

time.sleep(0.01)

cv2.imshow("test window", frame)

cv2.imwrite(r"C:\Users\surya\Downloads\yolov8helmetdetection-main\
yolov8helmetdetection-main\images\helmet_%d.jpg" %cpt, frame)

cpt += 1

if cv2.waitKey(5)&0xFF==27:

break

cap.release()

cv2.destroyAllWindows()

main.py

import os

HOME = os.getcwd()

print(HOME)

!pip install ultralytics==8.1.34

import ultralytics

ultralytics.checks()

from ultralytics import YOLO

32
from IPython.display import display, Image

from google.colab import drive

drive.mount('/content/gdrive')

!ln -s /content/gdrive/My\ Drive/ /mydrive

!ls /mydrive

!mkdir {HOME}/datasets

%cd {HOME}/datasets

!unzip /content/gdrive/MyDrive/MajorProjectNew.zip

%cd {HOME}

!yolo task=detect mode=predict model=yolov8n.pt conf=0.25


source='https://media.roboflow.com/notebooks/examples/dog.jpeg' show=True

%cd {HOME}

!yolo task=detect mode=train model=yolov8s.pt


data='/content/datasets/MajorProjectNew/data.yaml' epochs=100 imgsz=800
plots=True

Data.yml

nc: 2

names: ['helmet', 'noHelmet']

33
driver.py

from ultralytics import YOLO

model=YOLO('bestNew.pt')

source = '2wheelertraffic.webm'

results = model(source, show=True)

34
6.TESTING

6.1 TESTING

Testing includes running the python code multiple times, but each time we test different
aspects of the code in the real time scenario. The parameters of the project are checked
individually until all are relatively cohesive. After this all the code is integrated into a
single code, as we keep combining the codes, we perform integration testing when we
have successfully established unit testing on smaller parts of the code. Testing plays a
huge role in our project because our entire project's efficiency depends on the accuracy
of the output, the more accurate the output the better the project.

6.2 TEST CASES

S.NO TEST CASE EXPECTED TEST RESULT


RESULT

1 Provide a video The model SUCCESSFUL


footage containing accurately
various scenes identifies
with bikes and individuals
other vehicles. wearing helmets
within the video,
marking them
with bounding
boxes and labels
indicating
"helmet" or "no
35
helmet

2 Provide a video The model SUCCESSFUL


footage containing demonstrates
scenes with robustness against
varying lighting varied lighting
conditions, conditions,
including bright accurately
sunlight, shadows, identifying
and low light. individuals
wearing helmets
irrespective of
lighting
challenges.

3 Provide a video The model SUCCESSFUL


footage captured demonstrates
from various versatility in
camera detecting helmets
perspectives, from different
including front- camera
facing, side-view, perspectives,
and overhead maintaining
angles. accuracy and
consistency across
various viewing
angles.

Table 6.2.1 Test Cases

36
7. OUTPUT SCREENS

7.1 Output Screen 1

7.2 Output Screen 2

37
7.3 Output Screen 3

7.4 Output Screen 4

38
8.CONCLUSION

8.1 CONCLUSION

The project has made significant progress in developing a robust helmet detection
system using the YOLOv8 model. Through meticulous dataset preparation, model
training, and implementation of detection algorithms, the system demonstrates
promising capabilities in accurately identifying individuals wearing helmets in various
real-world scenarios.

With successful training and testing phases, the model showcases its ability to handle
diverse challenges such as varied lighting conditions, different camera perspectives, and
the presence of multiple vehicles. This ensures that the system can effectively
contribute to road safety measures by providing reliable helmet detection capabilities.

Moving forward, the project aims to further refine the model's performance, optimize
computational efficiency, and enhance the user interface for seamless integration into
real-time monitoring systems. Additionally, ongoing evaluation and validation efforts
will continue to assess the system's accuracy and reliability across diverse datasets and
scenarios.

In conclusion, the progress achieved thus far underscores the project's potential to make
a significant impact on road safety initiatives by providing an advanced helmet
detection solution. With continued development and refinement, the system is poised to
become a valuable tool in promoting safer road practices and reducing the risks
associated with inadequate helmet usage.

39
8.2 FURTHER ENHANCEMENTS

Further enhancements to the project could significantly bolster its effectiveness and
usability in promoting road safety through improved helmet detection capabilities. One
avenue for enhancement lies in optimizing the YOLOv8 model to strike a balance
between accuracy and computational efficiency. Techniques such as model
quantization, pruning, or architecture modifications can be explored to streamline
inference speed and reduce computational resource requirements, enabling smoother
real-time operation of the system on edge devices or embedded systems. By enhancing
the model's efficiency, the system can deliver faster and more responsive helmet
detection results, contributing to timely enforcement of helmet usage regulations and
improved safety outcomes on the roads.

Integrating multi-object tracking capabilities into the system represents another


significant enhancement opportunity. By extending the system to incorporate multi-
object tracking, it can track individuals wearing helmets over consecutive frames,
facilitating continuous monitoring of helmet usage patterns and behaviors in dynamic
environments. This enhancement is particularly valuable in scenarios where multiple
individuals or vehicles are present, allowing for more comprehensive analysis and
enforcement of helmet usage regulations. Additionally, multi-object tracking enhances
the system's ability to provide real-time feedback and alerts to relevant stakeholders,
enabling proactive intervention in case of detected violations or safety incidents.

Improving the user interface of the system is essential to enhance usability and
accessibility for end-users. A user-friendly interface with intuitive controls, informative
visualizations, and real-time feedback capabilities can streamline system operation and
analysis for users, including traffic authorities, law enforcement agencies, and other
stakeholders. By developing a web-based dashboard or mobile application for remote
monitoring, configuration, and analysis of helmet detection results, the system becomes
more accessible and adaptable to various deployment scenarios and user requirements.
40
Furthermore, establishing a framework for continuous evaluation and improvement of
the system's performance is crucial for its long-term success. Regular retraining of the
model with updated datasets, monitoring of key performance metrics, and solicitation of
feedback from end-users can help identify areas for enhancement and refinement. By
continuously iterating on the system's design and functionality, it can adapt to evolving
road safety challenges and regulatory requirements, ensuring its continued effectiveness
in promoting helmet usage and reducing the incidence of preventable accidents and
injuries on the roads.

41
9. BIBILOGRAPHY

9.1 BOOKS REFERENCES

“Computer Vision: Algorithms and Applications” by Richard Szeliski - This


comprehensive book covers a wide range of topics in computer vision, including object
detection, image segmentation, and feature extraction, which are fundamental to
understanding and implementing helmet detection systems.

"Deep Learning for Computer Vision" by Rajalingappaa Shanmugamani - This book


covers various deep learning techniques for computer vision tasks, including object
detection, with practical examples and implementations that may include YOLOv8 or
similar models.

9.2 WEBSITE REFERENCES

https://docs.ultralytics.com/yolov5

https://www.researchgate.net/publication/375675748_Real-
time_Traffic_Monitoring_System_based_on_Deep_Learning_and_YOLOv8

https://link.springer.com/article/10.1007/s11042-022-13644-y

9.3 TECHNICAL PUBLICATIONS

M. Hussain and R. Hill, "Custom lightweight convolutional neural network architecture


for automated detection of damaged pallet racking in warehousing & distribution
centers", IEEE Access, vol. 11, pp. 58879-58889, 2023.

M. F. Talu, K. Hanbay and M. H. Varjovi, "CNN-based fabric defect detection system


on loom fabric inspection", Tekstil Konfeksiyon, vol. 32, no. 3, pp. 208-219, Sep. 2022.

42
10. APPENDICES

A. SOFTWARE USED

Python

Python is a popular programming language. Python can be used on a server to create


web applications. Python can be used alongside software to create workflows. Python
can connect to database systems. It can also read and modify files. Python works on
different platforms. Python has a simple syntax similar to the English language. Python
has syntax that allows developers to write programs with fewer lines than some other
programming languages. Python runs on an interpreter system, meaning that code can
be executed as soon as it is written. This means that prototyping can be very quick.

Object Detection Model YOLOv8

YOLOv8 is the newest state-of-the-art YOLO model that can be used for object
detection, image classification, and instance segmentation tasks. YOLOv8 was
developed by Ultralytics, who also created the influential and industry-defining
YOLOv5 model. YOLOv8 includes numerous architectural and developer experience
changes and improvements over YOLOv5.

LabelImg

Utilized LabelImg, a popular annotation tool, for labeling and annotating datasets.
LabelImg allows for the efficient annotation of images with bounding boxes, which is
essential for training object detection models like YOLOv8. By incorporating LabelImg
into your workflow, you have ensured the availability of accurately labeled data for
training your object detection model, thereby enhancing its performance and
effectiveness.

43
B. METHODOLOGIES USED

The project likely involves collecting a dataset of images or videos containing scenes of
people riding bikes or motorcycles, both with and without helmets. These data serve as
the basis for training and testing the helmet detection model. In the provided code, a
video footage containing traffic scenes is utilized as input for the detection system.

The YOLOv8 model is chosen as the object detection framework for its efficiency and
accuracy. The provided code trains the YOLOv8 model using the collected dataset to
learn the features and patterns associated with helmet presence. The model is trained
with specified parameters such as the number of epochs and image size to optimize its
performance.

The code may involve experimenting with different hyperparameters such as the
learning rate, batch size, and image size to optimize the performance of the YOLOv8
model during training. Fine-tuning these parameters helps improve the model's
accuracy and convergence speed.

Following training, the performance of the trained model is evaluated and validated
using separate test data to assess its accuracy and generalization ability. Metrics such as
precision, recall, and F1 score may be calculated to measure the model's performance.
In the provided code, the model's performance can be visually assessed by running it on
test videos and inspecting the detection results.

Post-processing techniques such as non-maximum suppression (NMS) may be applied


to refine the detection results and eliminate redundant bounding boxes. This helps
improve the localization accuracy of detected helmets and reduce false positives in the
detection output.

Once validated, the trained model can be deployed and integrated into the target
application or system for real-world use. This involves optimizing the model for
inference on different hardware platforms and integrating it with existing software
infrastructure for seamless operation.
44
11.PLAGIARISM REPORT

45

You might also like