You are on page 1of 27

COUNTING AND CLASSIFYING THE VEHICLES

USING DEEP NEURAL NETWORKS

INTERNSHIP TRAINING REPORT

Submitted by
S.KEERTHIVASAN
Register No: 1919101018

DEPARTMENT OF CIVIL ENGINEERING


FINAL YEAR / VII - SEMESTER
BATCH: 2019-2023

SONA COLLEGE OF TECHNOLOGY


(AUTONOMOUS)
SALEM -636005
DECEMBER 2022
COUNTING AND CLASSIFYING THE VEHICLES
USING DEEP NEURAL NETWORKS

INTERNSHIP TRAINING REPORT

Submitted by
S.KEERTHIVASAN
Register No: 1919101018

DEPARTMENT OF CIVIL ENGINEERING


FINAL YEAR / VII - SEMESTER
BATCH: 2019-2023

SONA COLLEGE OF TECHNOLOGY


(AUTONOMOUS)
SALEM -636005
DECEMBER 2022
SONA COLLEGE OF TECHNOLOGY
(AUTONOMOUS)
SALEM -636005

Department of Civil Engineering

INTERNSHIP TRAINING REPORT

DECEMBER 2022

This is to certify that the Internship report entitled COUNTING AND


CLASSIFYING THE VEHICLES USING DEEP NEURAL NETWORKS
Is the bonafide record of Internship report done by

S.KEERTHIVASAN
Register No: 1919101018

For Internship Training during the year 2022-2023.

Industry / Company Incharge Head of the Department

Submitted for the Project Viva-Voce examination held on -------------------------

Internal Examiner External Examiner


Page|1

SUMMER FELLOWSHIP 2022


REPORT

COUNTING AND CLASSIFYING T H E


VEHICLES USING DEEP NEURAL
NETWOR K S

SUBMITTED
BY:
Name :Keerthivasan.S

Course :B.E Civil engineering

University/College :Sona College Of Technology

Register No :19CIVBE064

Year of study :III (2019-2023)

E-mail id :keerthivasan9162001@gmail.com

Ph.No :8072930969

TO:
Dr. Gitakrishnan Ramadurai
Associate Professor in the Transportation Engineering Division,Department of civil
engineering,Indian institute of technology,Madras.
Page|2

TABLE OF CONTENTS

Title Page No.


• Offerletter 3
• Acknowledgement 4
• Introduction 5
• Advantages of counting and classifying the 5
vehicles
• Machine learning vs deep learning 6
• Deep learning 7
• Different computer vision tasks 8
• Types of detectors 10
• Image annotating tools 11
• Labelimg 11
• General architecture of the vehicle counting 12
system
• Acquired during internship 13
• My responsibilities/position in internship 16
• Some pictures of my working 18
• Conclusion 22
• Reference 23
Page | 3
OFFER LETTER
Page|4

Acknowledgement

I,keerthivasan.s first would like to thank professor Dr.R G Robinson sir,head of


the department, civil engineering, indian institute of technology,madras,for
giving me the opportunity to do an internship and explore within the department
and organization.

Also i’am heartily thankful to sir Dr.Gitakrishnan ramadurai, associate


professor in the transportation engineering division, department of civil
engineering,IIT madras and a core faculty member in the Robert bosch center
for data science and AI at IIT madras for providing me an opportunity to work
under his guidance in his project.

I would like to convey my gratitude to the department of civil engineering,indian


institute of technology,madras for giving me platform to interact and learn.

I extend my warm gratitude and regards to everyone who helped me during my


summer fellowship programme 2022-internship.

Keerthivasan.s
Page|5

• INTRODUCTION
Vehicle detection and statistics in highway monitoring video scenes are of
considerable significance to intelligent traffic management and control of
the highway. With the popular installation of traffic surveillance cameras,
a vast database of traffic video footage has been obtained for analysis.
Generally, at a high viewing angle, a more-distant road surface can be
considered. The object size of the vehicle changes greatly at this viewing
angle, and the detection accuracy of a small object far away from the road
is low. In the face of complex camera scenes, it is essential to effectively
solve the above problems and further apply them we focus on the above
issues to propose a viable solution, and we apply the vehicle detection
results to multi-object tracking and vehicle counting.

• ADVANTAGES OF COUNTING AND CLASSIFYING THE


VEHICLES
Helps traffic police: A vehicle detection and counting system could be beneficial
for the traffic police because everything they can monitor from one place like
how many vehicles have crossed this toll and which vehicle.

Maintaining records: It is challenging for some individuals to record all the


vehicles with them because the cars are passing by in real-time. It’s not like that
one is watching the video, and they can pause it and have a note of it, so to remove
this limitation.

Traffic surveillance control: As this application can be planted anywhere as it


only requires a camera or some wires (for establishing the connectivity with the
central system) hence if the traffic is high at someplace, then from that area, an
officer can monitor it and forward the information to next toll officer so that they
could be prepared beforehand.
Page|6

• MACHINE LEARNING VS DEEP LEARNING


Page|7
• DEEP LEARNING
Deep learning is a subset of machine learning, which is essentially a neural
network with three or more layers.
Page|8

• Different Computer Vision Tasks

The area of Computer Vision basically deals with anything that humans see and
perceive.

Image Classification is a task where an image is classified into one or multiple


classes based on the task.
Page|9
Image/Object localization is a regression problem where the output is x and y
coordinates around the object of interest to draw bounding boxes.

Object Detection is the ability to detect or identify objects in any given image
correctly along with their spatial position in the given image, in the form of
rectangular boxes (known as Bounding Boxes) which bound the object within it.
P a g e | 10

• TYPES OF DETECTORS
P a g e | 11

• IMAGE ANNOTATING TOOLS


1. Labelbox
2. Scale AI
3. SuperAnnotate
4. Dataloop
5. Playment
6. Supervise.ly
7. Hive Data
8. CVAT
9. LabelMe
10.LabeIimg
11.VoTT
12.Img Lab

• labelImg
LabelImg is a graphical image annotation tool. It is written in Python and
uses Qt for its graphical interface. Annotations are saved as XML files in
PASCAL VOC format, the format used by ImageNet. Besides, it also supports
YOLO and CreateML formats.
P a g e | 12

• General architecture of the vehicle counting system.


P a g e | 13

• KNOWLEDGE ACQUIRED DURING INTERNSHIP

Goal:

Counting and classifying the vehicles using deep neural network.

What we use

The various deep learning methods use data to train neural network
algorithms to do a variety of machine learning tasks, such as the
classification of different classes of objects. Convolutional neural
networks are deep learning algorithms that are very powerful for
the analysis of images.

Convolutional neural networks(CNN)


CNN is a powerful algorithm for image processing. These algorithms
are currently the best algorithms we have for the automated processing
of images. Many companies use these algorithms to do things like
identifying and tracking the objects in an image.
THREE LAYERS OF CNN:

CNN is mainly used in image analysis tasks like Image


recognition, Object detection & Segmentation.

There are three types of layers in Convolutional Neural Networks:

1) Convolutional Layer

2) Pooling Layer

3) Fully-Connected layer
P a g e | 14

FLOWCHART OF WORKING OF CNN

OUTPUT INPUT

FULLY CONNECTED CONVOLUTION


LAYER

FLATTENED MAXIMUM POOLING

MAXIMUM POOLING CONVOLUTION

Note:Convolution and pooling layer may repeat ‘n’ number of time.


P a g e | 15

YOLO Algorithm
YOLO is an abbreviation for the term ‘You Only Look Once’. This is
an algorithm that detects and recognizes various objects in a picture (in
real-time). YOLO algorithm employs convolutional neural networks
(CNN) to detect objects in real-time.

Working of yolov3
The YOLOv3 algorithm first separates an image into a grid. Each grid
cell predicts some number of boundary boxes (sometimes referred to as
anchor boxes) around objects that score highly with the aforementioned
predefined classes.

Each boundary box has a respective confidence score of how accurate it


assumes that prediction should be and detects only one object per
bounding box. The boundary boxes are generated by clustering the
dimensions of the ground truth boxes from the original dataset to find
the most common shapes and sizes.

Other comparable algorithms that can carry out the same objective are
R-CNN (Region-based Convolutional Neural Networks made in 2015)
and Fast R-CNN (R-CNN improvement developed in 2017), and Mask
R-CNN.

However, unlike systems like R-CNN and Fast R-CNN, YOLO is


trained to do classification and bounding box regression at the
same time.
P a g e | 16

• MY RESPONSIBILITIES/POSITION IN
INTERNSHIP

Workings
1. Images were collected from the traffic camers from various
locations.
2. The images were classified into two folders,namely “test” and
“train”.
3. The images are annotated/labelled using a tool/software.the
annotations are classified into 5 categories,namely
“car”,”motorbike”,”truck”,”auto” and “bus”.Labels are used to help
identify components in your data which you want to train your model
to identify in datasets that are not labeled. High quality datasets are
essential for computer vision and building a highly performant model.
Creating computer vision models follows the garbage in, garbage out
philosophy which means labeling images carefully and accurately is
important.
4. For this annotation an popular open source software/tool is used
known as“labelImg”,we feed the collected images into this tool,and we
annotate/label the images,after annotation of each image its
saved in the preferred location in ”.xml” format. It’s written in
Python and uses QT for its graphical interface.
5. Finally all the frames/images are annotated.

6. Then its trained in the existing model for deep learning.

7. Then once training is finished,the model is evaluated,if the testing is


done on the same trained data we wont get the accuracy hence the
unseen data is used here( i.e.Test data),also using test data for evaluation
we get accurate measure of performance of model and the speed.
P a g e | 17

8. finallywe feed the input data(traffic camera video),the output is


achieved in classified manner accordingly.

Required system configuration for better performance:


dual-core Intel Xeon processor (2.20 GHz) having 256 GB of DDR4 RAM with
one TitanX graphics processing unit (GPU)
P a g e | 18

SOME PICTURES OF MY WORKINGS


P a g e | 19
P a g e | 20
P a g e | 21
P a g e | 22

• CONCLUSION:
From my summer fellowship programme 2022-internship,I hope that
this project is going to be very helpul for CHENNAI CITY POLICE in
data analysis and also for road safety.I was able to get a better
understanding of how the deep learning works,its application and how
effective it is. I enjoyed working with the team. Overall, I found the
internship experience to be positive, and I'm sure I will be able to use the
skills I learned in my career later.
P a g e | 23

+
REFERENCE
1. Training a deep learning architecture for vehicle detection using limited
heterogeneous traffic data
Deepak Mittal; Avinash Reddy; GitakrishnanRamadurai; KaushikMitra; BalaramanRavindran

Link:https://ieeexplore.ieee.org/abstract/document/8328279

2. Detecting Vehicles on the Edge: Knowledge Distillation To Improve


Performance in Heterogeneous Road Traffic

ManojBharadhwaj, GitakrishnanRamadurai, BalaramanRavindran; Proceedings of the IEEE/CVF


Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2022, pp. 3192-3198

Link:
https://openaccess.thecvf.com/content/CVPR2022W/AICity/html/Bharadhwaj_Detecting_V
ehicles_on_the_Edge_Knowledge_Distillation_To_Improve_Performance_CVPRW_2022_paper.
html

You might also like