FYP - Thesis - 2022 (1) - Merged

Intelligent Billing System Using Object Detection
Project report submitted to

Visvesvaraya National Institute of Technology, Nagpur in
partial fulfillment of the requirements for the award of
the degree
Bachelor of Technology
in
“Electronics and Communication Engineering”
by
Neeraj Chidella BT18ECE145

N Kalyan Reddy BT18ECE114
N Sai Dheeraj Reddy BT18ECE148
Maddi Mohan BT18ECE104
under the guidance of
Dr. Joydeep Sengupta
Department of Electronics and Communication Engineering

Visvesvaraya National Institute of Technology
Nagpur 440 010 (India)
2022
Project report submitted to
Visvesvaraya National Institute of Technology, Nagpur in
partial fulfillment of the requirements for the award of
the degree
Bachelor of Technology
in
“Electronics and Communication Engineering”
by
Neeraj Chidella BT18ECE145

N Kalyan Reddy BT18ECE114
N Sai Dheeraj Reddy BT18ECE148
Maddi Mohan BT18ECE104
under the guidance of

Visvesvaraya National Institute of Technology
Nagpur 440 010 (India)
2022
© Visvesvaraya National Institute of Technology (VNIT) 2022

Visvesvaraya National Institute of Technology, Nagpur
Declaration
We, Neeraj Chidella, N Kalyan Reddy, N Sai Dheeraj Reddy and Maddi
Mohan, hereby declare that this project work titled “Intelligent Billing System
Using Object Detection” is carried out by us in the Department of Electronics and

Communication Engineering of Visvesvaraya National Institute of Technology, Nagpur.
The work is original and has not been submitted earlier whole or in part for the award of
any degree/diploma at this or any other Institution / University.
Sr.No. Enrollment Name Signature

No.
1 BT18ECE145 Neeraj Chidella
2 BT18ECE114 N Kalyan Reddy
3 BT18ECE148 N Sai Dheeraj Reddy
4 BT18ECE104 Maddi Mohan
Date:
Certificate
This is to certify that the project titled “Intelligent Billing System Using Object
Detection”, submitted by Neeraj Chidella, N Kalyan Reddy, N Sai Dheeraj Reddy

and Maddi Mohan in partial fulfillment of the requirements for the award of the degree
of Bachelor of Technology in Electronics and Communication
Engineering, VNIT Nagpur. The work is comprehensive, complete and fit for final
evaluation.
Assistant Professor,
Electronics and Communication Engineering,
VNIT, Nagpur
Dr. A. G. Keskar
Head, Department of Electronics and Communication Engineering
VNIT, Nagpur
Date:
ACKNOWLEDGEMENT
The completion of this project could not have been possible without the support
and participation of a number of people who have always given their valuable
suggestions. We sincerely appreciate the constant guidance, support and inspiration of all
those who are involved in bringing success to this project.
We wish to take this opportunity to acknowledge, with deep sense of gratitude
and respect, our Project Guide, Dr. Joydeep Sengupta, Assistant Professor, Department
of Electronics and Communication Engineering, VNIT, Nagpur for his patient guidance,
constructive suggestions and constant encouragement throughout the period of work.
Also, we would like to thank Dr. A. G. Keskar, Professor and Head of Department,
Department of Electronics and Communication Engineering, VNIT, Nagpur and Dr. P.M.
Padole, Director, VNIT, Nagpur for giving us the golden opportunity to work on this
wonderful project on the topic “Intelligent Billing System Using Object Detection”,
which helped us in doing a lot of research. We sincerely thank them for providing each
and every facility required for the successful completion of the project, in spite of this
tough situation during the pandemic.
We would also like to thank our family and friends because any attempt at any
level cannot be satisfactorily completed without their support and encouragement.
Finally, we are deeply indebted to all the above mentioned for their moral support
provided during the project.
Neeraj Chidella
N Kalyan Reddy
N Sai Dheeraj Reddy
Maddi Mohan
ABSTRACT
With the rapid advancement of technology in machine learning, deep learning,

and artificial intelligence, enhancing the billing system is an excellent way to cut down
on wastage of time. Even if barcode scanners have become as quick as ever, fruits and
vegetables still require human entry into the computer, which is a time-consuming and
stressful operation. Fruit and Vegetable markets have become an inextricable part of our
lives; as a result, the setting must be made as pleasant as possible, and, more
significantly, billing must be made as simple as possible with no wastage of time. We
presented an automatic invoicing system that detects fruits and vegetables to address the
existing issues related with barcode and RFID tags., An automatic billing system is
proposed that detects the fruits and vegetables. The final Bill is then displayed. The main
goal is to detect fruits, present the fruits that have been discovered, and then bill these
things. Two separate algorithms were employed to accomplish this : 1) a fine-tuned
convolutional neural network that was created from a base model, and 2) a fine-tuned
convolutional neural network that was built from scratch. 3) used state-of-the-art YOLO
based on pytorch to boost accuracy for real-time object identification and the bounding
boxes to be shown, as YOLO predicts the bounding boxes and detects the object faster
and more reliably than other detection algorithms.
Keywords - YOLO, bounding boxes, Convolutional Neural Network(CNN), Automatic

Billing, Object detection.
i
LIST OF FIGURES
1.1. People waiting in shopping malls and road side shops ………… 2
1.2. Barcode Scanners which are used to scan the barcodes. ……... 2
1.3. Analysis of mindset of the customers …………………………………….. 3
1.4. Basic Flow of the Proposed Solution ……….……. 4
1.5. Some of the images from Dataset-1 ……………………………. 7
1.6. Some of the images from Dataset-2 ……………...………. 8
1.7. An image from Dataset-3 ……………………….….. 8
2.1. Architecture of Convolutional Neural Networks …………11
2.2. Bounding box features ………………………………………………….. 12
2.3. Anchor box features for detecting multiple objects …………………….. 13
2.3. Architecture of YOLOv5 Showing the various layers …………………….. 15
3.1. Bounding box using BBOX tool ………………………………………… 18
4.1. Graph of training loss vs No. of iterations ………………………………. 21
4.2. Results using YOLOv4 algorithm …………….…………………………...21
4.3. List of all classes and their accuracies using YOLOv4 …………………… 22
4.4. Billing Results ……………………………………………………………... 22
4.5. Website view ……………………………….…. 23
4.6. Billing Results in Website …………………………………….. 23
4.7. Results obtained using YOLOv5 …………………………………….. 24
4.8. Results obtained using YOLOv5 along with confidence scores ………. 25
ii
4.9. Result showing actual class and predicted class using CNN ………… 25
4.10. Accuracies obtained for each class ………… 26
4.11. Accuracy and Loss graphs ………… 26
iii
LIST OF ACRONYMS
RFID Radio Frequency Identification
CNN Convolutional Neural Network
R-CNN Region-Based Convolutional Neural Network
YOLO You Only Look Once
YOLOv3 You Only Look Once version 3
ReLU Rectified Linear activation Unit
IOU Intersection Over Union
CSPNet Cross Stage Partial Network
FLOPS Floating-point Operations Per Second
PANet Path Aggregation Network
FPN Feature Pyramid Network
API Application Programming Interface
GPU Graphics Processing Unit
TPU Tensor Processing Unit
HTML Hypertext Markup Language
CSS Cascading Style Sheets
iv
LIST OF PUBLICATIONS
Conference - The First International Conference on the Paradigm shifts in

Communication, Embedded Systems, Machine Learning and Signal Processing (PCEMS
2022)
Organized by -Visvesvaraya National Institute of Technology, Nagpur, India
Authors – Neeraj Chidella, N Kalyan Reddy, N Sai Dheeraj Reddy, Maddi Mohan, Dr.
Joydeep Sengupta
Conference Date – May 6-7, 2022
Status – Accepted for presentation
Title of Paper – “Intelligent Billing system using Object Detection”.
Abstract - With the rapidly increasing technology and development in machine learning,
deep learning and artificial intelligence, improving the billing system is an effective
means of reducing wastage of time. Nowadays, even though barcode scanners have
become as fast as ever, but for fruits and vegetables, it still needs to be entered manually
into the computer which is very time taking and hectic process. Vegetable and fruit
markets have become an integral part of our life hence in such places the environment
must be made hassle free and more importantly, the billing should be less laborious and
efficient without wasting time. In order to overcome the existing problems associated
with the barcode and RFID tags, we proposed an automatic billing system that detects
the fruits and vegetables and then displays the final Bill. The main objective of this
project is to detect the fruits, display the fruits detected and then to bill these items. To
achieve this, we have used two different algorithms, the Fine tuned Convolutional Neural
Network that we built from a base model. To increase accuracy for real time object
detection and for the bounding boxes to be displayed, we used state of the art YOLO
based on pytorch as YOLO predicts the bounding boxes and detects the object faster than
other detection algorithms and is more reliable.
v
INDEX
ABSTRACT …………………………………………………………… i
LIST OF FIGURES …………………………………………………... ii
LIST OF ACRONYMS ………………………………………………. iv
LIST OF PUBLICATIONS ………………………………………….. v
CHAPTER 1: INTRODUCTION …………………………………. 1-8

1.1 Objective …………………………………………………………..... 2
1.2 Problem Flow ……………………………………………………..... 4
1.3 Literature Survey …………………………………………………… 4
1.4 Approach …………………………………………………………… 6
1.5 Datasets ……………………………………. .……………………… 7
CHAPTER 2: METHODOLOGY ………………………………. 9-16

2.1 Object Detection and Identification algorithms….............................. 10
2.1.1. Object Detection Using CNN ……………………………….. 10
2.1.2. Object Detection Using YOLO ………………….………….. 12
2.1.3. Object Detection Using YOLOv5 …………………..……… 14
2.2 Detecting grocery objects with CNN …............................................. 16
2.3 Detecting grocery objects with YOLOv4…....................................... 16
2.4 Detecting grocery objects with YOLOv5…....................................... 16
CHAPTER 3: SOFTWARE USED ………………………………. 17-19

3.1 Dataset Creation Tools …………………………………………… 18
3.2 Model Training ………………………………..………………….. 19
3.3 Model Development …………….………………………………… 19
CHAPTER 4: RESULTS AND DISCUSSIONS …………..…… 20-26

4.1 Results of Object Detection using YOLOv4 ……………………… 21
4.2 Billing Results ……………………………….…………………….. 22
4.3 Website Results ……………….……………………….…………… 23
4.4 Results of Object Detection using YOLOv5 ….……………………. 24
4.5 Results of Object Detection using CNN ….……………..…………. 25
4.6 Model training metrics ……………………………………………… 26
CHAPTER 5: CONCLUSION AND FUTURE SCOPE ………. 27-29

5.1 Conclusion …………………………………………………………. 28
5.2 Future Scope ……………………………………………………….. 29
APPENDIX A: CODES …………….……………………………. 30-35

REFERENCES ……………………………………………………. 36-38
CHAPTER 1
INTRODUCTION
1
CHAPTER 1
INTRODUCTION
1.1. Objective
Wastage of time has been a major issue for many years. The world is constantly
evolving, people are continuously competing with each other to increase productivity in
less time. Automation of each and every daily process has made life easier for mankind,
all the monotonous jobs are being replaced by machines powered by artificial
intelligence and machine learning.
Fig 1.1. People waiting in shopping malls and road side shops
Nowadays, people want to spend more time with family and friends and keep their
mental state perfect instead of wasting time on such monotonous tasks. One such task is
the billing system methods that are being followed right now in India.
Currently, Billing in INDIA is mostly based on Barcode scanners and RFID tags. This
process of billing is acceptable in low populated areas, less dense regions where malls
and markets are not populous, whereas in metropolitan cities, fruit and vegetable
markets, busy areas and densely populated regions this method i.e., barcode scanning of
each and every item present inside the checkout bag and then waiting for the final bill to
pay is a lot of time. This leads to big queues in supermarkets which in turn lead to
increase in waiting times and decrease in satisfaction levels of customers.
Fig 1.2. Barcode Scanners which are used to scan the barcodes
2
As the market for daily products, vegetables and fruits is huge, unsatisfied customers
easily change the place of purchase if this problem is prevalent.
Fig 1.3. Analysis of mindset of the customers

The above picture from research says that the wait duration is directly linked to the
satisfaction of the customers. The wait experience, ambience also plays an important role
in the satisfaction of the customer.
Adding to the waiting problem is the covid situation where crowds and queues are very
dangerous and are the main reason for the spread of covid virus. People in crowded
supermarkets, standing in long queues, are very prone to covid virus. The transmission
rate of the virus also increases as the Biller or the cashier at the billing department needs
to pick all the products and scan them with hands.
So, A novel solution is proposed addressing all the problems mentioned. The billing
system proposed takes an input image fed by the user that detects the objects in the
image using deep learning techniques, then it generates the bill of all the objects
identified based on the prices provided in the database.
The main objective is to
• Build a hassle-free environment during billing
• Reduce the human efforts at the time of billing
• Reduce the risk of transmission of any virus
• Reduction in the wait times of customers and keep them satisfied
3
1.2. Problem Flow
Fig.1.4. Basic Flow of the Proposed Solution
1.3. Literature Survey
● Joseph Redmon, Santosh Divvala, Ross Girshick [1] proposed the algorithm of
YOLO (You Only Look Once) for the purpose of object detection. They used
bounding-box technique to detect objects in an image. This algorithm divides the
image into small grids and then checks whether the object is present in that
particular grid or not.
● Geethapriya. S, N. Duraimurugan, S.P. Chokkalingam [2] proposed a model for

detecting objects using the YOLO algorithm. They also compared different object
detection algorithms like Convolutional Neural Networks, Fast-Convolutional
Neural Networks with YOLO and found that YOLO is much better in terms of
speed and accuracy. They used the bounding box technique for converting the
images into YOLO format.
● Marcus Klasson, Cheng Zhang, Hedvig Kjellstrom [3] gave an implementation of

object detection using visual and semantic labels. They gave the implementation
of CNN on multiple datasets.
4
● Md Jan Nordin, Norshakirah Aziz, Ooi Wei Xin [4] proposed a model to apply
object detection using CNN on grocery objects. They proposed a model for
billing the obtained objects after object detection. They also created a website for
calculating the total bill of the detected objects.
● Chengji Liu,Yufan Tao, Jiawei Liang, Kai Li ,Yihang Chen [5] developed an
object detection model using the YOLO algorithm. They proposed a method for
tackling real-world image shooting problems like blurring, noise etc using image
degradation models based on YOLO.
● Ms. Y. Vineela Sravya, M. Keerthi, M. Kasturi, R. Lochana, A. Anusha [6]

proposed a food calorie estimation method and billing system using YOLO
algorithm. They used the anchor box technique to detect objects.
● Xiaofeng Ning, Wen Zhu, Shifeng Chen [7] developed a model of image
recognition, object detection and segmentation for images of white background.
They used faster R-CNN algorithm for detection of objects.
● Kavan Patel [8] proposed a model to detect fruits and vegetables using YOLO
algorithm and developed a self-checking portal for the customers.
● Ragesh N, Giridhar B, Lingeshwaran D, Siddharth P and K P Peeyush [9]

presented an object detection technique using deep learning methods to detect
fruits and vegetables for automating the billing process.
● Huimin Yuan, Ming Yan [10] proposed an intelligent food identification model
based on Cascade R-CNN and computer vision techniques. They got a very good
output and also a good amount of accuracy.
● Suraj Chopade, Prof. Smita Palnitkar, Sujit Chavan, Anirudha Deshpande [11]
implemented the automated billing system using image processing techniques.
This model detects the objects using image processing and the detected objects
are sent for billing.
● E. K. Jose and Veni. S [12] developed a YOLO based model for finding the open
parking space. They developed this model by detecting multiple objects in the
area using YOLO and hence open parking spaces were found.
● Redmon and A. Farhadi [13] introduced the YOLO9000 algorithm which can
detect over 9000 object categories. They used the COCO dataset for
implementing this algorithm and got good results.
5
● G. M. Farinella, D. Allegra, M. Moltisanti [14] developed a model for
understanding the food items present in an image based on various computer
vision techniques. This model monitors the food intake of a person based on the
images of the food he takes.
● Zhuang-Zhuang Wang, Kai Xie, Xin-Yu Zhang, Hua-Quan Chen, Chang Wen,
Jian-Biao He [15] proposed a model to detect small objects present in an image
using YOLO and Dense Block. They used the Image Super Resolution technique.
1.4. Approach
For the purpose of simplicity and easy understanding, the functioning of the
system has been divided into major tasks.
1. First of all, a dataset should be made or found according to the model that is
being trained, it should be pre-processed and pre annotated with proper
techniques such as rotation and augmentation.
2. Object detection and recognition of different classes of objects is the second task,
which is performed using a object detection model, CNN, You Only Look Once,
version 4 (YOLOv4) and YOLOv5
3. The third task is the supervision of all the results after training the model from all
the three types of models.
4. The fourth task is the applying the received weights to a set of detection images
and checking the results, confidence and the map precision of the model.
5. The fifth task is to assess the models performance, make changes to improve the
accuracy of the model, check whether the given model satisfies our objectives
and verify the final integrity of the objects detected
6. The sixth task is mapping the detected objects to their price through python code
7. The seventh task is to integrate this model after mapping detected objects with
the price and giving a final bill by summing all the objects detected
8. The last and final task is to make a webpage using pytorch and flask by
integrating the model after billing using python script.
6
1.5. Datasets
1. Fruits-360 dataset was used for training the model using CNN algorithm. This
dataset is taken from kaggle.
This dataset is chosen because it has a very large set of images. It will help us in
acquiring high accuracy and hence high detection.
This dataset consists of :
● 90483 total images which are split into train and test sets. The train set
consists of 67692 images and the test set consists of 22688 images.
● 131 classes of fruits and vegetables.
● All the images are resized to 100x100.
Here are some of the images from this dataset
Fig 1.5. Some of the images from Dataset-1
2. A Custom Dataset was created in YOLOv4 format by collecting the images from
the Open Images Dataset.
● 3850 total images which are split into train and test sets. The train set
consists of 3650 images and the test set consists of 200 images.
● 19 classes of fruits and vegetables.
7
Here are some of the images from this dataset
Fig 1.6. Some of the images from Dataset-2
3. Another Custom Dataset was created by collecting images from two different
datasets in kaggle and converting them into YOLO format using roboflow as the
images are pre-annotated.
● 1705 total images with 1654 images in the train set and 51 images in the
test set.
● 12 classes namely ‘apple’, ‘banana’, ‘cheetos’, ‘cucumber’, ‘eggplant’,

‘hershey’, ‘kitkat’, ‘maggie’, ‘mushroom’, ‘orange’, ‘pringle’, ‘reese’.
● All the images are resized to 416x416.
Here are some of the images from this dataset.
Fig 1.7. An image from Dataset-3
8
CHAPTER 2
METHODOLOGY
9
CHAPTER 2
METHODOLOGY
2.1. Object Detection and Identification algorithms
Object detection is a technique that is used to detect objects belonging to a

specific class, in digital images and videos. There are numerous object detection
algorithms present, some of which use the traditional technique of sliding a window of
specific size all over the image to detect objects whereas some algorithms use concepts
of deep learning. The algorithms that use the traditional technique are very slow
compared to the ones that use deep learning.
2.1.1. Object detection using CNN
A convolutional neural network, often known as a ConvNet, is commonly used to

evaluate visual pictures by processing data with a grid-like structure. To detect and
classify items in an image, a convolutional neural network is employed. Multiple hidden
layers assist extract information from an image in a convolution neural network. . The
four important layers in CNN are: 1)Convolution layer 2)ReLU layer 3)Pooling layer
4)Fully connected layer
Convolution Layer
In this procedure, important features from an image are extracted .Many filters in a
convolution layer conduct the convolution process. Every image is seen as a matrix of
pixel values.
ReLU layer
The rectified linear unit is abbreviated as ReLU. After the feature maps have been
removed, they must be moved to a ReLU layer. ReLU goes through each element one by
one, converting all negative pixels to zero.It causes the network to become non-linear,
and the result is a rectified feature map. For feature detection, the original image is
scanned with numerous convolutions and ReLU layers.
10
Pooling Layer
Pooling is a downsampling process that decreases the feature map's dimensionality. To

create a pooled feature map, the rectified feature map is now sent via a pooling layer.
The pooling layer employs several filters to identify different sections of the image, and
the flattening layer is the next step in the process. The pooled feature maps' 2-D arrays
are flattened into a single long continuous linear vector.
Fully connected layer
In this layer a flattened matrix is sent as input to the fully connected layer to classify the
image.
Here’s how exactly CNN recognizes
• The image is processed with many convolutions and ReLU layers for locating the
feature.
• Different pooling layers with various filters are used to identify certain parts of the
image.
• The pooled feature map is flattened and sent to a fully connected layer.
Fig 2.1. Architecture of Convolutional Neural Networks
11
2.1.2. Object Detection Using YOLO
YOLO is a real-time object identification technique that uses neural

networks.Because of its speed and precision, this algorithm is very popular.It has been
used to identify traffic signals, pedestrians, parking metres, and animals in a variety of
applications. This is an algorithm for detecting and recognising different items in a
photograph.Object detection in YOLO is done as a regression problem and provides the
class probabilities of the detected images.Convolutional neural networks (CNN) are used
in the YOLO method to recognise objects in real time. As the name suggests.To detect
objects, the approach just takes a single forward propagation through a neural network.
The YOLO algorithm consists of various variants. Some of the common ones include
tiny YOLO , YOLOv4 and YOLOv5
The three strategies used by the YOLO algorithm are as follows:
● Residual blocks
● Bounding box regression
● Intersection Over Union (IOU)
Residual blocks
The image is divided into S x S grids by predicting the bounding boxes and class
probabilities for each grid.For each grid of the image, image classification and object
localization algorithms are used, and each grid is given a label.The algorithm then goes
through each grid one by one, marking the labels that include objects as well as their
bounding boxes.The grid labels that do not have an item are indicated as zero.
Bounding box regression
Each grid is labelled, and picture classification and object localization procedures are
used to each grid. Y is assigned to the label.
Fig 2.2. Bounding box features
12
Pc = It represents whether an object is present in the grid or not, If present pc=1 else 0.
bx, by, bh, bw = these are the bounding boxes of the objects (if present).
c=Class (for example, person, fruits etc.)
Intersection over union (IOU)
The concept of intersection over union (IOU) illustrates how boxes overlap in object
detection. IOU is used by YOLO to create an output box that properly surrounds the
items.The bounding boxes and their confidence scores are predicted by each grid cell
.The IOU is 1 if the predicted and actual bounding boxes are identical.This approach
removes bounding boxes that aren't the same size as the actual box.
IoU = Area of intersection / Area of union
ACCURACY IMPROVEMENT
Only one object can be detected by a grid when bounding boxes are used for object
detection. As a result, we use the Anchor box to detect several objects. We can use any
number of anchor boxes for a single image to detect multiple images.
Fig 2.3. Anchor box features for detecting multiple objects
13
2.1.3. Object Detection Using YOLOv5
One of the object detection techniques that uses regression is You Only Look
Once (YOLO). By running the method a single time, all the items of various classes in
the image are detected, and bounding boxes are built around them. YOLO is one of the
most efficient object detection techniques. YOLOv4, which is one of the quickest but
less exact object detection algorithms. YOLOv5, the most recent version, has a very high
accuracy. It has been trained to recognise things from 80 different categories.
Architecture of YOLOv5
Figure 2.4 depicts the Yolov5 network architecture. Yolov5 was chosen as our initial
learner for three reasons. To begin, Yolov5 combined the cross stage partial network
(CSPNet) into Darknet, resulting in the creation of CSPDarknet as the network's
backbone. CSPNet solves the problem of recurrent gradient information in large-scale
backbones by including gradient changes into the feature map, reducing model
parameters and FLOPS (floating-point operations per second), ensuring inference speed
and accuracy while simultaneously reducing model size. In the detection of fruits and
vegetables or grocery, speed and accuracy are critical, and the size of the model impacts
its inference efficiency on resource-limited edge devices. Second, to improve
information flow, the Yolov5 used a path aggregation network (PANet) as its neck.
PANet uses a new feature pyramid network (FPN) topology with an improved bottom-up
approach to improve low-level feature propagation. Simultaneously, adaptive feature
pooling, which connects the feature grid to all feature levels, is employed to ensure that
meaningful information from each feature level reaches the next subnetwork. PANet
improves the use of precise localization signals in lower layers, which can significantly
improve the object's location accuracy. Finally, Yolov5's head, the Yolo layer, generates
three various sizes of feature maps (18 18, 36 36, 72 72) to provide multi-scale
prediction, allowing the model to handle tiny, medium, and large objects.
14
Fig.2.4. Architecture of YOLOv5 Showing the various layers
The main advantage of using YOLOV5 over other YOLO algorithms are
1) The decrease in size : There is an approximately 80 percent decrease in the model size
when compared to earlier Yolo models
2) The latest YOLO algorithm is YOLOV5 hence it is 150 percent faster than the earlier
YOLO algorithms
The cons of using YOLOv5 are

1) The accuracy is similar to that of YOLO v4
2) There is no official paper that is published to prove YOLO v5
15
2.2. Detecting grocery objects with CNN
As the fruits-360 dataset from kaggle was ordered, pre annotated and well built
dataset, A basic convolutional neural network was implemented using pre annotation
techniques like Rotation, Augmentation and Splitting. After training the dataset, an
accuracy of 98 percent was achieved on the split verification set. An API model was
incorporated to test the results with outdoor real - life images, but the CNN model wasn't
able to give good results with these images.
2.3. Detecting grocery objects with YOLO v4
Basic CNN was not able to give good results with real life images so after a
thorough research, Yolo (You only look once) detection algorithm is chosen and a proper
dataset is formed using BBox annotation. As images are gathered randomly from Google
images and proper Bbox annotations were given. Though an accuracy of 70 percent was
achieved, many images were found to be unrelated to the dataset hence is the reason for
70 percent.
2.3. Detecting grocery objects with YOLO v5
After a lot of literature survey, a new recent method of YOLO V5 is considered

and a different proper dataset with good images and reduced number of classes is
considered and Yolo V5 is applied. An accuracy nearing to a whooping 80 percent is
achieved using this algorithm. Yolo V5 with the properly annotated and precisely chosen
images gave the best results out of all the three object detection algorithms used.
16
CHAPTER 3
SOFTWARE USED
17
CHAPTER 3
SOFTWARE USED
3.1. Dataset Creation Tools
After importing the images from the google images dataset, for making the
images train using yolo model we need a txt file for each image containing information
about the bounding box of each and every object inside that particular image. This is
done using BBOX TOOL. Use of BBOX is shown below,
Fig 3.1. Bounding box using BBOX tool
3.2. Model Training
· Darknet:
Darknet is a neural network framework which is open source and used to train
our custom model with our custom datasets using Yolo. This framework contains the
source files which helps in training our model using the provided dataset with given
configuration settings that are required for the given dataset and model used.
18
· Google Colab:
Colab is a coding environment that runs entirely on the cloud and helps us in
executing our python code and used to clone the darknet framework and use it with our
custom dataset and model. It provides GPU and TPU free of cost for training our dataset.
3.3. Model Deployment
Our trained model is then deployed into a WebApp which is built using
Frontend: HTML, CSS, Bootstrap.
Backend: Python, FLASK, OpenCV.
In this OpenCV is used to read the model weights given and predict the output of
the given input image. A user-friendly GUI is developed using the above-mentioned
software framework which takes an image as input and gives the final bill of all items as
output. ATOM code editor is used for coding all these.
19
CHAPTER 4
RESULTS AND DISCUSSIONS
20
CHAPTER 4
RESULTS AND DISCUSSIONS
4.1. Results of Object Detection using YOLOv4
Fig 4.1. Graph of training loss vs No. of iterations
Fig 4.2. Results using YOLOv4 algorithm
21
Fig 4.3. List of all classes and their accuracies using YOLOv4
4.2. Billing Results
Fig 4.4. Billing Results
22
4.3. Website Results
Fig 4.5. Website view
Fig 4.6. Billing Results in Website
23
4.4. Results of Object Detection using YOLOv5
Fig 4.7. Results obtained using YOLOv5
24
Fig 4.8. Results obtained using YOLOv5 along with confidence scores
4.5. Results of Object Detection using CNN
Fig 4.9. Result showing actual class and predicted class using CNN
25
4.6. Model training metrics
Fig 4.10. Accuracies obtained for each class
Fig 4.11. Accuracy and Loss graphs
26
CHAPTER 5
CONCLUSION AND FUTURE SCOPE
27
CHAPTER 5
CONCLUSION AND FUTURE SCOPE
5.1. Conclusion
The ultimate goal of Automation of Billing systems using deep learning is to

ensure better satisfaction for the customers, and lessen the wait times in every way
possible. Shopping in super markets has become an integral part of our life and people in
cities often go to grocery stores in order to purchase their daily requirements.
Vegetables, fruits and groceries are daily commodities required so the experience in
purchasing them should be hassle free. Considering the advancement in the domain of
deep learning and computer vision, we hope our idea would revolutionize the complete
process of the billing system in INDIA and can create a change in existing methods.
In this project, we have proposed a novel solution to reduce the wait times, long queues
and the workforce in the supermarkets. The final billing cart with all the products bought
by a customer is fed as an input image to the website, every fruit/vegetable/grocery is
identified, tallied and a final bill after summing up is provided to the customer. Now each
and every product need not be scanned through barcode or RFID tag instead an image
consisting of all of the cart products can be fed as input and the customers can directly
access the bill.
The proposed web application is designed to make everyone’s life easier during the
process of billing by creating a hassle-free experience reducing the wait times. It is also
hands free so the risk of transmitting any kind of virus also decreases which in turn is
good for the customers health.The number of people working in the counters can be
decreased using this method as the number of counters itself would reduce due to less
demand in billing. Generalization of costs of vegetables and fruits can be done if this
method is employed as everyone would follow the same dataset and can have the same
prices, but the dataset can also be modified according to the shopkeeper's needs.
The main objective of reducing wait times and fastening the billing process is achieved
through this project.
28
5.2. Future Scope
In coming times, we would like to fix some of the setbacks of the project and
improve the detection accuracy of our model. We would also like to create a separate
annotated proper dataset that would consist of a greater number of classifications with
lots of dataset so that the model would understand the features of distinguishment easily
and would give more accurate results
Also, we would like to extend this project by adding a few more features to it , such as
video input which would take video as the feed and give live billing output. We would
also like to integrate the offers available, discounts given and the tax added to the final
bill.
An app integrating the detection and billing system is proposed to be made in the near
period. There is a need to improve the dataset which would in turn improve the model’s
accuracy.
This model once after improving accuracy and integrating it with mobile applications
can be taken into the market for business purposes.
Here, as per our observation one of the major problems during training the model is its
computational time. So as the time progresses due to enhancement in technology the
time taken to train and test the model will be decreasing exponentially, with this we can
incorporate many classes with a huge dataset and train with more complex and improved
versions of yolo algorithms, which makes the model more accurate and also makes it
compatible with a smaller number of resources. So, this project can be used efficiently
even in small devices. It can be used in small stores where the owner can not afford
RFID tags or Barcode scanners, and can directly use his mobile to bill the items with a
simple photo click.
29
APPENDIX A
The GUI created is integrated with the trained model using PYTHON (app.py) backend
code. And the templating of the GUI is done using HTML (success.html, page.html) and
styling is done using CSS (styles.html) and BOOTSTRAP. All the codes used are given
below.
app.py
30
31
32
page.html
33
success.html
34
styles.css
35
REFERENCES
36
REFERENCES
[1] Joseph Redmon, Santosh Divvala, Ross Girshick. “You Only Look Once:
Unified, Real-Time Object Detection”. The IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), 2016, pp. 779-788.
[2] Geethapriya. S, N. Duraimurugan, S.P. Chokkalingam. “Real-Time object

detection with yolo''. International Journal of Engineering and Advanced Technology
(IJEAT) ISSN: 2249 – 8958, Volume-8, Issue 3S, February 2019.
[3] Marcus Klasson, Cheng Zhang, Hedvig Kjellstrom. “A hierarchical grocery store
image dataset with visual and semantic labels”.
[4] Md Jan Nordin, Norshakirah Aziz, Ooi Wei Xin. “Food image recognition for
price calculation using convolutional neural network”.
[5] Chengji Liu,Yufan Tao, Jiawei Liang, Kai Li ,Yihang Chen. “Object detection
based on YOLO network”. 2018 IEEE 4th Information Technology and Mechatronics
Engineering Conference (ITOEC 2018).
[6] Ms. Y. Vineela Sravya, M. Keerthi, M. Kasturi, R. Lochana, A. Anusha. “Food

calorie estimation and auto bill generation for grocery products using YOLO object
detection”. Journal of Xi’an University of Architecture and Technology Volume XII,
Issue V, 2020 ISSN No : 1006-7930.
[7] Xiaofeng Ning, Wen Zhu, Shifeng Chen. “Recognition, Object Detection and
Segmentation of white background photos based on Deep Learning”.
[8] Kavan Patel. “Fruits and vegetable detection for POS with Deep Learning”.
[9] Ragesh N, Giridhar B, Lingeshwaran D, Siddharth P and K P Peeyush. “Deep

Learning based automated billing cart”. International Conference on Communication and
Signal Processing, April 4-6, 2019, India.
[10] Huimin Yuan, Ming Yan. “Food object recognition and intelligent billing system
based on Cascade R-CNN”. 2020 International Conference on Culture-oriented Science
and Technology (ICCST).
37
[11] Suraj Chopade, Prof. Smita Palnitkar, Sujit Chavan, Anirudha Deshpande.
“Automated Super Shop using image processing (Python)”. International Journal of
Future Generation Communication and Networking Vol. 13, No. 2s, (2020), pp.
382–388.
[12] E. K. Jose and Veni. S. “YOLO classification with multiple object tracking for
vacant parking lot detection”. Journal of Advanced Research in Dynamical and Control
Systems, vol. 10, pp. 683-689, 2018.
[13] Redmon and A. Farhadi. ”YOLO9000: Better, Faster, Stronger”. 2017 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017,
pp. 6517-6525.
[14] G. M. Farinella, D. Allegra, M. Moltisanti. “ Retrieval and classification of food

images”. Computers in Biology and Medicine, vol. 77, pp. 23- 39, 2016.
[15] Zhuang-Zhuang Wang, Kai Xie, Xin-Yu Zhang, Hua-Quan Chen, Chang Wen,
Jian-Biao He. “Small-Object detection based on YOLO and Dense Block via Image
Super-Resolution”. IEEE Access, vol. 9, pp. 56416- 56429.
38
Intelligent Billing System
ORIGINALITY REPORT
26 %
SIMILARITY INDEX
18%
INTERNET SOURCES
11%
PUBLICATIONS
20%
STUDENT PAPERS
PRIMARY SOURCES
1
Submitted to Visvesvaraya National Institute
of Technology
8%
Student Paper
2
www.simplilearn.com
Internet Source 1%
3
Submitted to Sri Lanka Institute of
Information Technology
1%
Student Paper
4
Submitted to Nanyang Technological
University
1%
Student Paper
5
Ritu Tandon, Shweta Agrawal, Rachana
Raghuwanshi, Narendra Pal Singh Rathore,
1%
Lalji Prasad, Vishal Jain. "Chapter 9 Automatic
Lung Carcinoma Identiﬁcation and
Classiﬁcation in CT Images Using CNN Deep
Learning Model", Springer Science and
Business Media LLC, 2022
Publication
6
Submitted to Nottingham Trent University
Student Paper 1%

FYP - Thesis - 2022 (1) - Merged

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FYP - Thesis - 2022 (1) - Merged

Uploaded by

Copyright:

Available Formats

Intelligent Billing System Using Object Detection

Project report submitted to

Neeraj Chidella BT18ECE145

under the guidance of

Dr. Joydeep Sengupta

Department of Electronics and Communication Engineering

Neeraj Chidella BT18ECE145

under the guidance of

Dr. Joydeep Sengupta

Department of Electronics and Communication Engineering

© Visvesvaraya National Institute of Technology (VNIT) 2022

Using Object Detection” is carried out by us in the Department of Electronics and

Sr.No. Enrollment Name Signature

Detection”, submitted by Neeraj Chidella, N Kalyan Reddy, N Sai Dheeraj Reddy

Intelligent Billing System Using Object Detection

With the rapid advancement of technology in machine learning, deep learning,

Keywords - YOLO, bounding boxes, Convolutional Neural Network(CNN), Automatic

1.3. Analysis of mindset of the customers …………………………………….. 3

1.4. Basic Flow of the Proposed Solution ……….……. 4

1.5. Some of the images from Dataset-1 ……………………………. 7

1.6. Some of the images from Dataset-2 ……………...………. 8

1.7. An image from Dataset-3 ……………………….….. 8

2.1. Architecture of Convolutional Neural Networks …………11

2.2. Bounding box features ………………………………………………….. 12

2.3. Anchor box features for detecting multiple objects …………………….. 13

2.3. Architecture of YOLOv5 Showing the various layers …………………….. 15

3.1. Bounding box using BBOX tool ………………………………………… 18

4.1. Graph of training loss vs No. of iterations ………………………………. 21

4.2. Results using YOLOv4 algorithm …………….…………………………...21

4.4. Billing Results ……………………………………………………………... 22

4.5. Website view ……………………………….…. 23

4.6. Billing Results in Website …………………………………….. 23

4.7. Results obtained using YOLOv5 …………………………………….. 24

4.10. Accuracies obtained for each class ………… 26

4.11. Accuracy and Loss graphs ………… 26

RFID Radio Frequency Identification

CNN Convolutional Neural Network

R-CNN Region-Based Convolutional Neural Network

YOLO You Only Look Once

YOLOv3 You Only Look Once version 3

YOLOv4 You Only Look Once version 4

YOLOv5 You Only Look Once version 5

ReLU Rectified Linear activation Unit

IOU Intersection Over Union

CSPNet Cross Stage Partial Network

FLOPS Floating-point Operations Per Second

PANet Path Aggregation Network

FPN Feature Pyramid Network

API Application Programming Interface

GPU Graphics Processing Unit

TPU Tensor Processing Unit

HTML Hypertext Markup Language

CSS Cascading Style Sheets

Conference - The First International Conference on the Paradigm shifts in

Title of Paper – “Intelligent Billing system using Object Detection”.

CHAPTER 1: INTRODUCTION …………………………………. 1-8

CHAPTER 2: METHODOLOGY ………………………………. 9-16

CHAPTER 3: SOFTWARE USED ………………………………. 17-19

CHAPTER 4: RESULTS AND DISCUSSIONS …………..…… 20-26

CHAPTER 5: CONCLUSION AND FUTURE SCOPE ………. 27-29

APPENDIX A: CODES …………….……………………………. 30-35

Fig 1.3. Analysis of mindset of the customers

Fig.1.4. Basic Flow of the Proposed Solution