You are on page 1of 13

Big Data Frameworks

2D Object Detection With YOLO  and RDD


Introduction

The You Only Look Once In this situation, 2D object


Utilizing distributed computing
(YOLO) algorithm, which is detection with YOLO, Spark,
frameworks like Apache Spark
renowned for its high accuracy and RDD provides a strong
can significantly speed up
and real-time performance, is method for locating objects in
processing when scaling up
one well-liked method for sizable image and video
YOLO for large datasets.
object detection. datasets
Problem Statement

The proposed system aims to


leverage the power of deep
The problem statement for learning-based object
Traditional object detection
2D object detection using detection using YOLO,
systems are often
YOLO, Spark, and RDD is distributed computing with
computationally intensive
to develop an efficient and Spark and RDD, and
and cannot handle the
scalable system for detecting efficient data pre-processing
processing of large-scale
objects in large-scale image techniques to achieve high
datasets in real-time.
datasets.  accuracy, scalability, and
efficiency in object detection
tasks.
Objective The aim of 2D object identification using YOLO, Spark, and RDD is to
identify and locate items of interest in a given picture or video frame
with high accuracy and efficiency. (RDD).

Bounding boxes and class probabilities are generated for recognised


objects using the YOLO model after processing input pictures or video
frames.

Large-scale datasets are distributed and parallelized using the Spark


framework.

The RDD data structure is employed to efficiently and fault-tolerantly


distribute and divide data over a cluster of nodes, guaranteeing that the
data is handled in a highly dispersed and parallelized way
Why RDD?

RDDs are immutable distributed collections of objects that can be processed in parallel across a
cluster of machines

RDDs enable distributed processing of data in a fault-tolerant manner, enabling the system to recover
from machine failures without any data loss.

split() function to each line in the RDD, splitting the line at the comma delimiter and creating a list of
labels.

The map() function extracts the first element (the class name) from each list in the RDD, and the
count() function counts the number of unique elements in the resulting RDD.
Methodolgy
Data preparation is the first step, which usually entails gathering and labelling
a sizable dataset of images or videos.

The YOLO model must now be trained using the prepared dataset. The model
architecture must be configured, the training parameters must be established,
and the training procedure must be carried out. 

By adjusting its weights in response to a set of labelled examples, the YOLO


model gains the ability to recognise objects in images during training.

After training, the YOLO model can be integrated with Spark by being loaded
into memory and having input data processing spread across a cluster of
computers.
After setting up the YOLO
The data is represented and
model and Spark-RDD
parallelized computations are
integration, the input data
carried out using the RDD
must now be subjected to
abstraction.
object detection.

The YOLO model is used in


this process to process the
input data and produce a set
of bounding boxes 
The You Only Look Once (YOLO) method, a deep learning model for
Model object recognition, is the foundation of the model used in 2D object
detection using YOLO, Spark, and RDD. 

The YOLO method predicts a set of bounding boxes.

A deep neural network with numerous convolutional layers makes up


the YOLO model

The YOLO model is tuned during training in order to reduce the sum
of squared errors between the predicted bounding boxes

Once trained, the YOLO model may be used in conjunction with


Apache Spark and RDD to distribute and parallelize the processing of
huge datasets. 
Once the model has been
The YOLO model runs an
trained, it may be combined
input picture or video frame
with Apache Spark and RDD
through the neural network
to analyse big datasets in a
during inference to provide a
distributed and parallel
collection of bounding boxes.
manner. 

The output may be post-


processed to remove false
positives and display the items
that were discovered.
Architecture
The YOLOv3 architecture consists of
The input to YOLOv3 is an image of 106 layers, which includes the Darknet-
size 416x416 pixels. 53 backbone network and the detection
layers.

A Darknet-53 design, a variation on the


residual network (ResNet) architecture,
The exact number of layers can vary serves as the YOLOv3 backbone
depending on the specific configuration network. ResNet is not as effective for
of the model object detection tasks as the Darknet-53
architecture, which is built with more
layers and fewer parameters.

The YOLOv3 architecture has three


detection layers, each of which detects
objects at a different scale. The
detection layers use anchor boxes to
help predict the location and size of
objects.
Each bounding box is
The output of YOLOv3 is a list represented by four coordinates
of bounding boxes and their (x, y, width, height), and each
corresponding class class probability represents the
probabilities.  likelihood of the object
belonging to a particular class.
Conclusion

A potent method for finding The approach can achieve


and identifying items of high accuracy, scalability, and
interest in huge picture efficiency in object detection
collections is 2D object tasks, enabling a wide range of
identification utilising YOLO, applications in computer
Spark, and RDD. vision. 

You might also like