Professional Documents
Culture Documents
Khadijas996chp1 1
Khadijas996chp1 1
1.1 Introduction
This chapter highlight on the general introduction of the project which includes the description of
the background of the study, statement of the problem, project questions, aims and objectives of
the study, significant of the study, project artifacts, scope and limitations of the study,
methodology, project structure and definition of terms.
1
still a challenge for real-time performance which requires the computational complexity to be
minimum.
Humans glance at an image and instantly know what objects are in the image, where they are,
and how they interact. The human visual system is fast and accurate, allowing us to perform
complex tasks like driving with little conscious thought.
Humans can detect and identify objects present in an image. The human visual system is fast and
accurate and can also perform complex tasks like identifying multiple objects and detect
obstacles with little conscious thought. The availability of large sets of data, faster GPUs, and
better algorithms, we can now easily train computers to detect and classify multiple objects
within an image with high accuracy. We need to understand terms such as object detection,
object localization, loss function for object detection and localization, and finally explore an
object detection algorithm known as “You Only Look Once” (YOLO). More recent approaches
like R-CNN uses region proposal making predictions. Unlike sliding window and region
proposal-based techniques, YOLO sees the entire image during training and test time so it
implicitly encodes contextual information about classes as well as their appearance. Fast R-CNN,
a top detection method, mistakes background patches in an image for objects because it can’t see
the larger context. YOLO makes less than half the number of background errors compared to
Fast R-CNN. Third, YOLO learns generalizable representations of objects. When trained on
natural images and tested on artwork, YOLO outperforms top detection methods like DPM and
R-CNN by a wide margin. Since YOLO is highly generalizable it is less likely to break down
when applied to new domains or unexpected inputs. YOLO still lags behind state-of-the-art
detection systems in accuracy. While it can quickly identify objects in images it struggles to
precisely localize some objects, especially smaller objects.
2
partial occlusions. Therefore, this project work will use YOLO model for identifying real-time
objects among several others objects.
The main aim of this project work is to develop a system that will allow identification of real-
time object using suitable deep learning model.
The following objectives have been identified to fulfill the stated aim of this project work:
1. To investigate and analyze the problems of real-time object detection.
2. To implement a system from YOLO model for handling real-time object detection
problem. .
3. To test the new system using some appropriate data.
3
1.8 Scope and Limitation of the Study
The ultimate Scope and limitation of object detection is to locate important items, draw
rectangular bounding boxes around them, and determine the class of each item discovered, other
decision parameters such as results and comparison of other datasets are not covered.
1.9 Methodology
This project of simple object detection has been handle using YOLO Model.
More on this model was been investigated from highly academic journals, conferences,
text books and other online sources, where the overall development approached to used
was YOLO, and iterative model.
A YOLO Model where YOLO is an abbreviation of You Only Look Once is an algorithm
that uses Neural Networks to provide real-time detection. This algorithm is popular
because of its speed and accuracy. It has been used in various applications to detect
traffic signals, peoples, packing meters and animals. The algorithm and recognizes
various objects in a picture (in real-time).object detection in YOLO is done as a
regression problem and provides the class probabilities of the detected image.
The mode was been implemented using Python programming language. Python is a
general-purpose interpreted, interactive, object-oriented and high-level programming
language. It was created by Guido van Rossum during 1985-1990. It is widely used in the
area of artificial intelligent due to its richness in of appropriate libraries.
1. 10 Project Structure
This project was structured to be in different chapters which is from chapter 1-5. The first
chapter include introduction, background of the study, and statement of the problem, project
question, project structure, aims and objectives, significant project artifact, scope and limitations
methodology. The second chapter include literature review and references. Third chapter include
introduction of the chapter description of the existing system, description of the proposed
system, dataflow diagram, architectural design, system flowchart and system interface design.
The fourth chapter which include introduction of the chapter, system design, implementation and
testing occurs. Finally the last chapter which is chapter five include summary, conclusion and
recommendation of the project work.
4
1.11 Definition of Terms
Object Detection is a computer vision technique that allows us to identify and locate objects in
an image or video. With this kind of identification and localization, object detection can be used
to count objects in a scene and determine and track their precise locations, all while accurately
labeling them.
Object Recognition is a computer vision technique for identifying objects in images or
videos. Object recognition is a key output of deep learning and machine learning algorithms.
When humans look at a photograph or watch a video, we can readily spot people, objects, scenes,
and visual details.
Deep learning is a subset of machine learning in artificial intelligence that has networks capable
of learning unsupervised from data that is unstructured or unlabeled. Also known
as deep neural learning or deep neural network.
Object is anything that has a fixed shape or form that you can touch or see.
YOLO Algorithm is an algorithm based on regression, instead of selecting the interesting part
of an image, it predicts classes and bounding boxes for the whole image in one run of
the Algorithm.
Artificial intelligence (AI) refers to the simulation of human intelligence in machines that are
programmed to think like humans and mimic their actions. The term may also be applied to any
machine that exhibits traits associated with a human mind such as learning and problem-solving.
Machine learning is a method of data analysis that automates analytical model building. It is a
branch of artificial intelligence based on the idea that systems can learn from data, identify
patterns and make decisions with minimal human intervention.
Region Based Convolutional Neural Networks (R-CNN) are a family of machine learning
models for computer vision and specifically object detection.
s