Object Recognition System Report

OBJECT RECOGNITION SYSTEM
MINOR PROJECT
SUBMITTED in PARTIAL FULFILLMENT of
THE REQUIREMENTS FOR THE AWARD OF THE DEGREE of
BACHELOR OF TECHNOLOGY
(INFORMATION TECHNOLOGY)
SUBMITTED BY
ANKUSH BHALLA – 04596303113

SAMBHAV JAIN-00296303113
SHYAM PRASAD GUPTA-02696303113
Under the Guidance of

(Ms. NISHTHA JATANA)
Department of Information Technology

MAHARAJA SURAJMAL INSTITUTE OF TECHNOLOGY,
JANAKPURI DELHI-58
GURU GOBIND SINGH INDRAPRASTHA UNIVERSITY
DELHI, INDIA
November 2016
ACKNOWLEDGEMENT
We truly acknowledge the cooperation and help made by Ms. Nishtha Jatana , Assistant
Professor Department of Computer Science and Engineering, Maharaja Surajmal Institute
of Technology, C-4 Janakpuri , Delhi, Guru Gobind Singh Indraprastha University, Delhi.
She has been a constant source of guidance throughout the course of this project in
understanding Object Recognition System. We are also thankful to our friends and family
whose silent support led us to complete our project.
(Signature)
Ankush Bhalla (04596303113)
(Signature)
Sambhav Jain (00296303113)
(Signature)
Shyam Prasad Gupta (02696303113)
i
Object Recognition System
CERTIFICATE
This is to certify that the project entitled OBJECT RECOGNITION SYSTEM is a bona
fide work carried out by Mr. Ankush Bhalla, Mr. Sambhav Jain and Mr. Shyam Prasad
Gupta under my guidance and supervision and submitted in partial fulfillment of B.Tech
degree in Information Technology of Guru Gobind Singh Indraprastha University, Delhi.
The work embodied in this project has not been submitted for any other degree or diploma.
(Ms. Nishtha Jatana)

(Assistant Professor)
Dr. Tripti Sharma

Head, Department of Information Technology
MSIT, Delhi.
ii
ABSTRACT
Computer vision has become one of the most appealing domains of recent research. It is
concerned with using computers to mimic the human visual system. Pattern recognition is
considered to be an important problem in the field of image analysis and computer vision.
Automatic (machine) recognition, description, classification, and grouping of patterns are
the major activities in a variety of engineering and scientific disciplines such as biology,
medicine, marketing, computer vision, artificial intelligence, and remote sensing.
The project concentrates on the technique of recognizing objects in digital images. We

focus on feature extraction approach for the description of objects and the classification of
objects. Extracting useful information from digital image, by analyzing the image and use
this information to find an object of interest in image by the properties of object and provide
this information to machine. The purpose is to provide the machine an ability to detect,
describe and understand an object from an image or camera. Nowadays, image processing
is among rapidly growing technologies. It forms core research area within engineering and
computer science disciplines too. The proposed technique is very simple, easy to implement
and could be used for real time applications.
iii
LIST OF FIGURES
Figure 2.1: Multiple object detection……………………………………………………..10
Figure 2.2: Computing gradient .…………………………………………………………17
Figure 3.1: Training process ..……………………………………………………………21
Figure 3.2: Multiple detection at lower threshold …………………………………..........23
Figure 3.3: Correct detection at optimum threshold ……………………………………..23
Figure 3.4: DFD level 0…………………………………………………………………...24
Figure 3.5: Flow Chart.. ………………………………………………………………….24
Figure 3.6 System Flow Diagram ………………………………………………………..25
Figure 4.1 GUI …………………………………………………………………………...28
Figure 4.2: Upload Image ………………………………………………………………...29
Figure 4.3: Detect tree ……………………………………………………………………29
Figure 4.4: Tree detected …………………………………………………………………30
Figure 4.5: Detect apple ………………………………………………………………….30
Figure 4.6: Apple detected ……………………………………………………………….31
Figure 4.7: Detect car. …………………………………………………………………...31
Figure 4.8: Car detected.. ………………………………………………………………...32
iv
LIST OF TABLE
Table 3.1 Parameter Consideration……………………………………………………….22

Table 3.2 Apple detector………………………………………………………………….26
Table 3.3 Car detector…………………………………………………………………….26
Table 3.4 Tree detector………………………………………………………………...…26
v
TABLE OF CONTENTS
ACKNOWLEDGEMENT.............................................................................. i
CERTIFICATE .............................................................................................. ii
ABSTRACT ...................................................................................................iii
LIST OF FIGURES ...................................................................................... iv
LIST OF TABLE ........................................................................................... v
CHAPTER 1
INTRODUCTION TO MACHINE LEARNING AND OBJECT
RECOGNITION ............................................................................................ 1
1.1 LEARNING TYPES .................................................................................................. 2
1.2 OBJECT DETECTION .............................................................................................. 3
1.3 OBJECT RECOGNITION ......................................................................................... 3
1.4 RECOGNITION PROCESS ...................................................................................... 4
1.5 APPLICATIONS ....................................................................................................... 4
CHAPTER 2
SOFTWARE REQUIREMENT SPECIFICATION .................................. 6
2.1 INTRODUCTION ...................................................................................................... 7
2.1.1 PURPOSE............................................................................................................ 7
2.1.2 SCOPE ................................................................................................................. 7
2.1.3 OPERATING ENVIRONMENT ........................................................................ 8
2.1.4 METHODLOGY ................................................................................................. 9
2.1.5 DESIGN AND IMPLEMENTATION CONSTRAINTS.................................. 11
2.2 SPECIFIC REQUIREMENTS ................................................................................. 11
2.2.1 USER INTERFACE REQUIREMENTS:-........................................................ 11
2.3 LITERATURE SURVEY ........................................................................................ 12
2.3.1 ABSTRACT OF PAPERS REFERED .............................................................. 12
2.3.2 FINDING FROM THE RESEARCH PAPER .................................................. 14
2.3.3 PROBLEM STATEMENT................................................................................ 15
2.4 SOLUTION PROPOSED ........................................................................................ 15
2.4.1 OBJECTIVES TO ACHIEVE ........................................................................... 16
2.4.2 METHODOLOGY TO ACHIEVE OBJECTIVE ............................................. 16
vi
CHAPTER 3
IMPLEMENTATION AND RESULT ...................................................... 20
3.1 SOFTWARE DESIGN ............................................................................................. 24
3.1.1 DFD LEVEL - 0 ................................................................................................ 24
3.1.2 FLOW DIAGRAM ............................................................................................ 24
3.1.3 SYSTEM FLOW DIAGRAM ........................................................................... 25
3.2 RESULT ................................................................................................................... 26
CHAPTER 4
SCREENSHOTS .......................................................................................... 27
4.1 GUI ........................................................................................................................... 28
4.2 UPLOAD IMAGE.................................................................................................... 28
4.3 DETECT TREE........................................................................................................ 29
4.4 DETECT APPLE ..................................................................................................... 30
4.5 DETECT CAR ......................................................................................................... 31
CHAPTER 5
CONCLUSION AND FUTURE SCOPE ................................................... 33
5.1 CONCLUSION ........................................................................................................ 34
5.2 FUTURE SCOPE ..................................................................................................... 34
CHAPTER 6
REFERENCES............................................................................................. 35
vii
CHAPTER 1
INTRODUCTION TO MACHINE LEARNING AND

OBJECT RECOGNITION
1
Machine learning is a type of artificial intelligence (AI) that provides computers with the
ability to learn without being explicitly programmed. Machine learning focuses on the
development of computer programs that can teach themselves to grow and change when
exposed to new data.
Over the past two decades Machine Learning has become one of the mainstays of
information technology and with that, a rather central, albeit usually hidden, part of our
life. With the ever increasing amounts of data becoming available there is good reason to
believe that smart data analysis will become even more pervasive as a necessary ingredient
for technological progress[1]
The process of machine learning is similar to that of data mining. Both systems search
through data to look for patterns. However, instead of extracting data for human
comprehension -- as is the case in data mining applications -- machine learning uses that
data to detect patterns in data and adjust program actions accordingly. Machine learning
algorithms are often categorized as being supervised or unsupervised. Supervised
algorithms can apply what has been learned in the past to new data. Unsupervised
algorithms can draw inferences from datasets.
Facebook's News Feed uses machine learning to personalize each member's feed. If a
member frequently stops scrolling in order to read or "like" a particular friend's posts, the
News Feed will start to show more of that friend's activity earlier in the feed [2].
1.1 LEARNING TYPES

 Supervised learning
Supervised learning deals with learning a function from available training data.
A supervised learning algorithm analyzes the training data and produces an
inferred function, which can be used for mapping new examples. Common
examples of supervised learning include:
 classifying e-mails as spam,
 labeling webpages based on their content, and
 voice recognition.
2
There are many supervised learning algorithms such as neural networks,
Support Vector Machines (SVMs), and Naive Bayes classifiers. Mahout
implements Naive Bayes classifier.
 Unsupervised learning
Unsupervised learning makes sense of unlabeled data without having any
predefined dataset for its training. Unsupervised learning is an extremely
powerful tool for analyzing available data and look for patterns and trends. It is
most commonly used for clustering similar input into logical groups. Common
approaches to unsupervised learning include:
 k-means
 self-organizing maps, and
 hierarchical clustering
1.2 OBJECT DETECTION

 Where is the object in the image?
 input: a clear image of an object, or some kind of model of an object (e.g. duck)
and an image (possibly) containing the object of interest
 output: position, or a bounding box of the input object if it exists in the image (e.g.
the duck is in the upper left corner of the image)
1.3 OBJECT RECOGNITION
 Which object is depicted in the image?

 input: an image containing unknown object(s)
 Possibly, the position of the object can be marked in the input, or the input might
be only a clear image of (not-occluded) object.
 output: position(s) and label(s) (names) of the objects in the image
 The positions of objects are either acquired form the input, or determined based on
the input image.
3
1.4 RECOGNITION PROCESS
The steps involved in recognition process is as follow:
 PREPROCESSING:
 Image cropping and converting them in same size.
 Features extraction; using extractors like LBP (Linear Binary Pattern), HOG
(Histograms of Oriented Gradients , inbuilt in matlab) .
 Dimension Reduction Process (only if necessary).
 TRAINING
 We give train cases to system (same object has same label no.)
 Plotting the features of the train cases in multidimensional space.
 Objects differentiated by drawing curve in the plot.
 Plot is drawn by the classifier e.g SVM.
 TESTING
 Test case is provided to the system.
 Uses classifier to plot the object features.
 Object is recognized by checking the curve space it belongs to.
1.5 APPLICATIONS
 Biometric recognition: Biometric technology uses human physical or behavioral
traits to recognize any individual for security and authentication. Biometrics is the
identification of an individual based on distinguished biological features such as
finger prints, hand geometry, retina and iris patterns, DNA, etc.
 Surveillance: Objects can be recognized and tracked for various video surveillance
systems. Object recognition is required so that the suspected person or vehicle for
example be tracked.
 Human computer interaction: Human gestures can be stored in the system, which
can be used for recognition in the real-time environment by computer to do
interaction with humans. The system can be any application on mobile phone,
interactive games, etc.
4
 Intelligent vehicle systems: Intelligent vehicle systems are needed for traffic sign
detection and recognition, especially for vehicle detection and tracking [3]. In such
a system is developed. In detection phase, a color-based segmentation method is
used to scan the scene in order to quickly establish regions of interest (ROI). Sign
candidates within ROIs are detected by a set of Haar wavelet features obtained from
AdaBoost training.
5
CHAPTER 2
SOFTWARE REQUIREMENT SPECIFICATION
6
2.1 INTRODUCTION
 The object recognition system provide the machine an ability to detect, describe
and understand an object from an image.
 Given some knowledge of how certain objects may appear and an image of a scene
possibly containing those objects, report which objects are present in the scene and
where.
 Input is an image containing unknown object(s) possibly, the position of the object
can be marked in the input, or the input might be only a clear image of (not-
occluded) object.
 Output is the position(s) and label(s) (names) of the objects in the image. The
positions of objects are either acquired form the input, or determined based on the
input image.
 When labeling objects, there is usually a set of categories/labels which the system
"knows" and between which the system can differentiate (e.g. object is either
apple, car, horse or tree).
2.1.1 PURPOSE
 This project report seeks to provide requirement specifications for the

‘Object Recognition System’.
 This will provide a broader understanding of the working of this system and
the features of this system along with the requirements.
 This document will guide the developers in the development process and it
will help to reduce the ambiguity of requirements provided by the end user.
 This document will help to narrow the gap between the requirements of the
user and the perspective of the developer. Finally it will assists as criteria
for a quality final project.
2.1.2 SCOPE
Image processing is being applied in many fields in today’s world:
7
 Automotive sector: In developing advanced drivers assist for semi-
autonomous cars and also heavily used in autonomous/driver-less cars.
 Surveillance: Objects can be recognized and tracked for various video
surveillance systems. Object recognition is required so that the suspected
person or vehicle for example be tracked.
 Manufacturing & Logistics: To identify defects in objects.
 Robotics: The research of autonomous robots is one of the most important
issues in recent years. The humanoid robot soccer competition is very
popular. The robot soccer players rely on their vision systems very heavily
when they are in the unpredictable and dynamic environments. The vision
system can help the robot to collect various environment information as the
terminal data to finish the functions of robot localization, robot tactic, barrier
avoiding, etc. It can decrease the computing efforts, to recognize the critical
objects in the contest field by object features which can be obtained easily by
object recognition techniques [3].
 Gaming: Advanced gaming consoles like Xbox Kinect uses image
processing from motion analysis of the human player.
 Human machine interface: machine are made smart by adding gestural
interface, or human action response interfaces, which decodes the actions of
the human user to perform certain tasks.
Camera simulates the eyes of the human being, which is one of the main sensor
of the human body using which brain takes decisions. Camera is relatively cheap
and decoding the image from it can give enormous amount of information which
can be used to perform certain actions/tasks. So image processing is one of the
emerging and is a future tool, so it has lot of scope.
2.1.3 OPERATING ENVIRONMENT
 The software can be installed on 32/64 bit Computing Environments with

different versions of Windows and have MATLAB installed on it.
 The minimum hardware requirements need to be 2 GB RAM.
 Image size should not be very large, with a maximum size of 1 MB.
8
 Distortion in the picture should not be exceedingly much.
 Noise reduction will be limited.
2.1.4 METHODLOGY
Object detection is a computer technology that connected to image processing

and computer vision that deal with detecting instance objects of certain class
in digital images and videos. Object detection is a challenging problem in
vision based computer applications. It is used to identifying that whether in
scene or image object is been there or not.
A single image may consist of single or multiple objects. If all the objects in
an image need to be detected the method shown in fig.7 can be used.
The method trains different object detectors with individual objects, as shown
in fig. there are N object detectors which are trained to detect N different
objects. Any of the above mentioned object recognition techniques can be used
depending upon the application area. An image is provided as input to the
system. The same image is given as input to all object detectors. Each detector
will determine if the object is present or not. We propose to use object detector
along with boundary detector. If the object is present, the detector will find its
boundary and tag the object name in the image. So, after the image has passed
via all the detectors all objects will be detected along with object boundary and
its tag displayed in the output image. Also, when the output image is displayed,
we can move the cursor over the image. The tag shown for an object inside the
complete boundary of the object remains same. Such multi-object detection in
the image can greatly improve the performance of the content based image
retrieval systems. The performance can further be improved by letting the
object detectors run in parallel.
9
Figure 2.1: Multiple object detection
Multi-component object detection method is good indistinguishable and
powerful.
The basic design of recognition systems described above makes use of

three categories of methodologies: (i) Feature based, (ii) SVM, (iii)
Neural Network Based
2.1.4.a Feature Based

The central idea of feature-based object recognition algorithms lies in
finding interest points that are invariant to change due to scale,
illumination and affine transformation. The Scale-Invariant Feature
Transform (SIFT) descriptor, proposed by Lowe, is one of the most
widely used feature representation schemes for vision applications
2.1.4.b SVM Based

Support Vector Machines (SVM) has been suggested as a new
technique for pattern recognition. For binary classification, a linear
SVM tries to find an N-dimensional hyper plane which optimally
separates the two classes. The closest points to the hyper plane are
called the support vectors. If the input data are not linearly separable
a non-linear transformation can be applied which maps the data points
of the input space into a high dimensional feature space.
10
2.1.5 DESIGN AND IMPLEMENTATION CONSTRAINTS
 Noise should be limited in the image for detection

 Minimum 10 negative images is required to start the training process.
 The system is not available in any language other than English.
 No deformation in the object in image is acceptable, object is taken as a
whole one thing.
 Best for images whose aspect ratio doesn’t change as much.
 For different shapes of the same object we need to train the system with
different view point of the image.
 Occlusion should not be there in the image.
2.2 SPECIFIC REQUIREMENTS
2.2.1 USER INTERFACE REQUIREMENTS:-
The user will feel at ease using the system as it is very convenient and easy to
use. The GUI of the system shows 4 buttons with the help of which one can
easy detected the object of interest. The buttons are:
 Upload – It opens the new window through which user can choose the
image for detection.
 Detect Tree – It detected the tree in the uploaded image and return a new
window with the image having boundary box on the object and tree written
on it.
 Detect Apple – This button detected the Apple in the uploaded image and
return a new window with the image having boundary box on the object
and tree written on it.
 Detect Car – It detected the Car in the uploaded image and return a new
window with the image having boundary box on the object and tree written
on it.
11
2.3 LITERATURE SURVEY
It detected the tree in the uploaded image and return a new window with the image
having boundary box on the object and tree written on it. Literature possesses
wider studies on the field of object recognition. Here in this research, we have
reviewed some of the techniques available in the literature for the recognition
of objects in digital images.
2.3.1 ABSTRACT OF PAPERS REFERED
1. Techniques for Object Recognition in Images and Multi-Object Detection,

by khushboo Khurana, Reetu Awasthi:
The modern world is enclosed with gigantic masses of digital visual

information. Increase in the images has urged for the development of robust and
efficient object recognition techniques. Most work reported in the literature
focuses on competent techniques for object recognition and its applications. A
single object can be easily detected in an image. Multiple objects in an image
can be detected by using different object detectors simultaneously. The paper
discusses various techniques for object recognition and a method for multiple
object detection in an image.
2. Rapid Object Detection using a Boosted Cascade of Simple Features, by

Paul Viola and Michael Jones:
This paper describes a machine learning approach for visual object detection
which is capable of processing images extremely rapidly and achieving high
detection rates. This work is distinguished by three key contributions. The first
is the introduction of a new image representation called the “Integral Image”
which allows the features used by our detector to be computed very quickly.
The second is a learning algorithm, based on AdaBoost, which selects a small
number of critical visual features from a larger set and yields extremely efficient
classifiers. The third contribution is a method for combining increasingly more
complex classifiers in a “cascade” which allows background regions of the
12
image to be quickly discarded while spending more computation on promising
object-like regions. The cascade can be viewed as an object specific focus-of-
attention mechanism which unlike previous approaches provides statistical
guarantees that discarded regions are unlikely to contain the object of interest.
In the domain of face detection the system yields detection rates comparable to
the best previous systems. Used in real-time applications, the detector runs at
15 frames per second without resorting to image differencing or skin color
detection.
3. Histograms of Oriented Gradients for Human Detection, by Navneet Dalal

and Bill Triggs
We study the question of feature sets for robust visual object recognition,
adopting linear SVM based human detection as a test case. After reviewing
existing edge and gradient based descriptors, we show experimentally that grids
of Histograms of Oriented Gradient (HOG) descriptors significantly outperform
existing feature sets for human detection. We study the impudence of each stage
of the computation on performance, concluding that _ne-scale gradients,
_reorientation binning, relatively coarse spatial binning, and high-quality local
contrast normalization in overlapping descriptor blocks are all important for
good results. The new approach gives near-perfect separation on the original
MIT pedestrian database, so we introduce a more challenging dataset containing
over 1800 annotated human images with a large range of pose variations and
backgrounds.
4. A Review Paper on Object Detection for Improve the Classification

Accuracy and Robustness using different Techniques by, Divya Patel and
Pankaj Kumar Gautam
Object detection is a computer technology that connected to image processing

and computer vision that deal with detecting instance objects of certain class in
digital images and videos. Object detection is a challenging problem in vision
based computer applications. It is used to identifying that whether in scene or
image object is been there or not. In this review paper, we are going to present
13
different techniques and methods for detecting or recognizing object with
various benefits like efficiency, accuracy, robustness etc.
5. The Concept of Object Recognition, by Astha Gautam, Anjana Kumari,

Pankaj Singh.
Object recognition [7] is a process of detecting the object present in an image

or a video sequence, with the help of some recognition technique or methods.
Object recognition is one of the techniques of digital image processing where
we can process any image by applying some of the operation. It actually
depends on human perception that what sort of output he needs, based on that,
one can apply a particular technique.
2.3.2 FINDING FROM THE RESEARCH PAPER
The techniques stated in [4] ranges from very basic algorithm to state of the art
published techniques categorized based on speed, memory requirements and
accuracy. They used methods such as frame difference technique, Real time
background subtraction and shadow detection technique, adaptive background
mixture model for real time tracking technique. They used algorithms ranges
from varying levels of accuracy and computational complexity. Some of them
can also deal with real time challenges like snow, rain, moving branches, objects
overlapping, light intensity or slow moving objects.
In [5] different super-resolution based methods is reviewed that can enhance

efficiency and robustness. An object recognition system finds objects in the real
world from an image. The object recognition problem can be defined as a
labeling problem based on models of known objects. Formally, given an image
containing one or more objects of interest and a set of labels corresponding to a
set of models known to the system, the system should assign correct labels to
regions, or a set of regions, in the image. For better result in occlusion patterns
Shadow c-means approach can be used in future. It will very useful for better
performance for object detecting.
14
In [6] cascade-of-rejecters approach with the Histograms of Oriented Gradients
features to achieve a fast and accurate human detection system. The features used
are Histograms of Oriented Gradients of variable-size blocks that capture salient
features of humans automatically. Using algorithm for feature selection, it
identifies the appropriate set of blocks, from a large set of possible blocks. It uses
the integral image representation and a rejection cascade which significantly
speed up the computation. For an image, the system can process 5 to 30 frames
per second depending on the density in which it scans the image, while
maintaining an accuracy level similar to existing methods.
In [8] author presents the voluminous survey of different algorithms of detection

of object from images and also classifies the detection approaches into different
categories. As the identification of objects from video have improved
significantly in the past few years; human motion and behavior interpretation
have naturally become the upcoming step. The research paper includes various
approaches and steps that have been used by different researchers for object
detection like SIFT, SURF technique etc.
2.3.3 PROBLEM STATEMENT
From very past time all the machine regardless how advanced and complex it is,
it requires human to guide it, to operate and interact in environment. In
manufacturing unit’s humans detects the defect in product which is a slow
process. Surveillance cameras and machines needs to be smart to detect the faces
and other objects. So there is a need to give the machine an ability to detect and
recognize an object so it can interact accordingly because every object requires
different kind of attention and interaction.
2.4 SOLUTION PROPOSED

Object recognition require Artificial Intelligence (AI) as the core part. The technique
has different phases which includes Preprocessing, Training and Classification. In this
project we have used cascade object classifier and HOG (Histogram of Oriented
Gradient). The input is an image containing the object, and the trained system will
15
detect the object of interest and return the boundary box on the object with a label of
its name.
2.4.1 OBJECTIVES TO ACHIEVE
 Effectively recognize the object of interest with minimum false positive

detection.
 System training for objects like Apple, Tree and Car.
 Optimizing threshold for maximum true detection.
2.4.2 METHODOLOGY TO ACHIEVE OBJECTIVE
2.4.2.a PREPROCESSING
 Image cropping and converting them in same size.
 HOG: HOG stands for Histograms of Oriented Gradients [9]. A
gradient vector can be computed for every pixel an image. It’s simply
a measure of the change in pixel values along the x-direction and the y-
direction around each pixel. HOG is a type of “feature descriptor”. The
intent of a feature descriptor is to generalize the object in such a way
that the same object produces as close as possible to the same feature
descriptor when viewed under different conditions. This makes the
classification task easier. The essential thought behind the histogram of
oriented gradients descriptor is that local object appearance and shape
within an image can be described by the distribution of intensity
gradients or edge directions. The image is divided into small connected
regions called cells, and for the pixels within each cell, a histogram of
gradient directions is compiled.
The first step of calculation is the computation of the gradient values.
The most common method is to apply the 1-D centered, point discrete
derivative mask in one or both of the horizontal and vertical directions.
The second step of calculation is creating the cell histograms. Each

pixel within the cell casts a weighted vote for an orientation-based
histogram channel based on the values found in the gradient
16
computation. The cells themselves can either be rectangular or radial in
shape, and the histogram channels are evenly spread over 0 to 180
degrees or 0 to 360 degrees, depending on whether the gradient is
“unsigned” or “signed”.
Figure 2.2: Computing Gradient
To account for changes in illumination and contrast, the gradient

strengths must be locally normalized, which requires grouping the cells
together into larger, spatially connected blocks. The HOG descriptor is
then the concatenated vector of the components of the normalized cell
histograms from all of the block regions. These blocks typically
overlap, meaning that each cell contributes more than once to the final
descriptor. Two main block geometries exist: rectangular R-HOG
blocks and circular C-HOG blocks. R-HOG blocks are generally square
grids, represented by three parameters: the number of cells per block,
the number of pixels per cell, and the number of channels per cell
histogram.
2.4.2.b TRAINING
In the training phase, we use the dataset of images. The dataset contains
the positive and negative images of a specific object. The images are
provided to the classifier for classification as object and non-object. For
17
this purpose we use Viola Jone algorithm. The problem to be solved is
detection of objects in an image [10]. A human can do this easily, but a
computer needs precise instructions and constraints. To make the task
more manageable, Viola–Jones requires full views of the object. Thus in
order to be detected, the entire object must point towards the camera and
should not be tilted to either side. While it seems these constraints could
diminish the algorithm's utility somewhat, because the detection step is
most often followed by a recognition step, in practice these limits on pose
are quite acceptable.
The characteristics of Viola–Jones algorithm which make it a good

detection algorithm are:
 Robust – very high detection rate (true-positive rate) & very low false-
positive rate always.
 Real time – For practical applications at least 2 frames per second must
be processed.
The algorithm has four stages:
1. HOG feature selection

2. Creating an internal image.- An image representation called the
integral image evaluates rectangular features in constant time, which
gives them a considerable speed advantage over more sophisticated
alternative features
3. AdaBoost Training- The speed with which features may be evaluated
does not adequately compensate for their number. Thus, the object
detection framework employs a variant of the learning algorithm
AdaBoost to both select the best features and to train classifiers that
use them. This algorithm constructs a “strong” classifier as a linear
combination of weighted simple “weak” classifiers.
4. Cascading classifier- The cascade training process involves two types
of tradeoffs. In most cases classifiers with more features will achieve
higher detection rates and lower false positive rates. At the same time
classifiers with more features require more time to compute. In
18
principle one could define an optimization framework in which: i) the
number of classifier stages, ii) the number of features in each stage,
and iii) the threshold of each stage, are traded off in order to minimize
the expected number of evaluated features.
 Each stage
 in the cascade reduces the false positive rate and decreases
 The detection rate. A target is selected for the minimum
 Reduction in false positives and the maximum decrease in
 Detection. Each stage is trained by adding features until the
 Target detection and false positives rates are met[10]
Advantages of Viola–Jones algorithm:
1. Efficient feature selection

2. Scale and location invariant detector
3. Instead of scaling the image itself (e.g. pyramid-filters), we scale the
features.
4. Such a generic detection scheme can be trained for detection of other
types of objects (e.g. cars, hands)
Disadvantages of Viola–Jones algorithm
1. Detector is most effective only on frontal images of faces

2. It can hardly cope with 45° face rotation both around the vertical and
horizontal axis.
3. Sensitive to lighting conditions
4. We might get multiple detections of the same face, due to overlapping
sub-windows.
2.4.2.c TESTING
In this phase we give test cases to the system which will check for the
object in the image. We creates a System object that detects objects using
the Viola-Jones algorithm. Bounding box is returned on the detected
object and an annotation is made on the bbox.
19
CHAPTER 3
IMPLEMENTATION AND RESULT
20
In this project we have taken 3 classes of objects namely CAR, TREE and APPLE. We
have trained the system for each object separately. Separate cascade classifiers have to be
trained for every rotation that is not in the image plane and will have to be retrained or run
on rotated features for every rotation that is in the image plane. For Tree class we have
taken around 150 positive images and around 350 negative images for training. Each
positive image is pre-processed and resized to specific pixel dimensions. Each object
positive image is labelled using Training Image Labeler. Similar is for the Apple. Both the
classes have colored positive and negative samples.
The Car class has positive samples in grayscale format and dimension of 100x50. The
number of positive images are around 600 and number of negative images are around
25,000. The cascade classifier for Car is trained up to 12 stages. And 8 stages for apple and
tree.
Figure 3.1: Training process
The object is chosen as such their aspect ratio does not change as much. The cascade
classifier consists of stages, where each stage is an ensemble of weak learners. The weak
learners are simple classifiers called decision stumps. Each stage is trained using a
technique called boosting. Boosting provides the ability to train a highly accurate classifier
by taking a weighted average of the decisions made by the weak learners.
Each stage of the classifier labels the region defined by the current location of the sliding
window as either positive or negative. Positive indicates that an object was found and
negative indicates no objects were found. If the label is negative, the classification of this
region is complete, and the detector slides the window to the next location. If the label is
positive, the classifier passes the region to the next stage. The detector reports an object
found at the current window location when the final stage classifies the region as positive.
21
 Terminologies:
i. True positive: When a positive sample is correctly classified.
ii. False positive: When a negative sample is mistakenly classified as
positive.
iii. False negative: When a positive sample is mistakenly classified as
negative.
Condition Consideration
A large training set (in the thousands). Increase the number of stages and set a higher false
positive rate for each stage.
A small training set. Decrease the number of stages and set a lower false
positive rate for each stage.
To reduce the probability of missing an Increase the true positive rate. However, a high true
object. positive rate can prevent you from achieving the
desired false positive rate per stage, making the
detector more likely to produce false detections.
To reduce the number of false detections. Increase the number of stages or decrease the false
alarm rate per stage.
Table 3.1: Parameter considerations
We have implemented the technique with 0.2 False alarm rate, for fewer false detection
and this makes the training phase more time consuming.
The merge threshold is chosen such that it removes multiple detection of same object in the
image. Groups of collocated detections that meet the threshold are merged to produce one
bounding box around the target object. When we set this property to 0, all detections are
returned without performing thresholding or merging operation. This property is tunable.
22
Figure 3.2: Multiple detection at lower threshold
Figure 3.3: Correct detection at optimum threshold
23
3.1 SOFTWARE DESIGN
3.1.1 DFD LEVEL - 0
Figure 3.4: DFD level 0
3.1.2 FLOW DIAGRAM
Figure 3.5: Flow Chart
24
3.1.3 SYSTEM FLOW DIAGRAM
Figure 3.6: System Flow Diagram
25
3.2 RESULT
As our projects aim is to find the object of interest from the given images. So a user
inputs any random image and our system processes the image, applies the Feature
extraction and classification algorithms to the image and shows the result. As a result
it surrounds the object in a yellow rectangular box with objects class name specified,
if it is present in the image.
We have tested our system on various images from the above mentioned 3 object
classes. As can be seen our system efficiently recognize instances of object in the
images but there are certain cases due to noise , unclear image, occlusion the system
is not able to recognize the object or causes multiple detection. Our Recognition
technique can be applied to both gray-scale and color images. The table here below
shows the efficiency of our project in different circumstances.
Apple Detection
Condition Efficiency
Bright image 79.8
Dull image with noise 56.1
Table 3.2: Apple detector
Car Detection (only side view)
Bright image 82.9
Table 3.3: Car detector
Tree Detection
Bright image 74.6
Table 3.4: Tree detector
26
CHAPTER 4
SCREENSHOTS
27
4.1 GUI
This is the User Interface of the application with four buttons namely,
• Upload
• Detect Tree
• Detect Car
• Detect Apple
Figure 4.1: GUI
4.2 UPLOAD IMAGE

This is the button from where the user can select image for detection.
28
Figure 4.2: Upload Image
4.3 DETECT TREE
This detect the tree in the image and open a new window with tree labeled.
Figure 4.3: Detect Tree
29
Figure 4.4: Tree Detected
4.4 DETECT APPLE
This detect the APPLE in the image and open a new window with APPLE labeled.
Figure 4.5: Detect Apple
30
Figure 4.6: Apple Detected
4.5 DETECT CAR

This detect the CAR in the image and open a new window with CAR labeled.
Figure 4.7: Detect Car
31
Figure 4.8: Car Detected
32
CHAPTER 5
CONCLUSION AND FUTURE SCOPE
33
5.1 CONCLUSION
Object recognition system is a new technology in the market and is very promising
technology. Object recognition system can give machines the ability to see and
recognize the objects like human do. The technique of object recognition has been
successfully carried out. The object recognition problem can be defined as a labeling
problem based on models of known objects. Formally, given an image containing one
or more objects of interest and a set of labels corresponding to a set of models known
to the system, the system should assign correct labels to the object instances. There are
immense possibility to increase the efficiency of the system by changing in the
parameters and by training the system with huge image database.
5.2 FUTURE SCOPE
 Object detection systems can be applied in every field and in every industry like
in manufacturing to detect the imperfections, computer vision to make intelligent
robots, to make smart surveillance systems etc.
 The system can also be installed at traffic lights, which can be used to fine the
violators.
 The highly trained recognition system can be used in driver less car, metro.
 It can be used in smartphones to detect the historical monuments and provide their
details directly on the user’s smartphones. This will help the tourists and promote
tourism.
34
CHAPTER 6
REFERENCES
35
[1] Alex Smola and S.V.N. Vishwanathan, “AN INTRODUCTION TO MACHINE
LEARNING”, Cambridge University Press 2008, UK, 2008.
[2] http://whatis.techtarget.com/definition/Facebook-Like-button
[3] Khushboo Khurana, Reetu Awasthi, “Techniques for Object Recognition in

Images and Multi-Object Detection”, (IJARCET) Volume 2, Issue 4, April 2013
[4] Mr. Deepjoy Das and Dr. Sarat Saharia,” Implementation and Performance
Evaluation of Background Subtraction Algorithms”, International Journal on
Computational Sciences & Applications (IJCSA) Vol.4, No.2, April 2014.
[5] Divya Patel and Pankaj Kumar Gautam “A Review Paper on Object Detection for
Improve the Classification Accuracy and Robustness using different Techniques
by”, International Journal of Computer Applications (0975 – 8887) Volume 112 –
No 11, February 2015
[6] Qiang Zhu, Shai Avidan, Mei-Chen Yeh, Kwang-Ting Cheng,” Fast Human
Detection Using a Cascade of Histograms of Oriented Gradients”, June 2006.
[7] Astha Gautam, Anjana Kumari, Pankaj Singh, “The Concept of Object
Recognition”, Volume 5, Issue 3, March 2015
[8] Vinay V Kulkarni1 , Dr.S. D. Lokhande,” Object Identification Using Image

Processing”, International Journal of Innovative Research in Science, Engineering
and Technology, Vol. 5, Issue 6, June 2016
[9] http://mccormickml.com/2013/05/09/hog-person-detector-tutorial/
[10] Viola, P., and M. J. Jones. "Rapid Object Detection using a Boosted Cascade of
Simple Features." Proceedings of the 2001 IEEE Computer Society Conference.
Volume 1, 15 April 2001, pp. I-511–I-518.
36

Object Recognition System Report

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Object Recognition System Report

Uploaded by

Copyright:

Available Formats

OBJECT RECOGNITION SYSTEM

ANKUSH BHALLA – 04596303113

Under the Guidance of

Department of Information Technology

(Ms. Nishtha Jatana)

Dr. Tripti Sharma

The project concentrates on the technique of recognizing objects in digital images. We

Figure 2.1: Multiple object detection……………………………………………………..10

Figure 2.2: Computing gradient .…………………………………………………………17

Figure 3.1: Training process ..……………………………………………………………21

Figure 3.2: Multiple detection at lower threshold …………………………………..........23

Figure 3.3: Correct detection at optimum threshold ……………………………………..23

Figure 3.4: DFD level 0…………………………………………………………………...24

Figure 3.5: Flow Chart.. ………………………………………………………………….24

Figure 3.6 System Flow Diagram ………………………………………………………..25

Figure 4.1 GUI …………………………………………………………………………...28

Figure 4.2: Upload Image ………………………………………………………………...29

Figure 4.3: Detect tree ……………………………………………………………………29

Figure 4.4: Tree detected …………………………………………………………………30

Figure 4.5: Detect apple ………………………………………………………………….30

Figure 4.6: Apple detected ……………………………………………………………….31

Figure 4.7: Detect car. …………………………………………………………………...31

Figure 4.8: Car detected.. ………………………………………………………………...32

Table 3.1 Parameter Consideration……………………………………………………….22

INTRODUCTION TO MACHINE LEARNING AND

1.1 LEARNING TYPES

 classifying e-mails as spam,

 labeling webpages based on their content, and

 self-organizing maps, and

1.2 OBJECT DETECTION

1.3 OBJECT RECOGNITION

 Which object is depicted in the image?

SOFTWARE REQUIREMENT SPECIFICATION

 This project report seeks to provide requirement specifications for the

Image processing is being applied in many fields in today’s world:

2.1.3 OPERATING ENVIRONMENT

 The software can be installed on 32/64 bit Computing Environments with

Object detection is a computer technology that connected to image processing

The basic design of recognition systems described above makes use of

2.1.4.a Feature Based

2.1.4.b SVM Based

 Noise should be limited in the image for detection

2.2 SPECIFIC REQUIREMENTS

2.2.1 USER INTERFACE REQUIREMENTS:-

2.3.1 ABSTRACT OF PAPERS REFERED

1. Techniques for Object Recognition in Images and Multi-Object Detection,

The modern world is enclosed with gigantic masses of digital visual

2. Rapid Object Detection using a Boosted Cascade of Simple Features, by

3. Histograms of Oriented Gradients for Human Detection, by Navneet Dalal

4. A Review Paper on Object Detection for Improve the Classification

Object detection is a computer technology that connected to image processing

5. The Concept of Object Recognition, by Astha Gautam, Anjana Kumari,

Object recognition [7] is a process of detecting the object present in an image

2.3.2 FINDING FROM THE RESEARCH PAPER

In [5] different super-resolution based methods is reviewed that can enhance

In [8] author presents the voluminous survey of different algorithms of detection

2.3.3 PROBLEM STATEMENT

2.4 SOLUTION PROPOSED

2.4.1 OBJECTIVES TO ACHIEVE

 Effectively recognize the object of interest with minimum false positive

2.4.2 METHODOLOGY TO ACHIEVE OBJECTIVE

The second step of calculation is creating the cell histograms. Each

Figure 2.2: Computing Gradient