Object Detection Using Python OpenCV

Object Detection using Python OpenCV
Saturday, March 30, 2019 8:33 PM
Clipped from: https://circuitdigest.com/tutorial/object-detection-using-

python-opencv
Object Detection using Python & OpenCV
We started with learning basics of OpenCV and then done some basic image
processing and manipulations on images followed by Image segmentations
and many other operations using OpenCV and python language. Here, in this
section, we will perform some simple object detection techniques using
template matching. We will find an object in an image and then we will
describe its features. Features are the common attributes of the image such
as corners, edges etc. We will also take a look at some common and popular
object detection algorithms such as SIFT, SURF, FAST, BREIF & ORB.
As told in the previous tutorials, OpenCV is Open Source Commuter

Vision Library which has C++, Python and Java interfaces and supports
Windows, Linux, Mac OS, iOS and Android. So it can be easily installed
in Raspberry Pi with Python and Linux environment. And Raspberry Pi with
Food Recognizer Page 1

in Raspberry Pi with Python and Linux environment. And Raspberry Pi with
OpenCV and attached camera can be used to create many real-time image
processing applications like Face detection, face lock, object tracking, car
number plate detection, Home security system etc.
Object detection and recognition form the most important use case for
computer vision, they are used to do powerful things such as
• Labelling scenes
• Robot Navigation
• Self-driving cars
• Body recognition (Microsoft Kinect)
• Disease and cancer detection
• Facial recognition
• Handwriting recognition
• Identifying objects in satellite images
Object Detection VS Recognition
Object recognition is the second level of object detection in which computer

is able to recognize an object from multiple objects in an image and may be
able to identify it.
Now, we will perform some image processing functions to find an object

from an image.
Finding an Object from an Image
Here we will use template matching for finding character/object in an image,

use OpenCV’s cv2.matchTemplate() function for finding that object
import cv2
import numpy as np
Load input image and convert it into gray
image=cv2.imread('WaldoBeach.jpg')
cv2.imshow('people',image)
cv2.waitKey(0)
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Load the template image
template=cv2.imread('waldo.jpg',0)
#result of template matching of object over an image

#result of template matching of object over an image
result=cv2.matchTemplate(gray,template,cv2.TM_CCOEFF)
sin_val, max_val, min_loc, max_loc=cv2.minMaxLoc(result)
Create bounding box
top_left=max_loc
#increasing the size of bounding rectangle by 50 pixels
bottom_right=(top_left[0]+50,top_left[1]+50)
cv2.rectangle(image, top_left, bottom_right, (0,255,0),5)
cv2.imshow('object found',image)
cv2.waitKey(0)
cv2.destroyAllWindows()

In cv2.matchTemplate(gray,template,cv2.TM_CCOEFF), input the gray-scale
image to find the object and template. Then apply the template matching
method for finding the objects from the image, here cv2.TM_CCOEFF is
used.
The whole function returns an array which is inputted in result, which is the
result of the template matching procedure.
And then we use cv2.minMaxLoc(result), which gives the coordinates or the

bounding box where the object was found in an image, and when we get
those coordinates draw a rectangle over it, and stretch a little dimensions of
the box so the object can easily fit inside the rectangle.
There are variety of methods to perform template matching and in this case
we are using cv2.TM_CCOEFF which stands for correlation coefficient.
cv2.matchTemplate takes a “sliding window” of the object and slides it over

the image from left to right and top to bottom, one pixel at a time. Then for
each location, we compute the correlation coefficient to determine how
“good” or “bad” the match is.
Regions with sufficiently high correlation can be considered as matches, from

there all we need is to call to cv2.minMaxLoc to find where the good
matches are in template matching.

matches are in template matching.
Feature Description Theory
In template matching we slide a template image across a source image until

a match is found. But it is not the best method for object recognition, as it
has severe limitations. This method isn’t very resilient.
The following factors make template matching a bad choice for object
detection.
• Rotation renders this method ineffective.

• Size (known as scaling) affects this as well.
• Photometric changes (e.g. brightness, contrast, hue etc.)
• Distortion form view point changes (Affine).
The one solution for this problem is image features
Image features are interesting areas of an image that are somewhat

unique to that specific image. They are also called key point features or
interest points.
The sky is an uninteresting feature, whereas as certain keypoints (marked in

red circles) can be used for the detection of the above image (interesting
Features). The image shown above clearly shows the difference between the
interesting feature and uninteresting feature.
Importance of feature detection

Features are important as they can be used to analyze, describe and match
images. They have extensive use in:
• Image alignment – e.g panorma stiching (finding corresponding

matches so we can stitch images together)
• 3D reconstruction
• Robot navigation
• Object recognition
• Motion tracking
• And more!
What defines the interest points?
Interesting areas carry a lot of distinct information and unique information of

Interesting areas carry a lot of distinct information and unique information of
an area. Typically, they are areas of high change of intensity, corners or
edges and more. But always be careful as noise can appear “informative”
when it is not! So try to blur so as to reduce noise.
Characteristic of Good or Interesting Features
Repeatable – They can be found in multiple pictures of the same scene.
Distinctive – Each feature is somewhat unique and different to other

features of the same scene.
Compactness/Efficiency – Significantly less features than pixels in the

image.
Locality – Feature occupies a small area of the image and is robust to

clutter and occlusion.

Corners as features
Corners are identified when shifting a window in any direction over that
point gives a large change in intensity.
Corners are not the best cases for identifying the images, but yes they have
certainly good use cases of them which make them handy to use.
So to identify corners in your image, imagine the green window we are

looking at and the black one is the image we want to find corners in, and
now when we move the window only inside the black box we see there is no
change in intensity and hence the image is flat i.e. no corners identified.
Now when we move the window in one direction we see that there is change
of intensity in one direction only, hence it’s an edge not a corner.
When we move the window in the corner, and no matter in what direction

When we move the window in the corner, and no matter in what direction
we move the window now there is a change in intensity, and this is identified
as a corner.
So let’s identify corner with the help of Harris Corner Detection algorithm,
developed in 1998 for corner detection and works fairly well.
The following OpenCV function is used for the detection of the corners.
cv2.cornerHarris(input image, block size, ksize, k)
Input image - Should be grayscale and float32 type.
blockSize - The size of neighborhood considered for corner detection
ksize - Aperture parameter of Sobel derivative used.
k - Harris detector free parameter in the equation
Output – array of corner locations (x,y)
Also an important thing to note is that Harris corner detection algorithm

requires a float 32 array datatype of image, i.e. image should be gray
image of float 32 type.
import cv2
import numpy as np
Load image then grayscale
image = cv2.imread('chess.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
The cornerHarris function requires the array datatype to be float32
gray = np.float32(gray)
harris_corners = cv2.cornerHarris(gray, 3, 3, 0.05)
We use dilation of the corner points to enlarge them
kernel = np.ones((7,7),np.uint8)
harris_corners = cv2.dilate(harris_corners, kernel, iterations = 2)
Threshold for an optimal value, it may vary depending on the image

Threshold for an optimal value, it may vary depending on the image
image[harris_corners > 0.025 * harris_corners.max() ] = [255, 127,

127]
cv2.imshow('Harris Corners', image)

cv2.waitKey(0)
Corner Harris returns the location of the corners, so as to visualize these

tiny locations we use dilation so as to add pixels to the edges of the corners.
So to enlarge the corner we run the dilation twice. And then we again do
some thresholding to change the colors of the corners.
The following function is used for the same with the below mentioned
parameters
cv2.goodFeaturesToTrack(input image, maxCorners, qualityLevel,

minDistance)
Input Image - 8-bit or floating-point 32-bit, single-channel image.

• Input Image - 8-bit or floating-point 32-bit, single-channel image.
• maxCorners – Maximum number of corners to return. If there are
more number of corners than the total numbers of corners which are
actually found, then the strongest one of them is returned.
• qualityLevel – Parameter characterizing the minimal accepted quality
of image corners. The parameter value is multiplied by the best corner
quality measure (smallest eigenvalue). The corners with the quality
measure less than the product are rejected. For example, if the best
corner has the quality measure = 1500, and the qualityLevel=0.01 ,
then all the corners with the quality measured less than 15 are rejected.
• minDistance – Minimum possible Euclidean distance between the
returned corners.
import cv2
import numpy as np
img = cv2.imread('chess.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
We specify the top 50 corners
corners = cv2.goodFeaturesToTrack(gray, 100, 0.01, 15)
for corner in corners:

x, y = corner[0]
x = int(x)
y = int(y)
cv2.rectangle(img,(x-10,y-10),(x+10,y+10),(0,255,0), 2)
cv2.imshow("Corners Found", img)

cv2.waitKey()
It also returns the array of location of the corners like previous method, so
we iterate through each of the corner position and plot a rectangle over it.

Problems with corners as features
Corner matching in images is tolerant of or corner detection don’t
have any problem with image detection when the image is
• Rotated
• Translated (i.e. shifts in image)
• Slight photometric changes e.g. brightness
or affine intensity
However, it is intolerant of:

• Large changes in intensity or photometric
changes)
• Scaling (i.e. enlarging or shrinking)

SIFT, SURF, FAST, BRIEF & ORB Algorithms
Scale Invariant Feature Transform (SIFT)
The corner detectors like Harris corner detection algorithm are rotation
invariant, which means even if the image is rotated we could still get the
same corners. It is also obvious as corners remain corners in rotated image
also. But when we scale the image, a corner may not be the corner as
shown in the above image.
SIFT is used to detect interesting keypoints in an image using the difference

of Gaussian method, these are the areas of the image where variation
exceeds a certain threshold and are better than edge descriptor.
Then we create a vector descriptor for these interesting areas. And the scale
Invariance is achieved via the following process:
i. Interesting points are scanned at several different scales.
ii. The scale at which we meet a specific stability criteria, is then
selected and encoded by the vector descriptor. Therefore, regardless of the
initial size, the more stable scale is found which allows us to be scale
invariant.
Rotation invariance is achieved by obtaining the Orientation

Assignment of the key point using image gradient magnitudes. Once we
know the 2D direction, we can normalize this direction.

know the 2D direction, we can normalize this direction.
A full paper on SIFT can be read here:
http://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf.
And you can also find a tutorial on the official OpenCV link.

Speeded Up Robust Features (SURF)
SURF is the speeded up version of SIFT, as the SIFT is quite computational

expensive
SURF was developed to improve the speed of a scale invariant feature

detector. Instead of using the Difference of Gaussian approach, SURF uses
Hessian matrix approximation to detect interesting points and uses the sum
of Haar wavelet responses for orientation assignment.
A full paper on SIFT can be read here: http://www.vision.ee.ethz.ch/

~surf/eccv06.pdf
Alternatives of SIFT and SURF
As the SIFT and SURF are patented they are not freely available for
commercial use however there are alternatives to these algorithms which
are explained in brief here
Features from Accelerated Segment Test (FAST)
• Key point detection only (no descriptor, we can use SIFT or SURF to
compute that)
• Used in real time applications
Here you can find the papers on FAST
https://www.edwardrosten.com/work/rosten_2006_machine.pdf
Binary Robust Independent Elementary Features (BRIEF)
• Computers descriptors quickly (instead of using SIFT or SURF)

• it is quite fast.
Here you can find the paper on BRIEF
http://cvlabwww.epfl.ch/~lepetit/papers/calonder_pami11.pdf
Oriented FAST and Rotated BRIEF (ORB)
• Developed out of OpenCV Labs (not patented so free to use!)

• Combines both Fast and Brief
Here you can find the paper on ORB
http://www.willowgarage.com/sites/default/files/orb_final.pdf

Using SIFT, SURF, FAST, BRIEF & ORB in OpenCV
Flow process for SIFT, SURF, FAST, BRIEF & ORB
Feature Detection implementation
The SIFT & SURF algorithms are patented by their respective creators, and
while they are free to use in academic and research settings, you should
technically be obtaining a license/permission from the creators if you are
using them in a commercial (i.e. for-profit) application.
Below we are explaining programming examples of all the

algorithms mentioned above.
SIFT
import cv2
import numpy as np
image = cv2.imread('paris.jpg')

Create SIFT Feature Detector object
sift = cv2.xfeatures2d.SIFT_create()
#Detect key points

keypoints = sift.detect(gray, None)
print("Number of keypoints Detected: ", len(keypoints))
Draw rich key points on input image
image = cv2.drawKeypoints(image, keypoints, None,

flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
cv2.imshow('Feature Method - SIFT', image)

cv2.waitKey(0)
Console Output:
Number of keypoints Detected: 1893
Here the keypoints are (X,Y) coordinates extracted using sift detector and
drawn over the image using cv2 draw keypoint function.

SURF
import cv2
import numpy as np
Create SURF Feature Detector object, here we set hessian threshold

to 500
# Only features, whose hessian is larger than hessianThreshold are retained

by the detector
#you can increase the value of hessian threshold to decrease the keypoints
surf = cv2.xfeatures2d.SURF_create(500)
keypoints, descriptors = surf.detectAndCompute(gray, None)

print ("Number of keypoints Detected: ", len(keypoints))
Draw rich key points on input image

cv2.imshow('Feature Method - SURF', image)

cv2.waitKey()
Console Output:

FAST
import cv2
import numpy as np
Create FAST Detector object
fast = cv2.FastFeatureDetector_create()
# Obtain Key points, by default non max suppression is On
# to turn off set fast.setBool('nonmaxSuppression', False)
keypoints = fast.detect(gray, None)
Draw rich keypoints on input image

cv2.imshow('Feature Method - FAST', image)

cv2.waitKey()
Console Output:

Console Output:
BRIEF
import cv2
import numpy as np
Create FAST detector object
brief = cv2.xfeatures2d.BriefDescriptorExtractor_create()
Create BRIEF extractor object
#brief = cv2.DescriptorExtractor_create("BRIEF")
# Determine key points

keypoints = fast.detect(gray, None)
Obtain descriptors and new final keypoints using BRIEF
keypoints, descriptors = brief.compute(gray, keypoints)


cv2.imshow('Feature Method - BRIEF', image)

cv2.waitKey()
Console Output:
ORB
import cv2
import numpy as np
Create ORB object, we can specify the number of key points we

desire
orb = cv2.ORB_create()

orb = cv2.ORB_create()
# Determine key points
keypoints = orb.detect(gray, None)
Obtain the descriptors
keypoints, descriptors = orb.compute(gray, keypoints)

print("Number of keypoints Detected: ", len(keypoints))

cv2.imshow('Feature Method - ORB', image)

cv2.waitKey()
Console Output:
We can specify the number of keypoints which has maximum limit of 5000,
however the default value is 500, i.e. ORB automatically would detect best
500 keypoints if not specified for any value of keypoints.
So this is how object detection takes place in OpenCV, the same programs

So this is how object detection takes place in OpenCV, the same programs
can also be run in OpenCV installed Raspberry Pi and can be used as a
portable device like Smartphones having Google Lens.

Object Detection Using Python OpenCV

Uploaded by

Document Information

Copyright

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Object Detection Using Python OpenCV

Uploaded by

Copyright:

Object Detection using Python OpenCV

Saturday, March 30, 2019 8:33 PM

Clipped from: https://circuitdigest.com/tutorial/object-detection-using-

Object Detection using Python & OpenCV

As told in the previous tutorials, OpenCV is Open Source Commuter

Food Recognizer Page 1

Object Detection VS Recognition

Object recognition is the second level of object detection in which computer

Now, we will perform some image processing functions to find an object

Finding an Object from an Image

Here we will use template matching for finding character/object in an image,

Load input image and convert it into gray

Load the template image

Food Recognizer Page 2

Create bounding box

Food Recognizer Page 3

And then we use cv2.minMaxLoc(result), which gives the coordinates or the

cv2.matchTemplate takes a “sliding window” of the object and slides it over

Regions with sufficiently high correlation can be considered as matches, from

Food Recognizer Page 4

Feature Description Theory

In template matching we slide a template image across a source image until

• Rotation renders this method ineffective.

Image features are interesting areas of an image that are somewhat

The sky is an uninteresting feature, whereas as certain keypoints (marked in

Importance of feature detection

Food Recognizer Page 5

• Image alignment – e.g panorma stiching (finding corresponding

What defines the interest points?

Interesting areas carry a lot of distinct information and unique information of

Characteristic of Good or Interesting Features

Repeatable – They can be found in multiple pictures of the same scene.

Distinctive – Each feature is somewhat unique and different to other

Compactness/Efficiency – Significantly less features than pixels in the

Locality – Feature occupies a small area of the image and is robust to

Food Recognizer Page 7

So to identify corners in your image, imagine the green window we are

Food Recognizer Page 8

cv2.cornerHarris(input image, block size, ksize, k)

Input image - Should be grayscale and float32 type.

blockSize - The size of neighborhood considered for corner detection

ksize - Aperture parameter of Sobel derivative used.

k - Harris detector free parameter in the equation

Output – array of corner locations (x,y)

Also an important thing to note is that Harris corner detection algorithm

Load image then grayscale

The cornerHarris function requires the array datatype to be float32

We use dilation of the corner points to enlarge them

Threshold for an optimal value, it may vary depending on the image

Food Recognizer Page 9

image[harris_corners > 0.025 * harris_corners.max() ] = [255, 127,

cv2.imshow('Harris Corners', image)

Corner Harris returns the location of the corners, so as to visualize these

cv2.goodFeaturesToTrack(input image, maxCorners, qualityLevel,

Input Image - 8-bit or floating-point 32-bit, single-channel image.

Food Recognizer Page 10

We specify the top 50 corners

corners = cv2.goodFeaturesToTrack(gray, 100, 0.01, 15)

for corner in corners:

cv2.imshow("Corners Found", img)

Food Recognizer Page 11

However, it is intolerant of:

Food Recognizer Page 12

Scale Invariant Feature Transform (SIFT)

SIFT is used to detect interesting keypoints in an image using the difference

Rotation invariance is achieved by obtaining the Orientation

Food Recognizer Page 13