HOG detectMultiScale Parameters Explained

Consulting
Click here to download the OpenCV

source code Installpost
to this Guides About FAQ Contact Coaching
Get Started Topics Books and Courses Student Success Stories Blog
IMAGE DESCRIPTORS OBJECT DETECTION TUTORIALS
HOG detectMultiScale
parameters explained
by Adrian Rosebrock on November 16, 2015
PyImageSearch
University
Course information:
35+ total classes • 39+
hours of on demand
video • Last updated:
April 2022
★★★★★
4.84 (128 Ratings) •
13,800+ Students
Enrolled
✓ 35+ courses on
essential computer vision,

Last week we discussed how to use OpenCV and Python to perform pedestrian
deep learning, and
detection.
OpenCV topics
To accomplish this, we leveraged the built-in HOG + Linear SVM detector that ✓ 35+ Certi cates of
OpenCV ships with, allowing us to detect people in images. Completion
✓ 39+ hours of on-

However, one aspect of the HOG person detector we did not discuss in detail is the
demand video
detectMultiScale function; speci cally, how the parameters of this function can:
✓ Brand new courses
1 Increase the number of false-positive detections (i.e., reporting that a location released every month,
in an image contains a person, but when in reality it does not). ensuring you can keep up
with state-of-the-art
2 Result in missing a detection entirely.
techniques
3 Dramatically a ect the speed of the detection process. ✓ Pre-con gured
Jupyter Notebooks in
In the remainder of this blog post I am going to breakdown each of the Google Colab
detectMultiScale parameters to the Histogram of Oriented Gradients descriptor ✓ Run all code examples
and SVM detector. in your web browser —
works on Windows,
I’ll also explain the trade-o between speed and accuracy that we must make if we
macOS, and Linux (no
want our pedestrian detector to run in real-time. This tradeo is especially
dev environment
important if you want to run the pedestrian detector in real-time on resource
con guration required!)
constrained devices such as the Raspberry Pi.
✓ Access to centralized
code repos for all 500+
tutorials on
PyImageSearch
✓ Easy one-click
downloads for code,
Looking for the source code to this post? datasets, pre-trained
models, etc.
JUMP RIGHT TO THE DOWNLOADS SECTION
✓ Access on mobile,
laptop, desktop, etc.
JOIN NOW
Accessing the HOG detectMultiScale

parameters
Picked For You
To view the parameters to the detectMultiScale function, just re up a shell,

import OpenCV, and use the help function:
→ Launch Jupyter Notebook on Google Colab
HOG detectMultiScale parameters explained

1. $ python
Achieving Optimal Speed and
2. >>> import cv2 Accuracy in Object Detection
3. >>> help(cv2.HOGDescriptor().detectMultiScale) (YOLOv4)
An Incremental Improvement with

Darknet-53 and Multi-Scale
Predictions (YOLOv3)
Mean Average Precision (mAP) Using

the COCO Evaluator
Figure 1: The available parameters to the detectMultiScale function.
You can use the built-in Python help method on any OpenCV function to get a full
listing of parameters and returned values.
A Better, Faster, and Stronger Object
HOG detectMultiScale parameters Detector (YOLOv2)
explained
Before we can explore the detectMultiScale parameters, let’s rst create a simple
Python script (based on our pedestrian detector from last week) that will allow us
Understanding a Real-Time Object
to easily experiment: Detection Network: You Only Look
Once (YOLOv1)

1. # import the necessary packages
2. from __future__ import print_function
3. import argparse
4. import datetime
5. import imutils
6. import cv2
7.
8. # construct the argument parse and parse the arguments
9. ap = argparse.ArgumentParser()
10. ap.add_argument("-i", "--image", required=True,
11. help="path to the input image")
12. ap.add_argument("-w", "--win-stride", type=str, default="(8, 8)",
13. help="window stride")
14. ap.add_argument("-p", "--padding", type=str, default="(16, 16)",
15. help="object padding")
16. ap.add_argument("-s", "--scale", type=float, default=1.05,
17. help="image pyramid scale")
18. ap.add_argument("-m", "--mean-shift", type=int, default=-1,
19. help="whether or not mean shift grouping should be used")
20. args = vars(ap.parse_args())
Since most of this script is based on last week’s post, I’ll do a more quick overview
of the code.
Lines 9-20 handle parsing our command line arguments The --image switch is the
path to our input image that we want to detect pedestrians in. The --win-stride is
the step size in the x and y direction of our sliding window. The --padding switch
controls the amount of pixels the ROI is padded with prior to HOG feature vector
extraction and SVM classi cation. To control the scale of the image pyramid
(allowing us to detect people in images at multiple scales), we can use the
--scale argument. And nally, --mean-shift can be speci ed if we want to apply
mean-shift grouping to the detected bounding boxes.

22. # evaluate the command line arguments (using the eval function like
23. # this is not good form, but let's tolerate it for the example)
24. winStride = eval(args["win_stride"])
25. padding = eval(args["padding"])
26. meanShift = True if args["mean_shift"] > 0 else False
27.
28. # initialize the HOG descriptor/person detector
29. hog = cv2.HOGDescriptor()
30. hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
31.
32. # load the image and resize it
33. image = cv2.imread(args["image"])
34. image = imutils.resize(image, width=min(400, image.shape[1]))
Now that we have our command line arguments parsed, we need to extract their
tuple and boolean values respectively on Lines 24-26. Using the eval function,
especially on command line arguments, is not good practice, but let’s tolerate it for
the sake of this example (and for the ease of allowing us to play with di erent
--win-stride and --padding values).
Lines 29 and 30 initialize the Histogram of Oriented Gradients detector and sets
the Support Vector Machine detector to be the default pedestrian detector
included with OpenCV.
From there, Lines 33 and 34 load our image and resize it to have a maximum width
of 400 pixels — the smaller our image is, the faster it will be to process and detect
people in it.

36. # detect people in the image
37. start = datetime.datetime.now()
38. (rects, weights) = hog.detectMultiScale(image, winStride=winStride,
39. padding=padding, scale=args["scale"], useMeanshiftGrouping=meanShift)
40. print("[INFO] detection took: {}s".format(
41. (datetime.datetime.now() - start).total_seconds()))
42.
43. # draw the original bounding boxes
44. for (x, y, w, h) in rects:
45. cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)
46.
47. # show the output image
48. cv2.imshow("Detections", image)
49. cv2.waitKey(0)
Lines 37-41 detect pedestrians in our image using the detectMultiScale function
and the parameters we supplied via command line arguments. We’ll start and stop
a timer on Line 37 and 41 allowing us to determine how long it takes a single image
to process for a given set of parameters.
Finally, Lines 44-49 draw the bounding box detections on our image and display
the output to our screen.
To get a default baseline in terms of object detection timing, just execute the
following command:

1. $ python detectmultiscale.py --image images/person_010.bmp
On my MacBook Pro, the detection process takes a total of 0.09s, implying that I
can process approximately 10 images per second:
Figure 2: On my system, it takes approximately 0.09s to process a single image using the default
parameters.
In the rest of this lesson we’ll explore the parameters to detectMultiScale in

detail, along with the implications these parameters have on detection timing.
img (required)
This parameter is pretty obvious — it’s the image that we want to detect objects (in
this case, people) in. This is the only required argument to the detectMultiScale
function. The image we pass in can either be color or grayscale.
hitThreshold (optional)
The hitThreshold parameter is optional and is not used by default in the
detectMultiScale function.
When I looked at the OpenCV documentation for this function and the only
description for the parameter is: “Threshold for the distance between features and
SVM classifying plane”.
Given the sparse documentation of the parameter (and the strange behavior of it
when I was playing around with it for pedestrian detection), I believe that this
parameter controls the maximum Euclidean distance between the input HOG
features and the classifying plane of the SVM. If the Euclidean distance exceeds
this threshold, the detection is rejected. However, if the distance is below this
threshold, the detection is accepted.
My personal opinion is that you shouldn’t bother playing around this parameter
unless you are seeing an extremely high rate of false-positive detections in your
image. In that case, it might be worth trying to set this parameter. Otherwise, just let
non-maxima suppression take care of any overlapping bounding boxes, as we did
in the previous lesson.
winStride (optional)
The winStride parameter is a 2-tuple that dictates the “step size” in both the x
and y location of the sliding window.
Both winStride and scale are extremely important parameters that need to be
set properly. These parameter have tremendous implications on not only the
accuracy of your detector, but also the speed in which your detector runs.
In the context of object detection, a sliding window is a rectangular region of xed

width and height that “slides” across an image, just like in the following gure:
Figure 3: An example of applying a

sliding window to an image for face
detection.
At each stop of the sliding window (and for each level of the image pyramid,
discussed in the scale section below), we (1) extract HOG features and (2) pass
these features on to our Linear SVM for classi cation. The process of feature
extraction and classi er decision is an expensive one, so we would prefer to
evaluate as little windows as possible if our intention is to run our Python script in
near real-time.
The smaller winStride is, the more windows need to be evaluated (which can
quickly turn into quite the computational burden):

1. $ python detectmultiscale.py --image images/person_010.bmp --win-stride="(4, 4)"
Figure 4: Decreasing the winStride increases the amount of time it takes it process each each.
Here we can see that decreasing the winStride to (4, 4) has actually increased
our detection time substantially to 0.27s.
Similarly, the larger winStride is the less windows need to be evaluated (allowing
us to dramatically speed up our detector). However, if winStride gets too large,
then we can easily miss out on detections entirely:

1. $ python detectmultiscale.py --image images/person_010.bmp --win-stride="(16, 16)"
Figure 5: Increasing the winStride can reduce our pedestrian detection time (0.09s down to
0.06s, respectively), but as you can see, we miss out on detecting the boy in the background.
I tend to start o using a winStride value of (4, 4) and increase the value until I
obtain a reasonable trade-o between speed and detection accuracy.
padding (optional)
The padding parameter is a tuple which indicates the number of pixels in both the
x and y direction in which the sliding window ROI is “padded” prior to HOG feature
extraction.
As suggested by Dalal and Triggs in their 2005 CVPR paper, Histogram of

Oriented Gradients for Human Detection, adding a bit of padding surrounding the
image ROI prior to HOG feature extraction and classi cation can actually increase
the accuracy of your detector.
Typical values for padding include (8, 8), (16, 16), (24, 24), and (32, 32).
scale (optional)
An image pyramid is a multi-scale representation of an image:
Figure 6: An example image pyramid.
At each layer of the image pyramid the image is downsized and (optionally)
smoothed via a Gaussian lter.
This scale parameter controls the factor in which our image is resized at each
layer of the image pyramid, ultimately in uencing the number of levels in the image
pyramid.
A smaller scale will increase the number of layers in the image pyramid and
increase the amount of time it takes to process your image:

1. $ python detectmultiscale.py --image images/person_010.bmp --scale 1.01
Figure 7: Decreasing the scale to 1.01
The amount of time it takes to process our image has signi cantly jumped to 0.3s.
We also now have an issue of overlapping bounding boxes. However, that issue
can be easily remedied using non-maxima suppression.
Meanwhile a larger scale will decrease the number of layers in the pyramid as well
as decrease the amount of time it takes to detect objects in an image:

1. $ python detectmultiscale.py --image images/person_010.bmp --scale 1.5
Figure 8: Increasing our scale allows us to process nearly 20 images per second — at the
expense of missing some detections.
Here we can see that we performed pedestrian detection in only 0.02s, implying
that we can process nearly 50 images per second. However, this comes at the
expense of missing some detections, as evidenced by the gure above.
Finally, if you decrease both winStride and scale at the same time, you’ll
dramatically increase the amount of time it takes to perform object detection:

1. $ python detectmultiscale.py --image images/person_010.bmp --scale 1.03 \
2. --win-stride="(4, 4)"
Figure 9: Decreasing both the scale and window stride.
We are able to detect both people in the image — but it’s taken almost half a
second to perform this detection, which is absolutely not suitable for real-time
applications.
Keep in mind that for each layer of the pyramid a sliding window with winStride
steps is moved across the entire layer. While it’s important to evaluate multiple
layers of the image pyramid, allowing us to nd objects in our image at di erent
scales, it also adds a signi cant computational burden since each layer also implies
a series of sliding windows, HOG feature extractions, and decisions by our SVM
must be performed.
Typical values for scale are normally in the range [1.01, 1.5]. If you intend on
running detectMultiScale in real-time, this value should be as large as possible
without signi cantly sacri cing detection accuracy.
Again, along with the winStride , the scale is the most important parameter for
you to tune in terms of detection speed.
nalThreshold (optional)
I honestly can’t even nd finalThreshold inside the OpenCV documentation
(speci cally for the Python bindings) and I have no idea what it does. I assume it
has some relation to the hitThreshold , allowing us to apply a “ nal threshold” to
the potential hits, weeding out potential false-positives, but again, that’s simply
speculation based on the argument name.
If anyone knows what this parameter controls, please leave a comment at the
bottom of this post.
useMeanShiftGrouping (optional)
The useMeanShiftGrouping parameter is a boolean indicating whether or not
mean-shift grouping should be performed to handle potential overlapping
bounding boxes. This value defaults to False and in my opinion, should never be
set to True — use non-maxima suppression instead; you’ll get much better
results.
When using HOG + Linear SVM object detectors you will undoubtably run into the
issue of multiple, overlapping bounding boxes where the detector has red
numerous times in regions surrounding the object we are trying to detect:
Figure 10: An example of detecting multiple, overlapping bounding boxes.
To suppress these multiple bounding boxes, Dalal suggested using mean

shift (Slide 18). However, in my experience mean shift performs sub-optimally and
should not be used as a method of bounding box suppression, as evidenced by the
image below:
Figure 11: Applying mean-shift to handle overlapping bounding boxes.
Instead, utilize non-maxima suppression (NMS). Not only is NMS faster, but it
obtains much more accurate nal detections:
Figure 12: Instead of applying mean-shift, utilize NMS instead. Your results will be much better.
Tips on speeding up the object detection

process
Whether you’re batch processing a dataset of images or looking to get your HOG
detector to run in real-time (or as close to real-time as feasible), these three tips
should help you milk as much performance out of your detector as possible:
1 Resize your image or frame to be as small as possible without sacri cing

detection accuracy. Prior to calling the detectMultiScale function, reduce
the width and height of your image. The smaller your image is, the less data
there is to process, and thus the detector will run faster.
2 Tune your scale and winStride parameters. These two arguments have
a tremendous impact on your object detector speed. Both scale and
winStride should be as large as possible, again, without sacri cing detector
accuracy.
3 If your detector still is not fast enough…you might want to look into re-
implementing your program in C/C++. Python is great and you can do a lot
with it. But sometimes you need the compiled binary speed of C or C++ — this
is especially true for resource constrained environments.
What's next? I recommend PyImageSearch

University.
3:52
Course information:
35+ total classes • 39h 44m video • Last updated: April 2022
★★★★★ 4.84 (128 Ratings) • 13,800+ Students Enrolled
I strongly believe that if you had the right teacher you could master
computer vision and deep learning.
Do you think learning computer vision and deep learning has to be time-
consuming, overwhelming, and complicated? Or has to involve complex
mathematics and equations? Or requires a degree in computer science?
That’s not the case.
All you need to master computer vision and deep learning is for someone
to explain things to you in simple, intuitive terms. And that’s exactly what I
do. My mission is to change education and how complex Arti cial
Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be
PyImageSearch University, the most comprehensive computer vision,
deep learning, and OpenCV course online today. Here you’ll learn how to
successfully and con dently apply computer vision to your work, research,
and projects. Join me in computer vision mastery.
Inside PyImageSearch University you'll nd:
✓ 35+ courses on essential computer vision, deep learning, and OpenCV

topics
✓ 35+ Certi cates of Completion
✓ 39+ hours of on-demand video
✓ Brand new courses released regularly, ensuring you can keep up with
state-of-the-art techniques
✓ Pre-con gured Jupyter Notebooks in Google Colab
✓ Run all code examples in your web browser — works on Windows,

macOS, and Linux (no dev environment con guration required!)
✓ Access to centralized code repos for all 450+ tutorials on

PyImageSearch
✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
✓ Access on mobile, laptop, desktop, etc.
CLICK HERE TO JOIN PYIMAGESEARCH UNIVERSITY
Summary
In this lesson we reviewed the parameters to the detectMultiScale function of
the HOG descriptor and SVM detector. Speci cally, we examined these parameter
values in context of pedestrian detection. We also discussed the speed and
accuracy tradeo s you must consider when utilizing HOG detectors.
If your goal is to apply HOG + Linear SVM in (near) real-time applications, you’ll rst
want to start by resizing your image to be as small as possible without sacri cing
detection accuracy: the smaller the image is, the less data there is to process. You
can always keep track of your resizing factor and multiply the returned bounding
boxes by this factor to obtain the bounding box sizes in relation to the original
image size.
Secondly, be sure to play with your scale and winStride parameters. This
values can dramatically a ect the detection accuracy (as well as false-positive rate)
of your detector.
Finally, if you still are not obtaining your desired frames per second (assuming you
are working on a real-time application), you might want to consider re-implementing
your program in C/C++. While Python is very fast (all things considered), there are
times you cannot beat the speed of a binary executable.
Download the Source Code and

FREE 17-page Resource Guide
Enter your email address below to get a .zip of the code
and a FREE 17-page Resource Guide on Computer
Vision, OpenCV, and Deep Learning. Inside you'll nd
my hand-picked tutorials, books, courses, and libraries
to help you master CV and DL!
Your email address DOWNLOAD THE CODE!
About the Author

Hi there, I’m Adrian Rosebrock, PhD. All too
often I see developers, students, and
researchers wasting their time, studying the
wrong things, and generally struggling to get
started with Computer Vision, Deep Learning,
and OpenCV. I created this website to show
you what I believe is the best possible way to
get your start.
Previous Article: Next Article:
Pedestrian Detection OpenCV Detecting machine-readable

zones in passport images
94 responses to: HOG detectMultiScale parameters

explained
Anuj Pahuja
November 16, 2015 at 1:18 pm
Hi Adrian,
Thanks again for the informative blog post. I had to use HOGDescriptor in
OpenCV for one of my projects and it was a pain to use because of no clear
documentation. So this was much needed.
The ‘ nalThreshold’ parameter is mainly used to select the clusters that have
at least ‘ nalThreshold + 1’ rectangles This parameter is passed as an
argument to groupRectangles() or groupRectangles_meanShift()(when
meanShift is enabled) function which rejects the small clusters containing
less than or equal to ‘ nalThreshold’ rectangles, computes the average
rectangle size for the rest of the accepted clusters and adds those to the
output rectangle list.
These should help:

1.
http://code.opencv.org/projects/opencv/repository/entry/modules/objdete
ct/src/hog.cpp?rev=2.4.9#L1057
2.
http://docs.opencv.org/2.4/modules/objdetect/doc/cascade_classi cation.
html#void%20groupRectangles%28vector%3CRect%3E&%20rectList,%20i
nt%20groupThreshold,%20double%20eps%29
Cheers,
Anuj
Adrian Rosebrock
Thanks so much for sharing the extra details Anuj!
Nrupatunga
November 17, 2015 at 1:41 am
Dear Adrian,
Very informative post on HOG detectmultiscale parameters. In fact I
appreciate this post very much. I couldn’t nd such detailed post on the net
with such examples.
Recently, I have trained HOG features(90×160) manually using SVMlight. I

had a hard time to make detectmultiscale work with these parameters.
I would like to share few observations while experimentation:
1. doing Hard train reduced my false positives.

2. nalThreshold and the useMeanShiftGrouping.
setting useMeanShiftGrouping to false, gave me good detection bounding
box around the person in the image and increasing the nal threshold
reduced the number of detection(number of bounding boxes).
I am still working on the part of improving the detection rate. I have many
images where I still couldn’t detect the person in the image.
I have reduced false positives. I wanted to increase my detection rate as

well.
Any inputs on this. I would really appreciate your inputs.
Thanks a lot for this post.
Correction: if I am not mistaken, I think there should be modi cation in the

code
“python detectmultiscale.py –image images/person_010.bmp –scale 1.03′
after the statement
“Meanwhile a larger scale will decrease the number of layers in the pyramid
as well as decrease the amount of time it takes to detect objects in an
image:”
Adrian Rosebrock
Hey Nrupatunga, thanks for the comment and all the added details, I
appreciate it. If you ended up using mean shift grouping, I would
suggest applying non-maxima suppression instead, you’ll likely end
up getting even better results.
In order to improve your detection rate, be sure to check the ‘C’

parameter of your SVM. Normally this value should be very small,
such as C=0.01. This will create a “soft classi er” and help with your
detection rate.
Another neat little trick you can do to create more training data is
“mirror” your training images. I’m not sure about your particular case,
but for pedestrian detection, you the horizontal mirror of an image is
still a person, thus you can use that as additional training data as well.
ngapweitham
Thanks for the brilliant explanations. It is much easier to understand than the
document of opencv.
Anyone tried to use dlib to do the pedestrian detection?There is a video

showing the reuslts(https://www.youtube.com/watch?v=wpmY_5gNbEY),
cannot tell the result is good or bad with my knowledge.
Adrian Rosebrock
Thanks for sharing Tham!
Sebastian
Hi Adrian, thanks for the post
I’m trying to do HOG detection but in real time from video camera. I’m
working in a raspberry pi 2 board and the code works but the frame rate is
too slow.
How can i make the process faster?

Do you think is possible to get good results working with the raspberry pi 2?
Thanks
Adrian Rosebrock
Hey Sebastian — pleas see the “Tips on speeding up the object

detection process” section. This section of the post details tricks you
can use to speed up the detection process, especially related to the
Raspberry Pi 2.
Cam
May 5, 2016 at 2:29 am
Hi Sebastian,
could you help me please with the people detection in real time
please, I’ve been trying but it doesn’t work, can you give me some
ideas to the code, or send to me that part of the code, i really
apreciate that.
Thank you.
Adrian Rosebrock
May 5, 2016 at 6:43 am
Hey Camilo — please see this followup blog post on tuning

detectMultiScale parameters for real-time detection.
Rish
February 7, 2017 at 5:38 am
Hey Adrian – the link leads to the same post. I’m really
trying hard to do real time detection. I’m hoping to
achieve 20fps (or 25fps if I can get really lucky). I’ve
implemented a tracking algorithm that helps quite a bit.
However, any tips to speed up the detectMultiScale
function as such would be really helpful.
As mentioned in the blogpost, changing scale from 1.20

to 1.05 increases time per 640×480 frame from 55ms to
98ms, however accuracy reduces signi cantly.
Adrian Rosebrock
Just to clarify, you are trying to obtain 20-25 FPS

on the Raspberry Pi?
CodeNinja
December 30, 2017 at 10:36 am
Hi Rish…. You can pass alternate frames to

classi er. Pass 0th frame for classi er. For 1st
frame do only tracking(output obtained from the
classi er) skip classi er part for this. Pass the
2nd frame to the classi er… and this cycle
repeats.
Vivek
December 3, 2015 at 11:35 pm
Adrian,
The scale parameter also take another input nLevels

the way it works is this.
Image size is descreased in nLevels
if nLevel=16
scale = 1.05
loop runs 16 times, each time decreasing the size by scale(starts=1)*=scale
So nLevel de nes the number of loops not the scale.
Adrian Rosebrock
Thanks for sharing Vivek. So just to clarify, nLevel is used to control

the maximum number of layers of the image pyramid? Also, does
nLevel work for the Python + OpenCV bindings or just for C++?
Vivek
Hi Adrian,
if you look at hog.cpp le, it will show how the nlevel is used.
our test in python shows that it does work the way it is de ned..
if you set nlevel too low say 4 and scale 1.01, you will see no small gures
will be detected.
Experiment by changing the nlevel and scale to it how it work.
Here is a code snippet from hog.cpp. I did not see any default value.. so the
loop will continue till the size of image becomes smaller than the window.
for( levels = 0; levels < nlevels; levels++ )

{
levelScale.push_back(scale);
if( cvRound(imgSize.width/scale) < winSize.width ||
cvRound(imgSize.height/scale) < winSize.height ||
scale0 <= 1 )
break;
scale *= scale0;
}
Adrian Rosebrock
Thanks for the clari cation! I’ll be sure to play with this parameter as
well. It seems like a nice way to compliment the scale.
Bob Estes
September 30, 2017 at 6:00 pm
I don’t see that nlevels is exposed at the python level.

If so, you’d have to recompile openCV to change it, right?
Ulrich
March 14, 2016 at 12:25 pm
Hello Adrian,
Tanks a lot for this blog. I read it carefully and tried out your code with own
pictures and videos. It works great!
Do you have also a python HoG implementation which is rotation invariant? I
try to detect pedestriants which are not upright (due to moving/ tilted
camera).
Adrian Rosebrock
March 14, 2016 at 3:16 pm
By de nition, HOG is not meant to be rotation invariant. If you know

how the camera is titled, simply deskew it by rotating it by theta
degrees.
Maya2
October 3, 2017 at 12:09 pm
Please how do apply this code on videos ?
Ulrich
March 15, 2016 at 9:05 am
Hello Adrian,
Thanks for your answer, this is a good idea and helps for many cases. But in
some special cases I do not know how the angle theta is, due to sliding
camera motion.
I guess an other posibility could be to train the SVM with tilted exaple
images. But therefor I think a quadratic window would be better than the
normaly used upright 128:64 window.
Is it possible to change the window size in the OpenCV SVM database?

Is it possible to add HoG descriptors which are analyzed from tilted
examples (pedestriants) into the existing SVM database? Or is it necessary to
create your own SVM?
Do you have a blog how to expand a SVM or how to create a own SVM?
Is this possibly described in one of your books?
It would be great when you can give me some answers and hints regarding
my problem.
Best regards Ulrich
Adrian Rosebrock
March 15, 2016 at 4:31 pm
If you decide to create additional data samples by titling your images,

then you’ll need to train a HOG detector for each set of rotations.
Keep in mind that HOG, by de nition, is sensitive to rotation.
The pedestrian detector that OpenCV ships with is pre-trained, so

you can’t adjust it. In your case, you would need to train your own
SVM.
I detail the steps involved in how to train a custom object detector in

this post. You can nd the source code implementation of the HOG +
Linear SVM detector inside the PyImageSearch Gurus course.
Anthony
April 20, 2016 at 5:07 am
Hi Adrian, I really need to thank you for all those amazing posts, it really is a
great job!
Because I liked your posts, I tried to reproduce it at home, using a Raspberry

Pi 2 with Python 2.7.9 and OpenCV 3.1.0.
I’m doing real-time person detection with this Rpi and it’s working well. My
problem is that I can’t nd a way to count the number of person when
performing hog.detectMultiScale. The return values of this function gives us
the location and weights of detected people but not the exact number of
these people.
Do you have any idea of implementing it?
Adrian Rosebrock
April 20, 2016 at 6:01 pm
The value returned by hog.detectMultiScale is just a list. This list

represents the number of people in the image. Therefore, to get the
total number of people in the image, just take the len(rects).
Anthony
April 22, 2016 at 10:11 am
Thanks you very much, this helped me a lot!
Keep going updating this blog, it’s wonderfull!
Arpit Solanki
May 25, 2016 at 5:54 am
thank you for this great post. with your approach i observed that it has a lot
of false positives. one example of a false positive is that i tested it on a photo
with a man and dog (front view) and it detected both of them as person. can
you please help me solving this kind of issue?
Adrian Rosebrock
May 25, 2016 at 3:19 pm
Since the classi er is pre-trained, you unfortunately cannot apply

hard-negative mining as in the HOG + Linear SVM pipeline. Instead,
you’ll need to try tuning the parameters of detectMultiScale. To
start, I would work with the scale factor and try to get that as large as
possible without hurting true-positive detection.
Imran
September 19, 2016 at 3:45 am
Just wondering how to incorporate detection with di erent postures (sitting,

crawling etc) in the framework of HoG descriptors? Any work done in this
regard?
Adrian Rosebrock
You would essentially need to train a separate HOG + Linear SVM

detector for each of the postures you wanted to detect.
Paulo
Hi Adrian,
Based on your experience, what technique you indicate to detect the

pattern of heads and shoulders top view?
Best regards Paulo
Adrian Rosebrock
It really depends on your dataset. I would rst examine the images in

the dataset and determine how much variance in appearance the
heads and shoulders have. Depending on how much variance there
is, I’d make a decision. For similar images with low variance I’d likely
use HOG + Linear SVM. For lots of variance I’d start to consider more
advanced approaches like CNNs.
Paulo
Hi Adrian, Thanks for answering.
In my dataset is between 2 and 6 people walking together to

pass through a door with a width of 60 in (1.5m). I tested the
Hough transform to detect heads, but the result was not
satisfactory. If I use the CNN (Convolutional Neural Network),
which of your posts you recommend to start?
Best regards Paulo
Adrian Rosebrock
Is your camera xed and non-moving? And I assume all

images are coming from the same camera sensor? If so,
I think HOG + Linear SVM would likely be enough here.
Paulo
Thanks Adrian!
My camera is xed.
From this code, how do I adapt it for o ine

training? What should I change?
Thanks…
Adrian Rosebrock
October 6, 2016 at 6:59 am
I demonstrate how to train HOG + Linear SVM

detectors from scratch inside the
PyImageSearch Gurus course. I would suggest
starting there.
Wanderson
Dear Adrian,
I would like to make a newbie question. Whenever I see talk about HOG
detector, the classi er SVM is involved. The descriptor HOG should always
be linked to a classi er? Or I can detect foreground objects with blobs
analysis.
Thanks,
Wanderson
Adrian Rosebrock
You typically see SVMs, in particular Linear SVMs, because they are
very fast. You could certainly use a di erent classi er if you wished.
James Brown
Hello, Adrian.
Your post is really amazing.
I have a question about HOG.
Is it possible to extract the human feature by removing background on
selected rectangular area?
I am researching the human body recognition project and I really hope you
guide me.
Thank you very much.
You are super!
Adrian Rosebrock
You would actually train a custom object detector using the HOG +
Linear SVM method. OpenCV actually ships with a pre-trained
human detector that you can try out.
John Beale
Thank you for this great blog. In your examples showing the foreground
woman and background child, the green bounding box cuts through the
woman’s forehead, so I’m assuming the HOG detector found her legs and
torso, but missed her head(?) In other cases, the box is well centered and
completely contains all the pixels showing the human gure, but includes
considerable extra background also. Is this algorithm intrinsically that “fuzzy”
about the precise outline, or can it be tuned to more closely match the
actual boundaries of the person? Thanks again!
Adrian Rosebrock
The algorithm itself isn’t “fuzzy”, it’s simply the step size of the sliding
window and the size of the image pyramid. I would suggest reading
this post on the HOG + Linear SVM detector to better understand
how the detector works.
Yonatan
January 3, 2017 at 6:23 am
a comment regarding the hitThreshold parameter.

It should represent the minimum Euclidean distance between the input HOG
features and the classifying plane of the SVM, meaning that only if the SVM
result exceeds this threshold, the detection is positive. (and if you set this
threshold to small negative values, you get a lot of false positive windows).
Saeed
February 16, 2017 at 3:05 pm
Hi Adrian,
I read “perform pedestrian detection” and current posts and there is one
point that I cannot understand.
Your sliding window’s size is xed to be 128×64 and all the features are
obtained from this window in any scale. However, when the targets are
detected the boxes have di erent sizes. I believe all the boxes should be
128×64 but they are not. Could you please describe what causes this?
Thank you in advance for your comment.
Adrian Rosebrock
You can have di erent sized bounding boxes in scale space due to
image pyramids. Image pyramids allow you to detect object at
varying scales of the image, but as you’ll notice each bounding box
as the same aspect ratio.
leo
Hi, can explain me please, [ -i ] and [ –images ], I’m new in this area
please help me
# construct the argument parse and parse the arguments

ap = argparse.ArgumentParser()
ap.add_argument(“-i”, “–images”, required=True, help=”path to images
directory”)
args = vars(ap.parse_args())
Adrian Rosebrock
Hi Leo — I would highly suggest that you spend some time reading
up on command line arguments and how they work.
Ashutosh
Dear Adrian,
Very useful blog, Thank you for drafting contents precisely.
I was checking the GPU version of the detectMultiScale at
http://docs.opencv.org/2.4/modules/gpu/doc/object_detection.html#gpu-
hogdescriptor-detectmultiscale
But could not understand as to why padding is (0,0).
“padding – Mock parameter to keep the CPU interface compatibility. It must

be (0,0).”
In that case, how to detect pedestrian at the edge of frame?
Thanks in advance.
Adrian Rosebrock
The GPU functionality of OpenCV is (unfortunately) only for C/C++.

There are not Python bindings for it, so I unfortunately haven’t had a
chance to play around with the GPU functions and do not have any
insight there.
Rob
March 19, 2017 at 12:34 pm
Hi Adrian,
if I want to use HOG + SVM as Tra c Sign detector, how should I do it?
Should I train a detector for each sign, or should i build a general detector
for all signs and then to distinguish the signs with another method? I want to
do it in realtime, do more detectors increase the calculation e ort
proportionally?
Thanks in advance.
Adrian Rosebrock
March 21, 2017 at 7:31 am
You would want to train a detector for each sign. More detectors will
increase the amount of time it takes to classify a given image, but the
bene t is that your detections will be more accurate. Please see the
PyImageSearch Gurus course for an example of building a tra c
sign detector.
Rob
March 28, 2017 at 8:02 am
I there a way to use detectmultiscale to distinguish between

several objectclasses. I also tried to implement my own scaling
and slinding windows and then use predict() to recognize the
object. For this purpose i trained a linear Svm and labeled the
data. Its working ne, but its so much slower than
detectmultiscale.
Adrian Rosebrock
March 28, 2017 at 12:49 pm
HOG + Linear SVM detectors work best when they are

binary (i.e., detecting one class label for each detector).
The detectMultiScale function in OpenCV only works
with one class. You can implement your own method (as
you’ve done), but it will be much slower in Python.
Rob
March 29, 2017 at 12:04 pm
Okay thank you, i’m trying that. When i run

detectmultiscale or predict, that code uses only
25% of my processor (on rpi 3). Is
multiprocessing possible with these methods?
How can I achieve this?
Adrian Rosebrock
March 31, 2017 at 1:58 pm
As far as detectMultiScale goes, unfortunately

there aren’t many optimizations on the Python
side of things. If you wanted to code in C++, you
could access the GPU via detectMultiScale for
added speed.
Nashwan
March 28, 2017 at 2:01 pm
Hi Adrian;
when i run this code tell me this error
detectmultiscale.py: error: argument -i/–image is required
i use opencv 3.0 and python 2.7 on windows 10

i’m waiting for your help…
Adrian Rosebrock
March 31, 2017 at 2:08 pm
Please read up on command line arguments and how to use them

before you continue.
mukesh
April 4, 2017 at 4:47 pm
hi Adrian,
i tried
(rects, weights) = hog.detectMultiScale(image, winStride=winStride,
padding=padding, scale=args[“scale”], useMeanshiftGrouping=meanShift)
and when i printed rects and weights i got empty tuples.

i m beginner and need some help.
waiting for your help.
Adrian Rosebrock
April 5, 2017 at 11:54 am
If you did not obtain any bounding boxes then the parameters to
.detectMultiScale need some tuning. Your image might also
contain poses that are not suitable for the pre-trained pedestrian
detector provided with OpenCV.
ramdan
May 8, 2017 at 4:13 am
Hi Adrian
How to train the HOG descriptor ?
Adrian Rosebrock
May 8, 2017 at 12:17 pm
I detail the steps to train a HOG + Linear SVM detector here. I then
demonstrate how to implement the HOG + Linear SVM detector
inside the PyImageSearch Gurus course.
Sunil
June 12, 2017 at 7:50 am
Nice e ort put into the article Adrian. Is there any relation between minimum
and maximum size possible to detect with the parameters of the hog/svm
detector of open cv ?
Adrian Rosebrock
June 13, 2017 at 11:01 am
I’m not sure what you mean by minimum/maximum size. Are you
referring to the object you’re trying to detect? The HOG window?
Keep in mind that we use image pyramids to nd objects at varying
scales in an image. You might need to upscale your image before
applying the image pyramid + sliding window to detect very small
objects in the background.
Sunil
June 14, 2017 at 3:03 am
Yeah, sorry for not being clear, I was wondering if there is a

relation between the max/min object size which can be
detected in a given image and the size of HOG window used.
Actually I am trying to see whether increasing the resolution by
a factor of two in each dimension has some positive e ect on
object detection.
Adrian Rosebrock
June 16, 2017 at 11:33 am
Increasing the resolution will enable you to detect

objects that would otherwise be too small for the sliding
window to capture. The downside is that the HOG +
Linear SVM detector now has more data to process,
thus making it substantially slower.
alberto
June 12, 2017 at 11:59 am
Hello,
I’ve trained my own HOG detector using the command

“opencv_traincascade” with the “-featureType HOG” ag on and it succefully
generated a .xml le as a HOG detector.
How can I implement my own XML le on the functions

“cv2.HOGDescriptor() hog.setSVMDetector()” ? So I can test my
HOGdetector in action.
I have only found working examples of the default people dectector
“cv2.HOGDescriptor_getDefaultPeopleDetector()”
Thanks,
Alberto
Claude
June 25, 2017 at 2:58 pm
Hi Alberto,
I have my SVM in yml, and then use
hog = cv2.HOGDescriptor(
IMAGE_SIZE, BLOCK_SIZE, BLOCK_STRIDE, CELL_SIZE, NR_BINS)
svm = cv2.ml.SVM_load(“trained_svm2.yml”)
hog.setSVMDetector(svm.getSupportVectors())
Maybe it will “just work” with an xml le as well
Mauro
July 2, 2017 at 11:28 am
Hi Adrian, nice job!

I’ve used your script to detect uman with picture saved by Motion in Debian..
It work very well, but sometimes Hog detect Cat in the image, i’ve tried with
some combinations values of scale, padding and winstrides but without
success.
There is a way to set Hog.multidetect to ignore a object too smal than a
value?
I’ve played with hitThreshold and nalThreshold, but it dont do exact it..
Thanks a lot,
Best Regards
Mauro
Adrian Rosebrock
July 5, 2017 at 6:27 am
I would suggest looping over the weights and rects together and
discarding any weight scores that are too low (you’ll likely have to
de ne what “too small” is yourself).
ziyi
Hi, i am running the hog descriptor on the exact same image as in your
example, with default params and default people descriptor,
but my speed is very slow, 800+ms per frame
my PC is i7 4 core, is there anything wrong i am doing ? i cant see why your

one took under 0.09 seconds
Adrian Rosebrock
It sounds like you may have compiled and installed OpenCV without
optimization libraries such as BLAS. How did you install OpenCV on
your system? Did you follow one of my tutorials here on
PyImageSearch?
Irum Zahra
December 11, 2017 at 1:30 pm
Hi Adrian !! This tutorial is so helpful. You are doing a wonderful job. But I
have a question. Can you guide me how can I increase the distance from
which it can detect pedestrians? What perimeters should I change? I want to
detect pedestrians from 30 feet.
Tjado
Hey Adrian, very nice tutorial, thanks!

I tried to create a function for locating faces, wich calls the
detectMultiScaleMethod and returns the result.
Now it should be callable with di erent parameters, but if i try to hand over
the minSize tupel as default parameters the following exception occurs:
TypeError: Argument given by name (‘ ags’) and position (4)
def locate_faces(image, scaleFactor=2, minNeighbors=4, minSize=(50, 50)):
face_one = faceDet.detectMultiScale(image, scaleFactor, minNeighbors,

minSize,
ags=cv2.CASCADE_SCALE_IMAGE)
return face_one
Why?
Roberto O
Hi Adrian, rst of all, thanks for sharing your expertise and knowledge with
everyone. In my own experience, learning about computer vision and
opencv can be quite a challenge and very frustrating when you can’t nd
useful information about some topic. Once again: Thanks a lot!
So… I have a question about real-time implementation. I’ve been “playing

around” with winStride and Scale parameters and I managed to get real-time
video feedback, nevertheless I can’t get ANY detection in any frame. I think I
had stumble upon a wall and I can’t gure out how to get this working. If you
could give me some tips about tuning those parameters in a way that
pedestrian detection can be accomplished for my real-time application, I
would appreciate it A LOT. Thanks in advance. See ya!
Adrian Rosebrock
Hey Roberto — thanks for the comment, I’m glad you’re nding the
PyImageSearch blog helpful!
To address your question:
Are you using the OpenCV HOG pedestrian detector covered in this
post? Or are you using your own model?
Keep in mind that the window stride and scale are important not only
for speed, but for obtaining the actual detections as well. You may
need to sacri ce speed for accuracy.
GabriellaK
Hi, Your posts are very helpful for me.

I have a question. I’m trying to detect people from realtime webcam and I’m
using this code https://programtalk.com/vs2/python/3138/football-
stats/test_scripts/people_detect.py/ , but it’s not so good as you said 🙁
But what do you mean with “resizing your image to be as small as possible”?
What function have I to use?
imutils.resize() ?
Hope you will answer soon 😀
Adrian Rosebrock
You can use either imutils.resize or cv2.resize to resize your image.
Jason
March 12, 2018 at 10:20 pm
Hi all,
I’ve learned a lot from these posts and I’ve spent some time trying to write
my own implementations of SVM and HOG to gain a better understanding of
the concepts. Some parts failed, some works but slow, and some actually
ends up being better than the reference I’m using. So I’d like to share with
you the “better” part: a vectorized implementation of HOG features
extraction using only numpy+scipy.
To put is short: it returns HOG feature vectors in all sliding windows on an

image in one go, and tested on an 512×512 image with a window size of
200×200, this speeds up wrt a native sliding window + skimage’s hog
function by 20~30 times. For a single image, one can just set the window
size same as image size, and still there is a 20 – 30 % speed gain wrt
skimage.
The link to the git repo: https://github.com/Xunius/HOG_python.

Any feedback are appreciated.
Adrian Rosebrock
March 14, 2018 at 12:52 pm
Thanks for sharing, Jason!
David Wilson Romero Guzman

May 9, 2018 at 2:54 am
Hey Adrian,
Thank you very much for your post! Really good 🙂
I have a problem with the detectmultiScale() function. I am developing a

object recognizer of multiple objects. For this, I trained n binary SVM
(object_i/no_object_i). On the test set (with patches of the same size) i get
accuracy of around 90%, which is quite ok. However, when I use them to
detect objects in a bigger image (i.e. with MultiScale() ) regardless of the
model I use, i get a pretty window right in the middle of the image :/
– Do you have any Idea of what could be the issue here?

– I used the detect() function and in this case I get the exact opposite
situation. squares everywhere.
Have you any Idea of what could be the issue here?
Best Regards,
David
Adrian Rosebrock
May 9, 2018 at 9:29 am
Congrats on training your own model David, that’s great. However, I’m
not sure why your detector would be falsely reporting a detection in
the middle of the image each and every time. That may be an issue
with your training data but unfortunately I’m not sure what the root
cause is.
Alex
May 20, 2018 at 5:09 pm
Hi Adrian,
I`ve tried it with mp4 video, but it doesn’t works, what should I change to
detect people on video?? maybe I should make something with
imutils.resize()??
Adrian Rosebrock
May 22, 2018 at 6:07 am
Hey Alex — could you be a bit more speci c when you say “it doesn’t
work”? That speci cally does not work? Are people not detected?
Does the code throw an error?
Alex
May 22, 2018 at 4:24 pm
people not detected
Adrian Rosebrock
May 23, 2018 at 7:14 am
Haar cascades and HOG + Linear SVM detectors are

not very good at handling changes to rotation and
orientation. My guess is that your input images/frames
di er signi cantly from what the model is trained on.
You may want to try a deep learning based object
detector.
kritesh
August 10, 2018 at 9:37 am
here, the code working well when the person is vertical ,while person
sleeping or horizontal this doesn’t work shows zero detection……..
what is perfect algo. satis es both cases
Adrian Rosebrock
August 15, 2018 at 9:24 am
There is no such thing as a “perfect detection algorithm”. Deep

learning-based object detectors may help you though.
Vikran
August 13, 2018 at 8:22 am
Hi Adrian,
I am trying to detect the object using tensor ow API by training the models
and once the object is detected, i am trying to blur that particular detected
part. Can we able to pass the detected object as input to
cv2.CascadeClassi er?.
Adrian Rosebrock
August 15, 2018 at 8:52 am
Hey Vikran — I’m a bit confused by your comment. If you’re using the
TensorFlow Object Detection API to detect an object why would you
want to further classify it? You already have the detected object
bounding box and labels.
roja
hi Adrian,
I have some code that is working well ,if i resize the image …..if i dont resize
the image it is not working properly…..is it necessary to resize the image
before giving it to HOG ?
Adrian Rosebrock
Yes, you must resize the image prior to passing it into HOG. HOG
requires a xed image/ROI dimensions, otherwise your output feature
vector dimensions could be di erent (depending on your
implementation).
silver
March 22, 2019 at 4:20 pm
Is there any way to improve accuracy when the size of person is very small
in the image? It is working very well with a clear and reasonable size of a
person, however, my image has low quality and the size of person is very
tiny.
Levi
June 20, 2019 at 7:48 am
is there a way to combine hog with the last layers of YOLO network to
perform object detection
Adrian Rosebrock
June 26, 2019 at 1:55 pm
No, and there’s not really a reason to do that either. Use one or the
other.
Comment section
Hey, Adrian Rosebrock here, author and creator of PyImageSearch. While I love
hearing from readers, a couple years ago I made the tough decision to no longer
o er 1:1 help over blog post comments.
At the time I was receiving 200+ emails per day and another 100+ blog post
comments. I simply did not have the time to moderate and respond to them all,
and the sheer volume of requests was taking a toll on me.
Instead, my goal is to do the most good for the computer vision, deep learning,
and OpenCV community at large by focusing my time on authoring high-quality
blog posts, tutorials, and books/courses.
If you need help learning computer vision and deep learning, I suggest you
refer to my full catalog of books and courses — they have helped tens of
thousands of developers, students, and researchers just like yourself learn
Computer Vision, Deep Learning, and OpenCV.
Click here to browse my full catalog.
Similar articles
DEEP LEARNING DEEP LEARNING TUTORIALS BUILDING A POKEDEX
EMBEDDED/IOT AND COMPUTER EXAMPLES OF IMAGE SEARCH

VISION ENGINES
Understanding weight
IOT MOVIDIUS initialization for neural networks TUTORIALS
RASPBERRY PI TUTORIALS
Building a Pokedex in Python:
OpenVINO, OpenCV, and OpenCV and Perspective
Movidius NCS on the Warping (Step 5 of 6)
Raspberry Pi
April 8, 2019 May 6, 2021 May 5, 2014
You can learn Computer Vision, Deep

Learning, and OpenCV.
Get your FREE 17 page Computer Vision, OpenCV, and Deep

Learning Resource Guide PDF. Inside you’ll nd our hand-
picked tutorials, books, courses, and libraries to help you
master CV and DL.
Your email address DOWNLOAD FOR FREE
Topics Machine Learning and Computer Books & Courses PyImageSearch

Vision
Deep Learning PyImageSearch University A liates
Medical Computer Vision
Dlib Library FREE CV, DL, and OpenCV Crash Get Started
Optical Character Recognition Course
Embedded/IoT and Computer OpenCV Install Guides
(OCR)
Vision Practical Python and OpenCV
About
Object Detection
Face Applications Deep Learning for Computer
FAQ
Object Tracking Vision with Python
Image Processing
Blog
OpenCV Tutorials PyImageSearch Gurus Course
Interviews
Contact
Raspberry Pi Raspberry Pi for Computer Vision
Keras
Privacy Policy
© 2022 PyImageSearch. All Rights Reserved.

HOG detectMultiScale Parameters Explained - PyImageSearch

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

HOG detectMultiScale Parameters Explained - PyImageSearch

Uploaded by

Copyright:

Available Formats

Consulting

Click here to download the OpenCV

IMAGE DESCRIPTORS OBJECT DETECTION TUTORIALS

essential computer vision,

✓ 39+ hours of on-

3 Dramatically a ect the speed of the detection process. ✓ Pre-con gured

code repos for all 500+

downloads for code,

Looking for the source code to this post? datasets, pre-trained

laptop, desktop, etc.

Accessing the HOG detectMultiScale

To view the parameters to the detectMultiScale function, just re up a shell,

→ Launch Jupyter Notebook on Google Colab

An Incremental Improvement with

Mean Average Precision (mAP) Using

A Better, Faster, and Stronger Object

HOG detectMultiScale parameters Detector (YOLOv2)

→ Launch Jupyter Notebook on Google Colab