You are on page 1of 39

TRAIN YOUR OWN OPENCV HAAR CLASSIFIER

22 July 2013, posted by Thorsten Ball

Open this page, allow it to access your webcam and see your face getting recognized
by your browser using JavaScript and OpenCV, an “open source computer vision
library”. That’s pretty cool! But recognizing faces in images is not something terribly
new and exciting. Wouldn’t it be great if we could tell OpenCV to recognize something
of our choice, something that is not a face? Let’s say… a banana?
That is totally possible! What we need in order to do that is called a “cascade classifier
for Haar features” to point OpenCV at. A cascade classifier basically tells OpenCV
what to look for in images. In the example above a classifier for face features was
being used. There are a lot of cascade classifiers floating around on the internet and
you can easily find a different one and use it. But most of them are for recognizing
faces, eyes, ears and mouths though and it would be great if we could tell OpenCV to
recognize an object of our choice. We need a cascade classifier that tells OpenCV
how to recognize a banana.
Here’s the good news: we can generate our own cascade classifier for Haar features.
Over in computer vision land that’s called “training a cascade classifier”. Even better
news: it’s not really difficult. And by “not really” I mean it takes time and a certain
amount of willingness to dig through the internet to find the relevant information and
tutorials on how to do it, but you don’t need a PhD and a lab.
But now for the best of news: keep on reading! We’ll train our own cascade classifier in
the following paragraphs and you just need to follow the steps described here.
The following instructions are heavily based on Naotoshi Seo’s immensely
helpful notes on OpenCV haartraining and make use of his scripts and resources he
released under the MIT licencse. This is an attempt to provide a more up-to-date step-
by-step guide for OpenCV 2.x that wouldn’t be possible without his work — Thanks
Naotoshi!

LET’S GET STARTED


The first thing you need to do is clone the repository on GitHub I made for this post. It
includes some empty directories (which we’ll fill later on), some utilities and most
importantly: an even playing field.

git clone https://github.com/mrnugget/opencv-haar-classifier-training

You’ll also need OpenCV on your system. The preferred installation methods differ
between operating systems so I won’t go into them, but be sure to get at least OpenCV
2.4.5, since that’s the version this post is based on and OpenCV 2.x had some major
API changes. Build OpenCV with TBB enabled, since that allows us to make use of
multiple threads while training the classifier.
If you’re on OS X and use homebrew it’s as easy as this:

brew tap homebrew/science

brew install --with-tbb opencv

Another thing we need is the OpenCV source code corresponding to our installed
version. So if your preferred installation method doesn’t provide you with access to it
go and download it. If you get a compiler error further down, be sure to try it with
OpenCV 2.4.5.

SAMPLES
In order to train our own classifier we need samples, which means we need a lot of
images that show the object we want to detect (positive sample) and even more
images without the object (negative sample).
How many images do we need? The numbers depend on a variety of factors, including
the quality of the images, the object you want to recognize, the method to generate the
samples, the CPU power you have and probably some magic.
Training a highly accurate classifier takes a lot of time and a huge number of samples.
The classifiers made for face recognition are great examples: they were created by
researchers with thousands of good images. The TechnoLabsz blog has a great
post that provides some information based on their experience:

It is unclear exactly how many of each kind of image are needed. For Urban

Challenge 2008 we used 1000 positive and 1000 negative images whereas the

previous project Grippered Bandit used 5000. The result for the Grippered Bandit

project was that their classifier was much more accurate than ours.

This post here is just an introduction and getting a large number of good samples is
harder than you might think, so we’ll just settle on the right amount that gives us
decent results and is not too hard to come by:
I’ve had success with the following numbers for small experiments: 40 positive
samples and 600 negative samples. So let’s use those!

POSITIVE IMAGES
Now we need to either take photos of the object we want to detect, look for them on
the internet, extract them from a video or take some Polaroid pictures and then scan
them: whatever it takes! We need 40 of them, which we can then use to generate
positive samples OpenCV can work with. It’s also important that they should differ in
lighting and background.
Once we have the pictures, we need to crop them so that only our desired object
is visible. Keep an eye on the ratios of the cropped images, they shouldn’t differ that
much. The best results come from positive images that look exactly like the ones you’d
want to detect the object in, except that they are cropped so only the object is visible.
Again: a lot of this depends on a variety of factors and I don’t know all of them and a
big part of it is probably black computer science magic, but since I’ve had pretty good
results by cropping them and keeping the ratio nearly the same, let’s do the same now.
To give you an idea, here are some scaled down positive images for the banana
classifier:

Take the positive, cropped images and put them in the ./positive_images directory of


the cloned repository.
Then, from the root of the repository, run this command in your shell:

find ./positive_images -iname "*.jpg" > positives.txt

NEGATIVE IMAGES
Now we need the negative images, the ones that don’t show a banana. In the best
case, if we were to train a highly accurate classifier, we would have a lot of negative
images that look exactly like the positive ones, except that they don’t contain the object
we want to recognize. If you want to detect stop signs on walls, the negative images
would ideally be a lot of pictures of walls. Maybe even with other signs.
We need at least 600 of them. And yes, getting them manually by hand takes a long
time. I know, I’ve been there. But again: you could take a video file and extract the
frames as images. That way you’d get 600 pictures pretty fast.
For the banana classifier I used random photos from my iPhoto library and some
photos of the background where I photographed the banana earlier, since the classifier
should be able to tell OpenCV about a banana in pretty much any picture.
Once we have the images, we put all of them in the negative_images folder of the
repository and use find to save the list of relative paths to a file:

find ./negative_images -iname "*.jpg" > negatives.txt

CREATING SAMPLES
With our positive and negative images in place, we are ready to generate samples out
of them, which we will use for the training itself. We need positive and negative
samples. Luckily we already have the negative samples in place. To quote the
OpenCV documentation about negative samples:

"Negative samples are taken from arbitrary images. These images must not contain

detected objects. Negative samples are enumerated in a special file. It is a

text file in which each line contains an image filename (relative to the

directory of the description file) of negative sample image."

That means our negatives.txt will serve as a list of negative samples. But we still


need positive samples and there are a lot of different ways to get them which all lead
to different results regarding the accuracy of your trained classifier. Be sure to read the
reference about creating samples in Naotoshi Seo’s tutorial, to know what it’s all about.
We’re going to use a method that doesn’t need a lot of preparation or a large number
of positive or negative images. We’ll use a tool OpenCV gives
us: opencv_createsamples. This tool offers several options as to how generate samples
out of input images and gives us a *.vec file which we can then use to train our
classifier.
opencv_createsamples generates a large number of positive samples from our positive
images, by applying transformations and distortions. Since one can only transform so
much out of one image until it’s not a different version anymore, we need a little help to
get a larger number of samples out of our relatively small number of input images.
Naotoshi Seo wrote some really useful scripts that help a lot when generating
samples. The first one we’ll use is createsamples.pl, a small Perl script, to get 1500
positive samples, by combining each positive image with a random negative image
and then running them through opencv_createsamples.

So, let’s make sure we’re in the root directory of the repository and fire this up:

perl bin/createsamples.pl positives.txt negatives.txt samples 1500\

"opencv_createsamples -bgcolor 0 -bgthresh 0 -maxxangle 1.1\

-maxyangle 1.1 maxzangle 0.5 -maxidev 40 -w 80 -h 40"

This shouldn’t take too long. There is a lot of information


about opencv_createsamples available online, be sure to read up on it, if you want to
tune the parameters. What you need to pay attention to are -w and -h: they should
have the same ratio as your positive input images.
The next thing we need to do is to merge the *.vec files we now have in
the samples directory. In order to do so, we need to get a list of them and then use
Naotoshi Seo’s mergevec.cpp tool, which I’ve included in the srcdirectory of the
repository, to combine them into one *.vec file.

In order to use it, we need to copy it into the OpenCV source directory and compile it
there with other OpenCV files:

cp src/mergevec.cpp ~/opencv-2.4.5/apps/haartraining

cd ~/opencv-2.4.5/apps/haartraining

g++ `pkg-config --libs --cflags opencv` -I. -o mergevec mergevec.cpp\

cvboost.cpp cvcommon.cpp cvsamples.cpp cvhaarclassifier.cpp\


cvhaartraining.cpp\

-lopencv_core -lopencv_calib3d -lopencv_imgproc -lopencv_highgui -

lopencv_objdetect

Then we can go back to the repository, bring the executable mergevec with us and use
it:

find ./samples -name '*.vec' > samples.txt

./mergevec samples.txt samples.vec

We can now use the resulting samples.vec to start the training of our classifier.

TRAINING THE CLASSIFIER


OpenCV offers two different applications for training a Haar
classifier: opencv_haartraining and opencv_traincascade. We are going to
use opencv_traincascade since it allows the training process to be multi-threaded,
reducing the time it takes to finish, and is compatible with the newer OpenCV 2.x API.
Whenever you run into a problem when loading your classifier, make sure the
application knows how to handle the format of the cascade file, since cascade files
generated by opencv_haartraining and opencv_traincascade differ in format.

Many tools and libraries out there (including jsfeat, which is used in the example in the
first paragraph) only accept classifiers in the old format. The reason for this is most
likely that the majority of the easily available classifiers (e.g. the ones that ship with
OpenCV) still use that format. So if you want to use these libraries and tools you
should use opencv_haartraining to train your classifier.

So let’s point opencv_traincascade at our positive samples (samples.vec), negative


images, tell it to write its output into the classifier directory of our repository and the
sample size (-w and -h). -numNeg specifies how many negative samples are there and -
precalcValBufSize and -precalcIdxBufSize how much memory to use while training. -
numPos should be lower than the positive samples we generated.

opencv_traincascade -data classifier -vec samples.vec -bg negatives.txt\

-numStages 20 -minHitRate 0.999 -maxFalseAlarmRate 0.5 -numPos 1000\

-numNeg 600 -w 80 -h 40 -mode ALL -precalcValBufSize 1024\

-precalcIdxBufSize 1024
Read at least this post on the OpenCV answer board and the official
documentation about opencv_traincascade to get a better understanding of the
parameters in use here. Especially the -numPosparameter can cause some problems.

This is going to take a lot of time. And I don’t mean the old “get a coffee and come
back”-taking-a-lot-of-time, no. Running this took a couple of days on my mid-2011
MacBook Air. It will also use a lot of memory and CPU. Do something else while this
runs and come back after you noticed that it finished. Even better: use a EC2 box to
run this on.
You don’t have to keep the process running without any interruptions though: you can
stop and restart it at any time and it will proceeded from the latest training stage it
finished.

OUTPUT DURING OPENCV TRAINING


Thank you Kevin Hughes for providing this chapter.
After starting the training program it will print back its parameters and then start
training. Each stage will print out some analysis as it is trained:

===== TRAINING 0-stage =====

<BEGIN

POS count : consumed 1000 : 1000

NEG count : acceptanceRatio 600 : 1

Precalculation time: 11

+----+---------+---------+

| N | HR | FA |

+----+---------+---------+

| 1| 1| 1|

+----+---------+---------+

| 2| 1| 1|
+----+---------+---------+

| 3| 1| 1|

+----+---------+---------+

| 4| 1| 1|

+----+---------+---------+

| 5| 1| 1|

+----+---------+---------+

| 6| 1| 1|

+----+---------+---------+

| 7| 1| 0.711667|

+----+---------+---------+

| 8| 1| 0.54|

+----+---------+---------+

| 9| 1| 0.305|

+----+---------+---------+

END>

Training until now has taken 0 days 3 hours 19 minutes 16 seconds.

Each row represents a feature that is being trained and contains some output about its
HitRatio and FalseAlarm ratio. If a training stage only selects a few features (e.g. N =
2) then its possible something is wrong with your training data.
At the end of each stage the classifier is saved to a file and the process can be
stopped and restarted. This is useful if you are tweaking a machine/settings to
optimize training speed.
When the process is finished we’ll find a file called classifier.xml in
the classifier directory. This is the one, this is our classifier we can now use to detect
bananas with OpenCV! Only the sky is the limit now.

USING OUR OWN CLASSIFIER


NODE.JS AND OPENCV
Let’s give our classifier a shot by using Node.js and the node-opencv module. We’ll
use one of the examplesin the repository and modify it to scan multiple image files.

var cv = require('opencv');

var color = [0, 255, 0];

var thickness = 2;

var cascadeFile = './my_cascade.xml';

var inputFiles = [

'./recognize_this_1.jpg', './recognize_this_2.jpg', './recognize_this_3.jpg',

'./recognize_this_3.jpg', './recognize_this_4.jpg', './recognize_this_5.jpg'

];

inputFiles.forEach(function(fileName) {

cv.readImage(fileName, function(err, im) {


im.detectObject(cascadeFile, {neighbors: 2, scale: 2}, function(err, objects)

console.log(objects);

for(var k = 0; k < objects.length; k++) {

var object = objects[k];

im.rectangle(

[object.x, object.y],

[object.x + object.width, object.y + object.height],

color,

);

im.save(fileName.replace(/\.jpg/, 'processed.jpg'));

});

});

});

This code is pretty straightforward: change inputFiles so that it contains the paths to


the files that include the object we want to detect and mark, then run it with node
recognize_this.js. It will read in the specified inputFiles with OpenCV and try to
detect objects with our cascade classifier.
Now open all the files ending in *processed.jpg and see if your classifier works: if
OpenCV detected one or more objects in one of the input files it should have marked
them with a green rectangle. If the results are not to your liking, try playing around with
the neighbors and scale options passed to detectObject.

The banana classifier works quite well:

OPENCV’S FACEDETECT
Now let’s use our webcam to detect a banana! OpenCV ships with a lot of samples
and one of them is facedetect.cpp, which allows us to detect objects in video frames
captured by a webcam. Compiling and using it is pretty straightforward:

cd ~/opencv-2.4.5/samples/c

chmod +x build_all.sh

./build_all.sh

./facedetect --scale=2 --cascade="/Users/mrnugget/banana_classifier.xml"

Be sure to play around with the --scale parameter, especially if the results are not
what you expected. If all goes well, you should see something like this:

IN CLOSING
OpenCV, Haar classifiers and image detection are vast topics that are nearly
impossible to cover in a blog post of this size, but I hope this post helps you to get your
feet wet and gives you an idea of what’s possible. If you want to get a more thorough
understanding start reading through the references linked below.
Be sure to dig through the samples folder of the OpenCV source, play around with other
language bindings and libraries. With your own classifier it’s even more fun than “just”
detecting faces. And if your classifier is not yet finished, try playing around with the
banana classifier I put in the trained_classifier directory of the repository. I’m sure
there are thousands of super practical use cases for it, so go ahead and use it. And if
you want to add your classifier to the directory just send a pull request!
If there are any questions, problems or remarks: leave a comment!

http://opencvuser.blogspot.mx/2011/08/creating-haar-cascade-classifier-aka.html

http://note.sonots.com/SciSoftware/haartraining.html

Tutorial: OpenCV haartraining (Rapid


Object Detection With A Cascade of
Boosted Classifiers Based on Haar-like
Features)
Table of Contents

 Objective
 Data Prepartion
o Positive (Face) Images
o Negative (Background) Images
o Natural Test (Face in Background) Images
o How to Crop Images Manually Fast
 Create Samples (Reference)
o 1. Create training samples from one
o 2. Create training samples from some
o 3. Create test samples
o 4. Show images
o EXTRA: random seed
 Create Samples
o Create Training Samples
o Create Testing Samples
 Training
o Haar Training
o Generate a XML File
 Testing
o Performance Evaluation
o Fun with a USB camera
 Experiments
o PIE Expeirment 1
o PIE Experiment 2
o PIE Experiment 3
o PIE Experiment 4
o PIE Experiment 5
o PIE Experiment 6
o UMIST Experiment 1
o UMIST Experiment 2
o CBCL Experiment 1
o haarcascade_frontalface_alt2.xml
 Discussion
 Download
o How to enable OpenMP
 References

Objective
The OpenCV library provides us a greatly interesting demonstration for a face detection.
Furthermore, it provides us programs (or functions) that they used to train classifiers for
their face detection system, called HaarTraining, so that we can create our own object
classifiers using these functions. It is interesting.

However, I could not follow how OpenCV developers performed the haartraining for their
face detection system exactly because they did not provide us several information such as
what images and parameters they used for training. The objective of this report is to
provide step-by-step procedures for following people.

My working environment is Visual Studio + cygwin on Windows XP, or on Linux. The cygwin
is required because I use several UNIX commands. I am sure that you will use the cygwin
(especially I mean UNIX commands) not only for this haartraining but also for others in the
future if you are one of engineer or science people.

FYI: I recommend you to work haartrainig with something different concurrently because
you have to wait so many days during training (it would possibly take one week). I
typically experimented as 1. run haartraining on Friday 2. forget about it completely 3. see
results on next Friday 4. run another haartraining (loop).

A picture from the OpenCV website


History
 10/16/2008 - Additional experimental results.
 08/28/2008 - Revised entirely.
 06/05/2007 - opencv-1.0.0
 03/12/2006 - First Edition (opencv-0.9.7)
Tag: SciSoftware ComputerVision FaceDetection OpenCV

Data Prepartion
FYI: There are database lists on Face Recognition Homepage - Databases .
and Computer Vision Test Images .

Positive (Face) Images


We need to collect positive images that contain only objects of interest, e.g., faces.

Kuranov et. al. [3] mentions as they used 5000 positive frontal face patterns, and 5000
positive frontal face patterns were derived from 1000 original faces. I describe how to
increase number of samples at the later chapter.
Before, I downloaded and used The UMIST Face Database  (Dead Link) because cropped
face images were available at there. The UMIST Face Database has video-like image
sequences from side-faces to frontal faces. I thought training with such images would
generate a face detector which is robust to facial pose. However, the generated face
detector did not work well. Probably, I dreamed too much. It was a story on 2006.
I obtained a cropped frontal face database based on CMU PIE Database . I use it too.
This dataset has a large illumination variations, thus this would result in the same bad
result with the case of the UMIST Face Database which had large variations in poses. 
#Sorry, it looks redistribution (of modifications) of PIE database is not allowed. I made only
a generated (distorted and diminished) .vec file available at the Download section. The
PIE database is free (send a request e-mail), but it does not include the cropped faces
originally.
MIT CBCL Face Data  is another choice. They have 2,429 frontal faces with few
illumination variations and pose variations. This data would be good for haartraining.
However, the size of image is originally small 19 x 19. So, we can not perform experiments
to determine good sizes.
Probably, the OpenCV developers used the FERET  database. It looks that the FERET
database became available to download over internet from Jan. 31, 2008(?).

Negative (Background) Images


We need to collect negative images that does not contain objects of interest, e.g., faces to
train haarcascade classifier.

Kuranov et. al. [3] states as they used 3000 negative images.


Fortunately, I found http://face.urtho.net/  (Negatives sets, Set 1 - Various negatives)
which has about 3500 images (Dead Link). But, this collection was used for eye detection,
and includes some faces in some pictures. Therefore, I deleted all suspicious images which
looked including faces. About 2900 images were remained, and I added 100 images to
there. The number should be enough.
The collection is available at the Download section (But, it may take forever to download.)

Natural Test (Face in Background) Images


We can synthesize testing image sets using the createsamples utility, but having a natural
testing image dataset is still good.

There is a CMU-MIT Frontal Face Test Set  that the OpenCV developers used for their
experiments. This dataset has a ground truth text including information for locations of
eyes, noses, and lip centers and tips, however, it does not have locations of faces
expressed by rectangle regions required by the haartraining utilities as default.
I created a simple script to compute facial regions from given ground truth information. My
computation works as follows:

1. Get margin as nose height - mouse height

Lower boundary is located below the margin from the mouse

Upper boundary is located above the margin from the eye

2. Get margin as left mouse tip - right mouse tip

Right boundary is located right the margin from the right eye

Left boundary is located left the margin from the left eye

This was not perfect, but looked okay.

The generated ground truth text and image dataset is available at the Download section,
you may download only the ground truth text . By the way, I converted GIF to PNG
because OpenCV does not support GIF. The mogrify (ImageMagick) command would be
useful to do such conversion of image types
$ mogrify -format png *.gif

How to Crop Images Manually Fast


To collect positive images, you may have to crop images a lot by your hand.

I created a multi-platform software imageclipper  to help to do it. This software is not


only for haartraining but also for other computer vision/machine learning researches. This
software has characteristics as follows:
 You can open images in a directory sequentially
 You can open a video file too, frame by frame
 Clipping and moving to the next image can be done by one button (SPACE)
 You will select a region to clip by dragging left mouse button
 You can move or resize your selected region by dragging right mouse button
 Your selected region is shown on the next image too.

Create Samples (Reference)


We can create training samples and testing samples with the createsamples utility. In this
section, I describe functionalities of the createsamples software because the Tutorial [1]
did not explain them clearly for me (but please see the Tutorial [1] also for further
options).
This is a list of options, but there are mainly four functions and the meanings of options
become different in different functions. It confuses us.

Usage: ./createsamples

[-info <description_file_name>]

[-img <image_file_name>]

[-vec <vec_file_name>]

[-bg <background_file_name>]

[-num <number_of_samples = 1000>]

[-bgcolor <background_color = 0>]

[-inv] [-randinv] [-bgthresh <background_color_threshold = 80>]

[-maxidev <max_intensity_deviation = 40>]

[-maxxangle <max_x_rotation_angle = 1.100000>]

[-maxyangle <max_y_rotation_angle = 1.100000>]


[-maxzangle <max_z_rotation_angle = 0.500000>]

[-show [<scale = 4.000000>]]

[-w <sample_width = 24>]

[-h <sample_height = 24>]

1. Create training samples from one


The 1st function of the createsamples utility is to create training samples from one image
applying distortions. This function (cvhaartraining.cpp#cvCreateTrainingSamples) is
launched when options, -img, -bg, and -vec were specified.

 -img <one_positive_image>
 -bg <collection_file_of_negatives>
 -vec <name_of_the_output_file_containing_the_generated_samples>
For example,

$ createsamples -img face.png -num 10 -bg negatives.dat -vec samples.vec


-maxxangle 0.6 -maxyangle 0 -maxzangle 0.3 -maxidev 100 -bgcolor 0 -
bgthresh 0 -w 20 -h 20

This generates <num> number of samples from one <positive_image> applying


distortions. Be careful that only the first <num> negative images in the
<collection_file_of_negatives> are used.

The file of the <collection_file_of_negatives> is as follows:

[filename]

[filename]

[filename]

...

such as

img/img1.jpg

img/img2.jpg
Let me call this file format as collection file format.
How to create a collection file
This format can easily be created with the find command as

$ cd [your working directory]

$ find [image dir] -name '*.[image ext]' > [description file]

such as

$ find ../../data/negatives/ -name '*.jpg' > negatives.dat

2. Create training samples from some


The 2nd function is to create training samples from some images without applying
distortions. This function (cvhaartraining.cpp#cvCreateTestSamples) is launched when
options, -info, and -vec were specified.

 -info <description_file_of_samples>
 -vec <name_of_the_output_file_containing_the_generated_samples>
For example,

$ createsamples -info samples.dat -vec samples.vec -w 20 -h 20

This generates samples without applying distortions. You may think this function as a file
format conversion function.

The format of the <description_file_of_samples> is as follows:

[filename] [# of objects] [[x y width height] [... 2nd object] ...]

[filename] [# of objects] [[x y width height] [... 2nd object] ...]

[filename] [# of objects] [[x y width height] [... 2nd object] ...]

...

where (x,y) is the left-upper corner of the object where the origin (0,0) is the left-upper
corner of the image such as

img/img1.jpg 1 140 100 45 45

img/img2.jpg 2 100 200 50 50 50 30 25 25


img/img3.jpg 1 0 0 20 20

Let me call this format as a description file format against the collection file format
although the manual [1] does not differentiate them.
This function crops regions specified and resize these images and convert into .vec format,
but (let me say again) this function does not generate many samples from one
image (one cropped image) applying distortions. Therefore, you may use this 2nd function
only when you have already sufficient number of natural images and their ground truths
(totally, 5000 or 7000 would be required).
Note that the option -num is used only to restrict the number of samples to generate, not
to increase number of samples applying distortions in this case.

How to create a description file


I write how to create a description file when already-cropped image files are available here
because some people had asked how to create it at the OpenCV forum. Note that my
tutorial steps do not require to perform this.

For such a situation, you can use the find command and the identify  command (cygwin
should have identify (ImageMagick) command) to create a description file as

$ cd <your working directory>

$ find <dir> -name '*.<ext>' -exec identify -format '%i 1 0 0 %w %h' \{\}
\; > <description_file>

such as

$ find ../../data/umist_cropped -name '*.pgm' -exec identify -format '%i


1 0 0 %w %h' \{\} \; > samplesdescription.dat

If all images have the same size, it becomes simpler and faster,

$ find <dir> -name '*.<ext>' -exec echo \{\} 1 0 0 <width> <height> \; >
<description_file>

such as

$ find ../../data/umist_cropped -name '*.pgm' -exec echo \{\} 1 0 0 20 20


\; > samplesdescription.dat

How to automate to crop images? If you can do it, you do not need haartraining. You have
an object detector already 
3. Create test samples
The 3rd function is to create test samples and their ground truth from single image
applying distortions. This function (cvsamples.cpp#cvCreateTrainingSamplesFromInfo) is
triggered when options, -img, -bg, and -info were specified.

 -img <one_positive_image>
 -bg <collection_file_of_negatives>
 -info <generated_description_file_for_the_generated_test_images>
In this case, -w and -h are used to determine the minimal size of positives to be embeded
in the test images.

$ createsamples -img face.png -num 10 -bg negatives.dat -info test.dat -


maxxangle 0.6 -maxyangle 0 -maxzangle 0.3 -maxidev 100 -bgcolor 0 -
bgthresh 0

Be careful that only the first <num> negative images in the <collection_file_of_negatives>
are used.

This generates tons of jpg files such as

The output image filename format is as <number>_<x>_<y>_<width>_<height>.jpg,


where x, y, width and height are the coordinates of placed object bounding rectangle.

Also, this generates <description_file_for_test_samples> of the description file


format (the same format with <description_file_of_samples> at the 2nd function).

4. Show images
The 4th function is to show images within a vec file. This function
(cvsamples.cpp#cvShowVecSamples) is triggered when only an option, -vec, was specified
(no -info, -img, -bg). For example,

$ createsamples -vec samples.vec -w 20 -h 20


EXTRA: random seed
The createsamples software applys the same sequence of distortions for each image. We
may want to apply the different sequence of distortions for each image because, otherwise,
our resulting detection may work only for specific distortions.

This can be done by modifying createsamples slightly as:

Add below in the top

#include<time.h>

Add below in the main function

srand(time(NULL));

The modified source code is available at svn:createsamples.cpp

Create Samples
Create Training Samples
Kuranov et. al. [3] mentions as they used 5000 positive frontal face patterns and 3000
negatives for training, and 5000 positive frontal face patterns were derived from 1000
original faces.
However, you may have noticed that none of 4 functions of the createsamples utility
provide us a function to generate 5000 positive images from 1000 images at burst. We
have to use the 1st function of the createsamples to generate 5 (or some) positives form 1
image, repeat the procedures 1000 (or some) times, and finally merge the generated
output vec files. *1
I wrote a program, mergevec.cpp, to merge vec files. I also wrote a
script, createtrainsamples.pl , to repeat the procedures 1000 (or some) times. I
specified 7000 instead of 5000 as default because the Tutorial [1] states as "the
reasonable number of positive samples is 7000." Please modify the path to createsamples
and its option parameters directly written in the file.
The input format of createtrainsamples.pl is

$ perl createtrainsamples.pl <positives.dat> <negatives.dat>


<vec_output_dir> [<totalnum = 7000>] [<createsample_command_options =
"./createsamples -w 20 -h 20...">]

And, the input format of mergevec is

$ mergevec <collection_file_of_vecs> <output_vec_file_name>

A collection file (a file containing list of filenames) can be generated as


$ find [dir_name] -name '*.[ext]' > [collection_file_name]

Example)

$ cd HaarTraining/bin

$ find ../../data/negatives/ -name '*.jpg' > negatives.dat

$ find ../../data/umist_cropped/ -name '*.pgm' > positives.dat

$ perl createtrainsamples.pl positives.dat negatives.dat samples 7000


"./createsamples -bgcolor 0 -bgthresh 0 -maxxangle 1.1 -maxyangle 1.1
maxzangle 0.5 -maxidev 40 -w 20 -h 20"

$ find samples/ -name '*.vec' > samples.dat # to create a collection file


for vec files

$ mergevec samples.dat samples.vec

$ # createsamples -vec samples.vec -show -w 20 -h 20 # Extra: If you want


to see inside

Kuranov et. al. [3] states as 20x20 of sample size achieved the highest hit rate.
Furthermore, they states as "For 18x18 four split nodes performed best, while for 20x20
two nodes were slightly better. Thus, -w 20 -h 20 would be good.

Create Testing Samples


Testing samples are images which include positives in negative background images and
locations of positives are known in the images. It is possible to create such testing images
by hand. We can also use the 3rd function of createsamples to synthesize such images.
But, we can specify only one image using it, thus, creating a script to repeat the procedure
would help us. The script is available at svn:createtestsamples.pl . Please modify the
path to createsamples and its option parameters directly in the file.
The input format of the createtestsamples.pl is as

$ perl createtestsamples.pl <positives.dat> <negatives.dat> <output_dir>


[<totalnum = 1000>] [<createsample_command_options = "./createsamples -w
20 -h 20...">]
This generates lots of jpg files and info.dat in the <output_dir>. The jpg file name format is
as <number>_<x>_<y>_<width>_<height>.jpg, where x, y, width and height are the
coordinates of placed object bounding rectangle.

Example)

$ # cd HaarTraining/bin

$ # find ../../data/negatives/ -name '*.jpg' > negatives.dat

$ # find ../../data/umist_cropped/ -name '*.pgm' > positives.dat

$ perl createtestsamples.pl positives.dat negatives.dat tests 1000


"./createsamples -bgcolor 0 -bgthresh 0 -maxxangle 1.1 -maxyangle 1.1 -
maxzangle 0.5 maxidev 40"

$ find tests/ -name 'info.dat' -exec cat \{\} \; > tests.dat # merge info
files

Training
Haar Training
Now, we train our own classifier using the haartraining utility. Here is the usage of the
haartraining.

Usage: ./haartraining

-data <dir_name>

-vec <vec_file_name>

-bg <background_file_name>

[-npos <number_of_positive_samples = 2000>]

[-nneg <number_of_negative_samples = 2000>]

[-nstages <number_of_stages = 14>]

[-nsplits <number_of_splits = 1>]

[-mem <memory_in_MB = 200>]


[-sym (default)] [-nonsym]

[-minhitrate <min_hit_rate = 0.995000>]

[-maxfalsealarm <max_false_alarm_rate = 0.500000>]

[-weighttrimming <weight_trimming = 0.950000>]

[-eqw]

[-mode <BASIC (default) | CORE | ALL>]

[-w <sample_width = 24>]

[-h <sample_height = 24>]

[-bt <DAB | RAB | LB | GAB (default)>]

[-err <misclass (default) | gini | entropy>]

[-maxtreesplits <max_number_of_splits_in_tree_cascade = 0>]

[-minpos <min_number_of_positive_samples_per_cluster = 500>]

Kuranov et. al. [3] states as 20x20 of sample size achieved the highest hit rate.
Furthermore, they states as "For 18x18 four split nodes performed best, while for 20x20
two nodes were slightly better. The difference between weak tree classifiers with 2, 3 or 4
split nodes is smaller than their superiority with respect to stumps."
Furthermore, there was a description as "20 stages were trained. Assuming that my test
set is representative for the learning task, I can expect a false alarm rate
about   and a hit rate about  ."
Therefore, use of 20x20 of sample size with nsplit = 2, nstages = 20, minhitrate = 0.9999
(default: 0.995), maxfalselarm = 0.5 (default: 0.5), and weighttrimming = 0.95 (default:
0.95) would be good such as

$ haartraining -data haarcascade -vec samples.vec -bg negatives.dat -


nstages 20 -nsplits 2 -minhitrate 0.999 -maxfalsealarm 0.5 -npos 7000 -
nneg 3019 -w 20 -h 20 -nonsym -mem 512 -mode ALL

The "-nonsym" option is used when the object class does not have vertical (left-right)
symmetry. If object class has vertical symmetry such as frontal faces, "-sym (default)"
should be used. It will speed up processing because it will use only the half (the centered
and either of the left-sided or the right-sided) haar-like features.
The "-mode ALL" uses Extended Sets of Haar-like Features [2]. Default is BASIC and it
uses only upright features, while ALL uses the full set of upright and 45 degree rotated
feature set [1].
The "-mem 512" is the available memory in MB for precalculation [1]. Default is 200MB, so
increase if more memory is available. We should not specify all system RAM because this
number is only for precalculation, not for all. The maximum possible number to be specified
would be 2GB because there is a limit of 4GB on the 32bit CPU (2^32 ≒ 4GB), and it
becomes 2GB on Windows (kernel reserves 1GB and windows does something more).
There are other options that [1] does not list such as

[-bt <DAB | RAB | LB | GAB (default)>]

[-err <misclass (default) | gini | entropy>]

[-maxtreesplits <max_number_of_splits_in_tree_cascade = 0>]

[-minpos <min_number_of_positive_samples_per_cluster = 500>]

Please see my modified version of haartraining document [5] for details.


#Even if you increase the number of stages, the training may finish in an intermediate
stage when it exceeded your desired minimum hit rate or false alarm because more
cascading will decrease these rate for sure (0.99 until current * 0.99 next = 0.9801 until
next). Or, the training may finish because all samples were rejected. In the case, you must
increase number of training samples.

#You can use OpenMP (multi-processing) with compilers such as Intel C++ compiler and
MS Visual Studio 2005 Professional Edition or better. See How to enable OpenMP section.
#One training took three days.

Generate a XML File


The haartraing generates a xml file when the process is completely finished (from OpenCV
beta5).

If you want to convert an intermediate haartraining output dir tree data into a xml file,
there is a software at the OpenCV/samples/c/convert_cascade.c (that is, in your installation
directory). Compile it.

The input format is as

$ convert_cascade --size="<sample_width>x<sampe_height>"
<haartraining_ouput_dir> <ouput_file>

Example)

$ convert_cascade --size="20x20" haarcascade haarcascade.xml


Testing
Performance Evaluation
We can evaluate the performance of the generated classifier using the performance utility.
Here is the usage of the performance utility.

Usage: ./performance

-data <classifier_directory_name>

-info <collection_file_name>

[-maxSizeDiff <max_size_difference = 1.500000>]

[-maxPosDiff <max_position_difference = 0.300000>]

[-sf <scale_factor = 1.200000>]

[-ni]

[-nos <number_of_stages = -1>]

[-rs <roc_size = 40>]

[-w <sample_width = 24>]

[-h <sample_height = 24>]

Please see my modified version of haartraining document [5] for details of options.


I cite how the performance utility works here:

During detection, a sliding window was moved pixel by pixel over the picture at each scale. Starting
with the original scale, the features were enlarged by 10% and 20%, respectively (i.e., representing a
rescale factor of 1.1 and 1.2, respectively) until exceeding the size of the picture in at least one
dimension. Often multiple faces are detect at near by location and scale at an actual face location.
Therefore, multiple nearby detection results were merged. Receiver Operating Curves (ROCs) were
constructed by varying the required number of detected faces per actual face before merging into a
single detection result. During experimentation only one parameter was changed at a time. The best
mode of a parameter found in an experiment was used for the subsequent experiments. [3]
Execute the performance utility as

$ performance -data haarcascade -w 20 -h 20 -info tests.dat -ni


or

$ performance -data haarcascade.xml -info tests.dat -ni

Be careful that you have to tell the size of training samples when you specify the classifier
directory although the classifier xml file includes the information inside *2.
-ni option suppresses to create resulted image files of detection. As default, the
performance utility creates the resulted image files of detection and stores them into
directories that a prefix 'det-' is added to test image directories. When you want to use this
function, you have to create destination directories beforehand by yourself. Execute next
command to create destination directories

$ cat tests.dat | perl -pe 's!^(.*)/.*$!det-$1!g' | xargs mkdir -p

where tests.dat is the collection file for testing images which you created at the step of
createtestsamples.pl. Now you can execute the performance utility without '-ni' option.

An output of the performance utility is as follows:

+================================+======+======+======+

| File Name | Hits |Missed| False|

+================================+======+======+======+

|tests/01/img01.bmp/0001_0153_005| 0| 1| 0|

+--------------------------------+------+------+------+

....

+--------------------------------+------+------+------+

| Total| 874| 554| 72|

+================================+======+======+======+

Number of stages: 15

Number of weak classifiers: 68

Total time: 115.000000

15
874 72 0.612045 0.050420

874 72 0.612045 0.050420

360 2 0.252101 0.001401

115 0 0.080532 0.000000

26 0 0.018207 0.000000

8 0 0.005602 0.000000

4 0 0.002801 0.000000

1 0 0.000700 0.000000

....

'Hits' shows the number of correct detections. 'Missed' shows the number of missed
detections or false negatives (Truly there exists, but the detector missed to detect it).
'False' shows the number of false alarms or false positives (Truly there does not exist, but
the detector alarmed as there exists.)

The latter table is for ROC plot. Please see my modified version of haartraining
document [5] for more.

Fun with a USB camera


Fun with a USB camera or some image files with the facedetect utility.

$ facedetect --cascade=<xml_file> [filename(image or video)|camera_index]

I modified facedetect.c slightly because the facedetect utility did not work in the same
manner with the performance utility. I added options to change parameters on command
line. The source code is available at the Download section (or direct link facedetect.c ).
Now the usage is as follows:

Usage: facedetect --cascade="<cascade_xml_path>" or -c


<cascade_xml_path>

[ -sf < scale_factor = 1.100000 > ]

[ -mn < min_neighbors = 1 > ]

[ -fl < flags = 0 > ]


[ -ms < min_size = 0 0 > ]

[ filename | camera_index = 0 ]

See also: cvHaarDetectObjects() about option parameters.

FYI: The original facedetect.c used min_neighbors = 2 although performance.cpp uses


min_neighbors = 1. It affected face detection results considerably.

Experiments
PIE Expeirment 1
The PIE dataset has only frontal faces with big illumination variations. The dataset used in
PIE experiments looks as follows:

1st 10th 21st


 List of Commands haarcascade_frontalface_pie1.sh .
oI used -w 18 -h 20 because the original images were not square but rectangle with
ratio about 18:20. I applied little distortions on this experiment.
o The training took 3 days on Intel Xeon 2GHz with 1GB memory machine.
 Performance Evaluation with pie_test (synthesize
tests) haarcascade_frontalface_pie1.performance_pie_tests.txt

 +================================+======+======+======+

 | File Name | Hits |Missed| False|

 +================================+======+======+======+

 | Total| 847| 581| 67|

 +================================+======+======+======+

 Number of stages: 16

 Number of weak classifiers: 113

 Total time: 123.000000

 16
 847 67 0.593137 0.046919

 847 67 0.593137 0.046919

 353 2 0.247199 0.001401

 110 0 0.077031 0.000000

 15 0 0.010504 0.000000

 1 0 0.000700 0.000000

 Performance evaluation with cmu_tests (natural


tests)haarcascade_frontalface_pie1.performance_cmu_tests.txt

 +================================+======+======+======+

 | File Name | Hits |Missed| False|

 +================================+======+======+======+

 | Total| 20| 491| 9|

 +================================+======+======+======+

 Number of stages: 16

 Number of weak classifiers: 113

 Total time: 5.830000

 16

 20 9 0.039139 0.017613

 20 9 0.039139 0.017613

 20 0.003914 0.000000

PIE Experiment 2
 haarcascade_frontalface_pie2.sh .
o Tried -nonsym option
o The training took 3 days on Intel Xeon 3GHz with 1GB memory machine.
 haarcascade_frontalface_pie2.performance_pie_tests.txt

| Total| 777| 651| 53|

 haarcascade_frontalface_pie2.performance_cmu_tests.txt

| Total| 12| 499| 3|

PIE Experiment 3
 haarcascade_frontalface_pie3.sh .
o Increased number of training samples from 7000 to 10000. Still -nonsym
o The training took 4 days on Intel Xeon 2GHz with 1GB memory machine.
o This was the best among PIE experiments
 haarcascade_frontalface_pie3.performance_pie_tests.txt

| Total| 874| 554| 72|

 haarcascade_frontalface_pie3.performance_cmu_tests.txt

| Total| 12| 499| 15|

PIE Experiment 4
 haarcascade_frontalface_pie4.sh .
o Tried to change -minhitrate from 0.999 to 0.995
o The training took 5 days on Intel Xeon 3GHz with 1GB memory machine.
 haarcascade_frontalface_pie4.performance_pie_tests.txt

| Total| 527| 901| 27|

 haarcascade_frontalface_pie4.performance_cmu_tests.txt

| Total| 3| 508| 1|

PIE Experiment 5
 haarcascade_frontalface_pie5.sh .
o -sym for experiment 3
o The training took 4 days on Intel Xeon 2GHz with 1GB memory machine.
 haarcascade_frontalface_pie5.performance_pie_tests.txt
| Total| 737| 691| 46|

 haarcascade_frontalface_pie5.performance_cmu_tests.txt

| Total| 10| 501| 6|

PIE Experiment 6
 haarcascade_frontalface_pie5.sh .
o -maxtreesplits 4
o The training took 3 weeks
 haarcascade_frontalface_pie5.performance_pie_tests.txt

| Total| 766| 662| 45|

 haarcascade_frontalface_pie5.performance_cmu_tests.txt

| Total| 8| 503| 7|

UMIST Experiment 1
The UMIST is a multi-view face dataset.

0th frame 21st frame 33rd frame


 haarcascade_profileface_umist1.sh
o -w 18 -h 22
o -sym generate flipped images in left-right, thus, -sym helps for profile face too
o took 5 days
 haarcascade_profileface_umist1.performance_umist_tests.txt

| Total| 96| 1054| 13|

 haarcascade_profileface_umist1.performance_cmu_tests.txt
| Total| 0| 511| 5|

UMIST Experiment 2
 haarcascade_profileface_umist2.sh
o -maxtreesplits 4
o training took 2 weeks
 haarcascade_profileface_umist2.performance_umist_tests.txt

| Total| 734| 416| 276|

 haarcascade_profileface_umist2.performance_cmu_tests.txt

| Total| 1| 510| 57|

CBCL Experiment 1
 haarcascade_frontalface_cbcl1.sh
o -maxtreesplits 4
o The training took 5 weeks
 haarcascade_frontalface_cbcl1.performance_cbcl_tests.txt

| Total| 306| 694| 20|

 haarcascade_frontalface_cbcl1.performance_cmu_tests.txt

| Total| 127| 384| 7|

haarcascade_frontalface_alt2.xml
 haarcascade_frontalface_alt2.performance_pie_tests.txt

| Total| 820| 608| 2099|

 haarcascade_frontalface_alt2.performance_umist_tests.txt

| Total| 263| 887| 1839|

 haarcascade_frontalface_alt2.performance_cbcl_tests.txt
| Total| 109| 891| 1490|

 haarcascade_frontalface_alt2.performance_cmu_tests.txt

| Total| 274| 237| 534|

Discussion
The created detectors outperformed the opencv default xml in terms of synthesized test
samples created from training samples. This shows that the training was successfully
performed. However, the detector did not work well in general test samples. This might
mean that the detector was over-trained or over-fitted to the specific training samples. I
still don't know good parameters or training samples to generalize detectors well.

False alarm rates of all of my generated detectors were pretty low compared with the
opencv default detector. I don't know which parameters are especially different. I set false
alarm rate with 0.5 and this makes sense theoretically. I don't know.

Training illumination varying faces in one detector resulted in pretty poor. The generated
detector became sensitive to illumination rather than robust to illumination. This detector
does not detect non-illuminated normal frontal faces. This makes sense because normal
frontal faces did not exist in training sets so many. Training multi-view faces in one time
resulted in the same thing.

We should train different detectors for each face pose or illumination state to construct a
multi-view or illumination varied face detector as Fast Multi-view Face Detection . Viola
and Jones extended their work for multi-view by training 12 separated face poses
detectors. To achieve rapidness, they further constructed a pose estimator by C4.5 decision
tree re-using the haar-like features, they further cascaded the pose estimator and face
detector (Of course, this means that if pose estimation fails, the face detection also fails).
Theory behind
The advantage of the haar-like features is the rapidness in detection phase, not accuracy.
We of course can construct another face detector which achieves better accuracy using,
e.g., PCA or LDA although it becomes slow in detection phase. Use such features when you
do not require rapidness. PCA does not require to train AdaBoost, so training phase would
quickly finish. I am pretty sure that there exist such face detection method already
although I did not search (I do not search because I am sure).

Download
The files are available at https://github.com/sonots/tutorial-haartraining
Directory Tree

 HaarTraining  haartraining
o
src  Source Code, haartraining and my additional c++ source codes are at here.
o
src/Makefile  Makefile for Linux, please read comments inside
o
bin  Binaries for Windows are ready, my perl scripts are also at here. This directory
would be a working directory.
o make  Visual Studio Project Files
 data  The collected Image Datasets
 result  Generated Files (vec and xml etc) and results
This is a svn repository, so you can download files at burst if you have a svn client (you
should have it on cygwin or Linux). For example,

$ svn co https://github.com/sonots/tutorial-haartraining/blob/master/
tutorial-haartraining

Sorry, but downloading (checkout) image datasets may take forever.... I created a zip file
once, but google code repository did not allow me to upload such a big file (100MB). I
recommend you to check out only the HaarTraining directory first as

$ svn co
https://github.com/sonots/tutorial-haartraining/blob/master/HaarTraining/
HaarTraining

Here, the list of my additional utilities (I put them in HaarTraining/src and HaarTraining/bin
directory):

 mergevec.cpp  † - Further Details (How to Use, Compile)


 vec2img.cpp  - Further Details (How to Use, Compile)
 createtrainsamples.pl  †
 createtestsamples.pl  †
The following additional utilities can be obtained from OpenCV/samples/c in your OpenCV
install directory (I also put them in HaarTraining/src directory).

 convert_cascade.c
 facedetect.c  This is my slightly modified version

How to enable OpenMP


I bundled windows binaries in the Download section, but I did not enable OpenMP (multi-
processing) support. Therefore, I write how to compile the haartraining utility to use
OpenMP with Visual Studio 2005 Professional Edition here based on my distribution files
(The procedure should be same for the originals too, but I did not verify.)

The solution file is in HaarTraining\make\haartraining.sln. Open it.

Right click cvhaartraining project > Properties. You will see a picture as below.
Reference http://www.codeproject.com/KB/cpp/BeginOpenMP.aspx
Follow Configuration Properties > C/C++ > Language > Change 'OpenMP Support' to 'Yes
(/openmp)' as the above picture shows. If you can not see it, probably your environment
does not support OpenMP.

Build cvhaartraining only (Right click the project > Project Only > Rebuild only
cvhaartraining) and do the same procedure (enable OpenMP) for haartraining project. Now,
haartraining.exe should work with OpenMP.

You may use Process Explorer  to verify whether it is utilizing OpenMP or not.
Run the Process Explorer > View > Show Lower Pane (Ctrl+L) > choose 'haartraining.exe'
process and see the Lower Pane. If you can see two threads not one thread, it is utilizing
OpenMP.

References
 [1] HaarTraining doc This document can be obtained from
OpenCV/apps/HaarTraining/doc on your OpenCV install directory
 [2] Rainer Lienhart and Jochen Maydt. An Extended Set of Haar-like Features for Rapid
Object Detection. IEEE ICIP 2002, Vol. 1, pp. 900-903, Sep.
2002. http://www.lienhart.de/ICIP2002.pdf
 [3] Alexander Kuranov, Rainer Lienhart, and Vadim Pisarevsky. An Empirical Analysis of
Boosting Algorithms for Rapid Objects With an Extended Set of Haar-like Features. Intel
Technical Report MRL-TR-July02-01,
2002. http://www.lienhart.de/Publications/DAGM2003.pdf
 [4] Paul Viola and Michael J. Jones. Rapid Object Detection using a Boosted Cascade of
Simple Features. IEEE CVPR,
2001. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.3.7597
 [5] Modified HaarTraining doc  This is my modified version of [1]

https://codeyarns.com/2014/09/01/how-to-train-opencv-cascade-classifier/

How to train OpenCV cascade classifier


OpenCV ships with an application that can be used to train a cascade
classifier. The steps to prepare your data and train the classifier can be
quite elaborate. I have detailed the steps that I used below to train the
classifier to identify an object (say car):
 Two OpenCV
programs: opencv_createsamples and opencv_traincascadew
ill be used for this process. They should be installed along with your
OpenCV. You can also compile them while compiling OpenCV from
source. Make sure to enable the option BUILD_APPS in the CMake GUI.
This is the option that builds these applications.
 Prepare images that have one or more instances of the object of interest.
These will be used to generate positive samples later. Also, prepare images
that do not have the object of interest. These will be used to
generate negative samples later.
 It is mandatory to use opencv_createsamples to generate the positive
samples for opencv_traincascade. The output of this program is
a .vec file that is used as input to opencv_traincascade.
 If you have image patches of the object and need to generate positive
sample images by applying transformations (rotations) on this in 3
dimensions and then superimposing it on a background, you can do that
using opencv_createsamples. See its documentation for details.
 In my case, I already had one or more instances of the object captured with
its actual background. So, what I instead had to do was to indicate the patch
rectangles in each image where my object was located. You can do this by
hand or by writing a simple OpenCV program to display image to you and
you mark out the rectangles on the objects and stores these values in a text
file. The format of this text file required
by opencv_createsamples can be seen in its documentation.
 To create the positive samples file, I
invoked opencv_createsamples as:
1 $ opencv_createsamples -info obj-rects.txt -w 50 -h 50 -vec pos-samples.vec
Here, obj-rects.txt is a text file that has the information of the
rectangles where the object is located in each image. See step above for
details. The output of this program is stored in pos-samples.vec.
 To view the positive samples that have been created by this program:
1 $ opencv_createsamples -vec pos-samples.vec -w 50 -h 50
Note that it switches to viewing mode when you only provide these three
parameters and their values should match what you provided to create the
positive samples.

 Now we can train the cascade classifier. The details of its parameters can be
seen in its documentation. I used this invocation:
$ opencv_traincascade -data obj-classifier -vec pos-samples.vec -bg neg-filepaths.txt -
1minhitrate 0.999 -maxfalsealarm 0.5 -w 50 -h 50 -nonsym -baseFormatSave
obj-classifier is a directory where we are asking the classifier files
to be stored. Note that this directory should already be created by
you. pos-samples.vec is the file we generated in step above. neg-
filepaths.txt is a file with list of paths to negative sample
files. 2048 is the amount of MB of memory we are requesting the program
to use. The more the memory, the faster the training. 200 is the number of
positive samples in pos-samples.vec. This number is also reported
by opencv_createsamples when it finishes its execution. 2000 is the
number of negative sample image paths we have specified in neg-
filepaths.txt. 20 is the number of stages in the classifier we
wish. 50x50 is the size of object in these images. This should be same as
what was specified with opencv_createsamples. 0.999and 0.5 are
self-explanatory. Details on all these parameters are found in
the documentation.
Related: See my tips and other posts on using OpenCV Cascade Classifier.
Tried with: OpenCV 2.4.9 and Ubuntu 14.04

You might also like