You are on page 1of 137

CS Club Ekaterinburg on 17-18 February 2011

What Computer Vision with the OpenCV

can and can not

Denis S. Perevalov
Ural Federal University / Institute of Mathematics an d Mechanics UB RAS
1. What is computer vision
2. Cameras for computer vision
3. Introduction to OpenCV
4. OpenCV integration into multimedia projects
5. Possibilities and limitations of simple tasks of computer
6. Possibilities and limitations of complex tasks of computer
7. New applications of computer vision

- The lecture intent

- The lecture is for…

Back to Contents
The lecture intent
This lecture is about:
- Computer vision,
- OpenCV library,
- The possibilities and limitations of computer vision, appearing
in the solution of applied problems of image analysis.

The lecture intent
Will be interested in:

1) algorithms that solve the problem of image analysis

in (almost) real time. That is, the processing time of one frame
should not exceed 1 - 10 sec.

2) The concerns and observations about the applicability of

such algorithms.

Not be interested in:

1) Questions accelerating algorithms using GPU.

2) Neural networks and artificial intelligence.

The lecture is for…

- For those interested in computer vision and wants to learn

more about their current opportunities and new ways of its
The lecture is for…

- For those who do not have experience with OpenCV, but

wishes as soon as possible to get it.
The lecture is for…

- For those who are seriously engaged in computer vision, and

wants to learn more about the bottlenecks and problems that
may occur when using the best (to date) algorithms for
computer vision.
1. What is
computer vision
- Definition
- First sign of computer vision tasks
- Second sign of computer vision tasks
- Examples of computer vision tasks
- An example of NOT computer vision task

Back to Contents
(From Wikipedia)

Computer vision - Theory and technology of creating

machines that can see.
As a scientific discipline, computer vision relates to the theory
and technology of creating artificial systems that receive
information from images. ...

As a technological discipline, computer vision seeks to apply

theories and models of computer vision to create computer
vision systems. ... 20pic.jpg
Computer vision can also be described as a supplement (but not necessarily
opposite) biological vision.

In biology, we study the visual perception of humans and various animals, resulting in a model of such
systems in terms of physiological processes. Computer vision, on the other hand, studies and describes a
computer vision system, which are executed by hardware or software. Interdisciplinary exchange between
biological and computer vision has been very productive for both scientific fields.
The topics include computer vision

- Play action

- Detection of events

- Tracking,

- Pattern recognition,

- Restoration of images.
First sign of computer vision tasks

The input data are two-dimensional array of data

- i.e., "image".

Data can also be:
- Video, that is, a sequence of images,
- 3D-data - point clouds with 3D scanners or other devices.
Sample Image

Ordinary light, radio waves, ultrasound - they are all sources of


1. Color images of the visible spectrum

2. Infrared images
3. Ultrasound images
4. Radar images
5. Images with depth data
Sample Image

1. Color images of the visible spectrum D0% 9F% D1% 80% D0% B8% D1% 80% D0% BE% D0% B4% D0%
Sample Image
2. Infrared images
Sample Image
3. Ultrasound images
Image with side-scan sonar:
Sample Image
4. Radar images

Snapshot of the radar:
Sample Image

5. Images with depth data Video

_images / disparity.png
First sign of computer vision tasks

The input data are two-dimensional array of data

- Ie, "image".

But the two-dimensional arrays of data are used not only in

computer vision:
Disciplines dealing with 2D-data
Second sign of computer vision tasks

The goal of treatment - extraction and use of color and geometric

structures in the image.
Disciplines dealing with 2D-images
Disciplines involved
1. Signal and Image Processing
Low-level data processing, usually without a detailed study of image
Objectives - restoration, removal of noise, data compression, improved
performance (sharpness, contrast, ...)

2. Computer vision
Sredneurovnevy data analysis involves the separation of the image of
any objects, and measuring their parameters.

3. Pattern recognition
High-level analysis of data - the definition of the type of object. The
input data usually must be presented as a set of attributes. Often the
signs are used to calculate 1. and 2.
Examples of computer vision tasks

Segmentation - A partition image to the "homogeneous"

in some sense the field.
Examples of computer vision tasks

Detection interesting objects in the picture, and

calculation of their size and other characteristics.
Examples of computer vision tasks

Tracking - Tracking objects of interest on the sequence of

Examples of computer vision tasks
Gloves of virtual reality - recognition of colors and patterns brushes, draft MIT,
the prototype.

Examples of computer vision tasks
Search markers (for use in augmented reality-based
An example of NOT computer vision
Search path in the maze.
(Although the input data - the image, but the task - not to find
objects on it, and solve the combinatorial problem to find a
2. Cameras
for computer vision
- Key features
- Examples of good cameras

Back to Contents
Key features

For various processing tasks in real-time need different


Their main features are:

1. Resolution

2. The number of frames per second

3. Type of data obtained

4. Way to transfer data into the computer

This is the image size in pixels, obtained from the camera.

320 x 240 640 x 480 1280 x 1024

accuracy accuracy accuracy
when observing an object when observing an object when observing an object
with the size of 1m: with the size of 1m: with the size of 1m:
3.13 mm 1.56 mm 0.97 mm
size of 30 frames:
6.6 MB size of 30 frames: size of 30 frames:
26.4 MB 112.5 MB
The number of frames per second
This is the number of images obtained from the camera per second.

30 fps 60 fps 150 fps

Time between frames: Time between frames: Time between frames:
33 ms 16 mS 6 ms
Can used for musical
Type of data obtained
What data we get from the camera for processing.

Color or grayscale image Infrared image Color image + depth

of the visible spectrum (Information about the
distance to objects)
Using invisible infrared
illumination, this camera will
in a dark room
(On performance)
Way to transfer data into the computer

- Analog
-Webcams (USB-camera)
- Firewire-camera(Cameras IEEE-1394)
- Network (IP-camera)
- Smart Camera (Smart cameras)

Historically appeared first,

signal is transmitted to analog signals (TV format).

(+) Transmit data over long distances,

albeit with interference (100 m)
(+) Easy to install, small size

(-) For signal input into the computer requires a special card or TV tuner ", they
usually consume a lot of computing resources.
(-) "Interlace"Or Interlace - very difficult to analyze the image, if there is
(Actually attending 2 half frame, each 50 times / sec)
Webcams (USB-camera)

Appeared in ~ 2000.,
transmit data via the USB-protocol
uncompressed or compressed in JPEG.

(+) Easy to connect computer and software

(+) Cheap, available for sale

(-) Overhead - to decode JPEG requires computing resources.

(-) The cheapest models are usually bad optics and the matrix (Makes noise in
the image)
(-) Because of limitations of USB bandwidth can not connect more than 2
cameras to a single USB-hub, but usually on the PC 2-3 USB hub.
Firewire-camera (IEEE-1394)

Cameras that transmit a signal

protocol FireWire,
pylevlagozaschitnom usually the case,
usually it is the camera for industrial applications.

(+) Transfer of uncompressed video in excellent quality at high speed

(+) You can connect multiple cameras
(+) Tend to have excellent optics

(-) High price

(-) Requires power, which is sometimes difficult to connect to laptops
Network (IP-camera)

Cameras that transmit data on

network (wired or wireless)
channel. Is now rapidly gaining
popularity in all areas.

(+) Easy connection to PC

(+) Ease of installation
(+) The possibility of transferring data to an unlimited distance, which allows you
to construct a network of cameras covering the building or area, attached to the
airship, etc.
(+) Control - to rotate the camera, adjust the increase

(-) May have problems with speed of response

(-) Is still relatively high price
(-) While not portable (2011)
"Smart" cameras (Smart cameras)

Cameras, in which case

located computer.
These cameras are fully functional
vision systems,
transmitting the output of the detected
facilities, etc. under different protocols.

(+) Compact.
(+) Scalability - it is easy to build a network of such cameras.

(-) Often they require adaptation of existing projects.

(-) Cost model is rather slow, so do a good job with only a relatively simple task of
image analysis.
Separate type: Infrared Camera

Constructed from ordinary cameras

by adding an infrared filter
and, often, an infrared illuminator.

+ IR-rays are almost invisible man (in the dark can be seen as a faint red color),
so often used to simplify the analysis of objects in the field of view.

- Specialized infrared camera suitable for machine vision, not a mass product, so
they usually need to be ordered.
Examples of good cameras
Sony PS3 Eye

320 x 240: 150 FPS

640 x 480: 60 FPS

Data Types:
visible light
IR (requires removing the IR filter)

Price: $ 50.

Examples of good cameras
Point Grey Flea3
648 x 488: 120 FPS

Data Type:
- Visible light,
- IR (?)

Price: $ 600.

Model FL3-FW-03S1C-C
IEEE 1394b, CCD
Examples of good cameras
Microsoft Kinect
640 x 480: 30 FPS

Data Type:
visible light + depth

Price: $ 150.

(Depth - stereo vision using laser infrared illuminator,

why not work in sunlight)
Examples of good cameras
Point Grey BumbleBee2
640 x 480: 48 FPS

Data Type:
visible light + depth

Price: $ 2000.

(Depth - stereo vision with two cameras)

IEEE 1394b, CCD
3. Introduction to OpenCV

- What is OpenCV
- The first project on OpenCV
- Mat class
- Image processing functions

Back to Contents
What is OpenCV

"Open Computer Vision Library"

Open library with a set of functions for processing,

analysis and image recognition, C / C++.
What is OpenCV

2000 - First alpha version, support for Intel, C-interface

2006 - Version 1.0

2008 - Support Willow Garage (lab. Robotics)

2009 - version 2.0, classes in C + +

2010 - version 2.2, realized work with the GPU

The first project on OpenCV
1. Creating a Project
We assume that Microsoft Visual C + + 2008 Express Edition
and OpenCV 2.1 is already installed.

1. Run VS2008

2. Create a console project

File - New - Project - Win32 Console Application,
in the Name enter Project1, click OK.

3. Set up the path

Alt + F7 - opens the project properties
Configuration Properties - C / C + + - General - Additional Include Directories,
where we put the value "C: \ Program Files \ OpenCV2.1 \ include \ opencv";

Linker - General - Additional Library Directories, where we put the value of

C: \ Program Files \ OpenCV2.1 \ lib \

Linker - Input - Additional Dependencies -

cv210.lib cvaux210.lib cxcore210.lib cxts210.lib highgui210.lib for Release,
cv210d.lib cvaux210d.lib cxcore210d.lib cxts210.lib highgui210d.lib for Debug
The first project on OpenCV
2. Reading the image and display it on screen
1. Preparing the input data:
write in C:\green_apple.jpg

2. Writing in Project1.cpp:
# Include "stdafx.h"
# Include "cv.h"
# Include "highgui.h"
using namespace cv;

int main (int argc, const char ** argv)

Mat image = imread ("C:\\green_apple.jpg");/ / Load image from disk
imshow ("image", image); / / Show image
waitKey (0); / / Wait for keystroke
return 0;

3. Press F7 - compilation, F5 - run.

The program will show the image in the window and by pressing any key will complete its
The first project on OpenCV
3. Linear operations on images

Replace the text in the main from the previous

for example:

int main (int argc, const char ** argv)

Mat image = imread ("C: \ \ green_apple.jpg");

//Image1 pixel by pixel is equal to 0.3 * image

Mat image1 = 0.3 * image;
imshow ("image", image);
imshow ("image1", image1);
waitKey (0);
return 0;
The first project on OpenCV
4. Working with rectangular subimages

Replace the text in the main from the previous example to:

int main (int argc, const char ** argv)

Mat image = imread ("C:\\green_apple.jpg");

//Cut of the picture

Rect rect = Rect (100, 100, 200, 200); //Rectangle
Mat image3;
image (rect). copyTo (image3); //Copy of the image
imshow ("image3", image3);

//Change the part of the picture inside the picture

image (rect) *= 2;
imshow ("image changed", image);

waitKey (0);
return 0;
Mat class
Mat - Base class for storing images OpenCV.
Mat class
Single-and multi-channel images

The image is a matrix of pixels.

Each pixel can store some data.
If the pixel stores the vector data, the dimension vector is number of image

1-channel image - also called the half-tone

3-channel images - typically consist of three components (Red, Green, Blue).

Also, OpenCV can work with 2 - and 4-channel image.

Mat class
Creating images
1) Let the picture without some type of

Mat imageEmpty;

2) Image w x h pixels, the values 0 .. 255

(8Umeans "unsigned 8 bit",C1means "a channel"):

int w = 150; int h = 100;

Mat imageGray (cv:: Size ( w, h ) CV_8UC1 );
Mat class
Creating images

3) 1-channel with the floating-point values

(32F means "float 32 bit"):

Mat imageFloat (cv:: Size (w, h), CV_32FC1 );

4) 3-channel image with values 0 .. 255 for each channel:

Mat imageRGB (cv:: Size (w, h), CV_8UC3 );

Mat class
Memory management
1. Memory for the image stands out and is automatically cleared

That is, OpenCV itself creates the image of the desired size and type, if this
image is an output parameter of a function:

Image imageFloat;
imageGray.convertTo (imageFloat, CV_32FC1, 1.0 / 255.0);

- Here OpenCV itself allocate imageFloat.

It is important that if your image is already the right size, there are no
operations on memory allocation is performed.

2. Assignment operator shall not copy the data (as does the std:: vector), and
not by copying pointers, and using mechanism of the reference count.
Mat class
Memory management
The mechanism of the reference count (In STL it is a shared_ptr, in Java
it is all pointers) works like this:
Mat A (cv:: Size (100, 100), CV_8UC1);
// Allocate memory for the image, and the memories,
// This memory is a single image.
Mat B = A;
// Here the memory for the image does not stand out, but simply
// Data in B point to the same area in memory.
// Therefore, if we change B, then changed, and A.
// Reference count increased by an image, was equal to 2.
// Here B came out of scope, the reference count is decreased,
// And became equal to 1.
// Here A came out of scope, the reference count becomes equal to 0,
// And the memory allocated to it are automatically cleared.
Mat class
Memory management

Since the operation

Mat B = A;
does not copy the image A to B, then in order to create a copy of the image
for subsequent independent use, you must use explicit commands copyTo
and clone:

image1.copyTo (image2);
image3 = image1.clone ();
Mat class
Memory management

1) an assignment Mat B = A; is very fast, and does not copy the data and
adjusts the pointers in a special way to them. This allows you to transfer
Mat in the function directly, without pointers and references. This will not
cause unwanted copying Mat the stack (as it would stalal std::vector).

Although, of course, const Mat & will be transmitted still faster.

2) to copy the images to use explicit commands copyTo and clone.

Mat class
Per-pixel access to images
In OpenCV has several ways of per-pixel access to images. They vary in
the degree of security (typing and go beyond the border), the speed and

Wherever possible, you should try to avoid direct references to the pixels,
but instead use the functions of OpenCV, since they usually work faster
and the code more understandable.
Mat class
Per-pixel access to images
One way to access the pixels for images that have known the type - the
use of the at. For single-channel images 0 ... 255 it is:

// Get values
int value = <uchar> (y, x);

// Set the values <uchar> (y, x) = 100;

Note that x and y in the call are swapped.

Mat class
Conversion types
In the derivation of the on-screen images with floating-point OpenCV
means we must bear in mind that they are displayed on the
assumption that their values lie in [0,1]. Therefore, when converting
8-bit images in an image float to do the transformation - the
multiplication by 1.0 / 255.0.

To convert images of different types of bit mode (float and unsigned

char) used a class member convertTo.
In its second argument - the type of the image.

imageGray.convertTo (imageFloat, CV_32FC1, 1.0 / 255.0);

The number of channels input and output must match!

Mat class
Conversion types
For converting different color spaces using the function cvtColor. If
necessary, it can change the number of channels.

For example, the conversion of 3-channel RGB-image to grayscale:

cvtColor (inputRGB, outputGray, CV_BGR2GRAY);

And reverse:
cvtColor (inputGray, outputRGB, CV_GRAY2BGR);
Mat class
Partition of the channels
Function split divides the multi-channel image into channels.
Functionmergestitches together a single-image multi-channel.

void split(Const Mat &mtx // Original color image

vector <Mat> &mv // Resulting set is 1-channel Images

void merge(Const vector <Mat> &mv // Initial set of 1-channel images

Mat &dst // The resulting color image

Most often they are used to separately to each color image processing, as well as
for various manipulations of the channels.
Mat class
Partition of the channels
int main (int argc, const char ** argv)
Mat image = imread ("C: \ \ green_apple.jpg");

// Split the original image into three channels

// - Channels [0], channels [1], channels [2]

vector <Mat> channels;

split (image, channels);

// Show the channels in separate windows

// Note that the red channel - 2, not 0.

imshow ("Red", channels [2]);

imshow ("Green", channels [1]);
imshow ("Blue", channels [0]);
waitKey (0);
return 0;
Image processing functions

Original image The image, smoothed box 11 x


Function GaussianBlur performs image smoothing Gaussian filter.

Most often, the smoothing is applied to remove small noise on the image for
subsequent image analysis. Is done by using a filter of small size.
Image processing functions

Function threshold performs threshold processing of the image.

Most often it is used to highlight objects of interest pixels in the image.
Image processing functions
Fill areas

Function floodFill provides a fill area, starting from a pixel (x, y), with specified
boundaries shutdown
using a 4 - or 8 - adjacency pixels.

It is important: It spoils the original image - as it fills.

Most often it is used to highlight areas identified by the threshold processing, for
subsequent analysis.
Image processing functions
Isolation circuits

The contour of the object - this is the line representing the edge of the object's shape.
Underline the contour points - Sobel, curves - Canny.

1. Recognition. Along the contour can often determine the type of object that we observe.

2. Dimension. With the circuit can accurately estimate the size of the object of their
rotation, and location.
4. OpenCV integration into
multimedia projects

- Low-level Libraries
- Middle-level Platforms
- High-level Environments

Back to Contents
Low-level Libraries

(Open Computing Language)

Processing, analysis (Open Graphics Library) Parallelization and speed up the
High-speed graphics calculations, in particular, means
and image recognition GPU.

(Open Audio Box2D - 2D Bullet - 3D physics Web server

Library) physics engine engine
Video 1 Video 2 and so on ...
Middle-level Platforms
This is a platforms for "Creative coding", includes a large set of
functions and libraries that are integrated for convenient

Processing openFrameworks Cinder

Language: Java Language: C / C + + Language: C / C + +

For computer vision Recently appeared,
Java is slow. gaining popularity.

Video 1 Video 2 Video 3

High-level Environments
“Visual programming“ environments, which allows to
implement projects without actual programming. It is important
that they can expand by the plugins made with low-level

Max / MSP / VVVV Unity3D

Focused on visual Focused on high-
Focused on audio. effects. quality 3D.
5. Possibilities and limitations
of simple tasks of Computer Vision
- The principal possibilities
- Source of problems
- "The Problem of Boundaries"
- "The Problem of Texture"
- Segmentation
- Optical flow
- Applications of Optical Flow
- Methods for calculating the optical flow
- Optical flow problems

Back to Contents
The principal possibilities

In principle, using computer vision to measure any

parameters of physical processes, if they are expressed in
mechanical motion, changing shape or color.
Source of problems
The main source of algorithmic problems of image analysis lies
in the fact that there is no simple relationship

Pixel values Objects in the scene

For simple cases when such connection is - computer vision

algorithms work very well :)
"The problem of boundaries
The closeness of color pixels does not mean that they belong to the
same object.

Similarly, a strong difference between colors of adjacent pixels does

not mean that the pixels belong to different objects.
"The problem of boundaries

How to separate the shadows from the trees?
"The problem of boundaries
To overcome this problem need to build the algorithms that receive
and use the contextual information about the location of objects in
the scene.
"The problem of boundaries

How to find the fish?
"The problem of texture"
On the objects are such a texture that does not allow to consider the
objects the same color, but are difficult to model and describe.
"The problem of texture"

How to differentiate the zebras?
"The problem of boundaries" and "texture
issue" is most clearly manifested in solving the
problem of segmentation.

Segmentation - A partition image to the "homogeneous"

in some sense the field.

The purpose of segmentation - to build a "simple" description

of the original image, which can be applied for further analysis
of the image.
There are many methods of segmentation, which use a variety
of ideas:

- "Growing regions" on the basis of luminance,

- "Method snake" - an iterative movement curves

- Construction of boundaries (for example, the method of


- Search for a partition on the field, minimizing "entropy",

- The use of a priori information about the form field

- The use of multiple scales to refine the boundaries.

One of the best algorithms - a method GrabCut.
(While working quite slowly.)

The method is based on the approximate construction of the

minimum cut of a special graph, which is based on the pixel.
construct a weighted graph G = {V, E}.
image pixels correspond to the vertices V,
and geometric, brightness and texture proximity between two
pixels i,j corresponds to the weight of the edgeS_i, j
(S"From the" similarity "- the proximity of pixels) ~ jshi / GraphTutorial /

Then the problem of segmentation into two regions can be
formulated as a problem of partitioning the set of vertices into
two parts. Such a division is called cut.

- The value of the cut.

Minimal incision (Ie, cut with a minimum value) - will

announce the decision of the segmentation problem.
Complementing the method of minimal incision manual original
layout of areas you can get very good results (although not fully

James Malcolm, Yogesh Rathi, Allen Tannenbaum

A Graph Cut Approach to Image Segmentation in Tensor Space

Adding the idea of multiresolution analysis

(Fully automatic result):

Eitan Sharon, Segmentation by Weighted Aggregation, CVPR'04 ~ jshi / GraphTutorial /

Eitan Sharon, Segmentation by Weighted Aggregation, CVPR'04 ~ jshi / GraphTutorial /

Eitan Sharon, Segmentation by Weighted Aggregation, CVPR'04 ~ jshi / GraphTutorial /
Optical Flow
Optical Flow(Optical flow) - is a vector field of apparent motion
of objects, surfaces and edges in a visual scene, caused by
relative motion between the observer and the scene.
Optical Flow
Usually considered an optical flow that arises when considering
two frames of video.

For each pixel (x,y) Optical flow is a vector (f(x,y)

g(x,y))Characterizing the shift:
Optical Flow
Application of optical flow

1. To determine the direction in which the moving objects

in the frame. Video

2. For segmentation of moving regions in the frame for further


3. To restore the form of three-dimensional object, near which

the camera moves.

4. As an auxiliary method of increasing the stability of

algorithms for detecting objects, if they are not in every
For example, for search problems individuals, markers, etc.
Application of optical flow
Enhancing the stability of facial recognition.

Video Green circle - the result of processor 1,

purple rectangles - Combine the results.
Methods of calculating the optical flow

(I) Block ("Naive" methods)

For each point searched shift that minimizes the difference in the local window.

(II) Differential (Most used)

Estimation of the derivatives of x, y, t.
1. Lucas-Kanade - Very fast.
2.Farneback- Good enough, but slow
3.CLG - Qualitative, but not yet implemented in OpenCV
4. Pyramidal Lucas-Kanade, calculated only on "points of interest"
5. Horn-Schunk - not very resistant to noise

(III) based on discrete optimization (Intensive)

The solution is constructed using the methods of min-cut, max-flow, linear programming
and belief propagation.
Methods of calculating the optical flow
Today in OpenCV implements several algorithms, the best of them - Farneback.
(Gunnar Farneb? Ack "Two-Frame Motion Estimation Based on Polynomial Expansion", Proceedings of the
13th Scandinavian Conference on Image Analysis Pages 363-370 June-July, 2003)
Methods of calculating the optical flow
The idea is to approximate a quadratic function of the brightness of the pixels
in the neighborhood of a pixel in both frames.
Using the coefficients of polynomials, we can calculate the shift - which
declares the value of optical flow in a given pixel.
Optical flow problems

Optical flow and the actual field of motion may not coincide, and even
be perpendicular to:
Optical flow problems
The problem of the aperture - the ambiguity of the definition of motion
caused by the consideration of the motion only locally (without
analyzing the edge of the object).

Is particularly so
- In maloteksturirovannyh scenes
- Stages in the combinatorial type strips and chess boards ~ ebj / sight_mind / motion / Nakayama / aperture_problem.GIF

Homework 1 of 2
To obtain the test on these lectures "automatic"

Build a picture with objects consisting of black bars and cages

on a white background, then the second picture, where those
objects are shifted by a distance greater than the width of the
bands. Then calculate and display the resulting optical flow.

Send the results to

6. Possibilities and limitations
of complex problems of
Computer Vision
- Face detection – Viola-Jones algorithm
- Pedestrian detection – HOG algorithm

Back to Contents
Face detection – Viola-Jones algorithm

The algorithm of Viola-Jones is now the base for searching

the frontal faces in the frame.
Face detection – Viola-Jones algorithm
The algorithm uses a set of “Haar-like features”. ~ bouakaz/OpenCV-0.9.5/docs/ref/pics/haarfeatures.png ~ cis520/wiki/uploads/Lectures/viola_jones_first2_small.png
Face detection – Viola-Jones algorithm

At the stage of learning from the excessive feature set by

boosting construct a set of classifiers.
Face detection – Viola-Jones algorithm

- Works well for frontal faces.

- For those in the profile does not work because of hairstyles.
- For the whole person or the upper part - I was not able to
achieve recognition.

This is due, apparently, so that the algorithm is capable of well

trained to recognize the internal contours of virtually immutable.
And with the changing contours of the external it does not work
very well.
Face detection – Viola-Jones algorithm

Application to search for objects in a fenced grass.

Face detection – Viola-Jones algorithm
Application to search for objects in a fenced grass.

•frequency of crossing: 0.158 (158 among 1000 positive examples);

• the frequency of false alarms: 0.049 (40 among 1000 positive examples and 58 among
1000 negative examples);
• Thus, the frequency of correct detection was 84.2%, The frequency of false alarms 4.9%.
Pedestrian detection - HOG algorithm

HOG = Histogram Of Gradients.

Pedestrian detection - HOG algorithm
How it works: the image is divided into regions, which is the gradient direction.
These areas are accumulated in the histogram. The resulting vector is used for
pattern recognition (the method of SVM).
Pedestrian detection - HOG algorithm

Algorithm can reliably detect cars, motorcycles, bicycles.




Judging by the demo video, the algorithm does not work well
with people in skirts, or he was trained to recognize people from
a different perspective.
7. New applications of
computer vision
- Interactive multimedia systems
- 3D-illusion
- Projection mapping
- Dynamic projection mapping

Back to Contents
Interactive multimedia system

Funky Forest
T. Watson)

Interactive multimedia system

Body paint

Interactive multimedia system
Projection onto the hands
of spectators

Yoko Ishii and Hiroshi Homura, It's fire, you can touch it, 2007.
Interactive multimedia system
Floor Games

Video championship outdoor ping-pong,

Championship held at the conference "Modern problems of mathematics" - 42 th
National Youth School-Conference
“Chrustalnaya” hotel, 2 February 2011

This illusion is the perception of body volume in the (flat) surface, which is
achieved by accurate simulation of geometry and light and shade for the body
from the point where the viewer stands.
Creating the illusion of 3D by tracking head and eye. Now
embedded in the portable gaming consoles, with the camera.

Projection mapping
Mapping (projection mapping) - implementation of video
projection is not on special screens, and on other objects for
their “animation".

Projection mapping
Today the mapping on buildings is popular
(Architectural Projection Mapping).

Dynamic projection mapping

The idea: to use the techniques and methods for tracking

Markerless AR to synchronize the moving object and image
from the projector.
Dynamic projection mapping

More radical: to track the movement of multiple objects

(skipping balls falling sheets of paper)
and implement a projection onto them.
Homework 2 of 2
To obtain the test on these lectures "automatic"
Make Tracking (detection position) of the falling ball, which then
bounces and jumps.

Shoot video, which has the ball and the top shows the position
of the ball, found a computer. Video to put on youtube, To send
this link to
Outside view and graphics: Physical

Water musical instrument

sound waves start

Aleatoric water musical instrument

- Literature
- Friendly lectures and seminars
- Partners
- Contacts

Back to Contents
Computer vision
1. Gonzalez R., Woods R.Digital image processing.
2. Shapiro, J. StockmanComputer vision.


1. Documentation OpenCV C++:

2. G. Bradski, A. Kaehler Learning OpenCV: Computer Vision with the OpenCV Library
- Unfortunately, for the version of OpenCV for C, not C++.

3. My lectures on OpenCV for C++ (Fall 2010)
Friendly lectures and seminars

Course on openFrameworks with elements OpenCV,

matmeh USU, spring semester 2011.

The program of study course and the lessons will be on site
Friendly lectures and seminars
Ural Federal University - College of Arts and culture
Ekaterinburg branch of the National Centre for Contemporary Art

The program "Art, Science, Technology,
Friendly lectures and seminars
The program "Art, Science, Technology
On March 2, 18:30
The body as interface: Wearable Computing

INVITATION TO with a 10-15 minute presentation without it.

Please indicate the desire and the subject of speeches Ksenia Fedorova
The seminar will be held at the USU Library (Lenin 51, 4 floor, 413a)
Details -
Friendly lectures and seminars
The program "Art, Science, Technology
April 21-22
2-nd International Workshop
"Theory and Practice of Media Art"

Chairman org.komiteta Ksenia Fedorova, curator of the EP NCCA
Information about the workshop will be published on,,usu.ruand
Partners (Interactive Systems)

LLC Business Frame (CV consulting)

Ltd. 5-th dimension (interactive systems) (A network of information-business portals)

Animation studio, "Mult-On

Animation Studio "Animatech"


Denis Perevalov

This lecture is published at (Russian)