You are on page 1of 52

OpenFrameworks lections

Introduction to Computer Vision. Grabbing and processing camera images

Denis Perevalov perevalovds@gmail.com

See in-depth details in my book Mastering openFrameworks Books examples are free, see masteringof.wordpress.com

What is computer vision

Definition
(From Wikipedia)

Computer vision - Theory and technology of creating machines that can see.

http: //the-gadgeteer.com/wp-content/uploads/2009/12/ mr-robot-head-game.jpg

Definition
The topics include computer vision
- Play action

- Detection of events
- Tracking, - Pattern recognition, - Restoration of images.

Image examples
Ordinary light, radio waves, ultrasound - they are all sources of images:

1. Color images of the visible spectrum 2. Infrared images 3. Ultrasound images 4. Radar images 5. Depth images

Image examples
1. Color images of the visible spectrum

http: //rkc.oblcit.ru/system/files/images/% D0% 9F% D1% 80% D0% B8% D1% 80% D0% BE% D0% B4% D0% B013.preview.jpg http: //imaging.geocomm.com/gallery/san_francisco_IOD032102.jpg

Image examples
2. Infrared images

http: //lh6.ggpht.com/_Wy2U3qKMO8k/SSyB6BTdg8I/AAAAAAAACd8/Iai_3QZIjrI/Australia+5+dollars+B+se.jpg http: //i367.photobucket.com/albums/oo117/syquest/acrylic_no_filter.jpg

Image examples
3. Ultrasound images Image with side-scan sonar:

http: //ess.ru/publications/2_2003/sedov/ris6.jpg

Image examples
4. Radar images Snapshot of the radar:

http: //cdn.wn.com/pd/b1/3a/abd9ebc81d9a3be0ba7c4a3dfc28_grande.jpg

Image examples
5. Images with depth

http: //opencv.willowgarage.com/documentation/c/ _images/disparity.png

Video

http: //www.youtube.com/watch?v=pk_cQVjqFZ4

First sign of computer vision tasks


The input data are two-dimensional array of data - Ie, "image".

But the two-dimensional arrays of data are used not only in computer vision:

Second sign of computer vision tasks


The goal of treatment - extraction and use of color and geometric structures in the image.

http: //www.tyvek.ru/construction/images/structure.jpg

Disciplines involved 2D-images


1. Signal and Image Processing Low-level data processing, usually without a detailed study of image content. Objectives - restoration, removal of noise, data compression, improved performance (sharpness, contrast, ...)
2. Computer vision Middle-level data analysis involves the separation of the image of any objects, and measuring their parameters. 3. Pattern recognition High-level analysis of data - the definition of the type of object. The input data usually must be presented as a set of attributes. Often the signs are used to calculate 1. and 2.

Camera for computer vision


- Key Features - Examples of good cameras

Key Features
For various processing tasks in real-time need different cameras. Their main features are: 1. Resolution 2. The number of frames per second

3. Type of data obtained


4. Way to transfer data into the computer

Resolution
This is the image size in pixels, obtained from the camera.

320 x 240

640 x 480

1280 x 1024

accuracy accuracy accuracy when observing an object when observing an object when observing an object the size of 1m: the size of 1m: the size of 1m: 1.56 mm 0.97 mm 3.13 mm size of 30 frames: 6.6 MB size of 30 frames: size of 30 frames: 26.4 MB 112.5 MB
http: //www.mtlru.com/images/klubnik1.jpg

The number of frames per second


This is the number of images obtained from the camera per second.

30 fps
Time between frames: 33 ms

60 fps
Time between frames: 16 mS

150 fps
Time between frames: 6 ms Can used for musical instrument

http: //www.youtube.com/watch?v=7iEvQIvbn8o

Type of data obtained


What data we get from the camera for processing.

Color or grayscale image of the visible spectrum

Infrared image

Color image + depth


(Information about the distance to objects)

Using invisible infrared illumination, this camera will see in a dark room (On performance)

Way to transfer data into the computer


- Analog -Webcams (USB-camera) - Firewire-camera(Cameras IEEE-1394) - Network (IP-camera) - Smart Camera (Smart cameras)

Analog

Historically appeared first, signal is transmitted to analog signals (TV format).

(+) Transmit data over long distances, albeit with interference (100 m) (+) Easy to install, small size
(-) For signal input into the computer requires a special card or TV tuner ", they usually consume a lot of computing resources. (-) "Interlace"Or Interlace - very difficult to analyze the image, if there is movement.

Webcams (USB-camera)

Appeared in ~ 2000., transmit data via the USB-protocol uncompressed or compressed in JPEG. (+) Easy to connect computer and software (+) Cheap, available for sale (-) Overhead - to decode JPEG requires computing resources. (-) The cheapest models are usually bad optics and the matrix (Makes noise in the image) (-) Because of limitations of USB bandwidth can not connect more than 2 cameras to a single USB-hub, but usually on the PC 2-3 USB hub.

Firewire-camera (IEEE-1394)

Cameras that transmit a signal protocol FireWire, pylevlagozaschitnom usually the case, usually it is the camera for industrial applications. (+) Transfer of uncompressed video in excellent quality at high speed (+) You can connect multiple cameras (+) Tend to have excellent optics (-) High price (-) Requires power, which is sometimes difficult to connect to laptops

Network (IP-camera)

Cameras that transmit data on network (wired or wireless) channel. Is now rapidly gaining popularity in all areas. (+) Easy connection to PC (+) Ease of installation (+) The possibility of transferring data to an unlimited distance, which allows you to construct a network of cameras covering the building or area, attached to the airship, etc. (+) Control - to rotate the camera, adjust the increase (-) May have problems with speed of response (-) Is still relatively high price (-) While not portable (2011)

"Smart" cameras (Smart cameras)


Cameras, in which case located computer. These cameras are fully functional vision systems, transmitting the output of the detected facilities, etc. under different protocols.
(+) Compact. (+) Scalability - it is easy to build a network of such cameras. (-), Often they require adaptation of existing projects. (-) Cost model is rather slow, so do a good job with only a relatively simple task of image analysis.

Separate type: Infrared Camera

Constructed from ordinary cameras by adding an infrared filter and, often, an infrared illuminator. + IR-rays are almost invisible man (in the dark can be seen as a faint red color), so often used to simplify the analysis of objects in the field of view. - Specialized infrared camera suitable for machine vision, not a mass product, so they usually need to be ordered.

Examples of good cameras


Sony PS3 Eye
320 x 240: 150 FPS 640 x 480: 60 FPS Data Types: visible light IR (requires removing the IR filter)

Price: $ 50.

USB, CCD

Examples of good cameras


Point Grey Flea3
648 x 488: 120 FPS Data Type: - Visible light, - IR (?) Price: $ 600.

Model FL3-FW-03S1C-C IEEE 1394b, CCD

Examples of good cameras


Microsoft Kinect
640 x 480: 30 FPS Data Type: visible light + depth Price: $ 150.

(Depth - stereo vision using laser infrared illuminator, why not work in sunlight) USB, CMOS

Examples of good cameras


Point Grey BumbleBee2
640 x 480: 48 FPS Data Type: visible light + depth Price: $ 2000.

(Depth - stereo vision with two cameras) IEEE 1394b, CCD

What if you have no webcam?


1. Get the program SplitCam http: //www.splitcamera.com/ It can simulate the webcam setting as input an arbitrary video file (usually avi). 2. Load avi-file into SplitCam, and then run the project CameraTest, see below. Attention Even if SplitCam is off, it is 0-th (default) camera in the system. Therefore, if you turn on a webcam, you project still can show black frames from camera. Solution: select camera 1 in the projects grabber settings, or uninstall SplitCam.

Getting images from camera in openFrameworks

Receiving and displaying a frame in OperFrameworks - Draft CameraTest

Preparation of the project:

In the folder openFrameworks/app/examples take the example emptyProject , copy it to /App/myApps/CameraTest

Draft CameraTesttestApp.cpp (1)


# Include "testApp.h" //Declare variables //Video-grabber for "capturing" video frames: ofVideoGrabber grabber; int w; //Width of the frame int h; //Height of the frame //Initialize void testApp:: setup () { w = 320; h = 240; grabber.initGrabber (w, h); //Connect the camera ofBackground (255,255,255); //Set the background color }

Draft CameraTesttestApp.cpp (2)


//Update the state void testApp:: update () { grabber.grabFrame (); //Grab a frame }

//Draw void testApp:: draw () { grabber.draw (0,0); //Output frame }

Threshold

Way of storing images in memory


The image is usually stored in memory by transferring its pixels sequentially, row by row. (JPEG, PNG, etc. - it's packed images are stored on a fundamentally different way) Depending on the type of image one pixel may consist of different number of bytes. 1 byte - black and white (monochrome) 3 bytes - color (Red, Green, Blue), 4 bytes - color with transparency (Red, Green, Blue, Alpha). Modern GUI uses 4-byte image for images, icons, etc. Input from the camera is in 3-byte format. Image analysis at stages vydelaniya objects is often conducted with a 1-byte images. Important: coordinate axis OY and RGBA sequence may vary depending on the file format.

Way of storing images in memory


Let unsigned char * image; - Image k bytes per pixel, the size w x h pixels Then access to the components of pixel (x,y): image [k * (x + w * y) + 0] image [k * (x + w * y) + 1] ... image [k * (x + w * y) + k-1].

For example, a pixel (x,y) RGB-image: image [3 * (x + w * y) + 0]- Red image [3 * (x + w * y) + 1] - Green image [3 * (x + w * y) + 2]- Blue

Threshold
Threshold allows you to find the pixels brightness, ie, (0.2989 * Red + 0.5870 * Green + 0.1140 * Blue) or one color component (Red, Green, Blue) which greater than some threshold value. What you need to conduct treatments: - Access to the pixels a frame for analysis, - To analyze pixels and display the result on the screen.

Threshold
++++++++ Add to "Declare the variables:
//Process bytes of image unsigned char * outImage; //Texture to display the processed image ofTexture outImageTexture; ++++++++ Add to the "setup ()": //Allocate memory for image analysis outImage = new unsigned char [w * h * 3]; //Create a texture to display the result on the screen outImageTexture.allocate (w, h, GL_RGB);

Threshold
//Update the state void testApp:: update () { grabber.grabFrame (); //grab a frame if (grabber.isFrameNew ()) { //If it came a new frame //Pixels of the input image: unsigned char * input = grabber.getPixels (); //Looping through them for (int y = 0; y <h; y + +) { for (int x = 0; x <w; x + +) { //Input pixel (x, y): int r = input [3 * (x + w * y) + 0]; int g = input [3 * (x + w * y) + 1]; int b = input [3 * (x + w * y) + 2]; //Threshold via Blue int result = (b> 100)? 255: 0; //Write output image will be black or white: outImage [3 * (x + w * y) + 0] = result; outImage [3 * (x + w * y) + 1] = result; outImage [3 * (x + w * y) + 2] = result; }} //Write to a texture for the subsequent withdrawal of its on-screen outImageTexture.loadData (outImage, w, h, GL_RGB); }

Threshold
//Draw void testApp:: draw () { grabber.draw (0, 0) //output frame outImageTexture.draw (w, 0, w, h); //Output the processing result }

Search for color labels


We solve the problem of finding the coordinates of the object is blue in the input frame.
First we find the blue pixels. These are pixels, Blue-channel is substantially greater than their Red and Greenchannels. To do this: >>>>>>>>> Changing string int result ... at: int result = (b> r + 100 & & b> g + 100)? 255: 0;

Search for color labels


Thus we have labeled "blue" pixels. Now find the coordinates of their center. For simplicity, we assume that the blue object in frame one. Then we can take the center of gravity of labeled pixels.

Search for color labels


++++++++ Add to "Declare the variables: ofPoint pos; //Coordinates of object ++++++++ Add to "update ()" - calculation of the center of gravity of labeled pixels pos = ofPoint (0, 0); int n = 0; //Number of pixels found for (int y = 0; y <h; y + +) { for (int x = 0; x <w; x + +) { int b = outImage [3 * (x + w * y) + 2]; //Look processed image if (b == 255) { //We have previously labeled as blue dots pos.x + = x; pos.y + = y; n + +; } } } //Display average if (n> 0) { pos.x/= n; pos.y/= n; }

Search for color labels


//Draw void testApp:: draw () { //This must be added - can not say why - otherwise the texture is drawn incorrectly: ofSetColor (255, 255, 255);

grabber.draw (0, 0) //output frame outImageTexture.draw (w, 0, w, h); //output processing result
//Display circle around the object ofSetColor (0, 255, 0); //Green ofNoFill (); //Turn off the fill ofCircle (pos.x, pos.y, 20); //Draw a circle on the ref. frame ofCircle (pos.x + w, pos.y, 20); //Draw a circle on the Rec. frame }

Search for color labels


Result:

These coordinates can be used to control something. By the way, "n" can be used to control that shot is of interest to us an object.

Homework: "Instinct2"
Implement the next interactive project:
1. Take the draft finding color labels, and build an image intens, consisting of pixels characterizing the intensity of the blue, without the threshold processing: int result = b - (r + g)/2; //Variants: b - min (r, g), b - max (r, g) result = max (result, 0); //Result must be in [0 .. 255] intens [3 * (x + w * y) + 0] = result; intens [3 * (x + w * y) + 1] = result; intens [3 * (x + w * y) + 2] = result; Bring out the image on the screen.

Homework: "Instinct2"
2. Position on the screen 20-50 colored "creatures", the initial position and color of which - given randomly. They have mass and velocity. Let be a variety of colors and sizes. Let the size pulsates. These sprites can be drawn in Photoshop translucent brushes of different colors:

Homework: "Instinct2"
3. Suppose that a move in the direction of maximum intensity of the blue. To this end, the establishment of the coordinates (x0, y0) is the center of mass intens in some of its neighborhood: float mx = 0; float my = 0; float sum = 0; int rad = 100; //Radius of the neighborhood, you may want to do is to depend on //Current size of the establishment for (int y =- rad; y <= rad; y + +) { for (int x =-rad; x <= rad; x + +) { if (x + x0> = 0 & & x + x0 <w & & y + y0> = 0 & & y + y0 <h //Of the screen & & X * x + y * y <= rad * rad //Inside a circle of radius rad ) {Float value = intens [3 * (x + w * y) + 0]; mx + = value * x; my + = value * y; sum + = value; }}} if (sum> 0) {mx/= sum; my/= sum;} Then mx, my - Coordinate, where necessary to direct the bacteria. Thus, to apply the 2 nd law of Newton, specifying the desired acceleration, the bacteria moved in the right direction.

How to make the speed of physics in your program does not depend on the computer performance
To speed simulation of physics in the program did not depend on the power of the computer, use the timer: //Declare variables float time0 = 0, //the last entry in the update () in the update (): float time = ofGetElapsedTimef (); //Time from the start of the program in seconds float dt = min (time - time0, 0.1); float time0 = time; //use dt value in your physics! //We take a min, because if for some reason the update is delayed //(For example, the user moves the window on the screen) //We zaschischitimsya from being able to dt did not become a great //(Large dt can "blow up items").

A note about Release/Debug


Do not forget to enable "Release" when compiling the finished project, it will accelerate the speed of the program.

Note
Tasks like threshold processing, noise removal, object detection, contour tracking and other - easier to solve with ready-made procedures implemented in the OpenCV library, connected to OpenFrameworks.

You might also like