This action might not be possible to undo. Are you sure you want to continue?
Working with 3D-cameras Microsoft Kinect and Asus Xtion
Lectures Contacts: www.uralvision.blogspot.com firstname.lastname@example.org
UFU / IMM fall 2011
What is a 3D-camera
This is a camera that can measure the distance to each pixel in meters.
What is a 3D-camera
The principle of “cheap” 3d-cameras - using stereovision with active IR illumination. The laser lights in the infrared range, creating a special dot pattern on the objects. IR camera analyzes the patterns and restores the distance to the object’s points.
What is a 3D-camera
Pros: - Relatively stable works in dark rooms (compared to conventional stereo systems). Cons: - In the outdoor with bright light does not work, because the camera does not see the light from the IR laser (compared with conventional stereo systems). - Can not see glasses and mirrors (compared to the sonar). - Low accuracy (compared to the Time-of-Flight cameras).
What are the camera
Microsoft Kinect 30 fps 640x480, a RGB-camera and 4 microphones. Long cord (3 m). But: big and heavy. Asus Xtion Pro 30 fps 640x480, 60 fps 320x480. Asus Xtion Pro Live as Asus Xtion Pro, plus RGB-camera. All cameras - USB 2.0.
Measurement range The cameras measure the distance from 80 cm to 500 cm, but qualitatively - from 80 cm to 350-400 cm Coverage of the camera From a distance of 200 cm camera sees the horizontal area of approximately 200 cm in length. Accuracy of measurement The principle of measuring distance in these cells based on the phenomenon of parallax. Therefore, with increasing distance to the object of measurement accuracy decreases inversely with distance. At a distance of 100-200 cm from the camera's accuracy on the average 1 cm
Safety in use
Manufacturers say that the infrared radiation produced by a laser, safe for the eyes, especially if you do not look at him at close range.
Low level - Freenect etc.
Low-level access to the camera to obtain depth maps, motor control. Please see PCL (Point cloud library) - it’s analog of OpenCV for the analysis of point clouds.
SDK from the company PrimeSense, which invented and first class devices such as Kinect and Xtion. Depth map, people tracking (labelling of pixels corresponding to the “id” of each person or other moving thing, similar in size to human), sceleton tracking (requires calibration with Ψ-pose), gestures recognition.
Depth map, people tracking (?), skeleton tracking (no calibration is required), working with microphones and a motor.
Comparison of OpenNI and Microsoft KinectSDK for the problem of tracking the skeleton
Big minus: calibration required (up toΨ-Position). Pros: much more stable skeleton tracking. Allows to identify more than two skeletons in the frame. Runs on Windows, Mac, Linux.
A big plus: immediately recognize the skeleton, without calibration. Cons: tracking is not very stable, and allows the tracker to two skeletons in the frame. It works only under Windows 7, with Kinect only.
Both SDK - not very well recognize the position of the legs.
Also, do not work for people in a skirt or wide clothes.
1. The use of depth maps
- Experiments in visualization and transformation of point clouds. - A simple gesture analyzer using the depth threshold. - Floor games. - Reconstruction of 3d map by analyzing the movement of the camera and point cloud obtained by integration.
http://www.georgetoledo.com/2010_11_01_archive.html http://spryflash.com/blog/?p=32 http://www.youtube.com/watch?v=iUkWgSXbz40
2. Using sign language analyzers OpenNI
- Control devices such as TV with gestures. - Create a "bezkasatelnyh" information stands. When I tested the recognizer gestures OpenNI, it was found that a well-recognized gestures Push, Wave, however, after reading the instructions - exactly as they should. Good tracking is the end of the brush. A circular gestures, swings left and right recognized very poorly. Maybe I just do not know how to do them correctly.
3. Using the selection of people OpenNI
- Creation of interactive systems that do not need to understand where some body parts are located. This applies to entertainment, lights, dancers, etc. - Analysis of people in the room for security systems and interactive applications. For this is the center of mass of pixels corresponding to the person, and this is characteristic of its center. Then, the local coordinates of the point (x pixels, y pixels, z mm) are transformed into the world (X mm, Y mm, Z mm), and obtain exact coordinates of the people relative to the camera.
4. Using the tracking of the skeleton OpenNI, KinectSDK
- Game - Entertainment - Control of robots. Cons: lack of tactile communication, so control is not very convenient.
Use in their projects
- Put the interests of SDK (OpenNI, KinectSDK, Freenect.) - If you want to work with 3d-points, which are supposed to display, then build the appropriate libraries to your project. - If you need only the result of the analysis of 3d points ("the position of his feet on the floor"), or the position of parts of the skeleton, or the coordinates of the people in the room - you can use the examples that come with the SDK, modify them, and transmit the necessary data to your program. This is most easily done by using the protocolOSC - To Unity3D, OpenFrameworks, Cinder, or XmlSockets - For Flash.
Use in their projects
The latter possibility is especially important to use Microsoft KinectSDK, as it is, in my opinion harder to integrate all of my projects. By OSC can also send data to the skeleton to another computer (usually a Mac) for rendering.
Use with OpenCV
If you have a picture of depth - usually a 16-bit single channel image: unsigned short * depthData = ...; / / Size w x h then it is based image OpenCV: Mat depth = Mat (cv:: Size (w, h), CV_16UC1, depthData); The question to explore OpenCV: In the latter case - whether allocates memory for the depth, or just use the pointer?
Achieving fast time of the application
If your camera provides 30 frames per second, this is not enough for smooth rendering (it is usually not less than 60 cad / sec).
To solve this problem, there are two options: 1. Scanning and processing of frames with 3D camera is made in a separate thread. The main thread simply reads the current state of the 3D camera.
2. Analysis of the 3D camera is made in a separate application, as discussed earlier.