You are on page 1of 10

Computer Vision System Toolbox

Design and simulate computer vision and video processing systems


Overview
Computer Vision System Toolbox provides algorithms and tools for the design and simulation of computer
vision and video processing systems. The toolbox includes algorithms for feature extraction, motion detection,
object detection, object tracking, stereo vision, video processing, and video analysis. Tools include video fle I/O,
video display, drawing graphics, and compositing. Capabilities are provided as MATLAB

functions, MATLAB
System objects, and Simulink

blocks. For rapid prototyping and embedded system design, the system toolbox
supports fxed-point arithmetic and C code generation.
Key Features
Feature detection, including FAST, Harris, Shi & Tomasi, SURF, and MSER detectors
Feature extraction and putative feature matching
Object detection and tracking, including Viola-Jones detection and CAMShift tracking
Motion estimation, including block matching, optical fow, and template matching
RANSAC-based estimation of geometric transformations or fundamental matrices
Video processing, video fle I/O, video display, graphic overlays, and compositing
Block library for use in Simulink
Feature Detection and Extraction
A feature is an interesting part of an image, such as a corner, blob, edge, or line. Feature extraction enables you to
derive a set of feature vectors, also called descriptors, from a set of detected features. Computer Vision System
Toolbox offers capabilities for feature detection and extraction that include:
Corner detection, including Shi & Tomasi, Harris, and FAST methods
SURF and MSER detection for blobs and regions
Extraction of simple pixel neighborhood and SURF descriptors
Visualization of feature location, scale, and orientation
Additionally, the system toolbox provides functionality to match two sets of feature vectors and visualize the
results. When combined into a single workfow, feature detection, extraction, and matching can be used to solve
many computer vision design challenges, such as registration, stereo vision, object detection, and tracking.
SURF (left), MSER (center), and corner detection (right) with Computer Vision System Toolbox. Using the same image,
the three different feature types are detected and results are plotted over the original image.
1
Registration and Stereo Vision
Computer Vision System Toolbox supports automatic image registration by providing algorithms that use
features to estimate the geometric relationships between images or video frames. Typical uses include video
mosaicking, video stabilization, image fusion, and stereo vision.
Feature-Based Registration
Feature detection, extraction, and matching are the frst steps in the feature-based registration workfow. With a
pair of images, you can detect and extract features in each image, using one of several feature types available in the
system toolbox. You can then determine putative matches between the two sets of features and visualize the
matches. Typically, this workfow produces many interest points with matches that include outliers. You can
remove the outliers with statistically robust methods such as RANSAC or least median of squares to compute a
similarity, affne, or projective transformation. You can then apply the geometric transformation to align the two
images.
Feature-based registration, used for video stabilization. The system toolbox detects interest points in two sequential
video frames using corner features(top); the putative matches are determined with numerous outliers (bottom left), and
outliers are removed using the RANSAC method (bottom right).
Stereo Image Rectifcation
Stereo image rectifcation transforms a pair of stereo images so that a corresponding point in one image can be
found in the corresponding row in the other image. You can rectify a pair of stereo images with the system
toolbox by determining a set of matched interest points, estimating the fundamental matrix, and then deriving
two projective transformations. This process reduces the 2D stereo correspondence problem to a 1D problem,
which simplifes the process of determining the depth of each point in the scene from the camera.
2
Results from stereo image rectification. Non-overlapping areas are show in red and cyan.
Stereo Vision
Stereo vision is the process of reconstructing a 3D scene from two or more views of the scene. Using the system
toolbox, you can perform uncalibrated stereo image rectifcation on a pair of stereo images and match individual
pixels along epipolar lines to compute a disparity map.
3
Reconstructing a scene using a pair of stereo images. To visualize the disparity, the right channel is combined with the
left channel to create a composite (top left); also shown are a disparity map of the scene (top right) and a 3D
rendering of the scene (bottom).
Object Detection, Motion Estimation, and Tracking
Object detection is the identifcation of an object in an image or video. Computer Vision System Toolbox supports
several approaches to object detection, including template matching, blob analysis, and the Viola-Jones algorithm.
Template matching uses a small image, or template, to fnd matching regions in a larger image. Blob analysis uses
segmentation and blob properties to identify objects of interest. The Viola-Jones algorithm uses Haar-like features
and a cascade of classifers to identify pretrained objects, including faces, noses, eyes, and other body parts.
4
Face detection using the Viola-Jones algorithm.
Motion estimation is the process of determining the movement of blocks between adjacent video frames. The
system toolbox provides a variety of motion estimation algorithms, such as optical fow, block matching, template
matching, and background estimation using Gaussian mixture models (GMMs). These algorithms create motion
vectors, which relate to the whole image, blocks, arbitrary patches, or individual pixels. For block and template
matching, the evaluation metrics for fnding the best match include MSE, MAD, MaxAD, SAD, and SSD.
Detecting moving objects using a stationary camera. In this series of video frames, optical flow is calculated and
detected motion is shown by overlaying the flow field on top of each frame.
Video tracking is the process of determining the movement of objects between video frames. For video tracking,
Computer Vision System Toolbox provides the continuously adaptive mean shift (CAMShift) algorithm, which
uses the histogram of pixel values to identify the object. For example, you can use this algorithm with the
Viola-Jones algorithm to detect and track faces in live video. Additionally, DSP System Toolbox provides Kalman
fltering to predict the movement of an object in upcoming video frames or the association of objects with
individual tracks.
Video Processing, Display, and Graphics
Computer Vision System Toolbox provides algorithms and tools for video processing. You can read and write
from common video formats, apply common video processing algorithms such as deinterlacing and
chroma-resampling, and display results with text and graphics burnt in to the video. Video processing in
MATLAB uses System objects, which avoids excessive memory use by streaming data for processing one frame at
a time.
5
Video deinterlacing in MATLAB.
Video I/O
Computer Vision System Toolbox can read and write multimedia fles in a wide range of formats, including AVI,
MPEG, and WMV. You can stream video to and from MMS sources over the Internet or a local network. You can
acquire video directly from Web cameras, frame grabbers, DCAM-compatible cameras, and other imaging devices
using Image Acquisition Toolbox. Simulink users can use the MATLAB workspace as a video source or sink.
Video Display
The system toolbox includes a video viewer that lets you:
View video streams in-the-loop as the data is being processed
View any video signal within your code or block diagram
Use multiple video viewers at the same time
Freeze the display and evaluate the current frame
Display pixel information for a region in the frame
Pan and zoom for closer inspection as the simulation is running
Start, stop, pause, and step through Simulink simulations one frame at a time
6
Model with viewers for four videos: (from left) original, estimated background, foreground pixels, and results of
tracking.
Graphics
Adding graphics to video helps with visualizing extracted information or debugging a system design. You can
insert text to display the number of objects or to keep track of other key information. You can insert graphics,
such as markers, lines, and polygons to mark found features, delineate objects, or highlight other key features. The
system toolbox functionality fuses text and graphics into the image or video itself rather than maintaining a
separate layer. You can combine two video sources in a composite that can highlight objects or a key region.
7
Images with text and graphics inserted. Adding these elements can help you visualize extracted information and
debug your design.
Stream Processing in MATLAB and Simulink
Computer Vision System Toolbox supports a stream processing architecture in both MATLAB and Simulink. In a
stream processing architecture, one or more video frames from a continuous stream are processed at a time. This
type of processing is appropriate for analysis of large video fles or systems with live video.
In MATLAB, stream processing is enabled by System objects, which use MATLAB objects to represent time-based
and data-driven algorithms, sources, and sinks. System objects implicitly manage many details of stream
processing, such as data indexing, buffering, and the management of algorithm state. You can mix System objects
with standard MATLAB functions and operators. Most System objects have corresponding Simulink blocks that
provide the same capabilities.
Simulink handles stream processing implicitly by managing the fow of data through the blocks that make up a
Simulink model. It includes a library of general-purpose, predefned blocks to represent algorithms, sources, sinks,
and system hierarchy. Computer Vision System Toolbox provides a library of blocks specifcally for the design of
computer vision and video processing systems.
8
An abandoned object detection model (top). The three viewers (bottom) show the process of detecting and tracking an
abandoned object in a live video stream from a camera in a train station.
System Design and Implementation
Computer Vision System Toolbox supports the creation of system-level test benches, fxed-point modeling, and
code generation within both MATLAB and Simulink. This support lets you integrate algorithm development with
rapid prototyping, implementation, and verifcation workfows.
Fixed-Point Modeling
Many real-time systems use hardware that requires fxed-point representation of your algorithm. Computer
Vision System Toolbox supports fxed-point modeling in most blocks and System objects, with dialog boxes and
object properties that help you with confguration.
System toolbox support for fxed point includes:
Word sizes from 1 to 128 bits
Arbitrary binary-point placement
Overfow handling methods (wrap or saturation)
Rounding methods, including ceiling, convergent, foor, nearest, round, simplest, and zero
Code Generation Support
Most System objects, functions, and blocks in Computer Vision System Toolbox can generate ANSI/ISO C code
using MATLAB Coder, Simulink Coder, or Embedded Coder. You can select optimizations for specifc
9
Product Details, Demos, and System Requirements
www.mathworks.com/products/computer-vision
Trial Software
www.mathworks.com/trialrequest
Sales
www.mathworks.com/contactsales
Technical Support
www.mathworks.com/support
processor architectures and integrate legacy C code with the generated code to leverage existing intellectual
property. You can generate C code for both foating-point and fxed-point data types.
Simulink model designed to create code for a specific hardware target. This model generates C code for a video
stabilization system and embeds the algorithm into a digital signal processor (DSP).
Image Processing Primitives
Computer Vision System Toolbox includes image processing primitives that support fxed-point data types and C
code generation. These System objects and Simulink blocks include:
2D spatial and frequency fltering
Image pre- and postprocessing algorithms
Morphological operators
Geometric transformations
Color space conversions
Resources
Online User Community
www.mathworks.com/matlabcentral
Training Services
www.mathworks.com/training
Third-Party Products and Services
www.mathworks.com/connections
Worldwide Contacts
www.mathworks.com/contact
2012 The MathWorks, Inc. MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See www.mathworks.com/trademarks
for a list of additional trademarks. Other product or brand names may be trademarks or registered trademarks of their respective holders.
10