OpenCV 2.

2 C++ Reference Introduction
Starting from OpenCV 2.0 the new modern C++ interface has been introduced. It is crisp (less typing is needed to code the same thing), type-safe (no more CvArr* a.k.a. void* ) and, in general, more convenient to use. Here is a short example of what it looks like: // // Simple retro-style photo effect done by adding noise to // the luminance channel and reducing intensity of the chroma channels // // include standard OpenCV headers, same as before #include "cv.h" #include "highgui.h" // all the new API is put into "cv" namespace. Export its content using namespace cv; // enable/disable use of mixed API in the code below. #define DEMO_MIXED_API_USE 1 int main( int argc, char** argv ) { const char* imagename = argc > 1 ? argv[1] : "lena.jpg"; #if DEMO_MIXED_API_USE // Ptr<T> is safe ref-conting pointer class Ptr<IplImage> iplimg = cvLoadImage(imagename); // cv::Mat replaces the CvMat and IplImage, but it's easy to convert // between the old and the new data structures // (by default, only the header is converted and the data is shared) Mat img(iplimg); #else // the newer cvLoadImage alternative with MATLAB-style name Mat img = imread(imagename); #endif if( !img.data ) // check if the image has been loaded properly return -1; Mat img_yuv; // convert image to YUV color space. // The output image will be allocated automatically cvtColor(img, img_yuv, CV_BGR2YCrCb); // split the image into separate color planes vector<Mat> planes; split(img_yuv, planes);

// another Mat constructor; allocates a matrix of the specified // size and type Mat noise(img.size(), CV_8U); // fills the matrix with normally distributed random values; // there is also randu() for uniformly distributed random numbers. // Scalar replaces CvScalar, Scalar::all() replaces cvScalarAll(). randn(noise, Scalar::all(128), Scalar::all(20)); // blur the noise a bit, kernel size is 3x3 and both sigma's // are set to 0.5 GaussianBlur(noise, noise, Size(3, 3), 0.5, 0.5); const double brightness_gain = 0; const double contrast_gain = 1.7; #if DEMO_MIXED_API_USE // it's easy to pass the new matrices to the functions that // only work with IplImage or CvMat: // step 1) - convert the headers, data will not be copied IplImage cv_planes_0 = planes[0], cv_noise = noise; // step 2) call the function; do not forget unary "&" to form pointers cvAddWeighted(&cv_planes_0, contrast_gain, &cv_noise, 1, -128 + brightness_gain, &cv_planes_0); #else addWeighted(planes[0], constrast_gain, noise, 1, -128 + brightness_gain, planes[0]); #endif const double color_scale = 0.5; // Mat::convertTo() replaces cvConvertScale. // One must explicitly specify the output matrix type // (we keep it intact, i.e. pass planes[1].type()) planes[1].convertTo(planes[1], planes[1].type(), color_scale, 128*(1-color_scale)); // alternative form of convertTo if we know the datatype // at compile time ("uchar" here). // This expression will not create any temporary arrays // and should be almost as fast as the above variant planes[2] = Mat_<uchar>(planes[2]*color_scale + 128*(1-color_scale)); // Mat::mul replaces cvMul(). Again, no temporary arrays are // created in the case of simple expressions. planes[0] = planes[0].mul(planes[0], 1./255); // now merge the results back merge(planes, img_yuv); // and produce the output RGB image cvtColor(img_yuv, img, CV_YCrCb2BGR); // this is counterpart for cvNamedWindow namedWindow("image with grain", CV_WINDOW_AUTOSIZE); #if DEMO_MIXED_API_USE

// this is to demonstrate that img and iplimg really share the data // the result of the above processing is stored to img and thus // in iplimg too. cvShowImage("image with grain", iplimg); #else imshow("image with grain", img); #endif waitKey(); return 0; // all the memory will automatically be released // by vector<>, Mat and Ptr<> destructors. } Following a summary “cheatsheet” below, the rest of the introduction will discuss the key features of the new interface in more detail.

C++ Cheatsheet
The section is just a summary “cheatsheet” of common things you may want to do with cv::Mat:. The code snippets below all assume the correct namespace is used: using namespace cv; using namespace std; Convert an IplImage or CvMat to an cv::Mat and a cv::Mat to an IplImage or CvMat: // Assuming somewhere IplImage *iplimg; exists // and has been allocated and cv::Mat Mimg has been defined Mat imgMat(iplimg); //Construct an Mat image "img" out of an IplImage Mimg = iplimg; //Or just set the header of pre existing cv::Mat //Ming to iplimg's data (no copying is done) //Convert to IplImage or CvMat, no data copying IplImage ipl_img = img; CvMat cvmat = img; // convert cv::Mat -> CvMat A very simple way to operate on a rectanglular sub-region of an image (ROI – “Region of Interest”): //Make a rectangle Rect roi(10, 20, 100, 50); //Point a cv::Mat header at it (no allocation is done) Mat image_roi = image(roi); A bit advanced, but should you want efficiently to sample from a circular region in an image (below, instead of sampling, we just draw into a BGR image) : // the function returns x boundary coordinates of // the circle for each y. RxV[y1] = x1 means that // when y=y1, -x1 <=x<=x1 is inside the circle void getCircularROI(int R, vector < int > & RxV) { RxV.resize(R+1); for( int y = 0; y <= R; y++ ) RxV[y] = cvRound(sqrt((double)R*R - y*y)); }

// This draws a circle in the green channel // (note the "[1]" for a BGR" image, // blue and red channels are not modified), // but is really an example of how to *sample* from a circular region. void drawCircle(Mat &image, int R, Point center) { vector<int> RxV; getCircularROI(R, RxV); Mat_<Vec3b>& img = (Mat_<Vec3b>&)image; //3 channel pointer to image for( int dy = -R; dy <= R; dy++ ) { int Rx = RxV[abs(dy)]; for( int dx = -Rx; dx <= Rx; dx++ ) img(center.y+dy, center.x+dx)[1] = 255; } }

Namespace cv and Function Naming
All the newly introduced classes and functions are placed into cv namespace. Therefore, to access this functionality from your code, use cv:: specifier or "using namespace cv;" directive: #include "cv.h" ... cv::Mat H = cv::findHomography(points1, points2, cv::RANSAC, 5); ... or #include "cv.h" using namespace cv; ... Mat H = findHomography(points1, points2, RANSAC, 5 ); ... It is probable that some of the current or future OpenCV external names conflict with STL or other libraries, in this case use explicit namespace specifiers to resolve the name conflicts: Mat a(100, 100, CV_32F); randu(a, Scalar::all(1), Scalar::all(std::rand() cv::log(a, a); a /= std::log(2.); For the most of the C functions and structures from OpenCV 1.x you may find the direct counterparts in the new C++ interface. The name is usually formed by omitting cv or Cv prefix and turning the first letter to the low case (unless it’s a own name, like Canny, Sobel etc). In case when there is no the new-style counterpart, it’s possible to use the old functions with the new structures, as shown the first sample in the chapter.

Memory Management
When using the new interface, the most of memory deallocation and even memory allocation

operations are done automatically when needed. First of all, Mat , SparseMat and other classes have destructors that deallocate memory buffers occupied by the structures when needed. Secondly, this “when needed” means that the destructors do not always deallocate the buffers, they take into account possible data sharing. That is, in a destructor the reference counter associated with the underlying data is decremented and the data is deallocated if and only if the reference counter becomes zero, that is, when no other structures refer to the same buffer. When such a structure containing a reference counter is copied, usually just the header is duplicated, while the underlying data is not; instead, the reference counter is incremented to memorize that there is another owner of the same data. Also, some structures, such as Mat , can refer to the user-allocated data. In this case the reference counter is NULL pointer and then no reference counting is done - the data is not deallocated by the destructors and should be deallocated manually by the user. We saw this scheme in the first example in the chapter: // allocates IplImages and wraps it into shared pointer class. Ptr<IplImage> iplimg = cvLoadImage(...); // constructs Mat header for IplImage data; // does not copy the data; // the reference counter will be NULL Mat img(iplimg); ... // in the end of the block img destructor is called, // which does not try to deallocate the data because // of NULL pointer to the reference counter. // // Then Ptr<IplImage> destructor is called that decrements // the reference counter and, as the counter becomes 0 in this case, // the destructor calls cvReleaseImage(). The copying semantics was mentioned in the above paragraph, but deserves a dedicated discussion. By default, the new OpenCV structures implement shallow, so called O(1) (i.e. constant-time) assignment operations. It gives user possibility to pass quite big data structures to functions (though, e.g. passing const Mat& is still faster than passing Mat ), return them (e.g. see the example with findHomography above), store them in OpenCV and STL containers etc. - and do all of this very efficiently. On the other hand, most of the new data structures provide clone() method that creates a full copy of an object. Here is the sample: // create a big 8Mb matrix Mat A(1000, 1000, CV_64F); // create another header for the same matrix; // this is instant operation, regardless of the matrix size. Mat B = A; // create another header for the 3-rd row of A; no data is copied either Mat C = B.row(3); // now create a separate copy of the matrix Mat D = B.clone(); // copy the 5-th row of B to C, that is, copy the 5-th row of A // to the 3-rd row of A. B.row(5).copyTo(C); // now let A and D share the data; after that the modified version // of A is still referenced by B and C. A = D; // now make B an empty matrix (which references no memory buffers),

// but the modified version of A will still be referenced by C, // despite that C is just a single row of the original A B.release(); // finally, make a full copy of C. In result, the big modified // matrix will be deallocated, since it's not referenced by anyone C = C.clone(); Memory management of the new data structures is automatic and thus easy. If, however, your code uses IplImage , CvMat or other C data structures a lot, memory management can still be automated without immediate migration to Mat by using the already mentioned template class Ptr , similar to shared_ptr from Boost and C++ TR1. It wraps a pointer to an arbitrary object, provides transparent access to all the object fields and associates a reference counter with it. Instance of the class can be passed to any function that expects the original pointer. For correct deallocation of the object, you should specialize Ptr<T>::delete_obj() method. Such specialized methods already exist for the classical OpenCV structures, e.g.: // cxoperations.hpp: ... template<> inline Ptr<IplImage>::delete_obj() { cvReleaseImage(&obj); } ... See Ptr description for more details and other usage scenarios.

Memory Management Part II. Automatic Data Allocation¶
With the new interface not only explicit memory deallocation is not needed anymore, but the memory allocation is often done automatically too. That was demonstrated in the example in the beginning of the chapter when cvtColor was called, and here are some more details. Mat and other array classes provide method create that allocates a new buffer for array data if and only if the currently allocated array is not of the required size and type. If a new buffer is needed, the previously allocated buffer is released (by engaging all the reference counting mechanism described in the previous section). Now, since it is very quick to check whether the needed memory buffer is already allocated, most new OpenCV functions that have arrays as output parameters call the create method and this way the automatic data allocation concept is implemented. Here is the example: #include "cv.h" #include "highgui.h" using namespace cv; int main(int, char**) { VideoCapture cap(0); if(!cap.isOpened()) return -1; Mat edges; namedWindow("edges",1); for(;;)

The complete list of overloaded operators can be found in Matrix Expressions . In these rare cases the corresponding functions take separate input parameters that specify the data type and/or size of the output arrays. gemm etc. if(waitKey(30) >= 0) break. 3).x. split(img_yuv.{ Mat frame.end<uchar>(). process Y plane using an iterator MatIterator_<uchar> it = planes[0]. In addition. In many cases the output array type and size can be inferenced from the input arrays’ respective characteristics. planes). imshow("edges". and none of them was both fast and convenient. please. which is nearly as fast as using the functions directly. Fast Element Access Historically. Anyway. for(. edges. Size(7. edges). CV_BGR2GRAY). OpenCV 2. With the new data structures. like add .x introduces a few more alternatives. 1. // method 1.t()*A). 0. a vast majority of the new-style array processing functions call create for each of the output array. subtract . } The matrix edges is allocated during the first frame processing and unless the resolution will suddenly change. GaussianBlur(edges.5). RNG::fill and some others. If you want to pass the new structures to some old OpenCV function. OpenCV provided many different ways to access image and matrix elements.). OpenCV 2. Note that this output array allocation semantic is only implemented in the new functions. For detailed description of the operations. like resize .begin<uchar>().t()*b). } . it != it_end.. Here is part of the retro-photo-styling example rewritten (in simplified form) using the element access operations: .5. hopefully more convenient than before.. the same buffer will be reused for every next frame’s edge map. edges.x provides some basic functions operating on matrices. For example. cap >> frame.7). Canny(edges.7 + rand() *it = saturate_cast<uchar>(v*v/255. it_end = planes[0]. with just a few exceptions like mixChannels . but not always. it introduces overloaded operators that give the user a convenient algebraic notation. // split the image into separate color planes vector<Mat> planes. edges. here is how the least squares problem can be solved using normal equations: Mat x = (A. then make CvMat or IplImage headers and after that call the function. 1. cvtColor(frame. Algebraic Operations Just like in v1. ++it) { double v = *it*1. check Mat and description. } return 0.inv()*(A. 30. you should first allocate the output arrays using create method.

if( scaleValue < 0 || scaleValue > 1000 ) CV_Error(CV_StsOutOfRange.at<uchar>(y. and that’s how all the pixel processing is done in OpenCV. use the standard exception handling mechanism: try { . process the first chroma plane using pre-stored row pointer. "The scale value is out of range").// method 2. This “saturation” semantics (different from usual C language “wrapping” semantics.. where lowest bits are taken. x). . cv::filter2D etc. // method 3. } } merge(planes.. each output pixel value is clipped to this available range: and the similar rules are applied to 8-bit signed and 16-bit signed and unsigned types. In the new version this special template operator is introduced to simplify implementation of this semantic in your own functions. for( int x = 0.. Saturation Arithmetics In the above sample you may have noticed operator.. . from the simple cv::add to cv::cvtColor . use () or (). cv::resize . img_yuv). it was there from very beginning.type() == CV_32FC1).rows. Vxy = saturate_cast<uchar>((Vxy-128)/2 + 128). CV_Assert(mymat. } catch( cv::Exception& e ) . It is not a new feature of OpenCV v2. x++ ) { Uptr[x] = saturate_cast<uchar>((Uptr[x]-128)/2 + 128).ptr<uchar>(y). There is also that yields no code in the Release configuration..cols.x. When a result of image operation is 8-bit image with pixel values ranging from 0 to 255. as opposite to the manual stack unrolling used in previous versions. process the second chroma plane using // individual element access operations for( int y = 0. y++ ) { uchar* Uptr = planes[1]. y < img_yuv. Error handling The modern error handling mechanism in OpenCV uses exceptions. x < img_yuv. To handle the errors. If you to check some conditions in your code and raise an error if they are not satisfied. is implemented in every image processing function. uchar& Vxy = planes[2]..

unsigned short.what(). '3f'.. Use . since the former is derived from the latter.. If you are familiar with OpenCV CvMat ‘s type notation. typedef _Tp value_type. CV _ 64FC2 etc. try not to use IplImage* . bool. '2d' etc. // in the case of multi-channel data it is the data type of each channel typedef <. CV_64F depth = DataDepth<channel_type>::value. Threading can be explicitly controlled by setNumThreads function.. A . to make it all work and do not worry about the object destruction. } instead of you can write std::exception .. Threading and Reenterability OpenCV uses OpenMP to run some time-consuming operations in parallel. double or a tuple of values of one of these types.{ const char* err_msg = e. channels) }.. }. // 1 .. that is.. Then.. functions and “const” methods of the classes are generally re-enterable.... std::vector<> . typedef <. enum { // CV_8U . // CV_8UC3. The template class DataType is descriptive class for OpenCV primitive data types and other types that comply with the following definition. A primitive OpenCV data type is one of unsigned char. // '1u'. . '4i'.. double for double.>.. The Core Functionality Basic Structures¶ DataType Template “traits” class for other OpenCV primitive data types template<typename _Tp> class DataType { // value_type is always a synonym for _Tp.. int..> work_type. signed char. float. unsigned short.. CV_32FC2 .>. obviously. etc. channels = <. signed short. they can be called from different threads asynchronously. signed char. fmt=<. then a primitive type can be defined as a type for which you can give a unique identifier in a form CV_<bit-depth>{U|S|F}C<number_of_channels> . // it is int for uchar.. CV _ 8U . where all the values in the tuple have the same type. . // intermediate type used for operations on _Tp. // float for float.> channel_type..... type = CV_MAKETYPE(depth. CV _ 32FC3. Also. signed short and int. CvMat* and plain pointers in general.

. The main purpose of the classes is to convert compile-time type information to OpenCV-compatible data type identifier. fmt='u'. Point_().. SparseMat_ or any other container that is able to store Vec instances. Mat . typedef uchar channel_type.e.. Mat B = Mat_<std::complex<double> >(3. Point_(const Point_& pt). " << B. that is. this mechanism is useful (and used in OpenCV this way) for generic algorithms implementations. DataType<float>::type). .channels() << endl. Mat_ . This technique is known in C++ as class traits. 2 /* i. even if such a type is not native to OpenCV (the matrix B intialization above compiles because OpenCV defines the proper specialized template class DataType<complex<_Tp> > )..depth() << ". fmt=(channels-1)*256+DataDepth<_Tp>::fmt. for example: // allocates 30x40 floating-point matrix Mat A(30. such traits are used to tell OpenCV which data type you are working with. channels = 1. Also. channels=2. SparseMat . depth == CV_64F. type = CV_8U }. // DataDepth is another helper trait class enum { depth = DataDepth<_Tp>::value. channels) }. 3). but its specialized versions. _Tp _y). typedef std::complex<_Tp> work_type. typedef _Tp channel_type. template<typename _Tp> DataType<std::complex<_Tp> > { typedef std::complex<_Tp> value_type. Multiple instances of such a type can be stored to a std::vector . such as: template<> class DataType<uchar> { typedef uchar value_type. channels == 2 */ cout << B. Point_ Template class for 2D points template<typename _Tp> class Point_ { public: typedef _Tp value_type. Point_(_Tp _x.. enum { channel_type = CV_8U. Point_(const CvPoint& pt).universal OpenCV structure able to store a single instance of such primitive data type is Vec . type=CV_MAKETYPE(depth. It’s not DataType itself that is used. }. The class DataType is basically used to provide some description of such primitive data types without adding any fields or methods to the corresponding classes (and it is actually impossible to add anything to primitive C/C++ data types). }. // the statement below will print 6. typedef int work_type. . 40.

cout << pt. The class represents a 2D point. " << pt. bool inside(const Rect_<_Tp>& r) const. Point_(const Vec<_Tp. double value = norm(pt). typedef Point_<float> Point2f. // computes dot-product (this->x*pt. typedef Point_<double> Point2d.Point_(const CvPoint2D32f& pt). The conversion from floating-point coordinates to integer coordinates is done by rounding.x + this->y*pt. in general case the conversion uses operation on each of the coordinates.y) _Tp dot(const Point_& pt) const. the following operations on points are implemented: pt1 = pt2 + pt3. _Tp x. Point_(const Size_<_Tp>& sz). y. Here is a short example: Point2f a(0. Point3_ Template class for 3D points template<typename _Tp> class Point3_ { . b(0. 2>& v). For user convenience. operator CvPoint() const. pt1 -= pt2. pt1 = pt2 . Point pt = (a + b)*10.3f. specified by its coordinates and .x << ". Point_& operator = (const Point_& pt). pt1 *= a.y << endl.pt3.f. pt1 = a * pt2.4f). // computes dot-product using double-precision arithmetics double ddot(const Point_& pt) const. operator Vec<_Tp. // L2 norm pt1 == pt2. Besides the class members listed in the declaration above. template<typename _Tp2> operator Point_<_Tp2>() const. 0. pt1 != pt2. // returns true if the point is inside the rectangle "r". pt1 += pt2.f. There is also cast operator to convert point coordinates to the specified type. Instance of the class is interchangeable with C structures CvPoint and CvPoint2D32f . operator CvPoint2D32f() const. 2>() const. typedef Point2i Point. }. pt1 = pt2 * a. 0. the following type aliases are defined: typedef Point_<int> Point2i.f).

_Tp width. Instance of the class is interchangeable with C structure CvPoint2D32f . Similarly to Point_ . y. The class represents a 3D point. Size_(const Size_& sz). operator CvPoint3D32f() const. Size_(const CvSize& sz). double ddot(const Point3_& pt) const. operator CvSize2D32f() const. . Point3_(const Vec<_Tp. Size_(_Tp _width. z. template<typename _Tp2> operator Point3_<_Tp2>() const. and . specified by its coordinates . Point3_(const CvPoint3D32f& pt). and the vector arithmetic and comparison operations are also supported. Point3_& operator = (const Point3_& pt). The following type aliases are available: typedef Point3_<int> Point3i. template<typename _Tp> class Size_ { public: typedef _Tp value_type. Point3_(_Tp _x. Size_(const Point_<_Tp>& pt). Size_(const CvSize2D32f& sz). the 3D points’ coordinates can be converted to another type. _Tp dot(const Point3_& pt) const. typedef Point3_<float> Point3f. _Tp area() const. Size_(). Size_ Template class for specfying image or rectangle size. height. Point3_(). typedef Point3_<double> Point3d. }. explicit Point3_(const Point_<_Tp>& pt). Size_& operator = (const Size_& sz). Point3_(const Point3_& pt). operator Size_<int>() const. operator Size_<double>() const. operator Vec<_Tp. 3>& v). operator Size_<float>() const. operator CvSize() const. 3>() const. _Tp _y. _Tp _z). _Tp x. _Tp _height).public: typedef _Tp value_type.

though. y) Rect_(const Point_<_Tp>& pt1. width. height. Rect_(_Tp _x.x && pt.(x. // returns width*height _Tp area() const. . operator Rect_<double>() const. const Size_<_Tp>& sz). // returns Point_<_Tp>(x+width. y) <. Rect_ Template class for 2D rectangles template<typename _Tp> class Rect_ { public: typedef _Tp value_type. y. operator CvRect() const.x < x + width && // y <= pt. // x <= pt. _Tp _height). height) <. The class Size_ is similar to Point_ . (width. The structure can be converted to and from the old OpenCV structures CvSize and CvSize2D32f . except that the two members are called width and height instead of x and y . (width. height) Size_<_Tp> size() const.sz Rect_(const Point_<_Tp>& org. pt2). typedef Size2i Size. in your algorithms you may count x and y from the bottom-left corner).y < y + height ? true : false bool contains(const Point_<_Tp>& pt) const. The same set of arithmetic and comparison operations as for Point_ is available. typedef Size_<float> Size2f. _Tp _width. y) Point_<_Tp> tl() const. // returns Size_<_Tp>(width. y) <. Rect_(). Rect_(const Rect_& r). // (x. The rectangle is described by the coordinates of the top-left corner (which is the default interpretation of Rect_::x and Rect_::y in OpenCV. Rect_(const CvRect& r). y+height) Point_<_Tp> br() const. the rectangle width and height. // returns Point_<_Tp>(x. pt2) . operator Rect_<float>() const. operator Rect_<int>() const. Rect_& operator = ( const Rect_& r ). // (x.max(pt1. OpenCV defines the following type aliases: typedef Size_<int> Size2i. height) <. const Point_<_Tp>& pt2). _Tp x.org. }.y && pt. _Tp _y.}.min(pt1.

rect -= size(augmenting operations) • rect = rect1 & rect2(rectangle intersection) • rect = rect1 | rect2(minimum area rectangle containing rect2 and rect3 ) • rect &= rect1. Here is how the partial ordering on rectangles can be established (rect1 rect2): template<typename _Tp> inline bool operator <= (const Rect_<_Tp>& r1. // rotation angle in degrees .x. RotatedRect(const Point2f& _center.y + rect.x + rect. RotatedRect(const CvBox2D& box). y < roi. rect -= point. rect != rect1(rectangle comparison) Example. for example.y. rect += size.. // backward conversion to CvBox2D operator CvBox2D() const..height. } For user convenience. y++) for(int x = roi. the method Rect_::contains returns true if And virtually every loop over an image ROI in OpenCV (where ROI is specified by Rect_<int> ) is implemented as: for(int y = roi. // returns minimal up-right rectangle that contains the rotated rectangle Rect boundingRect() const. // mass center of the rectangle Point2f center.Another assumption OpenCV usually makes is that the top and left boundary of the rectangle are inclusive. // size Size2f size. the following operations on rectangles are implemented: • (shifting rectangle by a certain offset) • (expanding or shrinking rectangle by a certain amount) • rect += point. RotatedRect Possibly rotated rectangle class RotatedRect { public: // constructors RotatedRect(). float _angle). rect |= rect1(and the corresponding augmenting operations) • rect == rect1. x < roi. const Rect_<_Tp>& r2) { return (r1 & r2) == r1.width. const Size2f& _size. the following type alias is available: typedef Rect_<int> Rect. while the right and bottom boundaries are not. x++) { // . } In addition to the class members.

. type = CV_MAKETYPE(depth. // type can be MAX_ITER.. EPS or MAX_ITER+EPS. int n> class Matx { public: typedef T value_type. channels = m*n. // or when the specified accuracy is achieved . most algorithms put some limit on the number of iterations anyway) // type = MAX_ITER + EPS means that algorithm stops when // either the specified number of iterations is made. . The class RotatedRect replaces the old CvBox2D and fully compatible with it. }. Tp val[m*n]. int maxCount.. MAX_ITER=COUNT. // type = MAX_ITER means that only the number of iterations does matter.whatever happens first. TermCriteria(int _type. 6> Matx16f. 1. Matx Template class for small matrices template<typename T. operator CvTermCriteria() const. 1. }.float angle. // constructors TermCriteria(). enum { depth = DataDepth<T>::value. TermCriteria(const CvTermCriteria& criteria).. int m. double epsilon. EPS=2 }. double _epsilon). }. int type. channels) }.. typedef Matx<float. // various methods . 1. The class TermCriteria replaces the old CvTermCriteria and fully compatible with it. typedef Matx<double. TermCriteria Termination criteria for iterative algorithms class TermCriteria { public: enum { COUNT=1. int _maxCount. 2> Matx12d. typedef Matx<float. // type = EPS means that only the required precision (epsilon) does matter // (though. 2> Matx12f.

}.typedef Matx<double. int cn> class Vec : public Matx<T. cn. 2. cout << sum(Mat(m*m. Vec Template class for short numerical vectors template<typename T.. 6> Matx16d. typedef Matx<float. typedef Vec<int. If you need to do some operation on Matx that is not implemented. 2> Vec2i. 4> Vec4i. Matx33f m(1. channels) }. 6> Vec6f. 2> Vec2f. typedef Vec<uchar. and most of the common matrix operations (see also MatrixExpressions ) are available. which type and size are known at compile time. typedef Vec<short.j) notation. type = CV_MAKETYPE(depth. 2> Matx22d. 2. 2> Vec2s. 6.. 3. 3> Vec3b. 1> Matx61d. 5. 6. typedef Matx<float. . typedef Matx<double. 1> Matx61f. 1> Matx21f. 2. typedef Vec<int. typedef Vec<int. typedef Vec<short. 2. typedef Vec<uchar. typedef Vec<short. 1> { public: typedef T value_type. 2> Vec2b. The elements of a matrix M are accessible using M(i.. 1> Matx21d. 6. // various methods . typedef Matx<double. typedef Matx<float. 6. 3> Vec3f. The class represents small matrices. 2. typedef Vec<uchar. 4> Vec4s. 8. enum { depth = DataDepth<T>::value. typedef Vec<float. use Mat .t())) << endl. typedef Vec<float. 3> Vec3s. typedef Vec<float. . 6> Matx66f. 6. . 4> Vec4f. 9). 4> Vec4b. 4. typedef Matx<double. typedef Matx<float. typedef Matx<double... it is easy to convert the matrix to Mat and backwards. If you need more flexible type. 2> Matx22f. channels = cn.. 1. 6> Matx66d. 3> Vec3i. 7. typedef Vec<float.

The type Scalar is widely used in OpenCV for passing pixel values and it is a drop-in replacement for CvScalar that was used for the same purpose in the earlier versions of OpenCV. note that these operations apply to the each computed vector component) • v1 == v2. and Vec<T. Vec<T. Being derived from Vec<_Tp. int channels. template<typename T2> operator Scalar_<T2>() const. _Tp v3=0). Scalar_(_Tp v0). class Range { public: Range(). Scalar_ 4-element vector template<typename _Tp> class Scalar_ : public Vec<_Tp. v1 != v2 • norm(v1)( -norm) The class Vec is commonly used to describe pixel types of multi-channel arrays. Range Specifies a continuous subsequence (a. template<typename T2> void convertTo(T2* buf. _Tp v1. typedef Vec<double. typedef Scalar_<double> Scalar. Scalar_(_Tp v0. bool empty() const. 3> Vec3d. 4> { public: Scalar_(). .3> to/from Point3_ . typedef Vec<double. int _end). The template class Scalar_ and it’s double-precision instantiation Scalar represent 4-element vector. Range(int _start.2> to/from Point_ . 2> Vec2d. double scale=1 ) const. 6> Vec6d. they can be used as typical 4-element vectors. All the expected vector operations are implemented too: • . Scalar_<_Tp> mul(const Scalar_<_Tp>& t. int size() const. 4> . operator CvScalar() const. static Scalar_<_Tp> all(_Tp v0). It is possible to convert Vec<T. slice) of a sequence. int unroll_to=0) const.k. see Mat_ description. The elements of Vec are accessed using operator[] .4> to CvScalar or Scalar . 4> Vec4d.a. Vec is a partial case of Matx . Scalar_(const CvScalar& s). Range(const CvSlice& slice). . }.typedef Vec<double. (plus the corresponding augmenting operations. typedef Vec<double. but in addition they can be converted to/from CvScalar . _Tp v2=0.

start. and for many other purposes. // decrements reference counter. // delete_obj() is called void release(). operator CvSlice() const.. // increments reference counter void addref(). const Range& r. As in Python.static Range all(). Such a half-opened interval is usually denoted as . in the case of your own custom processing you will probably have to check and handle it explicitly: void my_function(. r.. // by default. just like ” : ” in Matlab or ” . and end is exclusive right boundary of the range.. // returns true if obj == 0. when it becomes 0. }.. end. "delete obj.b) is basically the same as a:b in Matlab or a. ” in Python." is called void delete_obj(). Range(a. The static method Range::all() returns some special variable that means “the whole sequence” or “the whole range”. // copy constructor. . start is inclusive left boundary of the range.. . The class is used to specify a row or column span in a matrix ( Mat ). // destructor: calls release() ~Ptr()..end) } } Ptr A template class for smart reference-counting pointers template<typename _Tp> class Ptr { public: // default constructor Ptr().) { if(r == Range::all()) { // process all the data } else { // process [r. decrements own reference counter // (with release()) and increments ptr's reference counter Ptr& operator = (const Ptr& ptr).b in Python.. All the methods and functions in OpenCV that take Range support this special Range::all() value. // user-specified custom object deletion operation. bool empty() const. // constructor that wraps the object pointer Ptr(_Tp* _obj). int start. but of course. // assignment operator. increments ptr's reference counter Ptr(const Ptr& ptr)...

The class Ptr<_Tp> is a template class that wraps pointers of the corresponding type.e. For some objects. Indeed. const _Tp* operator -> () const. protected: // the encapsulated object pointer _Tp* obj. // return the underlying object pointer. It is similar to shared_ptr that is a part of Boost library ( http://www. // the associated reference counter int* refcount. The class Ptr treats the wrapped object as a black box. like complex classifiers in OpenCV. } . For some other objects. the delete_obj may be implemented as following: template<> inline void Ptr<FILE>::delete_obj() { fclose(obj). By wrapping a pointer to such a complex object TObj to Ptr<TObj> you will automatically get all of the necessary constructors and the assignment operator. • automatic destruction. no additional coding is needed.. The standard STL and most other C++ and OpenCV containers can only store objects of the same type and the same size. However. besides. mutexes. by STL containers). . the overhead becomes small and independent of the data size. See the example below with FILE* . then the specialized method should be created. // no need to clear the pointer afterwards. copy constructor or assignment operator are difficult to define. }. This knowledge is incapsulated in Ptr::delete_obj() method. by using Ptr<base_class_t>() instead of the raw pointers. like std::vector provide a copy constructor and an assignment operator. regardless of the data size. you can solve the problem. if you want to wrap FILE . // thanks to the methods. some of complex OpenCV and your own data structures may have been written in C. they are often required (e. By using this class you can get the following capabilities: • default constructor. // it is done externally. Again. which is called when the reference counter becomes 0. like files. if the object is deallocated in a different way. But if the structures are put into Ptr<> . operator const _Tp*() const.htm ) and also a part of the C++0x standard. • heterogeneous collections of objects. Finally. However..// provide access to the object fields and methods _Tp* operator -> (). . even for C structures.org/doc/libs/1_40_0/libs/smart_ptr/shared_ptr. sockets etc. The only thing the pointer class needs to know about the object is how to deallocate it. The classical solution to store objects of different types in the same container is to store pointers to the base class base_class_t* instead. copy constructors are absent and not easy to implement. • all the above-mentioned operations running very fast. i.boost.g. the reference counter is allocated and managed separately. For example. while some structures. the Ptr<_Tp> can be // used instead of _Tp* operator _Tp* (). If the object is a C++ class instance. because the default implementation of this method calls delete obj. copy constructors and default constructors can simplify programming a lot. windows. the operations may take considerable time if the data structures are big. copy constructor and assignment operator for an arbitrary C++ class or a C structure. but when you loose the automatic memory management. as “O(1)” operations.

//! the array dimensionality. // other members .. The data layout of array so that the address of element as: ... }. if(f. histograms (though. Note : The reference increment/decrement operations are implemented as atomic operations. // when array points to user-allocated data. vector fields. "r")). class CV_EXPORTS Mat { public: // . voxel volumes. //! pointer to the reference counter. may be better stored in a SparseMat ). tensors. point clouds. . The same is true for Mat and other C++ OpenCV classes that operate on the reference counters.. a lot of methods .continuity flag . It can be used to store real or complex-valued vectors and matrices.. .// now use it: Ptr<FILE> f(fopen("myfile. >= 2 int dims. Mat OpenCV C++ n-dimensional dense array class. . grayscale or color images. //! pointer to the data uchar* data.. fprintf(f.depth .. /*! includes several bit-fields: . -1) when the array has more than 2 dimensions int rows...... The class Mat represents an n-dimensional dense numerical single-channel or multi-channel array. // the file will be closed automatically by the Ptr<FILE> destructor. where is computed In the case of 2-dimensional array the above formula is reduced to: . and therefore it is normally safe to use the classes in multi-threaded applications.).number of channels */ int flags...the magic signature . //! the number of rows and columns or (-1. cols..empty()) throw . the pointer is NULL int* refcount. very high-dimensional histograms is defined by the array M.txt".step[] ..

a. // create new 320x240 image . CV_8UC1 means 8-bit single-channel array. Mat::clone() method can be used to get a full (a. but the created array will be 2-dimensional. several columns.row(3) + M.Scalar(1. where on the right side it can be a array or expression. Again. It can be a single row.g. rectangular region in the array (called a minor in algebra) or a diagonal. A new array of the specified size and specifed type will be allocated. several rows.col(1). ncols. sz.col(1) = M. such as Numpy (ndarray). // and now turn M to 100x60 15-channel 8-bit matrix. similarly to above. type has the same meaning as in cvCreateMat() method. You can actually modify a part of the array using this feature.g. e. type[.k.3)). ncols. with the number of columns set to 1. array assignment is O(1) operation because it only copies the header and increases the reference counter. CV_32FC2 means 2-channel (i. // add 5-th row. “strides”. any other array that uses “steps”. to compute position of a pixel. Because of such compatibility. That is. IplImage and CvMatND types from OpenCV 1. Here are the some popular ones: using create(nrows.step[i+1] (in fact.elemSize() .k. e. As noted in the introduction of this chapter.col(7).x. see below. the data layout in Mat is fully compatible with CvMat . as well as with majority of dense array types from the standard toolkits and SDKs. it is possible to make a Mat header for user-allocated data and process it in-place using OpenCV functions. That’s why Mat::dims is always >= 2 (can also be 0 when the array is empty) by using a copy constructor or assignment operator. multiplied by 3 to the 3rd row M. note that it is pass number of dimensions =1 to the Mat constructor.60. 3-dimensional matrices are stored plane-by-plane etc. you can create a multi-dimensional array: // create 100x100x100 8-bit array int sz[] = {100. type) method or the similar constructor Mat(nrows. 100. i.step[M. create() will only allocate a new array when the current array shape or type are different from the specified.Note that M. fillValue]) constructor.step[i] >= M. // now copy 7-th column to the 1-st column // M. // this will not work Mat M1 = M. a. complex) floating-point array etc: // make 7x7 complex matrix filled with 1+3j.copyTo(M1). Scalar::all(0)).step[i+1]*M.row(5)*3. There are many different ways to create Mat object. that is.7. CV_8U. // The old content will be deallocated M. by constructing a header for a part of another array. cv::Mat M(7.a. M. 100}.dims-1] is minimal and always equal to the element size M.e.size[i+1] ).e. as noted in the introduction.CV_32FC2. M. cv::Mat bigCube(3. M. because the new header will reference the same data. single column.step[i] >= M.col(7). Win32 (independent device bitmaps) etc. deep) copy of the array when you need it.row(3) = M.create(100. Such operations are also O(1). 2-dimensional matrices are stored row-by-row.CV_8UC(15)).

step == img->widthStep). b. Point ofs. cv::Mat M = cv::Mat(3.5. 1). if you need a deep copy.5). // that is. Backward conversion from Mat to CvMat or IplImage is provided via cast operators Mat::operator CvMat() const an Mat::operator IplImage() . e.g.10. It can be useful for processing “foreign” data using OpenCV (e. {d.locateROI(size. i}}.jpg". // convert IplImage* -> cv::Mat CvMat oldmat = mtx. Rect(10. c}. // the original 320x240 image will be modified roi = Scalar(0. // size will be (width=10. The operators do not copy the data. 9). .ptr == (uchar*)img->imageData && oldmat. it is possible to compute the relative sub-array position in the main “container” array using locateROI() : Mat A = Mat::eye(10. 3)) Mat C = B(Range(5. f}. Thanks to the additional datastart and dataend members. m).data.CV_8UC3).cv::Mat img(Size(320. Range::all()). Mat mtx(img).255. // fill the ROI with (0. // convert cv::Mat -> CvMat CV_Assert(oldmat.height=10) and the ofs will be (x=1. {g.0). IplImage* img = cvLoadImage("greatwave. Size size. void process_video_frame(const unsigned char* pixels. 3)).g. img. CV_8UC3. 1. C. h. C ~ A(Range(5.inv(). 10. // extracts A columns. int step) { cv::Mat img(height.cols == img->width && oldmat.0) (which is green in RGB space). use clone() method of the extracted sub-matrices.).100)). Range(1. width. cv::GaussianBlur(img. Range(1. ofs). 1. CV_64F. // extracts B rows. pixels.240). cv::Size(7.rows == img->height && oldmat. partial yet very common cases of this “user-allocated data” case are conversions from CvMat and IplImage to Mat . Mat B = A(Range::all(). e.7). int width. For this purpose there are special constructors taking pointers to CvMat or IplImage and the optional flag indicating whether to copy the data or not.255. by making a header for user-allocated-data. 1 (inclusive) to 3 (exclusive). y=5) As in the case of whole matrices. int height. CV_32S). step).100. 9). // select a roi cv::Mat roi(img. } for quick initialization of small matrices and/or super-fast element access double m[3][3] = {{a. 5 (inclusive) to 9 (exclusive). 3. when you implement a DirectShow filter or a processing module for gstreamer etc.

i < M. If you need to process a whole row of a 2d array. if(M.ptr<double>(i). 0. note the extra parentheses that are needed to avoid compiler errors. CV_64F). The next important thing to learn about the array class is element access. } for(int i = 0. zeros(). and then we just put << operator followed by comma-separated values that can be constants. have no gaps in the end of each row. 0. M.rows.f. j < M. If you know the array element type (which can be retrieved using the method Mat::type() ). e.rows. and if yes.by using MATLAB-style array initializers. assuming that M is double-precision floating-point array.e. if you want to release the data pointed by a array header before the array destructor is called. in which case you should handle the data by yourself).3) << 1. here we first call constructor of Mat_ class (that we describe further) with the proper parameters. for(int j = 0. i++) { const double* Mi = M.j) += 1. Once array is created. process them as a single long row: // compute sum of positive matrix elements.isContinuous()) { cols *= rows. variables. 0.). 0. Such operations are called element-wise and it makes sense to check whether all the input/output arrays are continuous. j++) sum += std::max(Mi[j]. M += Mat::eye(M. eye() . j < cols. 1.). Earlier it was shown how to compute address of each array element. e. for(int i = 0. it’s not needed to use the formula directly in your code. by using comma-separated initializer: // create 3x3 double-precision identity matrix Mat M = (Mat_<double>(3. The array data will be deallocated when no one points to it.at<double>(i.cols. the most efficient way is to get the pointer to the row first.cols. expressions etc.g. } Some operations. it will be automatically managed by using reference-counting mechanism (unless the array header is built on top of user-allocated data. rows = M. for(int j = 0. i++) { const double* Mi = M. Also. i < rows. 0. j++) sum += std::max(Mi[j]. There are several variants of the method at for different number of dimensions. 0. array addition).g.cols. use Mat::release() . 1). i. rows = 1. 0. like the above one. 0. .rows. Normally. ones().ptr<double>(i).: // create a double-precision identity martix and add it to M. int cols = M. optimized variant double sum=0. and then just use plain C operator [] : // compute sum of positive matrix elements // (assuming that M is double-precision matrix) double sum=0. you can access element of 2-dimensional array as: M. they just process elements of an array one by one (or elements from multiple arrays that have the same coordinates. do not actually depend on the array shape.

int type) . s for a scalar ( Scalar ).end<double>(). matrix constructors and operators that extract sub-matrices (see Mat description). repeat() etc. matrix comma-separated initializers. addition. verb “Mat_<destination_type>()” constructors to cast the result to the proper type. trace() . there are STL-style iterators that are smart enough to skip gaps between successive rows: // compute sum of positive matrix elements. The matrix iterators are random-access iterators. mean() . ++it) sum += std::max(*it. iterator-based variant double sum=0. Matrix Expressions This is a list of implemented matrix operations that can be combined in arbitrary complex expressions for a real-valued scalar ( double )): (here A . including std::sort() . A textasciicircum B. Note. int type) (3) Mat::Mat(Size size. A & s. which will be especially noticeable in the case of small matrices. ~ A element-wise minimum and maximum: element-wise absolute value: cross-product. so they can be passed to any STL algorithm. bitwise logical operations: A & B. such as norm() . it != it_end. determinant() . Finally. matrix initializers ( eye(). that comma-separated initializers and probably some other operations may require additional explicit Mat() or verb “Mat_<T>()” constuctor calls to resolve possible ambiguity. subtraction.} in the case of continuous matrix the outer loop body will be executed just once. so the overhead will be smaller. sum() .begin<double>(). it_end = M. MatConstIterator_<double> it = M. solving linear systems and least-squares problems: comparison: . ones() ). 0. B stand for matrices ( Mat ). per-element multiplication and division: matrix multiplication: transposition: matrix inversion and pseudo-inversion. countNonZero() . however. int cols. Below is the formal description of the Mat methods. dot-product: any function of matrix or matrices and scalars that returns a matrix or a scalar. cv::Mat::Mat Comments from the Wiki (1) Mat::Mat() (2) Mat::Mat(int rows. negation: scaling: . for(. A | s. A | B. A textasciicircum s. The result of comparison is 8-bit single channel mask. zeros().). which elements are set to 255 (if the particular element or pair of elements satisfy the condition) and 0 otherwise.

If you want to have an independent copy . Matrix constructors that take data and step parameters do not allocate matrix data.. Instead. bool copyData=false) (13) template<typename T. CV_64FC(n) to create multi-channel (up to CV_MAX_CN channels) matrices o s– The optional value to initialize each matrix element with. int m. n>& vec. bool copyData=true) (14) template<typename T.(4) Mat::Mat(int rows. size_t step=AUTO_STEP) (9) Mat::Mat(const Mat& m. is constructed and the associated with it reference counter.. specifying the n-dimensional array shape o type– The array type. const int* sizes.e. Instead. a partly) is assigned to the constructed matrix. No data is copied by these constructors. bool copyData=false) (16) Mat::Mat(const MatExpr& expr) (17) Mat::Mat(int ndims. bool copyData=true) (15) template<typename T> explicit Mat::Mat(const vector<T>& vec. const Scalar& s) (6) Mat::Mat(const Mat& m) (7) Mat::Mat(int rows. no data is copied. . if any. int cols. when you modify the matrix formed using such a constructor. const Scalar& s) (19) Mat::Mat(int ndims. .. m. Note that in the Size() constructor the number of rows and the number of columns go in the reverse order. const Range* ranges) Various array constructors Paramete o ndims– The array dimensionality rs: o rows– The number of rows in 2D array o cols– The number of columns in 2D array o size– The 2D array size: Size(cols. no padding is assumed and the actual step is calculated as cols*elemSize() . they just initialize the matrix header that points to the specified data. The external data is not automatically deallocated. use the assignment operator Mat::operator=(const Scalar& value) . That is. user should take care of it. size_t step=AUTO_STEP) (8) Mat::Mat(Size size. if any. const size_t* steps=0) (20) Mat::Mat(const Mat& m. or its sub-array. rows) . int type. CV_64FC4 to create 1-4 channel matrices. int type. o steps– The array of ndims-1 steps in the case of multi-dimensional array (the last step is always set to the element size). the matrix is assumed to be continuous. o m– The array that (in whole. const Range& rowRange. int type. int type.. o data– Pointer to the user data. void* data. int n> explicit Mat::Mat(const Vec<T. This optional parameter specifies the number of bytes that each matrix row occupies. int cols. o step– The data buddy. o sizes– The array of integers.. bool copyData=false) (12) Mat::Mat(const IplImage* img. the header pointing to m data.. const int* sizes. i. you will also modify the corresponding elements of m . n>& vec. or CV_8UC(n). int type. use CV_8UC1. const int* sizes. void* data. const Range& colRange) (10) Mat::Mat(const Mat& m. This operation is very efficient and can be used to process external data using OpenCV functions. If not specified. const Scalar& s) (5) Mat::Mat(Size size. void* data. The value should include the padding bytes in the end of each row. is incremented. see Mat::elemSize (). int n> explicit Mat::Mat(const Matx<T. If the parameter is missing (set to cv::AUTO_STEP ). int type. To set all the matrix elements to the particular value after the construction. const Rect& roi) (11) Mat::Mat(const CvMat* m. int type) (18) Mat::Mat(int ndims.

no new elements should be added to the vector. or the old-style CvMat or IplImage should be copied to (true) or shared with (false) the newly constructed matrix. The matrix will have a single column and the number of rows equal to the number of vector elements. meaning that STL vectors are not automatically converted to Mat instances. o vec– STL vector. the vector elements must be primitive numbers or uni-type numerical tuples of numbers. cv::Mat::operator = Comments from the Wiki Mat& Mat::operator = (const Mat& m) Mat& Mat::operator = (const MatExpr_Base& expr) Mat& operator = (const Scalar& s) . the full copy of the image data is created. the data is shared between the original image and the new matrix. because it can potentially yield vector data reallocation. the allocated buffer will be managed using Mat ‘s reference counting mechanism. you should write Mat(vec) explicitly. often the default constructor is enough. and you should not deallocate the data until the matrix is not destructed. whether the underlying data of the STL vector. or be allocated with Mat::create . As noticed in the .of the sub-array. Parameters: o expr– Matrix expression. the range start is inclusive and the range end is exclusive. use Mat::clone() .e. o rowRange– The range of the m ‘s rows to take. Type of the matrix will match the type of vector elements. and thus the matrix data pointer will become invalid. See Matrix Expressions . the reference counter will be NULL. of course. The constructor can handle arbitrary types. When the data is copied. Mixed-type structures are not supported. in which case the old content is dereferenced. As usual. By default. cv::Mat::Mat Comments from the Wiki Mat::textasciitilde Mat() Matrix destructor The matrix destructor calls Mat::release . for which there is properly declared DataType . Use Range::all() to take all the rows. o colRange– The range of the m ‘s columns to take. which elements will form the matrix. o copyData– Specifies. but when copyData is set. Use Range::all() to take all the columns. Note that the corresponding constructor is explicit. Another obvious note: unless you copied the data into the matrix ( copyData=true ). These are various constructors that form a matrix. The constructed matrix can further be assigned to another matrix or matrix expression. o ranges– The array of selected ranges of m along each dimensionality . and the proper matrix will be allocated by an OpenCV function. While when the data is shared. o img– Pointer to the old-style IplImage image structure. i.

no data is copied. cv::Mat::operator MatExpr Comments from the Wiki Mat::operator MatExpr_<Mat.. o expr– The assigned matrix expression object. These are the available assignment operators. Instead. // will not work This is because A. As opposite to the first form of assignment operation. right-hand-side matrix. It is automatically handled by the real function that the matrix expressions is expanded to. used by LU and many other algorithms: inline void matrix_axpy(Mat& A.row(i) = A. cv::Mat::row Comments from the Wiki Mat Mat::row(int i) const Makes a matrix header for the specified matrix row Parameters: o i– the 0-based row index The method makes a new header for the specified matrix row and returns it. Mat>() const Mat-to-MatExpr cast operator The cast operator should not be called explicitly.row(i) forms a temporary header.row(j). and add() will take care of automatic C reallocation.Matrix assignment operators Paramete o m– The assigned. assigned to each matrix element. is incremented. double alpha) { A.e. Here is the example of one of the classical basic matrix processing operations. } Important note . . Remember. It is used internally by the Matrix Expressions engine.row(j)*alpha. The underlying data of the new matrix will be shared with the original matrix. Matrix assignment is O(1) rs: operation. . that is. C) . axpy. please. or use Mat::copyTo method: Mat A. For example. you should either turn this simple assignment into an expression.. the above assignment will have absolutely no effect. while you may have expected j-th row being copied to i-th row. so. To achieve that.. int j. A. Thus. which is further assigned another header. .. the old data is dereferenced via Mat::release . Before assigning new data. The matrix size or type is not changed. the second form can reuse already allocated matrix if it has the right size and type to fit the matrix expression result. look at the operator parameters description.row(i) += A. B. int i. o s– The scalar. no data is copied. In the current implementation the following code will not work as expected: Mat A. C=A+B is expanded to cv::add(A. each of these operations is O(1). the data is shared and the reference counter. regardless of the matrix size. i. This is O(1) operation. if any. and they all are very different.

Similarly to Mat::row() and Mat::col() .row(i). // this is a bit longer.copyTo(Ai). Paramete o d– . See also Mat::row description.row(i) = A. cv::Mat::rowRange Comments from the Wiki Mat Mat::rowRange(int startrow.row(j). cv::Mat::diag Comments from the Wiki Mat Mat::diag(int d) const static Mat Mat::diag(const Mat& matD) Extracts diagonal from a matrix. Mat Ai = A. int endcol) const Mat Mat::colRange(const Range& r) const Makes a matrix header for the specified row span Parameters: o startcol– the 0-based start index of the column span o endcol– the 0-based ending index of the column span o r– The Range() structure containing both the start and the end indices The method makes a new header for the specified column span of the matrix. cv::Mat::colRange Comments from the Wiki Mat Mat::colRange(int startcol. regardless of the matrix size. The underlying data of the new matrix will be shared with the original matrix. this is O(1) operation. This is O(1) operation. this is O(1) operation. cv::Mat::col Comments from the Wiki Mat Mat::col(int j) const Makes a matrix header for the specified matrix column Parameters: o j– the 0-based column index The method makes a new header for the specified matrix column and returns it.// works. but the recommended method. M. int endrow) const Mat Mat::rowRange(const Range& r) const Makes a matrix header for the specified row span Parameters: o startrow– the 0-based start index of the row span o endrow– the 0-based ending index of the row span o r– The Range() structure containing both the start and the end indices The method makes a new header for the specified row span of the matrix. but looks a bit obscure. A.row(j) + 0. or creates a diagonal matrix. Similarly to Mat::row() and Mat::col() .

o alpha– The optional scale factor o beta– The optional delta. the destination matrix will have the same type as the source. Similarly to Mat::row() and Mat::col() .g. cv::Mat::copyTo Comments from the Wiki void Mat::copyTo(Mat& m ) const void Mat::copyTo( Mat& m. If it does not have a proper size or type rs: before the operation. the newly allocated matrix is initialized with all 0’s before copying the data. it will be reallocated o mask– The operation mask.g. d=1 means the diagonal immediately above the main one o matD– single-column matrix that will form the diagonal matrix.rs: index of the diagonal. or rather. the depth (since the number of channels will be the same with the source one). The original step[] are not taken into the account. const Mat& mask) const Copies the matrix to another one. d=1 means the diagonal immediately below the main one d<0a diagonal from the upper half. i. saturate_cast<> is applied in the end to . this is O(1) operation. cv::Mat::convertTo Comments from the Wiki void Mat::convertTo(Mat& m. If rtype is negative. will work as expected. and the Mat::create call shown above reallocated the matrix. added to the scaled values. While m. so that the destination matrix is reallocated if needed.create(this->size(). The method creates full copy of the array. The new matrix will be represented as a single-column matrix. int rtype. Paramete o m– The destination matrix. Its non-zero elements indicate. Paramete o m– The destination matrix. The method converts source pixel values to the target datatype. which matrix elements need to be copied The method copies the matrix data to another matrix. the array copy will be a continuous array occupying total()*elemSize() bytes.e. with the following meaning: d=0the main diagonal d>0a diagonal from the lower half. e. double beta=0) const Converts array to another datatype with optional scaling. double alpha=1. the method invokes m. When the operation mask is specified. it will be reallocated o rtype– The desired destination matrix type. Before copying the data. e. If it does not have a proper size or type rs: before the operation. That is. The method makes a new header for the specified matrix diagonal. cv::Mat::clone Comments from the Wiki Mat Mat::clone() const Creates full copy of the array and the underlying data. will have no effect. the function does not handle the case of a partial overlap between the source and the destination matrices. this->type).copyTo(m).

const Mat& mask=Mat()) Sets all or some of the array elements to the specified value. o rows– The new number of rows. No data is copied.. // convert vector to Mat. Mat pointMat = Mat(vec). cv::Mat::setTo Comments from the Wiki Mat& Mat::setTo(const Scalar& s. an O(1) operation t(). // finally. If the parameter is 0. The method makes a new matrix header for *this elements. Parameters: o s– Assigned scalar. cv::Mat::reshape Comments from the Wiki Mat Mat::reshape(int cn. this is O(1) operation. as long as: No extra elements is included into the new matrix and no elements are excluded. if you change the number of rows. the number rs: of channels remains the same. Assuming. int type=-1) const Functional form of convertTo o m– The destination array Paramete o type– The desired destination array depth (or -1 if it should be the rs: same as the source one). . Here is how it can be done: std::vector<cv::Point3f> vec. Here is some small example. // Also. the product rows*cols*channels() must stay the same after the transformation. Paramete o cn– The new number of channels. int rows=0) const Changes the 2D matrix’s shape and/or the number of channels without copying the data. the matrix must be continuous. transpose the Nx3 matrix. there is a set of 3D points that are stored as STL vector. // This involves copying of all the elements . and you want to represent the points as 3xN matrix..e. i. If the parameter is 0. // make Nx3 1-channel matrix out of Nx1 3-channel. This is internal-use method called by the Matrix Expressions engine. the number of rows remains the same. or the operation changes elements’ row indices in some other way. Consequently. See Mat::isContinuous() .avoid possible overflows: cv::Mat::assignTo Comments from the Wiki void Mat::assignTo(Mat& m. Consequently. O(1) operation reshape(1). which is converted to the actual array type o mask– The operation mask of the same size as *this This is the advanced variant of Mat::operator=(const Scalar& s) operator. Any combination is possible. The new matrix may have different size and/or different number of channels.

A. or a s: matrix expression o scale– The optional scale factor The method returns a temporary object encoding per-element array multiplication. for symmetrical positively defined matrices only. then the pseudo-inverse is computed The method performs matrix inversion by means of matrix expressions. 5) cv::Mat::cross Comments from the Wiki .mul(5/B). Note that this is not a matrix multiplication.type)*lambda. About twice faster than LU on big matrices.size(). cv::Mat::mul Comments from the Wiki MatExpr Mat::mul(const Mat& m. It does not perform the actual transposition. // compute (A + lambda*I)^t * (A + lamda*I) cv::Mat::inv Comments from the Wiki MatExpr Mat::inv(int method=DECOMP_LU) const Inverses the matrix Paramete o method– rs: The matrix inversion method. double scale=1) const MatExpr Mat::mul(const MatExpr& m. i.cv::Mat::t Comments from the Wiki MatExpr Mat::t() const Transposes the matrix The method performs matrix transposition by means of matrix expressions. The matrix must be non-singular DECOMP_CHOLESKYCholesky decomposition. Here is a example: Mat C = A.t()*A1.e. double scale=1) const Performs element-wise multiplication or division of the two matrices Parameter o m– Another matrix. DECOMP_SVDSVD decomposition. C. one of DECOMP_LULU decomposition. with optional scale. of the same type and the same size as *this . but returns a temporary “matrix transposition” object that can be further used as a part of more complex matrix expression or be assigned to a matrix: Mat A1 = A + Mat::eye(A. B. // equivalent to divide(A. The matrix can be a singular or even non-square. a temporary “matrix inversion” object is returned by the method. Mat C = A1. which corresponds to a simpler “*” operator. and can further be used as a part of more complex matrix expression or be assigned to a matrix.

cv::Mat::ones Comments from the Wiki static MatExpr Mat::ones(int rows. Mat A. Note that in the above sample a new matrix will be allocated only if A is not 3x3 floating-point matrix.Mat Mat::cross(const Mat& m) const Computes cross-product of two 3-element vectors Parameters: o m– Another cross-product operand The method computes cross-product of the two 3-element vectors. or as a matrix initializer. otherwise the existing matrix A will be filled with 0’s. If the matrices have more than one channel. the dot products from all the channels are summed together. rows) o sizes– The array of integers. It can be used to quickly form a constant array and use it as a function parameter. 3. A = Mat::zeros(3. int type) Returns array of all 1’s of the specified size and type Parameters: o ndims– The array dimensionality o rows– The number of rows . specifying the array shape o type– The created matrix type The method returns Matlab-style zero array initializer. int type) static MatExpr Mat::zeros(int ndims. int type) static MatExpr Mat::ones(Size size. int cols. int type) static MatExpr Mat::ones(int ndims. as a part of matrix expression. The method computes dot-product of the two matrices. int type) Returns zero array of the specified size and type o ndims– The array dimensionality Parameters: o rows– The number of rows o cols– The number of columns o size– Alternative matrix size specification: Size(cols. If the matrices are not single-column or single-row vectors. cv::Mat::zeros Comments from the Wiki static MatExpr Mat::zeros(int rows. int cols. CV_32F). const int* sizes. The result will be another 3-element vector of the same shape and the same type as operands. The vectors must be 3-elements floating-point vectors of the same shape and the same size. the top-to-bottom left-to-right scan ordering is used to treat them as 1D vectors. const int* sizes. The vectors must have the same size and the same type. int type) static MatExpr Mat::zeros(Size size. cv::Mat::dot Comments from the Wiki double Mat::dot(const Mat& m) const Computes dot-product of two vectors Parameters: o m– Another dot-product operand.

. Most new-style OpenCV functions and methods that produce arrays call this method for each output array. The method uses the following algorithm: 1 if the current array shape and the type match the new ones. int type) static MatExpr Mat::eye(Size size. Note that using this method you can initialize an array with arbitrary value. Parameters: o ndims– The new array dimensionality o rows– The new number of rows o cols– The new number of columns o size– Alternative new matrix size specification: Size(cols. int type) Returns identity matrix of the specified size and type Parameters: o rows– The number of rows o cols– The number of columns o size– Alternative matrix size specification: Size(cols. associated with the data. int type) Allocates new array data if needed. reference counter and set it to 1. it will just remember the scale factor (3 in this case) and use it when actually invoking the matrix initializer.1. int cols. 4. Mat A = Mat::eye(4. CV_32F)*0. rows) o sizes– The array of integers.1's on the diagonal. const int* sizes. specifying the array shape o type– The created matrix type The method returns Matlab-style ones’ array initializer. similarly to Mat::zeros() . // make 100x100 matrix filled with 3. Similarly to Mat::ones . 2 otherwise. int type) void Mat::create(Size size. int cols. cv::Mat::create Comments from the Wiki void Mat::create(int rows. The above operation will not form 100x100 matrix of ones and then multiply it by 3. specifying the new array shape o type– The new matrix type This is one of the key Mat methods. Instead. 100. dereference the previous data by calling Mat::release() 3 initialize the new header 4 allocate the new data of total()*elemSize() bytes 5 allocate the new. similarly to Mat::zeros() . int type) void Mat::create(int ndims.o cols– The number of columns o size– Alternative matrix size specification: Size(cols. using the following Matlab idiom: Mat A = Mat::ones(100. CV_8U)*3. you can use a scale operation to create a scaled identity matrix efficiently: // make a 4x4 diagonal matrix with 0. return immediately. rows) o type– The created matrix type The method returns Matlab-style identity matrix initializer. cv::Mat::eye Comments from the Wiki static MatExpr Mat::eye(int rows. rows) o sizes– The array of integers.

It is called implicitly by the matrix assignment operator. the first min(Mat::rows. Mat gray. the reference counter is NULL. color. .. If the matrix header points to an external data (see Mat::Mat() ). But since this method is automatically called in the destructor.depth()).rows. the matrix data is deallocated and the data and the reference counter pointers are set to NULL’s. the method should not be called explicitly.e. and the method has no effect in this case. Mat gray(color. it is usually not needed. calls Mat::create() for the output array internally. i. . cv::Mat::release Comments from the Wiki void Mat::release() Decrements the reference counter and deallocates the matrix if needed The method decrements the reference counter. and also saves quite a bit of typing for the user. you can simply write: Mat color. the reference counter is NULL. . usually there is no need to explicitly allocate output arrays. cvtColor(color. cv::Mat::addref Comments from the Wiki void Mat::addref() Increments the reference counter The method increments the reference counter. CV_BGR2GRAY). associated with the matrix data.cols. Normally. gray. associated with the matrix data. If the matrix header points to an external data (see Mat::Mat() ). The reference counter decrement and check for 0 is the atomic operation on the platforms that support it. instead of writing: Mat color. or by any other method that changes the data pointer. cv::Mat::resize Comments from the Wiki void Mat::resize(size_t sz) const Changes the number of matrix rows Parameters: o sz– The new number of rows The method changes the number of matrix rows.. thus it is safe to operate on the same matrices asynchronously in different threads. to avoid memory leaks. and the method has no effect in this case. If the matrix is reallocated.. thus it is safe to operate on the same matrices asynchronously in different threads. This method can be called manually to force the matrix data deallocation. When the reference counter reaches 0. The reference counter increment is the atomic operation on the platforms that support it. gray. because cvtColor . CV_BGR2GRAY).. color. That is. as well as most of OpenCV functions. cvtColor(color.Such a scheme makes the memory management robust and efficient at the same time.

Mat::rowRange() .. The method emulates the corresponding method of STL vector class. The methods add one or more elements to the bottom of the matrix. However. using which it is possible to reconstruct the original matrix size and the position of the extracted submatrix within the original matrix. int dright) Adjust submatrix size and position within the parent matrix Parameters: o dtop– The shift of the top submatrix boundary upwards o dbottom– The shift of the bottom submatrix boundary downwards o dleft– The shift of the left submatrix boundary to the left o dright– The shift of the right submatrix boundary to the right . its type and the number of columns must be the same as in the container matrix. int dbottom.sz) rows are preserved. an exception is thrown. int dleft. Paramete o nelems– The number of rows removed. o ofs– The output parameter that will contain offset of *this inside the whole matrix After you extracted a submatrix from a matrix using Mat::row() . Mat::colRange() etc. Point& ofs) const Locates matrix header within a parent matrix Paramete o wholeSize– The output parameter that will contain size of the whole rs: matrix. each submatrix contains some information (represented by datastart and dataend fields). Mat::push_back Comments from the Wiki template<typename T> void Mat::push_back(const T& elem) template<typename T> void Mat::push_back(const Mat_<T>& elem) Adds elements to the bottom of the matrix Parameters: o elem– The added element(s). cv::Mat::locateROI Comments from the Wiki void Mat::locateROI(Size& wholeSize. The method removes one or more rows from the bottom of the matrix. They emulate the corresponding method of STL vector class. When elem is Mat . If it is greater than the total rs: number of rows. The method locateROI does exactly that. cv::Mat::adjustROI Comments from the Wiki Mat& Mat::adjustROI(int dtop. Mat::pop_back Comments from the Wiki template<typename T> void Mat::pop_back(size_t nelems=1) Removes elements from the bottom of the matrix. Mat::col() . which *this is a part of. the result submatrix will point just to the part of the original big matrix.

thus you should make sure than the original matrix is not deallocated while the CvMat header is used. where mycvOldFunc is some function written to work with OpenCV 1. 10). When all the method’s parameters are positive. i. To select all the columns. The reference counter is not taken into account by this operation. 2). no matrix data is copied. cv::Mat::operator CvMat Comments from the Wiki Mat::operator CvMat() const Creates CvMat header for the matrix The operator makes CvMat header for the matrix without copying the underlying data. rs: The upper boundary is not included. A.rowRange(0.adjustROI(2. They are the most generalized forms of Mat::row() . use Range::all() o colRange– The start and the end column of the extracted submatrix. Range colRange) const Mat Mat::operator()( const Rect& roi ) const Mat Mat::operator()( const Ranges* ranges) const Extracts a rectangular submatrix Paramete o rowRange– The start and the end row of the extracted submatrix. i. mycvOldFunc( &cvimg.. The function is used internally by the OpenCV filtering functions. See also copyMakeBorder() .. it means that the ROI needs to grow in all directions by the specified amount. Range::all()) is equivalent to A.x data structures. Indeed.g: Mat img(Size(320. CV_8UC3). It’s user responsibility to make sure that adjustROI does not cross the parent matrix boundary. The operator is useful for intermixing the new and the old OpenCV API’s. Similarly to all of the above. cv::Mat::operator IplImage .e.. If it does. 240). the operators are O(1) operations. Typically it can be needed for filtering operations. increases the matrix size by 4 elements in each direction and shifts it by 2 elements to the left and 2 elements up.). A(Range(0. e. the function will signal an error. . the typical use of these functions is to determine the submatrix position within the parent matrix and then shift the position somehow.e. cv::Mat::operator() Comments from the Wiki Mat Mat::operator()( Range rowRange.The method is complimentary to the Mat::locateROI() . when pixels outside of the ROI should be taken into account. For example. Mat::rowRange() and Mat::colRange() . CvMat cvimg = img. Mat::col() . morphological operations etc. To select all the rows. .. which brings in all the necessary pixels for the filtering with 5x5 kernel. 10) . 2. like filter2D() . The upper boundary is not included. use Range::all() o roi– The extracted submatrix specified as a rectangle o ranges– The array of selected ranges along each array dimension The operators make a new header for the specified sub-array of *this . 2.

Size size = src1. if all the input and all the output arrays are continuous. dst. the operator is useful for intermixing the new and the old OpenCV API’s.e.size()).cols*m. done as following: // alternative implementation of Mat::isContinuous() bool myCheckMatContinuity(const Mat& m) { //return (m. color space transformations etc. . You should make sure than the original matrix is not deallocated while the IplImage header is used.type()). math functions. 1x1 or 1xN matrices are always continuous.type() && src1. cv::Mat::total Comments from the Wiki size_t Mat::total() const Returns the total number of array elements. The point is that element-wise operations (such as arithmetic and logical operations. alpha blending.size() == src2. Obviously. template<typename T> void alphaBlendRGBA(const Mat& src1. though it could be.elemSize(). The continuity flag is stored as a bit in Mat::flags field. CV_Assert( src1. and thus.step == m.flags & Mat::CONTINUOUS_FLAG) != 0. 4) && src1. // here is the idiom: check the arrays for continuity and. Mat& dst) { const float alpha_scale = (float)std::numeric_limits<T>::max(). return m.type() == src2. cv::Mat::isContinuous Comments from the Wiki bool Mat::isContinuous() const Reports whether the matrix is continuous or not The method returns true if the matrix elements are stored continuously. in theory.f/alpha_scale.Comments from the Wiki Mat::operator IplImage() const Creates IplImage header for the matrix The operator makes IplImage header for the matrix without copying the underlying data. and is computed automatically when you construct a matrix header. thus the continuity check is very fast operation. such matrices may no longer have this property. src1. and false otherwise. Mat::diag() etc. Matrices created with Mat::create() are always continuous.type() == CV_MAKETYPE(DataType<T>::depth. const Mat& src2. i. and you are welcome to use it as well.) do not depend on the image geometry.size(). without gaps in the end of each row. Here is the example of how alpha-blending function can be implemented. inv_scale = 1. the functions can process them as very long single-row vectors.rows == 1 || m. } The method is used in a quite a few of OpenCV functions.g.create(size. The method returns the number of array elements (e. Similarly to Mat::operator CvMat . or constructed a matrix header for externally allocated data. but if you extract a part of the matrix using Mat::col() . number of pixels if the array represents an image).

height = 1.ptr<T>(i). while being very simple. } size. it ignores the number of channels. j < size. can boost performance of a simple element-operation by 10-20 percents. T* dptr = dst. if the matrix type is CV_16SC3 .height. } } } This trick. j += 4 ) { float alpha = ptr1[j+3]*inv_scale. especially if the image is rather small and the operation is quite simple.(1-alpha)*(1-beta))*alpha_scale). beta = ptr2[j+3]*inv_scale. the method will return sizeof(short) or 2. the method will return 3*sizeof(short) or 6. // the outer loop is executed only once const T* ptr1 = src1. that is. for( int i = 0.we call Mat::create() for the destination array instead of checking that it already has the proper size and type. dptr[j+1] = saturate_cast<T>(ptr1[j+1]*alpha + ptr2[j+1]*beta). For example.width. // treat the arrays as 1D vectors if( src1. dptr[j+3] = saturate_cast<T>((1 .isContinuous() ) { size. i++ ) { // when the arrays are continuous. For example.height.isContinuous() && dst.isContinuous() && src2. const T* ptr2 = src2.ptr<T>(i). cv::Mat::elemSize Comments from the Wiki size_t Mat::elemSize() const Returns matrix element size in bytes The method returns the matrix element size in bytes. i < size. Also.width *= 4. size. dptr[j+2] = saturate_cast<T>(ptr1[j+2]*alpha + ptr2[j+2]*beta).width *= size. because create() does not always allocate a new matrix. we still check the destination array. for( int j = 0. dptr[j] = saturate_cast<T>(ptr1[j]*alpha + ptr2[j]*beta).// if this is the case. cv::Mat::type . if the matrix type is CV_16SC3 . cv::Mat::elemSize1 Comments from the Wiki size_t Mat::elemSize1() const Returns size of each matrix element channel in bytes The method returns the matrix element channel size in bytes.ptr<T>(i). note that we use another OpenCV idiom in this function . And while the newly allocated arrays are always continuous.

.16-bit unsigned integers ( 0. divided by Mat::elemSize1() . cv::Mat::ptr . the type of each individual channel...65535 ) • CV_16S.. cv::Mat::empty Comments from the Wiki bool Mat::empty() const Returns true if the array has no elemens The method returns true if Mat::total() is 0 or if Mat::data is NULL. INF. cv::Mat::step1 Comments from the Wiki size_t Mat::step1() const Returns normalized step The method returns the matrix step.. rows) .32-bit floating-point numbers ( -FLT_MAX.. an id.255 ) • CV_8S. The complete list of matrix types: • CV_8U. cv::Mat::size Comments from the Wiki Size Mat::size() const Returns the matrix size The method returns the matrix size: Size(cols.8-bit unsigned integers ( 0.127 ) • CV_16U.2147483647 ) • CV_32F. INF. NAN ) • CV_64F. cv::Mat::depth Comments from the Wiki int Mat::depth() const Returns matrix element depth The method returns the matrix element depth id.64-bit floating-point numbers ( -DBL_MAX.FLT_MAX. like CV_16SC3 or 16-bit signed 3-channel array etc. NAN ) cv::Mat::channels Comments from the Wiki int Mat::channels() const Returns matrix element depth The method returns the number of matrix channels.32767 ) • CV_32S.data == NULL .32-bit signed integers ( -2147483648. For example.16-bit signed integers ( -32768. compatible with the CvMat type system.8-bit signed integers ( -128.e.Comments from the Wiki int Mat::type() const Returns matrix element type The method returns the matrix element type. It can be useful for fast access to arbitrary matrix element.DBL_MAX.total() == 0 does not imply that M. i.. Because of pop_back() and resize() methods M. for 16-bit signed 3-channel array the method will return CV_16S .

rows.Comments from the Wiki uchar* Mat::ptr(int i=0) const uchar* Mat::ptr(int i=0) const template<typename _Tp> _Tp* Mat::ptr(int i=0) template<typename _Tp> const _Tp* Mat::ptr(int i=0) const Return pointer to the specified matrix row Parameters: o i– The 0-based row index The methods return uchar* or typed pointer to the specified matrix row. you can simply write A. That is. respectively. cv::Mat::at Comments from the Wiki template<typename T> T& Mat::at(int i) const template<typename T> const T& Mat::at(int i) const template<typename T> T& Mat::at(int i. i < H. A is 1 x N floating-point matrix and B is M x 1 integer matrix.k+4) and B.j)=1. For the sake of higher performance the index range checks are only performed in Debug configuration. int j.) – Indices along the dimensions 0. Parameters: respectively o pt– The element position specified as Point(j.at<float>(k+4) and B. k(i. Here is an example of initialization of a Hilbert matrix: Mat H(100. Note that the variants with a single index (i) can be used to access elements of single-row or single-column 2-dimensional arrays.at<int>(2*i+1) instead of A.at<double>(i. int k) const template<typename T> T& Mat::at(const int* idx) template<typename T> const T& Mat::at(const int* idx) const Return reference to the specified array element o j. for(int i = 0.at<int>(2*i+1. Here is the alpha blending function rewritten using the matrix iterators: . int k) template<typename T> const T& Mat::at(int i. 1 and 2.at<float>(0. for example.cols./(i+j+1). i++) for(int j = 0. See the sample in Mat::isContinuous() () on how to use these methods.i) o idx– The array of Mat::dims indices The template methods return reference to the specified array element. int j) template<typename T> const T& Mat::at(int i. The use of matrix iterators is very similar to the use of bi-directional STL iterators. j < H.0) . int j. 100. set to the first matrix element The methods return the matrix read-only or read-write iterators. if. cv::Mat::begin Comments from the Wiki template<typename _Tp> MatIterator_<_Tp> Mat::begin() template<typename _Tp> MatConstIterator_<_Tp> Mat::begin() const Return the matrix iterator. int j) const template<typename T> T& Mat::at(Point pt) template<typename T> const T& Mat::at(Point pt) const template<typename T> T& Mat::at(int i. CV_64F). j++) H.

some specific methods // and // no new extra fields }. it1 != it1_end.begin<VT>(). for( .size(). ++dst_it ) { VT pix1 = *it1.size() == src2. float alpha = pix1[3]*inv_scale. Mat_ Template matrix class derived from Mat template<typename _Tp> class Mat_ : public Mat { public: // . e.size()). MatIterator_<VT> dst_it = dst. ++it1. 4> VT. dst. CV_Assert( src1..end<VT>().type() && src1.. saturate_cast<T>((1 .type()).create(size. But do it with care.type() == DataType<VT>::type && src1. ++it2. inv_scale = 1. MatConstIterator_<VT> it1 = src1.type() == src2. It does not have any extra data fields. it1_end = src1. Mat& dst) { typedef Vec<T.g. const float alpha_scale = (float)std::numeric_limits<T>::max(). } } cv::Mat::end Comments from the Wiki template<typename _Tp> MatIterator_<_Tp> Mat::end() template<typename _Tp> MatConstIterator_<_Tp> Mat::end() const Return the matrix iterator. set to the after-last matrix element The methods return the matrix read-only or read-write iterators. pix2 = *it2. set to the point following the last matrix element.(1-alpha)*(1-beta))*alpha_scale)). *dst_it = VT(saturate_cast<T>(pix1[0]*alpha + pix2[0]*beta). nor it or Mat have any virtual methods and thus references or pointers to these two classes can be freely converted one to another. const Mat& src2. saturate_cast<T>(pix1[1]*alpha + pix2[1]*beta). The class Mat_<_Tp> is a “thin” template wrapper on top of Mat class. Size size = src1.f/alpha_scale.: . beta = pix2[3]*inv_scale.template<typename T> void alphaBlendRGBA(const Mat& src1.begin<VT>(). src1. saturate_cast<T>(pix1[2]*alpha + pix2[2]*beta).begin<VT>(). MatConstIterator_<VT> it2 = src2.

cols.at<double>(0. int narrays=-1).99) = 1.20).i)=Vec3b(255.255. i++) img(i. int narrays=-1). V.// create 100x100 8-bit matrix Mat M(100. // the program will likely crash at the statement below M1(99. //! proceeds to the next plane of every iterated matrix (postfix increment operator) NAryMatIterator operator ++(int).V). i++) for(int j = 0.just pass Vec as Mat_ parameter: // allocate 320x240 color image and fill it with green (in RGB space) Mat_<Vec3b> img(240. Vec3b(0.255). // now draw a diagonal white line for(int i = 0. . NAryMatIterator n-ary multi-dimensional array iterator class CV_EXPORTS NAryMatIterator { public: //! the default constructor NAryMatIterator().rows. Mat* planes. Mat E. Mat_<float>& M1 = (Mat_<float>&)M.0). int x) and Mat_<_Tp>::operator ()(int y. Note that Mat::at<_Tp>(int y. j++) M(i.. int x) do absolutely the same and run at the same speed. // the total number of planes }. cout << E.j) = 1.E.CV_8U). for(int i = 0. While Mat is sufficient in most cases. //! proceeds to the next plane of every iterated matrix NAryMatIterator& operator ++(). // and now scramble the 2nd (red) channel of each pixel for(int i = 0. eigen(M. int nplanes. .255. // this will compile fine.cols.rows. How to use ``Mat_`` for multi-channel images/matrices? This is simple . //! the separate iterator initialization method void init(const Mat** arrays.j)[2] ^= (uchar)(i ^ j). but the latter is certainly shorter: Mat_<double> M(20.f. j < img.0)/E. j < M. //! the full constructor taking arbitrary number of n-dim matrices NAryMatIterator(const Mat** arrays. Mat* planes.0)). no any data conversion will be done. i++) for(int j = 0./(i+j+1). i < 100. i < img. i < M. j++) img(i.at<double>(M.. Mat_ can be more convenient if you use a lot of element access operations and if you know matrix type at compile time. 320.100.rows-1.

Mat plane.planes[0] *= s.planes[0])[0]. it_end = image. for( . CV_Assert(image. It is possible to use conventional MatIterator ‘s for each array. ++it) it. // the loop below assumes that the image // is 8-bit 3-channel.The class is used for implementation of unary.f. THRESH_TOZERO). ++it ) { const Vec3b& pix = *it. Mat& hist. } s = 1. histSize. N}. ++it) { threshold(it. generally. it != it_end.end<Vec3b>(). on each iteration // it.planes[0]. 1). it = NAryMatIterator(&hist. NAryMatIterator it(&hist. N. pix[1]*N/256. for(int p = 0. } SparseMat Sparse n-dimensional array. 1).nplanes. Some of the arguments of n-ary function may be continuous arrays.type() == CV_8UC3)./s. Here is an example of how you can compute a normalized and thresholded 3D color histogram: void computeNormalizedColorHist(const Mat& image. } minProb *= image. you can iterate though several matrices simultaneously as long as they have the same geometry (dimensionality and all the dimension sizes are the same).. will be the slices of the corresponding matrices. Using it. s += sum(it. minProb. it. On each iteration it.at<float>(pix[0]*N/256. // iterate through the matrix. p++. p < it. int N. class SparseMat . // and clear it hist = Scalar(0). binary and. &plane. That’s where NAryMatIterator can be used. it. double minProb) { const int histSize[] = {N. p < it. . CV_32F). 0.begin<Vec3b>().planes[0]. // make sure that the histogram has proper size and type hist. but it can be a big overhead to increment all of the iterators after each small operations. &plane. so let's check it. some may be not. for(int p = 0.cols.planes[*] (of type Mat) will be set to the current plane.rows*image. n-ary element-wise operations on multi-dimensional arrays. p++..planes[0] .nplanes.planes[1] . pix[2]*N/256) += 1. double s = 0. MatConstIterator_<Vec3b> it = image.create(3. hist.

// all the data is copied. // internal structure . ///////// assignment operations /////////// // this is O(1) operation. // destructor ~SparseMat(). // converts old-style sparse matrix to the new-style. // converts 1D or 2D sparse matrix to dense 2D matrix. // creates full copy of the matrix SparseMat clone() const. . bool try1d=false). no data is copied SparseMat& operator = (const SparseMat& m). // copy all the data to the destination matrix. // (equivalent to the corresponding constructor with try1d=false) SparseMat& operator = (const Mat& m).sparse matrix header struct Hdr { . typedef SparseMatConstIterator const_iterator.element of a hash table struct Node { size_t hashval. }. // copy constructor SparseMat(const SparseMat& m). int _type). const int* _sizes. int idx[CV_MAX_DIM]. so that "m" can be safely // deleted after the conversion SparseMat(const CvSparseMat* m). // then the sparse matrix will be 1-dimensional. // converts dense array to the sparse form. // the destination will be reallocated if needed. void copyTo( SparseMat& m ) const. ////////// constructors and destructor ////////// // default constructor SparseMat().. }. // if try1d is true and matrix is a single-column matrix (Nx1). // sparse matrix node . SparseMat(const Mat& m. // creates matrix of the specified size and type SparseMat(int dims.{ public: typedef SparseMatIterator iterator. size_t next. // If the sparse matrix is 1D.. then the result will // be a single-column matrix.

void copyTo( Mat& m ) const. // returns the array of sizes and 0 if the matrix is not allocated const int* size() const. const int* _sizes. double alpha=1 ) const. int rtype. int depth() const. // converts sparse matrix to the old-style representation. // not used now void assignTo( SparseMat& m. int _type). int channels() const. void release(). // size of each element in bytes // (the matrix nodes will be bigger because of // element indices and other SparseMat::Node elements). // multiplies all the matrix elements by the specified scalar void convertTo( SparseMat& m. // returns the number of non-zero elements size_t nzcount() const. // decreses the header reference counter. // converts sparse matrix to dense matrix with optional type conversion and scaling. int rtype. // all the elements are copied. otherwise. // returns i-th size (or 0) int size(int i) const. operator CvSparseMat*() const. // returns the matrix dimensionality int dims() const. If it was already of the proper size and type. double beta=0 ) const. double alpha=1. // When rtype=-1. the destination element type will be the same // as the sparse matrix element type. void create(int dims. // the old matrix is released (using release()) and the new one is allocated. . // the same is in Mat int type() const. void clear(). when it reaches 0. // reallocates sparse matrix. // compute element hash value from the element indices: // 1D case size_t hash(int i0) const. // Otherwise rtype will specify the depth and // the number of channels will remain the same is in the sparse matrix void convertTo( Mat& m. // sets all the matrix elements to 0. size_t elemSize() const. int type=-1 ) const. which means clearing the hash table. void addref(). // elemSize()/channels() size_t elemSize1() const. // manually increases reference counter to the header. // it is simply cleared with clear(). // converts arbitrary sparse matrix to dense matrix. // the header and all the underlying data are deallocated.

uchar* ptr(int i0. size_t* hashval=0) const.equivalent to (_const Tp*)ptr(i0. but *hashval is taken instead. template<typename _Tp> const _Tp* find(int i0. it is created. 3D cases and the generic one for n-D case.. int i2. 0 is returned when the element is not there.. int i2. } // that is. // note that _Tp must match the actual matrix type // the functions do not do any on-fly type conversion // 1D case template<typename _Tp> _Tp& ref(int i0.[. int i1. Pointer to it is returned // If the optional hashval pointer is not NULL. size_t* hashval=0). size_t* hashval=0). template<typename _Tp> _Tp value(int i0. size_t* hashval=0) const. uchar* ptr(int i0.hashval]) . size_t* hashval=0) const. size_t* hashval=0). // // return pointer to the matrix element. size_t* hashval=0). size_t* hashval=0) const. the pointer to it is returned // if it's not there and createMissing=false.. int i1. // special variants for 1D. // if the element is there (it's non-zero). template<typename _Tp> _Tp value(const int* idx. int i2. // higher-level element access functions: // ref<_Tp>(i0.. bool createMissing. // n-D case size_t hash(const int* idx) const. // n-D case template<typename _Tp> _Tp& ref(const int* idx.. int i1.. size_t* hashval=0).hashval]) .false[. size_t* hashval=0) const.. // return pointer to the element or NULL pointer if the element is not there. template<typename _Tp> const _Tp* find(int i0. bool createMissing.. template<typename _Tp> _Tp value(int i0. int i1.hashval]) . // 3D case size_t hash(int i0. int i1. int i1.. // 2D case template<typename _Tp> _Tp& ref(int i0.. int i1. size_t* hashval=0) const.true[.. the element hash value is // not computed. size_t* hashval=0).. // always return valid reference to the element..hashval]).equivalent to *(_Tp*)ptr(i0. then the new element // is created and initialized with 0. size_t* hashval=0) const. int i1.// 2D case size_t hash(int i0. NULL pointer is returned // if it's not there and createMissing=true. template<typename _Tp> _Tp value(int i0.. . uchar* ptr(const int* idx. bool createMissing. uchar* ptr(int i0.. int i2) const. // 3D case template<typename _Tp> _Tp& ref(int i0. bool createMissing. int i1. size_t* hashval=0).[.equivalent to // { const _Tp* p = find<_Tp>(i0. // value<_Tp>(i0. size_t* hashval=0) const.. // If it's did not exist. template<typename _Tp> const _Tp* find(const int* idx. // low-level element-acccess functions.[. int i1) const. int i2. return p ? *p : _Tp().[. // find<_Tp>(i0. size_t* hashval=0)... 2D.hashval]).hashval]). template<typename _Tp> const _Tp* find(int i0.

size.. The class SparseMat represents multi-dimensional sparse numerical arrays. int i2. // return value stored in the sparse martix node template<typename _Tp> _Tp& value(Node* n). // and the template forms of the above methods. CV_32F). size_t* hashval=0).f.. template<typename _Tp> const _Tp& value(const Node* n) const. i < 1000. SparseMatIterator begin(). 10. int i1. size_t* hashval=0). k < dims. The non-zero elements are stored in a hash table that grows when it’s filled enough. the methods do nothing void erase(int i0. // .. template<typename _Tp> SparseMatIterator_<_Tp> end(). void erase(const int* idx. 10. i++) { int idx[dims]. int size[] = {10. Elements can be accessed using the following methods: query operations ( SparseMat::ptr and the higher-level SparseMat::ref . for(int k = 0. // return the matrix iterators. int i1. 10}. e. // pointing to the first sparse matrix element. k++) idx[k] = rand() sparse_mat. Such a sparse array can store elements of any type that Mat can store. template<typename _Tp> SparseMatConstIterator_<_Tp> end() const. template<typename _Tp> SparseMatIterator_<_Tp> begin(). 10. void erase(int i0.ref<float>(idx) += 1. some of its stored elements can actually become 0. or to the point after the last sparse matrix element SparseMatIterator end(). // When there is no such element.// erase the specified matrix element. as a result of operations on a sparse matrix. SparseMat::value and SparseMat::find ). “Sparse” means that only non-zero elements are stored (though. ////////////// some internal-use methods /////////////// . } . SparseMatConstIterator end() const. SparseMatConstIterator begin() const.g. }. It’s up to the user to detect such elements and delete them using SparseMat::erase ). size_t* hashval=0). so that the search time is O(1) in average (regardless of whether element is there or not).: const int dims = 5. for(int i = 0. // _Tp must match the actual matrix type.. SparseMat sparse_mat(dims. template<typename _Tp> SparseMatConstIterator_<_Tp> begin() const. // pointer to the sparse matrix header Hdr* hdr.

sparse matrix iterators. Like Mat iterators and unlike MatND iterators, the sparse matrix iterators are STL-style, that is, the iteration loop is familiar to C++ users: // prints elements of a sparse floating-point matrix // and the sum of elements. SparseMatConstIterator_<float> it = sparse_mat.begin<float>(), it_end = sparse_mat.end<float>(); double s = 0; int dims = sparse_mat.dims(); for(; it != it_end; ++it) { // print element indices and the element value const Node* n = it.node(); printf("(") for(int i = 0; i < dims; i++) printf(" printf(": s += *it; } printf("Element sum is If you run this loop, you will notice that elements are enumerated in no any logical order (lexicographical etc.), they come in the same order as they stored in the hash table, i.e. semi-randomly. You may collect pointers to the nodes and sort them to get the proper ordering. Note, however, that pointers to the nodes may become invalid when you add more elements to the matrix; this is because of possible buffer reallocation. a combination of the above 2 methods when you need to process 2 or more sparse matrices simultaneously, e.g. this is how you can compute unnormalized cross-correlation of the 2 floating-point sparse matrices: double cross_corr(const SparseMat& a, const SparseMat& b) { const SparseMat *_a = &a, *_b = &b; // if b contains less elements than a, // it's faster to iterate through b if(_a->nzcount() > _b->nzcount()) std::swap(_a, _b); SparseMatConstIterator_<float> it = _a->begin<float>(), it_end = _a->end<float>(); double ccorr = 0; for(; it != it_end; ++it) { // take the next element from the first matrix float avalue = *it; const Node* anode = it.node(); // and try to find element with the same index in the second matrix. // since the hash value depends only on the element index, // we reuse hashvalue stored in the node float bvalue = _b->value<float>(anode->idx,&anode->hashval); ccorr += avalue*bvalue; } return ccorr; }

SparseMat_
Template sparse n-dimensional array class derived from SparseMat template<typename _Tp> class SparseMat_ : public SparseMat { public: typedef SparseMatIterator_<_Tp> iterator; typedef SparseMatConstIterator_<_Tp> const_iterator; // constructors; // the created matrix will have data type = DataType<_Tp>::type SparseMat_(); SparseMat_(int dims, const int* _sizes); SparseMat_(const SparseMat& m); SparseMat_(const SparseMat_& m); SparseMat_(const Mat& m); SparseMat_(const CvSparseMat* m); // assignment operators; data type conversion is done when necessary SparseMat_& operator = (const SparseMat& m); SparseMat_& operator = (const SparseMat_& m); SparseMat_& operator = (const Mat& m); SparseMat_& operator = (const MatND& m); // equivalent to the correspoding parent class methods SparseMat_ clone() const; void create(int dims, const int* _sizes); operator CvSparseMat*() const; // overriden methods that do extra checks for the data type int type() const; int depth() const; int channels() const; // more convenient element access operations. // ref() is retained (but <_Tp> specification is not need anymore); // operator () is equivalent to SparseMat::value<_Tp> _Tp& ref(int i0, size_t* hashval=0); _Tp operator()(int i0, size_t* hashval=0) const; _Tp& ref(int i0, int i1, size_t* hashval=0); _Tp operator()(int i0, int i1, size_t* hashval=0) const; _Tp& ref(int i0, int i1, int i2, size_t* hashval=0); _Tp operator()(int i0, int i1, int i2, size_t* hashval=0) const; _Tp& ref(const int* idx, size_t* hashval=0); _Tp operator()(const int* idx, size_t* hashval=0) const; // iterators SparseMatIterator_<_Tp> begin(); SparseMatConstIterator_<_Tp> begin() const; SparseMatIterator_<_Tp> end(); SparseMatConstIterator_<_Tp> end() const; }; SparseMat_ is a thin wrapper on top of SparseMat , made in the same way as Mat_ . It simplifies

notation of some operations, and that’s it. int sz[] = {10, 20, 30}; SparseMat_<double> M(3, sz); ... M.ref(1, 2, 3) = M(4, 5, 6) + M(7, 8, 9);

Operations on Arrays¶
cv::abs
Comments from the Wiki MatExpr<...> abs(const Mat& src) MatExpr<...> abs(const MatExpr<...>& src) Computes absolute value of each matrix element o src– matrix or matrix Parameters: expression abs is a meta-function that is expanded to one of absdiff() forms: • C = abs(A-B)is equivalent to absdiff(A, B, C) and • C = abs(A)is equivalent to absdiff(A, Scalar::all(0), C) . • C = Mat_<Vec<uchar,n> >(abs(A*alpha + beta))is equivalent to convertScaleAbs(A, C, alpha, beta) The output matrix will have the same size and the same type as the input one (except for the last case, where C will be depth=CV_8U ). See also: Matrix Expressions , absdiff() ,

cv::absdiff
Comments from the Wiki void absdiff(const Mat& src1, const Mat& src2, Mat& dst) void absdiff(const Mat& src1, const Scalar& sc, Mat& dst) void absdiff(const MatND& src1, const MatND& src2, MatND& dst) void absdiff(const MatND& src1, const Scalar& sc, MatND& dst) Computes per-element absolute difference between 2 arrays or between array and a scalar. Paramete o src1– The first input array rs: o src2– The second input array; Must be the same size and same type as src1 o sc– Scalar; the second input parameter o dst– The destination array; it will have the same size and same type as src1 ; see Mat::create The functions absdiff compute: absolute difference between two arrays or absolute difference between array and a scalar: where I is multi-dimensional index of array elements. in the case of multi-channel arrays each channel is processed independently. See also: abs() ,

cv::add

Comments from the Wiki void add(const Mat& src1, const Mat& src2, Mat& dst) void add(const Mat& src1, const Mat& src2, Mat& dst, const Mat& mask) void add(const Mat& src1, const Scalar& sc, Mat& dst, const Mat& mask=Mat()) void add(const MatND& src1, const MatND& src2, MatND& dst) void add(const MatND& src1, const MatND& src2, MatND& dst, const MatND& mask) void add(const MatND& src1, const Scalar& sc, MatND& dst, const MatND& mask=MatND()) Computes the per-element sum of two arrays or an array and a scalar. o src1– The first source array Paramete rs: o src2– The second source array. It must have the same size and same type as src1 o sc– Scalar; the second input parameter o dst– The destination array; it will have the same size and same type as src1 ; see Mat::create o mask– The optional operation mask, 8-bit single channel array; specifies elements of the destination array to be changed The functions add compute: the sum of two arrays: or the sum of array and a scalar: where I is multi-dimensional index of array elements. The first function in the above list can be replaced with matrix expressions: dst = src1 + src2; dst += src1; // equivalent to add(dst, src1, dst); in the case of multi-channel arrays each channel is processed independently. See also: subtract() , addWeighted() , scaleAdd() , convertScale() , Matrix Expressions , .

cv::addWeighted
Comments from the Wiki void addWeighted(const Mat& src1, double alpha, const Mat& src2, double beta, double gamma, Mat& dst) void addWeighted(const MatND& src1, double alpha, const MatND& src2, double beta, double gamma, MatND& dst) Computes the weighted sum of two arrays. Parameters o src1– The first source array : o alpha– Weight for the first array elements o src2– The second source array; must have the same size and same type as src1 o beta– Weight for the second array elements o dst– The destination array; it will have the same size and same type as src1 o gamma– Scalar, added to each sum The functions addWeighted calculate the weighted sum of two arrays as follows: where I is multi-dimensional index of array elements. The first function can be replaced with a matrix expression: dst = src1*alpha + src2*beta + gamma;

In the case of multi-channel arrays each channel is processed independently. See also: add() , subtract() , scaleAdd() , convertScale() , Matrix Expressions , .

bitwise_and
Comments from the Wiki void bitwise_and(const Mat& src1, const Mat& src2, Mat& dst, const Mat& mask=Mat()) void bitwise_and(const Mat& src1, const Scalar& sc, Mat& dst, const Mat& mask=Mat()) void bitwise_and(const MatND& src1, const MatND& src2, MatND& dst, const MatND& mask=MatND()) void bitwise_and(const MatND& src1, const Scalar& sc, MatND& dst, const MatND& mask=MatND()) Calculates per-element bit-wise conjunction of two arrays and an array and a scalar. o src1– The first source array Paramete rs: o src2– The second source array. It must have the same size and same type as src1 o sc– Scalar; the second input parameter o dst– The destination array; it will have the same size and same type as src1 ; see Mat::create o mask– The optional operation mask, 8-bit single channel array; specifies elements of the destination array to be changed The functions bitwise_and compute per-element bit-wise logical conjunction: of two arrays or array and a scalar: In the case of floating-point arrays their machine-specific bit representations (usually IEEE754-compliant) are used for the operation, and in the case of multi-channel arrays each channel is processed independently. See also: , ,

bitwise_not
Comments from the Wiki void bitwise_not(const Mat& src, Mat& dst) void bitwise_not(const MatND& src, MatND& dst) Inverts every bit of array Paramete o src1– The source array rs: o dst– The destination array; it is reallocated to be of the same size and the same type as src ; see Mat::create o mask– The optional operation mask, 8-bit single channel array; specifies elements of the destination array to be changed The functions bitwise_not compute per-element bit-wise inversion of the source array: In the case of floating-point source array its machine-specific bit representation (usually IEEE754-compliant) is used for the operation. in the case of multi-channel arrays each channel is processed independently. See also: , ,

bitwise_or

Comments from the Wiki void bitwise_or(const Mat& src1, const Mat& src2, Mat& dst, const Mat& mask=Mat()) void bitwise_or(const Mat& src1, const Scalar& sc, Mat& dst, const Mat& mask=Mat()) void bitwise_or(const MatND& src1, const MatND& src2, MatND& dst, const MatND& mask=MatND()) void bitwise_or(const MatND& src1, const Scalar& sc, MatND& dst, const MatND& mask=MatND()) Calculates per-element bit-wise disjunction of two arrays and an array and a scalar. o src1– The first source array Paramete rs: o src2– The second source array. It must have the same size and same type as src1 o sc– Scalar; the second input parameter o dst– The destination array; it is reallocated to be of the same size and the same type as src1 ; see Mat::create o mask– The optional operation mask, 8-bit single channel array; specifies elements of the destination array to be changed The functions bitwise_or compute per-element bit-wise logical disjunction of two arrays or array and a scalar: In the case of floating-point arrays their machine-specific bit representations (usually IEEE754-compliant) are used for the operation. in the case of multi-channel arrays each channel is processed independently. See also: , ,

bitwise_xor
Comments from the Wiki void bitwise_xor(const Mat& src1, const Mat& src2, Mat& dst, const Mat& mask=Mat()) void bitwise_xor(const Mat& src1, const Scalar& sc, Mat& dst, const Mat& mask=Mat()) void bitwise_xor(const MatND& src1, const MatND& src2, MatND& dst, const MatND& mask=MatND()) void bitwise_xor(const MatND& src1, const Scalar& sc, MatND& dst, const MatND& mask=MatND()) Calculates per-element bit-wise “exclusive or” operation on two arrays and an array and a scalar. Paramete o src1– The first source array rs: o src2– The second source array. It must have the same size and same type as src1 o sc– Scalar; the second input parameter o dst– The destination array; it is reallocated to be of the same size and the same type as src1 ; see Mat::create o mask– The optional operation mask, 8-bit single channel array; specifies elements of the destination array to be changed The functions bitwise_xor compute per-element bit-wise logical “exclusive or” operation on two arrays or array and a scalar: In the case of floating-point arrays their machine-specific bit representations (usually IEEE754-compliant) are used for the operation. in the case of multi-channel arrays each channel is

CV_COVAR_NORMALThe output covariance matrix is calculated as: . or as rows or rs: columns of a single matrix o nsamples– The number of samples when they are stored separately o covar– The output covariance matrix. int flags.processed independently. CV_COVAR_SCALEIf the flag is specified. but rather the mean vector of the whole set. the function does not calculate mean from the input vectors. Mat& mean. . the EigenFaces technique for face recognition).e. instead. for example. CV_COVAR_COLS[Only useful in the second variant of the function] The flag means that all the input vectors are stored . but. in the “scrambled” mode scale is the reciprocal of the total number of elements in each input vector. the covariance matrix will be . By default (if the flag is not specified) the covariance matrix is not scaled (i. it will have type= ctype and square size o mean– The input or output (depending on the flags) array . mean should be a single-row vector in this case. uses the passed mean vector. See also: . Eigenvalues of this “scrambled” matrix will match the eigenvalues of the true covariance matrix and the “true” eigenvectors can be easily calculated from the eigenvectors of the “scrambled” covariance matrix. a combination of the following values CV_COVAR_SCRAMBLEDThe output covariance matrix is calculated as: . Mat& covar.the mean (average) vector of the input vectors o flags– The operation flags. CV_COVAR_ROWS[Only useful in the second variant of the function] The flag means that all the input vectors are stored as rows of the samples matrix. int flags. int ctype=CV_64F) void calcCovarMatrix(const Mat& samples. cv::calcCovarMatrix Comments from the Wiki void calcCovarMatrix(const Mat* samples. This is useful if mean has been pre-computed or known a-priori.in this case. that is. In the “normal” mode scale is 1. that is. int nsamples. mean is not a mean vector of the input sub-set of vectors. One and only one of CV_COVAR_SCRAMBLED and CV_COVAR_NORMAL must be specified CV_COVAR_USE_AVGIf the flag is specified. Mat& covar. int ctype=CV_64F) Calculates covariation matrix of a set of vectors Paramete o samples– The samples. Such an unusual covariance matrix is used for fast PCA of a set of very large vectors (see. scale=1 )./nsamples . stored as separate matrices. Mat& mean. covar will be a square matrix of the same size as the total number of elements in each input vector. the covariance matrix is scaled. or if the covariance matrix is calculated by parts .

o angleInDegrees– The flag indicating whether the angles are measured in radians. or both of every 2d vector (x(I). it must have the same size and same type as x o magnitude– The destination array of magnitudes of the same size and same type as x o angle– The destination array of angles of the same size and same type as x . double minVal=-DBL_MAX. The angles are measured in radians to or in degrees (0 to 360 degrees). If some values are out of range.0) point. Mat& magnitude. o pos– The optional output parameter. When minVal < -DBL_MAX and maxVal < DBL_MAX . Mat& angle. in the case of multi-channel arrays each channel is processed independently. const Mat& y. angle. For the (0. o x– The array of x-coordinates. double maxVal=DBL_MAX) Checks every element of an input array for invalid values.as columns of the samples matrix. or they throw an exception. cv::checkRange Comments from the Wiki bool checkRange(const Mat& src. where the position of the first outlier is stored. and then . must be a pointer to array of src. Point* pos=0.dims elements o minVal– The inclusive lower boundary of valid values range o maxVal– The exclusive upper boundary of valid values range The functions checkRange check that every array element is neither NaN nor . bool quiet=true.y(I)): The angles are calculated with accuracy. double minVal=-DBL_MAX. must be single-precision or Paramete double-precision floating-point array rs: o y– The array of y-coordinates. optionally. then the functions also check that each value is between minVal and maxVal . In the second function pos . Mahalanobis() cv::cartToPolar Comments from the Wiki void cartToPolar(const Mat& x. bool quiet=true. mean should be a single-column vector in this case. when not NULL. which is default mode. or in degrees The function cartToPolar calculates either the magnitude. position of the first outlier is stored in pos (when the functions either return false (when quiet=true ) or throw an exception. bool angleInDegrees=false) Calculates the magnitude and angle of 2d vectors. mulTransposed() . The functions calcCovarMatrix calculate the covariance matrix and. the angle is set to 0. See also: PCA() . Paramete o src– The array to check rs: o quiet– The flag indicating whether the functions quietly return false when the array elements are out of range. ). int* pos=0. double maxVal=DBL_MAX) bool checkRange(const MatND& src. the mean vector of the set of input vectors.

bool lowerToUpper=false) Copies the lower or the upper half of a square matrix to another half. When the comparison result is true. otherwise the upper half is copied to the lower half .. int cmpop) void compare(const MatND& src1.. int cmpop) void compare(const MatND& src1. MatND& dst. int cmpop) void compare(const Mat& src1. the corresponding element of destination array is set to 255. const MatND& src2. otherwise it is set to 0: • dst(I) = src1(I) cmpop src2(I) ? 255 : 0 • dst(I) = src1(I) cmpop value ? 255 : 0 The comparison operations can be replaced with the equivalent matrix expressions: Mat dst1 = src1 >= src2. Mat& dst. min() . the lower half is copied to the upper half. must have the same size and same type as src1 o value– The scalar value to compare each array element with o dst– The destination array. max() . will have the same size as src1 and type= CV_8UC1 o cmpop– The flag specifying the relation between the elements to be checked CMP_EQ CMP_GT CMP_GE CMP_LT CMP_LE CMP_NE or or or or or or The functions compare compare each element of src1 with the corresponding element of src2 or with real scalar value . int cmpop) Performs per-element comparison of two arrays or an array and scalar value. Mat dst2 = src1 < 8. MatND& dst. See also: checkRange() . threshold() . o src1– The first source array Parameters : o src2– The second source array. double value. const Mat& src2. Mat& dst. Matrix Expressions cv::completeSymm Comments from the Wiki void completeSymm(Mat& mtx. Paramete o mtx– Input-output floating-point square matrix rs: o lowerToUpper– If true. double value. .cv::compare Comments from the Wiki void compare(const Mat& src1.

// Mat_<float> B = abs(A*5+3) will also do the job. Parameters: o src– The source array o dst– The destination array o alpha– The optional scale factor o beta– The optional delta added to the scaled values On each element of the input array the function convertScaleAbs performs 3 operations sequentially: scaling. transpose() cv::convertScaleAbs Comments from the Wiki void convertScaleAbs(const Mat& src.30). B = abs(B).The function completeSymm copies the lower half of a square matrix to its another half. double beta=0) Scales. double alpha=1. When the output is not 8-bit. computes absolute values and converts the result to 8-bit. the matrix diagonal remains unchanged: • for if lowerToUpper=false • for if lowerToUpper=true See also: flip() . // but it will allocate a temporary matrix See also: Mat::convertTo() . taking absolute value. meanStdDev() . Scalar(-100). the operation can be emulated by calling Mat::convertTo method (or by using matrix expressions) and then by computing absolute value of the result. norm() . calcCovarMatrix() cv::cubeRoot Comments from the Wiki float cubeRoot(float val) Computes cube root of the argument Parameters: o val– The function . conversion to unsigned 8-bit type: in the case of multi-channel arrays the function processes each channel independently. for example: Mat_<float> A(30. Mat_<float> B = A*5 + 3. randu(A. abs() cv::countNonZero Comments from the Wiki int countNonZero(const Mat& mtx) int countNonZero(const MatND& mtx) Counts non-zero array elements. Parameters: o mtx– Single-channel array The function cvCountNonZero returns the number of non-zero elements in mtx: See also: mean() . Mat& dst. Scalar(100)). minMaxLoc() .

In this case the original array should not be deallocated while the new matrix header is used. then the data is contiguous). CvMat C1 = B1. Negative arguments are handled correctly. If coiMode=1 . // now A. then user may deallocate the original array right after the conversion o allowND– When it is true (default value). and user has to preserve the data until the new header is destructed. and optionally duplicates the underlying data. the function will never report an error. IplImage or CvMatND header to Mat() header. or when the parameter is false. NaN and are not handled. when copyData=true . instead it returns the header to the whole original image and user will have to check and process COI manually. The accuracy approaches the maximum possible accuracy for single-precision data. Otherwise. only the new header is created. The reverse transformation. A1. The constructed header is returned by the function. . true) :math:`\sim` cvarrToMat(src. IplImage or CvMatND rs: o copyData– When it is false (default value).g. which means that no reference counting is done for the matrix data. cvarrToMat(src. When copyData=false . Paramete o src– The source CvMat . If coiMode=0 . cvSetIdentity(A). If it’s not possible. B. the function will report an error o coiMode– The parameter specifies how the IplImage COI (when set) is handled. &A1). false).clone() (assuming that COI is not set). int coiMode=0) Converts CvMat. The the parameter is true. the function will report an error if COI is set. the new buffer will be allocated and managed as if you created a new matrix from scratch and copy the data there. C and C1 are different headers // for the same 10x10 floating-point array. however. Mat B1 = cvarrToMat(&A1). The function provides uniform way of supporting CvArr paradigm in the code that is migrated to use new-style data structures internally. cv::cvarrToMat Comments from the Wiki Mat cvarrToMat(const CvArr* src. that you will need to use "&" // to pass C & C1 to OpenCV functions. then CvMatND is converted to Mat if it’s possible (e. That is. from Mat() to CvMat or IplImage can be done by simple assignment: CvMat* A = cvCreateMat(10. IplImage C = B. cvGetImage(A. The function cvarrToMat converts CvMat . the function is used to convert an old-style 2D array ( CvMat or IplImage ) to Mat . bool copyData=false. bool allowND=true. B1. see extractImageCOI() . Mat B = cvarrToMat(A). the conversion is done really fast (in O(1) time) and the newly created matrix header will have refcount=0 .argument The function cubeRoot computes . 10. no data is copied. IplImage or CvMatND to cv::Mat. CV_32F). e. // note. all the data is copied.g: printf(" Normally. IplImage A1.

for . But for more complex operations. That is. specifies how to react on an image with COI set: by default it’s 0.the function can also take CvMatND on input and create Mat() for it. cvGetMatND() . Inverse Cosine transform of 1D vector of N elements: (since is orthogonal matrix. 0 < i < A.size*A. and you need to convert CvMatND to MatND() using the corresponding constructor of the latter. By using this function in this way. such as filtering functions. coiMode . respectively). Forward Cosine transform of 2D ) matrix: . it will not work. This flag allows user to transform multiple vectors simultaneously and can be used to decrease the overhead (which is sometimes several times larger than the processing itself).dim[i]. will have the same size and same type as src o flags– Transformation flags. mixChannels() cv::dct Comments from the Wiki void dct(const Mat& src. See also: cvGetImage() . The function dct performs a forward or inverse discrete cosine transform (DCT) of a 1D or 2D floating-point array: elements: Forward Cosine transform of 1D vector of where and .g.dims . you will need either to organize loop over the array (e. To process individual channel of an new-style array. a combination of the following values DCT_INVERSEdo an inverse 1D or 2D transform instead of the default forward transform. or extract the COI using mixChannels() (for new-style arrays) or extractImageCOI() (for old-style arrays). DCT_ROWSdo a forward or inverse transform of every individual row of the input matrix. if it’s possible. insertImageCOI() . using matrix iterators) where the channel of interest will be processed. such as Mat() and MatND() do not support COI natively. and then the function reports an error when an image with COI comes in. int flags=0) Performs a forward or inverse discrete cosine transform of 1D or 2D array Paramete o src– The source floating-point array rs: o dst– The destination array. process this individual channel and insert it back to the destination array if need (using mixChannel() or insertImageCOI() .dim. The modern structures. Mat& dst. The last parameter.step[i] == A.step[i-1] for all or for all but one i. the matrix data should be continuous or it should be representable as a sequence of continuous matrices. And for CvMatND A it is possible if and only if A. cvGetMat() . extractImageCOI() .user has to check COI presence and handle it manually. And coiMode=1 means that no error is signaled . you can process CvMatND using arbitrary element-wise function.dim. to do 3D and higher-dimensional transforms and so forth.

you can pass the flag and then the function will assume the symmetry and produce the real output array. and not monotonically. Normally. the function performs 1D transform of each row. However. DFT_SCALEscale the result: divide it by the number of array elements. otherwise it is inverse 1D or 2D transform. Also. • otherwise. has complex-conjugate symmetry ( CCS ). the function performs 1D transform • otherwise it performs 2D transform. if the array is a single column or a single row. } See also: dft() . it is combined with DFT_INVERSE . In the current implementation DCT of a vector of size N is computed via DFT of a vector of size N/2 . However. real or complex rs: o dst– The destination array. though being a complex array. * DFT_ROWS do a forward or inverse transform of every individual row of the input matrix. Paramete o src– The source array. to do 3D and higher-dimensional transforms and so forth.. DFT_REAL_OUTPUTthen the function performs inverse transformation of 1D or 2D complex array. see the description below. DFT_COMPLEX_OUTPUTthen the function performs forward transformation of 1D or 2D real array. 6 . idct() cv::dft Comments from the Wiki void dft(const Mat& src. int nonzeroRows=0) Performs a forward or inverse Discrete Fourier transform of 1D or 2D floating-point array.Inverse Cosine transform of 2D vector of elements: The function chooses the mode of operation by looking at the flags and size of the input array: • if (flags & DCT_INVERSE) == 0 . • if (flags & DCT_ROWS) :math:`\ne` 0 . if the source array has conjugate-complex symmetry (for example. Such an array can be packed into real array of the same size as input. This flag allows the user to transform multiple vectors simultaneously and can be used to decrease the overhead (which is sometimes several times larger than the processing itself). you may wish to get the full complex array (for simpler spectrum analysis etc. For data analysis and approximation you can pad the array when necessary. on the array size. 4. which is the fastest option and which is what the function does by default.).). the result is normally a complex array of the same size. then the output is real array. int flags=0. Important note : currently cv::dct supports even-size arrays (2. getOptimalDFTSize() . Pass the flag to tell the function to produce full-size complex output array. the function does forward 1D or 2D transform. Mat& dst. see getOptimalDFTSize() . Note that when the input is packed . thus the optimal DCT size can be computed as: size_t getOptimalDCTSize(size_t N) { return 2*getOptimalDFTSize((N+1)/2). the result.. it is a result of forward transformation with DFT_COMPLEX_OUTPUT flag). the function’s performance depends very much. While the function itself does not check whether the input is symmetrical or not. which size and type depends on the flags o flags– Transformation flags. a combination of the following values DFT_INVERSEdo an inverse 1D or 2D transform instead of the default forward transform.

the function treats the input as packed complex-conjugate symmetrical array. the function does forward 1D or 2D transform: o when DFT_COMPLEX_OUTPUT is set then the output will be complex matrix of the same size as input. the output will look as the first row of the above matrix. in the case of single 1D transform it will look as the first row of the above matrix. o otherwise the output will be a real matrix of the same size as input. So. • if input array is real and DFT_INVERSE is not set. in the case of 2D transform it will use the packed format as shown above. otherwise it will be 2D transform. in the case of multiple 1D transforms (when using DCT_ROWS flag) each row of the output matrix will look like . thus the function can handle the rest of the rows more efficiently and thus save some time. the function assumes that only the first nonzeroRows rows of the input array ( DFT_INVERSE is not set) or only the first nonzeroRows of the output array ( DFT_INVERSE is set) contain non-zeros.real array and inverse transformation is executed. the packed format called CCS (complex-conjugate-symmetrical) that was borrowed from IPL and used to represent the result of a forward Fourier transform or input for an inverse Fourier transform: in the case of 1D transform of real vector. the function chooses the operation mode depending on the flags and size of the input array: • if DFT_ROWS is set or the input array has single row or single column then the function performs 1D forward or inverse transform (of each row of a matrix when DFT_ROWS is set. This technique is very useful for computing array cross-correlation or convolution using DFT Forward Fourier transform of 1D vector of N elements: where N elements: and Inverse Fourier transform of 1D vector of where elements: Inverse Fourier transform of 2D vector of Forward Fourier transform of 2D vector of elements: In the case of real (single-channel) data. so the output will also be real array o nonzeroRows– When the parameter .

0. respectively Mat roiA(tempA.B.cols . • otherwise.cols + B.rows).cols.B. // copy A and B to the top-left corners of tempA and tempB. Unlike dct() .rows)). // now copy the result back to C.height = getOptimalDFTSize(A. Mat roiB(tempB.1).cols .cols. // Even though all the result rows will be non-zero.rows .B.rows .rows)). // now transform the padded A & B in-place. tempA(Rect(0.rows).create(abs(A. tempB. const Mat& B. A.rows of them. the input array is real. 0. The scaling is done after the transformation if DFT_SCALE is set.copyTo(roiB). A. Mat tempB(dftSize. if the input array is complex and either DFT_INVERSE or DFT_REAL_OUTPUT are not set then the output will be a complex array of the same size as input and the function will perform the forward or inverse 1D or 2D transform of the whole input array or each row of the input array independently. Size dftSize.1). B.cols)+1. dftSize. tempB. DFT_INVERSE + DFT_SCALE. and the function will perform 1D or 2D inverse transformation of the whole input array or each individual row. or it is complex but DFT_REAL_OUTPUT is set.rows)).the first row of the above matrix. // transform the product back from the frequency domain. // use "nonzeroRows" hint for faster processing dft(tempA. // compute the size of DFT transform dftSize. // the function handles packed spectrum representations well mulSpectrums(tempA.0.cols. Scalar::all(0)).type(). Rect(0. the function supports arrays of arbitrary size.B. C. dft(tempB. // we need only the first C.e. Rect(0. tempA. 0. and thus we // pass nonzeroRows == C. A.A. depending on the flags DFT_INVERSE and DFT_ROWS . // multiply the spectrums. tempA. B. Mat& C) { // reallocate the output array if needed C. which sizes can be factorized in a product of small prime numbers (2. A. depending on the flags DFT_INVERSE and DFT_ROWS . when DFT_INVERSE is set. Such an efficient DFT size can be computed using getOptimalDFTSize() method. B. C.copyTo(roiA). tempA).rows dft(tempA. // allocate temporary buffers and initialize them with 0's Mat tempA(dftSize.rows).width = getOptimalDFTSize(A.type(). Here is the sample on how to compute DFT-based convolution of two 2D real arrays: void convolveDFT(const Mat& A. 3 and 5 in the current implementation).rows)+1.0. C. but only those arrays are processed efficiently. the output will be a real array of the same size as input.A.rows + B. // all the temporary buffers will be deallocated automatically } . i. • otherwise. Scalar::all(0)).copyTo(C).type()). abs(A.

const Mat& src2. should have the same size and same type as src1 o scale– Scale factor o dst– The destination array. add() . since different tiles in C can be computed in parallel.cols ( tempB. When src2(I)=0 . double scale=1) void divide(double scale. the speed will decrease a lot. const Mat& src2. Mat& dst) void divide(const MatND& src1. when each tile in C is a single pixel. this DFT-based convolution does not have to be applied to the whole big arrays. because of repeated work . If the tiles are too big. the algorithm becomes equivalent to the naive convolution algorithm. MatND& dst) Performs per-element division of two arrays or a scalar by an array. not convolution. when there is no src1 : The result will have the same type as src1 . If the tiles in C are too small.cols ) rightmost columns of the matrices. mulSpectrums() . by using them. you can get even better performance than with the above theoretically optimal implementation (though. const MatND& src2. therefore. Matrix Expressions . flip() . we can compute convolution by parts. respectively. the temporary arrays tempA and tempB become too big and there is also slowdown because of bad cache locality. so you will need to “flip” the kernel or the image around the center using flip() ). All of the above improvements have been implemented in matchTemplate() and filter2D() . those two functions actually compute cross-correlation. filter2D() . it’s not necessary to clear the whole tempA and tempB . especially if B is significantly smaller than A or vice versa. Parameters o src1– The first source array : o src2– The second source array. matchTemplate() . MatND& dst.What can be optimized in the above sample? since we passed to the forward transform calls and since we copied A / B to the top-left corners of tempA / tempB . the loop can be threaded.in the ultimate case. which parts of A and B are required to compute convolution in this tile. See also: dct() . Instead. if the convolution is done by parts. phase() cv::divide Comments from the Wiki void divide(const Mat& src1. cartToPolar() . getOptimalDFTSize() . magnitude() . it is only necessary to clear the tempA. const MatND& src2. subtract() . dst(I)=0 too. So there is optimal tile size somewhere in the middle.A. Mat& dst.cols . will have the same size and same type as src2 The functions divide divide one array by another: or a scalar by array. See also: multiply() .cols . For that we need to split the destination array C into multiple tiles and for each tile estimate.B. double scale=1) void divide(double scale.

Mat& eigenvalues.:)' = eigenvalues(i)*eigenvectors(i. or eigenvalues and eigenvectors of symmetric matrix src : src*eigenvectors(i. must have CV_32FC1 or CV_64FC1 type. int lowindex=-1. See also: SVD() . The eigenvalues are stored in the descending order. completeSymm() . Indexing is 0-based. PCA() cv::exp Comments from the Wiki void exp(const Mat& src. trace() . o eigenvectors– The output matrix of eigenvectors. Parameters: o src– The source array . Example: To calculate the largest eigenvector/-value set lowindex = highindex = 0. (See below.:)' (in MATLAB notation) If either low. for larger matrices the function uses LU factorization. For legacy reasons this function always returns a square matrix the same size as the source matrix with eigenvectors and a vector the length of the source matrix with eigenvalues. (See below.) The functions eigen compute just eigenvalues. It will have the same size and the same type as src . Matrix Expressions cv::eigen Comments from the Wiki bool eigen(const Mat& src. For symmetric positive-determined matrices. Paramete o src– The input matrix.or highindex is supplied the other is required.lowindex + 1 rows. The selected eigenvectors/-values are always in the first highindex . MatND& dst) Calculates the exponent of every array element. rs: square size and be symmetric: o eigenvalues– The output vector of eigenvalues of the same type as src . int lowindex=-1.rows<=3 ) the direct method is used. it is also possible to compute SVD() : and then calculate the determinant as a product of the diagonal elements of . The eigenvectors are stored as subsequent matrix rows. See also: SVD() . For small matrices ( mtx. Mat& eigenvalues. int highindex=-1) Computes eigenvalues and eigenvectors of a symmetric matrix. Parameters o mtx– The input matrix. invert() . Mat& eigenvectors. too.) o highindex– Optional index of smallest eigenvalue/-vector to calculate.cols=mtx. solve() .cv::determinant Comments from the Wiki double determinant(const Mat& mtx) Returns determinant of a square floating-point matrix. Mat& dst) void exp(const MatND& src. in the same order as the corresponding eigenvalues o lowindex– Optional index of largest eigenvalue/-vector to calculate. int highindex=-1) bool eigen(const Mat& src. must have CV_32FC1 or CV_64FC1 type : and square size The function determinant computes and returns determinant of the specified matrix.

the function converts denormalized values to zeros on output. cvarrToMat() . Currently. The angle is measured in degrees and varies from to . int coi=-1) Extract the selected image channel Paramete o src– The source array. will have the same size and same type as src o flipCode– Specifies how to flip the array: 0 means flipping around the x-axis. ) are not handled.g. int flipCode) Flips a 2D array around vertical.. Paramete o src– The source array rs: o dst– The destination array. cartToPolar() .g. merge() . It should be a pointer to CvMat or IplImage o dst– The destination array. phase() . If it is <0 . positive (e. polarToCart() . As usual. See also the discussion below for the formulas. magnitude() cv::extractImageCOI Comments from the Wiki void extractImageCOI(const CvArr* src. horizontal or both axes. and negative (e. it specifies the channel to extract. To extract a channel from a new-style matrix. pow() .o dst– The destination array. cv::flip Comments from the Wiki void flip(const Mat& src. See also: log() . The accuracy is about . Mat& dst. will have single-channel. sqrt() . split() . cvGetImageCOI() cv::fastAtan2 Comments from the Wiki float fastAtan2(float y. . The function extractImageCOI is used to extract image COI from an old-style array and put the result to the new-style C++ matrix.. Special values (NaN. 1) means flipping around y-axis. cvSetImageCOI() . float x) Calculates the angle of a 2D vector in degrees Parameters: o x– x-coordinate of the vector o y– y-coordinate of the vector The function fastAtan2 calculates the full-range angle of an input 2D vector. use mixChannels() or split() See also: mixChannels() . Mat& dst.then the selected COI is extracted. will have the same size and same type as src The function exp calculates the exponent of every element of the input array: The maximum relative error is about for single-precision and less than for double-precision. -1) means flipping around both axes. the destination matrix is reallocated using Mat::create if needed. src must be a pointer to IplImage with valid COI set . and the same rs: size and the same depth as src o coi– If the parameter is >=0 .

g. beta. transform() .The function flip flips the array in one of three different ways (row and column indices are 0-based): The example scenarios of function use are: • vertical flipping of the image ( ) to switch between top-left and bottom-left image origin. int toType) . GEMM_1_T + GEMM_3_T) corresponds to The function can be replaced with a matrix expression. int toType) ConvertScaleData getConvertScaleElem(int fromType. repeat() .t()*src2 + beta*src3. e. the above call can be replaced with: dst = alpha*src1. which is a typical operation in video processing in Windows.t(). src2. Matrix Expressions cv::getConvertElem Comments from the Wiki ConvertData getConvertElem(int fromType. rs: CV_64FC1 . Paramete o src1– The first multiplied input matrix. const Mat& src3. For example. src3. int flags=0) Performs generalized matrix multiplication. should have CV_32FC1 . const Mat& src2. gemm(src1. CV_32FC2 or CV_64FC2 type o src2– The second multiplied input matrix. See also: mulTransposed() . Mat& dst. dst. It will have the proper size and the same type as input matrices o flags– Operation flags: GEMM_1_Ttranspose src1 GEMM_2_Ttranspose src2 GEMM_3_Ttranspose src3 The function performs generalized matrix multiplication and similar to the corresponding functions *gemm in BLAS level 3. • horizontal flipping of the image with subsequent horizontal shift and absolute difference calculation to check for a vertical-axis symmetry ( ) • simultaneous horizontal and vertical flipping of the image with subsequent shift and absolute difference calculation to check for a central symmetry ( ) or ) • reversing the order of 1d point arrays ( See also: transpose() . completeSymm() cv::gemm Comments from the Wiki void gemm(const Mat& src1. alpha. double beta. double alpha. should have the same type as src1 and src2 o beta– The weight of src3 o dst– The destination matrix. should have the same type as src1 o alpha– The weight of the matrix product o src3– The third optional delta matrix added to the matrix product.

300 = 5*5*3*2*2). See also: Mat::convertTo() . . such that the DFT of a vector of size N can be computed efficiently. it can be easily computed as getOptimalDFTSize((vecsize+1)/2)*2 . 4. While the main function purpose is to convert single pixels (actually. In the current .) are the fastest to process.typedef void (*ConvertData)(const void* from.. which size is a power-of-two (2.. 8. idft() . the arrays. 1.g.. you can use them to convert the whole row of a dense matrix or the whole matrix at once. which size is a product of 2’s. idct() . implementation The function returns a negative number if vecsize is too large (very close to INT_MAX ). 3’s and 5’s (e. Parameters: o vecsize– Vector size DFT performance is not a monotonic function of a vector size. Arrays. . o alpha– ConvertScaleData callback optional parameter: the scale factor o beta– ConvertScaleData callback optional parameter: the delta or offset The functions getConvertElem and getConvertScaleElem return pointers to the functions for converting individual pixels from one type to another. SparseMat::convertTo() cv::getOptimalDFTSize Comments from the Wiki int getOptimalDFTSize(int vecsize) Returns optimal DFT size for a given vector size. . int cn. can be arbitrary. The function getOptimalDFTSize returns the minimum number N that is greater than or equal to vecsize . are also processed quite efficiently. void* to. therefore. int cn) typedef void (*ConvertScaleData)(const void* from. though. it usually makes sense to pad the input data with zeros to get a bit larger array that can be transformed much faster than the original one. Will have the same size and same type as src o flags– The operation flags. 16. dct() . by setting cn = matrix.channels() if the matrix data is continuous. for some . See also: dft() . 100. .rows*matrix. 32. int flags=0) Computes inverse Discrete Cosine Transform of a 1D or 2D array Parameters: o src– The source floating-point single-channel array o dst– The destination array. . double alpha. 100000. for converting sparse matrices from one type to another). double beta) Returns conversion function for a single pixel Parameter o fromType– The source pixel type s: o toType– The destination pixel type o from– Callback parameter: pointer to the input pixel o to– Callback parameter: pointer to the output pixel o cn– Callback parameter: the number of channels.. While the function cannot be used directly to estimate the optimal vector size for DCT transform (since the current DCT implementation supports only even-size vectors). Mat& dst. mulSpectrums() cv::idct Comments from the Wiki void idct(const Mat& src. when you compute convolution of two arrays or do a spectral analysis of array.cols*matrix. MatND::convertTo() . void* to.

Mat& dst. dft() . getOptimalDFTSize() cv::idft Comments from the Wiki void idft(const Mat& src. See dft() o nonzeroRows– The number of dst rows to compute. will have the same size as src and CV_8U type The functions inRange do the range check for every element of the input array: for single-channel arrays. that none of dft and idft scale the result by default. Mat& dst) void inRange(const Mat& src. will have type as src o flags– The inversion method : matrix size and the same . const MatND& lowerb. The rest of the rows will have undefined content. See also: dct() . MatND& dst) void inRange(const MatND& src. flags) is equivalent to dct(src. int method=DECOMP_LU) Finds the inverse or pseudo-inverse of a matrix Paramete o src– The source floating-point rs: o dst– The destination matrix. which size and type depends on the flags o flags– The operation flags. for two-channel arrays and so forth. const Scalar& upperb. const Scalar& lowerb. o src– The first source array Parameters o lowerb– The inclusive lower boundary array of the same size and : type as src o upperb– The exclusive upper boundary array of the same size and type as src o dst– The destination array. See dct() for details. See the convolution sample in dft() description idft(src. dst. flags) is equivalent to dct(src. flags | DFT_INVERSE) . const Mat& lowerb. See also: dft() . Thus.idct(src. flags | DCT_INVERSE) . Mat& dst) void inRange(const MatND& src. const Scalar& upperb. const Scalar& lowerb. cv::invert Comments from the Wiki double invert(const Mat& src. mulSpectrums() . idct() . int flags=0. idft() . dst. const Mat& upperb. getOptimalDFTSize() cv::inRange Comments from the Wiki void inRange(const Mat& src. int outputRows=0) Computes inverse Discrete Fourier Transform of a 1D or 2D array Paramete o src– The source floating-point real or complex array rs: o dst– The destination array. See dft() for details. dst. Note. dst (I) is set to 255 (all 1 -bits) if src (I) is within the specified range and 0 otherwise. dct() . you should pass DFT_SCALE to one of dft or idft explicitly to make these transforms mutually inverse. dst. Mat& dst. const MatND& upperb. MatND& dst) Checks if array elements lie between the elements of two other arrays.

See also: solve() . The matrix must be symmetrical and positively defined The function invert inverts matrix src and stores the result in dst . will have the same size and same type as src The function log calculates the natural logarithm of the absolute value of every element of the input array: Where C is a large negative number (about -700 in the current implementation). If it is 0. In this case the function stores the inverted matrix in dst and returns non-zero. The maximum relative error is about for single-precision input and less than for double-precision input. In the case of DECOMP_LU method. such that is minimal. i.e. const Mat& lut. otherwise it returns 0. will have the same size and the same number of channels as src . the function returns the inversed condition number of src (the ratio of the smallest singular value to the largest singular value) and 0 if src is singular. sqrt() .DECOMP_LUGaussian elimination with optimal pivot element chosen DECOMP_SVDSingular value decomposition (SVD) method DECOMP_CHOLESKYCholesky decomposion. When the matrix src is singular or non-square. the matrix dst . In the case of DECOMP_SVD method. In the case of multi-channel source array. MatND& dst) Calculates the natural logarithm of every array element. cartToPolar() . See also: exp() . and the same depth as lut . The SVD method calculates a pseudo-inverse matrix if src is singular. phase() . the function computes the pseudo-inverse matrix. Paramete o src– Source array of 8-bit elements rs: o lut– Look-up table of 256 elements. the method DECOMP_CHOLESKY works only with non-singular square matrices. pow() . the matrix is not inverted and dst is filled with zeros. polarToCart() . the function returns the src determinant ( src must be square). Mat& dst) void log(const MatND& src. Mat& dst) Performs a look-up table transform of an array. SVD() cv::log Comments from the Wiki void log(const Mat& src. magnitude() cv::LUT Comments from the Wiki void LUT(const Mat& src. Similarly to DECOMP_LU . ) are not handled. Special values (NaN. Parameters: o src– The source array o dst– The destination array. the table should either have a single channel (in this case the same table is used for all channels) or the same number of channels as in the source array o dst– Destination array.

const MatND& src2.. must have the same size as x o dst– The destination array.. That is. as the most accurate). cv::max Comments from the Wiki Mat_Expr<. double value. will have the same size and same type as x The function magnitude calculates magnitude of 2D vectors formed from the corresponding elements of x and y arrays: See also: cartToPolar() . const Mat& icovar) Calculates the Mahalanobis distance between two vectors. const Mat& vec2.The function LUT fills the destination array with values from the look-up table.> max(double value... Parameters: o vec1– The first 1D source vector o vec2– The second 1D source vector o icovar– The inverse covariance matrix The function cvMahalonobis calculates and returns the weighted distance between two vectors: The covariance matrix may be calculated using the calcCovarMatrix() function and then inverted using the invert() function (preferably using DECOMP _ SVD method. polarToCart() . const Mat& src2. the function processes each element of src as follows: where See also: convertScaleAbs() . Mat& magnitude) Calculates magnitude of 2D vectors. Indices of the entries are taken from the source array. Mat& dst) void max(const MatND& src1. phase() ..> max(const Mat& src1..> max(const Mat& src1. Parameter o x– The floating-point array of x-coordinates of the vectors s: o y– The floating-point array of y-coordinates of the vectors. const Mat& y. double value) Mat_Expr<. sqrt() cv::Mahalanobis Comments from the Wiki double Mahalanobis(const Mat& vec1. Mat::convertTo cv::magnitude Comments from the Wiki void magnitude(const Mat& x. const Mat& src1) void max(const Mat& src1. Mat& dst) void max(const Mat& src1. const Mat& src2) Mat_Expr<. MatND& dst) .

or passed to a function etc. the functions return Scalar::all(0) . const Mat& mask=Mat()) void meanStdDev(const MatND& mtx. norm() . See also: countNonZero() . Scalar& mean. Scalar& stddev. const Mat& mask) Scalar mean(const MatND& mtx) Scalar mean(const MatND& mtx. inRange() . See also: min() . and return it: When all the mask elements are 0’s. MatND& dst) Calculates per-element maximum of two arrays or array and a scalar Parameters: o src1– The first source array o src2– The second source array of the same size and type as src1 o value– The real scalar value o dst– The destination array. minMaxLoc() . independently for each channel. const MatND& mask=MatND()) Calculates mean and standard deviation of array elements Paramete o mtx– The source array. when the source array is multi-channel. compare() . double value. each channel is compared with value independently. Matrix Expressions cv::mean Comments from the Wiki Scalar mean(const Mat& mtx) Scalar mean(const Mat& mtx. meanStdDev() . Scalar& stddev. const MatND& mask) Calculates average (mean) of array elements Paramete o mtx– The source array. Scalar& mean. or assigned to a matrix. minMaxLoc() cv::meanStdDev Comments from the Wiki void meanStdDev(const Mat& mtx. The first 3 variants of the function listed above are actually a part of Matrix Expressions . it should have 1 to 4 channels (so that the rs: results can be stored in Scalar() ‘s) o mean– The output parameter: computed mean value o stddev– The output parameter: computed standard deviation o mask– The optional operation mask . will have the same size and type as src1 The functions max compute per-element maximum of two arrays: or array and a scalar: In the second variant. it should have 1 to 4 channels (so that the rs: result can be stored in Scalar() ) o mask– The optional operation mask The functions mean compute mean value M of array elements.void max(const MatND& src1. they return the expression object that can be further transformed.

MatND& dst) void merge(const vector<MatND>& mv.. const Mat& src2. the number of channels will match the number of source matrices The functions merge merge several single-channel arrays (or rather interleave their elements) to make a single multi-channel array. calcCovarMatrix() cv::merge Comments from the Wiki void merge(const Mat* mv. All the matrices in mv must have the same size and the same type o count– The number of source matrices when mv is a plain C array. See also: countNonZero() . you can reshape the multi-channel array to the single-channel array (only possible when the matrix is continuous) and then pass the matrix to calcCovarMatrix() .The functions meanStdDev compute the mean and the standard deviation M of array elements.. double value. If the full matrix is needed. MatND& dst) Composes a multi-channel array from several single-channel arrays. const MatND& src2. norm() . mean() .. Mat& dst) void merge(const vector<Mat>& mv. Paramete o mv– The source array or vector of the single-channel matrices to be rs: merged. independently for each channel. The function split() does the reverse operation and if you need to merge several multi-channel images or shuffle channels in some other advanced way. double value. and return it via the output parameters: When all the mask elements are 0’s. MatND& dst) Calculates per-element minimum of two arrays or array and a scalar Parameters: o src1– The first source array o src2– The second source array of the same size and type as src1 o value– The real scalar value .> min(const Mat& src1. minMaxLoc() .> min(double value. Note that the computed standard deviation is only the diagonal of the complete normalized covariance matrix. the functions return mean=stddev=Scalar::all(0) . will have the same size and the same depth as mv[0] ... size_t count. Mat& dst) void merge(const MatND* mv. double value) Mat_Expr<. must be greater than zero o dst– The destination array.> min(const Mat& src1. use mixChannels() See also: mixChannels() . const Mat& src2) Mat_Expr<. Mat& dst) void min(const Mat& src1.. MatND& dst) void min(const MatND& src1. split() . const Mat& src1) void min(const Mat& src1. Mat& dst) void min(const MatND& src1. size_t count. reshape() cv::min Comments from the Wiki Mat_Expr<.

const MatND& mask=MatND()) void minMaxLoc(const SparseMat& src. See also: max() . use reshape() first to reinterpret the array as single-channel. in the case of a sparse matrix the minimum is found among non-zero elements only. otherwise must point to an array of src. size_t npairs) . each channel is compared with value independently. Or you may extract the particular channel using extractImageCOI() or mixChannels() or split() . NULL if not required o minLoc– Pointer to returned minimum location (in 2D case). min() . inRange() . Matrix Expressions cv::minMaxLoc Comments from the Wiki void minMaxLoc(const Mat& src. The extremums are searched across the whole array. NULL if not required o maxVal– Pointer to returned maximum value. int nsrc. int ndst.o dst– The destination array. int* minIdx=0. in the specified array region. minMaxLoc() . double* minVal. or. double* maxVal=0. double* maxVal. NULL if not required o mask– The optional mask used to select a sub-array The functions ninMaxLoc find minimum and maximum element values and their positions. if mask is not an empty array. If you need to find minimum or maximum elements across all the channels. See also: max() . The functions do not work with multi-channel arrays. when the source array is multi-channel. o maxIdx– Pointer to returned maximum location (in nD case). int* maxIdx=0. compare() . NULL if not required. will have the same size and type as src1 The functions min compute per-element minimum of two arrays: or array and a scalar: In the second variant. or assigned to a matrix. inRange() . int* maxIdx=0) Finds global minimum and maximum in a whole array or sub-array Paramete o src– The source single-channel array rs: o minVal– Pointer to returned minimum value. or passed to a function etc. split() .dims elements and the coordinates of minimum element in each dimensions will be stored sequentially there. int* minIdx=0. const int* fromTo. double* maxVal. extractImageCOI() . Mat* dstv. const Mat& mask=Mat()) void minMaxLoc(const MatND& src. they return the expression object that can be further transformed. cv::mixChannels Comments from the Wiki void mixChannels(const Mat* srcv. compare() . The first 3 variants of the function listed above are actually a part of Matrix Expressions . double* minVal. double* minVal. Point* maxLoc=0. Point* minLoc=0. NULL if not required o minIdx– Pointer to returned minimum location (in nD case). mixChannels() . NULL if not required o maxLoc– Pointer to returned maximum location (in 2D case). reshape() .

int flags.e. mixChannels( &rgba.rows.2. Mat alpha( rgba. const Mat& src2. 2. mixChannels requires the destination arrays be pre-allocated before calling the function. As a special case. vector<MatND>& dstv. MatND* dstv. o src1– The first source array Paramete rs: o src2– The second source array. As an example. rgba. rgba.cols.3 }. 4 ). fromTo[k*2] is the 0-based index of the input channel in srcv and fromTo[k*2+1] is the index of the output channel in dstv . alpha }. 3. // rgba[0] -> bgr[2]. must have the same size and the same type as src1 o dst– The destination array. split() and merge() and some forms of cvtColor() are partial cases of mixChannels .4) ). this code splits a 4-channel RGBA image into a 3-channel BGR (i. when fromTo[k*2] is negative. unlike many other new-style C++ functions in OpenCV (see the introduction section and Mat::create() ).channels()-1 . 1. const int* fromTo. cvtColor() cv::mulSpectrums Comments from the Wiki void mulSpectrums(const Mat& src1. CV_8UC1 ). Mat& dst. will have the same size and the same type . that is. int npairs) void mixChannels(const vector<MatND>& srcv. All the matrices must be allocated .. their size and depth must be the same as in srcv[0] o ndst– The number of elements in dstv o fromTo– The array of index pairs. merge() .channels() to srcv[0]. out.channels() + srcv[1]. const int* fromTo.rows.cols. Note that. bool conj=false) Performs per-element multiplication of two Fourier spectrums. Here the continuous channel numbering is used.void mixChannels(const MatND* srcv. from_to. 2.2. 1. and the same scheme is used for the output image channels. CV_8UC3 ). the corresponding output channel is filled with zero.1. // because the matrix data is not copied. int nsrc. int ndst. only the headers Mat out[] = { bgr. int npairs) Copies specified channels from input arrays to the specified channels of output arrays Paramete o srcv– The input array or vector of matrices. npairs The functions mixChannels provide an advanced mechanism for shuffling image channels. All the matrices must rs: have the same size and the same depth o nsrc– The number of elements in srcv o dstv– The output array or vector of matrices. See also: split() .0. the second input image channels are indexed from srcv[0]. 100. rgba[1] -> bgr[1].3. Scalar(1. Mat bgr( rgba. vector<Mat>& dstv.channels()-1 etc. size_t npairs) void mixChannels(const vector<Mat>& srcv. const int* fromTo. rgba[3] -> alpha[0] int from_to[] = { 0. the first input image channels are indexed from 0 to srcv[0]. with R and B channels swapped) and separate alpha channel image: Mat rgba( 100. // rgba[2] -> bgr[0]. // forming array of matrices is quite efficient operations. CV_8UC4. specifying which channels are copied and where.

accumulateSquare() . Otherwise. If you are looking for a matrix product. Paramete o src– The source matrix rs: o dst– The destination square matrix o aTa– Specifies the multiplication ordering. Mat& dst. const MatND& src2. may be used to calculate convolution (pass conj=false ) or correlation (pass conj=false ) of two arrays rapidly. int rtype=-1) Calculates the product of a matrix and its transposition.e. will have the same size and the same type as src1 o scale– The optional scale factor The function multiply calculates the per-element product of two arrays: There is also Matrix Expressions -friendly variant of the first function. Type of the delta matrix. const Mat& delta=Mat(). MatND& dst. accumulateProduct() . which should be either CV_32F or CV_64F . must be the same as the type of created destination matrix. together with dft() and idft() . The function. nothing is subtracted. it will have type=CV_MAT_DEPTH(rtype) . see the description below o delta– The optional delta matrix. double scale=1) Calculates the per-element scaled product of two arrays o src1– The first source array Parameters o src2– The second source array of the same size and the same type : as src1 o dst– The destination array. Mat& dst. otherwise if it has the same size as src . subtracted from src before the multiplication. see Mat::mul() . addWeighted() . not per-element product. double scale=1. see the rtype description o scale– The optional scale factor for the matrix product o rtype– When it’s negative. See also: add() . When the arrays are complex. when it’s not empty. const Mat& src2. bool aTa.as src1 o flags– The same flags as passed to dft() . it’s assumed to be zero. Matrix Expressions . accumulate() . Mat::convertTo() cv::mulTransposed Comments from the Wiki void mulTransposed(const Mat& src. substract() . they are simply multiplied (per-element) with optional conjugation of the second array elements. the destination matrix will have the same type as src . cv::multiply Comments from the Wiki void multiply(const Mat& src1. double scale=1) void multiply(const MatND& src1. When the matrix is empty ( delta=Mat() ). then it’s simply subtracted. they assumed to be CCS-packed (see dft() for details). i. scaleAdd() . When the arrays are real. see gemm() . only the flag DFT_ROWS is checked for o conj– The optional flag that conjugate the second source array before the multiplication (true) or not (false) The function mulSpectrums performs per-element multiplication of the two CCS-packed or complex matrices that are results of a real or complex Fourier transform. divide() . otherwise it is “repeated” (see repeat() ) to cover the full src and then subtracted.

const MatND& mask=MatND()) double norm(const SparseMat& src. const Mat& mask) double norm(const MatND& src1. const Mat& src2. The function is used to compute covariance matrix and with zero delta can be used as a faster substitute for general matrix product when . the results for all channels are combined. const Mat& mask) double norm(const Mat& src1. gemm() . const MatND& src2. that is. When there is mask parameter. int normType) Calculates absolute array norm. int normType=NORM_L2. Parameters: o src1– The first source array o src2– The second source array of the same size and the same type as src1 o normType– Type of the norm. or relative difference norm. and it is not empty (then it should have type CV_8U and the same size as src1 ). reduce() cv::norm Comments from the Wiki double norm(const Mat& src1. repeat() . int normType=NORM_L2) double norm(const Mat& src1. absolute difference norm.The function mulTransposed calculates the product of src and its transposition: if aTa=true . the norm is computed only over the specified by the mask region. cv::normalize Comments from the Wiki . const MatND& mask=MatND()) double norm(const MatND& src1. const Mat& src2. int normType. See also: calcCovarMatrix() . int normType. int normType=NORM_L2. and otherwise. int normType=NORM_L2) double norm(const Mat& src1. A multiple-channel source arrays are treated as a single-channel. see the discussion below o mask– The optional operation mask The functions norm calculate the absolute norm of src1 (when there is no src2 ): or an absolute or relative difference norm if src2 is there: or The functions norm return the calculated norm.

since it can shift the zero level. so that (where . see the discussion o rtype– When the parameter is negative. double alpha=1. the range transformation for sparse matrices is not allowed. Mat& result) const. const Mat& mask=Mat()) void normalize(const MatND& src. in the case of sparse matrices. int rtype=-1. PCA(const Mat& data. void backProject(const Mat& vec. // projects vector into the principal components space Mat project(const Mat& vec) const. the norm or min-n-max are computed over the sub-array and then this sub-array is modified to be normalized. otherwise it will have the same number of channels as src and the depth =CV_MAT_DEPTH(rtype) o mask– The optional operation mask The functions normalize scale and shift the source array elements. MatND::convertScale() . will have the same size as src o alpha– The norm value to normalize to or the lower range boundary in the case of range normalization o beta– The upper range boundary in the case of range normalization. MatND& dst. int normType) Normalizes array’s norm or the range Paramete o src– The source array rs: o dst– The destination array. int normType=NORM_L2. int maxComponents=0). // computes PCA for a set of vectors stored as data rows or columns PCA& operator()(const Mat& data. // computes PCA for a set of vectors stored as data rows or columns. double beta=0. . The optional mask specifies the sub-array to be normalize. Mat& result) const. the destination array will have the same type as src . double alpha. const MatND& mask=MatND()) void normalize(const SparseMat& src. Mat::convertScale() . you can use norm() and Mat::convertScale() / MatND::convertScale() /cross{SparseMat::convertScale} separately. that is. or so that when normType=NORM_MINMAX (for dense arrays only). int flags. not used for norm normalization o normType– The normalization type. NORM_L1 or NORM_L2 . double alpha=1. SparseMat::convertScale() PCA Comments from the Wiki PCA Class for Principal Component Analysis class PCA { public: // default constructor PCA(). int normType=NORM_L2. // reconstructs the vector from its PC projection Mat backProject(const Mat& vec) const. Mat& dst. const Mat& mean. int flags. void project(const Mat& vec. Because of this. 1 or 2) when normType=NORM_INF . only the non-zero values are analyzed and transformed. const Mat& mean. double beta=0.void normalize(const Mat& src. int maxComponents=0). If you want to only use the mask to compute the norm or min-max. See also: norm() . SparseMat& dst. but modify the whole array. int rtype=-1.

row(i). The first one stores the set of vectors (a row per vector) that is used to compute PCA. defined by the basis.row(i). // if there is no test data. See http://en. then reconstructed back and then the reconstruction error norm is computed and printed for each vector. not used for PCA compression/decompression Mat eigenvalues. maxComponents. The basis will consist of eigenvectors of the covariance matrix computed from the input set of vectors. just return the computed basis. // mean vector. Mat& compressed) { PCA pca(pcaset. coeffs = compressed.rows. // indicate that the vectors // are stored as matrix rows // (use CV_PCA_DATA_AS_COL if the vectors are // the matrix columns) maxComponents // specify. the result will be stored // in the i-th row of the output matrix pca. Geometrically it means that we compute projection of the vector to a subspace formed by a few eigenvectors corresponding to the dominant eigenvalues of the covariation matrix. or KLT. i++ ) { Mat vec = testset. int maxComponents. subtracted from the projected vector // or added to the reconstructed vector Mat mean. reconstructed). // so let the PCA engine to compute it CV_PCA_DATA_AS_ROW. Mat reconstructed. Such a transformation is also known as Karhunen-Loeve Transform.rows. CV_Assert( testset. // pass the data Mat(). i < testset.data ) return pca. we can represent the original vector from a high-dimensional space with a much shorter vector consisting of the projected vector’s coordinates in the subspace. }.// eigenvectors of the PC space. // compress the vector.project(vec. // and then reconstruct it pca.org/wiki/Principal_component_analysis The following sample is the function that takes two matrices. . // we do not have a pre-computed mean vector.create(testset. Usually. The class PCA is used to compute the special basis for a set of vectors. That is. const Mat& testset. for( int i = 0. ready-to-use if( !testset. // the corresponding eigenvalues.wikipedia. compressed. And usually such a projection is very close to the original vector. coeffs). in this new coordinate system each vector from the original set (and any linear combination of such vectors) can be quite accurately approximated by taking just the first few its components.backProject(coeffs.cols ). corresponding to the eigenvectors of the largest eigenvalues of the covariance matrix. PCA compressPCA(const Mat& pcaset. And also the class PCA can transform vectors to/from the new coordinate space. testset. the second one stores another “test” set of vectors (a row per vector) that are first compressed with PCA.cols == pcaset. how many principal components to retain ).type()). stored as the matrix rows Mat eigenvectors.

int maxComponents=0) PCA constructors Paramete o data– the input samples. stored as the matrix rows or as the matrix rs: columns o mean– the optional mean value. cv::PCA::operator () Comments from the Wiki PCA& PCA::operator()(const Mat& data. CV_PCA_DATA_AS_COLSIndicates that the input samples are stored as matrix columns. o maxComponents– The maximum number of components that PCA should retain. CV_PCA_DATA_AS_ROWSIndicates that the input samples are stored as matrix rows. Currently the parameter is only used to specify the data layout. If the matrix is empty ( Mat() ). By default. SVD() . Currently the parameter is only used to specify the data layout. CV_PCA_DATA_AS_COLSIndicates that the input samples are stored as matrix columns. dct() cv::PCA::PCA Comments from the Wiki PCA::PCA() PCA::PCA(const Mat& data. the existing internal data is reclaimed and the new eigenvalues . The default constructor initializes empty PCA structure. The operator performs PCA of the supplied dataset. By default. const Mat& mean.// and measure the error printf(" } return pca. If the matrix is empty ( Mat() ). o flags– operation flags. Paramete o data– the input samples. o flags– operation flags. the mean is computed from the data. CV_PCA_DATA_AS_ROWSIndicates that the input samples are stored as matrix rows. const Mat& mean. The second constructor initializes the structure and calls PCA::operator () . } See also: calcCovarMatrix() . It is safe to reuse the same PCA structure for multiple dataset. the mean is computed from the data. int flags. o maxComponents– The maximum number of components that PCA should retain. int maxComponents=0) Performs Principal Component Analysis of the supplied dataset. all the components are retained. The computed eigenvalues are sorted from the largest to the smallest and the corresponding . stored as the matrix rows or as the matrix rs: columns o mean– the optional mean value. mulTransposed() . That is. dft() . eigenvectors and mean are allocated and computed. int flags. all the components are retained. if the structure has been previously used with another dataset.

rs: each element is 2D/3D vector to be transformed o dst– The destination array. They have to have the same dimensionality rs: and the same layout as the input data used at PCA phase. then vec. Paramete o src– The source two-channel or three-channel floating-point array. while the second form can be more efficient in a processing loop.eigenvectors are stored as PCA::eigenvectors rows.cols==vec. Mat& result) const Project vector(s) to the principal component subspace Paramete o vec– the input vector(s). o result– the output vectors. cv::PCA::backProject Comments from the Wiki Mat PCA::backProject(const Mat& vec) const void PCA::backProject(const Mat& vec. i. cv::perspectiveTransform Comments from the Wiki void perspectiveTransform(const Mat& src.that’s why PCA is used after all. Mat& dst. unless all the principal components have been retained. it will have the same size and same type as src o mtx– or transformation matrix . if CV_PCA_DATA_AS_ROWS had been specified.cols==data.rows is the number of vectors to project. Of course. That is.g. In this case the output matrix will have as many columns as the number of input vectors. where each vector projection is represented by coefficients in the principal component basis. The first form of the method returns the matrix that the second form writes to the result. the reconstructed vectors will be different from the originals. result. Mat& result) const Reconstruct vectors from their PC projections. maxComponents parameter passed to the constructor). Let’s now consider CV_PCA_DATA_AS_COLS case. cv::PCA::project Comments from the Wiki Mat PCA::project(const Mat& vec) const void PCA::project(const Mat& vec. The layout and size are the same as of PCA::project input vectors. Paramete o vec– Coordinates of the vectors in the principal component subspace. and similarly for the CV_PCA_DATA_AS_COLS case. rs: The layout and size are the same as of PCA::project output vectors. So the first form can be used as a part of expression. const Mat& mtx) Performs perspective matrix transformation of vectors. They take PC coordinates of projected vectors and reconstruct the original vectors.cols and the number of rows will match the number of principal components (e.cols (that’s vectors’ dimensionality) and vec. o result– The reconstructed vectors. but typically the difference will be small is if the number of components is large enough (but still much smaller than the original vector dimensionality) .e. The methods project one or more vectors to the principal component subspace. The methods are inverse operations to PCA::project() .

in the case of 2D vector transformation the component is omitted): where and Note that the function transforms a sparse set of 2D or 3D vectors. the input angles are measured in . use warpPerspective() . const Mat& angle. otherwise they will be measured in radians The function phase computes the rotation angle of each 2D vector that is formed from the corresponding elements of x and y : The angle estimation accuracy is . in the following way (here 3D vector transformation is shown. want to compute the most probable perspective transformation out of several pairs of corresponding points. warpPerspective() . Mat& y. If you want to transform an image using perspective transformation. findHomography() cv::phase Comments from the Wiki void phase(const Mat& x. will have the same size and the same type as angle o y– The destination array of y-coordinates of 2D vectors. const Mat& y. i. If you have an inverse task. when x(I)=y(I)=0 . must have the rs: same size and the same type as x o angle– The destination array of vector angles. Paramete o magnitude– The source floating-point array of magnitudes of 2D rs: vectors. Mat& x. Mat& angle. it must have the same size and same type as angle o angle– The source floating-point array of angles of the 2D vectors o x– The destination array of x-coordinates of 2D vectors. the corresponding angle (I) is set to cv::polarToCart Comments from the Wiki void polarToCart(const Mat& magnitude.in this case the function assumes that all the magnitudes are =1. See also: transform() . bool angleInDegrees=false) Computes x and y coordinates of 2D vectors from their magnitude and angle. bool angleInDegrees=false) Calculates the rotation angle of 2d vectors Paramete o x– The source floating-point array of x-coordinates of 2D vectors o y– The source array of y-coordinates of 2D vectors.The function perspectiveTransform transforms every element of src . it will have the same size and same type as x o angleInDegrees– When it is true. See also: . will have the same size and the same type as angle o angleInDegrees– When it is true. you can use getPerspectiveTransform() or findHomography() . It can be an empty matrix ( =Mat() ) .e. by treating it as 2D or 3D vector. getPerspectiveTransform() . If it’s not empty. the function will compute angle in degrees.

NORMAL=1 }. operator ushort(). polarToCart() RNG Random number generator class. computing the 5th root of array src . For some values of p . log() . pow() . // return random numbers of the specified type operator uchar(). // constructors RNG(). cartToPolar() . See also: sqrt() . // returns 32-bit unsigned random number unsigned next(). for a non-integer power exponent the absolute values of input array elements are used. exp() . sqrt() cv::pow Comments from the Wiki void pow(const Mat& src. specialized faster algorithms are used. exp() . subtract(Scalar::all(0)./5. Parameters: o src– The source array o p– The exponent of power o dst– The destination array. operator schar(). . UNIFORM=0. will have the same size and the same type as src The function pow raises every element of the input array to p : That is.degrees. double p.5. and -0. shows: Mat mask = src < 0. double p. class CV_EXPORTS RNG { public: enum { A=4164903690U. 0. otherwise they are measured in radians The function polarToCart computes the cartesian coordinates of each 2D vector represented by the corresponding elements of magnitude and angle : The relative accuracy of the estimated coordinates is . See also: cartToPolar() . mask). phase() . dst. log() . pow(src. such as integer values. it is possible to get true values for negative values using some extra operations. as the following example. However. magnitude() . 1. Mat& dst) void pow(const MatND& src. MatND& dst) Raises every array element to a power.5. dst. RNG(uint64 state). dst).

consisting of all zeros. cv::RNG::next Comments from the Wiki unsigned RNG::next() Returns the next random number The method updates the state using MWC algorithm and returns the next 32-bit random number.wikipedia. int b). The first form sets the state to some pre-defined value. Tsang. introduced by G. double b). It encapsulates the RNG state (currently. . const Scalar& a. operator unsigned(). // returns Gaussian random number with zero mean. Currently it supports uniform and Gaussian (normal) distributions. b) range int uniform(int a. the constructor uses the above default value instead. The generator uses Multiply-With-Carry algorithm. double gaussian(double sigma). cv::RNG::RNG Comments from the Wiki RNG::RNG() RNG::RNG(uint64 state) RNG constructors Parameters: o state– the 64-bit value used to initialize the RNG These are the RNG constructors. unsigned operator()(). Marsaglia ( http://en. operator float(). // fills array with random numbers sampled from the specified distribution void fill( Mat& mat.wikipedia. const Scalar& b ). a 64-bit integer) and has methods to return scalar random values and to fill arrays with random values. operator int(). The class RNG implements random number generator. W. // returns a random integer sampled uniformly from [0. float uniform(float a. // internal state of the RNG (could change in the future) uint64 state. Marsaglia and W. const Scalar& b ). equal to 2**32-1 in the current implementation. operator double().operator short().org/wiki/Ziggurat_algorithm ). void fill( MatND& mat. unsigned operator()(unsigned N). to avoid the singular random number sequence. // returns a random number sampled uniformly from [a. const Scalar& a. introduced by G. Gaussian-distribution random numbers are generated using Ziggurat algorithm ( http://en. N). int distType. int distType. double uniform(double a. float b). }. If the user passed state=0 .org/wiki/Multiply-with-carry ). The second form sets the state to the specified value.

(double)1). double b) Returns the next random number sampled from the uniform distribution Parameters: o a– The lower inclusive boundary of the returned random numbers o b– The upper non-inclusive boundary of the returned random numbers The methods transforms the state using MWC algorithm and returns the next uniformly-distributed random number of the specified type. 1). i. // will likely cause compiler error because of ambiguity: . 1) double a1 = rng. the result will be in the range [0. // will produce double from [0. illustrated by the following sample: cv::RNG rng. the second form returns the random number modulo N . 1.cv::RNG::operator T Comments from the Wiki RNG::operator uchar() RNG::operator schar() RNG::operator ushort() RNG::operator short() RNG::operator unsigned() RNG::operator int() RNG::operator float() RNG::operator double() Returns the next random number of the specified type Each of the methods updates the state using MWC algorithm and returns the next random number of the specified type. The first form is equivalent to RNG::next() .1) range. b) .f).f.. // will always produce 0 double a = rng. N) . 1. There is one nuance. In the case of integer types the returned number is from the whole available value range for the specified type.uniform(0.uniform(0.). cv::RNG::uniform Comments from the Wiki int RNG::uniform(int a. 1) double b = rng. In the case of floating-point types the returned value is from [0. cv::RNG::operator () Comments from the Wiki unsigned RNG::operator()() unsigned RNG::operator()(unsigned N) Returns the next random number Parameters: o N– The upper non-inclusive boundary of the returned random number The methods transforms the state using MWC algorithm and returns the next random number.uniform((double)0.uniform(0. // will produce float from [0. float b) double RNG::uniform(double a. deduced from the input parameter type. int b) float RNG::uniform(float a.e. // will produce double from [0. 1) double c = rng. from the range [a.

const Scalar& b) void RNG::fill(MatND& mat. cv::randu Comments from the Wiki template<typename _Tp> _Tp randu() void randu(Mat& mtx. As the new numbers are generated. if they are constants.999999)? or RNG::uniform((double)0. cv::RNG::fill Comments from the Wiki void RNG::fill(Mat& mat. RNG can not generate samples from multi-dimensional Gaussian distribution with non-diagonal covariation matrix directly. the mean value of the returned random numbers will be zero and the standard deviation will be the specified sigma . In the case of uniform distribution this is inclusive lower boundary. The array must be rs: pre-allocated and have 1 to 4 channels .sigma) . but the range boundaries are integer numbers. o distType– The distribution type. To do that. const Scalar& a. RNG::UNIFORM or RNG::NORMAL o a– The first distribution parameter. first. 0. const Scalar& b) Fill arrays with random numbers Paramete o mat– 2D or N-dimensional matrix. the only thing that matters to it is the type of a and b parameters. either put dots in the end. the RNG state is updated accordingly. the compiler does not take into account type of the variable that you assign the result of RNG::uniform to. const Scalar& low. (int)0. int distType. as in a1 initialization above. const Scalar& a. int distType. That is. Use reshape() as a possible workaround. cv::RNG::gaussian Comments from the Wiki double RNG::gaussian(double sigma) Returns the next random number sampled from the Gaussian distribution o sigma– The standard deviation of the Parameters: distribution The methods transforms the state using MWC algorithm and returns the next random number from the Gaussian distribution N(0. Currently matrices with more rs: than 4 channels are not supported by the methods. So if you want a floating-point random number.uniform(0. In the case of normal distribution this is mean value. Each of the methods fills the matrix with the random values from the specified distribution.e.// RNG::uniform(0. generate matrix from the distribution . o b– The second distribution parameter. In the case of normal distribution this is standard deviation. That is.e.99999)? double d = rng. In the case of multiple-channel images every channel is filled independently. and then transform it using transform() and the specific covariation matrix. 0.999999). or use explicit type cast operators. Gaussian distribution with zero mean and identity covariation matrix. i. i. const Scalar& high) Generates a single uniformly-distributed random number or array of random numbers Paramete o mtx– The output array of random numbers. In the case of uniform distribution this is non-inclusive upper boundary.

cv::randn Comments from the Wiki void randn(Mat& mtx. is applied to the generated numbers (i. randn() . The number of such swap operations will be mtx. Its size and type is defined by dim and dtype parameters o dim– The dimension index along which the matrix is reduced. If it is zero. theRNG() . etc.cols*iterFactor See also: RNG() .e. double iterFactor=1. randu<int>() is equivalent to (int)theRNG(). sort() cv::reduce Comments from the Wiki void reduce(const Mat& mtx. theRNG() () is used instead The function randShuffle shuffles the specified 1D array by randomly choosing pairs of elements and swapping them. the values are clipped) See also: RNG() . RNG* rng=0) Shuffles the array elements randomly Paramete o mtx– The input/output numerical 1D array rs: o iterFactor– The scale factor that determines the number of random swap operations. const Scalar& mean. const Scalar& stddev) Fills array with normally distributed random numbers Paramete o mtx– The output array of random numbers. The array must be rs: pre-allocated and have 1 to 4 channels o mean– The mean value (expectation) of the generated random numbers o stddev– The standard deviation of the generated random numbers The function randn fills the matrix mtx with normally distributed random numbers with the specified mean and standard deviation. See the discussion o rng– The optional random number generator used for shuffling. randu() cv::randShuffle Comments from the Wiki void randShuffle(Mat& mtx. Mat& vec. int dtype=-1) Reduces a matrix to a vector Paramete o mtx– The source 2D matrix rs: o vec– The destination vector. See RNG() description. The second non-template variant of the function fills the matrix mtx with uniformly-distributed random numbers from the specified range: See also: RNG() . int dim.o low– The inclusive lower boundary of the generated random numbers o high– The exclusive upper boundary of the generated random numbers The template functions randu generate and return the next uniformly-distributed random value of the specified type.. int reduceOp. 0 means that the matrix is reduced to a single row and 1 means that the matrix is reduced to a single column .rows*mtx.

CV_REDUCE_AVGThe output is the mean vector of all of the matrix’s rows/columns. int nx.o reduceOp– The reduction operation. And multi-channel arrays are also supported in these two reduction modes. In the case of CV_REDUCE_SUM and CV_REDUCE_AVG the output may have a larger element bit-depth to preserve accuracy. o dtype– When it is negative. CV_REDUCE_MINThe output is the minimum (column/row-wise) of all of the matrix’s rows/columns. the function can be used to compute horizontal and vertical projections of an raster image. its type will be CV_MAKE_TYPE(CV_MAT_DEPTH(dtype). int ny. otherwise. int ny. Matrix Expressions saturate_cast Comments from the Wiki template<typename _Tp> inline _Tp saturate_cast(unsigned char v) template<typename _Tp> inline _Tp saturate_cast(signed char v) template<typename _Tp> inline _Tp saturate_cast(unsigned short v) template<typename _Tp> inline _Tp saturate_cast(signed short v) template<typename _Tp> inline _Tp saturate_cast(int v) template<typename _Tp> inline _Tp saturate_cast(unsigned int v) template<typename _Tp> inline _Tp saturate_cast(float v) template<typename _Tp> inline _Tp saturate_cast(double v) Template function for accurate conversion from one primitive type to another . one of: CV_REDUCE_SUMThe output is the sum of all of the matrix’s rows/columns. CV_REDUCE_MAXThe output is the maximum (column/row-wise) of all of the matrix’s rows/columns. mtx. int nx) Fill the destination array with repeated copies of the source array. For example. Mat& dst) Mat repeat(const Mat& src. See also: repeat() cv::repeat Comments from the Wiki void repeat(const Mat& src.channels()) The function reduce reduces matrix to a vector by treating the matrix rows/columns as a set of 1D vectors and performing the specified operation on the vectors until a single row/column is obtained. the destination vector will have the same type as the source matrix. Parameters: o src– The source array to replicate o dst– The destination array. will have the same type as src o ny– How many times the src is repeated along the vertical axis o nx– How many times the src is repeated along the horizontal axis The functions repeat() duplicate the source array one or more times along each of the two axes: The second variant of the function is more convenient to use with Matrix Expressions See also: reduce() .

addWeighted() . will have the same size and the same type as src1 The function cvScaleAdd is one of the classical primitive linear algebra operations. It calculates the sum of a scaled array and another array: The function can also be emulated with a matrix expression. const Scalar& value=Scalar(1)) Initializes a scaled identity matrix Parameters: o dst– The matrix to initialize (not necessarily square) o value– The value to assign to the diagonal elements The function setIdentity() initializes a scaled identity matrix: . Mat::convertTo() . for example: Mat A(3. See also: add() .. Mat::convertTo() cv::scaleAdd Comments from the Wiki void scaleAdd(const Mat& src1. “saturate” in the name means that when the input value v is out of range of the target type. signed char. Mat::dot() . Parameters o src1– The first source array : o scale– Scale factor for the first array o src2– The second source array. A.. . Matrix Expressions cv::setIdentity Comments from the Wiki void setIdentity(Mat& dst. This operation is used in most simple or complex image processing functions in OpenCV. const Mat& src2. see the introduction.row(1)*2 + A. When the parameter is floating-point value and the target type is an integer (8-.row(2). double scale. // a = 0 (UCHAR_MIN) short b = saturate_cast<short>(33333. subtract() . double scale.Parameters: o v– The function parameter The functions saturate_cast resembles the standard C++ cast operations. MatND& dst) Calculates the sum of a scaled array and another array. divide() . multiply() . const MatND& src2. Mat& dst) void scaleAdd(const MatND& src1. such as static_cast<T>() etc. unsigned short or signed short .or 16-bit). subtract() .33333). // b = 32767 (SHRT_MAX) Such clipping is done when the target type is unsigned char. must have the same size and the same type as src1 o dst– The destination array. but instead the value will be clipped.row(0) = A. 3. the floating-point value is first rounded to the nearest integer and then clipped if needed (when the target type is 8. CV_64F). See also: add() . They perform an efficient and accurate conversion from one primitive type to another. the result will not be formed just by taking low bits of the input.for 32-bit integers no clipping is done. known as DAXPY or SAXPY in BLAS . 16. For example: uchar a = saturate_cast<uchar>(-100).or 32-bit).

0]] See also: Mat::zeros() . int flags=DECOMP_LU) Solves one or more linear systems or least-squares problems. See also: invert() . [0. the system can be over-defined and/or the matrix src1 can be singular DECOMP_NORMALWhile all the previous flags are mutually exclusive. const Mat& src2. CV_32F)*5. the function solve will not do the work. in the latter case dst is not valid. 5. SVD() . Note that if you want to find unity-norm solution of an under-defined singular system . eigen() cv::solveCubic Comments from the Wiki void solveCubic(const Mat& coeffs. the matrix src1 must be symmetrical DECOMP_SVDSingular value decomposition (SVD) method. the system can be over-defined and/or the matrix src1 can be singular DECOMP_QRQR factorization. 0. 3. Mat::operator=() . this flag can be used together with any of the previous. o src1– The input matrix on the left-hand side of the system Paramete rs: o src2– The input matrix on the right-hand side of the system o dst– The output solution o flags– The solution (matrix inversion) method DECOMP_LUGaussian elimination with optimal pivot element chosen DECOMP_CHOLESKYCholesky factorization. 0]. Matrix Expressions . 0. [0. or by specifying the flag DECOMP_NORMAL ): If DECOMP_LU or DECOMP_CHOLESKY method is used.The function can also be emulated using the matrix initializers and the matrix expressions: Mat A = Mat::eye(4. Use SVD::solveZ() instead. Other methods find some pseudo-solution in the case of singular left-hand side part. 0]. the function returns 1 if src1 (or ) is non-singular and 0 otherwise. It means that the normal equations are solved instead of the original system The function solve solves a linear system or least-squares problem (the latter is possible with SVD or QR methods. cv::solve Comments from the Wiki bool solve(const Mat& src1. // A will be set to [[5. 0. Parameters: o coeffs– The equation coefficients. 5]. an array of 3 or 4 elements . Mat::setTo() . Mat& roots) Finds the real roots of a cubic equation. Mat::ones() . the matrix src1 must be symmetrical and positively defined DECOMP_EIGEigenvalue decomposition. [0. Mat& dst.

a combination of the following values: CV_SORT_EVERY_ROWEach matrix row is sorted independently CV_SORT_EVERY_COLUMNEach matrix column is sorted independently. randShuffle() cv::sortIdx Comments from the Wiki . Mat& roots. you can use STL std::sort generic function with the proper comparison predicate. If you want to sort matrix rows or columns lexicographically. cv::solvePoly Comments from the Wiki void solvePoly(const Mat& coeffs. int maxIters=20. See also: sortIdx() . This flag and the previous one are mutually exclusive CV_SORT_ASCENDINGEach matrix row is sorted in the ascending order CV_SORT_DESCENDINGEach matrix row is sorted in the descending order.o roots– The destination array of real roots which will have 1 or 3 elements The function solveCubic finds the real roots of a cubic equation: (if coeffs is a 4-element vector) or (if coeffs is 3-element vector): The roots are stored to roots array. int flags) Sorts each row or each column of a matrix o src– The source single-channel array Paramete rs: o dst– The destination array of the same size and the same type as src o flags– The operation flags. Mat& dst. This flag and the previous one are also mutually exclusive The function sort sorts each matrix row or each matrix column in ascending or descending order. int fig=100) Finds the real or complex roots of a polynomial equation o coeffs– The array of polynomial coefficients Parameters: o roots– The destination (complex) array of roots o maxIters– The maximum number of iterations the algorithm does o fig– The function solvePoly finds real and complex roots of a polynomial equation: cv::sort Comments from the Wiki void sort(const Mat& src.

1].void sortIdx(const Mat& src. it stores the indices of sorted elements in the destination array. a combination of the following values: CV_SORT_EVERY_ROWEach matrix row is sorted independently CV_SORT_EVERY_COLUMNEach matrix column is sorted independently. B. Mat* mv) void split(const Mat& mtx. Mat& dst) void sqrt(const MatND& src. randShuffle() cv::split Comments from the Wiki void split(const Mat& mtx. The arrays themselves will be reallocated if needed The functions split split multi-channel array into separate single-channel arrays: If you need to extract a single-channel or do some other sophisticated channel permutation. vector<MatND>& mv) Divides multi-channel array into several single-channel arrays o mtx– The source multi-channel array Paramete rs: o mv– The destination array or vector of arrays. 2. 2. int flags) Sorts each row or each column of a matrix Paramete o src– The source single-channel array rs: o dst– The destination integer array of the same size as src o flags– The operation flags. use mixChannels() See also: merge() . [0.channels() . [0. vector<Mat>& mv) void split(const MatND& mtx. // B will probably contain // (because of equal elements in A some permutations are possible): // [[1. For example: Mat A = Mat::eye(3. Instead of reordering the elements themselves.3. sortIdx(A.CV_32F). 2]] See also: sort() . This flag and the previous one are also mutually exclusive The function sortIdx sorts each matrix row or each matrix column in ascending or descending order. CV_SORT_EVERY_ROW + CV_SORT_ASCENDING). MatND& dst) Calculates square root of array elements Parameters: o src– The source floating-point array . Mat& dst. 1. mixChannels() . The number of arrays must match mtx. cvtColor() cv::sqrt Comments from the Wiki void sqrt(const Mat& src. MatND* mv) void split(const MatND& mtx. B. This flag and the previous one are mutually exclusive CV_SORT_ASCENDINGEach matrix row is sorted in the ascending order CV_SORT_DESCENDINGEach matrix row is sorted in the descending order. 0].

8-bit single channel array. SVD Comments from the Wiki SVD Class for computing Singular Value Decomposition class SVD { public: enum { MODIFY_A=1. const Mat& mask=Mat()) void subtract(const Scalar& sc. const Mat& mask) void subtract(const Mat& src1. const Mat& src2. NO_UV=2. const MatND& mask=MatND()) void subtract(const Scalar& sc. magnitude() cv::subtract Comments from the Wiki void subtract(const Mat& src1. // equivalent to subtract(dst. const MatND& src2. const Mat& src2. const MatND& src2. const Scalar& sc. MatND& dst. See also: pow() . const Mat& src2. Mat& dst. The first function in the above list can be replaced with matrix expressions: dst = src1 . Mat& dst. const MatND& mask) void subtract(const MatND& src1. addWeighted() . see Mat::create o mask– The optional operation mask. MatND& dst. // default empty constructor . dst). convertScale() . const MatND& src2. MatND& dst. const Scalar& sc.o dst– The destination array. Mat& dst) void subtract(const Mat& src1. const MatND& mask=MatND()) Calculates per-element difference between two arrays or array and a scalar Paramete o src1– The first source array rs: o src2– The second source array. The function accuracy is approximately the same as of the built-in std::sqrt . Mat& dst. will have the same size and the same type as src The functions sqrt calculate square root of each source array element. specifies elements of the destination array to be changed The functions subtract compute the difference between two arrays the difference between array and a scalar: the difference between scalar and an array: where I is multi-dimensional index of array elements. MatND& dst) void subtract(const MatND& src1. it will have the same size and same type as src1 . It must have the same size and same type as src1 o sc– Scalar. scaleAdd() . Matrix Expressions . dst -= src2. in the case of multi-channel arrays each channel is processed independently. const Mat& mask=Mat()) void subtract(const MatND& src1. src2. FULL_UV=4 }. the first or the second input parameter o dst– The destination array. See also: add() . .src2.

For a bit faster operation you can pass flags=SVD::MODIFY_A|. // the right orthogonal matrix }. Another flag FULL_UV indicates that full-size u and vt must be computed. // decomposes A into u. If you want to compute condition number of a matrix or absolute value of its determinant . to modify the decomposed matrix when it is not necessarily to preserve it. Mat& x ) const. See also: invert() . // finds such vector x.. // vector of singular values Mat vt. SVD& operator ()( const Mat& A. It can save some space and speed-up processing a bit SVD::NO_UVIndicates that only the vector of singular values w is to be computed. while u and vt will be set to empty matrices SVD::FULL_UVWhen the matrix is not square. w and vt: A = u*w*vt. The second constructor initializes empty SVD structure and then calls SVD::operator () . // decomposes A into u. // u and vt are orthogonal. eigen() . solve() . by default the algorithm produces u and vt matrices of sufficiently large size for the further A reconstruction. invert matrices. cv::SVD::operator () . however. int flags=0 ). norm(x)=1. The first constructor initializes empty SVD structure. . Mat u. which is not necessary most of the time. // where A is singular matrix static void solveZ( const Mat& A. determinant() cv::SVD::SVD Comments from the Wiki SVD::SVD() SVD::SVD(const Mat& A. The class SVD is used to compute Singular Value Decomposition of a floating-point matrix and then use it to solve least-square problems. // does back-subsitution: // x = vt. int flags=0 ).you do not need u and vt .SVD().. If.t()*rhs ~ inv(A)*rhs void backSubst( const Mat& rhs. so that A*x = 0.t()*inv(w)*u. int flags=0) SVD constructors Paramete o A– The decomposed matrix rs: o flags– Operation flags SVD::MODIFY_AThe algorithm can modify the decomposed matrix. FULL_UV flag is specified.. compute condition numbers etc. w is diagonal SVD( const Mat& A. so you can pass flags=SVD::NO_UV|. w and vt. under-determined linear systems. // the left orthogonal matrix Mat w.. u and vt will be full-size square orthogonal matrices. Mat& x ).

In practice. however. Mat& x) const Performs singular value back substitution Paramete being o rhs– The right-hand side of a linear system rs: solved. int flags=0) Performs SVD of a matrix Paramete o A– The decomposed matrix rs: o flags– Operation flags SVD::MODIFY_AThe algorithm can modify the decomposed matrix. vt and w are reclaimed and the new matrices are created. If. where A is the matrix passed to SVD::SVD() or SVD::operator () o x– The found solution of the system The method computes back substitution for the specified right-hand side: Using this technique you can either get a very accurate solution of convenient linear system. The operator performs singular value decomposition of the supplied matrix. if needed. the previous u . or the . strictly speaking. u and vt will be full-size square orthogonal matrices. the algorithm solves the following problem: cv::SVD::backSubst Comments from the Wiki void SVD::backSubst(const Mat& rhs. vt and the vector of singular values w are stored in the structure. o x– The found solution The method finds unit-length solution x of the under-determined system . The u . FULL_UV flag is specified. The same SVD structure can be reused many times with different matrices.Comments from the Wiki SVD& SVD::operator()( const Mat& A. Each time. The algorithm will not compute u and vt matrices SVD::FULL_UVWhen the matrix is not square. so the algorithm finds the unit-length solution as the right singular vector corresponding to the smallest singular value (which should be 0). Mat& x) Solves under-determined singular linear system Parameters: o A– The left-hand-side matrix. Theory says that such system has infinite number of solutions. cv::SVD::solveZ Comments from the Wiki static void SVD::solveZ(const Mat& A. It can save some space and speed-up processing a bit SVD::NO_UVOnly singular values are needed. because of round errors and limited floating-point accuracy. which is all handled by Mat::create() . the input matrix can appear to be close-to-singular rather than just singular. by default the algorithm produces u and vt matrices of sufficiently large size for the further A reconstruction. So.

See also: countNonZero() . For each thread there is separate random number generator. randu() . A ). cv::sum Comments from the Wiki Scalar sum(const Mat& mtx) Scalar sum(const MatND& mtx) Calculates sum of array elements Parameters: o mtx– The source array. meanStdDev() . so you can use the function safely in multi-thread environments. If you just need to get a single random number using this generator or initialize an array. Paramete o src– The source array. Note that explicit SVD with the further back substitution only makes sense if you need to solve many linear systems with the same left-hand side (e. cv::transform Comments from the Wiki void transform(const Mat& src. See also: RNG() .it will do absolutely the same thing. If all you need is to solve a single system (possibly with multiple rhs immediately available).cols or mtx. minMaxLoc() . must have as many channels (1 to 4) as rs: mtx. simply call solve() add pass cv::DECOMP_SVD there .best (in the least-squares terms) pseudo-solution of an overdetermined linear system. independently for each channel. it will be much faster to use this function to retrieve the generator and then use RNG::operator _Tp() . mean() .cols-1 o dst– The destination array. But if you are going to generate many random numbers inside a loop. must have 1 to 4 channels The functions sum calculate and return the sum of array elements. randn() cv::trace Comments from the Wiki Scalar trace(const Mat& mtx) Returns the trace of a matrix o mtx– The Parameters: source matrix The function trace returns the sum of the diagonal elements of the matrix mtx . will have the same size and depth as src and as many channels as mtx. reduce() cv::theRNG Comments from the Wiki RNG& theRNG() Returns the default random number generator The function theRNG returns the default random number generator.g. norm() . const Mat& mtx) Performs matrix transformation of every array element.rows o mtx– The transformation matrix . you can use randu() or randn() instead. Mat& dst.

The function transform performs matrix transformation of every element of array src and stores the results in dst : (when mtx.cols=src.channels() ), or (when mtx.cols=src.channels()+1 ) That is, every element of an N -channel array src is considered as N -element vector, which is or matrix mtx into an element of M -channel array dst . transformed using a The function may be used for geometrical transformation of -dimensional points, arbitrary linear color space transformation (such as various kinds of RGB YUV transforms), shuffling the image channels and so forth. See also: perspectiveTransform() , getAffineTransform() , estimateRigidTransform() , warpAffine() , warpPerspective()

cv::transpose
Comments from the Wiki void transpose(const Mat& src, Mat& dst) Transposes a matrix Parameters: o src– The source array o dst– The destination array of the same type as src The function transpose() transposes the matrix src : Note that no complex conjugation is done in the case of a complex matrix, it should be done separately if needed.

Drawing Functions¶
Drawing functions work with matrices/images of arbitrary depth. The boundaries of the shapes can be rendered with antialiasing (implemented only for 8-bit images for now). All the functions include the parameter color that uses a rgb value (that may be constructed with CV_RGB or the Scalar constructor ) for color images and brightness for grayscale images. For color images the order channel is normally Blue, Green, Red , this is what imshow() , imread() and imwrite() expect , so if you form a color using Scalar constructor, it should look like: If you are using your own image rendering and I/O functions, you can use any channel ordering, the drawing functions process each channel independently and do not depend on the channel order or even on the color space used. The whole image can be converted from BGR to RGB or to a different color space using cvtColor() . If a drawn figure is partially or completely outside the image, the drawing functions clip it. Also, many drawing functions can handle pixel coordinates specified with sub-pixel accuracy, that is, the coordinates can be passed as fixed-point numbers, encoded as integers. The number of fractional bits is specified by the shift parameter and the real point coordinates are calculated as . This feature is especially effective wehn rendering antialiased shapes. Also, note that the functions do not support alpha-transparency - when the target image is 4-channnel, then the color[3] is simply copied to the repainted pixels. Thus, if you want to paint semi-transparent shapes, you can paint them in a separate buffer and then blend it with the main image.

cv::circle

Comments from the Wiki void circle(Mat& img, Point center, int radius, const Scalar& color, int thickness=1, int lineType=8, int shift=0) Draws a circle Paramete o img– Image where the circle is drawn rs: o center– Center of the circle o radius– Radius of the circle o color– Circle color o thickness– Thickness of the circle outline if positive; negative thickness means that a filled circle is to be drawn o lineType– Type of the circle boundary, see line() description o shift– Number of fractional bits in the center coordinates and radius value The function circle draws a simple or filled circle with a given center and radius.

cv::clipLine
Comments from the Wiki bool clipLine(Size imgSize, Point& pt1, Point& pt2) bool clipLine(Rect imgRect, Point& pt1, Point& pt2) Clips the line against the image rectangle Paramete o imgSize– The image size; the image rectangle will be Rect(0, 0, rs: imgSize.width, imgSize.height) o imgSize– The image rectangle o pt1– The first line point o pt2– The second line point The functions clipLine calculate a part of the line segment which is entirely within the specified rectangle. They return false if the line segment is completely outside the rectangle and true otherwise.

cv::ellipse
Comments from the Wiki void ellipse(Mat& img, Point center, Size axes, double angle, double startAngle, double endAngle, const Scalar& color, int thickness=1, int lineType=8, int shift=0) void ellipse(Mat& img, const RotatedRect& box, const Scalar& color, int thickness=1, int lineType=8) Draws a simple or thick elliptic arc or an fills ellipse sector. Paramete o img– The image rs: o center– Center of the ellipse o axes– Length of the ellipse axes o angle– The ellipse rotation angle in degrees o startAngle– Starting angle of the elliptic arc in degrees o endAngle– Ending angle of the elliptic arc in degrees o box– Alternative ellipse representation via a RotatedRect , i.e. the function draws an ellipse inscribed in the rotated rectangle o color– Ellipse color o thickness– Thickness of the ellipse arc outline if positive, otherwise this indicates that a filled ellipse sector is to be drawn o lineType– Type of the ellipse boundary, see line() description o shift– Number of fractional bits in the center coordinates and axes’ values

The functions ellipse with less parameters draw an ellipse outline, a filled ellipse, an elliptic arc or a filled ellipse sector. A piecewise-linear curve is used to approximate the elliptic arc boundary. If you need more control of the ellipse rendering, you can retrieve the curve using ellipse2Poly() and then render it with polylines() or fill it with fillPoly() . If you use the first variant of the function and want to draw the whole ellipse, not an arc, pass startAngle=0 and endAngle=360 . The picture below explains the meaning of the parameters. Parameters of Elliptic Arc

cv::ellipse2Poly
Comments from the Wiki void ellipse2Poly(Point center, Size axes, int angle, int startAngle, int endAngle, int delta, vector<Point>& pts) Approximates an elliptic arc with a polyline Paramete o center– Center of the arc rs: o axes– Half-sizes of the arc. See ellipse() o angle– Rotation angle of the ellipse in degrees. See ellipse() o startAngle– Starting angle of the elliptic arc in degrees o endAngle– Ending angle of the elliptic arc in degrees o delta– Angle between the subsequent polyline vertices. It defines the approximation accuracy. o pts– The output vector of polyline vertices The function ellipse2Poly computes the vertices of a polyline that approximates the specified elliptic arc. It is used by ellipse() .

cv::fillConvexPoly
Comments from the Wiki void fillConvexPoly(Mat& img, const Point* pts, int npts, const Scalar& color, int lineType=8, int shift=0) Fills a convex polygon. Parameters: o img– Image o pts– The polygon vertices o npts– The number of polygon vertices o color– Polygon color o lineType– Type of the polygon boundaries, see line()

description o shift– The number of fractional bits in the vertex coordinates The function fillConvexPoly draws a filled convex polygon. This function is much faster than the function fillPoly and can fill not only convex polygons but any monotonic polygon without self-intersections, i.e., a polygon whose contour intersects every horizontal line (scan line) twice at the most (though, its top-most and/or the bottom edge could be horizontal).

cv::fillPoly
Comments from the Wiki void fillPoly(Mat& img, const Point** pts, const int* npts, int ncontours, const Scalar& color, int lineType=8, int shift=0, Point offset=Point()) Fills the area bounded by one or more polygons Parameters: o img– Image o pts– Array of polygons, each represented as an array of points o npts– The array of polygon vertex counters o ncontours– The number of contours that bind the filled region o color– Polygon color o lineType– Type of the polygon boundaries, see line() description o shift– The number of fractional bits in the vertex coordinates The function fillPoly fills an area bounded by several polygonal contours. The function can fills complex areas, for example, areas with holes, contours with self-intersections (some of thier parts), and so forth.

cv::getTextSize
Comments from the Wiki Size getTextSize(const string& text, int fontFace, double fontScale, int thickness, int* baseLine) Calculates the width and height of a text string. Paramete o text– The input text string rs: o fontFace– The font to use; see putText() o fontScale– The font scale; see putText() o thickness– The thickness of lines used to render the text; see putText() o baseLine– The output parameter - y-coordinate of the baseline relative to the bottom-most text point The function getTextSize calculates and returns size of the box that contain the specified text. That is, the following code will render some text, the tight box surrounding it and the baseline: // Use "y" to show that the baseLine is about string text = "Funny text inside the box"; int fontFace = FONT_HERSHEY_SCRIPT_SIMPLEX; double fontScale = 2; int thickness = 3; Mat img(600, 800, CV_8UC3, Scalar::all(0));

int baseline=0; Size textSize = getTextSize(text, fontFace, fontScale, thickness, &baseline); baseline += thickness; // center the text Point textOrg((img.cols - textSize.width)/2, (img.rows + textSize.height)/2); // draw the box rectangle(img, textOrg + Point(0, baseline), textOrg + Point(textSize.width, -textSize.height), Scalar(0,0,255)); // ... and the baseline first line(img, textOrg + Point(0, thickness), textOrg + Point(textSize.width, thickness), Scalar(0, 0, 255)); // then put the text itself putText(img, text, textOrg, fontFace, fontScale, Scalar::all(255), thickness, 8);

cv::line
Comments from the Wiki void line(Mat& img, Point pt1, Point pt2, const Scalar& color, int thickness=1, int lineType=8, int shift=0) Draws a line segment connecting two points Parameters: o img– The image o pt1– First point of the line segment o pt2– Second point of the line segment o color– Line color o thickness– Line thickness o lineType– Type of the line: 8(or omitted) 8-connected line. 44-connected line. CV_AAantialiased line. o shift– Number of fractional bits in the point coordinates The function line draws the line segment between pt1 and pt2 points in the image. The line is clipped by the image boundaries. For non-antialiased lines with integer coordinates the 8-connected or 4-connected Bresenham algorithm is used. Thick lines are drawn with rounding endings. Antialiased lines are drawn using Gaussian filtering. To specify the line color, the user may use the macro CV_RGB(r, g, b) .

LineIterator
Comments from the Wiki LineIterator

Class for iterating pixels on a raster line class LineIterator { public: // creates iterators for the line connecting pt1 and pt2 // the line will be clipped on the image boundaries // the line is 8-connected or 4-connected // If leftToRight=true, then the iteration is always done // from the left-most point to the right most, // not to depend on the ordering of pt1 and pt2 parameters LineIterator(const Mat& img, Point pt1, Point pt2, int connectivity=8, bool leftToRight=false); // returns pointer to the current line pixel uchar* operator *(); // move the iterator to the next pixel LineIterator& operator ++(); LineIterator operator ++(int); // internal state of the iterator uchar* ptr; int err, count; int minusDelta, plusDelta; int minusStep, plusStep; }; The class LineIterator is used to get each pixel of a raster line. It can be treated as versatile implementation of the Bresenham algorithm, where you can stop at each pixel and do some extra processing, for example, grab pixel values along the line, or draw a line with some effect (e.g. with XOR operation). The number of pixels along the line is store in LineIterator::count . // grabs pixels along the line (pt1, pt2) // from 8-bit 3-channel image to the buffer LineIterator it(img, pt1, pt2, 8); vector<Vec3b> buf(it.count); for(int i = 0; i < it.count; i++, ++it) buf[i] = *(const Vec3b)*it;

cv::rectangle
Comments from the Wiki void rectangle(Mat& img, Point pt1, Point pt2, const Scalar& color, int thickness=1, int lineType=8, int shift=0) Draws a simple, thick, or filled up-right rectangle. Paramete o img– Image rs: o pt1– One of the rectangle’s vertices o pt2– Opposite to pt1 rectangle vertex o color– Rectangle color or brightness (grayscale image) o thickness– Thickness of lines that make up the rectangle. Negative values, e.g. CV_FILLED , mean that the function has to draw a filled rectangle. o lineType– Type of the line, see line() description

o

shift– Number of fractional bits in the point coordinates

The function rectangle draws a rectangle outline or a filled rectangle, which two opposite corners are pt1 and pt2 .

cv::polylines
Comments from the Wiki void polylines(Mat& img, const Point** pts, const int* npts, int ncontours, bool isClosed, const Scalar& color, int thickness=1, int lineType=8, int shift=0) Draws several polygonal curves Paramete o img– The image rs: o pts– Array of polygonal curves o npts– Array of polygon vertex counters o ncontours– The number of curves o isClosed– Indicates whether the drawn polylines are closed or not. If they are closed, the function draws the line from the last vertex of each curve to its first vertex o color– Polyline color o thickness– Thickness of the polyline edges o lineType– Type of the line segments, see line() description o shift– The number of fractional bits in the vertex coordinates The function polylines draws one or more polygonal curves.

cv::putText
Comments from the Wiki void putText(Mat& img, const string& text, Point org, int fontFace, double fontScale, Scalar color, int thickness=1, int lineType=8, bool bottomLeftOrigin=false) Draws a text string Paramete o img– The image rs: o text– The text string to be drawn o org– The bottom-left corner of the text string in the image o fontFace– The font type, one of FONT_HERSHEY_SIMPLEX , FONT_HERSHEY_PLAIN , FONT_HERSHEY_DUPLEX , FONT_HERSHEY_COMPLEX , FONT_HERSHEY_TRIPLEX , FONT_HERSHEY_COMPLEX_SMALL , FONT_HERSHEY_SCRIPT_SIMPLEX or FONT_HERSHEY_SCRIPT_COMPLEX , where each of the font id’s can be combined with FONT_HERSHEY_ITALIC to get the slanted letters. o fontScale– The font scale factor that is multiplied by the font-specific base size o color– The text color o thickness– Thickness of the lines used to draw the text o lineType– The line type; see line for details o bottomLeftOrigin– When true, the image data origin is at the bottom-left corner, otherwise it’s at the top-left corner The function putText renders the specified text string in the image. Symbols that can not be rendered using the specified font are replaced by question marks. See getTextSize() for a text rendering code example.

XML/YAML Persistence¶
FileStorage
Comments from the Wiki FileStorage The XML/YAML file storage class class FileStorage { public: enum { READ=0, WRITE=1, APPEND=2 }; enum { UNDEFINED=0, VALUE_EXPECTED=1, NAME_EXPECTED=2, INSIDE_MAP=4 }; // the default constructor FileStorage(); // the constructor that opens the file for reading // (flags=FileStorage::READ) or writing (flags=FileStorage::WRITE) FileStorage(const string& filename, int flags); // wraps the already opened CvFileStorage* FileStorage(CvFileStorage* fs); // the destructor; closes the file if needed virtual ~FileStorage(); // opens the specified file for reading (flags=FileStorage::READ) // or writing (flags=FileStorage::WRITE) virtual bool open(const string& filename, int flags); // checks if the storage is opened virtual bool isOpened() const; // closes the file virtual void release(); // returns the first top-level node FileNode getFirstTopLevelNode() const; // returns the root file node // (it's the parent of the first top-level node) FileNode root(int streamidx=0) const; // returns the top-level node by name FileNode operator[](const string& nodename) const; FileNode operator[](const char* nodename) const; // returns the underlying CvFileStorage* CvFileStorage* operator *() { return fs; } const CvFileStorage* operator *() const { return fs; } // writes the certain number of elements of the specified format // (see DataType) without any headers void writeRaw( const string& fmt, const uchar* vec, size_t len ); // writes an old-style object (CvMat, CvMatND etc.) void writeObj( const string& name, const void* obj ); // returns the default object name from the filename // (used by cvSave() with the default object name etc.)

static string getDefaultObjectName(const string& filename); Ptr<CvFileStorage> fs; string elname; vector<char> structs; int state; };

FileNode
Comments from the Wiki FileNode The XML/YAML file node class class CV_EXPORTS FileNode { public: enum { NONE=0, INT=1, REAL=2, FLOAT=REAL, STR=3, STRING=STR, REF=4, SEQ=5, MAP=6, TYPE_MASK=7, FLOW=8, USER=16, EMPTY=32, NAMED=64 }; FileNode(); FileNode(const CvFileStorage* fs, const CvFileNode* node); FileNode(const FileNode& node); FileNode operator[](const string& nodename) const; FileNode operator[](const char* nodename) const; FileNode operator[](int i) const; int type() const; int rawDataSize(const string& fmt) const; bool empty() const; bool isNone() const; bool isSeq() const; bool isMap() const; bool isInt() const; bool isReal() const; bool isString() const; bool isNamed() const; string name() const; size_t size() const; operator int() const; operator float() const; operator double() const; operator string() const; FileNodeIterator begin() const; FileNodeIterator end() const; void readRaw( const string& fmt, uchar* vec, size_t len ) const; void* readObj() const; // do not use wrapper pointer classes for better efficiency const CvFileStorage* fs; const CvFileNode* node; };

FileNodeIterator
Comments from the Wiki FileNodeIterator The XML/YAML file node iterator class class CV_EXPORTS FileNodeIterator { public: FileNodeIterator(); FileNodeIterator(const CvFileStorage* fs, const CvFileNode* node, size_t ofs=0); FileNodeIterator(const FileNodeIterator& it); FileNode operator *() const; FileNode operator ->() const; FileNodeIterator& operator ++(); FileNodeIterator operator ++(int); FileNodeIterator& operator --(); FileNodeIterator operator --(int); FileNodeIterator& operator += (int); FileNodeIterator& operator -= (int); FileNodeIterator& readRaw( const string& fmt, uchar* vec, size_t maxCount=(size_t)INT_MAX ); const CvFileStorage* fs; const CvFileNode* container; CvSeqReader reader; size_t remaining; };

Clustering
cv::kmeans
Comments from the Wiki double kmeans(const Mat& samples, int clusterCount, Mat& labels, TermCriteria termcrit, int attempts, int flags, Mat* centers) Finds the centers of clusters and groups the input samples around the clusters. Paramete o samples– Floating-point matrix of input samples, one row per rs: sample o clusterCount– The number of clusters to split the set by o labels– The input/output integer array that will store the cluster indices for every sample o termcrit– Specifies maximum number of iterations and/or accuracy (distance the centers can move by between subsequent iterations) o attempts– How many times the algorithm is executed using different initial labelings. The algorithm returns the labels that yield the best compactness (see the last function parameter) o flags–

and false if they may or may not be in the same class The generic function partition implements an algorithm for splitting a set of elements into one or more equivalency classes. vector<int>& labels. the function uses the user-supplied labels instaed of computing them from the initial centers.org/wiki/Disjoint-set_data_structure .wikipedia. the best (minimum) value is chosen and the corresponding labels and the compactness value are returned by the function. the function will use the random or semi-random centers (use one of KMEANS_*_CENTERS flag to specify the exact method) o centers– The output matrix of the cluster centers. For the second and further attempts. The function returns the number of equivalency classes. class _EqPredicate> int() partition(const vector<_Tp>& vec. The predicate returns true when the elements are certainly if the same class. one row per each cluster center The function kmeans implements a k-means algorithm that finds the centers of clusterCount clusters and groups the input samples around the clusters. which is computed as after every attempt. Basically. contains a 0-based cluster index for the sample stored in the row of the samples matrix. The function returns the compactness measure. and then choose the best (most-compact) clustering. const _Tp& b) . initialize labels each time using some custom algorithm and pass them with ( flags = KMEANS_USE_INITIAL_LABELS ) flag.e. pointer to a boolean function of two arguments or an instance of the class that has the method bool operator()(const _Tp& a. will contain as many elements as vec . set the number of attempts to 1. Each label labels[i] is 0-based cluster index of vec[i] o predicate– The equivalence predicate (i. cv::partition Comments from the Wiki template<typename _Tp. as described in http://en. Paramete o vec– The set of elements stored as a vector rs: o labels– The output vector of labels. Utility and System Functions and Macros¶ cv::alignPtr . _EqPredicate predicate=_EqPredicate()) Splits an element set into equivalency classes. On output. the user can use only the core of the function.It can take the following values: KMEANS_RANDOM_CENTERSRandom initial centers are selected in each attempt KMEANS_PP_CENTERSUse kmeans++ center initialization by Arthur and Vassilvitskii KMEANS_USE_INITIAL_LABELS During the first (and possibly the only) attempt.

. size_t n) Allocates an array of elements o ptr– Pointer to the deallocated Parameters: buffer o n– The number of elements in the buffer The generic function deallocate deallocates the buffer allocated with allocate() .. int n=sizeof(_Tp)) Aligns pointer to the specified number of bytes Parameters: o ptr– The aligned pointer o n– The alignment size. CV_Assert Comments from the Wiki CV_Assert(expr) Checks a condition at runtime. For each element the default constructor is called. #define CV_Assert( expr ) .Comments from the Wiki template<typename _Tp> _Tp* alignPtr(_Tp* ptr.. int n) Aligns a buffer size to the specified number of bytes Parameters: o sz– The buffer size to align o n– The alignment size. The number of elements must match the number passed to allocate() .. must be a power of two The function returns the minimum number that is greater or equal to sz and is divisble by n : cv::allocate Comments from the Wiki template<typename _Tp> _Tp* allocate(size_t n) Allocates an array of elements Parameters: o n– The number of elements to allocate The generic function allocate allocates buffer for the specified number of elements. . #define CV_DbgAssert(expr) . cv::deallocate Comments from the Wiki template<typename _Tp> void deallocate(_Tp* ptr. must be a power of two The function returns the aligned pointer of the same type as the input pointer: cv::alignSize Comments from the Wiki size_t alignSize(size_t sz.

j.h o msg– Text of the error message o args– printf-like formatted error message in parantheses The function and the helper macros CV_Error and CV_Error_ call the error handler.> Signals an error and raises the exception o exc– The exception to throw Paramete rs: o code– The error code..> #define CV_Error_( code. // function name where the error happened string func. The macro CV_Assert checks the condition in both Debug and Release configurations. The macro CV_Error_ can be used to construct the error message on-fly to include some dynamic information. msg ) <. so that the execution stack and all the parameters can be analyzed in debugger.param expr: The checked expression The macros CV_Assert and CV_DbgAssert evaluate the specified expression and if it is 0.err to the standard error stream stderr .file . Currently.line and the error message exc. .. Exception(const Exception& exc). // the error code int code.at<float>(i. args ) <. the macros raise an error (see error() ). normally. int _line). const string& _file. for example: // note the extra parentheses around the formatted text message CV_Error_(CV_StsOutOfRange. line = 0.code ).. In Release configuration the exception exc is thrown. // the error text message string err. In Debug configuration it then provokes memory access violation.. } Exception(int _code. ("the matrix element ( i. the context ( exc. cv::error Comments from the Wiki void error(const Exception& exc) #define CV_Error( code. the error handler prints the error code ( exc. a negative value. Exception& operator = (const Exception& exc). The list of pre-defined error codes can be found in cxerror. const string& _err. const string& _func. exc. mtx.j))) Exception Comments from the Wiki Exception The exception class passed to error class Exception { public: // various constructors and the copy operation Exception() { code = 0. while CV_DbgAssert is only retained in the Debug configuration.

via CV_Error and CV_Error_ macros. cv::fastFree Comments from the Wiki void fastFree(void* ptr) Deallocates memory buffer Parameters: o ptr– Pointer to the allocated buffer The function deallocates the buffer. . allocated with fastMalloc() . see error() . When the buffer size is 16 bytes or more. The exception is usually constructed and thrown implicitly. // the source file line where the error happened int line.// the source file name where the error happened string file. cv::getThreadNum .) Returns a text string formatted using printf-like expression Parameters: o fmt– The printf-compatible formatting specifiers The function acts like sprintf ... the function does nothing. }. cv::getNumThreads Comments from the Wiki int getNumThreads() Returns the number of threads used by OpenCV The function returns the number of threads that is used by OpenCV. See also: setNumThreads() . cv::format Comments from the Wiki string format(const char* fmt. but forms and returns STL string. If NULL pointer is passed. The class Exception encapsulates all or almost all the necessary information about the error happened in the program. the returned buffer is aligned on 16 bytes. getThreadNum() . It can be used for form the error message in Exception() constructor. cv::fastMalloc Comments from the Wiki void* fastMalloc(size_t size) Allocates aligned memory buffer Parameters: o size– The allocated buffer size The function allocates buffer of the specified size and returns it.

g. See also the tick frequency. bool accumulate=false) void calcHist(const Mat* arrays. const int* histSize.. MatND& hist.t)/getTickFrequency(). int dims. const float** ranges. which is usually equal to the number of the processing cores.Comments from the Wiki int getThreadNum() Returns index of the currently executed thread The function returns 0-based index of the currently executed thread. cv::setNumThreads Comments from the Wiki void setNumThreads(int nthreads) Sets the number of threads used by OpenCV Parameters: o nthreads– The number of threads used by OpenCV The function sets the number of threads used by OpenCV in parallel OpenMP regions. const int* channels. t = ((double)getTickCount() . See also: getNumThreads() . getThreadNum() imgproc. cv::getTickFrequency Comments from the Wiki double getTickFrequency() Returns the number of ticks per second The function returns the number of ticks per second. bool uniform=true. If nthreads=0 . int narrays.. int narrays. When OpenCV is built without OpenMP support. // do something . SparseMat& hist. const Mat& mask. double t = (double)getTickCount(). cv::getTickCount Comments from the Wiki int64 getTickCount() Returns the number of ticks The function returns the number of ticks since the certain event (e. The function is only valid inside a parallel OpenMP region. It can be used to initialize RNG() or to measure a function execution time by reading the tick count before and after the function call. when the machine was turned on). That is. const Mat& mask. See also: setNumThreads() . . the function will use the default number of threads. Image Processing Histograms¶ cv::calcHist Comments from the Wiki void calcHist(const Mat* arrays. const int* channels. getNumThreads() . the following code computes the execution time in seconds. the function always returns 0.

When the histogram is uniform ( uniform =true). This feature allows user to compute a single histogram from several sets of arrays. hsv. which are not between and . If it is set.channels() to arrays[0]. CV_8U rs: or CV_32F . const int* histSize. Each of them can have an arbitrary number of channels o narrays– The number of source arrays o channels– The list of dims channels that are used to compute the histogram. if( argc != 2 || !(src=imread(argv[1]. int main( int argc. cvtColor(src. The non-zero mask elements mark the array elements that are counted in the histogram o hist– The output histogram. the histogram is not cleared in the beginning (when it is allocated).channels()-1 etc. bool uniform=true. in the case of uniform histogram each of ranges[i] is an array of 2 elements. const float** ranges. bool accumulate=false) Calculates histogram of a set of arrays Paramete o arrays– Source arrays. a dense or sparse dims -dimensional array o dims– The histogram dimensionality. The first array channels are numerated from 0 to arrays[0].data ) return -1.int dims. then for each dimension i it’s enough to specify the lower (inclusive) boundary of the 0-th histogram bin and the upper (exclusive) for the last histogram bin histSize[i]-1 . CV_BGR2HSV). When the histogram is not uniform ( uniform=false ). . are not counted in the histogram o uniform– Indicates whether the histogram is uniform or not. hsv. must be positive and not greater than CV_MAX_DIMS (=32 in the current OpenCV version) o histSize– The array of histogram sizes in each dimension o ranges– The array of dims arrays of the histogram bin boundaries in each dimension. They all should have the same depth.channels() + arrays[1]. The array elements. it must be 8-bit array of the same size as arrays[i] . 1)). That boundary is. char** argv ) { Mat src.h> using namespace cv. the second array channels are counted from arrays[0].h> #include <highgui. If the matrix is not empty. and the same size. o mask– The optional mask. The elements of a tuple that is used to increment a histogram bin are taken at the same location from the corresponding input arrays.channels()-1 . then each of ranges[i] contains histSize[i]+1 elements: . see above o accumulate– Accumulation flag. or to update the histogram in time The functions calcHist calculate the histogram of one or more arrays. The sample below shows how to compute 2D Hue-Saturation histogram for a color imag #include <cv.

double scale=1. s < sbins. Point(h*scale. true.// let's quantize the hue to 30 levels // and the saturation to 32 levels int hbins = 30. bool uniform=true) void calcBackProject(const Mat* arrays. 256 }. rectangle( histImg.1. // hue varies from 0 to 179. MatND hist. (s+1)*scale . 1 ). Mat& backProject. const float** ranges. imshow( "Source". 180 }. const MatND& hist. const int* channels. for( int h = 0. src ). sbins}. 0. int scale = 10. s++ ) { float binVal = hist. } cv::calcBackProject Comments from the Wiki void calcBackProject(const Mat* arrays. 2. imshow( "H-S Histogram". minMaxLoc(hist. hbins*10. CV_8U . Paramete o arrays– Source arrays.1). h++ ) for( int s = 0. namedWindow( "H-S Histogram". 1. CV_FILLED ). double maxVal=0. s*scale). int narrays. sranges }. int intensity = cvRound(binVal*255/maxVal). const int* channels. Scalar::all(intensity).at<float>(h. histSize. 1 ). const SparseMat& hist. int histSize[] = {hbins. h < hbins. // the histogram is uniform false ). 0). see cvtColor float hranges[] = { 0. Mat& backProject. channels. calcHist( &hsv. Point( (h+1)*scale . waitKey(). double scale=1. sbins = 32. // saturation varies from 0 (black-gray-white) to // 255 (pure spectrum color) float sranges[] = { 0. // we compute the histogram from the 0-th and 1-st channels int channels[] = {0. CV_8UC3). } namedWindow( "Source". ranges. bool uniform=true) Calculates the back projection of a histogram. const float** ranges. histImg ). Mat(). int narrays. &maxVal. 1}. They all should have the same depth. s). const float* ranges[] = { hranges. Mat histImg = Mat::zeros(sbins*scale. // do not use mask hist. 0.

The histogram will likely have a strong maximums. one of the following: CV_COMP_CORRELCorrelation CV_COMP_CHISQRChi-Square CV_COMP_INTERSECTIntersection CV_COMP_BHATTACHARYYABhatta charyya distance The functions compareHist compare two dense or two sparse histograms using the specified method: . 2 During the tracking. scales it by scale and stores in backProject(x.y) . Threshold the back projection to suppress weak colors. The number of channels must match the histogram dimensionality. const SparseMat& H2. calculate back projection of a hue plane of each input video frame using that pre-computed histogram. see above The functions calcBackProject calculate the back project of the histogram. you can find and track a bright-colored object in a scene: 1 Before the tracking. The first array channels are numerated from 0 to arrays[0].channels() + arrays[1]. int method) Compares two histograms o H1– The first compared histogram Parameters: o H2– The second compared histogram of the same size as H1 o method– The comparison method. show the object to the camera such that covers almost the whole frame. See calcHist() o scale– The optional scale factor for the output back projection o uniform– Indicates whether the histogram is uniform or not. Here is how. int method) double compareHist(const SparseMat& H1. 3 Find connected components in the resulting picture and choose. const MatND& H2. for example. for example. the function reads the bin value.channels() to arrays[0]. corresponding to the dominant colors in the object. It may also have sense to suppress pixels with non sufficient color saturation and too dark or too bright pixels. at each location (x. But instead of incrementing it. the second array channels are counted from arrays[0]. similarly to calcHist . y) the function collects the values from the selected channels in the input images and finds the corresponding histogram bin. Calculate a hue histogram.channels()-1 etc. will be a single-channel array of the same size and the same depth as arrays[0] o ranges– The array of arrays of the histogram bin boundaries in each dimension.rs: or CV_32F . Each of them can have an arbitrary number of channels o narrays– The number of source arrays o channels– The list of channels that are used to compute the back projection. In terms of statistics. the largest component.channels()-1 . That is. the function computes probability of each element value in respect with the empirical probability distribution represented by the histogram. a dense or sparse o backProject– Destination back projection aray. and the same size. o hist– The input histogram. See also: calcHist() cv::compareHist Comments from the Wiki double compareHist(const MatND& H1. That is the approximate algorithm of CAMShift() color object tracker.

because of aliasing and sampling problems the coordinates of non-zero histogram bins can slightly shift. will have the same size and the same type as src The function equalizes the histogram of the input image using the following algorithm: calculate the histogram for src . Chi-Square (method=CV_COMP_CHISQR) Intersection (method=CV_COMP_INTERSECT) Bhattacharyya distance (method=CV_COMP_BHATTACHARYYA) The function returns . To compare such histograms or more general sparse configurations of weighted points. normalize the histogram so that the sum of histogram bins is 255. 3-dimensional dense histograms.Correlation (method=CV_COMP_CORREL) where and is the total number of histogram bins. In case of a linear filter it is a weighted sum of pixel values. compute the integral of the histogram: transform the image using as a look-up table: The algorithm normalizes the brightness and increases the contrast of the image. for each pixel location in the source image some its (normally rectangular) neighborhood is considered and used to compute the response. Parameters o src– The source 8-bit single channel image : o dst– The destination image. Image Filtering Functions and classes described in this section are used to perform various linear or non-linear filtering operations on 2D images (represented as Mat() ‘s). where. 2-. While the function works well with 1-. consider using the calcEMD() function. in case of morphological . it may not be suitable for high-dimensional sparse histograms. that is. cv::equalizeHist Comments from the Wiki void equalizeHist(const Mat& src. Mat& dst) Equalizes the histogram of a grayscale image.

int dststep. // the aperture size int anchor. It means. BaseFilter() . a particular implementation is not limited to 8-bit data. that the output image will be of the same size as the input image. uchar* dst.operations it is the minimum or maximum etc. Instead. it can produce any side effects. getMorphologyColumnFilter() . if we filter. see the function borderInterpolate() and discussion of borderType parameter in various functions below. // position of the anchor point. Another common feature of the functions and classes described in this section is that. they need to extrapolate values of some non-existing pixels. i. See also: BaseRowFilter() . int width) = 0. // To be overriden by the user. therefore the output image will also have the same number of channels as the input one.e. // "dstcount" rows on output. The class only defines the interface and is not used directly. The filtering does not have to be a linear operation. // resets the filter state (may be needed for IIR filters) virtual void reset(). In general.1" rows on input. in which case every channel is processed independently.e. but. We can let those pixels be the same as the left-most image pixels (i. // each input and output row has "width" elements // the filtered rows are written into "dst" buffer. Those pointers are then passed to FilterEngine() constructor. use “replicated border” extrapolation method). // "dstcount + ksize . as it is represented as a class. it could be written as following: where is the filtering function. Normally. // // runs filtering operation on the set of rows. or assume that all the non-existing pixels are zeros (“contant border” extrapolation method) etc. virtual void operator()(const uchar** src. While the filtering operation interface uses uchar type. OpenCV let the user to specify the extrapolation method. the functions supports multi-channel arrays. For example. The class BaseColumnFilter is the base class for filtering data using single-column kernels. // normally not used during the processing }. FilterEngine() . memorize previously processed data etc. getLinearColumnFilter() . then during the processing of the left-most want to smooth an image using a Gaussian pixels in each row we need pixels to the left of them. outside of the image. getColumnSumFilter() . there are several functions in OpenCV (and you can add more) that return pointers to the derived classes that implement specific filtering operations. BaseColumnFilter Comments from the Wiki BaseColumnFilter Base class for filters with single-column kernels class BaseColumnFilter { public: virtual ~BaseColumnFilter(). int ksize. int dstcount. unlike simple arithmetic functions. The computed response is stored to the destination image at the same location .

The class BaseFilter is the base class for filtering data using 2D kernels. there are several functions in OpenCV (and you can add more) that return pointers to the derived classes that implement specific filtering operations. it could be written as following: where is the filtering function. Point anchor. // the filtered rows are written into "dst" buffer. int dstcount. The filtering does not have to be a linear operation. }. virtual void operator()(const uchar** src. a particular implementation is not limited to 8-bit data. FilterEngine() . . getMorphologyFilter() BaseRowFilter Comments from the Wiki BaseRowFilter Base class for filters with single-row kernels class BaseRowFilter { public: virtual ~BaseRowFilter().BaseFilter Comments from the Wiki BaseFilter Base class for 2D image filters class BaseFilter { public: virtual ~BaseFilter(). // "dstcount" rows on output. // // runs filtering operation on the set of rows. // resets the filter state (may be needed for IIR filters) virtual void reset().height . int dststep. Size ksize. // "dstcount + ksize. int width. See also: BaseColumnFilter() . Those pointers are then passed to FilterEngine() constructor. int cn) = 0. uchar* dst. getLinearFilter() . BaseRowFilter() .width-1)*cn" elements // each output row has "width*cn" elements. // To be overriden by the user.1" rows on input. While the filtering operation interface uses uchar type. In general. The class only defines the interface and is not used directly. // each input row has "(width + ksize. Instead.

While the filtering operation interface uses uchar type. the output data type will be "dstType".empty()) // the input data type will be "srcType". See also: BaseColumnFilter() . The filtering does not have to be a linear operation. In general. The class BaseRowFilter is the base class for filtering data using single-row kernels. Filter() .// To be overriden by the user. int bufType. // _rowBorderType and _columnBorderType determine how the image // will be extrapolated beyond the image boundaries. // builds a 2D non-separable filter (!_filter2D. int ksize.empty() && !_columnFilter. // use _rowBorderType by default const Scalar& _borderValue=Scalar()). The class only defines the interface and is not used directly. a particular implementation is not limited to 8-bit data. const Ptr<BaseColumnFilter>& _columnFilter. int srcType. int cn) = 0. it could be written as following: where is the filtering function. const Ptr<BaseColumnFilter>& _columnFilter. const Ptr<BaseRowFilter>& _rowFilter. each element is has "cn" channels. getRowSumFilter() FilterEngine Comments from the Wiki FilterEngine Generic image filtering class class FilterEngine { public: // empty constructor FilterEngine(). int srcType. getMorphologyRowFilter() . // _borderValue is only used when _rowBorderType and/or _columnBorderType // == cv::BORDER_CONSTANT FilterEngine(const Ptr<BaseFilter>& _filter2D. anchor. }. int _columnBorderType=-1. int bufType. virtual ~FilterEngine(). // the intermediate data type is "bufType". // the filtered row is written into "dst" buffer. int dstType. getLinearRowFilter() . // separate function for the engine initialization void init(const Ptr<BaseFilter>& _filter2D.empty()) or // a separable filter (!_rowFilter. const Ptr<BaseRowFilter>& _rowFilter. int dstType. int width. uchar* dst. FilterEngine() . there are several functions in OpenCV (and you can add more) that return pointers to the derived classes that implement specific filtering operations. virtual void operator()(const uchar* src. Instead. . Those pointers are then passed to FilterEngine() constructor. // // runs filtering operation on the single input row // of "width" element. int _rowBorderType=BORDER_REPLICATE.

maybe not very easy yet) to combine filtering operations with other operations. Ptr<BaseRowFilter> rowFilter. dilate() etc. It contains all the necessary intermediate buffers. bool isolated=false). see below. int srcCount.0. below is the implementation of Laplace operator for a floating-point images. int maxBufRows=-1). This class makes it easier (though. the class is the workhorse in many of OpenCV filtering functions. . such as color space conversions. } // how many rows from the input image are not yet processed int remainingInputRows() const. arithmetic operations.size().. Pointers to the initialized FilterEngine instances are returned by various create*Filter functions. // "srcCount" rows starting from "src" and // stores the results to "dst". virtual int start(Size wholeSize. endY. The class FilterEngine can be used to apply an arbitrary filtering operation to an image. int srcStep. const Rect& srcRoi=Rect(0. bool isSeparable() const { return filter2D.. Point dstOfs=Point(0. and they are used inside high-level functions such as filter2D() . that is. thresholding.int _rowBorderType=BORDER_REPLICATE. // returns the starting y-position in the source image. dst. bool isolated=false. erode() .-1. int dstStep). int _columnBorderType=-1. // pointers to the filters Ptr<BaseFilter> filter2D. it computes extrapolated values of the “virtual” pixels outside of the image etc. int maxBufRows=-1). . // starts filtering of the ROI in an image of size "wholeSize". Rect roi.-1. uchar* dst. // returns the number of produced rows virtual int proceed(const uchar* src. Ptr<BaseColumnFilter> columnFilter.type()). }.type() == CV_32F ). Mat& dst) { CV_Assert( src. // how many output rows are not yet produced int remainingOutputRows() const. const Scalar& _borderValue=Scalar()). By combining several operations together you can get much better performance because your data will stay in cache. // processes the next portion of the source image. Set isolated to true to pretend that // there are no real pixels outside of the ROI // (so that the pixels will be extrapolated using the specified border modes) virtual int start(const Mat& src. Mat& dst. // the starting and the ending rows in the source image int startY. // alternative form of start that takes the image // itself instead of "wholeSize".-1). etc. which is a simplified implementation of Laplacian() : void laplace_f(const Mat& src. For example. src.create(src.0. const Rect& srcRoi=Rect(0.-1).empty(). // higher-level function that processes the whole // ROI or the whole image with a single call virtual void apply( const Mat& src.0).

just swapped Mat kd. DELTA. dy). Ixx.step ). 0. dy = 0. (int)src. dst. borderType. Fyy->start(src).rows. getSobelKernels( kd. because at the // last iteration the output may take max(kd.rowRange(0.type().step. sptr += DELTA*src.data + y*src. you can simply use the FilterEngine::apply method.ks. ks. (int)src. Ptr<FilterEngine> Fxx = createSeparableLinearFilter(src. ksize. Ptr<FilterEngine> Fyy = createSeparableLinearFilter(src. // for d2I/dy2 we could use the same kernels. dsty + dy). dsty = 0. since // it was given the actual image height in the "start" method) // on output we can get: // * < DELTA rows (the initial buffer accumulation stage) // * = DELTA rows (settled state in the middle) // * > DELTA rows (then the input image is over. dsty < dst. // allocate the buffers for the spatial image derivatives.-1). Point(-1. dst. const Rect& srcRoi. dy = Fyy->proceed( sptr. bool isolated) { // check matrix types CV_Assert( src. d2y.type() ). 0. dst.cols. ks.rows).1. Mat Iyy( DELTA + kd. add(Ixx. Iyy.type() == srcType && dst. Scalar() ). kd. // inside the loop we always pass DELTA rows to the filter // (note that the "proceed" method takes care of possibe overflow. kd.cols. 0.type(). Point(-1.step. // let's process 10 source rows at once int DELTA = std::min(10.rows-1) // rows more than the input.rows-1. dsty += dy ) { Fxx->proceed( sptr. Mat& dst. dy). borderType. src.// get the derivative and smooth kernels for d2I/dx2. Here is how the method is actually implemented: void FilterEngine::apply(const Mat& src.rows . const uchar* sptr = src. borderType. Point dstOfs. src. // the buffers need to have more than DELTA rows.type() ). borderType.rowRange(dsty. DELTA. dstripe).type().step. (int)Iyy. false. // dsty is the current output row. src.-1). } } } If you do not need that much control of the filtering process. ktype ). Mat Ixx( DELTA + kd. .data. Scalar() ).type().step. ks. 2.rowRange(0. (int)Ixx. ks. but we generate // "virtual" rows using the border mode and filter them) // this variable number of output rows is dy. // sptr is the pointer to the first input row in the portion to process for( .1.data.step ). dst.type() == dstType ). int y = Fxx->start(src). if( dy > 0 ) { Mat dstripe = dst.rows .

1. printf("method1 = Note on the data types. despite that Base*Filter::operator() only takes uchar pointers and no information . isolated).elemSize().step.data + y*src. 0. // check if the destination ROI is inside the dst. dst. Mat dst2(1.width <= dst.1.cols. 0. // and FilterEngine::start will check if the source ROI is inside src. Rect(x.y) // method 1: // form a matrix header for a single value float val1 = 0. Note that "endY . Mat pix_roi(src. (int)dst. // process the whole ROI.y.rows). if( _srcRoi == Rect(0.type().x >= 0 && dstOfs. BORDER_REFLECT_101). CV_32F.src.&val1). Mat dst1(1.1.step ). Sobel(pix_roi. Fx->apply(src.&val2). that is.0.1. Rect(x.CV_32F.1).startY. // method 2: // form a matrix header for a single value float val2 = 0. For example. pixels outside of the ROI but inside the image can be used in the filtering operations.CV_32F. _srcRoi. (int)src. you can take a ROI of a single pixel and filter it .x*dst.startY" is the total number // of the source rows to process // (including the possible rows outside of srcRoi but inside the source image) proceed( src. it’s possible to emulate the old behavior by passing isolated=false to FilterEngine::start or FilterEngine::apply ). Ptr<FilterEngine> Fx = createDerivFilter(CV_32F. You can pass the ROI explicitly to FilterEngine::apply .// handle the "whole image" case Rect _srcRoi = srcRoi.1)).y + _srcRoi. 3. 0. now the filtering operations fully support the notion of image ROI. dst2.step + dstOfs.x + _srcRoi.0.cols && dstOfs. CV_Assert( dstOfs.that will be a filter response at that particular pixel (however.data + dstOfs. dst1). 3. or construct a new matrix headers: // compute dI/dx derivative at src(x.1. As it was mentioned in BaseFilter() description.step.y*dst. BORDER_REFLECT_101). 1. the specific filters can process data of any type. } Unlike the earlier versions of OpenCV. 1.y.-1.y >= 0 && dstOfs. Point(). endY .rows ). // start filtering int y = start(src. dst2.src.-1) ) _srcRoi = Rect(0.height <= dst.

double sigmaColor.uk/CVonline/LOCAL_COPIES/MANDUCHI1/Bilateral_Filtering. that is used during filtering. resulting in larger areas of semi-equal color o sigmaSpace– Filter sigma in the coordinate space. It transforms the input image data (of type srcType ) to the intermediate results stored in the internal buffers (of type bufType ). -1). see sigmaColor ). Thus. will have the same size and the same type as src o d– The diameter of each pixel neighborhood. int d. 1-channel or 3-channel image rs: o dst– The destination image.dai. the following rules are used: • in case of separable filtering FilterEngine::rowFilter applied first. the input type for columnFilter is CV_MAT_DEPTH(bufType) and the output type is CV_MAT_DEPTH(dstType) .-1) means that the anchor is at the kernel center o borderType– The border mode used to extrapolate pixels outside of the image The function smoothes the image using the kernel: . Mat& dst. as described in http://www. Larger value of the parameter means that farther pixels will influence each other (as long as their colors are close enough. otherwise d is proportional to sigmaSpace The function applies bilateral filtering to the input image. createLinearFilter() . double sigmaSpace. See also: BaseColumnFilter() . • in case of non-separable filtering bufType must be the same as srcType . BaseRowFilter() . createSeparableLinearFilter() cv::bilateralFilter Comments from the Wiki void bilateralFilter(const Mat& src. createMorphologyFilter() . the input type for filter2D is srcType (= bufType ) and the output type is dstType . it’s computed from sigmaSpace o sigmaColor– Filter sigma in the color space. Point anchor=Point(-1. Larger value of the parameter means that farther colors within the pixel neighborhood (see sigmaSpace ) will be mixed together. will have the same size and the same rs: type as src o ksize– The smoothing kernel size o anchor– The anchor point. createGaussianFilter() .html cv::blur Comments from the Wiki void blur(const Mat& src. Size ksize. the input type for rowFilter is srcType and the output type is bufType . Mat& dst. The default value Point(-1. Then these intermediate results are processed as single-channel data with FilterEngine::columnFilter and stored in the output image (of type dstType ). If it is non-positive. The source data is copied to the temporary buffer if needed and then just passed to FilterEngine::filter2D .ac. createBoxFilter() . int borderType=BORDER_DEFAULT) Smoothes image using normalized box filter Paramete o src– The source image o dst– The destination image. To make it all work. createDerivFilter() . it specifies the neighborhood size regardless of sigmaSpace .ed. BaseFilter() . int borderType=BORDER_DEFAULT) Applies bilateral filter to the image Paramete o src– The source 8-bit or floating-point. That is.about the actual types. Then d>0 .

BORDER_REFLECT_101 in the vertical direction and want to compute value of the “virtual” pixel Point(-5.-1) means that the anchor is at the kernel center o normalize– Indicates. bilateralFilter() . cv::borderInterpolate Comments from the Wiki int borderInterpolate(int p. copyMakeBorder() cv::boxFilter Comments from the Wiki void boxFilter(const Mat& src. dst.The call blur(src. int borderType) Computes source location of extrapolated pixel Paramete o p– 0-based coordinate of the extrapolated pixel along one of the rs: axes.rows. the function is not called directly. Point anchor=Point(-1. anchor. dst. 100) in a floating-point image img . img. one of the BORDER_* . will have the same size and the same type as src o ksize– The smoothing kernel size o anchor– The anchor point. Mat& dst. it will be float val = img. img. borderInterpolate(-5. Size ksize. src. -1). When borderType==BORDER_CONSTANT the function always returns -1. medianBlur() . For example.cols.at<float>(borderInterpolate(100. The default value Point(-1. borderType) is equivalent to boxFilter(src. See also: FilterEngine() . likely <0 or >= len o len– length of the array along the corresponding axis o borderType– the border type. regardless of p and len The function computes and returns the coordinate of the donor pixel. except for BORDER_TRANSPARENT and BORDER_ISOLATED . Normally. GaussianBlur() . int borderType=BORDER_DEFAULT) Smoothes image using box filter Paramete o src– The source image rs: o dst– The destination image. int ddepth. bool normalize=true. it is used inside FilterEngine() and copyMakeBorder() to compute tables for quick extrapolation. BORDER_REFLECT_101).type(). if we use BORDER_WRAP mode in the horizontal direction. whether the kernel is normalized by its area or not o borderType– The border mode used to extrapolate pixels outside of the image The function smoothes the image using the kernel: . true. ksize. borderType) . BORDER_WRAP)). corresponding to the specified extrapolated pixel when using the specified extrapolation border mode. See also: boxFilter() . int len. anchor.

will have the same type as src and the size Size(src.cols+left+right. dst[1] is the next pyramid layer. see borderInterpolate() o value– The border value if borderType==BORDER_CONSTANT The function copies the source image into the middle of the destination image. use integral() . GaussianBlur() . check pyrDown() for the list of supported rs: types o dst– The destination vector of maxlevel+1 images of the same type as src . above and below the copied source image will be filled with extrapolated pixels. integral() . such as covariation matrices of image derivatives (used in dense optical flow algorithms. etc. int borderType. dst[0] will be the same as src . including your own. The function supports the mode when src is already in the middle of dst . a smoothed and down-sized src etc. may do to simplify image boundary handling. medianBlur() .: // let border be the same in all directions . src. o maxlevel– The 0-based index of the last (i. left.where Unnormalized box filter is useful for computing various integral characteristics over each pixel neighborhood.rows+top+bottom) o bottom. e. cv::copyMakeBorder Comments from the Wiki void copyMakeBorder(const Mat& src. the smallest) pyramid layer.e. In this case the function does not copy src itself. bilateralFilter() . int bottom. right=1 mean that 1 pixel-wide border needs to be built o borderType– The border type.g. int right. The areas to the left. top=1. cv::buildPyramid Comments from the Wiki void buildPyramid(const Mat& src.) – Specify how much pixels in each direction from the source image rectangle one needs to extrapolate. to the right. but what other more complex functions. If you need to compute pixel sums over variable-size windows. Mat& dst. int top. e. but simply constructs the border. bottom=1. int maxlevel) Constructs Gaussian pyramid for an image Paramete o src– The source image. vector<Mat>& dst.). This is not what FilterEngine() or based on it filtering functions do (they extrapolate pixels on-fly). const Scalar& value=Scalar()) Forms a border around the image Paramete o src– The source image rs: o dst– The destination image. int left.g. left=1. starting from dst[0]==src . See also: boxFilter() . right(top. it must be non-negative The function constructs a vector of images and builds the gaussian pyramid by recursively applying pyrDown() to the previously built pyramid layers.

The function itself is used by blur() and boxFilter() . negative values mean that the anchor is at the kernel center o normalize– Whether the sums are normalized or not. rgb.cols + border*2. must have as many channels as srcType . bool normalize=true. int dstType. int ksize. cv::createDerivFilter Comments from the Wiki Ptr<FilterEngine> createDerivFilter(int srcType. int anchor=-1) Ptr<BaseColumnFilter> getColumnSumFilter(int sumType. rgb. border. int borderType=BORDER_DEFAULT) Ptr<BaseRowFilter> getRowSumFilter(int srcType. int dx. vertical sum filter with getColumnSumFilter() . int sumType. BORDER_REPLICATE). // form a border in-place copyMakeBorder(gray. int dstType. int borderType=BORDER_DEFAULT) Returns engine for computing image derivatives Parameters o srcType– The source image type : o dstType– The destination image type. int dstType. int ksize. gray. rgb. // now do some custom filtering . See also: borderInterpolate() cv::createBoxFilter Comments from the Wiki Ptr<FilterEngine> createBoxFilter(int srcType. Size ksize. // constructs a larger image to fit both the image and the border Mat gray_buf(rgb.. int dy. int anchor=-1.rows)). border. must have as many channels as srcType o ksize– The aperture size o anchor– The anchor position with the kernel. -1). must have as many channels as srcType o dstType– The destination image type. rgb.. int ksize.cols. see boxFilter() o scale– Another way to specify normalization in lower-level getColumnSumFilter o borderType– Which border type to use. See also: FilterEngine() . The constructed filter engine can be used for image filtering with normalized or unnormalized box filter. // convert image from RGB to grayscale cvtColor(rgb. CV_RGB2GRAY)..depth()). gray_buf. // select the middle part of it w/o copying data Mat gray(gray_canvas. boxFilter() . border. border.rows + border*2.. constructs new FilterEngine() and passes both of the primitive filters there. . double scale=1) Returns box filter engine Paramete o srcType– The source image type rs: o sumType– The intermediate horizontal sum type.int border=2. blur() . border. Rect(border. see borderInterpolate() The function is a convenience function that retrieves horizontal sum primitive filter with getRowSumFilter() . Point anchor=Point(-1.

double sigmaX. both for input and output. Note that while the function takes just one data type. Point anchor=Point(-1. The function is used by Sobel() and Scharr() . GaussianBlur() . int borderType=BORDER_DEFAULT) Returns engine for smoothing images with a Gaussian filter Parameter o type– The source and the destination image type s: o ksize– The aperture size. double delta=0. Size ksize. int rowBorderType=BORDER_DEFAULT. const Mat& kernel. -1). The function is used by GaussianBlur() . -1). see getGaussianKernel() o sigmaY– The Gaussian sigma in the vertical direction. Scharr() . cv::createGaussianFilter Comments from the Wiki Ptr<FilterEngine> createGaussianFilter(int type. see borderInterpolate() o borderValue– Used in case of constant border . getDerivKernels() . see borderInterpolate() The function createDerivFilter() is a small convenience function that retrieves linear filter coefficients for computing image derivatives using getDerivKernels() and then creates a separable linear filter with createSeparableLinearFilter() . Point _anchor=Point(-1. int dstType. int bits=0) Creates non-separable linear filter engine Paramete o srcType– The source image type rs: o dstType– The destination image type. double sigmaY=0. then o borderType– Which border type to use. the parameter specifies the number of the fractional bits o columnBorderType(rowBorderType. int dstType. if 0. const Scalar& borderValue=Scalar()) Ptr<BaseFilter> getLinearFilter(int srcType. special value Point(-1. cv::createLinearFilter Comments from the Wiki Ptr<FilterEngine> createLinearFilter(int srcType. int columnBorderType=-1. see borderInterpolate() The function createGaussianFilter() computes Gaussian kernel coefficients and then returns separable linear filter for that kernel. See also: createSeparableLinearFilter() .-1) means that the anchor is at the kernel center o delta– The value added to the filtered results before storing them o bits– When the kernel is an integer matrix representing fixed-point filter coefficients. const Mat& kernel. see getDerivKernels() borderType– Which border type to use.) – The pixel extrapolation methods in the horizontal and the vertical directions. double delta=0. Sobel() . see getGaussianKernel() o sigmaX– The Gaussian sigma in the horizontal direction. must have as many channels as srcType o kernel– The 2D array of filter coefficients o anchor– The anchor point within the kernel. See also: createSeparableLinearFilter() . you can pass by this limitation by calling getGaussianKernel() and then createSeparableFilter() directly.o o o o dx– The derivative order in respect with x dy– The derivative order in respect with y ksize– The aperture size. getGaussianKernel() .

FilterEngine() .The function returns pointer to 2D linear filter for the specified kernel. Note. The default value. int anchor. double delta=0. int anchor=-1) static inline Scalar morphologyDefaultBorderValue(){ return Scalar::all(DBL_MAX) } Creates engine for non-separable morphological operations Paramete o op– The morphology operation id. const Mat& columnKernel. that createMorphologyFilter() analyses the structuring element shape and builds a separable morphological filter engine when the structuring element is square. The functions construct primitive morphological filtering operations or a filter engine based on them. -1). see borderInterpolate() o borderValue– The border value in case of a constant border. for the erosion and to for the dilation. . morphologyDefaultBorderValue . FilterEngine() cv::createSeparableLinearFilter Comments from the Wiki Ptr<FilterEngine> createSeparableLinearFilter(int srcType. int symmetryType. The function is a higher-level function that calls getLinearFilter and passes the retrieved 2D filter to FilterEngine() constructor. const Mat& columnKernel. int type. const Mat& element. Point anchor=Point(-1. int esize. the source array type and the destination array type. int anchor. int bufType. filter2D() cv::createMorphologyFilter Comments from the Wiki Ptr<FilterEngine> createMorphologyFilter(int op. int columnBorderType=-1. dilate() . Point anchor=Point(-1. const Scalar& borderValue=Scalar()) Ptr<BaseColumnFilter> getLinearColumnFilter(int bufType. has the special meaning. negative values mean that the anchor is at the center o columnBorderType(rowBorderType. -1). int dstType. int dstType. Point anchor=Point(-1. int rowBorderType=BORDER_DEFAULT. dilate() or morphologyEx() . -1)) Ptr<BaseRowFilter> getMorphologyRowFilter(int op. Non-zero elements indicate the pixels that belong to the element o esize– The horizontal or vertical structuring element size for separable morphological operations o anchor– The anchor position within the structuring element. const Mat& element. const Scalar& borderValue=morphologyDefaultBorderValue()) Ptr<BaseFilter> getMorphologyFilter(int op. See also: erode() . int bits=0) Ptr<BaseRowFilter> getLinearRowFilter(int srcType. int columnBorderType=-1. int type. int type. const Mat& rowKernel. double delta=0. int esize. morphologyEx() . int type. const Mat& rowKernel. It is transformed which means that the minimum (maximum) is effectively computed only over the pixels that are inside the image. See also: createSeparableLinearFilter() . Normally it’s enough to use createMorphologyFilter() or even higher-level erode() . int rowBorderType=BORDER_CONSTANT.) – The pixel extrapolation methods in the horizontal and the vertical directions. int anchor=-1) Ptr<BaseColumnFilter> getMorphologyColumnFilter(int op. MORPH_ERODE or rs: MORPH_DILATE o type– The input/output image type o element– The 2D 8-bit structuring element for the morphological operation.

If it does not work for you. int borderType=BORDER_CONSTANT. the parameter specifies the number of the fractional bits o columnBorderType(rowBorderType. a rectangular structuring element is used o anchor– Position of the anchor within the element. Mat& dst. getKernelType() cv::dilate Comments from the Wiki void dilate(const Mat& src. const Mat& element. the intermediate bufType . int iterations=1. getLinearRowFilter directly and then pass them to FilterEngine() constructor. createLinearFilter() . If element=Mat() . -1). Paramete o src– The source image rs: o dst– The destination image.) – The pixel extrapolation methods in the horizontal and the vertical directions. The functions construct primitive separable linear filtering operations or a filter engine based on them. FilterEngine() . see getKernelType() . if the filtering can be done in integer arithmetics. it’s possible to call getLinearColumnFilter . The function createMorphologyFilter() is smart enough to figure out the symmetryType for each of the two kernels. and. The default value has a special meaning. the number of bits to encode the filter coefficients. must have as many channels as srcType o rowKernel– The coefficients for filtering each row o columnKernel– The coefficients for filtering each column o anchor– The anchor position within the kernel. const Scalar& borderValue=morphologyDefaultBorderValue()) Dilates an image by using a specific structuring element. see borderInterpolate() o borderValue– The border value in case of a constant border. negative values mean that anchor is positioned at the aperture center o delta– The value added to the filtered results before storing them o bits– When the kernel is an integer matrix representing fixed-point filter coefficients. It will have the same size and the same type as src o element– The structuring element used for dilation. See also: sepFilter2D() . see borderInterpolate() o borderValue– Used in case of a constant border o symmetryType– The type of each of the row and column kernel.int symmetryType) Creates engine for separable linear filter Paramete o srcType– The source array type rs: o dstType– The destination image type. see createMorphologyFilter() The function dilates the source image using the specified structuring element that determines the shape of a pixel neighborhood over which the maximum is taken: . Normally it’s enough to use createSeparableLinearFilter() or even higher-level sepFilter2D() . Point anchor=Point(-1. must have as many channels as srcType o bufType– The inermediate buffer type. The default value means that the anchor is at the element center o iterations– The number of times dilation is applied o borderType– The pixel extrapolation method.

double delta=0. int borderType=BORDER_CONSTANT. The anchor should lie within the kernel. morphologyEx() . Paramete o src– The source image rs: o dst– The destination image. int borderType=BORDER_DEFAULT) Convolves an image with the kernel Paramete o src– The source image rs: o dst– The destination image. If element=Mat() . a single-channel floating point matrix. createMorphologyFilter() cv::erode Comments from the Wiki void erode(const Mat& src. If it is negative. split the image into separate color planes using split() and process them individually o anchor– The anchor of the kernel that indicates the relative position of a filtered point within the kernel. const Mat& element. The default value has a special meaning. Erosion can be applied several ( iterations ) times.The function supports the in-place mode.-1) means that the anchor is at the kernel center . See also: erode() . If you want to apply different kernels to different channels. int iterations=1. -1). See also: dilate() . Point anchor=Point(-1. The special default value (-1. const Mat& kernel. It will have the same size and the same number of channels as src o ddepth– The desired depth of the destination image. Dilation can be applied several ( iterations ) times. const Scalar& borderValue=morphologyDefaultBorderValue()) Erodes an image by using a specific structuring element. Mat& dst. see borderInterpolate() o borderValue– The border value in case of a constant border. a rectangular structuring element is used o anchor– Position of the anchor within the element. It will have the same size and the same type as src o element– The structuring element used for dilation. -1). In the case of multi-channel images each channel is processed independently. see createMorphoogyFilter() The function erodes the source image using the specified structuring element that determines the shape of a pixel neighborhood over which the minimum is taken: The function supports the in-place mode. In the case of multi-channel images each channel is processed independently. it will be the same as src. morphologyEx() . Mat& dst. createMorphologyFilter() cv::filter2D Comments from the Wiki void filter2D(const Mat& src. The default value means that the anchor is at the element center o iterations– The number of times erosion is applied o borderType– The pixel extrapolation method. int ddepth.depth() o kernel– Convolution kernel (or rather a correlation kernel). Point anchor=Point(-1.

1) . In-place operation is supported.cols . int dy.y . kernel. bool normalize=false. dft() . matchTemplate() cv::GaussianBlur Comments from the Wiki void GaussianBlur(const Mat& src.anchor.) – The Gaussian kernel standard deviations in X and Y direction.x . int ksize. will have type ktype o dx– The derivative order in respect with x . will have the same size and the same rs: type as src o ksize– The Gaussian kernel size. If they are both zeros. See also: sepFilter2D() . int borderType=BORDER_DEFAULT) Smoothes image using a Gaussian filter Paramete o src– The source image o dst– The destination image. respectively. double sigmaY=0. they are computed from ksize. Or. it is recommended to specify all of ksize . Mat& dst. flip the kernel using flip() and set the new anchor to (kernel. then they are computed from sigma* o sigmaY(sigmaX. Size ksize.height can differ. medianBlur() cv::getDerivKernels Comments from the Wiki void getDerivKernels(Mat& kx. it is set to be equal to sigmaX . see getGaussianKernel() . int ktype=CV_32F) Returns filter coefficients for computing spatial image derivatives Paramete o kx– The output matrix of row filter coefficients. See also: sepFilter2D() . not the convolution: That is. If sigmaY is zero. the kernel is not mirrored around the anchor point. ksize. boxFilter() .anchor. Mat& ky. To fully control the result regardless of possible future modification of all this semantics.o delta– The optional value added to the filtered pixels before storing them in dst o borderType– The pixel extrapolation method. In-place filtering is supported. bilateralFilter() . int dx. will have type ktype rs: o ky– The output matrix of column filter coefficients. If you need a real convolution.rows . sigmaX and sigmaY o borderType– The pixel extrapolation method. but they both must be positive and odd. the function interpolates outlier pixel values according to the specified border mode. they can be zero’s. see borderInterpolate() The function applies an arbitrary linear filter to the image. createLinearFilter() .1. When the aperture is partially outside the image.width and ksize. The function uses -based algorithm in case of sufficiently large kernels (~ ) and the direct algorithm (that uses the engine retrieved by createLinearFilter() ) for small kernels. filter2D() . blur() . The function does actually computes correlation.width and ksize.height . double sigmaX. see borderInterpolate() The function convolves the source image with the specified Gaussian kernel.

o dy– The derivative order in respect with y o ksize– The aperture size. Also you may use the higher-level GaussianBlur() .1) + 0. see Scharr() . It can be CV_32f or CV_64F The function computes and returns the matrix of Gaussian filter coefficients: where and is the scale factor chosen so that Two of such generated kernels can be passed to sepFilter2D() or to createSeparableLinearFilter() that will automatically detect that these are smoothing kernels and handle them accordingly.3*(ksize/2 . In theory the coefficients should have the . When ksize=CV_SCHARR . o sigma– The Gaussian standard deviation. It should be odd ( rs: ) and positive. the Scharr kernels are generated. 3. The filters are normally passed to sepFilter2D() or to createSeparableLinearFilter() . Otherwise. It can be CV_SCHARR . GaussianBlur() . it is computed from ksize as sigma = 0. If it is non-positive. int ktype=CV_64F) Returns Gaussian filter coefficients Paramete o ksize– The aperture size. If you are going to filter denominator floating-point images. It can be CV_32f or CV_64F The function computes and returns the filter coefficients for spatial image derivatives. cv::getGaussianKernel Comments from the Wiki Mat getGaussianKernel(int ksize. createSeparableLinearFilter() . you will likely want to use the normalized kernels. getDerivKernels() . getStructuringElement() . cv::getKernelType Comments from the Wiki int getKernelType(const Mat& kernel. you may want to set normalize=false .when there is no any type of symmetry or other properties • KERNEL_SYMMETRICALThe kernel is symmetrical: and the anchor is at the center • KERNEL_ASYMMETRICALThe kernel is asymmetrical: . o ktype– The type of filter coefficients. But if you compute derivatives of a 8-bit image. store the results in 16-bit image and wish to preserve all the fractional bits.8 o ktype– The type of filter coefficients. see Sobel() . 1. See also: sepFilter2D() . 5 or 7 o normalize– Indicates. double sigma. whether to normalize (scale down) the filter coefficients or not. Point anchor) Returns the kernel type Parameters: o kernel– 1D array of the kernel coefficients to analyze o anchor– The anchor position within the kernel The function analyzes the kernel coefficients and returns the corresponding kernel type: • KERNEL_GENERALGeneric kernel . Sobel kernels are generated.

rectangular structuring element MORPH_ELLIPSE . It must be odd and more than 1. 5. -1)) Returns the structuring element of the specified size and shape for morphological operations Parameters: o shape– The element shape. will have the same size and the same type as src o ksize– The aperture linear size. But also you can construct an arbitrary binary mask yourself and use it as the structuring element. For larger aperture sizes it can only be CV_8U o dst– The destination array.and the anchor is at the center • KERNEL_SMOOTHAll the kernel elements are non-negative and sum to 1. esize. blur() . 0. Note that only the cross-shaped element’s shape depends on the anchor position. In-place operation is supported. E. 3.esize..width.or 4-channel image. See also: bilateralFilter() .e. CV_16U or CV_32F . the Gaussian kernel is both smooth kernel and symmetrical. GaussianBlur() cv::morphologyEx . 3. Each channel of a multi-channel image is processed independently.e. so the function will return KERNEL_SMOOTH | KERNEL_SYMMETRICAL • KERNEL_INTEGERAl the kernel coefficients are integer numbers. The function smoothes image using the median filter with aperture. cv::medianBlur Comments from the Wiki void medianBlur(const Mat& src.height) MORPH_CROSS . This flag can be combined with KERNEL_SYMMETRICAL or KERNEL_ASYMMETRICAL cv::getStructuringElement Comments from the Wiki Mat getStructuringElement(int shape.. Mat& dst. dilate() or morphologyEx() . boxFilter() . Size esize. i. 7 .cross-shaped structuring element: Paramete rs: o o esize– Size of the structuring element anchor– The anchor position within the element.elliptic structuring element.g. 0. Point anchor=Point(-1. in other cases the anchor just regulates by how much the result of the morphological operation is shifted The function constructs and returns the structuring element that can be then passed to createMorphologyFilter() . a filled ellipse inscribed into the rectangle Rect(0. int ksize) Smoothes image using median filter Paramete o src– The source 1-. The default value means that the anchor is at the center. i. erode() . the rs: image depth should be CV_8U . When ksize is 3 or 5. one of: MORPH_RECT .

const Mat& element. int borderType=BORDER_DEFAULT) Calculates the Laplacian of an image Paramete o src– Source image rs: o dst– Destination image.Comments from the Wiki void morphologyEx(const Mat& src. See also: dilate() . Mat& dst. double delta=0. see getDerivKernels() . no scaling is applied. will have the same size and the same number of channels as src o ddepth– The desired depth of the destination image o ksize– The aperture size used to compute the second-derivative filters. int borderType=BORDER_CONSTANT. Mat& dst. It must be positive and odd o scale– The optional scale factor for the computed Laplacian values (by default. erode() . see createMorphoogyFilter() The function can perform advanced morphological transformations using erosion and dilation as basic operations. int ksize=1. see getDerivKernels() ) o delta– The optional delta value. see borderInterpolate() o borderValue– The border value in case of a constant border. The default value has a special meaning. int ddepth. int op. Opening: Closing: Morphological gradient: “Top hat”: “Black hat”: Any of the operations can be done in-place. const Scalar& borderValue=morphologyDefaultBorderValue()) Performs advanced morphological transformations Paramete o src– Source image rs: o dst– Destination image. It will have the same size and the same type as src o element– Structuring element o op– Type of morphological operation. one of the following: MORPH_OPENopening MORPH_CLOSEclosing MORPH_GRADIENTmorphological gradient MORPH_TOPHAT“top hat” MORPH_BLACKHAT“black hat” o iterations– Number of times erosion and dilation are applied o borderType– The pixel extrapolation method. added to the results prior to storing them in dst . Point anchor=Point(-1. double scale=1. createMorphologyFilter() cv::Laplacian Comments from the Wiki void Laplacian(const Mat& src. -1). int iterations=1.

cols*2. But in any case the following conditions should be satisfied: . const Size& dstsize=Size()) Smoothes an image and downsamples it. Mat& dst. see borderInterpolate() The function calculates the Laplacian of the source image by adding up the second x and y derivatives calculated using the Sobel operator: This is done when ksize > 1 . Paramete o src– The source image o dst– The destination image.cols+1)/2.rows+1)/2) . First it convolves the source image with the kernel: and then downsamples the image by rejecting even rows and columns. the Laplacian is computed by filtering the image with the following aperture: See also: Sobel() . By default it is computed as Size(src. cv::pyrUp Comments from the Wiki void pyrUp(const Mat& src. (src. Mat& dst. (src.o borderType– The pixel extrapolation method. But in any case the following conditions should be satisfied: The function performs the downsampling step of the Gaussian pyramid construction. It will have the specified size and the same type as src o dstsize– Size of the destination image. const Size& dstsize=Size()) Upsamples an image and then smoothes it Paramete o src– The source image rs: o dst– The destination image. When ksize == 1 . Scharr() cv::pyrDown Comments from the Wiki void pyrDown(const Mat& src.rows*2) . It will have the specified size and the rs: same type as src o dstsize– Size of the destination image. By default it is computed as Size((src.

added to the results prior to storing them in dst o borderType– The pixel extrapolation method. blur() . const Mat& rowKernel. see getDerivKernels() ) o delta– The optional delta value.The function performs the upsampling step of the Gaussian pyramid construction (it can actually be used to construct the Laplacian pyramid). see borderInterpolate() The function applies a separable linear filter to the image. Mat& dst. cv::Sobel Comments from the Wiki void Sobel(const Mat& src. boxFilter() . double scale=1. will have the same size and the same number of channels as src o ddepth– The destination image depth o rowKernel– The coefficients for filtering each row o columnKernel– The coefficients for filtering each column o anchor– The anchor position within the kernel. every column of the result is filtered with 1D kernel columnKernel and the final result shifted by delta is stored in dst . must be 1. Then. . double delta=0. every row of src is filtered with 1D kernel rowKernel . Sobel() . multiplied by 4. int borderType=BORDER_DEFAULT) Calculates the first. -1). That is. The default value means that the anchor is at the kernel center o delta– The value added to the filtered results before storing them o borderType– The pixel extrapolation method. int ddepth. 3. GaussianBlur() . see borderInterpolate() In all cases except 1. int xorder. See also: createSeparableLinearFilter() . third or mixed image derivatives using an extended Sobel operator o src– The source image Paramete rs: o dst– The destination image. int ksize=3. an separable kernel will be used to calculate the derivative. const Mat& columnKernel. double delta=0. int ddepth. int borderType=BORDER_DEFAULT) Applies separable linear filter to an image Paramete o src– The source image rs: o dst– The destination image. int yorder. will have the same size and the same number of channels as src o ddepth– The destination image depth o xorder– Order of the derivative x o yorder– Order of the derivative y o ksize– Size of the extended Sobel kernel. Mat& dst. cv::sepFilter2D Comments from the Wiki void sepFilter2D(const Mat& src. no scaling is applied. First it upsamples the source image by injecting even zero rows and columns and then convolves the result with the same kernel as in pyrDown() . first. Point anchor=Point(-1. second. 5 or 7 o scale– The optional scale factor for the computed derivative values (by default. filter2D() .

When . ksize = 3) or ( xorder = 0. filter2D() . ksize = 1 can only be used for the first or the second x. The call is equivalent to . Lapacian() .e. int xorder. int yorder.or y. yorder = 0. int borderType=BORDER_DEFAULT) Calculates the first x.image derivative. GaussianBlur() cv::Scharr Comments from the Wiki void Scharr(const Mat& src. added to the results prior to storing them in dst o borderType– The pixel extrapolation method.spatial image derivative using Scharr operator. sepFilter2D() . yorder = 1. The first case corresponds to a kernel of: and the second one corresponds to a kernel of: See also: Scharr() .a or kernel will be used (i.or y. no Gaussian smoothing is done). Mat& dst. double scale=1. see getDerivKernels() ) o delta– The optional delta value. Most often. double delta=0. see borderInterpolate() The function computes the first x. int ddepth.derivatives. There is also the special value ksize = CV_SCHARR (-1) that corresponds to a Scharr filter that may give more accurate results than a Sobel. ksize = 3) to calculate the first x. The function calculates the image derivative by convolving the image with the appropriate kernel: The Sobel operators combine Gaussian smoothing and differentiation. will have the same size and the same number of channels as src o ddepth– The destination image depth o xorder– Order of the derivative x o yorder– Order of the derivative y o scale– The optional scale factor for the computed derivative values (by default.or y.or y.image derivative using Scharr operator Paramete o src– The source image rs: o dst– The destination image. the function is called with ( xorder = 1. no scaling is applied. The Scharr aperture is for the x-derivative or transposed for the y-derivative. so the result is more or less resistant to the noise.

and map this deformed grid to the destination image. a better result can be achieved by using more sophisticated interpolation methods . In fact. for some one of or .e. the mapping is done in the reverse order. need to solve the 2 main problems with the above formula: 1 extrapolation of non-existing pixels. That is. will have type dstmap1type and the same size as src o dstmap2– The second output map o dstmap1type– The type of the first output map. the functions compute coordinates of the corresponding “donor” pixel in the source image and copy the pixel value. int dstmap1type. const Mat& map2. Mat& dstmap2. cv::convertMaps Comments from the Wiki void convertMaps(const Mat& map1. so a pixel values at fractional coordinates needs to be retrieved. can be an affine or perspective transformation. or they both. see Resize . described in the previous section. that is: In the case when the user specifies the forward mapping: . or radial lens distortion correction etc. 2 interpolation of pixel values. bool nninterpolation=false) Converts image transformation maps from one representation to another Paramete o map1– The first input map of type CV_16SC2 or CV_32FC1 or rs: CV_32FC2 o map2– The second input map of type CV_16UC1 or CV_32FC1 or none (empty matrix). In OpenCV you can choose between several interpolation methods. in which case some extrapolation method needs to be used. for each pixel of the destination image. should be CV_16SC2 . to avoid sampling artifacts. may fall outside of the image. In the simplest case the coordinates can be just rounded to the nearest integer coordinates and the corresponding pixel used. from destination to the source. That is.).Geometric Image Transformations The functions in this section perform various geometrical transformations of 2D images. which means that the corresponding pixels in the destination image will not be modified at all. but also an additional method BORDER_TRANSPARENT . The actual implementations of the geometrical transformations. Usually and are floating-point numbers (i. which is called nearest-neighbor interpolation. respectively o dstmap1– The first output map. OpenCV provides the same selection of the extrapolation methods as in the filtering functions. where a polynomial function is fit into some neighborhood of the computed pixel and then the value of the polynomial at is taken as the interpolated pixel value. CV_32FC1 or CV_32FC2 o nninterpolation– Indicates whether the fixed-point maps will be used for nearest-neighbor or for more complex interpolation The function converts a pair of maps for remap() from one representation to another. the OpenCV functions first compute the corresponding inverse mapping: and then use the above formula. Mat& dstmap1. they do not change the image content. but deform the pixel grid. The following . Similarly to the filtering functions. However. from the most generic Remap and to the simplest and the fastest Resize .

options ( (map1. The same as above. 4 . dstmap2. initUndistortRectifyMap() cv::getAffineTransform Comments from the Wiki Mat getAffineTransform(const Point2f src[]. The first output array will contain the rounded coordinates and the second array (created only when nninterpolation=false ) will contain indices in the interpolation tables.type()) (dstmap1. the reconstructed floating-point maps will not be exactly the same as the originals. warpPerspective() . 5 the reverse conversion. const Point2f dst[]) Calculates the affine transform from 3 pairs of the corresponding points Parameters o src– Coordinates of a triangle vertices in the source image : o dst– Coordinates of the corresponding triangle vertices in the destination image The function calculates the matrix of an affine transform such that: where See also: warpAffine() . in which the original floating-point maps (see remap() ) are converted to more compact and much faster fixed-point representation.type()) ) are supported: 3 .type(). Obviously. const Point2f dst[]) Calculates the perspective transform from 4 pairs of the corresponding points Parameter o src– Coordinates of a quadrange vertices in the source image s: o dst– Coordinates of the corresponding quadrangle vertices in the destination image The function calculates the matrix of a perspective transform such that: where See also: findHomography() . but the original maps are stored in one 2-channel matrix. Size patchSize.type(). transform() cv::getPerspectiveTransform Comments from the Wiki Mat getPerspectiveTransform(const Point2f src[]. undisort() . perspectiveTransform() cv::getRectSubPix Comments from the Wiki void getRectSubPix(const Mat& image. This is the most frequently used conversion operation. int patchType=-1) . Mat& dst. map2. Point2f center. See also: remap() .

Every channel of multiple-channel images is processed independently. By default they will have the same depth as src The function getRectSubPix extracts pixels from src : where the values of the pixels at non-integer coordinates are retrieved using bilinear interpolation. the replication border mode (see borderInterpolate() ) is used to extrapolate the pixel values outside of the image. will have the size patchSize and the same number of channels as src o patchType– The depth of the extracted pixels. warpAffine() . parts of the rectangle may be outside.Retrieves the pixel rectangle from an image with sub-pixel accuracy Paramete o src– Source image rs: o patchSize– Size of the extracted patch o center– Floating point coordinates of the extracted rectangle center within the source image. See also: getAffineTransform() . transform() cv::invertAffineTransform Comments from the Wiki void invertAffineTransform(const Mat& M. the shift should be adjusted. double scale) Calculates the affine matrix of 2d rotation. Paramete o center– Center of the rotation in the source image rs: o angle– The rotation angle in degrees. Mat& iM) Inverts an affine transformation Parameters: o M– The original affine transformation o iM– The output reverse affine transformation The function computes inverse affine transformation represented by matrix M : . The center must be inside the image o dst– The extracted patch. If this is not the purpose. While the rectangle center must be inside the image. Positive values mean counter-clockwise rotation (the coordinate origin is assumed to be the top-left corner) o scale– Isotropic scale factor The function calculates the following matrix: where The transformation maps the rotation center to itself. In this case. warpPerspective() cv::getRotationMatrix2D Comments from the Wiki Mat getRotationMatrix2D(Point2f center. See also: warpAffine() . double angle.

cvFloor(y)) and contains indices in a table of interpolation coefficients. see resize() . Mat& dst. When the borderMode=BORDER_TRANSPARENT .y) points). const Mat& map2. Size dsize. See convertMaps() for converting floating point representation to fixed-point for speed. In the converted case. const Scalar& borderValue=Scalar()) Applies a generic geometrical transformation to an image. int interpolation=INTER_LINEAR) Resizes an image Paramete o src– Source image rs: o dst– Destination image. If it is zero. This function can not operate in-place. const Mat& map1. The type of dst will be the same as of src . It will have size dsize (when it is non-zero) or the size computed from src. Mat& dst. or fixed-point maps made by using convertMaps() . The reason you might want to convert from floating to fixed-point representations of a map is that they can yield much faster (~2x) remapping operations. CV_32FC1 or CV_32FC2 .size() and fx and fy . o map2– The second map of y values having type CV_16UC1 . cv::remap Comments from the Wiki void remap(const Mat& src. CV_32FC1 or none (empty map if map1 is (x.y) points or just x values having type CV_16SC2 . and can be encoded as separate floating-point maps in and respectively. respectively o interpolation– The interpolation method. double fy=0. By default it is 0 The function remap transforms the source image using the specified map: Where values of pixels with non-integer coordinates are computed using one of the available interpolation methods. it means that the pixels in the destination image that corresponds to the “outliers” in the source image are not modified by the function o borderValue– A value used in the case of a constant border. or interleaved floating-point maps of in . It will have the same size as map1 and the same type as src o map1– The first map of either (x. double fx=0.The result will also be a matrix of the same type as M . int borderMode=BORDER_CONSTANT. int interpolation. see borderInterpolate() . contains pairs (cvFloor(x). cv::resize Comments from the Wiki void resize(const Mat& src. then it is computed as: . Paramete o src– Source image rs: o dst– Destination image. The method INTER_AREA is not supported by this function o borderMode– The pixel extrapolation method. o dsize– The destination image size.

interpolation). const Mat& M. and the optional flag WARP_INVERSE_MAP that means that M is the inverse transformation ( ) o borderMode– The pixel extrapolation method. Note that the initial dst type or size are not taken into account. it is similar to the INTER_NEAREST method INTER_CUBICbicubic interpolation over 4x4 pixel neighborhood INTER_LANCZOS4Lanczos interpolation over 8x8 pixel neighborhood The function resize resizes an image src down to or up to the specified size. interpolation). Either dsize or both fx or fy must be non-zero. it is computed as o fy– The scale factor along the vertical axis. But when the image is zoomed. Size(). 0. 0.size(). int flags=INTER_LINEAR. See also: warpAffine() .5. Size dsize. dst. If you want to decimate the image by factor of 2 in each direction. dst. resize(src. as it gives moire-free results. Mat& dst. resize(src. dst. cv::warpAffine Comments from the Wiki void warpAffine(const Mat& src. Instead the size and type are derived from the src . const Scalar& borderValue=Scalar()) Applies an affine transformation to an image. it is computed as o interpolation– The interpolation method: INTER_NEARESTnearest-neighbor interpolation INTER_LINEARbilinear interpolation (used by default) INTER_AREAresampling using pixel area relation. remap() . When 0. When 0. will have size dsize and the same type as src o M– transformation matrix o dsize– Size of the destination image o flags– A combination of interpolation methods. fx and fy . int borderMode=BORDER_CONSTANT. It may be the preferred method for image decimation. you may call the function as: // explicitly specify dsize=dst. When the borderMode=BORDER_TRANSPARENT . Paramete o src– Source image rs: o dst– Destination image. If you want to resize src so that it fits the pre-created dst . 0. see resize() . warpPerspective() . By default it is 0 . dsize . you can call the function this way: // specify fx and fy and let the function to compute the destination image size. fx and fy will be computed from that. 0. it means that the pixels in the destination image that corresponds to the “outliers” in the source image are not modified by the function o borderValue– A value used in case of a constant border.size().5. o fx– The scale factor along the horizontal axis.. see borderInterpolate() .

and the optional flag WARP_INVERSE_MAP that means that M is the inverse ) transformation ( o borderMode– The pixel extrapolation method. remap() . Otherwise. int adaptiveMethod. int flags=INTER_LINEAR. resize() . Otherwise. Paramete o src– Source image rs: o dst– Destination image. const Scalar& borderValue=Scalar()) Applies a perspective transformation to an image. getRectSubPix() . See also: warpAffine() . Mat& dst. it means that the pixels in the destination image that corresponds to the “outliers” in the source image are not modified by the function o borderValue– A value used in case of a constant border. Mat& dst. See also: warpPerspective() . resize() . Paramete o src– Source 8-bit single-channel image rs: o dst– Destination image. transform() cv::warpPerspective Comments from the Wiki void warpPerspective(const Mat& src. will have the same size and the same type as src o maxValue– The non-zero value assigned to the pixels for which the condition is satisfied. See the discussion . const Mat& M. Size dsize. perspectiveTransform() Miscellaneous Image Transformations cv::adaptiveThreshold Comments from the Wiki void adaptiveThreshold(const Mat& src. see resize() . remap() . By default it is 0 The function warpPerspective transforms the source image using the specified matrix: when the flag WARP_INVERSE_MAP is set. the transformation is first inverted with invert() and then put in the formula above instead of M . double C) Applies an adaptive threshold to an array. double maxValue. The function can not operate in-place. int borderMode=BORDER_CONSTANT. When the borderMode=BORDER_TRANSPARENT . The function can not operate in-place. int blockSize.The function warpAffine transforms the source image using the specified matrix: when the flag WARP_INVERSE_MAP is set. int thresholdType. the transformation is first inverted with invertAffineTransform() and then put in the formula above instead of M . will have size dsize and the same type as src transformation matrix o M– o dsize– Size of the destination image o flags– A combination of interpolation methods. see borderInterpolate() . getRectSubPix() .

The function can process the image in-place. see getGaussianKernel() . For the method ADAPTIVE_THRESH_MEAN_C the threshold value neighborhood of . See also: threshold() . but may be zero or negative as well The function transforms a grayscale image to a binary image according to the formulas: THRESH_BINARY THRESH_BINARY_INV where 1 of a is a threshold calculated individually for each pixel. In the case of transformation to-from RGB color space the ordering of the channels should be specified explicitly (RGB or BGR). int code. G and B channel values are: • 0 to 255 for CV_8U images • 0 to 65535 for CV_16U images and • 0 to 1 for CV_32F images. GaussianBlur() cv::cvtColor Comments from the Wiki void cvtColor(const Mat& src. if the parameter is 0. 16-bit unsigned Paramete ( CV_16UC. 7. and so on o C– The constant subtracted from the mean or weighted mean (see the discussion). Mat& dst.e. int dstCn=0) Converts image from one color space to another o src– The source image. The default sigma (standard deviation) is used for the specified blockSize . e. minus C . the number of the channels will be derived automatically from src and the code The function converts the input image from one color space to another. The conventional ranges for R.g. 8-bit unsigned. Of course. normally. cross-correlation with a Gaussian window) of a neighborhood of . ADAPTIVE_THRESH_MEAN_C or ADAPTIVE_THRESH_GAUSSIAN_C (see the discussion) o thresholdType– Thresholding type. if you have a 32-bit floating-point image . minus C . is the is the mean 2 For the method ADAPTIVE_THRESH_GAUSSIAN_C the threshold value weighted sum (i. blur() .. 5. will have the same size and the same depth as src o code– The color space conversion code. but in the non-linear cases the input RGB image should be normalized to the proper value range in order to get the correct results.. must be one of THRESH_BINARY or THRESH_BINARY_INV o blockSize– The size of a pixel neighborhood that is used to calculate a threshold value for the pixel: 3. in the case of linear transformations the range does not matter. see the discussion o dstCn– The number of channels in the destination image. it’s positive. ) or single-precision floating-point rs: o dst– The destination image.o adaptiveMethod– Adaptive thresholding algorithm to use. For example. for RGB L*u*v* transformation.

CV_YCrCb2RGB ) where Y. before calling cvtColor . conversion to/from 16-bit RGB color (R5:G6:B5 or R5:G5:B5). CIE XYZ. bwsrc. CV_HSV2BGR. RGB YCrCb JPEG (a.directly converted from 8-bit image without any scaling.k. and cover the whole value range (in the case of floating-point images may exceed 1). YCC) ( CV_BGR2YCrCb. CV_HSV2RGB ) in the case of 8-bit and 16-bit images R.255 value range. So. CV_RGB2HSV. img. you need first to scale the image down: img *= 1. then it will have 0. G and B are converted to floating-point format and scaled to fit the 0 to 1 range . Cr and Cb cover the whole value range. Some more advanced channel reordering can also be done with mixChannels() . RGB CV_XYZ2BGR. RGB HSV ( CV_BGR2HSV. cvtColor(img. CV_RGB2XYZ.a. CV_BGR2Luv). CV_YCrCb2BGR.. CV_RGB2YCrCb.1. CV_XYZ2RGB ): . as well as conversion to/from grayscale using: and The conversion from a RGB image to gray is done with: cvtColor(src.Rec 709 with D65 white point ( CV_BGR2XYZ./255.. instead of the assumed by the function 0. reversing the channel order. The function can do the following transformations: Transformations within RGB space like adding/removing the alpha channel. CV_RGB2GRAY).

V are left as is RGB HLS ( CV_BGR2HLS. CV_HLS2RGB ). 16-bit images (currently not supported) 32-bit images H.if On output . CV_RGB2HLS. G and B are converted to floating-point format and scaled to fit the 0 to 1 range . 16-bit images (currently not supported) 32-bit images H. CV_Lab2BGR. if On output . CV_RGB2Lab. The values are then converted to the destination data type: 8-bit images then . CV_Lab2RGB ) in the case of 8-bit and 16-bit images R. CV_HLS2BGR. The values are then converted to the destination data type: 8-bit images then . . S. V are left as is RGB CIE L*a*b* ( CV_BGR2Lab. G and B are converted to floating-point format and scaled to fit the 0 to 1 range. in the case of 8-bit and 16-bit images R. . S.

G and B pixels (sensors of a particular component) are interleaved like this: The output RGB components of a pixel are interpolated from 1. CV_BayerBG2RGB. CV_Luv2BGR. CV_BayerRG2RGB. v are left as is The above formulas for converting RGB to/from various color spaces have been taken from multiple sources on Web. CV_Luv2RGB ) in the case of 8-bit and 16-bit images R. CV_BayerGB2BGR. 16-bit images currently not supported 32-bit images L.com/ColorFAQ. It allows one to get color pictures from a single plane where R. CV_BayerRG2BGR. CV_BayerGB2RGB. a.html Bayer RGB ( CV_BayerBG2BGR. The values are then 16-bit images currently not supported 32-bit images L. converted to the destination data type: 8-bit images . CV_BayerGR2RGB ) The Bayer pattern is widely used in CCD and CMOS cameras. b are left as is RGB CIE L*u*v* ( CV_BGR2Luv. 2 or 4 neighbors of the pixel having . u. . The values are then converted to the destination data type: 8-bit images .where and On output . CV_BayerGR2BGR. primarily from the Charles Poynton site http://www. CV_RGB2Luv. G and B are converted to floating-point format and scaled to fit 0 to 1 range On output .poynton.

coarse distance estimation CV_DIST_L2 . single-channel image of the same size as src o distanceType– Type of distance. the above pattern has very popular “BG” type.these are components from the second row. all of the horizontal and vertical shifts must have the same cost (that is denoted as a ). (for zero image pixels the distance will obviously be zero). and for a more accurate distance estimation CV_DIST_L2 . The connected components of zero pixels are also found and marked by the function. In other cases the algorithm Borgefors86 is used. For example. the function runs the algorithm described in Felzenszwalb04 . Mat& dst. Paramete o src– 8-bit. but it also identifies the nearest the nearest connected component consisting of zero pixels. can be CV_DIST_L1. because a mask gives or any larger aperture. In this mode the complexity is still linear. For a . Mat& labels.955. When maskSize == CV_DIST_MASK_PRECISE and distanceType == CV_DIST_L2 . a mask is used. Note that both the precise and the approximate algorithms are linear on the number of pixels. a mask or the precise algorithm is used. int distanceType. for pixel the function finds the shortest path to the nearest zero pixel consisting of basic shifts: horizontal.the same color. Index of the component is stored in . int distanceType. for a fast. second and third columns.1969 Typically. will be 32-bit floating-point. b = 1 CV_DIST_L1 a = 1. c=2.the discrete Voronoi diagram. Mat& dst. CV_DIST_L2 or CV_DIST_C o maskSize– Size of the distance transform mask.3693 CV_DIST_L2 a=1. single-channel (binary) source image rs: o dst– Output image with calculated distances. will have type CV_32SC1 and the same size as src . all the diagonal shifts must have the same cost (denoted b ). respectively. For CV_DIST_C and CV_DIST_L1 types the distance is calculated precisely. In the case of CV_DIST_L1 or CV_DIST_C distance type the parameter is forced to 3. b=1. that is. whereas for CV_DIST_L2 (Euclidian distance) mask gives more accurate the distance can be calculated only with some relative error (a results). There are several modifications of the above pattern that can be achieved by shifting the pattern one pixel left and/or one pixel up. vertical. b and c OpenCV uses the values suggested in the original paper: CV_DIST_C a = 1. The second variant of the function does not only compute the minimum distance for each pixel . can be 3. diagonal or knight’s move (the latest is available for a mask). b = 2 CV_DIST_L2 a=0. the same result as a o labels– The optional output 2d array of labels . See the discussion The functions distanceTransform calculate the approximate or precise distance from every binary image pixel to the nearest zero pixel. 5 or CV_DIST_MASK_PRECISE (the latter option is only supported by the first of the functions). b=1. int maskSize) void distanceTransform(const Mat& src. int maskSize) Calculates the distance to the closest zero pixel for each pixel of the source image. Because the distance function should be symmetric. That is. The two letters and in the conversion constants CV_Bayer 2BGR and CV_Bayer 2RGB indicate the particular pattern type . the function provides a very fast way to compute . cv::distanceTransform Comments from the Wiki void distanceTransform(const Mat& src. The overall distance is calculated as a sum of these basic distances.4. and all knight’s moves must have the same cost (denoted c ).

Point seed.e. fixed range . the range is floating) FLOODFILL_MASK_ONLY(for the second variant only) if set. floating range grayscale image. this second variant can only use the approximate distance transform algorithm. Mat& mask. It is possible to use the same mask in multiple calls to the function to make sure the filled area do not overlap. Currently. used within the function. the function does not change the image ( newVal is ignored). Scalar upDiff=Scalar(). or a seed pixel being added to the component o rect– The optional output parameter that the function sets to the minimum bounding rectangle of the repainted domain o flags– The operation flags. the difference between the current pixel and seed pixel is considered. but fills the mask The functions floodFill fill a connected component starting from the seed point with the specified color. Scalar newVal. Upper bits can be 0 or a combination of the following flags: FLOODFILL_FIXED_RANGEif set. an edge detector output can be used as a mask to stop filling at edges. Point seed. 8-bit or floating-point image. Scalar loDiff=Scalar(). Scalar upDiff=Scalar(). The function uses and updates the mask. or a seed pixel being added to the component o upDiff– Maximal upper brightness/color difference between the currently observed pixel and one of its neighbors belonging to the component. It rs: is modified by the function unless the FLOODFILL_MASK_ONLY flag is set (in the second variant of the function. Lower bits contain connectivity value. The pixel at is considered to belong to the repainted domain if: grayscale image. should be a single-channel 8-bit image. so the user takes responsibility of initializing the mask content. The connectivity is determined by the color/brightness closeness of the neighbor pixels.Voronoi diagram for the binary image. cv::floodFill Comments from the Wiki int floodFill(Mat& image. Paramete o image– Input/output 1. a pixel in image will correspond to the pixel in the mask o seed– The starting point o newVal– New value of the repainted domain pixels o loDiff– Maximal lower brightness/color difference between the currently observed pixel and one of its neighbors belonging to the component. otherwise the difference between neighbor pixels is considered (i. int flags=4) int floodFill(Mat& image. Connectivity determines which neighbors of a pixel are considered. Scalar newVal. see below) o mask– (For the second function only) Operation mask. 2 pixels wider and 2 pixels taller. 4 (by default) or 8. Rect* rect=0. for example.or 3-channel. Note : because the mask is larger than the filled image. int flags=4) Fills a connected component with the given color. Scalar loDiff=Scalar(). Flood-filling can’t go across non-zero pixels in the mask. Rect* rect=0.

or floating-point (32f or 64f) . 8-bit 1-channel image. Mat& sqsum. Mat& dst. int sdepth=-1) void integral(const Mat& image. By using these functions you can either mark a connected component with the specified color in-place. fixed range where is the value of one of pixel neighbors that is already known to belong to the component. Non-zero pixels indicate the area that needs to be inpainted. Mat& sqsum. int flags) Inpaints the selected region in the image. The function may be used to remove dust and scratches from a scanned photo. or build a mask and then extract the contour or copy the region to another image etc. rs: or 64f) o sum– The integral image. cv::integral Comments from the Wiki void integral(const Mat& image.c sample. Mat& sum. o flags– The inpainting method. That is.wikipedia. Paramete o src– The input 8-bit 1-channel or 3-channel image. 8-bit or floating-point (32f o image– The source image. o dst– The output image. one of the following: INPAINT_NSNavier-Stokes based method. double inpaintRadius. Mat& tilted. Mat& sum. or to remove undesirable objects from still images or video. Various modes of the function are demonstrated in floodfill. See also: findContours() cv::inpaint Comments from the Wiki void inpaint(const Mat& src. INPAINT_TELEAThe method by Alexandru Telea Telea04 The function reconstructs the selected image area from the pixel near the area boundary.color image. See http://en. will have the same size and the same type as src o inpaintRadius– The radius of a circlular neighborhood of each point inpainted that is considered by the algorithm. to be added to the connected component. int sdepth=-1) Calculates the integral of an image. Mat& sum. int sdepth=-1) void integral(const Mat& image. a pixel’s color/brightness should be close enough to the: • color/brightness of one of its neighbors that are already referred to the connected component in the case of floating range • color/brightness of the seed point in the case of fixed range.org/wiki/Inpainting for more details. floating range color image. 32-bit integer . rs: o inpaintMask– The inpainting mask. const Mat& inpaintMask. Paramete .

the same data type as sum o sdepth– The desired depth of the integral and the tilted integral images.3. In the case of multi-channel images. . double precision floating-point (64f) tilted– The integral for the image rotated by 45 degrees. mean and standard deviation over a specific up-right or rotated rectangular region of the image in a constant time. . as well as the relative pixels in the integral images sum and tilted . sums for each channel are accumulated independently. one may calculate sum.o o sqsum– The integral image for squared pixel values.1. CV_32S . The selected pixels in the original image are shown.3. begin{center} . the next figure shows the calculation of the integral of a straight rectangle Rect(3.3) . for example. As a practical example.2) and of a tilted rectangle Rect(5. CV_32F or CV_64F The functions integral calculate one or more integral images for the source image as following: Using these integral images. for example: It makes possible to do a fast blurring or fast block correlation with variable window size.2.

The function is typically used to get a bi-level (binary) image out of a grayscale image ( compare() could be also used for this purpose) or for removing a noise. 8-bit of 32-bit floating point) rs: o dst– Destination array. double maxVal.end{center} cv::threshold Comments from the Wiki double threshold(const Mat& src. i. int thresholdType) Applies a fixed-level threshold to each array element Paramete o src– Source array (single-channel. There are several types of thresholding that the function supports that are determined by thresholdType : THRESH_BINARY THRESH_BINARY_INV . Mat& dst. double thresh.e. will have the same size and the same type as src o thresh– Threshold value o maxVal– Maximum value to use with THRESH_BINARY and THRESH_BINARY_INV thresholding types o thresholdType– Thresholding type (see the discussion) The function applies fixed-level thresholding to a single-channel array. filtering out pixels with too small or too large values.

Otsu’s method is implemented only for 8-bit images. The function returns the computed threshold value. In this case the function determines the optimal threshold value using Otsu’s algorithm and uses it instead of the specified thresh .THRESH_TRUNC THRESH_TOZERO THRESH_TOZERO_INV Also. . the special value THRESH_OTSU may be combined with one of the above values. Currently.

Mat& markers) Does marker-based image segmentation using watershed algrorithm Paramete o image– The input 8-bit 3-channel image. min() . It should have the same size as image The function implements one of the variants of watershed. rs: o markers– The input/output 32-bit single-channel image (map) of markers. max() cv::watershed Comments from the Wiki void watershed(const Mat& image. non-parametric marker-based segmentation . compare() .See also: adaptiveThreshold() . findContours() .

3 etc (such markers can be retrieved from a binary mask using findContours() and drawContours() . or to -1 at boundaries between the regions. See also: findContours() cv::grabCut Comments from the Wiki void grabCut(const Mat& image.) – Temporary arrays used for segmentation. Note that the result can be refined with further calls with the mode==GC_INIT_WITH_MASK or mode==GC_EVAL o mode– The operation mode GC_INIT_WITH_RECTThe function initializes the state and the mask using the provided rectangle.e. All the other pixels in markers . Visual demonstration and usage example of the function can be found in OpenCV samples directory. Its elements may have one of four values. for example. The function implements the GrabCut image segmentation algorithm. After that it runs iterCount iterations of the algorithm GC_INIT_WITH_MASKThe function initializes the state using the provided mask. On the output of the function.cpp on how to use the function. Rect rect.cpp demo.algorithm. Mat& fgdModel. The markers will be “seeds” of the future image regions. then all the pixels outside of the ROI are automatically initialized with GC_BGD . See the sample grabcut. int iterCount. see watershed. The pixels outside of the ROI are marked as “certainly a background”. Mat& mask. The parameter is only used when mode==GC_INIT_WITH_RECT o fgdModel(bgdModel. that it is not necessary that every two neighbor connected components are separated by a watershed boundary (-1’s pixels). Note. which relation to the outlined regions is not known and should be defined by the algorithm. every region is represented as one or more connected components with the pixel values 1. should be set to 0’s. user has to outline roughly the desired regions in the image markers with positive ( ) indices. int mode) Runs GrabCut algorithm Paramete o image– The input 8-bit 3-channel image. each pixel in markers is set to one of values of the “seed” components. i. in case when such tangent components exist in the initial marker image.cpp demo). Note that GC_INIT_WITH_RECT and GC_INIT_WITH_MASK can be combined. GC_EVALThe value means that algorithm should just resume. Structural Analysis and Shape Descriptors¶ . The mask is initialize when mode==GC_INIT_WITH_RECT GC_BGDCertainly a background pixel GC_FGDCertainly a foreground (object) pixel GC_PR_BGDLikely a background pixel GC_PR_BGDLikely a foreground pixel o rect– The ROI containing the segmented object. Mat& bgdModel. see watershed. described in Meyer92 . Before passing the image to the function. rs: o mask– The input/output 8-bit single-channel mask. Do not modify them while you are processing the same image o iterCount– The number of iterations the algorithm should do before returning the result. 2.

of a vector shape or a rasterized shape.wikipedia. double m01. m20. m02. mu21. mu12. the moments computed for a contour will be slightly different from the moments computed for the same contour rasterized. m12. nu12. The moments of a contour are defined in the same way. Moments( const CvMoments& moments ). m10. nu30. nu03. m01. then all the non-zero image pixels are treated as 1’s The function computes moments. mu30. double m11. Moments(double m00. arcLength() cv::HuMoments . // central normalized moments double nu20. double m10. double m30. but computed using Green’s formula (see http://en. // central moments double mu20. }. . nu02. because of a limited raster resolution. hence the values are not stored. double m20. A raster image (single-channel. double m21.cv::moments Comments from the Wiki Moments moments(const Mat& array. m03. nu21. up to the 3rd order. mu11. m11. mu02. See also: contourArea() . where the class Moments is defined as: class Moments { public: Moments(). // spatial moments double m00. bool binaryImage=false) Calculates all of the moments up to the third order of a polygon or rasterized shape. In case of a raster image. mu03. double m02.org/wiki/Green_theorem ). double m03 ). m30. double m12. operator CvMoments() const. the spatial moments are computed as: param array: the central moments are computed as: where is the mass center: and the normalized central moments are computed as: Note that . m21. nu11. 8-bit or floating-point 2D array) or an array ( or ) of 2D points ( Point or Point2f ) param binaryImage: (For images only) If it is true. therefore.

See also: matchShapes() cv::findContours Comments from the Wiki void findContours(const Mat& image. the corresponding elements of hierarchy[i] will be negative o mode– The contour retrieval mode CV_RETR_EXTERNALretrieves only the extreme outer contours. The function modifies the image while extracting the contours o contours– The detected contours. In case of a raster images the computed Hu invariants for the original and transformed images will be a bit different. an 8-bit single-channel image. to create a binary image out of a grayscale or color one. hiearchy[i][1] . For each contour contours[i] . computed with moments() o h– The output Hu invariants The function calculates the seven Hu invariants. see http://en. It will have as many elements as the number of contours. Each contour is stored as a vector of points o hiararchy– The optional output vector that will contain information about the image topology. respectively. Paramete o image– The source. It will set hierarchy[i][2]=hierarchy[i][3]=-1 for all the .wikipedia. this invariance was proved with the assumption of infinite image resolution. Non-zero pixels rs: are treated as 1’s. These values are proved to be invariant to the image scale. parent or nested contours. hiearchy[i][2] . If for some contour i there is no next. the elements hierarchy[i][0] . int mode. threshold() . vector<vector<Point> >& contours. hiearchy[i][3] will be set to 0-based indices in contours of the next and previous contours at the same hierarchical level. adaptiveThreshold() .Comments from the Wiki void HuMoments(const Moments& moments. rotation. Point offset=Point()) Finds the contours in a binary image. You can use compare() .org/wiki/Image_moment . double h[7]) Calculates the seven Hu invariants. that are defined as: where stand for . the first child contour and the parent contour. vector<vector<Point> >& contours. and reflection except the seventh one. previous. Of course. whose sign is changed by reflection. int mode. Parameters: o moments– The input moments. int method. vector<Vec4i>& hierarchy. zero pixels remain 0’s . inRange() . int method. Point offset=Point()) void findContours(const Mat& image.the image is treated as binary . Canny() etc.

If 1. int thickness=1. const Scalar& color. It is only needed if you want to draw only some of the contours (see maxLevel ) o maxLevel– Maximal level for drawn contours. o lineType– The line connectivity.g. Paramete o image– The destination image rs: o contours– All the input contours. on the second level are the boundaries of the holes. Each contour is stored as a point vector o contourIdx– Indicates the contour to draw.c in the OpenCV sample directory. If it is negative. by which every contour point is shifted.c demo o method– The contour approximation method. every 2 points of a contour stored with this method are 8-connected neighbors of each other CV_CHAIN_APPROX_SIMPLEcompresses horizontal.contours CV_RETR_LISTretrieves all of the contours without establishing any hierarchical relationships CV_RETR_CCOMPretrieves all of the contours and organizes them into a two-level hierarchy: on the top level are the external boundaries of the components. only the specified contour is drawn. Point offset=Point()) Draws contours’ outlines or filled contours. CV_CHAIN_APPROX_NONEstores absolutely all the contour points.g. int contourIdx. the function draws the contour(s) and all . If 0. int lineType=8. const vector<vector<Point> >& contours. That is. If inside a hole of a connected component there is another contour. see line() description o hierarchy– The optional information about hierarchy. This full hierarchy is built and shown in OpenCV contours. const vector<Vec4i>& hierarchy=vector<Vec4i>(). int maxLevel=INT_MAX. See squares. and diagonal segments and leaves only their end points. an up-right rectangular contour will be encoded with 4 points CV_CHAIN_APPROX_TC89_L1. see TehChin89 o offset– The optional offset. If it is negative (e. This is useful if the contours are extracted from the image ROI and then they should be analyzed in the whole image context The function retrieves contours from the binary image using the algorithm Suzuki85 . all the contours are drawn o color– The contours’ color o thickness– Thickness of lines the contours are drawn with. The contours are a useful tool for shape analysis and object detection and recognition. E. Note: the source image is modified by this function. cv::drawContours Comments from the Wiki void drawContours(Mat& image. it will still be put on the top level CV_RETR_TREEretrieves all of the contours and reconstructs the full hierarchy of nested contours. vertical. thickness=CV_FILLED ).CV_CHAIN_APPRO X_TC89_KCOSapplies one of the flavors of the Teh-Chin chain approximation algorithm. the contour interiors are drawn.

// iterate through all the top-level contours. idx >= 0. 1 ). Mat dst = Mat::zeros(src. idx = hierarchy[idx][0] ) { Scalar color( rand()&255. dst ). findContours( src. Shift all the drawn contours by the specified The function draws contour outlines in the image if or fills the area bounded by the contours if . waitKey(0). imshow( "Components".h" #include "highgui. vector<Vec4i> hierarchy. double epsilon. color.cols. vector<Point>& approxCurve.data) return -1. drawContours( dst.the nested contours. } cv::approxPolyDP Comments from the Wiki void approxPolyDP(const Mat& curve. CV_RETR_CCOMP. 8. contours. all the nested contours and all the nested into nested contours etc. the function draws the contours.h" using namespace cv. o offset– The optional contour shift parameter. char** argv ) { Mat src. hierarchy ). namedWindow( "Source". CV_CHAIN_APPROX_SIMPLE ). // draw each connected component with its own random color int idx = 0. contours. int main( int argc. src ). CV_FILLED. Here is the example on how to retrieve connected components from the binary image and label them #include "cv. bool closed) . // the first command line parameter must be file name of binary // (black-n-white) image if( argc != 2 || !(src=imread(argv[1]. idx. This parameter is only taken into account when there is hierarchy available. vector<vector<Point> > contours. src = src > 1. for( . If 2. 1 ). hierarchy. rand()&255. } namedWindow( "Components". src. 0)). CV_8UC3).rows. imshow( "Source". rand()&255 ).

You can also convert vector<Point> or vector<Point2f to the matrix by calling Mat(const vector<T>&) constructor.e. otherwise it’s not The functions approxPolyDP approximate a curve or a polygon with another curve/polygon with less vertices. o approxCurve– The result of the approximation.void approxPolyDP(const Mat& curve. there are 6 degrees of freedom). otherwise. the function finds the optimal affine transformation with no any additional resrictions (i.e. It used Douglas-Peucker algorithm http://en. there are 5 degrees of freedom) . represented by CV_32SC2 or CV_32FC2 matrix. cv::estimateRigidTransform Comments from the Wiki Mat estimateRigidTransform(const Mat& srcpt. or by vector<Point> or vector<Point2f> converted to a matrix with Mat(const vector<T>&) constructor o closed– Indicates. bool fullAffine) Computes optimal affine transformation between two 2D point sets Paramete o srcpt– The first input 2D point set rs: o dst– The second input 2D point set of the same size and the same type as A o fullAffine– If true. represented by CV_32SC2 or rs: CV_32FC2 matrix. Paramete o curve– The input vector of 2D points. vector<Point2f>& approxCurve. whether the curve is closed or not The function computes the curve length or the closed contour perimeter.org/wiki/Ramer-Douglas-Peucker_algorithm cv::arcLength Comments from the Wiki double arcLength(const Mat& curve. the approximated curve is closed (i.wikipedia. or by vector<Point> or vector<Point2f> converted to rs: the matrix using Mat(const vector<T>&) constructor. double epsilon. bool closed) Calculates a contour perimeter or a curve length. Paramete o curve– The polygon or curve to approximate. cv::boundingRect Comments from the Wiki Rect boundingRect(const Mat& points) Calculates the up-right bounding rectangle of a point set. its first and last vertices are connected).e. const Mat& dstpt. bool closed) Approximates polygonal curve(s) with the specified precision. Paramete o points– The input 2D point set. rotation and uniform scaling (i. The type should match the type of the input curve o epsilon– Specifies the approximation accuracy. the class of transformations to choose from is limited to combinations of translation. so that the distance between them is less or equal to the specified precision. Must be or rs: matrix of type CV_32SC2 or CV_32FC2 . The function calculates and returns the minimal up-right bounding rectangle for the specified point set. This is the maximum distance between the original curve and its approximation o closed– If true.

5. vector<Point> approx. Similarly to moments() the area is computed using the Green formula. with which the matrix is estimated The function estimates the optimal 3D affine transformation between two 3D point sets using RANSAC algorithm. or by vector<Point> or vector<Point2f> converted to the matrix using Mat(const vector<T>&) constructor.push_back(Point2f(10.push_back(Point2f(10. double confidence = 0. See also: getAffineTransform() . const Mat& dstpt. contour. 4)). double area1 = contourArea(approx). The function computes the contour area. contour.push_back(Point2f(5. if you draw the contour using drawContours() or fillPoly() .push_back(Point2f(0.99) Computes optimal affine transformation between two 3D point sets Paramete o srcpt– The first input 3D point set rs: o dstpt– The second input 3D point set o out– The output 3D affine transformation matrix o outliers– The output vector indicating which points are outliers o ransacThreshold– The maximum reprojection error in RANSAC algorithm to consider a point an inlier o confidence– The confidence level. 10)). approx. 0)). cv::contourArea Comments from the Wiki double contourArea(const Mat& contour) Calculates the contour area Paramete o contour– The contour vertices.The function finds the optimal affine transform approximates best the transformation from (a to floating-point matrix) that : where can be either arbitrary (when fullAffine=true ) or have form when fullAffine=false . double ransacThreshold = 3. vector<uchar>& outliers. getPerspectiveTransform() . 0)). contour. true). approxPolyDP(contour. . contour. represented by CV_32SC2 or rs: CV_32FC2 matrix. double area0 = contourArea(contour). between 0 and 1. can be different. Mat& out. thus the returned area and the number of non-zero pixels. findHomography() cv::estimateAffine3D Comments from the Wiki int estimateAffine3D(const Mat& srcpt.0. Here is a short example: vector<Point> contour.

or by vector<Point> or vector<Point2f> converted to rs: the matrix using Mat(const vector<T>&) constructor. vector<Point>& hull. The functions find the convex hull of a 2D point set using Sklansky’s algorithm Sklansky82 that has or complexity (where is the number of input points).the origin is at the top-left corner. bool clockwise=false) void convexHull(const Mat& points. int distType. depending on how the initial sorting is implemented (currently it is . double aeps) void fitLine(const Mat& points. vector<Point2f> . It is either a vector of points that form the hull (must have the same type as the input points). cv::fitEllipse Comments from the Wiki RotatedRect fitEllipse(const Mat& points) Fits an ellipse around a set of 2D points.c that demonstrates the use of the different function variants. and y axis is oriented downwards. represented by CV_32SC2 or rs: CV_32FC2 matrix. The function calculates the ellipse that fits best (in least-squares sense) a set of 2D points. x axis is oriented to the right. the usual screen coordinate system is assumed . o points– The input 2D point set. represented by CV_32SC2 or Paramete CV_32FC2 matrix. bool clockwise=false) void convexHull(const Mat& points. vector<int>& hull.size() << endl. o points– The input 2D point set. Paramete o points– The input 2D point set. cv::fitLine Comments from the Wiki void fitLine(const Mat& points. int distType. vector<Point3i> or vector<Point3f> converted to the matrix by Mat(const vector<T>&) constructor o line– . double reps. otherwise it will be oriented counter-clockwise. It returns the rotated rectangle in which the ellipse is inscribed. double aeps) Fits a line to a 2D or 3D point set. See the OpenCV sample convexhull. or by vector<Point> or vector<Point2f> converted to rs: the matrix using Mat(const vector<T>&) constructor. o clockwise– If true. double param. o hull– The output convex hull.cout << "area0 =" << area0 << endl << "area1 =" << area1 << endl << "approx poly vertices" << approx. vector<Point2f>& hull. Vec6f& line. represented by CV_32SC2 or Paramete CV_32FC2 matrix. the output convex hull will be oriented clockwise. or by vector<Point> . Vec4f& line. Here. bool clockwise=false) Finds the convex hull of a point set. or a vector of 0-based point indices of the hull points in the original array (since the set of convex hull points is a subset of the original point set). double param. double reps. cv::convexHull Comments from the Wiki void convexHull(const Mat& points.

vy. The contour must be simple. it is a vector of 4 floats ``(vx. vy) is a normalized vector collinear to the line and (x0. one of: distType=CV_DIST_FAIR distType=CV_DIST_WELSCH distType=CV_DIST_HUBER The algorithm is based on the M-estimator ( http://en. cv::minAreaRect . in the case of a 3D fitting it is vector of 6 floats (vx. The function tests whether the input contour is convex or not. x0.org/wiki/M-estimator ) technique. y0. 0. i. vz. without self-intersections.The output line parameters. vz) is a normalized vector collinear to the line and (x0. y0)`` where (vx. or vector<Point> or vector<Point2f> converted to the rs: matrix using Mat(const vector<T>&) constructor. x0. if 0 then some optimal value is chosen o aeps(reps. y0. respectively. z0) is some point on the line o distType– The distance used by the M-estimator (see the discussion) o param– Numerical parameter ( C ) for some types of distances.wikipedia. vy.e. that iteratively fits the line using weighted least-squares algorithm and after each iteration the weights are adjusted to beinversely proportional to . cv::isContourConvex Comments from the Wiki bool isContourConvex(const Mat& contour) Tests contour convexity. y0) is some point on the line. a matrix of type CV_32SC2 or CV_32FC2 .01 would be a good default value for both. vy.) – Sufficient accuracy for the radius (distance between the coordinate origin and the line) and angle. z0) where (vx. Paramete o contour– The tested contour. otherwise the function output is undefined. In the case of a 2d fitting. The functions fitLine fit a line to a 2D or 3D point set by minimizing distance between the point and the line and distType=CV_DIST_L2 distType=CV_DIST_L1 distType=CV_DIST_L12 where is the is a distance function.

or by vector<Point> or vector<Point2f> converted to the matrix using Mat(const vector<T>&) constructor.c cv::matchShapes Comments from the Wiki double matchShapes(const Mat& object1. See the OpenCV sample minarea. int method. Paramete o points– The input 2D point set.c cv::minEnclosingCircle Comments from the Wiki void minEnclosingCircle(const Mat& points.Comments from the Wiki RotatedRect minAreaRect(const Mat& points) Finds the minimum area rotated rectangle enclosing a 2D point set. float& radius) Finds the minimum area circle enclosing a 2D point set. The function calculates and returns the minimum area bounding rectangle (possibly rotated) for the specified point set. denotes object2 ): method=CV_CONTOUR_MATCH_I1 method=CV_CONTOUR_MATCH_I2 method=CV_CONTOUR_MATCH_I3 . Point2f& center. See the OpenCV sample minarea. The 3 implemented methods all use Hu invariants (see HuMoments() ) as following ( denotes object1 . double parameter=0) Compares two shapes. o center– The output center of the circle o radius– The output radius of the circle The function finds the minimal enclosing circle of a 2D point set using iterative algorithm. or by vector<Point> or vector<Point2f> converted to the matrix using Mat(const vector<T>&) constructor. CV_CONTOURS_MATCH_I2 or CV_CONTOURS_MATCH_I3(see the discussion below) o parameter– Method-specific parameter (is not used now) The function compares two shapes. represented by CV_32SC2 or rs: CV_32FC2 matrix. const Mat& object2. Paramete o points– The input 2D point set. o object1– The first contour or grayscale image Parameters: o object2– The second contour or grayscale image o method– Comparison method: CV_CONTOUR_MATCH_I1. represented by CV_32SC2 or rs: CV_32FC2 matrix.

otherwise. Paramete o contour– The input contour rs: o pt– The point tested against the contour o measureDist– If true. correspondingly. Point2f pt. It returns positive (inside). negative (outside) or zero (on an edge) value. or lies on an edge (or coincides with a vertex). where each image pixel is tested against the contour.where and are the Hu moments of and respectively. bool measureDist) Performs point-in-contour test. the return value it is a signed distance between the point and the nearest contour edge. -1 and 0. outside. The function determines whether the point is inside a contour. the function only checks if the point is inside or not. the function estimates the signed distance from the point to the nearest contour edge. cv::pointPolygonTest Comments from the Wiki double pointPolygonTest(const Mat& contour. Otherwise. respectively. Here is the sample output of the function. When measureDist=false . the return value is +1. .

or some of its elements.or 3-channel. for the further foreground-background segmentation. accumulateWeighted() . See also: accumulateSquare() . 8-bit or 32-bit floating point rs: o dst– The accumulator image with the same number of channels as input image. each channel is processed independently. accumulateProduct() . const Mat& mask=Mat()) Adds image to the accumulator. Paramete o src– The input image. for example. viewed by a still camera. to collect statistic of background of a scene. 32-bit or 64-bit floating-point o mask– Optional operation mask The function adds src . to dst : The function supports multi-channel images.Motion Analysis and Object Tracking¶ cv::accumulate Comments from the Wiki void accumulate(const Mat& src. 1. The functions accumulate* can be used. Mat& dst.

cv::accumulateSquare Comments from the Wiki void accumulateSquare(const Mat& src. Paramete o src1– The first input image. 32-bit or 64-bit floating-point o mask– Optional operation mask The function adds the product of 2 images or their selected regions to the accumulator dst : The function supports multi-channel images. const Mat& src2. 1. Paramete o src– The input image. raised to power 2. 8-bit or 32-bit floating point rs: o dst– The accumulator image with the same number of channels as input image. The function supports multi-channel images. 32-bit or 64-bit floating-point o alpha– Weight of the input image o mask– Optional operation mask The function calculates the weighted sum of the input image src and the accumulator dst so that dst becomes a running average of frame sequence: that is. alpha regulates the update speed (how fast the accumulator “forgets” about earlier images). Mat& dst. const Mat& mask=Mat()) Updates the running average. See also: accumulate() . to the accumulator dst : The function supports multi-channel images. 8-bit or 32-bit floating point rs: o dst– The accumulator image with the same number of channels as input image. See also: accumulate() .or 3-channel. accumulateWeighted() cv::accumulateWeighted Comments from the Wiki void accumulateWeighted(const Mat& src. accumulateProduct() . each channel is processed independently. 32-bit or 64-bit floating-point o mask– Optional operation mask The function adds the input image src or its selected region. 1. const Mat& mask=Mat()) Adds the square of the source image to the accumulator. double alpha. accumulateSquare() . 1.or 3-channel. Mat& dst. Paramete o src– The input image. accumulateWeighted() cv::accumulateProduct Comments from the Wiki void accumulateProduct(const Mat& src1. accumulateProduct() Feature Detection¶ . const Mat& mask=Mat()) Adds the per-element product of two input images to the accumulator. each channel is processed independently. accumulateSquare() . See also: accumulateSquare() . 8-bit or 32-bit floating point rs: o src2– The second input image of the same type and the same size as src1 o dst– Accumulator with the same number of channels as input images. each channel is processed independently.or 3-channel. Mat& dst.

Paramete o image– Single-channel 8-bit input image rs: o edges– The output edge map. whether the more accurate norm should be used to compute the image gradient magnitude ( L2gradient=true ).wikipedia. The smallest value between threshold1 and threshold2 is used for edge linking. int borderType=BORDER_DEFAULT) Calculates eigenvalues and eigenvectors of image blocks for corner detection.cv::Canny Comments from the Wiki void Canny(const Mat& image. See also: cornerMinEigenVal() . or a faster default norm is enough ( L2gradient=false ) The function finds edges in the input image image and marks them in the output map edges using the Canny algorithm. double threshold2. see borderInterpolate() For every pixel neigborhood . After that it finds eigenvectors and eigenvalues of and stores them into destination image in the form are the eigenvalues of where . not sorted are the eigenvectors corresponding to are the eigenvectors corresponding to The output of the function can be used for robust edge or corner detection. the largest value is used to find the initial segments of strong edges.org/wiki/Canny_edge_detector cv::cornerEigenValsAndVecs Comments from the Wiki void cornerEigenValsAndVecs(const Mat& src. int apertureSize=3. Mat& edges. Parameter o src– Input single-channel 8-bit or floating-point image s: o dst– Image to store the results. It will have the same size and the same type as image o threshold1– The first threshold for the hysteresis procedure o threshold2– The second threshold for the hysteresis procedure o apertureSize– Aperture size for the Sobel() operator o L2gradient– Indicates. preCornerDetect() . Mat& dst. It will have the same size as src and the type CV_32FC(6) o blockSize– Neighborhood size (see discussion) o apertureSize– Aperture parameter for the Sobel() operator o boderType– Pixel extrapolation method. cornerHarris() . double threshold1. the function cornerEigenValsAndVecs considers a blockSize blockSize . int apertureSize. bool L2gradient=false) Finds edges in an image using Canny algorithm. It calculates the covariation matrix of derivatives over the neighborhood as: Where the derivatives are computed using Sobel() operator. int blockSize. see http://en.

see borderInterpolate() The function runs the Harris edge detector on the image. Mat& dst. int apertureSize. cv::cornerMinEigenVal Comments from the Wiki void cornerMinEigenVal(const Mat& src. Mat& dst. int blockSize. int borderType=BORDER_DEFAULT) Calculates the minimal eigenvalue of gradient matrices for corner detection. refined coordinates on output o winSize– Half of the side length of the search window.cv::cornerHarris Comments from the Wiki void cornerHarris(const Mat& src. i. cornerEigenValsAndVecs() description. int blockSize. See the formula below o boderType– Pixel extrapolation method. will have type CV_32FC1 and the same size as src o blockSize– Neighborhood size (see the discussion of cornerEigenValsAndVecs() ) o apertureSize– Aperture parameter for the Sobel() operator o k– Harris detector free parameter. Paramete o src– Input single-channel 8-bit or floating-point image rs: o dst– Image to store the Harris detector responses. int apertureSize=3. Then. Paramete o image– Input image rs: o corners– Initial coordinates of the input corners. see borderInterpolate() The function is similar to cornerEigenValsAndVecs() but it calculates and stores only the minimal eigenvalue of the covariation matrix of derivatives. will have type CV_32FC1 and the same size as src o blockSize– Neighborhood size (see the discussion of cornerEigenValsAndVecs() ) o apertureSize– Aperture parameter for the Sobel() operator o boderType– Pixel extrapolation method. it computes the following Corners in the image can be found as the local maxima of this response map. then a . For example.5) . TermCriteria criteria) Refines the corner locations. vector<Point2f>& corners. int borderType=BORDER_DEFAULT) Harris edge detector. Similarly to cornerMinEigenVal() and cornerEigenValsAndVecs() . Size zeroZone.e. if winSize=Size(5. in terms of the formulae in cv::cornerSubPix Comments from the Wiki void cornerSubPix(const Mat& image. for each pixel over a characteristic: it calculates a gradient covariation matrix neighborhood. double k. Size winSize. Paramete o src– Input single-channel 8-bit or floating-point image rs: o dst– Image to store the minimal eigenvalues.

The value of (-1. double qualityLevel. That is. const Mat& mask=Mat(). bool useHarrisDetector=false. vector<Point2f>& corners.search window would be used o zeroZone– Half of the size of the dead region in the middle of the search zone over which the summation in the formula below is not done.04) Determines strong corners on an image. Consider the expression: where is the image gradient at the one of the points in a neighborhood of . Paramete o image– The input 8-bit or floating-point 32-bit. The criteria may specify either of or both the maximum number of iteration and the required accuracy The function iterates to find the sub-pixel accurate location of corners.-1) indicates that there is no such size o criteria– Criteria for termination of the iterative process of corner refinement. Calling the first and then iterates until cv::goodFeaturesToTrack Comments from the Wiki void goodFeaturesToTrack(const Mat& image. A system of equations may be set up with set to zero: where the gradients are summed within a neighborhood (“search window”) of gradient term and the second gradient term gives: The algorithm sets the center of the neighborhood window at this new center the center keeps within a set threshold. The value of is to be found such that is minimized. double minDistance. . or radial saddle points. single-channel image rs: o corners– The output vector of detected corners o maxCorners– The maximum number of corners to return. the process of corner position refinement stops either after a certain number of iterations or when a required accuracy is achieved. double k=0. If there . int maxCorners. It is used sometimes to avoid possible singularities of the autocorrelation matrix. int blockSize=3. as shown in on the picture below. Sub-pixel accurate corner locator is based on the observation that every vector from the center to a point located within a neighborhood of is orthogonal to the image gradient at subject to image and measurement noise.

will be rejected. 3 the next step rejects the corners with the minimal eigenvalue less than 4 5 . double param1=100. described in Yuen90 . The corners. o minDistance– The minimum possible Euclidean distance between the returned corners o mask– The optional region of interest. which is basically 21HT . see cornerEigenValsAndVecs() o useHarrisDetector– Indicates. For example. the value of the parameter is multiplied by the by the best corner quality measure (which is the min eigenvalue. if the best corner has the quality measure = 1500. Paramete o image– The 8-bit. the remaining corners are then sorted by the quality measure in the descending order. and the qualityLevel=0. double minDist. the only implemented method is CV_HOUGH_GRADIENT . int method. Note that the if the function is called with different values A and B of the parameter qualityLevel . double dp. or the Harris function response. cornerHarris() . which quality measure is less than the product. grayscale input image rs: o circles– The output vector of found circles. then all the corners which quality measure is less than 15 will be rejected. vector<Vec3f>& circles. finally. calcOpticalFlowPyrLK() . see cornerHarris() ). whether to use operator or cornerMinEigenVal() o k– Free parameter of Harris detector The function finds the most prominent corners in the image or in the specified image region. . See also: cornerMinEigenVal() . Each vector is encoded as 3-element floating-point vector o method– Currently. int minRadius=0. the strongest of them will be returned o qualityLevel– Characterizes the minimal accepted quality of image corners.01 . the vector of returned corners with qualityLevel=A will be the prefix of the output vector with qualityLevel=B . the function throws away each corner if there is a stronger corner ( ) such that the distance between them is less than minDistance The function can be used to initialize a point-based tracker of an object. OneWayDescriptor() cv::HoughCircles Comments from the Wiki void HoughCircles(Mat& image. If the image is not empty (then it needs to have the type CV_8UC1 and the same size as image ). it will specify the region in which the corners are detected o blockSize– Size of the averaging block for computing derivative covariation matrix over each pixel neighborhood. int maxRadius=0) Finds circles in a grayscale image using a Hough transform.are more corners than that will be found. see cornerMinEigenVal() . PlanarObjectDetector() . as described in Shi94 : 1 the function first calculates the corner quality measure at every source image pixel using the cornerMinEigenVal() or cornerHarris() 2 then it performs non-maxima suppression (the local maxima in neighborhood are retained). double param2=100. estimateRigidMotion() . single-channel. and A > {B}.

will be returned first o minRadius– Minimum circle radius o maxRadius– Maximum circle radius The function finds circles in a grayscale image using some modification of Hough transform. gray. CV_HOUGH_GRADIENT. the more false circles may be detected. radius.size(). } Note that usually the function detects the circles’ centers well. etc o minDist– Minimum distance between the centers of the detected circles. if( argc != 2 && !(img=imread(argv[1].h> #include <math. 9). circles.h> #include <highgui. // smooth it.accumulator will have half as big width and height.255. for( size_t i = 0. 2 ). cvtColor(img. CV_BGR2GRAY). return 0. the accumulator will have the same resolution as the input image. // draw the circle center circle( img. If it is too large. 1)). 2. int main(int argc. 0 ). gray->rows/4. however it may fail to find the correct . if dp=2 . char** argv) { Mat img.0). in the case of CV_HOUGH_GRADIENT it is the accumulator threshold at the center detection stage. 100 ).255). 200. Scalar(0. otherwise a lot of false circles may be detected GaussianBlur( gray. For example. If the parameter is too small. } namedWindow( "circles". 1 ). gray. int radius = cvRound(circles[i][2]). 8.o dp– The inverse ratio of the accumulator resolution to the image resolution. // draw the circle outline circle( img. if dp=1 . HoughCircles(gray.0. in the case of CV_HOUGH_GRADIENT it is the higher threshold of the two passed to Canny() edge detector (the lower one will be twice smaller) o param2– The second method-specific parameter. Scalar(0. cvRound(circles[i][1])). 3. -1. img ). Here is a short usage example: #include <cv. gray. center. The smaller it is. some circles may be missed o param1– The first method-specific parameter.h> using namespace cv. multiple neighbor circles may be falsely detected in addition to a true one. 0 ). 2. 8. corresponding to the larger accumulator values. Size(9. i++ ) { Point center(cvRound(circles[i][0]). vector<Vec3f> circles. Circles. imshow( "circles".data) return -1. 3. i < circles. center.

The image rs: may be modified by the function o lines– The output vector of lines. See HoughLinesP() for the code example. Paramete o image– The 8-bit. . The image rs: may be modified by the function o lines– The output vector of lines. Line segments shorter than that will be rejected o maxLineGap– The maximum allowed gap between points on the same line to link them. single-channel. int threshold. o rho– Distance resolution of the accumulator in pixels o theta– Angle resolution of the accumulator in radians o threshold– The accumulator threshold parameter. See also: fitEllipse() . where and are the ending points of each line segment detected. binary source image. Each line is represented by a 4-element vector . int threshold. is the distance from the coordinate is the line rotation origin (top-left corner of the image) and angle in radians ( ) o rho– Distance resolution of the accumulator in pixels o theta– Angle resolution of the accumulator in radians o threshold– The accumulator threshold parameter. double rho. vector<Vec4i>& lines. double theta. Each line is represented by a two-element vector . double stn=0) Finds lines in a binary image using standard Hough transform. use only the center and find the correct radius using some additional procedure. Only those lines are returned that get enough votes ( ) o minLineLength– The minimum line length. Only those lines are returned that get enough votes ( ) o srn– For the multi-scale Hough transform it is the divisor for the distance resolution rho . double theta. double maxLineGap=0) Finds lines segments in a binary image using probabilistic Hough transform. double minLineLength=0. Paramete o image– The 8-bit. double srn=0. vector<Vec2f>& lines. single-channel. The coarse accumulator distance resolution will be rho and the accurate accumulator resolution will be rho/srn . double rho. You can assist the function by specifying the radius range ( minRadius and maxRadius ) if you know it. If both srn=0 and stn=0 then the classical Hough transform is used. o stn– For the multi-scale Hough transform it is the divisor for the distance resolution theta The function implements standard or standard multi-scale Hough transform algorithm for line detection. cv::HoughLinesP Comments from the Wiki void HoughLinesP(Mat& image.radii. minEnclosingCircle() cv::HoughLines Comments from the Wiki void HoughLines(Mat& image. binary source image. or you may ignore the returned radius. otherwise both these parameters should be positive.

lines[i][3]). HoughLinesP( dst.255). Point(lines[i][2]. 100 ). namedWindow( "Detected Lines".size().data) return -1. for( size_t i = 0. #if 0 vector<Vec2f> lines.size(). Pass an image name as a first parameter of the program. dst. color_dst. imshow( "Detected Lines". CV_PI/180. 1. dst. b = sin(theta). Canny( src. CV_PI/180. Switch between standard and probabilistic Hough transform by changing "#if 1" to "#if 0" and back */ #include <cv. 1 ). waitKey(0). lines. 200. if( argc != 2 || !(src=imread(argv[1]. i++ ) { line( color_dst. 80. cvtColor( dst. double x0 = a*rho. for( size_t i = 0. 3 ). Point pt1(cvRound(x0 + 1000*(-b)). described in Matas00 . 1 ). i < lines. pt1. 0)). Point pt2(cvRound(x0 . src ). .h> #include <highgui.1000*(a))). 3. cvRound(y0 + 1000*(a))).h> using namespace cv. 1. 10 ). int main(int argc. Scalar(0. lines[i][1]). line( color_dst. float theta = lines[i][1]. i < lines. pt2. Point(lines[i][0]. CV_GRAY2BGR ). 50. } #endif namedWindow( "Source". } #else vector<Vec4i> lines.1000*(-b)). char** argv) { Mat src. 8 ). i++ ) { float rho = lines[i][0]. cvRound(y0 . Below is line detection example: /* This is a standalone program. color_dst. 30. y0 = b*rho. lines. color_dst ). 3. HoughLines( dst. imshow( "Source".0.The function implements probabilistic Hough transform algorithm for line detection.255). Scalar(0.h> #include <math.0. 8 ). double a = cos(theta).

} This is the sample picture the function parameters have been tuned for: And this is the output of the above program in the case of probabilistic Hough transform .return 0.

int apertureSize. will have type CV_32F and the same size as src o apertureSize– Aperture size of Sobel() o borderType– The pixel extrapolation method. int borderType=BORDER_DEFAULT) Calculates the feature map for corner detection Parameters: o src– The source single-channel 8-bit of floating-point image o dst– The output image. Mat& dst. see borderInterpolate() The function calculates the complex spatial derivative-based function of the source image .cv::preCornerDetect Comments from the Wiki void preCornerDetect(const Mat& src.

Object Detection¶ cv::matchTemplate Comments from the Wiki void matchTemplate(const Mat& image. dilated_corners. Mat(). int method) Compares a template against overlapped image regions. corners. If image is and templ is then result will be o method– Specifies the comparison method (see below) The function slides through image . as shown below: Mat corners. Mat corner_mask = corners == dilated_corners. are the second image derivatives and is the mixed derivative. Mat& result. should be 8-bit or 32-bit rs: floating-point o templ– Searched template. will be single-channel 32-bit floating-point. . Here are the formulas for the available comparison methods ( denotes image . // dilation with 3x3 rectangular structuring element dilate(corners. must be not greater than the source image and have the same data type o result– A map of comparison results. The summation is done over template and/or the image patch: method=CV_TM_SQDIFF method=CV_TM_SQDIFF_NORMED method=CV_TM_CCORR method=CV_TM_CCORR_NORMED method=CV_TM_CCOEFF where . template . The corners can be found as local maximums of the functions. 1). compares the overlapped patches of size against templ using the specified method and stores the comparison results to result . dilated_corners. are the first image derivatives. const Mat& templ. preCornerDetect(image.where . Paramete o image– Image where the search is running. 3). result ).

2006). features2d. o nonmaxSupression– If it is true then non-maximum supression will be applied to detected corners (keypoints). int _edge_blur_size ). float _min_diversity. Paramete o image– The image. double _area_threshold. double _min_margin. o threshold– Threshold on difference between intensity of center pixel and pixels on circle around this pixel. See description of the algorithm. rs: o keypoints– Keypoints detected on the image. template summation in the numerator and each sum in the denominator is done over all of the channels (and separate mean values are used for each channel). Keypoints (corners) will be detected on this. int _min_area. bool nonmaxSupression=true) Detects corners using FAST algorithm by E. float _max_variation. the result will still be a single-channel image. That is. MSER Comments from the Wiki MSER Maximally-Stable Extremal Region Extractor class MSER : public CvMSERParams { public: // default constructor MSER().method=CV_TM_CCOEFF_NORMED After the function finishes the comparison. vector<KeyPoint>& keypoints. int _max_evolution. the best matches can be found as global minimums (when CV_TM_SQDIFF was used) or maximums (when CV_TM_CCORR or CV_TM_CCOEFF was used) using the minMaxLoc() function. . which is easier to analyze. // constructor that initializes all the algorithm parameters MSER( int _delta. int threshold. In the case of a color image. the function can take a color template and a color image. Rosten (‘’Machine learning for high-speed corner detection’‘. int _max_area. Feature Detection and Descriptor Extraction Feature detection and description cv::FAST Comments from the Wiki void FAST(const Mat& image.

11. 90.another threshold for the laplacian to // eliminate edges // lineThresholdBinarized . // finds keypoints in an image void operator()(const Mat& image. int lineThresholdProjected. // used to eliminate weak features. The class implements a modified version of CenSurE keypoint detector described in Agrawal08 SIFT Comments from the Wiki SIFT Class for extracting keypoints and computing descriptors using approach named Scale Invariant Feature Transform (SIFT). the more points you get. vector<KeyPoint>& keypoints) const. 12. const Mat& mask ) const. int suppressNonmaxSize). StarDetector Comments from the Wiki StarDetector Implements Star keypoint detector class StarDetector : CvStarDetectorParams { public: // default constructor StarDetector(). 64. The larger it is. 22.maximum size of the features. }.// runs the extractor on the specified image. class CV_EXPORTS SIFT { public: struct CommonParams { . 45. The class encapsulates all the parameters of MSER (see http://en. 6. see findContours) // the optional mask marks the area where MSERs are searched for void operator()( const Mat& image. int responseThreshold. int lineThresholdBinarized. 128 // responseThreshold . The following // values of the parameter are supported: // 4. 8. StarDetector(int maxSize. }. 32.threshold for the approximated laplacian. 46.wikipedia. // each encoded as a contour (vector<Point>. returns the MSERs. 23. 16.another threshold for the feature // size to eliminate edges. vector<vector<Point> >& msers. // The larger the 2 threshold. // the full constructor initialized all the algorithm parameters: // maxSize .org/wiki/Maximally_stable_extremal_regions ) extraction algorithm. // the less features will be retrieved // lineThresholdProjected .

int _angleMode ). bool _recalculateAngles = true. firstOctave.static const int DEFAULT_NOCTAVES = 4. int _nOctaveLayers.04 / SIFT::CommonParams::DEFAULT_NOCTAVE_LAYERS / 2. int nOctaves. AVERAGE_ANGLE = 1 }. bool _recalculateAngles ). int _firstOctave=CommonParams::DEFAULT_FIRST_OCTAVE. int _angleMode=CommonParams::FIRST_ANGLE ). edgeThreshold. CommonParams(). static const int DESCRIPTOR_SIZE = 128. SIFT( const CommonParams& _commParams.0. int _firstOctave=CommonParams::DEFAULT_FIRST_OCTAVE.0. } static double GET_DEFAULT_EDGE_THRESHOLD() { return 10. int _nOctaveLayers=CommonParams::DEFAULT_NOCTAVE_LAYERS. double threshold. struct DescriptorParams { static double GET_DEFAULT_MAGNIFICATION() { return 3. } static const bool DEFAULT_IS_NORMALIZE = true. DescriptorParams( double _magnification. int _nOctaves=CommonParams::DEFAULT_NOCTAVES. static const int DEFAULT_NOCTAVE_LAYERS = 3. }. DetectorParams( double _threshold. bool _isNormalize. int _firstOctave. bool isNormalize. SIFT(). . bool _isNormalize=true. int _angleMode=CommonParams::FIRST_ANGLE ). } DetectorParams(). struct DetectorParams { static double GET_DEFAULT_THRESHOLD() { return 0. int _nOctaveLayers=CommonParams::DEFAULT_NOCTAVE_LAYERS. //! sift-detector constructor SIFT( double _threshold. const DetectorParams& _detectorParams = DetectorParams(). //! sift-descriptor constructor SIFT( double _magnification. enum{ FIRST_ANGLE = 0. double _edgeThreshold ). double magnification. }. DescriptorParams(). nOctaveLayers.0. int angleMode. CommonParams( int _nOctaves. static const int DEFAULT_FIRST_OCTAVE = -1. int _nOctaves=CommonParams::DEFAULT_NOCTAVES. double _edgeThreshold. }. bool recalculateAngles.

//! returns the descriptor size in floats (128) int descriptorSize() const { return DescriptorParams::DESCRIPTOR_SIZE. The class SURF implements Speeded Up Robust Features descriptor Bay06 . int _nOctaveLayers=2. vector<KeyPoint>& keypoints. void operator()(const Mat& img. // detects keypoints and computes the SURF descriptors for them.size() as each descriptor is // descriptorSize() elements of this vector. but the descriptors can be also computed for the user-specified keypoints. // constructor that initializes all the algorithm parameters SURF(double _hessianThreshold. bool useProvidedKeypoints=false) const.. int _nOctaves=4. CommonParams getCommonParams () const { return commParams. const Mat& mask. vector<KeyPoint>& keypoints) const. The function can be used for object . } protected: . SURF Comments from the Wiki SURF Class for extracting Speeded Up Robust Features from an image. const Mat& mask. } //! finds the keypoints using SIFT algorithm void operator()(const Mat& img. // returns the number of elements in each descriptor (64 or 128) int descriptorSize() const. const Mat& mask. //! Optionally it can compute descriptors for the user-provided keypoints void operator()(const Mat& img. }. //! finds the keypoints and computes descriptors for them using SIFT algorithm. const Mat& mask. // output vector "descriptors" stores elements of descriptors and has size // equal descriptorSize()*keypoints. There is fast multi-scale Hessian keypoint detector that can be used to find the keypoints (which is the default option).const DescriptorParams& _descriptorParams = DescriptorParams() ). vector<KeyPoint>& keypoints. // detects keypoints using fast multi-scale Hessian detector void operator()(const Mat& img. vector<float>& descriptors. Mat& descriptors.. }. } DetectorParams getDetectorParams () const { return detectorParams. bool useProvidedKeypoints=false) const. class SURF : public CvSURFParams { public: // default constructor SURF(). } DescriptorParams getDescriptorParams () const { return descriptorParams. bool _extended=false). vector<KeyPoint>& keypoints) const.

int N. int num_quant_bits). int num_quant_bits). int depth. PatchGenerator &make_patch. std::vector<RTreeNode> nodes_. int num_leaves_. RandomizedTree Comments from the Wiki RandomizedTree The class contains base structure for RTreeClassifier class CV_EXPORTS RandomizedTree { public: friend class RTreeClassifier. See the find_obj. void write(const char* file_name) const. } inline void applyQuantization(int num_quant_bits) { makePosteriors2(num_quant_bits). int num_quant_bits). int classes() { return classes_. int views.cpp demo in OpenCV samples directory. int N. . uchar* getPosterior2(uchar* patch_data). int depth_. void read(const char* file_name. // following two funcs are EXPERIMENTAL //(do not use unless you know exactly what you do) static void quantizeVector(float *vec. size_t reduced_num_dim. float bnds[2]. void read(std::istream &is. void train(std::vector<BaseKeypoint> const& base_set. int dim. int depth. cv::RNG &rng. int views. image stitching etc. void write(std::ostream &os) const. uchar *dst). float bnds[2]. int dim. RandomizedTree(). int clamp_mode=0). static void quantizeVector(float *src. } private: int classes_. } int depth() { return depth_. cv::RNG &rng. void train(std::vector<BaseKeypoint> const& base_set. ~RandomizedTree(). int num_quant_bits). } void discardFloatPosteriors() { freePosteriors(1). size_t reduced_num_dim. const float* getPosterior(uchar* patch_data) const. // patch_data must be a 32x32 array (no row padding) float* getPosterior(uchar* patch_data).tracking and localization.

size_t reduced_num_dim. cv::RNG &rng). PatchGenerator &make_patch. int num_classes). void finalize(size_t reduced_num_dim. int depth. int num_quant_bits) Trains a randomized tree using input set of keypoints void train(std::vector<BaseKeypoint> const& base_set. Contains keypoints from the image are used for training} {Random numbers generator is used for training} {Patch generator is used for training} {Maximum tree depth} {Number of dimensions are used in compressed signature} {Number of bits are used for quantization} cv::RandomizedTree::read Comments from the Wiki read(const char* file_name. void freePosteriors(int which). int num_quant_bits). inline uchar* getPosteriorByIndex2(int index). 3=both void init(int classes. int num_quant_bits) Reads pre-saved randomized tree from file or stream read(std::istream &is. int views. cv::RNG &rng. // 16-bytes aligned posteriors std::vector<int> leaf_counts_. PatchGenerator &make_patch.float **posteriors_. // 16-bytes aligned posteriors uchar **posteriors2_. inline const float* getPosteriorByIndex(int index) const. void estimateQuantPercForPosteriors(float perc[2]). // which: 1=posteriors_. cv::RNG &rng. int num_quant_bits) {Vector of BaseKeypoint type. int num_quant_bits) Parameters: o file_name– Filename of file contains randomized tree data o is– Input stream associated with file contains randomized tree data {Number of bits are used for quantization} cv::RandomizedTree::write Comments from the Wiki void write(const char* file_name) const Writes current randomized tree to a file or stream void write(std::ostream &os) const . inline float* getPosteriorByIndex(int index). int getIndex(uchar* patch_data) const. void convertPosteriorsToChar(). }. cv::RandomizedTree::train Comments from the Wiki void train(std::vector<BaseKeypoint> const& base_set. int views. void addExample(int class_id. 2=posteriors2_. void compressLeaves(size_t reduced_num_dim). void allocPosteriorsAligned(int num_leaves. int depth. uchar* patch_data). void createNodes(int num_nodes. void makePosteriors2(int num_quant_bits). size_t reduced_num_dim. int depth. cv::RNG &rng).

static const size_t DEFAULT_NUM_QUANT_BITS = 4. RTreeClassifier Comments from the Wiki RTreeClassifier The class contains RTreeClassifier . cv::RNG &rng. . offset2. offset2(y2*PATCH_SIZE + x2) {} //! Left child on 0. right child on 1 inline bool operator() (uchar* patch_data) const { return patch_data[offset1] > patch_data[offset2]. } }. void train(std::vector<BaseKeypoint> const& base_set. RTreeNode() {} RTreeNode(uchar x1. It represents calonder descriptor which was originally introduced by Michael Calonder class CV_EXPORTS RTreeClassifier { public: static const int DEFAULT_TREES = 48. RTreeClassifier(). uchar y1. uchar x2. uchar y2) : offset1(y1*PATCH_SIZE + x1).Parameters : o file_name– Filename of file where randomized tree data will be stored o is– Output stream associated with file where randomized tree data will be stored cv::RandomizedTree::applyQuantization Comments from the Wiki void applyQuantization(int num_quant_bits) Applies quantization to the current randomized tree {Number of bits are used for quantization} RTreeNode Comments from the Wiki RTreeNode The class contains base structure for RandomizedTree struct RTreeNode { short offset1.

PatchGenerator &make_patch. static int countNonZeroElements(float *vec. int sig_len=176). uchar *sig). void read(const char* file_name). int num_sig=1. double tol=1e-10). void discardFloatPosteriors(). static inline uchar* safeSignatureAlloc(int num_sig=1. int depth = DEFAULT_DEPTH.int num_trees = RTreeClassifier::DEFAULT_TREES. bool keep_floats_. cv::RNG &rng. void getSignature(IplImage *patch. void write(std::ostream &os) const. private: int classes_. bool print_status = true). int num_quant_bits = DEFAULT_NUM_QUANT_BITS. size_t reduced_num_dim = DEFAULT_REDUCED_NUM_DIM. } void setQuantization(int num_quant_bits). void train(std::vector<BaseKeypoint> const& base_set. std::vector<RandomizedTree> trees_. inline int classes() { return classes_. void getSparseSignature(IplImage *patch. } inline int original_num_classes() { return original_num_classes_. int original_num_classes_. int num_quant_bits_. int depth = DEFAULT_DEPTH. float *sig). int views = DEFAULT_VIEWS. int n. int num_quant_bits = DEFAULT_NUM_QUANT_BITS. ushort *ptemp_. }. void write(const char* file_name) const. int num_trees = RTreeClassifier::DEFAULT_TREES. int views = DEFAULT_VIEWS. void read(std::istream &is). uchar **posteriors_. size_t reduced_num_dim = DEFAULT_REDUCED_NUM_DIM. float *sig. float thresh). bool print_status = true). // sig must point to a memory block of at least //classes()*sizeof(float|uchar) bytes void getSignature(IplImage *patch. int sig_len=176). static inline void safeSignatureAlloc(uchar **sig. .

size_t reduced_num_dim = DEFAULT_REDUCED_NUM_DIM. bool print_status = true) {Vector of BaseKeypoint type. int views = DEFAULT_VIEWS. size_t reduced_num_dim = DEFAULT_REDUCED_NUM_DIM. uchar *sig) Returns signature for image patch void getSignature(IplImage *patch. double tol=1e-10) The function returns the number of non-zero elements in the input array. int num_trees = RTreeClassifier::DEFAULT_TREES. float *sig) {Image patch to calculate signature for} {Output signature (array dimension is reduced_num_dim) } cv::RTreeClassifier::getSparseSignature Comments from the Wiki void getSparseSignature(IplImage *patch. int views = DEFAULT_VIEWS. int n. cv::RNG &rng. int depth = DEFAULT_DEPTH. int depth = DEFAULT_DEPTH. float *sig. int num_trees = RTreeClassifier::DEFAULT_TREES. cv::RNG &rng. bool print_status = true) Trains a randomized tree classificator using input set of keypoints void train(std::vector<BaseKeypoint> const& base_set. Contains keypoints from the image are used for training} {Random numbers generator is used for training} {Patch generator is used for training} {Number of randomized trees used in RTreeClassificator} {Maximum tree depth} {Number of dimensions are used in compressed signature} {Number of bits are used for quantization} {Print current status of training on the console} cv::RTreeClassifier::getSignature Comments from the Wiki void getSignature(IplImage *patch. Parameters: o vec– Input vector contains float elements o n– Input vector size {The threshold used for elements counting. int num_quant_bits = DEFAULT_NUM_QUANT_BITS.cv::RTreeClassifier::train Comments from the Wiki void train(std::vector<BaseKeypoint> const& base_set. int num_quant_bits = DEFAULT_NUM_QUANT_BITS. PatchGenerator &make_patch. float thresh) The function is simular to getSignaturebut uses the threshold for removing all signature elements less than the threshold. So that the signature is compressed {Image patch to calculate signature for} {Output signature (array dimension is reduced_num_dim) } {The threshold that is used for compressing the signature} cv::RTreeClassifier::countNonZeroElements Comments from the Wiki static int countNonZeroElements(float *vec. We take all elements are less than tol as zero elements} cv::RTreeClassifier::read Comments from the Wiki read(const char* file_name) Reads pre-saved RTreeClassifier from file or stream read(std::istream &is) .

int i=0. params ).point->pt. Output is and arrays which keep the best probabilities and corresponding features indexes for every train feature.i++) { point=(CvSURFPoint*)cvGetSeqElem(objectKeypoints. 0. storage. *objectDescriptors = 0. *imageDescriptors = 0.0. cvExtractSURF( train_image. -CV_PI/3. storage. } //Detector training cv::RNG rng( cvGetTickCount() ). CvMemStorage* storage = cvCreateMemStorage(0). 1). &objectDescriptors. There are test and train images and we extract features from both with SURF.CV_PI/3). CvSURFPoint* point.255. 0.train_image)). CvSeq *objectKeypoints = 0.1.i). cvExtractSURF( test_image. CvSeq *imageKeypoints = 0.x. &imageKeypoints.CV_PI/3. &objectKeypoints.y. int patch_width = cv::PATCH_SIZE. base_set. CvSURFParams params = cvSURFParams(500.push_back( cv::BaseKeypoint(point->pt. vector<cv::BaseKeypoint> base_set. . iint patch_height = cv::PATCH_SIZE. for (i=0. &imageDescriptors.i<(n_points > 0 ? n_points : objectKeypoints->total).2.Parameters: o file_name– Filename of file contains randomized tree data o is– Input stream associated with file contains randomized tree data cv::RTreeClassifier::write Comments from the Wiki void write(const char* file_name) const Writes current RTreeClassifier to a file or stream void write(std::ostream &os) const Parameters o file_name– Filename of file where randomized tree data will be : stored o is– Output stream associated with file where randomized tree data will be stored cv::RTreeClassifier::setQuantization Comments from the Wiki void setQuantization(int num_quant_bits) Applies quantization to the current randomized tree {Number of bits are used for quantization} Below there is an example of RTreeClassifier usage for feature matching.3. cv::PatchGenerator gen(0. cv::RTreeClassifier detector.false.-CV_PI/3.7. params ).

i).original_num_classes()]. cvCopy(test_image. } else { cvSetImageROI(test_image. printf("Donen"). patch_width. } for(i=0.width. } } best_corr_idx[i] = part_idx. best_corr_idx = new int[imageKeypoints->total].getSignature(roi_image.n").24. test_image->depth.patch_height/2. i < imageKeypoints->total. IplImage* roi_image = cvCreateImage(cvSize(roi.gen. int* best_corr_idx. roi).height).2000.y) .printf("RTree Classifier training. test_image->nChannels).x) . if(roi. roi. . if (imageKeypoints->total > 0) { best_corr = new float[imageKeypoints->total].width != patch_width || roi. cvSetImageROI(test_image.. for (int j = 0. prob = signature[j]. float* signature = new float[detector. best_corr[i] = prob.size().rng. CvRect roi = cvRect((int)(point->pt. detector.height != patch_height) { best_corr_idx[i] = part_idx. (int)(point->pt. float* best_corr.j++) { if (prob < signature[j]) { part_idx = j. detector.DEFAULT_NUM_QUANT_BITS). best_corr[i] = prob. detector. int part_idx = -1. i++) { point=(CvSURFPoint*)cvGetSeqElem(imageKeypoints. roi = cvGetImageROI(test_image).roi_image). signature).patch_width/2. (int)base_set. roi).original_num_classes().cv::DEFAULT_DEPTH. patch_height).train(base_set..0f. j< detector. float prob = 0.

// overlap is a ratio between area of keypoint regions intersection and // area of keypoint regions union (now keypoint region is circle) static float overlap(const KeyPoint& kp1. size(0). // computes overlap for pair of keypoints. // the response by which the most strong keypoints . // coordinates of the keypoints float size. int _octave=0. where each // keypoint is assigned the same size and the same orientation static void convert(const std::vector<Point2f>& points2f. octave(_octave). float size=1.0).if (roi_image) cvReleaseImage(&roi_image). float _angle=-1. } cvResetImageROI(test_image). float _response=0. float _size. float _size. angle(-1). float response=1. // converts vector of points to the vector of keypoints. KeyPoint Comments from the Wiki KeyPoint Data structure for salient point detectors. response(_response). // diameter of the meaningfull keypoint neighborhood float angle. int class_id=-1). class_id(_class_id) {} // another form of the full constructor KeyPoint(float x. int _class_id=-1) : pt(_pt). angle(_angle). int _octave=0. angle(_angle). float y. // computed orientation of the keypoint (-1 if not applicable) float response. response(0). float _response=0. octave(0). class KeyPoint { public: // the default constructor KeyPoint() : pt(0. Point2f pt. y). int octave=0. const std::vector<int>& keypointIndexes=std::vector<int>()). class_id(_class_id) {} // converts vector of keypoints to vector of points static void convert(const std::vector<KeyPoint>& keypoints. std::vector<Point2f>& points2f. size(_size). int _class_id=-1) : pt(x. const KeyPoint& kp2). std::vector<KeyPoint>& keypoints. octave(_octave). All objects that implement keypoint detectors inherit FeatureDetector() interface. } Common Interfaces of Feature Detectors¶ Feature detectors in OpenCV have wrappers with common interface that enables to switch easily between different algorithms solving the same problem. float _angle=-1. response(_response). class_id(-1) {} // the full constructor KeyPoint(Point2f _pt. size(_size).

const Mat& mask=Mat()) const Detect keypoints in an image (first variant) or image set (second variant). void detect( const Mat& image.. const vector<KeyPoint>& keypoints). o mask– Mask specifying where to look for keypoints (optional).. void FeatureDetector::detect(const vector<Mat>& images.// have been selected. }. Paramete o image– The image. cv::FeatureDetector::detect Comments from the Wiki void FeatureDetector::detect(const Mat& image. CV_OUT vector<KeyPoint>& keypoints). virtual void write(FileStorage&) const. // octave (pyramid layer) from which the keypoint has been extracted int class_id. vector<KeyPoint>& keypoints. // writes vector of keypoints to the file storage void write(FileStorage& fs. const string& name. const Mat& mask=Mat() ) const. class CV_EXPORTS FeatureDetector { public: virtual ~FeatureDetector(). vector<KeyPoint>& keypoints. vector<vector<KeyPoint> >& keypoints. const vector<Mat>& masks=vector<Mat>()) const . Must be a char matrix with non-zero values in the region of interest. FeatureDetector Comments from the Wiki FeatureDetector Abstract base class for 2D image feature detectors. protected: . static Ptr<FeatureDetector> create( const string& detectorType ). void detect( const vector<Mat>& images. // reads vector of keypoints from the specified file storage node void read(const FileNode& node. Can be used for the further sorting // or subsampling int octave. vector<vector<KeyPoint> >& keypoints. rs: o keypoints– The detected keypoints. // object class (if the keypoints need to be clustered by // an object they belong to) }. const vector<Mat>& masks=vector<Mat>() ) const. virtual void read(const FileNode&).

. Each element of masks vector must be a char matrix with non-zero values in the region of interest. keypoints[i] is a set of keypoints detected in an images[i]. . bool nonmaxSuppression=true ). masks Masks for each input image specifying where to look for keypoints (optional). :param : FastFeatureDetector() ```` StarFeatureDetector() ```` SiftFeatureDetector() ```` SurfFeatureDetector() ```` MserFeatureDetector() ```` GfttFeatureDetector() ```` HarrisFeatureDetector() ```` GridAdaptedFeatureDetector() ```` PyramidAdaptedFeatureDetector() ```` ```` FastFeatureDetector Comments from the Wiki FastFeatureDetector Wrapping class for feature detection using FAST() method. keypoints Collection of keypoints detected in an input images. virtual void read( const FileNode& fn ). virtual void write( FileStorage& fs ) const. masks[i] is a mask for images[i]. class FastFeatureDetector : public FeatureDetector { public: FastFeatureDetector( int threshold=1. }. cv::FeatureDetector::write Comments from the Wiki void FeatureDetector::write(FileStorage& fs) const Write feature detector object to file storage. Parameters: o fn– File node from which detector will be read. cv::FeatureDetector::create Comments from the Wiki FeatureDetector() .images Images set. protected: . o fs– File storage in which detector will Parameters: be written. cv::FeatureDetector::read Comments from the Wiki void FeatureDetector::read(const FileNode& fn) Read feature detector object from file node...

double minDiversity. GoodFeaturesToTrackDetector( const GoodFeaturesToTrackDetector::Params& params= GoodFeaturesToTrackDetector::Params() ). double qualityLevel. class MserFeatureDetector : public FeatureDetector { public: MserFeatureDetector( CvMSERParams params=cvMSERParams() ). int maxEvolution.. int blockSize=3. protected: . int maxCorners. bool useHarrisDetector=false. double k. }. bool useHarrisDetector. virtual void read( const FileNode& fn ). virtual void write( FileStorage& fs ) const. double qualityLevel. virtual void write( FileStorage& fs ) const.. int minArea. double k=0. double minDistance. GoodFeaturesToTrackDetector( int maxCorners.04 ). double qualityLevel=0.GoodFeaturesToTrackDetector Comments from the Wiki GoodFeaturesToTrackDetector Wrapping class for feature detection using goodFeaturesToTrack() function. double minDistance. int maxArea. double minDistance=1. protected: . bool useHarrisDetector=false. void read( const FileNode& fn ).. double maxVariation. int edgeBlurSize ). double minMargin. double k=0.01.04 ). }. MserFeatureDetector( int delta. int blockSize. void write( FileStorage& fs ) const. double areaThreshold. virtual void read( const FileNode& fn ). class GoodFeaturesToTrackDetector : public FeatureDetector { public: class Params { public: Params( int maxCorners=1000. int blockSize=3. MserFeatureDetector Comments from the Wiki MserFeatureDetector Wrapping class for feature detection using MSER() class.

int firstOctave=SIFT::CommonParams::DEFAULT_FIRST_OCTAVE. protected: . int nOctaveLayers=SIFT::CommonParams::DEFAULT_NOCTAVE_LAYERS. int nOctaves=SIFT::CommonParams::DEFAULT_NOCTAVES. StarFeatureDetector Comments from the Wiki StarFeatureDetector Wrapping class for feature detection using StarDetector() class.. SurfFeatureDetector Comments from the Wiki SurfFeatureDetector Wrapping class for feature detection using SURF() class.. virtual void write( FileStorage& fs ) const. int responseThreshold=30. int suppressNonmaxSize=5 ).. }. SiftFeatureDetector Comments from the Wiki SiftFeatureDetector Wrapping class for feature detection using SIFT() class. }.. virtual void write( FileStorage& fs ) const.. int lineThresholdBinarized=8. class SurfFeatureDetector : public FeatureDetector { . }. class StarFeatureDetector : public FeatureDetector { public: StarFeatureDetector( int maxSize=16.. int lineThresholdProjected = 10.. protected: . int angleMode=SIFT::CommonParams::FIRST_ANGLE ). class SiftFeatureDetector : public FeatureDetector { public: SiftFeatureDetector( const SIFT::DetectorParams& detectorParams=SIFT::DetectorParams(). SiftFeatureDetector( double threshold. virtual void read( const FileNode& fn ). double edgeThreshold. const SIFT::CommonParams& commonParams=SIFT::CommonParams() ). virtual void read( const FileNode& fn ).

* gridCols Grid column count. int octaveLayers = 4 ). virtual void read( const FileNode& fn ). virtual void write( FileStorage& fs ) const. virtual void read( const FileNode& fn ). virtual void write( FileStorage& fs ) const.. }.. Useful for detectors that are not inherently scaled. int levels=2 ). class GridAdaptedFeatureDetector : public FeatureDetector { public: /* * detector Detector that will be adapted. int gridRows=4. int octaves = 3.. * gridRows Grid rows count. protected: .public: SurfFeatureDetector( double hessianThreshold = 400. int maxTotalKeypoints. }. . * Only the strongest keypoints will be keeped. virtual void write( FileStorage& fs ) const.. protected: .. GridAdaptedFeatureDetector Comments from the Wiki GridAdaptedFeatureDetector Adapts a detector to partition the source image into a grid and detect points in each cell.. virtual void read( const FileNode& fn ). */ GridAdaptedFeatureDetector( const Ptr<FeatureDetector>& detector. int gridCols=4 ).. class PyramidAdaptedFeatureDetector : public FeatureDetector { public: PyramidAdaptedFeatureDetector( const Ptr<FeatureDetector>& detector. protected: . }. * maxTotalKeypoints Maximum count of keypoints detected on the image. PyramidAdaptedFeatureDetector Comments from the Wiki PyramidAdaptedFeatureDetector Adapts a detector to detect points over multiple levels of a Gaussian pyramid.

}. class DynamicAdaptedFeatureDetector: public FeatureDetector { public: DynamicAdaptedFeatureDetector( const Ptr<AdjusterAdapter>& adjaster. . //sample usage: //will create a detector that attempts to find //100 . int max_iters) DynamicAdaptedFeatureDetector constructor.true))). At each iteration the detector is rerun.. Paramete o adjaster– An AdjusterAdapter() that will do the detection and rs: parameter adjustment o min_features– This minimum desired number features. This is repeated until either the number of desired features are found or the parameters are maxed out. new FastAdjuster(20. 10. the detector may be used for consistent numbers of keypoints in a sets of images that are temporally related such as video streams or panorama series. int max_features=500. Beware that this is not thread safe . int max_features. After a detection. cv::DynamicAdaptedFeatureDetector::DynamicAdaptedFeature Detector Comments from the Wiki DynamicAdaptedFeatureDetector::DynamicAdaptedFeatureDetector(const Ptr<AdjusterAdapter>& adjaster. 110. and an unsatisfactory number of features are detected.. the AdjusterAdapter will adjust the detection parameters so that the next detection will result in more or less features.as the adjustment of parameters breaks the const of the detection routine.. int min_features. with the help of an AdjusterAdapter.110 FAST Keypoints. it will “remember” the parameters used on the last detection.DynamicAdaptedFeatureDetector Comments from the Wiki DynamicAdaptedFeatureDetector An adaptively adjusting detector that iteratively detects until the desired number of features are found. If the detector is persisted. but with Star or Surf. int min_features=400. o max_features– The maximum desired number of features. and will at most run //FAST feature detection 10 times until that //number of keypoints are found Ptr<FeatureDetector> detector(new DynamicAdaptedFeatureDetector (100. many iterations can get time consuming. int max_iters=5 ). . o max_iters– The maximum number of times to try to adjust the feature detector parameters. In this way. so keep this in mind when choosing this value.. Here is a sample of how to create a DynamicAdaptedFeatureDetector. The DynamicAdaptedFeatureDetector uses another detector such as FAST or SURF to do the dirty work. Adapters can easily be implemented for any detector via the AdjusterAdapter interface. For the FastAdjuster() this number can be high.

cv::AdjusterAdapter::tooFew Comments from the Wiki virtual void tooFew(int min. param n_detected: The actual number detected last run. See FastAdjuster() . int n_detected) = 0. An example implementation of this is void FastAdjuster::tooMany(int min. int n_detected) = 0. this is used by the DynamicAdaptedFeatureDetector() and is a wrapper for FeatureDetecto() r that allow them to be adjusted after a detection.so that the next detection detects less features. adjust the detector parameters accordingly .AdjusterAdapter Comments from the Wiki AdjusterAdapter A feature detector parameter adjuster interface. } cv::AdjusterAdapter::good Comments from the Wiki virtual bool good() const = 0 . int n_detected) = 0 Too many features were detected so.so that the next detection detects more features. param n_detected: The actual number detected last run. SurfAdjuster() for concrete implementations. StarAdjuster() . int n_detected) { thresh_--. virtual void tooMany(int max. param min: This minimum desired number features. }. An example implementation of this is void FastAdjuster::tooFew(int min. adjust the detector parameters accordingly . param max: This maximum desired number features. } cv::AdjusterAdapter::tooMany Comments from the Wiki virtual void tooMany(int max. int n_detected) = 0 Too few features were detected so. virtual bool good() const = 0. class AdjusterAdapter: public FeatureDetector { public: virtual ~AdjusterAdapter() {} virtual void tooFew(int min. int n_detected) { thresh_++.

class StarAdjuster: public AdjusterAdapter { StarAdjuster(double initial_thresh = 30.. Common Interfaces of Descriptor Extractors Extractors of keypoint descriptors in OpenCV have wrappers with common interface that enables to switch easily between different algorithms solving the same problem... This adjusts the hessianThreshold of SurfFeatureDetector. } FastAdjuster Comments from the Wiki FastAdjuster An AdjusterAdapter() for the FastFeatureDetector() . . SurfAdjuster Comments from the Wiki SurfAdjuster An AdjusterAdapter() for the SurfFeatureDetector() . . This section is devoted to computing descriptors that are represented as vectors in a multidimensional space. bool nonmax = true).0). This adjusts the responseThreshhold of StarFeatureDetector.. StarAdjuster Comments from the Wiki StarAdjuster An AdjusterAdapter() for the StarFeatureDetector() . }.Are params maxed out or still valid? Returns false if the parameters can’t be adjusted any more. This will basically decrement or increment the threshhold by 1 class FastAdjuster FastAdjuster: public AdjusterAdapter { public: FastAdjuster(int init_thresh = 20. An example implementation of this is bool FastAdjuster::good() const { return (thresh_ > 1) && (thresh_ < 200). }. . All objects that . }.. class SurfAdjuster: public SurfAdjuster { SurfAdjuster()..

vector<vector<KeyPoint> >& keypoints. vector<KeyPoint>& keypoints. cv::DescriptorExtractor::compute Comments from the Wiki void DescriptorExtractor::compute(const Mat& image.. Mat& descriptors) const Compute the descriptors for a set of keypoints detected in an image (first variant) or image set (second variant). param keypoints: The keypoints. keypoints[i] is keypoints detected in images[i]. fixed-dimensional vector of some basic type. descriptors Descriptor collection. static Ptr<DescriptorExtractor> create( const string& descriptorExtractorType ). void DescriptorExtractor::compute(const vector<Mat>& images. virtual int descriptorSize() const = 0. Mat& descriptors ) const. vector<Mat>& descriptors ) const. descriptors[i] are descriptors computed for . Row i is the descriptor for keypoint i. virtual int descriptorType() const = 0. void compute( const Mat& image. In this interface we assume a keypoint descriptor can be represented as a dense. where each row is one keypoint descriptor. vector<Mat>& descriptors) const images The image set. DescriptorExtractor Comments from the Wiki DescriptorExtractor Abstract base class for computing descriptors for image keypoints. }. vector<vector<KeyPoint> >& keypoints. param descriptors: The descriptors.. Keypoints for which a descriptor cannot be computed are removed.implement ‘’vector’’ descriptor extractors inherit DescriptorExtractor() interface. protected: . void compute( const vector<Mat>& images. Therefore we represent a collection of descriptors as a Mat() . param image: The image. keypoints Input keypoints collection. class CV_EXPORTS DescriptorExtractor { public: virtual ~DescriptorExtractor(). vector<KeyPoint>& keypoints. Most descriptors used in practice follow this pattern. virtual void write( FileStorage& ) const. as it makes it very easy to compute distances between descriptors. virtual void read( const FileNode& ). Keypoints for which a descriptor can not be computed are removed.

int nOctaves=SIFT::CommonParams::DEFAULT_NOCTAVES. const SIFT::CommonParams& commonParams=SIFT::CommonParams() ). int firstOctave=SIFT::CommonParams::DEFAULT_FIRST_OCTAVE. virtual void write (FileStorage &fs) const. SiftDescriptorExtractor( double magnification. Parameters: o fs– File storage in which detector will be written. int nOctaveLayers=SIFT::CommonParams::DEFAULT_NOCTAVE_LAYERS. cv::DescriptorExtractor::create Comments from the Wiki DescriptorExtractor() . class SiftDescriptorExtractor : public DescriptorExtractor { public: SiftDescriptorExtractor( const SIFT::DescriptorParams& descriptorParams=SIFT::DescriptorParams(). bool isNormalize=true. :param : SiftFeatureDetector() ```` SurfFeatureDetector() ```` BriefFeatureDetector() ```` OpponentColorDescriptorExtractor() ```` SiftDescriptorExtractor Comments from the Wiki SiftDescriptorExtractor Wrapping class for descriptors computing using SIFT() class. Parameters: o fn– File node from which detector will be read.. bool recalculateAngles=true. cv::DescriptorExtractor::read Comments from the Wiki void DescriptorExtractor::read(const FileNode& fn) Read descriptor extractor object from file node. virtual void read (const FileNode &fn).a set keypoints[i]. virtual int descriptorSize() const. int angleMode=SIFT::CommonParams::FIRST_ANGLE ). cv::DescriptorExtractor::write Comments from the Wiki void DescriptorExtractor::write(FileStorage& fs) const Write descriptor extractor object to file storage. .

CGIV 2008 “Color Descriptors for Object Category Recognition”). protected: .. template<typename T> class CalonderDescriptorExtractor : public DescriptorExtractor { public: CalonderDescriptorExtractor( const string& classifierFile ).. protected: .. virtual void read( const FileNode &fn ). } SurfDescriptorExtractor Comments from the Wiki SurfDescriptorExtractor Wrapping class for descriptors computing using SURF() class.. } CalonderDescriptorExtractor Comments from the Wiki CalonderDescriptorExtractor Wrapping class for descriptors computing using RTreeClassifier() class. virtual int descriptorType() const. virtual void read (const FileNode &fn). } OpponentColorDescriptorExtractor Comments from the Wiki OpponentColorDescriptorExtractor Adapts a descriptor extractor to compute descripors in Opponent Color Space (refer to van de Sande et al. Then unadapted descriptor extractor (set in constructor) . Input RGB image is transformed in Opponent Color Space. bool extended=false ). int nOctaveLayers=2.. virtual int descriptorSize() const. virtual void write( FileStorage &fs ) const.. virtual void write (FileStorage &fs) const.. virtual int descriptorType() const.virtual int descriptorType() const. protected: . virtual int descriptorSize() const. class SurfDescriptorExtractor : public DescriptorExtractor { public: SurfDescriptorExtractor( int nOctaves=4.

computes descriptors on each of the three channel and concatenate them into a single color descriptor. class OpponentColorDescriptorExtractor : public DescriptorExtractor { public: OpponentColorDescriptorExtractor( const Ptr<DescriptorExtractor>& dextractor ); virtual void read( const FileNode& ); virtual void write( FileStorage& ) const; virtual int descriptorSize() const; virtual int descriptorType() const; protected: ... };

BriefDescriptorExtractor
Comments from the Wiki BriefDescriptorExtractor Class for computing BRIEF descriptors described in paper of Calonder M., Lepetit V., Strecha C., Fua P.: ‘’BRIEF: Binary Robust Independent Elementary Features.’’ 11th European Conference on Computer Vision (ECCV), Heraklion, Crete. LNCS Springer, September 2010. class BriefDescriptorExtractor : public DescriptorExtractor { public: static const int PATCH_SIZE = 48; static const int KERNEL_SIZE = 9; // bytes is a length of descriptor in bytes. It can be equal 16, 32 or 64 bytes. BriefDescriptorExtractor( int bytes = 32 ); virtual void read( const FileNode& ); virtual void write( FileStorage& ) const; virtual int descriptorSize() const; virtual int descriptorType() const; protected: ... };

Common Interfaces of Descriptor Matchers¶
Matchers of keypoint descriptors in OpenCV have wrappers with common interface that enables to switch easily between different algorithms solving the same problem. This section is devoted to matching descriptors that are represented as vectors in a multidimensional space. All objects that implement ‘’vector’’ descriptor matchers inherit DescriptorMatcher() interface.

DMatch
Comments from the Wiki DMatch Match between two keypoint descriptors: query descriptor index, train descriptor index, train image index and distance between descriptors.

struct DMatch { DMatch() : queryIdx(-1), trainIdx(-1), imgIdx(-1), distance(std::numeric_limits<float>::max()) {} DMatch( int _queryIdx, int _trainIdx, float _distance ) : queryIdx(_queryIdx), trainIdx(_trainIdx), imgIdx(-1), distance(_distance) {} DMatch( int _queryIdx, int _trainIdx, int _imgIdx, float _distance ) : queryIdx(_queryIdx), trainIdx(_trainIdx), imgIdx(_imgIdx), distance(_distance) {} int queryIdx; // query descriptor index int trainIdx; // train descriptor index int imgIdx; // train image index float distance; // less is better bool operator<( const DMatch &m ) const; };

DescriptorMatcher
Comments from the Wiki DescriptorMatcher Abstract base class for matching keypoint descriptors. It has two groups of match methods: for matching descriptors of one image with other image or with image set. class DescriptorMatcher { public: virtual ~DescriptorMatcher(); virtual void add( const vector<Mat>& descriptors ); const vector<Mat>& getTrainDescriptors() const; virtual void clear(); bool empty() const; virtual bool isMaskSupported() const = 0; virtual void train(); /* * Group of methods to match descriptors from image pair. */ void match( const Mat& queryDescriptors, const Mat& trainDescriptors, vector<DMatch>& matches, const Mat& mask=Mat() ) const; void knnMatch( const Mat& queryDescriptors, const Mat& trainDescriptors, vector<vector<DMatch> >& matches, int k, const Mat& mask=Mat(), bool compactResult=false ) const; void radiusMatch( const Mat& queryDescriptors, const Mat& trainDescriptors, vector<vector<DMatch> >& matches, float maxDistance, const Mat& mask=Mat(), bool compactResult=false ) const;

/* * Group of methods to match descriptors from one image to image set. */ void match( const Mat& queryDescriptors, vector<DMatch>& matches, const vector<Mat>& masks=vector<Mat>() ); void knnMatch( const Mat& queryDescriptors, vector<vector<DMatch> >& matches, int k, const vector<Mat>& masks=vector<Mat>(), bool compactResult=false ); void radiusMatch( const Mat& queryDescriptors, vector<vector<DMatch> >& matches, float maxDistance, const vector<Mat>& masks=vector<Mat>(), bool compactResult=false ); virtual void read( const FileNode& ); virtual void write( FileStorage& ) const; virtual Ptr<DescriptorMatcher> clone( bool emptyTrainData=false ) const = 0; static Ptr<DescriptorMatcher> create( const string& descriptorMatcherType ); protected: vector<Mat> trainDescCollection; ... };

cv::DescriptorMatcher::add
Comments from the Wiki void add(const vector<Mat>& descriptors) Add descriptors to train descriptor collection. If collection trainDescCollectionis not empty the new descriptors are added to existing train descriptors. param descriptors: Descriptors to add. Each descriptors[i] is a set of descriptors from the same (one) train image.

cv::DescriptorMatcher::getTrainDescriptors
Comments from the Wiki const vector<Mat>& getTrainDescriptors() const Returns constant link to the train descriptor collection (i.e. trainDescCollection).

cv::DescriptorMatcher::clear
Comments from the Wiki void DescriptorMatcher::clear() Clear train descriptor collection.

cv::DescriptorMatcher::empty
Comments from the Wiki bool DescriptorMatcher::empty() const Return true if there are not train descriptors in collection.

cv::DescriptorMatcher::isMaskSupported
Comments from the Wiki bool DescriptorMatcher::isMaskSupported() Returns true if descriptor matcher supports masking permissible matches.

cv::DescriptorMatcher::train
Comments from the Wiki void DescriptorMatcher::train() Train descriptor matcher (e.g. train flann index). In all methods to match the method train() is run every time before matching. Some descriptor matchers (e.g. BruteForceMatcher) have empty implementation of this method, other matchers realy train their inner structures (e.g. FlannBasedMatcher trains flann::Index)

cv::DescriptorMatcher::match
Comments from the Wiki void DescriptorMatcher::match(const Mat& queryDescriptors, const Mat& trainDescriptors, vector<DMatch>& matches, const Mat& mask=Mat()) const Find the best match for each descriptor from a query set with train descriptors. Supposed that the query descriptors are of keypoints detected on the same query image. In first variant of this method train descriptors are set as input argument and supposed that they are of keypoints detected on the same train image. In second variant of the method train descriptors collection that was set using addmethod is used. Optional mask (or masks) can be set to describe which descriptors can be matched. queryDescriptors[i]can be matched with trainDescriptors[j]only if mask.at<uchar>(i,j)is non-zero. void DescriptorMatcher::match(const Mat& queryDescriptors, vector<DMatch>& matches, const vector<Mat>& masks=vector<Mat>()) Paramete o queryDescriptors– Query set of descriptors. rs: o trainDescriptors– Train set of descriptors. This will not be added to train descriptors collection stored in class object. o matches– Matches. If some query descriptor masked out in mask no match will be added for this descriptor. So matches size may be less query descriptors count. o mask– Mask specifying permissible matches between input query and train matrices of descriptors. o masks– The set of masks. Each masks[i] specifies permissible matches between input query descriptors and stored train descriptors from i-th image (i.e. trainDescCollection[i]) .

cv::DescriptorMatcher::knnMatch
Comments from the Wiki DescriptorMatcher::match() void DescriptorMatcher::knnMatch(const Mat& queryDescriptors, const Mat& trainDescriptors, vector<vector<DMatch> >& matches, int k, const Mat& mask=Mat(), bool compactResult=false) const Find the k best matches for each descriptor from a query set with train descriptors. Found k (or less if not possible) matches are returned in distance increasing order. Details about query and train descriptors see in . void DescriptorMatcher::knnMatch(const Mat& queryDescriptors, vector<vector<DMatch> >&

matches, int k, const vector<Mat>& masks=vector<Mat>(), bool compactResult=false) Paramete o trainDescriptors, mask, masks(queryDescriptors,) – See in rs: DescriptorMatcher::match() . o matches– Mathes. Each matches[i] is k or less matches for the same query descriptor. o k– Count of best matches will be found per each query descriptor (or less if it’s not possible). o compactResult– It’s used when mask (or masks) is not empty. If compactResult is false matches vector will have the same size as queryDescriptors rows. If compactResult is true matches vector will not contain matches for fully masked out query descriptors.

cv::DescriptorMatcher::radiusMatch
Comments from the Wiki DescriptorMatcher::match() void DescriptorMatcher::radiusMatch(const Mat& queryDescriptors, const Mat& trainDescriptors, vector<vector<DMatch> >& matches, float maxDistance, const Mat& mask=Mat(), bool compactResult=false) const Find the best matches for each query descriptor which have distance less than given threshold. Found matches are returned in distance increasing order. Details about query and train descriptors see in . void DescriptorMatcher::radiusMatch(const Mat& queryDescriptors, vector<vector<DMatch> >& matches, float maxDistance, const vector<Mat>& masks=vector<Mat>(), bool compactResult=false) Paramete o trainDescriptors, mask, masks(queryDescriptors,) – See in rs: DescriptorMatcher::match() . o compactResult(matches,) – See in DescriptorMatcher::knnMatch() . o maxDistance– The threshold to found match distances.

cv::DescriptorMatcher::clone
Comments from the Wiki Ptr<DescriptorMatcher> \DescriptorMatcher::clone(bool emptyTrainData) const Clone the matcher. Paramete o emptyTrainData– If emptyTrainData is false the method create rs: deep copy of the object, i.e. copies both parameters and train data. If emptyTrainData is true the method create object copy with current parameters but with empty train data..

cv::DescriptorMatcher::create
Comments from the Wiki DescriptorMatcher() .. :param :

BruteForceMatcher
Comments from the Wiki

BruteForceMatcher Brute-force descriptor matcher. For each descriptor in the first set, this matcher finds the closest descriptor in the second set by trying each one. This descriptor matcher supports masking permissible matches between descriptor sets. template<class Distance> class BruteForceMatcher : public DescriptorMatcher { public: BruteForceMatcher( Distance d = Distance() ); virtual ~BruteForceMatcher(); virtual bool isMaskSupported() const; virtual Ptr<DescriptorMatcher> clone( bool emptyTrainData=false ) const; protected: ... } For efficiency, BruteForceMatcher is templated on the distance metric. For float descriptors, a common choice would be L2<float> . Class of supported distances are: template<typename T> struct Accumulator { typedef T Type; }; template<> struct Accumulator<unsigned char> { typedef unsigned int Type; }; template<> struct Accumulator<unsigned short> { typedef unsigned int Type; }; template<> struct Accumulator<char> { typedef int Type; }; template<> struct Accumulator<short> { typedef int Type; }; /* * Squared Euclidean distance functor */ template<class T> struct L2 { typedef T ValueType; typedef typename Accumulator<T>::Type ResultType; ResultType operator()( const T* a, const T* b, int size ) const; }; /* * Manhattan distance (city block distance) functor */ template<class T> struct CV_EXPORTS L1 { typedef T ValueType; typedef typename Accumulator<T>::Type ResultType; ResultType operator()( const T* a, const T* b, int size ) const; ...

. virtual Ptr<DescriptorMatcher> clone( bool emptyTrainData=false ) const. FlannBasedMatcher does not support masking permissible matches between descriptor sets.. class FlannBasedMatcher : public DescriptorMatcher { public: FlannBasedMatcher( const Ptr<flann::IndexParams>& indexParams=new flann::KDTreeIndexParams(). }.. . virtual void add( const vector<Mat>& descriptors ). const unsigned char* b. }. const Ptr<flann::SearchParams>& searchParams=new flann::SearchParams() ). typedef int ResultType. int size ) const. virtual bool isMaskSupported() const. const unsigned char* b. ResultType operator()( const unsigned char* a. ResultType operator()( const unsigned char* a.. because flann::Index() does not support this. virtual void clear(). }. FlannBasedMatcher Comments from the Wiki FlannBasedMatcher Flann based descriptor matcher.}. So this matcher may be faster in cases of matching to large train collection than brute force matcher. struct Hamming { typedef unsigned char ValueType.. protected: . virtual void train(). typedef int ResultType. .. This matcher trains flann::Index() on train descriptor collection and calls it’s nearest search methods to find best matches.. /* * Hamming distance (city block distance) functor */ struct HammingLUT { typedef unsigned char ValueType. int size ) const.

This section is devoted to matching descriptors that can not be represented as vectors in a multidimensional space. virtual bool isMaskSupported() = 0. It does not make any assumptions about descriptor representation. void classify( const Mat& queryImage. GenericDescriptorMatcher Comments from the Wiki GenericDescriptorMatcher Abstract interface for a keypoint descriptor extracting and matching. class GenericDescriptorMatcher { public: GenericDescriptorMatcher(). There are descriptors such as One way descriptor and Ferns that have GenericDescriptorMatcher interface implemented. vector<KeyPoint>& trainKeypoints ) const. virtual void add( const vector<Mat>& images. Every descriptor with DescriptorExtractor() interface has a wrapper with GenericDescriptorMatcher interface (see VectorDescriptorMatcher() ). vector<KeyPoint>& trainKeypoints. virtual ~GenericDescriptorMatcher(). /* * Group of methods to match keypoints from image pair. virtual void train() = 0. void knnMatch( const Mat& queryImage. but do not support DescriptorExtractor() . GenericDescriptorMatcher is a more generic interface for descriptors. */ void match( const Mat& queryImage. vector<KeyPoint>& queryKeypoints ). but their interfaces are intended for descriptors represented as vectors in a multidimensional space. GenericDescriptorMatcher is a more generic interface for descriptors. const Mat& trainImage. GenericDescriptorMatcher has two groups of match methods: for matching keypoints of one image with other image or with image set. const Mat& mask=Mat() ) const. . virtual void clear().Common Interfaces of Generic Descriptor Matchers Matchers of keypoint descriptors in OpenCV have wrappers with common interface that enables to switch easily between different algorithms solving the same problem. const vector<Mat>& getTrainImages() const. vector<DMatch>& matches. const vector<vector<KeyPoint> >& getTrainKeypoints() const. vector<KeyPoint>& queryKeypoints. vector<KeyPoint>& queryKeypoints. As DescriptorMatcher() . There is DescriptorExtractor() and DescriptorMatcher() for these purposes too. vector<vector<KeyPoint> >& keypoints ). void classify( const Mat& queryImage. const Mat& trainImage. vector<KeyPoint>& queryKeypoints.

vector<KeyPoint>& queryKeypoints. void radiusMatch( const Mat& queryImage. const vector<Mat>& masks=vector<Mat>().. virtual Ptr<GenericDescriptorMatcher> clone( bool emptyTrainData=false ) const = 0.. vector<DMatch>& matches. const vector<Mat>& masks=vector<Mat>() ). int k. void radiusMatch( const Mat& queryImage. bool compactResult=false ) const. vector<KeyPoint>& trainKeypoints. vector<vector<DMatch> >& matches. /* * Group of methods to match keypoints from one image to image set. protected: . virtual void write( FileStorage& ) const..const Mat& trainImage. If train collection is not empty new image and keypoints from them will be added to existing data. . }. */ void match( const Mat& queryImage. const Mat& mask=Mat(). const vector<Mat>& masks=vector<Mat>(). param images: Image collection. vector<KeyPoint>& queryKeypoints. int k. vector<KeyPoint>& queryKeypoints. vector<KeyPoint>& trainKeypoints. virtual void read( const FileNode& ). bool compactResult=false ). vector<KeyPoint>& queryKeypoints. vector<vector<DMatch> >& matches. Assumes that keypoints[i] are keypoints detected in an image images[i] . bool compactResult=false ). const Mat& mask=Mat(). cv::GenericDescriptorMatcher::add Comments from the Wiki void GenericDescriptorMatcher::add(const vector<Mat>& images. void knnMatch( const Mat& queryImage. param keypoints: Point collection. vector<vector<DMatch> >& matches.. vector<vector<KeyPoint> >& keypoints) Adds images and keypoints from them to the train collection (descriptors are supposed to be calculated here). cv::GenericDescriptorMatcher::getTrainImages Comments from the Wiki . cv:: Comments from the Wiki . bool compactResult=false ) const. float maxDistance. vector<vector<DMatch> >& matches. float maxDistance. const Mat& trainImage.

const Mat& trainImage. vector<DMatch>& matches. vector<KeyPoint>& trainKeypoints) const Classifies query keypoints under keypoints of one train image qiven as input argument (first version of the method) or train image collection that set using (second version). const Mat& trainImage. vector<KeyPoint>& queryKeypoints) Parameters: o queryImage– The query image. rs: o queryKeypoints– Keypoints detected in queryImage . vector<KeyPoint>& queryKeypoints... o trainImage– Train image. void GenericDescriptorMatcher::classify(const Mat& queryImage.cv:: Comments from the Wiki . cv::GenericDescriptorMatcher::match Comments from the Wiki GenericDescriptorMatcher::add() DescriptorMatcher::match() void GenericDescriptorMatcher::match(const Mat& queryImage. vector<DMatch>& matches. o trainKeypoints– Keypoints detected in trainImage . o queryKeypoints– Keypoints from the query image. cv:: Comments from the Wiki . This will not be added to train image collection stored in class object. o trainImage– The train image. o trainKeypoints– Keypoints from the train image. cv:: Comments from the Wiki GenericDescriptorMatcher::add() void GenericDescriptorMatcher::classify(const Mat& queryImage. void GenericDescriptorMatcher::match(const Mat& queryImage. vector<KeyPoint>& queryKeypoints. In second version query keypoints are matched to training collectin that set using . vector<KeyPoint>& queryKeypoints.are input arguments. const Mat& mask=Mat()) const Find best match for query keypoints to the training set. const vector<Mat>& masks=vector<Mat>()) Paramete o queryImage– Query image. vector<KeyPoint>& trainKeypoints. They will not .. cv:: Comments from the Wiki . As in the mask can be set. In first version of method one train image and keypoints detected on it .

vector<KeyPoint>& queryKeypoints. const Mat& mask=Mat(). vector<vector<DMatch> >& matches. bool compactResult=false) const Find the best matches for each query keypoint which have distance less than given threshold. If some query descriptor (keypoint) masked out in mask no match will be added for this descriptor. vector<vector<DMatch> >& matches. Details see in and . void GenericDescriptorMatcher::radiusMatch(const Mat& queryImage. bool compactResult=false) const Find the knn best matches for each keypoint from a query set with train keypoints. vector<KeyPoint>& queryKeypoints. float maxDistance. vector<KeyPoint>& queryKeypoints. vector<vector<DMatch> >& matches. bool compactResult=false) cv::GenericDescriptorMatcher::radiusMatch Comments from the Wiki GenericDescriptorMatcher::match() DescriptorMatcher::radiusMatch() void GenericDescriptorMatcher::radiusMatch(const Mat& queryImage. o mask– Mask specifying permissible matches between input query and train keypoints. o masks– The set of masks. cv::GenericDescriptorMatcher::write Comments from the Wiki void GenericDescriptorMatcher::write(FileStorage& fs) const Writes match object to a file storage . int k. vector<KeyPoint>& queryKeypoints. o matches– Matches. void GenericDescriptorMatcher::knnMatch(const Mat& queryImage. vector<KeyPoint>& trainKeypoints. int k. Each masks[i] specifies permissible matches between input query keypoints and stored train keypointss from i-th image. So matches size may be less query keypoints count. Details see in and . bool compactResult=false) cv::GenericDescriptorMatcher::read Comments from the Wiki void GenericDescriptorMatcher::read(const FileNode& fn) Reads matcher object from a file node. vector<KeyPoint>& trainKeypoints. const Mat& trainImage.be added to train points collection stored in class object. Found matches are returned in distance increasing order. float maxDistance. const Mat& trainImage. Found knn (or less if not possible) matches are returned in distance increasing order. const Mat& mask=Mat(). vector<vector<DMatch> >& matches. const vector<Mat>& masks=vector<Mat>(). const vector<Mat>& masks=vector<Mat>(). cv::GenericDescriptorMatcher::knnMatch Comments from the Wiki GenericDescriptorMatcher::match() DescriptorMatcher::knnMatch() void GenericDescriptorMatcher::knnMatch(const Mat& queryImage.

e.cv::GenericDescriptorMatcher::clone Comments from the Wiki Ptr<GenericDescriptorMatcher>\GenericDescriptorMatcher::clone(bool emptyTrainData) const Clone the matcher. float minScale = GET_MIN_SCALE(). void initialize( const Params& params. const Ptr<OneWayDescriptorBase>& base=Ptr<OneWayDescriptorBase>() ). string trainImagesList = string(). PATCH_HEIGHT). Size patchSize. OneWayDescriptorMatcher( const Params& params=Params() ). }. copies both parameters and train data. matching and classification of descriptors using OneWayDescriptorBase() class. } static float GET_STEP_SCALE() { return 1. stepScale.2f. int poseCount. virtual ~OneWayDescriptorMatcher(). Size patchSize = Size(PATCH_WIDTH.5f. i. static const int PATCH_WIDTH = 24. float stepScale = GET_STEP_SCALE() ). string trainImagesList. Paramete o emptyTrainData– If emptyTrainData is false the method create rs: deep copy of the object. } Params( int poseCount = POSE_COUNT. maxScale.7f. } static float GET_MAX_SCALE() { return 1. If emptyTrainData is true the method create object copy with current parameters but with empty train data. static const int PATCH_HEIGHT = 24. string pcaFilename = string(). string pcaFilename. string trainPath = string(). static float GET_MIN_SCALE() { return 0. float maxScale = GET_MAX_SCALE(). string trainPath. class OneWayDescriptorMatcher : public GenericDescriptorMatcher { public: class Params { public: static const int POSE_COUNT = 500. float minScale. // Clears keypoints storing in collection and OneWayDescriptorBase . OneWayDescriptorMatcher Comments from the Wiki OneWayDescriptorMatcher Wrapping class for computing.

matching and classification of descriptors using FernClassifier() class. virtual void clear(). . virtual void write( FileStorage& fs ) const. const PatchGenerator& patchGenerator=PatchGenerator() ). int nviews=FernClassifier::DEFAULT_VIEWS. int nviews. int compressionMethod. int signatureSize=FernClassifier::DEFAULT_SIGNATURE_SIZE. virtual Ptr<GenericDescriptorMatcher> clone( bool emptyTrainData=false ) const. virtual void read( const FileNode &fn )... class FernDescriptorMatcher : public GenericDescriptorMatcher { public: class Params { public: Params( int nclasses=0. int structSize. protected: . int nclasses.virtual void clear(). virtual ~FernDescriptorMatcher(). FernDescriptorMatcher( const Params& params=Params() ). FernDescriptorMatcher Comments from the Wiki FernDescriptorMatcher Wrapping class for computing. PatchGenerator patchGenerator. int nstructs=FernClassifier::DEFAULT_STRUCTS. string filename. virtual bool isMaskSupported(). int structSize=FernClassifier::DEFAULT_STRUCT_SIZE. int nstructs. int signatureSize. int patchSize=FernClassifier::PATCH_SIZE. }. }. virtual void train(). int patchSize. Params( const string& filename ). int compressionMethod=FernClassifier::COMPRESSION_NONE.

.. const Ptr<DescriptorMatcher>& matcher ). Example of creating: VectorDescriptorMatcher matcher( new SurfDescriptorExtractor. virtual void read( const FileNode& fn ).virtual void train(). VectorDescriptorMatcher Comments from the Wiki VectorDescriptorMatcher Class used for matching descriptors that can be described as vectors in a finite-dimensional space. new BruteForceMatcher<L2<float> > ). virtual Ptr<GenericDescriptorMatcher> clone( bool emptyTrainData=false ) const. virtual void write( FileStorage& fs ) const. protected: ... virtual void write( FileStorage& fs ) const. virtual ~VectorDescriptorMatcher(). const . class CV_EXPORTS VectorDescriptorMatcher : public GenericDescriptorMatcher { public: VectorDescriptorMatcher( const Ptr<DescriptorExtractor>& extractor. virtual Ptr<GenericDescriptorMatcher> clone( bool emptyTrainData=false ) const. virtual void train(). virtual bool isMaskSupported(). }. vector<vector<KeyPoint> >& pointCollection ). virtual bool isMaskSupported(). protected: . const vector<KeyPoint>& keypoints1. }. virtual void read( const FileNode &fn ). const Mat& img2. Drawing Function of Keypoints and Matches¶ cv::drawMatches Comments from the Wiki void drawMatches(const Mat& img1. virtual void clear(). virtual void add( const vector<Mat>& imgCollection.

If singlePointColor==Scalar::all(-1) color will be generated randomly. const Scalar& singlePointColor=Scalar::all(-1). DRAW_RICH_KEYPOINTS = 4 // For each keypoint the circle around // keypoint with keypoint size and orientation will // be drawn. i. Matches will be drawn // on existing content of output image. rs: o keypoints1– Keypoints from first source image. o img2– Second source image. // Output image matrix will be created (Mat::create). Mat& outImg. see below. const vector<vector<char>>& matchesMask= vector<vector<char> >(). o flags– Each bit of flags sets some feature of drawing. const vector<KeyPoint>& keypoints2. int flags=DrawMatchesFlags::DEFAULT) Paramete o img1– First source image. const Scalar& color=Scalar::all(-1). o matchesMask– Mask determining which matches will be drawn. void drawMatches(const Mat& img1. // i. }. Match is a line connecting two keypoints (circles). o outImg– Output image. const vector<vector<DMatch> >& matches1to2. i. Its content depends on flags value what is drawn in output image. // Single keypoints will not be drawn. const vector<char>& matchesMask=vector<char>(). int flags=DrawMatchesFlags::DEFAULT) This function draws matches of keypints from two images on output image. Mat& outImg. Possible flags bit values is defined by DrawMatchesFlags . struct DrawMatchesFlags { enum{ DEFAULT = 0. cv::drawKeypoints Comments from the Wiki void drawKeypoints(const Mat& image. // Two source image. See below possible flags bit values. const Scalar& matchColor=Scalar::all(-1). keypoints not having the matches.e. o matches– Matches from first image to second one. Mat& outImg. keypoints1[i] has corresponding point keypoints2[matches[i]] . If matchColor==Scalar::all(-1) color will be generated randomly. const vector<KeyPoint>& keypoints1. // Output image matrix will not be // created (Mat::create). o singlePointColor– Color of single keypoints (circles). matches and single keypoints // will be drawn. o matchColor– Color of matches (lines and connected keypoints). If mask is empty all matches will be drawn. o keypoints2– Keypoints from second source image. const Mat& img2.vector<KeyPoint>& keypoints2. NOT_DRAW_SINGLE_POINTS = 2. }. existing memory of output image may be reused. // For each keypoint only the center point will be // drawn (without the circle around keypoint with // keypoint size and orientation). const Scalar& matchColor=Scalar::all(-1). const Scalar& singlePointColor=Scalar::all(-1). DRAW_OVER_OUTIMG = 1.e. const vector<KeyPoint>& keypoints. int flags=DrawMatchesFlags::DEFAULT) . const vector<DMatch>& matches1to2.e.

‘’Visual Categorization with Bags of Keypoints’’ of Gabriella Csurka. o color– Color of keypoints . const vector<Mat>& getDescriptors() const. Paramete o descriptors– Descriptors to add to training set. Lixin Fan. 2004. Possible flags bit values is defined by DrawMatchesFlags . The training set will be clustered using clustermethod to construct vocabulary. virtual Mat cluster() const = 0. Paramete rs: o image– Source image. Dance. Object Categorization¶ Some approaches based on local 2D features and used to object categorization are described in this section. cv::BOWTrainer::add Comments from the Wiki void BOWTrainer::add(const Mat& descriptors) Add descriptors to training set.. Each row of . o keypoints– Keypoints from source image. virtual Mat cluster( const Mat& descriptors ) const = 0. BOWTrainer Comments from the Wiki BOWTrainer Abstract base class for training ‘’bag of visual words’’ vocabulary from a set of descriptors. Cedric Bray. }. See possible flags bit values.Draw keypoints. see above in drawMatches() . class BOWTrainer { public: BOWTrainer(){} virtual ~BOWTrainer(){} void add( const Mat& descriptors ). Its content depends on flags value what is drawn in output image. Christopher R.g. o outImg– Output image.. virtual void clear(). Paramete rs: o flags– Each bit of flags sets some feature of drawing. int descripotorsCount() const. protected: . See e. Jutta Willamowski.

BOWKMeansTrainer Comments from the Wiki BOWKMeansTrainer kmeans() based class to train visual vocabulary using the ‘’bag of visual words’’ approach. int attempts=3.e. class BOWKMeansTrainer : public BOWTrainer { public: BOWKMeansTrainer( int clusterCount.. int flags=KMEANS_PP_CENTERS ). cv::BOWTrainer::cluster Comments from the Wiki Mat BOWTrainer::cluster() const Cluster train descriptors.rs: descriptors matrix is a one descriptor. }. cluster centers). Descriptors will not be added to the inner train descriptor set. Vocabulary consists from cluster centers. cv::BOWTrainer::getDescriptors Comments from the Wiki const vector<Mat>& BOWTrainer::getDescriptors() const Returns training set of descriptors. protected: .. In first method variant the stored in object train descriptors will be clustered. Each row of descriptors matrix rs: is a one descriptor. virtual Mat cluster() const. Mat BOWTrainer::cluster(const Mat& descriptors) const Paramete o descriptors– Descriptors to cluster. So this method returns vocabulary. To gain an understanding of constructor parameters see kmeans() function arguments. BOWImgDescriptorExtractor Comments from the Wiki . cv::BOWTrainer::descripotorsCount Comments from the Wiki const vector<Mat>& BOWTrainer::descripotorsCount() const Returns count of all descriptors stored in the training set. const TermCriteria& termcrit=TermCriteria(). virtual ~BOWKMeansTrainer(){} // Returns trained vocabulary (i. virtual Mat cluster( const Mat& descriptors ) const. in second variant – input descriptors will be clustered.

class BOWImgDescriptorExtractor { public: BOWImgDescriptorExtractor( const Ptr<DescriptorExtractor>& dextractor. }. 3. Mat* descriptors=0 ). rs: o dmatcher– Descriptor matcher that will be used to find nearest word of trained vocabulary to each keupoints descriptor of the image. void compute( const Mat& image. vector<vector<int> >* pointIdxsOfClusters=0. In few. cv::BOWImgDescriptorExtractor::getVocabulary . const Ptr<DescriptorMatcher>& dmatcher) Constructor. int descriptorSize() const. vector<KeyPoint>& keypoints. Each row of vocabulary is a one visual word (cluster center). Image descriptor is a normalized histogram of vocabulary words encountered in the image.. 2.e. I. const Mat& getVocabulary() const.. Mat& imgDescriptor. cv::BOWImgDescriptorExtractor::BOWImgDescriptorExtracto r Comments from the Wiki BOWImgDescriptorExtractor::BOWImgDescriptorExtractor(const Ptr<DescriptorExtractor>& dextractor. Find nearest visual words from vocabulary for each keypoint descriptor. const Ptr<DescriptorMatcher>& dmatcher ). such computing consists from the following steps: 1. i -bin of the histogram is a frequency of i -word of vocabulary in the given image. virtual ~BOWImgDescriptorExtractor(){} void setVocabulary( const Mat& vocabulary ). cv::BOWImgDescriptorExtractor::setVocabulary Comments from the Wiki void BOWImgDescriptorExtractor::setVocabulary(const Mat& vocabulary) Method to set visual vocabulary. Paramete o vocabulary– Vocabulary (can be trained using inheritor of rs: BOWTrainer() ). Paramete o dextractor– Descriptor extractor that will be used to compute descriptors for input image and it’s keypoints. protected: . Compute descriptors for given image and it’s keypoints set. int descriptorType() const.BOWImgDescriptorExtractor Class to compute image descriptor using ‘’bad of visual words’‘.

i. o keypoints– Keypoints detected in the input image. and 0 otherwise. computed image descriptor. flann. cv::flann::Index_ Comments from the Wiki cv::flann::Index_ The FLANN nearest neighbor index class. vector<vector<int> >* pointIdxsOfClusters=0. if vocabulary was set.cluster (word of vocabulary) (returned if it is not 0. cv::BOWImgDescriptorExtractor::compute Comments from the Wiki void BOWImgDescriptorExtractor::compute(const Mat& image. More information about FLANN can be found in muja_flann_2009 . Parameters: o image– The image. Mat* descriptors=0) Compute image descriptor using set visual vocabulary. o imgDescriptor– This is output.) o descriptors– Descriptors of the image keypoints (returned if it is not 0.) cv::BOWImgDescriptorExtractor::descriptorSize Comments from the Wiki int BOWImgDescriptorExtractor::descriptorSize() const Returns image discriptor size. This class is templated with the type of elements for which the index is built. cv::BOWImgDescriptorExtractor::descriptorType Comments from the Wiki int BOWImgDescriptorExtractor::descriptorType() const Returns image descriptor type.e. pointIdxsOfClusters[i]is keypoint indices which belong to the i. vector<KeyPoint>& keypoints. Mat& imgDescriptor. namespace cv { namespace flann . i. o pointIdxsOfClusters– Indices of keypoints which belong to the cluster.Comments from the Wiki const Mat& BOWImgDescriptorExtractor::getVocabulary() const Returns set vocabulary. Clustering and Search in Multi-Dimensional Spaces Fast Approximate Nearest Neighbor Search This section documents OpenCV’s interface to the FLANN library. Image descriptor will be computed for this. FLANN (Fast Library for Approximate Nearest Neighbors) is a library that contains a collection of algorithms optimized for fast nearest neighbor search in large datasets and for high dimensional features.e.

vector<int>& indices. Mat& dists. Mat& dists. float radius. const IndexParams& params). int knn. float radius.{ template <typename T> class Index_ { public: Index_(const Mat& features. const IndexParams* getIndexParameters(). o features– Matrix of containing the features(points) to index. }. int knn. Mat& indices. Mat& indices. int radiusSearch(const vector<T>& query. int size() const. int radiusSearch(const Mat& query. const IndexParams& params) Constructs a nearest neighbor search index for a given dataset. vector<float>& dists. vector<int>& indices. const SearchParams& params). void save(std::string filename). const SearchParams& params). } } // namespace cv::flann cv::cv::flann::Index_<T>::Index_ Comments from the Wiki Index_<T>::Index_(const Mat& features. int veclen() const. const SearchParams& params). ~Index_(). typedef Index_<float> Index. vector<float>& dists. The size Paramete rs: of the matrix is num _ features x feature _ dimensionality and the data type of the elements in the matrix must coincide with the type of the . void knnSearch(const Mat& queries. const SearchParams& params). void knnSearch(const vector<T>& query.

flann_centers_init_t centers_init = CENTERS_RANDOM. A value greater then zero also takes into account the size of the domain. CENTERS_GONZALES (picks the initial centers using Gonzales’ algorithm) and CENTERS_KMEANSPP (picks the initial centers using the algorithm suggested in arthur_kmeanspp_2007 ) cb_indexThis parameter (cluster boundary index) influences the way exploration is performed in the hierarchical kmeans tree. When cb_index is zero the next kmeans domain to be explored is choosen to be the one with the closest center. float cb_index = 0. brute-force search. int iterations = 11. branchingThe branching factor to use for the hierarchical k-means tree iterationsThe maximum number of iterations to use in the k-means clustering stage when building the k-means tree. CompositeIndexParamsWhen using a parameters object of this type the index created combines the randomized kd-trees and the hierarchical k-means tree. struct LinearIndexParams : public IndexParams { }.16] KMeansIndexParamsWhen passing an object of this type the index constructed will be a hierarchical k-means tree. The type of index that will be constructed depends on the type of this parameter. o params– Structure containing the index parameters. struct CompositeIndexParams : public IndexParams . A value of -1 used here means that the k-means clustering should be iterated until convergence centers_initThe algorithm to use for selecting the initial centers when performing a k-means clustering step. }. KDTreeIndexParamsWhen passing an object of this type the index constructed will consist of a set of randomized kd-trees which will be searched in parallel.2 ). The possible values are CENTERS_RANDOM (picks the initial cluster centers randomly). the index will perform a linear. struct KMeansIndexParams : public IndexParams { KMeansIndexParams( int branching = 32.. }. The possible parameter types are: LinearIndexParamsWhen passing an object of this type.index. struct KDTreeIndexParams : public IndexParams { KDTreeIndexParams( int trees = 4 ). treesThe number of parallel kd-trees to use. Good values are in the range [1.

target_precisionIs a number between 0 and 1 specifying the percentage of the approximate nearest-neighbor searches that return the exact nearest-neighbor. The optimum value usually depends on the application. struct SavedIndexParams : public IndexParams { SavedIndexParams( std::string filename ). }.01. float memory_weight = 0. In such case using just a fraction of the data helps speeding up this algorithm while still giving good approximations of the optimum parameters. but for very large datasets can take longer than desired. float build_weight = 0. Running the algorithm on the full dataset gives the most accurate results. SavedIndexParamsThis object type is used for loading a previously saved index from the disk. . sample_fractionIs a number between 0 and 1 indicating what fraction of the dataset to use in the automatic parameter configuration algorithm. Using a higher value for this parameter gives more accurate results. int branching = 32. by choosing the optimal index type (randomized kd-trees.{ CompositeIndexParams( int trees = 4. int iterations = 11. flann_centers_init_t centers_init = CENTERS_RANDOM. float sample_fraction = 0. }. struct AutotunedIndexParams : public IndexParams { AutotunedIndexParams( float target_precision = 0.2 ). AutotunedIndexParamsWhen passing an object of this type the index created is automatically tuned to offer the best performance. A value less than 1 gives more importance to the time spent and a value greater than 1 gives more importance to the memory usage. In some applications it’s acceptable for the index build step to take a long time if the subsequent searches in the index can be performed very fast. memory_weightIs used to specify the tradeoff between time (index build time and search time) and memory used by the index. }.1 ). hierarchical kmeans. build_weightSpecifies the importance of the index build time raported to the nearest-neighbor search time. but the search takes longer.9. In other applications it’s required that the index be build as fast as possible even if that leads to slightly longer search times. float cb_index = 0. linear) and parameters for the dataset provided.

o dists– Vector that will contain the distances to the K-nearest neighbors found. cv::cv::flann::Index_<T>::radiusSearch Comments from the Wiki int Index_<T>::radiusSearch(const vector<T>& query. It must have at least knn size. o dists– Vector that will contain the distances to the points found within the search radius o radius– The search radius o params– Search parameters cv::cv::flann::Index_<T>::save Comments from the Wiki void Index_<T>::save(std::string filename) . float radius. int knn. the number of checks required to achieve the specified precision was also computed. the ones that don’t fit in the vector are ignored. Mat& dists. Mat& indices. float radius. o checksThe number of times the tree(s) in the index should be recursively traversed.filenameThe filename in which the index was saved. const SearchParams& params) Performs a radius nearest neighbor search for a given query point. but also take more time. const SearchParams& params) int Index_<T>::radiusSearch(const Mat& query. Mat& indices. If automatic configuration was used when the index was created. vector<int>& indices. const SearchParams& params) Performs a K-nearest neighbor search for a given query point using the index. vector<int>& indices. }. vector<float>& dists. It must have at least knn size. cv::cv::flann::Index_<T>::knnSearch Comments from the Wiki void Index_<T>::knnSearch(const vector<T>& query. A higher value for this parameter would give better search precision. o knn– Number of nearest neighbors to search for. If the number of neighbors in the search radius is bigger than the size of this vector. Paramete o query– The query point rs: o indices– Vector that will contain the indices of the points found within the search radius in decreasing order of the distance to the query point. Paramete o query– The query point rs: o indices– Vector that will contain the indices of the K-nearest neighbors found. const SearchParams& params) void Index_<T>::knnSearch(const Mat& queries. o params– Search parameters struct SearchParams { SearchParams(int checks = 32). Mat& dists. vector<float>& dists. int knn. in which case this parameter is ignored.

Mat& centers. // supported feature types virtual ~FeatureEvaluator(). class CV_EXPORTS FeatureEvaluator { public: enum { HAAR = 0. objdetect. LBP = 1 }. o params– Parameters used in the construction of the hierarchical k-means tree The function returns the number of clusters computed. when the parameters computed can be retrived using this method. This is usefull in case of autotuned indices. Clustering¶ cv::cv::flann::hierarchicalClustering<ET. Paramete o features– The points to be clustered. the number of clusters computed will be the highest number of the form (branching-1)*k+1 that’s lower than the number of clusters desired. Parameters: o filename– The file to save the index to cv::cv::flann::Index_<T>::getIndexParameters Comments from the Wiki const IndexParams* Index_<T>::getIndexParameters() Returns the index paramreters. however. const KMeansIndexParams& params) Clusters the given points by constructing a hierarchical k-means tree and choosing a cut in the tree that minimizes the cluster’s variance. The matrix must have type DT. because of the way the cut in the hierarchical tree is chosen. The number of rows in this matrix represents the number of clusters desired. // destructor .DT> Comments from the Wiki int hierarchicalClustering<ET. where branching is the tree’s branching factor (see description of the KMeansIndexParams).DT>(const Mat& features. Object Detection Cascade Classification¶ FeatureEvaluator Comments from the Wiki FeatureEvaluator Base class for computing feature values in cascade classifiers. rs: o centers– The centers of the clusters obtained. The matrix must have elements of type ET.Saves the index to a file.

cv::FeatureEvaluator::clone Comments from the Wiki Ptr<FeatureEvaluator> FeatureEvaluator::clone() const Returns a full copy of the feature evaluator. virtual Ptr<FeatureEvaluator> clone() const. cv::FeatureEvaluator::setImage Comments from the Wiki bool FeatureEvaluator::setImage(const Mat& img. Size origWinSize) Sets the image in which to compute the features. Paramete o p– The upper left point of window in which the features will be rs: computed. virtual bool setWindow(Point p). .virtual bool read(const FileNode& node). virtual int calcCat(int featureIdx) const. virtual int getFeatureType() const. cv::FeatureEvaluator::getFeatureType Comments from the Wiki int FeatureEvaluator::getFeatureType() const Returns the feature type (HAAR or LBP for now). o origWinSize– Size of training images. cv::FeatureEvaluator::read Comments from the Wiki bool FeatureEvaluator::read(const FileNode& node) Reads parameters of the features from a FileStorage node. virtual bool setImage(const Mat& img. static Ptr<FeatureEvaluator> create(int type). }. virtual double calcOrd(int featureIdx) const. Parameter o img– Matrix of type CV_8UC1 containing the image in which to s: compute the features. Size of the window is equal to size of training images. Parameters: o node– File node from which the feature parameters are read. Size origWinSize). cv::FeatureEvaluator::setWindow Comments from the Wiki CascadeClassifier::runAt() bool FeatureEvaluator::setWindow(Point p) Sets window in the current image in which the features will be computed (called by ).

// structure for storing desision tree struct CV_EXPORTS DTree { int nodeCount. // split threshold of ordered features only int left. cv::FeatureEvaluator::calcCat Comments from the Wiki int FeatureEvaluator::calcCat(int featureIdx) const Computes value of a categorical feature. i.e.1)]. CascadeClassifier Comments from the Wiki CascadeClassifier The cascade classifier class for object detection. Parameters: o featureIdx– Index of feature whose value will be computed. // left child index in the tree nodes array int right. // right child index in the tree nodes array }. // nodes count }. Returns computed label of categorical feature. .. // feature index on which is a split float threshold. value from [0. class CascadeClassifier { public: // structure for storing tree node struct CV_EXPORTS DTreeNode { int featureIdx. Parameters: o featureIdx– Index of feature whose value will be computed.cv::FeatureEvaluator::calcOrd Comments from the Wiki double FeatureEvaluator::calcOrd(int featureIdx) const Computes value of an ordered (numerical) feature.. cv::FeatureEvaluator::create Comments from the Wiki static Ptr<FeatureEvaluator> FeatureEvaluator::create(int type) Constructs feature evaluator.. Parameters: o type– Type of features evaluated by cascade (HAAR or LBP for now). Returns computed value of ordered feature. (number of categories .

FIND_BIGGEST_OBJECT = CV_HAAR_FIND_BIGGEST_OBJECT. bool is_stump_based. // subsets of split by categorical feature Ptr<FeatureEvaluator> feval. // vector of leaf values vector<int> subsets. void detectMultiScale( const Mat& image. // default constructor CascadeClassifier(const string& filename). cv::CascadeClassifier::CascadeClassifier Comments from the Wiki . Point ). // size of training images vector<Stage> stages. // true. // number of trees float threshold. CascadeClassifier(). // vector of stages (BOOST for now) vector<DTree> classifiers. enum { BOOST = 0 }. // destructor bool empty() const.// structure for storing cascade stage (BOOST only for now) struct CV_EXPORTS Stage { int first. // supported stage types // mode of detection (see parameter flags in function HaarDetectObjects) enum { DO_CANNY_PRUNING = CV_HAAR_DO_CANNY_PRUNING. // pointer to feature evaluator Ptr<CvHaarClassifierCascade> oldCascade. int runAt( Ptr<FeatureEvaluator>&. double scaleFactor=1.1. bool setImage( Ptr<FeatureEvaluator>&. // stage type (BOOST only for now) int featureType. ~CascadeClassifier(). // treshold of stage sum }. Size minSize=Size()). const Mat& ). bool load(const string& filename). SCALE_IMAGE = CV_HAAR_SCALE_IMAGE. // pointer to old cascade }. int flags=0. // feature type (HAAR or LBP for now) int ncategories. // vector of decision trees vector<DTreeNode> nodes. if the trees are stumps int stageType. int minNeighbors=3. // vector of tree nodes vector<float> leaves. DO_ROUGH_SEARCH = CV_HAAR_DO_ROUGH_SEARCH }. // first tree index in tree array int ntrees. bool read(const FileNode& node). // number of categories (for categorical features only) Size origWinSize. vector<Rect>& objects.

The previous content is destroyed. o minNeighbors– Speficifes how many neighbors should each candiate rectangle have to retain it. cv::CascadeClassifier::load Comments from the Wiki bool CascadeClassifier::load(const string& filename) Loads the classifier from file.CascadeClassifier::CascadeClassifier(const string& filename) Loads the classifier from file. Size minSize=Size()) Detects objects of different sizes in the input image. cv::CascadeClassifier::detectMultiScale Comments from the Wiki void CascadeClassifier::detectMultiScale(const Mat& image. File may contain a new cascade classifier (trained traincascade application) only. int flags=0. int minNeighbors=3. The detected objects are returned as a list of rectangles. File may rs: contain as old haar classifier (trained by haartraining application) or new cascade classifier (trained traincascade application).1. o scaleFactor– Specifies how much the image size is reduced at each image scale. cv::CascadeClassifier::empty Comments from the Wiki bool CascadeClassifier::empty() const Checks if the classifier has been loaded or not. vector<Rect>& objects. o image– Matrix of type CV_8U containing the image in which to Paramete rs: detect objects. Objects smaller than that are ignored. double scaleFactor=1. Paramete o filename– Name of file from which classifier will be load. cv::CascadeClassifier::setImage Comments from the Wiki . Parameters: o filename– Name of file from which classifier will be load. o minSize– The minimum possible object size. cv::CascadeClassifier::read Comments from the Wiki bool CascadeClassifier::read(const FileNode& node) Reads the classifier from a FileStorage node. o flags– This parameter is not used for new cascade and have the same meaning for old cascade as in function cvHaarDetectObjects. o objects– Vector of rectangles such that each rectangle contains the detected object.

if cascade classifier detects object in the given location. TermCriteria criteria=TermCriteria( TermCriteria::COUNT+TermCriteria::EPS. cv::CascadeClassifier::runAt Comments from the Wiki int CascadeClassifier::runAt(Ptr<FeatureEvaluator>& feval. Parameter o feval– Pointer to feature evaluator which is used for computing s: features. int flags=0) Calculates the optical flow for a sparse feature set using the iterative Lucas-Kanade method with . -si . in a group of rectangles to retain it. Paramete o feval– Feature evaluator which is used for computing features. Video Analysis Motion Analysis and Object Tracking cv::calcOpticalFlowPyrLK Comments from the Wiki void calcOpticalFlowPyrLK(const Mat& prevImg. all the rectangles will be put in one cluster. cv::groupRectangles Comments from the Wiki void groupRectangles(vector<Rect>& rectList. In each other cluster the average rectangle will be computed and put into the output rectangle list. that combines rectangles that have similar sizes and similar locations (the similarity is defined by eps ). video. o image– Matrix of type CV_8UC1 containing the image in which to compute the features. Size of the window is equal to size of training images. double derivLambda=0. Then. containing less than or equal to groupThreshold rectangles. o eps– The relative difference between sides of the rectangles to merge them into a group The function is a wrapper for a generic function partition() . rs: o pt– The upper left point of window in which the features will be computed. vector<uchar>& status. If .bool CascadeClassifier::setImage(Ptr<FeatureEvaluator>& feval.01). 15). int groupThreshold. Returns: 1 . const vector<Point2f>& prevPts. On output there will rs: be retained and grouped rectangles o groupThreshold– The minimum possible number of rectangles. When eps=0 . the small clusters. double eps=0.2) Groups the object candidate rectangles Paramete o rectList– The input/output vector of rectangles. no clustering is done at all. int maxLevel=3.5. minus 1. It clusters all the input rectangles using the rectangle equivalence criteria. const Mat& image) Sets the image for detection (called by detectMultiScale at each image level). will be rejected. Point pt) Runs the detector at the specified point (the image that the detector is working with should be set by setImage).otherwise. si is an index of stage which first predicted that given window is a background image. 30. vector<Point2f>& nextPts. vector<float>& err. 0. Size winSize=Size(15. const Mat& nextImg.

5 means the classical pyramid. int levels. double pyrScale. If derivLambda=0 . if derivLambda=1 . two levels are used etc. If 0. pyramids are not used (single level). const Mat& nextImg. Mat& flow. then initially The function implements the sparse iterative version of the Lucas-Kanade optical flow in pyramids. see Bouguet00 . pyrScale=0. levels=1 means that no extra layers are created and only the original images are used o winsize– The averaging window size. Any other values between 0 and 1 means that both derivatives and the image intensity are used (in the corresponding proportions). int polyN. if 1.pyramids Paramete rs: o prevImg– The first 8-bit single-channel or 3-channel input image o nextImg– The second input image of the same size and the same type as prevImg o prevPts– Vector of points for which the flow needs to be found o nextPts– The output vector of points containing the calculated new positions of the input features in the second image o status– The output status vector. int flags) Computes dense optical flow using Gunnar Farneback’s algorithm Paramete o prevImg– The first 8-bit single-channel input image rs: o nextImg– The second input image of the same size and the same type as prevImg o flow– The computed flow image. only the image intensity is used. o criteria– Specifies the termination criteria of the iterative search algorithm (after the specified maximum number of iterations criteria.epsilon o derivLambda– The relative weight of the spatial image derivatives impact to the optical flow estimation.maxCount or when the search window moves by less than criteria. Each element of the vector is set to 1 if the flow for the corresponding features has been found. where each next layer is twice smaller than the previous o levels– The number of pyramid layers. only derivatives are used. but yield more blurred motion field o iterations– The number of iterations the algorithm does at each . o flags– The operation flags: OPTFLOW_USE_INITIAL_FLOWuse initial estimations stored in nextPts . cv::calcOpticalFlowFarneback Comments from the Wiki void calcOpticalFlowFarneback(const Mat& prevImg. will have the same size as prevImg and type CV_32FC2 o pyrScale– Specifies the image scale (<1) to build the pyramids for each image. including the initial image. The larger values increase the algorithm robustness to image noise and give more chances for fast motion detection. int iterations. If the flag is not set. double polySigma. int winsize. 0 otherwise o err– The output vector that will contain the difference between patches around the original and moved points o winSize– Size of the search window at each pyramid level o maxLevel– 0-based maximal pyramid level number.

c that demonstrates the use of all the motion template functions. Mat& mask. can be a combination of the following: OPTFLOW_USE_INITIAL_FLOWUse the input flow as the initial flow approximation OPTFLOW_FARNEBACK_GAUSSIANUse a Gaussian filter instead of box filter of the same size for optical flow estimation. together with calcMotionGradient() and calcGlobalOrientation() . 32-bit floating-point) o timestamp– Current time in milliseconds or other units o duration– Maximal duration of the motion track in the same units as timestamp The function updates the motion history image as following: That is. at the cost of lower speed (and normally winsize for a Gaussian window should be set to a larger value to achieve the same level of robustness) The function finds optical flow for each prevImg pixel using the alorithm so that cv::updateMotionHistory Comments from the Wiki void updateMotionHistory(const Mat& silhouette.1 . described in Davis97 and Bradski00 . double delta2. while the pixels where motion happened last time a long time ago are cleared. yielding more robust algorithm and more blurred motion field. polyN =5 or 7 o polySigma– Standard deviation of the Gaussian that is used to smooth derivatives that are used as a basis for the polynomial expansion. double duration) Updates the motion history image by a moving silhouette. MHI pixels where motion occurs are set to the current timestamp .pyramid level o polyN– Size of the pixel neighborhood used to find polynomial expansion in each pixel. this option gives more accurate flow than with a box filter. for polyN=7 a good value would be polySigma=1. For polyN=5 you can set polySigma=1. double delta1. Paramete o silhouette– Silhouette mask that has non-zero pixels where the rs: motion occurs o mhi– Motion history image. The larger values mean that the image will be approximated with smoother surfaces. Mat& orientation. . int apertureSize=3) Calculates the gradient orientation of a motion history image. Typically. The function. Usually.5 o flags– The operation flags. See also the OpenCV sample motempl. implements the motion templates technique. that is updated by the function (single-channel. double timestamp. cv::calcMotionGradient Comments from the Wiki void calcMotionGradient(const Mat& mhi. Mat& mhi.

const Mat& mhi. the function finds the minimum ( ) mhi values over motion orientation at o as valid only if ) and maximum ( neighborhood of each pixel and marks the apertureSize– The aperture size of Sobel() operator as: The function calculates the gradient orientation at each pixel (in fact. Rect& window. where a recent motion has larger weight and the motion occurred in the past has smaller weight. calculated by updateMotionHistory() o timestamp– The timestamp passed to updateMotionHistory() o duration– Maximal duration of motion track in milliseconds. and orientation Parameters: o probImage– Back projection of the object histogram. cv::calcGlobalOrientation Comments from the Wiki double calcGlobalOrientation(const Mat& orientation. so that the computed angle is measured in degrees and covers the full range 0.360). The average direction is computed from the weighted orientation histogram. double timestamp. That is. also calculated by calcMotionGradient() . const Mat& mask. will have the type CV_8UC1 and the same size as mhi . calculated by the function calcMotionGradient() rs: o mask– Mask image. will have the same type and the same size as mhi . passed to updateMotionHistory() The function calculates the average motion direction in the selected region and returns the angle between 0 degrees and 360 degrees. It may be a conjunction of a valid gradient mask. size. see calcBackProject() . cv::CamShift Comments from the Wiki RotatedRect CamShift(const Mat& probImage. Its non-zero elements will mark pixels where the motion gradient data is correct o orientation– The output motion gradient orientation image. TermCriteria criteria) Finds the object center.Paramete rs: o mhi– Motion history single-channel floating-point image o mask– The output mask image. the mask is filled to indicate pixels where the computed angle is valid.) – The minimal and maximal allowed difference between mhi values within a pixel neighorhood. Each pixel of it will the motion orientation in degrees. from 0 to 360. double duration) Calculates the global motion orientation in some selected region. and the mask of the region. as recorded in mhi . Paramete o orientation– Motion gradient orientation image. fastArctan() and phase() are used.. o delta2(delta1. whose direction needs to be calculated o mhi– The motion history image. Also.

throwing away contours with small area ( contourArea() ) and rendering the remaining contours with drawContours() ) KalmanFilter Comments from the Wiki KalmanFilter Kalman filter class class KalmanFilter { public: KalmanFilter(). the search window size or orientation do not change during the search. const Mat& correct(const Mat& measurement). by retrieving connected components with findContours() . TermCriteria criteria) Finds the object on a back projection image.epsilon .c that tracks colored objects.o o window– Initial search window criteria– Stop criteria for the underlying meanShift() The function implements the CAMSHIFT object tracking algrorithm Bradski98 . it finds an object center using meanShift() and then adjust the window size and finds the optimal rotation. Mat statePre. The procedure is repeated until the specified number of iterations criteria. KalmanFilter(int dynamParams. void init(int dynamParams. Mat controlMatrix. int measureParams. Mat transitionMatrix.maxCount is done or until the window center shifts by less than criteria. unlike CamShift() . int controlParams=0). Rect& window. See the OpenCV sample camshiftdemo. First. The function returns the rotated rectangle structure that includes the object position. see calcBackProject() o window– Initial search window o criteria– The stop criteria for the iterative search algorithm The function implements iterative object search algorithm. int measureParams. // corrects statePre based on the input measurement vector // and stores the result to statePost. It takes the object back projection on input and the initial position. // predicted state (x'(k)): // x(k)=A*x(k-1)+B*u(k) // corrected state (x(k)): // x(k)=x'(k)+K(k)*(z(k)-H*x'(k)) // state transition matrix (A) // control matrix (B) . // predicts statePre from statePost const Mat& predict(const Mat& control=Mat()). int controlParams=0). cv::meanShift Comments from the Wiki int meanShift(const Mat& probImage. Parameters: o probImage– Back projection of the object histogram. The algorithm is used inside CamShift() and. size and the orientation. but better results can be obtained if you pre-filter the back projection and remove the noise (e.g. The next position of the search window can be obtained with RotatedRect::boundingRect() . You can simply pass the output of calcBackProject() to this function. The mass center in window of the back projection image is computed and the search window center shifts to the mass center. Mat statePost.

User Interface¶ cv::createTrackbar Comments from the Wiki int createTrackbar(const string& trackbarname.void*). handle simple mouse events as well as keyboard commmands • read and write images to/from disk or memory.org/wiki/Kalman_filter . // priori error estimate covariance matrix (P'(k)): // P'(k)=A*P(k-1)*At + Q)*/ Mat gain. It provides easy interface to: • create and manipulate windows that can display images and “remember” their content (no need to handle repaint events from OS) • add trackbars to the windows. • read video from camera or file and write video to a file.c highgui. you can modify transitionMatrix . but only value is updated o userdata– The user data that is passed as-is to the callback. If the callback is NULL pointer. the slider position is defined by this variable. Upon creation. then no callbacks is called. it can be . o count– The maximal position of the slider. int* value. rs: o winname– Name of the window which will be used as a parent of the created trackbar. // Kalman gain matrix (K(k)): // K(k)=P'(k)*Ht*inv(H*P'(k)*Ht+R) Mat errorCovPost. WinForms or Cocoa) or without any UI at all. This function should be prototyped as void Foo(int.wikipedia. TrackbarCallback onChange CV_DEFAULT(0). The class implements standard Kalman filter http://en.. sometimes there is a need to try some functionality quickly and visualize the results. const string& winname.// measurement noise covariance matrix (R) Mat errorCovPre. whose value will reflect the position of the slider. o onChange– Pointer to the function to be called every time the slider changes position. This is what the HighGUI module has been designed for. o value– The optional pointer to an integer variable. // posteriori error estimate covariance matrix (P(k)): // P(k)=(I-K(k)*H)*P'(k) . . However. // process noise covariance matrix (Q) Mat measurementNoiseCov.// (it is not used if there is no control) Mat measurementMatrix. }.. The minimal position is always 0. controlMatrix and measurementMatrix to get the extended Kalman filter functionality. See the OpenCV sample kalman. High-level GUI and Media I/O¶ While OpenCV was designed for use in full-scale applications and can be used within functionally rich UI frameworks (such as Qt. where the first parameter is the trackbar position and the second parameter is the user data (see the next parameter). void* userdata CV_DEFAULT(0)) Creates a trackbar and attaches it to the specified window Paramete o trackbarname– Name of the created trackbar. // measurement matrix (H) Mat processNoiseCov. int count.

o image– Image to be shown. Paramete o name– Name of the window in the window caption that may be used rs: as a window identifier.used to handle trackbar events without using global variables The function createTrackbar creates a trackbar (a. The function may scale the image. o trackbarname– Name of the trackbar. or displayed on the control panel if winname is NULL. [Qt Backend Only] qt-specific details: • winnameName of the window which is the parent of the trackbar. assigns a variable value to be syncronized with trackbar position and specifies a callback function onChange to be called on the trackbar position change. cv::getTrackbarPos Comments from the Wiki int getTrackbarPos(const string& trackbarname. cv::namedWindow Comments from the Wiki void namedWindow(const string& winname. const string& winname) Returns the trackbar position. [Qt Backend Only] qt-specific details: • winnameName of the window which will be used as a parent for created trackbar. otherwise the image is scaled to fit in the window. The function imshow displays the image in the specified window.a.k. The function returns the current position of the specified trackbar.1] is mapped to [0. the value range [0. By clicking on the label of each trackbar. Can be NULL if the trackbar is attached to the control panel. The created trackbar is displayed on the top of the given window. slider or range control) with the specified name and range.255]. The created trackbar is displayed at the bottom of the given window if winname is correctly provided. it is possible to edit the trackbar’s value manually for a more accurate control of it.255]. • If the image is 16-bit unsigned or 32-bit integer. • If the image is 32-bit floating-point. Can be NULL if the trackbar should be attached to the control panel. the value range [0. it is displayed as is. That is. cv::imshow Comments from the Wiki void imshow(const string& winname. . depending on its depth: • If the image is 8-bit unsigned. int flags) Creates a window. the pixels are divided by 256.255*256] is mapped to [0. const Mat& image) Displays the image in the specified window Parameters: o winname– Name of the window. the pixel values are multiplied by 255. If the window was created with the CV_WINDOW_AUTOSIZE flag then the image is shown with its original size. That is. Parameters: o winname– Name of the window which is the parent of the trackbar.

whereas CV_WINDOW_AUTOSIZE adjusts automatically the window’s size to fit the displayed image (see ShowImage ). if you want to modify the flags. ``CV_WINDOW_NORMAL`` textbar ``CV_GUI_NORMAL`` ). Parameters: o delay– Delay in milliseconds. o pos– The new position. 0 is the special value that means “forever” The function waitKey waits for key event infinitely (when ) or for delay milliseconds. CV_WINDOW_KEEPRATIO . If a window with the same name already exists. const string& winname. The default flags set for a new window are CV_WINDOW_AUTOSIZE . o CV_GUI_NORMAL or CV_GUI_EXPANDED:CV_GUI_NORMALis the old way to draw the window without statusbar and toolbar. and the user can not change the window size manually. whereas CV_GUI_EXPANDED is the new enhance GUI. whereas CV_WINDOW_KEEPRATIO keep the image’s ratio. Currently the supported flags are: o CV_WINDOW_NORMAL or CV_WINDOW_AUTOSIZE:CV_WINDOW_NORMALlet the user resize the window. If this is set. int pos) Sets the trackbar position. Returns the code of the pressed key or -1 if no key was pressed before the specified time had elapsed. o winname– Name of the window which is the parent of trackbar. Created windows are referred to by their names. Currently the only supported flag is CV_WINDOW_AUTOSIZE . the window size is automatically adjusted to fit the displayed image (see imshow ). the function does nothing. and CV_GUI_EXPANDED . [Qt Backend Only] qt-specific details: • winnameName of the window which is the parent of trackbar. This parameter is optional. ie: namedWindow( ``myWindow''. The function namedWindow creates a window which can be used as a placeholder for images and trackbars. cv::setTrackbarPos Comments from the Wiki void setTrackbarPos(const string& trackbarname. cv::waitKey Comments from the Wiki int waitKey(int delay=0) Waits for a pressed key.o flags– Flags of the window. Can be NULL if the trackbar is attached to the control panel. . However. o CV_WINDOW_FREERATIO or CV_WINDOW_KEEPRATIO:CV_WINDOW_FREERATIOadjust the image without respect the its ration. The function sets the position of the specified trackbar in the specified window. and the user can not change the window size manually. when it’s positive. Parameters: o trackbarname– Name of the trackbar. you can combine them using OR operator. [Qt Backend Only] qt-specific details: flags Flags of the window.

see imwrite The function compresses the image and stores it in the memory buffer. If there are several HighGUI windows. Note 2: The function only works if there is at least one HighGUI window created and the window is active. the empty matrix will be returned. See imread for the list of supported formats and the flags description. 4-channel RGBA image will be .Note: This function is the only method in HighGUI that can fetch and handle events. cv::imread Comments from the Wiki Mat imread(const string& filename. e. Reading and Writing Images and Video cv::imdecode Comments from the Wiki Mat imdecode(const Mat& buf. is stripped from the output image. o buf– The input array of vector Parameters: of bytes o flags– The same flags as in imread The function reads image from the specified buffer in memory. int flags) Reads an image from a buffer in memory. If the buffer is too short or contains invalid data. Parameters: o ext– The file extension that defines the output format o img– The image to be written o buf– The output buffer. any of them can be active. so it needs to be called periodically for normal event processing. which is resized to fit the result. int flags=1) Loads an image from a file. resized to fit the compressed image o params– The format-specific parameters. vector<uchar>& buf.g. const vector<int>& params=vector<int>()) Encode an image into a memory buffer. rs: o flags– Specifies color type of the loaded image: >0the loaded image is forced to be a 3-channel color image =0the loaded image is forced to be grayscale <0the loaded image will be loaded as-is (note that in the current implementation the alpha channel. unless HighGUI is used within some environment that takes care of event processing. Paramete o filename– Name of file to be loaded. if any. cv::imencode Comments from the Wiki bool imencode(const string& ext. See imwrite for the list of supported formats and the flags description. const Mat& img.

tif (see Note2 ) Note1 : The function determines type of the image by the content. Note2 : On Windows and MacOSX the shipped with OpenCV image codecs (libjpeg. 0 or 1. install the relevant packages (do not forget the development files. so OpenCV can always read JPEGs.*. In the case of PPM. the function returns empty matrix ( Mat::data==NULL ).*. paramValue_1. o params– The format-specific save parameters. *. cv::imwrite Comments from the Wiki bool imwrite(const string& filename. Only 8-bit (or 16-bit in the case of PNG. Please. The function imwrite saves the image to the specified file. from 0 to 100 (the higher is the better). improper permissions. *. from 0 to 9 (the higher value means smaller size and longer compression time). On Linux. *. BSD flavors and other Unix-like open-source operating systems OpenCV looks for the supplied with OS image codecs. const vector<int>& params=vector<int>()) Saves an image to a specified file. .png (see Note2 ) • Portable image format . *.*. in Debian and Ubuntu) in order to get the codec support...jpeg. or use the universal XML I/O functions to save the image to XML or YAML format.ras (always supported) • TIFF files .*. because of the embedded into MacOSX color management. In the case of PNG it can be the compression level ( CV_IMWRITE_PNG_COMPRESSION ). and cvtColor to convert it before saving. see imread for the list of extensions. depth or channel order is different. On MacOSX there is also the option to use native MacOSX image readers. “libjpeg-dev” etc. libpng. 1 by default.pbm.Currently. The following parameters are currently supported: In the case of JPEG it can be a quality ( CV_IMWRITE_JPEG_QUALITY ).*. 3 by default. The function imread loads an image from the specified file and returns it. use Mat::convertTo . *. The image format is chosen based on the filename extension. libtiff and libjasper) are used by default.loaded as RGB if ).*. the following file formats are supported: • Windows bitmaps .bmp. If the format.g.jpe (see Note2 ) • JPEG 2000 files . PNGs and TIFFs.*. paramId_2.dib (always supported) • JPEG files . JPEG 2000 and TIFF) single-channel or 3-channel (with ‘BGR’ channel order) images can be saved using this function. or turn on OPENCV_BUILD_3RDPARTY_LIBS flag in CMake.ppm (always supported) • Sun rasters . But beware that currently these native image loaders give images with somewhat different pixel values.sr.tiff.jpg. e. VideoCapture Comments from the Wiki . unsupported or invalid format). 95 by default. . const Mat& img. rs: o img– The image to be saved.jp2 (see Note2 ) • Portable Network Graphics . Paramete o filename– Name of the file. If the image can not be read (because of missing file. *. PGM or PBM it can a binary format flag ( CV_IMWRITE_PXM_BINARY ). *. not by the file extension. paramValue_2. encoded as pairs paramId_1.pgm.

h" using namespace cv. // grab the next frame or a set of frames from a multi-head camera. }. . double value).h" #include "highgui. virtual VideoCapture& operator >> (Mat& image). // the constructor that opens video file VideoCapture(const string& filename). // closes the camera stream or the video file // (automatically called by the destructor) virtual void release().. // opens the specified video file virtual bool open(const string& filename). Here is how the class can be used: #include "cv. // returns false if there are no more frames virtual bool grab(). // sets the specified property propId to the specified value virtual bool set(int propId. The class provides C++ video capturing API. // the constructor that starts streaming from the camera VideoCapture(int device). int channel=0). // returns true if the file was open successfully or if the camera // has been initialized succesfully virtual bool isOpened() const..VideoCapture Class for video capturing from video files or cameras class VideoCapture { public: // the default constructor VideoCapture(). // reads the frame from the specified video stream // (non-zero channel is only valid for multi-head camera live streams) virtual bool retrieve(Mat& image. // equivalent to grab() + retrieve(image. 0). // starts streaming from the specified camera by its id virtual bool open(int device). // the destructor virtual ~VideoCapture(). // retrieves value of the specified property virtual double get(int propId). protected: .

7). 1 . GaussianBlur(edges. 1. // get a new frame from camera cvtColor(frame. } // the camera will be deinitialized automatically in VideoCapture destructor return 0. 0. edges.isOpened()) // check if we succeeded return -1. cap >> frame. edges).5). cv::VideoCapture::get Comments from the Wiki double VideoCapture::get(int property_id) Paramete o property_id– rs: Property identifier. 3). namedWindow("edges".start of the film. char**) { VideoCapture cap(0).) { Mat frame. // open the default camera if(!cap.. imshow("edges".5. edges. Size(7.int main(int. CV_BGR2GRAY).end of the film) CV_CAP_PROP_FRAME_WIDTHWidth of the frames in the video stream CV_CAP_PROP_FRAME_HEIGHTHeight of the frames in the video stream CV_CAP_PROP_FPSFrame rate . } cv::VideoCapture::VideoCapture Comments from the Wiki VideoCapture::VideoCapture() VideoCapture::VideoCapture(const string& filename) VideoCapture::VideoCapture(int device) Parameters: o filename– TOWRITE o device– TOWRITE VideoCapture constructors. Can be one of the following: CV_CAP_PROP_POS_MSECFilm current position in milliseconds or video capture timestamp CV_CAP_PROP_POS_FRAMES0-based index of the frame to be decoded/captured next CV_CAP_PROP_POS_AVI_RATIORelative position of the video file (0 . if(waitKey(30) >= 0) break. 30. edges. for(. Canny(edges.1). 1. Mat edges.

Can be one of the following: rs: CV_CAP_PROP_POS_MSECFilm current position in milliseconds or video capture timestamp CV_CAP_PROP_POS_FRAMES0-based index of the frame to be decoded/captured next CV_CAP_PROP_POS_AVI_RATIORelative position of the video file (0 .end of the film) CV_CAP_PROP_FRAME_WIDTHWidth of the frames in the video stream CV_CAP_PROP_FRAME_HEIGHTHeight of the frames in the video stream CV_CAP_PROP_FPSFrame rate CV_CAP_PROP_FOURCC4-character code of codec CV_CAP_PROP_FRAME_COUNTNumber of frames in the video file CV_CAP_PROP_FORMATThe format of the Mat objects returned by retrieve() CV_CAP_PROP_MODEA backend-specific value indicating the current capture mode .start of the film.x backend currently) Note that when querying a property which is unsupported by the backend used by the VideoCapture class.CV_CAP_PROP_FOURCC4-character code of codec CV_CAP_PROP_FRAME_COUNTNumber of frames in the video file CV_CAP_PROP_FORMATThe format of the Mat objects returned by retrieve() CV_CAP_PROP_MODEA backend-specific value indicating the current capture mode CV_CAP_PROP_BRIGHTNESSBrightness of the image (only for cameras) CV_CAP_PROP_CONTRASTContrast of the image (only for cameras) CV_CAP_PROP_SATURATIONSaturation of the image (only for cameras) CV_CAP_PROP_HUEHue of the image (only for cameras) CV_CAP_PROP_GAINGain of the image (only for cameras) CV_CAP_PROP_EXPOSUREExposure (only for cameras) CV_CAP_PROP_CONVERT_RGBBoolean flags indicating whether images should be converted to RGB CV_CAP_PROP_WHITE_BALANCECurrently unsupported CV_CAP_PROP_RECTIFICATIONTOWRITE (note: only supported by DC1394 v 2. the value 0 is returned. double value) o property_id– Paramete Property identifier. cv::VideoCapture::set Comments from the Wiki bool VideoCapture::set(int property_id. 1 .

Size frameSize.specifies whether the video stream is color or grayscale virtual bool open(const string& filename. . // writes the next video frame to the stream virtual VideoWriter& operator << (const Mat& image). int fourcc. // fourcc . Sets a property in the VideoCapture backend. // filename .the number of frames per second // frameSize . Size frameSize.o CV_CAP_PROP_BRIGHTNESSBrightness of the image (only for cameras) CV_CAP_PROP_CONTRASTContrast of the image (only for cameras) CV_CAP_PROP_SATURATIONSaturation of the image (only for cameras) CV_CAP_PROP_HUEHue of the image (only for cameras) CV_CAP_PROP_GAINGain of the image (only for cameras) CV_CAP_PROP_EXPOSUREExposure (only for cameras) CV_CAP_PROP_CONVERT_RGBBoolean flags indicating whether images should be converted to RGB CV_CAP_PROP_WHITE_BALANCECurrently unsupported CV_CAP_PROP_RECTIFICATIONTOWRITE (note: only supported by DC1394 v 2. // constructor that calls open VideoWriter(const string& filename. // opens the file and initializes the video writer. double fps.the output file name. // returns true if the writer has been initialized successfully virtual bool isOpened() const. double fps. VideoWriter Comments from the Wiki VideoWriter Video writer class class VideoWriter { public: // default constructor VideoWriter(). int fourcc.x backend currently) value– value of the property. // the destructor virtual ~VideoWriter(). bool isColor=true).the codec // fps .the video frame size // isColor . bool isColor=true).

The control panel can have trackbars and buttonbars attached to it.. Qt new functions This figure explains the new functionalities implemented with Qt GUI. the new GUI provides a statusbar.. }.protected: . a toolbar. and a control panel. As we can see. .

255.//OK tested char* nameb1 = "button1".img3). the window _ name parameter must be NULL. • To attach a buttonbar.callbackButton.8. cvNamedWindow("main1".callbackButton2. cvCreateButton("button5". cvReleaseImage(&img2). &value. cvCreateButton("button6". int prop_id. cvCreateButton(nameb1.callbackButton. cvCreateTrackbar( "track2".CV_WINDOW_AUTOSIZE | CV_GUI_NORMAL).8. char *argv[]) int value = 50. cvReleaseCapture(&video). cvReleaseImage(&img3). NULL).NULL.img2). } cv::setWindowProperty Comments from the Wiki void setWindowProperty(const string& name. CvCapture* video = cvCaptureFromFile("files/hockey. cvAddS(cvQueryFrame(video).NULL. If the last bar attached to the control panel is a trackbar. cvCreateButton(nameb2. If the last bar attached to the control panel is a buttonbar. double prop_value) Change the parameters of the window dynamically. char* nameb2 = "button2". cvNamedWindow("main2".cvScalarAll(value). int value2 = 0.CV_CHECKBOX. cvReleaseImage(&img1). IplImage* img1 = cvLoadImage("files/flower.• To attach a trackbar. cvSetMouseCallback( "main2". 255. the new button is added on the right of the last button. int main(int argc.nameb2.CV_WINDOW_NORMAL). while(cvWaitKey(33) != 27) { cvAddS(img1.1). IplImage* img2 = cvCreateImage(cvGetSize(img1).0). return 0.CV_RADIOBOX.CV_RADIOBOX.img2).NULL ). Paramete o name– Name of the window. &value2. The following code is an example used to generate the figure. Then a new button is attached to it.CV_CHECKBOX. cvShowImage("main1".jpg"). . "main1". IplImage* img3 = cvCreateImage(cvGetSize(cvQueryFrame(video)). NULL).3).avi").on_mouse.cvScalarAll(value2).1). cvShowImage("main2". a button must be created. NULL.0).img3). cvCreateTrackbar( "track1".3).nameb1. or the control panel is empty.callbackButton1. } cvDestroyAllWindows(). a new buttonbar is created.

o prop_value– New value of the Window’s property. Scalar color = Scalar::all(0). or allows the user to resize the window. the size is constrainted by the image displayed. CV_WND_PROP_AUTOSIZEChange if the user can resize the window (texttt {CV_WINDOW_NORMAL} or CV_WINDOW_AUTOSIZE ). The operation flags: CV_WND_PROP_FULLSCREENChange if the window is fullscreen ( CV_WINDOW_NORMAL or CV_WINDOW_FULLSCREEN ). int weight = CV_FONT_NORMAL. CV_WND_PROP_ASPECTRATIOChange if the image’s aspect ratio is preserved (texttt {CV_WINDOW_FREERATIO} or CV_WINDOW_KEEPRATIO ). CV_WND_PROP_AUTOSIZEChange if the user can resize the window (texttt {CV_WINDOW_NORMAL} or CV_WINDOW_AUTOSIZE ). int spacing = 0) Create the font to be used to draw text on an image. The function `` getWindowProperty`` return window’s properties. int style = CV_STYLE_NORMAL. CV_WND_PROP_ASPECTRATIOChange if the image’s aspect ratio is preserved (texttt {CV_WINDOW_FREERATIO} or CV_WINDOW_KEEPRATIO ). CV_WINDOW_FREERATIOThe image expends as much as it can (no ratio constraint) CV_WINDOW_KEEPRATIOThe ration image is respected. cv::getWindowProperty Comments from the Wiki void getWindowProperty(const char* name. cv::fontQt Comments from the Wiki CvFont fontQt(const string& nameFont. rs: o prop_id– Window’s property to retrive. .rs: o prop_id– Window’s property to edit. CV_WINDOW_AUTOSIZEThe user cannot resize the window. int pointSize = -1. int prop_id) Get the parameters of the window. The operation flags: CV_WND_PROP_FULLSCREENChange if the window is fullscreen ( CV_WINDOW_NORMAL or CV_WINDOW_FULLSCREEN ). CV_WINDOW_FULLSCREENChange the window to fullscreen. The operation flags: CV_WINDOW_NORMALChange the window in normal size. The function `` setWindowProperty`` allows to change the window’s properties. Paramete o name– Name of the window. See setWindowProperty to know the meaning of the returned values.

addText( img1. Use the macro CV _ RGB for simplicity. . o color– Color of the font in BGRA – A = 255 is fully transparent. If this value is zero. the point size of the font is set to a system-dependent default value. Point(50. Paramete o name– Name of the window rs: o text– Overlay text to write on the window’s image o delay– Delay to display the overlay text. . the timer is restarted and the text updated.50). If not specified. This CvFont is not compatible with putText.Paramete rs: o nameFont– Name of the font.y) where the text should start on the image o font– Font to use to draw the text The function addText draw text on the image img using a specific font font (see example fontQt ) cv::displayOverlay Comments from the Wiki void displayOverlay(const string& name. a default one will be used. const string& text. Generally. The text is display on the top of the image. A basic usage of this function is: CvFont font = fontQt(''Times''). o style– The operation flags: CV_STYLE_NORMALFont is normal CV_STYLE_ITALICFont is in italic CV_STYLE_OBLIQUEFont is oblique o spacing– Spacing between characters. Can be negative or positive The function fontQt creates a CvFont object. font). const string& text. o weight– The operation flags: CV_FONT_LIGHTWeight of 25 CV_FONT_NORMALWeight of 50 CV_FONT_DEMIBOLDWeight of 63 CV_FONT_BOLDWeight of 75 CV_FONT_BLACKWeight of 87 You can also specify a positive integer for more control. cv::addText Comments from the Wiki void addText(const Mat& img. If the font is not found. ``Hello World !''. this is 12 points. The name should match the name of a system font (such as ``Times’‘). o pointSize– Size of the font. equal zero or negative. Point location. the text never disapers. If this function is called before the previous overlay text time out. int delay) Display text on the window’s image as an overlay for delay milliseconds. CvFont *font) Create the font to be used to draw text on an image o img– Image where the text should be drawn Parameters: o text– Text to write on the image o location– Point(x. This is not editing the image’s data.

-1 }. cv::createOpenGLCallback Comments from the Wiki _ void createOpenGLCallback(const string& window_name. void* userdata CV_DEFAULT(NULL). { -1. { +1. -1. { +1. glRotatef( 55. (Optional . 0. This information is displayed on the window’s statubar (the window must be created with CV_GUI_EXPANDED flags). -1. const string& text. -1 }. 0.. o userdata– pointer passed to the callback function. OpenGLCallback callbackOpenGL. glRotatef( 45.0.0. This function should be prototyped as void Foo(*void). +1. An example of callback could be: void on_opengl(void* param) { glLoadIdentity(). double zmin CV_DEFAULT(-1). 0. +1. +1. glTranslated(0. +1. -1 }. { { +1. glRotatef( 0. Paramete o name– Name of the window rs: o text– Text to write on the window’s statusbar o delay– Delay to display the text. The function displayOverlay aims at displaying useful information/tips on the window for a certain amount of time delay .Default 1000) The function createOpenGLCallback can be used to draw 3D data on the window. -1 }. +1 }. double angle CV_DEFAULT(-1). { -1. This information is display on the top of the window. 1. -1 }. -1 }. { +1. +1. . +1 } }. 0 ). 1 ). static const int coords[6][4][3] = { { { +1. 1.Default 45 degree) o zmin– Specifies the distance from the viewer to the near clipping plane (always positive).The function displayOverlay aims at displaying useful information/tips on the window for a certain amount of time delay . in degrees. { +1. -1. { -1. in the y direction. -1. -1 }. double zmax CV_DEFAULT(-1) Create a callback function called to draw OpenGL on top the the image display by windowname. If this value is zero. (Optional . +1 } }. +1. the text never disapers. +1 }. . -1.0). 0.Default 0. If this function is called before the previous text time out. (Optional) o angle– Specifies the field of view angle. +1. { +1. +1. 0. -1 } }. Paramete o window_name– Name of the window rs: o callbackOpenGL– Pointer to the function to be called every frame. { -1. cv::displayStatusBar Comments from the Wiki void displayStatusBar(const string& name.01) o zmax– Specifies the distance from the viewer to the far clipping plane (always positive). 0 ). (Optional . the timer is restarted and the text updated. int delayms) Display text on the window’s statusbar as for delay milliseconds. { { +1.

-1. void* userdata CV_DEFAULT(NULL). { -1. the name will be rs: “button <number of boutton>”) o on_change– Pointer to the function to be called every time the button changed its state. 0. { { +1. +1. for (int i = 0. i < 6. -1. +1.2 * coords[i][j][2]). zoom and panning location of the window window_name cv::loadWindowParameters Comments from the Wiki _ void loadWindowParameters(const string& name) Load parameters of the window windowname. -1. { -1.{ { -1. -1. This function should be prototyped as void . trackbars’ value. { -1. -1 }. } } cv::saveWindowParameters Comments from the Wiki _ void saveWindowParameters(const string& name) Save parameters of the window windowname. { -1. +1 }. ++j) { glVertex3d(0. +1 }.2 * coords[i][j][1]. +1. -1. { +1. o name– Name of the Parameters: window The function loadWindowParameters load size. location. { +1. Paramete o button_name– Name of the button ( if NULL. location.2 * coords[i][j][0]. Parameters: o name– Name of the window The function saveWindowParameters saves size. +1 }. { { -1. for (int j = 0. +1 }. i*42 ). glBegin(GL_QUADS). +1 }. ButtonCallback on_change CV_DEFAULT(NULL). flags. 0. { -1. { +1. int initial_button_state CV_DEFAULT(0) Create a callback function called to draw OpenGL on top the the image display by windowname. { -1. +1 } } }. zoom and panning location of the window window_name cv::createButton Comments from the Wiki _ createButton(const string& button_name CV_DEFAULT(NULL). } glEnd(). int button_type CV_DEFAULT(CV_PUSH_BUTTON). -1 } }. +1 }. flags. +1 }. -1. +1. 100+i*10. ++i) { glColor3ub( i*20. -1 }. trackbars’ value. -1. -1 } }. j < 4. -1.

It could be -1 for a push button. Pose Estimation and Stereo¶ Camera Calibration and 3d Reconstruction The functions in this section use the so-called pinhole camera model. The radiobox on the same buttonbar (same line) are exclusive. Thus. once estimated. Camera Calibration. rigid motion of an object in front of still camera. if an image from camera is scaled by some factor.CV_PUSH_BUTTON. Use for checkbox and radiobox. respectively) by the same factor.NULL. The matrix of intrinsic parameters does not depend on the scene viewed and. createButton("button3".callbackButton). one on can be select at the time. or if the last element attached to the control panel was a trackbar. that will call callbackButton. Each button is added to a buttonbar on the right of the last button.) • CV_PUSH_BUTTONThe button will be a push button. • CV_RADIOBOXThe button will be a radiobox button.callbackButton2.NULL. • CV_CHECKBOXThe button will be a checkbox button. That is.Foo(int state. fixed with respect to the ): around a static scene.callbackButton1. is a principal point (that is usually at the image center).CV_RADIOBOX). 0 or 1 for a check/radio box button. its value could be 0 or 1.*void). can be re-used (as long as the focal length is fixed (in case of zoom lens)). The transformation above is equivalent to the following (when .//create a push button "button 0".callbackButton. state is the current state of the button. all of these parameters should be scaled (multiplied/divided. or Where are the coordinates of a 3D point in the world coordinate space. It is used to describe the camera motion to some coordinate system. o userdata– pointer passed to the callback function.0). A new buttonbar is create if nothing was attached to the control panel before. createButton("button5".1). or a matrix of intrinsic parameters. calib3d. or vice versa. createButton("button6". Here are various example of createButton function call: createButton(NULL. and are the focal lengths expressed in pixel-related units. (Optional) The button_type parameter can be : *(Optional – Will be a push button by default. . • initial_button_stateDefault state of the button. createButton("button2".NULL.&value). a scene view is formed by projecting 3D points into the image plane using a perspective transformation.CV_CHECKBOX.callbackButton. The joint rotation-translation matrix is called a matrix of extrinsic parameters. (Optional) The function createButton attach a button to the control panel. That is. is called a camera matrix. translates coordinates of a point camera. are the coordinates of the projection point in pixels.

Mat& distCoeffs. . Mat& cameraMatrix. and be used for images of need to be scaled appropriately). And they remain the same regardless of the captured image resolution. Size imageSize. const vector<vector<Point2f> >& imagePoints. vector<Mat>& rvecs. So. for example. . . are tangential distortion coefficients.Real lenses usually have some distortion. mostly radial distortion and slight tangential distortion. a few 3D points and their projections. That is. The functions below use the above model to • Project 3D points to the image plane given intrinsic and extrinsic parameters • Compute extrinsic parameters given intrinsic parameters. a camera has been calibrated on images of resolution. it means that . . int flags=0) Finds the camera intrinsic and extrinsic parameters from several views of a calibration pattern.e. • Estimate intrinsic and extrinsic camera parameters from several views of a known calibration pattern (i. . are radial distortion coefficients. . thus they also belong to the intrinsic camera parameters. . absolutely the same distortion coefficients can resolution from the same camera (while . the above model is extended as: . if. • Estimate the relative position and orientation of the stereo camera “heads” and compute the rectification transformation that makes the camera optical axes parallel. In the functions below the coefficients are passed or returned as vector. if the vector contains 4 elements. Higher-order coefficients are not considered in OpenCV. That is. The distortion coefficients do not depend on the scene viewed. cv::calibrateCamera Comments from the Wiki double calibrateCamera(const vector<vector<Point3f> >& objectPoints. vector<Mat>& tvecs. every view is described by several 3D-2D point correspondences).

then the vectors will be different. fy. i. Use FindExtrinsicCameraParams2 instead. cx. one vector per a view. The projections must be in the same order as the corresponding object points. it may have sense to put the model to the XY coordinate plane. may be 0 or combination of the following values: CV_CALIB_USE_INTRINSIC_GUESScameraMatrixcon tains the valid initial values of fx. estimated for each pattern view. it stays at the center or at the other location specified when CV_CALIB_USE_INTRINSIC_GUESS is set too. but since they are in the pattern coordinate system. estimated for each pattern view. one vector per view. only their ratio is computed and used further. Note. CV_CALIB_ZERO_TANGENT_DISTTangential . or even different patterns in different views . CV_CALIB_FIX_PRINCIPAL_POINTThe principal point is not changed during the global optimization. M -1) o tvecs– The output vector of translation vectors. real position of the calibration pattern in the k-th pattern view (k=0. Otherwise. there is no need to use this function just to estimate the extrinsic parameters. o flags– Different flags. used only to initialize the intrinsic camera matrix o cameraMatrix– The output 3x3 floating-point camera matrix . o imageSize– Size of the image. If CV_CALIB_USE_INTRINSIC_GUESS and/or CV_CALIB_FIX_ASPECT_RATIO are specified.Paramete rs: o objectPoints– The vector of vectors of points on the calibration pattern in its coordinate system. fy. the actual input values of fx and fy are ignored.. cy must be initialized before calling the function o distCoeffs– The output vector of distortion coefficients of 4. If the same calibration pattern is shown in each view and it’s fully visible then all the vectors will be the same. (cx. so that Z-coordinate of each input object point is 0 o imagePoints– The vector of vectors of the object point projections on the calibration pattern views. and focal distances are computed in some least-squares fashion. When CV_CALIB_USE_INTRINSIC_GUESS is not set.e. some or all of fx. that if intrinsic parameters are known. The points are 3D. CV_CALIB_FIX_ASPECT_RATIOThe functions considers only fy as a free parameter. then if the rig is planar. cx. 5 or 8 elements o rvecs– The output vector of rotation vectors (see Rodrigues2 ). although it is possible to use partially occluded patterns. cy that are optimized further. each k-th rotation vector together with the corresponding k-th translation vector (see the next output parameter description) brings the calibration pattern from the model coordinate space (in which object points are specified) to the world coordinate space. cy) is initially set to the image center ( imageSize is used here). the ratio fx/fy stays the same as in the input cameraMatrix . That is.

2 The initial camera pose is estimated as if the intrinsic parameters have been already known. CV_CALIB_RATIONAL_MODELEnable coefficients k4.e. double& fovy. Size imageSize. zero distortion coefficients. This is done using FindExtrinsicCameraParams2 3 After that the global Levenberg-Marquardt optimization algorithm is run to minimize the reprojection error. then you’ve probably used patternSize=cvSize(rows. If the flag is not set. The algorithm does the following: 1 First. That may be achieved by using an object with known geometry and easily detectable feature points. double& fovx. The coordinates of 3D object points and their correspondent 2D projections in each view must be specified. See also: FindChessboardCorners . this extra flag should be explicitly specified to make the calibration function use the rational model and return 8 coefficients. 3D calibration rigs can also be used as long as initial cameraMatrix is provided. i. CV_CALIB_FIX_K1.CV_CALIB_FIX_K6Do not change the corresponding radial distortion coefficient during the optimization. Currently. and OpenCV has built-in support for a chessboard as a calibration rig (see FindChessboardCorners ).. initialization of intrinsic parameters (when CV_CALIB_USE_INTRINSIC_GUESS is not set) is only implemented for planar calibration patterns (where z-coordinates of the object points must be all 0’s). but should use patternSize=cvSize(cols. it computes the initial intrinsic parameters (the option only available for planar calibration patterns) or reads them from the input parameters. The function returns the final re-projection error.. the function will compute and return only 5 distortion coefficients.. initCameraMatrix2D() . see ProjectPoints2 . FindExtrinsicCameraParams2 . k5 and k6. and calibrateCamera returns bad values (i. The function estimates the intrinsic camera parameters and extrinsic parameters for each of the views. an image center very far from . double& focalLength. double& aspectRatio) Computes some useful camera characteristics from the camera matrix Paramete o cameraMatrix– The input camera matrix that can be estimated by rs: calibrateCamera() or stereoCalibrate() o imageSize– The input image size in pixels o apertureWidth– Physical width of the sensor . the total sum of squared distances between the observed feature points imagePoints and the projected (using the current estimates for camera parameters and the poses) object points objectPoints . Point2d& principalPoint. To provide the backward compatibility. double apertureHeight. the coefficient from the supplied distCoeffs matrix is used.e. Such an object is called a calibration rig or calibration pattern.distortion coefficients will be set to zeros and stay zero. StereoCalibrate .cols) . Note: if you’re using a non-square (=non-NxN) grid and findChessboardCorners() for calibration.. otherwise it is set to 0.rows) in FindChessboardCorners . Undistort2 cv::calibrationMatrixValues Comments from the Wiki void calibrationMatrixValues(const Mat& cameraMatrix. The distortion coefficients are all set to zeros initially (unless some of CV_CALIB_FIX_K? are specified). If CV_CALIB_USE_INTRINSIC_GUESS is set. double apertureWidth. and / or large differences between and (ratios of 10:1 or more)).

cv::composeRT Comments from the Wiki void composeRT(const Mat& rvec1. const Mat& tvec2. Mat& rvec3. const Mat& rvec2. Mat& dt3dr1. cv::computeCorrespondEpilines Comments from the Wiki void computeCorrespondEpilines(const Mat& points. const Mat& rvec2. Mat& dt3dr2. Mat& dr3dt1.r. Mat& tvec3) void composeRT(const Mat& rvec1. computes the corresponding epilines in the other image. const Mat& F. rvec? or tvec? The functions compute: where denotes a rotation vector to rotation matrix transformation. vector<Vec3f>& lines) For points in one image of a stereo pair.t. int whichImage. o lines– The output vector of the corresponding to the points epipolar .t the input vectors (see matMulDeriv() ). const Mat& tvec2. see Rodrigues() . Paramete o points– The input points. Mat& dr3dr1. Mat& dr3dr2. Mat& dr3dt2. const Mat& tvec1. Mat& rvec3. the functions can compute the derivatives of the output vectors w. Also. The functions are used inside stereoCalibrate() but can also be used in your own code where Levenberg-Marquardt or another gradient-based solver is used to optimize a function that contains matrix multiplication.r. or matrix of type rs: CV_32FC2 or vector<Point2f> o whichImage– Index of the image (1 or 2) that contains the points o F– The fundamental matrix that can be estimated using FindFundamentalMat or StereoRectify .o apertureHeight– Physical height of the sensor o fovx– The output field of view in degrees along the horizontal sensor axis o fovy– The output field of view in degrees along the vertical sensor axis o focalLength– The focal length of the lens in mm o principalPoint– The principal point in pixels o aspectRatio– The function computes various useful camera characteristics from the previously estimated camera matrix. const Mat& tvec1. and denotes the inverse transformation. Mat& tvec3. Mat& dt3dt1. Mat& dt3dt2) Combines two rotation-and-shift transformations Parameters o rvec1– The first rotation vector : o tvec1– The first translation vector o rvec2– The second rotation vector o tvec2– The second translation vector o rvec3– The output rotation vector of the superposition o tvec3– The output translation vector of the superposition o d??d??– The optional output derivatives of rvec3 or tvec3 w.

From the fundamental matrix definition (see FindFundamentalMat ). o src– The input array or vector of 2D. Parameters: o projMatrix– The 3x4 input projection matrix P o cameraMatrix– The output 3x3 camera matrix K o rotMatrix– The output 3x3 external rotation matrix R o transVect– The output 4x1 translation vector T o rotMatrX– Optional 3x3 rotation matrix around x-axis o rotMatrY– Optional 3x3 rotation matrix around y-axis . Mat& transVect. Each line 3 numbers is encoded by For every point in one of the two images of a stereo-pair the function finds the equation of the corresponding epipolar line in the other image. when whichImage=2 . is computed from as: Line coefficients are defined up to a scale. Otherwise. Mat& rotMatrixY. Mat& rotMatrix. cv::convertPointsHomogeneous Comments from the Wiki void convertPointsHomogeneous(const Mat& src. Mat& cameraMatrix. Mat& rotMatrixZ. If the input array dimensionality is larger than the output. Mat& rotMatrix. line for the point in the second image in the first image (i. 3D or Parameters: 4D points o dst– The output vector of 2D or 2D points The functions convert 2D or 3D points from/to homogeneous coordinates.e. or simply copy or transpose the array. each coordinate is divided by the last coordinate: If the output array dimensionality is larger. vector<Point3f>& dst) void convertPointsHomogeneous(const Mat& src. Mat& cameraMatrix. vice versa. when whichImage=1 ) is computed as: and. Mat& transVect) void decomposeProjectionMatrix(const Mat& projMatrix. Vec3d& eulerAngles) Decomposes the projection matrix into a rotation matrix and a camera matrix. such that . They are normalized. cv::decomposeProjectionMatrix Comments from the Wiki void decomposeProjectionMatrix(const Mat& projMatrix.lines in the other image. the input array is simply copied (with optional transposition) to the output. Mat& rotMatrixX. vector<Point2f>& dst) Convert points to/from homogeneous coordinates. an extra 1 is appended to each point.

cv::drawChessboardCorners Comments from the Wiki void drawChessboardCorners(Mat& image. bool patternWasFound) Renders the detected chessboard corners. const Mat& corners. Paramete o image– The destination image.o rotMatrZ– Optional 3x3 rotation matrix around z-axis o eulerAngles– Optional 3 points containing the three Euler angles of rotation The function computes a decomposition of a projection matrix into a calibration and a rotation matrix and the position of the camera. Size patternSize.points _ per _ colum) = cvSize(columns. Size patternSize.rows) ) o corners– The output array of corners detected o flags– Various operation flags. It optionally returns three rotation matrices. this should be the output from findChessboardCorners wrapped in a cv::Mat(). o patternWasFound– Indicates whether the complete board was found or not . cv::findChessboardCorners Comments from the Wiki bool findChessboardCorners(const Mat& image. rather than a fixed threshold level (computed from the average image brightness). One may just pass the return value FindChessboardCorners here The function draws the individual chessboard corners detected as red circles if the board was not found or as colored corners connected with lines if the board was found. vector<Point2f>& corners. . and the three Euler angles that could be used in OpenGL. it must be an 8-bit grayscale or Paramete color image rs: o patternSize– The number of inner corners per chessboard row and column ( patternSize = cvSize(points _ per _ row. can be 0 or a combination of the following values: CV_CALIB_CB_ADAPTIVE_THRESHuse adaptive thresholding to convert the image to black and white. The function is based on RQDecomp3x3 . o image– Source chessboard view. CV_CALIB_CB_NORMALIZE_IMAGEnormalize the image gamma with EqualizeHist before applying fixed or adaptive thresholding. int flags=CV_CALIB_CB_ADAPTIVE_THRESH+ CV_CALIB_CB_NORMALIZE_IMAGE) Finds the positions of the internal corners of the chessboard. one for each axis.columns) ) o corners– The array of corners detected. perimeter. it must be an 8-bit color image rs: o patternSize– The number of inner corners per chessboard row and column. CV_CALIB_CB_FILTER_QUADSuse additional criteria (like contour area.points _ per _ column) = cv::Size(rows. square-like shape) to filter out false quads that are extracted at the contour retrieval stage. (patternSize = cv::Size(points _ per _ row.

the user may use the function FindCornerSubPix . Note: the function requires some white space (like a square-thick border. where N is the number of points. and shortcuts the call if none are found. otherwise. and to determine their position more accurately. //source image vector<Point2f> corners. it returns 0. CALIB_CB_ADAPTIVE_THRESH + CALIB_CB_NORMALIZE_IMAGE + CALIB_CB_FAST_CHECK). 2xN or Nx2 1-channel or 1xN or Nx1 2-channel. left to right in every row). drawChessboardCorners(img. patternfound). For example.6). The coordinates detected are approximate. 30. //this will be filled by the detected corners //CALIB_CB_FAST_CHECK saves a lot of time on images //that don't contain any chessboard corners bool patternfound = findChessboardCorners(gray. This can drastically speed up the call in the degenerate condition when no chessboard is observed. if(patternfound) cornerSubPix(gray. cv::solvePnP Comments from the Wiki void solvePnP(const Mat& objectPoints. patternsize. 3xN or Nx3 1-channel. corners. Mat& tvec. where N is the number of points. a regular chessboard has 8 x 8 squares and 7 x 7 internal corners. where the black squares touch each other. o o cameraMatrix– The input camera matrix distCoeffs– The input vector of distortion coefficients . Can also pass vector<Point3f> here.1)). Sample usage of detecting and drawing chessboard corners: Size patternsize(8. Mat& rvec. if the function fails to find all the corners or reorder them. or 1xN or Nx1 3-channel. patternsize. the outer black squares could not be segmented properly and so the square grouping and ordering algorithm will fail). 11). 0. TermCriteria(CV_TERMCRIT_EPS + CV_TERMCRIT_ITER. const Mat& distCoeffs. Can also pass vector<Point2f> here.. that is. corners. o imagePoints– The array of corresponding image points. bool useExtrinsicGuess=false) Finds the object pose from the 3D-2D point correspondences Paramete o objectPoints– The array of object points in the object coordinate rs: space. the wider the better) around the board to make the detection more robust in various environment (otherwise if there is no border and the background is dark. const Mat& cameraMatrix. Size(-1. Mat(corners). points. //interior number of corners Mat gray = ..CALIB_CB_FAST_CHECK Runs a fast check on the image that looks for chessboard corners. -1).. const Mat& imagePoints.. Size(11. The function returns a non-zero value if all of the corners have been found and they have been placed in a certain order (row by row. The function attempts to determine whether the input image is a view of the chessboard pattern and locate the internal chessboard corners.

double param2=0. CV_FM_8POINTfor an 8-point algorithm. const Mat& points2. their corresponding image projections. This function finds such a pose that minimizes reprojection error. every element of which is set to 0 for outliers and to 1 for the other points. It specifies the desirable level of confidence (probability) that the estimated matrix is correct o status– The output array of N elements. It can be set to something like 1-3. If the vector is NULL/empty. image resolution and the image noise o param2– The parameter is used for RANSAC or LMedS methods only. respectively.of 4. o rvec– The output rotation vector (see Rodrigues2 ) that (together with tvec ) brings points from the model coordinate system to the camera coordinate system o tvec– The output translation vector o useExtrinsicGuess– If true (1). beyond which the point is considered an outlier and is not used for computing the final fundamental matrix. but in the case of 7-point algorithm the . and are corresponding points in the first and the second images. int method=FM_RANSAC. . cv::findFundamentalMat Comments from the Wiki Mat findFundamentalMat(const Mat& points1. For other methods it is set to all 1’s The epipolar geometry is described by the following equation: where is fundamental matrix. 5 or 8 elements. as well as the camera matrix and the distortion coefficients. const Mat& points2. The function estimates the object pose given a set of object points. The point rs: coordinates should be floating-point (single or double precision) o points2– Array of the second image points of the same size and format as points1 o method– Method for computing the fundamental matrix CV_FM_7POINTfor a 7-point algorithm. depending on the accuracy of the point localization..99) Calculates the fundamental matrix from the corresponding points in two images. i. the function will use the provided rvec and tvec as the initial approximations of the rotation and translation vectors. double param1=3. int method=FM_RANSAC. The function calculates the fundamental matrix using one of four methods listed above and returns the found fundamental matrix . Paramete o points1– Array of N points from the first image. and will further optimize them.99) Mat findFundamentalMat(const Mat& points1. Normally just 1 matrix is found. respectively. vector<uchar>& status. the sum of squared distances between the observed projections imagePoints and the projected (using ProjectPoints2 ) objectPoints .e. the zero distortion coefficients are assumed. double param2=0. It is the maximum o distance from point to epipolar line in pixels. param1– The parameter is used for RANSAC. CV_FM_LMEDSfor the LMedS algorithm. The array is computed only in RANSAC and LMedS methods. double param1=3. CV_FM_RANSACfor the RANSAC algorithm..

points2.function may return up to 3 solutions ( matrix that stores all 3 matrices sequentially). o method– The method used to computed homography matrix. cv::findHomography Comments from the Wiki Mat findHomography(const Mat& srcPoints. The functions find and return the perspective transformation destination planes: between the source and the . double ransacReprojThreshold=3) Finds the perspective transformation between two planes. int method=0. } Mat fundamental_matrix = findFundamentalMat(points1. vector<Point2f> points2(point_count).. a matrix rs: of type CV_32FC2 or a vector<Point2f> . Mat& status. vector<Point2f> points1(point_count). double ransacReprojThreshold=3) Mat findHomography(const Mat& srcPoints. const Mat& dstPoints. o dstPoints– Coordinates of the points in the target plane.99).. FM_RANSAC. o status– The optional output mask set by a robust method ( CV_RANSAC or CV_LMEDS ). i++ ) { points1[i] = . vector<uchar>& status. Note that the input mask values are ignored. If srcPoints and dstPoints are measured in pixels.. It can also be passed to StereoRectifyUncalibrated to compute the rectification transformation. const Mat& dstPoints.. That is. points2[i] = . one of the following: 0a regular method using all the points CV_RANSACRANSAC-based robust method CV_LMEDSLeast-Median robust method o ransacReprojThreshold– The maximum allowed reprojection error to treat a point pair as an inlier (used in the RANSAC method only). // Example. a matrix of type CV_32FC2 or a vector<Point2f> .. double ransacReprojThreshold=3) Mat findHomography(const Mat& srcPoints.. Estimation of fundamental matrix using RANSAC algorithm int point_count = 100.. 0. The calculated fundamental matrix may be passed further to ComputeCorrespondEpilines that finds the epipolar lines corresponding to the specified points. 3. int method=0. Paramete o srcPoints– Coordinates of the points in the original plane. i < point_count. // initialize the points here . it usually makes sense to set this parameter somewhere in the range 1 to 10. const Mat& dstPoints.. if then the point is considered an outlier. */ for( int i = 0. int method=0.

In this case one can use one of the 2 robust methods. try many different random subsets of the corresponding point pairs (of 4 pairs each). In the latter case the new camera matrix will be: where and are and elements of cameraMatrix . the function uses all the point pairs to compute the initial homography estimate with a simple least-squares scheme. but it needs the threshold to distinguish inliers from outliers. thus it is normalized so that . if not all of the point pairs ( . Homography matrix is determined up to a scale. Size imgSize=Size(). the undistortion functions in OpenCV (see initUndistortRectifyMap . If the parameter method is set to the default value 0. this initial estimate will be poor.e. robust or not. undistort ) do not move the principal point. EstimateRigidMotion . RANSAC and LMeDS . The method LMeDS does not need any threshold. Regardless of the method. estimate the homography matrix using this subset and a simple least-square algorithm and then compute the quality/goodness of the computed homography (which is the number of inliers for RANSAC or the median re-projection error for LMeDs). Finally. it’s important to move the principal . the default method could be the best choice. WarpPerspective . but no outliers. bool centerPrincipalPoint=false) Returns the default new camera matrix Paramete o cameraMatrix– The input camera matrix rs: o imageSize– The camera view image size in pixels o centerPrincipalPoint– Indicates whether in the new camera matrix the principal point should be at the image center or not The function returns the camera matrix that is either an exact copy of the input cameraMatrix (when centerPrinicipalPoint=false ). However. or the modified one (when centerPrincipalPoint =true). The function is used to find initial intrinsic and extrinsic matrices. By default. PerspectiveTransform cv::getDefaultNewCameraMatrix Comments from the Wiki Mat getDefaultNewCameraMatrix(const Mat& cameraMatrix. there are some outliers). The method RANSAC can handle practically any ratio of outliers. ) fit the rigid perspective transformation (i. respectively. See also: GetAffineTransform . if you are sure in the computed features. the computed homography matrix is refined further (using inliers only in the case of a robust method) with the Levenberg-Marquardt method in order to reduce the re-projection error even more. GetPerspectiveTransform . The best subset is then used to produce the initial estimate of the homography matrix and the mask of inliers/outliers. where can be only some small noise present. when you work with stereo. Both methods. However. but it works correctly only when there are more than 50 % of inliers.So that the back-projection error is minimized.

the computed new camera matrix and the newImageSize should be passed to InitUndistortRectifyMap to produce the maps for Remap . i. Size imageSize. See calibrateCamera() o imageSize– The image size in pixels. . the zero distortion coefficients are assumed. 5 or 8 elements. where the principal points will be at the center. See roi1. o validPixROI– The optional output rectangle that will outline all-good-pixels region in the undistorted image. see StereoRectify o newCameraMatrix– The output new camera matrix. and maybe to the same x-coordinate too. cv::getOptimalNewCameraMatrix Comments from the Wiki Mat getOptimalNewCameraMatrix(const Mat& cameraMatrix.) Finds the initial camera matrix from the 3D-2D point correspondences Paramete o objectPoints– The vector of vectors of the object points. The original camera matrix. keep all the original image pixels if there is valuable information in the corners alpha=1 . patterns where each object point has z-coordinate =0. By default it will be set to imageSize . distortion coefficients. Otherwise The function estimates and returns the initial camera matrix for camera calibration process. o imageSize– The original image size o alpha– The free scaling parameter between 0 (when all the pixels in the undistorted image will be valid) and 1 (when all the source image pixels will be retained in the undistorted image).e. or get something in between. By varying this parameter the user may retrieve only sensible pixels alpha=0 . So you can form the new camera matrix for each view. the function only supports planar calibration patterns. If the vector is NULL/empty. the undistortion result will likely have some black pixels corresponding to “virtual” pixels outside of the captured distorted image. const vector<vector<Point2f> >& imagePoints. o newImageSize– The image size after rectification. Currently. roi2 description in StereoRectify The function computes and returns the optimal new camera matrix based on the free scaling parameter. Size imageSize. both and are estimated independently. cv::initCameraMatrix2D Comments from the Wiki Mat initCameraMatrix2D(const vector<vector<Point3f> >& objectPoints. Size newImageSize=Size(). Rect* validPixROI=0) Returns the new camera matrix based on the free scaling parameter Paramete o cameraMatrix– The input camera matrix rs: o distCoeffs– The input vector of distortion coefficients of 4. double alpha. const Mat& distCoeffs. When alpha>0 . used to initialize the principal point o aspectRatio– If it is zero or negative.points in both views to the same y-coordinate (which is required by most of stereo correspondence algorithms). double aspectRatio=1. See rs: calibrateCamera() o imagePoints– The vector of vectors of the corresponding image points.

as if it was captured with a camera with camera matrix =newCameraMatrix and zero distortion. this new camera will be oriented differently in the coordinate space. the identity transformation is assumed o newCameraMatrix– The new camera matrix o size– The undistorted image size o m1type– The type of the first output map. That. In the case of monocular camera newCameraMatrix is usually equal to cameraMatrix . o R– The optional rectification transformation in object space (3x3 matrix). const Mat& distCoeffs.coordinate (in the case of horizontally aligned stereo camera). 5 or 8 elements. according to R . int m1type. See convertMaps() o map1– The first output map o map2– The second output map The function computes the joint undistortion+rectification transformation and represents the result in the form of maps for Remap . If the matrix is empty . If the vector is NULL/empty. The process is the following: . Also. Mat& map2) Computes the undistortion and rectification transformation map. the zero distortion coefficients are assumed. R1 or R2 . const Mat& newCameraMatrix. That is. or it can be computed by GetOptimalNewCameraMatrix for a better control over scaling. helps to align two heads of a stereo camera so that the epipolar lines on both images become horizontal and have the same y. Mat& map1. The undistorted image will look like the original. In the case of stereo camera newCameraMatrix is normally set to P1 or P2 computed by StereoRectify .e. Size size.cv::initUndistortRectifyMap Comments from the Wiki void initUndistortRectifyMap(const Mat& cameraMatrix. const Mat& R. for each pixel in the destination (corrected and rectified) image the function computes the corresponding coordinates in the source image (i. The function actually builds the maps for the inverse mapping algorithm that is used by Remap . computed by StereoRectify can be passed here. in the original image from camera). Paramete rs: o o cameraMatrix– The input camera matrix distCoeffs– The input vector of distortion coefficients of 4. for example. can be CV_32FC1 or CV_16SC2 .

const Mat& rvec. But if the stereo camera was not calibrated. cv::matMulDeriv Comments from the Wiki void matMulDeriv(const Mat& A. Mat& dpdrot. Mat& dABdB) Computes partial derivatives of the matrix product w. double aspectRatio=0) Project 3D points on to an image plane. where N is the number of points in the view o rvec– The rotation vector.t each multiplied matrix Parameters: o A– The first multiplied matrix o B– The second multiplied matrix o dABdA– The first output derivative matrix d(A*B)/dA of size o dABdA– The second output derivative matrix d(A*B)/dB of size The function computes the partial derivatives of the elements of the matrix product w.t. For each camera the function computes homography H as the rectification transformation in pixel domain. which in its turn is called after StereoCalibrate . after StereoRectify . 3xN or Nx3 1-channel or Paramete rs: 1xN or Nx1 3-channel (or vector<Point3f> ) . Mat& dpdc. const Mat& B. Mat& dpdt. const Mat& cameraMatrix. see Rodrigues2 o tvec– The translation vector . const Mat& tvec.r. but can also be used in any other similar optimization function. const Mat& distCoeffs. Mat& dpdf. const Mat& rvec. const Mat& cameraMatrix.r. the elements of each of the two input matrices. In the case of a stereo camera this function is called twice. it is still possible to compute the rectification transformations directly from the fundamental matrix using StereoRectifyUncalibrated . o objectPoints– The array of object points. vector<Point2f>& imagePoints. Mat& dABdA. const Mat& distCoeffs. The R can be computed from H as where the cameraMatrix can be chosen arbitrarily. The function is used to compute Jacobian matrices in stereoCalibrate() . vector<Point2f>& imagePoints) void projectPoints(const Mat& objectPoints. cv::projectPoints Comments from the Wiki void projectPoints(const Mat& objectPoints. not a rotation matrix R in 3D space. Mat& dpddist. const Mat& tvec. once for each camera head.where are the distortion coefficients.

Note.e.matrices of partial derivatives of image points coordinates (as functions of all the input parameters) with respect to the particular parameters. the zero distortion coefficients are assumed. or by setting cameraMatrix to 3x3 identity matrix. when the pixels with the minimal disparity (that corresponds to the outliers. you can get various useful partial cases of the function.y) . for each pixel (x. Mat& _3dImage. i. 2xN or Nx2 1-channel or 1xN or Nx1 2-channel (or vector<Point2f> ) o dpdrot– Optional 2Nx3 matrix of derivatives of image points with respect to components of the rotation vector o dpdt– Optional 2Nx3 matrix of derivatives of image points with respect to components of the translation vector o dpdf– Optional 2Nx2 matrix of derivatives of image points with respect to and o dpdc– Optional 2Nx2 matrix of derivatives of image points with respect to and o dpddist– Optional 2Nx4 matrix of derivatives of image points with respect to distortion coefficients The function computes projections of 3D points to the image plane given intrinsic and extrinsic camera parameters. If the vector is NULL/empty. FindExtrinsicCameraParams2 and StereoCalibrate . that by setting rvec=tvec=(0.0. bool handleMissingValues=false) Reprojects disparity image to 3D space. or by passing zero distortion coefficients. Optionally. o Q– The perspective transformation matrix that can be obtained with StereoRectify o handleMissingValues– If true.0) . or apply a perspective transformation (and also compute the derivatives) in the ideal zero-distortion setup etc. Each element of _3dImage(x.o o cameraMatrix– The camera matrix distCoeffs– The input vector of distortion coefficients of 4. computed from the disparity map. the function computes jacobians . intrinsic and/or extrinsic. The function itself can also used to compute re-projection error given the current intrinsic and extrinsic parameters.y) will contain the 3D coordinates of the point (x. 5 or 8 elements.y) and the corresponding disparity d=disparity(x. The jacobians are used during the global optimization in CalibrateCamera2 . That is. see FindStereoCorrespondenceBM ) will be transformed to 3D points with some very large Z value (currently set to 10000) The function transforms 1-channel disparity map to 3-channel image representing a 3D surface.y) it computes: . o imagePoints– The output array of image points. cv::reprojectImageTo3D Comments from the Wiki void reprojectImageTo3D(const Mat& disparity. const Mat& Q. Paramete o disparity– The input single-channel 16-bit signed or 32-bit rs: floating-point disparity image o _3dImage– The output 3-channel floating-point image of the same size as disparity . you can compute the distorted coordinates for a sparse set of points.

cv::Rodrigues Comments from the Wiki void Rodrigues(const Mat& src. and the three Euler angles (as the return value) that could be used in OpenGL.d). the one computed by StereoRectify .The matrix Q can be arbitrary matrix.y. 3x9 or 9x3 ..g. Mat& Qx. Mat& Q) Vec3d RQDecomp3x3(const Mat& M. Parameters: o M– The 3x3 input matrix o R– The output 3x3 upper-triangular matrix o Q– The output 3x3 orthogonal matrix o Qx– Optional 3x3 rotation matrix around x-axis o Qy– Optional 3x3 rotation matrix around y-axis o Qz– Optional 3x3 rotation matrix around z-axis The function computes a RQ decomposition using the given rotations. respectively o jacobian– Optional output Jacobian matrix. use PerspectiveTransform . e. cv::RQDecomp3x3 Comments from the Wiki void RQDecomp3x3(const Mat& M.} to 3D space. Mat& Q. Mat& jacobian) Converts a rotation matrix to a rotation vector or vice versa. since . It optionally returns three rotation matrices. o src– The input rotation vector (3x1 or 1x3) or rotation matrix (3x3) Paramete rs: o dst– The output rotation matrix (3x3) or rotation vector (3x1 or 1x3). Mat& Qz) Computes the ‘RQ’ decomposition of 3x3 matrices.. This function is used in DecomposeProjectionMatrix to decompose the left 3x3 submatrix of a projection matrix into a camera and a rotation matrix.. one for each axis. Mat& Qy. Mat& R. Mat& dst. To reproject a sparse set of points {(x. Mat& R. Mat& dst) void Rodrigues(const Mat& src.partial derivatives of the output array components with respect to the input array components Inverse transformation can also be done easily.

const Mat& right. // Block matching stereo correspondence algorithmclass StereoBM { enum { NORMALIZED_RESPONSE = CV_STEREO_BM_NORMALIZED_RESPONSE.. // the disparity will be 16-bit signed (fixed-point) or 32-bit floating-point image of the same size as left. // separate initialization function void init(int preset. Mat& disparity. int uniquenessRatio=0. }. // the preset is one of . FISH_EYE_PRESET=CV_STEREO_BM_FISH_EYE. . void operator()( const Mat& left.A rotation vector is a convenient and most-compact representation of a rotation matrix (since any rotation matrix has just 3 degrees of freedom). NARROW_PRESET=CV_STEREO_BM_NARROW }. The class is a C++ wrapper for and the associated functions. StereoBM Comments from the Wiki StereoBM The class for computing stereo correspondence using block matching algorithm._PRESET above. // SADWindowSize is the size of averaging window used to match pixel blocks // (larger values mean better robustness to noise. int P1=0. int speckleWindowSize=0. but yield blurry disparity maps) StereoBM(int preset. const Mat& right. StereoBM().. virtual ~StereoSGBM(). BASIC_PRESET=CV_STEREO_BM_BASIC. int preFilterCap=0. StereoSGBM(int minDisparity. int P2=0. int numDisparities. See the respective descriptions. In particular. StereoSGBM Comments from the Wiki StereoSGBM The class for computing stereo correspondence using semi-global block matching algorithm. Ptr<CvStereoBMState> state. int disptype=CV_16S ). StereoBM::operator () is the wrapper for FindStereoCorrespondceBM . StereoCalibrate or FindExtrinsicCameraParams2 . int ndisparities=0. int speckleRange=0. Mat& disp). The representation is used in the global 3D geometry optimization procedures like CalibrateCamera2 . // in which the optimal disparity at each pixel is searched for. int SADWindowSize=21). // ndisparities is the size of disparity range. int ndisparities=0. int SADWindowSize. // computes the disparity for the two rectified 8-bit single-channel images. virtual void operator()(const Mat& left. class StereoSGBM { StereoSGBM(). int SADWindowSize=21). bool fullDP=false). int disp12MaxDiff=0.

cpp sample where some reasonably good P1 and P2 values are shown (like 8*number_of_image_channels*SADWindowSize*SADWindowSize and 32*number_of_image_channels*SADWindowSize*SADWindowSize . int disp12MaxDiff=0.processing steps from K. In the current implementation this parameter must be divisible by 16. bool fullDP. int uniquenessRatio=0. int speckleWindowSize=0. Instead. such as pre-filtering ( CV_STEREO_BM_XSOBEL type) and post-filtering (uniqueness check. Konolige algorithm FindStereoCorrespondceBM . int numDisparities. Must be an odd number >=1 . but sometimes rectification algorithms can shift images. The class implements modified H. Normally. The main differences between the implemented algorithm and the original one are: • by default the algorithm is single-pass. quadratic interpolation and speckle filtering) cv::StereoSGBM::StereoSGBM Comments from the Wiki StereoSGBM::StereoSGBM() StereoSGBM::StereoSGBM(int minDisparity. int SADWindowSize. we use a simpler Birchfield-Tomasi sub-pixel metric from BT96 . The larger the values. P1 is the penalty on the disparity change by plus or minus 1 between neighbor pixels. the smoother the disparity. int speckleRange=0. P2 is the penalty on the disparity change by more than 1 between neighbor pixels. int speckleRange. int numberOfDisparities. Normally it rs: is 0. o SADWindowSize– The matched block size. .and post. respectively). instead of 8 directions we only consider 5.. by setting SADWindowSize=1 the blocks are reduced to single pixels) • mutual information cost function is not implemented.11 range . int preFilterCap=0. Set fullDP=true to run the full variant of the algorithm (which could consume a lot of memory) • the algorithm matches blocks. . int P1=0. int preFilterCap. int speckleWindowSize. Paramete rs: o P2(P1. int P2=0. bool fullDP=false) StereoSGBM constructors Paramete o minDisparity– The minimum possible disparity value. int SADWindowSize. not individual pixels (though.. }.e. Always greater than 0. The algorithm requires P2 > P1 . i. it should be somewhere in 3. See stereo_match. Hirschmuller algorithm HH08 .. int disp12MaxDiff. int P1. int uniquenessRatio. though the color images are supported as well.int minDisparity. P2.) – Parameters that control disparity smoothness. • we include some pre. so this parameter needs to be adjusted accordingly o numDisparities– This is maximum disparity minus minimum disparity.

16 or 32 is good enough. Mat& R. const vector<vector<Point2f> >& imagePoints2. It will contain scaled by 16 disparity values. Mat& cameraMatrix1. 1e-6). The algorithm first computes x-derivative at each pixel and clips its value by [-preFilterCap. 30. Note that the method is not constant. Size imageSize. Set it to non-positive value to disable the check. Set it to 0 to disable speckle filtering. Mat& distCoeffs2. See stereo_match.cpp OpenCV sample on how to prepare the images and call the method. 8-bit single-channel or 3-channel. If you do speckle filtering. cv::stereoCalibrate Comments from the Wiki double stereoCalibrate(const vector<vector<Point3f> >& objectPoints. By default this is false The first constructor initializes StereoSGBM with all the default parameters (so actually one will only have to set StereoSGBM::numberOfDisparities at minimum). Mat& T. some value within 5-15 range is good enough o speckleWindowSize– Maximum size of smooth disparity regions to consider them noise speckles and invdalidate. Otherwise. It will be 16-bit signed single-channel image of the same size as the input images.o disp12MaxDiff– Maximum allowed difference (in integer pixel units) in the left-right disparity check. rs: o right– The right image of the same size and the same type as the left one. set it somewhere in 50-200 range. Paramete o objectPoints– The vector of vectors of points on the calibration rs: pattern in its coordinate system. o uniquenessRatio– The margin in percents by which the best (minimum) computed cost function value should “win” the second best value to consider the found match correct. It will consume O(W*H*numDisparities) bytes. Mat& E. cv::StereoSGBM::operator () Comments from the Wiki void SGBM::operator()(const Mat& left. Normally. o preFilterCap– Truncation value for the prefiltered image pixels. o speckleRange– Maximum disparity variation within each connected component. o disp– The output disparity map. The second constructor allows you to set each parameter to a custom value. int flags=CALIB_FIX_INTRINSIC) Calibrates stereo camera. const vector<vector<Point2f> >& imagePoints1. one vector per view. o fullDP– Set it to true to run full-scale 2-pass dynamic programming algorithm. Mat& distCoeffs1. preFilterCap] interval. The method executes SGBM algorithm on a rectified stereo pair. which is large for 640x480 stereo and huge for HD-size pictures. so that to get the floating-point disparity map. you will need to divide each disp element by 16. TermCriteria term_crit = TermCriteria(TermCriteria::COUNT+ TermCriteria::EPS. If the same . Mat& cameraMatrix2. The result values are passed to the Birchfield-Tomasi pixel cost function. Mat& disp) Computes disparity using SGBM algorithm for a rectified stereo pair Paramete o left– The left image. set it to some positive value. Normally. const Mat& right. multiple of 16. thus you should not use the same StereoSGBM instance from within different threads simultaneously. Mat& F.

may be 0 or combination of the following values: CV_CALIB_FIX_INTRINSICIf it is set. The points are 3D. so that only R. o R– The output rotation matrix between the 1st and the 2nd cameras’ coordinate systems. o imagePoints2– The vector of vectors of the object point projections on the calibration pattern views from the 2nd camera. o T– The output translation vector between the cameras’ coordinate systems. some or all of the matrices’ components must be initialized. o flags– Different flags. so that Z-coordinate of each input object point is 0 o imagePoints1– The vector of vectors of the object point projections on the calibration pattern views from the 1st camera. The projections must be in the same order as the corresponding object points. but the initial values are provided by the user. one vector per a view. CV_CALIB_USE_INTRINSIC_GUESSThe flag allows the function to optimize some or all of the intrinsic parameters. If any of CV_CALIB_USE_INTRINSIC_GUESS . 5 or 8 elements. CV_CALIB_FIX_PRINCIPAL_POINTThe principal points are fixed during the optimization. although it is possible to use partially occluded patterns. E and F are estimated. then if the rig is planar. o cameraMatrix2– The input/output second camera matrix. as well as distCoeffs? are fixed. one vector per a view. o E– The output essential matrix. T. . as distCoeffs1 . it may have sense to put the model to the XY coordinate plane. cameraMatrix? . o imageSize– Size of the image. The projections must be in the same order as the corresponding object points. o term_crit– The termination criteria for the iterative optimization algorithm. On output vector length depends on the flags. CV_CALIB_FIX_INTRINSIC or CV_CALIB_FIX_FOCAL_LENGTH are specified. or even different patterns in different views . o distCoeffs2– The input/output lens distortion coefficients for the second camera. see the flags description o distCoeffs– The input/output vector of distortion coefficients of 4. used only to initialize intrinsic camera matrix. as cameraMatrix1.calibration pattern is shown in each view and it’s fully visible then all the vectors will be the same. . o cameraMatrix1– The input/output first camera matrix: . but since they are in the pattern coordinate system. CV_CALIB_FIX_ASPECT_RATIO . depending on the other flags.then the vectors will be different. o F– The output fundamental matrix.

obviously. The function estimates transformation between the 2 cameras making a stereo pair. the function will compute and return only 5 distortion coefficients. it makes sense to restrict some parameters. ) it should be possible to compute ( .CV_CALIB_FIX_K6Do not change the corresponding radial distortion coefficient during the optimization. And also the Besides the stereo-related information. if intrinsic parameters can be estimated with high accuracy for each of the cameras individually (e. where the relative position and orientation of the 2 cameras is fixed. To provide the backward compatibility. it computes the essential matrix E: where are components of the translation vector function can compute the fundamental matrix F: : . ) such that: Optionally. CV_CALIB_RATIONAL_MODELEnable coefficients k4. Thus. However. i. otherwise it is set to 0. CV_CALIB_FIX_K1.g. If CV_CALIB_USE_INTRINSIC_GUESS is set. Otherwise. e.. It computes ( . T2). this extra flag should be explicitly specified to make the calibration function use the rational model and return 8 coefficients. those poses will relate to each other. given ( . the function can also perform full calibration of each of the 2 cameras.CV_CALIB_FIX_FOCAL_LENGTH fixed. The function returns the final value of the re-projection error. That’s what the described function does. which are usually reasonable assumptions... CV_CALIB_SAME_FOCAL_LENGTHEnforces and CV_CALIB_ZERO_TANGENT_DISTTangential distortion coefficients for each camera are set to zeros and fixed there.. but the ratio is fixed. using CalibrateCamera2 ). because of the high dimensionality of the parameter space and noise in the input data the function can diverge from the correct solution. T1) and (R2. . k5 and k6. pass CV_CALIB_SAME_FOCAL_LENGTH and CV_CALIB_ZERO_TANGENT_DIST flags. If we have a stereo camera. ) .g. (R1.e. CV_CALIB_FIX_ASPECT_RATIO and are is optimized. it is recommended to do so and then pass CV_CALIB_FIX_INTRINSIC flag to the function along with the computed intrinsic parameters. and if we computed poses of an object relative to the fist camera and to the second camera. Similarly to CalibrateCamera2 . If the flag is not set. if all the parameters are estimated at once. the coefficient from the supplied distCoeffs matrix is used. respectively (that can be done with FindExtrinsicCameraParams2 ).we only need to know the position and orientation of the 2nd camera relative to the 1st camera. the function minimizes the total re-projection error for all the points in all the available views from both cameras.

o R– The rotation matrix between the 1st and the 2nd cameras’ coordinate systems. If it is -1 or absent . 5 or 8 elements each. const Mat& T. o Q– The output disparity-to-depth mapping matrix. alpha=1 means that the rectified image will be decimated and shifted so that all the pixels from the original images from the cameras are retained in the rectified images. Mat& Q.) – The output projection matrices in the new (rectified) coordinate systems. const Mat& cameraMatrix2. const Mat& distCoeffs2. If the flag is set. i. And if the flag is not set. const Mat& R.e.cv::stereoRectify Comments from the Wiki void stereoRectify(const Mat& cameraMatrix1. i. Mat& R2. o imageSize– Size of the image used for stereo calibration. Mat& P1. Rect* roi1=0. Setting it to larger value can help you to preserve details in the original image. o alpha– The free scaling parameter. Size newImageSize=Size(). int flags=CALIB_ZERO_DISPARITY) void stereoRectify(const Mat& cameraMatrix1. Size imageSize. const Mat& distCoeffs1.) – The camera matrices rs: o . when (0. const Mat& distCoeffs2. the function makes the principal points of each camera have the same pixel coordinates in the rectified views. const Mat& T. there will be no black areas after rectification). const Mat& cameraMatrix2. Mat& P1. Size imageSize. the functions performs some default scaling. By default. Mat& R2. distCoeffs– The input vectors of distortion coefficients of 4. the ROIs will cover the .) – The output matrices) for the first and the second cameras. o roi2(roi1. Obviously. alpha=0 means that the rectified images will be zoomed and shifted so that only valid pixels are visible (i. any intermediate value yields some intermediate result between those two extreme cases. no source image pixels are lost. see reprojectImageTo3D() . it is set to the original imageSize .) – The optional output rectangles inside the rectified images where all the pixels are valid.0) is passed. o newImageSize– The new image resolution after rectification. Mat& R1. The same size should be passed to InitUndistortRectifyMap . const Mat& distCoeffs1.e.cpp sample in OpenCV samples directory. Mat& R1. If the vectors are NULL/empty. int flags=CALIB_ZERO_DISPARITY) Computes rectification transforms for each head of a calibrated stereo camera. o flags– The operation flags. respectively. Otherwise the parameter should be between 0 and 1. Mat& Q. rectification transforms (rotation o R2(R1.e. the zero distortion coefficients are assumed. see the stereo_calib. Mat& P2. If alpha=0 . Mat& P2. o P2(P1. especially when there is big radial distortion. may be 0 or CV_CALIB_ZERO_DISPARITY . Rect* roi2=0. the function may still shift the images in horizontal or vertical direction (depending on the orientation of epipolar lines) in order to maximize the useful image area. double alpha. Paramete o cameraMatrix2(cameraMatrix1. const Mat& R. o T– The translation vector between the cameras’ coordinate systems.

the images are well rectified (which is what most stereo correspondence algorithms rely on). when 1st and 2nd camera views are shifted relative to each other mainly along the x axis (with possible small vertical shift). On input the function takes the matrices computed by stereoCalibrate() and on output it gives 2 rotation matrices and also 2 projection matrices in the new coordinates. otherwise they likely be smaller.indeed. Vertical stereo. Then in the rectified images the corresponding epipolar lines in left and right cameras will be horizontal and have the same y-coordinate.cpp sample. The green rectangles are roi1 and roi2 . As you can see. that makes all the epipolar lines parallel and thus simplifies the dense stereo correspondence problem. Then the epipolar lines in the rectified images will be vertical and have the same x coordinate. see the picture below The function computes the rotation matrices for each camera that (virtually) make both camera image planes the same plane. their interior are all valid pixels. i. as you can see. P1 and P2 will look as: where is horizontal shift between the cameras and if CV_CALIB_ZERO_DISPARITY is set.whole images. P2 and P2 will look as: where is vertical shift between the cameras and if CALIB_ZERO_DISPARITY is set. together with R1 and R2 . Some red horizontal lines. pass through the corresponding image regions. Below is the screenshot from stereo_calib. .e. Consequently. The 2 cases are distinguished by the function are: Horizontal stereo. the first 3 columns of P1 and P2 will effectively be the new “rectified” camera matrices. The matrices. can then be passed to InitUndistortRectifyMap to initialize the rectification map for each camera. when 1st and 2nd camera views are shifted relative to each other mainly in vertical direction (and probably a bit in the horizontal direction too).

Otherwise all the points are considered inliers.) – The 2 arrays of corresponding 2D points. o imageSize– Size of the image. const Mat& F. const Mat& points2. It can be computed from the same set of point pairs using FindFundamentalMat . Size imgSize. then all the point pairs that do not comply the epipolar geometry well enough (that is. Mat& H1. o threshold– The optional threshold used to filter out the outliers. the points for which ) are rejected prior to computing the homographies. The rs: same formats as in FindFundamentalMat are supported o F– The input fundamental matrix. Mat& H2. The function computes the rectification transformations without knowing intrinsic parameters of the . If the parameter is greater than zero. double threshold=5) Computes rectification transform for uncalibrated stereo camera. Paramete o points2(points1.) – The output rectification homography matrices for the first and for the second images. o H2(H1.cv::stereoRectifyUncalibrated Comments from the Wiki bool stereoRectifyUncalibrated(const Mat& points1.

are filled with 0’s (black color). Note that while the algorithm does not need to know the intrinsic parameters of the cameras. Paramete o src– The observed point coordinates. const Mat& cameraMatrix. depending on your requirements. 1xN or Nx1 2-channel (CV _ . while the distortion coefficients remain the same. cv::undistort Comments from the Wiki void undistort(const Mat& src. The function is simply a combination of InitUndistortRectifyMap (with unity R ) and Remap (with bilinear interpolation). const Mat& distCoeffs. encoded by the homography matrices H1 and H2 . distortion coefficients can be estimated for each head of stereo camera separately by using CalibrateCamera2 and then the images can be corrected using Undistort2 . hence the suffix “Uncalibrated”. Therefore. the zero distortion coefficients are assumed. If the vector is NULL/empty. will have the same size and the same type as src o o cameraMatrix– The input camera matrix distCoeffs– The input vector of distortion coefficients of 4. o newCameraMatrix– Camera matrix of the distorted image. const Mat& P=Mat()) Computes the ideal point coordinates from the observed point coordinates. 5 or 8 elements. The particular subset of the source image that will be visible in the corrected image can be regulated by newCameraMatrix . if the camera lenses have significant distortion. See the former function for details of the transformation being performed. const Mat& R=Mat(). const Mat& P=Mat()) void undistortPoints(const Mat& src. it heavily depends on the epipolar geometry.cameras and their relative position in space. but the planar perspective transformations. for which there is no correspondent pixels in the source image. If the resolution of images is different from the used at the calibration stage. and need to cv::undistortPoints Comments from the Wiki void undistortPoints(const Mat& src. Paramete o src– The input (distorted) image rs: o dst– The output (corrected) image. Mat& dst. it would better be corrected before computing the fundamental matrix and calling this function. const Mat& distCoeffs. const Mat& cameraMatrix. const Mat& newCameraMatrix=Mat()) Transforms an image to compensate for lens distortion. The camera matrix and the distortion parameters can be determined using CalibrateCamera2 . const Mat& R=Mat(). For example. const Mat& distCoeffs. but you may additionally scale and shift the result by using some different matrix The function transforms the image to compensate radial and tangential lens distortion. The function implements the algorithm Hartley99 . or just the point coordinates can be corrected with UndistortPoints . By default it is the same as cameraMatrix . Those pixels in the destination image. Another related difference from StereoRectify is that the function outputs not the rectification transformations in the object (3D) space. Mat& dst. vector<Point2f>& dst. You can use GetOptimalNewCameraMatrix to compute the appropriate newCameraMatrix . const Mat& cameraMatrix. be scaled accordingly.

ml. or categorical input variables etc. 0 0 1] // P=[fx' 0 cx' tx. If the vector is NULL/empty.y') = undistort(x". 0 fy' cy' ty. o dst– The output ideal point coordinates. This common ground is defined by the class CvStatModel that all the other ML classes are derived from. computed by StereoRectify() can be passed here. Most of the classification and regression algorithms are implemented as C++ classes. regression and clustering of data. y = Y/W u' = x*fx' + cx' v' = y*fy' + cy'. . Also the function does some kind of reverse transformation to ProjectPoints2 (in the case of 3D object it will not reconstruct its 3D coordinates. of course. where undistort() is approximate iterative algorithm that estimates the normalized original point coordinates out of the normalized distorted point coordinates (“normalized” means that the coordinates do not depend on the camera matrix). As the algorithms have different seta of features (like the ability to handle missing measurements. the identity new camera matrix is used The function is similar to Undistort2 and InitUndistortRectifyMap . the zero distortion coefficients are assumed. computed by StereoRectify() can be passed here. (u'. 5 or 8 elements. 0 fy cy.v) is the input point. if the proper R is specified). 0 0 1 tz] x" = (u .). R1 or R2 . o R– The rectification transformation in object space (3x3 matrix). Statistical Models CvStatModel Comments from the Wiki CvStatModel Base class for the statistical models in ML. v') is the output point // camera_matrix=[fx 0 cx. If the matrix is empty. there is a little common ground between the classes. up to a translation vector.W]T = R*[x' y' 1]T x = X/W. P1 or P2 .cx)/fx y" = (v . but it operates on a sparse set of points instead of a raster image.rs: 32FC2 or CV _ 64FC2).Y.cy)/fy (x'. If the matrix is empty. the identity transformation is used o P– The new camera matrix (3x3) or the new projection matrix (3x4). o o cameraMatrix– The camera matrix distCoeffs– The input vector of distortion coefficients of 4. // (u. Machine Learning The Machine Learning Library (MLL) is a set of classes and functions for statistical classification. after undistortion and reverse perspective transformation . but for a planar object it will. The function can be used both for a stereo camera head or for monocular camera (when R is empty ).dist_coeffs) [X.y".

when the default constructor is followed by train() or load() .. CvStatModel::~CvStatModel Comments from the Wiki CvStatModel::~CvStatModel() Virtual destructor.] . [const CvMat* var_type. Actually. const char* name=0 )=0... . Most ML classes provide single-step construct and train constructors. const char* name )=0. virtual void read( CvFileStorage* storage. CvStatModel::CvStatModel(.. [const CvMat* sample_idx.... */ virtual ~CvStatModel()....class CvStatModel { public: /* CvStatModel()..) Training constructor. virtual void load( const char* filename.] . This constructor is useful for 2-stage model construction.. CvStatModel::CvStatModel Comments from the Wiki CvStatModel::CvStatModel() Default constructor. .. [int tflag. /* virtual bool train( const CvMat* train_data.) Comments from the Wiki CvStatModel::CvStatModel(const CvMat* train_data .. [const CvMat* var_idx. CvFileNode* node )=0. as if they are a part of the base class.] .. }.] . however. [const CvMat* missing_mask. followed by the train() method with the parameters that are passed to the constructor. ) const=0. const char* name=0 )=0. these are methods for which there is no unified API (with the exception of the default constructor). ).... In this declaration some methods are commented off. there are many similarities in the syntax and semantics that are briefly described below in this section. const CvMat* responses. */ /* virtual float predict( const CvMat* sample .. This constructor is equivalent to the default constructor. Each statistical model class in ML has a default constructor without parameters.] <misc_training_alg_params> .. */ virtual void save( const char* filename.. virtual void clear()=0... virtual void write( CvFileStorage* storage. )=0.. */ /* CvStatModel( const CvMat* train_data .

Data persistence functionality from CxCore is used. const char* name) Writes the model to file storage. CvStatModel::read . read() or even explicitly by the user. from the train methods of the derived classes. Normally.. unlike the C types of OpenCV that can be loaded using the generic cross{cvLoad}. This limitation will be removed in the later ML versions.. but in this instance it calls the overridden method clear() that deallocates all the memory.. the destructor of each derived class does nothing. CvStatModel::save Comments from the Wiki void CvStatModel::save(const char* filename. const char* name=0) Saves the model to a file. The method is called by save() . CvStatModel::write Comments from the Wiki void CvStatModel::write(CvFileStorage* storage. it deallocates all the memory occupied by the class members. The method write stores the complete model state to the file storage with the specified name or default name (that depends on the particular class). from the methods load() . CvStatModel::load Comments from the Wiki void CvStatModel::load(const char* filename. else model = new CvDTree(. so it is safe to write the following code: CvStatModel* model. The method save stores the complete model state to the specified XML or YAML file with the specified name or default name (that depends on the particular class). . However. Note that the method is virtual. This method is called from the destructor. /* SVM params */). The method load loads the complete model state with the specified name (or default model-dependent name) from the specified XML or YAML file. But the object itself is not destructed.. if( use_svm ) model = new CvSVM(. here the model type must be known. const char* name=0) Loads the model from a file. The method clear does the same job as the destructor. delete model. CvStatModel::clear Comments from the Wiki void CvStatModel::clear() Deallocates memory and resets the model state.The destructor of the base class is declared as virtual... The previous model state is cleared by clear() . /* Decision tree params */). because an empty model must be constructed beforehand. and can be reused further. so any model can be loaded using this virtual method.

The node must be located by the user using the function GetFileNodeByName . the previous model state is cleared by clear() before running the training procedure.e. However. In the latter case the type of output variable is either passed as separate parameter. Both input and output vectors/values are passed as matrices. [const CvMat* missing_mask. [const CvMat* var_type. some algorithms may optionally update the model state with the new training data. Most algorithms can handle only ordered input variables. i.. lists of 0-based indices. However. Both vectors are either integer ( CV_32SC1 ) vectors.. • tflag=CV_COL_SAMPLEmeans that the feature vectors are stored as columns. an 8-bit matrix the same size as train_data . meaning that all of the variables/samples are used for training. single-channel) format. or as a last element of var_type vector: • CV_VAR_CATEGORICALmeans that the output values are discrete class labels. If both layouts are supported... ] . the method includes tflag parameter that specifies the orientation: • tflag=CV_ROW_SAMPLEmeans that the feature vectors are stored as rows. or 8-bit ( CV_8UC1 ) masks of active variables/samples. . const CvMat* responses.e.. Usually. The user may pass NULL pointers instead of either of the arguments.. for regression problems the responses are values of the function to be approximated. [const CvMat* var_idx. ] . like various flavors of neural nets. CvStatModel::train Comments from the Wiki bool CvStatMode::train(const CvMat* train_data. The previous model state is cleared by clear() . take vector responses). To make it easier for the user. The train_data must have a CV_32FC1 (32-bit floating-point. [int tflag. CvFileNode* node) Reads the model from file storage. i. For classification problems the responses are discrete class labels.. they forgot to measure a temperature of patient A on Monday). The method trains the statistical model using a set of input feature vectors and the corresponding output values (responses).. i.Comments from the Wiki void CvStatMode::read(CvFileStorage* storage. The method read restores the complete model state from the specified node of the file storage. when all values of each particular feature (component/input variable) over the whole input set are stored continuously.. all the components (features) of a training vector are stored continuously. some . • CV_VAR_ORDERED(=CV_VAR_NUMERICAL)means that the output values are ordered.. The parameter missing_mask . Many models in the ML may be trained on a selected feature subset. ] . and the latter identifies samples of interest. one value per input vector (although some algorithms. By default the input feature vectors are stored as train_data rows. The former identifies variables (features) of interest.. and some can deal with both problems. Responses are usually stored in the 1d vector (a row or a column) of CV_32SC1 (only in the classification problem) or CV_32FC1 format.) Trains the model. is used to mark the missed values (non-zero elements of the mask).e. instead of resetting it. some algorithms can handle the transposed representation. that is when certain features of certain training samples have unknown values (for example.. and this is a regression problem The types of input variables can be also specified using var_type . CvStatModel::predict . ] .. Additionally some algorithms can handle missing measurements... Some algorithms can deal only with classification problems. and/or on a selected sample subset of the training set. the method train usually includes var_idx and sample_idx parameters. 2 different values can be compared as numbers. ] <misc_training_alg_params> .only with regression problems. [const CvMat* sample_idx..

virtual void clear(). In the case of classification the method returns the class label.the output function value. const char* name ). CvNormalBayesClassifier Comments from the Wiki CvNormalBayesClassifier Bayes classifier for normally distributed data. The input sample must have as many components as the train_data passed to train contains. protected: . virtual void save( const char* filename.. const char* name=0 ). so the whole data distribution function is assumed to be a Gaussian mixture. CvNormalBayesClassifier::train Comments from the Wiki . virtual ~CvNormalBayesClassifier(). virtual bool train( const CvMat* _train_data. Using the training data the algorithm estimates mean vectors and covariance matrices for every class. New York: Academic Press. virtual void write( CvFileStorage* storage. The suffix “const” means that prediction does not affect the internal model state. const CvMat* _responses. const CvMat* _responses. bool update=false ).. CvFileNode* node ). virtual float predict( const CvMat* _samples. Normal Bayes Classifier This is a simple classification model assuming that feature vectors from each class are normally distributed (though. Fukunaga. const char* name=0 ). CvNormalBayesClassifier( const CvMat* _train_data. const CvMat* _var_idx=0. 1990.Comments from the Wiki float CvStatMode::predict(const CvMat* sample[. CvMat* results=0 ) const. second ed. and then it uses them for prediction. so the method can be safely called from within different threads. const CvMat* _var_idx = 0. class CvNormalBayesClassifier : public CvStatModel { public: CvNormalBayesClassifier(). }. in the case of regression . it is remembered and then is used to extract only the necessary components from the input sample in the method predict . The method is used to predict the response for a new sample. Introduction to Statistical Pattern Recognition. <prediction_params>]) const Predicts the response for the sample. const CvMat* _sample_idx=0. not necessarily independently distributed). [Fukunaga90] K. If the var_idx parameter is passed to train . virtual void load( const char* filename.. virtual void read( CvFileStorage* storage. const CvMat* _sample_idx=0 ). one component per class.

K Nearest Neighbors¶ The algorithm caches all of the training samples. bool _update_base=false ). though the vector may have CV_32FC1 type). CvMat* results. bool update=false) Trains the model.e.) The method is sometimes referred to as “learning by example”. const CvMat* _sample_idx=0. virtual bool train( const CvMat* _train_data. the input variables are all ordered. int get_sample_count() const. The method trains the Normal Bayes classifier. and predicts the response for a new sample by analyzing a certain number ( K ) of the nearest neighbors of the sample (using voting. virtual float find_nearest( const CvMat* _samples. The input vectors (one or more) are stored as rows of the matrix samples . there is an update flag that identifies whether the model should be trained from scratch ( update=false ) or should be updated using the new training data ( update=true ). CvKNearest Comments from the Wiki CvKNearest K Nearest Neighbors model. elements of _responses must be integer numbers. const CvMat* _sample_idx=0. the output variable is categorical (i.bool CvNormalBayesClassifier::train(const CvMat* _train_data. int _max_k=32. calculating weighted sum etc. int k. virtual ~CvKNearest(). const CvMat* _sample_idx=0. because for prediction it looks for the feature vector with a known response that is closest to the given vector. CvMat* dist=0 ) const. class CvKNearest : public CvStatModel { public: CvKNearest(). CvMat* results=0) const Predicts the response for sample(s) The method predict estimates the most probable classes for the input vectors. const float** neighbors=0. bool _is_regression=false. CvMat* neighbor_responses=0. const CvMat* _responses. int get_max_k() const. int get_var_count() const. CvKNearest( const CvMat* _train_data. virtual void clear(). const CvMat* _responses. bool is_regression=false. const CvMat* _responses. CvNormalBayesClassifier::predict Comments from the Wiki float CvNormalBayesClassifier::predict(const CvMat* samples. there should be one output vector results . In addition. and missing measurements are not supported. It follows the conventions of the generic train “method” with the following limitations: only CV _ ROW _ SAMPLE data layout is supported. In the case of multiple input vectors. The predicted class for a single input vector is returned by the method. . const CvMat* _var_idx =0. int max_k=32 ).

In the case of classification the class is determined by voting. CvKNearest::train Comments from the Wiki bool CvKNearest::train(const CvMat* _train_data.. . If only a single input vector is passed. CvMat* trainClasses = cvCreateMat( train_sample_count. bool is_regression=false. For each input vector (which are the rows of the matrix _samples ) the method finds the nearest neighbor. const CvMat* _responses. char** argv ) { const int K = 10. The parameter _update_base specifies whether the model is trained from scratch ( _update_base=false ). For custom classification/regression prediction. an array of k*_samples->rows pointers). int k. int train_sample_count = 100. In the case of regression. CvMat* neighbor_responses=0. the method can optionally return pointers to the neighbor vectors themselves ( neighbors . k. CV_32FC1 ). }.h" #include "highgui. For each input vector the neighbors are sorted by their distances to the vector. const CvMat* _sample_idx=0. CvMat* dist=0) const Finds the neighbors for the input vectors. their corresponding output values ( neighbor_responses . bool _update_base=false) Trains the model. variable subsets ( var_idx ) and missing measurements are not supported. 3 ). the input variables are all ordered. 500 ). It follows the conventions of generic train “method” with the following limitations: only CV _ ROW _ SAMPLE data layout is supported. j. protected: .bool is_regression() const. const float** neighbors=0. accuracy. #include "ml. the output variables can be either categorical ( is_regression=false ) or ordered ( is_regression=true ). CvMat* trainData = cvCreateMat( train_sample_count. IplImage* img = cvCreateImage( cvSize( 500. int _max_k=32. CV_32FC1 ). a vector of k*_samples->rows elements) and the distances from the input vectors to the neighbors ( dist . or it is updated using the new training data ( _update_base=true ). The parameter _max_k specifies the number of maximum neighbors that may be passed to the method find_nearest . also a vector of k*_samples->rows elements). float response. all output matrices are optional and the predicted value is returned by the method. In the latter case the parameter _max_k must not be larger than the original value. The method trains the K-Nearest model. 1. 8. the predicted result will be a mean value of the particular vector’s neighbor responses. CvKNearest::find_nearest Comments from the Wiki float CvKNearest::find_nearest(const CvMat* _samples.h" int main( int argc. int i. 2.. CvMat* results=0. CvRNG rng_state = cvRNG(-1).

i.data.data.0)) : (accuracy > 5 ? CV_RGB(0.50) ). cvSet( &trainClasses1. train_sample_count/2.0)) ). K. cvGetRows( trainClasses. cvGetRows( trainData.fl[0] = (float)j. K ).300). pt. cvScalar(50. 2. CvMat trainData1. &trainData1. train_sample_count/2.data. i < train_sample_count/2.120. &trainData1. &trainClasses1. train_sample_count/2 ). CV_RAND_NORMAL. cvGetRows( trainClasses. // compute the number of neighbors representing the majority for( k = 0. train_sample_count ). j < img->width.0. k < K. 0. for( i = 0.fl[k] == response) accuracy++. i < img->height.0) : CV_RGB(180. // learn classifier CvKNearest knn( trainData. sample.float _sample[2]. cvRandArr( &rng_state. i++ ) { for( j = 0. j. cvScalar(200. train_sample_count ). CvMat* nearests = cvCreateMat( 1. &trainClasses2. cvScalar(50. trainClasses. i++ ) { CvPoint pt.K. &trainData2.fl[i*2]). trainClasses1. } // highlight the pixel depending on the accuracy (or confidence) cvSet2D( img. 0.fl[1] = (float)i. k++ ) { if( nearests->data.120. cvScalar(2) ).find_nearest(&sample.0). accuracy = 0. CV_32FC1. cvRandArr( &rng_state. cvZero( img ).50) ). // estimates the response and get the neighbors' labels response = knn.x = cvRound(trainData1.0. CV_RAND_NORMAL. j++ ) { sample. // form the training samples cvGetRows( trainData. _sample ). &trainData2.0.200). 0.nearests.0) : CV_RGB(120. trainClasses2.180. . cvScalar(1) ). trainData2. } } // display the original training samples for( i = 0. cvScalar(300. CV_32FC1). response == 1 ? (accuracy > 5 ? CV_RGB(180. false. CvMat sample = cvMat( 1. cvSet( &trainClasses2. train_sample_count/2 ).

Knowledge Discovery and Data Mining 2(2).fl[i*2]). cvReleaseMat( &trainClasses ). Then the technique has been extended to regression and clustering problems.. NU=3. cvCircle( img. 1 ). CV_RGB(0. // SVM kernel type enum { LINEAR=0. meaning that the position of other vectors does not affect the hyper-plane (the decision function).A Library for Support Vector Machines. cvShowImage( "classifier result". 1998.). support vector machines (SVM) was a technique for building an optimal (in some sense) binary (2-class) classifier. and then it builds an optimal linear discriminating function in this space (or an optimal hyper-plane that fits into the training data.255. • [Burges98] C. . pt. . EPS_SVR=103.pt. “A tutorial on support vector machines for pattern recognition”. 2.x = cvRound(trainData2.0).0). pt. a distance between any 2 points in the hyper-space needs to be defined. } cvNamedWindow( "classifier result". CV_FILLED ). 2. DEGREE=5 }. pt. // SVM params type enum { C=0.y = cvRound(trainData2. ONE_CLASS=102. The feature vectors that are the closest to the hyper-plane are called “support vectors”. SVM is a partial case of kernel-based methods. CV_FILLED ).fl[i*2+1]). class CvSVM : public CvStatModel { public: // SVM type enum { C_SVC=100. GAMMA=1. P=2.data. pt. cvReleaseMat( &trainData ). The solution is optimal in a sense that the margin between the separating hyper-plane and the nearest feature vectors from the both classes (in the case of 2-class classifier) is maximal.csie.ist.data.0. POLY=1.data. it maps feature vectors into higher-dimensional space using some kernel function.psu.edu/burges98tutorial. RBF=2..y = cvRound(trainData1. There are a lot of good references on SVM. By Chih-Chung Chang and Chih-Jen Lin( http://www.fl[i*2+1]). • LIBSVM . in the case of SVM the kernel is not defined explicitly.tw/~cjlin/libsvm/ ) CvSVM Comments from the Wiki CvSVM Support Vector Machines. return 0. SIGMOID=3 }. NU_SVR=104 }.(available online at http://citeseer. COEF=4.ntu. CV_RGB(255. } Support Vector Machines Originally.edu. Here are only a few ones to start with. Burges. cvWaitKey(0). Instead. img ). NU_SVC=101.html ). cvCircle( img.

}. int get_var_count() const { return var_idx ? var_idx->cols : var_all.. virtual void load( const char* filename. const CvMat* _responses. struct CvSVMParams { CvSVMParams(). CvParamGrid nu_grid = get_default_grid(CvSVM::NU). virtual void clear().CvSVM(). int _kernel_type. virtual const float* get_support_vector(int i) const. double _gamma. const CvMat* _var_idx. CvSVMParams _params=CvSVMParams() ). CvSVMParams _params=CvSVMParams() ). CvSVMParams Comments from the Wiki CvSVMParams SVM training parameters. const char* name=0 ). double _nu. double _coef0. CvParamGrid coef_grid = get_default_grid(CvSVM::COEF). CvSVM( const CvMat* _train_data. CvFileNode* node ). const CvMat* _sample_idx. const CvMat* _responses. }. const CvMat* _responses. static CvParamGrid get_default_grid( int param_id ). const CvMat* _sample_idx=0. virtual CvSVMParams get_params() const { return params. CvTermCriteria _term_crit ). virtual void save( const char* filename. CvParamGrid gamma_grid = get_default_grid(CvSVM::GAMMA). CvSVMParams _params. virtual void write( CvFileStorage* storage. double _degree. virtual float predict( const CvMat* _sample ) const. const CvMat* _sample_idx=0. virtual ~CvSVM().. const char* name=0 ). const CvMat* _var_idx=0. CvParamGrid C_grid = get_default_grid(CvSVM::C). virtual void read( CvFileStorage* storage. CvMat* _class_weights. const CvMat* _var_idx=0. virtual int get_support_vector_count() const. double _C. int k_fold = 10. CvSVMParams( int _svm_type. CvParamGrid p_grid = get_default_grid(CvSVM::P). const char* name ). virtual bool train( const CvMat* _train_data. virtual bool train_auto( const CvMat* _train_data. double _p. . } protected: . CvParamGrid degree_grid = get_default_grid(CvSVM::DEGREE) ).

svm_type=CvSVM::NU_SVC ). nu . maximal index such. the parameter gamma takes the values in the set ( . degree from CvSVMParams . for example. is gamma_grid. It follows the conventions of the generic train “method” with the following limitations: only the CV _ ROW _ SAMPLE data layout is supported. All the other parameters are gathered in CvSVMParams structure.min_val . or ordered ( _params. one subset being used to train the model. // for CV_SVM_NU_SVC. // for poly/sigmoid double C. The structure must be initialized and passed to the training method of CvSVM . Paramete o k_fold– Cross-validation parameter.svm_type=CvSVM::EPS_SVR or _params. // for poly gamma. CvParamGrid nu_grid = get_default_grid(CvSVM::NU). int k_fold = 10. kernel_type.step . p . and is the . The method trains the SVM model automatically by choosing the optimal parameters C . ) where is gamma_grid. that So step must always be greater than 1. CvSVMParams _params=CvSVMParams()) Trains SVM. . degree. CvParamGrid C_grid = get_default_grid(CvSVM::C). or not required at all ( _params. // for CV_SVM_C_SVC CvTermCriteria term_crit.int int double double double svm_type. . and CV_SVM_NU_SVR double p. const CvMat* _var_idx.. // for poly/rbf/sigmoid coef0. By optimal one means that the cross-validation estimate of the test set error is minimal. The training set is divided into rs: k_fold subsets. CvSVM::train_auto Comments from the Wiki train_auto(const CvMat* _train_data. CV_SVM_ONE_CLASS. const CvMat* _var_idx=0. const CvMat* _responses. // termination criteria }.svm_type=CvSVM::NU_SVR ). the input variables are all ordered.svm_type=CvSVM::C_SVC or _params. CvParamGrid p_grid = get_default_grid(CvSVM::P).. the others forming the test set. . The parameters are iterated by a logarithmic grid. CvParamGrid degree_grid = get_default_grid(CvSVM::DEGREE)) Trains SVM with optimal parameters. const CvMat* _responses. CvParamGrid gamma_grid = get_default_grid(CvSVM::GAMMA). gamma . The method trains the SVM model. the SVM algorithm is executed k_fold times. CvSVM::train Comments from the Wiki bool CvSVM::train(const CvMat* _train_data.svm_type=CvSVM::ONE_CLASS ). CvSVMParams params. missing measurements are not supported. // for CV_SVM_C_SVC. const CvMat* _sample_idx. So. coef0 . const CvMat* _sample_idx=0. CV_SVM_EPS_SVR and CV_SVM_NU_SVR double nu. // for CV_SVM_EPS_SVR CvMat* class_weights. CvParamGrid coef_grid = get_default_grid(CvSVM::COEF). the output variables can be either categorical ( _params.

svm_type=CvSVM::NU_SVC ) as well as for the regression ( params. In order to generate a grid.gamma will be taken for gamma . which are described in [Breiman84] . Parameters: o param_id– Must be one of the following: CvSVM::C CvSVM::GAMMA CvSVM::P CvSVM::NU CvSVM::COEF CvSVM::DEGREE . The function generates a grid for the specified parameter of the SVM algorithm. For example. finally. the value params. CvSVM::get_support_vector* Comments from the Wiki int CvSVM::get_support_vector_count() const const float* CvSVM::get_support_vector(int i) const Retrieves the number of support vectors and the particular vector.max_val being arbitrary numbers. one may call the function CvSVM::get_default_grid . If params. say. CvSVM::get_default_grid Comments from the Wiki CvParamGrid CvSVM::get_default_grid(int param_id) Generates a grid for the SVM parameters.svm_type=CvSVM::EPS_SVR or params. The grid will be generated for the parameter with this ID. or as a base class in tree .min_val . call CvSVM::get_default_grid(CvSVM::GAMMA) . but there is no idea of the corresponding grid.step = 0 . to avoid optimization in gamma one should set gamma_grid. CvSVM::get_params Comments from the Wiki CvSVMParams CvSVM::get_params() const Returns the current SVM parameters. And. This function works for the case of classification ( params.svm_type=CvSVM::C_SVC or params. The grid may be passed to the function CvSVM::train_auto . gamma_grid. The class CvDTree represents a single decision tree that may be used alone. the according grid step should be set to any value less or equal to 1.svm_type=CvSVM::ONE_CLASS . if the optimization in some parameter is required.If there is no need in optimization in some parameter. Decision Trees The ML classes discussed in this section implement Classification And Regression Tree algorithms.svm_type=CvSVM::NU_SVR ). The methods can be used to retrieve the set of support vectors. This function may be used to get the optimal parameters that were obtained while automatically training CvSVM::train_auto . gamma_grid. no optimization is made and the usual SVM with specified in params parameters is executed. for gamma . In this case.

One of the key properties of the constructed decision tree algorithms is that it is possible to compute importance (relative decisive power) of each variable. or to the right based on the value of a certain variable. And in the second case the discrete variable value is tested to see if it belongs to a certain subset of values (also stored in the node) from a limited set of values the variable could take. the variation is too small). when each tree leaf is marked with some class label (multiple leafs may have the same label). For example. when it is not statistically representative to split the node further. • the best split found does not give any noticeable improvement compared to a random choice.e.e.ensembles (see Boosting and Random Trees ). a pair of entities (variable _ index. To avoid such situations. Variable importance Besides the obvious use of decision trees .prediction. if necessary. it may be pruned using a cross-validation procedure.e. decision _ rule (threshold/subset)) is used. the procedure goes to the left. otherwise. whose index is stored in the observed node. else . A decision tree is a binary tree (i. Then the procedure recursively splits both left and right nodes. When the tree is built. the tree can be also used for various data analysis. else to the right). in the darkness it is difficult to determine the object color). Once a leaf node is reached. or for regression. Then. This pair is called a split (split on the variable variable _ index). every tree node may also be split on one or more other variables with nearly the same results. That is. the procedure goes to the left. if necessary. the prediction procedure starts with the root node. starting from the root node. when each tree leaf is also assigned a constant (so the approximation function is piecewise constant). if the value is less than the threshold. • number of training samples in the node is less than the specified threshold. certain features of the input vector are missed (for example. In the first case. decision trees use so-called surrogate splits. Predicting with Decision Trees To reach a leaf node. the value assigned to this node is used as the output of prediction procedure. while tree ensembles usually build small enough trees and use their own protection schemes against overfitting. to the right (for example. if the weight is less than 1 kilogram. some branches of the tree that may lead to the model overfitting are cut off. Sometimes. and sum of squared errors is used for regression). At each node the recursive procedure may stop (i. all of the data is divided using the primary and the surrogate splits (just like it is done in the prediction procedure) between the left and the right child node.to the right (for example. if yes. selects the left child node as the next observed node). From each non-leaf node the procedure goes to the left (i. in addition to the best “primary” split. and to obtain a response for the input feature vector. It can be used either for classification. in each node. and the prediction procedure may get stuck in the certain node (in the mentioned example if the node is split by color). In each node the optimum decision rule (i. All of the training data (feature vectors and the responses) is used to split the root node. tree where each non-leaf node has exactly 2 child nodes). Training Decision Trees The tree is built recursively. That is. stop splitting the node further) in one of the following cases: • depth of the tree branch being constructed has reached the specified maximum value. else to the right). if the color is green or red. the best “primary” split) is found based on some criteria (in ML gini “purity” criteria is used for classification. the procedure goes to the left. • all the samples in the node belong to the same class (or. Normally this procedure is only applied to standalone decision trees. in a spam filter that uses .e. in the case of regression. the surrogate splits are found that resemble the results of the primary split on the training data. The variable can be either ordered or categorical. the variable value is compared with the certain threshold (which is also stored in the node). go to the left. That is.

the surrogate splits must be enabled in the training parameters. C. struct CvDTreeNode { int class_idx.. to compute variable importance correctly. }.. . struct { float c. int sample_count. (1984).a set of words occurred in the message as a feature vector. CvDTreeSplit* next. struct CvDTreeSplit { int var_idx. R. “Classification and Regression Trees”. [Breiman84] Breiman. Friedman. and Stone. primary and surrogate ones. Olshen. int Tn. int split_point. int depth. Importance of each variable is computed over all the splits on this variable in the tree. CvDTreeSplit* split. } ord. CvDTreeSplit Comments from the Wiki CvDTreeSplit Decision tree node split. Thus. the variable importance rating can be used to determine the most “spam-indicating” words and thus help to keep the dictionary size reasonable. CvDTreeNode Comments from the Wiki CvDTreeNode Decision tree node. even if there is no missing data. }. float quality. double value. CvDTreeNode* parent. union { int subset[2].. Wadsworth. . CvDTreeNode* right. int inversed. L. }. CvDTreeNode* left. J.

const CvMat* _sample_idx=0. use_surrogates(true). virtual ~CvDTreeTrainData(). int _min_sample_count. The structure contains all the decision tree training parameters. struct CvDTreeTrainData { CvDTreeTrainData(). }. const CvMat* _missing_mask=0. struct CvDTreeParams { int max_categories. CvDTreeTrainData Comments from the Wiki CvDTreeTrainData Decision tree training data and shared data for tree ensembles. CvDTreeParams() : max_categories(10). float _regression_accuracy. const CvMat* _responses. bool truncate_pruned_tree. or the structure may be fully initialized using the advanced variant of the constructor. int min_sample_count. priors(0) {} CvDTreeParams( int _max_depth. regression_accuracy(0. const float* _priors ). truncate_pruned_tree(true). float regression_accuracy. bool _truncate_pruned_tree. bool use_surrogates. int _tflag. max_depth(INT_MAX). There is a default constructor that initializes all the parameters with the default values tuned for standalone classification tree. const CvDTreeParams& _params=CvDTreeParams(). int _cv_folds. int max_depth. const CvMat* _var_idx=0. const float* priors. use_1se_rule(true). . cv_folds(10). bool use_1se_rule. min_sample_count(10). Any of the parameters can be overridden then. bool _add_labels=false ). bool _use_surrogates. int _max_categories.Other numerous fields of CvDTreeNode are used internally at the training stage. CvDTreeParams Comments from the Wiki CvDTreeParams Decision tree training parameters. int cv_folds. CvDTreeTrainData( const CvMat* _train_data. bool _use_1se_rule. const CvMat* _var_type=0. bool _shared=false.01f).

var_count. bool shared. virtual void get_vectors( const CvMat* _subsample_idx. int vi ). virtual void free_node( CvDTreeNode* node ). virtual void read_params( CvFileStorage* fs. const CvDTreeParams& _params=CvDTreeParams(). cat_var_count. virtual void free_node_data( CvDTreeNode* node ). int ord_var_count. virtual CvPair32s32f* get_ord_var_data( CvDTreeNode* n. . int storage_idx. float quality ). uchar* missing. const CvMat* _responses. virtual CvDTreeNode* subsample_data( const CvMat* _subsample_idx ). buf_size. virtual int* get_class_labels( CvDTreeNode* n ). float* values. bool _shared=false. CvMat* cat_count. virtual CvDTreeSplit* new_split_ord( int vi. have_priors. const CvMat* _var_type=0. virtual CvDTreeSplit* new_split_cat( int vi. bool is_classifier. int get_num_classes() const. float cmp_val.virtual void set_data( const CvMat* _train_data. float* responses. int split_point. virtual float* get_ord_responses( CvDTreeNode* n ). int count. int inversed. virtual CvDTreeNode* new_node( CvDTreeNode* parent. //////////////////////////////////// virtual bool set_params( const CvDTreeParams& params ). int get_var_type(int vi) const. CvFileNode* node ). bool _update_data=false ). const CvMat* _var_idx=0. var_all. virtual int* get_labels( CvDTreeNode* n ). int _tflag. int vi ). const CvMat* _missing_mask=0. virtual void free_train_data(). // release all the data virtual void clear(). float quality ). int get_work_var_count() const. bool have_labels. bool get_class_idx=false ). int buf_count. virtual void write_params( CvFileStorage* fs ). virtual int get_child_buf_idx( CvDTreeNode* n ). const CvMat* _sample_idx=0. int sample_count. int offset ). max_c_count. bool _add_labels=false. virtual int* get_cat_var_data( CvDTreeNode* n.

CvMat* cat_map. followed by set_data (or it is built using the full form of constructor). CvMat* var_idx. The parameter _shared must be set to true . CvRNG rng. The scheme is the following: • The structure is initialized using the default constructor.CvMat* cat_ofs. CvMat* var_type. // i-th element = // k<0 . This structure is mostly used internally for storing both standalone trees and tree ensembles efficiently. class label compression map etc. or the ready-to-use “black box” tree ensemble from ML. 3 Buffers. train it and use it. CvSet* nv_heap. CvMemStorage* temp_storage. the structure may be constructed and used explicitly. it contains 3 types of information: 1 The training parameters. CvMat* counts. CvDTreeNode* data_root. see the special form of the method CvDTree::train . For tree ensembles this preprocessed data is reused by all the trees. the number of classes. memory storages for tree nodes. CvMat* buf. CvDTreeParams params. the training data characteristics that are shared by all trees in the ensemble are stored here: variable types.g. 2 The training data.categorical. for custom tree algorithms. CvMemStorage* tree_storage. an instance of CvDTreeParams. }. or another sophisticated cases.ordered // k>=0 . CvSet* split_heap. • One or more trees are trained using this data. see k-th element of cat_* arrays CvMat* priors. CvMat* direction. like Random Trees or Boosting ) there is no need to care or even to know about the structure . In simple cases (e. Additionally. splits and other elements of the trees constructed. The CvDTreeTrainData structure will be constructed and used internally. CvSet* cv_heap.just construct the needed statistical model. CvSet* node_heap. • Finally. the structure can be released only after all the trees using it are released. CvMat* split_buf. There are 2 ways of using this structure. preprocessed in order to find the best splits more efficiently. a standalone tree. However. CvDTree Comments from the Wiki CvDTree . Basically.

const CvMat* _sample_idx=0. const CvMat* _missing_data_mask=0. virtual CvDTreeSplit* find_split_ord_class( CvDTreeNode* n. int _tflag. CvFileNode* node ). int vi ). virtual CvDTreeSplit* find_split_ord_reg( CvDTreeNode* n. virtual double calc_node_dir( CvDTreeNode* node ). const char* name ). int vi ). int vi ). const CvMat* _missing_mask=0. const CvMat* _responses. virtual void read( CvFileStorage* fs. virtual ~CvDTree(). const CvMat* _subsample_idx ). int var_count. virtual void write( CvFileStorage* fs. virtual CvDTreeSplit* find_surrogate_split_cat( CvDTreeNode* n. bool raw_mode=false ) const. CvDTreeTrainData* get_data(). int vi ). virtual CvDTreeSplit* find_split_cat_class( CvDTreeNode* n. CvFileNode* node. // special read & write methods for trees in the tree ensembles virtual void read( CvFileStorage* fs. int* sums. virtual const CvMat* get_var_importance(). virtual void cluster_categories( const int* vectors. const CvMat* _var_idx=0. virtual void complete_node_dir( CvDTreeNode* node ). int get_pruned_tree_idx() const. virtual bool train( const CvMat* _train_data. . virtual CvDTreeSplit* find_split_cat_reg( CvDTreeNode* n. int k. virtual void clear(). int* cluster_labels ). class CvDTree : public CvStatModel { public: CvDTree(). virtual void write( CvFileStorage* fs ). CvDTreeTrainData* data ). int vector_count.Decision tree. virtual CvDTreeNode* predict( const CvMat* _sample. const CvMat* _var_type=0. const CvDTreeNode* get_root() const. virtual bool train( CvDTreeTrainData* _train_data. virtual CvDTreeSplit* find_best_split( CvDTreeNode* n ). CvDTreeParams params=CvDTreeParams() ). int vi ). virtual CvDTreeSplit* find_surrogate_split_ord( CvDTreeNode* n. virtual void try_split_node( CvDTreeNode* n ). int vi ). protected: virtual bool do_train( const CvMat* _subsample_idx ). virtual void split_node_data( CvDTreeNode* n ). virtual void calc_node_value( CvDTreeNode* node ).

int _tflag. arbitrary combinations of input and output variable types etc. see the CvDTreeParams description. The first method follows the generic CvStatModel::train conventions. CvFileNode* node ). virtual void free_prune_data(bool cut_tree). then _subsample_idx=[0. The indices in _subsample_idx are counted relatively to the _sample_idx . const CvMat* _missing_data_mask=0. virtual CvDTreeNode* read_node( CvFileStorage* fs. bool raw_mode=false) const Returns the leaf node of the decision tree corresponding to the input vector. The second method train is mostly used for building tree ensembles. const CvMat* _subsample_idx) Trains a decision tree.virtual void prune_cv(). if _sample_idx=[1. as well as sample and variable subsets. 7. CvMat* var_importance. CvDTreeSplit* split ). virtual void write_split( CvFileStorage* fs. virtual void write_tree_nodes( CvFileStorage* fs ). There are 2 train methods in CvDTree . It takes the pre-constructed CvDTreeTrainData instance and the optional subset of training set. CvDTree::predict Comments from the Wiki CvDTreeNode* CvDTree::predict(const CvMat* _sample. virtual double update_tree_rnc( int T. The prediction result. CvDTreeNode* node ). int fold. 100] of the original training set are used. virtual void read_tree_nodes( CvFileStorage* fs. 100] . const CvMat* _sample_idx=0. virtual int cut_tree( int T. const CvMat* _var_idx=0. it is the most complete form. passed to CvDTreeTrainData constructor. either the class . virtual CvDTreeSplit* read_split( CvFileStorage* fs. The method takes the feature vector and the optional missing measurement mask on input. double min_alpha ). CvDTreeParams params=CvDTreeParams()) bool CvDTree::train(CvDTreeTrainData* _train_data. traverses the decision tree and returns the reached leaf node on output. CvDTreeTrainData* data. The last parameter contains all of the necessary training parameters. CvFileNode* node ). }. CvDTree::train Comments from the Wiki bool CvDTree::train(const CvMat* _train_data.3] means that the samples [1. virtual void free_tree(). CvDTreeNode* root. int fold ). For example. const CvMat* _var_type=0. virtual void write_node( CvFileStorage* fs. CvFileNode* node. CvDTreeNode* parent ). 5. const CvMat* _responses. Both data layouts ( _tflag=CV_ROW_SAMPLE and _tflag=CV_COL_SAMPLE ) are supported. missing measurements. const CvMat* _missing_mask=0. int pruned_tree_idx.

Boosting is a powerful learning concept. Its weighted training error and scaling factor is computed (step 3b). using weights on the training data. In supervised learning. The last parameter is normally set to false . while predicting the quantitative output is called regression. . The boosted model is based on training examples with and . Therefore. The final classifier classifiers (step 4). If it is true . that reduces the problem to the 2-class problem. Real AdaBoost. implying a regular input. • • • Given examples : . It is useful for faster prediction with tree ensembles. and thus can be very simple and computationally inexpensive. As the training algorithm proceeds and the number of Start with weights Repeat for = o o o Fit the classifier Compute Set is the sign of the weighted sum over the individual weak with . Often the simplest decision trees with only a single split node per tree (called stumps) are sufficient. See the mushroom. the influence trimming technique may be employed.label or the estimated function value. The desired two-class output is encoded as -1 and +1. the method assumes that all the values of the discrete input variables have been already normalized to to ranges. It combines the performance of many “weak” classifiers to produce a powerful ‘committee’ HTF01 . In order to reduce computation time for boosted models without substantially losing accuracy. Decision trees are the most popular weak classifiers used in boosting schemes. and renormalize so that . we will look only at the standard two-class Discrete AdaBoost algorithm as shown in the box below. Example: Building A Tree for Classifying Mushrooms. which provide a solution to the supervised classification learning task. is a -component vector. Many of them smartly combined. For M 2 classes there is the AdaBoost. All weights are then normalized. Each sample is initially assigned the same weight (step 2). Predicting the functional relationship qualitative output is called classification. yet with a much larger training set. however. described in FHT98 .cpp sample that demonstrates how to build and use the decision tree. For ordered input variables the flag is not used. results in a strong classifier. and the process of finding the next weak classifier continues for another -1 times. and Gentle AdaBoost FHT98 . The weights are increased for training samples. which have been misclassified (step 3c). A weak classifier is only required to be better than chance. (as the decision tree uses such normalized representation internally). for example: dtreepredict(sample. the goal is to learn the between the input and the output . Each component encodes a feature relevant for the learning task at hand. Next a weak classifier is trained on the weighted training data (step 3a). . LogitBoost. o Output the classifier sign . Boosting¶ A common machine learning task is supervised learning. Different variants of boosting are known such as Discrete Adaboost.MH algorithm. . All of them are very similar in their overall structure. may be retrieved as the value field of the CvDTreeNode structure. which often outperforms most ‘monolithic’ strong classifiers such as SVMs and Neural Networks.mask)value. Two-class Discrete AdaBoost Algorithm: Training (steps 1 to 3) and Evaluation (step 4) NOTE: As well as the classical boosting methods. the current implementation supports 2-class classifiers only.

CvDTreeTrainData* _data ). R. const float* priors ). Stanford University. and Tibshirani. a larger number of the training samples are classified correctly and with increasing confidence. bool use_surrogates. [HTF01] Hastie. CvFileNode* node. virtual ~CvBoostTree(). cross-validation is not supported. CvBoostTree Comments from the Wiki CvBoostTree Weak tree classifier. T. of Statistics. Tibshirani. CvBoostParams Comments from the Wiki CvBoostParams Boosting training parameters. Inference. Thus such examples may be excluded during the weak classifier training without having much effect on the induced classifier. virtual void clear()... but not all of the decision tree parameters are supported. J. }. class CvBoostTree: public CvDTree { public: CvBoostTree(). int weak_count. [FHT98] Friedman. protected: . Examples with very low relative weight have small impact on training of the weak classifier. In particular. virtual void scale( double s ). double weight_trim_rate. Examples deleted at a particular iteration may be used again for learning some of the weak classifiers further FHT98 . J.. Hastie. CvBoostParams(). 1998. Springer Series in Statistics. H. CvBoostParams( int boost_type. virtual void read( CvFileStorage* fs. struct CvBoostParams : public CvDTreeParams { int boost_type. This process is controlled with the weight _ trim _ rate parameter. R. thereby those samples receive smaller weights on the subsequent iterations. Technical Report. The structure is derived from CvDTreeParams . Additive Logistic Regression: a Statistical View of Boosting. double weight_trim_rate. CvBoost* ensemble ). int max_depth. int split_criteria. H.trees in the ensemble is increased. Note that the weights for all training examples are recomputed at each training iteration. Only examples with the summary fraction weight _ trim _ rate of the total weight mass are used in the weak classifier training. virtual bool train( CvDTreeTrainData* _train_data. and Prediction. Dept. int weak_count. Friedman. The Elements of Statistical Learning: Data Mining. CvBoost* ensemble. T. 2001. const CvMat* subsample_idx.

And the votes are weighted. GINI=1.. CvMat* weak_responses=0. a negative value “votes” for class # 0. rather than a classification tree. virtual void write( CvFileStorage* storage. // Splitting criteria enum { DEFAULT=0. bool update=false ). const CvMat* _var_idx=0. MISCLASS=3. there is no need to use the weak classifiers directly. const CvMat* _var_idx=0. bool raw_mode=false ) const. CvBoostParams params=CvBoostParams(). const CvMat* _responses. const char* name ). const CvMat* _var_type=0. int _tflag.. CvBoost Comments from the Wiki CvBoost Boosted tree classifier. GENTLE=3 }. const CvMat* _missing_mask=0. const CvMat* _sample_idx=0. REAL=1. CvBoost( const CvMat* _train_data. CvBoost(). class CvBoost : public CvStatModel { public: // Boosting type enum { DISCRETE=0. virtual ~CvBoost(). CvSlice slice=CV_WHOLE_SEQ. const CvMat* _missing=0. retrieved by CvBoost::get_weak_predictors . Note. The weight of each individual tree may be increased or decreased using the method CvBoostTree::scale . a positive . int _tflag. CvBoostParams params=CvBoostParams() ). virtual bool train( const CvMat* _train_data. is a derivative of CvDTree . const CvMat* _sample_idx=0.for class # 1. a component of the boosted tree classifier CvBoost . virtual void prune( CvSlice slice ). CvFileNode* node ). const CvMat* _missing_mask=0. virtual float predict( const CvMat* _sample. virtual void clear(). The weak classifier. that in the case of LogitBoost and Gentle AdaBoost each weak predictor is a regression tree. SQERR=4 }. LOGIT=2. Even in the case of Discrete AdaBoost and Real AdaBoost the CvBoostTree::predict return value ( CvDTreeNode::value ) is not the output class label. const CvMat* _responses. . }. virtual void read( CvFileStorage* storage. however they can be accessed as elements of the sequence CvBoost::weak .. const CvMat* _var_type=0. CvBoost* ensemble. Normally.

e. const CvMat* _sample_idx=0. . CvBoostParams params. CvSeq* weak. The responses must be categorical. virtual void write_params( CvFileStorage* fs ). The method removes the specified weak classifiers from the sequence. Note that this method should not be confused with the pruning of individual decision trees. the new weak tree classifiers added to the existing ensemble). . virtual void read_params( CvFileStorage* fs. CvBoost::prune Comments from the Wiki void CvBoost::prune(CvSlice slice) Removes the specified weak classifiers. CvFileNode* node ). const CvMat* _var_type=0.e. const CvMat* _var_idx=0.. . or the classifier needs to be rebuilt from scratch. int _tflag. i. boosted trees can not be built for regression. and there should be 2 classes.. The method CvBoost::predict runs the sample through the trees in the ensemble and returns the output class label based on the weighted voting. CvDTreeTrainData* data. CvMat* weak_responses=0. CvBoost::get_weak_predictors Comments from the Wiki CvSeq* CvBoost::get_weak_predictors() Returns the sequence of weak tree classifiers. CvBoost::train Comments from the Wiki bool CvBoost::train(const CvMat* _train_data. CvSlice slice=CV_WHOLE_SEQ. }. bool update=false) Trains a boosted tree classifier. virtual void trim_weights(). const CvMat* _responses. protected: virtual bool set_params( const CvBoostParams& _params ). which is currently not supported.CvSeq* get_weak_predictors(). The train method follows the common template. CvBoost::predict Comments from the Wiki float CvBoost::predict(const CvMat* sample. the last parameter update specifies whether the classifier needs to be updated (i. const CvBoostParams& get_params() const.. const CvMat* _missing_mask=0. virtual void update_weights( CvBoostTree* tree ). CvBoostParams params=CvBoostParams(). const CvMat* missing=0.. bool raw_mode=false) const Predicts a response for the input sample.

berkeley. classifies it with every tree in the forest.berkeley. The algorithm can deal with both classification and regression problems. The vectors are chosen with replacement.The method returns the sequence of weak classifiers.e. some vectors will occur more than once and some will be absent. July 2002. http://stat-www. . using the very i-th tree. however its size is fixed for all the nodes and all the trees. The classification works as follows: the random trees classifier takes the input feature vector. All the trees are trained with the same parameters.stat.edu/users/breiman/wald2002-3. That is. The classification error is estimated by using this oob-data as following: • Get a prediction for each vector. and outputs the class label that recieved the majority of “votes”. set to by default. http://stat-www. Breiman). where the vector was oob) and compare it to the ground-truth response.edu/users/breiman/RandomForests/cc_home. Wald I. In the case of regression the classifier response is the average of the responses over all the trees in the forest. the class that has got the majority of votes in the trees. Wald III. int nactive_vars. to some of its derivatives). such as cross-validation or bootstrap. some vectors are left out (so-called oob (out-of-bag) data ). When the training set for the current tree is drawn by sampling with replacement. struct CvRTParams : public CvDTreeParams { bool calc_var_importance.stat. • Then the classification error estimate is computed as ratio of number of misclassified oob vectors to all the vectors in the original data.berkeley.pdf Software for the Masses. July 2002. None of the trees that are built are pruned. Random Trees¶ Random trees have been introduced by Leo Breiman and Adele Cutler: http://www. which are generated from the original training set using the bootstrap procedure: for each training set we randomly select the same number of vectors as in the original set ( =N ). which is oob relatively to the i-th tree.pdf And other articles from the web site http://www.berkeley. CvRTParams Comments from the Wiki CvRTParams Training Parameters of Random Trees. At each node of each tree trained not all the variables are used to find the best split. http://stat-www. for each vector that has ever been oob.pdf Looking Inside the Black Box.edu/users/breiman/wald2002-2. In the case of regression the oob-error is computed as the squared error for oob vectors difference divided by the total number of vectors. rather than a random subset of them.htm . The size of oob data is about N/3 . find the class-“winner” for it (i.berkeley.edu/users/breiman/wald2002-1. but on the different training sets. In random trees there is no need for any accuracy estimation procedures. • After all the trees have been trained. Random trees is a collection (ensemble) of tree predictors that is called forest further in this section (the term has been also introduced by L. probably. With each node a new subset is generated. or a separate test set to get an estimate of the training error. July 2002. References: Machine Learning.edu/users/breiman/RandomForests/ . Each element of the sequence is a pointer to a CvBoostTree class (or. Wald II. It is a training parameter. The error is estimated internally during the training.

CvRTrees Comments from the Wiki CvRTrees Random Trees. class CvRTrees : public CvStatModel { public: CvRTrees(). float _regression_accuracy. const CvMat* _sample_idx=0. float forest_accuracy. 50. bool _calc_var_importance. CvMat* get_active_var_mask(). CvRNG* get_rng(). CvRTParams() : CvDTreeParams( 5. bool _use_surrogates. virtual ~CvRTrees(). 10. CvRTParams params=CvRTParams() ). int max_tree_count. const CvMat* _responses. int _tflag. protected: .1 ). 0. virtual void write( CvFileStorage* fs. CvFileNode* node ). const float* _priors. the trees are not pruned. int _max_categories. The set of training parameters for the forest is the superset of the training parameters for a single tree. const CvMat* sample_2 ) const. virtual void clear(). virtual float get_proximity( const CvMat* sample_1. 10. so the cross-validation parameters are not used. int _nactive_vars. However. virtual bool train( const CvMat* _train_data. const CvMat* _var_type=0. false. CvForestTree* get_tree(int i) const. 0 ). virtual float predict( const CvMat* sample. nactive_vars(0) { term_crit = cvTermCriteria( CV_TERMCRIT_ITER+CV_TERMCRIT_EPS. } CvRTParams( int _max_depth. int get_tree_count() const. }. int termcrit_type ). int _min_sample_count. virtual const CvMat* get_var_importance(). 0. virtual void read( CvFileStorage* fs. 0. false. const char* name ). calc_var_importance(false).CvTermCriteria term_crit. false. const CvMat* missing = 0 ) const. const CvMat* _var_idx=0. Random trees do not need all the functionality/features of decision trees. most noticeably. const CvMat* _missing_mask=0.

or the mean of the regression function estimates). CvDTreeTrainData* data. All of the specific to the algorithm training parameters are passed as a CvRTParams instance. const CvMat* var_type=0. The method CvRTrees::train is very similar to the first form of CvDTree::train () and follows the generic method CvStatModel::train conventions. The method returns the variable importance vector. computed at the training stage when :ref:`CvRTParams`::calc_var_importance is set. to the total number of the trees). CvRTrees::get_var_importance Comments from the Wiki const CvMat* CvRTrees::get_var_importance() const Retrieves the variable importance array. CvRTrees::train Comments from the Wiki bool CvRTrees::train(const CvMat* train_data. }. const CvMat* missing_mask=0. The estimate of the training error ( oob-error ) is stored in the protected class member oob_error . Example: Prediction of mushroom goodness using random trees classifier #include <float. CvRTrees::predict Comments from the Wiki double CvRTrees::predict(const CvMat* sample. const CvMat* comp_idx=0.. but the return value type is different. int tflag..h> . then the NULL pointer is returned. CvRTrees::get_proximity Comments from the Wiki float CvRTrees::get_proximity(const CvMat* sample_1. const CvMat* missing=0) const Predicts the output for the input sample. const CvMat* sample_2) const Retrieves the proximity measure between two training samples. The input parameters of the prediction method are the same as in CvDTree::predict . The method returns proximity measure between any two samples (the ratio of the those trees in the ensemble. const CvMat* responses. const CvMat* sample_idx=0. // array of the trees of the forest CvForestTree** trees. int ntrees.bool grow_forest( const CvTermCriteria term_crit ). .h> #include <stdio. This is unlike decision trees. CvRTParams params=CvRTParams()) Trains the Random Trees model. If the training flag is not set. where variable importance can be computed anytime after the training. int nclasses. in which the samples fall into the same leaf node. This method returns the cumulative result from all the trees in the forest (the class that receives the majority of voices.

0. 0 ).float. i++ ) for( j = 0. ntrain_samples.xml". 0. CV_8UC1). CvMat* comp_idx = NULL. const int nvars = 23. "sample". NULL. cvRect(0. CvTreeClassifierTrainParams cart_params. CvRTreesParams params. CvMat train_data. i < train_data. cvGetSubRect( data.CV_STORAGE_READ ).rows.cols. train_data. } /* create sample_idx vector */ sample_idx = cvCreateMat(1.int. j < train_data.cols-1. for( i = 0. CvMat* sample_idx = NULL. resp_col = 0. CV_32SC1).int. } cvGetSubRect( data. storage = cvOpenFileStorage( "Mushroom. resp_col).i. ntrain_samples) ). cvGetCol( &train_data. cvRect(0. &response. ntrain_samples + ntest_samples) ). CvMat* data = (CvMat*)cvReadByName(storage. if(i>resp_col)CV_MAT_ELEM(*comp_idx. int resp_col = 0. nvars. nvars. CvMat* missed = NULL.0.i) = i. for( i = 0. int i. &train_data.i. i < train_data.j.#include <ctype. train_data.uchar. &test_data.i-1) = i.cols. i++ ) { if(i<resp_col)CV_MAT_ELEM(*comp_idx. . const int ntest_samples = 1000.j) < 0).cols. /* create comp_idx vector */ comp_idx = cvCreateMat(1. NULL. train_data.h" int main( void ) { CvStatModel* CvFileStorage* cls = NULL.rows. /* create missed variable matrix */ missed = cvCreateMat(train_data.j) = (uchar)(CV_MAT_ELEM(train_data. test_data.rows. CV_32SC1). const int ntrain_samples = 1000. if(data == NULL || data->cols != nvars) { puts("Error in source data").h> #include "ml. return -1. CvMat response. j++ ) CV_MAT_ELEM(*missed. CvMat* type_mask = NULL.

data.data.term_crit. CV_32FC1. test_data. CvMat test_resp.wrong_feature_as_unknown = 1. sample_idx. comp_idx. cvReleaseMat(&missed). i < train_data.int. cvSet( type_mask. int wrong = 0. total = 0.0) >= 0 ) { float resp = cls->predict( cls. sample.1. params. 0). cls = cvCreateRTreesClassifier( &train_data.epsilon = 0. cvReleaseMat(&sample_idx). params. train_data.type = CV_TERMCRIT_ITER|CV_TERMCRIT_EPS.fl ). /* create type mask */ type_mask = cvCreateMat(1.fl[i]) > 1e-3 ) ? 1 : 0. wrong += (fabs(resp-response.i.tree_params = &cart_params.0. // initialize training parameters cvSetDefaultParamTreeClassifier((CvStatModelParams*)&cart_params). &sample. for( i = 0. cvRealScalar(CV_VAR_CATEGORICAL). cvGetCol( &test_data. j++.float.fl += nvars ) { if( CV_MAT_ELEM(test_resp. resp_col). &test_resp.i. CV_MAT_ELEM(*sample_idx.term_crit.max_iter = 50.cols+1. params.data. &response. cvReleaseMat(&comp_idx). puts("Random forest results").for( j = i = 0. } sample_idx->cols = j.rows. nvars.0) < 0) continue. i++ ) { if(CV_MAT_ELEM(response. cart_params. NULL ). } } printf( "Test set error = } else puts("Error forest creation").float. i++.j) = i. if( cls ) { CvMat sample = cvMat( 1.term_crit. i < ntest_samples. CV_ROW_SAMPLE. (CvStatModelParams*)& params. type_mask. CV_8UC1). . params. missed ). total++.

The majority of the parameters sits in covariance matrices. : N vectors from a d-dimensional Euclidean Consider the set of the feature vectors space drawn from a Gaussian mixture: where is the number of mixtures. i. Another alternative when are unknown. cvReleaseStatModel(&cls). One of the main that EM algorithm should deal with is the large number of parameters to estimate. Each iteration of it includes two steps. is to use a simpler clustering algorithm to pre-cluster the input samples and thus obtain initial . is the weight of the k-th mixture. At the first step (Expectation-step. } Expectation-Maximization¶ The EM (Expectation-Maximization) algorithm estimates the parameters of the multivariate probability density function in the form of a Gaussian mixture distribution with a specified number of mixtures. is the normal distribution density with the mean and covariance matrix . cvReleaseMat(&data). or E-step). the algorithm may start with the M-step when the initial values for can be provided. return 0. Given the number of mixtures and the samples . cvReleaseFileStorage(&storage). Often (and in ML) the KMeans2 algorithm is used for that purpose.e. . or M-step) the mixture parameter estimates are refined using the computed probabilities: Alternatively. the algorithm finds the maximum-likelihood estimates (MLE) of the all the mixture parameters. which are elements each (where .cvReleaseMat(&type_mask). we find a probability (denoted in the formula below) of sample i to belong to mixture k using the currently available mixture parameter estimates: At the second step (Maximization-step. and : EM algorithm is an iterative procedure.

CvMat* _probs=0. int _cov_mat_type=1/*CvEM::COV_MAT_DIAGONAL*/. The structure has 2 constructors. So a robust computation scheme could be to start with the harder constraints on the covariance matrices and then use the estimated parameters as an input for a less constrained optimization problem (often a diagonal covariance matrix is already a good enough approximation). CvEMParams Comments from the Wiki CvEMParams Parameters of the EM algorithm. int cov_mat_type. weights(_weights). term_crit(_term_crit) {} int nclusters. Technical Report TR-97-021. means(0). means(_means).is the feature space dimensionality). covs(_covs). int start_step. CvTermCriteria term_crit. A. FLT_EPSILON). April 1998. CvMat* _means=0. with another one it is possible to override a variety of parameters. where is identity matrix and is mixture-dependent “scale” parameter. struct CvEMParams { CvEMParams() : nclusters(10). const CvMat** covs. 100. const CvMat* means. FLT_EPSILON ). in many practical problems the covariance matrices are close to diagonal. or even to . CvMat** _covs=0 ) : nclusters(_nclusters). References: • Bilmes98 J. start_step(_start_step). start_step(CvEM::START_AUTO_STEP). the default one represents a rough rule-of-thumb. 100. However. const CvMat* weights. A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models. weights(0). CvTermCriteria _term_crit=cvTermCriteria( CV_TERMCRIT_ITER+CV_TERMCRIT_EPS. from a single number of mixtures (the only essential problem-dependent parameter). International Computer Science Institute and Computer Science Division. probs(0). to the initial values for the mixture parameters. const CvMat* probs. Bilmes. University of California at Berkeley. }. probs(_probs). cov_mat_type(CvEM::COV_MAT_DIAGONAL). } CvEMParams( int _nclusters. covs(0) { term_crit=cvTermCriteria( CV_TERMCRIT_ITER+CV_TERMCRIT_EPS. . cov_mat_type(_cov_mat_type). CvMat* _weights=0. int _start_step=0/*CvEM::START_AUTO_STEP*/.

CvEMParams params=CvEMParams(). CvMat* inv_eigen_values. START_M_STEP=2. CvMat* means. CvMat* labels=0 ). } const CvMat* get_probs() const { return probs. COV_MAT_DIAGONAL=1. COV_MAT_GENERIC=2 }. CvMat* probs ) const. double log_likelihood. START_AUTO_STEP=0 }. CvMat** covs. CvMat** cov_rotate_mats. CvMat* weights. virtual double run_em( const CvVectors& train_data ). const CvMat* sample_idx=0. virtual void init_auto( const CvVectors& samples ). CvMat* log_weight_div_det. int get_nclusters() const { return params.CvEM Comments from the Wiki CvEM EM model. CvMat* probs. CvMat* labels=0 ). virtual bool train( const CvMat* samples. CvEMParams params. } const CvMat* get_means() const { return means. virtual float predict( const CvMat* sample. class CV_EXPORTS CvEM : public CvStatModel { public: // Type of covariance matrices enum { COV_MAT_SPHERICAL=0. virtual ~CvEM(). const CvMat* sample_idx=0. virtual void kmeans( const CvVectors& train_data. virtual void clear(). // The initial step enum { START_E_STEP=1. CvEM(). const CvMat* means ). } protected: virtual void set_params( const CvEMParams& params. int nclusters. } const CvMat* get_weights() const { return weights. }. const CvVectors& train_data ). CvTermCriteria criteria. virtual void init_em( const CvVectors& train_data ). .nclusters. CvEMParams params=CvEMParams(). CvEM( const CvMat* samples. CvMat* labels. } const CvMat** get_covs() const { return covs.

The model trained is similar to the Bayes classifier . 2.255. indices of the label” for each sample: most-probable mixture for each sample). (i+1)*nsamples/N ).CvEM::train Comments from the Wiki void CvEM::train(const CvMat* samples.)*img->height/(N1+1)). int i.255}}. 2. CV_RAND_NORMAL. Instead. 500 ). The trained model can be used further for prediction. &samples_part. CvMat samples_part. sigma = cvScalar(30. 3 ).0. samples. stores all the parameters inside the structure: in probs . CvEMParams params. sigma ).0}}. {{0. CvEM em_model. j. in means in covs[k] . i < N.h" int main( int argc.255. CV_32FC1 ). just like any other classifier. 0 ). in weights and optionally computes the output “class (i. i++ ) { CvScalar mean. const int N1 = (int)sqrt((double)N). CV_32SC1 ). _sample ). cvReshape( samples. Example: Clustering random samples of multi-Gaussian distribution using EM #include "ml. for( i = 0.{{0.h" #include "highgui.0} .255}}. &samples_part. sigma. IplImage* img = cvCreateImage( cvSize( 500. CV_32FC1.30). 1. char** argv ) { const int N = 4. EM is an unsupervised learning algorithm and it does not take responses (class labels or the function values) on input. i*nsamples/N. const CvMat* sample_idx=0. CvMat* labels = cvCreateMat( nsamples. CvRNG rng_state = cvRNG(-1). 8.{{255.255.e. float _sample[2]. const CvScalar colors[] = {{0. 2. mean. } . int nsamples = 100. it computes the MLE of the Gaussian mixture parameters from the input sample set. mean = cvScalar(((i ((i/N1)+1. Unlike many of the ML models. CvMat* labels=0) Estimates the Gaussian mixture parameters from the sample set. CvMat sample = cvMat( 1. CvMat* samples = cvCreateMat( nsamples. CvEMParams params=CvEMParams(). cvRandArr( &rng_state. // form the training samples cvGetRows( samples.

means = em_model.term_crit.val[1]*0.cov_mat_type = CvEM::COV_MAT_SPHERICAL.get_covs(). em_model2.get_weights(). // to use em_model2. j++ ) { CvPoint pt = cvPoint(j. int response = cvRound(em_model.fl[1] = (float)i. i < nsamples.weights = em_model.start_step = CvEM::START_E_STEP.predict() below #endif // classify every image pixel cvZero( img ). // initialize model's parameters params. params.start_step = CvEM::START_AUTO_STEP.covs = NULL. CvEM em_model2. i++ ) { for( j = 0.epsilon = 0. 0 ).nclusters = N. CvScalar c = colors[response].term_crit.probs = NULL. 1. // cluster the data em_model.means = NULL. NULL )). params. pt. i++ ) { .fl[0] = (float)j. params.cvReshape( samples. params.weights = NULL.term_crit.data.max_iter = 10. params. params. for( i = 0.1. params. params. 0.cov_mat_type = CvEM::COV_MAT_DIAGONAL. i). 1. #if 0 // the piece of code shows how to repeatedly optimize the model // with less-constrained parameters //(COV_MAT_DIAGONAL instead of COV_MAT_SPHERICAL) // when the output of the first stage is used as input for the second.type = CV_TERMCRIT_ITER|CV_TERMCRIT_EPS. } } //draw the clustered samples for( i = 0. samples.train( samples. replace em_model.75.predict() // with em_model2.val[2]*0.get_means(). i < img->height. j < img->width. labels ).train( samples. cvScalar(c.predict( &sample. c. params. params.75. params. params.c. CV_FILLED ). cvCircle( img.covs = (const CvMat**)em_model. params.75).val[0]*0. labels ). sample. params. 0. sample. params.data. params.

fl[i*2+1]). multi-layer perceptrons (MLP). } Neural Networks ML implements feed-forward artificial neural networks. individual for each neuron. 2 outputs and the hidden layer including 5 neurons: All the neurons in MLP are similar. colors[labels->data. cvWaitKey(0). cvReleaseMat( &samples ). output layer and one or more hidden layers. cvShowImage( "EM-clustering result". cvReleaseMat( &labels ).x = cvRound(samples->data. Each layer of MLP includes one or more neurons that are directionally linked with the neurons from the previous and the next layer.i[i]]. Each of them has several input links (i. MLP consists of the input layer. The values retrieved from the previous layer are summed with certain weights. plus the bias term. and the sum is transformed using the activation function that may be also different for different neurons. Here is an example of a 3-layer perceptron with 3 inputs. it takes the output values from several neurons in the previous layer on input) and several output links (i. pt. return 0. the most commonly used type of neural networks.CvPoint pt. 1 ). } cvNamedWindow( "EM-clustering result". Here is the picture: . pt. more particularly. img ).y = cvRound(samples->data. CV_FILLED ).e. 1. it passes the response to several neurons in the next layer). cvCircle( img.fl[i*2]). pt.e.

given the outputs of the layer . ML implements 3 standard ones: Identity function ( CvANN_MLP::IDENTITY ): Symmetrical sigmoid ( CvANN_MLP::SIGMOID_SYM ): ). the outputs of the layer In other are computed as: Different activation functions may be used. with the same free parameters ( ) that are specified by user and are not altered by the training algorithms. not completely supported by the moment. the standard sigmoid with is shown below: • Gaussian function ( CvANN_MLP::GAUSSIAN ): .words. In ML all the neurons have the same activation functions. the default choice for MLP. .

RPROP=1 }. struct CvANN_MLP_TrainParams { CvANN_MLP_TrainParams(). where i -th element is 1 if and only if the feature is equal to the i -th value out of M possible. References: • http://en. and iteratively adjusts the weights to try to make the network give the desired response on the provided input vectors.B. CvANN_MLP_TrainParams Comments from the Wiki CvANN_MLP_TrainParams Parameters of the MLP training algorithm. If a certain feature in the input or output (i. Braun. so the error on the test set usually starts increasing after the network size reaches some limit. until we compute the output layer. CvTermCriteria term_crit.5-50. Springer Lecture Notes in Computer Sciences 1524. double param1. // backpropagation parameters double bp_dw_scale. the outputs of the hidden layer are computed using the weights and the activation functions and passed further downstream.wikipedia. the more is the potential network flexibility. Bottou. int train_method. . bp_moment_scale. it makes sense to represent it as binary tuple of M elements. i.So the whole trained network works as follows: It takes the feature vector on input. pp. int train_method. the vector size is equal to the size of the input layer. Wikipedia article about the back-propagation algorithm. however there is a workaround. a tuple of probabilities instead of a fixed value.e. in order to compute the network one needs to know all the weights .e. “A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm”. the larger networks are train much longer than the smaller ones. It will increase the size of the input/output layer. The larger the network size (the number of hidden layers and their sizes). 1998. so it is reasonable to preprocess the data (using CalcPCA or similar technique) and train a smaller network on only the essential features.-R. in the case of n -class classifier for ) layer is categorical and can take different values. G. Muller. 1 LeCun. 1 Riedmiller and H. “Efficient backprop”. So. San Francisco (1993). and the error on the training set could be made arbitrarily small. in Neural Networks—Tricks of the Trade. The algorithm takes a training set: multiple input vectors with the corresponding output vectors. ~CvANN_MLP_TrainParams(). ICNN. But at the same time the learned network will also “learn” the noise present in the training set. Orr and K. Besides. CvANN_MLP_TrainParams( CvTermCriteria term_crit. Another feature of the MLP’s is their inability to handle categorical data as is. enum { BACKPROP=0. The weights are computed by the training algorithm. L. The first is the classical random sequential back-propagation algorithm and the second (default one) is batch RPROP algorithm. Proc. double param2=0 ).org/wiki/Backpropagation. but will speedup the training algorithm convergence and at the same time enable “fuzzy” values of such variables. when the values are passed as input to the first hidden layer. ML implements 2 algorithms for training MLP’s.

const CvMat* _outputs. CvANN_MLP( const CvMat* _layer_sizes.// rprop parameters double rp_dw0. virtual void write( CvFileStorage* storage. the individual parameters can be adjusted after the structure is created. // possible activation functions enum { IDENTITY = 0. rp_dw_min. virtual float predict( const CvMat* _inputs. int _activ_func=SIGMOID_SYM. CvMat* _outputs ) const. rp_dw_minus. NO_INPUT_SCALE = 2. }. virtual void create( const CvMat* _layer_sizes. const CvMat* _sample_weights. GAUSSIAN = 2 }. CvFileNode* node ). rp_dw_max. virtual void read( CvFileStorage* fs. int _activ_func=SIGMOID_SYM. double _f_param2=0 ). double _f_param2=0 ). CvANN_MLP_TrainParams _params = CvANN_MLP_TrainParams(). NO_OUTPUT_SCALE = 4 }. virtual int train( const CvMat* _inputs. } protected: virtual bool prepare_to_train( const CvMat* _inputs. // available training flags enum { UPDATE_WEIGHTS = 1. SIGMOID_SYM = 1. class CvANN_MLP : public CvStatModel { public: CvANN_MLP(). rp_dw_plus. . virtual void clear(). double _f_param1=0. int flags=0 ). virtual ~CvANN_MLP(). } const CvMat* get_layer_sizes() { return layer_sizes. const CvMat* _outputs. The structure has default constructor that initializes parameters for RPROP algorithm. const CvMat* _sample_idx. int get_layer_count() { return layer_sizes ? layer_sizes->cols : 0. const CvMat* _sample_idx=0. const CvMat* _sample_weights. Finally. There is also more advanced constructor to customize the parameters and/or choose backpropagation algorithm. double _f_param1=0. const char* name ). CvANN_MLP Comments from the Wiki CvANN_MLP MLP model.

int activ_func. const double* bias ) const. CvANN_MLP_TrainParams params. double _f_param2=0) Constructs the MLP with the specified topology Paramete o _layer_sizes– The integer vector specifies the number of neurons in rs: each layer including the input and output layers. in the MLP model these steps are separated. int flags ). virtual void set_activ_func( int _activ_func=SIGMOID_SYM. CvMat* _dst ) const. max_val. const double* _sw ). CvMat* wbuf. virtual void calc_output_scale( const CvVectors* vecs. a network with the specified topology is created using the non-default constructor or the method create . virtual void write_params( CvFileStorage* fs ). CvVectors _ovecs. f_param2.CvANN_MLP_TrainParams _params. one of CvANN_MLP::IDENTITY . CvMat* sample_weights. virtual void init_weights(). CvVectors* _ivecs. }. Unlike many other models in ML that are constructed and trained at once. CvMat* deriv. First. virtual void scale_output( const CvMat* _src. Then the network is trained using the set of input and output vectors. CvMat* layer_sizes. // RPROP algorithm virtual int train_rprop( CvVectors _ivecs. // sequential random backpropagation virtual int train_backprop( CvVectors _ivecs. virtual void scale_input( const CvMat* _src. CvANN_MLP::create Comments from the Wiki void CvANN_MLP::create(const CvMat* _layer_sizes. CvFileNode* node ). const double* bias ) const. double** weights. o _activ_func– Specifies the activation function for each neuron. CvRNG rng.e. CvVectors _ovecs. const double* _sw ). All the weights are set to zeros. int _flags ). virtual void calc_input_scale( const CvVectors* vecs. double _f_param1=0. virtual void calc_activ_func_deriv( CvMat* xf. max_buf_sz. int flags ). double f_param1. the weights can be adjusted based on the new training data. CvMat* _dst ) const. double _f_param1=0. i. virtual void read_params( CvFileStorage* fs. min_val1. CvANN_MLP::SIGMOID_SYM and . double** _sw. The training procedure can be repeated more than once. double min_val. virtual void calc_activ_func( CvMat* xf. int max_count. int _activ_func=SIGMOID_SYM. CvVectors* _ovecs. double _f_param2=0 ). max_val1.

the training algorithm normalizes each input feature independently. rows of _inputs and _outputs ) that are taken into account. respectively. the new training data could be much different from original one. shifting its mean value to 0 and making the standard deviation =1. const CvMat* _sample_idx=0. If the network is assumed to be updated frequently. o _sample_weights– (RPROP only) The optional floating-point vector of weights for each sample. and the user may want to raise the weight of certain classes to find the right balance between hit-rate and false-alarm rate etc. o _outputs– A floating-point matrix of the corresponding output vectors. one vector per row. In this case user should take care of proper normalization. one vector per rs: row. const CvMat* _outputs. Bibliography¶ [Agrawal0 8] Agrawal. NO_INPUT_SCALEalgorithm does not normalize the input vectors. Some samples may be more important than others for training. See the formulas in the introduction section. Paramete o _inputs– A floating-point matrix of input vectors. If the flag is not set. o _f_param1.e. 2008 . It returns the number of done iterations. o _flags– The various parameters to control the training algorithm. const CvMat* _sample_weights.CvANN_MLP::GAUSSIAN . See CvANN_MLP_TrainParams description. the training algorithm normalizes each output features independently. M. and Konolige. The method creates a MLP network with the specified topology and assigns the same activation function to all the neurons. and . and Blas. “CenSurE: Center Surround Extremas for Realtime Feature Detection and Matching”._f_param2– Free parameters of the activation function. This method applies the specified training algorithm to compute/adjust the network weights. May be a combination of the following: UPDATE_WEIGHTS = 1algorithm updates the network weights. If this flag is not set. int flags=0) Trains/updates MLP. o _params– The training params. K. o _sample_idx– The optional integer vector indicating the samples (i. by transforming it to the certain range depending on the activation function used. CvANN_MLP::train Comments from the Wiki int CvANN_MLP::train(const CvMat* _inputs. CvANN_MLP_TrainParams _params = CvANN_MLP_TrainParams(). ECCV08. M. rather than computes them from scratch (in the latter case the weights are initialized using Nguyen-Widrow algorithm). NO_OUTPUT_SCALEalgorithm does not normalize the output vectors.R.

TR2004-1963 (2004) [Borgefors 86] [Bradski0 0] [Bradski9 8] [Davis9 7] [Felzenszwalb 04] [Hartley9 9] Hartley.. “Theory and Practice of Projective Rectification”. and Huttenlocher. “Distance transformations in digital images”. “Depth Discontinuities by Pixel-to-Pixel Stereo”. TR2004-1963. and Bobick. J. R. and Van Gool. pp 344–371 (1986) Davis. 2006 Borgefors. 2000 Bradski. L.[BT96 ] [Bay0 6] Tomasi. 9th European Conference on Computer Vision. Intel. CVPR97. STAN-CS. H. “The Representation and Recognition of Action Using Temporal Templates”.I. and Bradski. 34 3. “Distance Transforms of Sampled Functions”. 1996 Bay. 1997 Felzenszwalb. C. “SURF: Speeded Up Robust Features”.R. Vision Graph. “Computer Vision Face Tracking for Use in a Perceptual User Interface”. WACV00. pp 115-127 (1999) .R. T.W. Daniel P. 1998 Davis.F. and Birchfield. Pedro F. G. “Motion Segmentation and Pose Recognition with Motion History Gradients”. J. S. IJCV 35 2.W.T. and Tuytelaars. Comput. Gunilla. A. Image Process. G.

Sign up to vote on this title
UsefulNot useful