The term segmentation is used to describe a range of different processes for partitioning the video to meaningful parts at different granularities. Segmentation of images and video is generally an ill-posed problem, i.e. For a given natural image or image sequence, there exists no unique solution to the segmentation problem. The application of any segmentation method is often preceded by a simplification step for discarding unnecessary information (e.g. Low-pass filtering) and feature extraction step for modifying or
The term segmentation is used to describe a range of different processes for partitioning the video to meaningful parts at different granularities. Segmentation of images and video is generally an ill-posed problem, i.e. For a given natural image or image sequence, there exists no unique solution to the segmentation problem. The application of any segmentation method is often preceded by a simplification step for discarding unnecessary information (e.g. Low-pass filtering) and feature extraction step for modifying or
Copyright:
Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online from Scribd
The term segmentation is used to describe a range of different processes for partitioning the video to meaningful parts at different granularities. Segmentation of images and video is generally an ill-posed problem, i.e. For a given natural image or image sequence, there exists no unique solution to the segmentation problem. The application of any segmentation method is often preceded by a simplification step for discarding unnecessary information (e.g. Low-pass filtering) and feature extraction step for modifying or
Copyright:
Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online from Scribd
meaningful elementary parts termed segments . Considering still images, (spatial) segmentation means partitioning the image to a number of arbitrarily shaped regions, each of them typically being assumed to constitute a meaningful part of the image, i.e. to correspond to one of the objects depicted in it or to a part of one such object. Considering moving images, i.e. video, the term segmentation is used to describe a range of different processes for partitioning the video to meaningful parts at different granularities. Segmentation of video can thus be temporal, aiming to break down the video to scenes or shots, spatial, addressing the problem of independently segmenting each video frame to arbitrarily shaped regions, or spatio-temporal, extending the previous case to the generation of temporal sequences of arbitrarily shaped spatial regions. The term segmentation is also frequently used to describe foreground/background separation in video, which can be seen as a special case of spatio- temporal segmentation. Regardless of the employed decision space, i.e. 1D, 2D or 3D for temporal, spatial and spatio-temporal segmentation, respectively, the application of any segmentation method is often preceded by a simplification step for discarding unnecessary information (e.g. low-pass filtering) and a feature extraction step for modifying or estimating features not readily available in the visual medium (e.g. texture, motion features etc., but also color features in a different color space etc.), as illustrated in for a variety a segmentation algorithms (Figure 1). Segmentation of images and video is generally an ill-posed problem, i.e. for a given natural image or image sequence, there exists no unique solution to the segmentation problem; the spatial, temporal or spatio-temporal segments that should ideally be formed as a result of segmentation largely depend on the application under consideration and most frequently on the subjective view of each human observer. Commonly considered applications of segmentation include region-based image and video description, indexing and retrieval, video summarization, interactive region-based annotation schemes, detection of objects that can serve as cues for event recognition, region- based coding, etc. Particularly image and video description, indexing and retrieval has been on the focus of attention of many researchers working on segmentation, since the benefits of introducing segmentation to this application have recently been documented well and significant progress has been made on related topics such as region-based description for indexing, most notably with the introduction of the MPEG-7 Standard. Most segmentation methods serving all aforementioned applications are generic , i.e. make no restrictive assumptions regarding the semantics of the visual content, such as that the content belongs to a specific domain; however, domain-specific methods for applications like medical image segmentation also exist. Spatial segmentation Segmentation methods for 2D images may be divided primarily into region-based and boundary-based methods. Region-based approaches rely on the homogeneity of spatially localized features such as intensity, texture, and position. On the other hand, boundary-based methods use primarily gradient information to locate object boundaries. Hybrid techniques that integrate the results of boundary detection and homogeneity-based clustering (e.g. region growing), as well as techniques exploiting additional information such as structural properties (e.g. inclusion), have also been proposed. Traditional region-based approaches include region growing and split and merge techniques. Starting from an initial region represented by an arbitrarily chosen single pixel, region growing is the process of adding neighboring pixels to this region by examining their similarity to the ones already added; when no further additions are possible according to the defined similarity criteria, a new region is created and grows accordingly. The opposite of this approach is split and merge. Starting from a single initial region spanning the entire image, region homogeneity is evaluated; if the homogeneity criterion is not satisfied, the region is split according to a pre- defined Page 782 pattern and neighboring regions are subsequently merged, providing this does not violate the homogeneity criterion. The interchange of split and merge steps continues until the latter is satisfied for all regions. Region-based approaches also include the Recursive Shortest Spanning Tree (RSST) algorithm, which starts from a very fine partitioning of the image and performs merging of neighboring nodes while considering the minimum of a cost function; the latter preserves the homogeneity of the generated regions. In order to avoid a possible premature termination of the merging process, resulting to over-segmentation, in the case that the desired final number of regions is not explicitly defined, the introduction of syntactic visual features to RSST has been proposed . The K-means algorithm, an iterative classification method, has also been used as the basis of several region-based approaches. In, the K-Means-with-Connectivity-Constraint variant of K-means is used to effect segmentation by means of pixel clustering in the combined intensity-texture-position feature space (Fig. 2). Another approach to pixel clustering is based on the Expectation- Maximization (EM) algorithm, which is a method for finding maximum likelihood estimates when there is missing or incomplete data. For the application of EM to segmentation, the cluster membership for each pixel can be seen as such. In, image segmentation is treated as a graph partitioning problem and the normalized cut, a global criterion measuring both the total dissimilarity between the different groups as well as the total similarity within the groups, is employed for segmenting the graph. In contrast to the aforementioned methods, boundary-based methods rely on detecting the discontinuities present in the feature space. The Canny edge detector is a popular such scheme, based on the convolution of the image, over a small window, with the directional derivatives of a Gaussian function. Another approach to boundary detection is anisotropic diffusion, which can be seen as a robust procedure for estimating a piecewise smooth image from a noisy input image. Anisotropic diffusion employs an edge-stopping function that allows the preservation of edges while diffusing the rest of the image. Mathematical morphology methods, including in particular the watershed algorithm , have also received considerable attention for use in image segmentation. The watershed algorithm determines the minima of the gradients of the image to be segmented, and associates a segment to each minimum. Conventional gradient operators generally produce many local minima, which are caused by noise or quantization errors, and hence, the watershed transformation with a conventional gradient operator usually results in over-segmentation. To alleviate this problem, the use of multiscale morphological gradient operators has been proposed. More recently, the use of the watershed algorithm to generate an initial over-segmentation and the subsequent representation of this result as a graph, to which partitioning via the weighted mean cut criterion is applied, was proposed to combat the over-segmentation effect . Finally, global energy minimization schemes, also known as snakes or active contour models, involve the evolution of a curve from an initial position toward the boundary of an object in such a way that a properly defined energy functional is minimized. Depending on the definition of the energy functional, the resulting scheme may be Page 783 edge- based, region-based or based on a combination of boundary detection and homogeneity- preserving criteria.