You are on page 1of 5

Spatiotemporal Segmentation of GIS Object for Mobile Mapping System

Peng Lia, Cheng Wanga,b, Hanyun Wanga, Shengyong Haoc


a b School of Electronic Science and Engineering School of Information Science and Tech National University of Defense Technology Xiamen University Changsha, China1 Xiamen, China2 chwang_nudt@263.net peterlee20058@hotmail.com hanyun.wang1986@gmail.com c Space. Star. Tech. Corp. Ltd. Beijing, China

AbstractAs one fast developing field of GIS, mobile mapping system (MMS) is the trend of state of art mapping technique and digital measurable image (DMI) sequence is the main data type for MMS. Given the DMI data acquired, its a tough but necessary task to segment the GIS object of interest (OOI) for various engineering applications. This paper proposed a novel hybrid method to segment the GIS object in DMI sequences spatiotemporally. First, the GIS OOI is indicated by the user in a key frame. We propose a hypothesis for foreground background identification and applying watershed transformation to achieve foreground background segmentation. Second, patch-based color histogram back-projection is employed to find the candidate region of the OOI in the new frame. Then SIFT is employed to describe the OOI and detect correspondence object in the new frame. The object template is updated as the matching going on until the last frame. Experimental results demonstrate the robustness and effectiveness of our method. Keywords-GIS; Mobel Mapping Systems (MMS); Digital Measurable Image (DMI); Segmentation; Object; Color Histogram Back-projection; SIFT; Watershed

I.

INTRODUCTION

Mobile Mapping System (MMS) is the trend of state of art mapping technique for large scale digital road map drawing. MMS integrates geodetic quality GPS, digital stereo cameras and inertial navigation system on a mobile platform and implements measurement-on-demand between various elements, especially the elements beside the road, during the motion of platform, and edits the data to construct the geoinformation database. MMS based road GIS data generation is typically divided into three stages: data acquisition, DMI sequence production and GIS data extraction and update. By far, automatic GIS data extraction and update is still an unsolved puzzle and the bottleneck of the MMS [1], [16]. Most arresting information in MMS is about GIS objects, e.g. guideposts, central lines and so on. Accurate and efficient segmentation of GIS objects in DMI sequences is a necessary preprocessing for the MMS and there is still a lack of credible work on this issue. And we aim to develop a spatiotemporal segmentation method of GIS object in DMI sequence for MMS.

Segmentation and tracking of objects in a 2D image sequence is an important and challenging field of wide usages including change detection, object-based video coding (MPEG4), video postproduction, content-based indexing and retrieval, surveillance, and 3D scene reconstruction for 3D TV [4], [8]. State-of-art object-based image sequence segmentation techniques can be grouped into different categories, for instance, supervised, unsupervised or semi-supervised; region based or boundary based; high level or low level; local information based or global information based; segmentation of mobile objects or static objects. Under the hypothesis of static camera, epipolar-plane image can be used for spatiotemporal volume segmentation of object in image sequences. T-junctions in epipolar image plane (EPI) are used as indicators of boundaries of spatiotemporal volume [3]. Several kinds of active contour have been widely utilized for semi-automatic video object segmentation and tracking [14]. Vaswani et al proposed that particle filter combined with level set based active contour can be used to segment moving and deformable object in an image sequence [15]. Graph or hypergraph is used in some region based segmentation method [12], [5], [7]. Some region based methods employ clustering operation or region splitting and growing in the feature space, which is usually formed by motion vectors, spatial features or appearance features like color, texture, and position, [2], [17]. Probabilistic issues are often involved in region based methods [9]. Keypoint features are frequently employed to localize the OOI in an image sequence [13]. Most of aforementioned methods are developed for common video sequence rather than DMI sequence acquired by a mobile imaging platform with a much larger time gap; they are not competent to get spatiotemporal segmentation of GIS object in DMI sequence. The main challenges faced include: 3D structure of GIS objects, viewpoint change, illumination change, background clutter, motion of imaging platform and large time gap in DMI sequence [1]. We aim to overcome these problems and propose a hybrid object-based DMI sequence segmentation method for MMS. This remainder of this paper is organized as follows: Section II proposes a method for foreground background identification applying watershed transformation. Section III

This paper is supported by National Natural Science Foundation of China Project 40971245 and Key Project of The Eleven-Five Year Research Program of China 2008BAC34B02-2.

978-1-4244-9404-0/11/$26.00 2011 IEEE

briefly proposes fast patch-based histogram back-projection method to get candidate region of object of interest (OOI). Section IV describes SIFT briefly. Framework of the proposed hybrid segmentation method and experimental results are presented in Section V. Section VI is the conclusion. II. FIGURE BACKGROUND INDENTIFICATION BY WATERSHED TRANSFORMATION
Figure 1. Figure-background markers. According to the hypothesis proposed in Section II, red rectangle marks foreground and blue lines marks background.

A. Image Segmentation Applying Watershed Transformation The watershed transformation is a powerful tool for image segmentation and overcomes the problem of disconnected contours and false edges. It considers the gradient magnitude of an image or a distance transformation of a binary image as a topographic surface. Pixels having the highest gradient magnitude intensities (GMIs) correspond to watershed lines, which represent the region boundaries. Water placed on any pixel enclosed by a common watershed line flows downhill to a common local intensity minimum (LIM). Pixels draining to a common minimum form a catch basin, which represents a segment. The output of the watershed algorithm is a tessellation of the input image into its different catchment basins, each one characterized by a unique label [10]. However, in practice, this transform produces severe oversegmentation due to noise or local irregularities in the gradient image. A previously defined set of markers can be used as a method to enhance the watershed transformation segmentation results. But good markers depend closely on different applications [10]. B. Hypothesis For Figure Background Identification And Auto-Marked Watershed Transformation According to characteristics of GIS OOI and common sense, we propose the following hypothesis for foreground background identification of GIS OOI template. In the object template, pixels near the bounding box (boundary) belong to background; The template image is not much bigger than the exact GIS object; The OOI lies in the center area of the template image.

III.

CANDIDATE REGION COMPUTATION BY FAST PATCH BASED COLOR HISTOGRAM BACK-PROJECTION

A. Color histogram Color, as one of the global features, is widely used in content based retrieval systems. Color histogram is always used to represent the color distribution in an image or a video frame. For digital images, it is basically the number of pixels that have colors in each of a fixed list of color ranges, which span the image's color space, the set of all possible colors. Color histogram is relatively invariant with translation and rotation about the viewing axis, and varies only slowly with the angle of view [2],[11]. Furthermore, compared to other invariant features, color histogram is less time-consuming. B. Color Histogram Back-projection and Patch-based Color Histogram Back-projection Histogram back-projection is a primitive operation that associates the pixel values in the image with the value of the corresponding histogram bin. Color histogram back-projection is a low complexity, active vision algorithm for finding objects in complex scenes and is little affected by the movement of camera. The color histogram back projection method computes the ratio of the color histogram of the particular pixel (Oi) and the image color histogram (Ii), as shown in (1):
Ri = Oi Ii

(1)

According to the above three hypothesis, we proposed an auto-marked watershed transform scheme for foreground background identification scheme. Two markers, as depicted in Fig. 1, for foreground and background respectively are imposed on the gradient image automatically. Segment the GIS object template using watershed transformation. All the segments adjacent to boundary of the template will be abandoned. The remainder segments form a mask indicating the exact OOI.

For each image pixel, the probability of it belonging to the sought object is computed as the count of the bin the pixel indexes into. Replacing each pixel with the corresponding probability , a grayscale image regarded as probability image is computed. However, histogram back-projection is only suitable to find the presence probability of a particular pixel for it compares histograms between pixels. Patch-based color histogram back-projection, as an improvement of color histogram backprojection, it solve the problem of localizing a model image in another image. It considers subregion of an image and the color histogram of that subregion and to ask whether the color histogram for the subregion matches the model histograms and associates with each such subregion a probability that the modeled object is, in fact, present in that subregion [6]. For the algorithm of patchbased color histogram back-projection in [6] implements a similar way of template matching, it will be computation consuming and memory consuming. In our experiment of localizing a 2337 model image in a 160120 image on MATLAB R2008a, 10.98 seconds are consumed. While in

Although the foreground-background segmentation result is coarse, most of the background can be eliminated and the remainder foreground is sufficient to compute a more accurate histogram or other further processing.

MMS application, most of the images are 100 times larger than the test image, time consumption will be more than 1000 seconds per frame and the patch-based color histogram backprojection is not practicable. C. Fast Patch-based Color Histogram Back-projection In this paper we propose an approximate but quick method, fast patch-based color histogram back-projection method. It is less time-consuming and more robust to object deformation. The improvements compared to former patch-based color histogram back-projection are: 1 , both the object template and the input image are resized to half of their initial size, for color histogram is nearly invariant to scale change. You can resize the inputs much smaller for speed if you like. 2nd, the sliding step is set to one third of the object size in corresponding directions rather than 1 pixel. For most of GIS objects template are much bigger than 3 pixels, hence a higher speed will be achieved.
st

addition of noise. So it is widely used in image matching, object detection, image registration and so on. B. Exisiting Problems However, computing SIFT keypoints is a time-consuming job for large images. It takes 23.7 seconds by Lowes MATLAB implementation of SIFT for finding more than 4000 SIFT keypoints in a 16001200 DMI. Even worse, most of these keypoints cant find their correspondence points on the OOI template and tend to lead to a miss matching. Because most of templates of GIS OOI are much smaller than the DMIs, it is more reasonable to detect a candidate region of the OOI in a new DMI. And we apply fast patchbased color histogram back-projection discussed in Section 3 to solve this problem. V. HYBRID OBJECT BASED IMAGE SEQUENCE SEGMENTATION FRAMEWORK AND EXPERIMENTAL RESULTS A. Flow Of The Hybrid Spatiotemporal Segmentati Method Of GIS Objects In DMI Sequence To get the spatiotemporal segmentation of GIS OOI quickly and accurately, we propose a hybrid segmentation method. Watershed segmentation, fast patch-based color histogram back-projection and SIFT method are combined in one framework.

For comparing two histograms, we choose the histogram intersection, shown in (2), as the correspondence measurement for efficiency consideration.
d intersection ( H 1 , H 2 ) = min( H 1 (i ), H 2 (i ))
i

(2)

After all, the result of fast patch-based color-histogram back-projection is an image of presence probability of the object template in the image. Because different objects may have similar or even same color components, using patchbased back-projection alone may produce a false alarming. SIFT, to be discussed in next section, is employed to make the final decision of the existence of the object and its transformation. IV. SIFT FOR OBJECT LOCALIZATION FOR MMS

A. SIFT The Scale Invariant Feature Transform (SIFT) extracts keypoint features from the image and creates a high dimensional description vector (descriptor) for the local image content [13]. The features are strong extremal points in a Difference of Gaussians (DoG) pyramid. After extraction, a relative coordinate system (rotational invariance) is assigned to each feature, based on local gradient information extracted at the scale at which the feature point is found. The descriptor is then computed based on local gradient information aligned with the new coordinate system. These descriptors are used to look for corresponding keypoints in two images. An approximate nearest neighbour algorithm, called the Best-Bin-First (BBF) algorithm is employed to match the feature sets of different images. RANSAC (random sample consensus) is applied to reject the outliers and get the affine transformation matrix between the two images. The extraction and description step are both invariant to rotational, scaling and illumination change, as well as the

Figure 2. Flowchart of the hybrid segmentation method of GIS object in DMI sequence for MMS

Figure 4. H-V histogram of the whole template and the fore-ground only

Figure 3. Figure backgroud segmentation results. Upper line is get by our auto-marked method and the lower line is get with no markers. Pixels inside the yellow border are classified into the foreground.

As indicated by the flowchart of our segmentation method shown in Fig. 2, the proposed method is a semi-supervised method for consideration both accuracy and efficiency. However, automatic indexing of interesting objects is very time-consuming and the indexing result may be ambiguous. User interaction can significantly alleviate this side effect. Given a DMI sequence and the GIS OOI template indicated by user, the frame work as follows: Step 1, we apply the proposed auto-marker watershed method to identify foreground and background in GIS object template as discussed in section. Step 2, fast patch-based histogram back-projection method is employed to get the candidate region of the GIS object in the next frame of the DMI sequence. Step 3, SIFT is used to find and match keypoints in both the GIS OOI template and the candidate region. Step 4, we use RANSAC algorithm to eliminate the mismatched keypoints and calculate the fundamental matrix. We localize and transform the OOI template in the candidate region and replace the template by the transformed region in the candidate region. Step 1 to Step 4 is implemented iteratively in each frame until the last frame of the DMI sequence. The spatiotemporal segmentation of the OOI is constructed by the segmented OOI templates in consecutive frames. B. Experiment Results In this section, we present the experiment results of the proposed method. The effectiveness of the proposed method is demonstrated via the experiment results using DMI sequences acquired by VISAT systems in Calgary, Canada. The size of DMI is 16001238, which has geo-information shown in the top and bottom of each frame and the actual size of image is 1600 1200. GIS OOIs include guidepost, temporary peg, fireplug, mailbox and so on. The sample interval is about one second, much larger than most video sequences.

Figure 5. The presense probability image and the candidate region specified by the red box.

Experiments are implemented on MATLAB R2008a to verify the proposed spatiotemporal segmentation method and performed on computer with Intel Pentium 4 2.4GHz CPU and 1GB RAM. The upper line of Fig. 3 shows the foreground background segmentation results of our auto-marked watershed transformation. Compared with the results of watershed with no markers shown in the lower line of Fig. 3, segment results by our method of different objects are greatly enhanced.

Figure 6. Spatiotemporal segmentation of GIS object guidepost

Figure 7. Spatiotemporal segmentation of GIS object of temporary peg

In Step 2 shown in Fig. 2, we employ H-V histogram for patch-based color histogram back-projection to find the candidate region. H stands for hue and V stand for value channel. Only foregrounds H-V histogram will be backprojected to the new frame. Fig. 4 shows the H-V histogram of the whole template and foreground only. And they are obviously different for there is no background information calculated in the right figure. Fig. 5 shows the candidate region indicated by the red rectangle in a frame of DMI of the fireplug show in middle Fig. 3. Generally speaking, the candidate regions are much smaller than the original DMI. Time consumption for calculating candidate region is about 10 seconds per frame. Having got the candidate regions, SIFT keypoints are found and matched between the object template and the candidate regions. In our experiment, time consumed for SIFT detection and matching per frame is only a little more than one second. Outliers are removed by RANSAC, and inliers are used to find the affine transformation from template to the candidate region. And the template is updated by its counterpart matched in the candidate region. Fig. 6 and Fig. 7 show the final spatiotemporal segmentation results of different GIS OOIs such as guidepost and temporary peg in real DMI sequence. We select the GIS object by a blue rectangle in the keyframe, and then the corresponding object in each frame of DMI sequence is indicated by the magenta rectangle. Our segmentation results demonstrate the robustness of our method to challenges, known as viewpoint change, background clutter, motion of imaging platform, large sample interval between adjacent frames, illumination change and so on. The final spatiotemporal segmentation is got by applying the auto-marked watershed transformation proposed in each new OOI template in consecutive frames. Total time consumed per frame is less than 15 seconds and there has been no any optimization adopted yet. If implemented in C and with some optimization method involved, the time consumed can be dramatically shortened. VI. CONCLUSION We propose a novel hybrid method for spatiotemporal segmentation of GIS object in DMI sequences. We propose an auto-marked watershed transformation for foreground background segmentation according to our rationale hypothesis on foreground and background. We propose a fast patch-based color histogram back-projection method, which runs 100 times fast than original patch-based back-projection and perform as well as the original one. SIFTs performance has also been

enhanced both in efficiency and accuracy for the combination of color. Experimental results on real-world DMI sequences captured by VISAT demonstrate that the proposed methods are successful in difficult scenes with significant background clutter simultaneously with a great decrease in time consumed. The proposed method can be applied in various MMS. REFERENCES
[1] [2] P. Li, C. Wang, and H. Y. Wang, Semi-supervised object based digital measurable image sequence segmentation for MMS, CGC 2010,2010 J. G. Allen, R. Y. D. Xu, and J. S. Jin, Object Tracking Using CamShift Algorithm and Multiple Quantized Feature Spaces, Pan-Sydney Area Workshop on Visual Information Processin, 2003. N. Apostoloff, and A. Fitzgibbon, Automatic video segmentation using spatiotemporal T-junctions, BMVC, 2006. S. W. Babacan, and T. N. Pappas, Spatiotemporal algorithm for joint video segmentation and foreground detection, EUSIPCO, 2006. E. Borenstein, and J. Malik, Shape Guided Object Segmentation, CVPR, 2006. G. Bradski, and A. Kaehler, Learning OpenCV. OReilly Media Inc., Sebastopol, pp.194-221, 2008. W. Brendel, and S. Todorovic, Video Object Segmentation by Tracking Regions ICCV, 2009.. P. L. Correia, , and F. Pereira, Classification of Video Segmentation Application Scenarios, IEEE Trans. on Circuits and Systems for Video Technology, 14(5), pp. 735-741, 2004. R. Ahmed, G. C. Karmakar, and L. S. Dooley, Probabilistic SpatioTemporal Video Object Segmentation Incorporating Shape Information, ICASSP, 2005. S. beucher, The watershed transformation applied to image segmentation, 10th Pfefferkorn Conf. on Signal and Image Processing in Microscopy and Microanalysis, 16-19 sept. 1991, Cambridge, UK, Scanning Microscopy International, suppl. 6. pp. 299-314, 1992. J. Huang, S. Ravikumar, M. Mitra, W.J. Zhu, and R. Zabih, spatial Color Indexing and Applications, International Journal of Computer Vision, 35(3), pp. 245268, 1999. Y. C. Huang, Q.S. Liu, and, D. Metaxas, Video Object Segmentation by Hypergraph Cut, CVPR, 2009 D. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, 60(2), pp. 91110, 2004. S. J. Sun, D. R. Haynor, and Y. M. Kim, Semiautomatic Video Object Segmentation Using VSnakes, IEEE Trans. on Circuits and Systems for Video Technology. 13(1), pp. 75-82, 2003. N. Vaswani, Y. Rathi, A. Yezzi, and A. Tannenbaum, Deform PF-MT: Particle Filter with Mode Tracker for Tracking Non-Affine Contour Deformations, IEEE Trans. Image Processing, 19(4), pp. 841-857, 2009. C. Wang, T. Hassan, N. El-Sheimy, and M. Lavigne, Automatic Road Vector Extraction for Mobile Mapping Systems, XXI Congress, ISPRS, 2008. T. T. Zin, and H. Hama, A Method Using Morphology and Histogram for Object-based Retrieval in Image and Video Databases, International Journal of Computer Science and Network Security, 7(9), pp. 123-129, 2007.

[3] [4] [5] [6] [7] [8]

[9]

[10]

[11]

[12] [13] [14]

[15]

[16]

[17]

You might also like