A VECTOR-BASED APPROACH TO DIGITAL IMAGE SCALING Sudha Velusamy, Ajit Bopardikar, Radhika R, Amit Prabhudesai and Basavaraja

SV† Samsung India Software Operations, Bangalore, India. Email: {sudha.v, ajit, radhikar, mr.amitp}@samsung.com, svbasavaraj@gmail.com
ABSTRACT Image and video capture devices are often designed with different capabilities and end-users in mind. As a result, very often images are captured at one resolution and need to be displayed at another resolution. Resizing thus continues to be an important research area in today’s milieu. Well-known pixelbased methods (like bicubic interpolation) have traditionally been used for the purpose. However, these methods are often found lacking in quality from a perceptual standpoint as they tend to soften edges and blur details. Recently, vectorization based approaches have been explored because of their inherent property of artifact free scaling. However, these methods fail to faithfully reproduce textures and fine details in an image. In this work, we present a novel, layered approach of combining image vectorization with a pixel based interpolation technique to achieve high quality scaling of digital images. The proposed method decomposes an image into two layers: a ‘coarse’ layer capturing the visually important structure of the image, and a ‘fine’ layer containing the details. We vectorize only the coarse layer and render at the desired scale, while using classical interpolation on the fine layer. The final scaled image is composed by blending the independently scaled layers. We compare the performance of the proposed method with several state-of-theart vectorization and scaling methods, and report comparably better performance. Keywords— Vector Graphics, Vectorization, Realistic Rendering, Image Resizing, Decomposition. 1. INTRODUCTION Television screens, especially high-definition (HD) screens, are being increasingly used to view not just broadcast content, but also other content like personal multimedia collections captured using digital cameras and mobile phones. Often the resolution of the multimedia content does not match the display resolution of the display screen. Devising scaling methods that bridge the difference in resolution between the content and the displays has long been an active research area. The classical pixel scaling techniques such as bicubic interpolation are fast and efficient, but yield poor visual quality. In recent times several edge based techniques have been proposed. When multiple low-resolution images of a scene are available, super-resolution
† The

algorithms hold the promise of better image quality. However, their high computational complexity has been an impediment to their implementation in real-time and embedded systems. Recently, vector graphics has been an area of active research because it promises arbitrary scaling of the given image while allowing for easy editing and compact representation. Vector graphics already find use in representing fonts to enable artifactfree scaling of text. It is intuitive then, to extend this approach to image scaling. This is, however, not a trivial task due to the difficulties involved in vectorizing (bitmap-to-vector conversion) natural images. In this work, we present a vectorization based layered image scaling technique. The present approach is motivated by the fact that our human visual system resolves a scene in hierarchical layers starting from coarse structures to fine details. Most scaling techniques are inefficient in simultaneous scaling of edge like structures and fine details. In the present work, we decompose an image into coarse and fine layers, and apply scaling techniques that are appropriate for each layer. Finally, the scaled layers are blended together to get the final output. In the next section, we review some of the related work in the area of vector and pixel based image scaling.

2. RELATED WORK The problem of vectorization, or raster-to-vector conversion, has been an actively studied problem in the computer graphics community. Researchers have addressed the problem of vectorizing line drawings [1] or synthetic images like cartoons [2, 3], as well as the more challenging problem of vectorizing natural color images [4, 5, 6]. These approaches can be grouped into two main categories based on the type of images they handle. The first category deals with mostly synthetic images that have a limited color palette, smooth color fills with well-defined borders and no (or relatively simple) texture. The approaches of Zhang et al [2] and Koloros [3] fall in this category. These approaches cannot handle real-life images that have many colors, smooth gradients and complex textures. The second category of methods attempts to generate a photo-realistic reconstruction of such images. Battiato et al [4] present a region-based approach that generates an over-segmented image using the watershed algorithm and fits polygons to the resulting region boundaries. The information is represented in the Scalable Vector Graphics (SVG) format which allows a very compact encoding of the vector data. A related approach [7] that is used in the RaveGrid

Photo-

author is currently with Nokia Pvt Ltd, India.

978-1-4244-7493-6/10/$26.00 c 2010 IEEE

1124

ICME 2010

software [8] is the use of the Delaunay triangulation that fits triangles to image regions. The limitation of this method (and of [4]) is that a large amount of vector data (triangles or polygons) is required for a faithful reconstruction of the original image. Mesh-based techniques have also been proposed that use regular or irregular meshes to represent the image data. Price et al [5] present a method for easy image editing based on meshfitting. A similar method based on gradient meshes is presented by Sun et al [6]. Though these methods provide sufficiently good rendering of most natural images, rendering fine textures (like hair, or fur) remains a challenge. Vector-based approaches have primarily been used to design tools for professional illustrators that are used for easy creation and/or editing of image content. Text fonts are often represented in vector form due to the advantages of vector scaling of the data that it represents. It is then intuitive to extend this philosophy to images. Application of vectorization techniques to the image scaling problem, however, is far from trivial. Simple images are relatively easy to vectorize faithfully. Real-life images are significantly more complex, and photo-realistic reconstruction of such images using vectorization methods remains a challenge. Image scaling is an area that has received considerable attention in the past several years. So called super resolution techniques [9, 10] have been proposed to use several low-resolution images of a scene to reconstruct a high resolution image of the same scene. Single-image interpolation techniques have also been proposed; most notably a number of edge-directed interpolation (EDI) methods (see [11] for an overview) that use the local statistical and geometrical properties to interpolate the missing pixel values. These methods are proven to be superior to conventional scaling methods like bicubic interpolation, as they preserve the sharpness and continuity of the interpolated edges. The primary disadvantage of most of these methods is the computational cost that makes real-time implementation difficult, especially on embedded hardware platforms. Our research was guided by a quest for an efficient image scaling solution with comparable performance to the best existing techniques. The main contribution of this work is to propose a novel application of vectorization to the image scaling problem. Our approach is motivated by the studies on the human visual system [12]. We propose a layered approach wherein the image to be scaled is decomposed into two layers, and each layer is processed separately. Our approach avoids the pitfalls of vectorizing natural images by processing only the most perceptually significant information in the image using vectorization techniques. We note that Saito et al [13] adopt a similar (layered) approach. However, our main contribution is the use of vectorization techniques on top of such a layered representation. 3. THE PROPOSED SYSTEM The proposed method is based on decomposing the input image into two layers. We assume an additive model for decomposing an image I into its coarse layer U and a texture layer V such that I ≈ U + V . The proposed method is based on the premise that rescaling each layer separately using methods that are most ap-

Fig. 1. The proposed system propriate to that layer, followed by blending of the scaled layers should yield improved quality over conventional methods. 3.1. System Overview Figure 2 gives a pictorial representation of the proposed system. The ‘coarse’ layer is extracted from the input image by passing it through the detail wiping module. This layer is then converted to vector form and rendered at the original scale. The ‘fine’ or texture layer is generated by computing a pixel-wise difference between the original image and the rendered coarse layer. For scaling the image, the vector representation of the coarse layer is rendered at the desired magnification by the rendering engine. The texture layer is then scaled separately by the same magnification factor. Finally, the scaled layers are blended to yield the rescaled image. The blending process includes the application of a post-processing step on rendered coarse layer; this is done to ensure a more natural and visually pleasing output image. Our approach of separating the image into coarse and fine layers circumvents the problem of vectorizing richly textured natural images. We input only the coarse layer, which is free of fine textures and does not requires any over-segmentation for faithful reconstruction, to the vectorization module. Hence, the vectorization ensures a fast, efficient scaling of the coarse layer. 3.2. Generating the Coarse Layer The coarse layer generation involves two major modules: viz., the detail wiping filter and the vectorization module. Detail-wiping filter: We pass the input image, I , through a detail-wiping filter that retains the strong edges and salient structure, and wipes out the fine details like textures. Examples of detail-wiping filters include Symmetric Nearest Neighbor filter, Kuwahara filter [14], and Bilateral filter [15]. We found the bilateral filter provides the most visually appealing results and good control over the level of details to be wiped out. The output of the bilateral filter, IBF , containing only strong edges and gross structure is then presented to the vectorization module.

1125

Fig. 2. Example: Top: Input Image, Bottom left: coarse layer, Bottom right: Detail layer Vectorization: We use vectorization for scaling as vector primitives (such as lines, curves, polygons) can be scaled to any arbitrary magnification without any artifacts. Vectorization involves three major modules, namely, i) image segmentation; ii) curve fitting for segmented regions, and; iii) rendering. The bilateral filtered image IBF is given to the segmentation module, which decomposes it into visually homogenous regions to be processed by the curve fitting module. In our work, we used the ‘EDISON’ segmentation method [16]. The algorithm provides user-control on level of segmentation in terms of parameters like color threshold, processing window size, etc. The algorithm segments the input image such that each connected components of pixel in the image is assigned with an unique label. Each label is then assigned with a color value that is the mean of the color values of all the pixels named with that label. The output of the segmentation module is thus an approximation of the image with flat color regions. Note that, the proposed method do not suffer from any limitations of the segmentation method used. Because, the application of additive decomposition model ensures that any loss of information in U (or V ) is captured in its complement layer such that I ≈ U + V . The curve fitting module fits boundary of each segmented region with suitable geometric shapes (polygons and lines). This results in a vector representation of the filtered image, which we denote as Iv . The details of shapes/paths along with their color information can be stored in the SVG format. Given an required display resolution (for example, resolution of image I ), the rendering module render an image, U , which is a ‘coarse layer’ of I (See Fig. 2). We use Potrace library [17] that applies cubic Be ´zier spline internally for curve fitting, and openVG for rendering. 3.3. Generating the Fine layer The rendered coarse layer generated above is subtracted from the original bitmap image, in pixel-wise fashion, to get the finedetail or texture layer, V (See Fig. 2). This layer contains the fine texture details that are complementary to coarse layer content. Fig. 2 shows the coarse layer and fine-detail layers for a sample input image.

Fig. 3. Flow diagram of adaptive filter 3.4. Scaling the Coarse and Fine layers Given a scale factor M , the coarse layer U and the detail layer V are independently scaled by the same scale factor. Scaling the coarse layer: The scaling of coarse layer is an efficient and simple process that requires rendering of vector data at the required scale factor (M ). The vector-based coarse layer scaling results in an image, Us with sharp, well-defined edges, as the geometric vector primitives remain sharp and artifact free, irrespective of the rendering scale. Scaling the detail layer: The fine-detail layer is independently scaled by the same scale factor using a suitable texture scaling algorithm. Since the residue image does not have sharp and long edges, a basic interpolation scheme such as bilinear with smaller kernel size could be used. The scaled fine-detail layer is denoted as Vs . 3.5. Blending the Layers The final output is constructed by blending the scaled coarse layer, Us , and detail layer, Vs However, direct blending may produce an output image that is sharp, but not visually pleasing. This is due to the unnatural sharpness (a halo-like effect) introduced around the edges of the image, when we combine a highly sharp vector layer with smooth fine-detail layer. To avoid this, we filter Us with a spatially adaptive blur filter, H . This may seem counter-intuitive to our stated aim, which is to generate a sharp, high quality rescaled version of the input image. However, the amount of blur introduced is very small, and selectively introduced only at certain regions. This results in a final output which is more pleasing and yet sharp. The filter used is a spatially varying smoothing filter (with gaussian kernel), whose parameters are decided based on the ‘blur map’ (computed as in [18]) of the input image. A flow chart of the filtering process is shown in Fig. 3. The final, high resolution output image Iout is obtained by combining the vector scaled and processed coarse layer with the interpolated residue layer. This is given by Iout = (H ∗ Us ) + Vs where ∗ denotes the convolution operation. (1)

1126

(a) Input

(b) Proposed Fig. 5. Sample database (Top: Image index 1-4 from left to right Bottom: Image index 5-8 from left to right) image scaling experiments. We used down-sampling factors of 4 and 8 in our experiments.

(c) Bicubic

(d) NEDI

(e) Vectorization

4.2. Quality Metrics For an objective quality comparison of the reconstructed image, we use the Structural SIMilarity(SSIM) metric [21], and the widely used Peak Signal-to-Noise Ratio (PSNR). Both the PSNR and SSIM are full-reference quality metrics, and we use the original high-resolution images as the reference in computing these scores. 4.3. Results We now compare our approach with the three approaches discussed earlier, namely, bicubic interpolation, the NEDI [19] method, and naive vectorization. Fig. 6 compares the performance of the proposed method to these methods using PSNR and SSIM as objective quality metrics. The results presented are for 8 test images shown in the Fig. 5, with a magnification factor of 4. As can be seen SSIM and PSNR indicate the superior performance of the proposed algorithm. Figures 7 and 8

Fig. 4. Comparison of results for a synthetic image 4. EXPERIMENTS AND RESULTS We begin this section with a simple example to demonstrate the effectiveness of our approach. We composed a synthetic image comprising regions with different visual properties: uniform fill, continuous color gradient and texture. Fig. 4(a) shows this image. Fig. 4(b) shows the magnification by a factor of 8 of the marked region in Fig. 4(a), using the proposed method. Observe the smooth and sharp edges at the borders of the regions. We compare our method with three competing approaches. Fig. 4(c) shows the results of the ‘classical’ bicubic interpolation technique. Next, results obtained using the New Edge Directed Interpolation (NEDI) [19] algorithm are shown in Fig. 4(d). We also compare our method with naive vectorization, where the entire image is vectorized (Fig. 4(e). The bicubic interpolation shows the familiar blurring across edges. The NEDI method fares considerably better, but at a higher computational cost. The naive vectorization does not produce a visually appealing result, producing jagged edges. This is attributed to the extremely fine segmentation of the image, which is necessary to faithfully reproduce the textured regions. This example clearly shows the effectiveness of our differential processing approach. This simple example clearly shows the potential of the proposed method. We now detail the experimental set up used to validate the proposed method, and demonstrate the results of proposed method and its performance on several real-life images. 4.1. Image Database We collected a large set of high-resolution images from a variety of sources like Flickr, Google Images, Berkeley’s segmentation dataset [20], and our personal photo collections. Fig. 5 show a sample-set of images used in our experiments. The low-resolution images used in our experiments were generated assuming a typical image acquisition model involving subsampling and blur. The blur models the camera point spread function (PSF) and also presents a challenging case to handle in

Fig. 6. Comparison of PSNR(top) and SSIM(bottom) values

1127

(a) Input

(b) Proposed

(c) Bicubic

(d) NEDI

(e) Naive vectorization

Fig. 7. Comparison of our approach with competing techniques show examples of scaled images using the above methods for a magnification factor of 8. The low-resolution input images are shown in Fig. 7(a) and 8(a). Fig. 7(b) and 8(b) show the scaling results of the proposed method while images in (c)-(e) of Fig. 7 and 8 show the results of bicubic interpolation, NEDI and naive vectorization, respectively. As before, we observe that NEDI performs better than simple interpolation, as it prevents smoothing across edges. However, the crispness of the resulting image using the proposed approach is immediately evident upon comparing images (b) and (c) in Fig. 7. Observe also, how the differential processing results in crisp, but smooth edges and also preserves the fine texture fairly well. The NEDI method on the other hand, results in blurring of fine texture. We observe the limitations of naive vectorization in handling natural images. Even with a highly over-segmented image, the vectorization method fails to yield a visually pleasing output. Finally as mentioned, because the proposed method vectorizes only the coarse layer, it is much faster than naive vectorization. 5. CONCLUSION We described a novel way to apply vectorization technique to the problem of image scaling. We adopt a layered approach that avoids the shortcomings of naive vectorization and pure pixel based interpolation methods. The proposed method is seen to give superior quality for high magnification factors as compared to recent proposed methods. The layered approach has been described here for single image resizing. However, it can easily be extended to the multiple image case. One area that we are considering in this context is the application of super resolution techniques to this problem. 6. REFERENCES [1] X. Hilaire and K. Tombre, “Robust and accurate vectorization of line drawings,” IEEE Trans. on Pattern Analysis and Machine Intel., vol. 28(6), pp. 11, Jun. 2007. [2] S.-H. Zhang, T. Chen, Y.-F. Zhang, S.-M. Hu, and R. R. Martin, “Vectorizing cartoon animations,” IEEE Trans. on Visualization and Computer Graphics, vol. 15(4), pp. 618–629, 2009. [3] M. Koloros and J. Zara, “Coding of vectorized cartoon video data,” in Proc. of Spring Conf. on Computer Graphics, 2006, pp. 177–183. [4] S. Battiato, G. Puglisi, and G. Impoco, “Vectorization of

1128

(a) Input

(b) Proposed

(c) Bicubic

(d) NEDI

(e) Naive vectorization

Fig. 8. Comparison of our approach with competing techniques raster color images,” in Proc. of 2nd Natl. Conf. of the Group of Color, Sept. 2006, pp. 20–22. [5] B. Price and W. Barrett, “Object-based vectorization for interactive image editing,” in Proc. of Vision Comp.(Pacific Graphics), Sept. 2006, pp. 661–670. [6] J. Sun, L. Liang, F. Wen, and H.-Y. Shum, “Image vectorization using optimized gradient meshes,” ACM Trans. on Graphics and SIGGRAPH, vol. 26(3), Jul. 2006. [7] S. Swaminarayan and L. Prasad, “Rapid automated polygonal image decomposition,” IEEE Applied Imagery and Pattern Recogn. Workshop, pp. 28–33, Oct. 2006. [8] “Ravegrid,” http://www.lanl.gov/software/RaveGrid/. [9] S.C. Park, M.K. Park, and M.G. Kang, “Super-resolution image reconstruction: a technical overview,” IEEE Signal Processing Magazine, vol. 20, no. 3, pp. 21–36, 2003. [10] M. Protter, M. Elad, H. Takeda, and P. Milanfar, “Generalizing the non-local means to super-resolution reconstruction,” IEEE Trans. on Image Processing, vol. 18, no. 1, Jan. 2009. [11] W.-S. Tam, C.-W Tok, and W.-C Siu, “A Modified Edge Directed Interpolation for images,” in Proc. of European Signal Processing Conf., Aug. 2009. [12] R.A. Young, “The Gaussian derivative model for spatial vision,” Spat. Vis., vol. 2, no. 4, pp. 273–293, 1987. [13] T. Saito, Y. Ishii, Y. Nakagawa, and T. Komatsu, “Adaptable image interpolation with skeleton-texture separation,” IEEE Intl. Conf. on Image Processing, 2006. [14] S. Chen and T.-Y. Shih, “On the evaluation of edge preserving smoothing filter,” in Proc. Geoinformatics, 2002. [15] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” IEEE Conf. on Computer Vision, pp. 839–846, 1998. [16] “Fast Edge Detection and Image Segmentation (EDISON) System,” http://tinyurl.com/d9h7g4. [17] “Peter Selinger, Potrace - a polygon-based tracing algorithm,” http://potrace.sourceforge.net/. [18] P. Marziliano, F. Dufaux, S. Winkler, and T. Ebrahimi, “A no-reference perceptual blurmetric,” IEEE Intl. Conf. on Image Processing, pp. 57–60, Sept. 2002. [19] L. Xin and M.-T. Orchard, “New edge-directed interpolation,” IEEE Intl. Conf. on Image Processing, Oct. 2001. [20] “The Berkeley Segmentation Dataset and Benchmark,” http://www.eecs.berkeley.edu/Research/Projects/CS/vision/. [21] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image Quality Assessment: From Error Visibility to Structural Similarity,” IEEE Trans. on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.

1129

Sign up to vote on this title
UsefulNot useful