Professional Documents
Culture Documents
6, 2011
Department of Computer Science and Engineering, Sipnas College of Engineering & Technology, Amravati, Maharashtra, India pranali.lokhande@gmail.com
2
Associate Professor, Department of Computer Science and Engineering, Sipnas College of Engineering & Technology, Amravati, Maharashtra, India pritishtijare@rediffmail.com
Abstract
Now a days people are interested in using digital images. There is a great need for developing an efficient technique for finding the images. A significant and increasingly popular approach that helps in the retrieval of image data from a huge collection is called Content Based Image Retrieval (CBIR). In order to find an image, image has to be represented with certain features. Color, texture and shape are three important visual features of an image. An efficient image retrieval technique can be implemented which uses color, texture and shape features of an image. Color is one of the most widely used low-level visual features and is invariant to image size and orientation. As conventional color features used in CBIR, there are color histogram, color correlogram, and Dominant Color Descriptor (DCD). Although a precise definition of texture is untraceable, the notion of texture generally refers to the presence of a spatial pattern that has some properties of homogeneity. In particular, the homogeneity cannot result from the presence of only a single color in the regions, but requires interaction of various colors. There is no universal definition of what shape is either. Impressions of shape can be conveyed by color or intensity patterns, or texture, from which a geometrical representation can be derived. We can consider shape as something geometrical. Therefore, we can consider edge orientation over all pixels as texture, but edge orientation at only region contours as shape information.
Keywords: Content Based Image Retrieval, Linear Block Algorithm, Dominant Color Descriptor, Gray Level Co-occurrence Matrix, Color Co-occurrence Matrix, Block Difference of Inverse Probabilities, Block Variation of Local Correlation coefficients, Alexandria Digital Library, Color Hierarchical Representation Oriented Management Architecture, Picture & Self Organized Map, Query By Image Content. 1. Introduction
Due to the proliferation of video and image data in digital form, Content Based Image Retrieval (CBIR) has become a prominent research topic. This motivates the extensive research into image retrieval systems. From historical perspective, one shall notice that the earlier image retrieval systems are rather text-based search since the images are required to be annotated and indexed accordingly. However, with the substantial increase of the size of images as well as the size of image database, the task of user-based annotation becomes very cumbersome, and, at some extent, subjective and, thereby, incomplete as the text often fails to convey the rich structure of the images. This motivates the research into what is referred to as CBIR. Therefore an important problem that needs to be addressed is fast retrieval of images from large databases. To find images that are perceptually similar to a query image, image retrieval systems attempt to search through a database. CBIR can greatly enhance the accuracy of the information being returned and is an important alternative and complement to traditional text-based image searching.
Special Issue
Page 1 of 86
CBIR is desirable because most web based image search engines rely purely on metadata and this produces a lot of garbage in the results. Also having humans manually enter keywords for images in a large database can be inefficient, expensive and may not capture every keyword that describes the image. Thus a system that can filter images based on their content would provide better indexing and return more accurate results. For describing image content, color, texture and shape features have been used. Color is one of the most widely used low-level visual features and is invariant to image size and orientation. There are color histogram, color correlogram as conventional color features used in CBIR. For representing color in terms of intensity values, a color space is defined as a model. A color component or a color channel is one of the dimensions. The RGB model uses three primary colors, red, green and blue, in an additive fashion to be able to reproduce other colors. As this is the basis of most computer displays today, this model has the advantage of being easy to extract. In a true-color image each pixel will have a red, green and blue value ranging from 0 to 255. Without any other information, many objects in an image can be distinguished solely by their textures. Texture may describe the structural arrangement of a region and the relationship of the surrounding regions and may also consist of some basic primitives. Texture is a very general notion that can be attributed to almost everything in nature. For a human, the texture relates mostly to a specific, spatially repetitive (micro) structure of surfaces formed by repeating a particular element or several elements in different relative spatial positions. Generally, the repetition involves local variations of scale, orientation, or other geometric and optical features of the elements. Texture is a key component of human visual perception. Like color, when querying image databases, this makes it an essential feature to consider. Everyone can recognize texture, but it is more difficult to define. Unlike color, texture occurs over a region rather than at a point. It is normally perceived by intensity levels and as such is orthogonal to color. Texture can be described in terms of direction, coarseness, contrast and so on. It has qualities such as periodicity and scale. It is this that makes texture a particularly interesting facet of images and results in a plethora of ways of extracting texture features. Image textures are defined as images of natural textured surfaces and artificially created visual patterns, which approach, within certain limits, these natural objects. Image sensors yield additional geometric and optical transformations of the perceived surfaces, and these transformations should not affect a particular class of textures the surface belongs. It is almost impossible to describe textures in words, although each human definition involves various informal qualitative structural features, such as fineness, coarseness, smoothness, granularity, lineation, directionality, roughness, regularity, randomness, and so on. These features, which define a spatial arrangement of texture constituents, help to single out the desired texture types, e.g. fine or coarse, close or loose, plain or twilled or ribbed textile fabrics. It is difficult to use human classifications as a basis for formal definitions of image textures, because there is no obvious ways of associating these features, easily perceived by human vision, with computational models that have the goal to describe the textures. Nonetheless, after several decades of research and development of texture analysis and synthesis, a variety of computational characteristics and properties for indexing and retrieving textures have been found. In many cases, the textural features follow from a particular random field model of textured images. As humans can recognize objects solely from their shapes, object shape features can also provide powerful information for image retrieval. Usually, the shape carries semantic information and shape features are different from other elementary visual features, such as color or texture features. Basically, shape features can be categorized as boundary-based and region-based. The former extracts features based on the outer boundary of the region while the latter extracts features based on the entire region. Shape matching is a well-explored research area with many shape representation and similarity measurement techniques found in the literature. Shape feature has been extensively used for retrieval systems. At the primitive level, shape is the most obvious requirement. Shape is a well-defined concept and there is considerable evidence that natural objects are primarily recognized by their shape. We will review CBIR system that is based on color, texture and shape.
Special Issue
Page 2 of 86
The remainder of this paper is organized as follows. Section 2 presents methods used for implementing CBIR. Section 3 outlines the various CBIR systems. In Section 4, we have described some of the potential applications of CBIR. Section 5 presents our conclusion.
Special Issue
Page 3 of 86
2.1.4. CIE lab color space In [9], CIE lab color space is used in order to characterize color content of an image. The lab color coordinate system has the benefit that the same distance in the color space leads to equal human color difference perception and lightness is distinct and independent from chromaticity coordinates.
Special Issue
Page 4 of 86
defined as local covariance normalized by local variance. The value of BVLC is determined as the difference between the maximum and minimum values of block-based local correlation coefficients according to four orientations. 2.2.5. Tamura and Gabor In [8], Tamura and Gabor filter are used for texture extraction. Tamura et al. took the approach of devising texture features that correspond to human visual perception. They defined six textural features coarseness, contrast, directionality, line-likeness, regularity and roughness and compared them with psychological measurements for human subjects. The first three attained very successful results and are used in our evaluation, both separately and as joint values. The use of Gabor filters has been one of the most popular signal processing based approaches for texture feature extraction. These enable filtering in the spatial and frequency domain. It has been proposed that Gabor filters can be used to model the responses of the human visual system. By using a bank of Gabor filters, Turner first implemented this to analyze texture. To extract frequency and orientation information, a bank of filters at different scales and orientations allows multichannel filtering of an image. This can be used to decompose the image into texture features. The feature is computed by first filtering the image with a bank of orientation and scale sensitive filters and then computing the mean and standard deviation of the output in the frequency domain. 2.2.6. Wavelet frames In [9], wavelet frames are used for texture extraction. The basic tools used for processing textured images are the wavelet frames ensuring the translation invariance of the features and a filter bank. The filter bank can simplify classification problem by decomposing the image into orthogonal components. Corresponding to different scales, the input signal can be decomposed into wavelet coefficients. Important texture characteristics such as periodicity and translational invariance can be handled by discrete wavelet frames.
3. CBIR systems
A number of content-based image retrieval systems, in alphabetical order are described in [10], which are mentioned below. If no matching, querying, indexing data structure, features or result presentation is mentioned, then it is not applicable to the system (e.g. there is no relevance feedback), or no such information is known to us (e.g. often no information is given about indexing data structures).
Special Issue
Page 5 of 86
Result presentation: The matched images are shown as thumbnails and their footprints are shown on the map in the map browser. Drawback: Many evaluations were a short-term activity closely bound to development. Most evaluations were too brief to incorporate characteristics of adaptability, but the evaluators did include what they had learned from the evaluation of one digital library project into the assessment of others. Finally, there is little evidence that digital library evaluations incorporate triangulation in data collection and analysis.
3.3. DrawSearch
Developer: Department of Electrical and Electronic Engineering, Technical University of Bari, Italy. Matching: The similarity between two feature vectors is given by the cosine metric in the color/shape subsystem. Between the two color vectors and shape descriptors, the similarity score between a query and a database image is calculated as a weighted sum of the distances. Result presentation: Images are displayed in decreasing similarity order. Drawback: There is lack of acknowledged "ground truth" performance measures. This is mainly because of the inherent uncertainty of similarity based retrieval. Images considered similar by one user may differ in the judgement of another one.
3.4. ImageFinder
Developer: Attrasoft Inc. Matching: Using a Boltzman Machine matching is done, which is a special kind of probabilistic artificial neural network. The network must first be trained on a set of images. This is done by just entering the training images (key images), there is no control over the features on which to train the network. Result presentation: The result is an HTML page with a list of links to the images found. Drawback: It remains unclear on what features the matching is done.
Special Issue
Page 6 of 86
Drawback: The MPEG-7-defined content descriptors can be used as such in the PicSOM system even though Euclidean distance calculation, inherently used in the PicSOM system is not optimal for all of them. Also, the PicSOM technique is a bit slower than the system based on vector quantization in starting to find relevant images.
4. Applications
In [11], a wide range of possible applications for CBIR technology has been mentioned. They include:
Special Issue
Page 7 of 86
5. Conclusion
CBIR is an active research topic in image processing, pattern recognition, and computer vision. Color can be usually represented by the color histogram, color autocorrelogram, color moment under a certain color space. Texture can be represented by Tamura feature, GLCM (Gray Level Co-occurrence Matrix), Gabor and Wavelet transformation. Shape can be represented by Psuedo Zernike Moments. The combination of the color and texture features of an image in conjunction with the shape features will provide a robust feature set for image retrieval.
6. References
[1] M.Babu Rao, Dr. B.Prabhakara Rao, Dr. A.Govardhan, Content based image retrieval using Dominant color and Texture features, International Journal of Computer science and information security, vol.9, no. 2, pp: 41 46, February 2011. [2] X-Y Wang et al., An effective image retrieval scheme using color, texture and shape features, Computer Standards & Interfaces, doi:10.1016/j.csi.2010.03.004, vol.33, pp: 59-68, 2011. [3] Chia-Hung Wei, Yue Li, Wing-Yin Chau, Chang-Tsun Li, Trademark image retrieval using synthetic features for describing global shape and interior structure, Pattern Recognition, vol.42, no.3, pp:127, 2009.
Special Issue
Page 8 of 86
[4] FAN-HUI KONG, Image Retrieval using both color and texture features, Proceedings of the 8th International Conference on Machine learning and Cybernetics, Baoding, pp: 2228-2232, 12-15 July 2009. [5] JI-QUAN MA, Content-Based Image Retrieval with HSV Color Space and Texture Features, Proceedings of the International Conference on Web Information Systems and Mining, pp: 61-63, 2009. [6] Nai-Chung Yang, Wei-Han Chang, Chung-Ming Kuo, Tsia-Hsing Li, A fast MPEG-7 dominant color extraction with new similarity measure for image retrieval, Journal of Visual Communication and Image Representation, vol.19,no.2 ,pp:92105, 2008. [7] Young Deok Chun, Nam Chul Kim, Ick Hoon Jang, Content based image retrieval using multi resolution color and texture features, IEEE Transaction on Multimedia, vol.10,no.6,pp:1073 1084, 2008. [8] P. Howarth and S. Ruger, Robust texture features for still- image retrieval, IEEE Proceedings of Vision, Image & Signal Processing, vol.152, no.6, pp: 868-874, December 2005. [9] S. Liapis, G. Tziritas, Color and texture image retrieval using chromaticity histograms and wavelet frames, IEEE Transactions on Multimedia, vol.6, no.5, pp: 676686, 2004. [10] Remco C. Veltkamp, Mirela Tanase, Content-Based Image Retrieval Systems: A Survey, Technical Report UU-CS-2000-34, pp: 1-62, October 28, 2002. [11] http://www.jisc.ac.uk/uploaded_documents/jtap-039.doc
Authors Profile
Pranali P. Lokhande currently doing M.E. in Computer Science & Engineering from Sipnas College of Engineering & Technology, Amravati. She has done her B.E. in Computer Science & Engineering from Sipnas College of Engineering & Technology, Amravati in 2003 from Amravati University, India. She has 5 years teaching experience.
Prof. P. A. Tijare has done his M.E. in Computer Technology & Application from CSVTU, Bhilai in 2008. He has done his B.E. in Computer Science & Engineering from SRTMU, Nanded in 2003. He has done his D.E. in Computer Engineering from MSBTE, Mumbai in 2000. He has teaching experience of more than 8 years. He has received certification in Oracle & WebSphere Studio V5.0.
Special Issue
Page 9 of 86