You are on page 1of 6

2013 International Conference on Control Communication and Computing (ICCC)

Content Based Image Retrieval System with


Fuzzified Texture Similarity Measurement
Sreena P. H. David Solomon George
Department of ECE Department of ECE
Govt. Rajiv Gandhi Institute of technology Govt. Rajiv Gandhi Institute of technology
Kottayam, India Kottayam, India
sreenapathangalil@gmail.com david@rit.ac.in

AbstractContent based image retrieval (CBIR) system is a features or the low level features are color, texture and shape.
database management system for retrieval of images based on the Features can be classified as general features- color, texture
similarity of image content with the query image. In the proposed and shape and domain specific feature- face, fingerprint, etc, as
CBIR system, Tamura texture features are extracted as image they depend on application. As image perception differs for
content. To measure similarity of query image with images in different person, there is no single best feature representation.
database, a fuzzified distance measure, fuzzy hamming distance A given feature has got multiple representations that
(FHD), is used. The database is sorted in the increasing order of characterize the feature from different perspective.
similarity measure, and made available to user. The proposed
technique is implemented in Matlab and its effectiveness is Instead of exact matching, content based image retrieval
verified using the standard Bordatz texture database. calculates visual similarities between a query image and
images in a database. Accordingly, the retrieval result is not a
Keywords- Content Based Image Retrieval, Fuzzy Hamming single image but a list of images ranked by their similarities
Distance, Tamura texture features with the query image. Many similarity measures have been
developed for image retrieval based on empirical estimates of
I. INTRODUCTION the distribution of features in recent years. Different
With increased accessibility to internet and low cost digital similarity/distance measures will affect retrieval performances
storage devices, large amount of digital information is available of an image retrieval system significantly. Various similarity
to common man. Taking the example of images, this large measures like Minkowski-Form distance, Quadratic Form
volume of data is used in fields of education, entertainment, distance, Mahalanobis distance etc.are found in literature [14].
commerce, biomedicine, and crime prevention. The highly During 1990s fuzzy logic based approaches were brought in
evolved, and much sophisticated visual system help human image retrieval. Fuzzy histogram analysis helped to include
beings to select the required image from a large image database ambiguity and vagueness in perceiving color or texture [15]
within a few minutes or less. But it is hard to teach a machine [18]. A number of similarity measures for these fuzzy
the interpretation of what we see. Yet, tremendous efforts have histograms were also defined [19]. In 2003 Ralescu proposed
been made over the past few decades to make the machine fuzzified version of the existing hamming distance measure
understand, index and annotate the pictures over wide range using color as feature [20]. It provides a fuzzy distance
with much progress. Image retrieval systems are such image measure between non-fuzzy real valued vectors. Thus without
database management systems that try to understand the image fuzzification of existing feature vectors, the ambiguity in their
before presenting it to user. similarity is identified. Ionescu in his work used Fuzzy
Different image retrieval techniques are in use [1]. They are Hamming Distance (FHD) to find similarity between color
categorized on the basis of nature of query like keyword based, histogram [21].
free text based, image content based, or composite of these. Fig. 1 gives general overview of a CBIR system. The first
Newer techniques, which use user interaction, help to reduce step in developing a CBIR system is feature database creation.
the semantic gap between the retrieved images and user query The feature database should contain feature vector
[2]. Early image retrieval techniques based on textual queries corresponding to each image in the database. A feature vector
became inadequate as a result of advances in internet and is selected depending on the nature of database and application
digital image storage technologies [3]-[7]. The systems that use of the system. On presenting a query image to the system, its
the properties inherent to images would be able to handle large feature vector is extracted and similarity/distance with all
image databases. The CBIR systems were developed on this feature vectors in database is measured. Images with largest
idea. A few commercial and experiment prototypes CBIR similarity or shortest distance are made available to the user
systems such as QBIC [8], Photobook [9], Virage [10], Netra after sorting.
[11], VisualSEEK [12], SIMPLIcity [13] have been developed.
The two important steps of CBIR are feature extraction and
similarity measurement. Commonly used content based

978-1-4799-0575-1/13/$31.00 2013 IEEE 80


coarseness, contrast and directionality are used. The other three
features, line-likeness, regularity and roughness are derived
from the former three features, hence not used.
Coarseness describes the size of the texture particles and its
measurement is given by size operator. Around each pixel, a
neighborhood of size 2k x 2k is defined (k = 0, 1 5). The k
value that maximizes the gradient of pixel values over non-
overlapping neighborhoods in both horizontal and vertical
orientation is taken as Sbest . A histogram of Sbest is made,
which is the coarseness feature vector of the image.
Contrast modification of image is usually done by
changing its gray level distribution. Tamura et al. also
considered other factors like dynamic range of gray levels,
polarization of the distribution of black and white on gray-level
histogram. The first is measured using the standard deviation of
Fig. 1. CBIR System Flow Chart grey levels and the second the kurtosis 4 . The contrast
In this paper a CBIR system which extracts texture measure is therefore defined as ratio of standard deviation and
features from images and compares it with that of database nth power of 4 .
images is proposed. A statistical texture measurement called
Tamura texture of the standard Bordatz database [22] is Directionality is measured as a global property, without
extracted. Fuzzy modification of hamming distance, Fuzzy considering orientation. i.e., same figures with different
Hamming Distance, is used for comparison of the images. The orientation should have same directionality. Histogram of local
system sorts the database in the increasing order of comparison edges probabilities against directional angles is used which is
result, i.e., the distance measure between query image and shown to sufficiently represent global features of the input
database images. The images with lowest distance measure are picture as long lines and simple curves [25].
the most similar images. The features are stored in feature database. Instead taking
Tamura texture, statistical texture feature used in this texture feature vector as element of 6-D space (corresponding
paper is discussed in section II. Section III discusses Fuzzy to 6 Tamura features), each of the 6 feature is taken as a vector.
Hamming Distance, fuzzy modification of hamming distance. A histogram of size measures Sbest is made with respect to
The experimental results of the proposed system on standard
different values of 2k for k=0, 1...5. This forms vector for
Bordatz texture database are presented in section VI.
coarseness. The directionality is represented by 16-D direction
II. PROPOSED CBIR SYSTEM histogram HD. Direct histogram measure for contrast doesnt
exist. So it is represented as a vector with each component the
CBIR system proposed in this paper uses Tamura texture contrast measure of one of the 128 x 128 image sub blocks.
features for retrieval and fuzzy hamming distance as similarity Thus a small amount of spatial information is added to the
measure. Bordatz digital album is used as image database. The measurement.
steps for implementation of the proposed system are
B. Similarity measurement
Feature extraction
After feature extraction query processing takes place. Fuzzy
Similarity measurement Hamming distance, FHD is used as similarity measure. FHD is
fuzzy extension of hamming distance to real valued vectors
Aggregation and ranking [20]. Hamming distance is the count of number of places two
A. Feature extraction real valued vectors differ. But FHD not only measure the
number of places they differ, but also the magnitude of
As mentioned earlier Tamura texture features are used as difference.
texture features. A set of six texture features taking on the basis
of human psychological experiments [23]. Tamura gave a set For computation of FHD, a difference fuzzy set is defined.
of images from Bordatz album to a group of people and asked This is the fuzzy set with membership function equal to the
to describe the texture. From the result mathematical th
degree of difference, i.e. the degree to which i component of
expression for six features were developed that effectively the vectors differ.
describe all input patterns and were well distributed within
these patterns. The features include coarseness, directionality, D ( X ,Y ) (i) = d ( xi , yi )
contrast, line-likeness, regularity and roughness. Many CBIR (1)
2
have effectively used Tamura features for texture description = 1 e ( xi yi )
[8] [24].
Out of six three important Tamura features of all the
database images are extracted. Modified versions of

81
Here i = 1, 2, n , where n is the dimension of the vectors nFHD = nCARD = {x; FHD ( x) > 0.5} (8)
and the variable parameter modulates the difference
xi yi . A is the closure of set A and A is the cardinality of A .
FHD is defined as the fuzzy cardinality of difference fuzzy nFHD is the number of points where D ( X ,Y ) is greater than
set. 0.5.
n
i The user inputs a query image of which similar images are
FHD = Card D ( X , Y ) (2) to be extracted from database. FHD between the query and all
i =0 Card D ( X ,Y ) (i ) the database images is measured. FHD measurements of
different features of the same image are made. Thus three
where
distance measurements are obtained. One in coarseness space,
Card D ( X ,Y ) (i ) = D ( X ,Y )(i ) ^ D ( X ,Y )(i +1) (3) one in directionality space, and one in contrast space.
C. Aggregation and Ranking
Values iof are defined as, Similarity measurement corresponding to each image is
1 = 0 1 2 n = 0 and ^ is the min operation. obtained as a set of similarity score. Lower the score, closer
will be the query to that image with respect to that feature.
FHD is the number of places where two vectors differ and its Aggregation module produces the weighted sum of these
membership function Card D ( X ,Y ) (i ) , is the degrees to which similarity scores. Here, a simple scoring technique is used
X and Y differs exactly at i components. which adds all the three similarity score. Ranking module is
nothing but a sorting module which ranks the database images
Selection of variable parameter needs attention because it in the increasing order of their aggregated scores. First 8
controls FHD sensitivity to the extent of variation. is selected images are retrieved as similar images to the query.
such that maximum bound is applied to the degree of
difference D ( X , Y ) . III. RESULTS
The proposed system was applied to 111 images of Bordatz
D ( x , y ) (i ) 1 (4) database in GIF format. The system is implemented in
MATLAB. Measurements for directionality and coarseness are
Subject to made without partitioning the image into sub images. Local
contrast measurements are taken by dividing images to 128 x
xi yi M (5) 128 pixel blocks. Query images from the same database and
certain other image sketches are also given to the system.
1 is the lower bound on membership value and M is Three images with features linelikeness, netlike and
lower bound on magnitude of difference. The lower bound M globelike are selected as query images. These images D15,
is selected as D20, and D30 of the Bordatz database are shown as the top
M = MAX (6) figure in Fig. 3 Fig .5.
Visually it is evident that D20 is most directional and D30
[0,1] and MAX is the maximum value in column is the least. Of these images, D30 is having the highest contrast
domain. If X = [ x1 , x2 xn ] and Y = [ y1 , y2 yn ] are two and coarseness while D20, is having the lowest. Directionality
feature vector given as a plot of local edge probability against
feature vectors, then MAX is another vector whose elements their directional angle for the three test queries is shown in Fig.
are maximum in corresponding columns. i.e., 2. The X axis of the histogram is obtained by quantizing the
MAX = [max( x1 , y1 ), max( x2 , y2 ) max( xn , yn )] . angle range 0 to in 16 directions. In the histogram, peak
occurs in direction in which most of the gradient vectors points.
From (1), (4), (5) and (6) The most directional image, D20 is having a sharp peak, where
as the non directional image, D30 has a flat histogram.
ln 1
= (7)
2 MAX 2
The difference between X and Y, greater than MAX, will be
used for calculating FHD since it cause a degree of change
greater than 1 .
The crisp value of the similarity is measured by
defuzzification of FHD. Here, non-fuzzy cardinality of fuzzy
set, i.e., non fuzzy approximation of fuzzy hamming distance is
taken.

82
final FHD measure. More similar the query to an image, the
lesser will be the FHD measure. The aggregation of the FHD
measures is done by adding the distances, giving unit weight to
each. The entire database is sorted in the increasing order of the
overall FHD measure. Those images in database with smallest
FHD measure are given to the user.

(a)

(b) (c)
Fig. 2. Directional probability histograms for the three test queries. (a) D15.
gif (b) D20. gif (c) D30.gif
D 20 0 D 51 10 D 101 11
Fig. 3 Fig. 5 show the query image and the images
retrieved from the database arranged in ranked orders. The
retrieved image with corresponding similarity score is given
below each image. For more similar images, smaller scores are
obtained and as score increases the similarity decreases.

D 102 11 D 103 13 D 104 13

Fig. 4. Top 6 retrieved images for query D20 from database.

D15 0 D 106 11 D 24 15

D 30 0 D 27 5 D 23 - 7

D 53 15 D 35 16 D 54 17

Fig. 3. Top 6 retrieved images for query D15 from database.

Three feature vectors corresponding to three Tamura


features directionality, coarseness and contrast, of the query D 31 8 D54 8 D 108 10
image are extracted. The FHD measures of the three feature
Fig. 5. Top 6 retrieved images for query D30 from database.
vectors of the query image with the corresponding feature
vectors of the Bordatz database images are found. Three
separate distance measures obtained are combined to get the

83
Not all the retrieved results shown in Fig. 3 Fig. 5 are
similar to the query. This occurs due to the simple unit weight
aggregation of the FHD measures. By adopting a non uniform
weighted aggregation technique better results can be obtained.
To compare the effectiveness of the system, sample
sketches of the texture are made and are presented as query
images. The results are presented in Fig. 6 Fig. 7. As
observed, most of the retrieved images have the prominent
feature of the query image, like the lines in Fig. 6, net in Fig. 7,
and globes in Fig. 8.

Fig. 8. Result with image sketches as query. The top most figure is the query
with globes as prominent feature.

Precision measures the retrieval accuracy of a CBIR


system. It is the ratio of number of relevant images retrieved to
total number of images retried. The relevant images are those
which satisfy the query to the system.

No. of relevant images retrieved


Precision =
Total No. of images retrieved

Fig. 6. Result with image sketches as query. The top most figure is the query The precision measures for the three queries in Fig 3 Fig
with lines as prominent feature.
5 are given in table I. Precision when the total number of
images retrieved is 5, 9, 15, 20 and 25 are given. The number
of relevant images is given in brackets.
TABLE III
PRECISION AT DIFFERENT NO. OF RETRIEVED IMAGES
Total No. of
Query 1 Query 2 Query 3
images retrieved

5 0.8 (4) 1 (5) 0.8(4)


9 0.5556 (5) 0.7778 (7) 0.5556(5)
15 0.4000 (6) 0.5333 (8) 0.4118 (7)
20 0.3000(6) 0.4000(8) 0.3500 (7)
25 0.2400 (6) 0.3200(8) 0.2800 (7)

From the precision analysis it can be seen that when all the
database images are retrieved precision approaches zero. That
is the later retrieved images are irrelevant.
Since natural texture images are used, there can be some
sort of indecisiveness in some features. For example, one
cannot decide about the coarseness or contrast of image D15.
Similar is the case for coarseness of D20. This can reduce the
relevance of the retrieved images. Also, the retrieved results in
Fig. 7. Result with image sketches as query. The top most figure is the query
with net like feature. each of the test are subjective to user. One cannot make final
discussion on the relevance of the retrieved images, in the

84
context of the query. The result relevant to one user in one [15] J. Han, and K Ma, Fuzzy Colour Histogram and Its Use in Colour
particular context may be irrelevant to same user in another Image Retrieval, IEEE Transactions on Image Processing, Volume 11,
Number 8, pp. 944-952, 2002.
context. The relevance of the retrieved images can be improved
[16] M. Ivanovici, N. Richard, and D. Paulus, Color Image Segmentation,
using fuzzified query. Advanced Color Image Processing and Analysis, 2013, pp 219-277
IV. CONCLUSION [17] O. Ibaez, O. Cordon, S. Damas, and J. Santamaria, Modeling the
SkullFace Overlay Uncertainty Using Fuzzy Sets, IEEE Transactions
A CBIR system that uses a fuzzified measure, fuzzy on Fuzzy Systems, Vol.19 ,No. 5, pp- 946 959, Oct. 2011.
hamming distance, for similarity measurement is proposed in [18] C. Vertan, and N. Boujemaa, Using Fuzzy Histograms and Distances
the paper. Tamura texture measures of the image is extracted for Colour Image retrieval, Challenge of Image Retrieval, Brighton.
2000.
and used as low level feature for matching. FHD identifies the
similarity between two images by evaluating the number of [19] I. Bloch. On fuzzy distances and their use in image processing under
imprecision. Pattern Recognition, vol. 32 no. 11, pp:18731895, Nov.
points where their feature vectors differ and also the degree to 1999.
which they differ. The effectiveness of system is verified using [20] A. Ralescu. Generalization of the hamming distance using fuzzy sets.
Bordatz texture database. The image which gave smallest Research Report, JSPS Senior Research Fellowship, Laboratory for
distance measures was retrieved as the most similar one. Mathematical Neuroscience, The Brain Science Institute. RIKEN, Japan,
May-June 2003.
REFERENCES [21] M. Ionescu,., and R. Ralescu, Fuzzy Hamming distance in a Content
[1] R. Dutta, D. Joshi, J. Li, and J. Z. Wang, Images Retrieval: Ideas, based Image Retrieval Systems, Proceedings of FUZZ-IEEE 2004,
Influences, and Trends of New Age, ACM Computing Surveys, Vol. Budapest, pp 1721-1726, 2004.
40, No. 2, 2008. [22] http://www.ux.uis.no/~tranden/brodatz.html
[2] Y. Rui, T.S. Huang, M. Ortega, S. Mehrotra, Relevance feedback: a [23] H. Tamura, S. Mori, and T. Yamawaki, "Texture features corresponding
power tool for interactive content-based image retrieval, IEEE to visual perception," IEEE Transactions on Systems, Man, and
Transaction on Circuits Video Technology, vol 8, no. 5, pp. 644655, Cybernetics, vol. Smc-8, No. 6, June 1978.
1998. [24] A. Mojsilovic, B. Rogowitz, Capturing image semantics with low-level
[3] N. S. Chang and K. S. Fu, A Relational Database System for Images, descriptors, Proceedings of the ICIP, September 2001, pp. 1821.
Technical Report, Purdue University, May 1979. [25] S. Mori, Y. Monden, and T. Mori, "Edge representation in gradient
[4] N. S. Chang and K. S. Fu, Query-by pictorial-example, IEEE space," Computer Graphics and Image Processing, vol. 2, pp. 321-325,
Transactions on Software Engineering vol.6., 1980. Dec. 1973.
[5] S.-F. Chang, A. Eleftheriadis, and R. McClintock, Next-generation
content representation, creation and searching for new media
applications in education, IEEE Proceedings, vol. 86, no. 5,pp. 884-904,
1998.
[6] S. F. Chang, J. R. Smith, M. Beigi, and A. Benitez, Visual information
retrieval from large distributed online repositories. Communication
ACM (Special Issue on Visual Information Retrieval), pp. 1220, Dec.
1997.
[7] S.K. Chang, S.H. Liu, Picture indexing and abstraction techniques for
pictorial databases, IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 6, no.4,pp. 475 483, 1984.
[8] M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom,
M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele and P. Yanker,
Query By Image and Video Content: The QBIC System, IEEE
Computer, Vol. 28, No: 9, pp. 23-32, September 1995.
[9] A. Pentland, R. Picard and S. Sclaroff, Photobook: Content-based
Manipulation of Image Databases, International Journal of Computer
Vision, Vol. 3, pp. 233-254, 1996.
[10] A. Gupta, Visual Information Retrieval: A Virage Perspective,
Technical Report Revision 4, Virage Inc, San Diego, CA 92121, 1996.
[11] W.Y. Ma, B. Manjunath, Netra: a toolbox for navigating large image
databases, Proceedings of the IEEE International Conference on Image
Processing, 1997, pp. 568 571.
[12] J.R. Smith, S.F. Chang, VisualSeek: a fully automatic content based
query system, Proceedings of the Fourth ACM International
Conference on Multimedia, 1996, pp. 8798.
[13] J.Z. Wang, J. Li, G. Wiederhold, SIMPLIcity: semantics-sensitive
integrated matching for picture libraries, IEEE Transactions Pattern
Analysis and machine intelligence, vol. 23, no. 9, pp. 947963, 2001.
[14] F. Long, H.J. Zhang, D.D. Feng, Fundamentals of content-based image
retrieval, Multimedia Information Retrieval and Management, Springer,
Berlin, 2003.

85

You might also like