Professional Documents
Culture Documents
A PROJECT REPORT
Submitted by
RAJESH.J (Reg.no:04TC09)
BAKTAVATCHALAM.G (Reg.no:03LC02)
of
BACHELOR OF ENGINEERING
in
MADURAI-625015
APRIL 2007
THIAGARAJAR COLLEGE OF ENGINEERING
(A Govt.Aided ISO 9001:2000 Certified Autonomous Institution, Affiliated to Anna
University)
MADURAI-625015
BONAFIDE CERTIFICATE
SIGNATURE SIGNATURE
ACKNOWLEDGEMENT
and their Clusters used to find the Nearest Images with respect to their Color
using Color and its other features. In many research areas Color feature is
the one of important feature for Image Retrieval. Many of the Researchers
find many methods to process the Image Color for similarity measurements
opportunity for Text-Based Image Retrieval rather then CBIR when some
only Pixel Values. Also we combine the similar images of output using
clustering.
1
LIST OF FIGURES
3. Context of GIS 08
4. Geo-Referencing 09
6. Set Analysis 11
7. RGB Histogram 14
9. Image Retrieval 21
2
1. INTRODUCTION
3
1.1 SPATIAL DATABASE
Spatial Database is the large and vast Database that contains Spatial Data.
Now the Geographical Information’s of Earth and its entities are stored in this Database
rather than normal Conventional Database. In Spatial Database the Geographical
Information’s stored as Images obtained form any Image Source like Satellite that
Collects Daily Weather Information and Extracted Features form that Image for Future
Process using that Image.
Also the Image Source gives vast amount of large Images with high definition, so
we don’t have that much of storage space. To reduce the image storage space we need to
process that image into various form and also the content of image shouldn’t be modified
by that processing. So we extract some features from that image and store it into database
rather than store the original image and get it from when it needed.
4
Implicit geographic references such as an address, postal code, census
tract name, forest stand identifier, or road name.
SPATIAL DATA REPRESENTATION
Objects are entities such as buildings, roads, pipes, properties; they have distinct
boundaries; they are considered discrete entities.
Fields are continuous phenomena such as elevation, temperature and soil
chemistry; they exist everywhere (every point has an elevation or temperature);
they are not discrete entities.
OBJECT TYPES:
o Points, Lines, Polygons…etc.
USED IN/FOR:
o GIS - Geographic Information Systems
o Meteorology
o Astronomy
o Environmental studies, etc.
SPATIAL ATTRIBUTES
o Topological
o adjacency or inclusion information
o Geometric
o position (longitude/latitude), area, perimeter, boundary polygon
5
Figure 1.2 Two types of Spatial Database Models
Vector Model:
In this model we extract features and store it into database using the format of
Points, Lines and Polygons.
Raster Model:
In this model we extract features and store it into database using cells, line of cells
and grid of cells.
o ATTRIBUTE DATABASE
Traditional database structures e.g. network, hierarchical and
relational database management systems
6
These are linked to spatial database by unique identifiers of entity
e.g. LOC_ID
7
1.1.4 GEOGRAPHICAL INFORMATION SYSTEMS
Definition:
‘A powerful set of tools for collecting, storing, retrieving at will,
transforming and displaying spatial data from the real world for a
particular set of purposes’, Burroughs and McDonnell 1998.
‘…tools that allow for the processing of spatial data into information…
and used to make decisions about, some portion of the earth’ De Mers
1997.
Context of GIS:
Database Creation:
Database creation involves several stages:
input of the spatial data
input of the attribute data
linking spatial and attribute data
For the vector data model, once points are entered and geometric
lines are created, topology must be "built".
This involves calculating and encoding relationships
between the points, lines and areas.
This information may be automatically coded into tables of
information in the database.
8
Data Input:
Digitizing hard copy maps
Keyboard entry of co-ordinate data
Scanning a map manuscript
Converting or reformatting existing digital data; and
From Satellites
Geo-Referencing:
All data must be input to the same geographical referencing systems.
Various algorithms available to convert from one system to the
adopted base geo-reference.
Problems of generalization with different scale data.
GIS Outputs:
o Maps and Tables
o Charts and Animations
o Numbers and Arrays
GIS Based Analysis:
o Attribute data queries
o Spatial data queries
o Set queries
o Network queries
9
1.1.5 SPATIAL DATA OPERATORS
Spatial operators define the spatial relationships that exist between map
features.
Most spatial operators (overlaps, entirely contains, entirely contained by,
contains, contained by, terminates in, terminus of, passes through, passed
through by, on boundary of, has on boundary, touches, meets) can be
combined to answer complex spatial queries.
Proximity operators (between, within, beyond, entirely between, entirely
within, and entirely beyond) are those that cannot be combined with each
other.
Examples of Spatial Data Queries:
Metrics
How long is the R. Thames including all its tributaries?
How many ha. Of acid marshy grassland exist in the R. Wolf
catchments?
10
Union: uniquely combines the contents of both operands (query sets).
Intersect: keeps only items present in both operands.
Minus: keeps only items from the first operands that are not in the second.
Difference: keeps only items present in one, but not both, operands.
11
1.2.1 COLOR
One of the primary components of image analysis for the purpose of content-
based image retrieval is that of color analysis. As you may recall, color that is visible to
the human eye represents a small range of the entire electromagnetic spectrum that
represents everything from cosmic rays to x-rays to electric waves.
As noted above, the color visible to the human eye range in wavelength from
4000 to 7000 angstroms respectively represents the colors violet and red and all of the
colors in between. All other waves ranging from cosmic rays from the stars to the FM
waves to our radios cannot be perceived by the human eye. It is this small range of the
spectrum that is referred as human perceived color space.
o The image is a two-dimensional grid of square tiles called pixels.
o Each pixel has a uniform color All colors that pixels may have form a
color space.
COLOR SPACE
The models of human perception of color differences are described in the form of
color spaces. The two primary color spaces are that of the CIE and HSV model – hence
these are the typical color spaces used in content-based image retrieval systems.
o RGB:
Each color is a sum of a red, green, and blue component.
The intensity of each component is given by a number from [0, 1].
The color is a triple (r, g, b) from the unit cube.
o HSV:
The HSV model represents color in its distinct components of hue,
saturation, and value. To understand this model, we will first explore its
components. The primary colors are identified as the primary set of colors that
when combined together can create all of the other colors within the visible
human spectrum. Similar to that of a computer monitor, the primary colors
12
are that of red, green, and blue. Equal mixing of these colors produce what is
known as the secondary colors of cyan, magenta, and yellow.
If we were to represent the primary and secondary colors within a
color wheel, you will note that the secondary colors complement the primary
colors. For example, the primary colors of red and blue mixed evenly will
produce magenta, blue and green create cyan, and red and green create
yellow. This process of inter-mixing colors will produce tertiary, quandary…
eventually producing a solid ring of colors. This definition of color based on
the combination of primary colors is also known as hue; note the color wheels
above and below.
As can be noted form the diagram above, saturation refers to the
dominance of a particular hue within a color. A less saturated color is closer
to white while a more saturated color is closer to the pure color found on the
outer edge of the HSV color wheel diagram (toward the pure colors).
Meanwhile, the value of a color refers to the intensity (the lightness or
darkness of the color). While the two components appear to be similar, they
have different effects concerning the visibility of a color.
A color that is highly saturated will have a lower value as noted from
the red color. A highly valued color will have less saturation and have a color
that is closer to black. Meanwhile, a minimally saturated and minimal valued
color will be white.
Hence, the HSV model utilizes its components of hue, saturation, and
value to quantify a color. This model’s more straight-forward ability to
quantify color is the reason why many CBIRS utilize this method for color
analysis.
hue, saturation (0 = gray, 1 = most vivid), value (or brightness: 0
= black, 1 = bright).
o CIE :
The CIE color model was developed by the French organization
Commission International de L’Eclairage developed in the first half of the 20 th
century. This model was based on the tristimulus theory of color perception
13
in which the three (3) different types of color receptors in our eyes, also
known as cones, respond differently to different wavelengths of light.
The CIE color model represents the wavelengths (400nm or 4000
angstroms for violet to 700 nm or 7000 angstroms for red) of human visible
light. The color white is when all three cones are stimulated evenly.
While the CIE model is a very precise method to measure color, it is
not a very practical or easy method to use to examine color. Because of this,
many current CBIRS utilize the HSV color space for image analysis.
PERCEPTUAL UNIFORMITY:
are represented by
Colors Points
(in the 3-d color space)
should
Perceived difference/similarity
correspond to Distance between points
between colors
14
to find a way to represent the numeric representation of color for the thousands of
pixels that make up the image.
Divide the color space into some number (e.g. N) of disjoint regions.
Represent each color by the index of the region it belongs to,
o A simple way of recognizing similar colors as similar (i.e. by
pretending they are the same)
• But colors from two different (adjacent) regions can
still be fairly similar, which we would tend to ignore
o As if the image had been painted using a palette of N colors
Example: divide each coordinate axis of the RGB cube into 6 ranges →
6× 6× 6 = 216 palette entries.
HISTOGRAMS
The color histogram represents an image by breaking down the various color
components of an image and graphs out the occurrences and intensity of each color.
To compare two images, one needs only to now compare the color histograms of the
two images and determine the similarity of the two histograms.
o Color histograms are a good representation of the colors present in an image.
o The image colors are therefore usually quantized, meaning that the number of
colors is reduced, often to 64 or 256 colors.
15
CALCULATION
Color Correlogram:
Another approach for color comparison is to utilize the color Correlogram
method. As noted in Huang, most commercial CBIRS utilize color histograms for
image color comparison but this method of comparison does not take into account
of space information – i.e. the space or distance between one color vs. another
color. There are various approaches that attempt to integrate spatial information
into color histograms, but color correlograms natively resolve this issue.
Also known as scatter plots, correlograms will create a visual
representation of the image similar to that of above.
So while the histogram will note the number of colors and their intensities,
a Correlogram will be able to note space information indicating the distance
between the different colors. Therefore, when comparing two different images, it
is not only the color components that are being compared, but also the distance
they are from each other.
16
o Manhattan: DM = |h1—g1| + |h2—g2| + . . . + |hN—gN|.
Remarkably useful given their simplicity.
Only capture information about the presence of a color, but ignore its spatial
distribution.
The color histograms are then similar to the grayscale histograms.
Except that each bin represents a color and not a grey level.
17
o Divide the image into a grid of small windows (e.g. 4× 4 pixels)
o Describe each window with a vector (e.g. average color, etc.)
If two vectors lie close together, their corresponding windows
are probably similar
o Use clustering to form groups of adjacent vectors (hopefully
representing similar windows)
o Form a region from the windows of each cluster. Use the centroid
of the cluster to describe the region.
1.2.2 TEXTURE
Gabor Filters:
Similar to a Fourier transform, Gabor functions when applied to images
convert image texture components into graphs similar to the ones below. There
are many widely-used approaches to the usage of Gabor filters for text image
characterization.
The careful manipulation of these Gabor filters will allow one to quantify
the coarseness or smoothness of an image. For example, within the above figure
b) could indicate a more coarse texture than that of what was found in a). Note,
the comparison of these images are performed against the mathematical
representation of these graphs hence the CBIRS’ ability to compare the textures of
two different images.
18
Wold Features:
Similar to the above Gabor filters, the purpose of using Wold features is to
utilize a mathematical function and coefficients to represent the texture of an
image. The Wold decomposition algorithm fits within the context of human
textual perception in that it breaks down image texture into the components of:
periodicity, directionality, and randomness. As noted in Liu and Picard, these
three components correspond to the dimensions of human textual perception
determined by a psychological study.
1.2.3 SHAPE
Used in many CBIRS, shape features are usually described after the
images have already been segmented or broken out [31]. While a good shape
representation of an image should be handle changes in translation, rotation,
and/or scaling; this is rather difficult to achieve. The primary difficulty is that
images involve numerous geometric shapes that when numerically characterized
will typically lose information. A methodology that identifies information at too
detail a level (down the individual colors and shapes of a Degas painting for
example) will only be able to identify the color palette.
Conversely, a methodology that characterizes image shape at too global a
level will only be able to quantify the entire image vs. identifying individual
components within the image. A global approach to shape analysis and
identification would require any similar images to be similar in all of its
components.
19
textual information’s are the sufficient one not necessary in image comparisons. Also
these descriptions highly used indexing and fast accessing the image from the database.
When we compare images using text the searching speed is very fast and it
occupies low space than the other methods and also no computations are needed. But
consider a large database that contains high definition images then the description adding
of these images are rapidly complex and no time for do that, so we go for CBIR rather
than using Text Based Image Retrieval.
2. PROBLEM DEFINITION
The following sections are the brief explanations to the problem definition
of our project. Also the following sections contain General Approach to solve our
problem. We use new approach using these general approaches.
Image Text
Descriptions Patterns
The Image Descriptions are given by the Users and Analyst who are
interested in that Domain. The Image Descriptions are then converted into Text Patterns
used for Faster Comparison. The Patterns contains the individual keywords extracted
from the descriptions. Then the Text Patterns are Formatted and Indexed for Faster
20
Accessing. Then All the Formatted and Indexed Patterns are now Stored into the
Database with that Image.
Image Retrieval:
The Input Text Query is given to the Pattern Matching System. The
Pattern Matching System Matching System does the following,
Extract Patterns from the Input Text Query and Format those Patterns.
Get the Patterns of a Image From the Database and compare those
Formatted Patterns with that input patterns.
If the patterns are matched as per the Threshold value then add that
Corresponding image into Resulted set.
Do the above until all the image patterns in the database are compared.
Now the Result set has matched patterns and the corresponding threshold values. Now we
divide the result set into no of clusters using threshold ranges. Then finally display the
output images according to the Clusters.
Here Clustering is done according to that Threshold value. The user must
select the Threshold value and the range of thresholds to do the clustering. Each cluster
contains images that have to be defined threshold range. Also how many no of ranges to
be calculated also indicated by the user.
Display Image
Output Clustering
Advantages:
21
Faster Execution
Storage Space is less
Accuracy is high
Retrieved Images are Very Similar
Disadvantages:
Adding Image Descriptions
o Highly Complex
o Long Time
o Storage space is large when describe large image
22
Figure 1.1 The Block Diagram for the General Approach to Content Based Image
Retrieval with Clustering and Feedback.
23
And domain concepts.
The developments in this field have been put forward in three levels.
Level one is the fundamental level where low-level features like color, texture,
shape and spatial locations are applied to segment images in image database and then find
symmetry based on these segmentations, with the input image. Quite a bit of research
works were being done during the last decade. As we mentioned earlier, many software
packages have been developed for efficient image retrieval. Most of them have used
combination of text-based and content-based retrieval. Images are segmented manually
beforehand and text generated based on these manual segmentations and retrievals are
carried out accordingly.
But since the volume of images generated could be enormous in fields like
satellite picturing, this method of manual part processing is time-consuming and
expensive. Little automatic retrieval without human intervention have been developed
like QBIC, Excalibur, Virage which are now commercially being used in addition to
packages developed which are not yet sent to commercial market. But they have limited
applications in areas like trademark registration, identification of drawings in a design
archive or color matching of fashion accessories based on input image. No universally
accepted retrieval technique has yet been developed. Also the retrieval Techniques
developed without human intervention are far from perfect. Segmentation has been done
in some packages based on color where the segmented parts taken individually do not
contribute to any meaningful identification. They generate a vague symmetry between
input objects and objects in image database. This level still needs to be developed further
to have globally acceptable packages.
Level two talks about bringing out semantic meanings of an image of the
database. One of the best known works in this field is of Forsyth by successfully
identifying human beings within images and this technique had been applied for other
objects like horses and trees. Also for example, a beach can be identified if search is
Based on color and texture matching and color selected is wide blue with yellow texture
below. Attrasoft Image Finder has come up with an image retrieval technique where input
images would be stored in various files. Also images would be kept in directory Files.
24
There would be an interface screen where users can provide the file name
containing the input image and also can put various parameters like focus, background,
Symmetry, rotation type, reduction type etc. The images from the directory would then be
selected based on these inputs. The images in the directory are defined containing the
sample segments or translated segments, rotated segments, scaled segments, rotated and
scaled segments, brighter or darker segments.
This method goes to some extent in bringing out semantic meanings in an image
in the sense that the user can specify an input image semantically, then corresponding
input image is retrieved and based on that input image, image database is searched to find
symmetry. There are few others similar automatic image retrieval models available
including Computer Vision Online Demos. But this level also needs much more
developments to achieve universally accepted techniques to bring out semantic meanings
out of an image.
Level three talks about retrievals with abstract attributes. This level of retrieval
can be divided into two groups. One is a particular event like ‘Find pictures of a
Particular birthday celebration’. Second one could be ‘Find picture of a double-decker
buses. To interpret an image after segmentations and analyzing it efficiently requires very
complex logic. This would also require retrieval technique of level two to get semantic
Meanings of various objects. It is obvious this retrieval technique is far from being
developed with modern technology available in the field.
A generic system is defined as one where the processing steps remain more or less
the same (or, standardized) for different choices of image contents. An automatic system
requires no manual intervention while a semi-automatic system may require a Limited
manual intervention. Thus, approaches to CBIR can be semi-automatic and non-generic,
semi-automatic and generic, automatic and non-generic or automatic and generic. The
selection of an approach to CBIR is influenced by the image features to be extracted, the
level of abstraction to be revealed in the features and the extent of desired domain
independence.
The use of low-level (or measurable, automatically extractable)
25
Features, for example, color, texture or shape, in image retrieval makes the approach
automatic but not necessarily efficient. CBIR systems that make use of only high-level
(composite) or semantic features are domain dependent. Inter-image distances computed
using the low-level features is called “real” inter-image distances.
Inter-image distances computed using the high-level features is called “estimated”
inter-image distances. There is an urgent need for an efficient and effective image
retrieval technique that not only reduces the complexity associated with computing the
inter-image distance using low-level features, but also aids in making the system generic
and automatic. The goal is to develop an efficient, generic and fully automated CBIR
system.
The image retrieval problem can be defined as: Let there be an image database,
populated with images O0, O1, O2… On. Let Q denote a query image. Let P denote the
real inter-image distance function. The real inter-image distance between two image
objects Oi and Oj are denoted by P (Oj, Oj). The goal is to efficiently and effectively
retrieve the best q (q < < n) images from the image database.
As we pointed out, plenty of research works have been done in image retrieval
based on contents of the image. Attempts have been made to retrieve similar shape when
Shapes are measured by coordinate systems. Content-based image retrieval is emerging
as an important research area with application to digital libraries and multimedia
databases. The focus is being put on the image processing aspects and in particular using
texture information for browsing and retrieval of large image data. It is proposed, use of
Gabor wavelet features for texture analysis and provide a comprehensive experimental
evaluation.
Comparisons with other multi-resolution texture features using the Brodatz
texture database indicate that the Gabor features provide the best pattern retrieval
accuracy. An application to browsing large air photos is illustrated. IMEDIA project
which is related to image analysis, the bottleneck of multimedia indexing concerns about
image analysis for feature space and probabilistic modelisation,
Statistics and information theory for interactive browsing, similarity measure and
matching. To achieve these goals, research involves the following topics: image indexing,
partial queries, interactive search, and multimedia indexing.
26
In a project named Efficient Content-Based Image Retrieval, the focus is the
development of a general, scalable architecture to support fast querying of very large
image databases with user-specified distance measures. They have developed algorithms
and data structures for efficient image retrieval from large databases with multiple
distance measures. They are investigating methods for merging their general distance
measure independent method with other useful techniques that may be distance measure
specific, such as keyword retrieval and relational indexing.
They are developing both new methods for combining distance measures and a
framework in which users can specify their queries without detailed knowledge of the
underlying metrics. They have built a prototype system to test their methods and
evaluated it on both a large general image database and a smaller controlled database.
An approach based on visual-based image retrieval method with respect to
MPEG-7 still image description scheme is presented. A segmentation method based on
multivariate minimum cross entropy is used hierarchically for partitioning the color
image in classes and regions. Local and global descriptors are defined in order to
characterize the color feature of these regions. The retrieved images are presented in a
description space which allows the user to better understand and interact with the results.
A histogram generation technique using HSV (Hue, Saturation and Value) color space
has been proposed for image retrieval. The histogram retains a perceptually smooth color
transition that makes it possible to do a Window-based comparison of feature vectors for
efficient image retrieval from very large databases.
For the purpose of ordering of image feature vectors, a vector cosine distance
measure is used. In an attempt to overcome the drawback of the histogram
Techniques of color image retrieval which consider only global properties and hence
cannot effectively define an image, a scheme to capture local properties has been
Developed for more accurate retrieval. The original image is segmented into several sub
images blocks and color histograms for every sub images block are generated. All these
color histograms generated are then combined into a multidimensional vector to search
database for similar images. In face detection in color images, a method has been used for
integrating the well-known color models by using fuzzy set based concept. The shape
analysis is performed by using RAMHD, an enhancement of the conventional Hausdorff
27
Distance. Also an algorithm for updating the elliptical model has been developed. Then a
video based face recognition system by support vector machines is presented.
The authors used Stereovision to coarsely segment face area from its
Background and then multiple-related template matching method is used to locate and
track the face area in the video to generate face samples of that particular person.
Face recognition algorithms are based on Support Vector Machines of which both “1 vs.
many” and “1 vs. 1” strategies. Also a methodology to find multiple persons in image has
been developed by finding face-like regions through skin, motion and silhouette features.
Attempts have been made to eliminate false Faces based on face geometric and the
Support Vector Machine (SVM) by developing an algorithm. To get rid of the effect of
lighting changes, a method of color constancy compensation is applied. To track multiple
persons, a face-status table is used. The authors claim the method is much robust and
powerful than other traditional methods. An object-based image retrieval procedure has
been presented which allows user to specify and to search for certain regions of interest
in images.
The marked regions are represented by wavelet coefficients and searched in all
image sections during runtime. All other image elements are ignored and a detailed
search can be performed. A system for the image indexing and retrieval using speech
annotations based on a pre-defined structured syntax is presented where an introduction
of N-best lists for index generation and a query expansion technique is explored to
enhance the query terms and to improve effectiveness. Through addition of the most
probable substitutions for the query terms, more relevant images are distinguished from
the data collection. A new conception of emergence index has been presented whereby
index for retrieving images from the database would be decided by considering the
hidden or implicit meanings of an image in addition to implicit meaning.
Relevance Feedback
In order to help the users retrieve the correct images they seek, relevance
feedback techniques have been developed. This involves allowing users to make further
Selections from the initial lot of images, presented for a query. The users can keep on
28
refining the search from the results of the previous search until they get the Desired
images or closest to what they desire.
Issues regarding relevance feedback have been presented where the linear and
kernel-based biased discriminate analysis, BiasMap is proposed to fit the unique nature of
relevance feedback as a small sample biased Classification problem. Also a word
association via relevance feedback (WARF) formula is presented and tested for erasing
the gap between low-level visual features and high-level semantic annotations during the
process of relevance feedback.
Feature Extraction
Most systems perform feature extraction as a preprocessing step, obtaining global
image features like color histogram or local descriptors like shape and texture. A region
based dominant color descriptor indexed in 3-D space along with their percentage
coverage within the regions is proposed, and shown to be more computationally efficient
in similarity based retrieval than traditional color histograms. The authors argue that this
compact representation is more efficient than high dimensional histograms in terms of
search and retrieval, and it also gets around some of the drawbacks associated with earlier
propositions such as dimension reduction and color moment descriptors.
A multi-resolution histogram capturing spatial image information has been shown
to be effective in retrieving textured images, while retaining the typical Advantages of
histograms. Gaussian mixture vector quantization (GMVQ) is used to extract Color
histograms and is shown to yield better retrieval than uniform quantization and vector
quantization with squared error. A set of color and texture descriptors rigorously tested
for inclusion in the MPEG-7 standard, and well suited to natural images and video. These
include histogram-based descriptors, dominant color Descriptors, spatial color descriptors
and texture descriptors suited for browsing and retrieval.
Texture features have been modeled on the marginal distribution of wavelet
coefficients using generalized Gaussian distributions. Shape is a key attribute of
segmented image regions, and its efficient and robust representation plays an important
29
Role in retrieval. A shape similarity measure using discrete curve evolution to simplify
contours is discussed. Doing this contour simplification helps to remove noisy and
irrelevant shape features from consideration. A new shape descriptor for shape matching,
referred to as shape context, has been proposed which is fairly compacting yet robust to a
number of geometric transformations. A dynamic programming (DP) approach to shape
matching has been proposed.
One problem with this approach is that computation of Fourier descriptors and
moments is slow, although pre-computation may help produce real-time results.
Continuing with Fourier descriptors, exploitation of both the amplitude and phase and
using Dynamic Time Warping (DTW) distance instead of Euclidean distance has been
shown to be an accurate shape matching technique. The rotational and starting point
invariance otherwise obtained by discarding the phase information is maintained here by
adding compensation terms to the original phase, thus allowing its exploitation for better
discrimination. For characterizing shape within images, reliable segmentation is critical,
without which the shape estimates are largely meaningless.
Even though the general problem of segmentation in the context of human
perception is far from being solved, there have been some interesting new Directions, one
of the most important being segmentation based on the Normalized Cuts criteria. This
approach, based primarily on the theory of spectral clustering, has been extended to
texture image segmentation by using cues of contour and texture differences, and to
incorporate partial grouping priors into the segmentation process by solving a constrained
optimization problem.
The latter has potential for incorporating real-world application specific priors,
e.g. location and size cues of organs in pathological images. Talking of medical imaging,
3D brain magnetic resonance (MR) images have been segmented using Hidden Markov
Random Fields and the Expectation- Maximization (EM) algorithm and the spectral
clustering approach has found some success in segmenting vertebral bodies from sagittal
MR images. Among other Recent approaches proposed are segmentation based on the
mean shift procedure, multi-resolution segmentation of low depth of field images, a
Bayesian framework Based segmentation involving the Markov chain Monte Carlo
30
technique, and an EM algorithm based segmentation using a Gaussian mixture model,
forming blobs suitable for image querying and retrieval.
A sequential segmentation Approach that starts with texture features and refines
segmentation using color features is explored in. While there is no denying that achieving
good segmentation is a big step forward in image understanding, some of the issues
plaguing current techniques are speed Considerations, reliability of good segmentation,
and a robust and acceptable benchmark for assessment of the same. In the case of image
retrieval, some of the ways of Getting around this problem has been to reduce
dependence on reliable segmentation, to involve every generated segment of an image in
the matching process to obtain Soft similarity measures, or to characterize spatial
arrangement of color and texture using block-based multi resolution hidden Markov
models, a technique that Has been extended to segment 3D volume images as well.
Another alternative has been to use principles of perceptual grouping to
hierarchically extract image structure. Features based on local invariants such as corner
points or interest points that have traditionally been used for stereo matching are being
used extensively in image retrieval. Scale and affine invariant interest points that can deal
with significant affine transformations and illumination changes have been shown as
effective features for image retrieval. In similar lines, wavelet-based salient points have
been used for retrieval.
The significance of such special points lie in their compact representation of
important image regions, leading to efficient indexing and good discriminative power,
especially in object-based retrieval. A discussion on the pros And cons of different types
of color interest points used in image retrieval can be found, while a comparative
performance evaluation of the various proposed interest Point detectors are reported. The
selection of appropriate features for content-based Image retrieval and annotation systems
remains largely ad-hoc, with some exceptions. One heuristic in the selection process is to
have application-specific feature sets. Although semantics-sensitive feature selection has
been shown effective in image retrieval, the need for a uniform feature space for efficient
search and indexing limits.
Heterogeneous feature set size to some extent. When a large number of image
features are available, one way to improve generalization and efficiency in classification
31
and indexing is to work with a feature subset. To avoid a combinatorial search, an
automatic feature subset selection algorithm for SVMs has been proposed. Some of The
other recent, more generic feature selection propositions involve boosting, evolutionary
searching Bayes classification error, and feature dependency/similarity measures.
A survey and performance comparison of some recent algorithms on the topic can
be found.
Approaches to Retrieval
Once a decision on the visual feature set choice has been made, how to steer them
towards accurate image retrieval is the next concern. There have been a large number of
fundamentally different frameworks proposed in the last few years. Leaving out those
discussed, here we briefly talk about some of the more recent approaches. A semantics-
sensitive approach to content-based image retrieval has been proposed. A semantic
categorization (e.g., graph - photograph, textured – non textured) for appropriate feature
extraction followed by a region based overall similarity measure, allows robust image
matching.
An important aspect of this system is its retrieval speed. The matching measure,
termed integrated region matching (IRM), has been constructed For faster retrieval using
region feature clustering and the most similar highest priority (MSHP) principle. Region
based image retrieval has also been extended to incorporate spatial similarity using the
Hausdorff distance on finite sized point sets, and to employ fuzziness to characterize
segmented regions for the purpose of feature matching. A framework for region-based
image retrieval using region codebooks and learned region weights has been proposed.
A new representation for object retrieval in cluttered images without relying on
accurate segmentation has been proposed. Another perspective in image retrieval has
been region-based querying using homogeneous color texture segments called blobs,
instead of image to image matching. For example, if one or more segmented blobs are
identified by the user as roughly corresponding to the concept “tiger”, then her search can
comprise of Looking for a tiger within other images, possibly with varying backgrounds.
While this can lead to a semantically more precise representation of the user’s
query objects in general, it also requires greater involvement from and dependence on
32
her. For finding images containing scaled or translated versions of query objects, retrieval
can also be performed without the user’s explicit region labeling. Instead of using image
segmentation, one approach to retrieval has been the use of hierarchical perceptual
grouping of primitive image features and their inter-relationships to characterize
structure.
Another proposition has been the use of vector quantization (VQ) on image
blocks to generate codebooks for representation and retrieval, taking inspiration from data
compression and text-based strategies. A windowed search over location and scale has
been shown more effective in object-based image retrieval than methods based on
inaccurate segmentation.
A hybrid approach involves the use of rectangular blocks for coarse
foreground/background segmentation on the user’s query region-of-interest (ROI),
followed by the database search using only the foreground regions. For textured images,
segmentation is not critical. A method for texture retrieval by a joint modeling of feature
extraction and similarity measurement using the Kullback-Leibler distance for statistical
model comparison has been proposed.
Another wavelet-based retrieval method involving salient points has been
proposed. Fractal block code based image histograms have been shown effective in
retrieval on textured image databases. The use of the MPEG-7 content descriptors to train
self-organizing maps (SOM) for the purpose of image retrieval has been explored.
Among other new approaches, anchoring-based image retrieval system has been
proposed. Anchoring is based on the fairly intuitive idea of finding a set of representative
“anchor” images and deciding semantic proximity between an arbitrary image pair in
terms of their similarity to these anchors. Despite the reduced computational complexity,
the relative image distance function is not guaranteed to be a metric.
For similar Reasons, a number of approaches have relied on the assumption that
the image feature space is a manifold embedded in Euclidean space. Clustering has been
applied to image retrieval to help improve interface design, visualization, and result pre-
processing. A statistical approach involving the Wald-Wolfowitz test for comparing non-
parametric multivariate distributions has been used for color image retrieval, representing
33
images as sets of vectors in the RGB-space. Multiple-instance Learning was introduced to
the CBIR community.
A number of probabilistic frameworks for image retrieval have been proposed in
the last few years. The Idea is to integrate feature selection, feature representation, and
similarity measure into a combined Bayesian formulation, with the objective of
minimizing the probability of retrieval error. One problem with this approach is the
computational complexity involved in estimating probabilistic similarity measures. Using
VQ to approximately model the probability distribution of the image features, the
complexity is reduced, making the measures more practical for real-world systems.
34
user searching for an image of a large sea mammal could easily be returned with pictures
of an aircraft.
The images returned would be relevant in terms of their low-level visual content,
but in terms of the semantic closeness of the retrieved image to the user’s query there is a
significant gap. As a consequence the user must either serially browse large sets of
seemingly irrelevant images or engage in numerous query refinements.
Despite these challenges there is some data to suggest that users may be able to
search on the basis of visual features, although the few studies that have been carried out
report mixed findings. Jose et al. examined users’ ability to retrieve images with a spatial
querying tool which allowed the submission of queries by Drawing and labeling
rectangles. They compared performance of this tool with a second system that used text
only queries. Participants reported that they were able to formulate a mental image of a
picture that would satisfy their search need, and that the queries they submitted were an
accurate representation of that picture. Measures of user satisfaction indicated that
subjects generally preferred the spatial query tool and that they felt it improved their
performance.
However, the tool used did not have the added complexity of being used in the
context of content based searching alone; users were able to annotate their visual query
with text descriptions. This may well have made the users task slightly more familiar and
perhaps less difficult than the use of visual features alone. Venters et al undertook a
requirements analysis for the design of a visual search tool for the Retrieval of trademark
images.
The analysis revealed that users believed that their work would benefit from the
use of three different types of visual search tool. A sketch tool, a shape building tool and
a browsing tool. They then evaluated the usefulness of these tools compared to an
interface that allowed users to simply browse a hierarchically organized collection of
images classified according to the Vienna System. They found that while the interface
tools were rated very positively in terms of their usability, participants reported that they
believed the sketch tool was inadequate and that its successful use depended wholly upon
the users’ artistic ability. Overall, participants reported that they preferred the system
which simply allowed them to browse the collection.
35
The possible causes for this browsing preference are unclear. It may be that users
were clinging to a more familiar retrieval strategy or it may be driven by the particular
tasks users were trying to accomplish.
2.3 CLUSTERING
Here we are using the Clustering scheme for the only purpose of categorizing the
result images in the user most likely order. So here we use a simple algorithm to
categorize the result images into various clusters, so the high promising images are stored
in a single cluster and less promising images stored in another cluster and the rejected
images stored in another cluster. The user can view all the clusters. The algorithm is like
the following,
Get the Image threshold value and compare it with the cluster ranges.
If the no of clusters are n then divide the threshold value by n.
Add the images with the divided threshold with the corresponding clusters.
Display the Clusters.
3. REQUIREMENTS
3.1 HARDWARE REQUIREMENTS
o Processor with 3.0 GHz Speed or Higher
o Random Access Memory 512 MB or More
o Virtual Memory 50GB or More
3.2 SOFTWARE REQUIREMENTS
o Microsoft Windows XP SP2 or Higher Version
o Microsoft .Net 3.0 or Higher
4. PROPOSED METHODOLOGY
The following sections illustrate our proposed methodology of our project. Many
of the sections contain algorithms with explanations.
36
4.1 IMAGE ENHANCEMENT
The first process in our project is to enhance the input image, so that the image is
compared perfectly. The input image may be has different contents and different format.
Maybe sometimes the same image may given by twice with some small change in that
image, but the comparison of these two images using pixel by pixel may not give the
positive result not at all. So we propose some of few enhancement techniques that
enhance the image and we compare the image using only the modified images not the
original one.
The following sections contain some of techniques that are used to enhance the
input image to increase the image comparison accuracy.
For i = 1 To ImageWidth- 1
For j = 1 to ImageHeight - 1
R = ((Pixel (i, j - 1) And &HFF) - (Pixel (i + 1, j) And &HFF)) + 128
G = (((Pixel (i - 1, j) And &HFF00) / &H100) Mod &H100 - ((Pixel (i, j + 1) And
&HFF00) / &H100) Mod &H100) + 128
B = (((Pixel (i, j - 1) And &HFF0000) / &H10000) Mod &H10000 - ((Pixel (i + 1, j)
And &HFF0000) / &H10000) Mod &H10000) + 128
No = Abs ((R + G + B) / 3)
SetPixel (i, j), RGB (No, No, No)
Next j
Next i
37
This technique is used as the Texture detection algorithm. Here we call
twice the standard Emboss algorithm. The algorithm is given below.
For i = 0 To ImageWidth
For j = 0 To ImageHeight
R = Pixel (i, j) And &HFF
G = ((Pixel (i, j) And &HFF00) / &H100) Mod &H100
B = ((Pixel (i, j) And &HFF0000) / &H10000) Mod &H10000
No = Abs ((R + G + B) / 3)
SetPixel (i, j), RGB (No, No, No)
Next j
Next i
Emboss ()
For i = 1 To ImageWidth
For j = 1 To ImageHeight
R = Abs (((Pixel (i - 1, j - 1) And &HFF) - (Pixel (i, j) And &HFF)) + 128)
38
G = Abs ((((Pixel (i - 1, j - 1) And &HFF00) / &H100) Mod &H100 - ((Pixel (i, j)
And &HFF00) / &H100) Mod &H100) + 128)
B = Abs ((((Pixel (i - 1, j - 1) And &HFF0000) / &H10000) Mod &H100 - ((Pixel (i,
j) And &HFF0000) / &H10000) Mod &H100) + 128)
No = Abs ((R / 4 + G / 4 + B / 4) + i * 2 / j + 20)
SetPixel (i, j), RGB (No, No, No)
Next j
Next i
4.2.5 INVERSE
This algorithm is used when the input image contains some alpha blended
objects or color inversed objects. The algorithm is given below.
For i = 0 To ImageWidth
For j = 0 To ImageHeight
SetPixel (i, j), Not (Pixel (i, j))
Next j
Next i
39
For i = 1 To ImageWidth- 1
For j = 1 To ImageHeight- 1
R = Abs (Pixel (i * j / 2 - 1, j) And &HFF + Pixel (i - 2, j + 1) And &HFF - Pixel (i +
1, j - 1) And &HFF) + 128
G = Abs ((Pixel (i * j - i, j) And &HFF00) Mod &H100 + (Pixel (i + 1, j - 2) And
&HFF00) Mod &H100 - (Pixel (i - 1, j + 1) And &HFF00) Mod &H100) + 128
B = Abs ((Pixel (i, j * j - i) And &HFF0000) Mod &H10000 + (Pixel (i - 2, j + 1)
And &HFF0000) Mod &H10000 - (Pixel (i + 1, j - 1) And &HFF0000) Mod &H10000) +
128
No = Abs ((R + G + B) / 3)
SetPixel (i - 1, j - 1), RGB (No, No, No)
Next j
Next i
Inverse ()
Grayscale ()
Inverse ()
For i = 1 To ImageWidth- 1
For j = 1 To ImageHeight- 1
R = Abs (Pixel (i * j / 2 - 1, j) And &HFF + Pixel (i - 2, j + 1) And &HFF - Pixel (i +
1, j - 1) And &HFF) + 128
G = Abs ((Pixel (i * j - i, j) And &HFF00) Mod &H100 + (Pixel (i + 1, j - 2) And
&HFF00) Mod &H100 - (Pixel (i - 1, j + 1) And &HFF00) Mod &H100) + 128
B = Abs ((Pixel (i, j * j - i) And &HFF0000) Mod &H10000 + (Pixel (i - 2, j + 1)
And &HFF0000) Mod &H10000 - (Pixel (i + 1, j - 1) And &HFF0000) Mod &H10000) +
128
No = Abs ((R + G + B) / 3)
SetPixel (i - 1, j - 1), RGB (No, No, No)
Next j
Next i
40
This algorithm is to reduce the high definition color image
in to low colored image. The algorithm is given below.
For i = 0 To ImageWidth
For j = 0 To Form1.Picture2.ScaleHeight
If Pixel (i, j) < 2097152 Then
No = 0
Else If Pixel (i, j) > 2097152 And Pixel (i, j) < 4194304 Then
No = 2097152
Else If Pixel (i, j) > 4194304 And Pixel (i, j) < 6291456 Then
No = 4194304
Else If Pixel (i, j) > 6291456 And Pixel (i, j) < 8388608 Then
No = 6291456
Else If Pixel (i, j) > 8388608 And Pixel (i, j) < 10485760 Then
No = 8388608
Else If Pixel (i, j) > 10485760 And Pixel (i, j) < 16777216 Then
No = 10485760
End If
SetPixel (i, j), No
Next j
Next i
41
Here we are using Color Based CBIR, only we compute the similarity of
images using each pixel color value. The color value of each pixel may not be a single
value it is a vector contains all measures to compute similarity.
Now compute the following for each pixel of the above Color Feature Vector.
o RED, GREEN and BLUE Components [RGB]
o HUE, SATURATION and VALUE Components [HSV]
42
o Commission International de L’Eclairage [CIE - LUV]
o YUV Components
Now we have the new Color Feature Vector of size,
V [10, n, 12].
Now we obtain the Color Feature Vector V. This vector is used for image
Similarity computing.
43
Inputs are Threshold Value T, Threshold Range (T/n)
Consider Clusters C [(T/n), T]
Compute Color Feature Vector V for Query. Image;
Compute Median Vector M for Query. Image;
For I1 to All_Images (Database.Images)
Compute Color Feature Vector V for Database.Images (I);
Compute Median Vector M for Database.Images (I);
For J1 to All_Patches (Database.Images (I))
ValComp_Patch (Query.Image.M, Database.Images (I).M, J);
DThreshold_Val (Val, T, (T/n));
C [D, T] Database.Images (I);
Next J
Next I
Display_All (C);
Now C contains all the clusters and user may try to get the Cluster that he was
most wanted. Also the User views the output images that are systematically similar.
4.4 PERFORMANCE EVALUATION
The above algorithm is produced following results with the efficiency.
Query Image:
[kri.JPG]
44
, , , ,
, , ,
[karaikudi1.JPG, karaikudi2.JPG, karaikudi.JPG, kri1.JPG, kri.JPG, tce.JPG, tcehostel.JPG, tpk.JPG]
Enhanced Images:
, , ,
, , ,
Output:
45
4.4.1 IMAGE SETTINGS
The input image and its enhanced images are not to be same size or same
color depth. But the no of enhanced images should be matched with the input image and
database images. For the patch generation the images are re-sized into square matrix of
46
pixel arrays. Also now this algorithm supports up to 5000 annotations and the Image size
should less than 2000X2000.
6000
karaikudi1.JPG
5000
karaikudi2.JPG
4000 karaikudi.JPG
3000 kri1.JPG
kri.JPG
2000
tce.JPG
1000 tcehostel.JPG
0 tpk.JPG
Threshold M ax
5. EXPERIMENTAL RESULTS
The following table contains the experimental results.
6. CONCLUSION
47
The above experimental result shows us our new approaches of algorithms
are gives best results the recommended hardware and software requirements. The existing
similarity measure algorithms take more computational space and time, but our algorithm
works as the best. Also all the comparisons are made only with pixel colors, the Image
enhancement algorithms are very useful to overcome the Color Based Image Retrieval
drawbacks. But the enhancement algorithms are not necessary. Also we compare the
efficiency with the Segmentation Algorithm which is already in developed and in use.
The patch detection algorithm is run in very fast manner, because it just divide the image
which has nXn pixel size into no of sub-images. Also threshold is computed using a
simple division operation rather use a complex algorithm in other approaches. The
clusters are selected according to the threshold range division, it is also a simple division
operation rather use a complex Clustering algorithms in other existing algorithms. Finally
all the images with best similarity are given to the user.
7. FUTURE WORK
The main drawback of content based image retrieval is Speed of Computing
similarities. So we suggest that we can run our algorithm in the Parallel Environment, So
that many no of computing elements are there, we get the result very quick manner.
Now our algorithm supports Color Based Image Retrieval, in future we
develop other Techniques like Shape, Texture, Segmentation, Region…etc.
Now our implementation is in Microsoft Visual Studio 2005 language, in
future we develop our all algorithms in Free Open Source Software.
8. REFERENCE
[2] David McG. Squire, Wolfgang Müller, Henning Müller and Thierry Pun, Content-
based query of image databases: inspirations from text retrieval, Pattern Recognition
Letters (Selected Papers from The 11th Scandinavian Conference on Image Analysis
SCIA '99), 21, 13-14, pp. 1193-1198, 2000. B.K. Ersboll, P. Johansen, Eds.
48
I. Jordan, “Matching words and pictures,” J. Mach. Learn. Res., vol. 3,
pp. 1107–1135, 2003.
[4] A. Del Bimbo and P. Pala, “Visual image retrieval by elastic matching
of user sketches,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 19, no.
2, pp. 121–132, Feb. 1997.
49