You are on page 1of 20

CSE412

Selected Topics in Computer Engineering

Content-Based Image Retrieval (CBIR)


Introduction
• Image databases, once an expensive
proposition, in terms of space, cost and
time has now become a reality.
• Image databases, store images of a
various kinds.
• These databases can be searched
interactively, based on image content or
by indexed keywords.
Introduction
Examples:
• Art collection – paintings could be
searched by artists, style, color … etc.
• Medical images – searched for anatomy
or diseases.
• Satellite images – for analysis or
prediction.
Introduction
Example Database Projects:
• IBM Query by Image Content (QBIC) –
retrieval is based on visual content, including
properties such as color percentage, color
layout and texture.
• Virage Inc. Search Engine – can search based
on color, texture, structure, ... etc.
• Corbis – general purpose CBIR, 17 million
images, searchable by keywords.
• Getty Images – image database organized by
categories and searchable through keywords.
Text-Based Image Retrieval
Traditional text-based image search engines
– Manual annotation of images with text description
– Use text-based retrieval methods

e.g. Water lilies

Flowers in a pond
Limitations of text-based approach
• Problem of image annotation
– Valid only for one language – with image retrieval
this limitation should not exist

• Problem of human perception


– Subjectivity of human perception
– Too much responsibility on end-user

• Problem of abstract needs


– Queries that cannot be described at all that depend
on visual features of images.
What is CBIR?
• Images have rich content.
• This content can be extracted as various
content features:
– Mean color, Color Histogram … etc.
• Take the responsibility of forming the
query away from the user.
• Each image will now be described by its
own features.
CBIR – A sample search query
• User wants to search for, as an example, many
rose images
– He submits an existing rose picture as a query.
– He submits his own sketch of rose as a query.
• The system will extract image features for this
query.
• It will compare these features with that of other
images in a database.
• Relevant results will be displayed to the user.
Sample Query
CBIR architecture
Feature Extraction
• What are image features?
• Primitive features
– Mean color (RGB)
– Color Histogram General features

• Semantic features
– Color Layout, texture … etc.
• Domain specific features
– e.g., face recognition
Mean Color

Pixel Color Information: R, G, B


Mean component (R,G, or B)=
Sum of that component for all pixels
Number of pixels

Pixel
Histogram
• Frequency count of each individual color
• Most commonly used color feature
representation
Color Layout
• Need for Color Layout
– Global color features give too many false
positives
• How it works:
– Divide whole image into sub-blocks
– Extract features from each sub-block
• Can we go one step further?
– Divide into regions based on color feature
concentration
– This process is called segmentation.
Example: Color layout
Images returned for 40% red, 30% yellow and 10% black.
Color Histogram Similarity Measures
• Color histogram matching could be used as described earlier.
• QBIC defines its color histogram distance as the sum of the
smallest bin for each corresponding bins in the two histograms
for input image I and the model image M normalized to the
number of pixels in the model image.
Color Similarity Measures
• Color layout is another possible distance
measure.
• The user can specify regions with specific
colors.
• Divide the image into a finite number of
grids. Then, associates each grid with a
specific color (chosen from a color
palette.
Color Similarity Measures
• It is also possible to provide this information from a sample
image.
• Color layout measures that use a grid require a grid square color
distance measure dcolor that compare the grids between the
sample image and the matched image.
dgridded_square (I,Q) = Σ dcolor(CI(g),CQ(g))

where CI(g) and CQ(g) represent the color in grid g of a database


image I and query image Q respectively.
• A suitable representation is to use the mean color in the grid
square
• Mean color could represent the mean of R, G, and B or a single
value component of them.
g

You might also like