You are on page 1of 73

CSE3018 Content Based Image and

Video Retrieval
         

Dr.S.Geetha
Professor
School of Computing Science and Engineering
VIT University, Chennai Campus, Chennai
AB1-205 – geetha.s@vit.ac.in
SYLLABUS
• Fundamentals of Content based image and video
retrieval
• Image Content descriptors –Key Frame features Color
• Image Content descriptors Key frame features– Texture,
Shape
• Motion features
• Similarity Measures and Indexing Schemes Feature
Extraction techniques
• Computer Vision System Toolbox
Recommended Books
• MOODLE
– Lecture Slides
– Reading materials
– Web links
– Tutorials
– Research works, Recent Trends etc.
• Digital Image Processing (3rd Edition)– Gonzalez
• Research and White Papers
Grading Policy
• CAT – 1 – 15%
• CAT – 2 – 15%
• FAT – 40%
• Digital Assignment, Quiz, Programming
Exercise – 30%
• Project
– Team of maximum 3 members
COURSE OBJECTIVES
• To understand the fundamentals for the content-based
image and video retrieval.
• To understand the role of key frame features for content
based image retrieval.
• To understand the motion features of videos.
• To give the exposure on importance of similarity
measure in content based image retrieval.
• To design the algorithm for content based image
retrieval using machine learning algorithm.
Lecture 1
Introduction to Image Retrieval
Applications
Lecture 1: Outline
Whatisisand
•• What andWhy
Why image
image retrieval?
retrieval?
• How to compare and retrieve images?
• How to compare and retrieve images?
− Digital image representation
– Digital image representation
− Common components of the CBIR systems
– Common components of the CBIR systems
− Main problems and research directions
– Main
• What areproblems and research directions
applications?
• What are applications?
What is image retrieval?
 Description Based Image Retrieval (DBIR)
 Content Based Image Retrieval (CBIR)
Query

Textual Text
query

Query Image
by
example Sketch
DBIR v. s. CBIR

DBIR CBIR
 Full text search algorithms are  Automatic index construction
applicable
+  Index is objective
 Search results correspond to
image semantics

 Manual annotating is hardly  Semantic gap


feasible
–  Manual annotations are  Querying by example is not
subjective convenient for a user
Why image retrieval?

• Huge amounts of images are everywhere: how


to manage this data?
• “A Picture is worth thousand words”
• Not everything can be described in text
Why is Image Retrieval Hard ?
? What is the topic of this image
? What are right keywords to index this image
? What words would you use to retrieve this
image ?
The Semantic Gap
How similar are these two images
Complexities for Image Retrieval
• Images were traditionally managed by first annotating their
contents and then using text-retrieval techniques to index
them.
• However, with the increase of information in digital image
format some drawbacks of this technique were revealed:
• Manual annotation requires vast amount of labor
• Different people may perceive differently the contents of an image; thus no objective
keywords for search are defined

• A new research field was born in the 90’s: Content-based


Image Retrieval aims at indexing and retrieving images
based on their visual contents
Image Database Examples
• IBM: Query by Image Content (QBIC)
– Retrieves images based on visual content, including such
properties as color percentage, color layout, and texture.
• Virage, Inc.
– Virage search engine can retrieve images based on color
composition, texture, and structure.
• Google Image search.
• National Library of Medicine provides a database of x-rays,
CT scans, MRI images, and color cross-sections, taken at very
small intervals along the bodies of male and female cadaver.
• The NASA collects huge databases of images from its satellites
and makes them available for public acquisition. (for free )
DB2 Image Extender
• DB2 Image Extender defines the distinct data type
DB2IMAGE with associated user-defined functions
for storing and manipulating image files
– (http://www-306.ibm.com/software/data/db2/extenders/
).
• The DB2 Image Extender provides similarity search
functionality based on the QBIC technology
– (http://wwwqbic.almaden.ibm.com/ )

17
Query By Example (QBE)
• The image DB user should be able to:
– show the system a sample image, or
– Paint one interactively on the screen, or
– Just sketch the outline of an object.
• The system should then be able to return
similar images or images containing similar
objects.

18
IBM-QBIC
• The Hermitage Web site was voted the best in
Russia. It uses the QBIC engine for searching
archives of world-famous art.
– http://www.hermitagemuseum.org/fcgi-bin/db2w
ww/qbicSearch.mac/qbic?selLang=English
– Color percentage
– Color layout

19
A sample query
• SELECT CONTENTS(image),
QBScoreFROMStr(`averageColor=
<255,0,0>’, image) AS SCORE
FROM signs ORDER BY SCORE

20
Photobook System

Figure 1. The texture retrieval of PhotoBook system


(http://web.media.mit.edu/~tpminka/photobook/).
Search Using Sketch

 Sketch entry

 Results of search
ImageScape System

Figure 2.5 The interface of ImgeScape visual query


system (http://skynet.liacs.nl/imagescape/).
Relevance Feedback in CBIR
• Motivation:
– Human perception of image similarity is subjective,
semantic, and task-dependent.
– The CBIR based on the similarities of pure visual
features are not necessarily perceptually and
semantically meaningful.
• Each type of visual feature tends to capture only one aspect
of image property and it is usually hard for a user to specify
clearly how different aspects are combined …
– Relevance Feedback is introduced to address these
problems.
• It is possible to establish the link between high-level
concepts and low-level features.
Relevance Feedback RF (cont.)
• RF is a supervised active learning technique
used to improve the effectiveness of
information systems.
– Main idea: use positive and negative examples
from the user to improve system performance.
Initial query
results
Query Image

Collect user’s
feedback Initial Query Results

Real-time User Relevance Feedback


learning

Query Results After User Feedback


Refine query
results
Content-Based Image Retrieval
• Content-Based Image Retrieval (CBIR)
– Image databases can be huge, containing hundreds of
thousands or millions of images.
– In most cases they are only indexed by keywords that have
to be decided upon and entered into the database system
by a human categorizer.
– However, image can be retrieved according to their
content, where content might refer to color distributions,
texture, region shapes, or object classification.
Why content based image
retrieval?
• Automatic generation of textual annotations for a wide
spectrum of images is not feasible.

• Annotating images manually is a cumbersome and expensive


task for large image databases.

• Manual annotations are often subjective, context-sensitive


and incomplete.

• Google, Yandex and others use text-based search. Results are


not perfect.
However, now it is much better, than a couple of years ago!

28/46
Content-based Image Retrieval (CBIR)

Searching a large database for images that


match a query:

– What kinds of databases?


– What kinds of queries?
– What constitutes a match?
– How do we make such searches efficient?
Applications
– Art Collections
e.g. Fine Arts Museum of San Francisco
– Medical Image Databases
CT, MRI, Ultrasound, The Visible Human
– Scientific Databases
e.g. Earth Sciences
– General Image Collections for Licensing
Corbis, Getty Images
– The World Wide Web
What is a Query?

– an image you already have


How did you get it?

– a rough sketch you draw


How might you draw it?

– a symbolic description of what you want


What’s an example?
Some Systems You Can Try
Corbis Stock Photography and Pictures

http://pro.corbis.com/

• Corbis sells high-quality images for use in advertising,


marketing, illustrating, etc.

• Search is entirely by keywords.

• Human indexers look at each new image and enter keywords.

• A thesaurus constructed from user queries is used.


QBIC

IBM’s QBIC (Query by Image Content)

http://wwwqbic.almaden.ibm.com

• The first commercial system.

• Uses or has-used color percentages, color layout,


texture, shape, location, and keywords.
Blobworld

UC Berkeley’s Blobworld

http://elib.cs.berkeley.edu/blobworld

• Images are segmented on color plus texture

• User selects a region of the query image

• System returns images with similar regions

• Works really well for tigers and zebras


Image Features / Distance Measures

Query Image Retrieved Images

User
Image Database

Distance Measure

Images Image Feature Feature Space


Extraction
Levels of image retrieval
• Level 1: Based on color, texture, shape features
– Images are compared based on low-level features,
no semantics involved
– A lot of research done, is a feasible task

• Level 2: Bring semantic meanings into the search


– E. g. identifying human beings, horses, trees, beaches
– Requires retrieval techniques of level 1
– Very active and challengeable research area

• Level 3: Retrieval with abstract and subjective attributes


– Find pictures of a particular birthday celebration
– Find a picture of a happy beautiful woman
– Requires retrieval techniques of level 2 and very complex logic
– Is far from being developed with modern technology available now

36/46
Image retrieval by Google

37/46
Image retrieval by Yandex

38/46
Lecture 1: Outline
• What is and Why image retrieval?
• How to compare and retrieve images?
– Digital image representation
– Common components of the CBIR systems
– Main problems and research directions
• What are applications?

39/46
Digital image representation
Vector image
draw circle
center 0.5, 0.5
radius 0.4
fill-color yellow
stroke-color black
stroke-width 0.05
draw circle
center 0.35, 0.4
radius 0.05
fill-color black
draw circle
center 0.65, 0.4
radius 0.05
fill-color black
draw line
start 0.3, 0.6
end 0.7, 0.6
stroke-color black
stroke-width 0.1

40/46
Digital image representation
Bitmap (raster) image

0  f (x , y )  L, and typically L  255

• Bitmap image is an array of pixels


• The value of each array element corresponds
to the color of the appropriate pixel
41/46
Digital image representation
Bitmap (raster) image
Important parameters of raster image:
• Raster dimensions
• Resolution (ppi)
• Sample depth (usually 2k)

Fixed resolution, varying dimension

Fixed dimensions, varying resolution


42/46
Digital image representation
Bitmap (raster) image

The same image with varying sample depths:

16 levels 8 levels 4 levels 2 levels

Typical levels: 8 bit (256 levels), 16 bit – png, tiff

43/46
Digital image representation
Bitmap (raster) image: color
• RGB – the most common color model (CRT monitors, LCD
screens/projectors)
• Each pixel represented by 3 values: red, green, blue

RGB bands:
color image built up of bands of red, green and blue color
44/46
Digital image representation
Bitmap (raster) image: color
• Pixel-interleaved format (chunky) – is a common one

• Color-interleaved format (planar)

45/46
Lecture 1: Outline
• What is and Why image retrieval?
• How to compare and retrieve images?
– Digital image representation
– Common components of the CBIR systems
– Main problems and research directions
• What are applications?

46/46
Common components of CBIR
system
indexation

image feature
extraction database

query feature result


retrieval

comparison
extraction

Relevance feedback: query refinement

47/46
Lecture 1: Outline
• What is and Why image retrieval?
• How to compare and retrieve images?
– Digital image representation
– Common components of the CBIR systems
– Main problems and research directions
• What are applications?

48/46
Problems and directions
• Low-level feature extraction
– How to represent an image in a compact and descriptive
way?
– How to compare features, and, thus, images?
• High dimensional indexing
• How to index huge amounts of high dimensional data?
• Visual interface for image browsing
– How to visualize the results?

49/46
How to: Image features

Textual/metadata features

 Semantics
Levels of image content

 Shape

 Texture
Low-level features / visual features
(signatures, descriptors)
 Color, lightness

50/46
How to: Image features
Image features

Visual (low-
Textual
level)
Annotations and metadata: Features extracted from pixel
− tags/keywords; values:
− creation date; − color descriptors;
− geo tags; − texture descriptors;
− name of the file; − shape descriptors;
− photography conditions − spatial layout descriptors.
(exposition, aperture, flash…).

51/46
How to: Image features
Low-level features

Global Local

Describes the whole image: Describes one part of the image:


− average intensity; − average intensity for the left
− average amount of red; upper part;
− average amount of red in the
− …
center of the image;
− …

All pixels of the image are Segmentation of the image is


processed. performed, pixels of a particular
segment are processed to extract
features.
52/46
How to: Feature spaces
• Feature vector – a vector of features, representing one image.
• Feature space – the set of all possible feature vectors with
defined similarity measure.
Image A Image B

xA1 xA2 … xAN Similarity measure xB1 xB2 … x BN


Similarity measure
xA1 … xAN yA1 … yAM zA1 … zAK Similarity measure xA1 … xAN yA1 … yAM zA1 … zAK
y A1 y A2 … yAM y B1 y B2 … y BM

z A1 z A2 … z AK Similarity measure …z B1 z B2 … zBK


53/46
How to: Combine results
Image A Image B

Similarity measure
xA1 xA2 … x AN xB1 xB2 … xBN d1
Similarity measure
y A 1 yA 2 … yAM yB1 yB2 … yBM d2
Similarity measure
z A 1 zA 2 … zAK zB1 zB2 … zBK d3


D c d
i
i i

54/46
How to: Image segmentation
• Fixed regions
– The same region boundaries for all images.
• Segmentation
– Boundaries depends on image content.
• Key points (point of interest) detection
– Points of particular interest in the image, feature extraction for areas
around key points.

55/46
Problems: semantic gap

Objects semantics
(regions)
Levels of image content

Texture semantic gap


(local regions)

Color, brightness
(one pixel) low-level features

How to understand what’s on the images?


56/46
Problems: what’s on the images?

• Sometimes it is not easy to understand the image even for humans!


• What do we want from machines?

57/46
Problems: what’s on the images?

• How do we know that all these objects are lamps?


58/46
Problems: subjectivity of
perception
Let’s compare our perception!

• Test a CBIR system

59/46
Problems: high dimensional data
• More information in feature vectors – better search results.

• Local features are usually more precise than global ->


more feature vectors.

• The dimensionality of the feature vectors is normally of the


order 102.

• ~200-500 keypoints per image

• Non-Euclidean similarity measure

60/46
How to: high dimensional
indexing
• Perform dimension reduction
– The dimension of the feature vectors is normally
very high, the embedded dimension is much lower.

• Use appropriate multi-dimensional indexing


techniques, which are capable of supporting
Non-Euclidean similarity measures
– Trees (k-d tree, VP-tree and others)
– Hashing

61/46
Problems: visualization
• Image content is very rich and its interpretation is
very contextual and subjective.

• Many independent similarity measures are


commonly used. How about to let user influence
the choice of these parameters?

• Which images to show as a result (result diversity)?

• Interactive search and relevance feedback.

62/46
1-D

How to: visual interfaces


• 1-D visualizations
– As a list (standard way)
• 2-D visualizations
– Based on dimension reduction techniques
• “3-D” visualizations
– Fish eye

2-D 3-D
63/46
Neighbour research areas
• Image processing
– Features extraction
– Pattern recognition and machine learning
• Faces, handwritings, thumbprints, …
• Classification tools
– Image enhancement
– Image classification
• The same features are used
• Classification helps to retrieve
• Information retrieval
– Scalability
– Performance measurement
– Fusion of multiple evidences

64/46
Lecture 1: Outline
• What is and Why image retrieval?
• How to compare and retrieve images?
– Digital image representation
– Common components of the CBIR systems
– Main problems and research directions
• What are applications?

65/46
What are applications? – Image Archives.
• Manage image archives
– Personal photo collections (many thousands of photos in mine)
– Professional photograph archives (millions of photos)
– Art collections (millions of photos)
• Browse images
• Organize image collection: delete duplicates, classify images, select “the best”
from the group of similar images
• Posters creation, auto cropping, album creation (www.snapfishlab.hpl.hp.com)
• Better organization of search-by-text results

66/46
What are applications? – Image Archives.
• Manage image archives

• …
• Search for particular image (by its smaller version, by its fragment)
• Search for similar images (landscape paintings, sea views, paintings by the
same author)
• Search for a painting with particular colors (“I want a sea view painting to my
bedroom with an orange carpet and yellow walls”)
• Search for group photos of my family
• Search for an image that will be a good illustration to my article/presentation
• … a lot of other use cases

67/46
What are applications? – Copyrights.
• Trademark and copyright application
– World Wide Web
– Enterprise network

• Copyright detection without watermarking and protect intellectual


property
• Forged images detection and sub-image retrieval
• Trademark image registration: a new candidate is compared with
existing marks to ensure no risk of confusing property ownership
• Search if confidential images are included into public presentations

68/46
What are applications? – Medical.

• Medical diagnosis
– Collection of X-ray images

• Search for similar past cases


• Is it similar to the “healthy” case?
• Classification of X-ray images

69/46
What are applications? – Security.
• Security issues
– Video surveillance material
– Faces, fingerprints, retina images

• Detect suspicious objects during the


video surveillance
• Detect “wanted” faces during the video
surveillance
• Grant or deny access based on
fingerprints/retina scanning

70/46
What are applications? – In
industry.
• Quality assurance

(a) CD-ROM controller(b) Pack of pills (c) Level of liquid (d) Air-bladders
in plastic
• Control that all parts of the product are on place (a)
• Control if all places in pill pack are filled (b)
• Control the level of liquid in bottles (c)
• Control the quality of plastic details (d)
• And even control the corn flakes! (e)
(e) Corn flakes

71/46
What are applications? – Others.
• Military-related issues
– Auto aiming, tracking systems

• Image-based modeling and 3-D reconstruction


– Medical imaging
– Indoor scene reconstruction from multiple images
– Outdoor scene reconstruction from aerial photography

• Geographical information and remote sensing


– Process satellite data: climate variability, sea surface
temperatures, storms watch.

72/46
Lecture 1: Summary
• CBIR is an actual problem and an active research area

• Main research directions are:


– Feature extraction
– Multidimensional indexing
– Visualization

• CBIR combines research results of image processing,


information retrieval, database communities

• CBIR has many applications in various areas

73/46
Lecture 1: Bibliography
• Gonzalez R, Woods R. Digital Image Processing, published by
Pearson Education, Inc, 2002.
• Rui Y., Huang T.S., Chang S.-F. Image Retrieval: Past, Present
and Future. In Proc. of Int. Symposium on Multimedia
Information Processing, Dec. 1997.
• Links:
– http://videolectures.net/russir08_vassilieva_cbir/
– http://videolectures.net/kdd08_malik_fis/
– http://imagedatabase.cs.washington.edu/demo/cbir.html
– courses.cs.vt.edu/~cs4624/cache/qbic.htm
– http://homepages.inf.ed.ac.uk/cgi/rbf/CVONLINE/entries.pl?TAG363
http://www.ux.uis.no/~tranden/brodatz.html
http://www.cs.washington.edu/research/imagedatabase/groundtruth/
http://peipa.essex.ac.uk/benchmark/databases/

74/46
Thank You!

You might also like