Professional Documents
Culture Documents
Architecture
Introduction
●
Multimedia retrieval refers to fetching continuous multimedia data
from the disk.
●
Multimedia involves very large amounts of data.
●
Retrieving multimedia needs to be perfectly executed under real
time constraints.
●
Multimedia retrieval scheme:
– Step 1: Host CPU send the retrieval request to I/O subsystem.
– Step 2: I/O subsystem moves compressed data from disk to
memory.
– Step 3: Host CPU decompresses the compressed data.
– Step 4: Host CPU waits for the ready signal from the display
subsystem, and moves the decompressed data from memory to
display device and speakers via the display subsystem.
Multimedia retrieval architecture
Principles of Multimedia Data Retrieval
●
Client/Server Model:
– Servers have resources and information that other
components called clients wish to access.
●
Multimedia Server:
– Digitally store multimedia content on a large array of high
capacity storage devices referred as multimedia storage.
– video, audio, text differ in characteristics, and require
different management techniques
●
Multimedia Client
– Process which setsup a multimedia query to extract
multimedia information.
Multimedia Data Retrieval Architecture
●
Sequential retrieval architecture
●
Pipeline retrieval architecture
●
Concurrent retrieval architecture
Continuity Requirement
●
For continuous retrieval of media data which is delay
sensitive or realtime based stream data, it is essential
that media information be available at the display device
at or before the time of it's playback.
●
CR Equation
for Sequential
Retrieval
Continuity Requirement
●
CR Equation for Pipeline Architecture
●
CR Equation for Concurrent Architecture
Query Processing
●
Types of queries
●
Attribute based queries
– association of attributes, including text and numerical attributes
which may represent features extracted from the multimedia
units
– retrieval by an identifier (e.g., an index), and
– retrieval by conditional statements.
●
Content based queries
– queries over color composition and other image or media
characteristics
●
Temporal queries
– temporal relations among the media units within a presentation.
Image Queries
●
Images are required for:
●
illustration of text articles, conveying information or
emotions difficult to describe in words,
●
display of detailed data (such as radiology images) for
analysis,
●
formal recording of design data (such as architectural
plans) for later use, and so on
Image Queries
●
Types of attributes:
– the presence of a particular combination of color, texture or
shape features (e.g., green stars);
– the presence or arrangement of specic types of object (e.g.,
chairs around a table);
– the depiction of a particular type of event (e.g., a football
match);
– the presence of named individuals, locations, or events (e.g.,
the PM greeting a crowd);
– subjective emotions one might associate with the image
(e.g., happiness).
Video Queries
●
Prepare a storyboard of annotated still images (often known as key frames)
representing each scene.
●
Prepare a series of short video clips, each capturing the essential details of
a single sequence – video skimming.
●
Level 1 comprises retrieval by primitive features such as color, texture,
shape or the spatial location of image elements
●
Level 2 comprises retrieval by derived features, involving some degree of
logical inference about the identity of the objects in image.
– retrieval of objects of a given type; retrieval of individual objects or
persons
●
Level 3 comprises retrieval by abstract attributes, involving a significant
amount of highlevel reasoning about the meaning and purpose of the
objects or scenes depicted.
– retrieval of named events or types of activity; retrieval of pictures with
emotional or religious significance
Queries for Video and Images Retrieval
●
Subimage Query:
●
(k, u,t) query image given image contains the
●
k labeled objects and u unlabeled objects, and a tolerance t, retrieve
all images that contain a (k,u,t) subimage which matches the query
within tolerance t.
●
Generic search algorithm:
●
Rtree search: Issue (one or more) range queries on the (k, 1) Rtree,
to obtain a list of promising images (image identifiers)
●
Cleanup: For each of the above obtained images, retrieve its
corresponding ARG from the graph file, and compute the actual
distance between this ARG and ARG of the original (k, u,t) query. If
the distance is less than the threshold t , the image is included in the
response set.
Single Region Based Image Query
●
regionlocation queries spatial properties of individual
regions, or indexing of region centroids or minimum
bounding rectangles are used
●
Spatial distance between regions given by Euclidean
distance
Where (xq, yq) and (xt, yt) are coordinates of 2 points
Single Region Based Image Query
●
Bounded query location
●
The user has flexibility in designating the spatial bounds
for each region in the query within which a target region
falls outside of the spatial distance of zero
Single Region Based Image Query
●
Centroid Location Spatial Access Spatial Quad trees
●
The centroids of the image regions are indexed using a
spatial quadtree on their x and y values.
● A query for region at location (xt, yt) is processed by first
traversing the spatial quadtree to the containing node,
then exhaustively searching the block for the points that
minimize
●
In the case that the user species a bounded spatial query,
a range of blocks are evaluated such that points within
the spatial bounds are all assigned
Single Region Based Image Query
●
Rectangle Location Spatial access – Rtrees
– The MBR is the smallest vertically aligned rectangle that
completely encloses the regions
– Size
– Another important perceptual dimension of the regions is their size
in terms of area and spatial extent.
– Area
– The distance in area between two regions is given by the absolute
distance
●
Spatial Extent
●
distance in MBR width (w) and height (h) between two regions is
given by:
Single Region Query Strategy
●
The single region distance is given by the weighted sum of
the color set, location, area and spatial extent distances.
●
single region query distance:
Multiple Regions Query
Multiple Regions Query Strategy – Absolute Locations
●
For each region in the query positioned by absolute
location, the query strategy outlined for single region
query is carried out, without computing the final
minimization
●
Find the image having three regions that best matches
●
Matches found:
Shaped based Query Processing
●
Shape Index
– For each color region the shape index may be computed as
follows:
– Compute the major and minor axes of each color region.
– Rotate the shape region to align the major axis to Xaxis to
achieve rotation normalization and scale it such that major axis
is of standard fixed length (say 96 pixels).
– Place the grid of fixed size (96x96 pixels) over the normalized
color region and obtain the binary sequence by assigning 1's and
0's accordingly.
– Using the binary sequence, compute the row and column total
vectors. These along with the eccentricity form the shape index
for the region.
Shaped based Query Processing
●
Query Process
– The query image is processed to obtain a list of matching images
based only on color features.
– For each color region in the query image, the shape
representation of each region is evaluated.
– Compare the shape index of regions in the query image to those
in the list of images retrieved on color.
– Regions with only matching eccentricity within a threshold (t) are
compared for shape similarity.
– The matching images are ordered depending on the dierence in
the sum of the difference in row and column vectors between
query and matching image.
Queries for multimedia objects
●
Query Model
– A query model for searching multimedia objects in a
database or a file needs to satisfy the following
requirements:
– Consider that a match between the value of an attribute of
a multimedia object and a given constant is not exact, i.e.,
must account for the grade of match.
– Allow users to specify thresholds on the grade of match of
the acceptable objects.
– Enable users to ask for only a few topmatching objects
Queries for multimedia documents
●
Four main phases of query processing:
– During the preprocessing phase parsing and catalog access are
performed, and also the query is modified in light of the type
hierarchy.
– The multicluster query resolution phase determines the set of
document clusters that must be accessed. Document distribution
on the various clusters is transparent to the applications, to
evaluate a query it is necessary to determine which clusters
contain documents that can potentially satisfy the query.
– Once the set of clusters involved in the query is determined, the
singlecluster query optimization phase is performed and a query
processing strategy is defined for each cluster.
– The query execution phase applies the strategies defined in the
previous phase.
Queries for multimedia documents
●
Predicates in a query are divided into four classes:
●
Structure predicates. These predicates are evaluated by accessing
the system catalogs.
●
Index predicates. These predicates are evaluated by using the
indexes.
●
Text predicates. These predicates are evaluated by means of
signature scanning.
●
Residual predicates. These are predicates on components for
which there are no access structures and so can only be evaluated by
accessing the documents. This is the case for data attributes with no
indexes. In addition, predicates defined on spring nodes belong to
this class.
Queries for multimedia documents
●
Index query. A query issued against the index segments by using the
access paths provided by the index handler.
●
Text query. A query issued against the signature segments by using the
access paths provided by the signature handler.
●
Document query. A query issued against the bulk storage segments by
using the access paths provided by the bulk storage handler.
●
Query Preprocessing Phase
– Parsing. The query is parsed by a conventional parser.
– Catalog Access. After parsing of the query, the definitions of the
conceptual types appearing in the query are retrieved from the system
catalogs.
– Component Checking. If the query contains a typeclause, then the
conceptual components present in the query are veried as belonging to
the specified types.
Shape based multimedia retrieval
Shape based multimedia retrieval
●
Registration: Given two 3D models, align them
optimally; compute the geometric similarity between
them;
●
Retrieval. Given a database of 3D models and a geometric
query, find the models that best match the query;
●
Recognition. Given a database of 3D models and a query
model, either find the query model in the database or
determine it is not there;
●
Verification. Given a 3D model and a specification,
determine whether they match to within some tolerance;
●
Clustering. Given a database of 3D models, automatically
partition them into a set of classes;
Shape based multimedia retrieval
●
Feature detection. Given a 3D model, find geometric
features of interest on its surface;
●
Classification. Given a set of model class specifications
and a query model, determine the class to which the
query model belongs;
●
Segmentation. Partition a given 3D model into its salient
parts;
●
Semantic labeling. Infer semantic meaning regarding the
purpose and function of a given 3D model;
●
Synthesis. Automatically synthesize new examples typical
of a given model class specification;
Indexing and retrieval
●
Used for pdf files
●
Indexing
– Each video sample is processed by the text recognition
software. For each frame the recognized characters are
stored after deletion of all text lines with fewer than 3
characters
●
Retrieval
– Video sequences are retrieved by specifying a search string.
Two search modes are supported:
●
exact substring matching and
●
approximate substring matching.
Shape based multimedia retrieval
●
FIBSSR – Feature Indexbased Similar Shape Retrieval
– A general and flexible shape similaritybased approach,
enables retrieval of both rigid and articulated shapes.
●
Spatial Access based Retrieval Methods
– SpaceFilling Curves
●
a finite precision in the representation of each coordinate, say,
K bits.
●
Address space is a square – image, represented 2k x 2k array of
1 X 1 squares pixel.
– RTrees
●
Zordering & Rtrees and variants
Content based retrieval methods
●
Retrieving stored images from a collection by comparing features
automatically extracted from the images themselves
●
measures of color, texture or shape
●
Color retrieval
– Each image added to the collection is analyzed to compute a color
histogram which shows the proportion of pixels of each color
within the image.
●
Texture retrieval
– comparing values of what are known as secondorder statistics
calculated from query and stored images
●
Shape retrieval
– A number of features characteristic of object shape are
computed for every object identified within each stored
image
Retrieval using indexing
●
Objects are represented as collections of features
●
Similarity depends on context and frame of reference
●
Features are characterized by multiple multimodal feature
measures
●
Challenges in Indexing
– The index must be created using all features of an object
class
– Nodes in index tree show consistency with respect to the
context and frame of reference.
– Multiple multimodal feature measures should be fused
properly to generate index tree so that a valid
categorization can be possible.
Similarity based retrieval
●
Uses similarity measures
●
When presented with a sample facial image, similarity
retrieval occurs in the same way as pattern classification
happens using a decision tree.
●
Retrieval follows the tree down to the leaf nodes. At each
level, similarity measures determine the decision.
●
Using distance as the similarity measure, the index tree
selects a node in the next level if d(x,t')=min,d(x,t'), where
x is sample image and t' is the template of the jth node.
●
At the leaf node level, all leaf nodes similar to the sample
image will be selected.
Storing Multiple Media Strands