You are on page 1of 17

A Memory Learning Framework for Effective Image Retrieval

ABSTRACT
Due to the rapidly growing amount of digital image data on the Internet and in digital libraries, there is a great need for large image database management and effective image retrieval tools. The project presents a framework for effective image retrieval by employing a novel idea of memory learning. It forms a knowledge memory model to store the semantic information by simply accumulating userprovided interactions. A learning strategy is then applied to predict the semantic relationships among images according to the memori ed knowledge. Image !ueries are finally performed based on a seamless combination of low-level features and learned semantics. The feedback knowledge memory model and the learning strategy are jointly known as memory learning. "ere the spatial features of the images are used to retrieve them from large databases.

1. INTR !"CTI N 1.1 #ER#IE$ F IMA%E RETRIE#AL S&STEMS An image retrieval system is a computer system for browsing, searching and retrieving images from a large database of digital images. #ost traditional and common methods of image retrieval utili e some method of adding metadata such as captioning, keywords, or descriptions to the images so that retrieval can be performed over the annotation words. #anual image annotation is time-consuming, laborious and e$pensive% to address this, there has been a large amount of research done on automatic image annotation. Image retrieval can be done with three main features as, &olor Te$ture 'dge 1.1.1 verview of CBIR (&ontent-based( means that the search will analy e the actual contents of the image. The term )content) in this conte$t might refer colors, shapes, te$tures, or any other information that can be derived from the image itself. *ithout the ability to e$amine image content, searches must rely on metadata such as captions or keywords, which may be laborious or e$pensive to produce. An image can be considered as a mosaic of different te$ture regions, and the image features associated with these regions can be used for search and retrieval. A typical !uery could be a region of interest provided by the user, such as outlining a vegetation patch in a satellite image. The input

information in such cases is an intensity pattern or te$ture within a rectangular window. +otential uses for &,I- include.

Art collections +hotograph archives -etail catalogs #edical records

CBIR 'oftware 'y'tem' an( tec)ni*+e' ,+ery tec)ni*+e' Different implementations of &,I- make use of different types of user !ueries. ,+ery -y e.am/le /uery by e$ample is a !uery techni!ue that involves providing the &,Isystem with an e$ample image that it will then base its search upon. 0ptions for providing e$ample images to the system include.

A pree$isting image may be supplied by the user or chosen from a random set. The user draws a rough appro$imation of the image they are looking for, for e$ample with blobs of color or general shapes.

This !uery techni!ue removes the difficulties that can arise when trying to describe images with words.

&urrent &,I- systems therefore generally make use of lower-level features like te$ture, color, and shape, although some systems take advantage of very common higher-level features. t)er *+ery met)o(' 0ther methods include specifying the proportions of colors desired 1e.g. (234 red, 534 blue(6 and searching for images that contain an object given in a !uery image. &,I- systems can also make use of relevance feedback, where the user progressively refines the search results by marking images in the results as (relevant(, (not relevant(, or (neutral( to the search !uery, then repeating the search with the new information. -elevance feedback 1-76 was introduced into &,I- to improve the performance of information systems but which lacked memory mechanism. The feedback knowledge memory model is presented to gather the users8 feedback information during the process of image search and feedback. It is efficient and can be simply implemented. A learning strategy based on the memori ed information is proposed. It can estimate the hidden semantic relationships among images. &onse!uently, this techni!ue could address the problem of user log sparsity in a certain e$tent.

During the interactive process, a seamless combination of normal -7 1low-level feature based6 and the memory learning 1semantics based6 is proposed to improve the retrieval performance. 1.0 #ER#IE$ F %AB R FILTER . 9abor filters are used to e$tract fractional energies in various spatialfre!uency channels. The system is able to serve !ueries ranging from scenes of purely natural objects such as vegetation, trees, sky, etc. to images containing conspicuous structural objects such as buildings, towers, bridges, etc. 9abor 7ilter 1and 9abor *avelet6 has been a popular tool to e$tract such fre!uency components from both color and grayscale images. 9abor filter is capable of first locating and then analy ing regions of subtle te$ture differences. In the case when there is similar te$ture pattern, color analysis can then be used. 9abor filters have been used in many applications, such as te$ture segmentation, target detection, fractal dimension management, document analysis, edge detection, retina identification, image coding and image representation. %a-or filtering This block implements one or multiple convolutions of an input image with a two-dimensional 9abor function.

To visuali e a 9abor function select the option (9abor function( under (0utput image(. The 9abor function for the specified values of the parameters (wavelength(, (orientation(, (phase offset(, (aspect ratio(, and (bandwidth( will be calculated and displayed as an intensity map image in the output window. "ere is the formula of a comple$ 9abor function in space domain g1$, y6 : s1$, y6 wr1$, y6 where s1$, y6 is a comple$ sinusoidal, known as the carrier, and wr1$, y6 is a 5-D 9aussian-shaped function, known as the envelop. The comple$ sinusoidal is defined as follows s1$, y6 : e$p 1j 15;1u3 $ < v3 y6 < +66 *here 1u3, v36 and + define the spatial fre!uency and the phase of the sinusoidal respectively. This sinusoidal can be thought as two separate real functions, conveniently allocated in the real and imaginary part of a comple$ function. The real part and the imaginary part of this sinusoidal are -e 1s1$, y66 : cos 15= 1u3 $ < v3 y6 < +6 Im1s1$, y66 : sin 15= 1u3 $ < v3 y6 < +6 The parameters u3 and v3 define the spatial fre!uency of the sinusoidal in &artesian coordinates. This spatial fre!uency can also be e$pressed in polar coordinates as magnitude 73 and direction >3. 73 : s!rt1u3?5 <v3?56 >3 : tan@A 1v3Bu36

i.e. u3 : 73 cos >3 v3 : 73 sin >3 Csing this representation, the comple$ sinusoidal is s1$, y6 : e$p 1j 15=73 1$ cos >3 < y sin >36 < +66 The 9abor space is very useful in e.g., image processing applications such as iris recognition and fingerprint recognition. -elations between activations for a specific spatial location are very distinctive between objects in an image. 1.1S FT$ARE !ESCRI2TI N Dava is an object oriented, multi thread programming language developed by Eun #icrosystems in AFFA. It is designed to be simple and portable across different platforms as well as operating systems. The popularity of Dava is due to its uni!ue technology that is designed on the basis of three key elements. They are the usage of applets, powerful programming language constructs and a rich set of significant object classes. Feat+re' of 3ava Dava was designed to meet all the real world re!uirements with its features, which are e$plained in the following paragraphs. Sim/le an( /orta-le Dava makes itself simple by not having surprising features. Eince it e$poses the internal working of the machine, the programmers can perform his desired action without fear.It is portable across multiple platforms. M+ltit)rea(e( Dava supports multithreaded programming, which allows user to write programs that perform many functions simultaneously.

Sec+rity Eecurity manager - determines what resources a class can access such as reading and writing to the local disk. !ynamic Bin(ing The linking of data and methods to where they are located is done at run-time. Gew classes can be loaded while a program is running. Hinking is done on the fly. 3AI4 Dava Advanced Imaging 1DAI6 is a Dava platform e$tension A+I that provides a set of object-oriented interfaces that support a simple, high-level programming model which allows images to be manipulated easily in Dava applications and applets. DAI goes beyond the functionality of traditional imaging A+Is to provide a high-performance, platform-independent, e$tensible image processing framework. 3ava 'wing4

Ewing is a widget toolkit for Dava. It is part of Eun #icrosystems) Dava 7oundation &lasses 1D7&6 I an A+I for providing a graphical user interface 19CI6 for Dava programs.

Ewing was developed to provide a more sophisticated set of 9CI components than the earlier Abstract *indow Toolkit. Ewing provides a native look and feel that emulates the look and feel of several platforms, and also supports a pluggable look and feel that allows applications to have a look and feel unrelated to the underlying platform.

Arc)itect+re Ewing is a platform-independent, #odel-Jiew-&ontroller 9CI framework for Dava. It follows a single-threaded programming model, and possesses the following traits. 2latform in(e/en(ence. Ewing is platform independent both in terms of its e$pression 1Dava6 and its implementation 1non-native universal rendering of widgets6. E.ten'i-ility. Ewing users can e$tend the framework by e$tending e$isting 1framework6 classes andBor providing alternative implementations of core components. Com/onent5 riente(. Ewing is a component-based framework. Ewing components are Dava ,eans components, compliant with the Dava ,eans &omponent Architecture specifications. C+'tomi6a-le. Csers will programmatically customi e a standard Ewing component 1such as a DTable6 by assigning specific ,orders, &olors, ,ackgrounds, opacities, etc., as the properties of that component. Config+ra-le. Ewing)s heavy reliance on runtime mechanisms and indirect composition patterns allows it to respond at runtime to fundamental changes in its settings. Lig)tweig)t "I. Ewing)s configurability is a result of a choice not to use the native host 0E)s 9CI controls for displaying itself. Ewing (paints( its controls programmatically through the use of Dava 5D A+Is, rather than calling into a native user interface toolkit.

Loo'ely5Co+/le(7M#C. The Ewing library makes heavy use of the #odelBJiewB&ontroller software design pattern, which conceptually decouples the data being viewed from the user interface controls through which it is viewed.

0. LITERAT"RE RE#IE$

-elevance feedback for content-based image retrieval using ,ayesian network

1. S&STEM S2ECIFICATI N 1.1 8AR!$ARE S2ECIFICATI N #onitor . '9A B J9A

Keyboard +rocessor "ard Disk -A#

. . . .

AA5 #ultimedia keyboards +entium IJ processor L39, 5MN#, *indows O+ Dava D&reator +ro

1.0 S FT$ARE S2ECIFICATI N 0perating Eystem . Hanguage Eoftware . .

9. S&STEM ANAL&SIS 9.1E:ISTIN% S&STEM The limited retrieval accuracy of image-centric retrieval systems is essentially due to the inherent gap between semantic concepts and low-level

features. In order to reduce the gap, the interactive relevance feedback 1-76 is introduced into &,I-. The basic idea of -7 is to incorporate human perception subjectivity into the !uery process and provide users with the opportunity to evaluate the retrieval results. The similarity measures are automatically refined on the basis of these evaluations. Although -7 can significantly improve the retrieval performance, its applicability still suffers from three inherent drawbacks. A6 Incapability of capturing semantics. 56 Ecarcity and imbalance of feedback e$amples. P6 Hack of the memory mechanism. To overcome these difficulties, another method, generally called longterm learning was introduced into &,I-. They memori e and accumulate users8 preferences in the -7 process. These long-term learning algorithms are mainly based on previous users8 behaviors, which basically embody more semantic information than low-level features. The limited retrieval accuracy of image-centric retrieval systems is essentially due to the inherent gap between semantic concepts and low-level features. In order to reduce the gap, the interactive relevance feedback 1-76 is introduced into &,I-. The basic idea of -7 is to incorporate human perception subjectivity into the !uery process and provide users with the opportunity to evaluate the retrieval results. The similarity measures are automatically refined on the basis of these evaluations.Although -7 can significantly improve the retrieval performance, its applicability still suffers from three inherent drawbacks. A6 Incapability of capturing semantics.

56 Ecarcity and imbalance of feedback e$amples. P6 Hack of the memory mechanism. To overcome these difficulties, another method, generally called longterm learning was introduced into &,I-. They memori e and accumulate users8 preferences in the -7 process. These long-term learning algorithms are mainly based on previous users8 behaviors, which basically embody more semantic information than low-level features. Actually, the idea of long-term learning in &,I- is borrowed from the work of collaborative filtering and link structure analysis in the web information retrieval. "owever, they inevitably encounter two problems in practice. A. 0ne is the sparsity of memori ed feedback information. 5. There is no learning or limited learning in such e$isting long-term learning systems.

9.02R 2 SE! S&STEM A novel memory learning framework has been proposed to address those two issues. A feedback knowledge memory model is introduced to accumulate

the previous users8 preferences. 7urthermore, a learning strategy is presented to predict hidden semantics using the memori ed information, which is able to reduce the limitation of user log sparsity to a certain e$tent. The feedback knowledge memory model and the learning strategy are jointly known as memory learning. Fee(-ack knowle(ge memory mo(el The feedback images are provided to the system by the e$perts who use them. These feedback images are retrieved when the user gives the same !uery ne$t time. Eince the user log accumulates feedback knowledge from various users, the semantic correlations can reflect the preference of the majority of the users. Learning 'trategy4 A Hearning strategy is used to estimate the hidden semantic correlation between two images without Qdirect link.R A(vantage'4 The memory learning provides the normal -7 with a pool of positive e$amples according to its captured knowledge, which helps the normal -7 to alleviate the problem of scarcity and imbalance of feedback e$amples. It is able to automatically collect and analy e the users8 historical judgments offline without additional cost of user interaction. Also, it hardly influences the speed of the real-time retrieval system.

You might also like