Content Based Remote-Sensing Image Retrieval With Bag of Visual Words Representation

Proceedings of the Second International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC 2018)
IEEE Xplore Part Number:CFP18OZV-ART; ISBN:978-1-5386-1442-6
Content Based Remote-Sensing Image Retrieval with Bag of Visual

Words Representation
Amruta Rudrawar
Shri Guru Gobind Singh College of Engineering and Technology and

Information Technology
Nanded
Email: 2016mit017@sggs.ac.in
Retrieval of images assumes a noteworthy part in various areas including therapeutic determination, biometrics, geological
data satellite frameworks, web searching and authentic research etc. At the point, when size of the database increases constantly,
the applications including images confront new diculties and signicant issues in indexing, learning and retrieving. We require a
productive retrieval system to retrieve images from the vision or audio database. CBIR-Content-based image retrieval is a image
retrieval procedure used for retrieving images productively by utilizing low level image features texture, shape and color. In CBIR
framework, a query image is described by features within the database. In this report, there are three steps. First, images from
dataset are split into training and validation sets. Second, SURF features are extracted of the images and they are represented as
bag of visual words using clustering and image indexing. Third, retrieval using cosine similarity. All these steps are carried out on
remote rensing images. This technique does not require any relevance feedback for retrieval and it also reduces annotation work
with similar results to query.
Index Terms—Content Based Image Retrieval, Relevance feedback,SVM, Remote Sensing Images.
I. I NTRODUCTION which are capable and exact with results in retrieval of remote
sensing images from enormous archive on the given client’s
mage Retrieval is a much needed task now a days. It can
I be based on text,context,content,keywords,tags,images etc.
Keyword based image retrieval has some limitations in proper
query. Traditional remote sensing image retrieval frameworks
generally rely on upon catchphrase, words, phrases, labels as
sensor sort, topographical zone, area, and information(data)
retrieval of results.The implementation of tag matching based acquiring more time to put away the images from the archive.
recovery approaches profoundly relies upon the accessibility The procedure of catchphrase coordinating and discovering
and the nature of manual labels. Where as keywords and tags results is the best approach. However, as a rule while utilizing
are expensive to obtain and are ambiguous. Because of these IR frameworks labels are uncommon and expensive to be
drawback of conventional TBIR system, the CBIR has taken accessible and some of the time consuming. Due to these
on eyes in research for better performance of image retrieval. drawbacks, late research and study have given conclusion
Enthusiasm for the capability of computerized pictures has that the content of the remote sensing information is more
expanded immensely finished the most recent couple of years, important rather than manual labels [1].
fuelled at any rate to some degree by the quick development
of imaging on the World-Wide Web (alluded to in this report
as ’the Web’). Clients in numerous expert fields are misusing II. R ELATED W ORK
the open doors offered by the capacity to get to and control The earliest use of the term content-based image retrieval
remotely-put away pictures in a wide range of new and in the literature was made in [6], to describe his experiments
energizing ways.[1] into automatic retrieval of images from a database by color and
The three fundamental targets in this subject are novel shape feature.CBIR draws huge numbers of its strategies from
active learning technique, content based image retrieval, and the eld of image handling and Personal Computer vision, and
relevance feedback. Active learning is a machine learning is considered as a subset of that eld. Image handling covers a
technique. Which will comprehend, what results are to be signicantly more extensive eld, including image improvement,
shown to the client, so they are more identied with the given pressure, transmission, and translation. While there are hazy
query. Content based means, query is one of the content areas, (for example, object recognition by feature analysis), the
amongst image, sound or a video to the single system on qualication between standard image investigation and CBIR is
which retrieval process is performed. Relevance feedback is typically obvious. A case may make this understandable. Many
given by client, when the results are retrieved at rst trial, police forces now use automatic face recognition systems.
client will examine the outcomes into three classications as Such systems can be used in one of two ways, One, the image
relevant, irrelevant, or uncertain images on the basis of which before the camera might be contrasted with a single or any
next results are given out. person’s database record to conrm his or her personality. For
As there is enormous development in satellite and image this situation, just two images are coordinated, This is called as
retrieval, there are the most requesting and rising applications, CBIR. Besides, the whole database might be looked to locate
978-1-5386-1442-6/18/$31.00 ©2018 IEEE 162
Authorized licensed use limited to: REVA UNIVERSITY. Downloaded on June 28,2023 at 07:53:07 UTC from IEEE Xplore. Restrictions apply.
the most coordinating images. This is a real case of CBIR. light of the fact that the quantity of highlight measurements is
Some of the key points regarding CBIR implementation are: substantially higher than the span of the preparation set. In this
1) Understanding image and clients’ needs paper, we build up a component to overcome these problems.
2) Identication of appropriate methods for representing To address the initial two problems,they have propose anasym-
image content metric stowing basedSVM(AB-SVM). For the third issue, we
3) Extracting such elements from crude(raw) images consolidate the arbitrary subspace technique and SVM for
4) Providing conservative storage for expansive image importance criticism, which is named irregular subspace SVM
databases (RS-SVM). At long last, by incorporating AB-SVM and RS-
5) Matching query and put away images in a way that re SVM, a topsy-turvy packing and arbitrary subspace SVM
ects similar judgments as of humans (ABRS-SVM) is worked to take care of these three issues
6) Eficiently retrieving images by content as output and further enhance the pertinence criticism execution.
7) Providing usable human interfaces to CBIR frameworks. Coating Zhang [11] propose a novel subspace learning
In CBIR the importance of features is understood only in this system, i.e., conjunctive patches subspace learning (CPSL)
phase, here, the images query or images frm archieves are with side data, for taking in a successful semantic subspace by
taken as input and featues are extracted. Features are refered to abusing the client recorded criticism log information for a CIR
be Color, Texture, ShapeSalient, Global, Object are extracted errand. CPSL can adequately coordinate the discriminative
on the basis of CBIR system[3,4,5,6,7]. data of marked log pictures, the geometrical data of named
log pictures, and the pitifully comparative data of unlabeled
TABLE I
pictures together to take in a dependable subspace. what’s
F EATURES AND T ECHNIQUES more, they define this issue into an obliged advancement issue
and afterward introduce another subspace learning procedure
Author Name Feature Technique Accuracy to misuse the client verifiable criticism log information.
Sergyan Szabolcs [3] Color Colour Histogram 87%
Sandhya R. Shinde et al. [4] Color Color Moments 81.25%
Haridas et al.[5] Texture Wavelet Transform 75% III. A RCHITECTURE OF CBIR S YSTEM
P. V. N. Reddy et al. [6] Texture Gabor Filters 81.7%
Heng Xu et al [8] presented the Relevance Feedback (RF)

Model, which catches the clients’ feedback data. At the point
when the framework restored the underlying retrieval results
to the client and the underlying retrieval comes about to take
care of the demand of the client, the retrieval procedure is
canceled. CBIR framework asks the client to give feedback
to the data identied with query image. So the framework gets
the data of semantically comparative and dierentiate images
for the positive and negative feedback tests. At that point the
underlying retrieval out-comes are based on the new input data
given by client. At the point when the out comes take care of
the demand of the client, RF process is stop and restore the last
ideal outcomes to clients. the RF is performed iteratively until
the point when the client is happy with the rened results.The
mix of profound learning and RF model could accomplish
23% execution change over just applying profound learning
for CBIR systems.
Steven C.H. Hoi et al [9] proposed, a brought together Fig. 1. Proposed architecture of CBIR
structure for log-based relevance feedback that incorporates
the log of input information into the conventional Relevant The CBIR System works as in,
feedback plans to learn completely about the relationship be- User will input a query image from the database then, the
tween lowlevel image components and undened state features. images from the archives database are compared by extracting
Because of the inclined nature of log information, they gave the features of query image as well as the images from
a novel learning system, named Soft Label Support Vector the archives. Now these feature vectors which are provided
Machine, to handle the unknown information issue. by feature extraction method are passed on to the similarity
Dacheng Tao et al. [10] the execution of SVM-based measure and distance measure for comparison and retrieval
importance criticism is regularly poor when the quantity of purpose. At last images are retrieved on the basis of relevance
named positive input tests is little. This is basically because to the query image with the help of Relevance feedback
of three reasons: 1) a SVM classifier is insecure on a little method.
measured preparing set, 2) SVM’s ideal hyperplane might be Mostly used distance measure is Euclidean distance.
one-sided when the positive input tests are significantly less Lihui Shi et al [12] have demonstrate the learning issue as a
than the negative criticism tests, and 3) overfitting happens in help top immaterial database picture weighting issue to acquire
978-1-5386-1442-6/18/$31.00 ©2018 IEEE 163
part in picture recovery. Highlight Vectors It depicts the

picture attributes. Highlight vectors speak to the abnormal
state thinking of visual substance of a picture. It is utilized
for picture ordering and recovery purposes. Visual highlights
Color, surface, shape and edge are ordinarily utilized visual
highlights in CBIR. The accompanying segments depict the
shading, surface and shape highlights utilized as a part of
CBIR[13].
Any CBIR framework basically comprises of (no less than)
two modules 1) a component extraction module that infers an
Fig. 2. Different similarity Measures arrangement of highlights for portraying and depicting pictures
and 2) a recovery module that hunts and recovers pictures like
the inquiry picture.
the ideal similarity function. They proposed most extreme best In the RS writing, a few crude (i.e., low level) highlights
exactness comparability learning technique (MTPS),they first have been introduced for recovery purposes, for example, the
split dataset to an arrangement of inquiries and an arrangement accompanying: force highlights , shading highlights, shape
of database pictures, and these questions are additionally part highlights, surface highlights, and nearby invariant highlights.
into a preparation set and a test set. At that point they apply the Be that as it may, the low level highlights from a picture have
proposed comparability learning calculation to the preparation an extremely restricted capacity in speaking to and breaking
set of inquiries and the database pictures to take in the likeness down the abnormal state idea passed on by RS pictures (i.e.,
work. At that point they utilize the likeness capacity to direct the semantic substance of RS pictures). This issue is known
the recovery undertaking for the test set of inquiries. The as the semantic hole that happened between the low level
execution is assessed by top accuracy measure.It has top highlights and the abnormal state semantic substance and
precisions higher than 0.18. prompts poor CBIR execution[13].
IV. E XISTING S YSTEM AND P ROPOSED METHOD

There are various CBIR systems TCAL(Triple Criteria Ac-
tive Learning Method), DCAL(Double Criteria Active Learn-
ing Method), Random Sampling which are based on users
relevance feedback or on its iterations.
Criteria’s involved in TCAL or DCAL are 1) uncertainity;2)
diversity;3) density of images in database where as in DCAL
only first two criterias are included. Similarity Measure and
Feature extraction methods are completly independent of these
criteria. The results shown in below table are of TCAL,
DCAL, and Random Sampling which used, SVM(Support
Vector Machine) and margin sampling for set of uncertain
images and k-means clustering for diverse set of images and Fig. 3. Exisiting System
SIFT (Scale Invarient feature extraction) for feature extraction
and Euclidean Distance as a distance measure for retrieval There are two consecutive steps for undergoing three crite-
purpose[13]. ria’s.
RF (Relevance feedback) is the feedback given by user or The first step includes SVM training with relevant and
its the annotation for images retrieved in two ways as relevant irrelevant set of images to get set of most uncertain images.To
image and other would be irrelevant image, On this feedback achieve this most famous margin sampling technique is used
the next results are generated which are relevant to the query which allows to select which line of seperation is optimal for
image. This scheme is used to improve the performance of seperating feature space where margin is a notion of distance
CBIR system. of point from the line from each class. More margin the better
At the point when the information to a calculation is too vast the result.Support Vectors are only important and so Margin
to possibly be handled and it is suspected to be repetitive, Sampling is great for finding such support vectors.
at that point it can be changed into a lessened arrangement The second step includes k-means clustering that is it
of highlights. This procedure is called include extraction. forms clusters or groups of features from the set of uncertain
Extraction of highlights from the picture assumes principle images. Where each cluster repreents a word. Now here comes,
part in content based picture recovery. Highlights portrayal the concept og BOVW representation of images, that is;
incorporates both visual highlights and Meta information Hisogram of features(SIFT features).Its origin is from texture
based highlights. Catchphrases and comments are content recognition.Histogram is distribution of greey levels in image
based highlights. Visual highlights are picture substance, for and which is a bag.Textons play important role in formation
example, shape, surface and shading are assumes a noteworthy of bag of visual words. Where texture is characterized by the
978-1-5386-1442-6/18/$31.00 ©2018 IEEE 164
repeatations of basic elements known as textons. So, Bag is a

collection of image descriptors to represent our image dataset.
After clustering the density of each cluster is estimated and
the images retrieved are from most densed clusters.Euclidean
separation is utilized as separation measure between question
picture highlight vector and pictures in dataset[13].
||∅(Xi )−∅(Xj )||2 = K(Xi , Xj )-2K(Xi , Xj )+K(Xj , Xj ) (1)
In proposed system, the relevance feedback is not required
and results acquired are more efficient.There are Four steps.
First, Training classifier with positive and negative set of
images and randomly spliting up of dataset into 70% training
set and 30% validation set. Second step is to index images
which is used for mapping of visual word to a image where,
each index maps each word to their occurances in the image
set.Where bag of features is created and are also mapped by
indexing of images.
SURF feature Extraction:
SURF utilizes square-molded channels as an estimate of
Gaussian smoothing. Separating the picture with a square is
significantly quicker if the vital picture is used.SURF utilizes Fig. 4. Visual Word Occurance
a blob identifier in view of the Hessian matrix to discover
features. The determinant of the Hessian grid is utilized as a
measure of nearby change around the point and focuses are V. D ISSCUSSION AND R ESULTS
picked where this determinant is maximal. Stages are-intrigue
In this work, UC-Merced Landuse Remote Sensing Image
point discovery, nearby neighborhood depiction and coordinat-
dataset is used. There are 2100 images in total with 21 different
ing. The standard rendition of SURF is a few times quicker
classes where each class consists of 100 images.Categories
than SIFT. It extricates less highlights or key focuses however
are agricultural, airplane, baseballdiamond, beach, Buildings,
quick extraction and correct coordinating is conceivable even
chaparral, denseresidential, forest, freeway, golfcourse, harbor,
with substantial datasets.
intersection, mediumresidential, mobilehomepark, overpass,
Steps of feature detection:
parkinglot, river, runway, sparseresidential, storagetanks, ten-
• Square-shaped Filters niscourt.Each image in dataset is of 256 x 256 pixels with
Px Py
– S(x, y)= i=0 j=0 I(i, j) (2) spatial resolution of 30 cm. We know the classifications
however in application we don’t know anything in starting
• Hessian Matrix
stage on dataset. We really prepare classifier with prepare
– H(ρ, σ)= information and prepare marks for evaluating the class of
inquiry picture so we could gauge the recovered pictures are
(Lxx (ρ, σ) Lxy (ρ, σ)) from a similar class or not.
(Lyx (ρ, σ) Lyy (ρ, σ)) In this trial, keeping in mind the end goal to get BOVW por-
trayals of images(SURF descriptors) part k-implies bunching
(3) was connected to haphazardly chosen 58926 most grounded
• Scale-space representation and location of points of in- highlights among the 409067 highlights extricated from 2100
terest pictures by choosing k=150.Then the SURF descriptors are
base f ilter scale quantized by appointing the mark of the nearest group.
– σapprox=current filter size x base f ilter size (4)
In this paper, we compared our retrieval results to exisisting
After feature extraction is done from the image sets the bag system as 1) TCAL,2) DCAL[13]. Where, First, we applied
of visual word and its occurances is estimated for efficient our application on the basis of categorial retrieval to estimate
understanding of image dataset as shown in Fig.4, Where, 150 the accuracy of retrieval.
words of visual vocabulary is created.
Third step is estimating distance between the query image TABLE II
feature vectors and image index feature vector using the C ATEGORICAL R ETRIEVAL E STIMATION
cosine similarity as a similarity measure. Cosine Similarity
Method Precision Category
is used when there is occurance of a word in more than one
CBIR Proposed Method 100%
class and magnitude of vectors does not matter therfore, it is TCAL 100% Agricultural
efficient way to estimate the distance. Also L2 normalization DCAL 90%
is performed for estimating the strongest features.
Dc (A, B)=1-Sc (A, B) (5) Now, the results are retrieved from the set of 2100 images
where, Dc is cosine distance and Sc is cosine similarity. and also from 21 categories.
978-1-5386-1442-6/18/$31.00 ©2018 IEEE 165
(a) Query Image

(a) (b)
(b) Top 20 Images from the archive (c) (d)

Fig. 5. Agricultural Image Retrieval
Fig. 7. Top 20 images similar to query from all categories
TABLE III
P RECISION ON ALL CATEGORIES R ETRIEVAL
Method Average Precision

CBIR Proposed Method 97.5%
TCAL 94.42%
DCAL 92.11%
Random 92.69%
VI. C ONCLUSION
(a) (b)
In this experiment, the results shown specify that without
giving any feedback to system only on the basis of discrip-
tors and measures the system gives top 20 similar images
from the archive of 2100 images.this system is helpful in
case where users annotation task is not required rather it is
nigligible.Computation time required for extracting features is
less as compared to exisisting system that is; 0.45 seconds
(c) (d)
for grouping of features as words.Where as, the number of
Fig. 6. Query images features extracted are also the strongest features which are
usefull in retrieval purpose.Retrieval accuracy is based on
query image from different categories and thus, the accuracy
varies in between approximately 80% to 98%.
Future work in this experiment would be to retrieve results
with higher accuracy and on large datasets.
Results for these Quries are shown below in Fig.7.

ACKNOWLEGMENT
Where the Fig.6.a is image from agricultural class, Fig.6.b This paper is written and experiment is done under the
from chaprral class, Fig.6.c is from harbor class and the last guidence of Prof. G.K.Pakle from Shri Guru Gobind Singh
Fig.6.d is from dense resedential. College of Engineering and Technology.
978-1-5386-1442-6/18/$31.00 ©2018 IEEE 166
R EFERENCES
[1] Gudivada V N and Raghavan V V (1995b) "Design and evaluation of
algorithms for image retrieval by spatial similarity" ACM Transactions
on Information Systems 13(2), 115-144
[2] Kato T (1992) "Database architecture for content-based image retrieval"
in Image Storage and Retrieval Systems (Jambardino, A A and Niblack,
W R , eds), Proc SPIE 1662, 112-123
[3] Sandhya R. Shinde et al., “Experiments on Content Based Image
Classification using Color Feature Extraction”,IEEE, 2015.
[4] Sergyan Szabolcs, “Color histogram features based image classification
in content-based image retrieval systems”, Applied Machine Intelligence
and Informatics, pp. 221-224, 2008.
[5] Dr. H. B. Kekre, Tanuja K. Sarode, Sudeep D. Thepade, “ Image
Retrieval using Color-Texture Features from DCT on VQ Codevectors
obtained by Kekre’s Fast Codebook Generation”, ICGST International
Journal on Graphics, Vision and Image Processing (GVIP), Vol. 9, Issue
5, September 2009.
[6] P. V. N. Reddy, K. Satya Prasad, “Color and Texture Features for Content
Based Image Retrieval”, International Journal of Computer Technology
and Applications, Vol. 2, No. 4, pp. 1016-1020, 2011.
[7] K. Haridas et al.,“Well-Organized Content based Image Retrieval System
in RGB Color Histogram, Tamura Texture and Gabor Feature”, Interna-
tional Journal of Advanced Research in Computer and Communication
Engineering, Vol. 3, 2014.
[8] Heng Xu et al.,” Relevance Feedback for Content-Based Image Retrieval
Using Deep Learning”, 2017 2nd International Conference on Image,
Vision and Computing.
[9] Steven C.H. Hoi et al.,” A Unified Log-Based Relevance Feedback
Scheme for Image Retrieval”, IEEE TRANSACTIONS ON KNOWL-
EDGE AND DATA ENGINEERING, VOL. 18, NO. 4, APRIL 2006
[10] Dacheng Tao et al.,” Asymmetric Bagging and Random Subspace
for Support Vector Machines-Based Relevance Feedback in Image
Retrieval”, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND
MACHINE INTELLIGENCE, VOL. 28, NO. 7, JULY 2006
[11] Lining Zhang,”Conjunctive Patches Subspace Learning With Side In-
formation for Collaborative Image Retrieval”, IEEE TRANSACTIONS
ON IMAGE PROCESSING, VOL. 21, NO. 8, AUGUST 2012
[12] Majid Fakheri et al., “Gabor wavelets and GVF for feature extraction in
efficient contentbased color and texture images retrieval”, IEEE, 2011.
[13] Begüm Demir et al.,” A Novel Active Learning in Relevance Feedback
for Content Based Remote Sensing Image Retreival”, IEEE TRANSAC-
TIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO.
5, MAY 2015.
978-1-5386-1442-6/18/$31.00 ©2018 IEEE 167


Content Based Remote-Sensing Image Retrieval With Bag of Visual Words Representation

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Content Based Remote-Sensing Image Retrieval With Bag of Visual Words Representation

Uploaded by

Copyright:

Available Formats

Proceedings of the Second International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC 2018)

IEEE Xplore Part Number:CFP18OZV-ART; ISBN:978-1-5386-1442-6

Content Based Remote-Sensing Image Retrieval with Bag of Visual

Shri Guru Gobind Singh College of Engineering and Technology and

Heng Xu et al [8] presented the Relevance Feedback (RF)

part in picture recovery. Highlight Vectors It depicts the

IV. E XISTING S YSTEM AND P ROPOSED METHOD

repeatations of basic elements known as textons. So, Bag is a

(a) Query Image

(b) Top 20 Images from the archive (c) (d)

Method Average Precision

Results for these Quries are shown below in Fig.7.

978-1-5386-1442-6/18/$31.00 ©2018 IEEE 167

You might also like