You are on page 1of 11

Pattern Recognition 99 (2020) 107082

Contents lists available at ScienceDirect

Pattern Recognition
journal homepage: www.elsevier.com/locate/patcog

Efficient nearest neighbor search in high dimensional hamming space


Bin Fan a,b,c, Qingqun Kong b,c, Baoqian Zhang a,d, Hongmin Liu e,f,∗, Chunhong Pan a,b,
Jiwen Lu g
a
National Laboratory of Pattern Recognition, China
b
Institute of Automation, Chinese Academy of Sciences, China
c
University of Chinese Academy of Sciences, China
d
China Foreign Affairs University, China
e
School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China
f
School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo 454000, China
g
Tsinghua University, China

a r t i c l e i n f o a b s t r a c t

Article history: Fast approximate nearest neighbor search has been well studied for real-valued vectors, however, the
Received 16 March 2019 methods for binary descriptors are less developed. The paper addresses this problem by resorting to the
Revised 17 August 2019
well established techniques in Euclidean space. To this end, the binary descriptors are firstly mapped
Accepted 12 October 2019
into low dimensional float vectors under the condition that the neighborhood information in the original
Available online 13 October 2019
Hamming space could be preserved in the mapped Euclidean space as much as possible. Then, KD-Tree
Keywords: is used to partitioning the mapped Euclidean space in order to quickly find approximate nearest neigh-
Binary feature bors for a given query point. This is identical to filter out a subset of nearest neighbor candidates in the
Feature matching original Hamming space due to the property of neighborhood preserving. Finally, Hamming ranking is
Approximate nearest neighbor search applied to the small number of candidates to find out the approximate nearest neighbor in the original
Scalable image matching Hamming space, with only a fraction of running time compared to the bruteforce linear scan. Our exper-
iments demonstrate that the proposed method significantly outperforms the state of the arts, obtaining
improved search accuracy at various speed up factors, e.g., at least 16% improvement of search accuracy
over previous methods (from 67.7% to 83.7%) when the search speed is 200 times faster than the linear
scan for a one million database.
© 2019 Elsevier Ltd. All rights reserved.

1. Introduction To ensure a good trade-off between precision and matching speed,


the dimension of the considered real-valued features should not be
Finding nearest neighbor of high-dimensional features against a
too high.
large scale database is of critical importance in pattern recognition
In addition to real-valued features, representing image data
and computer vision applications, such as image retrieval [1], data
or local patches as binary codes (binary descriptors) has gaining
clustering [2], and structure from motion [3]. Since the exhaus-
growing interest. This is because that binary descriptors are storage
tive linear scan of nearest neighbor has linear complexity to the
efficient and their Hamming distance can be computed very fast by
size of dataset, it is extremely computational expensive for large
several machine instructions. These advantages facilitate the fast
scale databases. As a result, researchers have proposed various ap-
development of binary descriptors, both in terms of hand-crafted
proximate nearest neighbor searching (ANN) methods with sublin-
features (e.g., BRISK [12], FRIF [13]) and learning based ones (e.g.,
ear complexity to overcome this limitation. These methods, such
ORB [14], RFD [15], BinBoost [16]). Although computing Hamming
as KD-Tree [4], vocabulary tree [5], lower bound tree [6], orthogo-
distance of binary descriptors is substantially faster than comput-
nal tree [7], hashing [8], usually perform well with real-valued fea-
ing Euclidean distance of real-valued descriptors, it remains too
tures, enabling very fast approximate nearest neighbor search for
slow when it has to match millions of such descriptors which is
various local descriptors like SIFT [9], LIOP [10], and SOSNet [11].
often the case for many applications nowadays. On the other hand,
directly applying the ANN methods designed for real-valued de-

Corresponding author at: School of Automation and Electrical Engineering, Uni- scriptors to the binary ones is either inapplicable or less effec-
versity of Science and Technology Beijing, Beijing 10 0 083, China. tive (i.e, the performance will be severely degraded) as evidenced
E-mail addresses: bfan@nlpr.ia.ac.cn (B. Fan), qingqun.kong@ia.ac.cn (Q. Kong),
in [17,18]. Therefore, it is highly desirable to develop specific ANN
zbqmate@outlook.com (B. Zhang), hmliu_82@163.com (H. Liu), chpan@nlpr.ia.ac.cn
(C. Pan), lujiwen@mail.tsinghua.edu.cn (J. Lu).
methods for binary descriptors.

https://doi.org/10.1016/j.patcog.2019.107082
0031-3203/© 2019 Elsevier Ltd. All rights reserved.
2 B. Fan, Q. Kong and B. Zhang et al. / Pattern Recognition 99 (2020) 107082

Compared to the ANN methods for real-valued descriptors, little The rest of this paper is organized as follows. Section 2 intro-
attention has been paid to the ANN methods for binary descriptors. duces the related work. Section 3 elaborates our method in detail
The most commonly used method [18] for this purpose is adapted with analysis and discussions about its properties. Then, experi-
from the well-known Locality Sensitive Hashing (LSH) [8] by ran- ments are conducted on Section 4 to show the effectiveness and
domly taking several bits of the binary descriptor as the hash efficiency of the proposed method as well as compare to the state
key. This is analog to the hashing functions used when dealing of the arts. Conclusions are drawn in Section 5.
with real-valued descriptors by hashing techniques. Two widely
used hashing strategies, multiple tables and multi-probes [19], are 2. Related work
used to improve the performance as well. Another well known
ANN method for binary descriptors is the hierarchical clustering 2.1. Fast nearest neighbor search in Euclidean space
trees (HCT) [20], which divides the database hierarchically by ran-
domly selecting binary descriptors. These two kinds of ANN meth- The methods for fast nearest neighbor search in Euclidean space
ods have been the primary choices for fast approximate nearest can be mainly divided into two categories, one is tree based
neighbor searching in Hamming space. However, they are still less method while the other is on the basis of building hash tables. KD-
efficient (both in terms of speed and accuracy) as we will show in Tree [22] is perhaps the most popular data structure for indexing
our experiments. real-valued vectors, which is essentially a form of balanced binary
In this paper, we propose a fast approximate nearest neighbor tree. Each node of the tree splits the dataset into two parts ac-
searching method for binary descriptors by resorting to the well cording to a threshold value at a selected dimension. The dimen-
developed ANN methods in Euclidean space, so as to further im- sion is usually selected as the one with the greatest variance in
prove the matching speed of the state of the arts without loss of the data points belong to this node, and the nodes are iteratively
the searching precision. Previous work on ANN has demonstrated added into a KD-Tree until the sizes of leaf nodes are less than
that converting real-valued features to binary ones (i.e., hashing a predefined number. KD-tree has been widely used in computer
technique) could accelerate the matching speed of real-valued fea- vision community for various applications, such as object recog-
tures, while in this paper, we will demonstrate that projecting bi- nition [9], image retrieval [23], structure from motion [24]. Silpa-
nary descriptors to real-valued ones is also helpful for fast approx- Anan and Hartley [4] proposed randomization techniques to make
imate nearest neighbor searching in Hamming space. The core idea the searches in multiple KD-Trees as independent as possible so as
of our method is based on two observations from previous meth- to improve its performance when dealing with high dimensional
ods: (1) The ANN methods in Euclidean space such as KD-Tree descriptors, such as the 128-d SIFT descriptors. Instead of the hy-
are highly reliable when the dealt real-valued features are of low perplane defined by a coordinate axis, Jia et al. [23] proposed to
dimensionality. (2) The key of fast approximate nearest neighbor partition the data space by hyperplanes that are linear combina-
search is to quickly find out a small subset of candidates contain- tion of several coordinate axes. While KD-Tree and its variants par-
ing the true nearest neighbor with high probability. Motivated by tition the data space by hyperplanes, another kind of tree struc-
these, our method first projects the binary descriptors into low- ture uses clustering algorithm to decompose the data space. Typi-
dimensional real-valued vectors by preserving their neighboring cal methods of this kind include hierarchical k-means tree [25] and
relationship as much as possible (e.g., using Locality Preserving vocabulary tree [5]. Besides, other tree structures (principal axis
Projections [21]), and then uses the KD-Tree (a typical ANN method tree [26], lower bound tree [6], orthogonal tree [7]) and triangu-
in Euclidean space) to quickly find out a small number of candi- lar inequality [27] have also been used in fast nearest neighbors
dates for a given query point. Finally, the Hamming distances of all search. Muja and Lowe [18] presented a wide range of comparisons
candidates to the query point are computed and sorted to find out to show that the multiple randomized KD-Trees proposed in [4] is
the nearest neighbor. In other words, our ANN method in Ham- the most effective one for matching high dimensional descriptors.
ming space is leveraged by using KD-Tree in Euclidean space. As They also proposed a priority search k-means tree, which performs
we will show in the experiments, this method significantly im- on par with the multiple randomized KD-Trees in some cases.
proves the state of the arts with much larger accelerating rate at Hashing is also a very famous technique for nearest neigh-
the same precision. The success of our method can be attributed bor searching problem. It converts real-valued vectors into binary
into two factors. First, since the purpose of using KD-Tree to the codes, which are used to build hash tables. Then, data points
projected low-dimensional real-valued features is to return a num- falling into the same or nearby hash buckets to the query point
ber of candidates that are further checked as nearest neighbor by are taken out as candidate points, from which the nearest neigh-
Hamming ranking, the projected real-valued features are not re- bor is obtained by ranking their distances. This technique has sub-
quired to preserve the exact neighborhood information with high linear complexity since it only needs to compute a small number
confidence. On the contrary, it is only loosely required to have of distances compared to the exhaustive linear scan. How to obtain
the true nearest neighbor among the returned candidates by fast the hashing functions used to generate binary codes is at the core
searching in Euclidean space. For this purpose, learning a locality of this kind of method. Locality Sensitive Hashing (LSH) [8] gen-
preserving projection from the dataset is not a hard task. Second, erates random projections from Gaussian distributions as hashing
applying KD-Tree to low-dimensional vectors is highly reliable so functions so as to map similar data points to the same bucket with
that the risk of incorrect candidates returned by searching KD-Tree high probability. It usually needs longer hash codes to achieve sat-
is minimal. Therefore, our method smartly combines two existing isfactory performance. Multi-probe LSH [19] improves LSH by re-
techniques (Locality Preserving Projection and KD-Tree) to achieve trieving data points from nearby hash buckets, enabling higher ac-
a novel and effective usage in fast nearest neighbor search of high curacy with lower storage. Cheng et al. [3] applied the idea of
dimensional binary descriptors, for which each of the two tech- cascade to LSH and proposed the CasHash for fast nearest neigh-
niques performs quite poor. Due to these properties, we term our bor search. To obtain better performance with smaller codes, many
method as Binary Neighborhood Preserving KD-Tree (BNP-KDTree). researchers have proposed to learn data-dependent hashing func-
It is worthy to point out that the proposed method is a framework tions from labeled data by exploring various machine learning
for conducting fast approximate nearest neighbor search in Ham- techniques and structural information embedded in data, such as
ming space through going back to the Euclidean space, and so any kernels [28], spectral embedding [29], ranking information [30],
ANN method in Euclidean space can replace the KD-Tree used in and deep learning [1]. Discrete optimization [31] has also been
this paper. applied to learn hashing functions without continuous relaxation.
B. Fan, Q. Kong and B. Zhang et al. / Pattern Recognition 99 (2020) 107082 3

These hashing based nearest neighbor searching methods can only we need to construct a data structure to partitioning the high di-
use very small number of hash codes, otherwise, the huge stor- mensional Hamming space spanned by the database. For this pur-
age requirement for the corresponding hash tables will make these pose, we first project the binary descriptors into low dimensional
methods impractical. float vectors by linear projections. These linear projections have
to be capable of preserving the neighborhood information of the
2.2. Fast nearest neighbor search in hamming space original binary descriptors since our target is for nearest neigh-
bor searching. Due to this reason, we use the Locality Preserv-
Due to the efficiency in storage and computation of Hamming ing Projections (LPP) [21] in this paper. LPP is an unsupervised
distance, binary descriptors have gained fast development in the method for learning linear projections to preserve the locality in-
past years. However, their fast nearest neighbor search methods formation as much as possible in the projected space. Then, the
are less developed. Designing tree-based data structures is also standard KD-Tree [22], which is an advanced method for fast in-
a prominent solution for this problem. Muja et al. [18] modified dexing of low dimensional float vectors, is applied to the projected
the hierarchical k-means tree [25] to the hierarchical clustering vectors to form a partition of them. Such a partition of the low
tree (HCT) so as to handle binary vectors. HCT hierarchically selects dimensional Euclidean space is essentially a partition of the origi-
random binary codes from the database to form the tree nodes and nal binary descriptors since all the projected vectors are generated
uses them to cluster data points in the database. Ma et al. [32] im- from the original high dimensional binary descriptors. In the test
proved HCT by selecting distinctive bits to construct the clustering stage (i.e., data query), we are given a query binary descriptor. The
trees. Feng et al. [33] resorted to use random trees by supervised query descriptor is first projected into a low dimensional float vec-
learning of nodes which make trees have uniform leaf size and low tor through linear projections learned by LPP in the train stage. The
error rates. projected float vector is then used to retrieve a set of candidate bi-
Indexing by hash tables is another solution for fast search in nary descriptors in the database by traversing the constructed KD-
Hamming space. The most common way is to randomly select a Tree in the train stage. Finally, the Hamming distances between the
few bits as hash key, which is often called as LSH in the context query descriptor and the candidate descriptors are computed and
of binary descriptor [33]. This method is less accurate due to the sorted to obtain the nearest neighbor of the query data. The de-
thick boundary in Hamming space [17]. To improve the searching scribed procedures for database construction and data query are
accuracy and maintain the high efficiency, Esmaeili et al. [34] pro- outlined in Algorithms 1 and 2 respectively. Since KD-Tree indexing
posed the error weighted hashing (EWH) by checking the bucket can quickly filter out a small number of candidates, the proposed
distances between the retrieved candidates and the query point. method only needs to compute a few Hamming distances when
Compared to LSH, EWH can filter out a large number of candidate searching for the nearest neighbor in the database. Therefore, our
neighbors returned by hash buckets so as to accelerate the search- method can query binary descriptors very efficiently which we will
ing speed. Feng et al. [35] proposed to select hashing bits with show later by various experiments.
uniform hash buckets and high collision rates as much as possi-
ble. Norouzi et al. [36] proposed the multi-index hashing (MIH) Algorithm 1 BNP-KDTree: Database Construction.
for fast exact nearest neighbor search. MIH builds multiple hash Input:
tables for non-overlapping substrings of binary codes respectively. A set of binary features b1 , b2 , . . . , bn , dimension of the pro-
All the neighbors within a radius to the query point can be quickly jected float vectors k
searched by retrieving the hash buckets with much smaller radius Output:
on these tables independently. Eghbali et al. [37] extended MIH k linear projections: α1 , α2 , . . . , αk , and constructed KD-Tree K
to the online setting where the database is dynamic growing. The 1: Compute the locality preserving projections by solving the
key of these methods is to probe hash buckets within some ra- eigenvector problem in Equ. (2), obtaining the k linear projec-
dius around the hash key of the query point. However, for high tions: α1 , α2 , . . . , αk .
dimensional binary descriptors, most buckets are empty and the 2: Apply the obtained linear transformation A = [α1 α2 . . . αk ] to
nearest Hamming distance is usually large. Therefore, to guarantee the dataset of binary features, obtaining a set of k-dimensional
the searching precision, it often requires the searching radius be- float vectors x1 , x2 , . . . , xn , where xi = AT bi
ing large enough to cover the nearest neighbor, in which case, the 3: Build a KD-Tree K for x1 , x2 , . . . , xn according to Section 3.3.
number of probed buckets even exceeds the database size and so 4: Output: α1 , α2 , . . . , αk , and one KD-Tree K.
the speed could be even slower than linear scan.
Our method is specially designed for the fast Hamming search
problem of high dimensional binary descriptors. It projects the
high dimensional binary descriptors into low dimensional float Algorithm 2 BNP-KDTree: Data Query.
vectors, which could to some extent preserve the neighborhood in-
Input:
formation of the binary descriptors and are used to construct KD-
a query binary descriptor b, k linear projections: α1 , α2 , . . . , αk ,
Tree for fast retrieving of a few candidates. The Hamming distances
the constructed KD-Tree K, number of returned candidates Q
between these candidates to the query point are computed and
Output:
sorted to find the nearest neighbor. In other words, our solution
nearest neighbor b
for fast Hamming search is aided by the reliable fast search tech-
1: Compute the low dimensional float vector of the query descrip-
nique in Euclidean space (e.g., KD-Tree), which is well motivated
tor by x = [α1 α2 . . . αk ]T b.
and significantly different from previous methods.
2: Use x to traverse the KD-Tree K and return Q candidate data
points in the database.
3. The proposed method
3: Compute Q Hamming distances between the query data b and
the returned candidates respectively.
3.1. Overview
4: Sort the Q distances and take the candidate point with the
smallest distance as the nearest neighbor of b, denoted as b .
The problem we solved is to fast indexing a database of high
5: Output: nearest neighbor b of the query data b.
dimensional binary descriptors. Fig. 1 briefly depicts the proposed
solution. In the train stage (also known as database construction),
4 B. Fan, Q. Kong and B. Zhang et al. / Pattern Recognition 99 (2020) 107082

Fig. 1. Overview of the proposed method for fast Hamming space search. The upper part depicts the train stage while the bottom part illustrates the data query stage. The
low dimensional float vectors in the dash red lines are intermediate vectors used for constructing KD-Tree, and will be discarded for further use. The data space partitioned
by the constructed KD-Tree in the Euclidean space corresponds to a specific partition of the database, which is useful in quickly filtering out a small set of candidates in the
query stage. See text for more details. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

In the following subsections, we describe in detail of the two 3.3. Fast indexing of low dimensional projected float vectors with
components of our method and then give analysis about its prop- KD-Tree
erties.
Given n low dimensional float vectors x1 , x2 , … , xn obtained
by projecting the high dimensional binary descriptors as described
3.2. Locality preserving projection for binary features
in Section 3.2, the next step of our method is to partitioning the
space spanned by x1 , x2 , … , xn with KD-Tree so as to enable fast in-
Given a set of m-bits binary features b1 , b2 , … , bn ∈ {0, 1}m ,
dexing of nearest neighbor in the projected low dimensional space.
our target is to find k  m linear projections α1 , α2 , . . . , αk ∈ Rm
Since each data point xi contained in the constructed KD-Tree is
such that the projected float vectors x1 , x2 , . . . , xn ∈ Rk preserve the
generated from a high dimensional binary descriptor bi , the con-
neighboring relationships of the binary features. Here, xi = AT bi ,
structed KD-Tree actually partitions the original database of binary
where A = [α1 α2 . . . αk ] ∈ Rm×k . The neighborhood information of
descriptors b1 , b2 , … , bn . As a result, through querying a low di-
a binary feature bi can be denoted as wij ∈ {0, 1}, where wi j = 1
mensional float vector x that is obtained by linear projecting a high
means bj is within the neighborhood of bi . All these neighboring
dimensional binary descriptor b in the constructed KD-Tree, we
information of b1 , … , bn construct the affinity matrix W ∈ {0, 1}n × n
can quickly get a small set of data points in database. Due to the
of the dataset. In Hamming space, the neighborhood of a binary
property of KD-Tree, the projected vectors of these data points are
feature is usually defined either by a distance threshold  or sim-
nearest neighbor candidates of x in the low dimensional Euclidean
ply as K nearest features. In this paper, we use the former one to
space. On the other hand, since the linear projection can preserve
define the neighbors. Note that we do not assign different weights
neighborhood information to some extent, these data points in
to neighbors since our purpose is just to preserve the neighboring
database could also be served as nearest neighbor candidates in
relationships after projection, while their relative orders are not
the original Hamming space. Based on these evidences, linear scan
important. Such an objective is somehow easier to achieve than
among these candidates is finally conducted to return the nearest
preserving the ranking orders among the preserved neighborhood
neighbor of the query binary descriptor. Due to the small number
information. To this end, the objective function can be formulated
of candidates that have to compute and sort distances, the query
as:
speed can be hundreds or thousands faster than the brutefore lin-
   
L= x i − x j 2 w i j = AT bi − AT b j 2 wi j (1) ear search among the whole database.
ij ij
The node of a KD-Tree is usually taken as a specific dimension
of the data along with a threshold. By comparing the value of this
Minimizing L will impose penalties to the neighboring points dimension with the threshold, the database is split into two parts,
that have large distances after projection. This problem has been i.e., each node of the KD-Tree has two child nodes. We follow the
well addressed as Locality Preserving Projections (LPP) [21] in the standard implementation of KD-Tree, for a given node, it computes
area of dimension reduction/feature extraction. Through simple al- the variance for each dimension based on the data points belong-
gebra formulation, the analytical solution of minimizing Eq. (1) can ing to this node. Then, the dimension corresponding to the largest
be obtained by solving the following generalized eigenvector prob- variance is adopted by this node to split its data points into two
lem: parts (thus generating two new nodes), using the mean data value
of this dimension as the splitting threshold. Starting from the root
BLBT A = λBDBT A (2)
node containing all data points in the database, this procedure is
where B = [b1 , b2 , . . . , bn ] is the data matrix, L = D − W is the iteratively conducted to add new nodes to the KD-Tree until reach-
Laplacian matrix of W, and D is a diagonal matrix whose entries ing the leaf nodes whose sizes are less than a predefined num-

n ber T. At the run-time, a query point traverses the constructed KD-
are row sums of W, i.e., Dii = wi j . The k projections α 1 , α 2 , … , Tree by comparing with the thresholds at the selected dimensions
j=1
α k are the eigenvectors corresponding to the k smallest eigenval- until reaching the leaf node. To increase the searching precision
ues respectively. at the cost of speed, priority search is used to traverse multiple
B. Fan, Q. Kong and B. Zhang et al. / Pattern Recognition 99 (2020) 107082 5

leaf nodes to contain more candidates. Specifically, each time when a database with n data points, if each leaf node contains T data
reaching a leaf node, the data points contained in this leaf node points, then n/T leaf nodes are enough. For n/T leaf nodes, it
are taken into the set of candidate nearest neighbors. If the num- needs at most n/T non-leaf nodes in the tree, so the storage re-
ber of candidate nearest neighbors is not enough, descending from quired to store a KD-Tree is at most 6n/T bytes. For a typical
the most priority node in the queue to the reached leaf node is database of one million data points with leaf size set as 50, this
conducted to add additional candidates. This procedure is repeated storage is about 6 × 1M/50 = 120K bytes. Besides this basic space
until the number of candidates is no less than the predefined num- requirement as other KD-Tree based methods, our method needs
ber, which controls the trade-off between searching accuracy and additionally 4mk bytes to store k linear projections, for which sev-
speed. The priority of a node in the queue is decided by its dis- eral hundreds of KB is enough. Note that the low dimensional float
tance to the query point so that the closest branch in the KD-Tree vectors projected from the original binary descriptors are only used
will be explored first. to construct the KD-Tree to partitioning the database, they do not
need to be stored once the tree has been constructed since all
the partition information is efficiently contained in the constructed
3.4. Analysis
KD-Tree as we have analyzed before. In summary, our method is
scalable to large databases.
3.4.1. Complexity
The query time of our method consists of three parts: (1) the
3.4.2. Performance
time t1 used for linear projecting the query binary descriptor into
The technical route of our method is identical to existing fast
a low dimensional float vector, (2) the time t2 used for indexing
approximate nearest neighbor search methods, i.e., quickly filtering
KD-Tree with the projected float vector, (3) the time t3 used for
out a small number of candidates and then using linear scan across
computing and sorting the Hamming distances between the query
the candidates to find the nearest neighbor. The searching accuracy
binary descriptor and those binary points returned by traversing
and speed is mainly determined by the quantity of candidates and
KD-Tree. t1 requires km multipliers and k(m − 1 ) additions, where
the probability that they contain the groundtruth nearest neigh-
k is the number of linear projections and m is the dimension of
bor. Good methods should contain the groundtruth nearest neigh-
the binary descriptor. Supposing there are T data points associ-
bor with high probability by as small number of candidates as pos-
ated with each leaf node of the KD-Tree, retrieving Q candidates
sible. In our method, KD-Tree is operated in low dimensional Eu-
needs to descending down the tree r = Q/T  times, each time of
clidean space, which guarantees its high accuracy with a relatively
which has to conduct at most h comparisons (h ≤ log2(n) is the tree
small number of candidates. Meanwhile, since the purpose of KD-
depth, and n is the number of binary descriptors in the database).
Tree indexing in our method is to find out a subset of candidates
Conducting these comparisons is fast, however, besides these com-
containing the true nearest neighbor, mapping from the original
parisons, t2 includes time used for maintaining the priority queue
high dimensional binary descriptors to the low dimensional float
in KD-Tree indexing. This part can not be quantitatively analyzed
vectors does not need to precisely preserve the distance orders. In
and often occupies the most of t2 . According to our experiments,
this case, ensuring the nearest neighbor lies in a small radius of
t2 takes 1/4 ∼ 1/2 time of t3 . Supposing there are Q candidates re-
the query point after projection is good enough. This is a relative
turned by indexing the KD-Tree, t3 is Q/n of that used in linear
easy task for LPP.2 Therefore, all these factors together make our
scan of the database. Usually, due to the small value of k, t1 is ne-
method being highly effective and efficient. We will show in the
glectable1 , so the total query time is approximately 1.25 ∼ 1.5Q/n of
next section that the proposed method significantly outperforms
that used in linear scan. As a result, Q actually controls the search-
the state of the art. For example, when querying a database with
ing accuracy and speed up over linear scan. The more candidates
one million data points, it achieves more than 100 times speed
returned by KD-Tree indexing (i.e., the larger Q is), the higher ac-
up over linear scan while still maintaining a precision as high as
curacy will it be. At the meantime, the longer time will it need.
90%. For comparison, the most competitive HCT implemented in
According to our experiments, set Q ≈ 0.006n could achieve a pre-
the OpenCV FLANN library could only obtain 50 times speed up
cision as high as 90% with over 100 times speed up for querying a
with the same precision. Finally, it is worthy to point out that the
database containing one million data points.
proposed method is a framework that effectively uses the advan-
The process of computing LPP projections is pre-computed be-
tages of ANN methods in Euclidean space to solve the ANN prob-
fore building the indexing structure and query, and thus is an off-
lem in Hamming space. Although we use KD-Tree in this paper, any
line process that only needs to compute once. In addition, com-
other ANN method can be integrated in our method as well. This is
puting LPP projections is analytical and only requires to solve an
similar for LPP and so any dimensionality reduction method with
eigen problem (e.g., Eq. (2). In practice, according to our experi-
property of neighborhood preserving can be used in our method.
ments, computing LPP on 25,0 0 0 samples requires about 68.6 s,
including the time used for matrix operations in Eq. (2) and solv-
ing the eigen problem. Given the projected data points (projecting 4. Experiments
1,0 0 0,0 0 0 data points into 20-d vectors needs about 43 s), the time
used for building the kd-tree of 1,0 0 0,0 0 0 data points is around 4.1. Setup
2 s. The time used for projecting one 512-bit FRIF feature into a
20-d float vector is 4 us. In a word, this part of computational cost 4.1.1. Datasets
is low and can be ignored in practical applications. In our experiments, we use binary descriptors extracted from
The space requirement of our method is similar to that of tree images of the large scale structure from motion dataset [38]. This
based methods. Firstly, it has to store the binary descriptors as the dataset contains several subsets, each of which corresponds to one
database and their indices in the leaf nodes. Then, each non-leaf landmark and contains thousands of images collected from the In-
node in the KD-Tree needs 6 bytes to store the splitting dimen- ternet. We use the Madrid Metropolis subset containing 1344 im-
sion (2 bytes are enough for representing dimension as high as ages. FRIF feature [13] is used to extract keypoints along with their
65,536) and threshold value (4 bytes as it is a float value). For
2
Actually, most nearest neighbors in the original Hamming space will not be the
nearest ones in the mapped Euclidean space. According to our experiments, only
1
For k = 20, t1 is about 4us according to our experiments, while t2 + t3 is about about 1% query points have nearest neighbors in the projected Euclidean space that
390 us for querying over 1M database with a precision around 90%. are identical to the ones found in the original Hamming space.
6 B. Fan, Q. Kong and B. Zhang et al. / Pattern Recognition 99 (2020) 107082

binary descriptors from these images. In total, there are 10,366,480 search of binary descriptors. Therefore, this strategy is not im-
FRIF keypoints detected in these images, each of which is repre- plemented in our experiments. We change the number of trees
sented by a 512-bits binary code (FRIF descriptor). We randomly and tree depth to obtain various operating points on the curves
sample 10,0 0 0 FRIF descriptors from all descriptors as the query of speed up factor vs. precision.
set. From the remaining ones, we randomly sample 100K, 1M,
10M descriptors respectively to form three databases with differ- 4.2. Results and analysis
ent sizes.
The results of querying different databases of various sizes are
4.1.2. Compared methods shown in Fig. 2. Obviously, the proposed method (BNP-KDTree) ob-
Besides the proposed method (BNP-KDTree), one baseline and tains significant better performance than other evaluated meth-
three competitive methods (HCT [18], MLSH [19], SRT [33]) are ods. For instance, when querying a database with one million data
evaluated. For the compared methods, their parameters are set ac- points, as demonstrated in Fig. 2(b), BNP-KDTree still accelerates
cording to experiments so as to achieve their best performance. linear search over 120 times while keeping the precision as high
The performance of different methods is demonstrated by draw- as 90%. For comparison, the best competitor SRT can only reach
ing curves of speed up factors at different precisions. All experi- 80 times speed up at this precision. The popular HCT performs
ments are conducted on an Intel i5 2.5GHz CPU with 16GB mem- even worse, achieving a speed up about 60 at the precision of 90%.
ory, and only single thread is used. The running times for querying The improved acceleration rate of our method over SRT and HCT is
10,0 0 0 descriptors over 100K, 1M, 10M databases by linear scan even more significant for moderate precisions. We can see from
are around 47.5s, 475s, 4750s respectively. Fig. 2(b) that for precision of 60% ∼ 80%, BNP-KDTree has speed
up factors between 250 and 700, while SRT (HCT) is only about
• BNP-KDTree: To learn LPP projections, we randomly select 130 ∼ 250 (110 ∼ 190) times faster than the linear search. Further
25,0 0 0 samples from the database to construct the graph lapla- reducing the searching precision (by decreasing the number of re-
cian matrix. The adjacent edges in graph Laplacian matrix are turned candidates), HCT seems to be saturated in running time.
formed by neighboring samples whose Hamming distances are This demonstrates the high basic operation time for HCT to main-
smaller than 175. Binary descriptors are linear projected into tain and traverse its tree data structure, which limits the maxi-
20-d float vectors by LPP. The leaf size of KD-Tree is set as 50 mal speed ups that HCT can achieve. On this aspect, BNP-KDTree
for a typical database size of 1M. These parameter settings are and SRT are more competitive. We can also observe that MLSH is
determined by our experiments and we will give a parameter less competitive than other methods, and is even worse than di-
study in the next subsection. We control the number of candi- rectly applying KD-Tree to the binary descriptors (i.e., Bin-KDTree)
dates returned by traversing the trees to obtain different speed when the database size is small. This is because that the hash-
up factors, as well as various precisions. ing key length is too small compared to the binary feature dimen-
• Bin-KDTree: This is an intuitive baseline of the proposed sion (30 vs. 512), resulting in a very low collision probability of
method. It builds KD-Tree directly by the original 512-bits bi- the generated hashing buckets. Therefore, the candidates returned
nary descriptors (simply treat them as 512D real-valued vec- by probing the nearby hashing buckets will have a low probabil-
tors) instead of the projected low-dimensional vectors. This ity containing the true nearest neighbors even though there are
method is denoted as Bin-KDTree and uses the same param- many candidates. This is reflected on the curves of Fig. 2, where
eters to BNP-KDTree. By comparing to this baseline, the effec- MLSH is less efficient for all precisions. What is worse, for high
tiveness and necessity of applying the locality preserving pro- dimensional binary descriptors, this problem of MLSH can not be
jections to transform the binary descriptors into float vectors addressed by simply increasing the hashing key length since it will
can be illustrated clearly. require a very huge amount of memory to store hashing tables for
• HCT: Hierarchical clustering tree [18] implemented in the long key length, thus is impractical. As expected, directly apply-
OpenCV FLANN library is a popular method for approximating ing KD-Tree to the binary descriptors also does not produce good
nearest neighbor search in the Hamming space. Regarding its results since KD-Tree is specifically designed for real-valued vec-
parameters, the number of divisions in each tree node is set as tors. On the contrary, binary descriptors only have two possible
20, the leaf size is 100, and 8 parallel trees are used. Setting dif- values in each dim, making the space division by tree nodes is less
ferent numbers of candidates returned by traversing trees can meaningful. Different from previous methods, our method smartly
obtain different operating points with different speeds and pre- converts binary descriptors into float vectors and maintains simi-
cisions. lar neighborhood relationship in the transformed Euclidean space
• MLSH: Similar to HCT, multiprobe LSH [19] is another widely as that in the original Hamming space. By this way, we conquer
used methods for indexing binary descriptors. We compare our the problem of applying KD-Tree to binary descriptors, relying on
method to this method implemented in the OpenCV FLANN li- KD-Tree division in the transformed Euclidean space to obtain a
brary due to its popularity. The probe is set as 2, the hashing specific partition of the database. As demonstrated by the curves
key length is fixed as 30 and we change the number of hashing of BNP-KDTree in Fig. 2, such kind of space division is highly effec-
tables to obtain various operating points with different speeds tive and leads to very good performance in terms of both efficiency
and precisions. and effectiveness. These observations of different methods are con-
• SRT: The supervised random tree [33] is one recent work for sistent for various database sizes as clearly shown from Fig. 2(a) to
fast indexing high dimensional binary descriptors leveraged by (c).
supervised information. It requires groundtruth nearest neigh- It is also interesting to note that the database size has a pos-
bors to select bits as nodes in a tree structure to balance the itive effect on the acceleration rate over linear scan, as clearly
leaf size and the probability containing nearest neighbors in demonstrated by Fig. 2(d). The larger the database is, the higher
the same leaf. To compare with this method, we compute 10 speed up factor a method can achieve, either for the proposed
groundtruth nearest neighbors for each data point as the labels BNP-KDTree or the existing HCT and SRT. This is because that for
to train the indexing tree. It also proposed a priority search small size database, the returned candidates are too few so that
strategy based on the intermediate intensity difference when the time used for traversing the tree structure can not be ne-
computing the binary descriptors, however, this information is glectable compared to that used for Hamming ranking. While for
unavailable in general case of approximate nearest neighbor large size database, the time used for traversing the tree structure
B. Fan, Q. Kong and B. Zhang et al. / Pattern Recognition 99 (2020) 107082 7

Fig. 2. (a)-(c) Accelerations over linear scan with respect to different searching precisions for different methods with various database sizes, i.e., 100K, 1M, 10M. (d) Speed
up factors of different methods for different database sizes when precision is kept at 80%.

only occupies very small fraction of the searching time. Therefore, higher precision than using multiple randomized KD-Trees at all
the speed up is more significant for larger database size. Fig. 2(d) speed up factors. This is because that the considered Euclidean
shows the speed up factors of HCT, SRT and our method when the space in our method is a relatively low dimensional one, and in
searching precision is kept at 80%. Compared to HCT and SRT, not this case using multiple randomized KD-Trees can not increase the
only the absolute speed up factors of our method, but also its rel- independence of different searches in the query phase. Meanwhile,
ative speed improvement along with the increased database size is the standard KD-Tree with priority search is already very reliable
more significant, demonstrating the high efficiency of the proposed to deal with the low dimensional float vectors. Due to these rea-
method. sons, it is unnecessary to use multiple randomized KD-Trees, which
is especially proposed to improve matching performance of high
4.3. Discussions and parameter study dimensional descriptors. Instead, our method still uses the stan-
dard KD-Tree with priority search to deal with the projected low
In this subsection, we conduct parameter study to show how dimensional float vectors. Compared to multiple KD-Trees, using
the performance of our method is related to the different param- single KD-Tree also requires less memory to store the tree struc-
eter settings. All experiments in this section are conducted on the ture and less time to build the tree.
1M database and 10,0 0 0 queries are used.
4.3.2. Influence of LPP dimensions
4.3.1. Single KD-Tree VS. multiple randomized KD-Trees Fig. 3 (b) studies the influence of LPP dimensions, and demon-
In literature, the technique of using multiple randomized KD- strates the nearest neighbor searching performance of our method
Trees has been proposed to increase the performance of high di- with different LPP dimensions. The performance is relatively sta-
mensional descriptor matching [4]. However, as shown in Fig. 3(a), ble across a wide range of dimensions, perhaps because that our
using one KD-Tree is highly reliable in our method and leads to method does not require a restrict neighborhood preserving pro-
8 B. Fan, Q. Kong and B. Zhang et al. / Pattern Recognition 99 (2020) 107082

Fig. 3. Parameters study. (a) Number of trees. (b) LPP projected dimensions (k). (c) Binary neighborhood threshold ( ). (d) Influence of different training data for LPP. Please
see text for details.

jection. For larger LPP dimensions, it is interesting to see that there be used in BNP-KDTree to query other datasets as well. In other
is no effect on the final performance. This is because that the words, we only need to solve the LPP problem once and the
added dimension does not contribute to the construction of KD- learned projections can be fixed afterwards due to its generaliza-
Tree due to the limit of tree depth (according to our experiments, tion ability. For this reason, in the following section of applying
the KD-Tree usually has a depth around 18 to fulfill the request of BNP-KDTree to structure from motion, we fix BNP-KDTree with the
leaf size). While for very small LPP dimensions, it leads to inferior LPP projections learned from the Madrid Metropolis subset.
performance as we expected since not enough information is cap-
tured. According to these results, we use 20 in our method. 4.3.5. Influence of different leaf sizes
Fig. 4 (a) shows the approximate nearest neighbor searching
4.3.3. Influence of binary neighborhood threshold performance of our method by using KD-Tree with different leaf
Fig. 3 (c) shows how the binary neighborhood threshold affects sizes. In BNP-KDTree, when indexing the KD-Tree to return a num-
the performance of our method. According to this figure, we set ber of candidates, all the data points in a leaf node are either taken
the threshold  as 175 although other settings do not degrade the in whole as candidates or not. Due to this reason, it actually re-
performance too much. turns more candidates to achieve similar precision for the larger
leaf size, thus requiring more processing time. Therefore, larger
4.3.4. Generalization of the learned LPP projections leaf size leads to worse performance as shown in Fig. 4(a). On the
Although the LPP projections are unsupervised learned from other hand, when the leaf size is too small, it basically corresponds
a specific dataset, it is basically a transformation from Hamming to a over-fined partition of the data space. In this case, it usually
space to Euclidean space with neighborhood information preserv- needs to traverse more leaf nodes in order to have a high proba-
ing property. Therefore, applying the learned projections to other bility containing the true nearest neighbor in the returned candi-
datasets should also produce very good performance. To show this dates. As a result, too small leaf size will have lower speed up fac-
point, we replace the learned LPP projections by a new one learned tors when the high precision is required, which is the case of black
from another dataset, i.e., a set of binary descriptors extracted from curve (T = 25) in Fig. 4(a). When the high precision is not required,
images of another structure from motion dataset (Here we use smaller leaf size is also more efficient because fewer candidates
the Gendarmarket and Alamo subsets in [38] respectively, which will be returned and the traversed leaf nodes are not too many as
are other landmarks different from the one used in our experi- shown in the left part of black curve (T = 25) in Fig. 4(a). How-
ments.). Except for the LPP projections, other parameters are kept ever, a very small leaf size will lead to significantly over-fined par-
unchanged, and the results are shown in Fig. 3(d). As can be seen, tition, which in turn results in degradation of query performance,
although the LPP projections are learned from different images, as clearly demonstrated by the green curve when the leaf size is
it almost has no effect on the final performance of BNP-KDTree. set as 10 in Fig. 4(a). According to the results shown in Fig. 4(a),
Therefore, the LPP projections learned from a specific dataset can we set leaf size to be 50 in our method.

Fig. 4. (a) Speed up factor VS. precision for different leaf sizes. (b) Comparison of different methods obtained under our framework and some existing methods. (c) ANN
search performance of different methods when using BRISK descriptors. Please see text for details.
B. Fan, Q. Kong and B. Zhang et al. / Pattern Recognition 99 (2020) 107082 9

4.3.6. Effectiveness of the proposed framework 4.3.7. Generalization to other features


The high level idea of our method is to project the binary fea- Since the feature distribution might affect the effectiveness and
tures into low-dimensional float vectors by preserving some neigh- efficiency of a searching algorithm, we conduct experiments on
borhood information, and then leveraging the ANN methods in Eu- another widely used binary descriptor, i.e., BRISK [12]. To fur-
clidean space to retrieve a small number of candidates for ANN ther demonstrate the generalization ability of our method, in this
search of binary features. To show the effectiveness of this general experiment, we use BRISK descriptors computed from the Gen-
framework, we have replaced the KD-Tree with LSH (another typ- damenmarkt subset in [38] to learn the LPP projections, and then
ical ANN method in Euclidean space) in this framework to quickly use the learned LPP projections to test the query performance
find candidates in the low-dimensional real-valued space, obtain- of the proposed method on BRISK descriptors extracted from the
ing a method called BNP-LSH. Specifically, in the projected low- Alamo subset in [38]. That is to say, the dataset used to learn LPP
dimensional space, we use LSH to build hash tables for indexing projections is different from the one we test query performance
the database. Given a query, its projected vector is mapped to a of different methods. The database size is 1M and 10,0 0 0 queries
hash code to retrieve the data points stored in the corresponding are used. The result is shown in Fig. 4(c). As can be seen, BNP-
hash buckets as nearest neighbor candidates as we do in KD-Tree. KDTree outperforms all the other evaluated methods. We can also
We fixed the key length as 22 and changed the number of hash note that the speed-ups of all the evaluated methods are not as
tables to obtain different tradeoff between accuracy and speed- good as the case of using FRIF descriptor on the Madrid dataset (cf.
up. The results are shown in Fig. 4(b). As can be seen, BNP-LSH Fig. 2(b)). Such difference reflects that the feature and dataset
only performs slightly worse than BNP-KDTree, and outperforms could affect the performance of searching algorithms. Even though,
all the competitive methods. Such results validate the effective- the relative performance ranking among different methods is basi-
ness of the proposed framework. The inferior performance of BNP- cally unchanged as all methods would suffer from these factors.
LSH to BNP-KDTree might because that the LSH usually needs to Among all the evaluated methods, the proposed one performs the
return more candidates than KD-Tree to contain the true nearest best. This result further validates the effectiveness of the proposed
neighbors. method as a general binary feature indexing method.

Fig. 5. SFM reconstruction results when using different feature matching methods. Top: results for the “Gendarmenmarkt” dataset, bottom: results for the “Madrid Metropo-
lis” dataset. The color tetrahedrolds show the camera poses. Although the two methods produce similar results, using our method (BNP-KDTree) to conduct feature matching
is 4 times faster than using HCT. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
10 B. Fan, Q. Kong and B. Zhang et al. / Pattern Recognition 99 (2020) 107082

Table 1 Note that although the components of our method are well
SFM reconstruction results of different feature matching methods on two large
established techniques, they are reasonably combined together in
scale image datasets. The column of “results” are listed in the form of (number
of recovered images, number of recovered sparse points, feature matching time). this paper to form an elegant solution for a new problem that nei-
ther of them can deal with. In the future, we would like to fur-
Dataset Number Average Results
ther explore this high level idea (i.e., solving ANN in Hamming
of images features top:BNP-KDTree
bottom:HCT space by resorting to Euclidean space), including apply different
subspace learning methods to form a better low dimensional Eu-
Madrid Metropolis 1344 7713.2 (421, 46197, 8h56m)
clidean space for this problem, investigate on how different ANN
(424, 44042, 37h29m)
Gendarmenmarkt 1463 8487.9 (890, 83270, 12h39m) methods on Euclidean space perform in our framework, etc. Mean-
(899, 81967, 48h15m) while, besides application to large scale SFM as demonstrated in
this paper, we plan to apply the proposed method to more applica-
tions, such as large scale vision based localization, image retrieval
4.4. Applications to large scale structure from motion by binary features.

The task of structure from motion is highly depended on the Acknowledgments


image matching quality. Especially, for large scale structure from
motion, the time spent on matching all image pairs of an image This work is supported by the National Science Foundation of
collection is usually a bottleneck of the whole procedure since the China (61573352,61876180) and the Young Elite Scientists Sponsor-
computational complexity increases in quadratic to the number of ship Program by CAST (2018QNRC001). The authors would like to
images. Thus, the exhaustive feature matching is extremely slow thank the anonymous reviewers for their suggestions to improve
even on the binary descriptors and becomes impractical in this ap- the quality of this paper. Thanks to Dr. Youji Feng for providing
plication. We apply our fast approximate nearest neighbor search- their codes of supervised random tree.
ing method with FRIF binary features to a typical structure from
motion application (i.e., Visual SFM) [39] on large scale image col- References
lections.For comparison, we also conduct feature matching with
[1] J. Tang, Z. Li, X. Zhu, Supervised deep hashing for scalable face image retrieval,
HCT due to its popularity on applications of fast indexing of bi- Pattern Recognit. 75 (C) (2018) 25–32.
nary features. The parameters of our method and HCT are set to [2] A.K. Jain, Data clustering: 50 years beyond k-means, Pattern Recognit. Lett. 31
the same ones as in the previous experiments, i.e., the LPP projec- (8) (2010) 651–666.
[3] J. Cheng, C. Leng, J. Wu, H. Cui, H. Lu, Fast and accurate image matching with
tions are learned from the Madrid Metropolis dataset. The results cascade hashing for 3D reconstruction, in: IEEE Conference on Computer Vi-
are summarized in Table 1, where the numbers of recovered cam- sion and Pattern Recognition, 2014, pp. 1–8.
eras and sparse points as well as the time used in feature matching [4] C. Silpa-Anan, R. Hartley, Optimised KD-trees for fast image descriptor match-
ing, in: IEEE Conference on Computer Vision and Pattern Recognition, 2008,
have been reported. Larger number of recovered cameras/sparse
pp. 1–8.
points indicates a better performance on estimating the scene ge- [5] D. Nistér, H. Stewénius, Scalable recognition with a vocabulary tree, in: IEEE
ometries and camera poses from the input images. Additionally, Conference on Computer Vision and Pattern Recognition, 2006, pp. 2161–2168.
Fig. 5 shows the 3D point clouds reconstructed from these two [6] Y.-S. Chen, Y.-P. Hung, T.-F. Yen, C.-S. Fuh, Fast and versatile algorithm for near-
est neighbor search based on a lower bound tree, Pattern Recognit. 40 (2)
datasets and the recovered camera poses. From these results, it is (2007) 360–375.
very encouraging to see that our method (BNP-KDTree) only re- [7] Y.-C. Liaw, M.-L. Leou, C.-M. Wu, Fast exact k nearest neighbors search using
quires about 1/4 feature matching time of that used by HCT, while an orthogonal search tree, Pattern Recognit. 43 (6) (2010) 2351–2358.
[8] A. Gionis, P. Indyk, R. Motwani, Similarity search in high dimensions via hash-
still achieving equivalent performance for this typical application. ing, in: International Conference on Very Large Data Bases, 1999, pp. 518–529.
Note that HCT is already a very efficient method for large scale [9] D.G. Lowe, Distinctive image features from scale-invariant keypoints, Int. J.
feature matching of binary descriptors. Comput. Vis. 60 (2) (2004) 91–110.
[10] Z. Wang, B. Fan, G.W. an Fuchao Wu, Exploring local and overall ordinal infor-
mation for robust feature description, IEEE Trans. Pattern Anal. Mach. Intell. 38
5. Conclusion (11) (2016) 2198–2211.
[11] Y. Tian, X. Yu, B. Fan, F. Wu, H. Heijnen, V. Balntas, SOSNet: second order simi-
larity regularization for local descriptor learning, in: IEEE Conference on Com-
This paper aims to solve the problem of efficient approximate
puter Vision and Pattern Recognition, 2019.
nearest neighbor searching in high dimensional Hamming space, [12] S. Leutenegger, M. Chli, R. Siegwart, BRISK: binary robust invariant scal-
which is critical for matching a large number of binary descriptors able keypoints, in: International Conference on Computer Vision, 2011,
in various applications ranging from large scale image based 3D pp. 2548–2555.
[13] Z. Wang, B. Fan, F. Wu, FRIF: fast robust invariant feature, in: British Machine
reconstruction to image based localization. Our solution is based Vision Conference, 2013.
on the observation that the same problem in Euclidean space can [14] E. Rublee, V. Rabaud, K. Konolige, G. Bradski, ORB: an efficient alterna-
be conducted very efficiently with high reliability. As a result, we tive to SIFT or SURF, in: International Conference on Computer Vision, 2011,
pp. 2564–2571.
propose to project the original high dimensional binary descriptors [15] B. Fan, Q. Kong, T. Trzcinski, Z. Wang, C. Pan, P. Fua, Receptive fields selec-
into low dimensional float vectors by the locality preserving pro- tion for binary feature description, IEEE Trans. Image Process. 23 (6) (2014)
jections so as to preserve the neighboring information in the orig- 2583–2595.
[16] T. Trzcinski, M. Christoudias, V. Lepetit, Learning image descriptors with boost-
inal Hamming space. The low dimensional float vectors are par- ing, IEEE Trans. Pattern Anal. Mach. Intell. 37 (3) (2015) 597–610.
titioned by KD-Tree to obtain a partition of the original binary [17] T. Trzcinski, V. Lepetit, P. Fua, Thick boundaries in binary space and their
database. Once the partition has been built, a query binary descrip- influence on nearest-neighbor search, Pattern Recognit. Lett. 33 (16) (2012)
2173–2180.
tor is projected into the low dimensional Euclidean space to index-
[18] M. Muja, D.G. Lowe, Scalable nearest neighbor algorithms for high dimensional
ing the constructed KD-Tree and obtain a small set of candidates. data, IEEE Trans. Pattern Anal. Mach. Intell. 36 (11) (2014) 2227–2240.
Due to the neighborhood preserving property, such candidates are [19] Q. Lv, W. Josephson, Z. Wang, M. Charikar, K. Li, Multiprobe LSH: efficient in-
dexing for high-dimensional similarity search, in: International Conference on
very likely to be the neighboring points in the original Hamming
Very Large Data Bases, 2007, pp. 950–961.
space. Thus the Hamming distances between the query data and [20] M. Muja, D.G. Lowe, Fast matching of binary features, in: Ninth Conference on
the candidates are computed and sorted to find out the nearest Computer and Robot Vision, 2012, pp. 404–410.
neighbor of the query data. Extensive experiments have demon- [21] X. He, P. Niyogi, Locality preserving projections, in: Neural Information Pro-
cessing Systems, 2003, pp. 153–160.
strated the effectiveness and efficiency of the proposed method [22] J.L. Bentley, Multidimensional binary search trees used for associative search-
compared to existing works. ing, Commun. ACM 18 (9) (1975) 509–517.
B. Fan, Q. Kong and B. Zhang et al. / Pattern Recognition 99 (2020) 107082 11

[23] Y. Jia, J. Wang, G. Zeng, H. Zha, X.-S. Hua, Optimizing kdtrees for scalable vi- Qingqun Kong received her B.S. degree in automation from Ocean University of
sual descriptor indexing, in: IEEE Conference on Computer Vision and Pattern China, in 2008, and the Ph.D. degree in pattern recognition and intelligent systems
Recognition, 2010, pp. 3392–3399. from the National Laboratory of Pattern Recognition, Institute of Automation, Chi-
[24] S. Agarwal, Y. Furukawa, N. Snavely, I. Simon, B. Curless, S.M. Seitz, R. Szeliski, nese Academy of Sciences, China, in 2013. Currently, she is an associate professor
Building rome in a day, Commun. ACM 54 (10) (2011) 105–112. with the Institute of Automation, Chinese Academy of Sciences. She researches on
[25] K. Fukunaga, P.M. Narendra, A branch and bound algorithm for computing k-n- computer vision, pattern recognition and brain inspired intelligence.
earest neighbors, IEEE Trans. Comput. 24 (7) (1975) 750–753.
[26] J. McNames, A fast nearest-neighbor algorithm based on a principal axis search Baoqian Zhang is a junior undergraduate student in China Foreign Affairs Univer-
tree, IEEE Trans. Pattern Anal. Mach. Intell. 23 (9) (2001) 964–976. sity and an intern at the NLPR, Institute of Automation, Chinese Academy of Sci-
[27] J.Z. Lai, Y.-C. Liaw, J. Liu, Fast k-nearest-neighbor search based on projection ences. This work was done within his internship at NLPR.
and triangular inequality, Pattern Recognit. 40 (2) (2007) 351–359.
[28] X. Liu, J. He, B. Lang, Multiple feature kernel hashing for large-scale visual
search, Pattern Recognit. 47 (2) (2014) 747–757. Hongmin Liu received the B.S. degree from Xidian University, China, in 2004 and
[29] Y. Weiss, A. Torralba, R. Fergus, Spectral hashing, in: Neural Information Pro- the Ph.D. degree from the Institute of Electronics, Chinese Academy of Sciences, Bei-
cessing Systems, 2008, pp. 1753–1760. jing, China, 2009. She worked as an associate professor in the School of Computer
[30] J. Wang, J. Wang, N. Yu, S. Li, Order preserving hashing for approximate near- Science and Technique, Henan Polytechnic University, Jiaozuo, China. She is cur-
est neighbor search, in: ACM International Conference on Multimedia, 2013, rently a Professor both with the School of Automation and Electrical Engineering,
pp. 133–142. University of Science and Technology Beijing, Beijing 10 0 083, China and the School
[31] Y. Luo, Y. Yang, F. Shen, Z. Huang, P. Zhou, H.T. Shen, Robust discrete code mod- of Computer Science and Technology, Henan Polytechnic University, Jiaozuo 4540 0 0,
eling for supervised hashing, Pattern Recognit. 75 (2018) 128–135. China. Her research is focused on image processing and pattern recognition.
[32] Y. Ma, H. Xie, Z. Chen, Q. Dai, Y. Huang, G. Ji, Fast search of binary codes with
distinctive bits, in: Pacific Rim Conference on Multimedia, 2014, pp. 274–283. Chunhong Pan received the B.S. degree in automatic control from Tsinghua Uni-
[33] Y. Feng, L. Fan, Y. Wu, Fast localization in large-scale environments using su- versity, Beijing, China, in 1987, the M.S. degree from Shanghai Institute of Optics
pervised indexing of binary features, IEEE Trans. Image Process. 25 (1) (2016) and Fine Mechanics, Chinese Academy of Sciences, Shanghai, China, in 1990, and
343–358. the Ph.D. degree in pattern recognition and intelligent systems from the Institute of
[34] M. Esmaeili, R. Ward, M. Fatourechi, A fast approximate nearest neighbor Automation, Chinese Academy of Sciences, Beijing, China, in 20 0 0. He is currently a
search algorithm in the hamming space, IEEE Trans. Pattern Anal. Mach. In- professor with the Institute of Automation, Chinese Academy of Sciences. His cur-
tell. 34 (12) (2012) 2481–2488. rent research interests include computer vision, image processing, computer graph-
[35] Y. Feng, Y. Wu, L. Fan, Real-time SLAM relocalization with online learning of ics, and remote sensing.
binary feature indexing, Mach. Vis. Appl. 28 (8) (2017) 953–963.
[36] M. Norouzi, A. Punjani, D. Fleet, Fast exact search in hamming space with Jiwen Lu received the BEng degree in mechanical engineering and the MEng de-
multi-index hashing, IEEE Trans. Pattern Anal. Mach. Intell. 36 (6) (2014) gree in electrical engineering from the Xian University of Technology, Xian, China,
1107–1119. in 2003 and 2006, respectively, and the PhD degree in electrical engineering from
[37] S. Eghbali, H. Ashtiani, L. Tahvildari, Online nearest neighbor search in binary Nanyang Technological University, Singapore, in 2012. He is currently an associate
space, in: IEEE International Conference on Data Mining, 2017, pp. 853–858. professor in the Department of Automation, Tsinghua University, Beijing, China. His
[38] K. Wilson, N. Snavely, Robust global translations with 1DSFM, in: European current research interests include computer vision, pattern recognition, and ma-
Conference on Computer Vision, 2014, pp. 61–75. chine learning. He serves as an associate editor of the Pattern Recognition, Pattern
[39] C. Wu, Towards linear-time incremental structure from motion, in: Interna- Recognition Letters, Neurocomputing and the IEEE Access, an Elected Member of the
tional Conference on 3D Vision, 2013, pp. 127–134. Information Forensics and Security Technical Committee of the IEEE Signal Process-
ing Society, and an Elected Member of the Multimedia Systems and Applications
Bin Fan received the B.S. degree in automation from the Beijing University of Chem- Technical Committee of the IEEE Circuits and Systems Society, respectively.
ical Technology, Beijing, China, in 2006, and the Ph.D. degree in pattern recognition
and intelligent systems from the National Laboratory of Pattern Recognition, Insti-
tute of Automation, Chinese Academy of Sciences, Beijing, in 2011. During 2015 to
2016, he spent one year as a visiting scholar at CVLab, EPFL. He is currently an asso-
ciate professor with the Institute of Automation, Chinese Academy of Sciences. His
current research interests include computer vision, image processing and pattern
recognition.

You might also like