You are on page 1of 9

Signal Processing: Image Communication 59 (2017) 131–139

Contents lists available at ScienceDirect

Signal Processing: Image Communication


journal homepage: www.elsevier.com/locate/image

A medical image retrieval method based on texture block coding tree


Wenbo Li a,b , Haiwei Pan a, *, Pengyuan Li c , Xiaoqin Xie a , Zhiqiang Zhang a
a
Department of College of Computer Science and Technology, Harbin Engineering University, China
b Department of Informatics, Kyushu University, Japan
c Department of College of Computer Science and Technology, University of Delaware, Harbin 150001, China

a r t i c l e i n f o a b s t r a c t

Keywords: Content-based medical image retrieval (CBMIR) has been widely studied for computer aided diagnosis. Accurate
Content-based image retrieval (CBIR) and comprehensive retrieval results are effective to facilitate diagnosis and treatment. Texture is one of the most
Texture block coding important features used in CBMIR. Most of existing methods utilize the distances between matching point pairs
Image processing for texture similarity measurement. However, the distance based similarity measurements are of low tolerance
Medical image
to slight texture shifts, which result in an excessive sensitivity. Furthermore, with the increase of the number of
texture points, their time complexity is in explosive growth. In this paper, a new medical image retrieval model is
presented based on an iterative texture block coding tree. The corresponding methods for coarse-grained and fine-
grained similarity matching are also proposed. Moreover, a multi-level index structure is designed to enhance
the retrieval efficiency. Experimental results show that, our methods are of high efficiency and appropriate
tolerance on slight shifts, and achieve a relative better retrieval performance in comparison of other existing
methods.
© 2017 Elsevier B.V. All rights reserved.

1. Introduction mammographic diagnosis [21]. Similar to LBP descriptor, Subramaniam


(2017) proposed a local mesh peak valley edge pattern based biomedical
With the rapid development and popularization of digital medical image indexing and retrieval method [22]. These efforts all increase the
equipment, medical image data is in widely explosion. The similarity accuracy of CBMIR. However, they are all of high time complexity and
retrieval of medical images can help doctors to find similar cases to their inappropriate tolerance in some medical image retrieval.
current patient, which allows them to make diagnoses based on previous Texture is one of the most frequently used features in medical image
relevant results. At present, image retrieval methods can be divided into analysis, because content-based medical image retrieval (CBMIR) needs
two categories: the label-based image retrieval and the content-based to distinguish abnormal deformations in medical images, which general
image retrieval (CBIR). The former method relies on the marked labels image retrieval is difficult to achieve a comfortable balance between
to measure the similarity between images which is lack of concern for tolerance and sensitivity [23,24]. The tolerance of slight texture shift
the semantic information. In contrast, CBIR methods utilize the image also has a significant affection to the retrieval performance. Usually,
content features to represent images, which enable those complicated even for the same patient, the results of the scanning images are not
and unmarked image data to be quantified and retrieved. always identical. These slight differences in texture images typically
In medical image retrieval, CBIR has been widely studied during past appear as the split of long edges and texture point offset. For example,
decades [1–6]. Currently, CBIR utilizes a variety of content features there are two long textures in two different images that should be
(e.g. grayscale feature, texture, color, SIFT) to represent images and the same. When a long texture is divided into several short textures
measures similarities [7–14]. Unay et al. (2010) utilized LBP descriptor due to some break points, it will be hard to calculate the similarity
with a spatial index in brain MR image retrieval [15]. Li (2013) proposed between these two long textures. Conventional methods typically match
an Uncertain Location Graph (ULGR) model which takes the uncertainty each texture separately and respectively get their matching similarity.
and the structure of the texture into account and the retrieval precision Obviously, their similarities are inaccurate since these matching textures
is significantly enhanced [16]. Jiang (2015) combined VocTree [17] and are only parts of the long texture. Therefore, if textures have breakpoints
SIFT [18,19] and proposed a new BoW [20] image retrieval model for and slight shifts, the accuracy of texture similarity matching is dropped

* Corresponding author.
E-mail address: panhaiwei@hrbeu.edu.cn (H. Pan).

http://dx.doi.org/10.1016/j.image.2017.06.013
Received 13 January 2017; Received in revised form 6 June 2017; Accepted 26 June 2017
Available online 8 July 2017
0923-5965/© 2017 Elsevier B.V. All rights reserved.
W. Li et al. Signal Processing: Image Communication 59 (2017) 131–139

Fig. 1. Image data preprocessing.

considerably. This is an important problem which is ignored by most of 2. Remove the image region of the skull by threshold segmentation
the existing methods. In addition, the high time complexity is another and region growing [26].
common problem in CBMIR methods. 3. Extract the brain midline and adjust the symmetrical struc-
In texture image retrieval, existing methods of similarity measure- ture [27].
ment can be broadly divided into two classes: one is texture points 4. Extract texture features and normalize the processed image to a
matching based retrieval methods (e.g. KLT [25] and ULGR-index [16]); square.
the other is based on local feature descriptors, such as SIFT [18,19] 5. Remove the outermost contour edge and save the normalized
and LBP [15]. For the former class, similarity of texture distribution image as gray texture images.
between images can be accurately described, but it is lack of tolerance
for slight texture deformation. Besides, it has a high time complexity on Fig. 1 shows an example of brain CT images preprocessing, where
dealing with large-scale texture points. For the latter class, deformation (a) represents a JPEG data extracted from DICOM file (step 1), (b)
tolerance can be improved due to its invariance and the time complexity shows the result of skull removing and symmetrical structure adjustment
is independent of the number of texture points. However, large numbers (step 2–3), (c) shows the normalized texture image (step 4) and (d)
of mismatching points might result in lacking discrimination in CBMIR. is the preprocessing result after outermost contour edge elimination.
Furthermore, some of local feature descriptors (e.g. SIFT) are also of By applying the preprocessing process, each DICOM file in our dataset
poor efficiency in dealing with large amounts of feature points matching. can be converted to a size-normalized texture image. By preprocessing,
Generally, the texture image matching is measured by texture simi- our DICOM files are converted to a dataset of size-normalized texture
larities, which mainly rely on the distance and the number of matching images.
point pairs. The texture coordinate based methods are excessively sen-
sitive to the slight shifting or splitting of textures and time consuming. 3. Texture block coding tree
While the local feature descriptors may not have the over-sensitivity
problem, but their invariance might lead to excessively high tolerance Before introducing our model, the descriptions of all mathematical
of local deformation. Moreover, the matching processes for those feature symbols used in this paper are given. The table of used mathematical
vectors are also of high time complexity. Therefore, we need a new symbols are given in Table 1.
texture similarity measure method, which is of high efficiency and Texture Block Coding Tree (TBC-Tree) is defined as a quadtree,
appropriate tolerance. where each node in the tree represents a block region of the texture
In fact, there is another way to describe the similarity between two image and their parent node is divided into 4 sub blocks which are coded
points, which is the regional information. Each texture image can be as ‘‘00’’, ‘‘01’’, ‘‘10’’, ‘‘11’’ from left to right and from top to down, as
normalized as a square and iteratively divided into four sub blocks shown in Fig. 2. Iteration is performed in such a rule until the leaf nodes
until the divided blocks reduce to pixel points. By encoding each of are reduced to pixels. The detail steps of Iterative Texture Block Coding
the four sub blocks as binary code, such as ‘‘00’’, ‘‘01’’, ‘‘10’’, ‘‘11’’, (IBC) algorithm are shown in algorithm 1.
we can obtain the coding of the regional information about the texture
points. Hence each coordinate of texture points can be converted into a
unique binary code that describes the region information. Based on these
binary encodings, the similarity between textures can be calculated by
bit manipulation. In process of texture point matching, we only care
about the corresponding region of the texture point, which does not
require the traversal of all points, and which greatly reduces the time
complexity.
This paper is organized as follows: Section 2 introduces a preprocess-
ing work. Section 3 proposes the model of iterative texture block coding
tree. In Section 4, we present the similarity measurements, including a
coarse-grained matching method and a fine-grained matching method.
Section 5 presents a multi-level index structure for our retrieval method.
Experiments are shown in Section 6, and in Section 7 we briefly
summarize the full paper and highlight the future work.

2. Preprocessing

Our dataset consists of 20 000 brain CT images from People’s hospital


of Chaoyang in Liaoning province in 2013. The preprocessing contains
four steps, which are shown as follows:

1. Extract pixel data from DICOM files and save as JPEG files.

132
W. Li et al. Signal Processing: Image Communication 59 (2017) 131–139

Table 1
Table of used mathematical symbols.
Symbol Name Symbol Name
𝐼 Preprocessed brain CT image. 𝑆 Size of image 𝐼
𝐼𝑖𝑗 Point of image 𝐼 located at (𝑖, 𝑗). 𝑁 Depth of texture block coding tree.
𝑏𝑖𝑗 (𝑙) Block whose upper left point is (𝑖, 𝑗) at level 𝑙. 𝑀𝑐 (𝑙) Coarse-grained texture matrix at level 𝑙.
𝑏𝑐 (𝑙) Binary code of block at level 𝑙. 𝑀𝑓 (𝑙) Fine-grained Texture Matrix at level 𝑙.
𝑝𝑐 Binary code of point. 𝛿 Coarse-size coefficient.
𝑃 (𝐼) Point code set of image 𝐼. 𝑇 Multi-level block matrix representation model.
( ) ′
𝑃 𝑏𝑖𝑗 (𝑙) Point code set of 𝑏𝑖𝑗 (𝑙). 𝑀𝑏𝑏′ Bidirectional matching matrix of block 𝑏 and 𝑏 .

Fig. 2. Texture block coding tree.

The parameter 𝐿 is initialized to 0 which indicates the starting point Definition 3. For a 𝑆 × 𝑆 binary gray texture image 𝐼, the Fine-grained
of the root node and the initial block 𝑟 is the complete image. Texture Matrix (FM, denoted as 𝑀𝑓 (𝑙)) is a 2𝑙 ×2𝑙 matrix, whose element
Every path from the root node to each child represents a block and is a block 𝑏𝑖𝑗 (𝑙). Especially, when the divided level 𝑙 is equal to the
the encodings of nodes on the link path constitute the block code 𝑏𝑐 given grain-size coefficient 𝛿, the matrix 𝑀𝑓 (𝛿) is called Fine-grained
named TBC-Code. Especially, when the terminal node reaches a leaf Matching Matrix (FMM).
node, the block code represents a point code 𝑝𝑐 .
With these definitions above, 𝑆 × 𝑆 binary gray texture image 𝐼 is
As shown in Fig. 3, for an image 𝐼 with size 𝑆 × 𝑆, the point code
described as Multi-Level Block Matrix Representation Model (MBM).
𝑝𝑐 of 𝐼 is converted to a binary code whose length is 2log 𝑆. Therefore,
image 𝐼 is represented by a point code set 𝑃 (𝐼), where 𝑃 (𝐼) = {𝑝𝑐 |𝑝𝑐 ∈
2 log 𝑆 2 log 𝑆 Definition 4. For a 𝑆 × 𝑆 binary gray texture image 𝐼, a Multi-
⏞⏞⏞ ⏞⏞⏞ level Block Matrix Representation Model (MBM) is a system 𝑇 =
[ 000...0, 111...1]}. The length of 𝑁th levels block code is 2𝑁. And the { } {
𝑆𝑐 , 𝑀𝑐 (𝛿) , 𝑀𝑓 (𝛿) , 𝛿 , where 𝑆𝑐 ∶ 𝑆𝑐 → 𝑀𝑐 (0) , 𝑀𝑐 (1) , 𝑀𝑐 (2)
}
code of block at level 𝑘 is called the relative code of level 𝑘, as these , … , 𝑀𝑐 (log 𝑆) is a CM set which is used for indexing and searching,
codes are shown in the brackets in Fig. 3. Thus, the definition of image 𝑀𝑐 (𝛿) ∶ 𝑀𝑐 (𝛿) ∈ 𝑆𝑐 is the CMM, 𝑀𝑓 (𝛿) ∶ 𝑀𝑓 (𝛿) is the FMM and
block based on TBC-Tree is defined as follows. 𝛿 ∶ 𝛿 → {0, 1, 2, … , log 𝑆} is the grain-size coefficient. The coarse-size
coefficient 𝛿 reflects the length of the divided regions, which determines
Definition 1. For a block divided from image 𝐼 with the size of 𝑆 × 𝑆 the fine degree of 𝑀𝑐 (𝛿) and 𝑀𝑓 (𝛿).
at level 𝑙, the divided block 𝑏𝑖𝑗 (𝑙) is represented by a point code set
( )
𝑃 𝑏𝑖𝑗 (𝑙) = {𝑝𝑐 |SubLef t(𝑝𝑐 , 1, 2𝑙) = IBC(𝐼𝑖𝑗 , 0, 𝐼, 𝑙), 𝑝𝑐 ∈ 𝑃 (𝐼)}, where
𝑙 represents the block divided level, which ranges from 0 to log 𝑆, 4. Similarity matching
SubLeft(𝑠, 𝑖, 𝑗) is a function that returns the sub binary bit from 𝑖 to
𝑗 in code 𝑠, IBC(𝐼𝑖𝑗 , 0, 𝐼, 𝑙) is a function that returns the code of block 4.1. Framework of matching method
whose upper left point is located at (𝑖, 𝑗).
According to Definitions 1–4, the preprocessed texture images are
Definition 2. For a 𝑆 × 𝑆 binary gray texture image 𝐼, the Coarse- described as MBMs. As shown in Fig. 4, the MBMs’ matching pro-
grained Texture Matrix (CM, denoted as 𝑀𝑐 (𝑙)) is a 2𝑙 ×2𝑙 binary matrix, cess consists of two steps: coarse-grained matching and fine-grained
whose element 𝑞𝑖𝑗 (𝑙) represents the presence (𝑞𝑖𝑗 (𝑙) = 1) or absence matching. Firstly, the coarse-grained matching method was applied to
(𝑞𝑖𝑗 (𝑙) = 0) of texture points in the block 𝑏𝑖𝑗 (𝑙) of image 𝐼. Especially, calculate the coarse-grained similarity between two MBMs. The fine-
when the divided level 𝑙 is equal to the given grain-size coefficient 𝛿, grained matching utilizes all matched blocks resulted from coarse-
the matrix 𝑀𝑐 (𝛿) is called Coarse-grained Matching Matrix (CMM). grained matching to exacted texture similarity computation.
The 𝛿 determines the grain-size. The larger the 𝛿, the finer the CMM.
For each element 𝑞𝑖𝑗 (𝑙) of CM 𝑀𝑐 (𝑙), a fine-grained texture matrix is Meanwhile, the FMM contains less pixel information in each divided
defined to precisely describe each of these blocks’ local texture features. block. The high 𝛿 value retain more texture details in the coarse-grain

133
W. Li et al. Signal Processing: Image Communication 59 (2017) 131–139

Fig. 3. Iterative block division.

Fig. 4. Overview of similarity matching.

matching phase which reduces the allowable texture offset and improves Definition 7. The coarse-grained similarity
{ between two MBMs } 𝑇 =
{ }
the matching accuracy, while the low 𝛿 makes fine-grained matching 𝑆𝑐 , 𝑀𝑐 (𝛿) , 𝑀𝑓 (𝛿) , 𝛿 and MBM 𝑇 ′ = 𝑆𝑐′ , 𝑀𝑐′ (𝛿) , 𝑀𝑓′ (𝛿) , 𝛿 is calcu-
process become more discriminative. Therefore, better matching results lated by below, where 𝑞, 𝑞 ′ represent the elements of 𝑀𝑐 (𝛿) and 𝑀𝑐′ (𝛿).
can be obtained by choosing appropriate 𝛿, which will be discussed in ( )
𝛿 𝛿
detail in Section 6. ( ) 𝑖=2∑
,𝑗=2 𝑞𝑖𝑗 .𝑞𝑖𝑗′
𝑐𝑠 𝑇 , 𝑇 ′ = . (1)
𝑖=0,𝑗=0
𝑞𝑖𝑗 + 𝑞𝑖𝑗′
4.2. Coarse-grained matching
Definition 7 gives the calculated mode of coarse-grained similarity
The coarse-grained matching of two MBMs is in fact their in- between two MBMs. It is calculated by statistics of the proportion of
tersection of CMM blocks containing textures. With these following matching blocks. The coarse-grained similarity matching ignores the
definitions, the coarse-grained similarity between two MBMs can be local texture details and focuses on the texture of the global distribution,
calculated. which makes the texture image matching more efficient and flexible.
However, only coarse-grained similarity description cannot accurately
Definition 5. For two CMMs 𝑀𝑐 (𝛿) and 𝑀𝑐′ (𝛿) of MBM 𝑇 and MBM represent the similarity between texture images and it is difficult to
𝑇 ′ , their element 𝑞𝑖𝑗 and 𝑞𝑖𝑗′ are coarse-grained matched if 𝑞𝑖𝑗 = 1 and obtain a better discrimination in similarity image retrieval. Therefore,
𝑞𝑖𝑗′ = 1. a fine-grained similarity matching method is proposed to describe the
details of local texture feature similarity.
Definition 6. For two FMMs 𝑀𝑓 (𝛿) and 𝑀𝑓′ (𝛿) of MBM 𝑇 and MBM 𝑇 ′ ,
their elements 𝑏𝑖𝑗 (𝑙) and 𝑏′𝑖𝑗 (𝑙) are matching blocks, if the corresponding 4.3. Fine-grained matching
element 𝑞𝑖𝑗 and 𝑞𝑖𝑗′ in 𝑀𝑐 (𝛿) and 𝑀𝑐′ (𝛿) are matched.

The coarse-grained matching process is shown in Fig. 5. Given a The fine-grained matching is a similarity measurement method of
coefficient 𝛿, the CMM 𝑀𝑐 (𝛿) is generated. As shown in Fig. 5, the white texture details in each matching block. Generally, point 𝑝 and 𝑝′ are
regions represent the element of 𝑀𝑐 (𝛿) with value 1, black regions are matching points
√ if their coordinates are identical or adjacent, namely
elements whose values are 0 and red regions represent the matching ‖𝑝 − 𝑝′ ‖2 ≤ 2. The similarity between two textures is calculated by
‖ ‖
blocks. the proportion of the total number of matching points. The similarity

134
W. Li et al. Signal Processing: Image Communication 59 (2017) 131–139

Fig. 5. Coarse-grained similarity matching process.

between images is determined by texture similarities. These definitions either of the following two conditions are satisfied: the XOR result of
are given as follows. the nodes 𝑑 and 𝑑 ′ is equal to that of their parent nodes; their parent
( ) nodes XOR result is ‘‘00’’. Therefore, the similarity of point codes can
Definition 8. For any point 𝑝 (𝑥, 𝑦) and 𝑝′ 𝑥′ , 𝑦′ are Identical Matching be calculated by counting the depth of matching node. The details are
(IA) if |𝑥 − 𝑥′ | = 0 and |𝑦 − 𝑦′ | = 0. as shown in the Code Similarity Matching (CSM) algorithm.
( )
Definition 9. For any point 𝑝 (𝑥, 𝑦) and 𝑝′ 𝑥′ , 𝑦′ are Row Adjacent
Matching (RA) if |𝑥 − 𝑥 | = 1 and |𝑦 − 𝑦 | = 0.
′ ′

( )
Definition 10. For any point 𝑝 (𝑥, 𝑦) and 𝑝′ 𝑥′ , 𝑦′ are Column Adjacent
Matching (CA) if |𝑥 − 𝑥′ | = 0 and |𝑦 − 𝑦′ | = 1.
( )
Definition 11. For any point 𝑝 (𝑥, 𝑦) and 𝑝′ 𝑥′ , 𝑦′ are Angle Adjacent
Matching (AA) if |𝑥 − 𝑥′ | = 1 and |𝑦 − 𝑦′ | = 1.
In our model, points are described as a sequence of binary code.
Apparently, where the locations are identical, their encoding is the
same. The condition for adjacent ones is discussed in detail as follows.
As shown in Fig. 6(a)–(c) respectively represent three kinds of adjacent
points. The white blocks are adjacent points and the numbers represent
their relative code in upper level blocks. By XOR operation on these
adjacent codes pairs, it is found that different adjacent types correspond
to different XOR values, as shown in Fig. 6, where RA corresponds to
‘‘01’’, CA corresponding ‘‘10’’ and AA is ‘‘11’’. Therefore, we have the
following theorems: In this phase, the TBC-Code bits are identical from 1 to 2𝛿 because
they are in the same matching block. Fig. 7 is an example of CA point of
Theorem 1. For any adjacent blocks (or points) 𝑏𝑖𝑗 (𝑙) and 𝑏′𝑖𝑗 (𝑙), their 256×256 image, former part is matching block code which is determined
upper level block 𝑏𝑖𝑗 (𝑙 − 1) and 𝑏′𝑖𝑗 (𝑙 − 1) are still adjacent blocks and have by coarse-grained matching phase while only the relative code takes
the same adjacent type, otherwise 𝑏𝑖𝑗 (𝑙 − 1) and 𝑏′𝑖𝑗 (𝑙 − 1) are identical. effect in fine-grained matching. Thus, to enhance the discrimination of
the fine-grained similarity, we utilize relative code to calculate the block
Proof. (For the) RA case, there is necessarily at least one point pair 𝑝 (𝑥, 𝑦) similarity.
and 𝑝′ 𝑥′ , 𝑦′ in 𝑏𝑖𝑗 (𝑙 − 1) and 𝑏′𝑖𝑗 (𝑙 − 1) which satisfy |𝑥 − 𝑥′ | = 1 and To measure the similarity of two blocks, the primal problem is
|𝑦−𝑦′ | = 0. Hence, 𝑏𝑖𝑗 (𝑁 − 1) and 𝑏′𝑖𝑗 (𝑁 − 1) are identical or adjacent. In point pair matching. In this paper, we use the bidirectional matching
adjacent cases, assuming 𝑏𝑖𝑗 and 𝑏′𝑖𝑗 are CA or AA, obviously |𝑦 − 𝑦′ | = 0 rule to generate[ the matching point ] pair set. For two blocks 𝑏 (𝑙) and
is always false. Thus, the assumptions are invalid. Therefore 𝑏𝑖𝑗 (𝑙 − 1) 𝑏′ (𝑙) , 𝑀𝑏𝑏′ = 𝑄1 , 𝑄2 , 𝑄3 , … , 𝑄𝑛 is a 𝑛 × 𝑛′ matrix, where 𝑛 and 𝑛′ is
[ ]T
and 𝑏′𝑖𝑗 (𝑙 − 1) are identical or RA. Similarly, the cases of CA and AA are the point number of 𝑏 and 𝑏′ . The vector 𝑄𝑖 = 𝑞1𝑖 , 𝑞2𝑖 , 𝑞3𝑖 , … , 𝑞𝑛′ 𝑖
proved. records the numbers of matched bits (between point 𝑖
𝑝 (𝑏) in 𝑏 and
( )) 𝑐
each points in 𝑏′ , namely 𝑞𝑖𝑗 = 𝐶𝑆𝑀 𝑝𝑖𝑐 (𝑏) , 𝑝𝑗𝑐 𝑏′ . And we define
Theorem 2. For any adjacent blocks (or points) 𝑏𝑖𝑗 (𝑁) and 𝑏′𝑖′ 𝑗 ′ (𝑁), their 𝑇 =
that the matrix 𝑀𝑏′ 𝑏 is the transpose of 𝑀𝑏𝑏′ , namely 𝑀𝑏′ 𝑏 = 𝑀𝑏𝑏
next level block 𝑏𝑢𝑣 (𝑁 + 1) and 𝑏′𝑢′ 𝑣′ (𝑁 + 1) are adjacent if their XOR result [ ] ′

of relative code is the same as 𝑏𝑖𝑗 (𝑁) and 𝑏′𝑖′ 𝑗 ′ (𝑁). 𝑄′1 , 𝑄′2 , 𝑄′3 , … , 𝑄′𝑛′ . Therefore, for a 𝑆 × 𝑆 binary gray texture image,
the similarity of block similarity is calculated as follow:
Proof. Assuming that 𝑏𝑖𝑗 (𝑁 + 1) and 𝑏′𝑖′ 𝑗 ′ (𝑁 + 1) are adjacent, their ′ ( ( ) ( ))
( ) 𝑖=𝑛,𝑘=𝑛∑ 𝑆𝑖𝑚 𝑀𝑎𝑥 𝑄𝑖 , 𝑀𝑎𝑥 𝑄′𝑘
upper level block 𝑏𝑢𝑣 (𝑁) and 𝑏′𝑢′ 𝑣′ (𝑁) are also adjacent and their XOR 𝑏𝑠 𝑏, 𝑏′ = . (2)
(2 log 𝑆 − 2𝛿) 𝑀𝑎𝑥 (𝑛, 𝑛′ )
result of relative codes are different. Hence, the adjacent type of them is 𝑖=1,𝑘=1
not the same. However, it is in contradiction with Theorem 1. Therefore, In( formula

) (2), 𝑀𝑎𝑥 (𝑄) returns the max element in vector 𝑄 and
the assumptions are invalid. Thus, the corollary is proved. 𝑆𝑖𝑚 𝑞𝑖𝑗 , 𝑞𝑘𝑚 is defined as formula (3):
{
According to the theorems, for any two nodes 𝑑 and 𝑑 ′ in TBC-Tree, ( ′
) 𝑞𝑖𝑗 − 2𝛿, 𝑗 = 𝑘, 𝑚 = 𝑖
their corresponding blocks (or points) 𝑏𝑖𝑗 (𝑙) and 𝑏′𝑖𝑗 (𝑙) are matched if 𝑆𝑖𝑚 𝑞𝑖𝑗 , 𝑞𝑘𝑚 = (3)
0, otherwise.

135
W. Li et al. Signal Processing: Image Communication 59 (2017) 131–139

Fig. 6. Three types of adjacent regions.

Fig. 7. Code matching example of CA.

}
𝐺𝑘 (𝑛) , which records the 𝑀𝑐 (𝑘) of all images in this level. In level
𝑘, 𝐺𝑘 (𝑖) is pointed by an upper level index item and only the deeper
index tables with the largest coarse-grained similarity are retrieved.
It is a gradually subdividing process to reduce the number of index
tables required to traverse during the retrieval. Moreover, for each
retrieved table, the number of items matched in 𝐺𝑘 (𝑖) will not exceed
the upper bound of 4𝑘 according to Definition 1. For a query MBM
{ }
𝑇 = 𝑆𝑐 , 𝐶𝛿 , 𝐹𝛿 , 𝛿 , each element 𝑀𝑐 (𝑘) of 𝑆𝑐 is sequentially matched
Fig. 8. Fine-grained similarity matching example.
to index tables from level 0 to 𝛿 by coarse-grained matching method.
The output of this matching process for 𝑆𝑐 is a candidate result dataset,
which can be utilized in fine-grained matching process.
Notably, TF-IDF [28] is widely adopted in text and image retrieval
With these definitions above, the definition of fine-grained similarity
methods. In our model, every divided block code is unique in each
is given as follow.
image, thus only the IDF is taken into account to reflect their discrim-
{ } inations. Therefore, the similarities of coarse-grained and fine-grained
Definition 12. The fine-grained similarity between 𝑇 = 𝑆𝑐 , 𝐶𝛿 , 𝐹𝛿 , 𝛿
{ } are redefined to enhance the importance of the abnormal regions by a
and 𝑇 ′ = 𝑆𝑐′ , 𝐶𝛿′ , 𝐹𝛿′ , 𝛿 is calculated by below: slight modification of multiplying an IDF weight for each sum term.
( ) ( )
( ) 𝑖=2∑𝛿 ,𝑗=2𝛿
𝑞𝑐𝑖𝑗 ⋅ 𝑞𝑐′ ⋅ 𝑏𝑠 𝑞𝑓𝑖𝑗 , 𝑞𝑓′
𝑖𝑗
𝑓𝑠 𝑇 , 𝑇 ′ =
𝑖𝑗
(4) 6. Results
𝑖=0,𝑗=0
𝑛
The experiment algorithms in this paper are developed by C#.NET
where 𝑓 𝑠 ∈ [0, 1] , 𝑞𝑐 , 𝑞𝑐′ represent the elements of 𝐶𝛿 , 𝐶𝛿′ and 𝑞𝑓 ,
and run at TOSHIBA PC. The specific environment configuration is as
𝑞𝑓′ represent the elements of 𝐹𝛿 and 𝐹𝛿′ . 𝑛 represents the number of
follows: memory: 4 GB; CPU: Intel (R) Core (TM) i5-4200 CPU; Oper-
matching blocks which is defined as below:
ating system: Microsoft Windows 7. Our experimental data is a total of
𝛿 ,𝑗=2𝛿
𝑖=2∑ ( ) 20 000 images which are all normalized to 256×256 gray texture images.
𝑛= 𝑞𝑐𝑖𝑗 ⋅ 𝑞𝑐′ . (5) In our experiment, we test the tolerance, discrimination, running time
𝑖𝑗
𝑖=0,𝑗=0
and f-score. We also compared our method with the state-of-the-art
Fig. 8 shows the result of fine-grained matching. The greater the gray methods including LBP [15], ULGR-index [16], VocTree+SIFT [22] and
value, the higher the block similarity. KLT [25].

5. Index structure 6.1. Tolerance analysis

Based on TBC-Tree, we design a multi-level index structure for fast In the process of medical image generation, minor deformations usu-
matching and retrieval as shown in Fig. 9. Each level 𝑘 in the index struc- ally occur due to different scanning time or different devices. Because
{
ture corresponds to a group of index tables 𝐺𝑘 = 𝐺𝑘 (1) , 𝐺𝑘 (2) , … , of these normal changes, the corresponding texture images are of slight

136
W. Li et al. Signal Processing: Image Communication 59 (2017) 131–139

Fig. 9. Multi-level index structure.

and 0.08. Meanwhile, the upper and lower quartiles of these methods
are also calculated, where those of TBC-Tree are 0.8 and 0.68, VocTree
are 0.77 and 0.36, KLT are 0.55 and 0.45, ULGR are 0.4 and 0.2, LBP
are 0.36 and 0.26.
VocTree+SIFT has a high tolerance compared to other methods
for its invariance of SIFT. However, the large scale of mismatching
points leads to its deficient stability. LBP is also of some invariance,
but it cannot achieve the desired results due to its excessive sensitive
to small variation in texture. While in KLT and ULGR model, the
similarity matching methods strongly depend on texture coordinates,
subtle texture offsets will seriously affect their matching results. In
TBC-Tree model, the block based coarse-grained matching avoids the
Fig. 10. Tolerance in slight normal deformation. matching error caused by fine texture deviation, which ensures TBC-
Tree model performs a preferable tolerance and stability.

differences (i.e. split of long edges, offset of texture point and so on). 6.2. Discrimination comparison
Therefore, these small texture deformations should be ignored in order
to improve the retrieval novelty. High tolerance is a guarantee of high In CBIR systems, the discrimination reflects their sensitivity to de-
recall. formations in important regions. Especially in medical image retrieval,
In this test, we use 3000 images to validate the tolerance of minor any slight abnormal changes should be identified. Better discrimination
normal deformations, in which 2000 images are interfering images and contributes to a higher precision. To evaluate the discrimination of our
method, we randomly select 500 normal images from dataset. According
the other 1000 are from 60 different patients. For each patient, we
to each selected image, we simulate an abnormal copy by randomly
randomly sample images with slight shifts into the same group. Thus, we
adding an abnormal tissue (i.e. tumor, clot and so on), see Fig. 11.
have 60 groups and an interfering group in our test dataset. We consider
that if a retrieval result is in the same group of the query image, it is a 𝑆 (𝑂𝑟𝑖𝑔𝑖𝑛𝑎𝑙) Δ𝑆 (Simulate)
𝐷𝑖𝑠𝑐𝑟𝑖𝑚𝑖𝑛𝑎𝑡𝑖𝑜𝑛 = . (6)
positive case, otherwise it is negative case. The proportion of positive 𝑆 (𝑂𝑟𝑖𝑔𝑖𝑛𝑎𝑙) ∪ 𝑆 (𝑆𝑖𝑚𝑢𝑙𝑎𝑡𝑒)
cases is utilized to evaluate the tolerance of our method. We conducted The discrimination is defined as formula (6), where 𝑆 (𝑂𝑟𝑖𝑔𝑖𝑛𝑎𝑙) is
150 queries and make a comparison with other methods. the relevant result set of the query of original image, 𝑆 (𝑆𝑖𝑚𝑢𝑙𝑎𝑡𝑒) is
As shown in Fig. 10, compared with other algorithms, TBC-Tree and the relevant result set of simulate image. We select top-20 as relevant
VocTree have a relatively high tolerance whose maximum values are results. As shown in Fig. 12, the discrimination of these five methods
0.92 and 0.95. While that of the other three methods (KLT, ULGR and (BC-Tree, VocTree, KLT, ULGR, LBP) are 0.54, 0.58, 0.28, 0.85 and
LBP) are 0.68, 0.62 and 0.4. In respect of stability, VocTree has the 0.17. ULGR-index shows the highest discrimination than other methods
largest fluctuation range whose minimum value is 0.22. While that of since its coordinate based matching algorithm is sensitive to the position
the other four methods (BC-Tree, KLT, ULGR and LBP) are 0.6, 0.3, 0.14 changes of texture points. TBC-tree also keeps the discrimination up to

137
W. Li et al. Signal Processing: Image Communication 59 (2017) 131–139

Fig. 13. F-Score in different coarse-size coefficient 𝛿.

Fig. 11. Examples of simulate images.

Fig. 14. Comparison of retrieval performance in 5 methods.

when 𝛿 = 5.
𝑝∗𝑟
𝐹 -𝑆𝑐𝑜𝑟𝑒 = 2 ∗ . (7)
(𝑝 + 𝑟)
Given a coarse-size coefficient 𝛿, the depth of index is determined.
The higher the 𝛿, the deeper the index level. The deeper index level
means the narrower candidate dataset for fine-grained matching. When
𝛿 increases to 8, the index items in level-8 index tables reduce to
pixel distributions, so that it keeps a high precision. However, this also
indicates that the recall decreased to the lowest since large numbers of
images is excluded by coarse-grained matching during the multi-level
Fig. 12. The discrimination comparison results for 5 methods.
index process. When 𝛿 is in the range of 5–6, both precision and recall
are maintaining a high level and the F-Score keeps a high value from
0.6 to 0.7.
0.5 owing to the weight improvement of the ROI regions by IDF, as the Fig. 14 shows the comparison of retrieval performance of these 5
red block shown in Fig. 12. methods. The precision, recall and F-Score of TBC-Tree are 0.55, 0.8
In our method, each matching block in coarse-grained matching and 0.65, VocTree are 0.68, 0.37 and 0.48, KLT are 0.42, 0.12 and 0.19,
phase corresponds to a fine-grained matching result. There is a limited ULGR are 0.82, 0.21 and 0.33, LBP are 0.22, 0.18 and 0.2. Compared
discrimination of fine-grained matching within the matching block. to other 4 methods, TBC-Tree, VocTree+SIFT and ULGR-index have a
Therefore, TBC-Tree achieves a high tolerance at the expense of dis- relative higher precision and ULGR-index are of the highest with 0.8.
crimination. However, since its excessively sensitive, the recall of ULGR-index is in
a relative lower level, while our method keeps the precision above 0.5
and shows the highest recall.
6.3. Retrieval performance
6.4. Running time
F-score is utilized to validate the retrieval performance of our
method, which is defined as formula (7), where 𝑝 and 𝑟 represent the This test is conducted in a total dataset of 20 000 images. As shown in
precision and recall. In this experiment, 3000 labeled brain CT images Fig. 15, three curves represent the coarse-grained matching time, fine-
are utilized. The result is shown in Fig. 13. From 𝛿 = 1 to 𝛿 = 8, grained matching time and total time of our method. With the increase
the recall of TBC-Tree decreases from 0.95 to 0.12 and the precision of the data scale, coarse-grained matching time growing faster than fine-
rises from 0.11 to 0.82. The F-Score achieves the maximum value 0.64 grained matching time. However, in our index structure, the search

138
W. Li et al. Signal Processing: Image Communication 59 (2017) 131–139

[2] A.W.M. Smeulders, M. Worring, S. Santini, et al., Content-based image retrieval at


the end of the early years, IEEE Trans. Pattern Anal. Mach. Intell. 22 (12) (2000)
1349–1380.
[3] F. Long, H. Zhang, D.D. Feng, Fundamentals of content-based image retrieval,
Multimedia Information Retrieval & Management (1) (2003) 1–26 Feng D.
[4] Y. Liu, D. Zhang, G. Lu, et al., A survey of content-based image retrieval with high-
level semantics, Pattern Recognit. 40 (1) (2007) 262–282.
[5] R. Datta, D. Joshi, J. Li, et al., Image retrieval: Ideas, influences, and trends of the
new age, ACM Comput. Surv. 40 (2) (2008) 2007.
[6] J. Wang, Y. Li, Y. Zhang, et al., Bag-of-features based medical image retrieval via
multiple assignment and visual words weighting., IEEE Trans. Med. Imaging 30 (30)
(2011) 1996–2011.
[7] A.T.D. Silva, A.X. Falcão, L.P. Magalhães, Active learning paradigms for CBIR
systems based on optimum-path forest classification, Pattern Recognit. 44 (12)
(2011) 2971–2978.
Fig. 15. Time cost in different data scale. [8] E. Sokic, S. Konjicija, Phase preserving Fourier descriptor for shape-based image
retrieval, Signal Process., Image Commun. 40 (2016) 82–96.
Table 2 [9] T.D. Mascio, D. Frigioni, L. Tarantino, VISTO: A new CBIR system for vector images,
The comparison of time consuming of 5 methods. Inf. Syst. 35 (7) (2010) 709–734.
[10] G. Hu, Q. Gao, An interactive image feature visualization system for supporting CBIR
Methods Time (s)
study, in: Image Analysis and Recognition, International Conference, Iciar 2009,
ULGR-index 74.6 Halifax, Canada, July 6–8, 2009. Proceedings. 2009, pp. 133–40.
VocTree+SIFT 17.7 [11] H.B. Kekre, D. Mishra, CBIR using upper six FFT sectors of color images for feature
KLT 921
vector generation, Int. J. Eng. Technol. 2 (2) (2010) 0975–4024.
LBP 4.9
[12] P.V.N. Reddy, K.S. Prasad, Color and texture features for content based image
TBC-tree 6.2
retrieval, Int. J. Comput. Technol. Appl. 4 (04) (2011) 146–149.
[13] W. Xu, W. Jin, X. Liu, et al., Application of image SIFT features to the context of
CBIR, in: International Conference on Computer Science and Software Engineering,
range become narrower as the level gets deeper. Although the fine- IEEE Computer Society, 2008, pp. 552–555.
grained matching is time consuming, its growth rate reduces according [14] L. Wang, H. Wang, Improving feature matching strategies for efficient image
retrieval, Signal Process., Image Commun. 53 (2017) 86–94.
to the optimized candidate dataset obtained by index process. Compared
[15] D. Unay, A. Ekin, R.S. Jasinschi, Local structure-based region-of-interest retrieval in
with other methods, TBC-Tree also shows a high efficiency, as shown in brain MR images, IEEE Trans. Inf. Technol. Biomed. Publ. IEEE Eng. Med. Biol. Soc.
Table 2. 14 (4) (2010) 897–903.
[16] P. Li, H. Pan, J. Li, et al., A novel model for medical image similarity retrieval,
7. Conclusion in: International Conference on Web-Age Information Management, Springer-
Verlag, 2013, pp. 595–606.
[17] X. Wang, M. Yang, T. Cour, et al. Contextual weighting for vocabulary tree based
In this paper, a new medical retrieval method based on TBC-Tree is
image retrieval, in: International Conference on Computer Vision, 2011, pp. 209–
proposed. A good tradeoff is achieved between tolerance and discrim-
216.
ination in medical image retrieval. Our method solves the problem of [18] D.G. Lowe, Object recognition from local scale-invariant features, in: The Proceed-
slight texture shift and the split of long edges in texture feature similarity ings of the Seventh IEEE International Conference on Computer Vision, 1999, Vol.
measurement. With the multi-level index structure, it also keeps a high 2, IEEE, 1999, pp. 1150–1157.
efficiency and retrieval performance. [19] D.G. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, Int. J. Com-
In our future work, how to apply our model to other areas of image put. Vis. 60 (60) (2004) 91–110.
[20] J. Sivic, A. Zisserman, Video Google: A text retrieval approach to object matching
retrieval problems is one of our next key research content. In addition,
in videos, in: IEEE International Conference on Computer Vision, Vol. 1470, IEEE
we consider combining TBC-tree with the state-of-art deep learning Computer Society, 2003.
theory to further explore new approaches of image classification. [21] M. Jiang, S. Zhang, H. Li, et al., Computer-aided diagnosis of mammographic masses
using scalable image retrieval, IEEE Trans. Bio-Med. Eng. 62 (2) (2015) 783–792.
Acknowledgments [22] S. Murala, Q.M.J. Wu, MRI and CT image indexing and retrieval using local mesh
peak valley edge patterns, Signal Process. Image Commun. 29 (3) (2013) 400–409.
[23] A.K. Tiwari, V. Kanhangad, R.B. Pachori, Histogram refinement for texture descriptor
The paper is supported by the National Natural Science Foundation
based image retrieval, Signal Processing Image Communication 53 (2017) 73–85.
of China under Grant No. 61672181, No. 51679058, Natural Science
[24] D. Markonis, M. Holzer, F. Baroz, et al., User-oriented evaluation of a medical image
Foundation of Heilongjiang Province under Grant No. F2016005. And retrieval system for radiologists, Int. J. Med. Inf. 84 (10) (2015) 774–783.
thanks to China Scholarship Council (CSC) for funding our work (No. [25] J. Shi, C. Tomasi, Good Features to Track. Volume 84(9), 1994, 593–600.
201706680067). [26] S. Kamdi, R.K. Krishna, Image segmentation and region growing algorithm, Int. J.
Comput. Technol. Electron. Eng. 1 (2) (2012).
References [27] W. Li, H. Pan, X. Xie, et al., Simple and robust ideal mid-sagittal line (iML) extraction
method for brain CT images, in: IEEE, International Conference on Bioinformatics
and Bioengineering, IEEE Computer Society, 2016, pp. 266–273.
[1] R. Yong, T.S. Huang, S.F. Chang, Image retrieval: Current techniques, promising
directions, and open issues, J. Vis. Commun. Image Represent. 10 (1) (1999) 39–62. [28] G. Salton, C. Buckley, C. Buckley, Term-weighting approaches in automatic text
retrieval, Inf. Process. Manag. 24 (5) (1988) 513–523.

139

You might also like