A Sketch-Based 3D Object Retrieval Approach For Augmented Reality Models Using Deep Learning

J. Internet Comput. Serv.
ISSN 1598-0170 (Print) / ISSN 2287-1136 (Online)

http://www.jics.or.kr
Copyright ⓒ 2020 KSII
A Sketch-based 3D Object Retrieval Approach

for Augmented Reality Models Using Deep
Learning☆
지 명 근1 전 준 철1*
Myunggeun Ji Junchul Chun
ABSTRACT
Retrieving a 3D model from a 3D database and augmenting the retrieved model in the Augmented Reality system simultaneously
became an issue in developing the plausible AR environments in a convenient fashion. It is considered that the sketch-based 3D object
retrieval is an intuitive way for searching 3D objects based on human-drawn sketches as query. In this paper, we propose a novel deep
learning based approach of retrieving a sketch-based 3D object as for an Augmented Reality Model. For this work, we introduce a
new method which uses Sketch CNN, Wasserstein CNN and Wasserstein center loss for retrieving a sketch-based 3D object. Especially,
Wasserstein center loss is used for learning the center of each object category and reducing the Wasserstein distance between center
and features of the same category. The proposed 3D object retrieval and augmentation consist of three major steps as follows. Firstly,
Wasserstein CNN extracts 2D images taken from various directions of 3D object using CNN, and extracts features of 3D data by
computing the Wasserstein barycenters of features of each image. Secondly, the features of the sketch are extracted using a separate
Sketch CNN. Finally, we adopt sketch-based object matching method to localize the natural marker of the images to register a 3D
virtual object in AR system. Using the detected marker, the retrieved 3D virtual object is augmented in AR system automatically. By
the experiments, we prove that the proposed method is efficiency for retrieving and augmenting objects.
☞ keyword : Convolutional Neural Network, object retrieval, Deep Learning, Sketch-based 3D object retrieval, Augmented Reality
1. Introduction In this paper, we propose a new deep learning based 3D

model retrieval method by classifying 3D objects in efficient
Object retrieval from a 3D database is preliminary work and convenient fashion. Especially we propose a sketch-based
for various applications areas such as Augmented Reality and 3D model retrieval by using CNN and a newly introduced
3D scene generation. Especially, sketch-based 3D shape Wasserstein center loss function. The similarity measure
retrieval has been receiving attention in the computer vision methods using CNN have been broadly used in contrastive
and graphics application[1,2,3]. In comparison to earlier loss[4], tripple loss[5] and center loss[6]. The Wasserstein
attempts to use keywords as queries, hand-drawn sketches center loss introduced in this paper is a method to improve
provide more convenient and various input of shapes while the measuring accuracy of the center loss function.
retrieving the desired objects. However, the previous works Wasserstein center loss is the loss function which uses the
including deep learning approaches could not meet the desired Wasserstein distance instead of Euclidean distance between
satisfaction in the efficiency of the retrieval rates. the center of the class and the feature[7].
The proposed Wasserstein CNN uses images rendered in
various directions to obtain the features of the 3D model and
1 Department of Computer Science, Kyonggi University, Gyeonggi-
do, 443-760, Korea. then extracts the features of the object by evaluating the
* Corresponding author (jcchun@kyonggi.ac.kr) Wasserstein barycenters of the 2D image features of the 3D
[Received 11 July 2019, Reviewed 15 July 2019(R2 15 October
2019), Accepted 21 October 2019] model. The feature of the sketch images is obtained by use
☆ This work was supported by Kyonggi University Research Grant of sketch CNN.
2018. In the following section 2, the related works of the
☆ A preliminary version of this paper was presented at ICONI
2018 and was selected as an outstanding paper. proposed method is described. In section 3, the concept of
Journal of Internet Computing and Services(JICS) 2020. Feb.: 21(1): 33-43 33

http://dx.doi.org/10.7472/jksii.2020.21.1.33
A Sketch-based 3D Object Retrieval Approach for Augmented Reality Models Using Deep Learning
(Figure 1) The framework of the proposed sketch-based 3D model retrieval.
Wasserstein distance, Wasserstein center and Wasserstein features of sketches and 3D objects[3]. In Xie’s study, he
center along with the structure of Wasserstein CNN and extracts features from the images of the 3D object by CNN
sketch-based 3D augmentation. In section 4, the experimental and obtains Wasserstein center of the features to match the
results by the proposed method included. Finally, in section objects and sketches[10].
5, the concluding remarks and future works are discussed. In the studies of similarity measure based on loss function,
Hadsell[4] proposed Contrastive loss and Schroff[5] proposed
2. Related Works Triplet loss in the classifying the input data. Wen[6]
introduced Center loss for recognizing the face. He etal[16]
The studies of sketch-based 3D object retrieval become a introduced Triplet Center loss which combined the Triplet
major issue in the field of contents-based model retrieval area. loss and Center loss in sketch-based 3D object retrieval.
However, the difficulties to use sketches for retrieving a 3D However, the demerit of Contrastive loss is the learning
object is the sketch of the object is not uniquely defined speed can be slow down when the pair of data is not properly
depending on the person’s subjection. Due to this reason, the designed. Triplet loss conventionally needs long learning time
2D sketches for the same 3D object can be presented in many because of the triple pair of data.
different fashions. Meanwhile, sketch-based image matching, which is known
In the study of the 2D projection of the 3D objects, a as a content-based retrieval[23,24,25] method to compare the
composite descriptor so called ZFEC which includes local database images with sketch images drawn by users, is used
region-based Zernike moment, boundary-based Fourier to detect a desired object in an input image and the detected
descriptor, and features of eccentricity and roundness is object is used as natural maker of AR for augmenting a
introduced[2]. In other study, the silhouette of a 3D model is virtual 3D object.
used as a 2D sketch of the model[8]. In a work of
sketch-based 3D retrieval by learning features, Eitz utilized 3. Proposed Method
sketches and 2D projections of the 3D objects by use of
Gabor local line-based features and bag-of-feature (BOF) The proposed sketch-based 3D model retrieval system
histogram[1]. In addition, Furuya proposed BF-SIFT to described in Fig 1 consists of three major parts: two CNN and
describe sketches and 2D projections of 3D objects[9]. Wasserstein Loss. The Wasserstein CNN extracts features of
Recently, studies of sketch-based 3D object retrieval using 3D models and the sketch CNN extracts features from the
CNN(Convolutional Neural Network) have been introduced. sketch, respectively. Wasserstein center loss is used for
Wang retrieved 3D object using two Siamese CNNs to extract learning of obtained both features from CNNs.
34 2020. 2
In section 3.1 and 3.2 Wasserstein distance and 3.2 Wasserstein Center
Wasserstein center are introduced. In section 3.3 the
Wasserstein barycenters is the center point of a set of
characteristics of Wasserstein CNN and Sketch CNN which
probability distributions calculated using the Wasserstein
extracts the features of 3D model and the features of the
distance. When a probability distribution set is
user’s sketch, respectively are explained. In section 3.4, the
Wasserstein center loss used for sketched-based 3D model  ℝ ×   ⋯ . barycenter  of this set is
retrieval is described. defined as follows[11].

3.1 Wasserstein Distance

       
 
   (5)
Wasserstein barycenters[12] is the center point of a set of

In Eq. (5)    is the Wasserstein distance of   and
probability distributions calculated using the Wasserstein
distance. Wasserstein distance is called Kantorovich-  ,  is the weight. Wasserstein center  can be obtained
Rubinstein metric or Earth Mover's Distance as one of by repeatedly calculated as follow:
several methods of determining the distance of probability

distributions. Let ℝ ×  and ℝ ×  be two probability
distributions. The transmission plan can be defined as follow.
   
 
  
  (6)
 
    ∈ℝ×        (1)     
    
   
In Eq. (1),  is the transmission scheme, and  is a
In Eq. (6)  is  iteration of Wasserstein center  and,
column vector in which all elements are 1. The Wasserstein
    is are auxiliary variables[17].
distance   between  this and  can be defined as
follow.
3.3 Wasserstein CNN

       (2)
 As shown in Fig 1. the features of the 3D model can be
 ×
In Eq (2),  ℝ is a pairwise distance matrix of  extracted from the rendered multi-view images of the target
and  called ground matrix.     is dot product of model using CNN and the Wasserstein center will be
 and  . Wasserstein distance   is the optimal obtained from the features[10]. In the fisrst stage, in order to
extract the features of the model, the 12 images are taken
transmission planning cost for transmitting the mass of  to
from the model according to the 30 degree rotational
 . In many cases, Eq. (2) may not have a unique solution
direction of the 3D model as illustrated in Fig 2. Those
and we use Eq. (3) [14] plus the entropy normalization term.
images are feed into CNN to obtain the 3D feature of the
 model.
            (3)

In Eq. (3)      is negative entropy and  is a

normalized agent variable. The optimal solution of the above
equation is obtained as shown in Eq. (4) below.

    (4)
In Eq. (4)      , vector  and  use Sinkhorn

algorithm[15].
(Figure 2) An example of 12 multi-view images from
a 3D model
한국 인터넷 정보학회 (21권1호) 35

(a) The structure of the Wasserstein CNN
(b) The structure of the sketch CNN

(Figure 3) The structure of the Wasserstein CNN and the sketch CNN
The proposed Wasserstein CNN plays a role of four major center loss, which has been used for face recognition area to
parts: CNN for extracting feature from each view, compensate for the Softmax loss of the supervised learning.
Wasserstein barycenter for extracting features of the 3D Center loss obtains the center of a class and minimizes the
model by calculating Wasserstein center of the all views, distance between the center and each feature to be classified.
CNN2 for mapping the obtained 3D features to the same The formula of the center loss can be defined as Eq. (7).
domain of the sketch features, and a classifier for classifying
the mapping features. Fig. 3(a) shows the structure of the  
proposed Wasserstein CNN.
      
   
(7)
Meanwhile, the sketch CNN consists of three major parts:

CNN for extracting feature from the user’s sketch, CNN2 for In Eq. (7)  is the input feature, ∈ℝ is the center
mapping the obtained feature to the same domain of the 3D
of  ,  is the dimension of the feature, and  is the
features, and a classifier to classify the mapping features. Fig.
3(b) is the structure of the sketch CNN. number of the feature.
Wasserstein center loss is the loss function which uses the
3.4 Wasserstein Center Loss Wasserstein distance instead of Euclidean distance between
the center of the class and the feature[7]. Wasserstein center
Wasserstein center loss is based on the understanding of loss can be defined as follow.
36 2020. 2
 3.5 Sketch-based 3D Augmentation

     
 
 
 (8)
The main idea of suggested sketch-based 3D augmentation
is that the sketch drawn by the user itself is considered as a
In Eq. (8),  is the feature of 3D model by using
natural marker of AR and the retrieved 3D object based on
Wasserstein CNN and feature of the sketch using Sketch
the proposed sketch-based 3D object retrieval method. And
CNN. ∈ℝ is the common center of the feature of 3D finally, the retrieved 3D object is registered on the detected
model and the sketch for  th class. The slope of   natural marker. It is realized in the following order: detect
according to  and the updating of  are defined as sketch from input video images, retrieve 3D model using
sketch CNN, and augment a 3D model on the detected
follow.
natural marker.
 
     (9) In order to augment a 3D object in AR, first the sketch

 is detected from the input image then the features of the

 
   ∙    sketch are extracted using sketch CNN. Subsequently, the
   
matched 3D object is retrieved by use of Wasserstein CNN
     
 
which compares the extracted sketch features with already
registered features of 3D models in the database.
Sketch-based matching is a way to compare features
In Eq. (9), δ is 1 when the condition is true or 0 when
between the input sketch image made by users and the input
the condition is false. In this work, we utilize both
video images. In order to convert the video image into a
Wasserstein center loss and cross entropy[18] as for the loss
sketch, we use canny edge detection algorithm. When the
and the total loss(L) can be defined by
edge image is obtained, the next step is comparing features
       (10) between the sketched images by SURF (Speeded Up Robust
 
Features) algorithm. SURF uses an integer approximation of
   
 
 
  ′ 
 
 
the determinant of Hessian blob detector, which can be
computed with 3 integer operations using a precomputed
In Eq. (10),   is cross entropy loss, and ′  is the class integral image. Its feature descriptor is based on the sum of
of the expected 3D model and sketch. the Haar wavelet response around the point of interest. In the
A comparative clustering result of 10 classes using both use of SURF, square-shaped filters are generally used as an
Wasserstein loss only and the combination of Wasserstein approximation of Gaussian smoothing. Integral image is used
loss and cross entropy by t-SNE[19] is illustrated in Fig. 4. with a square and defined as:
 
  
    
(11)
The sum of the original image within a rectangle can be

evaluated conveniently using the integral image. SURF uses
a blob detector based on the Hessian matrix to find points of
interest and it also uses the determinant of the Hessian for
selecting the scale. Given a point   in an image  , the
Hessian matrix  at point  and scale , is defined
(Figure 4) Clustering results using both Wasserstein as follows:
center loss and the combination of
Wasserstein and entropy loss

   
      (12)
Python, OpenCV library and PyTorch deep learning library.
For the test of sketch-based 3D augmentation, Logitech C920
PRO HD web camera is used.
In the formula (11)   etc. are the second-order
derivatives of the grayscale image. Finally the exact shape of (Table 1) Environments of the experiments
the marker can be extracted by GrabCut which utilizes a Resources Description
user-specified bounding box around the object to be CPU AMD Ryzen 7 1700 3GHz
segmented. GrabCut estimates the color distribution of the GPU NVIDIA GeForce GTX 1080 Ti
target object and that of the background using a Gaussian RAM 32.00 GB
mixture model. OS Ubuntu 16.04
Fig 5. shows an example of sketch-based 3D object Language Python 3.5
augmentation in AR system. Develop
Jupyter Notebook
Tool
Library OpenCV, PyTorch
Camera Logitech C920 PRO HD
For the test, 12 directionally rendered multi-view images

of the 3D objects are used. As for he CNN structure of both
Wasserstein CNN and Sketch CNN Resnet-18[21] is used.
In Wasserstein CNN, the multi-view images are assigned to
the structure CNN of Resnet-18 and ‘average pool’ layer of
Resnet-18 calculates the Wasserstein center by using 512
output and feed it into CNN2. In Sketch CNN, the sketch is
assigned to Resnet-18 and ‘average pool’ layer of Resnet-18
(Figure 5) An example of the sketch-based 3D object feeds 512 output to CNN2. CNN2 of both Wasserstein CNN
augmentation
and Sketch CNN consists of 512-300-100 ouput layers and
all of them use ReLU function. Classifier produces same
4. Experimental Results numbers of classes for each dataset.
Fig. 6 shows the sketch-based 3D object retrieval by use
In this section, the experimental results of the proposed of SHREC 13 and SHREC 14 respectively. Both sketches
Wasserstein CNN and Sketch CNN with the evaluation of the hand and chair classes are used for the experiments and most
retrieved 3D object is discussed. In the experiments, we of the retrieval results are correctly matched with 3D object
adopt SHREC 13[2] and SHREC 14[20] dataset for retrieving of the sketch.
3D objects. The proposed Wasserstein center loss (WCL) method are
SHREC 13 dataset consists of 7,200 sketches and 90 compare with 3D object retrieval methods such as Fourier
classes of 1,258 3D objects. Meanwhile, SHREC 14 is a descriptor (FDC), Edge-based Fourier spectrum descriptor
extended version of SHREC 13 with 13,680 sketches and 171 (EFSD), Sketch-based retrieval with view cluster (SVR-VC)
classes of 8,987 3D objects. We use 50 sketches of each [8], Cross domain morphology ranking (CDMR), Siamese
class for learning and 30 sketches from SHREC 13 and 14 network(Siamese)[3], Learned Wasserstein center representation
individually in the experiments of the proposed method. (LWBR)[10], Depth correlation metric learning(DCML)[22],
The environments for the test is illustrated in the table 1. and Triple center loss(TCL)[16]. In the evaluation of the
The size of the view and sketch image from the 3D object proposed method, the precision-recall curve(PR-Curve),
is  × . The proposed system is implemented by using nearest neighbor (NN), first tier (FT), second tier (ST),
38 2020. 2
(Table 3) Comparison of NN, FT, ST, E, DCG, and

mAP results in SHREC14 dataset (%)
Methods NN FT ST E DCG mAP
CDMR [9] 10.9 5.7 8.9 4.1 32.8 5.4
SBR-VC
9.5 5.0 8.1 3.7 31.9 5.0
[8]
Siamese
23.9 21.2 31.6 14.0 49.6 22.8
[3]
DCML[22] 27.2 27.5 34.5 17.1 49.8 28.6
LWBR [10] 40.3 37.8 45.5 23.6 58.1 40.1
TCL [16] 58.5 45.5 53.9 27.5 66.6 47.7
(a) 3D object retrieval by using SHREC 13 dataset WCL 61.0 61.3 67.9 33.4 73.6 64.2
Fig. 7 illustrates Precision-Recall(PR) rates of LWBR,

EFSD, SBR-VC, FDC and the proposed WCL. The proposed
WCL is high accuracy in precision and recall compared with
other methods.
(b) 3D object retrieval by using SHREC 14 dataset

(Figure 6) An example of the proposed sketch-based
3D object retrieval
E-measure (E), discounted cumulated gain (DCG) and mean

(a) Precision-Recall(PR) curve by using SHREC 13
average precision (mAP) are used for two datasets. (Table 2)
dataset
and (Table 3) show the comparative results between currently
available method and the proposed method. The proposed
method shows higher accuracy rate in retrieving 3D objects.
(Table 2) Comparison of NN, FT, ST, E, DCG, and

mAP results in SHREC 13 dataset (%)
Methods NN FT ST E DCG mAP
CDMR [9] 27.9 20.3 29.6 16.6 45.8 25.0
SBR-VC
16.4 9.7 14.9 8.5 34.8 11.6
[8]
Siamese
40.5 40.3 54.8 28.7 60.7 46.9
[3] (b) Precision-Recall(PR) curve by using SHREC 14
DCML[22] 65.0 63.4 71.9 34.8 76.6 67.4 dataset
LWBR [10] 71.2 72.5 78.5 36.9 81.4 75.2 (Figure 7) Comparative results of Precision-Recall
TCL [16] 76.3 78.7 84.9 39.2 85.4 80.7 of between the proposed WCL and other
WCL 80.7 82.4 86.3 40.0 87.4 83.9 methods.

(a) 3D object retrieval results using SHREC 13 (b) 3D object retrieval results using SHREC 14
(Figure 8) 3D object retrieval results using TCL and the proposed WCL
Fig. 8 shows the results of retrieval results of 3D object

between the TCL[16], which is considered as the state of the
art work in retrieving 3D object, and the proposed method.
When these two methods are applied to SHREC 13 and
SHREC 14, the proposed method retrieves more accurate 3D
objects, which are marked by blue color, rather than using
TCL.
Finally, the retrieved 3D object is augmented on the
sketch which plays a role of the natural marker in AR
system. As illustrated in Fig. 9, 3D objects of an airplane, a
human hand and a tree are augmented on the corresponding
sketch images.
5. Concluding Remarks
In this paper, we propose a deep learning based approach

of retrieving a sketch-based 3D object as for an Augmented
(Figure 9) Sketch-based 3D object augmentation in Reality Model. For this work, we utilize Sketch CNN,
AR system Wasserstein CNN and Wasserstein center loss for retrieving a
40 2020. 2
sketch-based 3D object. We use two parts of networks to [5] F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: A
extract features of 3D data and user-drawn sketch from each unified embedding for face recognition and clustering,”
image by Resnet. Wasserstein barycenters of 2D images 2015 IEEE Conference on Computer Vision and Pattern
taken from various directions of 3D data are evaluated from Recognition (CVPR), 2015.
the extracted features of 3D data. The second CNN, which is https://doi.org/10.1109/cvpr.2015.7298682
called 'CNN2', maps the Wasserstein barycenters of 2D [6] Y. Wen, K. Zhang, Z. Li, and Y. Qiao, “A
images and the sketch features to the corresponding outputs. Discriminative Feature Learning Approach for Deep
In order to train the two parts of networks, Wasserstein Face Recognition,” Lecture Notes in Computer Science,
distance loss function of the output is adopted. In the respect pp. 499–515, 2016.
of the accuracy of retrieving 3D object, we can justify that https://doi.org/10.1007/978-3-319-46478-7_31
the proposed method shows improved performance both on [7] A. Rolet, M. Cuturi, and G. Peyré. "Fast dictionary
the SHREC 13 and SHREC 14 datasets. Moreover, we learning with a smoothed wasserstein loss," International
proposed sketch-based object matching scheme to localize the Conference on Artificial Intelligence and Statistics,
natural marker of the images to register a 3D virtual object Cadiz, Spain, pp. 630–638, 2016.
in Augmented Reality. Using the detected sketch as a marker, http://www.jmlr.org/proceedings/papers/v51/rolet16.pdf
the retrieved 3D object is augmented in AR automatically. [8] B. Li, Y. Lu, A. Godil, T. Schreck, M. Aono, H. Johan,
Form the experiments, we prove that the proposed method is J. M. Saavedra, and S. Tashiro. "Shrec’13 track: Large
efficiency for retrieving and augmenting objects. scale sketchbased 3D shape retrieval," Eurographics
Workshop on 3D Object Retrieval, Girona, Spain, pp.
Reference 89–96, 2013.
https://dx.doi.org/10.2312/3DOR/3DOR13/089-096
[1] M. Eitz, R. Richter, T. Boubekeur, K. Hildebrand, and [9] T. Furuya and R. Ohbuchi. "Ranking on cross-domain
M. Alexa, “Sketch-based shape retrieval,” ACM manifold for sketch-based 3D model retrieval,"
Transactions on Graphics, vol. 31, no. 4, pp. 1–10, International Conference on Cyberworlds, Yokohama,
2012. https://doi.org/10.1145/2185520.2335382 Japan, pp. 274– 281, 2013.
[2] B. Li, Y. Lu, A. Godil, T. Schreck, B. Bustos, A. https://doi.org/10.1109/cw.2013.60
Ferreira, T. Furuya, M. J. Fonseca, H. Johan, T. [10] J. Xie, G. Dai, F. Zhu, and Y. Fang, “Learning
Matsuda, R. Ohbuchi, P. B. Pascoal, and J. M. Barycentric Representations of 3D Shapes for
Saavedra, “A comparison of methods for sketch-based Sketch-Based 3D Shape Retrieval,” 2017 IEEE
3D shape retrieval,” Computer Vision and Image Conference on Computer Vision and Pattern
Understanding, vol. 119, pp. 57–80, 2014. Recognition (CVPR), 2017.
https://doi.org/10.1016/j.cviu.2013.11.008 https://doi.org/10.1109/cvpr.2017.385
[3] Fang Wang, Le Kang, and Yi Li, “Sketch-based 3D [11] J.-D. Benamou, G. Carlier, M. Cuturi, L. Nenna, and G.
shape retrieval using Convolutional Neural Networks,” Peyré, “Iterative Bregman Projections for Regularized
2015 IEEE Conference on Computer Vision and Pattern Transportation Problems,” SIAM Journal on Scientific
Recognition (CVPR), 2015. Computing, vol. 37, no. 2, pp. A1111–A1138, 2015.
https://doi.org/10.1109/cvpr.2015.7298797 https://doi.org/10.1137/141000439
[4] R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality [12] V. I. Bogachev and A. V. Kolesnikov, “The
Reduction by Learning an Invariant Mapping,” 2006 Monge-Kantorovich problem: achievements, connections,
IEEE Computer Society Conference on Computer and perspectives,” Russian Mathematical Surveys, vol.
Vision and Pattern Recognition - Volume 2 (CVPR’06), 67, no. 5, pp. 785–890, 2012.
2006. https://doi.org/10.1109/cvpr.2006.100 https://doi.org/10.1070/rm2012v067n05abeh004808

[13] Y. Rubner, C. Tomasi, and L. J. Guibas. "The Earth [20] B. Li, Y. Lu, C. Li, A. Godil, T. Schreck, M. Aono, M.
Mover’s Distance as a metric for image retrieval," Burtscher, H. Fu, T. Furuya, H. Johan, J. Liu, R.
International Journal of Computer Vision, vol. 40, no. 2 Ohbuchi, A. Tatsuma, and C. Zou. "Extended large
pp. 99–121, 2000. scale sketch-based 3D shape retrieval," Eurographics
https://doi.org/10.1023/a:1026543900054 Workshop on 3D Object Retrieval, Strasbourg, France,
[14] M. Cuturi. "Sinkhorn distances: Lightspeed computation pp. 121–130, 2014.
of optimal transport," Advances in Neural Information http://dx.doi.org/10.2312/3dor.20141058
Processing Systems, Lake Tahoe, Nevada, USA, pp. [21] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual
2292– 2300, 2013. Learning for Image Recognition,” 2016 IEEE
https://papers.nips.cc/paper/4927-sinkhorn-distances-light Conference on Computer Vision and Pattern
speed-computation-of-optimal-transport.pdf Recognition (CVPR), 2016.
[15] R. Sinkhorn, “Diagonal Equivalence to Matrices with https://doi.org/10.1109/cvpr.2016.90
Prescribed Row and Column Sums,” The American [22] S. Ferradans, G.-S. Xia, G. Peyré, and J.-F. Aujol,
Mathematical Monthly, vol. 74, no. 4, p. 402, 1967. “Static and Dynamic Texture Mixing Using Optimal
https://doi.org/10.2307/2314570 Transport,” Scale Space and Variational Methods in
[16] He, Xinwei, et al. "Triplet-Center Loss for Multi-View Computer Vision, pp. 137–148, 2013.
3D Object Retrieval," arXiv preprint arXiv:1803.06189, https://doi.org/10.1007/978-3-642-38267-3_12
2018. [23] K.V. Shriram, P.L.K. Priyadarsini, and A. Baskar, “An
http://openaccess.thecvf.com/content_cvpr_2018/Camera intelligent system of content-based image retrieval for
Ready/1632.pdf crime investigation”, Int. J. of Advanced Intelligence
[17] N. Bonneel, G. Peyré, and M. Cuturi, “Wasserstein Paradigms, Vol. 7, No. 3/4, pp. 264-279. 2015.
barycentric coordinates,” ACM Transactions on https://doi.org/10.1504/IJAIP.2015.073707
Graphics, vol. 35, no. 4, pp. 1–10, 2016. [24] Eitz, M., Hildebrand, K., etal. “Sketch-Based Image
https://doi.org/10.1145/2897824.2925918 Retrieval: Benchmark and Bag-of-Features
[18] P.-T. de Boer, D. P. Kroese, S. Mannor, and R. Y. Descriptors,” IEEE Transactions on Visualization
Rubinstein, “A Tutorial on the Cross-Entropy Method,” and Computer Graphics, Vol. 17, No. 11, pp.
Annals of Operations Research, vol. 134, no. 1, pp. 19 1624-1636, 2010.
–67, 2005. https://doi.org/10.1007/s10479-005-5724-z https://doi.org/10.1109/TVCG.2010.266
[19] L. van der Maaten and G. Hinton. "Visualizing [25] Loris Nanni, Alessandra Lumini, and Sheryl
highdimensional data using t-SNE.," Journal of Machine Brahnam, “Ensemble of shape descriptors for shape
Learning Research, vol. 9, pp. 2579–2605, 2008. retrieval and classification,” Int. J. of Advanced
http://www.jmlr.org/papers/volume9/vandermaaten08a/v Intelligence Paradigms, Vol. 6, No.2, pp.136–156.
andermaaten08a.pdf https://doi.org/10.1504/IJAIP.2014.062177
42 2020. 2
◐ 저 자 소 개 ◑
지 명 근(Myunggeun Ji)
2017 B.S. in Computer Science, Kyonggi University, Suwon, Korea
2018 M.S. in Computer Science, Kyonggi University, Suwon, Korea
2018.03～Present Researcher at Huray, Seoul, Korea
Research Interests : Computer Vision, Augmented Reality
E-mail : jmg2968@gmail.com
전 준 철(Junchul Chun)
1984 B.S. in Computer Science, Chung-Ang University, Seoul, Korea
1986 M.S. in Computer Science(Software Engineering), Chung-Ang University, Seoul, Korea
1992 M.S. in Computer Science and Engineering (Computer Graphics), The Univ. of Connecticut, USA
1995 Ph.D. in Computer Science and Engineering (Computer Graphics), The Univ. of Connecticut, USA
2001.02～2002.02 Visiting Scholar, Michigan State Univ. Pattern Recognition and Image Processing Lab.
2009.02～2010.02 Visiting Scholar, Univ. of Colorado, Wellness Innovation and Interaction Lab.
1995.03～present, Professor at the Department of Computer Science, Kyonggi University.
Research Interests : Augmented Reality, Computer Vision, Human Computer Interaction
E-mail : jcchun@kgu.ac.kr

A Sketch-Based 3D Object Retrieval Approach For Augmented Reality Models Using Deep Learning

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Sketch-Based 3D Object Retrieval Approach For Augmented Reality Models Using Deep Learning

Uploaded by

Copyright:

Available Formats

J. Internet Comput. Serv.

ISSN 1598-0170 (Print) / ISSN 2287-1136 (Online)

A Sketch-based 3D Object Retrieval Approach

1. Introduction In this paper, we propose a new deep learning based 3D

Journal of Internet Computing and Services(JICS) 2020. Feb.: 21(1): 33-43 33

(Figure 1) The framework of the proposed sketch-based 3D model retrieval.

3.1 Wasserstein Distance

Wasserstein barycenters[12] is the center point of a set of

In Eq. (3)      is negative entropy and  is a

In Eq. (4)      , vector  and  use Sinkhorn

한국 인터넷 정보학회 (21권1호) 35

(a) The structure of the Wasserstein CNN

(b) The structure of the sketch CNN

Meanwhile, the sketch CNN consists of three major parts:

 3.5 Sketch-based 3D Augmentation

The sum of the original image within a rectangle can be

한국 인터넷 정보학회 (21권1호) 37

For the test, 12 directionally rendered multi-view images

(Table 3) Comparison of NN, FT, ST, E, DCG, and

Fig. 7 illustrates Precision-Recall(PR) rates of LWBR,

(b) 3D object retrieval by using SHREC 14 dataset

E-measure (E), discounted cumulated gain (DCG) and mean

(Table 2) Comparison of NN, FT, ST, E, DCG, and

한국 인터넷 정보학회 (21권1호) 39

Fig. 8 shows the results of retrieval results of 3D object

In this paper, we propose a deep learning based approach

한국 인터넷 정보학회 (21권1호) 41

한국 인터넷 정보학회 (21권1호) 43

You might also like