Professional Documents
Culture Documents
Abstract—Solving any problem using machine learning requires relevant for aircraft. In addition, the labeling is manual
datasets. Most datasets are labeled manually for supervised [6,7,8,9,10]. There are two problems with it. First, the
learning tasks. This paper describes a method for generating representativeness of such a training set is questionable.
labeled data for edge detection tasks on an image. This eliminates
a number of problems associated with manual data labeling.
I. INTRODUCTION
Object edges in the image are one of the main criteria that
is used to evaluate the shape, size, and other parameters of
almost any object. There are many algorithms and methods for
detecting and extracting object edges on an image. They have
various advantages and disadvantages. It can be divided into
two large groups: local and global. Local edge detection
methods determine the presence of an edge in a given pixel
taking into account its neighborhood. These methods include
all algorithms based on two-dimensional convolution, such as
the Previtt filter, Roberts filter, Kenny filter [1], and so on.
Fig. 1. Image with labels example from Microsoft COCO dataset [6]
However, the quality of binary classification strongly depends
on the threshold you select. Even algorithms that use automatic Manual labels impose serious restrictions on the amount of
threshold selection do not always work well. Also, when the data received. Second, manual labels are ambiguous. The same
signal-to-noise ratio is low, the detected edges contain a image can be marked differently by several experts (Fig. 2). It
number of gaps. It is happening because the information of the is difficult to estimate the correctness of various markups. This
selected pixel neighborhood is not always sufficient for the complicates the introduction of a metric for the accuracy of a
edge detection. particular algorithm and the development of a loss function. As
Global methods are analyzing the image in general. In this a result, many works devoted to detecting edges in an image
case the hypothesis of the presence of an edge of a certain either use controversial metrics or even do not perform
shape with specified parameters on the image [2, 3] is been put comparative analysis at all [2,5].
forward. The global methods are much more resistant to noise
than local methods. However, the hypotheses themselves are
usually quite primitive due to its computational complexity.
These methods are usually used for lines and circles detection.
More complex shapes add new degrees of freedom to the
parametric space.
There are some attempts to use a combination of these two
approaches [4, 5]. However, the search for an optimal solution
Original Image Subject 1
continues.
Usage of artificial neural networks could significantly
improve the quality of the problem being solved. However,
supervised machine learning methods always require an image
dataset with ground truth data. In case of the edge detection
algorithms the dataset must contain the ground truth data of the
object edges [6, 7, 8, 9, 10]. But most of the datasets [6,7,8] are
designed for the segmentation task but not the edge detection.
The information about the internal edges is ignored in this case Subject 2 Subject 3
(Fig. 1). Such type of labeling makes it difficult to solve the
problems of detecting power lines in the image [4], which is Fig. 2. Example of original image and various ground truth from Berkeley
Segmentation Data Set and Benchmarks 500 [10]
978-1-7281-6949-1/20/$31.00 ©2020 IEEE
Authorized licensed use limited to: University of Vermont Libraries. Downloaded on July 26,2020 at 02:20:26 UTC from IEEE Xplore. Restrictions apply.
2020 9th M EDITERRANEAN C ONFERENCE ON E MBEDDED C OMPUTING (MECO), 8-11 J UNE 2020, B UDVA , M ONTENEGRO
This work is devoted to the development of an algorithm Vector data Vector data Vector data
for generating a training dataset with unambiguous ground
truth (Fig. 3).
Interpolation and rasterization of contrast shapes
Contrast
shapes
TensorFlow
Edges
Background
Authorized licensed use limited to: University of Vermont Libraries. Downloaded on July 26,2020 at 02:20:26 UTC from IEEE Xplore. Restrictions apply.
2020 9th M EDITERRANEAN C ONFERENCE ON E MBEDDED C OMPUTING (MECO), 8-11 J UNE 2020, B UDVA , M ONTENEGRO
The noise has a Gaussian distribution. The background reducing the generation duration by 3 times. The experimental
generation is based on the approximation of the random examinations of the generated data have shown that it can be
brightness values at random points in the image. used in machine learning tasks.
The proposed edge images generation algorithm allows TABLE I. AVERAGE RUNTIME PER ONE IMAGE
receiving the reference data not only in the form of images, but
also in a vector (parametric) representation. This allows Batch Total Edges Postprocess (noise +
size time (Tensorflow) background)
significantly expanding the scope of its usage.
1 2.503501 2.179895 0.232935
2 1.438942 1.110552 0.234946
III. EXPERIMENTAL EXAMINATIONS
4 0.957967 0.629361 0.234869
CPU
Using GPU is performed at one of the data generation 8 0.684292 0.363456 0.232995
stages. In all other stages, the CPU is used. Moreover, GPU 16 0.562541 0.243959 0.232308
computing has overhead associated with I/O operations. Thus, 32 0.502753 0.187143 0.231478
the benefits of using GPU are not obvious. To test the 64 0.492677 0.174597 0.232624
1 1.838084 1.509922 0.256365
computational efficiency of GPU usage compared to CPU, 10 2 1.623060 1.298968 0.250169
generations with different numbers of simultaneously 4 1.534589 1.215724 0.249382
generated image pairs (batch size) were performed. A
GPU
8 1.522095 1.202656 0.249195
computer with the following characteristics was used: 16 1.788908 1.468058 0.249954
Intel Xeon 2.2Ghz (1 core, 2 threads CPU); Nvidia Tesla K80; 32 1.731476 1.407539 0.250859
12.72 GB RAM. 64 1.665243 1.344029 0.249644
Each image contained two or three edges. Each edge could TABLE II. U-NET ARCHITECTURE
contain up to two smoothed vertexes and a maximum of one
angular vertex. The generated images were upscaled four Layer ID Type Out Skip
1 Conv 8 None
times, and downscaled once. The final resolution was 256x256 2 Pooling 8 None
pixels. The average time spent on one pair for different batch 3 Conv 16 None
sizes was calculated. The result is shown in Table 1. It shows 4 Pooling 16 None
that using GPU reduces the generation duration by about three 5 ResBlock 32 None
times. 6 Pooling 32 None
7 ResBlock 64 None
Additional research to test the usefulness of the generated 8 Pooling 64 None
data was performed. A neural network of the U-net class was 9 4 × ResBlock 128 None
trained on the generated training set that is consists of 4000 10 Conv 128 None
pairs of training pairs and 800 pairs of verification data [12]. 11 UpSampling 128 None
The architecture of this net is presents in Table 2. The last 12 Conv 64 None
convolutional layer used the Sigmoid activation function, while 13 Concat 128 7
14 ResBlock 64 None
the rest used ReLU. The size of the convolution core is 3×3. 15 UpSampling 64 None
Two-dimensional pooling with a core size of 2×2 and a step of 16 Conv 32 None
2×2. The structure of the Res block is shown in Table 3. The 17 Concat 64 5
Optimizer is Adam. The learning speed is 1e-5. 18 ResBlock 32 None
19 UpSamling 32 None
The reference data was binarized with a zero threshold for 20 Conv 16 None
greater contrast. The convolutional part of the pre-trained 21 Concat 32 3
VGG16 is applied to the network output and reference 22 Conv 16 None
data [13]. The average square of the difference between the 23 UpSampling 16 None
outputs of both VGG16 is taken as a loss function. After each 24 Conv 8 None
epoch, this network was applied to real images without 25 Concat 16 1
reference data. An example of the trained network performance 26 Conv 8 None
27 Conv 1 None
is shown in Figure 5. 28 Conv 1 None
IV. CONCLUSION Conv – two-dimensional convolutional layer; Pooling – two-dimensional pooling; Add – element-wise
addition; Concat – concatenation; UpSampling – increasing image resolution.
In this paper an algorithm for generating images containing
edges, as well as reference data describing these edges TABLE III. RESBLOCK ARCHITECTURE
graphically or parametrically, is proposed. A software Layer ID Type Out Skip
implementation of the algorithm [14] is developed in Python 3. 1 Conv 128 None
Neural network training can use both pre-generated data and 2 Conv 128 None
data generated on the fly. One of the most difficult stages of 3 Add 128 1
4 Conv 128 None
image generation can be implemented on the GPU. This allows
Authorized licensed use limited to: University of Vermont Libraries. Downloaded on July 26,2020 at 02:20:26 UTC from IEEE Xplore. Restrictions apply.
2020 9th M EDITERRANEAN C ONFERENCE ON E MBEDDED C OMPUTING (MECO), 8-11 J UNE 2020, B UDVA , M ONTENEGRO
[2] Babayan, Pavel, and Nikita Shubin. "Line detection in a noisy
environment with weighted Radon transform." Image Processing:
Machine Vision Applications VII. Vol. 9024. International Society for
Optics and Photonics, 2014.
[3] Lee, Daeho, and Youngtae Park. "Discrete Hough transform using line
segment representation for line detection." Optical Engineering 50.8
(2011): 087004.
[4] Babayan, P. V., and N. Yu Shubin. "Neural network in a multi-agent
system for line detection task in images." Pattern Recognition and
Tracking XXX. Vol. 10995. International Society for Optics and
Photonics, 2019.
[5] Von Gioi, Rafael Grompone, et al. "LSD: A fast line segment detector
with a false detection control." IEEE transactions on pattern analysis and
machine intelligence 32.4 (2008): 722-732.
[6] Lin, Tsung-Yi, et al. "Microsoft coco: Common objects in context."
European conference on computer vision. Springer, Cham, 2014.
[7] https://sviro.kl.dfki.de/
[8] https://github.com/hendrycks/anomaly-seg
[9] Jeong, Chiyoon, Hyun S. Yang, and KyeongDeok Moon. "A novel
approach for detecting the horizon using a convolutional neural network
and multi-scale edge detection." Multidimensional Systems and Signal
Processing 30.3 (2019): 1187-1204.
[10] Arbelaez, Pablo, et al. "Contour detection and hierarchical image
segmentation." IEEE transactions on pattern analysis and machine
intelligence 33.5 (2010): 898-916.
[11] https://habr.com/ru/post/321734/
[12] Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net:
Convolutional networks for biomedical image segmentation."
International Conference on Medical image computing and computer-
Fig. 5. Edge detection example assisted intervention. Springer, Cham, 2015.
[13] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional
REFERENCES networks for large-scale image recognition." arXiv preprint
[1] G. N. Chaple, R. D. Daruwala and M. S. Gofane, "Comparisions of arXiv:1409.1556 (2014).
Robert, Prewitt, Sobel operator based edge detection methods for real [14] https://github.com/NikitaShubin/Edge-Detection-Dataset-Generator
time uses on FPGA," 2015 International Conference on Technologies for
Sustainable Development (ICTSD), Mumbai, 2015, pp. 1-4.
Authorized licensed use limited to: University of Vermont Libraries. Downloaded on July 26,2020 at 02:20:26 UTC from IEEE Xplore. Restrictions apply.