You are on page 1of 4

2020 9th M EDITERRANEAN C ONFERENCE ON E MBEDDED C OMPUTING (MECO), 8-11 J UNE 2020, B UDVA , M ONTENEGRO

TensorFlow for Generating Edge Detection Dataset


Boris A. Alpatov Nikita Yu. Shubin Andrey V. Yakovlev
Department of Automation and Department of Automation and Department of Automation and
Information Technologies in control Information Technologies in control Information Technologies in control
Ryazan State Radio Engineering Ryazan State Radio Engineering Ryazan State Radio Engineering
University University University
Ryazan, Russia Ryazan, Russia Ryazan, Russia
aitu@rsreu.ru shubin.kit@ya.ru andrey.yakovlev.98@mail.ru

Abstract—Solving any problem using machine learning requires relevant for aircraft. In addition, the labeling is manual
datasets. Most datasets are labeled manually for supervised [6,7,8,9,10]. There are two problems with it. First, the
learning tasks. This paper describes a method for generating representativeness of such a training set is questionable.
labeled data for edge detection tasks on an image. This eliminates
a number of problems associated with manual data labeling.

TensorFlow, dataset, edge detection, image processing.

I. INTRODUCTION
Object edges in the image are one of the main criteria that
is used to evaluate the shape, size, and other parameters of
almost any object. There are many algorithms and methods for
detecting and extracting object edges on an image. They have
various advantages and disadvantages. It can be divided into
two large groups: local and global. Local edge detection
methods determine the presence of an edge in a given pixel
taking into account its neighborhood. These methods include
all algorithms based on two-dimensional convolution, such as
the Previtt filter, Roberts filter, Kenny filter [1], and so on.
Fig. 1. Image with labels example from Microsoft COCO dataset [6]
However, the quality of binary classification strongly depends
on the threshold you select. Even algorithms that use automatic Manual labels impose serious restrictions on the amount of
threshold selection do not always work well. Also, when the data received. Second, manual labels are ambiguous. The same
signal-to-noise ratio is low, the detected edges contain a image can be marked differently by several experts (Fig. 2). It
number of gaps. It is happening because the information of the is difficult to estimate the correctness of various markups. This
selected pixel neighborhood is not always sufficient for the complicates the introduction of a metric for the accuracy of a
edge detection. particular algorithm and the development of a loss function. As
Global methods are analyzing the image in general. In this a result, many works devoted to detecting edges in an image
case the hypothesis of the presence of an edge of a certain either use controversial metrics or even do not perform
shape with specified parameters on the image [2, 3] is been put comparative analysis at all [2,5].
forward. The global methods are much more resistant to noise
than local methods. However, the hypotheses themselves are
usually quite primitive due to its computational complexity.
These methods are usually used for lines and circles detection.
More complex shapes add new degrees of freedom to the
parametric space.
There are some attempts to use a combination of these two
approaches [4, 5]. However, the search for an optimal solution
Original Image Subject 1
continues.
Usage of artificial neural networks could significantly
improve the quality of the problem being solved. However,
supervised machine learning methods always require an image
dataset with ground truth data. In case of the edge detection
algorithms the dataset must contain the ground truth data of the
object edges [6, 7, 8, 9, 10]. But most of the datasets [6,7,8] are
designed for the segmentation task but not the edge detection.
The information about the internal edges is ignored in this case Subject 2 Subject 3
(Fig. 1). Such type of labeling makes it difficult to solve the
problems of detecting power lines in the image [4], which is Fig. 2. Example of original image and various ground truth from Berkeley
Segmentation Data Set and Benchmarks 500 [10]
978-1-7281-6949-1/20/$31.00 ©2020 IEEE

Authorized licensed use limited to: University of Vermont Libraries. Downloaded on July 26,2020 at 02:20:26 UTC from IEEE Xplore. Restrictions apply.
2020 9th M EDITERRANEAN C ONFERENCE ON E MBEDDED C OMPUTING (MECO), 8-11 J UNE 2020, B UDVA , M ONTENEGRO
This work is devoted to the development of an algorithm Vector data Vector data Vector data
for generating a training dataset with unambiguous ground
truth (Fig. 3).
Interpolation and rasterization of contrast shapes

Contrast
shapes

TensorFlow

Edges

Background

Images Edge gradients Contrasts Edges

Fig. 3. Example of generated images with ground truth


Noise

II. THE IMAGE GENERATION ALGORITHM


In accordance with the proposed algorithm (fig. 4), the first
Final image
step is the generation of reference data in vector format. To
perform this step, the random sequence of points in the image Fig. 4. Main steps of dataset generation
space and the contrast value (the brightness difference of the
edge) in each of them are created. Several corner points are This approach was successfully implemented using the
randomly selected from this sequence of points. The random TensorFlow [11]. This is one of the most computationally
arrangement of points provides a dataset of lines of any shape intensive steps of the algorithm. The ability to execute this step
and curvature, as well as corners of any angles. Then the on the GPU using TensorFlow significantly reduces the overall
sequence of points is interpolated along with it contrasts. But program runtime. However, this approach requires a large
smoothing does not apply to corner points. After that, a number of iterations even for medium-sized images (from
gradient field is building. It produces a ground truth image and 500x500 pixels). This is caused by the slow transfer of “heat”
contrast shapes. from pixels to pixels. We suggest starting with lower
Also, a vector representation of the ground truth data can be resolutions and increasing the image by 2 times after several
extracted from the list of points. iterations of heat transfer to reduce the calculation time. At the
same time, gradient fields and contrast shapes are formed for
Contrast shape can be determined as the difference in new dimensions each time.
brightness around the edge. It plays the role of a frame when
creating an edge image. The contrast shape can be represented However, to prevent the strong aliasing of the resulting
as a supported temperature drop in a homogeneous medium. In edge, one or more additional image upsamplings followed by a
this case the brightness of each pixel will represent the further downsamplings must be performed. This additional
temperature of the medium at that point. Therefore, we can processing is smoothing the result image.
apply numerical methods to solve the partial differential At the last step of the algorithm all the edge images are
equation for the next algorithm step. combined and non-uniform background and noise is added.

Authorized licensed use limited to: University of Vermont Libraries. Downloaded on July 26,2020 at 02:20:26 UTC from IEEE Xplore. Restrictions apply.
2020 9th M EDITERRANEAN C ONFERENCE ON E MBEDDED C OMPUTING (MECO), 8-11 J UNE 2020, B UDVA , M ONTENEGRO
The noise has a Gaussian distribution. The background reducing the generation duration by 3 times. The experimental
generation is based on the approximation of the random examinations of the generated data have shown that it can be
brightness values at random points in the image. used in machine learning tasks.
The proposed edge images generation algorithm allows TABLE I. AVERAGE RUNTIME PER ONE IMAGE
receiving the reference data not only in the form of images, but
also in a vector (parametric) representation. This allows Batch Total Edges Postprocess (noise +
size time (Tensorflow) background)
significantly expanding the scope of its usage.
1 2.503501 2.179895 0.232935
2 1.438942 1.110552 0.234946
III. EXPERIMENTAL EXAMINATIONS
4 0.957967 0.629361 0.234869

CPU
Using GPU is performed at one of the data generation 8 0.684292 0.363456 0.232995
stages. In all other stages, the CPU is used. Moreover, GPU 16 0.562541 0.243959 0.232308
computing has overhead associated with I/O operations. Thus, 32 0.502753 0.187143 0.231478
the benefits of using GPU are not obvious. To test the 64 0.492677 0.174597 0.232624
1 1.838084 1.509922 0.256365
computational efficiency of GPU usage compared to CPU, 10 2 1.623060 1.298968 0.250169
generations with different numbers of simultaneously 4 1.534589 1.215724 0.249382
generated image pairs (batch size) were performed. A

GPU
8 1.522095 1.202656 0.249195
computer with the following characteristics was used: 16 1.788908 1.468058 0.249954
Intel Xeon 2.2Ghz (1 core, 2 threads CPU); Nvidia Tesla K80; 32 1.731476 1.407539 0.250859
12.72 GB RAM. 64 1.665243 1.344029 0.249644

Each image contained two or three edges. Each edge could TABLE II. U-NET ARCHITECTURE
contain up to two smoothed vertexes and a maximum of one
angular vertex. The generated images were upscaled four Layer ID Type Out Skip
1 Conv 8 None
times, and downscaled once. The final resolution was 256x256 2 Pooling 8 None
pixels. The average time spent on one pair for different batch 3 Conv 16 None
sizes was calculated. The result is shown in Table 1. It shows 4 Pooling 16 None
that using GPU reduces the generation duration by about three 5 ResBlock 32 None
times. 6 Pooling 32 None
7 ResBlock 64 None
Additional research to test the usefulness of the generated 8 Pooling 64 None
data was performed. A neural network of the U-net class was 9 4 × ResBlock 128 None
trained on the generated training set that is consists of 4000 10 Conv 128 None
pairs of training pairs and 800 pairs of verification data [12]. 11 UpSampling 128 None
The architecture of this net is presents in Table 2. The last 12 Conv 64 None
convolutional layer used the Sigmoid activation function, while 13 Concat 128 7
14 ResBlock 64 None
the rest used ReLU. The size of the convolution core is 3×3. 15 UpSampling 64 None
Two-dimensional pooling with a core size of 2×2 and a step of 16 Conv 32 None
2×2. The structure of the Res block is shown in Table 3. The 17 Concat 64 5
Optimizer is Adam. The learning speed is 1e-5. 18 ResBlock 32 None
19 UpSamling 32 None
The reference data was binarized with a zero threshold for 20 Conv 16 None
greater contrast. The convolutional part of the pre-trained 21 Concat 32 3
VGG16 is applied to the network output and reference 22 Conv 16 None
data [13]. The average square of the difference between the 23 UpSampling 16 None
outputs of both VGG16 is taken as a loss function. After each 24 Conv 8 None
epoch, this network was applied to real images without 25 Concat 16 1
reference data. An example of the trained network performance 26 Conv 8 None
27 Conv 1 None
is shown in Figure 5. 28 Conv 1 None
IV. CONCLUSION Conv – two-dimensional convolutional layer; Pooling – two-dimensional pooling; Add – element-wise
addition; Concat – concatenation; UpSampling – increasing image resolution.
In this paper an algorithm for generating images containing
edges, as well as reference data describing these edges TABLE III. RESBLOCK ARCHITECTURE
graphically or parametrically, is proposed. A software Layer ID Type Out Skip
implementation of the algorithm [14] is developed in Python 3. 1 Conv 128 None
Neural network training can use both pre-generated data and 2 Conv 128 None
data generated on the fly. One of the most difficult stages of 3 Add 128 1
4 Conv 128 None
image generation can be implemented on the GPU. This allows

Authorized licensed use limited to: University of Vermont Libraries. Downloaded on July 26,2020 at 02:20:26 UTC from IEEE Xplore. Restrictions apply.
2020 9th M EDITERRANEAN C ONFERENCE ON E MBEDDED C OMPUTING (MECO), 8-11 J UNE 2020, B UDVA , M ONTENEGRO
[2] Babayan, Pavel, and Nikita Shubin. "Line detection in a noisy
environment with weighted Radon transform." Image Processing:
Machine Vision Applications VII. Vol. 9024. International Society for
Optics and Photonics, 2014.
[3] Lee, Daeho, and Youngtae Park. "Discrete Hough transform using line
segment representation for line detection." Optical Engineering 50.8
(2011): 087004.
[4] Babayan, P. V., and N. Yu Shubin. "Neural network in a multi-agent
system for line detection task in images." Pattern Recognition and
Tracking XXX. Vol. 10995. International Society for Optics and
Photonics, 2019.
[5] Von Gioi, Rafael Grompone, et al. "LSD: A fast line segment detector
with a false detection control." IEEE transactions on pattern analysis and
machine intelligence 32.4 (2008): 722-732.
[6] Lin, Tsung-Yi, et al. "Microsoft coco: Common objects in context."
European conference on computer vision. Springer, Cham, 2014.
[7] https://sviro.kl.dfki.de/
[8] https://github.com/hendrycks/anomaly-seg
[9] Jeong, Chiyoon, Hyun S. Yang, and KyeongDeok Moon. "A novel
approach for detecting the horizon using a convolutional neural network
and multi-scale edge detection." Multidimensional Systems and Signal
Processing 30.3 (2019): 1187-1204.
[10] Arbelaez, Pablo, et al. "Contour detection and hierarchical image
segmentation." IEEE transactions on pattern analysis and machine
intelligence 33.5 (2010): 898-916.
[11] https://habr.com/ru/post/321734/
[12] Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net:
Convolutional networks for biomedical image segmentation."
International Conference on Medical image computing and computer-
Fig. 5. Edge detection example assisted intervention. Springer, Cham, 2015.
[13] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional
REFERENCES networks for large-scale image recognition." arXiv preprint
[1] G. N. Chaple, R. D. Daruwala and M. S. Gofane, "Comparisions of arXiv:1409.1556 (2014).
Robert, Prewitt, Sobel operator based edge detection methods for real [14] https://github.com/NikitaShubin/Edge-Detection-Dataset-Generator
time uses on FPGA," 2015 International Conference on Technologies for
Sustainable Development (ICTSD), Mumbai, 2015, pp. 1-4.

Authorized licensed use limited to: University of Vermont Libraries. Downloaded on July 26,2020 at 02:20:26 UTC from IEEE Xplore. Restrictions apply.

You might also like