Professional Documents
Culture Documents
Drone and Improved Human Detection in Sea Using Pi Pico
Drone and Improved Human Detection in Sea Using Pi Pico
Abstract—In the mission of searching and rescuing, it is often The object detection methods based on deep learning can
faced with the situation that the area to be searched is large and automatically extract features through convolutional neural
the target to be searched is small. Combined with the object network, and the semantic information of these features is more
detection technology, this paper proposes a method for searching abundant. It is not easily affected in a complex environment,
drowning people. At first, we make a dataset, which contains a and has better robustness and higher accuracy. The object
large number of human targets at sea. Then, we improve the detection methods based on deep learning can be divided into
Yolov3 algorithm: In the feature extraction network, we use the the two-stage detection algorithm and the one-stage detection
residual module with channel attention mechanism. In the feature
algorithm. The representative algorithms of the two-stage
fusion network, we add a bottom-up structure to the FPN
detection algorithm include R-CNN [4], F ast R-CNN [5], F aster
structure. Moreover, in terms of loss function, we use the CIoU loss
function. Finally, on the settings of the anchor box, we use a linear
R-CNN [6 ], etc. At first, this type of algorithm extracts the
transformation method to deal with the anchor boxes generated by candidate regions roughly, and generates regions where the
clustering algorithm. The detection accuracy of the improved target may exist. Then, these regions are extracted finely to
algorithm for human targets at sea is 72.17%, which has a good generate the final location and classification. accuracy has high
detection effect. accuracy, but the detection speed is slow. The one-stage
detection algorithm directly obtains the location and category
Keywords-deep learning; object detection; human target; information of the final target from the input image. The
searching and rescuing; detection method. representative algorithms of the one-stage detection algorithm
include YOLO [7], SSD [8 ], etc. This type of algorithm has
I. In t r o d u c t io n faster detection speed, but the accuracy is poorer than the two-
At present, in the searching and rescuing mission, it mainly stage detection algorithm. Most of the subsequent algorithms
depends on human vision to search drowning people. However, are improvements of the above algorithms.
in the process of large-scale searching, the search efficiency and
accuracy are low due to the small size of human targets at sea. II. ALGORITHM DESCRIPTION
In addition, the emotion and state of searchers will also affect Yolov3 [9] is a representative algorithm of the object
the efficiency and accuracy of searching, making it difficult to detection methods based on deep learning. Because of its high
find human targets quickly. With the development of computer accuracy and fast detection speed, we choose to make
vision technology, the object detection technology has also improvements on the basis of Yolov3: We improve the feature
attracted much attention and has always been a hot research extraction network, feature fusion network and the loss function.
topic. The application of this technology to the scene of
The structure of the improved detection network is shown
searching and rescuing can improve the efficiency of searching.
in Fig. 1.
The Object detection methods are generally divided into two
types: traditional object detection methods and the object
detection methods based on deep learning. The representative
algorithms of traditional object detection methods include
Haar+SVM [1], HOG+SVM [2], Shapelet+AdaBoost [3], etc.
Traditional object detection methods need artificial designed
features, which have very strict requirements for practitioners.
Authorized licensed use limited to: Carleton University. Downloaded on June 03,2021 at 11:38:22 UTC from IEEE Xplore. Restrictions apply.
B. Feature Fusion Network
In the convolutional neural network, the bottom layer is the
shallower part of the convolutional network, and the feature
2x
map obtained at the bottom layer is large and has rich detailed
2x information, while its semantic information is weak. The top
layer is the deeper part of the convolutional network, and the
3x
feature map obtained at the top layer has a small size, and the
3x feature at the top layer is abstract and has rich semantic
information. The feature fusion network can enhance the
4x
information of different layers by fusing the features of
different layers, which is very important for improving the
detection performance, especially the detection performance of
small objects. The improved feature fusion network is shown in
Fig. 3.
Figure 1. The structure of improved detection network
101
Authorized licensed use limited to: Carleton University. Downloaded on June 03,2021 at 11:38:22 UTC from IEEE Xplore. Restrictions apply.
III. Ex p e r im e n t a n d r e s u l t a n a l y s is The number of labeled boxed of human targets at sea is
In this paper, we use the Ubuntu 16.04 operating system, 226994, which lays a foundation for the training of the human
and use two Titan V Graphics cards for model training. In target detection model at sea.
addition, the deep learning framework we use is pytorch1 .2 B. Anchor box setting
version, and the python version we use is 3.7.
In this paper, we use the K-means++ algorithm to get anchor
A. Dataset boxes, and the obtained anchor boxes are (3,7), (5,10), (4,19),
(6,15), (9,17), (7,25), (11,25), (8,39), (11,49).
The dataset used in this experiment is the seaside human
dataset which is made by our teams. The dataset is formed by However, in the anchor boxes generated by the K-means++
collecting images, processing the images, and labeling the algorithm, there are some anchor boxes with similar sizes, such
human targets in the images. Fig. 4 and Fig. 5 are some samples as (4,19), (6,15), (9,17). Anchor boxes with similar sizes will
in the dataset. increase useless calculation in the process of object box
regression. In this paper, we use a linear transformation method
to deal with the generated anchor boxes. The transformation
formulas are as follows:
x'x — O.Sxj (4)
Xg = 2Xg
(5)
(Xj - i j )
x = (_Xg ~ X ' - y ) + X [
( X g ~ * i )
(6 )
,y*
yi = *
Figure 4. An image of the dataset (7)
In the above transformation formulas, x t and ) ', are the
width and height of the target box before transformation. x[
and y,' are the width and height of the target box after the
transformation. The anchor boxes obtained by this
transformation are (1,3), (3,7), (2,10), (4,11), (8,16), (6,21),
(11,25), (7,35), (11,49).
C. Analysis o f results
Since the purpose of this paper is to propose a detection
technology of drown people, and considering the effect of small
target detection, the detection accuracy of sea person when IOU
Figure 5. Other images of the dataset is 0.25 is used as the evaluation index. The experimental results
are shown in Table 1.
The dataset includes 6079 images and 528422 labeled boxes.
And the dataset divides the human targets into 4 categories: sea TABLE I. EXPERIMENT RESULTS
person (Seap), uncertain sea person (Unseap), land person Improved Improved New Seap
(Landp) and uncertain land person (Unlandp). Sea person refers CIoU
FEN FFN anchors AP25 (%)
to the human target at sea. The uncertain sea person refers to Yolov3 59.22
the uncertain human target at sea. The land person refers to the Yolov3-1 V 60.84
human target on land. The uncertain land person refers to the Yolov3-2 V V 64.04
Yolov3-3 V V V 69.16
uncertain human target on land. The statistical results of Yolov3-4 V V V V 72.17
different types of labeled boxes are shown in Fig. 6 .
284453
102
Authorized licensed use limited to: Carleton University. Downloaded on June 03,2021 at 11:38:22 UTC from IEEE Xplore. Restrictions apply.
accuracy value is increased from 64.04% to 69.16%, an increase effectiveness, there are still many shortcomings, such as
of 5.12 percentage points. Yolov3-4 is improved on the basis of inaccurate prediction. Our subsequent research work will
Yolov3-3, which uses the anchor generated by the linear continue to optimize it.
transformation, and the detection accuracy value is increased
from 69.16% to 72.17%, an increase of 3.01 percentage points. ACKNOWLEDGMENT
In summary, the detection accuracy value of the final improved At the end of this article, I would like to thank some people
algorithm we proposed is 72.17%, which is nearly 13% higher who are important to me. First of all, I would like to thank my
than the detection accuracy value of Yolov3. teacher Niu Fu, who has given me a lot of help in my study and
In addition to using indicators to describe the detection life. I would like to express my thanks to my partners of Beijing
results, the effectiveness of the technology can be shown Institute of electronics and control technology, who also have
directly from the image after detection. Fig. 7 and Fig. 8 show helped me a lot. In addition, I would like to thank my parents,
the detection results of two different pictures. it is their support that I can study at ease. Finally, I would like
to sincerely thank the teachers who have worked so hard to
review this article!
Re f e r en c es
103
Authorized licensed use limited to: Carleton University. Downloaded on June 03,2021 at 11:38:22 UTC from IEEE Xplore. Restrictions apply.