Wang 2019

The 4th International Conference on Control and Robotics Engineering
The Image Recognition Based on Restricted Boltzmann Machine and Deep

Learning Framework
Renshu Wang Jingdong Guo

Electric Power Research Institute of State Grid Fujian Electric Power Research Institute of State Grid Fujian
Electric Power Co., Ltd., Electric Power Co., Ltd.
Fujian Provincial Enterprise Key Laboratory of High Fujian Provincial Enterprise Key Laboratory of High
Reliable Electric Power Distribution Technology Reliable Electric Power Distribution Technology
Fuzhou, China Fuzhou, China
e-mail: 932521880@qq.com e-mail: guo_jingdong@fj.sgcc.com.cn
Bin Chen Jing Zhao

Electric Power Research Institute of State Grid Fujian Management training department
Electric Power Co., Ltd. State Grid Fujian Management Training Center
Fujian Provincial Enterprise Key Laboratory of High Fuzhou, China
Reliable Electric Power Distribution Technology e-mail: zhao_jing1@fj.sgcc.com.cn
Fuzhou, China
e-mail: dky_chenbin@fj.sgcc.com.cn
Abstract—For the Unmanned Aerial Vehicle (UAV) has high patroller, resulting in low-efficiency. So the image
mobility, it is adopted to reducing the difficulty in the recognition method is considered to speed up the patrolling
patrolling of electric power line is a hard work by manual way. in field.
However, the judgment of damaged poles is still carried out by Deep learning neural network is widely applied in the
the patroller which is inefficient and fallible. So the method
image recognition in many fields and shows good
with artificial intelligence is considered to be introduced that a
novel method is proposed in this paper to improve the performance [1]. With the convolutional neural network
recognition effect in complex background. Restricted (CNN) is adopted in the processing of images, the more
Boltzmann Machine (RBM) is used to instead the full detailed feature can be extracted and there have been several
connected layers of faster regions with convolutional neural popular CNN models, such as Fast RCNN [2], Yolo [3],
network (faster RCNN). For RBM has the ability of SSD [4] and so on. For the intelligence and high recognition
unsupervised learning, with the RBM and faster RCNN accuracy, the deep learning network makes the recognition
combined, it can reduce the training samples and influence of an easier task in lots of applications, including medicine
different background in the images to be identified. The image analysis, DNA methylation states prediction in
experimental results show that the proposed model takes
biology [5], human body action predicting [6] and the defect
effects on the recognition of the wire poles in the distribution
network which has practical value. detection [7].
Deep learning shows advanced performance than
Keywords- intelligent recognition; wire pole; Restricted traditional methods, however there are still room for
Boltzmann Machine; distribution network; faster regions with improvement, since large number of samples are need and
convolutional neural network the process is time consuming to reach a acceptable
accuracy. Besides that, for different recognition objects, the
I. INTRODUCTION generalization ability of the model constructed by deep
learning work may not meet the practical demand. To settle
Patrolling of electric power line is an important work in
these problems, there are researches focusing on improving
the operation and maintenance for power department. And
deep learning network in different applications.
for the power line crosses different types of areas, especially
in the distribution network, the dispersion of power line also In the recognition of electric equipment, there have been
increases the difficulty of patrolling. When disaster research results. In paper [8] AlexNet and random forest are
combined to realize the electric equipment image
happened to the distribution network, such as typhoon, there
recognition. In paper [9] the UAV images are processed by
are many wire poles damaged and it needs to find out these
CNN and constructive advice of tuning parameters in Faster
damaged poles as soon as possible. With the Unmanned
R-CNN is presented. In paper [10] the convolutional
Aerial Vehicle (UAV) is adopted, the mobility is improved
much. However, the damaged poles are still judged by recursive network is used for the processing of massive
infrared fault images of the transformers. In paper [11]
978-1-7281-1593-1/19/$31.00 ©2019 IEEE 161

double supervised signal deep learning was proposed for Then we can have the loss functions as follow:
infrared fault image recognition. In paper [12] deep learning
model in the framework of the internet of things is used to Lcls pi , pi* log[ pi pi* (1 pi* )(1 pi )] (3)
predict and diagnose the faults of wind power generation. In
different applications, researchers make efforts to pursue a
better performance to meet the practical demand. Lreg ti , ti* R ti ti* (4)
In this paper, with the complex background of the
distribution network considered, the framework of the deep Lcls is the loss function of classification and Lreg is the
learning network is proposed that the unsupervised learning regression loss. R is the smooth L1 function.
network with Restricted Boltzmann Machine (RBM) is The RPN and Fast RCNN share the common
convolutional layers. And RPN is trained end-to-end through
adopted to clustering feature maps to reduce the work of
back-propagation and stochastic gradient descent.
preparing of samples. With the advantage of faster RCNN in
the feature extraction, the RBM is applied for clustering B. RBM
before the classification can improve the robustness of RBM is featured by unsupervised Learning. In RBM,
recognition, and reduce the sample number. The training there are n visible nodes and m hidden nodes. vi ę{0,1},
process is presented and the test results are list to prove the hję{0,1},where 1 means that the nodes is activated and 0
effect. means not. And the energy of RBM is :
II. BACKGROUND n m n m
E(v, h | ') ¦ ai vi ¦ b j h j ¦¦ vW
i ij h j (5)
A. Faster RCNN i 1 j 1 i j
Based on the advantage of Fast RCNN, Faster RCNN where Δ={Wij, ai, bj}, Wij is the weight between vi and hj,
introduces a Region Proposal Network (RPN), accelerating ai is the bias value of vi, bj is the bias value of hj. the
the searching process of proposals and speeding up the activation probability is:
image recognition [13]. The framework of Faster RCNN is
shown as Fig. 1. m
P(vi 1| h, ') V (ai ¦Wij h j ) (6)
classification j 1
Input Resize Feature ROI Full
CNN RPN n
Image Image Map Pooling connection
13 conv layers
13 relu layers
regression
P(h j 1| v, ') V (b j ¦ vW
i ij ) (7)
4 pooling layers
i 1
1×1
where σ is sigmod function.
Sigmoid Proposal
3×3
conv The training of RBM can be equivalent to the question of
conv
1×1
minimizing E(v,h|Δ). The joint probability distribution of
conv hidden layer and visible layer can be defined as:
Figure. 1 Framework of Faster RCNN
E v , h|'
e
The input image is processed through the CNN and CNN P ( v, h | ' ) (8)
Z '
is a multi-layer neural network, containing convolutional
layer, ReLU(The Rectified Linear Unit) layer, pooling layer. where Z is:
With the feature map derived, a parallel process unit is added
that the RPN network generates the region proposals. Faster Z ' ¦e v ,h
E v , h|'
(9)
RCNN can be seen as the combination of Fast RCNN and
RPN. And the loss function of Fast RCNN can be described The marginal distribution of P is:
and the training goal is to minimize the function:
1
1 1 P v|'
Z '
¦e E v , h|'
(10)
L { pi },{ui }
Ncls
¦L
i
cls pi , p O
*
i
N reg
¦p L
i
*
i reg
*
ti , t
i (1) h
The likelihood function can be defined as:

where pi is the probability that the target is predicted by
anchor. p*i is the ground truth:
0 negative label
L '|v P v|' (11)
pi* ® (2) v
¯1 positive label And with the logarithm adopted, the derivation is carried
ti={tx,ty,tw,th} is a vector, meaning the 4 coordinates of out:
the bounding box. t *i is the coordinates vector of ground
truth according to a positive anchor.
162
w ln L ' | v w ln P v | ' IV. RESULTS OF PRACTICAL APPLICATION
w'
¦v w'
(12)
We collected the sample pictures with 7615 pole images,
of which 5239 are used as training datasets and the others are
According to the learning algorithm proposed in [14], the used for testing datasets. And the image size is ranged from
weight updating of every parameter can be derived. 512 pixels×512 pixels to 1024 pixels×1024 pixels. And we
use Ubuntu 16.04 as the operation system, Tensorflow as the
'wij Edata vi , h j Emod el vi , h j (13) deep learning framework. The results are as follows.
III. PROPOSED METHOD FOR DISASTER DAMAGE

RECOGNITION
In the proposed method, RBM network is adopted as the
fully connected layer to realize the classification. While the
fully connected layer is reformed the CNN is reserved for the
region proposal and feature extraction. And according to the
framework shown in Fig.2.
RBM
CNN
. .
. .
. .
.
.
.
. .
Pooling . classification
.
. .
Input Image Conv layers

Feature maps
Visible layer Hidden layer
ROI Pooling
Figure 4. Performance of the joint recognition model
RPN
Figure 2. Proposed framework of intelligent recognition In Fig. 4, avg_loss means the average loss, loss means
the total loss and rate means the current learning rate. The
the process of the intelligent recognition is divided into abscissa is in the unit and it can be find out that after the
the following steps: training of nearly 240 batches the loss goes down to an
x initialize the parameters of the model in Fig. 2; acceptable range.
x The input image is firstly processed by the CNN of
Faster RCNN
x Feature extraction of the input image;
x Training and recognition of normal wire poles and
damaged poles;
x Recognition target classification and evaluation.
And the training process is described as Fig.3 (a) Multi-scale target recognition (b) Recognition in dark conditions
Preparing the
samples
Training the faster Rcnn

(including RPN and CNN)
Fine-tuning faster Rcnn (c) Overlapped target recognition (d) Recognition in similar background
Figure 5. Poles recognition in different situations
Replacing the FC layer
with RBM In Fig. 5, the proposed model shows the ability on the
New recognition of the normal and damaged pole images which is
samples
Fine-tuning the joint captured by UAV in different situations. From the
recognition model perspective of UAV, many objects are shoot in one frame
including the target poles can be captured. With this method,
Figure 3. The training process of the proposed model
the UAV can quickly find out the damaged poles and it is
Through the training process, the trained model can be suitable to practical application under the different situations
derived and the detailed test is carried out. The including multi-scale targets, darkness, overlapping targets
corresponding results that the model is applied for the and the similar background influence. It can be seen that the
practical recognition are presented in the next part. UAV attached with intelligent recognition makes the
patrolling in field an easier work, especially after the disaster
such as typhoon. The UAV can quickly figure out the
163
damaged poles and generate a record sent to the repair team. medical image analysis.[J]. Medical Image Analysis, 2017, 42(9):60-
It accelerates the information collection of disaster which is 88.
quite beneficial to the disaster recovery. [5] Angermueller C, Lee H J, Reik W, et al. DeepCpG: accurate
prediction of single-cell DNA methylation states using deep
learning[J]. Genome Biology, 2017, 18(1):90.
V. CONCLUSIONS
[6] Kruthiventi S SS, Ayush K, Babu R V. DeepFix: A Fully
To improve the recognition effect of the images of wire Convolutional Neural Network for Predicting Human Eye
poles in distribution network, the framework of Faster Fixations[J]. IEEE Transactions on Image Processing, 2017,
RCNN is reformed and RBM is introduced for the 26(9):4446-4456.
unsupervised learning to make the model have more [7] Cha Y J, Choi W, Büyüköztürk O. Deep LearningϋBased Crack
adaptability in complex background. Through the collected Damage Detection Using Convolutional Neural Networks[J].
Computerϋaided Civil & Infrastructure Engineering, 2017,
videos by UAV, the tests are carried out and the result shows 32(5):361-378.
that the proposed method can be applied in the damaged [8] Wang W, Tian B, Liu Y, et al. Study on the Electrical Devices
poles detection after the disaster such as typhoon. Detection in UAV Images based on Region Based Convolutional
Neural Networks[J]. Journal of Geo-Information Science, 2017.
ACKNOWLEDGMENT [9] Ying L, Guo Z, Chen Y. Convolutional-recursive network based
This work is supported by the project current transformer infrared fault image diagnosis[J]. Power System
Protection & Control, 2015.
(SGGR0000JSJS1800569) from State Grid Corporation of
[10] Dong W, Gong Q, Lai W, et al. Research on Internal and External
China and the project (52130018008K) from State Grid Fault Diagnosis and Fault-selection of Transmission Line Based on
Fujian Electric Power Co., Ltd.. Convolutional Neural Network. Proceedings of the Csee, 2016.
[11] Jia X, Zhang J, Wen X. Infrared faults recognition for electrical
equipments based on dual supervision signals deep learning[J].
REFERENCES Infrared & Laser Engineering, 2018.
[1] Montavon G, Samek W, Müller K R. Methods for Interpreting and [12] Chen F, Fu Z, Yang Z. Wind power generation fault diagnosis based
Understanding Deep Neural Networks[J]. Digital Signal Processing, on deep learning model in internet of things (IoT) with clusters[J].
2018, 73:1-15. Cluster Computing, 2018(9):1-13.
[2] Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time [13] Girshick R, Donahue J, Darrell T, et al. Region-based convolutional
object detection with region proposal networks. International networks for accurate object detection and segmentation[J]. IEEE
Conference on Neural Information Processing Systems. MIT Press, Transactions on Pattern Analysis andMachine Intelligence,
2015:91-99. 2016,38(1): 142-158.
[3] Jeong H J, Park K S, Ha Y G. Image Preprocessing for Efficient [14] Hinton G E. Training Products of Experts by Minimizing
Training of YOLO Deep Learning Networks. IEEE International Contrastive Divergence [J]. Neural Computation (S0899-7667),
Conference on Big Data and Smart Computing. IEEE Computer 2002, 14(8): 1771-1800.
Society, 2018:635-637.
[4] Litjens G, Kooi T, Bejnordi B E, et al. A survey on deep learning in
164

Wang 2019

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Wang 2019

Uploaded by

Copyright:

Available Formats

The 4th International Conference on Control and Robotics Engineering

The Image Recognition Based on Restricted Boltzmann Machine and Deep

Renshu Wang Jingdong Guo

Bin Chen Jing Zhao

978-1-7281-1593-1/19/$31.00 ©2019 IEEE 161

The likelihood function can be defined as:

III. PROPOSED METHOD FOR DISASTER DAMAGE

Input Image Conv layers

Training the faster Rcnn

You might also like