You are on page 1of 15

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2017.2738605, IEEE
Transactions on Fuzzy Systems

IEEE TRANSACTIONS ON FUZZY SYSTEMS 1

A Pythagorean-Type Fuzzy Deep Denoising


Autoencoder for Industrial Accident Early Warning
Yu-Jun Zheng, Member, IEEE, Sheng-Yong Chen, Senior Member, IEEE, Yu Xue, and Jin-Yun Xue

Abstract—Early warning is crucial for preventing industrial for most machines it is difficult to define an accurate threshold
accidents and mitigating damage, but current methods are of “overload” that will result in an accident.
often time-consuming, error-prone, and incompetent to deal with Early warning is one of the most important arms of our soci-
uncertainty. The paper presents a fuzzy deep neural network for
early warning of industrial accidents, which equips the classical ety for preventing industrial accidents and mitigating damage
deep denoising autoencoder (DDAE) model with Pythagorean- [2]. Traditional approaches to accident early warning based on
type fuzzy parameters in order to enhance the model’s represen- risk assessment and control heavily rely on expert experience
tation ability and robustness. To efficiently train the fuzzy deep and knowledge, using a typical process that (1) identifies a
model, we propose a hybrid algorithm combining Hessian-free variety of potential risk factors; (2) verifies the relationships
(HF) optimization and biogeography-based optimization (BBO)
metaheuristic to balance global search and local search. Exper- between risk factors and accidents, and thus determines key
iments on datasets from several industrial zones in China show risk factors that are highly correlated to accidents; (3) analyzes
that the proposed Pythagorean-type fuzzy DDAE (PFDDAE) can the impact of the key factors on accident occurrence and sever-
achieve much higher accuracy of accident risk classification than ity, based on which develops the rules for early warning and/or
the classical DDAE and the fuzzy DDAE using regular fuzzy accident prevention [3]. Chae [4] developed a construction site
parameters, and the proposed hybrid learning algorithm exhibits
significant performance advantage over some other learning accident warning system that estimates the distance between
algorithms in training PFDDAE. In particular, a test on the 2014 worker and equipment based on RFID, judges the collision
Kunshan aluminum dust explosion accident shows that the deep potential according to the distance, and sends a warning
learning model would be very likely to prevent the accident if it message to the worker and relevant parties if the potential is
was adopted in advance. high. However, the study did not state how to set a threshold
Index Terms—Deep learning, deep denoising autoencoder of the potential. By establishing the similarity between early-
(DDAE), Pythagorean fuzzy set (PFS), evolutionary learning, warning systems and biological immune systems, Chen et
biogeography-based optimization (BBO), accident early warning. al. [5] designed an accident early-warning mechanism for
chemical industrial park, which uses a memory to store not
only risk factors but also the feedback results of reactive warn-
I. I NTRODUCTION ing/prevention policies for further improvement. For accident
risk early-warning of ventilation, gas, dust and fire in coal
ITH the rapid development of China’s economy, an
W increasing number of industrial accidents have oc-
curred in the country in recent years, such as the 2013
mine, Lin et al. [6] developed a “macroscopic” model that
employs closed loop running process to monitor whether the
production system is in a safe state. Based on the accident-
Shanghai Baoshan liquid ammonia leakage, the 2014 Kunshan
causing theory, Yuan et al. [7] developed a hierarchical risk
aluminum dust explosion, and the 2015 Tianjin Port dangerous
source identification method that builds an event causal chain
goods explosion, which have caused serious damage to lives
in according with the order of event occurrence and then mines
and property. Investigations of these accidents reveal that,
the risk sources from the node of the chain. The feasibility
almost in all the cases, a variety of risk symptoms are actually
of the method was verified on the oil pipeline explosion
observable before the accidents but are unfortunately ignored
accident of Dalian Xinggang harbor on July 16, 2010. In [8]
by managers and regulators. Some of this has been attributed
Dodshon and Burgess-Limerick applied the Bow Tie Analysis
to dereliction of duty, but much is due to the uncertainty,
technique combining fault-tree analysis and event-tree analysis
imprecision, and inconsistence of the risk symptoms which
to identify incident initiating events in mining industry. Ding
make it very difficult to judge situations and make decisions
and Zhou [9] developed a web-based system for safety risk
[1]. For example, overload operation has been identified as one
early warning in urban metro construction, where a data fusion
of the top causes of industrial accidents in manufacturing, but
model was employed to integrate multi-source information
Manuscript received ...(date to be filled in by Editor). including monitoring measurements, visual inspections, and
Y.J. Zheng is with the College of Computer Science & Technology, calculated predictions. There are also some other safety early
Zhejiang University of Technology, Hangzhou, 310023 China (e-mail: yu- warning systems reported, such as the coal-mining production
jun.zheng@computer.org).
S.Y. Chen is with the School of Computer Science & Engineering, Tianjin risk monitoring and early warning technology system [10], the
University of Science & Technology, Tianjin, 300384 China. rule-based reasoning system for risk identification of maritime
Y. Xue is with the School of Computer & Software, Nanjing University of accidents [11], and the knowledge-based system for predicting
Information Science & Technology, Nanjing 210044, China
J.Y. Xue is with the Jiangxi Provincial Lab of High-Performance Comput- the risk of component and system failures in power grids [12].
ing, Jiangxi Normal University, Nanchang, 330022 China. Rescue Wings, a web-based rescue decision support system

1063-6706 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2017.2738605, IEEE
Transactions on Fuzzy Systems

IEEE TRANSACTIONS ON FUZZY SYSTEMS 2

developed by the authors [13], supports receiving disaster also how it does not contribute to producing the answer.
alerts from personnel on site and then broadcasting customized Our fuzzy strategy can also be applied to other deep
alerts to the affected victims and rescuers via mobile services. learning models such as deep Boltzmann machines and
The crisp models used in the above studies are incompetent their variations [27], [39].
to deal with the uncertainties arising from dynamic environ- • We develop a new hybrid learning algorithm for training
ment, incomplete and noisy data, imprecise measurements, the Pythagorean-type fuzzy DDAE (PFDDAE) model,
etc., which often make the results inconsistent and subjective which combines a metaheuristic for facilitating explo-
[14]–[18]. Consequently, some recent studies have employed ration (global search) and a gradient-based method for
fuzzy set theory [19] to handle such uncertainties. Zheng et enhancing exploitation (local search), and thus suppresses
al. [20] proposed a fuzzy analytic hierarchy process (AHP) the premature convergence and improves the learning
for work safety evaluation and early warning rating of hot performance effectively.
and humid environments, where trapezoidal fuzzy numbers • We construct a PFDDAE-based neural network for ac-
are used to determine the weights of the hierarchy evaluation cident early warning that is of critical importance in
indexes and evaluate the performance of the indexes. Guo and industry. Experimental results show that it can achieve
Li [21] combined AHP and fuzzy synthetic evaluation (FSE) higher accuracy than the models with crisp parameters
to calculate the effecting factors of early-warning classification and regular fuzzy parameters. Particularly, a test on the
of coal mine accidents, and validated the effectiveness of the 2014 Kunshan aluminum dust explosion accident shows
model on a case of rock burst accident. Wang et al. [22] that our approach would be very likely to prevent the
developed a safety early-warning model of workface stray accident if it was adopted in advance.
current based on an adaptive-network-based fuzzy inference It should be noted that, although neural networks have been
system (ANFIS) [23], whose parameters are optimized via widely used in areas as fault detection and assessment [40],
reverse transmission mixed with least squares method. The their use for safety warning has been criticized and limited
current trend is to integrate different but complementary soft because their results are generally not interpretable and thus
computing techniques such as fuzzy logic, neural networks, cannot provide sufficient information about the causes of the
and heuristic search to cope with the increasing complexity of risks [24], [41]. Thus, our deep neural network is designed for
today’s industrial processes and environments [24]. estimating both the type and the level of the potential risks,
Nevertheless, almost all of the above approaches require which can provide important support to the decision-maker in
great human effort to identify key factors and their abstractions taking preventive and reactive measures. However, it may still
from a large number of candidate factors, the process of which require in-depth inspection and analysis to determine the root
is usually time-consuming, subjective, and error-prone because causes and their locations.
of the unknown complex multivariate probability distributions The rest of the paper is structured as follows. Section II
over the factors. Recent advances in deep neural networks introduces the preliminaries of DDAE and PFS. Section III
provide us a powerful tool for modeling such complex distri- presents our PFDDAE model, Section IV proposes the hybrid
butions by automatically discovering intermediate abstractions learning algorithm for the model, Section V describes the
[25]–[28]. Very recently, Chen et al. [29] proposed a fuzzy PFDDAE-based neural network for accident early warning,
restricted Boltzmann machine (RBM) model where the gov- the performance of which is tested by experiments in Section
erning parameters are fuzzy numbers, and demonstrate that the VI, and Section VII concludes with discussion.
fuzzy model is more representative than its crisp counterpart.
There are also other studies on modeling uncertainties in deep II. P RELIMINARIES
learning models. E.g., Khodayar and Teshnehlab [30] used a A. Autoencoder (AE), Denoising Autoencoder (DAE), and
rough regression layer in their model for handling uncertainty Deep Denoising Autoencoder (DDAE)
factors, and Song et al. [31], [32] used a Gaussian mixture A basic autoencoder (AE) [42] takes an input vector
model for the same purpose. In [33] Chen et al. used a x ∈ [0, 1]D and first transforms (encodes) it to a hidden

Correntropy-induced loss function to improve robustness to representation y ∈ [0, 1]D through an affine mapping:
outliers. However, studies on the integration of deep learning
and fuzzy sets are still very scarce. fθ (x) = s(Wx + b) (1)
In this paper, we propose a fuzzy deep neural network based where θ = [W, b], W is a D′ × D weight matrix and b is a
on the denoising autoencoder model [34] and Pythagorean- D′ -dimensional bias vector.
type fuzzy set (PFS) [35]–[37] for early warning of industrial The resulting hidden representation y is then mapped back
accidents for managers and regulators in a variety of industrial (decoded) to a reconstructed vector z ∈ [0, 1]D in input space
fields. The main contributions of the paper are as follows: with appropriately sized parameters θ′ = [W′ , b′ ]:
• We propose a novel fuzzy deep denoising autoencoder
(DDAE) model that improves the representation ability gθ′ (y) = s(W′ y + b′ ) (2)
of the classical DDAE [38] by using Pythagorean fuzzy AE training consists in minimizing the average reconstruc-
parameters to allow for a larger body of membership tion error over a training set D:
grades than regular and other non-standard fuzzy numbers
1 ∑ ( )
[35] and to enable each neuron learns not only how a arg min′ = L x, gθ′ (fθ (x)) (3)
feature contributes to producing the correct answer but θ,θ n
x∈D

1063-6706 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2017.2738605, IEEE
Transactions on Fuzzy Systems

IEEE TRANSACTIONS ON FUZZY SYSTEMS 3

where n is the size of the dataset, and L is a loss function. non-empty set, a PFS P is a mathematical object of the form:
For real-valued vectors, L can be the traditional squared error:
P = {⟨x, P (µP (x), νP (x))⟩|x ∈ S} (8)
L(x, z) = ∥x − z∥2 (4)
where µP (x) : S → [0, 1] and νP (x) : S → [0, 1] are respec-
For binary vectors or vectors of binary probabilities tively the membership degree and the non-membership degree
(Bernoullis), L can be the reconstruction cross-entropy: of the element x to S in P , satisfying that µ2P (x)+νP2 (x) ≤ 1.
The hesitant degree of x ∈ X is expressed as:

D
( ) √
L(x, z) = − xk log zk + (1 − xk )(1 − log zk ) (5)
πP (x) = 1 − µ2P (x) − νP2 (x) (9)
k=1

A denoising autoencoder (DAE) [34] is a simple variant For convenience, β = P (µβ , νβ ) is called a Pythagorean
of the basic AE to reconstruct a clean “repaired” input from fuzzy number (PFN) [45], which satisfies µβ , νβ ∈ [0, 1] and
a corrupted one. As shown in Fig. 1, DAE first corrupts an µ2β + νβ2 ≤ 1. The following operations are defined on PFN:
e by means of a stochastic mapping x into
initial input x into x
βC = P (νβ , µβ )
qD (ex|x), and then maps the corrupted input x e, as with the (√ 2 )
basic AE, to a hidden representation y = fθ (ex) = s(We x +b) β1 + β2 = P µβ1 + µ2β2 − µ2β1 µ2β2 , νβ1 νβ2
from which we reconstruct a z = gθ′ (y) = s(W′ y + b′ ). ( √ )
DAE training still consists in minimizing the average re- β1 × β2 = P µβ1 µβ2 , νβ21 + νβ22 − νβ21 νβ22
construction error, but the key difference is that z is now a (√ )
λβ = P 1 − (1 − µ2β )λ , νβλ
deterministic function of xe rather than x and thus the result √
( )
of a stochastic mapping of x: βλ = P µλβ , 1 − (1 − νβ2 )λ
1 ∑ ( )
arg min′ = L x, gθ′ (fθ (e
x)) (6) The following score function [45] and accuracy function
θ,θ n
x∈D [37] can be used for ranking different PFNs:
A DDAE [38] is an extension of DAE that has multiple s(β) = µ2β − νβ2 (10)
hidden layers, where each layer captures complicated, higher-
h(β) = µ2β + νβ2 (11)
order correlations between the activities of hidden features in
the layer below. Note that input corruption is only used for Based on the two functions, the ranking of any two PFNs
the initial denoising-training of each individual layer; Once β = P (µβ1 , νβ1 ) and β = P (µβ2 , νβ2 ) is defined as follows
the mapping fθ has been learnt, it will henceforth be used on [37], [46]:
uncorrupted inputs; No corruption is applied to produce the 1) If s(β1 ) < s(β2 ), then β1 < β2 ;
representation that will serve as clean input for training the 2) If s(β1 ) = s(β2 ), then
next layer. Fig. 2 illustrates a two-layer DDAE, where the 2nd 2.1) If h(β1 ) < h(β2 ), then β1 < β2 ;
layer is trained to optimize the following objective function: 2.2) If h(β1 ) = h(β2 ), then β1 = β2 .
1 ∑ ( (2) (2) (2) (2) )
arg min = L x , gθ′ (fθ (e
x )) (7) III. P YTHAGOREAN -T YPE F UZZY D EEP D ENOISING
(2)
θ ,θ ′(2) n
x∈D AUTOENCODER (PFDDAE)
where x e(2) is a corrupted version of the clean input x(2) = The proposed PFDDAE extends DDAE by using PFN to
fθ (x) to the 2nd layer (after fθ has been learnt at the 1st express the model governing parameters, in order to im-
layer). By corrupting the input of autoencoders, DDAE has prove the representation ability and robustness of the model
shown surprising advantage over the basic deep autoencoders by supporting fuzzy probability distribution [47], [48] over
on various classification and prediction problems [34], [43]. cross-layer units. In comparison with regular fuzzy numbers,
PFN parameters characterized by both membership and non-
membership functions enable neurons to learn not only how
B. Pythagorean Fuzzy Set (PFS) a feature contributes to producing the correct answer but also
how it does not contribute to producing the answer. Moreover,
The theory of fuzzy sets, initially proposed by Zadeh [19],
PFS has larger membership grades than IFS and thus can be
is a generalization of the classical set theory in that the
used in more situations (e.g., in which a user indicates that
membership of an element to a set S (called the support set)
the support for membership of x is 0.8 and the support against
is graded between 0 and 1 as opposed to being pure boolean.
membership is 0.6) without requiring the user to change their
Intuitionistic fuzzy set (IFS) [44] extends the basic fuzzy set by
information to satisfy the constraints of IFS [36].
utilizing a membership degree and a non-membership degree
Let θ̄(l) = [W̄(l) , b̄(l) ] and θ̄′(l) = [W̄′(l) , b̄′(l) ] denote the
whose sum is less than or equal to 1, to assess an element
fuzzy parameters of the lth layer of a PFDDAE illustrated by
from the positive side and the negative side simultaneously.
Fig. 3, when training the lth layer, the objective function to
More recently, Yager [35], [36] proposes Pythagorean fuzzy be optimized is:
set (PFS) which is more general that IFS in that the sum of
1 ∑ ( (l) (l) ¯(l) (l) )
squares of the membership degree and the non-membership arg min = L̄ x̄ , ḡθ̄′ (fθ̄ (x ē )) (12)
degree is less than or equal to 1. Formally, let S be an arbitrary θ̄ (l) ,θ̄ ′(l) n
x∈D

1063-6706 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2017.2738605, IEEE
Transactions on Fuzzy Systems

IEEE TRANSACTIONS ON FUZZY SYSTEMS 4

y
f g ' L(x,z)

qD

~
x x z

Fig. 1. Denoising autoencoder (DAE) [34].

f (2) g '(2) L(x(2),z(2))

f (2)
qD

f f

x x
(a) Architecture (b) Training of the 2nd layer

Fig. 2. Deep denoising autoencoder (DDAE) [38].

y– (l) centroid points of the two functions respectively [49], [50]:


– (l) – (l)

L(x–(l), –z(l)) (∫ ∫ ′ )
f g ' ( ) xµβ (x)dx yµβ (y)dy
cµ (β) = xµ (β), yµ (β) = ∫ , ∫ ′ (13)
µβ (x)dx µβ (y)dy
(∫ ∫ ′ )
( ) xνβ (x)dx yνβ (y)dy
qD cν (β) = xν (β), yν (β) = ∫ , ∫ (14)
νβ (x)dx νβ′ (y)dy

x~ (l) x– (l) – (l)
z
where µ′β and νβ′ are the inverse functions of µβ and νβ
respectively.
Based on their centroids, we can calculate the distance

f between two PFNs β1 and β2 as [51]:

∥β1 , β2 ∥ = |cµ (β1 ), cµ (β2 )|2 + |cν (β1 ), cν (β2 )|2 (15)
x
where |c1 , c2 | denotes the Euclidean distance between the two
Fig. 3. Pythagorean-Type fuzzy deep denoising autoencoder (PFDDAE). points c1 and c2 . Accordingly, the distance between two fuzzy
vectors x̄ and z̄ is evaluated as the square root of the sum
of squares of the distances between the corresponding fuzzy
components of the two vectors:
(∑D ) 12
∥x̄, z̄∥ = ∥x̄i , z̄i ∥2 (16)
(l)
where xē is a corrupted version of the clear input x̄(l) = i=1
(l−1) (l−1)
fθ (x̄ ) to the lth layer. Fig. 4 illustrates the calculation of the centroid of a trape-
zoidal PFN [52], [53]. As special cases of trapezoidal PFN,
This yields a fuzzy maximum likelihood problem because interval and triangular PFNs have simpler equations. Thus any
all the input vectors expect the input to the 1st layer and form of PFN can be used in the PFDDAE model.
all the reconstructed vectors are composed of PFN elements. After defuzzification, the objective function of model learn-
In general, such a problem is quite intractable and thus the ing becomes
learning process may be very inefficient [29]. To tackle this 1 ∑ ( (l) (l) ¯(l) (l) )
issue, here we use the similarity measure of fuzzy numbers arg min = L x̄ , ḡθ̄′ (fθ̄ (xē ))
θ̄ (l) ,θ̄ ′(l) n
based on centroid methods to evaluate the fuzzy loss function x∈D

L̄. Since a PFN β = P (µβ , νβ ) is related with a membership 1 ∑ (l) (l) ¯(l) (l) 2
= ē ))∥
∥x̄ , ḡθ̄′ (fθ̄ (x (17)
function and a non-membership function, we first calculate the n
x∈D

1063-6706 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2017.2738605, IEEE
Transactions on Fuzzy Systems

IEEE TRANSACTIONS ON FUZZY SYSTEMS 5

1
I=E
E
u

rate
I
v

fmin f(x) fmax

0 Fig. 5. A linear model of emigration and immigration rates of in BBO.


a' a b c( ) c( ) c d' d

% a & 2b & 2c & d (2a & b  3c)(c & 2d  3b) & 5u 2 "
c " ( ) ' ## , solution x are calculated as:
$ 6 12u ! ( )
% a'&2b & 2c & d ' ( 2a'b  3c)(c & 2d '3b) & (1  v)(5v & 7) " fmax − f (x) + ϵ
c! ( ) ' ## , γI (x) = I (18)
$ 6 12(1  v) ! f − fmin + ϵ
( max )
f (x) − fmin + ϵ
Fig. 4. The centroid of a trapezoidal Pythagorean fuzzy number β = γE (x) = E (19)
⟨(a, b, c, d, µβ ), (a′ , b, c, d′ , νβ )⟩, where u is the maximum degree of mem-
fmax − fmin + ϵ
bership function and v the minimum degree of non-membership function. where f denotes the fitness function, fmax and fmin are the
maximum and minimum function values in the population,
I and E are the maximum possible immigration rate and
Consequently, the fuzzy optimization problem is trans- emigration rate which are typically both set to 1, and ϵ is
formed into a crisp optimization problem, for which we can a small constant to avoid zero-division-error.
apply classical gradient-based methods directly. In the next The HF algorithm is a truncated Newton’s method that
section, we will propose a learning algorithm integrating a optimizes the quadratic function by using finite differences at
gradient-based method with a metaheuristic for PFDDAE. the cost of a single extra gradient evaluation via the identity
It should be noted that, using fuzzy membership functions and employing the linear conjugate gradient algorithm for
will inevitably reduce the generalization ability of DDAE, and optimizing quadratic objectives. It is adapted for deep learning
thus make it difficult to learn an over-complete representation by exploiting the idea that the gradients and log-likelihoods
of the input data. Consequently, the proposed PFDDAE is can be simultaneously computed when using mini-batches to
more suitable for learning in discrete attribute spaces [54] (es- compute the products, which can significantly decrease the
pecially those contaminated by uncertainty) than in continuous computational cost of handling large datasets [62].
ones (e.g., image recognition) [55]. The hybrid learning algorithm is applied to PFDDAE layer
by layer. At each hidden layer, the algorithm performs the
following steps to find an optimal or near-optimal structure
IV. A H YBRID L EARNING A LGORITHM FOR F UZZY D EEP and parameter setting:
D ENOISING AUTOENCODER
1) Randomly generate a population of n solutions, each of
Although deep learning has achieved great success, its which represents a design of the layer structure (including
applications on new problems can still be difficult due to issues the number of hidden units and weight and bias settings);
including speed of training, local optima, and manual selection 2) For each solution (layer structure) in the population do
of network structures [56]. Evolutionary algorithms have been 2.1) Employ the HF algorithm to train the network
widely used and have demonstrated their ability in training constructed so far to minimize the reconstruction
artificial neural networks [57], but their applications in deep error (i.e., maximize the correct rate) on the training
neural networks are still limited [56], [58]–[60]. set;
We propose for PFDDAE a hybrid learning algorithm, 2.2) Assign the correct rate calculated as the fitness value
which combines biogeography-based optimization (BBO) [61] to the solution.
for facilitating exploration and the Hessian-free (HF) optimiza- 3) Compute the immigration and emigration rates of all the
tion algorithm [62] for enhancing exploitation in the solution solutions.
space. BBO is a metaheuristic inspired by the science of 4) For each solution x in the population do
biogeography. Given an optimization problem, BBO generates 4.1) For each component (unit) of the layer, with a
a population of initial solutions, and then continually evolves probability of γI (x) replace the unit with the corre-
the solutions by “migrating” components from probably more sponding unit of another solution x′ selected with a
suitable (high quality) solutions to less suitable ones. The probability proportional to γE (x′ ).
equilibrium theory of the metaheuristic indicates that high 4.2) With a given mutation rate, mutate the solution by
(low) quality solutions have high emigration (immigration) randomly adding or removing a unit, or replacing a
rates. In a common linear migration model shown in Fig. 5, small number of weights with zero to increase the
the immigration rate γI (x) and emigration rate γE (x) of a sparsity of the network;

1063-6706 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2017.2738605, IEEE
Transactions on Fuzzy Systems

IEEE TRANSACTIONS ON FUZZY SYSTEMS 6

4.3) Employ the HF algorithm to train the new network;


If the error decreases, replace the original solution
with x in the population;
5) If the stop condition is not met then go to Step 3);
otherwise stop learning of the current layer.
Our experiments show that the hybrid algorithm (as most
evolutionary algorithms) can achieve significant improvement
even from a totally random initialization [58]. However, it is
found that setting most connection weights to zero (in our
Fig. 7. The ring topology.
model we set the number of non-zero incoming connection
weights to each unit to 20–30) can further improve perfor-
mance, because doing this allows the units to be both highly
zone, and thus provide early warning to the managers for tak-
differentiated and unsaturated [62].
ing effective measures. It is known that an industrial accident
Fig. 6 illustrates the migration operation described in Step stems from unsafe states of humans, machines and materials,
4.1): When training the lth layer, a solution x contains a and environments [65], [66]. A wide set of observable features
number of components {x1 , x2 , ...}, and each component derived from these factors together with the history accident
represents a unit of the layer together with the connection data (as the summarized in Table I) are used as the input
weights between the unit and the units of the (l − 1)th layer; to the network, which can be collected from the sources
Suppose the 2nd unit of x is to be immigrated and x′ is including built-in sensing devices (e.g., temperature and smoke
selected as the emigrating solution, then x′2 will replace x2 in sensors), public monitoring devices (e.g., surveillance cameras
x (the migration will be skipped if there is no corresponding and RFID devices), personal devices (e.g., mobile phones and
unit in x′ ). intelligent protection devices), and other electronic records
For the mutation operation described in Step 4.2), we use a (e.g., statistical data from the factories and other relevant
constant mutation rate γM for the worst half of the population, departments such as customs and revenue).
and the best half will never be mutated. The input to the network should be configured according to
The proposed algorithm further improves the exploration a- the actual situation of the target industrial zone. For a zone
bility of the basic BBO in two aspects. First, we employ a local with a small number of workers/machines, the features of all
ring topology for the population [63], where each solution is the workers/machines can be used as inputs; for a zone with a
only directly connected to two other neighboring solutions, as large number of workers/machines, the statistics (e.g., means
shown in Fig. 7. Second, we introduce a control parameter η in and standard deviations) can be used, but those critical features
the range of [0, 1] that denotes the “maturity” of the evolution: should still be used as separate inputs. Empirically, the input
For each migration operation, the emigrating solution x′ has dimension of PFDDAE is about 5000∼20000 for a medium-
a probability of η of being selected from the neighbors of the size industrial zone in China, depending on the number and
current solution x and a probability of (1−η) of being selected scale of plants as well as the level of intelligent surveillance.
from the rest. In the former case, let xL and xR be the left Note that features such as wages and unemployment rate are
neighbor and right neighbor of x, then xL has a probability of used as influence features of “mental states”, because they are
f (xL )/(f (xL ) + f (xR )) of becoming the emigrating solution closely correlated with the psychological stress of workers,
and xR has a probability of f (xR )/(f (xL ) + f (xR )). In the even though it seems that they could not directly lead to an
latter case, let S(x) be the set of the (n − 3) solutions not accident. Similarly, features such as tax rate and interest rate
directly connected to x, (then
∑ each solution ) x1 in the set has are used as influence features of “social environment” because

a probability of f (x1 )/ ′
x ∈S(x) f (x ) of becoming the they are closely correlated with the management pressure of
emigrating solution. In our algorithm, the value of η linearly enterprises. In practice, an input vector often contains noise
increases from a lower limit ηmin to an upper limit ηmax with and missing values, which is the reason why we develop
iteration as PFDDAE to deal with such imperfect information.
t It should be noted that, our network is designed as a part of
η = ηmin + (ηmax − ηmin ) (20)
tmax an integrated monitoring and early warning system, which also
where t is the current iteration number and tmax is the implements a variety of other rules for accident early warning.
maximum number of iterations. This way, a migration is more For example, if a sensor detects that the air temperature of a
likely to be performed between non-neighbors in early stages dangerous goods warehouse exceeds the threshold, a warning
to diversify the search and between neighbors in later stages message is directly produced. The PFDDAE-based network is
to enhance exploitation [64]. mainly used for early warning in complex situations where the
other rules cannot apply.
According to their data types, the input features can be
V. A PFDDAE-BASED D EEP N EURAL N ETWORK FOR divided into the following three classes that will be pre-
I NDUSTRIAL ACCIDENT E ARLY WARNING processed by different methods:
We develop a PFDDAE-based deep neural network for esti- • Static real valued features, which are directly input to the
mating the type and the level of accident risks in an industrial PFDDAE.

1063-6706 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2017.2738605, IEEE
Transactions on Fuzzy Systems

IEEE TRANSACTIONS ON FUZZY SYSTEMS 7

x3 –(l) x'3 x'4 – (l)


x: x1 x2 x': x'1 x'2 '
–(l) –(l)
W W'

... ... ... ...


The (l 1)th layer The (l 1)th layer

Fig. 6. Illustration of a migration of the BBO PFDDAE learning algorithm.

TABLE I
S UMMARY OF THE INPUT FEATURES FOR THE PFDDAE- BASED ACCIDENT EARLY WARNING NETWORK .

Category Factor Typical influence features


Basic type of work, operating mode, age, gender, ...
Skill vocational grade, working years, academic degree, training cycle, ...
Human
Physical states body temperature, blood pressure, pulse rate, sweat, ...
Mental states working hours, overtime, wages, unemployment rate, ...
Basic supplier, purchasing date, quality level, proportion of dangerous goods, ...
Machine and Designed performance rotation speed, load capacity, rated power, service life, ...
Material Operational performance frequency of use, operating hours, overload, failure rate, ...
Physical states surface temperature, curvature, fatigue, noise, ...
Basic plant density, number of workers, product type, power load, ...
Physical air temperature, air pressure, humidity, dust concentration, ...
Environment Managerial production volume, productivity, production cycle, service interval, ...
External visibility, rain fall, vehicle flow density, human flow density, ...
Social profit rate, tax rate, loans, interest rate, crime rate, ...
History accident frequency, alarm rate, casualty rate, rectification effects, ...

• Static labeled features, which are transformed into real • Very High: The risk is significant and is very likely to
values using membership functions of fuzzy sets defined result in an accident; an immediate action should be taken
on their domains. otherwise the accident would be inevitable.
• Dynamic (temporal) features, which are first transformed • Critical: The risk is already beyond the threshold and
into labeled features using a maximizing-discriminability- an accident seems to be inevitable; subsequent accident
based recurrent fuzzy network [67], and then transformed rescue operations should be prepared.
into real values. The exact form of an output vector of the network also
Thus there are at most two preprocessing layers for PFD- depends on the characteristics (such as the number and scale
DAE; moreover, a stand-alone multi-class logistic regression of plants) of the industrial zone. A set of rules for producing
layer is added on the top of PFDDAE, as shown in Fig. suggested measures to the managers according to the output
8. The top output layer takes inputs from the highest layer of the network has also been implemented in the integrated
of PFDDAE and is trained by supervised learning, the aim system, e.g., if the level of fire risk is above medium, it is
of which is to identify one or a combination of the eleven suggested to shut down those overheated machines; if the level
general types of safety risks: Fire, Explosion, Hazardous of fire risk is above high, it is also suggested to notify the
Gas Leakage, Hazardous Liquid Leakage, Hazardous Solid firemen on duty.
Leakage, Radioactive Leakage, Collapse, Collision, Falling,
Worker Injury, and Stampede. Some types can be further VI. E XPERIMENTS
divided into subtypes in some specific industrial applications.
Typically, for each type we classify six risk levels: A. Data Sets and Comparative Methods
• Very Low: The risk can be neglected; no early warning We use the proposed PFDDAE with hybrid HF and en-
is needed. hanced BBO learning (denoted by PFDDAE-EBO) for acci-
• Low: The risk does exist but is far from resulting in dent early warning in four industrial zones in Jiangsu Province
an accident; a reminder may be sent to the responsible and Zhejiang Province, East China. Table II summarizes the
person. basic characteristics of the four network instances, where the
• Medium: The risk deserves attention but is easy to con- governing parameters are all expressed by interval PFN.
trol; a caution should be exercised. For each of the four network instances, we construct a
• High: The risk is noteworthy and may result in an dataset consisting of 600 tuples based on historical records. For
accident; an early warning is needed to stimulate proper comparison, we have also implemented the following models
actions in order to prevent the accident. for identifying the accident risk on the test datasets:

1063-6706 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2017.2738605, IEEE
Transactions on Fuzzy Systems

IEEE TRANSACTIONS ON FUZZY SYSTEMS 8

TABLE II
S UMMARY OF THE PFDDAE- BASED DEEP NEURAL NETWORKS FOR THE FOUR INDUSTRIAL ZONES .
Zone Network
ID
2 Number of Number of Number of
Area (km ) Input dimension Output dimension
plants workers layers1
#1 1.8 21 20,000 3030 496 4
#2 3.2 30 63,000 5353 976 4
#3 8.1 76 106,000 11676 1992 5
#4 22.5 165 221,000 21188 2895 5
1. The number does not include the preprocessing layers and the logistic output layer.

B. Comparative Results in Terms of Error Rates


Logistic Table III presents the average error rates of the six different
models on the datasets; in each row, the minimum (best) error
PFDDAE rate among the six models is shown in boldface. As we can see,
PFDDAE-EBO achieves the minimum error rate on each test
– – (l)
case, which clearly demonstrates the performance advantage
f (l) g ' of the proposed model over the others.
First let us look at the difference among the first three
– – models that do not use evolutionary learning algorithms. From
f g '
Table III we can see that, the classical DDAE model always
achieves the maximum error rate, i.e., both the FDDAE and
the PFDDAE-HF models produce better results than DDAE,
... ... ...
real inputs Fuzz Fuzz Fuzz Fuzz
which demonstrates that using fuzzy parameters to govern
... ... the model can effectively improve the classification accuracy
of DDAE. For the two fuzzy models without evolutionary
label inputs MDRFN MDRFN
... learning, PFDDAE-HF achieves less error rate than FDDAE
temporary inputs on each test case, which indicates that Pythagorean fuzzy
numbers can provide better classification ability than regular
Fig. 8. The structure of the PFDDAE-based deep neural network for industrial fuzzy numbers.
accident early warning.
A motivation to introduce fuzzy parameters to deep neural
networks is to improve the robustness of the model when
training data samples are corrupted by noises [29]. To validate
• The classical DDAE model with the greedy layer-wise this, we further divide the total 2400 tuples into two parts:
training algorithm by Bengio et al. [42]. one containing 780 tuples which have more than 10% noises,
• The basic fuzzy DDAE model (whose parameters are and the other containing the remaining 1620 tuples. In the
represented by regular fuzzy numbers) with the Hessian- 5-fold cross-validation, the numbers of tuples in the noisy
free learning algorithm, denoted by FDDAE. part misclassified by DDAE, FDDAE and PFDDAE-HF are
• The PFDDAE model trained only by the HF algorithm respectively 156, 112 and 91, while the numbers of tuples
[62], denoted by PFDDAE-HF. in the non-noisy part misclassified by the three models are
• The PFDDAE model with the genetic algorithm (GA) respectively 339, 313, 306. The contributions of the two parts
assisted learning method [68], denoted by PFDDAE-GA. to the overall error rates of the three models are also given by
• The PFDDAE model with the hybrid algorithm that the histograms in Fig. 9. That is, for both the noisy part and
employs the basic BBO procedure, i.e., the local topology the non-noisy part, the error rate of FDDAE is lower than that
and the maturity parameter are not used in the algorithm, of DDAE, and that of PFDDAE-HF is less than FDDAE; nev-
denoted by PFDDAE-BBO. ertheless, the differences between the models are much more
The test uses a 5-fold cross-validation. The maximum significant in the noisy part than in the non-noisy part. This
number of training iterations is 1500 on instances #1 and reveals that, both the representation ability and robustness can
#2 and 3000 on #3 and #4 for all the models. The three be improved by replacing crisp model parameters with regular
evolutionary algorithms use the same population size of 30 fuzzy parameters and can be further improved by replacing
(and thus the maximum number of iterations is 50 on #1 and regular fuzzy parameters with Pythagorean fuzzy parameters,
#2 and 100 on #3 and #4). For both the basic BBO and our and the improvement of robustness is more remarkable.
EBO algorithms, we set γI = γE = 1 and γM = 0.05. For GA Second, the last three PFDDAE models with evolution-
we set the crossover rate to 0.8 and the mutation rate to 0.01, ary learning algorithms all achieve lower error rates than
as suggested in [68]. For each training case, each evolutionary PFDDAE-HF, demonstrating that the evolutionary algorithms
algorithm is run for 30 times, and the results are averaged over can explore the solution space more effectively than the single
the 30 runs. The experiments are conducted on a server with gradient-based method in training the deep model. This is
4×Intel Xeon 3430 CPU and 8GB DDR3 memory. because the gradient-based method that iteratively tries to

1063-6706 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2017.2738605, IEEE
Transactions on Fuzzy Systems

IEEE TRANSACTIONS ON FUZZY SYSTEMS 9

TABLE III
AVERAGE ERROR RATES OF DIFFERENT MODELS ON THE FOUR DATASETS .
ID DDAE FDDAE PFDDAE-HF PFDDAE-GA PFDDAE-BBO PFDDAE-EBO
#1 15.83% 13.33% 12.17% 11.94% 11.67% 11.50%
#2 19.17% 13.50% 12.67% 11.67% 12.00% 11.11%
#3 26.67% 24.33% 23.17% 21.22% 19.67% 18.22%
#4 20.83% 19.67% 18.17% 17.50% 17.17% 16.50%
Overall 20.63% 17.71% 16.54% 15.58% 15.13% 14.33%

25.00% PFDDAE models with evolutionary learning continue to


noise 10% noise<10% descend in later stages and fall below the curves of the
20.00% first three models (about at the 1000th∼1300th iteration
on #1 and #2 and the 1800th∼2300th iteration on #3
15.00%
and #4). That is, the gradient-based learning algorithms
10.00% converge faster in early stages but easily get trapped in
local optima; by contrast, hybrid evolutionary learning
5.00% algorithms explore the solution space much more widely,
so they converge slower in early stages but more likely
0.00%
DDAE FDDAE PFDDAE-HF to jump out of local optima and continuing the search.
• Among the three PFDDAE models with evolution-
Fig. 9. The contributions of the noisy and non-noisy tuples to the overall
ary learning, PFDDAE-GA converges the fastest and
error rates of the DDAE, FDDAE and PFDDAE-HF models. PFDDAE-BBO converges the slowest, but both of them
get trapped in local optima after about two third of the
whole iterations. In comparison, PFDDAE-EBO not only
improve a single solution based on its local information is has a relatively high convergence speed, but also has
easy to be trapped in local optima, while the evolutionary the longest convergence period before search stagnation.
algorithms can evolve a population of solutions to simulta- That is, PFDDAE-EBO balances the exploration and
neously explore multiple regions in the solution space with exploitation much better than the two other evolutionary
the help of information exchange among the solutions [69]. algorithms. In particular, the introduction of the local
The performance difference can be much more pronounced in topology and the maturity parameter is useful in improv-
optimizing both the weights and structure of the network than ing the search ability of the BBO algorithm.
optimizing only the network weights [56], [70].
Third, among the three PFDDAE models with evolutionary We have conducted Nonparametric Wilcoxon rank sum tests
learning, the overall error rate of PFDDAE-BBO is slightly on the classification error rates of PFDDAE-EBO and the other
better than PFDDAE-GA (the latter performs better on #2 and five methods on the four test cases, the p-value results of which
the former performs better on the other three cases), while are given in Table IV (a p-value less than 0.05 indicates that
PFDDAE-EBO achieves much lower error rate than PFDDAE- the results of PFDDAE-EBO and the comparative method are
BBO and PFDDAE-GA. This shows that the proposed learning statistically different with 95% confidence). As we can see, on
algorithm is more suitable for training PFDDAE than GA and each instance, the classification accuracy of PFDDAE-EBO is
the basic BBO, mainly because the local topology and the significantly better than all the other methods.
maturity control mechanism can avoid most solutions being
strongly attracted by one or several high fitness solutions in Table V presents the average CPU time per training iteration
early stages, and thus enhance the exploration ability and consumed by the six models on the datasets. As we can
suppress premature convergence [71]. see, the evolutionary learning algorithms inevitably consume
Fig. 10 (a)–(d) present the convergence curves (the changing more computational time than the gradient-based algorithms
of error rates with training iterations) of the six comparative in training the models, but the improvement of early warning
methods on the four instances respectively. From the curves accuracy is worth paying the computational cost. There is
we can see that: no significant difference among the CPU times of the three
evolutionary learning algorithms.
• In general, except the very early stages (about 0∼200
iterations), FDDAE converges faster than DDAE and In summary, on all the test cases PFDDAE-EBO exhibit-
PFDDAE-HF converges faster than FDDAE, but descend- s the best early warning accuracy among the six models.
ing rates of the first three models are similar. The overall accuracy improvement (in terms of error rate)
• DDAE, FDDAE and PFDDAE-HF converge faster than achieved by PFDDAE-EBO over DDAE, FDDAE, PFDDAE-
the other three models in early stages (about 800∼1000 HF, PFDDAE-GA and PFDDAE-BBO are 30.51%, 19.06%,
iterations on #1 and #2 and 1000∼1500 iterations on #3 13.35%, 8.02% and 5.23%, respectively. Far and away, such an
and #4), but afterwards they cannot further decrease their improvement can increase the efficiency of industrial accident
error rates. However, the convergence curves of the three prevention and rescue operation preparation remarkably.

1063-6706 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2017.2738605, IEEE
Transactions on Fuzzy Systems

IEEE TRANSACTIONS ON FUZZY SYSTEMS 10

DDAE FDDAE PFDDAE HF DDAE FDDAE PFDDAE HF


PFDDAE GA PFDDAE BBO PFDDAE EBO PFDDAE GA PFDDAE BBO PFDDAE EBO
50% 50%
Classification error rate

Classification error rate


ra

ra
40% 40%

30% 30%

20% 20%

10% 10%
0 300 600 900 1200 1500 0 300 600 900 1200 1500
t i i iteration
training it ti t i i iteration
training it ti

(a) Instance #1 (b) Instance #2

DDAE FDDAE PFDDAE HF DDAE FDDAE PFDDAE HF


PFDDAE GA PFDDAE BBO PFDDAE EBO PFDDAE GA PFDDAE BBO PFDDAE EBO
50% 50%
Classification error rate

Classification error rate


ra

ra
40% 40%

30% 30%

20% 20%

10% 10%
0 1000 2000 3000 0 1000 2000 3000
t i i iteration
training it ti t i i iteration
training it ti

(c) Instance #3 (d) Instance #4

Fig. 10. The convergence curves of comparative methods on the four instances.

TABLE IV
p- VALUES OF STATISTICAL TESTS ON THE CLASSIFICATION ACCURACY OF PFDDAE-EBO AND THE OTHER FIVE METHODS ON THE FOUR DATASETS .

PFDDAE-EBO vs.
ID
DDAE FDDAE PFDDAE-HF PFDDAE-GA PFDDAE-BBO
#1 1.065E-12 1.065E-12 1.065E-12 1.270E-06 4.430E-02
#2 1.087E-12 1.087E-12 1.087E-12 2.919E-05 1.154E-08
#3 1.052E-12 1.052E-12 1.052E-12 2.378E-11 3.713E-08
#4 1.119E-12 1.119E-12 1.119E-12 1.503E-06 7.760E-04

TABLE V
AVERAGE CPU TIME ( IN SECONDS ) PER TRAINING ITERATION OF DIFFERENT MODELS ON THE FOUR DATASETS .

ID DDAE FDDAE PFDDAE-HF PFDDAE-GA PFDDAE-BBO PFDDAE-EBO


#1 0.219 0.385 0.466 2.65 3.08 2.57
#2 0.313 0.529 0.730 3.55 3.98 3.37
#3 0.367 0.578 0.800 6.29 7.60 6.36
#4 0.478 0.730 0.966 9.71 8.50 9.97

1063-6706 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2017.2738605, IEEE
Transactions on Fuzzy Systems

IEEE TRANSACTIONS ON FUZZY SYSTEMS 11

C. Comparative Results in Terms of Missing-Warning Rates all missing warning false warning
and False-Warning Rates
40.00%
Recall that we classify six risk levels: Very Low, Low,
Medium, High, Very High, and Critical. Among all the mis- 30.00%
classification cases, we are mostly interested in two special
cases of warning errors: 20.00%
• High or Very High risk tuples that are classified as Very
Low or Low, which we called Missing-Warning cases. In 10.00%
such cases, early warning is required but is very likely
0.00%
not to be adopted due to misclassification, and thus a risk
often blossoms into an accident and causes serious losses.
• Very Low or Low risk tuples that are classified as High
or Very High, which we called False-Warning cases. In
such cases, a risk is overrated and unnecessary actions are
very likely to be taken, which may seriously affect regular Fig. 11. The comparison of PFDDAE-EBO’s classification accuracy improve-
production and may also cause substantial economic ment over the other five models in terms of error rates caused by missing-
losses. warning, false-warning, and all misclassification cases.
Generally, other misclassification cases (such as classifying
a Very Low risk tuple as Low) are not so serious as the
above two. Misclassification on Critical risk tuple is not obtained at about 480, 240, 180, 120, 90, 60, 45, 30, 15, and
specifically analyzed here because in such cases accidents are 10 minutes before the accident, respectively (the data within
often inevitable no matter whether early warnings are made. 10 minutes before the accident could directly activate some
Table VI and Table VII present the six models’ error rates other rules to produce critical warning in the system). Due
caused by missing-warning and false-warning, respectively. to the deficiency and the lag of information, there are about
From the results, we can also see that the two basic fuzzy 10%∼15% features missing, which imposes high requirements
models outperform the crisp DDAE model, the three PFDDAE on the robustness of the models.
methods with evolutionary learning outperform the gradient- Table VIII presents the potential risks (above the Low level)
based PFDDAE-HF model, and PFDDAE-EBO performs best identified by the early warning models at different times before
among the six models, in terms of either the missing-warning the accident. The progresses of early warning can be described
rate or the false-warning rate. as follows:
Moreover, PFDDAE-EBO’s performance advantage over the
• At the time 8 hours before the accident, there is no
other models on either the missing-warning rate or the false-
symptoms of the explosion accident. The basic DDAE
warning rate is more significant than that on the error rate
model produces a warning of fire risk at high level, which
caused by all misclassification cases: its classification accuracy
is acknowledged as a false-warning. Only our PFDDAE-
improvement over DDAE, FDDAE, PFDDAE-HF, PFDDAE-
EBO model produces a warning of hazard gas leakage at
GA and PFDDAE-BBO in terms of missing-warning rate
medium level. According to a subsequent investigation,
decrement are 38.63%, 29.39%, 26.22%, 18.59%, and 11.26%,
there was indeed a device failure that could cause gas
respectively, and that in terms of false-warning rate decrement
leakage during 8 hours to 4 hours before the explosion,
are 38.56%, 30.24%, 27.21%, 16.19%, and 12.80%, respec-
and the danger was removed just after the period (this
tively. Fig. 11 compares the PFDDAE-EBO’s performance
also revealed the bad management of the enterprise).
improvement over the other five models in missing-warning,
• At the time 4 hours before the accident, except DDAE,
false-warning, and all misclassification cases. This indicates
all the five fuzzy deep models identify the risk of gas
that the proposed model is more reliable in identifying key
leakage, and the four Pythagorean-type fuzzy models also
risks for industrial accident early warning.
identify a stampede risk of at medium level. In fact,
around that time hundreds of workers entered into the
D. Test on A Real-World Accident plant and began to work, and the crowd density was
On August 2, 2014, a dust explosion occurred in a large beyond the legal limit.
industrial plant for polishing various aluminium alloy parts in • At the time 2 hours before the accident, all the six models
Kunshan, Jiangsu Province, East China. 75 people lost their identify the stampede risk, where DDAE and FDDAE
lives immediately and another 185 were injured. Subsequently, consider it as medium level and the four Pythagorean-
71 of the seriously injured also died, which increased the total type fuzzy models consider it as high level. Moreover,
loss of lives to 146. The direct economic loss was 351 million DDAE produces a warning of fire risk at medium level
yuan. This is probably one of the most serious dust explosion and FDDAE produces a warning of collapse risk at high
catastrophes known in the world [72]. level, both of which are acknowledged as false-warning.
We test the proposed PFDDAE-EBO model together with • At the time one and a half hours before the accident, all
the other five comparative models in early warning of the the six models identify the stampede risk, while FDDAE
accident. The test dataset consists of 10 data tuples, which are still produces a false-warning of collapse risk.

1063-6706 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2017.2738605, IEEE
Transactions on Fuzzy Systems

IEEE TRANSACTIONS ON FUZZY SYSTEMS 12

TABLE VI
E RROR RATES CAUSED BY MISSING - WARNING OF THE DIFFERENT MODELS ON THE FOUR DATASETS .
ID DDAE DDAE PFDDAE-HF PFDDAE-GA PFDDAE-BBO PFDDAE-EBO
#1 3.50% 2.83% 2.67% 2.56% 2.28% 2.28%
#2 3.83% 3.17% 3.50% 2.94% 2.89% 2.50%
#3 6.00% 5.33% 4.83% 4.67% 4.17% 3.67%
#4 4.50% 4.17% 3.83% 3.28% 3.00% 2.50%
Overall 4.46% 3.88% 3.71% 3.36% 3.08% 2.74%

TABLE VII
E RROR RATES CAUSED BY FALSE - WARNING OF THE DIFFERENT MODELS ON THE FOUR DATASETS .
ID DDAE DDAE PFDDAE-HF PFDDAE-GA PFDDAE-BBO PFDDAE-EBO
#1 4.00% 3.33% 2.70% 2.58% 2.63% 2.55%
#2 4.17% 3.17% 3.67% 2.60% 2.50% 2.06%
#3 5.33% 5.50% 4.67% 4.83% 4.39% 3.50%
#4 4.67% 4.00% 4.33% 3.30% 3.28% 3.06%
Overall 4.54% 4.00% 3.83% 3.33% 3.20% 2.79%

• At the time one hour before the accident, all the six two or three hours before the accident and thus might provide
models identify the stampede risk at high level, and our opportunity to prevent the explosion or to reduce the casualty.
PFDDAE-EBO model uniquely identifies the explosion In summary, all the deep learning models could produce early
risk at high level! warning that would be helpful in mitigating the damage, and
• At the time 45 minutes before the accident, both the four Pythagorean-type fuzzy models would be very likely
PFDDAE-GA and PFDDAE-EBO identify the explosion to prevent the accident, if they were adopted in advance and
risk at high level, while both PFDDAE-HF and PFDDAE- their results were treated seriously. In this test, PFDDAE-EBO
BBO identify a fire risk at high level. In fact, the also demonstrates that it can produce accident early warning
explosion risk and the fire risk shared many symptoms in the most accurate and timely way among the six models.
in this accident, and here we do not consider the fire risk
warning as an invalid false-warning. VII. C ONCLUSION AND F URTHER S TUDY
• The early warning results of the models at the time 30 The paper presents a new deep denoising autoencoder model
minutes before the accident are similar to that at the time equipped with Pythagorean fuzzy parameters, and proposes
45 minutes before the accident, except that PFDDAE- a hybrid algorithm combining Hessian-free optimization and
BBO correctly identifies the explosion risk. BBO for efficiently training the model. Based on the PFDDAE
• At the time 15 minutes before the accident, all the four model and the evolutionary learning algorithm, we construct a
Pythagorean-type fuzzy models produce early warning fuzzy deep neural network for accident early warning that is
of the explosion risk, and PFDDAE-EBO and PFDDAE- of crucial importance in industrial operations. Experimental
EBO consider that the risk is very high. However, DDAE results show that the PFDDAE with evolutionary learning
and FDDAE still fail to identify the explosion risk. can effectively improve the model’s representation ability and
• At the time 10 minutes before the accident, PFDDAE- robustness in comparison with the state-of-the-arts and achieve
HF warns the explosion risk at very high level, and the high classification accuracy on a set of real-world data tuples.
three PFDDAE models with evolutionary learning warn To our knowledge the present work is the first attempt to apply
at critical level. At this time, FDDAE eventually identifies Pythagorean fuzzy sets in deep neural network models, and the
the explosion risk at high level, while DDAE identifies it result shows that it does improve the ability of DDAE. Our
as a fire risk. ongoing work includes testing its effectiveness on other deep
As we can see from the warning results, our PFDDAE- models such as restricted Boltzmann machines and deep belief
EBO model could produce early warning of the gas leakage networks.
risk at 8 hours before the accident and the explosion risk at For training the proposed model we employ an enhanced
one hour before the accident, both of which are the earliest BBO in the hybrid evolutionary learning algorithm and show
among the six models. The other three Pythagorean-type fuzzy that it outperforms GA and the basic BBO. Nevertheless, there
models could produce early warning of the explosion risk or are many other competitive evolutionary metaheuristics (such
the similar fire risk at 45 minutes before the accident, and thus as differential evolution [73] and water wave optimization
would also provide enough time for preventing the accident. [74]), and thus another future work is to develop hyper-
The crisp DDAE and the regular fuzzy model could identify heuristic learning algorithms by choosing and combining those
the explosion risk only 10 minutes before the accident, which metaheuristics [75].
might be too late to prevent the disaster. Moreover, DDAE
and FDDAE also produce some false-warning during the test, ACKNOWLEDGMENT
while the four PFDDAE models never do so. Nevertheless, This work was supported by grants from National Natural
all the six models could identify the stampede risk at about Science Foundation of China (Grant No. 61325019, 61472167,

1063-6706 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2017.2738605, IEEE
Transactions on Fuzzy Systems

IEEE TRANSACTIONS ON FUZZY SYSTEMS 13

TABLE VIII
T HE EARLY WARNING RESULTS OF THE DEEP LEARNING MODELS PRODUCED AT DIFFERENT TIMES ( IN MINUTES ) BEFORE THE 2014 K UNSHAN DUST
EXPLOSION ACCIDENT.
Time bef.
the acc. DDAE FDDAE PFDDAE-HF PFDDAE-GA PFDDAE-BBO PFDDAE-EBO
480 Fire -- -- -- -- Hazard gas leakage
(High) (Medium)
240 -- Hazard gas leakage Hazard gas leakage Hazard gas leakage Hazard gas leakage Hazard gas leakage
(High) (Medium) (Medium) (High) (High)
Stampede Stampede Stampede Stampede
(Medium) (Medium) (Medium) (Medium)
180 -- -- Stampede Stampede Stampede Stampede
(Medium) (High) (High) (High)
120 Stampede Stampede Stampede Stampede Stampede Stampede
(Medium) (Medium) (High) (High) (High) (High)
Fire Collapse
(Medium) (High)
90 Stampede Stampede Stampede Stampede Stampede Stampede
(High) (Medium) (Medium) (High) (High) (High)
Collapse
(Medium)
60 Stampede Stampede Stampede Stampede Stampede Stampede
(High) (High) (High) (High) (High) (High)
Explosion
(High)
45 Stampede Stampede Stampede Stampede Stampede Stampede
(High) (Medium) (High) (High) (High) (High)
Fire Explosion Fire Explosion
(High) (High) (High) (High)
30 Stampede Stampede Stampede Stampede Stampede Stampede
(Very High) (High) (High) (High) (High) (High)
Fire Explosion Explosion Explosion
(Very High) (High) (High) (High)
15 Stampede Stampede Stampede Stampede Stampede Stampede
(Very High) (High) (High) (High) (High) (High)
Explosion Explosion Explosion Explosion
(High) (High) (Very High) (Very High)
10 Stampede Stampede Stampede Stampede Stampede Stampede
(High) (High) (High) (High) (High) (High)
Fire Explosion Explosion Explosion Explosion Explosion
(Very High) (High) (Very High) (Critical) (Critical) (Critical)

61473263, and U1509207). [12] C. Rudin, D. Waltz, R. Anderson, A. Boulanger, A. Salleb-Aouissi,


M. Chow, H. Dutta, P. Gross, B. Huang, S. Ierome, D. Isaac, A. Kressner,
R. Passonneau, A. Radeva, and L. Wu, “Machine learning for the new
R EFERENCES york city power grid,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34,
[1] Y.-J. Zheng, S.-Y. Chen, and H.-F. Ling, “Evolutionary optimization for no. 2, pp. 328–345, 2012.
disaster relief operations: A survey,” Appl. Soft Comput., vol. 27, pp. [13] Y. J. Zheng, Q. Z. Chen, H. F. Ling, and J. Y. Xue, “Rescue Wings:
553–566, 2015. Mobile computing and active services support for disaster rescue,” IEEE
[2] M. Aldrich, Safety first: Technology, labor, and business in the building Trans. Serv. Comput., vol. 9, no. 4, pp. 594–607, 2016.
of American work safety, 1870-1939. Baltimore: JHU Press, 1997. [14] J. Lu, G. Zhang, and D. Ruan, Multi-objective Group Decision Mak-
[3] S. Chi, S. Han, D. Y. Kim, and Y. Shin, “Accident risk identification ing: Methods, Software and Applications With Fuzzy Set Techniques.
and its impact analyses for strategic construction safety management,” London, UK: Imperial College Press, 2007.
J. Civil Eng. Manag., vol. 21, no. 4, pp. 524–538, 2015. [15] P. Marttinen, J. Tang, B. De Baets, P. Dawyndt, and J. Corander,
[4] S. Chae, “Development of warning system for preventing collision “Bayesian clustering of fuzzy feature vectors using a quasi-likelihood
accident on construction site,” in Proc. 26th Int’l Symp. Automation approach,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 1, pp.
and Robotics in Construction, Austin TX, 2009, pp. 24–27. 74–85, 2009.
[5] Q. Chen, W.-l. Duan, and G. Chen, “Design of early-warning mechanism [16] M. Naderpour, J. Lu, and G. Zhang, “An abnormal situation modeling
of work safety for chemical industry park based on immune mechanism,” method to assist operators in safety-critical systems,” Reliab. Eng. Syst.
China Safety Sci., vol. 27, no. 9, pp. 159–165, 2011. Safety, vol. 133, no. 1, pp. 33–47, 2015.
[6] X. Lin, G. Chen, and X. Du, “Establishment of accident risk early- [17] D. Wu, G. Zhang, and J. Lu, “A fuzzy preference tree-based recom-
warning macroscopic model on ventilation, gas, dust and fire in coal mender system for personalized business-to-business e-services,” IEEE
mine,” Procedia Eng., vol. 45, pp. 53–58, 2012. Trans. Fuzzy Syst., vol. 23, no. 1, pp. 29–43, 2015.
[7] C.-F. Yuan, H. Wang, and Y. Chen, “The hierarchical risk source iden- [18] Y.-J. Zheng, H.-F. Ling, S.-Y. Chen, and J.-Y. Xue, “A hybrid neuro-
tification method connected with event causal chain in the emergency fuzzy network based on differential biogeography-based optimization
process of fire accident of petroleum storage and transportation,” Appl. for online population classification in earthquakes,” IEEE Trans. Fuzzy
Mech. Mat., vol. 501, pp. 2411–2414, 2014. Syst., vol. 23, no. 4, pp. 1070–1083, 2015.
[8] P. Dodshon and R. Burgess-Limerick, “Application of the bow tie [19] L. A. Zadeh, “Fuzzy sets,” Inform. Control, vol. 8, no. 3, pp. 338–353,
analysis technique to enhancing the identification of risk controls during 1965.
accident investigation activities.” in Proc. International Ergonomics [20] G. Zheng, N. Zhu, Z. Tian, Y. Chen, and B. Sun, “Application of a
Association, 2015, pp. 352–360. trapezoidal fuzzy AHP method for work safety evaluation and early
[9] L. Ding and C. Zhou, “Development of web-based system for safety warning rating of hot and humid environments,” Safety Sci., vol. 50,
risk early warning in urban metro construction,” Autom. Constr., vol. 34, no. 2, pp. 228–239, 2012.
no. 1, pp. 45–55, 2013. [21] J. Guo and J. Li, “Classification model of accident early-warning for
[10] Q. Cao, H. Zhang, J. Liu, and X. Liu, “Risk monitoring and early- coal mine based on ahp-fuzzy synthetic evaluation and its application,”
warning technology of coal mine production,” J. Coal Sci. Eng., vol. 13, Mining Res. Dev., vol. 32, no. 3, pp. 96–99, 2012.
no. 3, pp. 296–300, 2007. [22] C. Wang, C. Ma, J. Liu, G. Li, D. Zhang, and J. Tang, “Study on
[11] B. Idiri and A. Napoli, “The automatic identification system of maritime coalface stray current safety early warning based on ANFIS,” Procedia
accident risk using rule-based reasoning,” in the 7th International Earth Plane. Sci., vol. 1, no. 1, pp. 1332–1336, 2009.
Conference on System of Systems Engineering. IEEE, 2012, pp. 125– [23] J. S. R. Jang, “ANFIS: Adaptive-network-based fuzzy inference system,”
130. IEEE Trans. Syst. Man Cyber, vol. 23, no. 3, pp. 665–685, 1993.

1063-6706 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2017.2738605, IEEE
Transactions on Fuzzy Systems

IEEE TRANSACTIONS ON FUZZY SYSTEMS 14

[24] S. L. Rose-Pehrsson, S. J. Hart, T. T. Street, F. W. Williams, M. H. [50] W.-L. Hung and J.-W. Wu, “Correlation of intuitionistic fuzzy sets by
Hammond, D. T. Gottuk, M. T. Wright, and J. T. Wong, “Early warning centroid method,” Inform. Sci., vol. 144, no. 1, pp. 219–225, 2002.
fire detection system using a probabilistic neural network,” Fire Technol., [51] S. Das and D. Guha, “Similarity measure of intuitionistic fuzzy numbers
vol. 39, no. 2, pp. 147–171, 2003. by the centroid point,” in Mathematics and Computing, ser. Springer
[25] G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of Proceedings in Mathematics & Statistics, N. R. Mohapatra, D. Giri,
data with neural networks,” Science, vol. 313, no. 5786, pp. 504–507, K. P. Saxena, and D. P. Srivastava, Eds. New Delhi: Springer India,
2006. 2014, pp. 231–242.
[26] G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm [52] A. K. Nishad, S. K. Bharati, and S. R. Singh, “A new centroid method
for deep belief nets,” Neural Comput., vol. 18, no. 7, pp. 1527–1554, of ranking for intuitionistic fuzzy numbers,” in Proc. SCPS, V. B. Babu,
2006. A. Nagar, K. Deep, M. Pant, C. J. Bansal, K. Ray, and U. Gupta, Eds.
[27] R. Salakhutdinov and G. E. Hinton, “Deep boltzmann machines,” in New Delhi: Springer India, 2014, pp. 151–159.
Proc. AISTATS, Clearwater Beach, FL, 2009, pp. 448–455. [53] L. Abdullah and F. N. Azman, “Circumcenter of centroid in ranking
[28] R. Salakhutdinov, J. B. Tenenbaum, and A. Torralba, “Learning with fuzzy number: A case of generalized trapezoidal fuzzy numbers,” in
hierarchical-deep models,” IEEE Trans. Pattern Anal. Mach. Intell., Proc. IFUZZ-IEEE, 2015, pp. 1–6.
vol. 35, no. 8, pp. 1958–1971, 2013. [54] J. H. Saleh, E. A. Saltmarsh, F. M. Favaro, and L. Brevault, “Accident
[29] C. Chen, C.-Y. Zhang, L. Chen, and M. Gan, “Fuzzy restricted boltzman- precursors, near misses, and warning signs: Critical review and formal
n machine for the enhancement of deep learning,” IEEE Trans. Fuzzy definitions within the framework of discrete event systems,” Reliab. Eng.
Syst., vol. 23, no. 6, pp. 2163–2173, 2015. Syst. Safety, vol. 114, pp. 148–154, 2013.
[30] M. Khodayar and M. Teshnehlab, “Robust deep neural network for wind [55] M. Hayat, M. Bennamoun, and S. An, “Deep reconstruction models
speed prediction,” in Proc. CFIS, 2015, pp. 1–5. for image set classification,” IEEE Trans. Pattern Anal. Mach. Intell.,
[31] Q. Song, Y.-J. Zheng, Y. Xue, W.-G. Sheng, and M.-R. Zhao, “An evolu- vol. 37, no. 4, pp. 713–727, 2015.
tionary deep neural network for predicting morbidity of gastrointestinal [56] S. Lander and Y. Shang, “EvoAE – a new evolutionary method for
infections by food contamination,” Neurocomputing, vol. 226, pp. 16– training autoencoders for deep learning networks,” in Proc. COMPSAC,
22, 2017. vol. 2, Taichung, Taiwan, 2015, pp. 790–795.
[32] Q. Song, M.-R. Zhao, X.-H. Zhou, Y. Xue, and Y.-J. Zheng, “Predicting [57] A. G. Tettamanzi and M. Tomassini, Soft computing: integrating evolu-
gastrointestinal infection morbidity based on environmental pollutants: tionary, neural, and fuzzy systems. Berlin Heidelberg: Springer Verlag,
Deep learning versus traditional models,” Ecol. Ind., vol. 82, pp. 76–81, 2013.
2017. [58] T. Shinozaki and S. Watanabe, “Structure discovery of deep neural
[33] L. Chen, H. Qu, J. Zhao, B. Chen, and J. C. Principe, “Efficient and network based on evolutionary algorithms,” in Proc. ICASSP, Brisbane,
robust deep learning with correntropy-induced loss function,” Neural Australia, 2015, pp. 4979–4983.
Comput. Appl., vol. 27, no. 4, pp. 1019–1031, 2016. [59] J. ao Paulo Papa, W. Scheirer, and D. D. Cox, “Fine-tuning deep belief
[34] P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol, “Extracting networks using harmony search,” Appl. Soft Comput., vol. 46, pp. 875–
and composing robust features with denoising autoencoders,” in Proc. 885, 2016.
25th Int’l Conf. Machine Learning. New York, NY, USA: ACM, 2008, [60] Y. J. Zheng, W. G. Sheng, X. M. Sun, and S. Y. Chen, “Airline passenger
pp. 1096–1103. profiling based on fuzzy deep machine learning,” IEEE Trans. Neural
[35] R. Yager, “Pythagorean fuzzy subsets,” in Proc. IFSA/NAFIPS, Edmon- Netw. Learn. Syst., 2017, in press. DOI:10.1109/TNNLS.2016.2609437.
ton, Canada, 2013, pp. 57–61. [61] D. Simon, “Biogeography-based optimization,” IEEE Trans. Evol. Com-
put., vol. 12, no. 6, pp. 702–713, 2008.
[36] R. R. Yager, “Pythagorean membership grades in multicriteria decision
[62] J. Martens, “Deep learning via hessian-free optimization,” in Proceed-
making,” IEEE Trans. Fuzzy Syst., vol. 22, no. 4, pp. 958–965, 2014.
ings of the 27th International Conference on Machine Learning, 2010,
[37] P. Ren, Z. Xu, and X. Gou, “Pythagorean fuzzy TODIM approach to
pp. 735–742.
multi-criteria decision making,” Appl. Soft Comput., vol. 42, pp. 246–
[63] Y.-J. Zheng, H.-F. Ling, X.-B. Wu, and J.-Y. Xue, “Localized
259, 2016.
biogeography-based optimization,” Soft Comput., vol. 18, no. 11, pp.
[38] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol, 2323–2334, 2014.
“Stacked denoising autoencoders: Learning useful representations in a [64] Y.-J. Zheng, H.-F. Ling, and J.-Y. Xue, “Ecogeography-based optimiza-
deep network with a local denoising criterion,” J. Mach. Learn. Res., tion: Enhancing biogeography-based optimization with ecogeographic
vol. 11, pp. 3371–3408, 2010. barriers and differentiations,” Comput. Oper. Res., vol. 50, pp. 115–127,
[39] G. W. Taylor and G. E. Hinton, “Factored conditional restricted boltz- 2014.
mann machines for modeling motion style,” in Proc. ICML. New York, [65] J. R. Chelius, “The control of industrial accidents: Economic theory and
NY, USA: ACM, 2009, pp. 1025–1032. empirical evidence,” Law Contem. Prob., vol. 38, no. 4, pp. 700–729,
[40] N. Vora, S. S. Tambe, and B. D. Kulkarni, “Counterpropagation neural 1974.
networks for fault detection and diagnosis,” Comput. Chem. Eng., [66] T. Liu, M. Zhong, and J. Xing, “Industrial accidents: Challenges for
vol. 21, no. 2, pp. 177–185, 1997. china’s economic and social development,” Safety Sci., vol. 43, no. 8,
[41] J. Renders, A. Goosens, F. de Viron, and M. D. Vlaminck, “Nuclear pp. 503–522, 2005.
engineering a prototype neural network to perform early warning in [67] G.-D. Wu, Z.-W. Zhu, and P.-H. Huang, “A TS-type maximizing-
nuclear power plant,” Fuzzy Set. Syst., vol. 74, no. 1, pp. 139–151, 1995. discriminability-based recurrent fuzzy network for classification prob-
[42] Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, “Greedy layer- lems,” IEEE Trans. Fuzzy Syst., vol. 19, no. 2, pp. 339–352, 2011.
wise training of deep networks,” in Advances in Neural Information [68] O. E. David and I. Greental, “Genetic algorithms for evolving deep
Processing Systems (NIPS’06), J. P. Bernhard Schölkopf and T. Hoff- neural networks,” in Proc. GECCO. Vancouver, Canada: ACM, 2014,
man, Eds. MIT Press, 2007, vol. 19, pp. 153–160. pp. 1451–1452.
[43] N. J., A. Khosla, M. Kim, J. Nam, H. Lee, and A. Y. Ng, “Multimodal [69] A. Hertz and D. Kobler, “A framework for the description of evolutionary
deep learning,” in Proc. 28th Int’l Conf. Machine Learning. New York, algorithms,” Eur. J. Oper. Res., vol. 126, no. 1, pp. 1–12, 2000.
NY, USA: ACM, 2011, pp. 689–696. [70] J. S. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl, “Algorithms
[44] K. T. Atanassov, “Intuitionistic fuzzy sets,” Fuzzy Sets Syst., vol. 20, for hyper-parameter optimization,” in Advances in Neural Information
no. 1, pp. 87–96, 1986. Processing Systems 24, J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett,
[45] X. Zhang and Z. Xu, “Extension of TOPSIS to multiple criteria decision F. Pereira, and K. Q. Weinberger, Eds., 2011, pp. 2546–2554.
making with pythagorean fuzzy sets,” Int. J. Intell. Syst., vol. 29, no. 12, [71] Y. Xue, J. Jiang, B. Zhao, and T. Ma, “A self-adaptive artificial bee
pp. 1061–1078, 2014. colony algorithm based on global best for global optimization,” Soft
[46] Z. Xu and X. Cai, Intuitionistic fuzzy information aggregation. Beijing: Comput., 2017, Online first. DOI:10.1007/s00500-017-2547-1.
Sience Press, 2012. [72] G. Li, H.-X. Yang, C.-M. Yuan, and R. Eckhoff, “A catastrophic
[47] E. P. Klement, W. Schwyhla, and R. Lowen, “Fuzzy probability mea- aluminium-alloy dust explosion in China,” J. Loss Prev. Proc. Ind.,
sures,” Fuzzy Sets Syst., vol. 5, no. 1, pp. 21–30, 1981. vol. 39, no. 1, pp. 121–130, 2016.
[48] R. R. Yager, “Decision making with fuzzy probability assessments,” [73] R. Storn and K. Price, “Differential evolution - a simple and efficient
IEEE Trans. Fuzzy Syst., vol. 7, no. 4, pp. 462–467, 1999. heuristic for global optimization over continuous spaces,” J. Global
[49] T.-C. Chu and C.-T. Tsao, “Ranking fuzzy numbers with an area between Optim., vol. 11, no. 4, pp. 341–359, 1997.
the centroid point and original point,” Comput. Math. Appl., vol. 43, no. [74] Y.-J. Zheng, “Water wave optimization: A new nature-inspired meta-
12, pp. 111–117, 2002. heuristic,” Comput. Oper. Res., vol. 55, no. 1, pp. 1–11, 2015.

1063-6706 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2017.2738605, IEEE
Transactions on Fuzzy Systems

IEEE TRANSACTIONS ON FUZZY SYSTEMS 15

[75] E. Burke, E. Hart, G. Kendall, J. Newall, P. Ross, and S. Schulenburg,


“Hyper-heuristics: An emerging direction in modern research technolol-
ogy,” in Handbook of Metaheuristics, F. Glover and G. Kochenberger,
Eds. Kluwer, 2003, ch. 16, pp. 457–474.

Yu-Jun Zheng is an associate professor in Zhejiang


University of Technology. He received the Ph.D.
degree from Institute of Software, Chinese Academy
of Sciences in 2010. He is an IEEE member and
an ACM member, and his research interests include
bio-inspired computing and operations research. He
has authored over 50 papers in famous journals such
as IEEE Trans. Fuzzy Syst., IEEE Trans. Neural.
Netw. Learn. Syst., and IEEE Trans. Evol. Comput..
In 2014 he received the runner-up of IFORS Prize
for Development.

Sheng-Yong Chen is a professor, Ph.D. advisor in


Zhejiang University of Technology. He received the
Ph.D. degree from City University of Hong Kong
in 2003. Dr. Chen is an IET Fellow and an IEEE
senior member. His research interests include evolu-
tionary computation and intelligent systems. He has
authored over 100 scientific papers in international
journals and conferences. In 2013 he achieved the
National Outstanding Youth Fund of China.

Yu Xue received the Ph. D. degree from Nanjing


University of Aeronautics & Astronautics, China, in
2013. Since 2014 he has worked as an associate
professor in the School of Computer and Soft-
ware, Nanjing University of Information Science and
Technology. Since 2016 he has worked as a visiting
scholar of Victoria University of Wellington, New
Zealand. His research interests include computation-
al intelligence and data mining.

Jin-Yun Xue is a professor in Jiangxi Normal Uni-


versity, and a Ph.D. supervisor at Wuhan University.
He received his B.S. degree in Mathematics from
Najing University in 1970. From 1985 to 1988, he
worked as a visiting scholar at Cornell University.
After June 1995, he spent 10 months as a visit-
ing scholar at Santa Clara University. His research
interests include high-performance and dependable
computing.

1063-6706 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

You might also like