You are on page 1of 12

Journal of Difference Equations and Applications

ISSN: 1023-6198 (Print) 1563-5120 (Online) Journal homepage: http://www.tandfonline.com/loi/gdea20

A novel adaptive differential evolution SVM model


for predicting coal and gas outbursts

Zhigang Yan, Kan Yao & Yuanxuan Yang

To cite this article: Zhigang Yan, Kan Yao & Yuanxuan Yang (2016): A novel adaptive differential
evolution SVM model for predicting coal and gas outbursts, Journal of Difference Equations
and Applications, DOI: 10.1080/10236198.2016.1214725

To link to this article: http://dx.doi.org/10.1080/10236198.2016.1214725

Published online: 29 Jul 2016.

Submit your article to this journal

Article views: 1

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


http://www.tandfonline.com/action/journalInformation?journalCode=gdea20

Download by: [La Trobe University] Date: 02 August 2016, At: 02:54
JOURNAL OF DIFFERENCE EQUATIONS AND APPLICATIONS, 2016
http://dx.doi.org/10.1080/10236198.2016.1214725

A novel adaptive differential evolution SVM model for


predicting coal and gas outbursts
Zhigang Yana,b , Kan Yaoa,b and Yuanxuan Yanga,b
a School of Environmental Science and Spatial Informatics, China University of Mining and Technology,
Xuzhou, China; b Jiangsu Key Laboratory of Resources & Environmental Information Engineering, China
University of Mining and Technology, Xuzhou, China

ABSTRACT ARTICLE HISTORY


Downloaded by [La Trobe University] at 02:54 02 August 2016

Parameter selection is a key factor affecting the performance of Received 6 June 2016
support vector machines (SVMs). To further improve the classification Accepted 5 July 2016
accuracy and generalization ability of SVMs, a parameter selection KEYWORDS
model for SVMs with RBF kernel is proposed based on adaptive Support vector machine;
differential evolution (ADE) algorithm, and is applied to predict coal coal and gas outbursts;
and gas outbursts. The function of each parameter and its adjustment differential evolution;
scheme of differential evolution (DE) algorithm are analyzed, and mutation factor; crossover
the algorithm is improved by using the decision error rate of factor
samples as the objective function. Adaptive calculation equations for AMS SUBJECT
variability factor and crossover factor are designed. The variability and CLASSIFICATION
crossover factors are automatically adjusted in the execution process 93C95
of the algorithm, so that the population diversity is maintained in
the early stages of algorithm execution to enhance the ability of
searching global optimal values, while the stability of the algorithm
is guaranteed in the late stages by promoting the searching ability
for local optimal values. A novel ADESVM model for predicting
coal and gas outbursts is established by using ADE algorithm to
select SVM parameters, which is applied to predict the coal and
gas outbursts. Experimental results show that the designed ADE
algorithm has high convergence speed and high computational
accuracy. The proposed ADESVM model has higher training speed
and is more robust compared with other similar SVM models. It also
has higher prediction accuracy and shorter training time, compared
with back propagation neural networks, providing a new method for
the intelligent prediction of coal and gas outbursts.

1. Introduction
For a long time, coal and gas outbursts have been a major kind of disaster threatening
coal mining safety. Timely and accurately predicting coal and gas outbursts is the key
factor of increasing the economic benefits of coal mines and guaranteeing coal mine safety.
Establishing fast and effective prediction models for coal and gas outbursts and making
evaluations on outburst risk are important parts of preventing and controlling coal and
gas outbursts, which have theoretical significance and practical value [17]. Commonly
used prediction methods for coal and gas outbursts include index prediction methods,

CONTACT Zhigang Yan yanzhg@cumt.edu.cn


© 2016 Informa UK Limited, trading as Taylor & Francis Group
2 Z. YAN ET AL.

gas geology unit methods, geophysical methods, etc., [19]. Coal and gas outbursts are
complex non-linear dynamic systems. Therefore, using non-linear artificial intelligence
technology to recognize the patterns of outbursts and effectively predict them has become
a research hotspot. Currently, numerous researchers have carried out studies on predicting
coal and gas outburst using artificial intelligence methods, and achieved good results.
For example, Wen and Zhang, et al. proposed pattern recognition models [13,20]. Dong
et al. proposed G-K evaluation and a rough set model [5]. Wang et al. suggested a distance
discriminant analysis method [12]. Guo established a fuzzy synthetic mathematic evalu-
ation and clustering method [8]. You et al. developed a neural network based prediction
method [18]. Yang et al. proposed an IDEPB NN model [15]. Recently, support vector
machine (SVM) is applied in coal and gas outburst prediction [11,14]. SVM is a learning
method for small sample sets based on statistical learning theory, with relatively strong
non-linear modelling ability. It is very suitable for coal and gas outburst prediction, and
very good application performance has been reached. However, large numbers of studies
Downloaded by [La Trobe University] at 02:54 02 August 2016

have shown that parameters are the major factor affecting SVM performance. There have
been no unified standards or theories for SVM parameter selection at present [4]. Optimal
parameters are usually obtained empirically or by using cross validation methods with large
numbers of experiments, such as grid search method [1], which is time consuming, and
optimal parameters are not guaranteed. In recent years, many researchers have proposed
other parameter optimization methods. For example, gradient descend method was used
for parameter optimization [6]. Although the method reduces parameter searching time, it
is highly dependent on initial values, and is a kind of linear searching method which is easy
to be trapped in local optima. Particle swarm optimization [3] (PSO) method and genetic
algorithm [7] (GA) method were proposed, respectively, for SVM parameter optimization.
Although these intelligent methods reduce the dependence on initial values, their theory
and implementation are relatively complicated. Moreover, different optimization problems
require different crossover, mutation and selection methods, and are easy to be trapped in
local optima.
Differential evolution (DE) is a kind of heuristic parallel random searching optimization
method using floating point vector coding proposed by Storn and Price [9] in 1995. It
extracts differential information from the current population to guide the next step of
searching. Its principle is relatively simple, and it has only a few control parameters, with
relatively strong global searching ability, robustness, and high optimization speed. We
analyze the ‘reasonable region’ of the parameter optimization for SVMs with RBF kernel,
and the related new parameter optimization methods are studied [16]. On this basis, the
adaptive differential evolution (ADE) method for SVM parameter optimization is designed
in this study. Taking the global searching ability of DE into account, and using minimum
sample decision error rate as the optimization criterion to construct objective function, the
DE algorithm is improved for SVM parameter selection by adaptively design the mutation
and crossover factors, etc., so that the classification accuracy and generalization ability of
SVM is improved. The improved algorithm is applied to coal and gas outburst prediction.
Comparative experiments show that the proposed method has better performance and
higher accuracy than prevalent prediction methods, with relatively good application effects.
JOURNAL OF DIFFERENCE EQUATIONS AND APPLICATIONS 3

2. SVM algorithm
The idea of SVM [10] is based on Mercer theory. The input space is transformed to a
feature space with a higher dimension by using appropriate non-linear transformations.
In the feature space, the optimal classification hyperplane is solved, so that the hyperplane
can correctly classify as many data points of the two classes as possible, and the distances
between the classified data points of the two classes and the classification hyperplane are
as far as possible.
Given k samples: (x1 , y1 ), (x2 , y2 ), . . . (xk , yk ), x ∈ Rn , y ∈ {−1, 1}, a hyperplane (deci-
sion hyperplane) is to be found, so that the samples can be separated by it, i.e. Wx + b = 0,
W ∈ Rn , b ∈ R. The corresponding recognition function is:

f (x) = sign((Wx) + b) (1)


Downloaded by [La Trobe University] at 02:54 02 August 2016

The decision hyperplane should satisfy:

yi [Wxi + b] ≥ 1 − ξi , i = 1, 2, . . . , k (2)

The optimal decision hyperplane satisfies the condition that the minimum distance
between the samples of two classes and the decision hyperplane reaches longest. Then, the
classification problem is converted into the minimization problem of Equation (2) with
constraint ξi ≥ 0, i.e.:
1  k
min : τ (W) = W2 + C ξi (3)
2
i=1

where W2 is called structural risk, representing the complexity degree of the model, and
making
k the function smoother to improve the generalization ability;
i=1 ξi is called empirical risk, representing the error of the model;
C is the penalty parameter balancing the above two terms.
Equation (3) is an optimization problem with a constraint, which can be solved by
using Lagrangian optimization method. The corresponding classification function can be
converted into:
 k 

f (x) = sign αi yi (xi · x) + b (4)
i=1

When samples are non-linearly seperable, the raw samples can be mapped into a high-
dimensional feature space by a non-linear function φ(x), and the classification is carried out
in the high-dimensional feature space. The corresponding classification function becomes:
 

k
f (x) = sign αi yi (φ(x) · φ(xi )) + b (5)
i=1

The inner product operation in high-dimensional feature space can be defined as kernel
function K(x, y) = φ(x) · φ(y), so that kernel function can be conducted to the variables in
low-dimensional space, instead of directly using function φ. Therefore, Equation (5) can
be converted into Equation (6) for solution:
4 Z. YAN ET AL.

 

k
f (x) = sign αi yi K(xi , x) + b (6)
i=1

Commonly used kernel functions include linear kernels, polynomial kernels, Sigmoid
kernels, and RBF kernels.

3. ADE model for SVM parameter optimization


3.1. DE algorithm
DE algorithm is a heuristic algorithm based on population search. It solves optimization
problems through the cooperation and competition of individuals in the population. The
basic idea of DE is to extract searching step and direction information from the current
population. Random differences and crossover operations between individuals improve the
Downloaded by [La Trobe University] at 02:54 02 August 2016

diversity of population. Using the above mutation and crossover operations, a temporary
population is generated. Then, selection operations based on greedy criterion are used
to select the two populations one by one, so that a new generation of population can be
generated. According to the above method, the population can evolve continuously until
the stopping criterion of the algorithm is satisfied.

3.1.1. Mutation operation


For convenience, four different individuals are randomly selected in the population to gen-
erate differential vectors. The mutation operation is conducted for the optimal individuals
of each generation, so that the convergence speed of the algorithm is increased, and the
diversity of the population is guaranteed to some extent. The mutation operation method
is:    
g+1 g+1 g+1 g+1 g+1 g+1
vi = xbest + F xs1 − xs2 + xs 3 − xs 4 (7)

where
g+1
vi is the mutated individual generated by applying mutation operation in Equation
g
(7) to the individual vi of generation g;
g+1
xbest is the optimal individual in generation g + 1;
g is the current generation;
s1 , s2 , s3 , s4 ∈ {1, 2, . . . , N} are different random numbers not equal to i;
F is the mutation factor which enhances or reduces the differential quantities.

3.1.2. Crossover operation


To improve the diversity of the population, the crossover operation is:

g+1
g+1 vi,j , rand(j) ≤ CR
yi = g (8)
xi,j , rand(j) > CR

where
g+1
y is the individual generated by applying the crossover operation of Equation (8) to
g i
xi and the individual generated by Equation (7);
rand(j) is a random number evenly distributed in [0, 1];
JOURNAL OF DIFFERENCE EQUATIONS AND APPLICATIONS 5

CR ∈ [0, 1] is the crossover factors. The larger the CR is, the more contribution to C
g+1 g+1 g+1
the vi makes. When CR = 1, vi = yi , and it is beneficial to local searching and
g+1
fast convergence. The smaller the CR is, the more contribution to yi the C makes. When
g g+1
CR = 0, xi = yi , and it is beneficial to the diversity of the population and the global
searching ability.
3.1.3. Selection operation

g+1 g+1 g
g+1 yi , f (yi ) < f (xi )
xi = g g+1 g (9)
xi , f (yi ) ≥ f (xi )
where f is the objective function. Greedy searching strategy is used. The trial individual
g+1 g
yi generated by mutation and crossover operations competes with xi . Only when the
g+1 g g
fitness of yi is better than xi , it is selected as the offspring; otherwise, xi is selected as
offspring.
Downloaded by [La Trobe University] at 02:54 02 August 2016

Through the above operations, the new population is generated. Finally, the stopping
criterion is judged. If not, the mutation, crossover, and selection operations are iteratively
executed until the stopping criterion is satisfied and the optimal solution is obtained.

3.2. ADE algorithm for SVM parameter optimization


Kernel functions and their parameters are key factors affecting the performance of SVMs.
RBF kernel functions are commonly used classification kernel function. The general ex-
pression of them is K(x, y) = exp (−x − y2 /2σ 2 ). For convenience, another expression
is used: K(x, y) = exp ( − γ x − y2 ), γ > 0. The main factors affecting SVMs with RBF
kernel functions include error penalty parameter C and kernel parameter γ . Error penalty
parameter C balances the ratio of misclassified samples and the algorithm complexity. That
is, it adjusts the ratio between the confidence range and the empirical risk in a determined
feature subspace, so that the learning machine has optimal generalization ability. Kernel
parameter γ actually implicitly changes the mapping function, and thereby changes the
distribution complexity of the samples in the feature subspace. The optimization of SVM
parameters is actually to find an optimal parameter combination (C, γ ), so that the SVM
has optimal classification performance, and the learning and generalization ability of
SVM is improved. The ADE algorithm for SVM parameter selection is introduced next.
Research show that the performance of DE directly depends on the differential operation
mode and controlling parameters [9]. Inappropriate parameter settings may cause the
stagnation or premature phenomenon of the optimization model. Standard DE algorithm
uses completely parallel random probability for the combination of individuals in parental
generation, which only uses shallow information and lacks deep rational analysis. To
overcome the shortcoming, we establish ADE algorithm to improve the objective function
f , mutation factor F and crossover factor CR for SVM parameter selection.
3.2.1. Objective function
To find the optimal parameter combination (C, γ ), an appropriate objective function
should be defined, or an individual fitness function:

m1 m2
min f = k λ1 + λ2 (10)
n1 n2
6 Z. YAN ET AL.

where
m1 and m2 represent the misclassified data of the two classes, respectively;
n1 and n2 represent the numbers of the samples in the two classes, respectively;
m1 m2
n1 and n2 represent the misclassification rates of the samples in the two classes,
respectively;
k is a scaling factor controlling the significant degree of the changes of the objective
function value;
λ1 , λ2 ∈ [0, 1] are the controlling factors of the error rate for the two classes, respectively.
They adjust the misclassification of the two classes.

3.2.2. Mutation factor


Generally, the larger the F is, the more beneficial to convergence at optimal solutions it
is. When F > 1, the convergence speed is slow, and premature phenomena easily happen.
The smaller the F is, the higher the convergence speed is, but too small F causes non-
Downloaded by [La Trobe University] at 02:54 02 August 2016

optimal solutions. Therefore, to avoid premature, F should be relatively large at early


stages, which maintains the diversity of individuals and facilitates the searching of global
optimal solutions. On the other hand, F should be small at late stages, which is good for
the stable searching of local optima. The calculation equation for the designed adaptive
mutation factor F is:   
 fmax 
F = min Fmax ,    −1 (11)
fmin 
where
Fmax is the predefined maximum mutation factor;
fmin and fmax are the minimum and maximum objective function values of the current
generation, respectively;
It is clear that F ∈ [0, Fmax ]. At the beginning of training, the differences between the
objective function values of individuals are large, and fmax /fmin is also large, which results
in a large F value. At late stages, fmax /fmin approaches closer and closer to 1, which results
in a small F value. Moreover, as the iterations proceed, F shows a descending trend.

3.2.3. Crossover factor


A good searching strategy should maintain the diversity of the population at early stages of
searching, which is good for global searching; while improve the local searching ability
at late stages, which promotes the algorithm accuracy. Based on this idea, crossover
probability factor varying with the iteration numbers is designed in this study. That is,
CR increases from small to large with the iterations go on. In this way, at early stages of
g g+1
evolution, xi contributes more to yi , improving global searching ability. At late stages,
g+1 g+1
vi contributes more to yi , improving local searching ability. To satisfy the above
requirement, the designed CR is:

g(CRmax − CRmin )
CR = CRmin + (12)
gmax

where
CRmin is the defined minimum crossover probability;
CRmax is the defined maximum crossover probability;
g is the current iteration number;
JOURNAL OF DIFFERENCE EQUATIONS AND APPLICATIONS 7

gmax is the maximum iteration number.


The above design is able to effectively and adaptively control the DE process. It is able
to effectively overcome the stagnation or premature phenomena in the evolution process,
improving the convergence performance of the algorithm.

3.3. Algorithm steps


Step 1: Initialize the population size N, evolution generation number gmax , mutation factor
F, crossover probability CR, stopping threshold and upper/lower bounds of SVM
parameters (C, γ ), so that (C, γ ) can be randomly generated.
Step 2: Use current (C, γ ) combination as the SVM parameter. Train and test the sample
data, and obtain the test result, i.e. the classification result of the samples.
Step 3: Compare the classification result in Step2 with the actual classification, and cal-
Downloaded by [La Trobe University] at 02:54 02 August 2016

culate the objective function value f . Judge whether the predefined accuracy is
reached or g = gmax is satisfied (i.e. the maximum generation number is reached).
If either is satisfied, go to Step9; otherwise, go to next step.
Step 4: g = g + 1. Calculate new mutation factor and crossover factor. Perform the
evolution of the next generation.
g
Step 5: Select 4 different individuals xi from the current generation g. Use Equation (7)
g+1
for mutation operations and generate mutated individual vi of generation g + 1.
g+1
Step 6: Perform crossover operation for the mutated individual vi of generation g + 1
g+1
according to Equation (8), which generates the trial individual yi of generation
g + 1.
g+1
Step 7: Perform selection operation to trial individual yi of generation g + 1 according
g+1
to Equation (9), generating individual vi of generation g + 1.
Step 8: Calculate new (C, γ ) in individuals of generation g + 1, and then go to Step2.
Step 9: Obtain the optimal SVM parameters (C, γ ).
The algorithm flowchart is shown in Figure 1.

4. SVM model with ADE optimation for predicting coal and gas outbursts
4.1. Analysis of factors of coal and gas outbursts
There are many factors affecting coal and gas outbursts [19], such as coal type, initial
velocity of gas diffusion, coal sturdy coefficient, coal seam gas pressure, soft coal layer
thickness and the wall rock permeability of coal seams. Although coal and gas outbursts are
related with these factors, the risk is difficult to linearly express by these factors. Therefore,
accurately determine the main affecting factors of coal and gas outbursts is the key of
predicting coal and gas outbursts. Referring [15], we collect multiple groups of coal and
gas outburst data in the experimental mining area, Huaibei Mining Group Company,
Luling mine site. According to expert analysis, 24 relatively independent factors are
selected to establish a tree model of coal and gas outburst accidents. Principal component
analysis is used to select final eight main factors controlling coal and gas outbursts,
namely, gas pressure p, coal mechanical strength f , coal crumbliness comprehensive
feature coefficient Kc , coal permeability coefficient λ, coal split and merge feature coefficient
8 Z. YAN ET AL.

Initialize DE parameters
and SVM parameters (C,γ)

Train SVM and obtain New alternative


the classification results parameters (C,γ)

Calculate the objective Perform mutation,


function values, i.e., the crossover, and selection
fitness of individuals operations for individuals

Is stopping NO Evolution for the next


criterion generation
satisfied?

YES
Downloaded by [La Trobe University] at 02:54 02 August 2016

Optimal parameters (C,γ)

Figure 1. Flowchart of ADESVM parameter optimization.

Ks , coal thickness and coal thickness varying comprehensive feature coefficient Kt , fault
complexity coefficient Kf , interlayer sliding comprehensive feature coefficient Ki .

4.2. Experimental data sets for coal and gas outburst prediction
After the main controlling factors are determined, the experimental data are organized,
and 36 groups of coal and gas outburst sample data are obtained, which are shown in
Table 1. Among them, 26 groups are labelled as outburst, and 10 groups are labelled as
non-outburst. Arbitrarily choose 16 groups of outburst data and 5 groups of non-outburst
data as the training set, with number 1–21; the other 15 groups of data make up the test
set, with number 22–36.

4.3. Prediction examples


We use SVM to analyze and process the above data, which parameter selection is optimized
by ADE algorithm, namely ADESVM. The proposed ADESVM model is implemented by
using libsvm [2] on MATLAB platform. For the training and test data sets shown in Table
1, ADE parameters are initialized as:
maximum generation number gmax = 100;
population size N = 40;
maximum mutation factor Fmax = 1;
minimum crossover probability CRmin = 0.3;
maximum crossover probability CRmax = 0.9;
stopping threshold is set to 0.0001;  
the optimization range of (C, γ ) are both set to 2−10 , 210 .
In the fitness function, k is set to 3, which makes the objective function vary significantly.
λ1 = λ2 = 1, making the contribution of the misclassification of two classes the same.
JOURNAL OF DIFFERENCE EQUATIONS AND APPLICATIONS 9

Table 1. Sample data of coal and gas outburst collected in Luling coal mine.
No. p f Kc λ Ks Kt Kf Ki Outburst grade
1 2.16 0.34 1.05 0.22 18.7 6.25 0.014 7.74 Dangerous
2 1.75 0.3 1.26 0.51 19.8 6.03 0.039 6.75 Dangerous
3 1.35 0.45 1.48 0.41 5.1 4.02 0.022 2.53 Dangerous
4 0.97 0.41 1.55 0.72 5.1 4.15 0.022 2.53 Dangerous
5 1.02 0.35 1.28 0.55 20.4 5.79 0.035 2.53 Dangerous
6 1.12 0.29 1.36 0.47 6.8 4.99 0.041 10.22 Dangerous
7 0.8 0.2 1.18 0.7 5.1 6.04 0.025 8.86 Dangerous
8 1.4 0.42 1.65 0.39 5.1 7.01 0.076 2.53 Big
9 2.9 0.31 1.72 0.21 25.6 6.89 0.089 21.34 Big
10 3.65 0.22 1.36 0.09 5.1 5.87 0.044 2.53 Big
11 1.27 0.22 1.7 0.55 21.9 6.05 0.057 48.3 Big
12 3.61 0.24 1.81 0.12 15.7 7.77 0.037 2.53 Big
13 1.4 0.24 1.32 0.48 19.2 5.22 0.025 16.29 Ordinary
14 1.24 0.27 1.6 0.46 5.1 6.43 0.026 13.98 Ordinary
15 1.78 0.23 1.52 0.43 10.2 4.78 0.046 25.45 Ordinary
Downloaded by [La Trobe University] at 02:54 02 August 2016

16 2.1 0.33 1.49 0.19 7.3 5.66 0.048 18.76 Ordinary


17 0.95 0.58 0.51 0.48 5.1 4 0.017 2.53 Non
18 1.02 0.43 0.92 0.47 5.1 3.83 0.005 3.82 Non
19 0.5 0.65 0.68 0.66 5.1 5.12 0.023 4.54 Non
20 0.68 0.33 0.39 0.74 5.1 4.79 0.028 2.53 Non
21 1.75 0.78 0.21 0.35 5.1 5.22 0.019 2.53 Non
22 1.65 0.54 1.55 0.45 4.7 3.94 0.044 3.02 Dangerous
23 0.77 0.52 1.65 0.53 4.7 3.15 0.015 3.02 Dangerous
24 1.13 0.33 1.34 0.75 22.1 5.46 0.047 3.02 Dangerous
25 1.08 0.31 1.66 0.45 5.9 4.03 0.041 17.32 Dangerous
26 0.79 0.2 1.08 0.69 4.7 6.55 0.031 11.86 Dangerous
27 1.46 0.2 1.79 0.72 19.4 6.05 0.06 33.59 Big
28 3.45 0.22 1.78 0.24 18.7 7.06 0.046 3.02 Big
29 1.14 0.39 1.45 0.69 4.7 6.88 0.032 9.98 Ordinary
30 1.63 0.23 1.25 0.53 9.2 4.42 0.035 31.54 Ordinary
31 2.25 0.33 1.49 0.21 7.3 5.45 0.035 20.04 Ordinary
32 0.89 0.55 0.46 0.51 4.7 4.44 0.025 3.02 Non
33 1.21 0.38 1.02 0.47 4.7 3.35 0.011 4.83 Non
34 0.46 0.7 0.85 0.66 4.7 4.92 0.017 6.98 Non
35 0.83 0.23 0.41 0.77 4.7 5.06 0.033 3.02 Non
36 1.82 0.71 0.23 0.52 4.7 4.25 0.024 3.02 Non

Table 2. Results and prediction accuracies of different parameter selection algorithms for SVM model.
Parameter Parameters Time Prediction
selection algorithms The algorithms’ parameters c γ consuming/s accuracies (%)
Grid search c step: 0.5, γ Step: 0.5, v-fold = 5 1.3195 2.2974 14.936 93.33
PSO c1 = 1.5, c2 = 1.7, maxgen = 100, N = 40 1.2888 2.8635 6.784 93.33
others: the same as DE or default values
GA maxgen = 100, N = 40 others: 1.3396 3.3108 4.355 93.33
thesame as DE or default
 values
DE C ∈ 2−10 , 210 , γ ∈ 2−10 , 210 , 1.3212 2.8326 3.837 93.33
maxgen= 100, N = 40 threshold:
 0.0001

ADE C ∈ 2−10 , 210 , γ ∈ 2−10 , 210 , 1.3025 2.4605 2.426 93.33
maxgen = 100, N = 40 threshold: 0.0001

The sample data sets are normalized, and the SVM parameters (C, γ ) are optimized
by the above algorithm. The SVM is trained and the prediction results are obtained. The
experimental results are compared with parameter optimization methods, including grid
10 Z. YAN ET AL.

search algorithm [1], PSO [3], GA [7], and DE [9]. The comparative results are shown in
Table 2.
It is clear in Table 2 that the prediction accuracy of the parameter optimization methods
are generally the same. However, the proposed ADE algorithm consumes significantly
shorter time, improving the learning and generalization ability of SVM. Compared with
grid search algorithm, the evolution algorithms (PSO, GA, DE) require much less time,
indicating the advantage of evolution algorithms in parameter optimization. Among them,
DE has better adaptability. The proposed ADE algorithm is the improvement of DE, and
has better performance.
For the same training and test data sets, BPNN model is used for prediction [15], with
prediction accuracy of 86.67%. The prediction accuracy of ADESVM is 93.33%. Therefore,
the prediction accuracy of ADESVM is higher, and the training time is shorter.

5. Conclusion and discussion


Downloaded by [La Trobe University] at 02:54 02 August 2016

(1) In this study, the standard DE algorithm is improved in terms of optimization


objective function, mutation factor, crossover factor, etc. The ADE parameter
selection model for SVMs is designed and implemented. The model is able to reduce
training time and improve performance, thus facilitating parameter optimization
of SVMs.
(2) The proposed coal and gas outburst prediction model ADESVM is superior to
BPNN in terms of prediction accuracy and training time. The proposed model is
better than similar SVM algorithms in time efficiency. Experimental results show
that the proposed algorithm is feasible, robust and effective. It can well realize
the combination of global searching ability and local searching ability, with better
generalization ability.
(3) Using ADESVM in coal and gas outburst risk prediction has very good application
perspective, due to the high prediction accuracy, strong adaptability and robustness
of the model. It can meet the demand of coal mine production.

Acknowledgements
The authors would also like to thank the reviewers for their constructive comments.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This work was supported by National Natural Science Foundation of China (NSFC) under Con-
tract [41271445], and partially supported by a Project Funded by the Priority Academic Program
Development of Jiangsu Higher Education Institutions.

References
[1] N.E. Ayat, M. Cheriet, and C.Y. Suen, Automatic model selection for the optimization of SVM
kernels, Pattern Recognit. 10(38) (2005), pp. 1733–1745.
JOURNAL OF DIFFERENCE EQUATIONS AND APPLICATIONS 11

[2] C.-C. Chang and C.-J. Lin, LIBSVM: A library for support vector machines, December 14,
2015. Software Available at http://www.csie.ntu.edu.tw/~cjlin/libsv.
[3] L.-H. Chen and H.-D. Hsiao, Feature selection to diagnose a business crisis by using a real GA-
based support vector machine: An empirical study, Exp. Syst. Appl. 35(3) (2008), pp. 1145–1155.
[4] V. Cherkassky and Y. Ma, Practical selection of SVM parameters and noise estimation for SVM
regression, Neural Networks 17(1) (2004), pp. 113–126.
[5] C.-Y. Dong, Z.-G. Cao, Y.-H. Shang, and X. Liu, Coal and gas outburst classification analysis
based on G-K evaluation and rough set, J. Chin. Coal Soci. 36(7) (2011), pp. 1156–1160.
[6] T. Glasmachers and C. Igel, Gradient-based adaptation of general Gaussian kernels, Neural
Comput. 17(10) (2005), pp. 2099–2105.
[7] X.C. Guo, J.H. Yang, C.G. Wu, C.Y. Wang, and Y.C. Liang, A novel LS-SVMs hyper-
parameter selection based on particle swarm optimization, Neurocomputing 71(16–18) (2008),
pp. 3211–3215.
[8] D.-Y. Guo, M.-J. Zheng, C. Guo, D.-M. Hu, and X.-K. Zhang, Extension clustering method
for coal and gas outburst prediction and its application, J. Chin. Coal Soc. 34(6) (2009), pp.
783–787.
Downloaded by [La Trobe University] at 02:54 02 August 2016

[9] R. Storn and K. Price, Differential evolution-a simple and efficient heuristic for global
optimization over continuous spaces, J. Global Optim. 11 (1997), pp. 341–359.
[10] V. Vapnik, An overview of statistical learning theory, IEEE Trans. Neural Networks 10(5)
(1999), pp. 988–999.
[11] Z.-H. Wang and N. Qiao, Prediction model of coal and gas outburst intensity based on IGA-
LSSVM, J. Liaoning Tech. Univ. 34(7) (2015), pp. 791–796.
[12] C. Wang, D.-Z. Song, X.-S. Du, Z.-G. Zhang, D. Zhu, and D.-W. Yang, Prediction of coal and
gas outburst based on distance discriminant analysis method and its application, J. Min. Saf.
Eng. 26(4) (2009), pp. 470–474.
[13] C.-P. Wen, Attribute recognition model and its application of fatalness assessment of gas burst
in tunnel, J. Chin. Coal Soc. 36(8) (2011), pp. 1322–1328.
[14] L. Yang, J.-C. Geng, and K.-L. Wang, Reseach on coal and gas outburst prediction using fuzzy
support vector machines, J. Saf. Sci. Technol. 10(4) (2014), pp. 103–108.
[15] M. Yang, Y.-J. Wang, and Y.-P. Cheng, Improved differential evolution neural network and its
application in prediction of coal and gas outburst, J. Chin. Univ. Min. Technol. 38(3) (2009),
pp. 399–444.
[16] Z. Yan, Y. Yang, and Y. Ding, An experimental study of the hyper-parameters distribution
region and its optimization method for support vector machine with Gaussian kernel, Int. J.
Signal Process. Image Process. Pattern Recognit. 6(5) (2013), pp. 437–446.
[17] J.-W. Yan, X.-B. Zhang, and Z.-M. Zhang, Research on geological control mechanism of coal-gas
outburst, J. Chin. Coal Soc. 38(7) (2013), pp. 1174–1178.
[18] W. You, Y.-X. Liu, Y. Li, C.-H. Liu, and T.-B. Zhou, Predicting the coal and gas outburst using
artificial neural network, J. Chin. Coal Soc. 32(3) (2007), pp. 285–287.
[19] Q.-X. Yu, K. Wang, and S.-Q. Yang, Study on pattern and control of gas emission at coal face in
China, J. Chin. Univ. Min. Technol. 29(1) (2000), pp. 9–14.
[20] Z.-X. Zhang, G.-F. Liu, R.-S. Lu, and J. Zhang, Regional forecast of coal and gas burst based on
fuzzy pattern recognition, J. Chin. Coal Soc. 32(6) (2007), pp. 592–595.

You might also like