Professional Documents
Culture Documents
Yang Cheng-Xiang
School of Resources & Civil Engineering, Northeastern University, Shenyang, China
Feng Xia-Ting
Institute of Rock and Soil Mechanics, Chinese Academy of Sciences, Wuhan, China
ABSTRACT: This study applies support vector machines (SVM) to develop a highly nonlinear mapping
relationship between the Peak Particle Velocity (PPV) induced in structures and the affecting factors from data
taken from field measurements and real structures to predict the structural response to mining-induced blast
vibration. In addition, this study examines the feasibility of applying SVMs in the analysis of blast-induced
structural vibration response by comparing it with back-propagation neural networks. The application results
show that SVM provides a promising alternative for solving the vibration related problems.
587
588
w1 w2 wm weights
ζ +ε
ζ
0 (.) (.) ••• (.) dot product (Φ(x).(xi)) = K (x,xi)
-ε
Φ(x1) Φ(x2) Φ(x) ••• Φ(xn) mapped vectors Φ(xi), Φ(x)
+ε 0 -ε
x1 x2 ••• xn support vectors x1 ... xn
x input vector x
Finally, by applying Lagrangian theory and exploit- the solution. The smaller the fraction of support vec-
ing the optimality constraints, the decision function tors, the more general the obtained solution is and less
given by Equation 1 has the following explicit form: computations are required to evaluate the solution for
a new and unknown object. However, many support
vectors do not necessarily result in an over-trained
solution. Generally, the larger the ε, the fewer the
number of support vectors and thus the sparser the rep-
In this formula, αi and α∗i are Lagrange multipliers resentation of the solution. However, a larger ε can also
associated with a specific training point and can be depreciate the approximation accuracy placed on the
obtained by solving the following dual optimization training points. In this sense, ε is a trade-off between
problem: the sparseness of the representation and closeness to
the data.
In Equation 4, K is the so-called kernel function.
The value of the kernel is equal to the inner prod-
uct of two vectors xi and xj in the feature space (xi )
and (xj ), that is, K(xi , xj ) = (xi ) ∗ (xj ) . This sim-
plify the use of the map (x) by dealing with feature
spaces of arbitrary dimensionality without having to
compute it explicitly and the problem is reduced to
finding kernels that identify families of regression
formulas. Any function satisfying Mercer’s condi-
tion (Schölkopf & Smola 2002) can be used as the
kernel function. The most used kernel functions are
Because of the specific formulation of the cost func- the polynomial kernel K(x, y) = (x ∗ y + 1)d and the
tion and the use of the Lagrangian theory, the solution Gaussian kernel K(x, y) = exp(−1/δ2 (x − y)2 ) where
has several interesting properties. d is the degree of polynomial kernel and δ2 is the
bandwidth of the Gaussian kernel. The kernel param-
• Globality. The solution found is always global
eter should be carefully chosen as it implicitly defines
because the problem formulation is convex (Burges
the structure of the high dimensional feature space
1998).
(x) and thus controls the complexity of the final
• Uniqueness. The solution found is also unique if the
solution.
cost function is strictly convex.
From the implementation point of view, training
• Sparseness. Only a sparse number of training points
SVMs is equivalent to solving a linearly constrained
lying on or outside the ε-bound of the decision func-
quadratic programming (QP) with the number of
tion contribute to the solution found because the
variables twice as that of the training data points.
Lagrange multipliers of other data points are all
The sequential minimal optimization algorithm pro-
equal to zero.
pounded by Scholkopf and Smola (Smola & Schölkopf
• Dimension-free. The dimension of the input
1998, Schölkopf & Smola 2002) is reported to be
becomes irrelevant in the solution (due to the use
very effective in training SVMs for solving the regres-
of the inner product).
sion problem. Figure 2 contains a graphical overview
Training points with nonzero Lagrange multipli- over the different steps in the regression stage. The
ers are called support vectors and give shape to input pattern (for which a prediction should be made)
589
590
50
Training Testing
For the non-numerical input, type of structure in 40
591
Measured Training
50
20 0
0.001 0.01 0.1 1 6.28 10 100 1000
1/δ2
Figure 4. Learned and predicted results of SVMs. in the performance of SVMs. The regularized con-
stant C controls the trade-off between the empirical
60 risk and the regularization term while tube size ε is
y = 1.0013x + 0.0769 equivalent to the approximation accuracy placed on the
training data points and controls the trade-off between
R2 = 0.9984
the sparseness of the representation and closeness to
the data. Both C and ε are usually data and prob-
Measured PPV(m/s)
40
lem dependent. The kernel parameter δ2 implicitly
defines the structure of the high dimensional feature
space and thus controls the complexity of the final
solution. This section compares the prediction per-
20
formance with respect to various free parameters of
SVMs and discusses how does these parameters affect
on the solutions of SVMs.
Figure 6 gives the MSE over training and testing
cases of SVMs with different values of 1/δ2 while C
0 and ε are fixed at 0.84 and 0.001 respectively based on
0 20 40 60 the application results. We can see that the MSE on the
SVM predicted PPV(m/s) testing cases decreases with 1/δ2 initially but increases
after some certain value of 1/δ2 . That is, a small value
Figure 5. Linear regression result between the PPV values of 1/δ2 will lead to under-fit the training data while a
for the training and testing cases. large value over-fit the training data. In this case, the
Figure 4 shows the learned and predicted results of value of 6.28 for 1/δ2 is appropriate.
the SVMs, one can note that the SVMs can attain a Figure 7 gives the results of various C with 1/δ2 and
satisfied approximation. ε, respectively, are set to 6.28 and 0.001. The figure
Figure 5 shows the linear regression result between shows that the MSE on the test cases decreases first
the PPV values for the training and testing cases. It is and then increases after some certain point and finally
noted that the intercept of the straight line is almost maintains an almost constant value as C increases.This
zero and its slope is close to one. The results show that is because increasing the value of C will result in the
the prediction is fairly accurate. Therefore, it can be biased importance of the empirical risk, which will
concluded that SVMs provide a promising technique lead to over-fitting. A too large value of C will return
in the analysis of structural response to blast induced to the ERM problem and stops at the over-fitting result.
vibration. Figure 8a gives the results of SVMs at various ε
where 1/δ2 and C are fixed at 6.28 and 0.84 respec-
tively. It can be observed that the MSE on both the
5 DISCUSSION training set and the test set remain stable at small value
of ε but increases at larger ε. This indicates that the
As mentioned in section 2, the free parameters of performance of SVMs is insensitive to small value of
SVMs and the kernel parameter play important roles ε (0.0001∼0.01) but too large a value of ε (≥0.01)
592
Testing
response to blast induced vibration. The highly non-
linear relationship between the PPV in the structures
50 and the blast geometry, blast size, location, monitoring
stations and other relevant aspects was regressed from
the field data. The application results show that SVMs
provide a promising alternative to the BP neural net-
work for blast induced vibration analysis. SVMs give
0 more stable and accurate solutions than the BP network
0.1 0.3 0.5 0.84 1 10 100 both in training and generalization.
Value C
The sensitivity of SVMs to its parameters was also
discussed and the results show that kernel parameter δ2
Figure 7. The results of various C in which 1/δ2 = 6.28 and
ε = 0.001. A small value of C under-fits the training data and regularization constant C play an important role in
while a large value of C over-fits the testing data. the train and especially the generalization performance
of SVMs. The performance of SVMs is insensitive to
a small value of tube size ε while a larger ε will signif-
40
icantly reduce the number of support vectors and lead
to a sparse representation of the solution. But the larger
Mean squared error
0
0.00001 0.0001 0.001 0.01 0.1 ACKNOWLEDGEMENT
ε
(a) The MSE over training and testing cases. The financial support from The Special Funds for
Major State Basic Research Project under Grant no.
2002CB4127008, the Teaching and Research Award
25
Program for Outstanding Young Teachers in Higher
Education Institutions of MOE and the Research Fund
of the Key Laboratory of Rock and Soil Mechanics in
Number of SVs
15
REFERENCES
593
594