You are on page 1of 8

Application of Computers and Operations Research in the Mineral Industry –

Dessureault, Ganguli, Kecojevic & Dwyer (eds)


© 2005 Taylor & Francis Group, London, ISBN 04 1537 449 9

Estimation of structural response to mining-induced blast vibration using


support vector machines

Yang Cheng-Xiang
School of Resources & Civil Engineering, Northeastern University, Shenyang, China

Feng Xia-Ting
Institute of Rock and Soil Mechanics, Chinese Academy of Sciences, Wuhan, China

ABSTRACT: This study applies support vector machines (SVM) to develop a highly nonlinear mapping
relationship between the Peak Particle Velocity (PPV) induced in structures and the affecting factors from data
taken from field measurements and real structures to predict the structural response to mining-induced blast
vibration. In addition, this study examines the feasibility of applying SVMs in the analysis of blast-induced
structural vibration response by comparing it with back-propagation neural networks. The application results
show that SVM provides a promising alternative for solving the vibration related problems.

1 INTRODUCTION input variables through processing units, whose high


connectivity makes them suitable for describing com-
When blasting takes place in opencast or underground plex input-output mappings without resorting to a
mines, damages of adjacent structures may be caused physical description of the phenomenon.
by the induced vibration, which transmits through the From among many applications, gradient tech-
ground and create a dynamic response in structures. niques are most commonly used for network training,
The general conclusion in mining engineering is that typically some variation of back propagation (BP)
the peak particle velocity (PPV) is an appropriate (Wong et al. 1995). However, the BP neural network
index for assessing damaging ground motions to struc- suffers from a number of weaknesses which include
tures (Nicholls et al. 1971). The PPV of the structures the need for a large number of controlling parameters,
induced by blast vibration is estimated through empiri- difficulty in obtaining a stable solution and the dan-
cal equations, which often result in discrepancies with ger of over fitting (Curry & Morgan 1997, Jasmina &
real values (Yang et al. 1994, Richards et al. 1994). Ramazan 2001, Paulo et al. 2002). The over-fitting
This is due primarily to the fact that they exclude the problem usually leading to poor generalization could
effects of structural type, structural conditions, site present serious problems in real-world applications
conditions, ground vibration frequency contents and because the neural network has too large a capacity
ground vibration duration on structural responses. In which causes it to capture not only the useful informa-
fact, the characteristics of structural response are very tion contained in the training data but also unwanted
difficult to identify because of the complexities of such noises. As a result, it will end up only memorizing the
effecting factors (Wingh et al. 1996, Pijush 1998). training data and generalizing poorly to the unseen
In this case the experimentally supported analysis data. This issue of generalization has long been a
can be superior to the computational analysis, which concern to researchers. Many efforts have been made
can be very complex and inaccurate because of dif- for enhancing the generalization ability of neural net-
ficulties with material, structural and load modelling. works but most of these methods usually involve a
Artificial neural networks (ANNs) provide a rich, pow- substantial amount of computation.
erful and robust nonparametric modelling framework Recently, Support Vector Machines (SVM) as a
with proven and potential applications across sciences. novel type of neural networks have been developed by
ANN has proved itself to be a promising tool for solv- Vapnik (1995) and are gaining popularity due to many
ing the vibration related problems with data taken attractive features, and promising empirical perfor-
from measurements on real structures (Tienfuan & mance. SVMs are based on the Structural Risk Mini-
David 2002, Kuźniar & Waszczyszyn 2003). Essen- mization (SRM) principle, which has been shown to be
tially, ANN are highly parametric functions of the superior (Burges 1998), to traditional Empirical Risk

587

Copyright © 2005 Taylor & Francis Group plc, London, UK


Minimization (ERM) principle, employed by con- training samples), SVMs solves the regression prob-
ventional neural networks. SRM minimizes an upper lem using the following function:
bound on the Vapnik–Chervonenkis (VC) dimension
(‘generalization error’), as opposed to ERM that min-
imizes the error on the training data. This induction
principle is based on the fact that the generalization where (x) is the high dimensional feature space
error is bounded by the sum of the training error and which is non-linearly mapped from the input space X ,
a confidence interval term that depends on the VC which extend the approach to nonlinear functions. The
dimension. Based on SRM principle, SVMs achieve an best coefficients w and b are estimated by minimizing
optimum network structure by striking a right balance the following regularized risk function:
between the empirical error and the VC-confidence
interval. This incorporates capacity control to prevent
over-fitting and eventually results in better generaliza-
tion performance than other neural network models.
Another merit of SVMs is that the training of SVMs is where the first term is a penalty function which
equivalent to solving a quadratic programming prob- penalizes the empirical errors larger than ±ε using
lem with linear equality and inequality constraints a so-called ε-insensitive loss function Lε for each of
rather than a non-convex, unconstrained optimiza- the l training points.
tion problem. Consequently, the solution of SVMs is
always unique and globally optimal without the danger
of getting stuck into local minima. In addition, the flex-
ibility of kernel functions allows the SVM to search a
wide variety of hypothesis spaces. Due to its remark-
able generalization performance, SVM has received The second term, on the other hand, is the regular-
increasing attention in areas ranging from its origi- ization term that is used to regularize weight sizes
nal application of pattern recognition (Burges 1998) and penalizes large weights. Due to this regularization,
to the extended application of regression estimation the weights converge to smaller values. Large weights
(Smola & Schölkopf 1998). deteriorate the generalization ability of the SVM
In the current paper, we apply SVM for regression because, usually, they can cause excessive variance.
estimation to approximate the highly nonlinear map- The positive constant C is referred to as the regular-
ping relationship between the Peak Particle Velocity ized constant and it controls the trade-off between the
(PPV) induced in structures and the affecting fac- empirical risk and the regularization term by deter-
tors from data taken from measurements on field and mining the amount up to which deviations from ε are
real structures to predict the structural response to tolerated. Increasing the value of C will result in the
mining-induced blast vibration. We will compare the relative importance of the empirical risk with respect
performance of SVM and a BP neural network to to the regularization term to grow. ε is called the tube
study the feasibility of applying SVMs in the anal- size and it is equivalent to the approximation accuracy
ysis of blast-induced structural vibration response. placed on the training data points. Both C and ε have
Some experiments are also carried out to investigate to be chosen by the user and the optimal values are
the variability in performance with respect to the free usually data and problem dependent.
parameters of SVMs. By introducing the positive slack variables ζi and ζi∗
to denote the errors larger than ±ε, the cost function
given by Equation 2 was transformed to the so-called
primal function:
2 THEORY OF SVMS

SVMs have originally been developed to solve classi-


fication problems but their principles can be extended
easily to the task of regression by the introduction of an
alternative loss function modified to include a distance
measure. This section focuses on some highlights
representing crucial elements in using this method.
Details of support vector algorithms and tutorials can
be found in (Burges 1998, Smola & Schölkopf 1998, This loss function provides the advantage of enabling
Schölkopf & Smola 2002, Cristianini & Shawe-Taylor one to use sparse data points to represent the decision
2000, Campbell 2002, Suykens et al. 2002). function given by Equation 1. Figure 1 shows the use
Given training set (x1 , y1 ), (x2 , y2 ), …, (xl , yl ) of the slack variables and the linear ε-insensitive loss
(xi ∈ X ⊆ Rn , yi , ∈ Y ⊆ R, l is the total number of function that are used throughout this paper.

588

Copyright © 2005 Taylor & Francis Group plc, London, UK


Σ output Σ wi K (x, xi) + b

w1 w2 wm weights
ζ +ε
ζ
0 (.) (.) ••• (.) dot product (Φ(x).(xi)) = K (x,xi)

Φ(x1) Φ(x2) Φ(x) ••• Φ(xn) mapped vectors Φ(xi), Φ(x)

+ε 0 -ε
x1 x2 ••• xn support vectors x1 ... xn

x input vector x

Figure 1. The ε-insensitive loss function (only the black dots


located on or outside the tube (support vectors) contribute to Figure 2. Architecture of a support vector regression.
the cost.

Finally, by applying Lagrangian theory and exploit- the solution. The smaller the fraction of support vec-
ing the optimality constraints, the decision function tors, the more general the obtained solution is and less
given by Equation 1 has the following explicit form: computations are required to evaluate the solution for
a new and unknown object. However, many support
vectors do not necessarily result in an over-trained
solution. Generally, the larger the ε, the fewer the
number of support vectors and thus the sparser the rep-
In this formula, αi and α∗i are Lagrange multipliers resentation of the solution. However, a larger ε can also
associated with a specific training point and can be depreciate the approximation accuracy placed on the
obtained by solving the following dual optimization training points. In this sense, ε is a trade-off between
problem: the sparseness of the representation and closeness to
the data.
In Equation 4, K is the so-called kernel function.
The value of the kernel is equal to the inner prod-
uct of two vectors xi and xj in the feature space (xi )
and (xj ), that is, K(xi , xj ) = (xi ) ∗ (xj ) . This sim-
plify the use of the map (x) by dealing with feature
spaces of arbitrary dimensionality without having to
compute it explicitly and the problem is reduced to
finding kernels that identify families of regression
formulas. Any function satisfying Mercer’s condi-
tion (Schölkopf & Smola 2002) can be used as the
kernel function. The most used kernel functions are
Because of the specific formulation of the cost func- the polynomial kernel K(x, y) = (x ∗ y + 1)d and the
tion and the use of the Lagrangian theory, the solution Gaussian kernel K(x, y) = exp(−1/δ2 (x − y)2 ) where
has several interesting properties. d is the degree of polynomial kernel and δ2 is the
bandwidth of the Gaussian kernel. The kernel param-
• Globality. The solution found is always global
eter should be carefully chosen as it implicitly defines
because the problem formulation is convex (Burges
the structure of the high dimensional feature space
1998).
(x) and thus controls the complexity of the final
• Uniqueness. The solution found is also unique if the
solution.
cost function is strictly convex.
From the implementation point of view, training
• Sparseness. Only a sparse number of training points
SVMs is equivalent to solving a linearly constrained
lying on or outside the ε-bound of the decision func-
quadratic programming (QP) with the number of
tion contribute to the solution found because the
variables twice as that of the training data points.
Lagrange multipliers of other data points are all
The sequential minimal optimization algorithm pro-
equal to zero.
pounded by Scholkopf and Smola (Smola & Schölkopf
• Dimension-free. The dimension of the input
1998, Schölkopf & Smola 2002) is reported to be
becomes irrelevant in the solution (due to the use
very effective in training SVMs for solving the regres-
of the inner product).
sion problem. Figure 2 contains a graphical overview
Training points with nonzero Lagrange multipli- over the different steps in the regression stage. The
ers are called support vectors and give shape to input pattern (for which a prediction should be made)

589

Copyright © 2005 Taylor & Francis Group plc, London, UK


Table 1. Research data used in this study.

Compacted ground Structure


Total charge Maximum charge Distance PPV Frequency Frequency Height PPV structure
(kg) per delay (kg) (m) (mm/s) (Hz) Type∗ (Hz) difference (m) (mm/s)

14350.00 1800.00 940 4.03 5 BC 5 0.92 5.37


56875.00 5600.00 3000 1.596 5 BMC 15 0.76 1.640
19528.00 5200.00 1450 3.73 5 BMC 5 0.92 4.03
632.75 382.00 325 2.41 9 CN 8 0.61 2.48
2137.50 737.50 300 13.72 9 CN 9 0.61 14.73
1350.00 187.50 130 15.00 15 BC 12 4.0 40.10
1350.00 187.50 118 21.9 16 BC 12 3.05 55.9
731.50 300.00 312 2.00 15 BC 15 2.44 8.90
775.00 225.00 450 0.67 25 BMC 17 1.2 1.312
2150.00 445.00 1380 0.445 5 M 9 2.44 0.925
4520.84 645.00 1190 0.880 7 BC 8 3.65 1.978
2029.30 764.40 120 11.78 6 BC 7 0.76 14.90
1825.00 650.00 200 9.99 7 BC 8 0.76 11.33
260.13 140.70 145 6.71 12 BC 15 3.20 14.32
230.25 118.75 128 7.75 10 M 18 2.74 12.82
1925.00 175.00 135 10.36 28 CN 25 3.76 27.29
390.00 65.00 350 0.60 7 BMC 5 1.22 1.19
1237.50 225.00 400 4.03 7 BC 5 3.96 12.53
1475.00 337.50 250 3.88 5 BC 7 1.52 10.59
125.00 41.70 275 0.65 32 BC 27 1.83 1.64
712.50 50.00 115 5.22 10 BMC 7 3.66 7.01
2300.00 50.00 230 0.89 5 BC 7 3.05 3.53
2300.00 50.00 240 1.49 8 BC 10 7.0 9.84
2268.48 200.16 850 0.25 22 BC 16 3.66 0.60
2637.50 250.00 100 7.61 7 BC 7 4.57 31.62
2443.75 125.00 120 8.35 8 BC 8 1.78 21.62
2443.75 125.00 110 7.01 17 BC 20 1.40 9.54
530.00 80.00 200 4.18 35 BC 17 3.05 7.61
798.00 114.00 225 9.91 20 BMC 15 3.50 16.26

is mapped into feature space by a map . Then dot


3 RESEARCH DATA AND IMPLEMENTATION
products are computed with the images of the training
SETTINGS
patterns under the map . This corresponds to evaluat-
ing kernel K functions at locations K(xi , x). Finally the
3.1 Research data
dot products are added up using the weights αi − α∗i .
This, plus the constant term b yields the final predic- The research data used in this study was collected from
tion output. The process described here is very similar Pijush (1998) including experimental data from 20
to regression in a three-layered neural network, with different mines. Since we attempt to predict PPV of
the difference, that in the SVMs case the weights in the the structures induced by blast vibration, the effects
input layer are predetermined by the training patterns. of structural type, structural conditions, site condi-
In contrast to the Lagrange multipliers, the choice tions, ground vibration frequency contents and ground
of a kernel and its specific parameters and ε and C do vibration duration are used as input variables. This
not follow from the optimization problem and have to study selects 8 technical indicators to make up the ini-
be tuned by the user. Except for the choice of the kernel tial attributes, as determined by the review of domain
function, the other parameters can be optimized by the experts and prior research (Nicholls et al. 1971, Yang
use ofVapnik–Chervonenkis bounds, cross-validation, et al. 1994, Richards et al. 1994, Wingh et al. 1996,
an independent optimization set, or Bayesian learning Pijush 1998). There are the total charge, maximum
(Schölkopf & Smola 2002, Cristianini & Shawe-Taylor charge per delay, distance, PPV and frequency of
2000, Suykens et al. 2002). Data pretreatment of both the compacted ground, structure type (abbreviated
x and y can be options to improve the regression results form, see Table 2 for structure type codes), frequency
just as in other regression methods but this has to be of structure and the height difference. The data are
investigated for each problem separately. presented in Table 1.

590

Copyright © 2005 Taylor & Francis Group plc, London, UK


Table 2. Type of structure their respective abbreviated 40
form. Training Testing

Mean squared error


Abbreviated form Index Type of construction 30

BC 1 Brick and cement


construction with concrete 20
roof
BMC 2 Brick1mud and cement 10
plastered with wooden
ceiling
CN 3 Framework structure 0
with reinforced 0
20000 40000 60000 80000 10000
concrete Epochs 0
M 4 Mud structure with tiled roof (a) BP network

50
Training Testing
For the non-numerical input, type of structure in 40

Mean squared error


Table 1, we simply evaluate them with an index number
(as seen in Table 2) since in SVMs the dimension of 30
the input becomes irrelevant in the solution.
The original data are scaled into the range of [0.1, 20
0.9]. The goal of linear scaling is to independently
normalize each feature component to the specified 10
range. It ensures the larger value input attributes do not
overwhelm smaller value inputs, then helps to reduce 0
prediction errors. 0 50 100 150 200 250
There are a total of 29 data points collected. Four of Epochs
these cases (the last four cases in Table 1) are randomly (b) SVMs
selected to serve as the testing cases to examine the
generalization capability of SVMs. The others are used Figure 3. The behavior of the MSE over training cases and
as the training cases to obtain the decision function. testing cases.
The generalization performance is evaluated using the
mean squared error (MSE): 1 output node according to the real problem. The num-
ber of hidden nodes is selected using trial and error
since the BP network does not have a general rule for
determining the optimal number of hidden nodes. It
was determined that using 18 hidden nodes in the BP
network gives the best performance. During training,
an additional testing procedure on the MSE over test-
3.2 Implementation settings
ing cases to detect the over-fitting phenomena is used
3.2.1 SVM for the stopping criteria of BP. The learning rate is
In this study, the Gaussian radial basis function is set to 0.1 and the momentum term 0.1 according to
used as the kernel function of SVM because gaussian experience.
kernels tend to give good performance under general
smoothness assumptions. Since the free parameters of
SVM (regularized constant C and tube size ε) and 4 RESULTS
the kernel parameter (band width δ2 ) play important
roles in the performance of SVMs and there is little In the training of the BP network, over-fitting occurred
general guidance to determine these parameters, this at epoch 50, 500 and the corresponding value of MSE
study varies the parameters to select optimal values is 12.4. The behavior of the MSE over training cases
for the best prediction performance and discusses the and testing cases is given in Figure 3a. The choice of
sensibility of SVMs to the parameters. This study uses 1/δ2 = 6.28, C = 0.84 and ε = 0.001 in the training of
the LIBSVM software system (Chang & Lin 2001) to the SVMs is because these values produced the best
perform applications. predicting results over the testing cases. As seen in
Figure 3b, the SVMs gave more stable solutions and
3.2.2 BP have much smaller MSE which is 7.3 than BP. This
In this study, standard three-layer BP networks are used indicates that SVMs significantly outperform BP in
as benchmarks. There are 8 nodes in the input layer and training and generalization.

591

Copyright © 2005 Taylor & Francis Group plc, London, UK


60 200

Measured Training

Mean squared error


Learned 150 Testing
Predicted
40 100
PPV(m/s)

50

20 0
0.001 0.01 0.1 1 6.28 10 100 1000
1/δ2

Figure 6. The results of various 1/δ2 in which C = 0.84 and


0 ε = 0.001. A small value of 1/δ2 under-fits the training data
0 10 20 30 while a large value of 1/δ2 over-fits the testing data.
Case number

Figure 4. Learned and predicted results of SVMs. in the performance of SVMs. The regularized con-
stant C controls the trade-off between the empirical
60 risk and the regularization term while tube size ε is
y = 1.0013x + 0.0769 equivalent to the approximation accuracy placed on the
training data points and controls the trade-off between
R2 = 0.9984
the sparseness of the representation and closeness to
the data. Both C and ε are usually data and prob-
Measured PPV(m/s)

40
lem dependent. The kernel parameter δ2 implicitly
defines the structure of the high dimensional feature
space and thus controls the complexity of the final
solution. This section compares the prediction per-
20
formance with respect to various free parameters of
SVMs and discusses how does these parameters affect
on the solutions of SVMs.
Figure 6 gives the MSE over training and testing
cases of SVMs with different values of 1/δ2 while C
0 and ε are fixed at 0.84 and 0.001 respectively based on
0 20 40 60 the application results. We can see that the MSE on the
SVM predicted PPV(m/s) testing cases decreases with 1/δ2 initially but increases
after some certain value of 1/δ2 . That is, a small value
Figure 5. Linear regression result between the PPV values of 1/δ2 will lead to under-fit the training data while a
for the training and testing cases. large value over-fit the training data. In this case, the
Figure 4 shows the learned and predicted results of value of 6.28 for 1/δ2 is appropriate.
the SVMs, one can note that the SVMs can attain a Figure 7 gives the results of various C with 1/δ2 and
satisfied approximation. ε, respectively, are set to 6.28 and 0.001. The figure
Figure 5 shows the linear regression result between shows that the MSE on the test cases decreases first
the PPV values for the training and testing cases. It is and then increases after some certain point and finally
noted that the intercept of the straight line is almost maintains an almost constant value as C increases.This
zero and its slope is close to one. The results show that is because increasing the value of C will result in the
the prediction is fairly accurate. Therefore, it can be biased importance of the empirical risk, which will
concluded that SVMs provide a promising technique lead to over-fitting. A too large value of C will return
in the analysis of structural response to blast induced to the ERM problem and stops at the over-fitting result.
vibration. Figure 8a gives the results of SVMs at various ε
where 1/δ2 and C are fixed at 6.28 and 0.84 respec-
tively. It can be observed that the MSE on both the
5 DISCUSSION training set and the test set remain stable at small value
of ε but increases at larger ε. This indicates that the
As mentioned in section 2, the free parameters of performance of SVMs is insensitive to small value of
SVMs and the kernel parameter play important roles ε (0.0001∼0.01) but too large a value of ε (≥0.01)

592

Copyright © 2005 Taylor & Francis Group plc, London, UK


100 6 CONCLUSION
Training
This paper applies SVMs to model the structural
Mean squared error

Testing
response to blast induced vibration. The highly non-
linear relationship between the PPV in the structures
50 and the blast geometry, blast size, location, monitoring
stations and other relevant aspects was regressed from
the field data. The application results show that SVMs
provide a promising alternative to the BP neural net-
work for blast induced vibration analysis. SVMs give
0 more stable and accurate solutions than the BP network
0.1 0.3 0.5 0.84 1 10 100 both in training and generalization.
Value C
The sensitivity of SVMs to its parameters was also
discussed and the results show that kernel parameter δ2
Figure 7. The results of various C in which 1/δ2 = 6.28 and
ε = 0.001. A small value of C under-fits the training data and regularization constant C play an important role in
while a large value of C over-fits the testing data. the train and especially the generalization performance
of SVMs. The performance of SVMs is insensitive to
a small value of tube size ε while a larger ε will signif-
40
icantly reduce the number of support vectors and lead
to a sparse representation of the solution. But the larger
Mean squared error

30 Training Testing value ε can also depreciate the approximation accuracy


placed on the training cases. Therefore, the parame-
20 ters of SVMs should be carefully chosen according
to their effect on the performance and an extra opti-
mization procedure would be useful which will be the
10
further work.

0
0.00001 0.0001 0.001 0.01 0.1 ACKNOWLEDGEMENT
ε
(a) The MSE over training and testing cases. The financial support from The Special Funds for
Major State Basic Research Project under Grant no.
2002CB4127008, the Teaching and Research Award
25
Program for Outstanding Young Teachers in Higher
Education Institutions of MOE and the Research Fund
of the Key Laboratory of Rock and Soil Mechanics in
Number of SVs

20 the Institute of Rock and Soil Mechanics of CAS under


Grant no. Z110407 are gratefully acknowledged.

15
REFERENCES

10 Nicholls, H.R., Johnson, C.F., Duvall, W.I. 1971. Blasting


0.00001 0.0001 0.001 0.01 0.1 vibrations and their effects on structures. Washington DC:
ε Bureau of Mines, Bulletin 656.
Yang, R.L., Rocque, P., Katsabanis, P., et al. 1994. Mea-
(b) The number of support vectors. The number of support surement and analysis of near-field blast vibration and
dramatically decreases as increases. damage. Geotechnical & Geological Engineering 12(3):
169–182.
Figure 8. The results of various ε in which 1/δ2 = 6.28 Richards, A.B., Evans, R., Moore, A.J. 1994. Blast vibra-
and C = 10. (a) The MSE. SME is not affected much by ε. tion control and assessment techniques. In: Managing
(b) The number of support vectors. The number of support risk. Proc. 4th open pit conference: 209–215, Perth, 1994,
dramatically decreases as ε increases. (AIMM),.
Wingh, P.K., Voge, W., Singh, R.B., et al. 1996. Blasting side
effects – investigation in an opencast coal mine in India.
causes SVMs to under-fit the training data. Similar International Journal of Surface Mining, Reclamation and
result occurs in the number of support vectors as seen Environment 10(4): 155–159.
in Figure 8b. This is consistent with the fact that the Pijush, P.R. 1998. Characteristics of ground vibrations and
number of support vector is found to be a decreasing structural response to surface and underground blasting.
function of ε. Geotechnical and Geological Engineering 16: 151–166.

593

Copyright © 2005 Taylor & Francis Group plc, London, UK


Tienfuan, K. & David, C. 2002. Neural networks approach Burges, C.J.C. 1998. A tutorial on support vector machines
and microtremor measurements in estimating peak ground for pattern recognition, Data Mining and Knowledge
acceleration due to strong motion. Advances in Engineer- Discovery 2: 121–167.
ing Software 33: 733–742. Smola, A.J. & Schölkopf, B. 1998. A tutorial on support
Kuźniar, K. & Waszczyszyn, Z. 2003. Neural simulation of vector regression, NeuroCOLT Technical Report NC-TR-
dynamic response of prefabricated buildings subjected 98-030, London: Royal Holloway College, University of
to paraseismic excitations. Computers and Structures 81: London.
2353–2360. Schölkopf, B. & Smola, A.J. 2002. Learning with Kernels,
Wong, B.K., Bodnovich, T.A.E. & Selvi,Y. 1995. A bibliogra- Cambridge: MIT Press.
phy of neural network applications research: 1988–1994. Cristianini, N. & Shawe-Taylor, J. 2000. An Introduction
Expert Systems 12: 253–261. to Support Vector Machines and Other Kernel-Based
Curry, B. & Morgan, P. 1997. Neural networks: a need for Learning Methods, Cambridge: Cambridge Univ. Press.
caution. OMEGA, International Journal of Management Campbell, C. 2002. Kernel methods: a survey of current
Sciences 25: 123–133. techniques, Neurocomputing 48: 63–84.
Jasmina, A. & Ramazan, G. 2001. Using genetic algorithms Suykens, J.A.K., Gestel, T., Brabanter, J., et al. 2002. Least
to select architecture of a feedforward artificial neural Squares Support Vector Machines, Singapore: World
network. Physica A 289: 574–594. Scientific.
Paulo, C., Miguel, R. & Jose, N. 2002. A lamarckian approach Chang, C.C. & Lin, C.J. 2001. LIBSVM: a library for
for neural network training. Neural Processing Letters 15: support vector machines, Technical Report, Depart-
105–116. ment of Computer Science and Information Engi-
Vapnik, V.N. 1995. The Nature of Statistical Learning Theory. neering, National Taiwan University, Available at
New York: Springer. http://www.csie.edu.tw/∼cjlin/papers/libsvm.pdf.

594

Copyright © 2005 Taylor & Francis Group plc, London, UK

You might also like