You are on page 1of 23

J Seismol (2012) 16:489511

DOI 10.1007/s10950-012-9291-x

ORIGINAL ARTICLE

Prediction of modified Mercalli intensity from PGA, PGV,


moment magnitude, and epicentral distance using several
nonlinear statistical algorithms
Diego A. Alvarez Jorge E. Hurtado
Daniel Alveiro Bedoya-Ruz

Received: 8 March 2011 / Accepted: 24 February 2012 / Published online: 21 March 2012
Springer Science+Business Media B.V. 2012

Abstract Despite technological advances in seismic instrumentation, the assessment of the intensity of an earthquake using an observational scale
as given, for example, by the modified Mercalli
intensity scale is highly useful for practical purposes. In order to link the qualitative numbers extracted from the acceleration record of an earthquake and other instrumental data such as peak
ground velocity, epicentral distance, and moment
magnitude on the one hand and the modified
Mercalli intensity scale on the other, simple statistical regression has been generally employed.
In this paper, we will employ three methods
of nonlinear regression, namely support vector
regression, multilayer perceptrons, and genetic
programming in order to find a functional dependence between the instrumental records and the
modified Mercalli intensity scale. The proposed
methods predict the intensity of an earthquake

D. A. Alvarez (B) J. E. Hurtado


D. A. Bedoya-Ruz
Universidad Nacional de Colombia,
Apartado 127, Manizales, Colombia
e-mail: daalvarez@unal.edu.co
J. E. Hurtado
e-mail: jehurtadog@unal.edu.co
D. A. Bedoya-Ruz
e-mail: dabedoyar@unal.edu.co

while dealing with nonlinearity and the noise


inherent to the data. The nonlinear regressions
with good estimation results have been performed
using the Did You Feel It? database of the
US Geological Survey and the database of the
Center for Engineering Strong Motion Data for
the California region.
Keywords Modified Mercalli scale Seismic
intensity Multilayer perceptron Genetic
programming Support vector regression Model
identification Ground motion California

1 Introduction
The macroseismic intensity is an essential parameter of earthquake ground motion that allows
a simple and understandable description of earthquake damage on the basis of observed effects at a
given place. It is measured, for example, using the
European Macroseismic Scale, the China Seismic
Intensity Scale, MercalliCancaniSieberg scale,
the Modified Mercalli Intensity (MMI) scale, or
the Japan Meteorological Agency (JMA) seismic
intensity scale (see e.g., Grnthal 2011).
The intensity scales quantify the effects of a
strong motion on the Earths surface, humans,
objects of nature, and man-made structures at a
given location based on detailed description of
indoor and outdoor effects that occur during the

490

J Seismol (2012) 16:489511

shaking. For example, the modified Mercalli intensity (MMI) scale (Wood and Neumann 1931;
Richter 1958) measures the effects of an earthquake using 12 degrees, with 1 denoting not felt
and 12 as total destruction (take into account
that the US Geological Survey (USGS) no longer
assigns intensities higher than 10 and has not assigned 10 in many decades; see e.g., Stover and
Coffman (1993), Dengler and Dewey (1998); in
addition, the current practice of using the MMI
scale does neither follow the wording of the version by Wood and Neumann (1931) nor by Richter
(1958) (Grnthal 2011)). The values will differ
based on the distance to the earthquake, with
the highest intensities being usually around the
epicentral area, and based on subjective data that
are gathered from individuals who have felt the
quake at that location. Take into consideration
that all modern scales have 12 degrees with the
exception of the JMA scale which was upgraded
from 7 to 10 degrees in 1996 (see e.g., Musson et al.
2010).
The assessment of the intensity of a quake at
a given location used to be a slow process, as it
was usually performed by means of personalized
surveys; however, with the advent of macroseismic intensity internet surveys (a list of some of
these services can be found in Table 1), this is
not the fact anymore. One can see, for example, that when a strong shaking happens in the

California region, the USGS receives thousands of


responses from people after the event (they can be
in excess of 10,000 responses in less than half an
hour). In any case, it is convenient to have an automatized intensity measure based on instrumental data, so that distribution intensity maps can
be drawn almost immediately after the shaking.
With that information, emergency service systems
could prioritize their attention to those places
where the largest damage is expected. In addition,
the automatic shut-off of natural gas supplies after an earthquake in certain localities could be
planned.
In this paper, three methods are analyzed,
namely support vector regression, multilayer perceptrons, and genetic programming. The accuracy
of the algorithms have been tested using the Did
You Feel It? (DYFI) database of the USGS and
the database of the Center for Engineering Strong
Motion Data (CESMD) for the California region.
The plan of the paper is as follows: first, we will
begin in Section 2 with a succinct review on previous studies on the topic and then in Section 3 we
will explain the mathematical background of the
nonlinear regression methods employed. With the
database described in Section 4, we will introduce
and then perform some numerical experiments in
Section 5 and thereafter, the analysis of results
will be done in Section 6. The paper finalizes in
Section 7 with the conclusions.

Table 1 Some web-based macroseismic intensity questionnaires


Institution or agency

URL

Amateur Seismic Centre (India)


British Geological Survey
Central Institute for Meteorology and
Geodynamics
European-Mediterranean Seismological
Center
GeoNet (New Zealand)
Istituto Nazionale di Geofisica e Vulcanologia
Le Bureau Central Sismologique Franais
Natural Resources Canada
Royal Observatory of Belgium
Servicio Geolgico Colombiano
Swiss Seismological Service
U.S. Geological Survey (USGS)

http://asc-india.org/menu/felt-india.htm
http://www.earthquakes.bgs.ac.uk/questionnaire/EqQuestIntro.html
http://www.zamg.ac.at/erdbeben/bebenbericht/index.php
http://www.emsc-csem.org/Earthquake/Contribute/testimonies.php
http://www.geonet.org.nz/earthquake/
http://www.haisentitoilterremoto.it/
http://www.seisme.prd.fr/english.php
http://earthquakescanada.nrcan.gc.ca/dyfi/
http://seismologie.oma.be/index.php?LANG=EN&CNT=BE&LEVEL=0
http://seisan.ingeominas.gov.co/RSNC/index.php
http://www.seismo.ethz.ch/eq/detected/eq_form/index_EN
http://earthquake.usgs.gov/earthquakes/dyfzi/

J Seismol (2012) 16:489511

491

2 Previous studies

and

Several attempts have been made in order to


relate the intensity scale with some earthquake
parameters such as peak ground acceleration
(PGA), peak ground velocity (PGV), peak ground
displacement (PGD), the magnitude scale, and
the epicentral distance among others. For instance, Trifunac and Brady (1975), Murphy and
OBrien (1977) have tried to correlate intensity
with PGA, PGV, and PGD. In particular, for the
California region, we can list the works of
Wald et al. (1999b), Atkinson and Sonley (2000),
Atkinson and Kaka (2007). For example, Wald
et al. (1999b) suggested the following piecewise
linear relationships between PGA, PGV, and
MMI:

MMI =

1.00+2.20 log10 (PGA) for log10 (PGA) 1.82


MMI =
1.66+3.66 log (PGA) for log (PGA) > 1.82
10
10
(1)

and
MMI =

3.40+2.10 log10 (PGV) for log10 (PGV) 0.76


2.35+3.47 log (PGV) for log (PGV) > 0.76
10
10
(2)

while Atkinson and Kaka (2007) proposed:

2.65+1.39 log10 (PGA) for log10 (PGA) 1.69


MMI =
1.91+4.09 log (PGA) for log (PGA) > 1.69
10
10
(3)

and
MMI =

4.37+1.32 log10 (PGV) for log10 (PGV) 0.48


3.54+3.03 log (PGV) for log (PGV) > 0.48
10
10
(4)

Recently, Worden et al. (2012) suggested:

1.78+1.55 log10 (PGA) for log10 (PGA) 1.57


MMI =
1.60+3.70 log (PGA) for log (PGA) > 1.57
10
10
(5)

3.78+1.47 log10 (PGA) for log10 (PGA) 0.53


2.89+3.16 log (PGA) for log (PGA) > 0.53
10
10
(6)

In all the above equations, the PGA is expressed in centimeter per square second, while the
PGV is given in centimeter per second.
Karim and Yamazaki (2002) performed a similar research relating PGA, PGV, PGD, and the
so-called spectrum intensity to the JMA seismic intensity scale. Other authors like Atkinson
and Sonley (2000), Sokolov (2002), Kaka and
Atkinson (2004) have developed empirical relationships between response spectra or Fourier acceleration spectra and modified Mercalli intensity.
Shabestari and Yamazaki (2001) developed some
expressions that related the JMA intensity scale
and the MMI scale. Tselentis and Danciu (2008)
derived empirical regression equations for MMI
and for various ground motion parameters such as
duration, cumulative absolute velocity, Housners
spectrum intensity, and total elastic input energy
index.
Note that most of the aforementioned relationships were performed using univariate and multivariate linear regression analysis showing large
scatter. It is the authors belief that those studies
would have been better using robust regression
methods instead of standard linear regression in
order to account for possible outliers, that is, some
observations that do not follow the pattern that
the other observations have and that usually occur when large measurement errors are present.
Our preference for robust regression methods is
based on the fact that, in particular, least squares
estimates for regression models are highly biased
by the outliers that dominate the recordings from
few earthquakes, since the regression might by
dragged towards them.
It seems, however, that the linear model chosen
in the previously mentioned papers comes just
from mathematical convenience. The relationship
that links the MMI scale and the instrumental data
is complicated enough that nonlinear methods
seem to be the best choice for a model. This is
the reason that motivated us to propose a new
approach in Section 5.

492

J Seismol (2012) 16:489511

Nonlinear regression methods have already


been used. For example artificial neural networks have been employed by Tung et al. (1993),
Davenport (2004), Tselentis and Vladutu (2010)
in order to relate MMI and PGA, PGV, PGD,
response spectral acceleration, and response spectral velocity for selected frequencies, Arias intensity, and JMA intensity. All of them have reported
satisfactory results. In the present paper, we will
compare this approach to other nonlinear regression methods.
Finally, it is convenient to mention that recently, Faenza and Michelini (2010); Kuehn and
Scherbaum (2010); Worden et al. (2012) proposed
methods that treat the relationship of intensity
and instrumental information not using an ordinary least squares regression methodology.
Commonly employed strong motion features
There are several numbers that are either instrumental or are obtained after the processing of
the acceleration records of the shaking and which
have a preponderant role in the estimation of the
seismic intensity. Some of the most popular and
that were used in the investigations listed above
are mentioned in the following:

PGA, PGV, and PGD defined as the geometric mean of the two horizontal components
in the directions originally recorded. The first
parameter is specially important because according to Newtons second law, the PGA
multiplied by the mass provides the maximum
inertial force that affects the structure during
the shaking.
duration of the shaking;
spectral content (characteristic periods)
at some specific frequencies; according to
Sokolov (2002), the frequency range of the
amplitude spectrum |X( f )| of the acceleration
record that is of interest from an engineering
point of view lies between 0.3 and 14 Hz.
In fact, the frequency band 0.782.0 Hz is
representative for MMI greater than 8, while
the 3.06.0 Hz range represents best MMI
from 5 to 7 and the 7.08.0 Hz correlates best
with the lowest MMI.

magnitude (moment magnitude, local magnitude, surface wave magnitude, or body wave
magnitude);
epicentral distance;
amplitudes of acceleration, velocity, or displacement response spectra;
regional propagation path (geological conditions) and local soil conditions among others.

Some other features that can be extracted from


numerical processing of the acceleration record
are as follows: the spectrum intensity, the Arias
intensity, the cumulative absolute velocity, and
the Japan Meteorological Agency seismic intensity reference acceleration value among others.
It is the authors opinion that more informative
features can be extracted from the acceleration
records by using the different techniques that the
field of digital signal processing presents us, even
though there is not a clear physical interpretation
of those numbers. Some of these features might
be extracted by means of the wavelet and short
time Fourier transforms (see e.g., Mallat 2008),
and the time-varying ARMA process (see e.g.,
Poulimenos and Fassois 2006) among others. This
will be the topic of a future article.

3 Some mathematical background on nonlinear


regression
It was said in Section 2 that the relationship that
links the MMI scale and the instrumental data is
complicated enough that nonlinear methods seem
to be the best choice for a model. In the following,
we will introduce briefly three methods of nonlinear regression, namely support vector regression,
multilayer perceptrons, and genetic programming
in Section 5, that will help us find a functional dependence between the instrumental records and
the modified Mercalli intensity scale.

3.1 Support vector regression


In the following, we will make a succint overview
of the theory behind the support vector regression (SVR); for a more thorough coverage of the

J Seismol (2012) 16:489511

493

algorithm, the reader is referred to the excellent


tutorial by Smola and Schlkopf (2004).
Suppose we are given a training dataset D =
{(x1 , y1 ) , (x2 , y2 ) , . . . , (xn , yn )}, where xi Rd is
the i-th input vector and yi R is its corresponding output vector; the idea of SVR is to fit the data
to a function of the form as follows:
f (x; , b ) =

n


difference between the prediction f (x) and the


target y is less than the constant . This loss function is given by the following:
1
E ( f (xi ; , b ) yi ) + ()
E (x; , b ) =
n i=1
n

(8)

i K (x, xi ) + b ,

(7)

i=1

where is a regularization term and,

where K is a so-called kernel function. There


are several options for a kernel function, and
here, we considered the popular radial basis
kernel
 function, which is given by K (x, y) =
exp x y2 , where > 0 is a scalar parameter that defines the kernel function. The weights
i , the bias b , and the parameter are chosen, so
that the deviation from each target yi is at most
for all training data, and at the same time, is as flat
as possible. In this way, the data will lie around
an -sensitive tube as seen in Fig. 1. Here, > 0
is the width of an insensitive zone that controls
the amount of noise that the algorithm can handle.
For instance, when a training sample has no noise,
one may set = 0.
The flatness of the function f can be ensured by
minimizing the fitting error using a -insensitive
loss function which gives zero error if the absolute

f x
f x

0,

E ( f (x) y) =


0

if | f (x) y| <

| f (x) y|

otherwise
(9)

as illustrated in Fig. 2. The choice of this error


function makes the SVR rather different from
traditional error minimization problems, and very
robust to outliers.
As stated before, the SVR regression algorithm
tries to position the -insensitive tube around
the data as shown in Fig. 1. This is achieved
by minimizing the error (Eq. 8). In order to allow points to lie outside the -insensitive tube,
let us define the so-called slack variables i , i
that represent the distance from actual values
yi to the corresponding boundary values of the
-insensitive tube. For each point xi , we need two
slack variables i 0 and i 0. If xi lies above
the -tube, then i 0 and i = 0; if xi lies below

f x

f x

E z

0,

Fig. 1 An SVR, showing the regression curve together


with the -insensitive tube. Examples of slack variables
and are also shown

Fig. 2 Plot of an -insensitive error function (continuous


line), in which the error increases linearly with distance
beyond the insensitive region. Also shown for comparison
is the quadratic error function (dashed line)

494

J Seismol (2012) 16:489511

the -tube, then i = 0 and i 0. Finally, if xi


lies inside the -tube, then i = i = 0. Introducing
the slack variables allows points to lie outside the
tube, provided the slack variables are nonzero,
and in this way:
yi f (xi ; , b ) + + i

(10)

yi f (xi ; , b ) i

(11)

By substituting Eq. 9 into Eq. 8, the minimization


of the error (Eq. 8) becomes (see e.g., Smola and
Schlkopf 2004):

min C
,b

n


i + i + ()

(12)

i=1

subject to C > 0, i 0, i 0 and the constraints


(10) and (11). Here, C is another parameter
that is used to control noise and that determines
the trade-off between the flatness of f and the
amount up to which deviations larger than are
tolerated. The optimization problem (12) can be
expressed using its dual Lagrangian formulation
as the quadratic optimization problem (see e.g.,
Smola and Schlkopf 2004):
 

1 
i j K xi , x j
i yi
2 i=1 j=1
i=1
n

max
,

n

(i + i )

(13)

i=1

with i = i i and subject to the restrictions


0 i C, 0 i C for i = 1, 2, . . . , n, and
n

i=1 i = 0. Here, the i -s and i are Lagrange


multipliers that appeared when including the constraints mentioned above into the Lagrangian
formulation.
Usually, is set a-priori, and the parameters
C and are found by means of an optimization
that produces the minimum error in Eq. 8 after performing statistical cross-validation. Crossvalidation is a technique for assessing how the
results of a statistical analysis will generalize to

an independent dataset. One round of crossvalidation involves partitioning the dataset into
complementary subsets, performing the analysis
on one subset and validating the analysis on
the other subset. To reduce variability, multiple
rounds of cross-validation are performed using
different partitions, and the validation results are
averaged over the rounds.
It can be shown that Eq. 13 is a convex
quadratic optimization problem, and in consequence, it has a unique solution. Those Lagrange
multipliers i which are different from zero are
the so-called support vectors. The support vector
regression has the property that the solution is
sparse, that is, most of the i are zero.
The parameter b is found for a data point for
which 0 < i < C or 0 < i < C by solving the
equation:
b = yn

n


i K (xn , xi )

(14)

i=1

In practice, it is better to average over all such


estimates of b . In short, the steps to train the SVR
are as follows:
1. Choose > 0, > 0, and C > 0
2. Estimate the generalization error for , , and
C using only the training set and the leaveone-out methodology. On each step of the
leave-one-out, one has to solve the optimization problem (13) in order to find the i -s and
then find b from Eq. 14. The generalization
error is estimated as the average error (Eq. 8)
obtained with the training set during all stages
of the leave-one-out.
3. Choose new values of > 0, > 0, and C > 0
and repeat step 2, until a low generalization
error is found.
Once appropriate values of , , and C are
chosen, the predicted value of the regression is
found when x is used as a input to the Eq. 7, that
uses the weights and the bias b obtained after
performing the previous optimizations. All these
steps can be easily performed using the LIBSVM
software (Chang and Lin 2011).

J Seismol (2012) 16:489511

3.2 Multilayer perceptrons


An artificial neural network is a mathematical
model that is inspired by the structure and/or
functional aspects of biological neural networks.
Modern neural networks are nonlinear statistical
data modeling tools. They are usually used to
model complex relationships between inputs and
outputs or to find patterns in the data.
The most popular neural network is the socalled multilayer perceptron (MLP). A MLP (see
Fig. 3) consists of interconnected layers (an input,
hidden, and output layer) of processing units or
neurons. For example, the network shown in Fig. 3
has d units in the input layer, p neurons in the
hidden layer, and a single neuron in the output
layer. As seen in Fig. 3, the flow of information comes from left to right and is altered by
means of some parameters (the so-called weights
1
w0ji , wkj
, and biases b 0j and b 1k ) and the functions g and h. These functions (called activation
functions in neural network terminology) usually
are set to g(a) = tanh(a) and h(a) = a. A MLP
is a parameterized, adaptable vector function
which may be trained to perform classification
or regression tasks. Given a training dataset D =

Fig. 3 Topology of a multilayer perceptron with a single output unit. This network has d inputs, p neurons in
the hidden layer, and a single output. w0ji represents the
weights between the j-th neuron of the hidden layer and

495

{(x1 , y1 ) , (x2 , y2 ) , . . . , (xn , yn )}, where xr Rd is


the r-th input vector and yr R is its corresponding output vector, the idea of a MLP with a single
output unit is to estimate a function f : Rd R
of the form:


f x; w 0 , w 1 , b 0 , b 1

 d

p


= h
w11 j g
w0ji xi + b 0j + b 11
j=1

(15)

i=1

The number of neurons in the hidden layer must


be chosen so that the fitting of the network to
the data is adequate: if too few neurons are used,
the model will be unable to represent complex
data, and the resulting fit will be poor; if too many
neurons are used, the network may overfit the
data; when overfitting occurs, that is, the network
fits the training data extremely well but it generalizes poorly to new, unseen data. Therefore, a
validation set must be used to find the appropriate
number of neurons in the hidden layer.
Note that the parameters w 0ji , w11 j, b 0j , and b 11
were grouped as the matrices w 0 , w1 , b 0 , and b 1
respectively. They are found by minimizing the

the i-th input, while w11 j stands for the weights between the
output neuron and the j-th neuron of the hidden layer. b 0j
represents the bias weight of the j-th neuron of the hidden
layer and b 11 symbolizes the bias weight of the output layer

496

J Seismol (2012) 16:489511

mean squared error function (see dashed line of


Fig. 2):


E x; w0 , w 1 , b 0 , b 1

2
1 
yr f xr ; w0 , w 1 , b 0 , b 1
n r=1
n

(16)

with respect to the weights and biases. For more


details of neural network training, see for example
Haykin (1998), Bishop (2007).
Several authors have shown that under some
assumptions, MLPs are universal approximators;
that is, if the number of hidden nodes p is allowed
to increase towards infinity, they can approximate
any continuous function with arbitrary precision
(see e.g., Hornik et al. 1989).
However, there are a number of problems with
MLPs: (a) There is no theoretically sound way of
choosing the network topology. (b) For a given
architecture, learning algorithms often end up in a
local minimum of E instead of a global minimum.
(c) They are black box solutions to the problem.
In any case, these drawbacks do not imply that the
fittings performed in this paper affect the reliability of the results. They simply indicate that other
nonlinear models like SVR have better theoretical
properties for regression than MLPs.
3.3 Genetic programming
Genetic programming (GP) is a problem-solving
approach inspired by biological evolution in
which computer programs (mathematical formulas, computer programs, logical expressions, etc.)
are evolved in order to find solutions to problems
that perform a user-defined task. The solution
method is based on the Darwinian principle of
survival of the fittest and is closely related to the
field of genetic algorithms (GA). There are three
main differences between GA and GP: (a) Structure: GP usually evolves tree structures, while GA
evolve binary or real number strings. (b) Programs
vs. binary strings: GP usually evolves computer
programs while GA typically operate on coded
binary strings. (c) Variable vs. fixed length: In
traditional GAs, the length of the binary string is
fixed before the solution procedure begins. However, a GP tree can vary in length throughout the

execution. The theory behind genetic programming is large. Here, just a brief review of its main
concepts will be given. The interested reader is
referred to Koza (1992) for an ample discussion
on the topic.
Genetic programming uses the following steps
to solve problems:
1. Generate an initial population of computer
programs
2. Iteratively perform the following sub-steps on
the population until the termination criteria is
satisfied:
a. Execute each program in the population
and assign it a fitness value according to
how well it solves the problem
b. Create a new population by executing the
the following evolutionary operators with
certain probability:
Reproduction: it selects an individual
from within the current population so
that it can have an offspring. There
are several forms of choosing which
individual deserves to breed including fitness proportionate selection,
rank selection, and tournament
selection.
Crossover: mimics sexual combination in nature, where two parents are
chosen and parts of their trees are
swapped in a form that each crossover
operation should result in a legal
structure.
Mutation: it causes random changes
in an individual before it is introduced into the subsequent population. During mutation it may happen
that all functions and terminals are
removed beneath an arbitrarily determined node and a new branch is
randomly created or a single node is
swapped for another.
3. The best computer program that appears in
any generation is designated as the result of
genetic programming.
One of the main uses of genetic programming
is to evolve relationships between variables: this

J Seismol (2012) 16:489511

4 Earthquake data
Two earthquake datasets have been used in this
study. The first one is the USGSs Did You Feel
It? database (U.S. Geological Survey 2011) which
collects, by means of internet surveys, information
about how people actually experienced the earthquakes. The form of the questionnaire employed
in the DYFI database and the method for assignment of intensities are based on an algorithm
developed by Dengler and Dewey (1998).
In this dataset, one can find for a given earthquake a table of modified Mercalli intensities aggregated by city or postal code, number of responses for that region and epicentral distance,
and a representative latitude and longitude of the
surveyed region. In addition, it is possible to find
in the same database the depth of the earthquake
and the latitude and longitude of the epicenter.
The second employed database is the one of
the Center for Engineering Strong Motion Data
(CESMD - Center for Engineering Strong Motion
Data 2011). Here, one can find for some representative earthquakes the actual accelerograms of the
shakings measured at different stations. For each
station, one can find its code and name, its latitude
and longitude, and for the given earthquake, its
epicentral distance, magnitude, PGA, PGV, PDG,
and the amplitudes of acceleration response spectra for the 0.3, 1, and 3 s. However, one must
take into consideration that, usually, not all of the
above-mentioned data are available at the same
time in the CESMD database.

The records of the earthquakes that happened


in the California region after 2000, plus the
records of the Loma Prieta earthquake of October
17, 1989, and the Petrolia earthquake of April 25,
1992 were employed. All of the available records
of the CESMD database were used, but it was
found out that the records prior to 2000, except
the two formerly mentioned, gave incorrect MMI
predictions in the regressions performed (as we
could find out in our initial numerical experiments). This is in agreement to a comment made
by Vince Quitoriano of the USGS who told us
in an email, We [the USGS] did not start collecting internet responses [in the DYFI database]
until mid-1998. In that time period, Hector Mine
was the only large event that people responded
to immediately. Most of the pre-2000 data are
actually entries from people recalling historical
events (Northridge, Whittier Narrows, etc.) much
later, rather than responding to a current earthquake. And later he said, Note that all the
data in DYFI are from internet questionnaires; we
did not process any paper surveys. The questionnaire itself and the underlying algorithm have not
changed from Wald et al. (1999a). We allowed
the records of the Loma Prieta and the Petrolia
earthquake in any case because the records with
MMI intensity 7 and above were scarce, and those
earthquakes provided us with that information.

120

100

frequency

is called symbolic regression. Symbolic regression


via genetic programming is a branch of empirical
modeling that evolves summary expressions for
available data. For a long time, symbolic regression was a domain only for us, humans; however, over the last few decades, it has become
that of computers as well. Unique benefits of
symbolic regression include human insight and
in some cases, interpretability of model results,
identification of key variables and variable combinations, and the generation of computationally
simple models for deployment into operational
models.

497

80

60

40

20

100

200

300

400

500

600

700

number of responses

Fig. 4 Histogram of the number of responses for a single


MMI reading the training dataset

498

J Seismol (2012) 16:489511

Fig. 5 Locations where the MMI readings used in this


study were reported. All of the 843 readings were located
in California

A MATLAB program was written in order


to automatically download, match, and parse the
data from the databases using the Event ID of
the earthquake. We choose to correlate only those
records where the strong motion was near a MMI
observation. For each station, the nearest observation intensity with at least four reports was chosen, with the intention of decreasing the internal
variability of the MMI readings (remember that

the standard deviation of the mean of a normally


distributed and independent sample is given by

/ n where 2 is the variance of the population


and n is the number of samples used to estimate
the mean). Figure 4 shows the histogram of the
number of responses for a single MMI reading
in the training dataset. It can be seen from this
histogram that most of the MMI observations are
the average of a large number of responses. In
consequence much of the inherent variability of
the MMI readings has been removed from the
dataset. We used only the station readings that
were within 1.0 km to the MMI reading when
the modified Mercalli intensity was less than 6
and within 3 km when then MMI reading was
greater or equal to 6. For measuring that distance,
we used the latitude and longitude position of
the stations/MMI readings. All other data were
disregarded. Using those criteria, we came up with
a database composed of 843 station recordMMI
observation pairs coming from 63 earthquakes.
Figure 5 shows a map that indicates the locations
where the MMI readings that were used in this
study were reported. Since most of the readings
were done in places with high concentration of
population, the reported MMI data in the DYFI
database are the average of the reported MMI of
small ZIP regions. This fact explains as well the
small variability of the reported MMI readings in
the DYFI database.
From the above database, four representative features were coupled to the MMI reading,
namely moment magnitude, epicentral distance,
PGA, and PGV. We intended to include depth as
well, but the high variability of this random variable refrained us from including it in the analysis.
Sometimes, there was conflicting information, and
in that case, we deferred to the DYFI database.

Table 2 Spearman correlation coefficients between the analyzed variables

MMI
Epicentral distance
PGA
PGV
Moment magnitude
Depth

MMI

Epicentral
distance

PGA

PGV

Moment
magnitude

Depth

1.00
0.32
0.79
0.73
0.26
0.14

0.32
1.00
0.52
0.01
0.66
0.23

0.79
0.52
1.00
0.72
0.01
0.30

0.73
0.01
0.72
1.00
0.55
0.15

0.26
0.66
0.01
0.55
1.00
0.17

0.14
0.23
0.30
0.15
0.17
1.00

J Seismol (2012) 16:489511

499

Wald et. al. (1999)


Atkinson and Kaka (2007)
Worden et. at. (2012)
Equation (24)

modified Mercalli intensity

modified Mercalli intensity

7
6
5
4
3

2
2
0

50

100

150

200

250

300

350

400

In order to gain insight into the employed


dataset, Table 2 shows the Spearman correlation
coefficients between the analyzed variables. This
information shows not only the contribution of
each component in order to explain the MMI
but also shows the redundance of information
between variables. It can be seen from this table
that the MMI tends to grow when the epicentral
distance decreases and the PGA, PGV, or the
moment magnitude increases. Figures 6, 7, and 8

10

Fig. 8 Plot of PGV vs. MMI for the training dataset. The
lines represents the relationship between these variables
shown, given by Eqs. 2, 4, 6, and 24

confirm this fact. On the other hand, the depth


does not seem to have a strong relationship with
MMI. We have included in Fig. 7 the linear regressions (1), (3), and (5) that associate PGA and
MMI; while in Fig. 8, we have included the Eqs. 2,
4, and 6. Note that some of these regressions tend
to overestimate the MMI given with the actual
database.
Figure 9 shows a plot relating moment magnitude and MMI. It can be seen from this plot
that the correlation between these two variables is

Wald et. al. (1999)


Atkinson and Kaka (2007)
Worden et. at. (2012)
Equation (23)

modified Mercalli intensity

modified Mercalli intensity

10

PGV (cm/s) [log10 scale]

Fig. 6 Plot of epicentral distance vs. MMI for the training


dataset

10

epicentral distance (km)

6
5
4
3

2
10

10

PGA (cm/s2) [log

10

scale]

4.5

5.5

6.5

7.5

magnitude

Fig. 7 Plot of PGA vs. MMI for the training dataset. The
lines represents the relationship between these variables
shown, given by Eqs. 1, 3, 5, and 23

Fig. 9 Plot of moment magnitude vs. MMI for the training


dataset

500

J Seismol (2012) 16:489511


300

80
70

250

60
200

frequency

frequency

50
40
30

150

100

20
50

10
0
1

0
0

0.05

0.1

0.15

Fig. 10 Histogram of the modified Mercalli intensity for


the training dataset

weak. However, we opt to keep this information in


our database because this variable, together with
the epicentral distance, provides a better impression of the intensity of the quake.
It is important, as well, to know the distribution
of the training data, so that we can know in which
regions there is enough information to estimate
the MMI with the relationships that will be proposed in Section 5. In this sense, Figs. 10, 11, 12,
13, and 14 show the histograms of the different
variables. One can deduce that the regressions
performed are reliable when the expected MMI

0.2

0.25

0.3

0.35

0.4

0.45

PGA (g)

modified Mercalli intensity

Fig. 12 Histogram of PGA for the training dataset

lies in the range 26.5, the epicentral distance is


less than 400 km, the PGA is less than 0.2 g,
the PGV is less than 20 cm/s, and when the moment magnitude is less than 7.2. In principle, one
could use a regression model to prognose beyond
the extreme values found in the dataset in order
to predict higher intensities (in this case, MMIs
greater than 6.5 in the current database) or use,
for example, the model for PGAs larger than 0.2 g.
Even though this is tempting, this not advisable
because extrapolation of a model outside the parameter boundaries of its underlying dataset can
be dangerous (see for example Bommer et al.
2007, for a discussion about the extrapolation of

120
300

100
250

200

frequency

frequency

80

60

40

150

100

20
50

0
0

50

100

150

200

250

300

350

400

epicentral distance (km)

0
0

10

15

20

25

30

35

PGV (cm/s)

Fig. 11 Histogram of epicentral distance for the training


dataset

Fig. 13 Histogram of PGV for the training dataset

40

J Seismol (2012) 16:489511

501

140

120

frequency

100

80

60

40

20

0
4

4.5

5.5

6.5

7.5

moment magnitude

Fig. 14 Histogram of moment magnitude for the training


dataset

ground motion prediction equations). This is a


kind of curse for all prediction equations based
on data, and in this case, the only safe solution is
to employ a regression model fitted to a database
that covers the range of information in which the
extrapolation is to be performed. In other words,
regression algorithms, in general, are expected to
perform well in regions of the space of variables
where there exists a set of points modeling similar
characteristics, but are not so good at extrapolation, inasmuch as these results are subject to
greater uncertainty.
It is necessary to note that the differences in
earthquake parameters such as source mechanism, regional tectonics, propagation path properties, and geological and geotechnical conditions
are not taken into account in the present study.
Therefore, these factors are considered to be random variables affecting the ground motion parameters for a given location and intensity level.
In the next lines, we will present the results
of our numerical experimentation performed with
those 843 observations.

5 The proposed approach and its numerical


experimentation
Using the algorithms described in Section 3,
we related the MMI to the moment magnitude,

epicentral distance, PGA, and PGV measured at


the closest station to the observation. Even though
the MMI is reported as a roman natural number,
we will use it here as a real continuous number (in
arabic notation), so that when it is rounded to the
closest integer, it coincides with its corresponding
measurement in the modified Mercalli intensity
scale. In fact, the MMI in the DYFI database is
expressed by a real number with one decimal digit
of approximation (see Wald et al. 2006).
In order to validate the training, we randomly
split the 843 observations into three sets: a training, validation and testing set with 506 (60 %), 126
(15 %), and 169 (25 %) elements, respectively.
Before using the algorithms that we will describe below, each variable in the training set was
either normalized or standardized.
In the first case (which was applied before
training the MLP), each variable in the training
set was normalized (so that it had a value in the
interval [1, 1]) by means of the equation:
zk = 2

Xk min(Xk )
1
max(Xk ) min(Xk )

(17)

and employing the minimum and maximum values


that can be found in Table 3.
In the second case (which was applied before training the SVR and GP algorithms), the
standardization was performed by subtracting the
mean and dividing by the standard deviation using
the equation:

zk =

Xk mean(Xk )
std(Xk )

(18)

and employing the means and standard deviations that can be found in Table 3. Even though
this procedure is inspired in the normalization of
Gaussian random variables, it is applicable to any
kind of distribution, since the idea is to reduce
the spread in the data. Both methods are popular in nonlinear regression for making the input
variables rather small in order to improve the
numerical stability of the employed algorithms,
regardless of the distribution of the data. In other
words, this process tends to make the training

502

J Seismol (2012) 16:489511

Table 3 Mean, standard deviation, minimum, and maximum of each variable of the dataset
Variable

Mean

Standard deviation

Minimum

Maximum

MMI
x1 = epicentral distance (km)
x2 = PGA (g)
x3 = PGV (cm/s)
x4 = moment magnitude

3.6425
87.4656
0.0460
3.4752
5.2261

0.9669
102.9749
0.0619
5.4888
0.9574

2.0
1.6
0.002
0.080
4.0

7.5
393.7
0.588
62.88
7.2

The mean and standard deviation correspond to the training set and the minimum and maximum values correspond to the
whole database

process better behaved by improving the numerical condition of the underlying optimization algorithms employed and ensuring that various default
values involved in initialization and termination of
the algorithms are appropriate (see for instance,
Sarle 2002).
We performed the nonlinear regressions with
the algorithms described in Section 3 in the following way:

Training with support vector regression


The quadratic optimization (Eq. 13) is convex,
and therefore, its solution is unique. Hence, there
was no need of using a validation set, and in consequence, the training and the validation set were
merged. After setting a priori to 0.5 (we choose
this value because 0.5 is half the distance between
two MMI degrees), the constants C = 494.559 and
= 0.0625 were found by means of an optimization that selected the parameters which produced
the minimum least squares error in a leave-oneout cross-validation. The training was performed
using the LIBSVM software (Chang and Lin
2011), obtaining 147 support vectors, which can be
found together with their corresponding weights
in Appendix 1. Normalization by means of Eq. 17
was only performed on the inputs, since it was
found in this case that without normalization in
the outputs, the algorithm provided slightly better
results.
In Appendix 2.1, we have included the MATLAB code that calculates an estimation of
the MMI from the epicentral distance, PGA,
PGV, and moment magnitude using the SVR
algorithm.

Training with multilayer perceptrons


Using the neutral network toolbox of MATLAB
(specifically the utility nftool), we trained several times a multilayer perceptron with three
hidden units with the help of the Levenberg
Marquardt training method (see e.g., Bishop
2007). The number of hidden units, p, was chosen so that neither the model was underfitting
or overfitting the validation set. In this case, we
set p = 3. The training basically tried to minimize
Eq. 16, and it halted when the error on the validation set began to increase (this is called an earlystopping strategy). The MLP that produced the
smallest error (according to Eq. 16) on the testing
set was chosen for the results reported below. The
weights of that network are as follows:

4.2849 0.1396 0.8048 1.6088

w0 = 1.3383 1.1684 3.4321 0.8322


0.1812 5.4063 0.0084 0.3572


w1 = 0.4873 0.2759 1.4386
and the biases are the following:

T
b 0 = 3.9342 3.0765 6.5198
and


b 1 = 1.1465
Remember that variable normalization using
Eq. 17 before using the aforementioned weights

J Seismol (2012) 16:489511

503

is mandatory when using Eq. 15. Then, the MMI


is calcutated from Eq. 19 using the equation:
MMI
=

(outputNN+1)(max(MMI)min(MMI))
2
+ min(MMI)

= 2.75 outputNN + 2.0

where ifte() stands for the if-then-else function


and z1 , z2 , z3 , and z4 are the standardized epicentral distance, PGA, PGV, and magnitude, respectively, which were obtained by means of
Eq. 18.
The MMI can be retrieved from Eq. 19 using
the formula:
MMI = y std(MMI) + mean(MMI)
= 0.9669y + 3.6425

and the values that appear in Table 3.


In Appendix 2.2, we have included the
MATLAB code that estimates the MMI from
the epicentral distance, PGA, PGV, and moment magnitude using the multilayer perceptron
algorithm.

Training with genetic programming


We run the genetic programming method using
the GPTIPS software (Searson 2010), a population size of 400 individuals, 3,000 generations,
Luke and Panait (2002) plain lexicographic tournament selection method (choosing from a pool
of size 7), a mean squares fitness function, a maximum depth of trees of 3, using multigene individuals with a maximum number of four genes in an
individual, 5 % of elitism (that is, the fraction of
population to copy directly to the next generation
without modification), a probability of mutation
of 10 %, a probability of crossover of 85 % and
a probability of direct tree copy of 5 %. We run
the algorithm several times, and the symbolic regression that produced the minimum error on the
testing set was as follows:
y = 0.3339 max (z1 , z4 )+0.5415+0.4488 tanh (z2 )
0.578z1 +0.1507 ln (|z3 + 0.5909|)
+ 0.3339 ifte (z4 z2 , z4 , z3 )

+ 0.1604 ifte ifte (z3 4.234, 0.571, z4 ) z1 ,
ifte (z4 0.499, 3.29, 1.231) ,

ifte (z4 0.6676, 1.823, z2 )

(20)

In Appendix 2.3, we have included the MATLAB code that retrieves the estimated MMI from
the epicentral distance, PGA, PGV, and moment
magnitude using the symbolic regression (19) and
Eq. 20.
Take into account that the form of Eq. 19
should not be used outside of the present
study, inasmuch as the functional form and the
coefficients of this equation depend on the employed database. If this methodology is used with
another dataset, most probably the algorithm will
converge to a different functional form.

5.1 Weighted linear regression


In order to make a fair comparison of the nonlinear regression algorithms with the linear case,
we estimated the ordinary least squares linear
regression and several robust linear regressions
(varying the weighting function). The ordinary
least squares linear regression between the input
variables was as follows:
MMI = 2.0303 0.0063x1 + 0.4465 log10 (x2 )
+ 0.7688 log10 (x3 ) + 0.5247x4

(21)

while the robust linear regression that was the


smallest error in the testing set was the one with
the weighting function w(r) = |r| < 1, that is,
MMI = 2.2733 0.0059x1 + 0.4069 log10 (x2 )

(19)

+ 0.8383 log10 (x3 ) + 0.4538x4 ;

(22)

504

J Seismol (2012) 16:489511

Here, x1 to x4 denote, respectively, the epicentral


distance in kilometers, the PGA in gravities, the
PGV in centimeters per second, and the moment
magnitude.
For the purposes of completeness, we have
used robust linear regression to provide relationships similar to Eqs. 1 and 2 as follows:

1.78+1.29 log (PGA) for log (PGA) 1.73


10
10
MMI =
0.39+2.54 log (PGA) for log (PGA) > 1.73
10
10

(23)

and

3.45+1.38 log (PGV) for log (PGV) 0.96


10
10
MMI =
1.73+3.18 log (PGV) for log (PGV) > 0.96
10
10
(24)

Here, the PGA must be expressed in centimeter per square second, while the PGV must
be given in centimeter per second. These relationships have been plotted in Figs. 7 and 8,
respectively.
5.2 A note on the usage of the regressions
Note that according to Fig. 10, the MMI in our
database is bounded between 2 and 6.5, and in
consequence, these should be constraints of the
values produced by the regression methods described above. Take into account that the minimum and the maximum values of each variable
should be set as guidelines on the interpolative
power of the presented results, inasmuch as it is
not advisable to extrapolate with them. If that is

the case, one should calculate new equations or


parameters/weights with a dataset that contains
representative samples of those outliers.

6 Analysis of results
Although input MMI levels are discrete natural
numbers, the output is in the form of continuous
real numbers. We define a successful prediction
when the estimation is within 0.5 the MMI level
reported by the DYFI database. In this sense, we
evaluate the performance of the algorithm: the
first part, the loss function (8) with = 0.5, was
evaluated on the testing set (a set of data which
was not employed in the training phase of the
algorithm). For comparison reasons, the quadratic
mean error function (16) is calculated as well.
These numbers, together with the coefficients of
correlation of the predicted MMI vs. the actual
MMI and the percentage of misclassification on
the testing set are shown in Table 4.
In comparison to Eq. 23, which is only dependent of the PGA, the inclusion of more information in the MMI assessment is beneficial
for its prediction. The best nonlinear regression
method seems to be the one produced by MLP,
followed closely by the genetic programming, and
the SVR. In general, the MMI estimation shows
a good agreement with the reported intensity, the
-insensitive loss function, the mean square error,
and the missclassification error obtained with the
nonlinear algorithms are lower in comparison to
the values obtained with the linear regression (23).
Figure 15 illustrates how well the predicted MMI

Table 4 Performance of the different nonlinear regression algorithms on the testing dataset
Algorithm

-insensitive
loss function

Mean square
error

Correlation of predicted
vs. actual MMI

Percentage of
misclassification (%)

SVR
MLP
Genetic programming
Ordinary least squares regression (21)
Robust regression (22)
Equation 23
Equation 24

0.037
0.035
0.033
0.046
0.046
0.067
0.097

0.146
0.141
0.139
0.169
0.170
0.224
0.278

0.928
0.928
0.929
0.913
0.914
0.883
0.853

19.91
17.06
19.43
22.75
21.33
23.22
31.28

J Seismol (2012) 16:489511

505

Regression: R=0.9280

PGV that is at our disposal, given the available


information. If we had in our disposition probability density functions, possibility distributions,
or interval information on those values, we could
use that information in our proposed approach. If
that were the case, simple Monte Carlo sampling
or even the extension principle used in fuzzy set
theory and in the theory of random sets would
be excellent tools to propagate the uncertainty
through the nonlinear regression algorithms. In
this case, the expected MMI would be presented
in form of a probability density function, a normalized fuzzy set or an interval.

Ideal fit
Data fit
(Actual,Predicted)

Predicted MMI

7 Conclusions

Actual MMI

Fig. 15 Predicted MMI vs. actual MMI in the case of the


estimation done by the genetic programming algorithm.
The least mean squares fit for both variables is PredictedMMI 0.88 ActualMMI + 0.46

corresponds to the actual MMI in the case of the


estimation done by the MLP.
The nature of the misclassified points by the
nonlinear algorithms was analyzed. Initially, we
thought that most of those points belonged either
to regions where there was a unique ZIP code for
a large area or either to large intensities (as there
were not many data corresponding to large intensities to train the algorithms). However, it turned
out not to be like that in most of the cases. We
attribute the errors in the assignment of the MMI
to the fact that the moment magnitude, epicentral
distance, PGA, and PGV are not enough to fully
describe the different aspects of the earthquake,
and therefore, it would be convenient if other
descriptive parameters of the earthquake like the
ones mentioned in Section 2 would be included in
the analysis, as well.
With respect to the variability of PGA and
PGV, we are using the ones stated in the public
databases. It is to be expected that this random
number, it is the best estimate of the PGA and

In this paper, we presented three nonlinear regression methods, namely support vector regression,
multilayer perceptrons, and genetic programming
to model the relationship between the modified
Mercalli intensity scale and the earthquake moment magnitude, epicentral distance, PGA, and
PGV measured at the closest stations to the MMI
reading. In general, the MMI estimation shows
a good agreement with the reported intensity.
The best results were obtained by the multilayer
perceptron.
As seen from the results, nonlinear regression
should be applied in order to find a relationship
between MMI and instrumental information instead of the linear regressions that are popular in
this class of studies. Our numerical experiments
have shown for example that all of the nonlinear
regression algorithms employed perform better
than the linear regressions (21) and (22).

Acknowledgements The authors would like to thank


John R. Evans, David Wald, Vince Quitoriano, and Bruce
Worden of the USGS for their helpful advise over the
internet. Also, we would like to thank the anonymous
reviewers and the associate editor, Dr. Gottfried Grnthal,
for their constructive comments that have notoriously
improved the paper. Financial support for the realization of the present research has been received from the
Universidad Nacional de Colombia; the support is graciously acknowledged.

506

J Seismol (2012) 16:489511

Appendix 1: Support vectors of the SVR method


The bias b of the SVR was b = 6.6477 and the support vectors together with their weights i are listed in
the following:
SV
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

4.4895
405.5060
494.5590
45.3807
7.2997
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
10.7370
131.7847
83.2952
494.5590
277.7219
494.5590
494.5590
494.5590
494.5590
388.3332
494.5590
494.5590
494.5590
494.5590
249.7611
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
274.6066
494.5590
494.5590
494.5590
494.5590
494.5590

z1
0.6223
0.6048
0.1701
0.1353
0.0990
0.0249
0.6436
0.3647
0.2665
0.2675
0.2684
0.3014
0.6561
0.6358
0.3140
0.3875
1.8688
1.8978
2.1534
0.7113
0.6348
0.6232
0.5893
0.5777
0.3328
0.0646
0.2065
0.2539
0.7936
0.6813
0.6736
0.6600
0.6048
0.5932
0.5341
0.5003
0.7791
0.7752
0.7462
0.7239
0.6639
0.6329
0.3366

z2
1.5550
0.6038
0.6708
0.6178
1.3207
0.5704
0.4700
0.5536
0.6206
0.5871
0.6373
0.4198
0.1520
0.0823
0.5202
0.5536
0.6541
0.5202
0.6708
0.6346
0.0656
0.2664
0.3835
0.1492
0.0990
0.1353
0.2691
0.2859
1.1701
0.3361
0.4532
0.5536
0.6206
0.5704
0.6875
0.6541
0.4867
2.9440
0.1185
0.1353
0.6541
0.5704
0.5704

z3
2.7381
0.5537
0.5894
0.7897
2.8716
0.5443
0.4973
0.5462
0.5274
0.5199
0.5518
0.5462
0.4165
0.2211
0.4936
0.5593
0.4823
0.1948
0.3451
0.0382
0.3527
0.3395
0.0783
0.3132
0.2794
0.4372
0.3959
0.4729
0.0182
0.4729
0.4071
0.5199
0.5537
0.5537
0.6044
0.5744
0.4898
0.8085
0.3771
0.4259
0.6044
0.5086
0.5499

z4
2.0587
0.2432
0.2432
1.9541
1.9541
0.1386
1.0803
0.3478
0.3478
0.3478
0.3478
0.3478
0.8710
0.8710
0.1753
0.1753
1.3263
1.3263
1.3263
0.3478
0.3478
0.3478
0.3478
0.3478
0.3478
0.3478
0.3478
0.3478
1.0803
1.0803
1.0803
1.0803
1.0803
1.0803
1.0803
1.0803
1.0803
1.0803
1.0803
1.0803
1.0803
1.0803
0.5571

J Seismol (2012) 16:489511

SV
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90

494.5590
494.5590
116.8932
494.5590
494.5590
126.2576
489.1067
494.5590
24.8224
242.1119
494.5590
494.5590
209.7443
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
10.5189
494.5590
494.5590
23.3037
7.0743
494.5590
494.5590
494.5590
494.5590
494.5590
178.8668
494.5590
123.2760
10.9674
494.5590
66.9874
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590

507

z1
0.7045
0.6358
0.6329
0.6116
0.8890
0.7433
0.6978
0.6600
0.6552
0.6436
0.5738
0.5012
0.4916
0.4402
0.3831
0.3802
0.3260
0.3202
0.2104
0.8323
0.7162
0.2602
0.0326
0.0206
0.8139
0.7045
0.3153
0.8101
0.7384
0.7239
0.6900
0.6852
0.0768
0.7839
0.7588
0.5361
0.4548
0.3231
0.4490
0.8120
0.7278
0.7084
0.6949
0.6949
0.6910
0.6871
0.6803

z2
1.5717
0.0488
0.0656
0.3668
0.7043
1.4713
0.4672
1.3040
4.0318
1.9567
1.1199
0.8187
1.1032
0.5676
0.0656
0.0851
0.3361
0.3696
0.2691
1.6052
0.0321
0.6373
0.6206
0.6038
1.0864
0.6206
0.6541
2.5424
2.6763
2.0069
0.9860
2.0403
0.5704
0.0014
0.1492
0.4365
0.6206
4.7347
0.5704
3.9649
0.0321
0.1158
1.9064
2.1240
0.4505
0.0321
0.1185

z3
0.3463
0.1171
0.0757
0.0532
0.6044
1.6597
0.7503
0.5981
3.7020
0.4553
0.6807
0.2531
1.0509
0.0551
0.0720
0.0062
0.3959
0.3921
0.2380
0.4515
0.4616
0.5894
0.5744
0.5669
0.2298
0.5631
0.6063
0.7390
0.9006
0.7052
0.0250
0.2016
0.5631
0.3357
0.3827
0.5481
0.6025
4.8181
0.5988
0.6751
0.4691
0.4015
0.3407
0.3839
0.1516
0.2907
0.2474

z4
0.3846
0.3846
0.3846
0.3846
0.1753
0.1753
0.1753
0.1753
0.1753
0.1753
0.1753
0.1753
0.1753
0.1753
0.1753
0.1753
0.1753
0.1753
0.1753
0.7664
0.7664
0.7664
0.7664
0.7664
1.2895
1.2895
0.8710
0.5571
0.5571
0.5571
0.5571
0.5571
0.5571
1.2895
1.2895
1.1849
1.1849
1.3263
0.9756
0.8710
0.8710
0.8710
0.8710
0.8710
0.8710
0.8710
0.8710

508

SV
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137

J Seismol (2012) 16:489511

494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
0.0167
12.2687
159.9334
289.0971
494.5590
494.5590
494.5590
494.5590
415.6563
494.5590
494.5590
494.5590
494.5590
427.5960
494.5590
494.5590
10.2066
494.5590
494.5590
392.9353
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
494.5590
331.6516
494.5590
494.5590
494.5590
494.5590

z1
0.6542
0.6223
0.6155
0.5913
0.5680
0.5554
0.2534
0.8823
2.2667
2.4081
2.4894
2.5833
2.5852
2.6346
0.5240
0.2781
1.2656
1.6171
0.6252
0.1106
0.1116
0.1164
0.4292
0.6170
0.7438
1.2540
0.5554
0.4296
0.2830
0.2727
0.4718
0.7171
0.5787
0.2249
0.2597
0.5496
0.7849
0.7646
0.7626
0.4170
0.2844
0.2350
0.2147
0.7462
0.7926
0.2253
0.1585

z2
0.3528
0.3696
0.6038
0.4867
0.3863
0.2691
2.3918
0.0823
0.4867
0.6206
0.6206
0.6206
0.6206
0.6038
0.6038
0.5704
0.6708
0.6373
0.6513
0.4198
0.5871
0.4867
0.6206
0.4030
0.6373
0.6373
0.6875
0.6373
0.6206
0.5704
0.5704
0.4839
0.9191
0.2859
0.5536
0.6373
0.3026
0.4672
0.3361
0.6206
0.4532
0.6373
0.6206
0.2859
1.2203
0.7684
0.1492

z3
0.4823
0.4466
0.5932
0.5481
0.4992
0.4278
5.5809
0.9720
0.1039
0.0032
0.0344
0.1434
0.0577
0.0551
0.5781
0.4259
0.5781
0.5199
0.4121
0.3489
0.4842
0.4391
0.5499
0.3827
0.4616
0.4710
0.6082
0.5932
0.5650
0.5687
0.5293
0.0595
0.2662
0.4992
0.5255
0.5744
0.3733
0.0633
0.4541
0.5875
0.4691
0.5838
0.6007
0.5274
0.1742
0.5022
0.2662

z4
0.8710
0.8710
0.8710
0.8710
0.8710
0.8710
2.0587
2.0587
2.0587
2.0587
2.0587
2.0587
2.0587
2.0587
0.7664
0.4892
0.4892
0.4892
0.1753
0.1753
0.1753
0.1753
0.1753
0.1753
0.1753
0.1753
1.1849
1.1849
0.3478
0.1753
0.1753
0.3478
0.3478
0.3478
0.3478
1.0803
1.0803
1.0803
1.0803
0.6617
0.7664
0.4525
0.8710
1.0803
0.5571
1.3263
0.6985

J Seismol (2012) 16:489511

SV
138
139
140
141
142
143
144
145
146
147

494.5590
494.5590
494.5590
12.4008
116.9598
494.5590
494.5590
494.5590
494.5590
494.5590

509

z1
0.6861
0.6193
0.5613
2.6094
2.6859
2.9134
0.4625
0.2514
0.8658
0.6484

Appendix 2: MATLAB implementation


of the MLP and the GP approach
2.1 Support vector regression

z2
0.1687
0.1185
0.6038
0.5704
0.6206
0.6541
0.5704
0.2859
0.6541
0.5536

z3
0.3771
0.4673
0.5706
0.8592
0.2888
0.1114
0.5481
0.2324
0.5255
0.5593

2.2 Multilayer perceptron

z4
0.8710
0.8710
0.8710
2.0587
2.0587
2.0587
0.7664
0.1753
0.1753
1.1849

510

2.3 Symbolic regression by genetic programming

References
Atkinson GM, Kaka SI (2007) Relationships between felt
intensity and instrumental ground motion in the central United States and California. Bull Seismol Soc
Am 97(2):497510
Atkinson GM, Sonley E (2000) Empirical relationships between modified Mercalli intensity and response spectra. Bull Seismol Soc Am 90(2):537544
Bishop CM (2007) Pattern recognition and machine learning. Springer, NY
Bommer JJ, Stafford PJ, Alarcn JE, Akkar S (2007) The
influence of magnitude range on empirical groundmotion prediction. Bull Seismol Soc Am 97(6):2152
2170
Center for Engineering Strong Motion Data (2011) Internet data reports. http://www.strongmotioncenter.org/.
Accessed 15 Jan 2011
Chang CC, Lin CJ (2011) LIBSVM: a library for support
vector machines. Software available at http://www.
csie.ntu.edu.tw/cjlin/libsvm. Accessed 15 Jan 2011
Davenport P (2004) Neural network analysis of seismic
intensity from instrumental records. In: Proceedings of

J Seismol (2012) 16:489511


the 13th world conference on earthquake engineering,
paper no. 692. Vancouver, Canada
Dengler LA, Dewey JW (1998) An intensity survey of
households affected by the Northridge, California,
earthquake of 17 January 1994. Bull Seismol Soc Am
88(2):441462
Faenza L, Michelini A (2010) Regression analysis of
MCS intensity and ground motion parameters in
Italy and its application in ShakeMap. Geophys J Int
180(3):11381152
Grnthal G (2011) Earthquakes, intensity. In: Gupta HK
(ed) Encyclopedia of solid earth geophysics. Encyclopedia of earth sciences series. Springer, Dordrecht,
The Netherlands, pp 237242
Haykin S (1998) Neural networks: a comprehensive foundation, 2nd edn. Prentice Hall PTR, Upper Saddle
River, NJ, USA
Hornik K, Stinchcombe MB, White H (1989) Multilayer
feedforward networks are universal approximators.
Neural Netw 2(5):359366
Kaka SI, Atkinson GM (2004) Relationships between instrumental ground-motion parameters and modified
Mercalli intensity in eastern north America. Bull Seismol Soc Am 94(5):17281736
Karim KR, Yamazaki F (2002) Correlation of JMA instrumental seismic intensity with strong motion parameters. Earthq Eng Struct Dyn 31:11911212
Koza JR (1992) Genetic Programming: on the programming of computers by means of natural selection. MIT
Press, Cambridge, MA, USA
Kuehn NM, Scherbaum F (2010) Short note: a naive bayes
classifier for intensities using peak ground velocity
and acceleration. Bull Seismol Soc Am 100(6):3278
3283
Luke S, Panait L (2002) Lexicographic parsimony pressure. In: Proceedings of the genetic and evolutionary computation conference (GECCO 2002). Morgan
Kaufmann Publishers
Mallat S (2008) A wavelet tour of signal processing: the
sparse way. 3rd edn. Academic Press, San Diego,
California
Murphy JR, OBrien LJ (1977) The correlation of peak
ground acceleration amplitude with seismic intensity
and other physical parameters. Bull Seismol Soc Am
67(3):877915
Musson RMW, Grnthal G, Stucchi M (2010) The comparison of macroseismic intensity scales. J Seismol
14(2):413428
Poulimenos A, Fassois S (2006) Parametric time-domain
methods for non-stationary random vibration modelling and analysisa critical survey and comparison.
Mech Syst Signal Process 20(4):763816
Richter CF (1958) Elementary seismology. W. H. Freeman
and Co., San Francisco, pp 135149, 650653
Sarle WS (2002) USENET comp.ai.neural-nets
neural networks FAQ, part II: subject: should
I
normalize/standardize/rescale
the
data?.
ftp://ftp.sas.com/pub/neural/FAQ2.html#A_std.
Accessed 24 June 2011
Searson D (2010) GPTIPS: genetic programming and
symbolic regression for MATLAB. Software available

J Seismol (2012) 16:489511


at
http://sites.google.com/site/gptips4matlab/home.
Accessed 15 Jan 2011
Shabestari KT, Yamazaki F (2001) A proposal of instrumental seismic intensity scale compatible with MMI
evaluated from three-component acceleration records.
Earthq Spectra 17(4):711723
Smola AJ, Schlkopf B (2004) A tutorial on support vector
regression. Stat Comput 14:199222
Sokolov VY (2002) Seismic intensity and Fourier acceleration spectra: revised relationship. Earthq Spectra
18(1):161187
Stover C, Coffman J (1993) Seismicity of the United States,
15681989 (Revised). U.S. Geological Survey Professional Paper 1527
Trifunac MD, Brady AG (1975) On the correlation of seismic intensity scales with the peaks of recorded strong
ground motion. Bull Seismol Soc Am (65)1:139162
Tselentis GA, Danciu L (2008) Empirical relationships
between modified Mercalli intensity and engineering
ground-motion parameters in Greece. Bull Seismol
Soc Am 98(4):18631875
Tselentis GA, Vladutu L (2010) An attempt to model the
relationship between MMI attenuation and engineering ground-motion parameters using artificial neural
networks and genetic algorithms. Nat Hazards Earth
Syst Sci 10(12):25272537

511
Tung ATY, Wong FS, Dong W (1993) A neural networks
based MMI attenuation model. In: National earthquake conference: earthquake hazard reduction in the
central and eastern United States: a time for examination and action. Memphis, Tennessee, US
US Geological Survey (2011) Did you feel it? database.
http://earthquake.usgs.gov/earthquakes/dyfi/.
Accessed 15 Jan 2011
Wald D, Quitoriano V, Dengler L, Dewey JM (1999a) Utilization of the internet for rapid community intensity
maps. Seismol Res Lett 70(6):680697
Wald DJ, Quitoriano V, Heaton TH, Kanamori H (1999b)
Relationships between peak ground acceleration,
peak ground velocity, and modified Mercalli intensity
in California. Earthq Spectra 15(3):557564
Wald DJ, Quitoriano V, Dewey J (2006) USGS did you
feel it? community internet intensity maps: macroseismic data collection via the internet. In: Proceedings of the first European conference on earthquake
engineering and seismology. Geneva, Switzerland
Wood HO, Neumann F (1931) Modified Mercalli intensity
scale of 1931. Bull Seismol Soc Am 21:277283
Worden CB, Gerstenberger MC, Rhoades DA, Wald DJ
(2012) Probabilistic relationships between groundmotion parameters and modified Mercalli intensity in
California. Bull Seismol Soc Am 102:204221