23 views

Uploaded by julianzl2110

Prediction of Modified Mercalli Intensity From...

- Ann
- BCA Syllabus
- KAntoniou MSc Thesis
- Classification Methods
- HAICTA_2015_paper76
- Seismic Zone Map of Myanmar[2]
- MLP and SVM Networks
- Earthquake
- TRC Delmelle Spatial Model of Received Signal Strength
- 10.1.1.116
- Liquefaction Hazard Zonation for the City of Bhuj
- of2011-1053
- Noron lan truyen
- Dendrite Morphological Neurons Trained by Stochastic Gradient Descent
- earth-sci
- logregr
- Demand Forecasting
- Pls
- ECN225sol4.pdf
- 17616752 Advertising Sales Management

You are on page 1of 23

DOI 10.1007/s10950-012-9291-x

ORIGINAL ARTICLE

moment magnitude, and epicentral distance using several

nonlinear statistical algorithms

Diego A. Alvarez Jorge E. Hurtado

Daniel Alveiro Bedoya-Ruz

Received: 8 March 2011 / Accepted: 24 February 2012 / Published online: 21 March 2012

Springer Science+Business Media B.V. 2012

Abstract Despite technological advances in seismic instrumentation, the assessment of the intensity of an earthquake using an observational scale

as given, for example, by the modified Mercalli

intensity scale is highly useful for practical purposes. In order to link the qualitative numbers extracted from the acceleration record of an earthquake and other instrumental data such as peak

ground velocity, epicentral distance, and moment

magnitude on the one hand and the modified

Mercalli intensity scale on the other, simple statistical regression has been generally employed.

In this paper, we will employ three methods

of nonlinear regression, namely support vector

regression, multilayer perceptrons, and genetic

programming in order to find a functional dependence between the instrumental records and the

modified Mercalli intensity scale. The proposed

methods predict the intensity of an earthquake

D. A. Bedoya-Ruz

Universidad Nacional de Colombia,

Apartado 127, Manizales, Colombia

e-mail: daalvarez@unal.edu.co

J. E. Hurtado

e-mail: jehurtadog@unal.edu.co

D. A. Bedoya-Ruz

e-mail: dabedoyar@unal.edu.co

inherent to the data. The nonlinear regressions

with good estimation results have been performed

using the Did You Feel It? database of the

US Geological Survey and the database of the

Center for Engineering Strong Motion Data for

the California region.

Keywords Modified Mercalli scale Seismic

intensity Multilayer perceptron Genetic

programming Support vector regression Model

identification Ground motion California

1 Introduction

The macroseismic intensity is an essential parameter of earthquake ground motion that allows

a simple and understandable description of earthquake damage on the basis of observed effects at a

given place. It is measured, for example, using the

European Macroseismic Scale, the China Seismic

Intensity Scale, MercalliCancaniSieberg scale,

the Modified Mercalli Intensity (MMI) scale, or

the Japan Meteorological Agency (JMA) seismic

intensity scale (see e.g., Grnthal 2011).

The intensity scales quantify the effects of a

strong motion on the Earths surface, humans,

objects of nature, and man-made structures at a

given location based on detailed description of

indoor and outdoor effects that occur during the

490

shaking. For example, the modified Mercalli intensity (MMI) scale (Wood and Neumann 1931;

Richter 1958) measures the effects of an earthquake using 12 degrees, with 1 denoting not felt

and 12 as total destruction (take into account

that the US Geological Survey (USGS) no longer

assigns intensities higher than 10 and has not assigned 10 in many decades; see e.g., Stover and

Coffman (1993), Dengler and Dewey (1998); in

addition, the current practice of using the MMI

scale does neither follow the wording of the version by Wood and Neumann (1931) nor by Richter

(1958) (Grnthal 2011)). The values will differ

based on the distance to the earthquake, with

the highest intensities being usually around the

epicentral area, and based on subjective data that

are gathered from individuals who have felt the

quake at that location. Take into consideration

that all modern scales have 12 degrees with the

exception of the JMA scale which was upgraded

from 7 to 10 degrees in 1996 (see e.g., Musson et al.

2010).

The assessment of the intensity of a quake at

a given location used to be a slow process, as it

was usually performed by means of personalized

surveys; however, with the advent of macroseismic intensity internet surveys (a list of some of

these services can be found in Table 1), this is

not the fact anymore. One can see, for example, that when a strong shaking happens in the

responses from people after the event (they can be

in excess of 10,000 responses in less than half an

hour). In any case, it is convenient to have an automatized intensity measure based on instrumental data, so that distribution intensity maps can

be drawn almost immediately after the shaking.

With that information, emergency service systems

could prioritize their attention to those places

where the largest damage is expected. In addition,

the automatic shut-off of natural gas supplies after an earthquake in certain localities could be

planned.

In this paper, three methods are analyzed,

namely support vector regression, multilayer perceptrons, and genetic programming. The accuracy

of the algorithms have been tested using the Did

You Feel It? (DYFI) database of the USGS and

the database of the Center for Engineering Strong

Motion Data (CESMD) for the California region.

The plan of the paper is as follows: first, we will

begin in Section 2 with a succinct review on previous studies on the topic and then in Section 3 we

will explain the mathematical background of the

nonlinear regression methods employed. With the

database described in Section 4, we will introduce

and then perform some numerical experiments in

Section 5 and thereafter, the analysis of results

will be done in Section 6. The paper finalizes in

Section 7 with the conclusions.

Institution or agency

URL

British Geological Survey

Central Institute for Meteorology and

Geodynamics

European-Mediterranean Seismological

Center

GeoNet (New Zealand)

Istituto Nazionale di Geofisica e Vulcanologia

Le Bureau Central Sismologique Franais

Natural Resources Canada

Royal Observatory of Belgium

Servicio Geolgico Colombiano

Swiss Seismological Service

U.S. Geological Survey (USGS)

http://asc-india.org/menu/felt-india.htm

http://www.earthquakes.bgs.ac.uk/questionnaire/EqQuestIntro.html

http://www.zamg.ac.at/erdbeben/bebenbericht/index.php

http://www.emsc-csem.org/Earthquake/Contribute/testimonies.php

http://www.geonet.org.nz/earthquake/

http://www.haisentitoilterremoto.it/

http://www.seisme.prd.fr/english.php

http://earthquakescanada.nrcan.gc.ca/dyfi/

http://seismologie.oma.be/index.php?LANG=EN&CNT=BE&LEVEL=0

http://seisan.ingeominas.gov.co/RSNC/index.php

http://www.seismo.ethz.ch/eq/detected/eq_form/index_EN

http://earthquake.usgs.gov/earthquakes/dyfzi/

491

2 Previous studies

and

relate the intensity scale with some earthquake

parameters such as peak ground acceleration

(PGA), peak ground velocity (PGV), peak ground

displacement (PGD), the magnitude scale, and

the epicentral distance among others. For instance, Trifunac and Brady (1975), Murphy and

OBrien (1977) have tried to correlate intensity

with PGA, PGV, and PGD. In particular, for the

California region, we can list the works of

Wald et al. (1999b), Atkinson and Sonley (2000),

Atkinson and Kaka (2007). For example, Wald

et al. (1999b) suggested the following piecewise

linear relationships between PGA, PGV, and

MMI:

MMI =

MMI =

1.66+3.66 log (PGA) for log (PGA) > 1.82

10

10

(1)

and

MMI =

2.35+3.47 log (PGV) for log (PGV) > 0.76

10

10

(2)

MMI =

1.91+4.09 log (PGA) for log (PGA) > 1.69

10

10

(3)

and

MMI =

3.54+3.03 log (PGV) for log (PGV) > 0.48

10

10

(4)

MMI =

1.60+3.70 log (PGA) for log (PGA) > 1.57

10

10

(5)

2.89+3.16 log (PGA) for log (PGA) > 0.53

10

10

(6)

In all the above equations, the PGA is expressed in centimeter per square second, while the

PGV is given in centimeter per second.

Karim and Yamazaki (2002) performed a similar research relating PGA, PGV, PGD, and the

so-called spectrum intensity to the JMA seismic intensity scale. Other authors like Atkinson

and Sonley (2000), Sokolov (2002), Kaka and

Atkinson (2004) have developed empirical relationships between response spectra or Fourier acceleration spectra and modified Mercalli intensity.

Shabestari and Yamazaki (2001) developed some

expressions that related the JMA intensity scale

and the MMI scale. Tselentis and Danciu (2008)

derived empirical regression equations for MMI

and for various ground motion parameters such as

duration, cumulative absolute velocity, Housners

spectrum intensity, and total elastic input energy

index.

Note that most of the aforementioned relationships were performed using univariate and multivariate linear regression analysis showing large

scatter. It is the authors belief that those studies

would have been better using robust regression

methods instead of standard linear regression in

order to account for possible outliers, that is, some

observations that do not follow the pattern that

the other observations have and that usually occur when large measurement errors are present.

Our preference for robust regression methods is

based on the fact that, in particular, least squares

estimates for regression models are highly biased

by the outliers that dominate the recordings from

few earthquakes, since the regression might by

dragged towards them.

It seems, however, that the linear model chosen

in the previously mentioned papers comes just

from mathematical convenience. The relationship

that links the MMI scale and the instrumental data

is complicated enough that nonlinear methods

seem to be the best choice for a model. This is

the reason that motivated us to propose a new

approach in Section 5.

492

been used. For example artificial neural networks have been employed by Tung et al. (1993),

Davenport (2004), Tselentis and Vladutu (2010)

in order to relate MMI and PGA, PGV, PGD,

response spectral acceleration, and response spectral velocity for selected frequencies, Arias intensity, and JMA intensity. All of them have reported

satisfactory results. In the present paper, we will

compare this approach to other nonlinear regression methods.

Finally, it is convenient to mention that recently, Faenza and Michelini (2010); Kuehn and

Scherbaum (2010); Worden et al. (2012) proposed

methods that treat the relationship of intensity

and instrumental information not using an ordinary least squares regression methodology.

Commonly employed strong motion features

There are several numbers that are either instrumental or are obtained after the processing of

the acceleration records of the shaking and which

have a preponderant role in the estimation of the

seismic intensity. Some of the most popular and

that were used in the investigations listed above

are mentioned in the following:

PGA, PGV, and PGD defined as the geometric mean of the two horizontal components

in the directions originally recorded. The first

parameter is specially important because according to Newtons second law, the PGA

multiplied by the mass provides the maximum

inertial force that affects the structure during

the shaking.

duration of the shaking;

spectral content (characteristic periods)

at some specific frequencies; according to

Sokolov (2002), the frequency range of the

amplitude spectrum |X( f )| of the acceleration

record that is of interest from an engineering

point of view lies between 0.3 and 14 Hz.

In fact, the frequency band 0.782.0 Hz is

representative for MMI greater than 8, while

the 3.06.0 Hz range represents best MMI

from 5 to 7 and the 7.08.0 Hz correlates best

with the lowest MMI.

magnitude (moment magnitude, local magnitude, surface wave magnitude, or body wave

magnitude);

epicentral distance;

amplitudes of acceleration, velocity, or displacement response spectra;

regional propagation path (geological conditions) and local soil conditions among others.

numerical processing of the acceleration record

are as follows: the spectrum intensity, the Arias

intensity, the cumulative absolute velocity, and

the Japan Meteorological Agency seismic intensity reference acceleration value among others.

It is the authors opinion that more informative

features can be extracted from the acceleration

records by using the different techniques that the

field of digital signal processing presents us, even

though there is not a clear physical interpretation

of those numbers. Some of these features might

be extracted by means of the wavelet and short

time Fourier transforms (see e.g., Mallat 2008),

and the time-varying ARMA process (see e.g.,

Poulimenos and Fassois 2006) among others. This

will be the topic of a future article.

regression

It was said in Section 2 that the relationship that

links the MMI scale and the instrumental data is

complicated enough that nonlinear methods seem

to be the best choice for a model. In the following,

we will introduce briefly three methods of nonlinear regression, namely support vector regression,

multilayer perceptrons, and genetic programming

in Section 5, that will help us find a functional dependence between the instrumental records and

the modified Mercalli intensity scale.

In the following, we will make a succint overview

of the theory behind the support vector regression (SVR); for a more thorough coverage of the

493

tutorial by Smola and Schlkopf (2004).

Suppose we are given a training dataset D =

{(x1 , y1 ) , (x2 , y2 ) , . . . , (xn , yn )}, where xi Rd is

the i-th input vector and yi R is its corresponding output vector; the idea of SVR is to fit the data

to a function of the form as follows:

f (x; , b ) =

n

target y is less than the constant . This loss function is given by the following:

1

E ( f (xi ; , b ) yi ) + ()

E (x; , b ) =

n i=1

n

(8)

i K (x, xi ) + b ,

(7)

i=1

are several options for a kernel function, and

here, we considered the popular radial basis

kernel

function, which is given by K (x, y) =

exp x y2 , where > 0 is a scalar parameter that defines the kernel function. The weights

i , the bias b , and the parameter are chosen, so

that the deviation from each target yi is at most

for all training data, and at the same time, is as flat

as possible. In this way, the data will lie around

an -sensitive tube as seen in Fig. 1. Here, > 0

is the width of an insensitive zone that controls

the amount of noise that the algorithm can handle.

For instance, when a training sample has no noise,

one may set = 0.

The flatness of the function f can be ensured by

minimizing the fitting error using a -insensitive

loss function which gives zero error if the absolute

f x

f x

0,

E ( f (x) y) =

0

if | f (x) y| <

| f (x) y|

otherwise

(9)

function makes the SVR rather different from

traditional error minimization problems, and very

robust to outliers.

As stated before, the SVR regression algorithm

tries to position the -insensitive tube around

the data as shown in Fig. 1. This is achieved

by minimizing the error (Eq. 8). In order to allow points to lie outside the -insensitive tube,

let us define the so-called slack variables i , i

that represent the distance from actual values

yi to the corresponding boundary values of the

-insensitive tube. For each point xi , we need two

slack variables i 0 and i 0. If xi lies above

the -tube, then i 0 and i = 0; if xi lies below

f x

f x

E z

0,

with the -insensitive tube. Examples of slack variables

and are also shown

line), in which the error increases linearly with distance

beyond the insensitive region. Also shown for comparison

is the quadratic error function (dashed line)

494

lies inside the -tube, then i = i = 0. Introducing

the slack variables allows points to lie outside the

tube, provided the slack variables are nonzero,

and in this way:

yi f (xi ; , b ) + + i

(10)

yi f (xi ; , b ) i

(11)

of the error (Eq. 8) becomes (see e.g., Smola and

Schlkopf 2004):

min C

,b

n

i + i + ()

(12)

i=1

(10) and (11). Here, C is another parameter

that is used to control noise and that determines

the trade-off between the flatness of f and the

amount up to which deviations larger than are

tolerated. The optimization problem (12) can be

expressed using its dual Lagrangian formulation

as the quadratic optimization problem (see e.g.,

Smola and Schlkopf 2004):

1

i j K xi , x j

i yi

2 i=1 j=1

i=1

n

max

,

n

(i + i )

(13)

i=1

0 i C, 0 i C for i = 1, 2, . . . , n, and

n

multipliers that appeared when including the constraints mentioned above into the Lagrangian

formulation.

Usually, is set a-priori, and the parameters

C and are found by means of an optimization

that produces the minimum error in Eq. 8 after performing statistical cross-validation. Crossvalidation is a technique for assessing how the

results of a statistical analysis will generalize to

an independent dataset. One round of crossvalidation involves partitioning the dataset into

complementary subsets, performing the analysis

on one subset and validating the analysis on

the other subset. To reduce variability, multiple

rounds of cross-validation are performed using

different partitions, and the validation results are

averaged over the rounds.

It can be shown that Eq. 13 is a convex

quadratic optimization problem, and in consequence, it has a unique solution. Those Lagrange

multipliers i which are different from zero are

the so-called support vectors. The support vector

regression has the property that the solution is

sparse, that is, most of the i are zero.

The parameter b is found for a data point for

which 0 < i < C or 0 < i < C by solving the

equation:

b = yn

n

i K (xn , xi )

(14)

i=1

estimates of b . In short, the steps to train the SVR

are as follows:

1. Choose > 0, > 0, and C > 0

2. Estimate the generalization error for , , and

C using only the training set and the leaveone-out methodology. On each step of the

leave-one-out, one has to solve the optimization problem (13) in order to find the i -s and

then find b from Eq. 14. The generalization

error is estimated as the average error (Eq. 8)

obtained with the training set during all stages

of the leave-one-out.

3. Choose new values of > 0, > 0, and C > 0

and repeat step 2, until a low generalization

error is found.

Once appropriate values of , , and C are

chosen, the predicted value of the regression is

found when x is used as a input to the Eq. 7, that

uses the weights and the bias b obtained after

performing the previous optimizations. All these

steps can be easily performed using the LIBSVM

software (Chang and Lin 2011).

An artificial neural network is a mathematical

model that is inspired by the structure and/or

functional aspects of biological neural networks.

Modern neural networks are nonlinear statistical

data modeling tools. They are usually used to

model complex relationships between inputs and

outputs or to find patterns in the data.

The most popular neural network is the socalled multilayer perceptron (MLP). A MLP (see

Fig. 3) consists of interconnected layers (an input,

hidden, and output layer) of processing units or

neurons. For example, the network shown in Fig. 3

has d units in the input layer, p neurons in the

hidden layer, and a single neuron in the output

layer. As seen in Fig. 3, the flow of information comes from left to right and is altered by

means of some parameters (the so-called weights

1

w0ji , wkj

, and biases b 0j and b 1k ) and the functions g and h. These functions (called activation

functions in neural network terminology) usually

are set to g(a) = tanh(a) and h(a) = a. A MLP

is a parameterized, adaptable vector function

which may be trained to perform classification

or regression tasks. Given a training dataset D =

Fig. 3 Topology of a multilayer perceptron with a single output unit. This network has d inputs, p neurons in

the hidden layer, and a single output. w0ji represents the

weights between the j-th neuron of the hidden layer and

495

the r-th input vector and yr R is its corresponding output vector, the idea of a MLP with a single

output unit is to estimate a function f : Rd R

of the form:

f x; w 0 , w 1 , b 0 , b 1

d

p

= h

w11 j g

w0ji xi + b 0j + b 11

j=1

(15)

i=1

be chosen so that the fitting of the network to

the data is adequate: if too few neurons are used,

the model will be unable to represent complex

data, and the resulting fit will be poor; if too many

neurons are used, the network may overfit the

data; when overfitting occurs, that is, the network

fits the training data extremely well but it generalizes poorly to new, unseen data. Therefore, a

validation set must be used to find the appropriate

number of neurons in the hidden layer.

Note that the parameters w 0ji , w11 j, b 0j , and b 11

were grouped as the matrices w 0 , w1 , b 0 , and b 1

respectively. They are found by minimizing the

the i-th input, while w11 j stands for the weights between the

output neuron and the j-th neuron of the hidden layer. b 0j

represents the bias weight of the j-th neuron of the hidden

layer and b 11 symbolizes the bias weight of the output layer

496

Fig. 2):

E x; w0 , w 1 , b 0 , b 1

2

1

yr f xr ; w0 , w 1 , b 0 , b 1

n r=1

n

(16)

details of neural network training, see for example

Haykin (1998), Bishop (2007).

Several authors have shown that under some

assumptions, MLPs are universal approximators;

that is, if the number of hidden nodes p is allowed

to increase towards infinity, they can approximate

any continuous function with arbitrary precision

(see e.g., Hornik et al. 1989).

However, there are a number of problems with

MLPs: (a) There is no theoretically sound way of

choosing the network topology. (b) For a given

architecture, learning algorithms often end up in a

local minimum of E instead of a global minimum.

(c) They are black box solutions to the problem.

In any case, these drawbacks do not imply that the

fittings performed in this paper affect the reliability of the results. They simply indicate that other

nonlinear models like SVR have better theoretical

properties for regression than MLPs.

3.3 Genetic programming

Genetic programming (GP) is a problem-solving

approach inspired by biological evolution in

which computer programs (mathematical formulas, computer programs, logical expressions, etc.)

are evolved in order to find solutions to problems

that perform a user-defined task. The solution

method is based on the Darwinian principle of

survival of the fittest and is closely related to the

field of genetic algorithms (GA). There are three

main differences between GA and GP: (a) Structure: GP usually evolves tree structures, while GA

evolve binary or real number strings. (b) Programs

vs. binary strings: GP usually evolves computer

programs while GA typically operate on coded

binary strings. (c) Variable vs. fixed length: In

traditional GAs, the length of the binary string is

fixed before the solution procedure begins. However, a GP tree can vary in length throughout the

execution. The theory behind genetic programming is large. Here, just a brief review of its main

concepts will be given. The interested reader is

referred to Koza (1992) for an ample discussion

on the topic.

Genetic programming uses the following steps

to solve problems:

1. Generate an initial population of computer

programs

2. Iteratively perform the following sub-steps on

the population until the termination criteria is

satisfied:

a. Execute each program in the population

and assign it a fitness value according to

how well it solves the problem

b. Create a new population by executing the

the following evolutionary operators with

certain probability:

Reproduction: it selects an individual

from within the current population so

that it can have an offspring. There

are several forms of choosing which

individual deserves to breed including fitness proportionate selection,

rank selection, and tournament

selection.

Crossover: mimics sexual combination in nature, where two parents are

chosen and parts of their trees are

swapped in a form that each crossover

operation should result in a legal

structure.

Mutation: it causes random changes

in an individual before it is introduced into the subsequent population. During mutation it may happen

that all functions and terminals are

removed beneath an arbitrarily determined node and a new branch is

randomly created or a single node is

swapped for another.

3. The best computer program that appears in

any generation is designated as the result of

genetic programming.

One of the main uses of genetic programming

is to evolve relationships between variables: this

4 Earthquake data

Two earthquake datasets have been used in this

study. The first one is the USGSs Did You Feel

It? database (U.S. Geological Survey 2011) which

collects, by means of internet surveys, information

about how people actually experienced the earthquakes. The form of the questionnaire employed

in the DYFI database and the method for assignment of intensities are based on an algorithm

developed by Dengler and Dewey (1998).

In this dataset, one can find for a given earthquake a table of modified Mercalli intensities aggregated by city or postal code, number of responses for that region and epicentral distance,

and a representative latitude and longitude of the

surveyed region. In addition, it is possible to find

in the same database the depth of the earthquake

and the latitude and longitude of the epicenter.

The second employed database is the one of

the Center for Engineering Strong Motion Data

(CESMD - Center for Engineering Strong Motion

Data 2011). Here, one can find for some representative earthquakes the actual accelerograms of the

shakings measured at different stations. For each

station, one can find its code and name, its latitude

and longitude, and for the given earthquake, its

epicentral distance, magnitude, PGA, PGV, PDG,

and the amplitudes of acceleration response spectra for the 0.3, 1, and 3 s. However, one must

take into consideration that, usually, not all of the

above-mentioned data are available at the same

time in the CESMD database.

in the California region after 2000, plus the

records of the Loma Prieta earthquake of October

17, 1989, and the Petrolia earthquake of April 25,

1992 were employed. All of the available records

of the CESMD database were used, but it was

found out that the records prior to 2000, except

the two formerly mentioned, gave incorrect MMI

predictions in the regressions performed (as we

could find out in our initial numerical experiments). This is in agreement to a comment made

by Vince Quitoriano of the USGS who told us

in an email, We [the USGS] did not start collecting internet responses [in the DYFI database]

until mid-1998. In that time period, Hector Mine

was the only large event that people responded

to immediately. Most of the pre-2000 data are

actually entries from people recalling historical

events (Northridge, Whittier Narrows, etc.) much

later, rather than responding to a current earthquake. And later he said, Note that all the

data in DYFI are from internet questionnaires; we

did not process any paper surveys. The questionnaire itself and the underlying algorithm have not

changed from Wald et al. (1999a). We allowed

the records of the Loma Prieta and the Petrolia

earthquake in any case because the records with

MMI intensity 7 and above were scarce, and those

earthquakes provided us with that information.

120

100

frequency

via genetic programming is a branch of empirical

modeling that evolves summary expressions for

available data. For a long time, symbolic regression was a domain only for us, humans; however, over the last few decades, it has become

that of computers as well. Unique benefits of

symbolic regression include human insight and

in some cases, interpretability of model results,

identification of key variables and variable combinations, and the generation of computationally

simple models for deployment into operational

models.

497

80

60

40

20

100

200

300

400

500

600

700

number of responses

MMI reading the training dataset

498

study were reported. All of the 843 readings were located

in California

to automatically download, match, and parse the

data from the databases using the Event ID of

the earthquake. We choose to correlate only those

records where the strong motion was near a MMI

observation. For each station, the nearest observation intensity with at least four reports was chosen, with the intention of decreasing the internal

variability of the MMI readings (remember that

distributed and independent sample is given by

and n is the number of samples used to estimate

the mean). Figure 4 shows the histogram of the

number of responses for a single MMI reading

in the training dataset. It can be seen from this

histogram that most of the MMI observations are

the average of a large number of responses. In

consequence much of the inherent variability of

the MMI readings has been removed from the

dataset. We used only the station readings that

were within 1.0 km to the MMI reading when

the modified Mercalli intensity was less than 6

and within 3 km when then MMI reading was

greater or equal to 6. For measuring that distance,

we used the latitude and longitude position of

the stations/MMI readings. All other data were

disregarded. Using those criteria, we came up with

a database composed of 843 station recordMMI

observation pairs coming from 63 earthquakes.

Figure 5 shows a map that indicates the locations

where the MMI readings that were used in this

study were reported. Since most of the readings

were done in places with high concentration of

population, the reported MMI data in the DYFI

database are the average of the reported MMI of

small ZIP regions. This fact explains as well the

small variability of the reported MMI readings in

the DYFI database.

From the above database, four representative features were coupled to the MMI reading,

namely moment magnitude, epicentral distance,

PGA, and PGV. We intended to include depth as

well, but the high variability of this random variable refrained us from including it in the analysis.

Sometimes, there was conflicting information, and

in that case, we deferred to the DYFI database.

MMI

Epicentral distance

PGA

PGV

Moment magnitude

Depth

MMI

Epicentral

distance

PGA

PGV

Moment

magnitude

Depth

1.00

0.32

0.79

0.73

0.26

0.14

0.32

1.00

0.52

0.01

0.66

0.23

0.79

0.52

1.00

0.72

0.01

0.30

0.73

0.01

0.72

1.00

0.55

0.15

0.26

0.66

0.01

0.55

1.00

0.17

0.14

0.23

0.30

0.15

0.17

1.00

499

Atkinson and Kaka (2007)

Worden et. at. (2012)

Equation (24)

7

6

5

4

3

2

2

0

50

100

150

200

250

300

350

400

dataset, Table 2 shows the Spearman correlation

coefficients between the analyzed variables. This

information shows not only the contribution of

each component in order to explain the MMI

but also shows the redundance of information

between variables. It can be seen from this table

that the MMI tends to grow when the epicentral

distance decreases and the PGA, PGV, or the

moment magnitude increases. Figures 6, 7, and 8

10

Fig. 8 Plot of PGV vs. MMI for the training dataset. The

lines represents the relationship between these variables

shown, given by Eqs. 2, 4, 6, and 24

does not seem to have a strong relationship with

MMI. We have included in Fig. 7 the linear regressions (1), (3), and (5) that associate PGA and

MMI; while in Fig. 8, we have included the Eqs. 2,

4, and 6. Note that some of these regressions tend

to overestimate the MMI given with the actual

database.

Figure 9 shows a plot relating moment magnitude and MMI. It can be seen from this plot

that the correlation between these two variables is

Atkinson and Kaka (2007)

Worden et. at. (2012)

Equation (23)

10

dataset

10

6

5

4

3

2

10

10

10

scale]

4.5

5.5

6.5

7.5

magnitude

Fig. 7 Plot of PGA vs. MMI for the training dataset. The

lines represents the relationship between these variables

shown, given by Eqs. 1, 3, 5, and 23

dataset

500

300

80

70

250

60

200

frequency

frequency

50

40

30

150

100

20

50

10

0

1

0

0

0.05

0.1

0.15

the training dataset

our database because this variable, together with

the epicentral distance, provides a better impression of the intensity of the quake.

It is important, as well, to know the distribution

of the training data, so that we can know in which

regions there is enough information to estimate

the MMI with the relationships that will be proposed in Section 5. In this sense, Figs. 10, 11, 12,

13, and 14 show the histograms of the different

variables. One can deduce that the regressions

performed are reliable when the expected MMI

0.2

0.25

0.3

0.35

0.4

0.45

PGA (g)

less than 400 km, the PGA is less than 0.2 g,

the PGV is less than 20 cm/s, and when the moment magnitude is less than 7.2. In principle, one

could use a regression model to prognose beyond

the extreme values found in the dataset in order

to predict higher intensities (in this case, MMIs

greater than 6.5 in the current database) or use,

for example, the model for PGAs larger than 0.2 g.

Even though this is tempting, this not advisable

because extrapolation of a model outside the parameter boundaries of its underlying dataset can

be dangerous (see for example Bommer et al.

2007, for a discussion about the extrapolation of

120

300

100

250

200

frequency

frequency

80

60

40

150

100

20

50

0

0

50

100

150

200

250

300

350

400

0

0

10

15

20

25

30

35

PGV (cm/s)

dataset

40

501

140

120

frequency

100

80

60

40

20

0

4

4.5

5.5

6.5

7.5

moment magnitude

dataset

kind of curse for all prediction equations based

on data, and in this case, the only safe solution is

to employ a regression model fitted to a database

that covers the range of information in which the

extrapolation is to be performed. In other words,

regression algorithms, in general, are expected to

perform well in regions of the space of variables

where there exists a set of points modeling similar

characteristics, but are not so good at extrapolation, inasmuch as these results are subject to

greater uncertainty.

It is necessary to note that the differences in

earthquake parameters such as source mechanism, regional tectonics, propagation path properties, and geological and geotechnical conditions

are not taken into account in the present study.

Therefore, these factors are considered to be random variables affecting the ground motion parameters for a given location and intensity level.

In the next lines, we will present the results

of our numerical experimentation performed with

those 843 observations.

experimentation

Using the algorithms described in Section 3,

we related the MMI to the moment magnitude,

the closest station to the observation. Even though

the MMI is reported as a roman natural number,

we will use it here as a real continuous number (in

arabic notation), so that when it is rounded to the

closest integer, it coincides with its corresponding

measurement in the modified Mercalli intensity

scale. In fact, the MMI in the DYFI database is

expressed by a real number with one decimal digit

of approximation (see Wald et al. 2006).

In order to validate the training, we randomly

split the 843 observations into three sets: a training, validation and testing set with 506 (60 %), 126

(15 %), and 169 (25 %) elements, respectively.

Before using the algorithms that we will describe below, each variable in the training set was

either normalized or standardized.

In the first case (which was applied before

training the MLP), each variable in the training

set was normalized (so that it had a value in the

interval [1, 1]) by means of the equation:

zk = 2

Xk min(Xk )

1

max(Xk ) min(Xk )

(17)

that can be found in Table 3.

In the second case (which was applied before training the SVR and GP algorithms), the

standardization was performed by subtracting the

mean and dividing by the standard deviation using

the equation:

zk =

Xk mean(Xk )

std(Xk )

(18)

and employing the means and standard deviations that can be found in Table 3. Even though

this procedure is inspired in the normalization of

Gaussian random variables, it is applicable to any

kind of distribution, since the idea is to reduce

the spread in the data. Both methods are popular in nonlinear regression for making the input

variables rather small in order to improve the

numerical stability of the employed algorithms,

regardless of the distribution of the data. In other

words, this process tends to make the training

502

Table 3 Mean, standard deviation, minimum, and maximum of each variable of the dataset

Variable

Mean

Standard deviation

Minimum

Maximum

MMI

x1 = epicentral distance (km)

x2 = PGA (g)

x3 = PGV (cm/s)

x4 = moment magnitude

3.6425

87.4656

0.0460

3.4752

5.2261

0.9669

102.9749

0.0619

5.4888

0.9574

2.0

1.6

0.002

0.080

4.0

7.5

393.7

0.588

62.88

7.2

The mean and standard deviation correspond to the training set and the minimum and maximum values correspond to the

whole database

process better behaved by improving the numerical condition of the underlying optimization algorithms employed and ensuring that various default

values involved in initialization and termination of

the algorithms are appropriate (see for instance,

Sarle 2002).

We performed the nonlinear regressions with

the algorithms described in Section 3 in the following way:

The quadratic optimization (Eq. 13) is convex,

and therefore, its solution is unique. Hence, there

was no need of using a validation set, and in consequence, the training and the validation set were

merged. After setting a priori to 0.5 (we choose

this value because 0.5 is half the distance between

two MMI degrees), the constants C = 494.559 and

= 0.0625 were found by means of an optimization that selected the parameters which produced

the minimum least squares error in a leave-oneout cross-validation. The training was performed

using the LIBSVM software (Chang and Lin

2011), obtaining 147 support vectors, which can be

found together with their corresponding weights

in Appendix 1. Normalization by means of Eq. 17

was only performed on the inputs, since it was

found in this case that without normalization in

the outputs, the algorithm provided slightly better

results.

In Appendix 2.1, we have included the MATLAB code that calculates an estimation of

the MMI from the epicentral distance, PGA,

PGV, and moment magnitude using the SVR

algorithm.

Using the neutral network toolbox of MATLAB

(specifically the utility nftool), we trained several times a multilayer perceptron with three

hidden units with the help of the Levenberg

Marquardt training method (see e.g., Bishop

2007). The number of hidden units, p, was chosen so that neither the model was underfitting

or overfitting the validation set. In this case, we

set p = 3. The training basically tried to minimize

Eq. 16, and it halted when the error on the validation set began to increase (this is called an earlystopping strategy). The MLP that produced the

smallest error (according to Eq. 16) on the testing

set was chosen for the results reported below. The

weights of that network are as follows:

0.1812 5.4063 0.0084 0.3572

w1 = 0.4873 0.2759 1.4386

and the biases are the following:

T

b 0 = 3.9342 3.0765 6.5198

and

b 1 = 1.1465

Remember that variable normalization using

Eq. 17 before using the aforementioned weights

503

is calcutated from Eq. 19 using the equation:

MMI

=

(outputNN+1)(max(MMI)min(MMI))

2

+ min(MMI)

and z1 , z2 , z3 , and z4 are the standardized epicentral distance, PGA, PGV, and magnitude, respectively, which were obtained by means of

Eq. 18.

The MMI can be retrieved from Eq. 19 using

the formula:

MMI = y std(MMI) + mean(MMI)

= 0.9669y + 3.6425

In Appendix 2.2, we have included the

MATLAB code that estimates the MMI from

the epicentral distance, PGA, PGV, and moment magnitude using the multilayer perceptron

algorithm.

We run the genetic programming method using

the GPTIPS software (Searson 2010), a population size of 400 individuals, 3,000 generations,

Luke and Panait (2002) plain lexicographic tournament selection method (choosing from a pool

of size 7), a mean squares fitness function, a maximum depth of trees of 3, using multigene individuals with a maximum number of four genes in an

individual, 5 % of elitism (that is, the fraction of

population to copy directly to the next generation

without modification), a probability of mutation

of 10 %, a probability of crossover of 85 % and

a probability of direct tree copy of 5 %. We run

the algorithm several times, and the symbolic regression that produced the minimum error on the

testing set was as follows:

y = 0.3339 max (z1 , z4 )+0.5415+0.4488 tanh (z2 )

0.578z1 +0.1507 ln (|z3 + 0.5909|)

+ 0.3339 ifte (z4 z2 , z4 , z3 )

+ 0.1604 ifte ifte (z3 4.234, 0.571, z4 ) z1 ,

ifte (z4 0.499, 3.29, 1.231) ,

ifte (z4 0.6676, 1.823, z2 )

(20)

In Appendix 2.3, we have included the MATLAB code that retrieves the estimated MMI from

the epicentral distance, PGA, PGV, and moment

magnitude using the symbolic regression (19) and

Eq. 20.

Take into account that the form of Eq. 19

should not be used outside of the present

study, inasmuch as the functional form and the

coefficients of this equation depend on the employed database. If this methodology is used with

another dataset, most probably the algorithm will

converge to a different functional form.

In order to make a fair comparison of the nonlinear regression algorithms with the linear case,

we estimated the ordinary least squares linear

regression and several robust linear regressions

(varying the weighting function). The ordinary

least squares linear regression between the input

variables was as follows:

MMI = 2.0303 0.0063x1 + 0.4465 log10 (x2 )

+ 0.7688 log10 (x3 ) + 0.5247x4

(21)

smallest error in the testing set was the one with

the weighting function w(r) = |r| < 1, that is,

MMI = 2.2733 0.0059x1 + 0.4069 log10 (x2 )

(19)

(22)

504

distance in kilometers, the PGA in gravities, the

PGV in centimeters per second, and the moment

magnitude.

For the purposes of completeness, we have

used robust linear regression to provide relationships similar to Eqs. 1 and 2 as follows:

10

10

MMI =

0.39+2.54 log (PGA) for log (PGA) > 1.73

10

10

(23)

and

10

10

MMI =

1.73+3.18 log (PGV) for log (PGV) > 0.96

10

10

(24)

Here, the PGA must be expressed in centimeter per square second, while the PGV must

be given in centimeter per second. These relationships have been plotted in Figs. 7 and 8,

respectively.

5.2 A note on the usage of the regressions

Note that according to Fig. 10, the MMI in our

database is bounded between 2 and 6.5, and in

consequence, these should be constraints of the

values produced by the regression methods described above. Take into account that the minimum and the maximum values of each variable

should be set as guidelines on the interpolative

power of the presented results, inasmuch as it is

not advisable to extrapolate with them. If that is

parameters/weights with a dataset that contains

representative samples of those outliers.

6 Analysis of results

Although input MMI levels are discrete natural

numbers, the output is in the form of continuous

real numbers. We define a successful prediction

when the estimation is within 0.5 the MMI level

reported by the DYFI database. In this sense, we

evaluate the performance of the algorithm: the

first part, the loss function (8) with = 0.5, was

evaluated on the testing set (a set of data which

was not employed in the training phase of the

algorithm). For comparison reasons, the quadratic

mean error function (16) is calculated as well.

These numbers, together with the coefficients of

correlation of the predicted MMI vs. the actual

MMI and the percentage of misclassification on

the testing set are shown in Table 4.

In comparison to Eq. 23, which is only dependent of the PGA, the inclusion of more information in the MMI assessment is beneficial

for its prediction. The best nonlinear regression

method seems to be the one produced by MLP,

followed closely by the genetic programming, and

the SVR. In general, the MMI estimation shows

a good agreement with the reported intensity, the

-insensitive loss function, the mean square error,

and the missclassification error obtained with the

nonlinear algorithms are lower in comparison to

the values obtained with the linear regression (23).

Figure 15 illustrates how well the predicted MMI

Table 4 Performance of the different nonlinear regression algorithms on the testing dataset

Algorithm

-insensitive

loss function

Mean square

error

Correlation of predicted

vs. actual MMI

Percentage of

misclassification (%)

SVR

MLP

Genetic programming

Ordinary least squares regression (21)

Robust regression (22)

Equation 23

Equation 24

0.037

0.035

0.033

0.046

0.046

0.067

0.097

0.146

0.141

0.139

0.169

0.170

0.224

0.278

0.928

0.928

0.929

0.913

0.914

0.883

0.853

19.91

17.06

19.43

22.75

21.33

23.22

31.28

505

Regression: R=0.9280

information. If we had in our disposition probability density functions, possibility distributions,

or interval information on those values, we could

use that information in our proposed approach. If

that were the case, simple Monte Carlo sampling

or even the extension principle used in fuzzy set

theory and in the theory of random sets would

be excellent tools to propagate the uncertainty

through the nonlinear regression algorithms. In

this case, the expected MMI would be presented

in form of a probability density function, a normalized fuzzy set or an interval.

Ideal fit

Data fit

(Actual,Predicted)

Predicted MMI

7 Conclusions

Actual MMI

estimation done by the genetic programming algorithm.

The least mean squares fit for both variables is PredictedMMI 0.88 ActualMMI + 0.46

estimation done by the MLP.

The nature of the misclassified points by the

nonlinear algorithms was analyzed. Initially, we

thought that most of those points belonged either

to regions where there was a unique ZIP code for

a large area or either to large intensities (as there

were not many data corresponding to large intensities to train the algorithms). However, it turned

out not to be like that in most of the cases. We

attribute the errors in the assignment of the MMI

to the fact that the moment magnitude, epicentral

distance, PGA, and PGV are not enough to fully

describe the different aspects of the earthquake,

and therefore, it would be convenient if other

descriptive parameters of the earthquake like the

ones mentioned in Section 2 would be included in

the analysis, as well.

With respect to the variability of PGA and

PGV, we are using the ones stated in the public

databases. It is to be expected that this random

number, it is the best estimate of the PGA and

In this paper, we presented three nonlinear regression methods, namely support vector regression,

multilayer perceptrons, and genetic programming

to model the relationship between the modified

Mercalli intensity scale and the earthquake moment magnitude, epicentral distance, PGA, and

PGV measured at the closest stations to the MMI

reading. In general, the MMI estimation shows

a good agreement with the reported intensity.

The best results were obtained by the multilayer

perceptron.

As seen from the results, nonlinear regression

should be applied in order to find a relationship

between MMI and instrumental information instead of the linear regressions that are popular in

this class of studies. Our numerical experiments

have shown for example that all of the nonlinear

regression algorithms employed perform better

than the linear regressions (21) and (22).

John R. Evans, David Wald, Vince Quitoriano, and Bruce

Worden of the USGS for their helpful advise over the

internet. Also, we would like to thank the anonymous

reviewers and the associate editor, Dr. Gottfried Grnthal,

for their constructive comments that have notoriously

improved the paper. Financial support for the realization of the present research has been received from the

Universidad Nacional de Colombia; the support is graciously acknowledged.

506

The bias b of the SVR was b = 6.6477 and the support vectors together with their weights i are listed in

the following:

SV

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

4.4895

405.5060

494.5590

45.3807

7.2997

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

10.7370

131.7847

83.2952

494.5590

277.7219

494.5590

494.5590

494.5590

494.5590

388.3332

494.5590

494.5590

494.5590

494.5590

249.7611

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

274.6066

494.5590

494.5590

494.5590

494.5590

494.5590

z1

0.6223

0.6048

0.1701

0.1353

0.0990

0.0249

0.6436

0.3647

0.2665

0.2675

0.2684

0.3014

0.6561

0.6358

0.3140

0.3875

1.8688

1.8978

2.1534

0.7113

0.6348

0.6232

0.5893

0.5777

0.3328

0.0646

0.2065

0.2539

0.7936

0.6813

0.6736

0.6600

0.6048

0.5932

0.5341

0.5003

0.7791

0.7752

0.7462

0.7239

0.6639

0.6329

0.3366

z2

1.5550

0.6038

0.6708

0.6178

1.3207

0.5704

0.4700

0.5536

0.6206

0.5871

0.6373

0.4198

0.1520

0.0823

0.5202

0.5536

0.6541

0.5202

0.6708

0.6346

0.0656

0.2664

0.3835

0.1492

0.0990

0.1353

0.2691

0.2859

1.1701

0.3361

0.4532

0.5536

0.6206

0.5704

0.6875

0.6541

0.4867

2.9440

0.1185

0.1353

0.6541

0.5704

0.5704

z3

2.7381

0.5537

0.5894

0.7897

2.8716

0.5443

0.4973

0.5462

0.5274

0.5199

0.5518

0.5462

0.4165

0.2211

0.4936

0.5593

0.4823

0.1948

0.3451

0.0382

0.3527

0.3395

0.0783

0.3132

0.2794

0.4372

0.3959

0.4729

0.0182

0.4729

0.4071

0.5199

0.5537

0.5537

0.6044

0.5744

0.4898

0.8085

0.3771

0.4259

0.6044

0.5086

0.5499

z4

2.0587

0.2432

0.2432

1.9541

1.9541

0.1386

1.0803

0.3478

0.3478

0.3478

0.3478

0.3478

0.8710

0.8710

0.1753

0.1753

1.3263

1.3263

1.3263

0.3478

0.3478

0.3478

0.3478

0.3478

0.3478

0.3478

0.3478

0.3478

1.0803

1.0803

1.0803

1.0803

1.0803

1.0803

1.0803

1.0803

1.0803

1.0803

1.0803

1.0803

1.0803

1.0803

0.5571

SV

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

494.5590

494.5590

116.8932

494.5590

494.5590

126.2576

489.1067

494.5590

24.8224

242.1119

494.5590

494.5590

209.7443

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

10.5189

494.5590

494.5590

23.3037

7.0743

494.5590

494.5590

494.5590

494.5590

494.5590

178.8668

494.5590

123.2760

10.9674

494.5590

66.9874

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

507

z1

0.7045

0.6358

0.6329

0.6116

0.8890

0.7433

0.6978

0.6600

0.6552

0.6436

0.5738

0.5012

0.4916

0.4402

0.3831

0.3802

0.3260

0.3202

0.2104

0.8323

0.7162

0.2602

0.0326

0.0206

0.8139

0.7045

0.3153

0.8101

0.7384

0.7239

0.6900

0.6852

0.0768

0.7839

0.7588

0.5361

0.4548

0.3231

0.4490

0.8120

0.7278

0.7084

0.6949

0.6949

0.6910

0.6871

0.6803

z2

1.5717

0.0488

0.0656

0.3668

0.7043

1.4713

0.4672

1.3040

4.0318

1.9567

1.1199

0.8187

1.1032

0.5676

0.0656

0.0851

0.3361

0.3696

0.2691

1.6052

0.0321

0.6373

0.6206

0.6038

1.0864

0.6206

0.6541

2.5424

2.6763

2.0069

0.9860

2.0403

0.5704

0.0014

0.1492

0.4365

0.6206

4.7347

0.5704

3.9649

0.0321

0.1158

1.9064

2.1240

0.4505

0.0321

0.1185

z3

0.3463

0.1171

0.0757

0.0532

0.6044

1.6597

0.7503

0.5981

3.7020

0.4553

0.6807

0.2531

1.0509

0.0551

0.0720

0.0062

0.3959

0.3921

0.2380

0.4515

0.4616

0.5894

0.5744

0.5669

0.2298

0.5631

0.6063

0.7390

0.9006

0.7052

0.0250

0.2016

0.5631

0.3357

0.3827

0.5481

0.6025

4.8181

0.5988

0.6751

0.4691

0.4015

0.3407

0.3839

0.1516

0.2907

0.2474

z4

0.3846

0.3846

0.3846

0.3846

0.1753

0.1753

0.1753

0.1753

0.1753

0.1753

0.1753

0.1753

0.1753

0.1753

0.1753

0.1753

0.1753

0.1753

0.1753

0.7664

0.7664

0.7664

0.7664

0.7664

1.2895

1.2895

0.8710

0.5571

0.5571

0.5571

0.5571

0.5571

0.5571

1.2895

1.2895

1.1849

1.1849

1.3263

0.9756

0.8710

0.8710

0.8710

0.8710

0.8710

0.8710

0.8710

0.8710

508

SV

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

0.0167

12.2687

159.9334

289.0971

494.5590

494.5590

494.5590

494.5590

415.6563

494.5590

494.5590

494.5590

494.5590

427.5960

494.5590

494.5590

10.2066

494.5590

494.5590

392.9353

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

494.5590

331.6516

494.5590

494.5590

494.5590

494.5590

z1

0.6542

0.6223

0.6155

0.5913

0.5680

0.5554

0.2534

0.8823

2.2667

2.4081

2.4894

2.5833

2.5852

2.6346

0.5240

0.2781

1.2656

1.6171

0.6252

0.1106

0.1116

0.1164

0.4292

0.6170

0.7438

1.2540

0.5554

0.4296

0.2830

0.2727

0.4718

0.7171

0.5787

0.2249

0.2597

0.5496

0.7849

0.7646

0.7626

0.4170

0.2844

0.2350

0.2147

0.7462

0.7926

0.2253

0.1585

z2

0.3528

0.3696

0.6038

0.4867

0.3863

0.2691

2.3918

0.0823

0.4867

0.6206

0.6206

0.6206

0.6206

0.6038

0.6038

0.5704

0.6708

0.6373

0.6513

0.4198

0.5871

0.4867

0.6206

0.4030

0.6373

0.6373

0.6875

0.6373

0.6206

0.5704

0.5704

0.4839

0.9191

0.2859

0.5536

0.6373

0.3026

0.4672

0.3361

0.6206

0.4532

0.6373

0.6206

0.2859

1.2203

0.7684

0.1492

z3

0.4823

0.4466

0.5932

0.5481

0.4992

0.4278

5.5809

0.9720

0.1039

0.0032

0.0344

0.1434

0.0577

0.0551

0.5781

0.4259

0.5781

0.5199

0.4121

0.3489

0.4842

0.4391

0.5499

0.3827

0.4616

0.4710

0.6082

0.5932

0.5650

0.5687

0.5293

0.0595

0.2662

0.4992

0.5255

0.5744

0.3733

0.0633

0.4541

0.5875

0.4691

0.5838

0.6007

0.5274

0.1742

0.5022

0.2662

z4

0.8710

0.8710

0.8710

0.8710

0.8710

0.8710

2.0587

2.0587

2.0587

2.0587

2.0587

2.0587

2.0587

2.0587

0.7664

0.4892

0.4892

0.4892

0.1753

0.1753

0.1753

0.1753

0.1753

0.1753

0.1753

0.1753

1.1849

1.1849

0.3478

0.1753

0.1753

0.3478

0.3478

0.3478

0.3478

1.0803

1.0803

1.0803

1.0803

0.6617

0.7664

0.4525

0.8710

1.0803

0.5571

1.3263

0.6985

SV

138

139

140

141

142

143

144

145

146

147

494.5590

494.5590

494.5590

12.4008

116.9598

494.5590

494.5590

494.5590

494.5590

494.5590

509

z1

0.6861

0.6193

0.5613

2.6094

2.6859

2.9134

0.4625

0.2514

0.8658

0.6484

of the MLP and the GP approach

2.1 Support vector regression

z2

0.1687

0.1185

0.6038

0.5704

0.6206

0.6541

0.5704

0.2859

0.6541

0.5536

z3

0.3771

0.4673

0.5706

0.8592

0.2888

0.1114

0.5481

0.2324

0.5255

0.5593

z4

0.8710

0.8710

0.8710

2.0587

2.0587

2.0587

0.7664

0.1753

0.1753

1.1849

510

References

Atkinson GM, Kaka SI (2007) Relationships between felt

intensity and instrumental ground motion in the central United States and California. Bull Seismol Soc

Am 97(2):497510

Atkinson GM, Sonley E (2000) Empirical relationships between modified Mercalli intensity and response spectra. Bull Seismol Soc Am 90(2):537544

Bishop CM (2007) Pattern recognition and machine learning. Springer, NY

Bommer JJ, Stafford PJ, Alarcn JE, Akkar S (2007) The

influence of magnitude range on empirical groundmotion prediction. Bull Seismol Soc Am 97(6):2152

2170

Center for Engineering Strong Motion Data (2011) Internet data reports. http://www.strongmotioncenter.org/.

Accessed 15 Jan 2011

Chang CC, Lin CJ (2011) LIBSVM: a library for support

vector machines. Software available at http://www.

csie.ntu.edu.tw/cjlin/libsvm. Accessed 15 Jan 2011

Davenport P (2004) Neural network analysis of seismic

intensity from instrumental records. In: Proceedings of

the 13th world conference on earthquake engineering,

paper no. 692. Vancouver, Canada

Dengler LA, Dewey JW (1998) An intensity survey of

households affected by the Northridge, California,

earthquake of 17 January 1994. Bull Seismol Soc Am

88(2):441462

Faenza L, Michelini A (2010) Regression analysis of

MCS intensity and ground motion parameters in

Italy and its application in ShakeMap. Geophys J Int

180(3):11381152

Grnthal G (2011) Earthquakes, intensity. In: Gupta HK

(ed) Encyclopedia of solid earth geophysics. Encyclopedia of earth sciences series. Springer, Dordrecht,

The Netherlands, pp 237242

Haykin S (1998) Neural networks: a comprehensive foundation, 2nd edn. Prentice Hall PTR, Upper Saddle

River, NJ, USA

Hornik K, Stinchcombe MB, White H (1989) Multilayer

feedforward networks are universal approximators.

Neural Netw 2(5):359366

Kaka SI, Atkinson GM (2004) Relationships between instrumental ground-motion parameters and modified

Mercalli intensity in eastern north America. Bull Seismol Soc Am 94(5):17281736

Karim KR, Yamazaki F (2002) Correlation of JMA instrumental seismic intensity with strong motion parameters. Earthq Eng Struct Dyn 31:11911212

Koza JR (1992) Genetic Programming: on the programming of computers by means of natural selection. MIT

Press, Cambridge, MA, USA

Kuehn NM, Scherbaum F (2010) Short note: a naive bayes

classifier for intensities using peak ground velocity

and acceleration. Bull Seismol Soc Am 100(6):3278

3283

Luke S, Panait L (2002) Lexicographic parsimony pressure. In: Proceedings of the genetic and evolutionary computation conference (GECCO 2002). Morgan

Kaufmann Publishers

Mallat S (2008) A wavelet tour of signal processing: the

sparse way. 3rd edn. Academic Press, San Diego,

California

Murphy JR, OBrien LJ (1977) The correlation of peak

ground acceleration amplitude with seismic intensity

and other physical parameters. Bull Seismol Soc Am

67(3):877915

Musson RMW, Grnthal G, Stucchi M (2010) The comparison of macroseismic intensity scales. J Seismol

14(2):413428

Poulimenos A, Fassois S (2006) Parametric time-domain

methods for non-stationary random vibration modelling and analysisa critical survey and comparison.

Mech Syst Signal Process 20(4):763816

Richter CF (1958) Elementary seismology. W. H. Freeman

and Co., San Francisco, pp 135149, 650653

Sarle WS (2002) USENET comp.ai.neural-nets

neural networks FAQ, part II: subject: should

I

normalize/standardize/rescale

the

data?.

ftp://ftp.sas.com/pub/neural/FAQ2.html#A_std.

Accessed 24 June 2011

Searson D (2010) GPTIPS: genetic programming and

symbolic regression for MATLAB. Software available

at

http://sites.google.com/site/gptips4matlab/home.

Accessed 15 Jan 2011

Shabestari KT, Yamazaki F (2001) A proposal of instrumental seismic intensity scale compatible with MMI

evaluated from three-component acceleration records.

Earthq Spectra 17(4):711723

Smola AJ, Schlkopf B (2004) A tutorial on support vector

regression. Stat Comput 14:199222

Sokolov VY (2002) Seismic intensity and Fourier acceleration spectra: revised relationship. Earthq Spectra

18(1):161187

Stover C, Coffman J (1993) Seismicity of the United States,

15681989 (Revised). U.S. Geological Survey Professional Paper 1527

Trifunac MD, Brady AG (1975) On the correlation of seismic intensity scales with the peaks of recorded strong

ground motion. Bull Seismol Soc Am (65)1:139162

Tselentis GA, Danciu L (2008) Empirical relationships

between modified Mercalli intensity and engineering

ground-motion parameters in Greece. Bull Seismol

Soc Am 98(4):18631875

Tselentis GA, Vladutu L (2010) An attempt to model the

relationship between MMI attenuation and engineering ground-motion parameters using artificial neural

networks and genetic algorithms. Nat Hazards Earth

Syst Sci 10(12):25272537

511

Tung ATY, Wong FS, Dong W (1993) A neural networks

based MMI attenuation model. In: National earthquake conference: earthquake hazard reduction in the

central and eastern United States: a time for examination and action. Memphis, Tennessee, US

US Geological Survey (2011) Did you feel it? database.

http://earthquake.usgs.gov/earthquakes/dyfi/.

Accessed 15 Jan 2011

Wald D, Quitoriano V, Dengler L, Dewey JM (1999a) Utilization of the internet for rapid community intensity

maps. Seismol Res Lett 70(6):680697

Wald DJ, Quitoriano V, Heaton TH, Kanamori H (1999b)

Relationships between peak ground acceleration,

peak ground velocity, and modified Mercalli intensity

in California. Earthq Spectra 15(3):557564

Wald DJ, Quitoriano V, Dewey J (2006) USGS did you

feel it? community internet intensity maps: macroseismic data collection via the internet. In: Proceedings of the first European conference on earthquake

engineering and seismology. Geneva, Switzerland

Wood HO, Neumann F (1931) Modified Mercalli intensity

scale of 1931. Bull Seismol Soc Am 21:277283

Worden CB, Gerstenberger MC, Rhoades DA, Wald DJ

(2012) Probabilistic relationships between groundmotion parameters and modified Mercalli intensity in

California. Bull Seismol Soc Am 102:204221

- AnnUploaded byNag Bhushan
- BCA SyllabusUploaded bymaladeep
- KAntoniou MSc ThesisUploaded bypradeepjoshi007
- Classification MethodsUploaded byserovic1
- HAICTA_2015_paper76Uploaded byAuliya Burhanuddin
- Seismic Zone Map of Myanmar[2]Uploaded byzinkolinn070
- MLP and SVM NetworksUploaded byjustspamme
- EarthquakeUploaded byIan Karl Benigno Llanes
- TRC Delmelle Spatial Model of Received Signal StrengthUploaded bymuhammadriz
- 10.1.1.116Uploaded bydheerajkuma
- Liquefaction Hazard Zonation for the City of BhujUploaded bymm
- of2011-1053Uploaded byJohn Alexander Gallin
- Noron lan truyenUploaded byTú Huy Trần
- Dendrite Morphological Neurons Trained by Stochastic Gradient DescentUploaded byErik Zamora
- earth-sciUploaded byRiz Dadole
- logregrUploaded bySridhar Kalyankar
- Demand ForecastingUploaded byUmesh Bachhav
- PlsUploaded byRajath Vaidhish
- ECN225sol4.pdfUploaded byJamie1231
- 17616752 Advertising Sales ManagementUploaded bykhushi_sharma58101959
- Park 89Uploaded byAlberto Sarco
- Neural TuringUploaded byIamIN
- Human Rights Contention in Latin AmericaUploaded byDaniel Goinic
- Business Stats Ken Black Case AnswersUploaded byPriya Mehta
- Coventry Simple Linear RegressionUploaded byHà Hà
- Digital DivideUploaded bym.fendy
- Randomized Variable EliminationUploaded byscribd202
- plsUploaded bydsaty
- Comparison of Different Techniques for Distribution System Load Flow Analysis-A ReviewUploaded byInternational Journal for Scientific Research and Development - IJSRD
- Regression AnalysisUploaded byAbhishek2009GWU

- CRANE DESINGUploaded byjulianzl2110
- TABLAS SOLDADURAS EXCENTRICASUploaded byjulianzl2110
- ar_02.pdfUploaded byjulianzl2110
- ar_02.pdfUploaded byjulianzl2110
- NTC 385 Terminología Relativa Al Concreto y Sus AgregadosUploaded byjulianzl2110
- ntc 4026Uploaded byDaniel Bayter Vasquez
- Graficas para diseño de mezclaUploaded byjulianzl2110
- metaltub-manual-tecnico 2015.pdfUploaded byjulianzl2110
- Matlab 13 Install GuideUploaded byBen Umobi Jnr
- NTC 814 Refractarios. Clasificación de Concretos Refractarios de Alumina y Silicoaluminosos.pdfUploaded byjulianzl2110
- Manual Software Anclajes AnchorFix Web (Sika Perú)Uploaded byERIKA
- Manual Software Anclajes AnchorFix Web (Sika Perú)Uploaded byERIKA
- ar_08 ejemplo de analisis.pdfUploaded byjulianzl2110
- Brochure Fisuras en El Concreto ReforzadoUploaded byAlejandro Labarca Henríquez
- Acostadavid2016.pdfUploaded bysebastobon85
- Formato Contrato de Arreandimiento Sobre ParqueaderoUploaded byjulianzl2110
- ntc-944.Uploaded byjulianzl2110
- Acta de VecindadUploaded byjulianzl2110
- Install Guide Ja JPUploaded byCromeXza
- Catalogo ArcelorUploaded bylacrimosa881021
- Catalogo ArcelorUploaded bylacrimosa881021
- Clase 06 - Método Pendiente-DeflexiónUploaded byMauricio Leonardo Sandoval
- Ejemplo de Diseño de Columna Pre EsforzadaUploaded byjulianzl2110
- Matlab 13 Install GuideUploaded byBen Umobi Jnr
- replanteo en campoUploaded byjulianzl2110
- Adherencia Long de Desarrollo y EmpalmeUploaded byosval2iocc
- ntc-961Uploaded byjulianzl2110
- DSI_Sistemas-DYWIDAG-de-Postesado-de-Cable-Adherente_spUploaded byJorge Satake
- as Avanzadas Para Ingenieria-Kreyszig-By Dar12spinUploaded bySergio Moreno Guevara
- Descanso Dominical Régimen Laboral Colombiano (1)Uploaded byjulianzl2110

- WILNOS VSP Catalogue 2013 RfsUploaded byulas
- Service LBP 2400Uploaded byamd67
- e04ucfUploaded bykibur1
- 4PH0_2P_que_20160125.pdfUploaded bydadajee420
- Chapter 1 mangonelUploaded byZulkhairi
- 3- (VIP) Behavior of RC Beams With Tension Lap Splices Confined With Transverse Reinforcement Using Different Types of Concrete Under Pure BendingUploaded bysokamantyyahoo.com.ph
- Lec 9 Other MethodsUploaded byarafatasghar
- Rule of Thumb for Duct SystemsUploaded byImtiaz Ahmed
- Be8251 Basic Electrical and Electronics Engineering l t p cUploaded byselvan
- MERCEDES-BENZ-FaultCodes-0520Uploaded byactuator79
- Pole Foundation Excel SheetsUploaded bycoolkaisy
- Course Outline: ECOR 2606-Uploaded byf22archrer
- ASME 1.20.1.pdfUploaded bynelson
- Magic MathsUploaded byAnonymous 5Qtn0QVDuO
- Physics testUploaded bySignor Plaban Gogoi
- BT-14Uploaded bydalton2003
- A Study on Post Tensioned T-Beams Strengthened with AFRP FabricUploaded byIJRASETPublications
- Tbxlha 6565c Vtm.aspxUploaded byRitesh Ranjan
- IGCP 580 Annual MeetingUploaded byPratheesh Prabhakar
- Tru Laser [Basic Machine Operation & PM]Uploaded byCremuss Tarapu
- RILEM TC 162-TDF Test and design methods for steel fibre reinforced concreteUploaded byAbdul Ghaffar
- Chapter Outline 1.1-1.3Uploaded byAja Imani
- 19 AC Circuits Analysis Using Complex Variables (1)Uploaded byTalha Yousuf
- Chu Siga2 PsUploaded byjuan yenque
- OutlineUploaded bymichsantos
- CBNSTUploaded byfreedomtok
- wwwww123.pdfUploaded byChandu CK
- Digital Signal Processing Lab manual using matlabUploaded byrpecglobal
- Chiral Negative RefractionUploaded byahsanqau
- Measurement and Simulation of Grounding Resistance With Two and Four Mesh GridsUploaded bycphcricri