Multivariate Bayesian Regression Analysis Applied To Ground-Motion Prediction Equations, Part 1

Bulletin of the Seismological Society of America, Vol. 100, No. 4, pp. 1551–1567, August 2010, doi: 10.
1785/0120080354
Multivariate Bayesian Regression Analysis Applied to Ground-Motion

Prediction Equations, Part 1: Theory and Synthetic Example
by Danny Arroyo and Mario Ordaz
Abstract An application of a linear multivariate Bayesian regression model to

compute pseudoacceleration (SA) ground-motion prediction equations (GMPEs) is
presented. The model is able to include the correlation between observations for a
given earthquake, the correlation between SA ordinates at different periods, and the
correlation between regression coefficients of the ground-motion prediction model.
We evaluate the advantages of the Bayesian approach over the traditional regression
methods, and we discuss the differences between univariate and multivariate analyses.
Because the application of the Bayesian method is in general complex and implies an
increase in the numerical effort with respect to the traditional methods, our computer
code to perform linear Bayesian analyses is freely available on request.
Introduction
In the past, empirical pseudoacceleration (SA) ground- in this article can be considered an extension of the original
motion prediction models were constructed by fitting avail- work of Ordaz et al. (1994). Nevertheless, our model is more
able data to certain functional forms, using the least-squares general and is able to include the correlation between obser-
method. Several authors observed that in some cases the vations recorded at different sites for a given earthquake, the
decay of SA with distance could not be correctly determined correlation between SA ordinates at different periods, and the
with this method because it disregarded the correlation correlation between regression coefficients of the GMPE.
between observations recorded at different sites for a given In this article we discuss the theory of the Bayesian
earthquake (Campbell, 1981, 1985; Joyner and Boore, 1993, model, and we compare a GMPE obtained through the pro-
1994). The two-stage regression method and the one-stage posed technique with GMPEs obtained with the least-squares
maximum-likelihood method were then developed to solve and the one-stage maximum-likelihood methods. For the
this problem (Brillinger and Priesler, 1984; Abrahamson comparisons, we use as a benchmark a set of synthetic SA
and Youngs, 1992; Joyner and Boore, 1993, 1994). The one- spectra with predefined statistical parameters. For the pre-
stage maximum-likelihood approach was introduced by sented examples we used only the one-stage maximum-
Brillinger and Priesler (1984), and Abrahamson and Youngs likelihood method because it has been documented that this
(1992) and Joyner and Boore (1993, 1994) proposed com- method and the two-stage method lead essentially to the
putational algorithms to implement it. same results (Joyner and Boore 1993, 1994). Finally, in the
In many cases, however, the information contained in last part of the article we discuss the differences between
data sets is not enough to properly constrain all regression multivariate and univariate analyses.
coefficients for a given functional form and, in practice, cer-
tain coefficients are fixed in order to stabilize the regression The Regression Model
analysis. In this approach, which can be considered as the
constrained version of the maximum-likelihood method, For a given period T, a standard shape shown in equa-
the fixed values are defined by careful reviews of individual tion (1) was adopted as the GMPE,
terms of the adopted functional form.
Some ground-motion prediction equations (GMPEs) have yT α1 T α2 TMw 6 α3 TMw 62
been derived using univariate Bayesian analysis (Veneziano α4 T lnR α5 TR; (1)
and Heidari, 1985; Ordaz et al., 1994; Reyes, 1999; Sibilio,
2006; Wang and Takada, 2009). Ordaz et al. (1994) discussed where yT is the natural logarithm of SAT, Mw is the mo-
the advantages of the Bayesian analysis with respect to the ment magnitude, R is the closest distance to the rupture area,
least-squares method. However, the correlation between and αi T are the coefficients to be determined by regression
observations recorded at different sites for a given earthquake analysis. Although in this article we have used the functional
was not included in the model presented by Ordaz et al. form shown in equation (1), the procedure presented can be
(1994), nor in other similar studies. The model that we present readily applied to other linear functional forms.
1551
1552 D. Arroyo and M. Ordaz
The multivariate regression model is defined in equa- According to these assumptions, matrix Φ is a block
tion (2): diagonal matrix,
Y XαT E; (2) 2 3
ϕ1 0 0
where the superscript T stands for transpose, Y is a known 6 0 ϕ2 0 7
6 7
no × nT matrix that includes no observations of yT for nT Φ 6 .. .. . . .. 7; (5)
4 . . . . 5
periods, X is a known no × np matrix that comprises no ob- 0 0 ϕne
servations of np parameters in the model (note that according
to equation 1 the elements of the first column of the matrix X where ne is the number of earthquakes and the squared sub-
are equal to unity), α is an unknown nT × np matrix that com- matrix ϕi related to earthquake i is given by
prises the coefficients determined by regression analysis (each 2 3
1 γe γe
row of α contains the αi T coefficients for a given T), and E 6 γe 1 γe 7
is an unknown no × nT matrix that is comprised of the regres- 6 7
ϕi 6 .. .. . . .. 7: (6)
sion residuals. 4 . . . . 5
It is assumed that the elements of E are correlated, γe γe 1
normally distributed random variables, with zero mean.
The rank of ϕi is equal to the number of records of the earth-
The correlation between elements of E is defined through
quake i. Note that γ e is equal to the γ parameter in Joyner
an unknown no nT × no nT matrix Ω, which is defined in
and Boore (1993, 1994), and ρT 1 ;T 2 is the coefficient of cor-
equation (3):
relation between spectral ordinates SAT for a given pair of
Ω Φ⊗Σ; (3) periods, namely T 1 and T 2 . In summary, we used for the mul-
tivariate case the same structure of matrix Φ that was used by
where Φ is an unknown no × no matrix that accounts for the
Joyner and Boore (1993) for the univariate case.
correlation between the rows of Y, Σ is an unknown nT × nT
Some authors have identified that the intraevent correla-
matrix that accounts for the correlation between spectral
tion is a function of the distance between stations (Boore
ordinates, and the symbol ⊗ stands for Kronecker product.
et al., 2003; Kawakami and Mogi, 2003; Wang and Tanaka,
2005). The model can be extended to include the spatial cor-
The One-Stage Maximum-Likelihood Method relation between observations by using a value of γ e that
depends on the distance between stations and performing the
In this section we extend the equations proposed by Bayesian analysis consistently. This variant, however, has
Joyner and Boore (1993, 1994) for the univariate model to not been pursued in this article.
the multivariate case. For a given γ e , the values of α and Σ that maximize the
For the model described in equation (2), the likelihood likelihood are the well-known weighted least-squares estima-
of Y is defined in equation (4): tors, defined in equations (7) and (8):
LYjα; Σ; Φ; X ∝ jΣjno =2 jΦjnT =2
^ T XT Φ1 X1 XT Φ1 Y;
α (7)
1 1 1
× exp TrΦ Y Xα Σ Y Xα ;
T T T (4)
2
^ T T Φ1 Y Xα
^ Y Xα ^ T
where Tr denotes trace and the symbol ∝ stands for pro- Σ : (8)
no
portionality because we have omitted the normalization
constant. In the one-stage maximum-likelihood method, the value of
Following Joyner and Boore (1993, 1994) we consid- γ e that maximizes the likelihood is found iteratively. The
ered that the elements of E, εij , can be expressed as the sum values of α and Σ for the regression analysis are then com-
of earthquake-to-earthquake variability (εe ) and record-to- puted from equations (7) and (8). Normally, instead of
record variability (εr ). In addition, the following considera- directly maximizing equation (4) the maximization is per-
tions were made: formed over its natural logarithm, given by
1. For a given earthquake and a given site, the coefficient of no n
correlation between residuals for two different periods, lnLYjα; Σ; Φ; X ∝ lnjΣj T lnjΦj
2 2
say T 1 and T 2 , is equal to ρT 1 ;T 2 . 1 1
2. For a given earthquake, the coefficient of correlation TrΣ Y Xα ^ T T
2
between residuals for the same period at different sites × Φ1 Y Xα ^ T : (9)
is equal to γ e .
3. For a given earthquake, the coefficient of correlation Although not shown, it can be demonstrated that the prob-
between residuals for two different periods, say T 1 and ability distribution of α is a matrix Student t-distibution. The
T 2 , at different sites is equal to γ e ρT 1 ;T 2 . mean value of α is α, ^ and the covariance matrix of α ^V
4. Residuals related to different earthquakes are independent. vecα^ (note, that vec stands for stack) is given by
Multivariate Bayesian Regression Analysis Applied to Ground-Motion Prediction Equations, Part 1 1553

1 ν=2 1 1
COVα
^ V fXT Φ1 X1 ⊗Y Xα
^ T T pΣ ∝ jΣj exp TrΣ Q : (14)
no np 2 2
× Φ1 Y Xα
^ T g: (10) According to the properties of the inverted Wishart density,
Note that COVα ^ V exists only if the number of degrees of the positive nT × nT matrix Q can be computed from the
freedom of the distribution (i.e., no np ) is greater than 2. It prior mean value of Σ as follows:
is also worth noting that, even in the multivariate case, the
least-squares method is a particular case of the one-stage Q ν 2nT 2Σ0 ; (15)
maximum-likelihood method. The well-known least-squares
where Σ0 is the prior mean value of Σ and the scalar ν is a
estimators can be found setting γ e 0:0 (i.e., Φ I) in
measure of our degree of certainty on Σ0 . In order to give a
equations (4)–(10).
finite value to the variance of the elements of Σ, the value of
ν should be greater than 2nT 4; the larger the value of ν,
the greater the degree of certainty on Σ0 .
The Bayesian Model
In Bayesian analysis, usually an inverted Wishart den-
In the Bayesian approach α, Σ, and Φ are regarded as sity is also used for Φ (Rowe, 2002). However, if it is desired
matrix random variables with known joint prior density that Φ has the structure shown in equation (5), an inverted
pα; Σ; Φ. This prior density is updated through Bayes Wishart density cannot be used. After noticing that Φ is a
theorem, and the posterior density is given by the product function only of γ e , we decided to use a scalar beta density
between the likelihood and the prior density, for γ e,
pα; Σ; ΦjX; Y ∝ LYjα; Σ; Φ; Xpα; Σ; Φ: (11) pγ e ∝ γ ea1 1 γ e b1 ; (16)
In standard Bayesian analysis, three types of pα; Σ; Φ are where parameters a and b can be computed from the prior
commonly used: vague or noninformative densities, conju- mean value and standard deviation of γ e .
gate densities, and generalized conjugate densities. Vague In summary, the prior information about regression
densities are used when prior knowledge about parameters parameters is included in the analysis through αV0 , Δ, Q,
is diffuse, and conjugate and generalized conjugate den- ν, a, and b, which are known as hyperparameters, and equa-
sities are used when prior information about parameters is tions (12)–(16). Substituting equations (4), (13), (14), and
available. A more detailed description of each family of (16) into equation (11), we obtain the posterior joint density
probability density functions and their implications in the of the regression parameters,
regression analysis can be found elsewhere (see, for example,
Broemeling, 1985; Rowe 2002). In this article, we adopted pα; Σ; γ e jX; Y ∝ jΣjno ν=2 jΦjnT =2 γ ea1 1 γ e b1
a generalized conjugate probability density function as
basic density. In order to keep the structure of Φ shown in 1
× exp αV αV0 T Δ1 αV αV0
equation (5), we used a scalar beta density for γ e. Thus, the 2

prior density of Φ is not of standard form. The prior joint 1
probability density used in our analysis is given by × exp TrfΣ1 Y XαT T
2

pαV ; Σ; Φ pαV pΣpΦ: (12) 1
× Φ Y Xα Qg :
T
(17)
According to equation (12), the regression model considers
that, a priori, α, Σ, and Φ are independent; note that in This joint density should be marginalized in order to obtain
equation (12) αV vecα. the posterior marginal mean values of α, Σ, and γ e . However,
Following Rowe (2002), we assume that the prior den- for this density, it is not possible to obtain marginal distribu-
sity of αV is the normal density defined in equation (13) with tions in an analytical closed form. Posterior marginal mean
mean αV0 and covariance matrix Δ. Thus, αV0 vecα0 , values can only be numerically computed, for which we use
where α0 is the prior mean value of α, and the positive the stochastic integration method known as Gibbs sampling.
nT nP × nT nP matrix Δ is the prior covariance matrix of
αV0 . In other words,
Gibbs Sampling Method
1=2 1 1
pαV ∝ jΔj exp αV αV0 Δ αV αV0 :
T
Given the posterior joint density defined in equa-
2
(13) tion (17), posterior marginal mean values can be estimated
by averaging random variates generated from the posterior
For Σ we used as prior density the inverted Wishart (Rowe, conditional densities of α, Σ, and γ e , given in equa-
2002) shown in equation (14) with parameters ν and Q, tions (18)–(20),

1
pαV jΣ; Φ; X; Y ∝ exp αV α
~ V T According to equation (18), α ~ V is the mean of the
2 posterior conditional density of α, and it is computed as the
× Δ1 XT Φ1 X⊗Σ1 weighted average between the prior mean value and the con-
ditional weighted least-squares estimate (see equation 21).
× αV α
~ V ; (18) Hence, it is interesting to assess the contribution to the final
estimate of α of the prior information, in comparison with
the contribution of the data. These contributions can be eval-
pΣjα; Φ; X; Y ∝ jΣjno ν=2 uated with vectors Wp and Wd , defined in equations (27) and

1 (28), respectively,
× exp TrfΣ1 Y XαT T
2
Wp diagΔ1 Δ1 1 1
d Δ ; (27)
1
× Φ Y Xα Qg ;
T
(19)
Wd diagΔ1 Δ1 1 1
d Δd ; (28)
nT =2
pγ e ∝ γ ea1 1 γe
b1
jΦj
where Δ1 T 1
d X Φ X⊗Σ
1
is the inverse of the covari-
1 1 1
exp TrfΦ Y Xα Σ Y Xα g ;
T T T ance matrix of the conditional weighted least-squares esti-
2 mate of α. The posterior marginal mean values of Wp and
(20) Wd can be computed in a way similar to the one used to
where compute α,
Σ,
and γ e .
~ V Δ1 XT Φ1 X⊗Σ1 1 Δ1 αV0
α
XT Φ1 X⊗Σ1 α
^ V ; (21) Synthetic Data
In order to assess the performance of the least-squares
method, the one-stage maximum-likelihood method, and the
^ V vecYT Φ1 XXT Φ1 X1 :
α (22) Bayesian technique, we generated different sets of synthetic
SAT spectra with predefined statistical properties. We con-
In order to compute posterior marginal mean values, starting sidered 25 structural periods ranging between 0 and 5.0 sec.
values of Σ and γ e must be assumed, say Σ 0 and γ e0 , and
We generated six sets of synthetic spectra assuming the num-
then one has to cycle through, bers of earthquakes shown in Table 1, and we assumed that
1. α l1 a random variate from equation (18) with Σ each earthquake was recorded at the number of sites shown
Σ l and Φ Φ l , in Table 1. Thus, the number of records for a given set is the
2. Σl1 a random variate from equation (19) with α
product between the number of earthquakes and the number
α l1 and Φ Φ l , of sites. In order to obtain a reasonable event sample, we
3. γ el1 a random variate from equation (20) with α considered that Mw followed a modified Gutenberg–Richter
α l1 and Σ Σ l1 , distribution, with a minimum value of Mw equal to 6, a max-
imum value of Mw equal to 8.2, and β 2, where β is the
where Φ l is the value of Φ related to γ el .
parameter controlling the relative frequencies of earthquakes
The first s random variates, called the burn in sample, of different sizes. Also, we assumed that R followed a uni-
are discarded, and the following K terms are averaged in form distribution between 250 and 400 km.
order to compute the marginal mean values. In addition, The predefined statistical properties of the set of ground
the covariance matrix of α can be computed by averaging motions were αp , Σp , and Φp . We set αp as the value of the
the covariance matrix related to each term of the Gibbs sam- GMPE proposed by Reyes (1999) for station CU of Mexico
pling method. Techniques to generate random variates from City, whose coefficients are shown in Table 2. Because the
the densities shown in equations (18)–(20) can be found else-
where (Rowe, 2002). The Gibbs sampling method almost
surely converges to the mean value of the population param-
eter (Rowe, 2002) regardless of the values of Σ and Φ used Table 1
as starting values. A more detailed discussion about the con- Synthetic Sets of SAT Spectra Used
vergence of the Gibbs sampling can be found in Geman and Set Earthquakes Sites no
Geman (1984). The value of K required to attain conver- 1 3 4 12
gence of the Gibbs sampling depends on the correlation 2 5 10 50
between observations and on the covariance of the regression 3 10 10 100
parameters, so it has to be defined iteratively. In the compu- 4 20 10 200
tations presented in this article we attained convergence of 5 50 10 500
6 100 10 1000
the Gibbs sampling with K ranging from 200 to 500.
correlation between the residuals of the predictive model is reasonable values of α2 T and σ are observed. On the other
equal to the correlation between the logarithm of spectral hand, very unrealistic values for α3 T, α4 T, and α5 T are
ordinates (Baker and Cornell, 2006), the diagonal terms of observed. Positive values of α4 T and α5 T are observed
Σp were set as the variances of the residuals (σ2 ) related to even for no 1000, while positive values of α3 T are
the Reyes (1999) model (which are also presented in observed for no < 500. In some cases we have shortened
Table 2), while the off-diagonal terms were computed with the vertical axis in order to improve the clarity of the plots;
the equation proposed by Baker and Cornell (2006) to esti- therefore, although not shown, very unrealistic values were
mate the coefficient of correlation ρT 1 ;T 2 . For Φp, we used observed for no 12 and no 50. This is due to the fact
the structure shown in equation (5) with γ e 0:2234, which that least-squares results are based only on the information
is the value that we infer from the results presented by Joyner contained in the data set, and for no 12 we have only 3
and Boore (1993, 1994). Given the number of earthquakes earthquakes recorded at 4 different stations and for no 50
and the number of records shown in Table 1, an no × nT we have only 5 earthquakes recorded at 10 different stations.
matrix random variate from a matrix normal distribution In addition, in Figure 3 we have plotted the standard devia-
was generated. The mean value of the distribution was set tion of the regression coefficients, that is, the square root of
equal to αp , and the covariance matrix was obtained from the diagonal elements of the covariance matrix defined in
Φp ⊗Σp . The attenuation of synthetic SA values with R is equation (10), as a function of T. As no increases, the scatter
presented in Figure 1 for the case of T 0 (i.e., PGA). Note of regression parameters decreases.
that, regardless of the set considered, αp , Σp , and Φp repre-
sent the statistical properties of the entire population of SAT
spectra; hence these parameters were used as benchmark for Results for the One-Stage Maximum-
the regression analysis presented in the following sections. Likelihood Method
Figure 4 shows a comparison of regression parameters
Results for the Least-Squares Method obtained with the one-stage maximum-likelihood method
and the benchmark. The results are very similar to those
A comparison between regression parameters obtained observed for the least-squares method. In general, the one-
with the least-squares method and the benchmark is pre- stage maximum-likelihood method is not able to attain the
sented in Figure 2. The least-squares method is able to attain benchmark values, except for α2 T and σ. The greater dif-
the benchmark values only for the coefficient of the magni- ferences between least-squares and one-stage maximum-
tude (α2 T) and for σ. For no greater than or equal to 100, likelihood methods are observed for α1 T and α4 T, for
Table 2
αp and σ2 Used to Generate the Synthetic SA Spectra
T (sec) α1 α2 α3 α4 α5 σ2
0.0 5.8929 1.2457 0:09757 0:5 0:00632 0.17626

0.1 6.0831 1.1954 0:09668 0:5 0:00643 0.18784
0.2 6.7942 1.0675 0:09858 0:5 0:00732 0.18494
0.3 6.9623 1.1303 0:10357 0:5 0:00768 0.15107
0.4 6.7632 1.2513 0:09682 0:5 0:00727 0.19549
0.5 6.9039 1.2236 0:08753 0:5 0:00753 0.17397
0.6 6.5941 1.2748 0:06768 0:5 0:00693 0.18936
0.7 6.7755 1.3445 0:04662 0:5 0:0076 0.20463
0.8 6.5941 1.3676 0:03662 0:5 0:00705 0.20044
0.9 6.4534 1.347 0:0244 0:5 0:00648 0.18376
1.0 6.5638 1.3387 0:05429 0:5 0:00665 0.18297
1.2 6.6903 1.3167 0:05225 0:5 0:00723 0.19870
1.4 6.4825 1.3203 0:06347 0:5 0:00662 0.21055
1.6 6.4614 1.4268 0:10542 0:5 0:00632 0.23412
1.8 6.0912 1.4088 0:09393 0:5 0:00516 0.27786
2.0 5.8698 1.3854 0:05267 0:5 0:00505 0.28847
2.2 5.8367 1.4032 0:04392 0:5 0:00547 0.35229
2.4 5.838 1.4032 0:07922 0:5 0:00559 0.33916
2.6 5.858 1.3951 0:06917 0:5 0:00601 0.34523
2.8 5.6616 1.3937 0:0717 0:5 0:00583 0.33564
3.0 5.4214 1.4344 0:0608 0:5 0:00568 0.34968
3.5 4.6026 1.4899 0:06365 0:5 0:00438 0.33114
4.0 3.6367 1.5601 0:05951 0:5 0:00283 0.35594
4.5 3.098 1.4864 0:05698 0:5 0:00187 0.40088
5.0 3.2887 1.5282 0:03953 0:5 0:00364 0.44147
method. In Table 3 estimates of γ e related to different sets are

shown; very accurate estimates are observed with no ≥ 200.
Prior Information for the Bayesian Method

In this section we present a short discussion about how
to define the prior information for the Bayesian analysis. We
have not included a sound discussion because a synthetic
example is presented. In a companion article (Arroyo and
Ordaz, 2010) we use a set of actual ground-motion records
and we present a complete discussion on how the prior in-
formation can be defined.
The elements of α0 were set as follows. With the
amplitude Fourier spectra defined by Brune’s model (Brune,
1970) and common attenuation factors, and using random
vibration theory, we constructed a set of SA spectra related
Figure 1. PGA attenuation with distance for the synthetic data
used in this study (sixth synthetic set, Table 1). to several values of Mw and R (McGuire and Hanks, 1980;
Boore, 1983). Then, we fit the values so computed to the
functional form using the least-squares method in order to
which slightly better estimates are obtained with the least- compute α0 ; the elements of α0 obtained are shown in
squares method. In Figure 5, the standard deviation of the Table 4. This implies that a priori we believe that the attenua-
regression coefficients is shown; the trends are similar to tion of SA spectra can be properly characterized by Brune’s
those observed for the least-squares method. Slightly larger model and common attenuation factors.
values of standard deviation are observed for the maximum- The block-symmetric matrix Δ is the prior covariance
likelihood method than those obtained with the least-squares matrix of αV and can be written as
Figure 2. Results for least-squares method (benchmark values are plotted with a thick line).
Figure 3. Standard deviation of the regression coefficients for the least-squares method (the symbols are the same as in Fig. 2).
2 3 coefficient α1 T in equation (1) depends on site effects

δ11 δ12
δ1np
(Ordaz et al., 1994), we assigned a large value, with respect
6 δ22
δ2np 7
6 7 to the prior mean value of α1 T, to the diagonal elements of
Δ6 .... 7; (29)
4 . 5
. δ11 . This implies that α1 T is not controlled by its prior
δnp np mean value, so it is free to attain the value that yields the
where δij is an nT × nT symmetric matrix defined in equa- best fit in the regression analysis. In the computations we
tion (30): used covα1T K ; α1T K 10; 000. For δ22, δ33 , and δ44 we
2 3
covαiT 1 ; αjT 1 covαiT 1 ; αjT 2 covαiT 1 ; αjT nT
6 covαiT 2 ; αjT 2 covαiT 2 ; αjT nT 7
6 7
δij 6 .. .. 7: (30)
4 . . 5
covαiT nT ; αjT nT
In equation (30) covαiT k ; αjT l is the prior covariance assigned a value that implies a coefficient of variation of
between the coefficient αi T for the period T k and the 0.59 to their diagonal elements, as it was done in a previous
coefficient αj T for the period T l . The matrix Δ was set study (Ordaz et al., 1994). Similarly to δ11 , for the diagonal
as diagonal; therefore, only the diagonal elements of δ11 , elements of δ55 , we also used a large value with respect to the
δ22 , δ33 , δ44 , and δ55 are different from zero. This implies prior mean value of α5 T; in the computations we used
that a priori we believe that the different αi T random vari- covα5T K ; α5T K 1.
ables are uncorrelated. We set that structure for Δ because, We set the diagonal elements of Σ0 equal to 0.49, which
from the presented example, we do not have information means that a priori we believe that the expected standard
about correlation between the different αi T. Because the deviation of the residuals is equal to 0.7, independently of
Figure 4. Results for one-stage maximum-likelihood method (benchmark values are plotted with a thick line).
Figure 5. Standard deviation of the regression coefficients for the one-stage maximum-likelihood method (the symbols are the same
as in Fig. 4).
Table 3 the equations of Baker and Cornell (2006) as prior informa-

Values of γ e Obtained through One-Stage Maximum- tion because they were used to generate the synthetic
Likelihood Method and Bayesian Method data.
Benchmark One-Stage Maximum-Likelihood Bayesian The degree of certainty of Σ0 depends on ν. As it has
γ e 0:2234 Method Method been discussed, the larger the value of ν, the larger the degree
no 12 0.7490 0.3892 of certainty on Σ0 . The posterior mean value of Σ can be
no 50 0.0000 0.2336 expressed as a weighted average between the prior mean
no 100 0.1148 0.2596 value and the conditional weighted least-squares estimator.
no 200 0.1872 0.2351
no 500 0.2114 0.2382
The weighting factors are ν=no ν for the prior mean
no 1000 0.2235 – value and no =no ν for the conditional weighted least-
squares estimate. In the computations we have used a value
equal to 55 (the minimum required for nT 25 in order to
give a finite value to the covariance of Σ0 ) because we be-
T. In addition, the off-diagonal terms were defined through lieve that Σ0 is very uncertain. For no 100 the weighting
the coefficient of correlation shown in equation (31): factor for prior information is about 0.35.
Finally, our prior information of γ e is vague, so we have
ρ0T 1 ;T 2 expqjT 1 T 2 j: (31) set a b 1:5, which is a very flat density, with mean
value equal to 0.5, in order to force γ e to take the value that
Note that we use a very simple function to define the prior yields the best fit to data.
correlation between spectral ordinates of SA. According to
equation (31), together with q 1:0, the coefficient of cor- Results for the Bayesian Model
relation varies from unity, when T 1 and T 2 are equal, to
nearly 0.05 when the difference between T 1 and T 2 is about In order to exemplify the convergence of the Gibbs sam-
3 sec. It must be acknowledged that equation (31) is quite pling, Figure 6 shows a comparison between estimates of
arbitrary, and it was created only for the synthetic example regression parameters for K ranging between 50 and 200.
presented in this article. When working with actual ground As has been stated, any value of Σ 0 and γ e0 could be used
motions it could be more reasonable to use the correlation as the starting value; for the computations we set γ e0 0 and
coefficients defined by Baker and Cornell (2006) as prior Σ 0 I. In addition, the prior mean values used in the anal-
values. In the presented example we decided not to use ysis are plotted in Figure 6. We observed that the Gibbs sam-
pling method converges for relatively small values of K.
Although not shown, similar trends were observed for other
Table 4 values of no . In Table 5 convergence of the Gibbs sampling
α0 Used in the Analysis regarding γ e is presented; γ e also converges for K between
T (sec) α1 α2 α3 α4 α5
100 and 200. We note that these trends are valid only for the
presented example; for other cases of analysis it is recom-
0.0 7.463 0.785 0:026 0:758 0:005
0.1 6.303 0.835 0:035 0:601 0:005
mended to build plots similar to Figure 6 in order to assess
0.2 5.521 0.914 0:050 0:541 0:004 the convergence of the Gibbs sampling. The value of K
0.3 5.114 0.977 0:062 0:525 0:004 should be chosen carefully because computation time re-
0.4 4.826 1.032 0:073 0:518 0:003 quired for the analysis grows with K. In order to minimize
0.5 4.597 1.083 0:082 0:514 0:003 the computational effort, the peak value of K used in the
0.6 4.405 1.130 0:092 0:512 0:003
0.7 4.236 1.176 0:101 0:510 0:003
Bayesian analysis was 500.
0.8 4.084 1.221 0:111 0:509 0:003 In Figure 7 the weighting factors computed through
0.9 3.945 1.266 0:120 0:508 0:002 Gibbs sampling and equations (27) and (28) are shown for
1.0 3.817 1.310 0:130 0:507 0:002 no 200. As expected, for α1 T and α5 T the contribu-
1.2 3.575 1.398 0:150 0:506 0:002 tion of the prior mean values is marginal because we set,
1.4 3.363 1.485 0:170 0:505 0:002
1.6 3.149 1.568 0:189 0:505 0:002
a priori, a very large value for their covariance. In the case
1.8 3.015 1.641 0:205 0:504 0:002 of α2 T the contribution of the prior mean value is also
2.0 2.873 1.699 0:216 0:503 0:002 marginal. However, as can be observed in Figure 6, the pos-
2.2 2.740 1.729 0:216 0:503 0:002 terior mean values of α2 T are similar to the prior mean
2.4 2.542 1.718 0:200 0:503 0:002 values especially for systems in the range between 0.1 and
2.6 2.482 1.616 0:154 0:502 0:002
2.8 2.429 1.641 0:160 0:503 0:002
1.8 sec. This means that prior mean values of α2 T are
3.0 2.183 1.768 0:191 0:503 0:002 similar to the values obtained from data. On the other hand,
3.5 1.938 1.832 0:195 0:503 0:002 greater contribution of prior mean values is observed for
4.0 2.002 1.944 0:236 0:499 0:002 α3 T and α4 T. For α3 T the greater contribution of prior
4.5 1.485 1.995 0:251 0:532 0:001 mean values is observed for T shorter than 0.6 sec, while for
5.0 1.700 1.987 0:207 0:501 0:001
longer periods almost a constant value of 0.4 was observed.
Figure 6. Convergence of Gibbs sampling for the set with no 200 (prior values are plotted with a thick line).
For α4 T a nearly constant value of 0.9 is observed. We note As has been stated, the Bayesian method was not able to
that in both cases the posterior mean values are close to their attain the correct value of α3 T. According to Figure 8, as
prior counterparts, independent of Wp ; this observation will no increases, slightly better estimates are observed, espe-
be discussed later in the article. cially for T larger than 2 sec. However, in general, posterior
The results obtained with the Bayesian model are pre- mean values are close to prior mean values of α3 T, even if
sented in Figures 8 and 9. Conversely to what happened with the weighting factor decreases. For example, in Figure 7 it
other models, the Bayesian model was able to attain bench- can be seen that for no 200, Wp is about 0.4 in the long-
mark values, except for α3 T. Note that with no between period range, while for no 500 (although not shown) a Wp
100 and 200 very accurate estimates of the regression param-
eters are observed. In general, the scatter of the Bayesian
regression coefficients is smaller than that observed with Table 5
other methods, which is an advantage of the Bayesian ap- Convergence of γ e for no 200
proach over other methods. In Table 3 estimates of γ e related Prior γ e 0:5
to different sets are shown; accurate estimates are observed K 12 γe 0:2367
for no greater than 50. Note that in spite of our use of a prior K 50 γe 0:2338
mean value of γ e 0:5 the data shifted the prior mean value K 100 γe 0:2353
K 200 γe 0:2351
to the correct value of γ e .
Figure 7. Weighting factors observed for no 200 (black circles, weighting factor for prior information; thick line, weighting factor for
data information).
value about 0.2 was observed. This means that information the three methods yield very similar values of σ. Hence,
contained in the data is not enough to completely define we decided to assess the global accuracy of the models as
α3 T; in other words, α3 T has little effect on y and almost follows. Suppose that a regression analysis is performed with
any value could have been used. Hence, the Bayesian a set of size no and for a given T we obtain estimates of the
method, instead of leading to any value of α3 T (such as regression coefficients. Because for the synthetic example
other methods) leads to values of α3 T that are close to its the true coefficients are known, the expected value of the
prior mean value. error that would be observed if the estimated coefficients
(related to some value of no ) were applied is given by
Discussion Ez α1R α^ 1 α2R α^ 2 EMw 6
Although it is clear that, at least for the presented exam- α3R α^ 3 EMw 62 α4R α^ 4 Eln R
ple, the Bayesian method usually yields regression param- α5R α^ 5 ER Eε; (32)
eters that are closer to the population parameters than those
obtained with other methods, it must be acknowledged that where αjR is the true value of the j coefficient of the regres-
Bayesian estimates are not related to the minimum standard sion and α^ j is its corresponding estimate. Note that for a very
error. Furthermore, as can be observed in Figures 2, 4, and 8, large set α^ j tends to αjR ; hence, Ez Eε 0.
Furthermore, the variance of z is given by For a very large set, α^ j tends to αjR , so Ez2 Eε2 , which
is the value of σ2 shown in Table 2.
In order to assess the global accuracy of the predictions
Ez2 α1R α^ 1 2 α2R α^ 2 2 EMw 62 related to the three methods, in Figure 10 we present a com-
p
α3R α^ 3 2 EMw 64 parison of σp Ez2 for different values of no . For small
sets (i.e., no 12 and 50) the global accuracy of the Bayesian
α4R α^ 4 2 Eln R2 α5R α^ 5 2 ER2
method is better than those observed for the other two
Eε2i 2α1R α^ 1 α2R α^ 2 EMw 6 methods. Nevertheless, for no greater than 100 the global
2α1R α^ 1 α3R α^ 3 EMw 62 accuracy of the 3 methods is practically the same. Note that
the global accuracy for the least-squares and one-stage
2α1R α^ 1 α4R α^ 4 Eln R
maximum-likelihood methods is very similar in all cases
2α1R α^ 1 α5R α^ 5 ER and that the same level of accuracy is attained with the three
2α2R α^ 2 α3R α^ 3 EMw 63 methods regardless of differences observed in the regression
coefficients. As reference, in Table 6 the regression coeffi-
2α2R α^ 2 α4R α^ 4 EMw 6 lnR cients related to different methods are compared for
2α2R α^ 2 α5R α^ 5 EMw 6R no 200 and T 0 (i.e., PGA). Hence, it can be concluded
2α3R α^ 3 α4R α^ 4 EMw 62 lnR that there are multiple solutions for the regression analysis that
are close to the minimum error solution, and as can be
2α3R α^ 3 α5R α^ 5 EMw 62 R observed in Figures 2, 4, and 8 those solutions might be very
2α4R α^ 4 α5R α^ 5 ER lnR: (33) different.
Figure 8. Results for Bayesian method (benchmark values are plotted with a thick line).
Figure 9. Standard deviation of the regression coefficients for Bayesian method (the symbols are the same as in Fig. 8).
Based on the last observation one might consider that 0:5 for far-field ground motions and 1:0 for near-field
the great computational and analytical work required by records. But, even without fixing a predefined value of
the Bayesian model is not warranted. However, the Bayesian α4 T, the Bayesian analysis yielded a value for this
approach has some advantages that make its application coefficient that matches with seismological theory. For
worthwhile: instance, regarding the results presented in Table 6,
1. In the case of few data points, the accuracy of the Bayes- even when the three methods yield the same level of
ian method is greater than that obtained with other accuracy, most analysts would definitely not use results
methods. Usually, the scatter of the regression coeffi- obtained with the least-squares and one-stage maximum-
cients is smaller for the Bayesian model than for other likelihood methods because α2 T and α4 T are very far
methods, so narrower confidence intervals would be from what would be expected from seismological theory.
obtained for the Bayesian coefficients. 3. It is true that while values of M and R lie in the ranges
2. From theoretical grounds α4 T must be negative. observed in the sample, the three methods yield the same
However, the least-squares and one-stage maximum- level of accuracy, even when some coefficients seem the-
likelihood methods yield positive values of this coeffi- oretically unacceptable. However, we decided to compare
cient even for no 1000, that is, a very large sample. the predicted SA spectra, obtained with coefficients
In order to avoid this problem, some authors have fixed shown in Table 6, with the corresponding benchmark
α4 T during regression analysis, normally at a value of spectra. For the comparison we choose M 7 and four
Figure 10. Comparison of σp related to different methods (benchmark values are plotted with a thick line).
values of R (150, 200, 450, and 500 km). These values stage maximum-likelihood methods, in spite of the fact
are out of the range of the data contained in the synthetic that this distance is only 20% smaller than the minimum
set; thus, this comparison can be regarded as an evalua- value of R included in the synthetic set.
tion of the possibility of extrapolating the GMPEs. The Based on the results discussed in this section we con-
results are summarized in Figure 11. The Bayesian clude that the analytical and computational effort required
regression yields acceptable results, but the least-squares by the Bayesian method is completely warranted. Moreover,
and one-stage maximum-likelihood methods might lead in the future, the computation time required by the Bayesian
to very inaccurate results. Note that for R 200, large analysis will certainly decrease as more powerful computers
differences are observed for the least-squares and one- will be developed.
Table 6
Regression Parameters for no 200 and T 0
Method α1 T α2 T α3 T α4 T α5 T
Least-squares 3.7863 1.4263 0:2264 0:1136 0:0065

One-stage maximum-likelihood 9.6409 1.4263 0:2252 1:3340 0:0028
Bayesian 7.5033 1.0779 0:0243 0:8729 0:0042
performing the analysis period by period. For the case of the

least-squares method it is clear, from equation (6) together
with γ e 0, that the results are equal for both types of anal-
ysis and that only the diagonal elements of matrix Σ can be
estimated through univariate analysis. Conversely, in the case
of the one-stage maximum-likelihood and the Bayesian
methods, the results obtained are different because the cor-
relation structures of E are also different. For the multivariate
analysis the correlations between elements of E are defined
in equation (3) (the value of γ e is the same for all values of
T), while in the univariate analysis the correlation structure
cannot be described by equation (3) because different values
of γ e are obtained for each value of T. That is, the covariance
matrix Ω is coupled and it cannot be expressed as the
Kronecker product of two matrices.
If the matrix Ω cannot be separated, the likelihood func-
tion should be written as
LYjα; Ω; X ∝ jΩj1=2

1
× exp vecY vecXαT T Ω1
2

× vecY vecXαT : (34)
Note that Ω is a function of γ e for each period (γ eTi ), of the

coefficients of correlation between residuals for different
Figure 11. Comparison of SA spectra related to different meth- periods, and of the variances of the residuals for each period.
ods (the symbols are the same as in Fig. 10). Although not shown, the elements of Ω can be defined
according to the following rules:
One question that naturally arises after going through a 1. For a given earthquake and a given site, the coefficient
Bayesian analysis is how dependent the results are on the prior of correlation between residuals for two h different
information. It has been argued that, because of the inherent p
periods, T 1 and T 2 , is equal to ρT 1 ;T 2 γ eT 1 γ eT 2
subjectivity of the Bayesian approach, it lends itself to the pi
perpetuation of unrecognized error. However, in a previous 1 γ eT 1 1 γ eT 2 .
work of the second author it has been shown that this objection 2. For a given earthquake the coefficient of correlation be-
is not justified (see Ordaz et al., 1994, for further details). tween residuals for the same period (T i ) at different sites
is equal to γ eT i .
Differences between Univariate 3. For a given earthquake the coefficient of correlation be-
and Multivariate Analysis tween residuals for two different periods, say T 1 and T 2 ,
p
at different sites is equal to ρT 1 ;T 2 γ eT 1 γ eT 2 .
To finish the article, we discuss differences between per- 4. Residuals related to different earthquakes are inde-
forming a multivariate analysis and the common practice of pendent.
Table 7
Comparison of Results Obtained through Multivariate and Univariate Analysis, T 0 and
no 200
Method* α1 T α2 T α3 T α4 T α5 T σ γe
Benchmark 5.8929 1.2457 0:09757 0:5 0:00632 0.41983 0.2234

MML 9.6409 1.4263 0:2252 1:3340 0:0028 0.3925 0.1872
UML 11.6577 1.4237 0:2232 1:7608 0:0015 0.3714 0.0362
MB 7.5033 1.0779 0:0243 0:8729 0:0042 0.4131 0.2351
UB 7.1722 1.0962 0:0275 0:80207 0:0044 0.3953 0.1362
*MML, multivariate maximum-likelihood; UML, univariate one-stage maximum-likelihood; MB,
multivariate Bayesian; UB, univariate Bayesian.
The one-stage maximum-likelihood method for this Bayesian method could be used to obtain a GMPE consistent
correlation structure consists of maximizing equation (34). with seismological theory. In addition, for the presented syn-
However, the maximization process is much more complex thetic example, it is shown that GMPEs obtained through
than in the case of a single γ e for all periods because Bayesian analysis yield more accurate results than GMPEs
maximization must be carried out considering simulta- related to other methods when the GMPEs are extrapolated.
neously all the coefficients of correlation between residuals, However, the Bayesian method requires significantly more
all the γ eT i parameters, and all the variances of the residuals. analytical and computational work than traditional methods.
Also, the Bayesian model becomes very complex if different Hence, our computer code to perform linear Bayesian ana-
values of γ e are adopted for each period because prior lyses is freely available on request.
densities for all γ eT i and those parameters are coupled with
the variances of the residuals and with the coefficients of Data and Resources
correlation between residuals for different periods. Because
we consider that the application of this Bayesian model No actual data were used in this article. The synthetic
would be impractical, the complete analysis is not presented data are available on request.
in this article. However, the complete analysis can be ob-
tained following the procedure presented in the section titled
Acknowledgments
The Bayesian Model.
Thus, the common practice of performing the analysis The authors appreciate the efforts of two anonymous reviewers,
period by period implicitly disregards the correlation be- who greatly contributed to improve the original manuscript. Constructive
comments by Gail Atkinson are also appreciated.
tween residuals for different periods. The influence of this
assumption in the accuracy of the GMPE depends on how
large this correlation is. In Table 7 we present a comparison References
of regression parameters obtained through multivariate
Abrahamson, N. A., and R. R. Youngs (1992). A stable algorithm for regres-
maximum-likelihood, multivariate Bayesian, univariate one- sion analysis using the random effects model, Bull. Seismol. Soc. Am.
stage maximum-likelihood, and univariate Bayesian meth- 82, 505–510.
ods for no 200 and T 0 (i.e., PGA). Unsurprisingly, Arroyo, D., and M. Ordaz (2010). Multivariate Bayesian regression analysis
the multivariate results are closer to the benchmark values, applied to ground motion prediction equations, Part 2: Numerical ex-
ample with actual data, Bull. Seismol. Soc. Am. 100, 1568–1577.
especially for γ e, because the synthetic data were generated
Baker, J. W., and C. A. Cornell (2006). Correlation of response spectral
with the correlation structure defined in equation (3). values for multicomponent ground motions, Bull. Seismol. Soc. Am.
Nevertheless, in the case of data obtained from actual ground 96, 215–227.
motions, the true correlation structure is unknown; hence, Boore, D. M. (1983). Stochastic simulation of high-frequency ground
we cannot consider that the multivariate analysis is more motions based on seismological models of the radiated spectra, Bull.
accurate than the univariate analysis based only in the results Seismol. Soc. Am. 73, no. 6, 1865–1894.
Boore, D. M., J. F. Gibbs, W. B. Joyner, J. C. Tinsley, and D. J. Ponti (2003).
presented in Table 7. However, from the discussion presented Estimated ground motion from the 1994 Northridge, California, earth-
in this section we consider that the multivariate regression quake at the site of the Interstate 10 and La Cienega boulevard bridge
model is theoretically more robust than the common practice collapse, west Los Angeles, California, Bull. Seismol. Soc. Am. 93,
of performing the analysis period by period. 2737–2751.
It is interesting to note that during the computations we Brillinger, D. R., and H. K. Preisler (1984). An exploratory analysis of the
Joyner–Boore attenuation data, Bull. Seismol. Soc. Am. 74, 1441–
observed that the time required in the Bayesian analysis for 1450.
the univariate and the multivariate cases is almost the same Broemeling, L. D. (1985). Bayesian Analysis of Linear Models, Marcel Dek-
because the rank of matrix Φ is equal in both cases. In ker, New York, 454 pp.
practice, an approach that can be used is to perform the Brune, J. N. (1970). Tectonic stresses and spectra of seismic waves from
earthquakes, J. Geophys. Res. 75, 4997–5009.
univariate analysis and include the correlation between spec-
Campbell, K. W. (1981). Near-source attenuation of peak horizontal accel-
tral ordinates through the copula technique described in eration, Bull. Seismol. Soc. Am. 71, 2039–2070.
Goda and Atkinson (2009). Campbell, K. W. (1985). Strong motion attenuation relations: A ten-year
perspective, Earthq. Spectra 1, no. 4, 759–804.
Geman, S., and D. Geman (1984). Stochastic relaxation, Gibbs distributions,
Conclusions and the Bayesian restoration of images, IEEE Trans. Pattern Anal.
Mach. Intell. 6, 721–741.
We have presented a linear multivariate Bayesian regres- Goda, K., and G. M. Atkinson (2009). Interperiod dependence of
sion method that includes the correlation between observa- ground-motion prediction equations: A copula perspective, Bull.
tions for a given earthquake, the correlation between SA Seismol. Soc. Am. 99, no. 2A, 922–927.
ordinates at different periods, and the correlation between Joyner, W. B., and D. M. Boore (1993). Methods for regression analysis
of strong-motion data, Bull. Seismol. Soc. Am. 83, no. 2,
regression coefficients of the GMPE. Through comparisons
469–487.
of GMPEs obtained with the least-squares and the one-stage Joyner, W. B., and D. M. Boore (1994). Errata: Methods for regression
maximum-likelihood methods we have shown that multiple analysis of strong-motion data, Bull. Seismol. Soc. Am. 84, no. 3,
solutions close to minimum error could exist and that the 955–956.
Kawakami, H, and H. Mogi (2003). Analyzing spatial intraevent variability Wang, M., and T. Takada (2005). Macrospatial correlation model of seismic
of peak ground accelerations as a function of separation distance, Bull. ground motions, Earthq. Spectra 21, 1137–1156.
Seismol. Soc. Am. 93, 1079–1090. Wang, M., and T. Takada (2009). A Bayesian framework for prediction of
McGuire, R. K., and T. C. Hanks (1980). RMS accelerations and seismic ground motion, Bull. Seismol. Soc. Am. 99, no. 4, 2348–2364.
spectral amplitudes of strong ground motion during the San
Fernando, California, earthquake, Bull. Seismol. Soc. Am. 70, no. 5,
1907–1919. Departamento de Materiales
Ordaz, M., S. K. Singh, and A. Arciniega (1994). Bayesian attenuation Universidad Autónoma Metropolitana-Azcapotzalco
Av. San Pablo # 180. Colonia Reynosa Tamaulipas
regressions: An application to Mexico City, Geophys. J. Int. 117,
Azcapotzalco CP 02200
335–344. México D.F., México
Reyes, C. (1999). El estado límite de servicio en el diseño sísmico de aresda@correo.azc.uam.mx
edificios, Ph.D. Thesis, School of Engineering, UNAM. (D.A.)
Rowe, D. B. (2002). Multivariate Bayesian Statistics: Models for Source
Separation and Signal Unmixing, Chapman & Hall/CRC, New York,
329 pp. Instituto de Ingeniería
Sibilio, E. (2006). Seismic risk assessment of structures by means of stochas- UNAM Ciudad Universitaria
tic simulation techniques, Ph.D. Thesis, Braunschweig Technical Coyoacán CP 04510
University and Florence University. México D.F. México
mors@pumas.iingen.unam.mx
Veneziano, D., and M. Heidari (1985). Statistical analysis of attenuation in
(M.O.)
the eastern United States, in Methods of Earthquake Ground-Motion
Estimation for the Eastern United States, EPRI Research Project
RP2556-16, Palo Alto, California. Manuscript received 3 December 2008

Multivariate Bayesian Regression Analysis Applied To Ground-Motion Prediction Equations, Part 1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Multivariate Bayesian Regression Analysis Applied To Ground-Motion Prediction Equations, Part 1

Uploaded by

Copyright:

Available Formats

Bulletin of the Seismological Society of America, Vol. 100, No. 4, pp. 1551–1567, August 2010, doi: 10.