State-Of-The-Art Review On Bayesian Inference in SHM, Huang Y.

Invited Reviews
Advances in Structural Engineering

2019, Vol. 22(6) 1329–1351
State-of-the-art review on Bayesian Ó The Author(s) 2018
Article reuse guidelines:
inference in structural system sagepub.com/journals-permissions
DOI: 10.1177/1369433218811540
identification and damage assessment journals.sagepub.com/home/ase
Yong Huang1,2 , Changsong Shao1,2, Biao Wu3 ,

James L. Beck4 and Hui Li1,2
Abstract
Bayesian inference provides a powerful approach to system identification and damage assessment for structures. The application of
Bayesian method is motivated by the fact that inverse problems in structural engineering, including structural health monitoring, are
typically ill-conditioned and ill-posed when using noisy incomplete data because of various sources of modeling uncertainties. One
should not just search for a single ‘‘optimal’’ value for the vector of model parameters but rather attempt to describe the whole family
of plausible model parameters based on measured data using a Bayesian probabilistic framework. In this article, the fundamental princi-
ples of Bayesian analysis and computation are summarized; then a review is given of recent state-of-the-art practices of Bayesian infer-
ence in system identification and damage assessment for civil infrastructure. Discussions of the benefits and deficiencies of these
approaches, as well as potentially useful avenues for future studies, are also provided. Our focus is on meeting challenges that arise
from system identification and damage assessment for the civil infrastructure but our presented theories also have a considerably
broader applicability for inverse problems in science and technology.
Keywords
Bayesian inference, Bayesian model class assessment, Bayesian model updating, damage assessment, sparse Bayesian learning, structural
health monitoring, structural system identification, uncertainty quantification
Introduction inadequate theory for certain system behaviors, simpli-

fying approximations for developing structural models,
In the last few decades, worldwide efforts to implement thermally induced variations in structural properties,
structural health monitoring systems on civil infra- and so on. Due to these facts, no model is expected to
structure have produced large amounts of data. This
abundance of data has motivated the development of
an increasing number of data processing (information
extracting) techniques to understand the behavior and 1
Key Lab of Structures Dynamic Behavior and Control of the Ministry of
performance of civil infrastructure under real environ- Education, School of Civil Engineering, Harbin Institute of Technology,
mental conditions. These computer-based techniques, Harbin, China
2
developed in system identification research (e.g. Beck, Key Lab of Smart Prevention and Mitigation of Civil Engineering
Disasters of the Ministry of Industry and Information Technology,
2010; Ghanem and Shinozuka, 1995; Sirca and Adeli,
Harbin Institute of Technology, Harbin, China
2012), are key components in model-based inversions 3
School of Civil Engineering, Nanjing Tech University, Nanjing, China
for damage detection and assessment. Here, system 4
Division of Engineering and Applied Science, California Institute of
identification refers to the inverse problem of finding a Technology, Pasadena, CA, USA
mathematical model of the structural system on the
Corresponding authors:
basis of measured data. However, there are always Yong Huang, Key Lab of Structures Dynamic Behavior and Control of
some challenges encountered in system identification. the Ministry of Education, School of Civil Engineering, Harbin Institute of
One challenge is that sensors are typically installed at Technology, Harbin 150090, China.
only a limited number of locations, so that we are Email: huangyong@hit.edu.cn
unable to resolve detailed spatial information about
James L. Beck, Division of Engineering and Applied Science, California
the structure. In addition, there are always modeling Institute of Technology, Pasadena, CA 91125, USA.
uncertainties involved because of sensor noise, Email: jimbeck@caltech.edu
1330 Advances in Structural Engineering 22(6)
exactly represent the system input/output (I/O) future safety and reliability of structural systems. This
behavior. review starts with a section that introduces the underly-
Inverse problems in structural system identification ing theory and perspectives of Bayesian analysis and
are typically ill-conditioned and ill-posed when treated computation. Section ‘‘Applications of Bayesian infer-
deterministically because there is insufficient informa- ence in system identification and damage assessment’’
tion in the collected data to precisely determine a gives a literature review for the application of Bayesian
model within a realistic class of structural models. One methods to real problems of system identification and
commonly employed approach to handle this difficulty damage identification for civil infrastructures. This sec-
involves regularized parameter estimation approaches, tion presents an end-to-end Bayesian framework that
such as Tikhonov regularization (Tarantola, 2005), starts with building Bayesian models and ends with
where a regularization term is added to the data- characterizing the final posterior distribution of the
matching term in the objective function to be mini- model parameters. Section ‘‘Sparse Bayesian learning
mized. However, the relationship of the resulting and applications in structural damage assessment’’
unique solution to the solution of the original unregu- introduces a recently developed hierarchical sparse
larized inverse problem is uncertain. Bayesian learning (SBL) methodology to perform
The presence of substantial modeling uncertainties sparse stiffness change inference based on the vibration
suggests that when solving inverse problems in struc- data and to perform flaw (or defect) detection using
tural system identification, the objective should not be wave propagation data. In the final section, some con-
limited to the search for single ‘‘optimal’’ parameter clusions are drawn and suggestions for future research
vector. Instead, an attempt should be made to describe are provided.
the family of all plausible values of the model para-
meters based on the available data. Bayesian inference
provides a general, rational, and robust tool that is Bayesian inference
capable of handling the difficulty of non-unique solu-
Bayesian and Frequentist probability
tions (Beck, 1989; Beck and Katafygiotis, 1991, 1998;
Katafygiotis and Beck, 1998). It treats the parameter In the area of statistical analysis, there are two broad
estimation problem using Bayes’ theorem to determine categories of probability interpretations, namely
the posterior distribution for the parameter vector ‘‘Frequentist’’ and ‘‘Bayesian’’ probabilities. Probability
based on the available data. Probability in the in the Frequentist definition is interpreted as the rela-
Bayesian perspective represents a degree of plausibility tive frequency of occurrence of an ‘‘inherently random’’
of an uncertain proposition conditional on stated event in the ‘‘long run,’’ and probability distributions
information (Cox, 1946; Jaynes, 1957, 2003; Beck, are considered as inherent properties of ‘‘random’’ phe-
2010). The proposition may refer to events, structural nomena (Mises, 1981 [1939]). However, this definition
model parameters or even the model itself. The poster- is not always operational because it requires well-
ior distribution therefore quantifies the updated rela- defined ‘‘random’’ experiments that can be conceived
tive plausibility of the different values of the model as repeatable; for example, the probability of a model
parameters on the basis of the available incomplete is not meaningful. It is also not practical for establish-
information. Similarly, the posterior probability distri- ing distributions of multi-dimensional continuous vari-
bution obtained from Bayes’ theorem at the model ables because of the huge amount of effort that would
class level can be used to quantify the plausibility of be required for gathering the necessary relative fre-
each model class within a set of candidate model quencies in trials. Furthermore, the definition involves
classes for their consistency with both the observed the concept of ‘‘inherent randomness’’ of events, which
data and the prior information. is assumed but cannot be proved (Beck, 2010, 2014).
The objective of this article is to review Bayesian In contrast, Bayesian probability quantifies the
inference approaches in system identification, espe- states of plausible knowledge about phenomena
cially structural damage assessment, including both because of our limited capacity to collect or understand
vibration-based and wave propagation–based meth- the relevant information, rather than the existence of
ods. By treating the damage assessment problem within ‘‘inherent randomness’’ in nature and the probability
a framework of plausible inference in the presence axioms can be derived as a multi-valued logic for quan-
of incomplete information, the Bayesian framework titative plausible reasoning under uncertainty (Beck,
provides a promising way to locate and assess struc- 2010; Cox, 1946, 1961; Jaynes, 2003). These probability
tural damage, which may occur away from the sensor logic axioms provide a rigorous foundation for apply-
locations or be hidden from sight. Furthermore, being ing Bayesian inference. They incorporate not only
able to quantify the uncertainties of the structural parametric uncertainty (uncertainty regarding which
model parameters is essential for a robust prediction of model in a proposed set should be used to represent
Huang et al. 1331
the system behavior), but also non-parametric uncer- from a set of competing candidates to represent the
tainty because of the existence of model prediction behavior of the system of interest based on the
errors resulting from the approximate nature of any measured data D , that is, model class assessment (or
model. Under this interpretation, the probability of a selection, or comparison) (Beck and Yuen, 2004). Given
model is a measure of its plausibility relative to other a discrete set of chosen probabilistic model classes,
models within a set and one’s inferences regarding the M = fMm : m = 1, 2, . . . , Mg, for a system, the poster-
relative plausibility of each model are updated through ior probability P(Mm jD N , M) is computed from Bayes’
Bayes’ theorem as data evidences accumulate. Such a theorem at the model class level (our convention is to
concept makes Bayesian inference more suitable for use P( ) for probabilities and p() for PDFs)
inverse problems in structural system identification
than the Frequentist approach to inference. PðMm jD , MÞ = pðD jMm ÞPðMm jMÞ=pðD jMÞ ð2Þ
In the above, p(D jMm ) is the evidence for model
Bayesian model updating class Mm provided by the data D (additional condi-
tioning on M is irrelevant), which is given by the Total
A key concept in Bayesian model updating for systems
Probability Theorem
is a stochastic system model class M, which consists of
a set of probabilistic predictive I/O models for the ð
structural system together with a prior distribution pðD jMm Þ = pðD jw, Mm ÞpðwjMm Þdw ð3Þ
over this set that quantifies the initial relative plausibil-
ity of each predictive model. The data D can be used Usually, the model classes are considered equally
to update the relative plausibility of each predictive plausible a priori, that is, PðMm jM Þ = 1=M, the com-
model in M by computing the posterior probability putation of the multi-dimensional integral in equation
density function (PDF), p(wjD , M), for the uncertain (3) for the evidence function is vital in Bayesian model
model parameters, w 2 W RN using Bayes’ theorem class assessment. It has been shown that the log evi-
dence can be expressed as the difference between two
pðwjD , MÞ = pðD jw, MÞpðwjMÞ=pðD jMÞ terms (Beck, 2010; Muto and Beck, 2008)
ð1Þ
= c1 pðD jw, MÞpðwjMÞ
log½pðD jMm Þ
ð
where c = p(D jM) is the normalizing constant, which
is called the evidence or marginal likelihood for the = log½pðD jw, Mm ÞpðwjD , Mm Þdw
ð4Þ
model class M given by data D ; p(D jw, M), as a func- ð
pðwjD , Mm Þ
tion of w, is the likelihood function which expresses the log pðwjD , Mm Þdw
pðwjMm Þ
probability of obtaining data D based on the predictive
PDF for the response given by model parameters w The first term is the posterior mean of the log likeli-
within M; and p(wjM) is the prior PDF, which quanti- hood function, which is a measure of the average
fies the initial plausibility of each model defined by the goodness-of-fit of the model class Mm , and the latter
value of the model parameters w For ill-conditioned term is the relative entropy of the posterior
inverse problems, the prior PDF can be selected to pro- p(wjD , Mm ) relative to the prior p(wjMm ), which is a
vide ‘‘soft’’ constraints on the model updating and measure of the amount of information gain about w
thereby provide regularization (Bishop, 2006). One from the data D (information-theoretic model com-
important feature of the posterior PDF p(wjD , M) is plexity). The merit of equation (4) is that it shows rig-
the maximum a posteriori (MAP) value of the model orously, without introducing any ad-hoc concepts,
parameters w, that is, w ^ = arg max p(wjD , M), which that the log evidence for model class Mm explicitly
w
is the most probable value of w conditional on the data builds in a trade-off between the data-fit of the model
D . Another useful feature is to define the more plausi- class and its information-theoretic complexity. This
ble values of w by a ‘‘confidence interval,’’ which is an means that the average log goodness-of-fit is penalized
interval of parameter values centered on the MAP by the information gained about the model parameters
value that corresponds to a specified posterior prob- in the sense of Shannon (Cover and Thomas, 2006).
ability of, say, 0.90 or 0.95. This trade-off between data-fit and model complexity
is known as the Bayesian Ockham Razor (Beck, 2010;
Bayesian model class assessment and Bayesian Gull, 1988; Jefferys and Berger, 1992; Mackay, 1992).
This is important in system identification applications,
Ockham Razor since overly complex models often lead to overfitting
In system identification, we are often faced with the of the data and the subsequent response predictions
problem of choosing the most plausible model class may be extremely sensitive to the modeling error
However, the higher level involves the evaluation of

a multi-dimensional integral over the space of hyper-
parameters g that is analytically intractable. On
approach is to use the Laplace asymptotic method to
approximate the integral, assuming that the posterior
p(gjD ) has a pronounced peak at its MAP value g
(Beck and Katafygiotis, 1998)
Figure 1. Graphical hierarchical model representation, where g)

p(Djw)p(wj~
each arrow denotes the conditional dependencies used in the pðwjD Þ ’ pðwj~
g, D Þ = ð7Þ
g)
p(Dj~
joint probability model, pðD, w, gÞ = p(Djw)p(wjg)p(g).
e is learned from the data D
where the MAP estimate g
~ = arg max p(gjD ) = arg max p(D jg)pðgÞ

g ð8Þ
g g
and details of specific data (measurement noise, envi-
ronmental effects, etc.). An optimal model class should This procedure is sometimes called Empirical Bayes
have good data-fitting capability but small prediction using Type-II Maximum Likelihood Approximation
differences due to perturbation of the model (Mackay, 1992; Tipping, 2004) when p(g) is chosen as
parameters. constant. It is seen that the final prior PDF p(wj~ g) is
Once Bayesian model class assessment is implemen- not directly specified but instead is learned from the
ted, the most plausible model class based on the avail- data from a specified class of priors p(wjg). Note that
able data D is accessible. In cases where there are the MAP estimation of hyper-parameter vector g
multiple plausible model classes, Bayesian model aver- involves maximization of the evidence function p(D jg)
aging may be used where the expected value of a quan- when p(g) is constant so that all values of g are consid-
tity of interest h conditioned on the data D and all the ered to be equally plausible a priori, where
chosen model classes M can be estimated by ð
p(D jg) = p(D jw)p(wjg)dw ð9Þ
X
M
EðhjD , MÞ = EðhjD , Mm ÞPðMm jD Þ ð5Þ
m=1 For a Gaussian likelihood function that has a mean
linear in w and for a Gaussian prior, this integral can
be evaluated analytically (Tipping, 2001). Because of
Hierarchical Bayesian model and empirical Bayes the Bayesian Ockham Razor (Beck, 2010; Gull, 1988;
method Jefferys and Berger, 1992; Mackay, 1992), the learning
of the hyper-parameters g automatically implements a
The Empirical Bayes Method (Bishop, 2005) is an penalty against data overfitting of the models in
inference procedure in which the prior distribution of Bayesian inference.
the model parameters w is selected using the data D .
This method can be viewed as an approximation of a
full Bayesian framework involving a hierarchical Useful Bayesian approximation tools
Bayesian model, which is a Bayesian model expressed In Bayesian inference, the normalization of the poster-
in multiple levels (by placing a hyper-prior on the ior PDF is usually intractable because it involves a
prior at each level). Figure 1 is a graphical represen- high-dimensional integral and so we must use approxi-
tation of a hierarchical model for one level, where g mations to proceed. In fact, the main computational
is called the hyper-parameter vector because it is the issue in Bayesian analysis is the evaluation of multi-
parameter of prior distribution, p(wjg), for w. The dimensionalÐ integrals. The normalizing constant
posterior PDF of the model parameter vector, w, is p(D jM) = p(D jw, M)p(wjM)dw in equation (1) is
inferred using a full Bayesian approach analytically intractable except in special cases, where
ð ð conjugate priors are used (Bishop, 2006).
p(D jw)p(wjg)
pðwjD Þ= pðwjg, D ÞpðgjD Þdg =
p(Djg)
Laplace’s method of asymptotic approximation. Based on
pðgjD Þdg
the topology of the likelihood function, p(D jw, M) in
ð6Þ the parameter space, three categories for a model class
have been defined (Katafygiotis and Beck, 1998):
Huang et al. 1333
globally identifiable, locally identifiable, and unidentifi- that cannot be readily evaluated, provided that the
able based on the sensor data D , corresponding respec- value of the function f (w) (e.g. p(D jw, M)p(wjM) in
tively to unique, multiple but isolated, and a equation (1)) can be computed. Transitional MCMC
continuum of maximum likelihood estimate (MLE) (Ching and Chen, 2007) is one of the most widely used
w = argmaxpðD jw, MÞg). Full Bayesian updating
(f^ methods in Bayesian system identification. This
can treat all these cases (Yuen et al., 2004) but some- method was inspired by an adaptive Metropolis-
times approximate inference methods are applied for Hastings method (Beck and Au, 2002) and is applica-
special cases because of less computational effort. ble for Bayesian inversion problems with higher
If the model class is globally identifiable based on dimensions. It also enables model class assessment by
D , Laplace’s method can be used to approximate the providing an estimate of the multi-dimensional integral
integral in the evidence function p(D jM) as (e.g. Beck in equation (3) for the evidence as a by-product.
and Katafygiotis, 1991, 1998; Yuen, 2010) Transitional MCMC is fundamentally a sequential
Monte Carlo method where samples are taken from a
N 1
pðD jMÞ ’ pðD j~
w, MÞpðw ~ Þj2
~ jMÞð2pÞ 2 jHðw ð10Þ series of intermediate PDFs pj in an adaptive manner,
where
where H(~ w) is the Hessian matrix of the function
ln½pðD jw, MÞpðwjMÞ calculated at the MAP esti- pj ðwjD , MÞ}pðwjMÞpðD jwÞsj ,
mated values w ~ . Correspondingly, the posterior PDF ð11Þ
j = 0, . . . , J ; 0 = s0 \s1 \ \sJ = 1
p(wjD , M) in equation (1) can be approximated as
Gaussian, where the mean is the MAP estimated value where j is the stage number and sj is the corresponding
~ and the covariance matrix is equal to the inverse of
w tempering or annealing parameter for the jth stage.
the Hessian matrix H(~ w) calculated at w ~ . Laplace This parameter controls the speed of the gradual tran-
asymptotic approximations for the posterior PDF are sition from the prior p(wjM) (when j = 0 and s0 = 0),
also available for the locally identifiable case (Beck to the posterior p(wjD , M) (when j = J and sJ = 1)
and Katafygiotis, 1998; Yang and Beck, 1998) and the and it is automatically computed in the process to form
unidentifiable case (Katafygiotis and Lam, 2002). the intermediate PDFs so they are not too different.
Because these approximations need all MLEs to be Other recent popular techniques for Bayesian
found, they are only feasible for low-dimensional para- approximations are Approximate Bayesian
meter spaces. Computation (ABC) methods (Chiachio et al., 2014;
Marin et al., 2012; Vakilzadeh et al., 2017) and
Variational Bayesian methods (Bishop, 2006;
Markov chain Monte Carlo method. For the general case, Fujimoto et al., 2011; Li and Der Kiureghian, 2017).
Markov chain Monte Carlo (MCMC) samplers can be ABC methods are applicable even when an analytical
used. These algorithms generate samples that are con- formula for the likelihood function (p(D jw, M) in
sistent with any probability distribution (e.g. the pos- equation (1)) is elusive or it is computationally costly
terior distribution p(wjD , M)) by constructing a to evaluate it in Bayesian inference. The key idea of
Markov chain that has the desired distribution as its this class of methods is to sample from a posterior dis-
equilibrium distribution. In recent years, this class of tribution conditional on model predicted outputs
algorithms has received considerable attention for (rather than on observed data vector D ) that are
Bayesian model updating because it can provide a full acceptably close to the observed data D in the output
characterization of the posterior uncertainty, no mat- space under some metric. The key idea for variational
ter whether or not the data available is sufficient to Bayesian methods is to find a surrogate distribution
constrain the updated parameters to give a globally close to the true posterior PDF by approximately
identifiable model class (Robert and Casella, 2004). minimizing the Kullback–Leibler divergence between
Several MCMC methods have been proposed with these two distributions over a specified class of distri-
the goal of improving the computational efficiency of butions, such as a Gaussian family.
posterior sampling in Bayesian model updating of sys-
tems (e.g. Beck and Au, 2002; Beck and Zuev, 2013;
Catanach and Beck, 2017; Cheung and Beck, 2009; Applications of Bayesian inference in
Ching and Chen, 2007; Straub and Papaioannou, system identification and damage
2015). Most of these are based on the Metropolis–
assessment
Hastings MCMC algorithm (Hastings, 1970). The
advantage of the Metropolis–Hastings algorithm is Bayesian inference for accurately detecting, locating,
that it can draw samples from any PDF and assessing damage from severe loading events or
p(w) = f (w)=K, where K is the normalization constant progressive structural deterioration has been studied
for nearly two decades. Numerous techniques have where Nm is the number of modes observed and the
been developed using the probability logic-based unify- modal parameters are modeled as independently dis-
ing Bayesian system identification framework pre- tributed from mode to mode and from modal fre-
sented in Beck (1989), Beck and Katafygiotis (1991, quency to mode shape. The PDFs for the rth modal
1998), and Beck (2010). Typically, there are two cate- frequencies v ^ r are
^ 2r and mode shape components c
gories of methods for structural damage detection: one obtained from the following two model equations,
is vibration based and the other is wave propagation respectively
based.
^ 2r = v2r ðuÞ + ev^ 2r
v ð14aÞ
Vibration-based damage assessment using Bayesian ^ r = ar Gcr ðuÞ + e ^

c cr ð14bÞ
inference
where ar is a scaling factor, and G 2 RNo 3 Nd with ‘1s’
For vibration-based damage assessment, many meth-
and ‘0s’ picks the observed degrees of freedom (DOFs)
ods utilize the dependence of the identified structural ^ r 2 RNo from the full
in the ‘‘measured’’ mode shape c
modal parameters, such as natural frequencies and
model mode shapes, cr (u) 2 R . Using the Principle
Nd
mode shapes, on physical properties of structures, that
of Maximum Information Entropy (Jaynes, 1983,
is, stiffness, mass, and damping. On the other hand,
2003), the combined prediction errors and measure-
other methods directly utilize the measured time-
ment errors for the model modal parameters cr (u) and
domain vibration response to infer the physical prop-
v2r (u) are modeled as independent zero-mean Gaussian
erties of structures. For small-amplitude vibrations,
variables with unknown variances, which gives the
such as ambient vibrations of structures, it is reason-
largest uncertainty for the set fev^ 2r gNr =m
1 and fec
Nm
^ r gr = 1
able to choose linear structural models with classical
subject to the first two moment constraints. This pro-
damping and to parameterize the uncertain stiffness
duces the Gaussian likelihood functions p(^ v2r ju)
matrix, K 2 RNd 3 Nd using a sub-structuring approach ^ r ju). After defining the prior PDF of u, Bayes’
and p(c
theorem can be applied to infer the posterior PDF of
X
Nk
KðuÞ = K0 + hj K j ð12Þ u.
j=1 Bao et al. (2013) employed this formulation for data
fusion-based structural damage detection under vary-
where Kj 2 RNd 3 Nd , j = 1, . . . , NK , is the nominal con- ing temperature conditions. Here, the temperature
tribution of the jth substructure to the overall stiffness change effects on the modal parameters were consid-
matrix, K. The corresponding stiffness scaling para- ered in the construction of the likelihood function for
meter hj , j = 1, . . . , NK , is a factor that allows modifi- u in equation (13). Behmanesh and Moaveni (2015)
cation of the jth substructure stiffness to make it more applied this method for identification of simulated
consistent with the real structure behavior. Note that damage on a footbridge where an adaptive
the mass matrix M can also be parameterized in the Metropolis–Hastings algorithm (Andrieu and Thoms,
same affine manner with mass scaling parameters 2008; Haario et al., 2001) was used to sample the pos-
rj , j = 1, . . . , NM . In this case, the structural model terior PDF of the structural model parameters. Lam
parameter vector u to be updated should include not et al. (2014) extended this method to detect railway
only the stiffness scaling parameters hj , j = 1, . . . , NK , ballast damage under a concrete sleeper. They
but also the mass scaling parameters rj , j = 1, . . . , NM . employed a model class assessment technique to select
the most plausible number of ballast regions given
Modal-based Bayesian approaches. Bayesian inference for multiple sets of modal data. Behmanesh et al. (2017)
structural damage assessment first appeared in Vanik also incorporated a model class assessment method,
(1997), Vanik et al. (2000), and Sohn and Law (1997) where multiple model classes were defined as different
using modal data information. In the framework pre- subsets of the contributing modes.
sented in Vanik (1997) and Vanik et al. (2000), the like- One major difficulty for the approaches above is
lihood function for the structural model parameters u that mode matching is required, that is, it is necessary
is written as the product of the PDFs for the modal to match model modes (e.g. cr (u)) and experimental
modes (e.g. c ^ r ), one by one. This is a nontrivial task
frequencies f^v2r gNr =
m
1 and mode shape components
^ Nm
fcr gr = 1 because usually only partial mode shapes are mea-
sured. Moreover, the order of modes may switch due
Y
Nm
2 to the fact that the damage-induced local stiffness loss
pðD juÞ = p v ^ r ju
^ r ju p c ð13Þ may affect some modes more than others; this makes
r=1 mode matching even more challenging. To deal with
Huang et al. 1335
this difficulty, the concept of system mode shapes was based on the system mode shape fr and model para-
introduced in Beck et al. (2001) as additional variables meters u as follows
for Bayesian model updating. The system mode shape
parameters f represent the actual underlying mode ^ 2r M fr , r = 1, . . . , Nm
er = KðuÞ v ð16Þ
shapes of the linear dynamic structural system at all
DOFs corresponding to those of the structural model, where er is modeled as Gaussian based on the maxi-
but they are distinct from the model mode shapes, mum entropy distribution (Jaynes, 1983, 2003) subject
fcr (u)gNr =
m
1 in equation (14). The other benefit of to the first two moments as constraints. By employing
introducing system mode shapes is that they do not this prediction error equation, Yan and Katafygiotis
require solution of the nonlinear eigenvalue problem of (2015) developed a Bayesian damage assessment
a structural model. Instead of the model modal fre- method for using modal information from multiple
quencies, v2r ðuÞ the Rayleigh quotient frequencies, sensor setups. The problem is formulated as minimiz-
v2r (u, f) can be employed using the structural model ing an objective function with respect to the three
parameters and system mode shape parameters instead parameter groups above, which incorporates the infor-
of the model modal frequencies, v2r (u) mation of local mode shape components correspond-
ing to different sensor setups automatically.
fTr KðuÞfr Yuen et al. (2006) introduced system frequencies
v2r ðu, fÞ = ð15Þ
fTr Mfr fv2r gNr =
m
1 as uncertain parameters but they used the
eigen-equations of the structural dynamics model only
Then Bayes’ theorem can be used to express the pos- in the prior to provide a soft constraint
terior probabilities of the uncertain parameters u and
( )
f. Ching and Beck (2004a, 2004b) proposed an bX Nm

Expectation–Maximization algorithm to find the MAP 2
p v , fju, b = c1 exp k KðuÞ v2r M fr k2
2 r=1
values of these parameters, together with the predic-
tion error variance parameters. They analyzed the ð17Þ
Phase II simulated benchmarks (Bernal et al., 2002)
and experimental benchmark (Ching and Beck, 2003; where c1 is a constant, which is independent of system
Dyke et al., 2003) that were sponsored by the IASC- modal parameters, v2 and f: An iterative scheme
ASCE Task Group on Structural Health Monitoring. involving a series of coupled linear optimization
Most of the damage was detected and assessed success- problems was employed to find the MAP values of the
fully. Goller et al. (2012) found that it is vital to weigh structural model parameters u and system modal para-
differently the relative contributions of the likelihoods meters, v2 and f. By incorporating a finite element
relating to modal frequencies and mode shape data in (FE) model reduction technique in this formulation,
equation (13) to provide balanced model updating Yin et al. (2017) developed a methodology for
results. Bayesian model class assessment was employed detection of bolted connection damage in steel frame
for selecting the most plausible weight parameter structures. The novel feature is that only partial com-
(defined as the ratio between the prediction error var- ponents of system mode shapes f are inferred. This is
iances of mode shape vectors and modal frequencies) practical in cases where the dimension of full-system
based on the modal data. mode shapes f is extremely large but considerably
For damage assessment of complex civil infrastruc- fewer DOFs are measured, resulting in unreliable full-
tures, we would like to treat each structural member as system mode shape inference.
a substructure so that we can infer which, if any, mem-
bers have been damaged. Therefore, high-dimensional Time-domain Bayesian approaches. The other category of
model parameter vectors u often arise. Ching et al. Bayesian damage assessment is the time-domain
(2006b) proposed a Gibbs sampler method to effi- approach, which is particularly appropriate for situa-
ciently sample the posterior PDF of the high- tions with time-varying structural properties and a
dimensional vector u. The effective dimension is kept sequential dataset is observed. Typically, a stochastic
low by decomposing the uncertain parameters into state-space model is defined for the state time history
three groups and iteratively sampling the posterior dis- fxn gNn =
t
1 by implying a state transition PDF
tribution of one parameter group conditional on the
other two groups and the measured data. The three 8n 2 Z+ , pðxn jxn1 , un1 , uÞ
parameter groups are the model parameters u, system ð18Þ
= N ðxn jfn ðxn1 , un1 , uÞ, Qn Þ
mode shapes f, and variances for the prediction errors
fer gNr =
m
1 . For each mode, the prediction error er is along with a state-to-output PDF
defined for the prediction of the measured frequency
8n 2 Z+ , pðyn jxn , un , uÞ = N ðyn jgn ðxn , un , uÞ, Rn Þ model. Lam and Ng (2008) extended the Bayesian
ð19Þ ANN design method to include the selection of activa-
tion (transfer) functions for neurons in the hidden
where un and yn denote the (external) input and output layer. A comparison study showed that ANN perfor-
vectors, respectively, at time instant, tn ; and Qn and Rn mance trained by modal parameters is better than that
are the prior covariance matrices of the uncertain state trained by Ritz vectors. Bayesian neural network mod-
and output prediction errors, respectively. Ching et al. els were also examined in Arangio and Beck (2012)
(2006a) presented a comparison study of two Bayesian and the automatic relevance determination method
filtering algorithms: extended Kalman filter (EKF) (MacKay, 1994; Neal, 1996) was applied to evaluate
and a particle filter for the estimation of the augmen- the relative importance of every input in the neural
ted state vector containing both the state (displacement networks and separate relevant variables from those
and velocity) vector and structural model parameters that are redundant. The applicability of these Bayesian
u. Yuen and Kuok (2016) proposed a Bayesian prob- neural networks was investigated in Arangio and
abilistic algorithm for online estimation of noise para- Bontempi (2015) for the identification of damage of a
meters of EKF, motivated by the fact that improper cable-stayed bridge in China. This study demonstrated
assignment of noise covariance matrices Qn and Rn that the method is able to detect anomalies in the
leads to divergence in the estimates and misleading structural behavior produced by damage. Figueiredo
uncertainty quantification for the system state and et al. (2014) proposed a MCMC-based Bayesian pat-
model parameters. Using the general hierarchical state- tern recognition method for damage detection, where
space model in equations (18) and (19), Vakilzadeh the Bayesian approach is used to cluster structural
et al. (2017) examined the performance of the ABC- responses of the bridges into a reduced number of state
SubSim algorithm (Chiachio et al., 2014) for Bayesian conditions by inferring the parameters of a finite mix-
updating of the model parameters of dynamical sys- ture of Gaussian distributions. The method can be
tems, together with the noise parameters. For the case viewed as an improvement over the classical MLE-
of unknown, or partially unknown, input excitations, based expectation–maximization algorithm and it has
Astroza et al. (2017) presented a Bayesian method for the potential to overcome some difficulties when deal-
nonlinear FE model updating and seismic input ing with the structural responses containing the effects
identification.
of the environmental temperature variability.
Bayesian pattern recognition methods. Closely related to

artificial intelligence and machine learning, pattern Wave propagation-based damage detection using
recognition is another popular approach in structural Bayesian inference
damage detection and assessment. Lam et al. (2006)
Wave propagation-based damage detection techniques,
introduced a Bayesian artificial neural network (ANN)
such as the Lamb wave method and ultrasonic NDT
design method for pattern recognition-based damage
(non-destructive testing), are widely acknowledged as
detection, where an ‘‘optimal’’ ANN model class is
a most encouraging tool for quantitative identification
automatically selected based on the set of ANN train-
of damage in civil engineering structures, and much
ing data. In the ANN training process, the calculated
research has been conducted intensively over the last
features of damage-induced changes in Ritz vectors
several decades. The use of Bayesian inference in these
and corresponding damage scenarios in the structural
approaches is also increasing, due to the fact that there
model are treated as inputs and targets, respectively.
are always unavoidable uncertainties in the measure-
The calculated features are defined by
ment and modeling processes.
h iT h iT For Lamb wave methods, Ng et al. (2009) intro-
DRðk Þ = rT1 ðk Þ, . . . , rTNR ðk Þ rT1 ð0Þ, . . . , rTNR ð0Þ , k duced a Bayesian framework to detect and characterize
= 1, . . . , Ndp laminar damage in beam structures using guided wave
signals measured at a single point. The uncertain model
ð20Þ parameters u to be inferred include damage location,
where Ndp and NR are the total number of damage pat- length, depth, and Young’s modulus of the material.
terns considered and the number of extracted Ritz vec- Given the measured guided wave data qm , the likeli-
tors, respectively; the Ritz vectors ri are computed as hood function is defined as

b P
Nt
the mass normalized product of the flexibility matrix pðqm ju, bÞ= 1
exp kq ðt Þ q ðt, u Þk 2
ð21Þ
Nt N0 2 m
and spatial load vector. In the real identification pro- ð2pb1 Þ 2 t=1
cess, the damage scenario is identified by inputting the where q(t, u) is the calculated guided wave response
measured Ritz vector changes into the trained ANN using spectral FE method. A two-stage optimization
Huang et al. 1337
process consisting of simulated annealing followed by probabilistically inferred damaged state. In reality, the
a standard simplex search method was employed for information available from the structure’s local net-
determining the MAP values of the posterior PDF of work of sensors will generally be insufficient to support
the model parameters u to characterize the multivari- a member-level resolution of stiffness loss from dam-
ate damage and quantify the associated uncertainties. age. Accordingly, larger substructures consisting of
The method is only applicable for a single crack case. assemblages of structural members may be necessary
To identify multiple cracks, He and Ng (2017) intro- in order to reduce the number of model parameters. In
duced a Bayesian model class assessment technique to this case, defining a proper threshold to determine
determine the most plausible solution for the number whether the damage features shift from their healthy
of cracks based on the guided wave data information state is very important to alleviate false positive and
before further crack parameter identification and asso- false negative detections. However, it is very challen-
ciated uncertainty quantification. Another measure- ging to establish a reliable threshold value in a rigorous
ment information for damage detection is the time-of- manner in order to issue a timely damage alarm (Sohn
flight of scattered Lamb waves. Yan (2013) utilized this et al., 2005). A general strategy to alleviate this prob-
information to produce a Bayesian inference method lem is to incorporate as much prior knowledge as pos-
in which a MCMC algorithm developed by Nichols sible in order to constrain the set of solutions; in
et al. (2010) was employed to characterize the posterior particular, it is helpful to exploit the prior information
distributions of the unknown damage location and that structural stiffness change from damage typically
wave velocity parameters. occurs at a limited number of locations in a structure
Regarding ultrasonic NDT, Wang et al. (2015) com- in the absence of its collapse. Recently, by exploiting
bined cluster analysis and Bayesian theory for asses- this prior knowledge about the spatial sparseness of
sing external corrosion location and depth in buried damage, the effectiveness of sparse recovery tech-
pipeline structures. Chiachio et al. (2017) presented a niques, for example, l1 norm least square regularization
multilevel Bayesian approach for identifying Young’s (Candes et al., 2006; Chen et al., 1999; Tropp and
moduli, number, and position of damaged layers in Gilbert, 2007) and SBL (sparse Bayesian learning)
composite laminates using through-transmission ultra- techniques (Tipping, 2001), have been explored to pro-
sonic measurements. Three steps were defined in their duce more robust damage assessment even for high-
Bayesian inverse procedure: (1) inferring the posterior dimensional model parameter spaces (higher-resolu-
PDF p(ujMj , qm ) of model parameters (defined by tion damage localization) (e.g. Hou et al., 2018a,
Young’s moduli of damaged layers) for a specific dam- 2018b; Huang and Beck, 2015, 2018a; Huang et al.,
age hypothesis using equation (1); (2) obtaining the 2017a, 2017b; Zhou et al., 2015).
plausibility P(Mj jPi , qm ) of a particular damage In the field of guided wave/ultrasonic NDT signal
hypothesis (associated with damage positions) among processing for damage or defect detection, sparse sig-
the set of candidate hypotheses using equation (2); (3) nal recovery algorithms have also attracted increasing
assessing the degree of plausibility of the given damage attention during the recent decade (Hong et al., 2006;
pattern (defined by the number of damaged layers) Raghavan and Cesnik, 2007; Wu et al., 2017a, 2017b,
within a predefined damage pattern set P = fPi g by 2018; Zhang et al., 2008). The prior information
exploited in the application is that the object being
PðPi jP, qm Þ}pðqm jPi , PÞPðPi jPÞ ð22Þ inspected contains a limited amount of damage or
defects, and so the measured signal should be a linear
where combination of echoes reflected from these damage or
X defects.
pðqm jPi , PÞ = p qm jMj , Pi P Mj jPi , P ð23Þ In this section, the recent progress of SBL-based
j structural damage detection and assessment is
reviewed. The general theory of SBL is first introduced,
The most plausible damage hypothesis and pattern
and thereafter, the recently developed vibration-based
can be selected through the Bayesian model class
SBL methods are discussed, followed by the wave
assessment at different levels.
propagation–based SBL methods.
Sparse Bayesian learning and applications Sparse Bayesian learning

in structural damage assessment
In this section, we only briefly review the theory of
In vibration-based damage assessment, there is a fun- SBL and refer to Tipping (2001) and Faul and Tipping
damental trade-off between the spatial resolution of (2002) for a more detailed description. It is supposed
the inferred damage locations and the reliability of the that the model prediction of the measured output is
1
y = f + e + m 2 RNo , which involves a deterministic
^ S = ðbQ ~ Þ1 , m = bSQ
~ TQ + A ~ T
^y ð28Þ
function
where A ~ = diag(~
a1 , . . . , a
~ Np ) is the prior covariance
Np
X matrix for w: The approximation of equation (27) is
f= wj Qj = Qw ð24Þ based on the assumption that the posterior p(a, bj^y) is
j=1 highly peaked at the MAP value
along with uncertain model prediction error e and mea-
~ = arg max pða, bj^yÞ
~, b
a
surement noise m, where Q = ½Q1, ..., QNp is a general ½a, b
No 3 Np design matrix with column vectors fHj gj =
Np ð29Þ
1 = arg maxfpð^yja, bÞpðaÞpðbÞg
that may depend on inputs û and w = ½w1 , . . . , wNp T is ½a, b
a corresponding coefficient vector. Based on the princi-
ple of maximum information (largest uncertainty) Two optimization algorithms have been proposed
in the SBL literature to find the MAP values a ~
~ and b.
entropy subject to the first two moment constraints
(Jaynes, 1983, 2003), the combination of the prediction One is Tipping’s original iterative algorithm (Tipping,
error e and measurement noise m is modeled as a zero- 2001) and the other is Tipping and Faul’s ‘‘Fast
Algorithm’’ (Tipping and Faul, 2003). It is found that
mean Gaussian vector with covariance matrix, b1 INo .
the maximization in equation (29) results in many
This yields a Gaussian likelihood function based on
the data ^
y: hyper-parameters aj to approach zero during the
learning process. Thereby, a sparse model vector, w, is
produced, that is, many of its components become
No
1 2 b 2
pð ^
yjw, bÞ = 2pb exp k^
y Qwk2 zero. This is the Bayesian Ockham Razor at work
2 ð25Þ
1
(Beck, 2010; Gull, 1988; Jefferys and Berger, 1992;
= N (^
yjQw, b INo ) Mackay, 1992).
Huang et al. (2014) found that the SBL algorithm
The prior distribution for the parameter vector, w,
suffers from a robustness problem: there are local
is assigned as follows
maxima for equation (29) that may trap the hyper-
Np
Y Np
Y parameter optimization if the number of measure-

pðwjaÞ = p wj jaj = N wj j0, aj ments No is considerably smaller than the number of
j=1 j=1 model parameters Np ; this leads to non-robust
ð26Þ
Np
Y
Bayesian updating results. Several robustness
1=2 1 1 2
= 2paj exp aj wj enhancement algorithms (Huang et al., 2011, 2014,
j=1
2 2016, 2018b, 2018c, 2018d) have been developed,
with the goal of increasing the signal reconstruction
The key to the model sparseness is the utilization of accuracy in compressive sensing for highly and
the Np independent variance hyper-parameters approximately sparse signals.
fa1 , . . . , aNp g that moderate the strength of the
Gaussian prior. Note that an extremely small value of
aj implies that the corresponding term wj Qj in equa- Sparse Bayesian learning methods for vibration-
tion (24) has an insignificant contribution to the mod- based damage detection and assessment
eling of measurements ^y because it essentially produces
Hierarchical Bayesian model class. For damage detection
a Dirac delta-function at zero for the prior of wj , and
purposes, the hierarchical Bayesian model in Figure 2
so for its posterior.
The learning of the coefficient vector w from mea-
sured output ^ y is characterized by applying Bayes’
Theorem to infer the posterior PDF p(wj^y). Based on
the Empirical Bayes method (Laplace asymptotic
approximation)
ð
pðwj^
yÞ = pðwj^y, a, bÞpða, bj^
yÞdadb

’ p wj^y, a ~ =p ^
~, b ~ pðwj~
yjw, b aÞ=p ^yj~ ~
a, b
= N ðwjm, SÞ
ð27Þ
Figure 2. Hierarchical Bayesian model representation of the
with structural system identification problem.
Huang et al. 1339
was presented in Huang and Beck (2015), where v2 ^ u

^ 2 , c,
p(djv ~
û ) is highly peaked at the MAP value d,
and f denote the system natural frequencies and mode then the posterior PDF of u is approximated by
shapes, respectively, that correspond to the identified ð
^ A joint
natural frequencies and mode shapes v^ 2 and c. ^
2 ^
^ , c, uu = p ujd, v
p ujv ^ u
^ 2 , c, û p djv ^ u
^ 2 , c, û dd
prior PDF is assigned to the system modal parameters
v2 and f and structural stiffness parameters u of the ~ v
’ p ujd, ^ u
^ 2 , c, û
structural model. This is accomplished by introducing
an equation error precision parameter b to explicitly ð32Þ
control how closely the system and model modal para-
where d~ = arg max p(djv ^ u
^ 2 , c, û ) = argmaxp(v ^ u
^ 2 , c, û jd)
meters agree:
p(d). Treating the prior p(d) as uniform, the maximiza-
tion of the evidence p(v ^ u
^ 2 , c, û jd) here is effectively
p v2 , f, ujb }ð2p=bÞNm Nd =2
( implementing the Bayesian Ockham Razor. This sup-
bX Nm

KðuÞ v2 M f 2 g
ð30Þ presses the occurrence of false and missed alarms for
exp r r stiffness reductions, as shown in Appendix 1.
2 r=1
Although the system modal parameters v2 and f Sparse Bayesian learning algorithms using partial Gibbs sam-
are a nonlinear function of the structural model para- pling combined with Laplace’s approximation. To provide a
meters u, the joint prior p(v2 , f, ujb) can be decom- fuller treatment of the posterior uncertainty, it is neces-
posed into the product of a conditional PDF for any sary to avoid the Laplace approximation in the fast
one of the parameter vectors and a marginal PDF for SBL algorithm that involves the system modal para-
the other two. Therefore, a series of coupled linear-in- meters, fv2 , fg, and the equation error precision
the-parameter problems can be set up (Huang et al., parameter b. Huang et al. (2017b) accomplished this
2017a, 2017b). using Gibbs sampling (GS) to draw posterior samples
To promote model sparseness in the stiffness from p(f, v2 , u, bjv ^ u
^ 2 , c, û ) by decomposing the
changes, the MAP value u û from the calibration state
whole model parameter vector into four groups and
is chosen as the pseudo-data for u to define a likeli- repeatedly sampling from one parameter group condi-
hood function, then motivated by the SBL framework tional on the other three groups and the available data
(Tipping, 2001) fv ^ u
^ 2 , c, û g. The effective dimension is then four,
rather than the considerably higher total number of
Y
Nu

p uû ju, a = N ûu, s jus , as ð31Þ model parameters. Laplace’s approximation is used
s=1 only for the integrals that marginalize the hyper-
parameters from the conditional posterior PDF. In
where the hyper-parameters as are learned from the this GS method, the conditional posterior PDFs
modal data. If hyper-parameter as ! 0, then
us ! ^ uu, s , which is interpreted as the sth substructure
^ u
^ 2 , c,
p fjv û , v2 , u, b = p fjc,
^ v2 , u, b ð33aÞ
being undamaged. Gaussian likelihood functions,
^
p(v^ 2 jv2 , r) and p(cjf, h), are also defined for the sys- ^ uû , f, u, b = p v2 jv
^ 2 , c,
p v2 jv ^ 2 , f, u, b ð33bÞ
tem parameters v2 and f with corresponding preci-
sion parameters r and h, respectively. In addition, we
^ u
^ 2 , c,
p ujv û , f, v2 , b = p uju
û , f, v2 , b ð33cÞ
model our prior uncertainty for the equation error pre-
cision b by an exponential hyper-prior, p(bjb0 ), with
^ u
^ 2 , c,
p bjv û , f, v2 , u ð33dÞ
rate parameter, b0 .
are successively sampled to generate samples from the

Fast sparse Bayesian learning algorithm. Based on the hier- ^ u
^ 2 , c,
full posterior PDF p(f, v2 , u, bjv û ) when the
archical model in Figure 2, Huang et al. (2017a) pro- number of samples n is sufficiently large (beyond
posed a fast sparse Bayesian learning algorithm that burn-in) since the Markov chain created by the GS is
focuses on an analytical derivation of the posterior ergodic.
PDF of the stiffness parameters u and collects all other For the updating of stiffness scaling parameters u
uncertain parameters in the vector and system modal parameters, v2 and f, the corre-
d = ½(v2 )T , rT , fT , h, aT , b, b0 T . The latter are treated sponding model classes are investigated, as seen from
as ‘‘nuisance’’ parameters, which are integrated out the hierarchical Bayesian model in Figure 2. The appli-
using Laplace’s approximation method (Beck and cation of Bayes’ Theorem at the model class level auto-
Katafygiotis, 1998). It is assumed that the posterior matically penalizes models of u (v2 or f), which
^ u (v
‘‘underfit’’ or ‘‘overfit’’ the associated data u ^
^ 2 or c). combined parameter vector fwm gM m=1 where
Consequently, more reliable updating results for the ym = Qm wm + em + mm (see equation (24)) is
three parameter vectors are obtained. This is the
Bayesian Ockham Razor (Beck, 2010). Note that it is p fw m gM j fym gMm = 1 , a, a0 , b0
ð m=1
also tractable to marginalize out the equation
= p fw m gM M M
m = 1 jfym gm = 1 , a, b p bjfym gm = 1 , a, a0 , b0 db
error precision parameter b to remove it from the pos-
terior distributions as a ‘‘nuisance’’ parameter. This ð35Þ
leads to Student’s t-distributions for the posteriors
p(fjc,^ v2 , u), p(v2 jv ^ 2 , f, u), and p(uju û , f, v2 ) where b0 is the prior rate parameter for the Gamma
(Huang et al., 2017b). Student’s t-PDFs have heavy prior of the prediction error precision parameter, b.
tails and so the associated algorithm is more robust The MAP values of the hyper-parameters are inferred
against noise and outliers. using the datasets from all learning tasks

~ , ~b0 = argmaxp a, b0 jfym gM
a m=1 ð36Þ
Full Gibbs sampling procedure for sparse Bayesian
learning. In order to characterize the full posterior This approach was applied to identify structural
uncertainty, the GS is implemented to draw posterior stiffness losses by exploiting a commonality among
samples from the joint posterior PDF stiffness reduction models in the temporal domain,
^ u
^ 2 , c,
p(v2 , f, u, b, r, h, a, b0 jv û ) in Huang and Beck that is, the damage changes by a ‘‘small’’ amount over
(2018). To alleviate any inefficiency where the Markov adjacent time periods. It has been shown that damage
Chain samples may get trapped in local maxima of the patterns are more reliably detected in both qualitative
posterior PDF for the hyper-parameter a because of a and quantitative ways by this sharing of related infor-
very large number of uncertain parameters to be mation using multi-task learning. Huang et al. (2018b)
inferred, a sequential Bayesian inference procedure employed a multi-task SBL to adaptively borrow the
was introduced based on the hierarchical Bayesian respective strengths of two fractal dimension-based
model in Figure 2. The full joint posterior PDF is damage indices to acquire a unifying damage identifi-
given by cation index.

^ u
^ 2 , c,
p v2 , f, u, r, h, a, b, b0 jv û Application to IASC-ASCE Phase II benchmark problems. The
fast SBL algorithm with and without the sparseness
= p v2 , f, r, hju, a, b, b0 , v ^ u
^ 2 , c, û ð34Þ constraint (Huang et al., 2017a) and the full GS SBL

^ u
^ 2 , c, û algorithm (Huang and Beck, 2018) were applied to the
p u, a, b, b0 jv
brace damage patterns in the IASC-ASCE Phase II
The full posterior uncertainty is characterized by simulated benchmark (Bernal et al., 2002) and experi-
first taking the generated samples, fuðnÞ , aðnÞ , mental benchmark problems (Ching and Beck, 2003;
bðnÞ , b0 ðnÞ g, n = 1, . . . , N , from the PDF, p(u, a, b, b0 j. Dyke et al., 2003). The benchmark structure is a four-
^ u
^ 2 , c, û ). Thereafter, posterior samples, fv2 ðnÞ , fðnÞ , story, two-bay by two-bay steel braced-frame. Results
v
ðnÞ ð nÞ for the damage scenarios DP1B.ps (.ps denotes partial-
r , h g, n = 1, . . . , N , are drawn from the condi-
sensor measurement, which are at the third floor and
tional posterior PDF, p(v2 , f, r, hjuðnÞ , aðnÞ , bðnÞ , b0 ðnÞ ,
^ u û ), n = 1, . . . , N, using GS. the roof) and Config. 5 from simulated and experimen-
^ 2 , c,
v
tal benchmarks, respectively, are reported here. The
stiffness scaling parameter vector u has 16 components,
Multi-task sparse Bayesian learning methods. Multi-task one for each of the four faces of each of the four stor-
learning is a method that attempts to examine infor- ies. The true damage ratio values for the damaged sub-
mative relationships or data redundancy between M structures are 88.7% for u1, + y and u1, y for DP1B.ps
different groups of measurements fym gM m = 1 , which and 77.4% for u1, y for Config. 5 in terms of stiffness
may improve the SBL performance. Huang et al. reduction from the calibration configuration.
(2018a) presented a multi-task SBL method by assign- In Figures 3 and 4, all the samples generated from
ing a shared hyper-prior and prediction error precision the full GS SBL algorithm (Huang and Beck, 2018),
parameter, which characterizes the common sparseness excluding those in the burn-in period (4000 samples),
profile across multiple tasks. To enhance the learning are plotted in the fu1, + x , u1, + y g and fu1, x , u1, y g
robustness and posterior uncertainty quantification spaces for DP1B.ps and Config. 5, respectively. They
accuracy, the algorithm marginalized out the common show that the stiffness reduction corresponding to
prediction error precision parameter instead of merely u1, + y and u1, y for DP1B.ps and u1, y for Config. 5
finding its MAP value. Then the posterior PDF of the scenarios are correctly identified and quantified as far
Huang et al. 1341
Figure 3. Post burn-in samples of some posterior stiffness parameters for the DP1B.ps scenario, plotted in (a) fu1, + x , u1, + y g and
(b)fu1, x , u1, y g spaces by running the full GS SBL algorithm. The reduction in stiffness shown by u1, + y and u1, y reflects the
damage in the corresponding substructures.
Figure 4. Post burn-in samples for some posterior stiffness parameters for Config. 5 scenario plotted in (a) fu1, + x , u1, + y g and
(b)fu1, x , u1, y g spaces by running the full GS SBL algorithm. The reduction in stiffness shown by u1, y reflects the damage in the
corresponding substructure.
as the sample means are concerned. Considerably shows the benefit of exploiting damage sparseness.
larger posterior uncertainties can be observed in the The performance of the two SBL algorithms is similar,
stiffness scaling parameters for Config. 5 scenario although false damage detections (actual undamaged
compared with those for DP1B.ps. This is because of substructures that have probability densities shifted to
the larger modeling errors in this real data case, espe- larger damage extents) occur less often for the full GS
cially for those components corresponding to real SBL algorithm for Config. 5 (Figure 6). This is because
damage locations. of the robust treatment of the hyper-parameters by a
Figures 5 and 6 show the posterior probability den- fuller posterior uncertainty quantification. For exam-
sities of the damage extent fraction f for each substruc- ple, the probability densities of u1, + y , u3, + y , u4, + y ,
ture, which is the decrease in each stiffness parameter u2, y , u3, y , and u4, y are shifted to larger damage
divided by its original calibration value. The posterior extents for the fast SBL algorithm, which tends to pro-
probability densities are estimated by the computed duce false detections. Moreover, the damage extent
posterior PDFs (fast algorithms) or posterior samples estimation for u1, y (22.6%) is more accurately quanti-
(full GS algorithm). Damaged substructures should fied for the full GS SBL method than for the fast SBL
have large posterior probability density values where method.
the stiffness reduction value is close to the real value.
By comparing the results, it is seen that the fast SBL
Sparse Bayesian learning application in guided wave/
and GS algorithms give more accurate stiffness reduc-
tion ratios than the method without the sparseness ultrasonic NDT signal processing
constraint. Moreover, the false and missed damage In guided wave/ultrasonic NDT signal processing, the
indications have been effectively suppressed. This signal obtained from pulse-echo mode testing can be
Figure 5. DP1B.ps scenario: (a) Approximated Gaussian PDFs for the fast SBL algorithm with sparseness turned off and (b)
Approximated Gaussian PDFs for the fast SBL algorithm; (c) Kernel probability densities built from 6000 post burn-in stiffness
parameter samples by running the full GS SBL algorithm. The two damaged substructures correspond to u1, + y and u1, y .
Figure 6. Config. 5 scenario: (a) Approximated Gaussian PDFs for the fast SBL algorithm with sparseness turned off and (b)
Approximated Gaussian PDFs for the fast SBL algorithm; (c) Kernel probability densities built from 6000 post burn-in stiffness
parameter samples by running the full GS SBL algorithm. The only damaged substructure corresponds to u1, y .
Huang et al. 1343
represented as a linear combination of echoes reflected (2017a) chose dictionary parameters in accordance
from damage or defects in the sample being examined. with the energy distribution of the signal. Although
A generalized expression of the signal is given by the structure noise is not sparse in the spatial domain,
the proposed dictionary design strategy can produce
X
L
sparse representations of the structure noise by utiliz-
yð t Þ = c l f l ð t Þ + jð t Þ ð37Þ ing its limited frequency range and bandwidth. This
l=1
modeling is useful to increase the sparse Bayesian
where cl is the weighting coefficient of the lth echo learning accuracy of the weighting coefficient vector, c.
fl (t) and j(t) is a term representing noise in the signal.
When accurate representation of each echo is avail-
able, the recorded signal y(t) can be represented using SBL applications in guided wave damage/defect detection. In
only a few terms, that is, the sparseness of the repre- Guided Wave testing, the identification and recovery
sentation of y(t) can be exploited. Therefore, the SBL of each guided wave mode in the received signal is vital
can be employed to infer the weighting coefficients cl s
0 for damage or defect characterization and localization.
from the measured signal vector y using the following Once individual modes are identified, defect localiza-
linear equation tion is straightforward. Specially, if the amplitudes of
each mode are specified, it is possible to further char-
y = Fc + j ð38Þ acterize the size of the defect.
To process narrowband guide wave signals in which
where F 2 RK 3 L is an over-complete dictionary matrix signal dispersion is negligible, Wu et al. (2017b) intro-
(K L) that consists of L basis vectors, fl (also called duced a SBL-based method, where the Gabor model
atoms), and c = ½c1 , . . . , cL is a sparse weighting coef- given in equation (39) was utilized to approximate the
ficient vector. GW pulses. To form an efficient over-complete diction-
ary, the three Gabor parameters (v, s, u) were designed
SBL applications in ultrasonic NDT. In ultrasonic NDT, the as follows: the natural frequencies v evenly divide the
received signals are often contaminated by noise from total power of the signal; the scale parameters s are
both the measurement system and test sample (structure uniformly distributed in the range (0:5s0 , 2s0 ), where s0
noise, due to scattering of ultrasonic waves by the ‘‘grain’’ is the bandwidth of the generated signal, and the para-
microstructure of the tested material). To suppress noise meters u evenly separate the area between the upper
and to increase the visibility of echoes for detection, and lower envelopes of the signal.
Zhang et al. (2008) proposed a methodology in which the For dispersive guided wave signal processing, the
SBL algorithm developed by Wipf and Rao (2004) was Gabor dictionary becomes inefficient. Wu et al. (2018)
employed to decompose the noisy NDT signals. The dic- also proposed a parameterized chirp model for the
tionary F consisted of several fixed-scale critically approximation of the dispersive guided wave signal
sampled cosine Gabor bases where each atom is defined using a polynomial approximation of the frequency–
by the parameters fs, u, vg, as follows wavenumber dispersion k(f ). This wavenumber, as a
function of frequency f, characterizes the dispersion
pffiffi
g = A= s exp pðt zÞ2 =s2 cosðvðt zÞÞ ð39Þ property of the GW mode in the waveguide. By utiliz-
ing a third-order polynomial approximation of the
where s is the scale of the function, z is its translation, frequency–wavenumber dispersion k(f ), the time-
and v is the frequency modulation. One limitation of domain waveform of the pulse at the travel distance
this SBL algorithm is that it is extremely computation- x = x0 can be obtained as
ally demanding, especially when the dimension of dic- h i
tionary F is large. As such, Wu et al. (2017a) gðx0 , tÞ = Re Gð0, f Þejk ð f Þ
developed a signal processing method that employs a h i ð40Þ
ffi Re Gð0, f Þejx0 ð3d2 f + 2d1 f + d0 f + c0 Þ
1 3 1 2
robust sparse Bayesian learning algorithm (RSBL) to
process noisy NDT signals for flaw detection. The
RSBL algorithm was developed by Huang et al. where G(0, f ) is the Fourier transform of the pulse at
(2014). It is based on the ‘‘fast algorithm’’ by Tipping x = 0. Based on this model, Wu et al. (2018) presented
and Faul (2003) but enhanced for better robustness by a signal processing method, which utilizes an advanced
a successive relaxation strategy and stochastic optimi- SBL algorithm presented in Huang et al. (2016) to
zation searching scheme to alleviate the optimization recover multiple dispersive GW modes from noisy sig-
problem of the hyper-parameters being stuck in local nals for damage detection and localization. The dic-
maxima. Moreover, instead of using a uniformly dis- tionary design was based on the propagation path of
tributed parameter set for dictionary design, Wu et al. each mode, which is closely linked to signal dispersion.
inference in system identification, such as modal iden-

tification, are not included. Based on the literature
review, the following concluding remarks can be made:
1. A powerful Bayesian probabilistic framework

is available for treating modeling uncertainty in
system identification that is based solely on the
probability logic axioms. It allows plausible
reasoning regarding system behavior based on
noisy incomplete data without invoking the
concept of ‘‘inherent randomness.’’ Rather than
considering only a point estimate based on a
single model, Bayes’ theorem is used to com-
pute the posterior probability distribution and
quantify the relative plausibility of each model
in a parameterized set of system models.
2. Comparing the posterior probability at the
model class level automatically implements a
quantitative form of the Ockham Razor.
Roughly speaking, this principle states that
models should not be more complex than is
sufficient to explain the data. The Bayesian
Ockham Razor penalizes model classes that
‘‘overfit’’ the data, which is important in real
applications since overly complex models often
lead to overfitting of the data and then subse-
Figure 7. Experimental setup for damage localization. quent response predictions may be unreliable.
3. To allow a computationally feasible Bayesian
This leads to the desirable consequence that distances implementation, various Bayesian approxima-
between defect and actuator and between defect and tion tools have been developed for robust anal-
receiver can be easily obtained, making defect localiza- ysis and characterization of the posterior
tion straightforward. In addition, the SBL algorithm distribution in Bayesian updating and model
presented in Huang et al. (2016) can treat both highly class assessment involving a large number of
sparse and approximately sparse signal models, so the uncertain parameters. Their applications are
method is robust against the noise in the signals. based on different situations. For example,
This method was verified through the experimental Laplace’s asymptotic approximation is useful if
study in Figure 7, where a notch was prefabricated as the amount of data is not too small, and the
damage. Figure 8 shows the recovered signals and indi- model class is globally identifiable. When the
vidual modes obtained. Using the propagation infor- chosen class of models is unidentifiable or
mation of the recovered modes and the triangulation locally identifiable based on the data so that
method, the location of the damage was obtained, as there are multiple maximum likelihood esti-
presented in Figure 9. It is observed that the detected mates (MLEs), stochastic simulation methods
notch is close to its actual position (the distance are more practical to calculate the model class
between these two positions is approximately 17 mm). evidence, such as MCMC methods.
It is noteworthy that it is sufficient to localize the notch 4. The application of Bayesian inference for both
vibration-based and wave propagation-based
using the measurements from only two sensors.
damage assessment is addressed and reviewed.
Using a Bayesian probabilistic formulation, the
Discussion and future prospects updated posterior probability distribution of
the uncertain damage-related model parameters
This article presented a state-of-the-art review on is obtained. Not only the most probable esti-
Bayesian inference and its application in structural sys- mates are inferred but also the associated pos-
tem identification and damage assessment of civil terior uncertainties are quantified, including the
infrastructures. Because of limited page space allowed probability of substructure damage of various
for this article, other applications of Bayesian amounts. The concept of system mode shape
Huang et al. 1345
Figure 8. (a) Processed signal and recovered modes for sensor 1 and (b) processed signal and recovered modes for sensor 2.
research is desirable to develop new methods

for exploring high-dimensional model para-
meter spaces, such as iterative block-parameter
Gibbs sampling algorithms, and for refining
model parameter spaces, such as variable-
resolution approaches that permit a progressive
refinement of model parameterization.
2. The assessment of the bottlenecks in Bayesian
model updating and uncertainty quantification
of nonlinear structural models requires further
study. For complex nonlinear models, an ana-
lytical formula of the likelihood function might
be difficult, or even elusive. Methods such as
Approximate Bayesian Computation (ABC)
methods have the potential of bypassing the
evaluation of likelihood functions and should
be further explored in applications.
3. Bayesian inference is a powerful statistical
Figure 9. Notch localization.
framework for dealing with big datasets to
avoid data overfitting and to allow model uncer-
was utilized in the vibration-based damage
tainty to be explicitly quantified. To overcome
assessment methods. This avoids the challen-
the computing challenges with large-scale spatial
ging mode-matching problem and the necessity
and temporal datasets in structural health moni-
of solving the nonlinear inverse problem related
toring, the advance of scalable Bayesian infer-
to a structural model eigenvalue equation.
ence algorithms should be explored. Topics for
5. The hierarchical sparse Bayesian learning meth-
research include subsampling big datasets with a
odologies have attracted interest in recent years
stochastic method that exploits the redundancy
for performing sparse stiffness loss inference
in large-scale datasets, developing recursive
for vibration-based damage assessment and
Bayesian estimation for inferring an unknown
also for flaw detection using guided wave/ultra-
PDF over time using sequential datasets, pro-
sonic NDT signal processing. It is found that
ducing modular and portable software for dis-
the incorporation of prior information pertain-
tributed/parallel computing platforms.
ing to the spatial sparseness of structural dam-
4. Bayesian methods can enhance many machine
age helps to suppress the possible occurrences
learning methods (including deep learning) by
of false damage detections. Moreover, the algo-
handling missing data, extracting much more
rithms have the appealing feature that they
information from small datasets, and automati-
automatically select all algorithmic parameters,
cally tuning hyper-parameters. Moreover,
so that no user intervention is required.
Bayesian methods allow us to quantify both
modeling and measurement uncertainty in
To enhance the application of Bayesian inference in learning and making predictions, which is a
civil engineering and other related areas in science and desirable feature in various fields. Future
technology, the following suggestions for future research research in machine learning methodologies
are suggested: and applications can benefit by exploring
Bayesian methods.
1. Most past Bayesian inference applications in
system identification and damage assessment
Declaration of Conflicting Interests
have involved low-dimensional model para-
meters. There are computational challenges to The author(s) declared no potential conflicts of interest with
applying the Bayesian approach to high- respect to the research, authorship, and/or publication of this
dimensional inverse problems, such as how to article.
efficiently sample the posterior high-
dimensional parameter spaces and how to Funding
explore robustly the features implied by the col- The author(s) disclosed receipt of the following financial
lection of models corresponding to the poster- support for the research, authorship, and/or publication of
ior samples of the model parameters. Further this article: This research is financially supported by the
Huang et al. 1347
National Key Research and Development Program of Beck JL, Au S and Vanik MW (2001) Monitoring structural
China (2017YFC1500605) and the National Natural health using a probabilistic measure. Computer-Aided
Science Foundation of China (Grant Nos. 51778192, 51638 Civil and Infrastructure Engineering 16(1): 1–11.
007 and 51308161). Behmanesh I and Moaveni B (2015) Probabilistic identifica-
tion of simulated damage on the Dowling Hall footbridge
through Bayesian finite element model updating. Struc-
ORCID iDs
tural Control and Health Monitoring 22: 463–483.
Yong Huang https://orcid.org/0000-0002-7963-0720 Behmanesh I, Moaveni B and Papadimitriou C (2017) Prob-
Biao Wu https://orcid.org/0000-0002-7725-9980 abilistic damage identification of a designed 9-story build-
ing using modal data in the presence of modeling errors.
Engineering Structures 131: 542–552.
References
Bernal D, Dyke SJ, Lam HF, et al. (2002) Phase II of the
Andrieu C and Thoms J (2008) A tutorial on adaptive ASCE benchmark study on SHM. In: Proceedings of 15th
MCMC. Statistics and Computing 18: 343–373. ASCE engineering mechanics conference, New York, 2–5
Arangio S and Beck JL (2012) Bayesian neural networks for June, pp. 1048–1055.
bridge integrity assessment. Structural Control and Health Bishop CM (2005) Neural Networks for Pattern Recognition.
Monitoring 19(1): 3–21. New York: Oxford University Press.
Arangio S and Bontempi F (2015) Structural health monitor- Bishop CM (2006) Pattern Recognition and Machine Learn-
ing of a cable-stayed bridge with Bayesian neural net- ing. New York: Springer.
works. Structure and Infrastructure Engineering 11(4): Candes EJ, Romberg J and Tao T (2006) Robust uncertainty
575–587. principles: exact signal reconstruction from highly incom-
Astroza R, Ebrahimian H, Li Y, et al. (2017) Bayesian non- plete frequency information. IEEE Transactions on Infor-
linear structural FE model and seismic input identification mation Theory 52(2): 489–509.
for damage assessment of civil structures. Mechanical Sys- Catanach TA and Beck JL (2017) Bayesian system identifica-
tems and Signal Processing 93: 661–687. tion using auxiliary stochastic dynamical systems. Interna-
Bao YQ, Xia Y, Li H, et al. (2013) Data fusion-based struc- tional Journal of Nonlinear Mechanics 94: 72–83.
tural damage detection under varying temperature condi- Chen SS, Donoho DL and Saunders MA (1999) Atomic
tions. International Journal of Structural Stability and decomposition by basis pursuit. SIAM Journal on Scien-
Dynamics 12(6): 1250052. tific and Statistical Computing 20(1): 33–61.
Beck JL (1989) Statistical system identification of structures. Cheung SH and Beck JL (2009) Bayesian model updating
In: Proceedings of 5th international conference on structural using Hybrid Monte Carlo Simulation with application to
safety and reliability, San Francisco, CA, 7–11 August. structural dynamics models with many uncertain para-
Beck JL (2010) Bayesian system identification based on prob- meters. Journal of Engineering Mechanics 135: 243–255.
ability logic. Structural Control and Health Monitoring 17: Chiachio J, Bochud N, Chiachio M, et al. (2017) A multilevel
825–847. Bayesian method for ultrasound-based damage identifica-
Beck JL (2014) Bayesian system identification and the Baye- tion in composite laminates. Mechanical Systems and Sig-
sian Ockham Razor. In: Proceedings of the 9th interna- nal Processing 88: 462–477.
tional conference on structural dynamics, Porto, 30 June–2 Chiachio M, Beck JL, Chiachio J, et al. (2014) Approximate
July. Bayesian computation by subset simulation. SIAM Jour-
Beck JL and Au SK (2002) Bayesian updating of structural nal on Scientific Computing 36(3): A1339–A1358.
models and reliability using Markov Chain Monte Carlo Ching J and Beck JL (2003) Two-step Bayesian structure
simulation. Journal of Engineering Mechanics 128(4): health monitoring approach for IASC-ASCE phase II
380–391. simulated and experimental benchmark studies, Technical
Beck JL and Katafygiotis LS (1991) Updating of a Report EERL 2003-02, Earthquake Engineering Research
model and its uncertainties utilizing dynamic test data. In: Laboratory, California Institute of Technology, Pasa-
Spanos PD and Brebbia CA (eds) Computational Stochas- dena, CA.
tic Mechanics. Dordrecht: Springer, pp. 125–136. Ching J and Beck JL (2004a) Bayesian analysis of the Phase
Beck JL and Katafygiotis LS (1998) Updating models and II IASC–ASCE Structural Health Monitoring experimen-
their uncertainties. I: Bayesian statistical framework. Jour- tal benchmark data. Journal of Engineering Mechanics
nal of Engineering Mechanics 124(4): 455–461. 130(10): 1233–1244.
Beck JL and Yuen KV (2004) Model selection using Ching J and Beck JL (2004b) New Bayesian model updating
response measurements: Bayesian probabilistic algorithm applied to a Structural Health Monitoring
approach. Journal of Engineering Mechanics 130(2): benchmark. Structural Health Monitoring 3(4): 313–332.
192–203. Ching J and Chen YC (2007) Transitional Markov Chain
Beck JL and Zuev KM (2013) Asymptotically independent Monte Carlo method for Bayesian model updating, model
Markov sampling: a new Markov Chain Monte Carlo class selection, and model averaging. Journal of Engineer-
scheme for Bayesian interference. International Journal ing Mechanics 133(7): 816–832.
for Uncertainty Quantification 3(5): 445–474.
Ching J, Beck JL and Porter KA (2006a) Bayesian state and Hou RR, Xia Y, Bao YQ, et al. (2018b) Selection of regulari-
parameter estimation of uncertain dynamical systems. zation parameter for l1-regularized damage detection.
Probabilistic Engineering Mechanics 21(1): 81–96. Journal of Sound and Vibration 423: 141–160.
Ching J, Muto M and Beck JL (2006b) Structural model Huang Y and Beck JL (2015) Hierarchical sparse Bayesian
updating and health monitoring with incomplete modal learning for structural health monitoring with incomplete
data using Gibbs sampler. Computer-Aided Civil and modal data. International Journal for Uncertainty Quanti-
Infrastructure Engineering 21(4): 242–257. fication 5(2): 139–169.
Cover TM and Thomas JA (2006) Elements of Information Huang Y, Beck JL,Wu S and Li H (2011) Robust Diagnos-
Theory. Hoboken, NJ: Wiley-Interscience. tics for Bayesian Compressive Sensing Technique in Struc-
Cox RT (1946) Probability, frequency and reasonable expec- tural Health Monitoring, The 8th international workshop
tation. American Journal of Physics 14(1): 1–13. on structural health monitoring, Stanford, USA. 13–15
Cox RT (1961) The Algebra of Probable Inference. Baltimore, September.
MD: Johns Hopkins Press. Huang Y and Beck JL (2018) Full Gibbs sampling procedure
Dyke SJ, Bernal D, Beck JL, et al. (2003) Experimental phase for Bayesian system identification incorporating sparse
II of the structural health monitoring benchmark prob- Bayesian learning with automatic relevance determina-
lem. In: Proceedings of 16th Engineering Mechanics confer- tion. Computer-Aided Civil and Infrastructure Engineering
ence, ASCE, Seattle, USA. 33(9): 712–730.
Faul AC and Tipping ME (2002) Analysis of sparse Bayesian Huang Y, Beck JL and Li H (2017a) Hierarchical sparse
learning. In: Dietterich TG, Becker S and Ghahramani Z Bayesian learning for structural damage detection: theory,
(eds) Advances in Neural Information Processing Systems computation and application. Structural Safety 64: 37–53.
14. Cambridge, MA: MIT Press, pp. 383–389. Huang Y, Beck JL and Li H (2017b) Bayesian system identi-
Figueiredo E, Radu L, Worden K, et al. (2014) A Bayesian fication based on hierarchical sparse Bayesian learning
approach based on a Markov-chain Monte Carlo method and Gibbs sampling with application to structural dam-
for damage detection under unknown sources of variabil- age assessment. Computer Methods in Applied Mechanics
ity. Engineering Structures 80: 1–10. and Engineering 318: 382–411.
Fujimoto K, Satoh A and Fukunaga S (2011) System identi- Huang Y, Beck JL and Li H (2018a) Multi-task sparse Baye-
fication based on variational Bayes method and the invar- sian learning with applications in Structural Health Moni-
iance under coordinate transformations. In: Proceedings toring. Computer-Aided Civil and Infrastructure
of the 50th IEEE conference on CDC-ECC, Orlando, FL, Engineering. Epub ahead of print 21 August 2018. DOI:
pp. 3882–3888. New York: IEEE. 10.1111/mice.12408
Ghanem R and Shinozuka M (1995) Structural-system iden- Huang Y, Beck JL, Wu S, et al. (2014) Robust Bayesian com-
tification. I: theory. Journal of Engineering Mechanics pressive sensing for signals in Structural Health Monitor-
121(2): 255–264. ing. Computer-Aided Civil and Infrastructure Engineering
Goller B, Beck JL and Schuëller GI (2012) Evidence-based 29(3): 160–179.
identification of weighting factors in Bayesian model Huang Y, Beck JL, Wu S, et al. (2016) Bayesian compressive
updating using modal data. Journal of Engineering sensing for approximately sparse signals and application
Mechanics 138(5): 430–440. to structural health monitoring signals for data loss recov-
Gull SF (1988) Bayesian inductive inference and maximum ery. Probabilistic Engineering Mechanics 46: 62–79.
entropy. In: Erickson GJ and Smith CR (eds) Maximum Huang Y, Li H, Wu S, et al. (2018b) Fractal dimension based
Entropy and Bayesian Methods. Dordrecht: Kluwer Aca- damage identification incorporating multi-task sparse Baye-
demic Publishers, pp. 53–74. sian learning. Smart Materials and Structures 27: 075020.
Haario H, Saksman E and Tamminen J (2001) An adaptive Huang Y, Ren Y, Beck JL, et al. (2018c) Sequential Bayesian
Metropolis algorithm. Bernouli 7: 223–242. compressed sensing. In: The 7th world conference on struc-
Hastings WK (1970) Monte Carlo sampling methods using tural control and monitoring, 7WCSCM, Qingdao, China,
Markov Chains and their applications. Biometrika 57(1): 22–25 July.
97–109. Huang Y, Shao CS, Wu S and Li H (2018d) Diagnosis and
He S and Ng CT (2017) Guided wave-based identification of accuracy enhancement of compressive-sensing signal
multiple cracks in beams using a Bayesian approach. reconstruction in structural health monitoring using
Mechanical Systems and Signal Processing 84: 324–345. multi-task sparse Bayesian learning. Smart Materials and
Hong JC, Sun KH and Kim YY (2006) Waveguide damage Structures. Available at: https://doi.org/10.1088/1361-665
detection by the matching pursuit approach employing X/aae9b4
the dispersion-based chirp functions. IEEE Transactions Jaynes ET (1957) Information theory and statistical
on Ultrasonics Ferroelectrics and Frequency Control 53(3): mechanics. Physical Review 106(4): 620–630.
592–605. Jaynes ET (1983) In Rosenkrantz RD (ed.) Papers on Prob-
Hou RR, Xia Y and Zhou XQ (2018a) Structural damage ability, Statistics and Statistical Physics. D Dordrecht,
detection based on l1 regularization using natural frequen- Holland: Reidel Publishing.
cies and mode shapes. Structural Control and Health Mon- Jaynes ET (2003) Probability Theory: The Logic of Science.
itoring 25(3): e2107. Cambridge: Cambridge University Press.
Huang et al. 1349
Jefferys WH and Berger JO (1992) Ockham’s Razor and Sohn H and Law KH (1997) A Bayesian probabilistic approach
Bayesian analysis. American Scientist 80: 64–72. for structure damage detection. Earthquake Engineering &
Katafygiotis LS and Beck JL (1998) Updating models and Structural Dynamics 26(12): 1259–1281.
their uncertainties. II: Model identifiability. Journal of Straub D and Papaioannou I (2015) Bayesian updating with
Engineering Mechanics 124(4): 463–467. structural reliability methods. Journal of Engineering
Katafygiotis LS and Lam HF (2002) Tangential-projection Mechanics 141(3): 04014134.
algorithm for manifold representation in unidentifiable Tarantola A (2005) Inverse Problem Theory. Philadelphia,
model updating problems. Earthquake Engineering & PA: Society for Industrial and Applied Mathematics.
Structural Dynamics 31(4): 791–812. Tipping ME (2001) Sparse Bayesian learning and the rele-
Lam HF and Ng CT (2008) The selection of pattern features vance vector machine. Journal of Machine Learning
for structural damage detection using an extended Baye- Research 1: 211–244.
sian ANN algorithm. Engineering Structures 30(10): Tipping ME (2004) Bayesian inference: an introduction to
2762–2770. principles and practice in machine learning. In: Bousquet
Lam HF, Hu Q and Wong MT (2014) The Bayesian metho- O et al. (ed.) Advanced Lectures on Machine Learning.
dology for the detection of railway ballast damage under Springer-Verlag Berlin Heidelberg, pp. 41–62.
a concrete sleeper. Engineering Structures 81: 289–301. Tipping ME and Faul AC (2003) Fast marginal likelihood
Lam HF, Yuen KV and Beck JL (2006) Structural health maximization for sparse Bayesian models. In: Proceedings
monitoring via measured Ritz vectors utilizing artificial of 9th international workshop on artificial intelligence and
neural networks. Civil and Infrastructure Engineering 21: statistics, Key West, FL, 3–6 January.
232–241. Tropp JA and Gilbert AC (2007) Signal recovery from ran-
Li BB and Der Kiureghian A (2017) Operational modal iden- dom measurements via orthogonal matching pursuit.
tification using variational Bayes. Mechanical Systems and IEEE Transactions on Information Theory 53(12):
Signal Processing 88: 377–398. 4655–4666.
Mackay DJC (1992) Bayesian methods for adaptive models. Vakilzadeh MK, Huang Y, Beck JL, et al. (2017) Approxi-
PhD Thesis, Computation and Neural Systems, Califor- mate Bayesian Computation by Subset Simulation using
nia Institute of Technology, Pasadena, CA. hierarchical state-space models. Mechanical Systems and
MacKay DJC (1994) Chapter 6: Bayesian methods for back- Signal Processing 84: 2–20.
propagation networks. In: MacKay DJC (ed.) Model of Vanik MW (1997) A Bayesian probabilistic approach to struc-
Neural Networks III. Berlin: Springer, pp. 211–254. tural health monitoring. Technical Report EERL-9707.
Marin JM, Pudlo P, Robert CP, et al. (2012) Approximate Pasadena, CA: Earthquake Engineering Research
Bayesian computational methods. Statistics and Comput- Laboratory, Caltech.
ing 22: 1167–1180. Vanik MW, Beck JL and Au SK (2000) Bayesian probabilis-
Mises VR (1981 [1939]) Probability, Statistics, and Truth. tic approach to structural health monitoring. Journal of
New York: Dover Publications (in German). Engineering Mechanics 126(7): 738–745.
Muto M and Beck JL (2008) Bayesian updating and model Wang H, Yajima A, Liang RY, et al. (2015) A Bayesian
class selection using stochastic simulation. Journal of model framework for calibrating ultrasonic in-line inspec-
Vibration and Control 14: 7–34. tion data and estimating actual external corrosion depth
Neal RM (1996) Bayesian learning for neural networks. Lec- in buried pipeline utilizing a clustering technique. Struc-
ture Notes in Statistics, Berlin. tural Safety 54: 19–31.
Ng CT, Veidt M and Lam HF (2009) Guided wave damage Wipf DP and Rao BD (2004) Sparse Bayesian learning for
characterization in beams utilizing probabilistic optimiza- basis selection. IEEE Transactions on Signal Processing
tion. Engineering Structures 31(12): 2842–2850. 52(8): 2153–2164.
Nichols JM, Link WA, Murphy KD, et al. (2010) A Baye- Wu B, Huang Y and Krishnaswamy S (2017a) A Bayesian
sian approach to identifying structural nonlinearity using approach for sparse flaw detection from noisy signals for
free-decay response: application to damage detection in ultrasonic NDT. NDT&E International 85: 76–85.
composites. Journal of Sound and Vibration 329(15): Wu B, Huang Y, Chen X, et al. (2017b) Guided-wave signal
2995–3007. processing by the sparse Bayesian learning approach
Raghavan A and Cesnik CES (2007) Guided-wave signal employing Gabor pulse model. Structural Health Moni-
processing using chirplet matching pursuits and mode toring 16(3): 347–362.
correlation for structural health monitoring. Smart Mate- Wu B, Li H and Huang Y (2018) Sparse recovery of multiple
rials and Structures 16(2): 355–366. dispersive guided-wave modes for defect localization using
Robert CP and Casella G (2004) Monte Carlo Statistical a Bayesian approach. Structural Health Monitoring Avail-
Methods (ed S Fienberg). 2nd ed. New York: Springer. able at: https://doi.org/10.1177/1475921718790212
Sirca GF and Adeli H (2012) System identification in struc- Yan G (2013) A Bayesian approach for damage localization
tural engineering. Scientia Iranica 19(6): 1355–1364. in plate-like structures using Lamb waves. Smart Materi-
Sohn H, Allen DW, Worden K, et al. (2005) Structural damage als and Structures 22(3): 035012.
classification using extreme value statistics. Journal of Yan WJ and Katafygiotis LS (2015) A novel Bayesian
Dynamic Systems Measurement and Control 127(1): 125–132. approach for structural model updating utilizing
h i
statistical modal information from multiple setups. Struc- log p uû jv2 , f, b, a p v2 , fjb
tural Safety 52: 260–271. ð h
Yang CM and Beck JL (1998) Generalized trajectory meth- i
= log p u û jv2 , f, b, a p v2 , fjb
ods for finding multiple extrema and roots of functions.

Journal of Optimization Theory and Applications 97(1): û , v2 , f, b, a du
p uju
211–227.
ð h
Yin T, Jiang QH and Yuen KV (2017) Vibration-based dam-
= log pðujv2 , f, bÞpðv2 , fjbÞpðu û ju, aÞ=p
age detection for structural connections using incomplete
modal data by Bayesian approach and model reduction i
technique. Engineering Structures 132: 260–277. û , v2 , f, b, aÞ pðuju
ðuju û , v2 , f, b, aÞdu
ð
Yuen KV (2010) Recent developments of Bayesian model
class selection and applications in civil engineering. Struc- = log p v2 , f, ujb p uju û , v2 , f, b, a du ð42Þ
tural Safety 32(5): 338–346. ð h i
Yuen KV and Kuok SC (2016) Online updating and uncer- û , v2 , f, b, a =p u û ju, a p
log p uju
tainty quantification using nonstationary output-only
measurement. Mechanical Systems and Signal Processing

û , v2 , f, b, a du
uju
66–67: 62–77.
ð
Yuen KV, Beck JL and Au SK (2004) Structural damage
= log p v2 , f, ujb p uju û , v2 , f, b, a du
detection and assessment using adaptive Markov Chain
Monte Carlo simulation. Structural Control and Health ð h i
Monitoring 11(4): 327–347. log p uju û , v2 , f, b, a =p uju û , a p
Yuen KV, Beck JL and Katafygiotis LS (2006) Efficient
model updating and health monitoring methodology û , v2 , f, b, a du
uju
using incomplete modal data without mode matching.
Structural Control and Health Monitoring 13(1): 91–107. û ju, a) = N
Zhang GM, Harvey DM and Braden DR (2008) Signal where the pseudo likelihood p(u
^ ^ ^
(uu ju, A) = N (ujuu , A) = p(ujuu , a). Equation (42)
denoising and ultrasonic flaw detection via overcomplete
and sparse representations. Journal of the Acoustical Soci- shows that the logarithm function, which is to be max-
ety of America 124(5): 2963–2972. imized, is the difference between the posterior mean of
Zhou X, Xia Y and Weng S (2015) L1 regularization the log joint prior PDF p(v2 , f, ujb)(the first term)
approach to structural damage detection using frequency and the relative entropy (or Kullback–Leibler informa-
data. Structural Health Monitoring 14(6): 571–582. tion) of the posterior PDF p(uju û , v2 , f, b, a) of u
^
with respect to the PDF p(ujuu , a) conditional only on
a and the MAP vector u û , which is obtained from the
Appendix 1
calibration stage (the second term). The first term
Information-theoretic interpretation of producing quantifies the ability of the modal parameters corre-
sparseness in the inferred stiffness changes for the sponding to the structural model specified by u to
match the system modal parameters v2 and f; it is
fast sparse Bayesian learning algorithm
maximized if the model modal parameters become
In the fast sparse Bayesian learning algorithm, the tightly clustered around the system modal parameters
objective function for the derivation of maximum a v2 and f, that is, the equation error precision para-
posteriori (MAP) estimation of hyper-parameters a meter b ! ‘. The second term reflects the amount of
and b is defined as information extracted from the system modal para-
h i meters v2 and f, and so from the ‘‘measured’’ modal
J ða, bÞ = log p v ^ u
^ 2 , c, û jd pðdÞ data v ^ as implied by the observation of the
^ 2 and c,
h i hierarchical model structure exhibited in Figure 2. It
= log p u û jv2 , f, b, a p v2 , fjb p(bjb0 ) + c2 penalizes models that have more parameter compo-
nents uj differing from those in u û , and therefore forces
ð41Þ
the model updating to extract less information from
where c2 is a constant independent of a and b. For the the system modal parameters v2 and f. Over-extrac-
logarithm function of the product of pseudo-evidence tion of information from the system modal parameters
û jv2 , f, b, a) and probability density function
p(u v2 and f will produce a structural model vector u
(PDF) p(v2 , fjb) in J (a, b) in equation (41), the with too large of a difference from u û that is overly
information-theoretical interpretation of the trade-off sensitive to the details of the information in the speci-
between data fitting and model complexity (Beck, fied system modal parameters v2 and f, and therefore
2010) can be demonstrated as follows, where we use in the ‘‘measured’’ modal parameters v ^ 2 and c. ^ In
hierarchical model structure in Figure 2 other words, the measurement noise and other
Huang et al. 1351
environmental effects may not be ‘‘smoothed out’’ so v2 and f well. This is the Bayesian Ockham razor
they may have an excessive effect on the damage detec- (Beck, 2010) at work. The Bayesian procedure is effec-
tion performance. tively implementing Ockham’s razor by assigning lower
Summarizing, the maximization of equation (42) probabilities to a structural model whose parameter
automatically produces the optimal trade-off between û
vector u has too large or too small differences from u
data fitting and model complexity that causes many obtained at the calibration stage (too few or too many
hyper-parameters aj to approach zero with a reason- aj ! 0), thereby suppressing the occurrence of false
ably large value of b, giving a model u that has both a and missed damage detection alarms.
sufficiently small number of components uj differing
from those in uû and fits the system modal parameters

State-Of-The-Art Review On Bayesian Inference in SHM, Huang Y.

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

State-Of-The-Art Review On Bayesian Inference in SHM, Huang Y.

Uploaded by

Copyright:

Available Formats

Invited Reviews

Advances in Structural Engineering

identification and damage assessment journals.sagepub.com/home/ase

Yong Huang1,2 , Changsong Shao1,2, Biao Wu3 ,

Introduction inadequate theory for certain system behaviors, simpli-

However, the higher level involves the evaluation of

Figure 1. Graphical hierarchical model representation, where g)

~ = arg max p(gjD ) = arg max p(D jg)pðgÞ

Vibration-based damage assessment using Bayesian ^ r = ar Gcr ðuÞ + e ^

Bayesian pattern recognition methods. Closely related to

Sparse Bayesian learning and applications Sparse Bayesian learning

was presented in Huang and Beck (2015), where v2 ^ u

are successively sampled to generate samples from the

inference in system identification, such as modal iden-

1. A powerful Bayesian probabilistic framework

research is desirable to develop new methods

You might also like