Professional Documents
Culture Documents
net/publication/306245364
CITATIONS READS
132 792
3 authors:
Nam H. Kim
University of Florida
340 PUBLICATIONS 6,220 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Chanyoung Park on 20 August 2016.
RESEARCH PAPER
Y(x)|y An updated model with a data set. The trend MFS frameworks based on a discrepancy function are
function and the corresponding uncertainty of built by combining a low-fidelity simulation (or surrogate)
a prior model are updated with samples. with a discrepancy surrogate, which models the difference
yH A high fidelity data set. between low and high fidelity sample sets. Discrepancy-
yih The i-th data point of a high fidelity data set. based MFS frameworks have been used in design optimi-
yH(x) An unknown true high fidelity function value zation to alleviate computational burden. For example,
at x. Balabanov et al. (1998) used linear regression to combine
ŷH(x) An MFS predictor for a high fidelity function coarse and fine finite element models, while Mason et al.
value at x. (1998) used 2D and 3D finite element models as a low and
YH(x) A prior model (GP model) for predicting a high high fidelity model, respectively, for aircraft structural op-
fidelity function value at x for Bayesian MFS timization. The same approach was used to combine aero-
frameworks. This model can be a linear com- dynamic prediction from a cheap linear theory with expen-
bination of a low fidelity model and a discrep- sive Euler solutions for aircraft aerodynamic optimization
ancy function model. (Knill et al. 1999). A Bayesian discrepancy-based MFS
YH(x)|yH, yL An updated high fidelity model with low and using GP was introduced by Kennedy and O’Hagan
high fidelity data sets. (2000). The Bayesian model allows to incorporate prior
yL A low fidelity data set. information (Kennedy and O’Hagan 2000; Qian and Wu
ycL A low fidelity data set at locations common to 2008). Co-Kriging (Sacks et al. 1989; Lophaven et al.
those of high fidelity data points. 2002) provides an equivalent surrogate to the Bayesian
yil The i-th data point of a low fidelity data set. formulation with a non-informative prior and has good
yL(x) An unknown true low fidelity function value at computational characteristics (Forrester et al. 2007; Kuya
x. et al. 2011; Han et al. 2012; Le Gratiet 2013). Note that the
ŷL(x) An MFS predictor for a low fidelity function discrepancy function based MFS frameworks can handle
value at x. data from more than two fidelities.
ŷL(x, θ) An MFS predictor for a low fidelity function Model calibration is another strategy for building MFS by
value at x for a given calibrated parameter fitting a low fidelity surrogate with tuned parameters which
vector θ. improve agreement between the surrogate and high fidelity
YL(x) A prior model (GP model) for predicting a low sample set. A simple framework is to find parameters that
fidelity function value at x for Bayesian MFS minimize the discrepancy between a calibrated low fidelity
frameworks. surrogate and high fidelity sample set (Zheng et al. 2013).
YL(x, θ) A prior model (GP model) for a prediction of a GP-based Bayesian calibration frameworks were also intro-
low fidelity function value at x for a given duced (Kennedy and O’Hagan 2001a; Higdon et al. 2004;
calibrated parameter vector θ. Bayarri et al. 2007; McFarland et al. 2008). The Bayesian
YL(x)|yL An updated low fidelity model with a low fi- frameworks find the best calibration parameters that are the
delity data set. most statistically consistent with high fidelity samples
(McFarland et al. 2008; Prudencio and Schulz 2012). A com-
prehensive Bayesian MFS model that uses both calibration
and discrepancy was proposed by Kennedy and O’Hagan
1 Introduction (2001a) offering greater flexibility, although this is the most
complex framework.
Surrogate models, also known as meta-models, have been The objectives of this paper are: (1) to review the charac-
used as a cheap approximate model, which can be built with teristics and differences of GP-based Bayesian and simple
several dozens of samples. However, for many high fidelity MFS frameworks, (2) to investigate the performance of MFS
simulations, the cost for obtaining enough number of samples frameworks in terms of accuracy and predictability of error,
for achieving reasonable accuracy is high. Multi-fidelity sur- and (3) to investigate the performance of prediction sum of
rogate (MFS) models have been developed to compensate for squares (PRESS) based on cross validation errors as a surro-
expensive high fidelity samples with cheap low fidelity sam- gate performance estimator. The paper is organized as follows.
ples. Although, several Gaussian process (GP) based Section 2 presents MFS frameworks used in this paper.
Bayesian MFS frameworks have been introduced, the benefits Section 3 describes the methodology of the investigation and
of the Bayesian frameworks over simple frameworks have metrics. Section 4 presents numerical examples, followed by
been rarely studied. In addition, the performances of different discussions in Section 5. Each section is intended to be self-
frameworks have been rarely compared, which is the main explanatory such that skipping earlier sections does not ham-
focus of this paper. per reading later sections.
Remarks on multi-fidelity surrogates
2 Multi-fidelity surrogate frameworks model characterizes the trend and prediction uncertainty,
where hyper parameters are estimated for the best consistency
The first objective of this paper is to review the differences of the GP model with samples. Figure 1b shows a Kriging
between MFS frameworks based on three commonly used prediction and two standard deviation intervals obtained by
approaches: using (1) a model discrepancy function, (2) a updating the GP model Y(x) in Fig. 1a with sample set y.
low fidelity model parameter calibration and (3) a combina- A GP model typically includes a linear combination of
tion approach. We considered three simple MFS frameworks shape functions term ξðxÞT b as a trend function and a normal
and three Bayesian frameworks using those approaches. The random variable Z(x) ~ N(0, σ2), which models the prediction
simple frameworks refer to frameworks that build an MFS uncertainty of the trend function, as
using any surrogate. For example, Balabanov et al. (1998)
used a polynomial response surface MFS that is the sum of a Y ð xÞ ¼ ξ ð xÞ T b þ Z ð xÞ ð2Þ
low fidelity surrogate and a model discrepancy surrogate. The
where ξðxÞ is a given vector of shape functions (e.g. polyno-
characteristics, assumptions of the considered frameworks
and their differences will be described in the following mials), and b is a coefficient vector. The prediction uncer-
subsections. tainties at two points x and x′ are assumed to be correlated
via a covariance function. A common covariance function is
All Bayesian MFS frameworks are based on Gaussian
Process (GP) for modeling their prediction uncertainties. 0
0 T 0
Alternatively, we have simple frameworks that construct an cov Z ðxÞ; Z x ¼ σ2 exp − x−x Ω x−x ð3Þ
MFS with regular single fidelity surrogates. In order to mini-
mize the effect of the choice of surrogate for the simple frame- where Ω is a diagonal matrix Ω = diag(λ1, …, λn) denotes a
works, we use Kriging surrogates, which are also based on GP. diagonal matrix with n elements. For example, the (1,1) ele-
Furthermore, the combination of Kriging and the simple dis- ment of Ω is λ = {λ1, λ2, …, λn}T is a vector of hyper param-
crepancy framework is a special case of the Bayesian discrep- eters defining roughness via regulating the range of correla-
ancy framework. tion for n dimensional x, and σ is the process standard devia-
The Bayesian variant of Kriging shares many features for tion determining the magnitude of the error.
modeling prediction uncertainty and fitting processes with the Fitting a Kriging surrogate ŷ(x) has two steps: (1) a prior
Bayesian MFS. We therefore start by describing the Bayesian GP model is fitted to samples by estimating hyper parameters
Kriging surrogate so as to help understanding the Bayesian {b, σ, λ} using the maximum a posteriori estimation which
MFS frameworks. takes a mode of the posterior distribution for the best hyper
parameter estimates, and (2) the prior GP model is updated
2.1 Bayesian Kriging surrogate with samples using the formula of conditional distributions for
a multivariate normal distribution. For details about the pro-
Kriging surrogate is a well-known generalized linear regres- cesses, readers are referred to O’Hagan (1992) and Rasmussen
sion model based on GP (Sacks et al. 1989; Martin and (2004).
Simpson 2005; Lophaven et al. 2002; and Ryu et al. 2002).
Not surprisingly, there is a parallel effort to derive a Kriging
surrogate model using Bayesian techniques. The mathemati- 2.2 Discrepancy function based frameworks
cal form of Bayesian Kriging surrogate is identical to that of
the generalized linear regression model for non-informative 2.2.1 Simple discrepancy MFS framework
prior (Lophaven et al. 2002; O’Hagan 1992). Bayesian
Kriging surrogate provides a prediction with errors in a form This framework provides a convenient way of fitting an MFS
of t distribution (O’Hagan 1992). The mean and standard de- with any surrogate model. The MFS employs two surrogates,
viation of the t distribution represent the Kriging predictor and ŷL(x) and ^δðxÞ, which approximate the low fidelity function
the prediction uncertainty, respectively. The Kriging predictor and the discrepancy, respectively, as
is expressed as
^yH ðxÞ ¼ ρ^yL ðxÞ þ ^δðxÞ ð4Þ
^yðxÞ ¼ E Y ðxÞy ð1Þ
where ρ is a regression scalar minimizing the discrepancy
where Y(x) is a prior GP model and Y(x)|y is the updated between the scaled low fidelity surrogate ρŷL(x) and the high
model with sample set y. fidelity sample set at the common sampling points, where
Figure 1a illustrates a prior GP model with a quadratic sample locations for high and low fidelities are the same.
trend function ŷ(x), which is the mean function, and two stan- Note that modeling ρ as a function for inputs is an active
dard deviation intervals (blue-colored region). The prior research area (Fischer and Grandhi 2014, 2015).
Park et al
-30 -10
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
x x
(a) Estimating the trend function and hyper (b) Updating the prior GP model (Kriging
parameters for a prior GP model using samples surrogate) using samples
2.2.2 Simple discrepancy MFS framework with Kriging low fidelity and discrepancy function, and YL(x)|yL and
surrogate Δ(x)|δ are the corresponding GP models updated based on
the sample sets. Note that the two GP models are expressed in
The combination of the simple framework and Kriging surro- different symbols but their functional form may be the same.
gates provides a special case of the GP based Bayesian dis- For example, the two GP models can have the same linear
crepancy framework. This is because Kriging surrogates can polynomial trend functions and Gaussian correlation
be interpreted as single fidelity surrogates constructed with a functions.
GP based Bayesian framework as shown in Section 2.1. Since Figure 2a shows high and low fidelity samples with the
the low fidelity Kriging surrogate is constructed with a low corresponding functions. Figure 2b shows the prediction
fidelity sample set and the discrepancy Kriging surrogate is and the uncertainty of two standard deviations based on
constructed with discrepancy between a low and high fidelity the samples. The variance of the total prediction uncer-
sample sets at common points, by using (1) (4) can be rewrit- tainty from the two surrogates is the sum of the two var-
ten as iances (that is the sum of the squares of the two standard
errors). That is, the prediction uncertainty is measured by
calculating square root of sums of squared standard errors
^yH ðxÞ ¼ ρE Y L ðxÞyL þ E ΔðxÞδ ð5Þ of low fidelity and discrepancy Kriging surrogates. The
root-mean-square-error (RMSE) was calculated using
where yL is a low fidelity data set; δ = yH − ρycL is a discrepan- 1000 equally spaced points over [0,1]. Note that the high
cy sample set; ycL is a low fidelity data set at locations common fidelity samples are a subset of the low fidelity sample set,
to those of high fidelity data points; yH is a high fidelity data from which discrepancy samples are obtained as a differ-
set; and ρ is a regression scalar. We assume that the high ence between them.
fidelity model evaluated in a subset of the low fidelity evalu- In the simple framework, the MFS is constructed with
ation points. Figure 2a shows low and high fidelity samples three steps: (1) the low fidelity Kriging surrogate is con-
and the discrepancy sample set of the example has four ele- structed with the low fidelity sample set, (2) the regres-
ments obtained from the differences between high fidelity sion scalar ρ is determined by minimizing errors between
samples, and second, fourth, fifth and seventh low fidelity the low fidelity surrogate and high fidelity sample set,
samples, respectively. YL(x) and Δ(x) are GP models for the and (3) a Kriging surrogate of the discrepancy function is
-5 -5
-10 -10
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
x x
(a) A low and high fidelity functions and samples (b) MFS fit (RMSE=0.89 for =1.8)
Remarks on multi-fidelity surrogates
fitted using the difference between low and high fidelity effect on predicting the response of the low fidelity function.
sample set at the common sampling points. Note that the When the GP models of the Bayesian framework and those for
simple framework allows one to use any surrogate in- Kriging surrogates of the simple framework are the same, the
stead of Kriging, and even to use a different surrogate two frameworks make identical predictions about the low fi-
for the low fidelity fit and discrepancy fit. delity function.
The fitting process of the Bayesian discrepancy MFS
takes two steps: (1) fitting the low fidelity GP model to
2.2.3 Bayesian discrepancy MFS framework low fidelity sample set by estimating the low fidelity
hyper parameters and updating the GP model, and (2)
The previously presented combination of the simple frame- fitting the discrepancy function GP model to the samples
work and Kriging surrogate can be generalized using GP at the common points. With the sampling restriction, the
based Bayesian approach, introduced by Kennedy and MFS is the sum of these two updated models as (8)
O’Hagan (2000) and Qian and Wu (2008). Note that this which is equivalent to the updated model of (6).
Bayesian framework provides equivalent predictions to the Note that Bayesian frameworks including this frame-
co-Kriging surrogate (Forrester et al. 2007; Le Gratiet 2013) work can incorporate a priori information for reducing
with no prior information. This section gives emphasis to de- the prediction uncertainty of an MFS through prior dis-
scribing the difference between the GP based Bayesian dis- tributions for the hyper parameters. However, unlike
crepancy framework and the combination of the simple frame- physical parameters, where prior distributions can be ob-
works and Kriging. tained from experience, experience is rarely useful for
The high fidelity GP model is defined as a linear combina- selecting hyper parameters. Indeed, Co-Kriging, which
tion of two GP models YL(x) and Δ(x) as is an MFS framework equivalent to the Bayesian dis-
Y H ðxÞ ¼ ρY L ðxÞ þ ΔðxÞ ð6Þ crepancy framework, has become popular even though
it cannot incorporate prior distributions of hyper param-
where ρ is a regression scalar. Bayesian framework up- eters. There is also an issue of objectivity in prior distri-
dates the GP model with the high and low fidelity sample butions in Bayesian estimation. In this paper, we use
sets. The mean of the updated GP model is used as pre- unbounded constant non-informative priors for hyper pa-
dictor of the high fidelity response. The predictor is rameters for the Bayesian discrepancy MFS framework
expressed as and the other Bayesian frameworks. For details about the
hyper parameter estimation and the updating processes,
^yH ðxÞ ¼ E Y H ðxÞyH ; yL ð7Þ readers are referred to Kennedy and O’Hagan (2000).
20
yL(x)=0.5yH(x)+10(x-0.5)+5 the trend function for the discrepancy yH- ρyL. With this for-
15 yH(x)=(6x-2)2sin(12x-4) mulation, the simple framework would find ρ = 2, which leads
to a linear discrepancy function that can be fitted exactly with
10
the linear trend function of the discrepancy Kriging surrogate.
5 With this approach, an equivalent surrogate to the surrogate
from the Bayesian framework can be built with the simple
0
framework.
-5
Fig. 4 An example of a 20 20
2 2
substantial difference between the 15 H(x) 15 H(x)
Bayesian and simple frameworks: yH yH
The Bayesian discrepancy 10 yH(x) 10 yH(x)
framework significantly 5 5
outperforms the simple
framework by finding ρ that 0 0
makes the discrepancy more
-5 -5
suitable to be fitted (as shown in
(e)) rather than initially finding ρ -10
0 0.2 0.4 0.6 0.8 1
-10
0 0.2 0.4 0.6 0.8 1
by minimizing discrepancy x x
between high fidelity samples and (a) yˆ H ( x ) and 2 based on the Bayesian (b) yˆ H ( x ) and 2 based on the simple
low fidelity function as shown in
(f). a ŷH(x) and 2σ based on the framework (RMSE=0.3 for =2.0) framework with Kriging surrogate (RMSE=4.2
for =0.5)
Bayesian framework
(RMSE = 0.3 for ρ = 2.0). b ŷH(x) 20 20
2 2
and 2σ based on the simple L(x)
15 L(x) 15
framework (RMSE = 4.2 for yL yL
ρ = 0.5). c ŷL(x) and 2σ based on 10 yL(x) 10 yL(x)
the Bayesian framework. d ŷL(x)
5 5
and 2σ based on the simple
framework. e ^δðxÞ and 2σ based 0 0
(c) yˆ L ( x ) and 2 based on the Bayesian (d) yˆ L ( x ) and 2 based on the simple
framework framework with Kriging surrogate
0 10
2
-5 (x) 5
(x)
-10 0
-15 -5 2
(x)
-20 -10
(x)
-25 -15
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
x x
(e) δˆ ( x ) and 2 based on the Bayesian (f) δˆ ( x ) and 2 based on the simple framework
framework with Kriging surrogate
2.3.2 Bayesian calibration MFS framework function and a normal distribution ZL(x, q) ~ N(0, σ2L). The cor-
relation function of ZL(x, q) with n dimension of x and m
The Bayesian MFS framework makes inference about yH(x) dimension of q is defined as the product of two separate cor-
based on both low and high fidelity sample sets (McFarland relation functions.
et al. 2008; Prudencio and Schulz 2012). A prediction is made 0 0
by obtaining the mean of updated high fidelity GP model as in cov Z L ðx; qÞ; Z L x ; q
(7). The high fidelity GP model is defined using the low fidel-
0 T 0 0 T 0
ity model and a regression scalar ρ as ¼ σ2L exp − x−x Ωx x−x exp − q−q Ωq q−q
As in the previous discrepancy Bayesian framework, the where Ωx and Ωq are diagonal matrices. Ωx = diag(λx,1, …, λx,n)
high fidelity GP model needs to be fitted to the samples for and Ωq = diag(λq,1, …, λq,m) denote diagonal matrices with n
best consistency. To include the effect of calibration into the and m elements, respectively.
model, the low fidelity GP model is a function of x and q as For details about the hyper parameter estimation process,
the expression of Kriging surrogate for the previous simple readers are referred to McFarland et al. (2008) and Prudencio
framework. The low fidelity model is the sum of a trend and Schulz (2012).
Park et al
2.4 Comprehensive frameworks presented in the discrepancy function based framework sec-
tion. The low and high fidelity functions defined in Fig. 3 were
The most flexible and widely used Bayesian comprehensive used in this section. The constants of the low fidelity function
framework is the Bayesian calibration framework with a dis- were replaced with calibration parameters and the low fidelity
crepancy function model, which is also known as Kennedy function is expressed as y L T (x, θ 1 , θ 2 ) = 0.5y H T (x) +
and O’Hagan framework (Kennedy and O’Hagan 2001a, b). θ1(x − 0.5) + θ2. Thus ρ = 2, θ1 = 0 and θ2 = 0 yield the exact
The framework uses a GP model with Bayesian approach by high fidelity function for calibration based MFS frameworks
considering the cross-interaction between the low fidelity and with discrepancy function. The bounds of [0,10], [−5,5] and
discrepancy function models. As a counterpart, we also de- [1.5, 2.5] for θ1, θ2 and ρ were used, respectively.
scribe a simple framework which builds an MFS using A constant trend function was used for the low fidelity GP
Kriging surrogates while ignoring the cross-interaction. To models and Kriging surrogates for the simple frameworks, and
distinguish the two, we refer them as a simple comprehensive a linear polynomial trend function was used for discrepancy
framework and a Bayesian comprehensive framework. function GP models with the Gaussian correlation function.
We generated 30 low fidelity samples in {x, θ1, θ2} space
2.4.1 Simple comprehensive MFS framework using Latin hypercube sampling (LHS) as shown in Fig. 5.
Four high fidelity samples were independently generated by
This framework applies the simple calibration framework first LHS and then low fidelity sample evaluation points were up-
and fits the remaining difference with the simple discrepancy dated using the nearest neighbor sampling. This process was
function framework with Kriging surrogates as repeated 100 times to generate 100 different low and high
fidelity sets. The MFS frameworks were analyzed for each
^yH ðxÞ ¼ ρ^yL ðx; θÞ þ ^δðxÞ ð13Þ set of points. The differences between the simple and
Bayesian frameworks were minor for most of the 100 sets,
where ρ and ŷL(x, θ) are obtained as in the simple calibration but for some sets they were substantial. Here, one case that
framework and ^δðxÞ is constructed with a Kriging surrogate shows the most notable difference is presented with a high
using the discrepancy between the high fidelity sample set and fidelity sample set for xH = {0.06, 0.35, 0.69, 0.9}T. Figure 6
ρŷL(x, θ). compares the performance of four frameworks with the iden-
tified values of calibration parameters.
2.4.2 Bayesian comprehensive MFS framework Figure 6a shows the simple calibration MFS. Even if the
identified model parameters (θs) are very close to the exact
The prediction of the Bayesian MFS is the mean of the pos- values, the fitted MFS is not correspondingly close to the true
terior distribution obtained by updating the high fidelity model high fidelity function due to a finite number of samples used to
as in (7). The high fidelity model yH(x) is composed of the construct the low fidelity surrogate. The 2σ shaded region is
discrepancy function model and the low fidelity model. These twice the standard error of the predicted high fidelity response,
two models are assumed to be independent. The high fidelity which is obtained by multiplying ρ and standard error of the
GP model is expressed as low fidelity Kriging surrogate based on (9). Note, however,
that the uncertainties in the calibrated parameters are not taken
Y H ðxÞ ¼ ρY L ðx; θÞ þ ΔðxÞ ð14Þ into account in this shading.
A notable difference between the simple and Bayesian cal-
This comprehensive framework provides great flexibility, ibration frameworks is that the Bayesian frameworks make the
while it has the largest number of hyper parameters and model prediction to pass exactly through the high fidelity samples
parameters. Therefore, the weakness of this framework is to
estimate such many parameters simultaneously. Since it can
be impractical to estimate all parameters simultaneously, it is
possible to estimate them in groups (Kennedy and O’Hagan 5
2001a; Bayarri et al. 2007). For details about the estimation
processes and the posterior distribution, readers are referred to 0
2
θ
5 5
0 0
-5 -5
-10 -10
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
x x
(c) Simple comprehensive MFS fit (d) Bayesian comprehensive MFS fit
(RMSE=0.37 for 1=0.0, 2=0.1, =2.0) (RMSE=0.21 for 1=5.4, 2=4.5, =2.0)
while the simple frameworks do not. In Fig. 6a and c, the The nearest neighbor sampling takes two steps. Firstly it
predictions do not pass the high fidelity samples exactly. generates independent initial low and high fidelity sampling
Thought the difference is not visible in the figure. The char- points using LHS. In the second step move each low-fidelity
acteristics are reflected in their prediction variances that the point to the nearest high fidelity sampling point (Le Gratiet
prediction variances of the Bayesian frameworks are zero 2013). Figure 7 shows an example of 40 low fidelity and 20
values at high fidelity samples but those of the simple frame- high fidelity sampling points based on the nearest neighbor
works are nonzero values. design.
The Bayesian comprehensive framework has a distinct Another sampling strategy for achieving the sampling con-
characteristics of tuning parameters compared to other frame- dition is the nested design sampling, which was initially de-
works. In terms of RMSE, the Bayesian comprehensive veloped as a space filling technique for adding samples to an
framework significantly outperforms other frameworks. existing sample set as a way to ensure optimal coverage of the
However, the Bayesian comprehensive framework drew very union of two sample sets (Jin et al. 2005). This strategy can be
different parameters from the exact ones, while other frame- applied to generate low and high fidelity sampling points.
works came close to the exact values. This is because of the Sampling points for fitting a low fidelity surrogate is generat-
presence of the discrepancy function model in the framework. ed using LHS and additional sampling points are generated by
When ρ = 2, the discrepancy function becomes a linear poly- maximizing the minimum distance between all existing and
nomial. Since the Bayesian comprehensive framework can new points. The union of the existing sampling point set and
perfectly fit the linear discrepancy, it tunes parameters in order the additional sampling points is the low fidelity sampling
to give a best fit regardless the tuned parameters maximize the point set and the additional sampling point set is the high
agreement of the low fidelity model or not, while other frame- fidelity sample set.
works tune parameters to maximize the agreement. These two sampling strategies provide low and high fidel-
ity sample sets that satisfy the sampling condition of the high-
2.6 Design of experiments for MFS surrogate frameworks fidelity sampling points being a subset of the low-fidelity
sampling points. In this paper, the nearest neighbor design is
For simple discrepancy function based frameworks, the sam- used for the discrepancy function based frameworks for the
pling condition that the low fidelity sample set is a super set of numerical examples.
the high fidelity sample set is common to observe model dis- For calibration frameworks, the low-fidelity samples in-
crepancy at common points (Balabanov et al. 1998 and Mason clude both input variables and calibration parameters, while
et al. 1998). The same assumption is favored by the Bayesian the high-fidelity samples need to select only input variables.
framework (Kennedy and O’Hagan 2000). We generated low fidelity sampling points using LHS in the
Park et al
x2
x2
(nearest neighbors of the high 0.4 0.4
fidelity sampling points) are
replaced with the high fidelity 0.2 0.2
samples. b Updated 40 low
fidelity samples (blue crosses)
0 0
and 10 high fidelity samples (red 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
circles) obtained using the nearest x1 x1
neighbor sampling technique (a) 40 low fidelity (blue and green crosses) and (b) Updated 40 low fidelity sampling points
10 high fidelity (red circles) sampling points; (blue crosses) and 10 high fidelity sampling
the green crosses (nearest neighbors of the high points (red circles) obtained using the nearest
fidelity sampling points) are replaced with the neighbor sampling technique
high fidelity sampling points
dimensions of the input variables and the calibration parame- Since the cross validation error has been considered as a useful
ters. Then high fidelity sampling points were generated in the performance measure for surrogates (Sanchez et al. 2008;
dimensions of the input variables using LHS. The coordinates Acar and Rais-Rohani 2009; Viana et al. 2009), we investi-
of the low fidelity sampling points in the dimension of the gated whether it can also be employed for MFSs to rank the
input variables were adjusted using the high fidelity sampling frameworks. We ranked frameworks for a given DOE using
points and the nearest neighbor sampling technique. Figure 8 their cross validation errors and compared the rank to the
shows the low and high fidelity sampling points for 2 input actual ranks based on their RMSEs, which is our reference
variables and 1 model parameter for a calibration based MFS accuracy metric.
framework. Figure 8a shows 40 low fidelity samples obtained
from LHS and Fig. 8b shows the 40 samples projected on the
input dimensions and 20 high fidelity sampling points obtain- 3.1 Measures of accuracy
ed by the nearest neighbor sampling.
MFSs are fitted to function values at n sample points. We can
measure the accuracy of an MFS using the root mean square
3 Performance measures for MFS frameworks error (RMSE) integrated over the sampling domain. We use
Monte Carlo integration with ntest test points as
In this section, we describe performance measures for com- sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 Xtest i 2
n
paring MFS frameworks. Since the performance of surrogates RMSE ¼ yh;test −^yH xih;test f or i ¼ 1; …ntest ð15Þ
ntest i¼1
varies for different design of experiments (DOE), we collect
statistics of frameworks for 100 different DOEs randomly
generated using the nearest neighbor design. We evaluate the where xih,test and yh,test
i
are the sampling point vector and the
accuracy of an MFS model for given cost and the cost for high fidelity function value of ith test point. Note that RMSE is
given accuracy as well as the variability in these measures. only possible to calculate when the true function is available.
x2
3.2 Cross validation errors into k-folds using a “maximin” criterion (maximization of
the minimum inter-distance) (Viana et al. 2009), which is
Among all available frameworks, how to select the best used in the following numerical examples. Note that we
framework or how to filter out bad frameworks is an im- use unbounded constant non-informative priors for the
portant question. Since the cross validation error performs Bayesian frameworks. Therefore the influence of error in
well for selecting surrogates (Viana et al. 2009), it is also a priori information is excluded in the constructed MFSs
considered here for the MFS frameworks. For single fi- and the same for PRESS.
delity surrogates, a leave-one-out cross validation error is
the error measured at a sample point using the surrogate
constructed with all samples except for the point. By re-
peating this process for every sample, we can estimate n 4 Numerical examples
cross validation errors, and the RMSE of a surrogate can
be estimated by calculating the global cross-validation In this section, we statistically evaluate the performances
error measure called PRESS using cross validation errors of frameworks based on randomly generated DOEs. The
(Viana et al. 2009). Hartmann 6 function and the borehole function were used
Since the discrepancy MFS frameworks use samples for the statistical study of frameworks. The statistical
with the sampling restriction that a high fidelity sampling study of the previously presented 1D example is also pre-
point set is a subset of a low fidelity sampling set, a cross sented in Appendix A. Hartmann 6 function is an algebra-
validation error of a discrepancy MFS is obtained by leav- ic function with 6 input variables. The Borehole function
ing out both low and high fidelity samples at a common is a physical function with 3 input variables initially de-
point (Le Gratiet 2013). For the calibration based frame- veloped to calculate the flow of water through a borehole
works, since the low fidelity sample set has dimensions drilled from the ground surface through two aquifers.
for input variables and calibration parameters while the We refer the frameworks and their variants with labels,
high fidelity sample set has dimensions for only input which are presented in Table 1. First letters “S” and “B”
variables. For example, Fig. 8 shows a low sample set indicate that a framework uses simple framework using
for input variables x1 and x2, and calibration parameter q ready-made surrogates or Bayesian framework, respec-
and a high fidelity sample set for input variables. We tively. Then correction approaches are indicated by “C”
leave out only high fidelity samples for obtaining cross and “D” for calibration and discrepancy function, respec-
validation errors. Therefore, the number of cross valida- tively. For the comprehensive frameworks, which use
tion errors is same as that of high fidelity samples. The both approaches, we use the letters together as “CD”. To
RMSE of an MFS framework is estimated by calculating test the effect of including a regression scalar ρ in a
PRESS as framework, we compared MFSs constructed with and
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi without ρ. We remove the scaling effect of using a regres-
1 XH 2
n
sion scalar by applying ρ = 1 to a framework. The last
PRESS ¼ e ð16Þ
nH i¼1 i letter “R” indicates that ρ is included in a framework.
4.1 Hartmann 6 function sensitive to extreme values than mean. For the comparison’s
sake, the same test points were used to calculate the RMSEs
MFS frameworks are often employed to reduce computational for different DOEs.
cost and/or to improve accuracy. In this section, we examine The high fidelity function of this example, Hartmann 6
MFS frameworks with three factors that affect the perfor- function, is
mance: 1) total computational budget, 2) high-to-low sample
evaluation cost ratio, and 3) high-to-low sample size ratio. !!
1 X 4
X 6
2
Table 2 shows combinations of low and high fidelity f H ðxÞ ¼ − 2:58 þ αi exp − Ai j x j −Pi j ð17Þ
1:94 i¼1 j¼1
samples for given total budget and cost ratio. Since the
key question in MFS is whether low fidelity simulations
would reduce the cost of high fidelity simulations, the where α ¼ f 1 1:2 3 3:2 gT are model parameters, and
total computational budget is expressed in terms of the the following A and P matrices are constant:
number of high fidelity samples. For example, 56H de-
notes that we have computational budget for evaluating 0 1
56 high fidelity samples. Sample cost ratio tells how 10 3 17 3:5 1:7 8
B 0:05 10 17 0:1 8 14 C
many low fidelity samples can be evaluated for the cost A¼B@ 3
C and P
3:5 1:7 10 17 8 A
for a single high fidelity sample. For example, cost ratio
17 8 0:05 10 0:1 14
of 4 means that 4 low fidelity samples can be evaluated 0 1
with the budget for evaluating a single high fidelity sam- 1312 1696 5569 124 8283 5886
ple. For that ratio, a total budget of 56H can be used for B 2329 4135 8307 3736 1004 9991 C
¼ 10−4 B
@ 2348 1451
C
either 56 high fidelity samples, or 224 low fidelity sam- 3522 2883 3047 6650 A
ples, or mixes of the two such as the 36 high-fidelity and 4047 8828 8732 5743 1091 381
80 low fidelity samples shown in Table 2. These combi-
The low fidelity function is ð18Þ
nations are expressed with the numbers of high (nH) and
!
low (nL) fidelity samples, such as 36/80. We selected
1 X 0 4
mixes ranging from spending most of the budget on f L ðxÞ ¼ − 2:58 þ αi f exp ðν i Þ and ν i
1:94 i¼1
high-fidelity simulations (36/80), to spending most of it
on low fidelity simulation (6/200). X
6
2
We use the Hartmann 6 function over [0.1,1] for all dimen- ¼− Ai j x j −Pi j ð19Þ
sions as a high fidelity function and an approximated function j¼1
as a low fidelity function. As we did for the previous example, 0
for each case, we generated 100 different DOEs using the where α ¼ f 0:5 0:5 2:0 4:0 gT and fexp(x) is the ap-
nearest neighbor sampling and LHS. RMSE was calculated proximation function of the exponential function which is
based on 10,000 test points which give an accurate RMSE expressed as
estimate. Among 100 RMSEs, the median RMSE was chosen
as a representative value. A median RMSE is the RMSE great- −4 −4 ðx þ 4Þ 9
f exp ðxÞ ¼ exp þ exp ð20Þ
er than lower 50 % RMSE population. Also median is less 9 9 9
Table 2 Cases of sample cost and size ratios combinations for two total
computational budgets (Hartmann 6 function example) Note that the total function variation of the Hartmann 6
function is 0.33 and the RMSE of the low fidelity function
Total Sample Sample size ratio nH/nL
budget cost ratio
with respect to the high fidelity function is 0.11. That means
the maximum RMSE of a surrogate based solely on low fidel-
56H 4 36/80, 26/120, 16/160, 6/200 ity samples is 0.11. The RMSE was obtained using 10,000 test
10 49/70, 46/100, 42/140, 35/210, 28/280, 21/350, points.
14/420, 7/490 Bounds of [0.5, 1.5] were used for SDR, BDR, SCR, BCR,
30 48/240, 46/300, 44/360, 42/420, 40/480, 38/540, SCDR and BCDR, which use a regression scalar. For the
28/840, 18/1140 frameworks using calibration, α′3 and α′4 were selected as
28H 4 18/40, 13/60, 8/80, 3/100 model parameters to be calibrated. The bounds of [1, 6] and
10 22/60, 19/90, 16/120, 13/150, 10/180, 7/210, [2, 7] were used for the parameters. Note that we use 0.5 and
4/240
0.5 for α′1 and α′2, respectively. We only calibrate α′3 and α′4 to
30 24/120, 22/180, 20/240, 18/300, 16/360, 14/420,
10/540, 6/660, 4/720 simulate a common situation that we cannot afford to calibrate
all parameters.
Remarks on multi-fidelity surrogates
0.14 0.17
Fig. 9 Median RMSEs for
0.13 L 0.16
different sample size ratios and H
L
cost ratio of 30. a Best 0.12 BDR 0.15
H
frameworks for 56H total budget. 0.11 BCDR 0.14 BDR
Median RMSE
Median RMSE
BCDR
b Best frameworks for 28H total 0.1 0.13
budget. c Discrepancy based 0.09 0.12
frameworks for 56H total budget. 0.08 0.11
d Discrepancy based frameworks 0.07 0.1
for 28H total budget. e
0.06 0.09
Frameworks using calibration for
0.05 0.08
56H total budget. f Frameworks
0.04
using calibration for 28H total 0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2
Sample size ratio (nH/nL) Sample size ratio (nH/nL)
budget
(a) Best frameworks for 56H total budget (b) Best frameworks for 28H total budget
0.095 0.115
SD
0.09 SDR 0.11
BD
0.085 BDR 0.105
Median RMSE
Median RMSE
0.08 0.1
0.075 0.095
0.07 0.09
0.065 0.085 SD
SDR
BD
0.06 0.08
BDR
0.055 0.075
0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2
Sample size ratio (nH/nL) Sample size ratio (nH/nL)
(c) Discrepancy based frameworks for 56H (d) Discrepancy based frameworks for 28H total
total budget budget
0.09 0.115
0.11
0.085
0.105
0.08
0.1
Median RMSE
Median RMSE
0.075 0.095
0.07 0.09
0.085 SCR
0.065 SCR
SCDR SCDR
0.08
BC BC
0.06 BCR BCR
0.075
BCDR BCDR
0.055
0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2
Sample size ratio (nH/nL) Sample size ratio (nH/nL)
(e) Frameworks using calibration for 56H total (f) Frameworks using calibration for 28H total
budget budget
Figure 9 shows the effect of sample ratio on the median of simulations which give us a median RMSEs close to the true
RMSE using (15) for the cases of total computational budgets RMSE of 0.11 for exact low-fidelity function. This means,
56H and 28H with cost ratio of 30. A similar trend is observed that for such high-cost ratio we can expect better accuracy
for the cost ratio of 10, which is shown in Fig. 13 in from low fidelity surrogates than high fidelity surrogates.
Appendix B. Figure 9a and b compare the median RMSE of Figure 9c and d present the median RMSE of the discrepancy
the best frameworks selected from the discrepancy function- function based frameworks and Fig. 9e and f present that of
based frameworks and the frameworks using calibration to the frameworks using calibration.
that of the single fidelity surrogates. The red and black dashed The overall observation about results with the total budget
lines represent the median RMSEs of the single high fidelity of 56H and 28H are similar. All the MFS frameworks were
and low-fidelity Kriging surrogates, respectively. The median better in terms of median RMSEs than the Kriging surrogates
RMSEs were obtained from Kriging surrogates constructed using only low or high fidelity samples for most of sample size
with randomly generated 100 DOEs using LHS. For the single ratios. BDR framework performed best for mostly sample size
fidelity surrogates, total budget was used to achieve samples ratios less than 0.1. The frameworks using calibration
of one fidelity; there is no effect of sample-size ratio. With a outperformed BDR for small sample ratios while the perfor-
cost ratio of 30, we can afford 56 × 30 = 1680 low fidelity mance of BDR rapidly decreased as sample size ratio
Park et al
decreased. Each framework had an optimal sample size ratio. RMSE of BDR for the case of 56H and 30 cost ratio is
For this example, the optimal sample size ratio of BDR was 0.059. Since such low level of median RMSE cannot be
around 0.1 and those of the frameworks using calibration were achieved with the low fidelity surrogate, we calculated the
much lower than that. Note that, for the sample size ratio 0, the number of high fidelity samples that provides an equivalent
median RMSE of each framework converged to that of low median RMSE.
fidelity surrogates. Note that for a cost ratio of 30, most of the Figure 10a shows boxplots of RMSEs of 100 Kriging sur-
budget was still spent on high-fidelity simulations for sample rogates built with 100 different high fidelity sample sets,
size ratio larger than 0.33. which were randomly generated using LHS. Figure 10b
For this example, combination of full Bayesian approach shows the corresponding median RMSE graph in terms of
and a regression scalar ρ improved accuracy of the discrepan- the number of high fidelity samples. We obtain the cost saving
cy function-based frameworks significantly. For all cases, the using Fig. 10b. For example, the best median RMSE of BDR
performance of BDR was significantly better than that of BD, for the case of 56H and 30 cost ratio is 0.059 based on Fig. 9c.
SD and SDR. That means applying Bayesian (BD to BDR) We calculate the number of high fidelity samples which gives
without ρ or the other way (SDR to BDR) did not make no- 0.059 in the graph which is 320H. The BDR framework can
ticeable improvement but combination of Bayesian and ρ achieve the equivalent median RMSE with 56H, so the cost
works for this example. BCDR worked generally well among saving is −83 %.
the frameworks using calibration. For 28H and cost ratio of Table 3 shows the summary of the results for different
10, BDR performs best for sample size ratio larger than 0.13 cases. For each case, we selected the best frameworks in
and BCDR performs best for sample size ratio less than 0.13. terms of median RMSE. Commonly, frameworks using
SCR outperformed BCR for small sample size ratios calibration performed best for small sample size ratio,
(≤0.1), while the two behaved similarly for larger ratios. and BDR performed generally well. Best frameworks for
BCDR outperformed SCDR for almost all the diagramed sam- different sample size ratio ranges are presented. For ex-
ple size ratios. Unlike the discrepancy based frameworks, the ample, for the combination of total budget 56H and cost
effect of a regression scalar ρ was limited for the frameworks ratio of 4, BC is the best framework for the sample size
using calibrations. Unlike the discrepancy based frameworks, ratio smaller than 0.1, and BDR is the best for larger than
the effect of a regression scalar ρ was limited for the frame- 0.1. The maximum median RMSE improvement and the
works using calibrations. cost saving in the corresponding ranges were also present-
Each framework had its break-even sample size ratio where ed. The median RMSE of an MFS framework is a func-
the median RMSE of an MFS framework becomes smaller tion of the sample size ratio. We found the minimum
than the best low fidelity median RMSE. The break-even median RMSE of each framework over all considered
sample size ratio decreased as the total budget decreased. sample size ratios. For each MFS framework, the differ-
For 56H and cost ratio of 10, most of breaking even points ence between its minimum median RMSE and the median
were within the sample size ratios 0.4 and 0.8 while, for 28H RMSE of Kriging surrogates using only high fidelity sam-
and cost ratio of 10, breaking even points were within the ples was calculated and considered as maximum median
sample size ratios 0.2 and 0.4. That means securing enough RMSE reduction. Cost savings are then calculated by
number of low fidelity samples was important for this exam- finding how many high fidelity samples for Kriging sur-
ple. Little benefit was expected by applying MFS frameworks rogates are needed to achieve the same minimum median
for large sample size ratios. RMSE was calculated. Note that the median RMSEs of
In this paper, cost saving of an MFS framework is calcu- MFS frameworks were smaller than the RMSE of the low
lated by calculating computational cost to achieve the equiv- fidelity function that is the minimum achievable RMSE
alent median RMSE. For example, the low fidelity median with only low fidelity samples.
0.12
Fig. 10 RMSE of Kriging fits 0.15
built based on only high fidelity 0.11
samples. a Box plots of RMSEs.
0.1
b Median RMSEs
Median RMSE
RMSE
0.09
0.1
0.08
0.07
0.06
0.05
84 112 140 168 196 224 252 280 308 336 364 0.05
Number of high fidelity samples (H) 100 150 200 250 300 350
Number of high fidelity samples (H)
(a) Box plots of RMSEs (b) Median RMSEs
Remarks on multi-fidelity surrogates
This example shows that the benefit of employing MFS groundwater gradient, and laminar, isothermal flow
frameworks for saving computational cost was higher than through the borehole (Morris et al. 1993). The flow rate
for improving the accuracy. MFS frameworks are mostly ben- through the borehole in m3/yr for given conditions was
eficial for high cost ratios. The observed maximum cost sav- obtained as
ing is −86 % for the case of 28H total budget and 30 cost ratio. f H ðRw ; L; K w Þ ¼ 0
2πT u ðH u −H l Þ
1 ð21Þ
The maximum RMSE improvement is −51 % and the corre- . 2LT T
ln R Rw @1 þ . þ A
u u
sponding best median RMSE is 0.055. The MFS frameworks ln R Rw R2w K w T l
are generally useful for high cost ratio cases.
We assume that the flow rate fH(Rw, L, Kw) is calculat-
ed for input variables R w, L, and K w defining the
4.2 Borehole function dimensions and conductivity of a borehole and the other
environmental parameters were measured and given. The
In this section, we examine the MFS frameworks with a phys- input variables and parameters of the function are
ical example, the flow of water through a borehole penetrating presented in Table 5. The values of the parameters were
two aquifers. We considered the combinations of low and high set to the nominal values of interest ranges of the param-
fidelity samples for given total budget and cost ratio shown in eters. Morris et al. (1993) presents the interest ranges of
Table 4. The same notation is used for total budget, cost ratio the parameters.
and sample size ratio as with the previous example.
The model for the flow was obtained based on assump-
tions of steady-state flow from the upper aquifer into the Table 5 Input variables and environmental parameters
borehole and from the borehole into the lower aquifer, no Input Description
The borehole function was used as a high fidelity function and and [600, 1000] were used as the bounds of the parameters, Hu
we selected a low fidelity function from a literature (Xiong et al. and Hl, respectively.
2013). The RMSE of the low fidelity function with respect to the Figure 11 shows the effect of sample ratio on the me-
high fidelity function is 15.0. The low fidelity function is dian of RMSE for different cases. The cases of total com-
expressed as putational budgets 5H and 10H with cost ratio of 30 are
5T u ðH u −H l Þ shown. Figure 11a and b show the median RMSE of the
f L ðRw ; L; K w Þ ¼ 0 1 ð22Þ best frameworks, and the red and black dashed lines rep-
. 2LT u TuA
ln R Rw @ 1:5 þ . þ resent the median RMSEs of 100 single low and high
ln R Rw R2w K w T l fidelity Kriging surrogates, respectively. The median
RMSE of low fidelity surrogates is almost the same with
For all frameworks, regression scalar (ρ) bounds of [0.5, the RMSE of the low fidelity function of 15.0 for this
1.5] were used. Hu and Hl were selected as model parameters case. That means the low fidelity surrogates have almost
to be tuned since the flow rate is sensitive to them. [800, 1200] the same error as the exact low fidelity function. For 10H
16 25
Fig. 11 Median RMSEs for
L
different sample size ratios and 14
H
cost ratio of 30. a Best 12
L 20 SDR
H BDR
frameworks for 10H total budget. SDR BCDR
Median RMSE
Median RMSE
b Best frameworks for 5H total 10
BDR 15
budget. c Discrepancy based BCDR
8
frameworks for 10H total budget. 10
6
d Discrepancy based frameworks
for 5H total budget. e 4
5
Frameworks using calibration for 2
10H total budget. f Frameworks
0 0
using calibration for 5H total 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.02 0.04 0.06 0.08 0.1 0.12 0.14
budget Sample size ratio (nH/nL) Sample size ratio (nH/nL)
(a) Best frameworks for 10H total budget (b) Best frameworks for 5H total budget
10 10
SD
9 SDR 9
8 BD 8
BDR
7 7
Median RMSE
Median RMSE
6 6
5 5
4 4
SD
3 3 SDR
BD
2 2
BDR
1 1
0 0
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.02 0.04 0.06 0.08 0.1 0.12 0.14
Sample size ratio (nH/nL) Sample size ratio (nH/nL)
(c) Discrepancy based frameworks for 10H (d) Discrepancy based frameworks for 5H total
total budget budget
12 11
10
10
9
8
8
Median RMSE
Median RMSE
6 6
5
4 SCR SCR
4
SCDR SCDR
BC 3 BC
2 BCR
BCR
2 BCDR
BCDR
0 1
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.02 0.04 0.06 0.08 0.1 0.12 0.14
Sample size ratio (nH/nL) Sample size ratio (nH/nL)
(e) Frameworks using calibration for 10H total (f) Frameworks using calibration for 5H total
budget budget
Remarks on multi-fidelity surrogates
cases, high fidelity surrogates outperform low fidelity 4.3 Predicting the performance of the frameworks based
surrogates in terms of the median RMSE. on PRESS
The discrepancy function based MFS frameworks with a re-
gression scalar gave more accurate surrogates in terms of median The previous sections showed the performance of the
RMSE, especially for small sample size ratio as in the previous frameworks based on the median of randomly generated
example. Figure 11 show that BDR and SDR frameworks sig- 100 DOEs. We have observed, however, that even if one
nificantly outperformed the other frameworks for all sample size MFS is superior in most DOEs, it can perform poorly for
ratios. This shows that the effect of applying full Bayesian treat- others. Thus a performance prediction for a given problem
ment was minimal. The frameworks using a regression scalar, and DOE would help to choose a proper MFS framework.
BDR and SDR, significantly outperformed the rest, and the Since PRESS is considered as a good performance predic-
Bayesian treatment did not matter. Similarly, Fig. 11e and f show tor for surrogate models (e.g. Viana et al. 2009; Viana and
that there was no noticeable difference between Bayesian and Haftka 2009), we examined if PRESS can predict the per-
simple frameworks using calibration. For this example, the effect formance of MFS frameworks. We selected sample ratios
of a regression scalar ρ was significant since the other discrepan- and predicted the ranking of all the frameworks based on
cy based frameworks with ρ = 1, BD and SD, were significantly their PRESS. Then we evaluated the predictability by
outperformed by BDR and SDR. For the frameworks using cal- comparing the predicted best framework to actual best
ibration, there was little effect of having a regression scalar ρ. framework based on RMSEs for each DOE. We counted
This is because the same scaling effect can be achieved through how many cases out of the 100 DOEs PRESS made right
calibration of Hu and Hl. Note that the median RMSE behaviors predictions and obtained the success rate in those two
of cost ratio of 30 from Fig. 11 and for cost ratio of 10 were measurements.
similar. The median RMSE of cost ratio of 10 is given in Fig. 14 Table 7 shows the performance of PRESS for the
in Appendix B. Hartmann 6 function example. 10/560 and 20/240 were
Table 6 summarizes the results for all the cases in Table 4.
The performance of BDR and SDR were significantly better
Table 7 The numbers of cases out of 100: the predicted best/worst
than other frameworks using calibrations. The median RMSE
framework is among actual best 1, 2 and 3 (Hartmann 6 function
improvements were higher than cost savings but still the effect example, 9 possible frameworks)
of cost saving was significant. The effect of having a regres-
sion scalar was significant while applying full Bayesian ap- Correct Within best/worst 2 Within best/worst 3
best/worst
proach had little effect for the discrepancy function based
frameworks. The borehole function is a simple monotonically 10/560 Best 15 26 43
increasing function in the domain of input variables. For a Worst 28 49 60
simple function, discrepancy based approaches were better 20/240 Best 16 36 49
than frameworks using calibration and gains of applying Worst 20 47 62
Bayesian framework are limited.
Park et al
Table 8 The numbers of cases out of 100: the predicted best/worst could build MFS surrogates equivalent to Kriging surrogates
framework is among actual best 1, 2 and 3 (Borehole function example)
based on only high fidelity samples in terms of median RMSE
Correct Correct Correct with 14 % of the computational cost for the single fidelity
best/worst 1 best/worst 2 best/worst 3 Kriging surrogate (equivalent to 86 % cost saving). The maxi-
mum accuracy improvement over high fidelity Kriging surro-
5/50 Best 100 – –
gates is 51 % reduction in median RMSE. For the borehole
Worst 45 72 78
function, the accuracy improvement is better than computation-
al cost saving but the difference is marginal compared to the
selected since they were the optimal sample size ratio for huge difference between the corresponding quantities for the
the total budget of 28H and the cost ratio of 30 for BCDR Hartmann 6 function. The Bayesian discrepancy framework
and BDR, respectively. 10-fold cross validation errors performed generally the best. However, performance of the
were used to predict the ranks of the frameworks. We framework deteriorated rapidly when the number of available
predicted the best and worst frameworks for a given high fidelity samples was low. In contrast, for the Hartmann 6
DOE based on PRESS. We checked if the selected best function, the Bayesian calibration framework performed reli-
or worst frameworks are within the actual best 1, 2 and 3 ably well for the case of few high fidelity samples and
or worst 1, 2 and 3 frameworks in terms of RMSE (out of outperformed the Bayesian discrepancy framework. Since the
the 9 frameworks investigated here). For 10/560, PRESS discrepancy between the low and high fidelity functions was
predicted the actual best framework for only 15 DOEs out much more complex for the Hartmann 6 function than the bore-
of 100 DOEs. Note that the cases for best 2 include the hole function, the discrepancy function prediction with few
cases for best 1. Table 7 shows that the rate of making high fidelity samples had huge error for the Hartmann 6 func-
right prediction based on PRESS is not high. The chance tion. Using a regression scalar ρ was particularly beneficial for
that the predicted best framework is within the actual best the discrepancy based frameworks. For the Borehole function
3 is 43 % and the chance that the predicted worst frame- example, the presence of the scalar led to significant difference
work is within the actual worst 3 is 60 %. Also there is in the quality of an MFS and there is little benefit of using
little difference between the two cases of 10/560 and 20/ Bayesian framework.
240. The performance of the frameworks for the sample We also found substantial differences in performance
size ratio 10/560 and 20/240 were competitive except SD, between frameworks for different DOEs, which raised
SDR and BD based on Fig. 9. The result tells us that the question whether PRESS based on cross validation
PRESS could not rank the frameworks accurately. errors can help us choose the best framework. We found
Table 8 shows the predictive performance of PRESS for the that the usefulness of PRESS is limited. It is not as helpful
borehole function example. We considered 5/50 for the total as it is for single fidelity surrogates.
budget of 10H and the cost ratio of 10. As Fig. 11 shows, BDR Finally, the Bayesian frameworks were compared to the
and SDR significantly outperform the other frameworks and simple frameworks using Kriging surrogates. One of the
the PRESS performs much better than the Hartmann 6 func- advantages of the simple frameworks, is that they can be
tion example. much more readily implemented with existing surrogates.
This may compensate for the relatively small advantage
observed from the Bayesian frameworks.
This paper focuses on predicting the response of a high
5 Concluding remarks fidelity model with aid of a single low fidelity model
using the MFS frameworks. However, high fidelity
In this paper, we attempted to provide insight on three Bayesian models can hardly be perfect, a higher fidelity model
MFS frameworks and the corresponding simple frameworks may need to be considered to measure the error of the
based on approaches using 1) an additive discrepancy function, high fidelity model. Alternatively, only a general knowl-
2) calibration of low fidelity simulations, and 3) a comprehen- edge of the magnitude of the error in the high-fidelity
sive approach using both. We also examined the effect of the model is available. The knowledge can be useful to avoid
regression scalar ρ applied for MFS frameworks. The frame- trying to reduce the error of the MFS much below the
works were examined with a 6D Hartmann 6 algebraic function high fidelity error.
and a 3D Borehole physical function.
The MFS frameworks were more potent for reducing com-
putational cost rather than improving accuracy, especially for Acknowledgments This work is supported by the U.S. Department of
Energy, National Nuclear Security Administration, Advanced Simulation
the Hartmann 6 function. The phenomenon was more obvious
and Computing Program, as a Cooperative Agreement under the
for the Hartmann 6 function than the borehole function. For the Predictive Science Academic Alliance Program, under Contract No.
Hartmann 6 function, the discrepancy based MFS framework DE-NA0002378.
Remarks on multi-fidelity surrogates
RMSE
4
3
3
2 2
1 1
0 0
SDR BDR SCR BCR SCDR BCDR
0.14 0.17
Fig. 13 Median RMSEs for
different sample size ratios and 0.13 0.16
cost ratio of 10. a Best 0.15
frameworks for 56H total budget. 0.12
Median RMSE
Median RMSE
0.14
b Best frameworks for 28H total 0.11
budget. c Discrepancy based 0.13
frameworks for 56H total budget. 0.1
0.12
d Discrepancy based frameworks 0.09 L L
for 28H total budget. e H 0.11
H
BDR BDR
Frameworks using calibration for 0.08
BCDR 0.1
BCDR
56H total budget. f Frameworks
0.09
using calibration for 28H total 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Sample size ratio (nH/nL) Sample size ratio (nH/nL)
budget
(a) Best frameworks for 56H total budget (b) Best frameworks for 28H total budget
0.12 0.16
SD
0.115 SDR
0.15 BD
0.11
BDR
0.105 0.14
Median RMSE
Median RMSE
0.1
0.13
0.095
0.09 0.12
SD
0.085 SDR
BD 0.11
0.08 BDR
0.075 0.1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Sample size ratio (nH/nL) Sample size ratio (nH/nL)
(c) Discrepancy based frameworks for 56H (d) Discrepancy based frameworks for 28H total
total budget budget
0.13 0.15
0.12 0.14
0.11 0.13
Median RMSE
Median RMSE
0.1 0.12
(e) Frameworks using calibration for 56H total (f) Frameworks using calibration for 28H total
budget budget
Remarks on multi-fidelity surrogates
16 25
Fig. 14 Median RMSEs for L
different sample size ratios and 14 H
SDR
cost ratio of 10. a Best 12 BDR
20
Median RMSE
Median RMSE
b Best frameworks for 5H total 10
15
budget. c Discrepancy based 8
frameworks for 10H total budget. 10
6
d Discrepancy based frameworks L
H
for 5H total budget. e 4
5
SDR
Frameworks using calibration for BDR
2 BCDR
10H total budget. f Frameworks
0 0
using calibration for 5H total 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Sample size ratio (nH/nL) Sample size ratio (nH/nL)
budget
(a) Best frameworks for 10H total budget (b) Best frameworks for 5H total budget
10 10
SD
9 SDR 9
8 BD 8
BDR
7 7
Median RMSE
Median RMSE
6 6
5 5
4 4
3 3 SD
SDR
2 2
BD
1 1 BDR
0 0
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Sample size ratio (nH/nL) Sample size ratio (nH/nL)
(c) Discrepancy based frameworks for 10H (d) Discrepancy based frameworks for 5H total
total budget budget
15 35
SCR
SCDR
30 BC
BCR
10 BCDR
Median RMSE
Median RMSE
25
20
5 SCR
SCDR
BC
15
BCR
BCDR
0 10
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Sample size ratio (nH/nL) Sample size ratio (nH/nL)
(e) Frameworks using calibration for 10H total (f) Frameworks using calibration for 5H total
budget budget
Park et al