You are on page 1of 18

IMM

INFORMATICS AND MATHEMATICAL MODELLING J. No. DACE 1


1.8.2002
Technical University of Denmark HBN/ms
DK-2800 Kongens Lyngby – Denmark

DACE – A MATLAB
KRIGING TOOLBOX
VERSION 2.0

Søren N. Lophaven
Hans Bruun Nielsen
Jacob Søndergaard

TECHNICAL REPORT
IMM-REP-2002-12

IMM
2 2. Modelling and Predi tion

DACE 1. Introdu tion

A Matlab Kriging Toolbox This report des ribes the ba kground for and use of the software pa k-
age DACE (Design and Analysis of Computer Experiments), whi h
Version 2.0 is a Matlab toolbox for working with kriging approximations to om-
puter models.
Sren N. Lophaven The typi al use of this software is to onstru t a kriging approximation
Hans Bruun Nielsen
Ja ob Sndergaard model based on data from a omputer experiment, and to use this
approximation model as a surrogate for the omputer model. Here,
a omputer experiment is a olle tion of pairs of input and responses
from runs of a omputer model. Both the input and the response from
Contents the omputer model are likely to be high dimensional.
1. Introdu tion 2 The omputer models we address are deterministi , and thus a re-
2. Modelling and Predi tion 2 sponse from a model la ks random error, i.e., repeated runs for the
2.1. The Kriging Predi tor . . . . . . . . . . . . . . . . . . 4 same input parameters gives the same response from the model.
2.2. Regression Models . . . . . . . . . . . . . . . . . . . . . . 8
2.3. Correlation Models . . . . . . . . . . . . . . . . . . . . . 9 Often the approximation models are needed as a part of a design
3. Generalized Least Squares Fit 12 problem, in whi h the best set of parameters for running the omputer
3.1. Computational Aspe ts . . . . . . . . . . . . . . . . . 14 model is determined. This is for example problems where a omputer
4. Experimental Design 16 model is tted to physi al data. This design problem is related to the
4.1. Re tangular Grid . . . . . . . . . . . . . . . . . . . . . . . 17 more general problem of predi ting output from a omputer model at
4.2. Latin Hyper ube Sampling . . . . . . . . . . . . . 17 untried inputs.
5. Referen e Manual 18 In Se tion 2 we onsider models for omputer experiments and ef-
5.1. Model Constru tion . . . . . . . . . . . . . . . . . . . . 19 ient predi tors, Se tion 3 dis usses generalized least squares and
5.2. Evaluate the Model . . . . . . . . . . . . . . . . . . . . 21 implementation aspe ts, and in Se tion 4 we onsider experimental
5.3. Regression Models . . . . . . . . . . . . . . . . . . . . . . 22 design for the predi tors. Se tion 5 is a referen e manual for the tool-
5.4. Correlation Models . . . . . . . . . . . . . . . . . . . . . 24 box, and nally examples of usage and list of notation are given in
5.5. Experimental Design . . . . . . . . . . . . . . . . . . . 25 Se tions 6 and 7.
5.6. Auxiliary Fun tions . . . . . . . . . . . . . . . . . . . . 26
5.7. Data Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
6. Examples of Usage 28
6.1. Work-through Example . . . . . . . . . . . . . . . . . 28
6.2. Adding a Regression Fun tion . . . . . . . . . . 31 2. Modelling and Predi tion
7. Notation 32 Given a set of m design sites S = [s1    sm ℄> with si 2 IRn and
Referen es 33 responses Y = [y1    ym ℄> with yi 2 IRq . The data is assumed to
2. Modelling and Predi tion 3 4 2. Modelling and Predi tion

satisfy the normalization onditions1 First, however, we must bear in mind that the true value an be
  written as
[S:;j ℄ = 0; V S:;j ; S:;j = 1; j = 1; : : : ; n ;
  (2.1) y` (x) = F ( :;`; x) + ( :;`; x) ; (2.5)
[Y:;j ℄ = 0; V Y:;j ; Y:;j = 1; j = 1; : : : ; q ;
where X:;j is the where is the approximation error. The assumption is that by proper
 ve tor given by the j th olumn in matrix X , and hoi e of this error behaves like \white noise" in the region of in-
[  ℄ and V ;  denote respe tively the mean and the ovarian e.
terest, i.e., for x 2 D.
Following [9℄ we adopt a model y^ that expresses the deterministi
response y(x) 2 IRq , for an n dimensional input x 2 D  IRn , as a re-
alization of a regression model F and a random fun tion (sto hasti 2.1. The Kriging Predi tor
pro ess), For the set S of design sites we have the expanded mp design matrix
F with Fij = fj (si ),
y^` (x) = F ( :;`; x) + z` (x); ` = 1; : : : ; q : (2.2)
F = [f (s1 )    f (sm )℄> ; (2.6)
We use a regression model whi h is a linear ombination of p hosen
fun tions fj : IRn 7! IR, with f (x) de ned in (2.3). Further, de ne R as the matrix R of
sto hasti -pro ess orrelations between z 's at design sites,
F ( :;` ; x) = ;` f (x) +    p;` fp (x)
1 1

= [f (x)    fp (x)℄ :;`


1
Rij = R(; si ; sj ); i; j = 1; : : : ; m : (2.7)
 f (x)> :;` : (2.3) At an untried point x let
The oeÆ ients f k;` g are regression parameters. r(x) = [R(; s1 ; x)    R(; sm ; x)℄> (2.8)
The random pro ess z is assumed to have mean zero and ovarian e be the ve tor of orrelations between z 's at design sites and x.
 
E z` (w)z` (x) = ` R(; w; x); ` = 1; : : : ; q
2
(2.4) Now, for the sake of onvenien e, assume that q = 1, implying that
= :;1 and Y = Y:;1 , and onsider the linear predi tor
between z (w) and z (x), where `2 is the pro ess varian e for the `th
omponent of the response and R(; w; x) is the orrelation model with y^(x) = > Y ; (2.9)
parameters . An interpretation of the model (2.2) is that deviations
from the regression model, though the response is deterministi , may with = (x) 2 IRm . The error is
resemble a sample path of a (suitably hosen) sto hasti pro ess z . In
the following we will fo us on the kriging predi tor for y. y^(x) y(x) = > Y y(x)
1 The
= > (F + Z ) (f (x)> + z )
user does not have to think of this: The rst step in the model onstru -
= > Z z + F > f (x) > ;

tion is to normalize the given S; Y so that (2.1) is satis ed, see (5.1) below.
2.1. Kriging Predi tor 5 6 2. Modelling and Predi tion

where Z = [z1 : : : zm ℄> are the errors at the design sites. To keep y^(x) = (r F ~)> R 1Y
the predi tor unbiased we demand that F > f (x) = 0, or = r> R 1 Y (F > R 1 r f )> (F > R 1F ) 1 F > R 1 Y : (2.15)
F > (x) = f (x) : (2.10) In Se tion 3 we show that for the regression problem
Under this ondition the mean squared error (MSE) of the predi tor
(2.9) is F 'Y
 
'(x) = E (^y(x) y(x))2 the generalized least squares solution (with respe t to R) is
  
= E > Z z 2  = (F > R 1 F ) 1 F > R 1 Y ;
 
= E z 2 + > ZZ > 2 >Zz
 and inserting this in (2.15) we nd the predi tor
= 2 1 + > R 2 >r : (2.11)
The Lagrangian fun tion for the problem of minimizing ' with respe t y^(x) = r> R 1Y (F > R 1 r f )> 
to and subje t to the onstraint (2.10) is = f >  + r> R 1 (Y F  )

L( ; ) = 2 1 + > R 2 > r > (F > f ) : (2.12) = f (x)>  + r(x)>  : (2.16)
The gradient of (2.12) with respe t to is For multiple responses (q > 1) the relation (2.16) hold for ea h ol-
umn in Y , so that (2.16) holds with  2 IRpq given by (2.15) and
L0 ( ; ) = 22 (R r) F  ;  2 IRmq omputed via the residuals, R  = Y F  .
and from the rst order ne essary onditions for optimality (see e.g. Note that for a xed set of design data the matri es  and  are
[7, Se tion 12.2℄) we get the following system of equations xed. For every new x we just have to ompute the ve tors f (x) 2 IRp
 

   and r(x) 2 IRm and add two simple produ ts.
R F = r ; (2.13)
F> 0 ~ f Getting an estimate of the error involves a larger omputational work.
Again we rst let q = 1, and from (2.11) and (2.14) we get the following
where we have de ned expression for the MSE of the predi tor,

~ =

22 : '(x) = 2 1 + > (R 2r)
 
The solution to (2.13) is = 2 1 + (F ~ r)> R 1(F ~ + r)
 
~ = (F > R 1 F ) 1 (F > R 1 r f ) ; = 2 1 + ~> F > R 1 F ~ r> R 1 r
(2.14) = 2 1 + u>(F > R 1 F ) 1 u r> R 1r :

(2.17)
= R 1 (r F ~) :
The matrix R and therefore R 1 is symmetri , and by means of (2.9) where u = F > R 1 r f and 2 is found by means of (3.7) below.
we nd This expression generalizes immediately to the multiple response ase:
2.1. Kriging Predi tor 7 8 2. Modelling and Predi tion

for the `th response fun tion we repla e  by ` , the pro ess varian e fi R
(Jf (x))ij = x (x); (Jr (x))ij = x (; si ; x) : (2.19)
for `th response fun tion. Computational aspe ts are given in Se tion j j
3.1.
From (2.17) it follows that the gradient of the MSE an be expressed
Remark 2.1.1. Let x = si , the ith design site. Then r(x) = R:;i , as

the ith olumn of R, and R 1 r(x) = ei , the ith olumn of the unit '0 (x) = 22 (F > R 1 F ) 1 u0 R 1 r0
matrix, ei = I:;i . Using these relations and (2.6) in (2.16) we nd 
= 22 (F > R 1 F ) 1 (F > R 1 r0 Jf  ) R 1 r0 : (2.20)
y^(si ) = f (si )>  + r(si )> R 1 (Y F  ) These expressions are implemented in the toolbox, see Se tion 5.2.
= f (si )>  + e>i (Y F  )
= f (si )>  + yi Fi;:  = yi : 2.2. Regression Models
This shows that the Kriging predi tor interpolates the design data. The toolbox provides regression models with polynomials of orders 0,
Further, in (2.17) we get u = F > ei f (si ) = 0 and the asso iated 1 and 2. More spe i , with xj denoting the j th omponent of x,
MSE Constant, p = 1 :

'(si ) = 2 (1 R:;i > ei ) = 2 (1 Rii ) = 0 ; f1 (x) = 1 ; (2.21)


sin e Rii = 1.
Linear , p = n+1 :
Remark 2.1.2. As indi ated by the name MSE (mean squared er- f1 (x) = 1; f2 (x) = x1 ; : : : ; fn+1 (x) = xn ; (2.22)
ror) we expe t that '(x)  0, but in (2.17) it may happen that
r> R 1 r > 1 + u>(F > R 1F ) 1 u, in whi h ase '(x) < 0. This point Quadrati , p = 21 (n+1)(n+2) :
needs further investigation, but as a rst explanation we o er the fol-
lowing: Equation (2.11) is based on the assumption that the di eren e f1 (x) = 1
between the regression model and the true value is \white noise", and f2 (x) = x1 ; : : : ; fn+1 (x) = xn
if there is signi ant approximation error, (2.5), then this assumption fn+2 (x) = x21 ; : : : ; f2n+1 (x) = x1 xn (2.23)
and its impli ations do not hold.
f2n+2 (x) = x22 ; : : : ; f3n (x) = x2 xn
Remark 2.1.3. From (2.16) it follows that the gradient       fp (x) = x2n :
 y^ >
 
 y^
y^0 =    The orresponding Ja obians are (index nq denotes the size of the
x 1 x
n
matrix and O is the matrix of all zeros)
an be expressed as onstant : Jf = [On1 ℄ ;
y^0 (x) = Jf (x)>  + Jr (x)>  ; (2.18) linear : Jf = [On1 Inn ℄ ;
where Jf and Jr is the Ja obian of f and r, respe tively, quadrati : Jf = [On1 Inn H ℄ ;
2.3. Correlation Models 9 10 2. Modelling and Predi tion

where we illustrate H 2 IRn(p n 1)


by Some of the hoi es are illustrated in Figure 2.1 below. Note that in
  all ases the orrelation de reases with jdj j and a larger value for j
n=2: H = 2x1 x2 0 ; leads to a faster de rease. The normalization (2.1) of the data implies
0 x1 2x2 that jsij j <
 1 and therefore we are interested in ases where jdj j <
 2,
2 3 as illustrated in the gure.
2x1 x2 x3 0 0 0
n=3: H =4 0 x1 0 2x2 x3 0 5 : 1
EXP
1
GAUSS

0 0 x1 0 x2 2x3 0.8 0.8

0.6 0.6

2.3. Correlation Models 0.4 0.4

As [9℄ we restri t our attention to orrelations of the form 0.2 0.2

0 0
0 0.5 1 1.5 2 0 0.5 1 1.5 2
n
Y
R(; w; x) = Rj (; wj xj ) ; 1
LIN
1
SPLINE

j =1
0.8 0.8

i.e., to produ ts of stationary, one-dimensional orrelations. More 0.6 0.6

spe i , the toolbox ontains the following 7 hoi es 0.4 0.4

Name Rj (; dj )
0.2 0.2

exp( j jdj j)
0 0
0 0.5 1 1.5 2 0 0.5 1 1.5 2
exp
Correlation fun tions for 0  dj  2.
expg exp( j jdj jn+1 ); 0 < n+1  2 Figure 2.1.
Dashed, full and dash-dotted line: j = 0:2; 1; 5
gauss exp( j d2j )
The orrelation fun tions in Table 2.1 an be separated into two
lin maxf0; 1 j jdj jg groups, one ontaining fun tions that have a paraboli behavior near
spheri al 1 1:5j + 0:5j3; j = minf1; j jdj jg the origin (gauss, ubi and spline), and the other ontaining fun -
ubi 1 3j2 + 2j3 ; j = minf1; j jdj jg tions with a linear behaviour near the origin (exp, lin and spheri-
al). The general exponential expg an have both shapes, depending
spline & (j ); (2.24) j = j jdj j on the last parameter: n+1 = 2 and n+1 = 1 gives the Gaussian and
Table 2.1. Correlation fun tions. dj = wj xj the exponential fun tion, respe tively.
The spline orrelation model is de ned by The hoi e of orrelation fun tion should be motivated by the under-
8
lying phenomenon, e.g., a fun tion we want to optimize or a physi al
< 1 15j2 + 30j3 for 0  j  0:2 pro ess we want to model. If the underlying phenomenon is on-
& (j ) = 1:25(1 j )3 for 0:2 < j < 1 (2.24) tinuously di erentiable, the orrelation fun tion will likely show a
:
0 for j  1 paraboli behaviour near the origin, whi h means that the Gaussian,
the ubi or the spline fun tion should be hosen. Conversely, phys-
2.3. Correlation Models 11 12 3. Generalized Least Squares

i al phenomena usually show a linear behaviour near the origin, and Note that jRj m1 is monotonously in reasing in the interval. This is in
exp, expg, lin or spheri al would usually perform better, see [2℄. a ordan e with expe tation, sin e R is lose to the unit matrix for
Also note, that for large distan es the orrelation is 0 a ording to the large , while it is inde nite for small . For  = 0, R is the matrix
linear, ubi , spheri al and spline fun tions, while it is asymptoti ally of all ones, whi h has rank one. In ase of an inde nite R we de ne
0 when applying the other fun tions. 2 = () = "1".
Often the phenomenon is anisotropi . This means that di erent or- See [4℄ for a thorough dis ussion of properties of di erent orrelation
relations are identi ed in di erent dire tions, i.e., the shape of the models and the optimization problem (2.25).
fun tions in Figure 2.1 di er between di erent dire tions. This is
a ounted for in the fun tions in Table 2.1, sin e we allow di erent
parameters j in the n dimensions.
Assuming a Gaussian pro ess, the optimal oeÆ ients  of the or- 3. Generalized Least Squares Fit
relation fun tion solves
In this se tion we take a linear algebra view of generalized least squares
min

f ()  jRj m1 2 g ; (2.25) estimation, and get results that are well known in statisti al literature,
where they are derived with probabilisti tools.
where jRj is the determinant of R. This de nition of  orresponds Consider an m-ve tor Y with out omes of a sto hasti pro ess and let
to maximum likelihood estimation. In Figure 2.2 we illustrate the
typi al behaviour of the fun tions involved in (2.25). y = F + e ; (3.1)
where F 2 IRmp (with p < m) is given. Assume that
0.5
2 1/m
σ |R|
0.4
   
0.3 E ei = 0; E ei ej = 2 Rij ; (3.2)
0.2

0.1 where Rij is the (i; j )th element in the ovarian e matrix R, whi h is
assumed to be known. Note that we an express (3.2) in matrix-ve tor
form
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

   
E e = 0; E ee> = 2 R :
4
10
|R|1/m
2
2
10 σ

First, assume that the errors are un orrelated and all have the same
varian e. This is equivalent with R = I , and the maximum likelihood
0
10

−2
10 estimate of the parameter ve tor is the least squares solution, i.e., 
−4
10
is the solution to the simple normal equations
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 2.2. Typi al behaviour of , jRj m1 and 2 for 0 < j  1. (F > F )  = F > Y ; (3.3)
Data normalized as de ned in (5.1)
3. Generalized Least Squares 13 14 3. Generalized Least Squares

The orresponding maximum likelihood estimate of the varian e is Finally, onsider the ase where the errors have nonzero orrelation,
i.e., R is not diagonal. For any 2 IRm let  = > Y be a linear
1
2 = (Y F  )> (Y F  ) : (3.4) ombination of the elements in Y . Then
m
 = > F + " with " = > e ;
To get a entral estimate of the varian e, the denominator m should
be repla ed by m p, the number of degrees of freedom. and
Next, assume that the errors are un orrelated, but have di erent vari-    

an e, i.e., E ei ei = i2 and E ei ej = 0 for i 6= j . Then R is the E "2 = E > e e> = 2 > R :
diagonal matrix  
Sin e E "2 > 0 whenever 6= 0, we have shown that R is positive

12 2
 de nite. Further, from its de nition it is immediately seen that R
R = diag ; : : : ; m2 : is symmetri . These two properties imply that we an write it in
 2  fa torized form,
We introdu e the weight matrix W given by R = CC > ; (3.8)
 
  where the matrix C > may be hosen as the Cholesky fa tor. As in
W = diag
1
;:::;
m
, W =R ;
2 1
(3.5)
(3.6) we see that the \de orrelation transformation"
and the weighted observations Y~ = W Y = W F + e~ are easily seen e~ = C 1 e = C 1 Y C 1 F  Y~ F~ (3.9)
to satisfy    >
yields E e~ = 0 and E e~e~ = 2 I , and by similar arguments we see
     
E e~ = 0; E e~e~> = W E ee> W > = 2 I ; (3.6) that (3.7) is also appli able in this general ase.
i.e., this transformed set of observations satis es the assumptions for
the simplest ase, and it follows that  and  are found by repla ing 3.1. Computational Aspe ts
F; Y in (3.3) { (3.4) by W F; W Y . This results in the weighted normal The formulation (3.7) should not be used dire tly for pra ti al om-
equations, putation if the problem is large and/or ill- onditioned. Instead 
should be found by orthogonal transformation as the least squares
(F > W 2 F )  = F > W 2 Y; 1
2 = (Y F  )> W 2 (Y F  ) : solution to the overdetermined system
m
F~ ' Y~ ; (3.10)
From (3.5) we see that these relations an be expressed as
with the matrix and right-hand side obtained by solving the matrix
(F > R 1 F )  = F > R 1Y ; equations
1 (3.7)
2 = (Y F  )> R 1(Y F ) : C F~ = F; C Y~ = Y :
m
3.1. Computational Aspe ts 15 16 4. Experimental Design

The least squares solution to (3.10) an be found in the following For large sets of design sites R will be large, and { at least with the
steps, last four hoi es in Table 2.1 { it an be expe ted to be sparse. This
1. Compute the \e onomy size" (or \thin") QR fa torization of F~ property is preserved in the Cholesky fa tor, but R 1 = C T C 1 is
(see e.g. [1, Se tion 5.2.6℄), dense.
An alternative to the Cholesky fa tor in (3.8) is the eigenve tor-
F~ = QG> ; (3.11) eigenvalue de omposition
where Q 2 IRmp has orthonormal olumns and G> 2 IRpp is R = V V > with V > V = I;  = diag(1 ; : : : ; m ) ; (3.15)
upper triangular. p p
orresponding to C = V 1=2 , where 1=2 = diag( 1 ; : : : ; m ); all
2. Che k that G and thereby F has full rank. If not, this is an indi- the j are real and positive. However, this fa torization is more ostly
ation that the hosen regression fun tions were not suÆ iently to ompute, and the eigenve tor matrix V is often a dense matrix.
linearly independent, and omputation should stop. Otherwise, Depending on the hoi e of R and the parameters  the matrix R may
ompute the least squares solution by ba k substitution in the be vey ill- onditioned. This is investigated in [4, se tions 4-5℄, and in
system order to redu e e e ts of rounding errors the implementation uses a
modi ed Cholesky fa torization, where (3.8) is repla ed by
G>  = Q> Y~ : (3.12) CC > = R + I with  = (10+m)"M ; (3.16)
The auxiliary matri es an also be used to ompute the pro ess vari- where "M is the so alled ma hine a ura y (or unit round-o ), "M =
an e (3.7) 2 52 ' 2:22  10 16 in Matlab.
2 =
1 kY~ F~  k2 (3.13)
:;`
`
m :;` 2

and the MSE (2.17),


  4. Experimental Design
'` (x) = ` 1 + u> (F~ > F~ ) 1 u r~> r~
2
Experimental design arises in this ontext in de iding how to sele t

= `2 1 + u>(G G> ) 1 u r~> r~ the inputs at whi h to run the deterministi omputer ode in or-
= `2 1 + kG 1 uk22 kr~k22

(3.14) der to most eÆ iently ontrol or redu e the statisti al un ertainty
of the omputed predi tion. This se tion introdu es two algorithms
with with \spa e lling" properties. Note that Latin Hyper ube designs
are based on random numbers, and the other algorithm produ es de-
r~ = C 1 r; u = F > R 1 r f = F~ > r~ f ; terministi designs. See [6℄, [8℄, [9℄ or [10℄ for further dis ussion and
and we have used (3.11) in the rst transformation: F~ > F~ = GQ> QG> = more advan ed designs.
G G> sin e Q has orthonormal olumns.
4.2. Latin Hyper ube 17 18 5. Referen e Manual

4.1. Re tangular Grid 5. Referen e Manual


Assume that the region D 2 IR under interest is a box, de ned by
n
This se tion is a presentation of the fun tions in the DACE toolbox.
`j  xj  uj , j = 1; : : : ; n. The simplest distribution of design sites is The ontents are
de ned by all di erent ombinations of
5.1. Model Constru tion

sj = ` j + k j
(i) (i) uj `j
; kj = 0; 1; : : : ; j ;
(i) da efit Find the DACE model to a given set of design data
j and given regression and orrelation models
where the fj g are integers. If all j =  , then the number of these 5.2. Evaluate the Model
design points is ( +1)n . predi tor Use the DACE model to predi t the fun tion at one
or more untried sites
4.2. Latin Hyper ube Sampling 5.3. Regression Models

Latin hyper ube sampling, due to M Kay et al. [5℄, is a strategy for regpoly0 Zero order ploynomial
generating random sample points ensuring that all portions of the regpoly1 First order ploynomial
ve tor spa e is represented. Consider the ase where we wish to sam- regpoly2 Se ond order ploynomial
ple m points in the n dimensional ve tor spa e D 2 IRn . The Latin
hyper ube sampling strategy is as follows: 5.4. Correlation Models
orrexp Exponential
1. Divide the interval of ea h dimension into m non-overlapping orrexpg Generalized exponential
intervals having equal probability (here we onsider a uniform orrgauss Gaussian
distribution, so the intervals should have equal size). orrlin Linear
2. Sample randomly from a uniform distribution a point in ea h orrspheri al Spheri al
orrspline Cubi spline
interval in ea h dimension.
3. Pair randomly (equal likely ombinations) the point from ea h 5.5. Experimental Design

dimension. gridsamp Design sites in a re tangular grid


lhsamp Latin hyper ube distributed random points
5.6. Auxiliary Fun tions
dsmerge Merge data for multiple design sites
5.7. Data Files
data1.mat Example data S and Y
In the following we give details of the Matlab fun tions. The le
data1.mat is used in the rst example of Se tion 6.
5.1. da efit 19 20 5. Referen e Manual

Installation: Obtain the ar hive ontaining the software at


da e.zip Output:
the web-site 
http://www.imm.dtu.dk/ hbn/da e dmodel DACE model. Stru t with the elements
Unzip the ar hive at a onvenient pla e. On UNIX systems use the regr handle to the regression fun tion,
ommand unzip da e.zip, on Windows systems use WinZip or a orr handle to the orrelation fun tion,
similar program. theta orrelation fun tion parameters,
The ar hive ontains a folder named ontaining the software beta generalized least squares estimate, ,
 in (2.16)
da e

and this do umentation. The path to the software (i.e., to the


orrelation fa tors,  in (2.16),
da e

folder) should be in luded in the Matlab sear h path. Use either gamma
the pathtool or the addpath ommand. See the Matlab manual for sigma2 estimate of the pro ess varian e  2 ,
further dire tions. S s aled design sites, see (5.1) below,
Ss 2n array with s aling fa tors for design
sites,
5.1. Model Constru tion Ys 2q array with s aling fa tors for design
Purpose: Find the DACE model to a given set of design data and responses,
given regression and orrelation models. C Cholesky fa tor of orrelation matrix, (3.16),
Ft de orrelated regression matrix, F~ in (3.10),
Call:
G matrix G from (3.11).
[dmodel, perf℄ = da efit(S, Y, regr, orr, theta0)
[dmodel, perf℄ = ... perf Information about the optimization.
da efit(S, Y, regr, orr, theta0, lob, upb) Stru t with the elements
nv Number of evaluations of obje tive fun tion
Input parameters: (2.25) used to nd  ,
S Design sites: an mn array with S(i; :) = s>i . perf Array with nv olumns with urrent values
Y mq array with responses at S. of [; (); type℄. jtypej = 1, 2 or 3,
regr Handle to a fun tion like (5.2) below. indi ate \start", \explore" or \move".
orr Handle to a fun tion like (5.3) below. A negative value for type indi ates an uphill
theta0 If lob and upb are not present, then theta0 should trial step.
hold the orrelation fun tion parameters, .
Otherwise theta0 should hold initial guess on . The rst step in da efit is to normalize the input so that
See Se tion 5.4 about permissible lengths of theta0. Remarks.
(2.1) is satis ed,
lob,upb Optional. If present, then they should be ve tors of
the same length as theta0 and should hold respe - mS = mean(S); sS = std(S);
tively lower and upper bounds on . for j=1:n, S(:,j) = (S(:,j) - mS(j))/sS(j); end
(5.1)
If they are not present, then  is given by the values mY = mean(Y); sY = std(Y);
in theta0. for j=1:q, Y(:,j) = (Y(:,j) - mY(j))/sY(j); end
The values in mS and sS are returned in dmodel.Ss , and
dmodel.Ys = [mY; sY℄.
5.2. predi tor 21 22 5. Referen e Manual

All omputation is performed on the normalized data, but the pro ess or Optional result. Depends on the number of sites,
varian e is for the original data, (dmodel.sigma2)j = sY2j  2j , where m = 1: gradient y^0 , (2.18). Column ve tor with n
2j is the estimator for the j th olumn of the normalized responses. elements,
The matri es R and C are stored in sparse format, and it is exploited m > 1: m-ve tor with estimated MSE, (3.14).
that we only need to store the upper triangle of the symmetri matrix dy Optional result, allowed only if m = 1: nq array with
R. Ja obian of y^ (gradient y^0 (x) (2.18) if q = 1).
mse Optional result, allowed only if m = 1: estimated
As indi ated in Se tion 2.3, the determination of the optimal orrela- MSE, (3.14).
tion parameters  is an optimization problem with box onstraints, dmse Optional result, allowed only if m = 1: nq array with
`j  j  uj . We have developed a simple but eÆ ient algorithm Ja obian of ' (gradient '0 (x) (2.20) if q = 1).
with su essive oordinate sear h and pattern moves, as in the Hooke
& Jeeves method, see e.g. [3, Se tion 2.4℄. 1Details are given in [4, Se -
tion 6℄. The obje tive fun tion ()  jRj m 2 was presented in [9℄ for Remarks. The omputation is performed on normalized trial sites,
the ase q = 1. In the multiple response ase we let 2 := 12 +    + q2 , f. (5.1), but the returned results are in the original \units".
with ea h omponent of 2 omputed by (3.13). The spe ial treatment of optional results when there is only one trial
site was made so that predi tor an be used as the obje tive fun tion
5.2. Evaluate the Model
in a Matlab optimization fun tion demanding the gradient.
Purpose: Use the DACE model to predi t the fun tion at one or 5.3. Regression Models
more untried sites.
The toolbox provides fun tions that implement the polynomials dis-
Call: y = predi tor(x, dmodel) ussed in Se tion 2.2. All of these onform with the spe i ations
[y, or℄ = predi tor(x, dmodel) given in (5.2) at the end of this se tion.
[y, dy, mse℄ = predi tor(x, dmodel)
[y, dy, mse, dmse℄ = predi tor(x, dmodel) Purpose: Get values of zero order polynomial, (2.21).
Input parameters: Call: f = regpoly0(S)
x m trial site(s) with n dimensions. [f, df℄ = regpoly0(S)
If m = 1 and n > 1, then both a row and a olumn Input parameter:
ve tor is a epted. Otherwise x must be an mn S mn matrix with design sites stored rowwise.
array with the sites stored rowwise.
dmodel Stru t with the DACE model, see Se tion 5.1. Output:
f m1 ve tor with all ones.
Output: df Optional result, Ja obian for the rst site: n1 ve tor
y Predi ted response, y(i) = y^(x(i,:)), see (2.16). with all zeros.
5.4. Correlation Models 23 24 5. Referen e Manual

Purpose: Get values of rst order polynomials, (2.22). 5.4. Correlation Models
Call: f = regpoly1(S) The toolbox provides seven fun tions that implement the models
[f, df℄ = regpoly1(S) presented in Table 2.1, see the list on page 18. All of these onform
Input parameter: with the spe i ations given in (5.3) at the end of this se tion. We
S mn matrix with design sites stored rowwise. only present one of them in detail,
Output: Purpose: Get values of the fun tion denoted exp in Table 2.1.
f m(n+1) matrix with f(i,j) = fj (xi ). Call: r = orrexp(theta, d)
df Optional result, Ja obian for the rst site, Se tion 2.2. [r, dr℄ = orrexp(theta, d)
Input parameters:
theta Parameters in the orrelation fun tion.
A s alar value is allowed. This orresponds to an
Purpose: Get values of se ond order polynomials, (2.23). isotropi model: all j equal to theta. Otherwise,
f = regpoly2(S)
the number of elements in theta must equal the di-
Call:
[f, df℄ = regpoly2(S)
mension n given in d.
d mn array with di eren es between sites.
Input parameter:
S mn matrix with design sites stored rowwise. Output:
r Correlations, r(i) = R(; d(i,:)).
Output: dr Optional result. mn array with the Ja obian Jr ,
f mp matrix with f(i,j) = fj (xi ); p = 21 (n+1)(n+1). (2.19).
df Optional result, Ja obian for the rst site, Se tion 2.2.
Remarks. The Ja obian is meaningful only when d holds the di er-
en es between a point x and the design sites, as given in S , di;: =
x> Si;: . The expression given in Table 2.1 an be written as
Remark. The user may supply a new regression model in the form 0 1
of a fun tion that must have a de laration of the form n
Y Xn
ri = exp( j jdij j) = exp  j ij (xj Sij )A ;
fun tion [f, df℄ = regress(S) (5.2) j =1 j =1

where ij is the sign of d(ji) . The orresponding Ja obian is given by


For a given set of m sites S with the ith site stored in S(i,:) it
should return the mp matrix f with elements f(i,j) = fj (S(i,:)), (Jr )ij = j  ij  ri :
f. (2.3). If it is alled with two output arguments, then the Ja obian
Jf , (2.19), should be returned in df. This option is needed only if the Remarks. The handling of isotropi models is similar in orrgauss,
predi tor is alled with the option of returning also the gradient. orrgauss, orrlin, orrspheri al and orrspline, while orrexpg
5.5. Experimental Design 25 26 5. Referen e Manual

needs spe ial treatment of the exponent n+1 . Here theta must be Purpose: Compute latin hyper ube distributed random points,
a ve tor with either n+1 or 2 elements. In the latter ase j = Se tion 4.2.
theta(1); j = 1; : : : ; n and n+1 = theta(2). Call: S = lhsamp
S = lhsamp(m)
The user may supply a new orrelation model in the form of a fun tion S = lhsamp(m, n)
that must have a de laration of the form
Input parameters:
fun tion [r, dr℄ = orr(theta, d) (5.3) m Number of sample points to generate. If not present,
then m=1.
For a given set of parameters theta =  and di eren es d(i,:) = n Number of dimensions. If not present, then n=m.
x> Si;: it should return the olumn ve tor r with elements r(i) = Output:
R(; x; Si;: ). If it is alled with two output arguments, then the Ja o- S mn matrix with the generated sample points hosen
bian Jr , (2.19), should be returned in dr. This option is needed only from uniform distributions on m subdivisions of the
if the predi tor is alled with the option of returning also the gradient. interval ℄0:0; 1:0[.
5.5. Experimental Design 5.6. Auxiliary Fun tions
Currently the toolbox provides implementations of the two designs of The Kriging model presumes distin t design sites, and the toolbox
Se tion 4: provides a fun tion that an ompress the data if this ondition is not
Purpose: Get design sites in a re tangular grid, Se tion 4.1. satis ed.
Call: X = gridsamp(range, q) Purpose: Merge data for multiple design sites.

Input parameters: Call: [mS, mY℄ = dsmerge(S, Y)


range 2n with lower and upper limits, range(:,j) = [mS, mY℄ = dsmerge(S, Y, ds)
[`j uj ℄> . [mS, mY℄ = dsmerge(S, Y, ds, nms)
q Ve tor with q(j) holding the number of intervals in [mS, mY℄ = dsmerge(S, Y, ds, nms, wtds)
the j th dire tion. If q is a s alar, then it uses this [mS, mY℄ = dsmerge(S, Y, ds, nms, wtds, wtdy)
number in all dire tions. Input parameters:
Output: S mn array with design sites stored rowwise.
X mn array with grid points. Y mq array with responses at S.
ds Threshold for equal, normalized sites. Default is
1e-14.
nms Norm, in whi h the distan e is measured.
5.7. Data Files 27 28 6. Examples

nms = 1
1-norm (sum of absolute oordinate 6. Examples of Usage
di eren es)
2 2-norm (Eu lidean distan e) (default)
wtds What to do with the S-values in ase of multiple 6.1. Work-through Example
points. This example demonstrates simple usage of the two most important
wtds = 1 return the mean value (default) fun tions in the toolbox, namely da efit and predi tor. The ex-
2 return the median value ample shows how you an obtain a surfa e approximation and orre-
3 return the ' luster enter' sponding error estimates for the approximation to a given data set.
wtdy What to do with the Y-values in ase of multiple The example also shows how gradient approximations at given points
points. an be obtained for both predi tor and error estimate.
wtdy = 1 return the mean value (default) We start by loading the data setdata1.mat provided with the toolbox,
2 return the median value f. Se tion 5.7,
3 return the ' luster enter' value
4 return the minimum value load data1
5 return the maximum value
Now the 752 array S and 751 array Y are present in the workspa e.
Output:
mS Compressed design sites, with multiple points merged We hoose the poly0 regression fun tion and the gauss orrelation
a ording to wtds. fun tion. Assuming anisotropy we hoose the following starting point
mY Responses, ompressed a ording to wtdy. and bounds for 
theta = [10 10℄; lob = [1e-1 1e-1℄; upb = [20 20℄;
5.7. Data Files
We are now ready to make the model by alling da efit,
Currently the toolbox ontains one test data set, illustrated in Se tion
6. The ommand [dmodel, perf℄ = ...
load data1 da efit(S, Y, regpoly0,  orrgauss, theta, lob, upb)
makes the arrays S 2 IR752 and Y 2 IR751 available in the workspa e; From the returned results we an extra t information about the gener-
i.e., m = 75, n = 2, q = 1. The design sites stored in S are sampled in ated model. The number of evaluations of the obje tive fun tion (2.25)
the two-dimensional area [0; 100℄2. to nd  and the values of  are
perf.nv = 15
dmodel.theta = [3.5355 2.1022℄
The generalized least squares estimate  and the estimated pro ess
varian e 2 are
6.1. Example 1 29 30 6. Examples

dmodel.beta = 0.0918
dmodel.sigma2 = 3.3995
Having the model stored in the stru ture array dmodel we may use 46

it for predi tion at new (untried) sites. We generate a grid of points


on whi h to evaluate the predi tor (2.16). We hoose a 4040 mesh
44

of points distributed equidistantly in the area [0; 100℄2 overed by the


42

design sites, f. Se tion 5.7, and all the kriging predi tor with the
40

mesh points and the dmodel,


38

36

X = gridsamp([0 0;100 100℄, 40); 34

[YX MSE℄ = predi tor(X, dmodel); 32


100
80

The returned ve tor YX of predi ted values at the mesh and MSE the
100
60 80

mean squared error for ea h predi ted point. To plot the results we 40 60
40
20

reshape the oordinate matrix and YX to mat h the grid


20
0 0

X1 = reshape(X(:,1),40,40); X2 = reshape(X(:,2),40,40); Figure 6.1. Predi ted values


YX = reshape(YX, size(X1));
Mesh plot of the predi ted values at the grid points, and add the
design sites
figure(1), mesh(X1, X2, YX)
hold on, 3.5

plot3(S(:,1),S(:,2),Y,'.k', 'MarkerSize',10) 3

hold off 2.5

The resulting plot is shown in Figure 6.1. 2

Next, to get a mesh plot of the mean squared error in a new gure 1.5

window we issue the ommands 1


100

figure(2), mesh(X1, X2, reshape(MSE, size(X1)) )


80 100
60 80
40 60

The resulting plot is shown in Figure 6.2. From the Figure we note
40
20
20

how areas with few design sites (e.g. the enter area) have high MSE
0 0

values. Figure 6.2. Mean squared error


6.2. Example 2 31 32 7. Notation

The predi tor fun tion also allows for predi tion of gradients pro- % df is the Ja obian at the first point (first row in S)
vided a single point, e.g. here is how to predi t the gradient at the
rst design site, [m n℄ = size(S);
f = [ones(m,1) S S.^2℄;
[y, dy℄ = predi tor(S(1,:), dmodel)
y = 34.1000 if nargout > 1
dy = 0.2526 df = [zeros(n,1) eye(n) 2*diag(S(1,:))℄;
0.1774 end
The gradient of MSE is also available, e.g. the for the point (50; 50), Note that the array f is mp with p = 1+2n. The last part with
[y, dy, mse, dmse℄ = predi tor([50 50℄, dmodel) the gradient al ulation is needed only if the gradient feature of the
predi tor fun tion is used.
y = 38.0610
dy = -0.0141
-0.0431
mse = 0.7526
dmse = 0.0004 7. Notation
0.0187
m; n number of design sites and their dimensionality
6.2. Adding a Regression Fun tion p number of basis fun tions in regression model
This example shows how a user provided regression fun tion an be q dimensionality of responses
added to the toolbox. Adding a orrelation fun tion is done in virtu-
ally the same way, for whi h reason a spe i example of su h is not F ( ; x) regression model, F ( ; x) = f (x)> , see Se tion 2
given here. R(; w; x) orrelation fun tion, see Se tion 2.3
As noted in the referen e manual in Se tion 5, the regression fun tions C fa torization (e.g. Cholesky) of R, R = C T C
and the orrelation fun tions must be implemented with a spe i in- 
terfa e. Below is an example on how to implement a redu ed (without E expe tation operator
the ross-dimensional produ ts) se ond order polynomial regression fj basis fun tion for regression model
fun tion f (x) = [1 x1 : : : xn x21 : : : x2n ℄> . f p-ve tor, f (x) = [f1 (x)    fp (x)℄>
fun tion [f, df℄ = regpoly2r(S) F expanded design mp-matrix, see (2.6)
%REGPOLY2R Redu ed 2nd order polynomial regr. fun tion F ; Y~
~ transformed data, see (3.9)
% Call: [f, df℄ = regpoly2(S)
R mm-matrix of sto hasti -pro ess orrelations,
% S : m*n matrix with design sites
% f = [1 S S^2℄
see (2.7)
Referen es 33 34 Referen es

r m-ve tor of orrelations, see (2.8) [4℄ S.N. Lophaven, H.B. Nielsen, J. Sndergaard, Aspe ts of the Matlab
Toolbox DACE. Report IMM-REP-2002-13, Informati s and Mathe-
S mn matrix of design sites, see Se tion 2 mati al Modelling, DTU. (2002), 44 pages. Available as
Si;: , S:;j ith row and j th olumn in S , respe tively http://www.imm.dtu.dk/  hbn/publ/TR0213.ps

si ith design site, ve tor of length n. s>i = Si;: [5℄ M.D. M Kay, W.J. Conover, R.J. Be kman, A omparison of Three
 
V w; x ovarian e between w and x (2.4) Methods for Sele ting Values of Input Variables in the Analysis of
Output from a Computer Code, Te hnometri s, vol. 21, no. 2, 1979.
w; x n-dimensional trial points [6℄ W. G. M
uller, Colle ting Spatial Data. Optimum Design of Exper-
xj j th omponent in x iments for Random Fields. Physi a-Verlag, Heidelberg, Germany,
Y mq-matrix of responses, see Se tion 2 2001.
[7℄ J. No edal, S.J. Wright, Numeri al Optimization, Springer, New
yi response at ith design site, q-ve tor York, USA, 1999.
y^ predi ted response, see (2.2) [8℄ J.A. Royle, D. Ny hka, An Algorithm for the Constru tion of Spa-
tial Coverage Designs with Implementation in Splus, Computers and
z q-dimensional sto hasti pro ess, see (2.2) Geos ien es, vol. 24, no. 5, pp. 479-488, 1998.
pq-matrix of regression parameters, see (2.2), (2.16) [9℄ J. Sa ks, W.J. Wel h, T.J. Mit hell, H.P. Wynn, Design and Analysis
mq-matrix of orrelation onstants, see (2.16) of Computer Experiments, Statisti al S ien e, vol. 4, no. 4, pp. 409-
435, 1989.
 parameters of orrelation model, q-ve tor [10℄ T.W. Simpson, J.D. Peplinski, P.N. Ko h, J.K. Allen, Metamodels for
2 pro ess varian e (of z ), see (2.4) Computer-Based Engineering Design: Survey and Re ommendations
Engineering with Computers, vol. 17, pp. 129-150, 2001.
'(x) mean squared error of y^, see (2.11), (3.14)

Referen es

[1℄ G. Golub, C. Van Loan, Matrix Computations. Johns Hopkins Uni-


versity Press, Baltimore, USA, 3rd edition, 1996.
[2℄ E.H. Isaaks, R.M. Srivastava, An Introdu tion to Applied Geostatis-
ti s. Oxford University Press, New York, USA, 1989.

[3℄ J. Kowalik, M.R. Osborne, Methods for Un onstrained Optimization


Problems. Elsevier, New York, USA, 1968.

You might also like